DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH v1 0/6] fix distributor synchronization issues
       [not found] <CGME20200915193456eucas1p1a38a0bb16e9c81ed587f916aeb8c41e5@eucas1p1.samsung.com>
@ 2020-09-15 19:34 ` Lukasz Wojciechowski
       [not found]   ` <CGME20200915193457eucas1p2adbe25c41a0e4ef16c029e7bff104503@eucas1p2.samsung.com>
                     ` (6 more replies)
  0 siblings, 7 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-09-15 19:34 UTC (permalink / raw)
  Cc: dev, l.wojciechow

During review and verification of the patch created by Sarosh Arif:
"test_distributor: prevent memory leakages from the pool" I found out
that running distributor unit tests multiple times in a row causes fails.
So I investigated all the issues I found.

There are few synchronization issues that might cause deadlocks
or corrupted data. They are fixed with this set of patches for both tests
and librte_distributor library.

Lukasz Wojciechowski (6):
  app/test: fix deadlock in distributor test
  app/test: synchronize statistics between lcores
  app/test: fix freeing mbufs in distributor tests
  app/test: collect return mbufs in distributor test
  distributor: fix missing handshake synchronization
  distributor: fix handshake deadlock

 app/test/test_distributor.c              | 98 ++++++++++++++----------
 lib/librte_distributor/rte_distributor.c | 23 +++++-
 2 files changed, 79 insertions(+), 42 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v1 1/6] app/test: fix deadlock in distributor test
       [not found]   ` <CGME20200915193457eucas1p2adbe25c41a0e4ef16c029e7bff104503@eucas1p2.samsung.com>
@ 2020-09-15 19:34     ` Lukasz Wojciechowski
  2020-09-17 11:21       ` David Hunt
  0 siblings, 1 reply; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-09-15 19:34 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

The sanity test with worker shutdown delegates all bufs
to be processed by a single lcore worker, then it freezes
one of the lcore workers and continues to send more bufs.

Problem occurred if freezed lcore is the same as the one
that is processing the mbufs. The lcore processing mbufs
might be different every time test is launched.
This is caused by keeping the value of wkr static variable
in rte_distributor_process function between running test cases.

Test freezed always lcore with 0 id. The patch changes avoids
possible collision by freezing lcore with zero_idx. The lcore
that receives the data updates the zero_idx, so it is not freezed
itself.

To reproduce the issue fixed by this patch, please run
distributor_autotest command in test app several times in a row.

Fixes: c3eabff124e6 ("distributor: add unit tests")
Cc: bruce.richardson@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
---
 app/test/test_distributor.c | 22 ++++++++++++++++++++--
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
index ba1f81cf8..35b25463a 100644
--- a/app/test/test_distributor.c
+++ b/app/test/test_distributor.c
@@ -28,6 +28,7 @@ struct worker_params worker_params;
 static volatile int quit;      /**< general quit variable for all threads */
 static volatile int zero_quit; /**< var for when we just want thr0 to quit*/
 static volatile unsigned worker_idx;
+static volatile unsigned zero_idx;
 
 struct worker_stats {
 	volatile unsigned handled_packets;
@@ -346,27 +347,43 @@ handle_work_for_shutdown_test(void *arg)
 	unsigned int total = 0;
 	unsigned int i;
 	unsigned int returned = 0;
+	unsigned int zero_id = 0;
 	const unsigned int id = __atomic_fetch_add(&worker_idx, 1,
 			__ATOMIC_RELAXED);
 
 	num = rte_distributor_get_pkt(d, id, buf, buf, num);
 
+	zero_id = __atomic_load_n(&zero_idx, __ATOMIC_ACQUIRE);
+	if (id == zero_id && num > 0) {
+		zero_id = (zero_id + 1) %  __atomic_load_n(&worker_idx,
+			__ATOMIC_ACQUIRE);
+		__atomic_store_n(&zero_idx, zero_id, __ATOMIC_RELEASE);
+	}
+
 	/* wait for quit single globally, or for worker zero, wait
 	 * for zero_quit */
-	while (!quit && !(id == 0 && zero_quit)) {
+	while (!quit && !(id == zero_id && zero_quit)) {
 		worker_stats[id].handled_packets += num;
 		count += num;
 		for (i = 0; i < num; i++)
 			rte_pktmbuf_free(buf[i]);
 		num = rte_distributor_get_pkt(d,
 				id, buf, buf, num);
+
+		zero_id = __atomic_load_n(&zero_idx, __ATOMIC_ACQUIRE);
+		if (id == zero_id && num > 0) {
+			zero_id = (zero_id + 1) %  __atomic_load_n(&worker_idx,
+				__ATOMIC_ACQUIRE);
+			__atomic_store_n(&zero_idx, zero_id, __ATOMIC_RELEASE);
+		}
+
 		total += num;
 	}
 	worker_stats[id].handled_packets += num;
 	count += num;
 	returned = rte_distributor_return_pkt(d, id, buf, num);
 
-	if (id == 0) {
+	if (id == zero_id) {
 		/* for worker zero, allow it to restart to pick up last packet
 		 * when all workers are shutting down.
 		 */
@@ -586,6 +603,7 @@ quit_workers(struct worker_params *wp, struct rte_mempool *p)
 	rte_eal_mp_wait_lcore();
 	quit = 0;
 	worker_idx = 0;
+	zero_idx = 0;
 }
 
 static int
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v1 2/6] app/test: synchronize statistics between lcores
       [not found]   ` <CGME20200915193457eucas1p2321d28b6abf69f244cd7c1e61ed0620e@eucas1p2.samsung.com>
@ 2020-09-15 19:34     ` Lukasz Wojciechowski
  2020-09-17 11:50       ` David Hunt
  0 siblings, 1 reply; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-09-15 19:34 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

Statistics of handled packets are cleared and read on main lcore,
while they are increased in workers handlers on different lcores.

Without synchronization occasionally showed invalid values.
This patch uses atomic acquire/release mechanisms to synchronize.

Fixes: c3eabff124e6 ("distributor: add unit tests")
Cc: bruce.richardson@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
---
 app/test/test_distributor.c | 39 ++++++++++++++++++++++++-------------
 1 file changed, 26 insertions(+), 13 deletions(-)

diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
index 35b25463a..0e49e3714 100644
--- a/app/test/test_distributor.c
+++ b/app/test/test_distributor.c
@@ -43,7 +43,8 @@ total_packet_count(void)
 {
 	unsigned i, count = 0;
 	for (i = 0; i < worker_idx; i++)
-		count += worker_stats[i].handled_packets;
+		count += __atomic_load_n(&worker_stats[i].handled_packets,
+				__ATOMIC_ACQUIRE);
 	return count;
 }
 
@@ -52,6 +53,7 @@ static inline void
 clear_packet_count(void)
 {
 	memset(&worker_stats, 0, sizeof(worker_stats));
+	rte_atomic_thread_fence(__ATOMIC_RELEASE);
 }
 
 /* this is the basic worker function for sanity test
@@ -72,13 +74,13 @@ handle_work(void *arg)
 	num = rte_distributor_get_pkt(db, id, buf, buf, num);
 	while (!quit) {
 		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
-				__ATOMIC_RELAXED);
+				__ATOMIC_ACQ_REL);
 		count += num;
 		num = rte_distributor_get_pkt(db, id,
 				buf, buf, num);
 	}
 	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
-			__ATOMIC_RELAXED);
+			__ATOMIC_ACQ_REL);
 	count += num;
 	rte_distributor_return_pkt(db, id, buf, num);
 	return 0;
@@ -134,7 +136,8 @@ sanity_test(struct worker_params *wp, struct rte_mempool *p)
 
 	for (i = 0; i < rte_lcore_count() - 1; i++)
 		printf("Worker %u handled %u packets\n", i,
-				worker_stats[i].handled_packets);
+			__atomic_load_n(&worker_stats[i].handled_packets,
+					__ATOMIC_ACQUIRE));
 	printf("Sanity test with all zero hashes done.\n");
 
 	/* pick two flows and check they go correctly */
@@ -159,7 +162,9 @@ sanity_test(struct worker_params *wp, struct rte_mempool *p)
 
 		for (i = 0; i < rte_lcore_count() - 1; i++)
 			printf("Worker %u handled %u packets\n", i,
-					worker_stats[i].handled_packets);
+				__atomic_load_n(
+					&worker_stats[i].handled_packets,
+					__ATOMIC_ACQUIRE));
 		printf("Sanity test with two hash values done\n");
 	}
 
@@ -185,7 +190,8 @@ sanity_test(struct worker_params *wp, struct rte_mempool *p)
 
 	for (i = 0; i < rte_lcore_count() - 1; i++)
 		printf("Worker %u handled %u packets\n", i,
-				worker_stats[i].handled_packets);
+			__atomic_load_n(&worker_stats[i].handled_packets,
+					__ATOMIC_ACQUIRE));
 	printf("Sanity test with non-zero hashes done\n");
 
 	rte_mempool_put_bulk(p, (void *)bufs, BURST);
@@ -280,15 +286,17 @@ handle_work_with_free_mbufs(void *arg)
 		buf[i] = NULL;
 	num = rte_distributor_get_pkt(d, id, buf, buf, num);
 	while (!quit) {
-		worker_stats[id].handled_packets += num;
 		count += num;
+		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
+				__ATOMIC_ACQ_REL);
 		for (i = 0; i < num; i++)
 			rte_pktmbuf_free(buf[i]);
 		num = rte_distributor_get_pkt(d,
 				id, buf, buf, num);
 	}
-	worker_stats[id].handled_packets += num;
 	count += num;
+	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
+			__ATOMIC_ACQ_REL);
 	rte_distributor_return_pkt(d, id, buf, num);
 	return 0;
 }
@@ -363,8 +371,9 @@ handle_work_for_shutdown_test(void *arg)
 	/* wait for quit single globally, or for worker zero, wait
 	 * for zero_quit */
 	while (!quit && !(id == zero_id && zero_quit)) {
-		worker_stats[id].handled_packets += num;
 		count += num;
+		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
+				__ATOMIC_ACQ_REL);
 		for (i = 0; i < num; i++)
 			rte_pktmbuf_free(buf[i]);
 		num = rte_distributor_get_pkt(d,
@@ -379,10 +388,11 @@ handle_work_for_shutdown_test(void *arg)
 
 		total += num;
 	}
-	worker_stats[id].handled_packets += num;
 	count += num;
 	returned = rte_distributor_return_pkt(d, id, buf, num);
 
+	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
+			__ATOMIC_ACQ_REL);
 	if (id == zero_id) {
 		/* for worker zero, allow it to restart to pick up last packet
 		 * when all workers are shutting down.
@@ -394,10 +404,11 @@ handle_work_for_shutdown_test(void *arg)
 				id, buf, buf, num);
 
 		while (!quit) {
-			worker_stats[id].handled_packets += num;
 			count += num;
 			rte_pktmbuf_free(pkt);
 			num = rte_distributor_get_pkt(d, id, buf, buf, num);
+			__atomic_fetch_add(&worker_stats[id].handled_packets,
+					num, __ATOMIC_ACQ_REL);
 		}
 		returned = rte_distributor_return_pkt(d,
 				id, buf, num);
@@ -461,7 +472,8 @@ sanity_test_with_worker_shutdown(struct worker_params *wp,
 
 	for (i = 0; i < rte_lcore_count() - 1; i++)
 		printf("Worker %u handled %u packets\n", i,
-				worker_stats[i].handled_packets);
+			__atomic_load_n(&worker_stats[i].handled_packets,
+					__ATOMIC_ACQUIRE));
 
 	if (total_packet_count() != BURST * 2) {
 		printf("Line %d: Error, not all packets flushed. "
@@ -514,7 +526,8 @@ test_flush_with_worker_shutdown(struct worker_params *wp,
 	zero_quit = 0;
 	for (i = 0; i < rte_lcore_count() - 1; i++)
 		printf("Worker %u handled %u packets\n", i,
-				worker_stats[i].handled_packets);
+			__atomic_load_n(&worker_stats[i].handled_packets,
+					__ATOMIC_ACQUIRE));
 
 	if (total_packet_count() != BURST) {
 		printf("Line %d: Error, not all packets flushed. "
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v1 3/6] app/test: fix freeing mbufs in distributor tests
       [not found]   ` <CGME20200915193458eucas1p1d9308e63063eda28f96eedba3a361a2b@eucas1p1.samsung.com>
@ 2020-09-15 19:34     ` Lukasz Wojciechowski
  2020-09-17 12:34       ` David Hunt
  2020-09-22 12:42       ` David Marchand
  0 siblings, 2 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-09-15 19:34 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

Sanity tests with mbuf alloc and shutdown tests assume that
mbufs passed to worker cores are freed in handlers.
Such packets should not be returned to the distributor's main
core. The only packets that should be returned are the packets
send after completion of the tests in quit_workers function.

This patch fixes freeing mbufs, stops returning them
to distributor's core and cleans up unused variables.

Fixes: c0de0eb82e40 ("distributor: switch over to new API")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
---
 app/test/test_distributor.c | 37 +++++++++++--------------------------
 1 file changed, 11 insertions(+), 26 deletions(-)

diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
index 0e49e3714..da13a9a3f 100644
--- a/app/test/test_distributor.c
+++ b/app/test/test_distributor.c
@@ -277,24 +277,21 @@ handle_work_with_free_mbufs(void *arg)
 	struct rte_mbuf *buf[8] __rte_cache_aligned;
 	struct worker_params *wp = arg;
 	struct rte_distributor *d = wp->dist;
-	unsigned int count = 0;
 	unsigned int i;
 	unsigned int num = 0;
 	unsigned int id = __atomic_fetch_add(&worker_idx, 1, __ATOMIC_RELAXED);
 
 	for (i = 0; i < 8; i++)
 		buf[i] = NULL;
-	num = rte_distributor_get_pkt(d, id, buf, buf, num);
+	num = rte_distributor_get_pkt(d, id, buf, buf, 0);
 	while (!quit) {
-		count += num;
 		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
 				__ATOMIC_ACQ_REL);
 		for (i = 0; i < num; i++)
 			rte_pktmbuf_free(buf[i]);
 		num = rte_distributor_get_pkt(d,
-				id, buf, buf, num);
+				id, buf, buf, 0);
 	}
-	count += num;
 	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
 			__ATOMIC_ACQ_REL);
 	rte_distributor_return_pkt(d, id, buf, num);
@@ -322,7 +319,6 @@ sanity_test_with_mbuf_alloc(struct worker_params *wp, struct rte_mempool *p)
 			rte_distributor_process(d, NULL, 0);
 		for (j = 0; j < BURST; j++) {
 			bufs[j]->hash.usr = (i+j) << 1;
-			rte_mbuf_refcnt_set(bufs[j], 1);
 		}
 
 		rte_distributor_process(d, bufs, BURST);
@@ -346,20 +342,15 @@ sanity_test_with_mbuf_alloc(struct worker_params *wp, struct rte_mempool *p)
 static int
 handle_work_for_shutdown_test(void *arg)
 {
-	struct rte_mbuf *pkt = NULL;
 	struct rte_mbuf *buf[8] __rte_cache_aligned;
 	struct worker_params *wp = arg;
 	struct rte_distributor *d = wp->dist;
-	unsigned int count = 0;
 	unsigned int num = 0;
-	unsigned int total = 0;
 	unsigned int i;
-	unsigned int returned = 0;
 	unsigned int zero_id = 0;
 	const unsigned int id = __atomic_fetch_add(&worker_idx, 1,
 			__ATOMIC_RELAXED);
-
-	num = rte_distributor_get_pkt(d, id, buf, buf, num);
+	num = rte_distributor_get_pkt(d, id, buf, buf, 0);
 
 	zero_id = __atomic_load_n(&zero_idx, __ATOMIC_ACQUIRE);
 	if (id == zero_id && num > 0) {
@@ -371,13 +362,12 @@ handle_work_for_shutdown_test(void *arg)
 	/* wait for quit single globally, or for worker zero, wait
 	 * for zero_quit */
 	while (!quit && !(id == zero_id && zero_quit)) {
-		count += num;
 		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
 				__ATOMIC_ACQ_REL);
 		for (i = 0; i < num; i++)
 			rte_pktmbuf_free(buf[i]);
 		num = rte_distributor_get_pkt(d,
-				id, buf, buf, num);
+				id, buf, buf, 0);
 
 		zero_id = __atomic_load_n(&zero_idx, __ATOMIC_ACQUIRE);
 		if (id == zero_id && num > 0) {
@@ -385,12 +375,7 @@ handle_work_for_shutdown_test(void *arg)
 				__ATOMIC_ACQUIRE);
 			__atomic_store_n(&zero_idx, zero_id, __ATOMIC_RELEASE);
 		}
-
-		total += num;
 	}
-	count += num;
-	returned = rte_distributor_return_pkt(d, id, buf, num);
-
 	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
 			__ATOMIC_ACQ_REL);
 	if (id == zero_id) {
@@ -400,20 +385,20 @@ handle_work_for_shutdown_test(void *arg)
 		while (zero_quit)
 			usleep(100);
 
+		for (i = 0; i < num; i++)
+			rte_pktmbuf_free(buf[i]);
 		num = rte_distributor_get_pkt(d,
-				id, buf, buf, num);
+				id, buf, buf, 0);
 
 		while (!quit) {
-			count += num;
-			rte_pktmbuf_free(pkt);
-			num = rte_distributor_get_pkt(d, id, buf, buf, num);
 			__atomic_fetch_add(&worker_stats[id].handled_packets,
 					num, __ATOMIC_ACQ_REL);
+			for (i = 0; i < num; i++)
+				rte_pktmbuf_free(buf[i]);
+			num = rte_distributor_get_pkt(d, id, buf, buf, 0);
 		}
-		returned = rte_distributor_return_pkt(d,
-				id, buf, num);
-		printf("Num returned = %d\n", returned);
 	}
+	rte_distributor_return_pkt(d, id, buf, num);
 	return 0;
 }
 
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v1 4/6] app/test: collect return mbufs in distributor test
       [not found]   ` <CGME20200915193459eucas1p19f5d1cbea87d7dc3bbd2638cdb96a31b@eucas1p1.samsung.com>
@ 2020-09-15 19:34     ` Lukasz Wojciechowski
  2020-09-17 12:37       ` David Hunt
  0 siblings, 1 reply; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-09-15 19:34 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

During quit_workers function distributor's main core processes
some packets to wake up pending worker cores so they can quit.
As quit_workers acts also as a cleanup procedure for next test
case it should also collect these packages returned by workers'
handlers, so the cyclic buffer with returned packets
in distributor remains empty.

Fixes: c3eabff124e6 ("distributor: add unit tests")
Cc: bruce.richardson@intel.com
Fixes: c0de0eb82e40 ("distributor: switch over to new API")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
---
 app/test/test_distributor.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
index da13a9a3f..13c6397cc 100644
--- a/app/test/test_distributor.c
+++ b/app/test/test_distributor.c
@@ -599,6 +599,10 @@ quit_workers(struct worker_params *wp, struct rte_mempool *p)
 	rte_distributor_process(d, NULL, 0);
 	rte_distributor_flush(d);
 	rte_eal_mp_wait_lcore();
+
+	while (rte_distributor_returned_pkts(d, bufs, RTE_MAX_LCORE))
+		;
+
 	quit = 0;
 	worker_idx = 0;
 	zero_idx = 0;
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v1 5/6] distributor: fix missing handshake synchronization
       [not found]   ` <CGME20200915193500eucas1p2b079e1dcfd2d54e01a5630609b82b370@eucas1p2.samsung.com>
@ 2020-09-15 19:34     ` Lukasz Wojciechowski
  2020-09-17 13:22       ` David Hunt
  0 siblings, 1 reply; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-09-15 19:34 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

rte_distributor_return_pkt function which is run on worker cores
must wait for distributor core to clear handshake on retptr64
before using those buffers. While the handshake is set distributor
core controls buffers and any operations on worker side might overwrite
buffers which are unread yet.

Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
---
 lib/librte_distributor/rte_distributor.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
index 1c047f065..89493c331 100644
--- a/lib/librte_distributor/rte_distributor.c
+++ b/lib/librte_distributor/rte_distributor.c
@@ -160,6 +160,7 @@ rte_distributor_return_pkt(struct rte_distributor *d,
 {
 	struct rte_distributor_buffer *buf = &d->bufs[worker_id];
 	unsigned int i;
+	volatile int64_t *retptr64;
 
 	if (unlikely(d->alg_type == RTE_DIST_ALG_SINGLE)) {
 		if (num == 1)
@@ -169,6 +170,19 @@ rte_distributor_return_pkt(struct rte_distributor *d,
 			return -EINVAL;
 	}
 
+	retptr64 = &(buf->retptr64[0]);
+	/* Spin while handshake bits are set (scheduler clears it).
+	 * Sync with worker on GET_BUF flag.
+	 */
+	while (unlikely(__atomic_load_n(retptr64, __ATOMIC_ACQUIRE)
+			& RTE_DISTRIB_GET_BUF)) {
+		rte_pause();
+		uint64_t t = rte_rdtsc()+100;
+
+		while (rte_rdtsc() < t)
+			rte_pause();
+	}
+
 	/* Sync with distributor to acquire retptrs */
 	__atomic_thread_fence(__ATOMIC_ACQUIRE);
 	for (i = 0; i < RTE_DIST_BURST_SIZE; i++)
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v1 6/6] distributor: fix handshake deadlock
       [not found]   ` <CGME20200915193501eucas1p2333f0b08077c06ba04b89ce192072f9a@eucas1p2.samsung.com>
@ 2020-09-15 19:34     ` Lukasz Wojciechowski
  2020-09-17 13:28       ` David Hunt
  0 siblings, 1 reply; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-09-15 19:34 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

Synchronization of data exchange between distributor and worker cores
is based on 2 handshakes: retptr64 for returning mbufs from workers
to distributor and bufptr64 for passing mbufs to workers.

Without proper order of verifying those 2 handshakes a deadlock may
occur. This can happen when worker core want to return back mbufs
and waits for retptr handshake to be cleared and distributor core
wait for bufptr to send mbufs to worker.

This can happen as worker core first returns mbufs to distributor
and later gets new mbufs, while distributor first release mbufs
to worker and later handle returning packets.

This patch fixes possibility of the deadlock by always taking care
of returning packets first on the distributor side and handling
packets while waiting to release new.

Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
---
 lib/librte_distributor/rte_distributor.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
index 89493c331..12b3db33c 100644
--- a/lib/librte_distributor/rte_distributor.c
+++ b/lib/librte_distributor/rte_distributor.c
@@ -321,12 +321,14 @@ release(struct rte_distributor *d, unsigned int wkr)
 	struct rte_distributor_buffer *buf = &(d->bufs[wkr]);
 	unsigned int i;
 
+	handle_returns(d, wkr);
+
 	/* Sync with worker on GET_BUF flag */
 	while (!(__atomic_load_n(&(d->bufs[wkr].bufptr64[0]), __ATOMIC_ACQUIRE)
-		& RTE_DISTRIB_GET_BUF))
+		& RTE_DISTRIB_GET_BUF)) {
+		handle_returns(d, wkr);
 		rte_pause();
-
-	handle_returns(d, wkr);
+	}
 
 	buf->count = 0;
 
@@ -376,6 +378,7 @@ rte_distributor_process(struct rte_distributor *d,
 		/* Flush out all non-full cache-lines to workers. */
 		for (wid = 0 ; wid < d->num_workers; wid++) {
 			/* Sync with worker on GET_BUF flag. */
+			handle_returns(d, wid);
 			if (__atomic_load_n(&(d->bufs[wid].bufptr64[0]),
 				__ATOMIC_ACQUIRE) & RTE_DISTRIB_GET_BUF) {
 				release(d, wid);
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v1 1/6] app/test: fix deadlock in distributor test
  2020-09-15 19:34     ` [dpdk-dev] [PATCH v1 1/6] app/test: fix deadlock in distributor test Lukasz Wojciechowski
@ 2020-09-17 11:21       ` David Hunt
  2020-09-17 14:01         ` Lukasz Wojciechowski
  0 siblings, 1 reply; 164+ messages in thread
From: David Hunt @ 2020-09-17 11:21 UTC (permalink / raw)
  To: Lukasz Wojciechowski, Bruce Richardson; +Cc: dev, stable

Hi Lukasz,

On 15/9/2020 8:34 PM, Lukasz Wojciechowski wrote:
> The sanity test with worker shutdown delegates all bufs
> to be processed by a single lcore worker, then it freezes
> one of the lcore workers and continues to send more bufs.
>
> Problem occurred if freezed lcore is the same as the one
> that is processing the mbufs. The lcore processing mbufs
> might be different every time test is launched.
> This is caused by keeping the value of wkr static variable
> in rte_distributor_process function between running test cases.
>
> Test freezed always lcore with 0 id. The patch changes avoids
> possible collision by freezing lcore with zero_idx. The lcore
> that receives the data updates the zero_idx, so it is not freezed
> itself.
>
> To reproduce the issue fixed by this patch, please run
> distributor_autotest command in test app several times in a row.
>
> Fixes: c3eabff124e6 ("distributor: add unit tests")
> Cc: bruce.richardson@intel.com
> Cc: stable@dpdk.org
>
> Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
> ---
>   app/test/test_distributor.c | 22 ++++++++++++++++++++--
>   1 file changed, 20 insertions(+), 2 deletions(-)
>
> diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
> index ba1f81cf8..35b25463a 100644
> --- a/app/test/test_distributor.c
> +++ b/app/test/test_distributor.c
> @@ -28,6 +28,7 @@ struct worker_params worker_params;
>   static volatile int quit;      /**< general quit variable for all threads */
>   static volatile int zero_quit; /**< var for when we just want thr0 to quit*/
>   static volatile unsigned worker_idx;
> +static volatile unsigned zero_idx;
>   
>   struct worker_stats {
>   	volatile unsigned handled_packets;
> @@ -346,27 +347,43 @@ handle_work_for_shutdown_test(void *arg)
>   	unsigned int total = 0;
>   	unsigned int i;
>   	unsigned int returned = 0;
> +	unsigned int zero_id = 0;
>   	const unsigned int id = __atomic_fetch_add(&worker_idx, 1,
>   			__ATOMIC_RELAXED);
>   
>   	num = rte_distributor_get_pkt(d, id, buf, buf, num);
>   
> +	zero_id = __atomic_load_n(&zero_idx, __ATOMIC_ACQUIRE);
> +	if (id == zero_id && num > 0) {
> +		zero_id = (zero_id + 1) %  __atomic_load_n(&worker_idx,
> +			__ATOMIC_ACQUIRE);
> +		__atomic_store_n(&zero_idx, zero_id, __ATOMIC_RELEASE);
> +	}
> +
>   	/* wait for quit single globally, or for worker zero, wait
>   	 * for zero_quit */
> -	while (!quit && !(id == 0 && zero_quit)) {
> +	while (!quit && !(id == zero_id && zero_quit)) {
>   		worker_stats[id].handled_packets += num;
>   		count += num;
>   		for (i = 0; i < num; i++)
>   			rte_pktmbuf_free(buf[i]);
>   		num = rte_distributor_get_pkt(d,
>   				id, buf, buf, num);
> +
> +		zero_id = __atomic_load_n(&zero_idx, __ATOMIC_ACQUIRE);
> +		if (id == zero_id && num > 0) {
> +			zero_id = (zero_id + 1) %  __atomic_load_n(&worker_idx,
> +				__ATOMIC_ACQUIRE);
> +			__atomic_store_n(&zero_idx, zero_id, __ATOMIC_RELEASE);
> +		}
> +
>   		total += num;
>   	}
>   	worker_stats[id].handled_packets += num;
>   	count += num;
>   	returned = rte_distributor_return_pkt(d, id, buf, num);
>   
> -	if (id == 0) {
> +	if (id == zero_id) {
>   		/* for worker zero, allow it to restart to pick up last packet
>   		 * when all workers are shutting down.
>   		 */
> @@ -586,6 +603,7 @@ quit_workers(struct worker_params *wp, struct rte_mempool *p)
>   	rte_eal_mp_wait_lcore();
>   	quit = 0;
>   	worker_idx = 0;
> +	zero_idx = 0;
>   }
>   
>   static int


Lockup reproducable if you run the distributor_autotest 19 times in 
succesion. I was able to run the test many times more than that with the 
patch applied. Thanks.

Tested-by: David Hunt <david.hunt@intel.com>





^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v1 2/6] app/test: synchronize statistics between lcores
  2020-09-15 19:34     ` [dpdk-dev] [PATCH v1 2/6] app/test: synchronize statistics between lcores Lukasz Wojciechowski
@ 2020-09-17 11:50       ` David Hunt
  0 siblings, 0 replies; 164+ messages in thread
From: David Hunt @ 2020-09-17 11:50 UTC (permalink / raw)
  To: Lukasz Wojciechowski, Bruce Richardson; +Cc: dev, stable


On 15/9/2020 8:34 PM, Lukasz Wojciechowski wrote:
> Statistics of handled packets are cleared and read on main lcore,
> while they are increased in workers handlers on different lcores.
>
> Without synchronization occasionally showed invalid values.
> This patch uses atomic acquire/release mechanisms to synchronize.
>
> Fixes: c3eabff124e6 ("distributor: add unit tests")
> Cc: bruce.richardson@intel.com
> Cc: stable@dpdk.org
>
> Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
> ---
>   app/test/test_distributor.c | 39 ++++++++++++++++++++++++-------------
>   1 file changed, 26 insertions(+), 13 deletions(-)
>
> diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
> index 35b25463a..0e49e3714 100644
> --- a/app/test/test_distributor.c
> +++ b/app/test/test_distributor.c
> @@ -43,7 +43,8 @@ total_packet_count(void)
>   {
>   	unsigned i, count = 0;
>   	for (i = 0; i < worker_idx; i++)
> -		count += worker_stats[i].handled_packets;
> +		count += __atomic_load_n(&worker_stats[i].handled_packets,
> +				__ATOMIC_ACQUIRE);
>   	return count;
>   }
>   
> @@ -52,6 +53,7 @@ static inline void
>   clear_packet_count(void)
>   {
>   	memset(&worker_stats, 0, sizeof(worker_stats));
> +	rte_atomic_thread_fence(__ATOMIC_RELEASE);
>   }
>   
>   /* this is the basic worker function for sanity test
> @@ -72,13 +74,13 @@ handle_work(void *arg)
>   	num = rte_distributor_get_pkt(db, id, buf, buf, num);
>   	while (!quit) {
>   		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
> -				__ATOMIC_RELAXED);
> +				__ATOMIC_ACQ_REL);
>   		count += num;
>   		num = rte_distributor_get_pkt(db, id,
>   				buf, buf, num);
>   	}
>   	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
> -			__ATOMIC_RELAXED);
> +			__ATOMIC_ACQ_REL);
>   	count += num;
>   	rte_distributor_return_pkt(db, id, buf, num);
>   	return 0;
> @@ -134,7 +136,8 @@ sanity_test(struct worker_params *wp, struct rte_mempool *p)
>   
>   	for (i = 0; i < rte_lcore_count() - 1; i++)
>   		printf("Worker %u handled %u packets\n", i,
> -				worker_stats[i].handled_packets);
> +			__atomic_load_n(&worker_stats[i].handled_packets,
> +					__ATOMIC_ACQUIRE));
>   	printf("Sanity test with all zero hashes done.\n");
>   
>   	/* pick two flows and check they go correctly */
> @@ -159,7 +162,9 @@ sanity_test(struct worker_params *wp, struct rte_mempool *p)
>   
>   		for (i = 0; i < rte_lcore_count() - 1; i++)
>   			printf("Worker %u handled %u packets\n", i,
> -					worker_stats[i].handled_packets);
> +				__atomic_load_n(
> +					&worker_stats[i].handled_packets,
> +					__ATOMIC_ACQUIRE));
>   		printf("Sanity test with two hash values done\n");
>   	}
>   
> @@ -185,7 +190,8 @@ sanity_test(struct worker_params *wp, struct rte_mempool *p)
>   
>   	for (i = 0; i < rte_lcore_count() - 1; i++)
>   		printf("Worker %u handled %u packets\n", i,
> -				worker_stats[i].handled_packets);
> +			__atomic_load_n(&worker_stats[i].handled_packets,
> +					__ATOMIC_ACQUIRE));
>   	printf("Sanity test with non-zero hashes done\n");
>   
>   	rte_mempool_put_bulk(p, (void *)bufs, BURST);
> @@ -280,15 +286,17 @@ handle_work_with_free_mbufs(void *arg)
>   		buf[i] = NULL;
>   	num = rte_distributor_get_pkt(d, id, buf, buf, num);
>   	while (!quit) {
> -		worker_stats[id].handled_packets += num;
>   		count += num;
> +		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
> +				__ATOMIC_ACQ_REL);
>   		for (i = 0; i < num; i++)
>   			rte_pktmbuf_free(buf[i]);
>   		num = rte_distributor_get_pkt(d,
>   				id, buf, buf, num);
>   	}
> -	worker_stats[id].handled_packets += num;
>   	count += num;
> +	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
> +			__ATOMIC_ACQ_REL);
>   	rte_distributor_return_pkt(d, id, buf, num);
>   	return 0;
>   }
> @@ -363,8 +371,9 @@ handle_work_for_shutdown_test(void *arg)
>   	/* wait for quit single globally, or for worker zero, wait
>   	 * for zero_quit */
>   	while (!quit && !(id == zero_id && zero_quit)) {
> -		worker_stats[id].handled_packets += num;
>   		count += num;
> +		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
> +				__ATOMIC_ACQ_REL);
>   		for (i = 0; i < num; i++)
>   			rte_pktmbuf_free(buf[i]);
>   		num = rte_distributor_get_pkt(d,
> @@ -379,10 +388,11 @@ handle_work_for_shutdown_test(void *arg)
>   
>   		total += num;
>   	}
> -	worker_stats[id].handled_packets += num;
>   	count += num;
>   	returned = rte_distributor_return_pkt(d, id, buf, num);
>   
> +	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
> +			__ATOMIC_ACQ_REL);
>   	if (id == zero_id) {
>   		/* for worker zero, allow it to restart to pick up last packet
>   		 * when all workers are shutting down.
> @@ -394,10 +404,11 @@ handle_work_for_shutdown_test(void *arg)
>   				id, buf, buf, num);
>   
>   		while (!quit) {
> -			worker_stats[id].handled_packets += num;
>   			count += num;
>   			rte_pktmbuf_free(pkt);
>   			num = rte_distributor_get_pkt(d, id, buf, buf, num);
> +			__atomic_fetch_add(&worker_stats[id].handled_packets,
> +					num, __ATOMIC_ACQ_REL);
>   		}
>   		returned = rte_distributor_return_pkt(d,
>   				id, buf, num);
> @@ -461,7 +472,8 @@ sanity_test_with_worker_shutdown(struct worker_params *wp,
>   
>   	for (i = 0; i < rte_lcore_count() - 1; i++)
>   		printf("Worker %u handled %u packets\n", i,
> -				worker_stats[i].handled_packets);
> +			__atomic_load_n(&worker_stats[i].handled_packets,
> +					__ATOMIC_ACQUIRE));
>   
>   	if (total_packet_count() != BURST * 2) {
>   		printf("Line %d: Error, not all packets flushed. "
> @@ -514,7 +526,8 @@ test_flush_with_worker_shutdown(struct worker_params *wp,
>   	zero_quit = 0;
>   	for (i = 0; i < rte_lcore_count() - 1; i++)
>   		printf("Worker %u handled %u packets\n", i,
> -				worker_stats[i].handled_packets);
> +			__atomic_load_n(&worker_stats[i].handled_packets,
> +					__ATOMIC_ACQUIRE));
>   
>   	if (total_packet_count() != BURST) {
>   		printf("Line %d: Error, not all packets flushed. "


Thanks.

Acked-by: David Hunt <david.hunt@intel.com>



^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v1 3/6] app/test: fix freeing mbufs in distributor tests
  2020-09-15 19:34     ` [dpdk-dev] [PATCH v1 3/6] app/test: fix freeing mbufs in distributor tests Lukasz Wojciechowski
@ 2020-09-17 12:34       ` David Hunt
  2020-09-22 12:42       ` David Marchand
  1 sibling, 0 replies; 164+ messages in thread
From: David Hunt @ 2020-09-17 12:34 UTC (permalink / raw)
  To: Lukasz Wojciechowski, Bruce Richardson; +Cc: dev, stable

Hi Lukasz,

On 15/9/2020 8:34 PM, Lukasz Wojciechowski wrote:
> Sanity tests with mbuf alloc and shutdown tests assume that
> mbufs passed to worker cores are freed in handlers.
> Such packets should not be returned to the distributor's main
> core. The only packets that should be returned are the packets
> send after completion of the tests in quit_workers function.
>
> This patch fixes freeing mbufs, stops returning them
> to distributor's core and cleans up unused variables.
>
> Fixes: c0de0eb82e40 ("distributor: switch over to new API")
> Cc: david.hunt@intel.com
> Cc: stable@dpdk.org
>
> Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
> ---
>   app/test/test_distributor.c | 37 +++++++++++--------------------------
>   1 file changed, 11 insertions(+), 26 deletions(-)
>
> diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
> index 0e49e3714..da13a9a3f 100644
> --- a/app/test/test_distributor.c
> +++ b/app/test/test_distributor.c
> @@ -277,24 +277,21 @@ handle_work_with_free_mbufs(void *arg)
>   	struct rte_mbuf *buf[8] __rte_cache_aligned;
>   	struct worker_params *wp = arg;
>   	struct rte_distributor *d = wp->dist;
> -	unsigned int count = 0;
>   	unsigned int i;
>   	unsigned int num = 0;
>   	unsigned int id = __atomic_fetch_add(&worker_idx, 1, __ATOMIC_RELAXED);
>   
>   	for (i = 0; i < 8; i++)
>   		buf[i] = NULL;
> -	num = rte_distributor_get_pkt(d, id, buf, buf, num);
> +	num = rte_distributor_get_pkt(d, id, buf, buf, 0);
>   	while (!quit) {
> -		count += num;
>   		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
>   				__ATOMIC_ACQ_REL);
>   		for (i = 0; i < num; i++)
>   			rte_pktmbuf_free(buf[i]);
>   		num = rte_distributor_get_pkt(d,
> -				id, buf, buf, num);
> +				id, buf, buf, 0);
>   	}
> -	count += num;
>   	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
>   			__ATOMIC_ACQ_REL);
>   	rte_distributor_return_pkt(d, id, buf, num);
> @@ -322,7 +319,6 @@ sanity_test_with_mbuf_alloc(struct worker_params *wp, struct rte_mempool *p)
>   			rte_distributor_process(d, NULL, 0);
>   		for (j = 0; j < BURST; j++) {
>   			bufs[j]->hash.usr = (i+j) << 1;
> -			rte_mbuf_refcnt_set(bufs[j], 1);
>   		}
>   
>   		rte_distributor_process(d, bufs, BURST);
> @@ -346,20 +342,15 @@ sanity_test_with_mbuf_alloc(struct worker_params *wp, struct rte_mempool *p)
>   static int
>   handle_work_for_shutdown_test(void *arg)
>   {
> -	struct rte_mbuf *pkt = NULL;
>   	struct rte_mbuf *buf[8] __rte_cache_aligned;
>   	struct worker_params *wp = arg;
>   	struct rte_distributor *d = wp->dist;
> -	unsigned int count = 0;
>   	unsigned int num = 0;
> -	unsigned int total = 0;
>   	unsigned int i;
> -	unsigned int returned = 0;
>   	unsigned int zero_id = 0;
>   	const unsigned int id = __atomic_fetch_add(&worker_idx, 1,
>   			__ATOMIC_RELAXED);
> -
> -	num = rte_distributor_get_pkt(d, id, buf, buf, num);
> +	num = rte_distributor_get_pkt(d, id, buf, buf, 0);
>   
>   	zero_id = __atomic_load_n(&zero_idx, __ATOMIC_ACQUIRE);
>   	if (id == zero_id && num > 0) {
> @@ -371,13 +362,12 @@ handle_work_for_shutdown_test(void *arg)
>   	/* wait for quit single globally, or for worker zero, wait
>   	 * for zero_quit */
>   	while (!quit && !(id == zero_id && zero_quit)) {
> -		count += num;
>   		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
>   				__ATOMIC_ACQ_REL);
>   		for (i = 0; i < num; i++)
>   			rte_pktmbuf_free(buf[i]);
>   		num = rte_distributor_get_pkt(d,
> -				id, buf, buf, num);
> +				id, buf, buf, 0);
>   
>   		zero_id = __atomic_load_n(&zero_idx, __ATOMIC_ACQUIRE);
>   		if (id == zero_id && num > 0) {
> @@ -385,12 +375,7 @@ handle_work_for_shutdown_test(void *arg)
>   				__ATOMIC_ACQUIRE);
>   			__atomic_store_n(&zero_idx, zero_id, __ATOMIC_RELEASE);
>   		}
> -
> -		total += num;
>   	}
> -	count += num;
> -	returned = rte_distributor_return_pkt(d, id, buf, num);
> -
>   	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
>   			__ATOMIC_ACQ_REL);
>   	if (id == zero_id) {
> @@ -400,20 +385,20 @@ handle_work_for_shutdown_test(void *arg)
>   		while (zero_quit)
>   			usleep(100);
>   
> +		for (i = 0; i < num; i++)
> +			rte_pktmbuf_free(buf[i]);
>   		num = rte_distributor_get_pkt(d,
> -				id, buf, buf, num);
> +				id, buf, buf, 0);
>   
>   		while (!quit) {
> -			count += num;
> -			rte_pktmbuf_free(pkt);
> -			num = rte_distributor_get_pkt(d, id, buf, buf, num);
>   			__atomic_fetch_add(&worker_stats[id].handled_packets,
>   					num, __ATOMIC_ACQ_REL);
> +			for (i = 0; i < num; i++)
> +				rte_pktmbuf_free(buf[i]);
> +			num = rte_distributor_get_pkt(d, id, buf, buf, 0);
>   		}
> -		returned = rte_distributor_return_pkt(d,
> -				id, buf, num);
> -		printf("Num returned = %d\n", returned);
>   	}
> +	rte_distributor_return_pkt(d, id, buf, num);
>   	return 0;
>   }
>   

Nice cleanup, Thanks.

Acked-by: David Hunt <david.hunt@intel.com>






^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v1 4/6] app/test: collect return mbufs in distributor test
  2020-09-15 19:34     ` [dpdk-dev] [PATCH v1 4/6] app/test: collect return mbufs in distributor test Lukasz Wojciechowski
@ 2020-09-17 12:37       ` David Hunt
  0 siblings, 0 replies; 164+ messages in thread
From: David Hunt @ 2020-09-17 12:37 UTC (permalink / raw)
  To: Lukasz Wojciechowski, Bruce Richardson; +Cc: dev, stable

Hi Lukasz,

On 15/9/2020 8:34 PM, Lukasz Wojciechowski wrote:
> During quit_workers function distributor's main core processes
> some packets to wake up pending worker cores so they can quit.
> As quit_workers acts also as a cleanup procedure for next test
> case it should also collect these packages returned by workers'
> handlers, so the cyclic buffer with returned packets
> in distributor remains empty.
>
> Fixes: c3eabff124e6 ("distributor: add unit tests")
> Cc: bruce.richardson@intel.com
> Fixes: c0de0eb82e40 ("distributor: switch over to new API")
> Cc: david.hunt@intel.com
> Cc: stable@dpdk.org
>
> Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
> ---
>   app/test/test_distributor.c | 4 ++++
>   1 file changed, 4 insertions(+)
>
> diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
> index da13a9a3f..13c6397cc 100644
> --- a/app/test/test_distributor.c
> +++ b/app/test/test_distributor.c
> @@ -599,6 +599,10 @@ quit_workers(struct worker_params *wp, struct rte_mempool *p)
>   	rte_distributor_process(d, NULL, 0);
>   	rte_distributor_flush(d);
>   	rte_eal_mp_wait_lcore();
> +
> +	while (rte_distributor_returned_pkts(d, bufs, RTE_MAX_LCORE))
> +		;
> +
>   	quit = 0;
>   	worker_idx = 0;
>   	zero_idx = 0;


Acked-by: David Hunt <david.hunt@intel.com>




^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v1 5/6] distributor: fix missing handshake synchronization
  2020-09-15 19:34     ` [dpdk-dev] [PATCH v1 5/6] distributor: fix missing handshake synchronization Lukasz Wojciechowski
@ 2020-09-17 13:22       ` David Hunt
  0 siblings, 0 replies; 164+ messages in thread
From: David Hunt @ 2020-09-17 13:22 UTC (permalink / raw)
  To: Lukasz Wojciechowski, Bruce Richardson; +Cc: dev, stable

Hi Lukasz,

On 15/9/2020 8:34 PM, Lukasz Wojciechowski wrote:
> rte_distributor_return_pkt function which is run on worker cores
> must wait for distributor core to clear handshake on retptr64
> before using those buffers. While the handshake is set distributor
> core controls buffers and any operations on worker side might overwrite
> buffers which are unread yet.
>
> Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
> Cc: david.hunt@intel.com
> Cc: stable@dpdk.org
>
> Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
> ---
>   lib/librte_distributor/rte_distributor.c | 14 ++++++++++++++
>   1 file changed, 14 insertions(+)
>
> diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
> index 1c047f065..89493c331 100644
> --- a/lib/librte_distributor/rte_distributor.c
> +++ b/lib/librte_distributor/rte_distributor.c
> @@ -160,6 +160,7 @@ rte_distributor_return_pkt(struct rte_distributor *d,
>   {
>   	struct rte_distributor_buffer *buf = &d->bufs[worker_id];
>   	unsigned int i;
> +	volatile int64_t *retptr64;
>   
>   	if (unlikely(d->alg_type == RTE_DIST_ALG_SINGLE)) {
>   		if (num == 1)
> @@ -169,6 +170,19 @@ rte_distributor_return_pkt(struct rte_distributor *d,
>   			return -EINVAL;
>   	}
>   
> +	retptr64 = &(buf->retptr64[0]);
> +	/* Spin while handshake bits are set (scheduler clears it).
> +	 * Sync with worker on GET_BUF flag.
> +	 */
> +	while (unlikely(__atomic_load_n(retptr64, __ATOMIC_ACQUIRE)
> +			& RTE_DISTRIB_GET_BUF)) {
> +		rte_pause();
> +		uint64_t t = rte_rdtsc()+100;
> +
> +		while (rte_rdtsc() < t)
> +			rte_pause();
> +	}
> +


The 'unlikely' is appropriate, but when it does occur, this looks to be 
a necessary addition.
And I've confirmed no loss in performance on my system.

Acked-by: David Hunt <david.hunt@intel.com>





^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v1 6/6] distributor: fix handshake deadlock
  2020-09-15 19:34     ` [dpdk-dev] [PATCH v1 6/6] distributor: fix handshake deadlock Lukasz Wojciechowski
@ 2020-09-17 13:28       ` David Hunt
  0 siblings, 0 replies; 164+ messages in thread
From: David Hunt @ 2020-09-17 13:28 UTC (permalink / raw)
  To: Lukasz Wojciechowski, Bruce Richardson; +Cc: dev, stable

Hi Lukasz,

On 15/9/2020 8:34 PM, Lukasz Wojciechowski wrote:
> Synchronization of data exchange between distributor and worker cores
> is based on 2 handshakes: retptr64 for returning mbufs from workers
> to distributor and bufptr64 for passing mbufs to workers.
>
> Without proper order of verifying those 2 handshakes a deadlock may
> occur. This can happen when worker core want to return back mbufs
> and waits for retptr handshake to be cleared and distributor core
> wait for bufptr to send mbufs to worker.
>
> This can happen as worker core first returns mbufs to distributor
> and later gets new mbufs, while distributor first release mbufs
> to worker and later handle returning packets.
>
> This patch fixes possibility of the deadlock by always taking care
> of returning packets first on the distributor side and handling
> packets while waiting to release new.
>
> Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
> Cc: david.hunt@intel.com
> Cc: stable@dpdk.org
>
> Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
> ---
>   lib/librte_distributor/rte_distributor.c | 9 ++++++---
>   1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
> index 89493c331..12b3db33c 100644
> --- a/lib/librte_distributor/rte_distributor.c
> +++ b/lib/librte_distributor/rte_distributor.c
> @@ -321,12 +321,14 @@ release(struct rte_distributor *d, unsigned int wkr)
>   	struct rte_distributor_buffer *buf = &(d->bufs[wkr]);
>   	unsigned int i;
>   
> +	handle_returns(d, wkr);
> +
>   	/* Sync with worker on GET_BUF flag */
>   	while (!(__atomic_load_n(&(d->bufs[wkr].bufptr64[0]), __ATOMIC_ACQUIRE)
> -		& RTE_DISTRIB_GET_BUF))
> +		& RTE_DISTRIB_GET_BUF)) {
> +		handle_returns(d, wkr);
>   		rte_pause();
> -
> -	handle_returns(d, wkr);
> +	}
>   
>   	buf->count = 0;
>   
> @@ -376,6 +378,7 @@ rte_distributor_process(struct rte_distributor *d,
>   		/* Flush out all non-full cache-lines to workers. */
>   		for (wid = 0 ; wid < d->num_workers; wid++) {
>   			/* Sync with worker on GET_BUF flag. */
> +			handle_returns(d, wid);
>   			if (__atomic_load_n(&(d->bufs[wid].bufptr64[0]),
>   				__ATOMIC_ACQUIRE) & RTE_DISTRIB_GET_BUF) {
>   				release(d, wid);

Makes sense. Thanks for the series.  Again, no degradation in 
performance on my systems.

Acked-by: David Hunt <david.hunt@intel.com>




^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v1 1/6] app/test: fix deadlock in distributor test
  2020-09-17 11:21       ` David Hunt
@ 2020-09-17 14:01         ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-09-17 14:01 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson
  Cc: dev, stable, "'Lukasz Wojciechowski'",

Hi David,

W dniu 17.09.2020 o 13:21, David Hunt pisze:
> Hi Lukasz,
>
> On 15/9/2020 8:34 PM, Lukasz Wojciechowski wrote:
>> The sanity test with worker shutdown delegates all bufs
>> to be processed by a single lcore worker, then it freezes
>> one of the lcore workers and continues to send more bufs.
>>
>> Problem occurred if freezed lcore is the same as the one
>> that is processing the mbufs. The lcore processing mbufs
>> might be different every time test is launched.
>> This is caused by keeping the value of wkr static variable
>> in rte_distributor_process function between running test cases.
>>
>> Test freezed always lcore with 0 id. The patch changes avoids
>> possible collision by freezing lcore with zero_idx. The lcore
>> that receives the data updates the zero_idx, so it is not freezed
>> itself.
>>
>> To reproduce the issue fixed by this patch, please run
>> distributor_autotest command in test app several times in a row.
>>
>> Fixes: c3eabff124e6 ("distributor: add unit tests")
>> Cc: bruce.richardson@intel.com
>> Cc: stable@dpdk.org
>>
>> Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
>> ---
>>   app/test/test_distributor.c | 22 ++++++++++++++++++++--
>>   1 file changed, 20 insertions(+), 2 deletions(-)
>>
>> diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
>> index ba1f81cf8..35b25463a 100644
>> --- a/app/test/test_distributor.c
>> +++ b/app/test/test_distributor.c
>> @@ -28,6 +28,7 @@ struct worker_params worker_params;
>>   static volatile int quit;      /**< general quit variable for all 
>> threads */
>>   static volatile int zero_quit; /**< var for when we just want thr0 
>> to quit*/
>>   static volatile unsigned worker_idx;
>> +static volatile unsigned zero_idx;
>>     struct worker_stats {
>>       volatile unsigned handled_packets;
>> @@ -346,27 +347,43 @@ handle_work_for_shutdown_test(void *arg)
>>       unsigned int total = 0;
>>       unsigned int i;
>>       unsigned int returned = 0;
>> +    unsigned int zero_id = 0;
>>       const unsigned int id = __atomic_fetch_add(&worker_idx, 1,
>>               __ATOMIC_RELAXED);
>>         num = rte_distributor_get_pkt(d, id, buf, buf, num);
>>   +    zero_id = __atomic_load_n(&zero_idx, __ATOMIC_ACQUIRE);
>> +    if (id == zero_id && num > 0) {
>> +        zero_id = (zero_id + 1) % __atomic_load_n(&worker_idx,
>> +            __ATOMIC_ACQUIRE);
>> +        __atomic_store_n(&zero_idx, zero_id, __ATOMIC_RELEASE);
>> +    }
>> +
>>       /* wait for quit single globally, or for worker zero, wait
>>        * for zero_quit */
>> -    while (!quit && !(id == 0 && zero_quit)) {
>> +    while (!quit && !(id == zero_id && zero_quit)) {
>>           worker_stats[id].handled_packets += num;
>>           count += num;
>>           for (i = 0; i < num; i++)
>>               rte_pktmbuf_free(buf[i]);
>>           num = rte_distributor_get_pkt(d,
>>                   id, buf, buf, num);
>> +
>> +        zero_id = __atomic_load_n(&zero_idx, __ATOMIC_ACQUIRE);
>> +        if (id == zero_id && num > 0) {
>> +            zero_id = (zero_id + 1) % __atomic_load_n(&worker_idx,
>> +                __ATOMIC_ACQUIRE);
>> +            __atomic_store_n(&zero_idx, zero_id, __ATOMIC_RELEASE);
>> +        }
>> +
>>           total += num;
>>       }
>>       worker_stats[id].handled_packets += num;
>>       count += num;
>>       returned = rte_distributor_return_pkt(d, id, buf, num);
>>   -    if (id == 0) {
>> +    if (id == zero_id) {
>>           /* for worker zero, allow it to restart to pick up last packet
>>            * when all workers are shutting down.
>>            */
>> @@ -586,6 +603,7 @@ quit_workers(struct worker_params *wp, struct 
>> rte_mempool *p)
>>       rte_eal_mp_wait_lcore();
>>       quit = 0;
>>       worker_idx = 0;
>> +    zero_idx = 0;
>>   }
>>     static int
>
>
> Lockup reproducable if you run the distributor_autotest 19 times in 
> succesion. I was able to run the test many times more than that with 
> the patch applied. Thanks.
The number depends on number of lcores on your test environment.
>
> Tested-by: David Hunt <david.hunt@intel.com>


Thank you very much for reviewing and testing whole series.

>
>
>
-- 
Lukasz Wojciechowski
Principal Software Engineer

Samsung R&D Institute Poland
Samsung Electronics
Office +48 22 377 88 25
l.wojciechow@partner.samsung.com


^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v1 3/6] app/test: fix freeing mbufs in distributor tests
  2020-09-15 19:34     ` [dpdk-dev] [PATCH v1 3/6] app/test: fix freeing mbufs in distributor tests Lukasz Wojciechowski
  2020-09-17 12:34       ` David Hunt
@ 2020-09-22 12:42       ` David Marchand
  2020-09-23  1:55         ` Lukasz Wojciechowski
  1 sibling, 1 reply; 164+ messages in thread
From: David Marchand @ 2020-09-22 12:42 UTC (permalink / raw)
  To: Lukasz Wojciechowski, David Hunt; +Cc: Bruce Richardson, dev, dpdk stable

Hello Lukasz, David,


On Tue, Sep 15, 2020 at 9:35 PM Lukasz Wojciechowski
<l.wojciechow@partner.samsung.com> wrote:
> diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
> index 0e49e3714..da13a9a3f 100644
> --- a/app/test/test_distributor.c
> +++ b/app/test/test_distributor.c
> @@ -277,24 +277,21 @@ handle_work_with_free_mbufs(void *arg)
>         struct rte_mbuf *buf[8] __rte_cache_aligned;
>         struct worker_params *wp = arg;
>         struct rte_distributor *d = wp->dist;
> -       unsigned int count = 0;
>         unsigned int i;
>         unsigned int num = 0;
>         unsigned int id = __atomic_fetch_add(&worker_idx, 1, __ATOMIC_RELAXED);
>
>         for (i = 0; i < 8; i++)
>                 buf[i] = NULL;
> -       num = rte_distributor_get_pkt(d, id, buf, buf, num);
> +       num = rte_distributor_get_pkt(d, id, buf, buf, 0);

For my understanding, we pass an array even if we return 0 packet. Is
this necessary?


>         while (!quit) {
> -               count += num;
>                 __atomic_fetch_add(&worker_stats[id].handled_packets, num,
>                                 __ATOMIC_ACQ_REL);
>                 for (i = 0; i < num; i++)
>                         rte_pktmbuf_free(buf[i]);
>                 num = rte_distributor_get_pkt(d,
> -                               id, buf, buf, num);
> +                               id, buf, buf, 0);

Here, it gives the impression we have some potential use-after-free on
buf[] content.
And trying to pass NULL, I can see the distributor library
dereferences oldpkt[] without checking retcount != 0.


-- 
David Marchand


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v2 0/8] fix distributor synchronization issues
       [not found]   ` <CGME20200923014717eucas1p18699ad84d206e786a84f20dab9b65c33@eucas1p1.samsung.com>
@ 2020-09-23  1:47     ` Lukasz Wojciechowski
       [not found]       ` <CGME20200923014718eucas1p11fdcd774fef7b9e077e14e01c9f951d5@eucas1p1.samsung.com>
                         ` (10 more replies)
  0 siblings, 11 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-09-23  1:47 UTC (permalink / raw)
  Cc: dev, l.wojciechow

During review and verification of the patch created by Sarosh Arif:
"test_distributor: prevent memory leakages from the pool" I found out
that running distributor unit tests multiple times in a row causes fails.
So I investigated all the issues I found.

There are few synchronization issues that might cause deadlocks
or corrupted data. They are fixed with this set of patches for both tests
and librte_distributor library.

---
v2:
* assign NULL to freed mbufs in distributor test
* fix handshake check on legacy single distributor
     rte_distributor_return_pkt_single()
* add patch 7 passing NULL to legacy API calls if no bufs are returned
* add patch 8 fixing API documentation

Lukasz Wojciechowski (8):
  app/test: fix deadlock in distributor test
  app/test: synchronize statistics between lcores
  app/test: fix freeing mbufs in distributor tests
  app/test: collect return mbufs in distributor test
  distributor: fix missing handshake synchronization
  distributor: fix handshake deadlock
  distributor: do not use oldpkt when not needed
  distributor: align API documentation with code

 app/test/test_distributor.c                   | 113 +++++++++++-------
 lib/librte_distributor/rte_distributor.c      |  27 ++++-
 lib/librte_distributor/rte_distributor.h      |  23 ++--
 .../rte_distributor_single.c                  |   4 +
 4 files changed, 110 insertions(+), 57 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v2 1/8] app/test: fix deadlock in distributor test
       [not found]       ` <CGME20200923014718eucas1p11fdcd774fef7b9e077e14e01c9f951d5@eucas1p1.samsung.com>
@ 2020-09-23  1:47         ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-09-23  1:47 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

The sanity test with worker shutdown delegates all bufs
to be processed by a single lcore worker, then it freezes
one of the lcore workers and continues to send more bufs.

Problem occurred if freezed lcore is the same as the one
that is processing the mbufs. The lcore processing mbufs
might be different every time test is launched.
This is caused by keeping the value of wkr static variable
in rte_distributor_process function between running test cases.

Test freezed always lcore with 0 id. The patch changes avoids
possible collision by freezing lcore with zero_idx. The lcore
that receives the data updates the zero_idx, so it is not freezed
itself.

To reproduce the issue fixed by this patch, please run
distributor_autotest command in test app several times in a row.

Fixes: c3eabff124e6 ("distributor: add unit tests")
Cc: bruce.richardson@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
---
 app/test/test_distributor.c | 22 ++++++++++++++++++++--
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
index ba1f81cf8..35b25463a 100644
--- a/app/test/test_distributor.c
+++ b/app/test/test_distributor.c
@@ -28,6 +28,7 @@ struct worker_params worker_params;
 static volatile int quit;      /**< general quit variable for all threads */
 static volatile int zero_quit; /**< var for when we just want thr0 to quit*/
 static volatile unsigned worker_idx;
+static volatile unsigned zero_idx;
 
 struct worker_stats {
 	volatile unsigned handled_packets;
@@ -346,27 +347,43 @@ handle_work_for_shutdown_test(void *arg)
 	unsigned int total = 0;
 	unsigned int i;
 	unsigned int returned = 0;
+	unsigned int zero_id = 0;
 	const unsigned int id = __atomic_fetch_add(&worker_idx, 1,
 			__ATOMIC_RELAXED);
 
 	num = rte_distributor_get_pkt(d, id, buf, buf, num);
 
+	zero_id = __atomic_load_n(&zero_idx, __ATOMIC_ACQUIRE);
+	if (id == zero_id && num > 0) {
+		zero_id = (zero_id + 1) %  __atomic_load_n(&worker_idx,
+			__ATOMIC_ACQUIRE);
+		__atomic_store_n(&zero_idx, zero_id, __ATOMIC_RELEASE);
+	}
+
 	/* wait for quit single globally, or for worker zero, wait
 	 * for zero_quit */
-	while (!quit && !(id == 0 && zero_quit)) {
+	while (!quit && !(id == zero_id && zero_quit)) {
 		worker_stats[id].handled_packets += num;
 		count += num;
 		for (i = 0; i < num; i++)
 			rte_pktmbuf_free(buf[i]);
 		num = rte_distributor_get_pkt(d,
 				id, buf, buf, num);
+
+		zero_id = __atomic_load_n(&zero_idx, __ATOMIC_ACQUIRE);
+		if (id == zero_id && num > 0) {
+			zero_id = (zero_id + 1) %  __atomic_load_n(&worker_idx,
+				__ATOMIC_ACQUIRE);
+			__atomic_store_n(&zero_idx, zero_id, __ATOMIC_RELEASE);
+		}
+
 		total += num;
 	}
 	worker_stats[id].handled_packets += num;
 	count += num;
 	returned = rte_distributor_return_pkt(d, id, buf, num);
 
-	if (id == 0) {
+	if (id == zero_id) {
 		/* for worker zero, allow it to restart to pick up last packet
 		 * when all workers are shutting down.
 		 */
@@ -586,6 +603,7 @@ quit_workers(struct worker_params *wp, struct rte_mempool *p)
 	rte_eal_mp_wait_lcore();
 	quit = 0;
 	worker_idx = 0;
+	zero_idx = 0;
 }
 
 static int
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v2 2/8] app/test: synchronize statistics between lcores
       [not found]       ` <CGME20200923014719eucas1p2f26000109e86a649796e902c30e58bf0@eucas1p2.samsung.com>
@ 2020-09-23  1:47         ` Lukasz Wojciechowski
  2020-09-23  4:30           ` Honnappa Nagarahalli
  0 siblings, 1 reply; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-09-23  1:47 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

Statistics of handled packets are cleared and read on main lcore,
while they are increased in workers handlers on different lcores.

Without synchronization occasionally showed invalid values.
This patch uses atomic acquire/release mechanisms to synchronize.

Fixes: c3eabff124e6 ("distributor: add unit tests")
Cc: bruce.richardson@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
---
 app/test/test_distributor.c | 39 ++++++++++++++++++++++++-------------
 1 file changed, 26 insertions(+), 13 deletions(-)

diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
index 35b25463a..0e49e3714 100644
--- a/app/test/test_distributor.c
+++ b/app/test/test_distributor.c
@@ -43,7 +43,8 @@ total_packet_count(void)
 {
 	unsigned i, count = 0;
 	for (i = 0; i < worker_idx; i++)
-		count += worker_stats[i].handled_packets;
+		count += __atomic_load_n(&worker_stats[i].handled_packets,
+				__ATOMIC_ACQUIRE);
 	return count;
 }
 
@@ -52,6 +53,7 @@ static inline void
 clear_packet_count(void)
 {
 	memset(&worker_stats, 0, sizeof(worker_stats));
+	rte_atomic_thread_fence(__ATOMIC_RELEASE);
 }
 
 /* this is the basic worker function for sanity test
@@ -72,13 +74,13 @@ handle_work(void *arg)
 	num = rte_distributor_get_pkt(db, id, buf, buf, num);
 	while (!quit) {
 		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
-				__ATOMIC_RELAXED);
+				__ATOMIC_ACQ_REL);
 		count += num;
 		num = rte_distributor_get_pkt(db, id,
 				buf, buf, num);
 	}
 	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
-			__ATOMIC_RELAXED);
+			__ATOMIC_ACQ_REL);
 	count += num;
 	rte_distributor_return_pkt(db, id, buf, num);
 	return 0;
@@ -134,7 +136,8 @@ sanity_test(struct worker_params *wp, struct rte_mempool *p)
 
 	for (i = 0; i < rte_lcore_count() - 1; i++)
 		printf("Worker %u handled %u packets\n", i,
-				worker_stats[i].handled_packets);
+			__atomic_load_n(&worker_stats[i].handled_packets,
+					__ATOMIC_ACQUIRE));
 	printf("Sanity test with all zero hashes done.\n");
 
 	/* pick two flows and check they go correctly */
@@ -159,7 +162,9 @@ sanity_test(struct worker_params *wp, struct rte_mempool *p)
 
 		for (i = 0; i < rte_lcore_count() - 1; i++)
 			printf("Worker %u handled %u packets\n", i,
-					worker_stats[i].handled_packets);
+				__atomic_load_n(
+					&worker_stats[i].handled_packets,
+					__ATOMIC_ACQUIRE));
 		printf("Sanity test with two hash values done\n");
 	}
 
@@ -185,7 +190,8 @@ sanity_test(struct worker_params *wp, struct rte_mempool *p)
 
 	for (i = 0; i < rte_lcore_count() - 1; i++)
 		printf("Worker %u handled %u packets\n", i,
-				worker_stats[i].handled_packets);
+			__atomic_load_n(&worker_stats[i].handled_packets,
+					__ATOMIC_ACQUIRE));
 	printf("Sanity test with non-zero hashes done\n");
 
 	rte_mempool_put_bulk(p, (void *)bufs, BURST);
@@ -280,15 +286,17 @@ handle_work_with_free_mbufs(void *arg)
 		buf[i] = NULL;
 	num = rte_distributor_get_pkt(d, id, buf, buf, num);
 	while (!quit) {
-		worker_stats[id].handled_packets += num;
 		count += num;
+		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
+				__ATOMIC_ACQ_REL);
 		for (i = 0; i < num; i++)
 			rte_pktmbuf_free(buf[i]);
 		num = rte_distributor_get_pkt(d,
 				id, buf, buf, num);
 	}
-	worker_stats[id].handled_packets += num;
 	count += num;
+	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
+			__ATOMIC_ACQ_REL);
 	rte_distributor_return_pkt(d, id, buf, num);
 	return 0;
 }
@@ -363,8 +371,9 @@ handle_work_for_shutdown_test(void *arg)
 	/* wait for quit single globally, or for worker zero, wait
 	 * for zero_quit */
 	while (!quit && !(id == zero_id && zero_quit)) {
-		worker_stats[id].handled_packets += num;
 		count += num;
+		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
+				__ATOMIC_ACQ_REL);
 		for (i = 0; i < num; i++)
 			rte_pktmbuf_free(buf[i]);
 		num = rte_distributor_get_pkt(d,
@@ -379,10 +388,11 @@ handle_work_for_shutdown_test(void *arg)
 
 		total += num;
 	}
-	worker_stats[id].handled_packets += num;
 	count += num;
 	returned = rte_distributor_return_pkt(d, id, buf, num);
 
+	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
+			__ATOMIC_ACQ_REL);
 	if (id == zero_id) {
 		/* for worker zero, allow it to restart to pick up last packet
 		 * when all workers are shutting down.
@@ -394,10 +404,11 @@ handle_work_for_shutdown_test(void *arg)
 				id, buf, buf, num);
 
 		while (!quit) {
-			worker_stats[id].handled_packets += num;
 			count += num;
 			rte_pktmbuf_free(pkt);
 			num = rte_distributor_get_pkt(d, id, buf, buf, num);
+			__atomic_fetch_add(&worker_stats[id].handled_packets,
+					num, __ATOMIC_ACQ_REL);
 		}
 		returned = rte_distributor_return_pkt(d,
 				id, buf, num);
@@ -461,7 +472,8 @@ sanity_test_with_worker_shutdown(struct worker_params *wp,
 
 	for (i = 0; i < rte_lcore_count() - 1; i++)
 		printf("Worker %u handled %u packets\n", i,
-				worker_stats[i].handled_packets);
+			__atomic_load_n(&worker_stats[i].handled_packets,
+					__ATOMIC_ACQUIRE));
 
 	if (total_packet_count() != BURST * 2) {
 		printf("Line %d: Error, not all packets flushed. "
@@ -514,7 +526,8 @@ test_flush_with_worker_shutdown(struct worker_params *wp,
 	zero_quit = 0;
 	for (i = 0; i < rte_lcore_count() - 1; i++)
 		printf("Worker %u handled %u packets\n", i,
-				worker_stats[i].handled_packets);
+			__atomic_load_n(&worker_stats[i].handled_packets,
+					__ATOMIC_ACQUIRE));
 
 	if (total_packet_count() != BURST) {
 		printf("Line %d: Error, not all packets flushed. "
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v2 3/8] app/test: fix freeing mbufs in distributor tests
       [not found]       ` <CGME20200923014719eucas1p165c419cff4f265cff8add8cc818210ff@eucas1p1.samsung.com>
@ 2020-09-23  1:47         ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-09-23  1:47 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

Sanity tests with mbuf alloc and shutdown tests assume that
mbufs passed to worker cores are freed in handlers.
Such packets should not be returned to the distributor's main
core. The only packets that should be returned are the packets
send after completion of the tests in quit_workers function.

This patch fixes freeing mbufs, stops returning them
to distributor's core and cleans up unused variables.

Fixes: c0de0eb82e40 ("distributor: switch over to new API")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
---
 app/test/test_distributor.c | 50 +++++++++++++++++--------------------
 1 file changed, 23 insertions(+), 27 deletions(-)

diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
index 0e49e3714..94b65b382 100644
--- a/app/test/test_distributor.c
+++ b/app/test/test_distributor.c
@@ -277,24 +277,23 @@ handle_work_with_free_mbufs(void *arg)
 	struct rte_mbuf *buf[8] __rte_cache_aligned;
 	struct worker_params *wp = arg;
 	struct rte_distributor *d = wp->dist;
-	unsigned int count = 0;
 	unsigned int i;
 	unsigned int num = 0;
 	unsigned int id = __atomic_fetch_add(&worker_idx, 1, __ATOMIC_RELAXED);
 
 	for (i = 0; i < 8; i++)
 		buf[i] = NULL;
-	num = rte_distributor_get_pkt(d, id, buf, buf, num);
+	num = rte_distributor_get_pkt(d, id, buf, buf, 0);
 	while (!quit) {
-		count += num;
 		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
 				__ATOMIC_ACQ_REL);
-		for (i = 0; i < num; i++)
+		for (i = 0; i < num; i++) {
 			rte_pktmbuf_free(buf[i]);
+			buf[i] = NULL;
+		}
 		num = rte_distributor_get_pkt(d,
-				id, buf, buf, num);
+				id, buf, buf, 0);
 	}
-	count += num;
 	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
 			__ATOMIC_ACQ_REL);
 	rte_distributor_return_pkt(d, id, buf, num);
@@ -322,7 +321,6 @@ sanity_test_with_mbuf_alloc(struct worker_params *wp, struct rte_mempool *p)
 			rte_distributor_process(d, NULL, 0);
 		for (j = 0; j < BURST; j++) {
 			bufs[j]->hash.usr = (i+j) << 1;
-			rte_mbuf_refcnt_set(bufs[j], 1);
 		}
 
 		rte_distributor_process(d, bufs, BURST);
@@ -346,20 +344,18 @@ sanity_test_with_mbuf_alloc(struct worker_params *wp, struct rte_mempool *p)
 static int
 handle_work_for_shutdown_test(void *arg)
 {
-	struct rte_mbuf *pkt = NULL;
 	struct rte_mbuf *buf[8] __rte_cache_aligned;
 	struct worker_params *wp = arg;
 	struct rte_distributor *d = wp->dist;
-	unsigned int count = 0;
 	unsigned int num = 0;
-	unsigned int total = 0;
 	unsigned int i;
-	unsigned int returned = 0;
 	unsigned int zero_id = 0;
 	const unsigned int id = __atomic_fetch_add(&worker_idx, 1,
 			__ATOMIC_RELAXED);
+	for (i = 0; i < 8; i++)
+		buf[i] = NULL;
 
-	num = rte_distributor_get_pkt(d, id, buf, buf, num);
+	num = rte_distributor_get_pkt(d, id, buf, buf, 0);
 
 	zero_id = __atomic_load_n(&zero_idx, __ATOMIC_ACQUIRE);
 	if (id == zero_id && num > 0) {
@@ -371,13 +367,14 @@ handle_work_for_shutdown_test(void *arg)
 	/* wait for quit single globally, or for worker zero, wait
 	 * for zero_quit */
 	while (!quit && !(id == zero_id && zero_quit)) {
-		count += num;
 		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
 				__ATOMIC_ACQ_REL);
-		for (i = 0; i < num; i++)
+		for (i = 0; i < num; i++) {
 			rte_pktmbuf_free(buf[i]);
+			buf[i] = NULL;
+		}
 		num = rte_distributor_get_pkt(d,
-				id, buf, buf, num);
+				id, buf, buf, 0);
 
 		zero_id = __atomic_load_n(&zero_idx, __ATOMIC_ACQUIRE);
 		if (id == zero_id && num > 0) {
@@ -385,12 +382,7 @@ handle_work_for_shutdown_test(void *arg)
 				__ATOMIC_ACQUIRE);
 			__atomic_store_n(&zero_idx, zero_id, __ATOMIC_RELEASE);
 		}
-
-		total += num;
 	}
-	count += num;
-	returned = rte_distributor_return_pkt(d, id, buf, num);
-
 	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
 			__ATOMIC_ACQ_REL);
 	if (id == zero_id) {
@@ -400,20 +392,24 @@ handle_work_for_shutdown_test(void *arg)
 		while (zero_quit)
 			usleep(100);
 
+		for (i = 0; i < num; i++) {
+			rte_pktmbuf_free(buf[i]);
+			buf[i] = NULL;
+		}
 		num = rte_distributor_get_pkt(d,
-				id, buf, buf, num);
+				id, buf, buf, 0);
 
 		while (!quit) {
-			count += num;
-			rte_pktmbuf_free(pkt);
-			num = rte_distributor_get_pkt(d, id, buf, buf, num);
 			__atomic_fetch_add(&worker_stats[id].handled_packets,
 					num, __ATOMIC_ACQ_REL);
+			for (i = 0; i < num; i++) {
+				rte_pktmbuf_free(buf[i]);
+				buf[i] = NULL;
+			}
+			num = rte_distributor_get_pkt(d, id, buf, buf, 0);
 		}
-		returned = rte_distributor_return_pkt(d,
-				id, buf, num);
-		printf("Num returned = %d\n", returned);
 	}
+	rte_distributor_return_pkt(d, id, buf, num);
 	return 0;
 }
 
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v2 4/8] app/test: collect return mbufs in distributor test
       [not found]       ` <CGME20200923014720eucas1p2bd5887c96c24839f364810a1bbe840da@eucas1p2.samsung.com>
@ 2020-09-23  1:47         ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-09-23  1:47 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

During quit_workers function distributor's main core processes
some packets to wake up pending worker cores so they can quit.
As quit_workers acts also as a cleanup procedure for next test
case it should also collect these packages returned by workers'
handlers, so the cyclic buffer with returned packets
in distributor remains empty.

Fixes: c3eabff124e6 ("distributor: add unit tests")
Cc: bruce.richardson@intel.com
Fixes: c0de0eb82e40 ("distributor: switch over to new API")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
---
 app/test/test_distributor.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
index 94b65b382..f31b54edf 100644
--- a/app/test/test_distributor.c
+++ b/app/test/test_distributor.c
@@ -610,6 +610,10 @@ quit_workers(struct worker_params *wp, struct rte_mempool *p)
 	rte_distributor_process(d, NULL, 0);
 	rte_distributor_flush(d);
 	rte_eal_mp_wait_lcore();
+
+	while (rte_distributor_returned_pkts(d, bufs, RTE_MAX_LCORE))
+		;
+
 	quit = 0;
 	worker_idx = 0;
 	zero_idx = 0;
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v2 5/8] distributor: fix missing handshake synchronization
       [not found]       ` <CGME20200923014721eucas1p1d22ac56c9b9e4fb49ac73d72d51a7a23@eucas1p1.samsung.com>
@ 2020-09-23  1:47         ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-09-23  1:47 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

rte_distributor_return_pkt function which is run on worker cores
must wait for distributor core to clear handshake on retptr64
before using those buffers. While the handshake is set distributor
core controls buffers and any operations on worker side might overwrite
buffers which are unread yet.
Same situation appears in the legacy single distributor. Function
rte_distributor_return_pkt_single shouldn't modify the bufptr64 until
handshake on it is cleared by distributor lcore.

Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
---
 lib/librte_distributor/rte_distributor.c        | 14 ++++++++++++++
 lib/librte_distributor/rte_distributor_single.c |  4 ++++
 2 files changed, 18 insertions(+)

diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
index 1c047f065..89493c331 100644
--- a/lib/librte_distributor/rte_distributor.c
+++ b/lib/librte_distributor/rte_distributor.c
@@ -160,6 +160,7 @@ rte_distributor_return_pkt(struct rte_distributor *d,
 {
 	struct rte_distributor_buffer *buf = &d->bufs[worker_id];
 	unsigned int i;
+	volatile int64_t *retptr64;
 
 	if (unlikely(d->alg_type == RTE_DIST_ALG_SINGLE)) {
 		if (num == 1)
@@ -169,6 +170,19 @@ rte_distributor_return_pkt(struct rte_distributor *d,
 			return -EINVAL;
 	}
 
+	retptr64 = &(buf->retptr64[0]);
+	/* Spin while handshake bits are set (scheduler clears it).
+	 * Sync with worker on GET_BUF flag.
+	 */
+	while (unlikely(__atomic_load_n(retptr64, __ATOMIC_ACQUIRE)
+			& RTE_DISTRIB_GET_BUF)) {
+		rte_pause();
+		uint64_t t = rte_rdtsc()+100;
+
+		while (rte_rdtsc() < t)
+			rte_pause();
+	}
+
 	/* Sync with distributor to acquire retptrs */
 	__atomic_thread_fence(__ATOMIC_ACQUIRE);
 	for (i = 0; i < RTE_DIST_BURST_SIZE; i++)
diff --git a/lib/librte_distributor/rte_distributor_single.c b/lib/librte_distributor/rte_distributor_single.c
index abaf7730c..f4725b1d0 100644
--- a/lib/librte_distributor/rte_distributor_single.c
+++ b/lib/librte_distributor/rte_distributor_single.c
@@ -74,6 +74,10 @@ rte_distributor_return_pkt_single(struct rte_distributor_single *d,
 	union rte_distributor_buffer_single *buf = &d->bufs[worker_id];
 	uint64_t req = (((int64_t)(uintptr_t)oldpkt) << RTE_DISTRIB_FLAG_BITS)
 			| RTE_DISTRIB_RETURN_BUF;
+	while (unlikely(__atomic_load_n(&buf->bufptr64, __ATOMIC_RELAXED)
+			& RTE_DISTRIB_FLAGS_MASK))
+		rte_pause();
+
 	/* Sync with distributor on RETURN_BUF flag. */
 	__atomic_store_n(&(buf->bufptr64), req, __ATOMIC_RELEASE);
 	return 0;
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v2 6/8] distributor: fix handshake deadlock
       [not found]       ` <CGME20200923014722eucas1p2c2ef63759f4b800c1b5a80094e07e384@eucas1p2.samsung.com>
@ 2020-09-23  1:47         ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-09-23  1:47 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

Synchronization of data exchange between distributor and worker cores
is based on 2 handshakes: retptr64 for returning mbufs from workers
to distributor and bufptr64 for passing mbufs to workers.

Without proper order of verifying those 2 handshakes a deadlock may
occur. This can happen when worker core want to return back mbufs
and waits for retptr handshake to be cleared and distributor core
wait for bufptr to send mbufs to worker.

This can happen as worker core first returns mbufs to distributor
and later gets new mbufs, while distributor first release mbufs
to worker and later handle returning packets.

This patch fixes possibility of the deadlock by always taking care
of returning packets first on the distributor side and handling
packets while waiting to release new.

Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
---
 lib/librte_distributor/rte_distributor.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
index 89493c331..12b3db33c 100644
--- a/lib/librte_distributor/rte_distributor.c
+++ b/lib/librte_distributor/rte_distributor.c
@@ -321,12 +321,14 @@ release(struct rte_distributor *d, unsigned int wkr)
 	struct rte_distributor_buffer *buf = &(d->bufs[wkr]);
 	unsigned int i;
 
+	handle_returns(d, wkr);
+
 	/* Sync with worker on GET_BUF flag */
 	while (!(__atomic_load_n(&(d->bufs[wkr].bufptr64[0]), __ATOMIC_ACQUIRE)
-		& RTE_DISTRIB_GET_BUF))
+		& RTE_DISTRIB_GET_BUF)) {
+		handle_returns(d, wkr);
 		rte_pause();
-
-	handle_returns(d, wkr);
+	}
 
 	buf->count = 0;
 
@@ -376,6 +378,7 @@ rte_distributor_process(struct rte_distributor *d,
 		/* Flush out all non-full cache-lines to workers. */
 		for (wid = 0 ; wid < d->num_workers; wid++) {
 			/* Sync with worker on GET_BUF flag. */
+			handle_returns(d, wid);
 			if (__atomic_load_n(&(d->bufs[wid].bufptr64[0]),
 				__ATOMIC_ACQUIRE) & RTE_DISTRIB_GET_BUF) {
 				release(d, wid);
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v2 7/8] distributor: do not use oldpkt when not needed
       [not found]       ` <CGME20200923014723eucas1p2a7c7210a55289b3739faff4f5ed72e30@eucas1p2.samsung.com>
@ 2020-09-23  1:47         ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-09-23  1:47 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

rte_distributor_request_pkt and rte_distributor_get_pkt dereferenced
oldpkt parameter when in RTE_DIST_ALG_SINGLE even if number
of returned buffers from worker to distributor was 0.

This patch passes NULL to the legacy API when number of returned
buffers is 0. This allows passing NULL as oldpkt parameter.

Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
---
 lib/librte_distributor/rte_distributor.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
index 12b3db33c..b720abe03 100644
--- a/lib/librte_distributor/rte_distributor.c
+++ b/lib/librte_distributor/rte_distributor.c
@@ -42,7 +42,7 @@ rte_distributor_request_pkt(struct rte_distributor *d,
 
 	if (unlikely(d->alg_type == RTE_DIST_ALG_SINGLE)) {
 		rte_distributor_request_pkt_single(d->d_single,
-			worker_id, oldpkt[0]);
+			worker_id, count ? oldpkt[0] : NULL);
 		return;
 	}
 
@@ -134,7 +134,7 @@ rte_distributor_get_pkt(struct rte_distributor *d,
 	if (unlikely(d->alg_type == RTE_DIST_ALG_SINGLE)) {
 		if (return_count <= 1) {
 			pkts[0] = rte_distributor_get_pkt_single(d->d_single,
-				worker_id, oldpkt[0]);
+				worker_id, return_count ? oldpkt[0] : NULL);
 			return (pkts[0]) ? 1 : 0;
 		} else
 			return -EINVAL;
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v2 8/8] distributor: align API documentation with code
       [not found]       ` <CGME20200923014724eucas1p13d3c0428a15bea26def7a4343251e4e4@eucas1p1.samsung.com>
@ 2020-09-23  1:47         ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-09-23  1:47 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

After introducing burst API there were some artefacts in the
API documentation from legacy single API.
Also the rte_distributor_poll_pkt() function return values
mismatched the implementation.

Fixes: c0de0eb82e40 ("distributor: switch over to new API")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
---
 lib/librte_distributor/rte_distributor.h | 23 ++++++++++++-----------
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/lib/librte_distributor/rte_distributor.h b/lib/librte_distributor/rte_distributor.h
index 327c0c4ab..a073e6461 100644
--- a/lib/librte_distributor/rte_distributor.h
+++ b/lib/librte_distributor/rte_distributor.h
@@ -155,7 +155,7 @@ rte_distributor_clear_returns(struct rte_distributor *d);
  * @param pkts
  *   The mbufs pointer array to be filled in (up to 8 packets)
  * @param oldpkt
- *   The previous packet, if any, being processed by the worker
+ *   The previous packets, if any, being processed by the worker
  * @param retcount
  *   The number of packets being returned
  *
@@ -187,15 +187,15 @@ rte_distributor_return_pkt(struct rte_distributor *d,
 
 /**
  * API called by a worker to request a new packet to process.
- * Any previous packet given to the worker is assumed to have completed
+ * Any previous packets given to the worker are assumed to have completed
  * processing, and may be optionally returned to the distributor via
  * the oldpkt parameter.
- * Unlike rte_distributor_get_pkt_burst(), this function does not wait for a
- * new packet to be provided by the distributor.
+ * Unlike rte_distributor_get_pkt(), this function does not wait for
+ * new packets to be provided by the distributor.
  *
- * NOTE: after calling this function, rte_distributor_poll_pkt_burst() should
- * be used to poll for the packet requested. The rte_distributor_get_pkt_burst()
- * API should *not* be used to try and retrieve the new packet.
+ * NOTE: after calling this function, rte_distributor_poll_pkt() should
+ * be used to poll for the packets requested. The rte_distributor_get_pkt()
+ * API should *not* be used to try and retrieve the new packets.
  *
  * @param d
  *   The distributor instance to be used
@@ -213,9 +213,9 @@ rte_distributor_request_pkt(struct rte_distributor *d,
 		unsigned int count);
 
 /**
- * API called by a worker to check for a new packet that was previously
+ * API called by a worker to check for new packets that were previously
  * requested by a call to rte_distributor_request_pkt(). It does not wait
- * for the new packet to be available, but returns NULL if the request has
+ * for the new packets to be available, but returns if the request has
  * not yet been fulfilled by the distributor.
  *
  * @param d
@@ -227,8 +227,9 @@ rte_distributor_request_pkt(struct rte_distributor *d,
  *   The array of mbufs being given to the worker
  *
  * @return
- *   The number of packets being given to the worker thread, zero if no
- *   packet is yet available.
+ *   The number of packets being given to the worker thread,
+ *   -1 if no packets are yet available (burst API - RTE_DIST_ALG_BURST)
+ *   0 if no packets are yet available (legacy single API - RTE_DIST_ALG_SINGLE)
  */
 int
 rte_distributor_poll_pkt(struct rte_distributor *d,
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v1 3/6] app/test: fix freeing mbufs in distributor tests
  2020-09-22 12:42       ` David Marchand
@ 2020-09-23  1:55         ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-09-23  1:55 UTC (permalink / raw)
  To: David Marchand, David Hunt; +Cc: Bruce Richardson, dev, dpdk stable

Hello David,

W dniu 22.09.2020 o 14:42, David Marchand pisze:
> Hello Lukasz, David,
>
>
> On Tue, Sep 15, 2020 at 9:35 PM Lukasz Wojciechowski
> <l.wojciechow@partner.samsung.com> wrote:
>> diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
>> index 0e49e3714..da13a9a3f 100644
>> --- a/app/test/test_distributor.c
>> +++ b/app/test/test_distributor.c
>> @@ -277,24 +277,21 @@ handle_work_with_free_mbufs(void *arg)
>>          struct rte_mbuf *buf[8] __rte_cache_aligned;
>>          struct worker_params *wp = arg;
>>          struct rte_distributor *d = wp->dist;
>> -       unsigned int count = 0;
>>          unsigned int i;
>>          unsigned int num = 0;
>>          unsigned int id = __atomic_fetch_add(&worker_idx, 1, __ATOMIC_RELAXED);
>>
>>          for (i = 0; i < 8; i++)
>>                  buf[i] = NULL;
>> -       num = rte_distributor_get_pkt(d, id, buf, buf, num);
>> +       num = rte_distributor_get_pkt(d, id, buf, buf, 0);
> For my understanding, we pass an array even if we return 0 packet. Is
> this necessary?

The short answer is: yes.

That's because in case of using old legacy API (single distributor), it 
is required to pass a pointer to mbuf (might be NULL however). The new 
burst API functions call the old API dereferencing the first element of 
the array passed. So there must be a valid array containing at least 1 elem.

I pushed the v2 version of the patchset, which contains 2 new patches. 
Patch #7 fixes this issue in librte_distributor by passing NULL mbuf 
pointer to legacy API if number of return buffers is zero.

>
>
>>          while (!quit) {
>> -               count += num;
>>                  __atomic_fetch_add(&worker_stats[id].handled_packets, num,
>>                                  __ATOMIC_ACQ_REL);
>>                  for (i = 0; i < num; i++)
>>                          rte_pktmbuf_free(buf[i]);
>>                  num = rte_distributor_get_pkt(d,
>> -                               id, buf, buf, num);
>> +                               id, buf, buf, 0);
> Here, it gives the impression we have some potential use-after-free on
> buf[] content.
Nice catch! I missed it.
I fixed it in v2 by assigning NULL values to the bufs, so they won't be 
used after free.
> And trying to pass NULL, I can see the distributor library
> dereferences oldpkt[] without checking retcount != 0.

That's fixed in new patch v2 7/8


Best regards

Lukasz

>
>
-- 
Lukasz Wojciechowski
Principal Software Engineer

Samsung R&D Institute Poland
Samsung Electronics
Office +48 22 377 88 25
l.wojciechow@partner.samsung.com


^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/8] app/test: synchronize statistics between lcores
  2020-09-23  1:47         ` [dpdk-dev] [PATCH v2 2/8] app/test: synchronize statistics between lcores Lukasz Wojciechowski
@ 2020-09-23  4:30           ` Honnappa Nagarahalli
  2020-09-23 12:47             ` Lukasz Wojciechowski
  0 siblings, 1 reply; 164+ messages in thread
From: Honnappa Nagarahalli @ 2020-09-23  4:30 UTC (permalink / raw)
  To: Lukasz Wojciechowski, David Hunt, Bruce Richardson
  Cc: dev, stable, nd, Honnappa Nagarahalli, nd

<snip>

> 
> Statistics of handled packets are cleared and read on main lcore, while they
> are increased in workers handlers on different lcores.
> 
> Without synchronization occasionally showed invalid values.
What exactly do you mean by invalid values? Can you elaborate?

> This patch uses atomic acquire/release mechanisms to synchronize.
> 
> Fixes: c3eabff124e6 ("distributor: add unit tests")
> Cc: bruce.richardson@intel.com
> Cc: stable@dpdk.org
> 
> Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
> ---
>  app/test/test_distributor.c | 39 ++++++++++++++++++++++++-------------
>  1 file changed, 26 insertions(+), 13 deletions(-)
> 
> diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c index
> 35b25463a..0e49e3714 100644
> --- a/app/test/test_distributor.c
> +++ b/app/test/test_distributor.c
> @@ -43,7 +43,8 @@ total_packet_count(void)  {
>  	unsigned i, count = 0;
>  	for (i = 0; i < worker_idx; i++)
> -		count += worker_stats[i].handled_packets;
> +		count +=
> __atomic_load_n(&worker_stats[i].handled_packets,
> +				__ATOMIC_ACQUIRE);
>  	return count;
>  }
> 
> @@ -52,6 +53,7 @@ static inline void
>  clear_packet_count(void)
>  {
>  	memset(&worker_stats, 0, sizeof(worker_stats));
> +	rte_atomic_thread_fence(__ATOMIC_RELEASE);
>  }
> 
>  /* this is the basic worker function for sanity test @@ -72,13 +74,13 @@
> handle_work(void *arg)
>  	num = rte_distributor_get_pkt(db, id, buf, buf, num);
>  	while (!quit) {
>  		__atomic_fetch_add(&worker_stats[id].handled_packets,
> num,
> -				__ATOMIC_RELAXED);
> +				__ATOMIC_ACQ_REL);
>  		count += num;
>  		num = rte_distributor_get_pkt(db, id,
>  				buf, buf, num);
>  	}
>  	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
> -			__ATOMIC_RELAXED);
> +			__ATOMIC_ACQ_REL);
>  	count += num;
>  	rte_distributor_return_pkt(db, id, buf, num);
>  	return 0;
> @@ -134,7 +136,8 @@ sanity_test(struct worker_params *wp, struct
> rte_mempool *p)
> 
>  	for (i = 0; i < rte_lcore_count() - 1; i++)
>  		printf("Worker %u handled %u packets\n", i,
> -				worker_stats[i].handled_packets);
> +			__atomic_load_n(&worker_stats[i].handled_packets,
> +					__ATOMIC_ACQUIRE));
>  	printf("Sanity test with all zero hashes done.\n");
> 
>  	/* pick two flows and check they go correctly */ @@ -159,7 +162,9
> @@ sanity_test(struct worker_params *wp, struct rte_mempool *p)
> 
>  		for (i = 0; i < rte_lcore_count() - 1; i++)
>  			printf("Worker %u handled %u packets\n", i,
> -					worker_stats[i].handled_packets);
> +				__atomic_load_n(
> +					&worker_stats[i].handled_packets,
> +					__ATOMIC_ACQUIRE));
>  		printf("Sanity test with two hash values done\n");
>  	}
> 
> @@ -185,7 +190,8 @@ sanity_test(struct worker_params *wp, struct
> rte_mempool *p)
> 
>  	for (i = 0; i < rte_lcore_count() - 1; i++)
>  		printf("Worker %u handled %u packets\n", i,
> -				worker_stats[i].handled_packets);
> +			__atomic_load_n(&worker_stats[i].handled_packets,
> +					__ATOMIC_ACQUIRE));
>  	printf("Sanity test with non-zero hashes done\n");
> 
>  	rte_mempool_put_bulk(p, (void *)bufs, BURST); @@ -280,15
> +286,17 @@ handle_work_with_free_mbufs(void *arg)
>  		buf[i] = NULL;
>  	num = rte_distributor_get_pkt(d, id, buf, buf, num);
>  	while (!quit) {
> -		worker_stats[id].handled_packets += num;
>  		count += num;
> +		__atomic_fetch_add(&worker_stats[id].handled_packets,
> num,
> +				__ATOMIC_ACQ_REL);
>  		for (i = 0; i < num; i++)
>  			rte_pktmbuf_free(buf[i]);
>  		num = rte_distributor_get_pkt(d,
>  				id, buf, buf, num);
>  	}
> -	worker_stats[id].handled_packets += num;
>  	count += num;
> +	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
> +			__ATOMIC_ACQ_REL);
>  	rte_distributor_return_pkt(d, id, buf, num);
>  	return 0;
>  }
> @@ -363,8 +371,9 @@ handle_work_for_shutdown_test(void *arg)
>  	/* wait for quit single globally, or for worker zero, wait
>  	 * for zero_quit */
>  	while (!quit && !(id == zero_id && zero_quit)) {
> -		worker_stats[id].handled_packets += num;
>  		count += num;
> +		__atomic_fetch_add(&worker_stats[id].handled_packets,
> num,
> +				__ATOMIC_ACQ_REL);
>  		for (i = 0; i < num; i++)
>  			rte_pktmbuf_free(buf[i]);
>  		num = rte_distributor_get_pkt(d,
> @@ -379,10 +388,11 @@ handle_work_for_shutdown_test(void *arg)
> 
>  		total += num;
>  	}
> -	worker_stats[id].handled_packets += num;
>  	count += num;
>  	returned = rte_distributor_return_pkt(d, id, buf, num);
> 
> +	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
> +			__ATOMIC_ACQ_REL);
>  	if (id == zero_id) {
>  		/* for worker zero, allow it to restart to pick up last packet
>  		 * when all workers are shutting down.
> @@ -394,10 +404,11 @@ handle_work_for_shutdown_test(void *arg)
>  				id, buf, buf, num);
> 
>  		while (!quit) {
> -			worker_stats[id].handled_packets += num;
>  			count += num;
>  			rte_pktmbuf_free(pkt);
>  			num = rte_distributor_get_pkt(d, id, buf, buf, num);
> +
> 	__atomic_fetch_add(&worker_stats[id].handled_packets,
> +					num, __ATOMIC_ACQ_REL);
>  		}
>  		returned = rte_distributor_return_pkt(d,
>  				id, buf, num);
> @@ -461,7 +472,8 @@ sanity_test_with_worker_shutdown(struct
> worker_params *wp,
> 
>  	for (i = 0; i < rte_lcore_count() - 1; i++)
>  		printf("Worker %u handled %u packets\n", i,
> -				worker_stats[i].handled_packets);
> +			__atomic_load_n(&worker_stats[i].handled_packets,
> +					__ATOMIC_ACQUIRE));
> 
>  	if (total_packet_count() != BURST * 2) {
>  		printf("Line %d: Error, not all packets flushed. "
> @@ -514,7 +526,8 @@ test_flush_with_worker_shutdown(struct
> worker_params *wp,
>  	zero_quit = 0;
>  	for (i = 0; i < rte_lcore_count() - 1; i++)
>  		printf("Worker %u handled %u packets\n", i,
> -				worker_stats[i].handled_packets);
> +			__atomic_load_n(&worker_stats[i].handled_packets,
> +					__ATOMIC_ACQUIRE));
> 
>  	if (total_packet_count() != BURST) {
>  		printf("Line %d: Error, not all packets flushed. "
> --
> 2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v2 0/8] fix distributor synchronization issues
  2020-09-23  1:47     ` [dpdk-dev] [PATCH v2 0/8] fix distributor synchronization issues Lukasz Wojciechowski
                         ` (7 preceding siblings ...)
       [not found]       ` <CGME20200923014724eucas1p13d3c0428a15bea26def7a4343251e4e4@eucas1p1.samsung.com>
@ 2020-09-23  8:46       ` David Hunt
  2020-09-23 14:03         ` Lukasz Wojciechowski
  2020-09-23  8:47       ` David Hunt
       [not found]       ` <CGME20200923132544eucas1p29470697e7cb6621cc65e6e676c3e5d69@eucas1p2.samsung.com>
  10 siblings, 1 reply; 164+ messages in thread
From: David Hunt @ 2020-09-23  8:46 UTC (permalink / raw)
  To: dev

Hi Lukasz,


On 23/9/2020 2:47 AM, Lukasz Wojciechowski wrote:
> During review and verification of the patch created by Sarosh Arif:
> "test_distributor: prevent memory leakages from the pool" I found out
> that running distributor unit tests multiple times in a row causes fails.
> So I investigated all the issues I found.
>
> There are few synchronization issues that might cause deadlocks
> or corrupted data. They are fixed with this set of patches for both tests
> and librte_distributor library.
>
> ---
> v2:
> * assign NULL to freed mbufs in distributor test
> * fix handshake check on legacy single distributor
>       rte_distributor_return_pkt_single()
> * add patch 7 passing NULL to legacy API calls if no bufs are returned
> * add patch 8 fixing API documentation
>
>

Please include any Acked-by or Tested-by tags from previous versions.

Rgds,
Dave.




^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v2 0/8] fix distributor synchronization issues
  2020-09-23  1:47     ` [dpdk-dev] [PATCH v2 0/8] fix distributor synchronization issues Lukasz Wojciechowski
                         ` (8 preceding siblings ...)
  2020-09-23  8:46       ` [dpdk-dev] [PATCH v2 0/8] fix distributor synchronization issues David Hunt
@ 2020-09-23  8:47       ` David Hunt
       [not found]       ` <CGME20200923132544eucas1p29470697e7cb6621cc65e6e676c3e5d69@eucas1p2.samsung.com>
  10 siblings, 0 replies; 164+ messages in thread
From: David Hunt @ 2020-09-23  8:47 UTC (permalink / raw)
  To: Lukasz Wojciechowski; +Cc: dev

Hi Lukasz,

On 23/9/2020 2:47 AM, Lukasz Wojciechowski wrote:
> During review and verification of the patch created by Sarosh Arif:
> "test_distributor: prevent memory leakages from the pool" I found out
> that running distributor unit tests multiple times in a row causes fails.
> So I investigated all the issues I found.
>
> There are few synchronization issues that might cause deadlocks
> or corrupted data. They are fixed with this set of patches for both tests
> and librte_distributor library.
>
> ---
> v2:
> * assign NULL to freed mbufs in distributor test
> * fix handshake check on legacy single distributor
>       rte_distributor_return_pkt_single()
> * add patch 7 passing NULL to legacy API calls if no bufs are returned
> * add patch 8 fixing API documentation
>

Please include any Acked-by or Tested-by tags from previous versions.

Rgds,
Dave.


^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v2 2/8] app/test: synchronize statistics between lcores
  2020-09-23  4:30           ` Honnappa Nagarahalli
@ 2020-09-23 12:47             ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-09-23 12:47 UTC (permalink / raw)
  To: Honnappa Nagarahalli
  Cc: David Hunt, Bruce Richardson, dev, stable, nd,
	"'Lukasz Wojciechowski'",


W dniu 23.09.2020 o 06:30, Honnappa Nagarahalli pisze:
> <snip>
>
>> Statistics of handled packets are cleared and read on main lcore, while they
>> are increased in workers handlers on different lcores.
>>
>> Without synchronization occasionally showed invalid values.
> What exactly do you mean by invalid values? Can you elaborate?

I mean values that shouldn't be there which are obviously not related to 
number of packets handled by workers.

I reverted the patch and run stress tests. Failures without this patch 
look like these:

=== Test flush fn with worker shutdown (burst) ===
Worker 0 handled 0 packets
Worker 1 handled 0 packets
Worker 2 handled 0 packets
Worker 3 handled 0 packets
Worker 4 handled 32 packets
Worker 5 handled 0 packets
Worker 6 handled 6 packets
Line 519: Error, not all packets flushed. Expected 32, got 38
Test Failed

or:

=== Sanity test of worker shutdown ===
Worker 0 handled 0 packets
Worker 1 handled 0 packets
Worker 2 handled 0 packets
Worker 3 handled 0 packets
Worker 4 handled 0 packets
Worker 5 handled 64 packets
Worker 6 handled 149792 packets
Line 466: Error, not all packets flushed. Expected 64, got 149856
Test Failed

The 6 or 149792 packets reported by worker 6 were never sent to or 
processed by the workers.

>> This patch uses atomic acquire/release mechanisms to synchronize.
>>
>> Fixes: c3eabff124e6 ("distributor: add unit tests")
>> Cc: bruce.richardson@intel.com
>> Cc: stable@dpdk.org
>>
>> Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
>> ---
>>   app/test/test_distributor.c | 39 ++++++++++++++++++++++++-------------
>>   1 file changed, 26 insertions(+), 13 deletions(-)
>>
>> diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c index
>> 35b25463a..0e49e3714 100644
>> --- a/app/test/test_distributor.c
>> +++ b/app/test/test_distributor.c
>> @@ -43,7 +43,8 @@ total_packet_count(void)  {
>>   	unsigned i, count = 0;
>>   	for (i = 0; i < worker_idx; i++)
>> -		count += worker_stats[i].handled_packets;
>> +		count +=
>> __atomic_load_n(&worker_stats[i].handled_packets,
>> +				__ATOMIC_ACQUIRE);
>>   	return count;
>>   }
>>
>> @@ -52,6 +53,7 @@ static inline void
>>   clear_packet_count(void)
>>   {
>>   	memset(&worker_stats, 0, sizeof(worker_stats));
>> +	rte_atomic_thread_fence(__ATOMIC_RELEASE);
>>   }
>>
>>   /* this is the basic worker function for sanity test @@ -72,13 +74,13 @@
>> handle_work(void *arg)
>>   	num = rte_distributor_get_pkt(db, id, buf, buf, num);
>>   	while (!quit) {
>>   		__atomic_fetch_add(&worker_stats[id].handled_packets,
>> num,
>> -				__ATOMIC_RELAXED);
>> +				__ATOMIC_ACQ_REL);
>>   		count += num;
>>   		num = rte_distributor_get_pkt(db, id,
>>   				buf, buf, num);
>>   	}
>>   	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
>> -			__ATOMIC_RELAXED);
>> +			__ATOMIC_ACQ_REL);
>>   	count += num;
>>   	rte_distributor_return_pkt(db, id, buf, num);
>>   	return 0;
>> @@ -134,7 +136,8 @@ sanity_test(struct worker_params *wp, struct
>> rte_mempool *p)
>>
>>   	for (i = 0; i < rte_lcore_count() - 1; i++)
>>   		printf("Worker %u handled %u packets\n", i,
>> -				worker_stats[i].handled_packets);
>> +			__atomic_load_n(&worker_stats[i].handled_packets,
>> +					__ATOMIC_ACQUIRE));
>>   	printf("Sanity test with all zero hashes done.\n");
>>
>>   	/* pick two flows and check they go correctly */ @@ -159,7 +162,9
>> @@ sanity_test(struct worker_params *wp, struct rte_mempool *p)
>>
>>   		for (i = 0; i < rte_lcore_count() - 1; i++)
>>   			printf("Worker %u handled %u packets\n", i,
>> -					worker_stats[i].handled_packets);
>> +				__atomic_load_n(
>> +					&worker_stats[i].handled_packets,
>> +					__ATOMIC_ACQUIRE));
>>   		printf("Sanity test with two hash values done\n");
>>   	}
>>
>> @@ -185,7 +190,8 @@ sanity_test(struct worker_params *wp, struct
>> rte_mempool *p)
>>
>>   	for (i = 0; i < rte_lcore_count() - 1; i++)
>>   		printf("Worker %u handled %u packets\n", i,
>> -				worker_stats[i].handled_packets);
>> +			__atomic_load_n(&worker_stats[i].handled_packets,
>> +					__ATOMIC_ACQUIRE));
>>   	printf("Sanity test with non-zero hashes done\n");
>>
>>   	rte_mempool_put_bulk(p, (void *)bufs, BURST); @@ -280,15
>> +286,17 @@ handle_work_with_free_mbufs(void *arg)
>>   		buf[i] = NULL;
>>   	num = rte_distributor_get_pkt(d, id, buf, buf, num);
>>   	while (!quit) {
>> -		worker_stats[id].handled_packets += num;
>>   		count += num;
>> +		__atomic_fetch_add(&worker_stats[id].handled_packets,
>> num,
>> +				__ATOMIC_ACQ_REL);
>>   		for (i = 0; i < num; i++)
>>   			rte_pktmbuf_free(buf[i]);
>>   		num = rte_distributor_get_pkt(d,
>>   				id, buf, buf, num);
>>   	}
>> -	worker_stats[id].handled_packets += num;
>>   	count += num;
>> +	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
>> +			__ATOMIC_ACQ_REL);
>>   	rte_distributor_return_pkt(d, id, buf, num);
>>   	return 0;
>>   }
>> @@ -363,8 +371,9 @@ handle_work_for_shutdown_test(void *arg)
>>   	/* wait for quit single globally, or for worker zero, wait
>>   	 * for zero_quit */
>>   	while (!quit && !(id == zero_id && zero_quit)) {
>> -		worker_stats[id].handled_packets += num;
>>   		count += num;
>> +		__atomic_fetch_add(&worker_stats[id].handled_packets,
>> num,
>> +				__ATOMIC_ACQ_REL);
>>   		for (i = 0; i < num; i++)
>>   			rte_pktmbuf_free(buf[i]);
>>   		num = rte_distributor_get_pkt(d,
>> @@ -379,10 +388,11 @@ handle_work_for_shutdown_test(void *arg)
>>
>>   		total += num;
>>   	}
>> -	worker_stats[id].handled_packets += num;
>>   	count += num;
>>   	returned = rte_distributor_return_pkt(d, id, buf, num);
>>
>> +	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
>> +			__ATOMIC_ACQ_REL);
>>   	if (id == zero_id) {
>>   		/* for worker zero, allow it to restart to pick up last packet
>>   		 * when all workers are shutting down.
>> @@ -394,10 +404,11 @@ handle_work_for_shutdown_test(void *arg)
>>   				id, buf, buf, num);
>>
>>   		while (!quit) {
>> -			worker_stats[id].handled_packets += num;
>>   			count += num;
>>   			rte_pktmbuf_free(pkt);
>>   			num = rte_distributor_get_pkt(d, id, buf, buf, num);
>> +
>> 	__atomic_fetch_add(&worker_stats[id].handled_packets,
>> +					num, __ATOMIC_ACQ_REL);
>>   		}
>>   		returned = rte_distributor_return_pkt(d,
>>   				id, buf, num);
>> @@ -461,7 +472,8 @@ sanity_test_with_worker_shutdown(struct
>> worker_params *wp,
>>
>>   	for (i = 0; i < rte_lcore_count() - 1; i++)
>>   		printf("Worker %u handled %u packets\n", i,
>> -				worker_stats[i].handled_packets);
>> +			__atomic_load_n(&worker_stats[i].handled_packets,
>> +					__ATOMIC_ACQUIRE));
>>
>>   	if (total_packet_count() != BURST * 2) {
>>   		printf("Line %d: Error, not all packets flushed. "
>> @@ -514,7 +526,8 @@ test_flush_with_worker_shutdown(struct
>> worker_params *wp,
>>   	zero_quit = 0;
>>   	for (i = 0; i < rte_lcore_count() - 1; i++)
>>   		printf("Worker %u handled %u packets\n", i,
>> -				worker_stats[i].handled_packets);
>> +			__atomic_load_n(&worker_stats[i].handled_packets,
>> +					__ATOMIC_ACQUIRE));
>>
>>   	if (total_packet_count() != BURST) {
>>   		printf("Line %d: Error, not all packets flushed. "
>> --
>> 2.17.1

-- 
Lukasz Wojciechowski
Principal Software Engineer

Samsung R&D Institute Poland
Samsung Electronics
Office +48 22 377 88 25
l.wojciechow@partner.samsung.com


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v3 0/8] fix distributor synchronization issues
       [not found]       ` <CGME20200923132544eucas1p29470697e7cb6621cc65e6e676c3e5d69@eucas1p2.samsung.com>
@ 2020-09-23 13:25         ` Lukasz Wojciechowski
       [not found]           ` <CGME20200923132545eucas1p10db12d91121c9afdbab338bb60c8ed37@eucas1p1.samsung.com>
                             ` (9 more replies)
  0 siblings, 10 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-09-23 13:25 UTC (permalink / raw)
  Cc: dev, l.wojciechow

During review and verification of the patch created by Sarosh Arif:
"test_distributor: prevent memory leakages from the pool" I found out
that running distributor unit tests multiple times in a row causes fails.
So I investigated all the issues I found.

There are few synchronization issues that might cause deadlocks
or corrupted data. They are fixed with this set of patches for both tests
and librte_distributor library.

---
v3:
* add missing acked and tested by statements from v1

v2:
* assign NULL to freed mbufs in distributor test
* fix handshake check on legacy single distributor
     rte_distributor_return_pkt_single()
* add patch 7 passing NULL to legacy API calls if no bufs are returned
* add patch 8 fixing API documentation

Lukasz Wojciechowski (8):
  app/test: fix deadlock in distributor test
  app/test: synchronize statistics between lcores
  app/test: fix freeing mbufs in distributor tests
  app/test: collect return mbufs in distributor test
  distributor: fix missing handshake synchronization
  distributor: fix handshake deadlock
  distributor: do not use oldpkt when not needed
  distributor: align API documentation with code

 app/test/test_distributor.c                   | 113 +++++++++++-------
 lib/librte_distributor/rte_distributor.c      |  27 ++++-
 lib/librte_distributor/rte_distributor.h      |  23 ++--
 .../rte_distributor_single.c                  |   4 +
 4 files changed, 110 insertions(+), 57 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v3 1/8] app/test: fix deadlock in distributor test
       [not found]           ` <CGME20200923132545eucas1p10db12d91121c9afdbab338bb60c8ed37@eucas1p1.samsung.com>
@ 2020-09-23 13:25             ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-09-23 13:25 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

The sanity test with worker shutdown delegates all bufs
to be processed by a single lcore worker, then it freezes
one of the lcore workers and continues to send more bufs.

Problem occurred if freezed lcore is the same as the one
that is processing the mbufs. The lcore processing mbufs
might be different every time test is launched.
This is caused by keeping the value of wkr static variable
in rte_distributor_process function between running test cases.

Test freezed always lcore with 0 id. The patch changes avoids
possible collision by freezing lcore with zero_idx. The lcore
that receives the data updates the zero_idx, so it is not freezed
itself.

To reproduce the issue fixed by this patch, please run
distributor_autotest command in test app several times in a row.

Fixes: c3eabff124e6 ("distributor: add unit tests")
Cc: bruce.richardson@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Tested-by: David Hunt <david.hunt@intel.com>
---
 app/test/test_distributor.c | 22 ++++++++++++++++++++--
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
index ba1f81cf8..35b25463a 100644
--- a/app/test/test_distributor.c
+++ b/app/test/test_distributor.c
@@ -28,6 +28,7 @@ struct worker_params worker_params;
 static volatile int quit;      /**< general quit variable for all threads */
 static volatile int zero_quit; /**< var for when we just want thr0 to quit*/
 static volatile unsigned worker_idx;
+static volatile unsigned zero_idx;
 
 struct worker_stats {
 	volatile unsigned handled_packets;
@@ -346,27 +347,43 @@ handle_work_for_shutdown_test(void *arg)
 	unsigned int total = 0;
 	unsigned int i;
 	unsigned int returned = 0;
+	unsigned int zero_id = 0;
 	const unsigned int id = __atomic_fetch_add(&worker_idx, 1,
 			__ATOMIC_RELAXED);
 
 	num = rte_distributor_get_pkt(d, id, buf, buf, num);
 
+	zero_id = __atomic_load_n(&zero_idx, __ATOMIC_ACQUIRE);
+	if (id == zero_id && num > 0) {
+		zero_id = (zero_id + 1) %  __atomic_load_n(&worker_idx,
+			__ATOMIC_ACQUIRE);
+		__atomic_store_n(&zero_idx, zero_id, __ATOMIC_RELEASE);
+	}
+
 	/* wait for quit single globally, or for worker zero, wait
 	 * for zero_quit */
-	while (!quit && !(id == 0 && zero_quit)) {
+	while (!quit && !(id == zero_id && zero_quit)) {
 		worker_stats[id].handled_packets += num;
 		count += num;
 		for (i = 0; i < num; i++)
 			rte_pktmbuf_free(buf[i]);
 		num = rte_distributor_get_pkt(d,
 				id, buf, buf, num);
+
+		zero_id = __atomic_load_n(&zero_idx, __ATOMIC_ACQUIRE);
+		if (id == zero_id && num > 0) {
+			zero_id = (zero_id + 1) %  __atomic_load_n(&worker_idx,
+				__ATOMIC_ACQUIRE);
+			__atomic_store_n(&zero_idx, zero_id, __ATOMIC_RELEASE);
+		}
+
 		total += num;
 	}
 	worker_stats[id].handled_packets += num;
 	count += num;
 	returned = rte_distributor_return_pkt(d, id, buf, num);
 
-	if (id == 0) {
+	if (id == zero_id) {
 		/* for worker zero, allow it to restart to pick up last packet
 		 * when all workers are shutting down.
 		 */
@@ -586,6 +603,7 @@ quit_workers(struct worker_params *wp, struct rte_mempool *p)
 	rte_eal_mp_wait_lcore();
 	quit = 0;
 	worker_idx = 0;
+	zero_idx = 0;
 }
 
 static int
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v3 2/8] app/test: synchronize statistics between lcores
       [not found]           ` <CGME20200923132546eucas1p212b6eede801514b544d82d41f5b7e4b8@eucas1p2.samsung.com>
@ 2020-09-23 13:25             ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-09-23 13:25 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

Statistics of handled packets are cleared and read on main lcore,
while they are increased in workers handlers on different lcores.

Without synchronization occasionally showed invalid values.
This patch uses atomic acquire/release mechanisms to synchronize.

Fixes: c3eabff124e6 ("distributor: add unit tests")
Cc: bruce.richardson@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 app/test/test_distributor.c | 39 ++++++++++++++++++++++++-------------
 1 file changed, 26 insertions(+), 13 deletions(-)

diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
index 35b25463a..0e49e3714 100644
--- a/app/test/test_distributor.c
+++ b/app/test/test_distributor.c
@@ -43,7 +43,8 @@ total_packet_count(void)
 {
 	unsigned i, count = 0;
 	for (i = 0; i < worker_idx; i++)
-		count += worker_stats[i].handled_packets;
+		count += __atomic_load_n(&worker_stats[i].handled_packets,
+				__ATOMIC_ACQUIRE);
 	return count;
 }
 
@@ -52,6 +53,7 @@ static inline void
 clear_packet_count(void)
 {
 	memset(&worker_stats, 0, sizeof(worker_stats));
+	rte_atomic_thread_fence(__ATOMIC_RELEASE);
 }
 
 /* this is the basic worker function for sanity test
@@ -72,13 +74,13 @@ handle_work(void *arg)
 	num = rte_distributor_get_pkt(db, id, buf, buf, num);
 	while (!quit) {
 		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
-				__ATOMIC_RELAXED);
+				__ATOMIC_ACQ_REL);
 		count += num;
 		num = rte_distributor_get_pkt(db, id,
 				buf, buf, num);
 	}
 	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
-			__ATOMIC_RELAXED);
+			__ATOMIC_ACQ_REL);
 	count += num;
 	rte_distributor_return_pkt(db, id, buf, num);
 	return 0;
@@ -134,7 +136,8 @@ sanity_test(struct worker_params *wp, struct rte_mempool *p)
 
 	for (i = 0; i < rte_lcore_count() - 1; i++)
 		printf("Worker %u handled %u packets\n", i,
-				worker_stats[i].handled_packets);
+			__atomic_load_n(&worker_stats[i].handled_packets,
+					__ATOMIC_ACQUIRE));
 	printf("Sanity test with all zero hashes done.\n");
 
 	/* pick two flows and check they go correctly */
@@ -159,7 +162,9 @@ sanity_test(struct worker_params *wp, struct rte_mempool *p)
 
 		for (i = 0; i < rte_lcore_count() - 1; i++)
 			printf("Worker %u handled %u packets\n", i,
-					worker_stats[i].handled_packets);
+				__atomic_load_n(
+					&worker_stats[i].handled_packets,
+					__ATOMIC_ACQUIRE));
 		printf("Sanity test with two hash values done\n");
 	}
 
@@ -185,7 +190,8 @@ sanity_test(struct worker_params *wp, struct rte_mempool *p)
 
 	for (i = 0; i < rte_lcore_count() - 1; i++)
 		printf("Worker %u handled %u packets\n", i,
-				worker_stats[i].handled_packets);
+			__atomic_load_n(&worker_stats[i].handled_packets,
+					__ATOMIC_ACQUIRE));
 	printf("Sanity test with non-zero hashes done\n");
 
 	rte_mempool_put_bulk(p, (void *)bufs, BURST);
@@ -280,15 +286,17 @@ handle_work_with_free_mbufs(void *arg)
 		buf[i] = NULL;
 	num = rte_distributor_get_pkt(d, id, buf, buf, num);
 	while (!quit) {
-		worker_stats[id].handled_packets += num;
 		count += num;
+		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
+				__ATOMIC_ACQ_REL);
 		for (i = 0; i < num; i++)
 			rte_pktmbuf_free(buf[i]);
 		num = rte_distributor_get_pkt(d,
 				id, buf, buf, num);
 	}
-	worker_stats[id].handled_packets += num;
 	count += num;
+	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
+			__ATOMIC_ACQ_REL);
 	rte_distributor_return_pkt(d, id, buf, num);
 	return 0;
 }
@@ -363,8 +371,9 @@ handle_work_for_shutdown_test(void *arg)
 	/* wait for quit single globally, or for worker zero, wait
 	 * for zero_quit */
 	while (!quit && !(id == zero_id && zero_quit)) {
-		worker_stats[id].handled_packets += num;
 		count += num;
+		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
+				__ATOMIC_ACQ_REL);
 		for (i = 0; i < num; i++)
 			rte_pktmbuf_free(buf[i]);
 		num = rte_distributor_get_pkt(d,
@@ -379,10 +388,11 @@ handle_work_for_shutdown_test(void *arg)
 
 		total += num;
 	}
-	worker_stats[id].handled_packets += num;
 	count += num;
 	returned = rte_distributor_return_pkt(d, id, buf, num);
 
+	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
+			__ATOMIC_ACQ_REL);
 	if (id == zero_id) {
 		/* for worker zero, allow it to restart to pick up last packet
 		 * when all workers are shutting down.
@@ -394,10 +404,11 @@ handle_work_for_shutdown_test(void *arg)
 				id, buf, buf, num);
 
 		while (!quit) {
-			worker_stats[id].handled_packets += num;
 			count += num;
 			rte_pktmbuf_free(pkt);
 			num = rte_distributor_get_pkt(d, id, buf, buf, num);
+			__atomic_fetch_add(&worker_stats[id].handled_packets,
+					num, __ATOMIC_ACQ_REL);
 		}
 		returned = rte_distributor_return_pkt(d,
 				id, buf, num);
@@ -461,7 +472,8 @@ sanity_test_with_worker_shutdown(struct worker_params *wp,
 
 	for (i = 0; i < rte_lcore_count() - 1; i++)
 		printf("Worker %u handled %u packets\n", i,
-				worker_stats[i].handled_packets);
+			__atomic_load_n(&worker_stats[i].handled_packets,
+					__ATOMIC_ACQUIRE));
 
 	if (total_packet_count() != BURST * 2) {
 		printf("Line %d: Error, not all packets flushed. "
@@ -514,7 +526,8 @@ test_flush_with_worker_shutdown(struct worker_params *wp,
 	zero_quit = 0;
 	for (i = 0; i < rte_lcore_count() - 1; i++)
 		printf("Worker %u handled %u packets\n", i,
-				worker_stats[i].handled_packets);
+			__atomic_load_n(&worker_stats[i].handled_packets,
+					__ATOMIC_ACQUIRE));
 
 	if (total_packet_count() != BURST) {
 		printf("Line %d: Error, not all packets flushed. "
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v3 3/8] app/test: fix freeing mbufs in distributor tests
       [not found]           ` <CGME20200923132547eucas1p130620b0d5f3080a7a57234838a992e0e@eucas1p1.samsung.com>
@ 2020-09-23 13:25             ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-09-23 13:25 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

Sanity tests with mbuf alloc and shutdown tests assume that
mbufs passed to worker cores are freed in handlers.
Such packets should not be returned to the distributor's main
core. The only packets that should be returned are the packets
send after completion of the tests in quit_workers function.

This patch fixes freeing mbufs, stops returning them
to distributor's core and cleans up unused variables.

Fixes: c0de0eb82e40 ("distributor: switch over to new API")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 app/test/test_distributor.c | 50 +++++++++++++++++--------------------
 1 file changed, 23 insertions(+), 27 deletions(-)

diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
index 0e49e3714..94b65b382 100644
--- a/app/test/test_distributor.c
+++ b/app/test/test_distributor.c
@@ -277,24 +277,23 @@ handle_work_with_free_mbufs(void *arg)
 	struct rte_mbuf *buf[8] __rte_cache_aligned;
 	struct worker_params *wp = arg;
 	struct rte_distributor *d = wp->dist;
-	unsigned int count = 0;
 	unsigned int i;
 	unsigned int num = 0;
 	unsigned int id = __atomic_fetch_add(&worker_idx, 1, __ATOMIC_RELAXED);
 
 	for (i = 0; i < 8; i++)
 		buf[i] = NULL;
-	num = rte_distributor_get_pkt(d, id, buf, buf, num);
+	num = rte_distributor_get_pkt(d, id, buf, buf, 0);
 	while (!quit) {
-		count += num;
 		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
 				__ATOMIC_ACQ_REL);
-		for (i = 0; i < num; i++)
+		for (i = 0; i < num; i++) {
 			rte_pktmbuf_free(buf[i]);
+			buf[i] = NULL;
+		}
 		num = rte_distributor_get_pkt(d,
-				id, buf, buf, num);
+				id, buf, buf, 0);
 	}
-	count += num;
 	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
 			__ATOMIC_ACQ_REL);
 	rte_distributor_return_pkt(d, id, buf, num);
@@ -322,7 +321,6 @@ sanity_test_with_mbuf_alloc(struct worker_params *wp, struct rte_mempool *p)
 			rte_distributor_process(d, NULL, 0);
 		for (j = 0; j < BURST; j++) {
 			bufs[j]->hash.usr = (i+j) << 1;
-			rte_mbuf_refcnt_set(bufs[j], 1);
 		}
 
 		rte_distributor_process(d, bufs, BURST);
@@ -346,20 +344,18 @@ sanity_test_with_mbuf_alloc(struct worker_params *wp, struct rte_mempool *p)
 static int
 handle_work_for_shutdown_test(void *arg)
 {
-	struct rte_mbuf *pkt = NULL;
 	struct rte_mbuf *buf[8] __rte_cache_aligned;
 	struct worker_params *wp = arg;
 	struct rte_distributor *d = wp->dist;
-	unsigned int count = 0;
 	unsigned int num = 0;
-	unsigned int total = 0;
 	unsigned int i;
-	unsigned int returned = 0;
 	unsigned int zero_id = 0;
 	const unsigned int id = __atomic_fetch_add(&worker_idx, 1,
 			__ATOMIC_RELAXED);
+	for (i = 0; i < 8; i++)
+		buf[i] = NULL;
 
-	num = rte_distributor_get_pkt(d, id, buf, buf, num);
+	num = rte_distributor_get_pkt(d, id, buf, buf, 0);
 
 	zero_id = __atomic_load_n(&zero_idx, __ATOMIC_ACQUIRE);
 	if (id == zero_id && num > 0) {
@@ -371,13 +367,14 @@ handle_work_for_shutdown_test(void *arg)
 	/* wait for quit single globally, or for worker zero, wait
 	 * for zero_quit */
 	while (!quit && !(id == zero_id && zero_quit)) {
-		count += num;
 		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
 				__ATOMIC_ACQ_REL);
-		for (i = 0; i < num; i++)
+		for (i = 0; i < num; i++) {
 			rte_pktmbuf_free(buf[i]);
+			buf[i] = NULL;
+		}
 		num = rte_distributor_get_pkt(d,
-				id, buf, buf, num);
+				id, buf, buf, 0);
 
 		zero_id = __atomic_load_n(&zero_idx, __ATOMIC_ACQUIRE);
 		if (id == zero_id && num > 0) {
@@ -385,12 +382,7 @@ handle_work_for_shutdown_test(void *arg)
 				__ATOMIC_ACQUIRE);
 			__atomic_store_n(&zero_idx, zero_id, __ATOMIC_RELEASE);
 		}
-
-		total += num;
 	}
-	count += num;
-	returned = rte_distributor_return_pkt(d, id, buf, num);
-
 	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
 			__ATOMIC_ACQ_REL);
 	if (id == zero_id) {
@@ -400,20 +392,24 @@ handle_work_for_shutdown_test(void *arg)
 		while (zero_quit)
 			usleep(100);
 
+		for (i = 0; i < num; i++) {
+			rte_pktmbuf_free(buf[i]);
+			buf[i] = NULL;
+		}
 		num = rte_distributor_get_pkt(d,
-				id, buf, buf, num);
+				id, buf, buf, 0);
 
 		while (!quit) {
-			count += num;
-			rte_pktmbuf_free(pkt);
-			num = rte_distributor_get_pkt(d, id, buf, buf, num);
 			__atomic_fetch_add(&worker_stats[id].handled_packets,
 					num, __ATOMIC_ACQ_REL);
+			for (i = 0; i < num; i++) {
+				rte_pktmbuf_free(buf[i]);
+				buf[i] = NULL;
+			}
+			num = rte_distributor_get_pkt(d, id, buf, buf, 0);
 		}
-		returned = rte_distributor_return_pkt(d,
-				id, buf, num);
-		printf("Num returned = %d\n", returned);
 	}
+	rte_distributor_return_pkt(d, id, buf, num);
 	return 0;
 }
 
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v3 4/8] app/test: collect return mbufs in distributor test
       [not found]           ` <CGME20200923132548eucas1p2a54328cddb79ae5e876eb104217d585f@eucas1p2.samsung.com>
@ 2020-09-23 13:25             ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-09-23 13:25 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

During quit_workers function distributor's main core processes
some packets to wake up pending worker cores so they can quit.
As quit_workers acts also as a cleanup procedure for next test
case it should also collect these packages returned by workers'
handlers, so the cyclic buffer with returned packets
in distributor remains empty.

Fixes: c3eabff124e6 ("distributor: add unit tests")
Cc: bruce.richardson@intel.com
Fixes: c0de0eb82e40 ("distributor: switch over to new API")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 app/test/test_distributor.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
index 94b65b382..f31b54edf 100644
--- a/app/test/test_distributor.c
+++ b/app/test/test_distributor.c
@@ -610,6 +610,10 @@ quit_workers(struct worker_params *wp, struct rte_mempool *p)
 	rte_distributor_process(d, NULL, 0);
 	rte_distributor_flush(d);
 	rte_eal_mp_wait_lcore();
+
+	while (rte_distributor_returned_pkts(d, bufs, RTE_MAX_LCORE))
+		;
+
 	quit = 0;
 	worker_idx = 0;
 	zero_idx = 0;
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v3 5/8] distributor: fix missing handshake synchronization
       [not found]           ` <CGME20200923132549eucas1p29fc391c3f236fa704ff800774ab851f0@eucas1p2.samsung.com>
@ 2020-09-23 13:25             ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-09-23 13:25 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

rte_distributor_return_pkt function which is run on worker cores
must wait for distributor core to clear handshake on retptr64
before using those buffers. While the handshake is set distributor
core controls buffers and any operations on worker side might overwrite
buffers which are unread yet.
Same situation appears in the legacy single distributor. Function
rte_distributor_return_pkt_single shouldn't modify the bufptr64 until
handshake on it is cleared by distributor lcore.

Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 lib/librte_distributor/rte_distributor.c        | 14 ++++++++++++++
 lib/librte_distributor/rte_distributor_single.c |  4 ++++
 2 files changed, 18 insertions(+)

diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
index 1c047f065..89493c331 100644
--- a/lib/librte_distributor/rte_distributor.c
+++ b/lib/librte_distributor/rte_distributor.c
@@ -160,6 +160,7 @@ rte_distributor_return_pkt(struct rte_distributor *d,
 {
 	struct rte_distributor_buffer *buf = &d->bufs[worker_id];
 	unsigned int i;
+	volatile int64_t *retptr64;
 
 	if (unlikely(d->alg_type == RTE_DIST_ALG_SINGLE)) {
 		if (num == 1)
@@ -169,6 +170,19 @@ rte_distributor_return_pkt(struct rte_distributor *d,
 			return -EINVAL;
 	}
 
+	retptr64 = &(buf->retptr64[0]);
+	/* Spin while handshake bits are set (scheduler clears it).
+	 * Sync with worker on GET_BUF flag.
+	 */
+	while (unlikely(__atomic_load_n(retptr64, __ATOMIC_ACQUIRE)
+			& RTE_DISTRIB_GET_BUF)) {
+		rte_pause();
+		uint64_t t = rte_rdtsc()+100;
+
+		while (rte_rdtsc() < t)
+			rte_pause();
+	}
+
 	/* Sync with distributor to acquire retptrs */
 	__atomic_thread_fence(__ATOMIC_ACQUIRE);
 	for (i = 0; i < RTE_DIST_BURST_SIZE; i++)
diff --git a/lib/librte_distributor/rte_distributor_single.c b/lib/librte_distributor/rte_distributor_single.c
index abaf7730c..f4725b1d0 100644
--- a/lib/librte_distributor/rte_distributor_single.c
+++ b/lib/librte_distributor/rte_distributor_single.c
@@ -74,6 +74,10 @@ rte_distributor_return_pkt_single(struct rte_distributor_single *d,
 	union rte_distributor_buffer_single *buf = &d->bufs[worker_id];
 	uint64_t req = (((int64_t)(uintptr_t)oldpkt) << RTE_DISTRIB_FLAG_BITS)
 			| RTE_DISTRIB_RETURN_BUF;
+	while (unlikely(__atomic_load_n(&buf->bufptr64, __ATOMIC_RELAXED)
+			& RTE_DISTRIB_FLAGS_MASK))
+		rte_pause();
+
 	/* Sync with distributor on RETURN_BUF flag. */
 	__atomic_store_n(&(buf->bufptr64), req, __ATOMIC_RELEASE);
 	return 0;
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v3 6/8] distributor: fix handshake deadlock
       [not found]           ` <CGME20200923132550eucas1p2ce158dd81ccc04abcab4130d8cb391f4@eucas1p2.samsung.com>
@ 2020-09-23 13:25             ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-09-23 13:25 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

Synchronization of data exchange between distributor and worker cores
is based on 2 handshakes: retptr64 for returning mbufs from workers
to distributor and bufptr64 for passing mbufs to workers.

Without proper order of verifying those 2 handshakes a deadlock may
occur. This can happen when worker core want to return back mbufs
and waits for retptr handshake to be cleared and distributor core
wait for bufptr to send mbufs to worker.

This can happen as worker core first returns mbufs to distributor
and later gets new mbufs, while distributor first release mbufs
to worker and later handle returning packets.

This patch fixes possibility of the deadlock by always taking care
of returning packets first on the distributor side and handling
packets while waiting to release new.

Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 lib/librte_distributor/rte_distributor.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
index 89493c331..12b3db33c 100644
--- a/lib/librte_distributor/rte_distributor.c
+++ b/lib/librte_distributor/rte_distributor.c
@@ -321,12 +321,14 @@ release(struct rte_distributor *d, unsigned int wkr)
 	struct rte_distributor_buffer *buf = &(d->bufs[wkr]);
 	unsigned int i;
 
+	handle_returns(d, wkr);
+
 	/* Sync with worker on GET_BUF flag */
 	while (!(__atomic_load_n(&(d->bufs[wkr].bufptr64[0]), __ATOMIC_ACQUIRE)
-		& RTE_DISTRIB_GET_BUF))
+		& RTE_DISTRIB_GET_BUF)) {
+		handle_returns(d, wkr);
 		rte_pause();
-
-	handle_returns(d, wkr);
+	}
 
 	buf->count = 0;
 
@@ -376,6 +378,7 @@ rte_distributor_process(struct rte_distributor *d,
 		/* Flush out all non-full cache-lines to workers. */
 		for (wid = 0 ; wid < d->num_workers; wid++) {
 			/* Sync with worker on GET_BUF flag. */
+			handle_returns(d, wid);
 			if (__atomic_load_n(&(d->bufs[wid].bufptr64[0]),
 				__ATOMIC_ACQUIRE) & RTE_DISTRIB_GET_BUF) {
 				release(d, wid);
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v3 7/8] distributor: do not use oldpkt when not needed
       [not found]           ` <CGME20200923132550eucas1p1ce21011562d0a00cccfd4ae3f0be4ff9@eucas1p1.samsung.com>
@ 2020-09-23 13:25             ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-09-23 13:25 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

rte_distributor_request_pkt and rte_distributor_get_pkt dereferenced
oldpkt parameter when in RTE_DIST_ALG_SINGLE even if number
of returned buffers from worker to distributor was 0.

This patch passes NULL to the legacy API when number of returned
buffers is 0. This allows passing NULL as oldpkt parameter.

Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
---
 lib/librte_distributor/rte_distributor.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
index 12b3db33c..b720abe03 100644
--- a/lib/librte_distributor/rte_distributor.c
+++ b/lib/librte_distributor/rte_distributor.c
@@ -42,7 +42,7 @@ rte_distributor_request_pkt(struct rte_distributor *d,
 
 	if (unlikely(d->alg_type == RTE_DIST_ALG_SINGLE)) {
 		rte_distributor_request_pkt_single(d->d_single,
-			worker_id, oldpkt[0]);
+			worker_id, count ? oldpkt[0] : NULL);
 		return;
 	}
 
@@ -134,7 +134,7 @@ rte_distributor_get_pkt(struct rte_distributor *d,
 	if (unlikely(d->alg_type == RTE_DIST_ALG_SINGLE)) {
 		if (return_count <= 1) {
 			pkts[0] = rte_distributor_get_pkt_single(d->d_single,
-				worker_id, oldpkt[0]);
+				worker_id, return_count ? oldpkt[0] : NULL);
 			return (pkts[0]) ? 1 : 0;
 		} else
 			return -EINVAL;
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v3 8/8] distributor: align API documentation with code
       [not found]           ` <CGME20200923132551eucas1p214a5f78c61e891c5e7b6cddc038d0e2e@eucas1p2.samsung.com>
@ 2020-09-23 13:25             ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-09-23 13:25 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

After introducing burst API there were some artefacts in the
API documentation from legacy single API.
Also the rte_distributor_poll_pkt() function return values
mismatched the implementation.

Fixes: c0de0eb82e40 ("distributor: switch over to new API")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
---
 lib/librte_distributor/rte_distributor.h | 23 ++++++++++++-----------
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/lib/librte_distributor/rte_distributor.h b/lib/librte_distributor/rte_distributor.h
index 327c0c4ab..a073e6461 100644
--- a/lib/librte_distributor/rte_distributor.h
+++ b/lib/librte_distributor/rte_distributor.h
@@ -155,7 +155,7 @@ rte_distributor_clear_returns(struct rte_distributor *d);
  * @param pkts
  *   The mbufs pointer array to be filled in (up to 8 packets)
  * @param oldpkt
- *   The previous packet, if any, being processed by the worker
+ *   The previous packets, if any, being processed by the worker
  * @param retcount
  *   The number of packets being returned
  *
@@ -187,15 +187,15 @@ rte_distributor_return_pkt(struct rte_distributor *d,
 
 /**
  * API called by a worker to request a new packet to process.
- * Any previous packet given to the worker is assumed to have completed
+ * Any previous packets given to the worker are assumed to have completed
  * processing, and may be optionally returned to the distributor via
  * the oldpkt parameter.
- * Unlike rte_distributor_get_pkt_burst(), this function does not wait for a
- * new packet to be provided by the distributor.
+ * Unlike rte_distributor_get_pkt(), this function does not wait for
+ * new packets to be provided by the distributor.
  *
- * NOTE: after calling this function, rte_distributor_poll_pkt_burst() should
- * be used to poll for the packet requested. The rte_distributor_get_pkt_burst()
- * API should *not* be used to try and retrieve the new packet.
+ * NOTE: after calling this function, rte_distributor_poll_pkt() should
+ * be used to poll for the packets requested. The rte_distributor_get_pkt()
+ * API should *not* be used to try and retrieve the new packets.
  *
  * @param d
  *   The distributor instance to be used
@@ -213,9 +213,9 @@ rte_distributor_request_pkt(struct rte_distributor *d,
 		unsigned int count);
 
 /**
- * API called by a worker to check for a new packet that was previously
+ * API called by a worker to check for new packets that were previously
  * requested by a call to rte_distributor_request_pkt(). It does not wait
- * for the new packet to be available, but returns NULL if the request has
+ * for the new packets to be available, but returns if the request has
  * not yet been fulfilled by the distributor.
  *
  * @param d
@@ -227,8 +227,9 @@ rte_distributor_request_pkt(struct rte_distributor *d,
  *   The array of mbufs being given to the worker
  *
  * @return
- *   The number of packets being given to the worker thread, zero if no
- *   packet is yet available.
+ *   The number of packets being given to the worker thread,
+ *   -1 if no packets are yet available (burst API - RTE_DIST_ALG_BURST)
+ *   0 if no packets are yet available (legacy single API - RTE_DIST_ALG_SINGLE)
  */
 int
 rte_distributor_poll_pkt(struct rte_distributor *d,
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v2 0/8] fix distributor synchronization issues
  2020-09-23  8:46       ` [dpdk-dev] [PATCH v2 0/8] fix distributor synchronization issues David Hunt
@ 2020-09-23 14:03         ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-09-23 14:03 UTC (permalink / raw)
  To: David Hunt, dev; +Cc: "'Lukasz Wojciechowski'",


W dniu 23.09.2020 o 10:46, David Hunt pisze:
> Hi Lukasz,
>
>
> On 23/9/2020 2:47 AM, Lukasz Wojciechowski wrote:
>> During review and verification of the patch created by Sarosh Arif:
>> "test_distributor: prevent memory leakages from the pool" I found out
>> that running distributor unit tests multiple times in a row causes 
>> fails.
>> So I investigated all the issues I found.
>>
>> There are few synchronization issues that might cause deadlocks
>> or corrupted data. They are fixed with this set of patches for both 
>> tests
>> and librte_distributor library.
>>
>> ---
>> v2:
>> * assign NULL to freed mbufs in distributor test
>> * fix handshake check on legacy single distributor
>>       rte_distributor_return_pkt_single()
>> * add patch 7 passing NULL to legacy API calls if no bufs are returned
>> * add patch 8 fixing API documentation
>>
>>
>
> Please include any Acked-by or Tested-by tags from previous versions.
I added them in v3. I'm very sorry for not including them in v2.
>
> Rgds,
> Dave.
>
Best regards

Lukasz

>
>
>
-- 
Lukasz Wojciechowski
Principal Software Engineer

Samsung R&D Institute Poland
Samsung Electronics
Office +48 22 377 88 25
l.wojciechow@partner.samsung.com


^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v3 0/8] fix distributor synchronization issues
  2020-09-23 13:25         ` [dpdk-dev] [PATCH v3 " Lukasz Wojciechowski
                             ` (7 preceding siblings ...)
       [not found]           ` <CGME20200923132551eucas1p214a5f78c61e891c5e7b6cddc038d0e2e@eucas1p2.samsung.com>
@ 2020-09-25 12:31           ` David Marchand
  2020-09-25 22:42             ` Lukasz Wojciechowski
       [not found]           ` <CGME20200925224216eucas1p28b73b1b56c6f3372db4fcaba0890522f@eucas1p2.samsung.com>
  9 siblings, 1 reply; 164+ messages in thread
From: David Marchand @ 2020-09-25 12:31 UTC (permalink / raw)
  To: Lukasz Wojciechowski; +Cc: dev, David Hunt

Hello Lukasz,

On Wed, Sep 23, 2020 at 3:25 PM Lukasz Wojciechowski
<l.wojciechow@partner.samsung.com> wrote:
>
> During review and verification of the patch created by Sarosh Arif:
> "test_distributor: prevent memory leakages from the pool" I found out
> that running distributor unit tests multiple times in a row causes fails.
> So I investigated all the issues I found.
>
> There are few synchronization issues that might cause deadlocks
> or corrupted data. They are fixed with this set of patches for both tests
> and librte_distributor library.
>
> ---
> v3:
> * add missing acked and tested by statements from v1
>
> v2:
> * assign NULL to freed mbufs in distributor test
> * fix handshake check on legacy single distributor
>      rte_distributor_return_pkt_single()
> * add patch 7 passing NULL to legacy API calls if no bufs are returned
> * add patch 8 fixing API documentation
>
> Lukasz Wojciechowski (8):
>   app/test: fix deadlock in distributor test
>   app/test: synchronize statistics between lcores
>   app/test: fix freeing mbufs in distributor tests
>   app/test: collect return mbufs in distributor test

For these patches, we can use the "test/distributor: " prefix, and we
then avoid repeating "in distributor test"


>   distributor: fix missing handshake synchronization
>   distributor: fix handshake deadlock
>   distributor: do not use oldpkt when not needed
>   distributor: align API documentation with code

Thanks for working on those fixes !

Here is a suggestion:

- we can move this new patch 7 before patch 3 in the series, and
update the unit test:
 * by passing NULL to the first call to rte_distributor_get_pkt(),
there is no need for buf[] array init in handle_work(),
handle_work_with_free_mbufs() and handle_work_for_shutdown_test(),
 * at all points of those functions the buf[] array then contains only
[0, num[ valid entries,
 * bonus point, this makes the UT check passing NULL oldpkt,

- the former patch 3 is then easier to do since there is no need for
buf[] array clearing,

This gives the following diff, wdyt?

diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
index f31b54edf3..b7ab93ecbe 100644
--- a/app/test/test_distributor.c
+++ b/app/test/test_distributor.c
@@ -65,13 +65,10 @@ handle_work(void *arg)
        struct rte_mbuf *buf[8] __rte_cache_aligned;
        struct worker_params *wp = arg;
        struct rte_distributor *db = wp->dist;
-       unsigned int count = 0, num = 0;
+       unsigned int count = 0, num;
        unsigned int id = __atomic_fetch_add(&worker_idx, 1, __ATOMIC_RELAXED);
-       int i;

-       for (i = 0; i < 8; i++)
-               buf[i] = NULL;
-       num = rte_distributor_get_pkt(db, id, buf, buf, num);
+       num = rte_distributor_get_pkt(db, id, buf, NULL, 0);
        while (!quit) {
                __atomic_fetch_add(&worker_stats[id].handled_packets, num,
                                __ATOMIC_ACQ_REL);
@@ -277,22 +274,17 @@ handle_work_with_free_mbufs(void *arg)
        struct rte_mbuf *buf[8] __rte_cache_aligned;
        struct worker_params *wp = arg;
        struct rte_distributor *d = wp->dist;
+       unsigned int num;
        unsigned int i;
-       unsigned int num = 0;
        unsigned int id = __atomic_fetch_add(&worker_idx, 1, __ATOMIC_RELAXED);

-       for (i = 0; i < 8; i++)
-               buf[i] = NULL;
-       num = rte_distributor_get_pkt(d, id, buf, buf, 0);
+       num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
        while (!quit) {
                __atomic_fetch_add(&worker_stats[id].handled_packets, num,
                                __ATOMIC_ACQ_REL);
-               for (i = 0; i < num; i++) {
+               for (i = 0; i < num; i++)
                        rte_pktmbuf_free(buf[i]);
-                       buf[i] = NULL;
-               }
-               num = rte_distributor_get_pkt(d,
-                               id, buf, buf, 0);
+               num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
        }
        __atomic_fetch_add(&worker_stats[id].handled_packets, num,
                        __ATOMIC_ACQ_REL);
@@ -347,15 +339,13 @@ handle_work_for_shutdown_test(void *arg)
        struct rte_mbuf *buf[8] __rte_cache_aligned;
        struct worker_params *wp = arg;
        struct rte_distributor *d = wp->dist;
-       unsigned int num = 0;
+       unsigned int num;
        unsigned int i;
        unsigned int zero_id = 0;
        const unsigned int id = __atomic_fetch_add(&worker_idx, 1,
                        __ATOMIC_RELAXED);
-       for (i = 0; i < 8; i++)
-               buf[i] = NULL;

-       num = rte_distributor_get_pkt(d, id, buf, buf, 0);
+       num = rte_distributor_get_pkt(d, id, buf, NULL, 0);

        zero_id = __atomic_load_n(&zero_idx, __ATOMIC_ACQUIRE);
        if (id == zero_id && num > 0) {
@@ -369,12 +359,9 @@ handle_work_for_shutdown_test(void *arg)
        while (!quit && !(id == zero_id && zero_quit)) {
                __atomic_fetch_add(&worker_stats[id].handled_packets, num,
                                __ATOMIC_ACQ_REL);
-               for (i = 0; i < num; i++) {
+               for (i = 0; i < num; i++)
                        rte_pktmbuf_free(buf[i]);
-                       buf[i] = NULL;
-               }
-               num = rte_distributor_get_pkt(d,
-                               id, buf, buf, 0);
+               num = rte_distributor_get_pkt(d, id, buf, NULL, 0);

                zero_id = __atomic_load_n(&zero_idx, __ATOMIC_ACQUIRE);
                if (id == zero_id && num > 0) {
@@ -392,21 +379,16 @@ handle_work_for_shutdown_test(void *arg)
                while (zero_quit)
                        usleep(100);

-               for (i = 0; i < num; i++) {
+               for (i = 0; i < num; i++)
                        rte_pktmbuf_free(buf[i]);
-                       buf[i] = NULL;
-               }
-               num = rte_distributor_get_pkt(d,
-                               id, buf, buf, 0);
+               num = rte_distributor_get_pkt(d, id, buf, NULL, 0);

                while (!quit) {
                        __atomic_fetch_add(&worker_stats[id].handled_packets,
                                        num, __ATOMIC_ACQ_REL);
-                       for (i = 0; i < num; i++) {
+                       for (i = 0; i < num; i++)
                                rte_pktmbuf_free(buf[i]);
-                               buf[i] = NULL;
-                       }
-                       num = rte_distributor_get_pkt(d, id, buf, buf, 0);
+                       num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
                }
        }
        rte_distributor_return_pkt(d, id, buf, num);


-- 
David Marchand


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v4 0/8] fix distributor synchronization issues
       [not found]           ` <CGME20200925224216eucas1p28b73b1b56c6f3372db4fcaba0890522f@eucas1p2.samsung.com>
@ 2020-09-25 22:42             ` Lukasz Wojciechowski
       [not found]               ` <CGME20200925224216eucas1p1e8e1d0ecab4bbbf6e43b117c1d210649@eucas1p1.samsung.com>
                                 ` (8 more replies)
  0 siblings, 9 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-09-25 22:42 UTC (permalink / raw)
  Cc: dev, l.wojciechow

During review and verification of the patch created by Sarosh Arif:
"test_distributor: prevent memory leakages from the pool" I found out
that running distributor unit tests multiple times in a row causes fails.
So I investigated all the issues I found.

There are few synchronization issues that might cause deadlocks
or corrupted data. They are fixed with this set of patches for both tests
and librte_distributor library.

---
v4:
* adjust commit name prefixes app/test -> test/distributor:
* reorder patches
* use NULL oldpkt in rte_distributor_get_pkt() calls in tests

v3:
* add missing acked and tested by statements from v1

v2:
* assign NULL to freed mbufs in distributor test
* fix handshake check on legacy single distributor
     rte_distributor_return_pkt_single()
* add patch 7 passing NULL to legacy API calls if no bufs are returned
* add patch 8 fixing API documentation

Lukasz Wojciechowski (8):
  test/distributor: fix deadlock with freezed worker
  test/distributor: synchronize lcores statistics
  distributor: do not use oldpkt when not needed
  test/distributor: fix freeing mbufs
  test/distributor: collect return mbufs
  distributor: fix missing handshake synchronization
  distributor: fix handshake deadlock
  distributor: align API documentation with code

 app/test/test_distributor.c                   | 117 ++++++++++--------
 lib/librte_distributor/rte_distributor.c      |  27 +++-
 lib/librte_distributor/rte_distributor.h      |  23 ++--
 .../rte_distributor_single.c                  |   4 +
 4 files changed, 102 insertions(+), 69 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v4 1/8] test/distributor: fix deadlock with freezed worker
       [not found]               ` <CGME20200925224216eucas1p1e8e1d0ecab4bbbf6e43b117c1d210649@eucas1p1.samsung.com>
@ 2020-09-25 22:42                 ` Lukasz Wojciechowski
  2020-09-27 23:34                   ` Honnappa Nagarahalli
  0 siblings, 1 reply; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-09-25 22:42 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

The sanity test with worker shutdown delegates all bufs
to be processed by a single lcore worker, then it freezes
one of the lcore workers and continues to send more bufs.

Problem occurred if freezed lcore is the same as the one
that is processing the mbufs. The lcore processing mbufs
might be different every time test is launched.
This is caused by keeping the value of wkr static variable
in rte_distributor_process function between running test cases.

Test freezed always lcore with 0 id. The patch changes avoids
possible collision by freezing lcore with zero_idx. The lcore
that receives the data updates the zero_idx, so it is not freezed
itself.

To reproduce the issue fixed by this patch, please run
distributor_autotest command in test app several times in a row.

Fixes: c3eabff124e6 ("distributor: add unit tests")
Cc: bruce.richardson@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Tested-by: David Hunt <david.hunt@intel.com>
---
 app/test/test_distributor.c | 22 ++++++++++++++++++++--
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
index ba1f81cf8..35b25463a 100644
--- a/app/test/test_distributor.c
+++ b/app/test/test_distributor.c
@@ -28,6 +28,7 @@ struct worker_params worker_params;
 static volatile int quit;      /**< general quit variable for all threads */
 static volatile int zero_quit; /**< var for when we just want thr0 to quit*/
 static volatile unsigned worker_idx;
+static volatile unsigned zero_idx;
 
 struct worker_stats {
 	volatile unsigned handled_packets;
@@ -346,27 +347,43 @@ handle_work_for_shutdown_test(void *arg)
 	unsigned int total = 0;
 	unsigned int i;
 	unsigned int returned = 0;
+	unsigned int zero_id = 0;
 	const unsigned int id = __atomic_fetch_add(&worker_idx, 1,
 			__ATOMIC_RELAXED);
 
 	num = rte_distributor_get_pkt(d, id, buf, buf, num);
 
+	zero_id = __atomic_load_n(&zero_idx, __ATOMIC_ACQUIRE);
+	if (id == zero_id && num > 0) {
+		zero_id = (zero_id + 1) %  __atomic_load_n(&worker_idx,
+			__ATOMIC_ACQUIRE);
+		__atomic_store_n(&zero_idx, zero_id, __ATOMIC_RELEASE);
+	}
+
 	/* wait for quit single globally, or for worker zero, wait
 	 * for zero_quit */
-	while (!quit && !(id == 0 && zero_quit)) {
+	while (!quit && !(id == zero_id && zero_quit)) {
 		worker_stats[id].handled_packets += num;
 		count += num;
 		for (i = 0; i < num; i++)
 			rte_pktmbuf_free(buf[i]);
 		num = rte_distributor_get_pkt(d,
 				id, buf, buf, num);
+
+		zero_id = __atomic_load_n(&zero_idx, __ATOMIC_ACQUIRE);
+		if (id == zero_id && num > 0) {
+			zero_id = (zero_id + 1) %  __atomic_load_n(&worker_idx,
+				__ATOMIC_ACQUIRE);
+			__atomic_store_n(&zero_idx, zero_id, __ATOMIC_RELEASE);
+		}
+
 		total += num;
 	}
 	worker_stats[id].handled_packets += num;
 	count += num;
 	returned = rte_distributor_return_pkt(d, id, buf, num);
 
-	if (id == 0) {
+	if (id == zero_id) {
 		/* for worker zero, allow it to restart to pick up last packet
 		 * when all workers are shutting down.
 		 */
@@ -586,6 +603,7 @@ quit_workers(struct worker_params *wp, struct rte_mempool *p)
 	rte_eal_mp_wait_lcore();
 	quit = 0;
 	worker_idx = 0;
+	zero_idx = 0;
 }
 
 static int
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v4 2/8] test/distributor: synchronize lcores statistics
       [not found]               ` <CGME20200925224217eucas1p1bb5f73109b4aeed8f2badf311fa8dfb5@eucas1p1.samsung.com>
@ 2020-09-25 22:42                 ` Lukasz Wojciechowski
  2020-09-29  5:49                   ` Honnappa Nagarahalli
  0 siblings, 1 reply; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-09-25 22:42 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

Statistics of handled packets are cleared and read on main lcore,
while they are increased in workers handlers on different lcores.

Without synchronization occasionally showed invalid values.
This patch uses atomic acquire/release mechanisms to synchronize.

Fixes: c3eabff124e6 ("distributor: add unit tests")
Cc: bruce.richardson@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 app/test/test_distributor.c | 39 ++++++++++++++++++++++++-------------
 1 file changed, 26 insertions(+), 13 deletions(-)

diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
index 35b25463a..0e49e3714 100644
--- a/app/test/test_distributor.c
+++ b/app/test/test_distributor.c
@@ -43,7 +43,8 @@ total_packet_count(void)
 {
 	unsigned i, count = 0;
 	for (i = 0; i < worker_idx; i++)
-		count += worker_stats[i].handled_packets;
+		count += __atomic_load_n(&worker_stats[i].handled_packets,
+				__ATOMIC_ACQUIRE);
 	return count;
 }
 
@@ -52,6 +53,7 @@ static inline void
 clear_packet_count(void)
 {
 	memset(&worker_stats, 0, sizeof(worker_stats));
+	rte_atomic_thread_fence(__ATOMIC_RELEASE);
 }
 
 /* this is the basic worker function for sanity test
@@ -72,13 +74,13 @@ handle_work(void *arg)
 	num = rte_distributor_get_pkt(db, id, buf, buf, num);
 	while (!quit) {
 		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
-				__ATOMIC_RELAXED);
+				__ATOMIC_ACQ_REL);
 		count += num;
 		num = rte_distributor_get_pkt(db, id,
 				buf, buf, num);
 	}
 	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
-			__ATOMIC_RELAXED);
+			__ATOMIC_ACQ_REL);
 	count += num;
 	rte_distributor_return_pkt(db, id, buf, num);
 	return 0;
@@ -134,7 +136,8 @@ sanity_test(struct worker_params *wp, struct rte_mempool *p)
 
 	for (i = 0; i < rte_lcore_count() - 1; i++)
 		printf("Worker %u handled %u packets\n", i,
-				worker_stats[i].handled_packets);
+			__atomic_load_n(&worker_stats[i].handled_packets,
+					__ATOMIC_ACQUIRE));
 	printf("Sanity test with all zero hashes done.\n");
 
 	/* pick two flows and check they go correctly */
@@ -159,7 +162,9 @@ sanity_test(struct worker_params *wp, struct rte_mempool *p)
 
 		for (i = 0; i < rte_lcore_count() - 1; i++)
 			printf("Worker %u handled %u packets\n", i,
-					worker_stats[i].handled_packets);
+				__atomic_load_n(
+					&worker_stats[i].handled_packets,
+					__ATOMIC_ACQUIRE));
 		printf("Sanity test with two hash values done\n");
 	}
 
@@ -185,7 +190,8 @@ sanity_test(struct worker_params *wp, struct rte_mempool *p)
 
 	for (i = 0; i < rte_lcore_count() - 1; i++)
 		printf("Worker %u handled %u packets\n", i,
-				worker_stats[i].handled_packets);
+			__atomic_load_n(&worker_stats[i].handled_packets,
+					__ATOMIC_ACQUIRE));
 	printf("Sanity test with non-zero hashes done\n");
 
 	rte_mempool_put_bulk(p, (void *)bufs, BURST);
@@ -280,15 +286,17 @@ handle_work_with_free_mbufs(void *arg)
 		buf[i] = NULL;
 	num = rte_distributor_get_pkt(d, id, buf, buf, num);
 	while (!quit) {
-		worker_stats[id].handled_packets += num;
 		count += num;
+		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
+				__ATOMIC_ACQ_REL);
 		for (i = 0; i < num; i++)
 			rte_pktmbuf_free(buf[i]);
 		num = rte_distributor_get_pkt(d,
 				id, buf, buf, num);
 	}
-	worker_stats[id].handled_packets += num;
 	count += num;
+	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
+			__ATOMIC_ACQ_REL);
 	rte_distributor_return_pkt(d, id, buf, num);
 	return 0;
 }
@@ -363,8 +371,9 @@ handle_work_for_shutdown_test(void *arg)
 	/* wait for quit single globally, or for worker zero, wait
 	 * for zero_quit */
 	while (!quit && !(id == zero_id && zero_quit)) {
-		worker_stats[id].handled_packets += num;
 		count += num;
+		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
+				__ATOMIC_ACQ_REL);
 		for (i = 0; i < num; i++)
 			rte_pktmbuf_free(buf[i]);
 		num = rte_distributor_get_pkt(d,
@@ -379,10 +388,11 @@ handle_work_for_shutdown_test(void *arg)
 
 		total += num;
 	}
-	worker_stats[id].handled_packets += num;
 	count += num;
 	returned = rte_distributor_return_pkt(d, id, buf, num);
 
+	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
+			__ATOMIC_ACQ_REL);
 	if (id == zero_id) {
 		/* for worker zero, allow it to restart to pick up last packet
 		 * when all workers are shutting down.
@@ -394,10 +404,11 @@ handle_work_for_shutdown_test(void *arg)
 				id, buf, buf, num);
 
 		while (!quit) {
-			worker_stats[id].handled_packets += num;
 			count += num;
 			rte_pktmbuf_free(pkt);
 			num = rte_distributor_get_pkt(d, id, buf, buf, num);
+			__atomic_fetch_add(&worker_stats[id].handled_packets,
+					num, __ATOMIC_ACQ_REL);
 		}
 		returned = rte_distributor_return_pkt(d,
 				id, buf, num);
@@ -461,7 +472,8 @@ sanity_test_with_worker_shutdown(struct worker_params *wp,
 
 	for (i = 0; i < rte_lcore_count() - 1; i++)
 		printf("Worker %u handled %u packets\n", i,
-				worker_stats[i].handled_packets);
+			__atomic_load_n(&worker_stats[i].handled_packets,
+					__ATOMIC_ACQUIRE));
 
 	if (total_packet_count() != BURST * 2) {
 		printf("Line %d: Error, not all packets flushed. "
@@ -514,7 +526,8 @@ test_flush_with_worker_shutdown(struct worker_params *wp,
 	zero_quit = 0;
 	for (i = 0; i < rte_lcore_count() - 1; i++)
 		printf("Worker %u handled %u packets\n", i,
-				worker_stats[i].handled_packets);
+			__atomic_load_n(&worker_stats[i].handled_packets,
+					__ATOMIC_ACQUIRE));
 
 	if (total_packet_count() != BURST) {
 		printf("Line %d: Error, not all packets flushed. "
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v4 3/8] distributor: do not use oldpkt when not needed
       [not found]               ` <CGME20200925224218eucas1p2383ff0ebdaee18b581f5f731476f05ab@eucas1p2.samsung.com>
@ 2020-09-25 22:42                 ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-09-25 22:42 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

rte_distributor_request_pkt and rte_distributor_get_pkt dereferenced
oldpkt parameter when in RTE_DIST_ALG_SINGLE even if number
of returned buffers from worker to distributor was 0.

This patch passes NULL to the legacy API when number of returned
buffers is 0. This allows passing NULL as oldpkt parameter.

Distribor tests were also updated passing NULL as oldpkt and
0 as number of returned packets, where packets are not returned.

Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
---
 app/test/test_distributor.c              | 28 +++++++++---------------
 lib/librte_distributor/rte_distributor.c |  4 ++--
 2 files changed, 12 insertions(+), 20 deletions(-)

diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
index 0e49e3714..0e3ab0c4f 100644
--- a/app/test/test_distributor.c
+++ b/app/test/test_distributor.c
@@ -65,13 +65,10 @@ handle_work(void *arg)
 	struct rte_mbuf *buf[8] __rte_cache_aligned;
 	struct worker_params *wp = arg;
 	struct rte_distributor *db = wp->dist;
-	unsigned int count = 0, num = 0;
+	unsigned int count = 0, num;
 	unsigned int id = __atomic_fetch_add(&worker_idx, 1, __ATOMIC_RELAXED);
-	int i;
 
-	for (i = 0; i < 8; i++)
-		buf[i] = NULL;
-	num = rte_distributor_get_pkt(db, id, buf, buf, num);
+	num = rte_distributor_get_pkt(db, id, buf, NULL, 0);
 	while (!quit) {
 		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
 				__ATOMIC_ACQ_REL);
@@ -279,20 +276,17 @@ handle_work_with_free_mbufs(void *arg)
 	struct rte_distributor *d = wp->dist;
 	unsigned int count = 0;
 	unsigned int i;
-	unsigned int num = 0;
+	unsigned int num;
 	unsigned int id = __atomic_fetch_add(&worker_idx, 1, __ATOMIC_RELAXED);
 
-	for (i = 0; i < 8; i++)
-		buf[i] = NULL;
-	num = rte_distributor_get_pkt(d, id, buf, buf, num);
+	num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 	while (!quit) {
 		count += num;
 		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
 				__ATOMIC_ACQ_REL);
 		for (i = 0; i < num; i++)
 			rte_pktmbuf_free(buf[i]);
-		num = rte_distributor_get_pkt(d,
-				id, buf, buf, num);
+		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 	}
 	count += num;
 	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
@@ -351,7 +345,7 @@ handle_work_for_shutdown_test(void *arg)
 	struct worker_params *wp = arg;
 	struct rte_distributor *d = wp->dist;
 	unsigned int count = 0;
-	unsigned int num = 0;
+	unsigned int num;
 	unsigned int total = 0;
 	unsigned int i;
 	unsigned int returned = 0;
@@ -359,7 +353,7 @@ handle_work_for_shutdown_test(void *arg)
 	const unsigned int id = __atomic_fetch_add(&worker_idx, 1,
 			__ATOMIC_RELAXED);
 
-	num = rte_distributor_get_pkt(d, id, buf, buf, num);
+	num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 
 	zero_id = __atomic_load_n(&zero_idx, __ATOMIC_ACQUIRE);
 	if (id == zero_id && num > 0) {
@@ -376,8 +370,7 @@ handle_work_for_shutdown_test(void *arg)
 				__ATOMIC_ACQ_REL);
 		for (i = 0; i < num; i++)
 			rte_pktmbuf_free(buf[i]);
-		num = rte_distributor_get_pkt(d,
-				id, buf, buf, num);
+		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 
 		zero_id = __atomic_load_n(&zero_idx, __ATOMIC_ACQUIRE);
 		if (id == zero_id && num > 0) {
@@ -400,13 +393,12 @@ handle_work_for_shutdown_test(void *arg)
 		while (zero_quit)
 			usleep(100);
 
-		num = rte_distributor_get_pkt(d,
-				id, buf, buf, num);
+		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 
 		while (!quit) {
 			count += num;
 			rte_pktmbuf_free(pkt);
-			num = rte_distributor_get_pkt(d, id, buf, buf, num);
+			num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 			__atomic_fetch_add(&worker_stats[id].handled_packets,
 					num, __ATOMIC_ACQ_REL);
 		}
diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
index 1c047f065..8a12bf856 100644
--- a/lib/librte_distributor/rte_distributor.c
+++ b/lib/librte_distributor/rte_distributor.c
@@ -42,7 +42,7 @@ rte_distributor_request_pkt(struct rte_distributor *d,
 
 	if (unlikely(d->alg_type == RTE_DIST_ALG_SINGLE)) {
 		rte_distributor_request_pkt_single(d->d_single,
-			worker_id, oldpkt[0]);
+			worker_id, count ? oldpkt[0] : NULL);
 		return;
 	}
 
@@ -134,7 +134,7 @@ rte_distributor_get_pkt(struct rte_distributor *d,
 	if (unlikely(d->alg_type == RTE_DIST_ALG_SINGLE)) {
 		if (return_count <= 1) {
 			pkts[0] = rte_distributor_get_pkt_single(d->d_single,
-				worker_id, oldpkt[0]);
+				worker_id, return_count ? oldpkt[0] : NULL);
 			return (pkts[0]) ? 1 : 0;
 		} else
 			return -EINVAL;
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v4 4/8] test/distributor: fix freeing mbufs
       [not found]               ` <CGME20200925224218eucas1p221c1af87b0e4547547106503cd336afd@eucas1p2.samsung.com>
@ 2020-09-25 22:42                 ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-09-25 22:42 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

Sanity tests with mbuf alloc and shutdown tests assume that
mbufs passed to worker cores are freed in handlers.
Such packets should not be returned to the distributor's main
core. The only packets that should be returned are the packets
send after completion of the tests in quit_workers function.

This patch fixes freeing mbufs, stops returning them
to distributor's core and cleans up unused variables.

Fixes: c0de0eb82e40 ("distributor: switch over to new API")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 app/test/test_distributor.c | 30 +++++++-----------------------
 1 file changed, 7 insertions(+), 23 deletions(-)

diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
index 0e3ab0c4f..b302ed118 100644
--- a/app/test/test_distributor.c
+++ b/app/test/test_distributor.c
@@ -65,20 +65,18 @@ handle_work(void *arg)
 	struct rte_mbuf *buf[8] __rte_cache_aligned;
 	struct worker_params *wp = arg;
 	struct rte_distributor *db = wp->dist;
-	unsigned int count = 0, num;
+	unsigned int num;
 	unsigned int id = __atomic_fetch_add(&worker_idx, 1, __ATOMIC_RELAXED);
 
 	num = rte_distributor_get_pkt(db, id, buf, NULL, 0);
 	while (!quit) {
 		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
 				__ATOMIC_ACQ_REL);
-		count += num;
 		num = rte_distributor_get_pkt(db, id,
 				buf, buf, num);
 	}
 	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
 			__ATOMIC_ACQ_REL);
-	count += num;
 	rte_distributor_return_pkt(db, id, buf, num);
 	return 0;
 }
@@ -274,21 +272,18 @@ handle_work_with_free_mbufs(void *arg)
 	struct rte_mbuf *buf[8] __rte_cache_aligned;
 	struct worker_params *wp = arg;
 	struct rte_distributor *d = wp->dist;
-	unsigned int count = 0;
 	unsigned int i;
 	unsigned int num;
 	unsigned int id = __atomic_fetch_add(&worker_idx, 1, __ATOMIC_RELAXED);
 
 	num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 	while (!quit) {
-		count += num;
 		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
 				__ATOMIC_ACQ_REL);
 		for (i = 0; i < num; i++)
 			rte_pktmbuf_free(buf[i]);
 		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 	}
-	count += num;
 	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
 			__ATOMIC_ACQ_REL);
 	rte_distributor_return_pkt(d, id, buf, num);
@@ -316,7 +311,6 @@ sanity_test_with_mbuf_alloc(struct worker_params *wp, struct rte_mempool *p)
 			rte_distributor_process(d, NULL, 0);
 		for (j = 0; j < BURST; j++) {
 			bufs[j]->hash.usr = (i+j) << 1;
-			rte_mbuf_refcnt_set(bufs[j], 1);
 		}
 
 		rte_distributor_process(d, bufs, BURST);
@@ -340,15 +334,11 @@ sanity_test_with_mbuf_alloc(struct worker_params *wp, struct rte_mempool *p)
 static int
 handle_work_for_shutdown_test(void *arg)
 {
-	struct rte_mbuf *pkt = NULL;
 	struct rte_mbuf *buf[8] __rte_cache_aligned;
 	struct worker_params *wp = arg;
 	struct rte_distributor *d = wp->dist;
-	unsigned int count = 0;
 	unsigned int num;
-	unsigned int total = 0;
 	unsigned int i;
-	unsigned int returned = 0;
 	unsigned int zero_id = 0;
 	const unsigned int id = __atomic_fetch_add(&worker_idx, 1,
 			__ATOMIC_RELAXED);
@@ -365,7 +355,6 @@ handle_work_for_shutdown_test(void *arg)
 	/* wait for quit single globally, or for worker zero, wait
 	 * for zero_quit */
 	while (!quit && !(id == zero_id && zero_quit)) {
-		count += num;
 		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
 				__ATOMIC_ACQ_REL);
 		for (i = 0; i < num; i++)
@@ -378,12 +367,7 @@ handle_work_for_shutdown_test(void *arg)
 				__ATOMIC_ACQUIRE);
 			__atomic_store_n(&zero_idx, zero_id, __ATOMIC_RELEASE);
 		}
-
-		total += num;
 	}
-	count += num;
-	returned = rte_distributor_return_pkt(d, id, buf, num);
-
 	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
 			__ATOMIC_ACQ_REL);
 	if (id == zero_id) {
@@ -393,19 +377,19 @@ handle_work_for_shutdown_test(void *arg)
 		while (zero_quit)
 			usleep(100);
 
+		for (i = 0; i < num; i++)
+			rte_pktmbuf_free(buf[i]);
 		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 
 		while (!quit) {
-			count += num;
-			rte_pktmbuf_free(pkt);
-			num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 			__atomic_fetch_add(&worker_stats[id].handled_packets,
 					num, __ATOMIC_ACQ_REL);
+			for (i = 0; i < num; i++)
+				rte_pktmbuf_free(buf[i]);
+			num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 		}
-		returned = rte_distributor_return_pkt(d,
-				id, buf, num);
-		printf("Num returned = %d\n", returned);
 	}
+	rte_distributor_return_pkt(d, id, buf, num);
 	return 0;
 }
 
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v4 5/8] test/distributor: collect return mbufs
       [not found]               ` <CGME20200925224219eucas1p2d61447fef421573d653d2376423ecce0@eucas1p2.samsung.com>
@ 2020-09-25 22:42                 ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-09-25 22:42 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

During quit_workers function distributor's main core processes
some packets to wake up pending worker cores so they can quit.
As quit_workers acts also as a cleanup procedure for next test
case it should also collect these packages returned by workers'
handlers, so the cyclic buffer with returned packets
in distributor remains empty.

Fixes: c3eabff124e6 ("distributor: add unit tests")
Cc: bruce.richardson@intel.com
Fixes: c0de0eb82e40 ("distributor: switch over to new API")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 app/test/test_distributor.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
index b302ed118..1fbdf6fd1 100644
--- a/app/test/test_distributor.c
+++ b/app/test/test_distributor.c
@@ -590,6 +590,10 @@ quit_workers(struct worker_params *wp, struct rte_mempool *p)
 	rte_distributor_process(d, NULL, 0);
 	rte_distributor_flush(d);
 	rte_eal_mp_wait_lcore();
+
+	while (rte_distributor_returned_pkts(d, bufs, RTE_MAX_LCORE))
+		;
+
 	quit = 0;
 	worker_idx = 0;
 	zero_idx = 0;
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v4 6/8] distributor: fix missing handshake synchronization
       [not found]               ` <CGME20200925224220eucas1p1a44e99a1d7750d37d5aefa61f329209b@eucas1p1.samsung.com>
@ 2020-09-25 22:42                 ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-09-25 22:42 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

rte_distributor_return_pkt function which is run on worker cores
must wait for distributor core to clear handshake on retptr64
before using those buffers. While the handshake is set distributor
core controls buffers and any operations on worker side might overwrite
buffers which are unread yet.
Same situation appears in the legacy single distributor. Function
rte_distributor_return_pkt_single shouldn't modify the bufptr64 until
handshake on it is cleared by distributor lcore.

Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 lib/librte_distributor/rte_distributor.c        | 14 ++++++++++++++
 lib/librte_distributor/rte_distributor_single.c |  4 ++++
 2 files changed, 18 insertions(+)

diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
index 8a12bf856..dd68bc233 100644
--- a/lib/librte_distributor/rte_distributor.c
+++ b/lib/librte_distributor/rte_distributor.c
@@ -160,6 +160,7 @@ rte_distributor_return_pkt(struct rte_distributor *d,
 {
 	struct rte_distributor_buffer *buf = &d->bufs[worker_id];
 	unsigned int i;
+	volatile int64_t *retptr64;
 
 	if (unlikely(d->alg_type == RTE_DIST_ALG_SINGLE)) {
 		if (num == 1)
@@ -169,6 +170,19 @@ rte_distributor_return_pkt(struct rte_distributor *d,
 			return -EINVAL;
 	}
 
+	retptr64 = &(buf->retptr64[0]);
+	/* Spin while handshake bits are set (scheduler clears it).
+	 * Sync with worker on GET_BUF flag.
+	 */
+	while (unlikely(__atomic_load_n(retptr64, __ATOMIC_ACQUIRE)
+			& RTE_DISTRIB_GET_BUF)) {
+		rte_pause();
+		uint64_t t = rte_rdtsc()+100;
+
+		while (rte_rdtsc() < t)
+			rte_pause();
+	}
+
 	/* Sync with distributor to acquire retptrs */
 	__atomic_thread_fence(__ATOMIC_ACQUIRE);
 	for (i = 0; i < RTE_DIST_BURST_SIZE; i++)
diff --git a/lib/librte_distributor/rte_distributor_single.c b/lib/librte_distributor/rte_distributor_single.c
index abaf7730c..f4725b1d0 100644
--- a/lib/librte_distributor/rte_distributor_single.c
+++ b/lib/librte_distributor/rte_distributor_single.c
@@ -74,6 +74,10 @@ rte_distributor_return_pkt_single(struct rte_distributor_single *d,
 	union rte_distributor_buffer_single *buf = &d->bufs[worker_id];
 	uint64_t req = (((int64_t)(uintptr_t)oldpkt) << RTE_DISTRIB_FLAG_BITS)
 			| RTE_DISTRIB_RETURN_BUF;
+	while (unlikely(__atomic_load_n(&buf->bufptr64, __ATOMIC_RELAXED)
+			& RTE_DISTRIB_FLAGS_MASK))
+		rte_pause();
+
 	/* Sync with distributor on RETURN_BUF flag. */
 	__atomic_store_n(&(buf->bufptr64), req, __ATOMIC_RELEASE);
 	return 0;
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v4 7/8] distributor: fix handshake deadlock
       [not found]               ` <CGME20200925224221eucas1p151297834da32a0f7cfdffc120f57ab3a@eucas1p1.samsung.com>
@ 2020-09-25 22:42                 ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-09-25 22:42 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

Synchronization of data exchange between distributor and worker cores
is based on 2 handshakes: retptr64 for returning mbufs from workers
to distributor and bufptr64 for passing mbufs to workers.

Without proper order of verifying those 2 handshakes a deadlock may
occur. This can happen when worker core want to return back mbufs
and waits for retptr handshake to be cleared and distributor core
wait for bufptr to send mbufs to worker.

This can happen as worker core first returns mbufs to distributor
and later gets new mbufs, while distributor first release mbufs
to worker and later handle returning packets.

This patch fixes possibility of the deadlock by always taking care
of returning packets first on the distributor side and handling
packets while waiting to release new.

Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 lib/librte_distributor/rte_distributor.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
index dd68bc233..b720abe03 100644
--- a/lib/librte_distributor/rte_distributor.c
+++ b/lib/librte_distributor/rte_distributor.c
@@ -321,12 +321,14 @@ release(struct rte_distributor *d, unsigned int wkr)
 	struct rte_distributor_buffer *buf = &(d->bufs[wkr]);
 	unsigned int i;
 
+	handle_returns(d, wkr);
+
 	/* Sync with worker on GET_BUF flag */
 	while (!(__atomic_load_n(&(d->bufs[wkr].bufptr64[0]), __ATOMIC_ACQUIRE)
-		& RTE_DISTRIB_GET_BUF))
+		& RTE_DISTRIB_GET_BUF)) {
+		handle_returns(d, wkr);
 		rte_pause();
-
-	handle_returns(d, wkr);
+	}
 
 	buf->count = 0;
 
@@ -376,6 +378,7 @@ rte_distributor_process(struct rte_distributor *d,
 		/* Flush out all non-full cache-lines to workers. */
 		for (wid = 0 ; wid < d->num_workers; wid++) {
 			/* Sync with worker on GET_BUF flag. */
+			handle_returns(d, wid);
 			if (__atomic_load_n(&(d->bufs[wid].bufptr64[0]),
 				__ATOMIC_ACQUIRE) & RTE_DISTRIB_GET_BUF) {
 				release(d, wid);
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v4 8/8] distributor: align API documentation with code
       [not found]               ` <CGME20200925224222eucas1p1b10891c21bfef6784777526af4443dde@eucas1p1.samsung.com>
@ 2020-09-25 22:42                 ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-09-25 22:42 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

After introducing burst API there were some artefacts in the
API documentation from legacy single API.
Also the rte_distributor_poll_pkt() function return values
mismatched the implementation.

Fixes: c0de0eb82e40 ("distributor: switch over to new API")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
---
 lib/librte_distributor/rte_distributor.h | 23 ++++++++++++-----------
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/lib/librte_distributor/rte_distributor.h b/lib/librte_distributor/rte_distributor.h
index 327c0c4ab..a073e6461 100644
--- a/lib/librte_distributor/rte_distributor.h
+++ b/lib/librte_distributor/rte_distributor.h
@@ -155,7 +155,7 @@ rte_distributor_clear_returns(struct rte_distributor *d);
  * @param pkts
  *   The mbufs pointer array to be filled in (up to 8 packets)
  * @param oldpkt
- *   The previous packet, if any, being processed by the worker
+ *   The previous packets, if any, being processed by the worker
  * @param retcount
  *   The number of packets being returned
  *
@@ -187,15 +187,15 @@ rte_distributor_return_pkt(struct rte_distributor *d,
 
 /**
  * API called by a worker to request a new packet to process.
- * Any previous packet given to the worker is assumed to have completed
+ * Any previous packets given to the worker are assumed to have completed
  * processing, and may be optionally returned to the distributor via
  * the oldpkt parameter.
- * Unlike rte_distributor_get_pkt_burst(), this function does not wait for a
- * new packet to be provided by the distributor.
+ * Unlike rte_distributor_get_pkt(), this function does not wait for
+ * new packets to be provided by the distributor.
  *
- * NOTE: after calling this function, rte_distributor_poll_pkt_burst() should
- * be used to poll for the packet requested. The rte_distributor_get_pkt_burst()
- * API should *not* be used to try and retrieve the new packet.
+ * NOTE: after calling this function, rte_distributor_poll_pkt() should
+ * be used to poll for the packets requested. The rte_distributor_get_pkt()
+ * API should *not* be used to try and retrieve the new packets.
  *
  * @param d
  *   The distributor instance to be used
@@ -213,9 +213,9 @@ rte_distributor_request_pkt(struct rte_distributor *d,
 		unsigned int count);
 
 /**
- * API called by a worker to check for a new packet that was previously
+ * API called by a worker to check for new packets that were previously
  * requested by a call to rte_distributor_request_pkt(). It does not wait
- * for the new packet to be available, but returns NULL if the request has
+ * for the new packets to be available, but returns if the request has
  * not yet been fulfilled by the distributor.
  *
  * @param d
@@ -227,8 +227,9 @@ rte_distributor_request_pkt(struct rte_distributor *d,
  *   The array of mbufs being given to the worker
  *
  * @return
- *   The number of packets being given to the worker thread, zero if no
- *   packet is yet available.
+ *   The number of packets being given to the worker thread,
+ *   -1 if no packets are yet available (burst API - RTE_DIST_ALG_BURST)
+ *   0 if no packets are yet available (legacy single API - RTE_DIST_ALG_SINGLE)
  */
 int
 rte_distributor_poll_pkt(struct rte_distributor *d,
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v3 0/8] fix distributor synchronization issues
  2020-09-25 12:31           ` [dpdk-dev] [PATCH v3 0/8] fix distributor synchronization issues David Marchand
@ 2020-09-25 22:42             ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-09-25 22:42 UTC (permalink / raw)
  To: David Marchand; +Cc: dev, David Hunt, "'Lukasz Wojciechowski'",

Hello David,

Thank you for your review.

W dniu 25.09.2020 o 14:31, David Marchand pisze:
> Hello Lukasz,
>
> On Wed, Sep 23, 2020 at 3:25 PM Lukasz Wojciechowski
> <l.wojciechow@partner.samsung.com> wrote:
>> During review and verification of the patch created by Sarosh Arif:
>> "test_distributor: prevent memory leakages from the pool" I found out
>> that running distributor unit tests multiple times in a row causes fails.
>> So I investigated all the issues I found.
>>
>> There are few synchronization issues that might cause deadlocks
>> or corrupted data. They are fixed with this set of patches for both tests
>> and librte_distributor library.
>>
>> ---
>> v3:
>> * add missing acked and tested by statements from v1
>>
>> v2:
>> * assign NULL to freed mbufs in distributor test
>> * fix handshake check on legacy single distributor
>>       rte_distributor_return_pkt_single()
>> * add patch 7 passing NULL to legacy API calls if no bufs are returned
>> * add patch 8 fixing API documentation
>>
>> Lukasz Wojciechowski (8):
>>    app/test: fix deadlock in distributor test
>>    app/test: synchronize statistics between lcores
>>    app/test: fix freeing mbufs in distributor tests
>>    app/test: collect return mbufs in distributor test
> For these patches, we can use the "test/distributor: " prefix, and we
> then avoid repeating "in distributor test"
Changed
>
>>    distributor: fix missing handshake synchronization
>>    distributor: fix handshake deadlock
>>    distributor: do not use oldpkt when not needed
>>    distributor: align API documentation with code
> Thanks for working on those fixes !
>
> Here is a suggestion:
>
> - we can move this new patch 7 before patch 3 in the series, and
> update the unit test:
>   * by passing NULL to the first call to rte_distributor_get_pkt(),
> there is no need for buf[] array init in handle_work(),
> handle_work_with_free_mbufs() and handle_work_for_shutdown_test(),
>   * at all points of those functions the buf[] array then contains only
> [0, num[ valid entries,
>   * bonus point, this makes the UT check passing NULL oldpkt,
>
> - the former patch 3 is then easier to do since there is no need for
> buf[] array clearing,
>
> This gives the following diff, wdyt?
I reorder patches as you suggested.
I added unit test changes in the same patch that changes distributor 
lib. The changes follow your diff.
This also simplified "fix freeing mbufs" patch.
It's applied in v4.
>
> diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
> index f31b54edf3..b7ab93ecbe 100644
> --- a/app/test/test_distributor.c
> +++ b/app/test/test_distributor.c
> @@ -65,13 +65,10 @@ handle_work(void *arg)
>          struct rte_mbuf *buf[8] __rte_cache_aligned;
>          struct worker_params *wp = arg;
>          struct rte_distributor *db = wp->dist;
> -       unsigned int count = 0, num = 0;
> +       unsigned int count = 0, num;
>          unsigned int id = __atomic_fetch_add(&worker_idx, 1, __ATOMIC_RELAXED);
> -       int i;
>
> -       for (i = 0; i < 8; i++)
> -               buf[i] = NULL;
> -       num = rte_distributor_get_pkt(db, id, buf, buf, num);
> +       num = rte_distributor_get_pkt(db, id, buf, NULL, 0);
>          while (!quit) {
>                  __atomic_fetch_add(&worker_stats[id].handled_packets, num,
>                                  __ATOMIC_ACQ_REL);
> @@ -277,22 +274,17 @@ handle_work_with_free_mbufs(void *arg)
>          struct rte_mbuf *buf[8] __rte_cache_aligned;
>          struct worker_params *wp = arg;
>          struct rte_distributor *d = wp->dist;
> +       unsigned int num;
>          unsigned int i;
> -       unsigned int num = 0;
>          unsigned int id = __atomic_fetch_add(&worker_idx, 1, __ATOMIC_RELAXED);
>
> -       for (i = 0; i < 8; i++)
> -               buf[i] = NULL;
> -       num = rte_distributor_get_pkt(d, id, buf, buf, 0);
> +       num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
>          while (!quit) {
>                  __atomic_fetch_add(&worker_stats[id].handled_packets, num,
>                                  __ATOMIC_ACQ_REL);
> -               for (i = 0; i < num; i++) {
> +               for (i = 0; i < num; i++)
>                          rte_pktmbuf_free(buf[i]);
> -                       buf[i] = NULL;
> -               }
> -               num = rte_distributor_get_pkt(d,
> -                               id, buf, buf, 0);
> +               num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
>          }
>          __atomic_fetch_add(&worker_stats[id].handled_packets, num,
>                          __ATOMIC_ACQ_REL);
> @@ -347,15 +339,13 @@ handle_work_for_shutdown_test(void *arg)
>          struct rte_mbuf *buf[8] __rte_cache_aligned;
>          struct worker_params *wp = arg;
>          struct rte_distributor *d = wp->dist;
> -       unsigned int num = 0;
> +       unsigned int num;
>          unsigned int i;
>          unsigned int zero_id = 0;
>          const unsigned int id = __atomic_fetch_add(&worker_idx, 1,
>                          __ATOMIC_RELAXED);
> -       for (i = 0; i < 8; i++)
> -               buf[i] = NULL;
>
> -       num = rte_distributor_get_pkt(d, id, buf, buf, 0);
> +       num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
>
>          zero_id = __atomic_load_n(&zero_idx, __ATOMIC_ACQUIRE);
>          if (id == zero_id && num > 0) {
> @@ -369,12 +359,9 @@ handle_work_for_shutdown_test(void *arg)
>          while (!quit && !(id == zero_id && zero_quit)) {
>                  __atomic_fetch_add(&worker_stats[id].handled_packets, num,
>                                  __ATOMIC_ACQ_REL);
> -               for (i = 0; i < num; i++) {
> +               for (i = 0; i < num; i++)
>                          rte_pktmbuf_free(buf[i]);
> -                       buf[i] = NULL;
> -               }
> -               num = rte_distributor_get_pkt(d,
> -                               id, buf, buf, 0);
> +               num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
>
>                  zero_id = __atomic_load_n(&zero_idx, __ATOMIC_ACQUIRE);
>                  if (id == zero_id && num > 0) {
> @@ -392,21 +379,16 @@ handle_work_for_shutdown_test(void *arg)
>                  while (zero_quit)
>                          usleep(100);
>
> -               for (i = 0; i < num; i++) {
> +               for (i = 0; i < num; i++)
>                          rte_pktmbuf_free(buf[i]);
> -                       buf[i] = NULL;
> -               }
> -               num = rte_distributor_get_pkt(d,
> -                               id, buf, buf, 0);
> +               num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
>
>                  while (!quit) {
>                          __atomic_fetch_add(&worker_stats[id].handled_packets,
>                                          num, __ATOMIC_ACQ_REL);
> -                       for (i = 0; i < num; i++) {
> +                       for (i = 0; i < num; i++)
>                                  rte_pktmbuf_free(buf[i]);
> -                               buf[i] = NULL;
> -                       }
> -                       num = rte_distributor_get_pkt(d, id, buf, buf, 0);
> +                       num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
>                  }
>          }
>          rte_distributor_return_pkt(d, id, buf, num);
>
>
-- 
Lukasz Wojciechowski
Principal Software Engineer

Samsung R&D Institute Poland
Samsung Electronics
Office +48 22 377 88 25
l.wojciechow@partner.samsung.com


^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v4 1/8] test/distributor: fix deadlock with freezed worker
  2020-09-25 22:42                 ` [dpdk-dev] [PATCH v4 1/8] test/distributor: fix deadlock with freezed worker Lukasz Wojciechowski
@ 2020-09-27 23:34                   ` Honnappa Nagarahalli
  2020-09-30 20:22                     ` Lukasz Wojciechowski
  0 siblings, 1 reply; 164+ messages in thread
From: Honnappa Nagarahalli @ 2020-09-27 23:34 UTC (permalink / raw)
  To: Lukasz Wojciechowski, David Hunt, Bruce Richardson
  Cc: dev, stable, nd, Honnappa Nagarahalli, nd

Hi Lukasz,
	Few comments inline

<snip>

> 
> The sanity test with worker shutdown delegates all bufs to be processed by a
> single lcore worker, then it freezes one of the lcore workers and continues to
> send more bufs.
The designated core to freeze (core with id == 0 in the existing code) gets out of the first while loop and gets into the 2nd while loop in the function ' handle_work_for_shutdown_test'.
In between these 2 while loops, it informs the distributor that it will  not accept any more packets by calling ' rte_distributor_return_pkt' (at least this API is supposed to do that). But, the distributor hangs waiting for the frozen core to start accepting packets. I think this is a problem with the distributor and not the test case.

> 
> Problem occurred if freezed lcore is the same as the one that is processing
> the mbufs. The lcore processing mbufs might be different every time test is
> launched.
> This is caused by keeping the value of wkr static variable in
> rte_distributor_process function between running test cases.
> 
> Test freezed always lcore with 0 id. The patch changes avoids possible
> collision by freezing lcore with zero_idx. The lcore that receives the data
> updates the zero_idx, so it is not freezed itself.
> 
> To reproduce the issue fixed by this patch, please run distributor_autotest
> command in test app several times in a row.
> 
> Fixes: c3eabff124e6 ("distributor: add unit tests")
> Cc: bruce.richardson@intel.com
> Cc: stable@dpdk.org
> 
> Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
> Tested-by: David Hunt <david.hunt@intel.com>
> ---
>  app/test/test_distributor.c | 22 ++++++++++++++++++++--
>  1 file changed, 20 insertions(+), 2 deletions(-)
> 
> diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c index
> ba1f81cf8..35b25463a 100644
> --- a/app/test/test_distributor.c
> +++ b/app/test/test_distributor.c
> @@ -28,6 +28,7 @@ struct worker_params worker_params;
>  static volatile int quit;      /**< general quit variable for all threads */
>  static volatile int zero_quit; /**< var for when we just want thr0 to quit*/
> static volatile unsigned worker_idx;
> +static volatile unsigned zero_idx;
> 
>  struct worker_stats {
>  	volatile unsigned handled_packets;
> @@ -346,27 +347,43 @@ handle_work_for_shutdown_test(void *arg)
>  	unsigned int total = 0;
>  	unsigned int i;
>  	unsigned int returned = 0;
> +	unsigned int zero_id = 0;
>  	const unsigned int id = __atomic_fetch_add(&worker_idx, 1,
>  			__ATOMIC_RELAXED);
> 
>  	num = rte_distributor_get_pkt(d, id, buf, buf, num);
> 
> +	zero_id = __atomic_load_n(&zero_idx, __ATOMIC_ACQUIRE);
> +	if (id == zero_id && num > 0) {
> +		zero_id = (zero_id + 1) %  __atomic_load_n(&worker_idx,
> +			__ATOMIC_ACQUIRE);
> +		__atomic_store_n(&zero_idx, zero_id, __ATOMIC_RELEASE);
> +	}
> +
>  	/* wait for quit single globally, or for worker zero, wait
>  	 * for zero_quit */
> -	while (!quit && !(id == 0 && zero_quit)) {
> +	while (!quit && !(id == zero_id && zero_quit)) {
>  		worker_stats[id].handled_packets += num;
>  		count += num;
>  		for (i = 0; i < num; i++)
>  			rte_pktmbuf_free(buf[i]);
>  		num = rte_distributor_get_pkt(d,
>  				id, buf, buf, num);
> +
> +		zero_id = __atomic_load_n(&zero_idx,
> __ATOMIC_ACQUIRE);
> +		if (id == zero_id && num > 0) {
> +			zero_id = (zero_id + 1) %
> __atomic_load_n(&worker_idx,
> +				__ATOMIC_ACQUIRE);
> +			__atomic_store_n(&zero_idx, zero_id,
> __ATOMIC_RELEASE);
> +		}
> +
>  		total += num;
>  	}
>  	worker_stats[id].handled_packets += num;
>  	count += num;
>  	returned = rte_distributor_return_pkt(d, id, buf, num);
> 
> -	if (id == 0) {
> +	if (id == zero_id) {
>  		/* for worker zero, allow it to restart to pick up last packet
>  		 * when all workers are shutting down.
>  		 */
> @@ -586,6 +603,7 @@ quit_workers(struct worker_params *wp, struct
> rte_mempool *p)
>  	rte_eal_mp_wait_lcore();
>  	quit = 0;
>  	worker_idx = 0;
> +	zero_idx = 0;
>  }
> 
>  static int
> --
> 2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v4 2/8] test/distributor: synchronize lcores statistics
  2020-09-25 22:42                 ` [dpdk-dev] [PATCH v4 2/8] test/distributor: synchronize lcores statistics Lukasz Wojciechowski
@ 2020-09-29  5:49                   ` Honnappa Nagarahalli
  2020-10-02 11:25                     ` Lukasz Wojciechowski
  0 siblings, 1 reply; 164+ messages in thread
From: Honnappa Nagarahalli @ 2020-09-29  5:49 UTC (permalink / raw)
  To: Lukasz Wojciechowski, David Hunt, Bruce Richardson
  Cc: dev, stable, nd, Honnappa Nagarahalli, nd

<snip>

> 
> Statistics of handled packets are cleared and read on main lcore, while they
> are increased in workers handlers on different lcores.
> 
> Without synchronization occasionally showed invalid values.
> This patch uses atomic acquire/release mechanisms to synchronize.
In general, load-acquire and store-release memory orderings are required while synchronizing data (that cannot be updated atomically) between threads. In the situation, making counters atomic is enough.

> 
> Fixes: c3eabff124e6 ("distributor: add unit tests")
> Cc: bruce.richardson@intel.com
> Cc: stable@dpdk.org
> 
> Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
> Acked-by: David Hunt <david.hunt@intel.com>
> ---
>  app/test/test_distributor.c | 39 ++++++++++++++++++++++++-------------
>  1 file changed, 26 insertions(+), 13 deletions(-)
> 
> diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c index
> 35b25463a..0e49e3714 100644
> --- a/app/test/test_distributor.c
> +++ b/app/test/test_distributor.c
> @@ -43,7 +43,8 @@ total_packet_count(void)  {
>  	unsigned i, count = 0;
>  	for (i = 0; i < worker_idx; i++)
> -		count += worker_stats[i].handled_packets;
> +		count +=
> __atomic_load_n(&worker_stats[i].handled_packets,
> +				__ATOMIC_ACQUIRE);
RELAXED memory order is sufficient. For ex: the worker threads are not 'releasing' any data that is not atomically updated to the main thread.

>  	return count;
>  }
> 
> @@ -52,6 +53,7 @@ static inline void
>  clear_packet_count(void)
>  {
>  	memset(&worker_stats, 0, sizeof(worker_stats));
> +	rte_atomic_thread_fence(__ATOMIC_RELEASE);
Ideally, the counters should be set to 0 atomically rather than using a memset.

>  }
> 
>  /* this is the basic worker function for sanity test @@ -72,13 +74,13 @@
> handle_work(void *arg)
>  	num = rte_distributor_get_pkt(db, id, buf, buf, num);
>  	while (!quit) {
>  		__atomic_fetch_add(&worker_stats[id].handled_packets,
> num,
> -				__ATOMIC_RELAXED);
> +				__ATOMIC_ACQ_REL);
Using the __ATOMIC_ACQ_REL order does not mean anything to the main thread. The main thread might still see the updates from different threads in different order.

>  		count += num;
>  		num = rte_distributor_get_pkt(db, id,
>  				buf, buf, num);
>  	}
>  	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
> -			__ATOMIC_RELAXED);
> +			__ATOMIC_ACQ_REL);
Same here, do not see why this change is required.

>  	count += num;
>  	rte_distributor_return_pkt(db, id, buf, num);
>  	return 0;
> @@ -134,7 +136,8 @@ sanity_test(struct worker_params *wp, struct
> rte_mempool *p)
> 
>  	for (i = 0; i < rte_lcore_count() - 1; i++)
>  		printf("Worker %u handled %u packets\n", i,
> -				worker_stats[i].handled_packets);
> +			__atomic_load_n(&worker_stats[i].handled_packets,
> +					__ATOMIC_ACQUIRE));
__ATOMIC_RELAXED is enough.

>  	printf("Sanity test with all zero hashes done.\n");
> 
>  	/* pick two flows and check they go correctly */ @@ -159,7 +162,9
> @@ sanity_test(struct worker_params *wp, struct rte_mempool *p)
> 
>  		for (i = 0; i < rte_lcore_count() - 1; i++)
>  			printf("Worker %u handled %u packets\n", i,
> -					worker_stats[i].handled_packets);
> +				__atomic_load_n(
> +					&worker_stats[i].handled_packets,
> +					__ATOMIC_ACQUIRE));
__ATOMIC_RELAXED is enough

>  		printf("Sanity test with two hash values done\n");
>  	}
> 
> @@ -185,7 +190,8 @@ sanity_test(struct worker_params *wp, struct
> rte_mempool *p)
> 
>  	for (i = 0; i < rte_lcore_count() - 1; i++)
>  		printf("Worker %u handled %u packets\n", i,
> -				worker_stats[i].handled_packets);
> +			__atomic_load_n(&worker_stats[i].handled_packets,
> +					__ATOMIC_ACQUIRE));
__ATOMIC_RELAXED is enough

>  	printf("Sanity test with non-zero hashes done\n");
> 
>  	rte_mempool_put_bulk(p, (void *)bufs, BURST); @@ -280,15
> +286,17 @@ handle_work_with_free_mbufs(void *arg)
>  		buf[i] = NULL;
>  	num = rte_distributor_get_pkt(d, id, buf, buf, num);
>  	while (!quit) {
> -		worker_stats[id].handled_packets += num;
>  		count += num;
> +		__atomic_fetch_add(&worker_stats[id].handled_packets,
> num,
> +				__ATOMIC_ACQ_REL);
IMO, the problem would be the non-atomic update of the statistics. So, __ATOMIC_RELAXED is enough

>  		for (i = 0; i < num; i++)
>  			rte_pktmbuf_free(buf[i]);
>  		num = rte_distributor_get_pkt(d,
>  				id, buf, buf, num);
>  	}
> -	worker_stats[id].handled_packets += num;
>  	count += num;
> +	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
> +			__ATOMIC_ACQ_REL);
Same here, the problem is non-atomic update of the statistics, __ATOMIC_RELAXED is enough.
Similarly, for changes below, __ATOMIC_RELAXED is enough.

>  	rte_distributor_return_pkt(d, id, buf, num);
>  	return 0;
>  }
> @@ -363,8 +371,9 @@ handle_work_for_shutdown_test(void *arg)
>  	/* wait for quit single globally, or for worker zero, wait
>  	 * for zero_quit */
>  	while (!quit && !(id == zero_id && zero_quit)) {
> -		worker_stats[id].handled_packets += num;
>  		count += num;
> +		__atomic_fetch_add(&worker_stats[id].handled_packets,
> num,
> +				__ATOMIC_ACQ_REL);
>  		for (i = 0; i < num; i++)
>  			rte_pktmbuf_free(buf[i]);
>  		num = rte_distributor_get_pkt(d,
> @@ -379,10 +388,11 @@ handle_work_for_shutdown_test(void *arg)
> 
>  		total += num;
>  	}
> -	worker_stats[id].handled_packets += num;
>  	count += num;
>  	returned = rte_distributor_return_pkt(d, id, buf, num);
> 
> +	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
> +			__ATOMIC_ACQ_REL);
>  	if (id == zero_id) {
>  		/* for worker zero, allow it to restart to pick up last packet
>  		 * when all workers are shutting down.
> @@ -394,10 +404,11 @@ handle_work_for_shutdown_test(void *arg)
>  				id, buf, buf, num);
> 
>  		while (!quit) {
> -			worker_stats[id].handled_packets += num;
>  			count += num;
>  			rte_pktmbuf_free(pkt);
>  			num = rte_distributor_get_pkt(d, id, buf, buf, num);
> +
> 	__atomic_fetch_add(&worker_stats[id].handled_packets,
> +					num, __ATOMIC_ACQ_REL);
>  		}
>  		returned = rte_distributor_return_pkt(d,
>  				id, buf, num);
> @@ -461,7 +472,8 @@ sanity_test_with_worker_shutdown(struct
> worker_params *wp,
> 
>  	for (i = 0; i < rte_lcore_count() - 1; i++)
>  		printf("Worker %u handled %u packets\n", i,
> -				worker_stats[i].handled_packets);
> +			__atomic_load_n(&worker_stats[i].handled_packets,
> +					__ATOMIC_ACQUIRE));
> 
>  	if (total_packet_count() != BURST * 2) {
>  		printf("Line %d: Error, not all packets flushed. "
> @@ -514,7 +526,8 @@ test_flush_with_worker_shutdown(struct
> worker_params *wp,
>  	zero_quit = 0;
>  	for (i = 0; i < rte_lcore_count() - 1; i++)
>  		printf("Worker %u handled %u packets\n", i,
> -				worker_stats[i].handled_packets);
> +			__atomic_load_n(&worker_stats[i].handled_packets,
> +					__ATOMIC_ACQUIRE));
> 
>  	if (total_packet_count() != BURST) {
>  		printf("Line %d: Error, not all packets flushed. "
> --
> 2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v4 1/8] test/distributor: fix deadlock with freezed worker
  2020-09-27 23:34                   ` Honnappa Nagarahalli
@ 2020-09-30 20:22                     ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-09-30 20:22 UTC (permalink / raw)
  To: Honnappa Nagarahalli, David Hunt, Bruce Richardson; +Cc: dev, stable, nd

Hi Honnappa,

Thank you very much for your review
Reply inline below

W dniu 28.09.2020 o 01:34, Honnappa Nagarahalli pisze:
> Hi Lukasz,
> 	Few comments inline
>
> <snip>
>
>> The sanity test with worker shutdown delegates all bufs to be processed by a
>> single lcore worker, then it freezes one of the lcore workers and continues to
>> send more bufs.
> The designated core to freeze (core with id == 0 in the existing code) gets out of the first while loop and gets into the 2nd while loop in the function ' handle_work_for_shutdown_test'.
> In between these 2 while loops, it informs the distributor that it will  not accept any more packets by calling ' rte_distributor_return_pkt' (at least this API is supposed to do that). But, the distributor hangs waiting for the frozen core to start accepting packets. I think this is a problem with the distributor and not the test case.
I agree.
I did some further investigation and you are correct. This is the 
distributor issue. The new burst model doesn't care at all if the worker 
has called rte_distributor_return_pkt(). It it doesn't find a worker 
with matching tag, it will process packets to the worker without 
checking if it requested for packets.

The legacy single model used a different handshake value to indicate it 
does not want any more packets. The flag is reused for other purposes in 
burst model (marking valid return packets) and that's obviously wrong.

The tests however also need to be adjusted as they don't verify the 
request/return status of worker properly.

I hope I will be able to update the patches this or next week to fix it.

>
>> Problem occurred if freezed lcore is the same as the one that is processing
>> the mbufs. The lcore processing mbufs might be different every time test is
>> launched.
>> This is caused by keeping the value of wkr static variable in
>> rte_distributor_process function between running test cases.
>>
>> Test freezed always lcore with 0 id. The patch changes avoids possible
>> collision by freezing lcore with zero_idx. The lcore that receives the data
>> updates the zero_idx, so it is not freezed itself.
>>
>> To reproduce the issue fixed by this patch, please run distributor_autotest
>> command in test app several times in a row.
>>
>> Fixes: c3eabff124e6 ("distributor: add unit tests")
>> Cc: bruce.richardson@intel.com
>> Cc: stable@dpdk.org
>>
>> Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
>> Tested-by: David Hunt <david.hunt@intel.com>
>> ---
>>   app/test/test_distributor.c | 22 ++++++++++++++++++++--
>>   1 file changed, 20 insertions(+), 2 deletions(-)
>>
>> diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c index
>> ba1f81cf8..35b25463a 100644
>> --- a/app/test/test_distributor.c
>> +++ b/app/test/test_distributor.c
>> @@ -28,6 +28,7 @@ struct worker_params worker_params;
>>   static volatile int quit;      /**< general quit variable for all threads */
>>   static volatile int zero_quit; /**< var for when we just want thr0 to quit*/
>> static volatile unsigned worker_idx;
>> +static volatile unsigned zero_idx;
>>
>>   struct worker_stats {
>>   	volatile unsigned handled_packets;
>> @@ -346,27 +347,43 @@ handle_work_for_shutdown_test(void *arg)
>>   	unsigned int total = 0;
>>   	unsigned int i;
>>   	unsigned int returned = 0;
>> +	unsigned int zero_id = 0;
>>   	const unsigned int id = __atomic_fetch_add(&worker_idx, 1,
>>   			__ATOMIC_RELAXED);
>>
>>   	num = rte_distributor_get_pkt(d, id, buf, buf, num);
>>
>> +	zero_id = __atomic_load_n(&zero_idx, __ATOMIC_ACQUIRE);
>> +	if (id == zero_id && num > 0) {
>> +		zero_id = (zero_id + 1) %  __atomic_load_n(&worker_idx,
>> +			__ATOMIC_ACQUIRE);
>> +		__atomic_store_n(&zero_idx, zero_id, __ATOMIC_RELEASE);
>> +	}
>> +
>>   	/* wait for quit single globally, or for worker zero, wait
>>   	 * for zero_quit */
>> -	while (!quit && !(id == 0 && zero_quit)) {
>> +	while (!quit && !(id == zero_id && zero_quit)) {
>>   		worker_stats[id].handled_packets += num;
>>   		count += num;
>>   		for (i = 0; i < num; i++)
>>   			rte_pktmbuf_free(buf[i]);
>>   		num = rte_distributor_get_pkt(d,
>>   				id, buf, buf, num);
>> +
>> +		zero_id = __atomic_load_n(&zero_idx,
>> __ATOMIC_ACQUIRE);
>> +		if (id == zero_id && num > 0) {
>> +			zero_id = (zero_id + 1) %
>> __atomic_load_n(&worker_idx,
>> +				__ATOMIC_ACQUIRE);
>> +			__atomic_store_n(&zero_idx, zero_id,
>> __ATOMIC_RELEASE);
>> +		}
>> +
>>   		total += num;
>>   	}
>>   	worker_stats[id].handled_packets += num;
>>   	count += num;
>>   	returned = rte_distributor_return_pkt(d, id, buf, num);
>>
>> -	if (id == 0) {
>> +	if (id == zero_id) {
>>   		/* for worker zero, allow it to restart to pick up last packet
>>   		 * when all workers are shutting down.
>>   		 */
>> @@ -586,6 +603,7 @@ quit_workers(struct worker_params *wp, struct
>> rte_mempool *p)
>>   	rte_eal_mp_wait_lcore();
>>   	quit = 0;
>>   	worker_idx = 0;
>> +	zero_idx = 0;
>>   }
>>
>>   static int
>> --
>> 2.17.1

-- 
Lukasz Wojciechowski
Principal Software Engineer

Samsung R&D Institute Poland
Samsung Electronics
Office +48 22 377 88 25
l.wojciechow@partner.samsung.com


^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v4 2/8] test/distributor: synchronize lcores statistics
  2020-09-29  5:49                   ` Honnappa Nagarahalli
@ 2020-10-02 11:25                     ` Lukasz Wojciechowski
  2020-10-08 20:47                       ` Lukasz Wojciechowski
  2020-10-16  5:43                       ` Honnappa Nagarahalli
  0 siblings, 2 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-02 11:25 UTC (permalink / raw)
  To: Honnappa Nagarahalli, David Hunt, Bruce Richardson
  Cc: dev, stable, nd, "'Lukasz Wojciechowski'",

Hi Honnappa,

Many thanks for the review!

I'll write my answers here not inline as it would be easier to read them 
in one place, I think.
So first of all I agree with you in 2 things:
1) all uses of statistics must be atomic and lack of that caused most of 
the problems
2) it would be better to replace barrier and memset in 
clear_packet_count() with atomic stores as you suggested

So I will apply both of above.

However I wasn't not fully convinced on changing acquire/release to 
relaxed. It wood be perfectly ok
if it would look like in this Herb Sutter's example: 
https://youtu.be/KeLBd2EJLOU?t=4170
But in his case the counters are cleared before worker threads start and 
are printout after they are completed.

In case of the dpdk distributor tests both worker and main cores are 
running at the same time. In the sanity_test, the statistics are cleared 
and verified few times for different hashes of packages. The worker 
cores are not stopped at this time and they continue their loops in 
handle procedure. Verification made in main core is an exchange of data 
as the current statistics indicate how the test will result.

So as I wasn't convinced, I run some tests with both both relaxed and 
acquire/release modes and they both fail :(
The failures caused by statistics errors to number of tests ratio for 
200000 tests was:
for relaxed: 0,000790562
for acq/rel: 0,000091321


That's why I'm going to modify tests in such way, that they would:
1) clear statistics
2) launch worker threads
3) run test
4) wait for workers procedures to complete
5) check stats, verify results and print them out

This way worker main core will use (clear or verify) stats only when 
there are no worker threads. This would make things simpler and allowing 
to focus on testing the distributor not tests. And of course relaxed 
mode would be enough!


Best regards
Lukasz


W dniu 29.09.2020 o 07:49, Honnappa Nagarahalli pisze:
> <snip>
>
>> Statistics of handled packets are cleared and read on main lcore, while they
>> are increased in workers handlers on different lcores.
>>
>> Without synchronization occasionally showed invalid values.
>> This patch uses atomic acquire/release mechanisms to synchronize.
> In general, load-acquire and store-release memory orderings are required while synchronizing data (that cannot be updated atomically) between threads. In the situation, making counters atomic is enough.
>
>> Fixes: c3eabff124e6 ("distributor: add unit tests")
>> Cc: bruce.richardson@intel.com
>> Cc: stable@dpdk.org
>>
>> Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
>> Acked-by: David Hunt <david.hunt@intel.com>
>> ---
>>   app/test/test_distributor.c | 39 ++++++++++++++++++++++++-------------
>>   1 file changed, 26 insertions(+), 13 deletions(-)
>>
>> diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c index
>> 35b25463a..0e49e3714 100644
>> --- a/app/test/test_distributor.c
>> +++ b/app/test/test_distributor.c
>> @@ -43,7 +43,8 @@ total_packet_count(void)  {
>>   	unsigned i, count = 0;
>>   	for (i = 0; i < worker_idx; i++)
>> -		count += worker_stats[i].handled_packets;
>> +		count +=
>> __atomic_load_n(&worker_stats[i].handled_packets,
>> +				__ATOMIC_ACQUIRE);
> RELAXED memory order is sufficient. For ex: the worker threads are not 'releasing' any data that is not atomically updated to the main thread.
>
>>   	return count;
>>   }
>>
>> @@ -52,6 +53,7 @@ static inline void
>>   clear_packet_count(void)
>>   {
>>   	memset(&worker_stats, 0, sizeof(worker_stats));
>> +	rte_atomic_thread_fence(__ATOMIC_RELEASE);
> Ideally, the counters should be set to 0 atomically rather than using a memset.
>
>>   }
>>
>>   /* this is the basic worker function for sanity test @@ -72,13 +74,13 @@
>> handle_work(void *arg)
>>   	num = rte_distributor_get_pkt(db, id, buf, buf, num);
>>   	while (!quit) {
>>   		__atomic_fetch_add(&worker_stats[id].handled_packets,
>> num,
>> -				__ATOMIC_RELAXED);
>> +				__ATOMIC_ACQ_REL);
> Using the __ATOMIC_ACQ_REL order does not mean anything to the main thread. The main thread might still see the updates from different threads in different order.
>
>>   		count += num;
>>   		num = rte_distributor_get_pkt(db, id,
>>   				buf, buf, num);
>>   	}
>>   	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
>> -			__ATOMIC_RELAXED);
>> +			__ATOMIC_ACQ_REL);
> Same here, do not see why this change is required.
>
>>   	count += num;
>>   	rte_distributor_return_pkt(db, id, buf, num);
>>   	return 0;
>> @@ -134,7 +136,8 @@ sanity_test(struct worker_params *wp, struct
>> rte_mempool *p)
>>
>>   	for (i = 0; i < rte_lcore_count() - 1; i++)
>>   		printf("Worker %u handled %u packets\n", i,
>> -				worker_stats[i].handled_packets);
>> +			__atomic_load_n(&worker_stats[i].handled_packets,
>> +					__ATOMIC_ACQUIRE));
> __ATOMIC_RELAXED is enough.
>
>>   	printf("Sanity test with all zero hashes done.\n");
>>
>>   	/* pick two flows and check they go correctly */ @@ -159,7 +162,9
>> @@ sanity_test(struct worker_params *wp, struct rte_mempool *p)
>>
>>   		for (i = 0; i < rte_lcore_count() - 1; i++)
>>   			printf("Worker %u handled %u packets\n", i,
>> -					worker_stats[i].handled_packets);
>> +				__atomic_load_n(
>> +					&worker_stats[i].handled_packets,
>> +					__ATOMIC_ACQUIRE));
> __ATOMIC_RELAXED is enough
>
>>   		printf("Sanity test with two hash values done\n");
>>   	}
>>
>> @@ -185,7 +190,8 @@ sanity_test(struct worker_params *wp, struct
>> rte_mempool *p)
>>
>>   	for (i = 0; i < rte_lcore_count() - 1; i++)
>>   		printf("Worker %u handled %u packets\n", i,
>> -				worker_stats[i].handled_packets);
>> +			__atomic_load_n(&worker_stats[i].handled_packets,
>> +					__ATOMIC_ACQUIRE));
> __ATOMIC_RELAXED is enough
>
>>   	printf("Sanity test with non-zero hashes done\n");
>>
>>   	rte_mempool_put_bulk(p, (void *)bufs, BURST); @@ -280,15
>> +286,17 @@ handle_work_with_free_mbufs(void *arg)
>>   		buf[i] = NULL;
>>   	num = rte_distributor_get_pkt(d, id, buf, buf, num);
>>   	while (!quit) {
>> -		worker_stats[id].handled_packets += num;
>>   		count += num;
>> +		__atomic_fetch_add(&worker_stats[id].handled_packets,
>> num,
>> +				__ATOMIC_ACQ_REL);
> IMO, the problem would be the non-atomic update of the statistics. So, __ATOMIC_RELAXED is enough
>
>>   		for (i = 0; i < num; i++)
>>   			rte_pktmbuf_free(buf[i]);
>>   		num = rte_distributor_get_pkt(d,
>>   				id, buf, buf, num);
>>   	}
>> -	worker_stats[id].handled_packets += num;
>>   	count += num;
>> +	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
>> +			__ATOMIC_ACQ_REL);
> Same here, the problem is non-atomic update of the statistics, __ATOMIC_RELAXED is enough.
> Similarly, for changes below, __ATOMIC_RELAXED is enough.
>
>>   	rte_distributor_return_pkt(d, id, buf, num);
>>   	return 0;
>>   }
>> @@ -363,8 +371,9 @@ handle_work_for_shutdown_test(void *arg)
>>   	/* wait for quit single globally, or for worker zero, wait
>>   	 * for zero_quit */
>>   	while (!quit && !(id == zero_id && zero_quit)) {
>> -		worker_stats[id].handled_packets += num;
>>   		count += num;
>> +		__atomic_fetch_add(&worker_stats[id].handled_packets,
>> num,
>> +				__ATOMIC_ACQ_REL);
>>   		for (i = 0; i < num; i++)
>>   			rte_pktmbuf_free(buf[i]);
>>   		num = rte_distributor_get_pkt(d,
>> @@ -379,10 +388,11 @@ handle_work_for_shutdown_test(void *arg)
>>
>>   		total += num;
>>   	}
>> -	worker_stats[id].handled_packets += num;
>>   	count += num;
>>   	returned = rte_distributor_return_pkt(d, id, buf, num);
>>
>> +	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
>> +			__ATOMIC_ACQ_REL);
>>   	if (id == zero_id) {
>>   		/* for worker zero, allow it to restart to pick up last packet
>>   		 * when all workers are shutting down.
>> @@ -394,10 +404,11 @@ handle_work_for_shutdown_test(void *arg)
>>   				id, buf, buf, num);
>>
>>   		while (!quit) {
>> -			worker_stats[id].handled_packets += num;
>>   			count += num;
>>   			rte_pktmbuf_free(pkt);
>>   			num = rte_distributor_get_pkt(d, id, buf, buf, num);
>> +
>> 	__atomic_fetch_add(&worker_stats[id].handled_packets,
>> +					num, __ATOMIC_ACQ_REL);
>>   		}
>>   		returned = rte_distributor_return_pkt(d,
>>   				id, buf, num);
>> @@ -461,7 +472,8 @@ sanity_test_with_worker_shutdown(struct
>> worker_params *wp,
>>
>>   	for (i = 0; i < rte_lcore_count() - 1; i++)
>>   		printf("Worker %u handled %u packets\n", i,
>> -				worker_stats[i].handled_packets);
>> +			__atomic_load_n(&worker_stats[i].handled_packets,
>> +					__ATOMIC_ACQUIRE));
>>
>>   	if (total_packet_count() != BURST * 2) {
>>   		printf("Line %d: Error, not all packets flushed. "
>> @@ -514,7 +526,8 @@ test_flush_with_worker_shutdown(struct
>> worker_params *wp,
>>   	zero_quit = 0;
>>   	for (i = 0; i < rte_lcore_count() - 1; i++)
>>   		printf("Worker %u handled %u packets\n", i,
>> -				worker_stats[i].handled_packets);
>> +			__atomic_load_n(&worker_stats[i].handled_packets,
>> +					__ATOMIC_ACQUIRE));
>>
>>   	if (total_packet_count() != BURST) {
>>   		printf("Line %d: Error, not all packets flushed. "
>> --
>> 2.17.1

-- 
Lukasz Wojciechowski
Principal Software Engineer

Samsung R&D Institute Poland
Samsung Electronics
Office +48 22 377 88 25
l.wojciechow@partner.samsung.com


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v5 00/15] fix distributor synchronization issues
       [not found]               ` <CGME20201008052336eucas1p16b5b1600683e33ddba30479b7fd62ce6@eucas1p1.samsung.com>
@ 2020-10-08  5:23                 ` Lukasz Wojciechowski
       [not found]                   ` <CGME20201008052337eucas1p22b9e89987caf151ba8771442385fec16@eucas1p2.samsung.com>
                                     ` (16 more replies)
  0 siblings, 17 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-08  5:23 UTC (permalink / raw)
  Cc: dev, l.wojciechow

During review and verification of the patch created by Sarosh Arif:
"test_distributor: prevent memory leakages from the pool" I found out
that running distributor unit tests multiple times in a row causes fails.
So I investigated all the issues I found.

There are few synchronization issues that might cause deadlocks
or corrupted data. They are fixed with this set of patches for both tests
and librte_distributor library.

---
v5:
* implement missing functionality in burst mode - worker shutdown
* fix shutdown test to always shutdown busy worker
* use atomic stores instead of barrier in tests clear_packet_count()
* reorder patches
* new patch 7: fix call to return_pkt in single mode
* new patch 11: replacing delays with spinlock on atomics in tests
* new patch 12: fix scalar matching algorithm
* new patch 13: new test with marking and checking every packet
* new patch 14: flush also in flight packets
* new patch 15: fix clearing returns buffer
* minor fixes in other patches

v4:
* adjust commit name prefixes app/test -> test/distributor:
* reorder patches
* use NULL oldpkt in rte_distributor_get_pkt() calls in tests

v3:
* add missing acked and tested by statements from v1

v2:
* assign NULL to freed mbufs in distributor test
* fix handshake check on legacy single distributor
     rte_distributor_return_pkt_single()
* add patch 7 passing NULL to legacy API calls if no bufs are returned
* add patch 8 fixing API documentation


Lukasz Wojciechowski (15):
  distributor: fix missing handshake synchronization
  distributor: fix handshake deadlock
  distributor: do not use oldpkt when not needed
  distributor: handle worker shutdown in burst mode
  test/distributor: fix shutdown of busy worker
  test/distributor: synchronize lcores statistics
  distributor: fix return pkt calls in single mode
  test/distributor: fix freeing mbufs
  test/distributor: collect return mbufs
  distributor: align API documentation with code
  test/distributor: replace delays with spin locks
  distributor: fix scalar matching
  test/distributor: add test with packets marking
  distributor: fix flushing in flight packets
  distributor: fix clearing returns buffer

 app/test/test_distributor.c                   | 321 ++++++++++++++----
 lib/librte_distributor/distributor_private.h  |   3 +
 lib/librte_distributor/rte_distributor.c      | 219 +++++++++---
 lib/librte_distributor/rte_distributor.h      |  23 +-
 .../rte_distributor_single.c                  |   4 +
 5 files changed, 447 insertions(+), 123 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v5 01/15] distributor: fix missing handshake synchronization
       [not found]                   ` <CGME20201008052337eucas1p22b9e89987caf151ba8771442385fec16@eucas1p2.samsung.com>
@ 2020-10-08  5:23                     ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-08  5:23 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

rte_distributor_return_pkt function which is run on worker cores
must wait for distributor core to clear handshake on retptr64
before using those buffers. While the handshake is set distributor
core controls buffers and any operations on worker side might overwrite
buffers which are unread yet.
Same situation appears in the legacy single distributor. Function
rte_distributor_return_pkt_single shouldn't modify the bufptr64 until
handshake on it is cleared by distributor lcore.

Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 lib/librte_distributor/rte_distributor.c        | 14 ++++++++++++++
 lib/librte_distributor/rte_distributor_single.c |  4 ++++
 2 files changed, 18 insertions(+)

diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
index 1c047f065..89493c331 100644
--- a/lib/librte_distributor/rte_distributor.c
+++ b/lib/librte_distributor/rte_distributor.c
@@ -160,6 +160,7 @@ rte_distributor_return_pkt(struct rte_distributor *d,
 {
 	struct rte_distributor_buffer *buf = &d->bufs[worker_id];
 	unsigned int i;
+	volatile int64_t *retptr64;
 
 	if (unlikely(d->alg_type == RTE_DIST_ALG_SINGLE)) {
 		if (num == 1)
@@ -169,6 +170,19 @@ rte_distributor_return_pkt(struct rte_distributor *d,
 			return -EINVAL;
 	}
 
+	retptr64 = &(buf->retptr64[0]);
+	/* Spin while handshake bits are set (scheduler clears it).
+	 * Sync with worker on GET_BUF flag.
+	 */
+	while (unlikely(__atomic_load_n(retptr64, __ATOMIC_ACQUIRE)
+			& RTE_DISTRIB_GET_BUF)) {
+		rte_pause();
+		uint64_t t = rte_rdtsc()+100;
+
+		while (rte_rdtsc() < t)
+			rte_pause();
+	}
+
 	/* Sync with distributor to acquire retptrs */
 	__atomic_thread_fence(__ATOMIC_ACQUIRE);
 	for (i = 0; i < RTE_DIST_BURST_SIZE; i++)
diff --git a/lib/librte_distributor/rte_distributor_single.c b/lib/librte_distributor/rte_distributor_single.c
index abaf7730c..f4725b1d0 100644
--- a/lib/librte_distributor/rte_distributor_single.c
+++ b/lib/librte_distributor/rte_distributor_single.c
@@ -74,6 +74,10 @@ rte_distributor_return_pkt_single(struct rte_distributor_single *d,
 	union rte_distributor_buffer_single *buf = &d->bufs[worker_id];
 	uint64_t req = (((int64_t)(uintptr_t)oldpkt) << RTE_DISTRIB_FLAG_BITS)
 			| RTE_DISTRIB_RETURN_BUF;
+	while (unlikely(__atomic_load_n(&buf->bufptr64, __ATOMIC_RELAXED)
+			& RTE_DISTRIB_FLAGS_MASK))
+		rte_pause();
+
 	/* Sync with distributor on RETURN_BUF flag. */
 	__atomic_store_n(&(buf->bufptr64), req, __ATOMIC_RELEASE);
 	return 0;
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v5 02/15] distributor: fix handshake deadlock
       [not found]                   ` <CGME20201008052338eucas1p2d26a8705b17d07fd24056f0aeaf3504e@eucas1p2.samsung.com>
@ 2020-10-08  5:23                     ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-08  5:23 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

Synchronization of data exchange between distributor and worker cores
is based on 2 handshakes: retptr64 for returning mbufs from workers
to distributor and bufptr64 for passing mbufs to workers.

Without proper order of verifying those 2 handshakes a deadlock may
occur. This can happen when worker core wants to return back mbufs
and waits for retptr handshake to be cleared while distributor core
waits for bufptr to send mbufs to worker.

This can happen as worker core first returns mbufs to distributor
and later gets new mbufs, while distributor first releases mbufs
to worker and later handle returning packets.

This patch fixes possibility of the deadlock by always taking care
of returning packets first on the distributor side and handling
packets while waiting to release new.

Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 lib/librte_distributor/rte_distributor.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
index 89493c331..12b3db33c 100644
--- a/lib/librte_distributor/rte_distributor.c
+++ b/lib/librte_distributor/rte_distributor.c
@@ -321,12 +321,14 @@ release(struct rte_distributor *d, unsigned int wkr)
 	struct rte_distributor_buffer *buf = &(d->bufs[wkr]);
 	unsigned int i;
 
+	handle_returns(d, wkr);
+
 	/* Sync with worker on GET_BUF flag */
 	while (!(__atomic_load_n(&(d->bufs[wkr].bufptr64[0]), __ATOMIC_ACQUIRE)
-		& RTE_DISTRIB_GET_BUF))
+		& RTE_DISTRIB_GET_BUF)) {
+		handle_returns(d, wkr);
 		rte_pause();
-
-	handle_returns(d, wkr);
+	}
 
 	buf->count = 0;
 
@@ -376,6 +378,7 @@ rte_distributor_process(struct rte_distributor *d,
 		/* Flush out all non-full cache-lines to workers. */
 		for (wid = 0 ; wid < d->num_workers; wid++) {
 			/* Sync with worker on GET_BUF flag. */
+			handle_returns(d, wid);
 			if (__atomic_load_n(&(d->bufs[wid].bufptr64[0]),
 				__ATOMIC_ACQUIRE) & RTE_DISTRIB_GET_BUF) {
 				release(d, wid);
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v5 03/15] distributor: do not use oldpkt when not needed
       [not found]                   ` <CGME20201008052339eucas1p1a4e571cc3f5a277badff9d352ad7da8e@eucas1p1.samsung.com>
@ 2020-10-08  5:23                     ` Lukasz Wojciechowski
  2020-10-08  8:13                       ` David Hunt
  0 siblings, 1 reply; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-08  5:23 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

rte_distributor_request_pkt and rte_distributor_get_pkt dereferenced
oldpkt parameter when in RTE_DIST_ALG_SINGLE even if number
of returned buffers from worker to distributor was 0.

This patch passes NULL to the legacy API when number of returned
buffers is 0. This allows passing NULL as oldpkt parameter.

Distributor tests are also updated passing NULL as oldpkt and
0 as number of returned packets, where packets are not returned.

Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
---
 app/test/test_distributor.c              | 28 +++++++++---------------
 lib/librte_distributor/rte_distributor.c |  4 ++--
 2 files changed, 12 insertions(+), 20 deletions(-)

diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
index ba1f81cf8..52230d250 100644
--- a/app/test/test_distributor.c
+++ b/app/test/test_distributor.c
@@ -62,13 +62,10 @@ handle_work(void *arg)
 	struct rte_mbuf *buf[8] __rte_cache_aligned;
 	struct worker_params *wp = arg;
 	struct rte_distributor *db = wp->dist;
-	unsigned int count = 0, num = 0;
+	unsigned int count = 0, num;
 	unsigned int id = __atomic_fetch_add(&worker_idx, 1, __ATOMIC_RELAXED);
-	int i;
 
-	for (i = 0; i < 8; i++)
-		buf[i] = NULL;
-	num = rte_distributor_get_pkt(db, id, buf, buf, num);
+	num = rte_distributor_get_pkt(db, id, buf, NULL, 0);
 	while (!quit) {
 		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
 				__ATOMIC_RELAXED);
@@ -272,19 +269,16 @@ handle_work_with_free_mbufs(void *arg)
 	struct rte_distributor *d = wp->dist;
 	unsigned int count = 0;
 	unsigned int i;
-	unsigned int num = 0;
+	unsigned int num;
 	unsigned int id = __atomic_fetch_add(&worker_idx, 1, __ATOMIC_RELAXED);
 
-	for (i = 0; i < 8; i++)
-		buf[i] = NULL;
-	num = rte_distributor_get_pkt(d, id, buf, buf, num);
+	num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 	while (!quit) {
 		worker_stats[id].handled_packets += num;
 		count += num;
 		for (i = 0; i < num; i++)
 			rte_pktmbuf_free(buf[i]);
-		num = rte_distributor_get_pkt(d,
-				id, buf, buf, num);
+		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 	}
 	worker_stats[id].handled_packets += num;
 	count += num;
@@ -342,14 +336,14 @@ handle_work_for_shutdown_test(void *arg)
 	struct worker_params *wp = arg;
 	struct rte_distributor *d = wp->dist;
 	unsigned int count = 0;
-	unsigned int num = 0;
+	unsigned int num;
 	unsigned int total = 0;
 	unsigned int i;
 	unsigned int returned = 0;
 	const unsigned int id = __atomic_fetch_add(&worker_idx, 1,
 			__ATOMIC_RELAXED);
 
-	num = rte_distributor_get_pkt(d, id, buf, buf, num);
+	num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 
 	/* wait for quit single globally, or for worker zero, wait
 	 * for zero_quit */
@@ -358,8 +352,7 @@ handle_work_for_shutdown_test(void *arg)
 		count += num;
 		for (i = 0; i < num; i++)
 			rte_pktmbuf_free(buf[i]);
-		num = rte_distributor_get_pkt(d,
-				id, buf, buf, num);
+		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 		total += num;
 	}
 	worker_stats[id].handled_packets += num;
@@ -373,14 +366,13 @@ handle_work_for_shutdown_test(void *arg)
 		while (zero_quit)
 			usleep(100);
 
-		num = rte_distributor_get_pkt(d,
-				id, buf, buf, num);
+		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 
 		while (!quit) {
 			worker_stats[id].handled_packets += num;
 			count += num;
 			rte_pktmbuf_free(pkt);
-			num = rte_distributor_get_pkt(d, id, buf, buf, num);
+			num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 		}
 		returned = rte_distributor_return_pkt(d,
 				id, buf, num);
diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
index 12b3db33c..b720abe03 100644
--- a/lib/librte_distributor/rte_distributor.c
+++ b/lib/librte_distributor/rte_distributor.c
@@ -42,7 +42,7 @@ rte_distributor_request_pkt(struct rte_distributor *d,
 
 	if (unlikely(d->alg_type == RTE_DIST_ALG_SINGLE)) {
 		rte_distributor_request_pkt_single(d->d_single,
-			worker_id, oldpkt[0]);
+			worker_id, count ? oldpkt[0] : NULL);
 		return;
 	}
 
@@ -134,7 +134,7 @@ rte_distributor_get_pkt(struct rte_distributor *d,
 	if (unlikely(d->alg_type == RTE_DIST_ALG_SINGLE)) {
 		if (return_count <= 1) {
 			pkts[0] = rte_distributor_get_pkt_single(d->d_single,
-				worker_id, oldpkt[0]);
+				worker_id, return_count ? oldpkt[0] : NULL);
 			return (pkts[0]) ? 1 : 0;
 		} else
 			return -EINVAL;
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v5 04/15] distributor: handle worker shutdown in burst mode
       [not found]                   ` <CGME20201008052339eucas1p15697f457b8b96809d04f737e041af08a@eucas1p1.samsung.com>
@ 2020-10-08  5:23                     ` Lukasz Wojciechowski
  2020-10-08 14:26                       ` David Hunt
  0 siblings, 1 reply; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-08  5:23 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

The burst version of distributor implementation was missing proper
handling of worker shutdown. A worker processing packets received
from distributor can call rte_distributor_return_pkt() function
informing distributor that it want no more packets. Further calls to
rte_distributor_request_pkt() or rte_distributor_get_pkt() however
should inform distributor that new packets are requested again.

Lack of the proper implementation has caused that even after worker
informed about returning last packets, new packets were still sent
from distributor causing deadlocks as no one could get them on worker
side.

This patch adds handling shutdown of the worker in following way:
1) It fixes usage of RTE_DISTRIB_VALID_BUF handshake flag. This flag
was formerly unused in burst implementation and now it is used
for marking valid packets in retptr64 replacing invalid use
of RTE_DISTRIB_RETURN_BUF flag.
2) Uses RTE_DISTRIB_RETURN_BUF as a worker to distributor handshake
in retptr64 to indicate that worker has shutdown.
3) Worker that shuts down blocks also bufptr for itself with
RTE_DISTRIB_RETURN_BUF flag allowing distributor to retrieve any
in flight packets.
4) When distributor receives information about shutdown of a worker,
it: marks worker as not active; retrieves any in flight and backlog
packets and process them to different workers; unlocks bufptr64
by clearing RTE_DISTRIB_RETURN_BUF flag and allowing use in
the future if worker requests any new packages.
5) Do not allow to: send or add to backlog any packets for not
active workers. Such workers are also ignored if matched.
6) Adjust calls to handle_returns() and tags matching procedure
to react for possible activation deactivation of workers.

Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
---
 lib/librte_distributor/distributor_private.h |   3 +
 lib/librte_distributor/rte_distributor.c     | 175 +++++++++++++++----
 2 files changed, 146 insertions(+), 32 deletions(-)

diff --git a/lib/librte_distributor/distributor_private.h b/lib/librte_distributor/distributor_private.h
index 489aef2ac..689fe3e18 100644
--- a/lib/librte_distributor/distributor_private.h
+++ b/lib/librte_distributor/distributor_private.h
@@ -155,6 +155,9 @@ struct rte_distributor {
 	enum rte_distributor_match_function dist_match_fn;
 
 	struct rte_distributor_single *d_single;
+
+	uint8_t active[RTE_DISTRIB_MAX_WORKERS];
+	uint8_t activesum;
 };
 
 void
diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
index b720abe03..115443fc0 100644
--- a/lib/librte_distributor/rte_distributor.c
+++ b/lib/librte_distributor/rte_distributor.c
@@ -51,7 +51,7 @@ rte_distributor_request_pkt(struct rte_distributor *d,
 	 * Sync with worker on GET_BUF flag.
 	 */
 	while (unlikely(__atomic_load_n(retptr64, __ATOMIC_ACQUIRE)
-			& RTE_DISTRIB_GET_BUF)) {
+			& (RTE_DISTRIB_GET_BUF | RTE_DISTRIB_RETURN_BUF))) {
 		rte_pause();
 		uint64_t t = rte_rdtsc()+100;
 
@@ -67,11 +67,11 @@ rte_distributor_request_pkt(struct rte_distributor *d,
 	for (i = count; i < RTE_DIST_BURST_SIZE; i++)
 		buf->retptr64[i] = 0;
 
-	/* Set Return bit for each packet returned */
+	/* Set VALID_BUF bit for each packet returned */
 	for (i = count; i-- > 0; )
 		buf->retptr64[i] =
 			(((int64_t)(uintptr_t)(oldpkt[i])) <<
-			RTE_DISTRIB_FLAG_BITS) | RTE_DISTRIB_RETURN_BUF;
+			RTE_DISTRIB_FLAG_BITS) | RTE_DISTRIB_VALID_BUF;
 
 	/*
 	 * Finally, set the GET_BUF  to signal to distributor that cache
@@ -97,11 +97,13 @@ rte_distributor_poll_pkt(struct rte_distributor *d,
 		return (pkts[0]) ? 1 : 0;
 	}
 
-	/* If bit is set, return
+	/* If any of below bits is set, return.
+	 * GET_BUF is set when distributor hasn't sent any packets yet
+	 * RETURN_BUF is set when distributor must retrieve in-flight packets
 	 * Sync with distributor to acquire bufptrs
 	 */
 	if (__atomic_load_n(&(buf->bufptr64[0]), __ATOMIC_ACQUIRE)
-		& RTE_DISTRIB_GET_BUF)
+		& (RTE_DISTRIB_GET_BUF | RTE_DISTRIB_RETURN_BUF))
 		return -1;
 
 	/* since bufptr64 is signed, this should be an arithmetic shift */
@@ -113,7 +115,7 @@ rte_distributor_poll_pkt(struct rte_distributor *d,
 	}
 
 	/*
-	 * so now we've got the contents of the cacheline into an  array of
+	 * so now we've got the contents of the cacheline into an array of
 	 * mbuf pointers, so toggle the bit so scheduler can start working
 	 * on the next cacheline while we're working.
 	 * Sync with distributor on GET_BUF flag. Release bufptrs.
@@ -175,7 +177,7 @@ rte_distributor_return_pkt(struct rte_distributor *d,
 	 * Sync with worker on GET_BUF flag.
 	 */
 	while (unlikely(__atomic_load_n(retptr64, __ATOMIC_ACQUIRE)
-			& RTE_DISTRIB_GET_BUF)) {
+			& (RTE_DISTRIB_GET_BUF | RTE_DISTRIB_RETURN_BUF))) {
 		rte_pause();
 		uint64_t t = rte_rdtsc()+100;
 
@@ -187,17 +189,25 @@ rte_distributor_return_pkt(struct rte_distributor *d,
 	__atomic_thread_fence(__ATOMIC_ACQUIRE);
 	for (i = 0; i < RTE_DIST_BURST_SIZE; i++)
 		/* Switch off the return bit first */
-		buf->retptr64[i] &= ~RTE_DISTRIB_RETURN_BUF;
+		buf->retptr64[i] = 0;
 
 	for (i = num; i-- > 0; )
 		buf->retptr64[i] = (((int64_t)(uintptr_t)oldpkt[i]) <<
-			RTE_DISTRIB_FLAG_BITS) | RTE_DISTRIB_RETURN_BUF;
+			RTE_DISTRIB_FLAG_BITS) | RTE_DISTRIB_VALID_BUF;
+
+	/* Use RETURN_BUF on bufptr64 to notify distributor that
+	 * we won't read any mbufs from there even if GET_BUF is set.
+	 * This allows distributor to retrieve in-flight already sent packets.
+	 */
+	__atomic_or_fetch(&(buf->bufptr64[0]), RTE_DISTRIB_RETURN_BUF,
+		__ATOMIC_ACQ_REL);
 
-	/* set the GET_BUF but even if we got no returns.
-	 * Sync with distributor on GET_BUF flag. Release retptrs.
+	/* set the RETURN_BUF on retptr64 even if we got no returns.
+	 * Sync with distributor on RETURN_BUF flag. Release retptrs.
+	 * Notify distributor that we don't request more packets any more.
 	 */
 	__atomic_store_n(&(buf->retptr64[0]),
-		buf->retptr64[0] | RTE_DISTRIB_GET_BUF, __ATOMIC_RELEASE);
+		buf->retptr64[0] | RTE_DISTRIB_RETURN_BUF, __ATOMIC_RELEASE);
 
 	return 0;
 }
@@ -267,6 +277,59 @@ find_match_scalar(struct rte_distributor *d,
 	 */
 }
 
+/*
+ * When worker called rte_distributor_return_pkt()
+ * and passed RTE_DISTRIB_RETURN_BUF handshake through retptr64,
+ * distributor must retrieve both inflight and backlog packets assigned
+ * to the worker and reprocess them to another worker.
+ */
+static void
+handle_worker_shutdown(struct rte_distributor *d, unsigned int wkr)
+{
+	struct rte_distributor_buffer *buf = &(d->bufs[wkr]);
+	/* double BURST size for storing both inflights and backlog */
+	struct rte_mbuf *pkts[RTE_DIST_BURST_SIZE * 2];
+	unsigned int pkts_count = 0;
+	unsigned int i;
+
+	/* If GET_BUF is cleared there are in-flight packets sent
+	 * to worker which does not require new packets.
+	 * They must be retrieved and assigned to another worker.
+	 */
+	if (!(__atomic_load_n(&(buf->bufptr64[0]), __ATOMIC_ACQUIRE)
+		& RTE_DISTRIB_GET_BUF))
+		for (i = 0; i < RTE_DIST_BURST_SIZE; i++)
+			if (buf->bufptr64[i] & RTE_DISTRIB_VALID_BUF)
+				pkts[pkts_count++] = (void *)((uintptr_t)
+					(buf->bufptr64[i]
+						>> RTE_DISTRIB_FLAG_BITS));
+
+	/* Make following operations on handshake flags on bufptr64:
+	 * - set GET_BUF to indicate that distributor can overwrite buffer
+	 *     with new packets if worker will make a new request.
+	 * - clear RETURN_BUF to unlock reads on worker side.
+	 */
+	__atomic_store_n(&(buf->bufptr64[0]), RTE_DISTRIB_GET_BUF,
+		__ATOMIC_RELEASE);
+
+	/* Collect backlog packets from worker */
+	for (i = 0; i < d->backlog[wkr].count; i++)
+		pkts[pkts_count++] = (void *)((uintptr_t)
+			(d->backlog[wkr].pkts[i] >> RTE_DISTRIB_FLAG_BITS));
+
+	d->backlog[wkr].count = 0;
+
+	/* Clear both inflight and backlog tags */
+	for (i = 0; i < RTE_DIST_BURST_SIZE; i++) {
+		d->in_flight_tags[wkr][i] = 0;
+		d->backlog[wkr].tags[i] = 0;
+	}
+
+	/* Recursive call */
+	if (pkts_count > 0)
+		rte_distributor_process(d, pkts, pkts_count);
+}
+
 
 /*
  * When the handshake bits indicate that there are packets coming
@@ -285,19 +348,33 @@ handle_returns(struct rte_distributor *d, unsigned int wkr)
 
 	/* Sync on GET_BUF flag. Acquire retptrs. */
 	if (__atomic_load_n(&(buf->retptr64[0]), __ATOMIC_ACQUIRE)
-		& RTE_DISTRIB_GET_BUF) {
+		& (RTE_DISTRIB_GET_BUF | RTE_DISTRIB_RETURN_BUF)) {
 		for (i = 0; i < RTE_DIST_BURST_SIZE; i++) {
-			if (buf->retptr64[i] & RTE_DISTRIB_RETURN_BUF) {
+			if (buf->retptr64[i] & RTE_DISTRIB_VALID_BUF) {
 				oldbuf = ((uintptr_t)(buf->retptr64[i] >>
 					RTE_DISTRIB_FLAG_BITS));
 				/* store returns in a circular buffer */
 				store_return(oldbuf, d, &ret_start, &ret_count);
 				count++;
-				buf->retptr64[i] &= ~RTE_DISTRIB_RETURN_BUF;
+				buf->retptr64[i] &= ~RTE_DISTRIB_VALID_BUF;
 			}
 		}
 		d->returns.start = ret_start;
 		d->returns.count = ret_count;
+
+		/* If worker requested packets with GET_BUF, set it to active
+		 * otherwise (RETURN_BUF), set it to not active.
+		 */
+		d->activesum -= d->active[wkr];
+		d->active[wkr] = !!(buf->retptr64[0] & RTE_DISTRIB_GET_BUF);
+		d->activesum += d->active[wkr];
+
+		/* If worker returned packets without requesting new ones,
+		 * handle all in-flights and backlog packets assigned to it.
+		 */
+		if (unlikely(buf->retptr64[0] & RTE_DISTRIB_RETURN_BUF))
+			handle_worker_shutdown(d, wkr);
+
 		/* Clear for the worker to populate with more returns.
 		 * Sync with distributor on GET_BUF flag. Release retptrs.
 		 */
@@ -322,11 +399,15 @@ release(struct rte_distributor *d, unsigned int wkr)
 	unsigned int i;
 
 	handle_returns(d, wkr);
+	if (unlikely(!d->active[wkr]))
+		return 0;
 
 	/* Sync with worker on GET_BUF flag */
 	while (!(__atomic_load_n(&(d->bufs[wkr].bufptr64[0]), __ATOMIC_ACQUIRE)
 		& RTE_DISTRIB_GET_BUF)) {
 		handle_returns(d, wkr);
+		if (unlikely(!d->active[wkr]))
+			return 0;
 		rte_pause();
 	}
 
@@ -366,7 +447,7 @@ rte_distributor_process(struct rte_distributor *d,
 	int64_t next_value = 0;
 	uint16_t new_tag = 0;
 	uint16_t flows[RTE_DIST_BURST_SIZE] __rte_cache_aligned;
-	unsigned int i, j, w, wid;
+	unsigned int i, j, w, wid, matching_required;
 
 	if (d->alg_type == RTE_DIST_ALG_SINGLE) {
 		/* Call the old API */
@@ -374,11 +455,13 @@ rte_distributor_process(struct rte_distributor *d,
 			mbufs, num_mbufs);
 	}
 
+	for (wid = 0 ; wid < d->num_workers; wid++)
+		handle_returns(d, wid);
+
 	if (unlikely(num_mbufs == 0)) {
 		/* Flush out all non-full cache-lines to workers. */
 		for (wid = 0 ; wid < d->num_workers; wid++) {
 			/* Sync with worker on GET_BUF flag. */
-			handle_returns(d, wid);
 			if (__atomic_load_n(&(d->bufs[wid].bufptr64[0]),
 				__ATOMIC_ACQUIRE) & RTE_DISTRIB_GET_BUF) {
 				release(d, wid);
@@ -388,6 +471,9 @@ rte_distributor_process(struct rte_distributor *d,
 		return 0;
 	}
 
+	if (unlikely(!d->activesum))
+		return 0;
+
 	while (next_idx < num_mbufs) {
 		uint16_t matches[RTE_DIST_BURST_SIZE];
 		unsigned int pkts;
@@ -412,22 +498,30 @@ rte_distributor_process(struct rte_distributor *d,
 		for (; i < RTE_DIST_BURST_SIZE; i++)
 			flows[i] = 0;
 
-		switch (d->dist_match_fn) {
-		case RTE_DIST_MATCH_VECTOR:
-			find_match_vec(d, &flows[0], &matches[0]);
-			break;
-		default:
-			find_match_scalar(d, &flows[0], &matches[0]);
-		}
+		matching_required = 1;
 
+		for (j = 0; j < pkts; j++) {
+			if (unlikely(!d->activesum))
+				return next_idx;
+
+			if (unlikely(matching_required)) {
+				switch (d->dist_match_fn) {
+				case RTE_DIST_MATCH_VECTOR:
+					find_match_vec(d, &flows[0],
+						&matches[0]);
+					break;
+				default:
+					find_match_scalar(d, &flows[0],
+						&matches[0]);
+				}
+				matching_required = 0;
+			}
 		/*
 		 * Matches array now contain the intended worker ID (+1) of
 		 * the incoming packets. Any zeroes need to be assigned
 		 * workers.
 		 */
 
-		for (j = 0; j < pkts; j++) {
-
 			next_mb = mbufs[next_idx++];
 			next_value = (((int64_t)(uintptr_t)next_mb) <<
 					RTE_DISTRIB_FLAG_BITS);
@@ -447,12 +541,18 @@ rte_distributor_process(struct rte_distributor *d,
 			 */
 			/* matches[j] = 0; */
 
-			if (matches[j]) {
+			if (matches[j] && d->active[matches[j]-1]) {
 				struct rte_distributor_backlog *bl =
 						&d->backlog[matches[j]-1];
 				if (unlikely(bl->count ==
 						RTE_DIST_BURST_SIZE)) {
 					release(d, matches[j]-1);
+					if (!d->active[matches[j]-1]) {
+						j--;
+						next_idx--;
+						matching_required = 1;
+						continue;
+					}
 				}
 
 				/* Add to worker that already has flow */
@@ -462,11 +562,21 @@ rte_distributor_process(struct rte_distributor *d,
 				bl->pkts[idx] = next_value;
 
 			} else {
-				struct rte_distributor_backlog *bl =
-						&d->backlog[wkr];
+				struct rte_distributor_backlog *bl;
+
+				while (unlikely(!d->active[wkr]))
+					wkr = (wkr + 1) % d->num_workers;
+				bl = &d->backlog[wkr];
+
 				if (unlikely(bl->count ==
 						RTE_DIST_BURST_SIZE)) {
 					release(d, wkr);
+					if (!d->active[wkr]) {
+						j--;
+						next_idx--;
+						matching_required = 1;
+						continue;
+					}
 				}
 
 				/* Add to current worker worker */
@@ -485,9 +595,7 @@ rte_distributor_process(struct rte_distributor *d,
 						matches[w] = wkr+1;
 			}
 		}
-		wkr++;
-		if (wkr >= d->num_workers)
-			wkr = 0;
+		wkr = (wkr + 1) % d->num_workers;
 	}
 
 	/* Flush out all non-full cache-lines to workers. */
@@ -663,6 +771,9 @@ rte_distributor_create(const char *name,
 	for (i = 0 ; i < num_workers ; i++)
 		d->backlog[i].tags = &d->in_flight_tags[i][RTE_DIST_BURST_SIZE];
 
+	memset(d->active, 0, sizeof(d->active));
+	d->activesum = 0;
+
 	dist_burst_list = RTE_TAILQ_CAST(rte_dist_burst_tailq.head,
 					  rte_dist_burst_list);
 
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v5 05/15] test/distributor: fix shutdown of busy worker
       [not found]                   ` <CGME20201008052340eucas1p1451f2bf1b6475067491753274547b837@eucas1p1.samsung.com>
@ 2020-10-08  5:23                     ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-08  5:23 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

The sanity test with worker shutdown delegates all bufs
to be processed by a single lcore worker, then it freezes
one of the lcore workers and continues to send more bufs.
The freezed core shuts down first by calling
rte_distributor_return_pkt().

The test intention is to verify if packets assigned to
the shut down lcore will be reassigned to another worker.

However the shutdown core was not always the one, that was
processing packets. The lcore processing mbufs might be different
every time test is launched. This is caused by keeping the value
of wkr static variable in rte_distributor_process() function
between running test cases.

Test freezed always lcore with 0 id. The patch stores the id
of worker that is processing the data in zero_idx global atomic
variable. This way the freezed lcore is always the proper one.

Fixes: c3eabff124e6 ("distributor: add unit tests")
Cc: bruce.richardson@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Tested-by: David Hunt <david.hunt@intel.com>
---
 app/test/test_distributor.c | 23 +++++++++++++++++++++--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
index 52230d250..6cd7a2edd 100644
--- a/app/test/test_distributor.c
+++ b/app/test/test_distributor.c
@@ -28,6 +28,7 @@ struct worker_params worker_params;
 static volatile int quit;      /**< general quit variable for all threads */
 static volatile int zero_quit; /**< var for when we just want thr0 to quit*/
 static volatile unsigned worker_idx;
+static volatile unsigned zero_idx;
 
 struct worker_stats {
 	volatile unsigned handled_packets;
@@ -340,26 +341,43 @@ handle_work_for_shutdown_test(void *arg)
 	unsigned int total = 0;
 	unsigned int i;
 	unsigned int returned = 0;
+	unsigned int zero_id = 0;
+	unsigned int zero_unset;
 	const unsigned int id = __atomic_fetch_add(&worker_idx, 1,
 			__ATOMIC_RELAXED);
 
 	num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 
+	if (num > 0) {
+		zero_unset = RTE_MAX_LCORE;
+		__atomic_compare_exchange_n(&zero_idx, &zero_unset, id,
+			false, __ATOMIC_ACQ_REL, __ATOMIC_ACQUIRE);
+	}
+	zero_id = __atomic_load_n(&zero_idx, __ATOMIC_ACQUIRE);
+
 	/* wait for quit single globally, or for worker zero, wait
 	 * for zero_quit */
-	while (!quit && !(id == 0 && zero_quit)) {
+	while (!quit && !(id == zero_id && zero_quit)) {
 		worker_stats[id].handled_packets += num;
 		count += num;
 		for (i = 0; i < num; i++)
 			rte_pktmbuf_free(buf[i]);
 		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
+
+		if (num > 0) {
+			zero_unset = RTE_MAX_LCORE;
+			__atomic_compare_exchange_n(&zero_idx, &zero_unset, id,
+				false, __ATOMIC_ACQ_REL, __ATOMIC_ACQUIRE);
+		}
+		zero_id = __atomic_load_n(&zero_idx, __ATOMIC_ACQUIRE);
+
 		total += num;
 	}
 	worker_stats[id].handled_packets += num;
 	count += num;
 	returned = rte_distributor_return_pkt(d, id, buf, num);
 
-	if (id == 0) {
+	if (id == zero_id) {
 		/* for worker zero, allow it to restart to pick up last packet
 		 * when all workers are shutting down.
 		 */
@@ -578,6 +596,7 @@ quit_workers(struct worker_params *wp, struct rte_mempool *p)
 	rte_eal_mp_wait_lcore();
 	quit = 0;
 	worker_idx = 0;
+	zero_idx = RTE_MAX_LCORE;
 }
 
 static int
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v5 06/15] test/distributor: synchronize lcores statistics
       [not found]                   ` <CGME20201008052341eucas1p2379b186206e5bf481e3c680de46e5c16@eucas1p2.samsung.com>
@ 2020-10-08  5:23                     ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-08  5:23 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

Statistics of handled packets are cleared and read on main lcore,
while they are increased in workers handlers on different lcores.

Without synchronization occasionally showed invalid values.
This patch uses atomic acquire/release mechanisms to synchronize.

Fixes: c3eabff124e6 ("distributor: add unit tests")
Cc: bruce.richardson@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 app/test/test_distributor.c | 43 +++++++++++++++++++++++++------------
 1 file changed, 29 insertions(+), 14 deletions(-)

diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
index 6cd7a2edd..838459392 100644
--- a/app/test/test_distributor.c
+++ b/app/test/test_distributor.c
@@ -43,7 +43,8 @@ total_packet_count(void)
 {
 	unsigned i, count = 0;
 	for (i = 0; i < worker_idx; i++)
-		count += worker_stats[i].handled_packets;
+		count += __atomic_load_n(&worker_stats[i].handled_packets,
+				__ATOMIC_ACQUIRE);
 	return count;
 }
 
@@ -51,7 +52,10 @@ total_packet_count(void)
 static inline void
 clear_packet_count(void)
 {
-	memset(&worker_stats, 0, sizeof(worker_stats));
+	unsigned int i;
+	for (i = 0; i < RTE_MAX_LCORE; i++)
+		__atomic_store_n(&worker_stats[i].handled_packets, 0,
+			__ATOMIC_RELEASE);
 }
 
 /* this is the basic worker function for sanity test
@@ -69,13 +73,13 @@ handle_work(void *arg)
 	num = rte_distributor_get_pkt(db, id, buf, NULL, 0);
 	while (!quit) {
 		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
-				__ATOMIC_RELAXED);
+				__ATOMIC_ACQ_REL);
 		count += num;
 		num = rte_distributor_get_pkt(db, id,
 				buf, buf, num);
 	}
 	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
-			__ATOMIC_RELAXED);
+			__ATOMIC_ACQ_REL);
 	count += num;
 	rte_distributor_return_pkt(db, id, buf, num);
 	return 0;
@@ -131,7 +135,8 @@ sanity_test(struct worker_params *wp, struct rte_mempool *p)
 
 	for (i = 0; i < rte_lcore_count() - 1; i++)
 		printf("Worker %u handled %u packets\n", i,
-				worker_stats[i].handled_packets);
+			__atomic_load_n(&worker_stats[i].handled_packets,
+					__ATOMIC_ACQUIRE));
 	printf("Sanity test with all zero hashes done.\n");
 
 	/* pick two flows and check they go correctly */
@@ -156,7 +161,9 @@ sanity_test(struct worker_params *wp, struct rte_mempool *p)
 
 		for (i = 0; i < rte_lcore_count() - 1; i++)
 			printf("Worker %u handled %u packets\n", i,
-					worker_stats[i].handled_packets);
+				__atomic_load_n(
+					&worker_stats[i].handled_packets,
+					__ATOMIC_ACQUIRE));
 		printf("Sanity test with two hash values done\n");
 	}
 
@@ -182,7 +189,8 @@ sanity_test(struct worker_params *wp, struct rte_mempool *p)
 
 	for (i = 0; i < rte_lcore_count() - 1; i++)
 		printf("Worker %u handled %u packets\n", i,
-				worker_stats[i].handled_packets);
+			__atomic_load_n(&worker_stats[i].handled_packets,
+					__ATOMIC_ACQUIRE));
 	printf("Sanity test with non-zero hashes done\n");
 
 	rte_mempool_put_bulk(p, (void *)bufs, BURST);
@@ -275,14 +283,16 @@ handle_work_with_free_mbufs(void *arg)
 
 	num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 	while (!quit) {
-		worker_stats[id].handled_packets += num;
 		count += num;
+		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
+				__ATOMIC_ACQ_REL);
 		for (i = 0; i < num; i++)
 			rte_pktmbuf_free(buf[i]);
 		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 	}
-	worker_stats[id].handled_packets += num;
 	count += num;
+	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
+			__ATOMIC_ACQ_REL);
 	rte_distributor_return_pkt(d, id, buf, num);
 	return 0;
 }
@@ -358,8 +368,9 @@ handle_work_for_shutdown_test(void *arg)
 	/* wait for quit single globally, or for worker zero, wait
 	 * for zero_quit */
 	while (!quit && !(id == zero_id && zero_quit)) {
-		worker_stats[id].handled_packets += num;
 		count += num;
+		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
+				__ATOMIC_ACQ_REL);
 		for (i = 0; i < num; i++)
 			rte_pktmbuf_free(buf[i]);
 		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
@@ -373,10 +384,11 @@ handle_work_for_shutdown_test(void *arg)
 
 		total += num;
 	}
-	worker_stats[id].handled_packets += num;
 	count += num;
 	returned = rte_distributor_return_pkt(d, id, buf, num);
 
+	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
+			__ATOMIC_ACQ_REL);
 	if (id == zero_id) {
 		/* for worker zero, allow it to restart to pick up last packet
 		 * when all workers are shutting down.
@@ -387,7 +399,8 @@ handle_work_for_shutdown_test(void *arg)
 		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 
 		while (!quit) {
-			worker_stats[id].handled_packets += num;
+			__atomic_fetch_add(&worker_stats[id].handled_packets,
+					num, __ATOMIC_ACQ_REL);
 			count += num;
 			rte_pktmbuf_free(pkt);
 			num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
@@ -454,7 +467,8 @@ sanity_test_with_worker_shutdown(struct worker_params *wp,
 
 	for (i = 0; i < rte_lcore_count() - 1; i++)
 		printf("Worker %u handled %u packets\n", i,
-				worker_stats[i].handled_packets);
+			__atomic_load_n(&worker_stats[i].handled_packets,
+					__ATOMIC_ACQUIRE));
 
 	if (total_packet_count() != BURST * 2) {
 		printf("Line %d: Error, not all packets flushed. "
@@ -507,7 +521,8 @@ test_flush_with_worker_shutdown(struct worker_params *wp,
 	zero_quit = 0;
 	for (i = 0; i < rte_lcore_count() - 1; i++)
 		printf("Worker %u handled %u packets\n", i,
-				worker_stats[i].handled_packets);
+			__atomic_load_n(&worker_stats[i].handled_packets,
+					__ATOMIC_ACQUIRE));
 
 	if (total_packet_count() != BURST) {
 		printf("Line %d: Error, not all packets flushed. "
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v5 07/15] distributor: fix return pkt calls in single mode
       [not found]                   ` <CGME20201008052342eucas1p2376e75d9ac38f5054ca393b0ef7e663d@eucas1p2.samsung.com>
@ 2020-10-08  5:23                     ` Lukasz Wojciechowski
  2020-10-08 14:32                       ` David Hunt
  0 siblings, 1 reply; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-08  5:23 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

In the single legacy version of the distributor synchronization
requires continues exchange of buffers between distributor
and workers. Empty buffers are sent if only handshake
synchronization is required.
However calls to the rte_distributor_return_pkt()
with 0 buffers in single mode were ignored and not passed to the
legacy algorithm implementation causing lack of synchronization.

This patch fixes this issue by passing NULL as buffer which is
a valid way of sending just synchronization handshakes
in single mode.

Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
---
 lib/librte_distributor/rte_distributor.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
index 115443fc0..9fd7dcab7 100644
--- a/lib/librte_distributor/rte_distributor.c
+++ b/lib/librte_distributor/rte_distributor.c
@@ -168,6 +168,9 @@ rte_distributor_return_pkt(struct rte_distributor *d,
 		if (num == 1)
 			return rte_distributor_return_pkt_single(d->d_single,
 				worker_id, oldpkt[0]);
+		else if (num == 0)
+			return rte_distributor_return_pkt_single(d->d_single,
+				worker_id, NULL);
 		else
 			return -EINVAL;
 	}
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v5 08/15] test/distributor: fix freeing mbufs
       [not found]                   ` <CGME20201008052342eucas1p19e8474360d1f7dacd4164b3e21e54290@eucas1p1.samsung.com>
@ 2020-10-08  5:23                     ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-08  5:23 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

Sanity tests with mbuf alloc and shutdown tests assume that
mbufs passed to worker cores are freed in handlers.
Such packets should not be returned to the distributor's main
core. The only packets that should be returned are the packets
send after completion of the tests in quit_workers function.

This patch stops returning mbufs to distributor's core.
In case of shutdown tests it is impossible to determine
how worker and distributor threads would synchronize.
Packets used by tests should be freed and packets used during
quit_workers() shouldn't. That's why returning mbufs to mempool
is moved to test procedure run on distributor thread
from worker threads.

Additionally this patch cleans up unused variables.

Fixes: c0de0eb82e40 ("distributor: switch over to new API")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 app/test/test_distributor.c | 68 ++++++++++++++++++-------------------
 1 file changed, 33 insertions(+), 35 deletions(-)

diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
index 838459392..d7f780acc 100644
--- a/app/test/test_distributor.c
+++ b/app/test/test_distributor.c
@@ -67,20 +67,18 @@ handle_work(void *arg)
 	struct rte_mbuf *buf[8] __rte_cache_aligned;
 	struct worker_params *wp = arg;
 	struct rte_distributor *db = wp->dist;
-	unsigned int count = 0, num;
+	unsigned int num;
 	unsigned int id = __atomic_fetch_add(&worker_idx, 1, __ATOMIC_RELAXED);
 
 	num = rte_distributor_get_pkt(db, id, buf, NULL, 0);
 	while (!quit) {
 		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
 				__ATOMIC_ACQ_REL);
-		count += num;
 		num = rte_distributor_get_pkt(db, id,
 				buf, buf, num);
 	}
 	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
 			__ATOMIC_ACQ_REL);
-	count += num;
 	rte_distributor_return_pkt(db, id, buf, num);
 	return 0;
 }
@@ -276,21 +274,18 @@ handle_work_with_free_mbufs(void *arg)
 	struct rte_mbuf *buf[8] __rte_cache_aligned;
 	struct worker_params *wp = arg;
 	struct rte_distributor *d = wp->dist;
-	unsigned int count = 0;
 	unsigned int i;
 	unsigned int num;
 	unsigned int id = __atomic_fetch_add(&worker_idx, 1, __ATOMIC_RELAXED);
 
 	num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 	while (!quit) {
-		count += num;
 		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
 				__ATOMIC_ACQ_REL);
 		for (i = 0; i < num; i++)
 			rte_pktmbuf_free(buf[i]);
 		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 	}
-	count += num;
 	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
 			__ATOMIC_ACQ_REL);
 	rte_distributor_return_pkt(d, id, buf, num);
@@ -318,7 +313,6 @@ sanity_test_with_mbuf_alloc(struct worker_params *wp, struct rte_mempool *p)
 			rte_distributor_process(d, NULL, 0);
 		for (j = 0; j < BURST; j++) {
 			bufs[j]->hash.usr = (i+j) << 1;
-			rte_mbuf_refcnt_set(bufs[j], 1);
 		}
 
 		rte_distributor_process(d, bufs, BURST);
@@ -342,15 +336,10 @@ sanity_test_with_mbuf_alloc(struct worker_params *wp, struct rte_mempool *p)
 static int
 handle_work_for_shutdown_test(void *arg)
 {
-	struct rte_mbuf *pkt = NULL;
 	struct rte_mbuf *buf[8] __rte_cache_aligned;
 	struct worker_params *wp = arg;
 	struct rte_distributor *d = wp->dist;
-	unsigned int count = 0;
 	unsigned int num;
-	unsigned int total = 0;
-	unsigned int i;
-	unsigned int returned = 0;
 	unsigned int zero_id = 0;
 	unsigned int zero_unset;
 	const unsigned int id = __atomic_fetch_add(&worker_idx, 1,
@@ -368,11 +357,8 @@ handle_work_for_shutdown_test(void *arg)
 	/* wait for quit single globally, or for worker zero, wait
 	 * for zero_quit */
 	while (!quit && !(id == zero_id && zero_quit)) {
-		count += num;
 		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
 				__ATOMIC_ACQ_REL);
-		for (i = 0; i < num; i++)
-			rte_pktmbuf_free(buf[i]);
 		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 
 		if (num > 0) {
@@ -381,15 +367,12 @@ handle_work_for_shutdown_test(void *arg)
 				false, __ATOMIC_ACQ_REL, __ATOMIC_ACQUIRE);
 		}
 		zero_id = __atomic_load_n(&zero_idx, __ATOMIC_ACQUIRE);
-
-		total += num;
 	}
-	count += num;
-	returned = rte_distributor_return_pkt(d, id, buf, num);
-
 	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
 			__ATOMIC_ACQ_REL);
 	if (id == zero_id) {
+		rte_distributor_return_pkt(d, id, NULL, 0);
+
 		/* for worker zero, allow it to restart to pick up last packet
 		 * when all workers are shutting down.
 		 */
@@ -401,14 +384,10 @@ handle_work_for_shutdown_test(void *arg)
 		while (!quit) {
 			__atomic_fetch_add(&worker_stats[id].handled_packets,
 					num, __ATOMIC_ACQ_REL);
-			count += num;
-			rte_pktmbuf_free(pkt);
 			num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 		}
-		returned = rte_distributor_return_pkt(d,
-				id, buf, num);
-		printf("Num returned = %d\n", returned);
 	}
+	rte_distributor_return_pkt(d, id, buf, num);
 	return 0;
 }
 
@@ -424,7 +403,9 @@ sanity_test_with_worker_shutdown(struct worker_params *wp,
 {
 	struct rte_distributor *d = wp->dist;
 	struct rte_mbuf *bufs[BURST];
-	unsigned i;
+	struct rte_mbuf *bufs2[BURST];
+	unsigned int i;
+	unsigned int failed = 0;
 
 	printf("=== Sanity test of worker shutdown ===\n");
 
@@ -450,16 +431,17 @@ sanity_test_with_worker_shutdown(struct worker_params *wp,
 	 */
 
 	/* get more buffers to queue up, again setting them to the same flow */
-	if (rte_mempool_get_bulk(p, (void *)bufs, BURST) != 0) {
+	if (rte_mempool_get_bulk(p, (void *)bufs2, BURST) != 0) {
 		printf("line %d: Error getting mbufs from pool\n", __LINE__);
+		rte_mempool_put_bulk(p, (void *)bufs, BURST);
 		return -1;
 	}
 	for (i = 0; i < BURST; i++)
-		bufs[i]->hash.usr = 1;
+		bufs2[i]->hash.usr = 1;
 
 	/* get worker zero to quit */
 	zero_quit = 1;
-	rte_distributor_process(d, bufs, BURST);
+	rte_distributor_process(d, bufs2, BURST);
 
 	/* flush the distributor */
 	rte_distributor_flush(d);
@@ -474,9 +456,15 @@ sanity_test_with_worker_shutdown(struct worker_params *wp,
 		printf("Line %d: Error, not all packets flushed. "
 				"Expected %u, got %u\n",
 				__LINE__, BURST * 2, total_packet_count());
-		return -1;
+		failed = 1;
 	}
 
+	rte_mempool_put_bulk(p, (void *)bufs, BURST);
+	rte_mempool_put_bulk(p, (void *)bufs2, BURST);
+
+	if (failed)
+		return -1;
+
 	printf("Sanity test with worker shutdown passed\n\n");
 	return 0;
 }
@@ -490,7 +478,8 @@ test_flush_with_worker_shutdown(struct worker_params *wp,
 {
 	struct rte_distributor *d = wp->dist;
 	struct rte_mbuf *bufs[BURST];
-	unsigned i;
+	unsigned int i;
+	unsigned int failed = 0;
 
 	printf("=== Test flush fn with worker shutdown (%s) ===\n", wp->name);
 
@@ -528,9 +517,14 @@ test_flush_with_worker_shutdown(struct worker_params *wp,
 		printf("Line %d: Error, not all packets flushed. "
 				"Expected %u, got %u\n",
 				__LINE__, BURST, total_packet_count());
-		return -1;
+		failed = 1;
 	}
 
+	rte_mempool_put_bulk(p, (void *)bufs, BURST);
+
+	if (failed)
+		return -1;
+
 	printf("Flush test with worker shutdown passed\n\n");
 	return 0;
 }
@@ -596,7 +590,10 @@ quit_workers(struct worker_params *wp, struct rte_mempool *p)
 	const unsigned num_workers = rte_lcore_count() - 1;
 	unsigned i;
 	struct rte_mbuf *bufs[RTE_MAX_LCORE];
-	rte_mempool_get_bulk(p, (void *)bufs, num_workers);
+	if (rte_mempool_get_bulk(p, (void *)bufs, num_workers) != 0) {
+		printf("line %d: Error getting mbufs from pool\n", __LINE__);
+		return;
+	}
 
 	zero_quit = 0;
 	quit = 1;
@@ -604,11 +601,12 @@ quit_workers(struct worker_params *wp, struct rte_mempool *p)
 		bufs[i]->hash.usr = i << 1;
 	rte_distributor_process(d, bufs, num_workers);
 
-	rte_mempool_put_bulk(p, (void *)bufs, num_workers);
-
 	rte_distributor_process(d, NULL, 0);
 	rte_distributor_flush(d);
 	rte_eal_mp_wait_lcore();
+
+	rte_mempool_put_bulk(p, (void *)bufs, num_workers);
+
 	quit = 0;
 	worker_idx = 0;
 	zero_idx = RTE_MAX_LCORE;
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v5 09/15] test/distributor: collect return mbufs
       [not found]                   ` <CGME20201008052343eucas1p1649655353d6c76cdf6320a04e8d43f32@eucas1p1.samsung.com>
@ 2020-10-08  5:23                     ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-08  5:23 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

During quit_workers function distributor's main core processes
some packets to wake up pending worker cores so they can quit.
As quit_workers acts also as a cleanup procedure for next test
case it should also collect these packages returned by workers'
handlers, so the cyclic buffer with returned packets
in distributor remains empty.

Fixes: c3eabff124e6 ("distributor: add unit tests")
Cc: bruce.richardson@intel.com
Fixes: c0de0eb82e40 ("distributor: switch over to new API")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 app/test/test_distributor.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
index d7f780acc..838a67515 100644
--- a/app/test/test_distributor.c
+++ b/app/test/test_distributor.c
@@ -590,6 +590,7 @@ quit_workers(struct worker_params *wp, struct rte_mempool *p)
 	const unsigned num_workers = rte_lcore_count() - 1;
 	unsigned i;
 	struct rte_mbuf *bufs[RTE_MAX_LCORE];
+	struct rte_mbuf *returns[RTE_MAX_LCORE];
 	if (rte_mempool_get_bulk(p, (void *)bufs, num_workers) != 0) {
 		printf("line %d: Error getting mbufs from pool\n", __LINE__);
 		return;
@@ -605,6 +606,10 @@ quit_workers(struct worker_params *wp, struct rte_mempool *p)
 	rte_distributor_flush(d);
 	rte_eal_mp_wait_lcore();
 
+	while (rte_distributor_returned_pkts(d, returns, RTE_MAX_LCORE))
+		;
+
+	rte_distributor_clear_returns(d);
 	rte_mempool_put_bulk(p, (void *)bufs, num_workers);
 
 	quit = 0;
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v5 10/15] distributor: align API documentation with code
       [not found]                   ` <CGME20201008052344eucas1p270b04ad2c4346e6beb5f5ef844827085@eucas1p2.samsung.com>
@ 2020-10-08  5:23                     ` Lukasz Wojciechowski
  2020-10-08 14:35                       ` David Hunt
  0 siblings, 1 reply; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-08  5:23 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

After introducing burst API there were some artefacts in the
API documentation from legacy single API.
Also the rte_distributor_poll_pkt() function return values
mismatched the implementation.

Fixes: c0de0eb82e40 ("distributor: switch over to new API")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
---
 lib/librte_distributor/rte_distributor.h | 23 ++++++++++++-----------
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/lib/librte_distributor/rte_distributor.h b/lib/librte_distributor/rte_distributor.h
index 327c0c4ab..a073e6461 100644
--- a/lib/librte_distributor/rte_distributor.h
+++ b/lib/librte_distributor/rte_distributor.h
@@ -155,7 +155,7 @@ rte_distributor_clear_returns(struct rte_distributor *d);
  * @param pkts
  *   The mbufs pointer array to be filled in (up to 8 packets)
  * @param oldpkt
- *   The previous packet, if any, being processed by the worker
+ *   The previous packets, if any, being processed by the worker
  * @param retcount
  *   The number of packets being returned
  *
@@ -187,15 +187,15 @@ rte_distributor_return_pkt(struct rte_distributor *d,
 
 /**
  * API called by a worker to request a new packet to process.
- * Any previous packet given to the worker is assumed to have completed
+ * Any previous packets given to the worker are assumed to have completed
  * processing, and may be optionally returned to the distributor via
  * the oldpkt parameter.
- * Unlike rte_distributor_get_pkt_burst(), this function does not wait for a
- * new packet to be provided by the distributor.
+ * Unlike rte_distributor_get_pkt(), this function does not wait for
+ * new packets to be provided by the distributor.
  *
- * NOTE: after calling this function, rte_distributor_poll_pkt_burst() should
- * be used to poll for the packet requested. The rte_distributor_get_pkt_burst()
- * API should *not* be used to try and retrieve the new packet.
+ * NOTE: after calling this function, rte_distributor_poll_pkt() should
+ * be used to poll for the packets requested. The rte_distributor_get_pkt()
+ * API should *not* be used to try and retrieve the new packets.
  *
  * @param d
  *   The distributor instance to be used
@@ -213,9 +213,9 @@ rte_distributor_request_pkt(struct rte_distributor *d,
 		unsigned int count);
 
 /**
- * API called by a worker to check for a new packet that was previously
+ * API called by a worker to check for new packets that were previously
  * requested by a call to rte_distributor_request_pkt(). It does not wait
- * for the new packet to be available, but returns NULL if the request has
+ * for the new packets to be available, but returns if the request has
  * not yet been fulfilled by the distributor.
  *
  * @param d
@@ -227,8 +227,9 @@ rte_distributor_request_pkt(struct rte_distributor *d,
  *   The array of mbufs being given to the worker
  *
  * @return
- *   The number of packets being given to the worker thread, zero if no
- *   packet is yet available.
+ *   The number of packets being given to the worker thread,
+ *   -1 if no packets are yet available (burst API - RTE_DIST_ALG_BURST)
+ *   0 if no packets are yet available (legacy single API - RTE_DIST_ALG_SINGLE)
  */
 int
 rte_distributor_poll_pkt(struct rte_distributor *d,
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v5 11/15] test/distributor: replace delays with spin locks
       [not found]                   ` <CGME20201008052345eucas1p29e14456610d4ed48c09b8cf7bd338e18@eucas1p2.samsung.com>
@ 2020-10-08  5:23                     ` Lukasz Wojciechowski
  2020-10-09 12:23                       ` David Hunt
  0 siblings, 1 reply; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-08  5:23 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

Instead of making delays in test code and waiting
for worker hopefully to reach proper states,
synchronize worker shutdown test cases with spin lock
on atomic variable.

Fixes: c0de0eb82e40 ("distributor: switch over to new API")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
---
 app/test/test_distributor.c | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
index 838a67515..1e0a079ff 100644
--- a/app/test/test_distributor.c
+++ b/app/test/test_distributor.c
@@ -27,6 +27,7 @@ struct worker_params worker_params;
 /* statics - all zero-initialized by default */
 static volatile int quit;      /**< general quit variable for all threads */
 static volatile int zero_quit; /**< var for when we just want thr0 to quit*/
+static volatile int zero_sleep; /**< thr0 has quit basic loop and is sleeping*/
 static volatile unsigned worker_idx;
 static volatile unsigned zero_idx;
 
@@ -376,8 +377,10 @@ handle_work_for_shutdown_test(void *arg)
 		/* for worker zero, allow it to restart to pick up last packet
 		 * when all workers are shutting down.
 		 */
+		__atomic_store_n(&zero_sleep, 1, __ATOMIC_RELEASE);
 		while (zero_quit)
 			usleep(100);
+		__atomic_store_n(&zero_sleep, 0, __ATOMIC_RELEASE);
 
 		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 
@@ -445,7 +448,12 @@ sanity_test_with_worker_shutdown(struct worker_params *wp,
 
 	/* flush the distributor */
 	rte_distributor_flush(d);
-	rte_delay_us(10000);
+	while (!__atomic_load_n(&zero_sleep, __ATOMIC_ACQUIRE))
+		rte_distributor_flush(d);
+
+	zero_quit = 0;
+	while (__atomic_load_n(&zero_sleep, __ATOMIC_ACQUIRE))
+		rte_delay_us(100);
 
 	for (i = 0; i < rte_lcore_count() - 1; i++)
 		printf("Worker %u handled %u packets\n", i,
@@ -505,9 +513,14 @@ test_flush_with_worker_shutdown(struct worker_params *wp,
 	/* flush the distributor */
 	rte_distributor_flush(d);
 
-	rte_delay_us(10000);
+	while (!__atomic_load_n(&zero_sleep, __ATOMIC_ACQUIRE))
+		rte_distributor_flush(d);
 
 	zero_quit = 0;
+
+	while (__atomic_load_n(&zero_sleep, __ATOMIC_ACQUIRE))
+		rte_delay_us(100);
+
 	for (i = 0; i < rte_lcore_count() - 1; i++)
 		printf("Worker %u handled %u packets\n", i,
 			__atomic_load_n(&worker_stats[i].handled_packets,
@@ -615,6 +628,8 @@ quit_workers(struct worker_params *wp, struct rte_mempool *p)
 	quit = 0;
 	worker_idx = 0;
 	zero_idx = RTE_MAX_LCORE;
+	zero_quit = 0;
+	zero_sleep = 0;
 }
 
 static int
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v5 12/15] distributor: fix scalar matching
       [not found]                   ` <CGME20201008052345eucas1p17a05f99986032885a0316d3419cdea2d@eucas1p1.samsung.com>
@ 2020-10-08  5:23                     ` Lukasz Wojciechowski
  2020-10-09 12:31                       ` David Hunt
  0 siblings, 1 reply; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-08  5:23 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

Fix improper indexes while comparing tags.
In the find_match_scalar() function:
* j iterates over flow tags of following packets;
* w iterates over backlog or in flight tags positions.

Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
---
 lib/librte_distributor/rte_distributor.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
index 9fd7dcab7..4bd23a990 100644
--- a/lib/librte_distributor/rte_distributor.c
+++ b/lib/librte_distributor/rte_distributor.c
@@ -261,13 +261,13 @@ find_match_scalar(struct rte_distributor *d,
 
 		for (j = 0; j < RTE_DIST_BURST_SIZE ; j++)
 			for (w = 0; w < RTE_DIST_BURST_SIZE; w++)
-				if (d->in_flight_tags[i][j] == data_ptr[w]) {
+				if (d->in_flight_tags[i][w] == data_ptr[j]) {
 					output_ptr[j] = i+1;
 					break;
 				}
 		for (j = 0; j < RTE_DIST_BURST_SIZE; j++)
 			for (w = 0; w < RTE_DIST_BURST_SIZE; w++)
-				if (bl->tags[j] == data_ptr[w]) {
+				if (bl->tags[w] == data_ptr[j]) {
 					output_ptr[j] = i+1;
 					break;
 				}
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v5 13/15] test/distributor: add test with packets marking
       [not found]                   ` <CGME20201008052346eucas1p15b04bf84cafc2ba52bbe063f57d08c39@eucas1p1.samsung.com>
@ 2020-10-08  5:23                     ` Lukasz Wojciechowski
  2020-10-09 12:50                       ` David Hunt
  0 siblings, 1 reply; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-08  5:23 UTC (permalink / raw)
  To: David Hunt; +Cc: dev, l.wojciechow

All of the former tests analyzed only statistics
of packets processed by all workers.
The new test verifies also if packets are processed
on workers as expected.
Every packets processed by the worker is marked
and analyzed after it is returned to distributor.

This test allows finding issues in matching algorithms.

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
---
 app/test/test_distributor.c | 141 ++++++++++++++++++++++++++++++++++++
 1 file changed, 141 insertions(+)

diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
index 1e0a079ff..0404e463a 100644
--- a/app/test/test_distributor.c
+++ b/app/test/test_distributor.c
@@ -542,6 +542,141 @@ test_flush_with_worker_shutdown(struct worker_params *wp,
 	return 0;
 }
 
+static int
+handle_and_mark_work(void *arg)
+{
+	struct rte_mbuf *buf[8] __rte_cache_aligned;
+	struct worker_params *wp = arg;
+	struct rte_distributor *db = wp->dist;
+	unsigned int num, i;
+	unsigned int id = __atomic_fetch_add(&worker_idx, 1, __ATOMIC_RELAXED);
+	num = rte_distributor_get_pkt(db, id, buf, NULL, 0);
+	while (!quit) {
+		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
+				__ATOMIC_ACQ_REL);
+		for (i = 0; i < num; i++)
+			buf[i]->udata64 += id + 1;
+		num = rte_distributor_get_pkt(db, id,
+				buf, buf, num);
+	}
+	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
+			__ATOMIC_ACQ_REL);
+	rte_distributor_return_pkt(db, id, buf, num);
+	return 0;
+}
+
+/* sanity_mark_test sends packets to workers which mark them.
+ * Every packet has also encoded sequence number.
+ * The returned packets are sorted and verified if they were handled
+ * by proper workers.
+ */
+static int
+sanity_mark_test(struct worker_params *wp, struct rte_mempool *p)
+{
+	const unsigned int buf_count = 24;
+	const unsigned int burst = 8;
+	const unsigned int shift = 12;
+	const unsigned int seq_shift = 10;
+
+	struct rte_distributor *db = wp->dist;
+	struct rte_mbuf *bufs[buf_count];
+	struct rte_mbuf *returns[buf_count];
+	unsigned int i, count, id;
+	unsigned int sorted[buf_count], seq;
+	unsigned int failed = 0;
+
+	printf("=== Marked packets test ===\n");
+	clear_packet_count();
+	if (rte_mempool_get_bulk(p, (void *)bufs, buf_count) != 0) {
+		printf("line %d: Error getting mbufs from pool\n", __LINE__);
+		return -1;
+	}
+
+/* bufs' hashes will be like these below, but shifted left.
+ * The shifting is for avoiding collisions with backlogs
+ * and in-flight tags left by previous tests.
+ * [1, 1, 1, 1, 1, 1, 1, 1
+ *  1, 1, 1, 1, 2, 2, 2, 2
+ *  2, 2, 2, 2, 1, 1, 1, 1]
+ */
+	for (i = 0; i < burst; i++) {
+		bufs[0 * burst + i]->hash.usr = 1 << shift;
+		bufs[1 * burst + i]->hash.usr = ((i < burst / 2) ? 1 : 2)
+			<< shift;
+		bufs[2 * burst + i]->hash.usr = ((i < burst / 2) ? 2 : 1)
+			<< shift;
+	}
+/* Assign a sequence number to each packet. The sequence is shifted,
+ * so that lower bits of the udate64 will hold mark from worker.
+ */
+	for (i = 0; i < buf_count; i++)
+		bufs[i]->udata64 = i << seq_shift;
+
+	count = 0;
+	for (i = 0; i < buf_count/burst; i++) {
+		rte_distributor_process(db, &bufs[i * burst], burst);
+		count += rte_distributor_returned_pkts(db, &returns[count],
+			buf_count - count);
+	}
+
+	do {
+		rte_distributor_flush(db);
+		count += rte_distributor_returned_pkts(db, &returns[count],
+			buf_count - count);
+	} while (count < buf_count);
+
+	for (i = 0; i < rte_lcore_count() - 1; i++)
+		printf("Worker %u handled %u packets\n", i,
+			__atomic_load_n(&worker_stats[i].handled_packets,
+					__ATOMIC_ACQUIRE));
+
+/* Sort returned packets by sent order (sequence numbers). */
+	for (i = 0; i < buf_count; i++) {
+		seq = returns[i]->udata64 >> seq_shift;
+		id = returns[i]->udata64 - (seq << seq_shift);
+		sorted[seq] = id;
+	}
+
+/* Verify that packets [0-11] and [20-23] were processed
+ * by the same worker
+ */
+	for (i = 1; i < 12; i++) {
+		if (sorted[i] != sorted[0]) {
+			printf("Packet number %u processed by worker %u,"
+				" but should be processes by worker %u\n",
+				i, sorted[i], sorted[0]);
+			failed = 1;
+		}
+	}
+	for (i = 20; i < 24; i++) {
+		if (sorted[i] != sorted[0]) {
+			printf("Packet number %u processed by worker %u,"
+				" but should be processes by worker %u\n",
+				i, sorted[i], sorted[0]);
+			failed = 1;
+		}
+	}
+/* And verify that packets [12-19] were processed
+ * by the another worker
+ */
+	for (i = 13; i < 20; i++) {
+		if (sorted[i] != sorted[12]) {
+			printf("Packet number %u processed by worker %u,"
+				" but should be processes by worker %u\n",
+				i, sorted[i], sorted[12]);
+			failed = 1;
+		}
+	}
+
+	rte_mempool_put_bulk(p, (void *)bufs, buf_count);
+
+	if (failed)
+		return -1;
+
+	printf("Marked packets test passed\n");
+	return 0;
+}
+
 static
 int test_error_distributor_create_name(void)
 {
@@ -726,6 +861,12 @@ test_distributor(void)
 				goto err;
 			quit_workers(&worker_params, p);
 
+			rte_eal_mp_remote_launch(handle_and_mark_work,
+					&worker_params, SKIP_MASTER);
+			if (sanity_mark_test(&worker_params, p) < 0)
+				goto err;
+			quit_workers(&worker_params, p);
+
 		} else {
 			printf("Too few cores to run worker shutdown test\n");
 		}
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v5 14/15] distributor: fix flushing in flight packets
       [not found]                   ` <CGME20201008052347eucas1p1570239523104a0d609c928d8b149ebdf@eucas1p1.samsung.com>
@ 2020-10-08  5:23                     ` Lukasz Wojciechowski
  2020-10-09 13:10                       ` David Hunt
  0 siblings, 1 reply; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-08  5:23 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

rte_distributor_flush() is using total_outstanding()
function to calculate if it should still wait
for processing packets. However in burst mode
only backlog packets were counted.

This patch fixes that issue by counting also in flight
packets. There are also sum fixes to properly keep
count of in flight packets for each worker in bufs[].count.

Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
---
 lib/librte_distributor/rte_distributor.c | 12 +++++-------
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
index 4bd23a990..2478de3b7 100644
--- a/lib/librte_distributor/rte_distributor.c
+++ b/lib/librte_distributor/rte_distributor.c
@@ -467,6 +467,7 @@ rte_distributor_process(struct rte_distributor *d,
 			/* Sync with worker on GET_BUF flag. */
 			if (__atomic_load_n(&(d->bufs[wid].bufptr64[0]),
 				__ATOMIC_ACQUIRE) & RTE_DISTRIB_GET_BUF) {
+				d->bufs[wid].count = 0;
 				release(d, wid);
 				handle_returns(d, wid);
 			}
@@ -481,11 +482,6 @@ rte_distributor_process(struct rte_distributor *d,
 		uint16_t matches[RTE_DIST_BURST_SIZE];
 		unsigned int pkts;
 
-		/* Sync with worker on GET_BUF flag. */
-		if (__atomic_load_n(&(d->bufs[wkr].bufptr64[0]),
-			__ATOMIC_ACQUIRE) & RTE_DISTRIB_GET_BUF)
-			d->bufs[wkr].count = 0;
-
 		if ((num_mbufs - next_idx) < RTE_DIST_BURST_SIZE)
 			pkts = num_mbufs - next_idx;
 		else
@@ -605,8 +601,10 @@ rte_distributor_process(struct rte_distributor *d,
 	for (wid = 0 ; wid < d->num_workers; wid++)
 		/* Sync with worker on GET_BUF flag. */
 		if ((__atomic_load_n(&(d->bufs[wid].bufptr64[0]),
-			__ATOMIC_ACQUIRE) & RTE_DISTRIB_GET_BUF))
+			__ATOMIC_ACQUIRE) & RTE_DISTRIB_GET_BUF)) {
+			d->bufs[wid].count = 0;
 			release(d, wid);
+		}
 
 	return num_mbufs;
 }
@@ -649,7 +647,7 @@ total_outstanding(const struct rte_distributor *d)
 	unsigned int wkr, total_outstanding = 0;
 
 	for (wkr = 0; wkr < d->num_workers; wkr++)
-		total_outstanding += d->backlog[wkr].count;
+		total_outstanding += d->backlog[wkr].count + d->bufs[wkr].count;
 
 	return total_outstanding;
 }
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v5 15/15] distributor: fix clearing returns buffer
       [not found]                   ` <CGME20201008052348eucas1p183cfbe10d10bd98c7a63a34af98b80df@eucas1p1.samsung.com>
@ 2020-10-08  5:23                     ` Lukasz Wojciechowski
  2020-10-09 13:12                       ` David Hunt
  0 siblings, 1 reply; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-08  5:23 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

The patch clears distributors returns buffer
in clear_returns() by setting start and count to 0.

Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
---
 lib/librte_distributor/rte_distributor.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
index 2478de3b7..57240304a 100644
--- a/lib/librte_distributor/rte_distributor.c
+++ b/lib/librte_distributor/rte_distributor.c
@@ -704,6 +704,8 @@ rte_distributor_clear_returns(struct rte_distributor *d)
 		/* Sync with worker. Release retptrs. */
 		__atomic_store_n(&(d->bufs[wkr].retptr64[0]), 0,
 				__ATOMIC_RELEASE);
+
+	d->returns.start = d->returns.count = 0;
 }
 
 /* creates a distributor instance */
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v5 00/15] fix distributor synchronization issues
  2020-10-08  5:23                 ` [dpdk-dev] [PATCH v5 00/15] fix distributor synchronization issues Lukasz Wojciechowski
                                     ` (14 preceding siblings ...)
       [not found]                   ` <CGME20201008052348eucas1p183cfbe10d10bd98c7a63a34af98b80df@eucas1p1.samsung.com>
@ 2020-10-08  7:30                   ` David Marchand
  2020-10-08 21:16                     ` Lukasz Wojciechowski
       [not found]                   ` <CGME20201009220207eucas1p1d83b63b4f0e05cbaf0a58f7f01ec0052@eucas1p1.samsung.com>
  16 siblings, 1 reply; 164+ messages in thread
From: David Marchand @ 2020-10-08  7:30 UTC (permalink / raw)
  To: Lukasz Wojciechowski, David Hunt, Honnappa Nagarahalli; +Cc: dev, Sarosh Arif

On Thu, Oct 8, 2020 at 7:24 AM Lukasz Wojciechowski
<l.wojciechow@partner.samsung.com> wrote:
>
> During review and verification of the patch created by Sarosh Arif:
> "test_distributor: prevent memory leakages from the pool" I found out
> that running distributor unit tests multiple times in a row causes fails.
> So I investigated all the issues I found.
>
> There are few synchronization issues that might cause deadlocks
> or corrupted data. They are fixed with this set of patches for both tests
> and librte_distributor library.
>
> ---
> v5:
> * implement missing functionality in burst mode - worker shutdown
> * fix shutdown test to always shutdown busy worker
> * use atomic stores instead of barrier in tests clear_packet_count()
> * reorder patches
> * new patch 7: fix call to return_pkt in single mode
> * new patch 11: replacing delays with spinlock on atomics in tests
> * new patch 12: fix scalar matching algorithm
> * new patch 13: new test with marking and checking every packet
> * new patch 14: flush also in flight packets
> * new patch 15: fix clearing returns buffer
> * minor fixes in other patches

Thanks for working on it, Lukasz.
David, Honnappa, review please.


-- 
David Marchand


^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v5 03/15] distributor: do not use oldpkt when not needed
  2020-10-08  5:23                     ` [dpdk-dev] [PATCH v5 03/15] distributor: do not use oldpkt when not needed Lukasz Wojciechowski
@ 2020-10-08  8:13                       ` David Hunt
  0 siblings, 0 replies; 164+ messages in thread
From: David Hunt @ 2020-10-08  8:13 UTC (permalink / raw)
  To: Lukasz Wojciechowski, Bruce Richardson; +Cc: dev, stable

Hi Lukasz,

On 8/10/2020 6:23 AM, Lukasz Wojciechowski wrote:
> rte_distributor_request_pkt and rte_distributor_get_pkt dereferenced
> oldpkt parameter when in RTE_DIST_ALG_SINGLE even if number
> of returned buffers from worker to distributor was 0.
>
> This patch passes NULL to the legacy API when number of returned
> buffers is 0. This allows passing NULL as oldpkt parameter.
>
> Distributor tests are also updated passing NULL as oldpkt and
> 0 as number of returned packets, where packets are not returned.
>
> Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
> Cc: david.hunt@intel.com
> Cc: stable@dpdk.org
>
> Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
> ---


Good edge-case catch, thanks.

Acked-by: David Hunt <david.hunt@intel.com>



^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v5 04/15] distributor: handle worker shutdown in burst mode
  2020-10-08  5:23                     ` [dpdk-dev] [PATCH v5 04/15] distributor: handle worker shutdown in burst mode Lukasz Wojciechowski
@ 2020-10-08 14:26                       ` David Hunt
  2020-10-08 21:07                         ` Lukasz Wojciechowski
  0 siblings, 1 reply; 164+ messages in thread
From: David Hunt @ 2020-10-08 14:26 UTC (permalink / raw)
  To: Lukasz Wojciechowski, Bruce Richardson; +Cc: dev, stable


On 8/10/2020 6:23 AM, Lukasz Wojciechowski wrote:
> The burst version of distributor implementation was missing proper
> handling of worker shutdown. A worker processing packets received
> from distributor can call rte_distributor_return_pkt() function
> informing distributor that it want no more packets. Further calls to
> rte_distributor_request_pkt() or rte_distributor_get_pkt() however
> should inform distributor that new packets are requested again.
>
> Lack of the proper implementation has caused that even after worker
> informed about returning last packets, new packets were still sent
> from distributor causing deadlocks as no one could get them on worker
> side.
>
> This patch adds handling shutdown of the worker in following way:
> 1) It fixes usage of RTE_DISTRIB_VALID_BUF handshake flag. This flag
> was formerly unused in burst implementation and now it is used
> for marking valid packets in retptr64 replacing invalid use
> of RTE_DISTRIB_RETURN_BUF flag.
> 2) Uses RTE_DISTRIB_RETURN_BUF as a worker to distributor handshake
> in retptr64 to indicate that worker has shutdown.
> 3) Worker that shuts down blocks also bufptr for itself with
> RTE_DISTRIB_RETURN_BUF flag allowing distributor to retrieve any
> in flight packets.
> 4) When distributor receives information about shutdown of a worker,
> it: marks worker as not active; retrieves any in flight and backlog
> packets and process them to different workers; unlocks bufptr64
> by clearing RTE_DISTRIB_RETURN_BUF flag and allowing use in
> the future if worker requests any new packages.
> 5) Do not allow to: send or add to backlog any packets for not
> active workers. Such workers are also ignored if matched.
> 6) Adjust calls to handle_returns() and tags matching procedure
> to react for possible activation deactivation of workers.
>
> Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
> Cc: david.hunt@intel.com
> Cc: stable@dpdk.org
>
> Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
> ---


Hi Lukasz,

    I spent the most amount of time going through this particular patch, 
and it looks good to me (even the bit where rte_distributor_process is 
called recursively) :)

I'll try and get some time to run through some more testing, but for now:

Acked-by: David Hunt <david.hunt@intel.com>





^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v5 07/15] distributor: fix return pkt calls in single mode
  2020-10-08  5:23                     ` [dpdk-dev] [PATCH v5 07/15] distributor: fix return pkt calls in single mode Lukasz Wojciechowski
@ 2020-10-08 14:32                       ` David Hunt
  0 siblings, 0 replies; 164+ messages in thread
From: David Hunt @ 2020-10-08 14:32 UTC (permalink / raw)
  To: Lukasz Wojciechowski, Bruce Richardson; +Cc: dev, stable


On 8/10/2020 6:23 AM, Lukasz Wojciechowski wrote:
> In the single legacy version of the distributor synchronization
> requires continues exchange of buffers between distributor
> and workers. Empty buffers are sent if only handshake
> synchronization is required.
> However calls to the rte_distributor_return_pkt()
> with 0 buffers in single mode were ignored and not passed to the
> legacy algorithm implementation causing lack of synchronization.
>
> This patch fixes this issue by passing NULL as buffer which is
> a valid way of sending just synchronization handshakes
> in single mode.
>
> Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
> Cc: david.hunt@intel.com
> Cc: stable@dpdk.org
>
> Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
> ---
>   lib/librte_distributor/rte_distributor.c | 3 +++
>   1 file changed, 3 insertions(+)
>
> diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
> index 115443fc0..9fd7dcab7 100644
> --- a/lib/librte_distributor/rte_distributor.c
> +++ b/lib/librte_distributor/rte_distributor.c
> @@ -168,6 +168,9 @@ rte_distributor_return_pkt(struct rte_distributor *d,
>   		if (num == 1)
>   			return rte_distributor_return_pkt_single(d->d_single,
>   				worker_id, oldpkt[0]);
> +		else if (num == 0)
> +			return rte_distributor_return_pkt_single(d->d_single,
> +				worker_id, NULL);
>   		else
>   			return -EINVAL;
>   	}


Acked-by: David Hunt <david.hunt@intel.com>



^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v5 10/15] distributor: align API documentation with code
  2020-10-08  5:23                     ` [dpdk-dev] [PATCH v5 10/15] distributor: align API documentation with code Lukasz Wojciechowski
@ 2020-10-08 14:35                       ` David Hunt
  0 siblings, 0 replies; 164+ messages in thread
From: David Hunt @ 2020-10-08 14:35 UTC (permalink / raw)
  To: Lukasz Wojciechowski, Bruce Richardson; +Cc: dev, stable

Hi Lukasz,

On 8/10/2020 6:23 AM, Lukasz Wojciechowski wrote:
> After introducing burst API there were some artefacts in the
> API documentation from legacy single API.
> Also the rte_distributor_poll_pkt() function return values
> mismatched the implementation.
>
> Fixes: c0de0eb82e40 ("distributor: switch over to new API")
> Cc: david.hunt@intel.com
> Cc: stable@dpdk.org
>
> Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
> ---


Good doc catch, thanks! :)


Acked-by: David Hunt <david.hunt@intel.com>



^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v4 2/8] test/distributor: synchronize lcores statistics
  2020-10-02 11:25                     ` Lukasz Wojciechowski
@ 2020-10-08 20:47                       ` Lukasz Wojciechowski
  2020-10-16  5:43                       ` Honnappa Nagarahalli
  1 sibling, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-08 20:47 UTC (permalink / raw)
  To: Honnappa Nagarahalli, David Hunt, Bruce Richardson; +Cc: dev, stable, nd

Hi Honappa,

I pushed v5 of the patches today.

However all I fixed in this patch is replacing memset to loop of 
operations on atomic. I didn't move clearing and checking the statistics 
out of the test yet (to make sure no worker is running).

After fixing few more distributor issues I wasn't able now to catch even 
a single failure using the ACQ/REL memory models.
I'll test it maybe tomorrow with RELAXED to see if it will work.

Best regards

Lukasz

W dniu 02.10.2020 o 13:25, Lukasz Wojciechowski pisze:
> Hi Honnappa,
>
> Many thanks for the review!
>
> I'll write my answers here not inline as it would be easier to read them
> in one place, I think.
> So first of all I agree with you in 2 things:
> 1) all uses of statistics must be atomic and lack of that caused most of
> the problems
> 2) it would be better to replace barrier and memset in
> clear_packet_count() with atomic stores as you suggested
>
> So I will apply both of above.
>
> However I wasn't not fully convinced on changing acquire/release to
> relaxed. It wood be perfectly ok
> if it would look like in this Herb Sutter's example:
> https://youtu.be/KeLBd2EJLOU?t=4170
> But in his case the counters are cleared before worker threads start and
> are printout after they are completed.
>
> In case of the dpdk distributor tests both worker and main cores are
> running at the same time. In the sanity_test, the statistics are cleared
> and verified few times for different hashes of packages. The worker
> cores are not stopped at this time and they continue their loops in
> handle procedure. Verification made in main core is an exchange of data
> as the current statistics indicate how the test will result.
>
> So as I wasn't convinced, I run some tests with both both relaxed and
> acquire/release modes and they both fail :(
> The failures caused by statistics errors to number of tests ratio for
> 200000 tests was:
> for relaxed: 0,000790562
> for acq/rel: 0,000091321
>
>
> That's why I'm going to modify tests in such way, that they would:
> 1) clear statistics
> 2) launch worker threads
> 3) run test
> 4) wait for workers procedures to complete
> 5) check stats, verify results and print them out
>
> This way worker main core will use (clear or verify) stats only when
> there are no worker threads. This would make things simpler and allowing
> to focus on testing the distributor not tests. And of course relaxed
> mode would be enough!
>
>
> Best regards
> Lukasz
>
>
> W dniu 29.09.2020 o 07:49, Honnappa Nagarahalli pisze:
>> <snip>
>>
>>> Statistics of handled packets are cleared and read on main lcore, while they
>>> are increased in workers handlers on different lcores.
>>>
>>> Without synchronization occasionally showed invalid values.
>>> This patch uses atomic acquire/release mechanisms to synchronize.
>> In general, load-acquire and store-release memory orderings are required while synchronizing data (that cannot be updated atomically) between threads. In the situation, making counters atomic is enough.
>>
>>> Fixes: c3eabff124e6 ("distributor: add unit tests")
>>> Cc: bruce.richardson@intel.com
>>> Cc: stable@dpdk.org
>>>
>>> Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
>>> Acked-by: David Hunt <david.hunt@intel.com>
>>> ---
>>>    app/test/test_distributor.c | 39 ++++++++++++++++++++++++-------------
>>>    1 file changed, 26 insertions(+), 13 deletions(-)
>>>
>>> diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c index
>>> 35b25463a..0e49e3714 100644
>>> --- a/app/test/test_distributor.c
>>> +++ b/app/test/test_distributor.c
>>> @@ -43,7 +43,8 @@ total_packet_count(void)  {
>>>    	unsigned i, count = 0;
>>>    	for (i = 0; i < worker_idx; i++)
>>> -		count += worker_stats[i].handled_packets;
>>> +		count +=
>>> __atomic_load_n(&worker_stats[i].handled_packets,
>>> +				__ATOMIC_ACQUIRE);
>> RELAXED memory order is sufficient. For ex: the worker threads are not 'releasing' any data that is not atomically updated to the main thread.
>>
>>>    	return count;
>>>    }
>>>
>>> @@ -52,6 +53,7 @@ static inline void
>>>    clear_packet_count(void)
>>>    {
>>>    	memset(&worker_stats, 0, sizeof(worker_stats));
>>> +	rte_atomic_thread_fence(__ATOMIC_RELEASE);
>> Ideally, the counters should be set to 0 atomically rather than using a memset.
>>
>>>    }
>>>
>>>    /* this is the basic worker function for sanity test @@ -72,13 +74,13 @@
>>> handle_work(void *arg)
>>>    	num = rte_distributor_get_pkt(db, id, buf, buf, num);
>>>    	while (!quit) {
>>>    		__atomic_fetch_add(&worker_stats[id].handled_packets,
>>> num,
>>> -				__ATOMIC_RELAXED);
>>> +				__ATOMIC_ACQ_REL);
>> Using the __ATOMIC_ACQ_REL order does not mean anything to the main thread. The main thread might still see the updates from different threads in different order.
>>
>>>    		count += num;
>>>    		num = rte_distributor_get_pkt(db, id,
>>>    				buf, buf, num);
>>>    	}
>>>    	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
>>> -			__ATOMIC_RELAXED);
>>> +			__ATOMIC_ACQ_REL);
>> Same here, do not see why this change is required.
>>
>>>    	count += num;
>>>    	rte_distributor_return_pkt(db, id, buf, num);
>>>    	return 0;
>>> @@ -134,7 +136,8 @@ sanity_test(struct worker_params *wp, struct
>>> rte_mempool *p)
>>>
>>>    	for (i = 0; i < rte_lcore_count() - 1; i++)
>>>    		printf("Worker %u handled %u packets\n", i,
>>> -				worker_stats[i].handled_packets);
>>> +			__atomic_load_n(&worker_stats[i].handled_packets,
>>> +					__ATOMIC_ACQUIRE));
>> __ATOMIC_RELAXED is enough.
>>
>>>    	printf("Sanity test with all zero hashes done.\n");
>>>
>>>    	/* pick two flows and check they go correctly */ @@ -159,7 +162,9
>>> @@ sanity_test(struct worker_params *wp, struct rte_mempool *p)
>>>
>>>    		for (i = 0; i < rte_lcore_count() - 1; i++)
>>>    			printf("Worker %u handled %u packets\n", i,
>>> -					worker_stats[i].handled_packets);
>>> +				__atomic_load_n(
>>> +					&worker_stats[i].handled_packets,
>>> +					__ATOMIC_ACQUIRE));
>> __ATOMIC_RELAXED is enough
>>
>>>    		printf("Sanity test with two hash values done\n");
>>>    	}
>>>
>>> @@ -185,7 +190,8 @@ sanity_test(struct worker_params *wp, struct
>>> rte_mempool *p)
>>>
>>>    	for (i = 0; i < rte_lcore_count() - 1; i++)
>>>    		printf("Worker %u handled %u packets\n", i,
>>> -				worker_stats[i].handled_packets);
>>> +			__atomic_load_n(&worker_stats[i].handled_packets,
>>> +					__ATOMIC_ACQUIRE));
>> __ATOMIC_RELAXED is enough
>>
>>>    	printf("Sanity test with non-zero hashes done\n");
>>>
>>>    	rte_mempool_put_bulk(p, (void *)bufs, BURST); @@ -280,15
>>> +286,17 @@ handle_work_with_free_mbufs(void *arg)
>>>    		buf[i] = NULL;
>>>    	num = rte_distributor_get_pkt(d, id, buf, buf, num);
>>>    	while (!quit) {
>>> -		worker_stats[id].handled_packets += num;
>>>    		count += num;
>>> +		__atomic_fetch_add(&worker_stats[id].handled_packets,
>>> num,
>>> +				__ATOMIC_ACQ_REL);
>> IMO, the problem would be the non-atomic update of the statistics. So, __ATOMIC_RELAXED is enough
>>
>>>    		for (i = 0; i < num; i++)
>>>    			rte_pktmbuf_free(buf[i]);
>>>    		num = rte_distributor_get_pkt(d,
>>>    				id, buf, buf, num);
>>>    	}
>>> -	worker_stats[id].handled_packets += num;
>>>    	count += num;
>>> +	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
>>> +			__ATOMIC_ACQ_REL);
>> Same here, the problem is non-atomic update of the statistics, __ATOMIC_RELAXED is enough.
>> Similarly, for changes below, __ATOMIC_RELAXED is enough.
>>
>>>    	rte_distributor_return_pkt(d, id, buf, num);
>>>    	return 0;
>>>    }
>>> @@ -363,8 +371,9 @@ handle_work_for_shutdown_test(void *arg)
>>>    	/* wait for quit single globally, or for worker zero, wait
>>>    	 * for zero_quit */
>>>    	while (!quit && !(id == zero_id && zero_quit)) {
>>> -		worker_stats[id].handled_packets += num;
>>>    		count += num;
>>> +		__atomic_fetch_add(&worker_stats[id].handled_packets,
>>> num,
>>> +				__ATOMIC_ACQ_REL);
>>>    		for (i = 0; i < num; i++)
>>>    			rte_pktmbuf_free(buf[i]);
>>>    		num = rte_distributor_get_pkt(d,
>>> @@ -379,10 +388,11 @@ handle_work_for_shutdown_test(void *arg)
>>>
>>>    		total += num;
>>>    	}
>>> -	worker_stats[id].handled_packets += num;
>>>    	count += num;
>>>    	returned = rte_distributor_return_pkt(d, id, buf, num);
>>>
>>> +	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
>>> +			__ATOMIC_ACQ_REL);
>>>    	if (id == zero_id) {
>>>    		/* for worker zero, allow it to restart to pick up last packet
>>>    		 * when all workers are shutting down.
>>> @@ -394,10 +404,11 @@ handle_work_for_shutdown_test(void *arg)
>>>    				id, buf, buf, num);
>>>
>>>    		while (!quit) {
>>> -			worker_stats[id].handled_packets += num;
>>>    			count += num;
>>>    			rte_pktmbuf_free(pkt);
>>>    			num = rte_distributor_get_pkt(d, id, buf, buf, num);
>>> +
>>> 	__atomic_fetch_add(&worker_stats[id].handled_packets,
>>> +					num, __ATOMIC_ACQ_REL);
>>>    		}
>>>    		returned = rte_distributor_return_pkt(d,
>>>    				id, buf, num);
>>> @@ -461,7 +472,8 @@ sanity_test_with_worker_shutdown(struct
>>> worker_params *wp,
>>>
>>>    	for (i = 0; i < rte_lcore_count() - 1; i++)
>>>    		printf("Worker %u handled %u packets\n", i,
>>> -				worker_stats[i].handled_packets);
>>> +			__atomic_load_n(&worker_stats[i].handled_packets,
>>> +					__ATOMIC_ACQUIRE));
>>>
>>>    	if (total_packet_count() != BURST * 2) {
>>>    		printf("Line %d: Error, not all packets flushed. "
>>> @@ -514,7 +526,8 @@ test_flush_with_worker_shutdown(struct
>>> worker_params *wp,
>>>    	zero_quit = 0;
>>>    	for (i = 0; i < rte_lcore_count() - 1; i++)
>>>    		printf("Worker %u handled %u packets\n", i,
>>> -				worker_stats[i].handled_packets);
>>> +			__atomic_load_n(&worker_stats[i].handled_packets,
>>> +					__ATOMIC_ACQUIRE));
>>>
>>>    	if (total_packet_count() != BURST) {
>>>    		printf("Line %d: Error, not all packets flushed. "
>>> --
>>> 2.17.1

-- 
Lukasz Wojciechowski
Principal Software Engineer

Samsung R&D Institute Poland
Samsung Electronics
Office +48 22 377 88 25
l.wojciechow@partner.samsung.com


^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v5 04/15] distributor: handle worker shutdown in burst mode
  2020-10-08 14:26                       ` David Hunt
@ 2020-10-08 21:07                         ` Lukasz Wojciechowski
  2020-10-09 12:13                           ` David Hunt
  0 siblings, 1 reply; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-08 21:07 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson
  Cc: dev, stable, "'Lukasz Wojciechowski'",


W dniu 08.10.2020 o 16:26, David Hunt pisze:
>
> On 8/10/2020 6:23 AM, Lukasz Wojciechowski wrote:
>> The burst version of distributor implementation was missing proper
>> handling of worker shutdown. A worker processing packets received
>> from distributor can call rte_distributor_return_pkt() function
>> informing distributor that it want no more packets. Further calls to
>> rte_distributor_request_pkt() or rte_distributor_get_pkt() however
>> should inform distributor that new packets are requested again.
>>
>> Lack of the proper implementation has caused that even after worker
>> informed about returning last packets, new packets were still sent
>> from distributor causing deadlocks as no one could get them on worker
>> side.
>>
>> This patch adds handling shutdown of the worker in following way:
>> 1) It fixes usage of RTE_DISTRIB_VALID_BUF handshake flag. This flag
>> was formerly unused in burst implementation and now it is used
>> for marking valid packets in retptr64 replacing invalid use
>> of RTE_DISTRIB_RETURN_BUF flag.
>> 2) Uses RTE_DISTRIB_RETURN_BUF as a worker to distributor handshake
>> in retptr64 to indicate that worker has shutdown.
>> 3) Worker that shuts down blocks also bufptr for itself with
>> RTE_DISTRIB_RETURN_BUF flag allowing distributor to retrieve any
>> in flight packets.
>> 4) When distributor receives information about shutdown of a worker,
>> it: marks worker as not active; retrieves any in flight and backlog
>> packets and process them to different workers; unlocks bufptr64
>> by clearing RTE_DISTRIB_RETURN_BUF flag and allowing use in
>> the future if worker requests any new packages.
>> 5) Do not allow to: send or add to backlog any packets for not
>> active workers. Such workers are also ignored if matched.
>> 6) Adjust calls to handle_returns() and tags matching procedure
>> to react for possible activation deactivation of workers.
>>
>> Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
>> Cc: david.hunt@intel.com
>> Cc: stable@dpdk.org
>>
>> Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
>> ---
>
>
> Hi Lukasz,
Hi David,
Many thanks for your review.
>
>    I spent the most amount of time going through this particular 
> patch, and it looks good to me (even the bit where 
> rte_distributor_process is called recursively) :)
That's the same trick that was used in the legacy single version. :)
>
> I'll try and get some time to run through some more testing, but for now:
>
> Acked-by: David Hunt <david.hunt@intel.com>
Thanks and if you'll run the test, please take a look at the 
performance. I think it has dropped because of these additional 
synchronizations and actions on activation/deactivation.

However the quality has increased much. With v5 version , I ran tests 
over 100000 times and didn't get a single failure!

Let me know about your results.


Best regards

Lukasz

>
>
>
>
>
-- 
Lukasz Wojciechowski
Principal Software Engineer

Samsung R&D Institute Poland
Samsung Electronics
Office +48 22 377 88 25
l.wojciechow@partner.samsung.com


^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v5 00/15] fix distributor synchronization issues
  2020-10-08  7:30                   ` [dpdk-dev] [PATCH v5 00/15] fix distributor synchronization issues David Marchand
@ 2020-10-08 21:16                     ` Lukasz Wojciechowski
  2020-10-09 12:53                       ` David Marchand
  0 siblings, 1 reply; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-08 21:16 UTC (permalink / raw)
  To: David Marchand, David Hunt, Honnappa Nagarahalli
  Cc: dev, Sarosh Arif, "'Lukasz Wojciechowski'",


W dniu 08.10.2020 o 09:30, David Marchand pisze:
> On Thu, Oct 8, 2020 at 7:24 AM Lukasz Wojciechowski
> <l.wojciechow@partner.samsung.com> wrote:
>> During review and verification of the patch created by Sarosh Arif:
>> "test_distributor: prevent memory leakages from the pool" I found out
>> that running distributor unit tests multiple times in a row causes fails.
>> So I investigated all the issues I found.
>>
>> There are few synchronization issues that might cause deadlocks
>> or corrupted data. They are fixed with this set of patches for both tests
>> and librte_distributor library.
>>
>> ---
>> v5:
>> * implement missing functionality in burst mode - worker shutdown
>> * fix shutdown test to always shutdown busy worker
>> * use atomic stores instead of barrier in tests clear_packet_count()
>> * reorder patches
>> * new patch 7: fix call to return_pkt in single mode
>> * new patch 11: replacing delays with spinlock on atomics in tests
>> * new patch 12: fix scalar matching algorithm
>> * new patch 13: new test with marking and checking every packet
>> * new patch 14: flush also in flight packets
>> * new patch 15: fix clearing returns buffer
>> * minor fixes in other patches
> Thanks for working on it, Lukasz.
Sorry for the delay, but it was much to solve and test.
> David, Honnappa, review please.
I'm here if you have any questions or suggestions
>
>
-- 
Lukasz Wojciechowski
Principal Software Engineer

Samsung R&D Institute Poland
Samsung Electronics
Office +48 22 377 88 25
l.wojciechow@partner.samsung.com


^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v5 04/15] distributor: handle worker shutdown in burst mode
  2020-10-08 21:07                         ` Lukasz Wojciechowski
@ 2020-10-09 12:13                           ` David Hunt
  2020-10-09 20:43                             ` Lukasz Wojciechowski
  0 siblings, 1 reply; 164+ messages in thread
From: David Hunt @ 2020-10-09 12:13 UTC (permalink / raw)
  To: Lukasz Wojciechowski, Bruce Richardson; +Cc: dev, stable


On 8/10/2020 10:07 PM, Lukasz Wojciechowski wrote:
> W dniu 08.10.2020 o 16:26, David Hunt pisze:
>> On 8/10/2020 6:23 AM, Lukasz Wojciechowski wrote:
>>> The burst version of distributor implementation was missing proper
>>> handling of worker shutdown. A worker processing packets received
>>> from distributor can call rte_distributor_return_pkt() function
>>> informing distributor that it want no more packets. Further calls to
>>> rte_distributor_request_pkt() or rte_distributor_get_pkt() however
>>> should inform distributor that new packets are requested again.
>>>
>>> Lack of the proper implementation has caused that even after worker
>>> informed about returning last packets, new packets were still sent
>>> from distributor causing deadlocks as no one could get them on worker
>>> side.
>>>
>>> This patch adds handling shutdown of the worker in following way:
>>> 1) It fixes usage of RTE_DISTRIB_VALID_BUF handshake flag. This flag
>>> was formerly unused in burst implementation and now it is used
>>> for marking valid packets in retptr64 replacing invalid use
>>> of RTE_DISTRIB_RETURN_BUF flag.
>>> 2) Uses RTE_DISTRIB_RETURN_BUF as a worker to distributor handshake
>>> in retptr64 to indicate that worker has shutdown.
>>> 3) Worker that shuts down blocks also bufptr for itself with
>>> RTE_DISTRIB_RETURN_BUF flag allowing distributor to retrieve any
>>> in flight packets.
>>> 4) When distributor receives information about shutdown of a worker,
>>> it: marks worker as not active; retrieves any in flight and backlog
>>> packets and process them to different workers; unlocks bufptr64
>>> by clearing RTE_DISTRIB_RETURN_BUF flag and allowing use in
>>> the future if worker requests any new packages.
>>> 5) Do not allow to: send or add to backlog any packets for not
>>> active workers. Such workers are also ignored if matched.
>>> 6) Adjust calls to handle_returns() and tags matching procedure
>>> to react for possible activation deactivation of workers.
>>>
>>> Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
>>> Cc: david.hunt@intel.com
>>> Cc: stable@dpdk.org
>>>
>>> Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
>>> ---
>>
>> Hi Lukasz,
> Hi David,
> Many thanks for your review.
>>     I spent the most amount of time going through this particular
>> patch, and it looks good to me (even the bit where
>> rte_distributor_process is called recursively) :)
> That's the same trick that was used in the legacy single version. :)
>> I'll try and get some time to run through some more testing, but for now:
>>
>> Acked-by: David Hunt <david.hunt@intel.com>
> Thanks and if you'll run the test, please take a look at the
> performance. I think it has dropped because of these additional
> synchronizations and actions on activation/deactivation.
>
> However the quality has increased much. With v5 version , I ran tests
> over 100000 times and didn't get a single failure!
>
> Let me know about your results.
>

Going back through the patch set and running performance on each one, I 
see a 10% drop in performance at patch 2 in the series, which adds an 
extra handle_returns() call in the busy loop. Which avoids possible 
deadlock.

I played around with that patch for a while, only calling 
handle_returns() every x times aroudn the loop, but the performance was 
worse again, probably because of the extra branch I added.

However, it's more important to have stable performance than so it's 
still a good idea to have that fix applied, IMO.

Maybe we can get back some lost performance in future optimisation patches.

Thanks,
Dave.




^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v5 11/15] test/distributor: replace delays with spin locks
  2020-10-08  5:23                     ` [dpdk-dev] [PATCH v5 11/15] test/distributor: replace delays with spin locks Lukasz Wojciechowski
@ 2020-10-09 12:23                       ` David Hunt
  0 siblings, 0 replies; 164+ messages in thread
From: David Hunt @ 2020-10-09 12:23 UTC (permalink / raw)
  To: Lukasz Wojciechowski, Bruce Richardson; +Cc: dev, stable


On 8/10/2020 6:23 AM, Lukasz Wojciechowski wrote:
> Instead of making delays in test code and waiting
> for worker hopefully to reach proper states,
> synchronize worker shutdown test cases with spin lock
> on atomic variable.
>
> Fixes: c0de0eb82e40 ("distributor: switch over to new API")
> Cc: david.hunt@intel.com
> Cc: stable@dpdk.org
>
> Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
> ---
>   app/test/test_distributor.c | 19 +++++++++++++++++--
>   1 file changed, 17 insertions(+), 2 deletions(-)
>
> diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
> index 838a67515..1e0a079ff 100644
> --- a/app/test/test_distributor.c
> +++ b/app/test/test_distributor.c
> @@ -27,6 +27,7 @@ struct worker_params worker_params;
>   /* statics - all zero-initialized by default */
>   static volatile int quit;      /**< general quit variable for all threads */
>   static volatile int zero_quit; /**< var for when we just want thr0 to quit*/
> +static volatile int zero_sleep; /**< thr0 has quit basic loop and is sleeping*/
>   static volatile unsigned worker_idx;
>   static volatile unsigned zero_idx;
>   
> @@ -376,8 +377,10 @@ handle_work_for_shutdown_test(void *arg)
>   		/* for worker zero, allow it to restart to pick up last packet
>   		 * when all workers are shutting down.
>   		 */
> +		__atomic_store_n(&zero_sleep, 1, __ATOMIC_RELEASE);
>   		while (zero_quit)
>   			usleep(100);
> +		__atomic_store_n(&zero_sleep, 0, __ATOMIC_RELEASE);
>   
>   		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
>   
> @@ -445,7 +448,12 @@ sanity_test_with_worker_shutdown(struct worker_params *wp,
>   
>   	/* flush the distributor */
>   	rte_distributor_flush(d);
> -	rte_delay_us(10000);
> +	while (!__atomic_load_n(&zero_sleep, __ATOMIC_ACQUIRE))
> +		rte_distributor_flush(d);
> +
> +	zero_quit = 0;
> +	while (__atomic_load_n(&zero_sleep, __ATOMIC_ACQUIRE))
> +		rte_delay_us(100);
>   
>   	for (i = 0; i < rte_lcore_count() - 1; i++)
>   		printf("Worker %u handled %u packets\n", i,
> @@ -505,9 +513,14 @@ test_flush_with_worker_shutdown(struct worker_params *wp,
>   	/* flush the distributor */
>   	rte_distributor_flush(d);
>   
> -	rte_delay_us(10000);
> +	while (!__atomic_load_n(&zero_sleep, __ATOMIC_ACQUIRE))
> +		rte_distributor_flush(d);
>   
>   	zero_quit = 0;
> +
> +	while (__atomic_load_n(&zero_sleep, __ATOMIC_ACQUIRE))
> +		rte_delay_us(100);
> +
>   	for (i = 0; i < rte_lcore_count() - 1; i++)
>   		printf("Worker %u handled %u packets\n", i,
>   			__atomic_load_n(&worker_stats[i].handled_packets,
> @@ -615,6 +628,8 @@ quit_workers(struct worker_params *wp, struct rte_mempool *p)
>   	quit = 0;
>   	worker_idx = 0;
>   	zero_idx = RTE_MAX_LCORE;
> +	zero_quit = 0;
> +	zero_sleep = 0;
>   }
>   
>   static int

Acked-by: David Hunt <david.hunt@intel.com>



^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v5 12/15] distributor: fix scalar matching
  2020-10-08  5:23                     ` [dpdk-dev] [PATCH v5 12/15] distributor: fix scalar matching Lukasz Wojciechowski
@ 2020-10-09 12:31                       ` David Hunt
  2020-10-09 12:35                         ` David Hunt
  0 siblings, 1 reply; 164+ messages in thread
From: David Hunt @ 2020-10-09 12:31 UTC (permalink / raw)
  To: Lukasz Wojciechowski, Bruce Richardson; +Cc: dev, stable


On 8/10/2020 6:23 AM, Lukasz Wojciechowski wrote:
> Fix improper indexes while comparing tags.
> In the find_match_scalar() function:
> * j iterates over flow tags of following packets;
> * w iterates over backlog or in flight tags positions.
>
> Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
> Cc: david.hunt@intel.com
> Cc: stable@dpdk.org
>
> Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
> ---
>   lib/librte_distributor/rte_distributor.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
> index 9fd7dcab7..4bd23a990 100644
> --- a/lib/librte_distributor/rte_distributor.c
> +++ b/lib/librte_distributor/rte_distributor.c
> @@ -261,13 +261,13 @@ find_match_scalar(struct rte_distributor *d,
>   
>   		for (j = 0; j < RTE_DIST_BURST_SIZE ; j++)
>   			for (w = 0; w < RTE_DIST_BURST_SIZE; w++)
> -				if (d->in_flight_tags[i][j] == data_ptr[w]) {
> +				if (d->in_flight_tags[i][w] == data_ptr[j]) {
>   					output_ptr[j] = i+1;
>   					break;
>   				}
>   		for (j = 0; j < RTE_DIST_BURST_SIZE; j++)
>   			for (w = 0; w < RTE_DIST_BURST_SIZE; w++)
> -				if (bl->tags[j] == data_ptr[w]) {
> +				if (bl->tags[w] == data_ptr[j]) {
>   					output_ptr[j] = i+1;
>   					break;
>   				}

Hi Lukasz,

Could you give a bit more information on the problem that this is fixing?

Were you finding that flows were not being assigned to workers correctly 
in the scalar code?

Thanks,
Dave.




^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v5 12/15] distributor: fix scalar matching
  2020-10-09 12:31                       ` David Hunt
@ 2020-10-09 12:35                         ` David Hunt
  2020-10-09 21:02                           ` Lukasz Wojciechowski
  0 siblings, 1 reply; 164+ messages in thread
From: David Hunt @ 2020-10-09 12:35 UTC (permalink / raw)
  To: Lukasz Wojciechowski, Bruce Richardson; +Cc: dev, stable

Hi Lukasz,

On 9/10/2020 1:31 PM, David Hunt wrote:
>
> On 8/10/2020 6:23 AM, Lukasz Wojciechowski wrote:
>> Fix improper indexes while comparing tags.
>> In the find_match_scalar() function:
>> * j iterates over flow tags of following packets;
>> * w iterates over backlog or in flight tags positions.
>>
>> Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
>> Cc: david.hunt@intel.com
>> Cc: stable@dpdk.org
>>
>> Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
>> ---
>>   lib/librte_distributor/rte_distributor.c | 4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/lib/librte_distributor/rte_distributor.c 
>> b/lib/librte_distributor/rte_distributor.c
>> index 9fd7dcab7..4bd23a990 100644
>> --- a/lib/librte_distributor/rte_distributor.c
>> +++ b/lib/librte_distributor/rte_distributor.c
>> @@ -261,13 +261,13 @@ find_match_scalar(struct rte_distributor *d,
>>             for (j = 0; j < RTE_DIST_BURST_SIZE ; j++)
>>               for (w = 0; w < RTE_DIST_BURST_SIZE; w++)
>> -                if (d->in_flight_tags[i][j] == data_ptr[w]) {
>> +                if (d->in_flight_tags[i][w] == data_ptr[j]) {
>>                       output_ptr[j] = i+1;
>>                       break;
>>                   }
>>           for (j = 0; j < RTE_DIST_BURST_SIZE; j++)
>>               for (w = 0; w < RTE_DIST_BURST_SIZE; w++)
>> -                if (bl->tags[j] == data_ptr[w]) {
>> +                if (bl->tags[w] == data_ptr[j]) {
>>                       output_ptr[j] = i+1;
>>                       break;
>>                   }
>
> Hi Lukasz,
>
> Could you give a bit more information on the problem that this is fixing?
>
> Were you finding that flows were not being assigned to workers 
> correctly in the scalar code?
>
>

You answer this question in the next patch in the series, as you are 
adding a test to check the flows go to the correct workers, etc. You can 
igonore this question, and:

Acked-by: David Hunt <david.hunt@intel.com>




^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v5 13/15] test/distributor: add test with packets marking
  2020-10-08  5:23                     ` [dpdk-dev] [PATCH v5 13/15] test/distributor: add test with packets marking Lukasz Wojciechowski
@ 2020-10-09 12:50                       ` David Hunt
  2020-10-09 21:12                         ` Lukasz Wojciechowski
  0 siblings, 1 reply; 164+ messages in thread
From: David Hunt @ 2020-10-09 12:50 UTC (permalink / raw)
  To: Lukasz Wojciechowski; +Cc: dev

Hi Lukasz,

On 8/10/2020 6:23 AM, Lukasz Wojciechowski wrote:
> All of the former tests analyzed only statistics
> of packets processed by all workers.
> The new test verifies also if packets are processed
> on workers as expected.
> Every packets processed by the worker is marked
> and analyzed after it is returned to distributor.
>
> This test allows finding issues in matching algorithms.
>
> Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
> ---
>   app/test/test_distributor.c | 141 ++++++++++++++++++++++++++++++++++++
>   1 file changed, 141 insertions(+)
>
> diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
> index 1e0a079ff..0404e463a 100644
> --- a/app/test/test_distributor.c
> +++ b/app/test/test_distributor.c
> @@ -542,6 +542,141 @@ test_flush_with_worker_shutdown(struct worker_params *wp,
>   	return 0;
>   }
>   
> +static int
> +handle_and_mark_work(void *arg)
> +{
> +	struct rte_mbuf *buf[8] __rte_cache_aligned;
> +	struct worker_params *wp = arg;
> +	struct rte_distributor *db = wp->dist;
> +	unsigned int num, i;
> +	unsigned int id = __atomic_fetch_add(&worker_idx, 1, __ATOMIC_RELAXED);
> +	num = rte_distributor_get_pkt(db, id, buf, NULL, 0);
> +	while (!quit) {
> +		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
> +				__ATOMIC_ACQ_REL);
> +		for (i = 0; i < num; i++)
> +			buf[i]->udata64 += id + 1;
> +		num = rte_distributor_get_pkt(db, id,
> +				buf, buf, num);
> +	}
> +	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
> +			__ATOMIC_ACQ_REL);
> +	rte_distributor_return_pkt(db, id, buf, num);
> +	return 0;
> +}
> +
> +/* sanity_mark_test sends packets to workers which mark them.
> + * Every packet has also encoded sequence number.
> + * The returned packets are sorted and verified if they were handled
> + * by proper workers.
> + */
> +static int
> +sanity_mark_test(struct worker_params *wp, struct rte_mempool *p)
> +{
> +	const unsigned int buf_count = 24;
> +	const unsigned int burst = 8;
> +	const unsigned int shift = 12;
> +	const unsigned int seq_shift = 10;
> +
> +	struct rte_distributor *db = wp->dist;
> +	struct rte_mbuf *bufs[buf_count];
> +	struct rte_mbuf *returns[buf_count];
> +	unsigned int i, count, id;
> +	unsigned int sorted[buf_count], seq;
> +	unsigned int failed = 0;
> +
> +	printf("=== Marked packets test ===\n");
> +	clear_packet_count();
> +	if (rte_mempool_get_bulk(p, (void *)bufs, buf_count) != 0) {
> +		printf("line %d: Error getting mbufs from pool\n", __LINE__);
> +		return -1;
> +	}
> +
> +/* bufs' hashes will be like these below, but shifted left.
> + * The shifting is for avoiding collisions with backlogs
> + * and in-flight tags left by previous tests.
> + * [1, 1, 1, 1, 1, 1, 1, 1
> + *  1, 1, 1, 1, 2, 2, 2, 2
> + *  2, 2, 2, 2, 1, 1, 1, 1]
> + */

I would suggest indenting the comments to the same indent level as the 
code, this
would make the flow easier to read. Same with additional comments below.


> +	for (i = 0; i < burst; i++) {
> +		bufs[0 * burst + i]->hash.usr = 1 << shift;
> +		bufs[1 * burst + i]->hash.usr = ((i < burst / 2) ? 1 : 2)
> +			<< shift;
> +		bufs[2 * burst + i]->hash.usr = ((i < burst / 2) ? 2 : 1)
> +			<< shift;
> +	}
> +/* Assign a sequence number to each packet. The sequence is shifted,
> + * so that lower bits of the udate64 will hold mark from worker.
> + */
> +	for (i = 0; i < buf_count; i++)
> +		bufs[i]->udata64 = i << seq_shift;
> +
> +	count = 0;
> +	for (i = 0; i < buf_count/burst; i++) {
> +		rte_distributor_process(db, &bufs[i * burst], burst);
> +		count += rte_distributor_returned_pkts(db, &returns[count],
> +			buf_count - count);
> +	}
> +
> +	do {
> +		rte_distributor_flush(db);
> +		count += rte_distributor_returned_pkts(db, &returns[count],
> +			buf_count - count);
> +	} while (count < buf_count);
> +
> +	for (i = 0; i < rte_lcore_count() - 1; i++)
> +		printf("Worker %u handled %u packets\n", i,
> +			__atomic_load_n(&worker_stats[i].handled_packets,
> +					__ATOMIC_ACQUIRE));
> +
> +/* Sort returned packets by sent order (sequence numbers). */
> +	for (i = 0; i < buf_count; i++) {
> +		seq = returns[i]->udata64 >> seq_shift;
> +		id = returns[i]->udata64 - (seq << seq_shift);
> +		sorted[seq] = id;
> +	}
> +
> +/* Verify that packets [0-11] and [20-23] were processed
> + * by the same worker
> + */
> +	for (i = 1; i < 12; i++) {
> +		if (sorted[i] != sorted[0]) {
> +			printf("Packet number %u processed by worker %u,"
> +				" but should be processes by worker %u\n",
> +				i, sorted[i], sorted[0]);
> +			failed = 1;
> +		}
> +	}
> +	for (i = 20; i < 24; i++) {
> +		if (sorted[i] != sorted[0]) {
> +			printf("Packet number %u processed by worker %u,"
> +				" but should be processes by worker %u\n",
> +				i, sorted[i], sorted[0]);
> +			failed = 1;
> +		}
> +	}
> +/* And verify that packets [12-19] were processed
> + * by the another worker
> + */
> +	for (i = 13; i < 20; i++) {
> +		if (sorted[i] != sorted[12]) {
> +			printf("Packet number %u processed by worker %u,"
> +				" but should be processes by worker %u\n",
> +				i, sorted[i], sorted[12]);
> +			failed = 1;
> +		}
> +	}
> +
> +	rte_mempool_put_bulk(p, (void *)bufs, buf_count);
> +
> +	if (failed)
> +		return -1;
> +
> +	printf("Marked packets test passed\n");
> +	return 0;
> +}
> +
>   static
>   int test_error_distributor_create_name(void)
>   {
> @@ -726,6 +861,12 @@ test_distributor(void)
>   				goto err;
>   			quit_workers(&worker_params, p);
>   
> +			rte_eal_mp_remote_launch(handle_and_mark_work,
> +					&worker_params, SKIP_MASTER);
> +			if (sanity_mark_test(&worker_params, p) < 0)
> +				goto err;
> +			quit_workers(&worker_params, p);
> +
>   		} else {
>   			printf("Too few cores to run worker shutdown test\n");
>   		}


Checking the flows go to the correct workers is areally good test to 
have. Thanks for the effort here.

Apart from the comment indentation nit: the rest looks good to me.

Acked-by: David Hunt <david.hunt@intel.com>



^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v5 00/15] fix distributor synchronization issues
  2020-10-08 21:16                     ` Lukasz Wojciechowski
@ 2020-10-09 12:53                       ` David Marchand
  2020-10-09 21:41                         ` Lukasz Wojciechowski
  0 siblings, 1 reply; 164+ messages in thread
From: David Marchand @ 2020-10-09 12:53 UTC (permalink / raw)
  To: Lukasz Wojciechowski; +Cc: David Hunt, Honnappa Nagarahalli, dev, Sarosh Arif

Hello Lukasz,

On Thu, Oct 8, 2020 at 11:17 PM Lukasz Wojciechowski
<l.wojciechow@partner.samsung.com> wrote:
> I'm here if you have any questions or suggestions

Unfortunately, I can see a timeout on the distributor autotest in Travis:
https://travis-ci.com/github/ovsrobot/dpdk/jobs/396703415#L1151

Can you have a look?
Btw, did you receive a notification about this from the robot?


-- 
David Marchand


^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v5 14/15] distributor: fix flushing in flight packets
  2020-10-08  5:23                     ` [dpdk-dev] [PATCH v5 14/15] distributor: fix flushing in flight packets Lukasz Wojciechowski
@ 2020-10-09 13:10                       ` David Hunt
  0 siblings, 0 replies; 164+ messages in thread
From: David Hunt @ 2020-10-09 13:10 UTC (permalink / raw)
  To: Lukasz Wojciechowski, Bruce Richardson; +Cc: dev, stable


On 8/10/2020 6:23 AM, Lukasz Wojciechowski wrote:
> rte_distributor_flush() is using total_outstanding()
> function to calculate if it should still wait
> for processing packets. However in burst mode
> only backlog packets were counted.
>
> This patch fixes that issue by counting also in flight
> packets. There are also sum fixes to properly keep
> count of in flight packets for each worker in bufs[].count.
>
> Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
> Cc: david.hunt@intel.com
> Cc: stable@dpdk.org
>
> Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
> ---
>   lib/librte_distributor/rte_distributor.c | 12 +++++-------
>   1 file changed, 5 insertions(+), 7 deletions(-)
>
> diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
> index 4bd23a990..2478de3b7 100644
> --- a/lib/librte_distributor/rte_distributor.c
> +++ b/lib/librte_distributor/rte_distributor.c
> @@ -467,6 +467,7 @@ rte_distributor_process(struct rte_distributor *d,
>   			/* Sync with worker on GET_BUF flag. */
>   			if (__atomic_load_n(&(d->bufs[wid].bufptr64[0]),
>   				__ATOMIC_ACQUIRE) & RTE_DISTRIB_GET_BUF) {
> +				d->bufs[wid].count = 0;
>   				release(d, wid);
>   				handle_returns(d, wid);
>   			}
> @@ -481,11 +482,6 @@ rte_distributor_process(struct rte_distributor *d,
>   		uint16_t matches[RTE_DIST_BURST_SIZE];
>   		unsigned int pkts;
>   
> -		/* Sync with worker on GET_BUF flag. */
> -		if (__atomic_load_n(&(d->bufs[wkr].bufptr64[0]),
> -			__ATOMIC_ACQUIRE) & RTE_DISTRIB_GET_BUF)
> -			d->bufs[wkr].count = 0;
> -
>   		if ((num_mbufs - next_idx) < RTE_DIST_BURST_SIZE)
>   			pkts = num_mbufs - next_idx;
>   		else
> @@ -605,8 +601,10 @@ rte_distributor_process(struct rte_distributor *d,
>   	for (wid = 0 ; wid < d->num_workers; wid++)
>   		/* Sync with worker on GET_BUF flag. */
>   		if ((__atomic_load_n(&(d->bufs[wid].bufptr64[0]),
> -			__ATOMIC_ACQUIRE) & RTE_DISTRIB_GET_BUF))
> +			__ATOMIC_ACQUIRE) & RTE_DISTRIB_GET_BUF)) {
> +			d->bufs[wid].count = 0;
>   			release(d, wid);
> +		}
>   
>   	return num_mbufs;
>   }
> @@ -649,7 +647,7 @@ total_outstanding(const struct rte_distributor *d)
>   	unsigned int wkr, total_outstanding = 0;
>   
>   	for (wkr = 0; wkr < d->num_workers; wkr++)
> -		total_outstanding += d->backlog[wkr].count;
> +		total_outstanding += d->backlog[wkr].count + d->bufs[wkr].count;
>   
>   	return total_outstanding;
>   }



Acked-by: David Hunt <david.hunt@intel.com>



^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v5 15/15] distributor: fix clearing returns buffer
  2020-10-08  5:23                     ` [dpdk-dev] [PATCH v5 15/15] distributor: fix clearing returns buffer Lukasz Wojciechowski
@ 2020-10-09 13:12                       ` David Hunt
  0 siblings, 0 replies; 164+ messages in thread
From: David Hunt @ 2020-10-09 13:12 UTC (permalink / raw)
  To: Lukasz Wojciechowski, Bruce Richardson; +Cc: dev, stable

Hi Lukasz,

On 8/10/2020 6:23 AM, Lukasz Wojciechowski wrote:
> The patch clears distributors returns buffer
> in clear_returns() by setting start and count to 0.
>
> Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
> Cc: david.hunt@intel.com
> Cc: stable@dpdk.org
>
> Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
> ---
>   lib/librte_distributor/rte_distributor.c | 2 ++
>   1 file changed, 2 insertions(+)
>
> diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
> index 2478de3b7..57240304a 100644
> --- a/lib/librte_distributor/rte_distributor.c
> +++ b/lib/librte_distributor/rte_distributor.c
> @@ -704,6 +704,8 @@ rte_distributor_clear_returns(struct rte_distributor *d)
>   		/* Sync with worker. Release retptrs. */
>   		__atomic_store_n(&(d->bufs[wkr].retptr64[0]), 0,
>   				__ATOMIC_RELEASE);
> +
> +	d->returns.start = d->returns.count = 0;
>   }
>   
>   /* creates a distributor instance */


Acked-by: David Hunt <david.hunt@intel.com>



^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v5 04/15] distributor: handle worker shutdown in burst mode
  2020-10-09 12:13                           ` David Hunt
@ 2020-10-09 20:43                             ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-09 20:43 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson
  Cc: dev, stable, "'Lukasz Wojciechowski'",


W dniu 09.10.2020 o 14:13, David Hunt pisze:
>
> On 8/10/2020 10:07 PM, Lukasz Wojciechowski wrote:
>> W dniu 08.10.2020 o 16:26, David Hunt pisze:
>>> On 8/10/2020 6:23 AM, Lukasz Wojciechowski wrote:
>>>> The burst version of distributor implementation was missing proper
>>>> handling of worker shutdown. A worker processing packets received
>>>> from distributor can call rte_distributor_return_pkt() function
>>>> informing distributor that it want no more packets. Further calls to
>>>> rte_distributor_request_pkt() or rte_distributor_get_pkt() however
>>>> should inform distributor that new packets are requested again.
>>>>
>>>> Lack of the proper implementation has caused that even after worker
>>>> informed about returning last packets, new packets were still sent
>>>> from distributor causing deadlocks as no one could get them on worker
>>>> side.
>>>>
>>>> This patch adds handling shutdown of the worker in following way:
>>>> 1) It fixes usage of RTE_DISTRIB_VALID_BUF handshake flag. This flag
>>>> was formerly unused in burst implementation and now it is used
>>>> for marking valid packets in retptr64 replacing invalid use
>>>> of RTE_DISTRIB_RETURN_BUF flag.
>>>> 2) Uses RTE_DISTRIB_RETURN_BUF as a worker to distributor handshake
>>>> in retptr64 to indicate that worker has shutdown.
>>>> 3) Worker that shuts down blocks also bufptr for itself with
>>>> RTE_DISTRIB_RETURN_BUF flag allowing distributor to retrieve any
>>>> in flight packets.
>>>> 4) When distributor receives information about shutdown of a worker,
>>>> it: marks worker as not active; retrieves any in flight and backlog
>>>> packets and process them to different workers; unlocks bufptr64
>>>> by clearing RTE_DISTRIB_RETURN_BUF flag and allowing use in
>>>> the future if worker requests any new packages.
>>>> 5) Do not allow to: send or add to backlog any packets for not
>>>> active workers. Such workers are also ignored if matched.
>>>> 6) Adjust calls to handle_returns() and tags matching procedure
>>>> to react for possible activation deactivation of workers.
>>>>
>>>> Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
>>>> Cc: david.hunt@intel.com
>>>> Cc: stable@dpdk.org
>>>>
>>>> Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
>>>> ---
>>>
>>> Hi Lukasz,
>> Hi David,
>> Many thanks for your review.
>>>     I spent the most amount of time going through this particular
>>> patch, and it looks good to me (even the bit where
>>> rte_distributor_process is called recursively) :)
>> That's the same trick that was used in the legacy single version. :)
>>> I'll try and get some time to run through some more testing, but for 
>>> now:
>>>
>>> Acked-by: David Hunt <david.hunt@intel.com>
>> Thanks and if you'll run the test, please take a look at the
>> performance. I think it has dropped because of these additional
>> synchronizations and actions on activation/deactivation.
>>
>> However the quality has increased much. With v5 version , I ran tests
>> over 100000 times and didn't get a single failure!
>>
>> Let me know about your results.
>>
>
> Going back through the patch set and running performance on each one, 
> I see a 10% drop in performance at patch 2 in the series, which adds 
> an extra handle_returns() call in the busy loop. Which avoids possible 
> deadlock.
>
> I played around with that patch for a while, only calling 
> handle_returns() every x times aroudn the loop, but the performance 
> was worse again, probably because of the extra branch I added.
>
> However, it's more important to have stable performance than so it's 
> still a good idea to have that fix applied, IMO.
I agree
>
> Maybe we can get back some lost performance in future optimisation 
> patches.
That would be really nice. If i have some time, I would like to try some 
ideas I came with during work in the series.
>
> Thanks,
> Dave.
>
>
>
>
-- 
Lukasz Wojciechowski
Principal Software Engineer

Samsung R&D Institute Poland
Samsung Electronics
Office +48 22 377 88 25
l.wojciechow@partner.samsung.com


^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v5 12/15] distributor: fix scalar matching
  2020-10-09 12:35                         ` David Hunt
@ 2020-10-09 21:02                           ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-09 21:02 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson
  Cc: dev, stable, "'Lukasz Wojciechowski'",


W dniu 09.10.2020 o 14:35, David Hunt pisze:
> Hi Lukasz,
>
> On 9/10/2020 1:31 PM, David Hunt wrote:
>>
>> On 8/10/2020 6:23 AM, Lukasz Wojciechowski wrote:
>>> Fix improper indexes while comparing tags.
>>> In the find_match_scalar() function:
>>> * j iterates over flow tags of following packets;
>>> * w iterates over backlog or in flight tags positions.
>>>
>>> Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
>>> Cc: david.hunt@intel.com
>>> Cc: stable@dpdk.org
>>>
>>> Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
>>> ---
>>>   lib/librte_distributor/rte_distributor.c | 4 ++--
>>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/lib/librte_distributor/rte_distributor.c 
>>> b/lib/librte_distributor/rte_distributor.c
>>> index 9fd7dcab7..4bd23a990 100644
>>> --- a/lib/librte_distributor/rte_distributor.c
>>> +++ b/lib/librte_distributor/rte_distributor.c
>>> @@ -261,13 +261,13 @@ find_match_scalar(struct rte_distributor *d,
>>>             for (j = 0; j < RTE_DIST_BURST_SIZE ; j++)
>>>               for (w = 0; w < RTE_DIST_BURST_SIZE; w++)
>>> -                if (d->in_flight_tags[i][j] == data_ptr[w]) {
>>> +                if (d->in_flight_tags[i][w] == data_ptr[j]) {
>>>                       output_ptr[j] = i+1;
>>>                       break;
>>>                   }
>>>           for (j = 0; j < RTE_DIST_BURST_SIZE; j++)
>>>               for (w = 0; w < RTE_DIST_BURST_SIZE; w++)
>>> -                if (bl->tags[j] == data_ptr[w]) {
>>> +                if (bl->tags[w] == data_ptr[j]) {
>>>                       output_ptr[j] = i+1;
>>>                       break;
>>>                   }
>>
>> Hi Lukasz,
>>
>> Could you give a bit more information on the problem that this is 
>> fixing?
>>
>> Were you finding that flows were not being assigned to workers 
>> correctly in the scalar code?
>>
>>
>
> You answer this question in the next patch in the series, as you are 
> adding a test to check the flows go to the correct workers, etc. You 
> can igonore this question, and:
>
> Acked-by: David Hunt <david.hunt@intel.com>
>
Thanks for the ack.

And you already probably know the answer about flows, but let me show an 
example:

worker 0 tags:   3 5 7 0 0 0 0 0
incoming flow:   1 2 3 4 5 6 7 8
expected result: 0 0 1 0 1 0 1 0
unfixed result:  1 1 1 0 0 0 0 0

The tags were iterated with "j" variable same that indexed the result table


Best regards

Lukasz

>
>
-- 
Lukasz Wojciechowski
Principal Software Engineer

Samsung R&D Institute Poland
Samsung Electronics
Office +48 22 377 88 25
l.wojciechow@partner.samsung.com


^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v5 13/15] test/distributor: add test with packets marking
  2020-10-09 12:50                       ` David Hunt
@ 2020-10-09 21:12                         ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-09 21:12 UTC (permalink / raw)
  To: David Hunt; +Cc: dev, "'Lukasz Wojciechowski'",

Hi David,

W dniu 09.10.2020 o 14:50, David Hunt pisze:
> Hi Lukasz,
>
> On 8/10/2020 6:23 AM, Lukasz Wojciechowski wrote:
>> All of the former tests analyzed only statistics
>> of packets processed by all workers.
>> The new test verifies also if packets are processed
>> on workers as expected.
>> Every packets processed by the worker is marked
>> and analyzed after it is returned to distributor.
>>
>> This test allows finding issues in matching algorithms.
>>
>> Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
>> ---
>>   app/test/test_distributor.c | 141 ++++++++++++++++++++++++++++++++++++
>>   1 file changed, 141 insertions(+)
>>
>> diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
>> index 1e0a079ff..0404e463a 100644
>> --- a/app/test/test_distributor.c
>> +++ b/app/test/test_distributor.c
>> @@ -542,6 +542,141 @@ test_flush_with_worker_shutdown(struct 
>> worker_params *wp,
>>       return 0;
>>   }
>>   +static int
>> +handle_and_mark_work(void *arg)
>> +{
>> +    struct rte_mbuf *buf[8] __rte_cache_aligned;
>> +    struct worker_params *wp = arg;
>> +    struct rte_distributor *db = wp->dist;
>> +    unsigned int num, i;
>> +    unsigned int id = __atomic_fetch_add(&worker_idx, 1, 
>> __ATOMIC_RELAXED);
>> +    num = rte_distributor_get_pkt(db, id, buf, NULL, 0);
>> +    while (!quit) {
>> + __atomic_fetch_add(&worker_stats[id].handled_packets, num,
>> +                __ATOMIC_ACQ_REL);
>> +        for (i = 0; i < num; i++)
>> +            buf[i]->udata64 += id + 1;
>> +        num = rte_distributor_get_pkt(db, id,
>> +                buf, buf, num);
>> +    }
>> +    __atomic_fetch_add(&worker_stats[id].handled_packets, num,
>> +            __ATOMIC_ACQ_REL);
>> +    rte_distributor_return_pkt(db, id, buf, num);
>> +    return 0;
>> +}
>> +
>> +/* sanity_mark_test sends packets to workers which mark them.
>> + * Every packet has also encoded sequence number.
>> + * The returned packets are sorted and verified if they were handled
>> + * by proper workers.
>> + */
>> +static int
>> +sanity_mark_test(struct worker_params *wp, struct rte_mempool *p)
>> +{
>> +    const unsigned int buf_count = 24;
>> +    const unsigned int burst = 8;
>> +    const unsigned int shift = 12;
>> +    const unsigned int seq_shift = 10;
>> +
>> +    struct rte_distributor *db = wp->dist;
>> +    struct rte_mbuf *bufs[buf_count];
>> +    struct rte_mbuf *returns[buf_count];
>> +    unsigned int i, count, id;
>> +    unsigned int sorted[buf_count], seq;
>> +    unsigned int failed = 0;
>> +
>> +    printf("=== Marked packets test ===\n");
>> +    clear_packet_count();
>> +    if (rte_mempool_get_bulk(p, (void *)bufs, buf_count) != 0) {
>> +        printf("line %d: Error getting mbufs from pool\n", __LINE__);
>> +        return -1;
>> +    }
>> +
>> +/* bufs' hashes will be like these below, but shifted left.
>> + * The shifting is for avoiding collisions with backlogs
>> + * and in-flight tags left by previous tests.
>> + * [1, 1, 1, 1, 1, 1, 1, 1
>> + *  1, 1, 1, 1, 2, 2, 2, 2
>> + *  2, 2, 2, 2, 1, 1, 1, 1]
>> + */
>
> I would suggest indenting the comments to the same indent level as the 
> code, this
> would make the flow easier to read. Same with additional comments below.
>
Indentation fixed as you suggested will be fixed in v6
>
>> +    for (i = 0; i < burst; i++) {
>> +        bufs[0 * burst + i]->hash.usr = 1 << shift;
>> +        bufs[1 * burst + i]->hash.usr = ((i < burst / 2) ? 1 : 2)
>> +            << shift;
>> +        bufs[2 * burst + i]->hash.usr = ((i < burst / 2) ? 2 : 1)
>> +            << shift;
>> +    }
>> +/* Assign a sequence number to each packet. The sequence is shifted,
>> + * so that lower bits of the udate64 will hold mark from worker.
>> + */
>> +    for (i = 0; i < buf_count; i++)
>> +        bufs[i]->udata64 = i << seq_shift;
>> +
>> +    count = 0;
>> +    for (i = 0; i < buf_count/burst; i++) {
>> +        rte_distributor_process(db, &bufs[i * burst], burst);
>> +        count += rte_distributor_returned_pkts(db, &returns[count],
>> +            buf_count - count);
>> +    }
>> +
>> +    do {
>> +        rte_distributor_flush(db);
>> +        count += rte_distributor_returned_pkts(db, &returns[count],
>> +            buf_count - count);
>> +    } while (count < buf_count);
>> +
>> +    for (i = 0; i < rte_lcore_count() - 1; i++)
>> +        printf("Worker %u handled %u packets\n", i,
>> + __atomic_load_n(&worker_stats[i].handled_packets,
>> +                    __ATOMIC_ACQUIRE));
>> +
>> +/* Sort returned packets by sent order (sequence numbers). */
>> +    for (i = 0; i < buf_count; i++) {
>> +        seq = returns[i]->udata64 >> seq_shift;
>> +        id = returns[i]->udata64 - (seq << seq_shift);
>> +        sorted[seq] = id;
>> +    }
>> +
>> +/* Verify that packets [0-11] and [20-23] were processed
>> + * by the same worker
>> + */
>> +    for (i = 1; i < 12; i++) {
>> +        if (sorted[i] != sorted[0]) {
>> +            printf("Packet number %u processed by worker %u,"
>> +                " but should be processes by worker %u\n",
>> +                i, sorted[i], sorted[0]);
>> +            failed = 1;
>> +        }
>> +    }
>> +    for (i = 20; i < 24; i++) {
>> +        if (sorted[i] != sorted[0]) {
>> +            printf("Packet number %u processed by worker %u,"
>> +                " but should be processes by worker %u\n",
>> +                i, sorted[i], sorted[0]);
>> +            failed = 1;
>> +        }
>> +    }
>> +/* And verify that packets [12-19] were processed
>> + * by the another worker
>> + */
>> +    for (i = 13; i < 20; i++) {
>> +        if (sorted[i] != sorted[12]) {
>> +            printf("Packet number %u processed by worker %u,"
>> +                " but should be processes by worker %u\n",
>> +                i, sorted[i], sorted[12]);
>> +            failed = 1;
>> +        }
>> +    }
>> +
>> +    rte_mempool_put_bulk(p, (void *)bufs, buf_count);
>> +
>> +    if (failed)
>> +        return -1;
>> +
>> +    printf("Marked packets test passed\n");
>> +    return 0;
>> +}
>> +
>>   static
>>   int test_error_distributor_create_name(void)
>>   {
>> @@ -726,6 +861,12 @@ test_distributor(void)
>>                   goto err;
>>               quit_workers(&worker_params, p);
>>   +            rte_eal_mp_remote_launch(handle_and_mark_work,
>> +                    &worker_params, SKIP_MASTER);
>> +            if (sanity_mark_test(&worker_params, p) < 0)
>> +                goto err;
>> +            quit_workers(&worker_params, p);
>> +
>>           } else {
>>               printf("Too few cores to run worker shutdown test\n");
>>           }
>
>
> Checking the flows go to the correct workers is areally good test to 
> have. Thanks for the effort here.
>
> Apart from the comment indentation nit: the rest looks good to me.
>
> Acked-by: David Hunt <david.hunt@intel.com>

Thanks

Lukasz

>
>
>
-- 
Lukasz Wojciechowski
Principal Software Engineer

Samsung R&D Institute Poland
Samsung Electronics
Office +48 22 377 88 25
l.wojciechow@partner.samsung.com


^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v5 00/15] fix distributor synchronization issues
  2020-10-09 12:53                       ` David Marchand
@ 2020-10-09 21:41                         ` Lukasz Wojciechowski
  2020-10-09 23:25                           ` Lukasz Wojciechowski
  0 siblings, 1 reply; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-09 21:41 UTC (permalink / raw)
  To: David Marchand
  Cc: David Hunt, Honnappa Nagarahalli, dev, Sarosh Arif,
	"'Lukasz Wojciechowski'",

Hi David,

W dniu 09.10.2020 o 14:53, David Marchand pisze:
> Hello Lukasz,
>
> On Thu, Oct 8, 2020 at 11:17 PM Lukasz Wojciechowski
> <l.wojciechow@partner.samsung.com> wrote:
>> I'm here if you have any questions or suggestions
> Unfortunately, I can see a timeout on the distributor autotest in Travis:
> https://travis-ci.com/github/ovsrobot/dpdk/jobs/396703415#L1151
>
> Can you have a look?
I took a look and I don't know the cause of test hanging and timeout.
I run today more than 200000 iteration of distributor tests and didn't 
get a single failure or lock.
David Hunt run the series tests today also, when checking impact on 
performance and I guess he didn't got the issue.
@DavidHunt, Am I right?

The failure happened in only one configuration and tests were run by 
travis using different compilers, architecture, etc.

The test did not wrote anything on the stdout or stderr:
--- stdout ---
EAL: Probing VFIO support...
APP: HPET is not enabled, using TSC as default timer
RTE>>distributor_autotest
--- stderr ---
EAL: Detected 2 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: Multi-process socket /var/run/dpdk/distributor_autotest/mp_socket
EAL: Selected IOVA mode 'PA'
EAL: No available hugepages reported in hugepages-1048576kB
-------
That's quite strange before the first test that is run: sanity_test, 
starts with printing information about the start.

Before that there is only the initialization code of the distributor 
structure and creation of mempool.

The only modifications I made to initialization of distributor structure 
was initialization of active and active sum fields of distributor:

     memset(d->active, 0, sizeof(d->active));
     d->activesum = 0;

That's seems not to be the reason.

I don't know what could be.


Is there a way to trigger travis job manually to see if the timeout 
reproduces ?

> Btw, did you receive a notification about this from the robot?
Yes, I got it.
But I interpreted it badly. I downloaded the log and start reading it up 
from the end and when I saw:

    Compiler stderr:^M
      /usr/bin/ld: cannot find -lvirt^M
    collect2: error: ld returned 1 exit status^M

  I thought that was it. Sorry for that.


BTW I'm going to publish v6 with changes suggested by Honnappa 
Nagarahalli (RELAXED memory mode) and David Hunt (indentations)


Best regards

Lukasz

>
>
-- 
Lukasz Wojciechowski
Principal Software Engineer

Samsung R&D Institute Poland
Samsung Electronics
Office +48 22 377 88 25
l.wojciechow@partner.samsung.com


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v6 00/15] fix distributor synchronization issues
       [not found]                   ` <CGME20201009220207eucas1p1d83b63b4f0e05cbaf0a58f7f01ec0052@eucas1p1.samsung.com>
@ 2020-10-09 22:01                     ` Lukasz Wojciechowski
       [not found]                       ` <CGME20201009220229eucas1p17ad627f31005ed506c5422b93ad6d112@eucas1p1.samsung.com>
                                         ` (15 more replies)
  0 siblings, 16 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-09 22:01 UTC (permalink / raw)
  Cc: dev, l.wojciechow

During review and verification of the patch created by Sarosh Arif:
"test_distributor: prevent memory leakages from the pool" I found out
that running distributor unit tests multiple times in a row causes fails.
So I investigated all the issues I found.

There are few synchronization issues that might cause deadlocks
or corrupted data. They are fixed with this set of patches for both tests
and librte_distributor library.

---
v6:
* fix comments indentation
* fix stats atomic operations memory mode from ACQUIRE/RELEASE
    to RELAXED

v5:
* implement missing functionality in burst mode - worker shutdown
* fix shutdown test to always shutdown busy worker
* use atomic stores instead of barrier in tests clear_packet_count()
* reorder patches
* new patch 7: fix call to return_pkt in single mode
* new patch 11: replacing delays with spinlock on atomics in tests
* new patch 12: fix scalar matching algorithm
* new patch 13: new test with marking and checking every packet
* new patch 14: flush also in flight packets
* new patch 15: fix clearing returns buffer
* minor fixes in other patches

v4:
* adjust commit name prefixes app/test -> test/distributor:
* reorder patches
* use NULL oldpkt in rte_distributor_get_pkt() calls in tests

v3:
* add missing acked and tested by statements from v1

v2:
* assign NULL to freed mbufs in distributor test
* fix handshake check on legacy single distributor
     rte_distributor_return_pkt_single()
* add patch 7 passing NULL to legacy API calls if no bufs are returned
* add patch 8 fixing API documentation


Lukasz Wojciechowski (15):
  distributor: fix missing handshake synchronization
  distributor: fix handshake deadlock
  distributor: do not use oldpkt when not needed
  distributor: handle worker shutdown in burst mode
  test/distributor: fix shutdown of busy worker
  test/distributor: synchronize lcores statistics
  distributor: fix return pkt calls in single mode
  test/distributor: fix freeing mbufs
  test/distributor: collect return mbufs
  distributor: align API documentation with code
  test/distributor: replace delays with spin locks
  distributor: fix scalar matching
  test/distributor: add test with packets marking
  distributor: fix flushing in flight packets
  distributor: fix clearing returns buffer

 app/test/test_distributor.c                   | 317 ++++++++++++++----
 lib/librte_distributor/distributor_private.h  |   3 +
 lib/librte_distributor/rte_distributor.c      | 219 +++++++++---
 lib/librte_distributor/rte_distributor.h      |  23 +-
 .../rte_distributor_single.c                  |   4 +
 5 files changed, 445 insertions(+), 121 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v6 01/15] distributor: fix missing handshake synchronization
       [not found]                       ` <CGME20201009220229eucas1p17ad627f31005ed506c5422b93ad6d112@eucas1p1.samsung.com>
@ 2020-10-09 22:01                         ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-09 22:01 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

rte_distributor_return_pkt function which is run on worker cores
must wait for distributor core to clear handshake on retptr64
before using those buffers. While the handshake is set distributor
core controls buffers and any operations on worker side might overwrite
buffers which are unread yet.
Same situation appears in the legacy single distributor. Function
rte_distributor_return_pkt_single shouldn't modify the bufptr64 until
handshake on it is cleared by distributor lcore.

Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 lib/librte_distributor/rte_distributor.c        | 14 ++++++++++++++
 lib/librte_distributor/rte_distributor_single.c |  4 ++++
 2 files changed, 18 insertions(+)

diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
index 1c047f065..89493c331 100644
--- a/lib/librte_distributor/rte_distributor.c
+++ b/lib/librte_distributor/rte_distributor.c
@@ -160,6 +160,7 @@ rte_distributor_return_pkt(struct rte_distributor *d,
 {
 	struct rte_distributor_buffer *buf = &d->bufs[worker_id];
 	unsigned int i;
+	volatile int64_t *retptr64;
 
 	if (unlikely(d->alg_type == RTE_DIST_ALG_SINGLE)) {
 		if (num == 1)
@@ -169,6 +170,19 @@ rte_distributor_return_pkt(struct rte_distributor *d,
 			return -EINVAL;
 	}
 
+	retptr64 = &(buf->retptr64[0]);
+	/* Spin while handshake bits are set (scheduler clears it).
+	 * Sync with worker on GET_BUF flag.
+	 */
+	while (unlikely(__atomic_load_n(retptr64, __ATOMIC_ACQUIRE)
+			& RTE_DISTRIB_GET_BUF)) {
+		rte_pause();
+		uint64_t t = rte_rdtsc()+100;
+
+		while (rte_rdtsc() < t)
+			rte_pause();
+	}
+
 	/* Sync with distributor to acquire retptrs */
 	__atomic_thread_fence(__ATOMIC_ACQUIRE);
 	for (i = 0; i < RTE_DIST_BURST_SIZE; i++)
diff --git a/lib/librte_distributor/rte_distributor_single.c b/lib/librte_distributor/rte_distributor_single.c
index abaf7730c..f4725b1d0 100644
--- a/lib/librte_distributor/rte_distributor_single.c
+++ b/lib/librte_distributor/rte_distributor_single.c
@@ -74,6 +74,10 @@ rte_distributor_return_pkt_single(struct rte_distributor_single *d,
 	union rte_distributor_buffer_single *buf = &d->bufs[worker_id];
 	uint64_t req = (((int64_t)(uintptr_t)oldpkt) << RTE_DISTRIB_FLAG_BITS)
 			| RTE_DISTRIB_RETURN_BUF;
+	while (unlikely(__atomic_load_n(&buf->bufptr64, __ATOMIC_RELAXED)
+			& RTE_DISTRIB_FLAGS_MASK))
+		rte_pause();
+
 	/* Sync with distributor on RETURN_BUF flag. */
 	__atomic_store_n(&(buf->bufptr64), req, __ATOMIC_RELEASE);
 	return 0;
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v6 02/15] distributor: fix handshake deadlock
       [not found]                       ` <CGME20201009220231eucas1p217c48d880aaa7f15e4351f92eede01b6@eucas1p2.samsung.com>
@ 2020-10-09 22:01                         ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-09 22:01 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

Synchronization of data exchange between distributor and worker cores
is based on 2 handshakes: retptr64 for returning mbufs from workers
to distributor and bufptr64 for passing mbufs to workers.

Without proper order of verifying those 2 handshakes a deadlock may
occur. This can happen when worker core wants to return back mbufs
and waits for retptr handshake to be cleared while distributor core
waits for bufptr to send mbufs to worker.

This can happen as worker core first returns mbufs to distributor
and later gets new mbufs, while distributor first releases mbufs
to worker and later handle returning packets.

This patch fixes possibility of the deadlock by always taking care
of returning packets first on the distributor side and handling
packets while waiting to release new.

Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 lib/librte_distributor/rte_distributor.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
index 89493c331..12b3db33c 100644
--- a/lib/librte_distributor/rte_distributor.c
+++ b/lib/librte_distributor/rte_distributor.c
@@ -321,12 +321,14 @@ release(struct rte_distributor *d, unsigned int wkr)
 	struct rte_distributor_buffer *buf = &(d->bufs[wkr]);
 	unsigned int i;
 
+	handle_returns(d, wkr);
+
 	/* Sync with worker on GET_BUF flag */
 	while (!(__atomic_load_n(&(d->bufs[wkr].bufptr64[0]), __ATOMIC_ACQUIRE)
-		& RTE_DISTRIB_GET_BUF))
+		& RTE_DISTRIB_GET_BUF)) {
+		handle_returns(d, wkr);
 		rte_pause();
-
-	handle_returns(d, wkr);
+	}
 
 	buf->count = 0;
 
@@ -376,6 +378,7 @@ rte_distributor_process(struct rte_distributor *d,
 		/* Flush out all non-full cache-lines to workers. */
 		for (wid = 0 ; wid < d->num_workers; wid++) {
 			/* Sync with worker on GET_BUF flag. */
+			handle_returns(d, wid);
 			if (__atomic_load_n(&(d->bufs[wid].bufptr64[0]),
 				__ATOMIC_ACQUIRE) & RTE_DISTRIB_GET_BUF) {
 				release(d, wid);
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v6 03/15] distributor: do not use oldpkt when not needed
       [not found]                       ` <CGME20201009220232eucas1p201d3b81574b7ec42ff3fb18f4bbfcbea@eucas1p2.samsung.com>
@ 2020-10-09 22:01                         ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-09 22:01 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

rte_distributor_request_pkt and rte_distributor_get_pkt dereferenced
oldpkt parameter when in RTE_DIST_ALG_SINGLE even if number
of returned buffers from worker to distributor was 0.

This patch passes NULL to the legacy API when number of returned
buffers is 0. This allows passing NULL as oldpkt parameter.

Distributor tests are also updated passing NULL as oldpkt and
0 as number of returned packets, where packets are not returned.

Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 app/test/test_distributor.c              | 28 +++++++++---------------
 lib/librte_distributor/rte_distributor.c |  4 ++--
 2 files changed, 12 insertions(+), 20 deletions(-)

diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
index ba1f81cf8..52230d250 100644
--- a/app/test/test_distributor.c
+++ b/app/test/test_distributor.c
@@ -62,13 +62,10 @@ handle_work(void *arg)
 	struct rte_mbuf *buf[8] __rte_cache_aligned;
 	struct worker_params *wp = arg;
 	struct rte_distributor *db = wp->dist;
-	unsigned int count = 0, num = 0;
+	unsigned int count = 0, num;
 	unsigned int id = __atomic_fetch_add(&worker_idx, 1, __ATOMIC_RELAXED);
-	int i;
 
-	for (i = 0; i < 8; i++)
-		buf[i] = NULL;
-	num = rte_distributor_get_pkt(db, id, buf, buf, num);
+	num = rte_distributor_get_pkt(db, id, buf, NULL, 0);
 	while (!quit) {
 		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
 				__ATOMIC_RELAXED);
@@ -272,19 +269,16 @@ handle_work_with_free_mbufs(void *arg)
 	struct rte_distributor *d = wp->dist;
 	unsigned int count = 0;
 	unsigned int i;
-	unsigned int num = 0;
+	unsigned int num;
 	unsigned int id = __atomic_fetch_add(&worker_idx, 1, __ATOMIC_RELAXED);
 
-	for (i = 0; i < 8; i++)
-		buf[i] = NULL;
-	num = rte_distributor_get_pkt(d, id, buf, buf, num);
+	num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 	while (!quit) {
 		worker_stats[id].handled_packets += num;
 		count += num;
 		for (i = 0; i < num; i++)
 			rte_pktmbuf_free(buf[i]);
-		num = rte_distributor_get_pkt(d,
-				id, buf, buf, num);
+		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 	}
 	worker_stats[id].handled_packets += num;
 	count += num;
@@ -342,14 +336,14 @@ handle_work_for_shutdown_test(void *arg)
 	struct worker_params *wp = arg;
 	struct rte_distributor *d = wp->dist;
 	unsigned int count = 0;
-	unsigned int num = 0;
+	unsigned int num;
 	unsigned int total = 0;
 	unsigned int i;
 	unsigned int returned = 0;
 	const unsigned int id = __atomic_fetch_add(&worker_idx, 1,
 			__ATOMIC_RELAXED);
 
-	num = rte_distributor_get_pkt(d, id, buf, buf, num);
+	num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 
 	/* wait for quit single globally, or for worker zero, wait
 	 * for zero_quit */
@@ -358,8 +352,7 @@ handle_work_for_shutdown_test(void *arg)
 		count += num;
 		for (i = 0; i < num; i++)
 			rte_pktmbuf_free(buf[i]);
-		num = rte_distributor_get_pkt(d,
-				id, buf, buf, num);
+		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 		total += num;
 	}
 	worker_stats[id].handled_packets += num;
@@ -373,14 +366,13 @@ handle_work_for_shutdown_test(void *arg)
 		while (zero_quit)
 			usleep(100);
 
-		num = rte_distributor_get_pkt(d,
-				id, buf, buf, num);
+		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 
 		while (!quit) {
 			worker_stats[id].handled_packets += num;
 			count += num;
 			rte_pktmbuf_free(pkt);
-			num = rte_distributor_get_pkt(d, id, buf, buf, num);
+			num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 		}
 		returned = rte_distributor_return_pkt(d,
 				id, buf, num);
diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
index 12b3db33c..b720abe03 100644
--- a/lib/librte_distributor/rte_distributor.c
+++ b/lib/librte_distributor/rte_distributor.c
@@ -42,7 +42,7 @@ rte_distributor_request_pkt(struct rte_distributor *d,
 
 	if (unlikely(d->alg_type == RTE_DIST_ALG_SINGLE)) {
 		rte_distributor_request_pkt_single(d->d_single,
-			worker_id, oldpkt[0]);
+			worker_id, count ? oldpkt[0] : NULL);
 		return;
 	}
 
@@ -134,7 +134,7 @@ rte_distributor_get_pkt(struct rte_distributor *d,
 	if (unlikely(d->alg_type == RTE_DIST_ALG_SINGLE)) {
 		if (return_count <= 1) {
 			pkts[0] = rte_distributor_get_pkt_single(d->d_single,
-				worker_id, oldpkt[0]);
+				worker_id, return_count ? oldpkt[0] : NULL);
 			return (pkts[0]) ? 1 : 0;
 		} else
 			return -EINVAL;
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v6 04/15] distributor: handle worker shutdown in burst mode
       [not found]                       ` <CGME20201009220233eucas1p285b4d01402c0c8bcfd018673afeb05eb@eucas1p2.samsung.com>
@ 2020-10-09 22:01                         ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-09 22:01 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

The burst version of distributor implementation was missing proper
handling of worker shutdown. A worker processing packets received
from distributor can call rte_distributor_return_pkt() function
informing distributor that it want no more packets. Further calls to
rte_distributor_request_pkt() or rte_distributor_get_pkt() however
should inform distributor that new packets are requested again.

Lack of the proper implementation has caused that even after worker
informed about returning last packets, new packets were still sent
from distributor causing deadlocks as no one could get them on worker
side.

This patch adds handling shutdown of the worker in following way:
1) It fixes usage of RTE_DISTRIB_VALID_BUF handshake flag. This flag
was formerly unused in burst implementation and now it is used
for marking valid packets in retptr64 replacing invalid use
of RTE_DISTRIB_RETURN_BUF flag.
2) Uses RTE_DISTRIB_RETURN_BUF as a worker to distributor handshake
in retptr64 to indicate that worker has shutdown.
3) Worker that shuts down blocks also bufptr for itself with
RTE_DISTRIB_RETURN_BUF flag allowing distributor to retrieve any
in flight packets.
4) When distributor receives information about shutdown of a worker,
it: marks worker as not active; retrieves any in flight and backlog
packets and process them to different workers; unlocks bufptr64
by clearing RTE_DISTRIB_RETURN_BUF flag and allowing use in
the future if worker requests any new packages.
5) Do not allow to: send or add to backlog any packets for not
active workers. Such workers are also ignored if matched.
6) Adjust calls to handle_returns() and tags matching procedure
to react for possible activation deactivation of workers.

Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 lib/librte_distributor/distributor_private.h |   3 +
 lib/librte_distributor/rte_distributor.c     | 175 +++++++++++++++----
 2 files changed, 146 insertions(+), 32 deletions(-)

diff --git a/lib/librte_distributor/distributor_private.h b/lib/librte_distributor/distributor_private.h
index 489aef2ac..689fe3e18 100644
--- a/lib/librte_distributor/distributor_private.h
+++ b/lib/librte_distributor/distributor_private.h
@@ -155,6 +155,9 @@ struct rte_distributor {
 	enum rte_distributor_match_function dist_match_fn;
 
 	struct rte_distributor_single *d_single;
+
+	uint8_t active[RTE_DISTRIB_MAX_WORKERS];
+	uint8_t activesum;
 };
 
 void
diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
index b720abe03..115443fc0 100644
--- a/lib/librte_distributor/rte_distributor.c
+++ b/lib/librte_distributor/rte_distributor.c
@@ -51,7 +51,7 @@ rte_distributor_request_pkt(struct rte_distributor *d,
 	 * Sync with worker on GET_BUF flag.
 	 */
 	while (unlikely(__atomic_load_n(retptr64, __ATOMIC_ACQUIRE)
-			& RTE_DISTRIB_GET_BUF)) {
+			& (RTE_DISTRIB_GET_BUF | RTE_DISTRIB_RETURN_BUF))) {
 		rte_pause();
 		uint64_t t = rte_rdtsc()+100;
 
@@ -67,11 +67,11 @@ rte_distributor_request_pkt(struct rte_distributor *d,
 	for (i = count; i < RTE_DIST_BURST_SIZE; i++)
 		buf->retptr64[i] = 0;
 
-	/* Set Return bit for each packet returned */
+	/* Set VALID_BUF bit for each packet returned */
 	for (i = count; i-- > 0; )
 		buf->retptr64[i] =
 			(((int64_t)(uintptr_t)(oldpkt[i])) <<
-			RTE_DISTRIB_FLAG_BITS) | RTE_DISTRIB_RETURN_BUF;
+			RTE_DISTRIB_FLAG_BITS) | RTE_DISTRIB_VALID_BUF;
 
 	/*
 	 * Finally, set the GET_BUF  to signal to distributor that cache
@@ -97,11 +97,13 @@ rte_distributor_poll_pkt(struct rte_distributor *d,
 		return (pkts[0]) ? 1 : 0;
 	}
 
-	/* If bit is set, return
+	/* If any of below bits is set, return.
+	 * GET_BUF is set when distributor hasn't sent any packets yet
+	 * RETURN_BUF is set when distributor must retrieve in-flight packets
 	 * Sync with distributor to acquire bufptrs
 	 */
 	if (__atomic_load_n(&(buf->bufptr64[0]), __ATOMIC_ACQUIRE)
-		& RTE_DISTRIB_GET_BUF)
+		& (RTE_DISTRIB_GET_BUF | RTE_DISTRIB_RETURN_BUF))
 		return -1;
 
 	/* since bufptr64 is signed, this should be an arithmetic shift */
@@ -113,7 +115,7 @@ rte_distributor_poll_pkt(struct rte_distributor *d,
 	}
 
 	/*
-	 * so now we've got the contents of the cacheline into an  array of
+	 * so now we've got the contents of the cacheline into an array of
 	 * mbuf pointers, so toggle the bit so scheduler can start working
 	 * on the next cacheline while we're working.
 	 * Sync with distributor on GET_BUF flag. Release bufptrs.
@@ -175,7 +177,7 @@ rte_distributor_return_pkt(struct rte_distributor *d,
 	 * Sync with worker on GET_BUF flag.
 	 */
 	while (unlikely(__atomic_load_n(retptr64, __ATOMIC_ACQUIRE)
-			& RTE_DISTRIB_GET_BUF)) {
+			& (RTE_DISTRIB_GET_BUF | RTE_DISTRIB_RETURN_BUF))) {
 		rte_pause();
 		uint64_t t = rte_rdtsc()+100;
 
@@ -187,17 +189,25 @@ rte_distributor_return_pkt(struct rte_distributor *d,
 	__atomic_thread_fence(__ATOMIC_ACQUIRE);
 	for (i = 0; i < RTE_DIST_BURST_SIZE; i++)
 		/* Switch off the return bit first */
-		buf->retptr64[i] &= ~RTE_DISTRIB_RETURN_BUF;
+		buf->retptr64[i] = 0;
 
 	for (i = num; i-- > 0; )
 		buf->retptr64[i] = (((int64_t)(uintptr_t)oldpkt[i]) <<
-			RTE_DISTRIB_FLAG_BITS) | RTE_DISTRIB_RETURN_BUF;
+			RTE_DISTRIB_FLAG_BITS) | RTE_DISTRIB_VALID_BUF;
+
+	/* Use RETURN_BUF on bufptr64 to notify distributor that
+	 * we won't read any mbufs from there even if GET_BUF is set.
+	 * This allows distributor to retrieve in-flight already sent packets.
+	 */
+	__atomic_or_fetch(&(buf->bufptr64[0]), RTE_DISTRIB_RETURN_BUF,
+		__ATOMIC_ACQ_REL);
 
-	/* set the GET_BUF but even if we got no returns.
-	 * Sync with distributor on GET_BUF flag. Release retptrs.
+	/* set the RETURN_BUF on retptr64 even if we got no returns.
+	 * Sync with distributor on RETURN_BUF flag. Release retptrs.
+	 * Notify distributor that we don't request more packets any more.
 	 */
 	__atomic_store_n(&(buf->retptr64[0]),
-		buf->retptr64[0] | RTE_DISTRIB_GET_BUF, __ATOMIC_RELEASE);
+		buf->retptr64[0] | RTE_DISTRIB_RETURN_BUF, __ATOMIC_RELEASE);
 
 	return 0;
 }
@@ -267,6 +277,59 @@ find_match_scalar(struct rte_distributor *d,
 	 */
 }
 
+/*
+ * When worker called rte_distributor_return_pkt()
+ * and passed RTE_DISTRIB_RETURN_BUF handshake through retptr64,
+ * distributor must retrieve both inflight and backlog packets assigned
+ * to the worker and reprocess them to another worker.
+ */
+static void
+handle_worker_shutdown(struct rte_distributor *d, unsigned int wkr)
+{
+	struct rte_distributor_buffer *buf = &(d->bufs[wkr]);
+	/* double BURST size for storing both inflights and backlog */
+	struct rte_mbuf *pkts[RTE_DIST_BURST_SIZE * 2];
+	unsigned int pkts_count = 0;
+	unsigned int i;
+
+	/* If GET_BUF is cleared there are in-flight packets sent
+	 * to worker which does not require new packets.
+	 * They must be retrieved and assigned to another worker.
+	 */
+	if (!(__atomic_load_n(&(buf->bufptr64[0]), __ATOMIC_ACQUIRE)
+		& RTE_DISTRIB_GET_BUF))
+		for (i = 0; i < RTE_DIST_BURST_SIZE; i++)
+			if (buf->bufptr64[i] & RTE_DISTRIB_VALID_BUF)
+				pkts[pkts_count++] = (void *)((uintptr_t)
+					(buf->bufptr64[i]
+						>> RTE_DISTRIB_FLAG_BITS));
+
+	/* Make following operations on handshake flags on bufptr64:
+	 * - set GET_BUF to indicate that distributor can overwrite buffer
+	 *     with new packets if worker will make a new request.
+	 * - clear RETURN_BUF to unlock reads on worker side.
+	 */
+	__atomic_store_n(&(buf->bufptr64[0]), RTE_DISTRIB_GET_BUF,
+		__ATOMIC_RELEASE);
+
+	/* Collect backlog packets from worker */
+	for (i = 0; i < d->backlog[wkr].count; i++)
+		pkts[pkts_count++] = (void *)((uintptr_t)
+			(d->backlog[wkr].pkts[i] >> RTE_DISTRIB_FLAG_BITS));
+
+	d->backlog[wkr].count = 0;
+
+	/* Clear both inflight and backlog tags */
+	for (i = 0; i < RTE_DIST_BURST_SIZE; i++) {
+		d->in_flight_tags[wkr][i] = 0;
+		d->backlog[wkr].tags[i] = 0;
+	}
+
+	/* Recursive call */
+	if (pkts_count > 0)
+		rte_distributor_process(d, pkts, pkts_count);
+}
+
 
 /*
  * When the handshake bits indicate that there are packets coming
@@ -285,19 +348,33 @@ handle_returns(struct rte_distributor *d, unsigned int wkr)
 
 	/* Sync on GET_BUF flag. Acquire retptrs. */
 	if (__atomic_load_n(&(buf->retptr64[0]), __ATOMIC_ACQUIRE)
-		& RTE_DISTRIB_GET_BUF) {
+		& (RTE_DISTRIB_GET_BUF | RTE_DISTRIB_RETURN_BUF)) {
 		for (i = 0; i < RTE_DIST_BURST_SIZE; i++) {
-			if (buf->retptr64[i] & RTE_DISTRIB_RETURN_BUF) {
+			if (buf->retptr64[i] & RTE_DISTRIB_VALID_BUF) {
 				oldbuf = ((uintptr_t)(buf->retptr64[i] >>
 					RTE_DISTRIB_FLAG_BITS));
 				/* store returns in a circular buffer */
 				store_return(oldbuf, d, &ret_start, &ret_count);
 				count++;
-				buf->retptr64[i] &= ~RTE_DISTRIB_RETURN_BUF;
+				buf->retptr64[i] &= ~RTE_DISTRIB_VALID_BUF;
 			}
 		}
 		d->returns.start = ret_start;
 		d->returns.count = ret_count;
+
+		/* If worker requested packets with GET_BUF, set it to active
+		 * otherwise (RETURN_BUF), set it to not active.
+		 */
+		d->activesum -= d->active[wkr];
+		d->active[wkr] = !!(buf->retptr64[0] & RTE_DISTRIB_GET_BUF);
+		d->activesum += d->active[wkr];
+
+		/* If worker returned packets without requesting new ones,
+		 * handle all in-flights and backlog packets assigned to it.
+		 */
+		if (unlikely(buf->retptr64[0] & RTE_DISTRIB_RETURN_BUF))
+			handle_worker_shutdown(d, wkr);
+
 		/* Clear for the worker to populate with more returns.
 		 * Sync with distributor on GET_BUF flag. Release retptrs.
 		 */
@@ -322,11 +399,15 @@ release(struct rte_distributor *d, unsigned int wkr)
 	unsigned int i;
 
 	handle_returns(d, wkr);
+	if (unlikely(!d->active[wkr]))
+		return 0;
 
 	/* Sync with worker on GET_BUF flag */
 	while (!(__atomic_load_n(&(d->bufs[wkr].bufptr64[0]), __ATOMIC_ACQUIRE)
 		& RTE_DISTRIB_GET_BUF)) {
 		handle_returns(d, wkr);
+		if (unlikely(!d->active[wkr]))
+			return 0;
 		rte_pause();
 	}
 
@@ -366,7 +447,7 @@ rte_distributor_process(struct rte_distributor *d,
 	int64_t next_value = 0;
 	uint16_t new_tag = 0;
 	uint16_t flows[RTE_DIST_BURST_SIZE] __rte_cache_aligned;
-	unsigned int i, j, w, wid;
+	unsigned int i, j, w, wid, matching_required;
 
 	if (d->alg_type == RTE_DIST_ALG_SINGLE) {
 		/* Call the old API */
@@ -374,11 +455,13 @@ rte_distributor_process(struct rte_distributor *d,
 			mbufs, num_mbufs);
 	}
 
+	for (wid = 0 ; wid < d->num_workers; wid++)
+		handle_returns(d, wid);
+
 	if (unlikely(num_mbufs == 0)) {
 		/* Flush out all non-full cache-lines to workers. */
 		for (wid = 0 ; wid < d->num_workers; wid++) {
 			/* Sync with worker on GET_BUF flag. */
-			handle_returns(d, wid);
 			if (__atomic_load_n(&(d->bufs[wid].bufptr64[0]),
 				__ATOMIC_ACQUIRE) & RTE_DISTRIB_GET_BUF) {
 				release(d, wid);
@@ -388,6 +471,9 @@ rte_distributor_process(struct rte_distributor *d,
 		return 0;
 	}
 
+	if (unlikely(!d->activesum))
+		return 0;
+
 	while (next_idx < num_mbufs) {
 		uint16_t matches[RTE_DIST_BURST_SIZE];
 		unsigned int pkts;
@@ -412,22 +498,30 @@ rte_distributor_process(struct rte_distributor *d,
 		for (; i < RTE_DIST_BURST_SIZE; i++)
 			flows[i] = 0;
 
-		switch (d->dist_match_fn) {
-		case RTE_DIST_MATCH_VECTOR:
-			find_match_vec(d, &flows[0], &matches[0]);
-			break;
-		default:
-			find_match_scalar(d, &flows[0], &matches[0]);
-		}
+		matching_required = 1;
 
+		for (j = 0; j < pkts; j++) {
+			if (unlikely(!d->activesum))
+				return next_idx;
+
+			if (unlikely(matching_required)) {
+				switch (d->dist_match_fn) {
+				case RTE_DIST_MATCH_VECTOR:
+					find_match_vec(d, &flows[0],
+						&matches[0]);
+					break;
+				default:
+					find_match_scalar(d, &flows[0],
+						&matches[0]);
+				}
+				matching_required = 0;
+			}
 		/*
 		 * Matches array now contain the intended worker ID (+1) of
 		 * the incoming packets. Any zeroes need to be assigned
 		 * workers.
 		 */
 
-		for (j = 0; j < pkts; j++) {
-
 			next_mb = mbufs[next_idx++];
 			next_value = (((int64_t)(uintptr_t)next_mb) <<
 					RTE_DISTRIB_FLAG_BITS);
@@ -447,12 +541,18 @@ rte_distributor_process(struct rte_distributor *d,
 			 */
 			/* matches[j] = 0; */
 
-			if (matches[j]) {
+			if (matches[j] && d->active[matches[j]-1]) {
 				struct rte_distributor_backlog *bl =
 						&d->backlog[matches[j]-1];
 				if (unlikely(bl->count ==
 						RTE_DIST_BURST_SIZE)) {
 					release(d, matches[j]-1);
+					if (!d->active[matches[j]-1]) {
+						j--;
+						next_idx--;
+						matching_required = 1;
+						continue;
+					}
 				}
 
 				/* Add to worker that already has flow */
@@ -462,11 +562,21 @@ rte_distributor_process(struct rte_distributor *d,
 				bl->pkts[idx] = next_value;
 
 			} else {
-				struct rte_distributor_backlog *bl =
-						&d->backlog[wkr];
+				struct rte_distributor_backlog *bl;
+
+				while (unlikely(!d->active[wkr]))
+					wkr = (wkr + 1) % d->num_workers;
+				bl = &d->backlog[wkr];
+
 				if (unlikely(bl->count ==
 						RTE_DIST_BURST_SIZE)) {
 					release(d, wkr);
+					if (!d->active[wkr]) {
+						j--;
+						next_idx--;
+						matching_required = 1;
+						continue;
+					}
 				}
 
 				/* Add to current worker worker */
@@ -485,9 +595,7 @@ rte_distributor_process(struct rte_distributor *d,
 						matches[w] = wkr+1;
 			}
 		}
-		wkr++;
-		if (wkr >= d->num_workers)
-			wkr = 0;
+		wkr = (wkr + 1) % d->num_workers;
 	}
 
 	/* Flush out all non-full cache-lines to workers. */
@@ -663,6 +771,9 @@ rte_distributor_create(const char *name,
 	for (i = 0 ; i < num_workers ; i++)
 		d->backlog[i].tags = &d->in_flight_tags[i][RTE_DIST_BURST_SIZE];
 
+	memset(d->active, 0, sizeof(d->active));
+	d->activesum = 0;
+
 	dist_burst_list = RTE_TAILQ_CAST(rte_dist_burst_tailq.head,
 					  rte_dist_burst_list);
 
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v6 05/15] test/distributor: fix shutdown of busy worker
       [not found]                       ` <CGME20201009220235eucas1p17ded8b5bb42f2fef159a5715ef6fbca7@eucas1p1.samsung.com>
@ 2020-10-09 22:01                         ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-09 22:01 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

The sanity test with worker shutdown delegates all bufs
to be processed by a single lcore worker, then it freezes
one of the lcore workers and continues to send more bufs.
The freezed core shuts down first by calling
rte_distributor_return_pkt().

The test intention is to verify if packets assigned to
the shut down lcore will be reassigned to another worker.

However the shutdown core was not always the one, that was
processing packets. The lcore processing mbufs might be different
every time test is launched. This is caused by keeping the value
of wkr static variable in rte_distributor_process() function
between running test cases.

Test freezed always lcore with 0 id. The patch stores the id
of worker that is processing the data in zero_idx global atomic
variable. This way the freezed lcore is always the proper one.

Fixes: c3eabff124e6 ("distributor: add unit tests")
Cc: bruce.richardson@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Tested-by: David Hunt <david.hunt@intel.com>
---
 app/test/test_distributor.c | 23 +++++++++++++++++++++--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
index 52230d250..6cd7a2edd 100644
--- a/app/test/test_distributor.c
+++ b/app/test/test_distributor.c
@@ -28,6 +28,7 @@ struct worker_params worker_params;
 static volatile int quit;      /**< general quit variable for all threads */
 static volatile int zero_quit; /**< var for when we just want thr0 to quit*/
 static volatile unsigned worker_idx;
+static volatile unsigned zero_idx;
 
 struct worker_stats {
 	volatile unsigned handled_packets;
@@ -340,26 +341,43 @@ handle_work_for_shutdown_test(void *arg)
 	unsigned int total = 0;
 	unsigned int i;
 	unsigned int returned = 0;
+	unsigned int zero_id = 0;
+	unsigned int zero_unset;
 	const unsigned int id = __atomic_fetch_add(&worker_idx, 1,
 			__ATOMIC_RELAXED);
 
 	num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 
+	if (num > 0) {
+		zero_unset = RTE_MAX_LCORE;
+		__atomic_compare_exchange_n(&zero_idx, &zero_unset, id,
+			false, __ATOMIC_ACQ_REL, __ATOMIC_ACQUIRE);
+	}
+	zero_id = __atomic_load_n(&zero_idx, __ATOMIC_ACQUIRE);
+
 	/* wait for quit single globally, or for worker zero, wait
 	 * for zero_quit */
-	while (!quit && !(id == 0 && zero_quit)) {
+	while (!quit && !(id == zero_id && zero_quit)) {
 		worker_stats[id].handled_packets += num;
 		count += num;
 		for (i = 0; i < num; i++)
 			rte_pktmbuf_free(buf[i]);
 		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
+
+		if (num > 0) {
+			zero_unset = RTE_MAX_LCORE;
+			__atomic_compare_exchange_n(&zero_idx, &zero_unset, id,
+				false, __ATOMIC_ACQ_REL, __ATOMIC_ACQUIRE);
+		}
+		zero_id = __atomic_load_n(&zero_idx, __ATOMIC_ACQUIRE);
+
 		total += num;
 	}
 	worker_stats[id].handled_packets += num;
 	count += num;
 	returned = rte_distributor_return_pkt(d, id, buf, num);
 
-	if (id == 0) {
+	if (id == zero_id) {
 		/* for worker zero, allow it to restart to pick up last packet
 		 * when all workers are shutting down.
 		 */
@@ -578,6 +596,7 @@ quit_workers(struct worker_params *wp, struct rte_mempool *p)
 	rte_eal_mp_wait_lcore();
 	quit = 0;
 	worker_idx = 0;
+	zero_idx = RTE_MAX_LCORE;
 }
 
 static int
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v6 06/15] test/distributor: synchronize lcores statistics
       [not found]                       ` <CGME20201009220236eucas1p192e34b3bbf00681ec90de296abd1a6b5@eucas1p1.samsung.com>
@ 2020-10-09 22:01                         ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-09 22:01 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

Statistics of handled packets are cleared and read on main lcore,
while they are increased in workers handlers on different lcores.

Without synchronization occasionally showed invalid values.
This patch uses atomic acquire/release mechanisms to synchronize.

Fixes: c3eabff124e6 ("distributor: add unit tests")
Cc: bruce.richardson@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 app/test/test_distributor.c | 43 +++++++++++++++++++++++++------------
 1 file changed, 29 insertions(+), 14 deletions(-)

diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
index 6cd7a2edd..838459392 100644
--- a/app/test/test_distributor.c
+++ b/app/test/test_distributor.c
@@ -43,7 +43,8 @@ total_packet_count(void)
 {
 	unsigned i, count = 0;
 	for (i = 0; i < worker_idx; i++)
-		count += worker_stats[i].handled_packets;
+		count += __atomic_load_n(&worker_stats[i].handled_packets,
+				__ATOMIC_ACQUIRE);
 	return count;
 }
 
@@ -51,7 +52,10 @@ total_packet_count(void)
 static inline void
 clear_packet_count(void)
 {
-	memset(&worker_stats, 0, sizeof(worker_stats));
+	unsigned int i;
+	for (i = 0; i < RTE_MAX_LCORE; i++)
+		__atomic_store_n(&worker_stats[i].handled_packets, 0,
+			__ATOMIC_RELEASE);
 }
 
 /* this is the basic worker function for sanity test
@@ -69,13 +73,13 @@ handle_work(void *arg)
 	num = rte_distributor_get_pkt(db, id, buf, NULL, 0);
 	while (!quit) {
 		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
-				__ATOMIC_RELAXED);
+				__ATOMIC_ACQ_REL);
 		count += num;
 		num = rte_distributor_get_pkt(db, id,
 				buf, buf, num);
 	}
 	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
-			__ATOMIC_RELAXED);
+			__ATOMIC_ACQ_REL);
 	count += num;
 	rte_distributor_return_pkt(db, id, buf, num);
 	return 0;
@@ -131,7 +135,8 @@ sanity_test(struct worker_params *wp, struct rte_mempool *p)
 
 	for (i = 0; i < rte_lcore_count() - 1; i++)
 		printf("Worker %u handled %u packets\n", i,
-				worker_stats[i].handled_packets);
+			__atomic_load_n(&worker_stats[i].handled_packets,
+					__ATOMIC_ACQUIRE));
 	printf("Sanity test with all zero hashes done.\n");
 
 	/* pick two flows and check they go correctly */
@@ -156,7 +161,9 @@ sanity_test(struct worker_params *wp, struct rte_mempool *p)
 
 		for (i = 0; i < rte_lcore_count() - 1; i++)
 			printf("Worker %u handled %u packets\n", i,
-					worker_stats[i].handled_packets);
+				__atomic_load_n(
+					&worker_stats[i].handled_packets,
+					__ATOMIC_ACQUIRE));
 		printf("Sanity test with two hash values done\n");
 	}
 
@@ -182,7 +189,8 @@ sanity_test(struct worker_params *wp, struct rte_mempool *p)
 
 	for (i = 0; i < rte_lcore_count() - 1; i++)
 		printf("Worker %u handled %u packets\n", i,
-				worker_stats[i].handled_packets);
+			__atomic_load_n(&worker_stats[i].handled_packets,
+					__ATOMIC_ACQUIRE));
 	printf("Sanity test with non-zero hashes done\n");
 
 	rte_mempool_put_bulk(p, (void *)bufs, BURST);
@@ -275,14 +283,16 @@ handle_work_with_free_mbufs(void *arg)
 
 	num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 	while (!quit) {
-		worker_stats[id].handled_packets += num;
 		count += num;
+		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
+				__ATOMIC_ACQ_REL);
 		for (i = 0; i < num; i++)
 			rte_pktmbuf_free(buf[i]);
 		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 	}
-	worker_stats[id].handled_packets += num;
 	count += num;
+	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
+			__ATOMIC_ACQ_REL);
 	rte_distributor_return_pkt(d, id, buf, num);
 	return 0;
 }
@@ -358,8 +368,9 @@ handle_work_for_shutdown_test(void *arg)
 	/* wait for quit single globally, or for worker zero, wait
 	 * for zero_quit */
 	while (!quit && !(id == zero_id && zero_quit)) {
-		worker_stats[id].handled_packets += num;
 		count += num;
+		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
+				__ATOMIC_ACQ_REL);
 		for (i = 0; i < num; i++)
 			rte_pktmbuf_free(buf[i]);
 		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
@@ -373,10 +384,11 @@ handle_work_for_shutdown_test(void *arg)
 
 		total += num;
 	}
-	worker_stats[id].handled_packets += num;
 	count += num;
 	returned = rte_distributor_return_pkt(d, id, buf, num);
 
+	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
+			__ATOMIC_ACQ_REL);
 	if (id == zero_id) {
 		/* for worker zero, allow it to restart to pick up last packet
 		 * when all workers are shutting down.
@@ -387,7 +399,8 @@ handle_work_for_shutdown_test(void *arg)
 		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 
 		while (!quit) {
-			worker_stats[id].handled_packets += num;
+			__atomic_fetch_add(&worker_stats[id].handled_packets,
+					num, __ATOMIC_ACQ_REL);
 			count += num;
 			rte_pktmbuf_free(pkt);
 			num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
@@ -454,7 +467,8 @@ sanity_test_with_worker_shutdown(struct worker_params *wp,
 
 	for (i = 0; i < rte_lcore_count() - 1; i++)
 		printf("Worker %u handled %u packets\n", i,
-				worker_stats[i].handled_packets);
+			__atomic_load_n(&worker_stats[i].handled_packets,
+					__ATOMIC_ACQUIRE));
 
 	if (total_packet_count() != BURST * 2) {
 		printf("Line %d: Error, not all packets flushed. "
@@ -507,7 +521,8 @@ test_flush_with_worker_shutdown(struct worker_params *wp,
 	zero_quit = 0;
 	for (i = 0; i < rte_lcore_count() - 1; i++)
 		printf("Worker %u handled %u packets\n", i,
-				worker_stats[i].handled_packets);
+			__atomic_load_n(&worker_stats[i].handled_packets,
+					__ATOMIC_ACQUIRE));
 
 	if (total_packet_count() != BURST) {
 		printf("Line %d: Error, not all packets flushed. "
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v6 07/15] distributor: fix return pkt calls in single mode
       [not found]                       ` <CGME20201009220238eucas1p2e86c0026064774e5b494c16c7fd384ec@eucas1p2.samsung.com>
@ 2020-10-09 22:01                         ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-09 22:01 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

In the single legacy version of the distributor synchronization
requires continues exchange of buffers between distributor
and workers. Empty buffers are sent if only handshake
synchronization is required.
However calls to the rte_distributor_return_pkt()
with 0 buffers in single mode were ignored and not passed to the
legacy algorithm implementation causing lack of synchronization.

This patch fixes this issue by passing NULL as buffer which is
a valid way of sending just synchronization handshakes
in single mode.

Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 lib/librte_distributor/rte_distributor.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
index 115443fc0..9fd7dcab7 100644
--- a/lib/librte_distributor/rte_distributor.c
+++ b/lib/librte_distributor/rte_distributor.c
@@ -168,6 +168,9 @@ rte_distributor_return_pkt(struct rte_distributor *d,
 		if (num == 1)
 			return rte_distributor_return_pkt_single(d->d_single,
 				worker_id, oldpkt[0]);
+		else if (num == 0)
+			return rte_distributor_return_pkt_single(d->d_single,
+				worker_id, NULL);
 		else
 			return -EINVAL;
 	}
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v6 08/15] test/distributor: fix freeing mbufs
       [not found]                       ` <CGME20201009220246eucas1p1283b16f1f54c572b5952ca9334d667da@eucas1p1.samsung.com>
@ 2020-10-09 22:01                         ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-09 22:01 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

Sanity tests with mbuf alloc and shutdown tests assume that
mbufs passed to worker cores are freed in handlers.
Such packets should not be returned to the distributor's main
core. The only packets that should be returned are the packets
send after completion of the tests in quit_workers function.

This patch stops returning mbufs to distributor's core.
In case of shutdown tests it is impossible to determine
how worker and distributor threads would synchronize.
Packets used by tests should be freed and packets used during
quit_workers() shouldn't. That's why returning mbufs to mempool
is moved to test procedure run on distributor thread
from worker threads.

Additionally this patch cleans up unused variables.

Fixes: c0de0eb82e40 ("distributor: switch over to new API")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 app/test/test_distributor.c | 96 ++++++++++++++++++-------------------
 1 file changed, 47 insertions(+), 49 deletions(-)

diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
index 838459392..06e01ff9d 100644
--- a/app/test/test_distributor.c
+++ b/app/test/test_distributor.c
@@ -44,7 +44,7 @@ total_packet_count(void)
 	unsigned i, count = 0;
 	for (i = 0; i < worker_idx; i++)
 		count += __atomic_load_n(&worker_stats[i].handled_packets,
-				__ATOMIC_ACQUIRE);
+				__ATOMIC_RELAXED);
 	return count;
 }
 
@@ -55,7 +55,7 @@ clear_packet_count(void)
 	unsigned int i;
 	for (i = 0; i < RTE_MAX_LCORE; i++)
 		__atomic_store_n(&worker_stats[i].handled_packets, 0,
-			__ATOMIC_RELEASE);
+			__ATOMIC_RELAXED);
 }
 
 /* this is the basic worker function for sanity test
@@ -67,20 +67,18 @@ handle_work(void *arg)
 	struct rte_mbuf *buf[8] __rte_cache_aligned;
 	struct worker_params *wp = arg;
 	struct rte_distributor *db = wp->dist;
-	unsigned int count = 0, num;
+	unsigned int num;
 	unsigned int id = __atomic_fetch_add(&worker_idx, 1, __ATOMIC_RELAXED);
 
 	num = rte_distributor_get_pkt(db, id, buf, NULL, 0);
 	while (!quit) {
 		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
-				__ATOMIC_ACQ_REL);
-		count += num;
+				__ATOMIC_RELAXED);
 		num = rte_distributor_get_pkt(db, id,
 				buf, buf, num);
 	}
 	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
-			__ATOMIC_ACQ_REL);
-	count += num;
+			__ATOMIC_RELAXED);
 	rte_distributor_return_pkt(db, id, buf, num);
 	return 0;
 }
@@ -136,7 +134,7 @@ sanity_test(struct worker_params *wp, struct rte_mempool *p)
 	for (i = 0; i < rte_lcore_count() - 1; i++)
 		printf("Worker %u handled %u packets\n", i,
 			__atomic_load_n(&worker_stats[i].handled_packets,
-					__ATOMIC_ACQUIRE));
+					__ATOMIC_RELAXED));
 	printf("Sanity test with all zero hashes done.\n");
 
 	/* pick two flows and check they go correctly */
@@ -163,7 +161,7 @@ sanity_test(struct worker_params *wp, struct rte_mempool *p)
 			printf("Worker %u handled %u packets\n", i,
 				__atomic_load_n(
 					&worker_stats[i].handled_packets,
-					__ATOMIC_ACQUIRE));
+					__ATOMIC_RELAXED));
 		printf("Sanity test with two hash values done\n");
 	}
 
@@ -190,7 +188,7 @@ sanity_test(struct worker_params *wp, struct rte_mempool *p)
 	for (i = 0; i < rte_lcore_count() - 1; i++)
 		printf("Worker %u handled %u packets\n", i,
 			__atomic_load_n(&worker_stats[i].handled_packets,
-					__ATOMIC_ACQUIRE));
+					__ATOMIC_RELAXED));
 	printf("Sanity test with non-zero hashes done\n");
 
 	rte_mempool_put_bulk(p, (void *)bufs, BURST);
@@ -276,23 +274,20 @@ handle_work_with_free_mbufs(void *arg)
 	struct rte_mbuf *buf[8] __rte_cache_aligned;
 	struct worker_params *wp = arg;
 	struct rte_distributor *d = wp->dist;
-	unsigned int count = 0;
 	unsigned int i;
 	unsigned int num;
 	unsigned int id = __atomic_fetch_add(&worker_idx, 1, __ATOMIC_RELAXED);
 
 	num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 	while (!quit) {
-		count += num;
 		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
-				__ATOMIC_ACQ_REL);
+				__ATOMIC_RELAXED);
 		for (i = 0; i < num; i++)
 			rte_pktmbuf_free(buf[i]);
 		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 	}
-	count += num;
 	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
-			__ATOMIC_ACQ_REL);
+			__ATOMIC_RELAXED);
 	rte_distributor_return_pkt(d, id, buf, num);
 	return 0;
 }
@@ -318,7 +313,6 @@ sanity_test_with_mbuf_alloc(struct worker_params *wp, struct rte_mempool *p)
 			rte_distributor_process(d, NULL, 0);
 		for (j = 0; j < BURST; j++) {
 			bufs[j]->hash.usr = (i+j) << 1;
-			rte_mbuf_refcnt_set(bufs[j], 1);
 		}
 
 		rte_distributor_process(d, bufs, BURST);
@@ -342,15 +336,10 @@ sanity_test_with_mbuf_alloc(struct worker_params *wp, struct rte_mempool *p)
 static int
 handle_work_for_shutdown_test(void *arg)
 {
-	struct rte_mbuf *pkt = NULL;
 	struct rte_mbuf *buf[8] __rte_cache_aligned;
 	struct worker_params *wp = arg;
 	struct rte_distributor *d = wp->dist;
-	unsigned int count = 0;
 	unsigned int num;
-	unsigned int total = 0;
-	unsigned int i;
-	unsigned int returned = 0;
 	unsigned int zero_id = 0;
 	unsigned int zero_unset;
 	const unsigned int id = __atomic_fetch_add(&worker_idx, 1,
@@ -368,11 +357,8 @@ handle_work_for_shutdown_test(void *arg)
 	/* wait for quit single globally, or for worker zero, wait
 	 * for zero_quit */
 	while (!quit && !(id == zero_id && zero_quit)) {
-		count += num;
 		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
-				__ATOMIC_ACQ_REL);
-		for (i = 0; i < num; i++)
-			rte_pktmbuf_free(buf[i]);
+				__ATOMIC_RELAXED);
 		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 
 		if (num > 0) {
@@ -381,15 +367,12 @@ handle_work_for_shutdown_test(void *arg)
 				false, __ATOMIC_ACQ_REL, __ATOMIC_ACQUIRE);
 		}
 		zero_id = __atomic_load_n(&zero_idx, __ATOMIC_ACQUIRE);
-
-		total += num;
 	}
-	count += num;
-	returned = rte_distributor_return_pkt(d, id, buf, num);
-
 	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
-			__ATOMIC_ACQ_REL);
+			__ATOMIC_RELAXED);
 	if (id == zero_id) {
+		rte_distributor_return_pkt(d, id, NULL, 0);
+
 		/* for worker zero, allow it to restart to pick up last packet
 		 * when all workers are shutting down.
 		 */
@@ -400,15 +383,11 @@ handle_work_for_shutdown_test(void *arg)
 
 		while (!quit) {
 			__atomic_fetch_add(&worker_stats[id].handled_packets,
-					num, __ATOMIC_ACQ_REL);
-			count += num;
-			rte_pktmbuf_free(pkt);
+					num, __ATOMIC_RELAXED);
 			num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 		}
-		returned = rte_distributor_return_pkt(d,
-				id, buf, num);
-		printf("Num returned = %d\n", returned);
 	}
+	rte_distributor_return_pkt(d, id, buf, num);
 	return 0;
 }
 
@@ -424,7 +403,9 @@ sanity_test_with_worker_shutdown(struct worker_params *wp,
 {
 	struct rte_distributor *d = wp->dist;
 	struct rte_mbuf *bufs[BURST];
-	unsigned i;
+	struct rte_mbuf *bufs2[BURST];
+	unsigned int i;
+	unsigned int failed = 0;
 
 	printf("=== Sanity test of worker shutdown ===\n");
 
@@ -450,16 +431,17 @@ sanity_test_with_worker_shutdown(struct worker_params *wp,
 	 */
 
 	/* get more buffers to queue up, again setting them to the same flow */
-	if (rte_mempool_get_bulk(p, (void *)bufs, BURST) != 0) {
+	if (rte_mempool_get_bulk(p, (void *)bufs2, BURST) != 0) {
 		printf("line %d: Error getting mbufs from pool\n", __LINE__);
+		rte_mempool_put_bulk(p, (void *)bufs, BURST);
 		return -1;
 	}
 	for (i = 0; i < BURST; i++)
-		bufs[i]->hash.usr = 1;
+		bufs2[i]->hash.usr = 1;
 
 	/* get worker zero to quit */
 	zero_quit = 1;
-	rte_distributor_process(d, bufs, BURST);
+	rte_distributor_process(d, bufs2, BURST);
 
 	/* flush the distributor */
 	rte_distributor_flush(d);
@@ -468,15 +450,21 @@ sanity_test_with_worker_shutdown(struct worker_params *wp,
 	for (i = 0; i < rte_lcore_count() - 1; i++)
 		printf("Worker %u handled %u packets\n", i,
 			__atomic_load_n(&worker_stats[i].handled_packets,
-					__ATOMIC_ACQUIRE));
+					__ATOMIC_RELAXED));
 
 	if (total_packet_count() != BURST * 2) {
 		printf("Line %d: Error, not all packets flushed. "
 				"Expected %u, got %u\n",
 				__LINE__, BURST * 2, total_packet_count());
-		return -1;
+		failed = 1;
 	}
 
+	rte_mempool_put_bulk(p, (void *)bufs, BURST);
+	rte_mempool_put_bulk(p, (void *)bufs2, BURST);
+
+	if (failed)
+		return -1;
+
 	printf("Sanity test with worker shutdown passed\n\n");
 	return 0;
 }
@@ -490,7 +478,8 @@ test_flush_with_worker_shutdown(struct worker_params *wp,
 {
 	struct rte_distributor *d = wp->dist;
 	struct rte_mbuf *bufs[BURST];
-	unsigned i;
+	unsigned int i;
+	unsigned int failed = 0;
 
 	printf("=== Test flush fn with worker shutdown (%s) ===\n", wp->name);
 
@@ -522,15 +511,20 @@ test_flush_with_worker_shutdown(struct worker_params *wp,
 	for (i = 0; i < rte_lcore_count() - 1; i++)
 		printf("Worker %u handled %u packets\n", i,
 			__atomic_load_n(&worker_stats[i].handled_packets,
-					__ATOMIC_ACQUIRE));
+					__ATOMIC_RELAXED));
 
 	if (total_packet_count() != BURST) {
 		printf("Line %d: Error, not all packets flushed. "
 				"Expected %u, got %u\n",
 				__LINE__, BURST, total_packet_count());
-		return -1;
+		failed = 1;
 	}
 
+	rte_mempool_put_bulk(p, (void *)bufs, BURST);
+
+	if (failed)
+		return -1;
+
 	printf("Flush test with worker shutdown passed\n\n");
 	return 0;
 }
@@ -596,7 +590,10 @@ quit_workers(struct worker_params *wp, struct rte_mempool *p)
 	const unsigned num_workers = rte_lcore_count() - 1;
 	unsigned i;
 	struct rte_mbuf *bufs[RTE_MAX_LCORE];
-	rte_mempool_get_bulk(p, (void *)bufs, num_workers);
+	if (rte_mempool_get_bulk(p, (void *)bufs, num_workers) != 0) {
+		printf("line %d: Error getting mbufs from pool\n", __LINE__);
+		return;
+	}
 
 	zero_quit = 0;
 	quit = 1;
@@ -604,11 +601,12 @@ quit_workers(struct worker_params *wp, struct rte_mempool *p)
 		bufs[i]->hash.usr = i << 1;
 	rte_distributor_process(d, bufs, num_workers);
 
-	rte_mempool_put_bulk(p, (void *)bufs, num_workers);
-
 	rte_distributor_process(d, NULL, 0);
 	rte_distributor_flush(d);
 	rte_eal_mp_wait_lcore();
+
+	rte_mempool_put_bulk(p, (void *)bufs, num_workers);
+
 	quit = 0;
 	worker_idx = 0;
 	zero_idx = RTE_MAX_LCORE;
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v6 09/15] test/distributor: collect return mbufs
       [not found]                       ` <CGME20201009220247eucas1p1a783663e586127cbfd406a61e13c40eb@eucas1p1.samsung.com>
@ 2020-10-09 22:01                         ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-09 22:01 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

During quit_workers function distributor's main core processes
some packets to wake up pending worker cores so they can quit.
As quit_workers acts also as a cleanup procedure for next test
case it should also collect these packages returned by workers'
handlers, so the cyclic buffer with returned packets
in distributor remains empty.

Fixes: c3eabff124e6 ("distributor: add unit tests")
Cc: bruce.richardson@intel.com
Fixes: c0de0eb82e40 ("distributor: switch over to new API")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 app/test/test_distributor.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
index 06e01ff9d..ed03040d1 100644
--- a/app/test/test_distributor.c
+++ b/app/test/test_distributor.c
@@ -590,6 +590,7 @@ quit_workers(struct worker_params *wp, struct rte_mempool *p)
 	const unsigned num_workers = rte_lcore_count() - 1;
 	unsigned i;
 	struct rte_mbuf *bufs[RTE_MAX_LCORE];
+	struct rte_mbuf *returns[RTE_MAX_LCORE];
 	if (rte_mempool_get_bulk(p, (void *)bufs, num_workers) != 0) {
 		printf("line %d: Error getting mbufs from pool\n", __LINE__);
 		return;
@@ -605,6 +606,10 @@ quit_workers(struct worker_params *wp, struct rte_mempool *p)
 	rte_distributor_flush(d);
 	rte_eal_mp_wait_lcore();
 
+	while (rte_distributor_returned_pkts(d, returns, RTE_MAX_LCORE))
+		;
+
+	rte_distributor_clear_returns(d);
 	rte_mempool_put_bulk(p, (void *)bufs, num_workers);
 
 	quit = 0;
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v6 10/15] distributor: align API documentation with code
       [not found]                       ` <CGME20201009220248eucas1p156346857c1aab2340ccd7549abdce966@eucas1p1.samsung.com>
@ 2020-10-09 22:01                         ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-09 22:01 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

After introducing burst API there were some artefacts in the
API documentation from legacy single API.
Also the rte_distributor_poll_pkt() function return values
mismatched the implementation.

Fixes: c0de0eb82e40 ("distributor: switch over to new API")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 lib/librte_distributor/rte_distributor.h | 23 ++++++++++++-----------
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/lib/librte_distributor/rte_distributor.h b/lib/librte_distributor/rte_distributor.h
index 327c0c4ab..a073e6461 100644
--- a/lib/librte_distributor/rte_distributor.h
+++ b/lib/librte_distributor/rte_distributor.h
@@ -155,7 +155,7 @@ rte_distributor_clear_returns(struct rte_distributor *d);
  * @param pkts
  *   The mbufs pointer array to be filled in (up to 8 packets)
  * @param oldpkt
- *   The previous packet, if any, being processed by the worker
+ *   The previous packets, if any, being processed by the worker
  * @param retcount
  *   The number of packets being returned
  *
@@ -187,15 +187,15 @@ rte_distributor_return_pkt(struct rte_distributor *d,
 
 /**
  * API called by a worker to request a new packet to process.
- * Any previous packet given to the worker is assumed to have completed
+ * Any previous packets given to the worker are assumed to have completed
  * processing, and may be optionally returned to the distributor via
  * the oldpkt parameter.
- * Unlike rte_distributor_get_pkt_burst(), this function does not wait for a
- * new packet to be provided by the distributor.
+ * Unlike rte_distributor_get_pkt(), this function does not wait for
+ * new packets to be provided by the distributor.
  *
- * NOTE: after calling this function, rte_distributor_poll_pkt_burst() should
- * be used to poll for the packet requested. The rte_distributor_get_pkt_burst()
- * API should *not* be used to try and retrieve the new packet.
+ * NOTE: after calling this function, rte_distributor_poll_pkt() should
+ * be used to poll for the packets requested. The rte_distributor_get_pkt()
+ * API should *not* be used to try and retrieve the new packets.
  *
  * @param d
  *   The distributor instance to be used
@@ -213,9 +213,9 @@ rte_distributor_request_pkt(struct rte_distributor *d,
 		unsigned int count);
 
 /**
- * API called by a worker to check for a new packet that was previously
+ * API called by a worker to check for new packets that were previously
  * requested by a call to rte_distributor_request_pkt(). It does not wait
- * for the new packet to be available, but returns NULL if the request has
+ * for the new packets to be available, but returns if the request has
  * not yet been fulfilled by the distributor.
  *
  * @param d
@@ -227,8 +227,9 @@ rte_distributor_request_pkt(struct rte_distributor *d,
  *   The array of mbufs being given to the worker
  *
  * @return
- *   The number of packets being given to the worker thread, zero if no
- *   packet is yet available.
+ *   The number of packets being given to the worker thread,
+ *   -1 if no packets are yet available (burst API - RTE_DIST_ALG_BURST)
+ *   0 if no packets are yet available (legacy single API - RTE_DIST_ALG_SINGLE)
  */
 int
 rte_distributor_poll_pkt(struct rte_distributor *d,
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v6 11/15] test/distributor: replace delays with spin locks
       [not found]                       ` <CGME20201009220250eucas1p18587737171d82a9bde52c767ee8ed24b@eucas1p1.samsung.com>
@ 2020-10-09 22:01                         ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-09 22:01 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

Instead of making delays in test code and waiting
for worker hopefully to reach proper states,
synchronize worker shutdown test cases with spin lock
on atomic variable.

Fixes: c0de0eb82e40 ("distributor: switch over to new API")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 app/test/test_distributor.c | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
index ed03040d1..e8dd75078 100644
--- a/app/test/test_distributor.c
+++ b/app/test/test_distributor.c
@@ -27,6 +27,7 @@ struct worker_params worker_params;
 /* statics - all zero-initialized by default */
 static volatile int quit;      /**< general quit variable for all threads */
 static volatile int zero_quit; /**< var for when we just want thr0 to quit*/
+static volatile int zero_sleep; /**< thr0 has quit basic loop and is sleeping*/
 static volatile unsigned worker_idx;
 static volatile unsigned zero_idx;
 
@@ -376,8 +377,10 @@ handle_work_for_shutdown_test(void *arg)
 		/* for worker zero, allow it to restart to pick up last packet
 		 * when all workers are shutting down.
 		 */
+		__atomic_store_n(&zero_sleep, 1, __ATOMIC_RELEASE);
 		while (zero_quit)
 			usleep(100);
+		__atomic_store_n(&zero_sleep, 0, __ATOMIC_RELEASE);
 
 		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 
@@ -445,7 +448,12 @@ sanity_test_with_worker_shutdown(struct worker_params *wp,
 
 	/* flush the distributor */
 	rte_distributor_flush(d);
-	rte_delay_us(10000);
+	while (!__atomic_load_n(&zero_sleep, __ATOMIC_ACQUIRE))
+		rte_distributor_flush(d);
+
+	zero_quit = 0;
+	while (__atomic_load_n(&zero_sleep, __ATOMIC_ACQUIRE))
+		rte_delay_us(100);
 
 	for (i = 0; i < rte_lcore_count() - 1; i++)
 		printf("Worker %u handled %u packets\n", i,
@@ -505,9 +513,14 @@ test_flush_with_worker_shutdown(struct worker_params *wp,
 	/* flush the distributor */
 	rte_distributor_flush(d);
 
-	rte_delay_us(10000);
+	while (!__atomic_load_n(&zero_sleep, __ATOMIC_ACQUIRE))
+		rte_distributor_flush(d);
 
 	zero_quit = 0;
+
+	while (__atomic_load_n(&zero_sleep, __ATOMIC_ACQUIRE))
+		rte_delay_us(100);
+
 	for (i = 0; i < rte_lcore_count() - 1; i++)
 		printf("Worker %u handled %u packets\n", i,
 			__atomic_load_n(&worker_stats[i].handled_packets,
@@ -615,6 +628,8 @@ quit_workers(struct worker_params *wp, struct rte_mempool *p)
 	quit = 0;
 	worker_idx = 0;
 	zero_idx = RTE_MAX_LCORE;
+	zero_quit = 0;
+	zero_sleep = 0;
 }
 
 static int
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v6 12/15] distributor: fix scalar matching
       [not found]                       ` <CGME20201009220253eucas1p14078ab159186d2c26e787b3b2ed68062@eucas1p1.samsung.com>
@ 2020-10-09 22:01                         ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-09 22:01 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

Fix improper indexes while comparing tags.
In the find_match_scalar() function:
* j iterates over flow tags of following packets;
* w iterates over backlog or in flight tags positions.

Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 lib/librte_distributor/rte_distributor.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
index 9fd7dcab7..4bd23a990 100644
--- a/lib/librte_distributor/rte_distributor.c
+++ b/lib/librte_distributor/rte_distributor.c
@@ -261,13 +261,13 @@ find_match_scalar(struct rte_distributor *d,
 
 		for (j = 0; j < RTE_DIST_BURST_SIZE ; j++)
 			for (w = 0; w < RTE_DIST_BURST_SIZE; w++)
-				if (d->in_flight_tags[i][j] == data_ptr[w]) {
+				if (d->in_flight_tags[i][w] == data_ptr[j]) {
 					output_ptr[j] = i+1;
 					break;
 				}
 		for (j = 0; j < RTE_DIST_BURST_SIZE; j++)
 			for (w = 0; w < RTE_DIST_BURST_SIZE; w++)
-				if (bl->tags[j] == data_ptr[w]) {
+				if (bl->tags[w] == data_ptr[j]) {
 					output_ptr[j] = i+1;
 					break;
 				}
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v6 13/15] test/distributor: add test with packets marking
       [not found]                       ` <CGME20201009220253eucas1p2c0e27c3a495cb9603102b2cbf8a8f706@eucas1p2.samsung.com>
@ 2020-10-09 22:02                         ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-09 22:02 UTC (permalink / raw)
  To: David Hunt; +Cc: dev, l.wojciechow

All of the former tests analyzed only statistics
of packets processed by all workers.
The new test verifies also if packets are processed
on workers as expected.
Every packets processed by the worker is marked
and analyzed after it is returned to distributor.

This test allows finding issues in matching algorithms.

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 app/test/test_distributor.c | 141 ++++++++++++++++++++++++++++++++++++
 1 file changed, 141 insertions(+)

diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
index e8dd75078..4fc10b3cc 100644
--- a/app/test/test_distributor.c
+++ b/app/test/test_distributor.c
@@ -542,6 +542,141 @@ test_flush_with_worker_shutdown(struct worker_params *wp,
 	return 0;
 }
 
+static int
+handle_and_mark_work(void *arg)
+{
+	struct rte_mbuf *buf[8] __rte_cache_aligned;
+	struct worker_params *wp = arg;
+	struct rte_distributor *db = wp->dist;
+	unsigned int num, i;
+	unsigned int id = __atomic_fetch_add(&worker_idx, 1, __ATOMIC_RELAXED);
+	num = rte_distributor_get_pkt(db, id, buf, NULL, 0);
+	while (!quit) {
+		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
+				__ATOMIC_RELAXED);
+		for (i = 0; i < num; i++)
+			buf[i]->udata64 += id + 1;
+		num = rte_distributor_get_pkt(db, id,
+				buf, buf, num);
+	}
+	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
+			__ATOMIC_RELAXED);
+	rte_distributor_return_pkt(db, id, buf, num);
+	return 0;
+}
+
+/* sanity_mark_test sends packets to workers which mark them.
+ * Every packet has also encoded sequence number.
+ * The returned packets are sorted and verified if they were handled
+ * by proper workers.
+ */
+static int
+sanity_mark_test(struct worker_params *wp, struct rte_mempool *p)
+{
+	const unsigned int buf_count = 24;
+	const unsigned int burst = 8;
+	const unsigned int shift = 12;
+	const unsigned int seq_shift = 10;
+
+	struct rte_distributor *db = wp->dist;
+	struct rte_mbuf *bufs[buf_count];
+	struct rte_mbuf *returns[buf_count];
+	unsigned int i, count, id;
+	unsigned int sorted[buf_count], seq;
+	unsigned int failed = 0;
+
+	printf("=== Marked packets test ===\n");
+	clear_packet_count();
+	if (rte_mempool_get_bulk(p, (void *)bufs, buf_count) != 0) {
+		printf("line %d: Error getting mbufs from pool\n", __LINE__);
+		return -1;
+	}
+
+	/* bufs' hashes will be like these below, but shifted left.
+	 * The shifting is for avoiding collisions with backlogs
+	 * and in-flight tags left by previous tests.
+	 * [1, 1, 1, 1, 1, 1, 1, 1
+	 *  1, 1, 1, 1, 2, 2, 2, 2
+	 *  2, 2, 2, 2, 1, 1, 1, 1]
+	 */
+	for (i = 0; i < burst; i++) {
+		bufs[0 * burst + i]->hash.usr = 1 << shift;
+		bufs[1 * burst + i]->hash.usr = ((i < burst / 2) ? 1 : 2)
+			<< shift;
+		bufs[2 * burst + i]->hash.usr = ((i < burst / 2) ? 2 : 1)
+			<< shift;
+	}
+	/* Assign a sequence number to each packet. The sequence is shifted,
+	 * so that lower bits of the udate64 will hold mark from worker.
+	 */
+	for (i = 0; i < buf_count; i++)
+		bufs[i]->udata64 = i << seq_shift;
+
+	count = 0;
+	for (i = 0; i < buf_count/burst; i++) {
+		rte_distributor_process(db, &bufs[i * burst], burst);
+		count += rte_distributor_returned_pkts(db, &returns[count],
+			buf_count - count);
+	}
+
+	do {
+		rte_distributor_flush(db);
+		count += rte_distributor_returned_pkts(db, &returns[count],
+			buf_count - count);
+	} while (count < buf_count);
+
+	for (i = 0; i < rte_lcore_count() - 1; i++)
+		printf("Worker %u handled %u packets\n", i,
+			__atomic_load_n(&worker_stats[i].handled_packets,
+					__ATOMIC_RELAXED));
+
+	/* Sort returned packets by sent order (sequence numbers). */
+	for (i = 0; i < buf_count; i++) {
+		seq = returns[i]->udata64 >> seq_shift;
+		id = returns[i]->udata64 - (seq << seq_shift);
+		sorted[seq] = id;
+	}
+
+	/* Verify that packets [0-11] and [20-23] were processed
+	 * by the same worker
+	 */
+	for (i = 1; i < 12; i++) {
+		if (sorted[i] != sorted[0]) {
+			printf("Packet number %u processed by worker %u,"
+				" but should be processes by worker %u\n",
+				i, sorted[i], sorted[0]);
+			failed = 1;
+		}
+	}
+	for (i = 20; i < 24; i++) {
+		if (sorted[i] != sorted[0]) {
+			printf("Packet number %u processed by worker %u,"
+				" but should be processes by worker %u\n",
+				i, sorted[i], sorted[0]);
+			failed = 1;
+		}
+	}
+	/* And verify that packets [12-19] were processed
+	 * by the another worker
+	 */
+	for (i = 13; i < 20; i++) {
+		if (sorted[i] != sorted[12]) {
+			printf("Packet number %u processed by worker %u,"
+				" but should be processes by worker %u\n",
+				i, sorted[i], sorted[12]);
+			failed = 1;
+		}
+	}
+
+	rte_mempool_put_bulk(p, (void *)bufs, buf_count);
+
+	if (failed)
+		return -1;
+
+	printf("Marked packets test passed\n");
+	return 0;
+}
+
 static
 int test_error_distributor_create_name(void)
 {
@@ -726,6 +861,12 @@ test_distributor(void)
 				goto err;
 			quit_workers(&worker_params, p);
 
+			rte_eal_mp_remote_launch(handle_and_mark_work,
+					&worker_params, SKIP_MASTER);
+			if (sanity_mark_test(&worker_params, p) < 0)
+				goto err;
+			quit_workers(&worker_params, p);
+
 		} else {
 			printf("Too few cores to run worker shutdown test\n");
 		}
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v6 14/15] distributor: fix flushing in flight packets
       [not found]                       ` <CGME20201009220254eucas1p187bad9a066f00ee4c05ec6ca7fb4decd@eucas1p1.samsung.com>
@ 2020-10-09 22:02                         ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-09 22:02 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

rte_distributor_flush() is using total_outstanding()
function to calculate if it should still wait
for processing packets. However in burst mode
only backlog packets were counted.

This patch fixes that issue by counting also in flight
packets. There are also sum fixes to properly keep
count of in flight packets for each worker in bufs[].count.

Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 lib/librte_distributor/rte_distributor.c | 12 +++++-------
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
index 4bd23a990..2478de3b7 100644
--- a/lib/librte_distributor/rte_distributor.c
+++ b/lib/librte_distributor/rte_distributor.c
@@ -467,6 +467,7 @@ rte_distributor_process(struct rte_distributor *d,
 			/* Sync with worker on GET_BUF flag. */
 			if (__atomic_load_n(&(d->bufs[wid].bufptr64[0]),
 				__ATOMIC_ACQUIRE) & RTE_DISTRIB_GET_BUF) {
+				d->bufs[wid].count = 0;
 				release(d, wid);
 				handle_returns(d, wid);
 			}
@@ -481,11 +482,6 @@ rte_distributor_process(struct rte_distributor *d,
 		uint16_t matches[RTE_DIST_BURST_SIZE];
 		unsigned int pkts;
 
-		/* Sync with worker on GET_BUF flag. */
-		if (__atomic_load_n(&(d->bufs[wkr].bufptr64[0]),
-			__ATOMIC_ACQUIRE) & RTE_DISTRIB_GET_BUF)
-			d->bufs[wkr].count = 0;
-
 		if ((num_mbufs - next_idx) < RTE_DIST_BURST_SIZE)
 			pkts = num_mbufs - next_idx;
 		else
@@ -605,8 +601,10 @@ rte_distributor_process(struct rte_distributor *d,
 	for (wid = 0 ; wid < d->num_workers; wid++)
 		/* Sync with worker on GET_BUF flag. */
 		if ((__atomic_load_n(&(d->bufs[wid].bufptr64[0]),
-			__ATOMIC_ACQUIRE) & RTE_DISTRIB_GET_BUF))
+			__ATOMIC_ACQUIRE) & RTE_DISTRIB_GET_BUF)) {
+			d->bufs[wid].count = 0;
 			release(d, wid);
+		}
 
 	return num_mbufs;
 }
@@ -649,7 +647,7 @@ total_outstanding(const struct rte_distributor *d)
 	unsigned int wkr, total_outstanding = 0;
 
 	for (wkr = 0; wkr < d->num_workers; wkr++)
-		total_outstanding += d->backlog[wkr].count;
+		total_outstanding += d->backlog[wkr].count + d->bufs[wkr].count;
 
 	return total_outstanding;
 }
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v6 15/15] distributor: fix clearing returns buffer
       [not found]                       ` <CGME20201009220255eucas1p1e7a286684291e586ebb22cb0a2117e50@eucas1p1.samsung.com>
@ 2020-10-09 22:02                         ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-09 22:02 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

The patch clears distributors returns buffer
in clear_returns() by setting start and count to 0.

Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 lib/librte_distributor/rte_distributor.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
index 2478de3b7..57240304a 100644
--- a/lib/librte_distributor/rte_distributor.c
+++ b/lib/librte_distributor/rte_distributor.c
@@ -704,6 +704,8 @@ rte_distributor_clear_returns(struct rte_distributor *d)
 		/* Sync with worker. Release retptrs. */
 		__atomic_store_n(&(d->bufs[wkr].retptr64[0]), 0,
 				__ATOMIC_RELEASE);
+
+	d->returns.start = d->returns.count = 0;
 }
 
 /* creates a distributor instance */
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v5 00/15] fix distributor synchronization issues
  2020-10-09 21:41                         ` Lukasz Wojciechowski
@ 2020-10-09 23:25                           ` Lukasz Wojciechowski
  2020-10-10  8:12                             ` David Marchand
  0 siblings, 1 reply; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-09 23:25 UTC (permalink / raw)
  To: David Marchand; +Cc: David Hunt, Honnappa Nagarahalli, dev, Sarosh Arif


W dniu 09.10.2020 o 23:41, Lukasz Wojciechowski pisze:
>
> Hi David,
>
> W dniu 09.10.2020 o 14:53, David Marchand pisze:
>> Hello Lukasz,
>>
>> On Thu, Oct 8, 2020 at 11:17 PM Lukasz Wojciechowski
>> <l.wojciechow@partner.samsung.com>  wrote:
>>> I'm here if you have any questions or suggestions
>> Unfortunately, I can see a timeout on the distributor autotest in Travis:
>> https://travis-ci.com/github/ovsrobot/dpdk/jobs/396703415#L1151
>>
>> Can you have a look?
> I took a look and I don't know the cause of test hanging and timeout.
> I run today more than 200000 iteration of distributor tests and didn't 
> get a single failure or lock.
> David Hunt run the series tests today also, when checking impact on 
> performance and I guess he didn't got the issue.
> @DavidHunt, Am I right?
>
> The failure happened in only one configuration and tests were run by 
> travis using different compilers, architecture, etc.
>
> The test did not wrote anything on the stdout or stderr:
> --- stdout ---
> EAL: Probing VFIO support...
> APP: HPET is not enabled, using TSC as default timer
> RTE>>distributor_autotest
> --- stderr ---
> EAL: Detected 2 lcore(s)
> EAL: Detected 1 NUMA nodes
> EAL: Multi-process socket /var/run/dpdk/distributor_autotest/mp_socket
> EAL: Selected IOVA mode 'PA'
> EAL: No available hugepages reported in hugepages-1048576kB
> -------
> That's quite strange before the first test that is run: sanity_test, 
> starts with printing information about the start.
>
> Before that there is only the initialization code of the distributor 
> structure and creation of mempool.
>
> The only modifications I made to initialization of distributor 
> structure was initialization of active and active sum fields of 
> distributor:
>
>     memset(d->active, 0, sizeof(d->active));
>     d->activesum = 0;
>
> That's seems not to be the reason.
>
> I don't know what could be.
>
>
> Is there a way to trigger travis job manually to see if the timeout 
> reproduces ?
>
>> Btw, did you receive a notification about this from the robot?
> Yes, I got it.
> But I interpreted it badly. I downloaded the log and start reading it 
> up from the end and when I saw:
>
>     Compiler stderr:^M
>      /usr/bin/ld: cannot find -lvirt^M
>     collect2: error: ld returned 1 exit status^M
>
>  I thought that was it. Sorry for that.
>
>
> BTW I'm going to publish v6 with changes suggested by Honnappa 
> Nagarahalli (RELAXED memory mode) and David Hunt (indentations)
>

More bad news - same issue just appeared on travis for v6.
Good news we can reproduce it.

Is there a way to delegate a job for travis other way than sending a new 
patch version?

> Best regards
>
> Lukasz
>
> -- 
> Lukasz Wojciechowski
> Principal Software Engineer
>
> Samsung R&D Institute Poland
> Samsung Electronics
> Office +48 22 377 88 25
> l.wojciechow@partner.samsung.com
>
-- 
Lukasz Wojciechowski
Principal Software Engineer

Samsung R&D Institute Poland
Samsung Electronics
Office +48 22 377 88 25
l.wojciechow@partner.samsung.com


^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v5 00/15] fix distributor synchronization issues
  2020-10-09 23:25                           ` Lukasz Wojciechowski
@ 2020-10-10  8:12                             ` David Marchand
  2020-10-10  8:15                               ` David Marchand
  2020-10-10 16:27                               ` Lukasz Wojciechowski
  0 siblings, 2 replies; 164+ messages in thread
From: David Marchand @ 2020-10-10  8:12 UTC (permalink / raw)
  To: Lukasz Wojciechowski; +Cc: David Hunt, Honnappa Nagarahalli, dev, Sarosh Arif

Hello Lukasz,

On Sat, Oct 10, 2020 at 1:26 AM Lukasz Wojciechowski
<l.wojciechow@partner.samsung.com> wrote:
> W dniu 09.10.2020 o 23:41, Lukasz Wojciechowski pisze:
> More bad news - same issue just appeared on travis for v6.
> Good news we can reproduce it.
>
> Is there a way to delegate a job for travis other way than sending a new patch version?

You just need to fork dpdk in github, then setup travis.
Travis will get triggered on push.
I can help offlist if needed.


-- 
David Marchand


^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v5 00/15] fix distributor synchronization issues
  2020-10-10  8:12                             ` David Marchand
@ 2020-10-10  8:15                               ` David Marchand
  2020-10-10 16:27                               ` Lukasz Wojciechowski
  1 sibling, 0 replies; 164+ messages in thread
From: David Marchand @ 2020-10-10  8:15 UTC (permalink / raw)
  To: Lukasz Wojciechowski; +Cc: David Hunt, Honnappa Nagarahalli, dev, Sarosh Arif

On Sat, Oct 10, 2020 at 10:12 AM David Marchand
<david.marchand@redhat.com> wrote:
>
> Hello Lukasz,
>
> On Sat, Oct 10, 2020 at 1:26 AM Lukasz Wojciechowski
> <l.wojciechow@partner.samsung.com> wrote:
> > W dniu 09.10.2020 o 23:41, Lukasz Wojciechowski pisze:
> > More bad news - same issue just appeared on travis for v6.
> > Good news we can reproduce it.
> >
> > Is there a way to delegate a job for travis other way than sending a new patch version?
>
> You just need to fork dpdk in github, then setup travis.

Forgot to paste it:
https://docs.travis-ci.com/user/tutorial/#to-get-started-with-travis-ci-using-github

> Travis will get triggered on push.
> I can help offlist if needed.


-- 
David Marchand


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v7 00/16] fix distributor synchronization issues
       [not found]                       ` <CGME20201010160513eucas1p1fbacf1f82c40d65aef40634f245c4206@eucas1p1.samsung.com>
@ 2020-10-10 16:04                         ` Lukasz Wojciechowski
       [not found]                           ` <CGME20201010160515eucas1p18003d01d8217cdf04be3cba2e32f969f@eucas1p1.samsung.com>
                                             ` (16 more replies)
  0 siblings, 17 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-10 16:04 UTC (permalink / raw)
  Cc: dev, l.wojciechow

During review and verification of the patch created by Sarosh Arif:
"test_distributor: prevent memory leakages from the pool" I found out
that running distributor unit tests multiple times in a row causes fails.
So I investigated all the issues I found.

There are few synchronization issues that might cause deadlocks
or corrupted data. They are fixed with this set of patches for both tests
and librte_distributor library.

---
v7:
* add patch 16 ensuring that tests will try sending packets until workers
    are started and requested for packets

v6:
* fix comments indentation
* fix stats atomic operations memory mode from ACQUIRE/RELEASE
    to RELAXED

v5:
* implement missing functionality in burst mode - worker shutdown
* fix shutdown test to always shutdown busy worker
* use atomic stores instead of barrier in tests clear_packet_count()
* reorder patches
* new patch 7: fix call to return_pkt in single mode
* new patch 11: replacing delays with spinlock on atomics in tests
* new patch 12: fix scalar matching algorithm
* new patch 13: new test with marking and checking every packet
* new patch 14: flush also in flight packets
* new patch 15: fix clearing returns buffer
* minor fixes in other patches

v4:
* adjust commit name prefixes app/test -> test/distributor:
* reorder patches
* use NULL oldpkt in rte_distributor_get_pkt() calls in tests

v3:
* add missing acked and tested by statements from v1

v2:
* assign NULL to freed mbufs in distributor test
* fix handshake check on legacy single distributor
     rte_distributor_return_pkt_single()
* add patch 7 passing NULL to legacy API calls if no bufs are returned
* add patch 8 fixing API documentation


Lukasz Wojciechowski (16):
  distributor: fix missing handshake synchronization
  distributor: fix handshake deadlock
  distributor: do not use oldpkt when not needed
  distributor: handle worker shutdown in burst mode
  test/distributor: fix shutdown of busy worker
  test/distributor: synchronize lcores statistics
  distributor: fix return pkt calls in single mode
  test/distributor: fix freeing mbufs
  test/distributor: collect return mbufs
  distributor: align API documentation with code
  test/distributor: replace delays with spin locks
  distributor: fix scalar matching
  test/distributor: add test with packets marking
  distributor: fix flushing in flight packets
  distributor: fix clearing returns buffer
  test/distributor: ensure all packets are delivered

 app/test/test_distributor.c                   | 347 ++++++++++++++----
 lib/librte_distributor/distributor_private.h  |   3 +
 lib/librte_distributor/rte_distributor.c      | 219 ++++++++---
 lib/librte_distributor/rte_distributor.h      |  23 +-
 .../rte_distributor_single.c                  |   4 +
 5 files changed, 471 insertions(+), 125 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v7 01/16] distributor: fix missing handshake synchronization
       [not found]                           ` <CGME20201010160515eucas1p18003d01d8217cdf04be3cba2e32f969f@eucas1p1.samsung.com>
@ 2020-10-10 16:04                             ` Lukasz Wojciechowski
  2020-10-15 23:47                               ` Honnappa Nagarahalli
  0 siblings, 1 reply; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-10 16:04 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

rte_distributor_return_pkt function which is run on worker cores
must wait for distributor core to clear handshake on retptr64
before using those buffers. While the handshake is set distributor
core controls buffers and any operations on worker side might overwrite
buffers which are unread yet.
Same situation appears in the legacy single distributor. Function
rte_distributor_return_pkt_single shouldn't modify the bufptr64 until
handshake on it is cleared by distributor lcore.

Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 lib/librte_distributor/rte_distributor.c        | 14 ++++++++++++++
 lib/librte_distributor/rte_distributor_single.c |  4 ++++
 2 files changed, 18 insertions(+)

diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
index 1c047f065..89493c331 100644
--- a/lib/librte_distributor/rte_distributor.c
+++ b/lib/librte_distributor/rte_distributor.c
@@ -160,6 +160,7 @@ rte_distributor_return_pkt(struct rte_distributor *d,
 {
 	struct rte_distributor_buffer *buf = &d->bufs[worker_id];
 	unsigned int i;
+	volatile int64_t *retptr64;
 
 	if (unlikely(d->alg_type == RTE_DIST_ALG_SINGLE)) {
 		if (num == 1)
@@ -169,6 +170,19 @@ rte_distributor_return_pkt(struct rte_distributor *d,
 			return -EINVAL;
 	}
 
+	retptr64 = &(buf->retptr64[0]);
+	/* Spin while handshake bits are set (scheduler clears it).
+	 * Sync with worker on GET_BUF flag.
+	 */
+	while (unlikely(__atomic_load_n(retptr64, __ATOMIC_ACQUIRE)
+			& RTE_DISTRIB_GET_BUF)) {
+		rte_pause();
+		uint64_t t = rte_rdtsc()+100;
+
+		while (rte_rdtsc() < t)
+			rte_pause();
+	}
+
 	/* Sync with distributor to acquire retptrs */
 	__atomic_thread_fence(__ATOMIC_ACQUIRE);
 	for (i = 0; i < RTE_DIST_BURST_SIZE; i++)
diff --git a/lib/librte_distributor/rte_distributor_single.c b/lib/librte_distributor/rte_distributor_single.c
index abaf7730c..f4725b1d0 100644
--- a/lib/librte_distributor/rte_distributor_single.c
+++ b/lib/librte_distributor/rte_distributor_single.c
@@ -74,6 +74,10 @@ rte_distributor_return_pkt_single(struct rte_distributor_single *d,
 	union rte_distributor_buffer_single *buf = &d->bufs[worker_id];
 	uint64_t req = (((int64_t)(uintptr_t)oldpkt) << RTE_DISTRIB_FLAG_BITS)
 			| RTE_DISTRIB_RETURN_BUF;
+	while (unlikely(__atomic_load_n(&buf->bufptr64, __ATOMIC_RELAXED)
+			& RTE_DISTRIB_FLAGS_MASK))
+		rte_pause();
+
 	/* Sync with distributor on RETURN_BUF flag. */
 	__atomic_store_n(&(buf->bufptr64), req, __ATOMIC_RELEASE);
 	return 0;
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v7 02/16] distributor: fix handshake deadlock
       [not found]                           ` <CGME20201010160517eucas1p2141c0bb6097a05aa99ed8efdf5fb7512@eucas1p2.samsung.com>
@ 2020-10-10 16:04                             ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-10 16:04 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

Synchronization of data exchange between distributor and worker cores
is based on 2 handshakes: retptr64 for returning mbufs from workers
to distributor and bufptr64 for passing mbufs to workers.

Without proper order of verifying those 2 handshakes a deadlock may
occur. This can happen when worker core wants to return back mbufs
and waits for retptr handshake to be cleared while distributor core
waits for bufptr to send mbufs to worker.

This can happen as worker core first returns mbufs to distributor
and later gets new mbufs, while distributor first releases mbufs
to worker and later handle returning packets.

This patch fixes possibility of the deadlock by always taking care
of returning packets first on the distributor side and handling
packets while waiting to release new.

Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 lib/librte_distributor/rte_distributor.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
index 89493c331..12b3db33c 100644
--- a/lib/librte_distributor/rte_distributor.c
+++ b/lib/librte_distributor/rte_distributor.c
@@ -321,12 +321,14 @@ release(struct rte_distributor *d, unsigned int wkr)
 	struct rte_distributor_buffer *buf = &(d->bufs[wkr]);
 	unsigned int i;
 
+	handle_returns(d, wkr);
+
 	/* Sync with worker on GET_BUF flag */
 	while (!(__atomic_load_n(&(d->bufs[wkr].bufptr64[0]), __ATOMIC_ACQUIRE)
-		& RTE_DISTRIB_GET_BUF))
+		& RTE_DISTRIB_GET_BUF)) {
+		handle_returns(d, wkr);
 		rte_pause();
-
-	handle_returns(d, wkr);
+	}
 
 	buf->count = 0;
 
@@ -376,6 +378,7 @@ rte_distributor_process(struct rte_distributor *d,
 		/* Flush out all non-full cache-lines to workers. */
 		for (wid = 0 ; wid < d->num_workers; wid++) {
 			/* Sync with worker on GET_BUF flag. */
+			handle_returns(d, wid);
 			if (__atomic_load_n(&(d->bufs[wid].bufptr64[0]),
 				__ATOMIC_ACQUIRE) & RTE_DISTRIB_GET_BUF) {
 				release(d, wid);
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v7 03/16] distributor: do not use oldpkt when not needed
       [not found]                           ` <CGME20201010160523eucas1p19287c5bf3b7e2818c730ae23f514853f@eucas1p1.samsung.com>
@ 2020-10-10 16:04                             ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-10 16:04 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

rte_distributor_request_pkt and rte_distributor_get_pkt dereferenced
oldpkt parameter when in RTE_DIST_ALG_SINGLE even if number
of returned buffers from worker to distributor was 0.

This patch passes NULL to the legacy API when number of returned
buffers is 0. This allows passing NULL as oldpkt parameter.

Distributor tests are also updated passing NULL as oldpkt and
0 as number of returned packets, where packets are not returned.

Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 app/test/test_distributor.c              | 28 +++++++++---------------
 lib/librte_distributor/rte_distributor.c |  4 ++--
 2 files changed, 12 insertions(+), 20 deletions(-)

diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
index ba1f81cf8..52230d250 100644
--- a/app/test/test_distributor.c
+++ b/app/test/test_distributor.c
@@ -62,13 +62,10 @@ handle_work(void *arg)
 	struct rte_mbuf *buf[8] __rte_cache_aligned;
 	struct worker_params *wp = arg;
 	struct rte_distributor *db = wp->dist;
-	unsigned int count = 0, num = 0;
+	unsigned int count = 0, num;
 	unsigned int id = __atomic_fetch_add(&worker_idx, 1, __ATOMIC_RELAXED);
-	int i;
 
-	for (i = 0; i < 8; i++)
-		buf[i] = NULL;
-	num = rte_distributor_get_pkt(db, id, buf, buf, num);
+	num = rte_distributor_get_pkt(db, id, buf, NULL, 0);
 	while (!quit) {
 		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
 				__ATOMIC_RELAXED);
@@ -272,19 +269,16 @@ handle_work_with_free_mbufs(void *arg)
 	struct rte_distributor *d = wp->dist;
 	unsigned int count = 0;
 	unsigned int i;
-	unsigned int num = 0;
+	unsigned int num;
 	unsigned int id = __atomic_fetch_add(&worker_idx, 1, __ATOMIC_RELAXED);
 
-	for (i = 0; i < 8; i++)
-		buf[i] = NULL;
-	num = rte_distributor_get_pkt(d, id, buf, buf, num);
+	num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 	while (!quit) {
 		worker_stats[id].handled_packets += num;
 		count += num;
 		for (i = 0; i < num; i++)
 			rte_pktmbuf_free(buf[i]);
-		num = rte_distributor_get_pkt(d,
-				id, buf, buf, num);
+		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 	}
 	worker_stats[id].handled_packets += num;
 	count += num;
@@ -342,14 +336,14 @@ handle_work_for_shutdown_test(void *arg)
 	struct worker_params *wp = arg;
 	struct rte_distributor *d = wp->dist;
 	unsigned int count = 0;
-	unsigned int num = 0;
+	unsigned int num;
 	unsigned int total = 0;
 	unsigned int i;
 	unsigned int returned = 0;
 	const unsigned int id = __atomic_fetch_add(&worker_idx, 1,
 			__ATOMIC_RELAXED);
 
-	num = rte_distributor_get_pkt(d, id, buf, buf, num);
+	num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 
 	/* wait for quit single globally, or for worker zero, wait
 	 * for zero_quit */
@@ -358,8 +352,7 @@ handle_work_for_shutdown_test(void *arg)
 		count += num;
 		for (i = 0; i < num; i++)
 			rte_pktmbuf_free(buf[i]);
-		num = rte_distributor_get_pkt(d,
-				id, buf, buf, num);
+		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 		total += num;
 	}
 	worker_stats[id].handled_packets += num;
@@ -373,14 +366,13 @@ handle_work_for_shutdown_test(void *arg)
 		while (zero_quit)
 			usleep(100);
 
-		num = rte_distributor_get_pkt(d,
-				id, buf, buf, num);
+		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 
 		while (!quit) {
 			worker_stats[id].handled_packets += num;
 			count += num;
 			rte_pktmbuf_free(pkt);
-			num = rte_distributor_get_pkt(d, id, buf, buf, num);
+			num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 		}
 		returned = rte_distributor_return_pkt(d,
 				id, buf, num);
diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
index 12b3db33c..b720abe03 100644
--- a/lib/librte_distributor/rte_distributor.c
+++ b/lib/librte_distributor/rte_distributor.c
@@ -42,7 +42,7 @@ rte_distributor_request_pkt(struct rte_distributor *d,
 
 	if (unlikely(d->alg_type == RTE_DIST_ALG_SINGLE)) {
 		rte_distributor_request_pkt_single(d->d_single,
-			worker_id, oldpkt[0]);
+			worker_id, count ? oldpkt[0] : NULL);
 		return;
 	}
 
@@ -134,7 +134,7 @@ rte_distributor_get_pkt(struct rte_distributor *d,
 	if (unlikely(d->alg_type == RTE_DIST_ALG_SINGLE)) {
 		if (return_count <= 1) {
 			pkts[0] = rte_distributor_get_pkt_single(d->d_single,
-				worker_id, oldpkt[0]);
+				worker_id, return_count ? oldpkt[0] : NULL);
 			return (pkts[0]) ? 1 : 0;
 		} else
 			return -EINVAL;
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v7 04/16] distributor: handle worker shutdown in burst mode
       [not found]                           ` <CGME20201010160525eucas1p2314810086b9dd1c8cddf90eabe800363@eucas1p2.samsung.com>
@ 2020-10-10 16:04                             ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-10 16:04 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

The burst version of distributor implementation was missing proper
handling of worker shutdown. A worker processing packets received
from distributor can call rte_distributor_return_pkt() function
informing distributor that it want no more packets. Further calls to
rte_distributor_request_pkt() or rte_distributor_get_pkt() however
should inform distributor that new packets are requested again.

Lack of the proper implementation has caused that even after worker
informed about returning last packets, new packets were still sent
from distributor causing deadlocks as no one could get them on worker
side.

This patch adds handling shutdown of the worker in following way:
1) It fixes usage of RTE_DISTRIB_VALID_BUF handshake flag. This flag
was formerly unused in burst implementation and now it is used
for marking valid packets in retptr64 replacing invalid use
of RTE_DISTRIB_RETURN_BUF flag.
2) Uses RTE_DISTRIB_RETURN_BUF as a worker to distributor handshake
in retptr64 to indicate that worker has shutdown.
3) Worker that shuts down blocks also bufptr for itself with
RTE_DISTRIB_RETURN_BUF flag allowing distributor to retrieve any
in flight packets.
4) When distributor receives information about shutdown of a worker,
it: marks worker as not active; retrieves any in flight and backlog
packets and process them to different workers; unlocks bufptr64
by clearing RTE_DISTRIB_RETURN_BUF flag and allowing use in
the future if worker requests any new packages.
5) Do not allow to: send or add to backlog any packets for not
active workers. Such workers are also ignored if matched.
6) Adjust calls to handle_returns() and tags matching procedure
to react for possible activation deactivation of workers.

Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 lib/librte_distributor/distributor_private.h |   3 +
 lib/librte_distributor/rte_distributor.c     | 175 +++++++++++++++----
 2 files changed, 146 insertions(+), 32 deletions(-)

diff --git a/lib/librte_distributor/distributor_private.h b/lib/librte_distributor/distributor_private.h
index 489aef2ac..689fe3e18 100644
--- a/lib/librte_distributor/distributor_private.h
+++ b/lib/librte_distributor/distributor_private.h
@@ -155,6 +155,9 @@ struct rte_distributor {
 	enum rte_distributor_match_function dist_match_fn;
 
 	struct rte_distributor_single *d_single;
+
+	uint8_t active[RTE_DISTRIB_MAX_WORKERS];
+	uint8_t activesum;
 };
 
 void
diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
index b720abe03..115443fc0 100644
--- a/lib/librte_distributor/rte_distributor.c
+++ b/lib/librte_distributor/rte_distributor.c
@@ -51,7 +51,7 @@ rte_distributor_request_pkt(struct rte_distributor *d,
 	 * Sync with worker on GET_BUF flag.
 	 */
 	while (unlikely(__atomic_load_n(retptr64, __ATOMIC_ACQUIRE)
-			& RTE_DISTRIB_GET_BUF)) {
+			& (RTE_DISTRIB_GET_BUF | RTE_DISTRIB_RETURN_BUF))) {
 		rte_pause();
 		uint64_t t = rte_rdtsc()+100;
 
@@ -67,11 +67,11 @@ rte_distributor_request_pkt(struct rte_distributor *d,
 	for (i = count; i < RTE_DIST_BURST_SIZE; i++)
 		buf->retptr64[i] = 0;
 
-	/* Set Return bit for each packet returned */
+	/* Set VALID_BUF bit for each packet returned */
 	for (i = count; i-- > 0; )
 		buf->retptr64[i] =
 			(((int64_t)(uintptr_t)(oldpkt[i])) <<
-			RTE_DISTRIB_FLAG_BITS) | RTE_DISTRIB_RETURN_BUF;
+			RTE_DISTRIB_FLAG_BITS) | RTE_DISTRIB_VALID_BUF;
 
 	/*
 	 * Finally, set the GET_BUF  to signal to distributor that cache
@@ -97,11 +97,13 @@ rte_distributor_poll_pkt(struct rte_distributor *d,
 		return (pkts[0]) ? 1 : 0;
 	}
 
-	/* If bit is set, return
+	/* If any of below bits is set, return.
+	 * GET_BUF is set when distributor hasn't sent any packets yet
+	 * RETURN_BUF is set when distributor must retrieve in-flight packets
 	 * Sync with distributor to acquire bufptrs
 	 */
 	if (__atomic_load_n(&(buf->bufptr64[0]), __ATOMIC_ACQUIRE)
-		& RTE_DISTRIB_GET_BUF)
+		& (RTE_DISTRIB_GET_BUF | RTE_DISTRIB_RETURN_BUF))
 		return -1;
 
 	/* since bufptr64 is signed, this should be an arithmetic shift */
@@ -113,7 +115,7 @@ rte_distributor_poll_pkt(struct rte_distributor *d,
 	}
 
 	/*
-	 * so now we've got the contents of the cacheline into an  array of
+	 * so now we've got the contents of the cacheline into an array of
 	 * mbuf pointers, so toggle the bit so scheduler can start working
 	 * on the next cacheline while we're working.
 	 * Sync with distributor on GET_BUF flag. Release bufptrs.
@@ -175,7 +177,7 @@ rte_distributor_return_pkt(struct rte_distributor *d,
 	 * Sync with worker on GET_BUF flag.
 	 */
 	while (unlikely(__atomic_load_n(retptr64, __ATOMIC_ACQUIRE)
-			& RTE_DISTRIB_GET_BUF)) {
+			& (RTE_DISTRIB_GET_BUF | RTE_DISTRIB_RETURN_BUF))) {
 		rte_pause();
 		uint64_t t = rte_rdtsc()+100;
 
@@ -187,17 +189,25 @@ rte_distributor_return_pkt(struct rte_distributor *d,
 	__atomic_thread_fence(__ATOMIC_ACQUIRE);
 	for (i = 0; i < RTE_DIST_BURST_SIZE; i++)
 		/* Switch off the return bit first */
-		buf->retptr64[i] &= ~RTE_DISTRIB_RETURN_BUF;
+		buf->retptr64[i] = 0;
 
 	for (i = num; i-- > 0; )
 		buf->retptr64[i] = (((int64_t)(uintptr_t)oldpkt[i]) <<
-			RTE_DISTRIB_FLAG_BITS) | RTE_DISTRIB_RETURN_BUF;
+			RTE_DISTRIB_FLAG_BITS) | RTE_DISTRIB_VALID_BUF;
+
+	/* Use RETURN_BUF on bufptr64 to notify distributor that
+	 * we won't read any mbufs from there even if GET_BUF is set.
+	 * This allows distributor to retrieve in-flight already sent packets.
+	 */
+	__atomic_or_fetch(&(buf->bufptr64[0]), RTE_DISTRIB_RETURN_BUF,
+		__ATOMIC_ACQ_REL);
 
-	/* set the GET_BUF but even if we got no returns.
-	 * Sync with distributor on GET_BUF flag. Release retptrs.
+	/* set the RETURN_BUF on retptr64 even if we got no returns.
+	 * Sync with distributor on RETURN_BUF flag. Release retptrs.
+	 * Notify distributor that we don't request more packets any more.
 	 */
 	__atomic_store_n(&(buf->retptr64[0]),
-		buf->retptr64[0] | RTE_DISTRIB_GET_BUF, __ATOMIC_RELEASE);
+		buf->retptr64[0] | RTE_DISTRIB_RETURN_BUF, __ATOMIC_RELEASE);
 
 	return 0;
 }
@@ -267,6 +277,59 @@ find_match_scalar(struct rte_distributor *d,
 	 */
 }
 
+/*
+ * When worker called rte_distributor_return_pkt()
+ * and passed RTE_DISTRIB_RETURN_BUF handshake through retptr64,
+ * distributor must retrieve both inflight and backlog packets assigned
+ * to the worker and reprocess them to another worker.
+ */
+static void
+handle_worker_shutdown(struct rte_distributor *d, unsigned int wkr)
+{
+	struct rte_distributor_buffer *buf = &(d->bufs[wkr]);
+	/* double BURST size for storing both inflights and backlog */
+	struct rte_mbuf *pkts[RTE_DIST_BURST_SIZE * 2];
+	unsigned int pkts_count = 0;
+	unsigned int i;
+
+	/* If GET_BUF is cleared there are in-flight packets sent
+	 * to worker which does not require new packets.
+	 * They must be retrieved and assigned to another worker.
+	 */
+	if (!(__atomic_load_n(&(buf->bufptr64[0]), __ATOMIC_ACQUIRE)
+		& RTE_DISTRIB_GET_BUF))
+		for (i = 0; i < RTE_DIST_BURST_SIZE; i++)
+			if (buf->bufptr64[i] & RTE_DISTRIB_VALID_BUF)
+				pkts[pkts_count++] = (void *)((uintptr_t)
+					(buf->bufptr64[i]
+						>> RTE_DISTRIB_FLAG_BITS));
+
+	/* Make following operations on handshake flags on bufptr64:
+	 * - set GET_BUF to indicate that distributor can overwrite buffer
+	 *     with new packets if worker will make a new request.
+	 * - clear RETURN_BUF to unlock reads on worker side.
+	 */
+	__atomic_store_n(&(buf->bufptr64[0]), RTE_DISTRIB_GET_BUF,
+		__ATOMIC_RELEASE);
+
+	/* Collect backlog packets from worker */
+	for (i = 0; i < d->backlog[wkr].count; i++)
+		pkts[pkts_count++] = (void *)((uintptr_t)
+			(d->backlog[wkr].pkts[i] >> RTE_DISTRIB_FLAG_BITS));
+
+	d->backlog[wkr].count = 0;
+
+	/* Clear both inflight and backlog tags */
+	for (i = 0; i < RTE_DIST_BURST_SIZE; i++) {
+		d->in_flight_tags[wkr][i] = 0;
+		d->backlog[wkr].tags[i] = 0;
+	}
+
+	/* Recursive call */
+	if (pkts_count > 0)
+		rte_distributor_process(d, pkts, pkts_count);
+}
+
 
 /*
  * When the handshake bits indicate that there are packets coming
@@ -285,19 +348,33 @@ handle_returns(struct rte_distributor *d, unsigned int wkr)
 
 	/* Sync on GET_BUF flag. Acquire retptrs. */
 	if (__atomic_load_n(&(buf->retptr64[0]), __ATOMIC_ACQUIRE)
-		& RTE_DISTRIB_GET_BUF) {
+		& (RTE_DISTRIB_GET_BUF | RTE_DISTRIB_RETURN_BUF)) {
 		for (i = 0; i < RTE_DIST_BURST_SIZE; i++) {
-			if (buf->retptr64[i] & RTE_DISTRIB_RETURN_BUF) {
+			if (buf->retptr64[i] & RTE_DISTRIB_VALID_BUF) {
 				oldbuf = ((uintptr_t)(buf->retptr64[i] >>
 					RTE_DISTRIB_FLAG_BITS));
 				/* store returns in a circular buffer */
 				store_return(oldbuf, d, &ret_start, &ret_count);
 				count++;
-				buf->retptr64[i] &= ~RTE_DISTRIB_RETURN_BUF;
+				buf->retptr64[i] &= ~RTE_DISTRIB_VALID_BUF;
 			}
 		}
 		d->returns.start = ret_start;
 		d->returns.count = ret_count;
+
+		/* If worker requested packets with GET_BUF, set it to active
+		 * otherwise (RETURN_BUF), set it to not active.
+		 */
+		d->activesum -= d->active[wkr];
+		d->active[wkr] = !!(buf->retptr64[0] & RTE_DISTRIB_GET_BUF);
+		d->activesum += d->active[wkr];
+
+		/* If worker returned packets without requesting new ones,
+		 * handle all in-flights and backlog packets assigned to it.
+		 */
+		if (unlikely(buf->retptr64[0] & RTE_DISTRIB_RETURN_BUF))
+			handle_worker_shutdown(d, wkr);
+
 		/* Clear for the worker to populate with more returns.
 		 * Sync with distributor on GET_BUF flag. Release retptrs.
 		 */
@@ -322,11 +399,15 @@ release(struct rte_distributor *d, unsigned int wkr)
 	unsigned int i;
 
 	handle_returns(d, wkr);
+	if (unlikely(!d->active[wkr]))
+		return 0;
 
 	/* Sync with worker on GET_BUF flag */
 	while (!(__atomic_load_n(&(d->bufs[wkr].bufptr64[0]), __ATOMIC_ACQUIRE)
 		& RTE_DISTRIB_GET_BUF)) {
 		handle_returns(d, wkr);
+		if (unlikely(!d->active[wkr]))
+			return 0;
 		rte_pause();
 	}
 
@@ -366,7 +447,7 @@ rte_distributor_process(struct rte_distributor *d,
 	int64_t next_value = 0;
 	uint16_t new_tag = 0;
 	uint16_t flows[RTE_DIST_BURST_SIZE] __rte_cache_aligned;
-	unsigned int i, j, w, wid;
+	unsigned int i, j, w, wid, matching_required;
 
 	if (d->alg_type == RTE_DIST_ALG_SINGLE) {
 		/* Call the old API */
@@ -374,11 +455,13 @@ rte_distributor_process(struct rte_distributor *d,
 			mbufs, num_mbufs);
 	}
 
+	for (wid = 0 ; wid < d->num_workers; wid++)
+		handle_returns(d, wid);
+
 	if (unlikely(num_mbufs == 0)) {
 		/* Flush out all non-full cache-lines to workers. */
 		for (wid = 0 ; wid < d->num_workers; wid++) {
 			/* Sync with worker on GET_BUF flag. */
-			handle_returns(d, wid);
 			if (__atomic_load_n(&(d->bufs[wid].bufptr64[0]),
 				__ATOMIC_ACQUIRE) & RTE_DISTRIB_GET_BUF) {
 				release(d, wid);
@@ -388,6 +471,9 @@ rte_distributor_process(struct rte_distributor *d,
 		return 0;
 	}
 
+	if (unlikely(!d->activesum))
+		return 0;
+
 	while (next_idx < num_mbufs) {
 		uint16_t matches[RTE_DIST_BURST_SIZE];
 		unsigned int pkts;
@@ -412,22 +498,30 @@ rte_distributor_process(struct rte_distributor *d,
 		for (; i < RTE_DIST_BURST_SIZE; i++)
 			flows[i] = 0;
 
-		switch (d->dist_match_fn) {
-		case RTE_DIST_MATCH_VECTOR:
-			find_match_vec(d, &flows[0], &matches[0]);
-			break;
-		default:
-			find_match_scalar(d, &flows[0], &matches[0]);
-		}
+		matching_required = 1;
 
+		for (j = 0; j < pkts; j++) {
+			if (unlikely(!d->activesum))
+				return next_idx;
+
+			if (unlikely(matching_required)) {
+				switch (d->dist_match_fn) {
+				case RTE_DIST_MATCH_VECTOR:
+					find_match_vec(d, &flows[0],
+						&matches[0]);
+					break;
+				default:
+					find_match_scalar(d, &flows[0],
+						&matches[0]);
+				}
+				matching_required = 0;
+			}
 		/*
 		 * Matches array now contain the intended worker ID (+1) of
 		 * the incoming packets. Any zeroes need to be assigned
 		 * workers.
 		 */
 
-		for (j = 0; j < pkts; j++) {
-
 			next_mb = mbufs[next_idx++];
 			next_value = (((int64_t)(uintptr_t)next_mb) <<
 					RTE_DISTRIB_FLAG_BITS);
@@ -447,12 +541,18 @@ rte_distributor_process(struct rte_distributor *d,
 			 */
 			/* matches[j] = 0; */
 
-			if (matches[j]) {
+			if (matches[j] && d->active[matches[j]-1]) {
 				struct rte_distributor_backlog *bl =
 						&d->backlog[matches[j]-1];
 				if (unlikely(bl->count ==
 						RTE_DIST_BURST_SIZE)) {
 					release(d, matches[j]-1);
+					if (!d->active[matches[j]-1]) {
+						j--;
+						next_idx--;
+						matching_required = 1;
+						continue;
+					}
 				}
 
 				/* Add to worker that already has flow */
@@ -462,11 +562,21 @@ rte_distributor_process(struct rte_distributor *d,
 				bl->pkts[idx] = next_value;
 
 			} else {
-				struct rte_distributor_backlog *bl =
-						&d->backlog[wkr];
+				struct rte_distributor_backlog *bl;
+
+				while (unlikely(!d->active[wkr]))
+					wkr = (wkr + 1) % d->num_workers;
+				bl = &d->backlog[wkr];
+
 				if (unlikely(bl->count ==
 						RTE_DIST_BURST_SIZE)) {
 					release(d, wkr);
+					if (!d->active[wkr]) {
+						j--;
+						next_idx--;
+						matching_required = 1;
+						continue;
+					}
 				}
 
 				/* Add to current worker worker */
@@ -485,9 +595,7 @@ rte_distributor_process(struct rte_distributor *d,
 						matches[w] = wkr+1;
 			}
 		}
-		wkr++;
-		if (wkr >= d->num_workers)
-			wkr = 0;
+		wkr = (wkr + 1) % d->num_workers;
 	}
 
 	/* Flush out all non-full cache-lines to workers. */
@@ -663,6 +771,9 @@ rte_distributor_create(const char *name,
 	for (i = 0 ; i < num_workers ; i++)
 		d->backlog[i].tags = &d->in_flight_tags[i][RTE_DIST_BURST_SIZE];
 
+	memset(d->active, 0, sizeof(d->active));
+	d->activesum = 0;
+
 	dist_burst_list = RTE_TAILQ_CAST(rte_dist_burst_tailq.head,
 					  rte_dist_burst_list);
 
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v7 05/16] test/distributor: fix shutdown of busy worker
       [not found]                           ` <CGME20201010160527eucas1p2f55cb0fc45bf3647234cdfa251e542fc@eucas1p2.samsung.com>
@ 2020-10-10 16:04                             ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-10 16:04 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

The sanity test with worker shutdown delegates all bufs
to be processed by a single lcore worker, then it freezes
one of the lcore workers and continues to send more bufs.
The freezed core shuts down first by calling
rte_distributor_return_pkt().

The test intention is to verify if packets assigned to
the shut down lcore will be reassigned to another worker.

However the shutdown core was not always the one, that was
processing packets. The lcore processing mbufs might be different
every time test is launched. This is caused by keeping the value
of wkr static variable in rte_distributor_process() function
between running test cases.

Test freezed always lcore with 0 id. The patch stores the id
of worker that is processing the data in zero_idx global atomic
variable. This way the freezed lcore is always the proper one.

Fixes: c3eabff124e6 ("distributor: add unit tests")
Cc: bruce.richardson@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Tested-by: David Hunt <david.hunt@intel.com>
---
 app/test/test_distributor.c | 23 +++++++++++++++++++++--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
index 52230d250..6cd7a2edd 100644
--- a/app/test/test_distributor.c
+++ b/app/test/test_distributor.c
@@ -28,6 +28,7 @@ struct worker_params worker_params;
 static volatile int quit;      /**< general quit variable for all threads */
 static volatile int zero_quit; /**< var for when we just want thr0 to quit*/
 static volatile unsigned worker_idx;
+static volatile unsigned zero_idx;
 
 struct worker_stats {
 	volatile unsigned handled_packets;
@@ -340,26 +341,43 @@ handle_work_for_shutdown_test(void *arg)
 	unsigned int total = 0;
 	unsigned int i;
 	unsigned int returned = 0;
+	unsigned int zero_id = 0;
+	unsigned int zero_unset;
 	const unsigned int id = __atomic_fetch_add(&worker_idx, 1,
 			__ATOMIC_RELAXED);
 
 	num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 
+	if (num > 0) {
+		zero_unset = RTE_MAX_LCORE;
+		__atomic_compare_exchange_n(&zero_idx, &zero_unset, id,
+			false, __ATOMIC_ACQ_REL, __ATOMIC_ACQUIRE);
+	}
+	zero_id = __atomic_load_n(&zero_idx, __ATOMIC_ACQUIRE);
+
 	/* wait for quit single globally, or for worker zero, wait
 	 * for zero_quit */
-	while (!quit && !(id == 0 && zero_quit)) {
+	while (!quit && !(id == zero_id && zero_quit)) {
 		worker_stats[id].handled_packets += num;
 		count += num;
 		for (i = 0; i < num; i++)
 			rte_pktmbuf_free(buf[i]);
 		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
+
+		if (num > 0) {
+			zero_unset = RTE_MAX_LCORE;
+			__atomic_compare_exchange_n(&zero_idx, &zero_unset, id,
+				false, __ATOMIC_ACQ_REL, __ATOMIC_ACQUIRE);
+		}
+		zero_id = __atomic_load_n(&zero_idx, __ATOMIC_ACQUIRE);
+
 		total += num;
 	}
 	worker_stats[id].handled_packets += num;
 	count += num;
 	returned = rte_distributor_return_pkt(d, id, buf, num);
 
-	if (id == 0) {
+	if (id == zero_id) {
 		/* for worker zero, allow it to restart to pick up last packet
 		 * when all workers are shutting down.
 		 */
@@ -578,6 +596,7 @@ quit_workers(struct worker_params *wp, struct rte_mempool *p)
 	rte_eal_mp_wait_lcore();
 	quit = 0;
 	worker_idx = 0;
+	zero_idx = RTE_MAX_LCORE;
 }
 
 static int
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v7 06/16] test/distributor: synchronize lcores statistics
       [not found]                           ` <CGME20201010160528eucas1p2b9b8189aef51c18d116f97ccebf5719c@eucas1p2.samsung.com>
@ 2020-10-10 16:04                             ` Lukasz Wojciechowski
  2020-10-16  5:13                               ` Honnappa Nagarahalli
  0 siblings, 1 reply; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-10 16:04 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

Statistics of handled packets are cleared and read on main lcore,
while they are increased in workers handlers on different lcores.

Without synchronization occasionally showed invalid values.
This patch uses atomic acquire/release mechanisms to synchronize.

Fixes: c3eabff124e6 ("distributor: add unit tests")
Cc: bruce.richardson@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 app/test/test_distributor.c | 43 +++++++++++++++++++++++++------------
 1 file changed, 29 insertions(+), 14 deletions(-)

diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
index 6cd7a2edd..838459392 100644
--- a/app/test/test_distributor.c
+++ b/app/test/test_distributor.c
@@ -43,7 +43,8 @@ total_packet_count(void)
 {
 	unsigned i, count = 0;
 	for (i = 0; i < worker_idx; i++)
-		count += worker_stats[i].handled_packets;
+		count += __atomic_load_n(&worker_stats[i].handled_packets,
+				__ATOMIC_ACQUIRE);
 	return count;
 }
 
@@ -51,7 +52,10 @@ total_packet_count(void)
 static inline void
 clear_packet_count(void)
 {
-	memset(&worker_stats, 0, sizeof(worker_stats));
+	unsigned int i;
+	for (i = 0; i < RTE_MAX_LCORE; i++)
+		__atomic_store_n(&worker_stats[i].handled_packets, 0,
+			__ATOMIC_RELEASE);
 }
 
 /* this is the basic worker function for sanity test
@@ -69,13 +73,13 @@ handle_work(void *arg)
 	num = rte_distributor_get_pkt(db, id, buf, NULL, 0);
 	while (!quit) {
 		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
-				__ATOMIC_RELAXED);
+				__ATOMIC_ACQ_REL);
 		count += num;
 		num = rte_distributor_get_pkt(db, id,
 				buf, buf, num);
 	}
 	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
-			__ATOMIC_RELAXED);
+			__ATOMIC_ACQ_REL);
 	count += num;
 	rte_distributor_return_pkt(db, id, buf, num);
 	return 0;
@@ -131,7 +135,8 @@ sanity_test(struct worker_params *wp, struct rte_mempool *p)
 
 	for (i = 0; i < rte_lcore_count() - 1; i++)
 		printf("Worker %u handled %u packets\n", i,
-				worker_stats[i].handled_packets);
+			__atomic_load_n(&worker_stats[i].handled_packets,
+					__ATOMIC_ACQUIRE));
 	printf("Sanity test with all zero hashes done.\n");
 
 	/* pick two flows and check they go correctly */
@@ -156,7 +161,9 @@ sanity_test(struct worker_params *wp, struct rte_mempool *p)
 
 		for (i = 0; i < rte_lcore_count() - 1; i++)
 			printf("Worker %u handled %u packets\n", i,
-					worker_stats[i].handled_packets);
+				__atomic_load_n(
+					&worker_stats[i].handled_packets,
+					__ATOMIC_ACQUIRE));
 		printf("Sanity test with two hash values done\n");
 	}
 
@@ -182,7 +189,8 @@ sanity_test(struct worker_params *wp, struct rte_mempool *p)
 
 	for (i = 0; i < rte_lcore_count() - 1; i++)
 		printf("Worker %u handled %u packets\n", i,
-				worker_stats[i].handled_packets);
+			__atomic_load_n(&worker_stats[i].handled_packets,
+					__ATOMIC_ACQUIRE));
 	printf("Sanity test with non-zero hashes done\n");
 
 	rte_mempool_put_bulk(p, (void *)bufs, BURST);
@@ -275,14 +283,16 @@ handle_work_with_free_mbufs(void *arg)
 
 	num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 	while (!quit) {
-		worker_stats[id].handled_packets += num;
 		count += num;
+		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
+				__ATOMIC_ACQ_REL);
 		for (i = 0; i < num; i++)
 			rte_pktmbuf_free(buf[i]);
 		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 	}
-	worker_stats[id].handled_packets += num;
 	count += num;
+	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
+			__ATOMIC_ACQ_REL);
 	rte_distributor_return_pkt(d, id, buf, num);
 	return 0;
 }
@@ -358,8 +368,9 @@ handle_work_for_shutdown_test(void *arg)
 	/* wait for quit single globally, or for worker zero, wait
 	 * for zero_quit */
 	while (!quit && !(id == zero_id && zero_quit)) {
-		worker_stats[id].handled_packets += num;
 		count += num;
+		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
+				__ATOMIC_ACQ_REL);
 		for (i = 0; i < num; i++)
 			rte_pktmbuf_free(buf[i]);
 		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
@@ -373,10 +384,11 @@ handle_work_for_shutdown_test(void *arg)
 
 		total += num;
 	}
-	worker_stats[id].handled_packets += num;
 	count += num;
 	returned = rte_distributor_return_pkt(d, id, buf, num);
 
+	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
+			__ATOMIC_ACQ_REL);
 	if (id == zero_id) {
 		/* for worker zero, allow it to restart to pick up last packet
 		 * when all workers are shutting down.
@@ -387,7 +399,8 @@ handle_work_for_shutdown_test(void *arg)
 		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 
 		while (!quit) {
-			worker_stats[id].handled_packets += num;
+			__atomic_fetch_add(&worker_stats[id].handled_packets,
+					num, __ATOMIC_ACQ_REL);
 			count += num;
 			rte_pktmbuf_free(pkt);
 			num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
@@ -454,7 +467,8 @@ sanity_test_with_worker_shutdown(struct worker_params *wp,
 
 	for (i = 0; i < rte_lcore_count() - 1; i++)
 		printf("Worker %u handled %u packets\n", i,
-				worker_stats[i].handled_packets);
+			__atomic_load_n(&worker_stats[i].handled_packets,
+					__ATOMIC_ACQUIRE));
 
 	if (total_packet_count() != BURST * 2) {
 		printf("Line %d: Error, not all packets flushed. "
@@ -507,7 +521,8 @@ test_flush_with_worker_shutdown(struct worker_params *wp,
 	zero_quit = 0;
 	for (i = 0; i < rte_lcore_count() - 1; i++)
 		printf("Worker %u handled %u packets\n", i,
-				worker_stats[i].handled_packets);
+			__atomic_load_n(&worker_stats[i].handled_packets,
+					__ATOMIC_ACQUIRE));
 
 	if (total_packet_count() != BURST) {
 		printf("Line %d: Error, not all packets flushed. "
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v7 07/16] distributor: fix return pkt calls in single mode
       [not found]                           ` <CGME20201010160530eucas1p15baba6fba44a7caee8b4b0ff778a961d@eucas1p1.samsung.com>
@ 2020-10-10 16:04                             ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-10 16:04 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

In the single legacy version of the distributor synchronization
requires continues exchange of buffers between distributor
and workers. Empty buffers are sent if only handshake
synchronization is required.
However calls to the rte_distributor_return_pkt()
with 0 buffers in single mode were ignored and not passed to the
legacy algorithm implementation causing lack of synchronization.

This patch fixes this issue by passing NULL as buffer which is
a valid way of sending just synchronization handshakes
in single mode.

Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 lib/librte_distributor/rte_distributor.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
index 115443fc0..9fd7dcab7 100644
--- a/lib/librte_distributor/rte_distributor.c
+++ b/lib/librte_distributor/rte_distributor.c
@@ -168,6 +168,9 @@ rte_distributor_return_pkt(struct rte_distributor *d,
 		if (num == 1)
 			return rte_distributor_return_pkt_single(d->d_single,
 				worker_id, oldpkt[0]);
+		else if (num == 0)
+			return rte_distributor_return_pkt_single(d->d_single,
+				worker_id, NULL);
 		else
 			return -EINVAL;
 	}
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v7 08/16] test/distributor: fix freeing mbufs
       [not found]                           ` <CGME20201010160536eucas1p2b20e729b90d66eddd03618e98d38c179@eucas1p2.samsung.com>
@ 2020-10-10 16:04                             ` Lukasz Wojciechowski
  2020-10-16  5:12                               ` Honnappa Nagarahalli
  0 siblings, 1 reply; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-10 16:04 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

Sanity tests with mbuf alloc and shutdown tests assume that
mbufs passed to worker cores are freed in handlers.
Such packets should not be returned to the distributor's main
core. The only packets that should be returned are the packets
send after completion of the tests in quit_workers function.

This patch stops returning mbufs to distributor's core.
In case of shutdown tests it is impossible to determine
how worker and distributor threads would synchronize.
Packets used by tests should be freed and packets used during
quit_workers() shouldn't. That's why returning mbufs to mempool
is moved to test procedure run on distributor thread
from worker threads.

Additionally this patch cleans up unused variables.

Fixes: c0de0eb82e40 ("distributor: switch over to new API")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 app/test/test_distributor.c | 96 ++++++++++++++++++-------------------
 1 file changed, 47 insertions(+), 49 deletions(-)

diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
index 838459392..06e01ff9d 100644
--- a/app/test/test_distributor.c
+++ b/app/test/test_distributor.c
@@ -44,7 +44,7 @@ total_packet_count(void)
 	unsigned i, count = 0;
 	for (i = 0; i < worker_idx; i++)
 		count += __atomic_load_n(&worker_stats[i].handled_packets,
-				__ATOMIC_ACQUIRE);
+				__ATOMIC_RELAXED);
 	return count;
 }
 
@@ -55,7 +55,7 @@ clear_packet_count(void)
 	unsigned int i;
 	for (i = 0; i < RTE_MAX_LCORE; i++)
 		__atomic_store_n(&worker_stats[i].handled_packets, 0,
-			__ATOMIC_RELEASE);
+			__ATOMIC_RELAXED);
 }
 
 /* this is the basic worker function for sanity test
@@ -67,20 +67,18 @@ handle_work(void *arg)
 	struct rte_mbuf *buf[8] __rte_cache_aligned;
 	struct worker_params *wp = arg;
 	struct rte_distributor *db = wp->dist;
-	unsigned int count = 0, num;
+	unsigned int num;
 	unsigned int id = __atomic_fetch_add(&worker_idx, 1, __ATOMIC_RELAXED);
 
 	num = rte_distributor_get_pkt(db, id, buf, NULL, 0);
 	while (!quit) {
 		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
-				__ATOMIC_ACQ_REL);
-		count += num;
+				__ATOMIC_RELAXED);
 		num = rte_distributor_get_pkt(db, id,
 				buf, buf, num);
 	}
 	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
-			__ATOMIC_ACQ_REL);
-	count += num;
+			__ATOMIC_RELAXED);
 	rte_distributor_return_pkt(db, id, buf, num);
 	return 0;
 }
@@ -136,7 +134,7 @@ sanity_test(struct worker_params *wp, struct rte_mempool *p)
 	for (i = 0; i < rte_lcore_count() - 1; i++)
 		printf("Worker %u handled %u packets\n", i,
 			__atomic_load_n(&worker_stats[i].handled_packets,
-					__ATOMIC_ACQUIRE));
+					__ATOMIC_RELAXED));
 	printf("Sanity test with all zero hashes done.\n");
 
 	/* pick two flows and check they go correctly */
@@ -163,7 +161,7 @@ sanity_test(struct worker_params *wp, struct rte_mempool *p)
 			printf("Worker %u handled %u packets\n", i,
 				__atomic_load_n(
 					&worker_stats[i].handled_packets,
-					__ATOMIC_ACQUIRE));
+					__ATOMIC_RELAXED));
 		printf("Sanity test with two hash values done\n");
 	}
 
@@ -190,7 +188,7 @@ sanity_test(struct worker_params *wp, struct rte_mempool *p)
 	for (i = 0; i < rte_lcore_count() - 1; i++)
 		printf("Worker %u handled %u packets\n", i,
 			__atomic_load_n(&worker_stats[i].handled_packets,
-					__ATOMIC_ACQUIRE));
+					__ATOMIC_RELAXED));
 	printf("Sanity test with non-zero hashes done\n");
 
 	rte_mempool_put_bulk(p, (void *)bufs, BURST);
@@ -276,23 +274,20 @@ handle_work_with_free_mbufs(void *arg)
 	struct rte_mbuf *buf[8] __rte_cache_aligned;
 	struct worker_params *wp = arg;
 	struct rte_distributor *d = wp->dist;
-	unsigned int count = 0;
 	unsigned int i;
 	unsigned int num;
 	unsigned int id = __atomic_fetch_add(&worker_idx, 1, __ATOMIC_RELAXED);
 
 	num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 	while (!quit) {
-		count += num;
 		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
-				__ATOMIC_ACQ_REL);
+				__ATOMIC_RELAXED);
 		for (i = 0; i < num; i++)
 			rte_pktmbuf_free(buf[i]);
 		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 	}
-	count += num;
 	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
-			__ATOMIC_ACQ_REL);
+			__ATOMIC_RELAXED);
 	rte_distributor_return_pkt(d, id, buf, num);
 	return 0;
 }
@@ -318,7 +313,6 @@ sanity_test_with_mbuf_alloc(struct worker_params *wp, struct rte_mempool *p)
 			rte_distributor_process(d, NULL, 0);
 		for (j = 0; j < BURST; j++) {
 			bufs[j]->hash.usr = (i+j) << 1;
-			rte_mbuf_refcnt_set(bufs[j], 1);
 		}
 
 		rte_distributor_process(d, bufs, BURST);
@@ -342,15 +336,10 @@ sanity_test_with_mbuf_alloc(struct worker_params *wp, struct rte_mempool *p)
 static int
 handle_work_for_shutdown_test(void *arg)
 {
-	struct rte_mbuf *pkt = NULL;
 	struct rte_mbuf *buf[8] __rte_cache_aligned;
 	struct worker_params *wp = arg;
 	struct rte_distributor *d = wp->dist;
-	unsigned int count = 0;
 	unsigned int num;
-	unsigned int total = 0;
-	unsigned int i;
-	unsigned int returned = 0;
 	unsigned int zero_id = 0;
 	unsigned int zero_unset;
 	const unsigned int id = __atomic_fetch_add(&worker_idx, 1,
@@ -368,11 +357,8 @@ handle_work_for_shutdown_test(void *arg)
 	/* wait for quit single globally, or for worker zero, wait
 	 * for zero_quit */
 	while (!quit && !(id == zero_id && zero_quit)) {
-		count += num;
 		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
-				__ATOMIC_ACQ_REL);
-		for (i = 0; i < num; i++)
-			rte_pktmbuf_free(buf[i]);
+				__ATOMIC_RELAXED);
 		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 
 		if (num > 0) {
@@ -381,15 +367,12 @@ handle_work_for_shutdown_test(void *arg)
 				false, __ATOMIC_ACQ_REL, __ATOMIC_ACQUIRE);
 		}
 		zero_id = __atomic_load_n(&zero_idx, __ATOMIC_ACQUIRE);
-
-		total += num;
 	}
-	count += num;
-	returned = rte_distributor_return_pkt(d, id, buf, num);
-
 	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
-			__ATOMIC_ACQ_REL);
+			__ATOMIC_RELAXED);
 	if (id == zero_id) {
+		rte_distributor_return_pkt(d, id, NULL, 0);
+
 		/* for worker zero, allow it to restart to pick up last packet
 		 * when all workers are shutting down.
 		 */
@@ -400,15 +383,11 @@ handle_work_for_shutdown_test(void *arg)
 
 		while (!quit) {
 			__atomic_fetch_add(&worker_stats[id].handled_packets,
-					num, __ATOMIC_ACQ_REL);
-			count += num;
-			rte_pktmbuf_free(pkt);
+					num, __ATOMIC_RELAXED);
 			num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 		}
-		returned = rte_distributor_return_pkt(d,
-				id, buf, num);
-		printf("Num returned = %d\n", returned);
 	}
+	rte_distributor_return_pkt(d, id, buf, num);
 	return 0;
 }
 
@@ -424,7 +403,9 @@ sanity_test_with_worker_shutdown(struct worker_params *wp,
 {
 	struct rte_distributor *d = wp->dist;
 	struct rte_mbuf *bufs[BURST];
-	unsigned i;
+	struct rte_mbuf *bufs2[BURST];
+	unsigned int i;
+	unsigned int failed = 0;
 
 	printf("=== Sanity test of worker shutdown ===\n");
 
@@ -450,16 +431,17 @@ sanity_test_with_worker_shutdown(struct worker_params *wp,
 	 */
 
 	/* get more buffers to queue up, again setting them to the same flow */
-	if (rte_mempool_get_bulk(p, (void *)bufs, BURST) != 0) {
+	if (rte_mempool_get_bulk(p, (void *)bufs2, BURST) != 0) {
 		printf("line %d: Error getting mbufs from pool\n", __LINE__);
+		rte_mempool_put_bulk(p, (void *)bufs, BURST);
 		return -1;
 	}
 	for (i = 0; i < BURST; i++)
-		bufs[i]->hash.usr = 1;
+		bufs2[i]->hash.usr = 1;
 
 	/* get worker zero to quit */
 	zero_quit = 1;
-	rte_distributor_process(d, bufs, BURST);
+	rte_distributor_process(d, bufs2, BURST);
 
 	/* flush the distributor */
 	rte_distributor_flush(d);
@@ -468,15 +450,21 @@ sanity_test_with_worker_shutdown(struct worker_params *wp,
 	for (i = 0; i < rte_lcore_count() - 1; i++)
 		printf("Worker %u handled %u packets\n", i,
 			__atomic_load_n(&worker_stats[i].handled_packets,
-					__ATOMIC_ACQUIRE));
+					__ATOMIC_RELAXED));
 
 	if (total_packet_count() != BURST * 2) {
 		printf("Line %d: Error, not all packets flushed. "
 				"Expected %u, got %u\n",
 				__LINE__, BURST * 2, total_packet_count());
-		return -1;
+		failed = 1;
 	}
 
+	rte_mempool_put_bulk(p, (void *)bufs, BURST);
+	rte_mempool_put_bulk(p, (void *)bufs2, BURST);
+
+	if (failed)
+		return -1;
+
 	printf("Sanity test with worker shutdown passed\n\n");
 	return 0;
 }
@@ -490,7 +478,8 @@ test_flush_with_worker_shutdown(struct worker_params *wp,
 {
 	struct rte_distributor *d = wp->dist;
 	struct rte_mbuf *bufs[BURST];
-	unsigned i;
+	unsigned int i;
+	unsigned int failed = 0;
 
 	printf("=== Test flush fn with worker shutdown (%s) ===\n", wp->name);
 
@@ -522,15 +511,20 @@ test_flush_with_worker_shutdown(struct worker_params *wp,
 	for (i = 0; i < rte_lcore_count() - 1; i++)
 		printf("Worker %u handled %u packets\n", i,
 			__atomic_load_n(&worker_stats[i].handled_packets,
-					__ATOMIC_ACQUIRE));
+					__ATOMIC_RELAXED));
 
 	if (total_packet_count() != BURST) {
 		printf("Line %d: Error, not all packets flushed. "
 				"Expected %u, got %u\n",
 				__LINE__, BURST, total_packet_count());
-		return -1;
+		failed = 1;
 	}
 
+	rte_mempool_put_bulk(p, (void *)bufs, BURST);
+
+	if (failed)
+		return -1;
+
 	printf("Flush test with worker shutdown passed\n\n");
 	return 0;
 }
@@ -596,7 +590,10 @@ quit_workers(struct worker_params *wp, struct rte_mempool *p)
 	const unsigned num_workers = rte_lcore_count() - 1;
 	unsigned i;
 	struct rte_mbuf *bufs[RTE_MAX_LCORE];
-	rte_mempool_get_bulk(p, (void *)bufs, num_workers);
+	if (rte_mempool_get_bulk(p, (void *)bufs, num_workers) != 0) {
+		printf("line %d: Error getting mbufs from pool\n", __LINE__);
+		return;
+	}
 
 	zero_quit = 0;
 	quit = 1;
@@ -604,11 +601,12 @@ quit_workers(struct worker_params *wp, struct rte_mempool *p)
 		bufs[i]->hash.usr = i << 1;
 	rte_distributor_process(d, bufs, num_workers);
 
-	rte_mempool_put_bulk(p, (void *)bufs, num_workers);
-
 	rte_distributor_process(d, NULL, 0);
 	rte_distributor_flush(d);
 	rte_eal_mp_wait_lcore();
+
+	rte_mempool_put_bulk(p, (void *)bufs, num_workers);
+
 	quit = 0;
 	worker_idx = 0;
 	zero_idx = RTE_MAX_LCORE;
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v7 09/16] test/distributor: collect return mbufs
       [not found]                           ` <CGME20201010160538eucas1p19298667f236209cfeaa4745f9bb3aae6@eucas1p1.samsung.com>
@ 2020-10-10 16:05                             ` Lukasz Wojciechowski
  2020-10-16  4:53                               ` Honnappa Nagarahalli
  2020-10-16  5:13                               ` Honnappa Nagarahalli
  0 siblings, 2 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-10 16:05 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

During quit_workers function distributor's main core processes
some packets to wake up pending worker cores so they can quit.
As quit_workers acts also as a cleanup procedure for next test
case it should also collect these packages returned by workers'
handlers, so the cyclic buffer with returned packets
in distributor remains empty.

Fixes: c3eabff124e6 ("distributor: add unit tests")
Cc: bruce.richardson@intel.com
Fixes: c0de0eb82e40 ("distributor: switch over to new API")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 app/test/test_distributor.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
index 06e01ff9d..ed03040d1 100644
--- a/app/test/test_distributor.c
+++ b/app/test/test_distributor.c
@@ -590,6 +590,7 @@ quit_workers(struct worker_params *wp, struct rte_mempool *p)
 	const unsigned num_workers = rte_lcore_count() - 1;
 	unsigned i;
 	struct rte_mbuf *bufs[RTE_MAX_LCORE];
+	struct rte_mbuf *returns[RTE_MAX_LCORE];
 	if (rte_mempool_get_bulk(p, (void *)bufs, num_workers) != 0) {
 		printf("line %d: Error getting mbufs from pool\n", __LINE__);
 		return;
@@ -605,6 +606,10 @@ quit_workers(struct worker_params *wp, struct rte_mempool *p)
 	rte_distributor_flush(d);
 	rte_eal_mp_wait_lcore();
 
+	while (rte_distributor_returned_pkts(d, returns, RTE_MAX_LCORE))
+		;
+
+	rte_distributor_clear_returns(d);
 	rte_mempool_put_bulk(p, (void *)bufs, num_workers);
 
 	quit = 0;
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v7 10/16] distributor: align API documentation with code
       [not found]                           ` <CGME20201010160540eucas1p2d942834b4749672c433a37a8fe520bd1@eucas1p2.samsung.com>
@ 2020-10-10 16:05                             ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-10 16:05 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

After introducing burst API there were some artefacts in the
API documentation from legacy single API.
Also the rte_distributor_poll_pkt() function return values
mismatched the implementation.

Fixes: c0de0eb82e40 ("distributor: switch over to new API")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 lib/librte_distributor/rte_distributor.h | 23 ++++++++++++-----------
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/lib/librte_distributor/rte_distributor.h b/lib/librte_distributor/rte_distributor.h
index 327c0c4ab..a073e6461 100644
--- a/lib/librte_distributor/rte_distributor.h
+++ b/lib/librte_distributor/rte_distributor.h
@@ -155,7 +155,7 @@ rte_distributor_clear_returns(struct rte_distributor *d);
  * @param pkts
  *   The mbufs pointer array to be filled in (up to 8 packets)
  * @param oldpkt
- *   The previous packet, if any, being processed by the worker
+ *   The previous packets, if any, being processed by the worker
  * @param retcount
  *   The number of packets being returned
  *
@@ -187,15 +187,15 @@ rte_distributor_return_pkt(struct rte_distributor *d,
 
 /**
  * API called by a worker to request a new packet to process.
- * Any previous packet given to the worker is assumed to have completed
+ * Any previous packets given to the worker are assumed to have completed
  * processing, and may be optionally returned to the distributor via
  * the oldpkt parameter.
- * Unlike rte_distributor_get_pkt_burst(), this function does not wait for a
- * new packet to be provided by the distributor.
+ * Unlike rte_distributor_get_pkt(), this function does not wait for
+ * new packets to be provided by the distributor.
  *
- * NOTE: after calling this function, rte_distributor_poll_pkt_burst() should
- * be used to poll for the packet requested. The rte_distributor_get_pkt_burst()
- * API should *not* be used to try and retrieve the new packet.
+ * NOTE: after calling this function, rte_distributor_poll_pkt() should
+ * be used to poll for the packets requested. The rte_distributor_get_pkt()
+ * API should *not* be used to try and retrieve the new packets.
  *
  * @param d
  *   The distributor instance to be used
@@ -213,9 +213,9 @@ rte_distributor_request_pkt(struct rte_distributor *d,
 		unsigned int count);
 
 /**
- * API called by a worker to check for a new packet that was previously
+ * API called by a worker to check for new packets that were previously
  * requested by a call to rte_distributor_request_pkt(). It does not wait
- * for the new packet to be available, but returns NULL if the request has
+ * for the new packets to be available, but returns if the request has
  * not yet been fulfilled by the distributor.
  *
  * @param d
@@ -227,8 +227,9 @@ rte_distributor_request_pkt(struct rte_distributor *d,
  *   The array of mbufs being given to the worker
  *
  * @return
- *   The number of packets being given to the worker thread, zero if no
- *   packet is yet available.
+ *   The number of packets being given to the worker thread,
+ *   -1 if no packets are yet available (burst API - RTE_DIST_ALG_BURST)
+ *   0 if no packets are yet available (legacy single API - RTE_DIST_ALG_SINGLE)
  */
 int
 rte_distributor_poll_pkt(struct rte_distributor *d,
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v7 11/16] test/distributor: replace delays with spin locks
       [not found]                           ` <CGME20201010160541eucas1p11d079bad2b7500f9ab927463e1eeac04@eucas1p1.samsung.com>
@ 2020-10-10 16:05                             ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-10 16:05 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

Instead of making delays in test code and waiting
for worker hopefully to reach proper states,
synchronize worker shutdown test cases with spin lock
on atomic variable.

Fixes: c0de0eb82e40 ("distributor: switch over to new API")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 app/test/test_distributor.c | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
index ed03040d1..e8dd75078 100644
--- a/app/test/test_distributor.c
+++ b/app/test/test_distributor.c
@@ -27,6 +27,7 @@ struct worker_params worker_params;
 /* statics - all zero-initialized by default */
 static volatile int quit;      /**< general quit variable for all threads */
 static volatile int zero_quit; /**< var for when we just want thr0 to quit*/
+static volatile int zero_sleep; /**< thr0 has quit basic loop and is sleeping*/
 static volatile unsigned worker_idx;
 static volatile unsigned zero_idx;
 
@@ -376,8 +377,10 @@ handle_work_for_shutdown_test(void *arg)
 		/* for worker zero, allow it to restart to pick up last packet
 		 * when all workers are shutting down.
 		 */
+		__atomic_store_n(&zero_sleep, 1, __ATOMIC_RELEASE);
 		while (zero_quit)
 			usleep(100);
+		__atomic_store_n(&zero_sleep, 0, __ATOMIC_RELEASE);
 
 		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 
@@ -445,7 +448,12 @@ sanity_test_with_worker_shutdown(struct worker_params *wp,
 
 	/* flush the distributor */
 	rte_distributor_flush(d);
-	rte_delay_us(10000);
+	while (!__atomic_load_n(&zero_sleep, __ATOMIC_ACQUIRE))
+		rte_distributor_flush(d);
+
+	zero_quit = 0;
+	while (__atomic_load_n(&zero_sleep, __ATOMIC_ACQUIRE))
+		rte_delay_us(100);
 
 	for (i = 0; i < rte_lcore_count() - 1; i++)
 		printf("Worker %u handled %u packets\n", i,
@@ -505,9 +513,14 @@ test_flush_with_worker_shutdown(struct worker_params *wp,
 	/* flush the distributor */
 	rte_distributor_flush(d);
 
-	rte_delay_us(10000);
+	while (!__atomic_load_n(&zero_sleep, __ATOMIC_ACQUIRE))
+		rte_distributor_flush(d);
 
 	zero_quit = 0;
+
+	while (__atomic_load_n(&zero_sleep, __ATOMIC_ACQUIRE))
+		rte_delay_us(100);
+
 	for (i = 0; i < rte_lcore_count() - 1; i++)
 		printf("Worker %u handled %u packets\n", i,
 			__atomic_load_n(&worker_stats[i].handled_packets,
@@ -615,6 +628,8 @@ quit_workers(struct worker_params *wp, struct rte_mempool *p)
 	quit = 0;
 	worker_idx = 0;
 	zero_idx = RTE_MAX_LCORE;
+	zero_quit = 0;
+	zero_sleep = 0;
 }
 
 static int
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v7 12/16] distributor: fix scalar matching
       [not found]                           ` <CGME20201010160548eucas1p193e4f234da1005b91f22a8e7cb1d3226@eucas1p1.samsung.com>
@ 2020-10-10 16:05                             ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-10 16:05 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

Fix improper indexes while comparing tags.
In the find_match_scalar() function:
* j iterates over flow tags of following packets;
* w iterates over backlog or in flight tags positions.

Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 lib/librte_distributor/rte_distributor.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
index 9fd7dcab7..4bd23a990 100644
--- a/lib/librte_distributor/rte_distributor.c
+++ b/lib/librte_distributor/rte_distributor.c
@@ -261,13 +261,13 @@ find_match_scalar(struct rte_distributor *d,
 
 		for (j = 0; j < RTE_DIST_BURST_SIZE ; j++)
 			for (w = 0; w < RTE_DIST_BURST_SIZE; w++)
-				if (d->in_flight_tags[i][j] == data_ptr[w]) {
+				if (d->in_flight_tags[i][w] == data_ptr[j]) {
 					output_ptr[j] = i+1;
 					break;
 				}
 		for (j = 0; j < RTE_DIST_BURST_SIZE; j++)
 			for (w = 0; w < RTE_DIST_BURST_SIZE; w++)
-				if (bl->tags[j] == data_ptr[w]) {
+				if (bl->tags[w] == data_ptr[j]) {
 					output_ptr[j] = i+1;
 					break;
 				}
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v7 13/16] test/distributor: add test with packets marking
       [not found]                           ` <CGME20201010160549eucas1p1eba7cb8e4e9ba9200e9cd498137848c3@eucas1p1.samsung.com>
@ 2020-10-10 16:05                             ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-10 16:05 UTC (permalink / raw)
  To: David Hunt; +Cc: dev, l.wojciechow

All of the former tests analyzed only statistics
of packets processed by all workers.
The new test verifies also if packets are processed
on workers as expected.
Every packets processed by the worker is marked
and analyzed after it is returned to distributor.

This test allows finding issues in matching algorithms.

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 app/test/test_distributor.c | 141 ++++++++++++++++++++++++++++++++++++
 1 file changed, 141 insertions(+)

diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
index e8dd75078..4fc10b3cc 100644
--- a/app/test/test_distributor.c
+++ b/app/test/test_distributor.c
@@ -542,6 +542,141 @@ test_flush_with_worker_shutdown(struct worker_params *wp,
 	return 0;
 }
 
+static int
+handle_and_mark_work(void *arg)
+{
+	struct rte_mbuf *buf[8] __rte_cache_aligned;
+	struct worker_params *wp = arg;
+	struct rte_distributor *db = wp->dist;
+	unsigned int num, i;
+	unsigned int id = __atomic_fetch_add(&worker_idx, 1, __ATOMIC_RELAXED);
+	num = rte_distributor_get_pkt(db, id, buf, NULL, 0);
+	while (!quit) {
+		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
+				__ATOMIC_RELAXED);
+		for (i = 0; i < num; i++)
+			buf[i]->udata64 += id + 1;
+		num = rte_distributor_get_pkt(db, id,
+				buf, buf, num);
+	}
+	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
+			__ATOMIC_RELAXED);
+	rte_distributor_return_pkt(db, id, buf, num);
+	return 0;
+}
+
+/* sanity_mark_test sends packets to workers which mark them.
+ * Every packet has also encoded sequence number.
+ * The returned packets are sorted and verified if they were handled
+ * by proper workers.
+ */
+static int
+sanity_mark_test(struct worker_params *wp, struct rte_mempool *p)
+{
+	const unsigned int buf_count = 24;
+	const unsigned int burst = 8;
+	const unsigned int shift = 12;
+	const unsigned int seq_shift = 10;
+
+	struct rte_distributor *db = wp->dist;
+	struct rte_mbuf *bufs[buf_count];
+	struct rte_mbuf *returns[buf_count];
+	unsigned int i, count, id;
+	unsigned int sorted[buf_count], seq;
+	unsigned int failed = 0;
+
+	printf("=== Marked packets test ===\n");
+	clear_packet_count();
+	if (rte_mempool_get_bulk(p, (void *)bufs, buf_count) != 0) {
+		printf("line %d: Error getting mbufs from pool\n", __LINE__);
+		return -1;
+	}
+
+	/* bufs' hashes will be like these below, but shifted left.
+	 * The shifting is for avoiding collisions with backlogs
+	 * and in-flight tags left by previous tests.
+	 * [1, 1, 1, 1, 1, 1, 1, 1
+	 *  1, 1, 1, 1, 2, 2, 2, 2
+	 *  2, 2, 2, 2, 1, 1, 1, 1]
+	 */
+	for (i = 0; i < burst; i++) {
+		bufs[0 * burst + i]->hash.usr = 1 << shift;
+		bufs[1 * burst + i]->hash.usr = ((i < burst / 2) ? 1 : 2)
+			<< shift;
+		bufs[2 * burst + i]->hash.usr = ((i < burst / 2) ? 2 : 1)
+			<< shift;
+	}
+	/* Assign a sequence number to each packet. The sequence is shifted,
+	 * so that lower bits of the udate64 will hold mark from worker.
+	 */
+	for (i = 0; i < buf_count; i++)
+		bufs[i]->udata64 = i << seq_shift;
+
+	count = 0;
+	for (i = 0; i < buf_count/burst; i++) {
+		rte_distributor_process(db, &bufs[i * burst], burst);
+		count += rte_distributor_returned_pkts(db, &returns[count],
+			buf_count - count);
+	}
+
+	do {
+		rte_distributor_flush(db);
+		count += rte_distributor_returned_pkts(db, &returns[count],
+			buf_count - count);
+	} while (count < buf_count);
+
+	for (i = 0; i < rte_lcore_count() - 1; i++)
+		printf("Worker %u handled %u packets\n", i,
+			__atomic_load_n(&worker_stats[i].handled_packets,
+					__ATOMIC_RELAXED));
+
+	/* Sort returned packets by sent order (sequence numbers). */
+	for (i = 0; i < buf_count; i++) {
+		seq = returns[i]->udata64 >> seq_shift;
+		id = returns[i]->udata64 - (seq << seq_shift);
+		sorted[seq] = id;
+	}
+
+	/* Verify that packets [0-11] and [20-23] were processed
+	 * by the same worker
+	 */
+	for (i = 1; i < 12; i++) {
+		if (sorted[i] != sorted[0]) {
+			printf("Packet number %u processed by worker %u,"
+				" but should be processes by worker %u\n",
+				i, sorted[i], sorted[0]);
+			failed = 1;
+		}
+	}
+	for (i = 20; i < 24; i++) {
+		if (sorted[i] != sorted[0]) {
+			printf("Packet number %u processed by worker %u,"
+				" but should be processes by worker %u\n",
+				i, sorted[i], sorted[0]);
+			failed = 1;
+		}
+	}
+	/* And verify that packets [12-19] were processed
+	 * by the another worker
+	 */
+	for (i = 13; i < 20; i++) {
+		if (sorted[i] != sorted[12]) {
+			printf("Packet number %u processed by worker %u,"
+				" but should be processes by worker %u\n",
+				i, sorted[i], sorted[12]);
+			failed = 1;
+		}
+	}
+
+	rte_mempool_put_bulk(p, (void *)bufs, buf_count);
+
+	if (failed)
+		return -1;
+
+	printf("Marked packets test passed\n");
+	return 0;
+}
+
 static
 int test_error_distributor_create_name(void)
 {
@@ -726,6 +861,12 @@ test_distributor(void)
 				goto err;
 			quit_workers(&worker_params, p);
 
+			rte_eal_mp_remote_launch(handle_and_mark_work,
+					&worker_params, SKIP_MASTER);
+			if (sanity_mark_test(&worker_params, p) < 0)
+				goto err;
+			quit_workers(&worker_params, p);
+
 		} else {
 			printf("Too few cores to run worker shutdown test\n");
 		}
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v7 14/16] distributor: fix flushing in flight packets
       [not found]                           ` <CGME20201010160551eucas1p171642aa2d451e501287915824bfe7c24@eucas1p1.samsung.com>
@ 2020-10-10 16:05                             ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-10 16:05 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

rte_distributor_flush() is using total_outstanding()
function to calculate if it should still wait
for processing packets. However in burst mode
only backlog packets were counted.

This patch fixes that issue by counting also in flight
packets. There are also sum fixes to properly keep
count of in flight packets for each worker in bufs[].count.

Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 lib/librte_distributor/rte_distributor.c | 12 +++++-------
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
index 4bd23a990..2478de3b7 100644
--- a/lib/librte_distributor/rte_distributor.c
+++ b/lib/librte_distributor/rte_distributor.c
@@ -467,6 +467,7 @@ rte_distributor_process(struct rte_distributor *d,
 			/* Sync with worker on GET_BUF flag. */
 			if (__atomic_load_n(&(d->bufs[wid].bufptr64[0]),
 				__ATOMIC_ACQUIRE) & RTE_DISTRIB_GET_BUF) {
+				d->bufs[wid].count = 0;
 				release(d, wid);
 				handle_returns(d, wid);
 			}
@@ -481,11 +482,6 @@ rte_distributor_process(struct rte_distributor *d,
 		uint16_t matches[RTE_DIST_BURST_SIZE];
 		unsigned int pkts;
 
-		/* Sync with worker on GET_BUF flag. */
-		if (__atomic_load_n(&(d->bufs[wkr].bufptr64[0]),
-			__ATOMIC_ACQUIRE) & RTE_DISTRIB_GET_BUF)
-			d->bufs[wkr].count = 0;
-
 		if ((num_mbufs - next_idx) < RTE_DIST_BURST_SIZE)
 			pkts = num_mbufs - next_idx;
 		else
@@ -605,8 +601,10 @@ rte_distributor_process(struct rte_distributor *d,
 	for (wid = 0 ; wid < d->num_workers; wid++)
 		/* Sync with worker on GET_BUF flag. */
 		if ((__atomic_load_n(&(d->bufs[wid].bufptr64[0]),
-			__ATOMIC_ACQUIRE) & RTE_DISTRIB_GET_BUF))
+			__ATOMIC_ACQUIRE) & RTE_DISTRIB_GET_BUF)) {
+			d->bufs[wid].count = 0;
 			release(d, wid);
+		}
 
 	return num_mbufs;
 }
@@ -649,7 +647,7 @@ total_outstanding(const struct rte_distributor *d)
 	unsigned int wkr, total_outstanding = 0;
 
 	for (wkr = 0; wkr < d->num_workers; wkr++)
-		total_outstanding += d->backlog[wkr].count;
+		total_outstanding += d->backlog[wkr].count + d->bufs[wkr].count;
 
 	return total_outstanding;
 }
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v7 15/16] distributor: fix clearing returns buffer
       [not found]                           ` <CGME20201010160552eucas1p2efdec872c4aea2b63af29c84e9a5b52d@eucas1p2.samsung.com>
@ 2020-10-10 16:05                             ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-10 16:05 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

The patch clears distributors returns buffer
in clear_returns() by setting start and count to 0.

Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 lib/librte_distributor/rte_distributor.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
index 2478de3b7..57240304a 100644
--- a/lib/librte_distributor/rte_distributor.c
+++ b/lib/librte_distributor/rte_distributor.c
@@ -704,6 +704,8 @@ rte_distributor_clear_returns(struct rte_distributor *d)
 		/* Sync with worker. Release retptrs. */
 		__atomic_store_n(&(d->bufs[wkr].retptr64[0]), 0,
 				__ATOMIC_RELEASE);
+
+	d->returns.start = d->returns.count = 0;
 }
 
 /* creates a distributor instance */
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v7 16/16] test/distributor: ensure all packets are delivered
       [not found]                           ` <CGME20201010160605eucas1p1ff6b4cb5065e1355cb8eeafd4696abaf@eucas1p1.samsung.com>
@ 2020-10-10 16:05                             ` Lukasz Wojciechowski
  2020-10-12  7:46                               ` David Hunt
  0 siblings, 1 reply; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-10 16:05 UTC (permalink / raw)
  To: David Hunt; +Cc: dev, l.wojciechow

In all distributor tests there is a chance that tests
will send packets to distributor with rte_distributor_process()
before workers are started and requested for packets.

This patch ensures that all packets are delivered to workers
by calling rte_distributor_process() in loop until number
of successfully processed packets reaches required by test.
Change is applied to every first call in test case.

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
---
 app/test/test_distributor.c | 32 +++++++++++++++++++++++++++-----
 1 file changed, 27 insertions(+), 5 deletions(-)

diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
index 4fc10b3cc..3c56358d4 100644
--- a/app/test/test_distributor.c
+++ b/app/test/test_distributor.c
@@ -103,6 +103,7 @@ sanity_test(struct worker_params *wp, struct rte_mempool *p)
 	struct rte_mbuf *returns[BURST*2];
 	unsigned int i, count;
 	unsigned int retries;
+	unsigned int processed;
 
 	printf("=== Basic distributor sanity tests ===\n");
 	clear_packet_count();
@@ -116,7 +117,11 @@ sanity_test(struct worker_params *wp, struct rte_mempool *p)
 	for (i = 0; i < BURST; i++)
 		bufs[i]->hash.usr = 0;
 
-	rte_distributor_process(db, bufs, BURST);
+	processed = 0;
+	while (processed < BURST)
+		processed += rte_distributor_process(db, &bufs[processed],
+			BURST - processed);
+
 	count = 0;
 	do {
 
@@ -304,6 +309,7 @@ sanity_test_with_mbuf_alloc(struct worker_params *wp, struct rte_mempool *p)
 	struct rte_distributor *d = wp->dist;
 	unsigned i;
 	struct rte_mbuf *bufs[BURST];
+	unsigned int processed;
 
 	printf("=== Sanity test with mbuf alloc/free (%s) ===\n", wp->name);
 
@@ -316,7 +322,10 @@ sanity_test_with_mbuf_alloc(struct worker_params *wp, struct rte_mempool *p)
 			bufs[j]->hash.usr = (i+j) << 1;
 		}
 
-		rte_distributor_process(d, bufs, BURST);
+		processed = 0;
+		while (processed < BURST)
+			processed += rte_distributor_process(d,
+				&bufs[processed], BURST - processed);
 	}
 
 	rte_distributor_flush(d);
@@ -409,6 +418,7 @@ sanity_test_with_worker_shutdown(struct worker_params *wp,
 	struct rte_mbuf *bufs2[BURST];
 	unsigned int i;
 	unsigned int failed = 0;
+	unsigned int processed = 0;
 
 	printf("=== Sanity test of worker shutdown ===\n");
 
@@ -426,7 +436,10 @@ sanity_test_with_worker_shutdown(struct worker_params *wp,
 	for (i = 0; i < BURST; i++)
 		bufs[i]->hash.usr = 1;
 
-	rte_distributor_process(d, bufs, BURST);
+	processed = 0;
+	while (processed < BURST)
+		processed += rte_distributor_process(d, &bufs[processed],
+			BURST - processed);
 	rte_distributor_flush(d);
 
 	/* at this point, we will have processed some packets and have a full
@@ -488,6 +501,7 @@ test_flush_with_worker_shutdown(struct worker_params *wp,
 	struct rte_mbuf *bufs[BURST];
 	unsigned int i;
 	unsigned int failed = 0;
+	unsigned int processed;
 
 	printf("=== Test flush fn with worker shutdown (%s) ===\n", wp->name);
 
@@ -502,7 +516,10 @@ test_flush_with_worker_shutdown(struct worker_params *wp,
 	for (i = 0; i < BURST; i++)
 		bufs[i]->hash.usr = 0;
 
-	rte_distributor_process(d, bufs, BURST);
+	processed = 0;
+	while (processed < BURST)
+		processed += rte_distributor_process(d, &bufs[processed],
+			BURST - processed);
 	/* at this point, we will have processed some packets and have a full
 	 * backlog for the other ones at worker 0.
 	 */
@@ -584,6 +601,7 @@ sanity_mark_test(struct worker_params *wp, struct rte_mempool *p)
 	unsigned int i, count, id;
 	unsigned int sorted[buf_count], seq;
 	unsigned int failed = 0;
+	unsigned int processed;
 
 	printf("=== Marked packets test ===\n");
 	clear_packet_count();
@@ -614,7 +632,11 @@ sanity_mark_test(struct worker_params *wp, struct rte_mempool *p)
 
 	count = 0;
 	for (i = 0; i < buf_count/burst; i++) {
-		rte_distributor_process(db, &bufs[i * burst], burst);
+		processed = 0;
+		while (processed < burst)
+			processed += rte_distributor_process(db,
+				&bufs[i * burst + processed],
+				burst - processed);
 		count += rte_distributor_returned_pkts(db, &returns[count],
 			buf_count - count);
 	}
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v5 00/15] fix distributor synchronization issues
  2020-10-10  8:12                             ` David Marchand
  2020-10-10  8:15                               ` David Marchand
@ 2020-10-10 16:27                               ` Lukasz Wojciechowski
  1 sibling, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-10 16:27 UTC (permalink / raw)
  To: David Marchand
  Cc: David Hunt, Honnappa Nagarahalli, dev, Sarosh Arif,
	"'Lukasz Wojciechowski'",


W dniu 10.10.2020 o 10:12, David Marchand pisze:
> Hello Lukasz,
>
> On Sat, Oct 10, 2020 at 1:26 AM Lukasz Wojciechowski
> <l.wojciechow@partner.samsung.com> wrote:
>> W dniu 09.10.2020 o 23:41, Lukasz Wojciechowski pisze:
>> More bad news - same issue just appeared on travis for v6.
>> Good news we can reproduce it.
>>
>> Is there a way to delegate a job for travis other way than sending a new patch version?
> You just need to fork dpdk in github, then setup travis.
> Travis will get triggered on push.
> I can help offlist if needed.

Thank you

I managed to reproduce the issue by stressing my machine's cpus and memory.

The issue was caused by slow start of worker threads, which didn't reach 
place where they request for packages, because of that
they were treated as not activated. The distributor thread didn't send 
any packets because of that fact, but waited in an infinite loop until 
packets are returned from workers.

I pushed v7 of series with additional patch fixing that by running 
rte_distributor_process() in a loop until it manages to send all packets 
to workers.

>
>
-- 
Lukasz Wojciechowski
Principal Software Engineer

Samsung R&D Institute Poland
Samsung Electronics
Office +48 22 377 88 25
l.wojciechow@partner.samsung.com


^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v7 16/16] test/distributor: ensure all packets are delivered
  2020-10-10 16:05                             ` [dpdk-dev] [PATCH v7 16/16] test/distributor: ensure all packets are delivered Lukasz Wojciechowski
@ 2020-10-12  7:46                               ` David Hunt
  0 siblings, 0 replies; 164+ messages in thread
From: David Hunt @ 2020-10-12  7:46 UTC (permalink / raw)
  To: Lukasz Wojciechowski; +Cc: dev

Hi Lukasz,

On 10/10/2020 5:05 PM, Lukasz Wojciechowski wrote:
> In all distributor tests there is a chance that tests
> will send packets to distributor with rte_distributor_process()
> before workers are started and requested for packets.
>
> This patch ensures that all packets are delivered to workers
> by calling rte_distributor_process() in loop until number
> of successfully processed packets reaches required by test.
> Change is applied to every first call in test case.
>
> Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
> ---
>   app/test/test_distributor.c | 32 +++++++++++++++++++++++++++-----
>   1 file changed, 27 insertions(+), 5 deletions(-)
>
> diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c


Thanks for the quick turnaround.

Acked-by: David Hunt <david.hunt@intel.com>



^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v7 01/16] distributor: fix missing handshake synchronization
  2020-10-10 16:04                             ` [dpdk-dev] [PATCH v7 01/16] distributor: fix missing handshake synchronization Lukasz Wojciechowski
@ 2020-10-15 23:47                               ` Honnappa Nagarahalli
  2020-10-17  3:13                                 ` Lukasz Wojciechowski
  0 siblings, 1 reply; 164+ messages in thread
From: Honnappa Nagarahalli @ 2020-10-15 23:47 UTC (permalink / raw)
  To: Lukasz Wojciechowski, David Hunt, Bruce Richardson
  Cc: dev, stable, nd, Honnappa Nagarahalli, nd

<snip>

> 
> rte_distributor_return_pkt function which is run on worker cores must wait
> for distributor core to clear handshake on retptr64 before using those
> buffers. While the handshake is set distributor core controls buffers and any
> operations on worker side might overwrite buffers which are unread yet.
> Same situation appears in the legacy single distributor. Function
> rte_distributor_return_pkt_single shouldn't modify the bufptr64 until
> handshake on it is cleared by distributor lcore.
> 
> Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
> Cc: david.hunt@intel.com
> Cc: stable@dpdk.org
> 
> Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
> Acked-by: David Hunt <david.hunt@intel.com>
> ---
>  lib/librte_distributor/rte_distributor.c        | 14 ++++++++++++++
>  lib/librte_distributor/rte_distributor_single.c |  4 ++++
>  2 files changed, 18 insertions(+)
> 
> diff --git a/lib/librte_distributor/rte_distributor.c
> b/lib/librte_distributor/rte_distributor.c
> index 1c047f065..89493c331 100644
> --- a/lib/librte_distributor/rte_distributor.c
> +++ b/lib/librte_distributor/rte_distributor.c
> @@ -160,6 +160,7 @@ rte_distributor_return_pkt(struct rte_distributor *d,
> {
>  	struct rte_distributor_buffer *buf = &d->bufs[worker_id];
>  	unsigned int i;
> +	volatile int64_t *retptr64;
volatile is not needed here as use of __atomic_load_n implies volatile inherently.

> 
>  	if (unlikely(d->alg_type == RTE_DIST_ALG_SINGLE)) {
>  		if (num == 1)
> @@ -169,6 +170,19 @@ rte_distributor_return_pkt(struct rte_distributor *d,
>  			return -EINVAL;
>  	}
> 
> +	retptr64 = &(buf->retptr64[0]);
> +	/* Spin while handshake bits are set (scheduler clears it).
> +	 * Sync with worker on GET_BUF flag.
> +	 */
> +	while (unlikely(__atomic_load_n(retptr64, __ATOMIC_ACQUIRE)
nit. we could avoid using the temp variable retptr64, you could use '&buf->retptr64[0]' directly.
RELAXED memory order should be good as the thread_fence below will ensure that this load does not sink.

[1] 
> +			& RTE_DISTRIB_GET_BUF)) {
> +		rte_pause();
> +		uint64_t t = rte_rdtsc()+100;
> +
> +		while (rte_rdtsc() < t)
> +			rte_pause();
> +	}
> +
>  	/* Sync with distributor to acquire retptrs */
>  	__atomic_thread_fence(__ATOMIC_ACQUIRE);
>  	for (i = 0; i < RTE_DIST_BURST_SIZE; i++) diff --git
> a/lib/librte_distributor/rte_distributor_single.c
> b/lib/librte_distributor/rte_distributor_single.c
> index abaf7730c..f4725b1d0 100644
> --- a/lib/librte_distributor/rte_distributor_single.c
> +++ b/lib/librte_distributor/rte_distributor_single.c
> @@ -74,6 +74,10 @@ rte_distributor_return_pkt_single(struct
> rte_distributor_single *d,
>  	union rte_distributor_buffer_single *buf = &d->bufs[worker_id];
>  	uint64_t req = (((int64_t)(uintptr_t)oldpkt) <<
> RTE_DISTRIB_FLAG_BITS)
>  			| RTE_DISTRIB_RETURN_BUF;
> +	while (unlikely(__atomic_load_n(&buf->bufptr64,
> __ATOMIC_RELAXED)
> +			& RTE_DISTRIB_FLAGS_MASK))
> +		rte_pause();
> +
>  	/* Sync with distributor on RETURN_BUF flag. */
>  	__atomic_store_n(&(buf->bufptr64), req, __ATOMIC_RELEASE);
>  	return 0;
> --
> 2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v7 09/16] test/distributor: collect return mbufs
  2020-10-10 16:05                             ` [dpdk-dev] [PATCH v7 09/16] test/distributor: collect return mbufs Lukasz Wojciechowski
@ 2020-10-16  4:53                               ` Honnappa Nagarahalli
  2020-10-16  5:13                               ` Honnappa Nagarahalli
  1 sibling, 0 replies; 164+ messages in thread
From: Honnappa Nagarahalli @ 2020-10-16  4:53 UTC (permalink / raw)
  To: Lukasz Wojciechowski, David Hunt, Bruce Richardson
  Cc: dev, stable, nd, Honnappa Nagarahalli, nd



> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Lukasz Wojciechowski
> Sent: Saturday, October 10, 2020 11:05 AM
> To: David Hunt <david.hunt@intel.com>; Bruce Richardson
> <bruce.richardson@intel.com>
> Cc: dev@dpdk.org; l.wojciechow@partner.samsung.com; stable@dpdk.org
> Subject: [dpdk-dev] [PATCH v7 09/16] test/distributor: collect return mbufs
> 
> During quit_workers function distributor's main core processes some packets
> to wake up pending worker cores so they can quit.
> As quit_workers acts also as a cleanup procedure for next test case it should
> also collect these packages returned by workers'
> handlers, so the cyclic buffer with returned packets in distributor remains
> empty.
> 
> Fixes: c3eabff124e6 ("distributor: add unit tests")
> Cc: bruce.richardson@intel.com
> Fixes: c0de0eb82e40 ("distributor: switch over to new API")
> Cc: david.hunt@intel.com
> Cc: stable@dpdk.org
> 
> Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
> Acked-by: David Hunt <david.hunt@intel.com>
> ---
>  app/test/test_distributor.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c index
> 06e01ff9d..ed03040d1 100644
> --- a/app/test/test_distributor.c
> +++ b/app/test/test_distributor.c
> @@ -590,6 +590,7 @@ quit_workers(struct worker_params *wp, struct
> rte_mempool *p)
>  	const unsigned num_workers = rte_lcore_count() - 1;
>  	unsigned i;
>  	struct rte_mbuf *bufs[RTE_MAX_LCORE];
> +	struct rte_mbuf *returns[RTE_MAX_LCORE];
>  	if (rte_mempool_get_bulk(p, (void *)bufs, num_workers) != 0) {
>  		printf("line %d: Error getting mbufs from pool\n", __LINE__);
>  		return;
> @@ -605,6 +606,10 @@ quit_workers(struct worker_params *wp, struct
> rte_mempool *p)
>  	rte_distributor_flush(d);
>  	rte_eal_mp_wait_lcore();
> 
> +	while (rte_distributor_returned_pkts(d, returns, RTE_MAX_LCORE))
> +		;
> +
> +	rte_distributor_clear_returns(d);
>  	rte_mempool_put_bulk(p, (void *)bufs, num_workers);
> 
>  	quit = 0;
> --
> 2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v7 08/16] test/distributor: fix freeing mbufs
  2020-10-10 16:04                             ` [dpdk-dev] [PATCH v7 08/16] test/distributor: fix freeing mbufs Lukasz Wojciechowski
@ 2020-10-16  5:12                               ` Honnappa Nagarahalli
  2020-10-17  3:28                                 ` Lukasz Wojciechowski
  0 siblings, 1 reply; 164+ messages in thread
From: Honnappa Nagarahalli @ 2020-10-16  5:12 UTC (permalink / raw)
  To: Lukasz Wojciechowski, David Hunt, Bruce Richardson
  Cc: dev, stable, nd, Honnappa Nagarahalli, nd

<snip>

> 
> Sanity tests with mbuf alloc and shutdown tests assume that mbufs passed
> to worker cores are freed in handlers.
> Such packets should not be returned to the distributor's main core. The only
> packets that should be returned are the packets send after completion of
> the tests in quit_workers function.
> 
> This patch stops returning mbufs to distributor's core.
> In case of shutdown tests it is impossible to determine how worker and
> distributor threads would synchronize.
> Packets used by tests should be freed and packets used during
> quit_workers() shouldn't. That's why returning mbufs to mempool is moved
> to test procedure run on distributor thread from worker threads.
> 
> Additionally this patch cleans up unused variables.
> 
> Fixes: c0de0eb82e40 ("distributor: switch over to new API")
> Cc: david.hunt@intel.com
> Cc: stable@dpdk.org
> 
> Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
> Acked-by: David Hunt <david.hunt@intel.com>
> ---
>  app/test/test_distributor.c | 96 ++++++++++++++++++-------------------
>  1 file changed, 47 insertions(+), 49 deletions(-)
> 
> diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c index
> 838459392..06e01ff9d 100644
> --- a/app/test/test_distributor.c
> +++ b/app/test/test_distributor.c
> @@ -44,7 +44,7 @@ total_packet_count(void)
>  	unsigned i, count = 0;
>  	for (i = 0; i < worker_idx; i++)
>  		count +=
> __atomic_load_n(&worker_stats[i].handled_packets,
> -				__ATOMIC_ACQUIRE);
> +				__ATOMIC_RELAXED);
I think it is better to make this and other statistics changes below in commit 6/16. It will be in line with the commit log as well.

>  	return count;
>  }
> 
> @@ -55,7 +55,7 @@ clear_packet_count(void)
>  	unsigned int i;
>  	for (i = 0; i < RTE_MAX_LCORE; i++)
>  		__atomic_store_n(&worker_stats[i].handled_packets, 0,
> -			__ATOMIC_RELEASE);
> +			__ATOMIC_RELAXED);
>  }
> 
>  /* this is the basic worker function for sanity test @@ -67,20 +67,18 @@
> handle_work(void *arg)
>  	struct rte_mbuf *buf[8] __rte_cache_aligned;
>  	struct worker_params *wp = arg;
>  	struct rte_distributor *db = wp->dist;
> -	unsigned int count = 0, num;
> +	unsigned int num;
>  	unsigned int id = __atomic_fetch_add(&worker_idx, 1,
> __ATOMIC_RELAXED);
> 
>  	num = rte_distributor_get_pkt(db, id, buf, NULL, 0);
>  	while (!quit) {
>  		__atomic_fetch_add(&worker_stats[id].handled_packets,
> num,
> -				__ATOMIC_ACQ_REL);
> -		count += num;
> +				__ATOMIC_RELAXED);
>  		num = rte_distributor_get_pkt(db, id,
>  				buf, buf, num);
>  	}
>  	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
> -			__ATOMIC_ACQ_REL);
> -	count += num;
> +			__ATOMIC_RELAXED);
>  	rte_distributor_return_pkt(db, id, buf, num);
>  	return 0;
>  }
> @@ -136,7 +134,7 @@ sanity_test(struct worker_params *wp, struct
> rte_mempool *p)
>  	for (i = 0; i < rte_lcore_count() - 1; i++)
>  		printf("Worker %u handled %u packets\n", i,
>  			__atomic_load_n(&worker_stats[i].handled_packets,
> -					__ATOMIC_ACQUIRE));
> +					__ATOMIC_RELAXED));
>  	printf("Sanity test with all zero hashes done.\n");
> 
>  	/* pick two flows and check they go correctly */ @@ -163,7 +161,7
> @@ sanity_test(struct worker_params *wp, struct rte_mempool *p)
>  			printf("Worker %u handled %u packets\n", i,
>  				__atomic_load_n(
>  					&worker_stats[i].handled_packets,
> -					__ATOMIC_ACQUIRE));
> +					__ATOMIC_RELAXED));
>  		printf("Sanity test with two hash values done\n");
>  	}
> 
> @@ -190,7 +188,7 @@ sanity_test(struct worker_params *wp, struct
> rte_mempool *p)
>  	for (i = 0; i < rte_lcore_count() - 1; i++)
>  		printf("Worker %u handled %u packets\n", i,
>  			__atomic_load_n(&worker_stats[i].handled_packets,
> -					__ATOMIC_ACQUIRE));
> +					__ATOMIC_RELAXED));
>  	printf("Sanity test with non-zero hashes done\n");
> 
>  	rte_mempool_put_bulk(p, (void *)bufs, BURST); @@ -276,23
> +274,20 @@ handle_work_with_free_mbufs(void *arg)
>  	struct rte_mbuf *buf[8] __rte_cache_aligned;
>  	struct worker_params *wp = arg;
>  	struct rte_distributor *d = wp->dist;
> -	unsigned int count = 0;
>  	unsigned int i;
>  	unsigned int num;
>  	unsigned int id = __atomic_fetch_add(&worker_idx, 1,
> __ATOMIC_RELAXED);
> 
>  	num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
>  	while (!quit) {
> -		count += num;
>  		__atomic_fetch_add(&worker_stats[id].handled_packets,
> num,
> -				__ATOMIC_ACQ_REL);
> +				__ATOMIC_RELAXED);
>  		for (i = 0; i < num; i++)
>  			rte_pktmbuf_free(buf[i]);
>  		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
>  	}
> -	count += num;
>  	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
> -			__ATOMIC_ACQ_REL);
> +			__ATOMIC_RELAXED);
>  	rte_distributor_return_pkt(d, id, buf, num);
>  	return 0;
>  }
> @@ -318,7 +313,6 @@ sanity_test_with_mbuf_alloc(struct worker_params
> *wp, struct rte_mempool *p)
>  			rte_distributor_process(d, NULL, 0);
>  		for (j = 0; j < BURST; j++) {
>  			bufs[j]->hash.usr = (i+j) << 1;
> -			rte_mbuf_refcnt_set(bufs[j], 1);
>  		}
> 
>  		rte_distributor_process(d, bufs, BURST); @@ -342,15 +336,10
> @@ sanity_test_with_mbuf_alloc(struct worker_params *wp, struct
> rte_mempool *p)  static int  handle_work_for_shutdown_test(void *arg)  {
> -	struct rte_mbuf *pkt = NULL;
>  	struct rte_mbuf *buf[8] __rte_cache_aligned;
>  	struct worker_params *wp = arg;
>  	struct rte_distributor *d = wp->dist;
> -	unsigned int count = 0;
>  	unsigned int num;
> -	unsigned int total = 0;
> -	unsigned int i;
> -	unsigned int returned = 0;
>  	unsigned int zero_id = 0;
>  	unsigned int zero_unset;
>  	const unsigned int id = __atomic_fetch_add(&worker_idx, 1, @@ -
> 368,11 +357,8 @@ handle_work_for_shutdown_test(void *arg)
>  	/* wait for quit single globally, or for worker zero, wait
>  	 * for zero_quit */
>  	while (!quit && !(id == zero_id && zero_quit)) {
> -		count += num;
>  		__atomic_fetch_add(&worker_stats[id].handled_packets,
> num,
> -				__ATOMIC_ACQ_REL);
> -		for (i = 0; i < num; i++)
> -			rte_pktmbuf_free(buf[i]);
> +				__ATOMIC_RELAXED);
>  		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
> 
>  		if (num > 0) {
> @@ -381,15 +367,12 @@ handle_work_for_shutdown_test(void *arg)
>  				false, __ATOMIC_ACQ_REL,
> __ATOMIC_ACQUIRE);
>  		}
>  		zero_id = __atomic_load_n(&zero_idx,
> __ATOMIC_ACQUIRE);
> -
> -		total += num;
>  	}
> -	count += num;
> -	returned = rte_distributor_return_pkt(d, id, buf, num);
> -
>  	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
> -			__ATOMIC_ACQ_REL);
> +			__ATOMIC_RELAXED);
>  	if (id == zero_id) {
> +		rte_distributor_return_pkt(d, id, NULL, 0);
> +
>  		/* for worker zero, allow it to restart to pick up last packet
>  		 * when all workers are shutting down.
>  		 */
> @@ -400,15 +383,11 @@ handle_work_for_shutdown_test(void *arg)
> 
>  		while (!quit) {
> 
> 	__atomic_fetch_add(&worker_stats[id].handled_packets,
> -					num, __ATOMIC_ACQ_REL);
> -			count += num;
> -			rte_pktmbuf_free(pkt);
> +					num, __ATOMIC_RELAXED);
>  			num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
>  		}
> -		returned = rte_distributor_return_pkt(d,
> -				id, buf, num);
> -		printf("Num returned = %d\n", returned);
>  	}
> +	rte_distributor_return_pkt(d, id, buf, num);
>  	return 0;
>  }
> 
> @@ -424,7 +403,9 @@ sanity_test_with_worker_shutdown(struct
> worker_params *wp,  {
>  	struct rte_distributor *d = wp->dist;
>  	struct rte_mbuf *bufs[BURST];
> -	unsigned i;
> +	struct rte_mbuf *bufs2[BURST];
> +	unsigned int i;
> +	unsigned int failed = 0;
> 
>  	printf("=== Sanity test of worker shutdown ===\n");
> 
> @@ -450,16 +431,17 @@ sanity_test_with_worker_shutdown(struct
> worker_params *wp,
>  	 */
> 
>  	/* get more buffers to queue up, again setting them to the same
> flow */
> -	if (rte_mempool_get_bulk(p, (void *)bufs, BURST) != 0) {
> +	if (rte_mempool_get_bulk(p, (void *)bufs2, BURST) != 0) {
>  		printf("line %d: Error getting mbufs from pool\n", __LINE__);
> +		rte_mempool_put_bulk(p, (void *)bufs, BURST);
>  		return -1;
>  	}
>  	for (i = 0; i < BURST; i++)
> -		bufs[i]->hash.usr = 1;
> +		bufs2[i]->hash.usr = 1;
> 
>  	/* get worker zero to quit */
>  	zero_quit = 1;
> -	rte_distributor_process(d, bufs, BURST);
> +	rte_distributor_process(d, bufs2, BURST);
> 
>  	/* flush the distributor */
>  	rte_distributor_flush(d);
> @@ -468,15 +450,21 @@ sanity_test_with_worker_shutdown(struct
> worker_params *wp,
>  	for (i = 0; i < rte_lcore_count() - 1; i++)
>  		printf("Worker %u handled %u packets\n", i,
>  			__atomic_load_n(&worker_stats[i].handled_packets,
> -					__ATOMIC_ACQUIRE));
> +					__ATOMIC_RELAXED));
> 
>  	if (total_packet_count() != BURST * 2) {
>  		printf("Line %d: Error, not all packets flushed. "
>  				"Expected %u, got %u\n",
>  				__LINE__, BURST * 2, total_packet_count());
> -		return -1;
> +		failed = 1;
>  	}
> 
> +	rte_mempool_put_bulk(p, (void *)bufs, BURST);
> +	rte_mempool_put_bulk(p, (void *)bufs2, BURST);
> +
> +	if (failed)
> +		return -1;
> +
>  	printf("Sanity test with worker shutdown passed\n\n");
>  	return 0;
>  }
> @@ -490,7 +478,8 @@ test_flush_with_worker_shutdown(struct
> worker_params *wp,  {
>  	struct rte_distributor *d = wp->dist;
>  	struct rte_mbuf *bufs[BURST];
> -	unsigned i;
> +	unsigned int i;
> +	unsigned int failed = 0;
> 
>  	printf("=== Test flush fn with worker shutdown (%s) ===\n", wp-
> >name);
> 
> @@ -522,15 +511,20 @@ test_flush_with_worker_shutdown(struct
> worker_params *wp,
>  	for (i = 0; i < rte_lcore_count() - 1; i++)
>  		printf("Worker %u handled %u packets\n", i,
>  			__atomic_load_n(&worker_stats[i].handled_packets,
> -					__ATOMIC_ACQUIRE));
> +					__ATOMIC_RELAXED));
> 
>  	if (total_packet_count() != BURST) {
>  		printf("Line %d: Error, not all packets flushed. "
>  				"Expected %u, got %u\n",
>  				__LINE__, BURST, total_packet_count());
> -		return -1;
> +		failed = 1;
>  	}
> 
> +	rte_mempool_put_bulk(p, (void *)bufs, BURST);
> +
> +	if (failed)
> +		return -1;
> +
>  	printf("Flush test with worker shutdown passed\n\n");
>  	return 0;
>  }
> @@ -596,7 +590,10 @@ quit_workers(struct worker_params *wp, struct
> rte_mempool *p)
>  	const unsigned num_workers = rte_lcore_count() - 1;
>  	unsigned i;
>  	struct rte_mbuf *bufs[RTE_MAX_LCORE];
> -	rte_mempool_get_bulk(p, (void *)bufs, num_workers);
> +	if (rte_mempool_get_bulk(p, (void *)bufs, num_workers) != 0) {
> +		printf("line %d: Error getting mbufs from pool\n", __LINE__);
> +		return;
> +	}
> 
>  	zero_quit = 0;
>  	quit = 1;
> @@ -604,11 +601,12 @@ quit_workers(struct worker_params *wp, struct
> rte_mempool *p)
>  		bufs[i]->hash.usr = i << 1;
>  	rte_distributor_process(d, bufs, num_workers);
> 
> -	rte_mempool_put_bulk(p, (void *)bufs, num_workers);
> -
>  	rte_distributor_process(d, NULL, 0);
>  	rte_distributor_flush(d);
>  	rte_eal_mp_wait_lcore();
> +
> +	rte_mempool_put_bulk(p, (void *)bufs, num_workers);
> +
>  	quit = 0;
>  	worker_idx = 0;
>  	zero_idx = RTE_MAX_LCORE;
> --
> 2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v7 06/16] test/distributor: synchronize lcores statistics
  2020-10-10 16:04                             ` [dpdk-dev] [PATCH v7 06/16] test/distributor: synchronize lcores statistics Lukasz Wojciechowski
@ 2020-10-16  5:13                               ` Honnappa Nagarahalli
  2020-10-17  3:23                                 ` Lukasz Wojciechowski
  0 siblings, 1 reply; 164+ messages in thread
From: Honnappa Nagarahalli @ 2020-10-16  5:13 UTC (permalink / raw)
  To: Lukasz Wojciechowski, David Hunt, Bruce Richardson
  Cc: dev, stable, nd, Honnappa Nagarahalli, nd

Hi Lukasz,
	I see that in commit 8/16, the same code is changed again (updating the counters using the RELAXED memory order). It is better to pull the statistics changes from 8/16 into this commit.

Thanks,
Honnappa

> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Lukasz Wojciechowski
> Sent: Saturday, October 10, 2020 11:05 AM
> To: David Hunt <david.hunt@intel.com>; Bruce Richardson
> <bruce.richardson@intel.com>
> Cc: dev@dpdk.org; l.wojciechow@partner.samsung.com; stable@dpdk.org
> Subject: [dpdk-dev] [PATCH v7 06/16] test/distributor: synchronize lcores
> statistics
> 
> Statistics of handled packets are cleared and read on main lcore, while they
> are increased in workers handlers on different lcores.
> 
> Without synchronization occasionally showed invalid values.
> This patch uses atomic acquire/release mechanisms to synchronize.
> 
> Fixes: c3eabff124e6 ("distributor: add unit tests")
> Cc: bruce.richardson@intel.com
> Cc: stable@dpdk.org
> 
> Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
> Acked-by: David Hunt <david.hunt@intel.com>
> ---
>  app/test/test_distributor.c | 43 +++++++++++++++++++++++++------------
>  1 file changed, 29 insertions(+), 14 deletions(-)
> 
> diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c index
> 6cd7a2edd..838459392 100644
> --- a/app/test/test_distributor.c
> +++ b/app/test/test_distributor.c
> @@ -43,7 +43,8 @@ total_packet_count(void)  {
>  	unsigned i, count = 0;
>  	for (i = 0; i < worker_idx; i++)
> -		count += worker_stats[i].handled_packets;
> +		count +=
> __atomic_load_n(&worker_stats[i].handled_packets,
> +				__ATOMIC_ACQUIRE);
For ex: this line is changed in commit 8/16 as well. It is better to pull the changes from 8/16 to this commit.

>  	return count;
>  }
> 
> @@ -51,7 +52,10 @@ total_packet_count(void)  static inline void
>  clear_packet_count(void)
>  {
> -	memset(&worker_stats, 0, sizeof(worker_stats));
> +	unsigned int i;
> +	for (i = 0; i < RTE_MAX_LCORE; i++)
> +		__atomic_store_n(&worker_stats[i].handled_packets, 0,
> +			__ATOMIC_RELEASE);
>  }
> 
>  /* this is the basic worker function for sanity test @@ -69,13 +73,13 @@
> handle_work(void *arg)
>  	num = rte_distributor_get_pkt(db, id, buf, NULL, 0);
>  	while (!quit) {
>  		__atomic_fetch_add(&worker_stats[id].handled_packets,
> num,
> -				__ATOMIC_RELAXED);
> +				__ATOMIC_ACQ_REL);
>  		count += num;
>  		num = rte_distributor_get_pkt(db, id,
>  				buf, buf, num);
>  	}
>  	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
> -			__ATOMIC_RELAXED);
> +			__ATOMIC_ACQ_REL);
>  	count += num;
>  	rte_distributor_return_pkt(db, id, buf, num);
>  	return 0;
> @@ -131,7 +135,8 @@ sanity_test(struct worker_params *wp, struct
> rte_mempool *p)
> 
>  	for (i = 0; i < rte_lcore_count() - 1; i++)
>  		printf("Worker %u handled %u packets\n", i,
> -				worker_stats[i].handled_packets);
> +			__atomic_load_n(&worker_stats[i].handled_packets,
> +					__ATOMIC_ACQUIRE));
>  	printf("Sanity test with all zero hashes done.\n");
> 
>  	/* pick two flows and check they go correctly */ @@ -156,7 +161,9
> @@ sanity_test(struct worker_params *wp, struct rte_mempool *p)
> 
>  		for (i = 0; i < rte_lcore_count() - 1; i++)
>  			printf("Worker %u handled %u packets\n", i,
> -					worker_stats[i].handled_packets);
> +				__atomic_load_n(
> +					&worker_stats[i].handled_packets,
> +					__ATOMIC_ACQUIRE));
>  		printf("Sanity test with two hash values done\n");
>  	}
> 
> @@ -182,7 +189,8 @@ sanity_test(struct worker_params *wp, struct
> rte_mempool *p)
> 
>  	for (i = 0; i < rte_lcore_count() - 1; i++)
>  		printf("Worker %u handled %u packets\n", i,
> -				worker_stats[i].handled_packets);
> +			__atomic_load_n(&worker_stats[i].handled_packets,
> +					__ATOMIC_ACQUIRE));
>  	printf("Sanity test with non-zero hashes done\n");
> 
>  	rte_mempool_put_bulk(p, (void *)bufs, BURST); @@ -275,14
> +283,16 @@ handle_work_with_free_mbufs(void *arg)
> 
>  	num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
>  	while (!quit) {
> -		worker_stats[id].handled_packets += num;
>  		count += num;
> +		__atomic_fetch_add(&worker_stats[id].handled_packets,
> num,
> +				__ATOMIC_ACQ_REL);
>  		for (i = 0; i < num; i++)
>  			rte_pktmbuf_free(buf[i]);
>  		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
>  	}
> -	worker_stats[id].handled_packets += num;
>  	count += num;
> +	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
> +			__ATOMIC_ACQ_REL);
>  	rte_distributor_return_pkt(d, id, buf, num);
>  	return 0;
>  }
> @@ -358,8 +368,9 @@ handle_work_for_shutdown_test(void *arg)
>  	/* wait for quit single globally, or for worker zero, wait
>  	 * for zero_quit */
>  	while (!quit && !(id == zero_id && zero_quit)) {
> -		worker_stats[id].handled_packets += num;
>  		count += num;
> +		__atomic_fetch_add(&worker_stats[id].handled_packets,
> num,
> +				__ATOMIC_ACQ_REL);
>  		for (i = 0; i < num; i++)
>  			rte_pktmbuf_free(buf[i]);
>  		num = rte_distributor_get_pkt(d, id, buf, NULL, 0); @@ -
> 373,10 +384,11 @@ handle_work_for_shutdown_test(void *arg)
> 
>  		total += num;
>  	}
> -	worker_stats[id].handled_packets += num;
>  	count += num;
>  	returned = rte_distributor_return_pkt(d, id, buf, num);
> 
> +	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
> +			__ATOMIC_ACQ_REL);
>  	if (id == zero_id) {
>  		/* for worker zero, allow it to restart to pick up last packet
>  		 * when all workers are shutting down.
> @@ -387,7 +399,8 @@ handle_work_for_shutdown_test(void *arg)
>  		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
> 
>  		while (!quit) {
> -			worker_stats[id].handled_packets += num;
> +
> 	__atomic_fetch_add(&worker_stats[id].handled_packets,
> +					num, __ATOMIC_ACQ_REL);
>  			count += num;
>  			rte_pktmbuf_free(pkt);
>  			num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
> @@ -454,7 +467,8 @@ sanity_test_with_worker_shutdown(struct
> worker_params *wp,
> 
>  	for (i = 0; i < rte_lcore_count() - 1; i++)
>  		printf("Worker %u handled %u packets\n", i,
> -				worker_stats[i].handled_packets);
> +			__atomic_load_n(&worker_stats[i].handled_packets,
> +					__ATOMIC_ACQUIRE));
> 
>  	if (total_packet_count() != BURST * 2) {
>  		printf("Line %d: Error, not all packets flushed. "
> @@ -507,7 +521,8 @@ test_flush_with_worker_shutdown(struct
> worker_params *wp,
>  	zero_quit = 0;
>  	for (i = 0; i < rte_lcore_count() - 1; i++)
>  		printf("Worker %u handled %u packets\n", i,
> -				worker_stats[i].handled_packets);
> +			__atomic_load_n(&worker_stats[i].handled_packets,
> +					__ATOMIC_ACQUIRE));
> 
>  	if (total_packet_count() != BURST) {
>  		printf("Line %d: Error, not all packets flushed. "
> --
> 2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v7 09/16] test/distributor: collect return mbufs
  2020-10-10 16:05                             ` [dpdk-dev] [PATCH v7 09/16] test/distributor: collect return mbufs Lukasz Wojciechowski
  2020-10-16  4:53                               ` Honnappa Nagarahalli
@ 2020-10-16  5:13                               ` Honnappa Nagarahalli
  2020-10-17  3:29                                 ` Lukasz Wojciechowski
  1 sibling, 1 reply; 164+ messages in thread
From: Honnappa Nagarahalli @ 2020-10-16  5:13 UTC (permalink / raw)
  To: Lukasz Wojciechowski, David Hunt, Bruce Richardson
  Cc: dev, stable, nd, Honnappa Nagarahalli, nd

<snip>
> 
> During quit_workers function distributor's main core processes some packets
> to wake up pending worker cores so they can quit.
> As quit_workers acts also as a cleanup procedure for next test case it should
> also collect these packages returned by workers'
nit                              ^^^^^^^^ packets

> handlers, so the cyclic buffer with returned packets in distributor remains
> empty.
> 
> Fixes: c3eabff124e6 ("distributor: add unit tests")
> Cc: bruce.richardson@intel.com
> Fixes: c0de0eb82e40 ("distributor: switch over to new API")
> Cc: david.hunt@intel.com
> Cc: stable@dpdk.org
> 
> Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
> Acked-by: David Hunt <david.hunt@intel.com>
> ---
>  app/test/test_distributor.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c index
> 06e01ff9d..ed03040d1 100644
> --- a/app/test/test_distributor.c
> +++ b/app/test/test_distributor.c
> @@ -590,6 +590,7 @@ quit_workers(struct worker_params *wp, struct
> rte_mempool *p)
>  	const unsigned num_workers = rte_lcore_count() - 1;
>  	unsigned i;
>  	struct rte_mbuf *bufs[RTE_MAX_LCORE];
> +	struct rte_mbuf *returns[RTE_MAX_LCORE];
>  	if (rte_mempool_get_bulk(p, (void *)bufs, num_workers) != 0) {
>  		printf("line %d: Error getting mbufs from pool\n", __LINE__);
>  		return;
> @@ -605,6 +606,10 @@ quit_workers(struct worker_params *wp, struct
> rte_mempool *p)
>  	rte_distributor_flush(d);
>  	rte_eal_mp_wait_lcore();
> 
> +	while (rte_distributor_returned_pkts(d, returns, RTE_MAX_LCORE))
> +		;
> +
> +	rte_distributor_clear_returns(d);
>  	rte_mempool_put_bulk(p, (void *)bufs, num_workers);
> 
>  	quit = 0;
> --
> 2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v4 2/8] test/distributor: synchronize lcores statistics
  2020-10-02 11:25                     ` Lukasz Wojciechowski
  2020-10-08 20:47                       ` Lukasz Wojciechowski
@ 2020-10-16  5:43                       ` Honnappa Nagarahalli
  2020-10-16 12:43                         ` Lukasz Wojciechowski
  1 sibling, 1 reply; 164+ messages in thread
From: Honnappa Nagarahalli @ 2020-10-16  5:43 UTC (permalink / raw)
  To: Lukasz Wojciechowski, David Hunt, Bruce Richardson
  Cc: dev, stable, nd, Honnappa Nagarahalli, nd

<snip>

> 
> Hi Honnappa,
> 
> Many thanks for the review!
> 
> I'll write my answers here not inline as it would be easier to read them in one
> place, I think.
> So first of all I agree with you in 2 things:
> 1) all uses of statistics must be atomic and lack of that caused most of the
> problems
> 2) it would be better to replace barrier and memset in
> clear_packet_count() with atomic stores as you suggested
> 
> So I will apply both of above.
> 
> However I wasn't not fully convinced on changing acquire/release to relaxed.
> It wood be perfectly ok if it would look like in this Herb Sutter's example:
> https://youtu.be/KeLBd2[]  EJLOU?t=4170
> But in his case the counters are cleared before worker threads start and are
> printout after they are completed.
> 
> In case of the dpdk distributor tests both worker and main cores are running
> at the same time. In the sanity_test, the statistics are cleared and verified
> few times for different hashes of packages. The worker cores are not
> stopped at this time and they continue their loops in handle procedure.
> Verification made in main core is an exchange of data as the current statistics
> indicate how the test will result.
Agree. The key point we have to note is that the data that is exchanged between the two threads is already atomic (handled_packets is atomic).

> 
> So as I wasn't convinced, I run some tests with both both relaxed and
> acquire/release modes and they both fail :( The failures caused by statistics
> errors to number of tests ratio for
> 200000 tests was:
> for relaxed: 0,000790562
> for acq/rel: 0,000091321
> 
> 
> That's why I'm going to modify tests in such way, that they would:
> 1) clear statistics
> 2) launch worker threads
> 3) run test
> 4) wait for workers procedures to complete
> 5) check stats, verify results and print them out
> 
> This way worker main core will use (clear or verify) stats only when there are
> no worker threads. This would make things simpler and allowing to focus on
> testing the distributor not tests. And of course relaxed mode would be
> enough!
Agree, this would be the only way to ensure that the main thread sees the correct statistics (just like in the video)

> 
> 
> Best regards
> Lukasz
> 
> 
> W dniu 29.09.2020 o 07:49, Honnappa Nagarahalli pisze:
> > <snip>
> >
> >> Statistics of handled packets are cleared and read on main lcore,
> >> while they are increased in workers handlers on different lcores.
> >>
> >> Without synchronization occasionally showed invalid values.
> >> This patch uses atomic acquire/release mechanisms to synchronize.
> > In general, load-acquire and store-release memory orderings are required
> while synchronizing data (that cannot be updated atomically) between
> threads. In the situation, making counters atomic is enough.
> >
> >> Fixes: c3eabff124e6 ("distributor: add unit tests")
> >> Cc: bruce.richardson@intel.com
> >> Cc: stable@dpdk.org
> >>
> >> Signed-off-by: Lukasz Wojciechowski
> >> <l.wojciechow@partner.samsung.com>
> >> Acked-by: David Hunt <david.hunt@intel.com>
> >> ---
> >>   app/test/test_distributor.c | 39 ++++++++++++++++++++++++-----------
> --
> >>   1 file changed, 26 insertions(+), 13 deletions(-)
> >>
> >> diff --git a/app/test/test_distributor.c
> >> b/app/test/test_distributor.c index
> >> 35b25463a..0e49e3714 100644
> >> --- a/app/test/test_distributor.c
> >> +++ b/app/test/test_distributor.c
> >> @@ -43,7 +43,8 @@ total_packet_count(void)  {
> >>   	unsigned i, count = 0;
> >>   	for (i = 0; i < worker_idx; i++)
> >> -		count += worker_stats[i].handled_packets;
> >> +		count +=
> >> __atomic_load_n(&worker_stats[i].handled_packets,
> >> +				__ATOMIC_ACQUIRE);
> > RELAXED memory order is sufficient. For ex: the worker threads are not
> 'releasing' any data that is not atomically updated to the main thread.
> >
> >>   	return count;
> >>   }
> >>
> >> @@ -52,6 +53,7 @@ static inline void
> >>   clear_packet_count(void)
> >>   {
> >>   	memset(&worker_stats, 0, sizeof(worker_stats));
> >> +	rte_atomic_thread_fence(__ATOMIC_RELEASE);
> > Ideally, the counters should be set to 0 atomically rather than using a
> memset.
> >
> >>   }
> >>
> >>   /* this is the basic worker function for sanity test @@ -72,13
> >> +74,13 @@ handle_work(void *arg)
> >>   	num = rte_distributor_get_pkt(db, id, buf, buf, num);
> >>   	while (!quit) {
> >>   		__atomic_fetch_add(&worker_stats[id].handled_packets,
> >> num,
> >> -				__ATOMIC_RELAXED);
> >> +				__ATOMIC_ACQ_REL);
> > Using the __ATOMIC_ACQ_REL order does not mean anything to the main
> thread. The main thread might still see the updates from different threads in
> different order.
> >
> >>   		count += num;
> >>   		num = rte_distributor_get_pkt(db, id,
> >>   				buf, buf, num);
> >>   	}
> >>   	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
> >> -			__ATOMIC_RELAXED);
> >> +			__ATOMIC_ACQ_REL);
> > Same here, do not see why this change is required.
> >
> >>   	count += num;
> >>   	rte_distributor_return_pkt(db, id, buf, num);
> >>   	return 0;
> >> @@ -134,7 +136,8 @@ sanity_test(struct worker_params *wp, struct
> >> rte_mempool *p)
> >>
> >>   	for (i = 0; i < rte_lcore_count() - 1; i++)
> >>   		printf("Worker %u handled %u packets\n", i,
> >> -				worker_stats[i].handled_packets);
> >> +			__atomic_load_n(&worker_stats[i].handled_packets,
> >> +					__ATOMIC_ACQUIRE));
> > __ATOMIC_RELAXED is enough.
> >
> >>   	printf("Sanity test with all zero hashes done.\n");
> >>
> >>   	/* pick two flows and check they go correctly */ @@ -159,7 +162,9
> >> @@ sanity_test(struct worker_params *wp, struct rte_mempool *p)
> >>
> >>   		for (i = 0; i < rte_lcore_count() - 1; i++)
> >>   			printf("Worker %u handled %u packets\n", i,
> >> -					worker_stats[i].handled_packets);
> >> +				__atomic_load_n(
> >> +					&worker_stats[i].handled_packets,
> >> +					__ATOMIC_ACQUIRE));
> > __ATOMIC_RELAXED is enough
> >
> >>   		printf("Sanity test with two hash values done\n");
> >>   	}
> >>
> >> @@ -185,7 +190,8 @@ sanity_test(struct worker_params *wp, struct
> >> rte_mempool *p)
> >>
> >>   	for (i = 0; i < rte_lcore_count() - 1; i++)
> >>   		printf("Worker %u handled %u packets\n", i,
> >> -				worker_stats[i].handled_packets);
> >> +			__atomic_load_n(&worker_stats[i].handled_packets,
> >> +					__ATOMIC_ACQUIRE));
> > __ATOMIC_RELAXED is enough
> >
> >>   	printf("Sanity test with non-zero hashes done\n");
> >>
> >>   	rte_mempool_put_bulk(p, (void *)bufs, BURST); @@ -280,15
> >> +286,17 @@ handle_work_with_free_mbufs(void *arg)
> >>   		buf[i] = NULL;
> >>   	num = rte_distributor_get_pkt(d, id, buf, buf, num);
> >>   	while (!quit) {
> >> -		worker_stats[id].handled_packets += num;
> >>   		count += num;
> >> +		__atomic_fetch_add(&worker_stats[id].handled_packets,
> >> num,
> >> +				__ATOMIC_ACQ_REL);
> > IMO, the problem would be the non-atomic update of the statistics. So,
> > __ATOMIC_RELAXED is enough
> >
> >>   		for (i = 0; i < num; i++)
> >>   			rte_pktmbuf_free(buf[i]);
> >>   		num = rte_distributor_get_pkt(d,
> >>   				id, buf, buf, num);
> >>   	}
> >> -	worker_stats[id].handled_packets += num;
> >>   	count += num;
> >> +	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
> >> +			__ATOMIC_ACQ_REL);
> > Same here, the problem is non-atomic update of the statistics,
> __ATOMIC_RELAXED is enough.
> > Similarly, for changes below, __ATOMIC_RELAXED is enough.
> >
> >>   	rte_distributor_return_pkt(d, id, buf, num);
> >>   	return 0;
> >>   }
> >> @@ -363,8 +371,9 @@ handle_work_for_shutdown_test(void *arg)
> >>   	/* wait for quit single globally, or for worker zero, wait
> >>   	 * for zero_quit */
> >>   	while (!quit && !(id == zero_id && zero_quit)) {
> >> -		worker_stats[id].handled_packets += num;
> >>   		count += num;
> >> +		__atomic_fetch_add(&worker_stats[id].handled_packets,
> >> num,
> >> +				__ATOMIC_ACQ_REL);
> >>   		for (i = 0; i < num; i++)
> >>   			rte_pktmbuf_free(buf[i]);
> >>   		num = rte_distributor_get_pkt(d,
> >> @@ -379,10 +388,11 @@ handle_work_for_shutdown_test(void *arg)
> >>
> >>   		total += num;
> >>   	}
> >> -	worker_stats[id].handled_packets += num;
> >>   	count += num;
> >>   	returned = rte_distributor_return_pkt(d, id, buf, num);
> >>
> >> +	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
> >> +			__ATOMIC_ACQ_REL);
> >>   	if (id == zero_id) {
> >>   		/* for worker zero, allow it to restart to pick up last packet
> >>   		 * when all workers are shutting down.
> >> @@ -394,10 +404,11 @@ handle_work_for_shutdown_test(void *arg)
> >>   				id, buf, buf, num);
> >>
> >>   		while (!quit) {
> >> -			worker_stats[id].handled_packets += num;
> >>   			count += num;
> >>   			rte_pktmbuf_free(pkt);
> >>   			num = rte_distributor_get_pkt(d, id, buf, buf, num);
> >> +
> >> 	__atomic_fetch_add(&worker_stats[id].handled_packets,
> >> +					num, __ATOMIC_ACQ_REL);
> >>   		}
> >>   		returned = rte_distributor_return_pkt(d,
> >>   				id, buf, num);
> >> @@ -461,7 +472,8 @@ sanity_test_with_worker_shutdown(struct
> >> worker_params *wp,
> >>
> >>   	for (i = 0; i < rte_lcore_count() - 1; i++)
> >>   		printf("Worker %u handled %u packets\n", i,
> >> -				worker_stats[i].handled_packets);
> >> +			__atomic_load_n(&worker_stats[i].handled_packets,
> >> +					__ATOMIC_ACQUIRE));
> >>
> >>   	if (total_packet_count() != BURST * 2) {
> >>   		printf("Line %d: Error, not all packets flushed. "
> >> @@ -514,7 +526,8 @@ test_flush_with_worker_shutdown(struct
> >> worker_params *wp,
> >>   	zero_quit = 0;
> >>   	for (i = 0; i < rte_lcore_count() - 1; i++)
> >>   		printf("Worker %u handled %u packets\n", i,
> >> -				worker_stats[i].handled_packets);
> >> +			__atomic_load_n(&worker_stats[i].handled_packets,
> >> +					__ATOMIC_ACQUIRE));
> >>
> >>   	if (total_packet_count() != BURST) {
> >>   		printf("Line %d: Error, not all packets flushed. "
> >> --
> >> 2.17.1
> 
> --
> Lukasz Wojciechowski
> Principal Software Engineer
> 
> Samsung R&D Institute Poland
> Samsung Electronics
> Office +48 22 377 88 25
> l.wojciechow@partner.samsung.com


^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v4 2/8] test/distributor: synchronize lcores statistics
  2020-10-16  5:43                       ` Honnappa Nagarahalli
@ 2020-10-16 12:43                         ` Lukasz Wojciechowski
  2020-10-16 12:58                           ` Lukasz Wojciechowski
  0 siblings, 1 reply; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-16 12:43 UTC (permalink / raw)
  To: Honnappa Nagarahalli, David Hunt, Bruce Richardson
  Cc: dev, stable, nd, David Marchand,
	"'Lukasz Wojciechowski'",

Hi Honnappa,

Thank you for your answer.
In the current v7 version I followed your advise and used RELAXED memory model.
And it works without any issues. I guess after fixing other issues found since v4 the distributor works more stable.
I didn't have time to rearrange all tests in the way I proposed, but I guess if they work like this it's not a top priority.

Can you give an ack on the series? I believe David Marchand is waiting for your opinion to process it.

Best regards
Lukasz

W dniu 16.10.2020 o 07:43, Honnappa Nagarahalli pisze:
> <snip>
>
>> Hi Honnappa,
>>
>> Many thanks for the review!
>>
>> I'll write my answers here not inline as it would be easier to read them in one
>> place, I think.
>> So first of all I agree with you in 2 things:
>> 1) all uses of statistics must be atomic and lack of that caused most of the
>> problems
>> 2) it would be better to replace barrier and memset in
>> clear_packet_count() with atomic stores as you suggested
>>
>> So I will apply both of above.
>>
>> However I wasn't not fully convinced on changing acquire/release to relaxed.
>> It wood be perfectly ok if it would look like in this Herb Sutter's example:
>> https://youtu.be/KeLBd2[]  EJLOU?t=4170
>> But in his case the counters are cleared before worker threads start and are
>> printout after they are completed.
>>
>> In case of the dpdk distributor tests both worker and main cores are running
>> at the same time. In the sanity_test, the statistics are cleared and verified
>> few times for different hashes of packages. The worker cores are not
>> stopped at this time and they continue their loops in handle procedure.
>> Verification made in main core is an exchange of data as the current statistics
>> indicate how the test will result.
> Agree. The key point we have to note is that the data that is exchanged between the two threads is already atomic (handled_packets is atomic).
>
>> So as I wasn't convinced, I run some tests with both both relaxed and
>> acquire/release modes and they both fail :( The failures caused by statistics
>> errors to number of tests ratio for
>> 200000 tests was:
>> for relaxed: 0,000790562
>> for acq/rel: 0,000091321
>>
>>
>> That's why I'm going to modify tests in such way, that they would:
>> 1) clear statistics
>> 2) launch worker threads
>> 3) run test
>> 4) wait for workers procedures to complete
>> 5) check stats, verify results and print them out
>>
>> This way worker main core will use (clear or verify) stats only when there are
>> no worker threads. This would make things simpler and allowing to focus on
>> testing the distributor not tests. And of course relaxed mode would be
>> enough!
> Agree, this would be the only way to ensure that the main thread sees the correct statistics (just like in the video)
>
>>
>> Best regards
>> Lukasz
>>
>>
>> W dniu 29.09.2020 o 07:49, Honnappa Nagarahalli pisze:
>>> <snip>
>>>
>>>> Statistics of handled packets are cleared and read on main lcore,
>>>> while they are increased in workers handlers on different lcores.
>>>>
>>>> Without synchronization occasionally showed invalid values.
>>>> This patch uses atomic acquire/release mechanisms to synchronize.
>>> In general, load-acquire and store-release memory orderings are required
>> while synchronizing data (that cannot be updated atomically) between
>> threads. In the situation, making counters atomic is enough.
>>>> Fixes: c3eabff124e6 ("distributor: add unit tests")
>>>> Cc: bruce.richardson@intel.com
>>>> Cc: stable@dpdk.org
>>>>
>>>> Signed-off-by: Lukasz Wojciechowski
>>>> <l.wojciechow@partner.samsung.com>
>>>> Acked-by: David Hunt <david.hunt@intel.com>
>>>> ---
>>>>    app/test/test_distributor.c | 39 ++++++++++++++++++++++++-----------
>> --
>>>>    1 file changed, 26 insertions(+), 13 deletions(-)
>>>>
>>>> diff --git a/app/test/test_distributor.c
>>>> b/app/test/test_distributor.c index
>>>> 35b25463a..0e49e3714 100644
>>>> --- a/app/test/test_distributor.c
>>>> +++ b/app/test/test_distributor.c
>>>> @@ -43,7 +43,8 @@ total_packet_count(void)  {
>>>>    	unsigned i, count = 0;
>>>>    	for (i = 0; i < worker_idx; i++)
>>>> -		count += worker_stats[i].handled_packets;
>>>> +		count +=
>>>> __atomic_load_n(&worker_stats[i].handled_packets,
>>>> +				__ATOMIC_ACQUIRE);
>>> RELAXED memory order is sufficient. For ex: the worker threads are not
>> 'releasing' any data that is not atomically updated to the main thread.
>>>>    	return count;
>>>>    }
>>>>
>>>> @@ -52,6 +53,7 @@ static inline void
>>>>    clear_packet_count(void)
>>>>    {
>>>>    	memset(&worker_stats, 0, sizeof(worker_stats));
>>>> +	rte_atomic_thread_fence(__ATOMIC_RELEASE);
>>> Ideally, the counters should be set to 0 atomically rather than using a
>> memset.
>>>>    }
>>>>
>>>>    /* this is the basic worker function for sanity test @@ -72,13
>>>> +74,13 @@ handle_work(void *arg)
>>>>    	num = rte_distributor_get_pkt(db, id, buf, buf, num);
>>>>    	while (!quit) {
>>>>    		__atomic_fetch_add(&worker_stats[id].handled_packets,
>>>> num,
>>>> -				__ATOMIC_RELAXED);
>>>> +				__ATOMIC_ACQ_REL);
>>> Using the __ATOMIC_ACQ_REL order does not mean anything to the main
>> thread. The main thread might still see the updates from different threads in
>> different order.
>>>>    		count += num;
>>>>    		num = rte_distributor_get_pkt(db, id,
>>>>    				buf, buf, num);
>>>>    	}
>>>>    	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
>>>> -			__ATOMIC_RELAXED);
>>>> +			__ATOMIC_ACQ_REL);
>>> Same here, do not see why this change is required.
>>>
>>>>    	count += num;
>>>>    	rte_distributor_return_pkt(db, id, buf, num);
>>>>    	return 0;
>>>> @@ -134,7 +136,8 @@ sanity_test(struct worker_params *wp, struct
>>>> rte_mempool *p)
>>>>
>>>>    	for (i = 0; i < rte_lcore_count() - 1; i++)
>>>>    		printf("Worker %u handled %u packets\n", i,
>>>> -				worker_stats[i].handled_packets);
>>>> +			__atomic_load_n(&worker_stats[i].handled_packets,
>>>> +					__ATOMIC_ACQUIRE));
>>> __ATOMIC_RELAXED is enough.
>>>
>>>>    	printf("Sanity test with all zero hashes done.\n");
>>>>
>>>>    	/* pick two flows and check they go correctly */ @@ -159,7 +162,9
>>>> @@ sanity_test(struct worker_params *wp, struct rte_mempool *p)
>>>>
>>>>    		for (i = 0; i < rte_lcore_count() - 1; i++)
>>>>    			printf("Worker %u handled %u packets\n", i,
>>>> -					worker_stats[i].handled_packets);
>>>> +				__atomic_load_n(
>>>> +					&worker_stats[i].handled_packets,
>>>> +					__ATOMIC_ACQUIRE));
>>> __ATOMIC_RELAXED is enough
>>>
>>>>    		printf("Sanity test with two hash values done\n");
>>>>    	}
>>>>
>>>> @@ -185,7 +190,8 @@ sanity_test(struct worker_params *wp, struct
>>>> rte_mempool *p)
>>>>
>>>>    	for (i = 0; i < rte_lcore_count() - 1; i++)
>>>>    		printf("Worker %u handled %u packets\n", i,
>>>> -				worker_stats[i].handled_packets);
>>>> +			__atomic_load_n(&worker_stats[i].handled_packets,
>>>> +					__ATOMIC_ACQUIRE));
>>> __ATOMIC_RELAXED is enough
>>>
>>>>    	printf("Sanity test with non-zero hashes done\n");
>>>>
>>>>    	rte_mempool_put_bulk(p, (void *)bufs, BURST); @@ -280,15
>>>> +286,17 @@ handle_work_with_free_mbufs(void *arg)
>>>>    		buf[i] = NULL;
>>>>    	num = rte_distributor_get_pkt(d, id, buf, buf, num);
>>>>    	while (!quit) {
>>>> -		worker_stats[id].handled_packets += num;
>>>>    		count += num;
>>>> +		__atomic_fetch_add(&worker_stats[id].handled_packets,
>>>> num,
>>>> +				__ATOMIC_ACQ_REL);
>>> IMO, the problem would be the non-atomic update of the statistics. So,
>>> __ATOMIC_RELAXED is enough
>>>
>>>>    		for (i = 0; i < num; i++)
>>>>    			rte_pktmbuf_free(buf[i]);
>>>>    		num = rte_distributor_get_pkt(d,
>>>>    				id, buf, buf, num);
>>>>    	}
>>>> -	worker_stats[id].handled_packets += num;
>>>>    	count += num;
>>>> +	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
>>>> +			__ATOMIC_ACQ_REL);
>>> Same here, the problem is non-atomic update of the statistics,
>> __ATOMIC_RELAXED is enough.
>>> Similarly, for changes below, __ATOMIC_RELAXED is enough.
>>>
>>>>    	rte_distributor_return_pkt(d, id, buf, num);
>>>>    	return 0;
>>>>    }
>>>> @@ -363,8 +371,9 @@ handle_work_for_shutdown_test(void *arg)
>>>>    	/* wait for quit single globally, or for worker zero, wait
>>>>    	 * for zero_quit */
>>>>    	while (!quit && !(id == zero_id && zero_quit)) {
>>>> -		worker_stats[id].handled_packets += num;
>>>>    		count += num;
>>>> +		__atomic_fetch_add(&worker_stats[id].handled_packets,
>>>> num,
>>>> +				__ATOMIC_ACQ_REL);
>>>>    		for (i = 0; i < num; i++)
>>>>    			rte_pktmbuf_free(buf[i]);
>>>>    		num = rte_distributor_get_pkt(d,
>>>> @@ -379,10 +388,11 @@ handle_work_for_shutdown_test(void *arg)
>>>>
>>>>    		total += num;
>>>>    	}
>>>> -	worker_stats[id].handled_packets += num;
>>>>    	count += num;
>>>>    	returned = rte_distributor_return_pkt(d, id, buf, num);
>>>>
>>>> +	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
>>>> +			__ATOMIC_ACQ_REL);
>>>>    	if (id == zero_id) {
>>>>    		/* for worker zero, allow it to restart to pick up last packet
>>>>    		 * when all workers are shutting down.
>>>> @@ -394,10 +404,11 @@ handle_work_for_shutdown_test(void *arg)
>>>>    				id, buf, buf, num);
>>>>
>>>>    		while (!quit) {
>>>> -			worker_stats[id].handled_packets += num;
>>>>    			count += num;
>>>>    			rte_pktmbuf_free(pkt);
>>>>    			num = rte_distributor_get_pkt(d, id, buf, buf, num);
>>>> +
>>>> 	__atomic_fetch_add(&worker_stats[id].handled_packets,
>>>> +					num, __ATOMIC_ACQ_REL);
>>>>    		}
>>>>    		returned = rte_distributor_return_pkt(d,
>>>>    				id, buf, num);
>>>> @@ -461,7 +472,8 @@ sanity_test_with_worker_shutdown(struct
>>>> worker_params *wp,
>>>>
>>>>    	for (i = 0; i < rte_lcore_count() - 1; i++)
>>>>    		printf("Worker %u handled %u packets\n", i,
>>>> -				worker_stats[i].handled_packets);
>>>> +			__atomic_load_n(&worker_stats[i].handled_packets,
>>>> +					__ATOMIC_ACQUIRE));
>>>>
>>>>    	if (total_packet_count() != BURST * 2) {
>>>>    		printf("Line %d: Error, not all packets flushed. "
>>>> @@ -514,7 +526,8 @@ test_flush_with_worker_shutdown(struct
>>>> worker_params *wp,
>>>>    	zero_quit = 0;
>>>>    	for (i = 0; i < rte_lcore_count() - 1; i++)
>>>>    		printf("Worker %u handled %u packets\n", i,
>>>> -				worker_stats[i].handled_packets);
>>>> +			__atomic_load_n(&worker_stats[i].handled_packets,
>>>> +					__ATOMIC_ACQUIRE));
>>>>
>>>>    	if (total_packet_count() != BURST) {
>>>>    		printf("Line %d: Error, not all packets flushed. "
>>>> --
>>>> 2.17.1
>> --
>> Lukasz Wojciechowski
>> Principal Software Engineer
>>
>> Samsung R&D Institute Poland
>> Samsung Electronics
>> Office +48 22 377 88 25
>> l.wojciechow@partner.samsung.com

-- 
Lukasz Wojciechowski
Principal Software Engineer

Samsung R&D Institute Poland
Samsung Electronics
Office +48 22 377 88 25
l.wojciechow@partner.samsung.com


^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v4 2/8] test/distributor: synchronize lcores statistics
  2020-10-16 12:43                         ` Lukasz Wojciechowski
@ 2020-10-16 12:58                           ` Lukasz Wojciechowski
  2020-10-16 15:42                             ` Honnappa Nagarahalli
  0 siblings, 1 reply; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-16 12:58 UTC (permalink / raw)
  To: Honnappa Nagarahalli, David Hunt, Bruce Richardson
  Cc: dev, stable, nd, David Marchand,
	"'Lukasz Wojciechowski'",


W dniu 16.10.2020 o 14:43, Lukasz Wojciechowski pisze:
> Hi Honnappa,
>
> Thank you for your answer.
> In the current v7 version I followed your advise and used RELAXED memory model.
> And it works without any issues. I guess after fixing other issues found since v4 the distributor works more stable.
> I didn't have time to rearrange all tests in the way I proposed, but I guess if they work like this it's not a top priority.
>
> Can you give an ack on the series? I believe David Marchand is waiting for your opinion to process it.
I'm sorry I didn't see your other comments. I'll try to fix them them today.
>
> Best regards
> Lukasz
>
> W dniu 16.10.2020 o 07:43, Honnappa Nagarahalli pisze:
>> <snip>
>>
>>> Hi Honnappa,
>>>
>>> Many thanks for the review!
>>>
>>> I'll write my answers here not inline as it would be easier to read them in one
>>> place, I think.
>>> So first of all I agree with you in 2 things:
>>> 1) all uses of statistics must be atomic and lack of that caused most of the
>>> problems
>>> 2) it would be better to replace barrier and memset in
>>> clear_packet_count() with atomic stores as you suggested
>>>
>>> So I will apply both of above.
>>>
>>> However I wasn't not fully convinced on changing acquire/release to relaxed.
>>> It wood be perfectly ok if it would look like in this Herb Sutter's example:
>>> https://youtu.be/KeLBd2[]  EJLOU?t=4170
>>> But in his case the counters are cleared before worker threads start and are
>>> printout after they are completed.
>>>
>>> In case of the dpdk distributor tests both worker and main cores are running
>>> at the same time. In the sanity_test, the statistics are cleared and verified
>>> few times for different hashes of packages. The worker cores are not
>>> stopped at this time and they continue their loops in handle procedure.
>>> Verification made in main core is an exchange of data as the current statistics
>>> indicate how the test will result.
>> Agree. The key point we have to note is that the data that is exchanged between the two threads is already atomic (handled_packets is atomic).
>>
>>> So as I wasn't convinced, I run some tests with both both relaxed and
>>> acquire/release modes and they both fail :( The failures caused by statistics
>>> errors to number of tests ratio for
>>> 200000 tests was:
>>> for relaxed: 0,000790562
>>> for acq/rel: 0,000091321
>>>
>>>
>>> That's why I'm going to modify tests in such way, that they would:
>>> 1) clear statistics
>>> 2) launch worker threads
>>> 3) run test
>>> 4) wait for workers procedures to complete
>>> 5) check stats, verify results and print them out
>>>
>>> This way worker main core will use (clear or verify) stats only when there are
>>> no worker threads. This would make things simpler and allowing to focus on
>>> testing the distributor not tests. And of course relaxed mode would be
>>> enough!
>> Agree, this would be the only way to ensure that the main thread sees the correct statistics (just like in the video)
>>
>>> Best regards
>>> Lukasz
>>>
>>>
>>> W dniu 29.09.2020 o 07:49, Honnappa Nagarahalli pisze:
>>>> <snip>
>>>>
>>>>> Statistics of handled packets are cleared and read on main lcore,
>>>>> while they are increased in workers handlers on different lcores.
>>>>>
>>>>> Without synchronization occasionally showed invalid values.
>>>>> This patch uses atomic acquire/release mechanisms to synchronize.
>>>> In general, load-acquire and store-release memory orderings are required
>>> while synchronizing data (that cannot be updated atomically) between
>>> threads. In the situation, making counters atomic is enough.
>>>>> Fixes: c3eabff124e6 ("distributor: add unit tests")
>>>>> Cc: bruce.richardson@intel.com
>>>>> Cc: stable@dpdk.org
>>>>>
>>>>> Signed-off-by: Lukasz Wojciechowski
>>>>> <l.wojciechow@partner.samsung.com>
>>>>> Acked-by: David Hunt <david.hunt@intel.com>
>>>>> ---
>>>>>     app/test/test_distributor.c | 39 ++++++++++++++++++++++++-----------
>>> --
>>>>>     1 file changed, 26 insertions(+), 13 deletions(-)
>>>>>
>>>>> diff --git a/app/test/test_distributor.c
>>>>> b/app/test/test_distributor.c index
>>>>> 35b25463a..0e49e3714 100644
>>>>> --- a/app/test/test_distributor.c
>>>>> +++ b/app/test/test_distributor.c
>>>>> @@ -43,7 +43,8 @@ total_packet_count(void)  {
>>>>>     	unsigned i, count = 0;
>>>>>     	for (i = 0; i < worker_idx; i++)
>>>>> -		count += worker_stats[i].handled_packets;
>>>>> +		count +=
>>>>> __atomic_load_n(&worker_stats[i].handled_packets,
>>>>> +				__ATOMIC_ACQUIRE);
>>>> RELAXED memory order is sufficient. For ex: the worker threads are not
>>> 'releasing' any data that is not atomically updated to the main thread.
>>>>>     	return count;
>>>>>     }
>>>>>
>>>>> @@ -52,6 +53,7 @@ static inline void
>>>>>     clear_packet_count(void)
>>>>>     {
>>>>>     	memset(&worker_stats, 0, sizeof(worker_stats));
>>>>> +	rte_atomic_thread_fence(__ATOMIC_RELEASE);
>>>> Ideally, the counters should be set to 0 atomically rather than using a
>>> memset.
>>>>>     }
>>>>>
>>>>>     /* this is the basic worker function for sanity test @@ -72,13
>>>>> +74,13 @@ handle_work(void *arg)
>>>>>     	num = rte_distributor_get_pkt(db, id, buf, buf, num);
>>>>>     	while (!quit) {
>>>>>     		__atomic_fetch_add(&worker_stats[id].handled_packets,
>>>>> num,
>>>>> -				__ATOMIC_RELAXED);
>>>>> +				__ATOMIC_ACQ_REL);
>>>> Using the __ATOMIC_ACQ_REL order does not mean anything to the main
>>> thread. The main thread might still see the updates from different threads in
>>> different order.
>>>>>     		count += num;
>>>>>     		num = rte_distributor_get_pkt(db, id,
>>>>>     				buf, buf, num);
>>>>>     	}
>>>>>     	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
>>>>> -			__ATOMIC_RELAXED);
>>>>> +			__ATOMIC_ACQ_REL);
>>>> Same here, do not see why this change is required.
>>>>
>>>>>     	count += num;
>>>>>     	rte_distributor_return_pkt(db, id, buf, num);
>>>>>     	return 0;
>>>>> @@ -134,7 +136,8 @@ sanity_test(struct worker_params *wp, struct
>>>>> rte_mempool *p)
>>>>>
>>>>>     	for (i = 0; i < rte_lcore_count() - 1; i++)
>>>>>     		printf("Worker %u handled %u packets\n", i,
>>>>> -				worker_stats[i].handled_packets);
>>>>> +			__atomic_load_n(&worker_stats[i].handled_packets,
>>>>> +					__ATOMIC_ACQUIRE));
>>>> __ATOMIC_RELAXED is enough.
>>>>
>>>>>     	printf("Sanity test with all zero hashes done.\n");
>>>>>
>>>>>     	/* pick two flows and check they go correctly */ @@ -159,7 +162,9
>>>>> @@ sanity_test(struct worker_params *wp, struct rte_mempool *p)
>>>>>
>>>>>     		for (i = 0; i < rte_lcore_count() - 1; i++)
>>>>>     			printf("Worker %u handled %u packets\n", i,
>>>>> -					worker_stats[i].handled_packets);
>>>>> +				__atomic_load_n(
>>>>> +					&worker_stats[i].handled_packets,
>>>>> +					__ATOMIC_ACQUIRE));
>>>> __ATOMIC_RELAXED is enough
>>>>
>>>>>     		printf("Sanity test with two hash values done\n");
>>>>>     	}
>>>>>
>>>>> @@ -185,7 +190,8 @@ sanity_test(struct worker_params *wp, struct
>>>>> rte_mempool *p)
>>>>>
>>>>>     	for (i = 0; i < rte_lcore_count() - 1; i++)
>>>>>     		printf("Worker %u handled %u packets\n", i,
>>>>> -				worker_stats[i].handled_packets);
>>>>> +			__atomic_load_n(&worker_stats[i].handled_packets,
>>>>> +					__ATOMIC_ACQUIRE));
>>>> __ATOMIC_RELAXED is enough
>>>>
>>>>>     	printf("Sanity test with non-zero hashes done\n");
>>>>>
>>>>>     	rte_mempool_put_bulk(p, (void *)bufs, BURST); @@ -280,15
>>>>> +286,17 @@ handle_work_with_free_mbufs(void *arg)
>>>>>     		buf[i] = NULL;
>>>>>     	num = rte_distributor_get_pkt(d, id, buf, buf, num);
>>>>>     	while (!quit) {
>>>>> -		worker_stats[id].handled_packets += num;
>>>>>     		count += num;
>>>>> +		__atomic_fetch_add(&worker_stats[id].handled_packets,
>>>>> num,
>>>>> +				__ATOMIC_ACQ_REL);
>>>> IMO, the problem would be the non-atomic update of the statistics. So,
>>>> __ATOMIC_RELAXED is enough
>>>>
>>>>>     		for (i = 0; i < num; i++)
>>>>>     			rte_pktmbuf_free(buf[i]);
>>>>>     		num = rte_distributor_get_pkt(d,
>>>>>     				id, buf, buf, num);
>>>>>     	}
>>>>> -	worker_stats[id].handled_packets += num;
>>>>>     	count += num;
>>>>> +	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
>>>>> +			__ATOMIC_ACQ_REL);
>>>> Same here, the problem is non-atomic update of the statistics,
>>> __ATOMIC_RELAXED is enough.
>>>> Similarly, for changes below, __ATOMIC_RELAXED is enough.
>>>>
>>>>>     	rte_distributor_return_pkt(d, id, buf, num);
>>>>>     	return 0;
>>>>>     }
>>>>> @@ -363,8 +371,9 @@ handle_work_for_shutdown_test(void *arg)
>>>>>     	/* wait for quit single globally, or for worker zero, wait
>>>>>     	 * for zero_quit */
>>>>>     	while (!quit && !(id == zero_id && zero_quit)) {
>>>>> -		worker_stats[id].handled_packets += num;
>>>>>     		count += num;
>>>>> +		__atomic_fetch_add(&worker_stats[id].handled_packets,
>>>>> num,
>>>>> +				__ATOMIC_ACQ_REL);
>>>>>     		for (i = 0; i < num; i++)
>>>>>     			rte_pktmbuf_free(buf[i]);
>>>>>     		num = rte_distributor_get_pkt(d,
>>>>> @@ -379,10 +388,11 @@ handle_work_for_shutdown_test(void *arg)
>>>>>
>>>>>     		total += num;
>>>>>     	}
>>>>> -	worker_stats[id].handled_packets += num;
>>>>>     	count += num;
>>>>>     	returned = rte_distributor_return_pkt(d, id, buf, num);
>>>>>
>>>>> +	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
>>>>> +			__ATOMIC_ACQ_REL);
>>>>>     	if (id == zero_id) {
>>>>>     		/* for worker zero, allow it to restart to pick up last packet
>>>>>     		 * when all workers are shutting down.
>>>>> @@ -394,10 +404,11 @@ handle_work_for_shutdown_test(void *arg)
>>>>>     				id, buf, buf, num);
>>>>>
>>>>>     		while (!quit) {
>>>>> -			worker_stats[id].handled_packets += num;
>>>>>     			count += num;
>>>>>     			rte_pktmbuf_free(pkt);
>>>>>     			num = rte_distributor_get_pkt(d, id, buf, buf, num);
>>>>> +
>>>>> 	__atomic_fetch_add(&worker_stats[id].handled_packets,
>>>>> +					num, __ATOMIC_ACQ_REL);
>>>>>     		}
>>>>>     		returned = rte_distributor_return_pkt(d,
>>>>>     				id, buf, num);
>>>>> @@ -461,7 +472,8 @@ sanity_test_with_worker_shutdown(struct
>>>>> worker_params *wp,
>>>>>
>>>>>     	for (i = 0; i < rte_lcore_count() - 1; i++)
>>>>>     		printf("Worker %u handled %u packets\n", i,
>>>>> -				worker_stats[i].handled_packets);
>>>>> +			__atomic_load_n(&worker_stats[i].handled_packets,
>>>>> +					__ATOMIC_ACQUIRE));
>>>>>
>>>>>     	if (total_packet_count() != BURST * 2) {
>>>>>     		printf("Line %d: Error, not all packets flushed. "
>>>>> @@ -514,7 +526,8 @@ test_flush_with_worker_shutdown(struct
>>>>> worker_params *wp,
>>>>>     	zero_quit = 0;
>>>>>     	for (i = 0; i < rte_lcore_count() - 1; i++)
>>>>>     		printf("Worker %u handled %u packets\n", i,
>>>>> -				worker_stats[i].handled_packets);
>>>>> +			__atomic_load_n(&worker_stats[i].handled_packets,
>>>>> +					__ATOMIC_ACQUIRE));
>>>>>
>>>>>     	if (total_packet_count() != BURST) {
>>>>>     		printf("Line %d: Error, not all packets flushed. "
>>>>> --
>>>>> 2.17.1
>>> --
>>> Lukasz Wojciechowski
>>> Principal Software Engineer
>>>
>>> Samsung R&D Institute Poland
>>> Samsung Electronics
>>> Office +48 22 377 88 25
>>> l.wojciechow@partner.samsung.com

-- 
Lukasz Wojciechowski
Principal Software Engineer

Samsung R&D Institute Poland
Samsung Electronics
Office +48 22 377 88 25
l.wojciechow@partner.samsung.com


^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v4 2/8] test/distributor: synchronize lcores statistics
  2020-10-16 12:58                           ` Lukasz Wojciechowski
@ 2020-10-16 15:42                             ` Honnappa Nagarahalli
  2020-10-17  3:34                               ` Lukasz Wojciechowski
  0 siblings, 1 reply; 164+ messages in thread
From: Honnappa Nagarahalli @ 2020-10-16 15:42 UTC (permalink / raw)
  To: Lukasz Wojciechowski, David Hunt, Bruce Richardson
  Cc: dev, stable, nd, David Marchand, Honnappa Nagarahalli, nd

<snip>

> 
> W dniu 16.10.2020 o 14:43, Lukasz Wojciechowski pisze:
> > Hi Honnappa,
> >
> > Thank you for your answer.
> > In the current v7 version I followed your advise and used RELAXED memory
> model.
> > And it works without any issues. I guess after fixing other issues found
> since v4 the distributor works more stable.
> > I didn't have time to rearrange all tests in the way I proposed, but I guess if
> they work like this it's not a top priority.
Agree, not a top priority.

> >
> > Can you give an ack on the series? I believe David Marchand is waiting for
> your opinion to process it.
> I'm sorry I didn't see your other comments. I'll try to fix them them today.
No problem, I can review the next series quickly.

> >
> > Best regards
> > Lukasz
> >
> > W dniu 16.10.2020 o 07:43, Honnappa Nagarahalli pisze:
> >> <snip>
> >>
> >>> Hi Honnappa,
> >>>
> >>> Many thanks for the review!
> >>>
> >>> I'll write my answers here not inline as it would be easier to read
> >>> them in one place, I think.
> >>> So first of all I agree with you in 2 things:
> >>> 1) all uses of statistics must be atomic and lack of that caused
> >>> most of the problems
> >>> 2) it would be better to replace barrier and memset in
> >>> clear_packet_count() with atomic stores as you suggested
> >>>
> >>> So I will apply both of above.
> >>>
> >>> However I wasn't not fully convinced on changing acquire/release to
> relaxed.
> >>> It wood be perfectly ok if it would look like in this Herb Sutter's example:
> >>> https://youtu.be/KeLBd2[]  EJLOU?t=4170 But in his case the counters
> >>> are cleared before worker threads start and are printout after they
> >>> are completed.
> >>>
> >>> In case of the dpdk distributor tests both worker and main cores are
> >>> running at the same time. In the sanity_test, the statistics are
> >>> cleared and verified few times for different hashes of packages. The
> >>> worker cores are not stopped at this time and they continue their loops
> in handle procedure.
> >>> Verification made in main core is an exchange of data as the current
> >>> statistics indicate how the test will result.
> >> Agree. The key point we have to note is that the data that is exchanged
> between the two threads is already atomic (handled_packets is atomic).
> >>
> >>> So as I wasn't convinced, I run some tests with both both relaxed
> >>> and acquire/release modes and they both fail :( The failures caused
> >>> by statistics errors to number of tests ratio for
> >>> 200000 tests was:
> >>> for relaxed: 0,000790562
> >>> for acq/rel: 0,000091321
> >>>
> >>>
> >>> That's why I'm going to modify tests in such way, that they would:
> >>> 1) clear statistics
> >>> 2) launch worker threads
> >>> 3) run test
> >>> 4) wait for workers procedures to complete
> >>> 5) check stats, verify results and print them out
> >>>
> >>> This way worker main core will use (clear or verify) stats only when
> >>> there are no worker threads. This would make things simpler and
> >>> allowing to focus on testing the distributor not tests. And of
> >>> course relaxed mode would be enough!
> >> Agree, this would be the only way to ensure that the main thread sees
> >> the correct statistics (just like in the video)
> >>
> >>> Best regards
> >>> Lukasz
> >>>
> >>>
> >>> W dniu 29.09.2020 o 07:49, Honnappa Nagarahalli pisze:
> >>>> <snip>
> >>>>
> >>>>> Statistics of handled packets are cleared and read on main lcore,
> >>>>> while they are increased in workers handlers on different lcores.
> >>>>>
> >>>>> Without synchronization occasionally showed invalid values.
> >>>>> This patch uses atomic acquire/release mechanisms to synchronize.
> >>>> In general, load-acquire and store-release memory orderings are
> >>>> required
> >>> while synchronizing data (that cannot be updated atomically) between
> >>> threads. In the situation, making counters atomic is enough.
> >>>>> Fixes: c3eabff124e6 ("distributor: add unit tests")
> >>>>> Cc: bruce.richardson@intel.com
> >>>>> Cc: stable@dpdk.org
> >>>>>
> >>>>> Signed-off-by: Lukasz Wojciechowski
> >>>>> <l.wojciechow@partner.samsung.com>
> >>>>> Acked-by: David Hunt <david.hunt@intel.com>
> >>>>> ---
> >>>>>     app/test/test_distributor.c | 39
> >>>>> ++++++++++++++++++++++++-----------
> >>> --
> >>>>>     1 file changed, 26 insertions(+), 13 deletions(-)
> >>>>>
> >>>>> diff --git a/app/test/test_distributor.c
> >>>>> b/app/test/test_distributor.c index
> >>>>> 35b25463a..0e49e3714 100644
> >>>>> --- a/app/test/test_distributor.c
> >>>>> +++ b/app/test/test_distributor.c
> >>>>> @@ -43,7 +43,8 @@ total_packet_count(void)  {
> >>>>>     	unsigned i, count = 0;
> >>>>>     	for (i = 0; i < worker_idx; i++)
> >>>>> -		count += worker_stats[i].handled_packets;
> >>>>> +		count +=
> >>>>> __atomic_load_n(&worker_stats[i].handled_packets,
> >>>>> +				__ATOMIC_ACQUIRE);
> >>>> RELAXED memory order is sufficient. For ex: the worker threads are
> >>>> not
> >>> 'releasing' any data that is not atomically updated to the main thread.
> >>>>>     	return count;
> >>>>>     }
> >>>>>
> >>>>> @@ -52,6 +53,7 @@ static inline void
> >>>>>     clear_packet_count(void)
> >>>>>     {
> >>>>>     	memset(&worker_stats, 0, sizeof(worker_stats));
> >>>>> +	rte_atomic_thread_fence(__ATOMIC_RELEASE);
> >>>> Ideally, the counters should be set to 0 atomically rather than
> >>>> using a
> >>> memset.
> >>>>>     }
> >>>>>
> >>>>>     /* this is the basic worker function for sanity test @@ -72,13
> >>>>> +74,13 @@ handle_work(void *arg)
> >>>>>     	num = rte_distributor_get_pkt(db, id, buf, buf, num);
> >>>>>     	while (!quit) {
> >>>>>
> 	__atomic_fetch_add(&worker_stats[id].handled_packets,
> >>>>> num,
> >>>>> -				__ATOMIC_RELAXED);
> >>>>> +				__ATOMIC_ACQ_REL);
> >>>> Using the __ATOMIC_ACQ_REL order does not mean anything to the
> main
> >>> thread. The main thread might still see the updates from different
> >>> threads in different order.
> >>>>>     		count += num;
> >>>>>     		num = rte_distributor_get_pkt(db, id,
> >>>>>     				buf, buf, num);
> >>>>>     	}
> >>>>>     	__atomic_fetch_add(&worker_stats[id].handled_packets,
> num,
> >>>>> -			__ATOMIC_RELAXED);
> >>>>> +			__ATOMIC_ACQ_REL);
> >>>> Same here, do not see why this change is required.
> >>>>
> >>>>>     	count += num;
> >>>>>     	rte_distributor_return_pkt(db, id, buf, num);
> >>>>>     	return 0;
> >>>>> @@ -134,7 +136,8 @@ sanity_test(struct worker_params *wp, struct
> >>>>> rte_mempool *p)
> >>>>>
> >>>>>     	for (i = 0; i < rte_lcore_count() - 1; i++)
> >>>>>     		printf("Worker %u handled %u packets\n", i,
> >>>>> -				worker_stats[i].handled_packets);
> >>>>> +
> 	__atomic_load_n(&worker_stats[i].handled_packets,
> >>>>> +					__ATOMIC_ACQUIRE));
> >>>> __ATOMIC_RELAXED is enough.
> >>>>
> >>>>>     	printf("Sanity test with all zero hashes done.\n");
> >>>>>
> >>>>>     	/* pick two flows and check they go correctly */ @@ -159,7
> >>>>> +162,9 @@ sanity_test(struct worker_params *wp, struct
> rte_mempool
> >>>>> *p)
> >>>>>
> >>>>>     		for (i = 0; i < rte_lcore_count() - 1; i++)
> >>>>>     			printf("Worker %u handled %u packets\n", i,
> >>>>> -
> 	worker_stats[i].handled_packets);
> >>>>> +				__atomic_load_n(
> >>>>> +
> 	&worker_stats[i].handled_packets,
> >>>>> +					__ATOMIC_ACQUIRE));
> >>>> __ATOMIC_RELAXED is enough
> >>>>
> >>>>>     		printf("Sanity test with two hash values done\n");
> >>>>>     	}
> >>>>>
> >>>>> @@ -185,7 +190,8 @@ sanity_test(struct worker_params *wp, struct
> >>>>> rte_mempool *p)
> >>>>>
> >>>>>     	for (i = 0; i < rte_lcore_count() - 1; i++)
> >>>>>     		printf("Worker %u handled %u packets\n", i,
> >>>>> -				worker_stats[i].handled_packets);
> >>>>> +
> 	__atomic_load_n(&worker_stats[i].handled_packets,
> >>>>> +					__ATOMIC_ACQUIRE));
> >>>> __ATOMIC_RELAXED is enough
> >>>>
> >>>>>     	printf("Sanity test with non-zero hashes done\n");
> >>>>>
> >>>>>     	rte_mempool_put_bulk(p, (void *)bufs, BURST); @@ -280,15
> >>>>> +286,17 @@ handle_work_with_free_mbufs(void *arg)
> >>>>>     		buf[i] = NULL;
> >>>>>     	num = rte_distributor_get_pkt(d, id, buf, buf, num);
> >>>>>     	while (!quit) {
> >>>>> -		worker_stats[id].handled_packets += num;
> >>>>>     		count += num;
> >>>>> +
> 	__atomic_fetch_add(&worker_stats[id].handled_packets,
> >>>>> num,
> >>>>> +				__ATOMIC_ACQ_REL);
> >>>> IMO, the problem would be the non-atomic update of the statistics.
> >>>> So, __ATOMIC_RELAXED is enough
> >>>>
> >>>>>     		for (i = 0; i < num; i++)
> >>>>>     			rte_pktmbuf_free(buf[i]);
> >>>>>     		num = rte_distributor_get_pkt(d,
> >>>>>     				id, buf, buf, num);
> >>>>>     	}
> >>>>> -	worker_stats[id].handled_packets += num;
> >>>>>     	count += num;
> >>>>> +	__atomic_fetch_add(&worker_stats[id].handled_packets,
> num,
> >>>>> +			__ATOMIC_ACQ_REL);
> >>>> Same here, the problem is non-atomic update of the statistics,
> >>> __ATOMIC_RELAXED is enough.
> >>>> Similarly, for changes below, __ATOMIC_RELAXED is enough.
> >>>>
> >>>>>     	rte_distributor_return_pkt(d, id, buf, num);
> >>>>>     	return 0;
> >>>>>     }
> >>>>> @@ -363,8 +371,9 @@ handle_work_for_shutdown_test(void *arg)
> >>>>>     	/* wait for quit single globally, or for worker zero, wait
> >>>>>     	 * for zero_quit */
> >>>>>     	while (!quit && !(id == zero_id && zero_quit)) {
> >>>>> -		worker_stats[id].handled_packets += num;
> >>>>>     		count += num;
> >>>>> +
> 	__atomic_fetch_add(&worker_stats[id].handled_packets,
> >>>>> num,
> >>>>> +				__ATOMIC_ACQ_REL);
> >>>>>     		for (i = 0; i < num; i++)
> >>>>>     			rte_pktmbuf_free(buf[i]);
> >>>>>     		num = rte_distributor_get_pkt(d, @@ -379,10
> +388,11 @@
> >>>>> handle_work_for_shutdown_test(void *arg)
> >>>>>
> >>>>>     		total += num;
> >>>>>     	}
> >>>>> -	worker_stats[id].handled_packets += num;
> >>>>>     	count += num;
> >>>>>     	returned = rte_distributor_return_pkt(d, id, buf, num);
> >>>>>
> >>>>> +	__atomic_fetch_add(&worker_stats[id].handled_packets,
> num,
> >>>>> +			__ATOMIC_ACQ_REL);
> >>>>>     	if (id == zero_id) {
> >>>>>     		/* for worker zero, allow it to restart to pick up last
> packet
> >>>>>     		 * when all workers are shutting down.
> >>>>> @@ -394,10 +404,11 @@ handle_work_for_shutdown_test(void *arg)
> >>>>>     				id, buf, buf, num);
> >>>>>
> >>>>>     		while (!quit) {
> >>>>> -			worker_stats[id].handled_packets += num;
> >>>>>     			count += num;
> >>>>>     			rte_pktmbuf_free(pkt);
> >>>>>     			num = rte_distributor_get_pkt(d, id, buf, buf,
> num);
> >>>>> +
> >>>>> 	__atomic_fetch_add(&worker_stats[id].handled_packets,
> >>>>> +					num, __ATOMIC_ACQ_REL);
> >>>>>     		}
> >>>>>     		returned = rte_distributor_return_pkt(d,
> >>>>>     				id, buf, num);
> >>>>> @@ -461,7 +472,8 @@ sanity_test_with_worker_shutdown(struct
> >>>>> worker_params *wp,
> >>>>>
> >>>>>     	for (i = 0; i < rte_lcore_count() - 1; i++)
> >>>>>     		printf("Worker %u handled %u packets\n", i,
> >>>>> -				worker_stats[i].handled_packets);
> >>>>> +
> 	__atomic_load_n(&worker_stats[i].handled_packets,
> >>>>> +					__ATOMIC_ACQUIRE));
> >>>>>
> >>>>>     	if (total_packet_count() != BURST * 2) {
> >>>>>     		printf("Line %d: Error, not all packets flushed. "
> >>>>> @@ -514,7 +526,8 @@ test_flush_with_worker_shutdown(struct
> >>>>> worker_params *wp,
> >>>>>     	zero_quit = 0;
> >>>>>     	for (i = 0; i < rte_lcore_count() - 1; i++)
> >>>>>     		printf("Worker %u handled %u packets\n", i,
> >>>>> -				worker_stats[i].handled_packets);
> >>>>> +
> 	__atomic_load_n(&worker_stats[i].handled_packets,
> >>>>> +					__ATOMIC_ACQUIRE));
> >>>>>
> >>>>>     	if (total_packet_count() != BURST) {
> >>>>>     		printf("Line %d: Error, not all packets flushed. "
> >>>>> --
> >>>>> 2.17.1
> >>> --
> >>> Lukasz Wojciechowski
> >>> Principal Software Engineer
> >>>
> >>> Samsung R&D Institute Poland
> >>> Samsung Electronics
> >>> Office +48 22 377 88 25
> >>> l.wojciechow@partner.samsung.com
> 
> --
> Lukasz Wojciechowski
> Principal Software Engineer
> 
> Samsung R&D Institute Poland
> Samsung Electronics
> Office +48 22 377 88 25
> l.wojciechow@partner.samsung.com


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v8 00/17] fix distributor synchronization issues
       [not found]                           ` <CGME20201017030709eucas1p11285f14ee4fe2e79ad5791b0e9b9c653@eucas1p1.samsung.com>
@ 2020-10-17  3:06                             ` Lukasz Wojciechowski
       [not found]                               ` <CGME20201017030710eucas1p17fb6129fd3414b4b6b70dcd593c01a40@eucas1p1.samsung.com>
                                                 ` (17 more replies)
  0 siblings, 18 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-17  3:06 UTC (permalink / raw)
  Cc: dev, l.wojciechow

During review and verification of the patch created by Sarosh Arif:
"test_distributor: prevent memory leakages from the pool" I found out
that running distributor unit tests multiple times in a row causes fails.
So I investigated all the issues I found.

There are few synchronization issues that might cause deadlocks
or corrupted data. They are fixed with this set of patches for both tests
and librte_distributor library.

---
v8:
* simplify memory model to relaxed and remove extra variable in patch 1
* rearrange order of patches: "synchronize lcores statistics"
    and "fix freeing mbufs" to avoid changing same code twice
* reword "packages" -> "packets" in "collect return mbufs" commit message
* add patch 17 fixing quitting of workers in distributor tests

v7:
* add patch 16 ensuring that tests will try sending packets until workers
    are started and requested for packets

v6:
* fix comments indentation
* fix stats atomic operations memory mode from ACQUIRE/RELEASE
    to RELAXED

v5:
* implement missing functionality in burst mode - worker shutdown
* fix shutdown test to always shutdown busy worker
* use atomic stores instead of barrier in tests clear_packet_count()
* reorder patches
* new patch 7: fix call to return_pkt in single mode
* new patch 11: replacing delays with spinlock on atomics in tests
* new patch 12: fix scalar matching algorithm
* new patch 13: new test with marking and checking every packet
* new patch 14: flush also in flight packets
* new patch 15: fix clearing returns buffer
* minor fixes in other patches

v4:
* adjust commit name prefixes app/test -> test/distributor:
* reorder patches
* use NULL oldpkt in rte_distributor_get_pkt() calls in tests

v3:
* add missing acked and tested by statements from v1

v2:
* assign NULL to freed mbufs in distributor test
* fix handshake check on legacy single distributor
     rte_distributor_return_pkt_single()
* add patch 7 passing NULL to legacy API calls if no bufs are returned
* add patch 8 fixing API documentation
 

Lukasz Wojciechowski (17):
  distributor: fix missing handshake synchronization
  distributor: fix handshake deadlock
  distributor: do not use oldpkt when not needed
  distributor: handle worker shutdown in burst mode
  test/distributor: fix shutdown of busy worker
  distributor: fix return pkt calls in single mode
  test/distributor: fix freeing mbufs
  test/distributor: synchronize lcores statistics
  test/distributor: collect return mbufs
  distributor: align API documentation with code
  test/distributor: replace delays with spin locks
  distributor: fix scalar matching
  test/distributor: add test with packets marking
  distributor: fix flushing in flight packets
  distributor: fix clearing returns buffer
  test/distributor: ensure all packets are delivered
  test/distributor: fix quitting workers

 app/test/test_distributor.c                   | 353 ++++++++++++++----
 lib/librte_distributor/distributor_private.h  |   3 +
 lib/librte_distributor/rte_distributor.c      | 217 ++++++++---
 lib/librte_distributor/rte_distributor.h      |  23 +-
 .../rte_distributor_single.c                  |   4 +
 5 files changed, 473 insertions(+), 127 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v8 01/17] distributor: fix missing handshake synchronization
       [not found]                               ` <CGME20201017030710eucas1p17fb6129fd3414b4b6b70dcd593c01a40@eucas1p1.samsung.com>
@ 2020-10-17  3:06                                 ` Lukasz Wojciechowski
  2020-10-17 21:05                                   ` Honnappa Nagarahalli
  0 siblings, 1 reply; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-17  3:06 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

rte_distributor_return_pkt function which is run on worker cores
must wait for distributor core to clear handshake on retptr64
before using those buffers. While the handshake is set distributor
core controls buffers and any operations on worker side might overwrite
buffers which are unread yet.
Same situation appears in the legacy single distributor. Function
rte_distributor_return_pkt_single shouldn't modify the bufptr64 until
handshake on it is cleared by distributor lcore.

Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 lib/librte_distributor/rte_distributor.c        | 12 ++++++++++++
 lib/librte_distributor/rte_distributor_single.c |  4 ++++
 2 files changed, 16 insertions(+)

diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
index 1c047f065..c6b19a388 100644
--- a/lib/librte_distributor/rte_distributor.c
+++ b/lib/librte_distributor/rte_distributor.c
@@ -169,6 +169,18 @@ rte_distributor_return_pkt(struct rte_distributor *d,
 			return -EINVAL;
 	}
 
+	/* Spin while handshake bits are set (scheduler clears it).
+	 * Sync with worker on GET_BUF flag.
+	 */
+	while (unlikely(__atomic_load_n(&(buf->retptr64[0]), __ATOMIC_RELAXED)
+			& RTE_DISTRIB_GET_BUF)) {
+		rte_pause();
+		uint64_t t = rte_rdtsc()+100;
+
+		while (rte_rdtsc() < t)
+			rte_pause();
+	}
+
 	/* Sync with distributor to acquire retptrs */
 	__atomic_thread_fence(__ATOMIC_ACQUIRE);
 	for (i = 0; i < RTE_DIST_BURST_SIZE; i++)
diff --git a/lib/librte_distributor/rte_distributor_single.c b/lib/librte_distributor/rte_distributor_single.c
index abaf7730c..f4725b1d0 100644
--- a/lib/librte_distributor/rte_distributor_single.c
+++ b/lib/librte_distributor/rte_distributor_single.c
@@ -74,6 +74,10 @@ rte_distributor_return_pkt_single(struct rte_distributor_single *d,
 	union rte_distributor_buffer_single *buf = &d->bufs[worker_id];
 	uint64_t req = (((int64_t)(uintptr_t)oldpkt) << RTE_DISTRIB_FLAG_BITS)
 			| RTE_DISTRIB_RETURN_BUF;
+	while (unlikely(__atomic_load_n(&buf->bufptr64, __ATOMIC_RELAXED)
+			& RTE_DISTRIB_FLAGS_MASK))
+		rte_pause();
+
 	/* Sync with distributor on RETURN_BUF flag. */
 	__atomic_store_n(&(buf->bufptr64), req, __ATOMIC_RELEASE);
 	return 0;
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v8 02/17] distributor: fix handshake deadlock
       [not found]                               ` <CGME20201017030711eucas1p1b70f13e4636ad7c3e842b48726ae1845@eucas1p1.samsung.com>
@ 2020-10-17  3:06                                 ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-17  3:06 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

Synchronization of data exchange between distributor and worker cores
is based on 2 handshakes: retptr64 for returning mbufs from workers
to distributor and bufptr64 for passing mbufs to workers.

Without proper order of verifying those 2 handshakes a deadlock may
occur. This can happen when worker core wants to return back mbufs
and waits for retptr handshake to be cleared while distributor core
waits for bufptr to send mbufs to worker.

This can happen as worker core first returns mbufs to distributor
and later gets new mbufs, while distributor first releases mbufs
to worker and later handle returning packets.

This patch fixes possibility of the deadlock by always taking care
of returning packets first on the distributor side and handling
packets while waiting to release new.

Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 lib/librte_distributor/rte_distributor.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
index c6b19a388..d6d4350a2 100644
--- a/lib/librte_distributor/rte_distributor.c
+++ b/lib/librte_distributor/rte_distributor.c
@@ -319,12 +319,14 @@ release(struct rte_distributor *d, unsigned int wkr)
 	struct rte_distributor_buffer *buf = &(d->bufs[wkr]);
 	unsigned int i;
 
+	handle_returns(d, wkr);
+
 	/* Sync with worker on GET_BUF flag */
 	while (!(__atomic_load_n(&(d->bufs[wkr].bufptr64[0]), __ATOMIC_ACQUIRE)
-		& RTE_DISTRIB_GET_BUF))
+		& RTE_DISTRIB_GET_BUF)) {
+		handle_returns(d, wkr);
 		rte_pause();
-
-	handle_returns(d, wkr);
+	}
 
 	buf->count = 0;
 
@@ -374,6 +376,7 @@ rte_distributor_process(struct rte_distributor *d,
 		/* Flush out all non-full cache-lines to workers. */
 		for (wid = 0 ; wid < d->num_workers; wid++) {
 			/* Sync with worker on GET_BUF flag. */
+			handle_returns(d, wid);
 			if (__atomic_load_n(&(d->bufs[wid].bufptr64[0]),
 				__ATOMIC_ACQUIRE) & RTE_DISTRIB_GET_BUF) {
 				release(d, wid);
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v8 03/17] distributor: do not use oldpkt when not needed
       [not found]                               ` <CGME20201017030711eucas1p14855de461cd9d6a4fd3e4bac031b53e5@eucas1p1.samsung.com>
@ 2020-10-17  3:06                                 ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-17  3:06 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

rte_distributor_request_pkt and rte_distributor_get_pkt dereferenced
oldpkt parameter when in RTE_DIST_ALG_SINGLE even if number
of returned buffers from worker to distributor was 0.

This patch passes NULL to the legacy API when number of returned
buffers is 0. This allows passing NULL as oldpkt parameter.

Distributor tests are also updated passing NULL as oldpkt and
0 as number of returned packets, where packets are not returned.

Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 app/test/test_distributor.c              | 28 +++++++++---------------
 lib/librte_distributor/rte_distributor.c |  4 ++--
 2 files changed, 12 insertions(+), 20 deletions(-)

diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
index ba1f81cf8..52230d250 100644
--- a/app/test/test_distributor.c
+++ b/app/test/test_distributor.c
@@ -62,13 +62,10 @@ handle_work(void *arg)
 	struct rte_mbuf *buf[8] __rte_cache_aligned;
 	struct worker_params *wp = arg;
 	struct rte_distributor *db = wp->dist;
-	unsigned int count = 0, num = 0;
+	unsigned int count = 0, num;
 	unsigned int id = __atomic_fetch_add(&worker_idx, 1, __ATOMIC_RELAXED);
-	int i;
 
-	for (i = 0; i < 8; i++)
-		buf[i] = NULL;
-	num = rte_distributor_get_pkt(db, id, buf, buf, num);
+	num = rte_distributor_get_pkt(db, id, buf, NULL, 0);
 	while (!quit) {
 		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
 				__ATOMIC_RELAXED);
@@ -272,19 +269,16 @@ handle_work_with_free_mbufs(void *arg)
 	struct rte_distributor *d = wp->dist;
 	unsigned int count = 0;
 	unsigned int i;
-	unsigned int num = 0;
+	unsigned int num;
 	unsigned int id = __atomic_fetch_add(&worker_idx, 1, __ATOMIC_RELAXED);
 
-	for (i = 0; i < 8; i++)
-		buf[i] = NULL;
-	num = rte_distributor_get_pkt(d, id, buf, buf, num);
+	num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 	while (!quit) {
 		worker_stats[id].handled_packets += num;
 		count += num;
 		for (i = 0; i < num; i++)
 			rte_pktmbuf_free(buf[i]);
-		num = rte_distributor_get_pkt(d,
-				id, buf, buf, num);
+		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 	}
 	worker_stats[id].handled_packets += num;
 	count += num;
@@ -342,14 +336,14 @@ handle_work_for_shutdown_test(void *arg)
 	struct worker_params *wp = arg;
 	struct rte_distributor *d = wp->dist;
 	unsigned int count = 0;
-	unsigned int num = 0;
+	unsigned int num;
 	unsigned int total = 0;
 	unsigned int i;
 	unsigned int returned = 0;
 	const unsigned int id = __atomic_fetch_add(&worker_idx, 1,
 			__ATOMIC_RELAXED);
 
-	num = rte_distributor_get_pkt(d, id, buf, buf, num);
+	num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 
 	/* wait for quit single globally, or for worker zero, wait
 	 * for zero_quit */
@@ -358,8 +352,7 @@ handle_work_for_shutdown_test(void *arg)
 		count += num;
 		for (i = 0; i < num; i++)
 			rte_pktmbuf_free(buf[i]);
-		num = rte_distributor_get_pkt(d,
-				id, buf, buf, num);
+		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 		total += num;
 	}
 	worker_stats[id].handled_packets += num;
@@ -373,14 +366,13 @@ handle_work_for_shutdown_test(void *arg)
 		while (zero_quit)
 			usleep(100);
 
-		num = rte_distributor_get_pkt(d,
-				id, buf, buf, num);
+		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 
 		while (!quit) {
 			worker_stats[id].handled_packets += num;
 			count += num;
 			rte_pktmbuf_free(pkt);
-			num = rte_distributor_get_pkt(d, id, buf, buf, num);
+			num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 		}
 		returned = rte_distributor_return_pkt(d,
 				id, buf, num);
diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
index d6d4350a2..93c90cf54 100644
--- a/lib/librte_distributor/rte_distributor.c
+++ b/lib/librte_distributor/rte_distributor.c
@@ -42,7 +42,7 @@ rte_distributor_request_pkt(struct rte_distributor *d,
 
 	if (unlikely(d->alg_type == RTE_DIST_ALG_SINGLE)) {
 		rte_distributor_request_pkt_single(d->d_single,
-			worker_id, oldpkt[0]);
+			worker_id, count ? oldpkt[0] : NULL);
 		return;
 	}
 
@@ -134,7 +134,7 @@ rte_distributor_get_pkt(struct rte_distributor *d,
 	if (unlikely(d->alg_type == RTE_DIST_ALG_SINGLE)) {
 		if (return_count <= 1) {
 			pkts[0] = rte_distributor_get_pkt_single(d->d_single,
-				worker_id, oldpkt[0]);
+				worker_id, return_count ? oldpkt[0] : NULL);
 			return (pkts[0]) ? 1 : 0;
 		} else
 			return -EINVAL;
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v8 04/17] distributor: handle worker shutdown in burst mode
       [not found]                               ` <CGME20201017030712eucas1p1ce19efadc60ed2888dc615cbb2549bdc@eucas1p1.samsung.com>
@ 2020-10-17  3:06                                 ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-17  3:06 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

The burst version of distributor implementation was missing proper
handling of worker shutdown. A worker processing packets received
from distributor can call rte_distributor_return_pkt() function
informing distributor that it want no more packets. Further calls to
rte_distributor_request_pkt() or rte_distributor_get_pkt() however
should inform distributor that new packets are requested again.

Lack of the proper implementation has caused that even after worker
informed about returning last packets, new packets were still sent
from distributor causing deadlocks as no one could get them on worker
side.

This patch adds handling shutdown of the worker in following way:
1) It fixes usage of RTE_DISTRIB_VALID_BUF handshake flag. This flag
was formerly unused in burst implementation and now it is used
for marking valid packets in retptr64 replacing invalid use
of RTE_DISTRIB_RETURN_BUF flag.
2) Uses RTE_DISTRIB_RETURN_BUF as a worker to distributor handshake
in retptr64 to indicate that worker has shutdown.
3) Worker that shuts down blocks also bufptr for itself with
RTE_DISTRIB_RETURN_BUF flag allowing distributor to retrieve any
in flight packets.
4) When distributor receives information about shutdown of a worker,
it: marks worker as not active; retrieves any in flight and backlog
packets and process them to different workers; unlocks bufptr64
by clearing RTE_DISTRIB_RETURN_BUF flag and allowing use in
the future if worker requests any new packages.
5) Do not allow to: send or add to backlog any packets for not
active workers. Such workers are also ignored if matched.
6) Adjust calls to handle_returns() and tags matching procedure
to react for possible activation deactivation of workers.

Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 lib/librte_distributor/distributor_private.h |   3 +
 lib/librte_distributor/rte_distributor.c     | 175 +++++++++++++++----
 2 files changed, 146 insertions(+), 32 deletions(-)

diff --git a/lib/librte_distributor/distributor_private.h b/lib/librte_distributor/distributor_private.h
index 489aef2ac..689fe3e18 100644
--- a/lib/librte_distributor/distributor_private.h
+++ b/lib/librte_distributor/distributor_private.h
@@ -155,6 +155,9 @@ struct rte_distributor {
 	enum rte_distributor_match_function dist_match_fn;
 
 	struct rte_distributor_single *d_single;
+
+	uint8_t active[RTE_DISTRIB_MAX_WORKERS];
+	uint8_t activesum;
 };
 
 void
diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
index 93c90cf54..7aa079d53 100644
--- a/lib/librte_distributor/rte_distributor.c
+++ b/lib/librte_distributor/rte_distributor.c
@@ -51,7 +51,7 @@ rte_distributor_request_pkt(struct rte_distributor *d,
 	 * Sync with worker on GET_BUF flag.
 	 */
 	while (unlikely(__atomic_load_n(retptr64, __ATOMIC_ACQUIRE)
-			& RTE_DISTRIB_GET_BUF)) {
+			& (RTE_DISTRIB_GET_BUF | RTE_DISTRIB_RETURN_BUF))) {
 		rte_pause();
 		uint64_t t = rte_rdtsc()+100;
 
@@ -67,11 +67,11 @@ rte_distributor_request_pkt(struct rte_distributor *d,
 	for (i = count; i < RTE_DIST_BURST_SIZE; i++)
 		buf->retptr64[i] = 0;
 
-	/* Set Return bit for each packet returned */
+	/* Set VALID_BUF bit for each packet returned */
 	for (i = count; i-- > 0; )
 		buf->retptr64[i] =
 			(((int64_t)(uintptr_t)(oldpkt[i])) <<
-			RTE_DISTRIB_FLAG_BITS) | RTE_DISTRIB_RETURN_BUF;
+			RTE_DISTRIB_FLAG_BITS) | RTE_DISTRIB_VALID_BUF;
 
 	/*
 	 * Finally, set the GET_BUF  to signal to distributor that cache
@@ -97,11 +97,13 @@ rte_distributor_poll_pkt(struct rte_distributor *d,
 		return (pkts[0]) ? 1 : 0;
 	}
 
-	/* If bit is set, return
+	/* If any of below bits is set, return.
+	 * GET_BUF is set when distributor hasn't sent any packets yet
+	 * RETURN_BUF is set when distributor must retrieve in-flight packets
 	 * Sync with distributor to acquire bufptrs
 	 */
 	if (__atomic_load_n(&(buf->bufptr64[0]), __ATOMIC_ACQUIRE)
-		& RTE_DISTRIB_GET_BUF)
+		& (RTE_DISTRIB_GET_BUF | RTE_DISTRIB_RETURN_BUF))
 		return -1;
 
 	/* since bufptr64 is signed, this should be an arithmetic shift */
@@ -113,7 +115,7 @@ rte_distributor_poll_pkt(struct rte_distributor *d,
 	}
 
 	/*
-	 * so now we've got the contents of the cacheline into an  array of
+	 * so now we've got the contents of the cacheline into an array of
 	 * mbuf pointers, so toggle the bit so scheduler can start working
 	 * on the next cacheline while we're working.
 	 * Sync with distributor on GET_BUF flag. Release bufptrs.
@@ -173,7 +175,7 @@ rte_distributor_return_pkt(struct rte_distributor *d,
 	 * Sync with worker on GET_BUF flag.
 	 */
 	while (unlikely(__atomic_load_n(&(buf->retptr64[0]), __ATOMIC_RELAXED)
-			& RTE_DISTRIB_GET_BUF)) {
+			& (RTE_DISTRIB_GET_BUF | RTE_DISTRIB_RETURN_BUF))) {
 		rte_pause();
 		uint64_t t = rte_rdtsc()+100;
 
@@ -185,17 +187,25 @@ rte_distributor_return_pkt(struct rte_distributor *d,
 	__atomic_thread_fence(__ATOMIC_ACQUIRE);
 	for (i = 0; i < RTE_DIST_BURST_SIZE; i++)
 		/* Switch off the return bit first */
-		buf->retptr64[i] &= ~RTE_DISTRIB_RETURN_BUF;
+		buf->retptr64[i] = 0;
 
 	for (i = num; i-- > 0; )
 		buf->retptr64[i] = (((int64_t)(uintptr_t)oldpkt[i]) <<
-			RTE_DISTRIB_FLAG_BITS) | RTE_DISTRIB_RETURN_BUF;
+			RTE_DISTRIB_FLAG_BITS) | RTE_DISTRIB_VALID_BUF;
+
+	/* Use RETURN_BUF on bufptr64 to notify distributor that
+	 * we won't read any mbufs from there even if GET_BUF is set.
+	 * This allows distributor to retrieve in-flight already sent packets.
+	 */
+	__atomic_or_fetch(&(buf->bufptr64[0]), RTE_DISTRIB_RETURN_BUF,
+		__ATOMIC_ACQ_REL);
 
-	/* set the GET_BUF but even if we got no returns.
-	 * Sync with distributor on GET_BUF flag. Release retptrs.
+	/* set the RETURN_BUF on retptr64 even if we got no returns.
+	 * Sync with distributor on RETURN_BUF flag. Release retptrs.
+	 * Notify distributor that we don't request more packets any more.
 	 */
 	__atomic_store_n(&(buf->retptr64[0]),
-		buf->retptr64[0] | RTE_DISTRIB_GET_BUF, __ATOMIC_RELEASE);
+		buf->retptr64[0] | RTE_DISTRIB_RETURN_BUF, __ATOMIC_RELEASE);
 
 	return 0;
 }
@@ -265,6 +275,59 @@ find_match_scalar(struct rte_distributor *d,
 	 */
 }
 
+/*
+ * When worker called rte_distributor_return_pkt()
+ * and passed RTE_DISTRIB_RETURN_BUF handshake through retptr64,
+ * distributor must retrieve both inflight and backlog packets assigned
+ * to the worker and reprocess them to another worker.
+ */
+static void
+handle_worker_shutdown(struct rte_distributor *d, unsigned int wkr)
+{
+	struct rte_distributor_buffer *buf = &(d->bufs[wkr]);
+	/* double BURST size for storing both inflights and backlog */
+	struct rte_mbuf *pkts[RTE_DIST_BURST_SIZE * 2];
+	unsigned int pkts_count = 0;
+	unsigned int i;
+
+	/* If GET_BUF is cleared there are in-flight packets sent
+	 * to worker which does not require new packets.
+	 * They must be retrieved and assigned to another worker.
+	 */
+	if (!(__atomic_load_n(&(buf->bufptr64[0]), __ATOMIC_ACQUIRE)
+		& RTE_DISTRIB_GET_BUF))
+		for (i = 0; i < RTE_DIST_BURST_SIZE; i++)
+			if (buf->bufptr64[i] & RTE_DISTRIB_VALID_BUF)
+				pkts[pkts_count++] = (void *)((uintptr_t)
+					(buf->bufptr64[i]
+						>> RTE_DISTRIB_FLAG_BITS));
+
+	/* Make following operations on handshake flags on bufptr64:
+	 * - set GET_BUF to indicate that distributor can overwrite buffer
+	 *     with new packets if worker will make a new request.
+	 * - clear RETURN_BUF to unlock reads on worker side.
+	 */
+	__atomic_store_n(&(buf->bufptr64[0]), RTE_DISTRIB_GET_BUF,
+		__ATOMIC_RELEASE);
+
+	/* Collect backlog packets from worker */
+	for (i = 0; i < d->backlog[wkr].count; i++)
+		pkts[pkts_count++] = (void *)((uintptr_t)
+			(d->backlog[wkr].pkts[i] >> RTE_DISTRIB_FLAG_BITS));
+
+	d->backlog[wkr].count = 0;
+
+	/* Clear both inflight and backlog tags */
+	for (i = 0; i < RTE_DIST_BURST_SIZE; i++) {
+		d->in_flight_tags[wkr][i] = 0;
+		d->backlog[wkr].tags[i] = 0;
+	}
+
+	/* Recursive call */
+	if (pkts_count > 0)
+		rte_distributor_process(d, pkts, pkts_count);
+}
+
 
 /*
  * When the handshake bits indicate that there are packets coming
@@ -283,19 +346,33 @@ handle_returns(struct rte_distributor *d, unsigned int wkr)
 
 	/* Sync on GET_BUF flag. Acquire retptrs. */
 	if (__atomic_load_n(&(buf->retptr64[0]), __ATOMIC_ACQUIRE)
-		& RTE_DISTRIB_GET_BUF) {
+		& (RTE_DISTRIB_GET_BUF | RTE_DISTRIB_RETURN_BUF)) {
 		for (i = 0; i < RTE_DIST_BURST_SIZE; i++) {
-			if (buf->retptr64[i] & RTE_DISTRIB_RETURN_BUF) {
+			if (buf->retptr64[i] & RTE_DISTRIB_VALID_BUF) {
 				oldbuf = ((uintptr_t)(buf->retptr64[i] >>
 					RTE_DISTRIB_FLAG_BITS));
 				/* store returns in a circular buffer */
 				store_return(oldbuf, d, &ret_start, &ret_count);
 				count++;
-				buf->retptr64[i] &= ~RTE_DISTRIB_RETURN_BUF;
+				buf->retptr64[i] &= ~RTE_DISTRIB_VALID_BUF;
 			}
 		}
 		d->returns.start = ret_start;
 		d->returns.count = ret_count;
+
+		/* If worker requested packets with GET_BUF, set it to active
+		 * otherwise (RETURN_BUF), set it to not active.
+		 */
+		d->activesum -= d->active[wkr];
+		d->active[wkr] = !!(buf->retptr64[0] & RTE_DISTRIB_GET_BUF);
+		d->activesum += d->active[wkr];
+
+		/* If worker returned packets without requesting new ones,
+		 * handle all in-flights and backlog packets assigned to it.
+		 */
+		if (unlikely(buf->retptr64[0] & RTE_DISTRIB_RETURN_BUF))
+			handle_worker_shutdown(d, wkr);
+
 		/* Clear for the worker to populate with more returns.
 		 * Sync with distributor on GET_BUF flag. Release retptrs.
 		 */
@@ -320,11 +397,15 @@ release(struct rte_distributor *d, unsigned int wkr)
 	unsigned int i;
 
 	handle_returns(d, wkr);
+	if (unlikely(!d->active[wkr]))
+		return 0;
 
 	/* Sync with worker on GET_BUF flag */
 	while (!(__atomic_load_n(&(d->bufs[wkr].bufptr64[0]), __ATOMIC_ACQUIRE)
 		& RTE_DISTRIB_GET_BUF)) {
 		handle_returns(d, wkr);
+		if (unlikely(!d->active[wkr]))
+			return 0;
 		rte_pause();
 	}
 
@@ -364,7 +445,7 @@ rte_distributor_process(struct rte_distributor *d,
 	int64_t next_value = 0;
 	uint16_t new_tag = 0;
 	uint16_t flows[RTE_DIST_BURST_SIZE] __rte_cache_aligned;
-	unsigned int i, j, w, wid;
+	unsigned int i, j, w, wid, matching_required;
 
 	if (d->alg_type == RTE_DIST_ALG_SINGLE) {
 		/* Call the old API */
@@ -372,11 +453,13 @@ rte_distributor_process(struct rte_distributor *d,
 			mbufs, num_mbufs);
 	}
 
+	for (wid = 0 ; wid < d->num_workers; wid++)
+		handle_returns(d, wid);
+
 	if (unlikely(num_mbufs == 0)) {
 		/* Flush out all non-full cache-lines to workers. */
 		for (wid = 0 ; wid < d->num_workers; wid++) {
 			/* Sync with worker on GET_BUF flag. */
-			handle_returns(d, wid);
 			if (__atomic_load_n(&(d->bufs[wid].bufptr64[0]),
 				__ATOMIC_ACQUIRE) & RTE_DISTRIB_GET_BUF) {
 				release(d, wid);
@@ -386,6 +469,9 @@ rte_distributor_process(struct rte_distributor *d,
 		return 0;
 	}
 
+	if (unlikely(!d->activesum))
+		return 0;
+
 	while (next_idx < num_mbufs) {
 		uint16_t matches[RTE_DIST_BURST_SIZE];
 		unsigned int pkts;
@@ -410,22 +496,30 @@ rte_distributor_process(struct rte_distributor *d,
 		for (; i < RTE_DIST_BURST_SIZE; i++)
 			flows[i] = 0;
 
-		switch (d->dist_match_fn) {
-		case RTE_DIST_MATCH_VECTOR:
-			find_match_vec(d, &flows[0], &matches[0]);
-			break;
-		default:
-			find_match_scalar(d, &flows[0], &matches[0]);
-		}
+		matching_required = 1;
 
+		for (j = 0; j < pkts; j++) {
+			if (unlikely(!d->activesum))
+				return next_idx;
+
+			if (unlikely(matching_required)) {
+				switch (d->dist_match_fn) {
+				case RTE_DIST_MATCH_VECTOR:
+					find_match_vec(d, &flows[0],
+						&matches[0]);
+					break;
+				default:
+					find_match_scalar(d, &flows[0],
+						&matches[0]);
+				}
+				matching_required = 0;
+			}
 		/*
 		 * Matches array now contain the intended worker ID (+1) of
 		 * the incoming packets. Any zeroes need to be assigned
 		 * workers.
 		 */
 
-		for (j = 0; j < pkts; j++) {
-
 			next_mb = mbufs[next_idx++];
 			next_value = (((int64_t)(uintptr_t)next_mb) <<
 					RTE_DISTRIB_FLAG_BITS);
@@ -445,12 +539,18 @@ rte_distributor_process(struct rte_distributor *d,
 			 */
 			/* matches[j] = 0; */
 
-			if (matches[j]) {
+			if (matches[j] && d->active[matches[j]-1]) {
 				struct rte_distributor_backlog *bl =
 						&d->backlog[matches[j]-1];
 				if (unlikely(bl->count ==
 						RTE_DIST_BURST_SIZE)) {
 					release(d, matches[j]-1);
+					if (!d->active[matches[j]-1]) {
+						j--;
+						next_idx--;
+						matching_required = 1;
+						continue;
+					}
 				}
 
 				/* Add to worker that already has flow */
@@ -460,11 +560,21 @@ rte_distributor_process(struct rte_distributor *d,
 				bl->pkts[idx] = next_value;
 
 			} else {
-				struct rte_distributor_backlog *bl =
-						&d->backlog[wkr];
+				struct rte_distributor_backlog *bl;
+
+				while (unlikely(!d->active[wkr]))
+					wkr = (wkr + 1) % d->num_workers;
+				bl = &d->backlog[wkr];
+
 				if (unlikely(bl->count ==
 						RTE_DIST_BURST_SIZE)) {
 					release(d, wkr);
+					if (!d->active[wkr]) {
+						j--;
+						next_idx--;
+						matching_required = 1;
+						continue;
+					}
 				}
 
 				/* Add to current worker worker */
@@ -483,9 +593,7 @@ rte_distributor_process(struct rte_distributor *d,
 						matches[w] = wkr+1;
 			}
 		}
-		wkr++;
-		if (wkr >= d->num_workers)
-			wkr = 0;
+		wkr = (wkr + 1) % d->num_workers;
 	}
 
 	/* Flush out all non-full cache-lines to workers. */
@@ -661,6 +769,9 @@ rte_distributor_create(const char *name,
 	for (i = 0 ; i < num_workers ; i++)
 		d->backlog[i].tags = &d->in_flight_tags[i][RTE_DIST_BURST_SIZE];
 
+	memset(d->active, 0, sizeof(d->active));
+	d->activesum = 0;
+
 	dist_burst_list = RTE_TAILQ_CAST(rte_dist_burst_tailq.head,
 					  rte_dist_burst_list);
 
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v8 05/17] test/distributor: fix shutdown of busy worker
       [not found]                               ` <CGME20201017030713eucas1p1173c2178e647be341db2da29078c8d5d@eucas1p1.samsung.com>
@ 2020-10-17  3:06                                 ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-17  3:06 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

The sanity test with worker shutdown delegates all bufs
to be processed by a single lcore worker, then it freezes
one of the lcore workers and continues to send more bufs.
The freezed core shuts down first by calling
rte_distributor_return_pkt().

The test intention is to verify if packets assigned to
the shut down lcore will be reassigned to another worker.

However the shutdown core was not always the one, that was
processing packets. The lcore processing mbufs might be different
every time test is launched. This is caused by keeping the value
of wkr static variable in rte_distributor_process() function
between running test cases.

Test freezed always lcore with 0 id. The patch stores the id
of worker that is processing the data in zero_idx global atomic
variable. This way the freezed lcore is always the proper one.

Fixes: c3eabff124e6 ("distributor: add unit tests")
Cc: bruce.richardson@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Tested-by: David Hunt <david.hunt@intel.com>
---
 app/test/test_distributor.c | 23 +++++++++++++++++++++--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
index 52230d250..6cd7a2edd 100644
--- a/app/test/test_distributor.c
+++ b/app/test/test_distributor.c
@@ -28,6 +28,7 @@ struct worker_params worker_params;
 static volatile int quit;      /**< general quit variable for all threads */
 static volatile int zero_quit; /**< var for when we just want thr0 to quit*/
 static volatile unsigned worker_idx;
+static volatile unsigned zero_idx;
 
 struct worker_stats {
 	volatile unsigned handled_packets;
@@ -340,26 +341,43 @@ handle_work_for_shutdown_test(void *arg)
 	unsigned int total = 0;
 	unsigned int i;
 	unsigned int returned = 0;
+	unsigned int zero_id = 0;
+	unsigned int zero_unset;
 	const unsigned int id = __atomic_fetch_add(&worker_idx, 1,
 			__ATOMIC_RELAXED);
 
 	num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 
+	if (num > 0) {
+		zero_unset = RTE_MAX_LCORE;
+		__atomic_compare_exchange_n(&zero_idx, &zero_unset, id,
+			false, __ATOMIC_ACQ_REL, __ATOMIC_ACQUIRE);
+	}
+	zero_id = __atomic_load_n(&zero_idx, __ATOMIC_ACQUIRE);
+
 	/* wait for quit single globally, or for worker zero, wait
 	 * for zero_quit */
-	while (!quit && !(id == 0 && zero_quit)) {
+	while (!quit && !(id == zero_id && zero_quit)) {
 		worker_stats[id].handled_packets += num;
 		count += num;
 		for (i = 0; i < num; i++)
 			rte_pktmbuf_free(buf[i]);
 		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
+
+		if (num > 0) {
+			zero_unset = RTE_MAX_LCORE;
+			__atomic_compare_exchange_n(&zero_idx, &zero_unset, id,
+				false, __ATOMIC_ACQ_REL, __ATOMIC_ACQUIRE);
+		}
+		zero_id = __atomic_load_n(&zero_idx, __ATOMIC_ACQUIRE);
+
 		total += num;
 	}
 	worker_stats[id].handled_packets += num;
 	count += num;
 	returned = rte_distributor_return_pkt(d, id, buf, num);
 
-	if (id == 0) {
+	if (id == zero_id) {
 		/* for worker zero, allow it to restart to pick up last packet
 		 * when all workers are shutting down.
 		 */
@@ -578,6 +596,7 @@ quit_workers(struct worker_params *wp, struct rte_mempool *p)
 	rte_eal_mp_wait_lcore();
 	quit = 0;
 	worker_idx = 0;
+	zero_idx = RTE_MAX_LCORE;
 }
 
 static int
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v8 06/17] distributor: fix return pkt calls in single mode
       [not found]                               ` <CGME20201017030714eucas1p292bd71a85ea6d638256c21d279c8d533@eucas1p2.samsung.com>
@ 2020-10-17  3:06                                 ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-17  3:06 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

In the single legacy version of the distributor synchronization
requires continues exchange of buffers between distributor
and workers. Empty buffers are sent if only handshake
synchronization is required.
However calls to the rte_distributor_return_pkt()
with 0 buffers in single mode were ignored and not passed to the
legacy algorithm implementation causing lack of synchronization.

This patch fixes this issue by passing NULL as buffer which is
a valid way of sending just synchronization handshakes
in single mode.

Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 lib/librte_distributor/rte_distributor.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
index 7aa079d53..6e3eae58f 100644
--- a/lib/librte_distributor/rte_distributor.c
+++ b/lib/librte_distributor/rte_distributor.c
@@ -167,6 +167,9 @@ rte_distributor_return_pkt(struct rte_distributor *d,
 		if (num == 1)
 			return rte_distributor_return_pkt_single(d->d_single,
 				worker_id, oldpkt[0]);
+		else if (num == 0)
+			return rte_distributor_return_pkt_single(d->d_single,
+				worker_id, NULL);
 		else
 			return -EINVAL;
 	}
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v8 07/17] test/distributor: fix freeing mbufs
       [not found]                               ` <CGME20201017030715eucas1p2366d1f0ce16a219b21542bb26e4588a6@eucas1p2.samsung.com>
@ 2020-10-17  3:06                                 ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-17  3:06 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

Sanity tests with mbuf alloc and shutdown tests assume that
mbufs passed to worker cores are freed in handlers.
Such packets should not be returned to the distributor's main
core. The only packets that should be returned are the packets
send after completion of the tests in quit_workers function.

This patch stops returning mbufs to distributor's core.
In case of shutdown tests it is impossible to determine
how worker and distributor threads would synchronize.
Packets used by tests should be freed and packets used during
quit_workers() shouldn't. That's why returning mbufs to mempool
is moved to test procedure run on distributor thread
from worker threads.

Additionally this patch cleans up unused variables.

Fixes: c0de0eb82e40 ("distributor: switch over to new API")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 app/test/test_distributor.c | 67 ++++++++++++++++++-------------------
 1 file changed, 33 insertions(+), 34 deletions(-)

diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
index 6cd7a2edd..ec1fe348b 100644
--- a/app/test/test_distributor.c
+++ b/app/test/test_distributor.c
@@ -63,20 +63,18 @@ handle_work(void *arg)
 	struct rte_mbuf *buf[8] __rte_cache_aligned;
 	struct worker_params *wp = arg;
 	struct rte_distributor *db = wp->dist;
-	unsigned int count = 0, num;
+	unsigned int num;
 	unsigned int id = __atomic_fetch_add(&worker_idx, 1, __ATOMIC_RELAXED);
 
 	num = rte_distributor_get_pkt(db, id, buf, NULL, 0);
 	while (!quit) {
 		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
 				__ATOMIC_RELAXED);
-		count += num;
 		num = rte_distributor_get_pkt(db, id,
 				buf, buf, num);
 	}
 	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
 			__ATOMIC_RELAXED);
-	count += num;
 	rte_distributor_return_pkt(db, id, buf, num);
 	return 0;
 }
@@ -268,7 +266,6 @@ handle_work_with_free_mbufs(void *arg)
 	struct rte_mbuf *buf[8] __rte_cache_aligned;
 	struct worker_params *wp = arg;
 	struct rte_distributor *d = wp->dist;
-	unsigned int count = 0;
 	unsigned int i;
 	unsigned int num;
 	unsigned int id = __atomic_fetch_add(&worker_idx, 1, __ATOMIC_RELAXED);
@@ -276,13 +273,11 @@ handle_work_with_free_mbufs(void *arg)
 	num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 	while (!quit) {
 		worker_stats[id].handled_packets += num;
-		count += num;
 		for (i = 0; i < num; i++)
 			rte_pktmbuf_free(buf[i]);
 		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 	}
 	worker_stats[id].handled_packets += num;
-	count += num;
 	rte_distributor_return_pkt(d, id, buf, num);
 	return 0;
 }
@@ -308,7 +303,6 @@ sanity_test_with_mbuf_alloc(struct worker_params *wp, struct rte_mempool *p)
 			rte_distributor_process(d, NULL, 0);
 		for (j = 0; j < BURST; j++) {
 			bufs[j]->hash.usr = (i+j) << 1;
-			rte_mbuf_refcnt_set(bufs[j], 1);
 		}
 
 		rte_distributor_process(d, bufs, BURST);
@@ -332,15 +326,10 @@ sanity_test_with_mbuf_alloc(struct worker_params *wp, struct rte_mempool *p)
 static int
 handle_work_for_shutdown_test(void *arg)
 {
-	struct rte_mbuf *pkt = NULL;
 	struct rte_mbuf *buf[8] __rte_cache_aligned;
 	struct worker_params *wp = arg;
 	struct rte_distributor *d = wp->dist;
-	unsigned int count = 0;
 	unsigned int num;
-	unsigned int total = 0;
-	unsigned int i;
-	unsigned int returned = 0;
 	unsigned int zero_id = 0;
 	unsigned int zero_unset;
 	const unsigned int id = __atomic_fetch_add(&worker_idx, 1,
@@ -359,9 +348,6 @@ handle_work_for_shutdown_test(void *arg)
 	 * for zero_quit */
 	while (!quit && !(id == zero_id && zero_quit)) {
 		worker_stats[id].handled_packets += num;
-		count += num;
-		for (i = 0; i < num; i++)
-			rte_pktmbuf_free(buf[i]);
 		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 
 		if (num > 0) {
@@ -370,14 +356,12 @@ handle_work_for_shutdown_test(void *arg)
 				false, __ATOMIC_ACQ_REL, __ATOMIC_ACQUIRE);
 		}
 		zero_id = __atomic_load_n(&zero_idx, __ATOMIC_ACQUIRE);
-
-		total += num;
 	}
 	worker_stats[id].handled_packets += num;
-	count += num;
-	returned = rte_distributor_return_pkt(d, id, buf, num);
 
 	if (id == zero_id) {
+		rte_distributor_return_pkt(d, id, NULL, 0);
+
 		/* for worker zero, allow it to restart to pick up last packet
 		 * when all workers are shutting down.
 		 */
@@ -388,14 +372,10 @@ handle_work_for_shutdown_test(void *arg)
 
 		while (!quit) {
 			worker_stats[id].handled_packets += num;
-			count += num;
-			rte_pktmbuf_free(pkt);
 			num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 		}
-		returned = rte_distributor_return_pkt(d,
-				id, buf, num);
-		printf("Num returned = %d\n", returned);
 	}
+	rte_distributor_return_pkt(d, id, buf, num);
 	return 0;
 }
 
@@ -411,7 +391,9 @@ sanity_test_with_worker_shutdown(struct worker_params *wp,
 {
 	struct rte_distributor *d = wp->dist;
 	struct rte_mbuf *bufs[BURST];
-	unsigned i;
+	struct rte_mbuf *bufs2[BURST];
+	unsigned int i;
+	unsigned int failed = 0;
 
 	printf("=== Sanity test of worker shutdown ===\n");
 
@@ -437,16 +419,17 @@ sanity_test_with_worker_shutdown(struct worker_params *wp,
 	 */
 
 	/* get more buffers to queue up, again setting them to the same flow */
-	if (rte_mempool_get_bulk(p, (void *)bufs, BURST) != 0) {
+	if (rte_mempool_get_bulk(p, (void *)bufs2, BURST) != 0) {
 		printf("line %d: Error getting mbufs from pool\n", __LINE__);
+		rte_mempool_put_bulk(p, (void *)bufs, BURST);
 		return -1;
 	}
 	for (i = 0; i < BURST; i++)
-		bufs[i]->hash.usr = 1;
+		bufs2[i]->hash.usr = 1;
 
 	/* get worker zero to quit */
 	zero_quit = 1;
-	rte_distributor_process(d, bufs, BURST);
+	rte_distributor_process(d, bufs2, BURST);
 
 	/* flush the distributor */
 	rte_distributor_flush(d);
@@ -460,9 +443,15 @@ sanity_test_with_worker_shutdown(struct worker_params *wp,
 		printf("Line %d: Error, not all packets flushed. "
 				"Expected %u, got %u\n",
 				__LINE__, BURST * 2, total_packet_count());
-		return -1;
+		failed = 1;
 	}
 
+	rte_mempool_put_bulk(p, (void *)bufs, BURST);
+	rte_mempool_put_bulk(p, (void *)bufs2, BURST);
+
+	if (failed)
+		return -1;
+
 	printf("Sanity test with worker shutdown passed\n\n");
 	return 0;
 }
@@ -476,7 +465,8 @@ test_flush_with_worker_shutdown(struct worker_params *wp,
 {
 	struct rte_distributor *d = wp->dist;
 	struct rte_mbuf *bufs[BURST];
-	unsigned i;
+	unsigned int i;
+	unsigned int failed = 0;
 
 	printf("=== Test flush fn with worker shutdown (%s) ===\n", wp->name);
 
@@ -513,9 +503,14 @@ test_flush_with_worker_shutdown(struct worker_params *wp,
 		printf("Line %d: Error, not all packets flushed. "
 				"Expected %u, got %u\n",
 				__LINE__, BURST, total_packet_count());
-		return -1;
+		failed = 1;
 	}
 
+	rte_mempool_put_bulk(p, (void *)bufs, BURST);
+
+	if (failed)
+		return -1;
+
 	printf("Flush test with worker shutdown passed\n\n");
 	return 0;
 }
@@ -581,7 +576,10 @@ quit_workers(struct worker_params *wp, struct rte_mempool *p)
 	const unsigned num_workers = rte_lcore_count() - 1;
 	unsigned i;
 	struct rte_mbuf *bufs[RTE_MAX_LCORE];
-	rte_mempool_get_bulk(p, (void *)bufs, num_workers);
+	if (rte_mempool_get_bulk(p, (void *)bufs, num_workers) != 0) {
+		printf("line %d: Error getting mbufs from pool\n", __LINE__);
+		return;
+	}
 
 	zero_quit = 0;
 	quit = 1;
@@ -589,11 +587,12 @@ quit_workers(struct worker_params *wp, struct rte_mempool *p)
 		bufs[i]->hash.usr = i << 1;
 	rte_distributor_process(d, bufs, num_workers);
 
-	rte_mempool_put_bulk(p, (void *)bufs, num_workers);
-
 	rte_distributor_process(d, NULL, 0);
 	rte_distributor_flush(d);
 	rte_eal_mp_wait_lcore();
+
+	rte_mempool_put_bulk(p, (void *)bufs, num_workers);
+
 	quit = 0;
 	worker_idx = 0;
 	zero_idx = RTE_MAX_LCORE;
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v8 08/17] test/distributor: synchronize lcores statistics
       [not found]                               ` <CGME20201017030716eucas1p2911112ee3c9e0a3f3dd9a811cbafe77b@eucas1p2.samsung.com>
@ 2020-10-17  3:06                                 ` Lukasz Wojciechowski
  2020-10-17 21:11                                   ` Honnappa Nagarahalli
  0 siblings, 1 reply; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-17  3:06 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

Statistics of handled packets are cleared and read on main lcore,
while they are increased in workers handlers on different lcores.

Without synchronization occasionally showed invalid values.
This patch uses atomic mechanisms to synchronize.
Relaxed memory model is used.

Fixes: c3eabff124e6 ("distributor: add unit tests")
Cc: bruce.richardson@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 app/test/test_distributor.c | 39 +++++++++++++++++++++++++------------
 1 file changed, 27 insertions(+), 12 deletions(-)

diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
index ec1fe348b..4343efed1 100644
--- a/app/test/test_distributor.c
+++ b/app/test/test_distributor.c
@@ -43,7 +43,8 @@ total_packet_count(void)
 {
 	unsigned i, count = 0;
 	for (i = 0; i < worker_idx; i++)
-		count += worker_stats[i].handled_packets;
+		count += __atomic_load_n(&worker_stats[i].handled_packets,
+				__ATOMIC_RELAXED);
 	return count;
 }
 
@@ -51,7 +52,10 @@ total_packet_count(void)
 static inline void
 clear_packet_count(void)
 {
-	memset(&worker_stats, 0, sizeof(worker_stats));
+	unsigned int i;
+	for (i = 0; i < RTE_MAX_LCORE; i++)
+		__atomic_store_n(&worker_stats[i].handled_packets, 0,
+			__ATOMIC_RELAXED);
 }
 
 /* this is the basic worker function for sanity test
@@ -129,7 +133,8 @@ sanity_test(struct worker_params *wp, struct rte_mempool *p)
 
 	for (i = 0; i < rte_lcore_count() - 1; i++)
 		printf("Worker %u handled %u packets\n", i,
-				worker_stats[i].handled_packets);
+			__atomic_load_n(&worker_stats[i].handled_packets,
+					__ATOMIC_RELAXED));
 	printf("Sanity test with all zero hashes done.\n");
 
 	/* pick two flows and check they go correctly */
@@ -154,7 +159,9 @@ sanity_test(struct worker_params *wp, struct rte_mempool *p)
 
 		for (i = 0; i < rte_lcore_count() - 1; i++)
 			printf("Worker %u handled %u packets\n", i,
-					worker_stats[i].handled_packets);
+				__atomic_load_n(
+					&worker_stats[i].handled_packets,
+					__ATOMIC_RELAXED));
 		printf("Sanity test with two hash values done\n");
 	}
 
@@ -180,7 +187,8 @@ sanity_test(struct worker_params *wp, struct rte_mempool *p)
 
 	for (i = 0; i < rte_lcore_count() - 1; i++)
 		printf("Worker %u handled %u packets\n", i,
-				worker_stats[i].handled_packets);
+			__atomic_load_n(&worker_stats[i].handled_packets,
+					__ATOMIC_RELAXED));
 	printf("Sanity test with non-zero hashes done\n");
 
 	rte_mempool_put_bulk(p, (void *)bufs, BURST);
@@ -272,12 +280,14 @@ handle_work_with_free_mbufs(void *arg)
 
 	num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 	while (!quit) {
-		worker_stats[id].handled_packets += num;
+		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
+				__ATOMIC_RELAXED);
 		for (i = 0; i < num; i++)
 			rte_pktmbuf_free(buf[i]);
 		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 	}
-	worker_stats[id].handled_packets += num;
+	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
+			__ATOMIC_RELAXED);
 	rte_distributor_return_pkt(d, id, buf, num);
 	return 0;
 }
@@ -347,7 +357,8 @@ handle_work_for_shutdown_test(void *arg)
 	/* wait for quit single globally, or for worker zero, wait
 	 * for zero_quit */
 	while (!quit && !(id == zero_id && zero_quit)) {
-		worker_stats[id].handled_packets += num;
+		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
+				__ATOMIC_RELAXED);
 		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 
 		if (num > 0) {
@@ -357,8 +368,9 @@ handle_work_for_shutdown_test(void *arg)
 		}
 		zero_id = __atomic_load_n(&zero_idx, __ATOMIC_ACQUIRE);
 	}
-	worker_stats[id].handled_packets += num;
 
+	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
+			__ATOMIC_RELAXED);
 	if (id == zero_id) {
 		rte_distributor_return_pkt(d, id, NULL, 0);
 
@@ -371,7 +383,8 @@ handle_work_for_shutdown_test(void *arg)
 		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 
 		while (!quit) {
-			worker_stats[id].handled_packets += num;
+			__atomic_fetch_add(&worker_stats[id].handled_packets,
+					num, __ATOMIC_RELAXED);
 			num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 		}
 	}
@@ -437,7 +450,8 @@ sanity_test_with_worker_shutdown(struct worker_params *wp,
 
 	for (i = 0; i < rte_lcore_count() - 1; i++)
 		printf("Worker %u handled %u packets\n", i,
-				worker_stats[i].handled_packets);
+			__atomic_load_n(&worker_stats[i].handled_packets,
+					__ATOMIC_RELAXED));
 
 	if (total_packet_count() != BURST * 2) {
 		printf("Line %d: Error, not all packets flushed. "
@@ -497,7 +511,8 @@ test_flush_with_worker_shutdown(struct worker_params *wp,
 	zero_quit = 0;
 	for (i = 0; i < rte_lcore_count() - 1; i++)
 		printf("Worker %u handled %u packets\n", i,
-				worker_stats[i].handled_packets);
+			__atomic_load_n(&worker_stats[i].handled_packets,
+					__ATOMIC_RELAXED));
 
 	if (total_packet_count() != BURST) {
 		printf("Line %d: Error, not all packets flushed. "
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v8 09/17] test/distributor: collect return mbufs
       [not found]                               ` <CGME20201017030717eucas1p1ae327494575f851af4bdf77f3e8c83ae@eucas1p1.samsung.com>
@ 2020-10-17  3:06                                 ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-17  3:06 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

During quit_workers function distributor's main core processes
some packets to wake up pending worker cores so they can quit.
As quit_workers acts also as a cleanup procedure for next test
case it should also collect these packets returned by workers'
handlers, so the cyclic buffer with returned packets
in distributor remains empty.

Fixes: c3eabff124e6 ("distributor: add unit tests")
Cc: bruce.richardson@intel.com
Fixes: c0de0eb82e40 ("distributor: switch over to new API")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 app/test/test_distributor.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
index 4343efed1..3f0aeb7b9 100644
--- a/app/test/test_distributor.c
+++ b/app/test/test_distributor.c
@@ -591,6 +591,7 @@ quit_workers(struct worker_params *wp, struct rte_mempool *p)
 	const unsigned num_workers = rte_lcore_count() - 1;
 	unsigned i;
 	struct rte_mbuf *bufs[RTE_MAX_LCORE];
+	struct rte_mbuf *returns[RTE_MAX_LCORE];
 	if (rte_mempool_get_bulk(p, (void *)bufs, num_workers) != 0) {
 		printf("line %d: Error getting mbufs from pool\n", __LINE__);
 		return;
@@ -606,6 +607,10 @@ quit_workers(struct worker_params *wp, struct rte_mempool *p)
 	rte_distributor_flush(d);
 	rte_eal_mp_wait_lcore();
 
+	while (rte_distributor_returned_pkts(d, returns, RTE_MAX_LCORE))
+		;
+
+	rte_distributor_clear_returns(d);
 	rte_mempool_put_bulk(p, (void *)bufs, num_workers);
 
 	quit = 0;
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v8 10/17] distributor: align API documentation with code
       [not found]                               ` <CGME20201017030718eucas1p256e1f934af12af2a6b07640c9de7a766@eucas1p2.samsung.com>
@ 2020-10-17  3:06                                 ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-17  3:06 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

After introducing burst API there were some artefacts in the
API documentation from legacy single API.
Also the rte_distributor_poll_pkt() function return values
mismatched the implementation.

Fixes: c0de0eb82e40 ("distributor: switch over to new API")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 lib/librte_distributor/rte_distributor.h | 23 ++++++++++++-----------
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/lib/librte_distributor/rte_distributor.h b/lib/librte_distributor/rte_distributor.h
index 327c0c4ab..a073e6461 100644
--- a/lib/librte_distributor/rte_distributor.h
+++ b/lib/librte_distributor/rte_distributor.h
@@ -155,7 +155,7 @@ rte_distributor_clear_returns(struct rte_distributor *d);
  * @param pkts
  *   The mbufs pointer array to be filled in (up to 8 packets)
  * @param oldpkt
- *   The previous packet, if any, being processed by the worker
+ *   The previous packets, if any, being processed by the worker
  * @param retcount
  *   The number of packets being returned
  *
@@ -187,15 +187,15 @@ rte_distributor_return_pkt(struct rte_distributor *d,
 
 /**
  * API called by a worker to request a new packet to process.
- * Any previous packet given to the worker is assumed to have completed
+ * Any previous packets given to the worker are assumed to have completed
  * processing, and may be optionally returned to the distributor via
  * the oldpkt parameter.
- * Unlike rte_distributor_get_pkt_burst(), this function does not wait for a
- * new packet to be provided by the distributor.
+ * Unlike rte_distributor_get_pkt(), this function does not wait for
+ * new packets to be provided by the distributor.
  *
- * NOTE: after calling this function, rte_distributor_poll_pkt_burst() should
- * be used to poll for the packet requested. The rte_distributor_get_pkt_burst()
- * API should *not* be used to try and retrieve the new packet.
+ * NOTE: after calling this function, rte_distributor_poll_pkt() should
+ * be used to poll for the packets requested. The rte_distributor_get_pkt()
+ * API should *not* be used to try and retrieve the new packets.
  *
  * @param d
  *   The distributor instance to be used
@@ -213,9 +213,9 @@ rte_distributor_request_pkt(struct rte_distributor *d,
 		unsigned int count);
 
 /**
- * API called by a worker to check for a new packet that was previously
+ * API called by a worker to check for new packets that were previously
  * requested by a call to rte_distributor_request_pkt(). It does not wait
- * for the new packet to be available, but returns NULL if the request has
+ * for the new packets to be available, but returns if the request has
  * not yet been fulfilled by the distributor.
  *
  * @param d
@@ -227,8 +227,9 @@ rte_distributor_request_pkt(struct rte_distributor *d,
  *   The array of mbufs being given to the worker
  *
  * @return
- *   The number of packets being given to the worker thread, zero if no
- *   packet is yet available.
+ *   The number of packets being given to the worker thread,
+ *   -1 if no packets are yet available (burst API - RTE_DIST_ALG_BURST)
+ *   0 if no packets are yet available (legacy single API - RTE_DIST_ALG_SINGLE)
  */
 int
 rte_distributor_poll_pkt(struct rte_distributor *d,
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v8 11/17] test/distributor: replace delays with spin locks
       [not found]                               ` <CGME20201017030719eucas1p13b13db1fbc3715e19e81bb4be4635b7d@eucas1p1.samsung.com>
@ 2020-10-17  3:06                                 ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-17  3:06 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

Instead of making delays in test code and waiting
for worker hopefully to reach proper states,
synchronize worker shutdown test cases with spin lock
on atomic variable.

Fixes: c0de0eb82e40 ("distributor: switch over to new API")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 app/test/test_distributor.c | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
index 3f0aeb7b9..fdb6ea9ce 100644
--- a/app/test/test_distributor.c
+++ b/app/test/test_distributor.c
@@ -27,6 +27,7 @@ struct worker_params worker_params;
 /* statics - all zero-initialized by default */
 static volatile int quit;      /**< general quit variable for all threads */
 static volatile int zero_quit; /**< var for when we just want thr0 to quit*/
+static volatile int zero_sleep; /**< thr0 has quit basic loop and is sleeping*/
 static volatile unsigned worker_idx;
 static volatile unsigned zero_idx;
 
@@ -377,8 +378,10 @@ handle_work_for_shutdown_test(void *arg)
 		/* for worker zero, allow it to restart to pick up last packet
 		 * when all workers are shutting down.
 		 */
+		__atomic_store_n(&zero_sleep, 1, __ATOMIC_RELEASE);
 		while (zero_quit)
 			usleep(100);
+		__atomic_store_n(&zero_sleep, 0, __ATOMIC_RELEASE);
 
 		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
 
@@ -446,7 +449,12 @@ sanity_test_with_worker_shutdown(struct worker_params *wp,
 
 	/* flush the distributor */
 	rte_distributor_flush(d);
-	rte_delay_us(10000);
+	while (!__atomic_load_n(&zero_sleep, __ATOMIC_ACQUIRE))
+		rte_distributor_flush(d);
+
+	zero_quit = 0;
+	while (__atomic_load_n(&zero_sleep, __ATOMIC_ACQUIRE))
+		rte_delay_us(100);
 
 	for (i = 0; i < rte_lcore_count() - 1; i++)
 		printf("Worker %u handled %u packets\n", i,
@@ -506,9 +514,14 @@ test_flush_with_worker_shutdown(struct worker_params *wp,
 	/* flush the distributor */
 	rte_distributor_flush(d);
 
-	rte_delay_us(10000);
+	while (!__atomic_load_n(&zero_sleep, __ATOMIC_ACQUIRE))
+		rte_distributor_flush(d);
 
 	zero_quit = 0;
+
+	while (__atomic_load_n(&zero_sleep, __ATOMIC_ACQUIRE))
+		rte_delay_us(100);
+
 	for (i = 0; i < rte_lcore_count() - 1; i++)
 		printf("Worker %u handled %u packets\n", i,
 			__atomic_load_n(&worker_stats[i].handled_packets,
@@ -616,6 +629,8 @@ quit_workers(struct worker_params *wp, struct rte_mempool *p)
 	quit = 0;
 	worker_idx = 0;
 	zero_idx = RTE_MAX_LCORE;
+	zero_quit = 0;
+	zero_sleep = 0;
 }
 
 static int
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v8 12/17] distributor: fix scalar matching
       [not found]                               ` <CGME20201017030720eucas1p1fe683996638c3692cae530e67271b79b@eucas1p1.samsung.com>
@ 2020-10-17  3:06                                 ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-17  3:06 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

Fix improper indexes while comparing tags.
In the find_match_scalar() function:
* j iterates over flow tags of following packets;
* w iterates over backlog or in flight tags positions.

Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 lib/librte_distributor/rte_distributor.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
index 6e3eae58f..9fea3f69a 100644
--- a/lib/librte_distributor/rte_distributor.c
+++ b/lib/librte_distributor/rte_distributor.c
@@ -259,13 +259,13 @@ find_match_scalar(struct rte_distributor *d,
 
 		for (j = 0; j < RTE_DIST_BURST_SIZE ; j++)
 			for (w = 0; w < RTE_DIST_BURST_SIZE; w++)
-				if (d->in_flight_tags[i][j] == data_ptr[w]) {
+				if (d->in_flight_tags[i][w] == data_ptr[j]) {
 					output_ptr[j] = i+1;
 					break;
 				}
 		for (j = 0; j < RTE_DIST_BURST_SIZE; j++)
 			for (w = 0; w < RTE_DIST_BURST_SIZE; w++)
-				if (bl->tags[j] == data_ptr[w]) {
+				if (bl->tags[w] == data_ptr[j]) {
 					output_ptr[j] = i+1;
 					break;
 				}
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v8 13/17] test/distributor: add test with packets marking
       [not found]                               ` <CGME20201017030720eucas1p1359382fafa661abb1ba82fa65e19562c@eucas1p1.samsung.com>
@ 2020-10-17  3:06                                 ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-17  3:06 UTC (permalink / raw)
  To: David Hunt; +Cc: dev, l.wojciechow

All of the former tests analyzed only statistics
of packets processed by all workers.
The new test verifies also if packets are processed
on workers as expected.
Every packets processed by the worker is marked
and analyzed after it is returned to distributor.

This test allows finding issues in matching algorithms.

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 app/test/test_distributor.c | 141 ++++++++++++++++++++++++++++++++++++
 1 file changed, 141 insertions(+)

diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
index fdb6ea9ce..cfae5a1ac 100644
--- a/app/test/test_distributor.c
+++ b/app/test/test_distributor.c
@@ -543,6 +543,141 @@ test_flush_with_worker_shutdown(struct worker_params *wp,
 	return 0;
 }
 
+static int
+handle_and_mark_work(void *arg)
+{
+	struct rte_mbuf *buf[8] __rte_cache_aligned;
+	struct worker_params *wp = arg;
+	struct rte_distributor *db = wp->dist;
+	unsigned int num, i;
+	unsigned int id = __atomic_fetch_add(&worker_idx, 1, __ATOMIC_RELAXED);
+	num = rte_distributor_get_pkt(db, id, buf, NULL, 0);
+	while (!quit) {
+		__atomic_fetch_add(&worker_stats[id].handled_packets, num,
+				__ATOMIC_RELAXED);
+		for (i = 0; i < num; i++)
+			buf[i]->udata64 += id + 1;
+		num = rte_distributor_get_pkt(db, id,
+				buf, buf, num);
+	}
+	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
+			__ATOMIC_RELAXED);
+	rte_distributor_return_pkt(db, id, buf, num);
+	return 0;
+}
+
+/* sanity_mark_test sends packets to workers which mark them.
+ * Every packet has also encoded sequence number.
+ * The returned packets are sorted and verified if they were handled
+ * by proper workers.
+ */
+static int
+sanity_mark_test(struct worker_params *wp, struct rte_mempool *p)
+{
+	const unsigned int buf_count = 24;
+	const unsigned int burst = 8;
+	const unsigned int shift = 12;
+	const unsigned int seq_shift = 10;
+
+	struct rte_distributor *db = wp->dist;
+	struct rte_mbuf *bufs[buf_count];
+	struct rte_mbuf *returns[buf_count];
+	unsigned int i, count, id;
+	unsigned int sorted[buf_count], seq;
+	unsigned int failed = 0;
+
+	printf("=== Marked packets test ===\n");
+	clear_packet_count();
+	if (rte_mempool_get_bulk(p, (void *)bufs, buf_count) != 0) {
+		printf("line %d: Error getting mbufs from pool\n", __LINE__);
+		return -1;
+	}
+
+	/* bufs' hashes will be like these below, but shifted left.
+	 * The shifting is for avoiding collisions with backlogs
+	 * and in-flight tags left by previous tests.
+	 * [1, 1, 1, 1, 1, 1, 1, 1
+	 *  1, 1, 1, 1, 2, 2, 2, 2
+	 *  2, 2, 2, 2, 1, 1, 1, 1]
+	 */
+	for (i = 0; i < burst; i++) {
+		bufs[0 * burst + i]->hash.usr = 1 << shift;
+		bufs[1 * burst + i]->hash.usr = ((i < burst / 2) ? 1 : 2)
+			<< shift;
+		bufs[2 * burst + i]->hash.usr = ((i < burst / 2) ? 2 : 1)
+			<< shift;
+	}
+	/* Assign a sequence number to each packet. The sequence is shifted,
+	 * so that lower bits of the udate64 will hold mark from worker.
+	 */
+	for (i = 0; i < buf_count; i++)
+		bufs[i]->udata64 = i << seq_shift;
+
+	count = 0;
+	for (i = 0; i < buf_count/burst; i++) {
+		rte_distributor_process(db, &bufs[i * burst], burst);
+		count += rte_distributor_returned_pkts(db, &returns[count],
+			buf_count - count);
+	}
+
+	do {
+		rte_distributor_flush(db);
+		count += rte_distributor_returned_pkts(db, &returns[count],
+			buf_count - count);
+	} while (count < buf_count);
+
+	for (i = 0; i < rte_lcore_count() - 1; i++)
+		printf("Worker %u handled %u packets\n", i,
+			__atomic_load_n(&worker_stats[i].handled_packets,
+					__ATOMIC_RELAXED));
+
+	/* Sort returned packets by sent order (sequence numbers). */
+	for (i = 0; i < buf_count; i++) {
+		seq = returns[i]->udata64 >> seq_shift;
+		id = returns[i]->udata64 - (seq << seq_shift);
+		sorted[seq] = id;
+	}
+
+	/* Verify that packets [0-11] and [20-23] were processed
+	 * by the same worker
+	 */
+	for (i = 1; i < 12; i++) {
+		if (sorted[i] != sorted[0]) {
+			printf("Packet number %u processed by worker %u,"
+				" but should be processes by worker %u\n",
+				i, sorted[i], sorted[0]);
+			failed = 1;
+		}
+	}
+	for (i = 20; i < 24; i++) {
+		if (sorted[i] != sorted[0]) {
+			printf("Packet number %u processed by worker %u,"
+				" but should be processes by worker %u\n",
+				i, sorted[i], sorted[0]);
+			failed = 1;
+		}
+	}
+	/* And verify that packets [12-19] were processed
+	 * by the another worker
+	 */
+	for (i = 13; i < 20; i++) {
+		if (sorted[i] != sorted[12]) {
+			printf("Packet number %u processed by worker %u,"
+				" but should be processes by worker %u\n",
+				i, sorted[i], sorted[12]);
+			failed = 1;
+		}
+	}
+
+	rte_mempool_put_bulk(p, (void *)bufs, buf_count);
+
+	if (failed)
+		return -1;
+
+	printf("Marked packets test passed\n");
+	return 0;
+}
+
 static
 int test_error_distributor_create_name(void)
 {
@@ -727,6 +862,12 @@ test_distributor(void)
 				goto err;
 			quit_workers(&worker_params, p);
 
+			rte_eal_mp_remote_launch(handle_and_mark_work,
+					&worker_params, SKIP_MASTER);
+			if (sanity_mark_test(&worker_params, p) < 0)
+				goto err;
+			quit_workers(&worker_params, p);
+
 		} else {
 			printf("Too few cores to run worker shutdown test\n");
 		}
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v8 14/17] distributor: fix flushing in flight packets
       [not found]                               ` <CGME20201017030721eucas1p2a1032e6c78d99f903ea539e49f057a83@eucas1p2.samsung.com>
@ 2020-10-17  3:06                                 ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-17  3:06 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

rte_distributor_flush() is using total_outstanding()
function to calculate if it should still wait
for processing packets. However in burst mode
only backlog packets were counted.

This patch fixes that issue by counting also in flight
packets. There are also sum fixes to properly keep
count of in flight packets for each worker in bufs[].count.

Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 lib/librte_distributor/rte_distributor.c | 12 +++++-------
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
index 9fea3f69a..fb4e9d93f 100644
--- a/lib/librte_distributor/rte_distributor.c
+++ b/lib/librte_distributor/rte_distributor.c
@@ -465,6 +465,7 @@ rte_distributor_process(struct rte_distributor *d,
 			/* Sync with worker on GET_BUF flag. */
 			if (__atomic_load_n(&(d->bufs[wid].bufptr64[0]),
 				__ATOMIC_ACQUIRE) & RTE_DISTRIB_GET_BUF) {
+				d->bufs[wid].count = 0;
 				release(d, wid);
 				handle_returns(d, wid);
 			}
@@ -479,11 +480,6 @@ rte_distributor_process(struct rte_distributor *d,
 		uint16_t matches[RTE_DIST_BURST_SIZE];
 		unsigned int pkts;
 
-		/* Sync with worker on GET_BUF flag. */
-		if (__atomic_load_n(&(d->bufs[wkr].bufptr64[0]),
-			__ATOMIC_ACQUIRE) & RTE_DISTRIB_GET_BUF)
-			d->bufs[wkr].count = 0;
-
 		if ((num_mbufs - next_idx) < RTE_DIST_BURST_SIZE)
 			pkts = num_mbufs - next_idx;
 		else
@@ -603,8 +599,10 @@ rte_distributor_process(struct rte_distributor *d,
 	for (wid = 0 ; wid < d->num_workers; wid++)
 		/* Sync with worker on GET_BUF flag. */
 		if ((__atomic_load_n(&(d->bufs[wid].bufptr64[0]),
-			__ATOMIC_ACQUIRE) & RTE_DISTRIB_GET_BUF))
+			__ATOMIC_ACQUIRE) & RTE_DISTRIB_GET_BUF)) {
+			d->bufs[wid].count = 0;
 			release(d, wid);
+		}
 
 	return num_mbufs;
 }
@@ -647,7 +645,7 @@ total_outstanding(const struct rte_distributor *d)
 	unsigned int wkr, total_outstanding = 0;
 
 	for (wkr = 0; wkr < d->num_workers; wkr++)
-		total_outstanding += d->backlog[wkr].count;
+		total_outstanding += d->backlog[wkr].count + d->bufs[wkr].count;
 
 	return total_outstanding;
 }
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v8 15/17] distributor: fix clearing returns buffer
       [not found]                               ` <CGME20201017030721eucas1p1f3307c1e4e69c65186ad8f2fb18f5f74@eucas1p1.samsung.com>
@ 2020-10-17  3:06                                 ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-17  3:06 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

The patch clears distributors returns buffer
in clear_returns() by setting start and count to 0.

Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 lib/librte_distributor/rte_distributor.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c
index fb4e9d93f..ef34facba 100644
--- a/lib/librte_distributor/rte_distributor.c
+++ b/lib/librte_distributor/rte_distributor.c
@@ -702,6 +702,8 @@ rte_distributor_clear_returns(struct rte_distributor *d)
 		/* Sync with worker. Release retptrs. */
 		__atomic_store_n(&(d->bufs[wkr].retptr64[0]), 0,
 				__ATOMIC_RELEASE);
+
+	d->returns.start = d->returns.count = 0;
 }
 
 /* creates a distributor instance */
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v8 16/17] test/distributor: ensure all packets are delivered
       [not found]                               ` <CGME20201017030722eucas1p107dc8d3eb2d9ef620065deba31cf08ed@eucas1p1.samsung.com>
@ 2020-10-17  3:07                                 ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-17  3:07 UTC (permalink / raw)
  To: David Hunt; +Cc: dev, l.wojciechow, stable

In all distributor tests there is a chance that tests
will send packets to distributor with rte_distributor_process()
before workers are started and requested for packets.

This patch ensures that all packets are delivered to workers
by calling rte_distributor_process() in loop until number
of successfully processed packets reaches required by test.
Change is applied to every first call in test case.

Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Acked-by: David Hunt <david.hunt@intel.com>
---
 app/test/test_distributor.c | 32 +++++++++++++++++++++++++++-----
 1 file changed, 27 insertions(+), 5 deletions(-)

diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
index cfae5a1ac..a4af0a39c 100644
--- a/app/test/test_distributor.c
+++ b/app/test/test_distributor.c
@@ -103,6 +103,7 @@ sanity_test(struct worker_params *wp, struct rte_mempool *p)
 	struct rte_mbuf *returns[BURST*2];
 	unsigned int i, count;
 	unsigned int retries;
+	unsigned int processed;
 
 	printf("=== Basic distributor sanity tests ===\n");
 	clear_packet_count();
@@ -116,7 +117,11 @@ sanity_test(struct worker_params *wp, struct rte_mempool *p)
 	for (i = 0; i < BURST; i++)
 		bufs[i]->hash.usr = 0;
 
-	rte_distributor_process(db, bufs, BURST);
+	processed = 0;
+	while (processed < BURST)
+		processed += rte_distributor_process(db, &bufs[processed],
+			BURST - processed);
+
 	count = 0;
 	do {
 
@@ -304,6 +309,7 @@ sanity_test_with_mbuf_alloc(struct worker_params *wp, struct rte_mempool *p)
 	struct rte_distributor *d = wp->dist;
 	unsigned i;
 	struct rte_mbuf *bufs[BURST];
+	unsigned int processed;
 
 	printf("=== Sanity test with mbuf alloc/free (%s) ===\n", wp->name);
 
@@ -316,7 +322,10 @@ sanity_test_with_mbuf_alloc(struct worker_params *wp, struct rte_mempool *p)
 			bufs[j]->hash.usr = (i+j) << 1;
 		}
 
-		rte_distributor_process(d, bufs, BURST);
+		processed = 0;
+		while (processed < BURST)
+			processed += rte_distributor_process(d,
+				&bufs[processed], BURST - processed);
 	}
 
 	rte_distributor_flush(d);
@@ -410,6 +419,7 @@ sanity_test_with_worker_shutdown(struct worker_params *wp,
 	struct rte_mbuf *bufs2[BURST];
 	unsigned int i;
 	unsigned int failed = 0;
+	unsigned int processed = 0;
 
 	printf("=== Sanity test of worker shutdown ===\n");
 
@@ -427,7 +437,10 @@ sanity_test_with_worker_shutdown(struct worker_params *wp,
 	for (i = 0; i < BURST; i++)
 		bufs[i]->hash.usr = 1;
 
-	rte_distributor_process(d, bufs, BURST);
+	processed = 0;
+	while (processed < BURST)
+		processed += rte_distributor_process(d, &bufs[processed],
+			BURST - processed);
 	rte_distributor_flush(d);
 
 	/* at this point, we will have processed some packets and have a full
@@ -489,6 +502,7 @@ test_flush_with_worker_shutdown(struct worker_params *wp,
 	struct rte_mbuf *bufs[BURST];
 	unsigned int i;
 	unsigned int failed = 0;
+	unsigned int processed;
 
 	printf("=== Test flush fn with worker shutdown (%s) ===\n", wp->name);
 
@@ -503,7 +517,10 @@ test_flush_with_worker_shutdown(struct worker_params *wp,
 	for (i = 0; i < BURST; i++)
 		bufs[i]->hash.usr = 0;
 
-	rte_distributor_process(d, bufs, BURST);
+	processed = 0;
+	while (processed < BURST)
+		processed += rte_distributor_process(d, &bufs[processed],
+			BURST - processed);
 	/* at this point, we will have processed some packets and have a full
 	 * backlog for the other ones at worker 0.
 	 */
@@ -585,6 +602,7 @@ sanity_mark_test(struct worker_params *wp, struct rte_mempool *p)
 	unsigned int i, count, id;
 	unsigned int sorted[buf_count], seq;
 	unsigned int failed = 0;
+	unsigned int processed;
 
 	printf("=== Marked packets test ===\n");
 	clear_packet_count();
@@ -615,7 +633,11 @@ sanity_mark_test(struct worker_params *wp, struct rte_mempool *p)
 
 	count = 0;
 	for (i = 0; i < buf_count/burst; i++) {
-		rte_distributor_process(db, &bufs[i * burst], burst);
+		processed = 0;
+		while (processed < burst)
+			processed += rte_distributor_process(db,
+				&bufs[i * burst + processed],
+				burst - processed);
 		count += rte_distributor_returned_pkts(db, &returns[count],
 			buf_count - count);
 	}
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* [dpdk-dev] [PATCH v8 17/17] test/distributor: fix quitting workers
       [not found]                               ` <CGME20201017030723eucas1p16904cabfd94afa4fe751c072077e09ae@eucas1p1.samsung.com>
@ 2020-10-17  3:07                                 ` Lukasz Wojciechowski
  2020-10-17 21:15                                   ` Honnappa Nagarahalli
  0 siblings, 1 reply; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-17  3:07 UTC (permalink / raw)
  To: David Hunt, Bruce Richardson; +Cc: dev, l.wojciechow, stable

Sending number of packets equal to number of workers isn't enough
to stop all workers in burst version of distributor as more than
one packet can be matched and consumed by a single worker. This way
some of workers might not be awaken from rte_distributor_get_pkt().

This patch fixes it by sending packets one by one. Each sent packet
causes exactly one worker to quit.

Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
Cc: david.hunt@intel.com
Cc: stable@dpdk.org

Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
---
 app/test/test_distributor.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c
index a4af0a39c..e0cb698e1 100644
--- a/app/test/test_distributor.c
+++ b/app/test/test_distributor.c
@@ -769,9 +769,10 @@ quit_workers(struct worker_params *wp, struct rte_mempool *p)
 
 	zero_quit = 0;
 	quit = 1;
-	for (i = 0; i < num_workers; i++)
+	for (i = 0; i < num_workers; i++) {
 		bufs[i]->hash.usr = i << 1;
-	rte_distributor_process(d, bufs, num_workers);
+		rte_distributor_process(d, &bufs[i], 1);
+	}
 
 	rte_distributor_process(d, NULL, 0);
 	rte_distributor_flush(d);
-- 
2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v7 01/16] distributor: fix missing handshake synchronization
  2020-10-15 23:47                               ` Honnappa Nagarahalli
@ 2020-10-17  3:13                                 ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-17  3:13 UTC (permalink / raw)
  To: Honnappa Nagarahalli, David Hunt, Bruce Richardson
  Cc: dev, stable, nd, "'Lukasz Wojciechowski'",

All suggested changes applied and published in v8.

W dniu 16.10.2020 o 01:47, Honnappa Nagarahalli pisze:
> <snip>
>
>> rte_distributor_return_pkt function which is run on worker cores must wait
>> for distributor core to clear handshake on retptr64 before using those
>> buffers. While the handshake is set distributor core controls buffers and any
>> operations on worker side might overwrite buffers which are unread yet.
>> Same situation appears in the legacy single distributor. Function
>> rte_distributor_return_pkt_single shouldn't modify the bufptr64 until
>> handshake on it is cleared by distributor lcore.
>>
>> Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
>> Cc: david.hunt@intel.com
>> Cc: stable@dpdk.org
>>
>> Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
>> Acked-by: David Hunt <david.hunt@intel.com>
>> ---
>>   lib/librte_distributor/rte_distributor.c        | 14 ++++++++++++++
>>   lib/librte_distributor/rte_distributor_single.c |  4 ++++
>>   2 files changed, 18 insertions(+)
>>
>> diff --git a/lib/librte_distributor/rte_distributor.c
>> b/lib/librte_distributor/rte_distributor.c
>> index 1c047f065..89493c331 100644
>> --- a/lib/librte_distributor/rte_distributor.c
>> +++ b/lib/librte_distributor/rte_distributor.c
>> @@ -160,6 +160,7 @@ rte_distributor_return_pkt(struct rte_distributor *d,
>> {
>>   	struct rte_distributor_buffer *buf = &d->bufs[worker_id];
>>   	unsigned int i;
>> +	volatile int64_t *retptr64;
> volatile is not needed here as use of __atomic_load_n implies volatile inherently.
retptr64 variable removed at all
>>   	if (unlikely(d->alg_type == RTE_DIST_ALG_SINGLE)) {
>>   		if (num == 1)
>> @@ -169,6 +170,19 @@ rte_distributor_return_pkt(struct rte_distributor *d,
>>   			return -EINVAL;
>>   	}
>>
>> +	retptr64 = &(buf->retptr64[0]);
>> +	/* Spin while handshake bits are set (scheduler clears it).
>> +	 * Sync with worker on GET_BUF flag.
>> +	 */
>> +	while (unlikely(__atomic_load_n(retptr64, __ATOMIC_ACQUIRE)
> nit. we could avoid using the temp variable retptr64, you could use '&buf->retptr64[0]' directly.
> RELAXED memory order should be good as the thread_fence below will ensure that this load does not sink.
retptr64 variable removed and relaxed memory order used
>
> [1]
>> +			& RTE_DISTRIB_GET_BUF)) {
>> +		rte_pause();
>> +		uint64_t t = rte_rdtsc()+100;
>> +
>> +		while (rte_rdtsc() < t)
>> +			rte_pause();
>> +	}
>> +
>>   	/* Sync with distributor to acquire retptrs */
>>   	__atomic_thread_fence(__ATOMIC_ACQUIRE);
>>   	for (i = 0; i < RTE_DIST_BURST_SIZE; i++) diff --git
>> a/lib/librte_distributor/rte_distributor_single.c
>> b/lib/librte_distributor/rte_distributor_single.c
>> index abaf7730c..f4725b1d0 100644
>> --- a/lib/librte_distributor/rte_distributor_single.c
>> +++ b/lib/librte_distributor/rte_distributor_single.c
>> @@ -74,6 +74,10 @@ rte_distributor_return_pkt_single(struct
>> rte_distributor_single *d,
>>   	union rte_distributor_buffer_single *buf = &d->bufs[worker_id];
>>   	uint64_t req = (((int64_t)(uintptr_t)oldpkt) <<
>> RTE_DISTRIB_FLAG_BITS)
>>   			| RTE_DISTRIB_RETURN_BUF;
>> +	while (unlikely(__atomic_load_n(&buf->bufptr64,
>> __ATOMIC_RELAXED)
>> +			& RTE_DISTRIB_FLAGS_MASK))
>> +		rte_pause();
>> +
>>   	/* Sync with distributor on RETURN_BUF flag. */
>>   	__atomic_store_n(&(buf->bufptr64), req, __ATOMIC_RELEASE);
>>   	return 0;
>> --
>> 2.17.1

-- 
Lukasz Wojciechowski
Principal Software Engineer

Samsung R&D Institute Poland
Samsung Electronics
Office +48 22 377 88 25
l.wojciechow@partner.samsung.com


^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v7 06/16] test/distributor: synchronize lcores statistics
  2020-10-16  5:13                               ` Honnappa Nagarahalli
@ 2020-10-17  3:23                                 ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-17  3:23 UTC (permalink / raw)
  To: Honnappa Nagarahalli, David Hunt, Bruce Richardson
  Cc: dev, stable, nd, "'Lukasz Wojciechowski'",

Hi Honnappa,

W dniu 16.10.2020 o 07:13, Honnappa Nagarahalli pisze:
> Hi Lukasz,
> 	I see that in commit 8/16, the same code is changed again (updating the counters using the RELAXED memory order). It is better to pull the statistics changes from 8/16 into this commit.

I reordered patches: "synchronize lcores statistics" and "fix freeing 
mbufs" to avoid changing same code.

Many thanks for the review

Lukasz

>
> Thanks,
> Honnappa
>
>> -----Original Message-----
>> From: dev <dev-bounces@dpdk.org> On Behalf Of Lukasz Wojciechowski
>> Sent: Saturday, October 10, 2020 11:05 AM
>> To: David Hunt <david.hunt@intel.com>; Bruce Richardson
>> <bruce.richardson@intel.com>
>> Cc: dev@dpdk.org; l.wojciechow@partner.samsung.com; stable@dpdk.org
>> Subject: [dpdk-dev] [PATCH v7 06/16] test/distributor: synchronize lcores
>> statistics
>>
>> Statistics of handled packets are cleared and read on main lcore, while they
>> are increased in workers handlers on different lcores.
>>
>> Without synchronization occasionally showed invalid values.
>> This patch uses atomic acquire/release mechanisms to synchronize.
>>
>> Fixes: c3eabff124e6 ("distributor: add unit tests")
>> Cc: bruce.richardson@intel.com
>> Cc: stable@dpdk.org
>>
>> Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
>> Acked-by: David Hunt <david.hunt@intel.com>
>> ---
>>   app/test/test_distributor.c | 43 +++++++++++++++++++++++++------------
>>   1 file changed, 29 insertions(+), 14 deletions(-)
>>
>> diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c index
>> 6cd7a2edd..838459392 100644
>> --- a/app/test/test_distributor.c
>> +++ b/app/test/test_distributor.c
>> @@ -43,7 +43,8 @@ total_packet_count(void)  {
>>   	unsigned i, count = 0;
>>   	for (i = 0; i < worker_idx; i++)
>> -		count += worker_stats[i].handled_packets;
>> +		count +=
>> __atomic_load_n(&worker_stats[i].handled_packets,
>> +				__ATOMIC_ACQUIRE);
> For ex: this line is changed in commit 8/16 as well. It is better to pull the changes from 8/16 to this commit.
>
>>   	return count;
>>   }
>>
>> @@ -51,7 +52,10 @@ total_packet_count(void)  static inline void
>>   clear_packet_count(void)
>>   {
>> -	memset(&worker_stats, 0, sizeof(worker_stats));
>> +	unsigned int i;
>> +	for (i = 0; i < RTE_MAX_LCORE; i++)
>> +		__atomic_store_n(&worker_stats[i].handled_packets, 0,
>> +			__ATOMIC_RELEASE);
>>   }
>>
>>   /* this is the basic worker function for sanity test @@ -69,13 +73,13 @@
>> handle_work(void *arg)
>>   	num = rte_distributor_get_pkt(db, id, buf, NULL, 0);
>>   	while (!quit) {
>>   		__atomic_fetch_add(&worker_stats[id].handled_packets,
>> num,
>> -				__ATOMIC_RELAXED);
>> +				__ATOMIC_ACQ_REL);
>>   		count += num;
>>   		num = rte_distributor_get_pkt(db, id,
>>   				buf, buf, num);
>>   	}
>>   	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
>> -			__ATOMIC_RELAXED);
>> +			__ATOMIC_ACQ_REL);
>>   	count += num;
>>   	rte_distributor_return_pkt(db, id, buf, num);
>>   	return 0;
>> @@ -131,7 +135,8 @@ sanity_test(struct worker_params *wp, struct
>> rte_mempool *p)
>>
>>   	for (i = 0; i < rte_lcore_count() - 1; i++)
>>   		printf("Worker %u handled %u packets\n", i,
>> -				worker_stats[i].handled_packets);
>> +			__atomic_load_n(&worker_stats[i].handled_packets,
>> +					__ATOMIC_ACQUIRE));
>>   	printf("Sanity test with all zero hashes done.\n");
>>
>>   	/* pick two flows and check they go correctly */ @@ -156,7 +161,9
>> @@ sanity_test(struct worker_params *wp, struct rte_mempool *p)
>>
>>   		for (i = 0; i < rte_lcore_count() - 1; i++)
>>   			printf("Worker %u handled %u packets\n", i,
>> -					worker_stats[i].handled_packets);
>> +				__atomic_load_n(
>> +					&worker_stats[i].handled_packets,
>> +					__ATOMIC_ACQUIRE));
>>   		printf("Sanity test with two hash values done\n");
>>   	}
>>
>> @@ -182,7 +189,8 @@ sanity_test(struct worker_params *wp, struct
>> rte_mempool *p)
>>
>>   	for (i = 0; i < rte_lcore_count() - 1; i++)
>>   		printf("Worker %u handled %u packets\n", i,
>> -				worker_stats[i].handled_packets);
>> +			__atomic_load_n(&worker_stats[i].handled_packets,
>> +					__ATOMIC_ACQUIRE));
>>   	printf("Sanity test with non-zero hashes done\n");
>>
>>   	rte_mempool_put_bulk(p, (void *)bufs, BURST); @@ -275,14
>> +283,16 @@ handle_work_with_free_mbufs(void *arg)
>>
>>   	num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
>>   	while (!quit) {
>> -		worker_stats[id].handled_packets += num;
>>   		count += num;
>> +		__atomic_fetch_add(&worker_stats[id].handled_packets,
>> num,
>> +				__ATOMIC_ACQ_REL);
>>   		for (i = 0; i < num; i++)
>>   			rte_pktmbuf_free(buf[i]);
>>   		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
>>   	}
>> -	worker_stats[id].handled_packets += num;
>>   	count += num;
>> +	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
>> +			__ATOMIC_ACQ_REL);
>>   	rte_distributor_return_pkt(d, id, buf, num);
>>   	return 0;
>>   }
>> @@ -358,8 +368,9 @@ handle_work_for_shutdown_test(void *arg)
>>   	/* wait for quit single globally, or for worker zero, wait
>>   	 * for zero_quit */
>>   	while (!quit && !(id == zero_id && zero_quit)) {
>> -		worker_stats[id].handled_packets += num;
>>   		count += num;
>> +		__atomic_fetch_add(&worker_stats[id].handled_packets,
>> num,
>> +				__ATOMIC_ACQ_REL);
>>   		for (i = 0; i < num; i++)
>>   			rte_pktmbuf_free(buf[i]);
>>   		num = rte_distributor_get_pkt(d, id, buf, NULL, 0); @@ -
>> 373,10 +384,11 @@ handle_work_for_shutdown_test(void *arg)
>>
>>   		total += num;
>>   	}
>> -	worker_stats[id].handled_packets += num;
>>   	count += num;
>>   	returned = rte_distributor_return_pkt(d, id, buf, num);
>>
>> +	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
>> +			__ATOMIC_ACQ_REL);
>>   	if (id == zero_id) {
>>   		/* for worker zero, allow it to restart to pick up last packet
>>   		 * when all workers are shutting down.
>> @@ -387,7 +399,8 @@ handle_work_for_shutdown_test(void *arg)
>>   		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
>>
>>   		while (!quit) {
>> -			worker_stats[id].handled_packets += num;
>> +
>> 	__atomic_fetch_add(&worker_stats[id].handled_packets,
>> +					num, __ATOMIC_ACQ_REL);
>>   			count += num;
>>   			rte_pktmbuf_free(pkt);
>>   			num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
>> @@ -454,7 +467,8 @@ sanity_test_with_worker_shutdown(struct
>> worker_params *wp,
>>
>>   	for (i = 0; i < rte_lcore_count() - 1; i++)
>>   		printf("Worker %u handled %u packets\n", i,
>> -				worker_stats[i].handled_packets);
>> +			__atomic_load_n(&worker_stats[i].handled_packets,
>> +					__ATOMIC_ACQUIRE));
>>
>>   	if (total_packet_count() != BURST * 2) {
>>   		printf("Line %d: Error, not all packets flushed. "
>> @@ -507,7 +521,8 @@ test_flush_with_worker_shutdown(struct
>> worker_params *wp,
>>   	zero_quit = 0;
>>   	for (i = 0; i < rte_lcore_count() - 1; i++)
>>   		printf("Worker %u handled %u packets\n", i,
>> -				worker_stats[i].handled_packets);
>> +			__atomic_load_n(&worker_stats[i].handled_packets,
>> +					__ATOMIC_ACQUIRE));
>>
>>   	if (total_packet_count() != BURST) {
>>   		printf("Line %d: Error, not all packets flushed. "
>> --
>> 2.17.1

-- 
Lukasz Wojciechowski
Principal Software Engineer

Samsung R&D Institute Poland
Samsung Electronics
Office +48 22 377 88 25
l.wojciechow@partner.samsung.com


^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v7 08/16] test/distributor: fix freeing mbufs
  2020-10-16  5:12                               ` Honnappa Nagarahalli
@ 2020-10-17  3:28                                 ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-17  3:28 UTC (permalink / raw)
  To: Honnappa Nagarahalli, David Hunt, Bruce Richardson
  Cc: dev, stable, nd, "'Lukasz Wojciechowski'",

Hi Honnappa,

W dniu 16.10.2020 o 07:12, Honnappa Nagarahalli pisze:
> <snip>
>
>> Sanity tests with mbuf alloc and shutdown tests assume that mbufs passed
>> to worker cores are freed in handlers.
>> Such packets should not be returned to the distributor's main core. The only
>> packets that should be returned are the packets send after completion of
>> the tests in quit_workers function.
>>
>> This patch stops returning mbufs to distributor's core.
>> In case of shutdown tests it is impossible to determine how worker and
>> distributor threads would synchronize.
>> Packets used by tests should be freed and packets used during
>> quit_workers() shouldn't. That's why returning mbufs to mempool is moved
>> to test procedure run on distributor thread from worker threads.
>>
>> Additionally this patch cleans up unused variables.
>>
>> Fixes: c0de0eb82e40 ("distributor: switch over to new API")
>> Cc: david.hunt@intel.com
>> Cc: stable@dpdk.org
>>
>> Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
>> Acked-by: David Hunt <david.hunt@intel.com>
>> ---
>>   app/test/test_distributor.c | 96 ++++++++++++++++++-------------------
>>   1 file changed, 47 insertions(+), 49 deletions(-)
>>
>> diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c index
>> 838459392..06e01ff9d 100644
>> --- a/app/test/test_distributor.c
>> +++ b/app/test/test_distributor.c
>> @@ -44,7 +44,7 @@ total_packet_count(void)
>>   	unsigned i, count = 0;
>>   	for (i = 0; i < worker_idx; i++)
>>   		count +=
>> __atomic_load_n(&worker_stats[i].handled_packets,
>> -				__ATOMIC_ACQUIRE);
>> +				__ATOMIC_RELAXED);
> I think it is better to make this and other statistics changes below in commit 6/16. It will be in line with the commit log as well.
I changed the order of patches to avoid duplicated changes in the code.
>
>>   	return count;
>>   }
>>
>> @@ -55,7 +55,7 @@ clear_packet_count(void)
>>   	unsigned int i;
>>   	for (i = 0; i < RTE_MAX_LCORE; i++)
>>   		__atomic_store_n(&worker_stats[i].handled_packets, 0,
>> -			__ATOMIC_RELEASE);
>> +			__ATOMIC_RELAXED);
>>   }
>>
>>   /* this is the basic worker function for sanity test @@ -67,20 +67,18 @@
>> handle_work(void *arg)
>>   	struct rte_mbuf *buf[8] __rte_cache_aligned;
>>   	struct worker_params *wp = arg;
>>   	struct rte_distributor *db = wp->dist;
>> -	unsigned int count = 0, num;
>> +	unsigned int num;
>>   	unsigned int id = __atomic_fetch_add(&worker_idx, 1,
>> __ATOMIC_RELAXED);
>>
>>   	num = rte_distributor_get_pkt(db, id, buf, NULL, 0);
>>   	while (!quit) {
>>   		__atomic_fetch_add(&worker_stats[id].handled_packets,
>> num,
>> -				__ATOMIC_ACQ_REL);
>> -		count += num;
>> +				__ATOMIC_RELAXED);
>>   		num = rte_distributor_get_pkt(db, id,
>>   				buf, buf, num);
>>   	}
>>   	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
>> -			__ATOMIC_ACQ_REL);
>> -	count += num;
>> +			__ATOMIC_RELAXED);
>>   	rte_distributor_return_pkt(db, id, buf, num);
>>   	return 0;
>>   }
>> @@ -136,7 +134,7 @@ sanity_test(struct worker_params *wp, struct
>> rte_mempool *p)
>>   	for (i = 0; i < rte_lcore_count() - 1; i++)
>>   		printf("Worker %u handled %u packets\n", i,
>>   			__atomic_load_n(&worker_stats[i].handled_packets,
>> -					__ATOMIC_ACQUIRE));
>> +					__ATOMIC_RELAXED));
>>   	printf("Sanity test with all zero hashes done.\n");
>>
>>   	/* pick two flows and check they go correctly */ @@ -163,7 +161,7
>> @@ sanity_test(struct worker_params *wp, struct rte_mempool *p)
>>   			printf("Worker %u handled %u packets\n", i,
>>   				__atomic_load_n(
>>   					&worker_stats[i].handled_packets,
>> -					__ATOMIC_ACQUIRE));
>> +					__ATOMIC_RELAXED));
>>   		printf("Sanity test with two hash values done\n");
>>   	}
>>
>> @@ -190,7 +188,7 @@ sanity_test(struct worker_params *wp, struct
>> rte_mempool *p)
>>   	for (i = 0; i < rte_lcore_count() - 1; i++)
>>   		printf("Worker %u handled %u packets\n", i,
>>   			__atomic_load_n(&worker_stats[i].handled_packets,
>> -					__ATOMIC_ACQUIRE));
>> +					__ATOMIC_RELAXED));
>>   	printf("Sanity test with non-zero hashes done\n");
>>
>>   	rte_mempool_put_bulk(p, (void *)bufs, BURST); @@ -276,23
>> +274,20 @@ handle_work_with_free_mbufs(void *arg)
>>   	struct rte_mbuf *buf[8] __rte_cache_aligned;
>>   	struct worker_params *wp = arg;
>>   	struct rte_distributor *d = wp->dist;
>> -	unsigned int count = 0;
>>   	unsigned int i;
>>   	unsigned int num;
>>   	unsigned int id = __atomic_fetch_add(&worker_idx, 1,
>> __ATOMIC_RELAXED);
>>
>>   	num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
>>   	while (!quit) {
>> -		count += num;
>>   		__atomic_fetch_add(&worker_stats[id].handled_packets,
>> num,
>> -				__ATOMIC_ACQ_REL);
>> +				__ATOMIC_RELAXED);
>>   		for (i = 0; i < num; i++)
>>   			rte_pktmbuf_free(buf[i]);
>>   		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
>>   	}
>> -	count += num;
>>   	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
>> -			__ATOMIC_ACQ_REL);
>> +			__ATOMIC_RELAXED);
>>   	rte_distributor_return_pkt(d, id, buf, num);
>>   	return 0;
>>   }
>> @@ -318,7 +313,6 @@ sanity_test_with_mbuf_alloc(struct worker_params
>> *wp, struct rte_mempool *p)
>>   			rte_distributor_process(d, NULL, 0);
>>   		for (j = 0; j < BURST; j++) {
>>   			bufs[j]->hash.usr = (i+j) << 1;
>> -			rte_mbuf_refcnt_set(bufs[j], 1);
>>   		}
>>
>>   		rte_distributor_process(d, bufs, BURST); @@ -342,15 +336,10
>> @@ sanity_test_with_mbuf_alloc(struct worker_params *wp, struct
>> rte_mempool *p)  static int  handle_work_for_shutdown_test(void *arg)  {
>> -	struct rte_mbuf *pkt = NULL;
>>   	struct rte_mbuf *buf[8] __rte_cache_aligned;
>>   	struct worker_params *wp = arg;
>>   	struct rte_distributor *d = wp->dist;
>> -	unsigned int count = 0;
>>   	unsigned int num;
>> -	unsigned int total = 0;
>> -	unsigned int i;
>> -	unsigned int returned = 0;
>>   	unsigned int zero_id = 0;
>>   	unsigned int zero_unset;
>>   	const unsigned int id = __atomic_fetch_add(&worker_idx, 1, @@ -
>> 368,11 +357,8 @@ handle_work_for_shutdown_test(void *arg)
>>   	/* wait for quit single globally, or for worker zero, wait
>>   	 * for zero_quit */
>>   	while (!quit && !(id == zero_id && zero_quit)) {
>> -		count += num;
>>   		__atomic_fetch_add(&worker_stats[id].handled_packets,
>> num,
>> -				__ATOMIC_ACQ_REL);
>> -		for (i = 0; i < num; i++)
>> -			rte_pktmbuf_free(buf[i]);
>> +				__ATOMIC_RELAXED);
>>   		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
>>
>>   		if (num > 0) {
>> @@ -381,15 +367,12 @@ handle_work_for_shutdown_test(void *arg)
>>   				false, __ATOMIC_ACQ_REL,
>> __ATOMIC_ACQUIRE);
>>   		}
>>   		zero_id = __atomic_load_n(&zero_idx,
>> __ATOMIC_ACQUIRE);
>> -
>> -		total += num;
>>   	}
>> -	count += num;
>> -	returned = rte_distributor_return_pkt(d, id, buf, num);
>> -
>>   	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
>> -			__ATOMIC_ACQ_REL);
>> +			__ATOMIC_RELAXED);
>>   	if (id == zero_id) {
>> +		rte_distributor_return_pkt(d, id, NULL, 0);
>> +
>>   		/* for worker zero, allow it to restart to pick up last packet
>>   		 * when all workers are shutting down.
>>   		 */
>> @@ -400,15 +383,11 @@ handle_work_for_shutdown_test(void *arg)
>>
>>   		while (!quit) {
>>
>> 	__atomic_fetch_add(&worker_stats[id].handled_packets,
>> -					num, __ATOMIC_ACQ_REL);
>> -			count += num;
>> -			rte_pktmbuf_free(pkt);
>> +					num, __ATOMIC_RELAXED);
>>   			num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
>>   		}
>> -		returned = rte_distributor_return_pkt(d,
>> -				id, buf, num);
>> -		printf("Num returned = %d\n", returned);
>>   	}
>> +	rte_distributor_return_pkt(d, id, buf, num);
>>   	return 0;
>>   }
>>
>> @@ -424,7 +403,9 @@ sanity_test_with_worker_shutdown(struct
>> worker_params *wp,  {
>>   	struct rte_distributor *d = wp->dist;
>>   	struct rte_mbuf *bufs[BURST];
>> -	unsigned i;
>> +	struct rte_mbuf *bufs2[BURST];
>> +	unsigned int i;
>> +	unsigned int failed = 0;
>>
>>   	printf("=== Sanity test of worker shutdown ===\n");
>>
>> @@ -450,16 +431,17 @@ sanity_test_with_worker_shutdown(struct
>> worker_params *wp,
>>   	 */
>>
>>   	/* get more buffers to queue up, again setting them to the same
>> flow */
>> -	if (rte_mempool_get_bulk(p, (void *)bufs, BURST) != 0) {
>> +	if (rte_mempool_get_bulk(p, (void *)bufs2, BURST) != 0) {
>>   		printf("line %d: Error getting mbufs from pool\n", __LINE__);
>> +		rte_mempool_put_bulk(p, (void *)bufs, BURST);
>>   		return -1;
>>   	}
>>   	for (i = 0; i < BURST; i++)
>> -		bufs[i]->hash.usr = 1;
>> +		bufs2[i]->hash.usr = 1;
>>
>>   	/* get worker zero to quit */
>>   	zero_quit = 1;
>> -	rte_distributor_process(d, bufs, BURST);
>> +	rte_distributor_process(d, bufs2, BURST);
>>
>>   	/* flush the distributor */
>>   	rte_distributor_flush(d);
>> @@ -468,15 +450,21 @@ sanity_test_with_worker_shutdown(struct
>> worker_params *wp,
>>   	for (i = 0; i < rte_lcore_count() - 1; i++)
>>   		printf("Worker %u handled %u packets\n", i,
>>   			__atomic_load_n(&worker_stats[i].handled_packets,
>> -					__ATOMIC_ACQUIRE));
>> +					__ATOMIC_RELAXED));
>>
>>   	if (total_packet_count() != BURST * 2) {
>>   		printf("Line %d: Error, not all packets flushed. "
>>   				"Expected %u, got %u\n",
>>   				__LINE__, BURST * 2, total_packet_count());
>> -		return -1;
>> +		failed = 1;
>>   	}
>>
>> +	rte_mempool_put_bulk(p, (void *)bufs, BURST);
>> +	rte_mempool_put_bulk(p, (void *)bufs2, BURST);
>> +
>> +	if (failed)
>> +		return -1;
>> +
>>   	printf("Sanity test with worker shutdown passed\n\n");
>>   	return 0;
>>   }
>> @@ -490,7 +478,8 @@ test_flush_with_worker_shutdown(struct
>> worker_params *wp,  {
>>   	struct rte_distributor *d = wp->dist;
>>   	struct rte_mbuf *bufs[BURST];
>> -	unsigned i;
>> +	unsigned int i;
>> +	unsigned int failed = 0;
>>
>>   	printf("=== Test flush fn with worker shutdown (%s) ===\n", wp-
>>> name);
>> @@ -522,15 +511,20 @@ test_flush_with_worker_shutdown(struct
>> worker_params *wp,
>>   	for (i = 0; i < rte_lcore_count() - 1; i++)
>>   		printf("Worker %u handled %u packets\n", i,
>>   			__atomic_load_n(&worker_stats[i].handled_packets,
>> -					__ATOMIC_ACQUIRE));
>> +					__ATOMIC_RELAXED));
>>
>>   	if (total_packet_count() != BURST) {
>>   		printf("Line %d: Error, not all packets flushed. "
>>   				"Expected %u, got %u\n",
>>   				__LINE__, BURST, total_packet_count());
>> -		return -1;
>> +		failed = 1;
>>   	}
>>
>> +	rte_mempool_put_bulk(p, (void *)bufs, BURST);
>> +
>> +	if (failed)
>> +		return -1;
>> +
>>   	printf("Flush test with worker shutdown passed\n\n");
>>   	return 0;
>>   }
>> @@ -596,7 +590,10 @@ quit_workers(struct worker_params *wp, struct
>> rte_mempool *p)
>>   	const unsigned num_workers = rte_lcore_count() - 1;
>>   	unsigned i;
>>   	struct rte_mbuf *bufs[RTE_MAX_LCORE];
>> -	rte_mempool_get_bulk(p, (void *)bufs, num_workers);
>> +	if (rte_mempool_get_bulk(p, (void *)bufs, num_workers) != 0) {
>> +		printf("line %d: Error getting mbufs from pool\n", __LINE__);
>> +		return;
>> +	}
>>
>>   	zero_quit = 0;
>>   	quit = 1;
>> @@ -604,11 +601,12 @@ quit_workers(struct worker_params *wp, struct
>> rte_mempool *p)
>>   		bufs[i]->hash.usr = i << 1;
>>   	rte_distributor_process(d, bufs, num_workers);
>>
>> -	rte_mempool_put_bulk(p, (void *)bufs, num_workers);
>> -
>>   	rte_distributor_process(d, NULL, 0);
>>   	rte_distributor_flush(d);
>>   	rte_eal_mp_wait_lcore();
>> +
>> +	rte_mempool_put_bulk(p, (void *)bufs, num_workers);
>> +
>>   	quit = 0;
>>   	worker_idx = 0;
>>   	zero_idx = RTE_MAX_LCORE;
>> --
>> 2.17.1

-- 
Lukasz Wojciechowski
Principal Software Engineer

Samsung R&D Institute Poland
Samsung Electronics
Office +48 22 377 88 25
l.wojciechow@partner.samsung.com


^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v7 09/16] test/distributor: collect return mbufs
  2020-10-16  5:13                               ` Honnappa Nagarahalli
@ 2020-10-17  3:29                                 ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-17  3:29 UTC (permalink / raw)
  To: Honnappa Nagarahalli, David Hunt, Bruce Richardson
  Cc: dev, stable, nd, "'Lukasz Wojciechowski'",

Hi Honnappa,

W dniu 16.10.2020 o 07:13, Honnappa Nagarahalli pisze:
> <snip>
>> During quit_workers function distributor's main core processes some packets
>> to wake up pending worker cores so they can quit.
>> As quit_workers acts also as a cleanup procedure for next test case it should
>> also collect these packages returned by workers'
> nit                              ^^^^^^^^ packets
Fixed in v8
>
>> handlers, so the cyclic buffer with returned packets in distributor remains
>> empty.
>>
>> Fixes: c3eabff124e6 ("distributor: add unit tests")
>> Cc: bruce.richardson@intel.com
>> Fixes: c0de0eb82e40 ("distributor: switch over to new API")
>> Cc: david.hunt@intel.com
>> Cc: stable@dpdk.org
>>
>> Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
>> Acked-by: David Hunt <david.hunt@intel.com>
>> ---
>>   app/test/test_distributor.c | 5 +++++
>>   1 file changed, 5 insertions(+)
>>
>> diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c index
>> 06e01ff9d..ed03040d1 100644
>> --- a/app/test/test_distributor.c
>> +++ b/app/test/test_distributor.c
>> @@ -590,6 +590,7 @@ quit_workers(struct worker_params *wp, struct
>> rte_mempool *p)
>>   	const unsigned num_workers = rte_lcore_count() - 1;
>>   	unsigned i;
>>   	struct rte_mbuf *bufs[RTE_MAX_LCORE];
>> +	struct rte_mbuf *returns[RTE_MAX_LCORE];
>>   	if (rte_mempool_get_bulk(p, (void *)bufs, num_workers) != 0) {
>>   		printf("line %d: Error getting mbufs from pool\n", __LINE__);
>>   		return;
>> @@ -605,6 +606,10 @@ quit_workers(struct worker_params *wp, struct
>> rte_mempool *p)
>>   	rte_distributor_flush(d);
>>   	rte_eal_mp_wait_lcore();
>>
>> +	while (rte_distributor_returned_pkts(d, returns, RTE_MAX_LCORE))
>> +		;
>> +
>> +	rte_distributor_clear_returns(d);
>>   	rte_mempool_put_bulk(p, (void *)bufs, num_workers);
>>
>>   	quit = 0;
>> --
>> 2.17.1

-- 
Lukasz Wojciechowski
Principal Software Engineer

Samsung R&D Institute Poland
Samsung Electronics
Office +48 22 377 88 25
l.wojciechow@partner.samsung.com


^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v4 2/8] test/distributor: synchronize lcores statistics
  2020-10-16 15:42                             ` Honnappa Nagarahalli
@ 2020-10-17  3:34                               ` Lukasz Wojciechowski
  0 siblings, 0 replies; 164+ messages in thread
From: Lukasz Wojciechowski @ 2020-10-17  3:34 UTC (permalink / raw)
  To: Honnappa Nagarahalli, David Hunt, Bruce Richardson
  Cc: dev, stable, nd, David Marchand,
	"'Lukasz Wojciechowski'",


W dniu 16.10.2020 o 17:42, Honnappa Nagarahalli pisze:
> <snip>
>
>> W dniu 16.10.2020 o 14:43, Lukasz Wojciechowski pisze:
>>> Hi Honnappa,
>>>
>>> Thank you for your answer.
>>> In the current v7 version I followed your advise and used RELAXED memory
>> model.
>>> And it works without any issues. I guess after fixing other issues found
>> since v4 the distributor works more stable.
>>> I didn't have time to rearrange all tests in the way I proposed, but I guess if
>> they work like this it's not a top priority.
> Agree, not a top priority.
>
>>> Can you give an ack on the series? I believe David Marchand is waiting for
>> your opinion to process it.
>> I'm sorry I didn't see your other comments. I'll try to fix them them today.
> No problem, I can review the next series quickly.
So it's already there, the v8.
but you have one patch more to look at :)
I was ready to submit new version, but I run some tests ... and one of 
executions hanged, so I attached myself with gdb and found one issue more.
>
>>> Best regards
>>> Lukasz
>>>
>>> W dniu 16.10.2020 o 07:43, Honnappa Nagarahalli pisze:
>>>> <snip>
>>>>
>>>>> Hi Honnappa,
>>>>>
>>>>> Many thanks for the review!
>>>>>
>>>>> I'll write my answers here not inline as it would be easier to read
>>>>> them in one place, I think.
>>>>> So first of all I agree with you in 2 things:
>>>>> 1) all uses of statistics must be atomic and lack of that caused
>>>>> most of the problems
>>>>> 2) it would be better to replace barrier and memset in
>>>>> clear_packet_count() with atomic stores as you suggested
>>>>>
>>>>> So I will apply both of above.
>>>>>
>>>>> However I wasn't not fully convinced on changing acquire/release to
>> relaxed.
>>>>> It wood be perfectly ok if it would look like in this Herb Sutter's example:
>>>>> https://youtu.be/KeLBd2[]  EJLOU?t=4170 But in his case the counters
>>>>> are cleared before worker threads start and are printout after they
>>>>> are completed.
>>>>>
>>>>> In case of the dpdk distributor tests both worker and main cores are
>>>>> running at the same time. In the sanity_test, the statistics are
>>>>> cleared and verified few times for different hashes of packages. The
>>>>> worker cores are not stopped at this time and they continue their loops
>> in handle procedure.
>>>>> Verification made in main core is an exchange of data as the current
>>>>> statistics indicate how the test will result.
>>>> Agree. The key point we have to note is that the data that is exchanged
>> between the two threads is already atomic (handled_packets is atomic).
>>>>> So as I wasn't convinced, I run some tests with both both relaxed
>>>>> and acquire/release modes and they both fail :( The failures caused
>>>>> by statistics errors to number of tests ratio for
>>>>> 200000 tests was:
>>>>> for relaxed: 0,000790562
>>>>> for acq/rel: 0,000091321
>>>>>
>>>>>
>>>>> That's why I'm going to modify tests in such way, that they would:
>>>>> 1) clear statistics
>>>>> 2) launch worker threads
>>>>> 3) run test
>>>>> 4) wait for workers procedures to complete
>>>>> 5) check stats, verify results and print them out
>>>>>
>>>>> This way worker main core will use (clear or verify) stats only when
>>>>> there are no worker threads. This would make things simpler and
>>>>> allowing to focus on testing the distributor not tests. And of
>>>>> course relaxed mode would be enough!
>>>> Agree, this would be the only way to ensure that the main thread sees
>>>> the correct statistics (just like in the video)
>>>>
>>>>> Best regards
>>>>> Lukasz
>>>>>
>>>>>
>>>>> W dniu 29.09.2020 o 07:49, Honnappa Nagarahalli pisze:
>>>>>> <snip>
>>>>>>
>>>>>>> Statistics of handled packets are cleared and read on main lcore,
>>>>>>> while they are increased in workers handlers on different lcores.
>>>>>>>
>>>>>>> Without synchronization occasionally showed invalid values.
>>>>>>> This patch uses atomic acquire/release mechanisms to synchronize.
>>>>>> In general, load-acquire and store-release memory orderings are
>>>>>> required
>>>>> while synchronizing data (that cannot be updated atomically) between
>>>>> threads. In the situation, making counters atomic is enough.
>>>>>>> Fixes: c3eabff124e6 ("distributor: add unit tests")
>>>>>>> Cc: bruce.richardson@intel.com
>>>>>>> Cc: stable@dpdk.org
>>>>>>>
>>>>>>> Signed-off-by: Lukasz Wojciechowski
>>>>>>> <l.wojciechow@partner.samsung.com>
>>>>>>> Acked-by: David Hunt <david.hunt@intel.com>
>>>>>>> ---
>>>>>>>      app/test/test_distributor.c | 39
>>>>>>> ++++++++++++++++++++++++-----------
>>>>> --
>>>>>>>      1 file changed, 26 insertions(+), 13 deletions(-)
>>>>>>>
>>>>>>> diff --git a/app/test/test_distributor.c
>>>>>>> b/app/test/test_distributor.c index
>>>>>>> 35b25463a..0e49e3714 100644
>>>>>>> --- a/app/test/test_distributor.c
>>>>>>> +++ b/app/test/test_distributor.c
>>>>>>> @@ -43,7 +43,8 @@ total_packet_count(void)  {
>>>>>>>      	unsigned i, count = 0;
>>>>>>>      	for (i = 0; i < worker_idx; i++)
>>>>>>> -		count += worker_stats[i].handled_packets;
>>>>>>> +		count +=
>>>>>>> __atomic_load_n(&worker_stats[i].handled_packets,
>>>>>>> +				__ATOMIC_ACQUIRE);
>>>>>> RELAXED memory order is sufficient. For ex: the worker threads are
>>>>>> not
>>>>> 'releasing' any data that is not atomically updated to the main thread.
>>>>>>>      	return count;
>>>>>>>      }
>>>>>>>
>>>>>>> @@ -52,6 +53,7 @@ static inline void
>>>>>>>      clear_packet_count(void)
>>>>>>>      {
>>>>>>>      	memset(&worker_stats, 0, sizeof(worker_stats));
>>>>>>> +	rte_atomic_thread_fence(__ATOMIC_RELEASE);
>>>>>> Ideally, the counters should be set to 0 atomically rather than
>>>>>> using a
>>>>> memset.
>>>>>>>      }
>>>>>>>
>>>>>>>      /* this is the basic worker function for sanity test @@ -72,13
>>>>>>> +74,13 @@ handle_work(void *arg)
>>>>>>>      	num = rte_distributor_get_pkt(db, id, buf, buf, num);
>>>>>>>      	while (!quit) {
>>>>>>>
>> 	__atomic_fetch_add(&worker_stats[id].handled_packets,
>>>>>>> num,
>>>>>>> -				__ATOMIC_RELAXED);
>>>>>>> +				__ATOMIC_ACQ_REL);
>>>>>> Using the __ATOMIC_ACQ_REL order does not mean anything to the
>> main
>>>>> thread. The main thread might still see the updates from different
>>>>> threads in different order.
>>>>>>>      		count += num;
>>>>>>>      		num = rte_distributor_get_pkt(db, id,
>>>>>>>      				buf, buf, num);
>>>>>>>      	}
>>>>>>>      	__atomic_fetch_add(&worker_stats[id].handled_packets,
>> num,
>>>>>>> -			__ATOMIC_RELAXED);
>>>>>>> +			__ATOMIC_ACQ_REL);
>>>>>> Same here, do not see why this change is required.
>>>>>>
>>>>>>>      	count += num;
>>>>>>>      	rte_distributor_return_pkt(db, id, buf, num);
>>>>>>>      	return 0;
>>>>>>> @@ -134,7 +136,8 @@ sanity_test(struct worker_params *wp, struct
>>>>>>> rte_mempool *p)
>>>>>>>
>>>>>>>      	for (i = 0; i < rte_lcore_count() - 1; i++)
>>>>>>>      		printf("Worker %u handled %u packets\n", i,
>>>>>>> -				worker_stats[i].handled_packets);
>>>>>>> +
>> 	__atomic_load_n(&worker_stats[i].handled_packets,
>>>>>>> +					__ATOMIC_ACQUIRE));
>>>>>> __ATOMIC_RELAXED is enough.
>>>>>>
>>>>>>>      	printf("Sanity test with all zero hashes done.\n");
>>>>>>>
>>>>>>>      	/* pick two flows and check they go correctly */ @@ -159,7
>>>>>>> +162,9 @@ sanity_test(struct worker_params *wp, struct
>> rte_mempool
>>>>>>> *p)
>>>>>>>
>>>>>>>      		for (i = 0; i < rte_lcore_count() - 1; i++)
>>>>>>>      			printf("Worker %u handled %u packets\n", i,
>>>>>>> -
>> 	worker_stats[i].handled_packets);
>>>>>>> +				__atomic_load_n(
>>>>>>> +
>> 	&worker_stats[i].handled_packets,
>>>>>>> +					__ATOMIC_ACQUIRE));
>>>>>> __ATOMIC_RELAXED is enough
>>>>>>
>>>>>>>      		printf("Sanity test with two hash values done\n");
>>>>>>>      	}
>>>>>>>
>>>>>>> @@ -185,7 +190,8 @@ sanity_test(struct worker_params *wp, struct
>>>>>>> rte_mempool *p)
>>>>>>>
>>>>>>>      	for (i = 0; i < rte_lcore_count() - 1; i++)
>>>>>>>      		printf("Worker %u handled %u packets\n", i,
>>>>>>> -				worker_stats[i].handled_packets);
>>>>>>> +
>> 	__atomic_load_n(&worker_stats[i].handled_packets,
>>>>>>> +					__ATOMIC_ACQUIRE));
>>>>>> __ATOMIC_RELAXED is enough
>>>>>>
>>>>>>>      	printf("Sanity test with non-zero hashes done\n");
>>>>>>>
>>>>>>>      	rte_mempool_put_bulk(p, (void *)bufs, BURST); @@ -280,15
>>>>>>> +286,17 @@ handle_work_with_free_mbufs(void *arg)
>>>>>>>      		buf[i] = NULL;
>>>>>>>      	num = rte_distributor_get_pkt(d, id, buf, buf, num);
>>>>>>>      	while (!quit) {
>>>>>>> -		worker_stats[id].handled_packets += num;
>>>>>>>      		count += num;
>>>>>>> +
>> 	__atomic_fetch_add(&worker_stats[id].handled_packets,
>>>>>>> num,
>>>>>>> +				__ATOMIC_ACQ_REL);
>>>>>> IMO, the problem would be the non-atomic update of the statistics.
>>>>>> So, __ATOMIC_RELAXED is enough
>>>>>>
>>>>>>>      		for (i = 0; i < num; i++)
>>>>>>>      			rte_pktmbuf_free(buf[i]);
>>>>>>>      		num = rte_distributor_get_pkt(d,
>>>>>>>      				id, buf, buf, num);
>>>>>>>      	}
>>>>>>> -	worker_stats[id].handled_packets += num;
>>>>>>>      	count += num;
>>>>>>> +	__atomic_fetch_add(&worker_stats[id].handled_packets,
>> num,
>>>>>>> +			__ATOMIC_ACQ_REL);
>>>>>> Same here, the problem is non-atomic update of the statistics,
>>>>> __ATOMIC_RELAXED is enough.
>>>>>> Similarly, for changes below, __ATOMIC_RELAXED is enough.
>>>>>>
>>>>>>>      	rte_distributor_return_pkt(d, id, buf, num);
>>>>>>>      	return 0;
>>>>>>>      }
>>>>>>> @@ -363,8 +371,9 @@ handle_work_for_shutdown_test(void *arg)
>>>>>>>      	/* wait for quit single globally, or for worker zero, wait
>>>>>>>      	 * for zero_quit */
>>>>>>>      	while (!quit && !(id == zero_id && zero_quit)) {
>>>>>>> -		worker_stats[id].handled_packets += num;
>>>>>>>      		count += num;
>>>>>>> +
>> 	__atomic_fetch_add(&worker_stats[id].handled_packets,
>>>>>>> num,
>>>>>>> +				__ATOMIC_ACQ_REL);
>>>>>>>      		for (i = 0; i < num; i++)
>>>>>>>      			rte_pktmbuf_free(buf[i]);
>>>>>>>      		num = rte_distributor_get_pkt(d, @@ -379,10
>> +388,11 @@
>>>>>>> handle_work_for_shutdown_test(void *arg)
>>>>>>>
>>>>>>>      		total += num;
>>>>>>>      	}
>>>>>>> -	worker_stats[id].handled_packets += num;
>>>>>>>      	count += num;
>>>>>>>      	returned = rte_distributor_return_pkt(d, id, buf, num);
>>>>>>>
>>>>>>> +	__atomic_fetch_add(&worker_stats[id].handled_packets,
>> num,
>>>>>>> +			__ATOMIC_ACQ_REL);
>>>>>>>      	if (id == zero_id) {
>>>>>>>      		/* for worker zero, allow it to restart to pick up last
>> packet
>>>>>>>      		 * when all workers are shutting down.
>>>>>>> @@ -394,10 +404,11 @@ handle_work_for_shutdown_test(void *arg)
>>>>>>>      				id, buf, buf, num);
>>>>>>>
>>>>>>>      		while (!quit) {
>>>>>>> -			worker_stats[id].handled_packets += num;
>>>>>>>      			count += num;
>>>>>>>      			rte_pktmbuf_free(pkt);
>>>>>>>      			num = rte_distributor_get_pkt(d, id, buf, buf,
>> num);
>>>>>>> +
>>>>>>> 	__atomic_fetch_add(&worker_stats[id].handled_packets,
>>>>>>> +					num, __ATOMIC_ACQ_REL);
>>>>>>>      		}
>>>>>>>      		returned = rte_distributor_return_pkt(d,
>>>>>>>      				id, buf, num);
>>>>>>> @@ -461,7 +472,8 @@ sanity_test_with_worker_shutdown(struct
>>>>>>> worker_params *wp,
>>>>>>>
>>>>>>>      	for (i = 0; i < rte_lcore_count() - 1; i++)
>>>>>>>      		printf("Worker %u handled %u packets\n", i,
>>>>>>> -				worker_stats[i].handled_packets);
>>>>>>> +
>> 	__atomic_load_n(&worker_stats[i].handled_packets,
>>>>>>> +					__ATOMIC_ACQUIRE));
>>>>>>>
>>>>>>>      	if (total_packet_count() != BURST * 2) {
>>>>>>>      		printf("Line %d: Error, not all packets flushed. "
>>>>>>> @@ -514,7 +526,8 @@ test_flush_with_worker_shutdown(struct
>>>>>>> worker_params *wp,
>>>>>>>      	zero_quit = 0;
>>>>>>>      	for (i = 0; i < rte_lcore_count() - 1; i++)
>>>>>>>      		printf("Worker %u handled %u packets\n", i,
>>>>>>> -				worker_stats[i].handled_packets);
>>>>>>> +
>> 	__atomic_load_n(&worker_stats[i].handled_packets,
>>>>>>> +					__ATOMIC_ACQUIRE));
>>>>>>>
>>>>>>>      	if (total_packet_count() != BURST) {
>>>>>>>      		printf("Line %d: Error, not all packets flushed. "
>>>>>>> --
>>>>>>> 2.17.1
>>>>> --
>>>>> Lukasz Wojciechowski
>>>>> Principal Software Engineer
>>>>>
>>>>> Samsung R&D Institute Poland
>>>>> Samsung Electronics
>>>>> Office +48 22 377 88 25
>>>>> l.wojciechow@partner.samsung.com
>> --
>> Lukasz Wojciechowski
>> Principal Software Engineer
>>
>> Samsung R&D Institute Poland
>> Samsung Electronics
>> Office +48 22 377 88 25
>> l.wojciechow@partner.samsung.com

-- 
Lukasz Wojciechowski
Principal Software Engineer

Samsung R&D Institute Poland
Samsung Electronics
Office +48 22 377 88 25
l.wojciechow@partner.samsung.com


^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v8 01/17] distributor: fix missing handshake synchronization
  2020-10-17  3:06                                 ` [dpdk-dev] [PATCH v8 01/17] distributor: fix missing handshake synchronization Lukasz Wojciechowski
@ 2020-10-17 21:05                                   ` Honnappa Nagarahalli
  0 siblings, 0 replies; 164+ messages in thread
From: Honnappa Nagarahalli @ 2020-10-17 21:05 UTC (permalink / raw)
  To: Lukasz Wojciechowski, David Hunt, Bruce Richardson
  Cc: dev, stable, nd, Honnappa Nagarahalli, nd

<snip>
> 
> rte_distributor_return_pkt function which is run on worker cores must wait
> for distributor core to clear handshake on retptr64 before using those
> buffers. While the handshake is set distributor core controls buffers and any
> operations on worker side might overwrite buffers which are unread yet.
> Same situation appears in the legacy single distributor. Function
> rte_distributor_return_pkt_single shouldn't modify the bufptr64 until
> handshake on it is cleared by distributor lcore.
> 
> Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
> Cc: david.hunt@intel.com
> Cc: stable@dpdk.org
> 
> Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
> Acked-by: David Hunt <david.hunt@intel.com>
Looks good.
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>

> ---
>  lib/librte_distributor/rte_distributor.c        | 12 ++++++++++++
>  lib/librte_distributor/rte_distributor_single.c |  4 ++++
>  2 files changed, 16 insertions(+)
> 
> diff --git a/lib/librte_distributor/rte_distributor.c
> b/lib/librte_distributor/rte_distributor.c
> index 1c047f065..c6b19a388 100644
> --- a/lib/librte_distributor/rte_distributor.c
> +++ b/lib/librte_distributor/rte_distributor.c
> @@ -169,6 +169,18 @@ rte_distributor_return_pkt(struct rte_distributor *d,
>  			return -EINVAL;
>  	}
> 
> +	/* Spin while handshake bits are set (scheduler clears it).
> +	 * Sync with worker on GET_BUF flag.
> +	 */
> +	while (unlikely(__atomic_load_n(&(buf->retptr64[0]),
> __ATOMIC_RELAXED)
> +			& RTE_DISTRIB_GET_BUF)) {
> +		rte_pause();
> +		uint64_t t = rte_rdtsc()+100;
> +
> +		while (rte_rdtsc() < t)
> +			rte_pause();
> +	}
> +
>  	/* Sync with distributor to acquire retptrs */
>  	__atomic_thread_fence(__ATOMIC_ACQUIRE);
>  	for (i = 0; i < RTE_DIST_BURST_SIZE; i++) diff --git
> a/lib/librte_distributor/rte_distributor_single.c
> b/lib/librte_distributor/rte_distributor_single.c
> index abaf7730c..f4725b1d0 100644
> --- a/lib/librte_distributor/rte_distributor_single.c
> +++ b/lib/librte_distributor/rte_distributor_single.c
> @@ -74,6 +74,10 @@ rte_distributor_return_pkt_single(struct
> rte_distributor_single *d,
>  	union rte_distributor_buffer_single *buf = &d->bufs[worker_id];
>  	uint64_t req = (((int64_t)(uintptr_t)oldpkt) <<
> RTE_DISTRIB_FLAG_BITS)
>  			| RTE_DISTRIB_RETURN_BUF;
> +	while (unlikely(__atomic_load_n(&buf->bufptr64,
> __ATOMIC_RELAXED)
> +			& RTE_DISTRIB_FLAGS_MASK))
> +		rte_pause();
> +
>  	/* Sync with distributor on RETURN_BUF flag. */
>  	__atomic_store_n(&(buf->bufptr64), req, __ATOMIC_RELEASE);
>  	return 0;
> --
> 2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v8 08/17] test/distributor: synchronize lcores statistics
  2020-10-17  3:06                                 ` [dpdk-dev] [PATCH v8 08/17] test/distributor: synchronize lcores statistics Lukasz Wojciechowski
@ 2020-10-17 21:11                                   ` Honnappa Nagarahalli
  0 siblings, 0 replies; 164+ messages in thread
From: Honnappa Nagarahalli @ 2020-10-17 21:11 UTC (permalink / raw)
  To: Lukasz Wojciechowski, David Hunt, Bruce Richardson
  Cc: dev, stable, nd, Honnappa Nagarahalli, nd

<snip>

> 
> Statistics of handled packets are cleared and read on main lcore, while they
> are increased in workers handlers on different lcores.
> 
> Without synchronization occasionally showed invalid values.
> This patch uses atomic mechanisms to synchronize.
> Relaxed memory model is used.
> 
> Fixes: c3eabff124e6 ("distributor: add unit tests")
> Cc: bruce.richardson@intel.com
> Cc: stable@dpdk.org
> 
> Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
> Acked-by: David Hunt <david.hunt@intel.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>

> ---
>  app/test/test_distributor.c | 39 +++++++++++++++++++++++++------------
>  1 file changed, 27 insertions(+), 12 deletions(-)
> 
> diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c index
> ec1fe348b..4343efed1 100644
> --- a/app/test/test_distributor.c
> +++ b/app/test/test_distributor.c
> @@ -43,7 +43,8 @@ total_packet_count(void)  {
>  	unsigned i, count = 0;
>  	for (i = 0; i < worker_idx; i++)
> -		count += worker_stats[i].handled_packets;
> +		count +=
> __atomic_load_n(&worker_stats[i].handled_packets,
> +				__ATOMIC_RELAXED);
>  	return count;
>  }
> 
> @@ -51,7 +52,10 @@ total_packet_count(void)  static inline void
>  clear_packet_count(void)
>  {
> -	memset(&worker_stats, 0, sizeof(worker_stats));
> +	unsigned int i;
> +	for (i = 0; i < RTE_MAX_LCORE; i++)
> +		__atomic_store_n(&worker_stats[i].handled_packets, 0,
> +			__ATOMIC_RELAXED);
>  }
> 
>  /* this is the basic worker function for sanity test @@ -129,7 +133,8 @@
> sanity_test(struct worker_params *wp, struct rte_mempool *p)
> 
>  	for (i = 0; i < rte_lcore_count() - 1; i++)
>  		printf("Worker %u handled %u packets\n", i,
> -				worker_stats[i].handled_packets);
> +			__atomic_load_n(&worker_stats[i].handled_packets,
> +					__ATOMIC_RELAXED));
>  	printf("Sanity test with all zero hashes done.\n");
> 
>  	/* pick two flows and check they go correctly */ @@ -154,7 +159,9
> @@ sanity_test(struct worker_params *wp, struct rte_mempool *p)
> 
>  		for (i = 0; i < rte_lcore_count() - 1; i++)
>  			printf("Worker %u handled %u packets\n", i,
> -					worker_stats[i].handled_packets);
> +				__atomic_load_n(
> +					&worker_stats[i].handled_packets,
> +					__ATOMIC_RELAXED));
>  		printf("Sanity test with two hash values done\n");
>  	}
> 
> @@ -180,7 +187,8 @@ sanity_test(struct worker_params *wp, struct
> rte_mempool *p)
> 
>  	for (i = 0; i < rte_lcore_count() - 1; i++)
>  		printf("Worker %u handled %u packets\n", i,
> -				worker_stats[i].handled_packets);
> +			__atomic_load_n(&worker_stats[i].handled_packets,
> +					__ATOMIC_RELAXED));
>  	printf("Sanity test with non-zero hashes done\n");
> 
>  	rte_mempool_put_bulk(p, (void *)bufs, BURST); @@ -272,12
> +280,14 @@ handle_work_with_free_mbufs(void *arg)
> 
>  	num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
>  	while (!quit) {
> -		worker_stats[id].handled_packets += num;
> +		__atomic_fetch_add(&worker_stats[id].handled_packets,
> num,
> +				__ATOMIC_RELAXED);
>  		for (i = 0; i < num; i++)
>  			rte_pktmbuf_free(buf[i]);
>  		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
>  	}
> -	worker_stats[id].handled_packets += num;
> +	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
> +			__ATOMIC_RELAXED);
>  	rte_distributor_return_pkt(d, id, buf, num);
>  	return 0;
>  }
> @@ -347,7 +357,8 @@ handle_work_for_shutdown_test(void *arg)
>  	/* wait for quit single globally, or for worker zero, wait
>  	 * for zero_quit */
>  	while (!quit && !(id == zero_id && zero_quit)) {
> -		worker_stats[id].handled_packets += num;
> +		__atomic_fetch_add(&worker_stats[id].handled_packets,
> num,
> +				__ATOMIC_RELAXED);
>  		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
> 
>  		if (num > 0) {
> @@ -357,8 +368,9 @@ handle_work_for_shutdown_test(void *arg)
>  		}
>  		zero_id = __atomic_load_n(&zero_idx,
> __ATOMIC_ACQUIRE);
>  	}
> -	worker_stats[id].handled_packets += num;
> 
> +	__atomic_fetch_add(&worker_stats[id].handled_packets, num,
> +			__ATOMIC_RELAXED);
>  	if (id == zero_id) {
>  		rte_distributor_return_pkt(d, id, NULL, 0);
> 
> @@ -371,7 +383,8 @@ handle_work_for_shutdown_test(void *arg)
>  		num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
> 
>  		while (!quit) {
> -			worker_stats[id].handled_packets += num;
> +
> 	__atomic_fetch_add(&worker_stats[id].handled_packets,
> +					num, __ATOMIC_RELAXED);
>  			num = rte_distributor_get_pkt(d, id, buf, NULL, 0);
>  		}
>  	}
> @@ -437,7 +450,8 @@ sanity_test_with_worker_shutdown(struct
> worker_params *wp,
> 
>  	for (i = 0; i < rte_lcore_count() - 1; i++)
>  		printf("Worker %u handled %u packets\n", i,
> -				worker_stats[i].handled_packets);
> +			__atomic_load_n(&worker_stats[i].handled_packets,
> +					__ATOMIC_RELAXED));
> 
>  	if (total_packet_count() != BURST * 2) {
>  		printf("Line %d: Error, not all packets flushed. "
> @@ -497,7 +511,8 @@ test_flush_with_worker_shutdown(struct
> worker_params *wp,
>  	zero_quit = 0;
>  	for (i = 0; i < rte_lcore_count() - 1; i++)
>  		printf("Worker %u handled %u packets\n", i,
> -				worker_stats[i].handled_packets);
> +			__atomic_load_n(&worker_stats[i].handled_packets,
> +					__ATOMIC_RELAXED));
> 
>  	if (total_packet_count() != BURST) {
>  		printf("Line %d: Error, not all packets flushed. "
> --
> 2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v8 17/17] test/distributor: fix quitting workers
  2020-10-17  3:07                                 ` [dpdk-dev] [PATCH v8 17/17] test/distributor: fix quitting workers Lukasz Wojciechowski
@ 2020-10-17 21:15                                   ` Honnappa Nagarahalli
  0 siblings, 0 replies; 164+ messages in thread
From: Honnappa Nagarahalli @ 2020-10-17 21:15 UTC (permalink / raw)
  To: Lukasz Wojciechowski, David Hunt, Bruce Richardson
  Cc: dev, stable, nd, Honnappa Nagarahalli, nd

<snip>

> 
> Sending number of packets equal to number of workers isn't enough to stop
> all workers in burst version of distributor as more than one packet can be
> matched and consumed by a single worker. This way some of workers might
> not be awaken from rte_distributor_get_pkt().
> 
> This patch fixes it by sending packets one by one. Each sent packet causes
> exactly one worker to quit.
> 
> Fixes: 775003ad2f96 ("distributor: add new burst-capable library")
> Cc: david.hunt@intel.com
> Cc: stable@dpdk.org
> 
> Signed-off-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Looks good.
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>

> ---
>  app/test/test_distributor.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/app/test/test_distributor.c b/app/test/test_distributor.c index
> a4af0a39c..e0cb698e1 100644
> --- a/app/test/test_distributor.c
> +++ b/app/test/test_distributor.c
> @@ -769,9 +769,10 @@ quit_workers(struct worker_params *wp, struct
> rte_mempool *p)
> 
>  	zero_quit = 0;
>  	quit = 1;
> -	for (i = 0; i < num_workers; i++)
> +	for (i = 0; i < num_workers; i++) {
>  		bufs[i]->hash.usr = i << 1;
> -	rte_distributor_process(d, bufs, num_workers);
> +		rte_distributor_process(d, &bufs[i], 1);
> +	}
> 
>  	rte_distributor_process(d, NULL, 0);
>  	rte_distributor_flush(d);
> --
> 2.17.1


^ permalink raw reply	[flat|nested] 164+ messages in thread

* Re: [dpdk-dev] [PATCH v8 00/17] fix distributor synchronization issues
  2020-10-17  3:06                             ` [dpdk-dev] [PATCH v8 00/17] fix distributor synchronization issues Lukasz Wojciechowski
                                                 ` (16 preceding siblings ...)
       [not found]                               ` <CGME20201017030723eucas1p16904cabfd94afa4fe751c072077e09ae@eucas1p1.samsung.com>
@ 2020-10-19  8:32                               ` David Marchand
  17 siblings, 0 replies; 164+ messages in thread
From: David Marchand @ 2020-10-19  8:32 UTC (permalink / raw)
  To: Lukasz Wojciechowski; +Cc: dev, David Hunt, Honnappa Nagarahalli

On Sat, Oct 17, 2020 at 5:07 AM Lukasz Wojciechowski
<l.wojciechow@partner.samsung.com> wrote:
>
> During review and verification of the patch created by Sarosh Arif:
> "test_distributor: prevent memory leakages from the pool" I found out
> that running distributor unit tests multiple times in a row causes fails.
> So I investigated all the issues I found.
>
> There are few synchronization issues that might cause deadlocks
> or corrupted data. They are fixed with this set of patches for both tests
> and librte_distributor library.
>

Series applied, thanks Lukasz!


-- 
David Marchand


^ permalink raw reply	[flat|nested] 164+ messages in thread

end of thread, other threads:[~2020-10-19  8:32 UTC | newest]

Thread overview: 164+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CGME20200915193456eucas1p1a38a0bb16e9c81ed587f916aeb8c41e5@eucas1p1.samsung.com>
2020-09-15 19:34 ` [dpdk-dev] [PATCH v1 0/6] fix distributor synchronization issues Lukasz Wojciechowski
     [not found]   ` <CGME20200915193457eucas1p2adbe25c41a0e4ef16c029e7bff104503@eucas1p2.samsung.com>
2020-09-15 19:34     ` [dpdk-dev] [PATCH v1 1/6] app/test: fix deadlock in distributor test Lukasz Wojciechowski
2020-09-17 11:21       ` David Hunt
2020-09-17 14:01         ` Lukasz Wojciechowski
     [not found]   ` <CGME20200915193457eucas1p2321d28b6abf69f244cd7c1e61ed0620e@eucas1p2.samsung.com>
2020-09-15 19:34     ` [dpdk-dev] [PATCH v1 2/6] app/test: synchronize statistics between lcores Lukasz Wojciechowski
2020-09-17 11:50       ` David Hunt
     [not found]   ` <CGME20200915193458eucas1p1d9308e63063eda28f96eedba3a361a2b@eucas1p1.samsung.com>
2020-09-15 19:34     ` [dpdk-dev] [PATCH v1 3/6] app/test: fix freeing mbufs in distributor tests Lukasz Wojciechowski
2020-09-17 12:34       ` David Hunt
2020-09-22 12:42       ` David Marchand
2020-09-23  1:55         ` Lukasz Wojciechowski
     [not found]   ` <CGME20200915193459eucas1p19f5d1cbea87d7dc3bbd2638cdb96a31b@eucas1p1.samsung.com>
2020-09-15 19:34     ` [dpdk-dev] [PATCH v1 4/6] app/test: collect return mbufs in distributor test Lukasz Wojciechowski
2020-09-17 12:37       ` David Hunt
     [not found]   ` <CGME20200915193500eucas1p2b079e1dcfd2d54e01a5630609b82b370@eucas1p2.samsung.com>
2020-09-15 19:34     ` [dpdk-dev] [PATCH v1 5/6] distributor: fix missing handshake synchronization Lukasz Wojciechowski
2020-09-17 13:22       ` David Hunt
     [not found]   ` <CGME20200915193501eucas1p2333f0b08077c06ba04b89ce192072f9a@eucas1p2.samsung.com>
2020-09-15 19:34     ` [dpdk-dev] [PATCH v1 6/6] distributor: fix handshake deadlock Lukasz Wojciechowski
2020-09-17 13:28       ` David Hunt
     [not found]   ` <CGME20200923014717eucas1p18699ad84d206e786a84f20dab9b65c33@eucas1p1.samsung.com>
2020-09-23  1:47     ` [dpdk-dev] [PATCH v2 0/8] fix distributor synchronization issues Lukasz Wojciechowski
     [not found]       ` <CGME20200923014718eucas1p11fdcd774fef7b9e077e14e01c9f951d5@eucas1p1.samsung.com>
2020-09-23  1:47         ` [dpdk-dev] [PATCH v2 1/8] app/test: fix deadlock in distributor test Lukasz Wojciechowski
     [not found]       ` <CGME20200923014719eucas1p2f26000109e86a649796e902c30e58bf0@eucas1p2.samsung.com>
2020-09-23  1:47         ` [dpdk-dev] [PATCH v2 2/8] app/test: synchronize statistics between lcores Lukasz Wojciechowski
2020-09-23  4:30           ` Honnappa Nagarahalli
2020-09-23 12:47             ` Lukasz Wojciechowski
     [not found]       ` <CGME20200923014719eucas1p165c419cff4f265cff8add8cc818210ff@eucas1p1.samsung.com>
2020-09-23  1:47         ` [dpdk-dev] [PATCH v2 3/8] app/test: fix freeing mbufs in distributor tests Lukasz Wojciechowski
     [not found]       ` <CGME20200923014720eucas1p2bd5887c96c24839f364810a1bbe840da@eucas1p2.samsung.com>
2020-09-23  1:47         ` [dpdk-dev] [PATCH v2 4/8] app/test: collect return mbufs in distributor test Lukasz Wojciechowski
     [not found]       ` <CGME20200923014721eucas1p1d22ac56c9b9e4fb49ac73d72d51a7a23@eucas1p1.samsung.com>
2020-09-23  1:47         ` [dpdk-dev] [PATCH v2 5/8] distributor: fix missing handshake synchronization Lukasz Wojciechowski
     [not found]       ` <CGME20200923014722eucas1p2c2ef63759f4b800c1b5a80094e07e384@eucas1p2.samsung.com>
2020-09-23  1:47         ` [dpdk-dev] [PATCH v2 6/8] distributor: fix handshake deadlock Lukasz Wojciechowski
     [not found]       ` <CGME20200923014723eucas1p2a7c7210a55289b3739faff4f5ed72e30@eucas1p2.samsung.com>
2020-09-23  1:47         ` [dpdk-dev] [PATCH v2 7/8] distributor: do not use oldpkt when not needed Lukasz Wojciechowski
     [not found]       ` <CGME20200923014724eucas1p13d3c0428a15bea26def7a4343251e4e4@eucas1p1.samsung.com>
2020-09-23  1:47         ` [dpdk-dev] [PATCH v2 8/8] distributor: align API documentation with code Lukasz Wojciechowski
2020-09-23  8:46       ` [dpdk-dev] [PATCH v2 0/8] fix distributor synchronization issues David Hunt
2020-09-23 14:03         ` Lukasz Wojciechowski
2020-09-23  8:47       ` David Hunt
     [not found]       ` <CGME20200923132544eucas1p29470697e7cb6621cc65e6e676c3e5d69@eucas1p2.samsung.com>
2020-09-23 13:25         ` [dpdk-dev] [PATCH v3 " Lukasz Wojciechowski
     [not found]           ` <CGME20200923132545eucas1p10db12d91121c9afdbab338bb60c8ed37@eucas1p1.samsung.com>
2020-09-23 13:25             ` [dpdk-dev] [PATCH v3 1/8] app/test: fix deadlock in distributor test Lukasz Wojciechowski
     [not found]           ` <CGME20200923132546eucas1p212b6eede801514b544d82d41f5b7e4b8@eucas1p2.samsung.com>
2020-09-23 13:25             ` [dpdk-dev] [PATCH v3 2/8] app/test: synchronize statistics between lcores Lukasz Wojciechowski
     [not found]           ` <CGME20200923132547eucas1p130620b0d5f3080a7a57234838a992e0e@eucas1p1.samsung.com>
2020-09-23 13:25             ` [dpdk-dev] [PATCH v3 3/8] app/test: fix freeing mbufs in distributor tests Lukasz Wojciechowski
     [not found]           ` <CGME20200923132548eucas1p2a54328cddb79ae5e876eb104217d585f@eucas1p2.samsung.com>
2020-09-23 13:25             ` [dpdk-dev] [PATCH v3 4/8] app/test: collect return mbufs in distributor test Lukasz Wojciechowski
     [not found]           ` <CGME20200923132549eucas1p29fc391c3f236fa704ff800774ab851f0@eucas1p2.samsung.com>
2020-09-23 13:25             ` [dpdk-dev] [PATCH v3 5/8] distributor: fix missing handshake synchronization Lukasz Wojciechowski
     [not found]           ` <CGME20200923132550eucas1p2ce158dd81ccc04abcab4130d8cb391f4@eucas1p2.samsung.com>
2020-09-23 13:25             ` [dpdk-dev] [PATCH v3 6/8] distributor: fix handshake deadlock Lukasz Wojciechowski
     [not found]           ` <CGME20200923132550eucas1p1ce21011562d0a00cccfd4ae3f0be4ff9@eucas1p1.samsung.com>
2020-09-23 13:25             ` [dpdk-dev] [PATCH v3 7/8] distributor: do not use oldpkt when not needed Lukasz Wojciechowski
     [not found]           ` <CGME20200923132551eucas1p214a5f78c61e891c5e7b6cddc038d0e2e@eucas1p2.samsung.com>
2020-09-23 13:25             ` [dpdk-dev] [PATCH v3 8/8] distributor: align API documentation with code Lukasz Wojciechowski
2020-09-25 12:31           ` [dpdk-dev] [PATCH v3 0/8] fix distributor synchronization issues David Marchand
2020-09-25 22:42             ` Lukasz Wojciechowski
     [not found]           ` <CGME20200925224216eucas1p28b73b1b56c6f3372db4fcaba0890522f@eucas1p2.samsung.com>
2020-09-25 22:42             ` [dpdk-dev] [PATCH v4 " Lukasz Wojciechowski
     [not found]               ` <CGME20200925224216eucas1p1e8e1d0ecab4bbbf6e43b117c1d210649@eucas1p1.samsung.com>
2020-09-25 22:42                 ` [dpdk-dev] [PATCH v4 1/8] test/distributor: fix deadlock with freezed worker Lukasz Wojciechowski
2020-09-27 23:34                   ` Honnappa Nagarahalli
2020-09-30 20:22                     ` Lukasz Wojciechowski
     [not found]               ` <CGME20200925224217eucas1p1bb5f73109b4aeed8f2badf311fa8dfb5@eucas1p1.samsung.com>
2020-09-25 22:42                 ` [dpdk-dev] [PATCH v4 2/8] test/distributor: synchronize lcores statistics Lukasz Wojciechowski
2020-09-29  5:49                   ` Honnappa Nagarahalli
2020-10-02 11:25                     ` Lukasz Wojciechowski
2020-10-08 20:47                       ` Lukasz Wojciechowski
2020-10-16  5:43                       ` Honnappa Nagarahalli
2020-10-16 12:43                         ` Lukasz Wojciechowski
2020-10-16 12:58                           ` Lukasz Wojciechowski
2020-10-16 15:42                             ` Honnappa Nagarahalli
2020-10-17  3:34                               ` Lukasz Wojciechowski
     [not found]               ` <CGME20200925224218eucas1p2383ff0ebdaee18b581f5f731476f05ab@eucas1p2.samsung.com>
2020-09-25 22:42                 ` [dpdk-dev] [PATCH v4 3/8] distributor: do not use oldpkt when not needed Lukasz Wojciechowski
     [not found]               ` <CGME20200925224218eucas1p221c1af87b0e4547547106503cd336afd@eucas1p2.samsung.com>
2020-09-25 22:42                 ` [dpdk-dev] [PATCH v4 4/8] test/distributor: fix freeing mbufs Lukasz Wojciechowski
     [not found]               ` <CGME20200925224219eucas1p2d61447fef421573d653d2376423ecce0@eucas1p2.samsung.com>
2020-09-25 22:42                 ` [dpdk-dev] [PATCH v4 5/8] test/distributor: collect return mbufs Lukasz Wojciechowski
     [not found]               ` <CGME20200925224220eucas1p1a44e99a1d7750d37d5aefa61f329209b@eucas1p1.samsung.com>
2020-09-25 22:42                 ` [dpdk-dev] [PATCH v4 6/8] distributor: fix missing handshake synchronization Lukasz Wojciechowski
     [not found]               ` <CGME20200925224221eucas1p151297834da32a0f7cfdffc120f57ab3a@eucas1p1.samsung.com>
2020-09-25 22:42                 ` [dpdk-dev] [PATCH v4 7/8] distributor: fix handshake deadlock Lukasz Wojciechowski
     [not found]               ` <CGME20200925224222eucas1p1b10891c21bfef6784777526af4443dde@eucas1p1.samsung.com>
2020-09-25 22:42                 ` [dpdk-dev] [PATCH v4 8/8] distributor: align API documentation with code Lukasz Wojciechowski
     [not found]               ` <CGME20201008052336eucas1p16b5b1600683e33ddba30479b7fd62ce6@eucas1p1.samsung.com>
2020-10-08  5:23                 ` [dpdk-dev] [PATCH v5 00/15] fix distributor synchronization issues Lukasz Wojciechowski
     [not found]                   ` <CGME20201008052337eucas1p22b9e89987caf151ba8771442385fec16@eucas1p2.samsung.com>
2020-10-08  5:23                     ` [dpdk-dev] [PATCH v5 01/15] distributor: fix missing handshake synchronization Lukasz Wojciechowski
     [not found]                   ` <CGME20201008052338eucas1p2d26a8705b17d07fd24056f0aeaf3504e@eucas1p2.samsung.com>
2020-10-08  5:23                     ` [dpdk-dev] [PATCH v5 02/15] distributor: fix handshake deadlock Lukasz Wojciechowski
     [not found]                   ` <CGME20201008052339eucas1p1a4e571cc3f5a277badff9d352ad7da8e@eucas1p1.samsung.com>
2020-10-08  5:23                     ` [dpdk-dev] [PATCH v5 03/15] distributor: do not use oldpkt when not needed Lukasz Wojciechowski
2020-10-08  8:13                       ` David Hunt
     [not found]                   ` <CGME20201008052339eucas1p15697f457b8b96809d04f737e041af08a@eucas1p1.samsung.com>
2020-10-08  5:23                     ` [dpdk-dev] [PATCH v5 04/15] distributor: handle worker shutdown in burst mode Lukasz Wojciechowski
2020-10-08 14:26                       ` David Hunt
2020-10-08 21:07                         ` Lukasz Wojciechowski
2020-10-09 12:13                           ` David Hunt
2020-10-09 20:43                             ` Lukasz Wojciechowski
     [not found]                   ` <CGME20201008052340eucas1p1451f2bf1b6475067491753274547b837@eucas1p1.samsung.com>
2020-10-08  5:23                     ` [dpdk-dev] [PATCH v5 05/15] test/distributor: fix shutdown of busy worker Lukasz Wojciechowski
     [not found]                   ` <CGME20201008052341eucas1p2379b186206e5bf481e3c680de46e5c16@eucas1p2.samsung.com>
2020-10-08  5:23                     ` [dpdk-dev] [PATCH v5 06/15] test/distributor: synchronize lcores statistics Lukasz Wojciechowski
     [not found]                   ` <CGME20201008052342eucas1p2376e75d9ac38f5054ca393b0ef7e663d@eucas1p2.samsung.com>
2020-10-08  5:23                     ` [dpdk-dev] [PATCH v5 07/15] distributor: fix return pkt calls in single mode Lukasz Wojciechowski
2020-10-08 14:32                       ` David Hunt
     [not found]                   ` <CGME20201008052342eucas1p19e8474360d1f7dacd4164b3e21e54290@eucas1p1.samsung.com>
2020-10-08  5:23                     ` [dpdk-dev] [PATCH v5 08/15] test/distributor: fix freeing mbufs Lukasz Wojciechowski
     [not found]                   ` <CGME20201008052343eucas1p1649655353d6c76cdf6320a04e8d43f32@eucas1p1.samsung.com>
2020-10-08  5:23                     ` [dpdk-dev] [PATCH v5 09/15] test/distributor: collect return mbufs Lukasz Wojciechowski
     [not found]                   ` <CGME20201008052344eucas1p270b04ad2c4346e6beb5f5ef844827085@eucas1p2.samsung.com>
2020-10-08  5:23                     ` [dpdk-dev] [PATCH v5 10/15] distributor: align API documentation with code Lukasz Wojciechowski
2020-10-08 14:35                       ` David Hunt
     [not found]                   ` <CGME20201008052345eucas1p29e14456610d4ed48c09b8cf7bd338e18@eucas1p2.samsung.com>
2020-10-08  5:23                     ` [dpdk-dev] [PATCH v5 11/15] test/distributor: replace delays with spin locks Lukasz Wojciechowski
2020-10-09 12:23                       ` David Hunt
     [not found]                   ` <CGME20201008052345eucas1p17a05f99986032885a0316d3419cdea2d@eucas1p1.samsung.com>
2020-10-08  5:23                     ` [dpdk-dev] [PATCH v5 12/15] distributor: fix scalar matching Lukasz Wojciechowski
2020-10-09 12:31                       ` David Hunt
2020-10-09 12:35                         ` David Hunt
2020-10-09 21:02                           ` Lukasz Wojciechowski
     [not found]                   ` <CGME20201008052346eucas1p15b04bf84cafc2ba52bbe063f57d08c39@eucas1p1.samsung.com>
2020-10-08  5:23                     ` [dpdk-dev] [PATCH v5 13/15] test/distributor: add test with packets marking Lukasz Wojciechowski
2020-10-09 12:50                       ` David Hunt
2020-10-09 21:12                         ` Lukasz Wojciechowski
     [not found]                   ` <CGME20201008052347eucas1p1570239523104a0d609c928d8b149ebdf@eucas1p1.samsung.com>
2020-10-08  5:23                     ` [dpdk-dev] [PATCH v5 14/15] distributor: fix flushing in flight packets Lukasz Wojciechowski
2020-10-09 13:10                       ` David Hunt
     [not found]                   ` <CGME20201008052348eucas1p183cfbe10d10bd98c7a63a34af98b80df@eucas1p1.samsung.com>
2020-10-08  5:23                     ` [dpdk-dev] [PATCH v5 15/15] distributor: fix clearing returns buffer Lukasz Wojciechowski
2020-10-09 13:12                       ` David Hunt
2020-10-08  7:30                   ` [dpdk-dev] [PATCH v5 00/15] fix distributor synchronization issues David Marchand
2020-10-08 21:16                     ` Lukasz Wojciechowski
2020-10-09 12:53                       ` David Marchand
2020-10-09 21:41                         ` Lukasz Wojciechowski
2020-10-09 23:25                           ` Lukasz Wojciechowski
2020-10-10  8:12                             ` David Marchand
2020-10-10  8:15                               ` David Marchand
2020-10-10 16:27                               ` Lukasz Wojciechowski
     [not found]                   ` <CGME20201009220207eucas1p1d83b63b4f0e05cbaf0a58f7f01ec0052@eucas1p1.samsung.com>
2020-10-09 22:01                     ` [dpdk-dev] [PATCH v6 " Lukasz Wojciechowski
     [not found]                       ` <CGME20201009220229eucas1p17ad627f31005ed506c5422b93ad6d112@eucas1p1.samsung.com>
2020-10-09 22:01                         ` [dpdk-dev] [PATCH v6 01/15] distributor: fix missing handshake synchronization Lukasz Wojciechowski
     [not found]                       ` <CGME20201009220231eucas1p217c48d880aaa7f15e4351f92eede01b6@eucas1p2.samsung.com>
2020-10-09 22:01                         ` [dpdk-dev] [PATCH v6 02/15] distributor: fix handshake deadlock Lukasz Wojciechowski
     [not found]                       ` <CGME20201009220232eucas1p201d3b81574b7ec42ff3fb18f4bbfcbea@eucas1p2.samsung.com>
2020-10-09 22:01                         ` [dpdk-dev] [PATCH v6 03/15] distributor: do not use oldpkt when not needed Lukasz Wojciechowski
     [not found]                       ` <CGME20201009220233eucas1p285b4d01402c0c8bcfd018673afeb05eb@eucas1p2.samsung.com>
2020-10-09 22:01                         ` [dpdk-dev] [PATCH v6 04/15] distributor: handle worker shutdown in burst mode Lukasz Wojciechowski
     [not found]                       ` <CGME20201009220235eucas1p17ded8b5bb42f2fef159a5715ef6fbca7@eucas1p1.samsung.com>
2020-10-09 22:01                         ` [dpdk-dev] [PATCH v6 05/15] test/distributor: fix shutdown of busy worker Lukasz Wojciechowski
     [not found]                       ` <CGME20201009220236eucas1p192e34b3bbf00681ec90de296abd1a6b5@eucas1p1.samsung.com>
2020-10-09 22:01                         ` [dpdk-dev] [PATCH v6 06/15] test/distributor: synchronize lcores statistics Lukasz Wojciechowski
     [not found]                       ` <CGME20201009220238eucas1p2e86c0026064774e5b494c16c7fd384ec@eucas1p2.samsung.com>
2020-10-09 22:01                         ` [dpdk-dev] [PATCH v6 07/15] distributor: fix return pkt calls in single mode Lukasz Wojciechowski
     [not found]                       ` <CGME20201009220246eucas1p1283b16f1f54c572b5952ca9334d667da@eucas1p1.samsung.com>
2020-10-09 22:01                         ` [dpdk-dev] [PATCH v6 08/15] test/distributor: fix freeing mbufs Lukasz Wojciechowski
     [not found]                       ` <CGME20201009220247eucas1p1a783663e586127cbfd406a61e13c40eb@eucas1p1.samsung.com>
2020-10-09 22:01                         ` [dpdk-dev] [PATCH v6 09/15] test/distributor: collect return mbufs Lukasz Wojciechowski
     [not found]                       ` <CGME20201009220248eucas1p156346857c1aab2340ccd7549abdce966@eucas1p1.samsung.com>
2020-10-09 22:01                         ` [dpdk-dev] [PATCH v6 10/15] distributor: align API documentation with code Lukasz Wojciechowski
     [not found]                       ` <CGME20201009220250eucas1p18587737171d82a9bde52c767ee8ed24b@eucas1p1.samsung.com>
2020-10-09 22:01                         ` [dpdk-dev] [PATCH v6 11/15] test/distributor: replace delays with spin locks Lukasz Wojciechowski
     [not found]                       ` <CGME20201009220253eucas1p14078ab159186d2c26e787b3b2ed68062@eucas1p1.samsung.com>
2020-10-09 22:01                         ` [dpdk-dev] [PATCH v6 12/15] distributor: fix scalar matching Lukasz Wojciechowski
     [not found]                       ` <CGME20201009220253eucas1p2c0e27c3a495cb9603102b2cbf8a8f706@eucas1p2.samsung.com>
2020-10-09 22:02                         ` [dpdk-dev] [PATCH v6 13/15] test/distributor: add test with packets marking Lukasz Wojciechowski
     [not found]                       ` <CGME20201009220254eucas1p187bad9a066f00ee4c05ec6ca7fb4decd@eucas1p1.samsung.com>
2020-10-09 22:02                         ` [dpdk-dev] [PATCH v6 14/15] distributor: fix flushing in flight packets Lukasz Wojciechowski
     [not found]                       ` <CGME20201009220255eucas1p1e7a286684291e586ebb22cb0a2117e50@eucas1p1.samsung.com>
2020-10-09 22:02                         ` [dpdk-dev] [PATCH v6 15/15] distributor: fix clearing returns buffer Lukasz Wojciechowski
     [not found]                       ` <CGME20201010160513eucas1p1fbacf1f82c40d65aef40634f245c4206@eucas1p1.samsung.com>
2020-10-10 16:04                         ` [dpdk-dev] [PATCH v7 00/16] fix distributor synchronization issues Lukasz Wojciechowski
     [not found]                           ` <CGME20201010160515eucas1p18003d01d8217cdf04be3cba2e32f969f@eucas1p1.samsung.com>
2020-10-10 16:04                             ` [dpdk-dev] [PATCH v7 01/16] distributor: fix missing handshake synchronization Lukasz Wojciechowski
2020-10-15 23:47                               ` Honnappa Nagarahalli
2020-10-17  3:13                                 ` Lukasz Wojciechowski
     [not found]                           ` <CGME20201010160517eucas1p2141c0bb6097a05aa99ed8efdf5fb7512@eucas1p2.samsung.com>
2020-10-10 16:04                             ` [dpdk-dev] [PATCH v7 02/16] distributor: fix handshake deadlock Lukasz Wojciechowski
     [not found]                           ` <CGME20201010160523eucas1p19287c5bf3b7e2818c730ae23f514853f@eucas1p1.samsung.com>
2020-10-10 16:04                             ` [dpdk-dev] [PATCH v7 03/16] distributor: do not use oldpkt when not needed Lukasz Wojciechowski
     [not found]                           ` <CGME20201010160525eucas1p2314810086b9dd1c8cddf90eabe800363@eucas1p2.samsung.com>
2020-10-10 16:04                             ` [dpdk-dev] [PATCH v7 04/16] distributor: handle worker shutdown in burst mode Lukasz Wojciechowski
     [not found]                           ` <CGME20201010160527eucas1p2f55cb0fc45bf3647234cdfa251e542fc@eucas1p2.samsung.com>
2020-10-10 16:04                             ` [dpdk-dev] [PATCH v7 05/16] test/distributor: fix shutdown of busy worker Lukasz Wojciechowski
     [not found]                           ` <CGME20201010160528eucas1p2b9b8189aef51c18d116f97ccebf5719c@eucas1p2.samsung.com>
2020-10-10 16:04                             ` [dpdk-dev] [PATCH v7 06/16] test/distributor: synchronize lcores statistics Lukasz Wojciechowski
2020-10-16  5:13                               ` Honnappa Nagarahalli
2020-10-17  3:23                                 ` Lukasz Wojciechowski
     [not found]                           ` <CGME20201010160530eucas1p15baba6fba44a7caee8b4b0ff778a961d@eucas1p1.samsung.com>
2020-10-10 16:04                             ` [dpdk-dev] [PATCH v7 07/16] distributor: fix return pkt calls in single mode Lukasz Wojciechowski
     [not found]                           ` <CGME20201010160536eucas1p2b20e729b90d66eddd03618e98d38c179@eucas1p2.samsung.com>
2020-10-10 16:04                             ` [dpdk-dev] [PATCH v7 08/16] test/distributor: fix freeing mbufs Lukasz Wojciechowski
2020-10-16  5:12                               ` Honnappa Nagarahalli
2020-10-17  3:28                                 ` Lukasz Wojciechowski
     [not found]                           ` <CGME20201010160538eucas1p19298667f236209cfeaa4745f9bb3aae6@eucas1p1.samsung.com>
2020-10-10 16:05                             ` [dpdk-dev] [PATCH v7 09/16] test/distributor: collect return mbufs Lukasz Wojciechowski
2020-10-16  4:53                               ` Honnappa Nagarahalli
2020-10-16  5:13                               ` Honnappa Nagarahalli
2020-10-17  3:29                                 ` Lukasz Wojciechowski
     [not found]                           ` <CGME20201010160540eucas1p2d942834b4749672c433a37a8fe520bd1@eucas1p2.samsung.com>
2020-10-10 16:05                             ` [dpdk-dev] [PATCH v7 10/16] distributor: align API documentation with code Lukasz Wojciechowski
     [not found]                           ` <CGME20201010160541eucas1p11d079bad2b7500f9ab927463e1eeac04@eucas1p1.samsung.com>
2020-10-10 16:05                             ` [dpdk-dev] [PATCH v7 11/16] test/distributor: replace delays with spin locks Lukasz Wojciechowski
     [not found]                           ` <CGME20201010160548eucas1p193e4f234da1005b91f22a8e7cb1d3226@eucas1p1.samsung.com>
2020-10-10 16:05                             ` [dpdk-dev] [PATCH v7 12/16] distributor: fix scalar matching Lukasz Wojciechowski
     [not found]                           ` <CGME20201010160549eucas1p1eba7cb8e4e9ba9200e9cd498137848c3@eucas1p1.samsung.com>
2020-10-10 16:05                             ` [dpdk-dev] [PATCH v7 13/16] test/distributor: add test with packets marking Lukasz Wojciechowski
     [not found]                           ` <CGME20201010160551eucas1p171642aa2d451e501287915824bfe7c24@eucas1p1.samsung.com>
2020-10-10 16:05                             ` [dpdk-dev] [PATCH v7 14/16] distributor: fix flushing in flight packets Lukasz Wojciechowski
     [not found]                           ` <CGME20201010160552eucas1p2efdec872c4aea2b63af29c84e9a5b52d@eucas1p2.samsung.com>
2020-10-10 16:05                             ` [dpdk-dev] [PATCH v7 15/16] distributor: fix clearing returns buffer Lukasz Wojciechowski
     [not found]                           ` <CGME20201010160605eucas1p1ff6b4cb5065e1355cb8eeafd4696abaf@eucas1p1.samsung.com>
2020-10-10 16:05                             ` [dpdk-dev] [PATCH v7 16/16] test/distributor: ensure all packets are delivered Lukasz Wojciechowski
2020-10-12  7:46                               ` David Hunt
     [not found]                           ` <CGME20201017030709eucas1p11285f14ee4fe2e79ad5791b0e9b9c653@eucas1p1.samsung.com>
2020-10-17  3:06                             ` [dpdk-dev] [PATCH v8 00/17] fix distributor synchronization issues Lukasz Wojciechowski
     [not found]                               ` <CGME20201017030710eucas1p17fb6129fd3414b4b6b70dcd593c01a40@eucas1p1.samsung.com>
2020-10-17  3:06                                 ` [dpdk-dev] [PATCH v8 01/17] distributor: fix missing handshake synchronization Lukasz Wojciechowski
2020-10-17 21:05                                   ` Honnappa Nagarahalli
     [not found]                               ` <CGME20201017030711eucas1p1b70f13e4636ad7c3e842b48726ae1845@eucas1p1.samsung.com>
2020-10-17  3:06                                 ` [dpdk-dev] [PATCH v8 02/17] distributor: fix handshake deadlock Lukasz Wojciechowski
     [not found]                               ` <CGME20201017030711eucas1p14855de461cd9d6a4fd3e4bac031b53e5@eucas1p1.samsung.com>
2020-10-17  3:06                                 ` [dpdk-dev] [PATCH v8 03/17] distributor: do not use oldpkt when not needed Lukasz Wojciechowski
     [not found]                               ` <CGME20201017030712eucas1p1ce19efadc60ed2888dc615cbb2549bdc@eucas1p1.samsung.com>
2020-10-17  3:06                                 ` [dpdk-dev] [PATCH v8 04/17] distributor: handle worker shutdown in burst mode Lukasz Wojciechowski
     [not found]                               ` <CGME20201017030713eucas1p1173c2178e647be341db2da29078c8d5d@eucas1p1.samsung.com>
2020-10-17  3:06                                 ` [dpdk-dev] [PATCH v8 05/17] test/distributor: fix shutdown of busy worker Lukasz Wojciechowski
     [not found]                               ` <CGME20201017030714eucas1p292bd71a85ea6d638256c21d279c8d533@eucas1p2.samsung.com>
2020-10-17  3:06                                 ` [dpdk-dev] [PATCH v8 06/17] distributor: fix return pkt calls in single mode Lukasz Wojciechowski
     [not found]                               ` <CGME20201017030715eucas1p2366d1f0ce16a219b21542bb26e4588a6@eucas1p2.samsung.com>
2020-10-17  3:06                                 ` [dpdk-dev] [PATCH v8 07/17] test/distributor: fix freeing mbufs Lukasz Wojciechowski
     [not found]                               ` <CGME20201017030716eucas1p2911112ee3c9e0a3f3dd9a811cbafe77b@eucas1p2.samsung.com>
2020-10-17  3:06                                 ` [dpdk-dev] [PATCH v8 08/17] test/distributor: synchronize lcores statistics Lukasz Wojciechowski
2020-10-17 21:11                                   ` Honnappa Nagarahalli
     [not found]                               ` <CGME20201017030717eucas1p1ae327494575f851af4bdf77f3e8c83ae@eucas1p1.samsung.com>
2020-10-17  3:06                                 ` [dpdk-dev] [PATCH v8 09/17] test/distributor: collect return mbufs Lukasz Wojciechowski
     [not found]                               ` <CGME20201017030718eucas1p256e1f934af12af2a6b07640c9de7a766@eucas1p2.samsung.com>
2020-10-17  3:06                                 ` [dpdk-dev] [PATCH v8 10/17] distributor: align API documentation with code Lukasz Wojciechowski
     [not found]                               ` <CGME20201017030719eucas1p13b13db1fbc3715e19e81bb4be4635b7d@eucas1p1.samsung.com>
2020-10-17  3:06                                 ` [dpdk-dev] [PATCH v8 11/17] test/distributor: replace delays with spin locks Lukasz Wojciechowski
     [not found]                               ` <CGME20201017030720eucas1p1fe683996638c3692cae530e67271b79b@eucas1p1.samsung.com>
2020-10-17  3:06                                 ` [dpdk-dev] [PATCH v8 12/17] distributor: fix scalar matching Lukasz Wojciechowski
     [not found]                               ` <CGME20201017030720eucas1p1359382fafa661abb1ba82fa65e19562c@eucas1p1.samsung.com>
2020-10-17  3:06                                 ` [dpdk-dev] [PATCH v8 13/17] test/distributor: add test with packets marking Lukasz Wojciechowski
     [not found]                               ` <CGME20201017030721eucas1p2a1032e6c78d99f903ea539e49f057a83@eucas1p2.samsung.com>
2020-10-17  3:06                                 ` [dpdk-dev] [PATCH v8 14/17] distributor: fix flushing in flight packets Lukasz Wojciechowski
     [not found]                               ` <CGME20201017030721eucas1p1f3307c1e4e69c65186ad8f2fb18f5f74@eucas1p1.samsung.com>
2020-10-17  3:06                                 ` [dpdk-dev] [PATCH v8 15/17] distributor: fix clearing returns buffer Lukasz Wojciechowski
     [not found]                               ` <CGME20201017030722eucas1p107dc8d3eb2d9ef620065deba31cf08ed@eucas1p1.samsung.com>
2020-10-17  3:07                                 ` [dpdk-dev] [PATCH v8 16/17] test/distributor: ensure all packets are delivered Lukasz Wojciechowski
     [not found]                               ` <CGME20201017030723eucas1p16904cabfd94afa4fe751c072077e09ae@eucas1p1.samsung.com>
2020-10-17  3:07                                 ` [dpdk-dev] [PATCH v8 17/17] test/distributor: fix quitting workers Lukasz Wojciechowski
2020-10-17 21:15                                   ` Honnappa Nagarahalli
2020-10-19  8:32                               ` [dpdk-dev] [PATCH v8 00/17] fix distributor synchronization issues David Marchand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).