* [PATCH v2 00/12] use compiler atomic builtins for app modules
@ 2021-11-16 9:41 Joyce Kong
2021-11-16 9:41 ` [PATCH v2 01/12] test/pmd_perf: use compiler atomic builtins for polling sync Joyce Kong
` (12 more replies)
0 siblings, 13 replies; 36+ messages in thread
From: Joyce Kong @ 2021-11-16 9:41 UTC (permalink / raw)
Cc: dev, honnappa.nagarahalli, nd, Joyce Kong
Since atomic operations have been adopted in DPDK now[1],
change rte_atomicNN_xxx APIs to compiler's atomic built-ins
in app modules[2].
[1] https://www.dpdk.org/blog/2021/03/26/dpdk-adopts-the-c11-memory-model/
[2] https://doc.dpdk.org/guides/rel_notes/deprecation.html
v2:
By Honnappa Nagarahalli:
1. Replace the RELAXED barriers with suitable ones for shared
data sync in pmd_perf and timer test cases.
2. Avoid unnecessary atomic operations in compress and testpmd
modules.
3. Fix some typo.
Joyce Kong (12):
test/pmd_perf: use compiler atomic builtins for polling sync
test/ring_perf: use compiler atomic builtins for lcores sync
test/timer: use compiler atomic builtins for sync
test/stack_perf: use compiler atomics for lcore sync
test/bpf: use compiler atomics for calculation
test/func_reentrancy: use compiler atomics for data sync
app/eventdev: use compiler atomics for shared data sync
app/crypto: use compiler atomic builtins for display sync
app/compress: use compiler atomic builtins for display sync
app/testpmd: remove atomic operations for port status
app/bbdev: use compiler atomics for shared data sync
app: remove unnecessary include of atomic header file
app/proc-info/main.c | 1 -
app/test-bbdev/test_bbdev_perf.c | 135 ++++++++----------
.../comp_perf_test_common.h | 2 +-
.../comp_perf_test_cyclecount.c | 15 +-
.../comp_perf_test_throughput.c | 10 +-
.../comp_perf_test_verify.c | 6 +-
app/test-crypto-perf/cperf_test_latency.c | 6 +-
.../cperf_test_pmd_cyclecount.c | 9 +-
app/test-crypto-perf/cperf_test_throughput.c | 9 +-
app/test-crypto-perf/cperf_test_verify.c | 9 +-
app/test-eventdev/evt_main.c | 1 -
app/test-eventdev/test_order_atq.c | 4 +-
app/test-eventdev/test_order_common.c | 4 +-
app/test-eventdev/test_order_common.h | 8 +-
app/test-eventdev/test_order_queue.c | 4 +-
app/test-pipeline/config.c | 1 -
app/test-pipeline/init.c | 1 -
app/test-pipeline/main.c | 1 -
app/test-pipeline/runtime.c | 1 -
app/test-pmd/cmdline.c | 1 -
app/test-pmd/config.c | 1 -
app/test-pmd/csumonly.c | 1 -
app/test-pmd/flowgen.c | 1 -
app/test-pmd/icmpecho.c | 1 -
app/test-pmd/iofwd.c | 1 -
app/test-pmd/macfwd.c | 1 -
app/test-pmd/macswap.c | 1 -
app/test-pmd/parameters.c | 1 -
app/test-pmd/rxonly.c | 1 -
app/test-pmd/testpmd.c | 58 ++++----
app/test-pmd/txonly.c | 1 -
app/test/test_barrier.c | 1 -
app/test/test_bpf.c | 28 ++--
app/test/test_func_reentrancy.c | 27 ++--
app/test/test_mbuf.c | 1 -
app/test/test_mp_secondary.c | 1 -
app/test/test_pmd_perf.c | 14 +-
app/test/test_ring.c | 1 -
app/test/test_ring_perf.c | 9 +-
app/test/test_stack_perf.c | 14 +-
app/test/test_timer.c | 30 ++--
app/test/test_timer_secondary.c | 1 -
42 files changed, 197 insertions(+), 226 deletions(-)
--
2.25.1
^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH v2 01/12] test/pmd_perf: use compiler atomic builtins for polling sync
2021-11-16 9:41 [PATCH v2 00/12] use compiler atomic builtins for app modules Joyce Kong
@ 2021-11-16 9:41 ` Joyce Kong
2021-11-16 21:30 ` Honnappa Nagarahalli
2021-11-16 9:41 ` [PATCH v2 02/12] test/ring_perf: use compiler atomic builtins for lcores sync Joyce Kong
` (11 subsequent siblings)
12 siblings, 1 reply; 36+ messages in thread
From: Joyce Kong @ 2021-11-16 9:41 UTC (permalink / raw)
Cc: dev, honnappa.nagarahalli, nd, Joyce Kong, Ruifeng Wang
Convert rte_atomic usages to compiler atomic built-ins
for polling sync in pmd_perf test cases.
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
---
app/test/test_pmd_perf.c | 14 ++++++--------
1 file changed, 6 insertions(+), 8 deletions(-)
diff --git a/app/test/test_pmd_perf.c b/app/test/test_pmd_perf.c
index 1df86ce080..546384a50d 100644
--- a/app/test/test_pmd_perf.c
+++ b/app/test/test_pmd_perf.c
@@ -10,7 +10,6 @@
#include <rte_cycles.h>
#include <rte_ethdev.h>
#include <rte_byteorder.h>
-#include <rte_atomic.h>
#include <rte_malloc.h>
#include "packet_burst_generator.h"
#include "test.h"
@@ -525,7 +524,7 @@ main_loop(__rte_unused void *args)
return 0;
}
-static rte_atomic64_t start;
+static uint64_t start;
static inline int
poll_burst(void *args)
@@ -563,8 +562,7 @@ poll_burst(void *args)
num[portid] = pkt_per_port;
}
- while (!rte_atomic64_read(&start))
- ;
+ rte_wait_until_equal_64(&start, 1, __ATOMIC_ACQUIRE);
cur_tsc = rte_rdtsc();
while (total) {
@@ -616,15 +614,15 @@ exec_burst(uint32_t flags, int lcore)
pkt_per_port = MAX_TRAFFIC_BURST;
num = pkt_per_port * conf->nb_ports;
- rte_atomic64_init(&start);
-
/* start polling thread, but not actually poll yet */
rte_eal_remote_launch(poll_burst,
(void *)&pkt_per_port, lcore);
/* Only when polling first */
if (flags == SC_BURST_POLL_FIRST)
- rte_atomic64_set(&start, 1);
+ __atomic_store_n(&start, 1, __ATOMIC_RELAXED);
+ else
+ __atomic_store_n(&start, 0, __ATOMIC_RELAXED);
/* start xmit */
i = 0;
@@ -641,7 +639,7 @@ exec_burst(uint32_t flags, int lcore)
/* only when polling second */
if (flags == SC_BURST_XMIT_FIRST)
- rte_atomic64_set(&start, 1);
+ __atomic_store_n(&start, 1, __ATOMIC_RELEASE);
/* wait for polling finished */
diff_tsc = rte_eal_wait_lcore(lcore);
--
2.25.1
^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH v2 02/12] test/ring_perf: use compiler atomic builtins for lcores sync
2021-11-16 9:41 [PATCH v2 00/12] use compiler atomic builtins for app modules Joyce Kong
2021-11-16 9:41 ` [PATCH v2 01/12] test/pmd_perf: use compiler atomic builtins for polling sync Joyce Kong
@ 2021-11-16 9:41 ` Joyce Kong
2021-11-16 9:41 ` [PATCH v2 03/12] test/timer: use compiler atomic builtins for sync Joyce Kong
` (10 subsequent siblings)
12 siblings, 0 replies; 36+ messages in thread
From: Joyce Kong @ 2021-11-16 9:41 UTC (permalink / raw)
To: Honnappa Nagarahalli, Konstantin Ananyev
Cc: dev, nd, Joyce Kong, Ruifeng Wang
Convert rte_atomic usages to compiler atomic built-ins
for lcores sync in ring_perf test cases.
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
---
app/test/test_ring_perf.c | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/app/test/test_ring_perf.c b/app/test/test_ring_perf.c
index fd82e20412..2d8bb675a3 100644
--- a/app/test/test_ring_perf.c
+++ b/app/test/test_ring_perf.c
@@ -320,7 +320,7 @@ run_on_core_pair(struct lcore_pair *cores, struct rte_ring *r, const int esize)
return 0;
}
-static rte_atomic32_t synchro;
+static uint32_t synchro;
static uint64_t queue_count[RTE_MAX_LCORE];
#define TIME_MS 100
@@ -342,8 +342,7 @@ load_loop_fn_helper(struct thread_params *p, const int esize)
/* wait synchro for workers */
if (lcore != rte_get_main_lcore())
- while (rte_atomic32_read(&synchro) == 0)
- rte_pause();
+ rte_wait_until_equal_32(&synchro, 1, __ATOMIC_RELAXED);
begin = rte_get_timer_cycles();
while (time_diff < hz * TIME_MS / 1000) {
@@ -398,12 +397,12 @@ run_on_all_cores(struct rte_ring *r, const int esize)
param.r = r;
/* clear synchro and start workers */
- rte_atomic32_set(&synchro, 0);
+ __atomic_store_n(&synchro, 0, __ATOMIC_RELAXED);
if (rte_eal_mp_remote_launch(lcore_f, ¶m, SKIP_MAIN) < 0)
return -1;
/* start synchro and launch test on main */
- rte_atomic32_set(&synchro, 1);
+ __atomic_store_n(&synchro, 1, __ATOMIC_RELAXED);
lcore_f(¶m);
rte_eal_mp_wait_lcore();
--
2.25.1
^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH v2 03/12] test/timer: use compiler atomic builtins for sync
2021-11-16 9:41 [PATCH v2 00/12] use compiler atomic builtins for app modules Joyce Kong
2021-11-16 9:41 ` [PATCH v2 01/12] test/pmd_perf: use compiler atomic builtins for polling sync Joyce Kong
2021-11-16 9:41 ` [PATCH v2 02/12] test/ring_perf: use compiler atomic builtins for lcores sync Joyce Kong
@ 2021-11-16 9:41 ` Joyce Kong
2021-11-16 19:52 ` Honnappa Nagarahalli
2021-11-16 20:20 ` David Marchand
2021-11-16 9:41 ` [PATCH v2 04/12] test/stack_perf: use compiler atomics for lcore sync Joyce Kong
` (9 subsequent siblings)
12 siblings, 2 replies; 36+ messages in thread
From: Joyce Kong @ 2021-11-16 9:41 UTC (permalink / raw)
To: Robert Sanford, Erik Gabriel Carrillo
Cc: dev, honnappa.nagarahalli, nd, Joyce Kong, Ruifeng Wang
Convert rte_atomic usages to compiler atomic
built-ins for lcore_state and collisions sync.
Also, move 'main_init_workers' outside of
'timer_stress2_main_loop' to guarantee lcore_state
initialized correctly before the threads launched.
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
---
app/test/test_timer.c | 30 +++++++++++++-----------------
app/test/test_timer_secondary.c | 1 -
2 files changed, 13 insertions(+), 18 deletions(-)
diff --git a/app/test/test_timer.c b/app/test/test_timer.c
index a10b2fe9da..c97e5c891c 100644
--- a/app/test/test_timer.c
+++ b/app/test/test_timer.c
@@ -102,7 +102,6 @@
#include <rte_eal.h>
#include <rte_per_lcore.h>
#include <rte_lcore.h>
-#include <rte_atomic.h>
#include <rte_timer.h>
#include <rte_random.h>
#include <rte_malloc.h>
@@ -203,7 +202,7 @@ timer_stress_main_loop(__rte_unused void *arg)
/* Need to synchronize worker lcores through multiple steps. */
enum { WORKER_WAITING = 1, WORKER_RUN_SIGNAL, WORKER_RUNNING, WORKER_FINISHED };
-static rte_atomic16_t lcore_state[RTE_MAX_LCORE];
+static uint16_t lcore_state[RTE_MAX_LCORE];
static void
main_init_workers(void)
@@ -211,7 +210,7 @@ main_init_workers(void)
unsigned i;
RTE_LCORE_FOREACH_WORKER(i) {
- rte_atomic16_set(&lcore_state[i], WORKER_WAITING);
+ __atomic_store_n(&lcore_state[i], WORKER_WAITING, __ATOMIC_RELAXED);
}
}
@@ -221,11 +220,10 @@ main_start_workers(void)
unsigned i;
RTE_LCORE_FOREACH_WORKER(i) {
- rte_atomic16_set(&lcore_state[i], WORKER_RUN_SIGNAL);
+ __atomic_store_n(&lcore_state[i], WORKER_RUN_SIGNAL, __ATOMIC_RELEASE);
}
RTE_LCORE_FOREACH_WORKER(i) {
- while (rte_atomic16_read(&lcore_state[i]) != WORKER_RUNNING)
- rte_pause();
+ rte_wait_until_equal_16(&lcore_state[i], WORKER_RUNNING, __ATOMIC_ACQUIRE);
}
}
@@ -235,8 +233,7 @@ main_wait_for_workers(void)
unsigned i;
RTE_LCORE_FOREACH_WORKER(i) {
- while (rte_atomic16_read(&lcore_state[i]) != WORKER_FINISHED)
- rte_pause();
+ rte_wait_until_equal_16(&lcore_state[i], WORKER_FINISHED, __ATOMIC_ACQUIRE);
}
}
@@ -245,9 +242,8 @@ worker_wait_to_start(void)
{
unsigned lcore_id = rte_lcore_id();
- while (rte_atomic16_read(&lcore_state[lcore_id]) != WORKER_RUN_SIGNAL)
- rte_pause();
- rte_atomic16_set(&lcore_state[lcore_id], WORKER_RUNNING);
+ rte_wait_until_equal_16(&lcore_state[lcore_id], WORKER_RUN_SIGNAL, __ATOMIC_ACQUIRE);
+ __atomic_store_n(&lcore_state[lcore_id], WORKER_RUNNING, __ATOMIC_RELEASE);
}
static void
@@ -255,7 +251,7 @@ worker_finish(void)
{
unsigned lcore_id = rte_lcore_id();
- rte_atomic16_set(&lcore_state[lcore_id], WORKER_FINISHED);
+ __atomic_store_n(&lcore_state[lcore_id], WORKER_FINISHED, __ATOMIC_RELEASE);
}
@@ -281,13 +277,12 @@ timer_stress2_main_loop(__rte_unused void *arg)
unsigned int lcore_id = rte_lcore_id();
unsigned int main_lcore = rte_get_main_lcore();
int32_t my_collisions = 0;
- static rte_atomic32_t collisions;
+ static uint32_t collisions;
if (lcore_id == main_lcore) {
cb_count = 0;
test_failed = 0;
- rte_atomic32_set(&collisions, 0);
- main_init_workers();
+ __atomic_store_n(&collisions, 0, __ATOMIC_RELAXED);
timers = rte_malloc(NULL, sizeof(*timers) * NB_STRESS2_TIMERS, 0);
if (timers == NULL) {
printf("Test Failed\n");
@@ -315,7 +310,7 @@ timer_stress2_main_loop(__rte_unused void *arg)
my_collisions++;
}
if (my_collisions != 0)
- rte_atomic32_add(&collisions, my_collisions);
+ __atomic_fetch_add(&collisions, my_collisions, __ATOMIC_RELAXED);
/* wait long enough for timers to expire */
rte_delay_ms(100);
@@ -329,7 +324,7 @@ timer_stress2_main_loop(__rte_unused void *arg)
/* now check that we get the right number of callbacks */
if (lcore_id == main_lcore) {
- my_collisions = rte_atomic32_read(&collisions);
+ my_collisions = __atomic_load_n(&collisions, __ATOMIC_RELAXED);
if (my_collisions != 0)
printf("- %d timer reset collisions (OK)\n", my_collisions);
rte_timer_manage();
@@ -573,6 +568,7 @@ test_timer(void)
/* run a second, slightly different set of stress tests */
printf("\nStart timer stress tests 2\n");
test_failed = 0;
+ main_init_workers();
rte_eal_mp_remote_launch(timer_stress2_main_loop, NULL, CALL_MAIN);
rte_eal_mp_wait_lcore();
if (test_failed)
diff --git a/app/test/test_timer_secondary.c b/app/test/test_timer_secondary.c
index 16a9f1878b..5795c97f07 100644
--- a/app/test/test_timer_secondary.c
+++ b/app/test/test_timer_secondary.c
@@ -9,7 +9,6 @@
#include <rte_lcore.h>
#include <rte_debug.h>
#include <rte_memzone.h>
-#include <rte_atomic.h>
#include <rte_timer.h>
#include <rte_cycles.h>
#include <rte_mempool.h>
--
2.25.1
^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH v2 04/12] test/stack_perf: use compiler atomics for lcore sync
2021-11-16 9:41 [PATCH v2 00/12] use compiler atomic builtins for app modules Joyce Kong
` (2 preceding siblings ...)
2021-11-16 9:41 ` [PATCH v2 03/12] test/timer: use compiler atomic builtins for sync Joyce Kong
@ 2021-11-16 9:41 ` Joyce Kong
2021-11-16 9:41 ` [PATCH v2 05/12] test/bpf: use compiler atomics for calculation Joyce Kong
` (8 subsequent siblings)
12 siblings, 0 replies; 36+ messages in thread
From: Joyce Kong @ 2021-11-16 9:41 UTC (permalink / raw)
To: Olivier Matz; +Cc: dev, honnappa.nagarahalli, nd, Joyce Kong, Ruifeng Wang
Convert rte_atomic usages to compiler atomic built-ins
for lcore sync in stack_perf test cases.
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
---
app/test/test_stack_perf.c | 14 ++++++--------
1 file changed, 6 insertions(+), 8 deletions(-)
diff --git a/app/test/test_stack_perf.c b/app/test/test_stack_perf.c
index 4ee40d5d19..1eae00a334 100644
--- a/app/test/test_stack_perf.c
+++ b/app/test/test_stack_perf.c
@@ -6,7 +6,6 @@
#include <stdio.h>
#include <inttypes.h>
-#include <rte_atomic.h>
#include <rte_cycles.h>
#include <rte_launch.h>
#include <rte_pause.h>
@@ -24,7 +23,7 @@
*/
static volatile unsigned int bulk_sizes[] = {8, MAX_BURST};
-static rte_atomic32_t lcore_barrier;
+static uint32_t lcore_barrier;
struct lcore_pair {
unsigned int c1;
@@ -144,9 +143,8 @@ bulk_push_pop(void *p)
s = args->s;
size = args->sz;
- rte_atomic32_sub(&lcore_barrier, 1);
- while (rte_atomic32_read(&lcore_barrier) != 0)
- rte_pause();
+ __atomic_fetch_sub(&lcore_barrier, 1, __ATOMIC_RELAXED);
+ rte_wait_until_equal_32(&lcore_barrier, 0, __ATOMIC_RELAXED);
uint64_t start = rte_rdtsc();
@@ -175,7 +173,7 @@ run_on_core_pair(struct lcore_pair *cores, struct rte_stack *s,
unsigned int i;
for (i = 0; i < RTE_DIM(bulk_sizes); i++) {
- rte_atomic32_set(&lcore_barrier, 2);
+ __atomic_store_n(&lcore_barrier, 2, __ATOMIC_RELAXED);
args[0].sz = args[1].sz = bulk_sizes[i];
args[0].s = args[1].s = s;
@@ -208,7 +206,7 @@ run_on_n_cores(struct rte_stack *s, lcore_function_t fn, int n)
int cnt = 0;
double avg;
- rte_atomic32_set(&lcore_barrier, n);
+ __atomic_store_n(&lcore_barrier, n, __ATOMIC_RELAXED);
RTE_LCORE_FOREACH_WORKER(lcore_id) {
if (++cnt >= n)
@@ -302,7 +300,7 @@ __test_stack_perf(uint32_t flags)
struct lcore_pair cores;
struct rte_stack *s;
- rte_atomic32_init(&lcore_barrier);
+ __atomic_store_n(&lcore_barrier, 0, __ATOMIC_RELAXED);
s = rte_stack_create(STACK_NAME, STACK_SIZE, rte_socket_id(), flags);
if (s == NULL) {
--
2.25.1
^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH v2 05/12] test/bpf: use compiler atomics for calculation
2021-11-16 9:41 [PATCH v2 00/12] use compiler atomic builtins for app modules Joyce Kong
` (3 preceding siblings ...)
2021-11-16 9:41 ` [PATCH v2 04/12] test/stack_perf: use compiler atomics for lcore sync Joyce Kong
@ 2021-11-16 9:41 ` Joyce Kong
2021-11-16 9:41 ` [PATCH v2 06/12] test/func_reentrancy: use compiler atomics for data sync Joyce Kong
` (7 subsequent siblings)
12 siblings, 0 replies; 36+ messages in thread
From: Joyce Kong @ 2021-11-16 9:41 UTC (permalink / raw)
To: Konstantin Ananyev
Cc: dev, honnappa.nagarahalli, nd, Joyce Kong, Ruifeng Wang
Convert rte_atomic usages to compiler atomic built-ins
for calculation in bpf test cases.
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
---
app/test/test_bpf.c | 28 ++++++++++++++--------------
1 file changed, 14 insertions(+), 14 deletions(-)
diff --git a/app/test/test_bpf.c b/app/test/test_bpf.c
index e3e9a1b0b5..b8be1e3d30 100644
--- a/app/test/test_bpf.c
+++ b/app/test/test_bpf.c
@@ -1569,32 +1569,32 @@ test_xadd1_check(uint64_t rc, const void *arg)
memset(&dfe, 0, sizeof(dfe));
rv = 1;
- rte_atomic32_add((rte_atomic32_t *)&dfe.u32, rv);
- rte_atomic64_add((rte_atomic64_t *)&dfe.u64, rv);
+ __atomic_fetch_add(&dfe.u32, rv, __ATOMIC_RELAXED);
+ __atomic_fetch_add(&dfe.u64, rv, __ATOMIC_RELAXED);
rv = -1;
- rte_atomic32_add((rte_atomic32_t *)&dfe.u32, rv);
- rte_atomic64_add((rte_atomic64_t *)&dfe.u64, rv);
+ __atomic_fetch_add(&dfe.u32, rv, __ATOMIC_RELAXED);
+ __atomic_fetch_add(&dfe.u64, rv, __ATOMIC_RELAXED);
rv = (int32_t)TEST_FILL_1;
- rte_atomic32_add((rte_atomic32_t *)&dfe.u32, rv);
- rte_atomic64_add((rte_atomic64_t *)&dfe.u64, rv);
+ __atomic_fetch_add(&dfe.u32, rv, __ATOMIC_RELAXED);
+ __atomic_fetch_add(&dfe.u64, rv, __ATOMIC_RELAXED);
rv = TEST_MUL_1;
- rte_atomic32_add((rte_atomic32_t *)&dfe.u32, rv);
- rte_atomic64_add((rte_atomic64_t *)&dfe.u64, rv);
+ __atomic_fetch_add(&dfe.u32, rv, __ATOMIC_RELAXED);
+ __atomic_fetch_add(&dfe.u64, rv, __ATOMIC_RELAXED);
rv = TEST_MUL_2;
- rte_atomic32_add((rte_atomic32_t *)&dfe.u32, rv);
- rte_atomic64_add((rte_atomic64_t *)&dfe.u64, rv);
+ __atomic_fetch_add(&dfe.u32, rv, __ATOMIC_RELAXED);
+ __atomic_fetch_add(&dfe.u64, rv, __ATOMIC_RELAXED);
rv = TEST_JCC_2;
- rte_atomic32_add((rte_atomic32_t *)&dfe.u32, rv);
- rte_atomic64_add((rte_atomic64_t *)&dfe.u64, rv);
+ __atomic_fetch_add(&dfe.u32, rv, __ATOMIC_RELAXED);
+ __atomic_fetch_add(&dfe.u64, rv, __ATOMIC_RELAXED);
rv = TEST_JCC_3;
- rte_atomic32_add((rte_atomic32_t *)&dfe.u32, rv);
- rte_atomic64_add((rte_atomic64_t *)&dfe.u64, rv);
+ __atomic_fetch_add(&dfe.u32, rv, __ATOMIC_RELAXED);
+ __atomic_fetch_add(&dfe.u64, rv, __ATOMIC_RELAXED);
return cmp_res(__func__, 1, rc, &dfe, dft, sizeof(dfe));
}
--
2.25.1
^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH v2 06/12] test/func_reentrancy: use compiler atomics for data sync
2021-11-16 9:41 [PATCH v2 00/12] use compiler atomic builtins for app modules Joyce Kong
` (4 preceding siblings ...)
2021-11-16 9:41 ` [PATCH v2 05/12] test/bpf: use compiler atomics for calculation Joyce Kong
@ 2021-11-16 9:41 ` Joyce Kong
2021-11-16 9:42 ` [PATCH v2 07/12] app/eventdev: use compiler atomics for shared " Joyce Kong
` (6 subsequent siblings)
12 siblings, 0 replies; 36+ messages in thread
From: Joyce Kong @ 2021-11-16 9:41 UTC (permalink / raw)
To: Olivier Matz, Andrew Rybchenko, Bruce Richardson,
Vladimir Medvedkin, Yipeng Wang, Sameh Gobriel, Anatoly Burakov,
Honnappa Nagarahalli, Konstantin Ananyev
Cc: dev, nd, Joyce Kong, Ruifeng Wang
Convert rte_atomic usages to compiler atomic built-ins
for shared data sync in func_reentrancy test cases.
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
---
app/test/test_func_reentrancy.c | 27 +++++++++++++--------------
1 file changed, 13 insertions(+), 14 deletions(-)
diff --git a/app/test/test_func_reentrancy.c b/app/test/test_func_reentrancy.c
index 838ab6f0f9..7825c6cb86 100644
--- a/app/test/test_func_reentrancy.c
+++ b/app/test/test_func_reentrancy.c
@@ -20,7 +20,6 @@
#include <rte_eal.h>
#include <rte_per_lcore.h>
#include <rte_lcore.h>
-#include <rte_atomic.h>
#include <rte_branch_prediction.h>
#include <rte_ring.h>
#include <rte_mempool.h>
@@ -54,12 +53,12 @@ typedef void (*case_clean_t)(unsigned lcore_id);
#define MAX_LCORES (RTE_MAX_MEMZONE / (MAX_ITER_MULTI * 4U))
-static rte_atomic32_t obj_count = RTE_ATOMIC32_INIT(0);
-static rte_atomic32_t synchro = RTE_ATOMIC32_INIT(0);
+static uint32_t obj_count;
+static uint32_t synchro;
#define WAIT_SYNCHRO_FOR_WORKERS() do { \
if (lcore_self != rte_get_main_lcore()) \
- while (rte_atomic32_read(&synchro) == 0); \
+ rte_wait_until_equal_32(&synchro, 1, __ATOMIC_RELAXED); \
} while(0)
/*
@@ -72,7 +71,7 @@ test_eal_init_once(__rte_unused void *arg)
WAIT_SYNCHRO_FOR_WORKERS();
- rte_atomic32_set(&obj_count, 1); /* silent the check in the caller */
+ __atomic_store_n(&obj_count, 1, __ATOMIC_RELAXED); /* silent the check in the caller */
if (rte_eal_init(0, NULL) != -1)
return -1;
@@ -116,7 +115,7 @@ ring_create_lookup(__rte_unused void *arg)
for (i = 0; i < MAX_ITER_ONCE; i++) {
rp = rte_ring_create("fr_test_once", 4096, SOCKET_ID_ANY, 0);
if (rp != NULL)
- rte_atomic32_inc(&obj_count);
+ __atomic_fetch_add(&obj_count, 1, __ATOMIC_RELAXED);
}
/* create/lookup new ring several times */
@@ -183,7 +182,7 @@ mempool_create_lookup(__rte_unused void *arg)
my_obj_init, NULL,
SOCKET_ID_ANY, 0);
if (mp != NULL)
- rte_atomic32_inc(&obj_count);
+ __atomic_fetch_add(&obj_count, 1, __ATOMIC_RELAXED);
}
/* create/lookup new ring several times */
@@ -250,7 +249,7 @@ hash_create_free(__rte_unused void *arg)
for (i = 0; i < MAX_ITER_ONCE; i++) {
handle = rte_hash_create(&hash_params);
if (handle != NULL)
- rte_atomic32_inc(&obj_count);
+ __atomic_fetch_add(&obj_count, 1, __ATOMIC_RELAXED);
}
/* create mutiple times simultaneously */
@@ -318,7 +317,7 @@ fbk_create_free(__rte_unused void *arg)
for (i = 0; i < MAX_ITER_ONCE; i++) {
handle = rte_fbk_hash_create(&fbk_params);
if (handle != NULL)
- rte_atomic32_inc(&obj_count);
+ __atomic_fetch_add(&obj_count, 1, __ATOMIC_RELAXED);
}
/* create mutiple fbk tables simultaneously */
@@ -384,7 +383,7 @@ lpm_create_free(__rte_unused void *arg)
for (i = 0; i < MAX_ITER_ONCE; i++) {
lpm = rte_lpm_create("fr_test_once", SOCKET_ID_ANY, &config);
if (lpm != NULL)
- rte_atomic32_inc(&obj_count);
+ __atomic_fetch_add(&obj_count, 1, __ATOMIC_RELAXED);
}
/* create mutiple fbk tables simultaneously */
@@ -445,8 +444,8 @@ launch_test(struct test_case *pt_case)
if (pt_case->func == NULL)
return -1;
- rte_atomic32_set(&obj_count, 0);
- rte_atomic32_set(&synchro, 0);
+ __atomic_store_n(&obj_count, 0, __ATOMIC_RELAXED);
+ __atomic_store_n(&synchro, 0, __ATOMIC_RELAXED);
cores = RTE_MIN(rte_lcore_count(), MAX_LCORES);
RTE_LCORE_FOREACH_WORKER(lcore_id) {
@@ -456,7 +455,7 @@ launch_test(struct test_case *pt_case)
rte_eal_remote_launch(pt_case->func, pt_case->arg, lcore_id);
}
- rte_atomic32_set(&synchro, 1);
+ __atomic_store_n(&synchro, 1, __ATOMIC_RELAXED);
if (pt_case->func(pt_case->arg) < 0)
ret = -1;
@@ -471,7 +470,7 @@ launch_test(struct test_case *pt_case)
pt_case->clean(lcore_id);
}
- count = rte_atomic32_read(&obj_count);
+ count = __atomic_load_n(&obj_count, __ATOMIC_RELAXED);
if (count != 1) {
printf("%s: common object allocated %d times (should be 1)\n",
pt_case->name, count);
--
2.25.1
^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH v2 07/12] app/eventdev: use compiler atomics for shared data sync
2021-11-16 9:41 [PATCH v2 00/12] use compiler atomic builtins for app modules Joyce Kong
` (5 preceding siblings ...)
2021-11-16 9:41 ` [PATCH v2 06/12] test/func_reentrancy: use compiler atomics for data sync Joyce Kong
@ 2021-11-16 9:42 ` Joyce Kong
2021-11-16 9:42 ` [PATCH v2 08/12] app/crypto: use compiler atomic builtins for display sync Joyce Kong
` (5 subsequent siblings)
12 siblings, 0 replies; 36+ messages in thread
From: Joyce Kong @ 2021-11-16 9:42 UTC (permalink / raw)
To: Jerin Jacob; +Cc: dev, honnappa.nagarahalli, nd, Joyce Kong, Ruifeng Wang
Convert rte_atomic usages to compiler atomic built-ins
for shared data sync in eventdev cases.
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
---
app/test-eventdev/evt_main.c | 1 -
app/test-eventdev/test_order_atq.c | 4 ++--
app/test-eventdev/test_order_common.c | 4 ++--
app/test-eventdev/test_order_common.h | 8 ++++----
app/test-eventdev/test_order_queue.c | 4 ++--
5 files changed, 10 insertions(+), 11 deletions(-)
diff --git a/app/test-eventdev/evt_main.c b/app/test-eventdev/evt_main.c
index 3534aabca7..194c980c7a 100644
--- a/app/test-eventdev/evt_main.c
+++ b/app/test-eventdev/evt_main.c
@@ -6,7 +6,6 @@
#include <unistd.h>
#include <signal.h>
-#include <rte_atomic.h>
#include <rte_debug.h>
#include <rte_eal.h>
#include <rte_eventdev.h>
diff --git a/app/test-eventdev/test_order_atq.c b/app/test-eventdev/test_order_atq.c
index 71215a07b6..2fee4b4daa 100644
--- a/app/test-eventdev/test_order_atq.c
+++ b/app/test-eventdev/test_order_atq.c
@@ -28,7 +28,7 @@ order_atq_worker(void *arg, const bool flow_id_cap)
uint16_t event = rte_event_dequeue_burst(dev_id, port,
&ev, 1, 0);
if (!event) {
- if (rte_atomic64_read(outstand_pkts) <= 0)
+ if (__atomic_load_n(outstand_pkts, __ATOMIC_RELAXED) <= 0)
break;
rte_pause();
continue;
@@ -64,7 +64,7 @@ order_atq_worker_burst(void *arg, const bool flow_id_cap)
BURST_SIZE, 0);
if (nb_rx == 0) {
- if (rte_atomic64_read(outstand_pkts) <= 0)
+ if (__atomic_load_n(outstand_pkts, __ATOMIC_RELAXED) <= 0)
break;
rte_pause();
continue;
diff --git a/app/test-eventdev/test_order_common.c b/app/test-eventdev/test_order_common.c
index d7760061ba..ff7813f9c2 100644
--- a/app/test-eventdev/test_order_common.c
+++ b/app/test-eventdev/test_order_common.c
@@ -187,7 +187,7 @@ order_test_setup(struct evt_test *test, struct evt_options *opt)
evt_err("failed to allocate t->expected_flow_seq memory");
goto exp_nomem;
}
- rte_atomic64_set(&t->outstand_pkts, opt->nb_pkts);
+ __atomic_store_n(&t->outstand_pkts, opt->nb_pkts, __ATOMIC_RELAXED);
t->err = false;
t->nb_pkts = opt->nb_pkts;
t->nb_flows = opt->nb_flows;
@@ -294,7 +294,7 @@ order_launch_lcores(struct evt_test *test, struct evt_options *opt,
while (t->err == false) {
uint64_t new_cycles = rte_get_timer_cycles();
- int64_t remaining = rte_atomic64_read(&t->outstand_pkts);
+ int64_t remaining = __atomic_load_n(&t->outstand_pkts, __ATOMIC_RELAXED);
if (remaining <= 0) {
t->result = EVT_TEST_SUCCESS;
diff --git a/app/test-eventdev/test_order_common.h b/app/test-eventdev/test_order_common.h
index cd9d6009ec..92781d9587 100644
--- a/app/test-eventdev/test_order_common.h
+++ b/app/test-eventdev/test_order_common.h
@@ -48,7 +48,7 @@ struct test_order {
* The atomic_* is an expensive operation,Since it is a functional test,
* We are using the atomic_ operation to reduce the code complexity.
*/
- rte_atomic64_t outstand_pkts;
+ uint64_t outstand_pkts;
enum evt_test_result result;
uint32_t nb_flows;
uint64_t nb_pkts;
@@ -95,7 +95,7 @@ static __rte_always_inline void
order_process_stage_1(struct test_order *const t,
struct rte_event *const ev, const uint32_t nb_flows,
uint32_t *const expected_flow_seq,
- rte_atomic64_t *const outstand_pkts)
+ uint64_t *const outstand_pkts)
{
const uint32_t flow = (uintptr_t)ev->mbuf % nb_flows;
/* compare the seqn against expected value */
@@ -113,7 +113,7 @@ order_process_stage_1(struct test_order *const t,
*/
expected_flow_seq[flow]++;
rte_pktmbuf_free(ev->mbuf);
- rte_atomic64_sub(outstand_pkts, 1);
+ __atomic_sub_fetch(outstand_pkts, 1, __ATOMIC_RELAXED);
}
static __rte_always_inline void
@@ -132,7 +132,7 @@ order_process_stage_invalid(struct test_order *const t,
const uint8_t port = w->port_id;\
const uint32_t nb_flows = t->nb_flows;\
uint32_t *expected_flow_seq = t->expected_flow_seq;\
- rte_atomic64_t *outstand_pkts = &t->outstand_pkts;\
+ uint64_t *outstand_pkts = &t->outstand_pkts;\
if (opt->verbose_level > 1)\
printf("%s(): lcore %d dev_id %d port=%d\n",\
__func__, rte_lcore_id(), dev_id, port)
diff --git a/app/test-eventdev/test_order_queue.c b/app/test-eventdev/test_order_queue.c
index 621367805a..80eaea5cf5 100644
--- a/app/test-eventdev/test_order_queue.c
+++ b/app/test-eventdev/test_order_queue.c
@@ -28,7 +28,7 @@ order_queue_worker(void *arg, const bool flow_id_cap)
uint16_t event = rte_event_dequeue_burst(dev_id, port,
&ev, 1, 0);
if (!event) {
- if (rte_atomic64_read(outstand_pkts) <= 0)
+ if (__atomic_load_n(outstand_pkts, __ATOMIC_RELAXED) <= 0)
break;
rte_pause();
continue;
@@ -64,7 +64,7 @@ order_queue_worker_burst(void *arg, const bool flow_id_cap)
BURST_SIZE, 0);
if (nb_rx == 0) {
- if (rte_atomic64_read(outstand_pkts) <= 0)
+ if (__atomic_load_n(outstand_pkts, __ATOMIC_RELAXED) <= 0)
break;
rte_pause();
continue;
--
2.25.1
^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH v2 08/12] app/crypto: use compiler atomic builtins for display sync
2021-11-16 9:41 [PATCH v2 00/12] use compiler atomic builtins for app modules Joyce Kong
` (6 preceding siblings ...)
2021-11-16 9:42 ` [PATCH v2 07/12] app/eventdev: use compiler atomics for shared " Joyce Kong
@ 2021-11-16 9:42 ` Joyce Kong
2021-11-16 9:42 ` [PATCH v2 09/12] app/compress: " Joyce Kong
` (4 subsequent siblings)
12 siblings, 0 replies; 36+ messages in thread
From: Joyce Kong @ 2021-11-16 9:42 UTC (permalink / raw)
To: Declan Doherty, Ciara Power
Cc: dev, honnappa.nagarahalli, nd, Joyce Kong, Ruifeng Wang
Convert rte_atomic_test_and_set usage to compiler atomic
CAS operation for display sync in crypto cases.
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
---
app/test-crypto-perf/cperf_test_latency.c | 6 ++++--
app/test-crypto-perf/cperf_test_pmd_cyclecount.c | 9 ++++++---
app/test-crypto-perf/cperf_test_throughput.c | 9 ++++++---
app/test-crypto-perf/cperf_test_verify.c | 9 ++++++---
4 files changed, 22 insertions(+), 11 deletions(-)
diff --git a/app/test-crypto-perf/cperf_test_latency.c b/app/test-crypto-perf/cperf_test_latency.c
index 69f55de50a..ce49feaba9 100644
--- a/app/test-crypto-perf/cperf_test_latency.c
+++ b/app/test-crypto-perf/cperf_test_latency.c
@@ -126,7 +126,7 @@ cperf_latency_test_runner(void *arg)
uint8_t burst_size_idx = 0;
uint32_t imix_idx = 0;
- static rte_atomic16_t display_once = RTE_ATOMIC16_INIT(0);
+ static uint16_t display_once;
if (ctx == NULL)
return 0;
@@ -307,8 +307,10 @@ cperf_latency_test_runner(void *arg)
time_max = tunit*(double)(tsc_max) / tsc_hz;
time_min = tunit*(double)(tsc_min) / tsc_hz;
+ uint16_t exp = 0;
if (ctx->options->csv) {
- if (rte_atomic16_test_and_set(&display_once))
+ if (__atomic_compare_exchange_n(&display_once, &exp, 1, 0,
+ __ATOMIC_RELAXED, __ATOMIC_RELAXED))
printf("\n# lcore, Buffer Size, Burst Size, Pakt Seq #, "
"cycles, time (us)");
diff --git a/app/test-crypto-perf/cperf_test_pmd_cyclecount.c b/app/test-crypto-perf/cperf_test_pmd_cyclecount.c
index fda97e8ab9..ba1f104f72 100644
--- a/app/test-crypto-perf/cperf_test_pmd_cyclecount.c
+++ b/app/test-crypto-perf/cperf_test_pmd_cyclecount.c
@@ -404,7 +404,7 @@ cperf_pmd_cyclecount_test_runner(void *test_ctx)
state.lcore = rte_lcore_id();
state.linearize = 0;
- static rte_atomic16_t display_once = RTE_ATOMIC16_INIT(0);
+ static uint16_t display_once;
static bool warmup = true;
/*
@@ -449,8 +449,10 @@ cperf_pmd_cyclecount_test_runner(void *test_ctx)
continue;
}
+ uint16_t exp = 0;
if (!opts->csv) {
- if (rte_atomic16_test_and_set(&display_once))
+ if (__atomic_compare_exchange_n(&display_once, &exp, 1, 0,
+ __ATOMIC_RELAXED, __ATOMIC_RELAXED))
printf(PRETTY_HDR_FMT, "lcore id", "Buf Size",
"Burst Size", "Enqueued",
"Dequeued", "Enq Retries",
@@ -466,7 +468,8 @@ cperf_pmd_cyclecount_test_runner(void *test_ctx)
state.cycles_per_enq,
state.cycles_per_deq);
} else {
- if (rte_atomic16_test_and_set(&display_once))
+ if (__atomic_compare_exchange_n(&display_once, &exp, 1, 0,
+ __ATOMIC_RELAXED, __ATOMIC_RELAXED))
printf(CSV_HDR_FMT, "# lcore id", "Buf Size",
"Burst Size", "Enqueued",
"Dequeued", "Enq Retries",
diff --git a/app/test-crypto-perf/cperf_test_throughput.c b/app/test-crypto-perf/cperf_test_throughput.c
index 739ed9e573..51512af2ad 100644
--- a/app/test-crypto-perf/cperf_test_throughput.c
+++ b/app/test-crypto-perf/cperf_test_throughput.c
@@ -113,7 +113,7 @@ cperf_throughput_test_runner(void *test_ctx)
uint8_t burst_size_idx = 0;
uint32_t imix_idx = 0;
- static rte_atomic16_t display_once = RTE_ATOMIC16_INIT(0);
+ static uint16_t display_once;
struct rte_crypto_op *ops[ctx->options->max_burst_size];
struct rte_crypto_op *ops_processed[ctx->options->max_burst_size];
@@ -281,8 +281,10 @@ cperf_throughput_test_runner(void *test_ctx)
double cycles_per_packet = ((double)tsc_duration /
ctx->options->total_ops);
+ uint16_t exp = 0;
if (!ctx->options->csv) {
- if (rte_atomic16_test_and_set(&display_once))
+ if (__atomic_compare_exchange_n(&display_once, &exp, 1, 0,
+ __ATOMIC_RELAXED, __ATOMIC_RELAXED))
printf("%12s%12s%12s%12s%12s%12s%12s%12s%12s%12s\n\n",
"lcore id", "Buf Size", "Burst Size",
"Enqueued", "Dequeued", "Failed Enq",
@@ -302,7 +304,8 @@ cperf_throughput_test_runner(void *test_ctx)
throughput_gbps,
cycles_per_packet);
} else {
- if (rte_atomic16_test_and_set(&display_once))
+ if (__atomic_compare_exchange_n(&display_once, &exp, 1, 0,
+ __ATOMIC_RELAXED, __ATOMIC_RELAXED))
printf("#lcore id,Buffer Size(B),"
"Burst Size,Enqueued,Dequeued,Failed Enq,"
"Failed Deq,Ops(Millions),Throughput(Gbps),"
diff --git a/app/test-crypto-perf/cperf_test_verify.c b/app/test-crypto-perf/cperf_test_verify.c
index 1962438034..496eb0de00 100644
--- a/app/test-crypto-perf/cperf_test_verify.c
+++ b/app/test-crypto-perf/cperf_test_verify.c
@@ -241,7 +241,7 @@ cperf_verify_test_runner(void *test_ctx)
uint64_t ops_deqd = 0, ops_deqd_total = 0, ops_deqd_failed = 0;
uint64_t ops_failed = 0;
- static rte_atomic16_t display_once = RTE_ATOMIC16_INIT(0);
+ static uint16_t display_once;
uint64_t i;
uint16_t ops_unused = 0;
@@ -383,8 +383,10 @@ cperf_verify_test_runner(void *test_ctx)
ops_deqd_total += ops_deqd;
}
+ uint16_t exp = 0;
if (!ctx->options->csv) {
- if (rte_atomic16_test_and_set(&display_once))
+ if (__atomic_compare_exchange_n(&display_once, &exp, 1, 0,
+ __ATOMIC_RELAXED, __ATOMIC_RELAXED))
printf("%12s%12s%12s%12s%12s%12s%12s%12s\n\n",
"lcore id", "Buf Size", "Burst size",
"Enqueued", "Dequeued", "Failed Enq",
@@ -401,7 +403,8 @@ cperf_verify_test_runner(void *test_ctx)
ops_deqd_failed,
ops_failed);
} else {
- if (rte_atomic16_test_and_set(&display_once))
+ if (__atomic_compare_exchange_n(&display_once, &exp, 1, 0,
+ __ATOMIC_RELAXED, __ATOMIC_RELAXED))
printf("\n# lcore id, Buffer Size(B), "
"Burst Size,Enqueued,Dequeued,Failed Enq,"
"Failed Deq,Failed Ops\n");
--
2.25.1
^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH v2 09/12] app/compress: use compiler atomic builtins for display sync
2021-11-16 9:41 [PATCH v2 00/12] use compiler atomic builtins for app modules Joyce Kong
` (7 preceding siblings ...)
2021-11-16 9:42 ` [PATCH v2 08/12] app/crypto: use compiler atomic builtins for display sync Joyce Kong
@ 2021-11-16 9:42 ` Joyce Kong
2021-11-16 20:15 ` Honnappa Nagarahalli
2021-11-16 9:42 ` [PATCH v2 10/12] app/testpmd: remove atomic operations for port status Joyce Kong
` (3 subsequent siblings)
12 siblings, 1 reply; 36+ messages in thread
From: Joyce Kong @ 2021-11-16 9:42 UTC (permalink / raw)
Cc: dev, honnappa.nagarahalli, nd, Joyce Kong, Ruifeng Wang
Convert rte_atomic_test_and_set usage to compiler atomic
CAS operation for display sync.
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
---
app/test-compress-perf/comp_perf_test_common.h | 2 +-
.../comp_perf_test_cyclecount.c | 15 +++++++--------
.../comp_perf_test_throughput.c | 10 +++++++---
app/test-compress-perf/comp_perf_test_verify.c | 6 ++++--
4 files changed, 19 insertions(+), 14 deletions(-)
diff --git a/app/test-compress-perf/comp_perf_test_common.h b/app/test-compress-perf/comp_perf_test_common.h
index 72705c6a2b..d039e5a29a 100644
--- a/app/test-compress-perf/comp_perf_test_common.h
+++ b/app/test-compress-perf/comp_perf_test_common.h
@@ -14,7 +14,7 @@ struct cperf_mem_resources {
uint16_t qp_id;
uint8_t lcore_id;
- rte_atomic16_t print_info_once;
+ uint16_t print_info_once;
uint32_t total_bufs;
uint8_t *compressed_data;
diff --git a/app/test-compress-perf/comp_perf_test_cyclecount.c b/app/test-compress-perf/comp_perf_test_cyclecount.c
index c875ddbdac..da55b02b74 100644
--- a/app/test-compress-perf/comp_perf_test_cyclecount.c
+++ b/app/test-compress-perf/comp_perf_test_cyclecount.c
@@ -466,7 +466,7 @@ cperf_cyclecount_test_runner(void *test_ctx)
struct cperf_cyclecount_ctx *ctx = test_ctx;
struct comp_test_data *test_data = ctx->ver.options;
uint32_t lcore = rte_lcore_id();
- static rte_atomic16_t display_once = RTE_ATOMIC16_INIT(0);
+ static uint16_t display_once;
static rte_spinlock_t print_spinlock;
int i;
@@ -486,10 +486,12 @@ cperf_cyclecount_test_runner(void *test_ctx)
ctx->ver.mem.lcore_id = lcore;
+ uint16_t exp = 0;
/*
* printing information about current compression thread
*/
- if (rte_atomic16_test_and_set(&ctx->ver.mem.print_info_once))
+ if (__atomic_compare_exchange_n(&ctx->ver.mem.print_info_once, &exp,
+ 1, 0, __ATOMIC_RELAXED, __ATOMIC_RELAXED))
printf(" lcore: %u,"
" driver name: %s,"
" device name: %s,"
@@ -546,9 +548,10 @@ cperf_cyclecount_test_runner(void *test_ctx)
(ctx->ver.mem.total_bufs * test_data->num_iter);
/* R E P O R T processing */
- if (rte_atomic16_test_and_set(&display_once)) {
+ rte_spinlock_lock(&print_spinlock);
- rte_spinlock_lock(&print_spinlock);
+ if (display_once == 0) {
+ display_once = 1;
printf("\nLegend for the table\n"
" - Retries section: number of retries for the following operations:\n"
@@ -576,12 +579,8 @@ cperf_cyclecount_test_runner(void *test_ctx)
"setup/op",
"[C-e]", "[C-d]",
"[D-e]", "[D-d]");
-
- rte_spinlock_unlock(&print_spinlock);
}
- rte_spinlock_lock(&print_spinlock);
-
printf("%12u"
"%6u"
"%12zu"
diff --git a/app/test-compress-perf/comp_perf_test_throughput.c b/app/test-compress-perf/comp_perf_test_throughput.c
index 13922b658c..d3dff070b0 100644
--- a/app/test-compress-perf/comp_perf_test_throughput.c
+++ b/app/test-compress-perf/comp_perf_test_throughput.c
@@ -329,15 +329,17 @@ cperf_throughput_test_runner(void *test_ctx)
struct cperf_benchmark_ctx *ctx = test_ctx;
struct comp_test_data *test_data = ctx->ver.options;
uint32_t lcore = rte_lcore_id();
- static rte_atomic16_t display_once = RTE_ATOMIC16_INIT(0);
+ static uint16_t display_once;
int i, ret = EXIT_SUCCESS;
ctx->ver.mem.lcore_id = lcore;
+ uint16_t exp = 0;
/*
* printing information about current compression thread
*/
- if (rte_atomic16_test_and_set(&ctx->ver.mem.print_info_once))
+ if (__atomic_compare_exchange_n(&ctx->ver.mem.print_info_once, &exp,
+ 1, 0, __ATOMIC_RELAXED, __ATOMIC_RELAXED))
printf(" lcore: %u,"
" driver name: %s,"
" device name: %s,"
@@ -391,7 +393,9 @@ cperf_throughput_test_runner(void *test_ctx)
ctx->decomp_gbps = rte_get_tsc_hz() / ctx->decomp_tsc_byte * 8 /
1000000000;
- if (rte_atomic16_test_and_set(&display_once)) {
+ exp = 0;
+ if (__atomic_compare_exchange_n(&display_once, &exp, 1, 0,
+ __ATOMIC_RELAXED, __ATOMIC_RELAXED)) {
printf("\n%12s%6s%12s%17s%15s%16s\n",
"lcore id", "Level", "Comp size", "Comp ratio [%]",
"Comp [Gbps]", "Decomp [Gbps]");
diff --git a/app/test-compress-perf/comp_perf_test_verify.c b/app/test-compress-perf/comp_perf_test_verify.c
index 5e13257b79..f6e21368e8 100644
--- a/app/test-compress-perf/comp_perf_test_verify.c
+++ b/app/test-compress-perf/comp_perf_test_verify.c
@@ -388,7 +388,7 @@ cperf_verify_test_runner(void *test_ctx)
struct cperf_verify_ctx *ctx = test_ctx;
struct comp_test_data *test_data = ctx->options;
int ret = EXIT_SUCCESS;
- static rte_atomic16_t display_once = RTE_ATOMIC16_INIT(0);
+ static uint16_t display_once;
uint32_t lcore = rte_lcore_id();
ctx->mem.lcore_id = lcore;
@@ -427,8 +427,10 @@ cperf_verify_test_runner(void *test_ctx)
ctx->ratio = (double) ctx->comp_data_sz /
test_data->input_data_sz * 100;
+ uint16_t exp = 0;
if (!ctx->silent) {
- if (rte_atomic16_test_and_set(&display_once)) {
+ if (__atomic_compare_exchange_n(&display_once, &exp, 1, 0,
+ __ATOMIC_RELAXED, __ATOMIC_RELAXED)) {
printf("%12s%6s%12s%17s\n",
"lcore id", "Level", "Comp size", "Comp ratio [%]");
}
--
2.25.1
^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH v2 10/12] app/testpmd: remove atomic operations for port status
2021-11-16 9:41 [PATCH v2 00/12] use compiler atomic builtins for app modules Joyce Kong
` (8 preceding siblings ...)
2021-11-16 9:42 ` [PATCH v2 09/12] app/compress: " Joyce Kong
@ 2021-11-16 9:42 ` Joyce Kong
2021-11-16 21:34 ` Honnappa Nagarahalli
2021-11-16 9:42 ` [PATCH v2 11/12] app/bbdev: use compiler atomics for shared data sync Joyce Kong
` (2 subsequent siblings)
12 siblings, 1 reply; 36+ messages in thread
From: Joyce Kong @ 2021-11-16 9:42 UTC (permalink / raw)
To: Xiaoyun Li; +Cc: dev, honnappa.nagarahalli, nd, Joyce Kong, Ruifeng Wang
The port_status changes do not need to be handled
atomically, as they are modified during initialization
or through the testpmd prompt instead of multiple
threads.
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
---
app/test-pmd/testpmd.c | 58 ++++++++++++++++++++++--------------------
1 file changed, 31 insertions(+), 27 deletions(-)
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index a66dfb297c..ed472cacd2 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -36,7 +36,6 @@
#include <rte_alarm.h>
#include <rte_per_lcore.h>
#include <rte_lcore.h>
-#include <rte_atomic.h>
#include <rte_branch_prediction.h>
#include <rte_mempool.h>
#include <rte_malloc.h>
@@ -2521,9 +2520,9 @@ setup_hairpin_queues(portid_t pi, portid_t p_pi, uint16_t cnt_pi)
continue;
/* Fail to setup rx queue, return */
- if (rte_atomic16_cmpset(&(port->port_status),
- RTE_PORT_HANDLING,
- RTE_PORT_STOPPED) == 0)
+ if (port->port_status == RTE_PORT_HANDLING)
+ port->port_status = RTE_PORT_STOPPED;
+ else
fprintf(stderr,
"Port %d can not be set back to stopped\n", pi);
fprintf(stderr, "Fail to configure port %d hairpin queues\n",
@@ -2544,9 +2543,9 @@ setup_hairpin_queues(portid_t pi, portid_t p_pi, uint16_t cnt_pi)
continue;
/* Fail to setup rx queue, return */
- if (rte_atomic16_cmpset(&(port->port_status),
- RTE_PORT_HANDLING,
- RTE_PORT_STOPPED) == 0)
+ if (port->port_status == RTE_PORT_HANDLING)
+ port->port_status = RTE_PORT_STOPPED;
+ else
fprintf(stderr,
"Port %d can not be set back to stopped\n", pi);
fprintf(stderr, "Fail to configure port %d hairpin queues\n",
@@ -2729,8 +2728,9 @@ start_port(portid_t pid)
need_check_link_status = 0;
port = &ports[pi];
- if (rte_atomic16_cmpset(&(port->port_status), RTE_PORT_STOPPED,
- RTE_PORT_HANDLING) == 0) {
+ if (port->port_status == RTE_PORT_STOPPED)
+ port->port_status = RTE_PORT_HANDLING;
+ else {
fprintf(stderr, "Port %d is now not stopped\n", pi);
continue;
}
@@ -2766,8 +2766,9 @@ start_port(portid_t pid)
nb_txq + nb_hairpinq,
&(port->dev_conf));
if (diag != 0) {
- if (rte_atomic16_cmpset(&(port->port_status),
- RTE_PORT_HANDLING, RTE_PORT_STOPPED) == 0)
+ if (port->port_status == RTE_PORT_HANDLING)
+ port->port_status = RTE_PORT_STOPPED;
+ else
fprintf(stderr,
"Port %d can not be set back to stopped\n",
pi);
@@ -2828,9 +2829,9 @@ start_port(portid_t pid)
continue;
/* Fail to setup tx queue, return */
- if (rte_atomic16_cmpset(&(port->port_status),
- RTE_PORT_HANDLING,
- RTE_PORT_STOPPED) == 0)
+ if (port->port_status == RTE_PORT_HANDLING)
+ port->port_status = RTE_PORT_STOPPED;
+ else
fprintf(stderr,
"Port %d can not be set back to stopped\n",
pi);
@@ -2880,9 +2881,9 @@ start_port(portid_t pid)
continue;
/* Fail to setup rx queue, return */
- if (rte_atomic16_cmpset(&(port->port_status),
- RTE_PORT_HANDLING,
- RTE_PORT_STOPPED) == 0)
+ if (port->port_status == RTE_PORT_HANDLING)
+ port->port_status = RTE_PORT_STOPPED;
+ else
fprintf(stderr,
"Port %d can not be set back to stopped\n",
pi);
@@ -2917,16 +2918,18 @@ start_port(portid_t pid)
pi, rte_strerror(-diag));
/* Fail to setup rx queue, return */
- if (rte_atomic16_cmpset(&(port->port_status),
- RTE_PORT_HANDLING, RTE_PORT_STOPPED) == 0)
+ if (port->port_status == RTE_PORT_HANDLING)
+ port->port_status = RTE_PORT_STOPPED;
+ else
fprintf(stderr,
"Port %d can not be set back to stopped\n",
pi);
continue;
}
- if (rte_atomic16_cmpset(&(port->port_status),
- RTE_PORT_HANDLING, RTE_PORT_STARTED) == 0)
+ if (port->port_status == RTE_PORT_HANDLING)
+ port->port_status = RTE_PORT_STARTED;
+ else
fprintf(stderr, "Port %d can not be set into started\n",
pi);
@@ -3028,8 +3031,9 @@ stop_port(portid_t pid)
}
port = &ports[pi];
- if (rte_atomic16_cmpset(&(port->port_status), RTE_PORT_STARTED,
- RTE_PORT_HANDLING) == 0)
+ if (port->port_status == RTE_PORT_STARTED)
+ port->port_status = RTE_PORT_HANDLING;
+ else
continue;
if (hairpin_mode & 0xf) {
@@ -3055,8 +3059,9 @@ stop_port(portid_t pid)
RTE_LOG(ERR, EAL, "rte_eth_dev_stop failed for port %u\n",
pi);
- if (rte_atomic16_cmpset(&(port->port_status),
- RTE_PORT_HANDLING, RTE_PORT_STOPPED) == 0)
+ if (port->port_status == RTE_PORT_HANDLING)
+ port->port_status = RTE_PORT_STOPPED;
+ else
fprintf(stderr, "Port %d can not be set into stopped\n",
pi);
need_check_link_status = 1;
@@ -3119,8 +3124,7 @@ close_port(portid_t pid)
}
port = &ports[pi];
- if (rte_atomic16_cmpset(&(port->port_status),
- RTE_PORT_CLOSED, RTE_PORT_CLOSED) == 1) {
+ if (port->port_status == RTE_PORT_CLOSED) {
fprintf(stderr, "Port %d is already closed\n", pi);
continue;
}
--
2.25.1
^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH v2 11/12] app/bbdev: use compiler atomics for shared data sync
2021-11-16 9:41 [PATCH v2 00/12] use compiler atomic builtins for app modules Joyce Kong
` (9 preceding siblings ...)
2021-11-16 9:42 ` [PATCH v2 10/12] app/testpmd: remove atomic operations for port status Joyce Kong
@ 2021-11-16 9:42 ` Joyce Kong
2021-11-16 9:42 ` [PATCH v2 12/12] app: remove unnecessary include of atomic header file Joyce Kong
2021-11-17 8:21 ` [PATCH v3 00/12] use compiler atomic builtins for app modules Joyce Kong
12 siblings, 0 replies; 36+ messages in thread
From: Joyce Kong @ 2021-11-16 9:42 UTC (permalink / raw)
To: Nicolas Chautru; +Cc: dev, honnappa.nagarahalli, nd, Joyce Kong, Ruifeng Wang
Convert rte_atomic usages to compiler atomic built-ins
for shared data sync in bbdev cases.
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
---
app/test-bbdev/test_bbdev_perf.c | 135 ++++++++++++++-----------------
1 file changed, 59 insertions(+), 76 deletions(-)
diff --git a/app/test-bbdev/test_bbdev_perf.c b/app/test-bbdev/test_bbdev_perf.c
index 7b4529789b..0fa119a502 100644
--- a/app/test-bbdev/test_bbdev_perf.c
+++ b/app/test-bbdev/test_bbdev_perf.c
@@ -133,7 +133,7 @@ struct test_op_params {
uint16_t num_to_process;
uint16_t num_lcores;
int vector_mask;
- rte_atomic16_t sync;
+ uint16_t sync;
struct test_buffers q_bufs[RTE_MAX_NUMA_NODES][MAX_QUEUES];
};
@@ -148,9 +148,9 @@ struct thread_params {
uint8_t iter_count;
double iter_average;
double bler;
- rte_atomic16_t nb_dequeued;
- rte_atomic16_t processing_status;
- rte_atomic16_t burst_sz;
+ uint16_t nb_dequeued;
+ int16_t processing_status;
+ uint16_t burst_sz;
struct test_op_params *op_params;
struct rte_bbdev_dec_op *dec_ops[MAX_BURST];
struct rte_bbdev_enc_op *enc_ops[MAX_BURST];
@@ -2637,46 +2637,46 @@ dequeue_event_callback(uint16_t dev_id,
}
if (unlikely(event != RTE_BBDEV_EVENT_DEQUEUE)) {
- rte_atomic16_set(&tp->processing_status, TEST_FAILED);
+ __atomic_store_n(&tp->processing_status, TEST_FAILED, __ATOMIC_RELAXED);
printf(
"Dequeue interrupt handler called for incorrect event!\n");
return;
}
- burst_sz = rte_atomic16_read(&tp->burst_sz);
+ burst_sz = __atomic_load_n(&tp->burst_sz, __ATOMIC_RELAXED);
num_ops = tp->op_params->num_to_process;
if (test_vector.op_type == RTE_BBDEV_OP_TURBO_DEC)
deq = rte_bbdev_dequeue_dec_ops(dev_id, queue_id,
&tp->dec_ops[
- rte_atomic16_read(&tp->nb_dequeued)],
+ __atomic_load_n(&tp->nb_dequeued, __ATOMIC_RELAXED)],
burst_sz);
else if (test_vector.op_type == RTE_BBDEV_OP_LDPC_DEC)
deq = rte_bbdev_dequeue_ldpc_dec_ops(dev_id, queue_id,
&tp->dec_ops[
- rte_atomic16_read(&tp->nb_dequeued)],
+ __atomic_load_n(&tp->nb_dequeued, __ATOMIC_RELAXED)],
burst_sz);
else if (test_vector.op_type == RTE_BBDEV_OP_LDPC_ENC)
deq = rte_bbdev_dequeue_ldpc_enc_ops(dev_id, queue_id,
&tp->enc_ops[
- rte_atomic16_read(&tp->nb_dequeued)],
+ __atomic_load_n(&tp->nb_dequeued, __ATOMIC_RELAXED)],
burst_sz);
else /*RTE_BBDEV_OP_TURBO_ENC*/
deq = rte_bbdev_dequeue_enc_ops(dev_id, queue_id,
&tp->enc_ops[
- rte_atomic16_read(&tp->nb_dequeued)],
+ __atomic_load_n(&tp->nb_dequeued, __ATOMIC_RELAXED)],
burst_sz);
if (deq < burst_sz) {
printf(
"After receiving the interrupt all operations should be dequeued. Expected: %u, got: %u\n",
burst_sz, deq);
- rte_atomic16_set(&tp->processing_status, TEST_FAILED);
+ __atomic_store_n(&tp->processing_status, TEST_FAILED, __ATOMIC_RELAXED);
return;
}
- if (rte_atomic16_read(&tp->nb_dequeued) + deq < num_ops) {
- rte_atomic16_add(&tp->nb_dequeued, deq);
+ if (__atomic_load_n(&tp->nb_dequeued, __ATOMIC_RELAXED) + deq < num_ops) {
+ __atomic_fetch_add(&tp->nb_dequeued, deq, __ATOMIC_RELAXED);
return;
}
@@ -2713,7 +2713,7 @@ dequeue_event_callback(uint16_t dev_id,
if (ret) {
printf("Buffers validation failed\n");
- rte_atomic16_set(&tp->processing_status, TEST_FAILED);
+ __atomic_store_n(&tp->processing_status, TEST_FAILED, __ATOMIC_RELAXED);
}
switch (test_vector.op_type) {
@@ -2734,7 +2734,7 @@ dequeue_event_callback(uint16_t dev_id,
break;
default:
printf("Unknown op type: %d\n", test_vector.op_type);
- rte_atomic16_set(&tp->processing_status, TEST_FAILED);
+ __atomic_store_n(&tp->processing_status, TEST_FAILED, __ATOMIC_RELAXED);
return;
}
@@ -2743,7 +2743,7 @@ dequeue_event_callback(uint16_t dev_id,
tp->mbps += (((double)(num_ops * tb_len_bits)) / 1000000.0) /
((double)total_time / (double)rte_get_tsc_hz());
- rte_atomic16_add(&tp->nb_dequeued, deq);
+ __atomic_fetch_add(&tp->nb_dequeued, deq, __ATOMIC_RELAXED);
}
static int
@@ -2781,11 +2781,10 @@ throughput_intr_lcore_ldpc_dec(void *arg)
bufs = &tp->op_params->q_bufs[GET_SOCKET(info.socket_id)][queue_id];
- rte_atomic16_clear(&tp->processing_status);
- rte_atomic16_clear(&tp->nb_dequeued);
+ __atomic_store_n(&tp->processing_status, 0, __ATOMIC_RELAXED);
+ __atomic_store_n(&tp->nb_dequeued, 0, __ATOMIC_RELAXED);
- while (rte_atomic16_read(&tp->op_params->sync) == SYNC_WAIT)
- rte_pause();
+ rte_wait_until_equal_16(&tp->op_params->sync, SYNC_START, __ATOMIC_RELAXED);
ret = rte_bbdev_dec_op_alloc_bulk(tp->op_params->mp, ops,
num_to_process);
@@ -2833,17 +2832,15 @@ throughput_intr_lcore_ldpc_dec(void *arg)
* the number of operations is not a multiple of
* burst size.
*/
- rte_atomic16_set(&tp->burst_sz, num_to_enq);
+ __atomic_store_n(&tp->burst_sz, num_to_enq, __ATOMIC_RELAXED);
/* Wait until processing of previous batch is
* completed
*/
- while (rte_atomic16_read(&tp->nb_dequeued) !=
- (int16_t) enqueued)
- rte_pause();
+ rte_wait_until_equal_16(&tp->nb_dequeued, enqueued, __ATOMIC_RELAXED);
}
if (j != TEST_REPETITIONS - 1)
- rte_atomic16_clear(&tp->nb_dequeued);
+ __atomic_store_n(&tp->nb_dequeued, 0, __ATOMIC_RELAXED);
}
return TEST_SUCCESS;
@@ -2878,11 +2875,10 @@ throughput_intr_lcore_dec(void *arg)
bufs = &tp->op_params->q_bufs[GET_SOCKET(info.socket_id)][queue_id];
- rte_atomic16_clear(&tp->processing_status);
- rte_atomic16_clear(&tp->nb_dequeued);
+ __atomic_store_n(&tp->processing_status, 0, __ATOMIC_RELAXED);
+ __atomic_store_n(&tp->nb_dequeued, 0, __ATOMIC_RELAXED);
- while (rte_atomic16_read(&tp->op_params->sync) == SYNC_WAIT)
- rte_pause();
+ rte_wait_until_equal_16(&tp->op_params->sync, SYNC_START, __ATOMIC_RELAXED);
ret = rte_bbdev_dec_op_alloc_bulk(tp->op_params->mp, ops,
num_to_process);
@@ -2923,17 +2919,15 @@ throughput_intr_lcore_dec(void *arg)
* the number of operations is not a multiple of
* burst size.
*/
- rte_atomic16_set(&tp->burst_sz, num_to_enq);
+ __atomic_store_n(&tp->burst_sz, num_to_enq, __ATOMIC_RELAXED);
/* Wait until processing of previous batch is
* completed
*/
- while (rte_atomic16_read(&tp->nb_dequeued) !=
- (int16_t) enqueued)
- rte_pause();
+ rte_wait_until_equal_16(&tp->nb_dequeued, enqueued, __ATOMIC_RELAXED);
}
if (j != TEST_REPETITIONS - 1)
- rte_atomic16_clear(&tp->nb_dequeued);
+ __atomic_store_n(&tp->nb_dequeued, 0, __ATOMIC_RELAXED);
}
return TEST_SUCCESS;
@@ -2968,11 +2962,10 @@ throughput_intr_lcore_enc(void *arg)
bufs = &tp->op_params->q_bufs[GET_SOCKET(info.socket_id)][queue_id];
- rte_atomic16_clear(&tp->processing_status);
- rte_atomic16_clear(&tp->nb_dequeued);
+ __atomic_store_n(&tp->processing_status, 0, __ATOMIC_RELAXED);
+ __atomic_store_n(&tp->nb_dequeued, 0, __ATOMIC_RELAXED);
- while (rte_atomic16_read(&tp->op_params->sync) == SYNC_WAIT)
- rte_pause();
+ rte_wait_until_equal_16(&tp->op_params->sync, SYNC_START, __ATOMIC_RELAXED);
ret = rte_bbdev_enc_op_alloc_bulk(tp->op_params->mp, ops,
num_to_process);
@@ -3012,17 +3005,15 @@ throughput_intr_lcore_enc(void *arg)
* the number of operations is not a multiple of
* burst size.
*/
- rte_atomic16_set(&tp->burst_sz, num_to_enq);
+ __atomic_store_n(&tp->burst_sz, num_to_enq, __ATOMIC_RELAXED);
/* Wait until processing of previous batch is
* completed
*/
- while (rte_atomic16_read(&tp->nb_dequeued) !=
- (int16_t) enqueued)
- rte_pause();
+ rte_wait_until_equal_16(&tp->nb_dequeued, enqueued, __ATOMIC_RELAXED);
}
if (j != TEST_REPETITIONS - 1)
- rte_atomic16_clear(&tp->nb_dequeued);
+ __atomic_store_n(&tp->nb_dequeued, 0, __ATOMIC_RELAXED);
}
return TEST_SUCCESS;
@@ -3058,11 +3049,10 @@ throughput_intr_lcore_ldpc_enc(void *arg)
bufs = &tp->op_params->q_bufs[GET_SOCKET(info.socket_id)][queue_id];
- rte_atomic16_clear(&tp->processing_status);
- rte_atomic16_clear(&tp->nb_dequeued);
+ __atomic_store_n(&tp->processing_status, 0, __ATOMIC_RELAXED);
+ __atomic_store_n(&tp->nb_dequeued, 0, __ATOMIC_RELAXED);
- while (rte_atomic16_read(&tp->op_params->sync) == SYNC_WAIT)
- rte_pause();
+ rte_wait_until_equal_16(&tp->op_params->sync, SYNC_START, __ATOMIC_RELAXED);
ret = rte_bbdev_enc_op_alloc_bulk(tp->op_params->mp, ops,
num_to_process);
@@ -3104,17 +3094,15 @@ throughput_intr_lcore_ldpc_enc(void *arg)
* the number of operations is not a multiple of
* burst size.
*/
- rte_atomic16_set(&tp->burst_sz, num_to_enq);
+ __atomic_store_n(&tp->burst_sz, num_to_enq, __ATOMIC_RELAXED);
/* Wait until processing of previous batch is
* completed
*/
- while (rte_atomic16_read(&tp->nb_dequeued) !=
- (int16_t) enqueued)
- rte_pause();
+ rte_wait_until_equal_16(&tp->nb_dequeued, enqueued, __ATOMIC_RELAXED);
}
if (j != TEST_REPETITIONS - 1)
- rte_atomic16_clear(&tp->nb_dequeued);
+ __atomic_store_n(&tp->nb_dequeued, 0, __ATOMIC_RELAXED);
}
return TEST_SUCCESS;
@@ -3148,8 +3136,7 @@ throughput_pmd_lcore_dec(void *arg)
bufs = &tp->op_params->q_bufs[GET_SOCKET(info.socket_id)][queue_id];
- while (rte_atomic16_read(&tp->op_params->sync) == SYNC_WAIT)
- rte_pause();
+ rte_wait_until_equal_16(&tp->op_params->sync, SYNC_START, __ATOMIC_RELAXED);
ret = rte_bbdev_dec_op_alloc_bulk(tp->op_params->mp, ops_enq, num_ops);
TEST_ASSERT_SUCCESS(ret, "Allocation failed for %d ops", num_ops);
@@ -3252,8 +3239,7 @@ bler_pmd_lcore_ldpc_dec(void *arg)
bufs = &tp->op_params->q_bufs[GET_SOCKET(info.socket_id)][queue_id];
- while (rte_atomic16_read(&tp->op_params->sync) == SYNC_WAIT)
- rte_pause();
+ rte_wait_until_equal_16(&tp->op_params->sync, SYNC_START, __ATOMIC_RELAXED);
ret = rte_bbdev_dec_op_alloc_bulk(tp->op_params->mp, ops_enq, num_ops);
TEST_ASSERT_SUCCESS(ret, "Allocation failed for %d ops", num_ops);
@@ -3382,8 +3368,7 @@ throughput_pmd_lcore_ldpc_dec(void *arg)
bufs = &tp->op_params->q_bufs[GET_SOCKET(info.socket_id)][queue_id];
- while (rte_atomic16_read(&tp->op_params->sync) == SYNC_WAIT)
- rte_pause();
+ rte_wait_until_equal_16(&tp->op_params->sync, SYNC_START, __ATOMIC_RELAXED);
ret = rte_bbdev_dec_op_alloc_bulk(tp->op_params->mp, ops_enq, num_ops);
TEST_ASSERT_SUCCESS(ret, "Allocation failed for %d ops", num_ops);
@@ -3499,8 +3484,7 @@ throughput_pmd_lcore_enc(void *arg)
bufs = &tp->op_params->q_bufs[GET_SOCKET(info.socket_id)][queue_id];
- while (rte_atomic16_read(&tp->op_params->sync) == SYNC_WAIT)
- rte_pause();
+ rte_wait_until_equal_16(&tp->op_params->sync, SYNC_START, __ATOMIC_RELAXED);
ret = rte_bbdev_enc_op_alloc_bulk(tp->op_params->mp, ops_enq,
num_ops);
@@ -3590,8 +3574,7 @@ throughput_pmd_lcore_ldpc_enc(void *arg)
bufs = &tp->op_params->q_bufs[GET_SOCKET(info.socket_id)][queue_id];
- while (rte_atomic16_read(&tp->op_params->sync) == SYNC_WAIT)
- rte_pause();
+ rte_wait_until_equal_16(&tp->op_params->sync, SYNC_START, __ATOMIC_RELAXED);
ret = rte_bbdev_enc_op_alloc_bulk(tp->op_params->mp, ops_enq,
num_ops);
@@ -3774,7 +3757,7 @@ bler_test(struct active_device *ad,
else
return TEST_SKIPPED;
- rte_atomic16_set(&op_params->sync, SYNC_WAIT);
+ __atomic_store_n(&op_params->sync, SYNC_WAIT, __ATOMIC_RELAXED);
/* Main core is set at first entry */
t_params[0].dev_id = ad->dev_id;
@@ -3797,7 +3780,7 @@ bler_test(struct active_device *ad,
&t_params[used_cores++], lcore_id);
}
- rte_atomic16_set(&op_params->sync, SYNC_START);
+ __atomic_store_n(&op_params->sync, SYNC_START, __ATOMIC_RELAXED);
ret = bler_function(&t_params[0]);
/* Main core is always used */
@@ -3892,7 +3875,7 @@ throughput_test(struct active_device *ad,
throughput_function = throughput_pmd_lcore_enc;
}
- rte_atomic16_set(&op_params->sync, SYNC_WAIT);
+ __atomic_store_n(&op_params->sync, SYNC_WAIT, __ATOMIC_RELAXED);
/* Main core is set at first entry */
t_params[0].dev_id = ad->dev_id;
@@ -3915,7 +3898,7 @@ throughput_test(struct active_device *ad,
&t_params[used_cores++], lcore_id);
}
- rte_atomic16_set(&op_params->sync, SYNC_START);
+ __atomic_store_n(&op_params->sync, SYNC_START, __ATOMIC_RELAXED);
ret = throughput_function(&t_params[0]);
/* Main core is always used */
@@ -3945,29 +3928,29 @@ throughput_test(struct active_device *ad,
* Wait for main lcore operations.
*/
tp = &t_params[0];
- while ((rte_atomic16_read(&tp->nb_dequeued) <
- op_params->num_to_process) &&
- (rte_atomic16_read(&tp->processing_status) !=
- TEST_FAILED))
+ while ((__atomic_load_n(&tp->nb_dequeued, __ATOMIC_RELAXED) <
+ op_params->num_to_process) &&
+ (__atomic_load_n(&tp->processing_status, __ATOMIC_RELAXED) !=
+ TEST_FAILED))
rte_pause();
tp->ops_per_sec /= TEST_REPETITIONS;
tp->mbps /= TEST_REPETITIONS;
- ret |= (int)rte_atomic16_read(&tp->processing_status);
+ ret |= (int)__atomic_load_n(&tp->processing_status, __ATOMIC_RELAXED);
/* Wait for worker lcores operations */
for (used_cores = 1; used_cores < num_lcores; used_cores++) {
tp = &t_params[used_cores];
- while ((rte_atomic16_read(&tp->nb_dequeued) <
- op_params->num_to_process) &&
- (rte_atomic16_read(&tp->processing_status) !=
- TEST_FAILED))
+ while ((__atomic_load_n(&tp->nb_dequeued, __ATOMIC_RELAXED) <
+ op_params->num_to_process) &&
+ (__atomic_load_n(&tp->processing_status, __ATOMIC_RELAXED) !=
+ TEST_FAILED))
rte_pause();
tp->ops_per_sec /= TEST_REPETITIONS;
tp->mbps /= TEST_REPETITIONS;
- ret |= (int)rte_atomic16_read(&tp->processing_status);
+ ret |= (int)__atomic_load_n(&tp->processing_status, __ATOMIC_RELAXED);
}
/* Print throughput if test passed */
--
2.25.1
^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH v2 12/12] app: remove unnecessary include of atomic header file
2021-11-16 9:41 [PATCH v2 00/12] use compiler atomic builtins for app modules Joyce Kong
` (10 preceding siblings ...)
2021-11-16 9:42 ` [PATCH v2 11/12] app/bbdev: use compiler atomics for shared data sync Joyce Kong
@ 2021-11-16 9:42 ` Joyce Kong
2021-11-16 20:23 ` David Marchand
2021-11-17 8:21 ` [PATCH v3 00/12] use compiler atomic builtins for app modules Joyce Kong
12 siblings, 1 reply; 36+ messages in thread
From: Joyce Kong @ 2021-11-16 9:42 UTC (permalink / raw)
To: Maryam Tahhan, Reshma Pattan, Cristian Dumitrescu, Xiaoyun Li,
Olivier Matz, Anatoly Burakov, Honnappa Nagarahalli,
Konstantin Ananyev
Cc: dev, nd, Joyce Kong, Ruifeng Wang
Remove the unnecessary rte_atomic.h included in app modules.
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
---
app/proc-info/main.c | 1 -
app/test-pipeline/config.c | 1 -
app/test-pipeline/init.c | 1 -
app/test-pipeline/main.c | 1 -
app/test-pipeline/runtime.c | 1 -
app/test-pmd/cmdline.c | 1 -
app/test-pmd/config.c | 1 -
app/test-pmd/csumonly.c | 1 -
app/test-pmd/flowgen.c | 1 -
app/test-pmd/icmpecho.c | 1 -
app/test-pmd/iofwd.c | 1 -
app/test-pmd/macfwd.c | 1 -
app/test-pmd/macswap.c | 1 -
app/test-pmd/parameters.c | 1 -
app/test-pmd/rxonly.c | 1 -
app/test-pmd/txonly.c | 1 -
app/test/test_barrier.c | 1 -
app/test/test_mbuf.c | 1 -
app/test/test_mp_secondary.c | 1 -
app/test/test_ring.c | 1 -
20 files changed, 20 deletions(-)
diff --git a/app/proc-info/main.c b/app/proc-info/main.c
index a4271047e6..ebe2d77264 100644
--- a/app/proc-info/main.c
+++ b/app/proc-info/main.c
@@ -27,7 +27,6 @@
#include <rte_per_lcore.h>
#include <rte_lcore.h>
#include <rte_log.h>
-#include <rte_atomic.h>
#include <rte_branch_prediction.h>
#include <rte_string_fns.h>
#include <rte_metrics.h>
diff --git a/app/test-pipeline/config.c b/app/test-pipeline/config.c
index 33f3f1c827..daf838948b 100644
--- a/app/test-pipeline/config.c
+++ b/app/test-pipeline/config.c
@@ -21,7 +21,6 @@
#include <rte_eal.h>
#include <rte_per_lcore.h>
#include <rte_launch.h>
-#include <rte_atomic.h>
#include <rte_cycles.h>
#include <rte_prefetch.h>
#include <rte_lcore.h>
diff --git a/app/test-pipeline/init.c b/app/test-pipeline/init.c
index c738019041..eee0719b67 100644
--- a/app/test-pipeline/init.c
+++ b/app/test-pipeline/init.c
@@ -21,7 +21,6 @@
#include <rte_eal.h>
#include <rte_per_lcore.h>
#include <rte_launch.h>
-#include <rte_atomic.h>
#include <rte_cycles.h>
#include <rte_prefetch.h>
#include <rte_lcore.h>
diff --git a/app/test-pipeline/main.c b/app/test-pipeline/main.c
index 72e4797ff2..1e16794183 100644
--- a/app/test-pipeline/main.c
+++ b/app/test-pipeline/main.c
@@ -22,7 +22,6 @@
#include <rte_eal.h>
#include <rte_per_lcore.h>
#include <rte_launch.h>
-#include <rte_atomic.h>
#include <rte_cycles.h>
#include <rte_prefetch.h>
#include <rte_lcore.h>
diff --git a/app/test-pipeline/runtime.c b/app/test-pipeline/runtime.c
index 159192bcd8..d939a85d7e 100644
--- a/app/test-pipeline/runtime.c
+++ b/app/test-pipeline/runtime.c
@@ -21,7 +21,6 @@
#include <rte_eal.h>
#include <rte_per_lcore.h>
#include <rte_launch.h>
-#include <rte_atomic.h>
#include <rte_cycles.h>
#include <rte_prefetch.h>
#include <rte_branch_prediction.h>
diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 4f51b259fe..4e93f535ff 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -24,7 +24,6 @@
#include <rte_eal.h>
#include <rte_per_lcore.h>
#include <rte_lcore.h>
-#include <rte_atomic.h>
#include <rte_branch_prediction.h>
#include <rte_ring.h>
#include <rte_mempool.h>
diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 26cadf39f7..d8b5032b58 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -27,7 +27,6 @@
#include <rte_eal.h>
#include <rte_per_lcore.h>
#include <rte_lcore.h>
-#include <rte_atomic.h>
#include <rte_branch_prediction.h>
#include <rte_mempool.h>
#include <rte_mbuf.h>
diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c
index 8526d9158a..e0b00abe8c 100644
--- a/app/test-pmd/csumonly.c
+++ b/app/test-pmd/csumonly.c
@@ -24,7 +24,6 @@
#include <rte_eal.h>
#include <rte_per_lcore.h>
#include <rte_lcore.h>
-#include <rte_atomic.h>
#include <rte_branch_prediction.h>
#include <rte_mempool.h>
#include <rte_mbuf.h>
diff --git a/app/test-pmd/flowgen.c b/app/test-pmd/flowgen.c
index 5737eaa105..9ceef3b54a 100644
--- a/app/test-pmd/flowgen.c
+++ b/app/test-pmd/flowgen.c
@@ -24,7 +24,6 @@
#include <rte_eal.h>
#include <rte_per_lcore.h>
#include <rte_lcore.h>
-#include <rte_atomic.h>
#include <rte_branch_prediction.h>
#include <rte_mempool.h>
#include <rte_mbuf.h>
diff --git a/app/test-pmd/icmpecho.c b/app/test-pmd/icmpecho.c
index 8f1d68a83a..3a85ec3dd1 100644
--- a/app/test-pmd/icmpecho.c
+++ b/app/test-pmd/icmpecho.c
@@ -20,7 +20,6 @@
#include <rte_cycles.h>
#include <rte_per_lcore.h>
#include <rte_lcore.h>
-#include <rte_atomic.h>
#include <rte_branch_prediction.h>
#include <rte_memory.h>
#include <rte_mempool.h>
diff --git a/app/test-pmd/iofwd.c b/app/test-pmd/iofwd.c
index 83d098adcb..19cd920f70 100644
--- a/app/test-pmd/iofwd.c
+++ b/app/test-pmd/iofwd.c
@@ -23,7 +23,6 @@
#include <rte_eal.h>
#include <rte_per_lcore.h>
#include <rte_lcore.h>
-#include <rte_atomic.h>
#include <rte_branch_prediction.h>
#include <rte_memcpy.h>
#include <rte_mempool.h>
diff --git a/app/test-pmd/macfwd.c b/app/test-pmd/macfwd.c
index ac50d0b9f8..812a0c721f 100644
--- a/app/test-pmd/macfwd.c
+++ b/app/test-pmd/macfwd.c
@@ -24,7 +24,6 @@
#include <rte_eal.h>
#include <rte_per_lcore.h>
#include <rte_lcore.h>
-#include <rte_atomic.h>
#include <rte_branch_prediction.h>
#include <rte_mempool.h>
#include <rte_mbuf.h>
diff --git a/app/test-pmd/macswap.c b/app/test-pmd/macswap.c
index 310bca06af..4627ff83e9 100644
--- a/app/test-pmd/macswap.c
+++ b/app/test-pmd/macswap.c
@@ -24,7 +24,6 @@
#include <rte_eal.h>
#include <rte_per_lcore.h>
#include <rte_lcore.h>
-#include <rte_atomic.h>
#include <rte_branch_prediction.h>
#include <rte_mempool.h>
#include <rte_mbuf.h>
diff --git a/app/test-pmd/parameters.c b/app/test-pmd/parameters.c
index 0974b0a38f..2f4f944efa 100644
--- a/app/test-pmd/parameters.c
+++ b/app/test-pmd/parameters.c
@@ -30,7 +30,6 @@
#include <rte_eal.h>
#include <rte_per_lcore.h>
#include <rte_lcore.h>
-#include <rte_atomic.h>
#include <rte_branch_prediction.h>
#include <rte_mempool.h>
#include <rte_interrupts.h>
diff --git a/app/test-pmd/rxonly.c b/app/test-pmd/rxonly.c
index c78fc4609a..d1a579d8d8 100644
--- a/app/test-pmd/rxonly.c
+++ b/app/test-pmd/rxonly.c
@@ -24,7 +24,6 @@
#include <rte_eal.h>
#include <rte_per_lcore.h>
#include <rte_lcore.h>
-#include <rte_atomic.h>
#include <rte_branch_prediction.h>
#include <rte_mempool.h>
#include <rte_mbuf.h>
diff --git a/app/test-pmd/txonly.c b/app/test-pmd/txonly.c
index 34bb538379..b8497e733d 100644
--- a/app/test-pmd/txonly.c
+++ b/app/test-pmd/txonly.c
@@ -24,7 +24,6 @@
#include <rte_eal.h>
#include <rte_per_lcore.h>
#include <rte_lcore.h>
-#include <rte_atomic.h>
#include <rte_branch_prediction.h>
#include <rte_mempool.h>
#include <rte_mbuf.h>
diff --git a/app/test/test_barrier.c b/app/test/test_barrier.c
index c27f8a0742..898c2516ed 100644
--- a/app/test/test_barrier.c
+++ b/app/test/test_barrier.c
@@ -24,7 +24,6 @@
#include <rte_memory.h>
#include <rte_per_lcore.h>
#include <rte_launch.h>
-#include <rte_atomic.h>
#include <rte_eal.h>
#include <rte_lcore.h>
#include <rte_pause.h>
diff --git a/app/test/test_mbuf.c b/app/test/test_mbuf.c
index f93bcef8a9..d53126710f 100644
--- a/app/test/test_mbuf.c
+++ b/app/test/test_mbuf.c
@@ -21,7 +21,6 @@
#include <rte_eal.h>
#include <rte_per_lcore.h>
#include <rte_lcore.h>
-#include <rte_atomic.h>
#include <rte_branch_prediction.h>
#include <rte_ring.h>
#include <rte_mempool.h>
diff --git a/app/test/test_mp_secondary.c b/app/test/test_mp_secondary.c
index 5b6f05dbb1..021ca0547f 100644
--- a/app/test/test_mp_secondary.c
+++ b/app/test/test_mp_secondary.c
@@ -28,7 +28,6 @@
#include <rte_lcore.h>
#include <rte_errno.h>
#include <rte_branch_prediction.h>
-#include <rte_atomic.h>
#include <rte_ring.h>
#include <rte_debug.h>
#include <rte_log.h>
diff --git a/app/test/test_ring.c b/app/test/test_ring.c
index fb8532a409..bde33ab4a1 100644
--- a/app/test/test_ring.c
+++ b/app/test/test_ring.c
@@ -20,7 +20,6 @@
#include <rte_eal.h>
#include <rte_per_lcore.h>
#include <rte_lcore.h>
-#include <rte_atomic.h>
#include <rte_branch_prediction.h>
#include <rte_malloc.h>
#include <rte_ring.h>
--
2.25.1
^ permalink raw reply [flat|nested] 36+ messages in thread
* RE: [PATCH v2 03/12] test/timer: use compiler atomic builtins for sync
2021-11-16 9:41 ` [PATCH v2 03/12] test/timer: use compiler atomic builtins for sync Joyce Kong
@ 2021-11-16 19:52 ` Honnappa Nagarahalli
2021-11-16 20:20 ` David Marchand
1 sibling, 0 replies; 36+ messages in thread
From: Honnappa Nagarahalli @ 2021-11-16 19:52 UTC (permalink / raw)
To: Joyce Kong, Robert Sanford, Erik Gabriel Carrillo
Cc: dev, nd, Joyce Kong, Ruifeng Wang, nd
<snip>
>
> Convert rte_atomic usages to compiler atomic built-ins for lcore_state and
> collisions sync.
>
> Also, move 'main_init_workers' outside of 'timer_stress2_main_loop' to
> guarantee lcore_state initialized correctly before the threads launched.
>
> Signed-off-by: Joyce Kong <joyce.kong@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> ---
> app/test/test_timer.c | 30 +++++++++++++-----------------
> app/test/test_timer_secondary.c | 1 -
> 2 files changed, 13 insertions(+), 18 deletions(-)
>
> diff --git a/app/test/test_timer.c b/app/test/test_timer.c index
> a10b2fe9da..c97e5c891c 100644
> --- a/app/test/test_timer.c
> +++ b/app/test/test_timer.c
> @@ -102,7 +102,6 @@
> #include <rte_eal.h>
> #include <rte_per_lcore.h>
> #include <rte_lcore.h>
> -#include <rte_atomic.h>
> #include <rte_timer.h>
> #include <rte_random.h>
> #include <rte_malloc.h>
> @@ -203,7 +202,7 @@ timer_stress_main_loop(__rte_unused void *arg)
>
> /* Need to synchronize worker lcores through multiple steps. */ enum {
> WORKER_WAITING = 1, WORKER_RUN_SIGNAL, WORKER_RUNNING,
> WORKER_FINISHED }; -static rte_atomic16_t lcore_state[RTE_MAX_LCORE];
> +static uint16_t lcore_state[RTE_MAX_LCORE];
>
> static void
> main_init_workers(void)
> @@ -211,7 +210,7 @@ main_init_workers(void)
> unsigned i;
>
> RTE_LCORE_FOREACH_WORKER(i) {
> - rte_atomic16_set(&lcore_state[i], WORKER_WAITING);
> + __atomic_store_n(&lcore_state[i], WORKER_WAITING,
> __ATOMIC_RELAXED);
> }
> }
>
> @@ -221,11 +220,10 @@ main_start_workers(void)
> unsigned i;
>
> RTE_LCORE_FOREACH_WORKER(i) {
> - rte_atomic16_set(&lcore_state[i], WORKER_RUN_SIGNAL);
> + __atomic_store_n(&lcore_state[i], WORKER_RUN_SIGNAL,
> +__ATOMIC_RELEASE);
> }
> RTE_LCORE_FOREACH_WORKER(i) {
> - while (rte_atomic16_read(&lcore_state[i]) !=
> WORKER_RUNNING)
> - rte_pause();
> + rte_wait_until_equal_16(&lcore_state[i], WORKER_RUNNING,
> +__ATOMIC_ACQUIRE);
> }
> }
>
> @@ -235,8 +233,7 @@ main_wait_for_workers(void)
> unsigned i;
>
> RTE_LCORE_FOREACH_WORKER(i) {
> - while (rte_atomic16_read(&lcore_state[i]) !=
> WORKER_FINISHED)
> - rte_pause();
> + rte_wait_until_equal_16(&lcore_state[i], WORKER_FINISHED,
> +__ATOMIC_ACQUIRE);
> }
> }
>
> @@ -245,9 +242,8 @@ worker_wait_to_start(void) {
> unsigned lcore_id = rte_lcore_id();
>
> - while (rte_atomic16_read(&lcore_state[lcore_id]) !=
> WORKER_RUN_SIGNAL)
> - rte_pause();
> - rte_atomic16_set(&lcore_state[lcore_id], WORKER_RUNNING);
> + rte_wait_until_equal_16(&lcore_state[lcore_id],
> WORKER_RUN_SIGNAL, __ATOMIC_ACQUIRE);
> + __atomic_store_n(&lcore_state[lcore_id], WORKER_RUNNING,
> +__ATOMIC_RELEASE);
> }
>
> static void
> @@ -255,7 +251,7 @@ worker_finish(void)
> {
> unsigned lcore_id = rte_lcore_id();
>
> - rte_atomic16_set(&lcore_state[lcore_id], WORKER_FINISHED);
> + __atomic_store_n(&lcore_state[lcore_id], WORKER_FINISHED,
> +__ATOMIC_RELEASE);
> }
>
>
> @@ -281,13 +277,12 @@ timer_stress2_main_loop(__rte_unused void *arg)
> unsigned int lcore_id = rte_lcore_id();
> unsigned int main_lcore = rte_get_main_lcore();
> int32_t my_collisions = 0;
> - static rte_atomic32_t collisions;
> + static uint32_t collisions;
>
> if (lcore_id == main_lcore) {
> cb_count = 0;
> test_failed = 0;
> - rte_atomic32_set(&collisions, 0);
> - main_init_workers();
> + __atomic_store_n(&collisions, 0, __ATOMIC_RELAXED);
> timers = rte_malloc(NULL, sizeof(*timers) *
> NB_STRESS2_TIMERS, 0);
> if (timers == NULL) {
> printf("Test Failed\n");
> @@ -315,7 +310,7 @@ timer_stress2_main_loop(__rte_unused void *arg)
> my_collisions++;
> }
> if (my_collisions != 0)
> - rte_atomic32_add(&collisions, my_collisions);
> + __atomic_fetch_add(&collisions, my_collisions,
> __ATOMIC_RELAXED);
>
> /* wait long enough for timers to expire */
> rte_delay_ms(100);
> @@ -329,7 +324,7 @@ timer_stress2_main_loop(__rte_unused void *arg)
>
> /* now check that we get the right number of callbacks */
> if (lcore_id == main_lcore) {
> - my_collisions = rte_atomic32_read(&collisions);
> + my_collisions = __atomic_load_n(&collisions,
> __ATOMIC_RELAXED);
> if (my_collisions != 0)
> printf("- %d timer reset collisions (OK)\n",
> my_collisions);
> rte_timer_manage();
> @@ -573,6 +568,7 @@ test_timer(void)
> /* run a second, slightly different set of stress tests */
> printf("\nStart timer stress tests 2\n");
> test_failed = 0;
> + main_init_workers();
> rte_eal_mp_remote_launch(timer_stress2_main_loop, NULL,
> CALL_MAIN);
> rte_eal_mp_wait_lcore();
> if (test_failed)
> diff --git a/app/test/test_timer_secondary.c b/app/test/test_timer_secondary.c
> index 16a9f1878b..5795c97f07 100644
> --- a/app/test/test_timer_secondary.c
> +++ b/app/test/test_timer_secondary.c
> @@ -9,7 +9,6 @@
> #include <rte_lcore.h>
> #include <rte_debug.h>
> #include <rte_memzone.h>
> -#include <rte_atomic.h>
> #include <rte_timer.h>
> #include <rte_cycles.h>
> #include <rte_mempool.h>
> --
> 2.25.1
^ permalink raw reply [flat|nested] 36+ messages in thread
* RE: [PATCH v2 09/12] app/compress: use compiler atomic builtins for display sync
2021-11-16 9:42 ` [PATCH v2 09/12] app/compress: " Joyce Kong
@ 2021-11-16 20:15 ` Honnappa Nagarahalli
0 siblings, 0 replies; 36+ messages in thread
From: Honnappa Nagarahalli @ 2021-11-16 20:15 UTC (permalink / raw)
To: Joyce Kong; +Cc: dev, nd, Joyce Kong, Ruifeng Wang, nd
<snip>
>
> Convert rte_atomic_test_and_set usage to compiler atomic CAS operation for
> display sync.
>
> Signed-off-by: Joyce Kong <joyce.kong@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> ---
> app/test-compress-perf/comp_perf_test_common.h | 2 +-
> .../comp_perf_test_cyclecount.c | 15 +++++++--------
> .../comp_perf_test_throughput.c | 10 +++++++---
> app/test-compress-perf/comp_perf_test_verify.c | 6 ++++--
> 4 files changed, 19 insertions(+), 14 deletions(-)
>
> diff --git a/app/test-compress-perf/comp_perf_test_common.h b/app/test-
> compress-perf/comp_perf_test_common.h
> index 72705c6a2b..d039e5a29a 100644
> --- a/app/test-compress-perf/comp_perf_test_common.h
> +++ b/app/test-compress-perf/comp_perf_test_common.h
> @@ -14,7 +14,7 @@ struct cperf_mem_resources {
> uint16_t qp_id;
> uint8_t lcore_id;
>
> - rte_atomic16_t print_info_once;
> + uint16_t print_info_once;
>
> uint32_t total_bufs;
> uint8_t *compressed_data;
> diff --git a/app/test-compress-perf/comp_perf_test_cyclecount.c b/app/test-
> compress-perf/comp_perf_test_cyclecount.c
> index c875ddbdac..da55b02b74 100644
> --- a/app/test-compress-perf/comp_perf_test_cyclecount.c
> +++ b/app/test-compress-perf/comp_perf_test_cyclecount.c
> @@ -466,7 +466,7 @@ cperf_cyclecount_test_runner(void *test_ctx)
> struct cperf_cyclecount_ctx *ctx = test_ctx;
> struct comp_test_data *test_data = ctx->ver.options;
> uint32_t lcore = rte_lcore_id();
> - static rte_atomic16_t display_once = RTE_ATOMIC16_INIT(0);
> + static uint16_t display_once;
> static rte_spinlock_t print_spinlock;
> int i;
>
> @@ -486,10 +486,12 @@ cperf_cyclecount_test_runner(void *test_ctx)
>
> ctx->ver.mem.lcore_id = lcore;
>
> + uint16_t exp = 0;
> /*
> * printing information about current compression thread
> */
> - if (rte_atomic16_test_and_set(&ctx->ver.mem.print_info_once))
> + if (__atomic_compare_exchange_n(&ctx->ver.mem.print_info_once,
> &exp,
> + 1, 0, __ATOMIC_RELAXED,
> __ATOMIC_RELAXED))
> printf(" lcore: %u,"
> " driver name: %s,"
> " device name: %s,"
> @@ -546,9 +548,10 @@ cperf_cyclecount_test_runner(void *test_ctx)
> (ctx->ver.mem.total_bufs * test_data->num_iter);
>
> /* R E P O R T processing */
> - if (rte_atomic16_test_and_set(&display_once)) {
> + rte_spinlock_lock(&print_spinlock);
>
> - rte_spinlock_lock(&print_spinlock);
> + if (display_once == 0) {
> + display_once = 1;
>
> printf("\nLegend for the table\n"
> " - Retries section: number of retries for the following
> operations:\n"
> @@ -576,12 +579,8 @@ cperf_cyclecount_test_runner(void *test_ctx)
> "setup/op",
> "[C-e]", "[C-d]",
> "[D-e]", "[D-d]");
> -
> - rte_spinlock_unlock(&print_spinlock);
> }
>
> - rte_spinlock_lock(&print_spinlock);
> -
> printf("%12u"
> "%6u"
> "%12zu"
> diff --git a/app/test-compress-perf/comp_perf_test_throughput.c b/app/test-
> compress-perf/comp_perf_test_throughput.c
> index 13922b658c..d3dff070b0 100644
> --- a/app/test-compress-perf/comp_perf_test_throughput.c
> +++ b/app/test-compress-perf/comp_perf_test_throughput.c
> @@ -329,15 +329,17 @@ cperf_throughput_test_runner(void *test_ctx)
> struct cperf_benchmark_ctx *ctx = test_ctx;
> struct comp_test_data *test_data = ctx->ver.options;
> uint32_t lcore = rte_lcore_id();
> - static rte_atomic16_t display_once = RTE_ATOMIC16_INIT(0);
> + static uint16_t display_once;
> int i, ret = EXIT_SUCCESS;
>
> ctx->ver.mem.lcore_id = lcore;
>
> + uint16_t exp = 0;
> /*
> * printing information about current compression thread
> */
> - if (rte_atomic16_test_and_set(&ctx->ver.mem.print_info_once))
> + if (__atomic_compare_exchange_n(&ctx->ver.mem.print_info_once,
> &exp,
> + 1, 0, __ATOMIC_RELAXED,
> __ATOMIC_RELAXED))
> printf(" lcore: %u,"
> " driver name: %s,"
> " device name: %s,"
> @@ -391,7 +393,9 @@ cperf_throughput_test_runner(void *test_ctx)
> ctx->decomp_gbps = rte_get_tsc_hz() / ctx->decomp_tsc_byte * 8 /
> 1000000000;
>
> - if (rte_atomic16_test_and_set(&display_once)) {
> + exp = 0;
> + if (__atomic_compare_exchange_n(&display_once, &exp, 1, 0,
> + __ATOMIC_RELAXED, __ATOMIC_RELAXED)) {
> printf("\n%12s%6s%12s%17s%15s%16s\n",
> "lcore id", "Level", "Comp size", "Comp ratio [%]",
> "Comp [Gbps]", "Decomp [Gbps]");
> diff --git a/app/test-compress-perf/comp_perf_test_verify.c b/app/test-
> compress-perf/comp_perf_test_verify.c
> index 5e13257b79..f6e21368e8 100644
> --- a/app/test-compress-perf/comp_perf_test_verify.c
> +++ b/app/test-compress-perf/comp_perf_test_verify.c
> @@ -388,7 +388,7 @@ cperf_verify_test_runner(void *test_ctx)
> struct cperf_verify_ctx *ctx = test_ctx;
> struct comp_test_data *test_data = ctx->options;
> int ret = EXIT_SUCCESS;
> - static rte_atomic16_t display_once = RTE_ATOMIC16_INIT(0);
> + static uint16_t display_once;
> uint32_t lcore = rte_lcore_id();
>
> ctx->mem.lcore_id = lcore;
> @@ -427,8 +427,10 @@ cperf_verify_test_runner(void *test_ctx)
> ctx->ratio = (double) ctx->comp_data_sz /
> test_data->input_data_sz * 100;
>
> + uint16_t exp = 0;
> if (!ctx->silent) {
> - if (rte_atomic16_test_and_set(&display_once)) {
> + if (__atomic_compare_exchange_n(&display_once, &exp, 1, 0,
> + __ATOMIC_RELAXED, __ATOMIC_RELAXED)) {
> printf("%12s%6s%12s%17s\n",
> "lcore id", "Level", "Comp size", "Comp ratio [%]");
> }
> --
> 2.25.1
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 03/12] test/timer: use compiler atomic builtins for sync
2021-11-16 9:41 ` [PATCH v2 03/12] test/timer: use compiler atomic builtins for sync Joyce Kong
2021-11-16 19:52 ` Honnappa Nagarahalli
@ 2021-11-16 20:20 ` David Marchand
2021-11-16 21:21 ` Honnappa Nagarahalli
1 sibling, 1 reply; 36+ messages in thread
From: David Marchand @ 2021-11-16 20:20 UTC (permalink / raw)
To: Joyce Kong, Honnappa Nagarahalli
Cc: Robert Sanford, Erik Gabriel Carrillo, dev, nd, Ruifeng Wang
Joyce, Honnappa,
On Tue, Nov 16, 2021 at 10:43 AM Joyce Kong <joyce.kong@arm.com> wrote:
>
> Convert rte_atomic usages to compiler atomic
> built-ins for lcore_state and collisions sync.
>
> Also, move 'main_init_workers' outside of
> 'timer_stress2_main_loop' to guarantee lcore_state
> initialized correctly before the threads launched.
Is this "also" part actually related to the change?
Or is it a separate fix?
>
> Signed-off-by: Joyce Kong <joyce.kong@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
--
David Marchand
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 12/12] app: remove unnecessary include of atomic header file
2021-11-16 9:42 ` [PATCH v2 12/12] app: remove unnecessary include of atomic header file Joyce Kong
@ 2021-11-16 20:23 ` David Marchand
2021-11-17 7:05 ` Joyce Kong
0 siblings, 1 reply; 36+ messages in thread
From: David Marchand @ 2021-11-16 20:23 UTC (permalink / raw)
To: Joyce Kong
Cc: Maryam Tahhan, Reshma Pattan, Cristian Dumitrescu, Xiaoyun Li,
Olivier Matz, Anatoly Burakov, Honnappa Nagarahalli,
Konstantin Ananyev, dev, nd, Ruifeng Wang
On Tue, Nov 16, 2021 at 10:44 AM Joyce Kong <joyce.kong@arm.com> wrote:
>
> Remove the unnecessary rte_atomic.h included in app modules.
>
> Signed-off-by: Joyce Kong <joyce.kong@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
After patch, I still see:
$ git grep rte_atomic.h app/
app/test/commands.c:#include <rte_atomic.h>
app/test/test_atomic.c:#include <rte_atomic.h>
app/test/test_event_timer_adapter.c:#include <rte_atomic.h>
I can undertand why the test_atomic would depend on rte_atomic.h :-)
but not the rest.
Is there a reason? or is it just a miss?
--
David Marchand
^ permalink raw reply [flat|nested] 36+ messages in thread
* RE: [PATCH v2 03/12] test/timer: use compiler atomic builtins for sync
2021-11-16 20:20 ` David Marchand
@ 2021-11-16 21:21 ` Honnappa Nagarahalli
2021-11-17 9:29 ` David Marchand
0 siblings, 1 reply; 36+ messages in thread
From: Honnappa Nagarahalli @ 2021-11-16 21:21 UTC (permalink / raw)
To: David Marchand, Joyce Kong
Cc: Robert Sanford, Erik Gabriel Carrillo, dev, nd, Ruifeng Wang, nd
<snip>
>
> Joyce, Honnappa,
>
> On Tue, Nov 16, 2021 at 10:43 AM Joyce Kong <joyce.kong@arm.com> wrote:
> >
> > Convert rte_atomic usages to compiler atomic built-ins for lcore_state
> > and collisions sync.
> >
> > Also, move 'main_init_workers' outside of 'timer_stress2_main_loop' to
> > guarantee lcore_state initialized correctly before the threads
> > launched.
>
> Is this "also" part actually related to the change?
> Or is it a separate fix?
'Also' part is not fixing a different problem (i.e. the code earlier was not having any issues). This 'also' part just helps to keep the code simple.
>
>
> >
> > Signed-off-by: Joyce Kong <joyce.kong@arm.com>
> > Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
>
>
>
> --
> David Marchand
^ permalink raw reply [flat|nested] 36+ messages in thread
* RE: [PATCH v2 01/12] test/pmd_perf: use compiler atomic builtins for polling sync
2021-11-16 9:41 ` [PATCH v2 01/12] test/pmd_perf: use compiler atomic builtins for polling sync Joyce Kong
@ 2021-11-16 21:30 ` Honnappa Nagarahalli
0 siblings, 0 replies; 36+ messages in thread
From: Honnappa Nagarahalli @ 2021-11-16 21:30 UTC (permalink / raw)
To: Joyce Kong; +Cc: dev, nd, Joyce Kong, Ruifeng Wang, nd
<snip>
>
> Convert rte_atomic usages to compiler atomic built-ins for polling sync in
> pmd_perf test cases.
>
> Signed-off-by: Joyce Kong <joyce.kong@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> ---
> app/test/test_pmd_perf.c | 14 ++++++--------
> 1 file changed, 6 insertions(+), 8 deletions(-)
>
> diff --git a/app/test/test_pmd_perf.c b/app/test/test_pmd_perf.c index
> 1df86ce080..546384a50d 100644
> --- a/app/test/test_pmd_perf.c
> +++ b/app/test/test_pmd_perf.c
> @@ -10,7 +10,6 @@
> #include <rte_cycles.h>
> #include <rte_ethdev.h>
> #include <rte_byteorder.h>
> -#include <rte_atomic.h>
> #include <rte_malloc.h>
> #include "packet_burst_generator.h"
> #include "test.h"
> @@ -525,7 +524,7 @@ main_loop(__rte_unused void *args)
> return 0;
> }
>
> -static rte_atomic64_t start;
> +static uint64_t start;
>
> static inline int
> poll_burst(void *args)
> @@ -563,8 +562,7 @@ poll_burst(void *args)
> num[portid] = pkt_per_port;
> }
>
> - while (!rte_atomic64_read(&start))
> - ;
> + rte_wait_until_equal_64(&start, 1, __ATOMIC_ACQUIRE);
>
> cur_tsc = rte_rdtsc();
> while (total) {
> @@ -616,15 +614,15 @@ exec_burst(uint32_t flags, int lcore)
> pkt_per_port = MAX_TRAFFIC_BURST;
> num = pkt_per_port * conf->nb_ports;
>
> - rte_atomic64_init(&start);
> -
> /* start polling thread, but not actually poll yet */
> rte_eal_remote_launch(poll_burst,
> (void *)&pkt_per_port, lcore);
>
> /* Only when polling first */
> if (flags == SC_BURST_POLL_FIRST)
> - rte_atomic64_set(&start, 1);
> + __atomic_store_n(&start, 1, __ATOMIC_RELAXED);
> + else
> + __atomic_store_n(&start, 0, __ATOMIC_RELAXED);
These lines need to be moved up before calling rte_eal_remote_launch, so that update to start is visible to the worker threads.
>
> /* start xmit */
> i = 0;
> @@ -641,7 +639,7 @@ exec_burst(uint32_t flags, int lcore)
>
> /* only when polling second */
> if (flags == SC_BURST_XMIT_FIRST)
> - rte_atomic64_set(&start, 1);
> + __atomic_store_n(&start, 1, __ATOMIC_RELEASE);
>
> /* wait for polling finished */
> diff_tsc = rte_eal_wait_lcore(lcore);
> --
> 2.25.1
^ permalink raw reply [flat|nested] 36+ messages in thread
* RE: [PATCH v2 10/12] app/testpmd: remove atomic operations for port status
2021-11-16 9:42 ` [PATCH v2 10/12] app/testpmd: remove atomic operations for port status Joyce Kong
@ 2021-11-16 21:34 ` Honnappa Nagarahalli
0 siblings, 0 replies; 36+ messages in thread
From: Honnappa Nagarahalli @ 2021-11-16 21:34 UTC (permalink / raw)
To: Joyce Kong, Xiaoyun Li; +Cc: dev, nd, Joyce Kong, Ruifeng Wang, nd
<snip>
>
> The port_status changes do not need to be handled atomically, as they are
> modified during initialization or through the testpmd prompt instead of
> multiple threads.
>
> Signed-off-by: Joyce Kong <joyce.kong@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> ---
> app/test-pmd/testpmd.c | 58 ++++++++++++++++++++++--------------------
> 1 file changed, 31 insertions(+), 27 deletions(-)
>
> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index
> a66dfb297c..ed472cacd2 100644
> --- a/app/test-pmd/testpmd.c
> +++ b/app/test-pmd/testpmd.c
> @@ -36,7 +36,6 @@
> #include <rte_alarm.h>
> #include <rte_per_lcore.h>
> #include <rte_lcore.h>
> -#include <rte_atomic.h>
> #include <rte_branch_prediction.h>
> #include <rte_mempool.h>
> #include <rte_malloc.h>
> @@ -2521,9 +2520,9 @@ setup_hairpin_queues(portid_t pi, portid_t p_pi,
> uint16_t cnt_pi)
> continue;
>
> /* Fail to setup rx queue, return */
> - if (rte_atomic16_cmpset(&(port->port_status),
> - RTE_PORT_HANDLING,
> - RTE_PORT_STOPPED) == 0)
> + if (port->port_status == RTE_PORT_HANDLING)
> + port->port_status = RTE_PORT_STOPPED;
> + else
> fprintf(stderr,
> "Port %d can not be set back to stopped\n",
> pi);
> fprintf(stderr, "Fail to configure port %d hairpin queues\n",
> @@ -2544,9 +2543,9 @@ setup_hairpin_queues(portid_t pi, portid_t p_pi,
> uint16_t cnt_pi)
> continue;
>
> /* Fail to setup rx queue, return */
> - if (rte_atomic16_cmpset(&(port->port_status),
> - RTE_PORT_HANDLING,
> - RTE_PORT_STOPPED) == 0)
> + if (port->port_status == RTE_PORT_HANDLING)
> + port->port_status = RTE_PORT_STOPPED;
> + else
> fprintf(stderr,
> "Port %d can not be set back to stopped\n",
> pi);
> fprintf(stderr, "Fail to configure port %d hairpin queues\n",
> @@ -2729,8 +2728,9 @@ start_port(portid_t pid)
>
> need_check_link_status = 0;
> port = &ports[pi];
> - if (rte_atomic16_cmpset(&(port->port_status),
> RTE_PORT_STOPPED,
> - RTE_PORT_HANDLING) == 0)
> {
> + if (port->port_status == RTE_PORT_STOPPED)
> + port->port_status = RTE_PORT_HANDLING;
> + else {
> fprintf(stderr, "Port %d is now not stopped\n", pi);
> continue;
> }
> @@ -2766,8 +2766,9 @@ start_port(portid_t pid)
> nb_txq + nb_hairpinq,
> &(port->dev_conf));
> if (diag != 0) {
> - if (rte_atomic16_cmpset(&(port-
> >port_status),
> - RTE_PORT_HANDLING, RTE_PORT_STOPPED)
> == 0)
> + if (port->port_status ==
> RTE_PORT_HANDLING)
> + port->port_status =
> RTE_PORT_STOPPED;
> + else
> fprintf(stderr,
> "Port %d can not be set back
> to stopped\n",
> pi);
> @@ -2828,9 +2829,9 @@ start_port(portid_t pid)
> continue;
>
> /* Fail to setup tx queue, return */
> - if (rte_atomic16_cmpset(&(port-
> >port_status),
> -
> RTE_PORT_HANDLING,
> - RTE_PORT_STOPPED)
> == 0)
> + if (port->port_status ==
> RTE_PORT_HANDLING)
> + port->port_status =
> RTE_PORT_STOPPED;
> + else
> fprintf(stderr,
> "Port %d can not be set back
> to stopped\n",
> pi);
> @@ -2880,9 +2881,9 @@ start_port(portid_t pid)
> continue;
>
> /* Fail to setup rx queue, return */
> - if (rte_atomic16_cmpset(&(port-
> >port_status),
> -
> RTE_PORT_HANDLING,
> - RTE_PORT_STOPPED)
> == 0)
> + if (port->port_status ==
> RTE_PORT_HANDLING)
> + port->port_status =
> RTE_PORT_STOPPED;
> + else
> fprintf(stderr,
> "Port %d can not be set back
> to stopped\n",
> pi);
> @@ -2917,16 +2918,18 @@ start_port(portid_t pid)
> pi, rte_strerror(-diag));
>
> /* Fail to setup rx queue, return */
> - if (rte_atomic16_cmpset(&(port->port_status),
> - RTE_PORT_HANDLING, RTE_PORT_STOPPED)
> == 0)
> + if (port->port_status == RTE_PORT_HANDLING)
> + port->port_status = RTE_PORT_STOPPED;
> + else
> fprintf(stderr,
> "Port %d can not be set back to
> stopped\n",
> pi);
> continue;
> }
>
> - if (rte_atomic16_cmpset(&(port->port_status),
> - RTE_PORT_HANDLING, RTE_PORT_STARTED) == 0)
> + if (port->port_status == RTE_PORT_HANDLING)
> + port->port_status = RTE_PORT_STARTED;
> + else
> fprintf(stderr, "Port %d can not be set into started\n",
> pi);
>
> @@ -3028,8 +3031,9 @@ stop_port(portid_t pid)
> }
>
> port = &ports[pi];
> - if (rte_atomic16_cmpset(&(port->port_status),
> RTE_PORT_STARTED,
> - RTE_PORT_HANDLING) == 0)
> + if (port->port_status == RTE_PORT_STARTED)
> + port->port_status = RTE_PORT_HANDLING;
> + else
> continue;
>
> if (hairpin_mode & 0xf) {
> @@ -3055,8 +3059,9 @@ stop_port(portid_t pid)
> RTE_LOG(ERR, EAL, "rte_eth_dev_stop failed for port
> %u\n",
> pi);
>
> - if (rte_atomic16_cmpset(&(port->port_status),
> - RTE_PORT_HANDLING, RTE_PORT_STOPPED) == 0)
> + if (port->port_status == RTE_PORT_HANDLING)
> + port->port_status = RTE_PORT_STOPPED;
> + else
> fprintf(stderr, "Port %d can not be set into
> stopped\n",
> pi);
> need_check_link_status = 1;
> @@ -3119,8 +3124,7 @@ close_port(portid_t pid)
> }
>
> port = &ports[pi];
> - if (rte_atomic16_cmpset(&(port->port_status),
> - RTE_PORT_CLOSED, RTE_PORT_CLOSED) == 1) {
> + if (port->port_status == RTE_PORT_CLOSED) {
> fprintf(stderr, "Port %d is already closed\n", pi);
> continue;
> }
> --
> 2.25.1
^ permalink raw reply [flat|nested] 36+ messages in thread
* RE: [PATCH v2 12/12] app: remove unnecessary include of atomic header file
2021-11-16 20:23 ` David Marchand
@ 2021-11-17 7:05 ` Joyce Kong
0 siblings, 0 replies; 36+ messages in thread
From: Joyce Kong @ 2021-11-17 7:05 UTC (permalink / raw)
To: David Marchand
Cc: Maryam Tahhan, Reshma Pattan, Cristian Dumitrescu, Xiaoyun Li,
Olivier Matz, Anatoly Burakov, Honnappa Nagarahalli,
Konstantin Ananyev, dev, nd, Ruifeng Wang
<snip>
> Subject: Re: [PATCH v2 12/12] app: remove unnecessary include of atomic
> header file
>
> On Tue, Nov 16, 2021 at 10:44 AM Joyce Kong <joyce.kong@arm.com> wrote:
> >
> > Remove the unnecessary rte_atomic.h included in app modules.
> >
> > Signed-off-by: Joyce Kong <joyce.kong@arm.com>
> > Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
>
> After patch, I still see:
>
> $ git grep rte_atomic.h app/
> app/test/commands.c:#include <rte_atomic.h>
> app/test/test_atomic.c:#include <rte_atomic.h>
> app/test/test_event_timer_adapter.c:#include <rte_atomic.h>
>
> I can undertand why the test_atomic would depend on rte_atomic.h :-) but
> not the rest.
> Is there a reason? or is it just a miss?
>
> --
> David Marchand
Hi David, I checked the rest and it was a miss. Thanks for the remind, would update in v3.
Joyce
^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH v3 00/12] use compiler atomic builtins for app modules
2021-11-16 9:41 [PATCH v2 00/12] use compiler atomic builtins for app modules Joyce Kong
` (11 preceding siblings ...)
2021-11-16 9:42 ` [PATCH v2 12/12] app: remove unnecessary include of atomic header file Joyce Kong
@ 2021-11-17 8:21 ` Joyce Kong
2021-11-17 8:21 ` [PATCH v3 01/12] test/pmd_perf: use compiler atomic builtins for polling sync Joyce Kong
` (12 more replies)
12 siblings, 13 replies; 36+ messages in thread
From: Joyce Kong @ 2021-11-17 8:21 UTC (permalink / raw)
Cc: dev, honnappa.nagarahalli, nd, Joyce Kong
Since atomic operations have been adopted in DPDK now[1],
change rte_atomicNN_xxx APIs to compiler atomic built-ins
in app modules[2].
[1] https://www.dpdk.org/blog/2021/03/26/dpdk-adopts-the-c11-memory-model/
[2] https://doc.dpdk.org/guides/rel_notes/deprecation.html
v3:
1. In pmd_perf test case, move the initialization of polling
start before calling rte_eal_remote_launch, so the update
is visible to the worker threads.(Honnappa Nagarahalli)
2. Remove the rest rte_atomic.h which miss in v2.(David Marchand)
v2:
By Honnappa Nagarahalli:
1. Replace the RELAXED barriers with suitable ones for shared
data sync in pmd_perf and timer test cases.
2. Avoid unnecessary atomic operations in compress and testpmd
modules.
3. Fix some typo.
Joyce Kong (12):
test/pmd_perf: use compiler atomic builtins for polling sync
test/ring_perf: use compiler atomic builtins for lcores sync
test/timer: use compiler atomic builtins for sync
test/stack_perf: use compiler atomics for lcore sync
test/bpf: use compiler atomics for calculation
test/func_reentrancy: use compiler atomics for data sync
app/eventdev: use compiler atomics for shared data sync
app/crypto: use compiler atomic builtins for display sync
app/compress: use compiler atomic builtins for display sync
app/testpmd: remove atomic operations for port status
app/bbdev: use compiler atomics for shared data sync
app: remove unnecessary include of atomic header file
app/proc-info/main.c | 1 -
app/test-bbdev/test_bbdev_perf.c | 135 ++++++++----------
.../comp_perf_test_common.h | 2 +-
.../comp_perf_test_cyclecount.c | 15 +-
.../comp_perf_test_throughput.c | 10 +-
.../comp_perf_test_verify.c | 6 +-
app/test-crypto-perf/cperf_test_latency.c | 6 +-
.../cperf_test_pmd_cyclecount.c | 9 +-
app/test-crypto-perf/cperf_test_throughput.c | 9 +-
app/test-crypto-perf/cperf_test_verify.c | 9 +-
app/test-eventdev/evt_main.c | 1 -
app/test-eventdev/test_order_atq.c | 4 +-
app/test-eventdev/test_order_common.c | 4 +-
app/test-eventdev/test_order_common.h | 8 +-
app/test-eventdev/test_order_queue.c | 4 +-
app/test-pipeline/config.c | 1 -
app/test-pipeline/init.c | 1 -
app/test-pipeline/main.c | 1 -
app/test-pipeline/runtime.c | 1 -
app/test-pmd/cmdline.c | 1 -
app/test-pmd/config.c | 1 -
app/test-pmd/csumonly.c | 1 -
app/test-pmd/flowgen.c | 1 -
app/test-pmd/icmpecho.c | 1 -
app/test-pmd/iofwd.c | 1 -
app/test-pmd/macfwd.c | 1 -
app/test-pmd/macswap.c | 1 -
app/test-pmd/parameters.c | 1 -
app/test-pmd/rxonly.c | 1 -
app/test-pmd/testpmd.c | 58 ++++----
app/test-pmd/txonly.c | 1 -
app/test/commands.c | 1 -
app/test/test_barrier.c | 1 -
app/test/test_bpf.c | 28 ++--
app/test/test_event_timer_adapter.c | 1 -
app/test/test_func_reentrancy.c | 27 ++--
app/test/test_mbuf.c | 1 -
app/test/test_mp_secondary.c | 1 -
app/test/test_pmd_perf.c | 23 +--
app/test/test_ring.c | 1 -
app/test/test_ring_perf.c | 9 +-
app/test/test_stack_perf.c | 14 +-
app/test/test_timer.c | 30 ++--
app/test/test_timer_secondary.c | 1 -
44 files changed, 203 insertions(+), 231 deletions(-)
--
2.25.1
^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH v3 01/12] test/pmd_perf: use compiler atomic builtins for polling sync
2021-11-17 8:21 ` [PATCH v3 00/12] use compiler atomic builtins for app modules Joyce Kong
@ 2021-11-17 8:21 ` Joyce Kong
2021-11-17 8:21 ` [PATCH v3 02/12] test/ring_perf: use compiler atomic builtins for lcores sync Joyce Kong
` (11 subsequent siblings)
12 siblings, 0 replies; 36+ messages in thread
From: Joyce Kong @ 2021-11-17 8:21 UTC (permalink / raw)
Cc: dev, honnappa.nagarahalli, nd, Joyce Kong, Ruifeng Wang
Convert rte_atomic usages to compiler atomic built-ins
for polling sync in pmd_perf test cases.
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
---
app/test/test_pmd_perf.c | 23 ++++++++++++-----------
1 file changed, 12 insertions(+), 11 deletions(-)
diff --git a/app/test/test_pmd_perf.c b/app/test/test_pmd_perf.c
index 1df86ce080..a6bac9d45e 100644
--- a/app/test/test_pmd_perf.c
+++ b/app/test/test_pmd_perf.c
@@ -10,7 +10,6 @@
#include <rte_cycles.h>
#include <rte_ethdev.h>
#include <rte_byteorder.h>
-#include <rte_atomic.h>
#include <rte_malloc.h>
#include "packet_burst_generator.h"
#include "test.h"
@@ -525,7 +524,7 @@ main_loop(__rte_unused void *args)
return 0;
}
-static rte_atomic64_t start;
+static uint64_t start;
static inline int
poll_burst(void *args)
@@ -563,8 +562,7 @@ poll_burst(void *args)
num[portid] = pkt_per_port;
}
- while (!rte_atomic64_read(&start))
- ;
+ rte_wait_until_equal_64(&start, 1, __ATOMIC_ACQUIRE);
cur_tsc = rte_rdtsc();
while (total) {
@@ -616,16 +614,19 @@ exec_burst(uint32_t flags, int lcore)
pkt_per_port = MAX_TRAFFIC_BURST;
num = pkt_per_port * conf->nb_ports;
- rte_atomic64_init(&start);
+ /* only when polling first */
+ if (flags == SC_BURST_POLL_FIRST)
+ __atomic_store_n(&start, 1, __ATOMIC_RELAXED);
+ else
+ __atomic_store_n(&start, 0, __ATOMIC_RELAXED);
- /* start polling thread, but not actually poll yet */
+ /* start polling thread
+ * if in POLL_FIRST mode, poll once launched;
+ * otherwise, not actually poll yet
+ */
rte_eal_remote_launch(poll_burst,
(void *)&pkt_per_port, lcore);
- /* Only when polling first */
- if (flags == SC_BURST_POLL_FIRST)
- rte_atomic64_set(&start, 1);
-
/* start xmit */
i = 0;
while (num) {
@@ -641,7 +642,7 @@ exec_burst(uint32_t flags, int lcore)
/* only when polling second */
if (flags == SC_BURST_XMIT_FIRST)
- rte_atomic64_set(&start, 1);
+ __atomic_store_n(&start, 1, __ATOMIC_RELEASE);
/* wait for polling finished */
diff_tsc = rte_eal_wait_lcore(lcore);
--
2.25.1
^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH v3 02/12] test/ring_perf: use compiler atomic builtins for lcores sync
2021-11-17 8:21 ` [PATCH v3 00/12] use compiler atomic builtins for app modules Joyce Kong
2021-11-17 8:21 ` [PATCH v3 01/12] test/pmd_perf: use compiler atomic builtins for polling sync Joyce Kong
@ 2021-11-17 8:21 ` Joyce Kong
2021-11-17 8:21 ` [PATCH v3 03/12] test/timer: use compiler atomic builtins for sync Joyce Kong
` (10 subsequent siblings)
12 siblings, 0 replies; 36+ messages in thread
From: Joyce Kong @ 2021-11-17 8:21 UTC (permalink / raw)
To: Honnappa Nagarahalli, Konstantin Ananyev
Cc: dev, nd, Joyce Kong, Ruifeng Wang
Convert rte_atomic usages to compiler atomic built-ins
for lcores sync in ring_perf test cases.
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
---
app/test/test_ring_perf.c | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/app/test/test_ring_perf.c b/app/test/test_ring_perf.c
index fd82e20412..2d8bb675a3 100644
--- a/app/test/test_ring_perf.c
+++ b/app/test/test_ring_perf.c
@@ -320,7 +320,7 @@ run_on_core_pair(struct lcore_pair *cores, struct rte_ring *r, const int esize)
return 0;
}
-static rte_atomic32_t synchro;
+static uint32_t synchro;
static uint64_t queue_count[RTE_MAX_LCORE];
#define TIME_MS 100
@@ -342,8 +342,7 @@ load_loop_fn_helper(struct thread_params *p, const int esize)
/* wait synchro for workers */
if (lcore != rte_get_main_lcore())
- while (rte_atomic32_read(&synchro) == 0)
- rte_pause();
+ rte_wait_until_equal_32(&synchro, 1, __ATOMIC_RELAXED);
begin = rte_get_timer_cycles();
while (time_diff < hz * TIME_MS / 1000) {
@@ -398,12 +397,12 @@ run_on_all_cores(struct rte_ring *r, const int esize)
param.r = r;
/* clear synchro and start workers */
- rte_atomic32_set(&synchro, 0);
+ __atomic_store_n(&synchro, 0, __ATOMIC_RELAXED);
if (rte_eal_mp_remote_launch(lcore_f, ¶m, SKIP_MAIN) < 0)
return -1;
/* start synchro and launch test on main */
- rte_atomic32_set(&synchro, 1);
+ __atomic_store_n(&synchro, 1, __ATOMIC_RELAXED);
lcore_f(¶m);
rte_eal_mp_wait_lcore();
--
2.25.1
^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH v3 03/12] test/timer: use compiler atomic builtins for sync
2021-11-17 8:21 ` [PATCH v3 00/12] use compiler atomic builtins for app modules Joyce Kong
2021-11-17 8:21 ` [PATCH v3 01/12] test/pmd_perf: use compiler atomic builtins for polling sync Joyce Kong
2021-11-17 8:21 ` [PATCH v3 02/12] test/ring_perf: use compiler atomic builtins for lcores sync Joyce Kong
@ 2021-11-17 8:21 ` Joyce Kong
2021-11-17 8:21 ` [PATCH v3 04/12] test/stack_perf: use compiler atomics for lcore sync Joyce Kong
` (9 subsequent siblings)
12 siblings, 0 replies; 36+ messages in thread
From: Joyce Kong @ 2021-11-17 8:21 UTC (permalink / raw)
To: Robert Sanford, Erik Gabriel Carrillo
Cc: dev, honnappa.nagarahalli, nd, Joyce Kong, Ruifeng Wang
Convert rte_atomic usages to compiler atomic
built-ins for lcore_state and collisions sync.
Also, move 'main_init_workers' outside of
'timer_stress2_main_loop' to guarantee lcore_state
initialized correctly before the threads launched.
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
---
app/test/test_timer.c | 30 +++++++++++++-----------------
app/test/test_timer_secondary.c | 1 -
2 files changed, 13 insertions(+), 18 deletions(-)
diff --git a/app/test/test_timer.c b/app/test/test_timer.c
index a10b2fe9da..c97e5c891c 100644
--- a/app/test/test_timer.c
+++ b/app/test/test_timer.c
@@ -102,7 +102,6 @@
#include <rte_eal.h>
#include <rte_per_lcore.h>
#include <rte_lcore.h>
-#include <rte_atomic.h>
#include <rte_timer.h>
#include <rte_random.h>
#include <rte_malloc.h>
@@ -203,7 +202,7 @@ timer_stress_main_loop(__rte_unused void *arg)
/* Need to synchronize worker lcores through multiple steps. */
enum { WORKER_WAITING = 1, WORKER_RUN_SIGNAL, WORKER_RUNNING, WORKER_FINISHED };
-static rte_atomic16_t lcore_state[RTE_MAX_LCORE];
+static uint16_t lcore_state[RTE_MAX_LCORE];
static void
main_init_workers(void)
@@ -211,7 +210,7 @@ main_init_workers(void)
unsigned i;
RTE_LCORE_FOREACH_WORKER(i) {
- rte_atomic16_set(&lcore_state[i], WORKER_WAITING);
+ __atomic_store_n(&lcore_state[i], WORKER_WAITING, __ATOMIC_RELAXED);
}
}
@@ -221,11 +220,10 @@ main_start_workers(void)
unsigned i;
RTE_LCORE_FOREACH_WORKER(i) {
- rte_atomic16_set(&lcore_state[i], WORKER_RUN_SIGNAL);
+ __atomic_store_n(&lcore_state[i], WORKER_RUN_SIGNAL, __ATOMIC_RELEASE);
}
RTE_LCORE_FOREACH_WORKER(i) {
- while (rte_atomic16_read(&lcore_state[i]) != WORKER_RUNNING)
- rte_pause();
+ rte_wait_until_equal_16(&lcore_state[i], WORKER_RUNNING, __ATOMIC_ACQUIRE);
}
}
@@ -235,8 +233,7 @@ main_wait_for_workers(void)
unsigned i;
RTE_LCORE_FOREACH_WORKER(i) {
- while (rte_atomic16_read(&lcore_state[i]) != WORKER_FINISHED)
- rte_pause();
+ rte_wait_until_equal_16(&lcore_state[i], WORKER_FINISHED, __ATOMIC_ACQUIRE);
}
}
@@ -245,9 +242,8 @@ worker_wait_to_start(void)
{
unsigned lcore_id = rte_lcore_id();
- while (rte_atomic16_read(&lcore_state[lcore_id]) != WORKER_RUN_SIGNAL)
- rte_pause();
- rte_atomic16_set(&lcore_state[lcore_id], WORKER_RUNNING);
+ rte_wait_until_equal_16(&lcore_state[lcore_id], WORKER_RUN_SIGNAL, __ATOMIC_ACQUIRE);
+ __atomic_store_n(&lcore_state[lcore_id], WORKER_RUNNING, __ATOMIC_RELEASE);
}
static void
@@ -255,7 +251,7 @@ worker_finish(void)
{
unsigned lcore_id = rte_lcore_id();
- rte_atomic16_set(&lcore_state[lcore_id], WORKER_FINISHED);
+ __atomic_store_n(&lcore_state[lcore_id], WORKER_FINISHED, __ATOMIC_RELEASE);
}
@@ -281,13 +277,12 @@ timer_stress2_main_loop(__rte_unused void *arg)
unsigned int lcore_id = rte_lcore_id();
unsigned int main_lcore = rte_get_main_lcore();
int32_t my_collisions = 0;
- static rte_atomic32_t collisions;
+ static uint32_t collisions;
if (lcore_id == main_lcore) {
cb_count = 0;
test_failed = 0;
- rte_atomic32_set(&collisions, 0);
- main_init_workers();
+ __atomic_store_n(&collisions, 0, __ATOMIC_RELAXED);
timers = rte_malloc(NULL, sizeof(*timers) * NB_STRESS2_TIMERS, 0);
if (timers == NULL) {
printf("Test Failed\n");
@@ -315,7 +310,7 @@ timer_stress2_main_loop(__rte_unused void *arg)
my_collisions++;
}
if (my_collisions != 0)
- rte_atomic32_add(&collisions, my_collisions);
+ __atomic_fetch_add(&collisions, my_collisions, __ATOMIC_RELAXED);
/* wait long enough for timers to expire */
rte_delay_ms(100);
@@ -329,7 +324,7 @@ timer_stress2_main_loop(__rte_unused void *arg)
/* now check that we get the right number of callbacks */
if (lcore_id == main_lcore) {
- my_collisions = rte_atomic32_read(&collisions);
+ my_collisions = __atomic_load_n(&collisions, __ATOMIC_RELAXED);
if (my_collisions != 0)
printf("- %d timer reset collisions (OK)\n", my_collisions);
rte_timer_manage();
@@ -573,6 +568,7 @@ test_timer(void)
/* run a second, slightly different set of stress tests */
printf("\nStart timer stress tests 2\n");
test_failed = 0;
+ main_init_workers();
rte_eal_mp_remote_launch(timer_stress2_main_loop, NULL, CALL_MAIN);
rte_eal_mp_wait_lcore();
if (test_failed)
diff --git a/app/test/test_timer_secondary.c b/app/test/test_timer_secondary.c
index 16a9f1878b..5795c97f07 100644
--- a/app/test/test_timer_secondary.c
+++ b/app/test/test_timer_secondary.c
@@ -9,7 +9,6 @@
#include <rte_lcore.h>
#include <rte_debug.h>
#include <rte_memzone.h>
-#include <rte_atomic.h>
#include <rte_timer.h>
#include <rte_cycles.h>
#include <rte_mempool.h>
--
2.25.1
^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH v3 04/12] test/stack_perf: use compiler atomics for lcore sync
2021-11-17 8:21 ` [PATCH v3 00/12] use compiler atomic builtins for app modules Joyce Kong
` (2 preceding siblings ...)
2021-11-17 8:21 ` [PATCH v3 03/12] test/timer: use compiler atomic builtins for sync Joyce Kong
@ 2021-11-17 8:21 ` Joyce Kong
2021-11-17 8:21 ` [PATCH v3 05/12] test/bpf: use compiler atomics for calculation Joyce Kong
` (8 subsequent siblings)
12 siblings, 0 replies; 36+ messages in thread
From: Joyce Kong @ 2021-11-17 8:21 UTC (permalink / raw)
To: Olivier Matz; +Cc: dev, honnappa.nagarahalli, nd, Joyce Kong, Ruifeng Wang
Convert rte_atomic usages to compiler atomic built-ins
for lcore sync in stack_perf test cases.
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
---
app/test/test_stack_perf.c | 14 ++++++--------
1 file changed, 6 insertions(+), 8 deletions(-)
diff --git a/app/test/test_stack_perf.c b/app/test/test_stack_perf.c
index 4ee40d5d19..1eae00a334 100644
--- a/app/test/test_stack_perf.c
+++ b/app/test/test_stack_perf.c
@@ -6,7 +6,6 @@
#include <stdio.h>
#include <inttypes.h>
-#include <rte_atomic.h>
#include <rte_cycles.h>
#include <rte_launch.h>
#include <rte_pause.h>
@@ -24,7 +23,7 @@
*/
static volatile unsigned int bulk_sizes[] = {8, MAX_BURST};
-static rte_atomic32_t lcore_barrier;
+static uint32_t lcore_barrier;
struct lcore_pair {
unsigned int c1;
@@ -144,9 +143,8 @@ bulk_push_pop(void *p)
s = args->s;
size = args->sz;
- rte_atomic32_sub(&lcore_barrier, 1);
- while (rte_atomic32_read(&lcore_barrier) != 0)
- rte_pause();
+ __atomic_fetch_sub(&lcore_barrier, 1, __ATOMIC_RELAXED);
+ rte_wait_until_equal_32(&lcore_barrier, 0, __ATOMIC_RELAXED);
uint64_t start = rte_rdtsc();
@@ -175,7 +173,7 @@ run_on_core_pair(struct lcore_pair *cores, struct rte_stack *s,
unsigned int i;
for (i = 0; i < RTE_DIM(bulk_sizes); i++) {
- rte_atomic32_set(&lcore_barrier, 2);
+ __atomic_store_n(&lcore_barrier, 2, __ATOMIC_RELAXED);
args[0].sz = args[1].sz = bulk_sizes[i];
args[0].s = args[1].s = s;
@@ -208,7 +206,7 @@ run_on_n_cores(struct rte_stack *s, lcore_function_t fn, int n)
int cnt = 0;
double avg;
- rte_atomic32_set(&lcore_barrier, n);
+ __atomic_store_n(&lcore_barrier, n, __ATOMIC_RELAXED);
RTE_LCORE_FOREACH_WORKER(lcore_id) {
if (++cnt >= n)
@@ -302,7 +300,7 @@ __test_stack_perf(uint32_t flags)
struct lcore_pair cores;
struct rte_stack *s;
- rte_atomic32_init(&lcore_barrier);
+ __atomic_store_n(&lcore_barrier, 0, __ATOMIC_RELAXED);
s = rte_stack_create(STACK_NAME, STACK_SIZE, rte_socket_id(), flags);
if (s == NULL) {
--
2.25.1
^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH v3 05/12] test/bpf: use compiler atomics for calculation
2021-11-17 8:21 ` [PATCH v3 00/12] use compiler atomic builtins for app modules Joyce Kong
` (3 preceding siblings ...)
2021-11-17 8:21 ` [PATCH v3 04/12] test/stack_perf: use compiler atomics for lcore sync Joyce Kong
@ 2021-11-17 8:21 ` Joyce Kong
2021-11-17 8:21 ` [PATCH v3 06/12] test/func_reentrancy: use compiler atomics for data sync Joyce Kong
` (7 subsequent siblings)
12 siblings, 0 replies; 36+ messages in thread
From: Joyce Kong @ 2021-11-17 8:21 UTC (permalink / raw)
To: Konstantin Ananyev
Cc: dev, honnappa.nagarahalli, nd, Joyce Kong, Ruifeng Wang
Convert rte_atomic usages to compiler atomic built-ins
for calculation in bpf test cases.
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
---
app/test/test_bpf.c | 28 ++++++++++++++--------------
1 file changed, 14 insertions(+), 14 deletions(-)
diff --git a/app/test/test_bpf.c b/app/test/test_bpf.c
index e3e9a1b0b5..b8be1e3d30 100644
--- a/app/test/test_bpf.c
+++ b/app/test/test_bpf.c
@@ -1569,32 +1569,32 @@ test_xadd1_check(uint64_t rc, const void *arg)
memset(&dfe, 0, sizeof(dfe));
rv = 1;
- rte_atomic32_add((rte_atomic32_t *)&dfe.u32, rv);
- rte_atomic64_add((rte_atomic64_t *)&dfe.u64, rv);
+ __atomic_fetch_add(&dfe.u32, rv, __ATOMIC_RELAXED);
+ __atomic_fetch_add(&dfe.u64, rv, __ATOMIC_RELAXED);
rv = -1;
- rte_atomic32_add((rte_atomic32_t *)&dfe.u32, rv);
- rte_atomic64_add((rte_atomic64_t *)&dfe.u64, rv);
+ __atomic_fetch_add(&dfe.u32, rv, __ATOMIC_RELAXED);
+ __atomic_fetch_add(&dfe.u64, rv, __ATOMIC_RELAXED);
rv = (int32_t)TEST_FILL_1;
- rte_atomic32_add((rte_atomic32_t *)&dfe.u32, rv);
- rte_atomic64_add((rte_atomic64_t *)&dfe.u64, rv);
+ __atomic_fetch_add(&dfe.u32, rv, __ATOMIC_RELAXED);
+ __atomic_fetch_add(&dfe.u64, rv, __ATOMIC_RELAXED);
rv = TEST_MUL_1;
- rte_atomic32_add((rte_atomic32_t *)&dfe.u32, rv);
- rte_atomic64_add((rte_atomic64_t *)&dfe.u64, rv);
+ __atomic_fetch_add(&dfe.u32, rv, __ATOMIC_RELAXED);
+ __atomic_fetch_add(&dfe.u64, rv, __ATOMIC_RELAXED);
rv = TEST_MUL_2;
- rte_atomic32_add((rte_atomic32_t *)&dfe.u32, rv);
- rte_atomic64_add((rte_atomic64_t *)&dfe.u64, rv);
+ __atomic_fetch_add(&dfe.u32, rv, __ATOMIC_RELAXED);
+ __atomic_fetch_add(&dfe.u64, rv, __ATOMIC_RELAXED);
rv = TEST_JCC_2;
- rte_atomic32_add((rte_atomic32_t *)&dfe.u32, rv);
- rte_atomic64_add((rte_atomic64_t *)&dfe.u64, rv);
+ __atomic_fetch_add(&dfe.u32, rv, __ATOMIC_RELAXED);
+ __atomic_fetch_add(&dfe.u64, rv, __ATOMIC_RELAXED);
rv = TEST_JCC_3;
- rte_atomic32_add((rte_atomic32_t *)&dfe.u32, rv);
- rte_atomic64_add((rte_atomic64_t *)&dfe.u64, rv);
+ __atomic_fetch_add(&dfe.u32, rv, __ATOMIC_RELAXED);
+ __atomic_fetch_add(&dfe.u64, rv, __ATOMIC_RELAXED);
return cmp_res(__func__, 1, rc, &dfe, dft, sizeof(dfe));
}
--
2.25.1
^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH v3 06/12] test/func_reentrancy: use compiler atomics for data sync
2021-11-17 8:21 ` [PATCH v3 00/12] use compiler atomic builtins for app modules Joyce Kong
` (4 preceding siblings ...)
2021-11-17 8:21 ` [PATCH v3 05/12] test/bpf: use compiler atomics for calculation Joyce Kong
@ 2021-11-17 8:21 ` Joyce Kong
2021-11-17 8:21 ` [PATCH v3 07/12] app/eventdev: use compiler atomics for shared " Joyce Kong
` (6 subsequent siblings)
12 siblings, 0 replies; 36+ messages in thread
From: Joyce Kong @ 2021-11-17 8:21 UTC (permalink / raw)
To: Olivier Matz, Andrew Rybchenko, Bruce Richardson,
Vladimir Medvedkin, Honnappa Nagarahalli, Konstantin Ananyev,
Anatoly Burakov, Yipeng Wang, Sameh Gobriel
Cc: dev, nd, Joyce Kong, Ruifeng Wang
Convert rte_atomic usages to compiler atomic built-ins
for shared data sync in func_reentrancy test cases.
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
---
app/test/test_func_reentrancy.c | 27 +++++++++++++--------------
1 file changed, 13 insertions(+), 14 deletions(-)
diff --git a/app/test/test_func_reentrancy.c b/app/test/test_func_reentrancy.c
index 838ab6f0f9..7825c6cb86 100644
--- a/app/test/test_func_reentrancy.c
+++ b/app/test/test_func_reentrancy.c
@@ -20,7 +20,6 @@
#include <rte_eal.h>
#include <rte_per_lcore.h>
#include <rte_lcore.h>
-#include <rte_atomic.h>
#include <rte_branch_prediction.h>
#include <rte_ring.h>
#include <rte_mempool.h>
@@ -54,12 +53,12 @@ typedef void (*case_clean_t)(unsigned lcore_id);
#define MAX_LCORES (RTE_MAX_MEMZONE / (MAX_ITER_MULTI * 4U))
-static rte_atomic32_t obj_count = RTE_ATOMIC32_INIT(0);
-static rte_atomic32_t synchro = RTE_ATOMIC32_INIT(0);
+static uint32_t obj_count;
+static uint32_t synchro;
#define WAIT_SYNCHRO_FOR_WORKERS() do { \
if (lcore_self != rte_get_main_lcore()) \
- while (rte_atomic32_read(&synchro) == 0); \
+ rte_wait_until_equal_32(&synchro, 1, __ATOMIC_RELAXED); \
} while(0)
/*
@@ -72,7 +71,7 @@ test_eal_init_once(__rte_unused void *arg)
WAIT_SYNCHRO_FOR_WORKERS();
- rte_atomic32_set(&obj_count, 1); /* silent the check in the caller */
+ __atomic_store_n(&obj_count, 1, __ATOMIC_RELAXED); /* silent the check in the caller */
if (rte_eal_init(0, NULL) != -1)
return -1;
@@ -116,7 +115,7 @@ ring_create_lookup(__rte_unused void *arg)
for (i = 0; i < MAX_ITER_ONCE; i++) {
rp = rte_ring_create("fr_test_once", 4096, SOCKET_ID_ANY, 0);
if (rp != NULL)
- rte_atomic32_inc(&obj_count);
+ __atomic_fetch_add(&obj_count, 1, __ATOMIC_RELAXED);
}
/* create/lookup new ring several times */
@@ -183,7 +182,7 @@ mempool_create_lookup(__rte_unused void *arg)
my_obj_init, NULL,
SOCKET_ID_ANY, 0);
if (mp != NULL)
- rte_atomic32_inc(&obj_count);
+ __atomic_fetch_add(&obj_count, 1, __ATOMIC_RELAXED);
}
/* create/lookup new ring several times */
@@ -250,7 +249,7 @@ hash_create_free(__rte_unused void *arg)
for (i = 0; i < MAX_ITER_ONCE; i++) {
handle = rte_hash_create(&hash_params);
if (handle != NULL)
- rte_atomic32_inc(&obj_count);
+ __atomic_fetch_add(&obj_count, 1, __ATOMIC_RELAXED);
}
/* create mutiple times simultaneously */
@@ -318,7 +317,7 @@ fbk_create_free(__rte_unused void *arg)
for (i = 0; i < MAX_ITER_ONCE; i++) {
handle = rte_fbk_hash_create(&fbk_params);
if (handle != NULL)
- rte_atomic32_inc(&obj_count);
+ __atomic_fetch_add(&obj_count, 1, __ATOMIC_RELAXED);
}
/* create mutiple fbk tables simultaneously */
@@ -384,7 +383,7 @@ lpm_create_free(__rte_unused void *arg)
for (i = 0; i < MAX_ITER_ONCE; i++) {
lpm = rte_lpm_create("fr_test_once", SOCKET_ID_ANY, &config);
if (lpm != NULL)
- rte_atomic32_inc(&obj_count);
+ __atomic_fetch_add(&obj_count, 1, __ATOMIC_RELAXED);
}
/* create mutiple fbk tables simultaneously */
@@ -445,8 +444,8 @@ launch_test(struct test_case *pt_case)
if (pt_case->func == NULL)
return -1;
- rte_atomic32_set(&obj_count, 0);
- rte_atomic32_set(&synchro, 0);
+ __atomic_store_n(&obj_count, 0, __ATOMIC_RELAXED);
+ __atomic_store_n(&synchro, 0, __ATOMIC_RELAXED);
cores = RTE_MIN(rte_lcore_count(), MAX_LCORES);
RTE_LCORE_FOREACH_WORKER(lcore_id) {
@@ -456,7 +455,7 @@ launch_test(struct test_case *pt_case)
rte_eal_remote_launch(pt_case->func, pt_case->arg, lcore_id);
}
- rte_atomic32_set(&synchro, 1);
+ __atomic_store_n(&synchro, 1, __ATOMIC_RELAXED);
if (pt_case->func(pt_case->arg) < 0)
ret = -1;
@@ -471,7 +470,7 @@ launch_test(struct test_case *pt_case)
pt_case->clean(lcore_id);
}
- count = rte_atomic32_read(&obj_count);
+ count = __atomic_load_n(&obj_count, __ATOMIC_RELAXED);
if (count != 1) {
printf("%s: common object allocated %d times (should be 1)\n",
pt_case->name, count);
--
2.25.1
^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH v3 07/12] app/eventdev: use compiler atomics for shared data sync
2021-11-17 8:21 ` [PATCH v3 00/12] use compiler atomic builtins for app modules Joyce Kong
` (5 preceding siblings ...)
2021-11-17 8:21 ` [PATCH v3 06/12] test/func_reentrancy: use compiler atomics for data sync Joyce Kong
@ 2021-11-17 8:21 ` Joyce Kong
2021-11-17 8:21 ` [PATCH v3 08/12] app/crypto: use compiler atomic builtins for display sync Joyce Kong
` (5 subsequent siblings)
12 siblings, 0 replies; 36+ messages in thread
From: Joyce Kong @ 2021-11-17 8:21 UTC (permalink / raw)
To: Jerin Jacob; +Cc: dev, honnappa.nagarahalli, nd, Joyce Kong, Ruifeng Wang
Convert rte_atomic usages to compiler atomic built-ins
for shared data sync in eventdev cases.
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
---
app/test-eventdev/evt_main.c | 1 -
app/test-eventdev/test_order_atq.c | 4 ++--
app/test-eventdev/test_order_common.c | 4 ++--
app/test-eventdev/test_order_common.h | 8 ++++----
app/test-eventdev/test_order_queue.c | 4 ++--
5 files changed, 10 insertions(+), 11 deletions(-)
diff --git a/app/test-eventdev/evt_main.c b/app/test-eventdev/evt_main.c
index 3534aabca7..194c980c7a 100644
--- a/app/test-eventdev/evt_main.c
+++ b/app/test-eventdev/evt_main.c
@@ -6,7 +6,6 @@
#include <unistd.h>
#include <signal.h>
-#include <rte_atomic.h>
#include <rte_debug.h>
#include <rte_eal.h>
#include <rte_eventdev.h>
diff --git a/app/test-eventdev/test_order_atq.c b/app/test-eventdev/test_order_atq.c
index 71215a07b6..2fee4b4daa 100644
--- a/app/test-eventdev/test_order_atq.c
+++ b/app/test-eventdev/test_order_atq.c
@@ -28,7 +28,7 @@ order_atq_worker(void *arg, const bool flow_id_cap)
uint16_t event = rte_event_dequeue_burst(dev_id, port,
&ev, 1, 0);
if (!event) {
- if (rte_atomic64_read(outstand_pkts) <= 0)
+ if (__atomic_load_n(outstand_pkts, __ATOMIC_RELAXED) <= 0)
break;
rte_pause();
continue;
@@ -64,7 +64,7 @@ order_atq_worker_burst(void *arg, const bool flow_id_cap)
BURST_SIZE, 0);
if (nb_rx == 0) {
- if (rte_atomic64_read(outstand_pkts) <= 0)
+ if (__atomic_load_n(outstand_pkts, __ATOMIC_RELAXED) <= 0)
break;
rte_pause();
continue;
diff --git a/app/test-eventdev/test_order_common.c b/app/test-eventdev/test_order_common.c
index d7760061ba..ff7813f9c2 100644
--- a/app/test-eventdev/test_order_common.c
+++ b/app/test-eventdev/test_order_common.c
@@ -187,7 +187,7 @@ order_test_setup(struct evt_test *test, struct evt_options *opt)
evt_err("failed to allocate t->expected_flow_seq memory");
goto exp_nomem;
}
- rte_atomic64_set(&t->outstand_pkts, opt->nb_pkts);
+ __atomic_store_n(&t->outstand_pkts, opt->nb_pkts, __ATOMIC_RELAXED);
t->err = false;
t->nb_pkts = opt->nb_pkts;
t->nb_flows = opt->nb_flows;
@@ -294,7 +294,7 @@ order_launch_lcores(struct evt_test *test, struct evt_options *opt,
while (t->err == false) {
uint64_t new_cycles = rte_get_timer_cycles();
- int64_t remaining = rte_atomic64_read(&t->outstand_pkts);
+ int64_t remaining = __atomic_load_n(&t->outstand_pkts, __ATOMIC_RELAXED);
if (remaining <= 0) {
t->result = EVT_TEST_SUCCESS;
diff --git a/app/test-eventdev/test_order_common.h b/app/test-eventdev/test_order_common.h
index cd9d6009ec..92781d9587 100644
--- a/app/test-eventdev/test_order_common.h
+++ b/app/test-eventdev/test_order_common.h
@@ -48,7 +48,7 @@ struct test_order {
* The atomic_* is an expensive operation,Since it is a functional test,
* We are using the atomic_ operation to reduce the code complexity.
*/
- rte_atomic64_t outstand_pkts;
+ uint64_t outstand_pkts;
enum evt_test_result result;
uint32_t nb_flows;
uint64_t nb_pkts;
@@ -95,7 +95,7 @@ static __rte_always_inline void
order_process_stage_1(struct test_order *const t,
struct rte_event *const ev, const uint32_t nb_flows,
uint32_t *const expected_flow_seq,
- rte_atomic64_t *const outstand_pkts)
+ uint64_t *const outstand_pkts)
{
const uint32_t flow = (uintptr_t)ev->mbuf % nb_flows;
/* compare the seqn against expected value */
@@ -113,7 +113,7 @@ order_process_stage_1(struct test_order *const t,
*/
expected_flow_seq[flow]++;
rte_pktmbuf_free(ev->mbuf);
- rte_atomic64_sub(outstand_pkts, 1);
+ __atomic_sub_fetch(outstand_pkts, 1, __ATOMIC_RELAXED);
}
static __rte_always_inline void
@@ -132,7 +132,7 @@ order_process_stage_invalid(struct test_order *const t,
const uint8_t port = w->port_id;\
const uint32_t nb_flows = t->nb_flows;\
uint32_t *expected_flow_seq = t->expected_flow_seq;\
- rte_atomic64_t *outstand_pkts = &t->outstand_pkts;\
+ uint64_t *outstand_pkts = &t->outstand_pkts;\
if (opt->verbose_level > 1)\
printf("%s(): lcore %d dev_id %d port=%d\n",\
__func__, rte_lcore_id(), dev_id, port)
diff --git a/app/test-eventdev/test_order_queue.c b/app/test-eventdev/test_order_queue.c
index 621367805a..80eaea5cf5 100644
--- a/app/test-eventdev/test_order_queue.c
+++ b/app/test-eventdev/test_order_queue.c
@@ -28,7 +28,7 @@ order_queue_worker(void *arg, const bool flow_id_cap)
uint16_t event = rte_event_dequeue_burst(dev_id, port,
&ev, 1, 0);
if (!event) {
- if (rte_atomic64_read(outstand_pkts) <= 0)
+ if (__atomic_load_n(outstand_pkts, __ATOMIC_RELAXED) <= 0)
break;
rte_pause();
continue;
@@ -64,7 +64,7 @@ order_queue_worker_burst(void *arg, const bool flow_id_cap)
BURST_SIZE, 0);
if (nb_rx == 0) {
- if (rte_atomic64_read(outstand_pkts) <= 0)
+ if (__atomic_load_n(outstand_pkts, __ATOMIC_RELAXED) <= 0)
break;
rte_pause();
continue;
--
2.25.1
^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH v3 08/12] app/crypto: use compiler atomic builtins for display sync
2021-11-17 8:21 ` [PATCH v3 00/12] use compiler atomic builtins for app modules Joyce Kong
` (6 preceding siblings ...)
2021-11-17 8:21 ` [PATCH v3 07/12] app/eventdev: use compiler atomics for shared " Joyce Kong
@ 2021-11-17 8:21 ` Joyce Kong
2021-11-17 8:21 ` [PATCH v3 09/12] app/compress: " Joyce Kong
` (4 subsequent siblings)
12 siblings, 0 replies; 36+ messages in thread
From: Joyce Kong @ 2021-11-17 8:21 UTC (permalink / raw)
To: Declan Doherty, Ciara Power
Cc: dev, honnappa.nagarahalli, nd, Joyce Kong, Ruifeng Wang
Convert rte_atomic_test_and_set usage to compiler atomic
CAS operation for display sync in crypto cases.
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
---
app/test-crypto-perf/cperf_test_latency.c | 6 ++++--
app/test-crypto-perf/cperf_test_pmd_cyclecount.c | 9 ++++++---
app/test-crypto-perf/cperf_test_throughput.c | 9 ++++++---
app/test-crypto-perf/cperf_test_verify.c | 9 ++++++---
4 files changed, 22 insertions(+), 11 deletions(-)
diff --git a/app/test-crypto-perf/cperf_test_latency.c b/app/test-crypto-perf/cperf_test_latency.c
index 69f55de50a..ce49feaba9 100644
--- a/app/test-crypto-perf/cperf_test_latency.c
+++ b/app/test-crypto-perf/cperf_test_latency.c
@@ -126,7 +126,7 @@ cperf_latency_test_runner(void *arg)
uint8_t burst_size_idx = 0;
uint32_t imix_idx = 0;
- static rte_atomic16_t display_once = RTE_ATOMIC16_INIT(0);
+ static uint16_t display_once;
if (ctx == NULL)
return 0;
@@ -307,8 +307,10 @@ cperf_latency_test_runner(void *arg)
time_max = tunit*(double)(tsc_max) / tsc_hz;
time_min = tunit*(double)(tsc_min) / tsc_hz;
+ uint16_t exp = 0;
if (ctx->options->csv) {
- if (rte_atomic16_test_and_set(&display_once))
+ if (__atomic_compare_exchange_n(&display_once, &exp, 1, 0,
+ __ATOMIC_RELAXED, __ATOMIC_RELAXED))
printf("\n# lcore, Buffer Size, Burst Size, Pakt Seq #, "
"cycles, time (us)");
diff --git a/app/test-crypto-perf/cperf_test_pmd_cyclecount.c b/app/test-crypto-perf/cperf_test_pmd_cyclecount.c
index fda97e8ab9..ba1f104f72 100644
--- a/app/test-crypto-perf/cperf_test_pmd_cyclecount.c
+++ b/app/test-crypto-perf/cperf_test_pmd_cyclecount.c
@@ -404,7 +404,7 @@ cperf_pmd_cyclecount_test_runner(void *test_ctx)
state.lcore = rte_lcore_id();
state.linearize = 0;
- static rte_atomic16_t display_once = RTE_ATOMIC16_INIT(0);
+ static uint16_t display_once;
static bool warmup = true;
/*
@@ -449,8 +449,10 @@ cperf_pmd_cyclecount_test_runner(void *test_ctx)
continue;
}
+ uint16_t exp = 0;
if (!opts->csv) {
- if (rte_atomic16_test_and_set(&display_once))
+ if (__atomic_compare_exchange_n(&display_once, &exp, 1, 0,
+ __ATOMIC_RELAXED, __ATOMIC_RELAXED))
printf(PRETTY_HDR_FMT, "lcore id", "Buf Size",
"Burst Size", "Enqueued",
"Dequeued", "Enq Retries",
@@ -466,7 +468,8 @@ cperf_pmd_cyclecount_test_runner(void *test_ctx)
state.cycles_per_enq,
state.cycles_per_deq);
} else {
- if (rte_atomic16_test_and_set(&display_once))
+ if (__atomic_compare_exchange_n(&display_once, &exp, 1, 0,
+ __ATOMIC_RELAXED, __ATOMIC_RELAXED))
printf(CSV_HDR_FMT, "# lcore id", "Buf Size",
"Burst Size", "Enqueued",
"Dequeued", "Enq Retries",
diff --git a/app/test-crypto-perf/cperf_test_throughput.c b/app/test-crypto-perf/cperf_test_throughput.c
index 739ed9e573..51512af2ad 100644
--- a/app/test-crypto-perf/cperf_test_throughput.c
+++ b/app/test-crypto-perf/cperf_test_throughput.c
@@ -113,7 +113,7 @@ cperf_throughput_test_runner(void *test_ctx)
uint8_t burst_size_idx = 0;
uint32_t imix_idx = 0;
- static rte_atomic16_t display_once = RTE_ATOMIC16_INIT(0);
+ static uint16_t display_once;
struct rte_crypto_op *ops[ctx->options->max_burst_size];
struct rte_crypto_op *ops_processed[ctx->options->max_burst_size];
@@ -281,8 +281,10 @@ cperf_throughput_test_runner(void *test_ctx)
double cycles_per_packet = ((double)tsc_duration /
ctx->options->total_ops);
+ uint16_t exp = 0;
if (!ctx->options->csv) {
- if (rte_atomic16_test_and_set(&display_once))
+ if (__atomic_compare_exchange_n(&display_once, &exp, 1, 0,
+ __ATOMIC_RELAXED, __ATOMIC_RELAXED))
printf("%12s%12s%12s%12s%12s%12s%12s%12s%12s%12s\n\n",
"lcore id", "Buf Size", "Burst Size",
"Enqueued", "Dequeued", "Failed Enq",
@@ -302,7 +304,8 @@ cperf_throughput_test_runner(void *test_ctx)
throughput_gbps,
cycles_per_packet);
} else {
- if (rte_atomic16_test_and_set(&display_once))
+ if (__atomic_compare_exchange_n(&display_once, &exp, 1, 0,
+ __ATOMIC_RELAXED, __ATOMIC_RELAXED))
printf("#lcore id,Buffer Size(B),"
"Burst Size,Enqueued,Dequeued,Failed Enq,"
"Failed Deq,Ops(Millions),Throughput(Gbps),"
diff --git a/app/test-crypto-perf/cperf_test_verify.c b/app/test-crypto-perf/cperf_test_verify.c
index 1962438034..496eb0de00 100644
--- a/app/test-crypto-perf/cperf_test_verify.c
+++ b/app/test-crypto-perf/cperf_test_verify.c
@@ -241,7 +241,7 @@ cperf_verify_test_runner(void *test_ctx)
uint64_t ops_deqd = 0, ops_deqd_total = 0, ops_deqd_failed = 0;
uint64_t ops_failed = 0;
- static rte_atomic16_t display_once = RTE_ATOMIC16_INIT(0);
+ static uint16_t display_once;
uint64_t i;
uint16_t ops_unused = 0;
@@ -383,8 +383,10 @@ cperf_verify_test_runner(void *test_ctx)
ops_deqd_total += ops_deqd;
}
+ uint16_t exp = 0;
if (!ctx->options->csv) {
- if (rte_atomic16_test_and_set(&display_once))
+ if (__atomic_compare_exchange_n(&display_once, &exp, 1, 0,
+ __ATOMIC_RELAXED, __ATOMIC_RELAXED))
printf("%12s%12s%12s%12s%12s%12s%12s%12s\n\n",
"lcore id", "Buf Size", "Burst size",
"Enqueued", "Dequeued", "Failed Enq",
@@ -401,7 +403,8 @@ cperf_verify_test_runner(void *test_ctx)
ops_deqd_failed,
ops_failed);
} else {
- if (rte_atomic16_test_and_set(&display_once))
+ if (__atomic_compare_exchange_n(&display_once, &exp, 1, 0,
+ __ATOMIC_RELAXED, __ATOMIC_RELAXED))
printf("\n# lcore id, Buffer Size(B), "
"Burst Size,Enqueued,Dequeued,Failed Enq,"
"Failed Deq,Failed Ops\n");
--
2.25.1
^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH v3 09/12] app/compress: use compiler atomic builtins for display sync
2021-11-17 8:21 ` [PATCH v3 00/12] use compiler atomic builtins for app modules Joyce Kong
` (7 preceding siblings ...)
2021-11-17 8:21 ` [PATCH v3 08/12] app/crypto: use compiler atomic builtins for display sync Joyce Kong
@ 2021-11-17 8:21 ` Joyce Kong
2021-11-17 8:21 ` [PATCH v3 10/12] app/testpmd: remove atomic operations for port status Joyce Kong
` (3 subsequent siblings)
12 siblings, 0 replies; 36+ messages in thread
From: Joyce Kong @ 2021-11-17 8:21 UTC (permalink / raw)
Cc: dev, honnappa.nagarahalli, nd, Joyce Kong, Ruifeng Wang
Convert rte_atomic_test_and_set usage to compiler atomic
CAS operation for display sync.
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
---
app/test-compress-perf/comp_perf_test_common.h | 2 +-
.../comp_perf_test_cyclecount.c | 15 +++++++--------
.../comp_perf_test_throughput.c | 10 +++++++---
app/test-compress-perf/comp_perf_test_verify.c | 6 ++++--
4 files changed, 19 insertions(+), 14 deletions(-)
diff --git a/app/test-compress-perf/comp_perf_test_common.h b/app/test-compress-perf/comp_perf_test_common.h
index 72705c6a2b..d039e5a29a 100644
--- a/app/test-compress-perf/comp_perf_test_common.h
+++ b/app/test-compress-perf/comp_perf_test_common.h
@@ -14,7 +14,7 @@ struct cperf_mem_resources {
uint16_t qp_id;
uint8_t lcore_id;
- rte_atomic16_t print_info_once;
+ uint16_t print_info_once;
uint32_t total_bufs;
uint8_t *compressed_data;
diff --git a/app/test-compress-perf/comp_perf_test_cyclecount.c b/app/test-compress-perf/comp_perf_test_cyclecount.c
index c875ddbdac..da55b02b74 100644
--- a/app/test-compress-perf/comp_perf_test_cyclecount.c
+++ b/app/test-compress-perf/comp_perf_test_cyclecount.c
@@ -466,7 +466,7 @@ cperf_cyclecount_test_runner(void *test_ctx)
struct cperf_cyclecount_ctx *ctx = test_ctx;
struct comp_test_data *test_data = ctx->ver.options;
uint32_t lcore = rte_lcore_id();
- static rte_atomic16_t display_once = RTE_ATOMIC16_INIT(0);
+ static uint16_t display_once;
static rte_spinlock_t print_spinlock;
int i;
@@ -486,10 +486,12 @@ cperf_cyclecount_test_runner(void *test_ctx)
ctx->ver.mem.lcore_id = lcore;
+ uint16_t exp = 0;
/*
* printing information about current compression thread
*/
- if (rte_atomic16_test_and_set(&ctx->ver.mem.print_info_once))
+ if (__atomic_compare_exchange_n(&ctx->ver.mem.print_info_once, &exp,
+ 1, 0, __ATOMIC_RELAXED, __ATOMIC_RELAXED))
printf(" lcore: %u,"
" driver name: %s,"
" device name: %s,"
@@ -546,9 +548,10 @@ cperf_cyclecount_test_runner(void *test_ctx)
(ctx->ver.mem.total_bufs * test_data->num_iter);
/* R E P O R T processing */
- if (rte_atomic16_test_and_set(&display_once)) {
+ rte_spinlock_lock(&print_spinlock);
- rte_spinlock_lock(&print_spinlock);
+ if (display_once == 0) {
+ display_once = 1;
printf("\nLegend for the table\n"
" - Retries section: number of retries for the following operations:\n"
@@ -576,12 +579,8 @@ cperf_cyclecount_test_runner(void *test_ctx)
"setup/op",
"[C-e]", "[C-d]",
"[D-e]", "[D-d]");
-
- rte_spinlock_unlock(&print_spinlock);
}
- rte_spinlock_lock(&print_spinlock);
-
printf("%12u"
"%6u"
"%12zu"
diff --git a/app/test-compress-perf/comp_perf_test_throughput.c b/app/test-compress-perf/comp_perf_test_throughput.c
index 13922b658c..d3dff070b0 100644
--- a/app/test-compress-perf/comp_perf_test_throughput.c
+++ b/app/test-compress-perf/comp_perf_test_throughput.c
@@ -329,15 +329,17 @@ cperf_throughput_test_runner(void *test_ctx)
struct cperf_benchmark_ctx *ctx = test_ctx;
struct comp_test_data *test_data = ctx->ver.options;
uint32_t lcore = rte_lcore_id();
- static rte_atomic16_t display_once = RTE_ATOMIC16_INIT(0);
+ static uint16_t display_once;
int i, ret = EXIT_SUCCESS;
ctx->ver.mem.lcore_id = lcore;
+ uint16_t exp = 0;
/*
* printing information about current compression thread
*/
- if (rte_atomic16_test_and_set(&ctx->ver.mem.print_info_once))
+ if (__atomic_compare_exchange_n(&ctx->ver.mem.print_info_once, &exp,
+ 1, 0, __ATOMIC_RELAXED, __ATOMIC_RELAXED))
printf(" lcore: %u,"
" driver name: %s,"
" device name: %s,"
@@ -391,7 +393,9 @@ cperf_throughput_test_runner(void *test_ctx)
ctx->decomp_gbps = rte_get_tsc_hz() / ctx->decomp_tsc_byte * 8 /
1000000000;
- if (rte_atomic16_test_and_set(&display_once)) {
+ exp = 0;
+ if (__atomic_compare_exchange_n(&display_once, &exp, 1, 0,
+ __ATOMIC_RELAXED, __ATOMIC_RELAXED)) {
printf("\n%12s%6s%12s%17s%15s%16s\n",
"lcore id", "Level", "Comp size", "Comp ratio [%]",
"Comp [Gbps]", "Decomp [Gbps]");
diff --git a/app/test-compress-perf/comp_perf_test_verify.c b/app/test-compress-perf/comp_perf_test_verify.c
index 5e13257b79..f6e21368e8 100644
--- a/app/test-compress-perf/comp_perf_test_verify.c
+++ b/app/test-compress-perf/comp_perf_test_verify.c
@@ -388,7 +388,7 @@ cperf_verify_test_runner(void *test_ctx)
struct cperf_verify_ctx *ctx = test_ctx;
struct comp_test_data *test_data = ctx->options;
int ret = EXIT_SUCCESS;
- static rte_atomic16_t display_once = RTE_ATOMIC16_INIT(0);
+ static uint16_t display_once;
uint32_t lcore = rte_lcore_id();
ctx->mem.lcore_id = lcore;
@@ -427,8 +427,10 @@ cperf_verify_test_runner(void *test_ctx)
ctx->ratio = (double) ctx->comp_data_sz /
test_data->input_data_sz * 100;
+ uint16_t exp = 0;
if (!ctx->silent) {
- if (rte_atomic16_test_and_set(&display_once)) {
+ if (__atomic_compare_exchange_n(&display_once, &exp, 1, 0,
+ __ATOMIC_RELAXED, __ATOMIC_RELAXED)) {
printf("%12s%6s%12s%17s\n",
"lcore id", "Level", "Comp size", "Comp ratio [%]");
}
--
2.25.1
^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH v3 10/12] app/testpmd: remove atomic operations for port status
2021-11-17 8:21 ` [PATCH v3 00/12] use compiler atomic builtins for app modules Joyce Kong
` (8 preceding siblings ...)
2021-11-17 8:21 ` [PATCH v3 09/12] app/compress: " Joyce Kong
@ 2021-11-17 8:21 ` Joyce Kong
2021-11-17 8:21 ` [PATCH v3 11/12] app/bbdev: use compiler atomics for shared data sync Joyce Kong
` (2 subsequent siblings)
12 siblings, 0 replies; 36+ messages in thread
From: Joyce Kong @ 2021-11-17 8:21 UTC (permalink / raw)
To: Xiaoyun Li; +Cc: dev, honnappa.nagarahalli, nd, Joyce Kong, Ruifeng Wang
The port_status changes do not need to be handled
atomically, as they are modified during initialization
or through the testpmd prompt instead of multiple
threads.
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
---
app/test-pmd/testpmd.c | 58 ++++++++++++++++++++++--------------------
1 file changed, 31 insertions(+), 27 deletions(-)
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index a66dfb297c..ed472cacd2 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -36,7 +36,6 @@
#include <rte_alarm.h>
#include <rte_per_lcore.h>
#include <rte_lcore.h>
-#include <rte_atomic.h>
#include <rte_branch_prediction.h>
#include <rte_mempool.h>
#include <rte_malloc.h>
@@ -2521,9 +2520,9 @@ setup_hairpin_queues(portid_t pi, portid_t p_pi, uint16_t cnt_pi)
continue;
/* Fail to setup rx queue, return */
- if (rte_atomic16_cmpset(&(port->port_status),
- RTE_PORT_HANDLING,
- RTE_PORT_STOPPED) == 0)
+ if (port->port_status == RTE_PORT_HANDLING)
+ port->port_status = RTE_PORT_STOPPED;
+ else
fprintf(stderr,
"Port %d can not be set back to stopped\n", pi);
fprintf(stderr, "Fail to configure port %d hairpin queues\n",
@@ -2544,9 +2543,9 @@ setup_hairpin_queues(portid_t pi, portid_t p_pi, uint16_t cnt_pi)
continue;
/* Fail to setup rx queue, return */
- if (rte_atomic16_cmpset(&(port->port_status),
- RTE_PORT_HANDLING,
- RTE_PORT_STOPPED) == 0)
+ if (port->port_status == RTE_PORT_HANDLING)
+ port->port_status = RTE_PORT_STOPPED;
+ else
fprintf(stderr,
"Port %d can not be set back to stopped\n", pi);
fprintf(stderr, "Fail to configure port %d hairpin queues\n",
@@ -2729,8 +2728,9 @@ start_port(portid_t pid)
need_check_link_status = 0;
port = &ports[pi];
- if (rte_atomic16_cmpset(&(port->port_status), RTE_PORT_STOPPED,
- RTE_PORT_HANDLING) == 0) {
+ if (port->port_status == RTE_PORT_STOPPED)
+ port->port_status = RTE_PORT_HANDLING;
+ else {
fprintf(stderr, "Port %d is now not stopped\n", pi);
continue;
}
@@ -2766,8 +2766,9 @@ start_port(portid_t pid)
nb_txq + nb_hairpinq,
&(port->dev_conf));
if (diag != 0) {
- if (rte_atomic16_cmpset(&(port->port_status),
- RTE_PORT_HANDLING, RTE_PORT_STOPPED) == 0)
+ if (port->port_status == RTE_PORT_HANDLING)
+ port->port_status = RTE_PORT_STOPPED;
+ else
fprintf(stderr,
"Port %d can not be set back to stopped\n",
pi);
@@ -2828,9 +2829,9 @@ start_port(portid_t pid)
continue;
/* Fail to setup tx queue, return */
- if (rte_atomic16_cmpset(&(port->port_status),
- RTE_PORT_HANDLING,
- RTE_PORT_STOPPED) == 0)
+ if (port->port_status == RTE_PORT_HANDLING)
+ port->port_status = RTE_PORT_STOPPED;
+ else
fprintf(stderr,
"Port %d can not be set back to stopped\n",
pi);
@@ -2880,9 +2881,9 @@ start_port(portid_t pid)
continue;
/* Fail to setup rx queue, return */
- if (rte_atomic16_cmpset(&(port->port_status),
- RTE_PORT_HANDLING,
- RTE_PORT_STOPPED) == 0)
+ if (port->port_status == RTE_PORT_HANDLING)
+ port->port_status = RTE_PORT_STOPPED;
+ else
fprintf(stderr,
"Port %d can not be set back to stopped\n",
pi);
@@ -2917,16 +2918,18 @@ start_port(portid_t pid)
pi, rte_strerror(-diag));
/* Fail to setup rx queue, return */
- if (rte_atomic16_cmpset(&(port->port_status),
- RTE_PORT_HANDLING, RTE_PORT_STOPPED) == 0)
+ if (port->port_status == RTE_PORT_HANDLING)
+ port->port_status = RTE_PORT_STOPPED;
+ else
fprintf(stderr,
"Port %d can not be set back to stopped\n",
pi);
continue;
}
- if (rte_atomic16_cmpset(&(port->port_status),
- RTE_PORT_HANDLING, RTE_PORT_STARTED) == 0)
+ if (port->port_status == RTE_PORT_HANDLING)
+ port->port_status = RTE_PORT_STARTED;
+ else
fprintf(stderr, "Port %d can not be set into started\n",
pi);
@@ -3028,8 +3031,9 @@ stop_port(portid_t pid)
}
port = &ports[pi];
- if (rte_atomic16_cmpset(&(port->port_status), RTE_PORT_STARTED,
- RTE_PORT_HANDLING) == 0)
+ if (port->port_status == RTE_PORT_STARTED)
+ port->port_status = RTE_PORT_HANDLING;
+ else
continue;
if (hairpin_mode & 0xf) {
@@ -3055,8 +3059,9 @@ stop_port(portid_t pid)
RTE_LOG(ERR, EAL, "rte_eth_dev_stop failed for port %u\n",
pi);
- if (rte_atomic16_cmpset(&(port->port_status),
- RTE_PORT_HANDLING, RTE_PORT_STOPPED) == 0)
+ if (port->port_status == RTE_PORT_HANDLING)
+ port->port_status = RTE_PORT_STOPPED;
+ else
fprintf(stderr, "Port %d can not be set into stopped\n",
pi);
need_check_link_status = 1;
@@ -3119,8 +3124,7 @@ close_port(portid_t pid)
}
port = &ports[pi];
- if (rte_atomic16_cmpset(&(port->port_status),
- RTE_PORT_CLOSED, RTE_PORT_CLOSED) == 1) {
+ if (port->port_status == RTE_PORT_CLOSED) {
fprintf(stderr, "Port %d is already closed\n", pi);
continue;
}
--
2.25.1
^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH v3 11/12] app/bbdev: use compiler atomics for shared data sync
2021-11-17 8:21 ` [PATCH v3 00/12] use compiler atomic builtins for app modules Joyce Kong
` (9 preceding siblings ...)
2021-11-17 8:21 ` [PATCH v3 10/12] app/testpmd: remove atomic operations for port status Joyce Kong
@ 2021-11-17 8:21 ` Joyce Kong
2021-11-17 8:22 ` [PATCH v3 12/12] app: remove unnecessary include of atomic header file Joyce Kong
2021-11-17 10:02 ` [PATCH v3 00/12] use compiler atomic builtins for app modules David Marchand
12 siblings, 0 replies; 36+ messages in thread
From: Joyce Kong @ 2021-11-17 8:21 UTC (permalink / raw)
To: Nicolas Chautru; +Cc: dev, honnappa.nagarahalli, nd, Joyce Kong, Ruifeng Wang
Convert rte_atomic usages to compiler atomic built-ins
for shared data sync in bbdev cases.
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
---
app/test-bbdev/test_bbdev_perf.c | 135 ++++++++++++++-----------------
1 file changed, 59 insertions(+), 76 deletions(-)
diff --git a/app/test-bbdev/test_bbdev_perf.c b/app/test-bbdev/test_bbdev_perf.c
index 7b4529789b..0fa119a502 100644
--- a/app/test-bbdev/test_bbdev_perf.c
+++ b/app/test-bbdev/test_bbdev_perf.c
@@ -133,7 +133,7 @@ struct test_op_params {
uint16_t num_to_process;
uint16_t num_lcores;
int vector_mask;
- rte_atomic16_t sync;
+ uint16_t sync;
struct test_buffers q_bufs[RTE_MAX_NUMA_NODES][MAX_QUEUES];
};
@@ -148,9 +148,9 @@ struct thread_params {
uint8_t iter_count;
double iter_average;
double bler;
- rte_atomic16_t nb_dequeued;
- rte_atomic16_t processing_status;
- rte_atomic16_t burst_sz;
+ uint16_t nb_dequeued;
+ int16_t processing_status;
+ uint16_t burst_sz;
struct test_op_params *op_params;
struct rte_bbdev_dec_op *dec_ops[MAX_BURST];
struct rte_bbdev_enc_op *enc_ops[MAX_BURST];
@@ -2637,46 +2637,46 @@ dequeue_event_callback(uint16_t dev_id,
}
if (unlikely(event != RTE_BBDEV_EVENT_DEQUEUE)) {
- rte_atomic16_set(&tp->processing_status, TEST_FAILED);
+ __atomic_store_n(&tp->processing_status, TEST_FAILED, __ATOMIC_RELAXED);
printf(
"Dequeue interrupt handler called for incorrect event!\n");
return;
}
- burst_sz = rte_atomic16_read(&tp->burst_sz);
+ burst_sz = __atomic_load_n(&tp->burst_sz, __ATOMIC_RELAXED);
num_ops = tp->op_params->num_to_process;
if (test_vector.op_type == RTE_BBDEV_OP_TURBO_DEC)
deq = rte_bbdev_dequeue_dec_ops(dev_id, queue_id,
&tp->dec_ops[
- rte_atomic16_read(&tp->nb_dequeued)],
+ __atomic_load_n(&tp->nb_dequeued, __ATOMIC_RELAXED)],
burst_sz);
else if (test_vector.op_type == RTE_BBDEV_OP_LDPC_DEC)
deq = rte_bbdev_dequeue_ldpc_dec_ops(dev_id, queue_id,
&tp->dec_ops[
- rte_atomic16_read(&tp->nb_dequeued)],
+ __atomic_load_n(&tp->nb_dequeued, __ATOMIC_RELAXED)],
burst_sz);
else if (test_vector.op_type == RTE_BBDEV_OP_LDPC_ENC)
deq = rte_bbdev_dequeue_ldpc_enc_ops(dev_id, queue_id,
&tp->enc_ops[
- rte_atomic16_read(&tp->nb_dequeued)],
+ __atomic_load_n(&tp->nb_dequeued, __ATOMIC_RELAXED)],
burst_sz);
else /*RTE_BBDEV_OP_TURBO_ENC*/
deq = rte_bbdev_dequeue_enc_ops(dev_id, queue_id,
&tp->enc_ops[
- rte_atomic16_read(&tp->nb_dequeued)],
+ __atomic_load_n(&tp->nb_dequeued, __ATOMIC_RELAXED)],
burst_sz);
if (deq < burst_sz) {
printf(
"After receiving the interrupt all operations should be dequeued. Expected: %u, got: %u\n",
burst_sz, deq);
- rte_atomic16_set(&tp->processing_status, TEST_FAILED);
+ __atomic_store_n(&tp->processing_status, TEST_FAILED, __ATOMIC_RELAXED);
return;
}
- if (rte_atomic16_read(&tp->nb_dequeued) + deq < num_ops) {
- rte_atomic16_add(&tp->nb_dequeued, deq);
+ if (__atomic_load_n(&tp->nb_dequeued, __ATOMIC_RELAXED) + deq < num_ops) {
+ __atomic_fetch_add(&tp->nb_dequeued, deq, __ATOMIC_RELAXED);
return;
}
@@ -2713,7 +2713,7 @@ dequeue_event_callback(uint16_t dev_id,
if (ret) {
printf("Buffers validation failed\n");
- rte_atomic16_set(&tp->processing_status, TEST_FAILED);
+ __atomic_store_n(&tp->processing_status, TEST_FAILED, __ATOMIC_RELAXED);
}
switch (test_vector.op_type) {
@@ -2734,7 +2734,7 @@ dequeue_event_callback(uint16_t dev_id,
break;
default:
printf("Unknown op type: %d\n", test_vector.op_type);
- rte_atomic16_set(&tp->processing_status, TEST_FAILED);
+ __atomic_store_n(&tp->processing_status, TEST_FAILED, __ATOMIC_RELAXED);
return;
}
@@ -2743,7 +2743,7 @@ dequeue_event_callback(uint16_t dev_id,
tp->mbps += (((double)(num_ops * tb_len_bits)) / 1000000.0) /
((double)total_time / (double)rte_get_tsc_hz());
- rte_atomic16_add(&tp->nb_dequeued, deq);
+ __atomic_fetch_add(&tp->nb_dequeued, deq, __ATOMIC_RELAXED);
}
static int
@@ -2781,11 +2781,10 @@ throughput_intr_lcore_ldpc_dec(void *arg)
bufs = &tp->op_params->q_bufs[GET_SOCKET(info.socket_id)][queue_id];
- rte_atomic16_clear(&tp->processing_status);
- rte_atomic16_clear(&tp->nb_dequeued);
+ __atomic_store_n(&tp->processing_status, 0, __ATOMIC_RELAXED);
+ __atomic_store_n(&tp->nb_dequeued, 0, __ATOMIC_RELAXED);
- while (rte_atomic16_read(&tp->op_params->sync) == SYNC_WAIT)
- rte_pause();
+ rte_wait_until_equal_16(&tp->op_params->sync, SYNC_START, __ATOMIC_RELAXED);
ret = rte_bbdev_dec_op_alloc_bulk(tp->op_params->mp, ops,
num_to_process);
@@ -2833,17 +2832,15 @@ throughput_intr_lcore_ldpc_dec(void *arg)
* the number of operations is not a multiple of
* burst size.
*/
- rte_atomic16_set(&tp->burst_sz, num_to_enq);
+ __atomic_store_n(&tp->burst_sz, num_to_enq, __ATOMIC_RELAXED);
/* Wait until processing of previous batch is
* completed
*/
- while (rte_atomic16_read(&tp->nb_dequeued) !=
- (int16_t) enqueued)
- rte_pause();
+ rte_wait_until_equal_16(&tp->nb_dequeued, enqueued, __ATOMIC_RELAXED);
}
if (j != TEST_REPETITIONS - 1)
- rte_atomic16_clear(&tp->nb_dequeued);
+ __atomic_store_n(&tp->nb_dequeued, 0, __ATOMIC_RELAXED);
}
return TEST_SUCCESS;
@@ -2878,11 +2875,10 @@ throughput_intr_lcore_dec(void *arg)
bufs = &tp->op_params->q_bufs[GET_SOCKET(info.socket_id)][queue_id];
- rte_atomic16_clear(&tp->processing_status);
- rte_atomic16_clear(&tp->nb_dequeued);
+ __atomic_store_n(&tp->processing_status, 0, __ATOMIC_RELAXED);
+ __atomic_store_n(&tp->nb_dequeued, 0, __ATOMIC_RELAXED);
- while (rte_atomic16_read(&tp->op_params->sync) == SYNC_WAIT)
- rte_pause();
+ rte_wait_until_equal_16(&tp->op_params->sync, SYNC_START, __ATOMIC_RELAXED);
ret = rte_bbdev_dec_op_alloc_bulk(tp->op_params->mp, ops,
num_to_process);
@@ -2923,17 +2919,15 @@ throughput_intr_lcore_dec(void *arg)
* the number of operations is not a multiple of
* burst size.
*/
- rte_atomic16_set(&tp->burst_sz, num_to_enq);
+ __atomic_store_n(&tp->burst_sz, num_to_enq, __ATOMIC_RELAXED);
/* Wait until processing of previous batch is
* completed
*/
- while (rte_atomic16_read(&tp->nb_dequeued) !=
- (int16_t) enqueued)
- rte_pause();
+ rte_wait_until_equal_16(&tp->nb_dequeued, enqueued, __ATOMIC_RELAXED);
}
if (j != TEST_REPETITIONS - 1)
- rte_atomic16_clear(&tp->nb_dequeued);
+ __atomic_store_n(&tp->nb_dequeued, 0, __ATOMIC_RELAXED);
}
return TEST_SUCCESS;
@@ -2968,11 +2962,10 @@ throughput_intr_lcore_enc(void *arg)
bufs = &tp->op_params->q_bufs[GET_SOCKET(info.socket_id)][queue_id];
- rte_atomic16_clear(&tp->processing_status);
- rte_atomic16_clear(&tp->nb_dequeued);
+ __atomic_store_n(&tp->processing_status, 0, __ATOMIC_RELAXED);
+ __atomic_store_n(&tp->nb_dequeued, 0, __ATOMIC_RELAXED);
- while (rte_atomic16_read(&tp->op_params->sync) == SYNC_WAIT)
- rte_pause();
+ rte_wait_until_equal_16(&tp->op_params->sync, SYNC_START, __ATOMIC_RELAXED);
ret = rte_bbdev_enc_op_alloc_bulk(tp->op_params->mp, ops,
num_to_process);
@@ -3012,17 +3005,15 @@ throughput_intr_lcore_enc(void *arg)
* the number of operations is not a multiple of
* burst size.
*/
- rte_atomic16_set(&tp->burst_sz, num_to_enq);
+ __atomic_store_n(&tp->burst_sz, num_to_enq, __ATOMIC_RELAXED);
/* Wait until processing of previous batch is
* completed
*/
- while (rte_atomic16_read(&tp->nb_dequeued) !=
- (int16_t) enqueued)
- rte_pause();
+ rte_wait_until_equal_16(&tp->nb_dequeued, enqueued, __ATOMIC_RELAXED);
}
if (j != TEST_REPETITIONS - 1)
- rte_atomic16_clear(&tp->nb_dequeued);
+ __atomic_store_n(&tp->nb_dequeued, 0, __ATOMIC_RELAXED);
}
return TEST_SUCCESS;
@@ -3058,11 +3049,10 @@ throughput_intr_lcore_ldpc_enc(void *arg)
bufs = &tp->op_params->q_bufs[GET_SOCKET(info.socket_id)][queue_id];
- rte_atomic16_clear(&tp->processing_status);
- rte_atomic16_clear(&tp->nb_dequeued);
+ __atomic_store_n(&tp->processing_status, 0, __ATOMIC_RELAXED);
+ __atomic_store_n(&tp->nb_dequeued, 0, __ATOMIC_RELAXED);
- while (rte_atomic16_read(&tp->op_params->sync) == SYNC_WAIT)
- rte_pause();
+ rte_wait_until_equal_16(&tp->op_params->sync, SYNC_START, __ATOMIC_RELAXED);
ret = rte_bbdev_enc_op_alloc_bulk(tp->op_params->mp, ops,
num_to_process);
@@ -3104,17 +3094,15 @@ throughput_intr_lcore_ldpc_enc(void *arg)
* the number of operations is not a multiple of
* burst size.
*/
- rte_atomic16_set(&tp->burst_sz, num_to_enq);
+ __atomic_store_n(&tp->burst_sz, num_to_enq, __ATOMIC_RELAXED);
/* Wait until processing of previous batch is
* completed
*/
- while (rte_atomic16_read(&tp->nb_dequeued) !=
- (int16_t) enqueued)
- rte_pause();
+ rte_wait_until_equal_16(&tp->nb_dequeued, enqueued, __ATOMIC_RELAXED);
}
if (j != TEST_REPETITIONS - 1)
- rte_atomic16_clear(&tp->nb_dequeued);
+ __atomic_store_n(&tp->nb_dequeued, 0, __ATOMIC_RELAXED);
}
return TEST_SUCCESS;
@@ -3148,8 +3136,7 @@ throughput_pmd_lcore_dec(void *arg)
bufs = &tp->op_params->q_bufs[GET_SOCKET(info.socket_id)][queue_id];
- while (rte_atomic16_read(&tp->op_params->sync) == SYNC_WAIT)
- rte_pause();
+ rte_wait_until_equal_16(&tp->op_params->sync, SYNC_START, __ATOMIC_RELAXED);
ret = rte_bbdev_dec_op_alloc_bulk(tp->op_params->mp, ops_enq, num_ops);
TEST_ASSERT_SUCCESS(ret, "Allocation failed for %d ops", num_ops);
@@ -3252,8 +3239,7 @@ bler_pmd_lcore_ldpc_dec(void *arg)
bufs = &tp->op_params->q_bufs[GET_SOCKET(info.socket_id)][queue_id];
- while (rte_atomic16_read(&tp->op_params->sync) == SYNC_WAIT)
- rte_pause();
+ rte_wait_until_equal_16(&tp->op_params->sync, SYNC_START, __ATOMIC_RELAXED);
ret = rte_bbdev_dec_op_alloc_bulk(tp->op_params->mp, ops_enq, num_ops);
TEST_ASSERT_SUCCESS(ret, "Allocation failed for %d ops", num_ops);
@@ -3382,8 +3368,7 @@ throughput_pmd_lcore_ldpc_dec(void *arg)
bufs = &tp->op_params->q_bufs[GET_SOCKET(info.socket_id)][queue_id];
- while (rte_atomic16_read(&tp->op_params->sync) == SYNC_WAIT)
- rte_pause();
+ rte_wait_until_equal_16(&tp->op_params->sync, SYNC_START, __ATOMIC_RELAXED);
ret = rte_bbdev_dec_op_alloc_bulk(tp->op_params->mp, ops_enq, num_ops);
TEST_ASSERT_SUCCESS(ret, "Allocation failed for %d ops", num_ops);
@@ -3499,8 +3484,7 @@ throughput_pmd_lcore_enc(void *arg)
bufs = &tp->op_params->q_bufs[GET_SOCKET(info.socket_id)][queue_id];
- while (rte_atomic16_read(&tp->op_params->sync) == SYNC_WAIT)
- rte_pause();
+ rte_wait_until_equal_16(&tp->op_params->sync, SYNC_START, __ATOMIC_RELAXED);
ret = rte_bbdev_enc_op_alloc_bulk(tp->op_params->mp, ops_enq,
num_ops);
@@ -3590,8 +3574,7 @@ throughput_pmd_lcore_ldpc_enc(void *arg)
bufs = &tp->op_params->q_bufs[GET_SOCKET(info.socket_id)][queue_id];
- while (rte_atomic16_read(&tp->op_params->sync) == SYNC_WAIT)
- rte_pause();
+ rte_wait_until_equal_16(&tp->op_params->sync, SYNC_START, __ATOMIC_RELAXED);
ret = rte_bbdev_enc_op_alloc_bulk(tp->op_params->mp, ops_enq,
num_ops);
@@ -3774,7 +3757,7 @@ bler_test(struct active_device *ad,
else
return TEST_SKIPPED;
- rte_atomic16_set(&op_params->sync, SYNC_WAIT);
+ __atomic_store_n(&op_params->sync, SYNC_WAIT, __ATOMIC_RELAXED);
/* Main core is set at first entry */
t_params[0].dev_id = ad->dev_id;
@@ -3797,7 +3780,7 @@ bler_test(struct active_device *ad,
&t_params[used_cores++], lcore_id);
}
- rte_atomic16_set(&op_params->sync, SYNC_START);
+ __atomic_store_n(&op_params->sync, SYNC_START, __ATOMIC_RELAXED);
ret = bler_function(&t_params[0]);
/* Main core is always used */
@@ -3892,7 +3875,7 @@ throughput_test(struct active_device *ad,
throughput_function = throughput_pmd_lcore_enc;
}
- rte_atomic16_set(&op_params->sync, SYNC_WAIT);
+ __atomic_store_n(&op_params->sync, SYNC_WAIT, __ATOMIC_RELAXED);
/* Main core is set at first entry */
t_params[0].dev_id = ad->dev_id;
@@ -3915,7 +3898,7 @@ throughput_test(struct active_device *ad,
&t_params[used_cores++], lcore_id);
}
- rte_atomic16_set(&op_params->sync, SYNC_START);
+ __atomic_store_n(&op_params->sync, SYNC_START, __ATOMIC_RELAXED);
ret = throughput_function(&t_params[0]);
/* Main core is always used */
@@ -3945,29 +3928,29 @@ throughput_test(struct active_device *ad,
* Wait for main lcore operations.
*/
tp = &t_params[0];
- while ((rte_atomic16_read(&tp->nb_dequeued) <
- op_params->num_to_process) &&
- (rte_atomic16_read(&tp->processing_status) !=
- TEST_FAILED))
+ while ((__atomic_load_n(&tp->nb_dequeued, __ATOMIC_RELAXED) <
+ op_params->num_to_process) &&
+ (__atomic_load_n(&tp->processing_status, __ATOMIC_RELAXED) !=
+ TEST_FAILED))
rte_pause();
tp->ops_per_sec /= TEST_REPETITIONS;
tp->mbps /= TEST_REPETITIONS;
- ret |= (int)rte_atomic16_read(&tp->processing_status);
+ ret |= (int)__atomic_load_n(&tp->processing_status, __ATOMIC_RELAXED);
/* Wait for worker lcores operations */
for (used_cores = 1; used_cores < num_lcores; used_cores++) {
tp = &t_params[used_cores];
- while ((rte_atomic16_read(&tp->nb_dequeued) <
- op_params->num_to_process) &&
- (rte_atomic16_read(&tp->processing_status) !=
- TEST_FAILED))
+ while ((__atomic_load_n(&tp->nb_dequeued, __ATOMIC_RELAXED) <
+ op_params->num_to_process) &&
+ (__atomic_load_n(&tp->processing_status, __ATOMIC_RELAXED) !=
+ TEST_FAILED))
rte_pause();
tp->ops_per_sec /= TEST_REPETITIONS;
tp->mbps /= TEST_REPETITIONS;
- ret |= (int)rte_atomic16_read(&tp->processing_status);
+ ret |= (int)__atomic_load_n(&tp->processing_status, __ATOMIC_RELAXED);
}
/* Print throughput if test passed */
--
2.25.1
^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH v3 12/12] app: remove unnecessary include of atomic header file
2021-11-17 8:21 ` [PATCH v3 00/12] use compiler atomic builtins for app modules Joyce Kong
` (10 preceding siblings ...)
2021-11-17 8:21 ` [PATCH v3 11/12] app/bbdev: use compiler atomics for shared data sync Joyce Kong
@ 2021-11-17 8:22 ` Joyce Kong
2021-11-17 10:02 ` [PATCH v3 00/12] use compiler atomic builtins for app modules David Marchand
12 siblings, 0 replies; 36+ messages in thread
From: Joyce Kong @ 2021-11-17 8:22 UTC (permalink / raw)
To: Maryam Tahhan, Reshma Pattan, Cristian Dumitrescu, Xiaoyun Li,
Erik Gabriel Carrillo, Olivier Matz, Anatoly Burakov,
Honnappa Nagarahalli, Konstantin Ananyev
Cc: dev, nd, Joyce Kong, Ruifeng Wang
Remove the unnecessary rte_atomic.h included in app modules.
Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
---
app/proc-info/main.c | 1 -
app/test-pipeline/config.c | 1 -
app/test-pipeline/init.c | 1 -
app/test-pipeline/main.c | 1 -
app/test-pipeline/runtime.c | 1 -
app/test-pmd/cmdline.c | 1 -
app/test-pmd/config.c | 1 -
app/test-pmd/csumonly.c | 1 -
app/test-pmd/flowgen.c | 1 -
app/test-pmd/icmpecho.c | 1 -
app/test-pmd/iofwd.c | 1 -
app/test-pmd/macfwd.c | 1 -
app/test-pmd/macswap.c | 1 -
app/test-pmd/parameters.c | 1 -
app/test-pmd/rxonly.c | 1 -
app/test-pmd/txonly.c | 1 -
app/test/commands.c | 1 -
app/test/test_barrier.c | 1 -
app/test/test_event_timer_adapter.c | 1 -
app/test/test_mbuf.c | 1 -
app/test/test_mp_secondary.c | 1 -
app/test/test_ring.c | 1 -
22 files changed, 22 deletions(-)
diff --git a/app/proc-info/main.c b/app/proc-info/main.c
index a4271047e6..ebe2d77264 100644
--- a/app/proc-info/main.c
+++ b/app/proc-info/main.c
@@ -27,7 +27,6 @@
#include <rte_per_lcore.h>
#include <rte_lcore.h>
#include <rte_log.h>
-#include <rte_atomic.h>
#include <rte_branch_prediction.h>
#include <rte_string_fns.h>
#include <rte_metrics.h>
diff --git a/app/test-pipeline/config.c b/app/test-pipeline/config.c
index 33f3f1c827..daf838948b 100644
--- a/app/test-pipeline/config.c
+++ b/app/test-pipeline/config.c
@@ -21,7 +21,6 @@
#include <rte_eal.h>
#include <rte_per_lcore.h>
#include <rte_launch.h>
-#include <rte_atomic.h>
#include <rte_cycles.h>
#include <rte_prefetch.h>
#include <rte_lcore.h>
diff --git a/app/test-pipeline/init.c b/app/test-pipeline/init.c
index c738019041..eee0719b67 100644
--- a/app/test-pipeline/init.c
+++ b/app/test-pipeline/init.c
@@ -21,7 +21,6 @@
#include <rte_eal.h>
#include <rte_per_lcore.h>
#include <rte_launch.h>
-#include <rte_atomic.h>
#include <rte_cycles.h>
#include <rte_prefetch.h>
#include <rte_lcore.h>
diff --git a/app/test-pipeline/main.c b/app/test-pipeline/main.c
index 72e4797ff2..1e16794183 100644
--- a/app/test-pipeline/main.c
+++ b/app/test-pipeline/main.c
@@ -22,7 +22,6 @@
#include <rte_eal.h>
#include <rte_per_lcore.h>
#include <rte_launch.h>
-#include <rte_atomic.h>
#include <rte_cycles.h>
#include <rte_prefetch.h>
#include <rte_lcore.h>
diff --git a/app/test-pipeline/runtime.c b/app/test-pipeline/runtime.c
index 159192bcd8..d939a85d7e 100644
--- a/app/test-pipeline/runtime.c
+++ b/app/test-pipeline/runtime.c
@@ -21,7 +21,6 @@
#include <rte_eal.h>
#include <rte_per_lcore.h>
#include <rte_launch.h>
-#include <rte_atomic.h>
#include <rte_cycles.h>
#include <rte_prefetch.h>
#include <rte_branch_prediction.h>
diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 4f51b259fe..4e93f535ff 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -24,7 +24,6 @@
#include <rte_eal.h>
#include <rte_per_lcore.h>
#include <rte_lcore.h>
-#include <rte_atomic.h>
#include <rte_branch_prediction.h>
#include <rte_ring.h>
#include <rte_mempool.h>
diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 26cadf39f7..d8b5032b58 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -27,7 +27,6 @@
#include <rte_eal.h>
#include <rte_per_lcore.h>
#include <rte_lcore.h>
-#include <rte_atomic.h>
#include <rte_branch_prediction.h>
#include <rte_mempool.h>
#include <rte_mbuf.h>
diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c
index 8526d9158a..e0b00abe8c 100644
--- a/app/test-pmd/csumonly.c
+++ b/app/test-pmd/csumonly.c
@@ -24,7 +24,6 @@
#include <rte_eal.h>
#include <rte_per_lcore.h>
#include <rte_lcore.h>
-#include <rte_atomic.h>
#include <rte_branch_prediction.h>
#include <rte_mempool.h>
#include <rte_mbuf.h>
diff --git a/app/test-pmd/flowgen.c b/app/test-pmd/flowgen.c
index 5737eaa105..9ceef3b54a 100644
--- a/app/test-pmd/flowgen.c
+++ b/app/test-pmd/flowgen.c
@@ -24,7 +24,6 @@
#include <rte_eal.h>
#include <rte_per_lcore.h>
#include <rte_lcore.h>
-#include <rte_atomic.h>
#include <rte_branch_prediction.h>
#include <rte_mempool.h>
#include <rte_mbuf.h>
diff --git a/app/test-pmd/icmpecho.c b/app/test-pmd/icmpecho.c
index 8f1d68a83a..3a85ec3dd1 100644
--- a/app/test-pmd/icmpecho.c
+++ b/app/test-pmd/icmpecho.c
@@ -20,7 +20,6 @@
#include <rte_cycles.h>
#include <rte_per_lcore.h>
#include <rte_lcore.h>
-#include <rte_atomic.h>
#include <rte_branch_prediction.h>
#include <rte_memory.h>
#include <rte_mempool.h>
diff --git a/app/test-pmd/iofwd.c b/app/test-pmd/iofwd.c
index 83d098adcb..19cd920f70 100644
--- a/app/test-pmd/iofwd.c
+++ b/app/test-pmd/iofwd.c
@@ -23,7 +23,6 @@
#include <rte_eal.h>
#include <rte_per_lcore.h>
#include <rte_lcore.h>
-#include <rte_atomic.h>
#include <rte_branch_prediction.h>
#include <rte_memcpy.h>
#include <rte_mempool.h>
diff --git a/app/test-pmd/macfwd.c b/app/test-pmd/macfwd.c
index ac50d0b9f8..812a0c721f 100644
--- a/app/test-pmd/macfwd.c
+++ b/app/test-pmd/macfwd.c
@@ -24,7 +24,6 @@
#include <rte_eal.h>
#include <rte_per_lcore.h>
#include <rte_lcore.h>
-#include <rte_atomic.h>
#include <rte_branch_prediction.h>
#include <rte_mempool.h>
#include <rte_mbuf.h>
diff --git a/app/test-pmd/macswap.c b/app/test-pmd/macswap.c
index 310bca06af..4627ff83e9 100644
--- a/app/test-pmd/macswap.c
+++ b/app/test-pmd/macswap.c
@@ -24,7 +24,6 @@
#include <rte_eal.h>
#include <rte_per_lcore.h>
#include <rte_lcore.h>
-#include <rte_atomic.h>
#include <rte_branch_prediction.h>
#include <rte_mempool.h>
#include <rte_mbuf.h>
diff --git a/app/test-pmd/parameters.c b/app/test-pmd/parameters.c
index 0974b0a38f..2f4f944efa 100644
--- a/app/test-pmd/parameters.c
+++ b/app/test-pmd/parameters.c
@@ -30,7 +30,6 @@
#include <rte_eal.h>
#include <rte_per_lcore.h>
#include <rte_lcore.h>
-#include <rte_atomic.h>
#include <rte_branch_prediction.h>
#include <rte_mempool.h>
#include <rte_interrupts.h>
diff --git a/app/test-pmd/rxonly.c b/app/test-pmd/rxonly.c
index c78fc4609a..d1a579d8d8 100644
--- a/app/test-pmd/rxonly.c
+++ b/app/test-pmd/rxonly.c
@@ -24,7 +24,6 @@
#include <rte_eal.h>
#include <rte_per_lcore.h>
#include <rte_lcore.h>
-#include <rte_atomic.h>
#include <rte_branch_prediction.h>
#include <rte_mempool.h>
#include <rte_mbuf.h>
diff --git a/app/test-pmd/txonly.c b/app/test-pmd/txonly.c
index 34bb538379..b8497e733d 100644
--- a/app/test-pmd/txonly.c
+++ b/app/test-pmd/txonly.c
@@ -24,7 +24,6 @@
#include <rte_eal.h>
#include <rte_per_lcore.h>
#include <rte_lcore.h>
-#include <rte_atomic.h>
#include <rte_branch_prediction.h>
#include <rte_mempool.h>
#include <rte_mbuf.h>
diff --git a/app/test/commands.c b/app/test/commands.c
index 76f6ee5d23..2dced3bc44 100644
--- a/app/test/commands.c
+++ b/app/test/commands.c
@@ -25,7 +25,6 @@
#include <rte_eal.h>
#include <rte_per_lcore.h>
#include <rte_lcore.h>
-#include <rte_atomic.h>
#include <rte_branch_prediction.h>
#include <rte_ring.h>
#include <rte_malloc.h>
diff --git a/app/test/test_barrier.c b/app/test/test_barrier.c
index c27f8a0742..898c2516ed 100644
--- a/app/test/test_barrier.c
+++ b/app/test/test_barrier.c
@@ -24,7 +24,6 @@
#include <rte_memory.h>
#include <rte_per_lcore.h>
#include <rte_launch.h>
-#include <rte_atomic.h>
#include <rte_eal.h>
#include <rte_lcore.h>
#include <rte_pause.h>
diff --git a/app/test/test_event_timer_adapter.c b/app/test/test_event_timer_adapter.c
index 12c00e678e..25bac2d155 100644
--- a/app/test/test_event_timer_adapter.c
+++ b/app/test/test_event_timer_adapter.c
@@ -5,7 +5,6 @@
#include <math.h>
-#include <rte_atomic.h>
#include <rte_common.h>
#include <rte_cycles.h>
#include <rte_debug.h>
diff --git a/app/test/test_mbuf.c b/app/test/test_mbuf.c
index f93bcef8a9..d53126710f 100644
--- a/app/test/test_mbuf.c
+++ b/app/test/test_mbuf.c
@@ -21,7 +21,6 @@
#include <rte_eal.h>
#include <rte_per_lcore.h>
#include <rte_lcore.h>
-#include <rte_atomic.h>
#include <rte_branch_prediction.h>
#include <rte_ring.h>
#include <rte_mempool.h>
diff --git a/app/test/test_mp_secondary.c b/app/test/test_mp_secondary.c
index 5b6f05dbb1..021ca0547f 100644
--- a/app/test/test_mp_secondary.c
+++ b/app/test/test_mp_secondary.c
@@ -28,7 +28,6 @@
#include <rte_lcore.h>
#include <rte_errno.h>
#include <rte_branch_prediction.h>
-#include <rte_atomic.h>
#include <rte_ring.h>
#include <rte_debug.h>
#include <rte_log.h>
diff --git a/app/test/test_ring.c b/app/test/test_ring.c
index fb8532a409..bde33ab4a1 100644
--- a/app/test/test_ring.c
+++ b/app/test/test_ring.c
@@ -20,7 +20,6 @@
#include <rte_eal.h>
#include <rte_per_lcore.h>
#include <rte_lcore.h>
-#include <rte_atomic.h>
#include <rte_branch_prediction.h>
#include <rte_malloc.h>
#include <rte_ring.h>
--
2.25.1
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v2 03/12] test/timer: use compiler atomic builtins for sync
2021-11-16 21:21 ` Honnappa Nagarahalli
@ 2021-11-17 9:29 ` David Marchand
0 siblings, 0 replies; 36+ messages in thread
From: David Marchand @ 2021-11-17 9:29 UTC (permalink / raw)
To: Honnappa Nagarahalli
Cc: Joyce Kong, Robert Sanford, Erik Gabriel Carrillo, dev, nd, Ruifeng Wang
On Tue, Nov 16, 2021 at 10:21 PM Honnappa Nagarahalli
<Honnappa.Nagarahalli@arm.com> wrote:
> > Joyce, Honnappa,
> >
> > On Tue, Nov 16, 2021 at 10:43 AM Joyce Kong <joyce.kong@arm.com> wrote:
> > >
> > > Convert rte_atomic usages to compiler atomic built-ins for lcore_state
> > > and collisions sync.
> > >
> > > Also, move 'main_init_workers' outside of 'timer_stress2_main_loop' to
> > > guarantee lcore_state initialized correctly before the threads
> > > launched.
> >
> > Is this "also" part actually related to the change?
> > Or is it a separate fix?
> 'Also' part is not fixing a different problem (i.e. the code earlier was not having any issues). This 'also' part just helps to keep the code simple.
This is indeed better this way.
Thanks.
--
David Marchand
^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v3 00/12] use compiler atomic builtins for app modules
2021-11-17 8:21 ` [PATCH v3 00/12] use compiler atomic builtins for app modules Joyce Kong
` (11 preceding siblings ...)
2021-11-17 8:22 ` [PATCH v3 12/12] app: remove unnecessary include of atomic header file Joyce Kong
@ 2021-11-17 10:02 ` David Marchand
12 siblings, 0 replies; 36+ messages in thread
From: David Marchand @ 2021-11-17 10:02 UTC (permalink / raw)
To: Joyce Kong; +Cc: dev, Honnappa Nagarahalli, nd
On Wed, Nov 17, 2021 at 9:22 AM Joyce Kong <joyce.kong@arm.com> wrote:
>
> Since atomic operations have been adopted in DPDK now[1],
> change rte_atomicNN_xxx APIs to compiler atomic built-ins
> in app modules[2].
>
> [1] https://www.dpdk.org/blog/2021/03/26/dpdk-adopts-the-c11-memory-model/
> [2] https://doc.dpdk.org/guides/rel_notes/deprecation.html
>
> v3:
> 1. In pmd_perf test case, move the initialization of polling
> start before calling rte_eal_remote_launch, so the update
> is visible to the worker threads.(Honnappa Nagarahalli)
> 2. Remove the rest rte_atomic.h which miss in v2.(David Marchand)
>
> v2:
> By Honnappa Nagarahalli:
> 1. Replace the RELAXED barriers with suitable ones for shared
> data sync in pmd_perf and timer test cases.
> 2. Avoid unnecessary atomic operations in compress and testpmd
> modules.
> 3. Fix some typo.
>
> Joyce Kong (12):
> test/pmd_perf: use compiler atomic builtins for polling sync
> test/ring_perf: use compiler atomic builtins for lcores sync
> test/timer: use compiler atomic builtins for sync
> test/stack_perf: use compiler atomics for lcore sync
> test/bpf: use compiler atomics for calculation
> test/func_reentrancy: use compiler atomics for data sync
> app/eventdev: use compiler atomics for shared data sync
> app/crypto: use compiler atomic builtins for display sync
> app/compress: use compiler atomic builtins for display sync
> app/testpmd: remove atomic operations for port status
> app/bbdev: use compiler atomics for shared data sync
> app: remove unnecessary include of atomic header file
There were cleanups of unneeded rte_atomic.h inclusion along the series:
I moved all of them to the last patch so that patches focus on what
their commitlog describes.
Series applied, thanks.
--
David Marchand
^ permalink raw reply [flat|nested] 36+ messages in thread
end of thread, other threads:[~2021-11-17 10:02 UTC | newest]
Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-16 9:41 [PATCH v2 00/12] use compiler atomic builtins for app modules Joyce Kong
2021-11-16 9:41 ` [PATCH v2 01/12] test/pmd_perf: use compiler atomic builtins for polling sync Joyce Kong
2021-11-16 21:30 ` Honnappa Nagarahalli
2021-11-16 9:41 ` [PATCH v2 02/12] test/ring_perf: use compiler atomic builtins for lcores sync Joyce Kong
2021-11-16 9:41 ` [PATCH v2 03/12] test/timer: use compiler atomic builtins for sync Joyce Kong
2021-11-16 19:52 ` Honnappa Nagarahalli
2021-11-16 20:20 ` David Marchand
2021-11-16 21:21 ` Honnappa Nagarahalli
2021-11-17 9:29 ` David Marchand
2021-11-16 9:41 ` [PATCH v2 04/12] test/stack_perf: use compiler atomics for lcore sync Joyce Kong
2021-11-16 9:41 ` [PATCH v2 05/12] test/bpf: use compiler atomics for calculation Joyce Kong
2021-11-16 9:41 ` [PATCH v2 06/12] test/func_reentrancy: use compiler atomics for data sync Joyce Kong
2021-11-16 9:42 ` [PATCH v2 07/12] app/eventdev: use compiler atomics for shared " Joyce Kong
2021-11-16 9:42 ` [PATCH v2 08/12] app/crypto: use compiler atomic builtins for display sync Joyce Kong
2021-11-16 9:42 ` [PATCH v2 09/12] app/compress: " Joyce Kong
2021-11-16 20:15 ` Honnappa Nagarahalli
2021-11-16 9:42 ` [PATCH v2 10/12] app/testpmd: remove atomic operations for port status Joyce Kong
2021-11-16 21:34 ` Honnappa Nagarahalli
2021-11-16 9:42 ` [PATCH v2 11/12] app/bbdev: use compiler atomics for shared data sync Joyce Kong
2021-11-16 9:42 ` [PATCH v2 12/12] app: remove unnecessary include of atomic header file Joyce Kong
2021-11-16 20:23 ` David Marchand
2021-11-17 7:05 ` Joyce Kong
2021-11-17 8:21 ` [PATCH v3 00/12] use compiler atomic builtins for app modules Joyce Kong
2021-11-17 8:21 ` [PATCH v3 01/12] test/pmd_perf: use compiler atomic builtins for polling sync Joyce Kong
2021-11-17 8:21 ` [PATCH v3 02/12] test/ring_perf: use compiler atomic builtins for lcores sync Joyce Kong
2021-11-17 8:21 ` [PATCH v3 03/12] test/timer: use compiler atomic builtins for sync Joyce Kong
2021-11-17 8:21 ` [PATCH v3 04/12] test/stack_perf: use compiler atomics for lcore sync Joyce Kong
2021-11-17 8:21 ` [PATCH v3 05/12] test/bpf: use compiler atomics for calculation Joyce Kong
2021-11-17 8:21 ` [PATCH v3 06/12] test/func_reentrancy: use compiler atomics for data sync Joyce Kong
2021-11-17 8:21 ` [PATCH v3 07/12] app/eventdev: use compiler atomics for shared " Joyce Kong
2021-11-17 8:21 ` [PATCH v3 08/12] app/crypto: use compiler atomic builtins for display sync Joyce Kong
2021-11-17 8:21 ` [PATCH v3 09/12] app/compress: " Joyce Kong
2021-11-17 8:21 ` [PATCH v3 10/12] app/testpmd: remove atomic operations for port status Joyce Kong
2021-11-17 8:21 ` [PATCH v3 11/12] app/bbdev: use compiler atomics for shared data sync Joyce Kong
2021-11-17 8:22 ` [PATCH v3 12/12] app: remove unnecessary include of atomic header file Joyce Kong
2021-11-17 10:02 ` [PATCH v3 00/12] use compiler atomic builtins for app modules David Marchand
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).