* [dpdk-stable] [PATCH 1/7] doc/rcu: fix typos
[not found] <20190908224949.34851-1-honnappa.nagarahalli@arm.com>
@ 2019-09-08 22:49 ` Honnappa Nagarahalli
2019-09-08 22:49 ` [dpdk-stable] [PATCH 2/7] doc/rcu: correct the limitation on number of threads Honnappa Nagarahalli
` (3 subsequent siblings)
4 siblings, 0 replies; 7+ messages in thread
From: Honnappa Nagarahalli @ 2019-09-08 22:49 UTC (permalink / raw)
To: honnappa.nagarahalli, konstantin.ananyev; +Cc: dev, stable
Fix typos.
Fixes: 64994b56cfd7 ("rcu: add RCU library supporting QSBR mechanism")
Cc: stable@dpdk.org
Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
---
doc/guides/prog_guide/rcu_lib.rst | 24 ++++++++++++------------
1 file changed, 12 insertions(+), 12 deletions(-)
diff --git a/doc/guides/prog_guide/rcu_lib.rst b/doc/guides/prog_guide/rcu_lib.rst
index 8fe5b1f73..c019dfca8 100644
--- a/doc/guides/prog_guide/rcu_lib.rst
+++ b/doc/guides/prog_guide/rcu_lib.rst
@@ -37,8 +37,8 @@ What is Quiescent State
-----------------------
Quiescent State can be defined as "any point in the thread execution where the
-thread does not hold a reference to shared memory". It is up to the application
-to determine its quiescent state.
+thread does not hold a reference to shared memory". It is the responsibility of
+the application to determine its quiescent state.
Let us consider the following diagram:
@@ -76,7 +76,7 @@ Factors affecting the RCU mechanism
It is important to make sure that this library keeps the overhead of
identifying the end of grace period and subsequent freeing of memory,
-to a minimum. The following explains how grace period and critical
+to a minimum. The following paras explain how grace period and critical
section affect this overhead.
The writer has to poll the readers to identify the end of grace period.
@@ -91,14 +91,14 @@ critical sections smaller requires additional CPU cycles (due to additional
reporting) in the readers.
Hence, we need the characteristics of a small grace period and large critical
-section. This library addresses this by allowing the writer to do
-other work without having to block until the readers report their quiescent
-state.
+section. This library addresses these characteristics by allowing the writer
+to do other work without having to block until the readers report their
+quiescent state.
RCU in DPDK
-----------
-For DPDK applications, the start and end of a ``while(1)`` loop (where no
+For DPDK applications, the beginning and end of a ``while(1)`` loop (where no
references to shared data structures are kept) act as perfect quiescent
states. This will combine all the shared data structure accesses into a
single, large critical section which helps keep the overhead on the
@@ -106,11 +106,11 @@ reader side to a minimum.
DPDK supports a pipeline model of packet processing and service cores.
In these use cases, a given data structure may not be used by all the
-workers in the application. The writer does not have to wait for all
-the workers to report their quiescent state. To provide the required
-flexibility, this library has a concept of a QS variable. The application
-can create one QS variable per data structure to help it track the
-end of grace period for each data structure. This helps keep the grace
+workers in the application. The writer has to wait only for the workers that
+use the data structure to report their quiescent state. To provide the required
+flexibility, this library has a concept of a QS variable. If required, the
+application can create one QS variable per data structure to help it track the
+end of grace period for each data structure. This helps keep the length of grace
period to a minimum.
How to use this library
--
2.17.1
^ permalink raw reply [flat|nested] 7+ messages in thread
* [dpdk-stable] [PATCH 4/7] test/rcu: use size_t instead of int
[not found] <20190908224949.34851-1-honnappa.nagarahalli@arm.com>
` (2 preceding siblings ...)
2019-09-08 22:49 ` [dpdk-stable] [PATCH 3/7] doc/rcu: add information about storing token and resource Honnappa Nagarahalli
@ 2019-09-08 22:49 ` Honnappa Nagarahalli
2019-09-09 15:16 ` [dpdk-stable] [dpdk-dev] " Ruifeng Wang (Arm Technology China)
2019-09-08 22:49 ` [dpdk-stable] [PATCH 6/7] lib/rcu: add least acknowledged token optimization Honnappa Nagarahalli
4 siblings, 1 reply; 7+ messages in thread
From: Honnappa Nagarahalli @ 2019-09-08 22:49 UTC (permalink / raw)
To: honnappa.nagarahalli, konstantin.ananyev; +Cc: dev, stable
Variables used to store the return value of rte_rcu_qsbr_get_memsize
in variables of type 'int'. The variables are of type 'size_t' now.
Fixes: b87089b0bb19 ("test/rcu: add API and functional tests")
Cc: stable@dpdk.org
Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
---
app/test/test_rcu_qsbr_perf.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/app/test/test_rcu_qsbr_perf.c b/app/test/test_rcu_qsbr_perf.c
index cb2d177b7..e0598614c 100644
--- a/app/test/test_rcu_qsbr_perf.c
+++ b/app/test/test_rcu_qsbr_perf.c
@@ -125,7 +125,7 @@ test_rcu_qsbr_writer_perf(void *arg)
static int
test_rcu_qsbr_perf(void)
{
- int sz;
+ size_t sz;
unsigned int i, tmp_num_cores;
writer_done = 0;
@@ -188,7 +188,7 @@ test_rcu_qsbr_perf(void)
static int
test_rcu_qsbr_rperf(void)
{
- int sz;
+ size_t sz;
unsigned int i, tmp_num_cores;
rte_atomic64_clear(&updates);
@@ -234,7 +234,7 @@ test_rcu_qsbr_rperf(void)
static int
test_rcu_qsbr_wperf(void)
{
- int sz;
+ size_t sz;
unsigned int i;
rte_atomic64_clear(&checks);
@@ -379,7 +379,7 @@ static int
test_rcu_qsbr_sw_sv_1qs(void)
{
uint64_t token, begin, cycles;
- int sz;
+ size_t sz;
unsigned int i, j, tmp_num_cores;
int32_t pos;
--
2.17.1
^ permalink raw reply [flat|nested] 7+ messages in thread
* [dpdk-stable] [PATCH 6/7] lib/rcu: add least acknowledged token optimization
[not found] <20190908224949.34851-1-honnappa.nagarahalli@arm.com>
` (3 preceding siblings ...)
2019-09-08 22:49 ` [dpdk-stable] [PATCH 4/7] test/rcu: use size_t instead of int Honnappa Nagarahalli
@ 2019-09-08 22:49 ` Honnappa Nagarahalli
4 siblings, 0 replies; 7+ messages in thread
From: Honnappa Nagarahalli @ 2019-09-08 22:49 UTC (permalink / raw)
To: honnappa.nagarahalli, konstantin.ananyev; +Cc: dev, stable
When the rte_rcu_qsbr_check API is called, it is possible to
calculate the least valued token acknowledged by all the readers.
When the API is called next time, the readers' token counters do
not need to be scanned if the value of the token being queried is
less than the last least token acknowledged. This avoids the
cache line bounces between readers and writer.
Fixes: 64994b56cfd7 ("rcu: add RCU library supporting QSBR mechanism")
Cc: stable@dpdk.org
Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
---
lib/librte_rcu/rte_rcu_qsbr.c | 4 ++++
lib/librte_rcu/rte_rcu_qsbr.h | 42 +++++++++++++++++++++++++++++++++++
2 files changed, 46 insertions(+)
diff --git a/lib/librte_rcu/rte_rcu_qsbr.c b/lib/librte_rcu/rte_rcu_qsbr.c
index ce7f93dd3..c9ca66aaa 100644
--- a/lib/librte_rcu/rte_rcu_qsbr.c
+++ b/lib/librte_rcu/rte_rcu_qsbr.c
@@ -73,6 +73,7 @@ rte_rcu_qsbr_init(struct rte_rcu_qsbr *v, uint32_t max_threads)
__RTE_QSBR_THRID_ARRAY_ELM_SIZE) /
__RTE_QSBR_THRID_ARRAY_ELM_SIZE;
v->token = __RTE_QSBR_CNT_INIT;
+ v->acked_token = __RTE_QSBR_CNT_INIT - 1;
return 0;
}
@@ -245,6 +246,9 @@ rte_rcu_qsbr_dump(FILE *f, struct rte_rcu_qsbr *v)
fprintf(f, " Token = %"PRIu64"\n",
__atomic_load_n(&v->token, __ATOMIC_ACQUIRE));
+ fprintf(f, " Least Acknowledged Token = %"PRIu64"\n",
+ __atomic_load_n(&v->acked_token, __ATOMIC_ACQUIRE));
+
fprintf(f, "Quiescent State Counts for readers:\n");
for (i = 0; i < v->num_elems; i++) {
bmap = __atomic_load_n(__RTE_QSBR_THRID_ARRAY_ELM(v, i),
diff --git a/lib/librte_rcu/rte_rcu_qsbr.h b/lib/librte_rcu/rte_rcu_qsbr.h
index c80f15c00..3f445ba6c 100644
--- a/lib/librte_rcu/rte_rcu_qsbr.h
+++ b/lib/librte_rcu/rte_rcu_qsbr.h
@@ -83,6 +83,7 @@ struct rte_rcu_qsbr_cnt {
#define __RTE_QSBR_CNT_THR_OFFLINE 0
#define __RTE_QSBR_CNT_INIT 1
+#define __RTE_QSBR_CNT_MAX ((uint64_t)~0)
/* RTE Quiescent State variable structure.
* This structure has two elements that vary in size based on the
@@ -93,6 +94,10 @@ struct rte_rcu_qsbr_cnt {
struct rte_rcu_qsbr {
uint64_t token __rte_cache_aligned;
/**< Counter to allow for multiple concurrent quiescent state queries */
+ uint64_t acked_token;
+ /**< Least token acked by all the threads in the last call to
+ * rte_rcu_qsbr_check API.
+ */
uint32_t num_elems __rte_cache_aligned;
/**< Number of elements in the thread ID array */
@@ -472,6 +477,7 @@ __rte_rcu_qsbr_check_selective(struct rte_rcu_qsbr *v, uint64_t t, bool wait)
uint64_t bmap;
uint64_t c;
uint64_t *reg_thread_id;
+ uint64_t acked_token = __RTE_QSBR_CNT_MAX;
for (i = 0, reg_thread_id = __RTE_QSBR_THRID_ARRAY_ELM(v, 0);
i < v->num_elems;
@@ -493,6 +499,7 @@ __rte_rcu_qsbr_check_selective(struct rte_rcu_qsbr *v, uint64_t t, bool wait)
__RTE_RCU_DP_LOG(DEBUG,
"%s: status: token = %"PRIu64", wait = %d, Thread QS cnt = %"PRIu64", Thread ID = %d",
__func__, t, wait, c, id+j);
+
/* Counter is not checked for wrap-around condition
* as it is a 64b counter.
*/
@@ -512,10 +519,25 @@ __rte_rcu_qsbr_check_selective(struct rte_rcu_qsbr *v, uint64_t t, bool wait)
continue;
}
+ /* This thread is in quiescent state. Use the counter
+ * to find the least acknowledged token among all the
+ * readers.
+ */
+ if (c != __RTE_QSBR_CNT_THR_OFFLINE && acked_token > c)
+ acked_token = c;
+
bmap &= ~(1UL << j);
}
}
+ /* All readers are checked, update least acknowledged token.
+ * There might be multiple writers trying to update this. There is
+ * no need to update this very accurately using compare-and-swap.
+ */
+ if (acked_token != __RTE_QSBR_CNT_MAX)
+ __atomic_store_n(&v->acked_token, acked_token,
+ __ATOMIC_RELAXED);
+
return 1;
}
@@ -528,6 +550,7 @@ __rte_rcu_qsbr_check_all(struct rte_rcu_qsbr *v, uint64_t t, bool wait)
uint32_t i;
struct rte_rcu_qsbr_cnt *cnt;
uint64_t c;
+ uint64_t acked_token = __RTE_QSBR_CNT_MAX;
for (i = 0, cnt = v->qsbr_cnt; i < v->max_threads; i++, cnt++) {
__RTE_RCU_DP_LOG(DEBUG,
@@ -538,6 +561,7 @@ __rte_rcu_qsbr_check_all(struct rte_rcu_qsbr *v, uint64_t t, bool wait)
__RTE_RCU_DP_LOG(DEBUG,
"%s: status: token = %"PRIu64", wait = %d, Thread QS cnt = %"PRIu64", Thread ID = %d",
__func__, t, wait, c, i);
+
/* Counter is not checked for wrap-around condition
* as it is a 64b counter.
*/
@@ -550,8 +574,22 @@ __rte_rcu_qsbr_check_all(struct rte_rcu_qsbr *v, uint64_t t, bool wait)
rte_pause();
}
+
+ /* This thread is in quiescent state. Use the counter to find
+ * the least acknowledged token among all the readers.
+ */
+ if (likely(c != __RTE_QSBR_CNT_THR_OFFLINE && acked_token > c))
+ acked_token = c;
}
+ /* All readers are checked, update least acknowledged token.
+ * There might be multiple writers trying to update this. There is
+ * no need to update this very accurately using compare-and-swap.
+ */
+ if (acked_token != __RTE_QSBR_CNT_MAX)
+ __atomic_store_n(&v->acked_token, acked_token,
+ __ATOMIC_RELAXED);
+
return 1;
}
@@ -595,6 +633,10 @@ rte_rcu_qsbr_check(struct rte_rcu_qsbr *v, uint64_t t, bool wait)
{
RTE_ASSERT(v != NULL);
+ /* Check if all the readers have already acknowledged this token */
+ if (likely(t <= v->acked_token))
+ return 1;
+
if (likely(v->num_threads == v->max_threads))
return __rte_rcu_qsbr_check_all(v, t, wait);
else
--
2.17.1
^ permalink raw reply [flat|nested] 7+ messages in thread