* [dpdk-dev] [PATCH 1/7] doc/rcu: fix typos
2019-09-08 22:49 [dpdk-dev] [PATCH 0/7] typo, doc, simple fixes and some optimizations Honnappa Nagarahalli
@ 2019-09-08 22:49 ` Honnappa Nagarahalli
2019-09-08 22:49 ` [dpdk-dev] [PATCH 2/7] doc/rcu: correct the limitation on number of threads Honnappa Nagarahalli
` (5 subsequent siblings)
6 siblings, 0 replies; 10+ messages in thread
From: Honnappa Nagarahalli @ 2019-09-08 22:49 UTC (permalink / raw)
To: honnappa.nagarahalli, konstantin.ananyev; +Cc: dev, stable
Fix typos.
Fixes: 64994b56cfd7 ("rcu: add RCU library supporting QSBR mechanism")
Cc: stable@dpdk.org
Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
---
doc/guides/prog_guide/rcu_lib.rst | 24 ++++++++++++------------
1 file changed, 12 insertions(+), 12 deletions(-)
diff --git a/doc/guides/prog_guide/rcu_lib.rst b/doc/guides/prog_guide/rcu_lib.rst
index 8fe5b1f73..c019dfca8 100644
--- a/doc/guides/prog_guide/rcu_lib.rst
+++ b/doc/guides/prog_guide/rcu_lib.rst
@@ -37,8 +37,8 @@ What is Quiescent State
-----------------------
Quiescent State can be defined as "any point in the thread execution where the
-thread does not hold a reference to shared memory". It is up to the application
-to determine its quiescent state.
+thread does not hold a reference to shared memory". It is the responsibility of
+the application to determine its quiescent state.
Let us consider the following diagram:
@@ -76,7 +76,7 @@ Factors affecting the RCU mechanism
It is important to make sure that this library keeps the overhead of
identifying the end of grace period and subsequent freeing of memory,
-to a minimum. The following explains how grace period and critical
+to a minimum. The following paras explain how grace period and critical
section affect this overhead.
The writer has to poll the readers to identify the end of grace period.
@@ -91,14 +91,14 @@ critical sections smaller requires additional CPU cycles (due to additional
reporting) in the readers.
Hence, we need the characteristics of a small grace period and large critical
-section. This library addresses this by allowing the writer to do
-other work without having to block until the readers report their quiescent
-state.
+section. This library addresses these characteristics by allowing the writer
+to do other work without having to block until the readers report their
+quiescent state.
RCU in DPDK
-----------
-For DPDK applications, the start and end of a ``while(1)`` loop (where no
+For DPDK applications, the beginning and end of a ``while(1)`` loop (where no
references to shared data structures are kept) act as perfect quiescent
states. This will combine all the shared data structure accesses into a
single, large critical section which helps keep the overhead on the
@@ -106,11 +106,11 @@ reader side to a minimum.
DPDK supports a pipeline model of packet processing and service cores.
In these use cases, a given data structure may not be used by all the
-workers in the application. The writer does not have to wait for all
-the workers to report their quiescent state. To provide the required
-flexibility, this library has a concept of a QS variable. The application
-can create one QS variable per data structure to help it track the
-end of grace period for each data structure. This helps keep the grace
+workers in the application. The writer has to wait only for the workers that
+use the data structure to report their quiescent state. To provide the required
+flexibility, this library has a concept of a QS variable. If required, the
+application can create one QS variable per data structure to help it track the
+end of grace period for each data structure. This helps keep the length of grace
period to a minimum.
How to use this library
--
2.17.1
^ permalink raw reply [flat|nested] 10+ messages in thread
* [dpdk-dev] [PATCH 2/7] doc/rcu: correct the limitation on number of threads
2019-09-08 22:49 [dpdk-dev] [PATCH 0/7] typo, doc, simple fixes and some optimizations Honnappa Nagarahalli
2019-09-08 22:49 ` [dpdk-dev] [PATCH 1/7] doc/rcu: fix typos Honnappa Nagarahalli
@ 2019-09-08 22:49 ` Honnappa Nagarahalli
2019-09-08 22:49 ` [dpdk-dev] [PATCH 3/7] doc/rcu: add information about storing token and resource Honnappa Nagarahalli
` (4 subsequent siblings)
6 siblings, 0 replies; 10+ messages in thread
From: Honnappa Nagarahalli @ 2019-09-08 22:49 UTC (permalink / raw)
To: honnappa.nagarahalli, konstantin.ananyev; +Cc: dev, stable
There is no limitation of 1024 reader threads.
Fixes: 64994b56cfd7 ("rcu: add RCU library supporting QSBR mechanism")
Cc: stable@dpdk.org
Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
---
doc/guides/prog_guide/rcu_lib.rst | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/doc/guides/prog_guide/rcu_lib.rst b/doc/guides/prog_guide/rcu_lib.rst
index c019dfca8..4d1ed3954 100644
--- a/doc/guides/prog_guide/rcu_lib.rst
+++ b/doc/guides/prog_guide/rcu_lib.rst
@@ -120,8 +120,7 @@ The application must allocate memory and initialize a QS variable.
Applications can call ``rte_rcu_qsbr_get_memsize()`` to calculate the size
of memory to allocate. This API takes a maximum number of reader threads,
-using this variable, as a parameter. Currently, a maximum of 1024 threads
-are supported.
+using this variable, as a parameter.
Further, the application can initialize a QS variable using the API
``rte_rcu_qsbr_init()``.
--
2.17.1
^ permalink raw reply [flat|nested] 10+ messages in thread
* [dpdk-dev] [PATCH 3/7] doc/rcu: add information about storing token and resource
2019-09-08 22:49 [dpdk-dev] [PATCH 0/7] typo, doc, simple fixes and some optimizations Honnappa Nagarahalli
2019-09-08 22:49 ` [dpdk-dev] [PATCH 1/7] doc/rcu: fix typos Honnappa Nagarahalli
2019-09-08 22:49 ` [dpdk-dev] [PATCH 2/7] doc/rcu: correct the limitation on number of threads Honnappa Nagarahalli
@ 2019-09-08 22:49 ` Honnappa Nagarahalli
2019-09-08 22:49 ` [dpdk-dev] [PATCH 4/7] test/rcu: use size_t instead of int Honnappa Nagarahalli
` (3 subsequent siblings)
6 siblings, 0 replies; 10+ messages in thread
From: Honnappa Nagarahalli @ 2019-09-08 22:49 UTC (permalink / raw)
To: honnappa.nagarahalli, konstantin.ananyev; +Cc: dev, stable
After calling rte_rcu_qsbr_start API, the token and the deleted
resource need to be stored for subsequent query/free.
Fixes: 64994b56cfd7 ("rcu: add RCU library supporting QSBR mechanism")
Cc: stable@dpdk.org
Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
---
doc/guides/prog_guide/rcu_lib.rst | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/doc/guides/prog_guide/rcu_lib.rst b/doc/guides/prog_guide/rcu_lib.rst
index 4d1ed3954..8d0dfcf29 100644
--- a/doc/guides/prog_guide/rcu_lib.rst
+++ b/doc/guides/prog_guide/rcu_lib.rst
@@ -162,14 +162,19 @@ running as worker threads.
The separation of triggering the reporting from querying the status provides
the writer threads flexibility to do useful work instead of blocking for the
reader threads to enter the quiescent state or go offline. This reduces the
-memory accesses due to continuous polling for the status.
+memory accesses due to continuous polling for the status. But, since the
+resource is freed at a later time, the token and the reference to the deleted
+resource need to be stored for later queries.
The ``rte_rcu_qsbr_synchronize()`` API combines the functionality of
``rte_rcu_qsbr_start()`` and blocking ``rte_rcu_qsbr_check()`` into a single
API. This API triggers the reader threads to report their quiescent state and
polls till all the readers enter the quiescent state or go offline. This API
does not allow the writer to do useful work while waiting and introduces
-additional memory accesses due to continuous polling.
+additional memory accesses due to continuous polling. However, the application
+does not have to store the token or the reference to the deleted resource. The
+resource can be freed immediately after ``rte_rcu_qsbr_synchronize()`` API
+returns.
The reader thread must call ``rte_rcu_qsbr_thread_offline()`` and
``rte_rcu_qsbr_thread_unregister()`` APIs to remove itself from reporting its
--
2.17.1
^ permalink raw reply [flat|nested] 10+ messages in thread
* [dpdk-dev] [PATCH 4/7] test/rcu: use size_t instead of int
2019-09-08 22:49 [dpdk-dev] [PATCH 0/7] typo, doc, simple fixes and some optimizations Honnappa Nagarahalli
` (2 preceding siblings ...)
2019-09-08 22:49 ` [dpdk-dev] [PATCH 3/7] doc/rcu: add information about storing token and resource Honnappa Nagarahalli
@ 2019-09-08 22:49 ` Honnappa Nagarahalli
2019-09-09 15:16 ` Ruifeng Wang (Arm Technology China)
2019-09-08 22:49 ` [dpdk-dev] [PATCH 5/7] test/rcu: use correct nomenclature while printing results Honnappa Nagarahalli
` (2 subsequent siblings)
6 siblings, 1 reply; 10+ messages in thread
From: Honnappa Nagarahalli @ 2019-09-08 22:49 UTC (permalink / raw)
To: honnappa.nagarahalli, konstantin.ananyev; +Cc: dev, stable
Variables used to store the return value of rte_rcu_qsbr_get_memsize
in variables of type 'int'. The variables are of type 'size_t' now.
Fixes: b87089b0bb19 ("test/rcu: add API and functional tests")
Cc: stable@dpdk.org
Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
---
app/test/test_rcu_qsbr_perf.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/app/test/test_rcu_qsbr_perf.c b/app/test/test_rcu_qsbr_perf.c
index cb2d177b7..e0598614c 100644
--- a/app/test/test_rcu_qsbr_perf.c
+++ b/app/test/test_rcu_qsbr_perf.c
@@ -125,7 +125,7 @@ test_rcu_qsbr_writer_perf(void *arg)
static int
test_rcu_qsbr_perf(void)
{
- int sz;
+ size_t sz;
unsigned int i, tmp_num_cores;
writer_done = 0;
@@ -188,7 +188,7 @@ test_rcu_qsbr_perf(void)
static int
test_rcu_qsbr_rperf(void)
{
- int sz;
+ size_t sz;
unsigned int i, tmp_num_cores;
rte_atomic64_clear(&updates);
@@ -234,7 +234,7 @@ test_rcu_qsbr_rperf(void)
static int
test_rcu_qsbr_wperf(void)
{
- int sz;
+ size_t sz;
unsigned int i;
rte_atomic64_clear(&checks);
@@ -379,7 +379,7 @@ static int
test_rcu_qsbr_sw_sv_1qs(void)
{
uint64_t token, begin, cycles;
- int sz;
+ size_t sz;
unsigned int i, j, tmp_num_cores;
int32_t pos;
--
2.17.1
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] [PATCH 4/7] test/rcu: use size_t instead of int
2019-09-08 22:49 ` [dpdk-dev] [PATCH 4/7] test/rcu: use size_t instead of int Honnappa Nagarahalli
@ 2019-09-09 15:16 ` Ruifeng Wang (Arm Technology China)
2019-10-07 14:00 ` David Marchand
0 siblings, 1 reply; 10+ messages in thread
From: Ruifeng Wang (Arm Technology China) @ 2019-09-09 15:16 UTC (permalink / raw)
To: Honnappa Nagarahalli, Honnappa Nagarahalli, konstantin.ananyev
Cc: dev, stable, nd
> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Honnappa Nagarahalli
> Sent: Monday, September 9, 2019 06:50
> To: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>;
> konstantin.ananyev@intel.com
> Cc: dev@dpdk.org; stable@dpdk.org
> Subject: [dpdk-dev] [PATCH 4/7] test/rcu: use size_t instead of int
>
> Variables used to store the return value of rte_rcu_qsbr_get_memsize in
> variables of type 'int'. The variables are of type 'size_t' now.
>
> Fixes: b87089b0bb19 ("test/rcu: add API and functional tests")
> Cc: stable@dpdk.org
>
> Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> Reviewed-by: Gavin Hu <gavin.hu@arm.com>
> ---
> app/test/test_rcu_qsbr_perf.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/app/test/test_rcu_qsbr_perf.c b/app/test/test_rcu_qsbr_perf.c
> index cb2d177b7..e0598614c 100644
> --- a/app/test/test_rcu_qsbr_perf.c
> +++ b/app/test/test_rcu_qsbr_perf.c
> @@ -125,7 +125,7 @@ test_rcu_qsbr_writer_perf(void *arg) static int
> test_rcu_qsbr_perf(void)
> {
> - int sz;
> + size_t sz;
> unsigned int i, tmp_num_cores;
>
> writer_done = 0;
> @@ -188,7 +188,7 @@ test_rcu_qsbr_perf(void) static int
> test_rcu_qsbr_rperf(void)
> {
> - int sz;
> + size_t sz;
> unsigned int i, tmp_num_cores;
>
> rte_atomic64_clear(&updates);
> @@ -234,7 +234,7 @@ test_rcu_qsbr_rperf(void) static int
> test_rcu_qsbr_wperf(void)
> {
> - int sz;
> + size_t sz;
> unsigned int i;
>
> rte_atomic64_clear(&checks);
> @@ -379,7 +379,7 @@ static int
> test_rcu_qsbr_sw_sv_1qs(void)
> {
> uint64_t token, begin, cycles;
> - int sz;
> + size_t sz;
> unsigned int i, j, tmp_num_cores;
> int32_t pos;
>
> --
> 2.17.1
The same change is needed by test_rcu_qsbr_sw_sv_1qs_non_blocking.
And there are another 2 occurrences in test_rcu_qsbr.c.
Thanks.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [dpdk-dev] [PATCH 4/7] test/rcu: use size_t instead of int
2019-09-09 15:16 ` Ruifeng Wang (Arm Technology China)
@ 2019-10-07 14:00 ` David Marchand
0 siblings, 0 replies; 10+ messages in thread
From: David Marchand @ 2019-10-07 14:00 UTC (permalink / raw)
To: Honnappa Nagarahalli
Cc: konstantin.ananyev, dev, stable, nd, Ruifeng Wang (Arm Technology China)
Hello Honnappa,
On Mon, Sep 9, 2019 at 5:17 PM Ruifeng Wang (Arm Technology China)
<Ruifeng.Wang@arm.com> wrote:
>
>
> > -----Original Message-----
> > From: dev <dev-bounces@dpdk.org> On Behalf Of Honnappa Nagarahalli
> > Sent: Monday, September 9, 2019 06:50
> > To: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>;
> > konstantin.ananyev@intel.com
> > Cc: dev@dpdk.org; stable@dpdk.org
> > Subject: [dpdk-dev] [PATCH 4/7] test/rcu: use size_t instead of int
> >
> > Variables used to store the return value of rte_rcu_qsbr_get_memsize in
> > variables of type 'int'. The variables are of type 'size_t' now.
> >
> > Fixes: b87089b0bb19 ("test/rcu: add API and functional tests")
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> > Reviewed-by: Gavin Hu <gavin.hu@arm.com>
> > ---
> > app/test/test_rcu_qsbr_perf.c | 8 ++++----
> > 1 file changed, 4 insertions(+), 4 deletions(-)
> >
> > diff --git a/app/test/test_rcu_qsbr_perf.c b/app/test/test_rcu_qsbr_perf.c
> > index cb2d177b7..e0598614c 100644
> > --- a/app/test/test_rcu_qsbr_perf.c
> > +++ b/app/test/test_rcu_qsbr_perf.c
> > @@ -125,7 +125,7 @@ test_rcu_qsbr_writer_perf(void *arg) static int
> > test_rcu_qsbr_perf(void)
> > {
> > - int sz;
> > + size_t sz;
> > unsigned int i, tmp_num_cores;
> >
> > writer_done = 0;
> > @@ -188,7 +188,7 @@ test_rcu_qsbr_perf(void) static int
> > test_rcu_qsbr_rperf(void)
> > {
> > - int sz;
> > + size_t sz;
> > unsigned int i, tmp_num_cores;
> >
> > rte_atomic64_clear(&updates);
> > @@ -234,7 +234,7 @@ test_rcu_qsbr_rperf(void) static int
> > test_rcu_qsbr_wperf(void)
> > {
> > - int sz;
> > + size_t sz;
> > unsigned int i;
> >
> > rte_atomic64_clear(&checks);
> > @@ -379,7 +379,7 @@ static int
> > test_rcu_qsbr_sw_sv_1qs(void)
> > {
> > uint64_t token, begin, cycles;
> > - int sz;
> > + size_t sz;
> > unsigned int i, j, tmp_num_cores;
> > int32_t pos;
> >
> > --
> > 2.17.1
>
> The same change is needed by test_rcu_qsbr_sw_sv_1qs_non_blocking.
> And there are another 2 occurrences in test_rcu_qsbr.c.
Ruifeng comment looks valid, can you have a look and submit a new
version of this patch?
Thanks.
--
David Marchand
^ permalink raw reply [flat|nested] 10+ messages in thread
* [dpdk-dev] [PATCH 5/7] test/rcu: use correct nomenclature while printing results
2019-09-08 22:49 [dpdk-dev] [PATCH 0/7] typo, doc, simple fixes and some optimizations Honnappa Nagarahalli
` (3 preceding siblings ...)
2019-09-08 22:49 ` [dpdk-dev] [PATCH 4/7] test/rcu: use size_t instead of int Honnappa Nagarahalli
@ 2019-09-08 22:49 ` Honnappa Nagarahalli
2019-09-08 22:49 ` [dpdk-dev] [PATCH 6/7] lib/rcu: add least acknowledged token optimization Honnappa Nagarahalli
2019-09-08 22:49 ` [dpdk-dev] [PATCH 7/7] lib/rcu: update QS only when there are updates from writer Honnappa Nagarahalli
6 siblings, 0 replies; 10+ messages in thread
From: Honnappa Nagarahalli @ 2019-09-08 22:49 UTC (permalink / raw)
To: honnappa.nagarahalli, konstantin.ananyev; +Cc: dev
Use 'quiescent state updates' instead of just 'updates'.
Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
---
app/test/test_rcu_qsbr_perf.c | 16 ++++++++++------
1 file changed, 10 insertions(+), 6 deletions(-)
diff --git a/app/test/test_rcu_qsbr_perf.c b/app/test/test_rcu_qsbr_perf.c
index e0598614c..3a1df5d17 100644
--- a/app/test/test_rcu_qsbr_perf.c
+++ b/app/test/test_rcu_qsbr_perf.c
@@ -167,8 +167,10 @@ test_rcu_qsbr_perf(void)
/* Wait until all readers have exited */
rte_eal_mp_wait_lcore();
- printf("Total RCU updates = %"PRIi64"\n", rte_atomic64_read(&updates));
- printf("Cycles per %d updates: %"PRIi64"\n", RCU_SCALE_DOWN,
+ printf("Total quiescent state updates = %"PRIi64"\n",
+ rte_atomic64_read(&updates));
+ printf("Cycles per %d quiescent state updates: %"PRIi64"\n",
+ RCU_SCALE_DOWN,
rte_atomic64_read(&update_cycles) /
(rte_atomic64_read(&updates) / RCU_SCALE_DOWN));
printf("Total RCU checks = %"PRIi64"\n", rte_atomic64_read(&checks));
@@ -217,8 +219,10 @@ test_rcu_qsbr_rperf(void)
/* Wait until all readers have exited */
rte_eal_mp_wait_lcore();
- printf("Total RCU updates = %"PRIi64"\n", rte_atomic64_read(&updates));
- printf("Cycles per %d updates: %"PRIi64"\n", RCU_SCALE_DOWN,
+ printf("Total quiescent state updates = %"PRIi64"\n",
+ rte_atomic64_read(&updates));
+ printf("Cycles per %d quiescent state updates: %"PRIi64"\n",
+ RCU_SCALE_DOWN,
rte_atomic64_read(&update_cycles) /
(rte_atomic64_read(&updates) / RCU_SCALE_DOWN));
@@ -462,7 +466,7 @@ test_rcu_qsbr_sw_sv_1qs(void)
rte_free(keys);
printf("Following numbers include calls to rte_hash functions\n");
- printf("Cycles per 1 update(online/update/offline): %"PRIi64"\n",
+ printf("Cycles per 1 quiescent state update(online/update/offline): %"PRIi64"\n",
rte_atomic64_read(&update_cycles) /
rte_atomic64_read(&updates));
@@ -577,7 +581,7 @@ test_rcu_qsbr_sw_sv_1qs_non_blocking(void)
rte_free(keys);
printf("Following numbers include calls to rte_hash functions\n");
- printf("Cycles per 1 update(online/update/offline): %"PRIi64"\n",
+ printf("Cycles per 1 quiescent state update(online/update/offline): %"PRIi64"\n",
rte_atomic64_read(&update_cycles) /
rte_atomic64_read(&updates));
--
2.17.1
^ permalink raw reply [flat|nested] 10+ messages in thread
* [dpdk-dev] [PATCH 6/7] lib/rcu: add least acknowledged token optimization
2019-09-08 22:49 [dpdk-dev] [PATCH 0/7] typo, doc, simple fixes and some optimizations Honnappa Nagarahalli
` (4 preceding siblings ...)
2019-09-08 22:49 ` [dpdk-dev] [PATCH 5/7] test/rcu: use correct nomenclature while printing results Honnappa Nagarahalli
@ 2019-09-08 22:49 ` Honnappa Nagarahalli
2019-09-08 22:49 ` [dpdk-dev] [PATCH 7/7] lib/rcu: update QS only when there are updates from writer Honnappa Nagarahalli
6 siblings, 0 replies; 10+ messages in thread
From: Honnappa Nagarahalli @ 2019-09-08 22:49 UTC (permalink / raw)
To: honnappa.nagarahalli, konstantin.ananyev; +Cc: dev, stable
When the rte_rcu_qsbr_check API is called, it is possible to
calculate the least valued token acknowledged by all the readers.
When the API is called next time, the readers' token counters do
not need to be scanned if the value of the token being queried is
less than the last least token acknowledged. This avoids the
cache line bounces between readers and writer.
Fixes: 64994b56cfd7 ("rcu: add RCU library supporting QSBR mechanism")
Cc: stable@dpdk.org
Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
---
lib/librte_rcu/rte_rcu_qsbr.c | 4 ++++
lib/librte_rcu/rte_rcu_qsbr.h | 42 +++++++++++++++++++++++++++++++++++
2 files changed, 46 insertions(+)
diff --git a/lib/librte_rcu/rte_rcu_qsbr.c b/lib/librte_rcu/rte_rcu_qsbr.c
index ce7f93dd3..c9ca66aaa 100644
--- a/lib/librte_rcu/rte_rcu_qsbr.c
+++ b/lib/librte_rcu/rte_rcu_qsbr.c
@@ -73,6 +73,7 @@ rte_rcu_qsbr_init(struct rte_rcu_qsbr *v, uint32_t max_threads)
__RTE_QSBR_THRID_ARRAY_ELM_SIZE) /
__RTE_QSBR_THRID_ARRAY_ELM_SIZE;
v->token = __RTE_QSBR_CNT_INIT;
+ v->acked_token = __RTE_QSBR_CNT_INIT - 1;
return 0;
}
@@ -245,6 +246,9 @@ rte_rcu_qsbr_dump(FILE *f, struct rte_rcu_qsbr *v)
fprintf(f, " Token = %"PRIu64"\n",
__atomic_load_n(&v->token, __ATOMIC_ACQUIRE));
+ fprintf(f, " Least Acknowledged Token = %"PRIu64"\n",
+ __atomic_load_n(&v->acked_token, __ATOMIC_ACQUIRE));
+
fprintf(f, "Quiescent State Counts for readers:\n");
for (i = 0; i < v->num_elems; i++) {
bmap = __atomic_load_n(__RTE_QSBR_THRID_ARRAY_ELM(v, i),
diff --git a/lib/librte_rcu/rte_rcu_qsbr.h b/lib/librte_rcu/rte_rcu_qsbr.h
index c80f15c00..3f445ba6c 100644
--- a/lib/librte_rcu/rte_rcu_qsbr.h
+++ b/lib/librte_rcu/rte_rcu_qsbr.h
@@ -83,6 +83,7 @@ struct rte_rcu_qsbr_cnt {
#define __RTE_QSBR_CNT_THR_OFFLINE 0
#define __RTE_QSBR_CNT_INIT 1
+#define __RTE_QSBR_CNT_MAX ((uint64_t)~0)
/* RTE Quiescent State variable structure.
* This structure has two elements that vary in size based on the
@@ -93,6 +94,10 @@ struct rte_rcu_qsbr_cnt {
struct rte_rcu_qsbr {
uint64_t token __rte_cache_aligned;
/**< Counter to allow for multiple concurrent quiescent state queries */
+ uint64_t acked_token;
+ /**< Least token acked by all the threads in the last call to
+ * rte_rcu_qsbr_check API.
+ */
uint32_t num_elems __rte_cache_aligned;
/**< Number of elements in the thread ID array */
@@ -472,6 +477,7 @@ __rte_rcu_qsbr_check_selective(struct rte_rcu_qsbr *v, uint64_t t, bool wait)
uint64_t bmap;
uint64_t c;
uint64_t *reg_thread_id;
+ uint64_t acked_token = __RTE_QSBR_CNT_MAX;
for (i = 0, reg_thread_id = __RTE_QSBR_THRID_ARRAY_ELM(v, 0);
i < v->num_elems;
@@ -493,6 +499,7 @@ __rte_rcu_qsbr_check_selective(struct rte_rcu_qsbr *v, uint64_t t, bool wait)
__RTE_RCU_DP_LOG(DEBUG,
"%s: status: token = %"PRIu64", wait = %d, Thread QS cnt = %"PRIu64", Thread ID = %d",
__func__, t, wait, c, id+j);
+
/* Counter is not checked for wrap-around condition
* as it is a 64b counter.
*/
@@ -512,10 +519,25 @@ __rte_rcu_qsbr_check_selective(struct rte_rcu_qsbr *v, uint64_t t, bool wait)
continue;
}
+ /* This thread is in quiescent state. Use the counter
+ * to find the least acknowledged token among all the
+ * readers.
+ */
+ if (c != __RTE_QSBR_CNT_THR_OFFLINE && acked_token > c)
+ acked_token = c;
+
bmap &= ~(1UL << j);
}
}
+ /* All readers are checked, update least acknowledged token.
+ * There might be multiple writers trying to update this. There is
+ * no need to update this very accurately using compare-and-swap.
+ */
+ if (acked_token != __RTE_QSBR_CNT_MAX)
+ __atomic_store_n(&v->acked_token, acked_token,
+ __ATOMIC_RELAXED);
+
return 1;
}
@@ -528,6 +550,7 @@ __rte_rcu_qsbr_check_all(struct rte_rcu_qsbr *v, uint64_t t, bool wait)
uint32_t i;
struct rte_rcu_qsbr_cnt *cnt;
uint64_t c;
+ uint64_t acked_token = __RTE_QSBR_CNT_MAX;
for (i = 0, cnt = v->qsbr_cnt; i < v->max_threads; i++, cnt++) {
__RTE_RCU_DP_LOG(DEBUG,
@@ -538,6 +561,7 @@ __rte_rcu_qsbr_check_all(struct rte_rcu_qsbr *v, uint64_t t, bool wait)
__RTE_RCU_DP_LOG(DEBUG,
"%s: status: token = %"PRIu64", wait = %d, Thread QS cnt = %"PRIu64", Thread ID = %d",
__func__, t, wait, c, i);
+
/* Counter is not checked for wrap-around condition
* as it is a 64b counter.
*/
@@ -550,8 +574,22 @@ __rte_rcu_qsbr_check_all(struct rte_rcu_qsbr *v, uint64_t t, bool wait)
rte_pause();
}
+
+ /* This thread is in quiescent state. Use the counter to find
+ * the least acknowledged token among all the readers.
+ */
+ if (likely(c != __RTE_QSBR_CNT_THR_OFFLINE && acked_token > c))
+ acked_token = c;
}
+ /* All readers are checked, update least acknowledged token.
+ * There might be multiple writers trying to update this. There is
+ * no need to update this very accurately using compare-and-swap.
+ */
+ if (acked_token != __RTE_QSBR_CNT_MAX)
+ __atomic_store_n(&v->acked_token, acked_token,
+ __ATOMIC_RELAXED);
+
return 1;
}
@@ -595,6 +633,10 @@ rte_rcu_qsbr_check(struct rte_rcu_qsbr *v, uint64_t t, bool wait)
{
RTE_ASSERT(v != NULL);
+ /* Check if all the readers have already acknowledged this token */
+ if (likely(t <= v->acked_token))
+ return 1;
+
if (likely(v->num_threads == v->max_threads))
return __rte_rcu_qsbr_check_all(v, t, wait);
else
--
2.17.1
^ permalink raw reply [flat|nested] 10+ messages in thread
* [dpdk-dev] [PATCH 7/7] lib/rcu: update QS only when there are updates from writer
2019-09-08 22:49 [dpdk-dev] [PATCH 0/7] typo, doc, simple fixes and some optimizations Honnappa Nagarahalli
` (5 preceding siblings ...)
2019-09-08 22:49 ` [dpdk-dev] [PATCH 6/7] lib/rcu: add least acknowledged token optimization Honnappa Nagarahalli
@ 2019-09-08 22:49 ` Honnappa Nagarahalli
6 siblings, 0 replies; 10+ messages in thread
From: Honnappa Nagarahalli @ 2019-09-08 22:49 UTC (permalink / raw)
To: honnappa.nagarahalli, konstantin.ananyev; +Cc: dev
When the writer is checking the quiescent state status, it is not
deleting any entries in the data structure. This means, the readers
do not need to update their quiescent state during that period.
Readers update the quiescent state only when there are updates
available from the writer.
Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
---
lib/librte_rcu/rte_rcu_qsbr.h | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/lib/librte_rcu/rte_rcu_qsbr.h b/lib/librte_rcu/rte_rcu_qsbr.h
index 3f445ba6c..0d4664572 100644
--- a/lib/librte_rcu/rte_rcu_qsbr.h
+++ b/lib/librte_rcu/rte_rcu_qsbr.h
@@ -456,12 +456,14 @@ rte_rcu_qsbr_quiescent(struct rte_rcu_qsbr *v, unsigned int thread_id)
*/
t = __atomic_load_n(&v->token, __ATOMIC_ACQUIRE);
- /* Inform the writer that updates are visible to this reader.
+ /* Check if there are updates available from the writer.
+ * Inform the writer that updates are visible to this reader.
* Prior loads of the shared data structure should not move
* beyond this store. Hence use store-release.
*/
- __atomic_store_n(&v->qsbr_cnt[thread_id].cnt,
- t, __ATOMIC_RELEASE);
+ if (t != __atomic_load_n(&v->qsbr_cnt[thread_id].cnt, __ATOMIC_RELAXED))
+ __atomic_store_n(&v->qsbr_cnt[thread_id].cnt,
+ t, __ATOMIC_RELEASE);
__RTE_RCU_DP_LOG(DEBUG, "%s: update: token = %"PRIu64", Thread ID = %d",
__func__, t, thread_id);
--
2.17.1
^ permalink raw reply [flat|nested] 10+ messages in thread