* [dpdk-dev] [PATCH] event/octeontx2: use wfe while waiting for head
@ 2019-10-23 16:12 pbhagavatula
2019-10-24 15:53 ` Gavin Hu (Arm Technology China)
0 siblings, 1 reply; 8+ messages in thread
From: pbhagavatula @ 2019-10-23 16:12 UTC (permalink / raw)
To: gavin.hu, jerinj, Pavan Nikhilesh; +Cc: dev
From: Pavan Nikhilesh <pbhagavatula@marvell.com>
Use wfe to save power while waiting for tag to become head.
SSO signals EVENTI to allow cores to exit from wfe when they
are waiting for specific operations in which one of them is
setting HEAD bit in GWS_TAG.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
---
drivers/event/octeontx2/otx2_worker.h | 30 ++++++++++++++++++++++++---
1 file changed, 27 insertions(+), 3 deletions(-)
diff --git a/drivers/event/octeontx2/otx2_worker.h b/drivers/event/octeontx2/otx2_worker.h
index 4e971f27c..7a55caca5 100644
--- a/drivers/event/octeontx2/otx2_worker.h
+++ b/drivers/event/octeontx2/otx2_worker.h
@@ -226,10 +226,34 @@ otx2_ssogws_swtag_wait(struct otx2_ssogws *ws)
}
static __rte_always_inline void
-otx2_ssogws_head_wait(struct otx2_ssogws *ws, const uint8_t wait_flag)
+otx2_ssogws_head_wait(struct otx2_ssogws *ws)
{
- while (wait_flag && !(otx2_read64(ws->tag_op) & BIT_ULL(35)))
+#ifdef RTE_ARCH_ARM64
+ uint64_t tag;
+
+ asm volatile (
+ " ldr %[tag], [%[tag_op]] \n"
+ " tbnz %[tag], 35, done%= \n"
+ " sevl \n"
+ "rty%=: wfe \n"
+ " ldr %[tag], [%[tag_op]] \n"
+ " tbz %[tag], 35, rty%= \n"
+ "done%=: \n"
+ : [tag] "=&r" (tag)
+ : [tag_op] "r" (ws->tag_op)
+ );
+#else
+ /* Wait for the HEAD to be set */
+ while (!(otx2_read64(ws->tag_op) & BIT_ULL(35)))
;
+#endif
+}
+
+static __rte_always_inline void
+otx2_ssogws_order(struct otx2_ssogws *ws, const uint8_t wait_flag)
+{
+ if (wait_flag)
+ otx2_ssogws_head_wait(ws);
rte_cio_wmb();
}
@@ -258,7 +282,7 @@ otx2_ssogws_event_tx(struct otx2_ssogws *ws, struct rte_event ev[],
/* Perform header writes before barrier for TSO */
otx2_nix_xmit_prepare_tso(m, flags);
- otx2_ssogws_head_wait(ws, !ev->sched_type);
+ otx2_ssogws_order(ws, !ev->sched_type);
otx2_ssogws_prepare_pkt(txq, m, cmd, flags);
if (flags & NIX_TX_MULTI_SEG_F) {
--
2.17.1
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dpdk-dev] [PATCH] event/octeontx2: use wfe while waiting for head
2019-10-23 16:12 [dpdk-dev] [PATCH] event/octeontx2: use wfe while waiting for head pbhagavatula
@ 2019-10-24 15:53 ` Gavin Hu (Arm Technology China)
2019-10-25 4:26 ` Pavan Nikhilesh Bhagavatula
0 siblings, 1 reply; 8+ messages in thread
From: Gavin Hu (Arm Technology China) @ 2019-10-24 15:53 UTC (permalink / raw)
To: pbhagavatula, jerinj; +Cc: dev, nd
Hi Pavan,
> -----Original Message-----
> From: pbhagavatula@marvell.com <pbhagavatula@marvell.com>
> Sent: Thursday, October 24, 2019 12:13 AM
> To: Gavin Hu (Arm Technology China) <Gavin.Hu@arm.com>;
> jerinj@marvell.com; Pavan Nikhilesh <pbhagavatula@marvell.com>
> Cc: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH] event/octeontx2: use wfe while waiting for
> head
>
> From: Pavan Nikhilesh <pbhagavatula@marvell.com>
>
> Use wfe to save power while waiting for tag to become head.
>
> SSO signals EVENTI to allow cores to exit from wfe when they
> are waiting for specific operations in which one of them is
> setting HEAD bit in GWS_TAG.
>
> Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
> ---
> drivers/event/octeontx2/otx2_worker.h | 30 ++++++++++++++++++++++++--
> -
> 1 file changed, 27 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/event/octeontx2/otx2_worker.h
> b/drivers/event/octeontx2/otx2_worker.h
> index 4e971f27c..7a55caca5 100644
> --- a/drivers/event/octeontx2/otx2_worker.h
> +++ b/drivers/event/octeontx2/otx2_worker.h
> @@ -226,10 +226,34 @@ otx2_ssogws_swtag_wait(struct otx2_ssogws *ws)
> }
>
> static __rte_always_inline void
> -otx2_ssogws_head_wait(struct otx2_ssogws *ws, const uint8_t wait_flag)
> +otx2_ssogws_head_wait(struct otx2_ssogws *ws)
> {
> - while (wait_flag && !(otx2_read64(ws->tag_op) & BIT_ULL(35)))
> +#ifdef RTE_ARCH_ARM64
> + uint64_t tag;
> +
> + asm volatile (
> + " ldr %[tag], [%[tag_op]] \n"
"ldxr" should be used, exclusive-load is required to "monitor" the location, then a write to the location will cause clear of the exclusive monitor, thus a wake up event is generated implicitly.
You can find more explanation is here:
http://inbox.dpdk.org/dev/AM0PR08MB5363F9D1BA158B66B803EA068F6B0@AM0PR08MB5363.eurprd08.prod.outlook.com/
/Gavin
> + " tbnz %[tag], 35, done%= \n"
> + " sevl \n"
> + "rty%=: wfe \n"
> + " ldr %[tag], [%[tag_op]] \n"
> + " tbz %[tag], 35, rty%= \n"
> + "done%=: \n"
> + : [tag] "=&r" (tag)
> + : [tag_op] "r" (ws->tag_op)
> + );
> +#else
> + /* Wait for the HEAD to be set */
> + while (!(otx2_read64(ws->tag_op) & BIT_ULL(35)))
> ;
> +#endif
> +}
> +
> +static __rte_always_inline void
> +otx2_ssogws_order(struct otx2_ssogws *ws, const uint8_t wait_flag)
> +{
> + if (wait_flag)
> + otx2_ssogws_head_wait(ws);
>
> rte_cio_wmb();
What ordering does this barrier try to keep? If there is a write then wait for kind of response, should this barrier move before otx2_ssogws_head_wait?
/Gavin
> }
> @@ -258,7 +282,7 @@ otx2_ssogws_event_tx(struct otx2_ssogws *ws,
> struct rte_event ev[],
>
> /* Perform header writes before barrier for TSO */
> otx2_nix_xmit_prepare_tso(m, flags);
> - otx2_ssogws_head_wait(ws, !ev->sched_type);
> + otx2_ssogws_order(ws, !ev->sched_type);
> otx2_ssogws_prepare_pkt(txq, m, cmd, flags);
>
> if (flags & NIX_TX_MULTI_SEG_F) {
> --
> 2.17.1
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dpdk-dev] [PATCH] event/octeontx2: use wfe while waiting for head
2019-10-24 15:53 ` Gavin Hu (Arm Technology China)
@ 2019-10-25 4:26 ` Pavan Nikhilesh Bhagavatula
2019-10-25 16:34 ` Gavin Hu (Arm Technology China)
0 siblings, 1 reply; 8+ messages in thread
From: Pavan Nikhilesh Bhagavatula @ 2019-10-25 4:26 UTC (permalink / raw)
To: Gavin Hu (Arm Technology China), Jerin Jacob Kollanukkaran; +Cc: dev, nd
Hi Gavin,
>-----Original Message-----
>From: dev <dev-bounces@dpdk.org> On Behalf Of Gavin Hu (Arm
>Technology China)
>Sent: Thursday, October 24, 2019 9:23 PM
>To: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>; Jerin
>Jacob Kollanukkaran <jerinj@marvell.com>
>Cc: dev@dpdk.org; nd <nd@arm.com>
>Subject: Re: [dpdk-dev] [PATCH] event/octeontx2: use wfe while
>waiting for head
>
>Hi Pavan,
>
>> -----Original Message-----
>> From: pbhagavatula@marvell.com <pbhagavatula@marvell.com>
>> Sent: Thursday, October 24, 2019 12:13 AM
>> To: Gavin Hu (Arm Technology China) <Gavin.Hu@arm.com>;
>> jerinj@marvell.com; Pavan Nikhilesh <pbhagavatula@marvell.com>
>> Cc: dev@dpdk.org
>> Subject: [dpdk-dev] [PATCH] event/octeontx2: use wfe while waiting
>for
>> head
>>
>> From: Pavan Nikhilesh <pbhagavatula@marvell.com>
>>
>> Use wfe to save power while waiting for tag to become head.
>>
>> SSO signals EVENTI to allow cores to exit from wfe when they
>> are waiting for specific operations in which one of them is
>> setting HEAD bit in GWS_TAG.
>>
>> Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
>> ---
>> drivers/event/octeontx2/otx2_worker.h | 30
>++++++++++++++++++++++++--
>> -
>> 1 file changed, 27 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/event/octeontx2/otx2_worker.h
>> b/drivers/event/octeontx2/otx2_worker.h
>> index 4e971f27c..7a55caca5 100644
>> --- a/drivers/event/octeontx2/otx2_worker.h
>> +++ b/drivers/event/octeontx2/otx2_worker.h
>> @@ -226,10 +226,34 @@ otx2_ssogws_swtag_wait(struct
>otx2_ssogws *ws)
>> }
>>
>> static __rte_always_inline void
>> -otx2_ssogws_head_wait(struct otx2_ssogws *ws, const uint8_t
>wait_flag)
>> +otx2_ssogws_head_wait(struct otx2_ssogws *ws)
>> {
>> - while (wait_flag && !(otx2_read64(ws->tag_op) &
>BIT_ULL(35)))
>> +#ifdef RTE_ARCH_ARM64
>> + uint64_t tag;
>> +
>> + asm volatile (
>> + " ldr %[tag], [%[tag_op]] \n"
>"ldxr" should be used, exclusive-load is required to "monitor" the
>location, then a write to the location will cause clear of the exclusive
>monitor, thus a wake up event is generated implicitly.
As I have mentioned in the commit log:
"SSO signals EVENTI to allow cores to exit from wfe when they
are waiting for specific operations in which one of them is
setting HEAD bit in GWS_TAG."
The address need not be tracked by the global monitor.
>You can find more explanation is here:
>https://urldefense.proofpoint.com/v2/url?u=http-
>3A__inbox.dpdk.org_dev_AM0PR08MB5363F9D1BA158B66B803EA068F
>6B0-
>40AM0PR08MB5363.eurprd08.prod.outlook.com_&d=DwIFAg&c=nKjW
>ec2b6R0mOyPaz7xtfQ&r=1cjuAHrGh745jHNmj2fD85sUMIJ2IPIDsIJzo6F
>N6Z0&m=JMzT-4V2megNsFYxaO0V2wE0-
>GlK9UPUvE1K0pPA9aQ&s=JajU2VklhV_jFE0WKAZ076KjjWymIC-
>iTiJXU0Vwxr4&e=
>/Gavin
>> + " tbnz %[tag], 35, done%=
> \n"
>> + " sevl \n"
>> + "rty%=: wfe \n"
>> + " ldr %[tag], [%[tag_op]] \n"
>> + " tbz %[tag], 35, rty%= \n"
>> + "done%=: \n"
>> + : [tag] "=&r" (tag)
>> + : [tag_op] "r" (ws->tag_op)
>> + );
>> +#else
>> + /* Wait for the HEAD to be set */
>> + while (!(otx2_read64(ws->tag_op) & BIT_ULL(35)))
>> ;
>> +#endif
>> +}
>> +
>> +static __rte_always_inline void
>> +otx2_ssogws_order(struct otx2_ssogws *ws, const uint8_t
>wait_flag)
>> +{
>> + if (wait_flag)
>> + otx2_ssogws_head_wait(ws);
>>
>> rte_cio_wmb();
>What ordering does this barrier try to keep? If there is a write then wait
>for kind of response, should this barrier move before
>otx2_ssogws_head_wait?
The barrier is used to flush out write buffer to LLC (octeontx2 point of coherence) so
that NIX Tx picks up all the modifications done to the packet.
>/Gavin
>> }
>> @@ -258,7 +282,7 @@ otx2_ssogws_event_tx(struct otx2_ssogws
>*ws,
>> struct rte_event ev[],
>>
>> /* Perform header writes before barrier for TSO */
>> otx2_nix_xmit_prepare_tso(m, flags);
>> - otx2_ssogws_head_wait(ws, !ev->sched_type);
>> + otx2_ssogws_order(ws, !ev->sched_type);
>> otx2_ssogws_prepare_pkt(txq, m, cmd, flags);
>>
>> if (flags & NIX_TX_MULTI_SEG_F) {
>> --
>> 2.17.1
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dpdk-dev] [PATCH] event/octeontx2: use wfe while waiting for head
2019-10-25 4:26 ` Pavan Nikhilesh Bhagavatula
@ 2019-10-25 16:34 ` Gavin Hu (Arm Technology China)
2019-10-25 17:06 ` Pavan Nikhilesh Bhagavatula
0 siblings, 1 reply; 8+ messages in thread
From: Gavin Hu (Arm Technology China) @ 2019-10-25 16:34 UTC (permalink / raw)
To: Pavan Nikhilesh Bhagavatula, jerinj; +Cc: dev, nd, nd
Hi Pavan,
> -----Original Message-----
> From: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>
> Sent: Friday, October 25, 2019 12:26 PM
> To: Gavin Hu (Arm Technology China) <Gavin.Hu@arm.com>;
> jerinj@marvell.com
> Cc: dev@dpdk.org; nd <nd@arm.com>
> Subject: RE: [dpdk-dev] [PATCH] event/octeontx2: use wfe while waiting for
> head
>
> Hi Gavin,
>
> >-----Original Message-----
> >From: dev <dev-bounces@dpdk.org> On Behalf Of Gavin Hu (Arm
> >Technology China)
> >Sent: Thursday, October 24, 2019 9:23 PM
> >To: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>; Jerin
> >Jacob Kollanukkaran <jerinj@marvell.com>
> >Cc: dev@dpdk.org; nd <nd@arm.com>
> >Subject: Re: [dpdk-dev] [PATCH] event/octeontx2: use wfe while
> >waiting for head
> >
> >Hi Pavan,
> >
> >> -----Original Message-----
> >> From: pbhagavatula@marvell.com <pbhagavatula@marvell.com>
> >> Sent: Thursday, October 24, 2019 12:13 AM
> >> To: Gavin Hu (Arm Technology China) <Gavin.Hu@arm.com>;
> >> jerinj@marvell.com; Pavan Nikhilesh <pbhagavatula@marvell.com>
> >> Cc: dev@dpdk.org
> >> Subject: [dpdk-dev] [PATCH] event/octeontx2: use wfe while waiting
> >for
> >> head
> >>
> >> From: Pavan Nikhilesh <pbhagavatula@marvell.com>
> >>
> >> Use wfe to save power while waiting for tag to become head.
> >>
> >> SSO signals EVENTI to allow cores to exit from wfe when they
> >> are waiting for specific operations in which one of them is
> >> setting HEAD bit in GWS_TAG.
> >>
> >> Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
> >> ---
> >> drivers/event/octeontx2/otx2_worker.h | 30
> >++++++++++++++++++++++++--
> >> -
> >> 1 file changed, 27 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/event/octeontx2/otx2_worker.h
> >> b/drivers/event/octeontx2/otx2_worker.h
> >> index 4e971f27c..7a55caca5 100644
> >> --- a/drivers/event/octeontx2/otx2_worker.h
> >> +++ b/drivers/event/octeontx2/otx2_worker.h
> >> @@ -226,10 +226,34 @@ otx2_ssogws_swtag_wait(struct
> >otx2_ssogws *ws)
> >> }
> >>
> >> static __rte_always_inline void
> >> -otx2_ssogws_head_wait(struct otx2_ssogws *ws, const uint8_t
> >wait_flag)
> >> +otx2_ssogws_head_wait(struct otx2_ssogws *ws)
> >> {
> >> - while (wait_flag && !(otx2_read64(ws->tag_op) &
> >BIT_ULL(35)))
> >> +#ifdef RTE_ARCH_ARM64
> >> + uint64_t tag;
> >> +
> >> + asm volatile (
> >> + " ldr %[tag], [%[tag_op]] \n"
> >"ldxr" should be used, exclusive-load is required to "monitor" the
> >location, then a write to the location will cause clear of the exclusive
> >monitor, thus a wake up event is generated implicitly.
>
> As I have mentioned in the commit log:
> "SSO signals EVENTI to allow cores to exit from wfe when they
> are waiting for specific operations in which one of them is
> setting HEAD bit in GWS_TAG."
If you have other expected wake up sources, that is ok. Just curious is this signal explicitly sent to quit WFE?
Just wondering, implicit event(Clear of exclusive monitor) vs explicit signal, which has shorter latency?
/Gavin
>
> The address need not be tracked by the global monitor.
>
> >You can find more explanation is here:
> >https://urldefense.proofpoint.com/v2/url?u=http-
> >3A__inbox.dpdk.org_dev_AM0PR08MB5363F9D1BA158B66B803EA068F
> >6B0-
> >40AM0PR08MB5363.eurprd08.prod.outlook.com_&d=DwIFAg&c=nKjW
> >ec2b6R0mOyPaz7xtfQ&r=1cjuAHrGh745jHNmj2fD85sUMIJ2IPIDsIJzo6F
> >N6Z0&m=JMzT-4V2megNsFYxaO0V2wE0-
> >GlK9UPUvE1K0pPA9aQ&s=JajU2VklhV_jFE0WKAZ076KjjWymIC-
> >iTiJXU0Vwxr4&e=
> >/Gavin
> >> + " tbnz %[tag], 35, done%=
> > \n"
> >> + " sevl \n"
> >> + "rty%=: wfe \n"
> >> + " ldr %[tag], [%[tag_op]] \n"
> >> + " tbz %[tag], 35, rty%= \n"
> >> + "done%=: \n"
> >> + : [tag] "=&r" (tag)
> >> + : [tag_op] "r" (ws->tag_op)
> >> + );
> >> +#else
> >> + /* Wait for the HEAD to be set */
> >> + while (!(otx2_read64(ws->tag_op) & BIT_ULL(35)))
> >> ;
> >> +#endif
> >> +}
> >> +
> >> +static __rte_always_inline void
> >> +otx2_ssogws_order(struct otx2_ssogws *ws, const uint8_t
> >wait_flag)
> >> +{
> >> + if (wait_flag)
> >> + otx2_ssogws_head_wait(ws);
> >>
> >> rte_cio_wmb();
> >What ordering does this barrier try to keep? If there is a write then wait
> >for kind of response, should this barrier move before
> >otx2_ssogws_head_wait?
>
> The barrier is used to flush out write buffer to LLC (octeontx2 point of
> coherence) so
> that NIX Tx picks up all the modifications done to the packet.
Looking at the otx2_ssogws_event_tx function, so far at the point of rte_cio_wmb, only the header is written?
Should it be delayed after the whole packet written and before the submission?
If NIX is not falling within the SMP configuration, should it be rte_io_wmb instead?
/Gavin
> >> }
> >> @@ -258,7 +282,7 @@ otx2_ssogws_event_tx(struct otx2_ssogws
> >*ws,
> >> struct rte_event ev[],
> >>
> >> /* Perform header writes before barrier for TSO */
> >> otx2_nix_xmit_prepare_tso(m, flags);
> >> - otx2_ssogws_head_wait(ws, !ev->sched_type);
> >> + otx2_ssogws_order(ws, !ev->sched_type);
> >> otx2_ssogws_prepare_pkt(txq, m, cmd, flags);
> >>
> >> if (flags & NIX_TX_MULTI_SEG_F) {
> >> --
> >> 2.17.1
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dpdk-dev] [PATCH] event/octeontx2: use wfe while waiting for head
2019-10-25 16:34 ` Gavin Hu (Arm Technology China)
@ 2019-10-25 17:06 ` Pavan Nikhilesh Bhagavatula
2019-10-27 9:12 ` Gavin Hu (Arm Technology China)
2019-12-18 17:42 ` Honnappa Nagarahalli
0 siblings, 2 replies; 8+ messages in thread
From: Pavan Nikhilesh Bhagavatula @ 2019-10-25 17:06 UTC (permalink / raw)
To: Gavin Hu (Arm Technology China), Jerin Jacob Kollanukkaran; +Cc: dev, nd, nd
Hi Gavin,
>Hi Pavan,
>
>> -----Original Message-----
>> From: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>
>> Sent: Friday, October 25, 2019 12:26 PM
>> To: Gavin Hu (Arm Technology China) <Gavin.Hu@arm.com>;
>> jerinj@marvell.com
>> Cc: dev@dpdk.org; nd <nd@arm.com>
>> Subject: RE: [dpdk-dev] [PATCH] event/octeontx2: use wfe while
>waiting for
>> head
>>
>> Hi Gavin,
>>
>> >-----Original Message-----
>> >From: dev <dev-bounces@dpdk.org> On Behalf Of Gavin Hu (Arm
>> >Technology China)
>> >Sent: Thursday, October 24, 2019 9:23 PM
>> >To: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>;
>Jerin
>> >Jacob Kollanukkaran <jerinj@marvell.com>
>> >Cc: dev@dpdk.org; nd <nd@arm.com>
>> >Subject: Re: [dpdk-dev] [PATCH] event/octeontx2: use wfe while
>> >waiting for head
>> >
>> >Hi Pavan,
>> >
>> >> -----Original Message-----
>> >> From: pbhagavatula@marvell.com <pbhagavatula@marvell.com>
>> >> Sent: Thursday, October 24, 2019 12:13 AM
>> >> To: Gavin Hu (Arm Technology China) <Gavin.Hu@arm.com>;
>> >> jerinj@marvell.com; Pavan Nikhilesh
><pbhagavatula@marvell.com>
>> >> Cc: dev@dpdk.org
>> >> Subject: [dpdk-dev] [PATCH] event/octeontx2: use wfe while
>waiting
>> >for
>> >> head
>> >>
>> >> From: Pavan Nikhilesh <pbhagavatula@marvell.com>
>> >>
>> >> Use wfe to save power while waiting for tag to become head.
>> >>
>> >> SSO signals EVENTI to allow cores to exit from wfe when they
>> >> are waiting for specific operations in which one of them is
>> >> setting HEAD bit in GWS_TAG.
>> >>
>> >> Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
>> >> ---
>> >> drivers/event/octeontx2/otx2_worker.h | 30
>> >++++++++++++++++++++++++--
>> >> -
>> >> 1 file changed, 27 insertions(+), 3 deletions(-)
>> >>
>> >> diff --git a/drivers/event/octeontx2/otx2_worker.h
>> >> b/drivers/event/octeontx2/otx2_worker.h
>> >> index 4e971f27c..7a55caca5 100644
>> >> --- a/drivers/event/octeontx2/otx2_worker.h
>> >> +++ b/drivers/event/octeontx2/otx2_worker.h
>> >> @@ -226,10 +226,34 @@ otx2_ssogws_swtag_wait(struct
>> >otx2_ssogws *ws)
>> >> }
>> >>
>> >> static __rte_always_inline void
>> >> -otx2_ssogws_head_wait(struct otx2_ssogws *ws, const uint8_t
>> >wait_flag)
>> >> +otx2_ssogws_head_wait(struct otx2_ssogws *ws)
>> >> {
>> >> - while (wait_flag && !(otx2_read64(ws->tag_op) &
>> >BIT_ULL(35)))
>> >> +#ifdef RTE_ARCH_ARM64
>> >> + uint64_t tag;
>> >> +
>> >> + asm volatile (
>> >> + " ldr %[tag], [%[tag_op]] \n"
>> >"ldxr" should be used, exclusive-load is required to "monitor" the
>> >location, then a write to the location will cause clear of the exclusive
>> >monitor, thus a wake up event is generated implicitly.
>>
>> As I have mentioned in the commit log:
>> "SSO signals EVENTI to allow cores to exit from wfe when they
>> are waiting for specific operations in which one of them is
>> setting HEAD bit in GWS_TAG."
>If you have other expected wake up sources, that is ok. Just curious is
>this signal explicitly sent to quit WFE?
AFAIK yes, explicitly sent to quit WFE.
>Just wondering, implicit event(Clear of exclusive monitor) vs explicit
>signal, which has shorter latency?
Not really sure but SSO has dedicated bus inside each core.
>/Gavin
>>
>> The address need not be tracked by the global monitor.
>>
>> >You can find more explanation is here:
>> >https://urldefense.proofpoint.com/v2/url?u=http-
>>
>>3A__inbox.dpdk.org_dev_AM0PR08MB5363F9D1BA158B66B803EA068
>F
>> >6B0-
>>
>>40AM0PR08MB5363.eurprd08.prod.outlook.com_&d=DwIFAg&c=nKj
>W
>>
>>ec2b6R0mOyPaz7xtfQ&r=1cjuAHrGh745jHNmj2fD85sUMIJ2IPIDsIJzo6
>F
>> >N6Z0&m=JMzT-4V2megNsFYxaO0V2wE0-
>> >GlK9UPUvE1K0pPA9aQ&s=JajU2VklhV_jFE0WKAZ076KjjWymIC-
>> >iTiJXU0Vwxr4&e=
>> >/Gavin
>> >> + " tbnz %[tag], 35, done%=
>> > \n"
>> >> + " sevl \n"
>> >> + "rty%=: wfe \n"
>> >> + " ldr %[tag], [%[tag_op]] \n"
>> >> + " tbz %[tag], 35, rty%= \n"
>> >> + "done%=: \n"
>> >> + : [tag] "=&r" (tag)
>> >> + : [tag_op] "r" (ws->tag_op)
>> >> + );
>> >> +#else
>> >> + /* Wait for the HEAD to be set */
>> >> + while (!(otx2_read64(ws->tag_op) & BIT_ULL(35)))
>> >> ;
>> >> +#endif
>> >> +}
>> >> +
>> >> +static __rte_always_inline void
>> >> +otx2_ssogws_order(struct otx2_ssogws *ws, const uint8_t
>> >wait_flag)
>> >> +{
>> >> + if (wait_flag)
>> >> + otx2_ssogws_head_wait(ws);
>> >>
>> >> rte_cio_wmb();
>> >What ordering does this barrier try to keep? If there is a write then
>wait
>> >for kind of response, should this barrier move before
>> >otx2_ssogws_head_wait?
>>
>> The barrier is used to flush out write buffer to LLC (octeontx2 point of
>> coherence) so
>> that NIX Tx picks up all the modifications done to the packet.
>Looking at the otx2_ssogws_event_tx function, so far at the point of
>rte_cio_wmb, only the header is written?
>Should it be delayed after the whole packet written and before the
>submission?
We only care that the writes to the actual packet buffer ex. Start of ethernet header
are committed.
The rest of mbuf fields are translated into a HW command after the barrier and written
to a LMTLINE using ldoer.
>If NIX is not falling within the SMP configuration, should it be
>rte_io_wmb instead?
Octeontx2 has only single shareability domain i.e. it makes no distinction between
Outer and inner sharable domains.
Since all IO devices are interpreted to be on outer sharable domain, we like to use
rte_cio_(r/w)mb for IO devices.
>/Gavin
Regards,
Pavan.
>> >> }
>> >> @@ -258,7 +282,7 @@ otx2_ssogws_event_tx(struct otx2_ssogws
>> >*ws,
>> >> struct rte_event ev[],
>> >>
>> >> /* Perform header writes before barrier for TSO */
>> >> otx2_nix_xmit_prepare_tso(m, flags);
>> >> - otx2_ssogws_head_wait(ws, !ev->sched_type);
>> >> + otx2_ssogws_order(ws, !ev->sched_type);
>> >> otx2_ssogws_prepare_pkt(txq, m, cmd, flags);
>> >>
>> >> if (flags & NIX_TX_MULTI_SEG_F) {
>> >> --
>> >> 2.17.1
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dpdk-dev] [PATCH] event/octeontx2: use wfe while waiting for head
2019-10-25 17:06 ` Pavan Nikhilesh Bhagavatula
@ 2019-10-27 9:12 ` Gavin Hu (Arm Technology China)
2019-10-30 13:33 ` Jerin Jacob
2019-12-18 17:42 ` Honnappa Nagarahalli
1 sibling, 1 reply; 8+ messages in thread
From: Gavin Hu (Arm Technology China) @ 2019-10-27 9:12 UTC (permalink / raw)
To: Pavan Nikhilesh Bhagavatula, jerinj; +Cc: dev, nd, nd, nd
Hi Pavan,
> -----Original Message-----
> From: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>
> Sent: Saturday, October 26, 2019 1:06 AM
> To: Gavin Hu (Arm Technology China) <Gavin.Hu@arm.com>;
> jerinj@marvell.com
> Cc: dev@dpdk.org; nd <nd@arm.com>; nd <nd@arm.com>
> Subject: RE: [dpdk-dev] [PATCH] event/octeontx2: use wfe while waiting for
> head
>
> Hi Gavin,
> >Hi Pavan,
> >
> >> -----Original Message-----
> >> From: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>
> >> Sent: Friday, October 25, 2019 12:26 PM
> >> To: Gavin Hu (Arm Technology China) <Gavin.Hu@arm.com>;
> >> jerinj@marvell.com
> >> Cc: dev@dpdk.org; nd <nd@arm.com>
> >> Subject: RE: [dpdk-dev] [PATCH] event/octeontx2: use wfe while
> >waiting for
> >> head
> >>
> >> Hi Gavin,
> >>
> >> >-----Original Message-----
> >> >From: dev <dev-bounces@dpdk.org> On Behalf Of Gavin Hu (Arm
> >> >Technology China)
> >> >Sent: Thursday, October 24, 2019 9:23 PM
> >> >To: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>;
> >Jerin
> >> >Jacob Kollanukkaran <jerinj@marvell.com>
> >> >Cc: dev@dpdk.org; nd <nd@arm.com>
> >> >Subject: Re: [dpdk-dev] [PATCH] event/octeontx2: use wfe while
> >> >waiting for head
> >> >
> >> >Hi Pavan,
> >> >
> >> >> -----Original Message-----
> >> >> From: pbhagavatula@marvell.com <pbhagavatula@marvell.com>
> >> >> Sent: Thursday, October 24, 2019 12:13 AM
> >> >> To: Gavin Hu (Arm Technology China) <Gavin.Hu@arm.com>;
> >> >> jerinj@marvell.com; Pavan Nikhilesh
> ><pbhagavatula@marvell.com>
> >> >> Cc: dev@dpdk.org
> >> >> Subject: [dpdk-dev] [PATCH] event/octeontx2: use wfe while
> >waiting
> >> >for
> >> >> head
> >> >>
> >> >> From: Pavan Nikhilesh <pbhagavatula@marvell.com>
> >> >>
> >> >> Use wfe to save power while waiting for tag to become head.
> >> >>
> >> >> SSO signals EVENTI to allow cores to exit from wfe when they
> >> >> are waiting for specific operations in which one of them is
> >> >> setting HEAD bit in GWS_TAG.
> >> >>
> >> >> Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
> >> >> ---
> >> >> drivers/event/octeontx2/otx2_worker.h | 30
> >> >++++++++++++++++++++++++--
> >> >> -
> >> >> 1 file changed, 27 insertions(+), 3 deletions(-)
> >> >>
> >> >> diff --git a/drivers/event/octeontx2/otx2_worker.h
> >> >> b/drivers/event/octeontx2/otx2_worker.h
> >> >> index 4e971f27c..7a55caca5 100644
> >> >> --- a/drivers/event/octeontx2/otx2_worker.h
> >> >> +++ b/drivers/event/octeontx2/otx2_worker.h
> >> >> @@ -226,10 +226,34 @@ otx2_ssogws_swtag_wait(struct
> >> >otx2_ssogws *ws)
> >> >> }
> >> >>
> >> >> static __rte_always_inline void
> >> >> -otx2_ssogws_head_wait(struct otx2_ssogws *ws, const uint8_t
> >> >wait_flag)
> >> >> +otx2_ssogws_head_wait(struct otx2_ssogws *ws)
> >> >> {
> >> >> - while (wait_flag && !(otx2_read64(ws->tag_op) &
> >> >BIT_ULL(35)))
> >> >> +#ifdef RTE_ARCH_ARM64
> >> >> + uint64_t tag;
> >> >> +
> >> >> + asm volatile (
> >> >> + " ldr %[tag], [%[tag_op]] \n"
> >> >"ldxr" should be used, exclusive-load is required to "monitor" the
> >> >location, then a write to the location will cause clear of the exclusive
> >> >monitor, thus a wake up event is generated implicitly.
> >>
> >> As I have mentioned in the commit log:
> >> "SSO signals EVENTI to allow cores to exit from wfe when they
> >> are waiting for specific operations in which one of them is
> >> setting HEAD bit in GWS_TAG."
> >If you have other expected wake up sources, that is ok. Just curious is
> >this signal explicitly sent to quit WFE?
>
> AFAIK yes, explicitly sent to quit WFE.
>
> >Just wondering, implicit event(Clear of exclusive monitor) vs explicit
> >signal, which has shorter latency?
>
> Not really sure but SSO has dedicated bus inside each core.
That's ok.
>
> >/Gavin
> >>
> >> The address need not be tracked by the global monitor.
> >>
> >> >You can find more explanation is here:
> >> >https://urldefense.proofpoint.com/v2/url?u=http-
> >>
> >>3A__inbox.dpdk.org_dev_AM0PR08MB5363F9D1BA158B66B803EA068
> >F
> >> >6B0-
> >>
> >>40AM0PR08MB5363.eurprd08.prod.outlook.com_&d=DwIFAg&c=nKj
> >W
> >>
> >>ec2b6R0mOyPaz7xtfQ&r=1cjuAHrGh745jHNmj2fD85sUMIJ2IPIDsIJzo6
> >F
> >> >N6Z0&m=JMzT-4V2megNsFYxaO0V2wE0-
> >> >GlK9UPUvE1K0pPA9aQ&s=JajU2VklhV_jFE0WKAZ076KjjWymIC-
> >> >iTiJXU0Vwxr4&e=
> >> >/Gavin
> >> >> + " tbnz %[tag], 35, done%=
> >> > \n"
> >> >> + " sevl \n"
> >> >> + "rty%=: wfe \n"
> >> >> + " ldr %[tag], [%[tag_op]] \n"
> >> >> + " tbz %[tag], 35, rty%= \n"
> >> >> + "done%=: \n"
> >> >> + : [tag] "=&r" (tag)
> >> >> + : [tag_op] "r" (ws->tag_op)
> >> >> + );
> >> >> +#else
> >> >> + /* Wait for the HEAD to be set */
> >> >> + while (!(otx2_read64(ws->tag_op) & BIT_ULL(35)))
> >> >> ;
> >> >> +#endif
> >> >> +}
> >> >> +
> >> >> +static __rte_always_inline void
> >> >> +otx2_ssogws_order(struct otx2_ssogws *ws, const uint8_t
> >> >wait_flag)
> >> >> +{
> >> >> + if (wait_flag)
> >> >> + otx2_ssogws_head_wait(ws);
> >> >>
> >> >> rte_cio_wmb();
> >> >What ordering does this barrier try to keep? If there is a write then
> >wait
> >> >for kind of response, should this barrier move before
> >> >otx2_ssogws_head_wait?
> >>
> >> The barrier is used to flush out write buffer to LLC (octeontx2 point of
> >> coherence) so
> >> that NIX Tx picks up all the modifications done to the packet.
>
> >Looking at the otx2_ssogws_event_tx function, so far at the point of
> >rte_cio_wmb, only the header is written?
> >Should it be delayed after the whole packet written and before the
> >submission?
>
> We only care that the writes to the actual packet buffer ex. Start of ethernet
> header
> are committed.
> The rest of mbuf fields are translated into a HW command after the barrier
> and written
> to a LMTLINE using ldoer.
>
> >If NIX is not falling within the SMP configuration, should it be
> >rte_io_wmb instead?
>
> Octeontx2 has only single shareability domain i.e. it makes no distinction
> between
> Outer and inner sharable domains.
> Since all IO devices are interpreted to be on outer sharable domain, we like
> to use
> rte_cio_(r/w)mb for IO devices.
Yes, for an integral part of the out sharable domain, rte_cio_(r/w)mb is sufficient.
>
> Regards,
> Pavan.
>
> >> >> }
> >> >> @@ -258,7 +282,7 @@ otx2_ssogws_event_tx(struct otx2_ssogws
> >> >*ws,
> >> >> struct rte_event ev[],
> >> >>
> >> >> /* Perform header writes before barrier for TSO */
> >> >> otx2_nix_xmit_prepare_tso(m, flags);
> >> >> - otx2_ssogws_head_wait(ws, !ev->sched_type);
> >> >> + otx2_ssogws_order(ws, !ev->sched_type);
> >> >> otx2_ssogws_prepare_pkt(txq, m, cmd, flags);
> >> >>
> >> >> if (flags & NIX_TX_MULTI_SEG_F) {
> >> >> --
> >> >> 2.17.1
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dpdk-dev] [PATCH] event/octeontx2: use wfe while waiting for head
2019-10-27 9:12 ` Gavin Hu (Arm Technology China)
@ 2019-10-30 13:33 ` Jerin Jacob
0 siblings, 0 replies; 8+ messages in thread
From: Jerin Jacob @ 2019-10-30 13:33 UTC (permalink / raw)
To: Gavin Hu (Arm Technology China)
Cc: Pavan Nikhilesh Bhagavatula, jerinj, dev, nd
On Sun, Oct 27, 2019 at 2:42 PM Gavin Hu (Arm Technology China)
<Gavin.Hu@arm.com> wrote:
>
> Hi Pavan,
>
> > -----Original Message-----
> > From: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>
> > Sent: Saturday, October 26, 2019 1:06 AM
> > To: Gavin Hu (Arm Technology China) <Gavin.Hu@arm.com>;
> > jerinj@marvell.com
> > Cc: dev@dpdk.org; nd <nd@arm.com>; nd <nd@arm.com>
> > Subject: RE: [dpdk-dev] [PATCH] event/octeontx2: use wfe while waiting for
> > head
> >
> Reviewed-by: Gavin Hu <gavin.hu@arm.com>
Applied to dpdk-next-eventdev/master. Thanks.
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [dpdk-dev] [PATCH] event/octeontx2: use wfe while waiting for head
2019-10-25 17:06 ` Pavan Nikhilesh Bhagavatula
2019-10-27 9:12 ` Gavin Hu (Arm Technology China)
@ 2019-12-18 17:42 ` Honnappa Nagarahalli
1 sibling, 0 replies; 8+ messages in thread
From: Honnappa Nagarahalli @ 2019-12-18 17:42 UTC (permalink / raw)
To: Pavan Nikhilesh Bhagavatula, Gavin Hu, jerinj, Honnappa Nagarahalli
Cc: dev, nd, nd
<snip>
> >> >>
> >> >> static __rte_always_inline void
> >> >> -otx2_ssogws_head_wait(struct otx2_ssogws *ws, const uint8_t
> >> >wait_flag)
> >> >> +otx2_ssogws_head_wait(struct otx2_ssogws *ws)
> >> >> {
> >> >> - while (wait_flag && !(otx2_read64(ws->tag_op) &
> >> >BIT_ULL(35)))
> >> >> +#ifdef RTE_ARCH_ARM64
> >> >> + uint64_t tag;
> >> >> +
> >> >> + asm volatile (
> >> >> + " ldr %[tag], [%[tag_op]] \n"
> >> >"ldxr" should be used, exclusive-load is required to "monitor" the
> >> >location, then a write to the location will cause clear of the
> >> >exclusive monitor, thus a wake up event is generated implicitly.
> >>
> >> As I have mentioned in the commit log:
> >> "SSO signals EVENTI to allow cores to exit from wfe when they are
> >> waiting for specific operations in which one of them is setting HEAD
> >> bit in GWS_TAG."
> >If you have other expected wake up sources, that is ok. Just curious is
> >this signal explicitly sent to quit WFE?
>
> AFAIK yes, explicitly sent to quit WFE.
Pavan, is the wake up event sent to the particular core that is waiting on this head or is it sent to all the cores?
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2019-12-18 17:42 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-23 16:12 [dpdk-dev] [PATCH] event/octeontx2: use wfe while waiting for head pbhagavatula
2019-10-24 15:53 ` Gavin Hu (Arm Technology China)
2019-10-25 4:26 ` Pavan Nikhilesh Bhagavatula
2019-10-25 16:34 ` Gavin Hu (Arm Technology China)
2019-10-25 17:06 ` Pavan Nikhilesh Bhagavatula
2019-10-27 9:12 ` Gavin Hu (Arm Technology China)
2019-10-30 13:33 ` Jerin Jacob
2019-12-18 17:42 ` Honnappa Nagarahalli
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).