patches for DPDK stable branches
 help / color / mirror / Atom feed
From: Feifei Wang <Feifei.Wang2@arm.com>
To: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>,
	"jerinj@marvell.com" <jerinj@marvell.com>,
	Harry van Haaren <harry.van.haaren@intel.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>, nd <nd@arm.com>,
	Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>,
	"stable@dpdk.org" <stable@dpdk.org>,
	Ruifeng Wang <Ruifeng.Wang@arm.com>, nd <nd@arm.com>,
	nd <nd@arm.com>, nd <nd@arm.com>, nd <nd@arm.com>
Subject: [dpdk-stable] 回复: [RFC PATCH v1 4/6] app/eventdev: add release barriers for pipeline test
Date: Mon, 11 Jan 2021 01:57:07 +0000	[thread overview]
Message-ID: <DBBPR08MB44119E6B8B54CF36183446B4C8AB0@DBBPR08MB4411.eurprd08.prod.outlook.com> (raw)
In-Reply-To: <CO6PR18MB382875FB67CD8C5B6C8D29A7DEAE9@CO6PR18MB3828.namprd18.prod.outlook.com>



> -----邮件原件-----
> 发件人: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>
> 发送时间: 2021年1月8日 18:58
> 收件人: Feifei Wang <Feifei.Wang2@arm.com>; jerinj@marvell.com; Harry
> van Haaren <harry.van.haaren@intel.com>
> 抄送: dev@dpdk.org; nd <nd@arm.com>; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; stable@dpdk.org; Ruifeng Wang
> <Ruifeng.Wang@arm.com>; nd <nd@arm.com>; nd <nd@arm.com>; nd
> <nd@arm.com>
> 主题: RE: [RFC PATCH v1 4/6] app/eventdev: add release barriers for pipeline
> test
> 
> Hi Feifei,
> 
> >Hi, Pavan
> >
> >> -----邮件原件-----
> >> 发件人: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>
> >> 发送时间: 2021年1月8日 17:13
> >> 收件人: Feifei Wang <Feifei.Wang2@arm.com>; jerinj@marvell.com;
> >Harry
> >> van Haaren <harry.van.haaren@intel.com>
> >> 抄送: dev@dpdk.org; nd <nd@arm.com>; Honnappa Nagarahalli
> >> <Honnappa.Nagarahalli@arm.com>; stable@dpdk.org; Ruifeng Wang
> >> <Ruifeng.Wang@arm.com>; nd <nd@arm.com>; nd <nd@arm.com>
> >> 主题: RE: [RFC PATCH v1 4/6] app/eventdev: add release barriers for
> >pipeline
> >> test
> >>
> >> Hi Feifei,
> >>
> >> >Hi, Pavan
> >> >
> >> >
> >> >
> >> >> -----邮件原件-----
> >> >
> >> >> 发件人: Pavan Nikhilesh Bhagavatula
> ><mailto:pbhagavatula@marvell.com>
> >> >
> >> >> 发送时间: 2021年1月5日 17:29
> >> >
> >> >> 收件人: Feifei Wang <mailto:Feifei.Wang2@arm.com>;
> >mailto:jerinj@marvell.com;
> >> >Harry
> >> >
> >> >> van Haaren <mailto:harry.van.haaren@intel.com>
> >> >
> >> >> 抄送: mailto:dev@dpdk.org; nd <mailto:nd@arm.com>; Honnappa
> >Nagarahalli
> >> >
> >> >> <mailto:Honnappa.Nagarahalli@arm.com>;
> >mailto:stable@dpdk.org; Ruifeng Wang
> >> >
> >> >> <mailto:Ruifeng.Wang@arm.com>; nd <mailto:nd@arm.com>
> >> >
> >> >> 主题: RE: [RFC PATCH v1 4/6] app/eventdev: add release barriers
> >for
> >> >pipeline
> >> >
> >> >> test
> >> >
> >> >>
> >> >
> >> >> Hi Feifei,
> >> >
> >> >>
> >> >
> >> >> >Hi, Pavan
> >> >
> >> >> >
> >> >
> >> >> >Sorry for my late reply and thanks very much for your review.
> >> >
> >> >> >
> >> >
> >> >> >> -----Original Message-----
> >> >
> >> >> >> From: Pavan Nikhilesh Bhagavatula
> >>
> >><mailto:pbhagavatula@marvell.com<mailto:pbhagavatula@marvell.c
> >om>>
> >> >
> >> >> >> Sent: 2020年12月22日 18:33
> >> >
> >> >> >> To: Feifei Wang
> >> ><mailto:Feifei.Wang2@arm.com<mailto:Feifei.Wang2@arm.com>>;
> >> >mailto:jerinj@marvell.com<mailto:jerinj@marvell.com>;
> >> >
> >> >> >Harry van
> >> >
> >> >> >> Haaren
> >>
> >><mailto:harry.van.haaren@intel.com<mailto:harry.van.haaren@intel.
> >com>>;
> >> >Pavan Nikhilesh
> >> >
> >> >> >>
> >>
> >><pbhagavatula@caviumnetworks.com<mailto:pbhagavatula@cavium
> >n
> >> >etworks.com>>
> >> >
> >> >> >> Cc: mailto:dev@dpdk.org<mailto:dev@dpdk.org>; nd
> >> ><mailto:nd@arm.com<mailto:nd@arm.com>>; Honnappa
> >Nagarahalli
> >> >
> >> >> >>
> >>
> >><Honnappa.Nagarahalli@arm.com<mailto:Honnappa.Nagarahalli@ar
> >m
> >> >.com>>; mailto:stable@dpdk.org<mailto:stable@dpdk.org>; Phil
> >Yang
> >> >
> >> >> >> <mailto:Phil.Yang@arm.com<mailto:Phil.Yang@arm.com>>
> >> >
> >> >> >> Subject: RE: [RFC PATCH v1 4/6] app/eventdev: add release
> >> >barriers
> >> >
> >> >> >for
> >> >
> >> >> >> pipeline test
> >> >
> >> >> >>
> >> >
> >> >> >>
> >> >
> >> >> >> >Add release barriers before updating the processed packets for
> >> >
> >> >> >worker
> >> >
> >> >> >> >lcores to ensure the worker lcore has really finished data
> >> >
> >> >> >> >processing and then it can update the processed packets
> >number.
> >> >
> >> >> >> >
> >> >
> >> >> >>
> >> >
> >> >> >> I believe we can live with minor inaccuracies in stats being
> >> >
> >> >> >> presented
> >> >
> >> >> >as
> >> >
> >> >> >> atomics are pretty heavy when scheduler is limited to burst
> >> >> >> size as
> >> >1.
> >> >
> >> >> >>
> >> >
> >> >> >> One option is to move it before a pipeline operation
> >> >
> >> >> >(pipeline_event_tx,
> >> >
> >> >> >> pipeline_fwd_event etc.) as they imply implicit release barrier
> >> >> >> (as
> >> >
> >> >> >> all
> >> >
> >> >> >the
> >> >
> >> >> >> changes done to the event should be visible to the next core).
> >> >
> >> >> >
> >> >
> >> >> >If I understand correctly, your meaning is that move release
> >> >> >barriers
> >> >
> >> >> >before pipeline_event_tx or pipeline_fwd_event. This can ensure
> >the
> >> >
> >> >> >event has been processed before the next core begins to tx/fwd.
> >For
> >> >
> >> >> >example:
> >> >
> >> >>
> >> >
> >> >> What I meant was event APIs such as `rte_event_enqueue_burst`,
> >> >
> >> >> `rte_event_eth_tx_adapter_enqueue`
> >> >
> >> >> act as an implicit release barrier and the API
> >> >`rte_event_dequeue_burst` act
> >> >
> >> >> as an implicit acquire barrier.
> >> >
> >> >
> >> >
> >> >>
> >> >
> >> >> Since, pipeline_* test starts with a dequeue() and ends with an
> >> >enqueue() I
> >> >
> >> >> don’t believe we need barriers in Between.
> >> >
> >> >
> >> >
> >> >Sorry for my misunderstanding this. And I agree with you that no
> >> >barriers are
> >> >
> >> >needed between dequeue and enqueue.
> >> >
> >> >
> >> >
> >> >Now, let's go back to the beginning. Actually with this patch, our
> >> >barrier is mainly
> >> >
> >> >for the synchronous variable " w->processed_pkts ". As we all know,
> >the
> >> >event is firstly
> >> >
> >> >dequeued and then enqueued, after this, the event can be treated as
> >the
> >> >processed event
> >> >
> >> >and included in the statistics("w->processed_pkts++").
> >> >
> >> >
> >> >
> >> >Thus, we add a release barrier before " w->processed_pkts++" is to
> >> >prevent this operation
> >> >
> >> >being executed ahead of time. For example:
> >> >
> >> >dequeue  ->  w->processed_pkts++  ->  enqueue
> >> >
> >> >This cause that the worker doesn't actually finish this event
> >> >processing, but the event is treated
> >> >
> >> >as the processed one and included in the statistics.
> >> >
> >>
> >> But the current sequence is dequeue-> enqueue-> w- processed_pkts++
> >>and enqueue already acts as an explicit release barrier right?
> >>
> >
> >Sorry maybe I cannot understand how “enqueue” as an explicit release
> >barrier. I think of two possibilities:
> >1. As you say before, all the changes done to the event should be
> >visible to the next core and enqueue is a operation for event, so the
> >next core should wait for the event to be enqueued.
> >I think this is due to data dependence for the same variable. However,
> >‘w->processed_pkts’ and ‘ev’ are different variables, so this cannot
> >prevent ‘w->processed_pkts++’ before enqueue.
> >And the main core may load updated ‘w->processed_pkts’ but actually the
> >event is still being processed. For example:
> >
> > Time Slot	   Worker 1                                      Main core
> >   1                           dequeue
> >   2                  w->processed_pkts++
> >   3                       		                              load w->processed_pkts
> >   4                          enqueue
> >
> >2. Some release barriers have been included in enqueue. There is a
> >release barrier in rte_ring_enqueue :
> >move head -> copy elements to the ring -> release barrier -> update
> >tail
> >-> w->processed_pkts++
> >However, this barrier cannot prevent ‘w->processed_pkts++’ before
> >update tail, and when update_tail has been finished, the enqueue
> >process  can be seen completed.
> 
> I was talking about case 2 in particular almost all enqueue calls have some
> kind of release barrier in place. I do agree w->processed_pkts++ might get
> reordered with tail update but since enqueue itself is a ldr + blr I was hoping
> that it wouldn't occur.
> 
> We can continue the discussion once I have some performance data.
> 
Ok, that's great. I think this is a meaningful discussion. Thanks for your effort~.

Best Regards
Feifei

> Thanks for your patience :)
> Pavan.
> 
> >
> >> >
> >> >
> >>
> >>________________________________________________________
> _
> >> _
> >> >____________________
> >> >
> >> >
> >> >
> >> >By the way, I have two other questions about pipeline process test
> >> >in "test_pipeline_queue".
> >> >
> >> >1. when do we start counting processed events (w-
> >>processed_pkts)?
> >> >
> >> >For the fwd mode (internal_port = false), when we choose single
> >stage,
> >> >application increments
> >> >
> >> >the number events processed after "pipeline_event enqueue".
> >> >However, when we choose multiple
> >> >
> >> >stage, application increments the number events processed before
> >> >"pipleline_event_enqueue".
> >>
> >> We count an event as process when all the stages have completed
> >and its
> >> Trasnmitted.
> >>
> >> >So,
> >> >
> >> >maybe we can unify this. For example of multiple stage:
> >> >
> >> >
> >> >
> >> >                                if (cq_id == last_queue) {
> >> >
> >> >                                                ev.queue_id =
> >> > tx_queue[ev.mbuf->port];
> >> >
> >> >
> >> >rte_event_eth_tx_adapter_txq_set(ev.mbuf,
> >> >0);
> >> >
> >> >
> >> >pipeline_fwd_event(&ev, RTE_SCHED_TYPE_ATOMIC);
> >> >
> >> >                                +             pipeline_event_enqueue(dev, port, &ev);
> >> >
> >> >                                                w->processed_pkts++;
> >> >
> >> >                                } else {
> >> >
> >> >                                                ev.queue_id++;
> >> >
> >> >
> >> >pipeline_fwd_event(&ev, sched_type_list[cq_id]);
> >> >
> >> >                                +             pipeline_event_enqueue(dev, port, &ev);
> >> >
> >> >                                }
> >> >
> >> >
> >> >
> >> >                -              pipeline_event_enqueue(dev, port, &ev);
> >> >
> >> >
> >>
> >> The above change makes sense.
> >>
> >Thanks for your review, and I’ll update this change into the next
> >version.
> >> >
> >> >2. Whether  "pipeline_event_enqueue" is needed after
> >> >"pipeline_event_tx" for tx mode?
> >> >
> >> >For single_stage_burst_tx mode, after "pipeline_event_tx", the
> >worker
> >> >has to enqueue again
> >> >
> >> >due to  "pipeline_event_enqueue_burst", so maybe we should jump
> >out of
> >> >the loop after
> >> >
> >> >“pipeline_event_tx”,
> >>
> >> We call enqueue burst to release the events i.e. enqueue events with
> >> RTE_EVENT_OP_RELEASE.
> >>
> >However,
> >In case of single event, for ' pipeline_queue_worker_single_stage_tx'
> >and ' pipeline_queue_worker_multi_stage_tx',
> >after tx, there is no release operation.
> >
> >> >
> >>  for example:
> >> >
> >> >
> >> >
> >> >                                                if (ev[i].sched_type
> >> >==
> >> >RTE_SCHED_TYPE_ATOMIC) {
> >> >
> >> >
> >> > pipeline_event_tx(dev, port, &ev[i]);
> >> >
> >> >
> >> > ev[i].op = RTE_EVENT_OP_RELEASE;
> >> >
> >> >
> >> > w->processed_pkts++;
> >> >
> >> >                                                +             continue;
> >> >
> >> >                                                } else {
> >> >
> >> >
> >> > ev[i].queue_id++;
> >> >
> >> >
> >> > pipeline_fwd_event(&ev[i],
> >> >
> >> >
> >> >RTE_SCHED_TYPE_ATOMIC);
> >> >
> >> >                                                }
> >> >
> >> >                                }
> >> >
> >> >
> >> >
> >> >                                pipeline_event_enqueue_burst(dev,
> >> > port, ev, nb_rx);
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >>
> >> >
> >> >> >
> >> >
> >> >> >if (ev.sched_type == RTE_SCHED_TYPE_ATOMIC) {
> >> >
> >> >> >                          +
> >> >__atomic_thread_fence(__ATOMIC_RELEASE);
> >> >
> >> >> >                                          pipeline_event_tx(dev,
> >> >> > port, &ev);
> >> >
> >> >> >                                          w->processed_pkts++;
> >> >
> >> >> >                          } else {
> >> >
> >> >> >                                          ev.queue_id++;
> >> >
> >> >> >                          +
> >> >__atomic_thread_fence(__ATOMIC_RELEASE);
> >> >
> >> >> >                                          pipeline_fwd_event(&ev,
> >> >
> >> >> >RTE_SCHED_TYPE_ATOMIC);
> >> >
> >> >> >
> >> >> > pipeline_event_enqueue(dev, port, &ev);
> >> >
> >> >> >
> >> >
> >> >> >However, there are two reasons to prevent this:
> >> >
> >> >> >
> >> >
> >> >> >First, compare with other tests in app/eventdev, for example, the
> >> >
> >> >> >eventdev perf test, the wmb is after event operation to ensure
> >> >
> >> >> >operation has been finished and then w->processed_pkts++.
> >> >
> >> >>
> >> >
> >> >> In case of perf_* tests start with a dequeue() and finally ends
> >> >> with a
> >> >
> >> >> mempool_put() should also act as implicit acquire release pairs
> >> >making stats
> >> >
> >> >> consistent?
> >> >
> >> >
> >> >
> >> >For perf tests, this consistency refers to that there is a wmb after
> >> >mempool_put().
> >> >
> >> >Please refer to this link:
> >> >
> >> >https://urldefense.proofpoint.com/v2/url?u=http-
> >>
> >>3A__patches.dpdk.org_patch_85634_&d=DwIGaQ&c=nKjWec2b6R0m
> >O
> >>
> >>yPaz7xtfQ&r=1cjuAHrGh745jHNmj2fD85sUMIJ2IPIDsIJzo6FN6Z0&m=z
> >g
> >> >QHeSDiXWfI1PIIUxXBqMS6E-
> >2_3G46nhrzGXoBpHI&s=0FwTxPXjWflh-
> >> >sdmnkY133IPlJB780x0yxe7Am3JCBw&e=
> >> >
> >> >
> >> >
> >> >>
> >> >
> >> >> >So, if we move release barriers before tx/fwd, it may cause that
> >the
> >> >
> >> >> >tests of app/eventdev become  inconsistent.This may reduce the
> >> >
> >> >> >maintainability of the code and make it difficult to understand.
> >> >
> >> >> >
> >> >
> >> >> >Second, it is a test case, though heavy thread may cause
> >> >performance
> >> >
> >> >> >degradation, it can ensure that the operation process and the
> >> >> >test
> >> >
> >> >> >result are correct. And maybe for a test case, correctness is
> >> >> >more
> >> >
> >> >> >important than performance.
> >> >
> >> >> >
> >> >
> >> >>
> >> >
> >> >> Most of our internal perf test run on 24/48 core combinations and
> >> >since
> >> >
> >> >> Octeontx2 event device driver supports a burst size of 1, it will
> >> >> show
> >> >up as
> >> >
> >> >> Huge performance degradation.
> >> >
> >> >
> >> >
> >> >For the impact on performance, I do the test using software driver,
> >> >following are some test results:
> >> >
> >> >--------------------------------------------------------------------
> >> >---
> >> >----------------
> >> >---------------------------------------------
> >> >
> >> >Architecture: aarch64
> >> >
> >> >Nics: ixgbe-82599
> >> >
> >> >CPU: Cortex-A72
> >> >
> >> >BURST_SIZE: 1
> >> >
> >> >Order: ./dpdk-test-eventdev -l 0-15 -s 0x2 --vdev=event_sw0 -- --
> >> >test=pipeline_queue --wlcore=4-14 --prod_type_ethdev --stlist=a,a
> >> >
> >> >Flow: one flow, 64bits package, TX rate: 1.4Mpps
> >> >
> >> >
> >> >
> >> >Without this patch:
> >> >
> >> >0.954 mpps avg 0.953 mpps
> >> >
> >> >
> >> >
> >> >With this patch:
> >> >
> >> >0.932 mpps avg 0.930 mpps
> >> >
> >> >--------------------------------------------------------------------
> >> >---
> >> >----------------
> >> >---------------------------------------------
> >> >
> >> >
> >> >
> >> >Based on the result above, there is no significant performance
> >> >degradation with this patch.
> >> >
> >> >This is because the release barrier is only for  “w-
> >>processed_pkts++”.
> >> >It just ensures that the worker core
> >> >
> >> >increments the number events processed after enqueue, and it
> >doesn’t
> >> >affect dequeue/enqueue:
> >> >
> >> >
> >> >
> >> >dequeue -> enqueue -> release barrier -> w->processed_pkts++
> >> >
> >>
> >> Here enqueue already acts as an explicit release barrier.
> >>
> >Please refer above reasons.
> >
> >> >
> >> >
> >> >On the other hand, I infer the reason for the slight decrease in
> >> >measurement performance is that the release barrier
> >> >
> >> >prevent “w->processed_pkts++” before that the event has been
> >processed
> >> >(enqueue). But I think this test result is closer
> >> >
> >> >to the real performance.
> >> >
> >> >And sorry for that we have no octentx2 device, so there is no test
> >> >result on Octeontx2 event device driver. Would you please
> >> >
> >> >help us test this patch on octentx2 when you are convenient. Thanks
> >> >very much.
> >> >
> >>
> >> I will report the performance numbers on Monday.
> >>
> >
> >That’s great, Thanks very much for your help.
> >
> >Best Regards
> >Feifei
> >
> >> >
> >> >
> >> >Best Regards
> >> >
> >> >Feifei
> >>
> >> Regards,
> >> Pavan.
> >>
> >> >
> >> >
> >> >
> >> >>
> >> >
> >> >> >So, due to two reasons above, I'm ambivalent about how we
> >should
> >> >do in
> >> >
> >> >> >the next step.
> >> >
> >> >> >
> >> >
> >> >> >Best Regards
> >> >
> >> >> >Feifei
> >> >
> >> >>
> >> >
> >> >> Regards,
> >> >
> >> >> Pavan.
> >> >
> >> >>
> >> >
> >> >> >
> >> >
> >> >> >> >Fixes: 314bcf58ca8f ("app/eventdev: add pipeline queue
> >worker
> >> >
> >> >> >> >functions")
> >> >
> >> >> >> >Cc:
> >>
> >>mailto:pbhagavatula@marvell.com<mailto:pbhagavatula@marvell.co
> >m>
> >> >
> >> >> >> >Cc: mailto:stable@dpdk.org<mailto:stable@dpdk.org>
> >> >
> >> >> >> >
> >> >
> >> >> >> >Signed-off-by: Phil Yang
> >> ><mailto:phil.yang@arm.com<mailto:phil.yang@arm.com>>
> >> >
> >> >> >> >Signed-off-by: Feifei Wang
> >> ><mailto:feifei.wang2@arm.com<mailto:feifei.wang2@arm.com>>
> >> >
> >> >> >> >Reviewed-by: Ruifeng Wang
> >> ><mailto:ruifeng.wang@arm.com<mailto:ruifeng.wang@arm.com>>
> >> >
> >> >> >> >---
> >> >
> >> >> >> > app/test-eventdev/test_pipeline_queue.c | 64
> >> >
> >> >> >> >+++++++++++++++++++++----
> >> >
> >> >> >> > 1 file changed, 56 insertions(+), 8 deletions(-)
> >> >
> >> >> >> >
> >> >
> >> >> >> >diff --git a/app/test-eventdev/test_pipeline_queue.c
> >b/app/test-
> >> >
> >> >> >> >eventdev/test_pipeline_queue.c index 7bebac34f..0c0ec0ceb
> >> >
> >> >> >100644
> >> >
> >> >> >> >--- a/app/test-eventdev/test_pipeline_queue.c
> >> >
> >> >> >> >+++ b/app/test-eventdev/test_pipeline_queue.c
> >> >
> >> >> >> >@@ -30,7 +30,13 @@
> >> >pipeline_queue_worker_single_stage_tx(void
> >> >
> >> >> >> >*arg)
> >> >
> >> >> >> >
> >> >
> >> >> >> >                    if (ev.sched_type ==
> >> >> >> > RTE_SCHED_TYPE_ATOMIC) {
> >> >
> >> >> >> >                                    pipeline_event_tx(dev,
> >> >> >> > port, &ev);
> >> >
> >> >> >> >-                                  w->processed_pkts++;
> >> >
> >> >> >> >+
> >> >
> >> >> >> >+                                 /* release barrier here
> >> >> >> >+ ensures stored
> >> >operation
> >> >
> >> >> >> >+                                 * of the event completes
> >> >> >> >+ before the number
> >> >of
> >> >
> >> >> >> >+                                 * processed pkts is visible
> >> >> >> >+ to the main core
> >> >
> >> >> >> >+                                 */
> >> >
> >> >> >> >+
> >> >> >> >+ __atomic_fetch_add(&(w->processed_pkts),
> >> >1,
> >> >
> >> >> >> >+
> >> >> >> >+ __ATOMIC_RELEASE);
> >> >
> >> >> >> >                    } else {
> >> >
> >> >> >> >                                    ev.queue_id++;
> >> >
> >> >> >> >                                    pipeline_fwd_event(&ev,
> >> >
> >> >> >> >RTE_SCHED_TYPE_ATOMIC);
> >> >
> >> >> >> >@@ -59,7 +65,13 @@
> >> >
> >> >> >pipeline_queue_worker_single_stage_fwd(void
> >> >
> >> >> >> >*arg)
> >> >
> >> >> >> >                    rte_event_eth_tx_adapter_txq_set(ev.mbuf,
> >> >> >> > 0);
> >> >
> >> >> >> >                    pipeline_fwd_event(&ev,
> >> >> >> > RTE_SCHED_TYPE_ATOMIC);
> >> >
> >> >> >> >                    pipeline_event_enqueue(dev, port, &ev);
> >> >
> >> >> >> >-                  w->processed_pkts++;
> >> >
> >> >> >> >+
> >> >
> >> >> >> >+                 /* release barrier here ensures stored
> >> >> >> >+ operation
> >> >
> >> >> >> >+                 * of the event completes before the number
> >> >> >> >+ of
> >> >
> >> >> >> >+                 * processed pkts is visible to the main core
> >> >
> >> >> >> >+                 */
> >> >
> >> >> >> >+                 __atomic_fetch_add(&(w->processed_pkts), 1,
> >> >
> >> >> >> >+
> >> >> >> >+ __ATOMIC_RELEASE);
> >> >
> >> >> >> >    }
> >> >
> >> >> >> >
> >> >
> >> >> >> >    return 0;
> >> >
> >> >> >> >@@ -84,7 +96,13 @@
> >> >
> >> >> >> >pipeline_queue_worker_single_stage_burst_tx(void *arg)
> >> >
> >> >> >> >                                    if (ev[i].sched_type ==
> >> >
> >> >> >> >RTE_SCHED_TYPE_ATOMIC) {
> >> >
> >> >> >> >
> >> >> >> > pipeline_event_tx(dev, port, &ev[i]);
> >> >
> >> >> >> >                                                    ev[i].op =
> >> >> >> > RTE_EVENT_OP_RELEASE;
> >> >
> >> >> >> >-                                                  w->processed_pkts++;
> >> >
> >> >> >> >+
> >> >
> >> >> >> >+                                                 /* release
> >> >> >> >+ barrier here ensures stored
> >> >
> >> >> >> >operation
> >> >
> >> >> >> >+                                                 * of the
> >> >> >> >+ event completes before the
> >> >
> >> >> >> >number of
> >> >
> >> >> >> >+                                                 * processed
> >> >> >> >+ pkts is visible to the main
> >> >
> >> >> >> >core
> >> >
> >> >> >> >+                                                 */
> >> >
> >> >> >> >+
> >> >> >> >+ __atomic_fetch_add(&(w-
> >> >
> >> >> >> >>processed_pkts), 1,
> >> >
> >> >> >> >+
> >> >__ATOMIC_RELEASE);
> >> >
> >> >> >> >                                    } else {
> >> >
> >> >> >> >
> >> >> >> > ev[i].queue_id++;
> >> >
> >> >> >> >
> >> >> >> > pipeline_fwd_event(&ev[i],
> >> >
> >> >> >> >@@ -121,7 +139,13 @@
> >> >
> >> >> >> >pipeline_queue_worker_single_stage_burst_fwd(void *arg)
> >> >
> >> >> >> >                    }
> >> >
> >> >> >> >
> >> >
> >> >> >> >                    pipeline_event_enqueue_burst(dev, port,
> >> >> >> > ev, nb_rx);
> >> >
> >> >> >> >-                  w->processed_pkts += nb_rx;
> >> >
> >> >> >> >+
> >> >
> >> >> >> >+                 /* release barrier here ensures stored
> >> >> >> >+ operation
> >> >
> >> >> >> >+                 * of the event completes before the number
> >> >> >> >+ of
> >> >
> >> >> >> >+                 * processed pkts is visible to the main core
> >> >
> >> >> >> >+                 */
> >> >
> >> >> >> >+                 __atomic_fetch_add(&(w->processed_pkts),
> >> >> >> >+ nb_rx,
> >> >
> >> >> >> >+
> >> >> >> >+ __ATOMIC_RELEASE);
> >> >
> >> >> >> >    }
> >> >
> >> >> >> >
> >> >
> >> >> >> >    return 0;
> >> >
> >> >> >> >@@ -146,7 +170,13 @@
> >> >
> >> >> >pipeline_queue_worker_multi_stage_tx(void
> >> >
> >> >> >> >*arg)
> >> >
> >> >> >> >
> >> >
> >> >> >> >                    if (ev.queue_id ==
> >> >> >> > tx_queue[ev.mbuf->port]) {
> >> >
> >> >> >> >                                    pipeline_event_tx(dev,
> >> >> >> > port, &ev);
> >> >
> >> >> >> >-                                  w->processed_pkts++;
> >> >
> >> >> >> >+
> >> >
> >> >> >> >+                                 /* release barrier here
> >> >> >> >+ ensures stored
> >> >operation
> >> >
> >> >> >> >+                                 * of the event completes
> >> >> >> >+ before the number
> >> >of
> >> >
> >> >> >> >+                                 * processed pkts is visible
> >> >> >> >+ to the main core
> >> >
> >> >> >> >+                                 */
> >> >
> >> >> >> >+
> >> >> >> >+ __atomic_fetch_add(&(w->processed_pkts),
> >> >1,
> >> >
> >> >> >> >+
> >> >> >> >+ __ATOMIC_RELEASE);
> >> >
> >> >> >> >                                    continue;
> >> >
> >> >> >> >                    }
> >> >
> >> >> >> >
> >> >
> >> >> >> >@@ -180,7 +210,13 @@
> >> >
> >> >> >> >pipeline_queue_worker_multi_stage_fwd(void *arg)
> >> >
> >> >> >> >                                    ev.queue_id =
> >> >> >> > tx_queue[ev.mbuf->port];
> >> >
> >> >> >> >
> >> >> >> > rte_event_eth_tx_adapter_txq_set(ev.mbuf,
> >> >0);
> >> >
> >> >> >> >                                    pipeline_fwd_event(&ev,
> >> >
> >> >> >> >RTE_SCHED_TYPE_ATOMIC);
> >> >
> >> >> >> >-                                  w->processed_pkts++;
> >> >
> >> >> >> >+
> >> >
> >> >> >> >+                                 /* release barrier here
> >> >> >> >+ ensures stored
> >> >operation
> >> >
> >> >> >> >+                                 * of the event completes
> >> >> >> >+ before the number
> >> >of
> >> >
> >> >> >> >+                                 * processed pkts is visible
> >> >> >> >+ to the main core
> >> >
> >> >> >> >+                                 */
> >> >
> >> >> >> >+
> >> >> >> >+ __atomic_fetch_add(&(w->processed_pkts),
> >> >1,
> >> >
> >> >> >> >+
> >> >> >> >+ __ATOMIC_RELEASE);
> >> >
> >> >> >> >                    } else {
> >> >
> >> >> >> >                                    ev.queue_id++;
> >> >
> >> >> >> >                                    pipeline_fwd_event(&ev,
> >> >
> >> >> >> >sched_type_list[cq_id]);
> >> >
> >> >> >> >@@ -214,7 +250,13 @@
> >> >
> >> >> >> >pipeline_queue_worker_multi_stage_burst_tx(void *arg)
> >> >
> >> >> >> >                                    if (ev[i].queue_id ==
> >> >> >> > tx_queue[ev[i].mbuf-
> >> >
> >> >> >> >>port]) {
> >> >
> >> >> >> >
> >> >> >> > pipeline_event_tx(dev, port, &ev[i]);
> >> >
> >> >> >> >                                                    ev[i].op =
> >> >> >> > RTE_EVENT_OP_RELEASE;
> >> >
> >> >> >> >-                                                  w->processed_pkts++;
> >> >
> >> >> >> >+
> >> >
> >> >> >> >+                                                 /* release
> >> >> >> >+ barrier here ensures stored
> >> >
> >> >> >> >operation
> >> >
> >> >> >> >+                                                 * of the
> >> >> >> >+ event completes before the
> >> >
> >> >> >> >number of
> >> >
> >> >> >> >+                                                 * processed
> >> >> >> >+ pkts is visible to the main
> >> >
> >> >> >> >core
> >> >
> >> >> >> >+                                                 */
> >> >
> >> >> >> >+
> >> >> >> >+ __atomic_fetch_add(&(w-
> >> >
> >> >> >> >>processed_pkts), 1,
> >> >
> >> >> >> >+
> >> >__ATOMIC_RELEASE);
> >> >
> >> >> >> >                                                    continue;
> >> >
> >> >> >> >                                    }
> >> >
> >> >> >> >
> >> >
> >> >> >> >@@ -254,7 +296,13 @@
> >> >
> >> >> >> >pipeline_queue_worker_multi_stage_burst_fwd(void *arg)
> >> >
> >> >> >> >
> >> >
> >> >> >> >    rte_event_eth_tx_adapter_txq_set(ev[i].mbuf, 0);
> >> >
> >> >> >> >
> >> >> >> > pipeline_fwd_event(&ev[i],
> >> >
> >> >> >> >
> >> >
> >> >> >> >    RTE_SCHED_TYPE_ATOMIC);
> >> >
> >> >> >> >-                                                  w->processed_pkts++;
> >> >
> >> >> >> >+
> >> >
> >> >> >> >+                                                 /* release
> >> >> >> >+ barrier here ensures stored
> >> >
> >> >> >> >operation
> >> >
> >> >> >> >+                                                 * of the
> >> >> >> >+ event completes before the
> >> >
> >> >> >> >number of
> >> >
> >> >> >> >+                                                 * processed
> >> >> >> >+ pkts is visible to the main
> >> >
> >> >> >> >core
> >> >
> >> >> >> >+                                                 */
> >> >
> >> >> >> >+
> >> >> >> >+ __atomic_fetch_add(&(w-
> >> >
> >> >> >> >>processed_pkts), 1,
> >> >
> >> >> >> >+
> >> >__ATOMIC_RELEASE);
> >> >
> >> >> >> >                                    } else {
> >> >
> >> >> >> >
> >> >> >> > ev[i].queue_id++;
> >> >
> >> >> >> >
> >> >> >> > pipeline_fwd_event(&ev[i],
> >> >
> >> >> >> >--
> >> >
> >> >> >> >2.17.1
> >> >


  reply	other threads:[~2021-01-11  1:57 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-22 10:33 [dpdk-stable] " Pavan Nikhilesh Bhagavatula
2021-01-05  7:39 ` Feifei Wang
2021-01-05  9:29   ` Pavan Nikhilesh Bhagavatula
2021-01-08  7:12     ` [dpdk-stable] 回复: " Feifei Wang
2021-01-08  9:12       ` [dpdk-stable] " Pavan Nikhilesh Bhagavatula
2021-01-08 10:44         ` [dpdk-stable] 回复: " Feifei Wang
2021-01-08 10:58           ` [dpdk-stable] " Pavan Nikhilesh Bhagavatula
2021-01-11  1:57             ` Feifei Wang [this message]
2021-01-12  8:29             ` Pavan Nikhilesh Bhagavatula
2021-01-14  6:07               ` [dpdk-stable] 回复: " Feifei Wang
     [not found] <20201222050728.41000-1-feifei.wang2@arm.com>
2020-12-22  5:07 ` [dpdk-stable] " Feifei Wang
     [not found] <20201222032237.36046-1-feifei.wang2@arm.com>
2020-12-22  3:22 ` Feifei Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DBBPR08MB44119E6B8B54CF36183446B4C8AB0@DBBPR08MB4411.eurprd08.prod.outlook.com \
    --to=feifei.wang2@arm.com \
    --cc=Honnappa.Nagarahalli@arm.com \
    --cc=Ruifeng.Wang@arm.com \
    --cc=dev@dpdk.org \
    --cc=harry.van.haaren@intel.com \
    --cc=jerinj@marvell.com \
    --cc=nd@arm.com \
    --cc=pbhagavatula@marvell.com \
    --cc=stable@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).