* Re: [PATCH] app: fix silent enqueue fail in test_mbuf test_refcnt_iter
2023-08-10 16:09 ` jhascoet
@ 2023-08-18 5:16 ` Hascoet Julien
2023-08-18 6:25 ` David Marchand
2023-08-22 6:34 ` jhascoet
2023-08-24 7:37 ` [PATCH] app: enhance error check if " jhascoet
2 siblings, 1 reply; 14+ messages in thread
From: Hascoet Julien @ 2023-08-18 5:16 UTC (permalink / raw)
To: david.marchand; +Cc: dev
[-- Attachment #1: Type: text/plain, Size: 2020 bytes --]
Any news on that ? Thanks
Julien
On Thu, Aug 10, 2023 at 6:09 PM jhascoet <ju.hascoet@gmail.com> wrote:
> From: Julien Hascoet <ju.hascoet@gmail.com>
>
> In case of ring full state, we retry the enqueue
> operation in order to avoid mbuf loss.
>
> Fixes: af75078fece ("first public release")
>
> Signed-off-by: Julien Hascoet <ju.hascoet@gmail.com>
> ---
> app/test/test_mbuf.c | 15 ++++++++++++---
> 1 file changed, 12 insertions(+), 3 deletions(-)
>
> diff --git a/app/test/test_mbuf.c b/app/test/test_mbuf.c
> index efac01806b..ad18bf6378 100644
> --- a/app/test/test_mbuf.c
> +++ b/app/test/test_mbuf.c
> @@ -1033,12 +1033,21 @@ test_refcnt_iter(unsigned int lcore, unsigned int
> iter,
> tref += ref;
> if ((ref & 1) != 0) {
> rte_pktmbuf_refcnt_update(m, ref);
> - while (ref-- != 0)
> - rte_ring_enqueue(refcnt_mbuf_ring, m);
> + while (ref-- != 0) {
> + /* retry in case of failure */
> + while (rte_ring_enqueue(refcnt_mbuf_ring,
> m) != 0) {
> + /* let others consume */
> + rte_pause();
> + }
> + }
> } else {
> while (ref-- != 0) {
> rte_pktmbuf_refcnt_update(m, 1);
> - rte_ring_enqueue(refcnt_mbuf_ring, m);
> + /* retry in case of failure */
> + while (rte_ring_enqueue(refcnt_mbuf_ring,
> m) != 0) {
> + /* let others consume */
> + rte_pause();
> + }
> }
> }
> rte_pktmbuf_free(m);
> --
> 2.34.1
>
>
[-- Attachment #2: Type: text/html, Size: 2895 bytes --]
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH] app: fix silent enqueue fail in test_mbuf test_refcnt_iter
2023-08-10 16:09 ` jhascoet
2023-08-18 5:16 ` Hascoet Julien
@ 2023-08-22 6:34 ` jhascoet
2023-08-22 8:34 ` Olivier Matz
2023-08-30 16:16 ` Thomas Monjalon
2023-08-24 7:37 ` [PATCH] app: enhance error check if " jhascoet
2 siblings, 2 replies; 14+ messages in thread
From: jhascoet @ 2023-08-22 6:34 UTC (permalink / raw)
To: olivier.matz; +Cc: dev, maintainers, Julien Hascoet
From: Julien Hascoet <ju.hascoet@gmail.com>
In case of ring full state, we retry the enqueue
operation in order to avoid mbuf loss.
Fixes: af75078fece ("first public release")
Signed-off-by: Julien Hascoet <ju.hascoet@gmail.com>
---
app/test/test_mbuf.c | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)
diff --git a/app/test/test_mbuf.c b/app/test/test_mbuf.c
index efac01806b..ad18bf6378 100644
--- a/app/test/test_mbuf.c
+++ b/app/test/test_mbuf.c
@@ -1033,12 +1033,21 @@ test_refcnt_iter(unsigned int lcore, unsigned int iter,
tref += ref;
if ((ref & 1) != 0) {
rte_pktmbuf_refcnt_update(m, ref);
- while (ref-- != 0)
- rte_ring_enqueue(refcnt_mbuf_ring, m);
+ while (ref-- != 0) {
+ /* retry in case of failure */
+ while (rte_ring_enqueue(refcnt_mbuf_ring, m) != 0) {
+ /* let others consume */
+ rte_pause();
+ }
+ }
} else {
while (ref-- != 0) {
rte_pktmbuf_refcnt_update(m, 1);
- rte_ring_enqueue(refcnt_mbuf_ring, m);
+ /* retry in case of failure */
+ while (rte_ring_enqueue(refcnt_mbuf_ring, m) != 0) {
+ /* let others consume */
+ rte_pause();
+ }
}
}
rte_pktmbuf_free(m);
--
2.34.1
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] app: fix silent enqueue fail in test_mbuf test_refcnt_iter
2023-08-22 6:34 ` jhascoet
@ 2023-08-22 8:34 ` Olivier Matz
2023-08-24 6:51 ` Hascoet Julien
2023-08-30 16:16 ` Thomas Monjalon
1 sibling, 1 reply; 14+ messages in thread
From: Olivier Matz @ 2023-08-22 8:34 UTC (permalink / raw)
To: jhascoet; +Cc: dev, David Marchand
Hello Julien,
On Tue, Aug 22, 2023 at 08:34:53AM +0200, jhascoet wrote:
> From: Julien Hascoet <ju.hascoet@gmail.com>
>
> In case of ring full state, we retry the enqueue
> operation in order to avoid mbuf loss.
>
> Fixes: af75078fece ("first public release")
>
> Signed-off-by: Julien Hascoet <ju.hascoet@gmail.com>
> ---
> app/test/test_mbuf.c | 15 ++++++++++++---
> 1 file changed, 12 insertions(+), 3 deletions(-)
>
> diff --git a/app/test/test_mbuf.c b/app/test/test_mbuf.c
> index efac01806b..ad18bf6378 100644
> --- a/app/test/test_mbuf.c
> +++ b/app/test/test_mbuf.c
> @@ -1033,12 +1033,21 @@ test_refcnt_iter(unsigned int lcore, unsigned int iter,
> tref += ref;
> if ((ref & 1) != 0) {
> rte_pktmbuf_refcnt_update(m, ref);
> - while (ref-- != 0)
> - rte_ring_enqueue(refcnt_mbuf_ring, m);
> + while (ref-- != 0) {
> + /* retry in case of failure */
> + while (rte_ring_enqueue(refcnt_mbuf_ring, m) != 0) {
> + /* let others consume */
> + rte_pause();
> + }
> + }
> } else {
> while (ref-- != 0) {
> rte_pktmbuf_refcnt_update(m, 1);
> - rte_ring_enqueue(refcnt_mbuf_ring, m);
> + /* retry in case of failure */
> + while (rte_ring_enqueue(refcnt_mbuf_ring, m) != 0) {
> + /* let others consume */
> + rte_pause();
> + }
> }
> }
> rte_pktmbuf_free(m);
> --
> 2.34.1
>
Can you give some more details about how to reproduce the issue?
From what I see, the code does the following:
main core:
create a ring with at least (REFCNT_MBUF_NUM * REFCNT_MAX_REF) entries
create an mbuf pool with REFCNT_MBUF_NUM entries
start worker cores
do REFCNT_MAX_ITER times:
for each mbuf of the pool (REFCNT_MBUF_NUM entries):
let r be a random number between 1 and REFCNT_MAX_REF
increase mbuf references by r, and enqueue r times in the ring
wait that the ring is empty (since worker cores are dequeuing mbufs)
stop worker cores
worker cores:
dequeue packets from the ring and free them until asked to stop
I may be mistaking but I don't see how the number of mbufs in ring could
exceed REFCNT_MBUF_NUM * REFCNT_MAX_REF.
Regards,
Olivier
Note: removing CC maintainers@dpdk.org
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] app: fix silent enqueue fail in test_mbuf test_refcnt_iter
2023-08-22 8:34 ` Olivier Matz
@ 2023-08-24 6:51 ` Hascoet Julien
0 siblings, 0 replies; 14+ messages in thread
From: Hascoet Julien @ 2023-08-24 6:51 UTC (permalink / raw)
To: Olivier Matz; +Cc: dev, David Marchand
[-- Attachment #1: Type: text/plain, Size: 3326 bytes --]
Hello Oliver,
thanks, your response helped a lot, I managed to find the root cause of the
instability which is on our side.
It was due to other internal developments.
I'll still add an error check on the enqueue ops to catch eventual problems
earlier, if that suits you.
Best regards,
Julien
On Tue, Aug 22, 2023 at 10:34 AM Olivier Matz <olivier.matz@6wind.com>
wrote:
> Hello Julien,
>
> On Tue, Aug 22, 2023 at 08:34:53AM +0200, jhascoet wrote:
> > From: Julien Hascoet <ju.hascoet@gmail.com>
> >
> > In case of ring full state, we retry the enqueue
> > operation in order to avoid mbuf loss.
> >
> > Fixes: af75078fece ("first public release")
> >
> > Signed-off-by: Julien Hascoet <ju.hascoet@gmail.com>
> > ---
> > app/test/test_mbuf.c | 15 ++++++++++++---
> > 1 file changed, 12 insertions(+), 3 deletions(-)
> >
> > diff --git a/app/test/test_mbuf.c b/app/test/test_mbuf.c
> > index efac01806b..ad18bf6378 100644
> > --- a/app/test/test_mbuf.c
> > +++ b/app/test/test_mbuf.c
> > @@ -1033,12 +1033,21 @@ test_refcnt_iter(unsigned int lcore, unsigned
> int iter,
> > tref += ref;
> > if ((ref & 1) != 0) {
> > rte_pktmbuf_refcnt_update(m, ref);
> > - while (ref-- != 0)
> > - rte_ring_enqueue(refcnt_mbuf_ring, m);
> > + while (ref-- != 0) {
> > + /* retry in case of failure */
> > + while (rte_ring_enqueue(refcnt_mbuf_ring,
> m) != 0) {
> > + /* let others consume */
> > + rte_pause();
> > + }
> > + }
> > } else {
> > while (ref-- != 0) {
> > rte_pktmbuf_refcnt_update(m, 1);
> > - rte_ring_enqueue(refcnt_mbuf_ring, m);
> > + /* retry in case of failure */
> > + while (rte_ring_enqueue(refcnt_mbuf_ring,
> m) != 0) {
> > + /* let others consume */
> > + rte_pause();
> > + }
> > }
> > }
> > rte_pktmbuf_free(m);
> > --
> > 2.34.1
> >
>
> Can you give some more details about how to reproduce the issue?
>
> From what I see, the code does the following:
>
> main core:
> create a ring with at least (REFCNT_MBUF_NUM * REFCNT_MAX_REF) entries
> create an mbuf pool with REFCNT_MBUF_NUM entries
> start worker cores
> do REFCNT_MAX_ITER times:
> for each mbuf of the pool (REFCNT_MBUF_NUM entries):
> let r be a random number between 1 and REFCNT_MAX_REF
> increase mbuf references by r, and enqueue r times in the ring
> wait that the ring is empty (since worker cores are dequeuing mbufs)
> stop worker cores
>
> worker cores:
> dequeue packets from the ring and free them until asked to stop
>
>
> I may be mistaking but I don't see how the number of mbufs in ring could
> exceed REFCNT_MBUF_NUM * REFCNT_MAX_REF.
>
> Regards,
> Olivier
>
>
> Note: removing CC maintainers@dpdk.org
>
[-- Attachment #2: Type: text/html, Size: 4534 bytes --]
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] app: fix silent enqueue fail in test_mbuf test_refcnt_iter
2023-08-22 6:34 ` jhascoet
2023-08-22 8:34 ` Olivier Matz
@ 2023-08-30 16:16 ` Thomas Monjalon
1 sibling, 0 replies; 14+ messages in thread
From: Thomas Monjalon @ 2023-08-30 16:16 UTC (permalink / raw)
To: Julien Hascoet; +Cc: olivier.matz, dev
Hello,
22/08/2023 08:34, jhascoet:
> From: Julien Hascoet <ju.hascoet@gmail.com>
>
> In case of ring full state, we retry the enqueue
> operation in order to avoid mbuf loss.
>
> Fixes: af75078fece ("first public release")
>
> Signed-off-by: Julien Hascoet <ju.hascoet@gmail.com>
Please could you give more information about when this is happening?
I'm surprised such thing was not a problem so far.
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH] app: enhance error check if enqueue fail in test_mbuf test_refcnt_iter
2023-08-10 16:09 ` jhascoet
2023-08-18 5:16 ` Hascoet Julien
2023-08-22 6:34 ` jhascoet
@ 2023-08-24 7:37 ` jhascoet
2024-10-04 21:18 ` Stephen Hemminger
2 siblings, 1 reply; 14+ messages in thread
From: jhascoet @ 2023-08-24 7:37 UTC (permalink / raw)
To: olivier.matz; +Cc: dev, Julien Hascoet
From: Julien Hascoet <ju.hascoet@gmail.com>
As the ring is big enough by construction, the ring
enqueue ops cannot fail; therefore, we check and panic
on it as soon as it is detected.
Signed-off-by: Julien Hascoet <ju.hascoet@gmail.com>
---
app/test/test_mbuf.c | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/app/test/test_mbuf.c b/app/test/test_mbuf.c
index d7393df7eb..c94dd3749e 100644
--- a/app/test/test_mbuf.c
+++ b/app/test/test_mbuf.c
@@ -1033,12 +1033,15 @@ test_refcnt_iter(unsigned int lcore, unsigned int iter,
tref += ref;
if ((ref & 1) != 0) {
rte_pktmbuf_refcnt_update(m, ref);
- while (ref-- != 0)
- rte_ring_enqueue(refcnt_mbuf_ring, m);
+ while (ref-- != 0) {
+ if (rte_ring_enqueue(refcnt_mbuf_ring, m) != 0)
+ goto fail_enqueue;
+ }
} else {
while (ref-- != 0) {
rte_pktmbuf_refcnt_update(m, 1);
- rte_ring_enqueue(refcnt_mbuf_ring, m);
+ if (rte_ring_enqueue(refcnt_mbuf_ring, m) != 0)
+ goto fail_enqueue;
}
}
rte_pktmbuf_free(m);
@@ -1066,6 +1069,10 @@ test_refcnt_iter(unsigned int lcore, unsigned int iter,
rte_panic("(lcore=%u, iter=%u): after %us only "
"%u of %u mbufs left free\n", lcore, iter, wn, i, n);
+
+fail_enqueue:
+ rte_panic("(lcore=%u, iter=%u): enqueue mbuf %u cannot "
+ "fail since ring is big enough\n", lcore, iter, i);
}
static int
--
2.34.1
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] app: enhance error check if enqueue fail in test_mbuf test_refcnt_iter
2023-08-24 7:37 ` [PATCH] app: enhance error check if " jhascoet
@ 2024-10-04 21:18 ` Stephen Hemminger
0 siblings, 0 replies; 14+ messages in thread
From: Stephen Hemminger @ 2024-10-04 21:18 UTC (permalink / raw)
To: jhascoet; +Cc: olivier.matz, dev
On Thu, 24 Aug 2023 09:37:18 +0200
jhascoet <ju.hascoet@gmail.com> wrote:
> From: Julien Hascoet <ju.hascoet@gmail.com>
>
> As the ring is big enough by construction, the ring
> enqueue ops cannot fail; therefore, we check and panic
> on it as soon as it is detected.
>
> Signed-off-by: Julien Hascoet <ju.hascoet@gmail.com>
> ---
As per earlier patch in the email thread.
This is a test for something that should never happen.
Adding that test code is good and bad. Good, tests should never assume
code works. Bad, it creates more paths and complexity in the test code.
If the test was broken, and enqueue fails, it would fail in later checks
because of unfreed mbuf's.
Bottom line: lets skip this patch, it only happened to the submitter
because of other problems in their environment.
^ permalink raw reply [flat|nested] 14+ messages in thread