DPDK patches and discussions
 help / color / mirror / Atom feed
* [PATCH] net/gve: Change ERR to DEBUG to prevent flooding of logs for Tx-Dqo.
@ 2024-02-19  2:44 Rushil Gupta
  2024-02-19  9:33 ` Ferruh Yigit
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Rushil Gupta @ 2024-02-19  2:44 UTC (permalink / raw)
  To: junfeng.guo, jeroendb, joshwash, ferruh.yigit; +Cc: dev, stable, Rushil Gupta

This was causing failure for testpmd runs (for queues >=15)
presumably due to flooding of logs due to descriptor ring being
overwritten.

Fixes: a01854 ("net/gve: fix dqo bug for chained descriptors")
Cc: stable@dpdk.org

Signed-off-by: Rushil Gupta <rushilg@google.com>
Reviewed-by: Joshua Washington <joshwash@google.com>
---
 drivers/net/gve/gve_tx_dqo.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/gve/gve_tx_dqo.c b/drivers/net/gve/gve_tx_dqo.c
index 1a8eb96ea9..30a1455b20 100644
--- a/drivers/net/gve/gve_tx_dqo.c
+++ b/drivers/net/gve/gve_tx_dqo.c
@@ -116,7 +116,7 @@ gve_tx_burst_dqo(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
 		first_sw_id = sw_id;
 		do {
 			if (sw_ring[sw_id] != NULL)
-				PMD_DRV_LOG(ERR, "Overwriting an entry in sw_ring");
+				PMD_DRV_LOG(DEBUG, "Overwriting an entry in sw_ring");
 
 			txd = &txr[tx_id];
 			sw_ring[sw_id] = tx_pkt;
-- 
2.44.0.rc0.258.g7320e95886-goog


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] net/gve: Change ERR to DEBUG to prevent flooding of logs for Tx-Dqo.
  2024-02-19  2:44 [PATCH] net/gve: Change ERR to DEBUG to prevent flooding of logs for Tx-Dqo Rushil Gupta
@ 2024-02-19  9:33 ` Ferruh Yigit
  2024-02-20  9:40   ` Rushil Gupta
  2024-02-19 10:14 ` Ferruh Yigit
  2024-02-19 17:26 ` Stephen Hemminger
  2 siblings, 1 reply; 6+ messages in thread
From: Ferruh Yigit @ 2024-02-19  9:33 UTC (permalink / raw)
  To: Rushil Gupta, junfeng.guo, jeroendb, joshwash; +Cc: dev, stable

On 2/19/2024 2:44 AM, Rushil Gupta wrote:
> This was causing failure for testpmd runs (for queues >=15)
> presumably due to flooding of logs due to descriptor ring being
> overwritten.
> 
> Fixes: a01854 ("net/gve: fix dqo bug for chained descriptors")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Rushil Gupta <rushilg@google.com>
> Reviewed-by: Joshua Washington <joshwash@google.com>
> ---
>  drivers/net/gve/gve_tx_dqo.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/gve/gve_tx_dqo.c b/drivers/net/gve/gve_tx_dqo.c
> index 1a8eb96ea9..30a1455b20 100644
> --- a/drivers/net/gve/gve_tx_dqo.c
> +++ b/drivers/net/gve/gve_tx_dqo.c
> @@ -116,7 +116,7 @@ gve_tx_burst_dqo(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
>  		first_sw_id = sw_id;
>  		do {
>  			if (sw_ring[sw_id] != NULL)
> -				PMD_DRV_LOG(ERR, "Overwriting an entry in sw_ring");
> +				PMD_DRV_LOG(DEBUG, "Overwriting an entry in sw_ring");
>  
>  			txd = &txr[tx_id];
>  			sw_ring[sw_id] = tx_pkt;

Hi Rushil,

I will continue with this patch, BUT
logging in the datapath has potential problems like this, also it may
have performance impact even log is not printed, because of additional
check in the log function.

For datapath, you can prefer:
- Add log macros controlled by RTE_ETHDEV_DEBUG_RX & RTE_ETHDEV_DEBUG_TX
flags
- Or use RTE_LOG_DP() macro which is compiled out if default log level
is less than the log uses


Also another perspective for the logs, when end-user observes this bloat
of logs, what she can do?
Can she change some driver arguments or environment variables to fix the
issue, if not what is the point of the log?
If this is a condition that can occur dynamically based on received
traffic and user doesn't have control on it, maybe driver should handle
the error without log?
And if this is a log for driver developer, perhaps it should be assert
or similar that is disabled in the release build?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] net/gve: Change ERR to DEBUG to prevent flooding of logs for Tx-Dqo.
  2024-02-19  2:44 [PATCH] net/gve: Change ERR to DEBUG to prevent flooding of logs for Tx-Dqo Rushil Gupta
  2024-02-19  9:33 ` Ferruh Yigit
@ 2024-02-19 10:14 ` Ferruh Yigit
  2024-02-19 17:26 ` Stephen Hemminger
  2 siblings, 0 replies; 6+ messages in thread
From: Ferruh Yigit @ 2024-02-19 10:14 UTC (permalink / raw)
  To: Rushil Gupta, junfeng.guo, jeroendb, joshwash; +Cc: dev, stable

On 2/19/2024 2:44 AM, Rushil Gupta wrote:
> This was causing failure for testpmd runs (for queues >=15)
> presumably due to flooding of logs due to descriptor ring being
> overwritten.
> 
> Fixes: a01854 ("net/gve: fix dqo bug for chained descriptors")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Rushil Gupta <rushilg@google.com>
> Reviewed-by: Joshua Washington <joshwash@google.com>
>

Squashed into relevant commit in next-net, thanks.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] net/gve: Change ERR to DEBUG to prevent flooding of logs for Tx-Dqo.
  2024-02-19  2:44 [PATCH] net/gve: Change ERR to DEBUG to prevent flooding of logs for Tx-Dqo Rushil Gupta
  2024-02-19  9:33 ` Ferruh Yigit
  2024-02-19 10:14 ` Ferruh Yigit
@ 2024-02-19 17:26 ` Stephen Hemminger
  2024-02-20  3:38   ` Rushil Gupta
  2 siblings, 1 reply; 6+ messages in thread
From: Stephen Hemminger @ 2024-02-19 17:26 UTC (permalink / raw)
  To: Rushil Gupta; +Cc: junfeng.guo, jeroendb, joshwash, ferruh.yigit, dev, stable

On Mon, 19 Feb 2024 02:44:35 +0000
Rushil Gupta <rushilg@google.com> wrote:

> This was causing failure for testpmd runs (for queues >=15)
> presumably due to flooding of logs due to descriptor ring being
> overwritten.
> 
> Fixes: a01854 ("net/gve: fix dqo bug for chained descriptors")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Rushil Gupta <rushilg@google.com>
> Reviewed-by: Joshua Washington <joshwash@google.com>

Isn't this still an error. What about the descriptor overwritten is there an mbuf leak?
Maybe a statistic would be better than a message?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] net/gve: Change ERR to DEBUG to prevent flooding of logs for Tx-Dqo.
  2024-02-19 17:26 ` Stephen Hemminger
@ 2024-02-20  3:38   ` Rushil Gupta
  0 siblings, 0 replies; 6+ messages in thread
From: Rushil Gupta @ 2024-02-20  3:38 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Guo, Junfeng, Jeroen de Borst, Joshua Washington, Ferruh Yigit,
	dev, stable

[-- Attachment #1: Type: text/plain, Size: 925 bytes --]

I agree.
This bug has manifested for a while before I fixed it partially in "[PATCH]
net/gve: fix dqo bug for chained descriptors"
However, for higher queue counts (> 13); we still see this behavior. I'll
add a statistic.

On Mon, Feb 19, 2024, 10:56 PM Stephen Hemminger <stephen@networkplumber.org>
wrote:

> On Mon, 19 Feb 2024 02:44:35 +0000
> Rushil Gupta <rushilg@google.com> wrote:
>
> > This was causing failure for testpmd runs (for queues >=15)
> > presumably due to flooding of logs due to descriptor ring being
> > overwritten.
> >
> > Fixes: a01854 ("net/gve: fix dqo bug for chained descriptors")
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Rushil Gupta <rushilg@google.com>
> > Reviewed-by: Joshua Washington <joshwash@google.com>
>
> Isn't this still an error. What about the descriptor overwritten is there
> an mbuf leak?
> Maybe a statistic would be better than a message?
>

[-- Attachment #2: Type: text/html, Size: 1647 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] net/gve: Change ERR to DEBUG to prevent flooding of logs for Tx-Dqo.
  2024-02-19  9:33 ` Ferruh Yigit
@ 2024-02-20  9:40   ` Rushil Gupta
  0 siblings, 0 replies; 6+ messages in thread
From: Rushil Gupta @ 2024-02-20  9:40 UTC (permalink / raw)
  To: Ferruh Yigit
  Cc: Guo, Junfeng, Jeroen de Borst, Joshua Washington, dev, stable

[-- Attachment #1: Type: text/plain, Size: 2589 bytes --]

These are very useful insights Ferruh. I think RTE_LOG_DP() is something
that would have been more suitable here.
Also, the DEBUG statement combined with a statistic will be more useful
than ERR from developer perspective if they see a potential memory leak in
their program.

On Mon, Feb 19, 2024, 3:03 PM Ferruh Yigit <ferruh.yigit@amd.com> wrote:

> On 2/19/2024 2:44 AM, Rushil Gupta wrote:
> > This was causing failure for testpmd runs (for queues >=15)
> > presumably due to flooding of logs due to descriptor ring being
> > overwritten.
> >
> > Fixes: a01854 ("net/gve: fix dqo bug for chained descriptors")
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Rushil Gupta <rushilg@google.com>
> > Reviewed-by: Joshua Washington <joshwash@google.com>
> > ---
> >  drivers/net/gve/gve_tx_dqo.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/net/gve/gve_tx_dqo.c b/drivers/net/gve/gve_tx_dqo.c
> > index 1a8eb96ea9..30a1455b20 100644
> > --- a/drivers/net/gve/gve_tx_dqo.c
> > +++ b/drivers/net/gve/gve_tx_dqo.c
> > @@ -116,7 +116,7 @@ gve_tx_burst_dqo(void *tx_queue, struct rte_mbuf
> **tx_pkts, uint16_t nb_pkts)
> >               first_sw_id = sw_id;
> >               do {
> >                       if (sw_ring[sw_id] != NULL)
> > -                             PMD_DRV_LOG(ERR, "Overwriting an entry in
> sw_ring");
> > +                             PMD_DRV_LOG(DEBUG, "Overwriting an entry
> in sw_ring");
> >
> >                       txd = &txr[tx_id];
> >                       sw_ring[sw_id] = tx_pkt;
>
> Hi Rushil,
>
> I will continue with this patch, BUT
> logging in the datapath has potential problems like this, also it may
> have performance impact even log is not printed, because of additional
> check in the log function.
>
> For datapath, you can prefer:
> - Add log macros controlled by RTE_ETHDEV_DEBUG_RX & RTE_ETHDEV_DEBUG_TX
> flags
> - Or use RTE_LOG_DP() macro which is compiled out if default log level
> is less than the log uses
>
>
> Also another perspective for the logs, when end-user observes this bloat
> of logs, what she can do?
> Can she change some driver arguments or environment variables to fix the
> issue, if not what is the point of the log?
> If this is a condition that can occur dynamically based on received
> traffic and user doesn't have control on it, maybe driver should handle
> the error without log?
> And if this is a log for driver developer, perhaps it should be assert
> or similar that is disabled in the release build?
>

[-- Attachment #2: Type: text/html, Size: 3493 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-02-20  9:41 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-19  2:44 [PATCH] net/gve: Change ERR to DEBUG to prevent flooding of logs for Tx-Dqo Rushil Gupta
2024-02-19  9:33 ` Ferruh Yigit
2024-02-20  9:40   ` Rushil Gupta
2024-02-19 10:14 ` Ferruh Yigit
2024-02-19 17:26 ` Stephen Hemminger
2024-02-20  3:38   ` Rushil Gupta

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).