In bnxt_init_one_rx_ring(), reset this variable internal to the driver ring to 0, so that there is no mismatch with actual value in HW on traffic resumption. Fixes: 03c8f2fe111c ("net/bnxt: detect bad opaque in Rx completion") Cc: stable@dpdk.org Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com> --- drivers/net/bnxt/bnxt_rxr.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/bnxt/bnxt_rxr.c b/drivers/net/bnxt/bnxt_rxr.c index aea71703d1..73fbdd17d1 100644 --- a/drivers/net/bnxt/bnxt_rxr.c +++ b/drivers/net/bnxt/bnxt_rxr.c @@ -1379,6 +1379,9 @@ int bnxt_init_one_rx_ring(struct bnxt_rx_queue *rxq) } PMD_DRV_LOG(DEBUG, "TPA alloc Done!\n"); + /* Explicitly reset this driver internal tracker on a ring init */ + rxr->rx_next_cons = 0; + return 0; } -- 2.28.0.497.g54e85e7
[-- Attachment #1: Type: text/plain, Size: 1162 bytes --] On Mon, Aug 9, 2021 at 11:11 PM Somnath Kotur <somnath.kotur@broadcom.com> wrote: > In bnxt_init_one_rx_ring(), reset this variable internal to the driver > ring to 0, so that there is no mismatch with actual value in HW on > traffic resumption. > > Fixes: 03c8f2fe111c ("net/bnxt: detect bad opaque in Rx completion") > Cc: stable@dpdk.org > > Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> > Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> > Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com> > Patch applied to dpdk-next-net-brcm for-next-net branch. --- > drivers/net/bnxt/bnxt_rxr.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/net/bnxt/bnxt_rxr.c b/drivers/net/bnxt/bnxt_rxr.c > index aea71703d1..73fbdd17d1 100644 > --- a/drivers/net/bnxt/bnxt_rxr.c > +++ b/drivers/net/bnxt/bnxt_rxr.c > @@ -1379,6 +1379,9 @@ int bnxt_init_one_rx_ring(struct bnxt_rx_queue *rxq) > } > PMD_DRV_LOG(DEBUG, "TPA alloc Done!\n"); > > + /* Explicitly reset this driver internal tracker on a ring init */ > + rxr->rx_next_cons = 0; > + > return 0; > } > > -- > 2.28.0.497.g54e85e7 > >
On 8/10/2021 7:07 AM, Somnath Kotur wrote: > In bnxt_init_one_rx_ring(), reset this variable internal to the driver > ring to 0, so that there is no mismatch with actual value in HW on > traffic resumption. > Hi Somnath, What is the impact of the patch, what are you fixing? If the 'rx_next_cons' (next consumer index) is not matching with HW value, what happens, does the Rx fails completely and unrecoverable? Or does miss one or more packet? And when/how this happens, since driver is not new I assume this doesn't happen in first ring init. Does this happen after uninit and init again? Providing this information helps clarifying scope of the fix. Can you please send a new version with updated commit log? Thanks, ferruh > Fixes: 03c8f2fe111c ("net/bnxt: detect bad opaque in Rx completion") > Cc: stable@dpdk.org > > Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> > Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> > Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com> > --- > drivers/net/bnxt/bnxt_rxr.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/net/bnxt/bnxt_rxr.c b/drivers/net/bnxt/bnxt_rxr.c > index aea71703d1..73fbdd17d1 100644 > --- a/drivers/net/bnxt/bnxt_rxr.c > +++ b/drivers/net/bnxt/bnxt_rxr.c > @@ -1379,6 +1379,9 @@ int bnxt_init_one_rx_ring(struct bnxt_rx_queue *rxq) > } > PMD_DRV_LOG(DEBUG, "TPA alloc Done!\n"); > > + /* Explicitly reset this driver internal tracker on a ring init */ > + rxr->rx_next_cons = 0; > + > return 0; > } > >
In bnxt_init_one_rx_ring(), reset this variable internal to the driver ring to 0, so that there is no scope for any false mismatch errors being flagged on traffic resumption. Typically, this would show up in a port stop/start sequence as then the HW would start afresh, but this driver internal variable would still point to a stale value. Fixes: 03c8f2fe111c ("net/bnxt: detect bad opaque in Rx completion") Cc: stable@dpdk.org Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com> --- v2: Updated commit log as requested by Ferruh drivers/net/bnxt/bnxt_rxr.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/bnxt/bnxt_rxr.c b/drivers/net/bnxt/bnxt_rxr.c index aea71703d1..73fbdd17d1 100644 --- a/drivers/net/bnxt/bnxt_rxr.c +++ b/drivers/net/bnxt/bnxt_rxr.c @@ -1379,6 +1379,9 @@ int bnxt_init_one_rx_ring(struct bnxt_rx_queue *rxq) } PMD_DRV_LOG(DEBUG, "TPA alloc Done!\n"); + /* Explicitly reset this driver internal tracker on a ring init */ + rxr->rx_next_cons = 0; + return 0; } -- 2.28.0.497.g54e85e7
[-- Attachment #1: Type: text/plain, Size: 1828 bytes --] Thank you Ferruh, have sent V2 with updated commit log, please see if it sounds good now? -Som On Fri, Aug 20, 2021 at 6:17 PM Ferruh Yigit <ferruh.yigit@intel.com> wrote: > On 8/10/2021 7:07 AM, Somnath Kotur wrote: > > In bnxt_init_one_rx_ring(), reset this variable internal to the driver > > ring to 0, so that there is no mismatch with actual value in HW on > > traffic resumption. > > > > Hi Somnath, > > What is the impact of the patch, what are you fixing? > > If the 'rx_next_cons' (next consumer index) is not matching with HW value, > what > happens, does the Rx fails completely and unrecoverable? Or does miss one > or > more packet? > > And when/how this happens, since driver is not new I assume this doesn't > happen > in first ring init. Does this happen after uninit and init again? > Providing this > information helps clarifying scope of the fix. > > Can you please send a new version with updated commit log? > > Thanks, > ferruh > > > Fixes: 03c8f2fe111c ("net/bnxt: detect bad opaque in Rx completion") > > Cc: stable@dpdk.org > > > > Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> > > Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> > > Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com> > > --- > > drivers/net/bnxt/bnxt_rxr.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/drivers/net/bnxt/bnxt_rxr.c b/drivers/net/bnxt/bnxt_rxr.c > > index aea71703d1..73fbdd17d1 100644 > > --- a/drivers/net/bnxt/bnxt_rxr.c > > +++ b/drivers/net/bnxt/bnxt_rxr.c > > @@ -1379,6 +1379,9 @@ int bnxt_init_one_rx_ring(struct bnxt_rx_queue > *rxq) > > } > > PMD_DRV_LOG(DEBUG, "TPA alloc Done!\n"); > > > > + /* Explicitly reset this driver internal tracker on a ring init */ > > + rxr->rx_next_cons = 0; > > + > > return 0; > > } > > > > > >
In bnxt_init_one_rx_ring(), reset rx_next_cons to 0, so that there is no scope for any false mismatch errors being flagged on traffic resumption. Typically, this would show up in a port stop/start sequence as then the HW would start with fresh values, but this driver internal variable would still point to a stale value. Fixes: 03c8f2fe111c ("net/bnxt: detect bad opaque in Rx completion") Cc: stable@dpdk.org Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com> --- v3: Updated commit log and summary as requested by Ferruh drivers/net/bnxt/bnxt_rxr.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/bnxt/bnxt_rxr.c b/drivers/net/bnxt/bnxt_rxr.c index aea71703d1..73fbdd17d1 100644 --- a/drivers/net/bnxt/bnxt_rxr.c +++ b/drivers/net/bnxt/bnxt_rxr.c @@ -1379,6 +1379,9 @@ int bnxt_init_one_rx_ring(struct bnxt_rx_queue *rxq) } PMD_DRV_LOG(DEBUG, "TPA alloc Done!\n"); + /* Explicitly reset this driver internal tracker on a ring init */ + rxr->rx_next_cons = 0; + return 0; } -- 2.28.0.497.g54e85e7
On chips like Thor, port stop/start sequence could result in a crash in the application. This is because of false detection of a bad opaque in the Rx completion and the subsequent kicking-in of the ring reset code to recover from the condition. The root cause being that the port stop/start would result in the HW starting with fresh values, while the driver internal tracker variable `rx_next_cons` is still pointing to a stale value. Fix this by reseting rx_next_cons to 0 in bnxt_init_one_rx_ring() Fixes: 03c8f2fe111c ("net/bnxt: detect bad opaque in Rx completion") Cc: stable@dpdk.org Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com> --- v4: Updated commit log as requested by Ferruh v3: Updated commit log and summary as requested by Ferruh v2: Updated commit log as requested by Ferruh drivers/net/bnxt/bnxt_rxr.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/bnxt/bnxt_rxr.c b/drivers/net/bnxt/bnxt_rxr.c index aea71703d1..73fbdd17d1 100644 --- a/drivers/net/bnxt/bnxt_rxr.c +++ b/drivers/net/bnxt/bnxt_rxr.c @@ -1379,6 +1379,9 @@ int bnxt_init_one_rx_ring(struct bnxt_rx_queue *rxq) } PMD_DRV_LOG(DEBUG, "TPA alloc Done!\n"); + /* Explicitly reset this driver internal tracker on a ring init */ + rxr->rx_next_cons = 0; + return 0; } -- 2.28.0.497.g54e85e7
On chips like Thor, port stop/start sequence could result in a crash in the application. This is because of false detection of a bad opaque in the Rx completion and the subsequent kicking-in of the ring reset code to recover from the condition. The root cause being that the port stop/start would result in the HW starting with fresh values, while the driver internal tracker variable `rx_next_cons` is still pointing to a stale value. Fix this by resetting rx_next_cons to 0 in bnxt_init_one_rx_ring() Fixes: 03c8f2fe111c ("net/bnxt: detect bad opaque in Rx completion") Cc: stable@dpdk.org Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com> --- v4: Updated commit log as requested by Ferruh v3: Updated commit log and summary as requested by Ferruh v2: Updated commit log as requested by Ferruh drivers/net/bnxt/bnxt_rxr.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/net/bnxt/bnxt_rxr.c b/drivers/net/bnxt/bnxt_rxr.c index aea71703d1..73fbdd17d1 100644 --- a/drivers/net/bnxt/bnxt_rxr.c +++ b/drivers/net/bnxt/bnxt_rxr.c @@ -1379,6 +1379,9 @@ int bnxt_init_one_rx_ring(struct bnxt_rx_queue *rxq) } PMD_DRV_LOG(DEBUG, "TPA alloc Done!\n"); + /* Explicitly reset this driver internal tracker on a ring init */ + rxr->rx_next_cons = 0; + return 0; } -- 2.28.0.497.g54e85e7
[-- Attachment #1: Type: text/plain, Size: 932 bytes --] On Mon, Aug 23, 2021 at 8:49 AM Somnath Kotur <somnath.kotur@broadcom.com> wrote: > > On chips like Thor, port stop/start sequence could result in a crash > in the application. This is because of false detection of a bad > opaque in the Rx completion and the subsequent kicking-in of the ring > reset code to recover from the condition. > The root cause being that the port stop/start would result in the HW > starting with fresh values, while the driver internal tracker variable > `rx_next_cons` is still pointing to a stale value. > Fix this by resetting rx_next_cons to 0 in bnxt_init_one_rx_ring() > > Fixes: 03c8f2fe111c ("net/bnxt: detect bad opaque in Rx completion") > Cc: stable@dpdk.org > > Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> > Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> > Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Thanks Ferruh. Patch applied to dpdk-next-net-brcm