DPDK patches and discussions
 help / color / mirror / Atom feed
* BUILD bug hidden in SFC driver.
@ 2023-11-11 16:56 Stephen Hemminger
  2023-11-11 17:29 ` Stephen Hemminger
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Stephen Hemminger @ 2023-11-11 16:56 UTC (permalink / raw)
  To: Andrew Rybchenko; +Cc: dev

While examining the use of VLA in DPDK, ran into a bug in sfc driver.

If DPDK is built with -Wvla, then the RTE_BUILD_BUG_ON() macro won't work
as written. Experimenting with a better more portable version of that macro
as:
	#define RTE_BUILD_BUG_ON(e) _Static_assert(!(e),  #e)

revealed that the SFC driver was calling RTE_BUILD_BUG_ON with non constant
expression.

../drivers/net/sfc/sfc_ef100_tx.c: In function ‘sfc_ef100_tx_pkt_descs_max’:
../lib/eal/include/rte_common.h:585:20: warning: comparison of integer expressions of different signedness: ‘unsigned int’ and ‘int’ [-Wsign-compare]
  585 |                 _a < _b ? _a : _b; \
      |                    ^
../lib/eal/include/rte_common.h:498:46: note: in definition of macro ‘RTE_BUILD_BUG_ON’
  498 | #define RTE_BUILD_BUG_ON(e) _Static_assert(!(e),  #e)
      |                                              ^
../drivers/net/sfc/sfc_ef100_tx.c:566:34: note: in expansion of macro ‘RTE_MIN’
  566 |                                  RTE_MIN((unsigned int)EFX_MAC_PDU_MAX,
      |                                  ^~~~~~~
../lib/eal/include/rte_common.h:585:32: warning: operand of ‘?:’ changes signedness from ‘int’ to ‘unsigned int’ due to unsignedness of other operand [-Wsign-compare]
  585 |                 _a < _b ? _a : _b; \
      |                                ^~
../lib/eal/include/rte_common.h:498:46: note: in definition of macro ‘RTE_BUILD_BUG_ON’
  498 | #define RTE_BUILD_BUG_ON(e) _Static_assert(!(e),  #e)
      |                                              ^
../drivers/net/sfc/sfc_ef100_tx.c:566:34: note: in expansion of macro ‘RTE_MIN’
  566 |                                  RTE_MIN((unsigned int)EFX_MAC_PDU_MAX,
      |                                  ^~~~~~~
../lib/eal/include/rte_common.h:498:44: error: expression in static assertion is not constant
  498 | #define RTE_BUILD_BUG_ON(e) _Static_assert(!(e),  #e)
      |                                            ^~~~
../drivers/net/sfc/sfc_ef100_tx.c:565:17: note: in expansion of macro ‘RTE_BUILD_BUG_ON’


The problem is that Gcc does not evaluate a ternary operator expression
with all constants at compile time to produce a constant value! Apparently,
the language standards leave this as ambiguous.

If the code is expanded into two assertions as:

diff --git a/drivers/net/sfc/sfc_ef100_tx.c b/drivers/net/sfc/sfc_ef100_tx.c
index 1b6374775f07..25e6633d6679 100644
--- a/drivers/net/sfc/sfc_ef100_tx.c
+++ b/drivers/net/sfc/sfc_ef100_tx.c
@@ -562,9 +562,8 @@ sfc_ef100_tx_pkt_descs_max(const struct rte_mbuf *m)
                 * Make sure that the first segment does not need fragmentation
                 * (split into many Tx descriptors).
                 */
-               RTE_BUILD_BUG_ON(SFC_EF100_TX_SEND_DESC_LEN_MAX <
-                                RTE_MIN((unsigned int)EFX_MAC_PDU_MAX,
-                                SFC_MBUF_SEG_LEN_MAX));
+               RTE_BUILD_BUG_ON(SFC_EF100_TX_SEND_DESC_LEN_MAX < EFX_MAC_PDU_MAX);
+               RTE_BUILD_BUG_ON(SFC_EF100_TX_SEND_DESC_LEN_MAX < SFC_MBUF_SEG_LEN_MAX);
        }
 
        if (m->ol_flags & sfc_dp_mport_override) {

Then a new problem arises:
In file included from ../lib/mbuf/rte_mbuf.h:36,
                 from ../drivers/net/sfc/sfc_ef100_tx.c:12:
../drivers/net/sfc/sfc_ef100_tx.c: In function ‘sfc_ef100_tx_pkt_descs_max’:
../lib/eal/include/rte_common.h:498:29: error: static assertion failed: "SFC_EF100_TX_SEND_DESC_LEN_MAX < SFC_MBUF_SEG_LEN_MAX"
  498 | #define RTE_BUILD_BUG_ON(e) _Static_assert(!(e),  #e)
      |                             ^~~~~~~~~~~~~~
../drivers/net/sfc/sfc_ef100_tx.c:566:17: note: in expansion of macro ‘RTE_BUILD_BUG_ON’
  566 |                 RTE_BUILD_BUG_ON(SFC_EF100_TX_SEND_DESC_LEN_MAX < SFC_MBUF_SEG_LEN_MAX);
      |                 ^~~~~~~~~~~~~~~~

Building a little program to unwind the #defines reveals:

SFC_EF100_TX_SEND_DESC_LEN_MAX = 16383
EFX_MAC_PDU_MAX = 9240
SFC_MBUF_SEG_LEN_MAX = 65535

I.e:
	RTE_BUILD_BUG_ON(16383 < RTE_MIN(9240, 65535));
	

Therefore the current driver should be getting build bug, but the existing macro
hides it.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: BUILD bug hidden in SFC driver.
  2023-11-11 16:56 BUILD bug hidden in SFC driver Stephen Hemminger
@ 2023-11-11 17:29 ` Stephen Hemminger
  2023-11-13 12:04 ` Ivan Malov
  2023-11-13 16:28 ` Tyler Retzlaff
  2 siblings, 0 replies; 5+ messages in thread
From: Stephen Hemminger @ 2023-11-11 17:29 UTC (permalink / raw)
  To: Andrew Rybchenko; +Cc: dev

On Sat, 11 Nov 2023 08:56:34 -0800
Stephen Hemminger <stephen@networkplumber.org> wrote:

> While examining the use of VLA in DPDK, ran into a bug in sfc driver.
> 
> If DPDK is built with -Wvla, then the RTE_BUILD_BUG_ON() macro won't work
> as written. Experimenting with a better more portable version of that macro
> as:
> 	#define RTE_BUILD_BUG_ON(e) _Static_assert(!(e),  #e)

Sending this as RFC also catches another driver with hidden failure.

from ../drivers/event/opdl/opdl_ring.c:10:
./drivers/event/opdl/opdl_ring.c: In function ‘opdl_ring_create’:
./lib/eal/include/rte_common.h:498:44: error: expression in static assertion is not constant
#define RTE_BUILD_BUG_ON(e) _Static_assert(!(e), #e)
^~~~
./drivers/event/opdl/opdl_ring.c:913:2: note: in expansion of macro ‘RTE_BUILD_BUG_ON’
RTE_BUILD_BUG_ON(!rte_is_power_of_2(OPDL_DISCLAIMS_PER_LCORE));

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: BUILD bug hidden in SFC driver.
  2023-11-11 16:56 BUILD bug hidden in SFC driver Stephen Hemminger
  2023-11-11 17:29 ` Stephen Hemminger
@ 2023-11-13 12:04 ` Ivan Malov
  2023-11-13 16:30   ` Stephen Hemminger
  2023-11-13 16:28 ` Tyler Retzlaff
  2 siblings, 1 reply; 5+ messages in thread
From: Ivan Malov @ 2023-11-13 12:04 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Andrew Rybchenko, dev

[-- Attachment #1: Type: text/plain, Size: 5770 bytes --]

Hi Stephen,

On Sat, 11 Nov 2023, Stephen Hemminger wrote:

> While examining the use of VLA in DPDK, ran into a bug in sfc driver.
>
> If DPDK is built with -Wvla, then the RTE_BUILD_BUG_ON() macro won't work
> as written. Experimenting with a better more portable version of that macro
> as:
> 	#define RTE_BUILD_BUG_ON(e) _Static_assert(!(e),  #e)

First of all, thanks for the effort. Very helpful.
Please see below.

>
> revealed that the SFC driver was calling RTE_BUILD_BUG_ON with non constant
> expression.
>
> ../drivers/net/sfc/sfc_ef100_tx.c: In function ‘sfc_ef100_tx_pkt_descs_max’:
> ../lib/eal/include/rte_common.h:585:20: warning: comparison of integer expressions of different signedness: ‘unsigned int’ and ‘int’ [-Wsign-compare]
>  585 |                 _a < _b ? _a : _b; \
>      |                    ^
> ../lib/eal/include/rte_common.h:498:46: note: in definition of macro ‘RTE_BUILD_BUG_ON’
>  498 | #define RTE_BUILD_BUG_ON(e) _Static_assert(!(e),  #e)
>      |                                              ^
> ../drivers/net/sfc/sfc_ef100_tx.c:566:34: note: in expansion of macro ‘RTE_MIN’
>  566 |                                  RTE_MIN((unsigned int)EFX_MAC_PDU_MAX,
>      |                                  ^~~~~~~
> ../lib/eal/include/rte_common.h:585:32: warning: operand of ‘?:’ changes signedness from ‘int’ to ‘unsigned int’ due to unsignedness of other operand [-Wsign-compare]
>  585 |                 _a < _b ? _a : _b; \
>      |                                ^~
> ../lib/eal/include/rte_common.h:498:46: note: in definition of macro ‘RTE_BUILD_BUG_ON’
>  498 | #define RTE_BUILD_BUG_ON(e) _Static_assert(!(e),  #e)
>      |                                              ^
> ../drivers/net/sfc/sfc_ef100_tx.c:566:34: note: in expansion of macro ‘RTE_MIN’
>  566 |                                  RTE_MIN((unsigned int)EFX_MAC_PDU_MAX,
>      |                                  ^~~~~~~
> ../lib/eal/include/rte_common.h:498:44: error: expression in static assertion is not constant
>  498 | #define RTE_BUILD_BUG_ON(e) _Static_assert(!(e),  #e)
>      |                                            ^~~~
> ../drivers/net/sfc/sfc_ef100_tx.c:565:17: note: in expansion of macro ‘RTE_BUILD_BUG_ON’
>
>
> The problem is that Gcc does not evaluate a ternary operator expression
> with all constants at compile time to produce a constant value! Apparently,
> the language standards leave this as ambiguous.
>
> If the code is expanded into two assertions as:
>
> diff --git a/drivers/net/sfc/sfc_ef100_tx.c b/drivers/net/sfc/sfc_ef100_tx.c
> index 1b6374775f07..25e6633d6679 100644
> --- a/drivers/net/sfc/sfc_ef100_tx.c
> +++ b/drivers/net/sfc/sfc_ef100_tx.c
> @@ -562,9 +562,8 @@ sfc_ef100_tx_pkt_descs_max(const struct rte_mbuf *m)
>                 * Make sure that the first segment does not need fragmentation
>                 * (split into many Tx descriptors).
>                 */
> -               RTE_BUILD_BUG_ON(SFC_EF100_TX_SEND_DESC_LEN_MAX <
> -                                RTE_MIN((unsigned int)EFX_MAC_PDU_MAX,
> -                                SFC_MBUF_SEG_LEN_MAX));
> +               RTE_BUILD_BUG_ON(SFC_EF100_TX_SEND_DESC_LEN_MAX < EFX_MAC_PDU_MAX);
> +               RTE_BUILD_BUG_ON(SFC_EF100_TX_SEND_DESC_LEN_MAX < SFC_MBUF_SEG_LEN_MAX);
>        }
>
>        if (m->ol_flags & sfc_dp_mport_override) {
>
> Then a new problem arises:
> In file included from ../lib/mbuf/rte_mbuf.h:36,
>                 from ../drivers/net/sfc/sfc_ef100_tx.c:12:
> ../drivers/net/sfc/sfc_ef100_tx.c: In function ‘sfc_ef100_tx_pkt_descs_max’:
> ../lib/eal/include/rte_common.h:498:29: error: static assertion failed: "SFC_EF100_TX_SEND_DESC_LEN_MAX < SFC_MBUF_SEG_LEN_MAX"
>  498 | #define RTE_BUILD_BUG_ON(e) _Static_assert(!(e),  #e)
>      |                             ^~~~~~~~~~~~~~
> ../drivers/net/sfc/sfc_ef100_tx.c:566:17: note: in expansion of macro ‘RTE_BUILD_BUG_ON’
>  566 |                 RTE_BUILD_BUG_ON(SFC_EF100_TX_SEND_DESC_LEN_MAX < SFC_MBUF_SEG_LEN_MAX);
>      |                 ^~~~~~~~~~~~~~~~
>
> Building a little program to unwind the #defines reveals:
>
> SFC_EF100_TX_SEND_DESC_LEN_MAX = 16383
> EFX_MAC_PDU_MAX = 9240
> SFC_MBUF_SEG_LEN_MAX = 65535
>
> I.e:
> 	RTE_BUILD_BUG_ON(16383 < RTE_MIN(9240, 65535));
>
>
> Therefore the current driver should be getting build bug, but the existing macro
> hides it.

As far as I understand, the intention behind this check is to make sure
that SFC_EF100_TX_SEND_DESC_LEN_MAX represents enough room to
accommodate either EFX_MAC_PDU_MAX or SFC_MBUF_SEG_LEN_MAX bytes,
whichever is smaller. Is 16383 sufficient to accommodate 9240?
I think so. Do you agree?

That being said, indeed, applying the "more portable version" of yours
results in me seeing the warning about a non-constant expression.

Applying the following patch makes all errors disappear
when building with either version of RTE_BUILD_BUG_ON:

diff --git a/drivers/net/sfc/sfc_ef100_tx.c b/drivers/net/sfc/sfc_ef100_tx.c
index 1b6374775f..01f37c2616 100644
--- a/drivers/net/sfc/sfc_ef100_tx.c
+++ b/drivers/net/sfc/sfc_ef100_tx.c
@@ -563,7 +563,7 @@ sfc_ef100_tx_pkt_descs_max(const struct rte_mbuf *m)
                  * (split into many Tx descriptors).
                  */
                 RTE_BUILD_BUG_ON(SFC_EF100_TX_SEND_DESC_LEN_MAX <
-                                RTE_MIN((unsigned int)EFX_MAC_PDU_MAX,
+                                MIN((unsigned int)EFX_MAC_PDU_MAX,
                                  SFC_MBUF_SEG_LEN_MAX));
         }

with MIN being defined in drivers/common/sfc_efx/efsys.h as
#define MIN(v1, v2)	((v1) < (v2) ? (v1) : (v2))

Would that be an acceptable fix? Or am I missing something?

Thank you.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: BUILD bug hidden in SFC driver.
  2023-11-11 16:56 BUILD bug hidden in SFC driver Stephen Hemminger
  2023-11-11 17:29 ` Stephen Hemminger
  2023-11-13 12:04 ` Ivan Malov
@ 2023-11-13 16:28 ` Tyler Retzlaff
  2 siblings, 0 replies; 5+ messages in thread
From: Tyler Retzlaff @ 2023-11-13 16:28 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Andrew Rybchenko, dev

On Sat, Nov 11, 2023 at 08:56:34AM -0800, Stephen Hemminger wrote:
> While examining the use of VLA in DPDK, ran into a bug in sfc driver.
> 
> If DPDK is built with -Wvla, then the RTE_BUILD_BUG_ON() macro won't work
> as written. Experimenting with a better more portable version of that macro
> as:
> 	#define RTE_BUILD_BUG_ON(e) _Static_assert(!(e),  #e)
> 
> revealed that the SFC driver was calling RTE_BUILD_BUG_ON with non constant
> expression.
> 
> ../drivers/net/sfc/sfc_ef100_tx.c: In function ‘sfc_ef100_tx_pkt_descs_max’:
> ../lib/eal/include/rte_common.h:585:20: warning: comparison of integer expressions of different signedness: ‘unsigned int’ and ‘int’ [-Wsign-compare]
>   585 |                 _a < _b ? _a : _b; \
>       |                    ^
> ../lib/eal/include/rte_common.h:498:46: note: in definition of macro ‘RTE_BUILD_BUG_ON’
>   498 | #define RTE_BUILD_BUG_ON(e) _Static_assert(!(e),  #e)
>       |                                              ^
> ../drivers/net/sfc/sfc_ef100_tx.c:566:34: note: in expansion of macro ‘RTE_MIN’
>   566 |                                  RTE_MIN((unsigned int)EFX_MAC_PDU_MAX,
>       |                                  ^~~~~~~
> ../lib/eal/include/rte_common.h:585:32: warning: operand of ‘?:’ changes signedness from ‘int’ to ‘unsigned int’ due to unsignedness of other operand [-Wsign-compare]
>   585 |                 _a < _b ? _a : _b; \
>       |                                ^~
> ../lib/eal/include/rte_common.h:498:46: note: in definition of macro ‘RTE_BUILD_BUG_ON’
>   498 | #define RTE_BUILD_BUG_ON(e) _Static_assert(!(e),  #e)
>       |                                              ^
> ../drivers/net/sfc/sfc_ef100_tx.c:566:34: note: in expansion of macro ‘RTE_MIN’
>   566 |                                  RTE_MIN((unsigned int)EFX_MAC_PDU_MAX,
>       |                                  ^~~~~~~
> ../lib/eal/include/rte_common.h:498:44: error: expression in static assertion is not constant
>   498 | #define RTE_BUILD_BUG_ON(e) _Static_assert(!(e),  #e)
>       |                                            ^~~~
> ../drivers/net/sfc/sfc_ef100_tx.c:565:17: note: in expansion of macro ‘RTE_BUILD_BUG_ON’
> 
> 
> The problem is that Gcc does not evaluate a ternary operator expression
> with all constants at compile time to produce a constant value! Apparently,
> the language standards leave this as ambiguous.

the problem is the expression being asserted was *never* constant or at
least not the expression you thought was being evaluated.

when the non-constant value is provided to the macro it produces a VLA
where the sizeof that VLA is the result of the expansion. i'm aware of
multiple instances of this bug outside of the use of this macro in dpdk.

the reason the expression is non-constant is because all the RTE_XXX
macros are expanded to statement expressions which themselves do not
evaluate as constant expressions. any RTE_BUILD_BUG_ON() expanding an
RTE_XXX macro will be impacted.

ideally we would move the source tree to disable VLAs entirely to wipe
out this and other similar bugs i've found. it would also be nice to
find a way to eliminate the use of statement expressions because of the
extremely subtle ways they can explode when seemingly innocuous control
flow mechanisms are present.

we should immediately convert to _Static_assert even if that means
temporarily disabling some of the RTE_BUILD_BUG_ON entries (which as we
can see have no value as written).

we should enable -Wvla and then handle re-enablement as an exception to
allow us to transition to a tree without VLAs to squash new additions
and catch other unexpected VLA generation.

> 
> If the code is expanded into two assertions as:
> 
> diff --git a/drivers/net/sfc/sfc_ef100_tx.c b/drivers/net/sfc/sfc_ef100_tx.c
> index 1b6374775f07..25e6633d6679 100644
> --- a/drivers/net/sfc/sfc_ef100_tx.c
> +++ b/drivers/net/sfc/sfc_ef100_tx.c
> @@ -562,9 +562,8 @@ sfc_ef100_tx_pkt_descs_max(const struct rte_mbuf *m)
>                  * Make sure that the first segment does not need fragmentation
>                  * (split into many Tx descriptors).
>                  */
> -               RTE_BUILD_BUG_ON(SFC_EF100_TX_SEND_DESC_LEN_MAX <
> -                                RTE_MIN((unsigned int)EFX_MAC_PDU_MAX,
> -                                SFC_MBUF_SEG_LEN_MAX));
> +               RTE_BUILD_BUG_ON(SFC_EF100_TX_SEND_DESC_LEN_MAX < EFX_MAC_PDU_MAX);
> +               RTE_BUILD_BUG_ON(SFC_EF100_TX_SEND_DESC_LEN_MAX < SFC_MBUF_SEG_LEN_MAX);
>         }
>  
>         if (m->ol_flags & sfc_dp_mport_override) {
> 
> Then a new problem arises:
> In file included from ../lib/mbuf/rte_mbuf.h:36,
>                  from ../drivers/net/sfc/sfc_ef100_tx.c:12:
> ../drivers/net/sfc/sfc_ef100_tx.c: In function ‘sfc_ef100_tx_pkt_descs_max’:
> ../lib/eal/include/rte_common.h:498:29: error: static assertion failed: "SFC_EF100_TX_SEND_DESC_LEN_MAX < SFC_MBUF_SEG_LEN_MAX"
>   498 | #define RTE_BUILD_BUG_ON(e) _Static_assert(!(e),  #e)
>       |                             ^~~~~~~~~~~~~~
> ../drivers/net/sfc/sfc_ef100_tx.c:566:17: note: in expansion of macro ‘RTE_BUILD_BUG_ON’
>   566 |                 RTE_BUILD_BUG_ON(SFC_EF100_TX_SEND_DESC_LEN_MAX < SFC_MBUF_SEG_LEN_MAX);
>       |                 ^~~~~~~~~~~~~~~~
> 
> Building a little program to unwind the #defines reveals:
> 
> SFC_EF100_TX_SEND_DESC_LEN_MAX = 16383
> EFX_MAC_PDU_MAX = 9240
> SFC_MBUF_SEG_LEN_MAX = 65535
> 
> I.e:
> 	RTE_BUILD_BUG_ON(16383 < RTE_MIN(9240, 65535));
> 	
> 
> Therefore the current driver should be getting build bug, but the existing macro
> hides it.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: BUILD bug hidden in SFC driver.
  2023-11-13 12:04 ` Ivan Malov
@ 2023-11-13 16:30   ` Stephen Hemminger
  0 siblings, 0 replies; 5+ messages in thread
From: Stephen Hemminger @ 2023-11-13 16:30 UTC (permalink / raw)
  To: Ivan Malov; +Cc: Andrew Rybchenko, dev

On Mon, 13 Nov 2023 16:04:51 +0400 (+04)
Ivan Malov <ivan.malov@arknetworks.am> wrote:

> diff --git a/drivers/net/sfc/sfc_ef100_tx.c b/drivers/net/sfc/sfc_ef100_tx.c
> index 1b6374775f..01f37c2616 100644
> --- a/drivers/net/sfc/sfc_ef100_tx.c
> +++ b/drivers/net/sfc/sfc_ef100_tx.c
> @@ -563,7 +563,7 @@ sfc_ef100_tx_pkt_descs_max(const struct rte_mbuf *m)
>                   * (split into many Tx descriptors).
>                   */
>                  RTE_BUILD_BUG_ON(SFC_EF100_TX_SEND_DESC_LEN_MAX <
> -                                RTE_MIN((unsigned int)EFX_MAC_PDU_MAX,
> +                                MIN((unsigned int)EFX_MAC_PDU_MAX,
>                                   SFC_MBUF_SEG_LEN_MAX));
>          }
> 
> with MIN being defined in drivers/common/sfc_efx/efsys.h as
> #define MIN(v1, v2)	((v1) < (v2) ? (v1) : (v2))

That should work.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-11-13 16:30 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-11 16:56 BUILD bug hidden in SFC driver Stephen Hemminger
2023-11-11 17:29 ` Stephen Hemminger
2023-11-13 12:04 ` Ivan Malov
2023-11-13 16:30   ` Stephen Hemminger
2023-11-13 16:28 ` Tyler Retzlaff

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).