From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 3C36C4331C; Mon, 13 Nov 2023 17:28:20 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 2E5E0402CF; Mon, 13 Nov 2023 17:28:20 +0100 (CET) Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by mails.dpdk.org (Postfix) with ESMTP id 8308A4026C for ; Mon, 13 Nov 2023 17:28:18 +0100 (CET) Received: by linux.microsoft.com (Postfix, from userid 1086) id D5E7920B74C1; Mon, 13 Nov 2023 08:28:17 -0800 (PST) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com D5E7920B74C1 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.microsoft.com; s=default; t=1699892897; bh=76cqrmV+XcBMDG42IrG34Ug60UgF80UA03Ni4svwp24=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=WO7KceImF/UwjkimotgSdVos4wwcVCSVHUkY26a+1CTimmTDFQjtApp9hb7pB74nn ULGGudWPbGTnDk9XWmhwomFWXM77PbYzIHlnt5NXaHNB3VPkeGXGTYAliwonXJE3ne kaxWkmjHwz8J/t3W3ab35OD1BXlADNVShepb4bKQ= Date: Mon, 13 Nov 2023 08:28:17 -0800 From: Tyler Retzlaff To: Stephen Hemminger Cc: Andrew Rybchenko , dev@dpdk.org Subject: Re: BUILD bug hidden in SFC driver. Message-ID: <20231113162817.GA13153@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net> References: <20231111085634.5df512bf@hermes.local> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20231111085634.5df512bf@hermes.local> User-Agent: Mutt/1.5.21 (2010-09-15) X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On Sat, Nov 11, 2023 at 08:56:34AM -0800, Stephen Hemminger wrote: > While examining the use of VLA in DPDK, ran into a bug in sfc driver. > > If DPDK is built with -Wvla, then the RTE_BUILD_BUG_ON() macro won't work > as written. Experimenting with a better more portable version of that macro > as: > #define RTE_BUILD_BUG_ON(e) _Static_assert(!(e), #e) > > revealed that the SFC driver was calling RTE_BUILD_BUG_ON with non constant > expression. > > ../drivers/net/sfc/sfc_ef100_tx.c: In function ‘sfc_ef100_tx_pkt_descs_max’: > ../lib/eal/include/rte_common.h:585:20: warning: comparison of integer expressions of different signedness: ‘unsigned int’ and ‘int’ [-Wsign-compare] > 585 | _a < _b ? _a : _b; \ > | ^ > ../lib/eal/include/rte_common.h:498:46: note: in definition of macro ‘RTE_BUILD_BUG_ON’ > 498 | #define RTE_BUILD_BUG_ON(e) _Static_assert(!(e), #e) > | ^ > ../drivers/net/sfc/sfc_ef100_tx.c:566:34: note: in expansion of macro ‘RTE_MIN’ > 566 | RTE_MIN((unsigned int)EFX_MAC_PDU_MAX, > | ^~~~~~~ > ../lib/eal/include/rte_common.h:585:32: warning: operand of ‘?:’ changes signedness from ‘int’ to ‘unsigned int’ due to unsignedness of other operand [-Wsign-compare] > 585 | _a < _b ? _a : _b; \ > | ^~ > ../lib/eal/include/rte_common.h:498:46: note: in definition of macro ‘RTE_BUILD_BUG_ON’ > 498 | #define RTE_BUILD_BUG_ON(e) _Static_assert(!(e), #e) > | ^ > ../drivers/net/sfc/sfc_ef100_tx.c:566:34: note: in expansion of macro ‘RTE_MIN’ > 566 | RTE_MIN((unsigned int)EFX_MAC_PDU_MAX, > | ^~~~~~~ > ../lib/eal/include/rte_common.h:498:44: error: expression in static assertion is not constant > 498 | #define RTE_BUILD_BUG_ON(e) _Static_assert(!(e), #e) > | ^~~~ > ../drivers/net/sfc/sfc_ef100_tx.c:565:17: note: in expansion of macro ‘RTE_BUILD_BUG_ON’ > > > The problem is that Gcc does not evaluate a ternary operator expression > with all constants at compile time to produce a constant value! Apparently, > the language standards leave this as ambiguous. the problem is the expression being asserted was *never* constant or at least not the expression you thought was being evaluated. when the non-constant value is provided to the macro it produces a VLA where the sizeof that VLA is the result of the expansion. i'm aware of multiple instances of this bug outside of the use of this macro in dpdk. the reason the expression is non-constant is because all the RTE_XXX macros are expanded to statement expressions which themselves do not evaluate as constant expressions. any RTE_BUILD_BUG_ON() expanding an RTE_XXX macro will be impacted. ideally we would move the source tree to disable VLAs entirely to wipe out this and other similar bugs i've found. it would also be nice to find a way to eliminate the use of statement expressions because of the extremely subtle ways they can explode when seemingly innocuous control flow mechanisms are present. we should immediately convert to _Static_assert even if that means temporarily disabling some of the RTE_BUILD_BUG_ON entries (which as we can see have no value as written). we should enable -Wvla and then handle re-enablement as an exception to allow us to transition to a tree without VLAs to squash new additions and catch other unexpected VLA generation. > > If the code is expanded into two assertions as: > > diff --git a/drivers/net/sfc/sfc_ef100_tx.c b/drivers/net/sfc/sfc_ef100_tx.c > index 1b6374775f07..25e6633d6679 100644 > --- a/drivers/net/sfc/sfc_ef100_tx.c > +++ b/drivers/net/sfc/sfc_ef100_tx.c > @@ -562,9 +562,8 @@ sfc_ef100_tx_pkt_descs_max(const struct rte_mbuf *m) > * Make sure that the first segment does not need fragmentation > * (split into many Tx descriptors). > */ > - RTE_BUILD_BUG_ON(SFC_EF100_TX_SEND_DESC_LEN_MAX < > - RTE_MIN((unsigned int)EFX_MAC_PDU_MAX, > - SFC_MBUF_SEG_LEN_MAX)); > + RTE_BUILD_BUG_ON(SFC_EF100_TX_SEND_DESC_LEN_MAX < EFX_MAC_PDU_MAX); > + RTE_BUILD_BUG_ON(SFC_EF100_TX_SEND_DESC_LEN_MAX < SFC_MBUF_SEG_LEN_MAX); > } > > if (m->ol_flags & sfc_dp_mport_override) { > > Then a new problem arises: > In file included from ../lib/mbuf/rte_mbuf.h:36, > from ../drivers/net/sfc/sfc_ef100_tx.c:12: > ../drivers/net/sfc/sfc_ef100_tx.c: In function ‘sfc_ef100_tx_pkt_descs_max’: > ../lib/eal/include/rte_common.h:498:29: error: static assertion failed: "SFC_EF100_TX_SEND_DESC_LEN_MAX < SFC_MBUF_SEG_LEN_MAX" > 498 | #define RTE_BUILD_BUG_ON(e) _Static_assert(!(e), #e) > | ^~~~~~~~~~~~~~ > ../drivers/net/sfc/sfc_ef100_tx.c:566:17: note: in expansion of macro ‘RTE_BUILD_BUG_ON’ > 566 | RTE_BUILD_BUG_ON(SFC_EF100_TX_SEND_DESC_LEN_MAX < SFC_MBUF_SEG_LEN_MAX); > | ^~~~~~~~~~~~~~~~ > > Building a little program to unwind the #defines reveals: > > SFC_EF100_TX_SEND_DESC_LEN_MAX = 16383 > EFX_MAC_PDU_MAX = 9240 > SFC_MBUF_SEG_LEN_MAX = 65535 > > I.e: > RTE_BUILD_BUG_ON(16383 < RTE_MIN(9240, 65535)); > > > Therefore the current driver should be getting build bug, but the existing macro > hides it.