From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id CA93E45FCC; Fri, 3 Jan 2025 01:12:24 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 64E4E402B1; Fri, 3 Jan 2025 01:12:24 +0100 (CET) Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by mails.dpdk.org (Postfix) with ESMTP id 97D56402B1 for ; Fri, 3 Jan 2025 01:12:22 +0100 (CET) Received: by linux.microsoft.com (Postfix, from userid 1213) id D4E4B2041A8E; Thu, 2 Jan 2025 16:12:21 -0800 (PST) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com D4E4B2041A8E DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.microsoft.com; s=default; t=1735863141; bh=wwLivJV1fVvcRqEdE/bSa4BswY+smogP0g0DEgs47I8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=FI5y4NQeRFgHFlEjUVVgbSEk7HRcDJWWKBLyEafgto8siur+jjJ/AGuhi48tcG5Hc fOc1J5lEH8aQk8uLZZUWVeS/IAxFCyTiPq4egrlX9fSaQBulayj+gFMKoS/FmTeVhR pEonGhna0kiX9lClehrk5O4jXwdqzD2SQV1JCwuA= From: Andre Muezerie To: andremue@linux.microsoft.com Cc: dev@dpdk.org, stephen@networkplumber.org Subject: [PATCH v10 2/3] drivers/common: add diagnostics macros to make code portable Date: Thu, 2 Jan 2025 16:12:16 -0800 Message-Id: <1735863137-31675-3-git-send-email-andremue@linux.microsoft.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1735863137-31675-1-git-send-email-andremue@linux.microsoft.com> References: <1735263196-2809-1-git-send-email-andremue@linux.microsoft.com> <1735863137-31675-1-git-send-email-andremue@linux.microsoft.com> X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org It was a common pattern to have "GCC diagnostic ignored" pragmas sprinkled over the code and only activate these pragmas for certain compilers (gcc and clang). Clang supports GCC's pragma for compatibility with existing source code, so #pragma GCC diagnostic and #pragma clang diagnostic are synonyms for Clang (https://clang.llvm.org/docs/UsersManual.html). Now that effort is being made to make the code compatible with MSVC these expressions would become more complex. It makes sense to hide this complexity behind macros. This makes maintenance easier as these macros are defined in a single place. As a plus the code becomes more readable as well. Signed-off-by: Andre Muezerie --- drivers/common/idpf/idpf_common_rxtx_avx512.c | 46 +++++++++++++++++-- 1 file changed, 42 insertions(+), 4 deletions(-) diff --git a/drivers/common/idpf/idpf_common_rxtx_avx512.c b/drivers/common/idpf/idpf_common_rxtx_avx512.c index b8450b03ae..37cd0a43e2 100644 --- a/drivers/common/idpf/idpf_common_rxtx_avx512.c +++ b/drivers/common/idpf/idpf_common_rxtx_avx512.c @@ -6,10 +6,6 @@ #include "idpf_common_device.h" #include "idpf_common_rxtx.h" -#ifndef __INTEL_COMPILER -#pragma GCC diagnostic ignored "-Wcast-qual" -#endif - #define IDPF_DESCS_PER_LOOP_AVX 8 #define PKTLEN_SHIFT 10 @@ -34,8 +30,11 @@ idpf_singleq_rearm_common(struct idpf_rx_queue *rxq) dma_addr0 = _mm_setzero_si128(); for (i = 0; i < IDPF_VPMD_DESCS_PER_LOOP; i++) { rxp[i] = &rxq->fake_mbuf; +__rte_diagnostic_push +__rte_diagnostic_ignored_wcast_qual _mm_store_si128((__m128i *)&rxdp[i].read, dma_addr0); +__rte_diagnostic_pop } } rte_atomic_fetch_add_explicit(&rxq->rx_stats.mbuf_alloc_failed, @@ -108,8 +107,11 @@ idpf_singleq_rearm_common(struct idpf_rx_queue *rxq) dma_addr4_7 = _mm512_add_epi64(dma_addr4_7, hdr_room); /* flush desc with pa dma_addr */ +__rte_diagnostic_push +__rte_diagnostic_ignored_wcast_qual _mm512_store_si512((__m512i *)&rxdp->read, dma_addr0_3); _mm512_store_si512((__m512i *)&(rxdp + 4)->read, dma_addr4_7); +__rte_diagnostic_pop } rxq->rxrearm_start += IDPF_RXQ_REARM_THRESH; @@ -164,8 +166,11 @@ idpf_singleq_rearm(struct idpf_rx_queue *rxq) dma_addr0 = _mm_setzero_si128(); for (i = 0; i < IDPF_VPMD_DESCS_PER_LOOP; i++) { rxp[i] = &rxq->fake_mbuf; +__rte_diagnostic_push +__rte_diagnostic_ignored_wcast_qual _mm_storeu_si128((__m128i *)&rxdp[i].read, dma_addr0); +__rte_diagnostic_pop } } rte_atomic_fetch_add_explicit(&rxq->rx_stats.mbuf_alloc_failed, @@ -216,10 +221,13 @@ idpf_singleq_rearm(struct idpf_rx_queue *rxq) iovas1); const __m512i desc6_7 = _mm512_bsrli_epi128(desc4_5, 8); +__rte_diagnostic_push +__rte_diagnostic_ignored_wcast_qual _mm512_storeu_si512((void *)rxdp, desc0_1); _mm512_storeu_si512((void *)(rxdp + 2), desc2_3); _mm512_storeu_si512((void *)(rxdp + 4), desc4_5); _mm512_storeu_si512((void *)(rxdp + 6), desc6_7); +__rte_diagnostic_pop rxp += IDPF_DESCS_PER_LOOP_AVX; rxdp += IDPF_DESCS_PER_LOOP_AVX; @@ -336,6 +344,8 @@ _idpf_singleq_recv_raw_pkts_avx512(struct idpf_rx_queue *rxq, #endif __m512i raw_desc0_3, raw_desc4_7; +__rte_diagnostic_push +__rte_diagnostic_ignored_wcast_qual const __m128i raw_desc7 = _mm_load_si128((void *)(rxdp + 7)); rte_compiler_barrier(); @@ -359,6 +369,7 @@ _idpf_singleq_recv_raw_pkts_avx512(struct idpf_rx_queue *rxq, rte_compiler_barrier(); const __m128i raw_desc0 = _mm_load_si128((void *)(rxdp + 0)); +__rte_diagnostic_pop raw_desc4_7 = _mm512_broadcast_i32x4(raw_desc4); raw_desc4_7 = _mm512_inserti32x4(raw_desc4_7, raw_desc5, 1); @@ -560,8 +571,11 @@ idpf_splitq_rearm_common(struct idpf_rx_queue *rx_bufq) dma_addr0 = _mm_setzero_si128(); for (i = 0; i < IDPF_VPMD_DESCS_PER_LOOP; i++) { rxp[i] = &rx_bufq->fake_mbuf; +__rte_diagnostic_push +__rte_diagnostic_ignored_wcast_qual _mm_store_si128((__m128i *)&rxdp[i], dma_addr0); +__rte_diagnostic_pop } } rte_atomic_fetch_add_explicit(&rx_bufq->rx_stats.mbuf_alloc_failed, @@ -634,8 +648,11 @@ idpf_splitq_rearm(struct idpf_rx_queue *rx_bufq) dma_addr0 = _mm_setzero_si128(); for (i = 0; i < IDPF_VPMD_DESCS_PER_LOOP; i++) { rxp[i] = &rx_bufq->fake_mbuf; +__rte_diagnostic_push +__rte_diagnostic_ignored_wcast_qual _mm_storeu_si128((__m128i *)&rxdp[i], dma_addr0); +__rte_diagnostic_pop } } rte_atomic_fetch_add_explicit(&rx_bufq->rx_stats.mbuf_alloc_failed, @@ -797,6 +814,8 @@ _idpf_splitq_recv_raw_pkts_avx512(struct idpf_rx_queue *rxq, #endif __m512i raw_desc0_3, raw_desc4_7; +__rte_diagnostic_push +__rte_diagnostic_ignored_wcast_qual const __m128i raw_desc7 = _mm_load_si128((void *)(rxdp + 7)); rte_compiler_barrier(); @@ -820,6 +839,7 @@ _idpf_splitq_recv_raw_pkts_avx512(struct idpf_rx_queue *rxq, rte_compiler_barrier(); const __m128i raw_desc0 = _mm_load_si128((void *)(rxdp + 0)); +__rte_diagnostic_pop raw_desc4_7 = _mm512_broadcast_i32x4(raw_desc4); raw_desc4_7 = _mm512_inserti32x4(raw_desc4_7, raw_desc5, 1); @@ -1131,7 +1151,10 @@ idpf_singleq_vtx1(volatile struct idpf_base_tx_desc *txdp, __m128i descriptor = _mm_set_epi64x(high_qw, pkt->buf_iova + pkt->data_off); +__rte_diagnostic_push +__rte_diagnostic_ignored_wcast_qual _mm_storeu_si128((__m128i *)txdp, descriptor); +__rte_diagnostic_pop } #define IDPF_TX_LEN_MASK 0xAA @@ -1178,7 +1201,10 @@ idpf_singleq_vtx(volatile struct idpf_base_tx_desc *txdp, pkt[1]->buf_iova + pkt[1]->data_off, hi_qw0, pkt[0]->buf_iova + pkt[0]->data_off); +__rte_diagnostic_push +__rte_diagnostic_ignored_wcast_qual _mm512_storeu_si512((void *)txdp, desc0_3); +__rte_diagnostic_pop } /* do any last ones */ @@ -1435,7 +1461,10 @@ idpf_splitq_vtx1(volatile struct idpf_flex_tx_sched_desc *txdp, __m128i descriptor = _mm_set_epi64x(high_qw, pkt->buf_iova + pkt->data_off); +__rte_diagnostic_push +__rte_diagnostic_ignored_wcast_qual _mm_storeu_si128((__m128i *)txdp, descriptor); +__rte_diagnostic_pop } static __rte_always_inline void @@ -1480,7 +1509,10 @@ idpf_splitq_vtx(volatile struct idpf_flex_tx_sched_desc *txdp, pkt[1]->buf_iova + pkt[1]->data_off, hi_qw0, pkt[0]->buf_iova + pkt[0]->data_off); +__rte_diagnostic_push +__rte_diagnostic_ignored_wcast_qual _mm512_storeu_si512((void *)txdp, desc0_3); +__rte_diagnostic_pop } /* do any last ones */ @@ -1521,11 +1553,14 @@ idpf_splitq_xmit_fixed_burst_vec_avx512(void *tx_queue, struct rte_mbuf **tx_pkt if (nb_commit >= n) { tx_backlog_entry_avx512(txep, tx_pkts, n); +__rte_diagnostic_push +__rte_diagnostic_ignored_wcast_qual idpf_splitq_vtx((void *)txdp, tx_pkts, n - 1, cmd_dtype); tx_pkts += (n - 1); txdp += (n - 1); idpf_splitq_vtx1((void *)txdp, *tx_pkts++, cmd_dtype); +__rte_diagnostic_pop nb_commit = (uint16_t)(nb_commit - n); @@ -1540,7 +1575,10 @@ idpf_splitq_xmit_fixed_burst_vec_avx512(void *tx_queue, struct rte_mbuf **tx_pkt tx_backlog_entry_avx512(txep, tx_pkts, nb_commit); +__rte_diagnostic_push +__rte_diagnostic_ignored_wcast_qual idpf_splitq_vtx((void *)txdp, tx_pkts, nb_commit, cmd_dtype); +__rte_diagnostic_pop tx_id = (uint16_t)(tx_id + nb_commit); if (tx_id > txq->next_rs) -- 2.47.0.vfs.0.3