From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 61923A0C41 for ; Tue, 30 Nov 2021 17:41:16 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 5B2AE410F7; Tue, 30 Nov 2021 17:41:16 +0100 (CET) Received: from smtp-relay-internal-0.canonical.com (smtp-relay-internal-0.canonical.com [185.125.188.122]) by mails.dpdk.org (Postfix) with ESMTP id 5D5EF410F7 for ; Tue, 30 Nov 2021 17:41:15 +0100 (CET) Received: from mail-ed1-f69.google.com (mail-ed1-f69.google.com [209.85.208.69]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp-relay-internal-0.canonical.com (Postfix) with ESMTPS id 4E33640035 for ; Tue, 30 Nov 2021 16:41:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=canonical.com; s=20210705; t=1638290474; bh=AqIlX9ws067UMc4YkJ7bfDgPPdc9y+wgT6Lpa2U1pPE=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=IUdws9mwOA3BVWRtnst7YBzk3WlS+52bE9bgwk3lEy0uveAsDMH6rYkjcGU3EeNwX Ry6vXjqSdmSihEs5wz9qHkFBRr6bVWVt1J6mX09iSYT3YSHZ8x79Hz79eebiEP62DF 3gsmguFRJp+HcID2hdd/vNezmuD7L+NBNFu6l+hZM8g3rrRl7shUwKHhosmYLLWdCw 6jTLwvM8DVYkFqQQc+fmLRB+4J+Pu3BXxBQQIQj2/EPv2h/Ep9GQiIwVgxQTSsBAe0 m099CKPHOeqjjP66c/aHFKU5RK4GlMPdL9Xh681SnomIy6fhnp8KcdhrRMlNber18R EWYz9VWFJ3JNg== Received: by mail-ed1-f69.google.com with SMTP id i19-20020a05640242d300b003e7d13ebeedso17451437edc.7 for ; Tue, 30 Nov 2021 08:41:14 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=AqIlX9ws067UMc4YkJ7bfDgPPdc9y+wgT6Lpa2U1pPE=; b=xQAm/im2oszqM5TUm6+ZnkfS3uIdX77Cyoz7Yr0dAzzrCYrY+qTSsp8WU40+6DwE/r d1vKLxof2bEutAOMfGZTcNb/B0UnAOQk1KT7TrpJxpncy+Q4cXNdPULBX+Gmjtf7DiZX 8F8sbEDiI8JYvfQ3CR/d8SbYC72W9wb+iypi/RI6ACSMoFDYOPZCh6XdG+MkdomJm38e 4Nhzgn+LJL+FrTwm9f1whmYSTTKaD9mF81uPqjFSBRTjwxtjgyeIwHkmuZ0NidY+JlHk iOG3txsd3aiIdtleXMqLvOacHr/zqe343cE9L6IguRLyZQVyhPaYIP9rDtaH7i0thAS4 YS7g== X-Gm-Message-State: AOAM531V8dgKDfP+LbqgXM0bWJLof9riypAu6ozRiOLNFVsbkWzzoxkL XhKvmfYphf8Yi6DxtnrLzqdvd2o/b5U0ZNQN30PSGZeG5iNPD1oHQ0ZXSglyruXY7GIBBh+q7VZ xL4WFBLZRPLNNKCJBclWM+qEb X-Received: by 2002:aa7:c7cf:: with SMTP id o15mr126719eds.176.1638290473652; Tue, 30 Nov 2021 08:41:13 -0800 (PST) X-Google-Smtp-Source: ABdhPJwGFDf3ZwyEq0/B60AdobjRV+A7zufd9h97bPlfiDXlteOEXtbTRCAt94FK6vWsHPb7X36iWw== X-Received: by 2002:aa7:c7cf:: with SMTP id o15mr126691eds.176.1638290473431; Tue, 30 Nov 2021 08:41:13 -0800 (PST) Received: from localhost.localdomain ([2001:67c:1560:8007::aac:c4ad]) by smtp.gmail.com with ESMTPSA id kx3sm9154519ejc.112.2021.11.30.08.41.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Nov 2021 08:41:12 -0800 (PST) From: christian.ehrhardt@canonical.com To: Eli Britstein Cc: dpdk stable Subject: patch 'eal/x86: avoid cast-align warning in memcpy functions' has been queued to stable release 19.11.11 Date: Tue, 30 Nov 2021 17:35:04 +0100 Message-Id: <20211130163605.2460997-100-christian.ehrhardt@canonical.com> X-Mailer: git-send-email 2.34.0 In-Reply-To: <20211130163605.2460997-1-christian.ehrhardt@canonical.com> References: <20211130163605.2460997-1-christian.ehrhardt@canonical.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-BeenThere: stable@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: patches for DPDK stable branches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: stable-bounces@dpdk.org Hi, FYI, your patch has been queued to stable release 19.11.11 Note it hasn't been pushed to http://dpdk.org/browse/dpdk-stable yet. It will be pushed if I get no objections before December 10th 2021. So please shout if anyone has objections. Also note that after the patch there's a diff of the upstream commit vs the patch applied to the branch. This will indicate if there was any rebasing needed to apply to the stable branch. If there were code changes for rebasing (ie: not only metadata diffs), please double check that the rebase was correctly done. Queued patches are on a temporary branch at: https://github.com/cpaelzer/dpdk-stable-queue This queued commit can be viewed at: https://github.com/cpaelzer/dpdk-stable-queue/commit/545b8e9ee5faaa4983cffc669c743fe372a75263 Thanks. Christian Ehrhardt --- >From 545b8e9ee5faaa4983cffc669c743fe372a75263 Mon Sep 17 00:00:00 2001 From: Eli Britstein Date: Thu, 21 Oct 2021 11:51:32 +0300 Subject: [PATCH] eal/x86: avoid cast-align warning in memcpy functions [ upstream commit 6de430b7079e8f7c29f9c18869393f74f8dffcb6 ] Functions and macros in x86 rte_memcpy.h may cause cast-align warnings, when using strict cast align flag with supporting gcc: gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0 CFLAGS="-Wcast-align=strict" make V=1 -C examples/l2fwd clean static For example: In file included from main.c:24: /dpdk/build/include/rte_memcpy.h: In function 'rte_mov16': /dpdk/build/include/rte_memcpy.h:306:25: warning: cast increases required alignment of target type [-Wcast-align] 306 | xmm0 = _mm_loadu_si128((const __m128i *)src); | ^ As the code assumes correct alignment, add first a (void *) or (const void *) castings, to avoid the warnings. Fixes: 9484092baad3 ("eal/x86: optimize memcpy for AVX512 platforms") Signed-off-by: Eli Britstein --- .../common/include/arch/x86/rte_memcpy.h | 80 ++++++++++--------- 1 file changed, 44 insertions(+), 36 deletions(-) diff --git a/lib/librte_eal/common/include/arch/x86/rte_memcpy.h b/lib/librte_eal/common/include/arch/x86/rte_memcpy.h index d01832fa15..f1751dd41c 100644 --- a/lib/librte_eal/common/include/arch/x86/rte_memcpy.h +++ b/lib/librte_eal/common/include/arch/x86/rte_memcpy.h @@ -303,8 +303,8 @@ rte_mov16(uint8_t *dst, const uint8_t *src) { __m128i xmm0; - xmm0 = _mm_loadu_si128((const __m128i *)src); - _mm_storeu_si128((__m128i *)dst, xmm0); + xmm0 = _mm_loadu_si128((const __m128i *)(const void *)src); + _mm_storeu_si128((__m128i *)(void *)dst, xmm0); } /** @@ -316,8 +316,8 @@ rte_mov32(uint8_t *dst, const uint8_t *src) { __m256i ymm0; - ymm0 = _mm256_loadu_si256((const __m256i *)src); - _mm256_storeu_si256((__m256i *)dst, ymm0); + ymm0 = _mm256_loadu_si256((const __m256i *)(const void *)src); + _mm256_storeu_si256((__m256i *)(void *)dst, ymm0); } /** @@ -354,16 +354,24 @@ rte_mov128blocks(uint8_t *dst, const uint8_t *src, size_t n) __m256i ymm0, ymm1, ymm2, ymm3; while (n >= 128) { - ymm0 = _mm256_loadu_si256((const __m256i *)((const uint8_t *)src + 0 * 32)); + ymm0 = _mm256_loadu_si256((const __m256i *)(const void *) + ((const uint8_t *)src + 0 * 32)); n -= 128; - ymm1 = _mm256_loadu_si256((const __m256i *)((const uint8_t *)src + 1 * 32)); - ymm2 = _mm256_loadu_si256((const __m256i *)((const uint8_t *)src + 2 * 32)); - ymm3 = _mm256_loadu_si256((const __m256i *)((const uint8_t *)src + 3 * 32)); + ymm1 = _mm256_loadu_si256((const __m256i *)(const void *) + ((const uint8_t *)src + 1 * 32)); + ymm2 = _mm256_loadu_si256((const __m256i *)(const void *) + ((const uint8_t *)src + 2 * 32)); + ymm3 = _mm256_loadu_si256((const __m256i *)(const void *) + ((const uint8_t *)src + 3 * 32)); src = (const uint8_t *)src + 128; - _mm256_storeu_si256((__m256i *)((uint8_t *)dst + 0 * 32), ymm0); - _mm256_storeu_si256((__m256i *)((uint8_t *)dst + 1 * 32), ymm1); - _mm256_storeu_si256((__m256i *)((uint8_t *)dst + 2 * 32), ymm2); - _mm256_storeu_si256((__m256i *)((uint8_t *)dst + 3 * 32), ymm3); + _mm256_storeu_si256((__m256i *)(void *) + ((uint8_t *)dst + 0 * 32), ymm0); + _mm256_storeu_si256((__m256i *)(void *) + ((uint8_t *)dst + 1 * 32), ymm1); + _mm256_storeu_si256((__m256i *)(void *) + ((uint8_t *)dst + 2 * 32), ymm2); + _mm256_storeu_si256((__m256i *)(void *) + ((uint8_t *)dst + 3 * 32), ymm3); dst = (uint8_t *)dst + 128; } } @@ -496,8 +504,8 @@ rte_mov16(uint8_t *dst, const uint8_t *src) { __m128i xmm0; - xmm0 = _mm_loadu_si128((const __m128i *)(const __m128i *)src); - _mm_storeu_si128((__m128i *)dst, xmm0); + xmm0 = _mm_loadu_si128((const __m128i *)(const void *)src); + _mm_storeu_si128((__m128i *)(void *)dst, xmm0); } /** @@ -581,25 +589,25 @@ rte_mov256(uint8_t *dst, const uint8_t *src) __extension__ ({ \ size_t tmp; \ while (len >= 128 + 16 - offset) { \ - xmm0 = _mm_loadu_si128((const __m128i *)((const uint8_t *)src - offset + 0 * 16)); \ + xmm0 = _mm_loadu_si128((const __m128i *)(const void *)((const uint8_t *)src - offset + 0 * 16)); \ len -= 128; \ - xmm1 = _mm_loadu_si128((const __m128i *)((const uint8_t *)src - offset + 1 * 16)); \ - xmm2 = _mm_loadu_si128((const __m128i *)((const uint8_t *)src - offset + 2 * 16)); \ - xmm3 = _mm_loadu_si128((const __m128i *)((const uint8_t *)src - offset + 3 * 16)); \ - xmm4 = _mm_loadu_si128((const __m128i *)((const uint8_t *)src - offset + 4 * 16)); \ - xmm5 = _mm_loadu_si128((const __m128i *)((const uint8_t *)src - offset + 5 * 16)); \ - xmm6 = _mm_loadu_si128((const __m128i *)((const uint8_t *)src - offset + 6 * 16)); \ - xmm7 = _mm_loadu_si128((const __m128i *)((const uint8_t *)src - offset + 7 * 16)); \ - xmm8 = _mm_loadu_si128((const __m128i *)((const uint8_t *)src - offset + 8 * 16)); \ + xmm1 = _mm_loadu_si128((const __m128i *)(const void *)((const uint8_t *)src - offset + 1 * 16)); \ + xmm2 = _mm_loadu_si128((const __m128i *)(const void *)((const uint8_t *)src - offset + 2 * 16)); \ + xmm3 = _mm_loadu_si128((const __m128i *)(const void *)((const uint8_t *)src - offset + 3 * 16)); \ + xmm4 = _mm_loadu_si128((const __m128i *)(const void *)((const uint8_t *)src - offset + 4 * 16)); \ + xmm5 = _mm_loadu_si128((const __m128i *)(const void *)((const uint8_t *)src - offset + 5 * 16)); \ + xmm6 = _mm_loadu_si128((const __m128i *)(const void *)((const uint8_t *)src - offset + 6 * 16)); \ + xmm7 = _mm_loadu_si128((const __m128i *)(const void *)((const uint8_t *)src - offset + 7 * 16)); \ + xmm8 = _mm_loadu_si128((const __m128i *)(const void *)((const uint8_t *)src - offset + 8 * 16)); \ src = (const uint8_t *)src + 128; \ - _mm_storeu_si128((__m128i *)((uint8_t *)dst + 0 * 16), _mm_alignr_epi8(xmm1, xmm0, offset)); \ - _mm_storeu_si128((__m128i *)((uint8_t *)dst + 1 * 16), _mm_alignr_epi8(xmm2, xmm1, offset)); \ - _mm_storeu_si128((__m128i *)((uint8_t *)dst + 2 * 16), _mm_alignr_epi8(xmm3, xmm2, offset)); \ - _mm_storeu_si128((__m128i *)((uint8_t *)dst + 3 * 16), _mm_alignr_epi8(xmm4, xmm3, offset)); \ - _mm_storeu_si128((__m128i *)((uint8_t *)dst + 4 * 16), _mm_alignr_epi8(xmm5, xmm4, offset)); \ - _mm_storeu_si128((__m128i *)((uint8_t *)dst + 5 * 16), _mm_alignr_epi8(xmm6, xmm5, offset)); \ - _mm_storeu_si128((__m128i *)((uint8_t *)dst + 6 * 16), _mm_alignr_epi8(xmm7, xmm6, offset)); \ - _mm_storeu_si128((__m128i *)((uint8_t *)dst + 7 * 16), _mm_alignr_epi8(xmm8, xmm7, offset)); \ + _mm_storeu_si128((__m128i *)(void *)((uint8_t *)dst + 0 * 16), _mm_alignr_epi8(xmm1, xmm0, offset)); \ + _mm_storeu_si128((__m128i *)(void *)((uint8_t *)dst + 1 * 16), _mm_alignr_epi8(xmm2, xmm1, offset)); \ + _mm_storeu_si128((__m128i *)(void *)((uint8_t *)dst + 2 * 16), _mm_alignr_epi8(xmm3, xmm2, offset)); \ + _mm_storeu_si128((__m128i *)(void *)((uint8_t *)dst + 3 * 16), _mm_alignr_epi8(xmm4, xmm3, offset)); \ + _mm_storeu_si128((__m128i *)(void *)((uint8_t *)dst + 4 * 16), _mm_alignr_epi8(xmm5, xmm4, offset)); \ + _mm_storeu_si128((__m128i *)(void *)((uint8_t *)dst + 5 * 16), _mm_alignr_epi8(xmm6, xmm5, offset)); \ + _mm_storeu_si128((__m128i *)(void *)((uint8_t *)dst + 6 * 16), _mm_alignr_epi8(xmm7, xmm6, offset)); \ + _mm_storeu_si128((__m128i *)(void *)((uint8_t *)dst + 7 * 16), _mm_alignr_epi8(xmm8, xmm7, offset)); \ dst = (uint8_t *)dst + 128; \ } \ tmp = len; \ @@ -609,13 +617,13 @@ __extension__ ({ dst = (uint8_t *)dst + tmp; \ if (len >= 32 + 16 - offset) { \ while (len >= 32 + 16 - offset) { \ - xmm0 = _mm_loadu_si128((const __m128i *)((const uint8_t *)src - offset + 0 * 16)); \ + xmm0 = _mm_loadu_si128((const __m128i *)(const void *)((const uint8_t *)src - offset + 0 * 16)); \ len -= 32; \ - xmm1 = _mm_loadu_si128((const __m128i *)((const uint8_t *)src - offset + 1 * 16)); \ - xmm2 = _mm_loadu_si128((const __m128i *)((const uint8_t *)src - offset + 2 * 16)); \ + xmm1 = _mm_loadu_si128((const __m128i *)(const void *)((const uint8_t *)src - offset + 1 * 16)); \ + xmm2 = _mm_loadu_si128((const __m128i *)(const void *)((const uint8_t *)src - offset + 2 * 16)); \ src = (const uint8_t *)src + 32; \ - _mm_storeu_si128((__m128i *)((uint8_t *)dst + 0 * 16), _mm_alignr_epi8(xmm1, xmm0, offset)); \ - _mm_storeu_si128((__m128i *)((uint8_t *)dst + 1 * 16), _mm_alignr_epi8(xmm2, xmm1, offset)); \ + _mm_storeu_si128((__m128i *)(void *)((uint8_t *)dst + 0 * 16), _mm_alignr_epi8(xmm1, xmm0, offset)); \ + _mm_storeu_si128((__m128i *)(void *)((uint8_t *)dst + 1 * 16), _mm_alignr_epi8(xmm2, xmm1, offset)); \ dst = (uint8_t *)dst + 32; \ } \ tmp = len; \ -- 2.34.0 --- Diff of the applied patch vs upstream commit (please double-check if non-empty: --- --- - 2021-11-30 16:50:11.658164101 +0100 +++ 0100-eal-x86-avoid-cast-align-warning-in-memcpy-functions.patch 2021-11-30 16:50:05.898874322 +0100 @@ -1 +1 @@ -From 6de430b7079e8f7c29f9c18869393f74f8dffcb6 Mon Sep 17 00:00:00 2001 +From 545b8e9ee5faaa4983cffc669c743fe372a75263 Mon Sep 17 00:00:00 2001 @@ -5,0 +6,2 @@ +[ upstream commit 6de430b7079e8f7c29f9c18869393f74f8dffcb6 ] + @@ -23 +24,0 @@ -Cc: stable@dpdk.org @@ -27 +28 @@ - lib/eal/x86/include/rte_memcpy.h | 80 ++++++++++++++++++-------------- + .../common/include/arch/x86/rte_memcpy.h | 80 ++++++++++--------- @@ -30,4 +31,4 @@ -diff --git a/lib/eal/x86/include/rte_memcpy.h b/lib/eal/x86/include/rte_memcpy.h -index 79f381dd9b..1b6c6e585f 100644 ---- a/lib/eal/x86/include/rte_memcpy.h -+++ b/lib/eal/x86/include/rte_memcpy.h +diff --git a/lib/librte_eal/common/include/arch/x86/rte_memcpy.h b/lib/librte_eal/common/include/arch/x86/rte_memcpy.h +index d01832fa15..f1751dd41c 100644 +--- a/lib/librte_eal/common/include/arch/x86/rte_memcpy.h ++++ b/lib/librte_eal/common/include/arch/x86/rte_memcpy.h