From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 0F18B41C7F; Mon, 13 Feb 2023 02:48:21 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id D9E9040EE7; Mon, 13 Feb 2023 02:48:20 +0100 (CET) Received: from forward500b.mail.yandex.net (forward500b.mail.yandex.net [178.154.239.144]) by mails.dpdk.org (Postfix) with ESMTP id C7B9740E09 for ; Mon, 13 Feb 2023 02:48:19 +0100 (CET) Received: from sas2-dd6f205a74fe.qloud-c.yandex.net (sas2-dd6f205a74fe.qloud-c.yandex.net [IPv6:2a02:6b8:c08:bc8b:0:640:dd6f:205a]) by forward500b.mail.yandex.net (Yandex) with ESMTP id 2C99E5E9DA; Mon, 13 Feb 2023 04:48:19 +0300 (MSK) Received: by sas2-dd6f205a74fe.qloud-c.yandex.net (smtp/Yandex) with ESMTPSA id FmW4NLYYMiE1-iCCUxaN7; Mon, 13 Feb 2023 04:48:18 +0300 X-Yandex-Fwd: 1 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex.ru; s=mail; t=1676252898; bh=APuzmzkKDf9Dmn+daktbVN96E8YDobDzQMnJhqNQ0dQ=; h=In-Reply-To:Cc:Date:References:To:Subject:From:Message-ID; b=tVl1uPG0CBtyDoHy4wq2Lj7XZ+Kv0Xy/QBDdcug9m9xSqJXGb9pRsuYTPqdOln4Er EizUMyCTUy1zC/f+ZK54KWC038BaYR1yhajTKRvCl2PnUR1/pUbHBTPfnJg5q1Fn1D qrzBvucvfKyulWK983kxXrOjwtIEYrPADK/ixtg4= Authentication-Results: sas2-dd6f205a74fe.qloud-c.yandex.net; dkim=pass header.i=@yandex.ru Message-ID: <0c9daa31-5f4c-5252-c4f0-0e45ac5d2953@yandex.ru> Date: Mon, 13 Feb 2023 01:48:15 +0000 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.7.1 From: Konstantin Ananyev Subject: Re: [EXT] Re: [PATCH] ring: compilation fix with GCC-12 To: Thomas Monjalon , Honnappa Nagarahalli , Amit Prakash Shukla Cc: "dev@dpdk.org" , Jerin Jacob Kollanukkaran , "david.marchand@redhat.com" , "bruce.richardson@intel.com" , "ferruh.yigit@amd.com" References: <20220805090348.1947658-1-amitprakashs@marvell.com> <837071028.0ifERbkFSE@thomas> <2906200.X9hSmTKtgW@thomas> Content-Language: en-US In-Reply-To: <2906200.X9hSmTKtgW@thomas> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org 13/01/2023 13:11, Thomas Monjalon пишет: > 13/01/2023 13:39, Amit Prakash Shukla: >> From: Thomas Monjalon >>> 23/08/2022 11:38, Amit Prakash Shukla: >>>> From: Konstantin Ananyev >>>>> 06/08/2022 19:35, Honnappa Nagarahalli пишет: >>>>>>> Replacing memcpy with rte_memcpy fixes the GCC-12 compilation >>> issue. >>>>>> >>>>>> Any reason why this replacement fixes the problem? >>>>>> Do you have any performance numbers with this change? >>>>>> >>>>>>> Also it would be better to change to rte_memcpy as the function >>>>>>> is called in fastpath. >>>>>> >>>>>> On Arm platforms, memcpy in the later versions has the best >>> performance. >>>>> >>>>> I agree with Honnappa, it is better to keep memcpy() here. >>>>> Actually what is strange - why it ends up in >>>>> __rte_ring_dequeue_elems_128() at all? >>>>> Inside rxa_intr_ring_dequeue() we clearly doing: rte_ring_dequeue(), >>>>> which should boil down to ___rte_ring_dequeue_elems_64(). >>>>> it should go to __rte_ring_dequeue_elems_128() at all. >>>> >>>> I agree. After having close look and doing few experiments, ideally it >>>> should not be going to __rte_ring_dequeue_elems_128(). >>>> Sizeof(in call of rte_ring_enqueue_elem) gets evaluated at compile >>>> time which in this case it is evaluated to 8 bytes so >>>> __rte_ring_dequeue_elems_128() shall not be in the path. Looks like more >>> of a gcc-12 bug.? >>>> >>>>> Another q - is this warning happens only on arm platforms? >>>> >>>> Warning is observed on x86 with build type as debug. >>>> "meson --werror --buildtype=debug build" >>> >>> I confirm the compilation issue on x86 with GCC 12 in a debug build. >>> >>> We need to find a workaround. >>> Is it reported to GCC already? >>> >> I found an old gcc bug reporting similar issue. This bug seems to be re-opened recently in Dec-2022. Not sure if it is reopened specifically for gcc-12. >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89689 > > Please would you like to open a bug specific to GCC 12? > >> Kevin has push a work around for DPDK-21.11.3. >> https://git.dpdk.org/dpdk-stable/commit/?h=21.11&id=e1d728588dc73af9ed60cc0074d51a7f24b2ba60 > > In the meantime we could use Kevin's workaround: > > #if defined(RTE_TOOLCHAIN_GCC) && (GCC_VERSION >= 120000) > #pragma GCC diagnostic push > #pragma GCC diagnostic ignored "-Wstringop-overflow" > #pragma GCC diagnostic ignored "-Wstringop-overread" > #endif > > Opinions? > > Yep, disable warnings should work. Anoter way to consider - change enqueue/dequeue_elems_128() functions to not use memcpy() at all. Instead of that they can copy 2*num 64-bit entities directly, same as _64_ versions do. Something like the patch below. That's pretty similar to what Amit initially proposed, but without rte_memcpy() involvement. Performance-wise I don't expect noticeable difference with what we have right now. But sure, we'll need to do extra checks here. diff --git a/lib/ring/rte_ring_elem_pvt.h b/lib/ring/rte_ring_elem_pvt.h index 83788c56e6..de79040618 100644 --- a/lib/ring/rte_ring_elem_pvt.h +++ b/lib/ring/rte_ring_elem_pvt.h @@ -93,25 +93,32 @@ __rte_ring_enqueue_elems_128(struct rte_ring *r, uint32_t prod_head, unsigned int i; const uint32_t size = r->size; uint32_t idx = prod_head & r->mask; - rte_int128_t *ring = (rte_int128_t *)&r[1]; - const rte_int128_t *obj = (const rte_int128_t *)obj_table; + uint64_t *ring = (uint64_t *)&r[1]; + const unaligned_uint64_t *obj = (const unaligned_uint64_t *)obj_table; if (likely(idx + n <= size)) { - for (i = 0; i < (n & ~0x1); i += 2, idx += 2) - memcpy((void *)(ring + idx), - (const void *)(obj + i), 32); + idx *= 2; + for (i = 0; i < 2 * (n & ~0x1); i += 4, idx += 4) { + ring[idx] = obj[i]; + ring[idx + 1] = obj[i + 1]; + ring[idx + 2] = obj[i + 2]; + ring[idx + 3] = obj[i + 3]; + } switch (n & 0x1) { case 1: - memcpy((void *)(ring + idx), - (const void *)(obj + i), 16); + ring[idx] = obj[i]; + ring[idx + 1] = obj[i + 1]; } } else { - for (i = 0; idx < size; i++, idx++) - memcpy((void *)(ring + idx), - (const void *)(obj + i), 16); + idx *= 2; + for (i = 0; idx < 2 * size; i += 2, idx += 2) { + ring[idx] = obj[i]; + ring[idx + 1] = obj[i + 1]; + } /* Start at the beginning */ - for (idx = 0; i < n; i++, idx++) - memcpy((void *)(ring + idx), - (const void *)(obj + i), 16); + for (idx = 0; i < 2 * n; i += 2, idx += 2) { + ring[idx] = obj[i]; + ring[idx + 1] = obj[i + 1]; + } } } @@ -227,21 +234,32 @@ __rte_ring_dequeue_elems_128(struct rte_ring *r, uint32_t prod_head, unsigned int i; const uint32_t size = r->size; uint32_t idx = prod_head & r->mask; - rte_int128_t *ring = (rte_int128_t *)&r[1]; - rte_int128_t *obj = (rte_int128_t *)obj_table; + uint64_t *ring = (uint64_t *)&r[1]; + unaligned_uint64_t *obj = (unaligned_uint64_t *)obj_table; if (likely(idx + n <= size)) { - for (i = 0; i < (n & ~0x1); i += 2, idx += 2) - memcpy((void *)(obj + i), (void *)(ring + idx), 32); + idx *= 2; + for (i = 0; i < 2 * (n & ~0x1); i += 4, idx += 4) { + obj[i] = ring[idx]; + obj[i + 1] = ring[idx + 1]; + obj[i + 2] = ring[idx + 2]; + obj[i + 3] = ring[idx + 3]; + } switch (n & 0x1) { case 1: - memcpy((void *)(obj + i), (void *)(ring + idx), 16); + obj[i] = ring[idx]; + obj[i + 1] = ring[idx + 1]; } } else { - for (i = 0; idx < size; i++, idx++) - memcpy((void *)(obj + i), (void *)(ring + idx), 16); + idx *= 2; + for (i = 0; idx < 2 * size; i += 2, idx += 2) { + obj[i] = ring[idx]; + obj[i + 1] = ring[idx + 1]; + } /* Start at the beginning */ - for (idx = 0; i < n; i++, idx++) - memcpy((void *)(obj + i), (void *)(ring + idx), 16); + for (idx = 0; i < 2 * n; i += 2, idx += 2) { + obj[i] = ring[idx]; + obj[i + 1] = ring[idx + 1]; + } } }