From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 386DD431CA; Sat, 4 Nov 2023 17:54:50 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id BC9244029B; Sat, 4 Nov 2023 17:54:49 +0100 (CET) Received: from dkmailrelay1.smartsharesystems.com (smartserver.smartsharesystems.com [77.243.40.215]) by mails.dpdk.org (Postfix) with ESMTP id 9119440282 for ; Sat, 4 Nov 2023 17:54:48 +0100 (CET) Received: from smartserver.smartsharesystems.com (smartserver.smartsharesys.local [192.168.4.10]) by dkmailrelay1.smartsharesystems.com (Postfix) with ESMTP id 63FEF206C3; Sat, 4 Nov 2023 17:54:48 +0100 (CET) Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Subject: RE: [dpdk-dev] [PATCH] ring: fix unaligned memory access on aarch32 X-MimeOLE: Produced By Microsoft Exchange V6.5 Date: Sat, 4 Nov 2023 17:54:46 +0100 Message-ID: <98CBD80474FA8B44BF855DF32C47DC35E9EFD3@smartserver.smartshare.dk> In-Reply-To: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: [dpdk-dev] [PATCH] ring: fix unaligned memory access on aarch32 Thread-Index: AdX2Nvy5Gi0oTfcnRCSjjFjXwwOPtYgw8JBwACLmqxAAAJhyUA== References: <1583774395-10233-1-git-send-email-phil.yang@arm.com> <98CBD80474FA8B44BF855DF32C47DC35E9EFD1@smartserver.smartshare.dk> From: =?iso-8859-1?Q?Morten_Br=F8rup?= To: "Honnappa Nagarahalli" , "Phil Yang" , "Ruifeng Wang" , Cc: , , "Dharmik Jayesh Thakkar" , "Gavin Hu" , "nd" , , "nd" X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org > From: Honnappa Nagarahalli [mailto:Honnappa.Nagarahalli@arm.com] > Sent: Saturday, 4 November 2023 17.32 >=20 > > From: Morten Br=F8rup > > Sent: Friday, November 3, 2023 7:04 PM > > > > I have for a long time now wondered why the ring functions for > > enqueue/dequeue of 64-bit objects supports unaligned addresses, and > now I > > finally found the patch introducing it. > > > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Phil Yang > > > Sent: Monday, 9 March 2020 18.20 > > > > > > The 32-bit arm machine doesn't support unaligned memory access. It > > > will cause a bus error on aarch32 with the custom element size > ring. > > > > > > Thread 1 "test" received signal SIGBUS, Bus error. > > > __rte_ring_enqueue_elems_64 (n=3D1, obj_table=3D0xf5edfe41, > prod_head=3D0, \ > > > r=3D0xf5edfb80) at /build/dpdk/build/include/rte_ring_elem.h:177 > > > 177 ring[idx++] =3D obj[i++]; > > > > Which test is this? Why is it using an unaligned array of 64-bit > objects? (Notice > > that obj_table=3D0xf5edfe41.) > Can't recollect which test it is. I am guessing one of the unit test > cases. We might have to reinvestigate, not sure why the obj_table is > unaligned. Thank you for picking this up, Honnappa. >=20 > > > > Nobody in their right mind would use an unaligned array of 64-bit > objects. You > > can only create such an array if you force the compiler to prevent > automatic > > alignment! And all the functions in your application using this = array > would also > > need to support unaligned addressing of these objects. > > > > This seems extremely exotic, and not something any real application > would do! > > > > I would like to revert this patch for performance reasons. > Can you provide more details? Platform, test, how much is the > regression? I haven't seen a regression, but I speculate some performance cost on = low-end CPUs. Maybe it is purely academic. Maybe not purely academic... I just tested on Godbolt, which shows = different code generated: uint64_t fa(void *p) { return *(uint64_t *)p; } uint64_t fu(void *p) { return *(unaligned_uint64_t *)p; } Generates different output: fa: ldrd r0, [r0] bx lr fu: mov r3, r0 ldr r0, [r0] @ unaligned ldr r1, [r3, #4] @ unaligned bx lr >=20 > > > > > > > > Fixes: cc4b218790f6 ("ring: support configurable element size") > > > > > > Signed-off-by: Phil Yang > > > Reviewed-by: Ruifeng Wang > > > Reviewed-by: Honnappa Nagarahalli > > > --- > > > lib/librte_ring/rte_ring_elem.h | 4 ++-- > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > diff --git a/lib/librte_ring/rte_ring_elem.h > > > b/lib/librte_ring/rte_ring_elem.h index 3976757..663addc 100644 > > > --- a/lib/librte_ring/rte_ring_elem.h > > > +++ b/lib/librte_ring/rte_ring_elem.h > > > @@ -160,7 +160,7 @@ __rte_ring_enqueue_elems_64(struct rte_ring = *r, > > > uint32_t prod_head, > > > const uint32_t size =3D r->size; > > > uint32_t idx =3D prod_head & r->mask; > > > uint64_t *ring =3D (uint64_t *)&r[1]; > > > - const uint64_t *obj =3D (const uint64_t *)obj_table; > > > + const unaligned_uint64_t *obj =3D (const unaligned_uint64_t > > > *)obj_table; > > > if (likely(idx + n < size)) { > > > for (i =3D 0; i < (n & ~0x3); i +=3D 4, idx +=3D 4) { > > > ring[idx] =3D obj[i]; > > > @@ -294,7 +294,7 @@ __rte_ring_dequeue_elems_64(struct rte_ring = *r, > > > uint32_t prod_head, > > > const uint32_t size =3D r->size; > > > uint32_t idx =3D prod_head & r->mask; > > > uint64_t *ring =3D (uint64_t *)&r[1]; > > > - uint64_t *obj =3D (uint64_t *)obj_table; > > > + unaligned_uint64_t *obj =3D (unaligned_uint64_t *)obj_table; > > > if (likely(idx + n < size)) { > > > for (i =3D 0; i < (n & ~0x3); i +=3D 4, idx +=3D 4) { > > > obj[i] =3D ring[idx]; > > > -- > > > 2.7.4 > > > > > > > References: > > > = https://git.dpdk.org/dpdk/commit/lib/librte_ring/rte_ring_elem.h?id=3D3ba= > 514 > > 78a3ab3132c33effc8b132641233275b36 > > = https://patchwork.dpdk.org/project/dpdk/patch/1583774395-10233-1-git- > > send-email-phil.yang@arm.com/