From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 296B8A04A3; Mon, 3 Jan 2022 15:56:44 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id B54E040042; Mon, 3 Jan 2022 15:56:43 +0100 (CET) Received: from smartserver.smartsharesystems.com (smartserver.smartsharesystems.com [77.243.40.215]) by mails.dpdk.org (Postfix) with ESMTP id D74AB4003C for ; Mon, 3 Jan 2022 15:56:41 +0100 (CET) X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Subject: RE: [PATCH 1/1] ring: fix off by 1 mistake Date: Mon, 3 Jan 2022 15:56:33 +0100 Message-ID: <98CBD80474FA8B44BF855DF32C47DC35D86DCF@smartserver.smartshare.dk> In-Reply-To: <20220103142201.475552-2-amo@semihalf.com> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: [PATCH 1/1] ring: fix off by 1 mistake Thread-Index: AdgArXcoSkgNvM5CSuiNKccTB0s0uAAA54hQ References: <20220103142201.475552-1-amo@semihalf.com> <20220103142201.475552-2-amo@semihalf.com> From: =?iso-8859-1?Q?Morten_Br=F8rup?= To: "Andrzej Ostruszka" , "Honnappa Nagarahalli" , "Konstantin Ananyev" Cc: , X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org +Ring queue maintainers: Honnappa Nagarahalli = , Konstantin Ananyev = > From: Andrzej Ostruszka [mailto:amo@semihalf.com] > Sent: Monday, 3 January 2022 15.22 >=20 > When enqueueing/dequeueing to/from the ring we try to optimize by > manual > loop unrolling. The check for this optimization looks like: >=20 > if (likely(idx + n < size)) { >=20 > where 'idx' points to the first usable element (empty slot for = enqueue, > data for dequeue). The correct comparison here should be '<=3D' = instead > of '<'. >=20 > This is not a functional error since we fall back to the loop with > correct checks on indexes. Just a minor suboptimal behaviour for the > case when we want to enqueue/dequeue exactly the number of elements > that > we have in the ring before wrapping to its beginning. >=20 > Signed-off-by: Andrzej Ostruszka > --- > lib/ring/rte_ring_elem_pvt.h | 12 ++++++------ > 1 file changed, 6 insertions(+), 6 deletions(-) >=20 > diff --git a/lib/ring/rte_ring_elem_pvt.h > b/lib/ring/rte_ring_elem_pvt.h > index 275ec55393..83788c56e6 100644 > --- a/lib/ring/rte_ring_elem_pvt.h > +++ b/lib/ring/rte_ring_elem_pvt.h > @@ -17,7 +17,7 @@ __rte_ring_enqueue_elems_32(struct rte_ring *r, = const > uint32_t size, > unsigned int i; > uint32_t *ring =3D (uint32_t *)&r[1]; > const uint32_t *obj =3D (const uint32_t *)obj_table; > - if (likely(idx + n < size)) { > + if (likely(idx + n <=3D size)) { > for (i =3D 0; i < (n & ~0x7); i +=3D 8, idx +=3D 8) { > ring[idx] =3D obj[i]; > ring[idx + 1] =3D obj[i + 1]; > @@ -62,7 +62,7 @@ __rte_ring_enqueue_elems_64(struct rte_ring *r, > uint32_t prod_head, > uint32_t idx =3D prod_head & r->mask; > uint64_t *ring =3D (uint64_t *)&r[1]; > const unaligned_uint64_t *obj =3D (const unaligned_uint64_t > *)obj_table; > - if (likely(idx + n < size)) { > + if (likely(idx + n <=3D size)) { > for (i =3D 0; i < (n & ~0x3); i +=3D 4, idx +=3D 4) { > ring[idx] =3D obj[i]; > ring[idx + 1] =3D obj[i + 1]; > @@ -95,7 +95,7 @@ __rte_ring_enqueue_elems_128(struct rte_ring *r, > uint32_t prod_head, > uint32_t idx =3D prod_head & r->mask; > rte_int128_t *ring =3D (rte_int128_t *)&r[1]; > const rte_int128_t *obj =3D (const rte_int128_t *)obj_table; > - if (likely(idx + n < size)) { > + if (likely(idx + n <=3D size)) { > for (i =3D 0; i < (n & ~0x1); i +=3D 2, idx +=3D 2) > memcpy((void *)(ring + idx), > (const void *)(obj + i), 32); > @@ -151,7 +151,7 @@ __rte_ring_dequeue_elems_32(struct rte_ring *r, > const uint32_t size, > unsigned int i; > uint32_t *ring =3D (uint32_t *)&r[1]; > uint32_t *obj =3D (uint32_t *)obj_table; > - if (likely(idx + n < size)) { > + if (likely(idx + n <=3D size)) { > for (i =3D 0; i < (n & ~0x7); i +=3D 8, idx +=3D 8) { > obj[i] =3D ring[idx]; > obj[i + 1] =3D ring[idx + 1]; > @@ -196,7 +196,7 @@ __rte_ring_dequeue_elems_64(struct rte_ring *r, > uint32_t prod_head, > uint32_t idx =3D prod_head & r->mask; > uint64_t *ring =3D (uint64_t *)&r[1]; > unaligned_uint64_t *obj =3D (unaligned_uint64_t *)obj_table; > - if (likely(idx + n < size)) { > + if (likely(idx + n <=3D size)) { > for (i =3D 0; i < (n & ~0x3); i +=3D 4, idx +=3D 4) { > obj[i] =3D ring[idx]; > obj[i + 1] =3D ring[idx + 1]; > @@ -229,7 +229,7 @@ __rte_ring_dequeue_elems_128(struct rte_ring *r, > uint32_t prod_head, > uint32_t idx =3D prod_head & r->mask; > rte_int128_t *ring =3D (rte_int128_t *)&r[1]; > rte_int128_t *obj =3D (rte_int128_t *)obj_table; > - if (likely(idx + n < size)) { > + if (likely(idx + n <=3D size)) { > for (i =3D 0; i < (n & ~0x1); i +=3D 2, idx +=3D 2) > memcpy((void *)(obj + i), (void *)(ring + idx), 32); > switch (n & 0x1) { > -- > 2.34.1.448.ga2b2bfdf31-goog >=20 Well spotted! I took a very good look at it, and came to the same = conclusion: It not a functional bug; the only consequence is that the = optimized code path may not be taken in a situation where it could be = taken. But it should be fixed as suggested in your patch. Reviewed-by: Morten Br=F8rup