From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id B8FEFA034F; Tue, 11 Jan 2022 12:38:47 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 359D4426D4; Tue, 11 Jan 2022 12:38:47 +0100 (CET) Received: from mail-lf1-f51.google.com (mail-lf1-f51.google.com [209.85.167.51]) by mails.dpdk.org (Postfix) with ESMTP id 9572A411AE for ; Tue, 11 Jan 2022 12:38:45 +0100 (CET) Received: by mail-lf1-f51.google.com with SMTP id j11so55222765lfg.3 for ; Tue, 11 Jan 2022 03:38:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=semihalf-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=s2I8CFJ1ew3IOYfMmgcubn0AO4axhtvn/ltuuoBzcgI=; b=ISGdmJqfUJ0BXNU1dU7tDQJG9/XjN5etVbfNddG+qzTi3+P4anVlAnCPOtz3k7o7D7 q4B+PdDJgQ5XzmArJgtkENHcE4Pk1XGaZQRXi/ju+6GldTQgXsferakqrwWiqmy7g01U uYz/49JtCKcEK5hF2D88rKczqXQC9sssOz6yPpCQC7pmfp4rFyyIhAVD60DbWgctKlDt GLdEkoO6XyPCzve31EBYT2Nb19RZCA6O0W3o96DulvKVe2IS/nGx2nDpgbQVjAmnZYbt VJmAHgSLx9SmHYGulS5WfefQH9qAwbSqHUkuI3rQNIho6IdI0Iyzt/5MoR1MyY3XB4e+ EOWw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=s2I8CFJ1ew3IOYfMmgcubn0AO4axhtvn/ltuuoBzcgI=; b=Cb4pYhRdn5pYWB9/rKSNvfZS/Sv5TDlr15LNTg/6/gMYoZZMpME5C5bYD3vApBWWx2 fpqoSmG3ygiLO81Kc/0epe3jm8kf1qi6nwMvTT9i7RqMNuIdiCC12Zmbn5KK6BsiDv+c LQkH1HMrcuCaT4zox4+D14t8gG4gT0OCxaxC/cA6f78cVB2xVcz3x5ZAHEgBAohmKZC3 mMM8XgTfkB3W2yInifkcggl+ZFD64roW0YsoKEmREBdooQr90Kx6TTCU5Rb4kDvLksvX Ipal6ywmBihDrhlhu3AB2P6yhlJB2Rb5J0Tii+7BuGOiISGf/kOs0IuzwaQlCiGSB+/z K4HA== X-Gm-Message-State: AOAM533uj0A8A7FgtNOeZa4gNpGFEVO+xXaOtS0L+pGfYUCezsOiV0MI 4+/1x64BFonJwjXZKcpVBLIbTEDG3CXyYA== X-Google-Smtp-Source: ABdhPJxGXVoDacgIov45RNsSVvWI/AhhwHKVK0fcc/1yCcTXEg8akF+AmWordM4HAP6EMYZgu64v+Q== X-Received: by 2002:a2e:884e:: with SMTP id z14mr2574592ljj.523.1641901124968; Tue, 11 Jan 2022 03:38:44 -0800 (PST) Received: from andrzejo-l.semihalf.net ([83.142.187.84]) by smtp.googlemail.com with ESMTPSA id f24sm437182lfk.187.2022.01.11.03.38.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 11 Jan 2022 03:38:44 -0800 (PST) From: Andrzej Ostruszka To: dev@dpdk.org Cc: Andrzej Ostruszka , Olivier Matz , Konstantin Ananyev Subject: [PATCH] ring: optimize corner case for enqueue/dequeue Date: Tue, 11 Jan 2022 12:37:39 +0100 Message-Id: <20220111113739.1104058-1-amo@semihalf.com> X-Mailer: git-send-email 2.34.1.575.g55b058a8bb-goog In-Reply-To: <20220103142201.475552-2-amo@semihalf.com> References: <20220103142201.475552-2-amo@semihalf.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org When enqueueing/dequeueing to/from the ring we try to optimize by manual loop unrolling. The check for this optimization looks like: if (likely(idx + n < size)) { where 'idx' points to the first usable element (empty slot for enqueue, data for dequeue). The correct comparison here should be '<=' instead of '<'. This is not a functional error since we fall back to the loop with correct checks on indexes. Just a minor suboptimal behaviour for the case when we want to enqueue/dequeue exactly the number of elements that we have in the ring before wrapping to its beginning. Fixes: cc4b218790f6 ("ring: support configurable element size") Fixes: 286bd05bf70d ("ring: optimisations") Signed-off-by: Andrzej Ostruszka Reviewed-by: Olivier Matz Acked-by: Konstantin Ananyev --- lib/ring/rte_ring_elem_pvt.h | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/lib/ring/rte_ring_elem_pvt.h b/lib/ring/rte_ring_elem_pvt.h index 275ec55393..83788c56e6 100644 --- a/lib/ring/rte_ring_elem_pvt.h +++ b/lib/ring/rte_ring_elem_pvt.h @@ -17,7 +17,7 @@ __rte_ring_enqueue_elems_32(struct rte_ring *r, const uint32_t size, unsigned int i; uint32_t *ring = (uint32_t *)&r[1]; const uint32_t *obj = (const uint32_t *)obj_table; - if (likely(idx + n < size)) { + if (likely(idx + n <= size)) { for (i = 0; i < (n & ~0x7); i += 8, idx += 8) { ring[idx] = obj[i]; ring[idx + 1] = obj[i + 1]; @@ -62,7 +62,7 @@ __rte_ring_enqueue_elems_64(struct rte_ring *r, uint32_t prod_head, uint32_t idx = prod_head & r->mask; uint64_t *ring = (uint64_t *)&r[1]; const unaligned_uint64_t *obj = (const unaligned_uint64_t *)obj_table; - if (likely(idx + n < size)) { + if (likely(idx + n <= size)) { for (i = 0; i < (n & ~0x3); i += 4, idx += 4) { ring[idx] = obj[i]; ring[idx + 1] = obj[i + 1]; @@ -95,7 +95,7 @@ __rte_ring_enqueue_elems_128(struct rte_ring *r, uint32_t prod_head, uint32_t idx = prod_head & r->mask; rte_int128_t *ring = (rte_int128_t *)&r[1]; const rte_int128_t *obj = (const rte_int128_t *)obj_table; - if (likely(idx + n < size)) { + if (likely(idx + n <= size)) { for (i = 0; i < (n & ~0x1); i += 2, idx += 2) memcpy((void *)(ring + idx), (const void *)(obj + i), 32); @@ -151,7 +151,7 @@ __rte_ring_dequeue_elems_32(struct rte_ring *r, const uint32_t size, unsigned int i; uint32_t *ring = (uint32_t *)&r[1]; uint32_t *obj = (uint32_t *)obj_table; - if (likely(idx + n < size)) { + if (likely(idx + n <= size)) { for (i = 0; i < (n & ~0x7); i += 8, idx += 8) { obj[i] = ring[idx]; obj[i + 1] = ring[idx + 1]; @@ -196,7 +196,7 @@ __rte_ring_dequeue_elems_64(struct rte_ring *r, uint32_t prod_head, uint32_t idx = prod_head & r->mask; uint64_t *ring = (uint64_t *)&r[1]; unaligned_uint64_t *obj = (unaligned_uint64_t *)obj_table; - if (likely(idx + n < size)) { + if (likely(idx + n <= size)) { for (i = 0; i < (n & ~0x3); i += 4, idx += 4) { obj[i] = ring[idx]; obj[i + 1] = ring[idx + 1]; @@ -229,7 +229,7 @@ __rte_ring_dequeue_elems_128(struct rte_ring *r, uint32_t prod_head, uint32_t idx = prod_head & r->mask; rte_int128_t *ring = (rte_int128_t *)&r[1]; rte_int128_t *obj = (rte_int128_t *)obj_table; - if (likely(idx + n < size)) { + if (likely(idx + n <= size)) { for (i = 0; i < (n & ~0x1); i += 2, idx += 2) memcpy((void *)(obj + i), (void *)(ring + idx), 32); switch (n & 0x1) { -- 2.34.1.575.g55b058a8bb-goog