From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id C2EF6A034F; Thu, 25 Feb 2021 13:23:27 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 5A958160840; Thu, 25 Feb 2021 13:23:27 +0100 (CET) Received: from mx0b-0016f401.pphosted.com (mx0b-0016f401.pphosted.com [67.231.156.173]) by mails.dpdk.org (Postfix) with ESMTP id E39DE406B4 for ; Thu, 25 Feb 2021 13:23:25 +0100 (CET) Received: from pps.filterd (m0045851.ppops.net [127.0.0.1]) by mx0b-0016f401.pphosted.com (8.16.0.43/8.16.0.43) with SMTP id 11PCKJhI021415 for ; Thu, 25 Feb 2021 04:23:25 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : mime-version : content-transfer-encoding : content-type; s=pfpt0220; bh=C52OmCg6grAxxs5Vi0RzzS9ISycOvqDHZSiuRXAGlyg=; b=c3c41WWQEjox+9madctCwxrMg51cX/gCwjceH+tDTpVG9VLT5S2YTxWxOJiuFTPuuqfG WVAzhfZigu4kCaz1d94Qvvy/FvFK1R5DJmUzx0GKS3xalFM+YGkOhWjHKQY4WjaVO3Lm xms99tis6wDGlHLnb77d2PChcIe1NBSHfFRDlcxeHkFe0jtpAn0fG681R2ij2fIDPl1e 38lU3U8xPfhcc8nCS5R8ul+6veq87qzcgCWxbk4kNOPC1FOke7QztdH1j5AcRlIJ2YwY UP16SaH40uWBIqrOeuzJh6rYSagdPWL98QXFesM3ai7XyslwgnpG86In2FrXF+qQUv4c sg== Received: from dc5-exch01.marvell.com ([199.233.59.181]) by mx0b-0016f401.pphosted.com with ESMTP id 36wxbwt0yh-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT) for ; Thu, 25 Feb 2021 04:23:25 -0800 Received: from SC-EXCH01.marvell.com (10.93.176.81) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Thu, 25 Feb 2021 04:23:23 -0800 Received: from DC5-EXCH02.marvell.com (10.69.176.39) by SC-EXCH01.marvell.com (10.93.176.81) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Thu, 25 Feb 2021 04:23:22 -0800 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH02.marvell.com (10.69.176.39) with Microsoft SMTP Server id 15.0.1497.2 via Frontend Transport; Thu, 25 Feb 2021 04:23:22 -0800 Received: from BG-LT7430.marvell.com (BG-LT7430.marvell.com [10.28.177.176]) by maili.marvell.com (Postfix) with ESMTP id C52F93F703F; Thu, 25 Feb 2021 04:23:20 -0800 (PST) From: To: , Pavan Nikhilesh CC: Date: Thu, 25 Feb 2021 17:53:11 +0530 Message-ID: <20210225122315.6350-1-pbhagavatula@marvell.com> X-Mailer: git-send-email 2.17.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.369, 18.0.761 definitions=2021-02-25_07:2021-02-24, 2021-02-25 signatures=0 Subject: [dpdk-dev] [PATCH 1/4] event/octeontx2: simplify timer bucket estimation X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" From: Pavan Nikhilesh Simplify timer bucket estimation we need not align buckets to power of 2 instead use reciprocal division to compute mod. Signed-off-by: Pavan Nikhilesh --- drivers/event/octeontx2/otx2_tim_evdev.c | 78 ++++----------------- drivers/event/octeontx2/otx2_tim_evdev.h | 84 ++++++++--------------- drivers/event/octeontx2/otx2_tim_worker.c | 4 +- drivers/event/octeontx2/otx2_tim_worker.h | 40 +++++------ 4 files changed, 64 insertions(+), 142 deletions(-) diff --git a/drivers/event/octeontx2/otx2_tim_evdev.c b/drivers/event/octeontx2/otx2_tim_evdev.c index 4c24cc8a6..d1e967eb7 100644 --- a/drivers/event/octeontx2/otx2_tim_evdev.c +++ b/drivers/event/octeontx2/otx2_tim_evdev.c @@ -34,27 +34,25 @@ tim_set_fp_ops(struct otx2_tim_ring *tim_ring) { uint8_t prod_flag = !tim_ring->prod_type_sp; - /* [MOD/AND] [DFB/FB] [SP][MP]*/ - const rte_event_timer_arm_burst_t arm_burst[2][2][2][2] = { -#define FP(_name, _f4, _f3, _f2, _f1, flags) \ - [_f4][_f3][_f2][_f1] = otx2_tim_arm_burst_ ## _name, -TIM_ARM_FASTPATH_MODES + /* [DFB/FB] [SP][MP]*/ + const rte_event_timer_arm_burst_t arm_burst[2][2][2] = { +#define FP(_name, _f3, _f2, _f1, flags) \ + [_f3][_f2][_f1] = otx2_tim_arm_burst_##_name, + TIM_ARM_FASTPATH_MODES #undef FP }; - const rte_event_timer_arm_tmo_tick_burst_t arm_tmo_burst[2][2][2] = { -#define FP(_name, _f3, _f2, _f1, flags) \ - [_f3][_f2][_f1] = otx2_tim_arm_tmo_tick_burst_ ## _name, -TIM_ARM_TMO_FASTPATH_MODES + const rte_event_timer_arm_tmo_tick_burst_t arm_tmo_burst[2][2] = { +#define FP(_name, _f2, _f1, flags) \ + [_f2][_f1] = otx2_tim_arm_tmo_tick_burst_##_name, + TIM_ARM_TMO_FASTPATH_MODES #undef FP }; otx2_tim_ops.arm_burst = - arm_burst[tim_ring->enable_stats][tim_ring->optimized] - [tim_ring->ena_dfb][prod_flag]; + arm_burst[tim_ring->enable_stats][tim_ring->ena_dfb][prod_flag]; otx2_tim_ops.arm_tmo_tick_burst = - arm_tmo_burst[tim_ring->enable_stats][tim_ring->optimized] - [tim_ring->ena_dfb]; + arm_tmo_burst[tim_ring->enable_stats][tim_ring->ena_dfb]; otx2_tim_ops.cancel_burst = otx2_tim_timer_cancel_burst; } @@ -70,51 +68,6 @@ otx2_tim_ring_info_get(const struct rte_event_timer_adapter *adptr, sizeof(struct rte_event_timer_adapter_conf)); } -static void -tim_optimze_bkt_param(struct otx2_tim_ring *tim_ring) -{ - uint64_t tck_nsec; - uint32_t hbkts; - uint32_t lbkts; - - hbkts = rte_align32pow2(tim_ring->nb_bkts); - tck_nsec = RTE_ALIGN_MUL_CEIL(tim_ring->max_tout / (hbkts - 1), 10); - - if ((tck_nsec < TICK2NSEC(OTX2_TIM_MIN_TMO_TKS, - tim_ring->tenns_clk_freq) || - hbkts > OTX2_TIM_MAX_BUCKETS)) - hbkts = 0; - - lbkts = rte_align32prevpow2(tim_ring->nb_bkts); - tck_nsec = RTE_ALIGN_MUL_CEIL((tim_ring->max_tout / (lbkts - 1)), 10); - - if ((tck_nsec < TICK2NSEC(OTX2_TIM_MIN_TMO_TKS, - tim_ring->tenns_clk_freq) || - lbkts > OTX2_TIM_MAX_BUCKETS)) - lbkts = 0; - - if (!hbkts && !lbkts) - return; - - if (!hbkts) { - tim_ring->nb_bkts = lbkts; - goto end; - } else if (!lbkts) { - tim_ring->nb_bkts = hbkts; - goto end; - } - - tim_ring->nb_bkts = (hbkts - tim_ring->nb_bkts) < - (tim_ring->nb_bkts - lbkts) ? hbkts : lbkts; -end: - tim_ring->optimized = true; - tim_ring->tck_nsec = RTE_ALIGN_MUL_CEIL((tim_ring->max_tout / - (tim_ring->nb_bkts - 1)), 10); - otx2_tim_dbg("Optimized configured values"); - otx2_tim_dbg("Nb_bkts : %" PRIu32 "", tim_ring->nb_bkts); - otx2_tim_dbg("Tck_nsec : %" PRIu64 "", tim_ring->tck_nsec); -} - static int tim_chnk_pool_create(struct otx2_tim_ring *tim_ring, struct rte_event_timer_adapter_conf *rcfg) @@ -319,14 +272,6 @@ otx2_tim_ring_create(struct rte_event_timer_adapter *adptr) tim_ring->chunk_sz); tim_ring->nb_chunk_slots = OTX2_TIM_NB_CHUNK_SLOTS(tim_ring->chunk_sz); - /* Try to optimize the bucket parameters. */ - if ((rcfg->flags & RTE_EVENT_TIMER_ADAPTER_F_ADJUST_RES)) { - if (rte_is_power_of_2(tim_ring->nb_bkts)) - tim_ring->optimized = true; - else - tim_optimze_bkt_param(tim_ring); - } - if (tim_ring->disable_npa) tim_ring->nb_chunks = tim_ring->nb_chunks * tim_ring->nb_bkts; else @@ -459,6 +404,7 @@ otx2_tim_ring_start(const struct rte_event_timer_adapter *adptr) tim_ring->tck_int = NSEC2TICK(tim_ring->tck_nsec, rte_get_timer_hz()); tim_ring->tot_int = tim_ring->tck_int * tim_ring->nb_bkts; tim_ring->fast_div = rte_reciprocal_value_u64(tim_ring->tck_int); + tim_ring->fast_bkt = rte_reciprocal_value_u64(tim_ring->nb_bkts); otx2_tim_calibrate_start_tsc(tim_ring); diff --git a/drivers/event/octeontx2/otx2_tim_evdev.h b/drivers/event/octeontx2/otx2_tim_evdev.h index 44e3c7b51..bf89b85b0 100644 --- a/drivers/event/octeontx2/otx2_tim_evdev.h +++ b/drivers/event/octeontx2/otx2_tim_evdev.h @@ -76,8 +76,6 @@ #define OTX2_TIM_SP 0x1 #define OTX2_TIM_MP 0x2 -#define OTX2_TIM_BKT_AND 0x4 -#define OTX2_TIM_BKT_MOD 0x8 #define OTX2_TIM_ENA_FB 0x10 #define OTX2_TIM_ENA_DFB 0x20 #define OTX2_TIM_ENA_STATS 0x40 @@ -149,11 +147,11 @@ struct otx2_tim_ring { struct otx2_tim_bkt *bkt; struct rte_mempool *chunk_pool; struct rte_reciprocal_u64 fast_div; + struct rte_reciprocal_u64 fast_bkt; uint64_t arm_cnt; uint8_t prod_type_sp; uint8_t enable_stats; uint8_t disable_npa; - uint8_t optimized; uint8_t ena_dfb; uint16_t ring_id; uint32_t aura; @@ -178,60 +176,38 @@ tim_priv_get(void) return mz->addr; } -#define TIM_ARM_FASTPATH_MODES \ -FP(mod_sp, 0, 0, 0, 0, OTX2_TIM_BKT_MOD | OTX2_TIM_ENA_DFB | OTX2_TIM_SP) \ -FP(mod_mp, 0, 0, 0, 1, OTX2_TIM_BKT_MOD | OTX2_TIM_ENA_DFB | OTX2_TIM_MP) \ -FP(mod_fb_sp, 0, 0, 1, 0, OTX2_TIM_BKT_MOD | OTX2_TIM_ENA_FB | OTX2_TIM_SP) \ -FP(mod_fb_mp, 0, 0, 1, 1, OTX2_TIM_BKT_MOD | OTX2_TIM_ENA_FB | OTX2_TIM_MP) \ -FP(and_sp, 0, 1, 0, 0, OTX2_TIM_BKT_AND | OTX2_TIM_ENA_DFB | OTX2_TIM_SP) \ -FP(and_mp, 0, 1, 0, 1, OTX2_TIM_BKT_AND | OTX2_TIM_ENA_DFB | OTX2_TIM_MP) \ -FP(and_fb_sp, 0, 1, 1, 0, OTX2_TIM_BKT_AND | OTX2_TIM_ENA_FB | OTX2_TIM_SP) \ -FP(and_fb_mp, 0, 1, 1, 1, OTX2_TIM_BKT_AND | OTX2_TIM_ENA_FB | OTX2_TIM_MP) \ -FP(stats_mod_sp, 1, 0, 0, 0, OTX2_TIM_ENA_STATS | OTX2_TIM_BKT_MOD | \ - OTX2_TIM_ENA_DFB | OTX2_TIM_SP) \ -FP(stats_mod_mp, 1, 0, 0, 1, OTX2_TIM_ENA_STATS | OTX2_TIM_BKT_MOD | \ - OTX2_TIM_ENA_DFB | OTX2_TIM_MP) \ -FP(stats_mod_fb_sp, 1, 0, 1, 0, OTX2_TIM_ENA_STATS | OTX2_TIM_BKT_MOD | \ - OTX2_TIM_ENA_FB | OTX2_TIM_SP) \ -FP(stats_mod_fb_mp, 1, 0, 1, 1, OTX2_TIM_ENA_STATS | OTX2_TIM_BKT_MOD | \ - OTX2_TIM_ENA_FB | OTX2_TIM_MP) \ -FP(stats_and_sp, 1, 1, 0, 0, OTX2_TIM_ENA_STATS | OTX2_TIM_BKT_AND | \ - OTX2_TIM_ENA_DFB | OTX2_TIM_SP) \ -FP(stats_and_mp, 1, 1, 0, 1, OTX2_TIM_ENA_STATS | OTX2_TIM_BKT_AND | \ - OTX2_TIM_ENA_DFB | OTX2_TIM_MP) \ -FP(stats_and_fb_sp, 1, 1, 1, 0, OTX2_TIM_ENA_STATS | OTX2_TIM_BKT_AND | \ - OTX2_TIM_ENA_FB | OTX2_TIM_SP) \ -FP(stats_and_fb_mp, 1, 1, 1, 1, OTX2_TIM_ENA_STATS | OTX2_TIM_BKT_AND | \ - OTX2_TIM_ENA_FB | OTX2_TIM_MP) - -#define TIM_ARM_TMO_FASTPATH_MODES \ -FP(mod, 0, 0, 0, OTX2_TIM_BKT_MOD | OTX2_TIM_ENA_DFB) \ -FP(mod_fb, 0, 0, 1, OTX2_TIM_BKT_MOD | OTX2_TIM_ENA_FB) \ -FP(and, 0, 1, 0, OTX2_TIM_BKT_AND | OTX2_TIM_ENA_DFB) \ -FP(and_fb, 0, 1, 1, OTX2_TIM_BKT_AND | OTX2_TIM_ENA_FB) \ -FP(stats_mod, 1, 0, 0, OTX2_TIM_ENA_STATS | OTX2_TIM_BKT_MOD | \ - OTX2_TIM_ENA_DFB) \ -FP(stats_mod_fb, 1, 0, 1, OTX2_TIM_ENA_STATS | OTX2_TIM_BKT_MOD | \ - OTX2_TIM_ENA_FB) \ -FP(stats_and, 1, 1, 0, OTX2_TIM_ENA_STATS | OTX2_TIM_BKT_AND | \ - OTX2_TIM_ENA_DFB) \ -FP(stats_and_fb, 1, 1, 1, OTX2_TIM_ENA_STATS | OTX2_TIM_BKT_AND | \ - OTX2_TIM_ENA_FB) - -#define FP(_name, _f4, _f3, _f2, _f1, flags) \ -uint16_t \ -otx2_tim_arm_burst_ ## _name(const struct rte_event_timer_adapter *adptr, \ - struct rte_event_timer **tim, \ - const uint16_t nb_timers); +#define TIM_ARM_FASTPATH_MODES \ + FP(sp, 0, 0, 0, OTX2_TIM_ENA_DFB | OTX2_TIM_SP) \ + FP(mp, 0, 0, 1, OTX2_TIM_ENA_DFB | OTX2_TIM_MP) \ + FP(fb_sp, 0, 1, 0, OTX2_TIM_ENA_FB | OTX2_TIM_SP) \ + FP(fb_mp, 0, 1, 1, OTX2_TIM_ENA_FB | OTX2_TIM_MP) \ + FP(stats_mod_sp, 1, 0, 0, \ + OTX2_TIM_ENA_STATS | OTX2_TIM_ENA_DFB | OTX2_TIM_SP) \ + FP(stats_mod_mp, 1, 0, 1, \ + OTX2_TIM_ENA_STATS | OTX2_TIM_ENA_DFB | OTX2_TIM_MP) \ + FP(stats_mod_fb_sp, 1, 1, 0, \ + OTX2_TIM_ENA_STATS | OTX2_TIM_ENA_FB | OTX2_TIM_SP) \ + FP(stats_mod_fb_mp, 1, 1, 1, \ + OTX2_TIM_ENA_STATS | OTX2_TIM_ENA_FB | OTX2_TIM_MP) + +#define TIM_ARM_TMO_FASTPATH_MODES \ + FP(dfb, 0, 0, OTX2_TIM_ENA_DFB) \ + FP(fb, 0, 1, OTX2_TIM_ENA_FB) \ + FP(stats_dfb, 1, 0, OTX2_TIM_ENA_STATS | OTX2_TIM_ENA_DFB) \ + FP(stats_fb, 1, 1, OTX2_TIM_ENA_STATS | OTX2_TIM_ENA_FB) + +#define FP(_name, _f3, _f2, _f1, flags) \ + uint16_t otx2_tim_arm_burst_##_name( \ + const struct rte_event_timer_adapter *adptr, \ + struct rte_event_timer **tim, const uint16_t nb_timers); TIM_ARM_FASTPATH_MODES #undef FP -#define FP(_name, _f3, _f2, _f1, flags) \ -uint16_t \ -otx2_tim_arm_tmo_tick_burst_ ## _name( \ - const struct rte_event_timer_adapter *adptr, \ - struct rte_event_timer **tim, \ - const uint64_t timeout_tick, const uint16_t nb_timers); +#define FP(_name, _f2, _f1, flags) \ + uint16_t otx2_tim_arm_tmo_tick_burst_##_name( \ + const struct rte_event_timer_adapter *adptr, \ + struct rte_event_timer **tim, const uint64_t timeout_tick, \ + const uint16_t nb_timers); TIM_ARM_TMO_FASTPATH_MODES #undef FP diff --git a/drivers/event/octeontx2/otx2_tim_worker.c b/drivers/event/octeontx2/otx2_tim_worker.c index 4b5cfdc72..eb901844d 100644 --- a/drivers/event/octeontx2/otx2_tim_worker.c +++ b/drivers/event/octeontx2/otx2_tim_worker.c @@ -136,7 +136,7 @@ tim_timer_arm_tmo_brst(const struct rte_event_timer_adapter *adptr, return set_timers; } -#define FP(_name, _f4, _f3, _f2, _f1, _flags) \ +#define FP(_name, _f3, _f2, _f1, _flags) \ uint16_t __rte_noinline \ otx2_tim_arm_burst_ ## _name(const struct rte_event_timer_adapter *adptr, \ struct rte_event_timer **tim, \ @@ -147,7 +147,7 @@ otx2_tim_arm_burst_ ## _name(const struct rte_event_timer_adapter *adptr, \ TIM_ARM_FASTPATH_MODES #undef FP -#define FP(_name, _f3, _f2, _f1, _flags) \ +#define FP(_name, _f2, _f1, _flags) \ uint16_t __rte_noinline \ otx2_tim_arm_tmo_tick_burst_ ## _name( \ const struct rte_event_timer_adapter *adptr, \ diff --git a/drivers/event/octeontx2/otx2_tim_worker.h b/drivers/event/octeontx2/otx2_tim_worker.h index af2f864d7..f03912b81 100644 --- a/drivers/event/octeontx2/otx2_tim_worker.h +++ b/drivers/event/octeontx2/otx2_tim_worker.h @@ -115,27 +115,27 @@ tim_bkt_clr_nent(struct otx2_tim_bkt *bktp) return __atomic_and_fetch(&bktp->w1, v, __ATOMIC_ACQ_REL); } +static inline uint64_t +tim_bkt_fast_mod(uint64_t n, uint64_t d, struct rte_reciprocal_u64 R) +{ + return (n - (d * rte_reciprocal_divide_u64(n, &R))); +} + static __rte_always_inline void -tim_get_target_bucket(struct otx2_tim_ring * const tim_ring, +tim_get_target_bucket(struct otx2_tim_ring *const tim_ring, const uint32_t rel_bkt, struct otx2_tim_bkt **bkt, - struct otx2_tim_bkt **mirr_bkt, const uint8_t flag) + struct otx2_tim_bkt **mirr_bkt) { const uint64_t bkt_cyc = rte_rdtsc() - tim_ring->ring_start_cyc; - uint32_t bucket = rte_reciprocal_divide_u64(bkt_cyc, - &tim_ring->fast_div) + rel_bkt; - uint32_t mirr_bucket = 0; - - if (flag & OTX2_TIM_BKT_MOD) { - bucket = bucket % tim_ring->nb_bkts; - mirr_bucket = (bucket + (tim_ring->nb_bkts >> 1)) % - tim_ring->nb_bkts; - } - if (flag & OTX2_TIM_BKT_AND) { - bucket = bucket & (tim_ring->nb_bkts - 1); - mirr_bucket = (bucket + (tim_ring->nb_bkts >> 1)) & - (tim_ring->nb_bkts - 1); - } - + uint64_t bucket = + rte_reciprocal_divide_u64(bkt_cyc, &tim_ring->fast_div) + + rel_bkt; + uint64_t mirr_bucket = 0; + + bucket = + tim_bkt_fast_mod(bucket, tim_ring->nb_bkts, tim_ring->fast_bkt); + mirr_bucket = tim_bkt_fast_mod(bucket + (tim_ring->nb_bkts >> 1), + tim_ring->nb_bkts, tim_ring->fast_bkt); *bkt = &tim_ring->bkt[bucket]; *mirr_bkt = &tim_ring->bkt[mirr_bucket]; } @@ -236,7 +236,7 @@ tim_add_entry_sp(struct otx2_tim_ring * const tim_ring, int16_t rem; __retry: - tim_get_target_bucket(tim_ring, rel_bkt, &bkt, &mirr_bkt, flags); + tim_get_target_bucket(tim_ring, rel_bkt, &bkt, &mirr_bkt); /* Get Bucket sema*/ lock_sema = tim_bkt_fetch_sema_lock(bkt); @@ -322,7 +322,7 @@ tim_add_entry_mp(struct otx2_tim_ring * const tim_ring, int16_t rem; __retry: - tim_get_target_bucket(tim_ring, rel_bkt, &bkt, &mirr_bkt, flags); + tim_get_target_bucket(tim_ring, rel_bkt, &bkt, &mirr_bkt); /* Get Bucket sema*/ lock_sema = tim_bkt_fetch_sema_lock(bkt); @@ -454,7 +454,7 @@ tim_add_entry_brst(struct otx2_tim_ring * const tim_ring, uint8_t lock_cnt; __retry: - tim_get_target_bucket(tim_ring, rel_bkt, &bkt, &mirr_bkt, flags); + tim_get_target_bucket(tim_ring, rel_bkt, &bkt, &mirr_bkt); /* Only one thread beyond this. */ lock_sema = tim_bkt_inc_lock(bkt); -- 2.17.1