From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 669ED42B6D; Mon, 22 May 2023 09:23:46 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id E7470410DD; Mon, 22 May 2023 09:23:45 +0200 (CEST) Received: from mail-vk1-f171.google.com (mail-vk1-f171.google.com [209.85.221.171]) by mails.dpdk.org (Postfix) with ESMTP id 5BFD9410D1 for ; Mon, 22 May 2023 09:23:45 +0200 (CEST) Received: by mail-vk1-f171.google.com with SMTP id 71dfb90a1353d-45739737afcso351096e0c.2 for ; Mon, 22 May 2023 00:23:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1684740224; x=1687332224; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=628Ce77kyZSNQvjbC648IoGi6I7eS6CHGqahmEPFzIk=; b=MbEOGv0am2LsJYAcLXsqcNZAram+dUFidtoxKuwAja3wRS1kiHfmiIDpAdhfrhiY9J izgTYhKC6vB99GlaEOPcsU6tfc0ThReme4CMfwvtTSdOML6Mo8t42JS9lRlUyg1znyCN u3RIW5SGOxJ5zzIiIqpxRvAQzVZyE/WIbgz9vJqOyc71NReWQ5c1I7KB+t6TRBq6met+ BWwhbwJ+UmVEAFIwbnWoi8Xi1u/eRdqs0/r44D8fMwMpnpe2eVNPHGc5h3uD93h7y40t D45kVqC4MBJj6JbnaYsVhFe2etHHNRd1cvGrxWWitIYvg4kQkwqR46vfHJfUPmiYMoM/ Q3zg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684740224; x=1687332224; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=628Ce77kyZSNQvjbC648IoGi6I7eS6CHGqahmEPFzIk=; b=c9uzEShxBJ0GbD8rOz0p86Pc1OWSjgmEmnhPX/+kfUew+NJtuUJEJAVoDnnoOeRtGA HouarBf6tIh+nVLEWSmYrcVHXIjDmLPABN5WLNtZunVB7SravfajWsWpbjWGy8P+7dez Tjx7vg52mUpmLXTwQ6cAYzfrLl73+4MfrWDYFh7EUbEZ0j0Tz0V5tHfaCFm/85CEg3Gq cLOtE7e7D5QxHHBSH/R183bIUoU/qAIrqXLLLytDyPGxVyBoCQoVggt+NsgH+dm1b5FG Uezi4UW3QXw8uLJ3uSmbCtYj9btkMMgpCjE4pMJk67AKhuje0En4DmtSCp3rLxZuk2xG lU1w== X-Gm-Message-State: AC+VfDzlVgxXiqNTGktFG/UZLyv6OX9c9qrzyqky/jF2EtZBdXnv5isT Aib13ewe2lkUT1D6cQUduhq3m9xfQ4DVqv8ztHc= X-Google-Smtp-Source: ACHHUZ4hKOTvXGEHloMiNvqj7msbEDvf4827PVhLe65nfkdxWLPlMPZXOJnuwRrf52lntLG1Q2vSOhrtwsNHl4Bs8qg= X-Received: by 2002:a1f:4148:0:b0:44f:cc32:1585 with SMTP id o69-20020a1f4148000000b0044fcc321585mr2859183vka.16.1684740224482; Mon, 22 May 2023 00:23:44 -0700 (PDT) MIME-Version: 1.0 References: <20230419200151.2474-1-pbhagavatula@marvell.com> <20230425195110.4223-1-pbhagavatula@marvell.com> In-Reply-To: From: Jerin Jacob Date: Mon, 22 May 2023 12:53:18 +0530 Message-ID: Subject: Re: [PATCH v2 1/3] event/cnxk: use LMTST for enqueue new burst To: Shijith Thotton Cc: Pavan Nikhilesh Bhagavatula , Jerin Jacob Kollanukkaran , Nithin Kumar Dabilpuram , Kiran Kumar Kokkilagadda , Sunil Kumar Kori , Satha Koteswara Rao Kottidi , "dev@dpdk.org" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On Thu, May 18, 2023 at 9:12=E2=80=AFPM Shijith Thotton wrote: > > >From: Pavan Nikhilesh > > > >Use LMTST when all events in the burst are enqueue with > >rte_event:op as RTE_EVENT_OP_NEW i.e. events are enqueued > >with the `rte_event_enqueue_new_burst` API. > > > >Signed-off-by: Pavan Nikhilesh > > Acked-by: Shijith Thotton Applied 2/3 and 3/3. to dpdk-next-net-eventdev/for-main. Thanks Please resend 1/3 to fix the following ### [PATCH] event/cnxk: use LMTST for enqueue new burst Warning in drivers/event/cnxk/cn10k_worker.c: Using __atomic_op_fetch, prefer __atomic_fetch_op 2/3 valid patches checkpatch failed > > >--- > >v2 Changes: > >- Fix spell check. > > > > drivers/common/cnxk/hw/sso.h | 1 + > > drivers/common/cnxk/roc_sso.c | 10 +- > > drivers/common/cnxk/roc_sso.h | 3 + > > drivers/event/cnxk/cn10k_eventdev.c | 9 +- > > drivers/event/cnxk/cn10k_eventdev.h | 6 +- > > drivers/event/cnxk/cn10k_worker.c | 304 +++++++++++++++++++++++++++- > > drivers/event/cnxk/cn9k_eventdev.c | 2 +- > > drivers/event/cnxk/cnxk_eventdev.c | 15 +- > > drivers/event/cnxk/cnxk_eventdev.h | 4 +- > > 9 files changed, 338 insertions(+), 16 deletions(-) > > > >diff --git a/drivers/common/cnxk/hw/sso.h b/drivers/common/cnxk/hw/sso.h > >index 25deaa4c14..09b8d4955f 100644 > >--- a/drivers/common/cnxk/hw/sso.h > >+++ b/drivers/common/cnxk/hw/sso.h > >@@ -157,6 +157,7 @@ > > #define SSO_LF_GGRP_AQ_CNT (0x1c0ull) > > #define SSO_LF_GGRP_AQ_THR (0x1e0ull) > > #define SSO_LF_GGRP_MISC_CNT (0x200ull) > >+#define SSO_LF_GGRP_OP_AW_LMTST (0x400ull) > > > > #define SSO_AF_IAQ_FREE_CNT_MASK 0x3FFFull > > #define SSO_AF_IAQ_RSVD_FREE_MASK 0x3FFFull > >diff --git a/drivers/common/cnxk/roc_sso.c b/drivers/common/cnxk/roc_sso= .c > >index 4a6a5080f7..99a55e49b0 100644 > >--- a/drivers/common/cnxk/roc_sso.c > >+++ b/drivers/common/cnxk/roc_sso.c > >@@ -6,6 +6,7 @@ > > #include "roc_priv.h" > > > > #define SSO_XAQ_CACHE_CNT (0x7) > >+#define SSO_XAQ_SLACK (16) > > > > /* Private functions. */ > > int > >@@ -493,9 +494,13 @@ sso_hwgrp_init_xaq_aura(struct dev *dev, struct > >roc_sso_xaq_data *xaq, > > > > xaq->nb_xae =3D nb_xae; > > > >- /* Taken from HRM 14.3.3(4) */ > >+ /** SSO will reserve up to 0x4 XAQ buffers per group when GetWork > >engine > >+ * is inactive and it might prefetch an additional 0x3 buffers du= e to > >+ * pipelining. > >+ */ > > xaq->nb_xaq =3D (SSO_XAQ_CACHE_CNT * nb_hwgrp); > > xaq->nb_xaq +=3D PLT_MAX(1 + ((xaq->nb_xae - 1) / xae_waes), xaq- > >>nb_xaq); > >+ xaq->nb_xaq +=3D SSO_XAQ_SLACK; > > > > xaq->mem =3D plt_zmalloc(xaq_buf_size * xaq->nb_xaq, xaq_buf_size= ); > > if (xaq->mem =3D=3D NULL) { > >@@ -537,7 +542,8 @@ sso_hwgrp_init_xaq_aura(struct dev *dev, struct > >roc_sso_xaq_data *xaq, > > * There should be a minimum headroom of 7 XAQs per HWGRP for SSO > > * to request XAQ to cache them even before enqueue is called. > > */ > >- xaq->xaq_lmt =3D xaq->nb_xaq - (nb_hwgrp * SSO_XAQ_CACHE_CNT); > >+ xaq->xaq_lmt =3D > >+ xaq->nb_xaq - (nb_hwgrp * SSO_XAQ_CACHE_CNT) - > >SSO_XAQ_SLACK; > > > > return 0; > > npa_fill_fail: > >diff --git a/drivers/common/cnxk/roc_sso.h b/drivers/common/cnxk/roc_sso= .h > >index e67797b046..a2bb6fcb22 100644 > >--- a/drivers/common/cnxk/roc_sso.h > >+++ b/drivers/common/cnxk/roc_sso.h > >@@ -7,6 +7,9 @@ > > > > #include "hw/ssow.h" > > > >+#define ROC_SSO_AW_PER_LMT_LINE_LOG2 3 > >+#define ROC_SSO_XAE_PER_XAQ 352 > >+ > > struct roc_sso_hwgrp_qos { > > uint16_t hwgrp; > > uint8_t xaq_prcnt; > >diff --git a/drivers/event/cnxk/cn10k_eventdev.c > >b/drivers/event/cnxk/cn10k_eventdev.c > >index 071ea5a212..855c92da83 100644 > >--- a/drivers/event/cnxk/cn10k_eventdev.c > >+++ b/drivers/event/cnxk/cn10k_eventdev.c > >@@ -91,8 +91,10 @@ cn10k_sso_hws_setup(void *arg, void *hws, uintptr_t > >grp_base) > > uint64_t val; > > > > ws->grp_base =3D grp_base; > >- ws->fc_mem =3D (uint64_t *)dev->fc_iova; > >+ ws->fc_mem =3D (int64_t *)dev->fc_iova; > > ws->xaq_lmt =3D dev->xaq_lmt; > >+ ws->fc_cache_space =3D dev->fc_cache_space; > >+ ws->aw_lmt =3D ws->lmt_base; > > > > /* Set get_work timeout for HWS */ > > val =3D NSEC2USEC(dev->deq_tmo_ns); > >@@ -624,6 +626,7 @@ cn10k_sso_info_get(struct rte_eventdev *event_dev, > > > > dev_info->driver_name =3D RTE_STR(EVENTDEV_NAME_CN10K_PMD); > > cnxk_sso_info_get(dev, dev_info); > >+ dev_info->max_event_port_enqueue_depth =3D UINT32_MAX; > > } > > > > static int > >@@ -632,7 +635,7 @@ cn10k_sso_dev_configure(const struct rte_eventdev > >*event_dev) > > struct cnxk_sso_evdev *dev =3D cnxk_sso_pmd_priv(event_dev); > > int rc; > > > >- rc =3D cnxk_sso_dev_validate(event_dev); > >+ rc =3D cnxk_sso_dev_validate(event_dev, 1, UINT32_MAX); > > if (rc < 0) { > > plt_err("Invalid event device configuration"); > > return -EINVAL; > >@@ -871,7 +874,7 @@ cn10k_sso_set_priv_mem(const struct rte_eventdev > >*event_dev, void *lookup_mem) > > for (i =3D 0; i < dev->nb_event_ports; i++) { > > struct cn10k_sso_hws *ws =3D event_dev->data->ports[i]; > > ws->xaq_lmt =3D dev->xaq_lmt; > >- ws->fc_mem =3D (uint64_t *)dev->fc_iova; > >+ ws->fc_mem =3D (int64_t *)dev->fc_iova; > > ws->tstamp =3D dev->tstamp; > > if (lookup_mem) > > ws->lookup_mem =3D lookup_mem; > >diff --git a/drivers/event/cnxk/cn10k_eventdev.h > >b/drivers/event/cnxk/cn10k_eventdev.h > >index aaa01d1ec1..29567728cd 100644 > >--- a/drivers/event/cnxk/cn10k_eventdev.h > >+++ b/drivers/event/cnxk/cn10k_eventdev.h > >@@ -19,9 +19,11 @@ struct cn10k_sso_hws { > > struct cnxk_timesync_info **tstamp; > > uint64_t meta_aura; > > /* Add Work Fastpath data */ > >- uint64_t xaq_lmt __rte_cache_aligned; > >- uint64_t *fc_mem; > >+ int64_t *fc_mem __rte_cache_aligned; > >+ int64_t *fc_cache_space; > >+ uintptr_t aw_lmt; > > uintptr_t grp_base; > >+ int32_t xaq_lmt; > > /* Tx Fastpath data */ > > uintptr_t lmt_base __rte_cache_aligned; > > uint64_t lso_tun_fmt; > >diff --git a/drivers/event/cnxk/cn10k_worker.c > >b/drivers/event/cnxk/cn10k_worker.c > >index 562d2fca13..1028a12c64 100644 > >--- a/drivers/event/cnxk/cn10k_worker.c > >+++ b/drivers/event/cnxk/cn10k_worker.c > >@@ -77,6 +77,36 @@ cn10k_sso_hws_forward_event(struct cn10k_sso_hws *ws, > > cn10k_sso_hws_fwd_group(ws, ev, grp); > > } > > > >+static inline int32_t > >+sso_read_xaq_space(struct cn10k_sso_hws *ws) > >+{ > >+ return (ws->xaq_lmt - __atomic_load_n(ws->fc_mem, > >__ATOMIC_RELAXED)) * > >+ ROC_SSO_XAE_PER_XAQ; > >+} > >+ > >+static inline void > >+sso_lmt_aw_wait_fc(struct cn10k_sso_hws *ws, int64_t req) > >+{ > >+ int64_t cached, refill; > >+ > >+retry: > >+ while (__atomic_load_n(ws->fc_cache_space, __ATOMIC_RELAXED) < 0) > >+ ; > >+ > >+ cached =3D __atomic_sub_fetch(ws->fc_cache_space, req, > >__ATOMIC_ACQUIRE); > >+ /* Check if there is enough space, else update and retry. */ > >+ if (cached < 0) { > >+ /* Check if we have space else retry. */ > >+ do { > >+ refill =3D sso_read_xaq_space(ws); > >+ } while (refill <=3D 0); > >+ __atomic_compare_exchange(ws->fc_cache_space, &cached, > >&refill, > >+ 0, __ATOMIC_RELEASE, > >+ __ATOMIC_RELAXED); > >+ goto retry; > >+ } > >+} > >+ > > uint16_t __rte_hot > > cn10k_sso_hws_enq(void *port, const struct rte_event *ev) > > { > >@@ -103,6 +133,253 @@ cn10k_sso_hws_enq(void *port, const struct rte_eve= nt > >*ev) > > return 1; > > } > > > >+#define VECTOR_SIZE_BITS 0xFFFFFFFFFFF80000ULL > >+#define VECTOR_GET_LINE_OFFSET(line) (19 + (3 * line)) > >+ > >+static uint64_t > >+vector_size_partial_mask(uint16_t off, uint16_t cnt) > >+{ > >+ return (VECTOR_SIZE_BITS & ~(~0x0ULL << off)) | > >+ ((uint64_t)(cnt - 1) << off); > >+} > >+ > >+static __rte_always_inline uint16_t > >+cn10k_sso_hws_new_event_lmtst(struct cn10k_sso_hws *ws, uint8_t > >queue_id, > >+ const struct rte_event ev[], uint16_t n) > >+{ > >+ uint16_t lines, partial_line, burst, left; > >+ uint64_t wdata[2], pa[2] =3D {0}; > >+ uintptr_t lmt_addr; > >+ uint16_t sz0, sz1; > >+ uint16_t lmt_id; > >+ > >+ sz0 =3D sz1 =3D 0; > >+ lmt_addr =3D ws->lmt_base; > >+ ROC_LMT_BASE_ID_GET(lmt_addr, lmt_id); > >+ > >+ left =3D n; > >+again: > >+ burst =3D RTE_MIN( > >+ BIT(ROC_SSO_AW_PER_LMT_LINE_LOG2 + > >ROC_LMT_LINES_PER_CORE_LOG2), > >+ left); > >+ > >+ /* Set wdata */ > >+ lines =3D burst >> ROC_SSO_AW_PER_LMT_LINE_LOG2; > >+ partial_line =3D burst & (BIT(ROC_SSO_AW_PER_LMT_LINE_LOG2) - 1); > >+ wdata[0] =3D wdata[1] =3D 0; > >+ if (lines > BIT(ROC_LMT_LINES_PER_STR_LOG2)) { > >+ wdata[0] =3D lmt_id; > >+ wdata[0] |=3D 15ULL << 12; > >+ wdata[0] |=3D VECTOR_SIZE_BITS; > >+ pa[0] =3D (ws->grp_base + (queue_id << 12) + > >+ SSO_LF_GGRP_OP_AW_LMTST) | > >+ (0x7 << 4); > >+ sz0 =3D 16 << ROC_SSO_AW_PER_LMT_LINE_LOG2; > >+ > >+ wdata[1] =3D lmt_id + 16; > >+ pa[1] =3D (ws->grp_base + (queue_id << 12) + > >+ SSO_LF_GGRP_OP_AW_LMTST) | > >+ (0x7 << 4); > >+ > >+ lines -=3D 17; > >+ wdata[1] |=3D partial_line ? (uint64_t)(lines + 1) << 12 = : > >+ (uint64_t)(lines << 12); > >+ wdata[1] |=3D partial_line ? > >+ vector_size_partial_mask( > >+ VECTOR_GET_LINE_OFFSET(lines)= , > >+ partial_line) : > >+ VECTOR_SIZE_BITS; > >+ sz1 =3D burst - sz0; > >+ partial_line =3D 0; > >+ } else if (lines) { > >+ /* We need to handle two cases here: > >+ * 1. Partial line spill over to wdata[1] i.e. lines =3D= =3D 16 > >+ * 2. Partial line with spill lines < 16. > >+ */ > >+ wdata[0] =3D lmt_id; > >+ pa[0] =3D (ws->grp_base + (queue_id << 12) + > >+ SSO_LF_GGRP_OP_AW_LMTST) | > >+ (0x7 << 4); > >+ sz0 =3D lines << ROC_SSO_AW_PER_LMT_LINE_LOG2; > >+ if (lines =3D=3D 16) { > >+ wdata[0] |=3D 15ULL << 12; > >+ wdata[0] |=3D VECTOR_SIZE_BITS; > >+ if (partial_line) { > >+ wdata[1] =3D lmt_id + 16; > >+ pa[1] =3D (ws->grp_base + (queue_id << 12= ) + > >+ SSO_LF_GGRP_OP_AW_LMTST) | > >+ ((partial_line - 1) << 4); > >+ } > >+ } else { > >+ lines -=3D 1; > >+ wdata[0] |=3D partial_line ? (uint64_t)(lines + 1= ) << 12 : > >+ (uint64_t)(lines= << 12); > >+ wdata[0] |=3D > >+ partial_line ? > >+ vector_size_partial_mask( > >+ VECTOR_GET_LINE_OFFSET(li= nes), > >+ partial_line) : > >+ VECTOR_SIZE_BITS; > >+ sz0 +=3D partial_line; > >+ } > >+ sz1 =3D burst - sz0; > >+ partial_line =3D 0; > >+ } > >+ > >+ /* Only partial lines */ > >+ if (partial_line) { > >+ wdata[0] =3D lmt_id; > >+ pa[0] =3D (ws->grp_base + (queue_id << 12) + > >+ SSO_LF_GGRP_OP_AW_LMTST) | > >+ ((partial_line - 1) << 4); > >+ sz0 =3D partial_line; > >+ sz1 =3D burst - sz0; > >+ } > >+ > >+#if defined(RTE_ARCH_ARM64) > >+ uint64x2_t aw_mask =3D {0xC0FFFFFFFFULL, ~0x0ULL}; > >+ uint64x2_t tt_mask =3D {0x300000000ULL, 0}; > >+ uint16_t parts; > >+ > >+ while (burst) { > >+ parts =3D burst > 7 ? 8 : plt_align32prevpow2(burst); > >+ burst -=3D parts; > >+ /* Lets try to fill at least one line per burst. */ > >+ switch (parts) { > >+ case 8: { > >+ uint64x2_t aw0, aw1, aw2, aw3, aw4, aw5, aw6, aw7= ; > >+ > >+ aw0 =3D vandq_u64(vld1q_u64((const uint64_t *)&ev= [0]), > >+ aw_mask); > >+ aw1 =3D vandq_u64(vld1q_u64((const uint64_t *)&ev= [1]), > >+ aw_mask); > >+ aw2 =3D vandq_u64(vld1q_u64((const uint64_t *)&ev= [2]), > >+ aw_mask); > >+ aw3 =3D vandq_u64(vld1q_u64((const uint64_t *)&ev= [3]), > >+ aw_mask); > >+ aw4 =3D vandq_u64(vld1q_u64((const uint64_t *)&ev= [4]), > >+ aw_mask); > >+ aw5 =3D vandq_u64(vld1q_u64((const uint64_t *)&ev= [5]), > >+ aw_mask); > >+ aw6 =3D vandq_u64(vld1q_u64((const uint64_t *)&ev= [6]), > >+ aw_mask); > >+ aw7 =3D vandq_u64(vld1q_u64((const uint64_t *)&ev= [7]), > >+ aw_mask); > >+ > >+ aw0 =3D vorrq_u64(vandq_u64(vshrq_n_u64(aw0, 6), > >tt_mask), > >+ aw0); > >+ aw1 =3D vorrq_u64(vandq_u64(vshrq_n_u64(aw1, 6), > >tt_mask), > >+ aw1); > >+ aw2 =3D vorrq_u64(vandq_u64(vshrq_n_u64(aw2, 6), > >tt_mask), > >+ aw2); > >+ aw3 =3D vorrq_u64(vandq_u64(vshrq_n_u64(aw3, 6), > >tt_mask), > >+ aw3); > >+ aw4 =3D vorrq_u64(vandq_u64(vshrq_n_u64(aw4, 6), > >tt_mask), > >+ aw4); > >+ aw5 =3D vorrq_u64(vandq_u64(vshrq_n_u64(aw5, 6), > >tt_mask), > >+ aw5); > >+ aw6 =3D vorrq_u64(vandq_u64(vshrq_n_u64(aw6, 6), > >tt_mask), > >+ aw6); > >+ aw7 =3D vorrq_u64(vandq_u64(vshrq_n_u64(aw7, 6), > >tt_mask), > >+ aw7); > >+ > >+ vst1q_u64((void *)lmt_addr, aw0); > >+ vst1q_u64((void *)PLT_PTR_ADD(lmt_addr, 16), aw1)= ; > >+ vst1q_u64((void *)PLT_PTR_ADD(lmt_addr, 32), aw2)= ; > >+ vst1q_u64((void *)PLT_PTR_ADD(lmt_addr, 48), aw3)= ; > >+ vst1q_u64((void *)PLT_PTR_ADD(lmt_addr, 64), aw4)= ; > >+ vst1q_u64((void *)PLT_PTR_ADD(lmt_addr, 80), aw5)= ; > >+ vst1q_u64((void *)PLT_PTR_ADD(lmt_addr, 96), aw6)= ; > >+ vst1q_u64((void *)PLT_PTR_ADD(lmt_addr, 112), aw7= ); > >+ lmt_addr =3D (uintptr_t)PLT_PTR_ADD(lmt_addr, 128= ); > >+ } break; > >+ case 4: { > >+ uint64x2_t aw0, aw1, aw2, aw3; > >+ aw0 =3D vandq_u64(vld1q_u64((const uint64_t *)&ev= [0]), > >+ aw_mask); > >+ aw1 =3D vandq_u64(vld1q_u64((const uint64_t *)&ev= [1]), > >+ aw_mask); > >+ aw2 =3D vandq_u64(vld1q_u64((const uint64_t *)&ev= [2]), > >+ aw_mask); > >+ aw3 =3D vandq_u64(vld1q_u64((const uint64_t *)&ev= [3]), > >+ aw_mask); > >+ > >+ aw0 =3D vorrq_u64(vandq_u64(vshrq_n_u64(aw0, 6), > >tt_mask), > >+ aw0); > >+ aw1 =3D vorrq_u64(vandq_u64(vshrq_n_u64(aw1, 6), > >tt_mask), > >+ aw1); > >+ aw2 =3D vorrq_u64(vandq_u64(vshrq_n_u64(aw2, 6), > >tt_mask), > >+ aw2); > >+ aw3 =3D vorrq_u64(vandq_u64(vshrq_n_u64(aw3, 6), > >tt_mask), > >+ aw3); > >+ > >+ vst1q_u64((void *)lmt_addr, aw0); > >+ vst1q_u64((void *)PLT_PTR_ADD(lmt_addr, 16), aw1)= ; > >+ vst1q_u64((void *)PLT_PTR_ADD(lmt_addr, 32), aw2)= ; > >+ vst1q_u64((void *)PLT_PTR_ADD(lmt_addr, 48), aw3)= ; > >+ lmt_addr =3D (uintptr_t)PLT_PTR_ADD(lmt_addr, 64)= ; > >+ } break; > >+ case 2: { > >+ uint64x2_t aw0, aw1; > >+ > >+ aw0 =3D vandq_u64(vld1q_u64((const uint64_t *)&ev= [0]), > >+ aw_mask); > >+ aw1 =3D vandq_u64(vld1q_u64((const uint64_t *)&ev= [1]), > >+ aw_mask); > >+ > >+ aw0 =3D vorrq_u64(vandq_u64(vshrq_n_u64(aw0, 6), > >tt_mask), > >+ aw0); > >+ aw1 =3D vorrq_u64(vandq_u64(vshrq_n_u64(aw1, 6), > >tt_mask), > >+ aw1); > >+ > >+ vst1q_u64((void *)lmt_addr, aw0); > >+ vst1q_u64((void *)PLT_PTR_ADD(lmt_addr, 16), aw1)= ; > >+ lmt_addr =3D (uintptr_t)PLT_PTR_ADD(lmt_addr, 32)= ; > >+ } break; > >+ case 1: { > >+ __uint128_t aw0; > >+ > >+ aw0 =3D ev[0].u64; > >+ aw0 <<=3D 64; > >+ aw0 |=3D ev[0].event & (BIT_ULL(32) - 1); > >+ aw0 |=3D (uint64_t)ev[0].sched_type << 32; > >+ > >+ *((__uint128_t *)lmt_addr) =3D aw0; > >+ lmt_addr =3D (uintptr_t)PLT_PTR_ADD(lmt_addr, 16)= ; > >+ } break; > >+ } > >+ ev +=3D parts; > >+ } > >+#else > >+ uint16_t i; > >+ > >+ for (i =3D 0; i < burst; i++) { > >+ __uint128_t aw0; > >+ > >+ aw0 =3D ev[0].u64; > >+ aw0 <<=3D 64; > >+ aw0 |=3D ev[0].event & (BIT_ULL(32) - 1); > >+ aw0 |=3D (uint64_t)ev[0].sched_type << 32; > >+ *((__uint128_t *)lmt_addr) =3D aw0; > >+ lmt_addr =3D (uintptr_t)PLT_PTR_ADD(lmt_addr, 16); > >+ } > >+#endif > >+ > >+ /* wdata[0] will be always valid */ > >+ sso_lmt_aw_wait_fc(ws, sz0); > >+ roc_lmt_submit_steorl(wdata[0], pa[0]); > >+ if (wdata[1]) { > >+ sso_lmt_aw_wait_fc(ws, sz1); > >+ roc_lmt_submit_steorl(wdata[1], pa[1]); > >+ } > >+ > >+ left -=3D (sz0 + sz1); > >+ if (left) > >+ goto again; > >+ > >+ return n; > >+} > >+ > > uint16_t __rte_hot > > cn10k_sso_hws_enq_burst(void *port, const struct rte_event ev[], > > uint16_t nb_events) > >@@ -115,13 +392,32 @@ uint16_t __rte_hot > > cn10k_sso_hws_enq_new_burst(void *port, const struct rte_event ev[], > > uint16_t nb_events) > > { > >+ uint16_t idx =3D 0, done =3D 0, rc =3D 0; > > struct cn10k_sso_hws *ws =3D port; > >- uint16_t i, rc =3D 1; > >+ uint8_t queue_id; > >+ int32_t space; > >+ > >+ /* Do a common back-pressure check and return */ > >+ space =3D sso_read_xaq_space(ws) - ROC_SSO_XAE_PER_XAQ; > >+ if (space <=3D 0) > >+ return 0; > >+ nb_events =3D space < nb_events ? space : nb_events; > >+ > >+ do { > >+ queue_id =3D ev[idx].queue_id; > >+ for (idx =3D idx + 1; idx < nb_events; idx++) > >+ if (queue_id !=3D ev[idx].queue_id) > >+ break; > >+ > >+ rc =3D cn10k_sso_hws_new_event_lmtst(ws, queue_id, &ev[do= ne], > >+ idx - done); > >+ if (rc !=3D (idx - done)) > >+ return rc + done; > >+ done +=3D rc; > > > >- for (i =3D 0; i < nb_events && rc; i++) > >- rc =3D cn10k_sso_hws_new_event(ws, &ev[i]); > >+ } while (done < nb_events); > > > >- return nb_events; > >+ return done; > > } > > > > uint16_t __rte_hot > >diff --git a/drivers/event/cnxk/cn9k_eventdev.c > >b/drivers/event/cnxk/cn9k_eventdev.c > >index 7e8339bd3a..e59e537311 100644 > >--- a/drivers/event/cnxk/cn9k_eventdev.c > >+++ b/drivers/event/cnxk/cn9k_eventdev.c > >@@ -753,7 +753,7 @@ cn9k_sso_dev_configure(const struct rte_eventdev > >*event_dev) > > struct cnxk_sso_evdev *dev =3D cnxk_sso_pmd_priv(event_dev); > > int rc; > > > >- rc =3D cnxk_sso_dev_validate(event_dev); > >+ rc =3D cnxk_sso_dev_validate(event_dev, 1, 1); > > if (rc < 0) { > > plt_err("Invalid event device configuration"); > > return -EINVAL; > >diff --git a/drivers/event/cnxk/cnxk_eventdev.c > >b/drivers/event/cnxk/cnxk_eventdev.c > >index cb9ba5d353..99f9cdcd0d 100644 > >--- a/drivers/event/cnxk/cnxk_eventdev.c > >+++ b/drivers/event/cnxk/cnxk_eventdev.c > >@@ -145,7 +145,8 @@ cnxk_sso_restore_links(const struct rte_eventdev > >*event_dev, > > } > > > > int > >-cnxk_sso_dev_validate(const struct rte_eventdev *event_dev) > >+cnxk_sso_dev_validate(const struct rte_eventdev *event_dev, uint32_t > >deq_depth, > >+ uint32_t enq_depth) > > { > > struct rte_event_dev_config *conf =3D &event_dev->data->dev_conf; > > struct cnxk_sso_evdev *dev =3D cnxk_sso_pmd_priv(event_dev); > >@@ -173,12 +174,12 @@ cnxk_sso_dev_validate(const struct rte_eventdev > >*event_dev) > > return -EINVAL; > > } > > > >- if (conf->nb_event_port_dequeue_depth > 1) { > >+ if (conf->nb_event_port_dequeue_depth > deq_depth) { > > plt_err("Unsupported event port deq depth requested"); > > return -EINVAL; > > } > > > >- if (conf->nb_event_port_enqueue_depth > 1) { > >+ if (conf->nb_event_port_enqueue_depth > enq_depth) { > > plt_err("Unsupported event port enq depth requested"); > > return -EINVAL; > > } > >@@ -630,6 +631,14 @@ cnxk_sso_init(struct rte_eventdev *event_dev) > > } > > > > dev =3D cnxk_sso_pmd_priv(event_dev); > >+ dev->fc_cache_space =3D rte_zmalloc("fc_cache", PLT_CACHE_LINE_SI= ZE, > >+ PLT_CACHE_LINE_SIZE); > >+ if (dev->fc_cache_space =3D=3D NULL) { > >+ plt_memzone_free(mz); > >+ plt_err("Failed to reserve memory for XAQ fc cache"); > >+ return -ENOMEM; > >+ } > >+ > > pci_dev =3D container_of(event_dev->dev, struct rte_pci_device, d= evice); > > dev->sso.pci_dev =3D pci_dev; > > > >diff --git a/drivers/event/cnxk/cnxk_eventdev.h > >b/drivers/event/cnxk/cnxk_eventdev.h > >index c7cbd722ab..a2f30bfe5f 100644 > >--- a/drivers/event/cnxk/cnxk_eventdev.h > >+++ b/drivers/event/cnxk/cnxk_eventdev.h > >@@ -90,6 +90,7 @@ struct cnxk_sso_evdev { > > uint32_t max_dequeue_timeout_ns; > > int32_t max_num_events; > > uint64_t xaq_lmt; > >+ int64_t *fc_cache_space; > > rte_iova_t fc_iova; > > uint64_t rx_offloads; > > uint64_t tx_offloads; > >@@ -206,7 +207,8 @@ int cnxk_sso_fini(struct rte_eventdev *event_dev); > > int cnxk_sso_remove(struct rte_pci_device *pci_dev); > > void cnxk_sso_info_get(struct cnxk_sso_evdev *dev, > > struct rte_event_dev_info *dev_info); > >-int cnxk_sso_dev_validate(const struct rte_eventdev *event_dev); > >+int cnxk_sso_dev_validate(const struct rte_eventdev *event_dev, > >+ uint32_t deq_depth, uint32_t enq_depth); > > int cnxk_setup_event_ports(const struct rte_eventdev *event_dev, > > cnxk_sso_init_hws_mem_t init_hws_mem, > > cnxk_sso_hws_setup_t hws_setup); > >-- > >2.25.1 >