From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 74DD7A0545; Mon, 20 Jun 2022 13:59:14 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 581834069C; Mon, 20 Jun 2022 13:59:14 +0200 (CEST) Received: from mx0b-0016f401.pphosted.com (mx0a-0016f401.pphosted.com [67.231.148.174]) by mails.dpdk.org (Postfix) with ESMTP id 9217A40151 for ; Mon, 20 Jun 2022 13:59:12 +0200 (CEST) Received: from pps.filterd (m0045849.ppops.net [127.0.0.1]) by mx0a-0016f401.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 25K9pIH6025299 for ; Mon, 20 Jun 2022 04:59:11 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : mime-version : content-transfer-encoding : content-type; s=pfpt0220; bh=I2E9/ZFhtXbCfkUM8/1F5pg946Tob81/njvY1et20HY=; b=AyW7cY1MDBq5BCAW/NZWgi5XoHMnKri7Q9BKeHyxviAf+CymZ/XmBidS7po7HkKmMTta qQUEIoxUPjq6/d1drx8TlfBu6oBGfIzuC0tfQJOIjc5T8ZiQ6qwoKNVBI5cIODxs0qZw MwFwOjvEVcDbDDjYcY+Uh5SkcJ+Dv8A76fnaqM/UPLqosxtpyLQUj+URR6BnUXWZiQZg bj5GFqgFCUYfsa6UQZ7zWBnYReuCRhTDdZJF3trLT8ZabyIP6tVtPoT68kZTRL5TDrDf y75Oy1mlo8vj/clkW7fLc9jRGNTLMm2f9RbPvjMhwj8MWHBJd/GZtwYTFVtpz6JD90Md Zg== Received: from dc5-exch01.marvell.com ([199.233.59.181]) by mx0a-0016f401.pphosted.com (PPS) with ESMTPS id 3gsc2p6se0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT) for ; Mon, 20 Jun 2022 04:59:11 -0700 Received: from DC5-EXCH01.marvell.com (10.69.176.38) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Mon, 20 Jun 2022 04:59:09 -0700 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server id 15.0.1497.2 via Frontend Transport; Mon, 20 Jun 2022 04:59:09 -0700 Received: from HY-LT1002.marvell.com (unknown [10.193.81.204]) by maili.marvell.com (Postfix) with ESMTP id 4A23F5B6957; Mon, 20 Jun 2022 04:59:05 -0700 (PDT) From: Anoob Joseph To: Akhil Goyal , Jerin Jacob CC: Tejasree Kondoj , Shijith Thotton , Subject: [PATCH] crypto/cnxk: add CPT hardware flow control checks Date: Mon, 20 Jun 2022 17:29:03 +0530 Message-ID: <20220620115903.2485-1-anoobj@marvell.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-Proofpoint-ORIG-GUID: UOlZXflmL1NbN4Uw7k0d9wftDFM2MX1l X-Proofpoint-GUID: UOlZXflmL1NbN4Uw7k0d9wftDFM2MX1l X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.883,Hydra:6.0.517,FMLib:17.11.64.514 definitions=2022-06-20_05,2022-06-17_01,2022-02-23_01 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Add hardware supported flow control checks before enqueueing to CPT. Since both poll mode and event mode can be used at the same time, add hardware flow control checks to make sure s/w doesn't over submit to hardware queues. For cn9k, queue depth usage is not high and so FC check is omitted for poll mode. To allow for more accurate updates, flow control hardware setting is updated to give an update per 32 packets. In case of crypto adapter, multiple cores can enqueue to the same CPT LF at the same time. To allow such a case, flow control threshold is updated when the adapter is configured. Signed-off-by: Anoob Joseph --- drivers/common/cnxk/hw/cpt.h | 9 ++++++++ drivers/common/cnxk/roc_cpt.c | 10 ++++----- drivers/common/cnxk/roc_cpt.h | 12 +--------- drivers/crypto/cnxk/cn10k_cryptodev_ops.c | 20 ++++++++++++++++- drivers/crypto/cnxk/cn9k_cryptodev_ops.c | 9 +++++++- drivers/event/cnxk/cnxk_eventdev.c | 27 +++++++++++++++++++++++ 6 files changed, 69 insertions(+), 18 deletions(-) diff --git a/drivers/common/cnxk/hw/cpt.h b/drivers/common/cnxk/hw/cpt.h index 8fbba2c2a4..3c87a0d1e4 100644 --- a/drivers/common/cnxk/hw/cpt.h +++ b/drivers/common/cnxk/hw/cpt.h @@ -322,4 +322,13 @@ struct cpt_frag_info_s { } w1; }; +union cpt_fc_write_s { + struct { + uint32_t qsize; + uint32_t reserved_32_63; + uint64_t reserved_64_127; + } s; + uint64_t u64[2]; +}; + #endif /* __CPT_HW_H__ */ diff --git a/drivers/common/cnxk/roc_cpt.c b/drivers/common/cnxk/roc_cpt.c index 742723ad1d..dee093db61 100644 --- a/drivers/common/cnxk/roc_cpt.c +++ b/drivers/common/cnxk/roc_cpt.c @@ -21,8 +21,9 @@ #define CPT_IQ_GRP_SIZE(nb_desc) \ (CPT_IQ_NB_DESC_SIZE_DIV40(nb_desc) * CPT_IQ_GRP_LEN) -#define CPT_LF_MAX_NB_DESC 128000 -#define CPT_LF_DEFAULT_NB_DESC 1024 +#define CPT_LF_MAX_NB_DESC 128000 +#define CPT_LF_DEFAULT_NB_DESC 1024 +#define CPT_LF_FC_MIN_THRESHOLD 32 static void cpt_lf_misc_intr_enb_dis(struct roc_cpt_lf *lf, bool enb) @@ -474,8 +475,6 @@ cpt_iq_init(struct roc_cpt_lf *lf) plt_write64(lf_q_size.u, lf->rbase + CPT_LF_Q_SIZE); lf->fc_addr = (uint64_t *)addr; - lf->fc_hyst_bits = plt_log2_u32(lf->nb_desc) / 2; - lf->fc_thresh = lf->nb_desc - (lf->nb_desc % (1 << lf->fc_hyst_bits)); } int @@ -879,7 +878,7 @@ roc_cpt_iq_enable(struct roc_cpt_lf *lf) lf_ctl.s.ena = 1; lf_ctl.s.fc_ena = 1; lf_ctl.s.fc_up_crossing = 0; - lf_ctl.s.fc_hyst_bits = lf->fc_hyst_bits; + lf_ctl.s.fc_hyst_bits = plt_log2_u32(CPT_LF_FC_MIN_THRESHOLD); plt_write64(lf_ctl.u, lf->rbase + CPT_LF_CTL); /* Enable command queue execution */ @@ -906,6 +905,7 @@ roc_cpt_lmtline_init(struct roc_cpt *roc_cpt, struct roc_cpt_lmtline *lmtline, lmtline->fc_addr = lf->fc_addr; lmtline->lmt_base = lf->lmt_base; + lmtline->fc_thresh = lf->nb_desc - CPT_LF_FC_MIN_THRESHOLD; return 0; } diff --git a/drivers/common/cnxk/roc_cpt.h b/drivers/common/cnxk/roc_cpt.h index 99cb8b2862..1824a9ce6b 100644 --- a/drivers/common/cnxk/roc_cpt.h +++ b/drivers/common/cnxk/roc_cpt.h @@ -99,6 +99,7 @@ struct roc_cpt_lmtline { uint64_t io_addr; uint64_t *fc_addr; uintptr_t lmt_base; + uint32_t fc_thresh; }; struct roc_cpt_lf { @@ -114,8 +115,6 @@ struct roc_cpt_lf { uint16_t msixoff; uint16_t pf_func; uint64_t *fc_addr; - uint32_t fc_hyst_bits; - uint64_t fc_thresh; uint64_t io_addr; uint8_t *iq_vaddr; struct roc_nix *inl_outb_nix; @@ -144,15 +143,6 @@ struct roc_cpt_rxc_time_cfg { uint16_t zombie_thres; }; -static inline int -roc_cpt_is_iq_full(struct roc_cpt_lf *lf) -{ - if (*lf->fc_addr < lf->fc_thresh) - return 0; - - return 1; -} - int __roc_api roc_cpt_rxc_time_cfg(struct roc_cpt *roc_cpt, struct roc_cpt_rxc_time_cfg *cfg); int __roc_api roc_cpt_dev_init(struct roc_cpt *roc_cpt); diff --git a/drivers/crypto/cnxk/cn10k_cryptodev_ops.c b/drivers/crypto/cnxk/cn10k_cryptodev_ops.c index 869fde0176..f761ba36e2 100644 --- a/drivers/crypto/cnxk/cn10k_cryptodev_ops.c +++ b/drivers/crypto/cnxk/cn10k_cryptodev_ops.c @@ -194,6 +194,8 @@ cn10k_cpt_enqueue_burst(void *qptr, struct rte_crypto_op **ops, uint16_t nb_ops) struct cnxk_cpt_qp *qp = qptr; struct pending_queue *pend_q; struct cpt_inst_s *inst; + union cpt_fc_write_s fc; + uint64_t *fc_addr; uint16_t lmt_id; uint64_t head; int ret, i; @@ -211,11 +213,20 @@ cn10k_cpt_enqueue_burst(void *qptr, struct rte_crypto_op **ops, uint16_t nb_ops) lmt_base = qp->lmtline.lmt_base; io_addr = qp->lmtline.io_addr; + fc_addr = qp->lmtline.fc_addr; + + const uint32_t fc_thresh = qp->lmtline.fc_thresh; ROC_LMT_BASE_ID_GET(lmt_base, lmt_id); inst = (struct cpt_inst_s *)lmt_base; again: + fc.u64[0] = __atomic_load_n(fc_addr, __ATOMIC_RELAXED); + if (unlikely(fc.s.qsize > fc_thresh)) { + i = 0; + goto pend_q_commit; + } + for (i = 0; i < RTE_MIN(PKTS_PER_LOOP, nb_ops); i++) { infl_req = &pend_q->req_queue[head]; infl_req->op_flags = 0; @@ -386,7 +397,9 @@ cn10k_cpt_crypto_adapter_enqueue(uintptr_t base, struct rte_crypto_op *op) struct cpt_inflight_req *infl_req; uint64_t lmt_base, lmt_arg, w2; struct cpt_inst_s *inst; + union cpt_fc_write_s fc; struct cnxk_cpt_qp *qp; + uint64_t *fc_addr; uint16_t lmt_id; int ret; @@ -408,6 +421,10 @@ cn10k_cpt_crypto_adapter_enqueue(uintptr_t base, struct rte_crypto_op *op) infl_req->op_flags = 0; lmt_base = qp->lmtline.lmt_base; + fc_addr = qp->lmtline.fc_addr; + + const uint32_t fc_thresh = qp->lmtline.fc_thresh; + ROC_LMT_BASE_ID_GET(lmt_base, lmt_id); inst = (struct cpt_inst_s *)lmt_base; @@ -426,7 +443,8 @@ cn10k_cpt_crypto_adapter_enqueue(uintptr_t base, struct rte_crypto_op *op) inst->w2.u64 = w2; inst->w3.u64 = CNXK_CPT_INST_W3(1, infl_req); - if (roc_cpt_is_iq_full(&qp->lf)) { + fc.u64[0] = __atomic_load_n(fc_addr, __ATOMIC_RELAXED); + if (unlikely(fc.s.qsize > fc_thresh)) { rte_mempool_put(qp->ca.req_mp, infl_req); rte_errno = EAGAIN; return 0; diff --git a/drivers/crypto/cnxk/cn9k_cryptodev_ops.c b/drivers/crypto/cnxk/cn9k_cryptodev_ops.c index eccaf398df..9752ab42ef 100644 --- a/drivers/crypto/cnxk/cn9k_cryptodev_ops.c +++ b/drivers/crypto/cnxk/cn9k_cryptodev_ops.c @@ -436,8 +436,10 @@ uint16_t cn9k_cpt_crypto_adapter_enqueue(uintptr_t base, struct rte_crypto_op *op) { struct cpt_inflight_req *infl_req; + union cpt_fc_write_s fc; struct cnxk_cpt_qp *qp; struct cpt_inst_s inst; + uint64_t *fc_addr; int ret; ret = cn9k_ca_meta_info_extract(op, &qp, &inst); @@ -471,7 +473,12 @@ cn9k_cpt_crypto_adapter_enqueue(uintptr_t base, struct rte_crypto_op *op) inst.res_addr = (uint64_t)&infl_req->res; inst.w3.u64 = CNXK_CPT_INST_W3(1, infl_req); - if (roc_cpt_is_iq_full(&qp->lf)) { + fc_addr = qp->lmtline.fc_addr; + + const uint32_t fc_thresh = qp->lmtline.fc_thresh; + + fc.u64[0] = __atomic_load_n(fc_addr, __ATOMIC_RELAXED); + if (unlikely(fc.s.qsize > fc_thresh)) { rte_mempool_put(qp->ca.req_mp, infl_req); rte_errno = EAGAIN; return 0; diff --git a/drivers/event/cnxk/cnxk_eventdev.c b/drivers/event/cnxk/cnxk_eventdev.c index b66f241ef8..a9e4201ed8 100644 --- a/drivers/event/cnxk/cnxk_eventdev.c +++ b/drivers/event/cnxk/cnxk_eventdev.c @@ -12,6 +12,24 @@ crypto_adapter_qp_setup(const struct rte_cryptodev *cdev, char name[RTE_MEMPOOL_NAMESIZE]; uint32_t cache_size, nb_req; unsigned int req_size; + uint32_t nb_desc_min; + + /* + * Update CPT FC threshold. Decrement by hardware burst size to allow + * simultaneous enqueue from all available cores. + */ + if (roc_model_is_cn10k()) + nb_desc_min = rte_lcore_count() * 32; + else + nb_desc_min = rte_lcore_count() * 2; + + if (qp->lmtline.fc_thresh < nb_desc_min) { + plt_err("CPT queue depth not sufficient to allow enqueueing from %d cores", + rte_lcore_count()); + return -ENOSPC; + } + + qp->lmtline.fc_thresh -= nb_desc_min; snprintf(name, RTE_MEMPOOL_NAMESIZE, "cnxk_ca_req_%u:%u", cdev->data->dev_id, qp->lf.lf_id); @@ -69,9 +87,18 @@ cnxk_crypto_adapter_qp_add(const struct rte_eventdev *event_dev, static int crypto_adapter_qp_free(struct cnxk_cpt_qp *qp) { + int ret; + rte_mempool_free(qp->ca.req_mp); qp->ca.enabled = false; + ret = roc_cpt_lmtline_init(qp->lf.roc_cpt, &qp->lmtline, qp->lf.lf_id); + if (ret < 0) { + plt_err("Could not reset lmtline for queue pair %d", + qp->lf.lf_id); + return ret; + } + return 0; } -- 2.25.1