From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id CE009440E3; Mon, 27 May 2024 14:24:27 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id A0461402CF; Mon, 27 May 2024 14:24:27 +0200 (CEST) Received: from mx0b-0016f401.pphosted.com (mx0a-0016f401.pphosted.com [67.231.148.174]) by mails.dpdk.org (Postfix) with ESMTP id 2551B402BD for ; Mon, 27 May 2024 14:24:25 +0200 (CEST) Received: from pps.filterd (m0045849.ppops.net [127.0.0.1]) by mx0a-0016f401.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 44R5fdeP027356 for ; Mon, 27 May 2024 05:24:25 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h= cc:content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s=pfpt0220; bh=Y 3NMHUYguU+BkmcqkAZRlhd03cFl4emp5IpE6Fzuysw=; b=FB2c7KFSLlKtmaeuM tZJ1VoA6V8EmP/W6hHrubllB5JIOObNtKJE8oRNXPfnhUh1n2RgD2t2MVUcqSM+/ gXKKKKW53zajE//KTKqLCCgtFdjGgZ0u7omFAiq0xiSE4PRFLlI/+zUYe6kwOB4+ o2ZRrdETkn37N2CSPYJJJ/OozkJHlB+ohMKeL6fD1tdF7PJlr/W9yxr+zNhU3WrK i5/XkzU0pkPjaIK2o7B6h0fVn/sNag2Kwgl3Qw09b1x4RsFxoRYeGExthK++KZAP f4pgKqsKipgJlCNSJVAUaGsunCqhjEGEbyj7iGVA7i3prtUcb1qR3PDmw9D5TtaO TF/CQ== Received: from dc6wp-exch02.marvell.com ([4.21.29.225]) by mx0a-0016f401.pphosted.com (PPS) with ESMTPS id 3ycm8gs7x4-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Mon, 27 May 2024 05:24:24 -0700 (PDT) Received: from DC6WP-EXCH02.marvell.com (10.76.176.209) by DC6WP-EXCH02.marvell.com (10.76.176.209) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.4; Mon, 27 May 2024 05:24:23 -0700 Received: from maili.marvell.com (10.69.176.80) by DC6WP-EXCH02.marvell.com (10.76.176.209) with Microsoft SMTP Server id 15.2.1544.4 via Frontend Transport; Mon, 27 May 2024 05:24:23 -0700 Received: from MININT-80QBFE8.corp.innovium.com (MININT-80QBFE8.marvell.com [10.28.164.106]) by maili.marvell.com (Postfix) with ESMTP id 521D23F705C; Mon, 27 May 2024 05:24:19 -0700 (PDT) From: To: , Nithin Dabilpuram , "Kiran Kumar K" , Sunil Kumar Kori , Satha Rao , Harman Kalra , "Vamsi Attunuru" CC: , Pavan Nikhilesh , "Amit Prakash Shukla" Subject: [PATCH v2] dma/cnxk: add higher chunk size support Date: Mon, 27 May 2024 17:54:01 +0530 Message-ID: <20240527122401.5954-1-pbhagavatula@marvell.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20240522144520.1907-1-pbhagavatula@marvell.com> References: <20240522144520.1907-1-pbhagavatula@marvell.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-Proofpoint-GUID: ORXIQLPEpnD_DK-FZTAUITrx8kwB8UfA X-Proofpoint-ORIG-GUID: ORXIQLPEpnD_DK-FZTAUITrx8kwB8UfA X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.650,FMLib:17.12.28.16 definitions=2024-05-27_02,2024-05-24_01,2024-05-17_01 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org From: Pavan Nikhilesh Add support to configure higher chunk size by using the new OPEN_V2 mailbox, this improves performance as the number of mempool allocs are reduced. Add timeout when polling for queue idle timeout. Signed-off-by: Pavan Nikhilesh Signed-off-by: Amit Prakash Shukla --- v2 Changes: - Update release notes. - Use timeout when polling for queue idle state. doc/guides/rel_notes/release_24_07.rst | 6 +++ drivers/common/cnxk/roc_dpi.c | 72 ++++++++++++++++++++++---- drivers/common/cnxk/roc_dpi.h | 3 ++ drivers/common/cnxk/roc_dpi_priv.h | 3 ++ drivers/common/cnxk/version.map | 2 + drivers/dma/cnxk/cnxk_dmadev.c | 37 ++++++++----- drivers/dma/cnxk/cnxk_dmadev.h | 1 + 7 files changed, 101 insertions(+), 23 deletions(-) diff --git a/doc/guides/rel_notes/release_24_07.rst b/doc/guides/rel_notes/release_24_07.rst index a69f24cf99..60b92e4842 100644 --- a/doc/guides/rel_notes/release_24_07.rst +++ b/doc/guides/rel_notes/release_24_07.rst @@ -55,6 +55,12 @@ New Features Also, make sure to start the actual text at the margin. ======================================================= +* **Updated Marvell CNXK DMA driver.** + + * Updated DMA driver internal pool to use higher chunk size, effectively + reducing the number of mempool allocs needed, thereby increasing DMA + performance. + Removed Items ------------- diff --git a/drivers/common/cnxk/roc_dpi.c b/drivers/common/cnxk/roc_dpi.c index 1ee777d779..892685d185 100644 --- a/drivers/common/cnxk/roc_dpi.c +++ b/drivers/common/cnxk/roc_dpi.c @@ -38,6 +38,24 @@ send_msg_to_pf(struct plt_pci_addr *pci_addr, const char *value, int size) return 0; } +int +roc_dpi_wait_queue_idle(struct roc_dpi *roc_dpi) +{ + const uint64_t cyc = (DPI_QUEUE_IDLE_TMO_MS * plt_tsc_hz()) / 1E3; + const uint64_t start = plt_tsc_cycles(); + uint64_t reg; + + /* Wait for SADDR to become idle */ + reg = plt_read64(roc_dpi->rbase + DPI_VDMA_SADDR); + while (!(reg & BIT_ULL(63))) { + reg = plt_read64(roc_dpi->rbase + DPI_VDMA_SADDR); + if (plt_tsc_cycles() - start == cyc) + return -ETIMEDOUT; + } + + return 0; +} + int roc_dpi_enable(struct roc_dpi *dpi) { @@ -57,7 +75,6 @@ roc_dpi_configure(struct roc_dpi *roc_dpi, uint32_t chunk_sz, uint64_t aura, uin { struct plt_pci_device *pci_dev; dpi_mbox_msg_t mbox_msg; - uint64_t reg; int rc; if (!roc_dpi) { @@ -68,9 +85,9 @@ roc_dpi_configure(struct roc_dpi *roc_dpi, uint32_t chunk_sz, uint64_t aura, uin pci_dev = roc_dpi->pci_dev; roc_dpi_disable(roc_dpi); - reg = plt_read64(roc_dpi->rbase + DPI_VDMA_SADDR); - while (!(reg & BIT_ULL(63))) - reg = plt_read64(roc_dpi->rbase + DPI_VDMA_SADDR); + rc = roc_dpi_wait_queue_idle(roc_dpi); + if (rc) + return rc; plt_write64(0x0, roc_dpi->rbase + DPI_VDMA_REQQ_CTL); plt_write64(chunk_base, roc_dpi->rbase + DPI_VDMA_SADDR); @@ -87,6 +104,45 @@ roc_dpi_configure(struct roc_dpi *roc_dpi, uint32_t chunk_sz, uint64_t aura, uin if (mbox_msg.s.wqecsoff) mbox_msg.s.wqecs = 1; + rc = send_msg_to_pf(&pci_dev->addr, (const char *)&mbox_msg, sizeof(dpi_mbox_msg_t)); + if (rc < 0) + plt_err("Failed to send mbox message %d to DPI PF, err %d", mbox_msg.s.cmd, rc); + + return rc; +} + +int +roc_dpi_configure_v2(struct roc_dpi *roc_dpi, uint32_t chunk_sz, uint64_t aura, uint64_t chunk_base) +{ + struct plt_pci_device *pci_dev; + dpi_mbox_msg_t mbox_msg; + int rc; + + if (!roc_dpi) { + plt_err("roc_dpi is NULL"); + return -EINVAL; + } + + pci_dev = roc_dpi->pci_dev; + + roc_dpi_disable(roc_dpi); + + rc = roc_dpi_wait_queue_idle(roc_dpi); + if (rc) + return rc; + + plt_write64(0x0, roc_dpi->rbase + DPI_VDMA_REQQ_CTL); + plt_write64(chunk_base, roc_dpi->rbase + DPI_VDMA_SADDR); + mbox_msg.u[0] = 0; + mbox_msg.u[1] = 0; + /* DPI PF driver expects vfid starts from index 0 */ + mbox_msg.s.vfid = roc_dpi->vfid; + mbox_msg.s.cmd = DPI_QUEUE_OPEN_V2; + mbox_msg.s.csize = chunk_sz / 8; + mbox_msg.s.aura = aura; + mbox_msg.s.sso_pf_func = idev_sso_pffunc_get(); + mbox_msg.s.npa_pf_func = idev_npa_pffunc_get(); + rc = send_msg_to_pf(&pci_dev->addr, (const char *)&mbox_msg, sizeof(dpi_mbox_msg_t)); if (rc < 0) @@ -116,13 +172,11 @@ roc_dpi_dev_fini(struct roc_dpi *roc_dpi) { struct plt_pci_device *pci_dev = roc_dpi->pci_dev; dpi_mbox_msg_t mbox_msg; - uint64_t reg; int rc; - /* Wait for SADDR to become idle */ - reg = plt_read64(roc_dpi->rbase + DPI_VDMA_SADDR); - while (!(reg & BIT_ULL(63))) - reg = plt_read64(roc_dpi->rbase + DPI_VDMA_SADDR); + rc = roc_dpi_wait_queue_idle(roc_dpi); + if (rc) + return rc; mbox_msg.u[0] = 0; mbox_msg.u[1] = 0; diff --git a/drivers/common/cnxk/roc_dpi.h b/drivers/common/cnxk/roc_dpi.h index 978e2badb2..7b4f9d4f4f 100644 --- a/drivers/common/cnxk/roc_dpi.h +++ b/drivers/common/cnxk/roc_dpi.h @@ -16,7 +16,10 @@ int __roc_api roc_dpi_dev_fini(struct roc_dpi *roc_dpi); int __roc_api roc_dpi_configure(struct roc_dpi *dpi, uint32_t chunk_sz, uint64_t aura, uint64_t chunk_base); +int __roc_api roc_dpi_configure_v2(struct roc_dpi *roc_dpi, uint32_t chunk_sz, uint64_t aura, + uint64_t chunk_base); int __roc_api roc_dpi_enable(struct roc_dpi *dpi); +int __roc_api roc_dpi_wait_queue_idle(struct roc_dpi *dpi); int __roc_api roc_dpi_disable(struct roc_dpi *dpi); #endif diff --git a/drivers/common/cnxk/roc_dpi_priv.h b/drivers/common/cnxk/roc_dpi_priv.h index 52962c8bc0..844e5f37ee 100644 --- a/drivers/common/cnxk/roc_dpi_priv.h +++ b/drivers/common/cnxk/roc_dpi_priv.h @@ -15,6 +15,9 @@ #define DPI_QUEUE_CLOSE 0x2 #define DPI_REG_DUMP 0x3 #define DPI_GET_REG_CFG 0x4 +#define DPI_QUEUE_OPEN_V2 0x5 + +#define DPI_QUEUE_IDLE_TMO_MS 1E3 typedef union dpi_mbox_msg_t { uint64_t u[2]; diff --git a/drivers/common/cnxk/version.map b/drivers/common/cnxk/version.map index 424ad7f484..cc9f47e0ad 100644 --- a/drivers/common/cnxk/version.map +++ b/drivers/common/cnxk/version.map @@ -82,10 +82,12 @@ INTERNAL { roc_cpt_int_misc_cb_register; roc_cpt_int_misc_cb_unregister; roc_dpi_configure; + roc_dpi_configure_v2; roc_dpi_dev_fini; roc_dpi_dev_init; roc_dpi_disable; roc_dpi_enable; + roc_dpi_wait_queue_idle; roc_error_msg_get; roc_eswitch_nix_process_repte_notify_cb_register; roc_eswitch_nix_process_repte_notify_cb_unregister; diff --git a/drivers/dma/cnxk/cnxk_dmadev.c b/drivers/dma/cnxk/cnxk_dmadev.c index 4ab3cfbdf2..2de0a0a3ce 100644 --- a/drivers/dma/cnxk/cnxk_dmadev.c +++ b/drivers/dma/cnxk/cnxk_dmadev.c @@ -291,6 +291,7 @@ cnxk_dmadev_start(struct rte_dma_dev *dev) struct cnxk_dpi_vf_s *dpivf = dev->fp_obj->dev_private; struct cnxk_dpi_conf *dpi_conf; uint32_t chunks, nb_desc = 0; + uint32_t queue_buf_sz; int i, j, rc = 0; void *chunk; @@ -310,34 +311,44 @@ cnxk_dmadev_start(struct rte_dma_dev *dev) dpi_conf->completed_offset = 0; } - chunks = CNXK_DPI_CHUNKS_FROM_DESC(CNXK_DPI_QUEUE_BUF_SIZE, nb_desc); - rc = cnxk_dmadev_chunk_pool_create(dev, chunks, CNXK_DPI_QUEUE_BUF_SIZE); + queue_buf_sz = CNXK_DPI_QUEUE_BUF_SIZE_V2; + /* Max block size allowed by cnxk mempool driver is (128 * 1024). + * Block size = elt_size + mp->header + mp->trailer. + * + * Note from cn9k mempool driver: + * In cn9k additional padding of 128 bytes is added to mempool->trailer to + * ensure that the element size always occupies odd number of cachelines + * to ensure even distribution of elements among L1D cache sets. + */ + if (!roc_model_is_cn10k()) + queue_buf_sz = CNXK_DPI_QUEUE_BUF_SIZE_V2 - 128; + + chunks = CNXK_DPI_CHUNKS_FROM_DESC(queue_buf_sz, nb_desc); + rc = cnxk_dmadev_chunk_pool_create(dev, chunks, queue_buf_sz); if (rc < 0) { plt_err("DMA pool configure failed err = %d", rc); - goto done; + goto error; } rc = rte_mempool_get(dpivf->chunk_pool, &chunk); if (rc < 0) { plt_err("DMA failed to get chunk pointer err = %d", rc); rte_mempool_free(dpivf->chunk_pool); - goto done; + goto error; } - rc = roc_dpi_configure(&dpivf->rdpi, CNXK_DPI_QUEUE_BUF_SIZE, dpivf->aura, (uint64_t)chunk); + rc = roc_dpi_configure_v2(&dpivf->rdpi, queue_buf_sz, dpivf->aura, (uint64_t)chunk); if (rc < 0) { plt_err("DMA configure failed err = %d", rc); rte_mempool_free(dpivf->chunk_pool); - goto done; + goto error; } - dpivf->chunk_base = chunk; dpivf->chunk_head = 0; - dpivf->chunk_size_m1 = (CNXK_DPI_QUEUE_BUF_SIZE >> 3) - 2; + dpivf->chunk_size_m1 = (queue_buf_sz >> 3) - 2; roc_dpi_enable(&dpivf->rdpi); - -done: +error: return rc; } @@ -345,11 +356,9 @@ static int cnxk_dmadev_stop(struct rte_dma_dev *dev) { struct cnxk_dpi_vf_s *dpivf = dev->fp_obj->dev_private; - uint64_t reg; - reg = plt_read64(dpivf->rdpi.rbase + DPI_VDMA_SADDR); - while (!(reg & BIT_ULL(63))) - reg = plt_read64(dpivf->rdpi.rbase + DPI_VDMA_SADDR); + if (roc_dpi_wait_queue_idle(&dpivf->rdpi)) + return -EAGAIN; roc_dpi_disable(&dpivf->rdpi); rte_mempool_free(dpivf->chunk_pool); diff --git a/drivers/dma/cnxk/cnxk_dmadev.h b/drivers/dma/cnxk/cnxk_dmadev.h index 610a360ba2..3d8f875ada 100644 --- a/drivers/dma/cnxk/cnxk_dmadev.h +++ b/drivers/dma/cnxk/cnxk_dmadev.h @@ -30,6 +30,7 @@ #define CNXK_DPI_MIN_DESC 2 #define CNXK_DPI_MAX_VCHANS_PER_QUEUE 4 #define CNXK_DPI_QUEUE_BUF_SIZE 16256 +#define CNXK_DPI_QUEUE_BUF_SIZE_V2 130944 #define CNXK_DPI_POOL_MAX_CACHE_SZ (16) #define CNXK_DPI_DW_PER_SINGLE_CMD 8 #define CNXK_DPI_HDR_LEN 4 -- 2.25.1