From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 14A2E41B9D; Wed, 1 Feb 2023 10:26:55 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 2C41A43024; Wed, 1 Feb 2023 10:23:58 +0100 (CET) Received: from mx0b-0016f401.pphosted.com (mx0a-0016f401.pphosted.com [67.231.148.174]) by mails.dpdk.org (Postfix) with ESMTP id 3790742D75 for ; Wed, 1 Feb 2023 10:23:27 +0100 (CET) Received: from pps.filterd (m0045849.ppops.net [127.0.0.1]) by mx0a-0016f401.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3116LbgX026195 for ; Wed, 1 Feb 2023 01:23:26 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type; s=pfpt0220; bh=uYy+ehBKF17fkJZV2ciaRw0t8ol1+ZfDYlR6MWYfBDc=; b=LTqSwjUjbUvdQIlAqxAkOJiBCxuM3m8UaJ2D4K5j+DGsm8Y5BB9d21535F2v31FfS/x+ b9ZI+ErFd3fnx2bc0AzP+h8UjHNTAyM40t2W6Cf+T7Y1wIMXuNvEZiSykf561A1WEDdz Z3FcJ7xL5s23CyyQ1+tFRIBHSC7dckplPn6SapggS75WRjajSkZ06rbA3s0guMo+MaT7 fTzO4nayLwZXO/Z+b7wK/O3aILz1d1FL6QTWFUFuEODPorZSdhsf/evOQx5BIEMY9YsS rZER+7B1yoU4U/ytE2r83OasTmlDlaTdiceMbqRYG8Ke73Ypc5tRD2Zm8j9pyfZEzazi Bw== Received: from dc5-exch01.marvell.com ([199.233.59.181]) by mx0a-0016f401.pphosted.com (PPS) with ESMTPS id 3nfjr8rgv8-2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT) for ; Wed, 01 Feb 2023 01:23:26 -0800 Received: from DC5-EXCH01.marvell.com (10.69.176.38) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server (TLS) id 15.0.1497.42; Wed, 1 Feb 2023 01:23:24 -0800 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server id 15.0.1497.42 via Frontend Transport; Wed, 1 Feb 2023 01:23:24 -0800 Received: from ml-host-33.caveonetworks.com (unknown [10.110.143.233]) by maili.marvell.com (Postfix) with ESMTP id 165D23F7059; Wed, 1 Feb 2023 01:23:24 -0800 (PST) From: Srikanth Yalavarthi To: Srikanth Yalavarthi CC: , , , Subject: [PATCH v4 36/39] ml/cnxk: add support to use lock during jcmd enq Date: Wed, 1 Feb 2023 01:23:07 -0800 Message-ID: <20230201092310.23252-37-syalavarthi@marvell.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230201092310.23252-1-syalavarthi@marvell.com> References: <20221208200220.20267-1-syalavarthi@marvell.com> <20230201092310.23252-1-syalavarthi@marvell.com> MIME-Version: 1.0 Content-Type: text/plain X-Proofpoint-GUID: cdBtumfiyoHy1-XtfllIxAWLVQl-Vkfj X-Proofpoint-ORIG-GUID: cdBtumfiyoHy1-XtfllIxAWLVQl-Vkfj X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.930,Hydra:6.0.562,FMLib:17.11.122.1 definitions=2023-02-01_03,2023-01-31_01,2022-06-22_01 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Added device argument "hw_queue_lock" to select the JCMDQ enqueue ROC function to be used in fast path. hw_queue_lock: 0: Disable, use lock free version of JCMDQ enqueue ROC function for job queuing. To avoid race condition in request queuing to hardware, disabling hw_queue_lock restricts the number of queue-pairs supported by cnxk driver to 1. 1: Enable, (default) use spin-lock version of JCMDQ enqueue ROC function for job queuing. Enabling spinlock version would disable restrictions on the number of queue-pairs that can be created. Signed-off-by: Srikanth Yalavarthi --- drivers/ml/cnxk/cn10k_ml_dev.c | 31 ++++++++++++++++++++++++++++++- drivers/ml/cnxk/cn10k_ml_dev.h | 13 +++++++++++-- drivers/ml/cnxk/cn10k_ml_ops.c | 20 +++++++++++++++++--- 3 files changed, 58 insertions(+), 6 deletions(-) diff --git a/drivers/ml/cnxk/cn10k_ml_dev.c b/drivers/ml/cnxk/cn10k_ml_dev.c index 5c02d67c8e..aa503b2691 100644 --- a/drivers/ml/cnxk/cn10k_ml_dev.c +++ b/drivers/ml/cnxk/cn10k_ml_dev.c @@ -22,12 +22,14 @@ #define CN10K_ML_FW_REPORT_DPE_WARNINGS "report_dpe_warnings" #define CN10K_ML_DEV_CACHE_MODEL_DATA "cache_model_data" #define CN10K_ML_OCM_ALLOC_MODE "ocm_alloc_mode" +#define CN10K_ML_DEV_HW_QUEUE_LOCK "hw_queue_lock" #define CN10K_ML_FW_PATH_DEFAULT "/lib/firmware/mlip-fw.bin" #define CN10K_ML_FW_ENABLE_DPE_WARNINGS_DEFAULT 1 #define CN10K_ML_FW_REPORT_DPE_WARNINGS_DEFAULT 0 #define CN10K_ML_DEV_CACHE_MODEL_DATA_DEFAULT 1 #define CN10K_ML_OCM_ALLOC_MODE_DEFAULT "lowest" +#define CN10K_ML_DEV_HW_QUEUE_LOCK_DEFAULT 1 /* ML firmware macros */ #define FW_MEMZONE_NAME "ml_cn10k_fw_mz" @@ -46,6 +48,7 @@ static const char *const valid_args[] = {CN10K_ML_FW_PATH, CN10K_ML_FW_REPORT_DPE_WARNINGS, CN10K_ML_DEV_CACHE_MODEL_DATA, CN10K_ML_OCM_ALLOC_MODE, + CN10K_ML_DEV_HW_QUEUE_LOCK, NULL}; /* Dummy operations for ML device */ @@ -87,6 +90,7 @@ cn10k_mldev_parse_devargs(struct rte_devargs *devargs, struct cn10k_ml_dev *mlde bool cache_model_data_set = false; struct rte_kvargs *kvlist = NULL; bool ocm_alloc_mode_set = false; + bool hw_queue_lock_set = false; char *ocm_alloc_mode = NULL; bool fw_path_set = false; char *fw_path = NULL; @@ -158,6 +162,18 @@ cn10k_mldev_parse_devargs(struct rte_devargs *devargs, struct cn10k_ml_dev *mlde ocm_alloc_mode_set = true; } + if (rte_kvargs_count(kvlist, CN10K_ML_DEV_HW_QUEUE_LOCK) == 1) { + ret = rte_kvargs_process(kvlist, CN10K_ML_DEV_HW_QUEUE_LOCK, &parse_integer_arg, + &mldev->hw_queue_lock); + if (ret < 0) { + plt_err("Error processing arguments, key = %s\n", + CN10K_ML_DEV_HW_QUEUE_LOCK); + ret = -EINVAL; + goto exit; + } + hw_queue_lock_set = true; + } + check_args: if (!fw_path_set) mldev->fw.path = CN10K_ML_FW_PATH_DEFAULT; @@ -215,6 +231,18 @@ cn10k_mldev_parse_devargs(struct rte_devargs *devargs, struct cn10k_ml_dev *mlde } plt_info("ML: %s = %s", CN10K_ML_OCM_ALLOC_MODE, mldev->ocm.alloc_mode); + if (!hw_queue_lock_set) { + mldev->hw_queue_lock = CN10K_ML_DEV_HW_QUEUE_LOCK_DEFAULT; + } else { + if ((mldev->hw_queue_lock < 0) || (mldev->hw_queue_lock > 1)) { + plt_err("Invalid argument, %s = %d\n", CN10K_ML_DEV_HW_QUEUE_LOCK, + mldev->hw_queue_lock); + ret = -EINVAL; + goto exit; + } + } + plt_info("ML: %s = %d", CN10K_ML_DEV_HW_QUEUE_LOCK, mldev->hw_queue_lock); + exit: if (kvlist) rte_kvargs_free(kvlist); @@ -756,4 +784,5 @@ RTE_PMD_REGISTER_PARAM_STRING(MLDEV_NAME_CN10K_PMD, CN10K_ML_FW_PATH "=" CN10K_ML_FW_ENABLE_DPE_WARNINGS "=<0|1>" CN10K_ML_FW_REPORT_DPE_WARNINGS "=<0|1>" CN10K_ML_DEV_CACHE_MODEL_DATA - "=<0|1>" CN10K_ML_OCM_ALLOC_MODE "="); + "=<0|1>" CN10K_ML_OCM_ALLOC_MODE + "=" CN10K_ML_DEV_HW_QUEUE_LOCK "=<0|1>"); diff --git a/drivers/ml/cnxk/cn10k_ml_dev.h b/drivers/ml/cnxk/cn10k_ml_dev.h index 718edadde7..49676ac9e7 100644 --- a/drivers/ml/cnxk/cn10k_ml_dev.h +++ b/drivers/ml/cnxk/cn10k_ml_dev.h @@ -21,8 +21,11 @@ /* Maximum number of models per device */ #define ML_CN10K_MAX_MODELS 16 -/* Maximum number of queue-pairs per device */ -#define ML_CN10K_MAX_QP_PER_DEVICE 1 +/* Maximum number of queue-pairs per device, spinlock version */ +#define ML_CN10K_MAX_QP_PER_DEVICE_SL 16 + +/* Maximum number of queue-pairs per device, lock-free version */ +#define ML_CN10K_MAX_QP_PER_DEVICE_LF 1 /* Maximum number of descriptors per queue-pair */ #define ML_CN10K_MAX_DESC_PER_QP 1024 @@ -384,6 +387,12 @@ struct cn10k_ml_dev { /* Enable / disable model data caching */ int cache_model_data; + + /* Use spinlock version of ROC enqueue */ + int hw_queue_lock; + + /* JCMD enqueue function handler */ + bool (*ml_jcmdq_enqueue)(struct roc_ml *roc_ml, struct ml_job_cmd_s *job_cmd); }; uint64_t cn10k_ml_fw_flags_get(struct cn10k_ml_fw *fw); diff --git a/drivers/ml/cnxk/cn10k_ml_ops.c b/drivers/ml/cnxk/cn10k_ml_ops.c index c12259f44c..3c96db4514 100644 --- a/drivers/ml/cnxk/cn10k_ml_ops.c +++ b/drivers/ml/cnxk/cn10k_ml_ops.c @@ -534,13 +534,21 @@ cn10k_ml_cache_model_data(struct rte_ml_dev *dev, int16_t model_id) static int cn10k_ml_dev_info_get(struct rte_ml_dev *dev, struct rte_ml_dev_info *dev_info) { + struct cn10k_ml_dev *mldev; + if (dev_info == NULL) return -EINVAL; + mldev = dev->data->dev_private; + memset(dev_info, 0, sizeof(struct rte_ml_dev_info)); dev_info->driver_name = dev->device->driver->name; dev_info->max_models = ML_CN10K_MAX_MODELS; - dev_info->max_queue_pairs = ML_CN10K_MAX_QP_PER_DEVICE; + if (mldev->hw_queue_lock) + dev_info->max_queue_pairs = ML_CN10K_MAX_QP_PER_DEVICE_SL; + else + dev_info->max_queue_pairs = ML_CN10K_MAX_QP_PER_DEVICE_LF; + dev_info->max_desc = ML_CN10K_MAX_DESC_PER_QP; dev_info->max_segments = ML_CN10K_MAX_SEGMENTS; dev_info->min_align_size = ML_CN10K_ALIGN_SIZE; @@ -703,6 +711,12 @@ cn10k_ml_dev_configure(struct rte_ml_dev *dev, const struct rte_ml_dev_config *c else mldev->xstats_enabled = false; + /* Set JCMDQ enqueue function */ + if (mldev->hw_queue_lock == 1) + mldev->ml_jcmdq_enqueue = roc_ml_jcmdq_enqueue_sl; + else + mldev->ml_jcmdq_enqueue = roc_ml_jcmdq_enqueue_lf; + dev->enqueue_burst = cn10k_ml_enqueue_burst; dev->dequeue_burst = cn10k_ml_dequeue_burst; dev->op_error_get = cn10k_ml_op_error_get; @@ -1993,7 +2007,7 @@ cn10k_ml_enqueue_burst(struct rte_ml_dev *dev, uint16_t qp_id, struct rte_ml_op req->result.user_ptr = op->user_ptr; plt_write64(ML_CN10K_POLL_JOB_START, &req->status); - enqueued = roc_ml_jcmdq_enqueue_lf(&mldev->roc, &req->jcmd); + enqueued = mldev->ml_jcmdq_enqueue(&mldev->roc, &req->jcmd); if (unlikely(!enqueued)) goto jcmdq_full; @@ -2114,7 +2128,7 @@ cn10k_ml_inference_sync(struct rte_ml_dev *dev, struct rte_ml_op *op) timeout = true; req->timeout = plt_tsc_cycles() + ML_CN10K_CMD_TIMEOUT * plt_tsc_hz(); do { - if (roc_ml_jcmdq_enqueue_lf(&mldev->roc, &req->jcmd)) { + if (mldev->ml_jcmdq_enqueue(&mldev->roc, &req->jcmd)) { req->op = op; timeout = false; break; -- 2.17.1