From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 41A55A0093; Thu, 8 Dec 2022 21:03:53 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id D522F42D4E; Thu, 8 Dec 2022 21:02:40 +0100 (CET) Received: from mx0b-0016f401.pphosted.com (mx0b-0016f401.pphosted.com [67.231.156.173]) by mails.dpdk.org (Postfix) with ESMTP id 9CA3742D24 for ; Thu, 8 Dec 2022 21:02:30 +0100 (CET) Received: from pps.filterd (m0045851.ppops.net [127.0.0.1]) by mx0b-0016f401.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2B8JjesI004916 for ; Thu, 8 Dec 2022 12:02:30 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type; s=pfpt0220; bh=VIUBDTaqSnIC/FDAYSc/aIKe/2xYMC8k/up5HstmUdU=; b=FJQteh1Mjz5SHqbCCMuLl+lJUYzC6TKYpsp6sETyM0cUVBrVpAyD8wOFf8KQMD2urXb2 3RohbiEAljK6LXDHYFTc3RvU5JU/kGQd7lu9v7Wu/bzCk/nbS1DiEZEjZ/5Gyc+PRDqu 9olYS92F4fnQZR11IAxQt2x8wNWMkdmDxXFS1UXl+cKeKpD7wBMliJj+yfxvjb6BI0QL 7QXBBmHDsYEBP7DcOknghSUJE3ECv57BsDCDeUQ3dCa3OfllvESgSoFMa9esUkPbx7/z bCHoTUbt9I056ag4+dARVWjglECe8FQ0lU65uotWBiBpfuNOjrPiYGmv+vBkOPzTIkAy kg== Received: from dc5-exch02.marvell.com ([199.233.59.182]) by mx0b-0016f401.pphosted.com (PPS) with ESMTPS id 3m86usnfsu-5 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT) for ; Thu, 08 Dec 2022 12:02:29 -0800 Received: from DC5-EXCH01.marvell.com (10.69.176.38) by DC5-EXCH02.marvell.com (10.69.176.39) with Microsoft SMTP Server (TLS) id 15.0.1497.18; Thu, 8 Dec 2022 12:02:27 -0800 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server id 15.0.1497.2 via Frontend Transport; Thu, 8 Dec 2022 12:02:27 -0800 Received: from ml-host-33.caveonetworks.com (unknown [10.110.143.233]) by maili.marvell.com (Postfix) with ESMTP id CB3DD3F705E; Thu, 8 Dec 2022 12:02:27 -0800 (PST) From: Srikanth Yalavarthi To: Srikanth Yalavarthi CC: , , , Subject: [PATCH v1 18/37] ml/cnxk: enable support to stop an ML models Date: Thu, 8 Dec 2022 12:02:01 -0800 Message-ID: <20221208200220.20267-19-syalavarthi@marvell.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20221208200220.20267-1-syalavarthi@marvell.com> References: <20221208200220.20267-1-syalavarthi@marvell.com> MIME-Version: 1.0 Content-Type: text/plain X-Proofpoint-GUID: imLM-yv737Jg-Zy8fUZIn3Yp2D9XDetg X-Proofpoint-ORIG-GUID: imLM-yv737Jg-Zy8fUZIn3Yp2D9XDetg X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.923,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-12-08_11,2022-12-08_01,2022-06-22_01 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Implemented model stop driver function. A model stop job is enqueued through scratch registers and is checked for completion through polling in a synchronous mode. OCM pages are released after model stop completion. Signed-off-by: Srikanth Yalavarthi --- drivers/ml/cnxk/cn10k_ml_ops.c | 115 ++++++++++++++++++++++++++++++++- drivers/ml/cnxk/cn10k_ml_ops.h | 1 + 2 files changed, 114 insertions(+), 2 deletions(-) diff --git a/drivers/ml/cnxk/cn10k_ml_ops.c b/drivers/ml/cnxk/cn10k_ml_ops.c index b74092e605..a0b0fc7e1f 100644 --- a/drivers/ml/cnxk/cn10k_ml_ops.c +++ b/drivers/ml/cnxk/cn10k_ml_ops.c @@ -295,10 +295,14 @@ cn10k_ml_dev_configure(struct rte_ml_dev *dev, const struct rte_ml_dev_config *c /* Re-configure */ void **models; - /* Unload all models */ + /* Stop and unload all models */ for (model_id = 0; model_id < dev->data->nb_models; model_id++) { model = dev->data->models[model_id]; if (model != NULL) { + if (model->state == ML_CN10K_MODEL_STATE_STARTED) { + if (cn10k_ml_model_stop(dev, model_id) != 0) + plt_err("Could not stop model %d", model_id); + } if (model->state == ML_CN10K_MODEL_STATE_LOADED) { if (cn10k_ml_model_unload(dev, model_id) != 0) plt_err("Could not unload model %d", model_id); @@ -362,10 +366,14 @@ cn10k_ml_dev_close(struct rte_ml_dev *dev) mldev = dev->data->dev_private; - /* Unload all models */ + /* Stop and unload all models */ for (model_id = 0; model_id < dev->data->nb_models; model_id++) { model = dev->data->models[model_id]; if (model != NULL) { + if (model->state == ML_CN10K_MODEL_STATE_STARTED) { + if (cn10k_ml_model_stop(dev, model_id) != 0) + plt_err("Could not stop model %d", model_id); + } if (model->state == ML_CN10K_MODEL_STATE_LOADED) { if (cn10k_ml_model_unload(dev, model_id) != 0) plt_err("Could not unload model %d", model_id); @@ -767,6 +775,108 @@ cn10k_ml_model_start(struct rte_ml_dev *dev, int16_t model_id) return ret; } +int +cn10k_ml_model_stop(struct rte_ml_dev *dev, int16_t model_id) +{ + struct cn10k_ml_model *model; + struct cn10k_ml_dev *mldev; + struct cn10k_ml_ocm *ocm; + struct cn10k_ml_req *req; + + bool job_enqueued; + bool job_dequeued; + bool locked; + int ret = 0; + + mldev = dev->data->dev_private; + ocm = &mldev->ocm; + model = dev->data->models[model_id]; + + if (model == NULL) { + plt_err("Invalid model_id = %d", model_id); + return -EINVAL; + } + + /* Prepare JD */ + req = model->req; + cn10k_ml_prep_sp_job_descriptor(mldev, model, req, ML_CN10K_JOB_TYPE_MODEL_STOP); + req->result.error_code = 0x0; + req->result.user_ptr = NULL; + + plt_write64(ML_CN10K_POLL_JOB_START, &req->status); + plt_wmb(); + + locked = false; + while (!locked) { + if (plt_spinlock_trylock(&model->lock) != 0) { + if (model->state == ML_CN10K_MODEL_STATE_LOADED) { + plt_ml_dbg("Model not started, model = 0x%016lx", + PLT_U64_CAST(model)); + plt_spinlock_unlock(&model->lock); + return 1; + } + + if (model->state == ML_CN10K_MODEL_STATE_JOB_ACTIVE) { + plt_err("A slow-path job is active for the model = 0x%016lx", + PLT_U64_CAST(model)); + plt_spinlock_unlock(&model->lock); + return -EBUSY; + } + + model->state = ML_CN10K_MODEL_STATE_JOB_ACTIVE; + plt_spinlock_unlock(&model->lock); + locked = true; + } + } + + while (model->model_mem_map.ocm_reserved) { + if (plt_spinlock_trylock(&ocm->lock) != 0) { + cn10k_ml_ocm_free_pages(dev, model->model_id); + model->model_mem_map.ocm_reserved = false; + model->model_mem_map.tilemask = 0x0; + plt_spinlock_unlock(&ocm->lock); + } + } + + job_enqueued = false; + job_dequeued = false; + do { + if (!job_enqueued) { + req->timeout = plt_tsc_cycles() + ML_CN10K_CMD_TIMEOUT * plt_tsc_hz(); + job_enqueued = roc_ml_scratch_enqueue(&mldev->roc, &req->jd); + } + + if (job_enqueued && !job_dequeued) + job_dequeued = roc_ml_scratch_dequeue(&mldev->roc, &req->jd); + + if (job_dequeued) + break; + } while (plt_tsc_cycles() < req->timeout); + + if (job_dequeued) { + if (plt_read64(&req->status) == ML_CN10K_POLL_JOB_FINISH) { + if (req->result.error_code == 0x0) + ret = 0; + else + ret = -1; + } + } else { + roc_ml_scratch_queue_reset(&mldev->roc); + ret = -ETIME; + } + + locked = false; + while (!locked) { + if (plt_spinlock_trylock(&model->lock) != 0) { + model->state = ML_CN10K_MODEL_STATE_LOADED; + plt_spinlock_unlock(&model->lock); + locked = true; + } + } + + return ret; +} + struct rte_ml_dev_ops cn10k_ml_ops = { /* Device control ops */ .dev_info_get = cn10k_ml_dev_info_get, @@ -783,4 +893,5 @@ struct rte_ml_dev_ops cn10k_ml_ops = { .model_load = cn10k_ml_model_load, .model_unload = cn10k_ml_model_unload, .model_start = cn10k_ml_model_start, + .model_stop = cn10k_ml_model_stop, }; diff --git a/drivers/ml/cnxk/cn10k_ml_ops.h b/drivers/ml/cnxk/cn10k_ml_ops.h index 3fe3872fd1..5e7e42ee88 100644 --- a/drivers/ml/cnxk/cn10k_ml_ops.h +++ b/drivers/ml/cnxk/cn10k_ml_ops.h @@ -65,5 +65,6 @@ int cn10k_ml_model_load(struct rte_ml_dev *dev, struct rte_ml_model_params *para int16_t *model_id); int cn10k_ml_model_unload(struct rte_ml_dev *dev, int16_t model_id); int cn10k_ml_model_start(struct rte_ml_dev *dev, int16_t model_id); +int cn10k_ml_model_stop(struct rte_ml_dev *dev, int16_t model_id); #endif /* _CN10K_ML_OPS_H_ */ -- 2.17.1