From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124])
	by inbox.dpdk.org (Postfix) with ESMTP id 5A0D841D3D;
	Fri, 10 Mar 2023 09:22:25 +0100 (CET)
Received: from mails.dpdk.org (localhost [127.0.0.1])
	by mails.dpdk.org (Postfix) with ESMTP id BBD3A42D84;
	Fri, 10 Mar 2023 09:20:49 +0100 (CET)
Received: from mx0b-0016f401.pphosted.com (mx0b-0016f401.pphosted.com
 [67.231.156.173])
 by mails.dpdk.org (Postfix) with ESMTP id E5AA142B8E
 for <dev@dpdk.org>; Fri, 10 Mar 2023 09:20:29 +0100 (CET)
Received: from pps.filterd (m0045851.ppops.net [127.0.0.1])
 by mx0b-0016f401.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id
 32A7uIdA021875 for <dev@dpdk.org>; Fri, 10 Mar 2023 00:20:29 -0800
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com;
 h=from : to : cc :
 subject : date : message-id : in-reply-to : references : mime-version :
 content-type; s=pfpt0220; bh=yIJAlsqHtFlVIXo/vDDUx61Y98G+kO8oSKtFSS2lnSY=;
 b=PCslzJHizYa9p1cjvNOCYmLCG6q/eE2JZrgyfQUNQ4FtcarLVPxseDk69/QmSyLup1hq
 pmxF1tncZ9KeUMeQEFDoeKFUs7mEXI6JEMI51uQzTB2T0Uz/Aqet0i3mU/HvS/uzIQoQ
 DH44ntPI045R+Q/A3f3YgelGnYBE9kfVBMsJ2/qPGwmNgZNPKKgyLohAB8GzYHrau/XQ
 l1P8/QDZ1O0nasBnifebmQxFU4A9TCgUivKR5HezdT1PX6VlABgWfy42YYZ83Z7v62W1
 Bu/GuEea3hL5Osv5J3gFeZxLvznCxMXm8X80u1xMCdhP0i5OIT+JUIFAGbw0j8UwJlNB sQ== 
Received: from dc5-exch01.marvell.com ([199.233.59.181])
 by mx0b-0016f401.pphosted.com (PPS) with ESMTPS id 3p80k782tf-9
 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT)
 for <dev@dpdk.org>; Fri, 10 Mar 2023 00:20:29 -0800
Received: from DC5-EXCH02.marvell.com (10.69.176.39) by DC5-EXCH01.marvell.com
 (10.69.176.38) with Microsoft SMTP Server (TLS) id 15.0.1497.42;
 Fri, 10 Mar 2023 00:20:23 -0800
Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH02.marvell.com
 (10.69.176.39) with Microsoft SMTP Server id 15.0.1497.42 via Frontend
 Transport; Fri, 10 Mar 2023 00:20:23 -0800
Received: from ml-host-33.caveonetworks.com (unknown [10.110.143.233])
 by maili.marvell.com (Postfix) with ESMTP id 73A5D3F7088;
 Fri, 10 Mar 2023 00:20:21 -0800 (PST)
From: Srikanth Yalavarthi <syalavarthi@marvell.com>
To: Srikanth Yalavarthi <syalavarthi@marvell.com>
CC: <dev@dpdk.org>, <sshankarnara@marvell.com>, <jerinj@marvell.com>,
 <aprabhu@marvell.com>, <ptakkar@marvell.com>, <pshukla@marvell.com>
Subject: [PATCH v6 19/39] ml/cnxk: enable support to stop an ML models
Date: Fri, 10 Mar 2023 00:19:55 -0800
Message-ID: <20230310082015.20200-20-syalavarthi@marvell.com>
X-Mailer: git-send-email 2.17.1
In-Reply-To: <20230310082015.20200-1-syalavarthi@marvell.com>
References: <20221208200220.20267-1-syalavarthi@marvell.com>
 <20230310082015.20200-1-syalavarthi@marvell.com>
MIME-Version: 1.0
Content-Type: text/plain
X-Proofpoint-GUID: 9ozsgbFBnUufWvkHn42vrxcVuBE4cnjl
X-Proofpoint-ORIG-GUID: 9ozsgbFBnUufWvkHn42vrxcVuBE4cnjl
X-Proofpoint-Virus-Version: vendor=baseguard
 engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22
 definitions=2023-03-10_02,2023-03-09_01,2023-02-09_01
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org

Implemented model stop driver function. A model stop job is
enqueued through scratch registers and is checked for
completion through polling in a synchronous mode. OCM pages
are released after model stop completion.

Signed-off-by: Srikanth Yalavarthi <syalavarthi@marvell.com>
---
 drivers/ml/cnxk/cn10k_ml_ops.c | 115 ++++++++++++++++++++++++++++++++-
 drivers/ml/cnxk/cn10k_ml_ops.h |   1 +
 2 files changed, 114 insertions(+), 2 deletions(-)

diff --git a/drivers/ml/cnxk/cn10k_ml_ops.c b/drivers/ml/cnxk/cn10k_ml_ops.c
index e8ce65b182..77d3728d8d 100644
--- a/drivers/ml/cnxk/cn10k_ml_ops.c
+++ b/drivers/ml/cnxk/cn10k_ml_ops.c
@@ -295,10 +295,14 @@ cn10k_ml_dev_configure(struct rte_ml_dev *dev, const struct rte_ml_dev_config *c
 		/* Re-configure */
 		void **models;
 
-		/* Unload all models */
+		/* Stop and unload all models */
 		for (model_id = 0; model_id < dev->data->nb_models; model_id++) {
 			model = dev->data->models[model_id];
 			if (model != NULL) {
+				if (model->state == ML_CN10K_MODEL_STATE_STARTED) {
+					if (cn10k_ml_model_stop(dev, model_id) != 0)
+						plt_err("Could not stop model %u", model_id);
+				}
 				if (model->state == ML_CN10K_MODEL_STATE_LOADED) {
 					if (cn10k_ml_model_unload(dev, model_id) != 0)
 						plt_err("Could not unload model %u", model_id);
@@ -362,10 +366,14 @@ cn10k_ml_dev_close(struct rte_ml_dev *dev)
 
 	mldev = dev->data->dev_private;
 
-	/* Unload all models */
+	/* Stop and unload all models */
 	for (model_id = 0; model_id < dev->data->nb_models; model_id++) {
 		model = dev->data->models[model_id];
 		if (model != NULL) {
+			if (model->state == ML_CN10K_MODEL_STATE_STARTED) {
+				if (cn10k_ml_model_stop(dev, model_id) != 0)
+					plt_err("Could not stop model %u", model_id);
+			}
 			if (model->state == ML_CN10K_MODEL_STATE_LOADED) {
 				if (cn10k_ml_model_unload(dev, model_id) != 0)
 					plt_err("Could not unload model %u", model_id);
@@ -767,6 +775,108 @@ cn10k_ml_model_start(struct rte_ml_dev *dev, uint16_t model_id)
 	return ret;
 }
 
+int
+cn10k_ml_model_stop(struct rte_ml_dev *dev, uint16_t model_id)
+{
+	struct cn10k_ml_model *model;
+	struct cn10k_ml_dev *mldev;
+	struct cn10k_ml_ocm *ocm;
+	struct cn10k_ml_req *req;
+
+	bool job_enqueued;
+	bool job_dequeued;
+	bool locked;
+	int ret = 0;
+
+	mldev = dev->data->dev_private;
+	ocm = &mldev->ocm;
+	model = dev->data->models[model_id];
+
+	if (model == NULL) {
+		plt_err("Invalid model_id = %u", model_id);
+		return -EINVAL;
+	}
+
+	/* Prepare JD */
+	req = model->req;
+	cn10k_ml_prep_sp_job_descriptor(mldev, model, req, ML_CN10K_JOB_TYPE_MODEL_STOP);
+	req->result.error_code = 0x0;
+	req->result.user_ptr = NULL;
+
+	plt_write64(ML_CN10K_POLL_JOB_START, &req->status);
+	plt_wmb();
+
+	locked = false;
+	while (!locked) {
+		if (plt_spinlock_trylock(&model->lock) != 0) {
+			if (model->state == ML_CN10K_MODEL_STATE_LOADED) {
+				plt_ml_dbg("Model not started, model = 0x%016lx",
+					   PLT_U64_CAST(model));
+				plt_spinlock_unlock(&model->lock);
+				return 1;
+			}
+
+			if (model->state == ML_CN10K_MODEL_STATE_JOB_ACTIVE) {
+				plt_err("A slow-path job is active for the model = 0x%016lx",
+					PLT_U64_CAST(model));
+				plt_spinlock_unlock(&model->lock);
+				return -EBUSY;
+			}
+
+			model->state = ML_CN10K_MODEL_STATE_JOB_ACTIVE;
+			plt_spinlock_unlock(&model->lock);
+			locked = true;
+		}
+	}
+
+	while (model->model_mem_map.ocm_reserved) {
+		if (plt_spinlock_trylock(&ocm->lock) != 0) {
+			cn10k_ml_ocm_free_pages(dev, model->model_id);
+			model->model_mem_map.ocm_reserved = false;
+			model->model_mem_map.tilemask = 0x0;
+			plt_spinlock_unlock(&ocm->lock);
+		}
+	}
+
+	job_enqueued = false;
+	job_dequeued = false;
+	do {
+		if (!job_enqueued) {
+			req->timeout = plt_tsc_cycles() + ML_CN10K_CMD_TIMEOUT * plt_tsc_hz();
+			job_enqueued = roc_ml_scratch_enqueue(&mldev->roc, &req->jd);
+		}
+
+		if (job_enqueued && !job_dequeued)
+			job_dequeued = roc_ml_scratch_dequeue(&mldev->roc, &req->jd);
+
+		if (job_dequeued)
+			break;
+	} while (plt_tsc_cycles() < req->timeout);
+
+	if (job_dequeued) {
+		if (plt_read64(&req->status) == ML_CN10K_POLL_JOB_FINISH) {
+			if (req->result.error_code == 0x0)
+				ret = 0;
+			else
+				ret = -1;
+		}
+	} else {
+		roc_ml_scratch_queue_reset(&mldev->roc);
+		ret = -ETIME;
+	}
+
+	locked = false;
+	while (!locked) {
+		if (plt_spinlock_trylock(&model->lock) != 0) {
+			model->state = ML_CN10K_MODEL_STATE_LOADED;
+			plt_spinlock_unlock(&model->lock);
+			locked = true;
+		}
+	}
+
+	return ret;
+}
+
 struct rte_ml_dev_ops cn10k_ml_ops = {
 	/* Device control ops */
 	.dev_info_get = cn10k_ml_dev_info_get,
@@ -783,4 +893,5 @@ struct rte_ml_dev_ops cn10k_ml_ops = {
 	.model_load = cn10k_ml_model_load,
 	.model_unload = cn10k_ml_model_unload,
 	.model_start = cn10k_ml_model_start,
+	.model_stop = cn10k_ml_model_stop,
 };
diff --git a/drivers/ml/cnxk/cn10k_ml_ops.h b/drivers/ml/cnxk/cn10k_ml_ops.h
index 989af978c4..22576b93c0 100644
--- a/drivers/ml/cnxk/cn10k_ml_ops.h
+++ b/drivers/ml/cnxk/cn10k_ml_ops.h
@@ -65,5 +65,6 @@ int cn10k_ml_model_load(struct rte_ml_dev *dev, struct rte_ml_model_params *para
 			uint16_t *model_id);
 int cn10k_ml_model_unload(struct rte_ml_dev *dev, uint16_t model_id);
 int cn10k_ml_model_start(struct rte_ml_dev *dev, uint16_t model_id);
+int cn10k_ml_model_stop(struct rte_ml_dev *dev, uint16_t model_id);
 
 #endif /* _CN10K_ML_OPS_H_ */
-- 
2.17.1