From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 2A8884318E; Tue, 17 Oct 2023 19:01:46 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 009C342E2B; Tue, 17 Oct 2023 19:00:22 +0200 (CEST) Received: from mx0b-0016f401.pphosted.com (mx0b-0016f401.pphosted.com [67.231.156.173]) by mails.dpdk.org (Postfix) with ESMTP id 52C0042DF1 for ; Tue, 17 Oct 2023 19:00:12 +0200 (CEST) Received: from pps.filterd (m0045851.ppops.net [127.0.0.1]) by mx0b-0016f401.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 39HCUCbK018496 for ; Tue, 17 Oct 2023 10:00:11 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=pfpt0220; bh=octgNKp7jCQqA/8TMjGLpydq0vmDAXOBsZUt9dy2wpA=; b=cRBdP/rFGMZxQNeeEmoxijpgWqO6SHdgwxNM8cB0n7XEltR8qsjBLCM4SIa8tIXv3hRf f3zpOvxdMYCxMHFP5YBBc/Slp5YYI71TyM87UrU5qLm/K1gfhsrnkjn9Jv+BD/gMAQPZ HvwmUh6Es89hkUAUOpLbBQSFof+G9fXibEqa6JQN1O0e4GQ/72/SfG6Nw43gWHvHBUe/ UhRzgx9YMPaZSJtesTFtoGNoAOgUz9R0NqdkYz6ubKqekQj3UJdQKZO7rTv8CLmF/wfe QktZYtcFruAhAaAChfjEyjXMAQeQky+Sj1XoKSj+f6FcDeop/MCDvfmuZGqRmxN3AaV3 MA== Received: from dc5-exch01.marvell.com ([199.233.59.181]) by mx0b-0016f401.pphosted.com (PPS) with ESMTPS id 3tstb3s9ky-7 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT) for ; Tue, 17 Oct 2023 10:00:11 -0700 Received: from DC5-EXCH02.marvell.com (10.69.176.39) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server (TLS) id 15.0.1497.48; Tue, 17 Oct 2023 10:00:06 -0700 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH02.marvell.com (10.69.176.39) with Microsoft SMTP Server id 15.0.1497.48 via Frontend Transport; Tue, 17 Oct 2023 10:00:06 -0700 Received: from ml-host-33.caveonetworks.com (unknown [10.110.143.233]) by maili.marvell.com (Postfix) with ESMTP id 6A15C5B6947; Tue, 17 Oct 2023 10:00:06 -0700 (PDT) From: Srikanth Yalavarthi To: Srikanth Yalavarthi CC: , , , Subject: [PATCH v4 13/34] ml/cnxk: update data quantization functions Date: Tue, 17 Oct 2023 09:59:26 -0700 Message-ID: <20231017165951.27299-14-syalavarthi@marvell.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231017165951.27299-1-syalavarthi@marvell.com> References: <20230830155927.3566-1-syalavarthi@marvell.com> <20231017165951.27299-1-syalavarthi@marvell.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-Proofpoint-ORIG-GUID: El9dGSFOX0WGU5BRgTWlg7smHsmCbtCz X-Proofpoint-GUID: El9dGSFOX0WGU5BRgTWlg7smHsmCbtCz X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.980,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-10-17_03,2023-10-17_01,2023-05-22_02 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Added cnxk wrapper functions to quantize input data and dequantize output data. Signed-off-by: Srikanth Yalavarthi --- drivers/ml/cnxk/cn10k_ml_ops.c | 164 --------------------------------- drivers/ml/cnxk/cn10k_ml_ops.h | 7 -- drivers/ml/cnxk/cnxk_ml_io.c | 95 +++++++++++++++++++ drivers/ml/cnxk/cnxk_ml_io.h | 3 + drivers/ml/cnxk/cnxk_ml_ops.c | 78 +++++++++++++++- drivers/ml/cnxk/meson.build | 1 + 6 files changed, 175 insertions(+), 173 deletions(-) create mode 100644 drivers/ml/cnxk/cnxk_ml_io.c diff --git a/drivers/ml/cnxk/cn10k_ml_ops.c b/drivers/ml/cnxk/cn10k_ml_ops.c index c0d6216485..ff190b7f86 100644 --- a/drivers/ml/cnxk/cn10k_ml_ops.c +++ b/drivers/ml/cnxk/cn10k_ml_ops.c @@ -1856,170 +1856,6 @@ cn10k_ml_model_params_update(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_mode return 0; } -int -cn10k_ml_io_quantize(struct rte_ml_dev *dev, uint16_t model_id, struct rte_ml_buff_seg **dbuffer, - struct rte_ml_buff_seg **qbuffer) -{ - struct cnxk_ml_model *model; - uint8_t model_input_type; - uint8_t *lcl_dbuffer; - uint8_t *lcl_qbuffer; - uint8_t input_type; - float qscale; - uint32_t i; - uint32_t j; - int ret; - - model = dev->data->models[model_id]; - - if (model == NULL) { - plt_err("Invalid model_id = %u", model_id); - return -EINVAL; - } - - lcl_dbuffer = dbuffer[0]->addr; - lcl_qbuffer = qbuffer[0]->addr; - - for (i = 0; i < model->layer[0].glow.metadata.model.num_input; i++) { - if (i < MRVL_ML_NUM_INPUT_OUTPUT_1) { - input_type = model->layer[0].glow.metadata.input1[i].input_type; - model_input_type = model->layer[0].glow.metadata.input1[i].model_input_type; - qscale = model->layer[0].glow.metadata.input1[i].qscale; - } else { - j = i - MRVL_ML_NUM_INPUT_OUTPUT_1; - input_type = model->layer[0].glow.metadata.input2[j].input_type; - model_input_type = model->layer[0].glow.metadata.input2[j].model_input_type; - qscale = model->layer[0].glow.metadata.input2[j].qscale; - } - - if (input_type == model_input_type) { - rte_memcpy(lcl_qbuffer, lcl_dbuffer, model->layer[0].info.input[i].sz_d); - } else { - switch (model->layer[0].glow.metadata.input1[i].model_input_type) { - case RTE_ML_IO_TYPE_INT8: - ret = rte_ml_io_float32_to_int8( - qscale, model->layer[0].info.input[i].nb_elements, - lcl_dbuffer, lcl_qbuffer); - break; - case RTE_ML_IO_TYPE_UINT8: - ret = rte_ml_io_float32_to_uint8( - qscale, model->layer[0].info.input[i].nb_elements, - lcl_dbuffer, lcl_qbuffer); - break; - case RTE_ML_IO_TYPE_INT16: - ret = rte_ml_io_float32_to_int16( - qscale, model->layer[0].info.input[i].nb_elements, - lcl_dbuffer, lcl_qbuffer); - break; - case RTE_ML_IO_TYPE_UINT16: - ret = rte_ml_io_float32_to_uint16( - qscale, model->layer[0].info.input[i].nb_elements, - lcl_dbuffer, lcl_qbuffer); - break; - case RTE_ML_IO_TYPE_FP16: - ret = rte_ml_io_float32_to_float16( - model->layer[0].info.input[i].nb_elements, lcl_dbuffer, - lcl_qbuffer); - break; - default: - plt_err("Unsupported model_input_type[%u] : %u", i, - model->layer[0].glow.metadata.input1[i].model_input_type); - ret = -ENOTSUP; - } - if (ret < 0) - return ret; - } - - lcl_dbuffer += model->layer[0].info.input[i].sz_d; - lcl_qbuffer += model->layer[0].info.input[i].sz_q; - } - - return 0; -} - -int -cn10k_ml_io_dequantize(struct rte_ml_dev *dev, uint16_t model_id, struct rte_ml_buff_seg **qbuffer, - struct rte_ml_buff_seg **dbuffer) -{ - struct cnxk_ml_model *model; - uint8_t model_output_type; - uint8_t *lcl_qbuffer; - uint8_t *lcl_dbuffer; - uint8_t output_type; - float dscale; - uint32_t i; - uint32_t j; - int ret; - - model = dev->data->models[model_id]; - - if (model == NULL) { - plt_err("Invalid model_id = %u", model_id); - return -EINVAL; - } - - lcl_dbuffer = dbuffer[0]->addr; - lcl_qbuffer = qbuffer[0]->addr; - - for (i = 0; i < model->layer[0].glow.metadata.model.num_output; i++) { - if (i < MRVL_ML_NUM_INPUT_OUTPUT_1) { - output_type = model->layer[0].glow.metadata.output1[i].output_type; - model_output_type = - model->layer[0].glow.metadata.output1[i].model_output_type; - dscale = model->layer[0].glow.metadata.output1[i].dscale; - } else { - j = i - MRVL_ML_NUM_INPUT_OUTPUT_1; - output_type = model->layer[0].glow.metadata.output2[j].output_type; - model_output_type = - model->layer[0].glow.metadata.output2[j].model_output_type; - dscale = model->layer[0].glow.metadata.output2[j].dscale; - } - - if (output_type == model_output_type) { - rte_memcpy(lcl_dbuffer, lcl_qbuffer, model->layer[0].info.output[i].sz_q); - } else { - switch (model->layer[0].glow.metadata.output1[i].model_output_type) { - case RTE_ML_IO_TYPE_INT8: - ret = rte_ml_io_int8_to_float32( - dscale, model->layer[0].info.output[i].nb_elements, - lcl_qbuffer, lcl_dbuffer); - break; - case RTE_ML_IO_TYPE_UINT8: - ret = rte_ml_io_uint8_to_float32( - dscale, model->layer[0].info.output[i].nb_elements, - lcl_qbuffer, lcl_dbuffer); - break; - case RTE_ML_IO_TYPE_INT16: - ret = rte_ml_io_int16_to_float32( - dscale, model->layer[0].info.output[i].nb_elements, - lcl_qbuffer, lcl_dbuffer); - break; - case RTE_ML_IO_TYPE_UINT16: - ret = rte_ml_io_uint16_to_float32( - dscale, model->layer[0].info.output[i].nb_elements, - lcl_qbuffer, lcl_dbuffer); - break; - case RTE_ML_IO_TYPE_FP16: - ret = rte_ml_io_float16_to_float32( - model->layer[0].info.output[i].nb_elements, lcl_qbuffer, - lcl_dbuffer); - break; - default: - plt_err("Unsupported model_output_type[%u] : %u", i, - model->layer[0].glow.metadata.output1[i].model_output_type); - ret = -ENOTSUP; - } - if (ret < 0) - return ret; - } - - lcl_qbuffer += model->layer[0].info.output[i].sz_q; - lcl_dbuffer += model->layer[0].info.output[i].sz_d; - } - - return 0; -} - static __rte_always_inline void queue_index_advance(uint64_t *index, uint64_t nb_desc) { diff --git a/drivers/ml/cnxk/cn10k_ml_ops.h b/drivers/ml/cnxk/cn10k_ml_ops.h index ef12069f0d..780e2a9f9c 100644 --- a/drivers/ml/cnxk/cn10k_ml_ops.h +++ b/drivers/ml/cnxk/cn10k_ml_ops.h @@ -320,13 +320,6 @@ int cn10k_ml_model_stop(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_model *mo int cn10k_ml_model_params_update(struct cnxk_ml_dev *cnxk_mldev, struct cnxk_ml_model *model, void *buffer); -/* I/O ops */ -int cn10k_ml_io_quantize(struct rte_ml_dev *dev, uint16_t model_id, - struct rte_ml_buff_seg **dbuffer, struct rte_ml_buff_seg **qbuffer); - -int cn10k_ml_io_dequantize(struct rte_ml_dev *dev, uint16_t model_id, - struct rte_ml_buff_seg **qbuffer, struct rte_ml_buff_seg **dbuffer); - /* Fast-path ops */ __rte_hot uint16_t cn10k_ml_enqueue_burst(struct rte_ml_dev *dev, uint16_t qp_id, struct rte_ml_op **ops, uint16_t nb_ops); diff --git a/drivers/ml/cnxk/cnxk_ml_io.c b/drivers/ml/cnxk/cnxk_ml_io.c new file mode 100644 index 0000000000..c78009ab0c --- /dev/null +++ b/drivers/ml/cnxk/cnxk_ml_io.c @@ -0,0 +1,95 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright (c) 2023 Marvell. + */ + +#include + +#include + +#include + +#include "cnxk_ml_io.h" + +inline int +cnxk_ml_io_quantize_single(struct cnxk_ml_io *input, uint8_t *dbuffer, uint8_t *qbuffer) +{ + enum rte_ml_io_type qtype; + enum rte_ml_io_type dtype; + uint32_t nb_elements; + float qscale; + int ret = 0; + + dtype = input->dtype; + qtype = input->qtype; + qscale = input->scale; + nb_elements = input->nb_elements; + + if (dtype == qtype) { + rte_memcpy(qbuffer, dbuffer, input->sz_d); + } else { + switch (qtype) { + case RTE_ML_IO_TYPE_INT8: + ret = rte_ml_io_float32_to_int8(qscale, nb_elements, dbuffer, qbuffer); + break; + case RTE_ML_IO_TYPE_UINT8: + ret = rte_ml_io_float32_to_uint8(qscale, nb_elements, dbuffer, qbuffer); + break; + case RTE_ML_IO_TYPE_INT16: + ret = rte_ml_io_float32_to_int16(qscale, nb_elements, dbuffer, qbuffer); + break; + case RTE_ML_IO_TYPE_UINT16: + ret = rte_ml_io_float32_to_uint16(qscale, nb_elements, dbuffer, qbuffer); + break; + case RTE_ML_IO_TYPE_FP16: + ret = rte_ml_io_float32_to_float16(nb_elements, dbuffer, qbuffer); + break; + default: + plt_err("Unsupported qtype : %u", qtype); + ret = -ENOTSUP; + } + } + + return ret; +} + +inline int +cnxk_ml_io_dequantize_single(struct cnxk_ml_io *output, uint8_t *qbuffer, uint8_t *dbuffer) +{ + enum rte_ml_io_type qtype; + enum rte_ml_io_type dtype; + uint32_t nb_elements; + float dscale; + int ret = 0; + + dtype = output->dtype; + qtype = output->qtype; + dscale = output->scale; + nb_elements = output->nb_elements; + + if (dtype == qtype) { + rte_memcpy(dbuffer, qbuffer, output->sz_q); + } else { + switch (qtype) { + case RTE_ML_IO_TYPE_INT8: + ret = rte_ml_io_int8_to_float32(dscale, nb_elements, qbuffer, dbuffer); + break; + case RTE_ML_IO_TYPE_UINT8: + ret = rte_ml_io_uint8_to_float32(dscale, nb_elements, qbuffer, dbuffer); + break; + case RTE_ML_IO_TYPE_INT16: + ret = rte_ml_io_int16_to_float32(dscale, nb_elements, qbuffer, dbuffer); + break; + case RTE_ML_IO_TYPE_UINT16: + ret = rte_ml_io_uint16_to_float32(dscale, nb_elements, qbuffer, dbuffer); + break; + case RTE_ML_IO_TYPE_FP16: + ret = rte_ml_io_float16_to_float32(nb_elements, qbuffer, dbuffer); + break; + default: + plt_err("Unsupported qtype: %u", qtype); + ret = -ENOTSUP; + } + } + + return ret; +} diff --git a/drivers/ml/cnxk/cnxk_ml_io.h b/drivers/ml/cnxk/cnxk_ml_io.h index 29ec7ec511..5de166c252 100644 --- a/drivers/ml/cnxk/cnxk_ml_io.h +++ b/drivers/ml/cnxk/cnxk_ml_io.h @@ -76,4 +76,7 @@ struct cnxk_ml_io_info { uint32_t total_output_sz_d; }; +int cnxk_ml_io_quantize_single(struct cnxk_ml_io *input, uint8_t *dbuffer, uint8_t *qbuffer); +int cnxk_ml_io_dequantize_single(struct cnxk_ml_io *output, uint8_t *qbuffer, uint8_t *dbuffer); + #endif /* _CNXK_ML_IO_H_ */ diff --git a/drivers/ml/cnxk/cnxk_ml_ops.c b/drivers/ml/cnxk/cnxk_ml_ops.c index 79665fa21b..5d181eb0f2 100644 --- a/drivers/ml/cnxk/cnxk_ml_ops.c +++ b/drivers/ml/cnxk/cnxk_ml_ops.c @@ -5,6 +5,8 @@ #include #include +#include + #include "cnxk_ml_dev.h" #include "cnxk_ml_io.h" #include "cnxk_ml_model.h" @@ -708,6 +710,78 @@ cnxk_ml_model_params_update(struct rte_ml_dev *dev, uint16_t model_id, void *buf return cn10k_ml_model_params_update(cnxk_mldev, model, buffer); } +static int +cnxk_ml_io_quantize(struct rte_ml_dev *dev, uint16_t model_id, struct rte_ml_buff_seg **dbuffer, + struct rte_ml_buff_seg **qbuffer) +{ + struct cnxk_ml_io_info *info = NULL; + struct cnxk_ml_model *model; + uint8_t *lcl_dbuffer; + uint8_t *lcl_qbuffer; + uint32_t i; + int ret; + + if ((dev == NULL) || (dbuffer == NULL) || (qbuffer == NULL)) + return -EINVAL; + + model = dev->data->models[model_id]; + if (model == NULL) { + plt_err("Invalid model_id = %u", model_id); + return -EINVAL; + } + + info = &model->layer[0].info; + + lcl_dbuffer = dbuffer[0]->addr; + lcl_qbuffer = qbuffer[0]->addr; + for (i = 0; i < info->nb_inputs; i++) { + ret = cnxk_ml_io_quantize_single(&info->input[i], lcl_dbuffer, lcl_qbuffer); + if (ret < 0) + return ret; + + lcl_dbuffer += info->input[i].sz_d; + lcl_qbuffer += info->input[i].sz_q; + } + + return 0; +} + +static int +cnxk_ml_io_dequantize(struct rte_ml_dev *dev, uint16_t model_id, struct rte_ml_buff_seg **qbuffer, + struct rte_ml_buff_seg **dbuffer) +{ + struct cnxk_ml_io_info *info = NULL; + struct cnxk_ml_model *model; + uint8_t *lcl_qbuffer; + uint8_t *lcl_dbuffer; + uint32_t i; + int ret; + + if ((dev == NULL) || (qbuffer == NULL) || (dbuffer == NULL)) + return -EINVAL; + + model = dev->data->models[model_id]; + if (model == NULL) { + plt_err("Invalid model_id = %u", model_id); + return -EINVAL; + } + + info = &model->layer[model->nb_layers - 1].info; + + lcl_qbuffer = qbuffer[0]->addr; + lcl_dbuffer = dbuffer[0]->addr; + for (i = 0; i < info->nb_outputs; i++) { + ret = cnxk_ml_io_dequantize_single(&info->output[i], lcl_qbuffer, lcl_dbuffer); + if (ret < 0) + return ret; + + lcl_qbuffer += info->output[i].sz_q; + lcl_dbuffer += info->output[i].sz_d; + } + + return 0; +} + struct rte_ml_dev_ops cnxk_ml_ops = { /* Device control ops */ .dev_info_get = cnxk_ml_dev_info_get, @@ -739,6 +813,6 @@ struct rte_ml_dev_ops cnxk_ml_ops = { .model_params_update = cnxk_ml_model_params_update, /* I/O ops */ - .io_quantize = cn10k_ml_io_quantize, - .io_dequantize = cn10k_ml_io_dequantize, + .io_quantize = cnxk_ml_io_quantize, + .io_dequantize = cnxk_ml_io_dequantize, }; diff --git a/drivers/ml/cnxk/meson.build b/drivers/ml/cnxk/meson.build index 6385ac4548..9cc4ddec70 100644 --- a/drivers/ml/cnxk/meson.build +++ b/drivers/ml/cnxk/meson.build @@ -25,6 +25,7 @@ sources = files( 'cn10k_ml_model.c', 'cn10k_ml_ocm.c', 'cnxk_ml_dev.c', + 'cnxk_ml_io.c', 'cnxk_ml_model.c', 'cnxk_ml_ops.c', ) -- 2.42.0