From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 058BAA00C5; Thu, 15 Sep 2022 10:15:26 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id E6B2B4021D; Thu, 15 Sep 2022 10:15:25 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by mails.dpdk.org (Postfix) with ESMTP id 210DD40156 for ; Thu, 15 Sep 2022 10:15:25 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1663229724; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=h2x83Cx+xCvYKem58NYLvJW1X+Gnr4+khY8zKq1Q2sk=; b=FKLAkJAw0xILZUy5n87gZC6a0s5vbcmDjSGEHs2OF3uIo/rKOQsw2zn/b4MiXCOQoAYUHd Uox5YuitBnCUjJZesMF8ig5yrXKSB7+uw1hGTrtGijpjcAekA7sk3CGPLbCusWMjVd0gmn Kf8Cs0G7sikRLVJiD1M6MHlv8tQRclQ= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-255-xd_Blwk5Mzud2NdFTNT8mw-1; Thu, 15 Sep 2022 04:15:21 -0400 X-MC-Unique: xd_Blwk5Mzud2NdFTNT8mw-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 3B8C987B2A2; Thu, 15 Sep 2022 08:15:21 +0000 (UTC) Received: from [10.39.208.12] (unknown [10.39.208.12]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 8D9F81121314; Thu, 15 Sep 2022 08:15:19 +0000 (UTC) Message-ID: <20a46f14-e564-bab4-da88-6eeefcb9a1d0@redhat.com> Date: Thu, 15 Sep 2022 10:15:18 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.12.0 To: Hernan Vargas , dev@dpdk.org, gakhil@marvell.com, trix@redhat.com Cc: nicolas.chautru@intel.com, qi.z.zhang@intel.com References: <20220820023157.189047-1-hernan.vargas@intel.com> <20220820023157.189047-16-hernan.vargas@intel.com> From: Maxime Coquelin Subject: Re: [PATCH v2 15/37] baseband/acc100: add workaround for deRM corner cases In-Reply-To: <20220820023157.189047-16-hernan.vargas@intel.com> X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On 8/20/22 04:31, Hernan Vargas wrote: > Add function to asses if de-ratematch pre-processing should be run > in SW for corner cases. > > Signed-off-by: Hernan Vargas > --- > drivers/baseband/acc100/acc100_pmd.h | 13 +++ > drivers/baseband/acc100/rte_acc100_pmd.c | 103 ++++++++++++++++++++++- > 2 files changed, 114 insertions(+), 2 deletions(-) > > diff --git a/drivers/baseband/acc100/acc100_pmd.h b/drivers/baseband/acc100/acc100_pmd.h > index 19a1f434bc..c98a182be6 100644 > --- a/drivers/baseband/acc100/acc100_pmd.h > +++ b/drivers/baseband/acc100/acc100_pmd.h > @@ -140,6 +140,8 @@ > /* Constants from K0 computation from 3GPP 38.212 Table 5.4.2.1-2 */ > #define ACC100_N_ZC_1 66 /* N = 66 Zc for BG 1 */ > #define ACC100_N_ZC_2 50 /* N = 50 Zc for BG 2 */ > +#define ACC100_K_ZC_1 22 /* K = 22 Zc for BG 1 */ > +#define ACC100_K_ZC_2 10 /* K = 10 Zc for BG 2 */ > #define ACC100_K0_1_1 17 /* K0 fraction numerator for rv 1 and BG 1 */ > #define ACC100_K0_1_2 13 /* K0 fraction numerator for rv 1 and BG 2 */ > #define ACC100_K0_2_1 33 /* K0 fraction numerator for rv 2 and BG 1 */ > @@ -177,6 +179,16 @@ > #define ACC100_MS_IN_US (1000) > #define ACC100_DDR_TRAINING_MAX (5000) > > +/* Code rate limitation when padding is required */ > +#define ACC100_LIM_03 2 /* 0.03 */ > +#define ACC100_LIM_09 6 /* 0.09 */ > +#define ACC100_LIM_14 9 /* 0.14 */ > +#define ACC100_LIM_21 14 /* 0.21 */ > +#define ACC100_LIM_31 20 /* 0.31 */ > +#define ACC100_MAX_E (128 * 1024 - 2) > + > + > + > /* ACC100 DMA Descriptor triplet */ > struct acc100_dma_triplet { > uint64_t address; > @@ -572,6 +584,7 @@ struct __rte_cache_aligned acc100_queue { > uint8_t *lb_out; > rte_iova_t lb_in_addr_iova; > rte_iova_t lb_out_addr_iova; > + int8_t *derm_buffer; /* interim buffer for de-rm in SDK */ > struct acc100_device *d; > }; > > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c > index 1504acfadd..69c0714a37 100644 > --- a/drivers/baseband/acc100/rte_acc100_pmd.c > +++ b/drivers/baseband/acc100/rte_acc100_pmd.c > @@ -24,6 +24,10 @@ > #include "acc100_pmd.h" > #include "acc101_pmd.h" > > +#ifdef RTE_BBDEV_SDK_AVX512 > +#include > +#endif > + > #ifdef RTE_LIBRTE_BBDEV_DEBUG > RTE_LOG_REGISTER_DEFAULT(acc100_logtype, DEBUG); > #else > @@ -898,6 +902,16 @@ acc100_queue_setup(struct rte_bbdev *dev, uint16_t queue_id, > rte_free(q); > return -ENOMEM; > } > + q->derm_buffer = rte_zmalloc_socket(dev->device->driver->name, > + RTE_BBDEV_TURBO_MAX_CB_SIZE * 10, > + RTE_CACHE_LINE_SIZE, conf->socket); > + if (q->derm_buffer == NULL) { > + rte_bbdev_log(ERR, "Failed to allocate derm_buffer memory"); > + rte_free(q->lb_in); > + rte_free(q->lb_out); > + rte_free(q); > + return -ENOMEM; > + } It may make sense to have a common error path to avoid duplication and so risk introducing memory leaks when changes will be made. > q->lb_out_addr_iova = rte_malloc_virt2iova(q->lb_out); > > /* > @@ -918,6 +932,7 @@ acc100_queue_setup(struct rte_bbdev *dev, uint16_t queue_id, > > q_idx = acc100_find_free_queue_idx(dev, conf); > if (q_idx == -1) { > + rte_free(q->derm_buffer); > rte_free(q->lb_in); > rte_free(q->lb_out); > rte_free(q); > @@ -955,6 +970,7 @@ acc100_queue_release(struct rte_bbdev *dev, uint16_t q_id) > /* Mark the Queue as un-assigned */ > d->q_assigned_bit_map[q->qgrp_id] &= (0xFFFFFFFF - > (1 << q->aq_id)); > + rte_free(q->derm_buffer); > rte_free(q->lb_in); > rte_free(q->lb_out); > rte_free(q); > @@ -3512,10 +3528,42 @@ harq_loopback(struct acc100_queue *q, struct rte_bbdev_dec_op *op, > return 1; > } > > +/** Assess whether a work around is required for the deRM corner cases */ > +static inline bool > +derm_workaround_required(struct rte_bbdev_op_ldpc_dec *ldpc_dec, struct acc100_queue *q) > +{ > + if (!is_acc100(q)) > + return false; > + int32_t e = ldpc_dec->cb_params.e; > + int q_m = ldpc_dec->q_m; > + int z_c = ldpc_dec->z_c; > + int K = (ldpc_dec->basegraph == 1 ? ACC100_K_ZC_1 : ACC100_K_ZC_2) > + * z_c; > + bool required = false; Add new line. > + if (ldpc_dec->basegraph == 1) { > + if ((q_m == 4) && (z_c >= 320) && (e * ACC100_LIM_31 > K * 64)) > + required = true; > + else if ((e * ACC100_LIM_21 > K * 64)) > + required = true; > + } else { > + if (q_m <= 2) { > + if ((z_c >= 208) && (e * ACC100_LIM_09 > K * 64)) > + required = true; > + else if ((z_c < 208) && (e * ACC100_LIM_03 > K * 64)) > + required = true; > + } else if (e * ACC100_LIM_14 > K * 64) > + required = true; > + } > + if (required) > + rte_bbdev_log(INFO, "Running deRM pre-processing in SW"); Add new line. > + return required; > +} > + > /** Enqueue one decode operations for ACC100 device in CB mode */ > static inline int > enqueue_ldpc_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op, > - uint16_t total_enqueued_cbs, bool same_op) > + uint16_t total_enqueued_cbs, bool same_op, > + struct rte_bbdev_queue_data *q_data) > { > int ret; > if (unlikely(check_bit(op->ldpc_dec.op_flags, > @@ -3571,6 +3619,57 @@ enqueue_ldpc_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op, > &in_offset, &h_out_offset, > &h_out_length, harq_layout); > } else { > + if (derm_workaround_required(&op->ldpc_dec, q)) { > + #ifdef RTE_BBDEV_SDK_AVX512 First, the indentation is not good here. Also, my understanding is that this code will get built only if Flexran SDK is available. Flexran SDK is proprietary, and so it is not possible to have this code exercised by the upstream CI. Code under RTE_BBDEV_SDK_AVX512 should be dropped IMO. > + struct rte_bbdev_op_ldpc_dec *dec = &op->ldpc_dec; > + /* Checking input size is matching with E */ > + if (dec->input.data->data_len < dec->cb_params.e) { > + rte_bbdev_log(ERR, > + "deRM: Input size mismatch"); > + return -EFAULT; > + } > + /* Run first deRM processing in SW */ > + struct bblib_rate_dematching_5gnr_request derm_req; > + struct bblib_rate_dematching_5gnr_response derm_resp; > + uint8_t *in = rte_pktmbuf_mtod_offset(dec->input.data, > + uint8_t *, in_offset); Don't mix declarations & code. > + derm_req.p_in = (int8_t *) in; > + derm_req.p_harq = (int8_t *) q->derm_buffer; > + derm_req.base_graph = dec->basegraph; > + derm_req.zc = dec->z_c; > + derm_req.ncb = dec->n_cb; > + derm_req.e = dec->cb_params.e; > + if (derm_req.e > ACC100_MAX_E) { > + rte_bbdev_log(WARNING, > + "deRM: E %d > %d max", > + derm_req.e, ACC100_MAX_E); > + derm_req.e = ACC100_MAX_E; > + } > + derm_req.k0 = 0; /* Actual output from SDK */ > + derm_req.isretx = false; > + derm_req.rvid = dec->rv_index; > + derm_req.modulation_order = dec->q_m; > + derm_req.start_null_index = > + (dec->basegraph == 1 ? 22 : 10) > + * dec->z_c - 2 * dec->z_c > + - dec->n_filler; > + derm_req.num_of_null = dec->n_filler; > + bblib_rate_dematching_5gnr(&derm_req, &derm_resp); > + /* Force back the HW DeRM */ > + dec->q_m = 1; > + dec->cb_params.e = dec->n_cb - dec->n_filler; > + dec->rv_index = 0; > + rte_memcpy(in, q->derm_buffer, dec->cb_params.e); > + /* Capture counter when pre-processing is used */ > + q_data->queue_stats.enqueue_warn_count++; > + #else > + RTE_SET_USED(q_data); > + rte_bbdev_log(WARNING, > + "Corner case may require deRM pre-processing in SDK" > + ); > + #endif > + } > + > struct acc100_fcw_ld *fcw; > uint32_t seg_total_left; Don't mix declarations & code. > fcw = &desc->req.fcw_ld; > @@ -4322,7 +4421,7 @@ acc100_enqueue_ldpc_dec_cb(struct rte_bbdev_queue_data *q_data, > ops[i]->ldpc_dec.n_cb, ops[i]->ldpc_dec.q_m, > ops[i]->ldpc_dec.n_filler, ops[i]->ldpc_dec.cb_params.e, > same_op); > - ret = enqueue_ldpc_dec_one_op_cb(q, ops[i], i, same_op); > + ret = enqueue_ldpc_dec_one_op_cb(q, ops[i], i, same_op, q_data); > if (ret < 0) { > acc100_enqueue_invalid(q_data); > break;