DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Chautru, Nicolas" <nicolas.chautru@intel.com>
To: Maxime Coquelin <maxime.coquelin@redhat.com>,
	"Vargas, Hernan" <hernan.vargas@intel.com>,
	"dev@dpdk.org" <dev@dpdk.org>,
	"gakhil@marvell.com" <gakhil@marvell.com>,
	"trix@redhat.com" <trix@redhat.com>
Cc: "Zhang, Qi Z" <qi.z.zhang@intel.com>
Subject: RE: [PATCH v5 28/29] baseband/acc100: add workaround for deRM corner cases
Date: Fri, 21 Oct 2022 15:40:51 +0000	[thread overview]
Message-ID: <BY5PR11MB44510F3D15DAA61CD787D398F82D9@BY5PR11MB4451.namprd11.prod.outlook.com> (raw)
In-Reply-To: <f1517456-1b9f-e8b3-2c59-274be4839999@redhat.com>

Hi Maxime, 

> -----Original Message-----
> From: Maxime Coquelin <maxime.coquelin@redhat.com>
> Sent: Friday, October 21, 2022 2:33 AM
> To: Vargas, Hernan <hernan.vargas@intel.com>; dev@dpdk.org;
> gakhil@marvell.com; trix@redhat.com
> Cc: Chautru, Nicolas <nicolas.chautru@intel.com>; Zhang, Qi Z
> <qi.z.zhang@intel.com>
> Subject: Re: [PATCH v5 28/29] baseband/acc100: add workaround for deRM
> corner cases
> 
> 
> 
> On 10/21/22 07:21, Hernan Vargas wrote:
> > Add function to support de-ratematch pre-processing for SW corner cases.
> > Some specific 5GUL FEC corner cases may cause unintended back pressure
> > and in some cases potential stability issue on the ACC100.
> > To be able to avoid completly such potential issue, the PMD can
> > preempt such code block configuration so that to process the first
> > level deRM in SW using the SDK libraries prior to running the rest of
> > the FEC decoding in HW using an amended code block configuration.
> > In case meson build system doesn't find such SDK libraries, the fall
> > method is to run in HW as is with a warning.
> >
> > Signed-off-by: Hernan Vargas <hernan.vargas@intel.com>
> > ---
> >   drivers/baseband/acc/acc_common.h     |   8 ++
> >   drivers/baseband/acc/meson.build      |  21 +++++
> >   drivers/baseband/acc/rte_acc100_pmd.c | 108
> +++++++++++++++++++++++++-
> >   3 files changed, 134 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/baseband/acc/acc_common.h
> > b/drivers/baseband/acc/acc_common.h
> > index eae7eab4e9..5e8972b40a 100644
> > --- a/drivers/baseband/acc/acc_common.h
> > +++ b/drivers/baseband/acc/acc_common.h
> > @@ -123,6 +123,14 @@
> >   #define ACC_HARQ_ALIGN_64B          64
> >   #define ACC_MAX_ZC                  384
> >
> > +/* De-ratematch code rate limitation when padding is required */
> > +#define ACC_LIM_03 2  /* 0.03 */ #define ACC_LIM_09 6  /* 0.09 */
> > +#define ACC_LIM_14 9  /* 0.14 */ #define ACC_LIM_21 14 /* 0.21 */
> > +#define ACC_LIM_31 20 /* 0.31 */ #define ACC_MAX_E (128 * 1024 - 2)
> > +
> >   /* Helper macro for logging */
> >   #define rte_acc_log(level, fmt, ...) \
> >   	rte_log(RTE_LOG_ ## level, RTE_LOG_NOTICE, fmt "\n", \ diff --git
> > a/drivers/baseband/acc/meson.build b/drivers/baseband/acc/meson.build
> > index 77c393b533..a5fc4fed01 100644
> > --- a/drivers/baseband/acc/meson.build
> > +++ b/drivers/baseband/acc/meson.build
> > @@ -1,6 +1,27 @@
> >   # SPDX-License-Identifier: BSD-3-Clause
> >   # Copyright(c) 2020 Intel Corporation
> >
> > +# Check for FlexRAN SDK libraries
> > +dep_dec5g = dependency('flexran_sdk_ldpc_decoder_5gnr', required:
> > +false)
> > +
> > +if dep_dec5g.found()
> > +    ext_deps += cc.find_library('libstdc++', required: true)
> > +    ext_deps += cc.find_library('libirc', required: true)
> > +    ext_deps += cc.find_library('libimf', required: true)
> > +    ext_deps += cc.find_library('libipps', required: true)
> > +    ext_deps += cc.find_library('libsvml', required: true)
> > +    ext_deps += dep_dec5g
> > +    ext_deps += dependency('flexran_sdk_ldpc_encoder_5gnr', required:
> true)
> > +    ext_deps += dependency('flexran_sdk_LDPC_ratematch_5gnr', required:
> true)
> > +    ext_deps += dependency('flexran_sdk_rate_dematching_5gnr', required:
> true)
> > +    ext_deps += dependency('flexran_sdk_turbo', required: true)
> > +    ext_deps += dependency('flexran_sdk_crc', required: true)
> > +    ext_deps += dependency('flexran_sdk_rate_matching', required: true)
> > +    ext_deps += dependency('flexran_sdk_common', required: true)
> > +    cflags += ['-DRTE_BBDEV_SDK_AVX2']
> > +    cflags += ['-DRTE_BBDEV_SDK_AVX512'] endif
> > +
> >   deps += ['bbdev', 'bus_pci']
> >
> >   sources = files('rte_acc100_pmd.c', 'rte_acc200_pmd.c') diff --git
> > a/drivers/baseband/acc/rte_acc100_pmd.c
> > b/drivers/baseband/acc/rte_acc100_pmd.c
> > index 23bc5d25bb..e8b230e563 100644
> > --- a/drivers/baseband/acc/rte_acc100_pmd.c
> > +++ b/drivers/baseband/acc/rte_acc100_pmd.c
> > @@ -25,6 +25,10 @@
> >   #include "acc101_pmd.h"
> >   #include "acc200_cfg.h"
> >
> > +#ifdef RTE_BBDEV_SDK_AVX512
> > +#include <phy_rate_dematching_5gnr.h> #endif
> > +
> >   #ifdef RTE_LIBRTE_BBDEV_DEBUG
> >   RTE_LOG_REGISTER_DEFAULT(acc100_logtype, DEBUG);
> >   #else
> > @@ -756,6 +760,14 @@ acc100_queue_setup(struct rte_bbdev *dev,
> uint16_t queue_id,
> >   		ret = -ENOMEM;
> >   		goto free_lb_out;
> >   	}
> > +	q->derm_buffer = rte_zmalloc_socket(dev->device->driver->name,
> > +			RTE_BBDEV_TURBO_MAX_CB_SIZE * 10,
> > +			RTE_CACHE_LINE_SIZE, conf->socket);
> > +	if (q->derm_buffer == NULL) {
> > +		rte_bbdev_log(ERR, "Failed to allocate derm_buffer memory");
> > +		ret = -ENOMEM;
> > +		goto free_companion_ring_addr;
> > +	}
> >
> >   	/*
> >   	 * Software queue ring wraps synchronously with the HW when it
> > reaches @@ -776,7 +788,7 @@ acc100_queue_setup(struct rte_bbdev *dev,
> uint16_t queue_id,
> >   	q_idx = acc100_find_free_queue_idx(dev, conf);
> >   	if (q_idx == -1) {
> >   		ret = -EINVAL;
> > -		goto free_companion_ring_addr;
> > +		goto free_derm_buffer;
> >   	}
> >
> >   	q->qgrp_id = (q_idx >> ACC100_GRP_ID_SHIFT) & 0xF; @@ -804,6
> +816,9
> > @@ acc100_queue_setup(struct rte_bbdev *dev, uint16_t queue_id,
> >   	dev->data->queues[queue_id].queue_private = q;
> >   	return 0;
> >
> > +free_derm_buffer:
> > +	rte_free(q->derm_buffer);
> > +	q->derm_buffer = NULL;
> >   free_companion_ring_addr:
> >   	rte_free(q->companion_ring_addr);
> >   	q->companion_ring_addr = NULL;
> > @@ -890,6 +905,7 @@ acc100_queue_release(struct rte_bbdev *dev,
> uint16_t q_id)
> >   		/* Mark the Queue as un-assigned */
> >   		d->q_assigned_bit_map[q->qgrp_id] &= (0xFFFFFFFFFFFFFFFF -
> >   				(uint64_t) (1 << q->aq_id));
> > +		rte_free(q->derm_buffer);
> >   		rte_free(q->companion_ring_addr);
> >   		rte_free(q->lb_in);
> >   		rte_free(q->lb_out);
> > @@ -3111,10 +3127,44 @@ harq_loopback(struct acc_queue *q, struct
> rte_bbdev_dec_op *op,
> >   	return 1;
> >   }
> >
> > +/** Assess whether a work around is required for the deRM corner
> > +cases */ static inline bool derm_workaround_required(struct
> > +rte_bbdev_op_ldpc_dec *ldpc_dec, struct acc_queue *q) {
> > +	if (!is_acc100(q))
> > +		return false;
> > +	int32_t e = ldpc_dec->cb_params.e;
> > +	int q_m = ldpc_dec->q_m;
> > +	int z_c = ldpc_dec->z_c;
> > +	int K = (ldpc_dec->basegraph == 1 ? ACC_K_ZC_1 : ACC_K_ZC_2)
> > +			* z_c;
> > +
> > +	bool required = false;
> > +	if (ldpc_dec->basegraph == 1) {
> > +		if ((q_m == 4) && (z_c >= 320) && (e * ACC_LIM_31 > K * 64))
> > +			required = true;
> > +		else if ((e * ACC_LIM_21 > K * 64))
> > +			required = true;
> > +	} else {
> > +		if (q_m <= 2) {
> > +			if ((z_c >= 208) && (e * ACC_LIM_09 > K * 64))
> > +				required = true;
> > +			else if ((z_c < 208) && (e * ACC_LIM_03 > K * 64))
> > +				required = true;
> > +		} else if (e * ACC_LIM_14 > K * 64)
> > +			required = true;
> > +	}
> > +	if (required)
> > +		rte_bbdev_log(INFO, "Running deRM pre-processing in SW");
> 
> I understood bblib_rate_dematching_5gnr could not be reimplemented in the
> bbdev library as a simple helper because of performance reason.
> So is it a good idea to emit a log with INFO level each time you enter in the
> workaround?

The INFO level is not enabled by default at run time, so we should not see anything except if log level is explicit reduced. 
You don’t think INFO log level is low enough for such low level trace?
If any concern we can definitely remove. 

> 
> > +
> > +	return required;
> > +}
> > +
> >   /** Enqueue one decode operations for ACC100 device in CB mode */
> >   static inline int
> >   enqueue_ldpc_dec_one_op_cb(struct acc_queue *q, struct
> rte_bbdev_dec_op *op,
> > -		uint16_t total_enqueued_cbs, bool same_op)
> > +		uint16_t total_enqueued_cbs, bool same_op,
> > +		struct rte_bbdev_queue_data *q_data)
> >   {
> >   	int ret;
> >   	if (unlikely(check_bit(op->ldpc_dec.op_flags,
> > @@ -3168,6 +3218,58 @@ enqueue_ldpc_dec_one_op_cb(struct acc_queue
> *q, struct rte_bbdev_dec_op *op,
> >   	} else {
> >   		struct acc_fcw_ld *fcw;
> >   		uint32_t seg_total_left;
> > +
> > +		if (derm_workaround_required(&op->ldpc_dec, q)) {
> > +			#ifdef RTE_BBDEV_SDK_AVX512
> > +			struct rte_bbdev_op_ldpc_dec *dec = &op->ldpc_dec;
> > +			struct bblib_rate_dematching_5gnr_request
> derm_req;
> > +			struct bblib_rate_dematching_5gnr_response
> derm_resp;
> > +			uint8_t *in;
> > +
> > +			/* Checking input size is matching with E */
> > +			if (dec->input.data->data_len < dec->cb_params.e) {
> > +				rte_bbdev_log(ERR, "deRM: Input size
> mismatch");
> > +				return -EFAULT;
> > +			}
> > +			/* Run first deRM processing in SW */
> > +			in = rte_pktmbuf_mtod_offset(dec->input.data, uint8_t
> *, in_offset);
> > +			derm_req.p_in = (int8_t *) in;
> > +			derm_req.p_harq = (int8_t *) q->derm_buffer;
> > +			derm_req.base_graph = dec->basegraph;
> > +			derm_req.zc = dec->z_c;
> > +			derm_req.ncb = dec->n_cb;
> > +			derm_req.e = dec->cb_params.e;
> > +			if (derm_req.e > ACC_MAX_E) {
> > +				rte_bbdev_log(WARNING,
> > +						"deRM: E %d > %d max",
> > +						derm_req.e, ACC_MAX_E);
> > +				derm_req.e = ACC_MAX_E;
> > +			}
> > +			derm_req.k0 = 0; /* Actual output from SDK */
> > +			derm_req.isretx = false;
> > +			derm_req.rvid = dec->rv_index;
> > +			derm_req.modulation_order = dec->q_m;
> > +			derm_req.start_null_index =
> > +					(dec->basegraph == 1 ? 22 : 10)
> > +					* dec->z_c - 2 * dec->z_c
> > +					- dec->n_filler;
> > +			derm_req.num_of_null = dec->n_filler;
> > +			bblib_rate_dematching_5gnr(&derm_req,
> &derm_resp);
> > +			/* Force back the HW DeRM */
> > +			dec->q_m = 1;
> > +			dec->cb_params.e = dec->n_cb - dec->n_filler;
> > +			dec->rv_index = 0;
> > +			rte_memcpy(in, q->derm_buffer, dec->cb_params.e);
> > +			/* Capture counter when pre-processing is used */
> > +			q_data->queue_stats.enqueue_warn_count++;
> > +			#else
> > +			RTE_SET_USED(q_data);
> > +			rte_bbdev_log(WARNING,
> > +				"Corner case may require deRM pre-
> processing in SDK"
> > +				);
> 
> What is the result if we end here?
> Shouldn't we handle it as an error and so propagate it?

This will work fine most of the time. But there is theoretical risk of reliability issue identified even if probability is at best low. Recommendation is to use the SDK workaround to prevent such issue to potentially occur but that is not imposed on user/customer. 

> 
> > +			#endif
> > +		}
> > +
> >   		fcw = &desc->req.fcw_ld;
> >   		q->d->fcw_ld_fill(op, fcw, harq_layout);
> >
> > @@ -3721,7 +3823,7 @@ acc100_enqueue_ldpc_dec_cb(struct
> rte_bbdev_queue_data *q_data,
> >   			ops[i]->ldpc_dec.n_cb, ops[i]->ldpc_dec.q_m,
> >   			ops[i]->ldpc_dec.n_filler, ops[i]-
> >ldpc_dec.cb_params.e,
> >   			same_op);
> > -		ret = enqueue_ldpc_dec_one_op_cb(q, ops[i], i, same_op);
> > +		ret = enqueue_ldpc_dec_one_op_cb(q, ops[i], i, same_op,
> q_data);
> >   		if (ret < 0) {
> >   			acc_enqueue_invalid(q_data);
> >   			break;


  reply	other threads:[~2022-10-21 15:41 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-21  5:20 [PATCH v5 00/29] baseband/acc100: changes for 22.11 Hernan Vargas
2022-10-21  5:20 ` [PATCH v5 01/29] baseband/acc100: fix ring availability calculation Hernan Vargas
2022-10-21  9:04   ` Maxime Coquelin
2022-10-21  5:20 ` [PATCH v5 02/29] baseband/acc100: add function to check AQ availability Hernan Vargas
2022-10-21  9:07   ` Maxime Coquelin
2022-10-21  5:20 ` [PATCH v5 03/29] baseband/acc100: memory leak fix Hernan Vargas
2022-10-21  5:20 ` [PATCH v5 04/29] baseband/acc100: add LDPC encoder padding function Hernan Vargas
2022-10-21  5:20 ` [PATCH v5 05/29] baseband/acc100: check turbo dec/enc input Hernan Vargas
2022-10-21  5:20 ` [PATCH v5 06/29] baseband/acc100: check for unlikely operation vals Hernan Vargas
2022-10-21  5:20 ` [PATCH v5 07/29] baseband/acc100: enforce additional check on FCW Hernan Vargas
2022-10-21  9:08   ` Maxime Coquelin
2022-10-21  5:20 ` [PATCH v5 08/29] baseband/acc100: allocate ring/queue mem when NULL Hernan Vargas
2022-10-21  9:16   ` Maxime Coquelin
2022-10-21  5:20 ` [PATCH v5 09/29] baseband/acc100: reduce input length for CRC24B Hernan Vargas
2022-10-21  5:20 ` [PATCH v5 10/29] baseband/acc100: fix clearing PF IR outside handler Hernan Vargas
2022-10-21  5:20 ` [PATCH v5 11/29] baseband/acc100: set device min alignment to 1 Hernan Vargas
2022-10-21  5:20 ` [PATCH v5 12/29] baseband/acc100: add protection for NULL HARQ input Hernan Vargas
2022-10-21  5:20 ` [PATCH v5 13/29] baseband/acc100: reset pointer after rte_free Hernan Vargas
2022-10-21  5:20 ` [PATCH v5 14/29] baseband/acc100: fix debug print for LDPC FCW Hernan Vargas
2022-10-21  5:20 ` [PATCH v5 15/29] baseband/acc100: add enqueue status Hernan Vargas
2022-10-21  5:20 ` [PATCH v5 16/29] baseband/acc100: add scatter-gather support Hernan Vargas
2022-10-21  5:20 ` [PATCH v5 17/29] baseband/acc100: add HARQ index helper function Hernan Vargas
2022-10-21  5:20 ` [PATCH v5 18/29] baseband/acc100: enable input validation by default Hernan Vargas
2022-10-21  5:20 ` [PATCH v5 19/29] baseband/acc100: added LDPC transport block support Hernan Vargas
2022-10-21  5:20 ` [PATCH v5 20/29] baseband/acc100: update validate LDPC enc/dec Hernan Vargas
2022-10-21  9:21   ` Maxime Coquelin
2022-10-21  5:20 ` [PATCH v5 21/29] baseband/acc100: implement configurable queue depth Hernan Vargas
2022-10-21  5:20 ` [PATCH v5 22/29] baseband/acc100: add queue stop operation Hernan Vargas
2022-10-21  5:20 ` [PATCH v5 23/29] baseband/acc100: update uplink CB input length Hernan Vargas
2022-10-21  5:20 ` [PATCH v5 24/29] baseband/acc100: update log messages Hernan Vargas
2022-10-21  5:20 ` [PATCH v5 25/29] baseband/acc100: store FCW from first CB descriptor Hernan Vargas
2022-10-21  5:20 ` [PATCH v5 26/29] baseband/acc100: update device info Hernan Vargas
2022-10-21  5:21 ` [PATCH v5 27/29] baseband/acc100: add ring companion address Hernan Vargas
2022-10-21  9:29   ` Maxime Coquelin
2022-10-21  5:21 ` [PATCH v5 28/29] baseband/acc100: add workaround for deRM corner cases Hernan Vargas
2022-10-21  9:32   ` Maxime Coquelin
2022-10-21 15:40     ` Chautru, Nicolas [this message]
2022-10-21  5:21 ` [PATCH v5 29/29] baseband/acc100: configure PMON control registers Hernan Vargas
2022-10-21 13:06 ` [EXT] [PATCH v5 00/29] baseband/acc100: changes for 22.11 Akhil Goyal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BY5PR11MB44510F3D15DAA61CD787D398F82D9@BY5PR11MB4451.namprd11.prod.outlook.com \
    --to=nicolas.chautru@intel.com \
    --cc=dev@dpdk.org \
    --cc=gakhil@marvell.com \
    --cc=hernan.vargas@intel.com \
    --cc=maxime.coquelin@redhat.com \
    --cc=qi.z.zhang@intel.com \
    --cc=trix@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).