From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 03719A00C4; Sat, 29 Oct 2022 14:11:09 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 945A940694; Sat, 29 Oct 2022 14:11:09 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by mails.dpdk.org (Postfix) with ESMTP id A365140146 for ; Sat, 29 Oct 2022 14:11:08 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1667045468; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=fitzFfaAWDJq/b57ou07vzrkiLiK+NW4DEZrCt5qW6s=; b=ftWrg2+LKzaYEeenNRvzpHDMwctJgjwguDmv+096g0HYcxxwR4u7TuSCkN3KjqE5k21fVL sQG54Ea5ADj0+pD2zrKmyneInwsTaNLnzLb16hakUQaa/se6B7q3QLDoBI2qWuVeq9b15c zz9nKLexaCLGpfuH//vDLxyTaEQ8QU8= Received: from mail-qt1-f198.google.com (mail-qt1-f198.google.com [209.85.160.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-151-Y9nQ-h7iNYK8iXYJ0yj59Q-1; Sat, 29 Oct 2022 08:11:06 -0400 X-MC-Unique: Y9nQ-h7iNYK8iXYJ0yj59Q-1 Received: by mail-qt1-f198.google.com with SMTP id cj6-20020a05622a258600b003a519d02f59so424708qtb.5 for ; Sat, 29 Oct 2022 05:11:06 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-language:content-transfer-encoding:in-reply-to:mime-version :user-agent:date:message-id:from:references:cc:to:subject :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=fitzFfaAWDJq/b57ou07vzrkiLiK+NW4DEZrCt5qW6s=; b=mR1qibVHB8xXIHblkqR4aEisuUQYA5zDdWvSg6F1PbZ8ogdT3DdXWBboSMvYX99ZXs wXgjXlTG6lSHxMrEWPeZnhDRulgV1t6tR/GduMkwbdgGED5wBuvMSmG3TvogPFDac3no c485BWIW9Ga85OZ9TreienZv0FOmCsYk8fUv11SfBpScDJuz4BL+Aq0IKqCeFjiH31gu YTCYywg23lZLDoT9luTiQ8R7Nd4HB0vbNxAgNiHMdzTlaZo4YQ/xq/4kT1WldO7U6d3r AgZL4ry+DylApaPjsaZ8I5GsKgrdaiboKt5HJv6LY61DpSeUVxO+Q2GPOiFjoXid2z32 5k5w== X-Gm-Message-State: ACrzQf27ViucgVjEVB6muf/RIaMURIK0eMkvbQyEfHgZ1hgg2L9VwfTd 8wWihrywUnzvGdJoKe9iV+XEH6REtELwbi7J61pOSCc2Sv6nmdDdF1qMXSJt9HSnvobUA/xsQ1H lfB0= X-Received: by 2002:ac8:5712:0:b0:39c:cd48:25e4 with SMTP id 18-20020ac85712000000b0039ccd4825e4mr3280545qtw.581.1667045466217; Sat, 29 Oct 2022 05:11:06 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4XwJh4LyRoqk7F3HRTK44QEh3gsB6Skbc03c+YI3wBI4sNsd3h5MfeLVN5jnAkeDmE6S6njQ== X-Received: by 2002:ac8:5712:0:b0:39c:cd48:25e4 with SMTP id 18-20020ac85712000000b0039ccd4825e4mr3280521qtw.581.1667045465980; Sat, 29 Oct 2022 05:11:05 -0700 (PDT) Received: from localhost.localdomain (024-205-208-113.res.spectrum.com. [24.205.208.113]) by smtp.gmail.com with ESMTPSA id gd26-20020a05622a5c1a00b0039764587192sm732745qtb.57.2022.10.29.05.11.04 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sat, 29 Oct 2022 05:11:05 -0700 (PDT) Subject: Re: [PATCH v6 1/1] baseband/acc100: add workaround for deRM corner cases To: "Chautru, Nicolas" , Maxime Coquelin , "Vargas, Hernan" , "dev@dpdk.org" , "gakhil@marvell.com" Cc: "Zhang, Qi Z" , David Marchand , Thomas Monjalon References: <20221025023824.127049-1-hernan.vargas@intel.com> <20221025023824.127049-2-hernan.vargas@intel.com> From: Tom Rix Message-ID: Date: Sat, 29 Oct 2022 05:11:03 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.10.1 MIME-Version: 1.0 In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On 10/25/22 8:01 AM, Chautru, Nicolas wrote: > Hi Maxime, > >> -----Original Message----- >> From: Maxime Coquelin >> Sent: Tuesday, October 25, 2022 1:26 AM >> To: Vargas, Hernan ; dev@dpdk.org; >> gakhil@marvell.com; trix@redhat.com >> Cc: Chautru, Nicolas ; Zhang, Qi Z >> ; David Marchand ; >> Thomas Monjalon >> Subject: Re: [PATCH v6 1/1] baseband/acc100: add workaround for deRM >> corner cases >> >> Hi Hernan, >> >> On 10/25/22 04:38, Hernan Vargas wrote: >>> Add function to support de-ratematch pre-processing for SW corner cases. >>> Some specific 5GUL FEC corner cases may cause unintended back pressure >>> and in some cases a potential stability issue on the ACC100. >>> To be able to avoid completely such potential issue, the PMD can >>> preempt such code block configuration so that to process the first >>> level deRM in SW using the SDK libraries prior to running the rest of >>> the FEC decoding in HW using an amended code block configuration. >>> In case meson build system doesn't find such SDK libraries, the fall >>> method is to run in HW with a warning. >>> >>> Signed-off-by: Hernan Vargas >>> --- >>> drivers/baseband/acc/acc_common.h | 8 ++ >>> drivers/baseband/acc/meson.build | 21 +++++ >>> drivers/baseband/acc/rte_acc100_pmd.c | 108 >> +++++++++++++++++++++++++- >>> 3 files changed, 134 insertions(+), 3 deletions(-) >>> >>> diff --git a/drivers/baseband/acc/acc_common.h >>> b/drivers/baseband/acc/acc_common.h >>> index eae7eab4e9..5e8972b40a 100644 >>> --- a/drivers/baseband/acc/acc_common.h >>> +++ b/drivers/baseband/acc/acc_common.h >>> @@ -123,6 +123,14 @@ >>> #define ACC_HARQ_ALIGN_64B 64 >>> #define ACC_MAX_ZC 384 >>> >>> +/* De-ratematch code rate limitation when padding is required */ >>> +#define ACC_LIM_03 2 /* 0.03 */ #define ACC_LIM_09 6 /* 0.09 */ >>> +#define ACC_LIM_14 9 /* 0.14 */ #define ACC_LIM_21 14 /* 0.21 */ >>> +#define ACC_LIM_31 20 /* 0.31 */ #define ACC_MAX_E (128 * 1024 - 2) >>> + >>> /* Helper macro for logging */ >>> #define rte_acc_log(level, fmt, ...) \ >>> rte_log(RTE_LOG_ ## level, RTE_LOG_NOTICE, fmt "\n", \ diff --git >>> a/drivers/baseband/acc/meson.build b/drivers/baseband/acc/meson.build >>> index 77c393b533..a5fc4fed01 100644 >>> --- a/drivers/baseband/acc/meson.build >>> +++ b/drivers/baseband/acc/meson.build >>> @@ -1,6 +1,27 @@ >>> # SPDX-License-Identifier: BSD-3-Clause >>> # Copyright(c) 2020 Intel Corporation >>> >>> +# Check for FlexRAN SDK libraries >>> +dep_dec5g = dependency('flexran_sdk_ldpc_decoder_5gnr', required: >>> +false) >>> + >>> +if dep_dec5g.found() >>> + ext_deps += cc.find_library('libstdc++', required: true) >>> + ext_deps += cc.find_library('libirc', required: true) >>> + ext_deps += cc.find_library('libimf', required: true) >>> + ext_deps += cc.find_library('libipps', required: true) >>> + ext_deps += cc.find_library('libsvml', required: true) >>> + ext_deps += dep_dec5g >>> + ext_deps += dependency('flexran_sdk_ldpc_encoder_5gnr', required: >> true) >>> + ext_deps += dependency('flexran_sdk_LDPC_ratematch_5gnr', required: >> true) >>> + ext_deps += dependency('flexran_sdk_rate_dematching_5gnr', required: >> true) >>> + ext_deps += dependency('flexran_sdk_turbo', required: true) >>> + ext_deps += dependency('flexran_sdk_crc', required: true) >>> + ext_deps += dependency('flexran_sdk_rate_matching', required: true) >>> + ext_deps += dependency('flexran_sdk_common', required: true) >>> + cflags += ['-DRTE_BBDEV_SDK_AVX2'] >>> + cflags += ['-DRTE_BBDEV_SDK_AVX512'] endif >>> + >>> deps += ['bbdev', 'bus_pci'] >>> >>> sources = files('rte_acc100_pmd.c', 'rte_acc200_pmd.c') diff --git >>> a/drivers/baseband/acc/rte_acc100_pmd.c >>> b/drivers/baseband/acc/rte_acc100_pmd.c >>> index 23bc5d25bb..e8b230e563 100644 >>> --- a/drivers/baseband/acc/rte_acc100_pmd.c >>> +++ b/drivers/baseband/acc/rte_acc100_pmd.c >>> @@ -25,6 +25,10 @@ >>> #include "acc101_pmd.h" >>> #include "acc200_cfg.h" >>> >>> +#ifdef RTE_BBDEV_SDK_AVX512 >>> +#include #endif >>> + >>> #ifdef RTE_LIBRTE_BBDEV_DEBUG >>> RTE_LOG_REGISTER_DEFAULT(acc100_logtype, DEBUG); >>> #else >>> @@ -756,6 +760,14 @@ acc100_queue_setup(struct rte_bbdev *dev, >> uint16_t queue_id, >>> ret = -ENOMEM; >>> goto free_lb_out; >>> } >>> + q->derm_buffer = rte_zmalloc_socket(dev->device->driver->name, >>> + RTE_BBDEV_TURBO_MAX_CB_SIZE * 10, >>> + RTE_CACHE_LINE_SIZE, conf->socket); >>> + if (q->derm_buffer == NULL) { >>> + rte_bbdev_log(ERR, "Failed to allocate derm_buffer memory"); >>> + ret = -ENOMEM; >>> + goto free_companion_ring_addr; >>> + } >>> >>> /* >>> * Software queue ring wraps synchronously with the HW when it >>> reaches @@ -776,7 +788,7 @@ acc100_queue_setup(struct rte_bbdev *dev, >> uint16_t queue_id, >>> q_idx = acc100_find_free_queue_idx(dev, conf); >>> if (q_idx == -1) { >>> ret = -EINVAL; >>> - goto free_companion_ring_addr; >>> + goto free_derm_buffer; >>> } >>> >>> q->qgrp_id = (q_idx >> ACC100_GRP_ID_SHIFT) & 0xF; @@ -804,6 >> +816,9 >>> @@ acc100_queue_setup(struct rte_bbdev *dev, uint16_t queue_id, >>> dev->data->queues[queue_id].queue_private = q; >>> return 0; >>> >>> +free_derm_buffer: >>> + rte_free(q->derm_buffer); >>> + q->derm_buffer = NULL; >>> free_companion_ring_addr: >>> rte_free(q->companion_ring_addr); >>> q->companion_ring_addr = NULL; >>> @@ -890,6 +905,7 @@ acc100_queue_release(struct rte_bbdev *dev, >> uint16_t q_id) >>> /* Mark the Queue as un-assigned */ >>> d->q_assigned_bit_map[q->qgrp_id] &= (0xFFFFFFFFFFFFFFFF - >>> (uint64_t) (1 << q->aq_id)); >>> + rte_free(q->derm_buffer); >>> rte_free(q->companion_ring_addr); >>> rte_free(q->lb_in); >>> rte_free(q->lb_out); >>> @@ -3111,10 +3127,44 @@ harq_loopback(struct acc_queue *q, struct >> rte_bbdev_dec_op *op, >>> return 1; >>> } >>> >>> +/** Assess whether a work around is required for the deRM corner >>> +cases */ static inline bool derm_workaround_required(struct >>> +rte_bbdev_op_ldpc_dec *ldpc_dec, struct acc_queue *q) { >>> + if (!is_acc100(q)) >>> + return false; >>> + int32_t e = ldpc_dec->cb_params.e; >>> + int q_m = ldpc_dec->q_m; >>> + int z_c = ldpc_dec->z_c; >>> + int K = (ldpc_dec->basegraph == 1 ? ACC_K_ZC_1 : ACC_K_ZC_2) >>> + * z_c; >>> + >>> + bool required = false; >>> + if (ldpc_dec->basegraph == 1) { >>> + if ((q_m == 4) && (z_c >= 320) && (e * ACC_LIM_31 > K * 64)) >>> + required = true; >>> + else if ((e * ACC_LIM_21 > K * 64)) >>> + required = true; >>> + } else { >>> + if (q_m <= 2) { >>> + if ((z_c >= 208) && (e * ACC_LIM_09 > K * 64)) >>> + required = true; >>> + else if ((z_c < 208) && (e * ACC_LIM_03 > K * 64)) >>> + required = true; >>> + } else if (e * ACC_LIM_14 > K * 64) >>> + required = true; >>> + } >>> + if (required) >>> + rte_bbdev_log(INFO, "Running deRM pre-processing in SW"); >>> + >>> + return required; >>> +} >>> + >>> /** Enqueue one decode operations for ACC100 device in CB mode */ >>> static inline int >>> enqueue_ldpc_dec_one_op_cb(struct acc_queue *q, struct >> rte_bbdev_dec_op *op, >>> - uint16_t total_enqueued_cbs, bool same_op) >>> + uint16_t total_enqueued_cbs, bool same_op, >>> + struct rte_bbdev_queue_data *q_data) >>> { >>> int ret; >>> if (unlikely(check_bit(op->ldpc_dec.op_flags, >>> @@ -3168,6 +3218,58 @@ enqueue_ldpc_dec_one_op_cb(struct acc_queue >> *q, struct rte_bbdev_dec_op *op, >>> } else { >>> struct acc_fcw_ld *fcw; >>> uint32_t seg_total_left; >>> + >>> + if (derm_workaround_required(&op->ldpc_dec, q)) { >>> + #ifdef RTE_BBDEV_SDK_AVX512 >>> + struct rte_bbdev_op_ldpc_dec *dec = &op->ldpc_dec; >>> + struct bblib_rate_dematching_5gnr_request >> derm_req; >>> + struct bblib_rate_dematching_5gnr_response >> derm_resp; >>> + uint8_t *in; >>> + >>> + /* Checking input size is matching with E */ >>> + if (dec->input.data->data_len < dec->cb_params.e) { >>> + rte_bbdev_log(ERR, "deRM: Input size >> mismatch"); >>> + return -EFAULT; >>> + } >>> + /* Run first deRM processing in SW */ >>> + in = rte_pktmbuf_mtod_offset(dec->input.data, uint8_t >> *, in_offset); >>> + derm_req.p_in = (int8_t *) in; >>> + derm_req.p_harq = (int8_t *) q->derm_buffer; >>> + derm_req.base_graph = dec->basegraph; >>> + derm_req.zc = dec->z_c; >>> + derm_req.ncb = dec->n_cb; >>> + derm_req.e = dec->cb_params.e; >>> + if (derm_req.e > ACC_MAX_E) { >>> + rte_bbdev_log(WARNING, >>> + "deRM: E %d > %d max", >>> + derm_req.e, ACC_MAX_E); >>> + derm_req.e = ACC_MAX_E; >>> + } >>> + derm_req.k0 = 0; /* Actual output from SDK */ >>> + derm_req.isretx = false; >>> + derm_req.rvid = dec->rv_index; >>> + derm_req.modulation_order = dec->q_m; >>> + derm_req.start_null_index = >>> + (dec->basegraph == 1 ? 22 : 10) >>> + * dec->z_c - 2 * dec->z_c >>> + - dec->n_filler; >>> + derm_req.num_of_null = dec->n_filler; >>> + bblib_rate_dematching_5gnr(&derm_req, >> &derm_resp); >>> + /* Force back the HW DeRM */ >>> + dec->q_m = 1; >>> + dec->cb_params.e = dec->n_cb - dec->n_filler; >>> + dec->rv_index = 0; >>> + rte_memcpy(in, q->derm_buffer, dec->cb_params.e); >>> + /* Capture counter when pre-processing is used */ >>> + q_data->queue_stats.enqueue_warn_count++; >>> + #else >>> + RTE_SET_USED(q_data); >>> + rte_bbdev_log(WARNING, >>> + "Corner case may require deRM pre- >> processing in SDK" >>> + ); >>> + #endif >>> + } >>> + >>> fcw = &desc->req.fcw_ld; >>> q->d->fcw_ld_fill(op, fcw, harq_layout); >>> >>> @@ -3721,7 +3823,7 @@ acc100_enqueue_ldpc_dec_cb(struct >> rte_bbdev_queue_data *q_data, >>> ops[i]->ldpc_dec.n_cb, ops[i]->ldpc_dec.q_m, >>> ops[i]->ldpc_dec.n_filler, ops[i]- >>> ldpc_dec.cb_params.e, >>> same_op); >>> - ret = enqueue_ldpc_dec_one_op_cb(q, ops[i], i, same_op); >>> + ret = enqueue_ldpc_dec_one_op_cb(q, ops[i], i, same_op, >> q_data); >>> if (ret < 0) { >>> acc_enqueue_invalid(q_data); >>> break; >> As I already replied on a previous revision, I think it is not acceptable to >> condition the fix on forcing to depend on a proprietary SDK. The ACC100 driver >> did not have this Flexran SDK requirement initially, not like the SW-only BBDEV >> PMDs. > The SDK is not proprietary. This is available on IDZ with a permissible license and includes a plain C code implementation in case there is no AVX512 isa on the CPU. > >> In my opinion, you can keep the call to Flexran SDK API but you should provide >> an open-source alternative, even if not optimized so that it is at least >> functionnal. Later contributors could provide an optimized open-source version >> if they feel the need. > Again the SDK is available on the IDZ with a permissible license and includes a plain C implementation. Agreeing this is a license issue. Additionally, this approach does not scale. If everyone solved corner cases by referencing a external library to pull in a function we would quickly have configuration mess. Even within using the flexran sdk its a mess, which version of the external library works with which version of dpdk ?  Leave it to the user to try and when it fails who fixes what ? Distro wise we can not use idz so this is dead code for dpdk in fedora and rhel. Can the c implementation be imported into dpdk or someone clean room it? Tom > >> I had a quick look at the bblib_rate_dematching_5gnr C implementation, and it >> is reasonably small and there is really nothing special in it. >> >> Not having an open-source alternative is problematic, because it could make CI >> to fail when using test-bbdev application if the HW bug is triggered. > CI would not fail. > We can discuss further offline later this week as there seems to be some confusion still. > > Thanks > Nic > >> Regards, >> Maxime