From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 9A682A00C5; Thu, 15 Sep 2022 09:37:24 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 8A7FC4021D; Thu, 15 Sep 2022 09:37:24 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mails.dpdk.org (Postfix) with ESMTP id 105E540156 for ; Thu, 15 Sep 2022 09:37:22 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1663227442; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=odgeKrXFQBGMynRIEF8zDEdNnpufXec8r6zUTDFhstM=; b=ICGjGLJKF9g6hrk9VznOsysIvCTLxUcOGL6LeGz2kk374pSvV1qYim6lkcCQy5VrS3R9P9 jcO3GIJvXi25bQSDzv3Ak8xcLicj1pDOmCLfrUs5/61cYrq+8/nWWXoO82jZdpAQK+au4Y 0eiySPQ+jOqmH8E2X1dCprC6vJheYlc= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-192-BywApgYfNeiCDKegDJK4Uw-1; Thu, 15 Sep 2022 03:37:18 -0400 X-MC-Unique: BywApgYfNeiCDKegDJK4Uw-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id B5395811E67; Thu, 15 Sep 2022 07:37:17 +0000 (UTC) Received: from [10.39.208.12] (unknown [10.39.208.12]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 5C5431759E; Thu, 15 Sep 2022 07:37:16 +0000 (UTC) Message-ID: Date: Thu, 15 Sep 2022 09:37:14 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.12.0 To: Hernan Vargas , dev@dpdk.org, gakhil@marvell.com, trix@redhat.com Cc: nicolas.chautru@intel.com, qi.z.zhang@intel.com References: <20220820023157.189047-1-hernan.vargas@intel.com> <20220820023157.189047-14-hernan.vargas@intel.com> From: Maxime Coquelin Subject: Re: [PATCH v2 13/37] baseband/acc10x: limit cases for HARQ pruning In-Reply-To: <20220820023157.189047-14-hernan.vargas@intel.com> X-Scanned-By: MIMEDefang 3.1 on 10.11.54.5 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On 8/20/22 04:31, Hernan Vargas wrote: > Add flag ACC101_HARQ_PRUNING_OPTIMIZATION to limit cases when HARQ > pruning is valid. > > Signed-off-by: Hernan Vargas > --- > drivers/baseband/acc100/rte_acc100_pmd.c | 52 +++++++++++++++++++----- > 1 file changed, 41 insertions(+), 11 deletions(-) > > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c > index 81bae4d695..e47f7d68c2 100644 > --- a/drivers/baseband/acc100/rte_acc100_pmd.c > +++ b/drivers/baseband/acc100/rte_acc100_pmd.c > @@ -1370,17 +1370,23 @@ acc100_fcw_ld_fill(struct rte_bbdev_dec_op *op, struct acc100_fcw_ld *fcw, > harq_index = hq_index(op->ldpc_dec.harq_combined_output.offset); > #ifdef ACC100_EXT_MEM > /* Limit cases when HARQ pruning is valid */ > +#ifdef ACC100_HARQ_PRUNING_OPTIMIZATION > harq_prun = ((op->ldpc_dec.harq_combined_output.offset % > - ACC100_HARQ_OFFSET) == 0) && > - (op->ldpc_dec.harq_combined_output.offset <= UINT16_MAX > - * ACC100_HARQ_OFFSET); > + ACC100_HARQ_OFFSET) == 0); > +#endif Optimizations should not be put under #ifdefs, it will become a testing hell otherwise. CI will have to run as many builds as there are possible combinations, which is not sustainable. Even if not part of this patch, the "#ifdef ACC100_EXT_MEM" should also be removed. > #endif > if (fcw->hcin_en > 0) { > harq_in_length = op->ldpc_dec.harq_combined_input.length; > if (fcw->hcin_decomp_mode > 0) > harq_in_length = harq_in_length * 8 / 6; > - harq_in_length = RTE_ALIGN(harq_in_length, 64); > - if ((harq_layout[harq_index].offset > 0) & harq_prun) { > + harq_in_length = RTE_MIN(harq_in_length, op->ldpc_dec.n_cb > + - op->ldpc_dec.n_filler); > + /* Alignment on next 64B - Already enforced from HC output */ > + harq_in_length = RTE_ALIGN_FLOOR(harq_in_length, 64); > + /* Stronger alignment requirement when in decompression mode */ > + if (fcw->hcin_decomp_mode > 0) > + harq_in_length = RTE_ALIGN_FLOOR(harq_in_length, 256); > + if ((harq_layout[harq_index].offset > 0) && harq_prun) { > rte_bbdev_log_debug("HARQ IN offset unexpected for now\n"); > fcw->hcin_size0 = harq_layout[harq_index].size0; > fcw->hcin_offset = harq_layout[harq_index].offset; > @@ -1455,6 +1461,7 @@ acc101_fcw_ld_fill(struct rte_bbdev_dec_op *op, struct acc100_fcw_ld *fcw, > uint16_t harq_out_length, harq_in_length, ncb_p, k0_p, parity_offset; > uint32_t harq_index; > uint32_t l; > + bool harq_prun = false; > > fcw->qm = op->ldpc_dec.q_m; > fcw->nfiller = op->ldpc_dec.n_filler; > @@ -1500,6 +1507,13 @@ acc101_fcw_ld_fill(struct rte_bbdev_dec_op *op, struct acc100_fcw_ld *fcw, > fcw->llr_pack_mode = check_bit(op->ldpc_dec.op_flags, > RTE_BBDEV_LDPC_LLR_COMPRESSION); > harq_index = hq_index(op->ldpc_dec.harq_combined_output.offset); > + #ifdef ACC100_EXT_MEM > + /* Limit cases when HARQ pruning is valid */ > +#ifdef ACC101_HARQ_PRUNING_OPTIMIZATION > + harq_prun = ((op->ldpc_dec.harq_combined_output.offset % > + ACC101_HARQ_OFFSET) == 0); > +#endif > +#endif > if (fcw->hcin_en > 0) { > harq_in_length = op->ldpc_dec.harq_combined_input.length; > if (fcw->hcin_decomp_mode > 0) > @@ -1508,9 +1522,17 @@ acc101_fcw_ld_fill(struct rte_bbdev_dec_op *op, struct acc100_fcw_ld *fcw, > - op->ldpc_dec.n_filler); > /* Alignment on next 64B - Already enforced from HC output */ > harq_in_length = RTE_ALIGN_FLOOR(harq_in_length, 64); > - fcw->hcin_size0 = harq_in_length; > - fcw->hcin_offset = 0; > - fcw->hcin_size1 = 0; > + if ((harq_layout[harq_index].offset > 0) && harq_prun) { > + rte_bbdev_log_debug("HARQ IN offset unexpected for now\n"); > + fcw->hcin_size0 = harq_layout[harq_index].size0; > + fcw->hcin_offset = harq_layout[harq_index].offset; > + fcw->hcin_size1 = harq_in_length - > + harq_layout[harq_index].offset; > + } else { > + fcw->hcin_size0 = harq_in_length; > + fcw->hcin_offset = 0; > + fcw->hcin_size1 = 0; > + } > } else { > fcw->hcin_size0 = 0; > fcw->hcin_offset = 0; > @@ -1551,9 +1573,17 @@ acc101_fcw_ld_fill(struct rte_bbdev_dec_op *op, struct acc100_fcw_ld *fcw, > harq_out_length = RTE_MIN(harq_out_length, ncb_p); > /* Alignment on next 64B */ > harq_out_length = RTE_ALIGN_CEIL(harq_out_length, 64); > - fcw->hcout_size0 = harq_out_length; > - fcw->hcout_size1 = 0; > - fcw->hcout_offset = 0; > + if ((k0_p > fcw->hcin_size0 + ACC100_HARQ_OFFSET_THRESHOLD) && > + harq_prun) { > + fcw->hcout_size0 = (uint16_t) fcw->hcin_size0; > + fcw->hcout_offset = k0_p & 0xFFC0; > + fcw->hcout_size1 = harq_out_length - fcw->hcout_offset; > + } else { > + fcw->hcout_size0 = harq_out_length; > + fcw->hcout_size1 = 0; > + fcw->hcout_offset = 0; > + } > + > harq_layout[harq_index].offset = fcw->hcout_offset; > harq_layout[harq_index].size0 = fcw->hcout_size0; > } else {