From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id B0A434269F; Tue, 3 Oct 2023 16:36:58 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 505AF402A2; Tue, 3 Oct 2023 16:36:58 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by mails.dpdk.org (Postfix) with ESMTP id A2B5340262 for ; Tue, 3 Oct 2023 16:36:56 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1696343815; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=xRN9pfY2WAVbLY6LCqn2ZpYmrw1OAueSCdnQIu4vOjY=; b=M7iCr3kXABSmqimTmSAjPvWKd3L5Ef3jfRXevi0h4hmspodgpfaRdoQuJ2dpZtS6HWkNLd eWy0LYrDriBikoGoPxRBzrbgRNJnT4U1yMd/ngko5YZybSplje07Zb7pOaRcrA5dl2pSkU ZPWLVlcfVWIRwK4TXh0gnmLhUQrsBYk= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-255-qiJzwpIcPNGa1suSrMhPgQ-1; Tue, 03 Oct 2023 10:36:51 -0400 X-MC-Unique: qiJzwpIcPNGa1suSrMhPgQ-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 0A0A688B7C1; Tue, 3 Oct 2023 14:36:51 +0000 (UTC) Received: from [10.39.208.36] (unknown [10.39.208.36]) by smtp.corp.redhat.com (Postfix) with ESMTPS id C9BECC15BB8; Tue, 3 Oct 2023 14:36:49 +0000 (UTC) Message-ID: Date: Tue, 3 Oct 2023 16:36:48 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.15.1 Subject: Re: [PATCH v3 09/12] baseband/acc: add FFT support to VRB2 variant To: Nicolas Chautru , dev@dpdk.org Cc: hemant.agrawal@nxp.com, david.marchand@redhat.com, hernan.vargas@intel.com References: <20230929163516.3636499-1-nicolas.chautru@intel.com> <20230929163516.3636499-10-nicolas.chautru@intel.com> From: Maxime Coquelin In-Reply-To: <20230929163516.3636499-10-nicolas.chautru@intel.com> X-Scanned-By: MIMEDefang 3.1 on 10.11.54.8 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On 9/29/23 18:35, Nicolas Chautru wrote: > Support for the FFT the processing specific to the > VRB2 variant. > > Signed-off-by: Nicolas Chautru > --- > drivers/baseband/acc/rte_vrb_pmd.c | 132 ++++++++++++++++++++++++++++- > 1 file changed, 128 insertions(+), 4 deletions(-) > > diff --git a/drivers/baseband/acc/rte_vrb_pmd.c b/drivers/baseband/acc/rte_vrb_pmd.c > index 93add82947..ce4b90d8e7 100644 > --- a/drivers/baseband/acc/rte_vrb_pmd.c > +++ b/drivers/baseband/acc/rte_vrb_pmd.c > @@ -903,6 +903,9 @@ vrb_queue_setup(struct rte_bbdev *dev, uint16_t queue_id, > ACC_FCW_LD_BLEN : (conf->op_type == RTE_BBDEV_OP_FFT ? > ACC_FCW_FFT_BLEN : ACC_FCW_MLDTS_BLEN)))); > > + if ((q->d->device_variant == VRB2_VARIANT) && (conf->op_type == RTE_BBDEV_OP_FFT)) > + fcw_len = ACC_FCW_FFT_BLEN_3; > + > for (desc_idx = 0; desc_idx < d->sw_ring_max_depth; desc_idx++) { > desc = q->ring_addr + desc_idx; > desc->req.word0 = ACC_DMA_DESC_TYPE; > @@ -1323,6 +1326,24 @@ vrb_dev_info_get(struct rte_bbdev *dev, struct rte_bbdev_driver_info *dev_info) > .num_buffers_soft_out = 0, > } > }, > + { > + .type = RTE_BBDEV_OP_FFT, > + .cap.fft = { > + .capability_flags = > + RTE_BBDEV_FFT_WINDOWING | > + RTE_BBDEV_FFT_CS_ADJUSTMENT | > + RTE_BBDEV_FFT_DFT_BYPASS | > + RTE_BBDEV_FFT_IDFT_BYPASS | > + RTE_BBDEV_FFT_FP16_INPUT | > + RTE_BBDEV_FFT_FP16_OUTPUT | > + RTE_BBDEV_FFT_POWER_MEAS | > + RTE_BBDEV_FFT_WINDOWING_BYPASS, > + .num_buffers_src = > + 1, > + .num_buffers_dst = > + 1, > + } > + }, > RTE_BBDEV_END_OF_CAPABILITIES_LIST() > }; > > @@ -3849,6 +3870,47 @@ vrb1_fcw_fft_fill(struct rte_bbdev_fft_op *op, struct acc_fcw_fft *fcw) > fcw->bypass = 0; > } > > +/* Fill in a frame control word for FFT processing. */ > +static inline void > +vrb2_fcw_fft_fill(struct rte_bbdev_fft_op *op, struct acc_fcw_fft_3 *fcw) > +{ > + fcw->in_frame_size = op->fft.input_sequence_size; > + fcw->leading_pad_size = op->fft.input_leading_padding; > + fcw->out_frame_size = op->fft.output_sequence_size; > + fcw->leading_depad_size = op->fft.output_leading_depadding; > + fcw->cs_window_sel = op->fft.window_index[0] + > + (op->fft.window_index[1] << 8) + > + (op->fft.window_index[2] << 16) + > + (op->fft.window_index[3] << 24); > + fcw->cs_window_sel2 = op->fft.window_index[4] + > + (op->fft.window_index[5] << 8); > + fcw->cs_enable_bmap = op->fft.cs_bitmap; > + fcw->num_antennas = op->fft.num_antennas_log2; > + fcw->idft_size = op->fft.idft_log2; > + fcw->dft_size = op->fft.dft_log2; > + fcw->cs_offset = op->fft.cs_time_adjustment; > + fcw->idft_shift = op->fft.idft_shift; > + fcw->dft_shift = op->fft.dft_shift; > + fcw->cs_multiplier = op->fft.ncs_reciprocal; > + fcw->power_shift = op->fft.power_shift; > + fcw->exp_adj = op->fft.fp16_exp_adjust; > + fcw->fp16_in = check_bit(op->fft.op_flags, RTE_BBDEV_FFT_FP16_INPUT); > + fcw->fp16_out = check_bit(op->fft.op_flags, RTE_BBDEV_FFT_FP16_OUTPUT); > + fcw->power_en = check_bit(op->fft.op_flags, RTE_BBDEV_FFT_POWER_MEAS); > + if (check_bit(op->fft.op_flags, > + RTE_BBDEV_FFT_IDFT_BYPASS)) { > + if (check_bit(op->fft.op_flags, > + RTE_BBDEV_FFT_WINDOWING_BYPASS)) > + fcw->bypass = 2; > + else > + fcw->bypass = 1; > + } else if (check_bit(op->fft.op_flags, > + RTE_BBDEV_FFT_DFT_BYPASS)) > + fcw->bypass = 3; > + else > + fcw->bypass = 0; The only difference I see with VRB1 are backed by corresponding op_flags (POWER & FP16), is that correct? If so, it does not make sense to me to have a specific fucntion for VRB2. > +} > + > static inline int > vrb1_dma_desc_fft_fill(struct rte_bbdev_fft_op *op, > struct acc_dma_req_desc *desc, > @@ -3882,6 +3944,58 @@ vrb1_dma_desc_fft_fill(struct rte_bbdev_fft_op *op, > return 0; > } > > +static inline int > +vrb2_dma_desc_fft_fill(struct rte_bbdev_fft_op *op, > + struct acc_dma_req_desc *desc, > + struct rte_mbuf *input, struct rte_mbuf *output, struct rte_mbuf *win_input, > + struct rte_mbuf *pwr, uint32_t *in_offset, uint32_t *out_offset, > + uint32_t *win_offset, uint32_t *pwr_offset) > +{ > + bool pwr_en = check_bit(op->fft.op_flags, RTE_BBDEV_FFT_POWER_MEAS); > + bool win_en = check_bit(op->fft.op_flags, RTE_BBDEV_FFT_DEWINDOWING); > + int num_cs = 0, i, bd_idx = 1; > + > + /* FCW already done */ > + acc_header_init(desc); > + > + RTE_SET_USED(win_input); > + RTE_SET_USED(win_offset); > + > + desc->data_ptrs[bd_idx].address = rte_pktmbuf_iova_offset(input, *in_offset); > + desc->data_ptrs[bd_idx].blen = op->fft.input_sequence_size * ACC_IQ_SIZE; > + desc->data_ptrs[bd_idx].blkid = ACC_DMA_BLKID_IN; > + desc->data_ptrs[bd_idx].last = 1; > + desc->data_ptrs[bd_idx].dma_ext = 0; > + bd_idx++; > + > + desc->data_ptrs[bd_idx].address = rte_pktmbuf_iova_offset(output, *out_offset); > + desc->data_ptrs[bd_idx].blen = op->fft.output_sequence_size * ACC_IQ_SIZE; > + desc->data_ptrs[bd_idx].blkid = ACC_DMA_BLKID_OUT_HARD; > + desc->data_ptrs[bd_idx].last = pwr_en ? 0 : 1; > + desc->data_ptrs[bd_idx].dma_ext = 0; > + desc->m2dlen = win_en ? 3 : 2; > + desc->d2mlen = pwr_en ? 2 : 1; > + desc->ib_ant_offset = op->fft.input_sequence_size; > + desc->num_ant = op->fft.num_antennas_log2 - 3; > + > + for (i = 0; i < RTE_BBDEV_MAX_CS; i++) > + if (check_bit(op->fft.cs_bitmap, 1 << i)) > + num_cs++; > + desc->num_cs = num_cs; > + > + if (pwr_en && pwr) { > + bd_idx++; > + desc->data_ptrs[bd_idx].address = rte_pktmbuf_iova_offset(pwr, *pwr_offset); > + desc->data_ptrs[bd_idx].blen = num_cs * (1 << op->fft.num_antennas_log2) * 4; > + desc->data_ptrs[bd_idx].blkid = ACC_DMA_BLKID_OUT_SOFT; > + desc->data_ptrs[bd_idx].last = 1; > + desc->data_ptrs[bd_idx].dma_ext = 0; > + } > + desc->ob_cyc_offset = op->fft.output_sequence_size; > + desc->ob_ant_offset = op->fft.output_sequence_size * num_cs; > + desc->op_addr = op; > + return 0; > +} > > /** Enqueue one FFT operation for device. */ > static inline int > @@ -3889,22 +4003,32 @@ vrb_enqueue_fft_one_op(struct acc_queue *q, struct rte_bbdev_fft_op *op, > uint16_t total_enqueued_cbs) > { > union acc_dma_desc *desc; > - struct rte_mbuf *input, *output; > - uint32_t in_offset, out_offset; > + struct rte_mbuf *input, *output, *pwr, *win; > + uint32_t in_offset, out_offset, pwr_offset, win_offset; > struct acc_fcw_fft *fcw; > > desc = acc_desc(q, total_enqueued_cbs); > input = op->fft.base_input.data; > output = op->fft.base_output.data; > + pwr = op->fft.power_meas_output.data; > + win = op->fft.dewindowing_input.data; > in_offset = op->fft.base_input.offset; > out_offset = op->fft.base_output.offset; > + pwr_offset = op->fft.power_meas_output.offset; > + win_offset = op->fft.dewindowing_input.offset; > > fcw = (struct acc_fcw_fft *) (q->fcw_ring + > ((q->sw_ring_head + total_enqueued_cbs) & q->sw_ring_wrap_mask) > * ACC_MAX_FCW_SIZE); > > - vrb1_fcw_fft_fill(op, fcw); > - vrb1_dma_desc_fft_fill(op, &desc->req, input, output, &in_offset, &out_offset); > + if (q->d->device_variant == VRB1_VARIANT) { > + vrb1_fcw_fft_fill(op, fcw); > + vrb1_dma_desc_fft_fill(op, &desc->req, input, output, &in_offset, &out_offset); > + } else { > + vrb2_fcw_fft_fill(op, (struct acc_fcw_fft_3 *) fcw); > + vrb2_dma_desc_fft_fill(op, &desc->req, input, output, win, pwr, > + &in_offset, &out_offset, &win_offset, &pwr_offset); > + } > #ifdef RTE_LIBRTE_BBDEV_DEBUG > rte_memdump(stderr, "FCW", &desc->req.fcw_fft, > sizeof(desc->req.fcw_fft));