From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id D906EA0543; Thu, 7 Jul 2022 15:09:46 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 7AF40406B4; Thu, 7 Jul 2022 15:09:46 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by mails.dpdk.org (Postfix) with ESMTP id 213704069D for ; Thu, 7 Jul 2022 15:09:44 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1657199384; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Fbs2xr/ckuq231gHbFBmJ6K3kisMiM4Nh3m1Vh8WUi0=; b=dQuu4xlHS6LY/oy79mexahN0LRHzdUQ9IKjIjiwQ2upJsI74hUxlxrxezKjR34cWm/vSJM jgNBZhBfwPNwxfF9pFWeRkZEfCQ4pqscJVWLI9NsS6fCj4ePmHOj98J9E7O7zBvMUUa8zx XXJ/IKqREWDp9iFtyB5+igXNtuBCdak= Received: from mail-qt1-f197.google.com (mail-qt1-f197.google.com [209.85.160.197]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-491-qw_b18yRPHWCWTJqTMwXew-1; Thu, 07 Jul 2022 09:09:43 -0400 X-MC-Unique: qw_b18yRPHWCWTJqTMwXew-1 Received: by mail-qt1-f197.google.com with SMTP id ck12-20020a05622a230c00b00304ee787b02so14232398qtb.11 for ; Thu, 07 Jul 2022 06:09:43 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding :content-language; bh=Fbs2xr/ckuq231gHbFBmJ6K3kisMiM4Nh3m1Vh8WUi0=; b=bFc0+O/bXCTIzpdvE+AG4boCyWS0Z8eGCxynrPx9mjPcNJudI7YoytCJ3kXL49Mpgl 5jE/5ByS+SCGx3XAujal5TcTqnV7SkW7+BJ4IgIY+OCODBDwAri5uQ39VCNJC9zDmaaX Tmd8KG2poCV2x5mW19/FN+zqSzuoFQhjbWDtvaMMR/t9CaC5REr+2DyReJSZgYLkBB3K UeyRKDmn8Gc8mthmpvwPsp3gTDPnIWw/SYdFTqPKzyHnouY3whcUZcWk/XMkEX4jP+nb Wn8auDtYpTwqeFu71ARYNiFwHH1Dr6lU0Nyz39mhat5lNR08iMHpUvxQat8Eya8ZnVsk l9+w== X-Gm-Message-State: AJIora+ONHmvZUtGG/HHFVLLYaFMSMSUtFmjpPgRNrEVRo9AO7+AFLeF B4JTfwfSYOuDhd4ZleM+h1oFJz0oCsWMJ5BGkj77VR5dMD6JWgarviwHQVvuU/NfZG3d1NQ+aMR Yi8M= X-Received: by 2002:a05:622a:11d4:b0:31d:298c:981f with SMTP id n20-20020a05622a11d400b0031d298c981fmr33934182qtk.285.1657199382553; Thu, 07 Jul 2022 06:09:42 -0700 (PDT) X-Google-Smtp-Source: AGRyM1vnoAOTrIvt+rkaLIork57MrhRU7tSofzWPjrkgVROWad4vYcq38111eBnGTM2PktYlcPA04g== X-Received: by 2002:a05:622a:11d4:b0:31d:298c:981f with SMTP id n20-20020a05622a11d400b0031d298c981fmr33934108qtk.285.1657199381762; Thu, 07 Jul 2022 06:09:41 -0700 (PDT) Received: from localhost.localdomain (024-205-208-113.res.spectrum.com. [24.205.208.113]) by smtp.gmail.com with ESMTPSA id v11-20020ac8578b000000b0031e95b87c1bsm1908194qta.44.2022.07.07.06.09.39 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 07 Jul 2022 06:09:41 -0700 (PDT) Subject: Re: [PATCH v4 5/7] bbdev: add new operation for FFT processing To: "Chautru, Nicolas" , "dev@dpdk.org" , "thomas@monjalon.net" , "gakhil@marvell.com" , "hemant.agrawal@nxp.com" Cc: "maxime.coquelin@redhat.com" , "mdr@ashroe.eu" , "Richardson, Bruce" , "david.marchand@redhat.com" , "stephen@networkplumber.org" References: <1655491040-183649-6-git-send-email-nicolas.chautru@intel.com> <1657067022-54373-1-git-send-email-nicolas.chautru@intel.com> <1657067022-54373-6-git-send-email-nicolas.chautru@intel.com> <0d0baa2c-3c59-ebde-b223-bb6062a814bd@redhat.com> From: Tom Rix Message-ID: <556dc305-1e1c-614a-0adc-9dcd7a8505c8@redhat.com> Date: Thu, 7 Jul 2022 06:09:38 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.10.1 MIME-Version: 1.0 In-Reply-To: Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=trix@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Nic, Not all my comments were addressed. The one I am most interested in is the default type / size and how it interacts with fp16. Please see the others below On 7/6/22 2:04 PM, Chautru, Nicolas wrote: > Hi Tom, > >> -----Original Message----- >> From: Tom Rix > >> >> On 7/5/22 5:23 PM, Nicolas Chautru wrote: >>> Extension of bbdev operation to support FFT based operations. >>> >>> Signed-off-by: Nicolas Chautru >>> Acked-by: Hemant Agrawal >>> --- >>> doc/guides/prog_guide/bbdev.rst | 130 >> +++++++++++++++++++++++++++++++++++ >>> lib/bbdev/rte_bbdev.c | 11 ++- >>> lib/bbdev/rte_bbdev.h | 76 ++++++++++++++++++++ >>> lib/bbdev/rte_bbdev_op.h | 149 >> ++++++++++++++++++++++++++++++++++++++++ >>> lib/bbdev/version.map | 4 ++ >>> 5 files changed, 369 insertions(+), 1 deletion(-) >>> >>> diff --git a/doc/guides/prog_guide/bbdev.rst >>> b/doc/guides/prog_guide/bbdev.rst index 70fa01a..4a055b5 100644 >>> --- a/doc/guides/prog_guide/bbdev.rst >>> +++ b/doc/guides/prog_guide/bbdev.rst >>> @@ -1118,6 +1118,136 @@ Figure :numref:`figure_turbo_tb_decode` above >>> showing the Turbo decoding of CBs using BBDEV interface in TB-mode >>> is also valid for LDPC decode. >>> >>> +BBDEV FFT Operation >>> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >>> + >>> +This operation allows to run a combination of DFT and/or IDFT and/or time- >> domain windowing. >>> +These can be used in a modular fashion (using bypass modes) or as a >>> +processing pipeline which can be used for FFT-based baseband signal >> processing. >>> +In more details it allows : >>> +- to process the data first through an IDFT of adjustable size and >>> +padding; >>> +- to perform the windowing as a programmable cyclic shift offset of >>> +the data followed by a pointwise multiplication by a time domain >>> +window; >>> +- to process the related data through a DFT of adjustable size and >>> +depadding for each such cyclic shift output. >>> + >>> +A flexible number of Rx antennas are being processed in parallel with the >> same configuration. >>> +The API allows more generally for flexibility in what the PMD may >>> +support (cabability flags) and flexibility to adjust some of the parameters of >> the processing. >>> + >>> +The operation/capability flags that can be set for each FFT operation are >> given below. >>> + >>> + **NOTE:** The actual operation flags that may be used with a >>> + specific BBDEV PMD are dependent on the driver capabilities as >>> + reported via ``rte_bbdev_info_get()``, and may be a subset of those below. >>> + >>> ++--------------------------------------------------------------------+ >>> +|Description of FFT capability flags | >>> >> ++=============================================================== >> ===== >>> +++ >>> +|RTE_BBDEV_FFT_WINDOWING | >>> +| Set to enable/support windowing in time domain | >>> ++--------------------------------------------------------------------+ >>> +|RTE_BBDEV_FFT_CS_ADJUSTMENT | >>> +| Set to enable/support the cyclic shift time offset adjustment | >>> ++--------------------------------------------------------------------+ >>> +|RTE_BBDEV_FFT_DFT_BYPASS | >>> +| Set to bypass the DFT and use directly the IDFT as an option | >>> ++--------------------------------------------------------------------+ >>> +|RTE_BBDEV_FFT_IDFT_BYPASS | >>> +| Set to bypass the IDFT and use directly the DFT as an option | >>> ++--------------------------------------------------------------------+ >>> +|RTE_BBDEV_FFT_WINDOWING_BYPASS | >>> +| Set to bypass the time domain windowing as an option | >>> ++--------------------------------------------------------------------+ >>> +|RTE_BBDEV_FFT_POWER_MEAS >> Other flags are not truncated, should be >> >> RTE_BBDEV_FFT_POWER_MEASUREMENT >> > The intention from DPDK recommendation is for these to be kept shortnames, isn't it? > Above we use many acronyms to keep it short (CS, etc...) > Even in current BBDEV API we use many truncation to keep names short: OUT, ENC/DEC, HQ, RM on top of acronyms. > I believe this is still super explicit with that name? Some of other identifier have longer names than this. If you wanted to keep things short, drop the last _ Generally the use of acronyms should be avoided because they add a layer of jargon that makes the code less readable to all but writer. > >>> | >>> +| Set to provide an optional power measument of the DFT output | >>> ++--------------------------------------------------------------------+ >> measurement > OK Thanks > >>> +|RTE_BBDEV_FFT_FP16_INPUT | >>> +| Set if the input data shall use FP16 format instead of INT16 | >>> ++--------------------------------------------------------------------+ >>> +|RTE_BBDEV_FFT_FP16_OUTPUT | >>> +| Set if the output data shall use FP16 format instead of INT16 | >>> ++--------------------------------------------------------------------+ >>> + >>> +The structure passed for each FFT operation is given below, with the >>> +operation flags forming a bitmask in the ``op_flags`` field. >>> + >>> +.. code-block:: c >>> + >>> + struct rte_bbdev_op_fft { >>> + struct rte_bbdev_op_data base_input; >>> + struct rte_bbdev_op_data base_output; >>> + struct rte_bbdev_op_data power_meas_output; >> similar to above, meas -> measurement > See above. Would that really help? I don’t believe there can be any confusion. Naming is hard. How about dropping the _meas_ and go with power_output > >>> + uint32_t op_flags; >>> + uint16_t input_sequence_size; >> Could these be future proofed by increasing small int size's to uint32_t ? > It is not possible to be that big for any signal processing relevant to that operation. > >>> + uint16_t input_leading_padding; >>> + uint16_t output_sequence_size; >>> + uint16_t output_leading_depadding; >>> + uint8_t window_index[RTE_BBDEV_MAX_CS_2]; >>> + uint16_t cs_bitmap; >>> + uint8_t num_antennas_log2; >>> + uint8_t idft_log2; >>> + uint8_t dft_log2; >> is _log2 needed in variable name if it is documenation ? > I believe it is a best practice when the variable name may be misleading, ie. this is not the actual dft size as a natural number (2048 for instance) but there is an implied mapping. > >>> + int8_t cs_time_adjustment; >>> + int8_t idft_shift; >>> + int8_t dft_shift; >>> + uint16_t ncs_reciprocal; >>> + uint16_t power_shift; >>> + uint16_t fp16_exp_adjust; >>> + }; >>> + >>> +The FFT parameters are set out in the table below. >>> + >>> ++----------------------+--------------------------------------------------------------+ >>> +|Parameter |Description | >>> >> ++======================+======================================== >> ===== >>> ++=================+ >>> +|base_input |input data | >>> ++----------------------+--------------------------------------------------------------+ >>> +|base_output |output data | >>> ++----------------------+--------------------------------------------------------------+ >>> +|power_meas_output |optional output data with power measurement >> on DFT output | >>> ++----------------------+--------------------------------------------------------------+ >>> +|op_flags |bitmask of all active operation capabilities | >>> ++----------------------+--------------------------------------------------------------+ >>> +|input_sequence_size |size of the input sequence in 32-bits points per >> antenna | >>> ++----------------------+--------------------------------------------------------------+ >>> +|input_leading_padding |number of points padded at the start of input >> data | >>> ++----------------------+--------------------------------------------------------------+ >>> +|output_sequence_size |size of the output sequence per antenna and >> cyclic shift | >>> ++----------------------+--------------------------------------------------------------+ >>> +|output_depadding |number of points depadded at the start of output >> data | >>> ++----------------------+--------------------------------------------------------------+ >> output_leading_depadding > OK Thanks > >>> +|window_index |optional windowing profile index used for each cyclic >> shift | >>> ++----------------------+--------------------------------------------------------------+ >>> +|cs_bitmap |bitmap of the cyclic shift output requested (LSB for >> index 0) | >>> ++----------------------+--------------------------------------------------------------+ >>> +|num_antennas_log2 |number of antennas as a log2 (10 maps to 1024...) >> | >>> ++----------------------+--------------------------------------------------------------+ >>> +|idft_log2 |iDFT size as a log2 | >>> ++----------------------+--------------------------------------------------------------+ >>> +|dft_log2 |DFT size as a log2 | >>> ++----------------------+--------------------------------------------------------------+ >>> +|cs_time_adjustment |adjustment of time position of all the cyclic shift >> output | >>> ++----------------------+--------------------------------------------------------------+ >>> +|idft_shift |shift down of signal level post iDFT | >>> ++----------------------+--------------------------------------------------------------+ >>> +|dft_shift |shift down of signal level post DFT | >>> ++----------------------+--------------------------------------------------------------+ >>> +|ncs_reciprocal |inverse of max number of CS normalized to 15b (ie. >> 231 for 12)| >>> ++----------------------+--------------------------------------------------------------+ >>> +|power_shift |shift down of level of power measurement when >> enabled | >>> ++----------------------+--------------------------------------------------------------+ >>> +|fp16_exp_adjust |value added to FP16 exponent at conversion from >> INT16 | >>> ++----------------------+--------------------------------------------------------------+ >>> + >>> +The mbuf input ``base_input`` is mandatory for all BBDEV PMDs and is >>> +the incoming data for the processing. Its size may not fit into an >>> +actual mbuf, but the structure is used to pass iova address. >>> +The mbuf output ``output`` is mandatory and is output of the FFT >> processing chain. >>> +Each point is a complex number of 32bits : either as 2 INT16 or as 2 >>> +FP16 based when the option supported. >>> +The data layout is based on contiguous concatenation of output data >>> +first by cyclic shift then by antenna. >>> >>> Sample code >>> ----------- >>> diff --git a/lib/bbdev/rte_bbdev.c b/lib/bbdev/rte_bbdev.c index >>> 555bda9..28b105d 100644 >>> --- a/lib/bbdev/rte_bbdev.c >>> +++ b/lib/bbdev/rte_bbdev.c >>> @@ -24,7 +24,7 @@ >>> #define DEV_NAME "BBDEV" >>> >>> /* Number of supported operation types */ -#define >>> BBDEV_OP_TYPE_COUNT 5 >>> +#define BBDEV_OP_TYPE_COUNT 6 >>> /* Number of supported device status */ >>> #define BBDEV_DEV_STATUS_COUNT 9 >>> >>> @@ -854,6 +854,9 @@ struct rte_bbdev * >>> case RTE_BBDEV_OP_LDPC_ENC: >>> result = sizeof(struct rte_bbdev_enc_op); >>> break; >>> + case RTE_BBDEV_OP_FFT: >>> + result = sizeof(struct rte_bbdev_fft_op); >>> + break; >>> default: >>> break; >>> } >>> @@ -877,6 +880,10 @@ struct rte_bbdev * >>> struct rte_bbdev_enc_op *op = element; >>> memset(op, 0, mempool->elt_size); >>> op->mempool = mempool; >>> + } else if (type == RTE_BBDEV_OP_FFT) { >>> + struct rte_bbdev_fft_op *op = element; >>> + memset(op, 0, mempool->elt_size); >>> + op->mempool = mempool; >>> } >>> } >>> >>> @@ -1126,6 +1133,8 @@ struct rte_mempool * >>> "RTE_BBDEV_OP_TURBO_DEC", >>> "RTE_BBDEV_OP_TURBO_ENC", >>> "RTE_BBDEV_OP_LDPC_DEC", >>> + "RTE_BBDEV_OP_LDPC_ENC", >> Why ldpc_enc line, this is already in codebase ? >>> + "RTE_BBDEV_OP_FFT", > Thanks, there this is a rebase issue in previous commit > > >>> }; >>> >>> if (op_type < BBDEV_OP_TYPE_COUNT) >>> diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h index >>> ac941d6..ed528b8 100644 >>> --- a/lib/bbdev/rte_bbdev.h >>> +++ b/lib/bbdev/rte_bbdev.h >>> @@ -401,6 +401,12 @@ typedef uint16_t >> (*rte_bbdev_enqueue_dec_ops_t)( >>> struct rte_bbdev_dec_op **ops, >>> uint16_t num); >>> >>> +/** @internal Enqueue fft operations for processing on queue of a >>> +device. */ typedef uint16_t (*rte_bbdev_enqueue_fft_ops_t)( >>> + struct rte_bbdev_queue_data *q_data, >>> + struct rte_bbdev_fft_op **ops, >>> + uint16_t num); >>> + >>> /** @internal Dequeue encode operations from a queue of a device. */ >>> typedef uint16_t (*rte_bbdev_dequeue_enc_ops_t)( >>> struct rte_bbdev_queue_data *q_data, @@ -411,6 +417,11 >> @@ typedef >>> uint16_t (*rte_bbdev_dequeue_dec_ops_t)( >>> struct rte_bbdev_queue_data *q_data, >>> struct rte_bbdev_dec_op **ops, uint16_t num); >>> >>> +/** @internal Dequeue fft operations from a queue of a device. */ >>> +typedef uint16_t (*rte_bbdev_dequeue_fft_ops_t)( >>> + struct rte_bbdev_queue_data *q_data, >>> + struct rte_bbdev_fft_op **ops, uint16_t num); >>> + >>> #define RTE_BBDEV_NAME_MAX_LEN 64 /**< Max length of device name >>> */ >>> >>> /** >>> @@ -459,6 +470,10 @@ struct __rte_cache_aligned rte_bbdev { >>> rte_bbdev_dequeue_enc_ops_t dequeue_ldpc_enc_ops; >>> /** Dequeue decode function */ >>> rte_bbdev_dequeue_dec_ops_t dequeue_ldpc_dec_ops; >>> + /** Enqueue FFT function */ >>> + rte_bbdev_enqueue_fft_ops_t enqueue_fft_ops; >>> + /** Dequeue FFT function */ >>> + rte_bbdev_dequeue_fft_ops_t dequeue_fft_ops; >>> const struct rte_bbdev_ops *dev_ops; /**< Functions exported by >> PMD */ >>> struct rte_bbdev_data *data; /**< Pointer to device data */ >>> enum rte_bbdev_state state; /**< If device is currently used or >>> not */ @@ -591,6 +606,36 @@ struct __rte_cache_aligned rte_bbdev { >>> return dev->enqueue_ldpc_dec_ops(q_data, ops, num_ops); >>> } >>> >>> +/** >>> + * Enqueue a burst of fft operations to a queue of the device. >>> + * This functions only enqueues as many operations as currently >>> +possible and >>> + * does not block until @p num_ops entries in the queue are available. >>> + * This function does not provide any error notification to avoid the >>> + * corresponding overhead. >>> + * >>> + * @param dev_id >>> + * The identifier of the device. >>> + * @param queue_id >>> + * The index of the queue. >>> + * @param ops >>> + * Pointer array containing operations to be enqueued Must have at least >>> + * @p num_ops entries >>> + * @param num_ops >>> + * The maximum number of operations to enqueue. >>> + * >>> + * @return >>> + * The number of operations actually enqueued (this is the number of >> processed >>> + * entries in the @p ops array). >>> + */ >>> +__rte_experimental >>> +static inline uint16_t >>> +rte_bbdev_enqueue_fft_ops(uint16_t dev_id, uint16_t queue_id, >>> + struct rte_bbdev_fft_op **ops, uint16_t num_ops) { >>> + struct rte_bbdev *dev = &rte_bbdev_devices[dev_id]; >> Who checks the input is valid ? Who checks the input is valid ? >>> + struct rte_bbdev_queue_data *q_data = &dev->data- >>> queues[queue_id]; >>> + return dev->enqueue_fft_ops(q_data, ops, num_ops); } >>> >>> /** >>> * Dequeue a burst of processed encode operations from a queue of the >> device. >>> @@ -716,6 +761,37 @@ struct __rte_cache_aligned rte_bbdev { >>> return dev->dequeue_ldpc_dec_ops(q_data, ops, num_ops); >>> } >>> >>> +/** >>> + * Dequeue a burst of fft operations from a queue of the device. >>> + * This functions returns only the current contents of the queue, and >>> +does not >>> + * block until @ num_ops is available. >>> + * This function does not provide any error notification to avoid the >>> + * corresponding overhead. >>> + * >>> + * @param dev_id >>> + * The identifier of the device. >>> + * @param queue_id >>> + * The index of the queue. >>> + * @param ops >>> + * Pointer array where operations will be dequeued to. Must have at least >>> + * @p num_ops entries >>> + * @param num_ops >>> + * The maximum number of operations to dequeue. >>> + * >>> + * @return >>> + * The number of operations actually dequeued (this is the number of >> entries >>> + * copied into the @p ops array). >>> + */ >>> +__rte_experimental >>> +static inline uint16_t >>> +rte_bbdev_dequeue_fft_ops(uint16_t dev_id, uint16_t queue_id, >>> + struct rte_bbdev_fft_op **ops, uint16_t num_ops) { >>> + struct rte_bbdev *dev = &rte_bbdev_devices[dev_id]; >>> + struct rte_bbdev_queue_data *q_data = &dev->data- >>> queues[queue_id]; >>> + return dev->dequeue_fft_ops(q_data, ops, num_ops); } >>> + >>> /** Definitions of device event types */ >>> enum rte_bbdev_event_type { >>> RTE_BBDEV_EVENT_UNKNOWN, /**< unknown event type */ diff --git >>> a/lib/bbdev/rte_bbdev_op.h b/lib/bbdev/rte_bbdev_op.h index >>> cd82418..3e46f1d 100644 >>> --- a/lib/bbdev/rte_bbdev_op.h >>> +++ b/lib/bbdev/rte_bbdev_op.h >>> @@ -47,6 +47,8 @@ >>> #define RTE_BBDEV_TURBO_MAX_CODE_BLOCKS (64) >>> /* LDPC: Maximum number of Code Blocks in Transport Block.*/ >>> #define RTE_BBDEV_LDPC_MAX_CODE_BLOCKS (256) >>> +/* 12 CS maximum */ >>> +#define RTE_BBDEV_MAX_CS_2 (6) >>> >>> /** Flags for turbo decoder operation and capability structure */ >>> enum rte_bbdev_op_td_flag_bitmasks { @@ -211,6 +213,26 @@ enum >>> rte_bbdev_op_ldpcenc_flag_bitmasks { >>> RTE_BBDEV_LDPC_ENC_CONCATENATION = (1ULL << 7) >>> }; >>> >>> +/** Flags for DFT operation and capability structure */ enum >>> +rte_bbdev_op_fft_flag_bitmasks { >>> + /** Flexible windowing capability */ >>> + RTE_BBDEV_FFT_WINDOWING = (1ULL << 0), >>> + /** Flexible adjustment of Cyclic Shift time offset */ >>> + RTE_BBDEV_FFT_CS_ADJUSTMENT = (1ULL << 1), >>> + /** Set for bypass the DFT and get directly into iDFT input */ >>> + RTE_BBDEV_FFT_DFT_BYPASS = (1ULL << 2), >>> + /** Set for bypass the IDFT and get directly the DFT output */ >>> + RTE_BBDEV_FFT_IDFT_BYPASS = (1ULL << 3), >>> + /** Set for bypass time domain windowing */ >>> + RTE_BBDEV_FFT_WINDOWING_BYPASS = (1ULL << 4), >>> + /** Set for optional power measurement on DFT output */ >>> + RTE_BBDEV_FFT_POWER_MEAS = (1ULL << 5), >> Meas here too, change generally >>> + /** Set if the input data used FP16 format */ >>> + RTE_BBDEV_FFT_FP16_INPUT = (1ULL << 6), >> What are the other data type(s) ? >> >> The default is not mentioned, or i missed it. ? >> >>> + /** Set if the output data uses FP16 format */ >>> + RTE_BBDEV_FFT_FP16_OUTPUT = (1ULL << 7) }; >>> + >>> /** Flags for the Code Block/Transport block mode */ >>> enum rte_bbdev_op_cb_mode { >>> /** One operation is one or fraction of one transport block */ @@ >>> -689,6 +711,55 @@ struct rte_bbdev_op_ldpc_enc { >>> }; >>> }; >>> >>> +/** Operation structure for FFT processing. >>> + * >>> + * The operation processes the data for multiple antennas in a single >>> +call >>> + * (.i.e for all the REs belonging to a given SRS sequence for >>> +instance) >>> + * >>> + * The output mbuf data structure is expected to be allocated by the >>> + * application with enough room for the output data. >>> + */ >>> +struct rte_bbdev_op_fft { >>> + /** Input data starting from first antenna */ >>> + struct rte_bbdev_op_data base_input; >>> + /** Output data starting from first antenna and first cyclic shift */ >>> + struct rte_bbdev_op_data base_output; >>> + /** Optional power measurement output data */ >>> + struct rte_bbdev_op_data power_meas_output; >>> + /** Flags from rte_bbdev_op_fft_flag_bitmasks */ >>> + uint32_t op_flags; >>> + /** Input sequence size in 32-bits points */ >>> + uint16_t input_sequence_size; >> size is bytes*4 ? how does this work with fp16 ? ? >>> + /** Padding at the start of the sequence */ >>> + uint16_t input_leading_padding; >>> + /** Output sequence size in 32-bits points */ >>> + uint16_t output_sequence_size; >>> + /** Depadding at the start of the DFT output */ >>> + uint16_t output_leading_depadding; >>> + /** Window index being used for each cyclic shift output */ >>> + uint8_t window_index[RTE_BBDEV_MAX_CS_2]; >>> + /** Bitmap of the cyclic shift output requested */ >>> + uint16_t cs_bitmap; >>> + /** Number of antennas as a log2 – 8 to 128 */ >>> + uint8_t num_antennas_log2; >>> + /** iDFT size as a log2 - 32 to 2048 */ >>> + uint8_t idft_log2; >>> + /** DFT size as a log2 - 8 to 2048 */ >>> + uint8_t dft_log2; >>> + /** Adjustment of position of the cyclic shifts - -31 to 31 */ >>> + int8_t cs_time_adjustment; >>> + /** iDFT shift down */ >>> + int8_t idft_shift; >>> + /** DFT shift down */ >>> + int8_t dft_shift; >>> + /** NCS reciprocal factor */ >>> + uint16_t ncs_reciprocal; >>> + /** power measurement out shift down */ >>> + uint16_t power_shift; >>> + /** Adjust the FP6 exponent for INT<->FP16 conversion */ >>> + uint16_t fp16_exp_adjust; >>> +}; >>> + >>> /** List of the capabilities for the Turbo Decoder */ >>> struct rte_bbdev_op_cap_turbo_dec { >>> /** Flags from rte_bbdev_op_td_flag_bitmasks */ @@ -741,6 +812,16 >>> @@ struct rte_bbdev_op_cap_ldpc_enc { >>> uint16_t num_buffers_dst; >>> }; >>> >>> +/** List of the capabilities for the FFT */ struct >>> +rte_bbdev_op_cap_fft { >>> + /** Flags from rte_bbdev_op_ldpcenc_flag_bitmasks */ >> you mean 'from rte_bbdev_op_fft_flag_bitmasks' ? ? >>> + uint32_t capability_flags; >>> + /** Num input code block buffers */ >>> + uint16_t num_buffers_src; >>> + /** Num output code block buffers */ >>> + uint16_t num_buffers_dst; >>> +}; >>> + >>> /** Different operation types supported by the device */ >>> enum rte_bbdev_op_type { >>> RTE_BBDEV_OP_NONE, /**< Dummy operation that does nothing */ >> @@ >>> -748,6 +829,7 @@ enum rte_bbdev_op_type { >>> RTE_BBDEV_OP_TURBO_ENC, /**< Turbo encode */ >>> RTE_BBDEV_OP_LDPC_DEC, /**< LDPC decode */ >>> RTE_BBDEV_OP_LDPC_ENC, /**< LDPC encode */ >>> + RTE_BBDEV_OP_FFT, /**< FFT */ >>> RTE_BBDEV_OP_TYPE_PADDED_MAX = 8, /**< Maximum op type >> number including padding */ >>> }; >>> >>> @@ -791,6 +873,18 @@ struct rte_bbdev_dec_op { >>> }; >>> }; >>> >>> +/** Structure specifying a single fft operation */ struct >>> +rte_bbdev_fft_op { >>> + /** Status of operation that was performed */ >>> + int status; >>> + /** Mempool which op instance is in */ >>> + struct rte_mempool *mempool; >>> + /** Opaque pointer for user data */ >>> + void *opaque_data; >>> + /** Contains turbo decoder specific parameters */ >>> + struct rte_bbdev_op_fft fft; >>> +}; >>> + >>> /** Operation capabilities supported by a device */ >>> struct rte_bbdev_op_cap { >>> enum rte_bbdev_op_type type; /**< Type of operation */ @@ -799,6 >>> +893,7 @@ struct rte_bbdev_op_cap { >>> struct rte_bbdev_op_cap_turbo_enc turbo_enc; >>> struct rte_bbdev_op_cap_ldpc_dec ldpc_dec; >>> struct rte_bbdev_op_cap_ldpc_enc ldpc_enc; >>> + struct rte_bbdev_op_cap_fft fft; >>> } cap; /**< Operation-type specific capabilities */ >>> }; >>> >>> @@ -918,6 +1013,42 @@ struct rte_mempool * >>> } >>> >>> /** >>> + * Bulk allocate fft operations from a mempool with parameter defaults >> reset. >>> + * >>> + * @param mempool >>> + * Operation mempool, created by rte_bbdev_op_pool_create(). >>> + * @param ops >>> + * Output array to place allocated operations >>> + * @param num_ops >>> + * Number of operations to allocate >>> + * >>> + * @returns >>> + * - 0 on success >>> + * - EINVAL if invalid mempool is provided >>> + */ >>> +__rte_experimental >>> +static inline int >>> +rte_bbdev_fft_op_alloc_bulk(struct rte_mempool *mempool, >>> + struct rte_bbdev_fft_op **ops, uint16_t num_ops) { >>> + struct rte_bbdev_op_pool_private *priv; >>> + int ret; >>> + >>> + /* Check type */ >>> + priv = (struct rte_bbdev_op_pool_private *) >>> + rte_mempool_get_priv(mempool); >>> + if (unlikely(priv->type != RTE_BBDEV_OP_FFT)) >>> + return -EINVAL; >>> + >>> + /* Get elements */ >>> + ret = rte_mempool_get_bulk(mempool, (void **)ops, num_ops); >>> + if (unlikely(ret < 0)) >>> + return ret; >> if-check is not needed, just >> >> return ret; >> >> and drop the next line ? >> >> Tom >> >>> + >>> + return 0; >>> +} >>> + >>> +/** >>> * Free decode operation structures that were allocated by >>> * rte_bbdev_dec_op_alloc_bulk(). >>> * All structures must belong to the same mempool. >>> @@ -951,6 +1082,24 @@ struct rte_mempool * >>> rte_mempool_put_bulk(ops[0]->mempool, (void **)ops, >> num_ops); >>> } >>> >>> +/** >>> + * Free encode operation structures that were allocated by >>> + * rte_bbdev_fft_op_alloc_bulk(). >>> + * All structures must belong to the same mempool. >>> + * >>> + * @param ops >>> + * Operation structures >>> + * @param num_ops >>> + * Number of structures >>> + */ >>> +__rte_experimental >>> +static inline void >>> +rte_bbdev_fft_op_free_bulk(struct rte_bbdev_fft_op **ops, unsigned >>> +int num_ops) { >>> + if (num_ops > 0) >>> + rte_mempool_put_bulk(ops[0]->mempool, (void **)ops, >> num_ops); } >>> + >>> #ifdef __cplusplus >>> } >>> #endif >>> diff --git a/lib/bbdev/version.map b/lib/bbdev/version.map index >>> 9ac3643..efae50b 100644 >>> --- a/lib/bbdev/version.map >>> +++ b/lib/bbdev/version.map >>> @@ -44,4 +44,8 @@ EXPERIMENTAL { >>> global: >>> >>> rte_bbdev_device_status_str; >>> + rte_bbdev_enqueue_fft_ops; >>> + rte_bbdev_dequeue_fft_ops; >>> + rte_bbdev_fft_op_alloc_bulk; >>> + rte_bbdev_fft_op_free_bulk; >>> };