From: Maxime Coquelin <maxime.coquelin@redhat.com>
To: Hernan Vargas <hernan.vargas@intel.com>,
dev@dpdk.org, gakhil@marvell.com, trix@redhat.com
Cc: nicolas.chautru@intel.com, qi.z.zhang@intel.com
Subject: Re: [PATCH v3 3/4] baseband/fpga_5gnr_fec: add AGX100 support
Date: Tue, 17 Oct 2023 14:48:28 +0200 [thread overview]
Message-ID: <ad42ed96-2dbc-4456-bc9a-5d4568b14e7b@redhat.com> (raw)
In-Reply-To: <20230918163114.276722-4-hernan.vargas@intel.com>
On 9/18/23 18:31, Hernan Vargas wrote:
> Add support for new FPGA variant AGX100 (on Arrow Creek N6000).
>
> Signed-off-by: Hernan Vargas <hernan.vargas@intel.com>
> ---
> doc/guides/bbdevs/fpga_5gnr_fec.rst | 72 +-
> drivers/baseband/fpga_5gnr_fec/agx100_pmd.h | 273 ++++
> .../baseband/fpga_5gnr_fec/fpga_5gnr_fec.h | 12 +-
> .../fpga_5gnr_fec/rte_fpga_5gnr_fec.c | 1214 +++++++++++++++--
> 4 files changed, 1429 insertions(+), 142 deletions(-)
> create mode 100644 drivers/baseband/fpga_5gnr_fec/agx100_pmd.h
>
> diff --git a/doc/guides/bbdevs/fpga_5gnr_fec.rst b/doc/guides/bbdevs/fpga_5gnr_fec.rst
> index 9d71585e9e18..c27db695a834 100644
> --- a/doc/guides/bbdevs/fpga_5gnr_fec.rst
> +++ b/doc/guides/bbdevs/fpga_5gnr_fec.rst
> @@ -6,12 +6,13 @@ Intel(R) FPGA 5GNR FEC Poll Mode Driver
>
> The BBDEV FPGA 5GNR FEC poll mode driver (PMD) supports an FPGA implementation of a VRAN
> LDPC Encode / Decode 5GNR wireless acceleration function, using Intel's PCI-e and FPGA
> -based Vista Creek device.
> +based Vista Creek (N3000, referred to as VC_5GNR in the code) as well as Arrow Creek (N6000,
> +referred to as AGX100 in the code).
>
> Features
> --------
>
> -FPGA 5GNR FEC PMD supports the following features:
> +FPGA 5GNR FEC PMD supports the following BBDEV capabilities:
>
> - LDPC Encode in the DL
> - LDPC Decode in the UL
> @@ -67,10 +68,18 @@ Initialization
>
> When the device first powers up, its PCI Physical Functions (PF) can be listed through this command:
>
> +Vista Creek (N3000)
> +
> .. code-block:: console
>
> sudo lspci -vd8086:0d8f
>
> +Arrow Creek (N6000)
> +
> +.. code-block:: console
> +
> + sudo lspci -vd8086:5799
> +
> The physical and virtual functions are compatible with Linux UIO drivers:
> ``vfio`` and ``igb_uio``. However, in order to work the FPGA 5GNR FEC device firstly needs
> to be bound to one of these linux drivers through DPDK.
> @@ -85,24 +94,34 @@ Install the DPDK igb_uio driver, bind it with the PF PCI device ID and use
> The igb_uio driver may be bound to the PF PCI device using one of two methods:
>
>
> -1. PCI functions (physical or virtual, depending on the use case) can be bound to
> -the UIO driver by repeating this command for every function.
> +1. PCI functions (physical or virtual, depending on the use case) can be bound to the UIO driver by repeating this command for every function.
>
> -.. code-block:: console
> + .. code-block:: console
> +
> + insmod igb_uio.ko
> +
> + Bind N3000 to igb_uio
> +
> + .. code-block:: console
>
> - insmod igb_uio.ko
> - echo "8086 0d8f" > /sys/bus/pci/drivers/igb_uio/new_id
> - lspci -vd8086:0d8f
> + echo "8086 0d8f" > /sys/bus/pci/drivers/igb_uio/new_id
> + lspci -vd8086:0d8f
>
> + Bind N6000 to igb_uio
> +
> + .. code-block:: console
> +
> + echo "8086 5799" > /sys/bus/pci/drivers/igb_uio/new_id
> + lspci -vd8086:5799
>
> 2. Another way to bind PF with DPDK UIO driver is by using the ``dpdk-devbind.py`` tool
>
> -.. code-block:: console
> + .. code-block:: console
>
> - cd <dpdk-top-level-directory>
> - ./usertools/dpdk-devbind.py -b igb_uio 0000:06:00.0
> + cd <dpdk-top-level-directory>
> + ./usertools/dpdk-devbind.py -b igb_uio 0000:06:00.0
>
> -where the PCI device ID (example: 0000:06:00.0) is obtained using lspci -vd8086:0d8f
> +where the PCI device ID (example: 0000:06:00.0) is obtained using lspci -vd8086:0d8f for N3000 or lspci -vd8086:5799 for N6000
As done in VRB2 series, please provide a link to the VFIO doc instead of
listing instructions for UIO and VFIO.
>
> In the same way the FPGA 5GNR FEC PF can be bound with vfio, but vfio driver does not
> @@ -165,7 +184,6 @@ parameters defined in ``rte_fpga_5gnr_fec_conf`` structure:
> uint8_t dl_bandwidth;
> uint8_t ul_load_balance;
> uint8_t dl_load_balance;
> - uint16_t flr_time_out;
> };
>
> - ``pf_mode_en``: identifies whether only PF is to be used, or the VFs. PF and
> @@ -176,12 +194,12 @@ parameters defined in ``rte_fpga_5gnr_fec_conf`` structure:
>
> - ``vf_*l_queues_number``: defines the hardware queue mapping for every VF.
>
> -- ``*l_bandwidth``: in case of congestion on PCIe interface. The device
> - allocates different bandwidth to UL and DL. The weight is configured by this
> - setting. The unit of weight is 3 code blocks. For example, if the code block
> - cbps (code block per second) ratio between UL and DL is 12:1, then the
> - configuration value should be set to 36:3. The schedule algorithm is based
> - on code block regardless the length of each block.
> +- ``*l_bandwidth``: Only used for the Vista Creek schedule algorithm in case of
> + congestion on PCIe interface. The device allocates different bandwidth to UL
> + and DL. The weight is configured by this setting. The unit of weight is 3 code
> + blocks. For example, if the code block cbps (code block per second) ratio between
> + UL and DL is 12:1, then the configuration value should be set to 36:3.
> + The schedule algorithm is based on code block regardless the length of each block.
>
> - ``*l_load_balance``: hardware queues are load-balanced in a round-robin
> fashion. Queues get filled first-in first-out until they reach a pre-defined
> @@ -191,10 +209,6 @@ parameters defined in ``rte_fpga_5gnr_fec_conf`` structure:
> If all hardware queues exceeds the watermark, no code blocks will be
> streamed in from UL/DL code block FIFO.
>
> -- ``flr_time_out``: specifies how many 16.384us to be FLR time out. The
> - time_out = flr_time_out x 16.384us. For instance, if you want to set 10ms for
> - the FLR time out then set this setting to 0x262=610.
> -
Why is it removed?
Isn't it applicable for Vista Creek anymore?
If so please do it in a dedicated patch, or in the patch that removes
the parameter in the code.
>
> An example configuration code calling the function ``rte_fpga_5gnr_fec_configure()`` is shown
> below:
> @@ -219,7 +233,7 @@ below:
> /* setup FPGA PF */
> ret = rte_fpga_5gnr_fec_configure(info->dev_name, &conf);
> TEST_ASSERT_SUCCESS(ret,
> - "Failed to configure 4G FPGA PF for bbdev %s",
> + "Failed to configure 5GNR FPGA PF for bbdev %s",
> info->dev_name);
>
>
> @@ -263,7 +277,6 @@ are defined in test_bbdev_perf.c as:
> - DL_BANDWIDTH 3
> - UL_LOAD_BALANCE 128
> - DL_LOAD_BALANCE 128
> -- FLR_TIMEOUT 610
>
>
> Test Vectors
> @@ -287,7 +300,16 @@ See for more details: https://github.com/intel/pf-bb-config
>
> Specifically for the BBDEV FPGA 5GNR FEC PMD, the command below can be used:
>
> +Vista Creek (N3000)
> +
> .. code-block:: console
>
> ./pf_bb_config FPGA_5GNR -c fpga_5gnr/fpga_5gnr_config_vf.cfg
> ./test-bbdev.py -e="-c 0xff0 -a${VF_PCI_ADDR}" -c validation -n 64 -b 32 -l 1 -v ./ldpc_dec_default.data
> +
> +Arrow Creek (N6000)
> +
> +.. code-block:: console
> +
> + ./pf_bb_config AGX100 -c agx100/agx100_config_1vf.cfg
> + ./test-bbdev.py -e="-c 0xff0 -a${VF_PCI_ADDR}" -c validation -n 64 -b 32 -l 1 -v ./ldpc_dec_default.data
> diff --git a/drivers/baseband/fpga_5gnr_fec/agx100_pmd.h b/drivers/baseband/fpga_5gnr_fec/agx100_pmd.h
> new file mode 100644
> index 000000000000..8013571402c8
> --- /dev/null
> +++ b/drivers/baseband/fpga_5gnr_fec/agx100_pmd.h
> @@ -0,0 +1,273 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2022 Intel Corporation
> + */
> +
> +#ifndef _AGX100_H_
> +#define _AGX100_H_
> +
> +#include <stdint.h>
> +#include <stdbool.h>
> +
> +/* AGX100 PCI vendor & device IDs. */
> +#define AGX100_VENDOR_ID (0x8086)
> +#define AGX100_PF_DEVICE_ID (0x5799)
> +#define AGX100_VF_DEVICE_ID (0x579A)
> +
> +/* Maximum number of possible queues supported on device. */
> +#define AGX100_MAXIMUM_QUEUES_SUPPORTED (64)
> +
> +/* AGX100 Ring size is in 256 bits (64 bytes) units. */
> +#define AGX100_RING_DESC_LEN_UNIT_BYTES (64)
> +
> +/* Align DMA descriptors to 256 bytes - cache-aligned. */
> +#define AGX100_RING_DESC_ENTRY_LENGTH (8)
> +
> +/* AGX100 Register mapping on BAR0. */
> +enum {
> + AGX100_FLR_TIME_OUT = 0x0000000E, /* len: 2B. */
> + AGX100_QUEUE_MAP = 0x00000100 /* len: 256B. */
> +};
> +
> +/* AGX100 DESCRIPTOR ERROR. */
> +enum {
> + AGX100_DESC_ERR_NO_ERR = 0x00, /**< 4'b0000 2'b00. */
> + AGX100_DESC_ERR_E_NOT_LEGAL = 0x11, /**< 4'b0001 2'b01. */
> + AGX100_DESC_ERR_K_P_OUT_OF_RANGE = 0x21, /**< 4'b0010 2'b01. */
> + AGX100_DESC_ERR_NCB_OUT_OF_RANGE = 0x31, /**< 4'b0011 2'b01. */
> + AGX100_DESC_ERR_Z_C_NOT_LEGAL = 0x41, /**< 4'b0100 2'b01. */
> + AGX100_DESC_ERR_DESC_INDEX_ERR = 0x03, /**< 4'b0000 2'b11. */
> + AGX100_DESC_ERR_HARQ_INPUT_LEN_A = 0x51, /**< 4'b0101 2'b01. */
> + AGX100_DESC_ERR_HARQ_INPUT_LEN_B = 0x61, /**< 4'b0110 2'b01. */
> + AGX100_DESC_ERR_HBSTORE_OFFSET_ERR = 0x71, /**< 4'b0111 2'b01. */
> + AGX100_DESC_ERR_TB_CBG_ERR = 0x81, /**< 4'b1000 2'b01. */
> + AGX100_DESC_ERR_CBG_OUT_OF_RANGE = 0x91, /**< 4'b1001 2'b01. */
> + AGX100_DESC_ERR_CW_RM_NOT_LEGAL = 0xA1, /**< 4'b1010 2'b01. */
> + AGX100_DESC_ERR_UNSUPPORTED_REQ = 0x12, /**< 4'b0000 2'b10. */
> + AGX100_DESC_ERR_RESERVED = 0x22, /**< 4'b0010 2'b10. */
> + AGX100_DESC_ERR_DESC_ABORT = 0x42, /**< 4'b0100 2'b10. */
> + AGX100_DESC_ERR_DESC_READ_TLP_POISONED = 0x82 /**< 4'b1000 2'b10. */
> +};
> +
> +/* AGX100 TX Slice Descriptor. */
> +struct __rte_packed agx100_input_slice_desc {
> + uint32_t input_start_addr_lo;
> + uint32_t input_start_addr_hi;
> + uint32_t input_slice_length:21,
> + rsrvd0:9,
> + end_of_pkt:1,
> + start_of_pkt:1;
> + uint32_t input_slice_time_stamp:31,
> + input_c:1;
> +};
> +
> +/* AGX100 RX Slice Descriptor. */
> +struct __rte_packed agx100_output_slice_desc {
> + uint32_t output_start_addr_lo;
> + uint32_t output_start_addr_hi;
> + uint32_t output_slice_length:21,
> + rsrvd0:9,
> + end_of_pkt:1,
> + start_of_pkt:1;
> + uint32_t output_slice_time_stamp:31,
> + output_c:1;
> +};
> +
> +/* AGX100 DL DMA Encoding Request Descriptor. */
> +struct __rte_packed agx100_dma_enc_desc {
> + uint32_t done:1, /**< 0: not completed 1: completed. */
> + rsrvd0:17,
> + error_msg:2,
> + error_code:4,
> + rsrvd1:8;
> + uint32_t ncb:16, /**< Limited circular buffer size. */
> + bg_idx:1, /**< Base Graph 0: BG1 1: BG2.*/
> + qm_idx:3, /**< 0: BPSK; 1: QPSK; 2: 16QAM; 3: 64QAM; 4: 256QAM. */
> + zc:9, /**< Lifting size. */
> + rv:2, /**< Redundancy version number. */
> + int_en:1; /**< Interrupt enable. */
> + uint32_t max_cbg:4, /**< Only valid when workload is TB or CBGs. */
> + rsrvd2:4,
> + cbgti:8, /**< CBG bitmap. */
> + rsrvd3:4,
> + cbgs:1, /**< 0: TB or CB 1: CBGs. */
> + desc_idx:11; /**< Sequence number of the descriptor. */
> + uint32_t ca:10, /**< Code block number with Ea in TB or CBG. */
> + c:10, /**< Total code block number in TB or CBG. */
> + rsrvd4:2,
> + num_null:10; /**< Number of null bits. */
> + uint32_t ea:21, /**< Value of E when worload is CB. */
> + rsrvd5:11;
> + uint32_t eb:21, /**< Only valid when workload is TB or CBGs. */
> + rsrvd6:11;
> + uint32_t k_:16, /**< Code block length without null bits. */
> + rsrvd7:8,
> + en_slice_ts:1, /**< Enable slice descriptor timestamp. */
> + en_host_ts:1, /**< Enable host descriptor timestamp. */
> + en_cb_wr_status:1, /**< Enable code block write back status. */
> + en_output_sg:1, /**< Enable RX scatter-gather. */
> + en_input_sg:1, /**< Enable TX scatter-gather. */
> + tb_cb:1, /**< 2'b10: the descriptor is for a TrBlk.
> + * 2'b00: the descriptor is for a CBlk.
> + * 2'b11 or 01: the descriptor is for a CBGs.
> + */
> + crc_en:1, /**< 1: CB CRC enabled 0: CB CRC disabled.
> + * Only valid when workload is CB or CBGs.
> + */
> + rsrvd8:1;
> + uint32_t rsrvd9;
> + union {
> + uint32_t input_slice_table_addr_lo; /**<Used when scatter-gather enabled.*/
> + uint32_t input_start_addr_lo; /**< Used when scatter-gather disabled. */
> + };
> + union {
> + uint32_t input_slice_table_addr_hi; /**<Used when scatter-gather enabled.*/
> + uint32_t input_start_addr_hi; /**< Used when scatter-gather disabled. */
> + };
> + union {
> + uint32_t input_slice_num:21, /**< Used when scatter-gather enabled. */
> + rsrvd10:11;
> + uint32_t input_length:26, /**< Used when scatter-gather disabled. */
> + rsrvd11:6;
> + };
> + union {
> + uint32_t output_slice_table_addr_lo; /**< Used when scatter-gather enabled.*/
> + uint32_t output_start_addr_lo; /**< Used when scatter-gather disabled. */
> + };
> + union {
> + uint32_t output_slice_table_addr_hi; /**< Used when scatter-gather enabled.*/
> + uint32_t output_start_addr_hi; /**< Used when scatter-gather disabled. */
> + };
> + union {
> + uint32_t output_slice_num:21, /**< Used when scatter-gather enabled. */
> + rsrvd12:11;
> + uint32_t output_length:26, /**< Used when scatter-gather disabled. */
> + rsrvd13:6;
> + };
> + uint32_t enqueue_timestamp:31, /**< Time when AGX100 receives descriptor. */
> + rsrvd14:1;
> + uint32_t completion_timestamp:31, /**< Time when AGX100 completes descriptor. */
> + rsrvd15:1;
> +
> + union {
> + struct {
> + /** Virtual addresses used to retrieve SW context info. */
> + void *op_addr;
> + /** Stores information about total number of Code Blocks
> + * in currently processed Transport Block
> + */
> + uint64_t cbs_in_op;
> + };
> +
> + uint8_t sw_ctxt[AGX100_RING_DESC_LEN_UNIT_BYTES *
> + (AGX100_RING_DESC_ENTRY_LENGTH - 1)];
> + };
> +};
> +
> +/* AGX100 UL DMA Decoding Request Descriptor. */
> +struct __rte_packed agx100_dma_dec_desc {
> + uint32_t done:1, /**< 0: not completed 1: completed. */
> + tb_crc_pass:1, /**< 0: doesn't pass 1: pass. */
> + cb_crc_all_pass:1, /**< 0: doesn't pass 1: pass. */
> + cb_all_et_pass:1, /**< 0: not all decoded 1: all decoded. */
> + max_iter_ret:6, /**< Iteration number returned by LDPC decoder. */
> + cgb_crc_bitmap:8, /**< Field valid only when workload is TB or CBGs. */
> + error_msg:2,
> + error_code:4,
> + et_dis:1, /**< Disable the early termination feature of LDPC decoder. */
> + harq_in_en:1, /**< 0: combine disabled 1: combine enable.*/
> + max_iter:6; /**< Maximum value of iteration for decoding CB. */
> + uint32_t ncb:16, /**< Limited circular buffer size. */
> + bg_idx:1, /**< Base Graph 0: BG1 1: BG2.*/
> + qm_idx:3, /**< 0: BPSK; 1: QPSK; 2: 16QAM; 3: 64QAM; 4: 256QAM. */
> + zc:9, /**< Lifting size. */
> + rv:2, /**< Redundancy version number. */
> + int_en:1; /**< Interrupt enable. */
> + uint32_t max_cbg:4, /**< Only valid when workload is TB or CBGs. */
> + rsrvd0:4,
> + cbgti:8, /**< CBG bitmap. */
> + cbgfi:1, /**< 0: overwrite HARQ buffer 1: enable HARQ for CBGs. */
> + rsrvd1:3,
> + cbgs:1, /**< 0: TB or CB 1: CBGs. */
> + desc_idx:11; /**< Sequence number of the descriptor. */
> + uint32_t ca:10, /**< Code block number with Ea in TB or CBG. */
> + c:10, /**< Total code block number in TB or CBG. */
> + llr_pckg:1, /**< 0: 8-bit LLR 1: 6-bit LLR packed together. */
> + syndrome_check_mode:1, /**<0: full syndrome check 1: 4-layer syndome check.*/
> + num_null:10; /**< Number of null bits. */
> + uint32_t ea:21, /**< Value of E when worload is CB. */
> + rsrvd2:3,
> + eba:8; /**< Only valid when workload is TB or CBGs. */
> + uint32_t hbstore_offset_out:24, /**< HARQ buffer write address. */
> + rsrvd3:8;
> + uint32_t hbstore_offset_in:24, /**< HARQ buffer read address. */
> + en_slice_ts:1, /**< Enable slice descriptor timestamp. */
> + en_host_ts:1, /**< Enable host descriptor timestamp. */
> + en_cb_wr_status:1, /**< Enable code block write back status. */
> + en_output_sg:1, /**< Enable RX scatter-gather. */
> + en_input_sg:1, /**< Enable TX scatter-gather. */
> + tb_cb:1, /**< 2'b10: the descriptor is for a TrBlk.
> + * 2'b00: the descriptor is for a CBlk.
> + * 2'b11 or 01: the descriptor is for a CBGs.
> + */
> + crc24b_ind:1, /**< 1: CB includes CRC, need LDPC-V to check the CB CRC.
> + * 0: There is no CB CRC check.
> + * Only valid when workload is CB or CBGs.
> + */
> + drop_crc24b:1; /**< 1: CB CRC will be dropped. */
> + uint32_t harq_input_length_a: 16, /**< HARQ_input_length for CB. */
> + harq_input_length_b:16; /**< Only valid when workload is TB or CBGs. */
> + union {
> + uint32_t input_slice_table_addr_lo; /**< Used when scatter-gather enabled.*/
> + uint32_t input_start_addr_lo; /**< Used when scatter-gather disabled. */
> + };
> + union {
> + uint32_t input_slice_table_addr_hi; /**< Used when scatter-gather enabled.*/
> + uint32_t input_start_addr_hi; /**< Used when scatter-gather disabled. */
> + };
> + union {
> + uint32_t input_slice_num:21, /**< Used when scatter-gather enabled. */
> + rsrvd4:11;
> + uint32_t input_length:26, /**< Used when scatter-gather disabled. */
> + rsrvd5:6;
> + };
> + union {
> + uint32_t output_slice_table_addr_lo; /**< Used when scatter-gather enabled.*/
> + uint32_t output_start_addr_lo; /**< Used when scatter-gather disabled. */
> + };
> + union {
> + uint32_t output_slice_table_addr_hi; /**< Used when scatter-gather enabled.*/
> + uint32_t output_start_addr_hi; /**< Used when scatter-gather disabled. */
> + };
> + union {
> + uint32_t output_slice_num:21, /**< Used when scatter-gather enabled. */
> + rsrvd6:11;
> + uint32_t output_length:26, /**< Used when scatter-gather disabled. */
> + rsrvd7:6;
> + };
> + uint32_t enqueue_timestamp:31, /**< Time when AGX100 receives descriptor. */
> + rsrvd8:1;
> + uint32_t completion_timestamp:31, /**< Time when AGX100 completes descriptor. */
> + rsrvd9:1;
> +
> + union {
> + struct {
> + /** Virtual addresses used to retrieve SW context info. */
> + void *op_addr;
> + /** Stores information about total number of Code Blocks
> + * in currently processed Transport Block
> + */
> + uint8_t cbs_in_op;
> + };
> +
> + uint8_t sw_ctxt[AGX100_RING_DESC_LEN_UNIT_BYTES *
> + (AGX100_RING_DESC_ENTRY_LENGTH - 1)];
> + };
> +};
> +
> +/* AGX100 DMA Descriptor. */
> +union agx100_dma_desc {
> + struct agx100_dma_enc_desc agx100_enc_req;
> + struct agx100_dma_dec_desc agx100_dec_req;
No need to prefix the fields with agx100_, it is redundant.
> +};
> +
> +#endif /* _AGX100_H_ */
> diff --git a/drivers/baseband/fpga_5gnr_fec/fpga_5gnr_fec.h b/drivers/baseband/fpga_5gnr_fec/fpga_5gnr_fec.h
> index 982e956dc819..224684902569 100644
> --- a/drivers/baseband/fpga_5gnr_fec/fpga_5gnr_fec.h
> +++ b/drivers/baseband/fpga_5gnr_fec/fpga_5gnr_fec.h
> @@ -8,6 +8,7 @@
> #include <stdint.h>
> #include <stdbool.h>
>
> +#include "agx100_pmd.h"
> #include "vc_5gnr_pmd.h"
>
> /* Helper macro for logging */
> @@ -131,12 +132,21 @@ struct fpga_5gnr_fec_device {
> uint64_t q_assigned_bit_map;
> /** True if this is a PF FPGA 5GNR device. */
> bool pf_device;
> + /** Maximum number of possible queues for this device. */
> + uint8_t total_num_queues;
Introduction of total_num_queues should be in a dedicated patch as a
preliminary rework.
> + /** FPGA Variant. VC_5GNR_FPGA_VARIANT = 0; AGX100_FPGA_VARIANT = 1. */
> + uint8_t fpga_variant;
> };
>
> /** Structure associated with each queue. */
> struct __rte_cache_aligned fpga_5gnr_queue {
> struct fpga_5gnr_ring_ctrl_reg ring_ctrl_reg; /**< Ring Control Register */
> - union vc_5gnr_dma_desc *vc_5gnr_ring_addr; /**< Virtual address of VC 5GNR software ring. */
> + union {
> + /** Virtual address of VC 5GNR software ring. */
> + union vc_5gnr_dma_desc *vc_5gnr_ring_addr;
> + /** Virtual address of AGX100 software ring. */
> + union agx100_dma_desc *agx100_ring_addr;
> + };
> uint64_t *ring_head_addr; /* Virtual address of completion_head */
> uint64_t shadow_completion_head; /* Shadow completion head value */
> uint16_t head_free_desc; /* Ring head */
> diff --git a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
> index 41e6e6b58905..5c88f84d581a 100644
> --- a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
> +++ b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
> @@ -18,8 +18,8 @@
> #include <rte_bbdev.h>
> #include <rte_bbdev_pmd.h>
>
> -#include "fpga_5gnr_fec.h"
> #include "rte_pmd_fpga_5gnr_fec.h"
> +#include "fpga_5gnr_fec.h"
>
> #ifdef RTE_LIBRTE_BBDEV_DEBUG
> RTE_LOG_REGISTER_DEFAULT(fpga_5gnr_fec_logtype, DEBUG);
> @@ -71,9 +71,11 @@ print_ring_reg_debug_info(void *mmio_base, uint32_t offset)
>
> /* Read Static Register of Vista Creek device. */
> static inline void
> -print_static_reg_debug_info(void *mmio_base)
> +print_static_reg_debug_info(void *mmio_base, uint8_t fpga_variant)
> {
> - uint16_t config = fpga_5gnr_reg_read_16(mmio_base, VC_5GNR_CONFIGURATION);
> + uint16_t config;
> + if (fpga_variant == VC_5GNR_FPGA_VARIANT)
> + config = fpga_5gnr_reg_read_16(mmio_base, VC_5GNR_CONFIGURATION);
No mixing of code and variables declarations.
> uint8_t qmap_done = fpga_5gnr_reg_read_8(mmio_base,
> FPGA_5GNR_FEC_QUEUE_PF_VF_MAP_DONE);
> uint16_t lb_factor = fpga_5gnr_reg_read_16(mmio_base,
> @@ -81,14 +83,19 @@ print_static_reg_debug_info(void *mmio_base)
> uint16_t ring_desc_len = fpga_5gnr_reg_read_16(mmio_base,
> FPGA_5GNR_FEC_RING_DESC_LEN);
>
> - rte_bbdev_log_debug("UL.DL Weights = %u.%u",
> - ((uint8_t)config), ((uint8_t)(config >> 8)));
> + if (fpga_variant == VC_5GNR_FPGA_VARIANT)
> + rte_bbdev_log_debug("UL.DL Weights = %u.%u",
> + ((uint8_t)config), ((uint8_t)(config >> 8)));
> rte_bbdev_log_debug("UL.DL Load Balance = %u.%u",
> ((uint8_t)lb_factor), ((uint8_t)(lb_factor >> 8)));
> rte_bbdev_log_debug("Queue-PF/VF Mapping Table = %s",
> (qmap_done > 0) ? "READY" : "NOT-READY");
> - rte_bbdev_log_debug("Ring Descriptor Size = %u bytes",
> - ring_desc_len*VC_5GNR_RING_DESC_LEN_UNIT_BYTES);
> + if (fpga_variant == VC_5GNR_FPGA_VARIANT)
> + rte_bbdev_log_debug("Ring Descriptor Size = %u bytes",
> + ring_desc_len*VC_5GNR_RING_DESC_LEN_UNIT_BYTES);
Spaces around '*'
> + else
> + rte_bbdev_log_debug("Ring Descriptor Size = %u bytes",
> + ring_desc_len*AGX100_RING_DESC_LEN_UNIT_BYTES);
Ditto
> }
>
> /* Print decode DMA Descriptor of Vista Creek Decoder device. */
> @@ -142,6 +149,108 @@ vc_5gnr_print_dma_dec_desc_debug_info(union vc_5gnr_dma_desc *desc)
> word[4], word[5], word[6], word[7]);
> }
>
> +/* Print decode DMA Descriptor of AGX100 Decoder device. */
> +static void
> +agx100_print_dma_dec_desc_debug_info(union agx100_dma_desc *desc)
> +{
> + rte_bbdev_log_debug("DMA response desc %p\n"
> + "\t-- done(%"PRIu32") | tb_crc_pass(%"PRIu32") | cb_crc_all_pass(%"PRIu32")"
> + " | cb_all_et_pass(%"PRIu32") | max_iter_ret(%"PRIu32") |"
> + "cgb_crc_bitmap(%"PRIu32") | error_msg(%"PRIu32") | error_code(%"PRIu32") |"
> + "et_dis (%"PRIu32") | harq_in_en(%"PRIu32") | max_iter(%"PRIu32")\n"
> + "\t-- ncb(%"PRIu32") | bg_idx (%"PRIu32") | qm_idx (%"PRIu32")"
> + "| zc(%"PRIu32") | rv(%"PRIu32") | int_en(%"PRIu32")\n"
> + "\t-- max_cbg(%"PRIu32") | cbgti(%"PRIu32") | cbgfi(%"PRIu32") |"
> + "cbgs(%"PRIu32") | desc_idx(%"PRIu32")\n"
> + "\t-- ca(%"PRIu32") | c(%"PRIu32") | llr_pckg(%"PRIu32") |"
> + "syndrome_check_mode(%"PRIu32") | num_null(%"PRIu32")\n"
> + "\t-- ea(%"PRIu32") | eba(%"PRIu32")\n"
> + "\t-- hbstore_offset_out(%"PRIu32")\n"
> + "\t-- hbstore_offset_in(%"PRIu32") | en_slice_ts(%"PRIu32") |"
> + "en_host_ts(%"PRIu32") | en_cb_wr_status(%"PRIu32")"
> + " | en_output_sg(%"PRIu32") | en_input_sg(%"PRIu32") | tb_cb(%"PRIu32")"
> + " | crc24b_ind(%"PRIu32")| drop_crc24b(%"PRIu32")\n"
> + "\t-- harq_input_length_a(%"PRIu32") | harq_input_length_b(%"PRIu32")\n"
> + "\t-- input_slice_table_addr_lo(%"PRIu32")"
> + " | input_start_addr_lo(%"PRIu32")\n"
> + "\t-- input_slice_table_addr_hi(%"PRIu32")"
> + " | input_start_addr_hi(%"PRIu32")\n"
> + "\t-- input_slice_num(%"PRIu32") | input_length(%"PRIu32")\n"
> + "\t-- output_slice_table_addr_lo(%"PRIu32")"
> + " | output_start_addr_lo(%"PRIu32")\n"
> + "\t-- output_slice_table_addr_hi(%"PRIu32")"
> + " | output_start_addr_hi(%"PRIu32")\n"
> + "\t-- output_slice_num(%"PRIu32") | output_length(%"PRIu32")\n"
> + "\t-- enqueue_timestamp(%"PRIu32")\n"
> + "\t-- completion_timestamp(%"PRIu32")\n",
> + desc,
> + (uint32_t)desc->agx100_dec_req.done,
> + (uint32_t)desc->agx100_dec_req.tb_crc_pass,
> + (uint32_t)desc->agx100_dec_req.cb_crc_all_pass,
> + (uint32_t)desc->agx100_dec_req.cb_all_et_pass,
> + (uint32_t)desc->agx100_dec_req.max_iter_ret,
> + (uint32_t)desc->agx100_dec_req.cgb_crc_bitmap,
> + (uint32_t)desc->agx100_dec_req.error_msg,
> + (uint32_t)desc->agx100_dec_req.error_code,
> + (uint32_t)desc->agx100_dec_req.et_dis,
> + (uint32_t)desc->agx100_dec_req.harq_in_en,
> + (uint32_t)desc->agx100_dec_req.max_iter,
> + (uint32_t)desc->agx100_dec_req.ncb,
> + (uint32_t)desc->agx100_dec_req.bg_idx,
> + (uint32_t)desc->agx100_dec_req.qm_idx,
> + (uint32_t)desc->agx100_dec_req.zc,
> + (uint32_t)desc->agx100_dec_req.rv,
> + (uint32_t)desc->agx100_dec_req.int_en,
> + (uint32_t)desc->agx100_dec_req.max_cbg,
> + (uint32_t)desc->agx100_dec_req.cbgti,
> + (uint32_t)desc->agx100_dec_req.cbgfi,
> + (uint32_t)desc->agx100_dec_req.cbgs,
> + (uint32_t)desc->agx100_dec_req.desc_idx,
> + (uint32_t)desc->agx100_dec_req.ca,
> + (uint32_t)desc->agx100_dec_req.c,
> + (uint32_t)desc->agx100_dec_req.llr_pckg,
> + (uint32_t)desc->agx100_dec_req.syndrome_check_mode,
> + (uint32_t)desc->agx100_dec_req.num_null,
> + (uint32_t)desc->agx100_dec_req.ea,
> + (uint32_t)desc->agx100_dec_req.eba,
> + (uint32_t)desc->agx100_dec_req.hbstore_offset_out,
> + (uint32_t)desc->agx100_dec_req.hbstore_offset_in,
> + (uint32_t)desc->agx100_dec_req.en_slice_ts,
> + (uint32_t)desc->agx100_dec_req.en_host_ts,
> + (uint32_t)desc->agx100_dec_req.en_cb_wr_status,
> + (uint32_t)desc->agx100_dec_req.en_output_sg,
> + (uint32_t)desc->agx100_dec_req.en_input_sg,
> + (uint32_t)desc->agx100_dec_req.tb_cb,
> + (uint32_t)desc->agx100_dec_req.crc24b_ind,
> + (uint32_t)desc->agx100_dec_req.drop_crc24b,
> + (uint32_t)desc->agx100_dec_req.harq_input_length_a,
> + (uint32_t)desc->agx100_dec_req.harq_input_length_b,
> + (uint32_t)desc->agx100_dec_req.input_slice_table_addr_lo,
> + (uint32_t)desc->agx100_dec_req.input_start_addr_lo,
> + (uint32_t)desc->agx100_dec_req.input_slice_table_addr_hi,
> + (uint32_t)desc->agx100_dec_req.input_start_addr_hi,
> + (uint32_t)desc->agx100_dec_req.input_slice_num,
> + (uint32_t)desc->agx100_dec_req.input_length,
> + (uint32_t)desc->agx100_dec_req.output_slice_table_addr_lo,
> + (uint32_t)desc->agx100_dec_req.output_start_addr_lo,
> + (uint32_t)desc->agx100_dec_req.output_slice_table_addr_hi,
> + (uint32_t)desc->agx100_dec_req.output_start_addr_hi,
> + (uint32_t)desc->agx100_dec_req.output_slice_num,
> + (uint32_t)desc->agx100_dec_req.output_length,
> + (uint32_t)desc->agx100_dec_req.enqueue_timestamp,
> + (uint32_t)desc->agx100_dec_req.completion_timestamp);
> +
> + uint32_t *word = (uint32_t *) desc;
> + rte_bbdev_log_debug("%08"PRIx32"\n%08"PRIx32"\n%08"PRIx32"\n%08"PRIx32"\n"
> + "%08"PRIx32"\n%08"PRIx32"\n%08"PRIx32"\n%08"PRIx32"\n"
> + "%08"PRIx32"\n%08"PRIx32"\n%08"PRIx32"\n%08"PRIx32"\n"
> + "%08"PRIx32"\n%08"PRIx32"\n%08"PRIx32"\n%08"PRIx32"\n",
> + word[0], word[1], word[2], word[3],
> + word[4], word[5], word[6], word[7],
> + word[8], word[9], word[10], word[11],
> + word[12], word[13], word[14], word[15]);
> +}
> +
> /* Print decode DMA Descriptor of Vista Creek encoder device. */
> static void
> vc_5gnr_print_dma_enc_desc_debug_info(union vc_5gnr_dma_desc *desc)
> @@ -175,6 +284,87 @@ vc_5gnr_print_dma_enc_desc_debug_info(union vc_5gnr_dma_desc *desc)
> word[4], word[5], word[6], word[7]);
> }
>
> +/* Print decode DMA Descriptor of AGX100 encoder device. */
> +static void
> +agx100_print_dma_enc_desc_debug_info(union agx100_dma_desc *desc)
> +{
> + rte_bbdev_log_debug("DMA response desc %p\n"
> + "\t-- done(%"PRIu32") | error_msg(%"PRIu32") | error_code(%"PRIu32")\n"
> + "\t-- ncb(%"PRIu32") | bg_idx (%"PRIu32") | qm_idx (%"PRIu32")"
> + "| zc(%"PRIu32") | rv(%"PRIu32") | int_en(%"PRIu32")\n"
> + "\t-- max_cbg(%"PRIu32") | cbgti(%"PRIu32") | cbgs(%"PRIu32") | "
> + "desc_idx(%"PRIu32")\n"
> + "\t-- ca(%"PRIu32") | c(%"PRIu32") | num_null(%"PRIu32")\n"
> + "\t-- ea(%"PRIu32")\n"
> + "\t-- eb(%"PRIu32")\n"
> + "\t-- k_(%"PRIu32") | en_slice_ts(%"PRIu32") | en_host_ts(%"PRIu32") | "
> + "en_cb_wr_status(%"PRIu32") | en_output_sg(%"PRIu32") | "
> + "en_input_sg(%"PRIu32") | tb_cb(%"PRIu32") | crc_en(%"PRIu32")\n"
> + "\t-- input_slice_table_addr_lo(%"PRIu32")"
> + " | input_start_addr_lo(%"PRIu32")\n"
> + "\t-- input_slice_table_addr_hi(%"PRIu32")"
> + " | input_start_addr_hi(%"PRIu32")\n"
> + "\t-- input_slice_num(%"PRIu32") | input_length(%"PRIu32")\n"
> + "\t-- output_slice_table_addr_lo(%"PRIu32")"
> + " | output_start_addr_lo(%"PRIu32")\n"
> + "\t-- output_slice_table_addr_hi(%"PRIu32")"
> + " | output_start_addr_hi(%"PRIu32")\n"
> + "\t-- output_slice_num(%"PRIu32") | output_length(%"PRIu32")\n"
> + "\t-- enqueue_timestamp(%"PRIu32")\n"
> + "\t-- completion_timestamp(%"PRIu32")\n",
> + desc,
> + (uint32_t)desc->agx100_enc_req.done,
> + (uint32_t)desc->agx100_enc_req.error_msg,
> + (uint32_t)desc->agx100_enc_req.error_code,
> + (uint32_t)desc->agx100_enc_req.ncb,
> + (uint32_t)desc->agx100_enc_req.bg_idx,
> + (uint32_t)desc->agx100_enc_req.qm_idx,
> + (uint32_t)desc->agx100_enc_req.zc,
> + (uint32_t)desc->agx100_enc_req.rv,
> + (uint32_t)desc->agx100_enc_req.int_en,
> + (uint32_t)desc->agx100_enc_req.max_cbg,
> + (uint32_t)desc->agx100_enc_req.cbgti,
> + (uint32_t)desc->agx100_enc_req.cbgs,
> + (uint32_t)desc->agx100_enc_req.desc_idx,
> + (uint32_t)desc->agx100_enc_req.ca,
> + (uint32_t)desc->agx100_enc_req.c,
> + (uint32_t)desc->agx100_enc_req.num_null,
> + (uint32_t)desc->agx100_enc_req.ea,
> + (uint32_t)desc->agx100_enc_req.eb,
> + (uint32_t)desc->agx100_enc_req.k_,
> + (uint32_t)desc->agx100_enc_req.en_slice_ts,
> + (uint32_t)desc->agx100_enc_req.en_host_ts,
> + (uint32_t)desc->agx100_enc_req.en_cb_wr_status,
> + (uint32_t)desc->agx100_enc_req.en_output_sg,
> + (uint32_t)desc->agx100_enc_req.en_input_sg,
> + (uint32_t)desc->agx100_enc_req.tb_cb,
> + (uint32_t)desc->agx100_enc_req.crc_en,
> + (uint32_t)desc->agx100_enc_req.input_slice_table_addr_lo,
> + (uint32_t)desc->agx100_enc_req.input_start_addr_lo,
> + (uint32_t)desc->agx100_enc_req.input_slice_table_addr_hi,
> + (uint32_t)desc->agx100_enc_req.input_start_addr_hi,
> + (uint32_t)desc->agx100_enc_req.input_slice_num,
> + (uint32_t)desc->agx100_enc_req.input_length,
> + (uint32_t)desc->agx100_enc_req.output_slice_table_addr_lo,
> + (uint32_t)desc->agx100_enc_req.output_start_addr_lo,
> + (uint32_t)desc->agx100_enc_req.output_slice_table_addr_hi,
> + (uint32_t)desc->agx100_enc_req.output_start_addr_hi,
> + (uint32_t)desc->agx100_enc_req.output_slice_num,
> + (uint32_t)desc->agx100_enc_req.output_length,
> + (uint32_t)desc->agx100_enc_req.enqueue_timestamp,
> + (uint32_t)desc->agx100_enc_req.completion_timestamp);
> +
> + uint32_t *word = (uint32_t *) desc;
> + rte_bbdev_log_debug("%08"PRIx32"\n%08"PRIx32"\n%08"PRIx32"\n%08"PRIx32"\n"
> + "%08"PRIx32"\n%08"PRIx32"\n%08"PRIx32"\n%08"PRIx32"\n"
> + "%08"PRIx32"\n%08"PRIx32"\n%08"PRIx32"\n%08"PRIx32"\n"
> + "%08"PRIx32"\n%08"PRIx32"\n%08"PRIx32"\n%08"PRIx32"\n",
> + word[0], word[1], word[2], word[3],
> + word[4], word[5], word[6], word[7],
> + word[8], word[9], word[10], word[11],
> + word[12], word[13], word[14], word[15]);
> +}
> +
> #endif
>
> static int
> @@ -198,14 +388,32 @@ fpga_5gnr_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id
> /* Clear queue registers structure */
> memset(&ring_reg, 0, sizeof(struct fpga_5gnr_ring_ctrl_reg));
>
> + if (d->fpga_variant == AGX100_FPGA_VARIANT) {
> + /* Maximum number of queues possible for this device. */
> + d->total_num_queues = fpga_5gnr_reg_read_32(
> + d->mmio_base,
> + FPGA_5GNR_FEC_VERSION_ID) >> 24;
It seems to be set twice for AGX100 (here and in fpga_5gnr_fec_init()).
It seems weird it is done here, as it is not done for Vista Creek.
> + if (d->total_num_queues > AGX100_MAXIMUM_QUEUES_SUPPORTED) {
> + rte_bbdev_log(ERR,
> + "Total number of queues defined greater %d! Register value corrupted?\n",
> + AGX100_MAXIMUM_QUEUES_SUPPORTED);
> + return -EPERM;
> + }
> + }
> +
> /* Scan queue map.
> * If a queue is valid and mapped to a calling PF/VF the read value is
> * replaced with a queue ID and if it's not then
> * FPGA_5GNR_INVALID_HW_QUEUE_ID is returned.
> */
> - for (q_id = 0; q_id < VC_5GNR_TOTAL_NUM_QUEUES; ++q_id) {
> - uint32_t hw_q_id = fpga_5gnr_reg_read_32(d->mmio_base,
> - VC_5GNR_QUEUE_MAP + (q_id << 2));
> + for (q_id = 0; q_id < d->total_num_queues; ++q_id) {
> + uint32_t hw_q_id;
> + if (d->fpga_variant == VC_5GNR_FPGA_VARIANT)
> + hw_q_id = fpga_5gnr_reg_read_32(d->mmio_base,
> + VC_5GNR_QUEUE_MAP + (q_id << 2));
> + else
> + hw_q_id = fpga_5gnr_reg_read_32(d->mmio_base,
> + AGX100_QUEUE_MAP + (q_id << 2));
>
> rte_bbdev_log_debug("%s: queue ID: %u, registry queue ID: %u",
> dev->device->name, q_id, hw_q_id);
> @@ -231,8 +439,10 @@ fpga_5gnr_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id
> dev->device->name, num_queues, hw_q_num);
> return -EINVAL;
> }
> -
> - ring_size = FPGA_5GNR_RING_MAX_SIZE * sizeof(struct vc_5gnr_dma_dec_desc);
> + if (d->fpga_variant == VC_5GNR_FPGA_VARIANT)
> + ring_size = FPGA_5GNR_RING_MAX_SIZE * sizeof(struct vc_5gnr_dma_dec_desc);
> + else
> + ring_size = FPGA_5GNR_RING_MAX_SIZE * sizeof(struct agx100_dma_dec_desc);
>
> /* Enforce 32 byte alignment */
> RTE_BUILD_BUG_ON((RTE_CACHE_LINE_SIZE % 32) != 0);
> @@ -293,7 +503,7 @@ fpga_5gnr_dev_info_get(struct rte_bbdev *dev, struct rte_bbdev_driver_info *dev_
> struct fpga_5gnr_fec_device *d = dev->data->dev_private;
> uint32_t q_id = 0;
>
> - static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
> + static const struct rte_bbdev_op_cap vc_5gnr_bbdev_capabilities[] = {
> {
> .type = RTE_BBDEV_OP_LDPC_ENC,
> .cap.ldpc_enc = {
> @@ -333,6 +543,44 @@ fpga_5gnr_dev_info_get(struct rte_bbdev *dev, struct rte_bbdev_driver_info *dev_
> RTE_BBDEV_END_OF_CAPABILITIES_LIST()
> };
>
> + static const struct rte_bbdev_op_cap agx100_bbdev_capabilities[] = {
> + {
> + .type = RTE_BBDEV_OP_LDPC_ENC,
> + .cap.ldpc_enc = {
> + .capability_flags =
> + RTE_BBDEV_LDPC_RATE_MATCH |
> + RTE_BBDEV_LDPC_CRC_24B_ATTACH,
> + .num_buffers_src =
> + RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> + .num_buffers_dst =
> + RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> + }
> + },
> + {
> + .type = RTE_BBDEV_OP_LDPC_DEC,
> + .cap.ldpc_dec = {
> + .capability_flags =
> + RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK |
> + RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP |
> + RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE |
> + RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE |
> + RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE |
> + RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE |
> + RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE |
> + RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK |
> + RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_FILLERS,
> + .llr_size = 6,
> + .llr_decimals = 2,
> + .num_buffers_src =
> + RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> + .num_buffers_hard_out =
> + RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> + .num_buffers_soft_out = 0,
> + }
> + },
> + RTE_BBDEV_END_OF_CAPABILITIES_LIST()
> + };
> +
> /* Check the HARQ DDR size available */
> uint8_t timeout_counter = 0;
> uint32_t harq_buf_ready = fpga_5gnr_reg_read_32(d->mmio_base,
> @@ -357,19 +605,30 @@ fpga_5gnr_dev_info_get(struct rte_bbdev *dev, struct rte_bbdev_driver_info *dev_
> dev_info->driver_name = dev->device->driver->name;
> dev_info->queue_size_lim = FPGA_5GNR_RING_MAX_SIZE;
> dev_info->hardware_accelerated = true;
> - dev_info->min_alignment = 64;
> - dev_info->harq_buffer_size = (harq_buf_size >> 10) + 1;
> + dev_info->min_alignment = 1;
> + if (d->fpga_variant == VC_5GNR_FPGA_VARIANT)
> + dev_info->harq_buffer_size = (harq_buf_size >> 10) + 1;
> + else
> + dev_info->harq_buffer_size = harq_buf_size << 10;
> dev_info->default_queue_conf = default_queue_conf;
> - dev_info->capabilities = bbdev_capabilities;
> + if (d->fpga_variant == VC_5GNR_FPGA_VARIANT)
> + dev_info->capabilities = vc_5gnr_bbdev_capabilities;
> + else
> + dev_info->capabilities = agx100_bbdev_capabilities;
> dev_info->cpu_flag_reqs = NULL;
> dev_info->data_endianness = RTE_LITTLE_ENDIAN;
> dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
>
> /* Calculates number of queues assigned to device */
> dev_info->max_num_queues = 0;
> - for (q_id = 0; q_id < VC_5GNR_TOTAL_NUM_QUEUES; ++q_id) {
> - uint32_t hw_q_id = fpga_5gnr_reg_read_32(d->mmio_base,
> - VC_5GNR_QUEUE_MAP + (q_id << 2));
> + for (q_id = 0; q_id < d->total_num_queues; ++q_id) {
> + uint32_t hw_q_id;
> + if (d->fpga_variant == VC_5GNR_FPGA_VARIANT)
> + hw_q_id = fpga_5gnr_reg_read_32(d->mmio_base,
> + VC_5GNR_QUEUE_MAP + (q_id << 2));
> + else
> + hw_q_id = fpga_5gnr_reg_read_32(d->mmio_base,
> + AGX100_QUEUE_MAP + (q_id << 2));
Maybe you could consider helpers for getting settiong queue map to make
adding new variants less painful.
> if (hw_q_id != FPGA_5GNR_INVALID_HW_QUEUE_ID)
> dev_info->max_num_queues++;
> }
> @@ -377,8 +636,8 @@ fpga_5gnr_dev_info_get(struct rte_bbdev *dev, struct rte_bbdev_driver_info *dev_
> dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
> dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = 0;
> dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = 0;
> - dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = dev_info->max_num_queues / 2;
> - dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = dev_info->max_num_queues / 2;
> + dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = dev_info->max_num_queues >> 1;
> + dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = dev_info->max_num_queues >> 1;
Looks good, but not really realted to the purpose of the patch (maybe in
patch 4?)
> dev_info->queue_priority[RTE_BBDEV_OP_LDPC_DEC] = 1;
> dev_info->queue_priority[RTE_BBDEV_OP_LDPC_ENC] = 1;
> }
> @@ -394,11 +653,11 @@ fpga_5gnr_find_free_queue_idx(struct rte_bbdev *dev,
> struct fpga_5gnr_fec_device *d = dev->data->dev_private;
> uint64_t q_idx;
> uint8_t i = 0;
> - uint8_t range = VC_5GNR_TOTAL_NUM_QUEUES >> 1;
> + uint8_t range = d->total_num_queues >> 1;
>
> if (conf->op_type == RTE_BBDEV_OP_LDPC_ENC) {
> - i = VC_5GNR_NUM_DL_QUEUES;
> - range = VC_5GNR_TOTAL_NUM_QUEUES;
> + i = d->total_num_queues >> 1;
> + range = d->total_num_queues;
> }
>
> for (; i < range; ++i) {
> @@ -445,7 +704,11 @@ fpga_5gnr_queue_setup(struct rte_bbdev *dev, uint16_t queue_id,
> q->q_idx = q_idx;
>
> /* Set ring_base_addr */
> - q->vc_5gnr_ring_addr = RTE_PTR_ADD(d->sw_rings, (d->sw_ring_size * queue_id));
> + if (d->fpga_variant == VC_5GNR_FPGA_VARIANT)
> + q->vc_5gnr_ring_addr = RTE_PTR_ADD(d->sw_rings, (d->sw_ring_size * queue_id));
> + else
> + q->agx100_ring_addr = RTE_PTR_ADD(d->sw_rings, (d->sw_ring_size * queue_id));
> +
> q->ring_ctrl_reg.ring_base_addr = d->sw_rings_phys + (d->sw_ring_size * queue_id);
>
> /* Allocate memory for Completion Head variable*/
> @@ -661,7 +924,7 @@ fpga_5gnr_dev_interrupt_handler(void *cb_arg)
> uint8_t i;
>
> /* Scan queue assigned to this device */
> - for (i = 0; i < VC_5GNR_TOTAL_NUM_QUEUES; ++i) {
> + for (i = 0; i < d->total_num_queues; ++i) {
> q_idx = 1ULL << i;
> if (d->q_bound_bit_map & q_idx) {
> queue_id = get_queue_id(dev->data, i);
> @@ -710,6 +973,13 @@ fpga_5gnr_intr_enable(struct rte_bbdev *dev)
> {
> int ret;
> uint8_t i;
> + struct fpga_5gnr_fec_device *d = dev->data->dev_private;
> + uint8_t num_intr_vec;
> +
> + if (d->fpga_variant == VC_5GNR_FPGA_VARIANT)
> + num_intr_vec = VC_5GNR_NUM_INTR_VEC;
> + else
> + num_intr_vec = d->total_num_queues - RTE_INTR_VEC_RXTX_OFFSET;
Checking the variant is not needed, as
VC_5GNR_NUM_INTR_VEC == d->total_num_queues - RTE_INTR_VEC_RXTX_OFFSET
>
> if (!rte_intr_cap_multiple(dev->intr_handle)) {
> rte_bbdev_log(ERR, "Multiple intr vector is not supported by FPGA (%s)",
> @@ -717,15 +987,15 @@ fpga_5gnr_intr_enable(struct rte_bbdev *dev)
> return -ENOTSUP;
> }
>
> - /* Create event file descriptors for each of 64 queue. Event fds will be
> - * mapped to FPGA IRQs in rte_intr_enable(). This is a 1:1 mapping where
> - * the IRQ number is a direct translation to the queue number.
> + /* Create event file descriptors for each of the supported queues (Maximum 64).
> + * Event fds will be mapped to FPGA IRQs in rte_intr_enable().
> + * This is a 1:1 mapping where the IRQ number is a direct translation to the queue number.
> *
> - * 63 (VC_5GNR_NUM_INTR_VEC) event fds are created as rte_intr_enable()
> + * num_intr_vec event fds are created as rte_intr_enable()
> * mapped the first IRQ to already created interrupt event file
> * descriptor (intr_handle->fd).
> */
> - if (rte_intr_efd_enable(dev->intr_handle, VC_5GNR_NUM_INTR_VEC)) {
> + if (rte_intr_efd_enable(dev->intr_handle, num_intr_vec)) {
> rte_bbdev_log(ERR, "Failed to create fds for %u queues", dev->data->num_queues);
> return -1;
> }
> @@ -735,7 +1005,7 @@ fpga_5gnr_intr_enable(struct rte_bbdev *dev)
> * It ensures that callback function assigned to that descriptor will
> * invoked when any FPGA queue issues interrupt.
> */
> - for (i = 0; i < VC_5GNR_NUM_INTR_VEC; ++i) {
> + for (i = 0; i < num_intr_vec; ++i) {
> if (rte_intr_efds_index_set(dev->intr_handle, i,
> rte_intr_fd_get(dev->intr_handle)))
> return -rte_errno;
> @@ -856,6 +1126,72 @@ vc_5gnr_check_desc_error(uint32_t error_code) {
> return 1;
> }
>
> +/* AGX100 FPGA descriptor errors
> + * Print an error if a descriptor error has occurred.
> + * Return 0 on success, 1 on failure
> + */
> +static inline int
> +agx100_check_desc_error(uint32_t error_code, uint32_t error_msg) {
> + uint8_t error = error_code << 4 | error_msg;
> + switch (error) {
> + case AGX100_DESC_ERR_NO_ERR:
> + return 0;
> + case AGX100_DESC_ERR_E_NOT_LEGAL:
> + rte_bbdev_log(ERR, "Invalid output length of rate matcher E");
> + break;
> + case AGX100_DESC_ERR_K_P_OUT_OF_RANGE:
> + rte_bbdev_log(ERR, "Encode block size K' is out of range");
> + break;
> + case AGX100_DESC_ERR_NCB_OUT_OF_RANGE:
> + rte_bbdev_log(ERR, "Ncb circular buffer size is out of range");
> + break;
> + case AGX100_DESC_ERR_Z_C_NOT_LEGAL:
> + rte_bbdev_log(ERR, "Zc is illegal");
> + break;
> + case AGX100_DESC_ERR_DESC_INDEX_ERR:
> + rte_bbdev_log(ERR,
> + "Desc_index received does not meet the expectation in the AGX100"
> + );
> + break;
> + case AGX100_DESC_ERR_HARQ_INPUT_LEN_A:
> + rte_bbdev_log(ERR, "HARQ input length A is invalid.");
> + break;
> + case AGX100_DESC_ERR_HARQ_INPUT_LEN_B:
> + rte_bbdev_log(ERR, "HARQ input length B is invalid.");
> + break;
> + case AGX100_DESC_ERR_HBSTORE_OFFSET_ERR:
> + rte_bbdev_log(ERR, "Hbstore exceeds HARQ buffer size.");
> + break;
> + case AGX100_DESC_ERR_TB_CBG_ERR:
> + rte_bbdev_log(ERR, "Total CB number C=0 or CB number with Ea Ca=0 or Ca>C.");
> + break;
> + case AGX100_DESC_ERR_CBG_OUT_OF_RANGE:
> + rte_bbdev_log(ERR, "Cbgti or max_cbg is out of range");
> + break;
> + case AGX100_DESC_ERR_CW_RM_NOT_LEGAL:
> + rte_bbdev_log(ERR, "Cw_rm is illegal");
> + break;
> + case AGX100_DESC_ERR_UNSUPPORTED_REQ:
> + rte_bbdev_log(ERR, "Unsupported request for descriptor");
> + break;
> + case AGX100_DESC_ERR_RESERVED:
> + rte_bbdev_log(ERR, "Reserved");
> + break;
> + case AGX100_DESC_ERR_DESC_ABORT:
> + rte_bbdev_log(ERR, "Completed abort for descriptor");
> + break;
> + case AGX100_DESC_ERR_DESC_READ_TLP_POISONED:
> + rte_bbdev_log(ERR, "Descriptor read TLP poisoned");
> + break;
> + default:
> + rte_bbdev_log(ERR,
> + "Descriptor error unknown error code %u error msg %u",
> + error_code, error_msg);
> + break;
> + }
> + return 1;
> +}
> +
> /* Compute value of k0.
> * Based on 3GPP 38.212 Table 5.4.2.1-2
> * Starting position of different redundancy versions, k0
> @@ -953,6 +1289,88 @@ vc_5gnr_dma_desc_te_fill(struct rte_bbdev_enc_op *op,
> return 0;
> }
>
> +/**
> + * AGX100 FPGA
> + * Set DMA descriptor for encode operation (1 Code Block)
> + *
> + * @param op
> + * Pointer to a single encode operation.
> + * @param desc
> + * Pointer to DMA descriptor.
> + * @param input
> + * Pointer to pointer to input data which will be decoded.
> + * @param e
> + * E value (length of output in bits).
> + * @param ncb
> + * Ncb value (size of the soft buffer).
> + * @param out_length
> + * Length of output buffer
> + * @param in_offset
> + * Input offset in rte_mbuf structure. It is used for calculating the point
> + * where data is starting.
> + * @param out_offset
> + * Output offset in rte_mbuf structure. It is used for calculating the point
> + * where hard output data will be stored.
> + * @param cbs_in_op
> + * Number of CBs contained in one operation.
> + */
> +static inline int
> +agx100_dma_desc_le_fill(struct rte_bbdev_enc_op *op,
> + struct agx100_dma_enc_desc *desc, struct rte_mbuf *input,
> + struct rte_mbuf *output, uint16_t k_, uint32_t e,
> + uint32_t in_offset, uint32_t out_offset, uint16_t desc_offset,
> + uint8_t cbs_in_op)
> +{
> + /* reset. */
> + desc->done = 0;
> + desc->error_msg = 0;
> + desc->error_code = 0;
> + desc->ncb = op->ldpc_enc.n_cb;
> + desc->bg_idx = op->ldpc_enc.basegraph - 1;
> + desc->qm_idx = op->ldpc_enc.q_m >> 1;
> + desc->zc = op->ldpc_enc.z_c;
> + desc->rv = op->ldpc_enc.rv_index;
> + desc->int_en = 0; /**< Set by device externally. */
> + desc->max_cbg = 0; /**< TODO: CBG specific. */
> + desc->cbgti = 0; /**< TODO: CBG specific. */
> + desc->cbgs = 0; /**< TODO: CBG specific. */
> + desc->desc_idx = desc_offset;
> + desc->ca = 0; /**< TODO: CBG specific. */
> + desc->c = 0; /**< TODO: CBG specific. */
> + desc->num_null = op->ldpc_enc.n_filler;
> + desc->ea = e;
> + desc->eb = e; /**< TODO: TB/CBG specific. */
> + desc->k_ = k_;
> + desc->en_slice_ts = 0; /**< TODO: Slice specific. */
> + desc->en_host_ts = 0; /**< TODO: Slice specific. */
> + desc->en_cb_wr_status = 0; /**< TODO: Event Queue specific. */
> + desc->en_output_sg = 0; /**< TODO: Slice specific. */
> + desc->en_input_sg = 0; /**< TODO: Slice specific. */
> + desc->tb_cb = 0; /**< Descriptor for CB. TODO: Add TB and CBG logic. */
> + desc->crc_en = check_bit(op->ldpc_enc.op_flags,
> + RTE_BBDEV_LDPC_CRC_24B_ATTACH);
> +
> + /* Set inbound/outbound data buffer address. */
> + /* TODO: add logic for input_slice. */
> + desc->output_start_addr_hi = (uint32_t)(
> + rte_pktmbuf_iova_offset(output, out_offset) >> 32);
> + desc->output_start_addr_lo = (uint32_t)(
> + rte_pktmbuf_iova_offset(output, out_offset));
> + desc->input_start_addr_hi = (uint32_t)(
> + rte_pktmbuf_iova_offset(input, in_offset) >> 32);
> + desc->input_start_addr_lo = (uint32_t)(
> + rte_pktmbuf_iova_offset(input, in_offset));
> + desc->output_length = (e + 7) >> 3; /* in bytes. */
> + desc->input_length = input->data_len;
> + desc->enqueue_timestamp = 0;
> + desc->completion_timestamp = 0;
> + /* Save software context needed for dequeue. */
> + desc->op_addr = op;
> + /* Set total number of CBs in an op. */
> + desc->cbs_in_op = cbs_in_op;
> + return 0;
> +}
> +
> /**
> * Vista Creek 5GNR FPGA
> * Set DMA descriptor for decode operation (1 Code Block)
> @@ -1021,6 +1439,105 @@ vc_5gnr_dma_desc_ld_fill(struct rte_bbdev_dec_op *op,
> return 0;
> }
>
> +/**
> + * AGX100 FPGA
> + * Set DMA descriptor for decode operation (1 Code Block)
> + *
> + * @param op
> + * Pointer to a single encode operation.
> + * @param desc
> + * Pointer to DMA descriptor.
> + * @param input
> + * Pointer to pointer to input data which will be decoded.
> + * @param in_offset
> + * Input offset in rte_mbuf structure. It is used for calculating the point
> + * where data is starting.
> + * @param out_offset
> + * Output offset in rte_mbuf structure. It is used for calculating the point
> + * where hard output data will be stored.
> + * @param cbs_in_op
> + * Number of CBs contained in one operation.
> + */
> +static inline int
> +agx100_dma_desc_ld_fill(struct rte_bbdev_dec_op *op,
> + struct agx100_dma_dec_desc *desc,
> + struct rte_mbuf *input, struct rte_mbuf *output,
> + uint16_t harq_in_length,
> + uint32_t in_offset, uint32_t out_offset,
> + uint32_t harq_in_offset,
> + uint32_t harq_out_offset,
> + uint16_t desc_offset,
> + uint8_t cbs_in_op)
> +{
> + /* reset. */
> + desc->done = 0;
> + desc->tb_crc_pass = 0;
> + desc->cb_crc_all_pass = 0;
> + desc->cb_all_et_pass = 0;
> + desc->max_iter_ret = 0;
> + desc->cgb_crc_bitmap = 0; /**< TODO: CBG specific. */
> + desc->error_msg = 0;
> + desc->error_code = 0;
> + desc->et_dis = !check_bit(op->ldpc_dec.op_flags,
> + RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE);
> + desc->harq_in_en = check_bit(op->ldpc_dec.op_flags,
> + RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE);
> + desc->max_iter = op->ldpc_dec.iter_max;
> + desc->ncb = op->ldpc_dec.n_cb;
> + desc->bg_idx = op->ldpc_dec.basegraph - 1;
> + desc->qm_idx = op->ldpc_dec.q_m >> 1;
> + desc->zc = op->ldpc_dec.z_c;
> + desc->rv = op->ldpc_dec.rv_index;
> + desc->int_en = 0; /**< Set by device externally. */
> + desc->max_cbg = 0; /**< TODO: CBG specific. */
> + desc->cbgti = 0; /**< TODO: CBG specific. */
> + desc->cbgfi = 0; /**< TODO: CBG specific. */
> + desc->cbgs = 0; /**< TODO: CBG specific. */
> + desc->desc_idx = desc_offset;
> + desc->ca = 0; /**< TODO: CBG specific. */
> + desc->c = 0; /**< TODO: CBG specific. */
> + desc->llr_pckg = 0; /**< TODO: Not implemented yet. */
> + desc->syndrome_check_mode = 1; /**< TODO: Make it configurable. */
> + desc->num_null = op->ldpc_dec.n_filler;
> + desc->ea = op->ldpc_dec.cb_params.e; /**< TODO: TB/CBG specific. */
> + desc->eba = 0; /**< TODO: TB/CBG specific. */
> + desc->hbstore_offset_out = harq_out_offset >> 10;
> + desc->hbstore_offset_in = harq_in_offset >> 10;
> + desc->en_slice_ts = 0; /**< TODO: Slice specific. */
> + desc->en_host_ts = 0; /**< TODO: Slice specific. */
> + desc->en_cb_wr_status = 0; /**< TODO: Event Queue specific. */
> + desc->en_output_sg = 0; /**< TODO: Slice specific. */
> + desc->en_input_sg = 0; /**< TODO: Slice specific. */
> + desc->tb_cb = 0; /**< Descriptor for CB. TODO: Add TB and CBG logic. */
> + desc->crc24b_ind = check_bit(op->ldpc_dec.op_flags,
> + RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK);
> + desc->drop_crc24b = check_bit(op->ldpc_dec.op_flags,
> + RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP);
> + desc->harq_input_length_a =
> + harq_in_length; /**< Descriptor for CB. TODO: Add TB and CBG logic. */
> + desc->harq_input_length_b = 0; /**< Descriptor for CB. TODO: Add TB and CBG logic. */
> + /* Set inbound/outbound data buffer address. */
> + /* TODO: add logic for input_slice. */
> + desc->output_start_addr_hi = (uint32_t)(
> + rte_pktmbuf_iova_offset(output, out_offset) >> 32);
> + desc->output_start_addr_lo = (uint32_t)(
> + rte_pktmbuf_iova_offset(output, out_offset));
> + desc->input_start_addr_hi = (uint32_t)(
> + rte_pktmbuf_iova_offset(input, in_offset) >> 32);
> + desc->input_start_addr_lo = (uint32_t)(
> + rte_pktmbuf_iova_offset(input, in_offset));
> + desc->output_length = (((op->ldpc_dec.basegraph == 1) ? 22 : 10) * op->ldpc_dec.z_c
> + - op->ldpc_dec.n_filler - desc->drop_crc24b * 24) >> 3;
> + desc->input_length = op->ldpc_dec.cb_params.e; /**< TODO: TB/CBG specific. */
> + desc->enqueue_timestamp = 0;
> + desc->completion_timestamp = 0;
> + /* Save software context needed for dequeue. */
> + desc->op_addr = op;
> + /* Set total number of CBs in an op. */
> + desc->cbs_in_op = cbs_in_op;
> + return 0;
> +}
> +
> /* Validates LDPC encoder parameters for VC 5GNR FPGA. */
> static inline int
> vc_5gnr_validate_ldpc_enc_op(struct rte_bbdev_enc_op *op)
> @@ -1484,27 +2001,35 @@ fpga_5gnr_harq_write_loopback(struct fpga_5gnr_queue *q,
> uint64_t *input = NULL;
> uint32_t last_transaction = left_length % FPGA_5GNR_DDR_WR_DATA_LEN_IN_BYTES;
> uint64_t last_word;
> + struct fpga_5gnr_fec_device *d = q->d;
>
> if (last_transaction > 0)
> left_length -= last_transaction;
> -
> - /*
> - * Get HARQ buffer size for each VF/PF: When 0x00, there is no
> - * available DDR space for the corresponding VF/PF.
> - */
> - reg_32 = fpga_5gnr_reg_read_32(q->d->mmio_base, FPGA_5GNR_FEC_HARQ_BUF_SIZE_REGS);
> - if (reg_32 < harq_in_length) {
> - left_length = reg_32;
> - rte_bbdev_log(ERR, "HARQ in length > HARQ buffer size\n");
> + if (d->fpga_variant == VC_5GNR_FPGA_VARIANT) {
> + /*
> + * Get HARQ buffer size for each VF/PF: When 0x00, there is no
> + * available DDR space for the corresponding VF/PF.
> + */
> + reg_32 = fpga_5gnr_reg_read_32(q->d->mmio_base, FPGA_5GNR_FEC_HARQ_BUF_SIZE_REGS);
> + if (reg_32 < harq_in_length) {
> + left_length = reg_32;
> + rte_bbdev_log(ERR, "HARQ in length > HARQ buffer size\n");
> + }
> }
>
> input = (uint64_t *)rte_pktmbuf_mtod_offset(harq_input, uint8_t *, in_offset);
>
> while (left_length > 0) {
> if (fpga_5gnr_reg_read_8(q->d->mmio_base, FPGA_5GNR_FEC_DDR4_ADDR_RDY_REGS) == 1) {
> - fpga_5gnr_reg_write_32(q->d->mmio_base,
> - FPGA_5GNR_FEC_DDR4_WR_ADDR_REGS,
> - out_offset);
> + if (d->fpga_variant == AGX100_FPGA_VARIANT) {
> + fpga_5gnr_reg_write_32(q->d->mmio_base,
> + FPGA_5GNR_FEC_DDR4_WR_ADDR_REGS,
> + out_offset >> 3);
> + } else {
> + fpga_5gnr_reg_write_32(q->d->mmio_base,
> + FPGA_5GNR_FEC_DDR4_WR_ADDR_REGS,
> + out_offset);
> + }
> fpga_5gnr_reg_write_64(q->d->mmio_base,
> FPGA_5GNR_FEC_DDR4_WR_DATA_REGS,
> input[increment]);
> @@ -1516,12 +2041,17 @@ fpga_5gnr_harq_write_loopback(struct fpga_5gnr_queue *q,
> }
> while (last_transaction > 0) {
> if (fpga_5gnr_reg_read_8(q->d->mmio_base, FPGA_5GNR_FEC_DDR4_ADDR_RDY_REGS) == 1) {
> - fpga_5gnr_reg_write_32(q->d->mmio_base,
> - FPGA_5GNR_FEC_DDR4_WR_ADDR_REGS,
> - out_offset);
> + if (d->fpga_variant == AGX100_FPGA_VARIANT) {
> + fpga_5gnr_reg_write_32(q->d->mmio_base,
> + FPGA_5GNR_FEC_DDR4_WR_ADDR_REGS,
> + out_offset >> 3);
> + } else {
> + fpga_5gnr_reg_write_32(q->d->mmio_base,
> + FPGA_5GNR_FEC_DDR4_WR_ADDR_REGS,
> + out_offset);
> + }
> last_word = input[increment];
> - last_word &= (uint64_t)(1 << (last_transaction * 4))
> - - 1;
> + last_word &= (uint64_t)(1ULL << (last_transaction * 4)) - 1;
> fpga_5gnr_reg_write_64(q->d->mmio_base,
> FPGA_5GNR_FEC_DDR4_WR_DATA_REGS,
> last_word);
> @@ -1544,14 +2074,17 @@ fpga_5gnr_harq_read_loopback(struct fpga_5gnr_queue *q,
> uint32_t increment = 0;
> uint64_t *input = NULL;
> uint32_t last_transaction = harq_in_length % FPGA_5GNR_DDR_WR_DATA_LEN_IN_BYTES;
> + struct fpga_5gnr_fec_device *d = q->d;
>
> if (last_transaction > 0)
> harq_in_length += (8 - last_transaction);
>
> - reg = fpga_5gnr_reg_read_32(q->d->mmio_base, FPGA_5GNR_FEC_HARQ_BUF_SIZE_REGS);
> - if (reg < harq_in_length) {
> - harq_in_length = reg;
> - rte_bbdev_log(ERR, "HARQ in length > HARQ buffer size\n");
> + if (d->fpga_variant == VC_5GNR_FPGA_VARIANT) {
> + reg = fpga_5gnr_reg_read_32(q->d->mmio_base, FPGA_5GNR_FEC_HARQ_BUF_SIZE_REGS);
> + if (reg < harq_in_length) {
> + harq_in_length = reg;
> + rte_bbdev_log(ERR, "HARQ in length > HARQ buffer size\n");
> + }
> }
>
> if (!mbuf_append(harq_output, harq_output, harq_in_length)) {
> @@ -1570,9 +2103,15 @@ fpga_5gnr_harq_read_loopback(struct fpga_5gnr_queue *q,
> input = (uint64_t *)rte_pktmbuf_mtod_offset(harq_output, uint8_t *, harq_out_offset);
>
> while (left_length > 0) {
> - fpga_5gnr_reg_write_32(q->d->mmio_base,
> - FPGA_5GNR_FEC_DDR4_RD_ADDR_REGS,
> - in_offset);
> + if (d->fpga_variant == AGX100_FPGA_VARIANT) {
> + fpga_5gnr_reg_write_32(q->d->mmio_base,
> + FPGA_5GNR_FEC_DDR4_RD_ADDR_REGS,
> + in_offset >> 3);
> + } else {
> + fpga_5gnr_reg_write_32(q->d->mmio_base,
> + FPGA_5GNR_FEC_DDR4_RD_ADDR_REGS,
> + in_offset);
> + }
> fpga_5gnr_reg_write_8(q->d->mmio_base, FPGA_5GNR_FEC_DDR4_RD_DONE_REGS, 1);
> reg = fpga_5gnr_reg_read_8(q->d->mmio_base, FPGA_5GNR_FEC_DDR4_RD_RDY_REGS);
> while (reg != 1) {
> @@ -1587,7 +2126,10 @@ fpga_5gnr_harq_read_loopback(struct fpga_5gnr_queue *q,
> left_length -= FPGA_5GNR_DDR_RD_DATA_LEN_IN_BYTES;
> in_offset += FPGA_5GNR_DDR_WR_DATA_LEN_IN_BYTES;
> increment++;
> - fpga_5gnr_reg_write_8(q->d->mmio_base, FPGA_5GNR_FEC_DDR4_RD_DONE_REGS, 0);
> + if (d->fpga_variant == AGX100_FPGA_VARIANT)
> + fpga_5gnr_reg_write_8(q->d->mmio_base, FPGA_5GNR_FEC_DDR4_RD_RDY_REGS, 0);
> + else
> + fpga_5gnr_reg_write_8(q->d->mmio_base, FPGA_5GNR_FEC_DDR4_RD_DONE_REGS, 0);
> }
> fpga_5gnr_mutex_free(q);
> return 1;
> @@ -1598,6 +2140,7 @@ enqueue_ldpc_enc_one_op_cb(struct fpga_5gnr_queue *q, struct rte_bbdev_enc_op *o
> uint16_t desc_offset)
> {
> union vc_5gnr_dma_desc *vc_5gnr_desc;
> + union agx100_dma_desc *agx100_desc;
> int ret;
> uint8_t c, crc24_bits = 0;
> struct rte_bbdev_op_ldpc_enc *enc = &op->ldpc_enc;
> @@ -1610,10 +2153,13 @@ enqueue_ldpc_enc_one_op_cb(struct fpga_5gnr_queue *q, struct rte_bbdev_enc_op *o
> uint16_t total_left = enc->input.length;
> uint16_t ring_offset;
> uint16_t K, k_;
> + struct fpga_5gnr_fec_device *d = q->d;
>
> - if (vc_5gnr_validate_ldpc_enc_op(op) == -1) {
> - rte_bbdev_log(ERR, "LDPC encoder validation rejected");
> - return -EINVAL;
> + if (d->fpga_variant == VC_5GNR_FPGA_VARIANT) {
> + if (vc_5gnr_validate_ldpc_enc_op(op) == -1) {
> + rte_bbdev_log(ERR, "LDPC encoder validation rejected");
> + return -EINVAL;
> + }
> }
>
> /* Clear op status */
> @@ -1629,14 +2175,13 @@ enqueue_ldpc_enc_one_op_cb(struct fpga_5gnr_queue *q, struct rte_bbdev_enc_op *o
> crc24_bits = 24;
>
> if (enc->code_block_mode == RTE_BBDEV_TRANSPORT_BLOCK) {
> - /* For Transport Block mode */
> - /* FIXME */
> - c = enc->tb_params.c;
> - e = enc->tb_params.ea;
> - } else { /* For Code Block mode */
> - c = 1;
> - e = enc->cb_params.e;
> + /* TODO: For Transport Block mode. */
> + rte_bbdev_log(ERR, "Transport Block not supported yet");
> + return -1;
> }
> + /* For Code Block mode. */
> + c = 1;
> + e = enc->cb_params.e;
>
> /* Update total_left */
> K = (enc->basegraph == 1 ? 22 : 10) * enc->z_c;
> @@ -1658,10 +2203,19 @@ enqueue_ldpc_enc_one_op_cb(struct fpga_5gnr_queue *q, struct rte_bbdev_enc_op *o
>
> /* Offset into the ring */
> ring_offset = ((q->tail + desc_offset) & q->sw_ring_wrap_mask);
> - /* Setup DMA Descriptor */
> - vc_5gnr_desc = q->vc_5gnr_ring_addr + ring_offset;
> - ret = vc_5gnr_dma_desc_te_fill(op, &vc_5gnr_desc->vc_5gnr_enc_req, m_in, m_out,
> - k_, e, in_offset, out_offset, ring_offset, c);
> +
> + if (d->fpga_variant == VC_5GNR_FPGA_VARIANT) {
> + /* Setup DMA Descriptor. */
> + vc_5gnr_desc = q->vc_5gnr_ring_addr + ring_offset;
> + ret = vc_5gnr_dma_desc_te_fill(op, &vc_5gnr_desc->vc_5gnr_enc_req, m_in, m_out,
> + k_, e, in_offset, out_offset, ring_offset, c);
> + } else {
> + /* Setup DMA Descriptor. */
> + agx100_desc = q->agx100_ring_addr + ring_offset;
> + ret = agx100_dma_desc_le_fill(op, &agx100_desc->agx100_enc_req, m_in, m_out,
> + k_, e, in_offset, out_offset, ring_offset, c);
> + }
> +
> if (unlikely(ret < 0))
> return ret;
>
> @@ -1677,7 +2231,10 @@ enqueue_ldpc_enc_one_op_cb(struct fpga_5gnr_queue *q, struct rte_bbdev_enc_op *o
> }
>
> #ifdef RTE_LIBRTE_BBDEV_DEBUG
> - vc_5gnr_print_dma_enc_desc_debug_info(vc_5gnr_desc);
> + if (d->fpga_variant == VC_5GNR_FPGA_VARIANT)
> + vc_5gnr_print_dma_enc_desc_debug_info(vc_5gnr_desc);
> + else
> + agx100_print_dma_enc_desc_debug_info(agx100_desc);
> #endif
> return 1;
> }
> @@ -1817,28 +2374,152 @@ vc_5gnr_enqueue_ldpc_dec_one_op_cb(struct fpga_5gnr_queue *q, struct rte_bbdev_d
> return 1;
> }
>
> -static uint16_t
> -fpga_5gnr_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
> - struct rte_bbdev_enc_op **ops, uint16_t num)
> +static inline int
> +agx100_enqueue_ldpc_dec_one_op_cb(struct fpga_5gnr_queue *q, struct rte_bbdev_dec_op *op,
> + uint16_t desc_offset)
> {
> - uint16_t i, total_enqueued_cbs = 0;
> - int32_t avail;
> - int enqueued_cbs;
> - struct fpga_5gnr_queue *q = q_data->queue_private;
> - union vc_5gnr_dma_desc *vc_5gnr_desc;
> + union agx100_dma_desc *desc;
> + int ret;
> + uint16_t ring_offset;
> + uint8_t c;
> + uint16_t e, in_length, out_length, k0, l, seg_total_left, sys_cols;
> + uint16_t K, parity_offset, harq_in_length = 0, harq_out_length = 0;
> + uint16_t crc24_overlap = 0;
> + struct rte_bbdev_op_ldpc_dec *dec = &op->ldpc_dec;
> + struct rte_mbuf *m_in = dec->input.data;
> + struct rte_mbuf *m_out = dec->hard_output.data;
> + struct rte_mbuf *m_out_head = dec->hard_output.data;
> + uint16_t in_offset = dec->input.offset;
> + uint16_t out_offset = dec->hard_output.offset;
> + uint32_t harq_in_offset = 0;
> + uint32_t harq_out_offset = 0;
>
> - /* Check if queue is not full */
> - if (unlikely(((q->tail + 1) & q->sw_ring_wrap_mask) == q->head_free_desc))
> - return 0;
> + /* Clear op status. */
> + op->status = 0;
>
> - /* Calculates available space */
> - avail = (q->head_free_desc > q->tail) ?
> - q->head_free_desc - q->tail - 1 :
> - q->ring_ctrl_reg.ring_size + q->head_free_desc - q->tail - 1;
> + /* Setup DMA Descriptor. */
> + ring_offset = ((q->tail + desc_offset) & q->sw_ring_wrap_mask);
> + desc = q->agx100_ring_addr + ring_offset;
>
> - for (i = 0; i < num; ++i) {
> - /* Check if there is available space for further
> - * processing
> + if (check_bit(dec->op_flags, RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK)) {
> + struct rte_mbuf *harq_in = dec->harq_combined_input.data;
> + struct rte_mbuf *harq_out = dec->harq_combined_output.data;
> + harq_in_length = dec->harq_combined_input.length;
> + uint32_t harq_in_offset = dec->harq_combined_input.offset;
> + uint32_t harq_out_offset = dec->harq_combined_output.offset;
> +
> + if (check_bit(dec->op_flags, RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE)) {
> + ret = fpga_5gnr_harq_write_loopback(q, harq_in,
> + harq_in_length, harq_in_offset,
> + harq_out_offset);
> + } else if (check_bit(dec->op_flags,
> + RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE)) {
> + ret = fpga_5gnr_harq_read_loopback(q, harq_out,
> + harq_in_length, harq_in_offset,
> + harq_out_offset);
> + dec->harq_combined_output.length = harq_in_length;
> + } else {
> + rte_bbdev_log(ERR, "OP flag Err!");
> + ret = -1;
> + }
> +
> + /* Set descriptor for dequeue. */
> + desc->agx100_dec_req.done = 1;
> + desc->agx100_dec_req.error_code = 0;
> + desc->agx100_dec_req.error_msg = 0;
> + desc->agx100_dec_req.op_addr = op;
> + desc->agx100_dec_req.cbs_in_op = 1;
> +
> + /* Mark this dummy descriptor to be dropped by HW. */
> + desc->agx100_dec_req.desc_idx = (ring_offset + 1) & q->sw_ring_wrap_mask;
> +
> + return ret; /* Error or number of CB. */
> + }
> +
> + if (m_in == NULL || m_out == NULL) {
> + rte_bbdev_log(ERR, "Invalid mbuf pointer");
> + op->status = 1 << RTE_BBDEV_DATA_ERROR;
> + return -1;
> + }
> +
> + c = 1;
> + e = dec->cb_params.e;
> +
> + if (check_bit(dec->op_flags, RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP))
> + crc24_overlap = 24;
> +
> + sys_cols = (dec->basegraph == 1) ? 22 : 10;
> + K = sys_cols * dec->z_c;
> + parity_offset = K - 2 * dec->z_c;
> +
> + out_length = ((K - crc24_overlap - dec->n_filler) >> 3);
> + in_length = e;
> + seg_total_left = dec->input.length;
> +
> + if (check_bit(dec->op_flags, RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE))
> + harq_in_length = RTE_MIN(dec->harq_combined_input.length, (uint32_t)dec->n_cb);
> +
> + if (check_bit(dec->op_flags, RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
> + k0 = get_k0(dec->n_cb, dec->z_c, dec->basegraph, dec->rv_index);
> + if (k0 > parity_offset)
> + l = k0 + e;
> + else
> + l = k0 + e + dec->n_filler;
> + harq_out_length = RTE_MIN(RTE_MAX(harq_in_length, l), dec->n_cb);
> + dec->harq_combined_output.length = harq_out_length;
> + }
> +
> + mbuf_append(m_out_head, m_out, out_length);
> + harq_in_offset = dec->harq_combined_input.offset;
> + harq_out_offset = dec->harq_combined_output.offset;
> +
> + ret = agx100_dma_desc_ld_fill(op, &desc->agx100_dec_req, m_in, m_out,
> + harq_in_length, in_offset, out_offset, harq_in_offset,
> + harq_out_offset, ring_offset, c);
> +
> + if (unlikely(ret < 0))
> + return ret;
> + /* Update lengths. */
> + seg_total_left -= in_length;
> + op->ldpc_dec.hard_output.length += out_length;
> + if (seg_total_left > 0) {
> + rte_bbdev_log(ERR,
> + "Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
> + seg_total_left, in_length);
> + return -1;
> + }
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> + agx100_print_dma_dec_desc_debug_info(desc);
> +#endif
> +
> + return 1;
> +}
> +
> +static uint16_t
> +fpga_5gnr_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
> + struct rte_bbdev_enc_op **ops, uint16_t num)
> +{
> + uint16_t i, total_enqueued_cbs = 0;
> + int32_t avail;
> + int enqueued_cbs;
> + struct fpga_5gnr_queue *q = q_data->queue_private;
> + union vc_5gnr_dma_desc *vc_5gnr_desc;
> + union agx100_dma_desc *agx100_desc;
> + struct fpga_5gnr_fec_device *d = q->d;
> +
> + /* Check if queue is not full */
> + if (unlikely(((q->tail + 1) & q->sw_ring_wrap_mask) == q->head_free_desc))
> + return 0;
> +
> + /* Calculates available space */
> + avail = (q->head_free_desc > q->tail) ?
> + q->head_free_desc - q->tail - 1 :
> + q->ring_ctrl_reg.ring_size + q->head_free_desc - q->tail - 1;
> +
> + for (i = 0; i < num; ++i) {
> + /* Check if there is available space for further
> + * processing
> */
> if (unlikely(avail - 1 < 0))
> break;
> @@ -1858,9 +2539,15 @@ fpga_5gnr_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
> /* Set interrupt bit for last CB in enqueued ops. FPGA issues interrupt
> * only when all previous CBs were already processed.
> */
> - vc_5gnr_desc = q->vc_5gnr_ring_addr +
> - ((q->tail + total_enqueued_cbs - 1) & q->sw_ring_wrap_mask);
> - vc_5gnr_desc->vc_5gnr_enc_req.irq_en = q->irq_enable;
> + if (d->fpga_variant == VC_5GNR_FPGA_VARIANT) {
> + vc_5gnr_desc = q->vc_5gnr_ring_addr +
> + ((q->tail + total_enqueued_cbs - 1) & q->sw_ring_wrap_mask);
> + vc_5gnr_desc->vc_5gnr_enc_req.irq_en = q->irq_enable;
> + } else {
> + agx100_desc = q->agx100_ring_addr +
> + ((q->tail + total_enqueued_cbs - 1) & q->sw_ring_wrap_mask);
Could we have a helper like in acc100 & VRB?
We could have a new union member as void * for rings addresses.
> + agx100_desc->agx100_enc_req.int_en = q->irq_enable;
> + }
>
> fpga_5gnr_dma_enqueue(q, total_enqueued_cbs, &q_data->queue_stats);
>
> @@ -1880,6 +2567,8 @@ fpga_5gnr_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
> int enqueued_cbs;
> struct fpga_5gnr_queue *q = q_data->queue_private;
> union vc_5gnr_dma_desc *vc_5gnr_desc;
> + union agx100_dma_desc *agx100_desc;
> + struct fpga_5gnr_fec_device *d = q->d;
>
> /* Check if queue is not full */
> if (unlikely(((q->tail + 1) & q->sw_ring_wrap_mask) == q->head_free_desc))
> @@ -1898,8 +2587,13 @@ fpga_5gnr_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
> if (unlikely(avail - 1 < 0))
> break;
> avail -= 1;
> - enqueued_cbs = vc_5gnr_enqueue_ldpc_dec_one_op_cb(q, ops[i],
> - total_enqueued_cbs);
> + if (q->d->fpga_variant == VC_5GNR_FPGA_VARIANT) {
> + enqueued_cbs = vc_5gnr_enqueue_ldpc_dec_one_op_cb(q, ops[i],
> + total_enqueued_cbs);
> + } else {
> + enqueued_cbs = agx100_enqueue_ldpc_dec_one_op_cb(q, ops[i],
> + total_enqueued_cbs);
> + }
>
> if (enqueued_cbs < 0)
> break;
> @@ -1918,9 +2612,16 @@ fpga_5gnr_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
> /* Set interrupt bit for last CB in enqueued ops. FPGA issues interrupt
> * only when all previous CBs were already processed.
> */
> - vc_5gnr_desc = q->vc_5gnr_ring_addr +
> - ((q->tail + total_enqueued_cbs - 1) & q->sw_ring_wrap_mask);
> - vc_5gnr_desc->vc_5gnr_enc_req.irq_en = q->irq_enable;
> + if (d->fpga_variant == VC_5GNR_FPGA_VARIANT) {
> + vc_5gnr_desc = q->vc_5gnr_ring_addr +
> + ((q->tail + total_enqueued_cbs - 1) & q->sw_ring_wrap_mask);
> + vc_5gnr_desc->vc_5gnr_enc_req.irq_en = q->irq_enable;
> + } else {
> + agx100_desc = q->agx100_ring_addr +
> + ((q->tail + total_enqueued_cbs - 1) & q->sw_ring_wrap_mask);
> + agx100_desc->agx100_enc_req.int_en = q->irq_enable;
> + }
> +
> fpga_5gnr_dma_enqueue(q, total_enqueued_cbs, &q_data->queue_stats);
> return i;
> }
> @@ -1955,6 +2656,36 @@ vc_5gnr_dequeue_ldpc_enc_one_op_cb(struct fpga_5gnr_queue *q, struct rte_bbdev_e
> return 1;
> }
>
> +static inline int
> +agx100_dequeue_ldpc_enc_one_op_cb(struct fpga_5gnr_queue *q, struct rte_bbdev_enc_op **op,
> + uint16_t desc_offset)
> +{
> + union agx100_dma_desc *desc;
> + int desc_error;
> +
> + /* Set current desc. */
> + desc = q->agx100_ring_addr + ((q->head_free_desc + desc_offset) & q->sw_ring_wrap_mask);
> + /*check if done */
> + if (desc->agx100_enc_req.done == 0)
> + return -1;
> +
> + /* make sure the response is read atomically. */
> + rte_smp_rmb();
> +
> + rte_bbdev_log_debug("DMA response desc %p", desc);
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> + agx100_print_dma_enc_desc_debug_info(desc);
> +#endif
> + *op = desc->agx100_enc_req.op_addr;
> + /* Check the descriptor error field, return 1 on error. */
> + desc_error = agx100_check_desc_error(desc->agx100_enc_req.error_code,
> + desc->agx100_enc_req.error_msg);
> +
> + (*op)->status = desc_error << RTE_BBDEV_DATA_ERROR;
> +
> + return 1;
> +}
>
> static inline int
> vc_5gnr_dequeue_ldpc_dec_one_op_cb(struct fpga_5gnr_queue *q, struct rte_bbdev_dec_op **op,
> @@ -2003,6 +2734,52 @@ vc_5gnr_dequeue_ldpc_dec_one_op_cb(struct fpga_5gnr_queue *q, struct rte_bbdev_d
> return 1;
> }
>
> +static inline int
> +agx100_dequeue_ldpc_dec_one_op_cb(struct fpga_5gnr_queue *q, struct rte_bbdev_dec_op **op,
> + uint16_t desc_offset)
> +{
> + union agx100_dma_desc *desc;
> + int desc_error;
> +
> + /* Set descriptor. */
> + desc = q->agx100_ring_addr +
> + ((q->head_free_desc + desc_offset) & q->sw_ring_wrap_mask);
> + /* Verify done bit is set. */
> + if (desc->agx100_dec_req.done == 0)
> + return -1;
> +
> + /* make sure the response is read atomically. */
> + rte_smp_rmb();
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> + agx100_print_dma_dec_desc_debug_info(desc);
> +#endif
> +
> + *op = desc->agx100_dec_req.op_addr;
> +
> + if (check_bit((*op)->ldpc_dec.op_flags, RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK)) {
> + (*op)->status = 0;
> + return 1;
> + }
> +
> + /* FPGA reports iterations based on round-up minus 1. */
> + (*op)->ldpc_dec.iter_count = desc->agx100_dec_req.max_iter_ret + 1;
> +
> + /* CRC Check criteria. */
> + if (desc->agx100_dec_req.crc24b_ind && !(desc->agx100_dec_req.cb_crc_all_pass))
> + (*op)->status = 1 << RTE_BBDEV_CRC_ERROR;
> +
> + /* et_pass = 0 when decoder fails. */
> + (*op)->status |= !(desc->agx100_dec_req.cb_all_et_pass) << RTE_BBDEV_SYNDROME_ERROR;
> +
> + /* Check the descriptor error field, return 1 on error. */
> + desc_error = agx100_check_desc_error(desc->agx100_dec_req.error_code,
> + desc->agx100_dec_req.error_msg);
> +
> + (*op)->status |= desc_error << RTE_BBDEV_DATA_ERROR;
> + return 1;
> +}
> +
> static uint16_t
> fpga_5gnr_dequeue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
> struct rte_bbdev_enc_op **ops, uint16_t num)
> @@ -2014,7 +2791,10 @@ fpga_5gnr_dequeue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
> int ret;
>
> for (i = 0; (i < num) && (dequeued_cbs < avail); ++i) {
> - ret = vc_5gnr_dequeue_ldpc_enc_one_op_cb(q, &ops[i], dequeued_cbs);
> + if (q->d->fpga_variant == VC_5GNR_FPGA_VARIANT)
> + ret = vc_5gnr_dequeue_ldpc_enc_one_op_cb(q, &ops[i], dequeued_cbs);
> + else
> + ret = agx100_dequeue_ldpc_enc_one_op_cb(q, &ops[i], dequeued_cbs);
>
> if (ret < 0)
> break;
> @@ -2046,7 +2826,10 @@ fpga_5gnr_dequeue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
> int ret;
>
> for (i = 0; (i < num) && (dequeued_cbs < avail); ++i) {
> - ret = vc_5gnr_dequeue_ldpc_dec_one_op_cb(q, &ops[i], dequeued_cbs);
> + if (q->d->fpga_variant == VC_5GNR_FPGA_VARIANT)
> + ret = vc_5gnr_dequeue_ldpc_dec_one_op_cb(q, &ops[i], dequeued_cbs);
> + else
> + ret = agx100_dequeue_ldpc_dec_one_op_cb(q, &ops[i], dequeued_cbs);
>
> if (ret < 0)
> break;
> @@ -2079,10 +2862,29 @@ fpga_5gnr_fec_init(struct rte_bbdev *dev, struct rte_pci_driver *drv)
> dev->dequeue_ldpc_enc_ops = fpga_5gnr_dequeue_ldpc_enc;
> dev->dequeue_ldpc_dec_ops = fpga_5gnr_dequeue_ldpc_dec;
>
> - ((struct fpga_5gnr_fec_device *) dev->data->dev_private)->pf_device =
> - !strcmp(drv->driver.name, RTE_STR(FPGA_5GNR_FEC_PF_DRIVER_NAME));
> - ((struct fpga_5gnr_fec_device *) dev->data->dev_private)->mmio_base =
> - pci_dev->mem_resource[0].addr;
> + /* Device variant specific handling. */
> + if ((pci_dev->id.device_id == AGX100_PF_DEVICE_ID) ||
> + (pci_dev->id.device_id == AGX100_VF_DEVICE_ID)) {
> + ((struct fpga_5gnr_fec_device *) dev->data->dev_private)->fpga_variant =
> + AGX100_FPGA_VARIANT;
> + ((struct fpga_5gnr_fec_device *) dev->data->dev_private)->pf_device =
> + !strcmp(drv->driver.name, RTE_STR(FPGA_5GNR_FEC_PF_DRIVER_NAME));
> + ((struct fpga_5gnr_fec_device *) dev->data->dev_private)->mmio_base =
> + pci_dev->mem_resource[0].addr;
> + /* Maximum number of queues possible for this device. */
> + ((struct fpga_5gnr_fec_device *) dev->data->dev_private)->total_num_queues =
> + fpga_5gnr_reg_read_32(pci_dev->mem_resource[0].addr,
> + FPGA_5GNR_FEC_VERSION_ID) >> 24;
> + } else {
> + ((struct fpga_5gnr_fec_device *) dev->data->dev_private)->fpga_variant =
> + VC_5GNR_FPGA_VARIANT;
> + ((struct fpga_5gnr_fec_device *) dev->data->dev_private)->pf_device =
> + !strcmp(drv->driver.name, RTE_STR(FPGA_5GNR_FEC_PF_DRIVER_NAME));
> + ((struct fpga_5gnr_fec_device *) dev->data->dev_private)->mmio_base =
> + pci_dev->mem_resource[0].addr;
> + ((struct fpga_5gnr_fec_device *) dev->data->dev_private)->total_num_queues =
> + VC_5GNR_TOTAL_NUM_QUEUES;
> + }
>
> rte_bbdev_log_debug(
> "Init device %s [%s] @ virtaddr %p phyaddr %#"PRIx64,
> @@ -2097,6 +2899,7 @@ fpga_5gnr_fec_probe(struct rte_pci_driver *pci_drv,
> {
> struct rte_bbdev *bbdev = NULL;
> char dev_name[RTE_BBDEV_NAME_MAX_LEN];
> + struct fpga_5gnr_fec_device *d;
>
> if (pci_dev == NULL) {
> rte_bbdev_log(ERR, "NULL PCI device");
> @@ -2135,15 +2938,24 @@ fpga_5gnr_fec_probe(struct rte_pci_driver *pci_drv,
> rte_bbdev_log_debug("bbdev id = %u [%s]",
> bbdev->data->dev_id, dev_name);
>
> - struct fpga_5gnr_fec_device *d = bbdev->data->dev_private;
> - uint32_t version_id = fpga_5gnr_reg_read_32(d->mmio_base, FPGA_5GNR_FEC_VERSION_ID);
> - rte_bbdev_log(INFO, "Vista Creek FPGA RTL v%u.%u",
> - ((uint16_t)(version_id >> 16)), ((uint16_t)version_id));
> + d = bbdev->data->dev_private;
> + if (d->fpga_variant == VC_5GNR_FPGA_VARIANT) {
> + uint32_t version_id = fpga_5gnr_reg_read_32(d->mmio_base, FPGA_5GNR_FEC_VERSION_ID);
> + rte_bbdev_log(INFO, "Vista Creek FPGA RTL v%u.%u",
> + ((uint16_t)(version_id >> 16)), ((uint16_t)version_id));
> + } else {
> + uint32_t version_num_queues = fpga_5gnr_reg_read_32(d->mmio_base,
> + FPGA_5GNR_FEC_VERSION_ID);
> + uint8_t major_version_id = version_num_queues >> 16;
> + uint8_t minor_version_id = version_num_queues >> 8;
> + uint8_t patch_id = version_num_queues;
> +
> + rte_bbdev_log(INFO, "AGX100 RTL v%u.%u.%u",
> + major_version_id, minor_version_id, patch_id);
> + }
>
> #ifdef RTE_LIBRTE_BBDEV_DEBUG
> - if (!strcmp(pci_drv->driver.name,
> - RTE_STR(FPGA_5GNR_FEC_PF_DRIVER_NAME)))
> - print_static_reg_debug_info(d->mmio_base);
> + print_static_reg_debug_info(d->mmio_base, d->fpga_variant);
> #endif
> return 0;
> }
> @@ -2242,7 +3054,7 @@ static int vc_5gnr_configure(const char *dev_name, const struct rte_fpga_5gnr_fe
>
> /* Clear all queues registers */
> payload_32 = FPGA_5GNR_INVALID_HW_QUEUE_ID;
> - for (q_id = 0; q_id < VC_5GNR_TOTAL_NUM_QUEUES; ++q_id) {
> + for (q_id = 0; q_id < d->total_num_queues; ++q_id) {
> address = (q_id << 2) + VC_5GNR_QUEUE_MAP;
> fpga_5gnr_reg_write_32(d->mmio_base, address, payload_32);
> }
> @@ -2303,7 +3115,7 @@ static int vc_5gnr_configure(const char *dev_name, const struct rte_fpga_5gnr_fe
> */
> if (conf->pf_mode_en) {
> payload_32 = 0x1;
> - for (q_id = 0; q_id < VC_5GNR_TOTAL_NUM_QUEUES; ++q_id) {
> + for (q_id = 0; q_id < d->total_num_queues; ++q_id) {
> address = (q_id << 2) + VC_5GNR_QUEUE_MAP;
> fpga_5gnr_reg_write_32(d->mmio_base, address, payload_32);
> }
> @@ -2321,11 +3133,11 @@ static int vc_5gnr_configure(const char *dev_name, const struct rte_fpga_5gnr_fe
> */
> if ((total_ul_q_id > VC_5GNR_NUM_UL_QUEUES) ||
> (total_dl_q_id > VC_5GNR_NUM_DL_QUEUES) ||
> - (total_q_id > VC_5GNR_TOTAL_NUM_QUEUES)) {
> + (total_q_id > d->total_num_queues)) {
> rte_bbdev_log(ERR,
> "VC 5GNR FPGA Configuration failed. Too many queues to configure: UL_Q %u, DL_Q %u, FPGA_Q %u",
> total_ul_q_id, total_dl_q_id,
> - VC_5GNR_TOTAL_NUM_QUEUES);
> + d->total_num_queues);
> return -EINVAL;
> }
> total_ul_q_id = 0;
> @@ -2369,7 +3181,169 @@ static int vc_5gnr_configure(const char *dev_name, const struct rte_fpga_5gnr_fe
> rte_bbdev_log_debug("PF Vista Creek 5GNR FPGA configuration complete for %s", dev_name);
>
> #ifdef RTE_LIBRTE_BBDEV_DEBUG
> - print_static_reg_debug_info(d->mmio_base);
> + print_static_reg_debug_info(d->mmio_base, d->fpga_variant);
> +#endif
> + return 0;
> +}
> +
> +/* Initial configuration of AGX100 device. */
> +static int agx100_configure(const char *dev_name, const struct rte_fpga_5gnr_fec_conf *conf)
> +{
> + uint32_t payload_32, address;
> + uint16_t payload_16;
> + uint8_t payload_8;
> + uint16_t q_id, vf_id, total_q_id, total_ul_q_id, total_dl_q_id;
> + struct rte_bbdev *bbdev = rte_bbdev_get_named_dev(dev_name);
> + struct rte_fpga_5gnr_fec_conf def_conf;
> +
> + if (bbdev == NULL) {
> + rte_bbdev_log(ERR,
> + "Invalid dev_name (%s), or device is not yet initialised",
> + dev_name);
> + return -ENODEV;
> + }
> +
> + struct fpga_5gnr_fec_device *d = bbdev->data->dev_private;
> +
> + if (conf == NULL) {
> + rte_bbdev_log(ERR, "AGX100 Configuration was not provided.");
> + rte_bbdev_log(ERR, "Default configuration will be loaded.");
> + fpga_5gnr_set_default_conf(&def_conf);
> + conf = &def_conf;
> + }
> +
> + uint8_t total_num_queues = d->total_num_queues;
> + uint8_t num_ul_queues = total_num_queues >> 1;
> + uint8_t num_dl_queues = total_num_queues >> 1;
> +
> + /* Clear all queues registers */
> + payload_32 = FPGA_5GNR_INVALID_HW_QUEUE_ID;
> + for (q_id = 0; q_id < total_num_queues; ++q_id) {
> + address = (q_id << 2) + AGX100_QUEUE_MAP;
> + fpga_5gnr_reg_write_32(d->mmio_base, address, payload_32);
> + }
> +
> + /*
> + * If PF mode is enabled allocate all queues for PF only.
> + *
> + * For VF mode each VF can have different number of UL and DL queues.
> + * Total number of queues to configure cannot exceed AGX100
> + * capabilities - 64 queues - 32 queues for UL and 32 queues for DL.
> + * Queues mapping is done according to configuration:
> + *
> + * UL queues:
> + * | Q_ID | VF_ID |
> + * | 0 | 0 |
> + * | ... | 0 |
> + * | conf->vf_dl_queues_number[0] - 1 | 0 |
> + * | conf->vf_dl_queues_number[0] | 1 |
> + * | ... | 1 |
> + * | conf->vf_dl_queues_number[1] - 1 | 1 |
> + * | ... | ... |
> + * | conf->vf_dl_queues_number[7] - 1 | 7 |
> + *
> + * DL queues:
> + * | Q_ID | VF_ID |
> + * | 32 | 0 |
> + * | ... | 0 |
> + * | conf->vf_ul_queues_number[0] - 1 | 0 |
> + * | conf->vf_ul_queues_number[0] | 1 |
> + * | ... | 1 |
> + * | conf->vf_ul_queues_number[1] - 1 | 1 |
> + * | ... | ... |
> + * | conf->vf_ul_queues_number[7] - 1 | 7 |
> + *
> + * Example of configuration:
> + * conf->vf_ul_queues_number[0] = 4; -> 4 UL queues for VF0
> + * conf->vf_dl_queues_number[0] = 4; -> 4 DL queues for VF0
> + * conf->vf_ul_queues_number[1] = 2; -> 2 UL queues for VF1
> + * conf->vf_dl_queues_number[1] = 2; -> 2 DL queues for VF1
> + *
> + * UL:
> + * | Q_ID | VF_ID |
> + * | 0 | 0 |
> + * | 1 | 0 |
> + * | 2 | 0 |
> + * | 3 | 0 |
> + * | 4 | 1 |
> + * | 5 | 1 |
> + *
> + * DL:
> + * | Q_ID | VF_ID |
> + * | 32 | 0 |
> + * | 33 | 0 |
> + * | 34 | 0 |
> + * | 35 | 0 |
> + * | 36 | 1 |
> + * | 37 | 1 |
> + */
> + if (conf->pf_mode_en) {
> + payload_32 = 0x1;
> + for (q_id = 0; q_id < total_num_queues; ++q_id) {
> + address = (q_id << 2) + AGX100_QUEUE_MAP;
> + fpga_5gnr_reg_write_32(d->mmio_base, address, payload_32);
> + }
> + } else {
> + /* Calculate total number of UL and DL queues to configure. */
> + total_ul_q_id = total_dl_q_id = 0;
> + for (vf_id = 0; vf_id < FPGA_5GNR_FEC_NUM_VFS; ++vf_id) {
> + total_ul_q_id += conf->vf_ul_queues_number[vf_id];
> + total_dl_q_id += conf->vf_dl_queues_number[vf_id];
> + }
> + total_q_id = total_dl_q_id + total_ul_q_id;
> + /*
> + * Check if total number of queues to configure does not exceed
> + * AGX100 capabilities (64 queues - 32 UL and 32 DL queues)
> + */
> + if ((total_ul_q_id > num_ul_queues) ||
> + (total_dl_q_id > num_dl_queues) ||
> + (total_q_id > total_num_queues)) {
> + rte_bbdev_log(ERR,
> + "AGX100 Configuration failed. Too many queues to configure: UL_Q %u, DL_Q %u, AGX100_Q %u",
> + total_ul_q_id, total_dl_q_id,
> + total_num_queues);
> + return -EINVAL;
> + }
> + total_ul_q_id = 0;
> + for (vf_id = 0; vf_id < FPGA_5GNR_FEC_NUM_VFS; ++vf_id) {
> + for (q_id = 0; q_id < conf->vf_ul_queues_number[vf_id];
> + ++q_id, ++total_ul_q_id) {
> + address = (total_ul_q_id << 2) + AGX100_QUEUE_MAP;
> + payload_32 = ((0x80 + vf_id) << 16) | 0x1;
> + fpga_5gnr_reg_write_32(d->mmio_base, address, payload_32);
> + }
> + }
> + total_dl_q_id = 0;
> + for (vf_id = 0; vf_id < FPGA_5GNR_FEC_NUM_VFS; ++vf_id) {
> + for (q_id = 0; q_id < conf->vf_dl_queues_number[vf_id];
> + ++q_id, ++total_dl_q_id) {
> + address = ((total_dl_q_id + num_ul_queues)
> + << 2) + AGX100_QUEUE_MAP;
> + payload_32 = ((0x80 + vf_id) << 16) | 0x1;
> + fpga_5gnr_reg_write_32(d->mmio_base, address, payload_32);
> + }
> + }
> + }
> +
> + /* Setting Load Balance Factor. */
> + payload_16 = (conf->dl_load_balance << 8) | (conf->ul_load_balance);
> + address = FPGA_5GNR_FEC_LOAD_BALANCE_FACTOR;
> + fpga_5gnr_reg_write_16(d->mmio_base, address, payload_16);
> +
> + /* Setting length of ring descriptor entry. */
> + payload_16 = FPGA_5GNR_RING_DESC_ENTRY_LENGTH;
> + address = FPGA_5GNR_FEC_RING_DESC_LEN;
> + fpga_5gnr_reg_write_16(d->mmio_base, address, payload_16);
> +
> + /* Queue PF/VF mapping table is ready. */
> + payload_8 = 0x1;
> + address = FPGA_5GNR_FEC_QUEUE_PF_VF_MAP_DONE;
> + fpga_5gnr_reg_write_8(d->mmio_base, address, payload_8);
> +
> + rte_bbdev_log_debug("PF AGX100 configuration complete for %s", dev_name);
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> + print_static_reg_debug_info(d->mmio_base, d->fpga_variant);
> #endif
> return 0;
> }
> @@ -2386,6 +3360,8 @@ int rte_fpga_5gnr_fec_configure(const char *dev_name, const struct rte_fpga_5gnr
> printf("Configure dev id %x\n", pci_dev->id.device_id);
> if (pci_dev->id.device_id == VC_5GNR_PF_DEVICE_ID)
> return vc_5gnr_configure(dev_name, conf);
> + else if (pci_dev->id.device_id == AGX100_PF_DEVICE_ID)
> + return agx100_configure(dev_name, conf);
>
> rte_bbdev_log(ERR, "Invalid device_id (%d)", pci_dev->id.device_id);
> return -ENODEV;
> @@ -2393,6 +3369,9 @@ int rte_fpga_5gnr_fec_configure(const char *dev_name, const struct rte_fpga_5gnr
>
> /* FPGA 5GNR FEC PCI PF address map */
> static struct rte_pci_id pci_id_fpga_5gnr_fec_pf_map[] = {
> + {
> + RTE_PCI_DEVICE(AGX100_VENDOR_ID, AGX100_PF_DEVICE_ID)
> + },
> {
> RTE_PCI_DEVICE(VC_5GNR_VENDOR_ID, VC_5GNR_PF_DEVICE_ID)
> },
> @@ -2408,6 +3387,9 @@ static struct rte_pci_driver fpga_5gnr_fec_pci_pf_driver = {
>
> /* FPGA 5GNR FEC PCI VF address map */
> static struct rte_pci_id pci_id_fpga_5gnr_fec_vf_map[] = {
> + {
> + RTE_PCI_DEVICE(AGX100_VENDOR_ID, AGX100_VF_DEVICE_ID)
> + },
> {
> RTE_PCI_DEVICE(VC_5GNR_VENDOR_ID, VC_5GNR_VF_DEVICE_ID)
> },
next prev parent reply other threads:[~2023-10-17 12:48 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-18 16:31 [PATCH v3 0/4] changes for 23.11 Hernan Vargas
2023-09-18 16:31 ` [PATCH v3 1/4] baseband/fpga_5gnr_fec: renaming for consistency Hernan Vargas
2023-09-18 16:31 ` [PATCH v3 2/4] baseband/fpga_5gnr_fec: add Vista Creek variant Hernan Vargas
2023-10-17 12:55 ` Maxime Coquelin
2023-09-18 16:31 ` [PATCH v3 3/4] baseband/fpga_5gnr_fec: add AGX100 support Hernan Vargas
2023-10-17 12:48 ` Maxime Coquelin [this message]
2023-09-18 16:31 ` [PATCH v3 4/4] baseband/fpga_5gnr_fec: cosmetic comment changes Hernan Vargas
2023-10-17 12:50 ` Maxime Coquelin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ad42ed96-2dbc-4456-bc9a-5d4568b14e7b@redhat.com \
--to=maxime.coquelin@redhat.com \
--cc=dev@dpdk.org \
--cc=gakhil@marvell.com \
--cc=hernan.vargas@intel.com \
--cc=nicolas.chautru@intel.com \
--cc=qi.z.zhang@intel.com \
--cc=trix@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).