From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 4711342B83; Tue, 23 May 2023 20:50:04 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 131DB42D4B; Tue, 23 May 2023 20:49:36 +0200 (CEST) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by mails.dpdk.org (Postfix) with ESMTP id F0CA240689 for ; Tue, 23 May 2023 20:49:30 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1684867771; x=1716403771; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=QzXftI0yym5xVsYEpfYRqDWbfcBTf1p5xrU/jG5ON98=; b=EfQpExn64jwpigLc4OqU8Dh5R+O6UO06IbgqCRfeYq6nVR6oPTv/bbaC aso3LrSGwmN6SS9RZ4NIevOvOT7FCKtaRWqwgtuiTQPLEaKY3uD8fvO1j DUKz9yvkJrn4+o9x5JsSWl3QEfJHlCqJY3JHme5ao0s0dwdGApTrefIGn 6poTlpc556oHuzyexeal5KD8UDjb6sLF5GuehxgOAhsfAr4/kuD7MWVLt 7Bb2Fv+bkieuiElPzXNPodE9oLkOr9LQ92mG121OsEU7lGzYdP2jXZUaL ZtTBxXhowGg4RHIrU3AghnRJTFuXf1RD6jFNC6wwrCi5SjNp8jcswjI3k Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10719"; a="439677954" X-IronPort-AV: E=Sophos;i="6.00,187,1681196400"; d="scan'208";a="439677954" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 May 2023 11:49:29 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10719"; a="878317361" X-IronPort-AV: E=Sophos;i="6.00,187,1681196400"; d="scan'208";a="878317361" Received: from unknown (HELO csl-npg-qt0.la.intel.com) ([10.233.181.103]) by orsmga005.jf.intel.com with ESMTP; 23 May 2023 11:49:29 -0700 From: Hernan Vargas To: dev@dpdk.org, maxime.coquelin@redhat.com, gakhil@marvell.com, trix@redhat.com Cc: nicolas.chautru@intel.com, qi.z.zhang@intel.com, Hernan Vargas Subject: [PATCH v1 5/6] baseband/fpga_5gnr_fec: add AGX100 support Date: Tue, 23 May 2023 11:48:17 -0700 Message-Id: <20230523184818.139353-6-hernan.vargas@intel.com> X-Mailer: git-send-email 2.37.1 In-Reply-To: <20230523184818.139353-1-hernan.vargas@intel.com> References: <20230523184818.139353-1-hernan.vargas@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Add support for new FPGA variant AGX100 (on Arrow Creek N6000). Signed-off-by: Hernan Vargas --- doc/guides/bbdevs/fpga_5gnr_fec.rst | 72 +- drivers/baseband/fpga_5gnr_fec/agx100_pmd.h | 273 ++++ .../baseband/fpga_5gnr_fec/fpga_5gnr_fec.h | 6 + .../fpga_5gnr_fec/rte_fpga_5gnr_fec.c | 1197 +++++++++++++++-- 4 files changed, 1395 insertions(+), 153 deletions(-) create mode 100644 drivers/baseband/fpga_5gnr_fec/agx100_pmd.h diff --git a/doc/guides/bbdevs/fpga_5gnr_fec.rst b/doc/guides/bbdevs/fpga_5gnr_fec.rst index 9d71585e9e18..c27db695a834 100644 --- a/doc/guides/bbdevs/fpga_5gnr_fec.rst +++ b/doc/guides/bbdevs/fpga_5gnr_fec.rst @@ -6,12 +6,13 @@ Intel(R) FPGA 5GNR FEC Poll Mode Driver The BBDEV FPGA 5GNR FEC poll mode driver (PMD) supports an FPGA implementation of a VRAN LDPC Encode / Decode 5GNR wireless acceleration function, using Intel's PCI-e and FPGA -based Vista Creek device. +based Vista Creek (N3000, referred to as VC_5GNR in the code) as well as Arrow Creek (N6000, +referred to as AGX100 in the code). Features -------- -FPGA 5GNR FEC PMD supports the following features: +FPGA 5GNR FEC PMD supports the following BBDEV capabilities: - LDPC Encode in the DL - LDPC Decode in the UL @@ -67,10 +68,18 @@ Initialization When the device first powers up, its PCI Physical Functions (PF) can be listed through this command: +Vista Creek (N3000) + .. code-block:: console sudo lspci -vd8086:0d8f +Arrow Creek (N6000) + +.. code-block:: console + + sudo lspci -vd8086:5799 + The physical and virtual functions are compatible with Linux UIO drivers: ``vfio`` and ``igb_uio``. However, in order to work the FPGA 5GNR FEC device firstly needs to be bound to one of these linux drivers through DPDK. @@ -85,24 +94,34 @@ Install the DPDK igb_uio driver, bind it with the PF PCI device ID and use The igb_uio driver may be bound to the PF PCI device using one of two methods: -1. PCI functions (physical or virtual, depending on the use case) can be bound to -the UIO driver by repeating this command for every function. +1. PCI functions (physical or virtual, depending on the use case) can be bound to the UIO driver by repeating this command for every function. -.. code-block:: console + .. code-block:: console + + insmod igb_uio.ko + + Bind N3000 to igb_uio + + .. code-block:: console - insmod igb_uio.ko - echo "8086 0d8f" > /sys/bus/pci/drivers/igb_uio/new_id - lspci -vd8086:0d8f + echo "8086 0d8f" > /sys/bus/pci/drivers/igb_uio/new_id + lspci -vd8086:0d8f + Bind N6000 to igb_uio + + .. code-block:: console + + echo "8086 5799" > /sys/bus/pci/drivers/igb_uio/new_id + lspci -vd8086:5799 2. Another way to bind PF with DPDK UIO driver is by using the ``dpdk-devbind.py`` tool -.. code-block:: console + .. code-block:: console - cd - ./usertools/dpdk-devbind.py -b igb_uio 0000:06:00.0 + cd + ./usertools/dpdk-devbind.py -b igb_uio 0000:06:00.0 -where the PCI device ID (example: 0000:06:00.0) is obtained using lspci -vd8086:0d8f +where the PCI device ID (example: 0000:06:00.0) is obtained using lspci -vd8086:0d8f for N3000 or lspci -vd8086:5799 for N6000 In the same way the FPGA 5GNR FEC PF can be bound with vfio, but vfio driver does not @@ -165,7 +184,6 @@ parameters defined in ``rte_fpga_5gnr_fec_conf`` structure: uint8_t dl_bandwidth; uint8_t ul_load_balance; uint8_t dl_load_balance; - uint16_t flr_time_out; }; - ``pf_mode_en``: identifies whether only PF is to be used, or the VFs. PF and @@ -176,12 +194,12 @@ parameters defined in ``rte_fpga_5gnr_fec_conf`` structure: - ``vf_*l_queues_number``: defines the hardware queue mapping for every VF. -- ``*l_bandwidth``: in case of congestion on PCIe interface. The device - allocates different bandwidth to UL and DL. The weight is configured by this - setting. The unit of weight is 3 code blocks. For example, if the code block - cbps (code block per second) ratio between UL and DL is 12:1, then the - configuration value should be set to 36:3. The schedule algorithm is based - on code block regardless the length of each block. +- ``*l_bandwidth``: Only used for the Vista Creek schedule algorithm in case of + congestion on PCIe interface. The device allocates different bandwidth to UL + and DL. The weight is configured by this setting. The unit of weight is 3 code + blocks. For example, if the code block cbps (code block per second) ratio between + UL and DL is 12:1, then the configuration value should be set to 36:3. + The schedule algorithm is based on code block regardless the length of each block. - ``*l_load_balance``: hardware queues are load-balanced in a round-robin fashion. Queues get filled first-in first-out until they reach a pre-defined @@ -191,10 +209,6 @@ parameters defined in ``rte_fpga_5gnr_fec_conf`` structure: If all hardware queues exceeds the watermark, no code blocks will be streamed in from UL/DL code block FIFO. -- ``flr_time_out``: specifies how many 16.384us to be FLR time out. The - time_out = flr_time_out x 16.384us. For instance, if you want to set 10ms for - the FLR time out then set this setting to 0x262=610. - An example configuration code calling the function ``rte_fpga_5gnr_fec_configure()`` is shown below: @@ -219,7 +233,7 @@ below: /* setup FPGA PF */ ret = rte_fpga_5gnr_fec_configure(info->dev_name, &conf); TEST_ASSERT_SUCCESS(ret, - "Failed to configure 4G FPGA PF for bbdev %s", + "Failed to configure 5GNR FPGA PF for bbdev %s", info->dev_name); @@ -263,7 +277,6 @@ are defined in test_bbdev_perf.c as: - DL_BANDWIDTH 3 - UL_LOAD_BALANCE 128 - DL_LOAD_BALANCE 128 -- FLR_TIMEOUT 610 Test Vectors @@ -287,7 +300,16 @@ See for more details: https://github.com/intel/pf-bb-config Specifically for the BBDEV FPGA 5GNR FEC PMD, the command below can be used: +Vista Creek (N3000) + .. code-block:: console ./pf_bb_config FPGA_5GNR -c fpga_5gnr/fpga_5gnr_config_vf.cfg ./test-bbdev.py -e="-c 0xff0 -a${VF_PCI_ADDR}" -c validation -n 64 -b 32 -l 1 -v ./ldpc_dec_default.data + +Arrow Creek (N6000) + +.. code-block:: console + + ./pf_bb_config AGX100 -c agx100/agx100_config_1vf.cfg + ./test-bbdev.py -e="-c 0xff0 -a${VF_PCI_ADDR}" -c validation -n 64 -b 32 -l 1 -v ./ldpc_dec_default.data diff --git a/drivers/baseband/fpga_5gnr_fec/agx100_pmd.h b/drivers/baseband/fpga_5gnr_fec/agx100_pmd.h new file mode 100644 index 000000000000..8013571402c8 --- /dev/null +++ b/drivers/baseband/fpga_5gnr_fec/agx100_pmd.h @@ -0,0 +1,273 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2022 Intel Corporation + */ + +#ifndef _AGX100_H_ +#define _AGX100_H_ + +#include +#include + +/* AGX100 PCI vendor & device IDs. */ +#define AGX100_VENDOR_ID (0x8086) +#define AGX100_PF_DEVICE_ID (0x5799) +#define AGX100_VF_DEVICE_ID (0x579A) + +/* Maximum number of possible queues supported on device. */ +#define AGX100_MAXIMUM_QUEUES_SUPPORTED (64) + +/* AGX100 Ring size is in 256 bits (64 bytes) units. */ +#define AGX100_RING_DESC_LEN_UNIT_BYTES (64) + +/* Align DMA descriptors to 256 bytes - cache-aligned. */ +#define AGX100_RING_DESC_ENTRY_LENGTH (8) + +/* AGX100 Register mapping on BAR0. */ +enum { + AGX100_FLR_TIME_OUT = 0x0000000E, /* len: 2B. */ + AGX100_QUEUE_MAP = 0x00000100 /* len: 256B. */ +}; + +/* AGX100 DESCRIPTOR ERROR. */ +enum { + AGX100_DESC_ERR_NO_ERR = 0x00, /**< 4'b0000 2'b00. */ + AGX100_DESC_ERR_E_NOT_LEGAL = 0x11, /**< 4'b0001 2'b01. */ + AGX100_DESC_ERR_K_P_OUT_OF_RANGE = 0x21, /**< 4'b0010 2'b01. */ + AGX100_DESC_ERR_NCB_OUT_OF_RANGE = 0x31, /**< 4'b0011 2'b01. */ + AGX100_DESC_ERR_Z_C_NOT_LEGAL = 0x41, /**< 4'b0100 2'b01. */ + AGX100_DESC_ERR_DESC_INDEX_ERR = 0x03, /**< 4'b0000 2'b11. */ + AGX100_DESC_ERR_HARQ_INPUT_LEN_A = 0x51, /**< 4'b0101 2'b01. */ + AGX100_DESC_ERR_HARQ_INPUT_LEN_B = 0x61, /**< 4'b0110 2'b01. */ + AGX100_DESC_ERR_HBSTORE_OFFSET_ERR = 0x71, /**< 4'b0111 2'b01. */ + AGX100_DESC_ERR_TB_CBG_ERR = 0x81, /**< 4'b1000 2'b01. */ + AGX100_DESC_ERR_CBG_OUT_OF_RANGE = 0x91, /**< 4'b1001 2'b01. */ + AGX100_DESC_ERR_CW_RM_NOT_LEGAL = 0xA1, /**< 4'b1010 2'b01. */ + AGX100_DESC_ERR_UNSUPPORTED_REQ = 0x12, /**< 4'b0000 2'b10. */ + AGX100_DESC_ERR_RESERVED = 0x22, /**< 4'b0010 2'b10. */ + AGX100_DESC_ERR_DESC_ABORT = 0x42, /**< 4'b0100 2'b10. */ + AGX100_DESC_ERR_DESC_READ_TLP_POISONED = 0x82 /**< 4'b1000 2'b10. */ +}; + +/* AGX100 TX Slice Descriptor. */ +struct __rte_packed agx100_input_slice_desc { + uint32_t input_start_addr_lo; + uint32_t input_start_addr_hi; + uint32_t input_slice_length:21, + rsrvd0:9, + end_of_pkt:1, + start_of_pkt:1; + uint32_t input_slice_time_stamp:31, + input_c:1; +}; + +/* AGX100 RX Slice Descriptor. */ +struct __rte_packed agx100_output_slice_desc { + uint32_t output_start_addr_lo; + uint32_t output_start_addr_hi; + uint32_t output_slice_length:21, + rsrvd0:9, + end_of_pkt:1, + start_of_pkt:1; + uint32_t output_slice_time_stamp:31, + output_c:1; +}; + +/* AGX100 DL DMA Encoding Request Descriptor. */ +struct __rte_packed agx100_dma_enc_desc { + uint32_t done:1, /**< 0: not completed 1: completed. */ + rsrvd0:17, + error_msg:2, + error_code:4, + rsrvd1:8; + uint32_t ncb:16, /**< Limited circular buffer size. */ + bg_idx:1, /**< Base Graph 0: BG1 1: BG2.*/ + qm_idx:3, /**< 0: BPSK; 1: QPSK; 2: 16QAM; 3: 64QAM; 4: 256QAM. */ + zc:9, /**< Lifting size. */ + rv:2, /**< Redundancy version number. */ + int_en:1; /**< Interrupt enable. */ + uint32_t max_cbg:4, /**< Only valid when workload is TB or CBGs. */ + rsrvd2:4, + cbgti:8, /**< CBG bitmap. */ + rsrvd3:4, + cbgs:1, /**< 0: TB or CB 1: CBGs. */ + desc_idx:11; /**< Sequence number of the descriptor. */ + uint32_t ca:10, /**< Code block number with Ea in TB or CBG. */ + c:10, /**< Total code block number in TB or CBG. */ + rsrvd4:2, + num_null:10; /**< Number of null bits. */ + uint32_t ea:21, /**< Value of E when worload is CB. */ + rsrvd5:11; + uint32_t eb:21, /**< Only valid when workload is TB or CBGs. */ + rsrvd6:11; + uint32_t k_:16, /**< Code block length without null bits. */ + rsrvd7:8, + en_slice_ts:1, /**< Enable slice descriptor timestamp. */ + en_host_ts:1, /**< Enable host descriptor timestamp. */ + en_cb_wr_status:1, /**< Enable code block write back status. */ + en_output_sg:1, /**< Enable RX scatter-gather. */ + en_input_sg:1, /**< Enable TX scatter-gather. */ + tb_cb:1, /**< 2'b10: the descriptor is for a TrBlk. + * 2'b00: the descriptor is for a CBlk. + * 2'b11 or 01: the descriptor is for a CBGs. + */ + crc_en:1, /**< 1: CB CRC enabled 0: CB CRC disabled. + * Only valid when workload is CB or CBGs. + */ + rsrvd8:1; + uint32_t rsrvd9; + union { + uint32_t input_slice_table_addr_lo; /** #include +#include "agx100_pmd.h" #include "vc_5gnr_pmd.h" /* Helper macro for logging */ @@ -131,12 +132,17 @@ struct fpga_5gnr_fec_device { uint64_t q_assigned_bit_map; /** True if this is a PF FPGA 5GNR device. */ bool pf_device; + /** Maximum number of possible queues for this device */ + uint8_t total_num_queues; + /** FPGA Variant. VC_5GNR_FPGA_VARIANT = 0; AGX100_FPGA_VARIANT = 1 */ + uint8_t fpga_variant; }; /** Structure associated with each queue. */ struct __rte_cache_aligned fpga_5gnr_queue { struct fpga_5gnr_ring_ctrl_reg ring_ctrl_reg; /**< Ring Control Register */ union vc_5gnr_dma_desc *vc_5gnr_ring_addr; /**< Virtual address of VC 5GNR software ring. */ + union agx100_dma_desc *agx100_ring_addr; /**< Virtual address of AGX100 software ring */ uint64_t *ring_head_addr; /* Virtual address of completion_head */ uint64_t shadow_completion_head; /* Shadow completion head value */ uint16_t head_free_desc; /* Ring head */ diff --git a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c index 979028405902..a2ce859f5d4b 100644 --- a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c +++ b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c @@ -18,8 +18,8 @@ #include #include -#include "fpga_5gnr_fec.h" #include "rte_pmd_fpga_5gnr_fec.h" +#include "fpga_5gnr_fec.h" #ifdef RTE_LIBRTE_BBDEV_DEBUG RTE_LOG_REGISTER_DEFAULT(fpga_5gnr_fec_logtype, DEBUG); @@ -71,9 +71,11 @@ print_ring_reg_debug_info(void *mmio_base, uint32_t offset) /* Read Static Register of Vista Creek device. */ static inline void -print_static_reg_debug_info(void *mmio_base) +print_static_reg_debug_info(void *mmio_base, uint8_t fpga_variant) { - uint16_t config = fpga_5gnr_reg_read_16(mmio_base, VC_5GNR_CONFIGURATION); + uint16_t config; + if (fpga_variant == VC_5GNR_FPGA_VARIANT) + config = fpga_5gnr_reg_read_16(mmio_base, VC_5GNR_CONFIGURATION); uint8_t qmap_done = fpga_5gnr_reg_read_8(mmio_base, FPGA_5GNR_FEC_QUEUE_PF_VF_MAP_DONE); uint16_t lb_factor = fpga_5gnr_reg_read_16(mmio_base, @@ -81,14 +83,19 @@ print_static_reg_debug_info(void *mmio_base) uint16_t ring_desc_len = fpga_5gnr_reg_read_16(mmio_base, FPGA_5GNR_FEC_RING_DESC_LEN); - rte_bbdev_log_debug("UL.DL Weights = %u.%u", - ((uint8_t)config), ((uint8_t)(config >> 8))); + if (fpga_variant == VC_5GNR_FPGA_VARIANT) + rte_bbdev_log_debug("UL.DL Weights = %u.%u", + ((uint8_t)config), ((uint8_t)(config >> 8))); rte_bbdev_log_debug("UL.DL Load Balance = %u.%u", ((uint8_t)lb_factor), ((uint8_t)(lb_factor >> 8))); rte_bbdev_log_debug("Queue-PF/VF Mapping Table = %s", (qmap_done > 0) ? "READY" : "NOT-READY"); - rte_bbdev_log_debug("Ring Descriptor Size = %u bytes", - ring_desc_len*VC_5GNR_RING_DESC_LEN_UNIT_BYTES); + if (fpga_variant == VC_5GNR_FPGA_VARIANT) + rte_bbdev_log_debug("Ring Descriptor Size = %u bytes", + ring_desc_len*VC_5GNR_RING_DESC_LEN_UNIT_BYTES); + else + rte_bbdev_log_debug("Ring Descriptor Size = %u bytes", + ring_desc_len*AGX100_RING_DESC_LEN_UNIT_BYTES); } /* Print decode DMA Descriptor of Vista Creek Decoder device. */ @@ -142,6 +149,108 @@ vc_5gnr_print_dma_dec_desc_debug_info(union vc_5gnr_dma_desc *desc) word[4], word[5], word[6], word[7]); } +/* Print decode DMA Descriptor of AGX100 Decoder device */ +static void +agx100_print_dma_dec_desc_debug_info(union agx100_dma_desc *desc) +{ + rte_bbdev_log_debug("DMA response desc %p\n" + "\t-- done(%"PRIu32") | tb_crc_pass(%"PRIu32") | cb_crc_all_pass(%"PRIu32")" + " | cb_all_et_pass(%"PRIu32") | max_iter_ret(%"PRIu32") |" + "cgb_crc_bitmap(%"PRIu32") | error_msg(%"PRIu32") | error_code(%"PRIu32") |" + "et_dis (%"PRIu32") | harq_in_en(%"PRIu32") | max_iter(%"PRIu32")\n" + "\t-- ncb(%"PRIu32") | bg_idx (%"PRIu32") | qm_idx (%"PRIu32")" + "| zc(%"PRIu32") | rv(%"PRIu32") | int_en(%"PRIu32")\n" + "\t-- max_cbg(%"PRIu32") | cbgti(%"PRIu32") | cbgfi(%"PRIu32") |" + "cbgs(%"PRIu32") | desc_idx(%"PRIu32")\n" + "\t-- ca(%"PRIu32") | c(%"PRIu32") | llr_pckg(%"PRIu32") |" + "syndrome_check_mode(%"PRIu32") | num_null(%"PRIu32")\n" + "\t-- ea(%"PRIu32") | eba(%"PRIu32")\n" + "\t-- hbstore_offset_out(%"PRIu32")\n" + "\t-- hbstore_offset_in(%"PRIu32") | en_slice_ts(%"PRIu32") |" + "en_host_ts(%"PRIu32") | en_cb_wr_status(%"PRIu32")" + " | en_output_sg(%"PRIu32") | en_input_sg(%"PRIu32") | tb_cb(%"PRIu32")" + " | crc24b_ind(%"PRIu32")| drop_crc24b(%"PRIu32")\n" + "\t-- harq_input_length_a(%"PRIu32") | harq_input_length_b(%"PRIu32")\n" + "\t-- input_slice_table_addr_lo(%"PRIu32")" + " | input_start_addr_lo(%"PRIu32")\n" + "\t-- input_slice_table_addr_hi(%"PRIu32")" + " | input_start_addr_hi(%"PRIu32")\n" + "\t-- input_slice_num(%"PRIu32") | input_length(%"PRIu32")\n" + "\t-- output_slice_table_addr_lo(%"PRIu32")" + " | output_start_addr_lo(%"PRIu32")\n" + "\t-- output_slice_table_addr_hi(%"PRIu32")" + " | output_start_addr_hi(%"PRIu32")\n" + "\t-- output_slice_num(%"PRIu32") | output_length(%"PRIu32")\n" + "\t-- enqueue_timestamp(%"PRIu32")\n" + "\t-- completion_timestamp(%"PRIu32")\n", + desc, + (uint32_t)desc->agx100_dec_req.done, + (uint32_t)desc->agx100_dec_req.tb_crc_pass, + (uint32_t)desc->agx100_dec_req.cb_crc_all_pass, + (uint32_t)desc->agx100_dec_req.cb_all_et_pass, + (uint32_t)desc->agx100_dec_req.max_iter_ret, + (uint32_t)desc->agx100_dec_req.cgb_crc_bitmap, + (uint32_t)desc->agx100_dec_req.error_msg, + (uint32_t)desc->agx100_dec_req.error_code, + (uint32_t)desc->agx100_dec_req.et_dis, + (uint32_t)desc->agx100_dec_req.harq_in_en, + (uint32_t)desc->agx100_dec_req.max_iter, + (uint32_t)desc->agx100_dec_req.ncb, + (uint32_t)desc->agx100_dec_req.bg_idx, + (uint32_t)desc->agx100_dec_req.qm_idx, + (uint32_t)desc->agx100_dec_req.zc, + (uint32_t)desc->agx100_dec_req.rv, + (uint32_t)desc->agx100_dec_req.int_en, + (uint32_t)desc->agx100_dec_req.max_cbg, + (uint32_t)desc->agx100_dec_req.cbgti, + (uint32_t)desc->agx100_dec_req.cbgfi, + (uint32_t)desc->agx100_dec_req.cbgs, + (uint32_t)desc->agx100_dec_req.desc_idx, + (uint32_t)desc->agx100_dec_req.ca, + (uint32_t)desc->agx100_dec_req.c, + (uint32_t)desc->agx100_dec_req.llr_pckg, + (uint32_t)desc->agx100_dec_req.syndrome_check_mode, + (uint32_t)desc->agx100_dec_req.num_null, + (uint32_t)desc->agx100_dec_req.ea, + (uint32_t)desc->agx100_dec_req.eba, + (uint32_t)desc->agx100_dec_req.hbstore_offset_out, + (uint32_t)desc->agx100_dec_req.hbstore_offset_in, + (uint32_t)desc->agx100_dec_req.en_slice_ts, + (uint32_t)desc->agx100_dec_req.en_host_ts, + (uint32_t)desc->agx100_dec_req.en_cb_wr_status, + (uint32_t)desc->agx100_dec_req.en_output_sg, + (uint32_t)desc->agx100_dec_req.en_input_sg, + (uint32_t)desc->agx100_dec_req.tb_cb, + (uint32_t)desc->agx100_dec_req.crc24b_ind, + (uint32_t)desc->agx100_dec_req.drop_crc24b, + (uint32_t)desc->agx100_dec_req.harq_input_length_a, + (uint32_t)desc->agx100_dec_req.harq_input_length_b, + (uint32_t)desc->agx100_dec_req.input_slice_table_addr_lo, + (uint32_t)desc->agx100_dec_req.input_start_addr_lo, + (uint32_t)desc->agx100_dec_req.input_slice_table_addr_hi, + (uint32_t)desc->agx100_dec_req.input_start_addr_hi, + (uint32_t)desc->agx100_dec_req.input_slice_num, + (uint32_t)desc->agx100_dec_req.input_length, + (uint32_t)desc->agx100_dec_req.output_slice_table_addr_lo, + (uint32_t)desc->agx100_dec_req.output_start_addr_lo, + (uint32_t)desc->agx100_dec_req.output_slice_table_addr_hi, + (uint32_t)desc->agx100_dec_req.output_start_addr_hi, + (uint32_t)desc->agx100_dec_req.output_slice_num, + (uint32_t)desc->agx100_dec_req.output_length, + (uint32_t)desc->agx100_dec_req.enqueue_timestamp, + (uint32_t)desc->agx100_dec_req.completion_timestamp); + + uint32_t *word = (uint32_t *) desc; + rte_bbdev_log_debug("%08"PRIx32"\n%08"PRIx32"\n%08"PRIx32"\n%08"PRIx32"\n" + "%08"PRIx32"\n%08"PRIx32"\n%08"PRIx32"\n%08"PRIx32"\n" + "%08"PRIx32"\n%08"PRIx32"\n%08"PRIx32"\n%08"PRIx32"\n" + "%08"PRIx32"\n%08"PRIx32"\n%08"PRIx32"\n%08"PRIx32"\n", + word[0], word[1], word[2], word[3], + word[4], word[5], word[6], word[7], + word[8], word[9], word[10], word[11], + word[12], word[13], word[14], word[15]); +} + /* Print decode DMA Descriptor of Vista Creek encoder device */ static void vc_5gnr_print_dma_enc_desc_debug_info(union vc_5gnr_dma_desc *desc) @@ -175,6 +284,87 @@ vc_5gnr_print_dma_enc_desc_debug_info(union vc_5gnr_dma_desc *desc) word[4], word[5], word[6], word[7]); } +/* Print decode DMA Descriptor of AGX100 encoder device */ +static void +agx100_print_dma_enc_desc_debug_info(union agx100_dma_desc *desc) +{ + rte_bbdev_log_debug("DMA response desc %p\n" + "\t-- done(%"PRIu32") | error_msg(%"PRIu32") | error_code(%"PRIu32")\n" + "\t-- ncb(%"PRIu32") | bg_idx (%"PRIu32") | qm_idx (%"PRIu32")" + "| zc(%"PRIu32") | rv(%"PRIu32") | int_en(%"PRIu32")\n" + "\t-- max_cbg(%"PRIu32") | cbgti(%"PRIu32") | cbgs(%"PRIu32") | " + "desc_idx(%"PRIu32")\n" + "\t-- ca(%"PRIu32") | c(%"PRIu32") | num_null(%"PRIu32")\n" + "\t-- ea(%"PRIu32")\n" + "\t-- eb(%"PRIu32")\n" + "\t-- k_(%"PRIu32") | en_slice_ts(%"PRIu32") | en_host_ts(%"PRIu32") | " + "en_cb_wr_status(%"PRIu32") | en_output_sg(%"PRIu32") | " + "en_input_sg(%"PRIu32") | tb_cb(%"PRIu32") | crc_en(%"PRIu32")\n" + "\t-- input_slice_table_addr_lo(%"PRIu32")" + " | input_start_addr_lo(%"PRIu32")\n" + "\t-- input_slice_table_addr_hi(%"PRIu32")" + " | input_start_addr_hi(%"PRIu32")\n" + "\t-- input_slice_num(%"PRIu32") | input_length(%"PRIu32")\n" + "\t-- output_slice_table_addr_lo(%"PRIu32")" + " | output_start_addr_lo(%"PRIu32")\n" + "\t-- output_slice_table_addr_hi(%"PRIu32")" + " | output_start_addr_hi(%"PRIu32")\n" + "\t-- output_slice_num(%"PRIu32") | output_length(%"PRIu32")\n" + "\t-- enqueue_timestamp(%"PRIu32")\n" + "\t-- completion_timestamp(%"PRIu32")\n", + desc, + (uint32_t)desc->agx100_enc_req.done, + (uint32_t)desc->agx100_enc_req.error_msg, + (uint32_t)desc->agx100_enc_req.error_code, + (uint32_t)desc->agx100_enc_req.ncb, + (uint32_t)desc->agx100_enc_req.bg_idx, + (uint32_t)desc->agx100_enc_req.qm_idx, + (uint32_t)desc->agx100_enc_req.zc, + (uint32_t)desc->agx100_enc_req.rv, + (uint32_t)desc->agx100_enc_req.int_en, + (uint32_t)desc->agx100_enc_req.max_cbg, + (uint32_t)desc->agx100_enc_req.cbgti, + (uint32_t)desc->agx100_enc_req.cbgs, + (uint32_t)desc->agx100_enc_req.desc_idx, + (uint32_t)desc->agx100_enc_req.ca, + (uint32_t)desc->agx100_enc_req.c, + (uint32_t)desc->agx100_enc_req.num_null, + (uint32_t)desc->agx100_enc_req.ea, + (uint32_t)desc->agx100_enc_req.eb, + (uint32_t)desc->agx100_enc_req.k_, + (uint32_t)desc->agx100_enc_req.en_slice_ts, + (uint32_t)desc->agx100_enc_req.en_host_ts, + (uint32_t)desc->agx100_enc_req.en_cb_wr_status, + (uint32_t)desc->agx100_enc_req.en_output_sg, + (uint32_t)desc->agx100_enc_req.en_input_sg, + (uint32_t)desc->agx100_enc_req.tb_cb, + (uint32_t)desc->agx100_enc_req.crc_en, + (uint32_t)desc->agx100_enc_req.input_slice_table_addr_lo, + (uint32_t)desc->agx100_enc_req.input_start_addr_lo, + (uint32_t)desc->agx100_enc_req.input_slice_table_addr_hi, + (uint32_t)desc->agx100_enc_req.input_start_addr_hi, + (uint32_t)desc->agx100_enc_req.input_slice_num, + (uint32_t)desc->agx100_enc_req.input_length, + (uint32_t)desc->agx100_enc_req.output_slice_table_addr_lo, + (uint32_t)desc->agx100_enc_req.output_start_addr_lo, + (uint32_t)desc->agx100_enc_req.output_slice_table_addr_hi, + (uint32_t)desc->agx100_enc_req.output_start_addr_hi, + (uint32_t)desc->agx100_enc_req.output_slice_num, + (uint32_t)desc->agx100_enc_req.output_length, + (uint32_t)desc->agx100_enc_req.enqueue_timestamp, + (uint32_t)desc->agx100_enc_req.completion_timestamp); + + uint32_t *word = (uint32_t *) desc; + rte_bbdev_log_debug("%08"PRIx32"\n%08"PRIx32"\n%08"PRIx32"\n%08"PRIx32"\n" + "%08"PRIx32"\n%08"PRIx32"\n%08"PRIx32"\n%08"PRIx32"\n" + "%08"PRIx32"\n%08"PRIx32"\n%08"PRIx32"\n%08"PRIx32"\n" + "%08"PRIx32"\n%08"PRIx32"\n%08"PRIx32"\n%08"PRIx32"\n", + word[0], word[1], word[2], word[3], + word[4], word[5], word[6], word[7], + word[8], word[9], word[10], word[11], + word[12], word[13], word[14], word[15]); +} + #endif static int @@ -198,14 +388,32 @@ fpga_5gnr_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id /* Clear queue registers structure */ memset(&ring_reg, 0, sizeof(struct fpga_5gnr_ring_ctrl_reg)); + if (d->fpga_variant == AGX100_FPGA_VARIANT) { + /* Maximum number of queues possible for this device */ + d->total_num_queues = fpga_5gnr_reg_read_32( + d->mmio_base, + FPGA_5GNR_FEC_VERSION_ID) >> 24; + if (d->total_num_queues > AGX100_MAXIMUM_QUEUES_SUPPORTED) { + rte_bbdev_log(ERR, + "Total number of queues defined greater %d! Register value corrupted?\n", + AGX100_MAXIMUM_QUEUES_SUPPORTED); + return -EPERM; + } + } + /* Scan queue map. * If a queue is valid and mapped to a calling PF/VF the read value is * replaced with a queue ID and if it's not then * FPGA_5GNR_INVALID_HW_QUEUE_ID is returned. */ - for (q_id = 0; q_id < VC_5GNR_TOTAL_NUM_QUEUES; ++q_id) { - uint32_t hw_q_id = fpga_5gnr_reg_read_32(d->mmio_base, - VC_5GNR_QUEUE_MAP + (q_id << 2)); + for (q_id = 0; q_id < d->total_num_queues; ++q_id) { + uint32_t hw_q_id; + if (d->fpga_variant == VC_5GNR_FPGA_VARIANT) + hw_q_id = fpga_5gnr_reg_read_32(d->mmio_base, + VC_5GNR_QUEUE_MAP + (q_id << 2)); + else + hw_q_id = fpga_5gnr_reg_read_32(d->mmio_base, + AGX100_QUEUE_MAP + (q_id << 2)); rte_bbdev_log_debug("%s: queue ID: %u, registry queue ID: %u", dev->device->name, q_id, hw_q_id); @@ -231,8 +439,10 @@ fpga_5gnr_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id dev->device->name, num_queues, hw_q_num); return -EINVAL; } - - ring_size = FPGA_5GNR_RING_MAX_SIZE * sizeof(struct vc_5gnr_dma_dec_desc); + if (d->fpga_variant == VC_5GNR_FPGA_VARIANT) + ring_size = FPGA_5GNR_RING_MAX_SIZE * sizeof(struct vc_5gnr_dma_dec_desc); + else + ring_size = FPGA_5GNR_RING_MAX_SIZE * sizeof(struct agx100_dma_dec_desc); /* Enforce 32 byte alignment */ RTE_BUILD_BUG_ON((RTE_CACHE_LINE_SIZE % 32) != 0); @@ -357,8 +567,11 @@ fpga_5gnr_dev_info_get(struct rte_bbdev *dev, struct rte_bbdev_driver_info *dev_ dev_info->driver_name = dev->device->driver->name; dev_info->queue_size_lim = FPGA_5GNR_RING_MAX_SIZE; dev_info->hardware_accelerated = true; - dev_info->min_alignment = 64; - dev_info->harq_buffer_size = (harq_buf_size >> 10) + 1; + dev_info->min_alignment = 1; + if (d->fpga_variant == VC_5GNR_FPGA_VARIANT) + dev_info->harq_buffer_size = (harq_buf_size >> 10) + 1; + else + dev_info->harq_buffer_size = harq_buf_size << 10; dev_info->default_queue_conf = default_queue_conf; dev_info->capabilities = bbdev_capabilities; dev_info->cpu_flag_reqs = NULL; @@ -367,9 +580,14 @@ fpga_5gnr_dev_info_get(struct rte_bbdev *dev, struct rte_bbdev_driver_info *dev_ /* Calculates number of queues assigned to device */ dev_info->max_num_queues = 0; - for (q_id = 0; q_id < VC_5GNR_TOTAL_NUM_QUEUES; ++q_id) { - uint32_t hw_q_id = fpga_5gnr_reg_read_32(d->mmio_base, - VC_5GNR_QUEUE_MAP + (q_id << 2)); + for (q_id = 0; q_id < d->total_num_queues; ++q_id) { + uint32_t hw_q_id; + if (d->fpga_variant == VC_5GNR_FPGA_VARIANT) + hw_q_id = fpga_5gnr_reg_read_32(d->mmio_base, + VC_5GNR_QUEUE_MAP + (q_id << 2)); + else + hw_q_id = fpga_5gnr_reg_read_32(d->mmio_base, + AGX100_QUEUE_MAP + (q_id << 2)); if (hw_q_id != FPGA_5GNR_INVALID_HW_QUEUE_ID) dev_info->max_num_queues++; } @@ -377,8 +595,8 @@ fpga_5gnr_dev_info_get(struct rte_bbdev *dev, struct rte_bbdev_driver_info *dev_ dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0; dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = 0; dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = 0; - dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = dev_info->max_num_queues / 2; - dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = dev_info->max_num_queues / 2; + dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = dev_info->max_num_queues >> 1; + dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = dev_info->max_num_queues >> 1; dev_info->queue_priority[RTE_BBDEV_OP_LDPC_DEC] = 1; dev_info->queue_priority[RTE_BBDEV_OP_LDPC_ENC] = 1; } @@ -394,11 +612,11 @@ fpga_5gnr_find_free_queue_idx(struct rte_bbdev *dev, struct fpga_5gnr_fec_device *d = dev->data->dev_private; uint64_t q_idx; uint8_t i = 0; - uint8_t range = VC_5GNR_TOTAL_NUM_QUEUES >> 1; + uint8_t range = d->total_num_queues >> 1; if (conf->op_type == RTE_BBDEV_OP_LDPC_ENC) { - i = VC_5GNR_NUM_DL_QUEUES; - range = VC_5GNR_TOTAL_NUM_QUEUES; + i = d->total_num_queues >> 1; + range = d->total_num_queues; } for (; i < range; ++i) { @@ -445,7 +663,11 @@ fpga_5gnr_queue_setup(struct rte_bbdev *dev, uint16_t queue_id, q->q_idx = q_idx; /* Set ring_base_addr */ - q->vc_5gnr_ring_addr = RTE_PTR_ADD(d->sw_rings, (d->sw_ring_size * queue_id)); + if (d->fpga_variant == VC_5GNR_FPGA_VARIANT) + q->vc_5gnr_ring_addr = RTE_PTR_ADD(d->sw_rings, (d->sw_ring_size * queue_id)); + else + q->agx100_ring_addr = RTE_PTR_ADD(d->sw_rings, (d->sw_ring_size * queue_id)); + q->ring_ctrl_reg.ring_base_addr = d->sw_rings_phys + (d->sw_ring_size * queue_id); /* Allocate memory for Completion Head variable*/ @@ -661,7 +883,7 @@ fpga_5gnr_dev_interrupt_handler(void *cb_arg) uint8_t i; /* Scan queue assigned to this device */ - for (i = 0; i < VC_5GNR_TOTAL_NUM_QUEUES; ++i) { + for (i = 0; i < d->total_num_queues; ++i) { q_idx = 1ULL << i; if (d->q_bound_bit_map & q_idx) { queue_id = get_queue_id(dev->data, i); @@ -710,6 +932,13 @@ fpga_5gnr_intr_enable(struct rte_bbdev *dev) { int ret; uint8_t i; + struct fpga_5gnr_fec_device *d = dev->data->dev_private; + uint8_t num_intr_vec; + + if (d->fpga_variant == VC_5GNR_FPGA_VARIANT) + num_intr_vec = VC_5GNR_NUM_INTR_VEC; + else + num_intr_vec = d->total_num_queues - RTE_INTR_VEC_RXTX_OFFSET; if (!rte_intr_cap_multiple(dev->intr_handle)) { rte_bbdev_log(ERR, "Multiple intr vector is not supported by FPGA (%s)", @@ -717,15 +946,15 @@ fpga_5gnr_intr_enable(struct rte_bbdev *dev) return -ENOTSUP; } - /* Create event file descriptors for each of 64 queue. Event fds will be - * mapped to FPGA IRQs in rte_intr_enable(). This is a 1:1 mapping where - * the IRQ number is a direct translation to the queue number. + /* Create event file descriptors for each of the supported queues (Maximum 64). + * Event fds will be mapped to FPGA IRQs in rte_intr_enable(). + * This is a 1:1 mapping where the IRQ number is a direct translation to the queue number. * - * 63 (VC_5GNR_NUM_INTR_VEC) event fds are created as rte_intr_enable() + * num_intr_vec event fds are created as rte_intr_enable() * mapped the first IRQ to already created interrupt event file * descriptor (intr_handle->fd). */ - if (rte_intr_efd_enable(dev->intr_handle, VC_5GNR_NUM_INTR_VEC)) { + if (rte_intr_efd_enable(dev->intr_handle, num_intr_vec)) { rte_bbdev_log(ERR, "Failed to create fds for %u queues", dev->data->num_queues); return -1; } @@ -735,7 +964,7 @@ fpga_5gnr_intr_enable(struct rte_bbdev *dev) * It ensures that callback function assigned to that descriptor will * invoked when any FPGA queue issues interrupt. */ - for (i = 0; i < VC_5GNR_NUM_INTR_VEC; ++i) { + for (i = 0; i < num_intr_vec; ++i) { if (rte_intr_efds_index_set(dev->intr_handle, i, rte_intr_fd_get(dev->intr_handle))) return -rte_errno; @@ -856,6 +1085,72 @@ vc_5gnr_check_desc_error(uint32_t error_code) { return 1; } +/* AGX100 FPGA descriptor errors + * Print an error if a descriptor error has occurred. + * Return 0 on success, 1 on failure + */ +static inline int +agx100_check_desc_error(uint32_t error_code, uint32_t error_msg) { + uint8_t error = error_code << 4 | error_msg; + switch (error) { + case AGX100_DESC_ERR_NO_ERR: + return 0; + case AGX100_DESC_ERR_E_NOT_LEGAL: + rte_bbdev_log(ERR, "Invalid output length of rate matcher E"); + break; + case AGX100_DESC_ERR_K_P_OUT_OF_RANGE: + rte_bbdev_log(ERR, "Encode block size K' is out of range"); + break; + case AGX100_DESC_ERR_NCB_OUT_OF_RANGE: + rte_bbdev_log(ERR, "Ncb circular buffer size is out of range"); + break; + case AGX100_DESC_ERR_Z_C_NOT_LEGAL: + rte_bbdev_log(ERR, "Zc is illegal"); + break; + case AGX100_DESC_ERR_DESC_INDEX_ERR: + rte_bbdev_log(ERR, + "Desc_index received does not meet the expectation in the AGX100" + ); + break; + case AGX100_DESC_ERR_HARQ_INPUT_LEN_A: + rte_bbdev_log(ERR, "HARQ input length A is invalid."); + break; + case AGX100_DESC_ERR_HARQ_INPUT_LEN_B: + rte_bbdev_log(ERR, "HARQ input length B is invalid."); + break; + case AGX100_DESC_ERR_HBSTORE_OFFSET_ERR: + rte_bbdev_log(ERR, "Hbstore exceeds HARQ buffer size."); + break; + case AGX100_DESC_ERR_TB_CBG_ERR: + rte_bbdev_log(ERR, "Total CB number C=0 or CB number with Ea Ca=0 or Ca>C."); + break; + case AGX100_DESC_ERR_CBG_OUT_OF_RANGE: + rte_bbdev_log(ERR, "Cbgti or max_cbg is out of range"); + break; + case AGX100_DESC_ERR_CW_RM_NOT_LEGAL: + rte_bbdev_log(ERR, "Cw_rm is illegal"); + break; + case AGX100_DESC_ERR_UNSUPPORTED_REQ: + rte_bbdev_log(ERR, "Unsupported request for descriptor"); + break; + case AGX100_DESC_ERR_RESERVED: + rte_bbdev_log(ERR, "Reserved"); + break; + case AGX100_DESC_ERR_DESC_ABORT: + rte_bbdev_log(ERR, "Completed abort for descriptor"); + break; + case AGX100_DESC_ERR_DESC_READ_TLP_POISONED: + rte_bbdev_log(ERR, "Descriptor read TLP poisoned"); + break; + default: + rte_bbdev_log(ERR, + "Descriptor error unknown error code %u error msg %u", + error_code, error_msg); + break; + } + return 1; +} + /* Compute value of k0. * Based on 3GPP 38.212 Table 5.4.2.1-2 * Starting position of different redundancy versions, k0 @@ -953,6 +1248,88 @@ vc_5gnr_dma_desc_te_fill(struct rte_bbdev_enc_op *op, return 0; } +/** + * AGX100 FPGA + * Set DMA descriptor for encode operation (1 Code Block) + * + * @param op + * Pointer to a single encode operation. + * @param desc + * Pointer to DMA descriptor. + * @param input + * Pointer to pointer to input data which will be decoded. + * @param e + * E value (length of output in bits). + * @param ncb + * Ncb value (size of the soft buffer). + * @param out_length + * Length of output buffer + * @param in_offset + * Input offset in rte_mbuf structure. It is used for calculating the point + * where data is starting. + * @param out_offset + * Output offset in rte_mbuf structure. It is used for calculating the point + * where hard output data will be stored. + * @param cbs_in_op + * Number of CBs contained in one operation. + */ +static inline int +agx100_dma_desc_le_fill(struct rte_bbdev_enc_op *op, + struct agx100_dma_enc_desc *desc, struct rte_mbuf *input, + struct rte_mbuf *output, uint16_t k_, uint32_t e, + uint32_t in_offset, uint32_t out_offset, uint16_t desc_offset, + uint8_t cbs_in_op) +{ + /* reset */ + desc->done = 0; + desc->error_msg = 0; + desc->error_code = 0; + desc->ncb = op->ldpc_enc.n_cb; + desc->bg_idx = op->ldpc_enc.basegraph - 1; + desc->qm_idx = op->ldpc_enc.q_m >> 1; + desc->zc = op->ldpc_enc.z_c; + desc->rv = op->ldpc_enc.rv_index; + desc->int_en = 0; /**< Set by device externally*/ + desc->max_cbg = 0; /*TODO: CBG specific */ + desc->cbgti = 0; /*TODO: CBG specific */ + desc->cbgs = 0; /*TODO: CBG specific */ + desc->desc_idx = desc_offset; + desc->ca = 0; /*TODO: CBG specific */ + desc->c = 0; /*TODO: CBG specific */ + desc->num_null = op->ldpc_enc.n_filler; + desc->ea = e; + desc->eb = e; /*TODO: TB/CBG specific */ + desc->k_ = k_; + desc->en_slice_ts = 0; /*TODO: Slice specific*/ + desc->en_host_ts = 0; /*TODO: Slice specific*/ + desc->en_cb_wr_status = 0; /*TODO: Event Queue specific*/ + desc->en_output_sg = 0; /*TODO: Slice specific*/ + desc->en_input_sg = 0; /*TODO: Slice specific*/ + desc->tb_cb = 0; /*Descriptor for CB. TODO: Add TB and CBG logic*/ + desc->crc_en = check_bit(op->ldpc_enc.op_flags, + RTE_BBDEV_LDPC_CRC_24B_ATTACH); + + /* Set inbound/outbound data buffer address */ + /* TODO: add logic for input_slice */ + desc->output_start_addr_hi = (uint32_t)( + rte_pktmbuf_iova_offset(output, out_offset) >> 32); + desc->output_start_addr_lo = (uint32_t)( + rte_pktmbuf_iova_offset(output, out_offset)); + desc->input_start_addr_hi = (uint32_t)( + rte_pktmbuf_iova_offset(input, in_offset) >> 32); + desc->input_start_addr_lo = (uint32_t)( + rte_pktmbuf_iova_offset(input, in_offset)); + desc->output_length = (e + 7) >> 3; /* in bytes */ + desc->input_length = input->data_len; + desc->enqueue_timestamp = 0; + desc->completion_timestamp = 0; + /* Save software context needed for dequeue */ + desc->op_addr = op; + /* Set total number of CBs in an op */ + desc->cbs_in_op = cbs_in_op; + return 0; +} + /** * Vista Creek 5GNR FPGA * Set DMA descriptor for decode operation (1 Code Block) @@ -1021,6 +1398,105 @@ vc_5gnr_dma_desc_ld_fill(struct rte_bbdev_dec_op *op, return 0; } +/** + * AGX100 FPGA + * Set DMA descriptor for decode operation (1 Code Block) + * + * @param op + * Pointer to a single encode operation. + * @param desc + * Pointer to DMA descriptor. + * @param input + * Pointer to pointer to input data which will be decoded. + * @param in_offset + * Input offset in rte_mbuf structure. It is used for calculating the point + * where data is starting. + * @param out_offset + * Output offset in rte_mbuf structure. It is used for calculating the point + * where hard output data will be stored. + * @param cbs_in_op + * Number of CBs contained in one operation. + */ +static inline int +agx100_dma_desc_ld_fill(struct rte_bbdev_dec_op *op, + struct agx100_dma_dec_desc *desc, + struct rte_mbuf *input, struct rte_mbuf *output, + uint16_t harq_in_length, + uint32_t in_offset, uint32_t out_offset, + uint32_t harq_in_offset, + uint32_t harq_out_offset, + uint16_t desc_offset, + uint8_t cbs_in_op) +{ + /* reset */ + desc->done = 0; + desc->tb_crc_pass = 0; + desc->cb_crc_all_pass = 0; + desc->cb_all_et_pass = 0; + desc->max_iter_ret = 0; + desc->cgb_crc_bitmap = 0; /*TODO: CBG specific */ + desc->error_msg = 0; + desc->error_code = 0; + desc->et_dis = !check_bit(op->ldpc_dec.op_flags, + RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE); + desc->harq_in_en = check_bit(op->ldpc_dec.op_flags, + RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE); + desc->max_iter = op->ldpc_dec.iter_max; + desc->ncb = op->ldpc_dec.n_cb; + desc->bg_idx = op->ldpc_dec.basegraph - 1; + desc->qm_idx = op->ldpc_dec.q_m >> 1; + desc->zc = op->ldpc_dec.z_c; + desc->rv = op->ldpc_dec.rv_index; + desc->int_en = 0; /**< Set by device externally*/ + desc->max_cbg = 0; /*TODO: CBG specific*/ + desc->cbgti = 0; /*TODO: CBG specific*/ + desc->cbgfi = 0; /*TODO: CBG specific*/ + desc->cbgs = 0; /*TODO: CBG specific*/ + desc->desc_idx = desc_offset; + desc->ca = 0; /*TODO: CBG specific*/ + desc->c = 0; /*TODO: CBG specific*/ + desc->llr_pckg = 0; /*TODO: Not implemented yet*/ + desc->syndrome_check_mode = 1; /*TODO: Make it configurable*/ + desc->num_null = op->ldpc_dec.n_filler; + desc->ea = op->ldpc_dec.cb_params.e; /*TODO: TB/CBG specific*/ + desc->eba = 0; /*TODO: TB/CBG specific*/ + desc->hbstore_offset_out = harq_out_offset >> 10; + desc->hbstore_offset_in = harq_in_offset >> 10; + desc->en_slice_ts = 0; /*TODO: Slice specific*/ + desc->en_host_ts = 0; /*TODO: Slice specific*/ + desc->en_cb_wr_status = 0; /*TODO: Event Queue specific*/ + desc->en_output_sg = 0; /*TODO: Slice specific*/ + desc->en_input_sg = 0; /*TODO: Slice specific*/ + desc->tb_cb = 0; /* Descriptor for CB. TODO: Add TB and CBG logic*/ + desc->crc24b_ind = check_bit(op->ldpc_dec.op_flags, + RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK); + desc->drop_crc24b = check_bit(op->ldpc_dec.op_flags, + RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP); + desc->harq_input_length_a = + harq_in_length; /*Descriptor for CB. TODO: Add TB and CBG logic*/ + desc->harq_input_length_b = 0; /*Descriptor for CB. TODO: Add TB and CBG logic*/ + /* Set inbound/outbound data buffer address */ + /* TODO: add logic for input_slice */ + desc->output_start_addr_hi = (uint32_t)( + rte_pktmbuf_iova_offset(output, out_offset) >> 32); + desc->output_start_addr_lo = (uint32_t)( + rte_pktmbuf_iova_offset(output, out_offset)); + desc->input_start_addr_hi = (uint32_t)( + rte_pktmbuf_iova_offset(input, in_offset) >> 32); + desc->input_start_addr_lo = (uint32_t)( + rte_pktmbuf_iova_offset(input, in_offset)); + desc->output_length = (((op->ldpc_dec.basegraph == 1) ? 22 : 10) * op->ldpc_dec.z_c + - op->ldpc_dec.n_filler - desc->drop_crc24b * 24) >> 3; + desc->input_length = op->ldpc_dec.cb_params.e; /*TODO: TB/CBG specific*/ + desc->enqueue_timestamp = 0; + desc->completion_timestamp = 0; + /* Save software context needed for dequeue */ + desc->op_addr = op; + /* Set total number of CBs in an op */ + desc->cbs_in_op = cbs_in_op; + return 0; +} + /* Validates LDPC encoder parameters for VC 5GNR FPGA. */ static inline int vc_5gnr_validate_ldpc_enc_op(struct rte_bbdev_enc_op *op) @@ -1484,27 +1960,35 @@ fpga_5gnr_harq_write_loopback(struct fpga_5gnr_queue *q, uint64_t *input = NULL; uint32_t last_transaction = left_length % FPGA_5GNR_DDR_WR_DATA_LEN_IN_BYTES; uint64_t last_word; + struct fpga_5gnr_fec_device *d = q->d; if (last_transaction > 0) left_length -= last_transaction; - - /* - * Get HARQ buffer size for each VF/PF: When 0x00, there is no - * available DDR space for the corresponding VF/PF. - */ - reg_32 = fpga_5gnr_reg_read_32(q->d->mmio_base, FPGA_5GNR_FEC_HARQ_BUF_SIZE_REGS); - if (reg_32 < harq_in_length) { - left_length = reg_32; - rte_bbdev_log(ERR, "HARQ in length > HARQ buffer size\n"); + if (d->fpga_variant == VC_5GNR_FPGA_VARIANT) { + /* + * Get HARQ buffer size for each VF/PF: When 0x00, there is no + * available DDR space for the corresponding VF/PF. + */ + reg_32 = fpga_5gnr_reg_read_32(q->d->mmio_base, FPGA_5GNR_FEC_HARQ_BUF_SIZE_REGS); + if (reg_32 < harq_in_length) { + left_length = reg_32; + rte_bbdev_log(ERR, "HARQ in length > HARQ buffer size\n"); + } } input = (uint64_t *)rte_pktmbuf_mtod_offset(harq_input, uint8_t *, in_offset); while (left_length > 0) { if (fpga_5gnr_reg_read_8(q->d->mmio_base, FPGA_5GNR_FEC_DDR4_ADDR_RDY_REGS) == 1) { - fpga_5gnr_reg_write_32(q->d->mmio_base, - FPGA_5GNR_FEC_DDR4_WR_ADDR_REGS, - out_offset); + if (d->fpga_variant == AGX100_FPGA_VARIANT) { + fpga_5gnr_reg_write_32(q->d->mmio_base, + FPGA_5GNR_FEC_DDR4_WR_ADDR_REGS, + out_offset >> 3); + } else { + fpga_5gnr_reg_write_32(q->d->mmio_base, + FPGA_5GNR_FEC_DDR4_WR_ADDR_REGS, + out_offset); + } fpga_5gnr_reg_write_64(q->d->mmio_base, FPGA_5GNR_FEC_DDR4_WR_DATA_REGS, input[increment]); @@ -1516,12 +2000,17 @@ fpga_5gnr_harq_write_loopback(struct fpga_5gnr_queue *q, } while (last_transaction > 0) { if (fpga_5gnr_reg_read_8(q->d->mmio_base, FPGA_5GNR_FEC_DDR4_ADDR_RDY_REGS) == 1) { - fpga_5gnr_reg_write_32(q->d->mmio_base, - FPGA_5GNR_FEC_DDR4_WR_ADDR_REGS, - out_offset); + if (d->fpga_variant == AGX100_FPGA_VARIANT) { + fpga_5gnr_reg_write_32(q->d->mmio_base, + FPGA_5GNR_FEC_DDR4_WR_ADDR_REGS, + out_offset >> 3); + } else { + fpga_5gnr_reg_write_32(q->d->mmio_base, + FPGA_5GNR_FEC_DDR4_WR_ADDR_REGS, + out_offset); + } last_word = input[increment]; - last_word &= (uint64_t)(1 << (last_transaction * 4)) - - 1; + last_word &= (uint64_t)(1ULL << (last_transaction * 4)) - 1; fpga_5gnr_reg_write_64(q->d->mmio_base, FPGA_5GNR_FEC_DDR4_WR_DATA_REGS, last_word); @@ -1544,14 +2033,17 @@ fpga_5gnr_harq_read_loopback(struct fpga_5gnr_queue *q, uint32_t increment = 0; uint64_t *input = NULL; uint32_t last_transaction = harq_in_length % FPGA_5GNR_DDR_WR_DATA_LEN_IN_BYTES; + struct fpga_5gnr_fec_device *d = q->d; if (last_transaction > 0) harq_in_length += (8 - last_transaction); - reg = fpga_5gnr_reg_read_32(q->d->mmio_base, FPGA_5GNR_FEC_HARQ_BUF_SIZE_REGS); - if (reg < harq_in_length) { - harq_in_length = reg; - rte_bbdev_log(ERR, "HARQ in length > HARQ buffer size\n"); + if (d->fpga_variant == VC_5GNR_FPGA_VARIANT) { + reg = fpga_5gnr_reg_read_32(q->d->mmio_base, FPGA_5GNR_FEC_HARQ_BUF_SIZE_REGS); + if (reg < harq_in_length) { + harq_in_length = reg; + rte_bbdev_log(ERR, "HARQ in length > HARQ buffer size\n"); + } } if (!mbuf_append(harq_output, harq_output, harq_in_length)) { @@ -1570,9 +2062,15 @@ fpga_5gnr_harq_read_loopback(struct fpga_5gnr_queue *q, input = (uint64_t *)rte_pktmbuf_mtod_offset(harq_output, uint8_t *, harq_out_offset); while (left_length > 0) { - fpga_5gnr_reg_write_32(q->d->mmio_base, - FPGA_5GNR_FEC_DDR4_RD_ADDR_REGS, - in_offset); + if (d->fpga_variant == AGX100_FPGA_VARIANT) { + fpga_5gnr_reg_write_32(q->d->mmio_base, + FPGA_5GNR_FEC_DDR4_RD_ADDR_REGS, + in_offset >> 3); + } else { + fpga_5gnr_reg_write_32(q->d->mmio_base, + FPGA_5GNR_FEC_DDR4_RD_ADDR_REGS, + in_offset); + } fpga_5gnr_reg_write_8(q->d->mmio_base, FPGA_5GNR_FEC_DDR4_RD_DONE_REGS, 1); reg = fpga_5gnr_reg_read_8(q->d->mmio_base, FPGA_5GNR_FEC_DDR4_RD_RDY_REGS); while (reg != 1) { @@ -1587,7 +2085,10 @@ fpga_5gnr_harq_read_loopback(struct fpga_5gnr_queue *q, left_length -= FPGA_5GNR_DDR_RD_DATA_LEN_IN_BYTES; in_offset += FPGA_5GNR_DDR_WR_DATA_LEN_IN_BYTES; increment++; - fpga_5gnr_reg_write_8(q->d->mmio_base, FPGA_5GNR_FEC_DDR4_RD_DONE_REGS, 0); + if (d->fpga_variant == AGX100_FPGA_VARIANT) + fpga_5gnr_reg_write_8(q->d->mmio_base, FPGA_5GNR_FEC_DDR4_RD_RDY_REGS, 0); + else + fpga_5gnr_reg_write_8(q->d->mmio_base, FPGA_5GNR_FEC_DDR4_RD_DONE_REGS, 0); } fpga_5gnr_mutex_free(q); return 1; @@ -1598,6 +2099,7 @@ enqueue_ldpc_enc_one_op_cb(struct fpga_5gnr_queue *q, struct rte_bbdev_enc_op *o uint16_t desc_offset) { union vc_5gnr_dma_desc *vc_5gnr_desc; + union agx100_dma_desc *agx100_desc; int ret; uint8_t c, crc24_bits = 0; struct rte_bbdev_op_ldpc_enc *enc = &op->ldpc_enc; @@ -1610,10 +2112,13 @@ enqueue_ldpc_enc_one_op_cb(struct fpga_5gnr_queue *q, struct rte_bbdev_enc_op *o uint16_t total_left = enc->input.length; uint16_t ring_offset; uint16_t K, k_; + struct fpga_5gnr_fec_device *d = q->d; - if (vc_5gnr_validate_ldpc_enc_op(op) == -1) { - rte_bbdev_log(ERR, "LDPC encoder validation rejected"); - return -EINVAL; + if (d->fpga_variant == VC_5GNR_FPGA_VARIANT) { + if (vc_5gnr_validate_ldpc_enc_op(op) == -1) { + rte_bbdev_log(ERR, "LDPC encoder validation rejected"); + return -EINVAL; + } } /* Clear op status */ @@ -1629,14 +2134,13 @@ enqueue_ldpc_enc_one_op_cb(struct fpga_5gnr_queue *q, struct rte_bbdev_enc_op *o crc24_bits = 24; if (enc->code_block_mode == RTE_BBDEV_TRANSPORT_BLOCK) { - /* For Transport Block mode */ - /* FIXME */ - c = enc->tb_params.c; - e = enc->tb_params.ea; - } else { /* For Code Block mode */ - c = 1; - e = enc->cb_params.e; + /* TODO: For Transport Block mode */ + rte_bbdev_log(ERR, "Transport Block not supported yet"); + return -1; } + /* For Code Block mode */ + c = 1; + e = enc->cb_params.e; /* Update total_left */ K = (enc->basegraph == 1 ? 22 : 10) * enc->z_c; @@ -1658,10 +2162,19 @@ enqueue_ldpc_enc_one_op_cb(struct fpga_5gnr_queue *q, struct rte_bbdev_enc_op *o /* Offset into the ring */ ring_offset = ((q->tail + desc_offset) & q->sw_ring_wrap_mask); - /* Setup DMA Descriptor */ - vc_5gnr_desc = q->vc_5gnr_ring_addr + ring_offset; - ret = vc_5gnr_dma_desc_te_fill(op, &vc_5gnr_desc->vc_5gnr_enc_req, m_in, m_out, - k_, e, in_offset, out_offset, ring_offset, c); + + if (d->fpga_variant == VC_5GNR_FPGA_VARIANT) { + /* Setup DMA Descriptor */ + vc_5gnr_desc = q->vc_5gnr_ring_addr + ring_offset; + ret = vc_5gnr_dma_desc_te_fill(op, &vc_5gnr_desc->vc_5gnr_enc_req, m_in, m_out, + k_, e, in_offset, out_offset, ring_offset, c); + } else { + /* Setup DMA Descriptor */ + agx100_desc = q->agx100_ring_addr + ring_offset; + ret = agx100_dma_desc_le_fill(op, &agx100_desc->agx100_enc_req, m_in, m_out, + k_, e, in_offset, out_offset, ring_offset, c); + } + if (unlikely(ret < 0)) return ret; @@ -1677,7 +2190,10 @@ enqueue_ldpc_enc_one_op_cb(struct fpga_5gnr_queue *q, struct rte_bbdev_enc_op *o } #ifdef RTE_LIBRTE_BBDEV_DEBUG - vc_5gnr_print_dma_enc_desc_debug_info(vc_5gnr_desc); + if (d->fpga_variant == VC_5GNR_FPGA_VARIANT) + vc_5gnr_print_dma_enc_desc_debug_info(vc_5gnr_desc); + else + agx100_print_dma_enc_desc_debug_info(agx100_desc); #endif return 1; } @@ -1817,50 +2333,180 @@ vc_5gnr_enqueue_ldpc_dec_one_op_cb(struct fpga_5gnr_queue *q, struct rte_bbdev_d return 1; } -static uint16_t -fpga_5gnr_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data, - struct rte_bbdev_enc_op **ops, uint16_t num) +static inline int +agx100_enqueue_ldpc_dec_one_op_cb(struct fpga_5gnr_queue *q, struct rte_bbdev_dec_op *op, + uint16_t desc_offset) { - uint16_t i, total_enqueued_cbs = 0; - int32_t avail; - int enqueued_cbs; - struct fpga_5gnr_queue *q = q_data->queue_private; - union vc_5gnr_dma_desc *vc_5gnr_desc; - - /* Check if queue is not full */ - if (unlikely(((q->tail + 1) & q->sw_ring_wrap_mask) == q->head_free_desc)) - return 0; - - /* Calculates available space */ - avail = (q->head_free_desc > q->tail) ? - q->head_free_desc - q->tail - 1 : - q->ring_ctrl_reg.ring_size + q->head_free_desc - q->tail - 1; + union agx100_dma_desc *desc; + int ret; + uint16_t ring_offset; + uint8_t c; + uint16_t e, in_length, out_length, k0, l, seg_total_left, sys_cols; + uint16_t K, parity_offset, harq_in_length = 0, harq_out_length = 0; + uint16_t crc24_overlap = 0; + struct rte_bbdev_op_ldpc_dec *dec = &op->ldpc_dec; + struct rte_mbuf *m_in = dec->input.data; + struct rte_mbuf *m_out = dec->hard_output.data; + struct rte_mbuf *m_out_head = dec->hard_output.data; + uint16_t in_offset = dec->input.offset; + uint16_t out_offset = dec->hard_output.offset; + uint32_t harq_in_offset = 0; + uint32_t harq_out_offset = 0; - for (i = 0; i < num; ++i) { - /* Check if there is available space for further - * processing - */ - if (unlikely(avail - 1 < 0)) - break; - avail -= 1; - enqueued_cbs = enqueue_ldpc_enc_one_op_cb(q, ops[i], total_enqueued_cbs); + /* Clear op status */ + op->status = 0; - if (enqueued_cbs < 0) - break; + /* Setup DMA Descriptor */ + ring_offset = ((q->tail + desc_offset) & q->sw_ring_wrap_mask); + desc = q->agx100_ring_addr + ring_offset; - total_enqueued_cbs += enqueued_cbs; + if (check_bit(dec->op_flags, RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK)) { + struct rte_mbuf *harq_in = dec->harq_combined_input.data; + struct rte_mbuf *harq_out = dec->harq_combined_output.data; + harq_in_length = dec->harq_combined_input.length; + uint32_t harq_in_offset = dec->harq_combined_input.offset; + uint32_t harq_out_offset = dec->harq_combined_output.offset; - rte_bbdev_log_debug("enqueuing enc ops [%d/%d] | head %d | tail %d", - total_enqueued_cbs, num, - q->head_free_desc, q->tail); - } + if (check_bit(dec->op_flags, RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE)) { + ret = fpga_5gnr_harq_write_loopback(q, harq_in, + harq_in_length, harq_in_offset, + harq_out_offset); + } else if (check_bit(dec->op_flags, + RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE)) { + ret = fpga_5gnr_harq_read_loopback(q, harq_out, + harq_in_length, harq_in_offset, + harq_out_offset); + dec->harq_combined_output.length = harq_in_length; + } else { + rte_bbdev_log(ERR, "OP flag Err!"); + ret = -1; + } + + /* Set descriptor for dequeue */ + desc->agx100_dec_req.done = 1; + desc->agx100_dec_req.error_code = 0; + desc->agx100_dec_req.error_msg = 0; + desc->agx100_dec_req.op_addr = op; + desc->agx100_dec_req.cbs_in_op = 1; + + /* Mark this dummy descriptor to be dropped by HW */ + desc->agx100_dec_req.desc_idx = (ring_offset + 1) & q->sw_ring_wrap_mask; + + return ret; /* Error or number of CB */ + } + + if (m_in == NULL || m_out == NULL) { + rte_bbdev_log(ERR, "Invalid mbuf pointer"); + op->status = 1 << RTE_BBDEV_DATA_ERROR; + return -1; + } + + c = 1; + e = dec->cb_params.e; + + if (check_bit(dec->op_flags, RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP)) + crc24_overlap = 24; + + sys_cols = (dec->basegraph == 1) ? 22 : 10; + K = sys_cols * dec->z_c; + parity_offset = K - 2 * dec->z_c; + + out_length = ((K - crc24_overlap - dec->n_filler) >> 3); + in_length = e; + seg_total_left = dec->input.length; + + if (check_bit(dec->op_flags, RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) + harq_in_length = RTE_MIN(dec->harq_combined_input.length, (uint32_t)dec->n_cb); + + if (check_bit(dec->op_flags, RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) { + k0 = get_k0(dec->n_cb, dec->z_c, dec->basegraph, dec->rv_index); + if (k0 > parity_offset) + l = k0 + e; + else + l = k0 + e + dec->n_filler; + harq_out_length = RTE_MIN(RTE_MAX(harq_in_length, l), dec->n_cb); + dec->harq_combined_output.length = harq_out_length; + } + + mbuf_append(m_out_head, m_out, out_length); + harq_in_offset = dec->harq_combined_input.offset; + harq_out_offset = dec->harq_combined_output.offset; + + ret = agx100_dma_desc_ld_fill(op, &desc->agx100_dec_req, m_in, m_out, + harq_in_length, in_offset, out_offset, harq_in_offset, + harq_out_offset, ring_offset, c); + + if (unlikely(ret < 0)) + return ret; + /* Update lengths */ + seg_total_left -= in_length; + op->ldpc_dec.hard_output.length += out_length; + if (seg_total_left > 0) { + rte_bbdev_log(ERR, + "Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u", + seg_total_left, in_length); + return -1; + } + +#ifdef RTE_LIBRTE_BBDEV_DEBUG + agx100_print_dma_dec_desc_debug_info(desc); +#endif + + return 1; +} + +static uint16_t +fpga_5gnr_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data, + struct rte_bbdev_enc_op **ops, uint16_t num) +{ + uint16_t i, total_enqueued_cbs = 0; + int32_t avail; + int enqueued_cbs; + struct fpga_5gnr_queue *q = q_data->queue_private; + union vc_5gnr_dma_desc *vc_5gnr_desc; + union agx100_dma_desc *agx100_desc; + struct fpga_5gnr_fec_device *d = q->d; + + /* Check if queue is not full */ + if (unlikely(((q->tail + 1) & q->sw_ring_wrap_mask) == q->head_free_desc)) + return 0; + + /* Calculates available space */ + avail = (q->head_free_desc > q->tail) ? + q->head_free_desc - q->tail - 1 : + q->ring_ctrl_reg.ring_size + q->head_free_desc - q->tail - 1; + + for (i = 0; i < num; ++i) { + /* Check if there is available space for further + * processing + */ + if (unlikely(avail - 1 < 0)) + break; + avail -= 1; + enqueued_cbs = enqueue_ldpc_enc_one_op_cb(q, ops[i], total_enqueued_cbs); + + if (enqueued_cbs < 0) + break; + + total_enqueued_cbs += enqueued_cbs; + + rte_bbdev_log_debug("enqueuing enc ops [%d/%d] | head %d | tail %d", + total_enqueued_cbs, num, + q->head_free_desc, q->tail); + } /* Set interrupt bit for last CB in enqueued ops. FPGA issues interrupt * only when all previous CBs were already processed. */ - vc_5gnr_desc = q->vc_5gnr_ring_addr + - ((q->tail + total_enqueued_cbs - 1) & q->sw_ring_wrap_mask); - vc_5gnr_desc->vc_5gnr_enc_req.irq_en = q->irq_enable; + if (d->fpga_variant == VC_5GNR_FPGA_VARIANT) { + vc_5gnr_desc = q->vc_5gnr_ring_addr + + ((q->tail + total_enqueued_cbs - 1) & q->sw_ring_wrap_mask); + vc_5gnr_desc->vc_5gnr_enc_req.irq_en = q->irq_enable; + } else { + agx100_desc = q->agx100_ring_addr + + ((q->tail + total_enqueued_cbs - 1) & q->sw_ring_wrap_mask); + agx100_desc->agx100_enc_req.int_en = q->irq_enable; + } fpga_5gnr_dma_enqueue(q, total_enqueued_cbs, &q_data->queue_stats); @@ -1880,6 +2526,8 @@ fpga_5gnr_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data, int enqueued_cbs; struct fpga_5gnr_queue *q = q_data->queue_private; union vc_5gnr_dma_desc *vc_5gnr_desc; + union agx100_dma_desc *agx100_desc; + struct fpga_5gnr_fec_device *d = q->d; /* Check if queue is not full */ if (unlikely(((q->tail + 1) & q->sw_ring_wrap_mask) == q->head_free_desc)) @@ -1898,8 +2546,13 @@ fpga_5gnr_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data, if (unlikely(avail - 1 < 0)) break; avail -= 1; - enqueued_cbs = vc_5gnr_enqueue_ldpc_dec_one_op_cb(q, ops[i], - total_enqueued_cbs); + if (q->d->fpga_variant == VC_5GNR_FPGA_VARIANT) { + enqueued_cbs = vc_5gnr_enqueue_ldpc_dec_one_op_cb(q, ops[i], + total_enqueued_cbs); + } else { + enqueued_cbs = agx100_enqueue_ldpc_dec_one_op_cb(q, ops[i], + total_enqueued_cbs); + } if (enqueued_cbs < 0) break; @@ -1918,9 +2571,16 @@ fpga_5gnr_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data, /* Set interrupt bit for last CB in enqueued ops. FPGA issues interrupt * only when all previous CBs were already processed. */ - vc_5gnr_desc = q->vc_5gnr_ring_addr + - ((q->tail + total_enqueued_cbs - 1) & q->sw_ring_wrap_mask); - vc_5gnr_desc->vc_5gnr_enc_req.irq_en = q->irq_enable; + if (d->fpga_variant == VC_5GNR_FPGA_VARIANT) { + vc_5gnr_desc = q->vc_5gnr_ring_addr + + ((q->tail + total_enqueued_cbs - 1) & q->sw_ring_wrap_mask); + vc_5gnr_desc->vc_5gnr_enc_req.irq_en = q->irq_enable; + } else { + agx100_desc = q->agx100_ring_addr + + ((q->tail + total_enqueued_cbs - 1) & q->sw_ring_wrap_mask); + agx100_desc->agx100_enc_req.int_en = q->irq_enable; + } + fpga_5gnr_dma_enqueue(q, total_enqueued_cbs, &q_data->queue_stats); return i; } @@ -1955,6 +2615,36 @@ vc_5gnr_dequeue_ldpc_enc_one_op_cb(struct fpga_5gnr_queue *q, struct rte_bbdev_e return 1; } +static inline int +agx100_dequeue_ldpc_enc_one_op_cb(struct fpga_5gnr_queue *q, struct rte_bbdev_enc_op **op, + uint16_t desc_offset) +{ + union agx100_dma_desc *desc; + int desc_error; + + /* Set current desc */ + desc = q->agx100_ring_addr + ((q->head_free_desc + desc_offset) & q->sw_ring_wrap_mask); + /*check if done */ + if (desc->agx100_enc_req.done == 0) + return -1; + + /* make sure the response is read atomically */ + rte_smp_rmb(); + + rte_bbdev_log_debug("DMA response desc %p", desc); + +#ifdef RTE_LIBRTE_BBDEV_DEBUG + agx100_print_dma_enc_desc_debug_info(desc); +#endif + *op = desc->agx100_enc_req.op_addr; + /* Check the descriptor error field, return 1 on error */ + desc_error = agx100_check_desc_error(desc->agx100_enc_req.error_code, + desc->agx100_enc_req.error_msg); + + (*op)->status = desc_error << RTE_BBDEV_DATA_ERROR; + + return 1; +} static inline int vc_5gnr_dequeue_ldpc_dec_one_op_cb(struct fpga_5gnr_queue *q, struct rte_bbdev_dec_op **op, @@ -2003,6 +2693,52 @@ vc_5gnr_dequeue_ldpc_dec_one_op_cb(struct fpga_5gnr_queue *q, struct rte_bbdev_d return 1; } +static inline int +agx100_dequeue_ldpc_dec_one_op_cb(struct fpga_5gnr_queue *q, struct rte_bbdev_dec_op **op, + uint16_t desc_offset) +{ + union agx100_dma_desc *desc; + int desc_error; + + /* Set descriptor */ + desc = q->agx100_ring_addr + + ((q->head_free_desc + desc_offset) & q->sw_ring_wrap_mask); + /* Verify done bit is set */ + if (desc->agx100_dec_req.done == 0) + return -1; + + /* make sure the response is read atomically */ + rte_smp_rmb(); + +#ifdef RTE_LIBRTE_BBDEV_DEBUG + agx100_print_dma_dec_desc_debug_info(desc); +#endif + + *op = desc->agx100_dec_req.op_addr; + + if (check_bit((*op)->ldpc_dec.op_flags, RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK)) { + (*op)->status = 0; + return 1; + } + + /* FPGA reports iterations based on round-up minus 1 */ + (*op)->ldpc_dec.iter_count = desc->agx100_dec_req.max_iter_ret + 1; + + /* CRC Check criteria */ + if (desc->agx100_dec_req.crc24b_ind && !(desc->agx100_dec_req.cb_crc_all_pass)) + (*op)->status = 1 << RTE_BBDEV_CRC_ERROR; + + /* et_pass = 0 when decoder fails */ + (*op)->status |= !(desc->agx100_dec_req.cb_all_et_pass) << RTE_BBDEV_SYNDROME_ERROR; + + /* Check the descriptor error field, return 1 on error */ + desc_error = agx100_check_desc_error(desc->agx100_dec_req.error_code, + desc->agx100_dec_req.error_msg); + + (*op)->status |= desc_error << RTE_BBDEV_DATA_ERROR; + return 1; +} + static uint16_t fpga_5gnr_dequeue_ldpc_enc(struct rte_bbdev_queue_data *q_data, struct rte_bbdev_enc_op **ops, uint16_t num) @@ -2014,7 +2750,10 @@ fpga_5gnr_dequeue_ldpc_enc(struct rte_bbdev_queue_data *q_data, int ret; for (i = 0; (i < num) && (dequeued_cbs < avail); ++i) { - ret = vc_5gnr_dequeue_ldpc_enc_one_op_cb(q, &ops[i], dequeued_cbs); + if (q->d->fpga_variant == VC_5GNR_FPGA_VARIANT) + ret = vc_5gnr_dequeue_ldpc_enc_one_op_cb(q, &ops[i], dequeued_cbs); + else + ret = agx100_dequeue_ldpc_enc_one_op_cb(q, &ops[i], dequeued_cbs); if (ret < 0) break; @@ -2046,7 +2785,10 @@ fpga_5gnr_dequeue_ldpc_dec(struct rte_bbdev_queue_data *q_data, int ret; for (i = 0; (i < num) && (dequeued_cbs < avail); ++i) { - ret = vc_5gnr_dequeue_ldpc_dec_one_op_cb(q, &ops[i], dequeued_cbs); + if (q->d->fpga_variant == VC_5GNR_FPGA_VARIANT) + ret = vc_5gnr_dequeue_ldpc_dec_one_op_cb(q, &ops[i], dequeued_cbs); + else + ret = agx100_dequeue_ldpc_dec_one_op_cb(q, &ops[i], dequeued_cbs); if (ret < 0) break; @@ -2079,10 +2821,29 @@ fpga_5gnr_fec_init(struct rte_bbdev *dev, struct rte_pci_driver *drv) dev->dequeue_ldpc_enc_ops = fpga_5gnr_dequeue_ldpc_enc; dev->dequeue_ldpc_dec_ops = fpga_5gnr_dequeue_ldpc_dec; - ((struct fpga_5gnr_fec_device *) dev->data->dev_private)->pf_device = - !strcmp(drv->driver.name, RTE_STR(FPGA_5GNR_FEC_PF_DRIVER_NAME)); - ((struct fpga_5gnr_fec_device *) dev->data->dev_private)->mmio_base = - pci_dev->mem_resource[0].addr; + /* Device variant specific handling */ + if ((pci_dev->id.device_id == AGX100_PF_DEVICE_ID) || + (pci_dev->id.device_id == AGX100_VF_DEVICE_ID)) { + ((struct fpga_5gnr_fec_device *) dev->data->dev_private)->fpga_variant = + AGX100_FPGA_VARIANT; + ((struct fpga_5gnr_fec_device *) dev->data->dev_private)->pf_device = + !strcmp(drv->driver.name, RTE_STR(FPGA_5GNR_FEC_PF_DRIVER_NAME)); + ((struct fpga_5gnr_fec_device *) dev->data->dev_private)->mmio_base = + pci_dev->mem_resource[0].addr; + /* Maximum number of queues possible for this device */ + ((struct fpga_5gnr_fec_device *) dev->data->dev_private)->total_num_queues = + fpga_5gnr_reg_read_32(pci_dev->mem_resource[0].addr, + FPGA_5GNR_FEC_VERSION_ID) >> 24; + } else { + ((struct fpga_5gnr_fec_device *) dev->data->dev_private)->fpga_variant = + VC_5GNR_FPGA_VARIANT; + ((struct fpga_5gnr_fec_device *) dev->data->dev_private)->pf_device = + !strcmp(drv->driver.name, RTE_STR(FPGA_5GNR_FEC_PF_DRIVER_NAME)); + ((struct fpga_5gnr_fec_device *) dev->data->dev_private)->mmio_base = + pci_dev->mem_resource[0].addr; + ((struct fpga_5gnr_fec_device *) dev->data->dev_private)->total_num_queues = + VC_5GNR_TOTAL_NUM_QUEUES; + } rte_bbdev_log_debug( "Init device %s [%s] @ virtaddr %p phyaddr %#"PRIx64, @@ -2097,6 +2858,7 @@ fpga_5gnr_fec_probe(struct rte_pci_driver *pci_drv, { struct rte_bbdev *bbdev = NULL; char dev_name[RTE_BBDEV_NAME_MAX_LEN]; + struct fpga_5gnr_fec_device *d; if (pci_dev == NULL) { rte_bbdev_log(ERR, "NULL PCI device"); @@ -2135,15 +2897,24 @@ fpga_5gnr_fec_probe(struct rte_pci_driver *pci_drv, rte_bbdev_log_debug("bbdev id = %u [%s]", bbdev->data->dev_id, dev_name); - struct fpga_5gnr_fec_device *d = bbdev->data->dev_private; - uint32_t version_id = fpga_5gnr_reg_read_32(d->mmio_base, FPGA_5GNR_FEC_VERSION_ID); - rte_bbdev_log(INFO, "Vista Creek FPGA RTL v%u.%u", - ((uint16_t)(version_id >> 16)), ((uint16_t)version_id)); + d = bbdev->data->dev_private; + if (d->fpga_variant == VC_5GNR_FPGA_VARIANT) { + uint32_t version_id = fpga_5gnr_reg_read_32(d->mmio_base, FPGA_5GNR_FEC_VERSION_ID); + rte_bbdev_log(INFO, "Vista Creek FPGA RTL v%u.%u", + ((uint16_t)(version_id >> 16)), ((uint16_t)version_id)); + } else { + uint32_t version_num_queues = fpga_5gnr_reg_read_32(d->mmio_base, + FPGA_5GNR_FEC_VERSION_ID); + uint8_t major_version_id = version_num_queues >> 16; + uint8_t minor_version_id = version_num_queues >> 8; + uint8_t patch_id = version_num_queues; + + rte_bbdev_log(INFO, "AGX100 RTL v%u.%u.%u", + major_version_id, minor_version_id, patch_id); + } #ifdef RTE_LIBRTE_BBDEV_DEBUG - if (!strcmp(pci_drv->driver.name, - RTE_STR(FPGA_5GNR_FEC_PF_DRIVER_NAME))) - print_static_reg_debug_info(d->mmio_base); + print_static_reg_debug_info(d->mmio_base, d->fpga_variant); #endif return 0; } @@ -2242,7 +3013,7 @@ static int vc_5gnr_configure(const char *dev_name, const struct rte_fpga_5gnr_fe /* Clear all queues registers */ payload_32 = FPGA_5GNR_INVALID_HW_QUEUE_ID; - for (q_id = 0; q_id < VC_5GNR_TOTAL_NUM_QUEUES; ++q_id) { + for (q_id = 0; q_id < d->total_num_queues; ++q_id) { address = (q_id << 2) + VC_5GNR_QUEUE_MAP; fpga_5gnr_reg_write_32(d->mmio_base, address, payload_32); } @@ -2303,7 +3074,7 @@ static int vc_5gnr_configure(const char *dev_name, const struct rte_fpga_5gnr_fe */ if (conf->pf_mode_en) { payload_32 = 0x1; - for (q_id = 0; q_id < VC_5GNR_TOTAL_NUM_QUEUES; ++q_id) { + for (q_id = 0; q_id < d->total_num_queues; ++q_id) { address = (q_id << 2) + VC_5GNR_QUEUE_MAP; fpga_5gnr_reg_write_32(d->mmio_base, address, payload_32); } @@ -2321,11 +3092,11 @@ static int vc_5gnr_configure(const char *dev_name, const struct rte_fpga_5gnr_fe */ if ((total_ul_q_id > VC_5GNR_NUM_UL_QUEUES) || (total_dl_q_id > VC_5GNR_NUM_DL_QUEUES) || - (total_q_id > VC_5GNR_TOTAL_NUM_QUEUES)) { + (total_q_id > d->total_num_queues)) { rte_bbdev_log(ERR, "VC 5GNR FPGA Configuration failed. Too many queues to configure: UL_Q %u, DL_Q %u, FPGA_Q %u", total_ul_q_id, total_dl_q_id, - VC_5GNR_TOTAL_NUM_QUEUES); + d->total_num_queues); return -EINVAL; } total_ul_q_id = 0; @@ -2369,7 +3140,169 @@ static int vc_5gnr_configure(const char *dev_name, const struct rte_fpga_5gnr_fe rte_bbdev_log_debug("PF Vista Creek 5GNR FPGA configuration complete for %s", dev_name); #ifdef RTE_LIBRTE_BBDEV_DEBUG - print_static_reg_debug_info(d->mmio_base); + print_static_reg_debug_info(d->mmio_base, d->fpga_variant); +#endif + return 0; +} + +/* Initial configuration of AGX100 device */ +static int agx100_configure(const char *dev_name, const struct rte_fpga_5gnr_fec_conf *conf) +{ + uint32_t payload_32, address; + uint16_t payload_16; + uint8_t payload_8; + uint16_t q_id, vf_id, total_q_id, total_ul_q_id, total_dl_q_id; + struct rte_bbdev *bbdev = rte_bbdev_get_named_dev(dev_name); + struct rte_fpga_5gnr_fec_conf def_conf; + + if (bbdev == NULL) { + rte_bbdev_log(ERR, + "Invalid dev_name (%s), or device is not yet initialised", + dev_name); + return -ENODEV; + } + + struct fpga_5gnr_fec_device *d = bbdev->data->dev_private; + + if (conf == NULL) { + rte_bbdev_log(ERR, "AGX100 Configuration was not provided."); + rte_bbdev_log(ERR, "Default configuration will be loaded."); + fpga_5gnr_set_default_conf(&def_conf); + conf = &def_conf; + } + + uint8_t total_num_queues = d->total_num_queues; + uint8_t num_ul_queues = total_num_queues >> 1; + uint8_t num_dl_queues = total_num_queues >> 1; + + /* Clear all queues registers */ + payload_32 = FPGA_5GNR_INVALID_HW_QUEUE_ID; + for (q_id = 0; q_id < total_num_queues; ++q_id) { + address = (q_id << 2) + AGX100_QUEUE_MAP; + fpga_5gnr_reg_write_32(d->mmio_base, address, payload_32); + } + + /* + * If PF mode is enabled allocate all queues for PF only. + * + * For VF mode each VF can have different number of UL and DL queues. + * Total number of queues to configure cannot exceed AGX100 + * capabilities - 64 queues - 32 queues for UL and 32 queues for DL. + * Queues mapping is done according to configuration: + * + * UL queues: + * | Q_ID | VF_ID | + * | 0 | 0 | + * | ... | 0 | + * | conf->vf_dl_queues_number[0] - 1 | 0 | + * | conf->vf_dl_queues_number[0] | 1 | + * | ... | 1 | + * | conf->vf_dl_queues_number[1] - 1 | 1 | + * | ... | ... | + * | conf->vf_dl_queues_number[7] - 1 | 7 | + * + * DL queues: + * | Q_ID | VF_ID | + * | 32 | 0 | + * | ... | 0 | + * | conf->vf_ul_queues_number[0] - 1 | 0 | + * | conf->vf_ul_queues_number[0] | 1 | + * | ... | 1 | + * | conf->vf_ul_queues_number[1] - 1 | 1 | + * | ... | ... | + * | conf->vf_ul_queues_number[7] - 1 | 7 | + * + * Example of configuration: + * conf->vf_ul_queues_number[0] = 4; -> 4 UL queues for VF0 + * conf->vf_dl_queues_number[0] = 4; -> 4 DL queues for VF0 + * conf->vf_ul_queues_number[1] = 2; -> 2 UL queues for VF1 + * conf->vf_dl_queues_number[1] = 2; -> 2 DL queues for VF1 + * + * UL: + * | Q_ID | VF_ID | + * | 0 | 0 | + * | 1 | 0 | + * | 2 | 0 | + * | 3 | 0 | + * | 4 | 1 | + * | 5 | 1 | + * + * DL: + * | Q_ID | VF_ID | + * | 32 | 0 | + * | 33 | 0 | + * | 34 | 0 | + * | 35 | 0 | + * | 36 | 1 | + * | 37 | 1 | + */ + if (conf->pf_mode_en) { + payload_32 = 0x1; + for (q_id = 0; q_id < total_num_queues; ++q_id) { + address = (q_id << 2) + AGX100_QUEUE_MAP; + fpga_5gnr_reg_write_32(d->mmio_base, address, payload_32); + } + } else { + /* Calculate total number of UL and DL queues to configure */ + total_ul_q_id = total_dl_q_id = 0; + for (vf_id = 0; vf_id < FPGA_5GNR_FEC_NUM_VFS; ++vf_id) { + total_ul_q_id += conf->vf_ul_queues_number[vf_id]; + total_dl_q_id += conf->vf_dl_queues_number[vf_id]; + } + total_q_id = total_dl_q_id + total_ul_q_id; + /* + * Check if total number of queues to configure does not exceed + * AGX100 capabilities (64 queues - 32 UL and 32 DL queues) + */ + if ((total_ul_q_id > num_ul_queues) || + (total_dl_q_id > num_dl_queues) || + (total_q_id > total_num_queues)) { + rte_bbdev_log(ERR, + "AGX100 Configuration failed. Too many queues to configure: UL_Q %u, DL_Q %u, AGX100_Q %u", + total_ul_q_id, total_dl_q_id, + total_num_queues); + return -EINVAL; + } + total_ul_q_id = 0; + for (vf_id = 0; vf_id < FPGA_5GNR_FEC_NUM_VFS; ++vf_id) { + for (q_id = 0; q_id < conf->vf_ul_queues_number[vf_id]; + ++q_id, ++total_ul_q_id) { + address = (total_ul_q_id << 2) + AGX100_QUEUE_MAP; + payload_32 = ((0x80 + vf_id) << 16) | 0x1; + fpga_5gnr_reg_write_32(d->mmio_base, address, payload_32); + } + } + total_dl_q_id = 0; + for (vf_id = 0; vf_id < FPGA_5GNR_FEC_NUM_VFS; ++vf_id) { + for (q_id = 0; q_id < conf->vf_dl_queues_number[vf_id]; + ++q_id, ++total_dl_q_id) { + address = ((total_dl_q_id + num_ul_queues) + << 2) + AGX100_QUEUE_MAP; + payload_32 = ((0x80 + vf_id) << 16) | 0x1; + fpga_5gnr_reg_write_32(d->mmio_base, address, payload_32); + } + } + } + + /* Setting Load Balance Factor */ + payload_16 = (conf->dl_load_balance << 8) | (conf->ul_load_balance); + address = FPGA_5GNR_FEC_LOAD_BALANCE_FACTOR; + fpga_5gnr_reg_write_16(d->mmio_base, address, payload_16); + + /* Setting length of ring descriptor entry */ + payload_16 = FPGA_5GNR_RING_DESC_ENTRY_LENGTH; + address = FPGA_5GNR_FEC_RING_DESC_LEN; + fpga_5gnr_reg_write_16(d->mmio_base, address, payload_16); + + /* Queue PF/VF mapping table is ready */ + payload_8 = 0x1; + address = FPGA_5GNR_FEC_QUEUE_PF_VF_MAP_DONE; + fpga_5gnr_reg_write_8(d->mmio_base, address, payload_8); + + rte_bbdev_log_debug("PF AGX100 configuration complete for %s", dev_name); + +#ifdef RTE_LIBRTE_BBDEV_DEBUG + print_static_reg_debug_info(d->mmio_base, d->fpga_variant); #endif return 0; } @@ -2386,6 +3319,8 @@ int rte_fpga_5gnr_fec_configure(const char *dev_name, const struct rte_fpga_5gnr printf("Configure dev id %x\n", pci_dev->id.device_id); if (pci_dev->id.device_id == VC_5GNR_PF_DEVICE_ID) return vc_5gnr_configure(dev_name, conf); + else if (pci_dev->id.device_id == AGX100_PF_DEVICE_ID) + return agx100_configure(dev_name, conf); rte_bbdev_log(ERR, "Invalid device_id (%d)", pci_dev->id.device_id); return -ENODEV; @@ -2393,6 +3328,9 @@ int rte_fpga_5gnr_fec_configure(const char *dev_name, const struct rte_fpga_5gnr /* FPGA 5GNR FEC PCI PF address map */ static struct rte_pci_id pci_id_fpga_5gnr_fec_pf_map[] = { + { + RTE_PCI_DEVICE(AGX100_VENDOR_ID, AGX100_PF_DEVICE_ID) + }, { RTE_PCI_DEVICE(VC_5GNR_VENDOR_ID, VC_5GNR_PF_DEVICE_ID) }, @@ -2408,6 +3346,9 @@ static struct rte_pci_driver fpga_5gnr_fec_pci_pf_driver = { /* FPGA 5GNR FEC PCI VF address map */ static struct rte_pci_id pci_id_fpga_5gnr_fec_vf_map[] = { + { + RTE_PCI_DEVICE(AGX100_VENDOR_ID, AGX100_VF_DEVICE_ID) + }, { RTE_PCI_DEVICE(VC_5GNR_VENDOR_ID, VC_5GNR_VF_DEVICE_ID) }, -- 2.37.1