* [dpdk-dev] [PATCH v3 00/11] bbdev PMD ACC100
@ 2020-08-19 0:25 Nicolas Chautru
2020-08-19 0:25 ` [dpdk-dev] [PATCH v3 01/11] drivers/baseband: add PMD for ACC100 Nicolas Chautru
` (10 more replies)
0 siblings, 11 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-08-19 0:25 UTC (permalink / raw)
To: dev, akhil.goyal; +Cc: bruce.richardson, Nicolas Chautru
v3: missed a change during rebase
v2: includes clean up from latest CI checks.
This set includes a new PMD for the accelerator
ACC100 for 4G+5G FEC in 20.11.
Documentation is updated as well accordingly.
Existing unit tests are all still supported.
Nicolas Chautru (11):
drivers/baseband: add PMD for ACC100
baseband/acc100: add register definition file
baseband/acc100: add info get function
baseband/acc100: add queue configuration
baseband/acc100: add LDPC processing functions
baseband/acc100: add HARQ loopback support
baseband/acc100: add support for 4G processing
baseband/acc100: add interrupt support to PMD
baseband/acc100: add debug function to validate input
baseband/acc100: add configure function
doc: update bbdev feature table
app/test-bbdev/Makefile | 3 +
app/test-bbdev/meson.build | 3 +
app/test-bbdev/test_bbdev_perf.c | 72 +
config/common_base | 4 +
doc/guides/bbdevs/acc100.rst | 233 +
doc/guides/bbdevs/features/acc100.ini | 14 +
doc/guides/bbdevs/features/mbc.ini | 14 -
doc/guides/bbdevs/index.rst | 1 +
doc/guides/rel_notes/release_20_11.rst | 6 +
drivers/baseband/Makefile | 2 +
drivers/baseband/acc100/Makefile | 28 +
drivers/baseband/acc100/acc100_pf_enum.h | 1068 +++++
drivers/baseband/acc100/acc100_vf_enum.h | 73 +
drivers/baseband/acc100/meson.build | 8 +
drivers/baseband/acc100/rte_acc100_cfg.h | 113 +
drivers/baseband/acc100/rte_acc100_pmd.c | 4684 ++++++++++++++++++++
drivers/baseband/acc100/rte_acc100_pmd.h | 593 +++
.../acc100/rte_pmd_bbdev_acc100_version.map | 10 +
drivers/baseband/meson.build | 2 +-
mk/rte.app.mk | 1 +
20 files changed, 6917 insertions(+), 15 deletions(-)
create mode 100644 doc/guides/bbdevs/acc100.rst
create mode 100644 doc/guides/bbdevs/features/acc100.ini
delete mode 100644 doc/guides/bbdevs/features/mbc.ini
create mode 100644 drivers/baseband/acc100/Makefile
create mode 100644 drivers/baseband/acc100/acc100_pf_enum.h
create mode 100644 drivers/baseband/acc100/acc100_vf_enum.h
create mode 100644 drivers/baseband/acc100/meson.build
create mode 100644 drivers/baseband/acc100/rte_acc100_cfg.h
create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.c
create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.h
create mode 100644 drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
--
1.8.3.1
^ permalink raw reply [flat|nested] 213+ messages in thread
* [dpdk-dev] [PATCH v3 01/11] drivers/baseband: add PMD for ACC100
2020-08-19 0:25 [dpdk-dev] [PATCH v3 00/11] bbdev PMD ACC100 Nicolas Chautru
@ 2020-08-19 0:25 ` Nicolas Chautru
2020-08-29 9:44 ` Xu, Rosen
2020-08-19 0:25 ` [dpdk-dev] [PATCH v3 02/11] baseband/acc100: add register definition file Nicolas Chautru
` (9 subsequent siblings)
10 siblings, 1 reply; 213+ messages in thread
From: Nicolas Chautru @ 2020-08-19 0:25 UTC (permalink / raw)
To: dev, akhil.goyal; +Cc: bruce.richardson, Nicolas Chautru
Add stubs for the ACC100 PMD
Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
config/common_base | 4 +
doc/guides/bbdevs/acc100.rst | 233 +++++++++++++++++++++
doc/guides/bbdevs/index.rst | 1 +
doc/guides/rel_notes/release_20_11.rst | 6 +
drivers/baseband/Makefile | 2 +
drivers/baseband/acc100/Makefile | 25 +++
drivers/baseband/acc100/meson.build | 6 +
drivers/baseband/acc100/rte_acc100_pmd.c | 175 ++++++++++++++++
drivers/baseband/acc100/rte_acc100_pmd.h | 37 ++++
.../acc100/rte_pmd_bbdev_acc100_version.map | 3 +
drivers/baseband/meson.build | 2 +-
mk/rte.app.mk | 1 +
12 files changed, 494 insertions(+), 1 deletion(-)
create mode 100644 doc/guides/bbdevs/acc100.rst
create mode 100644 drivers/baseband/acc100/Makefile
create mode 100644 drivers/baseband/acc100/meson.build
create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.c
create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.h
create mode 100644 drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
diff --git a/config/common_base b/config/common_base
index fbf0ee7..218ab16 100644
--- a/config/common_base
+++ b/config/common_base
@@ -584,6 +584,10 @@ CONFIG_RTE_LIBRTE_PMD_BBDEV_NULL=y
#
CONFIG_RTE_LIBRTE_PMD_BBDEV_TURBO_SW=y
+# Compile PMD for ACC100 bbdev device
+#
+CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100=y
+
#
# Compile PMD for Intel FPGA LTE FEC bbdev device
#
diff --git a/doc/guides/bbdevs/acc100.rst b/doc/guides/bbdevs/acc100.rst
new file mode 100644
index 0000000..f87ee09
--- /dev/null
+++ b/doc/guides/bbdevs/acc100.rst
@@ -0,0 +1,233 @@
+.. SPDX-License-Identifier: BSD-3-Clause
+ Copyright(c) 2020 Intel Corporation
+
+Intel(R) ACC100 5G/4G FEC Poll Mode Driver
+==========================================
+
+The BBDEV ACC100 5G/4G FEC poll mode driver (PMD) supports an
+implementation of a VRAN FEC wireless acceleration function.
+This device is also known as Mount Bryce.
+
+Features
+--------
+
+ACC100 5G/4G FEC PMD supports the following features:
+
+- LDPC Encode in the DL (5GNR)
+- LDPC Decode in the UL (5GNR)
+- Turbo Encode in the DL (4G)
+- Turbo Decode in the UL (4G)
+- 16 VFs per PF (physical device)
+- Maximum of 128 queues per VF
+- PCIe Gen-3 x16 Interface
+- MSI
+- SR-IOV
+
+ACC100 5G/4G FEC PMD supports the following BBDEV capabilities:
+
+* For the LDPC encode operation:
+ - ``RTE_BBDEV_LDPC_CRC_24B_ATTACH`` : set to attach CRC24B to CB(s)
+ - ``RTE_BBDEV_LDPC_RATE_MATCH`` : if set then do not do Rate Match bypass
+ - ``RTE_BBDEV_LDPC_INTERLEAVER_BYPASS`` : if set then bypass interleaver
+
+* For the LDPC decode operation:
+ - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK`` : check CRC24B from CB(s)
+ - ``RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE`` : disable early termination
+ - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP`` : drops CRC24B bits appended while decoding
+ - ``RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE`` : provides an input for HARQ combining
+ - ``RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE`` : provides an input for HARQ combining
+ - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE`` : HARQ memory input is internal
+ - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE`` : HARQ memory output is internal
+ - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK`` : loopback data to/from HARQ memory
+ - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_FILLERS`` : HARQ memory includes the fillers bits
+ - ``RTE_BBDEV_LDPC_DEC_SCATTER_GATHER`` : supports scatter-gather for input/output data
+ - ``RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION`` : supports compression of the HARQ input/output
+ - ``RTE_BBDEV_LDPC_LLR_COMPRESSION`` : supports LLR input compression
+
+* For the turbo encode operation:
+ - ``RTE_BBDEV_TURBO_CRC_24B_ATTACH`` : set to attach CRC24B to CB(s)
+ - ``RTE_BBDEV_TURBO_RATE_MATCH`` : if set then do not do Rate Match bypass
+ - ``RTE_BBDEV_TURBO_ENC_INTERRUPTS`` : set for encoder dequeue interrupts
+ - ``RTE_BBDEV_TURBO_RV_INDEX_BYPASS`` : set to bypass RV index
+ - ``RTE_BBDEV_TURBO_ENC_SCATTER_GATHER`` : supports scatter-gather for input/output data
+
+* For the turbo decode operation:
+ - ``RTE_BBDEV_TURBO_CRC_TYPE_24B`` : check CRC24B from CB(s)
+ - ``RTE_BBDEV_TURBO_SUBBLOCK_DEINTERLEAVE`` : perform subblock de-interleave
+ - ``RTE_BBDEV_TURBO_DEC_INTERRUPTS`` : set for decoder dequeue interrupts
+ - ``RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN`` : set if negative LLR encoder i/p is supported
+ - ``RTE_BBDEV_TURBO_POS_LLR_1_BIT_IN`` : set if positive LLR encoder i/p is supported
+ - ``RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP`` : keep CRC24B bits appended while decoding
+ - ``RTE_BBDEV_TURBO_EARLY_TERMINATION`` : set early early termination feature
+ - ``RTE_BBDEV_TURBO_DEC_SCATTER_GATHER`` : supports scatter-gather for input/output data
+ - ``RTE_BBDEV_TURBO_HALF_ITERATION_EVEN`` : set half iteration granularity
+
+Installation
+------------
+
+Section 3 of the DPDK manual provides instuctions on installing and compiling DPDK. The
+default set of bbdev compile flags may be found in config/common_base, where for example
+the flag to build the ACC100 5G/4G FEC device, ``CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100``,
+is already set.
+
+DPDK requires hugepages to be configured as detailed in section 2 of the DPDK manual.
+The bbdev test application has been tested with a configuration 40 x 1GB hugepages. The
+hugepage configuration of a server may be examined using:
+
+.. code-block:: console
+
+ grep Huge* /proc/meminfo
+
+
+Initialization
+--------------
+
+When the device first powers up, its PCI Physical Functions (PF) can be listed through this command:
+
+.. code-block:: console
+
+ sudo lspci -vd8086:0d5c
+
+The physical and virtual functions are compatible with Linux UIO drivers:
+``vfio`` and ``igb_uio``. However, in order to work the ACC100 5G/4G
+FEC device firstly needs to be bound to one of these linux drivers through DPDK.
+
+
+Bind PF UIO driver(s)
+~~~~~~~~~~~~~~~~~~~~~
+
+Install the DPDK igb_uio driver, bind it with the PF PCI device ID and use
+``lspci`` to confirm the PF device is under use by ``igb_uio`` DPDK UIO driver.
+
+The igb_uio driver may be bound to the PF PCI device using one of three methods:
+
+
+1. PCI functions (physical or virtual, depending on the use case) can be bound to
+the UIO driver by repeating this command for every function.
+
+.. code-block:: console
+
+ cd <dpdk-top-level-directory>
+ insmod ./build/kmod/igb_uio.ko
+ echo "8086 0d5c" > /sys/bus/pci/drivers/igb_uio/new_id
+ lspci -vd8086:0d5c
+
+
+2. Another way to bind PF with DPDK UIO driver is by using the ``dpdk-devbind.py`` tool
+
+.. code-block:: console
+
+ cd <dpdk-top-level-directory>
+ ./usertools/dpdk-devbind.py -b igb_uio 0000:06:00.0
+
+where the PCI device ID (example: 0000:06:00.0) is obtained using lspci -vd8086:0d5c
+
+
+3. A third way to bind is to use ``dpdk-setup.sh`` tool
+
+.. code-block:: console
+
+ cd <dpdk-top-level-directory>
+ ./usertools/dpdk-setup.sh
+
+ select 'Bind Ethernet/Crypto/Baseband device to IGB UIO module'
+ or
+ select 'Bind Ethernet/Crypto/Baseband device to VFIO module' depending on driver required
+ enter PCI device ID
+ select 'Display current Ethernet/Crypto/Baseband device settings' to confirm binding
+
+
+In the same way the ACC100 5G/4G FEC PF can be bound with vfio, but vfio driver does not
+support SR-IOV configuration right out of the box, so it will need to be patched.
+
+
+Enable Virtual Functions
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+Now, it should be visible in the printouts that PCI PF is under igb_uio control
+"``Kernel driver in use: igb_uio``"
+
+To show the number of available VFs on the device, read ``sriov_totalvfs`` file..
+
+.. code-block:: console
+
+ cat /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/sriov_totalvfs
+
+ where 0000\:<b>\:<d>.<f> is the PCI device ID
+
+
+To enable VFs via igb_uio, echo the number of virtual functions intended to
+enable to ``max_vfs`` file..
+
+.. code-block:: console
+
+ echo <num-of-vfs> > /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/max_vfs
+
+
+Afterwards, all VFs must be bound to appropriate UIO drivers as required, same
+way it was done with the physical function previously.
+
+Enabling SR-IOV via vfio driver is pretty much the same, except that the file
+name is different:
+
+.. code-block:: console
+
+ echo <num-of-vfs> > /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/sriov_numvfs
+
+
+Configure the VFs through PF
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The PCI virtual functions must be configured before working or getting assigned
+to VMs/Containers. The configuration involves allocating the number of hardware
+queues, priorities, load balance, bandwidth and other settings necessary for the
+device to perform FEC functions.
+
+This configuration needs to be executed at least once after reboot or PCI FLR and can
+be achieved by using the function ``acc100_configure()``, which sets up the
+parameters defined in ``acc100_conf`` structure.
+
+Test Application
+----------------
+
+BBDEV provides a test application, ``test-bbdev.py`` and range of test data for testing
+the functionality of ACC100 5G/4G FEC encode and decode, depending on the device's
+capabilities. The test application is located under app->test-bbdev folder and has the
+following options:
+
+.. code-block:: console
+
+ "-p", "--testapp-path": specifies path to the bbdev test app.
+ "-e", "--eal-params" : EAL arguments which are passed to the test app.
+ "-t", "--timeout" : Timeout in seconds (default=300).
+ "-c", "--test-cases" : Defines test cases to run. Run all if not specified.
+ "-v", "--test-vector" : Test vector path (default=dpdk_path+/app/test-bbdev/test_vectors/bbdev_null.data).
+ "-n", "--num-ops" : Number of operations to process on device (default=32).
+ "-b", "--burst-size" : Operations enqueue/dequeue burst size (default=32).
+ "-s", "--snr" : SNR in dB used when generating LLRs for bler tests.
+ "-s", "--iter_max" : Number of iterations for LDPC decoder.
+ "-l", "--num-lcores" : Number of lcores to run (default=16).
+ "-i", "--init-device" : Initialise PF device with default values.
+
+
+To execute the test application tool using simple decode or encode data,
+type one of the following:
+
+.. code-block:: console
+
+ ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_dec_default.data
+ ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_enc_default.data
+
+
+The test application ``test-bbdev.py``, supports the ability to configure the PF device with
+a default set of values, if the "-i" or "- -init-device" option is included. The default values
+are defined in test_bbdev_perf.c.
+
+
+Test Vectors
+~~~~~~~~~~~~
+
+In addition to the simple LDPC decoder and LDPC encoder tests, bbdev also provides
+a range of additional tests under the test_vectors folder, which may be useful. The results
+of these tests will depend on the ACC100 5G/4G FEC capabilities which may cause some
+testcases to be skipped, but no failure should be reported.
diff --git a/doc/guides/bbdevs/index.rst b/doc/guides/bbdevs/index.rst
index a8092dd..4445cbd 100644
--- a/doc/guides/bbdevs/index.rst
+++ b/doc/guides/bbdevs/index.rst
@@ -13,3 +13,4 @@ Baseband Device Drivers
turbo_sw
fpga_lte_fec
fpga_5gnr_fec
+ acc100
diff --git a/doc/guides/rel_notes/release_20_11.rst b/doc/guides/rel_notes/release_20_11.rst
index df227a1..b3ab614 100644
--- a/doc/guides/rel_notes/release_20_11.rst
+++ b/doc/guides/rel_notes/release_20_11.rst
@@ -55,6 +55,12 @@ New Features
Also, make sure to start the actual text at the margin.
=======================================================
+* **Added Intel ACC100 bbdev PMD.**
+
+ Added a new ``acc100`` bbdev driver for the Intel\ |reg| ACC100 accelerator
+ also known as Mount Bryce. See the
+ :doc:`../bbdevs/acc100` BBDEV guide for more details on this new driver.
+
Removed Items
-------------
diff --git a/drivers/baseband/Makefile b/drivers/baseband/Makefile
index dcc0969..b640294 100644
--- a/drivers/baseband/Makefile
+++ b/drivers/baseband/Makefile
@@ -10,6 +10,8 @@ DIRS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_NULL) += null
DEPDIRS-null = $(core-libs)
DIRS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_TURBO_SW) += turbo_sw
DEPDIRS-turbo_sw = $(core-libs)
+DIRS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100) += acc100
+DEPDIRS-acc100 = $(core-libs)
DIRS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_FPGA_LTE_FEC) += fpga_lte_fec
DEPDIRS-fpga_lte_fec = $(core-libs)
DIRS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_FPGA_5GNR_FEC) += fpga_5gnr_fec
diff --git a/drivers/baseband/acc100/Makefile b/drivers/baseband/acc100/Makefile
new file mode 100644
index 0000000..c79e487
--- /dev/null
+++ b/drivers/baseband/acc100/Makefile
@@ -0,0 +1,25 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2020 Intel Corporation
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_pmd_bbdev_acc100.a
+
+# build flags
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring -lrte_cfgfile
+LDLIBS += -lrte_bbdev
+LDLIBS += -lrte_pci -lrte_bus_pci
+
+# versioning export map
+EXPORT_MAP := rte_pmd_bbdev_acc100_version.map
+
+# library version
+LIBABIVER := 1
+
+# library source files
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100) += rte_acc100_pmd.c
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/baseband/acc100/meson.build b/drivers/baseband/acc100/meson.build
new file mode 100644
index 0000000..8afafc2
--- /dev/null
+++ b/drivers/baseband/acc100/meson.build
@@ -0,0 +1,6 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2020 Intel Corporation
+
+deps += ['bbdev', 'bus_vdev', 'ring', 'pci', 'bus_pci']
+
+sources = files('rte_acc100_pmd.c')
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
new file mode 100644
index 0000000..1b4cd13
--- /dev/null
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -0,0 +1,175 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#include <unistd.h>
+
+#include <rte_common.h>
+#include <rte_log.h>
+#include <rte_dev.h>
+#include <rte_malloc.h>
+#include <rte_mempool.h>
+#include <rte_byteorder.h>
+#include <rte_errno.h>
+#include <rte_branch_prediction.h>
+#include <rte_hexdump.h>
+#include <rte_pci.h>
+#include <rte_bus_pci.h>
+
+#include <rte_bbdev.h>
+#include <rte_bbdev_pmd.h>
+#include "rte_acc100_pmd.h"
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, DEBUG);
+#else
+RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);
+#endif
+
+/* Free 64MB memory used for software rings */
+static int
+acc100_dev_close(struct rte_bbdev *dev __rte_unused)
+{
+ return 0;
+}
+
+static const struct rte_bbdev_ops acc100_bbdev_ops = {
+ .close = acc100_dev_close,
+};
+
+/* ACC100 PCI PF address map */
+static struct rte_pci_id pci_id_acc100_pf_map[] = {
+ {
+ RTE_PCI_DEVICE(RTE_ACC100_VENDOR_ID, RTE_ACC100_PF_DEVICE_ID)
+ },
+ {.device_id = 0},
+};
+
+/* ACC100 PCI VF address map */
+static struct rte_pci_id pci_id_acc100_vf_map[] = {
+ {
+ RTE_PCI_DEVICE(RTE_ACC100_VENDOR_ID, RTE_ACC100_VF_DEVICE_ID)
+ },
+ {.device_id = 0},
+};
+
+/* Initialization Function */
+static void
+acc100_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv)
+{
+ struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
+
+ dev->dev_ops = &acc100_bbdev_ops;
+
+ ((struct acc100_device *) dev->data->dev_private)->pf_device =
+ !strcmp(drv->driver.name,
+ RTE_STR(ACC100PF_DRIVER_NAME));
+ ((struct acc100_device *) dev->data->dev_private)->mmio_base =
+ pci_dev->mem_resource[0].addr;
+
+ rte_bbdev_log_debug("Init device %s [%s] @ vaddr %p paddr %#"PRIx64"",
+ drv->driver.name, dev->data->name,
+ (void *)pci_dev->mem_resource[0].addr,
+ pci_dev->mem_resource[0].phys_addr);
+}
+
+static int acc100_pci_probe(struct rte_pci_driver *pci_drv,
+ struct rte_pci_device *pci_dev)
+{
+ struct rte_bbdev *bbdev = NULL;
+ char dev_name[RTE_BBDEV_NAME_MAX_LEN];
+
+ if (pci_dev == NULL) {
+ rte_bbdev_log(ERR, "NULL PCI device");
+ return -EINVAL;
+ }
+
+ rte_pci_device_name(&pci_dev->addr, dev_name, sizeof(dev_name));
+
+ /* Allocate memory to be used privately by drivers */
+ bbdev = rte_bbdev_allocate(pci_dev->device.name);
+ if (bbdev == NULL)
+ return -ENODEV;
+
+ /* allocate device private memory */
+ bbdev->data->dev_private = rte_zmalloc_socket(dev_name,
+ sizeof(struct acc100_device), RTE_CACHE_LINE_SIZE,
+ pci_dev->device.numa_node);
+
+ if (bbdev->data->dev_private == NULL) {
+ rte_bbdev_log(CRIT,
+ "Allocate of %zu bytes for device \"%s\" failed",
+ sizeof(struct acc100_device), dev_name);
+ rte_bbdev_release(bbdev);
+ return -ENOMEM;
+ }
+
+ /* Fill HW specific part of device structure */
+ bbdev->device = &pci_dev->device;
+ bbdev->intr_handle = &pci_dev->intr_handle;
+ bbdev->data->socket_id = pci_dev->device.numa_node;
+
+ /* Invoke ACC100 device initialization function */
+ acc100_bbdev_init(bbdev, pci_drv);
+
+ rte_bbdev_log_debug("Initialised bbdev %s (id = %u)",
+ dev_name, bbdev->data->dev_id);
+ return 0;
+}
+
+static int acc100_pci_remove(struct rte_pci_device *pci_dev)
+{
+ struct rte_bbdev *bbdev;
+ int ret;
+ uint8_t dev_id;
+
+ if (pci_dev == NULL)
+ return -EINVAL;
+
+ /* Find device */
+ bbdev = rte_bbdev_get_named_dev(pci_dev->device.name);
+ if (bbdev == NULL) {
+ rte_bbdev_log(CRIT,
+ "Couldn't find HW dev \"%s\" to uninitialise it",
+ pci_dev->device.name);
+ return -ENODEV;
+ }
+ dev_id = bbdev->data->dev_id;
+
+ /* free device private memory before close */
+ rte_free(bbdev->data->dev_private);
+
+ /* Close device */
+ ret = rte_bbdev_close(dev_id);
+ if (ret < 0)
+ rte_bbdev_log(ERR,
+ "Device %i failed to close during uninit: %i",
+ dev_id, ret);
+
+ /* release bbdev from library */
+ rte_bbdev_release(bbdev);
+
+ rte_bbdev_log_debug("Destroyed bbdev = %u", dev_id);
+
+ return 0;
+}
+
+static struct rte_pci_driver acc100_pci_pf_driver = {
+ .probe = acc100_pci_probe,
+ .remove = acc100_pci_remove,
+ .id_table = pci_id_acc100_pf_map,
+ .drv_flags = RTE_PCI_DRV_NEED_MAPPING
+};
+
+static struct rte_pci_driver acc100_pci_vf_driver = {
+ .probe = acc100_pci_probe,
+ .remove = acc100_pci_remove,
+ .id_table = pci_id_acc100_vf_map,
+ .drv_flags = RTE_PCI_DRV_NEED_MAPPING
+};
+
+RTE_PMD_REGISTER_PCI(ACC100PF_DRIVER_NAME, acc100_pci_pf_driver);
+RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME, pci_id_acc100_pf_map);
+RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
+RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME, pci_id_acc100_vf_map);
+
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
new file mode 100644
index 0000000..6f46df0
--- /dev/null
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -0,0 +1,37 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#ifndef _RTE_ACC100_PMD_H_
+#define _RTE_ACC100_PMD_H_
+
+/* Helper macro for logging */
+#define rte_bbdev_log(level, fmt, ...) \
+ rte_log(RTE_LOG_ ## level, acc100_logtype, fmt "\n", \
+ ##__VA_ARGS__)
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+#define rte_bbdev_log_debug(fmt, ...) \
+ rte_bbdev_log(DEBUG, "acc100_pmd: " fmt, \
+ ##__VA_ARGS__)
+#else
+#define rte_bbdev_log_debug(fmt, ...)
+#endif
+
+/* ACC100 PF and VF driver names */
+#define ACC100PF_DRIVER_NAME intel_acc100_pf
+#define ACC100VF_DRIVER_NAME intel_acc100_vf
+
+/* ACC100 PCI vendor & device IDs */
+#define RTE_ACC100_VENDOR_ID (0x8086)
+#define RTE_ACC100_PF_DEVICE_ID (0x0d5c)
+#define RTE_ACC100_VF_DEVICE_ID (0x0d5d)
+
+/* Private data structure for each ACC100 device */
+struct acc100_device {
+ void *mmio_base; /**< Base address of MMIO registers (BAR0) */
+ bool pf_device; /**< True if this is a PF ACC100 device */
+ bool configured; /**< True if this ACC100 device is configured */
+};
+
+#endif /* _RTE_ACC100_PMD_H_ */
diff --git a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
new file mode 100644
index 0000000..4a76d1d
--- /dev/null
+++ b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
@@ -0,0 +1,3 @@
+DPDK_21 {
+ local: *;
+};
diff --git a/drivers/baseband/meson.build b/drivers/baseband/meson.build
index 415b672..72301ce 100644
--- a/drivers/baseband/meson.build
+++ b/drivers/baseband/meson.build
@@ -5,7 +5,7 @@ if is_windows
subdir_done()
endif
-drivers = ['null', 'turbo_sw', 'fpga_lte_fec', 'fpga_5gnr_fec']
+drivers = ['null', 'turbo_sw', 'fpga_lte_fec', 'fpga_5gnr_fec', 'acc100']
config_flag_fmt = 'RTE_LIBRTE_PMD_BBDEV_@0@'
driver_name_fmt = 'rte_pmd_bbdev_@0@'
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index a544259..a77f538 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -254,6 +254,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_NETVSC_PMD) += -lrte_pmd_netvsc
ifeq ($(CONFIG_RTE_LIBRTE_BBDEV),y)
_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_NULL) += -lrte_pmd_bbdev_null
+_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100) += -lrte_pmd_bbdev_acc100
_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_FPGA_LTE_FEC) += -lrte_pmd_bbdev_fpga_lte_fec
_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_FPGA_5GNR_FEC) += -lrte_pmd_bbdev_fpga_5gnr_fec
--
1.8.3.1
^ permalink raw reply [flat|nested] 213+ messages in thread
* [dpdk-dev] [PATCH v3 02/11] baseband/acc100: add register definition file
2020-08-19 0:25 [dpdk-dev] [PATCH v3 00/11] bbdev PMD ACC100 Nicolas Chautru
2020-08-19 0:25 ` [dpdk-dev] [PATCH v3 01/11] drivers/baseband: add PMD for ACC100 Nicolas Chautru
@ 2020-08-19 0:25 ` Nicolas Chautru
2020-08-29 9:55 ` Xu, Rosen
2020-08-19 0:25 ` [dpdk-dev] [PATCH v3 03/11] baseband/acc100: add info get function Nicolas Chautru
` (8 subsequent siblings)
10 siblings, 1 reply; 213+ messages in thread
From: Nicolas Chautru @ 2020-08-19 0:25 UTC (permalink / raw)
To: dev, akhil.goyal; +Cc: bruce.richardson, Nicolas Chautru
Add in the list of registers for the device and related
HW specs definitions.
Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
drivers/baseband/acc100/acc100_pf_enum.h | 1068 ++++++++++++++++++++++++++++++
drivers/baseband/acc100/acc100_vf_enum.h | 73 ++
drivers/baseband/acc100/rte_acc100_pmd.h | 490 ++++++++++++++
3 files changed, 1631 insertions(+)
create mode 100644 drivers/baseband/acc100/acc100_pf_enum.h
create mode 100644 drivers/baseband/acc100/acc100_vf_enum.h
diff --git a/drivers/baseband/acc100/acc100_pf_enum.h b/drivers/baseband/acc100/acc100_pf_enum.h
new file mode 100644
index 0000000..a1ee416
--- /dev/null
+++ b/drivers/baseband/acc100/acc100_pf_enum.h
@@ -0,0 +1,1068 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2017 Intel Corporation
+ */
+
+#ifndef ACC100_PF_ENUM_H
+#define ACC100_PF_ENUM_H
+
+/*
+ * ACC100 Register mapping on PF BAR0
+ * This is automatically generated from RDL, format may change with new RDL
+ * Release.
+ * Variable names are as is
+ */
+enum {
+ HWPfQmgrEgressQueuesTemplate = 0x0007FE00,
+ HWPfQmgrIngressAq = 0x00080000,
+ HWPfQmgrArbQAvail = 0x00A00010,
+ HWPfQmgrArbQBlock = 0x00A00014,
+ HWPfQmgrAqueueDropNotifEn = 0x00A00024,
+ HWPfQmgrAqueueDisableNotifEn = 0x00A00028,
+ HWPfQmgrSoftReset = 0x00A00038,
+ HWPfQmgrInitStatus = 0x00A0003C,
+ HWPfQmgrAramWatchdogCount = 0x00A00040,
+ HWPfQmgrAramWatchdogCounterEn = 0x00A00044,
+ HWPfQmgrAxiWatchdogCount = 0x00A00048,
+ HWPfQmgrAxiWatchdogCounterEn = 0x00A0004C,
+ HWPfQmgrProcessWatchdogCount = 0x00A00050,
+ HWPfQmgrProcessWatchdogCounterEn = 0x00A00054,
+ HWPfQmgrProcessUl4GWatchdogCounter = 0x00A00058,
+ HWPfQmgrProcessDl4GWatchdogCounter = 0x00A0005C,
+ HWPfQmgrProcessUl5GWatchdogCounter = 0x00A00060,
+ HWPfQmgrProcessDl5GWatchdogCounter = 0x00A00064,
+ HWPfQmgrProcessMldWatchdogCounter = 0x00A00068,
+ HWPfQmgrMsiOverflowUpperVf = 0x00A00070,
+ HWPfQmgrMsiOverflowLowerVf = 0x00A00074,
+ HWPfQmgrMsiWatchdogOverflow = 0x00A00078,
+ HWPfQmgrMsiOverflowEnable = 0x00A0007C,
+ HWPfQmgrDebugAqPointerMemGrp = 0x00A00100,
+ HWPfQmgrDebugOutputArbQFifoGrp = 0x00A00140,
+ HWPfQmgrDebugMsiFifoGrp = 0x00A00180,
+ HWPfQmgrDebugAxiWdTimeoutMsiFifo = 0x00A001C0,
+ HWPfQmgrDebugProcessWdTimeoutMsiFifo = 0x00A001C4,
+ HWPfQmgrDepthLog2Grp = 0x00A00200,
+ HWPfQmgrTholdGrp = 0x00A00300,
+ HWPfQmgrGrpTmplateReg0Indx = 0x00A00600,
+ HWPfQmgrGrpTmplateReg1Indx = 0x00A00680,
+ HWPfQmgrGrpTmplateReg2indx = 0x00A00700,
+ HWPfQmgrGrpTmplateReg3Indx = 0x00A00780,
+ HWPfQmgrGrpTmplateReg4Indx = 0x00A00800,
+ HWPfQmgrVfBaseAddr = 0x00A01000,
+ HWPfQmgrUl4GWeightRrVf = 0x00A02000,
+ HWPfQmgrDl4GWeightRrVf = 0x00A02100,
+ HWPfQmgrUl5GWeightRrVf = 0x00A02200,
+ HWPfQmgrDl5GWeightRrVf = 0x00A02300,
+ HWPfQmgrMldWeightRrVf = 0x00A02400,
+ HWPfQmgrArbQDepthGrp = 0x00A02F00,
+ HWPfQmgrGrpFunction0 = 0x00A02F40,
+ HWPfQmgrGrpFunction1 = 0x00A02F44,
+ HWPfQmgrGrpPriority = 0x00A02F48,
+ HWPfQmgrWeightSync = 0x00A03000,
+ HWPfQmgrAqEnableVf = 0x00A10000,
+ HWPfQmgrAqResetVf = 0x00A20000,
+ HWPfQmgrRingSizeVf = 0x00A20004,
+ HWPfQmgrGrpDepthLog20Vf = 0x00A20008,
+ HWPfQmgrGrpDepthLog21Vf = 0x00A2000C,
+ HWPfQmgrGrpFunction0Vf = 0x00A20010,
+ HWPfQmgrGrpFunction1Vf = 0x00A20014,
+ HWPfDmaConfig0Reg = 0x00B80000,
+ HWPfDmaConfig1Reg = 0x00B80004,
+ HWPfDmaQmgrAddrReg = 0x00B80008,
+ HWPfDmaSoftResetReg = 0x00B8000C,
+ HWPfDmaAxcacheReg = 0x00B80010,
+ HWPfDmaVersionReg = 0x00B80014,
+ HWPfDmaFrameThreshold = 0x00B80018,
+ HWPfDmaTimestampLo = 0x00B8001C,
+ HWPfDmaTimestampHi = 0x00B80020,
+ HWPfDmaAxiStatus = 0x00B80028,
+ HWPfDmaAxiControl = 0x00B8002C,
+ HWPfDmaNoQmgr = 0x00B80030,
+ HWPfDmaQosScale = 0x00B80034,
+ HWPfDmaQmanen = 0x00B80040,
+ HWPfDmaQmgrQosBase = 0x00B80060,
+ HWPfDmaFecClkGatingEnable = 0x00B80080,
+ HWPfDmaPmEnable = 0x00B80084,
+ HWPfDmaQosEnable = 0x00B80088,
+ HWPfDmaHarqWeightedRrFrameThreshold = 0x00B800B0,
+ HWPfDmaDataSmallWeightedRrFrameThresh = 0x00B800B4,
+ HWPfDmaDataLargeWeightedRrFrameThresh = 0x00B800B8,
+ HWPfDmaInboundCbMaxSize = 0x00B800BC,
+ HWPfDmaInboundDrainDataSize = 0x00B800C0,
+ HWPfDmaVfDdrBaseRw = 0x00B80400,
+ HWPfDmaCmplTmOutCnt = 0x00B80800,
+ HWPfDmaProcTmOutCnt = 0x00B80804,
+ HWPfDmaStatusRrespBresp = 0x00B80810,
+ HWPfDmaCfgRrespBresp = 0x00B80814,
+ HWPfDmaStatusMemParErr = 0x00B80818,
+ HWPfDmaCfgMemParErrEn = 0x00B8081C,
+ HWPfDmaStatusDmaHwErr = 0x00B80820,
+ HWPfDmaCfgDmaHwErrEn = 0x00B80824,
+ HWPfDmaStatusFecCoreErr = 0x00B80828,
+ HWPfDmaCfgFecCoreErrEn = 0x00B8082C,
+ HWPfDmaStatusFcwDescrErr = 0x00B80830,
+ HWPfDmaCfgFcwDescrErrEn = 0x00B80834,
+ HWPfDmaStatusBlockTransmit = 0x00B80838,
+ HWPfDmaBlockOnErrEn = 0x00B8083C,
+ HWPfDmaStatusFlushDma = 0x00B80840,
+ HWPfDmaFlushDmaOnErrEn = 0x00B80844,
+ HWPfDmaStatusSdoneFifoFull = 0x00B80848,
+ HWPfDmaStatusDescriptorErrLoVf = 0x00B8084C,
+ HWPfDmaStatusDescriptorErrHiVf = 0x00B80850,
+ HWPfDmaStatusFcwErrLoVf = 0x00B80854,
+ HWPfDmaStatusFcwErrHiVf = 0x00B80858,
+ HWPfDmaStatusDataErrLoVf = 0x00B8085C,
+ HWPfDmaStatusDataErrHiVf = 0x00B80860,
+ HWPfDmaCfgMsiEnSoftwareErr = 0x00B80864,
+ HWPfDmaDescriptorSignatuture = 0x00B80868,
+ HWPfDmaFcwSignature = 0x00B8086C,
+ HWPfDmaErrorDetectionEn = 0x00B80870,
+ HWPfDmaErrCntrlFifoDebug = 0x00B8087C,
+ HWPfDmaStatusToutData = 0x00B80880,
+ HWPfDmaStatusToutDesc = 0x00B80884,
+ HWPfDmaStatusToutUnexpData = 0x00B80888,
+ HWPfDmaStatusToutUnexpDesc = 0x00B8088C,
+ HWPfDmaStatusToutProcess = 0x00B80890,
+ HWPfDmaConfigCtoutOutDataEn = 0x00B808A0,
+ HWPfDmaConfigCtoutOutDescrEn = 0x00B808A4,
+ HWPfDmaConfigUnexpComplDataEn = 0x00B808A8,
+ HWPfDmaConfigUnexpComplDescrEn = 0x00B808AC,
+ HWPfDmaConfigPtoutOutEn = 0x00B808B0,
+ HWPfDmaFec5GulDescBaseLoRegVf = 0x00B88020,
+ HWPfDmaFec5GulDescBaseHiRegVf = 0x00B88024,
+ HWPfDmaFec5GulRespPtrLoRegVf = 0x00B88028,
+ HWPfDmaFec5GulRespPtrHiRegVf = 0x00B8802C,
+ HWPfDmaFec5GdlDescBaseLoRegVf = 0x00B88040,
+ HWPfDmaFec5GdlDescBaseHiRegVf = 0x00B88044,
+ HWPfDmaFec5GdlRespPtrLoRegVf = 0x00B88048,
+ HWPfDmaFec5GdlRespPtrHiRegVf = 0x00B8804C,
+ HWPfDmaFec4GulDescBaseLoRegVf = 0x00B88060,
+ HWPfDmaFec4GulDescBaseHiRegVf = 0x00B88064,
+ HWPfDmaFec4GulRespPtrLoRegVf = 0x00B88068,
+ HWPfDmaFec4GulRespPtrHiRegVf = 0x00B8806C,
+ HWPfDmaFec4GdlDescBaseLoRegVf = 0x00B88080,
+ HWPfDmaFec4GdlDescBaseHiRegVf = 0x00B88084,
+ HWPfDmaFec4GdlRespPtrLoRegVf = 0x00B88088,
+ HWPfDmaFec4GdlRespPtrHiRegVf = 0x00B8808C,
+ HWPfDmaVfDdrBaseRangeRo = 0x00B880A0,
+ HWPfQosmonACntrlReg = 0x00B90000,
+ HWPfQosmonAEvalOverflow0 = 0x00B90008,
+ HWPfQosmonAEvalOverflow1 = 0x00B9000C,
+ HWPfQosmonADivTerm = 0x00B90010,
+ HWPfQosmonATickTerm = 0x00B90014,
+ HWPfQosmonAEvalTerm = 0x00B90018,
+ HWPfQosmonAAveTerm = 0x00B9001C,
+ HWPfQosmonAForceEccErr = 0x00B90020,
+ HWPfQosmonAEccErrDetect = 0x00B90024,
+ HWPfQosmonAIterationConfig0Low = 0x00B90060,
+ HWPfQosmonAIterationConfig0High = 0x00B90064,
+ HWPfQosmonAIterationConfig1Low = 0x00B90068,
+ HWPfQosmonAIterationConfig1High = 0x00B9006C,
+ HWPfQosmonAIterationConfig2Low = 0x00B90070,
+ HWPfQosmonAIterationConfig2High = 0x00B90074,
+ HWPfQosmonAIterationConfig3Low = 0x00B90078,
+ HWPfQosmonAIterationConfig3High = 0x00B9007C,
+ HWPfQosmonAEvalMemAddr = 0x00B90080,
+ HWPfQosmonAEvalMemData = 0x00B90084,
+ HWPfQosmonAXaction = 0x00B900C0,
+ HWPfQosmonARemThres1Vf = 0x00B90400,
+ HWPfQosmonAThres2Vf = 0x00B90404,
+ HWPfQosmonAWeiFracVf = 0x00B90408,
+ HWPfQosmonARrWeiVf = 0x00B9040C,
+ HWPfPermonACntrlRegVf = 0x00B98000,
+ HWPfPermonACountVf = 0x00B98008,
+ HWPfPermonAKCntLoVf = 0x00B98010,
+ HWPfPermonAKCntHiVf = 0x00B98014,
+ HWPfPermonADeltaCntLoVf = 0x00B98020,
+ HWPfPermonADeltaCntHiVf = 0x00B98024,
+ HWPfPermonAVersionReg = 0x00B9C000,
+ HWPfPermonACbControlFec = 0x00B9C0F0,
+ HWPfPermonADltTimerLoFec = 0x00B9C0F4,
+ HWPfPermonADltTimerHiFec = 0x00B9C0F8,
+ HWPfPermonACbCountFec = 0x00B9C100,
+ HWPfPermonAAccExecTimerLoFec = 0x00B9C104,
+ HWPfPermonAAccExecTimerHiFec = 0x00B9C108,
+ HWPfPermonAExecTimerMinFec = 0x00B9C200,
+ HWPfPermonAExecTimerMaxFec = 0x00B9C204,
+ HWPfPermonAControlBusMon = 0x00B9C400,
+ HWPfPermonAConfigBusMon = 0x00B9C404,
+ HWPfPermonASkipCountBusMon = 0x00B9C408,
+ HWPfPermonAMinLatBusMon = 0x00B9C40C,
+ HWPfPermonAMaxLatBusMon = 0x00B9C500,
+ HWPfPermonATotalLatLowBusMon = 0x00B9C504,
+ HWPfPermonATotalLatUpperBusMon = 0x00B9C508,
+ HWPfPermonATotalReqCntBusMon = 0x00B9C50C,
+ HWPfQosmonBCntrlReg = 0x00BA0000,
+ HWPfQosmonBEvalOverflow0 = 0x00BA0008,
+ HWPfQosmonBEvalOverflow1 = 0x00BA000C,
+ HWPfQosmonBDivTerm = 0x00BA0010,
+ HWPfQosmonBTickTerm = 0x00BA0014,
+ HWPfQosmonBEvalTerm = 0x00BA0018,
+ HWPfQosmonBAveTerm = 0x00BA001C,
+ HWPfQosmonBForceEccErr = 0x00BA0020,
+ HWPfQosmonBEccErrDetect = 0x00BA0024,
+ HWPfQosmonBIterationConfig0Low = 0x00BA0060,
+ HWPfQosmonBIterationConfig0High = 0x00BA0064,
+ HWPfQosmonBIterationConfig1Low = 0x00BA0068,
+ HWPfQosmonBIterationConfig1High = 0x00BA006C,
+ HWPfQosmonBIterationConfig2Low = 0x00BA0070,
+ HWPfQosmonBIterationConfig2High = 0x00BA0074,
+ HWPfQosmonBIterationConfig3Low = 0x00BA0078,
+ HWPfQosmonBIterationConfig3High = 0x00BA007C,
+ HWPfQosmonBEvalMemAddr = 0x00BA0080,
+ HWPfQosmonBEvalMemData = 0x00BA0084,
+ HWPfQosmonBXaction = 0x00BA00C0,
+ HWPfQosmonBRemThres1Vf = 0x00BA0400,
+ HWPfQosmonBThres2Vf = 0x00BA0404,
+ HWPfQosmonBWeiFracVf = 0x00BA0408,
+ HWPfQosmonBRrWeiVf = 0x00BA040C,
+ HWPfPermonBCntrlRegVf = 0x00BA8000,
+ HWPfPermonBCountVf = 0x00BA8008,
+ HWPfPermonBKCntLoVf = 0x00BA8010,
+ HWPfPermonBKCntHiVf = 0x00BA8014,
+ HWPfPermonBDeltaCntLoVf = 0x00BA8020,
+ HWPfPermonBDeltaCntHiVf = 0x00BA8024,
+ HWPfPermonBVersionReg = 0x00BAC000,
+ HWPfPermonBCbControlFec = 0x00BAC0F0,
+ HWPfPermonBDltTimerLoFec = 0x00BAC0F4,
+ HWPfPermonBDltTimerHiFec = 0x00BAC0F8,
+ HWPfPermonBCbCountFec = 0x00BAC100,
+ HWPfPermonBAccExecTimerLoFec = 0x00BAC104,
+ HWPfPermonBAccExecTimerHiFec = 0x00BAC108,
+ HWPfPermonBExecTimerMinFec = 0x00BAC200,
+ HWPfPermonBExecTimerMaxFec = 0x00BAC204,
+ HWPfPermonBControlBusMon = 0x00BAC400,
+ HWPfPermonBConfigBusMon = 0x00BAC404,
+ HWPfPermonBSkipCountBusMon = 0x00BAC408,
+ HWPfPermonBMinLatBusMon = 0x00BAC40C,
+ HWPfPermonBMaxLatBusMon = 0x00BAC500,
+ HWPfPermonBTotalLatLowBusMon = 0x00BAC504,
+ HWPfPermonBTotalLatUpperBusMon = 0x00BAC508,
+ HWPfPermonBTotalReqCntBusMon = 0x00BAC50C,
+ HWPfFecUl5gCntrlReg = 0x00BC0000,
+ HWPfFecUl5gI2MThreshReg = 0x00BC0004,
+ HWPfFecUl5gVersionReg = 0x00BC0100,
+ HWPfFecUl5gFcwStatusReg = 0x00BC0104,
+ HWPfFecUl5gWarnReg = 0x00BC0108,
+ HwPfFecUl5gIbDebugReg = 0x00BC0200,
+ HwPfFecUl5gObLlrDebugReg = 0x00BC0204,
+ HwPfFecUl5gObHarqDebugReg = 0x00BC0208,
+ HwPfFecUl5g1CntrlReg = 0x00BC1000,
+ HwPfFecUl5g1I2MThreshReg = 0x00BC1004,
+ HwPfFecUl5g1VersionReg = 0x00BC1100,
+ HwPfFecUl5g1FcwStatusReg = 0x00BC1104,
+ HwPfFecUl5g1WarnReg = 0x00BC1108,
+ HwPfFecUl5g1IbDebugReg = 0x00BC1200,
+ HwPfFecUl5g1ObLlrDebugReg = 0x00BC1204,
+ HwPfFecUl5g1ObHarqDebugReg = 0x00BC1208,
+ HwPfFecUl5g2CntrlReg = 0x00BC2000,
+ HwPfFecUl5g2I2MThreshReg = 0x00BC2004,
+ HwPfFecUl5g2VersionReg = 0x00BC2100,
+ HwPfFecUl5g2FcwStatusReg = 0x00BC2104,
+ HwPfFecUl5g2WarnReg = 0x00BC2108,
+ HwPfFecUl5g2IbDebugReg = 0x00BC2200,
+ HwPfFecUl5g2ObLlrDebugReg = 0x00BC2204,
+ HwPfFecUl5g2ObHarqDebugReg = 0x00BC2208,
+ HwPfFecUl5g3CntrlReg = 0x00BC3000,
+ HwPfFecUl5g3I2MThreshReg = 0x00BC3004,
+ HwPfFecUl5g3VersionReg = 0x00BC3100,
+ HwPfFecUl5g3FcwStatusReg = 0x00BC3104,
+ HwPfFecUl5g3WarnReg = 0x00BC3108,
+ HwPfFecUl5g3IbDebugReg = 0x00BC3200,
+ HwPfFecUl5g3ObLlrDebugReg = 0x00BC3204,
+ HwPfFecUl5g3ObHarqDebugReg = 0x00BC3208,
+ HwPfFecUl5g4CntrlReg = 0x00BC4000,
+ HwPfFecUl5g4I2MThreshReg = 0x00BC4004,
+ HwPfFecUl5g4VersionReg = 0x00BC4100,
+ HwPfFecUl5g4FcwStatusReg = 0x00BC4104,
+ HwPfFecUl5g4WarnReg = 0x00BC4108,
+ HwPfFecUl5g4IbDebugReg = 0x00BC4200,
+ HwPfFecUl5g4ObLlrDebugReg = 0x00BC4204,
+ HwPfFecUl5g4ObHarqDebugReg = 0x00BC4208,
+ HwPfFecUl5g5CntrlReg = 0x00BC5000,
+ HwPfFecUl5g5I2MThreshReg = 0x00BC5004,
+ HwPfFecUl5g5VersionReg = 0x00BC5100,
+ HwPfFecUl5g5FcwStatusReg = 0x00BC5104,
+ HwPfFecUl5g5WarnReg = 0x00BC5108,
+ HwPfFecUl5g5IbDebugReg = 0x00BC5200,
+ HwPfFecUl5g5ObLlrDebugReg = 0x00BC5204,
+ HwPfFecUl5g5ObHarqDebugReg = 0x00BC5208,
+ HwPfFecUl5g6CntrlReg = 0x00BC6000,
+ HwPfFecUl5g6I2MThreshReg = 0x00BC6004,
+ HwPfFecUl5g6VersionReg = 0x00BC6100,
+ HwPfFecUl5g6FcwStatusReg = 0x00BC6104,
+ HwPfFecUl5g6WarnReg = 0x00BC6108,
+ HwPfFecUl5g6IbDebugReg = 0x00BC6200,
+ HwPfFecUl5g6ObLlrDebugReg = 0x00BC6204,
+ HwPfFecUl5g6ObHarqDebugReg = 0x00BC6208,
+ HwPfFecUl5g7CntrlReg = 0x00BC7000,
+ HwPfFecUl5g7I2MThreshReg = 0x00BC7004,
+ HwPfFecUl5g7VersionReg = 0x00BC7100,
+ HwPfFecUl5g7FcwStatusReg = 0x00BC7104,
+ HwPfFecUl5g7WarnReg = 0x00BC7108,
+ HwPfFecUl5g7IbDebugReg = 0x00BC7200,
+ HwPfFecUl5g7ObLlrDebugReg = 0x00BC7204,
+ HwPfFecUl5g7ObHarqDebugReg = 0x00BC7208,
+ HwPfFecUl5g8CntrlReg = 0x00BC8000,
+ HwPfFecUl5g8I2MThreshReg = 0x00BC8004,
+ HwPfFecUl5g8VersionReg = 0x00BC8100,
+ HwPfFecUl5g8FcwStatusReg = 0x00BC8104,
+ HwPfFecUl5g8WarnReg = 0x00BC8108,
+ HwPfFecUl5g8IbDebugReg = 0x00BC8200,
+ HwPfFecUl5g8ObLlrDebugReg = 0x00BC8204,
+ HwPfFecUl5g8ObHarqDebugReg = 0x00BC8208,
+ HWPfFecDl5gCntrlReg = 0x00BCF000,
+ HWPfFecDl5gI2MThreshReg = 0x00BCF004,
+ HWPfFecDl5gVersionReg = 0x00BCF100,
+ HWPfFecDl5gFcwStatusReg = 0x00BCF104,
+ HWPfFecDl5gWarnReg = 0x00BCF108,
+ HWPfFecUlVersionReg = 0x00BD0000,
+ HWPfFecUlControlReg = 0x00BD0004,
+ HWPfFecUlStatusReg = 0x00BD0008,
+ HWPfFecDlVersionReg = 0x00BDF000,
+ HWPfFecDlClusterConfigReg = 0x00BDF004,
+ HWPfFecDlBurstThres = 0x00BDF00C,
+ HWPfFecDlClusterStatusReg0 = 0x00BDF040,
+ HWPfFecDlClusterStatusReg1 = 0x00BDF044,
+ HWPfFecDlClusterStatusReg2 = 0x00BDF048,
+ HWPfFecDlClusterStatusReg3 = 0x00BDF04C,
+ HWPfFecDlClusterStatusReg4 = 0x00BDF050,
+ HWPfFecDlClusterStatusReg5 = 0x00BDF054,
+ HWPfChaFabPllPllrst = 0x00C40000,
+ HWPfChaFabPllClk0 = 0x00C40004,
+ HWPfChaFabPllClk1 = 0x00C40008,
+ HWPfChaFabPllBwadj = 0x00C4000C,
+ HWPfChaFabPllLbw = 0x00C40010,
+ HWPfChaFabPllResetq = 0x00C40014,
+ HWPfChaFabPllPhshft0 = 0x00C40018,
+ HWPfChaFabPllPhshft1 = 0x00C4001C,
+ HWPfChaFabPllDivq0 = 0x00C40020,
+ HWPfChaFabPllDivq1 = 0x00C40024,
+ HWPfChaFabPllDivq2 = 0x00C40028,
+ HWPfChaFabPllDivq3 = 0x00C4002C,
+ HWPfChaFabPllDivq4 = 0x00C40030,
+ HWPfChaFabPllDivq5 = 0x00C40034,
+ HWPfChaFabPllDivq6 = 0x00C40038,
+ HWPfChaFabPllDivq7 = 0x00C4003C,
+ HWPfChaDl5gPllPllrst = 0x00C40080,
+ HWPfChaDl5gPllClk0 = 0x00C40084,
+ HWPfChaDl5gPllClk1 = 0x00C40088,
+ HWPfChaDl5gPllBwadj = 0x00C4008C,
+ HWPfChaDl5gPllLbw = 0x00C40090,
+ HWPfChaDl5gPllResetq = 0x00C40094,
+ HWPfChaDl5gPllPhshft0 = 0x00C40098,
+ HWPfChaDl5gPllPhshft1 = 0x00C4009C,
+ HWPfChaDl5gPllDivq0 = 0x00C400A0,
+ HWPfChaDl5gPllDivq1 = 0x00C400A4,
+ HWPfChaDl5gPllDivq2 = 0x00C400A8,
+ HWPfChaDl5gPllDivq3 = 0x00C400AC,
+ HWPfChaDl5gPllDivq4 = 0x00C400B0,
+ HWPfChaDl5gPllDivq5 = 0x00C400B4,
+ HWPfChaDl5gPllDivq6 = 0x00C400B8,
+ HWPfChaDl5gPllDivq7 = 0x00C400BC,
+ HWPfChaDl4gPllPllrst = 0x00C40100,
+ HWPfChaDl4gPllClk0 = 0x00C40104,
+ HWPfChaDl4gPllClk1 = 0x00C40108,
+ HWPfChaDl4gPllBwadj = 0x00C4010C,
+ HWPfChaDl4gPllLbw = 0x00C40110,
+ HWPfChaDl4gPllResetq = 0x00C40114,
+ HWPfChaDl4gPllPhshft0 = 0x00C40118,
+ HWPfChaDl4gPllPhshft1 = 0x00C4011C,
+ HWPfChaDl4gPllDivq0 = 0x00C40120,
+ HWPfChaDl4gPllDivq1 = 0x00C40124,
+ HWPfChaDl4gPllDivq2 = 0x00C40128,
+ HWPfChaDl4gPllDivq3 = 0x00C4012C,
+ HWPfChaDl4gPllDivq4 = 0x00C40130,
+ HWPfChaDl4gPllDivq5 = 0x00C40134,
+ HWPfChaDl4gPllDivq6 = 0x00C40138,
+ HWPfChaDl4gPllDivq7 = 0x00C4013C,
+ HWPfChaUl5gPllPllrst = 0x00C40180,
+ HWPfChaUl5gPllClk0 = 0x00C40184,
+ HWPfChaUl5gPllClk1 = 0x00C40188,
+ HWPfChaUl5gPllBwadj = 0x00C4018C,
+ HWPfChaUl5gPllLbw = 0x00C40190,
+ HWPfChaUl5gPllResetq = 0x00C40194,
+ HWPfChaUl5gPllPhshft0 = 0x00C40198,
+ HWPfChaUl5gPllPhshft1 = 0x00C4019C,
+ HWPfChaUl5gPllDivq0 = 0x00C401A0,
+ HWPfChaUl5gPllDivq1 = 0x00C401A4,
+ HWPfChaUl5gPllDivq2 = 0x00C401A8,
+ HWPfChaUl5gPllDivq3 = 0x00C401AC,
+ HWPfChaUl5gPllDivq4 = 0x00C401B0,
+ HWPfChaUl5gPllDivq5 = 0x00C401B4,
+ HWPfChaUl5gPllDivq6 = 0x00C401B8,
+ HWPfChaUl5gPllDivq7 = 0x00C401BC,
+ HWPfChaUl4gPllPllrst = 0x00C40200,
+ HWPfChaUl4gPllClk0 = 0x00C40204,
+ HWPfChaUl4gPllClk1 = 0x00C40208,
+ HWPfChaUl4gPllBwadj = 0x00C4020C,
+ HWPfChaUl4gPllLbw = 0x00C40210,
+ HWPfChaUl4gPllResetq = 0x00C40214,
+ HWPfChaUl4gPllPhshft0 = 0x00C40218,
+ HWPfChaUl4gPllPhshft1 = 0x00C4021C,
+ HWPfChaUl4gPllDivq0 = 0x00C40220,
+ HWPfChaUl4gPllDivq1 = 0x00C40224,
+ HWPfChaUl4gPllDivq2 = 0x00C40228,
+ HWPfChaUl4gPllDivq3 = 0x00C4022C,
+ HWPfChaUl4gPllDivq4 = 0x00C40230,
+ HWPfChaUl4gPllDivq5 = 0x00C40234,
+ HWPfChaUl4gPllDivq6 = 0x00C40238,
+ HWPfChaUl4gPllDivq7 = 0x00C4023C,
+ HWPfChaDdrPllPllrst = 0x00C40280,
+ HWPfChaDdrPllClk0 = 0x00C40284,
+ HWPfChaDdrPllClk1 = 0x00C40288,
+ HWPfChaDdrPllBwadj = 0x00C4028C,
+ HWPfChaDdrPllLbw = 0x00C40290,
+ HWPfChaDdrPllResetq = 0x00C40294,
+ HWPfChaDdrPllPhshft0 = 0x00C40298,
+ HWPfChaDdrPllPhshft1 = 0x00C4029C,
+ HWPfChaDdrPllDivq0 = 0x00C402A0,
+ HWPfChaDdrPllDivq1 = 0x00C402A4,
+ HWPfChaDdrPllDivq2 = 0x00C402A8,
+ HWPfChaDdrPllDivq3 = 0x00C402AC,
+ HWPfChaDdrPllDivq4 = 0x00C402B0,
+ HWPfChaDdrPllDivq5 = 0x00C402B4,
+ HWPfChaDdrPllDivq6 = 0x00C402B8,
+ HWPfChaDdrPllDivq7 = 0x00C402BC,
+ HWPfChaErrStatus = 0x00C40400,
+ HWPfChaErrMask = 0x00C40404,
+ HWPfChaDebugPcieMsiFifo = 0x00C40410,
+ HWPfChaDebugDdrMsiFifo = 0x00C40414,
+ HWPfChaDebugMiscMsiFifo = 0x00C40418,
+ HWPfChaPwmSet = 0x00C40420,
+ HWPfChaDdrRstStatus = 0x00C40430,
+ HWPfChaDdrStDoneStatus = 0x00C40434,
+ HWPfChaDdrWbRstCfg = 0x00C40438,
+ HWPfChaDdrApbRstCfg = 0x00C4043C,
+ HWPfChaDdrPhyRstCfg = 0x00C40440,
+ HWPfChaDdrCpuRstCfg = 0x00C40444,
+ HWPfChaDdrSifRstCfg = 0x00C40448,
+ HWPfChaPadcfgPcomp0 = 0x00C41000,
+ HWPfChaPadcfgNcomp0 = 0x00C41004,
+ HWPfChaPadcfgOdt0 = 0x00C41008,
+ HWPfChaPadcfgProtect0 = 0x00C4100C,
+ HWPfChaPreemphasisProtect0 = 0x00C41010,
+ HWPfChaPreemphasisCompen0 = 0x00C41040,
+ HWPfChaPreemphasisOdten0 = 0x00C41044,
+ HWPfChaPadcfgPcomp1 = 0x00C41100,
+ HWPfChaPadcfgNcomp1 = 0x00C41104,
+ HWPfChaPadcfgOdt1 = 0x00C41108,
+ HWPfChaPadcfgProtect1 = 0x00C4110C,
+ HWPfChaPreemphasisProtect1 = 0x00C41110,
+ HWPfChaPreemphasisCompen1 = 0x00C41140,
+ HWPfChaPreemphasisOdten1 = 0x00C41144,
+ HWPfChaPadcfgPcomp2 = 0x00C41200,
+ HWPfChaPadcfgNcomp2 = 0x00C41204,
+ HWPfChaPadcfgOdt2 = 0x00C41208,
+ HWPfChaPadcfgProtect2 = 0x00C4120C,
+ HWPfChaPreemphasisProtect2 = 0x00C41210,
+ HWPfChaPreemphasisCompen2 = 0x00C41240,
+ HWPfChaPreemphasisOdten4 = 0x00C41444,
+ HWPfChaPreemphasisOdten2 = 0x00C41244,
+ HWPfChaPadcfgPcomp3 = 0x00C41300,
+ HWPfChaPadcfgNcomp3 = 0x00C41304,
+ HWPfChaPadcfgOdt3 = 0x00C41308,
+ HWPfChaPadcfgProtect3 = 0x00C4130C,
+ HWPfChaPreemphasisProtect3 = 0x00C41310,
+ HWPfChaPreemphasisCompen3 = 0x00C41340,
+ HWPfChaPreemphasisOdten3 = 0x00C41344,
+ HWPfChaPadcfgPcomp4 = 0x00C41400,
+ HWPfChaPadcfgNcomp4 = 0x00C41404,
+ HWPfChaPadcfgOdt4 = 0x00C41408,
+ HWPfChaPadcfgProtect4 = 0x00C4140C,
+ HWPfChaPreemphasisProtect4 = 0x00C41410,
+ HWPfChaPreemphasisCompen4 = 0x00C41440,
+ HWPfHiVfToPfDbellVf = 0x00C80000,
+ HWPfHiPfToVfDbellVf = 0x00C80008,
+ HWPfHiInfoRingBaseLoVf = 0x00C80010,
+ HWPfHiInfoRingBaseHiVf = 0x00C80014,
+ HWPfHiInfoRingPointerVf = 0x00C80018,
+ HWPfHiInfoRingIntWrEnVf = 0x00C80020,
+ HWPfHiInfoRingPf2VfWrEnVf = 0x00C80024,
+ HWPfHiMsixVectorMapperVf = 0x00C80060,
+ HWPfHiModuleVersionReg = 0x00C84000,
+ HWPfHiIosf2axiErrLogReg = 0x00C84004,
+ HWPfHiHardResetReg = 0x00C84008,
+ HWPfHi5GHardResetReg = 0x00C8400C,
+ HWPfHiInfoRingBaseLoRegPf = 0x00C84010,
+ HWPfHiInfoRingBaseHiRegPf = 0x00C84014,
+ HWPfHiInfoRingPointerRegPf = 0x00C84018,
+ HWPfHiInfoRingIntWrEnRegPf = 0x00C84020,
+ HWPfHiInfoRingVf2pfLoWrEnReg = 0x00C84024,
+ HWPfHiInfoRingVf2pfHiWrEnReg = 0x00C84028,
+ HWPfHiLogParityErrStatusReg = 0x00C8402C,
+ HWPfHiLogDataParityErrorVfStatusLo = 0x00C84030,
+ HWPfHiLogDataParityErrorVfStatusHi = 0x00C84034,
+ HWPfHiBlockTransmitOnErrorEn = 0x00C84038,
+ HWPfHiCfgMsiIntWrEnRegPf = 0x00C84040,
+ HWPfHiCfgMsiVf2pfLoWrEnReg = 0x00C84044,
+ HWPfHiCfgMsiVf2pfHighWrEnReg = 0x00C84048,
+ HWPfHiMsixVectorMapperPf = 0x00C84060,
+ HWPfHiApbWrWaitTime = 0x00C84100,
+ HWPfHiXCounterMaxValue = 0x00C84104,
+ HWPfHiPfMode = 0x00C84108,
+ HWPfHiClkGateHystReg = 0x00C8410C,
+ HWPfHiSnoopBitsReg = 0x00C84110,
+ HWPfHiMsiDropEnableReg = 0x00C84114,
+ HWPfHiMsiStatReg = 0x00C84120,
+ HWPfHiFifoOflStatReg = 0x00C84124,
+ HWPfHiHiDebugReg = 0x00C841F4,
+ HWPfHiDebugMemSnoopMsiFifo = 0x00C841F8,
+ HWPfHiDebugMemSnoopInputFifo = 0x00C841FC,
+ HWPfHiMsixMappingConfig = 0x00C84200,
+ HWPfHiJunkReg = 0x00C8FF00,
+ HWPfDdrUmmcVer = 0x00D00000,
+ HWPfDdrUmmcCap = 0x00D00010,
+ HWPfDdrUmmcCtrl = 0x00D00020,
+ HWPfDdrMpcPe = 0x00D00080,
+ HWPfDdrMpcPpri3 = 0x00D00090,
+ HWPfDdrMpcPpri2 = 0x00D000A0,
+ HWPfDdrMpcPpri1 = 0x00D000B0,
+ HWPfDdrMpcPpri0 = 0x00D000C0,
+ HWPfDdrMpcPrwgrpCtrl = 0x00D000D0,
+ HWPfDdrMpcPbw7 = 0x00D000E0,
+ HWPfDdrMpcPbw6 = 0x00D000F0,
+ HWPfDdrMpcPbw5 = 0x00D00100,
+ HWPfDdrMpcPbw4 = 0x00D00110,
+ HWPfDdrMpcPbw3 = 0x00D00120,
+ HWPfDdrMpcPbw2 = 0x00D00130,
+ HWPfDdrMpcPbw1 = 0x00D00140,
+ HWPfDdrMpcPbw0 = 0x00D00150,
+ HWPfDdrMemoryInit = 0x00D00200,
+ HWPfDdrMemoryInitDone = 0x00D00210,
+ HWPfDdrMemInitPhyTrng0 = 0x00D00240,
+ HWPfDdrMemInitPhyTrng1 = 0x00D00250,
+ HWPfDdrMemInitPhyTrng2 = 0x00D00260,
+ HWPfDdrMemInitPhyTrng3 = 0x00D00270,
+ HWPfDdrBcDram = 0x00D003C0,
+ HWPfDdrBcAddrMap = 0x00D003D0,
+ HWPfDdrBcRef = 0x00D003E0,
+ HWPfDdrBcTim0 = 0x00D00400,
+ HWPfDdrBcTim1 = 0x00D00410,
+ HWPfDdrBcTim2 = 0x00D00420,
+ HWPfDdrBcTim3 = 0x00D00430,
+ HWPfDdrBcTim4 = 0x00D00440,
+ HWPfDdrBcTim5 = 0x00D00450,
+ HWPfDdrBcTim6 = 0x00D00460,
+ HWPfDdrBcTim7 = 0x00D00470,
+ HWPfDdrBcTim8 = 0x00D00480,
+ HWPfDdrBcTim9 = 0x00D00490,
+ HWPfDdrBcTim10 = 0x00D004A0,
+ HWPfDdrBcTim12 = 0x00D004C0,
+ HWPfDdrDfiInit = 0x00D004D0,
+ HWPfDdrDfiInitComplete = 0x00D004E0,
+ HWPfDdrDfiTim0 = 0x00D004F0,
+ HWPfDdrDfiTim1 = 0x00D00500,
+ HWPfDdrDfiPhyUpdEn = 0x00D00530,
+ HWPfDdrMemStatus = 0x00D00540,
+ HWPfDdrUmmcErrStatus = 0x00D00550,
+ HWPfDdrUmmcIntStatus = 0x00D00560,
+ HWPfDdrUmmcIntEn = 0x00D00570,
+ HWPfDdrPhyRdLatency = 0x00D48400,
+ HWPfDdrPhyRdLatencyDbi = 0x00D48410,
+ HWPfDdrPhyWrLatency = 0x00D48420,
+ HWPfDdrPhyTrngType = 0x00D48430,
+ HWPfDdrPhyMrsTiming2 = 0x00D48440,
+ HWPfDdrPhyMrsTiming0 = 0x00D48450,
+ HWPfDdrPhyMrsTiming1 = 0x00D48460,
+ HWPfDdrPhyDramTmrd = 0x00D48470,
+ HWPfDdrPhyDramTmod = 0x00D48480,
+ HWPfDdrPhyDramTwpre = 0x00D48490,
+ HWPfDdrPhyDramTrfc = 0x00D484A0,
+ HWPfDdrPhyDramTrwtp = 0x00D484B0,
+ HWPfDdrPhyMr01Dimm = 0x00D484C0,
+ HWPfDdrPhyMr01DimmDbi = 0x00D484D0,
+ HWPfDdrPhyMr23Dimm = 0x00D484E0,
+ HWPfDdrPhyMr45Dimm = 0x00D484F0,
+ HWPfDdrPhyMr67Dimm = 0x00D48500,
+ HWPfDdrPhyWrlvlWwRdlvlRr = 0x00D48510,
+ HWPfDdrPhyOdtEn = 0x00D48520,
+ HWPfDdrPhyFastTrng = 0x00D48530,
+ HWPfDdrPhyDynTrngGap = 0x00D48540,
+ HWPfDdrPhyDynRcalGap = 0x00D48550,
+ HWPfDdrPhyIdletimeout = 0x00D48560,
+ HWPfDdrPhyRstCkeGap = 0x00D48570,
+ HWPfDdrPhyCkeMrsGap = 0x00D48580,
+ HWPfDdrPhyMemVrefMidVal = 0x00D48590,
+ HWPfDdrPhyVrefStep = 0x00D485A0,
+ HWPfDdrPhyVrefThreshold = 0x00D485B0,
+ HWPfDdrPhyPhyVrefMidVal = 0x00D485C0,
+ HWPfDdrPhyDqsCountMax = 0x00D485D0,
+ HWPfDdrPhyDqsCountNum = 0x00D485E0,
+ HWPfDdrPhyDramRow = 0x00D485F0,
+ HWPfDdrPhyDramCol = 0x00D48600,
+ HWPfDdrPhyDramBgBa = 0x00D48610,
+ HWPfDdrPhyDynamicUpdreqrel = 0x00D48620,
+ HWPfDdrPhyVrefLimits = 0x00D48630,
+ HWPfDdrPhyIdtmTcStatus = 0x00D6C020,
+ HWPfDdrPhyIdtmFwVersion = 0x00D6C410,
+ HWPfDdrPhyRdlvlGateInitDelay = 0x00D70000,
+ HWPfDdrPhyRdenSmplabc = 0x00D70008,
+ HWPfDdrPhyVrefNibble0 = 0x00D7000C,
+ HWPfDdrPhyVrefNibble1 = 0x00D70010,
+ HWPfDdrPhyRdlvlGateDqsSmpl0 = 0x00D70014,
+ HWPfDdrPhyRdlvlGateDqsSmpl1 = 0x00D70018,
+ HWPfDdrPhyRdlvlGateDqsSmpl2 = 0x00D7001C,
+ HWPfDdrPhyDqsCount = 0x00D70020,
+ HWPfDdrPhyWrlvlRdlvlGateStatus = 0x00D70024,
+ HWPfDdrPhyErrorFlags = 0x00D70028,
+ HWPfDdrPhyPowerDown = 0x00D70030,
+ HWPfDdrPhyPrbsSeedByte0 = 0x00D70034,
+ HWPfDdrPhyPrbsSeedByte1 = 0x00D70038,
+ HWPfDdrPhyPcompDq = 0x00D70040,
+ HWPfDdrPhyNcompDq = 0x00D70044,
+ HWPfDdrPhyPcompDqs = 0x00D70048,
+ HWPfDdrPhyNcompDqs = 0x00D7004C,
+ HWPfDdrPhyPcompCmd = 0x00D70050,
+ HWPfDdrPhyNcompCmd = 0x00D70054,
+ HWPfDdrPhyPcompCk = 0x00D70058,
+ HWPfDdrPhyNcompCk = 0x00D7005C,
+ HWPfDdrPhyRcalOdtDq = 0x00D70060,
+ HWPfDdrPhyRcalOdtDqs = 0x00D70064,
+ HWPfDdrPhyRcalMask1 = 0x00D70068,
+ HWPfDdrPhyRcalMask2 = 0x00D7006C,
+ HWPfDdrPhyRcalCtrl = 0x00D70070,
+ HWPfDdrPhyRcalCnt = 0x00D70074,
+ HWPfDdrPhyRcalOverride = 0x00D70078,
+ HWPfDdrPhyRcalGateen = 0x00D7007C,
+ HWPfDdrPhyCtrl = 0x00D70080,
+ HWPfDdrPhyWrlvlAlg = 0x00D70084,
+ HWPfDdrPhyRcalVreftTxcmdOdt = 0x00D70088,
+ HWPfDdrPhyRdlvlGateParam = 0x00D7008C,
+ HWPfDdrPhyRdlvlGateParam2 = 0x00D70090,
+ HWPfDdrPhyRcalVreftTxdata = 0x00D70094,
+ HWPfDdrPhyCmdIntDelay = 0x00D700A4,
+ HWPfDdrPhyAlertN = 0x00D700A8,
+ HWPfDdrPhyTrngReqWpre2tck = 0x00D700AC,
+ HWPfDdrPhyCmdPhaseSel = 0x00D700B4,
+ HWPfDdrPhyCmdDcdl = 0x00D700B8,
+ HWPfDdrPhyCkDcdl = 0x00D700BC,
+ HWPfDdrPhySwTrngCtrl1 = 0x00D700C0,
+ HWPfDdrPhySwTrngCtrl2 = 0x00D700C4,
+ HWPfDdrPhyRcalPcompRden = 0x00D700C8,
+ HWPfDdrPhyRcalNcompRden = 0x00D700CC,
+ HWPfDdrPhyRcalCompen = 0x00D700D0,
+ HWPfDdrPhySwTrngRdqs = 0x00D700D4,
+ HWPfDdrPhySwTrngWdqs = 0x00D700D8,
+ HWPfDdrPhySwTrngRdena = 0x00D700DC,
+ HWPfDdrPhySwTrngRdenb = 0x00D700E0,
+ HWPfDdrPhySwTrngRdenc = 0x00D700E4,
+ HWPfDdrPhySwTrngWdq = 0x00D700E8,
+ HWPfDdrPhySwTrngRdq = 0x00D700EC,
+ HWPfDdrPhyPcfgHmValue = 0x00D700F0,
+ HWPfDdrPhyPcfgTimerValue = 0x00D700F4,
+ HWPfDdrPhyPcfgSoftwareTraining = 0x00D700F8,
+ HWPfDdrPhyPcfgMcStatus = 0x00D700FC,
+ HWPfDdrPhyWrlvlPhRank0 = 0x00D70100,
+ HWPfDdrPhyRdenPhRank0 = 0x00D70104,
+ HWPfDdrPhyRdenIntRank0 = 0x00D70108,
+ HWPfDdrPhyRdqsDcdlRank0 = 0x00D7010C,
+ HWPfDdrPhyRdqsShadowDcdlRank0 = 0x00D70110,
+ HWPfDdrPhyWdqsDcdlRank0 = 0x00D70114,
+ HWPfDdrPhyWdmDcdlShadowRank0 = 0x00D70118,
+ HWPfDdrPhyWdmDcdlRank0 = 0x00D7011C,
+ HWPfDdrPhyDbiDcdlRank0 = 0x00D70120,
+ HWPfDdrPhyRdenDcdlaRank0 = 0x00D70124,
+ HWPfDdrPhyDbiDcdlShadowRank0 = 0x00D70128,
+ HWPfDdrPhyRdenDcdlbRank0 = 0x00D7012C,
+ HWPfDdrPhyWdqsShadowDcdlRank0 = 0x00D70130,
+ HWPfDdrPhyRdenDcdlcRank0 = 0x00D70134,
+ HWPfDdrPhyRdenShadowDcdlaRank0 = 0x00D70138,
+ HWPfDdrPhyWrlvlIntRank0 = 0x00D7013C,
+ HWPfDdrPhyRdqDcdlBit0Rank0 = 0x00D70200,
+ HWPfDdrPhyRdqDcdlShadowBit0Rank0 = 0x00D70204,
+ HWPfDdrPhyWdqDcdlBit0Rank0 = 0x00D70208,
+ HWPfDdrPhyWdqDcdlShadowBit0Rank0 = 0x00D7020C,
+ HWPfDdrPhyRdqDcdlBit1Rank0 = 0x00D70240,
+ HWPfDdrPhyRdqDcdlShadowBit1Rank0 = 0x00D70244,
+ HWPfDdrPhyWdqDcdlBit1Rank0 = 0x00D70248,
+ HWPfDdrPhyWdqDcdlShadowBit1Rank0 = 0x00D7024C,
+ HWPfDdrPhyRdqDcdlBit2Rank0 = 0x00D70280,
+ HWPfDdrPhyRdqDcdlShadowBit2Rank0 = 0x00D70284,
+ HWPfDdrPhyWdqDcdlBit2Rank0 = 0x00D70288,
+ HWPfDdrPhyWdqDcdlShadowBit2Rank0 = 0x00D7028C,
+ HWPfDdrPhyRdqDcdlBit3Rank0 = 0x00D702C0,
+ HWPfDdrPhyRdqDcdlShadowBit3Rank0 = 0x00D702C4,
+ HWPfDdrPhyWdqDcdlBit3Rank0 = 0x00D702C8,
+ HWPfDdrPhyWdqDcdlShadowBit3Rank0 = 0x00D702CC,
+ HWPfDdrPhyRdqDcdlBit4Rank0 = 0x00D70300,
+ HWPfDdrPhyRdqDcdlShadowBit4Rank0 = 0x00D70304,
+ HWPfDdrPhyWdqDcdlBit4Rank0 = 0x00D70308,
+ HWPfDdrPhyWdqDcdlShadowBit4Rank0 = 0x00D7030C,
+ HWPfDdrPhyRdqDcdlBit5Rank0 = 0x00D70340,
+ HWPfDdrPhyRdqDcdlShadowBit5Rank0 = 0x00D70344,
+ HWPfDdrPhyWdqDcdlBit5Rank0 = 0x00D70348,
+ HWPfDdrPhyWdqDcdlShadowBit5Rank0 = 0x00D7034C,
+ HWPfDdrPhyRdqDcdlBit6Rank0 = 0x00D70380,
+ HWPfDdrPhyRdqDcdlShadowBit6Rank0 = 0x00D70384,
+ HWPfDdrPhyWdqDcdlBit6Rank0 = 0x00D70388,
+ HWPfDdrPhyWdqDcdlShadowBit6Rank0 = 0x00D7038C,
+ HWPfDdrPhyRdqDcdlBit7Rank0 = 0x00D703C0,
+ HWPfDdrPhyRdqDcdlShadowBit7Rank0 = 0x00D703C4,
+ HWPfDdrPhyWdqDcdlBit7Rank0 = 0x00D703C8,
+ HWPfDdrPhyWdqDcdlShadowBit7Rank0 = 0x00D703CC,
+ HWPfDdrPhyIdtmStatus = 0x00D740D0,
+ HWPfDdrPhyIdtmError = 0x00D74110,
+ HWPfDdrPhyIdtmDebug = 0x00D74120,
+ HWPfDdrPhyIdtmDebugInt = 0x00D74130,
+ HwPfPcieLnAsicCfgovr = 0x00D80000,
+ HwPfPcieLnAclkmixer = 0x00D80004,
+ HwPfPcieLnTxrampfreq = 0x00D80008,
+ HwPfPcieLnLanetest = 0x00D8000C,
+ HwPfPcieLnDcctrl = 0x00D80010,
+ HwPfPcieLnDccmeas = 0x00D80014,
+ HwPfPcieLnDccovrAclk = 0x00D80018,
+ HwPfPcieLnDccovrTxa = 0x00D8001C,
+ HwPfPcieLnDccovrTxk = 0x00D80020,
+ HwPfPcieLnDccovrDclk = 0x00D80024,
+ HwPfPcieLnDccovrEclk = 0x00D80028,
+ HwPfPcieLnDcctrimAclk = 0x00D8002C,
+ HwPfPcieLnDcctrimTx = 0x00D80030,
+ HwPfPcieLnDcctrimDclk = 0x00D80034,
+ HwPfPcieLnDcctrimEclk = 0x00D80038,
+ HwPfPcieLnQuadCtrl = 0x00D8003C,
+ HwPfPcieLnQuadCorrIndex = 0x00D80040,
+ HwPfPcieLnQuadCorrStatus = 0x00D80044,
+ HwPfPcieLnAsicRxovr1 = 0x00D80048,
+ HwPfPcieLnAsicRxovr2 = 0x00D8004C,
+ HwPfPcieLnAsicEqinfovr = 0x00D80050,
+ HwPfPcieLnRxcsr = 0x00D80054,
+ HwPfPcieLnRxfectrl = 0x00D80058,
+ HwPfPcieLnRxtest = 0x00D8005C,
+ HwPfPcieLnEscount = 0x00D80060,
+ HwPfPcieLnCdrctrl = 0x00D80064,
+ HwPfPcieLnCdrctrl2 = 0x00D80068,
+ HwPfPcieLnCdrcfg0Ctrl0 = 0x00D8006C,
+ HwPfPcieLnCdrcfg0Ctrl1 = 0x00D80070,
+ HwPfPcieLnCdrcfg0Ctrl2 = 0x00D80074,
+ HwPfPcieLnCdrcfg1Ctrl0 = 0x00D80078,
+ HwPfPcieLnCdrcfg1Ctrl1 = 0x00D8007C,
+ HwPfPcieLnCdrcfg1Ctrl2 = 0x00D80080,
+ HwPfPcieLnCdrcfg2Ctrl0 = 0x00D80084,
+ HwPfPcieLnCdrcfg2Ctrl1 = 0x00D80088,
+ HwPfPcieLnCdrcfg2Ctrl2 = 0x00D8008C,
+ HwPfPcieLnCdrcfg3Ctrl0 = 0x00D80090,
+ HwPfPcieLnCdrcfg3Ctrl1 = 0x00D80094,
+ HwPfPcieLnCdrcfg3Ctrl2 = 0x00D80098,
+ HwPfPcieLnCdrphase = 0x00D8009C,
+ HwPfPcieLnCdrfreq = 0x00D800A0,
+ HwPfPcieLnCdrstatusPhase = 0x00D800A4,
+ HwPfPcieLnCdrstatusFreq = 0x00D800A8,
+ HwPfPcieLnCdroffset = 0x00D800AC,
+ HwPfPcieLnRxvosctl = 0x00D800B0,
+ HwPfPcieLnRxvosctl2 = 0x00D800B4,
+ HwPfPcieLnRxlosctl = 0x00D800B8,
+ HwPfPcieLnRxlos = 0x00D800BC,
+ HwPfPcieLnRxlosvval = 0x00D800C0,
+ HwPfPcieLnRxvosd0 = 0x00D800C4,
+ HwPfPcieLnRxvosd1 = 0x00D800C8,
+ HwPfPcieLnRxvosep0 = 0x00D800CC,
+ HwPfPcieLnRxvosep1 = 0x00D800D0,
+ HwPfPcieLnRxvosen0 = 0x00D800D4,
+ HwPfPcieLnRxvosen1 = 0x00D800D8,
+ HwPfPcieLnRxvosafe = 0x00D800DC,
+ HwPfPcieLnRxvosa0 = 0x00D800E0,
+ HwPfPcieLnRxvosa0Out = 0x00D800E4,
+ HwPfPcieLnRxvosa1 = 0x00D800E8,
+ HwPfPcieLnRxvosa1Out = 0x00D800EC,
+ HwPfPcieLnRxmisc = 0x00D800F0,
+ HwPfPcieLnRxbeacon = 0x00D800F4,
+ HwPfPcieLnRxdssout = 0x00D800F8,
+ HwPfPcieLnRxdssout2 = 0x00D800FC,
+ HwPfPcieLnAlphapctrl = 0x00D80100,
+ HwPfPcieLnAlphanctrl = 0x00D80104,
+ HwPfPcieLnAdaptctrl = 0x00D80108,
+ HwPfPcieLnAdaptctrl1 = 0x00D8010C,
+ HwPfPcieLnAdaptstatus = 0x00D80110,
+ HwPfPcieLnAdaptvga1 = 0x00D80114,
+ HwPfPcieLnAdaptvga2 = 0x00D80118,
+ HwPfPcieLnAdaptvga3 = 0x00D8011C,
+ HwPfPcieLnAdaptvga4 = 0x00D80120,
+ HwPfPcieLnAdaptboost1 = 0x00D80124,
+ HwPfPcieLnAdaptboost2 = 0x00D80128,
+ HwPfPcieLnAdaptboost3 = 0x00D8012C,
+ HwPfPcieLnAdaptboost4 = 0x00D80130,
+ HwPfPcieLnAdaptsslms1 = 0x00D80134,
+ HwPfPcieLnAdaptsslms2 = 0x00D80138,
+ HwPfPcieLnAdaptvgaStatus = 0x00D8013C,
+ HwPfPcieLnAdaptboostStatus = 0x00D80140,
+ HwPfPcieLnAdaptsslmsStatus1 = 0x00D80144,
+ HwPfPcieLnAdaptsslmsStatus2 = 0x00D80148,
+ HwPfPcieLnAfectrl1 = 0x00D8014C,
+ HwPfPcieLnAfectrl2 = 0x00D80150,
+ HwPfPcieLnAfectrl3 = 0x00D80154,
+ HwPfPcieLnAfedefault1 = 0x00D80158,
+ HwPfPcieLnAfedefault2 = 0x00D8015C,
+ HwPfPcieLnDfectrl1 = 0x00D80160,
+ HwPfPcieLnDfectrl2 = 0x00D80164,
+ HwPfPcieLnDfectrl3 = 0x00D80168,
+ HwPfPcieLnDfectrl4 = 0x00D8016C,
+ HwPfPcieLnDfectrl5 = 0x00D80170,
+ HwPfPcieLnDfectrl6 = 0x00D80174,
+ HwPfPcieLnAfestatus1 = 0x00D80178,
+ HwPfPcieLnAfestatus2 = 0x00D8017C,
+ HwPfPcieLnDfestatus1 = 0x00D80180,
+ HwPfPcieLnDfestatus2 = 0x00D80184,
+ HwPfPcieLnDfestatus3 = 0x00D80188,
+ HwPfPcieLnDfestatus4 = 0x00D8018C,
+ HwPfPcieLnDfestatus5 = 0x00D80190,
+ HwPfPcieLnAlphastatus = 0x00D80194,
+ HwPfPcieLnFomctrl1 = 0x00D80198,
+ HwPfPcieLnFomctrl2 = 0x00D8019C,
+ HwPfPcieLnFomctrl3 = 0x00D801A0,
+ HwPfPcieLnAclkcalStatus = 0x00D801A4,
+ HwPfPcieLnOffscorrStatus = 0x00D801A8,
+ HwPfPcieLnEyewidthStatus = 0x00D801AC,
+ HwPfPcieLnEyeheightStatus = 0x00D801B0,
+ HwPfPcieLnAsicTxovr1 = 0x00D801B4,
+ HwPfPcieLnAsicTxovr2 = 0x00D801B8,
+ HwPfPcieLnAsicTxovr3 = 0x00D801BC,
+ HwPfPcieLnTxbiasadjOvr = 0x00D801C0,
+ HwPfPcieLnTxcsr = 0x00D801C4,
+ HwPfPcieLnTxtest = 0x00D801C8,
+ HwPfPcieLnTxtestword = 0x00D801CC,
+ HwPfPcieLnTxtestwordHigh = 0x00D801D0,
+ HwPfPcieLnTxdrive = 0x00D801D4,
+ HwPfPcieLnMtcsLn = 0x00D801D8,
+ HwPfPcieLnStatsumLn = 0x00D801DC,
+ HwPfPcieLnRcbusScratch = 0x00D801E0,
+ HwPfPcieLnRcbusMinorrev = 0x00D801F0,
+ HwPfPcieLnRcbusMajorrev = 0x00D801F4,
+ HwPfPcieLnRcbusBlocktype = 0x00D801F8,
+ HwPfPcieSupPllcsr = 0x00D80800,
+ HwPfPcieSupPlldiv = 0x00D80804,
+ HwPfPcieSupPllcal = 0x00D80808,
+ HwPfPcieSupPllcalsts = 0x00D8080C,
+ HwPfPcieSupPllmeas = 0x00D80810,
+ HwPfPcieSupPlldactrim = 0x00D80814,
+ HwPfPcieSupPllbiastrim = 0x00D80818,
+ HwPfPcieSupPllbwtrim = 0x00D8081C,
+ HwPfPcieSupPllcaldly = 0x00D80820,
+ HwPfPcieSupRefclkonpclkctrl = 0x00D80824,
+ HwPfPcieSupPclkdelay = 0x00D80828,
+ HwPfPcieSupPhyconfig = 0x00D8082C,
+ HwPfPcieSupRcalIntf = 0x00D80830,
+ HwPfPcieSupAuxcsr = 0x00D80834,
+ HwPfPcieSupVref = 0x00D80838,
+ HwPfPcieSupLinkmode = 0x00D8083C,
+ HwPfPcieSupRrefcalctl = 0x00D80840,
+ HwPfPcieSupRrefcal = 0x00D80844,
+ HwPfPcieSupRrefcaldly = 0x00D80848,
+ HwPfPcieSupTximpcalctl = 0x00D8084C,
+ HwPfPcieSupTximpcal = 0x00D80850,
+ HwPfPcieSupTximpoffset = 0x00D80854,
+ HwPfPcieSupTximpcaldly = 0x00D80858,
+ HwPfPcieSupRximpcalctl = 0x00D8085C,
+ HwPfPcieSupRximpcal = 0x00D80860,
+ HwPfPcieSupRximpoffset = 0x00D80864,
+ HwPfPcieSupRximpcaldly = 0x00D80868,
+ HwPfPcieSupFence = 0x00D8086C,
+ HwPfPcieSupMtcs = 0x00D80870,
+ HwPfPcieSupStatsum = 0x00D809B8,
+ HwPfPciePcsDpStatus0 = 0x00D81000,
+ HwPfPciePcsDpControl0 = 0x00D81004,
+ HwPfPciePcsPmaStatusLane0 = 0x00D81008,
+ HwPfPciePcsPipeStatusLane0 = 0x00D8100C,
+ HwPfPciePcsTxdeemph0Lane0 = 0x00D81010,
+ HwPfPciePcsTxdeemph1Lane0 = 0x00D81014,
+ HwPfPciePcsInternalStatusLane0 = 0x00D81018,
+ HwPfPciePcsDpStatus1 = 0x00D8101C,
+ HwPfPciePcsDpControl1 = 0x00D81020,
+ HwPfPciePcsPmaStatusLane1 = 0x00D81024,
+ HwPfPciePcsPipeStatusLane1 = 0x00D81028,
+ HwPfPciePcsTxdeemph0Lane1 = 0x00D8102C,
+ HwPfPciePcsTxdeemph1Lane1 = 0x00D81030,
+ HwPfPciePcsInternalStatusLane1 = 0x00D81034,
+ HwPfPciePcsDpStatus2 = 0x00D81038,
+ HwPfPciePcsDpControl2 = 0x00D8103C,
+ HwPfPciePcsPmaStatusLane2 = 0x00D81040,
+ HwPfPciePcsPipeStatusLane2 = 0x00D81044,
+ HwPfPciePcsTxdeemph0Lane2 = 0x00D81048,
+ HwPfPciePcsTxdeemph1Lane2 = 0x00D8104C,
+ HwPfPciePcsInternalStatusLane2 = 0x00D81050,
+ HwPfPciePcsDpStatus3 = 0x00D81054,
+ HwPfPciePcsDpControl3 = 0x00D81058,
+ HwPfPciePcsPmaStatusLane3 = 0x00D8105C,
+ HwPfPciePcsPipeStatusLane3 = 0x00D81060,
+ HwPfPciePcsTxdeemph0Lane3 = 0x00D81064,
+ HwPfPciePcsTxdeemph1Lane3 = 0x00D81068,
+ HwPfPciePcsInternalStatusLane3 = 0x00D8106C,
+ HwPfPciePcsEbStatus0 = 0x00D81070,
+ HwPfPciePcsEbStatus1 = 0x00D81074,
+ HwPfPciePcsEbStatus2 = 0x00D81078,
+ HwPfPciePcsEbStatus3 = 0x00D8107C,
+ HwPfPciePcsPllSettingPcieG1 = 0x00D81088,
+ HwPfPciePcsPllSettingPcieG2 = 0x00D8108C,
+ HwPfPciePcsPllSettingPcieG3 = 0x00D81090,
+ HwPfPciePcsControl = 0x00D81094,
+ HwPfPciePcsEqControl = 0x00D81098,
+ HwPfPciePcsEqTimer = 0x00D8109C,
+ HwPfPciePcsEqErrStatus = 0x00D810A0,
+ HwPfPciePcsEqErrCount = 0x00D810A4,
+ HwPfPciePcsStatus = 0x00D810A8,
+ HwPfPciePcsMiscRegister = 0x00D810AC,
+ HwPfPciePcsObsControl = 0x00D810B0,
+ HwPfPciePcsPrbsCount0 = 0x00D81200,
+ HwPfPciePcsBistControl0 = 0x00D81204,
+ HwPfPciePcsBistStaticWord00 = 0x00D81208,
+ HwPfPciePcsBistStaticWord10 = 0x00D8120C,
+ HwPfPciePcsBistStaticWord20 = 0x00D81210,
+ HwPfPciePcsBistStaticWord30 = 0x00D81214,
+ HwPfPciePcsPrbsCount1 = 0x00D81220,
+ HwPfPciePcsBistControl1 = 0x00D81224,
+ HwPfPciePcsBistStaticWord01 = 0x00D81228,
+ HwPfPciePcsBistStaticWord11 = 0x00D8122C,
+ HwPfPciePcsBistStaticWord21 = 0x00D81230,
+ HwPfPciePcsBistStaticWord31 = 0x00D81234,
+ HwPfPciePcsPrbsCount2 = 0x00D81240,
+ HwPfPciePcsBistControl2 = 0x00D81244,
+ HwPfPciePcsBistStaticWord02 = 0x00D81248,
+ HwPfPciePcsBistStaticWord12 = 0x00D8124C,
+ HwPfPciePcsBistStaticWord22 = 0x00D81250,
+ HwPfPciePcsBistStaticWord32 = 0x00D81254,
+ HwPfPciePcsPrbsCount3 = 0x00D81260,
+ HwPfPciePcsBistControl3 = 0x00D81264,
+ HwPfPciePcsBistStaticWord03 = 0x00D81268,
+ HwPfPciePcsBistStaticWord13 = 0x00D8126C,
+ HwPfPciePcsBistStaticWord23 = 0x00D81270,
+ HwPfPciePcsBistStaticWord33 = 0x00D81274,
+ HwPfPcieGpexLtssmStateCntrl = 0x00D90400,
+ HwPfPcieGpexLtssmStateStatus = 0x00D90404,
+ HwPfPcieGpexSkipFreqTimer = 0x00D90408,
+ HwPfPcieGpexLaneSelect = 0x00D9040C,
+ HwPfPcieGpexLaneDeskew = 0x00D90410,
+ HwPfPcieGpexRxErrorStatus = 0x00D90414,
+ HwPfPcieGpexLaneNumControl = 0x00D90418,
+ HwPfPcieGpexNFstControl = 0x00D9041C,
+ HwPfPcieGpexLinkStatus = 0x00D90420,
+ HwPfPcieGpexAckReplayTimeout = 0x00D90438,
+ HwPfPcieGpexSeqNumberStatus = 0x00D9043C,
+ HwPfPcieGpexCoreClkRatio = 0x00D90440,
+ HwPfPcieGpexDllTholdControl = 0x00D90448,
+ HwPfPcieGpexPmTimer = 0x00D90450,
+ HwPfPcieGpexPmeTimeout = 0x00D90454,
+ HwPfPcieGpexAspmL1Timer = 0x00D90458,
+ HwPfPcieGpexAspmReqTimer = 0x00D9045C,
+ HwPfPcieGpexAspmL1Dis = 0x00D90460,
+ HwPfPcieGpexAdvisoryErrorControl = 0x00D90468,
+ HwPfPcieGpexId = 0x00D90470,
+ HwPfPcieGpexClasscode = 0x00D90474,
+ HwPfPcieGpexSubsystemId = 0x00D90478,
+ HwPfPcieGpexDeviceCapabilities = 0x00D9047C,
+ HwPfPcieGpexLinkCapabilities = 0x00D90480,
+ HwPfPcieGpexFunctionNumber = 0x00D90484,
+ HwPfPcieGpexPmCapabilities = 0x00D90488,
+ HwPfPcieGpexFunctionSelect = 0x00D9048C,
+ HwPfPcieGpexErrorCounter = 0x00D904AC,
+ HwPfPcieGpexConfigReady = 0x00D904B0,
+ HwPfPcieGpexFcUpdateTimeout = 0x00D904B8,
+ HwPfPcieGpexFcUpdateTimer = 0x00D904BC,
+ HwPfPcieGpexVcBufferLoad = 0x00D904C8,
+ HwPfPcieGpexVcBufferSizeThold = 0x00D904CC,
+ HwPfPcieGpexVcBufferSelect = 0x00D904D0,
+ HwPfPcieGpexBarEnable = 0x00D904D4,
+ HwPfPcieGpexBarDwordLower = 0x00D904D8,
+ HwPfPcieGpexBarDwordUpper = 0x00D904DC,
+ HwPfPcieGpexBarSelect = 0x00D904E0,
+ HwPfPcieGpexCreditCounterSelect = 0x00D904E4,
+ HwPfPcieGpexCreditCounterStatus = 0x00D904E8,
+ HwPfPcieGpexTlpHeaderSelect = 0x00D904EC,
+ HwPfPcieGpexTlpHeaderDword0 = 0x00D904F0,
+ HwPfPcieGpexTlpHeaderDword1 = 0x00D904F4,
+ HwPfPcieGpexTlpHeaderDword2 = 0x00D904F8,
+ HwPfPcieGpexTlpHeaderDword3 = 0x00D904FC,
+ HwPfPcieGpexRelaxOrderControl = 0x00D90500,
+ HwPfPcieGpexBarPrefetch = 0x00D90504,
+ HwPfPcieGpexFcCheckControl = 0x00D90508,
+ HwPfPcieGpexFcUpdateTimerTraffic = 0x00D90518,
+ HwPfPcieGpexPhyControl0 = 0x00D9053C,
+ HwPfPcieGpexPhyControl1 = 0x00D90544,
+ HwPfPcieGpexPhyControl2 = 0x00D9054C,
+ HwPfPcieGpexUserControl0 = 0x00D9055C,
+ HwPfPcieGpexUncorrErrorStatus = 0x00D905F0,
+ HwPfPcieGpexRxCplError = 0x00D90620,
+ HwPfPcieGpexRxCplErrorDword0 = 0x00D90624,
+ HwPfPcieGpexRxCplErrorDword1 = 0x00D90628,
+ HwPfPcieGpexRxCplErrorDword2 = 0x00D9062C,
+ HwPfPcieGpexPabSwResetEn = 0x00D90630,
+ HwPfPcieGpexGen3Control0 = 0x00D90634,
+ HwPfPcieGpexGen3Control1 = 0x00D90638,
+ HwPfPcieGpexGen3Control2 = 0x00D9063C,
+ HwPfPcieGpexGen2ControlCsr = 0x00D90640,
+ HwPfPcieGpexTotalVfInitialVf0 = 0x00D90644,
+ HwPfPcieGpexTotalVfInitialVf1 = 0x00D90648,
+ HwPfPcieGpexSriovLinkDevId0 = 0x00D90684,
+ HwPfPcieGpexSriovLinkDevId1 = 0x00D90688,
+ HwPfPcieGpexSriovPageSize0 = 0x00D906C4,
+ HwPfPcieGpexSriovPageSize1 = 0x00D906C8,
+ HwPfPcieGpexIdVersion = 0x00D906FC,
+ HwPfPcieGpexSriovVfOffsetStride0 = 0x00D90704,
+ HwPfPcieGpexSriovVfOffsetStride1 = 0x00D90708,
+ HwPfPcieGpexGen3DeskewControl = 0x00D907B4,
+ HwPfPcieGpexGen3EqControl = 0x00D907B8,
+ HwPfPcieGpexBridgeVersion = 0x00D90800,
+ HwPfPcieGpexBridgeCapability = 0x00D90804,
+ HwPfPcieGpexBridgeControl = 0x00D90808,
+ HwPfPcieGpexBridgeStatus = 0x00D9080C,
+ HwPfPcieGpexEngineActivityStatus = 0x00D9081C,
+ HwPfPcieGpexEngineResetControl = 0x00D90820,
+ HwPfPcieGpexAxiPioControl = 0x00D90840,
+ HwPfPcieGpexAxiPioStatus = 0x00D90844,
+ HwPfPcieGpexAmbaSlaveCmdStatus = 0x00D90848,
+ HwPfPcieGpexPexPioControl = 0x00D908C0,
+ HwPfPcieGpexPexPioStatus = 0x00D908C4,
+ HwPfPcieGpexAmbaMasterStatus = 0x00D908C8,
+ HwPfPcieGpexCsrSlaveCmdStatus = 0x00D90920,
+ HwPfPcieGpexMailboxAxiControl = 0x00D90A50,
+ HwPfPcieGpexMailboxAxiData = 0x00D90A54,
+ HwPfPcieGpexMailboxPexControl = 0x00D90A90,
+ HwPfPcieGpexMailboxPexData = 0x00D90A94,
+ HwPfPcieGpexPexInterruptEnable = 0x00D90AD0,
+ HwPfPcieGpexPexInterruptStatus = 0x00D90AD4,
+ HwPfPcieGpexPexInterruptAxiPioVector = 0x00D90AD8,
+ HwPfPcieGpexPexInterruptPexPioVector = 0x00D90AE0,
+ HwPfPcieGpexPexInterruptMiscVector = 0x00D90AF8,
+ HwPfPcieGpexAmbaInterruptPioEnable = 0x00D90B00,
+ HwPfPcieGpexAmbaInterruptMiscEnable = 0x00D90B0C,
+ HwPfPcieGpexAmbaInterruptPioStatus = 0x00D90B10,
+ HwPfPcieGpexAmbaInterruptMiscStatus = 0x00D90B1C,
+ HwPfPcieGpexPexPmControl = 0x00D90B80,
+ HwPfPcieGpexSlotMisc = 0x00D90B88,
+ HwPfPcieGpexAxiAddrMappingControl = 0x00D90BA0,
+ HwPfPcieGpexAxiAddrMappingWindowAxiBase = 0x00D90BA4,
+ HwPfPcieGpexAxiAddrMappingWindowPexBaseLow = 0x00D90BA8,
+ HwPfPcieGpexAxiAddrMappingWindowPexBaseHigh = 0x00D90BAC,
+ HwPfPcieGpexPexBarAddrFunc0Bar0 = 0x00D91BA0,
+ HwPfPcieGpexPexBarAddrFunc0Bar1 = 0x00D91BA4,
+ HwPfPcieGpexAxiAddrMappingPcieHdrParam = 0x00D95BA0,
+ HwPfPcieGpexExtAxiAddrMappingAxiBase = 0x00D980A0,
+ HwPfPcieGpexPexExtBarAddrFunc0Bar0 = 0x00D984A0,
+ HwPfPcieGpexPexExtBarAddrFunc0Bar1 = 0x00D984A4,
+ HwPfPcieGpexAmbaInterruptFlrEnable = 0x00D9B960,
+ HwPfPcieGpexAmbaInterruptFlrStatus = 0x00D9B9A0,
+ HwPfPcieGpexExtAxiAddrMappingSize = 0x00D9BAF0,
+ HwPfPcieGpexPexPioAwcacheControl = 0x00D9C300,
+ HwPfPcieGpexPexPioArcacheControl = 0x00D9C304,
+ HwPfPcieGpexPabObSizeControlVc0 = 0x00D9C310
+};
+
+/* TIP PF Interrupt numbers */
+enum {
+ ACC100_PF_INT_QMGR_AQ_OVERFLOW = 0,
+ ACC100_PF_INT_DOORBELL_VF_2_PF = 1,
+ ACC100_PF_INT_DMA_DL_DESC_IRQ = 2,
+ ACC100_PF_INT_DMA_UL_DESC_IRQ = 3,
+ ACC100_PF_INT_DMA_MLD_DESC_IRQ = 4,
+ ACC100_PF_INT_DMA_UL5G_DESC_IRQ = 5,
+ ACC100_PF_INT_DMA_DL5G_DESC_IRQ = 6,
+ ACC100_PF_INT_ILLEGAL_FORMAT = 7,
+ ACC100_PF_INT_QMGR_DISABLED_ACCESS = 8,
+ ACC100_PF_INT_QMGR_AQ_OVERTHRESHOLD = 9,
+ ACC100_PF_INT_ARAM_ACCESS_ERR = 10,
+ ACC100_PF_INT_ARAM_ECC_1BIT_ERR = 11,
+ ACC100_PF_INT_PARITY_ERR = 12,
+ ACC100_PF_INT_QMGR_ERR = 13,
+ ACC100_PF_INT_INT_REQ_OVERFLOW = 14,
+ ACC100_PF_INT_APB_TIMEOUT = 15,
+};
+
+#endif /* ACC100_PF_ENUM_H */
diff --git a/drivers/baseband/acc100/acc100_vf_enum.h b/drivers/baseband/acc100/acc100_vf_enum.h
new file mode 100644
index 0000000..b512af3
--- /dev/null
+++ b/drivers/baseband/acc100/acc100_vf_enum.h
@@ -0,0 +1,73 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2017 Intel Corporation
+ */
+
+#ifndef ACC100_VF_ENUM_H
+#define ACC100_VF_ENUM_H
+
+/*
+ * ACC100 Register mapping on VF BAR0
+ * This is automatically generated from RDL, format may change with new RDL
+ */
+enum {
+ HWVfQmgrIngressAq = 0x00000000,
+ HWVfHiVfToPfDbellVf = 0x00000800,
+ HWVfHiPfToVfDbellVf = 0x00000808,
+ HWVfHiInfoRingBaseLoVf = 0x00000810,
+ HWVfHiInfoRingBaseHiVf = 0x00000814,
+ HWVfHiInfoRingPointerVf = 0x00000818,
+ HWVfHiInfoRingIntWrEnVf = 0x00000820,
+ HWVfHiInfoRingPf2VfWrEnVf = 0x00000824,
+ HWVfHiMsixVectorMapperVf = 0x00000860,
+ HWVfDmaFec5GulDescBaseLoRegVf = 0x00000920,
+ HWVfDmaFec5GulDescBaseHiRegVf = 0x00000924,
+ HWVfDmaFec5GulRespPtrLoRegVf = 0x00000928,
+ HWVfDmaFec5GulRespPtrHiRegVf = 0x0000092C,
+ HWVfDmaFec5GdlDescBaseLoRegVf = 0x00000940,
+ HWVfDmaFec5GdlDescBaseHiRegVf = 0x00000944,
+ HWVfDmaFec5GdlRespPtrLoRegVf = 0x00000948,
+ HWVfDmaFec5GdlRespPtrHiRegVf = 0x0000094C,
+ HWVfDmaFec4GulDescBaseLoRegVf = 0x00000960,
+ HWVfDmaFec4GulDescBaseHiRegVf = 0x00000964,
+ HWVfDmaFec4GulRespPtrLoRegVf = 0x00000968,
+ HWVfDmaFec4GulRespPtrHiRegVf = 0x0000096C,
+ HWVfDmaFec4GdlDescBaseLoRegVf = 0x00000980,
+ HWVfDmaFec4GdlDescBaseHiRegVf = 0x00000984,
+ HWVfDmaFec4GdlRespPtrLoRegVf = 0x00000988,
+ HWVfDmaFec4GdlRespPtrHiRegVf = 0x0000098C,
+ HWVfDmaDdrBaseRangeRoVf = 0x000009A0,
+ HWVfQmgrAqResetVf = 0x00000E00,
+ HWVfQmgrRingSizeVf = 0x00000E04,
+ HWVfQmgrGrpDepthLog20Vf = 0x00000E08,
+ HWVfQmgrGrpDepthLog21Vf = 0x00000E0C,
+ HWVfQmgrGrpFunction0Vf = 0x00000E10,
+ HWVfQmgrGrpFunction1Vf = 0x00000E14,
+ HWVfPmACntrlRegVf = 0x00000F40,
+ HWVfPmACountVf = 0x00000F48,
+ HWVfPmAKCntLoVf = 0x00000F50,
+ HWVfPmAKCntHiVf = 0x00000F54,
+ HWVfPmADeltaCntLoVf = 0x00000F60,
+ HWVfPmADeltaCntHiVf = 0x00000F64,
+ HWVfPmBCntrlRegVf = 0x00000F80,
+ HWVfPmBCountVf = 0x00000F88,
+ HWVfPmBKCntLoVf = 0x00000F90,
+ HWVfPmBKCntHiVf = 0x00000F94,
+ HWVfPmBDeltaCntLoVf = 0x00000FA0,
+ HWVfPmBDeltaCntHiVf = 0x00000FA4
+};
+
+/* TIP VF Interrupt numbers */
+enum {
+ ACC100_VF_INT_QMGR_AQ_OVERFLOW = 0,
+ ACC100_VF_INT_DOORBELL_VF_2_PF = 1,
+ ACC100_VF_INT_DMA_DL_DESC_IRQ = 2,
+ ACC100_VF_INT_DMA_UL_DESC_IRQ = 3,
+ ACC100_VF_INT_DMA_MLD_DESC_IRQ = 4,
+ ACC100_VF_INT_DMA_UL5G_DESC_IRQ = 5,
+ ACC100_VF_INT_DMA_DL5G_DESC_IRQ = 6,
+ ACC100_VF_INT_ILLEGAL_FORMAT = 7,
+ ACC100_VF_INT_QMGR_DISABLED_ACCESS = 8,
+ ACC100_VF_INT_QMGR_AQ_OVERTHRESHOLD = 9,
+};
+
+#endif /* ACC100_VF_ENUM_H */
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 6f46df0..cd77570 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -5,6 +5,9 @@
#ifndef _RTE_ACC100_PMD_H_
#define _RTE_ACC100_PMD_H_
+#include "acc100_pf_enum.h"
+#include "acc100_vf_enum.h"
+
/* Helper macro for logging */
#define rte_bbdev_log(level, fmt, ...) \
rte_log(RTE_LOG_ ## level, acc100_logtype, fmt "\n", \
@@ -27,6 +30,493 @@
#define RTE_ACC100_PF_DEVICE_ID (0x0d5c)
#define RTE_ACC100_VF_DEVICE_ID (0x0d5d)
+/* Define as 1 to use only a single FEC engine */
+#ifndef RTE_ACC100_SINGLE_FEC
+#define RTE_ACC100_SINGLE_FEC 0
+#endif
+
+/* Values used in filling in descriptors */
+#define ACC100_DMA_DESC_TYPE 2
+#define ACC100_DMA_CODE_BLK_MODE 0
+#define ACC100_DMA_BLKID_FCW 1
+#define ACC100_DMA_BLKID_IN 2
+#define ACC100_DMA_BLKID_OUT_ENC 1
+#define ACC100_DMA_BLKID_OUT_HARD 1
+#define ACC100_DMA_BLKID_OUT_SOFT 2
+#define ACC100_DMA_BLKID_OUT_HARQ 3
+#define ACC100_DMA_BLKID_IN_HARQ 3
+
+/* Values used in filling in decode FCWs */
+#define ACC100_FCW_TD_VER 1
+#define ACC100_FCW_TD_EXT_COLD_REG_EN 1
+#define ACC100_FCW_TD_AUTOMAP 0x0f
+#define ACC100_FCW_TD_RVIDX_0 2
+#define ACC100_FCW_TD_RVIDX_1 26
+#define ACC100_FCW_TD_RVIDX_2 50
+#define ACC100_FCW_TD_RVIDX_3 74
+
+/* Values used in writing to the registers */
+#define ACC100_REG_IRQ_EN_ALL 0x1FF83FF /* Enable all interrupts */
+
+/* ACC100 Specific Dimensioning */
+#define ACC100_SIZE_64MBYTE (64*1024*1024)
+/* Number of elements in an Info Ring */
+#define ACC100_INFO_RING_NUM_ENTRIES 1024
+/* Number of elements in HARQ layout memory */
+#define ACC100_HARQ_LAYOUT (64*1024*1024)
+/* Assume offset for HARQ in memory */
+#define ACC100_HARQ_OFFSET (32*1024)
+/* Mask used to calculate an index in an Info Ring array (not a byte offset) */
+#define ACC100_INFO_RING_MASK (ACC100_INFO_RING_NUM_ENTRIES-1)
+/* Number of Virtual Functions ACC100 supports */
+#define ACC100_NUM_VFS 16
+#define ACC100_NUM_QGRPS 8
+#define ACC100_NUM_QGRPS_PER_WORD 8
+#define ACC100_NUM_AQS 16
+#define MAX_ENQ_BATCH_SIZE 255
+/* All ACC100 Registers alignment are 32bits = 4B */
+#define BYTES_IN_WORD 4
+#define MAX_E_MBUF 64000
+
+#define GRP_ID_SHIFT 10 /* Queue Index Hierarchy */
+#define VF_ID_SHIFT 4 /* Queue Index Hierarchy */
+#define VF_OFFSET_QOS 16 /* offset in Memory Space specific to QoS Mon */
+#define TMPL_PRI_0 0x03020100
+#define TMPL_PRI_1 0x07060504
+#define TMPL_PRI_2 0x0b0a0908
+#define TMPL_PRI_3 0x0f0e0d0c
+#define QUEUE_ENABLE 0x80000000 /* Bit to mark Queue as Enabled */
+#define WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
+
+#define ACC100_NUM_TMPL 32
+#define VF_OFFSET_QOS 16 /* offset in Memory Space specific to QoS Mon */
+/* Mapping of signals for the available engines */
+#define SIG_UL_5G 0
+#define SIG_UL_5G_LAST 7
+#define SIG_DL_5G 13
+#define SIG_DL_5G_LAST 15
+#define SIG_UL_4G 16
+#define SIG_UL_4G_LAST 21
+#define SIG_DL_4G 27
+#define SIG_DL_4G_LAST 31
+
+/* max number of iterations to allocate memory block for all rings */
+#define SW_RING_MEM_ALLOC_ATTEMPTS 5
+#define MAX_QUEUE_DEPTH 1024
+#define ACC100_DMA_MAX_NUM_POINTERS 14
+#define ACC100_DMA_DESC_PADDING 8
+#define ACC100_FCW_PADDING 12
+#define ACC100_DESC_FCW_OFFSET 192
+#define ACC100_DESC_SIZE 256
+#define ACC100_DESC_OFFSET (ACC100_DESC_SIZE / 64)
+#define ACC100_FCW_TE_BLEN 32
+#define ACC100_FCW_TD_BLEN 24
+#define ACC100_FCW_LE_BLEN 32
+#define ACC100_FCW_LD_BLEN 36
+
+#define ACC100_FCW_VER 2
+#define MUX_5GDL_DESC 6
+#define CMP_ENC_SIZE 20
+#define CMP_DEC_SIZE 24
+#define ENC_OFFSET (32)
+#define DEC_OFFSET (80)
+#define ACC100_EXT_MEM
+#define ACC100_HARQ_OFFSET_THRESHOLD 1024
+
+/* Constants from K0 computation from 3GPP 38.212 Table 5.4.2.1-2 */
+#define N_ZC_1 66 /* N = 66 Zc for BG 1 */
+#define N_ZC_2 50 /* N = 50 Zc for BG 2 */
+#define K0_1_1 17 /* K0 fraction numerator for rv 1 and BG 1 */
+#define K0_1_2 13 /* K0 fraction numerator for rv 1 and BG 2 */
+#define K0_2_1 33 /* K0 fraction numerator for rv 2 and BG 1 */
+#define K0_2_2 25 /* K0 fraction numerator for rv 2 and BG 2 */
+#define K0_3_1 56 /* K0 fraction numerator for rv 3 and BG 1 */
+#define K0_3_2 43 /* K0 fraction numerator for rv 3 and BG 2 */
+
+/* ACC100 Configuration */
+#define ACC100_DDR_ECC_ENABLE
+#define ACC100_CFG_DMA_ERROR 0x3D7
+#define ACC100_CFG_AXI_CACHE 0x11
+#define ACC100_CFG_QMGR_HI_P 0x0F0F
+#define ACC100_CFG_PCI_AXI 0xC003
+#define ACC100_CFG_PCI_BRIDGE 0x40006033
+#define ACC100_ENGINE_OFFSET 0x1000
+#define ACC100_RESET_HI 0x20100
+#define ACC100_RESET_LO 0x20000
+#define ACC100_RESET_HARD 0x1FF
+#define ACC100_ENGINES_MAX 9
+#define LONG_WAIT 1000
+
+/* ACC100 DMA Descriptor triplet */
+struct acc100_dma_triplet {
+ uint64_t address;
+ uint32_t blen:20,
+ res0:4,
+ last:1,
+ dma_ext:1,
+ res1:2,
+ blkid:4;
+} __rte_packed;
+
+
+
+/* ACC100 DMA Response Descriptor */
+union acc100_dma_rsp_desc {
+ uint32_t val;
+ struct {
+ uint32_t crc_status:1,
+ synd_ok:1,
+ dma_err:1,
+ neg_stop:1,
+ fcw_err:1,
+ output_err:1,
+ input_err:1,
+ timestampEn:1,
+ iterCountFrac:8,
+ iter_cnt:8,
+ rsrvd3:6,
+ sdone:1,
+ fdone:1;
+ uint32_t add_info_0;
+ uint32_t add_info_1;
+ };
+};
+
+
+/* ACC100 Queue Manager Enqueue PCI Register */
+union acc100_enqueue_reg_fmt {
+ uint32_t val;
+ struct {
+ uint32_t num_elem:8,
+ addr_offset:3,
+ rsrvd:1,
+ req_elem_addr:20;
+ };
+};
+
+/* FEC 4G Uplink Frame Control Word */
+struct __rte_packed acc100_fcw_td {
+ uint8_t fcw_ver:4,
+ num_maps:4; /* Unused */
+ uint8_t filler:6, /* Unused */
+ rsrvd0:1,
+ bypass_sb_deint:1;
+ uint16_t k_pos;
+ uint16_t k_neg; /* Unused */
+ uint8_t c_neg; /* Unused */
+ uint8_t c; /* Unused */
+ uint32_t ea; /* Unused */
+ uint32_t eb; /* Unused */
+ uint8_t cab; /* Unused */
+ uint8_t k0_start_col; /* Unused */
+ uint8_t rsrvd1;
+ uint8_t code_block_mode:1, /* Unused */
+ turbo_crc_type:1,
+ rsrvd2:3,
+ bypass_teq:1, /* Unused */
+ soft_output_en:1, /* Unused */
+ ext_td_cold_reg_en:1;
+ union { /* External Cold register */
+ uint32_t ext_td_cold_reg;
+ struct {
+ uint32_t min_iter:4, /* Unused */
+ max_iter:4,
+ ext_scale:5, /* Unused */
+ rsrvd3:3,
+ early_stop_en:1, /* Unused */
+ sw_soft_out_dis:1, /* Unused */
+ sw_et_cont:1, /* Unused */
+ sw_soft_out_saturation:1, /* Unused */
+ half_iter_on:1, /* Unused */
+ raw_decoder_input_on:1, /* Unused */
+ rsrvd4:10;
+ };
+ };
+};
+
+/* FEC 5GNR Uplink Frame Control Word */
+struct __rte_packed acc100_fcw_ld {
+ uint32_t FCWversion:4,
+ qm:4,
+ nfiller:11,
+ BG:1,
+ Zc:9,
+ res0:1,
+ synd_precoder:1,
+ synd_post:1;
+ uint32_t ncb:16,
+ k0:16;
+ uint32_t rm_e:24,
+ hcin_en:1,
+ hcout_en:1,
+ crc_select:1,
+ bypass_dec:1,
+ bypass_intlv:1,
+ so_en:1,
+ so_bypass_rm:1,
+ so_bypass_intlv:1;
+ uint32_t hcin_offset:16,
+ hcin_size0:16;
+ uint32_t hcin_size1:16,
+ hcin_decomp_mode:3,
+ llr_pack_mode:1,
+ hcout_comp_mode:3,
+ res2:1,
+ dec_convllr:4,
+ hcout_convllr:4;
+ uint32_t itmax:7,
+ itstop:1,
+ so_it:7,
+ res3:1,
+ hcout_offset:16;
+ uint32_t hcout_size0:16,
+ hcout_size1:16;
+ uint32_t gain_i:8,
+ gain_h:8,
+ negstop_th:16;
+ uint32_t negstop_it:7,
+ negstop_en:1,
+ res4:24;
+};
+
+/* FEC 4G Downlink Frame Control Word */
+struct __rte_packed acc100_fcw_te {
+ uint16_t k_neg;
+ uint16_t k_pos;
+ uint8_t c_neg;
+ uint8_t c;
+ uint8_t filler;
+ uint8_t cab;
+ uint32_t ea:17,
+ rsrvd0:15;
+ uint32_t eb:17,
+ rsrvd1:15;
+ uint16_t ncb_neg;
+ uint16_t ncb_pos;
+ uint8_t rv_idx0:2,
+ rsrvd2:2,
+ rv_idx1:2,
+ rsrvd3:2;
+ uint8_t bypass_rv_idx0:1,
+ bypass_rv_idx1:1,
+ bypass_rm:1,
+ rsrvd4:5;
+ uint8_t rsrvd5:1,
+ rsrvd6:3,
+ code_block_crc:1,
+ rsrvd7:3;
+ uint8_t code_block_mode:1,
+ rsrvd8:7;
+ uint64_t rsrvd9;
+};
+
+/* FEC 5GNR Downlink Frame Control Word */
+struct __rte_packed acc100_fcw_le {
+ uint32_t FCWversion:4,
+ qm:4,
+ nfiller:11,
+ BG:1,
+ Zc:9,
+ res0:3;
+ uint32_t ncb:16,
+ k0:16;
+ uint32_t rm_e:24,
+ res1:2,
+ crc_select:1,
+ res2:1,
+ bypass_intlv:1,
+ res3:3;
+ uint32_t res4_a:12,
+ mcb_count:3,
+ res4_b:17;
+ uint32_t res5;
+ uint32_t res6;
+ uint32_t res7;
+ uint32_t res8;
+};
+
+/* ACC100 DMA Request Descriptor */
+struct __rte_packed acc100_dma_req_desc {
+ union {
+ struct{
+ uint32_t type:4,
+ rsrvd0:26,
+ sdone:1,
+ fdone:1;
+ uint32_t rsrvd1;
+ uint32_t rsrvd2;
+ uint32_t pass_param:8,
+ sdone_enable:1,
+ irq_enable:1,
+ timeStampEn:1,
+ res0:5,
+ numCBs:4,
+ res1:4,
+ m2dlen:4,
+ d2mlen:4;
+ };
+ struct{
+ uint32_t word0;
+ uint32_t word1;
+ uint32_t word2;
+ uint32_t word3;
+ };
+ };
+ struct acc100_dma_triplet data_ptrs[ACC100_DMA_MAX_NUM_POINTERS];
+
+ /* Virtual addresses used to retrieve SW context info */
+ union {
+ void *op_addr;
+ uint64_t pad1; /* pad to 64 bits */
+ };
+ /*
+ * Stores additional information needed for driver processing:
+ * - last_desc_in_batch - flag used to mark last descriptor (CB)
+ * in batch
+ * - cbs_in_tb - stores information about total number of Code Blocks
+ * in currently processed Transport Block
+ */
+ union {
+ struct {
+ union {
+ struct acc100_fcw_ld fcw_ld;
+ struct acc100_fcw_td fcw_td;
+ struct acc100_fcw_le fcw_le;
+ struct acc100_fcw_te fcw_te;
+ uint32_t pad2[ACC100_FCW_PADDING];
+ };
+ uint32_t last_desc_in_batch :8,
+ cbs_in_tb:8,
+ pad4 : 16;
+ };
+ uint64_t pad3[ACC100_DMA_DESC_PADDING]; /* pad to 64 bits */
+ };
+};
+
+/* ACC100 DMA Descriptor */
+union acc100_dma_desc {
+ struct acc100_dma_req_desc req;
+ union acc100_dma_rsp_desc rsp;
+};
+
+
+/* Union describing Info Ring entry */
+union acc100_harq_layout_data {
+ uint32_t val;
+ struct {
+ uint16_t offset;
+ uint16_t size0;
+ };
+} __rte_packed;
+
+
+/* Union describing Info Ring entry */
+union acc100_info_ring_data {
+ uint32_t val;
+ struct {
+ union {
+ uint16_t detailed_info;
+ struct {
+ uint16_t aq_id: 4;
+ uint16_t qg_id: 4;
+ uint16_t vf_id: 6;
+ uint16_t reserved: 2;
+ };
+ };
+ uint16_t int_nb: 7;
+ uint16_t msi_0: 1;
+ uint16_t vf2pf: 6;
+ uint16_t loop: 1;
+ uint16_t valid: 1;
+ };
+} __rte_packed;
+
+struct acc100_registry_addr {
+ unsigned int dma_ring_dl5g_hi;
+ unsigned int dma_ring_dl5g_lo;
+ unsigned int dma_ring_ul5g_hi;
+ unsigned int dma_ring_ul5g_lo;
+ unsigned int dma_ring_dl4g_hi;
+ unsigned int dma_ring_dl4g_lo;
+ unsigned int dma_ring_ul4g_hi;
+ unsigned int dma_ring_ul4g_lo;
+ unsigned int ring_size;
+ unsigned int info_ring_hi;
+ unsigned int info_ring_lo;
+ unsigned int info_ring_en;
+ unsigned int info_ring_ptr;
+ unsigned int tail_ptrs_dl5g_hi;
+ unsigned int tail_ptrs_dl5g_lo;
+ unsigned int tail_ptrs_ul5g_hi;
+ unsigned int tail_ptrs_ul5g_lo;
+ unsigned int tail_ptrs_dl4g_hi;
+ unsigned int tail_ptrs_dl4g_lo;
+ unsigned int tail_ptrs_ul4g_hi;
+ unsigned int tail_ptrs_ul4g_lo;
+ unsigned int depth_log0_offset;
+ unsigned int depth_log1_offset;
+ unsigned int qman_group_func;
+ unsigned int ddr_range;
+};
+
+/* Structure holding registry addresses for PF */
+static const struct acc100_registry_addr pf_reg_addr = {
+ .dma_ring_dl5g_hi = HWPfDmaFec5GdlDescBaseHiRegVf,
+ .dma_ring_dl5g_lo = HWPfDmaFec5GdlDescBaseLoRegVf,
+ .dma_ring_ul5g_hi = HWPfDmaFec5GulDescBaseHiRegVf,
+ .dma_ring_ul5g_lo = HWPfDmaFec5GulDescBaseLoRegVf,
+ .dma_ring_dl4g_hi = HWPfDmaFec4GdlDescBaseHiRegVf,
+ .dma_ring_dl4g_lo = HWPfDmaFec4GdlDescBaseLoRegVf,
+ .dma_ring_ul4g_hi = HWPfDmaFec4GulDescBaseHiRegVf,
+ .dma_ring_ul4g_lo = HWPfDmaFec4GulDescBaseLoRegVf,
+ .ring_size = HWPfQmgrRingSizeVf,
+ .info_ring_hi = HWPfHiInfoRingBaseHiRegPf,
+ .info_ring_lo = HWPfHiInfoRingBaseLoRegPf,
+ .info_ring_en = HWPfHiInfoRingIntWrEnRegPf,
+ .info_ring_ptr = HWPfHiInfoRingPointerRegPf,
+ .tail_ptrs_dl5g_hi = HWPfDmaFec5GdlRespPtrHiRegVf,
+ .tail_ptrs_dl5g_lo = HWPfDmaFec5GdlRespPtrLoRegVf,
+ .tail_ptrs_ul5g_hi = HWPfDmaFec5GulRespPtrHiRegVf,
+ .tail_ptrs_ul5g_lo = HWPfDmaFec5GulRespPtrLoRegVf,
+ .tail_ptrs_dl4g_hi = HWPfDmaFec4GdlRespPtrHiRegVf,
+ .tail_ptrs_dl4g_lo = HWPfDmaFec4GdlRespPtrLoRegVf,
+ .tail_ptrs_ul4g_hi = HWPfDmaFec4GulRespPtrHiRegVf,
+ .tail_ptrs_ul4g_lo = HWPfDmaFec4GulRespPtrLoRegVf,
+ .depth_log0_offset = HWPfQmgrGrpDepthLog20Vf,
+ .depth_log1_offset = HWPfQmgrGrpDepthLog21Vf,
+ .qman_group_func = HWPfQmgrGrpFunction0,
+ .ddr_range = HWPfDmaVfDdrBaseRw,
+};
+
+/* Structure holding registry addresses for VF */
+static const struct acc100_registry_addr vf_reg_addr = {
+ .dma_ring_dl5g_hi = HWVfDmaFec5GdlDescBaseHiRegVf,
+ .dma_ring_dl5g_lo = HWVfDmaFec5GdlDescBaseLoRegVf,
+ .dma_ring_ul5g_hi = HWVfDmaFec5GulDescBaseHiRegVf,
+ .dma_ring_ul5g_lo = HWVfDmaFec5GulDescBaseLoRegVf,
+ .dma_ring_dl4g_hi = HWVfDmaFec4GdlDescBaseHiRegVf,
+ .dma_ring_dl4g_lo = HWVfDmaFec4GdlDescBaseLoRegVf,
+ .dma_ring_ul4g_hi = HWVfDmaFec4GulDescBaseHiRegVf,
+ .dma_ring_ul4g_lo = HWVfDmaFec4GulDescBaseLoRegVf,
+ .ring_size = HWVfQmgrRingSizeVf,
+ .info_ring_hi = HWVfHiInfoRingBaseHiVf,
+ .info_ring_lo = HWVfHiInfoRingBaseLoVf,
+ .info_ring_en = HWVfHiInfoRingIntWrEnVf,
+ .info_ring_ptr = HWVfHiInfoRingPointerVf,
+ .tail_ptrs_dl5g_hi = HWVfDmaFec5GdlRespPtrHiRegVf,
+ .tail_ptrs_dl5g_lo = HWVfDmaFec5GdlRespPtrLoRegVf,
+ .tail_ptrs_ul5g_hi = HWVfDmaFec5GulRespPtrHiRegVf,
+ .tail_ptrs_ul5g_lo = HWVfDmaFec5GulRespPtrLoRegVf,
+ .tail_ptrs_dl4g_hi = HWVfDmaFec4GdlRespPtrHiRegVf,
+ .tail_ptrs_dl4g_lo = HWVfDmaFec4GdlRespPtrLoRegVf,
+ .tail_ptrs_ul4g_hi = HWVfDmaFec4GulRespPtrHiRegVf,
+ .tail_ptrs_ul4g_lo = HWVfDmaFec4GulRespPtrLoRegVf,
+ .depth_log0_offset = HWVfQmgrGrpDepthLog20Vf,
+ .depth_log1_offset = HWVfQmgrGrpDepthLog21Vf,
+ .qman_group_func = HWVfQmgrGrpFunction0Vf,
+ .ddr_range = HWVfDmaDdrBaseRangeRoVf,
+};
+
/* Private data structure for each ACC100 device */
struct acc100_device {
void *mmio_base; /**< Base address of MMIO registers (BAR0) */
--
1.8.3.1
^ permalink raw reply [flat|nested] 213+ messages in thread
* [dpdk-dev] [PATCH v3 03/11] baseband/acc100: add info get function
2020-08-19 0:25 [dpdk-dev] [PATCH v3 00/11] bbdev PMD ACC100 Nicolas Chautru
2020-08-19 0:25 ` [dpdk-dev] [PATCH v3 01/11] drivers/baseband: add PMD for ACC100 Nicolas Chautru
2020-08-19 0:25 ` [dpdk-dev] [PATCH v3 02/11] baseband/acc100: add register definition file Nicolas Chautru
@ 2020-08-19 0:25 ` Nicolas Chautru
2020-08-19 0:25 ` [dpdk-dev] [PATCH v3 04/11] baseband/acc100: add queue configuration Nicolas Chautru
` (7 subsequent siblings)
10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-08-19 0:25 UTC (permalink / raw)
To: dev, akhil.goyal; +Cc: bruce.richardson, Nicolas Chautru
Add in the "info_get" function to the driver, to allow us to query the
device.
No processing capability are available yet.
Linking bbdev-test to support the PMD with null capability.
Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
app/test-bbdev/Makefile | 3 +
app/test-bbdev/meson.build | 3 +
drivers/baseband/acc100/rte_acc100_cfg.h | 96 +++++++++++++
drivers/baseband/acc100/rte_acc100_pmd.c | 225 +++++++++++++++++++++++++++++++
drivers/baseband/acc100/rte_acc100_pmd.h | 3 +
5 files changed, 330 insertions(+)
create mode 100644 drivers/baseband/acc100/rte_acc100_cfg.h
diff --git a/app/test-bbdev/Makefile b/app/test-bbdev/Makefile
index dc29557..dbc3437 100644
--- a/app/test-bbdev/Makefile
+++ b/app/test-bbdev/Makefile
@@ -26,5 +26,8 @@ endif
ifeq ($(CONFIG_RTE_LIBRTE_PMD_BBDEV_FPGA_5GNR_FEC),y)
LDLIBS += -lrte_pmd_bbdev_fpga_5gnr_fec
endif
+ifeq ($(CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100),y)
+LDLIBS += -lrte_pmd_bbdev_acc100
+endif
include $(RTE_SDK)/mk/rte.app.mk
diff --git a/app/test-bbdev/meson.build b/app/test-bbdev/meson.build
index 18ab6a8..fbd8ae3 100644
--- a/app/test-bbdev/meson.build
+++ b/app/test-bbdev/meson.build
@@ -12,3 +12,6 @@ endif
if dpdk_conf.has('RTE_LIBRTE_PMD_BBDEV_FPGA_5GNR_FEC')
deps += ['pmd_bbdev_fpga_5gnr_fec']
endif
+if dpdk_conf.has('RTE_LIBRTE_PMD_BBDEV_ACC100')
+ deps += ['pmd_bbdev_acc100']
+endif
\ No newline at end of file
diff --git a/drivers/baseband/acc100/rte_acc100_cfg.h b/drivers/baseband/acc100/rte_acc100_cfg.h
new file mode 100644
index 0000000..73bbe36
--- /dev/null
+++ b/drivers/baseband/acc100/rte_acc100_cfg.h
@@ -0,0 +1,96 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#ifndef _RTE_ACC100_CFG_H_
+#define _RTE_ACC100_CFG_H_
+
+/**
+ * @file rte_acc100_cfg.h
+ *
+ * Functions for configuring ACC100 HW, exposed directly to applications.
+ * Configuration related to encoding/decoding is done through the
+ * librte_bbdev library.
+ *
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ */
+
+#include <stdint.h>
+#include <stdbool.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+/**< Number of Virtual Functions ACC100 supports */
+#define RTE_ACC100_NUM_VFS 16
+
+/**
+ * Definition of Queue Topology for ACC100 Configuration
+ * Some level of details is abstracted out to expose a clean interface
+ * given that comprehensive flexibility is not required
+ */
+struct rte_q_topology_t {
+ /** Number of QGroups in incremental order of priority */
+ uint16_t num_qgroups;
+ /**
+ * All QGroups have the same number of AQs here.
+ * Note : Could be made a 16-array if more flexibility is really
+ * required
+ */
+ uint16_t num_aqs_per_groups;
+ /**
+ * Depth of the AQs is the same of all QGroups here. Log2 Enum : 2^N
+ * Note : Could be made a 16-array if more flexibility is really
+ * required
+ */
+ uint16_t aq_depth_log2;
+ /**
+ * Index of the first Queue Group Index - assuming contiguity
+ * Initialized as -1
+ */
+ int8_t first_qgroup_index;
+};
+
+/**
+ * Definition of Arbitration related parameters for ACC100 Configuration
+ */
+struct rte_arbitration_t {
+ /** Default Weight for VF Fairness Arbitration */
+ uint16_t round_robin_weight;
+ uint32_t gbr_threshold1; /**< Guaranteed Bitrate Threshold 1 */
+ uint32_t gbr_threshold2; /**< Guaranteed Bitrate Threshold 2 */
+};
+
+/**
+ * Structure to pass ACC100 configuration.
+ * Note: all VF Bundles will have the same configuration.
+ */
+struct acc100_conf {
+ bool pf_mode_en; /**< 1 if PF is used for dataplane, 0 for VFs */
+ /** 1 if input '1' bit is represented by a positive LLR value, 0 if '1'
+ * bit is represented by a negative value.
+ */
+ bool input_pos_llr_1_bit;
+ /** 1 if output '1' bit is represented by a positive value, 0 if '1'
+ * bit is represented by a negative value.
+ */
+ bool output_pos_llr_1_bit;
+ uint16_t num_vf_bundles; /**< Number of VF bundles to setup */
+ /** Queue topology for each operation type */
+ struct rte_q_topology_t q_ul_4g;
+ struct rte_q_topology_t q_dl_4g;
+ struct rte_q_topology_t q_ul_5g;
+ struct rte_q_topology_t q_dl_5g;
+ /** Arbitration configuration for each operation type */
+ struct rte_arbitration_t arb_ul_4g[RTE_ACC100_NUM_VFS];
+ struct rte_arbitration_t arb_dl_4g[RTE_ACC100_NUM_VFS];
+ struct rte_arbitration_t arb_ul_5g[RTE_ACC100_NUM_VFS];
+ struct rte_arbitration_t arb_dl_5g[RTE_ACC100_NUM_VFS];
+};
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_ACC100_CFG_H_ */
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 1b4cd13..7807a30 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -26,6 +26,184 @@
RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);
#endif
+/* Read a register of a ACC100 device */
+static inline uint32_t
+acc100_reg_read(struct acc100_device *d, uint32_t offset)
+{
+
+ void *reg_addr = RTE_PTR_ADD(d->mmio_base, offset);
+ uint32_t ret = *((volatile uint32_t *)(reg_addr));
+ return rte_le_to_cpu_32(ret);
+}
+
+/* Calculate the offset of the enqueue register */
+static inline uint32_t
+queue_offset(bool pf_device, uint8_t vf_id, uint8_t qgrp_id, uint16_t aq_id)
+{
+ if (pf_device)
+ return ((vf_id << 12) + (qgrp_id << 7) + (aq_id << 3) +
+ HWPfQmgrIngressAq);
+ else
+ return ((qgrp_id << 7) + (aq_id << 3) +
+ HWVfQmgrIngressAq);
+}
+
+enum {UL_4G = 0, UL_5G, DL_4G, DL_5G, NUM_ACC};
+
+/* Return the queue topology for a Queue Group Index */
+static inline void
+qtopFromAcc(struct rte_q_topology_t **qtop, int acc_enum,
+ struct acc100_conf *acc100_conf)
+{
+ struct rte_q_topology_t *p_qtop;
+ p_qtop = NULL;
+ switch (acc_enum) {
+ case UL_4G:
+ p_qtop = &(acc100_conf->q_ul_4g);
+ break;
+ case UL_5G:
+ p_qtop = &(acc100_conf->q_ul_5g);
+ break;
+ case DL_4G:
+ p_qtop = &(acc100_conf->q_dl_4g);
+ break;
+ case DL_5G:
+ p_qtop = &(acc100_conf->q_dl_5g);
+ break;
+ default:
+ /* NOTREACHED */
+ rte_bbdev_log(ERR, "Unexpected error evaluating qtopFromAcc");
+ break;
+ }
+ *qtop = p_qtop;
+}
+
+static void
+initQTop(struct acc100_conf *acc100_conf)
+{
+ acc100_conf->q_ul_4g.num_aqs_per_groups = 0;
+ acc100_conf->q_ul_4g.num_qgroups = 0;
+ acc100_conf->q_ul_4g.first_qgroup_index = -1;
+ acc100_conf->q_ul_5g.num_aqs_per_groups = 0;
+ acc100_conf->q_ul_5g.num_qgroups = 0;
+ acc100_conf->q_ul_5g.first_qgroup_index = -1;
+ acc100_conf->q_dl_4g.num_aqs_per_groups = 0;
+ acc100_conf->q_dl_4g.num_qgroups = 0;
+ acc100_conf->q_dl_4g.first_qgroup_index = -1;
+ acc100_conf->q_dl_5g.num_aqs_per_groups = 0;
+ acc100_conf->q_dl_5g.num_qgroups = 0;
+ acc100_conf->q_dl_5g.first_qgroup_index = -1;
+}
+
+static inline void
+updateQtop(uint8_t acc, uint8_t qg, struct acc100_conf *acc100_conf,
+ struct acc100_device *d) {
+ uint32_t reg;
+ struct rte_q_topology_t *q_top = NULL;
+ qtopFromAcc(&q_top, acc, acc100_conf);
+ if (unlikely(q_top == NULL))
+ return;
+ uint16_t aq;
+ q_top->num_qgroups++;
+ if (q_top->first_qgroup_index == -1) {
+ q_top->first_qgroup_index = qg;
+ /* Can be optimized to assume all are enabled by default */
+ reg = acc100_reg_read(d, queue_offset(d->pf_device,
+ 0, qg, ACC100_NUM_AQS - 1));
+ if (reg & QUEUE_ENABLE) {
+ q_top->num_aqs_per_groups = ACC100_NUM_AQS;
+ return;
+ }
+ q_top->num_aqs_per_groups = 0;
+ for (aq = 0; aq < ACC100_NUM_AQS; aq++) {
+ reg = acc100_reg_read(d, queue_offset(d->pf_device,
+ 0, qg, aq));
+ if (reg & QUEUE_ENABLE)
+ q_top->num_aqs_per_groups++;
+ }
+ }
+}
+
+/* Fetch configuration enabled for the PF/VF using MMIO Read (slow) */
+static inline void
+fetch_acc100_config(struct rte_bbdev *dev)
+{
+ struct acc100_device *d = dev->data->dev_private;
+ struct acc100_conf *acc100_conf = &d->acc100_conf;
+ const struct acc100_registry_addr *reg_addr;
+ uint8_t acc, qg;
+ uint32_t reg, reg_aq, reg_len0, reg_len1;
+ uint32_t reg_mode;
+
+ /* No need to retrieve the configuration is already done */
+ if (d->configured)
+ return;
+
+ /* Choose correct registry addresses for the device type */
+ if (d->pf_device)
+ reg_addr = &pf_reg_addr;
+ else
+ reg_addr = &vf_reg_addr;
+
+ d->ddr_size = (1 + acc100_reg_read(d, reg_addr->ddr_range)) << 10;
+
+ /* Single VF Bundle by VF */
+ acc100_conf->num_vf_bundles = 1;
+ initQTop(acc100_conf);
+
+ struct rte_q_topology_t *q_top = NULL;
+ int qman_func_id[5] = {0, 2, 1, 3, 4};
+ reg = acc100_reg_read(d, reg_addr->qman_group_func);
+ for (qg = 0; qg < ACC100_NUM_QGRPS_PER_WORD; qg++) {
+ reg_aq = acc100_reg_read(d,
+ queue_offset(d->pf_device, 0, qg, 0));
+ if (reg_aq & QUEUE_ENABLE) {
+ acc = qman_func_id[(reg >> (qg * 4)) & 0x7];
+ updateQtop(acc, qg, acc100_conf, d);
+ }
+ }
+
+ /* Check the depth of the AQs*/
+ reg_len0 = acc100_reg_read(d, reg_addr->depth_log0_offset);
+ reg_len1 = acc100_reg_read(d, reg_addr->depth_log1_offset);
+ for (acc = 0; acc < NUM_ACC; acc++) {
+ qtopFromAcc(&q_top, acc, acc100_conf);
+ if (q_top->first_qgroup_index < ACC100_NUM_QGRPS_PER_WORD)
+ q_top->aq_depth_log2 = (reg_len0 >>
+ (q_top->first_qgroup_index * 4))
+ & 0xF;
+ else
+ q_top->aq_depth_log2 = (reg_len1 >>
+ ((q_top->first_qgroup_index -
+ ACC100_NUM_QGRPS_PER_WORD) * 4))
+ & 0xF;
+ }
+
+ /* Read PF mode */
+ if (d->pf_device) {
+ reg_mode = acc100_reg_read(d, HWPfHiPfMode);
+ acc100_conf->pf_mode_en = (reg_mode == 2) ? 1 : 0;
+ }
+
+ rte_bbdev_log_debug(
+ "%s Config LLR SIGN IN/OUT %s %s QG %u %u %u %u AQ %u %u %u %u Len %u %u %u %u\n",
+ (d->pf_device) ? "PF" : "VF",
+ (acc100_conf->input_pos_llr_1_bit) ? "POS" : "NEG",
+ (acc100_conf->output_pos_llr_1_bit) ? "POS" : "NEG",
+ acc100_conf->q_ul_4g.num_qgroups,
+ acc100_conf->q_dl_4g.num_qgroups,
+ acc100_conf->q_ul_5g.num_qgroups,
+ acc100_conf->q_dl_5g.num_qgroups,
+ acc100_conf->q_ul_4g.num_aqs_per_groups,
+ acc100_conf->q_dl_4g.num_aqs_per_groups,
+ acc100_conf->q_ul_5g.num_aqs_per_groups,
+ acc100_conf->q_dl_5g.num_aqs_per_groups,
+ acc100_conf->q_ul_4g.aq_depth_log2,
+ acc100_conf->q_dl_4g.aq_depth_log2,
+ acc100_conf->q_ul_5g.aq_depth_log2,
+ acc100_conf->q_dl_5g.aq_depth_log2);
+}
+
/* Free 64MB memory used for software rings */
static int
acc100_dev_close(struct rte_bbdev *dev __rte_unused)
@@ -33,8 +211,55 @@
return 0;
}
+/* Get ACC100 device info */
+static void
+acc100_dev_info_get(struct rte_bbdev *dev,
+ struct rte_bbdev_driver_info *dev_info)
+{
+ struct acc100_device *d = dev->data->dev_private;
+
+ static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
+ RTE_BBDEV_END_OF_CAPABILITIES_LIST()
+ };
+
+ static struct rte_bbdev_queue_conf default_queue_conf;
+ default_queue_conf.socket = dev->data->socket_id;
+ default_queue_conf.queue_size = MAX_QUEUE_DEPTH;
+
+ dev_info->driver_name = dev->device->driver->name;
+
+ /* Read and save the populated config from ACC100 registers */
+ fetch_acc100_config(dev);
+
+ /* This isn't ideal because it reports the maximum number of queues but
+ * does not provide info on how many can be uplink/downlink or different
+ * priorities
+ */
+ dev_info->max_num_queues =
+ d->acc100_conf.q_dl_5g.num_aqs_per_groups *
+ d->acc100_conf.q_dl_5g.num_qgroups +
+ d->acc100_conf.q_ul_5g.num_aqs_per_groups *
+ d->acc100_conf.q_ul_5g.num_qgroups +
+ d->acc100_conf.q_dl_4g.num_aqs_per_groups *
+ d->acc100_conf.q_dl_4g.num_qgroups +
+ d->acc100_conf.q_ul_4g.num_aqs_per_groups *
+ d->acc100_conf.q_ul_4g.num_qgroups;
+ dev_info->queue_size_lim = MAX_QUEUE_DEPTH;
+ dev_info->hardware_accelerated = true;
+ dev_info->max_dl_queue_priority =
+ d->acc100_conf.q_dl_4g.num_qgroups - 1;
+ dev_info->max_ul_queue_priority =
+ d->acc100_conf.q_ul_4g.num_qgroups - 1;
+ dev_info->default_queue_conf = default_queue_conf;
+ dev_info->cpu_flag_reqs = NULL;
+ dev_info->min_alignment = 64;
+ dev_info->capabilities = bbdev_capabilities;
+ dev_info->harq_buffer_size = d->ddr_size;
+}
+
static const struct rte_bbdev_ops acc100_bbdev_ops = {
.close = acc100_dev_close,
+ .info_get = acc100_dev_info_get,
};
/* ACC100 PCI PF address map */
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index cd77570..662e2c8 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -7,6 +7,7 @@
#include "acc100_pf_enum.h"
#include "acc100_vf_enum.h"
+#include "rte_acc100_cfg.h"
/* Helper macro for logging */
#define rte_bbdev_log(level, fmt, ...) \
@@ -520,6 +521,8 @@ struct acc100_registry_addr {
/* Private data structure for each ACC100 device */
struct acc100_device {
void *mmio_base; /**< Base address of MMIO registers (BAR0) */
+ uint32_t ddr_size; /* Size in kB */
+ struct acc100_conf acc100_conf; /* ACC100 Initial configuration */
bool pf_device; /**< True if this is a PF ACC100 device */
bool configured; /**< True if this ACC100 device is configured */
};
--
1.8.3.1
^ permalink raw reply [flat|nested] 213+ messages in thread
* [dpdk-dev] [PATCH v3 04/11] baseband/acc100: add queue configuration
2020-08-19 0:25 [dpdk-dev] [PATCH v3 00/11] bbdev PMD ACC100 Nicolas Chautru
` (2 preceding siblings ...)
2020-08-19 0:25 ` [dpdk-dev] [PATCH v3 03/11] baseband/acc100: add info get function Nicolas Chautru
@ 2020-08-19 0:25 ` Nicolas Chautru
2020-08-29 10:39 ` Xu, Rosen
2020-08-19 0:25 ` [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC processing functions Nicolas Chautru
` (6 subsequent siblings)
10 siblings, 1 reply; 213+ messages in thread
From: Nicolas Chautru @ 2020-08-19 0:25 UTC (permalink / raw)
To: dev, akhil.goyal; +Cc: bruce.richardson, Nicolas Chautru
Adding function to create and configure queues for
the device. Still no capability.
Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
drivers/baseband/acc100/rte_acc100_pmd.c | 420 ++++++++++++++++++++++++++++++-
drivers/baseband/acc100/rte_acc100_pmd.h | 45 ++++
2 files changed, 464 insertions(+), 1 deletion(-)
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 7807a30..7a21c57 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -26,6 +26,22 @@
RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);
#endif
+/* Write to MMIO register address */
+static inline void
+mmio_write(void *addr, uint32_t value)
+{
+ *((volatile uint32_t *)(addr)) = rte_cpu_to_le_32(value);
+}
+
+/* Write a register of a ACC100 device */
+static inline void
+acc100_reg_write(struct acc100_device *d, uint32_t offset, uint32_t payload)
+{
+ void *reg_addr = RTE_PTR_ADD(d->mmio_base, offset);
+ mmio_write(reg_addr, payload);
+ usleep(1000);
+}
+
/* Read a register of a ACC100 device */
static inline uint32_t
acc100_reg_read(struct acc100_device *d, uint32_t offset)
@@ -36,6 +52,22 @@
return rte_le_to_cpu_32(ret);
}
+/* Basic Implementation of Log2 for exact 2^N */
+static inline uint32_t
+log2_basic(uint32_t value)
+{
+ return (value == 0) ? 0 : __builtin_ctz(value);
+}
+
+/* Calculate memory alignment offset assuming alignment is 2^N */
+static inline uint32_t
+calc_mem_alignment_offset(void *unaligned_virt_mem, uint32_t alignment)
+{
+ rte_iova_t unaligned_phy_mem = rte_malloc_virt2iova(unaligned_virt_mem);
+ return (uint32_t)(alignment -
+ (unaligned_phy_mem & (alignment-1)));
+}
+
/* Calculate the offset of the enqueue register */
static inline uint32_t
queue_offset(bool pf_device, uint8_t vf_id, uint8_t qgrp_id, uint16_t aq_id)
@@ -204,10 +236,393 @@
acc100_conf->q_dl_5g.aq_depth_log2);
}
+static void
+free_base_addresses(void **base_addrs, int size)
+{
+ int i;
+ for (i = 0; i < size; i++)
+ rte_free(base_addrs[i]);
+}
+
+static inline uint32_t
+get_desc_len(void)
+{
+ return sizeof(union acc100_dma_desc);
+}
+
+/* Allocate the 2 * 64MB block for the sw rings */
+static int
+alloc_2x64mb_sw_rings_mem(struct rte_bbdev *dev, struct acc100_device *d,
+ int socket)
+{
+ uint32_t sw_ring_size = ACC100_SIZE_64MBYTE;
+ d->sw_rings_base = rte_zmalloc_socket(dev->device->driver->name,
+ 2 * sw_ring_size, RTE_CACHE_LINE_SIZE, socket);
+ if (d->sw_rings_base == NULL) {
+ rte_bbdev_log(ERR, "Failed to allocate memory for %s:%u",
+ dev->device->driver->name,
+ dev->data->dev_id);
+ return -ENOMEM;
+ }
+ memset(d->sw_rings_base, 0, ACC100_SIZE_64MBYTE);
+ uint32_t next_64mb_align_offset = calc_mem_alignment_offset(
+ d->sw_rings_base, ACC100_SIZE_64MBYTE);
+ d->sw_rings = RTE_PTR_ADD(d->sw_rings_base, next_64mb_align_offset);
+ d->sw_rings_phys = rte_malloc_virt2iova(d->sw_rings_base) +
+ next_64mb_align_offset;
+ d->sw_ring_size = MAX_QUEUE_DEPTH * get_desc_len();
+ d->sw_ring_max_depth = d->sw_ring_size / get_desc_len();
+
+ return 0;
+}
+
+/* Attempt to allocate minimised memory space for sw rings */
+static void
+alloc_sw_rings_min_mem(struct rte_bbdev *dev, struct acc100_device *d,
+ uint16_t num_queues, int socket)
+{
+ rte_iova_t sw_rings_base_phy, next_64mb_align_addr_phy;
+ uint32_t next_64mb_align_offset;
+ rte_iova_t sw_ring_phys_end_addr;
+ void *base_addrs[SW_RING_MEM_ALLOC_ATTEMPTS];
+ void *sw_rings_base;
+ int i = 0;
+ uint32_t q_sw_ring_size = MAX_QUEUE_DEPTH * get_desc_len();
+ uint32_t dev_sw_ring_size = q_sw_ring_size * num_queues;
+
+ /* Find an aligned block of memory to store sw rings */
+ while (i < SW_RING_MEM_ALLOC_ATTEMPTS) {
+ /*
+ * sw_ring allocated memory is guaranteed to be aligned to
+ * q_sw_ring_size at the condition that the requested size is
+ * less than the page size
+ */
+ sw_rings_base = rte_zmalloc_socket(
+ dev->device->driver->name,
+ dev_sw_ring_size, q_sw_ring_size, socket);
+
+ if (sw_rings_base == NULL) {
+ rte_bbdev_log(ERR,
+ "Failed to allocate memory for %s:%u",
+ dev->device->driver->name,
+ dev->data->dev_id);
+ break;
+ }
+
+ sw_rings_base_phy = rte_malloc_virt2iova(sw_rings_base);
+ next_64mb_align_offset = calc_mem_alignment_offset(
+ sw_rings_base, ACC100_SIZE_64MBYTE);
+ next_64mb_align_addr_phy = sw_rings_base_phy +
+ next_64mb_align_offset;
+ sw_ring_phys_end_addr = sw_rings_base_phy + dev_sw_ring_size;
+
+ /* Check if the end of the sw ring memory block is before the
+ * start of next 64MB aligned mem address
+ */
+ if (sw_ring_phys_end_addr < next_64mb_align_addr_phy) {
+ d->sw_rings_phys = sw_rings_base_phy;
+ d->sw_rings = sw_rings_base;
+ d->sw_rings_base = sw_rings_base;
+ d->sw_ring_size = q_sw_ring_size;
+ d->sw_ring_max_depth = MAX_QUEUE_DEPTH;
+ break;
+ }
+ /* Store the address of the unaligned mem block */
+ base_addrs[i] = sw_rings_base;
+ i++;
+ }
+
+ /* Free all unaligned blocks of mem allocated in the loop */
+ free_base_addresses(base_addrs, i);
+}
+
+
+/* Allocate 64MB memory used for all software rings */
+static int
+acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id)
+{
+ uint32_t phys_low, phys_high, payload;
+ struct acc100_device *d = dev->data->dev_private;
+ const struct acc100_registry_addr *reg_addr;
+
+ if (d->pf_device && !d->acc100_conf.pf_mode_en) {
+ rte_bbdev_log(NOTICE,
+ "%s has PF mode disabled. This PF can't be used.",
+ dev->data->name);
+ return -ENODEV;
+ }
+
+ alloc_sw_rings_min_mem(dev, d, num_queues, socket_id);
+
+ /* If minimal memory space approach failed, then allocate
+ * the 2 * 64MB block for the sw rings
+ */
+ if (d->sw_rings == NULL)
+ alloc_2x64mb_sw_rings_mem(dev, d, socket_id);
+
+ /* Configure ACC100 with the base address for DMA descriptor rings
+ * Same descriptor rings used for UL and DL DMA Engines
+ * Note : Assuming only VF0 bundle is used for PF mode
+ */
+ phys_high = (uint32_t)(d->sw_rings_phys >> 32);
+ phys_low = (uint32_t)(d->sw_rings_phys & ~(ACC100_SIZE_64MBYTE-1));
+
+ /* Choose correct registry addresses for the device type */
+ if (d->pf_device)
+ reg_addr = &pf_reg_addr;
+ else
+ reg_addr = &vf_reg_addr;
+
+ /* Read the populated cfg from ACC100 registers */
+ fetch_acc100_config(dev);
+
+ /* Mark as configured properly */
+ d->configured = true;
+
+ /* Release AXI from PF */
+ if (d->pf_device)
+ acc100_reg_write(d, HWPfDmaAxiControl, 1);
+
+ acc100_reg_write(d, reg_addr->dma_ring_ul5g_hi, phys_high);
+ acc100_reg_write(d, reg_addr->dma_ring_ul5g_lo, phys_low);
+ acc100_reg_write(d, reg_addr->dma_ring_dl5g_hi, phys_high);
+ acc100_reg_write(d, reg_addr->dma_ring_dl5g_lo, phys_low);
+ acc100_reg_write(d, reg_addr->dma_ring_ul4g_hi, phys_high);
+ acc100_reg_write(d, reg_addr->dma_ring_ul4g_lo, phys_low);
+ acc100_reg_write(d, reg_addr->dma_ring_dl4g_hi, phys_high);
+ acc100_reg_write(d, reg_addr->dma_ring_dl4g_lo, phys_low);
+
+ /*
+ * Configure Ring Size to the max queue ring size
+ * (used for wrapping purpose)
+ */
+ payload = log2_basic(d->sw_ring_size / 64);
+ acc100_reg_write(d, reg_addr->ring_size, payload);
+
+ /* Configure tail pointer for use when SDONE enabled */
+ d->tail_ptrs = rte_zmalloc_socket(
+ dev->device->driver->name,
+ ACC100_NUM_QGRPS * ACC100_NUM_AQS * sizeof(uint32_t),
+ RTE_CACHE_LINE_SIZE, socket_id);
+ if (d->tail_ptrs == NULL) {
+ rte_bbdev_log(ERR, "Failed to allocate tail ptr for %s:%u",
+ dev->device->driver->name,
+ dev->data->dev_id);
+ rte_free(d->sw_rings);
+ return -ENOMEM;
+ }
+ d->tail_ptr_phys = rte_malloc_virt2iova(d->tail_ptrs);
+
+ phys_high = (uint32_t)(d->tail_ptr_phys >> 32);
+ phys_low = (uint32_t)(d->tail_ptr_phys);
+ acc100_reg_write(d, reg_addr->tail_ptrs_ul5g_hi, phys_high);
+ acc100_reg_write(d, reg_addr->tail_ptrs_ul5g_lo, phys_low);
+ acc100_reg_write(d, reg_addr->tail_ptrs_dl5g_hi, phys_high);
+ acc100_reg_write(d, reg_addr->tail_ptrs_dl5g_lo, phys_low);
+ acc100_reg_write(d, reg_addr->tail_ptrs_ul4g_hi, phys_high);
+ acc100_reg_write(d, reg_addr->tail_ptrs_ul4g_lo, phys_low);
+ acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_hi, phys_high);
+ acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_lo, phys_low);
+
+ d->harq_layout = rte_zmalloc_socket("HARQ Layout",
+ ACC100_HARQ_LAYOUT * sizeof(*d->harq_layout),
+ RTE_CACHE_LINE_SIZE, dev->data->socket_id);
+
+ rte_bbdev_log_debug(
+ "ACC100 (%s) configured sw_rings = %p, sw_rings_phys = %#"
+ PRIx64, dev->data->name, d->sw_rings, d->sw_rings_phys);
+
+ return 0;
+}
+
/* Free 64MB memory used for software rings */
static int
-acc100_dev_close(struct rte_bbdev *dev __rte_unused)
+acc100_dev_close(struct rte_bbdev *dev)
{
+ struct acc100_device *d = dev->data->dev_private;
+ if (d->sw_rings_base != NULL) {
+ rte_free(d->tail_ptrs);
+ rte_free(d->sw_rings_base);
+ d->sw_rings_base = NULL;
+ }
+ usleep(1000);
+ return 0;
+}
+
+
+/**
+ * Report a ACC100 queue index which is free
+ * Return 0 to 16k for a valid queue_idx or -1 when no queue is available
+ * Note : Only supporting VF0 Bundle for PF mode
+ */
+static int
+acc100_find_free_queue_idx(struct rte_bbdev *dev,
+ const struct rte_bbdev_queue_conf *conf)
+{
+ struct acc100_device *d = dev->data->dev_private;
+ int op_2_acc[5] = {0, UL_4G, DL_4G, UL_5G, DL_5G};
+ int acc = op_2_acc[conf->op_type];
+ struct rte_q_topology_t *qtop = NULL;
+ qtopFromAcc(&qtop, acc, &(d->acc100_conf));
+ if (qtop == NULL)
+ return -1;
+ /* Identify matching QGroup Index which are sorted in priority order */
+ uint16_t group_idx = qtop->first_qgroup_index;
+ group_idx += conf->priority;
+ if (group_idx >= ACC100_NUM_QGRPS ||
+ conf->priority >= qtop->num_qgroups) {
+ rte_bbdev_log(INFO, "Invalid Priority on %s, priority %u",
+ dev->data->name, conf->priority);
+ return -1;
+ }
+ /* Find a free AQ_idx */
+ uint16_t aq_idx;
+ for (aq_idx = 0; aq_idx < qtop->num_aqs_per_groups; aq_idx++) {
+ if (((d->q_assigned_bit_map[group_idx] >> aq_idx) & 0x1) == 0) {
+ /* Mark the Queue as assigned */
+ d->q_assigned_bit_map[group_idx] |= (1 << aq_idx);
+ /* Report the AQ Index */
+ return (group_idx << GRP_ID_SHIFT) + aq_idx;
+ }
+ }
+ rte_bbdev_log(INFO, "Failed to find free queue on %s, priority %u",
+ dev->data->name, conf->priority);
+ return -1;
+}
+
+/* Setup ACC100 queue */
+static int
+acc100_queue_setup(struct rte_bbdev *dev, uint16_t queue_id,
+ const struct rte_bbdev_queue_conf *conf)
+{
+ struct acc100_device *d = dev->data->dev_private;
+ struct acc100_queue *q;
+ int16_t q_idx;
+
+ /* Allocate the queue data structure. */
+ q = rte_zmalloc_socket(dev->device->driver->name, sizeof(*q),
+ RTE_CACHE_LINE_SIZE, conf->socket);
+ if (q == NULL) {
+ rte_bbdev_log(ERR, "Failed to allocate queue memory");
+ return -ENOMEM;
+ }
+
+ q->d = d;
+ q->ring_addr = RTE_PTR_ADD(d->sw_rings, (d->sw_ring_size * queue_id));
+ q->ring_addr_phys = d->sw_rings_phys + (d->sw_ring_size * queue_id);
+
+ /* Prepare the Ring with default descriptor format */
+ union acc100_dma_desc *desc = NULL;
+ unsigned int desc_idx, b_idx;
+ int fcw_len = (conf->op_type == RTE_BBDEV_OP_LDPC_ENC ?
+ ACC100_FCW_LE_BLEN : (conf->op_type == RTE_BBDEV_OP_TURBO_DEC ?
+ ACC100_FCW_TD_BLEN : ACC100_FCW_LD_BLEN));
+
+ for (desc_idx = 0; desc_idx < d->sw_ring_max_depth; desc_idx++) {
+ desc = q->ring_addr + desc_idx;
+ desc->req.word0 = ACC100_DMA_DESC_TYPE;
+ desc->req.word1 = 0; /**< Timestamp */
+ desc->req.word2 = 0;
+ desc->req.word3 = 0;
+ uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
+ desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
+ desc->req.data_ptrs[0].blen = fcw_len;
+ desc->req.data_ptrs[0].blkid = ACC100_DMA_BLKID_FCW;
+ desc->req.data_ptrs[0].last = 0;
+ desc->req.data_ptrs[0].dma_ext = 0;
+ for (b_idx = 1; b_idx < ACC100_DMA_MAX_NUM_POINTERS - 1;
+ b_idx++) {
+ desc->req.data_ptrs[b_idx].blkid = ACC100_DMA_BLKID_IN;
+ desc->req.data_ptrs[b_idx].last = 1;
+ desc->req.data_ptrs[b_idx].dma_ext = 0;
+ b_idx++;
+ desc->req.data_ptrs[b_idx].blkid =
+ ACC100_DMA_BLKID_OUT_ENC;
+ desc->req.data_ptrs[b_idx].last = 1;
+ desc->req.data_ptrs[b_idx].dma_ext = 0;
+ }
+ /* Preset some fields of LDPC FCW */
+ desc->req.fcw_ld.FCWversion = ACC100_FCW_VER;
+ desc->req.fcw_ld.gain_i = 1;
+ desc->req.fcw_ld.gain_h = 1;
+ }
+
+ q->lb_in = rte_zmalloc_socket(dev->device->driver->name,
+ RTE_CACHE_LINE_SIZE,
+ RTE_CACHE_LINE_SIZE, conf->socket);
+ if (q->lb_in == NULL) {
+ rte_bbdev_log(ERR, "Failed to allocate lb_in memory");
+ return -ENOMEM;
+ }
+ q->lb_in_addr_phys = rte_malloc_virt2iova(q->lb_in);
+ q->lb_out = rte_zmalloc_socket(dev->device->driver->name,
+ RTE_CACHE_LINE_SIZE,
+ RTE_CACHE_LINE_SIZE, conf->socket);
+ if (q->lb_out == NULL) {
+ rte_bbdev_log(ERR, "Failed to allocate lb_out memory");
+ return -ENOMEM;
+ }
+ q->lb_out_addr_phys = rte_malloc_virt2iova(q->lb_out);
+
+ /*
+ * Software queue ring wraps synchronously with the HW when it reaches
+ * the boundary of the maximum allocated queue size, no matter what the
+ * sw queue size is. This wrapping is guarded by setting the wrap_mask
+ * to represent the maximum queue size as allocated at the time when
+ * the device has been setup (in configure()).
+ *
+ * The queue depth is set to the queue size value (conf->queue_size).
+ * This limits the occupancy of the queue at any point of time, so that
+ * the queue does not get swamped with enqueue requests.
+ */
+ q->sw_ring_depth = conf->queue_size;
+ q->sw_ring_wrap_mask = d->sw_ring_max_depth - 1;
+
+ q->op_type = conf->op_type;
+
+ q_idx = acc100_find_free_queue_idx(dev, conf);
+ if (q_idx == -1) {
+ rte_free(q);
+ return -1;
+ }
+
+ q->qgrp_id = (q_idx >> GRP_ID_SHIFT) & 0xF;
+ q->vf_id = (q_idx >> VF_ID_SHIFT) & 0x3F;
+ q->aq_id = q_idx & 0xF;
+ q->aq_depth = (conf->op_type == RTE_BBDEV_OP_TURBO_DEC) ?
+ (1 << d->acc100_conf.q_ul_4g.aq_depth_log2) :
+ (1 << d->acc100_conf.q_dl_4g.aq_depth_log2);
+
+ q->mmio_reg_enqueue = RTE_PTR_ADD(d->mmio_base,
+ queue_offset(d->pf_device,
+ q->vf_id, q->qgrp_id, q->aq_id));
+
+ rte_bbdev_log_debug(
+ "Setup dev%u q%u: qgrp_id=%u, vf_id=%u, aq_id=%u, aq_depth=%u, mmio_reg_enqueue=%p",
+ dev->data->dev_id, queue_id, q->qgrp_id, q->vf_id,
+ q->aq_id, q->aq_depth, q->mmio_reg_enqueue);
+
+ dev->data->queues[queue_id].queue_private = q;
+ return 0;
+}
+
+/* Release ACC100 queue */
+static int
+acc100_queue_release(struct rte_bbdev *dev, uint16_t q_id)
+{
+ struct acc100_device *d = dev->data->dev_private;
+ struct acc100_queue *q = dev->data->queues[q_id].queue_private;
+
+ if (q != NULL) {
+ /* Mark the Queue as un-assigned */
+ d->q_assigned_bit_map[q->qgrp_id] &= (0xFFFFFFFF -
+ (1 << q->aq_id));
+ rte_free(q->lb_in);
+ rte_free(q->lb_out);
+ rte_free(q);
+ dev->data->queues[q_id].queue_private = NULL;
+ }
+
return 0;
}
@@ -258,8 +673,11 @@
}
static const struct rte_bbdev_ops acc100_bbdev_ops = {
+ .setup_queues = acc100_setup_queues,
.close = acc100_dev_close,
.info_get = acc100_dev_info_get,
+ .queue_setup = acc100_queue_setup,
+ .queue_release = acc100_queue_release,
};
/* ACC100 PCI PF address map */
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 662e2c8..0e2b79c 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -518,11 +518,56 @@ struct acc100_registry_addr {
.ddr_range = HWVfDmaDdrBaseRangeRoVf,
};
+/* Structure associated with each queue. */
+struct __rte_cache_aligned acc100_queue {
+ union acc100_dma_desc *ring_addr; /* Virtual address of sw ring */
+ rte_iova_t ring_addr_phys; /* Physical address of software ring */
+ uint32_t sw_ring_head; /* software ring head */
+ uint32_t sw_ring_tail; /* software ring tail */
+ /* software ring size (descriptors, not bytes) */
+ uint32_t sw_ring_depth;
+ /* mask used to wrap enqueued descriptors on the sw ring */
+ uint32_t sw_ring_wrap_mask;
+ /* MMIO register used to enqueue descriptors */
+ void *mmio_reg_enqueue;
+ uint8_t vf_id; /* VF ID (max = 63) */
+ uint8_t qgrp_id; /* Queue Group ID */
+ uint16_t aq_id; /* Atomic Queue ID */
+ uint16_t aq_depth; /* Depth of atomic queue */
+ uint32_t aq_enqueued; /* Count how many "batches" have been enqueued */
+ uint32_t aq_dequeued; /* Count how many "batches" have been dequeued */
+ uint32_t irq_enable; /* Enable ops dequeue interrupts if set to 1 */
+ struct rte_mempool *fcw_mempool; /* FCW mempool */
+ enum rte_bbdev_op_type op_type; /* Type of this Queue: TE or TD */
+ /* Internal Buffers for loopback input */
+ uint8_t *lb_in;
+ uint8_t *lb_out;
+ rte_iova_t lb_in_addr_phys;
+ rte_iova_t lb_out_addr_phys;
+ struct acc100_device *d;
+};
+
/* Private data structure for each ACC100 device */
struct acc100_device {
void *mmio_base; /**< Base address of MMIO registers (BAR0) */
+ void *sw_rings_base; /* Base addr of un-aligned memory for sw rings */
+ void *sw_rings; /* 64MBs of 64MB aligned memory for sw rings */
+ rte_iova_t sw_rings_phys; /* Physical address of sw_rings */
+ /* Virtual address of the info memory routed to the this function under
+ * operation, whether it is PF or VF.
+ */
+ union acc100_harq_layout_data *harq_layout;
+ uint32_t sw_ring_size;
uint32_t ddr_size; /* Size in kB */
+ uint32_t *tail_ptrs; /* Base address of response tail pointer buffer */
+ rte_iova_t tail_ptr_phys; /* Physical address of tail pointers */
+ /* Max number of entries available for each queue in device, depending
+ * on how many queues are enabled with configure()
+ */
+ uint32_t sw_ring_max_depth;
struct acc100_conf acc100_conf; /* ACC100 Initial configuration */
+ /* Bitmap capturing which Queues have already been assigned */
+ uint16_t q_assigned_bit_map[ACC100_NUM_QGRPS];
bool pf_device; /**< True if this is a PF ACC100 device */
bool configured; /**< True if this ACC100 device is configured */
};
--
1.8.3.1
^ permalink raw reply [flat|nested] 213+ messages in thread
* [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC processing functions
2020-08-19 0:25 [dpdk-dev] [PATCH v3 00/11] bbdev PMD ACC100 Nicolas Chautru
` (3 preceding siblings ...)
2020-08-19 0:25 ` [dpdk-dev] [PATCH v3 04/11] baseband/acc100: add queue configuration Nicolas Chautru
@ 2020-08-19 0:25 ` Nicolas Chautru
2020-08-20 14:38 ` Dave Burley
2020-08-29 11:10 ` Xu, Rosen
2020-08-19 0:25 ` [dpdk-dev] [PATCH v3 06/11] baseband/acc100: add HARQ loopback support Nicolas Chautru
` (5 subsequent siblings)
10 siblings, 2 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-08-19 0:25 UTC (permalink / raw)
To: dev, akhil.goyal; +Cc: bruce.richardson, Nicolas Chautru
Adding LDPC decode and encode processing operations
Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
drivers/baseband/acc100/rte_acc100_pmd.c | 1625 +++++++++++++++++++++++++++++-
drivers/baseband/acc100/rte_acc100_pmd.h | 3 +
2 files changed, 1626 insertions(+), 2 deletions(-)
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 7a21c57..5f32813 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -15,6 +15,9 @@
#include <rte_hexdump.h>
#include <rte_pci.h>
#include <rte_bus_pci.h>
+#ifdef RTE_BBDEV_OFFLOAD_COST
+#include <rte_cycles.h>
+#endif
#include <rte_bbdev.h>
#include <rte_bbdev_pmd.h>
@@ -449,7 +452,6 @@
return 0;
}
-
/**
* Report a ACC100 queue index which is free
* Return 0 to 16k for a valid queue_idx or -1 when no queue is available
@@ -634,6 +636,46 @@
struct acc100_device *d = dev->data->dev_private;
static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
+ {
+ .type = RTE_BBDEV_OP_LDPC_ENC,
+ .cap.ldpc_enc = {
+ .capability_flags =
+ RTE_BBDEV_LDPC_RATE_MATCH |
+ RTE_BBDEV_LDPC_CRC_24B_ATTACH |
+ RTE_BBDEV_LDPC_INTERLEAVER_BYPASS,
+ .num_buffers_src =
+ RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+ .num_buffers_dst =
+ RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+ }
+ },
+ {
+ .type = RTE_BBDEV_OP_LDPC_DEC,
+ .cap.ldpc_dec = {
+ .capability_flags =
+ RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK |
+ RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP |
+ RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE |
+ RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE |
+#ifdef ACC100_EXT_MEM
+ RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE |
+ RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE |
+#endif
+ RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE |
+ RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS |
+ RTE_BBDEV_LDPC_DECODE_BYPASS |
+ RTE_BBDEV_LDPC_DEC_SCATTER_GATHER |
+ RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION |
+ RTE_BBDEV_LDPC_LLR_COMPRESSION,
+ .llr_size = 8,
+ .llr_decimals = 1,
+ .num_buffers_src =
+ RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+ .num_buffers_hard_out =
+ RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+ .num_buffers_soft_out = 0,
+ }
+ },
RTE_BBDEV_END_OF_CAPABILITIES_LIST()
};
@@ -669,9 +711,14 @@
dev_info->cpu_flag_reqs = NULL;
dev_info->min_alignment = 64;
dev_info->capabilities = bbdev_capabilities;
+#ifdef ACC100_EXT_MEM
dev_info->harq_buffer_size = d->ddr_size;
+#else
+ dev_info->harq_buffer_size = 0;
+#endif
}
+
static const struct rte_bbdev_ops acc100_bbdev_ops = {
.setup_queues = acc100_setup_queues,
.close = acc100_dev_close,
@@ -696,6 +743,1577 @@
{.device_id = 0},
};
+/* Read flag value 0/1 from bitmap */
+static inline bool
+check_bit(uint32_t bitmap, uint32_t bitmask)
+{
+ return bitmap & bitmask;
+}
+
+static inline char *
+mbuf_append(struct rte_mbuf *m_head, struct rte_mbuf *m, uint16_t len)
+{
+ if (unlikely(len > rte_pktmbuf_tailroom(m)))
+ return NULL;
+
+ char *tail = (char *)m->buf_addr + m->data_off + m->data_len;
+ m->data_len = (uint16_t)(m->data_len + len);
+ m_head->pkt_len = (m_head->pkt_len + len);
+ return tail;
+}
+
+/* Compute value of k0.
+ * Based on 3GPP 38.212 Table 5.4.2.1-2
+ * Starting position of different redundancy versions, k0
+ */
+static inline uint16_t
+get_k0(uint16_t n_cb, uint16_t z_c, uint8_t bg, uint8_t rv_index)
+{
+ if (rv_index == 0)
+ return 0;
+ uint16_t n = (bg == 1 ? N_ZC_1 : N_ZC_2) * z_c;
+ if (n_cb == n) {
+ if (rv_index == 1)
+ return (bg == 1 ? K0_1_1 : K0_1_2) * z_c;
+ else if (rv_index == 2)
+ return (bg == 1 ? K0_2_1 : K0_2_2) * z_c;
+ else
+ return (bg == 1 ? K0_3_1 : K0_3_2) * z_c;
+ }
+ /* LBRM case - includes a division by N */
+ if (rv_index == 1)
+ return (((bg == 1 ? K0_1_1 : K0_1_2) * n_cb)
+ / n) * z_c;
+ else if (rv_index == 2)
+ return (((bg == 1 ? K0_2_1 : K0_2_2) * n_cb)
+ / n) * z_c;
+ else
+ return (((bg == 1 ? K0_3_1 : K0_3_2) * n_cb)
+ / n) * z_c;
+}
+
+/* Fill in a frame control word for LDPC encoding. */
+static inline void
+acc100_fcw_le_fill(const struct rte_bbdev_enc_op *op,
+ struct acc100_fcw_le *fcw, int num_cb)
+{
+ fcw->qm = op->ldpc_enc.q_m;
+ fcw->nfiller = op->ldpc_enc.n_filler;
+ fcw->BG = (op->ldpc_enc.basegraph - 1);
+ fcw->Zc = op->ldpc_enc.z_c;
+ fcw->ncb = op->ldpc_enc.n_cb;
+ fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_enc.basegraph,
+ op->ldpc_enc.rv_index);
+ fcw->rm_e = op->ldpc_enc.cb_params.e;
+ fcw->crc_select = check_bit(op->ldpc_enc.op_flags,
+ RTE_BBDEV_LDPC_CRC_24B_ATTACH);
+ fcw->bypass_intlv = check_bit(op->ldpc_enc.op_flags,
+ RTE_BBDEV_LDPC_INTERLEAVER_BYPASS);
+ fcw->mcb_count = num_cb;
+}
+
+/* Fill in a frame control word for LDPC decoding. */
+static inline void
+acc100_fcw_ld_fill(const struct rte_bbdev_dec_op *op, struct acc100_fcw_ld *fcw,
+ union acc100_harq_layout_data *harq_layout)
+{
+ uint16_t harq_out_length, harq_in_length, ncb_p, k0_p, parity_offset;
+ uint16_t harq_index;
+ uint32_t l;
+ bool harq_prun = false;
+
+ fcw->qm = op->ldpc_dec.q_m;
+ fcw->nfiller = op->ldpc_dec.n_filler;
+ fcw->BG = (op->ldpc_dec.basegraph - 1);
+ fcw->Zc = op->ldpc_dec.z_c;
+ fcw->ncb = op->ldpc_dec.n_cb;
+ fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_dec.basegraph,
+ op->ldpc_dec.rv_index);
+ if (op->ldpc_dec.code_block_mode == 1)
+ fcw->rm_e = op->ldpc_dec.cb_params.e;
+ else
+ fcw->rm_e = (op->ldpc_dec.tb_params.r <
+ op->ldpc_dec.tb_params.cab) ?
+ op->ldpc_dec.tb_params.ea :
+ op->ldpc_dec.tb_params.eb;
+
+ fcw->hcin_en = check_bit(op->ldpc_dec.op_flags,
+ RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE);
+ fcw->hcout_en = check_bit(op->ldpc_dec.op_flags,
+ RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE);
+ fcw->crc_select = check_bit(op->ldpc_dec.op_flags,
+ RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK);
+ fcw->bypass_dec = check_bit(op->ldpc_dec.op_flags,
+ RTE_BBDEV_LDPC_DECODE_BYPASS);
+ fcw->bypass_intlv = check_bit(op->ldpc_dec.op_flags,
+ RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS);
+ if (op->ldpc_dec.q_m == 1) {
+ fcw->bypass_intlv = 1;
+ fcw->qm = 2;
+ }
+ fcw->hcin_decomp_mode = check_bit(op->ldpc_dec.op_flags,
+ RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
+ fcw->hcout_comp_mode = check_bit(op->ldpc_dec.op_flags,
+ RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
+ fcw->llr_pack_mode = check_bit(op->ldpc_dec.op_flags,
+ RTE_BBDEV_LDPC_LLR_COMPRESSION);
+ harq_index = op->ldpc_dec.harq_combined_output.offset /
+ ACC100_HARQ_OFFSET;
+#ifdef ACC100_EXT_MEM
+ /* Limit cases when HARQ pruning is valid */
+ harq_prun = ((op->ldpc_dec.harq_combined_output.offset %
+ ACC100_HARQ_OFFSET) == 0) &&
+ (op->ldpc_dec.harq_combined_output.offset <= UINT16_MAX
+ * ACC100_HARQ_OFFSET);
+#endif
+ if (fcw->hcin_en > 0) {
+ harq_in_length = op->ldpc_dec.harq_combined_input.length;
+ if (fcw->hcin_decomp_mode > 0)
+ harq_in_length = harq_in_length * 8 / 6;
+ harq_in_length = RTE_ALIGN(harq_in_length, 64);
+ if ((harq_layout[harq_index].offset > 0) & harq_prun) {
+ rte_bbdev_log_debug("HARQ IN offset unexpected for now\n");
+ fcw->hcin_size0 = harq_layout[harq_index].size0;
+ fcw->hcin_offset = harq_layout[harq_index].offset;
+ fcw->hcin_size1 = harq_in_length -
+ harq_layout[harq_index].offset;
+ } else {
+ fcw->hcin_size0 = harq_in_length;
+ fcw->hcin_offset = 0;
+ fcw->hcin_size1 = 0;
+ }
+ } else {
+ fcw->hcin_size0 = 0;
+ fcw->hcin_offset = 0;
+ fcw->hcin_size1 = 0;
+ }
+
+ fcw->itmax = op->ldpc_dec.iter_max;
+ fcw->itstop = check_bit(op->ldpc_dec.op_flags,
+ RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE);
+ fcw->synd_precoder = fcw->itstop;
+ /*
+ * These are all implicitly set
+ * fcw->synd_post = 0;
+ * fcw->so_en = 0;
+ * fcw->so_bypass_rm = 0;
+ * fcw->so_bypass_intlv = 0;
+ * fcw->dec_convllr = 0;
+ * fcw->hcout_convllr = 0;
+ * fcw->hcout_size1 = 0;
+ * fcw->so_it = 0;
+ * fcw->hcout_offset = 0;
+ * fcw->negstop_th = 0;
+ * fcw->negstop_it = 0;
+ * fcw->negstop_en = 0;
+ * fcw->gain_i = 1;
+ * fcw->gain_h = 1;
+ */
+ if (fcw->hcout_en > 0) {
+ parity_offset = (op->ldpc_dec.basegraph == 1 ? 20 : 8)
+ * op->ldpc_dec.z_c - op->ldpc_dec.n_filler;
+ k0_p = (fcw->k0 > parity_offset) ?
+ fcw->k0 - op->ldpc_dec.n_filler : fcw->k0;
+ ncb_p = fcw->ncb - op->ldpc_dec.n_filler;
+ l = k0_p + fcw->rm_e;
+ harq_out_length = (uint16_t) fcw->hcin_size0;
+ harq_out_length = RTE_MIN(RTE_MAX(harq_out_length, l), ncb_p);
+ harq_out_length = (harq_out_length + 0x3F) & 0xFFC0;
+ if ((k0_p > fcw->hcin_size0 + ACC100_HARQ_OFFSET_THRESHOLD) &&
+ harq_prun) {
+ fcw->hcout_size0 = (uint16_t) fcw->hcin_size0;
+ fcw->hcout_offset = k0_p & 0xFFC0;
+ fcw->hcout_size1 = harq_out_length - fcw->hcout_offset;
+ } else {
+ fcw->hcout_size0 = harq_out_length;
+ fcw->hcout_size1 = 0;
+ fcw->hcout_offset = 0;
+ }
+ harq_layout[harq_index].offset = fcw->hcout_offset;
+ harq_layout[harq_index].size0 = fcw->hcout_size0;
+ } else {
+ fcw->hcout_size0 = 0;
+ fcw->hcout_size1 = 0;
+ fcw->hcout_offset = 0;
+ }
+}
+
+/**
+ * Fills descriptor with data pointers of one block type.
+ *
+ * @param desc
+ * Pointer to DMA descriptor.
+ * @param input
+ * Pointer to pointer to input data which will be encoded. It can be changed
+ * and points to next segment in scatter-gather case.
+ * @param offset
+ * Input offset in rte_mbuf structure. It is used for calculating the point
+ * where data is starting.
+ * @param cb_len
+ * Length of currently processed Code Block
+ * @param seg_total_left
+ * It indicates how many bytes still left in segment (mbuf) for further
+ * processing.
+ * @param op_flags
+ * Store information about device capabilities
+ * @param next_triplet
+ * Index for ACC100 DMA Descriptor triplet
+ *
+ * @return
+ * Returns index of next triplet on success, other value if lengths of
+ * pkt and processed cb do not match.
+ *
+ */
+static inline int
+acc100_dma_fill_blk_type_in(struct acc100_dma_req_desc *desc,
+ struct rte_mbuf **input, uint32_t *offset, uint32_t cb_len,
+ uint32_t *seg_total_left, int next_triplet)
+{
+ uint32_t part_len;
+ struct rte_mbuf *m = *input;
+
+ part_len = (*seg_total_left < cb_len) ? *seg_total_left : cb_len;
+ cb_len -= part_len;
+ *seg_total_left -= part_len;
+
+ desc->data_ptrs[next_triplet].address =
+ rte_pktmbuf_iova_offset(m, *offset);
+ desc->data_ptrs[next_triplet].blen = part_len;
+ desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN;
+ desc->data_ptrs[next_triplet].last = 0;
+ desc->data_ptrs[next_triplet].dma_ext = 0;
+ *offset += part_len;
+ next_triplet++;
+
+ while (cb_len > 0) {
+ if (next_triplet < ACC100_DMA_MAX_NUM_POINTERS &&
+ m->next != NULL) {
+
+ m = m->next;
+ *seg_total_left = rte_pktmbuf_data_len(m);
+ part_len = (*seg_total_left < cb_len) ?
+ *seg_total_left :
+ cb_len;
+ desc->data_ptrs[next_triplet].address =
+ rte_pktmbuf_mtophys(m);
+ desc->data_ptrs[next_triplet].blen = part_len;
+ desc->data_ptrs[next_triplet].blkid =
+ ACC100_DMA_BLKID_IN;
+ desc->data_ptrs[next_triplet].last = 0;
+ desc->data_ptrs[next_triplet].dma_ext = 0;
+ cb_len -= part_len;
+ *seg_total_left -= part_len;
+ /* Initializing offset for next segment (mbuf) */
+ *offset = part_len;
+ next_triplet++;
+ } else {
+ rte_bbdev_log(ERR,
+ "Some data still left for processing: "
+ "data_left: %u, next_triplet: %u, next_mbuf: %p",
+ cb_len, next_triplet, m->next);
+ return -EINVAL;
+ }
+ }
+ /* Storing new mbuf as it could be changed in scatter-gather case*/
+ *input = m;
+
+ return next_triplet;
+}
+
+/* Fills descriptor with data pointers of one block type.
+ * Returns index of next triplet on success, other value if lengths of
+ * output data and processed mbuf do not match.
+ */
+static inline int
+acc100_dma_fill_blk_type_out(struct acc100_dma_req_desc *desc,
+ struct rte_mbuf *output, uint32_t out_offset,
+ uint32_t output_len, int next_triplet, int blk_id)
+{
+ desc->data_ptrs[next_triplet].address =
+ rte_pktmbuf_iova_offset(output, out_offset);
+ desc->data_ptrs[next_triplet].blen = output_len;
+ desc->data_ptrs[next_triplet].blkid = blk_id;
+ desc->data_ptrs[next_triplet].last = 0;
+ desc->data_ptrs[next_triplet].dma_ext = 0;
+ next_triplet++;
+
+ return next_triplet;
+}
+
+static inline int
+acc100_dma_desc_le_fill(struct rte_bbdev_enc_op *op,
+ struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
+ struct rte_mbuf *output, uint32_t *in_offset,
+ uint32_t *out_offset, uint32_t *out_length,
+ uint32_t *mbuf_total_left, uint32_t *seg_total_left)
+{
+ int next_triplet = 1; /* FCW already done */
+ uint16_t K, in_length_in_bits, in_length_in_bytes;
+ struct rte_bbdev_op_ldpc_enc *enc = &op->ldpc_enc;
+
+ desc->word0 = ACC100_DMA_DESC_TYPE;
+ desc->word1 = 0; /**< Timestamp could be disabled */
+ desc->word2 = 0;
+ desc->word3 = 0;
+ desc->numCBs = 1;
+
+ K = (enc->basegraph == 1 ? 22 : 10) * enc->z_c;
+ in_length_in_bits = K - enc->n_filler;
+ if ((enc->op_flags & RTE_BBDEV_LDPC_CRC_24A_ATTACH) ||
+ (enc->op_flags & RTE_BBDEV_LDPC_CRC_24B_ATTACH))
+ in_length_in_bits -= 24;
+ in_length_in_bytes = in_length_in_bits >> 3;
+
+ if (unlikely((*mbuf_total_left == 0) ||
+ (*mbuf_total_left < in_length_in_bytes))) {
+ rte_bbdev_log(ERR,
+ "Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+ *mbuf_total_left, in_length_in_bytes);
+ return -1;
+ }
+
+ next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset,
+ in_length_in_bytes,
+ seg_total_left, next_triplet);
+ if (unlikely(next_triplet < 0)) {
+ rte_bbdev_log(ERR,
+ "Mismatch between data to process and mbuf data length in bbdev_op: %p",
+ op);
+ return -1;
+ }
+ desc->data_ptrs[next_triplet - 1].last = 1;
+ desc->m2dlen = next_triplet;
+ *mbuf_total_left -= in_length_in_bytes;
+
+ /* Set output length */
+ /* Integer round up division by 8 */
+ *out_length = (enc->cb_params.e + 7) >> 3;
+
+ next_triplet = acc100_dma_fill_blk_type_out(desc, output, *out_offset,
+ *out_length, next_triplet, ACC100_DMA_BLKID_OUT_ENC);
+ if (unlikely(next_triplet < 0)) {
+ rte_bbdev_log(ERR,
+ "Mismatch between data to process and mbuf data length in bbdev_op: %p",
+ op);
+ return -1;
+ }
+ op->ldpc_enc.output.length += *out_length;
+ *out_offset += *out_length;
+ desc->data_ptrs[next_triplet - 1].last = 1;
+ desc->data_ptrs[next_triplet - 1].dma_ext = 0;
+ desc->d2mlen = next_triplet - desc->m2dlen;
+
+ desc->op_addr = op;
+
+ return 0;
+}
+
+static inline int
+acc100_dma_desc_ld_fill(struct rte_bbdev_dec_op *op,
+ struct acc100_dma_req_desc *desc,
+ struct rte_mbuf **input, struct rte_mbuf *h_output,
+ uint32_t *in_offset, uint32_t *h_out_offset,
+ uint32_t *h_out_length, uint32_t *mbuf_total_left,
+ uint32_t *seg_total_left,
+ struct acc100_fcw_ld *fcw)
+{
+ struct rte_bbdev_op_ldpc_dec *dec = &op->ldpc_dec;
+ int next_triplet = 1; /* FCW already done */
+ uint32_t input_length;
+ uint16_t output_length, crc24_overlap = 0;
+ uint16_t sys_cols, K, h_p_size, h_np_size;
+ bool h_comp = check_bit(dec->op_flags,
+ RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
+
+ desc->word0 = ACC100_DMA_DESC_TYPE;
+ desc->word1 = 0; /**< Timestamp could be disabled */
+ desc->word2 = 0;
+ desc->word3 = 0;
+ desc->numCBs = 1;
+
+ if (check_bit(op->ldpc_dec.op_flags,
+ RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP))
+ crc24_overlap = 24;
+
+ /* Compute some LDPC BG lengths */
+ input_length = dec->cb_params.e;
+ if (check_bit(op->ldpc_dec.op_flags,
+ RTE_BBDEV_LDPC_LLR_COMPRESSION))
+ input_length = (input_length * 3 + 3) / 4;
+ sys_cols = (dec->basegraph == 1) ? 22 : 10;
+ K = sys_cols * dec->z_c;
+ output_length = K - dec->n_filler - crc24_overlap;
+
+ if (unlikely((*mbuf_total_left == 0) ||
+ (*mbuf_total_left < input_length))) {
+ rte_bbdev_log(ERR,
+ "Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+ *mbuf_total_left, input_length);
+ return -1;
+ }
+
+ next_triplet = acc100_dma_fill_blk_type_in(desc, input,
+ in_offset, input_length,
+ seg_total_left, next_triplet);
+
+ if (unlikely(next_triplet < 0)) {
+ rte_bbdev_log(ERR,
+ "Mismatch between data to process and mbuf data length in bbdev_op: %p",
+ op);
+ return -1;
+ }
+
+ if (check_bit(op->ldpc_dec.op_flags,
+ RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
+ h_p_size = fcw->hcin_size0 + fcw->hcin_size1;
+ if (h_comp)
+ h_p_size = (h_p_size * 3 + 3) / 4;
+ desc->data_ptrs[next_triplet].address =
+ dec->harq_combined_input.offset;
+ desc->data_ptrs[next_triplet].blen = h_p_size;
+ desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN_HARQ;
+ desc->data_ptrs[next_triplet].dma_ext = 1;
+#ifndef ACC100_EXT_MEM
+ acc100_dma_fill_blk_type_out(
+ desc,
+ op->ldpc_dec.harq_combined_input.data,
+ op->ldpc_dec.harq_combined_input.offset,
+ h_p_size,
+ next_triplet,
+ ACC100_DMA_BLKID_IN_HARQ);
+#endif
+ next_triplet++;
+ }
+
+ desc->data_ptrs[next_triplet - 1].last = 1;
+ desc->m2dlen = next_triplet;
+ *mbuf_total_left -= input_length;
+
+ next_triplet = acc100_dma_fill_blk_type_out(desc, h_output,
+ *h_out_offset, output_length >> 3, next_triplet,
+ ACC100_DMA_BLKID_OUT_HARD);
+ if (unlikely(next_triplet < 0)) {
+ rte_bbdev_log(ERR,
+ "Mismatch between data to process and mbuf data length in bbdev_op: %p",
+ op);
+ return -1;
+ }
+
+ if (check_bit(op->ldpc_dec.op_flags,
+ RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
+ /* Pruned size of the HARQ */
+ h_p_size = fcw->hcout_size0 + fcw->hcout_size1;
+ /* Non-Pruned size of the HARQ */
+ h_np_size = fcw->hcout_offset > 0 ?
+ fcw->hcout_offset + fcw->hcout_size1 :
+ h_p_size;
+ if (h_comp) {
+ h_np_size = (h_np_size * 3 + 3) / 4;
+ h_p_size = (h_p_size * 3 + 3) / 4;
+ }
+ dec->harq_combined_output.length = h_np_size;
+ desc->data_ptrs[next_triplet].address =
+ dec->harq_combined_output.offset;
+ desc->data_ptrs[next_triplet].blen = h_p_size;
+ desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_OUT_HARQ;
+ desc->data_ptrs[next_triplet].dma_ext = 1;
+#ifndef ACC100_EXT_MEM
+ acc100_dma_fill_blk_type_out(
+ desc,
+ dec->harq_combined_output.data,
+ dec->harq_combined_output.offset,
+ h_p_size,
+ next_triplet,
+ ACC100_DMA_BLKID_OUT_HARQ);
+#endif
+ next_triplet++;
+ }
+
+ *h_out_length = output_length >> 3;
+ dec->hard_output.length += *h_out_length;
+ *h_out_offset += *h_out_length;
+ desc->data_ptrs[next_triplet - 1].last = 1;
+ desc->d2mlen = next_triplet - desc->m2dlen;
+
+ desc->op_addr = op;
+
+ return 0;
+}
+
+static inline void
+acc100_dma_desc_ld_update(struct rte_bbdev_dec_op *op,
+ struct acc100_dma_req_desc *desc,
+ struct rte_mbuf *input, struct rte_mbuf *h_output,
+ uint32_t *in_offset, uint32_t *h_out_offset,
+ uint32_t *h_out_length,
+ union acc100_harq_layout_data *harq_layout)
+{
+ int next_triplet = 1; /* FCW already done */
+ desc->data_ptrs[next_triplet].address =
+ rte_pktmbuf_iova_offset(input, *in_offset);
+ next_triplet++;
+
+ if (check_bit(op->ldpc_dec.op_flags,
+ RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
+ struct rte_bbdev_op_data hi = op->ldpc_dec.harq_combined_input;
+ desc->data_ptrs[next_triplet].address = hi.offset;
+#ifndef ACC100_EXT_MEM
+ desc->data_ptrs[next_triplet].address =
+ rte_pktmbuf_iova_offset(hi.data, hi.offset);
+#endif
+ next_triplet++;
+ }
+
+ desc->data_ptrs[next_triplet].address =
+ rte_pktmbuf_iova_offset(h_output, *h_out_offset);
+ *h_out_length = desc->data_ptrs[next_triplet].blen;
+ next_triplet++;
+
+ if (check_bit(op->ldpc_dec.op_flags,
+ RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
+ desc->data_ptrs[next_triplet].address =
+ op->ldpc_dec.harq_combined_output.offset;
+ /* Adjust based on previous operation */
+ struct rte_bbdev_dec_op *prev_op = desc->op_addr;
+ op->ldpc_dec.harq_combined_output.length =
+ prev_op->ldpc_dec.harq_combined_output.length;
+ int16_t hq_idx = op->ldpc_dec.harq_combined_output.offset /
+ ACC100_HARQ_OFFSET;
+ int16_t prev_hq_idx =
+ prev_op->ldpc_dec.harq_combined_output.offset
+ / ACC100_HARQ_OFFSET;
+ harq_layout[hq_idx].val = harq_layout[prev_hq_idx].val;
+#ifndef ACC100_EXT_MEM
+ struct rte_bbdev_op_data ho =
+ op->ldpc_dec.harq_combined_output;
+ desc->data_ptrs[next_triplet].address =
+ rte_pktmbuf_iova_offset(ho.data, ho.offset);
+#endif
+ next_triplet++;
+ }
+
+ op->ldpc_dec.hard_output.length += *h_out_length;
+ desc->op_addr = op;
+}
+
+
+/* Enqueue a number of operations to HW and update software rings */
+static inline void
+acc100_dma_enqueue(struct acc100_queue *q, uint16_t n,
+ struct rte_bbdev_stats *queue_stats)
+{
+ union acc100_enqueue_reg_fmt enq_req;
+#ifdef RTE_BBDEV_OFFLOAD_COST
+ uint64_t start_time = 0;
+ queue_stats->acc_offload_cycles = 0;
+ RTE_SET_USED(queue_stats);
+#else
+ RTE_SET_USED(queue_stats);
+#endif
+
+ enq_req.val = 0;
+ /* Setting offset, 100b for 256 DMA Desc */
+ enq_req.addr_offset = ACC100_DESC_OFFSET;
+
+ /* Split ops into batches */
+ do {
+ union acc100_dma_desc *desc;
+ uint16_t enq_batch_size;
+ uint64_t offset;
+ rte_iova_t req_elem_addr;
+
+ enq_batch_size = RTE_MIN(n, MAX_ENQ_BATCH_SIZE);
+
+ /* Set flag on last descriptor in a batch */
+ desc = q->ring_addr + ((q->sw_ring_head + enq_batch_size - 1) &
+ q->sw_ring_wrap_mask);
+ desc->req.last_desc_in_batch = 1;
+
+ /* Calculate the 1st descriptor's address */
+ offset = ((q->sw_ring_head & q->sw_ring_wrap_mask) *
+ sizeof(union acc100_dma_desc));
+ req_elem_addr = q->ring_addr_phys + offset;
+
+ /* Fill enqueue struct */
+ enq_req.num_elem = enq_batch_size;
+ /* low 6 bits are not needed */
+ enq_req.req_elem_addr = (uint32_t)(req_elem_addr >> 6);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+ rte_memdump(stderr, "Req sdone", desc, sizeof(*desc));
+#endif
+ rte_bbdev_log_debug(
+ "Enqueue %u reqs (phys %#"PRIx64") to reg %p",
+ enq_batch_size,
+ req_elem_addr,
+ (void *)q->mmio_reg_enqueue);
+
+ rte_wmb();
+
+#ifdef RTE_BBDEV_OFFLOAD_COST
+ /* Start time measurement for enqueue function offload. */
+ start_time = rte_rdtsc_precise();
+#endif
+ rte_bbdev_log(DEBUG, "Debug : MMIO Enqueue");
+ mmio_write(q->mmio_reg_enqueue, enq_req.val);
+
+#ifdef RTE_BBDEV_OFFLOAD_COST
+ queue_stats->acc_offload_cycles +=
+ rte_rdtsc_precise() - start_time;
+#endif
+
+ q->aq_enqueued++;
+ q->sw_ring_head += enq_batch_size;
+ n -= enq_batch_size;
+
+ } while (n);
+
+
+}
+
+/* Enqueue one encode operations for ACC100 device in CB mode */
+static inline int
+enqueue_ldpc_enc_n_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op **ops,
+ uint16_t total_enqueued_cbs, int16_t num)
+{
+ union acc100_dma_desc *desc = NULL;
+ uint32_t out_length;
+ struct rte_mbuf *output_head, *output;
+ int i, next_triplet;
+ uint16_t in_length_in_bytes;
+ struct rte_bbdev_op_ldpc_enc *enc = &ops[0]->ldpc_enc;
+
+ uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+ & q->sw_ring_wrap_mask);
+ desc = q->ring_addr + desc_idx;
+ acc100_fcw_le_fill(ops[0], &desc->req.fcw_le, num);
+
+ /** This could be done at polling */
+ desc->req.word0 = ACC100_DMA_DESC_TYPE;
+ desc->req.word1 = 0; /**< Timestamp could be disabled */
+ desc->req.word2 = 0;
+ desc->req.word3 = 0;
+ desc->req.numCBs = num;
+
+ in_length_in_bytes = ops[0]->ldpc_enc.input.data->data_len;
+ out_length = (enc->cb_params.e + 7) >> 3;
+ desc->req.m2dlen = 1 + num;
+ desc->req.d2mlen = num;
+ next_triplet = 1;
+
+ for (i = 0; i < num; i++) {
+ desc->req.data_ptrs[next_triplet].address =
+ rte_pktmbuf_iova_offset(ops[i]->ldpc_enc.input.data, 0);
+ desc->req.data_ptrs[next_triplet].blen = in_length_in_bytes;
+ next_triplet++;
+ desc->req.data_ptrs[next_triplet].address =
+ rte_pktmbuf_iova_offset(
+ ops[i]->ldpc_enc.output.data, 0);
+ desc->req.data_ptrs[next_triplet].blen = out_length;
+ next_triplet++;
+ ops[i]->ldpc_enc.output.length = out_length;
+ output_head = output = ops[i]->ldpc_enc.output.data;
+ mbuf_append(output_head, output, out_length);
+ output->data_len = out_length;
+ }
+
+ desc->req.op_addr = ops[0];
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+ rte_memdump(stderr, "FCW", &desc->req.fcw_le,
+ sizeof(desc->req.fcw_le) - 8);
+ rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+ /* One CB (one op) was successfully prepared to enqueue */
+ return num;
+}
+
+/* Enqueue one encode operations for ACC100 device in CB mode */
+static inline int
+enqueue_ldpc_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
+ uint16_t total_enqueued_cbs)
+{
+ union acc100_dma_desc *desc = NULL;
+ int ret;
+ uint32_t in_offset, out_offset, out_length, mbuf_total_left,
+ seg_total_left;
+ struct rte_mbuf *input, *output_head, *output;
+
+ uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+ & q->sw_ring_wrap_mask);
+ desc = q->ring_addr + desc_idx;
+ acc100_fcw_le_fill(op, &desc->req.fcw_le, 1);
+
+ input = op->ldpc_enc.input.data;
+ output_head = output = op->ldpc_enc.output.data;
+ in_offset = op->ldpc_enc.input.offset;
+ out_offset = op->ldpc_enc.output.offset;
+ out_length = 0;
+ mbuf_total_left = op->ldpc_enc.input.length;
+ seg_total_left = rte_pktmbuf_data_len(op->ldpc_enc.input.data)
+ - in_offset;
+
+ ret = acc100_dma_desc_le_fill(op, &desc->req, &input, output,
+ &in_offset, &out_offset, &out_length, &mbuf_total_left,
+ &seg_total_left);
+
+ if (unlikely(ret < 0))
+ return ret;
+
+ mbuf_append(output_head, output, out_length);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+ rte_memdump(stderr, "FCW", &desc->req.fcw_le,
+ sizeof(desc->req.fcw_le) - 8);
+ rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+
+ /* Check if any data left after processing one CB */
+ if (mbuf_total_left != 0) {
+ rte_bbdev_log(ERR,
+ "Some date still left after processing one CB: mbuf_total_left = %u",
+ mbuf_total_left);
+ return -EINVAL;
+ }
+#endif
+ /* One CB (one op) was successfully prepared to enqueue */
+ return 1;
+}
+
+/** Enqueue one decode operations for ACC100 device in CB mode */
+static inline int
+enqueue_ldpc_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+ uint16_t total_enqueued_cbs, bool same_op)
+{
+ int ret;
+
+ union acc100_dma_desc *desc;
+ uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+ & q->sw_ring_wrap_mask);
+ desc = q->ring_addr + desc_idx;
+ struct rte_mbuf *input, *h_output_head, *h_output;
+ uint32_t in_offset, h_out_offset, h_out_length, mbuf_total_left;
+ input = op->ldpc_dec.input.data;
+ h_output_head = h_output = op->ldpc_dec.hard_output.data;
+ in_offset = op->ldpc_dec.input.offset;
+ h_out_offset = op->ldpc_dec.hard_output.offset;
+ mbuf_total_left = op->ldpc_dec.input.length;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+ if (unlikely(input == NULL)) {
+ rte_bbdev_log(ERR, "Invalid mbuf pointer");
+ return -EFAULT;
+ }
+#endif
+ union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+
+ if (same_op) {
+ union acc100_dma_desc *prev_desc;
+ desc_idx = ((q->sw_ring_head + total_enqueued_cbs - 1)
+ & q->sw_ring_wrap_mask);
+ prev_desc = q->ring_addr + desc_idx;
+ uint8_t *prev_ptr = (uint8_t *) prev_desc;
+ uint8_t *new_ptr = (uint8_t *) desc;
+ /* Copy first 4 words and BDESCs */
+ rte_memcpy(new_ptr, prev_ptr, 16);
+ rte_memcpy(new_ptr + 36, prev_ptr + 36, 40);
+ desc->req.op_addr = prev_desc->req.op_addr;
+ /* Copy FCW */
+ rte_memcpy(new_ptr + ACC100_DESC_FCW_OFFSET,
+ prev_ptr + ACC100_DESC_FCW_OFFSET,
+ ACC100_FCW_LD_BLEN);
+ acc100_dma_desc_ld_update(op, &desc->req, input, h_output,
+ &in_offset, &h_out_offset,
+ &h_out_length, harq_layout);
+ } else {
+ struct acc100_fcw_ld *fcw;
+ uint32_t seg_total_left;
+ fcw = &desc->req.fcw_ld;
+ acc100_fcw_ld_fill(op, fcw, harq_layout);
+
+ /* Special handling when overusing mbuf */
+ if (fcw->rm_e < MAX_E_MBUF)
+ seg_total_left = rte_pktmbuf_data_len(input)
+ - in_offset;
+ else
+ seg_total_left = fcw->rm_e;
+
+ ret = acc100_dma_desc_ld_fill(op, &desc->req, &input, h_output,
+ &in_offset, &h_out_offset,
+ &h_out_length, &mbuf_total_left,
+ &seg_total_left, fcw);
+ if (unlikely(ret < 0))
+ return ret;
+ }
+
+ /* Hard output */
+ mbuf_append(h_output_head, h_output, h_out_length);
+#ifndef ACC100_EXT_MEM
+ if (op->ldpc_dec.harq_combined_output.length > 0) {
+ /* Push the HARQ output into host memory */
+ struct rte_mbuf *hq_output_head, *hq_output;
+ hq_output_head = op->ldpc_dec.harq_combined_output.data;
+ hq_output = op->ldpc_dec.harq_combined_output.data;
+ mbuf_append(hq_output_head, hq_output,
+ op->ldpc_dec.harq_combined_output.length);
+ }
+#endif
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+ rte_memdump(stderr, "FCW", &desc->req.fcw_ld,
+ sizeof(desc->req.fcw_ld) - 8);
+ rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+ /* One CB (one op) was successfully prepared to enqueue */
+ return 1;
+}
+
+
+/* Enqueue one decode operations for ACC100 device in TB mode */
+static inline int
+enqueue_ldpc_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+ uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
+{
+ union acc100_dma_desc *desc = NULL;
+ int ret;
+ uint8_t r, c;
+ uint32_t in_offset, h_out_offset,
+ h_out_length, mbuf_total_left, seg_total_left;
+ struct rte_mbuf *input, *h_output_head, *h_output;
+ uint16_t current_enqueued_cbs = 0;
+
+ uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+ & q->sw_ring_wrap_mask);
+ desc = q->ring_addr + desc_idx;
+ uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
+ union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+ acc100_fcw_ld_fill(op, &desc->req.fcw_ld, harq_layout);
+
+ input = op->ldpc_dec.input.data;
+ h_output_head = h_output = op->ldpc_dec.hard_output.data;
+ in_offset = op->ldpc_dec.input.offset;
+ h_out_offset = op->ldpc_dec.hard_output.offset;
+ h_out_length = 0;
+ mbuf_total_left = op->ldpc_dec.input.length;
+ c = op->ldpc_dec.tb_params.c;
+ r = op->ldpc_dec.tb_params.r;
+
+ while (mbuf_total_left > 0 && r < c) {
+
+ seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
+
+ /* Set up DMA descriptor */
+ desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
+ & q->sw_ring_wrap_mask);
+ desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
+ desc->req.data_ptrs[0].blen = ACC100_FCW_LD_BLEN;
+ ret = acc100_dma_desc_ld_fill(op, &desc->req, &input,
+ h_output, &in_offset, &h_out_offset,
+ &h_out_length,
+ &mbuf_total_left, &seg_total_left,
+ &desc->req.fcw_ld);
+
+ if (unlikely(ret < 0))
+ return ret;
+
+ /* Hard output */
+ mbuf_append(h_output_head, h_output, h_out_length);
+
+ /* Set total number of CBs in TB */
+ desc->req.cbs_in_tb = cbs_in_tb;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+ rte_memdump(stderr, "FCW", &desc->req.fcw_td,
+ sizeof(desc->req.fcw_td) - 8);
+ rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+ if (seg_total_left == 0) {
+ /* Go to the next mbuf */
+ input = input->next;
+ in_offset = 0;
+ h_output = h_output->next;
+ h_out_offset = 0;
+ }
+ total_enqueued_cbs++;
+ current_enqueued_cbs++;
+ r++;
+ }
+
+ if (unlikely(desc == NULL))
+ return current_enqueued_cbs;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+ /* Check if any CBs left for processing */
+ if (mbuf_total_left != 0) {
+ rte_bbdev_log(ERR,
+ "Some date still left for processing: mbuf_total_left = %u",
+ mbuf_total_left);
+ return -EINVAL;
+ }
+#endif
+ /* Set SDone on last CB descriptor for TB mode */
+ desc->req.sdone_enable = 1;
+ desc->req.irq_enable = q->irq_enable;
+
+ return current_enqueued_cbs;
+}
+
+
+/* Calculates number of CBs in processed encoder TB based on 'r' and input
+ * length.
+ */
+static inline uint8_t
+get_num_cbs_in_tb_enc(struct rte_bbdev_op_turbo_enc *turbo_enc)
+{
+ uint8_t c, c_neg, r, crc24_bits = 0;
+ uint16_t k, k_neg, k_pos;
+ uint8_t cbs_in_tb = 0;
+ int32_t length;
+
+ length = turbo_enc->input.length;
+ r = turbo_enc->tb_params.r;
+ c = turbo_enc->tb_params.c;
+ c_neg = turbo_enc->tb_params.c_neg;
+ k_neg = turbo_enc->tb_params.k_neg;
+ k_pos = turbo_enc->tb_params.k_pos;
+ crc24_bits = 0;
+ if (check_bit(turbo_enc->op_flags, RTE_BBDEV_TURBO_CRC_24B_ATTACH))
+ crc24_bits = 24;
+ while (length > 0 && r < c) {
+ k = (r < c_neg) ? k_neg : k_pos;
+ length -= (k - crc24_bits) >> 3;
+ r++;
+ cbs_in_tb++;
+ }
+
+ return cbs_in_tb;
+}
+
+/* Calculates number of CBs in processed decoder TB based on 'r' and input
+ * length.
+ */
+static inline uint16_t
+get_num_cbs_in_tb_dec(struct rte_bbdev_op_turbo_dec *turbo_dec)
+{
+ uint8_t c, c_neg, r = 0;
+ uint16_t kw, k, k_neg, k_pos, cbs_in_tb = 0;
+ int32_t length;
+
+ length = turbo_dec->input.length;
+ r = turbo_dec->tb_params.r;
+ c = turbo_dec->tb_params.c;
+ c_neg = turbo_dec->tb_params.c_neg;
+ k_neg = turbo_dec->tb_params.k_neg;
+ k_pos = turbo_dec->tb_params.k_pos;
+ while (length > 0 && r < c) {
+ k = (r < c_neg) ? k_neg : k_pos;
+ kw = RTE_ALIGN_CEIL(k + 4, 32) * 3;
+ length -= kw;
+ r++;
+ cbs_in_tb++;
+ }
+
+ return cbs_in_tb;
+}
+
+/* Calculates number of CBs in processed decoder TB based on 'r' and input
+ * length.
+ */
+static inline uint16_t
+get_num_cbs_in_tb_ldpc_dec(struct rte_bbdev_op_ldpc_dec *ldpc_dec)
+{
+ uint16_t r, cbs_in_tb = 0;
+ int32_t length = ldpc_dec->input.length;
+ r = ldpc_dec->tb_params.r;
+ while (length > 0 && r < ldpc_dec->tb_params.c) {
+ length -= (r < ldpc_dec->tb_params.cab) ?
+ ldpc_dec->tb_params.ea :
+ ldpc_dec->tb_params.eb;
+ r++;
+ cbs_in_tb++;
+ }
+ return cbs_in_tb;
+}
+
+/* Check we can mux encode operations with common FCW */
+static inline bool
+check_mux(struct rte_bbdev_enc_op **ops, uint16_t num) {
+ uint16_t i;
+ if (num == 1)
+ return false;
+ for (i = 1; i < num; ++i) {
+ /* Only mux compatible code blocks */
+ if (memcmp((uint8_t *)(&ops[i]->ldpc_enc) + ENC_OFFSET,
+ (uint8_t *)(&ops[0]->ldpc_enc) + ENC_OFFSET,
+ CMP_ENC_SIZE) != 0)
+ return false;
+ }
+ return true;
+}
+
+/** Enqueue encode operations for ACC100 device in CB mode. */
+static inline uint16_t
+acc100_enqueue_ldpc_enc_cb(struct rte_bbdev_queue_data *q_data,
+ struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+ struct acc100_queue *q = q_data->queue_private;
+ int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+ uint16_t i = 0;
+ union acc100_dma_desc *desc;
+ int ret, desc_idx = 0;
+ int16_t enq, left = num;
+
+ while (left > 0) {
+ if (unlikely(avail - 1 < 0))
+ break;
+ avail--;
+ enq = RTE_MIN(left, MUX_5GDL_DESC);
+ if (check_mux(&ops[i], enq)) {
+ ret = enqueue_ldpc_enc_n_op_cb(q, &ops[i],
+ desc_idx, enq);
+ if (ret < 0)
+ break;
+ i += enq;
+ } else {
+ ret = enqueue_ldpc_enc_one_op_cb(q, ops[i], desc_idx);
+ if (ret < 0)
+ break;
+ i++;
+ }
+ desc_idx++;
+ left = num - i;
+ }
+
+ if (unlikely(i == 0))
+ return 0; /* Nothing to enqueue */
+
+ /* Set SDone in last CB in enqueued ops for CB mode*/
+ desc = q->ring_addr + ((q->sw_ring_head + desc_idx - 1)
+ & q->sw_ring_wrap_mask);
+ desc->req.sdone_enable = 1;
+ desc->req.irq_enable = q->irq_enable;
+
+ acc100_dma_enqueue(q, desc_idx, &q_data->queue_stats);
+
+ /* Update stats */
+ q_data->queue_stats.enqueued_count += i;
+ q_data->queue_stats.enqueue_err_count += num - i;
+
+ return i;
+}
+
+/* Enqueue encode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
+ struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+ if (unlikely(num == 0))
+ return 0;
+ return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
+}
+
+/* Check we can mux encode operations with common FCW */
+static inline bool
+cmp_ldpc_dec_op(struct rte_bbdev_dec_op **ops) {
+ /* Only mux compatible code blocks */
+ if (memcmp((uint8_t *)(&ops[0]->ldpc_dec) + DEC_OFFSET,
+ (uint8_t *)(&ops[1]->ldpc_dec) +
+ DEC_OFFSET, CMP_DEC_SIZE) != 0) {
+ return false;
+ } else
+ return true;
+}
+
+
+/* Enqueue decode operations for ACC100 device in TB mode */
+static uint16_t
+acc100_enqueue_ldpc_dec_tb(struct rte_bbdev_queue_data *q_data,
+ struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+ struct acc100_queue *q = q_data->queue_private;
+ int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+ uint16_t i, enqueued_cbs = 0;
+ uint8_t cbs_in_tb;
+ int ret;
+
+ for (i = 0; i < num; ++i) {
+ cbs_in_tb = get_num_cbs_in_tb_ldpc_dec(&ops[i]->ldpc_dec);
+ /* Check if there are available space for further processing */
+ if (unlikely(avail - cbs_in_tb < 0))
+ break;
+ avail -= cbs_in_tb;
+
+ ret = enqueue_ldpc_dec_one_op_tb(q, ops[i],
+ enqueued_cbs, cbs_in_tb);
+ if (ret < 0)
+ break;
+ enqueued_cbs += ret;
+ }
+
+ acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
+
+ /* Update stats */
+ q_data->queue_stats.enqueued_count += i;
+ q_data->queue_stats.enqueue_err_count += num - i;
+ return i;
+}
+
+/* Enqueue decode operations for ACC100 device in CB mode */
+static uint16_t
+acc100_enqueue_ldpc_dec_cb(struct rte_bbdev_queue_data *q_data,
+ struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+ struct acc100_queue *q = q_data->queue_private;
+ int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+ uint16_t i;
+ union acc100_dma_desc *desc;
+ int ret;
+ bool same_op = false;
+ for (i = 0; i < num; ++i) {
+ /* Check if there are available space for further processing */
+ if (unlikely(avail - 1 < 0))
+ break;
+ avail -= 1;
+
+ if (i > 0)
+ same_op = cmp_ldpc_dec_op(&ops[i-1]);
+ rte_bbdev_log(INFO, "Op %d %d %d %d %d %d %d %d %d %d %d %d\n",
+ i, ops[i]->ldpc_dec.op_flags, ops[i]->ldpc_dec.rv_index,
+ ops[i]->ldpc_dec.iter_max, ops[i]->ldpc_dec.iter_count,
+ ops[i]->ldpc_dec.basegraph, ops[i]->ldpc_dec.z_c,
+ ops[i]->ldpc_dec.n_cb, ops[i]->ldpc_dec.q_m,
+ ops[i]->ldpc_dec.n_filler, ops[i]->ldpc_dec.cb_params.e,
+ same_op);
+ ret = enqueue_ldpc_dec_one_op_cb(q, ops[i], i, same_op);
+ if (ret < 0)
+ break;
+ }
+
+ if (unlikely(i == 0))
+ return 0; /* Nothing to enqueue */
+
+ /* Set SDone in last CB in enqueued ops for CB mode*/
+ desc = q->ring_addr + ((q->sw_ring_head + i - 1)
+ & q->sw_ring_wrap_mask);
+
+ desc->req.sdone_enable = 1;
+ desc->req.irq_enable = q->irq_enable;
+
+ acc100_dma_enqueue(q, i, &q_data->queue_stats);
+
+ /* Update stats */
+ q_data->queue_stats.enqueued_count += i;
+ q_data->queue_stats.enqueue_err_count += num - i;
+ return i;
+}
+
+/* Enqueue decode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
+ struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+ struct acc100_queue *q = q_data->queue_private;
+ int32_t aq_avail = q->aq_depth +
+ (q->aq_dequeued - q->aq_enqueued) / 128;
+
+ if (unlikely((aq_avail == 0) || (num == 0)))
+ return 0;
+
+ if (ops[0]->ldpc_dec.code_block_mode == 0)
+ return acc100_enqueue_ldpc_dec_tb(q_data, ops, num);
+ else
+ return acc100_enqueue_ldpc_dec_cb(q_data, ops, num);
+}
+
+
+/* Dequeue one encode operations from ACC100 device in CB mode */
+static inline int
+dequeue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op **ref_op,
+ uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
+{
+ union acc100_dma_desc *desc, atom_desc;
+ union acc100_dma_rsp_desc rsp;
+ struct rte_bbdev_enc_op *op;
+ int i;
+
+ desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
+ & q->sw_ring_wrap_mask);
+ atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+ __ATOMIC_RELAXED);
+
+ /* Check fdone bit */
+ if (!(atom_desc.rsp.val & ACC100_FDONE))
+ return -1;
+
+ rsp.val = atom_desc.rsp.val;
+ rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
+
+ /* Dequeue */
+ op = desc->req.op_addr;
+
+ /* Clearing status, it will be set based on response */
+ op->status = 0;
+
+ op->status |= ((rsp.input_err)
+ ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+ op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+ op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+
+ if (desc->req.last_desc_in_batch) {
+ (*aq_dequeued)++;
+ desc->req.last_desc_in_batch = 0;
+ }
+ desc->rsp.val = ACC100_DMA_DESC_TYPE;
+ desc->rsp.add_info_0 = 0; /*Reserved bits */
+ desc->rsp.add_info_1 = 0; /*Reserved bits */
+
+ /* Flag that the muxing cause loss of opaque data */
+ op->opaque_data = (void *)-1;
+ for (i = 0 ; i < desc->req.numCBs; i++)
+ ref_op[i] = op;
+
+ /* One CB (op) was successfully dequeued */
+ return desc->req.numCBs;
+}
+
+/* Dequeue one encode operations from ACC100 device in TB mode */
+static inline int
+dequeue_enc_one_op_tb(struct acc100_queue *q, struct rte_bbdev_enc_op **ref_op,
+ uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
+{
+ union acc100_dma_desc *desc, *last_desc, atom_desc;
+ union acc100_dma_rsp_desc rsp;
+ struct rte_bbdev_enc_op *op;
+ uint8_t i = 0;
+ uint16_t current_dequeued_cbs = 0, cbs_in_tb;
+
+ desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
+ & q->sw_ring_wrap_mask);
+ atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+ __ATOMIC_RELAXED);
+
+ /* Check fdone bit */
+ if (!(atom_desc.rsp.val & ACC100_FDONE))
+ return -1;
+
+ /* Get number of CBs in dequeued TB */
+ cbs_in_tb = desc->req.cbs_in_tb;
+ /* Get last CB */
+ last_desc = q->ring_addr + ((q->sw_ring_tail
+ + total_dequeued_cbs + cbs_in_tb - 1)
+ & q->sw_ring_wrap_mask);
+ /* Check if last CB in TB is ready to dequeue (and thus
+ * the whole TB) - checking sdone bit. If not return.
+ */
+ atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
+ __ATOMIC_RELAXED);
+ if (!(atom_desc.rsp.val & ACC100_SDONE))
+ return -1;
+
+ /* Dequeue */
+ op = desc->req.op_addr;
+
+ /* Clearing status, it will be set based on response */
+ op->status = 0;
+
+ while (i < cbs_in_tb) {
+ desc = q->ring_addr + ((q->sw_ring_tail
+ + total_dequeued_cbs)
+ & q->sw_ring_wrap_mask);
+ atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+ __ATOMIC_RELAXED);
+ rsp.val = atom_desc.rsp.val;
+ rte_bbdev_log_debug("Resp. desc %p: %x", desc,
+ rsp.val);
+
+ op->status |= ((rsp.input_err)
+ ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+ op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+ op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+
+ if (desc->req.last_desc_in_batch) {
+ (*aq_dequeued)++;
+ desc->req.last_desc_in_batch = 0;
+ }
+ desc->rsp.val = ACC100_DMA_DESC_TYPE;
+ desc->rsp.add_info_0 = 0;
+ desc->rsp.add_info_1 = 0;
+ total_dequeued_cbs++;
+ current_dequeued_cbs++;
+ i++;
+ }
+
+ *ref_op = op;
+
+ return current_dequeued_cbs;
+}
+
+/* Dequeue one decode operation from ACC100 device in CB mode */
+static inline int
+dequeue_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
+ struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
+ uint16_t dequeued_cbs, uint32_t *aq_dequeued)
+{
+ union acc100_dma_desc *desc, atom_desc;
+ union acc100_dma_rsp_desc rsp;
+ struct rte_bbdev_dec_op *op;
+
+ desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+ & q->sw_ring_wrap_mask);
+ atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+ __ATOMIC_RELAXED);
+
+ /* Check fdone bit */
+ if (!(atom_desc.rsp.val & ACC100_FDONE))
+ return -1;
+
+ rsp.val = atom_desc.rsp.val;
+ rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
+
+ /* Dequeue */
+ op = desc->req.op_addr;
+
+ /* Clearing status, it will be set based on response */
+ op->status = 0;
+ op->status |= ((rsp.input_err)
+ ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+ op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+ op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+ if (op->status != 0)
+ q_data->queue_stats.dequeue_err_count++;
+
+ /* CRC invalid if error exists */
+ if (!op->status)
+ op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
+ op->turbo_dec.iter_count = (uint8_t) rsp.iter_cnt / 2;
+ /* Check if this is the last desc in batch (Atomic Queue) */
+ if (desc->req.last_desc_in_batch) {
+ (*aq_dequeued)++;
+ desc->req.last_desc_in_batch = 0;
+ }
+ desc->rsp.val = ACC100_DMA_DESC_TYPE;
+ desc->rsp.add_info_0 = 0;
+ desc->rsp.add_info_1 = 0;
+ *ref_op = op;
+
+ /* One CB (op) was successfully dequeued */
+ return 1;
+}
+
+/* Dequeue one decode operations from ACC100 device in CB mode */
+static inline int
+dequeue_ldpc_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
+ struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
+ uint16_t dequeued_cbs, uint32_t *aq_dequeued)
+{
+ union acc100_dma_desc *desc, atom_desc;
+ union acc100_dma_rsp_desc rsp;
+ struct rte_bbdev_dec_op *op;
+
+ desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+ & q->sw_ring_wrap_mask);
+ atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+ __ATOMIC_RELAXED);
+
+ /* Check fdone bit */
+ if (!(atom_desc.rsp.val & ACC100_FDONE))
+ return -1;
+
+ rsp.val = atom_desc.rsp.val;
+
+ /* Dequeue */
+ op = desc->req.op_addr;
+
+ /* Clearing status, it will be set based on response */
+ op->status = 0;
+ op->status |= rsp.input_err << RTE_BBDEV_DATA_ERROR;
+ op->status |= rsp.dma_err << RTE_BBDEV_DRV_ERROR;
+ op->status |= rsp.fcw_err << RTE_BBDEV_DRV_ERROR;
+ if (op->status != 0)
+ q_data->queue_stats.dequeue_err_count++;
+
+ op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
+ if (op->ldpc_dec.hard_output.length > 0 && !rsp.synd_ok)
+ op->status |= 1 << RTE_BBDEV_SYNDROME_ERROR;
+ op->ldpc_dec.iter_count = (uint8_t) rsp.iter_cnt;
+
+ /* Check if this is the last desc in batch (Atomic Queue) */
+ if (desc->req.last_desc_in_batch) {
+ (*aq_dequeued)++;
+ desc->req.last_desc_in_batch = 0;
+ }
+
+ desc->rsp.val = ACC100_DMA_DESC_TYPE;
+ desc->rsp.add_info_0 = 0;
+ desc->rsp.add_info_1 = 0;
+
+ *ref_op = op;
+
+ /* One CB (op) was successfully dequeued */
+ return 1;
+}
+
+/* Dequeue one decode operations from ACC100 device in TB mode. */
+static inline int
+dequeue_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
+ uint16_t dequeued_cbs, uint32_t *aq_dequeued)
+{
+ union acc100_dma_desc *desc, *last_desc, atom_desc;
+ union acc100_dma_rsp_desc rsp;
+ struct rte_bbdev_dec_op *op;
+ uint8_t cbs_in_tb = 1, cb_idx = 0;
+
+ desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+ & q->sw_ring_wrap_mask);
+ atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+ __ATOMIC_RELAXED);
+
+ /* Check fdone bit */
+ if (!(atom_desc.rsp.val & ACC100_FDONE))
+ return -1;
+
+ /* Dequeue */
+ op = desc->req.op_addr;
+
+ /* Get number of CBs in dequeued TB */
+ cbs_in_tb = desc->req.cbs_in_tb;
+ /* Get last CB */
+ last_desc = q->ring_addr + ((q->sw_ring_tail
+ + dequeued_cbs + cbs_in_tb - 1)
+ & q->sw_ring_wrap_mask);
+ /* Check if last CB in TB is ready to dequeue (and thus
+ * the whole TB) - checking sdone bit. If not return.
+ */
+ atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
+ __ATOMIC_RELAXED);
+ if (!(atom_desc.rsp.val & ACC100_SDONE))
+ return -1;
+
+ /* Clearing status, it will be set based on response */
+ op->status = 0;
+
+ /* Read remaining CBs if exists */
+ while (cb_idx < cbs_in_tb) {
+ desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+ & q->sw_ring_wrap_mask);
+ atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+ __ATOMIC_RELAXED);
+ rsp.val = atom_desc.rsp.val;
+ rte_bbdev_log_debug("Resp. desc %p: %x", desc,
+ rsp.val);
+
+ op->status |= ((rsp.input_err)
+ ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+ op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+ op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+
+ /* CRC invalid if error exists */
+ if (!op->status)
+ op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
+ op->turbo_dec.iter_count = RTE_MAX((uint8_t) rsp.iter_cnt,
+ op->turbo_dec.iter_count);
+
+ /* Check if this is the last desc in batch (Atomic Queue) */
+ if (desc->req.last_desc_in_batch) {
+ (*aq_dequeued)++;
+ desc->req.last_desc_in_batch = 0;
+ }
+ desc->rsp.val = ACC100_DMA_DESC_TYPE;
+ desc->rsp.add_info_0 = 0;
+ desc->rsp.add_info_1 = 0;
+ dequeued_cbs++;
+ cb_idx++;
+ }
+
+ *ref_op = op;
+
+ return cb_idx;
+}
+
+/* Dequeue LDPC encode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
+ struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+ struct acc100_queue *q = q_data->queue_private;
+ uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+ uint32_t aq_dequeued = 0;
+ uint16_t dequeue_num, i, dequeued_cbs = 0, dequeued_descs = 0;
+ int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+ if (unlikely(ops == 0 && q == NULL))
+ return 0;
+#endif
+
+ dequeue_num = (avail < num) ? avail : num;
+
+ for (i = 0; i < dequeue_num; i++) {
+ ret = dequeue_enc_one_op_cb(q, &ops[dequeued_cbs],
+ dequeued_descs, &aq_dequeued);
+ if (ret < 0)
+ break;
+ dequeued_cbs += ret;
+ dequeued_descs++;
+ if (dequeued_cbs >= num)
+ break;
+ }
+
+ q->aq_dequeued += aq_dequeued;
+ q->sw_ring_tail += dequeued_descs;
+
+ /* Update enqueue stats */
+ q_data->queue_stats.dequeued_count += dequeued_cbs;
+
+ return dequeued_cbs;
+}
+
+/* Dequeue decode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
+ struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+ struct acc100_queue *q = q_data->queue_private;
+ uint16_t dequeue_num;
+ uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+ uint32_t aq_dequeued = 0;
+ uint16_t i;
+ uint16_t dequeued_cbs = 0;
+ struct rte_bbdev_dec_op *op;
+ int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+ if (unlikely(ops == 0 && q == NULL))
+ return 0;
+#endif
+
+ dequeue_num = (avail < num) ? avail : num;
+
+ for (i = 0; i < dequeue_num; ++i) {
+ op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+ & q->sw_ring_wrap_mask))->req.op_addr;
+ if (op->ldpc_dec.code_block_mode == 0)
+ ret = dequeue_dec_one_op_tb(q, &ops[i], dequeued_cbs,
+ &aq_dequeued);
+ else
+ ret = dequeue_ldpc_dec_one_op_cb(
+ q_data, q, &ops[i], dequeued_cbs,
+ &aq_dequeued);
+
+ if (ret < 0)
+ break;
+ dequeued_cbs += ret;
+ }
+
+ q->aq_dequeued += aq_dequeued;
+ q->sw_ring_tail += dequeued_cbs;
+
+ /* Update enqueue stats */
+ q_data->queue_stats.dequeued_count += i;
+
+ return i;
+}
+
/* Initialization Function */
static void
acc100_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv)
@@ -703,6 +2321,10 @@
struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
dev->dev_ops = &acc100_bbdev_ops;
+ dev->enqueue_ldpc_enc_ops = acc100_enqueue_ldpc_enc;
+ dev->enqueue_ldpc_dec_ops = acc100_enqueue_ldpc_dec;
+ dev->dequeue_ldpc_enc_ops = acc100_dequeue_ldpc_enc;
+ dev->dequeue_ldpc_dec_ops = acc100_dequeue_ldpc_dec;
((struct acc100_device *) dev->data->dev_private)->pf_device =
!strcmp(drv->driver.name,
@@ -815,4 +2437,3 @@ static int acc100_pci_remove(struct rte_pci_device *pci_dev)
RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME, pci_id_acc100_pf_map);
RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME, pci_id_acc100_vf_map);
-
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 0e2b79c..78686c1 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -88,6 +88,8 @@
#define TMPL_PRI_3 0x0f0e0d0c
#define QUEUE_ENABLE 0x80000000 /* Bit to mark Queue as Enabled */
#define WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
+#define ACC100_FDONE 0x80000000
+#define ACC100_SDONE 0x40000000
#define ACC100_NUM_TMPL 32
#define VF_OFFSET_QOS 16 /* offset in Memory Space specific to QoS Mon */
@@ -398,6 +400,7 @@ struct __rte_packed acc100_dma_req_desc {
union acc100_dma_desc {
struct acc100_dma_req_desc req;
union acc100_dma_rsp_desc rsp;
+ uint64_t atom_hdr;
};
--
1.8.3.1
^ permalink raw reply [flat|nested] 213+ messages in thread
* [dpdk-dev] [PATCH v3 06/11] baseband/acc100: add HARQ loopback support
2020-08-19 0:25 [dpdk-dev] [PATCH v3 00/11] bbdev PMD ACC100 Nicolas Chautru
` (4 preceding siblings ...)
2020-08-19 0:25 ` [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC processing functions Nicolas Chautru
@ 2020-08-19 0:25 ` Nicolas Chautru
2020-08-19 0:25 ` [dpdk-dev] [PATCH v3 07/11] baseband/acc100: add support for 4G processing Nicolas Chautru
` (4 subsequent siblings)
10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-08-19 0:25 UTC (permalink / raw)
To: dev, akhil.goyal; +Cc: bruce.richardson, Nicolas Chautru
Additional support for HARQ memory loopback
Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
drivers/baseband/acc100/rte_acc100_pmd.c | 158 +++++++++++++++++++++++++++++++
1 file changed, 158 insertions(+)
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 5f32813..b44b2f5 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -658,6 +658,7 @@
RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE |
RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE |
#ifdef ACC100_EXT_MEM
+ RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK |
RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE |
RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE |
#endif
@@ -1480,12 +1481,169 @@
return 1;
}
+static inline int
+harq_loopback(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+ uint16_t total_enqueued_cbs) {
+ struct acc100_fcw_ld *fcw;
+ union acc100_dma_desc *desc;
+ int next_triplet = 1;
+ struct rte_mbuf *hq_output_head, *hq_output;
+ uint16_t harq_in_length = op->ldpc_dec.harq_combined_input.length;
+ if (harq_in_length == 0) {
+ rte_bbdev_log(ERR, "Loopback of invalid null size\n");
+ return -EINVAL;
+ }
+
+ int h_comp = check_bit(op->ldpc_dec.op_flags,
+ RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION
+ ) ? 1 : 0;
+ if (h_comp == 1)
+ harq_in_length = harq_in_length * 8 / 6;
+ harq_in_length = RTE_ALIGN(harq_in_length, 64);
+ uint16_t harq_dma_length_in = (h_comp == 0) ?
+ harq_in_length :
+ harq_in_length * 6 / 8;
+ uint16_t harq_dma_length_out = harq_dma_length_in;
+ bool ddr_mem_in = check_bit(op->ldpc_dec.op_flags,
+ RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE);
+ union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+ uint16_t harq_index = (ddr_mem_in ?
+ op->ldpc_dec.harq_combined_input.offset :
+ op->ldpc_dec.harq_combined_output.offset)
+ / ACC100_HARQ_OFFSET;
+
+ uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+ & q->sw_ring_wrap_mask);
+ desc = q->ring_addr + desc_idx;
+ fcw = &desc->req.fcw_ld;
+ /* Set the FCW from loopback into DDR */
+ memset(fcw, 0, sizeof(struct acc100_fcw_ld));
+ fcw->FCWversion = ACC100_FCW_VER;
+ fcw->qm = 2;
+ fcw->Zc = 384;
+ if (harq_in_length < 16 * N_ZC_1)
+ fcw->Zc = 16;
+ fcw->ncb = fcw->Zc * N_ZC_1;
+ fcw->rm_e = 2;
+ fcw->hcin_en = 1;
+ fcw->hcout_en = 1;
+
+ rte_bbdev_log(DEBUG, "Loopback IN %d Index %d offset %d length %d %d\n",
+ ddr_mem_in, harq_index,
+ harq_layout[harq_index].offset, harq_in_length,
+ harq_dma_length_in);
+
+ if (ddr_mem_in && (harq_layout[harq_index].offset > 0)) {
+ fcw->hcin_size0 = harq_layout[harq_index].size0;
+ fcw->hcin_offset = harq_layout[harq_index].offset;
+ fcw->hcin_size1 = harq_in_length - fcw->hcin_offset;
+ harq_dma_length_in = (fcw->hcin_size0 + fcw->hcin_size1);
+ if (h_comp == 1)
+ harq_dma_length_in = harq_dma_length_in * 6 / 8;
+ } else {
+ fcw->hcin_size0 = harq_in_length;
+ }
+ harq_layout[harq_index].val = 0;
+ rte_bbdev_log(DEBUG, "Loopback FCW Config %d %d %d\n",
+ fcw->hcin_size0, fcw->hcin_offset, fcw->hcin_size1);
+ fcw->hcout_size0 = harq_in_length;
+ fcw->hcin_decomp_mode = h_comp;
+ fcw->hcout_comp_mode = h_comp;
+ fcw->gain_i = 1;
+ fcw->gain_h = 1;
+
+ /* Set the prefix of descriptor. This could be done at polling */
+ desc->req.word0 = ACC100_DMA_DESC_TYPE;
+ desc->req.word1 = 0; /**< Timestamp could be disabled */
+ desc->req.word2 = 0;
+ desc->req.word3 = 0;
+ desc->req.numCBs = 1;
+
+ /* Null LLR input for Decoder */
+ desc->req.data_ptrs[next_triplet].address =
+ q->lb_in_addr_phys;
+ desc->req.data_ptrs[next_triplet].blen = 2;
+ desc->req.data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN;
+ desc->req.data_ptrs[next_triplet].last = 0;
+ desc->req.data_ptrs[next_triplet].dma_ext = 0;
+ next_triplet++;
+
+ /* HARQ Combine input from either Memory interface */
+ if (!ddr_mem_in) {
+ next_triplet = acc100_dma_fill_blk_type_out(&desc->req,
+ op->ldpc_dec.harq_combined_input.data,
+ op->ldpc_dec.harq_combined_input.offset,
+ harq_dma_length_in,
+ next_triplet,
+ ACC100_DMA_BLKID_IN_HARQ);
+ } else {
+ desc->req.data_ptrs[next_triplet].address =
+ op->ldpc_dec.harq_combined_input.offset;
+ desc->req.data_ptrs[next_triplet].blen =
+ harq_dma_length_in;
+ desc->req.data_ptrs[next_triplet].blkid =
+ ACC100_DMA_BLKID_IN_HARQ;
+ desc->req.data_ptrs[next_triplet].dma_ext = 1;
+ next_triplet++;
+ }
+ desc->req.data_ptrs[next_triplet - 1].last = 1;
+ desc->req.m2dlen = next_triplet;
+
+ /* Dropped decoder hard output */
+ desc->req.data_ptrs[next_triplet].address =
+ q->lb_out_addr_phys;
+ desc->req.data_ptrs[next_triplet].blen = BYTES_IN_WORD;
+ desc->req.data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_OUT_HARD;
+ desc->req.data_ptrs[next_triplet].last = 0;
+ desc->req.data_ptrs[next_triplet].dma_ext = 0;
+ next_triplet++;
+
+ /* HARQ Combine output to either Memory interface */
+ if (check_bit(op->ldpc_dec.op_flags,
+ RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE
+ )) {
+ desc->req.data_ptrs[next_triplet].address =
+ op->ldpc_dec.harq_combined_output.offset;
+ desc->req.data_ptrs[next_triplet].blen =
+ harq_dma_length_out;
+ desc->req.data_ptrs[next_triplet].blkid =
+ ACC100_DMA_BLKID_OUT_HARQ;
+ desc->req.data_ptrs[next_triplet].dma_ext = 1;
+ next_triplet++;
+ } else {
+ hq_output_head = op->ldpc_dec.harq_combined_output.data;
+ hq_output = op->ldpc_dec.harq_combined_output.data;
+ next_triplet = acc100_dma_fill_blk_type_out(
+ &desc->req,
+ op->ldpc_dec.harq_combined_output.data,
+ op->ldpc_dec.harq_combined_output.offset,
+ harq_dma_length_out,
+ next_triplet,
+ ACC100_DMA_BLKID_OUT_HARQ);
+ /* HARQ output */
+ mbuf_append(hq_output_head, hq_output, harq_dma_length_out);
+ op->ldpc_dec.harq_combined_output.length =
+ harq_dma_length_out;
+ }
+ desc->req.data_ptrs[next_triplet - 1].last = 1;
+ desc->req.d2mlen = next_triplet - desc->req.m2dlen;
+ desc->req.op_addr = op;
+
+ /* One CB (one op) was successfully prepared to enqueue */
+ return 1;
+}
+
/** Enqueue one decode operations for ACC100 device in CB mode */
static inline int
enqueue_ldpc_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
uint16_t total_enqueued_cbs, bool same_op)
{
int ret;
+ if (unlikely(check_bit(op->ldpc_dec.op_flags,
+ RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK))) {
+ ret = harq_loopback(q, op, total_enqueued_cbs);
+ return ret;
+ }
union acc100_dma_desc *desc;
uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
--
1.8.3.1
^ permalink raw reply [flat|nested] 213+ messages in thread
* [dpdk-dev] [PATCH v3 07/11] baseband/acc100: add support for 4G processing
2020-08-19 0:25 [dpdk-dev] [PATCH v3 00/11] bbdev PMD ACC100 Nicolas Chautru
` (5 preceding siblings ...)
2020-08-19 0:25 ` [dpdk-dev] [PATCH v3 06/11] baseband/acc100: add HARQ loopback support Nicolas Chautru
@ 2020-08-19 0:25 ` Nicolas Chautru
2020-08-19 0:25 ` [dpdk-dev] [PATCH v3 08/11] baseband/acc100: add interrupt support to PMD Nicolas Chautru
` (3 subsequent siblings)
10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-08-19 0:25 UTC (permalink / raw)
To: dev, akhil.goyal; +Cc: bruce.richardson, Nicolas Chautru
Adding capability for 4G encode and decoder processing
Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
drivers/baseband/acc100/rte_acc100_pmd.c | 1010 ++++++++++++++++++++++++++++--
1 file changed, 943 insertions(+), 67 deletions(-)
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index b44b2f5..1de7531 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -339,7 +339,6 @@
free_base_addresses(base_addrs, i);
}
-
/* Allocate 64MB memory used for all software rings */
static int
acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id)
@@ -637,6 +636,41 @@
static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
{
+ .type = RTE_BBDEV_OP_TURBO_DEC,
+ .cap.turbo_dec = {
+ .capability_flags =
+ RTE_BBDEV_TURBO_SUBBLOCK_DEINTERLEAVE |
+ RTE_BBDEV_TURBO_CRC_TYPE_24B |
+ RTE_BBDEV_TURBO_HALF_ITERATION_EVEN |
+ RTE_BBDEV_TURBO_EARLY_TERMINATION |
+ RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN |
+ RTE_BBDEV_TURBO_MAP_DEC |
+ RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP |
+ RTE_BBDEV_TURBO_DEC_SCATTER_GATHER,
+ .max_llr_modulus = INT8_MAX,
+ .num_buffers_src =
+ RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+ .num_buffers_hard_out =
+ RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+ .num_buffers_soft_out =
+ RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+ }
+ },
+ {
+ .type = RTE_BBDEV_OP_TURBO_ENC,
+ .cap.turbo_enc = {
+ .capability_flags =
+ RTE_BBDEV_TURBO_CRC_24B_ATTACH |
+ RTE_BBDEV_TURBO_RV_INDEX_BYPASS |
+ RTE_BBDEV_TURBO_RATE_MATCH |
+ RTE_BBDEV_TURBO_ENC_SCATTER_GATHER,
+ .num_buffers_src =
+ RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+ .num_buffers_dst =
+ RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+ }
+ },
+ {
.type = RTE_BBDEV_OP_LDPC_ENC,
.cap.ldpc_enc = {
.capability_flags =
@@ -719,7 +753,6 @@
#endif
}
-
static const struct rte_bbdev_ops acc100_bbdev_ops = {
.setup_queues = acc100_setup_queues,
.close = acc100_dev_close,
@@ -763,6 +796,58 @@
return tail;
}
+/* Fill in a frame control word for turbo encoding. */
+static inline void
+acc100_fcw_te_fill(const struct rte_bbdev_enc_op *op, struct acc100_fcw_te *fcw)
+{
+ fcw->code_block_mode = op->turbo_enc.code_block_mode;
+ if (fcw->code_block_mode == 0) { /* For TB mode */
+ fcw->k_neg = op->turbo_enc.tb_params.k_neg;
+ fcw->k_pos = op->turbo_enc.tb_params.k_pos;
+ fcw->c_neg = op->turbo_enc.tb_params.c_neg;
+ fcw->c = op->turbo_enc.tb_params.c;
+ fcw->ncb_neg = op->turbo_enc.tb_params.ncb_neg;
+ fcw->ncb_pos = op->turbo_enc.tb_params.ncb_pos;
+
+ if (check_bit(op->turbo_enc.op_flags,
+ RTE_BBDEV_TURBO_RATE_MATCH)) {
+ fcw->bypass_rm = 0;
+ fcw->cab = op->turbo_enc.tb_params.cab;
+ fcw->ea = op->turbo_enc.tb_params.ea;
+ fcw->eb = op->turbo_enc.tb_params.eb;
+ } else {
+ /* E is set to the encoding output size when RM is
+ * bypassed.
+ */
+ fcw->bypass_rm = 1;
+ fcw->cab = fcw->c_neg;
+ fcw->ea = 3 * fcw->k_neg + 12;
+ fcw->eb = 3 * fcw->k_pos + 12;
+ }
+ } else { /* For CB mode */
+ fcw->k_pos = op->turbo_enc.cb_params.k;
+ fcw->ncb_pos = op->turbo_enc.cb_params.ncb;
+
+ if (check_bit(op->turbo_enc.op_flags,
+ RTE_BBDEV_TURBO_RATE_MATCH)) {
+ fcw->bypass_rm = 0;
+ fcw->eb = op->turbo_enc.cb_params.e;
+ } else {
+ /* E is set to the encoding output size when RM is
+ * bypassed.
+ */
+ fcw->bypass_rm = 1;
+ fcw->eb = 3 * fcw->k_pos + 12;
+ }
+ }
+
+ fcw->bypass_rv_idx1 = check_bit(op->turbo_enc.op_flags,
+ RTE_BBDEV_TURBO_RV_INDEX_BYPASS);
+ fcw->code_block_crc = check_bit(op->turbo_enc.op_flags,
+ RTE_BBDEV_TURBO_CRC_24B_ATTACH);
+ fcw->rv_idx1 = op->turbo_enc.rv_index;
+}
+
/* Compute value of k0.
* Based on 3GPP 38.212 Table 5.4.2.1-2
* Starting position of different redundancy versions, k0
@@ -813,6 +898,25 @@
fcw->mcb_count = num_cb;
}
+/* Fill in a frame control word for turbo decoding. */
+static inline void
+acc100_fcw_td_fill(const struct rte_bbdev_dec_op *op, struct acc100_fcw_td *fcw)
+{
+ /* Note : Early termination is always enabled for 4GUL */
+ fcw->fcw_ver = 1;
+ if (op->turbo_dec.code_block_mode == 0)
+ fcw->k_pos = op->turbo_dec.tb_params.k_pos;
+ else
+ fcw->k_pos = op->turbo_dec.cb_params.k;
+ fcw->turbo_crc_type = check_bit(op->turbo_dec.op_flags,
+ RTE_BBDEV_TURBO_CRC_TYPE_24B);
+ fcw->bypass_sb_deint = 0;
+ fcw->raw_decoder_input_on = 0;
+ fcw->max_iter = op->turbo_dec.iter_max;
+ fcw->half_iter_on = !check_bit(op->turbo_dec.op_flags,
+ RTE_BBDEV_TURBO_HALF_ITERATION_EVEN);
+}
+
/* Fill in a frame control word for LDPC decoding. */
static inline void
acc100_fcw_ld_fill(const struct rte_bbdev_dec_op *op, struct acc100_fcw_ld *fcw,
@@ -1042,6 +1146,87 @@
}
static inline int
+acc100_dma_desc_te_fill(struct rte_bbdev_enc_op *op,
+ struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
+ struct rte_mbuf *output, uint32_t *in_offset,
+ uint32_t *out_offset, uint32_t *out_length,
+ uint32_t *mbuf_total_left, uint32_t *seg_total_left, uint8_t r)
+{
+ int next_triplet = 1; /* FCW already done */
+ uint32_t e, ea, eb, length;
+ uint16_t k, k_neg, k_pos;
+ uint8_t cab, c_neg;
+
+ desc->word0 = ACC100_DMA_DESC_TYPE;
+ desc->word1 = 0; /**< Timestamp could be disabled */
+ desc->word2 = 0;
+ desc->word3 = 0;
+ desc->numCBs = 1;
+
+ if (op->turbo_enc.code_block_mode == 0) {
+ ea = op->turbo_enc.tb_params.ea;
+ eb = op->turbo_enc.tb_params.eb;
+ cab = op->turbo_enc.tb_params.cab;
+ k_neg = op->turbo_enc.tb_params.k_neg;
+ k_pos = op->turbo_enc.tb_params.k_pos;
+ c_neg = op->turbo_enc.tb_params.c_neg;
+ e = (r < cab) ? ea : eb;
+ k = (r < c_neg) ? k_neg : k_pos;
+ } else {
+ e = op->turbo_enc.cb_params.e;
+ k = op->turbo_enc.cb_params.k;
+ }
+
+ if (check_bit(op->turbo_enc.op_flags, RTE_BBDEV_TURBO_CRC_24B_ATTACH))
+ length = (k - 24) >> 3;
+ else
+ length = k >> 3;
+
+ if (unlikely((*mbuf_total_left == 0) || (*mbuf_total_left < length))) {
+ rte_bbdev_log(ERR,
+ "Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+ *mbuf_total_left, length);
+ return -1;
+ }
+
+ next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset,
+ length, seg_total_left, next_triplet);
+ if (unlikely(next_triplet < 0)) {
+ rte_bbdev_log(ERR,
+ "Mismatch between data to process and mbuf data length in bbdev_op: %p",
+ op);
+ return -1;
+ }
+ desc->data_ptrs[next_triplet - 1].last = 1;
+ desc->m2dlen = next_triplet;
+ *mbuf_total_left -= length;
+
+ /* Set output length */
+ if (check_bit(op->turbo_enc.op_flags, RTE_BBDEV_TURBO_RATE_MATCH))
+ /* Integer round up division by 8 */
+ *out_length = (e + 7) >> 3;
+ else
+ *out_length = (k >> 3) * 3 + 2;
+
+ next_triplet = acc100_dma_fill_blk_type_out(desc, output, *out_offset,
+ *out_length, next_triplet, ACC100_DMA_BLKID_OUT_ENC);
+ if (unlikely(next_triplet < 0)) {
+ rte_bbdev_log(ERR,
+ "Mismatch between data to process and mbuf data length in bbdev_op: %p",
+ op);
+ return -1;
+ }
+ op->turbo_enc.output.length += *out_length;
+ *out_offset += *out_length;
+ desc->data_ptrs[next_triplet - 1].last = 1;
+ desc->d2mlen = next_triplet - desc->m2dlen;
+
+ desc->op_addr = op;
+
+ return 0;
+}
+
+static inline int
acc100_dma_desc_le_fill(struct rte_bbdev_enc_op *op,
struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
struct rte_mbuf *output, uint32_t *in_offset,
@@ -1110,6 +1295,117 @@
}
static inline int
+acc100_dma_desc_td_fill(struct rte_bbdev_dec_op *op,
+ struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
+ struct rte_mbuf *h_output, struct rte_mbuf *s_output,
+ uint32_t *in_offset, uint32_t *h_out_offset,
+ uint32_t *s_out_offset, uint32_t *h_out_length,
+ uint32_t *s_out_length, uint32_t *mbuf_total_left,
+ uint32_t *seg_total_left, uint8_t r)
+{
+ int next_triplet = 1; /* FCW already done */
+ uint16_t k;
+ uint16_t crc24_overlap = 0;
+ uint32_t e, kw;
+
+ desc->word0 = ACC100_DMA_DESC_TYPE;
+ desc->word1 = 0; /**< Timestamp could be disabled */
+ desc->word2 = 0;
+ desc->word3 = 0;
+ desc->numCBs = 1;
+
+ if (op->turbo_dec.code_block_mode == 0) {
+ k = (r < op->turbo_dec.tb_params.c_neg)
+ ? op->turbo_dec.tb_params.k_neg
+ : op->turbo_dec.tb_params.k_pos;
+ e = (r < op->turbo_dec.tb_params.cab)
+ ? op->turbo_dec.tb_params.ea
+ : op->turbo_dec.tb_params.eb;
+ } else {
+ k = op->turbo_dec.cb_params.k;
+ e = op->turbo_dec.cb_params.e;
+ }
+
+ if ((op->turbo_dec.code_block_mode == 0)
+ && !check_bit(op->turbo_dec.op_flags,
+ RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP))
+ crc24_overlap = 24;
+
+ /* Calculates circular buffer size.
+ * According to 3gpp 36.212 section 5.1.4.2
+ * Kw = 3 * Kpi,
+ * where:
+ * Kpi = nCol * nRow
+ * where nCol is 32 and nRow can be calculated from:
+ * D =< nCol * nRow
+ * where D is the size of each output from turbo encoder block (k + 4).
+ */
+ kw = RTE_ALIGN_CEIL(k + 4, 32) * 3;
+
+ if (unlikely((*mbuf_total_left == 0) || (*mbuf_total_left < kw))) {
+ rte_bbdev_log(ERR,
+ "Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+ *mbuf_total_left, kw);
+ return -1;
+ }
+
+ next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset, kw,
+ seg_total_left, next_triplet);
+ if (unlikely(next_triplet < 0)) {
+ rte_bbdev_log(ERR,
+ "Mismatch between data to process and mbuf data length in bbdev_op: %p",
+ op);
+ return -1;
+ }
+ desc->data_ptrs[next_triplet - 1].last = 1;
+ desc->m2dlen = next_triplet;
+ *mbuf_total_left -= kw;
+
+ next_triplet = acc100_dma_fill_blk_type_out(
+ desc, h_output, *h_out_offset,
+ k >> 3, next_triplet, ACC100_DMA_BLKID_OUT_HARD);
+ if (unlikely(next_triplet < 0)) {
+ rte_bbdev_log(ERR,
+ "Mismatch between data to process and mbuf data length in bbdev_op: %p",
+ op);
+ return -1;
+ }
+
+ *h_out_length = ((k - crc24_overlap) >> 3);
+ op->turbo_dec.hard_output.length += *h_out_length;
+ *h_out_offset += *h_out_length;
+
+ /* Soft output */
+ if (check_bit(op->turbo_dec.op_flags, RTE_BBDEV_TURBO_SOFT_OUTPUT)) {
+ if (check_bit(op->turbo_dec.op_flags,
+ RTE_BBDEV_TURBO_EQUALIZER))
+ *s_out_length = e;
+ else
+ *s_out_length = (k * 3) + 12;
+
+ next_triplet = acc100_dma_fill_blk_type_out(desc, s_output,
+ *s_out_offset, *s_out_length, next_triplet,
+ ACC100_DMA_BLKID_OUT_SOFT);
+ if (unlikely(next_triplet < 0)) {
+ rte_bbdev_log(ERR,
+ "Mismatch between data to process and mbuf data length in bbdev_op: %p",
+ op);
+ return -1;
+ }
+
+ op->turbo_dec.soft_output.length += *s_out_length;
+ *s_out_offset += *s_out_length;
+ }
+
+ desc->data_ptrs[next_triplet - 1].last = 1;
+ desc->d2mlen = next_triplet - desc->m2dlen;
+
+ desc->op_addr = op;
+
+ return 0;
+}
+
+static inline int
acc100_dma_desc_ld_fill(struct rte_bbdev_dec_op *op,
struct acc100_dma_req_desc *desc,
struct rte_mbuf **input, struct rte_mbuf *h_output,
@@ -1374,6 +1670,57 @@
/* Enqueue one encode operations for ACC100 device in CB mode */
static inline int
+enqueue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
+ uint16_t total_enqueued_cbs)
+{
+ union acc100_dma_desc *desc = NULL;
+ int ret;
+ uint32_t in_offset, out_offset, out_length, mbuf_total_left,
+ seg_total_left;
+ struct rte_mbuf *input, *output_head, *output;
+
+ uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+ & q->sw_ring_wrap_mask);
+ desc = q->ring_addr + desc_idx;
+ acc100_fcw_te_fill(op, &desc->req.fcw_te);
+
+ input = op->turbo_enc.input.data;
+ output_head = output = op->turbo_enc.output.data;
+ in_offset = op->turbo_enc.input.offset;
+ out_offset = op->turbo_enc.output.offset;
+ out_length = 0;
+ mbuf_total_left = op->turbo_enc.input.length;
+ seg_total_left = rte_pktmbuf_data_len(op->turbo_enc.input.data)
+ - in_offset;
+
+ ret = acc100_dma_desc_te_fill(op, &desc->req, &input, output,
+ &in_offset, &out_offset, &out_length, &mbuf_total_left,
+ &seg_total_left, 0);
+
+ if (unlikely(ret < 0))
+ return ret;
+
+ mbuf_append(output_head, output, out_length);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+ rte_memdump(stderr, "FCW", &desc->req.fcw_te,
+ sizeof(desc->req.fcw_te) - 8);
+ rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+
+ /* Check if any data left after processing one CB */
+ if (mbuf_total_left != 0) {
+ rte_bbdev_log(ERR,
+ "Some date still left after processing one CB: mbuf_total_left = %u",
+ mbuf_total_left);
+ return -EINVAL;
+ }
+#endif
+ /* One CB (one op) was successfully prepared to enqueue */
+ return 1;
+}
+
+/* Enqueue one encode operations for ACC100 device in CB mode */
+static inline int
enqueue_ldpc_enc_n_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op **ops,
uint16_t total_enqueued_cbs, int16_t num)
{
@@ -1481,78 +1828,235 @@
return 1;
}
-static inline int
-harq_loopback(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
- uint16_t total_enqueued_cbs) {
- struct acc100_fcw_ld *fcw;
- union acc100_dma_desc *desc;
- int next_triplet = 1;
- struct rte_mbuf *hq_output_head, *hq_output;
- uint16_t harq_in_length = op->ldpc_dec.harq_combined_input.length;
- if (harq_in_length == 0) {
- rte_bbdev_log(ERR, "Loopback of invalid null size\n");
- return -EINVAL;
- }
- int h_comp = check_bit(op->ldpc_dec.op_flags,
- RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION
- ) ? 1 : 0;
- if (h_comp == 1)
- harq_in_length = harq_in_length * 8 / 6;
- harq_in_length = RTE_ALIGN(harq_in_length, 64);
- uint16_t harq_dma_length_in = (h_comp == 0) ?
- harq_in_length :
- harq_in_length * 6 / 8;
- uint16_t harq_dma_length_out = harq_dma_length_in;
- bool ddr_mem_in = check_bit(op->ldpc_dec.op_flags,
- RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE);
- union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
- uint16_t harq_index = (ddr_mem_in ?
- op->ldpc_dec.harq_combined_input.offset :
- op->ldpc_dec.harq_combined_output.offset)
- / ACC100_HARQ_OFFSET;
+/* Enqueue one encode operations for ACC100 device in TB mode. */
+static inline int
+enqueue_enc_one_op_tb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
+ uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
+{
+ union acc100_dma_desc *desc = NULL;
+ int ret;
+ uint8_t r, c;
+ uint32_t in_offset, out_offset, out_length, mbuf_total_left,
+ seg_total_left;
+ struct rte_mbuf *input, *output_head, *output;
+ uint16_t current_enqueued_cbs = 0;
uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
& q->sw_ring_wrap_mask);
desc = q->ring_addr + desc_idx;
- fcw = &desc->req.fcw_ld;
- /* Set the FCW from loopback into DDR */
- memset(fcw, 0, sizeof(struct acc100_fcw_ld));
- fcw->FCWversion = ACC100_FCW_VER;
- fcw->qm = 2;
- fcw->Zc = 384;
- if (harq_in_length < 16 * N_ZC_1)
- fcw->Zc = 16;
- fcw->ncb = fcw->Zc * N_ZC_1;
- fcw->rm_e = 2;
- fcw->hcin_en = 1;
- fcw->hcout_en = 1;
+ uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
+ acc100_fcw_te_fill(op, &desc->req.fcw_te);
- rte_bbdev_log(DEBUG, "Loopback IN %d Index %d offset %d length %d %d\n",
- ddr_mem_in, harq_index,
- harq_layout[harq_index].offset, harq_in_length,
- harq_dma_length_in);
+ input = op->turbo_enc.input.data;
+ output_head = output = op->turbo_enc.output.data;
+ in_offset = op->turbo_enc.input.offset;
+ out_offset = op->turbo_enc.output.offset;
+ out_length = 0;
+ mbuf_total_left = op->turbo_enc.input.length;
- if (ddr_mem_in && (harq_layout[harq_index].offset > 0)) {
- fcw->hcin_size0 = harq_layout[harq_index].size0;
- fcw->hcin_offset = harq_layout[harq_index].offset;
- fcw->hcin_size1 = harq_in_length - fcw->hcin_offset;
- harq_dma_length_in = (fcw->hcin_size0 + fcw->hcin_size1);
- if (h_comp == 1)
- harq_dma_length_in = harq_dma_length_in * 6 / 8;
- } else {
- fcw->hcin_size0 = harq_in_length;
- }
- harq_layout[harq_index].val = 0;
- rte_bbdev_log(DEBUG, "Loopback FCW Config %d %d %d\n",
- fcw->hcin_size0, fcw->hcin_offset, fcw->hcin_size1);
- fcw->hcout_size0 = harq_in_length;
- fcw->hcin_decomp_mode = h_comp;
- fcw->hcout_comp_mode = h_comp;
- fcw->gain_i = 1;
- fcw->gain_h = 1;
+ c = op->turbo_enc.tb_params.c;
+ r = op->turbo_enc.tb_params.r;
- /* Set the prefix of descriptor. This could be done at polling */
+ while (mbuf_total_left > 0 && r < c) {
+ seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
+ /* Set up DMA descriptor */
+ desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
+ & q->sw_ring_wrap_mask);
+ desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
+ desc->req.data_ptrs[0].blen = ACC100_FCW_TE_BLEN;
+
+ ret = acc100_dma_desc_te_fill(op, &desc->req, &input, output,
+ &in_offset, &out_offset, &out_length,
+ &mbuf_total_left, &seg_total_left, r);
+ if (unlikely(ret < 0))
+ return ret;
+ mbuf_append(output_head, output, out_length);
+
+ /* Set total number of CBs in TB */
+ desc->req.cbs_in_tb = cbs_in_tb;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+ rte_memdump(stderr, "FCW", &desc->req.fcw_te,
+ sizeof(desc->req.fcw_te) - 8);
+ rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+ if (seg_total_left == 0) {
+ /* Go to the next mbuf */
+ input = input->next;
+ in_offset = 0;
+ output = output->next;
+ out_offset = 0;
+ }
+
+ total_enqueued_cbs++;
+ current_enqueued_cbs++;
+ r++;
+ }
+
+ if (unlikely(desc == NULL))
+ return current_enqueued_cbs;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+ /* Check if any CBs left for processing */
+ if (mbuf_total_left != 0) {
+ rte_bbdev_log(ERR,
+ "Some date still left for processing: mbuf_total_left = %u",
+ mbuf_total_left);
+ return -EINVAL;
+ }
+#endif
+
+ /* Set SDone on last CB descriptor for TB mode. */
+ desc->req.sdone_enable = 1;
+ desc->req.irq_enable = q->irq_enable;
+
+ return current_enqueued_cbs;
+}
+
+/** Enqueue one decode operations for ACC100 device in CB mode */
+static inline int
+enqueue_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+ uint16_t total_enqueued_cbs)
+{
+ union acc100_dma_desc *desc = NULL;
+ int ret;
+ uint32_t in_offset, h_out_offset, s_out_offset, s_out_length,
+ h_out_length, mbuf_total_left, seg_total_left;
+ struct rte_mbuf *input, *h_output_head, *h_output,
+ *s_output_head, *s_output;
+
+ uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+ & q->sw_ring_wrap_mask);
+ desc = q->ring_addr + desc_idx;
+ acc100_fcw_td_fill(op, &desc->req.fcw_td);
+
+ input = op->turbo_dec.input.data;
+ h_output_head = h_output = op->turbo_dec.hard_output.data;
+ s_output_head = s_output = op->turbo_dec.soft_output.data;
+ in_offset = op->turbo_dec.input.offset;
+ h_out_offset = op->turbo_dec.hard_output.offset;
+ s_out_offset = op->turbo_dec.soft_output.offset;
+ h_out_length = s_out_length = 0;
+ mbuf_total_left = op->turbo_dec.input.length;
+ seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+ if (unlikely(input == NULL)) {
+ rte_bbdev_log(ERR, "Invalid mbuf pointer");
+ return -EFAULT;
+ }
+#endif
+
+ /* Set up DMA descriptor */
+ desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
+ & q->sw_ring_wrap_mask);
+
+ ret = acc100_dma_desc_td_fill(op, &desc->req, &input, h_output,
+ s_output, &in_offset, &h_out_offset, &s_out_offset,
+ &h_out_length, &s_out_length, &mbuf_total_left,
+ &seg_total_left, 0);
+
+ if (unlikely(ret < 0))
+ return ret;
+
+ /* Hard output */
+ mbuf_append(h_output_head, h_output, h_out_length);
+
+ /* Soft output */
+ if (check_bit(op->turbo_dec.op_flags, RTE_BBDEV_TURBO_SOFT_OUTPUT))
+ mbuf_append(s_output_head, s_output, s_out_length);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+ rte_memdump(stderr, "FCW", &desc->req.fcw_td,
+ sizeof(desc->req.fcw_td) - 8);
+ rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+
+ /* Check if any CBs left for processing */
+ if (mbuf_total_left != 0) {
+ rte_bbdev_log(ERR,
+ "Some date still left after processing one CB: mbuf_total_left = %u",
+ mbuf_total_left);
+ return -EINVAL;
+ }
+#endif
+
+ /* One CB (one op) was successfully prepared to enqueue */
+ return 1;
+}
+
+static inline int
+harq_loopback(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+ uint16_t total_enqueued_cbs) {
+ struct acc100_fcw_ld *fcw;
+ union acc100_dma_desc *desc;
+ int next_triplet = 1;
+ struct rte_mbuf *hq_output_head, *hq_output;
+ uint16_t harq_in_length = op->ldpc_dec.harq_combined_input.length;
+ if (harq_in_length == 0) {
+ rte_bbdev_log(ERR, "Loopback of invalid null size\n");
+ return -EINVAL;
+ }
+
+ int h_comp = check_bit(op->ldpc_dec.op_flags,
+ RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION
+ ) ? 1 : 0;
+ if (h_comp == 1)
+ harq_in_length = harq_in_length * 8 / 6;
+ harq_in_length = RTE_ALIGN(harq_in_length, 64);
+ uint16_t harq_dma_length_in = (h_comp == 0) ?
+ harq_in_length :
+ harq_in_length * 6 / 8;
+ uint16_t harq_dma_length_out = harq_dma_length_in;
+ bool ddr_mem_in = check_bit(op->ldpc_dec.op_flags,
+ RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE);
+ union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+ uint16_t harq_index = (ddr_mem_in ?
+ op->ldpc_dec.harq_combined_input.offset :
+ op->ldpc_dec.harq_combined_output.offset)
+ / ACC100_HARQ_OFFSET;
+
+ uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+ & q->sw_ring_wrap_mask);
+ desc = q->ring_addr + desc_idx;
+ fcw = &desc->req.fcw_ld;
+ /* Set the FCW from loopback into DDR */
+ memset(fcw, 0, sizeof(struct acc100_fcw_ld));
+ fcw->FCWversion = ACC100_FCW_VER;
+ fcw->qm = 2;
+ fcw->Zc = 384;
+ if (harq_in_length < 16 * N_ZC_1)
+ fcw->Zc = 16;
+ fcw->ncb = fcw->Zc * N_ZC_1;
+ fcw->rm_e = 2;
+ fcw->hcin_en = 1;
+ fcw->hcout_en = 1;
+
+ rte_bbdev_log(DEBUG, "Loopback IN %d Index %d offset %d length %d %d\n",
+ ddr_mem_in, harq_index,
+ harq_layout[harq_index].offset, harq_in_length,
+ harq_dma_length_in);
+
+ if (ddr_mem_in && (harq_layout[harq_index].offset > 0)) {
+ fcw->hcin_size0 = harq_layout[harq_index].size0;
+ fcw->hcin_offset = harq_layout[harq_index].offset;
+ fcw->hcin_size1 = harq_in_length - fcw->hcin_offset;
+ harq_dma_length_in = (fcw->hcin_size0 + fcw->hcin_size1);
+ if (h_comp == 1)
+ harq_dma_length_in = harq_dma_length_in * 6 / 8;
+ } else {
+ fcw->hcin_size0 = harq_in_length;
+ }
+ harq_layout[harq_index].val = 0;
+ rte_bbdev_log(DEBUG, "Loopback FCW Config %d %d %d\n",
+ fcw->hcin_size0, fcw->hcin_offset, fcw->hcin_size1);
+ fcw->hcout_size0 = harq_in_length;
+ fcw->hcin_decomp_mode = h_comp;
+ fcw->hcout_comp_mode = h_comp;
+ fcw->gain_i = 1;
+ fcw->gain_h = 1;
+
+ /* Set the prefix of descriptor. This could be done at polling */
desc->req.word0 = ACC100_DMA_DESC_TYPE;
desc->req.word1 = 0; /**< Timestamp could be disabled */
desc->req.word2 = 0;
@@ -1816,6 +2320,107 @@
return current_enqueued_cbs;
}
+/* Enqueue one decode operations for ACC100 device in TB mode */
+static inline int
+enqueue_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+ uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
+{
+ union acc100_dma_desc *desc = NULL;
+ int ret;
+ uint8_t r, c;
+ uint32_t in_offset, h_out_offset, s_out_offset, s_out_length,
+ h_out_length, mbuf_total_left, seg_total_left;
+ struct rte_mbuf *input, *h_output_head, *h_output,
+ *s_output_head, *s_output;
+ uint16_t current_enqueued_cbs = 0;
+
+ uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+ & q->sw_ring_wrap_mask);
+ desc = q->ring_addr + desc_idx;
+ uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
+ acc100_fcw_td_fill(op, &desc->req.fcw_td);
+
+ input = op->turbo_dec.input.data;
+ h_output_head = h_output = op->turbo_dec.hard_output.data;
+ s_output_head = s_output = op->turbo_dec.soft_output.data;
+ in_offset = op->turbo_dec.input.offset;
+ h_out_offset = op->turbo_dec.hard_output.offset;
+ s_out_offset = op->turbo_dec.soft_output.offset;
+ h_out_length = s_out_length = 0;
+ mbuf_total_left = op->turbo_dec.input.length;
+ c = op->turbo_dec.tb_params.c;
+ r = op->turbo_dec.tb_params.r;
+
+ while (mbuf_total_left > 0 && r < c) {
+
+ seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
+
+ /* Set up DMA descriptor */
+ desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
+ & q->sw_ring_wrap_mask);
+ desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
+ desc->req.data_ptrs[0].blen = ACC100_FCW_TD_BLEN;
+ ret = acc100_dma_desc_td_fill(op, &desc->req, &input,
+ h_output, s_output, &in_offset, &h_out_offset,
+ &s_out_offset, &h_out_length, &s_out_length,
+ &mbuf_total_left, &seg_total_left, r);
+
+ if (unlikely(ret < 0))
+ return ret;
+
+ /* Hard output */
+ mbuf_append(h_output_head, h_output, h_out_length);
+
+ /* Soft output */
+ if (check_bit(op->turbo_dec.op_flags,
+ RTE_BBDEV_TURBO_SOFT_OUTPUT))
+ mbuf_append(s_output_head, s_output, s_out_length);
+
+ /* Set total number of CBs in TB */
+ desc->req.cbs_in_tb = cbs_in_tb;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+ rte_memdump(stderr, "FCW", &desc->req.fcw_td,
+ sizeof(desc->req.fcw_td) - 8);
+ rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+ if (seg_total_left == 0) {
+ /* Go to the next mbuf */
+ input = input->next;
+ in_offset = 0;
+ h_output = h_output->next;
+ h_out_offset = 0;
+
+ if (check_bit(op->turbo_dec.op_flags,
+ RTE_BBDEV_TURBO_SOFT_OUTPUT)) {
+ s_output = s_output->next;
+ s_out_offset = 0;
+ }
+ }
+
+ total_enqueued_cbs++;
+ current_enqueued_cbs++;
+ r++;
+ }
+
+ if (unlikely(desc == NULL))
+ return current_enqueued_cbs;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+ /* Check if any CBs left for processing */
+ if (mbuf_total_left != 0) {
+ rte_bbdev_log(ERR,
+ "Some date still left for processing: mbuf_total_left = %u",
+ mbuf_total_left);
+ return -EINVAL;
+ }
+#endif
+ /* Set SDone on last CB descriptor for TB mode */
+ desc->req.sdone_enable = 1;
+ desc->req.irq_enable = q->irq_enable;
+
+ return current_enqueued_cbs;
+}
/* Calculates number of CBs in processed encoder TB based on 'r' and input
* length.
@@ -1893,6 +2498,45 @@
return cbs_in_tb;
}
+/* Enqueue encode operations for ACC100 device in CB mode. */
+static uint16_t
+acc100_enqueue_enc_cb(struct rte_bbdev_queue_data *q_data,
+ struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+ struct acc100_queue *q = q_data->queue_private;
+ int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+ uint16_t i;
+ union acc100_dma_desc *desc;
+ int ret;
+
+ for (i = 0; i < num; ++i) {
+ /* Check if there are available space for further processing */
+ if (unlikely(avail - 1 < 0))
+ break;
+ avail -= 1;
+
+ ret = enqueue_enc_one_op_cb(q, ops[i], i);
+ if (ret < 0)
+ break;
+ }
+
+ if (unlikely(i == 0))
+ return 0; /* Nothing to enqueue */
+
+ /* Set SDone in last CB in enqueued ops for CB mode*/
+ desc = q->ring_addr + ((q->sw_ring_head + i - 1)
+ & q->sw_ring_wrap_mask);
+ desc->req.sdone_enable = 1;
+ desc->req.irq_enable = q->irq_enable;
+
+ acc100_dma_enqueue(q, i, &q_data->queue_stats);
+
+ /* Update stats */
+ q_data->queue_stats.enqueued_count += i;
+ q_data->queue_stats.enqueue_err_count += num - i;
+ return i;
+}
+
/* Check we can mux encode operations with common FCW */
static inline bool
check_mux(struct rte_bbdev_enc_op **ops, uint16_t num) {
@@ -1960,6 +2604,52 @@
return i;
}
+/* Enqueue encode operations for ACC100 device in TB mode. */
+static uint16_t
+acc100_enqueue_enc_tb(struct rte_bbdev_queue_data *q_data,
+ struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+ struct acc100_queue *q = q_data->queue_private;
+ int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+ uint16_t i, enqueued_cbs = 0;
+ uint8_t cbs_in_tb;
+ int ret;
+
+ for (i = 0; i < num; ++i) {
+ cbs_in_tb = get_num_cbs_in_tb_enc(&ops[i]->turbo_enc);
+ /* Check if there are available space for further processing */
+ if (unlikely(avail - cbs_in_tb < 0))
+ break;
+ avail -= cbs_in_tb;
+
+ ret = enqueue_enc_one_op_tb(q, ops[i], enqueued_cbs, cbs_in_tb);
+ if (ret < 0)
+ break;
+ enqueued_cbs += ret;
+ }
+
+ acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
+
+ /* Update stats */
+ q_data->queue_stats.enqueued_count += i;
+ q_data->queue_stats.enqueue_err_count += num - i;
+
+ return i;
+}
+
+/* Enqueue encode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_enc(struct rte_bbdev_queue_data *q_data,
+ struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+ if (unlikely(num == 0))
+ return 0;
+ if (ops[0]->turbo_enc.code_block_mode == 0)
+ return acc100_enqueue_enc_tb(q_data, ops, num);
+ else
+ return acc100_enqueue_enc_cb(q_data, ops, num);
+}
+
/* Enqueue encode operations for ACC100 device. */
static uint16_t
acc100_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
@@ -1967,7 +2657,51 @@
{
if (unlikely(num == 0))
return 0;
- return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
+ if (ops[0]->ldpc_enc.code_block_mode == 0)
+ return acc100_enqueue_enc_tb(q_data, ops, num);
+ else
+ return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
+}
+
+
+/* Enqueue decode operations for ACC100 device in CB mode */
+static uint16_t
+acc100_enqueue_dec_cb(struct rte_bbdev_queue_data *q_data,
+ struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+ struct acc100_queue *q = q_data->queue_private;
+ int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+ uint16_t i;
+ union acc100_dma_desc *desc;
+ int ret;
+
+ for (i = 0; i < num; ++i) {
+ /* Check if there are available space for further processing */
+ if (unlikely(avail - 1 < 0))
+ break;
+ avail -= 1;
+
+ ret = enqueue_dec_one_op_cb(q, ops[i], i);
+ if (ret < 0)
+ break;
+ }
+
+ if (unlikely(i == 0))
+ return 0; /* Nothing to enqueue */
+
+ /* Set SDone in last CB in enqueued ops for CB mode*/
+ desc = q->ring_addr + ((q->sw_ring_head + i - 1)
+ & q->sw_ring_wrap_mask);
+ desc->req.sdone_enable = 1;
+ desc->req.irq_enable = q->irq_enable;
+
+ acc100_dma_enqueue(q, i, &q_data->queue_stats);
+
+ /* Update stats */
+ q_data->queue_stats.enqueued_count += i;
+ q_data->queue_stats.enqueue_err_count += num - i;
+
+ return i;
}
/* Check we can mux encode operations with common FCW */
@@ -2065,6 +2799,53 @@
return i;
}
+
+/* Enqueue decode operations for ACC100 device in TB mode */
+static uint16_t
+acc100_enqueue_dec_tb(struct rte_bbdev_queue_data *q_data,
+ struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+ struct acc100_queue *q = q_data->queue_private;
+ int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+ uint16_t i, enqueued_cbs = 0;
+ uint8_t cbs_in_tb;
+ int ret;
+
+ for (i = 0; i < num; ++i) {
+ cbs_in_tb = get_num_cbs_in_tb_dec(&ops[i]->turbo_dec);
+ /* Check if there are available space for further processing */
+ if (unlikely(avail - cbs_in_tb < 0))
+ break;
+ avail -= cbs_in_tb;
+
+ ret = enqueue_dec_one_op_tb(q, ops[i], enqueued_cbs, cbs_in_tb);
+ if (ret < 0)
+ break;
+ enqueued_cbs += ret;
+ }
+
+ acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
+
+ /* Update stats */
+ q_data->queue_stats.enqueued_count += i;
+ q_data->queue_stats.enqueue_err_count += num - i;
+
+ return i;
+}
+
+/* Enqueue decode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_dec(struct rte_bbdev_queue_data *q_data,
+ struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+ if (unlikely(num == 0))
+ return 0;
+ if (ops[0]->turbo_dec.code_block_mode == 0)
+ return acc100_enqueue_dec_tb(q_data, ops, num);
+ else
+ return acc100_enqueue_dec_cb(q_data, ops, num);
+}
+
/* Enqueue decode operations for ACC100 device. */
static uint16_t
acc100_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
@@ -2388,6 +3169,51 @@
return cb_idx;
}
+/* Dequeue encode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_enc(struct rte_bbdev_queue_data *q_data,
+ struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+ struct acc100_queue *q = q_data->queue_private;
+ uint16_t dequeue_num;
+ uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+ uint32_t aq_dequeued = 0;
+ uint16_t i;
+ uint16_t dequeued_cbs = 0;
+ struct rte_bbdev_enc_op *op;
+ int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+ if (unlikely(ops == 0 && q == NULL))
+ return 0;
+#endif
+
+ dequeue_num = (avail < num) ? avail : num;
+
+ for (i = 0; i < dequeue_num; ++i) {
+ op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+ & q->sw_ring_wrap_mask))->req.op_addr;
+ if (op->turbo_enc.code_block_mode == 0)
+ ret = dequeue_enc_one_op_tb(q, &ops[i], dequeued_cbs,
+ &aq_dequeued);
+ else
+ ret = dequeue_enc_one_op_cb(q, &ops[i], dequeued_cbs,
+ &aq_dequeued);
+
+ if (ret < 0)
+ break;
+ dequeued_cbs += ret;
+ }
+
+ q->aq_dequeued += aq_dequeued;
+ q->sw_ring_tail += dequeued_cbs;
+
+ /* Update enqueue stats */
+ q_data->queue_stats.dequeued_count += i;
+
+ return i;
+}
+
/* Dequeue LDPC encode operations from ACC100 device. */
static uint16_t
acc100_dequeue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
@@ -2426,6 +3252,52 @@
return dequeued_cbs;
}
+
+/* Dequeue decode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_dec(struct rte_bbdev_queue_data *q_data,
+ struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+ struct acc100_queue *q = q_data->queue_private;
+ uint16_t dequeue_num;
+ uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+ uint32_t aq_dequeued = 0;
+ uint16_t i;
+ uint16_t dequeued_cbs = 0;
+ struct rte_bbdev_dec_op *op;
+ int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+ if (unlikely(ops == 0 && q == NULL))
+ return 0;
+#endif
+
+ dequeue_num = (avail < num) ? avail : num;
+
+ for (i = 0; i < dequeue_num; ++i) {
+ op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+ & q->sw_ring_wrap_mask))->req.op_addr;
+ if (op->turbo_dec.code_block_mode == 0)
+ ret = dequeue_dec_one_op_tb(q, &ops[i], dequeued_cbs,
+ &aq_dequeued);
+ else
+ ret = dequeue_dec_one_op_cb(q_data, q, &ops[i],
+ dequeued_cbs, &aq_dequeued);
+
+ if (ret < 0)
+ break;
+ dequeued_cbs += ret;
+ }
+
+ q->aq_dequeued += aq_dequeued;
+ q->sw_ring_tail += dequeued_cbs;
+
+ /* Update enqueue stats */
+ q_data->queue_stats.dequeued_count += i;
+
+ return i;
+}
+
/* Dequeue decode operations from ACC100 device. */
static uint16_t
acc100_dequeue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
@@ -2479,6 +3351,10 @@
struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
dev->dev_ops = &acc100_bbdev_ops;
+ dev->enqueue_enc_ops = acc100_enqueue_enc;
+ dev->enqueue_dec_ops = acc100_enqueue_dec;
+ dev->dequeue_enc_ops = acc100_dequeue_enc;
+ dev->dequeue_dec_ops = acc100_dequeue_dec;
dev->enqueue_ldpc_enc_ops = acc100_enqueue_ldpc_enc;
dev->enqueue_ldpc_dec_ops = acc100_enqueue_ldpc_dec;
dev->dequeue_ldpc_enc_ops = acc100_dequeue_ldpc_enc;
--
1.8.3.1
^ permalink raw reply [flat|nested] 213+ messages in thread
* [dpdk-dev] [PATCH v3 08/11] baseband/acc100: add interrupt support to PMD
2020-08-19 0:25 [dpdk-dev] [PATCH v3 00/11] bbdev PMD ACC100 Nicolas Chautru
` (6 preceding siblings ...)
2020-08-19 0:25 ` [dpdk-dev] [PATCH v3 07/11] baseband/acc100: add support for 4G processing Nicolas Chautru
@ 2020-08-19 0:25 ` Nicolas Chautru
2020-08-19 0:25 ` [dpdk-dev] [PATCH v3 09/11] baseband/acc100: add debug function to validate input Nicolas Chautru
` (2 subsequent siblings)
10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-08-19 0:25 UTC (permalink / raw)
To: dev, akhil.goyal; +Cc: bruce.richardson, Nicolas Chautru
Adding capability and functions to support MSI
interrupts, call backs and inforing.
Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
drivers/baseband/acc100/rte_acc100_pmd.c | 288 ++++++++++++++++++++++++++++++-
drivers/baseband/acc100/rte_acc100_pmd.h | 15 ++
2 files changed, 300 insertions(+), 3 deletions(-)
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 1de7531..ba8e1d8 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -339,6 +339,213 @@
free_base_addresses(base_addrs, i);
}
+/*
+ * Find queue_id of a device queue based on details from the Info Ring.
+ * If a queue isn't found UINT16_MAX is returned.
+ */
+static inline uint16_t
+get_queue_id_from_ring_info(struct rte_bbdev_data *data,
+ const union acc100_info_ring_data ring_data)
+{
+ uint16_t queue_id;
+
+ for (queue_id = 0; queue_id < data->num_queues; ++queue_id) {
+ struct acc100_queue *acc100_q =
+ data->queues[queue_id].queue_private;
+ if (acc100_q != NULL && acc100_q->aq_id == ring_data.aq_id &&
+ acc100_q->qgrp_id == ring_data.qg_id &&
+ acc100_q->vf_id == ring_data.vf_id)
+ return queue_id;
+ }
+
+ return UINT16_MAX;
+}
+
+/* Checks PF Info Ring to find the interrupt cause and handles it accordingly */
+static inline void
+acc100_check_ir(struct acc100_device *acc100_dev)
+{
+ volatile union acc100_info_ring_data *ring_data;
+ uint16_t info_ring_head = acc100_dev->info_ring_head;
+ if (acc100_dev->info_ring == NULL)
+ return;
+
+ ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head &
+ ACC100_INFO_RING_MASK);
+
+ while (ring_data->valid) {
+ if ((ring_data->int_nb < ACC100_PF_INT_DMA_DL_DESC_IRQ) || (
+ ring_data->int_nb >
+ ACC100_PF_INT_DMA_DL5G_DESC_IRQ))
+ rte_bbdev_log(WARNING, "InfoRing: ITR:%d Info:0x%x",
+ ring_data->int_nb, ring_data->detailed_info);
+ /* Initialize Info Ring entry and move forward */
+ ring_data->val = 0;
+ info_ring_head++;
+ ring_data = acc100_dev->info_ring +
+ (info_ring_head & ACC100_INFO_RING_MASK);
+ }
+}
+
+/* Checks PF Info Ring to find the interrupt cause and handles it accordingly */
+static inline void
+acc100_pf_interrupt_handler(struct rte_bbdev *dev)
+{
+ struct acc100_device *acc100_dev = dev->data->dev_private;
+ volatile union acc100_info_ring_data *ring_data;
+ struct acc100_deq_intr_details deq_intr_det;
+
+ ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head &
+ ACC100_INFO_RING_MASK);
+
+ while (ring_data->valid) {
+
+ rte_bbdev_log_debug(
+ "ACC100 PF Interrupt received, Info Ring data: 0x%x",
+ ring_data->val);
+
+ switch (ring_data->int_nb) {
+ case ACC100_PF_INT_DMA_DL_DESC_IRQ:
+ case ACC100_PF_INT_DMA_UL_DESC_IRQ:
+ case ACC100_PF_INT_DMA_UL5G_DESC_IRQ:
+ case ACC100_PF_INT_DMA_DL5G_DESC_IRQ:
+ deq_intr_det.queue_id = get_queue_id_from_ring_info(
+ dev->data, *ring_data);
+ if (deq_intr_det.queue_id == UINT16_MAX) {
+ rte_bbdev_log(ERR,
+ "Couldn't find queue: aq_id: %u, qg_id: %u, vf_id: %u",
+ ring_data->aq_id,
+ ring_data->qg_id,
+ ring_data->vf_id);
+ return;
+ }
+ rte_bbdev_pmd_callback_process(dev,
+ RTE_BBDEV_EVENT_DEQUEUE, &deq_intr_det);
+ break;
+ default:
+ rte_bbdev_pmd_callback_process(dev,
+ RTE_BBDEV_EVENT_ERROR, NULL);
+ break;
+ }
+
+ /* Initialize Info Ring entry and move forward */
+ ring_data->val = 0;
+ ++acc100_dev->info_ring_head;
+ ring_data = acc100_dev->info_ring +
+ (acc100_dev->info_ring_head &
+ ACC100_INFO_RING_MASK);
+ }
+}
+
+/* Checks VF Info Ring to find the interrupt cause and handles it accordingly */
+static inline void
+acc100_vf_interrupt_handler(struct rte_bbdev *dev)
+{
+ struct acc100_device *acc100_dev = dev->data->dev_private;
+ volatile union acc100_info_ring_data *ring_data;
+ struct acc100_deq_intr_details deq_intr_det;
+
+ ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head &
+ ACC100_INFO_RING_MASK);
+
+ while (ring_data->valid) {
+
+ rte_bbdev_log_debug(
+ "ACC100 VF Interrupt received, Info Ring data: 0x%x",
+ ring_data->val);
+
+ switch (ring_data->int_nb) {
+ case ACC100_VF_INT_DMA_DL_DESC_IRQ:
+ case ACC100_VF_INT_DMA_UL_DESC_IRQ:
+ case ACC100_VF_INT_DMA_UL5G_DESC_IRQ:
+ case ACC100_VF_INT_DMA_DL5G_DESC_IRQ:
+ /* VFs are not aware of their vf_id - it's set to 0 in
+ * queue structures.
+ */
+ ring_data->vf_id = 0;
+ deq_intr_det.queue_id = get_queue_id_from_ring_info(
+ dev->data, *ring_data);
+ if (deq_intr_det.queue_id == UINT16_MAX) {
+ rte_bbdev_log(ERR,
+ "Couldn't find queue: aq_id: %u, qg_id: %u",
+ ring_data->aq_id,
+ ring_data->qg_id);
+ return;
+ }
+ rte_bbdev_pmd_callback_process(dev,
+ RTE_BBDEV_EVENT_DEQUEUE, &deq_intr_det);
+ break;
+ default:
+ rte_bbdev_pmd_callback_process(dev,
+ RTE_BBDEV_EVENT_ERROR, NULL);
+ break;
+ }
+
+ /* Initialize Info Ring entry and move forward */
+ ring_data->valid = 0;
+ ++acc100_dev->info_ring_head;
+ ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head
+ & ACC100_INFO_RING_MASK);
+ }
+}
+
+/* Interrupt handler triggered by ACC100 dev for handling specific interrupt */
+static void
+acc100_dev_interrupt_handler(void *cb_arg)
+{
+ struct rte_bbdev *dev = cb_arg;
+ struct acc100_device *acc100_dev = dev->data->dev_private;
+
+ /* Read info ring */
+ if (acc100_dev->pf_device)
+ acc100_pf_interrupt_handler(dev);
+ else
+ acc100_vf_interrupt_handler(dev);
+}
+
+/* Allocate and setup inforing */
+static int
+allocate_inforing(struct rte_bbdev *dev)
+{
+ struct acc100_device *d = dev->data->dev_private;
+ const struct acc100_registry_addr *reg_addr;
+ rte_iova_t info_ring_phys;
+ uint32_t phys_low, phys_high;
+
+ if (d->info_ring != NULL)
+ return 0; /* Already configured */
+
+ /* Choose correct registry addresses for the device type */
+ if (d->pf_device)
+ reg_addr = &pf_reg_addr;
+ else
+ reg_addr = &vf_reg_addr;
+ /* Allocate InfoRing */
+ d->info_ring = rte_zmalloc_socket("Info Ring",
+ ACC100_INFO_RING_NUM_ENTRIES *
+ sizeof(*d->info_ring), RTE_CACHE_LINE_SIZE,
+ dev->data->socket_id);
+ if (d->info_ring == NULL) {
+ rte_bbdev_log(ERR,
+ "Failed to allocate Info Ring for %s:%u",
+ dev->device->driver->name,
+ dev->data->dev_id);
+ return -ENOMEM;
+ }
+ info_ring_phys = rte_malloc_virt2iova(d->info_ring);
+
+ /* Setup Info Ring */
+ phys_high = (uint32_t)(info_ring_phys >> 32);
+ phys_low = (uint32_t)(info_ring_phys);
+ acc100_reg_write(d, reg_addr->info_ring_hi, phys_high);
+ acc100_reg_write(d, reg_addr->info_ring_lo, phys_low);
+ acc100_reg_write(d, reg_addr->info_ring_en, ACC100_REG_IRQ_EN_ALL);
+ d->info_ring_head = (acc100_reg_read(d, reg_addr->info_ring_ptr) &
+ 0xFFF) / sizeof(union acc100_info_ring_data);
+ return 0;
+}
+
+
/* Allocate 64MB memory used for all software rings */
static int
acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id)
@@ -426,6 +633,7 @@
acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_hi, phys_high);
acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_lo, phys_low);
+ allocate_inforing(dev);
d->harq_layout = rte_zmalloc_socket("HARQ Layout",
ACC100_HARQ_LAYOUT * sizeof(*d->harq_layout),
RTE_CACHE_LINE_SIZE, dev->data->socket_id);
@@ -437,13 +645,53 @@
return 0;
}
+static int
+acc100_intr_enable(struct rte_bbdev *dev)
+{
+ int ret;
+ struct acc100_device *d = dev->data->dev_private;
+
+ /* Only MSI are currently supported */
+ if (dev->intr_handle->type == RTE_INTR_HANDLE_VFIO_MSI ||
+ dev->intr_handle->type == RTE_INTR_HANDLE_UIO) {
+
+ allocate_inforing(dev);
+
+ ret = rte_intr_enable(dev->intr_handle);
+ if (ret < 0) {
+ rte_bbdev_log(ERR,
+ "Couldn't enable interrupts for device: %s",
+ dev->data->name);
+ rte_free(d->info_ring);
+ return ret;
+ }
+ ret = rte_intr_callback_register(dev->intr_handle,
+ acc100_dev_interrupt_handler, dev);
+ if (ret < 0) {
+ rte_bbdev_log(ERR,
+ "Couldn't register interrupt callback for device: %s",
+ dev->data->name);
+ rte_free(d->info_ring);
+ return ret;
+ }
+
+ return 0;
+ }
+
+ rte_bbdev_log(ERR, "ACC100 (%s) supports only VFIO MSI interrupts",
+ dev->data->name);
+ return -ENOTSUP;
+}
+
/* Free 64MB memory used for software rings */
static int
acc100_dev_close(struct rte_bbdev *dev)
{
struct acc100_device *d = dev->data->dev_private;
+ acc100_check_ir(d);
if (d->sw_rings_base != NULL) {
rte_free(d->tail_ptrs);
+ rte_free(d->info_ring);
rte_free(d->sw_rings_base);
d->sw_rings_base = NULL;
}
@@ -643,6 +891,7 @@
RTE_BBDEV_TURBO_CRC_TYPE_24B |
RTE_BBDEV_TURBO_HALF_ITERATION_EVEN |
RTE_BBDEV_TURBO_EARLY_TERMINATION |
+ RTE_BBDEV_TURBO_DEC_INTERRUPTS |
RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN |
RTE_BBDEV_TURBO_MAP_DEC |
RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP |
@@ -663,6 +912,7 @@
RTE_BBDEV_TURBO_CRC_24B_ATTACH |
RTE_BBDEV_TURBO_RV_INDEX_BYPASS |
RTE_BBDEV_TURBO_RATE_MATCH |
+ RTE_BBDEV_TURBO_ENC_INTERRUPTS |
RTE_BBDEV_TURBO_ENC_SCATTER_GATHER,
.num_buffers_src =
RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
@@ -676,7 +926,8 @@
.capability_flags =
RTE_BBDEV_LDPC_RATE_MATCH |
RTE_BBDEV_LDPC_CRC_24B_ATTACH |
- RTE_BBDEV_LDPC_INTERLEAVER_BYPASS,
+ RTE_BBDEV_LDPC_INTERLEAVER_BYPASS |
+ RTE_BBDEV_LDPC_ENC_INTERRUPTS,
.num_buffers_src =
RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
.num_buffers_dst =
@@ -701,7 +952,8 @@
RTE_BBDEV_LDPC_DECODE_BYPASS |
RTE_BBDEV_LDPC_DEC_SCATTER_GATHER |
RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION |
- RTE_BBDEV_LDPC_LLR_COMPRESSION,
+ RTE_BBDEV_LDPC_LLR_COMPRESSION |
+ RTE_BBDEV_LDPC_DEC_INTERRUPTS,
.llr_size = 8,
.llr_decimals = 1,
.num_buffers_src =
@@ -751,14 +1003,39 @@
#else
dev_info->harq_buffer_size = 0;
#endif
+ acc100_check_ir(d);
+}
+
+static int
+acc100_queue_intr_enable(struct rte_bbdev *dev, uint16_t queue_id)
+{
+ struct acc100_queue *q = dev->data->queues[queue_id].queue_private;
+
+ if (dev->intr_handle->type != RTE_INTR_HANDLE_VFIO_MSI &&
+ dev->intr_handle->type != RTE_INTR_HANDLE_UIO)
+ return -ENOTSUP;
+
+ q->irq_enable = 1;
+ return 0;
+}
+
+static int
+acc100_queue_intr_disable(struct rte_bbdev *dev, uint16_t queue_id)
+{
+ struct acc100_queue *q = dev->data->queues[queue_id].queue_private;
+ q->irq_enable = 0;
+ return 0;
}
static const struct rte_bbdev_ops acc100_bbdev_ops = {
.setup_queues = acc100_setup_queues,
+ .intr_enable = acc100_intr_enable,
.close = acc100_dev_close,
.info_get = acc100_dev_info_get,
.queue_setup = acc100_queue_setup,
.queue_release = acc100_queue_release,
+ .queue_intr_enable = acc100_queue_intr_enable,
+ .queue_intr_disable = acc100_queue_intr_disable
};
/* ACC100 PCI PF address map */
@@ -3018,8 +3295,10 @@
? (1 << RTE_BBDEV_DATA_ERROR) : 0);
op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
- if (op->status != 0)
+ if (op->status != 0) {
q_data->queue_stats.dequeue_err_count++;
+ acc100_check_ir(q->d);
+ }
/* CRC invalid if error exists */
if (!op->status)
@@ -3076,6 +3355,9 @@
op->status |= 1 << RTE_BBDEV_SYNDROME_ERROR;
op->ldpc_dec.iter_count = (uint8_t) rsp.iter_cnt;
+ if (op->status & (1 << RTE_BBDEV_DRV_ERROR))
+ acc100_check_ir(q->d);
+
/* Check if this is the last desc in batch (Atomic Queue) */
if (desc->req.last_desc_in_batch) {
(*aq_dequeued)++;
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 78686c1..8980fa5 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -559,7 +559,14 @@ struct acc100_device {
/* Virtual address of the info memory routed to the this function under
* operation, whether it is PF or VF.
*/
+ union acc100_info_ring_data *info_ring;
+
union acc100_harq_layout_data *harq_layout;
+ /* Virtual Info Ring head */
+ uint16_t info_ring_head;
+ /* Number of bytes available for each queue in device, depending on
+ * how many queues are enabled with configure()
+ */
uint32_t sw_ring_size;
uint32_t ddr_size; /* Size in kB */
uint32_t *tail_ptrs; /* Base address of response tail pointer buffer */
@@ -575,4 +582,12 @@ struct acc100_device {
bool configured; /**< True if this ACC100 device is configured */
};
+/**
+ * Structure with details about RTE_BBDEV_EVENT_DEQUEUE event. It's passed to
+ * the callback function.
+ */
+struct acc100_deq_intr_details {
+ uint16_t queue_id;
+};
+
#endif /* _RTE_ACC100_PMD_H_ */
--
1.8.3.1
^ permalink raw reply [flat|nested] 213+ messages in thread
* [dpdk-dev] [PATCH v3 09/11] baseband/acc100: add debug function to validate input
2020-08-19 0:25 [dpdk-dev] [PATCH v3 00/11] bbdev PMD ACC100 Nicolas Chautru
` (7 preceding siblings ...)
2020-08-19 0:25 ` [dpdk-dev] [PATCH v3 08/11] baseband/acc100: add interrupt support to PMD Nicolas Chautru
@ 2020-08-19 0:25 ` Nicolas Chautru
2020-08-19 0:25 ` [dpdk-dev] [PATCH v3 10/11] baseband/acc100: add configure function Nicolas Chautru
2020-08-19 0:25 ` [dpdk-dev] [PATCH v3 11/11] doc: update bbdev feature table Nicolas Chautru
10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-08-19 0:25 UTC (permalink / raw)
To: dev, akhil.goyal; +Cc: bruce.richardson, Nicolas Chautru
Debug functions to validate the input API from user
Only enabled in DEBUG mode at build time
Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
drivers/baseband/acc100/rte_acc100_pmd.c | 424 +++++++++++++++++++++++++++++++
1 file changed, 424 insertions(+)
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index ba8e1d8..dc14079 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -1945,6 +1945,231 @@
}
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+/* Validates turbo encoder parameters */
+static inline int
+validate_enc_op(struct rte_bbdev_enc_op *op)
+{
+ struct rte_bbdev_op_turbo_enc *turbo_enc = &op->turbo_enc;
+ struct rte_bbdev_op_enc_turbo_cb_params *cb = NULL;
+ struct rte_bbdev_op_enc_turbo_tb_params *tb = NULL;
+ uint16_t kw, kw_neg, kw_pos;
+
+ if (op->mempool == NULL) {
+ rte_bbdev_log(ERR, "Invalid mempool pointer");
+ return -1;
+ }
+ if (turbo_enc->input.data == NULL) {
+ rte_bbdev_log(ERR, "Invalid input pointer");
+ return -1;
+ }
+ if (turbo_enc->output.data == NULL) {
+ rte_bbdev_log(ERR, "Invalid output pointer");
+ return -1;
+ }
+ if (turbo_enc->rv_index > 3) {
+ rte_bbdev_log(ERR,
+ "rv_index (%u) is out of range 0 <= value <= 3",
+ turbo_enc->rv_index);
+ return -1;
+ }
+ if (turbo_enc->code_block_mode != 0 &&
+ turbo_enc->code_block_mode != 1) {
+ rte_bbdev_log(ERR,
+ "code_block_mode (%u) is out of range 0 <= value <= 1",
+ turbo_enc->code_block_mode);
+ return -1;
+ }
+
+ if (turbo_enc->code_block_mode == 0) {
+ tb = &turbo_enc->tb_params;
+ if ((tb->k_neg < RTE_BBDEV_TURBO_MIN_CB_SIZE
+ || tb->k_neg > RTE_BBDEV_TURBO_MAX_CB_SIZE)
+ && tb->c_neg > 0) {
+ rte_bbdev_log(ERR,
+ "k_neg (%u) is out of range %u <= value <= %u",
+ tb->k_neg, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+ RTE_BBDEV_TURBO_MAX_CB_SIZE);
+ return -1;
+ }
+ if (tb->k_pos < RTE_BBDEV_TURBO_MIN_CB_SIZE
+ || tb->k_pos > RTE_BBDEV_TURBO_MAX_CB_SIZE) {
+ rte_bbdev_log(ERR,
+ "k_pos (%u) is out of range %u <= value <= %u",
+ tb->k_pos, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+ RTE_BBDEV_TURBO_MAX_CB_SIZE);
+ return -1;
+ }
+ if (tb->c_neg > (RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1))
+ rte_bbdev_log(ERR,
+ "c_neg (%u) is out of range 0 <= value <= %u",
+ tb->c_neg,
+ RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1);
+ if (tb->c < 1 || tb->c > RTE_BBDEV_TURBO_MAX_CODE_BLOCKS) {
+ rte_bbdev_log(ERR,
+ "c (%u) is out of range 1 <= value <= %u",
+ tb->c, RTE_BBDEV_TURBO_MAX_CODE_BLOCKS);
+ return -1;
+ }
+ if (tb->cab > tb->c) {
+ rte_bbdev_log(ERR,
+ "cab (%u) is greater than c (%u)",
+ tb->cab, tb->c);
+ return -1;
+ }
+ if ((tb->ea < RTE_BBDEV_TURBO_MIN_CB_SIZE || (tb->ea % 2))
+ && tb->r < tb->cab) {
+ rte_bbdev_log(ERR,
+ "ea (%u) is less than %u or it is not even",
+ tb->ea, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+ return -1;
+ }
+ if ((tb->eb < RTE_BBDEV_TURBO_MIN_CB_SIZE || (tb->eb % 2))
+ && tb->c > tb->cab) {
+ rte_bbdev_log(ERR,
+ "eb (%u) is less than %u or it is not even",
+ tb->eb, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+ return -1;
+ }
+
+ kw_neg = 3 * RTE_ALIGN_CEIL(tb->k_neg + 4,
+ RTE_BBDEV_TURBO_C_SUBBLOCK);
+ if (tb->ncb_neg < tb->k_neg || tb->ncb_neg > kw_neg) {
+ rte_bbdev_log(ERR,
+ "ncb_neg (%u) is out of range (%u) k_neg <= value <= (%u) kw_neg",
+ tb->ncb_neg, tb->k_neg, kw_neg);
+ return -1;
+ }
+
+ kw_pos = 3 * RTE_ALIGN_CEIL(tb->k_pos + 4,
+ RTE_BBDEV_TURBO_C_SUBBLOCK);
+ if (tb->ncb_pos < tb->k_pos || tb->ncb_pos > kw_pos) {
+ rte_bbdev_log(ERR,
+ "ncb_pos (%u) is out of range (%u) k_pos <= value <= (%u) kw_pos",
+ tb->ncb_pos, tb->k_pos, kw_pos);
+ return -1;
+ }
+ if (tb->r > (tb->c - 1)) {
+ rte_bbdev_log(ERR,
+ "r (%u) is greater than c - 1 (%u)",
+ tb->r, tb->c - 1);
+ return -1;
+ }
+ } else {
+ cb = &turbo_enc->cb_params;
+ if (cb->k < RTE_BBDEV_TURBO_MIN_CB_SIZE
+ || cb->k > RTE_BBDEV_TURBO_MAX_CB_SIZE) {
+ rte_bbdev_log(ERR,
+ "k (%u) is out of range %u <= value <= %u",
+ cb->k, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+ RTE_BBDEV_TURBO_MAX_CB_SIZE);
+ return -1;
+ }
+
+ if (cb->e < RTE_BBDEV_TURBO_MIN_CB_SIZE || (cb->e % 2)) {
+ rte_bbdev_log(ERR,
+ "e (%u) is less than %u or it is not even",
+ cb->e, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+ return -1;
+ }
+
+ kw = RTE_ALIGN_CEIL(cb->k + 4, RTE_BBDEV_TURBO_C_SUBBLOCK) * 3;
+ if (cb->ncb < cb->k || cb->ncb > kw) {
+ rte_bbdev_log(ERR,
+ "ncb (%u) is out of range (%u) k <= value <= (%u) kw",
+ cb->ncb, cb->k, kw);
+ return -1;
+ }
+ }
+
+ return 0;
+}
+/* Validates LDPC encoder parameters */
+static inline int
+validate_ldpc_enc_op(struct rte_bbdev_enc_op *op)
+{
+ struct rte_bbdev_op_ldpc_enc *ldpc_enc = &op->ldpc_enc;
+
+ if (op->mempool == NULL) {
+ rte_bbdev_log(ERR, "Invalid mempool pointer");
+ return -1;
+ }
+ if (ldpc_enc->input.data == NULL) {
+ rte_bbdev_log(ERR, "Invalid input pointer");
+ return -1;
+ }
+ if (ldpc_enc->output.data == NULL) {
+ rte_bbdev_log(ERR, "Invalid output pointer");
+ return -1;
+ }
+ if (ldpc_enc->input.length >
+ RTE_BBDEV_LDPC_MAX_CB_SIZE >> 3) {
+ rte_bbdev_log(ERR, "CB size (%u) is too big, max: %d",
+ ldpc_enc->input.length,
+ RTE_BBDEV_LDPC_MAX_CB_SIZE);
+ return -1;
+ }
+ if ((ldpc_enc->basegraph > 2) || (ldpc_enc->basegraph == 0)) {
+ rte_bbdev_log(ERR,
+ "BG (%u) is out of range 1 <= value <= 2",
+ ldpc_enc->basegraph);
+ return -1;
+ }
+ if (ldpc_enc->rv_index > 3) {
+ rte_bbdev_log(ERR,
+ "rv_index (%u) is out of range 0 <= value <= 3",
+ ldpc_enc->rv_index);
+ return -1;
+ }
+ if (ldpc_enc->code_block_mode > 1) {
+ rte_bbdev_log(ERR,
+ "code_block_mode (%u) is out of range 0 <= value <= 1",
+ ldpc_enc->code_block_mode);
+ return -1;
+ }
+
+ return 0;
+}
+
+/* Validates LDPC decoder parameters */
+static inline int
+validate_ldpc_dec_op(struct rte_bbdev_dec_op *op)
+{
+ struct rte_bbdev_op_ldpc_dec *ldpc_dec = &op->ldpc_dec;
+
+ if (op->mempool == NULL) {
+ rte_bbdev_log(ERR, "Invalid mempool pointer");
+ return -1;
+ }
+ if ((ldpc_dec->basegraph > 2) || (ldpc_dec->basegraph == 0)) {
+ rte_bbdev_log(ERR,
+ "BG (%u) is out of range 1 <= value <= 2",
+ ldpc_dec->basegraph);
+ return -1;
+ }
+ if (ldpc_dec->iter_max == 0) {
+ rte_bbdev_log(ERR,
+ "iter_max (%u) is equal to 0",
+ ldpc_dec->iter_max);
+ return -1;
+ }
+ if (ldpc_dec->rv_index > 3) {
+ rte_bbdev_log(ERR,
+ "rv_index (%u) is out of range 0 <= value <= 3",
+ ldpc_dec->rv_index);
+ return -1;
+ }
+ if (ldpc_dec->code_block_mode > 1) {
+ rte_bbdev_log(ERR,
+ "code_block_mode (%u) is out of range 0 <= value <= 1",
+ ldpc_dec->code_block_mode);
+ return -1;
+ }
+
+ return 0;
+}
+#endif
+
/* Enqueue one encode operations for ACC100 device in CB mode */
static inline int
enqueue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
@@ -1956,6 +2181,14 @@
seg_total_left;
struct rte_mbuf *input, *output_head, *output;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+ /* Validate op structure */
+ if (validate_enc_op(op) == -1) {
+ rte_bbdev_log(ERR, "Turbo encoder validation failed");
+ return -EINVAL;
+ }
+#endif
+
uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
& q->sw_ring_wrap_mask);
desc = q->ring_addr + desc_idx;
@@ -2008,6 +2241,14 @@
uint16_t in_length_in_bytes;
struct rte_bbdev_op_ldpc_enc *enc = &ops[0]->ldpc_enc;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+ /* Validate op structure */
+ if (validate_ldpc_enc_op(ops[0]) == -1) {
+ rte_bbdev_log(ERR, "LDPC encoder validation failed");
+ return -EINVAL;
+ }
+#endif
+
uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
& q->sw_ring_wrap_mask);
desc = q->ring_addr + desc_idx;
@@ -2065,6 +2306,14 @@
seg_total_left;
struct rte_mbuf *input, *output_head, *output;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+ /* Validate op structure */
+ if (validate_ldpc_enc_op(op) == -1) {
+ rte_bbdev_log(ERR, "LDPC encoder validation failed");
+ return -EINVAL;
+ }
+#endif
+
uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
& q->sw_ring_wrap_mask);
desc = q->ring_addr + desc_idx;
@@ -2119,6 +2368,14 @@
struct rte_mbuf *input, *output_head, *output;
uint16_t current_enqueued_cbs = 0;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+ /* Validate op structure */
+ if (validate_enc_op(op) == -1) {
+ rte_bbdev_log(ERR, "Turbo encoder validation failed");
+ return -EINVAL;
+ }
+#endif
+
uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
& q->sw_ring_wrap_mask);
desc = q->ring_addr + desc_idx;
@@ -2191,6 +2448,142 @@
return current_enqueued_cbs;
}
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+/* Validates turbo decoder parameters */
+static inline int
+validate_dec_op(struct rte_bbdev_dec_op *op)
+{
+ struct rte_bbdev_op_turbo_dec *turbo_dec = &op->turbo_dec;
+ struct rte_bbdev_op_dec_turbo_cb_params *cb = NULL;
+ struct rte_bbdev_op_dec_turbo_tb_params *tb = NULL;
+
+ if (op->mempool == NULL) {
+ rte_bbdev_log(ERR, "Invalid mempool pointer");
+ return -1;
+ }
+ if (turbo_dec->input.data == NULL) {
+ rte_bbdev_log(ERR, "Invalid input pointer");
+ return -1;
+ }
+ if (turbo_dec->hard_output.data == NULL) {
+ rte_bbdev_log(ERR, "Invalid hard_output pointer");
+ return -1;
+ }
+ if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_SOFT_OUTPUT) &&
+ turbo_dec->soft_output.data == NULL) {
+ rte_bbdev_log(ERR, "Invalid soft_output pointer");
+ return -1;
+ }
+ if (turbo_dec->rv_index > 3) {
+ rte_bbdev_log(ERR,
+ "rv_index (%u) is out of range 0 <= value <= 3",
+ turbo_dec->rv_index);
+ return -1;
+ }
+ if (turbo_dec->iter_min < 1) {
+ rte_bbdev_log(ERR,
+ "iter_min (%u) is less than 1",
+ turbo_dec->iter_min);
+ return -1;
+ }
+ if (turbo_dec->iter_max <= 2) {
+ rte_bbdev_log(ERR,
+ "iter_max (%u) is less than or equal to 2",
+ turbo_dec->iter_max);
+ return -1;
+ }
+ if (turbo_dec->iter_min > turbo_dec->iter_max) {
+ rte_bbdev_log(ERR,
+ "iter_min (%u) is greater than iter_max (%u)",
+ turbo_dec->iter_min, turbo_dec->iter_max);
+ return -1;
+ }
+ if (turbo_dec->code_block_mode != 0 &&
+ turbo_dec->code_block_mode != 1) {
+ rte_bbdev_log(ERR,
+ "code_block_mode (%u) is out of range 0 <= value <= 1",
+ turbo_dec->code_block_mode);
+ return -1;
+ }
+
+ if (turbo_dec->code_block_mode == 0) {
+ tb = &turbo_dec->tb_params;
+ if ((tb->k_neg < RTE_BBDEV_TURBO_MIN_CB_SIZE
+ || tb->k_neg > RTE_BBDEV_TURBO_MAX_CB_SIZE)
+ && tb->c_neg > 0) {
+ rte_bbdev_log(ERR,
+ "k_neg (%u) is out of range %u <= value <= %u",
+ tb->k_neg, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+ RTE_BBDEV_TURBO_MAX_CB_SIZE);
+ return -1;
+ }
+ if ((tb->k_pos < RTE_BBDEV_TURBO_MIN_CB_SIZE
+ || tb->k_pos > RTE_BBDEV_TURBO_MAX_CB_SIZE)
+ && tb->c > tb->c_neg) {
+ rte_bbdev_log(ERR,
+ "k_pos (%u) is out of range %u <= value <= %u",
+ tb->k_pos, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+ RTE_BBDEV_TURBO_MAX_CB_SIZE);
+ return -1;
+ }
+ if (tb->c_neg > (RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1))
+ rte_bbdev_log(ERR,
+ "c_neg (%u) is out of range 0 <= value <= %u",
+ tb->c_neg,
+ RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1);
+ if (tb->c < 1 || tb->c > RTE_BBDEV_TURBO_MAX_CODE_BLOCKS) {
+ rte_bbdev_log(ERR,
+ "c (%u) is out of range 1 <= value <= %u",
+ tb->c, RTE_BBDEV_TURBO_MAX_CODE_BLOCKS);
+ return -1;
+ }
+ if (tb->cab > tb->c) {
+ rte_bbdev_log(ERR,
+ "cab (%u) is greater than c (%u)",
+ tb->cab, tb->c);
+ return -1;
+ }
+ if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_EQUALIZER) &&
+ (tb->ea < RTE_BBDEV_TURBO_MIN_CB_SIZE
+ || (tb->ea % 2))
+ && tb->cab > 0) {
+ rte_bbdev_log(ERR,
+ "ea (%u) is less than %u or it is not even",
+ tb->ea, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+ return -1;
+ }
+ if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_EQUALIZER) &&
+ (tb->eb < RTE_BBDEV_TURBO_MIN_CB_SIZE
+ || (tb->eb % 2))
+ && tb->c > tb->cab) {
+ rte_bbdev_log(ERR,
+ "eb (%u) is less than %u or it is not even",
+ tb->eb, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+ }
+ } else {
+ cb = &turbo_dec->cb_params;
+ if (cb->k < RTE_BBDEV_TURBO_MIN_CB_SIZE
+ || cb->k > RTE_BBDEV_TURBO_MAX_CB_SIZE) {
+ rte_bbdev_log(ERR,
+ "k (%u) is out of range %u <= value <= %u",
+ cb->k, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+ RTE_BBDEV_TURBO_MAX_CB_SIZE);
+ return -1;
+ }
+ if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_EQUALIZER) &&
+ (cb->e < RTE_BBDEV_TURBO_MIN_CB_SIZE ||
+ (cb->e % 2))) {
+ rte_bbdev_log(ERR,
+ "e (%u) is less than %u or it is not even",
+ cb->e, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+ return -1;
+ }
+ }
+
+ return 0;
+}
+#endif
+
/** Enqueue one decode operations for ACC100 device in CB mode */
static inline int
enqueue_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
@@ -2203,6 +2596,14 @@
struct rte_mbuf *input, *h_output_head, *h_output,
*s_output_head, *s_output;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+ /* Validate op structure */
+ if (validate_dec_op(op) == -1) {
+ rte_bbdev_log(ERR, "Turbo decoder validation failed");
+ return -EINVAL;
+ }
+#endif
+
uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
& q->sw_ring_wrap_mask);
desc = q->ring_addr + desc_idx;
@@ -2426,6 +2827,13 @@
return ret;
}
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+ /* Validate op structure */
+ if (validate_ldpc_dec_op(op) == -1) {
+ rte_bbdev_log(ERR, "LDPC decoder validation failed");
+ return -EINVAL;
+ }
+#endif
union acc100_dma_desc *desc;
uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
& q->sw_ring_wrap_mask);
@@ -2521,6 +2929,14 @@
struct rte_mbuf *input, *h_output_head, *h_output;
uint16_t current_enqueued_cbs = 0;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+ /* Validate op structure */
+ if (validate_ldpc_dec_op(op) == -1) {
+ rte_bbdev_log(ERR, "LDPC decoder validation failed");
+ return -EINVAL;
+ }
+#endif
+
uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
& q->sw_ring_wrap_mask);
desc = q->ring_addr + desc_idx;
@@ -2611,6 +3027,14 @@
*s_output_head, *s_output;
uint16_t current_enqueued_cbs = 0;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+ /* Validate op structure */
+ if (validate_dec_op(op) == -1) {
+ rte_bbdev_log(ERR, "Turbo decoder validation failed");
+ return -EINVAL;
+ }
+#endif
+
uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
& q->sw_ring_wrap_mask);
desc = q->ring_addr + desc_idx;
--
1.8.3.1
^ permalink raw reply [flat|nested] 213+ messages in thread
* [dpdk-dev] [PATCH v3 10/11] baseband/acc100: add configure function
2020-08-19 0:25 [dpdk-dev] [PATCH v3 00/11] bbdev PMD ACC100 Nicolas Chautru
` (8 preceding siblings ...)
2020-08-19 0:25 ` [dpdk-dev] [PATCH v3 09/11] baseband/acc100: add debug function to validate input Nicolas Chautru
@ 2020-08-19 0:25 ` Nicolas Chautru
2020-09-03 10:06 ` Aidan Goddard
2020-08-19 0:25 ` [dpdk-dev] [PATCH v3 11/11] doc: update bbdev feature table Nicolas Chautru
10 siblings, 1 reply; 213+ messages in thread
From: Nicolas Chautru @ 2020-08-19 0:25 UTC (permalink / raw)
To: dev, akhil.goyal; +Cc: bruce.richardson, Nicolas Chautru
Add configure function to configure the PF from within
the bbdev-test itself without external application
configuration the device.
Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
app/test-bbdev/test_bbdev_perf.c | 72 +++
drivers/baseband/acc100/Makefile | 3 +
drivers/baseband/acc100/meson.build | 2 +
drivers/baseband/acc100/rte_acc100_cfg.h | 17 +
drivers/baseband/acc100/rte_acc100_pmd.c | 505 +++++++++++++++++++++
.../acc100/rte_pmd_bbdev_acc100_version.map | 7 +
6 files changed, 606 insertions(+)
diff --git a/app/test-bbdev/test_bbdev_perf.c b/app/test-bbdev/test_bbdev_perf.c
index 45c0d62..32f23ff 100644
--- a/app/test-bbdev/test_bbdev_perf.c
+++ b/app/test-bbdev/test_bbdev_perf.c
@@ -52,6 +52,18 @@
#define FLR_5G_TIMEOUT 610
#endif
+#ifdef RTE_LIBRTE_PMD_BBDEV_ACC100
+#include <rte_acc100_cfg.h>
+#define ACC100PF_DRIVER_NAME ("intel_acc100_pf")
+#define ACC100VF_DRIVER_NAME ("intel_acc100_vf")
+#define ACC100_QMGR_NUM_AQS 16
+#define ACC100_QMGR_NUM_QGS 2
+#define ACC100_QMGR_AQ_DEPTH 5
+#define ACC100_QMGR_INVALID_IDX -1
+#define ACC100_QMGR_RR 1
+#define ACC100_QOS_GBR 0
+#endif
+
#define OPS_CACHE_SIZE 256U
#define OPS_POOL_SIZE_MIN 511U /* 0.5K per queue */
@@ -653,6 +665,66 @@ typedef int (test_case_function)(struct active_device *ad,
info->dev_name);
}
#endif
+#ifdef RTE_LIBRTE_PMD_BBDEV_ACC100
+ if ((get_init_device() == true) &&
+ (!strcmp(info->drv.driver_name, ACC100PF_DRIVER_NAME))) {
+ struct acc100_conf conf;
+ unsigned int i;
+
+ printf("Configure ACC100 FEC Driver %s with default values\n",
+ info->drv.driver_name);
+
+ /* clear default configuration before initialization */
+ memset(&conf, 0, sizeof(struct acc100_conf));
+
+ /* Always set in PF mode for built-in configuration */
+ conf.pf_mode_en = true;
+ for (i = 0; i < RTE_ACC100_NUM_VFS; ++i) {
+ conf.arb_dl_4g[i].gbr_threshold1 = ACC100_QOS_GBR;
+ conf.arb_dl_4g[i].gbr_threshold1 = ACC100_QOS_GBR;
+ conf.arb_dl_4g[i].round_robin_weight = ACC100_QMGR_RR;
+ conf.arb_ul_4g[i].gbr_threshold1 = ACC100_QOS_GBR;
+ conf.arb_ul_4g[i].gbr_threshold1 = ACC100_QOS_GBR;
+ conf.arb_ul_4g[i].round_robin_weight = ACC100_QMGR_RR;
+ conf.arb_dl_5g[i].gbr_threshold1 = ACC100_QOS_GBR;
+ conf.arb_dl_5g[i].gbr_threshold1 = ACC100_QOS_GBR;
+ conf.arb_dl_5g[i].round_robin_weight = ACC100_QMGR_RR;
+ conf.arb_ul_5g[i].gbr_threshold1 = ACC100_QOS_GBR;
+ conf.arb_ul_5g[i].gbr_threshold1 = ACC100_QOS_GBR;
+ conf.arb_ul_5g[i].round_robin_weight = ACC100_QMGR_RR;
+ }
+
+ conf.input_pos_llr_1_bit = true;
+ conf.output_pos_llr_1_bit = true;
+ conf.num_vf_bundles = 1; /**< Number of VF bundles to setup */
+
+ conf.q_ul_4g.num_qgroups = ACC100_QMGR_NUM_QGS;
+ conf.q_ul_4g.first_qgroup_index = ACC100_QMGR_INVALID_IDX;
+ conf.q_ul_4g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
+ conf.q_ul_4g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
+ conf.q_dl_4g.num_qgroups = ACC100_QMGR_NUM_QGS;
+ conf.q_dl_4g.first_qgroup_index = ACC100_QMGR_INVALID_IDX;
+ conf.q_dl_4g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
+ conf.q_dl_4g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
+ conf.q_ul_5g.num_qgroups = ACC100_QMGR_NUM_QGS;
+ conf.q_ul_5g.first_qgroup_index = ACC100_QMGR_INVALID_IDX;
+ conf.q_ul_5g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
+ conf.q_ul_5g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
+ conf.q_dl_5g.num_qgroups = ACC100_QMGR_NUM_QGS;
+ conf.q_dl_5g.first_qgroup_index = ACC100_QMGR_INVALID_IDX;
+ conf.q_dl_5g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
+ conf.q_dl_5g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
+
+ /* setup PF with configuration information */
+ ret = acc100_configure(info->dev_name, &conf);
+ TEST_ASSERT_SUCCESS(ret,
+ "Failed to configure ACC100 PF for bbdev %s",
+ info->dev_name);
+ /* Let's refresh this now this is configured */
+ }
+ rte_bbdev_info_get(dev_id, info);
+#endif
+
nb_queues = RTE_MIN(rte_lcore_count(), info->drv.max_num_queues);
nb_queues = RTE_MIN(nb_queues, (unsigned int) MAX_QUEUES);
diff --git a/drivers/baseband/acc100/Makefile b/drivers/baseband/acc100/Makefile
index c79e487..37e73af 100644
--- a/drivers/baseband/acc100/Makefile
+++ b/drivers/baseband/acc100/Makefile
@@ -22,4 +22,7 @@ LIBABIVER := 1
# library source files
SRCS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100) += rte_acc100_pmd.c
+# export include files
+SYMLINK-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100)-include += rte_acc100_cfg.h
+
include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/baseband/acc100/meson.build b/drivers/baseband/acc100/meson.build
index 8afafc2..7ac44dc 100644
--- a/drivers/baseband/acc100/meson.build
+++ b/drivers/baseband/acc100/meson.build
@@ -4,3 +4,5 @@
deps += ['bbdev', 'bus_vdev', 'ring', 'pci', 'bus_pci']
sources = files('rte_acc100_pmd.c')
+
+install_headers('rte_acc100_cfg.h')
diff --git a/drivers/baseband/acc100/rte_acc100_cfg.h b/drivers/baseband/acc100/rte_acc100_cfg.h
index 73bbe36..7f523bc 100644
--- a/drivers/baseband/acc100/rte_acc100_cfg.h
+++ b/drivers/baseband/acc100/rte_acc100_cfg.h
@@ -89,6 +89,23 @@ struct acc100_conf {
struct rte_arbitration_t arb_dl_5g[RTE_ACC100_NUM_VFS];
};
+/**
+ * Configure a ACC100 device
+ *
+ * @param dev_name
+ * The name of the device. This is the short form of PCI BDF, e.g. 00:01.0.
+ * It can also be retrieved for a bbdev device from the dev_name field in the
+ * rte_bbdev_info structure returned by rte_bbdev_info_get().
+ * @param conf
+ * Configuration to apply to ACC100 HW.
+ *
+ * @return
+ * Zero on success, negative value on failure.
+ */
+__rte_experimental
+int
+acc100_configure(const char *dev_name, struct acc100_conf *conf);
+
#ifdef __cplusplus
}
#endif
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index dc14079..43f664b 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -85,6 +85,26 @@
enum {UL_4G = 0, UL_5G, DL_4G, DL_5G, NUM_ACC};
+/* Return the accelerator enum for a Queue Group Index */
+static inline int
+accFromQgid(int qg_idx, const struct acc100_conf *acc100_conf)
+{
+ int accQg[ACC100_NUM_QGRPS];
+ int NumQGroupsPerFn[NUM_ACC];
+ int acc, qgIdx, qgIndex = 0;
+ for (qgIdx = 0; qgIdx < ACC100_NUM_QGRPS; qgIdx++)
+ accQg[qgIdx] = 0;
+ NumQGroupsPerFn[UL_4G] = acc100_conf->q_ul_4g.num_qgroups;
+ NumQGroupsPerFn[UL_5G] = acc100_conf->q_ul_5g.num_qgroups;
+ NumQGroupsPerFn[DL_4G] = acc100_conf->q_dl_4g.num_qgroups;
+ NumQGroupsPerFn[DL_5G] = acc100_conf->q_dl_5g.num_qgroups;
+ for (acc = UL_4G; acc < NUM_ACC; acc++)
+ for (qgIdx = 0; qgIdx < NumQGroupsPerFn[acc]; qgIdx++)
+ accQg[qgIndex++] = acc;
+ acc = accQg[qg_idx];
+ return acc;
+}
+
/* Return the queue topology for a Queue Group Index */
static inline void
qtopFromAcc(struct rte_q_topology_t **qtop, int acc_enum,
@@ -113,6 +133,30 @@
*qtop = p_qtop;
}
+/* Return the AQ depth for a Queue Group Index */
+static inline int
+aqDepth(int qg_idx, struct acc100_conf *acc100_conf)
+{
+ struct rte_q_topology_t *q_top = NULL;
+ int acc_enum = accFromQgid(qg_idx, acc100_conf);
+ qtopFromAcc(&q_top, acc_enum, acc100_conf);
+ if (unlikely(q_top == NULL))
+ return 0;
+ return q_top->aq_depth_log2;
+}
+
+/* Return the AQ depth for a Queue Group Index */
+static inline int
+aqNum(int qg_idx, struct acc100_conf *acc100_conf)
+{
+ struct rte_q_topology_t *q_top = NULL;
+ int acc_enum = accFromQgid(qg_idx, acc100_conf);
+ qtopFromAcc(&q_top, acc_enum, acc100_conf);
+ if (unlikely(q_top == NULL))
+ return 0;
+ return q_top->num_aqs_per_groups;
+}
+
static void
initQTop(struct acc100_conf *acc100_conf)
{
@@ -4177,3 +4221,464 @@ static int acc100_pci_remove(struct rte_pci_device *pci_dev)
RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME, pci_id_acc100_pf_map);
RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME, pci_id_acc100_vf_map);
+
+/*
+ * Implementation to fix the power on status of some 5GUL engines
+ * This requires DMA permission if ported outside DPDK
+ */
+static void
+poweron_cleanup(struct rte_bbdev *bbdev, struct acc100_device *d,
+ struct acc100_conf *conf)
+{
+ int i, template_idx, qg_idx;
+ uint32_t address, status, payload;
+ printf("Need to clear power-on 5GUL status in internal memory\n");
+ /* Reset LDPC Cores */
+ for (i = 0; i < ACC100_ENGINES_MAX; i++)
+ acc100_reg_write(d, HWPfFecUl5gCntrlReg +
+ ACC100_ENGINE_OFFSET * i, ACC100_RESET_HI);
+ usleep(LONG_WAIT);
+ for (i = 0; i < ACC100_ENGINES_MAX; i++)
+ acc100_reg_write(d, HWPfFecUl5gCntrlReg +
+ ACC100_ENGINE_OFFSET * i, ACC100_RESET_LO);
+ usleep(LONG_WAIT);
+ /* Prepare dummy workload */
+ alloc_2x64mb_sw_rings_mem(bbdev, d, 0);
+ /* Set base addresses */
+ uint32_t phys_high = (uint32_t)(d->sw_rings_phys >> 32);
+ uint32_t phys_low = (uint32_t)(d->sw_rings_phys &
+ ~(ACC100_SIZE_64MBYTE-1));
+ acc100_reg_write(d, HWPfDmaFec5GulDescBaseHiRegVf, phys_high);
+ acc100_reg_write(d, HWPfDmaFec5GulDescBaseLoRegVf, phys_low);
+
+ /* Descriptor for a dummy 5GUL code block processing*/
+ union acc100_dma_desc *desc = NULL;
+ desc = d->sw_rings;
+ desc->req.data_ptrs[0].address = d->sw_rings_phys +
+ ACC100_DESC_FCW_OFFSET;
+ desc->req.data_ptrs[0].blen = ACC100_FCW_LD_BLEN;
+ desc->req.data_ptrs[0].blkid = ACC100_DMA_BLKID_FCW;
+ desc->req.data_ptrs[0].last = 0;
+ desc->req.data_ptrs[0].dma_ext = 0;
+ desc->req.data_ptrs[1].address = d->sw_rings_phys + 512;
+ desc->req.data_ptrs[1].blkid = ACC100_DMA_BLKID_IN;
+ desc->req.data_ptrs[1].last = 1;
+ desc->req.data_ptrs[1].dma_ext = 0;
+ desc->req.data_ptrs[1].blen = 44;
+ desc->req.data_ptrs[2].address = d->sw_rings_phys + 1024;
+ desc->req.data_ptrs[2].blkid = ACC100_DMA_BLKID_OUT_ENC;
+ desc->req.data_ptrs[2].last = 1;
+ desc->req.data_ptrs[2].dma_ext = 0;
+ desc->req.data_ptrs[2].blen = 5;
+ /* Dummy FCW */
+ desc->req.fcw_ld.FCWversion = ACC100_FCW_VER;
+ desc->req.fcw_ld.qm = 1;
+ desc->req.fcw_ld.nfiller = 30;
+ desc->req.fcw_ld.BG = 2 - 1;
+ desc->req.fcw_ld.Zc = 7;
+ desc->req.fcw_ld.ncb = 350;
+ desc->req.fcw_ld.rm_e = 4;
+ desc->req.fcw_ld.itmax = 10;
+ desc->req.fcw_ld.gain_i = 1;
+ desc->req.fcw_ld.gain_h = 1;
+
+ int engines_to_restart[SIG_UL_5G_LAST + 1] = {0};
+ int num_failed_engine = 0;
+ /* Detect engines in undefined state */
+ for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST;
+ template_idx++) {
+ /* Check engine power-on status */
+ address = HwPfFecUl5gIbDebugReg +
+ ACC100_ENGINE_OFFSET * template_idx;
+ status = (acc100_reg_read(d, address) >> 4) & 0xF;
+ if (status == 0) {
+ engines_to_restart[num_failed_engine] = template_idx;
+ num_failed_engine++;
+ }
+ }
+
+ int numQqsAcc = conf->q_ul_5g.num_qgroups;
+ int numQgs = conf->q_ul_5g.num_qgroups;
+ payload = 0;
+ for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+ payload |= (1 << qg_idx);
+ /* Force each engine which is in unspecified state */
+ for (i = 0; i < num_failed_engine; i++) {
+ int failed_engine = engines_to_restart[i];
+ printf("Force engine %d\n", failed_engine);
+ for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST;
+ template_idx++) {
+ address = HWPfQmgrGrpTmplateReg4Indx
+ + BYTES_IN_WORD * template_idx;
+ if (template_idx == failed_engine)
+ acc100_reg_write(d, address, payload);
+ else
+ acc100_reg_write(d, address, 0);
+ }
+ /* Reset descriptor header */
+ desc->req.word0 = ACC100_DMA_DESC_TYPE;
+ desc->req.word1 = 0;
+ desc->req.word2 = 0;
+ desc->req.word3 = 0;
+ desc->req.numCBs = 1;
+ desc->req.m2dlen = 2;
+ desc->req.d2mlen = 1;
+ /* Enqueue the code block for processing */
+ union acc100_enqueue_reg_fmt enq_req;
+ enq_req.val = 0;
+ enq_req.addr_offset = ACC100_DESC_OFFSET;
+ enq_req.num_elem = 1;
+ enq_req.req_elem_addr = 0;
+ rte_wmb();
+ acc100_reg_write(d, HWPfQmgrIngressAq + 0x100, enq_req.val);
+ usleep(LONG_WAIT * 100);
+ if (desc->req.word0 != 2)
+ printf("DMA Response %#"PRIx32"\n", desc->req.word0);
+ }
+
+ /* Reset LDPC Cores */
+ for (i = 0; i < ACC100_ENGINES_MAX; i++)
+ acc100_reg_write(d, HWPfFecUl5gCntrlReg +
+ ACC100_ENGINE_OFFSET * i, ACC100_RESET_HI);
+ usleep(LONG_WAIT);
+ for (i = 0; i < ACC100_ENGINES_MAX; i++)
+ acc100_reg_write(d, HWPfFecUl5gCntrlReg +
+ ACC100_ENGINE_OFFSET * i, ACC100_RESET_LO);
+ usleep(LONG_WAIT);
+ acc100_reg_write(d, HWPfHi5GHardResetReg, ACC100_RESET_HARD);
+ usleep(LONG_WAIT);
+ int numEngines = 0;
+ /* Check engine power-on status again */
+ for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST;
+ template_idx++) {
+ address = HwPfFecUl5gIbDebugReg +
+ ACC100_ENGINE_OFFSET * template_idx;
+ status = (acc100_reg_read(d, address) >> 4) & 0xF;
+ address = HWPfQmgrGrpTmplateReg4Indx
+ + BYTES_IN_WORD * template_idx;
+ if (status == 1) {
+ acc100_reg_write(d, address, payload);
+ numEngines++;
+ } else
+ acc100_reg_write(d, address, 0);
+ }
+ printf("Number of 5GUL engines %d\n", numEngines);
+
+ if (d->sw_rings_base != NULL)
+ rte_free(d->sw_rings_base);
+ usleep(LONG_WAIT);
+}
+
+/* Initial configuration of a ACC100 device prior to running configure() */
+int
+acc100_configure(const char *dev_name, struct acc100_conf *conf)
+{
+ rte_bbdev_log(INFO, "acc100_configure");
+ uint32_t payload, address, status;
+ int qg_idx, template_idx, vf_idx, acc, i;
+ struct rte_bbdev *bbdev = rte_bbdev_get_named_dev(dev_name);
+
+ /* Compile time checks */
+ RTE_BUILD_BUG_ON(sizeof(struct acc100_dma_req_desc) != 256);
+ RTE_BUILD_BUG_ON(sizeof(union acc100_dma_desc) != 256);
+ RTE_BUILD_BUG_ON(sizeof(struct acc100_fcw_td) != 24);
+ RTE_BUILD_BUG_ON(sizeof(struct acc100_fcw_te) != 32);
+
+ if (bbdev == NULL) {
+ rte_bbdev_log(ERR,
+ "Invalid dev_name (%s), or device is not yet initialised",
+ dev_name);
+ return -ENODEV;
+ }
+ struct acc100_device *d = bbdev->data->dev_private;
+
+ /* Store configuration */
+ rte_memcpy(&d->acc100_conf, conf, sizeof(d->acc100_conf));
+
+ /* PCIe Bridge configuration */
+ acc100_reg_write(d, HwPfPcieGpexBridgeControl, ACC100_CFG_PCI_BRIDGE);
+ for (i = 1; i < 17; i++)
+ acc100_reg_write(d,
+ HwPfPcieGpexAxiAddrMappingWindowPexBaseHigh
+ + i * 16, 0);
+
+ /* PCIe Link Trainiing and Status State Machine */
+ acc100_reg_write(d, HwPfPcieGpexLtssmStateCntrl, 0xDFC00000);
+
+ /* Prevent blocking AXI read on BRESP for AXI Write */
+ address = HwPfPcieGpexAxiPioControl;
+ payload = ACC100_CFG_PCI_AXI;
+ acc100_reg_write(d, address, payload);
+
+ /* 5GDL PLL phase shift */
+ acc100_reg_write(d, HWPfChaDl5gPllPhshft0, 0x1);
+
+ /* Explicitly releasing AXI as this may be stopped after PF FLR/BME */
+ address = HWPfDmaAxiControl;
+ payload = 1;
+ acc100_reg_write(d, address, payload);
+
+ /* DDR Configuration */
+ address = HWPfDdrBcTim6;
+ payload = acc100_reg_read(d, address);
+ payload &= 0xFFFFFFFB; /* Bit 2 */
+#ifdef ACC100_DDR_ECC_ENABLE
+ payload |= 0x4;
+#endif
+ acc100_reg_write(d, address, payload);
+ address = HWPfDdrPhyDqsCountNum;
+#ifdef ACC100_DDR_ECC_ENABLE
+ payload = 9;
+#else
+ payload = 8;
+#endif
+ acc100_reg_write(d, address, payload);
+
+ /* Set default descriptor signature */
+ address = HWPfDmaDescriptorSignatuture;
+ payload = 0;
+ acc100_reg_write(d, address, payload);
+
+ /* Enable the Error Detection in DMA */
+ payload = ACC100_CFG_DMA_ERROR;
+ address = HWPfDmaErrorDetectionEn;
+ acc100_reg_write(d, address, payload);
+
+ /* AXI Cache configuration */
+ payload = ACC100_CFG_AXI_CACHE;
+ address = HWPfDmaAxcacheReg;
+ acc100_reg_write(d, address, payload);
+
+ /* Default DMA Configuration (Qmgr Enabled) */
+ address = HWPfDmaConfig0Reg;
+ payload = 0;
+ acc100_reg_write(d, address, payload);
+ address = HWPfDmaQmanen;
+ payload = 0;
+ acc100_reg_write(d, address, payload);
+
+ /* Default RLIM/ALEN configuration */
+ address = HWPfDmaConfig1Reg;
+ payload = (1 << 31) + (23 << 8) + (1 << 6) + 7;
+ acc100_reg_write(d, address, payload);
+
+ /* Configure DMA Qmanager addresses */
+ address = HWPfDmaQmgrAddrReg;
+ payload = HWPfQmgrEgressQueuesTemplate;
+ acc100_reg_write(d, address, payload);
+
+ /* ===== Qmgr Configuration ===== */
+ /* Configuration of the AQueue Depth QMGR_GRP_0_DEPTH_LOG2 for UL */
+ int totalQgs = conf->q_ul_4g.num_qgroups +
+ conf->q_ul_5g.num_qgroups +
+ conf->q_dl_4g.num_qgroups +
+ conf->q_dl_5g.num_qgroups;
+ for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) {
+ address = HWPfQmgrDepthLog2Grp +
+ BYTES_IN_WORD * qg_idx;
+ payload = aqDepth(qg_idx, conf);
+ acc100_reg_write(d, address, payload);
+ address = HWPfQmgrTholdGrp +
+ BYTES_IN_WORD * qg_idx;
+ payload = (1 << 16) + (1 << (aqDepth(qg_idx, conf) - 1));
+ acc100_reg_write(d, address, payload);
+ }
+
+ /* Template Priority in incremental order */
+ for (template_idx = 0; template_idx < ACC100_NUM_TMPL;
+ template_idx++) {
+ address = HWPfQmgrGrpTmplateReg0Indx +
+ BYTES_IN_WORD * (template_idx % 8);
+ payload = TMPL_PRI_0;
+ acc100_reg_write(d, address, payload);
+ address = HWPfQmgrGrpTmplateReg1Indx +
+ BYTES_IN_WORD * (template_idx % 8);
+ payload = TMPL_PRI_1;
+ acc100_reg_write(d, address, payload);
+ address = HWPfQmgrGrpTmplateReg2indx +
+ BYTES_IN_WORD * (template_idx % 8);
+ payload = TMPL_PRI_2;
+ acc100_reg_write(d, address, payload);
+ address = HWPfQmgrGrpTmplateReg3Indx +
+ BYTES_IN_WORD * (template_idx % 8);
+ payload = TMPL_PRI_3;
+ acc100_reg_write(d, address, payload);
+ }
+
+ address = HWPfQmgrGrpPriority;
+ payload = ACC100_CFG_QMGR_HI_P;
+ acc100_reg_write(d, address, payload);
+
+ /* Template Configuration */
+ for (template_idx = 0; template_idx < ACC100_NUM_TMPL; template_idx++) {
+ payload = 0;
+ address = HWPfQmgrGrpTmplateReg4Indx
+ + BYTES_IN_WORD * template_idx;
+ acc100_reg_write(d, address, payload);
+ }
+ /* 4GUL */
+ int numQgs = conf->q_ul_4g.num_qgroups;
+ int numQqsAcc = 0;
+ payload = 0;
+ for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+ payload |= (1 << qg_idx);
+ for (template_idx = SIG_UL_4G; template_idx <= SIG_UL_4G_LAST;
+ template_idx++) {
+ address = HWPfQmgrGrpTmplateReg4Indx
+ + BYTES_IN_WORD*template_idx;
+ acc100_reg_write(d, address, payload);
+ }
+ /* 5GUL */
+ numQqsAcc += numQgs;
+ numQgs = conf->q_ul_5g.num_qgroups;
+ payload = 0;
+ int numEngines = 0;
+ for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+ payload |= (1 << qg_idx);
+ for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST;
+ template_idx++) {
+ /* Check engine power-on status */
+ address = HwPfFecUl5gIbDebugReg +
+ ACC100_ENGINE_OFFSET * template_idx;
+ status = (acc100_reg_read(d, address) >> 4) & 0xF;
+ address = HWPfQmgrGrpTmplateReg4Indx
+ + BYTES_IN_WORD * template_idx;
+ if (status == 1) {
+ acc100_reg_write(d, address, payload);
+ numEngines++;
+ } else
+ acc100_reg_write(d, address, 0);
+ #if RTE_ACC100_SINGLE_FEC == 1
+ payload = 0;
+ #endif
+ }
+ printf("Number of 5GUL engines %d\n", numEngines);
+ /* 4GDL */
+ numQqsAcc += numQgs;
+ numQgs = conf->q_dl_4g.num_qgroups;
+ payload = 0;
+ for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+ payload |= (1 << qg_idx);
+ for (template_idx = SIG_DL_4G; template_idx <= SIG_DL_4G_LAST;
+ template_idx++) {
+ address = HWPfQmgrGrpTmplateReg4Indx
+ + BYTES_IN_WORD*template_idx;
+ acc100_reg_write(d, address, payload);
+ #if RTE_ACC100_SINGLE_FEC == 1
+ payload = 0;
+ #endif
+ }
+ /* 5GDL */
+ numQqsAcc += numQgs;
+ numQgs = conf->q_dl_5g.num_qgroups;
+ payload = 0;
+ for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+ payload |= (1 << qg_idx);
+ for (template_idx = SIG_DL_5G; template_idx <= SIG_DL_5G_LAST;
+ template_idx++) {
+ address = HWPfQmgrGrpTmplateReg4Indx
+ + BYTES_IN_WORD*template_idx;
+ acc100_reg_write(d, address, payload);
+ #if RTE_ACC100_SINGLE_FEC == 1
+ payload = 0;
+ #endif
+ }
+
+ /* Queue Group Function mapping */
+ int qman_func_id[5] = {0, 2, 1, 3, 4};
+ address = HWPfQmgrGrpFunction0;
+ payload = 0;
+ for (qg_idx = 0; qg_idx < 8; qg_idx++) {
+ acc = accFromQgid(qg_idx, conf);
+ payload |= qman_func_id[acc]<<(qg_idx * 4);
+ }
+ acc100_reg_write(d, address, payload);
+
+ /* Configuration of the Arbitration QGroup depth to 1 */
+ for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) {
+ address = HWPfQmgrArbQDepthGrp +
+ BYTES_IN_WORD * qg_idx;
+ payload = 0;
+ acc100_reg_write(d, address, payload);
+ }
+
+ /* Enabling AQueues through the Queue hierarchy*/
+ for (vf_idx = 0; vf_idx < ACC100_NUM_VFS; vf_idx++) {
+ for (qg_idx = 0; qg_idx < ACC100_NUM_QGRPS; qg_idx++) {
+ payload = 0;
+ if (vf_idx < conf->num_vf_bundles &&
+ qg_idx < totalQgs)
+ payload = (1 << aqNum(qg_idx, conf)) - 1;
+ address = HWPfQmgrAqEnableVf
+ + vf_idx * BYTES_IN_WORD;
+ payload += (qg_idx << 16);
+ acc100_reg_write(d, address, payload);
+ }
+ }
+
+ /* This pointer to ARAM (256kB) is shifted by 2 (4B per register) */
+ uint32_t aram_address = 0;
+ for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) {
+ for (vf_idx = 0; vf_idx < conf->num_vf_bundles; vf_idx++) {
+ address = HWPfQmgrVfBaseAddr + vf_idx
+ * BYTES_IN_WORD + qg_idx
+ * BYTES_IN_WORD * 64;
+ payload = aram_address;
+ acc100_reg_write(d, address, payload);
+ /* Offset ARAM Address for next memory bank
+ * - increment of 4B
+ */
+ aram_address += aqNum(qg_idx, conf) *
+ (1 << aqDepth(qg_idx, conf));
+ }
+ }
+
+ if (aram_address > WORDS_IN_ARAM_SIZE) {
+ rte_bbdev_log(ERR, "ARAM Configuration not fitting %d %d\n",
+ aram_address, WORDS_IN_ARAM_SIZE);
+ return -EINVAL;
+ }
+
+ /* ==== HI Configuration ==== */
+
+ /* Prevent Block on Transmit Error */
+ address = HWPfHiBlockTransmitOnErrorEn;
+ payload = 0;
+ acc100_reg_write(d, address, payload);
+ /* Prevents to drop MSI */
+ address = HWPfHiMsiDropEnableReg;
+ payload = 0;
+ acc100_reg_write(d, address, payload);
+ /* Set the PF Mode register */
+ address = HWPfHiPfMode;
+ payload = (conf->pf_mode_en) ? 2 : 0;
+ acc100_reg_write(d, address, payload);
+ /* Enable Error Detection in HW */
+ address = HWPfDmaErrorDetectionEn;
+ payload = 0x3D7;
+ acc100_reg_write(d, address, payload);
+
+ /* QoS overflow init */
+ payload = 1;
+ address = HWPfQosmonAEvalOverflow0;
+ acc100_reg_write(d, address, payload);
+ address = HWPfQosmonBEvalOverflow0;
+ acc100_reg_write(d, address, payload);
+
+ /* HARQ DDR Configuration */
+ unsigned int ddrSizeInMb = 512; /* Fixed to 512 MB per VF for now */
+ for (vf_idx = 0; vf_idx < conf->num_vf_bundles; vf_idx++) {
+ address = HWPfDmaVfDdrBaseRw + vf_idx
+ * 0x10;
+ payload = ((vf_idx * (ddrSizeInMb / 64)) << 16) +
+ (ddrSizeInMb - 1);
+ acc100_reg_write(d, address, payload);
+ }
+ usleep(LONG_WAIT);
+
+ if (numEngines < (SIG_UL_5G_LAST + 1))
+ poweron_cleanup(bbdev, d, conf);
+
+ rte_bbdev_log_debug("PF Tip configuration complete for %s", dev_name);
+ return 0;
+}
diff --git a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
index 4a76d1d..91c234d 100644
--- a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
+++ b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
@@ -1,3 +1,10 @@
DPDK_21 {
local: *;
};
+
+EXPERIMENTAL {
+ global:
+
+ acc100_configure;
+
+};
--
1.8.3.1
^ permalink raw reply [flat|nested] 213+ messages in thread
* [dpdk-dev] [PATCH v3 11/11] doc: update bbdev feature table
2020-08-19 0:25 [dpdk-dev] [PATCH v3 00/11] bbdev PMD ACC100 Nicolas Chautru
` (9 preceding siblings ...)
2020-08-19 0:25 ` [dpdk-dev] [PATCH v3 10/11] baseband/acc100: add configure function Nicolas Chautru
@ 2020-08-19 0:25 ` Nicolas Chautru
2020-09-04 17:53 ` [dpdk-dev] [PATCH v4 00/11] bbdev PMD ACC100 Nicolas Chautru
` (8 more replies)
10 siblings, 9 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-08-19 0:25 UTC (permalink / raw)
To: dev, akhil.goyal; +Cc: bruce.richardson, Nicolas Chautru
Correcting overview matrix to use acc100 name
Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
doc/guides/bbdevs/features/acc100.ini | 14 ++++++++++++++
doc/guides/bbdevs/features/mbc.ini | 14 --------------
2 files changed, 14 insertions(+), 14 deletions(-)
create mode 100644 doc/guides/bbdevs/features/acc100.ini
delete mode 100644 doc/guides/bbdevs/features/mbc.ini
diff --git a/doc/guides/bbdevs/features/acc100.ini b/doc/guides/bbdevs/features/acc100.ini
new file mode 100644
index 0000000..642cd48
--- /dev/null
+++ b/doc/guides/bbdevs/features/acc100.ini
@@ -0,0 +1,14 @@
+;
+; Supported features of the 'acc100' bbdev driver.
+;
+; Refer to default.ini for the full list of available PMD features.
+;
+[Features]
+Turbo Decoder (4G) = Y
+Turbo Encoder (4G) = Y
+LDPC Decoder (5G) = Y
+LDPC Encoder (5G) = Y
+LLR/HARQ Compression = Y
+External DDR Access = Y
+HW Accelerated = Y
+BBDEV API = Y
diff --git a/doc/guides/bbdevs/features/mbc.ini b/doc/guides/bbdevs/features/mbc.ini
deleted file mode 100644
index 78a7b95..0000000
--- a/doc/guides/bbdevs/features/mbc.ini
+++ /dev/null
@@ -1,14 +0,0 @@
-;
-; Supported features of the 'mbc' bbdev driver.
-;
-; Refer to default.ini for the full list of available PMD features.
-;
-[Features]
-Turbo Decoder (4G) = Y
-Turbo Encoder (4G) = Y
-LDPC Decoder (5G) = Y
-LDPC Encoder (5G) = Y
-LLR/HARQ Compression = Y
-External DDR Access = Y
-HW Accelerated = Y
-BBDEV API = Y
--
1.8.3.1
^ permalink raw reply [flat|nested] 213+ messages in thread
* Re: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC processing functions
2020-08-19 0:25 ` [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC processing functions Nicolas Chautru
@ 2020-08-20 14:38 ` Dave Burley
2020-08-20 14:52 ` Chautru, Nicolas
2020-08-29 11:10 ` Xu, Rosen
1 sibling, 1 reply; 213+ messages in thread
From: Dave Burley @ 2020-08-20 14:38 UTC (permalink / raw)
To: Nicolas Chautru, dev; +Cc: bruce.richardson
Hi Nic,
As you've now specified the use of RTE_BBDEV_LDPC_LLR_COMPRESSION for this PMB, please could you confirm what the packed format of the LLRs in memory looks like?
Best Regards
Dave Burley
From: dev <dev-bounces@dpdk.org> on behalf of Nicolas Chautru <nicolas.chautru@intel.com>
Sent: 19 August 2020 01:25
To: dev@dpdk.org <dev@dpdk.org>; akhil.goyal@nxp.com <akhil.goyal@nxp.com>
Cc: bruce.richardson@intel.com <bruce.richardson@intel.com>; Nicolas Chautru <nicolas.chautru@intel.com>
Subject: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC processing functions
Adding LDPC decode and encode processing operations
Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
drivers/baseband/acc100/rte_acc100_pmd.c | 1625 +++++++++++++++++++++++++++++-
drivers/baseband/acc100/rte_acc100_pmd.h | 3 +
2 files changed, 1626 insertions(+), 2 deletions(-)
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 7a21c57..5f32813 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -15,6 +15,9 @@
#include <rte_hexdump.h>
#include <rte_pci.h>
#include <rte_bus_pci.h>
+#ifdef RTE_BBDEV_OFFLOAD_COST
+#include <rte_cycles.h>
+#endif
#include <rte_bbdev.h>
#include <rte_bbdev_pmd.h>
@@ -449,7 +452,6 @@
return 0;
}
-
/**
* Report a ACC100 queue index which is free
* Return 0 to 16k for a valid queue_idx or -1 when no queue is available
@@ -634,6 +636,46 @@
struct acc100_device *d = dev->data->dev_private;
static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
+ {
+ .type = RTE_BBDEV_OP_LDPC_ENC,
+ .cap.ldpc_enc = {
+ .capability_flags =
+ RTE_BBDEV_LDPC_RATE_MATCH |
+ RTE_BBDEV_LDPC_CRC_24B_ATTACH |
+ RTE_BBDEV_LDPC_INTERLEAVER_BYPASS,
+ .num_buffers_src =
+ RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+ .num_buffers_dst =
+ RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+ }
+ },
+ {
+ .type = RTE_BBDEV_OP_LDPC_DEC,
+ .cap.ldpc_dec = {
+ .capability_flags =
+ RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK |
+ RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP |
+ RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE |
+ RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE |
+#ifdef ACC100_EXT_MEM
+ RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE |
+ RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE |
+#endif
+ RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE |
+ RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS |
+ RTE_BBDEV_LDPC_DECODE_BYPASS |
+ RTE_BBDEV_LDPC_DEC_SCATTER_GATHER |
+ RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION |
+ RTE_BBDEV_LDPC_LLR_COMPRESSION,
+ .llr_size = 8,
+ .llr_decimals = 1,
+ .num_buffers_src =
+ RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+ .num_buffers_hard_out =
+ RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+ .num_buffers_soft_out = 0,
+ }
+ },
RTE_BBDEV_END_OF_CAPABILITIES_LIST()
};
@@ -669,9 +711,14 @@
dev_info->cpu_flag_reqs = NULL;
dev_info->min_alignment = 64;
dev_info->capabilities = bbdev_capabilities;
+#ifdef ACC100_EXT_MEM
dev_info->harq_buffer_size = d->ddr_size;
+#else
+ dev_info->harq_buffer_size = 0;
+#endif
}
+
static const struct rte_bbdev_ops acc100_bbdev_ops = {
.setup_queues = acc100_setup_queues,
.close = acc100_dev_close,
@@ -696,6 +743,1577 @@
{.device_id = 0},
};
+/* Read flag value 0/1 from bitmap */
+static inline bool
+check_bit(uint32_t bitmap, uint32_t bitmask)
+{
+ return bitmap & bitmask;
+}
+
+static inline char *
+mbuf_append(struct rte_mbuf *m_head, struct rte_mbuf *m, uint16_t len)
+{
+ if (unlikely(len > rte_pktmbuf_tailroom(m)))
+ return NULL;
+
+ char *tail = (char *)m->buf_addr + m->data_off + m->data_len;
+ m->data_len = (uint16_t)(m->data_len + len);
+ m_head->pkt_len = (m_head->pkt_len + len);
+ return tail;
+}
+
+/* Compute value of k0.
+ * Based on 3GPP 38.212 Table 5.4.2.1-2
+ * Starting position of different redundancy versions, k0
+ */
+static inline uint16_t
+get_k0(uint16_t n_cb, uint16_t z_c, uint8_t bg, uint8_t rv_index)
+{
+ if (rv_index == 0)
+ return 0;
+ uint16_t n = (bg == 1 ? N_ZC_1 : N_ZC_2) * z_c;
+ if (n_cb == n) {
+ if (rv_index == 1)
+ return (bg == 1 ? K0_1_1 : K0_1_2) * z_c;
+ else if (rv_index == 2)
+ return (bg == 1 ? K0_2_1 : K0_2_2) * z_c;
+ else
+ return (bg == 1 ? K0_3_1 : K0_3_2) * z_c;
+ }
+ /* LBRM case - includes a division by N */
+ if (rv_index == 1)
+ return (((bg == 1 ? K0_1_1 : K0_1_2) * n_cb)
+ / n) * z_c;
+ else if (rv_index == 2)
+ return (((bg == 1 ? K0_2_1 : K0_2_2) * n_cb)
+ / n) * z_c;
+ else
+ return (((bg == 1 ? K0_3_1 : K0_3_2) * n_cb)
+ / n) * z_c;
+}
+
+/* Fill in a frame control word for LDPC encoding. */
+static inline void
+acc100_fcw_le_fill(const struct rte_bbdev_enc_op *op,
+ struct acc100_fcw_le *fcw, int num_cb)
+{
+ fcw->qm = op->ldpc_enc.q_m;
+ fcw->nfiller = op->ldpc_enc.n_filler;
+ fcw->BG = (op->ldpc_enc.basegraph - 1);
+ fcw->Zc = op->ldpc_enc.z_c;
+ fcw->ncb = op->ldpc_enc.n_cb;
+ fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_enc.basegraph,
+ op->ldpc_enc.rv_index);
+ fcw->rm_e = op->ldpc_enc.cb_params.e;
+ fcw->crc_select = check_bit(op->ldpc_enc.op_flags,
+ RTE_BBDEV_LDPC_CRC_24B_ATTACH);
+ fcw->bypass_intlv = check_bit(op->ldpc_enc.op_flags,
+ RTE_BBDEV_LDPC_INTERLEAVER_BYPASS);
+ fcw->mcb_count = num_cb;
+}
+
+/* Fill in a frame control word for LDPC decoding. */
+static inline void
+acc100_fcw_ld_fill(const struct rte_bbdev_dec_op *op, struct acc100_fcw_ld *fcw,
+ union acc100_harq_layout_data *harq_layout)
+{
+ uint16_t harq_out_length, harq_in_length, ncb_p, k0_p, parity_offset;
+ uint16_t harq_index;
+ uint32_t l;
+ bool harq_prun = false;
+
+ fcw->qm = op->ldpc_dec.q_m;
+ fcw->nfiller = op->ldpc_dec.n_filler;
+ fcw->BG = (op->ldpc_dec.basegraph - 1);
+ fcw->Zc = op->ldpc_dec.z_c;
+ fcw->ncb = op->ldpc_dec.n_cb;
+ fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_dec.basegraph,
+ op->ldpc_dec.rv_index);
+ if (op->ldpc_dec.code_block_mode == 1)
+ fcw->rm_e = op->ldpc_dec.cb_params.e;
+ else
+ fcw->rm_e = (op->ldpc_dec.tb_params.r <
+ op->ldpc_dec.tb_params.cab) ?
+ op->ldpc_dec.tb_params.ea :
+ op->ldpc_dec.tb_params.eb;
+
+ fcw->hcin_en = check_bit(op->ldpc_dec.op_flags,
+ RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE);
+ fcw->hcout_en = check_bit(op->ldpc_dec.op_flags,
+ RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE);
+ fcw->crc_select = check_bit(op->ldpc_dec.op_flags,
+ RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK);
+ fcw->bypass_dec = check_bit(op->ldpc_dec.op_flags,
+ RTE_BBDEV_LDPC_DECODE_BYPASS);
+ fcw->bypass_intlv = check_bit(op->ldpc_dec.op_flags,
+ RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS);
+ if (op->ldpc_dec.q_m == 1) {
+ fcw->bypass_intlv = 1;
+ fcw->qm = 2;
+ }
+ fcw->hcin_decomp_mode = check_bit(op->ldpc_dec.op_flags,
+ RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
+ fcw->hcout_comp_mode = check_bit(op->ldpc_dec.op_flags,
+ RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
+ fcw->llr_pack_mode = check_bit(op->ldpc_dec.op_flags,
+ RTE_BBDEV_LDPC_LLR_COMPRESSION);
+ harq_index = op->ldpc_dec.harq_combined_output.offset /
+ ACC100_HARQ_OFFSET;
+#ifdef ACC100_EXT_MEM
+ /* Limit cases when HARQ pruning is valid */
+ harq_prun = ((op->ldpc_dec.harq_combined_output.offset %
+ ACC100_HARQ_OFFSET) == 0) &&
+ (op->ldpc_dec.harq_combined_output.offset <= UINT16_MAX
+ * ACC100_HARQ_OFFSET);
+#endif
+ if (fcw->hcin_en > 0) {
+ harq_in_length = op->ldpc_dec.harq_combined_input.length;
+ if (fcw->hcin_decomp_mode > 0)
+ harq_in_length = harq_in_length * 8 / 6;
+ harq_in_length = RTE_ALIGN(harq_in_length, 64);
+ if ((harq_layout[harq_index].offset > 0) & harq_prun) {
+ rte_bbdev_log_debug("HARQ IN offset unexpected for now\n");
+ fcw->hcin_size0 = harq_layout[harq_index].size0;
+ fcw->hcin_offset = harq_layout[harq_index].offset;
+ fcw->hcin_size1 = harq_in_length -
+ harq_layout[harq_index].offset;
+ } else {
+ fcw->hcin_size0 = harq_in_length;
+ fcw->hcin_offset = 0;
+ fcw->hcin_size1 = 0;
+ }
+ } else {
+ fcw->hcin_size0 = 0;
+ fcw->hcin_offset = 0;
+ fcw->hcin_size1 = 0;
+ }
+
+ fcw->itmax = op->ldpc_dec.iter_max;
+ fcw->itstop = check_bit(op->ldpc_dec.op_flags,
+ RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE);
+ fcw->synd_precoder = fcw->itstop;
+ /*
+ * These are all implicitly set
+ * fcw->synd_post = 0;
+ * fcw->so_en = 0;
+ * fcw->so_bypass_rm = 0;
+ * fcw->so_bypass_intlv = 0;
+ * fcw->dec_convllr = 0;
+ * fcw->hcout_convllr = 0;
+ * fcw->hcout_size1 = 0;
+ * fcw->so_it = 0;
+ * fcw->hcout_offset = 0;
+ * fcw->negstop_th = 0;
+ * fcw->negstop_it = 0;
+ * fcw->negstop_en = 0;
+ * fcw->gain_i = 1;
+ * fcw->gain_h = 1;
+ */
+ if (fcw->hcout_en > 0) {
+ parity_offset = (op->ldpc_dec.basegraph == 1 ? 20 : 8)
+ * op->ldpc_dec.z_c - op->ldpc_dec.n_filler;
+ k0_p = (fcw->k0 > parity_offset) ?
+ fcw->k0 - op->ldpc_dec.n_filler : fcw->k0;
+ ncb_p = fcw->ncb - op->ldpc_dec.n_filler;
+ l = k0_p + fcw->rm_e;
+ harq_out_length = (uint16_t) fcw->hcin_size0;
+ harq_out_length = RTE_MIN(RTE_MAX(harq_out_length, l), ncb_p);
+ harq_out_length = (harq_out_length + 0x3F) & 0xFFC0;
+ if ((k0_p > fcw->hcin_size0 + ACC100_HARQ_OFFSET_THRESHOLD) &&
+ harq_prun) {
+ fcw->hcout_size0 = (uint16_t) fcw->hcin_size0;
+ fcw->hcout_offset = k0_p & 0xFFC0;
+ fcw->hcout_size1 = harq_out_length - fcw->hcout_offset;
+ } else {
+ fcw->hcout_size0 = harq_out_length;
+ fcw->hcout_size1 = 0;
+ fcw->hcout_offset = 0;
+ }
+ harq_layout[harq_index].offset = fcw->hcout_offset;
+ harq_layout[harq_index].size0 = fcw->hcout_size0;
+ } else {
+ fcw->hcout_size0 = 0;
+ fcw->hcout_size1 = 0;
+ fcw->hcout_offset = 0;
+ }
+}
+
+/**
+ * Fills descriptor with data pointers of one block type.
+ *
+ * @param desc
+ * Pointer to DMA descriptor.
+ * @param input
+ * Pointer to pointer to input data which will be encoded. It can be changed
+ * and points to next segment in scatter-gather case.
+ * @param offset
+ * Input offset in rte_mbuf structure. It is used for calculating the point
+ * where data is starting.
+ * @param cb_len
+ * Length of currently processed Code Block
+ * @param seg_total_left
+ * It indicates how many bytes still left in segment (mbuf) for further
+ * processing.
+ * @param op_flags
+ * Store information about device capabilities
+ * @param next_triplet
+ * Index for ACC100 DMA Descriptor triplet
+ *
+ * @return
+ * Returns index of next triplet on success, other value if lengths of
+ * pkt and processed cb do not match.
+ *
+ */
+static inline int
+acc100_dma_fill_blk_type_in(struct acc100_dma_req_desc *desc,
+ struct rte_mbuf **input, uint32_t *offset, uint32_t cb_len,
+ uint32_t *seg_total_left, int next_triplet)
+{
+ uint32_t part_len;
+ struct rte_mbuf *m = *input;
+
+ part_len = (*seg_total_left < cb_len) ? *seg_total_left : cb_len;
+ cb_len -= part_len;
+ *seg_total_left -= part_len;
+
+ desc->data_ptrs[next_triplet].address =
+ rte_pktmbuf_iova_offset(m, *offset);
+ desc->data_ptrs[next_triplet].blen = part_len;
+ desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN;
+ desc->data_ptrs[next_triplet].last = 0;
+ desc->data_ptrs[next_triplet].dma_ext = 0;
+ *offset += part_len;
+ next_triplet++;
+
+ while (cb_len > 0) {
+ if (next_triplet < ACC100_DMA_MAX_NUM_POINTERS &&
+ m->next != NULL) {
+
+ m = m->next;
+ *seg_total_left = rte_pktmbuf_data_len(m);
+ part_len = (*seg_total_left < cb_len) ?
+ *seg_total_left :
+ cb_len;
+ desc->data_ptrs[next_triplet].address =
+ rte_pktmbuf_mtophys(m);
+ desc->data_ptrs[next_triplet].blen = part_len;
+ desc->data_ptrs[next_triplet].blkid =
+ ACC100_DMA_BLKID_IN;
+ desc->data_ptrs[next_triplet].last = 0;
+ desc->data_ptrs[next_triplet].dma_ext = 0;
+ cb_len -= part_len;
+ *seg_total_left -= part_len;
+ /* Initializing offset for next segment (mbuf) */
+ *offset = part_len;
+ next_triplet++;
+ } else {
+ rte_bbdev_log(ERR,
+ "Some data still left for processing: "
+ "data_left: %u, next_triplet: %u, next_mbuf: %p",
+ cb_len, next_triplet, m->next);
+ return -EINVAL;
+ }
+ }
+ /* Storing new mbuf as it could be changed in scatter-gather case*/
+ *input = m;
+
+ return next_triplet;
+}
+
+/* Fills descriptor with data pointers of one block type.
+ * Returns index of next triplet on success, other value if lengths of
+ * output data and processed mbuf do not match.
+ */
+static inline int
+acc100_dma_fill_blk_type_out(struct acc100_dma_req_desc *desc,
+ struct rte_mbuf *output, uint32_t out_offset,
+ uint32_t output_len, int next_triplet, int blk_id)
+{
+ desc->data_ptrs[next_triplet].address =
+ rte_pktmbuf_iova_offset(output, out_offset);
+ desc->data_ptrs[next_triplet].blen = output_len;
+ desc->data_ptrs[next_triplet].blkid = blk_id;
+ desc->data_ptrs[next_triplet].last = 0;
+ desc->data_ptrs[next_triplet].dma_ext = 0;
+ next_triplet++;
+
+ return next_triplet;
+}
+
+static inline int
+acc100_dma_desc_le_fill(struct rte_bbdev_enc_op *op,
+ struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
+ struct rte_mbuf *output, uint32_t *in_offset,
+ uint32_t *out_offset, uint32_t *out_length,
+ uint32_t *mbuf_total_left, uint32_t *seg_total_left)
+{
+ int next_triplet = 1; /* FCW already done */
+ uint16_t K, in_length_in_bits, in_length_in_bytes;
+ struct rte_bbdev_op_ldpc_enc *enc = &op->ldpc_enc;
+
+ desc->word0 = ACC100_DMA_DESC_TYPE;
+ desc->word1 = 0; /**< Timestamp could be disabled */
+ desc->word2 = 0;
+ desc->word3 = 0;
+ desc->numCBs = 1;
+
+ K = (enc->basegraph == 1 ? 22 : 10) * enc->z_c;
+ in_length_in_bits = K - enc->n_filler;
+ if ((enc->op_flags & RTE_BBDEV_LDPC_CRC_24A_ATTACH) ||
+ (enc->op_flags & RTE_BBDEV_LDPC_CRC_24B_ATTACH))
+ in_length_in_bits -= 24;
+ in_length_in_bytes = in_length_in_bits >> 3;
+
+ if (unlikely((*mbuf_total_left == 0) ||
+ (*mbuf_total_left < in_length_in_bytes))) {
+ rte_bbdev_log(ERR,
+ "Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+ *mbuf_total_left, in_length_in_bytes);
+ return -1;
+ }
+
+ next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset,
+ in_length_in_bytes,
+ seg_total_left, next_triplet);
+ if (unlikely(next_triplet < 0)) {
+ rte_bbdev_log(ERR,
+ "Mismatch between data to process and mbuf data length in bbdev_op: %p",
+ op);
+ return -1;
+ }
+ desc->data_ptrs[next_triplet - 1].last = 1;
+ desc->m2dlen = next_triplet;
+ *mbuf_total_left -= in_length_in_bytes;
+
+ /* Set output length */
+ /* Integer round up division by 8 */
+ *out_length = (enc->cb_params.e + 7) >> 3;
+
+ next_triplet = acc100_dma_fill_blk_type_out(desc, output, *out_offset,
+ *out_length, next_triplet, ACC100_DMA_BLKID_OUT_ENC);
+ if (unlikely(next_triplet < 0)) {
+ rte_bbdev_log(ERR,
+ "Mismatch between data to process and mbuf data length in bbdev_op: %p",
+ op);
+ return -1;
+ }
+ op->ldpc_enc.output.length += *out_length;
+ *out_offset += *out_length;
+ desc->data_ptrs[next_triplet - 1].last = 1;
+ desc->data_ptrs[next_triplet - 1].dma_ext = 0;
+ desc->d2mlen = next_triplet - desc->m2dlen;
+
+ desc->op_addr = op;
+
+ return 0;
+}
+
+static inline int
+acc100_dma_desc_ld_fill(struct rte_bbdev_dec_op *op,
+ struct acc100_dma_req_desc *desc,
+ struct rte_mbuf **input, struct rte_mbuf *h_output,
+ uint32_t *in_offset, uint32_t *h_out_offset,
+ uint32_t *h_out_length, uint32_t *mbuf_total_left,
+ uint32_t *seg_total_left,
+ struct acc100_fcw_ld *fcw)
+{
+ struct rte_bbdev_op_ldpc_dec *dec = &op->ldpc_dec;
+ int next_triplet = 1; /* FCW already done */
+ uint32_t input_length;
+ uint16_t output_length, crc24_overlap = 0;
+ uint16_t sys_cols, K, h_p_size, h_np_size;
+ bool h_comp = check_bit(dec->op_flags,
+ RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
+
+ desc->word0 = ACC100_DMA_DESC_TYPE;
+ desc->word1 = 0; /**< Timestamp could be disabled */
+ desc->word2 = 0;
+ desc->word3 = 0;
+ desc->numCBs = 1;
+
+ if (check_bit(op->ldpc_dec.op_flags,
+ RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP))
+ crc24_overlap = 24;
+
+ /* Compute some LDPC BG lengths */
+ input_length = dec->cb_params.e;
+ if (check_bit(op->ldpc_dec.op_flags,
+ RTE_BBDEV_LDPC_LLR_COMPRESSION))
+ input_length = (input_length * 3 + 3) / 4;
+ sys_cols = (dec->basegraph == 1) ? 22 : 10;
+ K = sys_cols * dec->z_c;
+ output_length = K - dec->n_filler - crc24_overlap;
+
+ if (unlikely((*mbuf_total_left == 0) ||
+ (*mbuf_total_left < input_length))) {
+ rte_bbdev_log(ERR,
+ "Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+ *mbuf_total_left, input_length);
+ return -1;
+ }
+
+ next_triplet = acc100_dma_fill_blk_type_in(desc, input,
+ in_offset, input_length,
+ seg_total_left, next_triplet);
+
+ if (unlikely(next_triplet < 0)) {
+ rte_bbdev_log(ERR,
+ "Mismatch between data to process and mbuf data length in bbdev_op: %p",
+ op);
+ return -1;
+ }
+
+ if (check_bit(op->ldpc_dec.op_flags,
+ RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
+ h_p_size = fcw->hcin_size0 + fcw->hcin_size1;
+ if (h_comp)
+ h_p_size = (h_p_size * 3 + 3) / 4;
+ desc->data_ptrs[next_triplet].address =
+ dec->harq_combined_input.offset;
+ desc->data_ptrs[next_triplet].blen = h_p_size;
+ desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN_HARQ;
+ desc->data_ptrs[next_triplet].dma_ext = 1;
+#ifndef ACC100_EXT_MEM
+ acc100_dma_fill_blk_type_out(
+ desc,
+ op->ldpc_dec.harq_combined_input.data,
+ op->ldpc_dec.harq_combined_input.offset,
+ h_p_size,
+ next_triplet,
+ ACC100_DMA_BLKID_IN_HARQ);
+#endif
+ next_triplet++;
+ }
+
+ desc->data_ptrs[next_triplet - 1].last = 1;
+ desc->m2dlen = next_triplet;
+ *mbuf_total_left -= input_length;
+
+ next_triplet = acc100_dma_fill_blk_type_out(desc, h_output,
+ *h_out_offset, output_length >> 3, next_triplet,
+ ACC100_DMA_BLKID_OUT_HARD);
+ if (unlikely(next_triplet < 0)) {
+ rte_bbdev_log(ERR,
+ "Mismatch between data to process and mbuf data length in bbdev_op: %p",
+ op);
+ return -1;
+ }
+
+ if (check_bit(op->ldpc_dec.op_flags,
+ RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
+ /* Pruned size of the HARQ */
+ h_p_size = fcw->hcout_size0 + fcw->hcout_size1;
+ /* Non-Pruned size of the HARQ */
+ h_np_size = fcw->hcout_offset > 0 ?
+ fcw->hcout_offset + fcw->hcout_size1 :
+ h_p_size;
+ if (h_comp) {
+ h_np_size = (h_np_size * 3 + 3) / 4;
+ h_p_size = (h_p_size * 3 + 3) / 4;
+ }
+ dec->harq_combined_output.length = h_np_size;
+ desc->data_ptrs[next_triplet].address =
+ dec->harq_combined_output.offset;
+ desc->data_ptrs[next_triplet].blen = h_p_size;
+ desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_OUT_HARQ;
+ desc->data_ptrs[next_triplet].dma_ext = 1;
+#ifndef ACC100_EXT_MEM
+ acc100_dma_fill_blk_type_out(
+ desc,
+ dec->harq_combined_output.data,
+ dec->harq_combined_output.offset,
+ h_p_size,
+ next_triplet,
+ ACC100_DMA_BLKID_OUT_HARQ);
+#endif
+ next_triplet++;
+ }
+
+ *h_out_length = output_length >> 3;
+ dec->hard_output.length += *h_out_length;
+ *h_out_offset += *h_out_length;
+ desc->data_ptrs[next_triplet - 1].last = 1;
+ desc->d2mlen = next_triplet - desc->m2dlen;
+
+ desc->op_addr = op;
+
+ return 0;
+}
+
+static inline void
+acc100_dma_desc_ld_update(struct rte_bbdev_dec_op *op,
+ struct acc100_dma_req_desc *desc,
+ struct rte_mbuf *input, struct rte_mbuf *h_output,
+ uint32_t *in_offset, uint32_t *h_out_offset,
+ uint32_t *h_out_length,
+ union acc100_harq_layout_data *harq_layout)
+{
+ int next_triplet = 1; /* FCW already done */
+ desc->data_ptrs[next_triplet].address =
+ rte_pktmbuf_iova_offset(input, *in_offset);
+ next_triplet++;
+
+ if (check_bit(op->ldpc_dec.op_flags,
+ RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
+ struct rte_bbdev_op_data hi = op->ldpc_dec.harq_combined_input;
+ desc->data_ptrs[next_triplet].address = hi.offset;
+#ifndef ACC100_EXT_MEM
+ desc->data_ptrs[next_triplet].address =
+ rte_pktmbuf_iova_offset(hi.data, hi.offset);
+#endif
+ next_triplet++;
+ }
+
+ desc->data_ptrs[next_triplet].address =
+ rte_pktmbuf_iova_offset(h_output, *h_out_offset);
+ *h_out_length = desc->data_ptrs[next_triplet].blen;
+ next_triplet++;
+
+ if (check_bit(op->ldpc_dec.op_flags,
+ RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
+ desc->data_ptrs[next_triplet].address =
+ op->ldpc_dec.harq_combined_output.offset;
+ /* Adjust based on previous operation */
+ struct rte_bbdev_dec_op *prev_op = desc->op_addr;
+ op->ldpc_dec.harq_combined_output.length =
+ prev_op->ldpc_dec.harq_combined_output.length;
+ int16_t hq_idx = op->ldpc_dec.harq_combined_output.offset /
+ ACC100_HARQ_OFFSET;
+ int16_t prev_hq_idx =
+ prev_op->ldpc_dec.harq_combined_output.offset
+ / ACC100_HARQ_OFFSET;
+ harq_layout[hq_idx].val = harq_layout[prev_hq_idx].val;
+#ifndef ACC100_EXT_MEM
+ struct rte_bbdev_op_data ho =
+ op->ldpc_dec.harq_combined_output;
+ desc->data_ptrs[next_triplet].address =
+ rte_pktmbuf_iova_offset(ho.data, ho.offset);
+#endif
+ next_triplet++;
+ }
+
+ op->ldpc_dec.hard_output.length += *h_out_length;
+ desc->op_addr = op;
+}
+
+
+/* Enqueue a number of operations to HW and update software rings */
+static inline void
+acc100_dma_enqueue(struct acc100_queue *q, uint16_t n,
+ struct rte_bbdev_stats *queue_stats)
+{
+ union acc100_enqueue_reg_fmt enq_req;
+#ifdef RTE_BBDEV_OFFLOAD_COST
+ uint64_t start_time = 0;
+ queue_stats->acc_offload_cycles = 0;
+ RTE_SET_USED(queue_stats);
+#else
+ RTE_SET_USED(queue_stats);
+#endif
+
+ enq_req.val = 0;
+ /* Setting offset, 100b for 256 DMA Desc */
+ enq_req.addr_offset = ACC100_DESC_OFFSET;
+
+ /* Split ops into batches */
+ do {
+ union acc100_dma_desc *desc;
+ uint16_t enq_batch_size;
+ uint64_t offset;
+ rte_iova_t req_elem_addr;
+
+ enq_batch_size = RTE_MIN(n, MAX_ENQ_BATCH_SIZE);
+
+ /* Set flag on last descriptor in a batch */
+ desc = q->ring_addr + ((q->sw_ring_head + enq_batch_size - 1) &
+ q->sw_ring_wrap_mask);
+ desc->req.last_desc_in_batch = 1;
+
+ /* Calculate the 1st descriptor's address */
+ offset = ((q->sw_ring_head & q->sw_ring_wrap_mask) *
+ sizeof(union acc100_dma_desc));
+ req_elem_addr = q->ring_addr_phys + offset;
+
+ /* Fill enqueue struct */
+ enq_req.num_elem = enq_batch_size;
+ /* low 6 bits are not needed */
+ enq_req.req_elem_addr = (uint32_t)(req_elem_addr >> 6);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+ rte_memdump(stderr, "Req sdone", desc, sizeof(*desc));
+#endif
+ rte_bbdev_log_debug(
+ "Enqueue %u reqs (phys %#"PRIx64") to reg %p",
+ enq_batch_size,
+ req_elem_addr,
+ (void *)q->mmio_reg_enqueue);
+
+ rte_wmb();
+
+#ifdef RTE_BBDEV_OFFLOAD_COST
+ /* Start time measurement for enqueue function offload. */
+ start_time = rte_rdtsc_precise();
+#endif
+ rte_bbdev_log(DEBUG, "Debug : MMIO Enqueue");
+ mmio_write(q->mmio_reg_enqueue, enq_req.val);
+
+#ifdef RTE_BBDEV_OFFLOAD_COST
+ queue_stats->acc_offload_cycles +=
+ rte_rdtsc_precise() - start_time;
+#endif
+
+ q->aq_enqueued++;
+ q->sw_ring_head += enq_batch_size;
+ n -= enq_batch_size;
+
+ } while (n);
+
+
+}
+
+/* Enqueue one encode operations for ACC100 device in CB mode */
+static inline int
+enqueue_ldpc_enc_n_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op **ops,
+ uint16_t total_enqueued_cbs, int16_t num)
+{
+ union acc100_dma_desc *desc = NULL;
+ uint32_t out_length;
+ struct rte_mbuf *output_head, *output;
+ int i, next_triplet;
+ uint16_t in_length_in_bytes;
+ struct rte_bbdev_op_ldpc_enc *enc = &ops[0]->ldpc_enc;
+
+ uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+ & q->sw_ring_wrap_mask);
+ desc = q->ring_addr + desc_idx;
+ acc100_fcw_le_fill(ops[0], &desc->req.fcw_le, num);
+
+ /** This could be done at polling */
+ desc->req.word0 = ACC100_DMA_DESC_TYPE;
+ desc->req.word1 = 0; /**< Timestamp could be disabled */
+ desc->req.word2 = 0;
+ desc->req.word3 = 0;
+ desc->req.numCBs = num;
+
+ in_length_in_bytes = ops[0]->ldpc_enc.input.data->data_len;
+ out_length = (enc->cb_params.e + 7) >> 3;
+ desc->req.m2dlen = 1 + num;
+ desc->req.d2mlen = num;
+ next_triplet = 1;
+
+ for (i = 0; i < num; i++) {
+ desc->req.data_ptrs[next_triplet].address =
+ rte_pktmbuf_iova_offset(ops[i]->ldpc_enc.input.data, 0);
+ desc->req.data_ptrs[next_triplet].blen = in_length_in_bytes;
+ next_triplet++;
+ desc->req.data_ptrs[next_triplet].address =
+ rte_pktmbuf_iova_offset(
+ ops[i]->ldpc_enc.output.data, 0);
+ desc->req.data_ptrs[next_triplet].blen = out_length;
+ next_triplet++;
+ ops[i]->ldpc_enc.output.length = out_length;
+ output_head = output = ops[i]->ldpc_enc.output.data;
+ mbuf_append(output_head, output, out_length);
+ output->data_len = out_length;
+ }
+
+ desc->req.op_addr = ops[0];
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+ rte_memdump(stderr, "FCW", &desc->req.fcw_le,
+ sizeof(desc->req.fcw_le) - 8);
+ rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+ /* One CB (one op) was successfully prepared to enqueue */
+ return num;
+}
+
+/* Enqueue one encode operations for ACC100 device in CB mode */
+static inline int
+enqueue_ldpc_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
+ uint16_t total_enqueued_cbs)
+{
+ union acc100_dma_desc *desc = NULL;
+ int ret;
+ uint32_t in_offset, out_offset, out_length, mbuf_total_left,
+ seg_total_left;
+ struct rte_mbuf *input, *output_head, *output;
+
+ uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+ & q->sw_ring_wrap_mask);
+ desc = q->ring_addr + desc_idx;
+ acc100_fcw_le_fill(op, &desc->req.fcw_le, 1);
+
+ input = op->ldpc_enc.input.data;
+ output_head = output = op->ldpc_enc.output.data;
+ in_offset = op->ldpc_enc.input.offset;
+ out_offset = op->ldpc_enc.output.offset;
+ out_length = 0;
+ mbuf_total_left = op->ldpc_enc.input.length;
+ seg_total_left = rte_pktmbuf_data_len(op->ldpc_enc.input.data)
+ - in_offset;
+
+ ret = acc100_dma_desc_le_fill(op, &desc->req, &input, output,
+ &in_offset, &out_offset, &out_length, &mbuf_total_left,
+ &seg_total_left);
+
+ if (unlikely(ret < 0))
+ return ret;
+
+ mbuf_append(output_head, output, out_length);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+ rte_memdump(stderr, "FCW", &desc->req.fcw_le,
+ sizeof(desc->req.fcw_le) - 8);
+ rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+
+ /* Check if any data left after processing one CB */
+ if (mbuf_total_left != 0) {
+ rte_bbdev_log(ERR,
+ "Some date still left after processing one CB: mbuf_total_left = %u",
+ mbuf_total_left);
+ return -EINVAL;
+ }
+#endif
+ /* One CB (one op) was successfully prepared to enqueue */
+ return 1;
+}
+
+/** Enqueue one decode operations for ACC100 device in CB mode */
+static inline int
+enqueue_ldpc_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+ uint16_t total_enqueued_cbs, bool same_op)
+{
+ int ret;
+
+ union acc100_dma_desc *desc;
+ uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+ & q->sw_ring_wrap_mask);
+ desc = q->ring_addr + desc_idx;
+ struct rte_mbuf *input, *h_output_head, *h_output;
+ uint32_t in_offset, h_out_offset, h_out_length, mbuf_total_left;
+ input = op->ldpc_dec.input.data;
+ h_output_head = h_output = op->ldpc_dec.hard_output.data;
+ in_offset = op->ldpc_dec.input.offset;
+ h_out_offset = op->ldpc_dec.hard_output.offset;
+ mbuf_total_left = op->ldpc_dec.input.length;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+ if (unlikely(input == NULL)) {
+ rte_bbdev_log(ERR, "Invalid mbuf pointer");
+ return -EFAULT;
+ }
+#endif
+ union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+
+ if (same_op) {
+ union acc100_dma_desc *prev_desc;
+ desc_idx = ((q->sw_ring_head + total_enqueued_cbs - 1)
+ & q->sw_ring_wrap_mask);
+ prev_desc = q->ring_addr + desc_idx;
+ uint8_t *prev_ptr = (uint8_t *) prev_desc;
+ uint8_t *new_ptr = (uint8_t *) desc;
+ /* Copy first 4 words and BDESCs */
+ rte_memcpy(new_ptr, prev_ptr, 16);
+ rte_memcpy(new_ptr + 36, prev_ptr + 36, 40);
+ desc->req.op_addr = prev_desc->req.op_addr;
+ /* Copy FCW */
+ rte_memcpy(new_ptr + ACC100_DESC_FCW_OFFSET,
+ prev_ptr + ACC100_DESC_FCW_OFFSET,
+ ACC100_FCW_LD_BLEN);
+ acc100_dma_desc_ld_update(op, &desc->req, input, h_output,
+ &in_offset, &h_out_offset,
+ &h_out_length, harq_layout);
+ } else {
+ struct acc100_fcw_ld *fcw;
+ uint32_t seg_total_left;
+ fcw = &desc->req.fcw_ld;
+ acc100_fcw_ld_fill(op, fcw, harq_layout);
+
+ /* Special handling when overusing mbuf */
+ if (fcw->rm_e < MAX_E_MBUF)
+ seg_total_left = rte_pktmbuf_data_len(input)
+ - in_offset;
+ else
+ seg_total_left = fcw->rm_e;
+
+ ret = acc100_dma_desc_ld_fill(op, &desc->req, &input, h_output,
+ &in_offset, &h_out_offset,
+ &h_out_length, &mbuf_total_left,
+ &seg_total_left, fcw);
+ if (unlikely(ret < 0))
+ return ret;
+ }
+
+ /* Hard output */
+ mbuf_append(h_output_head, h_output, h_out_length);
+#ifndef ACC100_EXT_MEM
+ if (op->ldpc_dec.harq_combined_output.length > 0) {
+ /* Push the HARQ output into host memory */
+ struct rte_mbuf *hq_output_head, *hq_output;
+ hq_output_head = op->ldpc_dec.harq_combined_output.data;
+ hq_output = op->ldpc_dec.harq_combined_output.data;
+ mbuf_append(hq_output_head, hq_output,
+ op->ldpc_dec.harq_combined_output.length);
+ }
+#endif
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+ rte_memdump(stderr, "FCW", &desc->req.fcw_ld,
+ sizeof(desc->req.fcw_ld) - 8);
+ rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+ /* One CB (one op) was successfully prepared to enqueue */
+ return 1;
+}
+
+
+/* Enqueue one decode operations for ACC100 device in TB mode */
+static inline int
+enqueue_ldpc_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+ uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
+{
+ union acc100_dma_desc *desc = NULL;
+ int ret;
+ uint8_t r, c;
+ uint32_t in_offset, h_out_offset,
+ h_out_length, mbuf_total_left, seg_total_left;
+ struct rte_mbuf *input, *h_output_head, *h_output;
+ uint16_t current_enqueued_cbs = 0;
+
+ uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+ & q->sw_ring_wrap_mask);
+ desc = q->ring_addr + desc_idx;
+ uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
+ union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+ acc100_fcw_ld_fill(op, &desc->req.fcw_ld, harq_layout);
+
+ input = op->ldpc_dec.input.data;
+ h_output_head = h_output = op->ldpc_dec.hard_output.data;
+ in_offset = op->ldpc_dec.input.offset;
+ h_out_offset = op->ldpc_dec.hard_output.offset;
+ h_out_length = 0;
+ mbuf_total_left = op->ldpc_dec.input.length;
+ c = op->ldpc_dec.tb_params.c;
+ r = op->ldpc_dec.tb_params.r;
+
+ while (mbuf_total_left > 0 && r < c) {
+
+ seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
+
+ /* Set up DMA descriptor */
+ desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
+ & q->sw_ring_wrap_mask);
+ desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
+ desc->req.data_ptrs[0].blen = ACC100_FCW_LD_BLEN;
+ ret = acc100_dma_desc_ld_fill(op, &desc->req, &input,
+ h_output, &in_offset, &h_out_offset,
+ &h_out_length,
+ &mbuf_total_left, &seg_total_left,
+ &desc->req.fcw_ld);
+
+ if (unlikely(ret < 0))
+ return ret;
+
+ /* Hard output */
+ mbuf_append(h_output_head, h_output, h_out_length);
+
+ /* Set total number of CBs in TB */
+ desc->req.cbs_in_tb = cbs_in_tb;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+ rte_memdump(stderr, "FCW", &desc->req.fcw_td,
+ sizeof(desc->req.fcw_td) - 8);
+ rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+ if (seg_total_left == 0) {
+ /* Go to the next mbuf */
+ input = input->next;
+ in_offset = 0;
+ h_output = h_output->next;
+ h_out_offset = 0;
+ }
+ total_enqueued_cbs++;
+ current_enqueued_cbs++;
+ r++;
+ }
+
+ if (unlikely(desc == NULL))
+ return current_enqueued_cbs;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+ /* Check if any CBs left for processing */
+ if (mbuf_total_left != 0) {
+ rte_bbdev_log(ERR,
+ "Some date still left for processing: mbuf_total_left = %u",
+ mbuf_total_left);
+ return -EINVAL;
+ }
+#endif
+ /* Set SDone on last CB descriptor for TB mode */
+ desc->req.sdone_enable = 1;
+ desc->req.irq_enable = q->irq_enable;
+
+ return current_enqueued_cbs;
+}
+
+
+/* Calculates number of CBs in processed encoder TB based on 'r' and input
+ * length.
+ */
+static inline uint8_t
+get_num_cbs_in_tb_enc(struct rte_bbdev_op_turbo_enc *turbo_enc)
+{
+ uint8_t c, c_neg, r, crc24_bits = 0;
+ uint16_t k, k_neg, k_pos;
+ uint8_t cbs_in_tb = 0;
+ int32_t length;
+
+ length = turbo_enc->input.length;
+ r = turbo_enc->tb_params.r;
+ c = turbo_enc->tb_params.c;
+ c_neg = turbo_enc->tb_params.c_neg;
+ k_neg = turbo_enc->tb_params.k_neg;
+ k_pos = turbo_enc->tb_params.k_pos;
+ crc24_bits = 0;
+ if (check_bit(turbo_enc->op_flags, RTE_BBDEV_TURBO_CRC_24B_ATTACH))
+ crc24_bits = 24;
+ while (length > 0 && r < c) {
+ k = (r < c_neg) ? k_neg : k_pos;
+ length -= (k - crc24_bits) >> 3;
+ r++;
+ cbs_in_tb++;
+ }
+
+ return cbs_in_tb;
+}
+
+/* Calculates number of CBs in processed decoder TB based on 'r' and input
+ * length.
+ */
+static inline uint16_t
+get_num_cbs_in_tb_dec(struct rte_bbdev_op_turbo_dec *turbo_dec)
+{
+ uint8_t c, c_neg, r = 0;
+ uint16_t kw, k, k_neg, k_pos, cbs_in_tb = 0;
+ int32_t length;
+
+ length = turbo_dec->input.length;
+ r = turbo_dec->tb_params.r;
+ c = turbo_dec->tb_params.c;
+ c_neg = turbo_dec->tb_params.c_neg;
+ k_neg = turbo_dec->tb_params.k_neg;
+ k_pos = turbo_dec->tb_params.k_pos;
+ while (length > 0 && r < c) {
+ k = (r < c_neg) ? k_neg : k_pos;
+ kw = RTE_ALIGN_CEIL(k + 4, 32) * 3;
+ length -= kw;
+ r++;
+ cbs_in_tb++;
+ }
+
+ return cbs_in_tb;
+}
+
+/* Calculates number of CBs in processed decoder TB based on 'r' and input
+ * length.
+ */
+static inline uint16_t
+get_num_cbs_in_tb_ldpc_dec(struct rte_bbdev_op_ldpc_dec *ldpc_dec)
+{
+ uint16_t r, cbs_in_tb = 0;
+ int32_t length = ldpc_dec->input.length;
+ r = ldpc_dec->tb_params.r;
+ while (length > 0 && r < ldpc_dec->tb_params.c) {
+ length -= (r < ldpc_dec->tb_params.cab) ?
+ ldpc_dec->tb_params.ea :
+ ldpc_dec->tb_params.eb;
+ r++;
+ cbs_in_tb++;
+ }
+ return cbs_in_tb;
+}
+
+/* Check we can mux encode operations with common FCW */
+static inline bool
+check_mux(struct rte_bbdev_enc_op **ops, uint16_t num) {
+ uint16_t i;
+ if (num == 1)
+ return false;
+ for (i = 1; i < num; ++i) {
+ /* Only mux compatible code blocks */
+ if (memcmp((uint8_t *)(&ops[i]->ldpc_enc) + ENC_OFFSET,
+ (uint8_t *)(&ops[0]->ldpc_enc) + ENC_OFFSET,
+ CMP_ENC_SIZE) != 0)
+ return false;
+ }
+ return true;
+}
+
+/** Enqueue encode operations for ACC100 device in CB mode. */
+static inline uint16_t
+acc100_enqueue_ldpc_enc_cb(struct rte_bbdev_queue_data *q_data,
+ struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+ struct acc100_queue *q = q_data->queue_private;
+ int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+ uint16_t i = 0;
+ union acc100_dma_desc *desc;
+ int ret, desc_idx = 0;
+ int16_t enq, left = num;
+
+ while (left > 0) {
+ if (unlikely(avail - 1 < 0))
+ break;
+ avail--;
+ enq = RTE_MIN(left, MUX_5GDL_DESC);
+ if (check_mux(&ops[i], enq)) {
+ ret = enqueue_ldpc_enc_n_op_cb(q, &ops[i],
+ desc_idx, enq);
+ if (ret < 0)
+ break;
+ i += enq;
+ } else {
+ ret = enqueue_ldpc_enc_one_op_cb(q, ops[i], desc_idx);
+ if (ret < 0)
+ break;
+ i++;
+ }
+ desc_idx++;
+ left = num - i;
+ }
+
+ if (unlikely(i == 0))
+ return 0; /* Nothing to enqueue */
+
+ /* Set SDone in last CB in enqueued ops for CB mode*/
+ desc = q->ring_addr + ((q->sw_ring_head + desc_idx - 1)
+ & q->sw_ring_wrap_mask);
+ desc->req.sdone_enable = 1;
+ desc->req.irq_enable = q->irq_enable;
+
+ acc100_dma_enqueue(q, desc_idx, &q_data->queue_stats);
+
+ /* Update stats */
+ q_data->queue_stats.enqueued_count += i;
+ q_data->queue_stats.enqueue_err_count += num - i;
+
+ return i;
+}
+
+/* Enqueue encode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
+ struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+ if (unlikely(num == 0))
+ return 0;
+ return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
+}
+
+/* Check we can mux encode operations with common FCW */
+static inline bool
+cmp_ldpc_dec_op(struct rte_bbdev_dec_op **ops) {
+ /* Only mux compatible code blocks */
+ if (memcmp((uint8_t *)(&ops[0]->ldpc_dec) + DEC_OFFSET,
+ (uint8_t *)(&ops[1]->ldpc_dec) +
+ DEC_OFFSET, CMP_DEC_SIZE) != 0) {
+ return false;
+ } else
+ return true;
+}
+
+
+/* Enqueue decode operations for ACC100 device in TB mode */
+static uint16_t
+acc100_enqueue_ldpc_dec_tb(struct rte_bbdev_queue_data *q_data,
+ struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+ struct acc100_queue *q = q_data->queue_private;
+ int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+ uint16_t i, enqueued_cbs = 0;
+ uint8_t cbs_in_tb;
+ int ret;
+
+ for (i = 0; i < num; ++i) {
+ cbs_in_tb = get_num_cbs_in_tb_ldpc_dec(&ops[i]->ldpc_dec);
+ /* Check if there are available space for further processing */
+ if (unlikely(avail - cbs_in_tb < 0))
+ break;
+ avail -= cbs_in_tb;
+
+ ret = enqueue_ldpc_dec_one_op_tb(q, ops[i],
+ enqueued_cbs, cbs_in_tb);
+ if (ret < 0)
+ break;
+ enqueued_cbs += ret;
+ }
+
+ acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
+
+ /* Update stats */
+ q_data->queue_stats.enqueued_count += i;
+ q_data->queue_stats.enqueue_err_count += num - i;
+ return i;
+}
+
+/* Enqueue decode operations for ACC100 device in CB mode */
+static uint16_t
+acc100_enqueue_ldpc_dec_cb(struct rte_bbdev_queue_data *q_data,
+ struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+ struct acc100_queue *q = q_data->queue_private;
+ int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+ uint16_t i;
+ union acc100_dma_desc *desc;
+ int ret;
+ bool same_op = false;
+ for (i = 0; i < num; ++i) {
+ /* Check if there are available space for further processing */
+ if (unlikely(avail - 1 < 0))
+ break;
+ avail -= 1;
+
+ if (i > 0)
+ same_op = cmp_ldpc_dec_op(&ops[i-1]);
+ rte_bbdev_log(INFO, "Op %d %d %d %d %d %d %d %d %d %d %d %d\n",
+ i, ops[i]->ldpc_dec.op_flags, ops[i]->ldpc_dec.rv_index,
+ ops[i]->ldpc_dec.iter_max, ops[i]->ldpc_dec.iter_count,
+ ops[i]->ldpc_dec.basegraph, ops[i]->ldpc_dec.z_c,
+ ops[i]->ldpc_dec.n_cb, ops[i]->ldpc_dec.q_m,
+ ops[i]->ldpc_dec.n_filler, ops[i]->ldpc_dec.cb_params.e,
+ same_op);
+ ret = enqueue_ldpc_dec_one_op_cb(q, ops[i], i, same_op);
+ if (ret < 0)
+ break;
+ }
+
+ if (unlikely(i == 0))
+ return 0; /* Nothing to enqueue */
+
+ /* Set SDone in last CB in enqueued ops for CB mode*/
+ desc = q->ring_addr + ((q->sw_ring_head + i - 1)
+ & q->sw_ring_wrap_mask);
+
+ desc->req.sdone_enable = 1;
+ desc->req.irq_enable = q->irq_enable;
+
+ acc100_dma_enqueue(q, i, &q_data->queue_stats);
+
+ /* Update stats */
+ q_data->queue_stats.enqueued_count += i;
+ q_data->queue_stats.enqueue_err_count += num - i;
+ return i;
+}
+
+/* Enqueue decode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
+ struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+ struct acc100_queue *q = q_data->queue_private;
+ int32_t aq_avail = q->aq_depth +
+ (q->aq_dequeued - q->aq_enqueued) / 128;
+
+ if (unlikely((aq_avail == 0) || (num == 0)))
+ return 0;
+
+ if (ops[0]->ldpc_dec.code_block_mode == 0)
+ return acc100_enqueue_ldpc_dec_tb(q_data, ops, num);
+ else
+ return acc100_enqueue_ldpc_dec_cb(q_data, ops, num);
+}
+
+
+/* Dequeue one encode operations from ACC100 device in CB mode */
+static inline int
+dequeue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op **ref_op,
+ uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
+{
+ union acc100_dma_desc *desc, atom_desc;
+ union acc100_dma_rsp_desc rsp;
+ struct rte_bbdev_enc_op *op;
+ int i;
+
+ desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
+ & q->sw_ring_wrap_mask);
+ atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+ __ATOMIC_RELAXED);
+
+ /* Check fdone bit */
+ if (!(atom_desc.rsp.val & ACC100_FDONE))
+ return -1;
+
+ rsp.val = atom_desc.rsp.val;
+ rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
+
+ /* Dequeue */
+ op = desc->req.op_addr;
+
+ /* Clearing status, it will be set based on response */
+ op->status = 0;
+
+ op->status |= ((rsp.input_err)
+ ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+ op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+ op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+
+ if (desc->req.last_desc_in_batch) {
+ (*aq_dequeued)++;
+ desc->req.last_desc_in_batch = 0;
+ }
+ desc->rsp.val = ACC100_DMA_DESC_TYPE;
+ desc->rsp.add_info_0 = 0; /*Reserved bits */
+ desc->rsp.add_info_1 = 0; /*Reserved bits */
+
+ /* Flag that the muxing cause loss of opaque data */
+ op->opaque_data = (void *)-1;
+ for (i = 0 ; i < desc->req.numCBs; i++)
+ ref_op[i] = op;
+
+ /* One CB (op) was successfully dequeued */
+ return desc->req.numCBs;
+}
+
+/* Dequeue one encode operations from ACC100 device in TB mode */
+static inline int
+dequeue_enc_one_op_tb(struct acc100_queue *q, struct rte_bbdev_enc_op **ref_op,
+ uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
+{
+ union acc100_dma_desc *desc, *last_desc, atom_desc;
+ union acc100_dma_rsp_desc rsp;
+ struct rte_bbdev_enc_op *op;
+ uint8_t i = 0;
+ uint16_t current_dequeued_cbs = 0, cbs_in_tb;
+
+ desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
+ & q->sw_ring_wrap_mask);
+ atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+ __ATOMIC_RELAXED);
+
+ /* Check fdone bit */
+ if (!(atom_desc.rsp.val & ACC100_FDONE))
+ return -1;
+
+ /* Get number of CBs in dequeued TB */
+ cbs_in_tb = desc->req.cbs_in_tb;
+ /* Get last CB */
+ last_desc = q->ring_addr + ((q->sw_ring_tail
+ + total_dequeued_cbs + cbs_in_tb - 1)
+ & q->sw_ring_wrap_mask);
+ /* Check if last CB in TB is ready to dequeue (and thus
+ * the whole TB) - checking sdone bit. If not return.
+ */
+ atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
+ __ATOMIC_RELAXED);
+ if (!(atom_desc.rsp.val & ACC100_SDONE))
+ return -1;
+
+ /* Dequeue */
+ op = desc->req.op_addr;
+
+ /* Clearing status, it will be set based on response */
+ op->status = 0;
+
+ while (i < cbs_in_tb) {
+ desc = q->ring_addr + ((q->sw_ring_tail
+ + total_dequeued_cbs)
+ & q->sw_ring_wrap_mask);
+ atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+ __ATOMIC_RELAXED);
+ rsp.val = atom_desc.rsp.val;
+ rte_bbdev_log_debug("Resp. desc %p: %x", desc,
+ rsp.val);
+
+ op->status |= ((rsp.input_err)
+ ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+ op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+ op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+
+ if (desc->req.last_desc_in_batch) {
+ (*aq_dequeued)++;
+ desc->req.last_desc_in_batch = 0;
+ }
+ desc->rsp.val = ACC100_DMA_DESC_TYPE;
+ desc->rsp.add_info_0 = 0;
+ desc->rsp.add_info_1 = 0;
+ total_dequeued_cbs++;
+ current_dequeued_cbs++;
+ i++;
+ }
+
+ *ref_op = op;
+
+ return current_dequeued_cbs;
+}
+
+/* Dequeue one decode operation from ACC100 device in CB mode */
+static inline int
+dequeue_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
+ struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
+ uint16_t dequeued_cbs, uint32_t *aq_dequeued)
+{
+ union acc100_dma_desc *desc, atom_desc;
+ union acc100_dma_rsp_desc rsp;
+ struct rte_bbdev_dec_op *op;
+
+ desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+ & q->sw_ring_wrap_mask);
+ atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+ __ATOMIC_RELAXED);
+
+ /* Check fdone bit */
+ if (!(atom_desc.rsp.val & ACC100_FDONE))
+ return -1;
+
+ rsp.val = atom_desc.rsp.val;
+ rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
+
+ /* Dequeue */
+ op = desc->req.op_addr;
+
+ /* Clearing status, it will be set based on response */
+ op->status = 0;
+ op->status |= ((rsp.input_err)
+ ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+ op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+ op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+ if (op->status != 0)
+ q_data->queue_stats.dequeue_err_count++;
+
+ /* CRC invalid if error exists */
+ if (!op->status)
+ op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
+ op->turbo_dec.iter_count = (uint8_t) rsp.iter_cnt / 2;
+ /* Check if this is the last desc in batch (Atomic Queue) */
+ if (desc->req.last_desc_in_batch) {
+ (*aq_dequeued)++;
+ desc->req.last_desc_in_batch = 0;
+ }
+ desc->rsp.val = ACC100_DMA_DESC_TYPE;
+ desc->rsp.add_info_0 = 0;
+ desc->rsp.add_info_1 = 0;
+ *ref_op = op;
+
+ /* One CB (op) was successfully dequeued */
+ return 1;
+}
+
+/* Dequeue one decode operations from ACC100 device in CB mode */
+static inline int
+dequeue_ldpc_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
+ struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
+ uint16_t dequeued_cbs, uint32_t *aq_dequeued)
+{
+ union acc100_dma_desc *desc, atom_desc;
+ union acc100_dma_rsp_desc rsp;
+ struct rte_bbdev_dec_op *op;
+
+ desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+ & q->sw_ring_wrap_mask);
+ atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+ __ATOMIC_RELAXED);
+
+ /* Check fdone bit */
+ if (!(atom_desc.rsp.val & ACC100_FDONE))
+ return -1;
+
+ rsp.val = atom_desc.rsp.val;
+
+ /* Dequeue */
+ op = desc->req.op_addr;
+
+ /* Clearing status, it will be set based on response */
+ op->status = 0;
+ op->status |= rsp.input_err << RTE_BBDEV_DATA_ERROR;
+ op->status |= rsp.dma_err << RTE_BBDEV_DRV_ERROR;
+ op->status |= rsp.fcw_err << RTE_BBDEV_DRV_ERROR;
+ if (op->status != 0)
+ q_data->queue_stats.dequeue_err_count++;
+
+ op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
+ if (op->ldpc_dec.hard_output.length > 0 && !rsp.synd_ok)
+ op->status |= 1 << RTE_BBDEV_SYNDROME_ERROR;
+ op->ldpc_dec.iter_count = (uint8_t) rsp.iter_cnt;
+
+ /* Check if this is the last desc in batch (Atomic Queue) */
+ if (desc->req.last_desc_in_batch) {
+ (*aq_dequeued)++;
+ desc->req.last_desc_in_batch = 0;
+ }
+
+ desc->rsp.val = ACC100_DMA_DESC_TYPE;
+ desc->rsp.add_info_0 = 0;
+ desc->rsp.add_info_1 = 0;
+
+ *ref_op = op;
+
+ /* One CB (op) was successfully dequeued */
+ return 1;
+}
+
+/* Dequeue one decode operations from ACC100 device in TB mode. */
+static inline int
+dequeue_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
+ uint16_t dequeued_cbs, uint32_t *aq_dequeued)
+{
+ union acc100_dma_desc *desc, *last_desc, atom_desc;
+ union acc100_dma_rsp_desc rsp;
+ struct rte_bbdev_dec_op *op;
+ uint8_t cbs_in_tb = 1, cb_idx = 0;
+
+ desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+ & q->sw_ring_wrap_mask);
+ atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+ __ATOMIC_RELAXED);
+
+ /* Check fdone bit */
+ if (!(atom_desc.rsp.val & ACC100_FDONE))
+ return -1;
+
+ /* Dequeue */
+ op = desc->req.op_addr;
+
+ /* Get number of CBs in dequeued TB */
+ cbs_in_tb = desc->req.cbs_in_tb;
+ /* Get last CB */
+ last_desc = q->ring_addr + ((q->sw_ring_tail
+ + dequeued_cbs + cbs_in_tb - 1)
+ & q->sw_ring_wrap_mask);
+ /* Check if last CB in TB is ready to dequeue (and thus
+ * the whole TB) - checking sdone bit. If not return.
+ */
+ atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
+ __ATOMIC_RELAXED);
+ if (!(atom_desc.rsp.val & ACC100_SDONE))
+ return -1;
+
+ /* Clearing status, it will be set based on response */
+ op->status = 0;
+
+ /* Read remaining CBs if exists */
+ while (cb_idx < cbs_in_tb) {
+ desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+ & q->sw_ring_wrap_mask);
+ atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+ __ATOMIC_RELAXED);
+ rsp.val = atom_desc.rsp.val;
+ rte_bbdev_log_debug("Resp. desc %p: %x", desc,
+ rsp.val);
+
+ op->status |= ((rsp.input_err)
+ ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+ op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+ op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+
+ /* CRC invalid if error exists */
+ if (!op->status)
+ op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
+ op->turbo_dec.iter_count = RTE_MAX((uint8_t) rsp.iter_cnt,
+ op->turbo_dec.iter_count);
+
+ /* Check if this is the last desc in batch (Atomic Queue) */
+ if (desc->req.last_desc_in_batch) {
+ (*aq_dequeued)++;
+ desc->req.last_desc_in_batch = 0;
+ }
+ desc->rsp.val = ACC100_DMA_DESC_TYPE;
+ desc->rsp.add_info_0 = 0;
+ desc->rsp.add_info_1 = 0;
+ dequeued_cbs++;
+ cb_idx++;
+ }
+
+ *ref_op = op;
+
+ return cb_idx;
+}
+
+/* Dequeue LDPC encode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
+ struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+ struct acc100_queue *q = q_data->queue_private;
+ uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+ uint32_t aq_dequeued = 0;
+ uint16_t dequeue_num, i, dequeued_cbs = 0, dequeued_descs = 0;
+ int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+ if (unlikely(ops == 0 && q == NULL))
+ return 0;
+#endif
+
+ dequeue_num = (avail < num) ? avail : num;
+
+ for (i = 0; i < dequeue_num; i++) {
+ ret = dequeue_enc_one_op_cb(q, &ops[dequeued_cbs],
+ dequeued_descs, &aq_dequeued);
+ if (ret < 0)
+ break;
+ dequeued_cbs += ret;
+ dequeued_descs++;
+ if (dequeued_cbs >= num)
+ break;
+ }
+
+ q->aq_dequeued += aq_dequeued;
+ q->sw_ring_tail += dequeued_descs;
+
+ /* Update enqueue stats */
+ q_data->queue_stats.dequeued_count += dequeued_cbs;
+
+ return dequeued_cbs;
+}
+
+/* Dequeue decode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
+ struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+ struct acc100_queue *q = q_data->queue_private;
+ uint16_t dequeue_num;
+ uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+ uint32_t aq_dequeued = 0;
+ uint16_t i;
+ uint16_t dequeued_cbs = 0;
+ struct rte_bbdev_dec_op *op;
+ int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+ if (unlikely(ops == 0 && q == NULL))
+ return 0;
+#endif
+
+ dequeue_num = (avail < num) ? avail : num;
+
+ for (i = 0; i < dequeue_num; ++i) {
+ op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+ & q->sw_ring_wrap_mask))->req.op_addr;
+ if (op->ldpc_dec.code_block_mode == 0)
+ ret = dequeue_dec_one_op_tb(q, &ops[i], dequeued_cbs,
+ &aq_dequeued);
+ else
+ ret = dequeue_ldpc_dec_one_op_cb(
+ q_data, q, &ops[i], dequeued_cbs,
+ &aq_dequeued);
+
+ if (ret < 0)
+ break;
+ dequeued_cbs += ret;
+ }
+
+ q->aq_dequeued += aq_dequeued;
+ q->sw_ring_tail += dequeued_cbs;
+
+ /* Update enqueue stats */
+ q_data->queue_stats.dequeued_count += i;
+
+ return i;
+}
+
/* Initialization Function */
static void
acc100_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv)
@@ -703,6 +2321,10 @@
struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
dev->dev_ops = &acc100_bbdev_ops;
+ dev->enqueue_ldpc_enc_ops = acc100_enqueue_ldpc_enc;
+ dev->enqueue_ldpc_dec_ops = acc100_enqueue_ldpc_dec;
+ dev->dequeue_ldpc_enc_ops = acc100_dequeue_ldpc_enc;
+ dev->dequeue_ldpc_dec_ops = acc100_dequeue_ldpc_dec;
((struct acc100_device *) dev->data->dev_private)->pf_device =
!strcmp(drv->driver.name,
@@ -815,4 +2437,3 @@ static int acc100_pci_remove(struct rte_pci_device *pci_dev)
RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME, pci_id_acc100_pf_map);
RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME, pci_id_acc100_vf_map);
-
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 0e2b79c..78686c1 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -88,6 +88,8 @@
#define TMPL_PRI_3 0x0f0e0d0c
#define QUEUE_ENABLE 0x80000000 /* Bit to mark Queue as Enabled */
#define WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
+#define ACC100_FDONE 0x80000000
+#define ACC100_SDONE 0x40000000
#define ACC100_NUM_TMPL 32
#define VF_OFFSET_QOS 16 /* offset in Memory Space specific to QoS Mon */
@@ -398,6 +400,7 @@ struct __rte_packed acc100_dma_req_desc {
union acc100_dma_desc {
struct acc100_dma_req_desc req;
union acc100_dma_rsp_desc rsp;
+ uint64_t atom_hdr;
};
--
1.8.3.1
^ permalink raw reply [flat|nested] 213+ messages in thread
* Re: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC processing functions
2020-08-20 14:38 ` Dave Burley
@ 2020-08-20 14:52 ` Chautru, Nicolas
2020-08-20 14:57 ` Dave Burley
0 siblings, 1 reply; 213+ messages in thread
From: Chautru, Nicolas @ 2020-08-20 14:52 UTC (permalink / raw)
To: Dave Burley, dev; +Cc: Richardson, Bruce
Hi Dave,
This is assuming 6 bits LLR compression packing (ie. first 2 MSB dropped). Similar to HARQ compression.
Let me know if unclear, I can clarify further in documentation if not explicit enough.
Thanks
Nic
> -----Original Message-----
> From: Dave Burley <dave.burley@accelercomm.com>
> Sent: Thursday, August 20, 2020 7:39 AM
> To: Chautru, Nicolas <nicolas.chautru@intel.com>; dev@dpdk.org
> Cc: Richardson, Bruce <bruce.richardson@intel.com>
> Subject: Re: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC
> processing functions
>
> Hi Nic,
>
> As you've now specified the use of RTE_BBDEV_LDPC_LLR_COMPRESSION for
> this PMB, please could you confirm what the packed format of the LLRs in
> memory looks like?
>
> Best Regards
>
> Dave Burley
>
>
> From: dev <dev-bounces@dpdk.org> on behalf of Nicolas Chautru
> <nicolas.chautru@intel.com>
> Sent: 19 August 2020 01:25
> To: dev@dpdk.org <dev@dpdk.org>; akhil.goyal@nxp.com
> <akhil.goyal@nxp.com>
> Cc: bruce.richardson@intel.com <bruce.richardson@intel.com>; Nicolas
> Chautru <nicolas.chautru@intel.com>
> Subject: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC processing
> functions
>
> Adding LDPC decode and encode processing operations
>
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> ---
> drivers/baseband/acc100/rte_acc100_pmd.c | 1625
> +++++++++++++++++++++++++++++-
> drivers/baseband/acc100/rte_acc100_pmd.h | 3 +
> 2 files changed, 1626 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> b/drivers/baseband/acc100/rte_acc100_pmd.c
> index 7a21c57..5f32813 100644
> --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> @@ -15,6 +15,9 @@
> #include <rte_hexdump.h>
> #include <rte_pci.h>
> #include <rte_bus_pci.h>
> +#ifdef RTE_BBDEV_OFFLOAD_COST
> +#include <rte_cycles.h>
> +#endif
>
> #include <rte_bbdev.h>
> #include <rte_bbdev_pmd.h>
> @@ -449,7 +452,6 @@
> return 0;
> }
>
> -
> /**
> * Report a ACC100 queue index which is free
> * Return 0 to 16k for a valid queue_idx or -1 when no queue is available
> @@ -634,6 +636,46 @@
> struct acc100_device *d = dev->data->dev_private;
>
> static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
> + {
> + .type = RTE_BBDEV_OP_LDPC_ENC,
> + .cap.ldpc_enc = {
> + .capability_flags =
> + RTE_BBDEV_LDPC_RATE_MATCH |
> + RTE_BBDEV_LDPC_CRC_24B_ATTACH |
> + RTE_BBDEV_LDPC_INTERLEAVER_BYPASS,
> + .num_buffers_src =
> + RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> + .num_buffers_dst =
> + RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> + }
> + },
> + {
> + .type = RTE_BBDEV_OP_LDPC_DEC,
> + .cap.ldpc_dec = {
> + .capability_flags =
> + RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK |
> + RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP |
> + RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE |
> + RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE |
> +#ifdef ACC100_EXT_MEM
> + RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABL
> E |
> + RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENA
> BLE |
> +#endif
> + RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE |
> + RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS |
> + RTE_BBDEV_LDPC_DECODE_BYPASS |
> + RTE_BBDEV_LDPC_DEC_SCATTER_GATHER |
> + RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION |
> + RTE_BBDEV_LDPC_LLR_COMPRESSION,
> + .llr_size = 8,
> + .llr_decimals = 1,
> + .num_buffers_src =
> + RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> + .num_buffers_hard_out =
> + RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> + .num_buffers_soft_out = 0,
> + }
> + },
> RTE_BBDEV_END_OF_CAPABILITIES_LIST()
> };
>
> @@ -669,9 +711,14 @@
> dev_info->cpu_flag_reqs = NULL;
> dev_info->min_alignment = 64;
> dev_info->capabilities = bbdev_capabilities;
> +#ifdef ACC100_EXT_MEM
> dev_info->harq_buffer_size = d->ddr_size;
> +#else
> + dev_info->harq_buffer_size = 0;
> +#endif
> }
>
> +
> static const struct rte_bbdev_ops acc100_bbdev_ops = {
> .setup_queues = acc100_setup_queues,
> .close = acc100_dev_close,
> @@ -696,6 +743,1577 @@
> {.device_id = 0},
> };
>
> +/* Read flag value 0/1 from bitmap */
> +static inline bool
> +check_bit(uint32_t bitmap, uint32_t bitmask)
> +{
> + return bitmap & bitmask;
> +}
> +
> +static inline char *
> +mbuf_append(struct rte_mbuf *m_head, struct rte_mbuf *m, uint16_t len)
> +{
> + if (unlikely(len > rte_pktmbuf_tailroom(m)))
> + return NULL;
> +
> + char *tail = (char *)m->buf_addr + m->data_off + m->data_len;
> + m->data_len = (uint16_t)(m->data_len + len);
> + m_head->pkt_len = (m_head->pkt_len + len);
> + return tail;
> +}
> +
> +/* Compute value of k0.
> + * Based on 3GPP 38.212 Table 5.4.2.1-2
> + * Starting position of different redundancy versions, k0
> + */
> +static inline uint16_t
> +get_k0(uint16_t n_cb, uint16_t z_c, uint8_t bg, uint8_t rv_index)
> +{
> + if (rv_index == 0)
> + return 0;
> + uint16_t n = (bg == 1 ? N_ZC_1 : N_ZC_2) * z_c;
> + if (n_cb == n) {
> + if (rv_index == 1)
> + return (bg == 1 ? K0_1_1 : K0_1_2) * z_c;
> + else if (rv_index == 2)
> + return (bg == 1 ? K0_2_1 : K0_2_2) * z_c;
> + else
> + return (bg == 1 ? K0_3_1 : K0_3_2) * z_c;
> + }
> + /* LBRM case - includes a division by N */
> + if (rv_index == 1)
> + return (((bg == 1 ? K0_1_1 : K0_1_2) * n_cb)
> + / n) * z_c;
> + else if (rv_index == 2)
> + return (((bg == 1 ? K0_2_1 : K0_2_2) * n_cb)
> + / n) * z_c;
> + else
> + return (((bg == 1 ? K0_3_1 : K0_3_2) * n_cb)
> + / n) * z_c;
> +}
> +
> +/* Fill in a frame control word for LDPC encoding. */
> +static inline void
> +acc100_fcw_le_fill(const struct rte_bbdev_enc_op *op,
> + struct acc100_fcw_le *fcw, int num_cb)
> +{
> + fcw->qm = op->ldpc_enc.q_m;
> + fcw->nfiller = op->ldpc_enc.n_filler;
> + fcw->BG = (op->ldpc_enc.basegraph - 1);
> + fcw->Zc = op->ldpc_enc.z_c;
> + fcw->ncb = op->ldpc_enc.n_cb;
> + fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_enc.basegraph,
> + op->ldpc_enc.rv_index);
> + fcw->rm_e = op->ldpc_enc.cb_params.e;
> + fcw->crc_select = check_bit(op->ldpc_enc.op_flags,
> + RTE_BBDEV_LDPC_CRC_24B_ATTACH);
> + fcw->bypass_intlv = check_bit(op->ldpc_enc.op_flags,
> + RTE_BBDEV_LDPC_INTERLEAVER_BYPASS);
> + fcw->mcb_count = num_cb;
> +}
> +
> +/* Fill in a frame control word for LDPC decoding. */
> +static inline void
> +acc100_fcw_ld_fill(const struct rte_bbdev_dec_op *op, struct acc100_fcw_ld
> *fcw,
> + union acc100_harq_layout_data *harq_layout)
> +{
> + uint16_t harq_out_length, harq_in_length, ncb_p, k0_p, parity_offset;
> + uint16_t harq_index;
> + uint32_t l;
> + bool harq_prun = false;
> +
> + fcw->qm = op->ldpc_dec.q_m;
> + fcw->nfiller = op->ldpc_dec.n_filler;
> + fcw->BG = (op->ldpc_dec.basegraph - 1);
> + fcw->Zc = op->ldpc_dec.z_c;
> + fcw->ncb = op->ldpc_dec.n_cb;
> + fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_dec.basegraph,
> + op->ldpc_dec.rv_index);
> + if (op->ldpc_dec.code_block_mode == 1)
> + fcw->rm_e = op->ldpc_dec.cb_params.e;
> + else
> + fcw->rm_e = (op->ldpc_dec.tb_params.r <
> + op->ldpc_dec.tb_params.cab) ?
> + op->ldpc_dec.tb_params.ea :
> + op->ldpc_dec.tb_params.eb;
> +
> + fcw->hcin_en = check_bit(op->ldpc_dec.op_flags,
> + RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE);
> + fcw->hcout_en = check_bit(op->ldpc_dec.op_flags,
> + RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE);
> + fcw->crc_select = check_bit(op->ldpc_dec.op_flags,
> + RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK);
> + fcw->bypass_dec = check_bit(op->ldpc_dec.op_flags,
> + RTE_BBDEV_LDPC_DECODE_BYPASS);
> + fcw->bypass_intlv = check_bit(op->ldpc_dec.op_flags,
> + RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS);
> + if (op->ldpc_dec.q_m == 1) {
> + fcw->bypass_intlv = 1;
> + fcw->qm = 2;
> + }
> + fcw->hcin_decomp_mode = check_bit(op->ldpc_dec.op_flags,
> + RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
> + fcw->hcout_comp_mode = check_bit(op->ldpc_dec.op_flags,
> + RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
> + fcw->llr_pack_mode = check_bit(op->ldpc_dec.op_flags,
> + RTE_BBDEV_LDPC_LLR_COMPRESSION);
> + harq_index = op->ldpc_dec.harq_combined_output.offset /
> + ACC100_HARQ_OFFSET;
> +#ifdef ACC100_EXT_MEM
> + /* Limit cases when HARQ pruning is valid */
> + harq_prun = ((op->ldpc_dec.harq_combined_output.offset %
> + ACC100_HARQ_OFFSET) == 0) &&
> + (op->ldpc_dec.harq_combined_output.offset <= UINT16_MAX
> + * ACC100_HARQ_OFFSET);
> +#endif
> + if (fcw->hcin_en > 0) {
> + harq_in_length = op->ldpc_dec.harq_combined_input.length;
> + if (fcw->hcin_decomp_mode > 0)
> + harq_in_length = harq_in_length * 8 / 6;
> + harq_in_length = RTE_ALIGN(harq_in_length, 64);
> + if ((harq_layout[harq_index].offset > 0) & harq_prun) {
> + rte_bbdev_log_debug("HARQ IN offset unexpected for now\n");
> + fcw->hcin_size0 = harq_layout[harq_index].size0;
> + fcw->hcin_offset = harq_layout[harq_index].offset;
> + fcw->hcin_size1 = harq_in_length -
> + harq_layout[harq_index].offset;
> + } else {
> + fcw->hcin_size0 = harq_in_length;
> + fcw->hcin_offset = 0;
> + fcw->hcin_size1 = 0;
> + }
> + } else {
> + fcw->hcin_size0 = 0;
> + fcw->hcin_offset = 0;
> + fcw->hcin_size1 = 0;
> + }
> +
> + fcw->itmax = op->ldpc_dec.iter_max;
> + fcw->itstop = check_bit(op->ldpc_dec.op_flags,
> + RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE);
> + fcw->synd_precoder = fcw->itstop;
> + /*
> + * These are all implicitly set
> + * fcw->synd_post = 0;
> + * fcw->so_en = 0;
> + * fcw->so_bypass_rm = 0;
> + * fcw->so_bypass_intlv = 0;
> + * fcw->dec_convllr = 0;
> + * fcw->hcout_convllr = 0;
> + * fcw->hcout_size1 = 0;
> + * fcw->so_it = 0;
> + * fcw->hcout_offset = 0;
> + * fcw->negstop_th = 0;
> + * fcw->negstop_it = 0;
> + * fcw->negstop_en = 0;
> + * fcw->gain_i = 1;
> + * fcw->gain_h = 1;
> + */
> + if (fcw->hcout_en > 0) {
> + parity_offset = (op->ldpc_dec.basegraph == 1 ? 20 : 8)
> + * op->ldpc_dec.z_c - op->ldpc_dec.n_filler;
> + k0_p = (fcw->k0 > parity_offset) ?
> + fcw->k0 - op->ldpc_dec.n_filler : fcw->k0;
> + ncb_p = fcw->ncb - op->ldpc_dec.n_filler;
> + l = k0_p + fcw->rm_e;
> + harq_out_length = (uint16_t) fcw->hcin_size0;
> + harq_out_length = RTE_MIN(RTE_MAX(harq_out_length, l), ncb_p);
> + harq_out_length = (harq_out_length + 0x3F) & 0xFFC0;
> + if ((k0_p > fcw->hcin_size0 + ACC100_HARQ_OFFSET_THRESHOLD)
> &&
> + harq_prun) {
> + fcw->hcout_size0 = (uint16_t) fcw->hcin_size0;
> + fcw->hcout_offset = k0_p & 0xFFC0;
> + fcw->hcout_size1 = harq_out_length - fcw->hcout_offset;
> + } else {
> + fcw->hcout_size0 = harq_out_length;
> + fcw->hcout_size1 = 0;
> + fcw->hcout_offset = 0;
> + }
> + harq_layout[harq_index].offset = fcw->hcout_offset;
> + harq_layout[harq_index].size0 = fcw->hcout_size0;
> + } else {
> + fcw->hcout_size0 = 0;
> + fcw->hcout_size1 = 0;
> + fcw->hcout_offset = 0;
> + }
> +}
> +
> +/**
> + * Fills descriptor with data pointers of one block type.
> + *
> + * @param desc
> + * Pointer to DMA descriptor.
> + * @param input
> + * Pointer to pointer to input data which will be encoded. It can be changed
> + * and points to next segment in scatter-gather case.
> + * @param offset
> + * Input offset in rte_mbuf structure. It is used for calculating the point
> + * where data is starting.
> + * @param cb_len
> + * Length of currently processed Code Block
> + * @param seg_total_left
> + * It indicates how many bytes still left in segment (mbuf) for further
> + * processing.
> + * @param op_flags
> + * Store information about device capabilities
> + * @param next_triplet
> + * Index for ACC100 DMA Descriptor triplet
> + *
> + * @return
> + * Returns index of next triplet on success, other value if lengths of
> + * pkt and processed cb do not match.
> + *
> + */
> +static inline int
> +acc100_dma_fill_blk_type_in(struct acc100_dma_req_desc *desc,
> + struct rte_mbuf **input, uint32_t *offset, uint32_t cb_len,
> + uint32_t *seg_total_left, int next_triplet)
> +{
> + uint32_t part_len;
> + struct rte_mbuf *m = *input;
> +
> + part_len = (*seg_total_left < cb_len) ? *seg_total_left : cb_len;
> + cb_len -= part_len;
> + *seg_total_left -= part_len;
> +
> + desc->data_ptrs[next_triplet].address =
> + rte_pktmbuf_iova_offset(m, *offset);
> + desc->data_ptrs[next_triplet].blen = part_len;
> + desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN;
> + desc->data_ptrs[next_triplet].last = 0;
> + desc->data_ptrs[next_triplet].dma_ext = 0;
> + *offset += part_len;
> + next_triplet++;
> +
> + while (cb_len > 0) {
> + if (next_triplet < ACC100_DMA_MAX_NUM_POINTERS &&
> + m->next != NULL) {
> +
> + m = m->next;
> + *seg_total_left = rte_pktmbuf_data_len(m);
> + part_len = (*seg_total_left < cb_len) ?
> + *seg_total_left :
> + cb_len;
> + desc->data_ptrs[next_triplet].address =
> + rte_pktmbuf_mtophys(m);
> + desc->data_ptrs[next_triplet].blen = part_len;
> + desc->data_ptrs[next_triplet].blkid =
> + ACC100_DMA_BLKID_IN;
> + desc->data_ptrs[next_triplet].last = 0;
> + desc->data_ptrs[next_triplet].dma_ext = 0;
> + cb_len -= part_len;
> + *seg_total_left -= part_len;
> + /* Initializing offset for next segment (mbuf) */
> + *offset = part_len;
> + next_triplet++;
> + } else {
> + rte_bbdev_log(ERR,
> + "Some data still left for processing: "
> + "data_left: %u, next_triplet: %u, next_mbuf: %p",
> + cb_len, next_triplet, m->next);
> + return -EINVAL;
> + }
> + }
> + /* Storing new mbuf as it could be changed in scatter-gather case*/
> + *input = m;
> +
> + return next_triplet;
> +}
> +
> +/* Fills descriptor with data pointers of one block type.
> + * Returns index of next triplet on success, other value if lengths of
> + * output data and processed mbuf do not match.
> + */
> +static inline int
> +acc100_dma_fill_blk_type_out(struct acc100_dma_req_desc *desc,
> + struct rte_mbuf *output, uint32_t out_offset,
> + uint32_t output_len, int next_triplet, int blk_id)
> +{
> + desc->data_ptrs[next_triplet].address =
> + rte_pktmbuf_iova_offset(output, out_offset);
> + desc->data_ptrs[next_triplet].blen = output_len;
> + desc->data_ptrs[next_triplet].blkid = blk_id;
> + desc->data_ptrs[next_triplet].last = 0;
> + desc->data_ptrs[next_triplet].dma_ext = 0;
> + next_triplet++;
> +
> + return next_triplet;
> +}
> +
> +static inline int
> +acc100_dma_desc_le_fill(struct rte_bbdev_enc_op *op,
> + struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
> + struct rte_mbuf *output, uint32_t *in_offset,
> + uint32_t *out_offset, uint32_t *out_length,
> + uint32_t *mbuf_total_left, uint32_t *seg_total_left)
> +{
> + int next_triplet = 1; /* FCW already done */
> + uint16_t K, in_length_in_bits, in_length_in_bytes;
> + struct rte_bbdev_op_ldpc_enc *enc = &op->ldpc_enc;
> +
> + desc->word0 = ACC100_DMA_DESC_TYPE;
> + desc->word1 = 0; /**< Timestamp could be disabled */
> + desc->word2 = 0;
> + desc->word3 = 0;
> + desc->numCBs = 1;
> +
> + K = (enc->basegraph == 1 ? 22 : 10) * enc->z_c;
> + in_length_in_bits = K - enc->n_filler;
> + if ((enc->op_flags & RTE_BBDEV_LDPC_CRC_24A_ATTACH) ||
> + (enc->op_flags & RTE_BBDEV_LDPC_CRC_24B_ATTACH))
> + in_length_in_bits -= 24;
> + in_length_in_bytes = in_length_in_bits >> 3;
> +
> + if (unlikely((*mbuf_total_left == 0) ||
> + (*mbuf_total_left < in_length_in_bytes))) {
> + rte_bbdev_log(ERR,
> + "Mismatch between mbuf length and included CB sizes:
> mbuf len %u, cb len %u",
> + *mbuf_total_left, in_length_in_bytes);
> + return -1;
> + }
> +
> + next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset,
> + in_length_in_bytes,
> + seg_total_left, next_triplet);
> + if (unlikely(next_triplet < 0)) {
> + rte_bbdev_log(ERR,
> + "Mismatch between data to process and mbuf data length
> in bbdev_op: %p",
> + op);
> + return -1;
> + }
> + desc->data_ptrs[next_triplet - 1].last = 1;
> + desc->m2dlen = next_triplet;
> + *mbuf_total_left -= in_length_in_bytes;
> +
> + /* Set output length */
> + /* Integer round up division by 8 */
> + *out_length = (enc->cb_params.e + 7) >> 3;
> +
> + next_triplet = acc100_dma_fill_blk_type_out(desc, output, *out_offset,
> + *out_length, next_triplet, ACC100_DMA_BLKID_OUT_ENC);
> + if (unlikely(next_triplet < 0)) {
> + rte_bbdev_log(ERR,
> + "Mismatch between data to process and mbuf data length
> in bbdev_op: %p",
> + op);
> + return -1;
> + }
> + op->ldpc_enc.output.length += *out_length;
> + *out_offset += *out_length;
> + desc->data_ptrs[next_triplet - 1].last = 1;
> + desc->data_ptrs[next_triplet - 1].dma_ext = 0;
> + desc->d2mlen = next_triplet - desc->m2dlen;
> +
> + desc->op_addr = op;
> +
> + return 0;
> +}
> +
> +static inline int
> +acc100_dma_desc_ld_fill(struct rte_bbdev_dec_op *op,
> + struct acc100_dma_req_desc *desc,
> + struct rte_mbuf **input, struct rte_mbuf *h_output,
> + uint32_t *in_offset, uint32_t *h_out_offset,
> + uint32_t *h_out_length, uint32_t *mbuf_total_left,
> + uint32_t *seg_total_left,
> + struct acc100_fcw_ld *fcw)
> +{
> + struct rte_bbdev_op_ldpc_dec *dec = &op->ldpc_dec;
> + int next_triplet = 1; /* FCW already done */
> + uint32_t input_length;
> + uint16_t output_length, crc24_overlap = 0;
> + uint16_t sys_cols, K, h_p_size, h_np_size;
> + bool h_comp = check_bit(dec->op_flags,
> + RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
> +
> + desc->word0 = ACC100_DMA_DESC_TYPE;
> + desc->word1 = 0; /**< Timestamp could be disabled */
> + desc->word2 = 0;
> + desc->word3 = 0;
> + desc->numCBs = 1;
> +
> + if (check_bit(op->ldpc_dec.op_flags,
> + RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP))
> + crc24_overlap = 24;
> +
> + /* Compute some LDPC BG lengths */
> + input_length = dec->cb_params.e;
> + if (check_bit(op->ldpc_dec.op_flags,
> + RTE_BBDEV_LDPC_LLR_COMPRESSION))
> + input_length = (input_length * 3 + 3) / 4;
> + sys_cols = (dec->basegraph == 1) ? 22 : 10;
> + K = sys_cols * dec->z_c;
> + output_length = K - dec->n_filler - crc24_overlap;
> +
> + if (unlikely((*mbuf_total_left == 0) ||
> + (*mbuf_total_left < input_length))) {
> + rte_bbdev_log(ERR,
> + "Mismatch between mbuf length and included CB sizes:
> mbuf len %u, cb len %u",
> + *mbuf_total_left, input_length);
> + return -1;
> + }
> +
> + next_triplet = acc100_dma_fill_blk_type_in(desc, input,
> + in_offset, input_length,
> + seg_total_left, next_triplet);
> +
> + if (unlikely(next_triplet < 0)) {
> + rte_bbdev_log(ERR,
> + "Mismatch between data to process and mbuf data length
> in bbdev_op: %p",
> + op);
> + return -1;
> + }
> +
> + if (check_bit(op->ldpc_dec.op_flags,
> + RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
> + h_p_size = fcw->hcin_size0 + fcw->hcin_size1;
> + if (h_comp)
> + h_p_size = (h_p_size * 3 + 3) / 4;
> + desc->data_ptrs[next_triplet].address =
> + dec->harq_combined_input.offset;
> + desc->data_ptrs[next_triplet].blen = h_p_size;
> + desc->data_ptrs[next_triplet].blkid =
> ACC100_DMA_BLKID_IN_HARQ;
> + desc->data_ptrs[next_triplet].dma_ext = 1;
> +#ifndef ACC100_EXT_MEM
> + acc100_dma_fill_blk_type_out(
> + desc,
> + op->ldpc_dec.harq_combined_input.data,
> + op->ldpc_dec.harq_combined_input.offset,
> + h_p_size,
> + next_triplet,
> + ACC100_DMA_BLKID_IN_HARQ);
> +#endif
> + next_triplet++;
> + }
> +
> + desc->data_ptrs[next_triplet - 1].last = 1;
> + desc->m2dlen = next_triplet;
> + *mbuf_total_left -= input_length;
> +
> + next_triplet = acc100_dma_fill_blk_type_out(desc, h_output,
> + *h_out_offset, output_length >> 3, next_triplet,
> + ACC100_DMA_BLKID_OUT_HARD);
> + if (unlikely(next_triplet < 0)) {
> + rte_bbdev_log(ERR,
> + "Mismatch between data to process and mbuf data length
> in bbdev_op: %p",
> + op);
> + return -1;
> + }
> +
> + if (check_bit(op->ldpc_dec.op_flags,
> + RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
> + /* Pruned size of the HARQ */
> + h_p_size = fcw->hcout_size0 + fcw->hcout_size1;
> + /* Non-Pruned size of the HARQ */
> + h_np_size = fcw->hcout_offset > 0 ?
> + fcw->hcout_offset + fcw->hcout_size1 :
> + h_p_size;
> + if (h_comp) {
> + h_np_size = (h_np_size * 3 + 3) / 4;
> + h_p_size = (h_p_size * 3 + 3) / 4;
> + }
> + dec->harq_combined_output.length = h_np_size;
> + desc->data_ptrs[next_triplet].address =
> + dec->harq_combined_output.offset;
> + desc->data_ptrs[next_triplet].blen = h_p_size;
> + desc->data_ptrs[next_triplet].blkid =
> ACC100_DMA_BLKID_OUT_HARQ;
> + desc->data_ptrs[next_triplet].dma_ext = 1;
> +#ifndef ACC100_EXT_MEM
> + acc100_dma_fill_blk_type_out(
> + desc,
> + dec->harq_combined_output.data,
> + dec->harq_combined_output.offset,
> + h_p_size,
> + next_triplet,
> + ACC100_DMA_BLKID_OUT_HARQ);
> +#endif
> + next_triplet++;
> + }
> +
> + *h_out_length = output_length >> 3;
> + dec->hard_output.length += *h_out_length;
> + *h_out_offset += *h_out_length;
> + desc->data_ptrs[next_triplet - 1].last = 1;
> + desc->d2mlen = next_triplet - desc->m2dlen;
> +
> + desc->op_addr = op;
> +
> + return 0;
> +}
> +
> +static inline void
> +acc100_dma_desc_ld_update(struct rte_bbdev_dec_op *op,
> + struct acc100_dma_req_desc *desc,
> + struct rte_mbuf *input, struct rte_mbuf *h_output,
> + uint32_t *in_offset, uint32_t *h_out_offset,
> + uint32_t *h_out_length,
> + union acc100_harq_layout_data *harq_layout)
> +{
> + int next_triplet = 1; /* FCW already done */
> + desc->data_ptrs[next_triplet].address =
> + rte_pktmbuf_iova_offset(input, *in_offset);
> + next_triplet++;
> +
> + if (check_bit(op->ldpc_dec.op_flags,
> + RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
> + struct rte_bbdev_op_data hi = op->ldpc_dec.harq_combined_input;
> + desc->data_ptrs[next_triplet].address = hi.offset;
> +#ifndef ACC100_EXT_MEM
> + desc->data_ptrs[next_triplet].address =
> + rte_pktmbuf_iova_offset(hi.data, hi.offset);
> +#endif
> + next_triplet++;
> + }
> +
> + desc->data_ptrs[next_triplet].address =
> + rte_pktmbuf_iova_offset(h_output, *h_out_offset);
> + *h_out_length = desc->data_ptrs[next_triplet].blen;
> + next_triplet++;
> +
> + if (check_bit(op->ldpc_dec.op_flags,
> + RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
> + desc->data_ptrs[next_triplet].address =
> + op->ldpc_dec.harq_combined_output.offset;
> + /* Adjust based on previous operation */
> + struct rte_bbdev_dec_op *prev_op = desc->op_addr;
> + op->ldpc_dec.harq_combined_output.length =
> + prev_op->ldpc_dec.harq_combined_output.length;
> + int16_t hq_idx = op->ldpc_dec.harq_combined_output.offset /
> + ACC100_HARQ_OFFSET;
> + int16_t prev_hq_idx =
> + prev_op->ldpc_dec.harq_combined_output.offset
> + / ACC100_HARQ_OFFSET;
> + harq_layout[hq_idx].val = harq_layout[prev_hq_idx].val;
> +#ifndef ACC100_EXT_MEM
> + struct rte_bbdev_op_data ho =
> + op->ldpc_dec.harq_combined_output;
> + desc->data_ptrs[next_triplet].address =
> + rte_pktmbuf_iova_offset(ho.data, ho.offset);
> +#endif
> + next_triplet++;
> + }
> +
> + op->ldpc_dec.hard_output.length += *h_out_length;
> + desc->op_addr = op;
> +}
> +
> +
> +/* Enqueue a number of operations to HW and update software rings */
> +static inline void
> +acc100_dma_enqueue(struct acc100_queue *q, uint16_t n,
> + struct rte_bbdev_stats *queue_stats)
> +{
> + union acc100_enqueue_reg_fmt enq_req;
> +#ifdef RTE_BBDEV_OFFLOAD_COST
> + uint64_t start_time = 0;
> + queue_stats->acc_offload_cycles = 0;
> + RTE_SET_USED(queue_stats);
> +#else
> + RTE_SET_USED(queue_stats);
> +#endif
> +
> + enq_req.val = 0;
> + /* Setting offset, 100b for 256 DMA Desc */
> + enq_req.addr_offset = ACC100_DESC_OFFSET;
> +
> + /* Split ops into batches */
> + do {
> + union acc100_dma_desc *desc;
> + uint16_t enq_batch_size;
> + uint64_t offset;
> + rte_iova_t req_elem_addr;
> +
> + enq_batch_size = RTE_MIN(n, MAX_ENQ_BATCH_SIZE);
> +
> + /* Set flag on last descriptor in a batch */
> + desc = q->ring_addr + ((q->sw_ring_head + enq_batch_size - 1) &
> + q->sw_ring_wrap_mask);
> + desc->req.last_desc_in_batch = 1;
> +
> + /* Calculate the 1st descriptor's address */
> + offset = ((q->sw_ring_head & q->sw_ring_wrap_mask) *
> + sizeof(union acc100_dma_desc));
> + req_elem_addr = q->ring_addr_phys + offset;
> +
> + /* Fill enqueue struct */
> + enq_req.num_elem = enq_batch_size;
> + /* low 6 bits are not needed */
> + enq_req.req_elem_addr = (uint32_t)(req_elem_addr >> 6);
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> + rte_memdump(stderr, "Req sdone", desc, sizeof(*desc));
> +#endif
> + rte_bbdev_log_debug(
> + "Enqueue %u reqs (phys %#"PRIx64") to reg %p",
> + enq_batch_size,
> + req_elem_addr,
> + (void *)q->mmio_reg_enqueue);
> +
> + rte_wmb();
> +
> +#ifdef RTE_BBDEV_OFFLOAD_COST
> + /* Start time measurement for enqueue function offload. */
> + start_time = rte_rdtsc_precise();
> +#endif
> + rte_bbdev_log(DEBUG, "Debug : MMIO Enqueue");
> + mmio_write(q->mmio_reg_enqueue, enq_req.val);
> +
> +#ifdef RTE_BBDEV_OFFLOAD_COST
> + queue_stats->acc_offload_cycles +=
> + rte_rdtsc_precise() - start_time;
> +#endif
> +
> + q->aq_enqueued++;
> + q->sw_ring_head += enq_batch_size;
> + n -= enq_batch_size;
> +
> + } while (n);
> +
> +
> +}
> +
> +/* Enqueue one encode operations for ACC100 device in CB mode */
> +static inline int
> +enqueue_ldpc_enc_n_op_cb(struct acc100_queue *q, struct
> rte_bbdev_enc_op **ops,
> + uint16_t total_enqueued_cbs, int16_t num)
> +{
> + union acc100_dma_desc *desc = NULL;
> + uint32_t out_length;
> + struct rte_mbuf *output_head, *output;
> + int i, next_triplet;
> + uint16_t in_length_in_bytes;
> + struct rte_bbdev_op_ldpc_enc *enc = &ops[0]->ldpc_enc;
> +
> + uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> + & q->sw_ring_wrap_mask);
> + desc = q->ring_addr + desc_idx;
> + acc100_fcw_le_fill(ops[0], &desc->req.fcw_le, num);
> +
> + /** This could be done at polling */
> + desc->req.word0 = ACC100_DMA_DESC_TYPE;
> + desc->req.word1 = 0; /**< Timestamp could be disabled */
> + desc->req.word2 = 0;
> + desc->req.word3 = 0;
> + desc->req.numCBs = num;
> +
> + in_length_in_bytes = ops[0]->ldpc_enc.input.data->data_len;
> + out_length = (enc->cb_params.e + 7) >> 3;
> + desc->req.m2dlen = 1 + num;
> + desc->req.d2mlen = num;
> + next_triplet = 1;
> +
> + for (i = 0; i < num; i++) {
> + desc->req.data_ptrs[next_triplet].address =
> + rte_pktmbuf_iova_offset(ops[i]->ldpc_enc.input.data, 0);
> + desc->req.data_ptrs[next_triplet].blen = in_length_in_bytes;
> + next_triplet++;
> + desc->req.data_ptrs[next_triplet].address =
> + rte_pktmbuf_iova_offset(
> + ops[i]->ldpc_enc.output.data, 0);
> + desc->req.data_ptrs[next_triplet].blen = out_length;
> + next_triplet++;
> + ops[i]->ldpc_enc.output.length = out_length;
> + output_head = output = ops[i]->ldpc_enc.output.data;
> + mbuf_append(output_head, output, out_length);
> + output->data_len = out_length;
> + }
> +
> + desc->req.op_addr = ops[0];
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> + rte_memdump(stderr, "FCW", &desc->req.fcw_le,
> + sizeof(desc->req.fcw_le) - 8);
> + rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> +#endif
> +
> + /* One CB (one op) was successfully prepared to enqueue */
> + return num;
> +}
> +
> +/* Enqueue one encode operations for ACC100 device in CB mode */
> +static inline int
> +enqueue_ldpc_enc_one_op_cb(struct acc100_queue *q, struct
> rte_bbdev_enc_op *op,
> + uint16_t total_enqueued_cbs)
> +{
> + union acc100_dma_desc *desc = NULL;
> + int ret;
> + uint32_t in_offset, out_offset, out_length, mbuf_total_left,
> + seg_total_left;
> + struct rte_mbuf *input, *output_head, *output;
> +
> + uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> + & q->sw_ring_wrap_mask);
> + desc = q->ring_addr + desc_idx;
> + acc100_fcw_le_fill(op, &desc->req.fcw_le, 1);
> +
> + input = op->ldpc_enc.input.data;
> + output_head = output = op->ldpc_enc.output.data;
> + in_offset = op->ldpc_enc.input.offset;
> + out_offset = op->ldpc_enc.output.offset;
> + out_length = 0;
> + mbuf_total_left = op->ldpc_enc.input.length;
> + seg_total_left = rte_pktmbuf_data_len(op->ldpc_enc.input.data)
> + - in_offset;
> +
> + ret = acc100_dma_desc_le_fill(op, &desc->req, &input, output,
> + &in_offset, &out_offset, &out_length, &mbuf_total_left,
> + &seg_total_left);
> +
> + if (unlikely(ret < 0))
> + return ret;
> +
> + mbuf_append(output_head, output, out_length);
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> + rte_memdump(stderr, "FCW", &desc->req.fcw_le,
> + sizeof(desc->req.fcw_le) - 8);
> + rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> +
> + /* Check if any data left after processing one CB */
> + if (mbuf_total_left != 0) {
> + rte_bbdev_log(ERR,
> + "Some date still left after processing one CB:
> mbuf_total_left = %u",
> + mbuf_total_left);
> + return -EINVAL;
> + }
> +#endif
> + /* One CB (one op) was successfully prepared to enqueue */
> + return 1;
> +}
> +
> +/** Enqueue one decode operations for ACC100 device in CB mode */
> +static inline int
> +enqueue_ldpc_dec_one_op_cb(struct acc100_queue *q, struct
> rte_bbdev_dec_op *op,
> + uint16_t total_enqueued_cbs, bool same_op)
> +{
> + int ret;
> +
> + union acc100_dma_desc *desc;
> + uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> + & q->sw_ring_wrap_mask);
> + desc = q->ring_addr + desc_idx;
> + struct rte_mbuf *input, *h_output_head, *h_output;
> + uint32_t in_offset, h_out_offset, h_out_length, mbuf_total_left;
> + input = op->ldpc_dec.input.data;
> + h_output_head = h_output = op->ldpc_dec.hard_output.data;
> + in_offset = op->ldpc_dec.input.offset;
> + h_out_offset = op->ldpc_dec.hard_output.offset;
> + mbuf_total_left = op->ldpc_dec.input.length;
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> + if (unlikely(input == NULL)) {
> + rte_bbdev_log(ERR, "Invalid mbuf pointer");
> + return -EFAULT;
> + }
> +#endif
> + union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
> +
> + if (same_op) {
> + union acc100_dma_desc *prev_desc;
> + desc_idx = ((q->sw_ring_head + total_enqueued_cbs - 1)
> + & q->sw_ring_wrap_mask);
> + prev_desc = q->ring_addr + desc_idx;
> + uint8_t *prev_ptr = (uint8_t *) prev_desc;
> + uint8_t *new_ptr = (uint8_t *) desc;
> + /* Copy first 4 words and BDESCs */
> + rte_memcpy(new_ptr, prev_ptr, 16);
> + rte_memcpy(new_ptr + 36, prev_ptr + 36, 40);
> + desc->req.op_addr = prev_desc->req.op_addr;
> + /* Copy FCW */
> + rte_memcpy(new_ptr + ACC100_DESC_FCW_OFFSET,
> + prev_ptr + ACC100_DESC_FCW_OFFSET,
> + ACC100_FCW_LD_BLEN);
> + acc100_dma_desc_ld_update(op, &desc->req, input, h_output,
> + &in_offset, &h_out_offset,
> + &h_out_length, harq_layout);
> + } else {
> + struct acc100_fcw_ld *fcw;
> + uint32_t seg_total_left;
> + fcw = &desc->req.fcw_ld;
> + acc100_fcw_ld_fill(op, fcw, harq_layout);
> +
> + /* Special handling when overusing mbuf */
> + if (fcw->rm_e < MAX_E_MBUF)
> + seg_total_left = rte_pktmbuf_data_len(input)
> + - in_offset;
> + else
> + seg_total_left = fcw->rm_e;
> +
> + ret = acc100_dma_desc_ld_fill(op, &desc->req, &input, h_output,
> + &in_offset, &h_out_offset,
> + &h_out_length, &mbuf_total_left,
> + &seg_total_left, fcw);
> + if (unlikely(ret < 0))
> + return ret;
> + }
> +
> + /* Hard output */
> + mbuf_append(h_output_head, h_output, h_out_length);
> +#ifndef ACC100_EXT_MEM
> + if (op->ldpc_dec.harq_combined_output.length > 0) {
> + /* Push the HARQ output into host memory */
> + struct rte_mbuf *hq_output_head, *hq_output;
> + hq_output_head = op->ldpc_dec.harq_combined_output.data;
> + hq_output = op->ldpc_dec.harq_combined_output.data;
> + mbuf_append(hq_output_head, hq_output,
> + op->ldpc_dec.harq_combined_output.length);
> + }
> +#endif
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> + rte_memdump(stderr, "FCW", &desc->req.fcw_ld,
> + sizeof(desc->req.fcw_ld) - 8);
> + rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> +#endif
> +
> + /* One CB (one op) was successfully prepared to enqueue */
> + return 1;
> +}
> +
> +
> +/* Enqueue one decode operations for ACC100 device in TB mode */
> +static inline int
> +enqueue_ldpc_dec_one_op_tb(struct acc100_queue *q, struct
> rte_bbdev_dec_op *op,
> + uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
> +{
> + union acc100_dma_desc *desc = NULL;
> + int ret;
> + uint8_t r, c;
> + uint32_t in_offset, h_out_offset,
> + h_out_length, mbuf_total_left, seg_total_left;
> + struct rte_mbuf *input, *h_output_head, *h_output;
> + uint16_t current_enqueued_cbs = 0;
> +
> + uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> + & q->sw_ring_wrap_mask);
> + desc = q->ring_addr + desc_idx;
> + uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
> + union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
> + acc100_fcw_ld_fill(op, &desc->req.fcw_ld, harq_layout);
> +
> + input = op->ldpc_dec.input.data;
> + h_output_head = h_output = op->ldpc_dec.hard_output.data;
> + in_offset = op->ldpc_dec.input.offset;
> + h_out_offset = op->ldpc_dec.hard_output.offset;
> + h_out_length = 0;
> + mbuf_total_left = op->ldpc_dec.input.length;
> + c = op->ldpc_dec.tb_params.c;
> + r = op->ldpc_dec.tb_params.r;
> +
> + while (mbuf_total_left > 0 && r < c) {
> +
> + seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
> +
> + /* Set up DMA descriptor */
> + desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
> + & q->sw_ring_wrap_mask);
> + desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
> + desc->req.data_ptrs[0].blen = ACC100_FCW_LD_BLEN;
> + ret = acc100_dma_desc_ld_fill(op, &desc->req, &input,
> + h_output, &in_offset, &h_out_offset,
> + &h_out_length,
> + &mbuf_total_left, &seg_total_left,
> + &desc->req.fcw_ld);
> +
> + if (unlikely(ret < 0))
> + return ret;
> +
> + /* Hard output */
> + mbuf_append(h_output_head, h_output, h_out_length);
> +
> + /* Set total number of CBs in TB */
> + desc->req.cbs_in_tb = cbs_in_tb;
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> + rte_memdump(stderr, "FCW", &desc->req.fcw_td,
> + sizeof(desc->req.fcw_td) - 8);
> + rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> +#endif
> +
> + if (seg_total_left == 0) {
> + /* Go to the next mbuf */
> + input = input->next;
> + in_offset = 0;
> + h_output = h_output->next;
> + h_out_offset = 0;
> + }
> + total_enqueued_cbs++;
> + current_enqueued_cbs++;
> + r++;
> + }
> +
> + if (unlikely(desc == NULL))
> + return current_enqueued_cbs;
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> + /* Check if any CBs left for processing */
> + if (mbuf_total_left != 0) {
> + rte_bbdev_log(ERR,
> + "Some date still left for processing: mbuf_total_left = %u",
> + mbuf_total_left);
> + return -EINVAL;
> + }
> +#endif
> + /* Set SDone on last CB descriptor for TB mode */
> + desc->req.sdone_enable = 1;
> + desc->req.irq_enable = q->irq_enable;
> +
> + return current_enqueued_cbs;
> +}
> +
> +
> +/* Calculates number of CBs in processed encoder TB based on 'r' and input
> + * length.
> + */
> +static inline uint8_t
> +get_num_cbs_in_tb_enc(struct rte_bbdev_op_turbo_enc *turbo_enc)
> +{
> + uint8_t c, c_neg, r, crc24_bits = 0;
> + uint16_t k, k_neg, k_pos;
> + uint8_t cbs_in_tb = 0;
> + int32_t length;
> +
> + length = turbo_enc->input.length;
> + r = turbo_enc->tb_params.r;
> + c = turbo_enc->tb_params.c;
> + c_neg = turbo_enc->tb_params.c_neg;
> + k_neg = turbo_enc->tb_params.k_neg;
> + k_pos = turbo_enc->tb_params.k_pos;
> + crc24_bits = 0;
> + if (check_bit(turbo_enc->op_flags,
> RTE_BBDEV_TURBO_CRC_24B_ATTACH))
> + crc24_bits = 24;
> + while (length > 0 && r < c) {
> + k = (r < c_neg) ? k_neg : k_pos;
> + length -= (k - crc24_bits) >> 3;
> + r++;
> + cbs_in_tb++;
> + }
> +
> + return cbs_in_tb;
> +}
> +
> +/* Calculates number of CBs in processed decoder TB based on 'r' and input
> + * length.
> + */
> +static inline uint16_t
> +get_num_cbs_in_tb_dec(struct rte_bbdev_op_turbo_dec *turbo_dec)
> +{
> + uint8_t c, c_neg, r = 0;
> + uint16_t kw, k, k_neg, k_pos, cbs_in_tb = 0;
> + int32_t length;
> +
> + length = turbo_dec->input.length;
> + r = turbo_dec->tb_params.r;
> + c = turbo_dec->tb_params.c;
> + c_neg = turbo_dec->tb_params.c_neg;
> + k_neg = turbo_dec->tb_params.k_neg;
> + k_pos = turbo_dec->tb_params.k_pos;
> + while (length > 0 && r < c) {
> + k = (r < c_neg) ? k_neg : k_pos;
> + kw = RTE_ALIGN_CEIL(k + 4, 32) * 3;
> + length -= kw;
> + r++;
> + cbs_in_tb++;
> + }
> +
> + return cbs_in_tb;
> +}
> +
> +/* Calculates number of CBs in processed decoder TB based on 'r' and input
> + * length.
> + */
> +static inline uint16_t
> +get_num_cbs_in_tb_ldpc_dec(struct rte_bbdev_op_ldpc_dec *ldpc_dec)
> +{
> + uint16_t r, cbs_in_tb = 0;
> + int32_t length = ldpc_dec->input.length;
> + r = ldpc_dec->tb_params.r;
> + while (length > 0 && r < ldpc_dec->tb_params.c) {
> + length -= (r < ldpc_dec->tb_params.cab) ?
> + ldpc_dec->tb_params.ea :
> + ldpc_dec->tb_params.eb;
> + r++;
> + cbs_in_tb++;
> + }
> + return cbs_in_tb;
> +}
> +
> +/* Check we can mux encode operations with common FCW */
> +static inline bool
> +check_mux(struct rte_bbdev_enc_op **ops, uint16_t num) {
> + uint16_t i;
> + if (num == 1)
> + return false;
> + for (i = 1; i < num; ++i) {
> + /* Only mux compatible code blocks */
> + if (memcmp((uint8_t *)(&ops[i]->ldpc_enc) + ENC_OFFSET,
> + (uint8_t *)(&ops[0]->ldpc_enc) + ENC_OFFSET,
> + CMP_ENC_SIZE) != 0)
> + return false;
> + }
> + return true;
> +}
> +
> +/** Enqueue encode operations for ACC100 device in CB mode. */
> +static inline uint16_t
> +acc100_enqueue_ldpc_enc_cb(struct rte_bbdev_queue_data *q_data,
> + struct rte_bbdev_enc_op **ops, uint16_t num)
> +{
> + struct acc100_queue *q = q_data->queue_private;
> + int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
> + uint16_t i = 0;
> + union acc100_dma_desc *desc;
> + int ret, desc_idx = 0;
> + int16_t enq, left = num;
> +
> + while (left > 0) {
> + if (unlikely(avail - 1 < 0))
> + break;
> + avail--;
> + enq = RTE_MIN(left, MUX_5GDL_DESC);
> + if (check_mux(&ops[i], enq)) {
> + ret = enqueue_ldpc_enc_n_op_cb(q, &ops[i],
> + desc_idx, enq);
> + if (ret < 0)
> + break;
> + i += enq;
> + } else {
> + ret = enqueue_ldpc_enc_one_op_cb(q, ops[i], desc_idx);
> + if (ret < 0)
> + break;
> + i++;
> + }
> + desc_idx++;
> + left = num - i;
> + }
> +
> + if (unlikely(i == 0))
> + return 0; /* Nothing to enqueue */
> +
> + /* Set SDone in last CB in enqueued ops for CB mode*/
> + desc = q->ring_addr + ((q->sw_ring_head + desc_idx - 1)
> + & q->sw_ring_wrap_mask);
> + desc->req.sdone_enable = 1;
> + desc->req.irq_enable = q->irq_enable;
> +
> + acc100_dma_enqueue(q, desc_idx, &q_data->queue_stats);
> +
> + /* Update stats */
> + q_data->queue_stats.enqueued_count += i;
> + q_data->queue_stats.enqueue_err_count += num - i;
> +
> + return i;
> +}
> +
> +/* Enqueue encode operations for ACC100 device. */
> +static uint16_t
> +acc100_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
> + struct rte_bbdev_enc_op **ops, uint16_t num)
> +{
> + if (unlikely(num == 0))
> + return 0;
> + return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
> +}
> +
> +/* Check we can mux encode operations with common FCW */
> +static inline bool
> +cmp_ldpc_dec_op(struct rte_bbdev_dec_op **ops) {
> + /* Only mux compatible code blocks */
> + if (memcmp((uint8_t *)(&ops[0]->ldpc_dec) + DEC_OFFSET,
> + (uint8_t *)(&ops[1]->ldpc_dec) +
> + DEC_OFFSET, CMP_DEC_SIZE) != 0) {
> + return false;
> + } else
> + return true;
> +}
> +
> +
> +/* Enqueue decode operations for ACC100 device in TB mode */
> +static uint16_t
> +acc100_enqueue_ldpc_dec_tb(struct rte_bbdev_queue_data *q_data,
> + struct rte_bbdev_dec_op **ops, uint16_t num)
> +{
> + struct acc100_queue *q = q_data->queue_private;
> + int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
> + uint16_t i, enqueued_cbs = 0;
> + uint8_t cbs_in_tb;
> + int ret;
> +
> + for (i = 0; i < num; ++i) {
> + cbs_in_tb = get_num_cbs_in_tb_ldpc_dec(&ops[i]->ldpc_dec);
> + /* Check if there are available space for further processing */
> + if (unlikely(avail - cbs_in_tb < 0))
> + break;
> + avail -= cbs_in_tb;
> +
> + ret = enqueue_ldpc_dec_one_op_tb(q, ops[i],
> + enqueued_cbs, cbs_in_tb);
> + if (ret < 0)
> + break;
> + enqueued_cbs += ret;
> + }
> +
> + acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
> +
> + /* Update stats */
> + q_data->queue_stats.enqueued_count += i;
> + q_data->queue_stats.enqueue_err_count += num - i;
> + return i;
> +}
> +
> +/* Enqueue decode operations for ACC100 device in CB mode */
> +static uint16_t
> +acc100_enqueue_ldpc_dec_cb(struct rte_bbdev_queue_data *q_data,
> + struct rte_bbdev_dec_op **ops, uint16_t num)
> +{
> + struct acc100_queue *q = q_data->queue_private;
> + int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
> + uint16_t i;
> + union acc100_dma_desc *desc;
> + int ret;
> + bool same_op = false;
> + for (i = 0; i < num; ++i) {
> + /* Check if there are available space for further processing */
> + if (unlikely(avail - 1 < 0))
> + break;
> + avail -= 1;
> +
> + if (i > 0)
> + same_op = cmp_ldpc_dec_op(&ops[i-1]);
> + rte_bbdev_log(INFO, "Op %d %d %d %d %d %d %d %d %d %d %d
> %d\n",
> + i, ops[i]->ldpc_dec.op_flags, ops[i]->ldpc_dec.rv_index,
> + ops[i]->ldpc_dec.iter_max, ops[i]->ldpc_dec.iter_count,
> + ops[i]->ldpc_dec.basegraph, ops[i]->ldpc_dec.z_c,
> + ops[i]->ldpc_dec.n_cb, ops[i]->ldpc_dec.q_m,
> + ops[i]->ldpc_dec.n_filler, ops[i]->ldpc_dec.cb_params.e,
> + same_op);
> + ret = enqueue_ldpc_dec_one_op_cb(q, ops[i], i, same_op);
> + if (ret < 0)
> + break;
> + }
> +
> + if (unlikely(i == 0))
> + return 0; /* Nothing to enqueue */
> +
> + /* Set SDone in last CB in enqueued ops for CB mode*/
> + desc = q->ring_addr + ((q->sw_ring_head + i - 1)
> + & q->sw_ring_wrap_mask);
> +
> + desc->req.sdone_enable = 1;
> + desc->req.irq_enable = q->irq_enable;
> +
> + acc100_dma_enqueue(q, i, &q_data->queue_stats);
> +
> + /* Update stats */
> + q_data->queue_stats.enqueued_count += i;
> + q_data->queue_stats.enqueue_err_count += num - i;
> + return i;
> +}
> +
> +/* Enqueue decode operations for ACC100 device. */
> +static uint16_t
> +acc100_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
> + struct rte_bbdev_dec_op **ops, uint16_t num)
> +{
> + struct acc100_queue *q = q_data->queue_private;
> + int32_t aq_avail = q->aq_depth +
> + (q->aq_dequeued - q->aq_enqueued) / 128;
> +
> + if (unlikely((aq_avail == 0) || (num == 0)))
> + return 0;
> +
> + if (ops[0]->ldpc_dec.code_block_mode == 0)
> + return acc100_enqueue_ldpc_dec_tb(q_data, ops, num);
> + else
> + return acc100_enqueue_ldpc_dec_cb(q_data, ops, num);
> +}
> +
> +
> +/* Dequeue one encode operations from ACC100 device in CB mode */
> +static inline int
> +dequeue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op
> **ref_op,
> + uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
> +{
> + union acc100_dma_desc *desc, atom_desc;
> + union acc100_dma_rsp_desc rsp;
> + struct rte_bbdev_enc_op *op;
> + int i;
> +
> + desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
> + & q->sw_ring_wrap_mask);
> + atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> + __ATOMIC_RELAXED);
> +
> + /* Check fdone bit */
> + if (!(atom_desc.rsp.val & ACC100_FDONE))
> + return -1;
> +
> + rsp.val = atom_desc.rsp.val;
> + rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
> +
> + /* Dequeue */
> + op = desc->req.op_addr;
> +
> + /* Clearing status, it will be set based on response */
> + op->status = 0;
> +
> + op->status |= ((rsp.input_err)
> + ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> + op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> + op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> +
> + if (desc->req.last_desc_in_batch) {
> + (*aq_dequeued)++;
> + desc->req.last_desc_in_batch = 0;
> + }
> + desc->rsp.val = ACC100_DMA_DESC_TYPE;
> + desc->rsp.add_info_0 = 0; /*Reserved bits */
> + desc->rsp.add_info_1 = 0; /*Reserved bits */
> +
> + /* Flag that the muxing cause loss of opaque data */
> + op->opaque_data = (void *)-1;
> + for (i = 0 ; i < desc->req.numCBs; i++)
> + ref_op[i] = op;
> +
> + /* One CB (op) was successfully dequeued */
> + return desc->req.numCBs;
> +}
> +
> +/* Dequeue one encode operations from ACC100 device in TB mode */
> +static inline int
> +dequeue_enc_one_op_tb(struct acc100_queue *q, struct rte_bbdev_enc_op
> **ref_op,
> + uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
> +{
> + union acc100_dma_desc *desc, *last_desc, atom_desc;
> + union acc100_dma_rsp_desc rsp;
> + struct rte_bbdev_enc_op *op;
> + uint8_t i = 0;
> + uint16_t current_dequeued_cbs = 0, cbs_in_tb;
> +
> + desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
> + & q->sw_ring_wrap_mask);
> + atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> + __ATOMIC_RELAXED);
> +
> + /* Check fdone bit */
> + if (!(atom_desc.rsp.val & ACC100_FDONE))
> + return -1;
> +
> + /* Get number of CBs in dequeued TB */
> + cbs_in_tb = desc->req.cbs_in_tb;
> + /* Get last CB */
> + last_desc = q->ring_addr + ((q->sw_ring_tail
> + + total_dequeued_cbs + cbs_in_tb - 1)
> + & q->sw_ring_wrap_mask);
> + /* Check if last CB in TB is ready to dequeue (and thus
> + * the whole TB) - checking sdone bit. If not return.
> + */
> + atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
> + __ATOMIC_RELAXED);
> + if (!(atom_desc.rsp.val & ACC100_SDONE))
> + return -1;
> +
> + /* Dequeue */
> + op = desc->req.op_addr;
> +
> + /* Clearing status, it will be set based on response */
> + op->status = 0;
> +
> + while (i < cbs_in_tb) {
> + desc = q->ring_addr + ((q->sw_ring_tail
> + + total_dequeued_cbs)
> + & q->sw_ring_wrap_mask);
> + atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> + __ATOMIC_RELAXED);
> + rsp.val = atom_desc.rsp.val;
> + rte_bbdev_log_debug("Resp. desc %p: %x", desc,
> + rsp.val);
> +
> + op->status |= ((rsp.input_err)
> + ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> + op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> + op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> +
> + if (desc->req.last_desc_in_batch) {
> + (*aq_dequeued)++;
> + desc->req.last_desc_in_batch = 0;
> + }
> + desc->rsp.val = ACC100_DMA_DESC_TYPE;
> + desc->rsp.add_info_0 = 0;
> + desc->rsp.add_info_1 = 0;
> + total_dequeued_cbs++;
> + current_dequeued_cbs++;
> + i++;
> + }
> +
> + *ref_op = op;
> +
> + return current_dequeued_cbs;
> +}
> +
> +/* Dequeue one decode operation from ACC100 device in CB mode */
> +static inline int
> +dequeue_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
> + struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
> + uint16_t dequeued_cbs, uint32_t *aq_dequeued)
> +{
> + union acc100_dma_desc *desc, atom_desc;
> + union acc100_dma_rsp_desc rsp;
> + struct rte_bbdev_dec_op *op;
> +
> + desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> + & q->sw_ring_wrap_mask);
> + atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> + __ATOMIC_RELAXED);
> +
> + /* Check fdone bit */
> + if (!(atom_desc.rsp.val & ACC100_FDONE))
> + return -1;
> +
> + rsp.val = atom_desc.rsp.val;
> + rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
> +
> + /* Dequeue */
> + op = desc->req.op_addr;
> +
> + /* Clearing status, it will be set based on response */
> + op->status = 0;
> + op->status |= ((rsp.input_err)
> + ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> + op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> + op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> + if (op->status != 0)
> + q_data->queue_stats.dequeue_err_count++;
> +
> + /* CRC invalid if error exists */
> + if (!op->status)
> + op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
> + op->turbo_dec.iter_count = (uint8_t) rsp.iter_cnt / 2;
> + /* Check if this is the last desc in batch (Atomic Queue) */
> + if (desc->req.last_desc_in_batch) {
> + (*aq_dequeued)++;
> + desc->req.last_desc_in_batch = 0;
> + }
> + desc->rsp.val = ACC100_DMA_DESC_TYPE;
> + desc->rsp.add_info_0 = 0;
> + desc->rsp.add_info_1 = 0;
> + *ref_op = op;
> +
> + /* One CB (op) was successfully dequeued */
> + return 1;
> +}
> +
> +/* Dequeue one decode operations from ACC100 device in CB mode */
> +static inline int
> +dequeue_ldpc_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
> + struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
> + uint16_t dequeued_cbs, uint32_t *aq_dequeued)
> +{
> + union acc100_dma_desc *desc, atom_desc;
> + union acc100_dma_rsp_desc rsp;
> + struct rte_bbdev_dec_op *op;
> +
> + desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> + & q->sw_ring_wrap_mask);
> + atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> + __ATOMIC_RELAXED);
> +
> + /* Check fdone bit */
> + if (!(atom_desc.rsp.val & ACC100_FDONE))
> + return -1;
> +
> + rsp.val = atom_desc.rsp.val;
> +
> + /* Dequeue */
> + op = desc->req.op_addr;
> +
> + /* Clearing status, it will be set based on response */
> + op->status = 0;
> + op->status |= rsp.input_err << RTE_BBDEV_DATA_ERROR;
> + op->status |= rsp.dma_err << RTE_BBDEV_DRV_ERROR;
> + op->status |= rsp.fcw_err << RTE_BBDEV_DRV_ERROR;
> + if (op->status != 0)
> + q_data->queue_stats.dequeue_err_count++;
> +
> + op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
> + if (op->ldpc_dec.hard_output.length > 0 && !rsp.synd_ok)
> + op->status |= 1 << RTE_BBDEV_SYNDROME_ERROR;
> + op->ldpc_dec.iter_count = (uint8_t) rsp.iter_cnt;
> +
> + /* Check if this is the last desc in batch (Atomic Queue) */
> + if (desc->req.last_desc_in_batch) {
> + (*aq_dequeued)++;
> + desc->req.last_desc_in_batch = 0;
> + }
> +
> + desc->rsp.val = ACC100_DMA_DESC_TYPE;
> + desc->rsp.add_info_0 = 0;
> + desc->rsp.add_info_1 = 0;
> +
> + *ref_op = op;
> +
> + /* One CB (op) was successfully dequeued */
> + return 1;
> +}
> +
> +/* Dequeue one decode operations from ACC100 device in TB mode. */
> +static inline int
> +dequeue_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op
> **ref_op,
> + uint16_t dequeued_cbs, uint32_t *aq_dequeued)
> +{
> + union acc100_dma_desc *desc, *last_desc, atom_desc;
> + union acc100_dma_rsp_desc rsp;
> + struct rte_bbdev_dec_op *op;
> + uint8_t cbs_in_tb = 1, cb_idx = 0;
> +
> + desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> + & q->sw_ring_wrap_mask);
> + atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> + __ATOMIC_RELAXED);
> +
> + /* Check fdone bit */
> + if (!(atom_desc.rsp.val & ACC100_FDONE))
> + return -1;
> +
> + /* Dequeue */
> + op = desc->req.op_addr;
> +
> + /* Get number of CBs in dequeued TB */
> + cbs_in_tb = desc->req.cbs_in_tb;
> + /* Get last CB */
> + last_desc = q->ring_addr + ((q->sw_ring_tail
> + + dequeued_cbs + cbs_in_tb - 1)
> + & q->sw_ring_wrap_mask);
> + /* Check if last CB in TB is ready to dequeue (and thus
> + * the whole TB) - checking sdone bit. If not return.
> + */
> + atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
> + __ATOMIC_RELAXED);
> + if (!(atom_desc.rsp.val & ACC100_SDONE))
> + return -1;
> +
> + /* Clearing status, it will be set based on response */
> + op->status = 0;
> +
> + /* Read remaining CBs if exists */
> + while (cb_idx < cbs_in_tb) {
> + desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> + & q->sw_ring_wrap_mask);
> + atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> + __ATOMIC_RELAXED);
> + rsp.val = atom_desc.rsp.val;
> + rte_bbdev_log_debug("Resp. desc %p: %x", desc,
> + rsp.val);
> +
> + op->status |= ((rsp.input_err)
> + ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> + op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> + op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> +
> + /* CRC invalid if error exists */
> + if (!op->status)
> + op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
> + op->turbo_dec.iter_count = RTE_MAX((uint8_t) rsp.iter_cnt,
> + op->turbo_dec.iter_count);
> +
> + /* Check if this is the last desc in batch (Atomic Queue) */
> + if (desc->req.last_desc_in_batch) {
> + (*aq_dequeued)++;
> + desc->req.last_desc_in_batch = 0;
> + }
> + desc->rsp.val = ACC100_DMA_DESC_TYPE;
> + desc->rsp.add_info_0 = 0;
> + desc->rsp.add_info_1 = 0;
> + dequeued_cbs++;
> + cb_idx++;
> + }
> +
> + *ref_op = op;
> +
> + return cb_idx;
> +}
> +
> +/* Dequeue LDPC encode operations from ACC100 device. */
> +static uint16_t
> +acc100_dequeue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
> + struct rte_bbdev_enc_op **ops, uint16_t num)
> +{
> + struct acc100_queue *q = q_data->queue_private;
> + uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
> + uint32_t aq_dequeued = 0;
> + uint16_t dequeue_num, i, dequeued_cbs = 0, dequeued_descs = 0;
> + int ret;
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> + if (unlikely(ops == 0 && q == NULL))
> + return 0;
> +#endif
> +
> + dequeue_num = (avail < num) ? avail : num;
> +
> + for (i = 0; i < dequeue_num; i++) {
> + ret = dequeue_enc_one_op_cb(q, &ops[dequeued_cbs],
> + dequeued_descs, &aq_dequeued);
> + if (ret < 0)
> + break;
> + dequeued_cbs += ret;
> + dequeued_descs++;
> + if (dequeued_cbs >= num)
> + break;
> + }
> +
> + q->aq_dequeued += aq_dequeued;
> + q->sw_ring_tail += dequeued_descs;
> +
> + /* Update enqueue stats */
> + q_data->queue_stats.dequeued_count += dequeued_cbs;
> +
> + return dequeued_cbs;
> +}
> +
> +/* Dequeue decode operations from ACC100 device. */
> +static uint16_t
> +acc100_dequeue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
> + struct rte_bbdev_dec_op **ops, uint16_t num)
> +{
> + struct acc100_queue *q = q_data->queue_private;
> + uint16_t dequeue_num;
> + uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
> + uint32_t aq_dequeued = 0;
> + uint16_t i;
> + uint16_t dequeued_cbs = 0;
> + struct rte_bbdev_dec_op *op;
> + int ret;
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> + if (unlikely(ops == 0 && q == NULL))
> + return 0;
> +#endif
> +
> + dequeue_num = (avail < num) ? avail : num;
> +
> + for (i = 0; i < dequeue_num; ++i) {
> + op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> + & q->sw_ring_wrap_mask))->req.op_addr;
> + if (op->ldpc_dec.code_block_mode == 0)
> + ret = dequeue_dec_one_op_tb(q, &ops[i], dequeued_cbs,
> + &aq_dequeued);
> + else
> + ret = dequeue_ldpc_dec_one_op_cb(
> + q_data, q, &ops[i], dequeued_cbs,
> + &aq_dequeued);
> +
> + if (ret < 0)
> + break;
> + dequeued_cbs += ret;
> + }
> +
> + q->aq_dequeued += aq_dequeued;
> + q->sw_ring_tail += dequeued_cbs;
> +
> + /* Update enqueue stats */
> + q_data->queue_stats.dequeued_count += i;
> +
> + return i;
> +}
> +
> /* Initialization Function */
> static void
> acc100_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv)
> @@ -703,6 +2321,10 @@
> struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
>
> dev->dev_ops = &acc100_bbdev_ops;
> + dev->enqueue_ldpc_enc_ops = acc100_enqueue_ldpc_enc;
> + dev->enqueue_ldpc_dec_ops = acc100_enqueue_ldpc_dec;
> + dev->dequeue_ldpc_enc_ops = acc100_dequeue_ldpc_enc;
> + dev->dequeue_ldpc_dec_ops = acc100_dequeue_ldpc_dec;
>
> ((struct acc100_device *) dev->data->dev_private)->pf_device =
> !strcmp(drv->driver.name,
> @@ -815,4 +2437,3 @@ static int acc100_pci_remove(struct rte_pci_device
> *pci_dev)
> RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME,
> pci_id_acc100_pf_map);
> RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
> RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME,
> pci_id_acc100_vf_map);
> -
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h
> b/drivers/baseband/acc100/rte_acc100_pmd.h
> index 0e2b79c..78686c1 100644
> --- a/drivers/baseband/acc100/rte_acc100_pmd.h
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
> @@ -88,6 +88,8 @@
> #define TMPL_PRI_3 0x0f0e0d0c
> #define QUEUE_ENABLE 0x80000000 /* Bit to mark Queue as Enabled */
> #define WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
> +#define ACC100_FDONE 0x80000000
> +#define ACC100_SDONE 0x40000000
>
> #define ACC100_NUM_TMPL 32
> #define VF_OFFSET_QOS 16 /* offset in Memory Space specific to QoS Mon */
> @@ -398,6 +400,7 @@ struct __rte_packed acc100_dma_req_desc {
> union acc100_dma_desc {
> struct acc100_dma_req_desc req;
> union acc100_dma_rsp_desc rsp;
> + uint64_t atom_hdr;
> };
>
>
> --
> 1.8.3.1
^ permalink raw reply [flat|nested] 213+ messages in thread
* Re: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC processing functions
2020-08-20 14:52 ` Chautru, Nicolas
@ 2020-08-20 14:57 ` Dave Burley
2020-08-20 21:05 ` Chautru, Nicolas
0 siblings, 1 reply; 213+ messages in thread
From: Dave Burley @ 2020-08-20 14:57 UTC (permalink / raw)
To: Chautru, Nicolas, dev; +Cc: Richardson, Bruce
Hi Nic
Thank you - it would be useful to have further documentation for clarification as the data format isn't explicitly documented in BBDEV.
Best Regards
Dave
From: Chautru, Nicolas <nicolas.chautru@intel.com>
Sent: 20 August 2020 15:52
To: Dave Burley <dave.burley@accelercomm.com>; dev@dpdk.org <dev@dpdk.org>
Cc: Richardson, Bruce <bruce.richardson@intel.com>
Subject: RE: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC processing functions
Hi Dave,
This is assuming 6 bits LLR compression packing (ie. first 2 MSB dropped). Similar to HARQ compression.
Let me know if unclear, I can clarify further in documentation if not explicit enough.
Thanks
Nic
> -----Original Message-----
> From: Dave Burley <dave.burley@accelercomm.com>
> Sent: Thursday, August 20, 2020 7:39 AM
> To: Chautru, Nicolas <nicolas.chautru@intel.com>; dev@dpdk.org
> Cc: Richardson, Bruce <bruce.richardson@intel.com>
> Subject: Re: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC
> processing functions
>
> Hi Nic,
>
> As you've now specified the use of RTE_BBDEV_LDPC_LLR_COMPRESSION for
> this PMB, please could you confirm what the packed format of the LLRs in
> memory looks like?
>
> Best Regards
>
> Dave Burley
>
>
> From: dev <dev-bounces@dpdk.org> on behalf of Nicolas Chautru
> <nicolas.chautru@intel.com>
> Sent: 19 August 2020 01:25
> To: dev@dpdk.org <dev@dpdk.org>; akhil.goyal@nxp.com
> <akhil.goyal@nxp.com>
> Cc: bruce.richardson@intel.com <bruce.richardson@intel.com>; Nicolas
> Chautru <nicolas.chautru@intel.com>
> Subject: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC processing
> functions
>
> Adding LDPC decode and encode processing operations
>
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> ---
> drivers/baseband/acc100/rte_acc100_pmd.c | 1625
> +++++++++++++++++++++++++++++-
> drivers/baseband/acc100/rte_acc100_pmd.h | 3 +
> 2 files changed, 1626 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> b/drivers/baseband/acc100/rte_acc100_pmd.c
> index 7a21c57..5f32813 100644
> --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> @@ -15,6 +15,9 @@
> #include <rte_hexdump.h>
> #include <rte_pci.h>
> #include <rte_bus_pci.h>
> +#ifdef RTE_BBDEV_OFFLOAD_COST
> +#include <rte_cycles.h>
> +#endif
>
> #include <rte_bbdev.h>
> #include <rte_bbdev_pmd.h>
> @@ -449,7 +452,6 @@
> return 0;
> }
>
> -
> /**
> * Report a ACC100 queue index which is free
> * Return 0 to 16k for a valid queue_idx or -1 when no queue is available
> @@ -634,6 +636,46 @@
> struct acc100_device *d = dev->data->dev_private;
>
> static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
> + {
> + .type = RTE_BBDEV_OP_LDPC_ENC,
> + .cap.ldpc_enc = {
> + .capability_flags =
> + RTE_BBDEV_LDPC_RATE_MATCH |
> + RTE_BBDEV_LDPC_CRC_24B_ATTACH |
> + RTE_BBDEV_LDPC_INTERLEAVER_BYPASS,
> + .num_buffers_src =
> + RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> + .num_buffers_dst =
> + RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> + }
> + },
> + {
> + .type = RTE_BBDEV_OP_LDPC_DEC,
> + .cap.ldpc_dec = {
> + .capability_flags =
> + RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK |
> + RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP |
> + RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE |
> + RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE |
> +#ifdef ACC100_EXT_MEM
> + RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABL
> E |
> + RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENA
> BLE |
> +#endif
> + RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE |
> + RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS |
> + RTE_BBDEV_LDPC_DECODE_BYPASS |
> + RTE_BBDEV_LDPC_DEC_SCATTER_GATHER |
> + RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION |
> + RTE_BBDEV_LDPC_LLR_COMPRESSION,
> + .llr_size = 8,
> + .llr_decimals = 1,
> + .num_buffers_src =
> + RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> + .num_buffers_hard_out =
> + RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> + .num_buffers_soft_out = 0,
> + }
> + },
> RTE_BBDEV_END_OF_CAPABILITIES_LIST()
> };
>
> @@ -669,9 +711,14 @@
> dev_info->cpu_flag_reqs = NULL;
> dev_info->min_alignment = 64;
> dev_info->capabilities = bbdev_capabilities;
> +#ifdef ACC100_EXT_MEM
> dev_info->harq_buffer_size = d->ddr_size;
> +#else
> + dev_info->harq_buffer_size = 0;
> +#endif
> }
>
> +
> static const struct rte_bbdev_ops acc100_bbdev_ops = {
> .setup_queues = acc100_setup_queues,
> .close = acc100_dev_close,
> @@ -696,6 +743,1577 @@
> {.device_id = 0},
> };
>
> +/* Read flag value 0/1 from bitmap */
> +static inline bool
> +check_bit(uint32_t bitmap, uint32_t bitmask)
> +{
> + return bitmap & bitmask;
> +}
> +
> +static inline char *
> +mbuf_append(struct rte_mbuf *m_head, struct rte_mbuf *m, uint16_t len)
> +{
> + if (unlikely(len > rte_pktmbuf_tailroom(m)))
> + return NULL;
> +
> + char *tail = (char *)m->buf_addr + m->data_off + m->data_len;
> + m->data_len = (uint16_t)(m->data_len + len);
> + m_head->pkt_len = (m_head->pkt_len + len);
> + return tail;
> +}
> +
> +/* Compute value of k0.
> + * Based on 3GPP 38.212 Table 5.4.2.1-2
> + * Starting position of different redundancy versions, k0
> + */
> +static inline uint16_t
> +get_k0(uint16_t n_cb, uint16_t z_c, uint8_t bg, uint8_t rv_index)
> +{
> + if (rv_index == 0)
> + return 0;
> + uint16_t n = (bg == 1 ? N_ZC_1 : N_ZC_2) * z_c;
> + if (n_cb == n) {
> + if (rv_index == 1)
> + return (bg == 1 ? K0_1_1 : K0_1_2) * z_c;
> + else if (rv_index == 2)
> + return (bg == 1 ? K0_2_1 : K0_2_2) * z_c;
> + else
> + return (bg == 1 ? K0_3_1 : K0_3_2) * z_c;
> + }
> + /* LBRM case - includes a division by N */
> + if (rv_index == 1)
> + return (((bg == 1 ? K0_1_1 : K0_1_2) * n_cb)
> + / n) * z_c;
> + else if (rv_index == 2)
> + return (((bg == 1 ? K0_2_1 : K0_2_2) * n_cb)
> + / n) * z_c;
> + else
> + return (((bg == 1 ? K0_3_1 : K0_3_2) * n_cb)
> + / n) * z_c;
> +}
> +
> +/* Fill in a frame control word for LDPC encoding. */
> +static inline void
> +acc100_fcw_le_fill(const struct rte_bbdev_enc_op *op,
> + struct acc100_fcw_le *fcw, int num_cb)
> +{
> + fcw->qm = op->ldpc_enc.q_m;
> + fcw->nfiller = op->ldpc_enc.n_filler;
> + fcw->BG = (op->ldpc_enc.basegraph - 1);
> + fcw->Zc = op->ldpc_enc.z_c;
> + fcw->ncb = op->ldpc_enc.n_cb;
> + fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_enc.basegraph,
> + op->ldpc_enc.rv_index);
> + fcw->rm_e = op->ldpc_enc.cb_params.e;
> + fcw->crc_select = check_bit(op->ldpc_enc.op_flags,
> + RTE_BBDEV_LDPC_CRC_24B_ATTACH);
> + fcw->bypass_intlv = check_bit(op->ldpc_enc.op_flags,
> + RTE_BBDEV_LDPC_INTERLEAVER_BYPASS);
> + fcw->mcb_count = num_cb;
> +}
> +
> +/* Fill in a frame control word for LDPC decoding. */
> +static inline void
> +acc100_fcw_ld_fill(const struct rte_bbdev_dec_op *op, struct acc100_fcw_ld
> *fcw,
> + union acc100_harq_layout_data *harq_layout)
> +{
> + uint16_t harq_out_length, harq_in_length, ncb_p, k0_p, parity_offset;
> + uint16_t harq_index;
> + uint32_t l;
> + bool harq_prun = false;
> +
> + fcw->qm = op->ldpc_dec.q_m;
> + fcw->nfiller = op->ldpc_dec.n_filler;
> + fcw->BG = (op->ldpc_dec.basegraph - 1);
> + fcw->Zc = op->ldpc_dec.z_c;
> + fcw->ncb = op->ldpc_dec.n_cb;
> + fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_dec.basegraph,
> + op->ldpc_dec.rv_index);
> + if (op->ldpc_dec.code_block_mode == 1)
> + fcw->rm_e = op->ldpc_dec.cb_params.e;
> + else
> + fcw->rm_e = (op->ldpc_dec.tb_params.r <
> + op->ldpc_dec.tb_params.cab) ?
> + op->ldpc_dec.tb_params.ea :
> + op->ldpc_dec.tb_params.eb;
> +
> + fcw->hcin_en = check_bit(op->ldpc_dec.op_flags,
> + RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE);
> + fcw->hcout_en = check_bit(op->ldpc_dec.op_flags,
> + RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE);
> + fcw->crc_select = check_bit(op->ldpc_dec.op_flags,
> + RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK);
> + fcw->bypass_dec = check_bit(op->ldpc_dec.op_flags,
> + RTE_BBDEV_LDPC_DECODE_BYPASS);
> + fcw->bypass_intlv = check_bit(op->ldpc_dec.op_flags,
> + RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS);
> + if (op->ldpc_dec.q_m == 1) {
> + fcw->bypass_intlv = 1;
> + fcw->qm = 2;
> + }
> + fcw->hcin_decomp_mode = check_bit(op->ldpc_dec.op_flags,
> + RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
> + fcw->hcout_comp_mode = check_bit(op->ldpc_dec.op_flags,
> + RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
> + fcw->llr_pack_mode = check_bit(op->ldpc_dec.op_flags,
> + RTE_BBDEV_LDPC_LLR_COMPRESSION);
> + harq_index = op->ldpc_dec.harq_combined_output.offset /
> + ACC100_HARQ_OFFSET;
> +#ifdef ACC100_EXT_MEM
> + /* Limit cases when HARQ pruning is valid */
> + harq_prun = ((op->ldpc_dec.harq_combined_output.offset %
> + ACC100_HARQ_OFFSET) == 0) &&
> + (op->ldpc_dec.harq_combined_output.offset <= UINT16_MAX
> + * ACC100_HARQ_OFFSET);
> +#endif
> + if (fcw->hcin_en > 0) {
> + harq_in_length = op->ldpc_dec.harq_combined_input.length;
> + if (fcw->hcin_decomp_mode > 0)
> + harq_in_length = harq_in_length * 8 / 6;
> + harq_in_length = RTE_ALIGN(harq_in_length, 64);
> + if ((harq_layout[harq_index].offset > 0) & harq_prun) {
> + rte_bbdev_log_debug("HARQ IN offset unexpected for now\n");
> + fcw->hcin_size0 = harq_layout[harq_index].size0;
> + fcw->hcin_offset = harq_layout[harq_index].offset;
> + fcw->hcin_size1 = harq_in_length -
> + harq_layout[harq_index].offset;
> + } else {
> + fcw->hcin_size0 = harq_in_length;
> + fcw->hcin_offset = 0;
> + fcw->hcin_size1 = 0;
> + }
> + } else {
> + fcw->hcin_size0 = 0;
> + fcw->hcin_offset = 0;
> + fcw->hcin_size1 = 0;
> + }
> +
> + fcw->itmax = op->ldpc_dec.iter_max;
> + fcw->itstop = check_bit(op->ldpc_dec.op_flags,
> + RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE);
> + fcw->synd_precoder = fcw->itstop;
> + /*
> + * These are all implicitly set
> + * fcw->synd_post = 0;
> + * fcw->so_en = 0;
> + * fcw->so_bypass_rm = 0;
> + * fcw->so_bypass_intlv = 0;
> + * fcw->dec_convllr = 0;
> + * fcw->hcout_convllr = 0;
> + * fcw->hcout_size1 = 0;
> + * fcw->so_it = 0;
> + * fcw->hcout_offset = 0;
> + * fcw->negstop_th = 0;
> + * fcw->negstop_it = 0;
> + * fcw->negstop_en = 0;
> + * fcw->gain_i = 1;
> + * fcw->gain_h = 1;
> + */
> + if (fcw->hcout_en > 0) {
> + parity_offset = (op->ldpc_dec.basegraph == 1 ? 20 : 8)
> + * op->ldpc_dec.z_c - op->ldpc_dec.n_filler;
> + k0_p = (fcw->k0 > parity_offset) ?
> + fcw->k0 - op->ldpc_dec.n_filler : fcw->k0;
> + ncb_p = fcw->ncb - op->ldpc_dec.n_filler;
> + l = k0_p + fcw->rm_e;
> + harq_out_length = (uint16_t) fcw->hcin_size0;
> + harq_out_length = RTE_MIN(RTE_MAX(harq_out_length, l), ncb_p);
> + harq_out_length = (harq_out_length + 0x3F) & 0xFFC0;
> + if ((k0_p > fcw->hcin_size0 + ACC100_HARQ_OFFSET_THRESHOLD)
> &&
> + harq_prun) {
> + fcw->hcout_size0 = (uint16_t) fcw->hcin_size0;
> + fcw->hcout_offset = k0_p & 0xFFC0;
> + fcw->hcout_size1 = harq_out_length - fcw->hcout_offset;
> + } else {
> + fcw->hcout_size0 = harq_out_length;
> + fcw->hcout_size1 = 0;
> + fcw->hcout_offset = 0;
> + }
> + harq_layout[harq_index].offset = fcw->hcout_offset;
> + harq_layout[harq_index].size0 = fcw->hcout_size0;
> + } else {
> + fcw->hcout_size0 = 0;
> + fcw->hcout_size1 = 0;
> + fcw->hcout_offset = 0;
> + }
> +}
> +
> +/**
> + * Fills descriptor with data pointers of one block type.
> + *
> + * @param desc
> + * Pointer to DMA descriptor.
> + * @param input
> + * Pointer to pointer to input data which will be encoded. It can be changed
> + * and points to next segment in scatter-gather case.
> + * @param offset
> + * Input offset in rte_mbuf structure. It is used for calculating the point
> + * where data is starting.
> + * @param cb_len
> + * Length of currently processed Code Block
> + * @param seg_total_left
> + * It indicates how many bytes still left in segment (mbuf) for further
> + * processing.
> + * @param op_flags
> + * Store information about device capabilities
> + * @param next_triplet
> + * Index for ACC100 DMA Descriptor triplet
> + *
> + * @return
> + * Returns index of next triplet on success, other value if lengths of
> + * pkt and processed cb do not match.
> + *
> + */
> +static inline int
> +acc100_dma_fill_blk_type_in(struct acc100_dma_req_desc *desc,
> + struct rte_mbuf **input, uint32_t *offset, uint32_t cb_len,
> + uint32_t *seg_total_left, int next_triplet)
> +{
> + uint32_t part_len;
> + struct rte_mbuf *m = *input;
> +
> + part_len = (*seg_total_left < cb_len) ? *seg_total_left : cb_len;
> + cb_len -= part_len;
> + *seg_total_left -= part_len;
> +
> + desc->data_ptrs[next_triplet].address =
> + rte_pktmbuf_iova_offset(m, *offset);
> + desc->data_ptrs[next_triplet].blen = part_len;
> + desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN;
> + desc->data_ptrs[next_triplet].last = 0;
> + desc->data_ptrs[next_triplet].dma_ext = 0;
> + *offset += part_len;
> + next_triplet++;
> +
> + while (cb_len > 0) {
> + if (next_triplet < ACC100_DMA_MAX_NUM_POINTERS &&
> + m->next != NULL) {
> +
> + m = m->next;
> + *seg_total_left = rte_pktmbuf_data_len(m);
> + part_len = (*seg_total_left < cb_len) ?
> + *seg_total_left :
> + cb_len;
> + desc->data_ptrs[next_triplet].address =
> + rte_pktmbuf_mtophys(m);
> + desc->data_ptrs[next_triplet].blen = part_len;
> + desc->data_ptrs[next_triplet].blkid =
> + ACC100_DMA_BLKID_IN;
> + desc->data_ptrs[next_triplet].last = 0;
> + desc->data_ptrs[next_triplet].dma_ext = 0;
> + cb_len -= part_len;
> + *seg_total_left -= part_len;
> + /* Initializing offset for next segment (mbuf) */
> + *offset = part_len;
> + next_triplet++;
> + } else {
> + rte_bbdev_log(ERR,
> + "Some data still left for processing: "
> + "data_left: %u, next_triplet: %u, next_mbuf: %p",
> + cb_len, next_triplet, m->next);
> + return -EINVAL;
> + }
> + }
> + /* Storing new mbuf as it could be changed in scatter-gather case*/
> + *input = m;
> +
> + return next_triplet;
> +}
> +
> +/* Fills descriptor with data pointers of one block type.
> + * Returns index of next triplet on success, other value if lengths of
> + * output data and processed mbuf do not match.
> + */
> +static inline int
> +acc100_dma_fill_blk_type_out(struct acc100_dma_req_desc *desc,
> + struct rte_mbuf *output, uint32_t out_offset,
> + uint32_t output_len, int next_triplet, int blk_id)
> +{
> + desc->data_ptrs[next_triplet].address =
> + rte_pktmbuf_iova_offset(output, out_offset);
> + desc->data_ptrs[next_triplet].blen = output_len;
> + desc->data_ptrs[next_triplet].blkid = blk_id;
> + desc->data_ptrs[next_triplet].last = 0;
> + desc->data_ptrs[next_triplet].dma_ext = 0;
> + next_triplet++;
> +
> + return next_triplet;
> +}
> +
> +static inline int
> +acc100_dma_desc_le_fill(struct rte_bbdev_enc_op *op,
> + struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
> + struct rte_mbuf *output, uint32_t *in_offset,
> + uint32_t *out_offset, uint32_t *out_length,
> + uint32_t *mbuf_total_left, uint32_t *seg_total_left)
> +{
> + int next_triplet = 1; /* FCW already done */
> + uint16_t K, in_length_in_bits, in_length_in_bytes;
> + struct rte_bbdev_op_ldpc_enc *enc = &op->ldpc_enc;
> +
> + desc->word0 = ACC100_DMA_DESC_TYPE;
> + desc->word1 = 0; /**< Timestamp could be disabled */
> + desc->word2 = 0;
> + desc->word3 = 0;
> + desc->numCBs = 1;
> +
> + K = (enc->basegraph == 1 ? 22 : 10) * enc->z_c;
> + in_length_in_bits = K - enc->n_filler;
> + if ((enc->op_flags & RTE_BBDEV_LDPC_CRC_24A_ATTACH) ||
> + (enc->op_flags & RTE_BBDEV_LDPC_CRC_24B_ATTACH))
> + in_length_in_bits -= 24;
> + in_length_in_bytes = in_length_in_bits >> 3;
> +
> + if (unlikely((*mbuf_total_left == 0) ||
> + (*mbuf_total_left < in_length_in_bytes))) {
> + rte_bbdev_log(ERR,
> + "Mismatch between mbuf length and included CB sizes:
> mbuf len %u, cb len %u",
> + *mbuf_total_left, in_length_in_bytes);
> + return -1;
> + }
> +
> + next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset,
> + in_length_in_bytes,
> + seg_total_left, next_triplet);
> + if (unlikely(next_triplet < 0)) {
> + rte_bbdev_log(ERR,
> + "Mismatch between data to process and mbuf data length
> in bbdev_op: %p",
> + op);
> + return -1;
> + }
> + desc->data_ptrs[next_triplet - 1].last = 1;
> + desc->m2dlen = next_triplet;
> + *mbuf_total_left -= in_length_in_bytes;
> +
> + /* Set output length */
> + /* Integer round up division by 8 */
> + *out_length = (enc->cb_params.e + 7) >> 3;
> +
> + next_triplet = acc100_dma_fill_blk_type_out(desc, output, *out_offset,
> + *out_length, next_triplet, ACC100_DMA_BLKID_OUT_ENC);
> + if (unlikely(next_triplet < 0)) {
> + rte_bbdev_log(ERR,
> + "Mismatch between data to process and mbuf data length
> in bbdev_op: %p",
> + op);
> + return -1;
> + }
> + op->ldpc_enc.output.length += *out_length;
> + *out_offset += *out_length;
> + desc->data_ptrs[next_triplet - 1].last = 1;
> + desc->data_ptrs[next_triplet - 1].dma_ext = 0;
> + desc->d2mlen = next_triplet - desc->m2dlen;
> +
> + desc->op_addr = op;
> +
> + return 0;
> +}
> +
> +static inline int
> +acc100_dma_desc_ld_fill(struct rte_bbdev_dec_op *op,
> + struct acc100_dma_req_desc *desc,
> + struct rte_mbuf **input, struct rte_mbuf *h_output,
> + uint32_t *in_offset, uint32_t *h_out_offset,
> + uint32_t *h_out_length, uint32_t *mbuf_total_left,
> + uint32_t *seg_total_left,
> + struct acc100_fcw_ld *fcw)
> +{
> + struct rte_bbdev_op_ldpc_dec *dec = &op->ldpc_dec;
> + int next_triplet = 1; /* FCW already done */
> + uint32_t input_length;
> + uint16_t output_length, crc24_overlap = 0;
> + uint16_t sys_cols, K, h_p_size, h_np_size;
> + bool h_comp = check_bit(dec->op_flags,
> + RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
> +
> + desc->word0 = ACC100_DMA_DESC_TYPE;
> + desc->word1 = 0; /**< Timestamp could be disabled */
> + desc->word2 = 0;
> + desc->word3 = 0;
> + desc->numCBs = 1;
> +
> + if (check_bit(op->ldpc_dec.op_flags,
> + RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP))
> + crc24_overlap = 24;
> +
> + /* Compute some LDPC BG lengths */
> + input_length = dec->cb_params.e;
> + if (check_bit(op->ldpc_dec.op_flags,
> + RTE_BBDEV_LDPC_LLR_COMPRESSION))
> + input_length = (input_length * 3 + 3) / 4;
> + sys_cols = (dec->basegraph == 1) ? 22 : 10;
> + K = sys_cols * dec->z_c;
> + output_length = K - dec->n_filler - crc24_overlap;
> +
> + if (unlikely((*mbuf_total_left == 0) ||
> + (*mbuf_total_left < input_length))) {
> + rte_bbdev_log(ERR,
> + "Mismatch between mbuf length and included CB sizes:
> mbuf len %u, cb len %u",
> + *mbuf_total_left, input_length);
> + return -1;
> + }
> +
> + next_triplet = acc100_dma_fill_blk_type_in(desc, input,
> + in_offset, input_length,
> + seg_total_left, next_triplet);
> +
> + if (unlikely(next_triplet < 0)) {
> + rte_bbdev_log(ERR,
> + "Mismatch between data to process and mbuf data length
> in bbdev_op: %p",
> + op);
> + return -1;
> + }
> +
> + if (check_bit(op->ldpc_dec.op_flags,
> + RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
> + h_p_size = fcw->hcin_size0 + fcw->hcin_size1;
> + if (h_comp)
> + h_p_size = (h_p_size * 3 + 3) / 4;
> + desc->data_ptrs[next_triplet].address =
> + dec->harq_combined_input.offset;
> + desc->data_ptrs[next_triplet].blen = h_p_size;
> + desc->data_ptrs[next_triplet].blkid =
> ACC100_DMA_BLKID_IN_HARQ;
> + desc->data_ptrs[next_triplet].dma_ext = 1;
> +#ifndef ACC100_EXT_MEM
> + acc100_dma_fill_blk_type_out(
> + desc,
> + op->ldpc_dec.harq_combined_input.data,
> + op->ldpc_dec.harq_combined_input.offset,
> + h_p_size,
> + next_triplet,
> + ACC100_DMA_BLKID_IN_HARQ);
> +#endif
> + next_triplet++;
> + }
> +
> + desc->data_ptrs[next_triplet - 1].last = 1;
> + desc->m2dlen = next_triplet;
> + *mbuf_total_left -= input_length;
> +
> + next_triplet = acc100_dma_fill_blk_type_out(desc, h_output,
> + *h_out_offset, output_length >> 3, next_triplet,
> + ACC100_DMA_BLKID_OUT_HARD);
> + if (unlikely(next_triplet < 0)) {
> + rte_bbdev_log(ERR,
> + "Mismatch between data to process and mbuf data length
> in bbdev_op: %p",
> + op);
> + return -1;
> + }
> +
> + if (check_bit(op->ldpc_dec.op_flags,
> + RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
> + /* Pruned size of the HARQ */
> + h_p_size = fcw->hcout_size0 + fcw->hcout_size1;
> + /* Non-Pruned size of the HARQ */
> + h_np_size = fcw->hcout_offset > 0 ?
> + fcw->hcout_offset + fcw->hcout_size1 :
> + h_p_size;
> + if (h_comp) {
> + h_np_size = (h_np_size * 3 + 3) / 4;
> + h_p_size = (h_p_size * 3 + 3) / 4;
> + }
> + dec->harq_combined_output.length = h_np_size;
> + desc->data_ptrs[next_triplet].address =
> + dec->harq_combined_output.offset;
> + desc->data_ptrs[next_triplet].blen = h_p_size;
> + desc->data_ptrs[next_triplet].blkid =
> ACC100_DMA_BLKID_OUT_HARQ;
> + desc->data_ptrs[next_triplet].dma_ext = 1;
> +#ifndef ACC100_EXT_MEM
> + acc100_dma_fill_blk_type_out(
> + desc,
> + dec->harq_combined_output.data,
> + dec->harq_combined_output.offset,
> + h_p_size,
> + next_triplet,
> + ACC100_DMA_BLKID_OUT_HARQ);
> +#endif
> + next_triplet++;
> + }
> +
> + *h_out_length = output_length >> 3;
> + dec->hard_output.length += *h_out_length;
> + *h_out_offset += *h_out_length;
> + desc->data_ptrs[next_triplet - 1].last = 1;
> + desc->d2mlen = next_triplet - desc->m2dlen;
> +
> + desc->op_addr = op;
> +
> + return 0;
> +}
> +
> +static inline void
> +acc100_dma_desc_ld_update(struct rte_bbdev_dec_op *op,
> + struct acc100_dma_req_desc *desc,
> + struct rte_mbuf *input, struct rte_mbuf *h_output,
> + uint32_t *in_offset, uint32_t *h_out_offset,
> + uint32_t *h_out_length,
> + union acc100_harq_layout_data *harq_layout)
> +{
> + int next_triplet = 1; /* FCW already done */
> + desc->data_ptrs[next_triplet].address =
> + rte_pktmbuf_iova_offset(input, *in_offset);
> + next_triplet++;
> +
> + if (check_bit(op->ldpc_dec.op_flags,
> + RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
> + struct rte_bbdev_op_data hi = op->ldpc_dec.harq_combined_input;
> + desc->data_ptrs[next_triplet].address = hi.offset;
> +#ifndef ACC100_EXT_MEM
> + desc->data_ptrs[next_triplet].address =
> + rte_pktmbuf_iova_offset(hi.data, hi.offset);
> +#endif
> + next_triplet++;
> + }
> +
> + desc->data_ptrs[next_triplet].address =
> + rte_pktmbuf_iova_offset(h_output, *h_out_offset);
> + *h_out_length = desc->data_ptrs[next_triplet].blen;
> + next_triplet++;
> +
> + if (check_bit(op->ldpc_dec.op_flags,
> + RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
> + desc->data_ptrs[next_triplet].address =
> + op->ldpc_dec.harq_combined_output.offset;
> + /* Adjust based on previous operation */
> + struct rte_bbdev_dec_op *prev_op = desc->op_addr;
> + op->ldpc_dec.harq_combined_output.length =
> + prev_op->ldpc_dec.harq_combined_output.length;
> + int16_t hq_idx = op->ldpc_dec.harq_combined_output.offset /
> + ACC100_HARQ_OFFSET;
> + int16_t prev_hq_idx =
> + prev_op->ldpc_dec.harq_combined_output.offset
> + / ACC100_HARQ_OFFSET;
> + harq_layout[hq_idx].val = harq_layout[prev_hq_idx].val;
> +#ifndef ACC100_EXT_MEM
> + struct rte_bbdev_op_data ho =
> + op->ldpc_dec.harq_combined_output;
> + desc->data_ptrs[next_triplet].address =
> + rte_pktmbuf_iova_offset(ho.data, ho.offset);
> +#endif
> + next_triplet++;
> + }
> +
> + op->ldpc_dec.hard_output.length += *h_out_length;
> + desc->op_addr = op;
> +}
> +
> +
> +/* Enqueue a number of operations to HW and update software rings */
> +static inline void
> +acc100_dma_enqueue(struct acc100_queue *q, uint16_t n,
> + struct rte_bbdev_stats *queue_stats)
> +{
> + union acc100_enqueue_reg_fmt enq_req;
> +#ifdef RTE_BBDEV_OFFLOAD_COST
> + uint64_t start_time = 0;
> + queue_stats->acc_offload_cycles = 0;
> + RTE_SET_USED(queue_stats);
> +#else
> + RTE_SET_USED(queue_stats);
> +#endif
> +
> + enq_req.val = 0;
> + /* Setting offset, 100b for 256 DMA Desc */
> + enq_req.addr_offset = ACC100_DESC_OFFSET;
> +
> + /* Split ops into batches */
> + do {
> + union acc100_dma_desc *desc;
> + uint16_t enq_batch_size;
> + uint64_t offset;
> + rte_iova_t req_elem_addr;
> +
> + enq_batch_size = RTE_MIN(n, MAX_ENQ_BATCH_SIZE);
> +
> + /* Set flag on last descriptor in a batch */
> + desc = q->ring_addr + ((q->sw_ring_head + enq_batch_size - 1) &
> + q->sw_ring_wrap_mask);
> + desc->req.last_desc_in_batch = 1;
> +
> + /* Calculate the 1st descriptor's address */
> + offset = ((q->sw_ring_head & q->sw_ring_wrap_mask) *
> + sizeof(union acc100_dma_desc));
> + req_elem_addr = q->ring_addr_phys + offset;
> +
> + /* Fill enqueue struct */
> + enq_req.num_elem = enq_batch_size;
> + /* low 6 bits are not needed */
> + enq_req.req_elem_addr = (uint32_t)(req_elem_addr >> 6);
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> + rte_memdump(stderr, "Req sdone", desc, sizeof(*desc));
> +#endif
> + rte_bbdev_log_debug(
> + "Enqueue %u reqs (phys %#"PRIx64") to reg %p",
> + enq_batch_size,
> + req_elem_addr,
> + (void *)q->mmio_reg_enqueue);
> +
> + rte_wmb();
> +
> +#ifdef RTE_BBDEV_OFFLOAD_COST
> + /* Start time measurement for enqueue function offload. */
> + start_time = rte_rdtsc_precise();
> +#endif
> + rte_bbdev_log(DEBUG, "Debug : MMIO Enqueue");
> + mmio_write(q->mmio_reg_enqueue, enq_req.val);
> +
> +#ifdef RTE_BBDEV_OFFLOAD_COST
> + queue_stats->acc_offload_cycles +=
> + rte_rdtsc_precise() - start_time;
> +#endif
> +
> + q->aq_enqueued++;
> + q->sw_ring_head += enq_batch_size;
> + n -= enq_batch_size;
> +
> + } while (n);
> +
> +
> +}
> +
> +/* Enqueue one encode operations for ACC100 device in CB mode */
> +static inline int
> +enqueue_ldpc_enc_n_op_cb(struct acc100_queue *q, struct
> rte_bbdev_enc_op **ops,
> + uint16_t total_enqueued_cbs, int16_t num)
> +{
> + union acc100_dma_desc *desc = NULL;
> + uint32_t out_length;
> + struct rte_mbuf *output_head, *output;
> + int i, next_triplet;
> + uint16_t in_length_in_bytes;
> + struct rte_bbdev_op_ldpc_enc *enc = &ops[0]->ldpc_enc;
> +
> + uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> + & q->sw_ring_wrap_mask);
> + desc = q->ring_addr + desc_idx;
> + acc100_fcw_le_fill(ops[0], &desc->req.fcw_le, num);
> +
> + /** This could be done at polling */
> + desc->req.word0 = ACC100_DMA_DESC_TYPE;
> + desc->req.word1 = 0; /**< Timestamp could be disabled */
> + desc->req.word2 = 0;
> + desc->req.word3 = 0;
> + desc->req.numCBs = num;
> +
> + in_length_in_bytes = ops[0]->ldpc_enc.input.data->data_len;
> + out_length = (enc->cb_params.e + 7) >> 3;
> + desc->req.m2dlen = 1 + num;
> + desc->req.d2mlen = num;
> + next_triplet = 1;
> +
> + for (i = 0; i < num; i++) {
> + desc->req.data_ptrs[next_triplet].address =
> + rte_pktmbuf_iova_offset(ops[i]->ldpc_enc.input.data, 0);
> + desc->req.data_ptrs[next_triplet].blen = in_length_in_bytes;
> + next_triplet++;
> + desc->req.data_ptrs[next_triplet].address =
> + rte_pktmbuf_iova_offset(
> + ops[i]->ldpc_enc.output.data, 0);
> + desc->req.data_ptrs[next_triplet].blen = out_length;
> + next_triplet++;
> + ops[i]->ldpc_enc.output.length = out_length;
> + output_head = output = ops[i]->ldpc_enc.output.data;
> + mbuf_append(output_head, output, out_length);
> + output->data_len = out_length;
> + }
> +
> + desc->req.op_addr = ops[0];
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> + rte_memdump(stderr, "FCW", &desc->req.fcw_le,
> + sizeof(desc->req.fcw_le) - 8);
> + rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> +#endif
> +
> + /* One CB (one op) was successfully prepared to enqueue */
> + return num;
> +}
> +
> +/* Enqueue one encode operations for ACC100 device in CB mode */
> +static inline int
> +enqueue_ldpc_enc_one_op_cb(struct acc100_queue *q, struct
> rte_bbdev_enc_op *op,
> + uint16_t total_enqueued_cbs)
> +{
> + union acc100_dma_desc *desc = NULL;
> + int ret;
> + uint32_t in_offset, out_offset, out_length, mbuf_total_left,
> + seg_total_left;
> + struct rte_mbuf *input, *output_head, *output;
> +
> + uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> + & q->sw_ring_wrap_mask);
> + desc = q->ring_addr + desc_idx;
> + acc100_fcw_le_fill(op, &desc->req.fcw_le, 1);
> +
> + input = op->ldpc_enc.input.data;
> + output_head = output = op->ldpc_enc.output.data;
> + in_offset = op->ldpc_enc.input.offset;
> + out_offset = op->ldpc_enc.output.offset;
> + out_length = 0;
> + mbuf_total_left = op->ldpc_enc.input.length;
> + seg_total_left = rte_pktmbuf_data_len(op->ldpc_enc.input.data)
> + - in_offset;
> +
> + ret = acc100_dma_desc_le_fill(op, &desc->req, &input, output,
> + &in_offset, &out_offset, &out_length, &mbuf_total_left,
> + &seg_total_left);
> +
> + if (unlikely(ret < 0))
> + return ret;
> +
> + mbuf_append(output_head, output, out_length);
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> + rte_memdump(stderr, "FCW", &desc->req.fcw_le,
> + sizeof(desc->req.fcw_le) - 8);
> + rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> +
> + /* Check if any data left after processing one CB */
> + if (mbuf_total_left != 0) {
> + rte_bbdev_log(ERR,
> + "Some date still left after processing one CB:
> mbuf_total_left = %u",
> + mbuf_total_left);
> + return -EINVAL;
> + }
> +#endif
> + /* One CB (one op) was successfully prepared to enqueue */
> + return 1;
> +}
> +
> +/** Enqueue one decode operations for ACC100 device in CB mode */
> +static inline int
> +enqueue_ldpc_dec_one_op_cb(struct acc100_queue *q, struct
> rte_bbdev_dec_op *op,
> + uint16_t total_enqueued_cbs, bool same_op)
> +{
> + int ret;
> +
> + union acc100_dma_desc *desc;
> + uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> + & q->sw_ring_wrap_mask);
> + desc = q->ring_addr + desc_idx;
> + struct rte_mbuf *input, *h_output_head, *h_output;
> + uint32_t in_offset, h_out_offset, h_out_length, mbuf_total_left;
> + input = op->ldpc_dec.input.data;
> + h_output_head = h_output = op->ldpc_dec.hard_output.data;
> + in_offset = op->ldpc_dec.input.offset;
> + h_out_offset = op->ldpc_dec.hard_output.offset;
> + mbuf_total_left = op->ldpc_dec.input.length;
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> + if (unlikely(input == NULL)) {
> + rte_bbdev_log(ERR, "Invalid mbuf pointer");
> + return -EFAULT;
> + }
> +#endif
> + union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
> +
> + if (same_op) {
> + union acc100_dma_desc *prev_desc;
> + desc_idx = ((q->sw_ring_head + total_enqueued_cbs - 1)
> + & q->sw_ring_wrap_mask);
> + prev_desc = q->ring_addr + desc_idx;
> + uint8_t *prev_ptr = (uint8_t *) prev_desc;
> + uint8_t *new_ptr = (uint8_t *) desc;
> + /* Copy first 4 words and BDESCs */
> + rte_memcpy(new_ptr, prev_ptr, 16);
> + rte_memcpy(new_ptr + 36, prev_ptr + 36, 40);
> + desc->req.op_addr = prev_desc->req.op_addr;
> + /* Copy FCW */
> + rte_memcpy(new_ptr + ACC100_DESC_FCW_OFFSET,
> + prev_ptr + ACC100_DESC_FCW_OFFSET,
> + ACC100_FCW_LD_BLEN);
> + acc100_dma_desc_ld_update(op, &desc->req, input, h_output,
> + &in_offset, &h_out_offset,
> + &h_out_length, harq_layout);
> + } else {
> + struct acc100_fcw_ld *fcw;
> + uint32_t seg_total_left;
> + fcw = &desc->req.fcw_ld;
> + acc100_fcw_ld_fill(op, fcw, harq_layout);
> +
> + /* Special handling when overusing mbuf */
> + if (fcw->rm_e < MAX_E_MBUF)
> + seg_total_left = rte_pktmbuf_data_len(input)
> + - in_offset;
> + else
> + seg_total_left = fcw->rm_e;
> +
> + ret = acc100_dma_desc_ld_fill(op, &desc->req, &input, h_output,
> + &in_offset, &h_out_offset,
> + &h_out_length, &mbuf_total_left,
> + &seg_total_left, fcw);
> + if (unlikely(ret < 0))
> + return ret;
> + }
> +
> + /* Hard output */
> + mbuf_append(h_output_head, h_output, h_out_length);
> +#ifndef ACC100_EXT_MEM
> + if (op->ldpc_dec.harq_combined_output.length > 0) {
> + /* Push the HARQ output into host memory */
> + struct rte_mbuf *hq_output_head, *hq_output;
> + hq_output_head = op->ldpc_dec.harq_combined_output.data;
> + hq_output = op->ldpc_dec.harq_combined_output.data;
> + mbuf_append(hq_output_head, hq_output,
> + op->ldpc_dec.harq_combined_output.length);
> + }
> +#endif
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> + rte_memdump(stderr, "FCW", &desc->req.fcw_ld,
> + sizeof(desc->req.fcw_ld) - 8);
> + rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> +#endif
> +
> + /* One CB (one op) was successfully prepared to enqueue */
> + return 1;
> +}
> +
> +
> +/* Enqueue one decode operations for ACC100 device in TB mode */
> +static inline int
> +enqueue_ldpc_dec_one_op_tb(struct acc100_queue *q, struct
> rte_bbdev_dec_op *op,
> + uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
> +{
> + union acc100_dma_desc *desc = NULL;
> + int ret;
> + uint8_t r, c;
> + uint32_t in_offset, h_out_offset,
> + h_out_length, mbuf_total_left, seg_total_left;
> + struct rte_mbuf *input, *h_output_head, *h_output;
> + uint16_t current_enqueued_cbs = 0;
> +
> + uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> + & q->sw_ring_wrap_mask);
> + desc = q->ring_addr + desc_idx;
> + uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
> + union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
> + acc100_fcw_ld_fill(op, &desc->req.fcw_ld, harq_layout);
> +
> + input = op->ldpc_dec.input.data;
> + h_output_head = h_output = op->ldpc_dec.hard_output.data;
> + in_offset = op->ldpc_dec.input.offset;
> + h_out_offset = op->ldpc_dec.hard_output.offset;
> + h_out_length = 0;
> + mbuf_total_left = op->ldpc_dec.input.length;
> + c = op->ldpc_dec.tb_params.c;
> + r = op->ldpc_dec.tb_params.r;
> +
> + while (mbuf_total_left > 0 && r < c) {
> +
> + seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
> +
> + /* Set up DMA descriptor */
> + desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
> + & q->sw_ring_wrap_mask);
> + desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
> + desc->req.data_ptrs[0].blen = ACC100_FCW_LD_BLEN;
> + ret = acc100_dma_desc_ld_fill(op, &desc->req, &input,
> + h_output, &in_offset, &h_out_offset,
> + &h_out_length,
> + &mbuf_total_left, &seg_total_left,
> + &desc->req.fcw_ld);
> +
> + if (unlikely(ret < 0))
> + return ret;
> +
> + /* Hard output */
> + mbuf_append(h_output_head, h_output, h_out_length);
> +
> + /* Set total number of CBs in TB */
> + desc->req.cbs_in_tb = cbs_in_tb;
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> + rte_memdump(stderr, "FCW", &desc->req.fcw_td,
> + sizeof(desc->req.fcw_td) - 8);
> + rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> +#endif
> +
> + if (seg_total_left == 0) {
> + /* Go to the next mbuf */
> + input = input->next;
> + in_offset = 0;
> + h_output = h_output->next;
> + h_out_offset = 0;
> + }
> + total_enqueued_cbs++;
> + current_enqueued_cbs++;
> + r++;
> + }
> +
> + if (unlikely(desc == NULL))
> + return current_enqueued_cbs;
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> + /* Check if any CBs left for processing */
> + if (mbuf_total_left != 0) {
> + rte_bbdev_log(ERR,
> + "Some date still left for processing: mbuf_total_left = %u",
> + mbuf_total_left);
> + return -EINVAL;
> + }
> +#endif
> + /* Set SDone on last CB descriptor for TB mode */
> + desc->req.sdone_enable = 1;
> + desc->req.irq_enable = q->irq_enable;
> +
> + return current_enqueued_cbs;
> +}
> +
> +
> +/* Calculates number of CBs in processed encoder TB based on 'r' and input
> + * length.
> + */
> +static inline uint8_t
> +get_num_cbs_in_tb_enc(struct rte_bbdev_op_turbo_enc *turbo_enc)
> +{
> + uint8_t c, c_neg, r, crc24_bits = 0;
> + uint16_t k, k_neg, k_pos;
> + uint8_t cbs_in_tb = 0;
> + int32_t length;
> +
> + length = turbo_enc->input.length;
> + r = turbo_enc->tb_params.r;
> + c = turbo_enc->tb_params.c;
> + c_neg = turbo_enc->tb_params.c_neg;
> + k_neg = turbo_enc->tb_params.k_neg;
> + k_pos = turbo_enc->tb_params.k_pos;
> + crc24_bits = 0;
> + if (check_bit(turbo_enc->op_flags,
> RTE_BBDEV_TURBO_CRC_24B_ATTACH))
> + crc24_bits = 24;
> + while (length > 0 && r < c) {
> + k = (r < c_neg) ? k_neg : k_pos;
> + length -= (k - crc24_bits) >> 3;
> + r++;
> + cbs_in_tb++;
> + }
> +
> + return cbs_in_tb;
> +}
> +
> +/* Calculates number of CBs in processed decoder TB based on 'r' and input
> + * length.
> + */
> +static inline uint16_t
> +get_num_cbs_in_tb_dec(struct rte_bbdev_op_turbo_dec *turbo_dec)
> +{
> + uint8_t c, c_neg, r = 0;
> + uint16_t kw, k, k_neg, k_pos, cbs_in_tb = 0;
> + int32_t length;
> +
> + length = turbo_dec->input.length;
> + r = turbo_dec->tb_params.r;
> + c = turbo_dec->tb_params.c;
> + c_neg = turbo_dec->tb_params.c_neg;
> + k_neg = turbo_dec->tb_params.k_neg;
> + k_pos = turbo_dec->tb_params.k_pos;
> + while (length > 0 && r < c) {
> + k = (r < c_neg) ? k_neg : k_pos;
> + kw = RTE_ALIGN_CEIL(k + 4, 32) * 3;
> + length -= kw;
> + r++;
> + cbs_in_tb++;
> + }
> +
> + return cbs_in_tb;
> +}
> +
> +/* Calculates number of CBs in processed decoder TB based on 'r' and input
> + * length.
> + */
> +static inline uint16_t
> +get_num_cbs_in_tb_ldpc_dec(struct rte_bbdev_op_ldpc_dec *ldpc_dec)
> +{
> + uint16_t r, cbs_in_tb = 0;
> + int32_t length = ldpc_dec->input.length;
> + r = ldpc_dec->tb_params.r;
> + while (length > 0 && r < ldpc_dec->tb_params.c) {
> + length -= (r < ldpc_dec->tb_params.cab) ?
> + ldpc_dec->tb_params.ea :
> + ldpc_dec->tb_params.eb;
> + r++;
> + cbs_in_tb++;
> + }
> + return cbs_in_tb;
> +}
> +
> +/* Check we can mux encode operations with common FCW */
> +static inline bool
> +check_mux(struct rte_bbdev_enc_op **ops, uint16_t num) {
> + uint16_t i;
> + if (num == 1)
> + return false;
> + for (i = 1; i < num; ++i) {
> + /* Only mux compatible code blocks */
> + if (memcmp((uint8_t *)(&ops[i]->ldpc_enc) + ENC_OFFSET,
> + (uint8_t *)(&ops[0]->ldpc_enc) + ENC_OFFSET,
> + CMP_ENC_SIZE) != 0)
> + return false;
> + }
> + return true;
> +}
> +
> +/** Enqueue encode operations for ACC100 device in CB mode. */
> +static inline uint16_t
> +acc100_enqueue_ldpc_enc_cb(struct rte_bbdev_queue_data *q_data,
> + struct rte_bbdev_enc_op **ops, uint16_t num)
> +{
> + struct acc100_queue *q = q_data->queue_private;
> + int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
> + uint16_t i = 0;
> + union acc100_dma_desc *desc;
> + int ret, desc_idx = 0;
> + int16_t enq, left = num;
> +
> + while (left > 0) {
> + if (unlikely(avail - 1 < 0))
> + break;
> + avail--;
> + enq = RTE_MIN(left, MUX_5GDL_DESC);
> + if (check_mux(&ops[i], enq)) {
> + ret = enqueue_ldpc_enc_n_op_cb(q, &ops[i],
> + desc_idx, enq);
> + if (ret < 0)
> + break;
> + i += enq;
> + } else {
> + ret = enqueue_ldpc_enc_one_op_cb(q, ops[i], desc_idx);
> + if (ret < 0)
> + break;
> + i++;
> + }
> + desc_idx++;
> + left = num - i;
> + }
> +
> + if (unlikely(i == 0))
> + return 0; /* Nothing to enqueue */
> +
> + /* Set SDone in last CB in enqueued ops for CB mode*/
> + desc = q->ring_addr + ((q->sw_ring_head + desc_idx - 1)
> + & q->sw_ring_wrap_mask);
> + desc->req.sdone_enable = 1;
> + desc->req.irq_enable = q->irq_enable;
> +
> + acc100_dma_enqueue(q, desc_idx, &q_data->queue_stats);
> +
> + /* Update stats */
> + q_data->queue_stats.enqueued_count += i;
> + q_data->queue_stats.enqueue_err_count += num - i;
> +
> + return i;
> +}
> +
> +/* Enqueue encode operations for ACC100 device. */
> +static uint16_t
> +acc100_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
> + struct rte_bbdev_enc_op **ops, uint16_t num)
> +{
> + if (unlikely(num == 0))
> + return 0;
> + return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
> +}
> +
> +/* Check we can mux encode operations with common FCW */
> +static inline bool
> +cmp_ldpc_dec_op(struct rte_bbdev_dec_op **ops) {
> + /* Only mux compatible code blocks */
> + if (memcmp((uint8_t *)(&ops[0]->ldpc_dec) + DEC_OFFSET,
> + (uint8_t *)(&ops[1]->ldpc_dec) +
> + DEC_OFFSET, CMP_DEC_SIZE) != 0) {
> + return false;
> + } else
> + return true;
> +}
> +
> +
> +/* Enqueue decode operations for ACC100 device in TB mode */
> +static uint16_t
> +acc100_enqueue_ldpc_dec_tb(struct rte_bbdev_queue_data *q_data,
> + struct rte_bbdev_dec_op **ops, uint16_t num)
> +{
> + struct acc100_queue *q = q_data->queue_private;
> + int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
> + uint16_t i, enqueued_cbs = 0;
> + uint8_t cbs_in_tb;
> + int ret;
> +
> + for (i = 0; i < num; ++i) {
> + cbs_in_tb = get_num_cbs_in_tb_ldpc_dec(&ops[i]->ldpc_dec);
> + /* Check if there are available space for further processing */
> + if (unlikely(avail - cbs_in_tb < 0))
> + break;
> + avail -= cbs_in_tb;
> +
> + ret = enqueue_ldpc_dec_one_op_tb(q, ops[i],
> + enqueued_cbs, cbs_in_tb);
> + if (ret < 0)
> + break;
> + enqueued_cbs += ret;
> + }
> +
> + acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
> +
> + /* Update stats */
> + q_data->queue_stats.enqueued_count += i;
> + q_data->queue_stats.enqueue_err_count += num - i;
> + return i;
> +}
> +
> +/* Enqueue decode operations for ACC100 device in CB mode */
> +static uint16_t
> +acc100_enqueue_ldpc_dec_cb(struct rte_bbdev_queue_data *q_data,
> + struct rte_bbdev_dec_op **ops, uint16_t num)
> +{
> + struct acc100_queue *q = q_data->queue_private;
> + int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
> + uint16_t i;
> + union acc100_dma_desc *desc;
> + int ret;
> + bool same_op = false;
> + for (i = 0; i < num; ++i) {
> + /* Check if there are available space for further processing */
> + if (unlikely(avail - 1 < 0))
> + break;
> + avail -= 1;
> +
> + if (i > 0)
> + same_op = cmp_ldpc_dec_op(&ops[i-1]);
> + rte_bbdev_log(INFO, "Op %d %d %d %d %d %d %d %d %d %d %d
> %d\n",
> + i, ops[i]->ldpc_dec.op_flags, ops[i]->ldpc_dec.rv_index,
> + ops[i]->ldpc_dec.iter_max, ops[i]->ldpc_dec.iter_count,
> + ops[i]->ldpc_dec.basegraph, ops[i]->ldpc_dec.z_c,
> + ops[i]->ldpc_dec.n_cb, ops[i]->ldpc_dec.q_m,
> + ops[i]->ldpc_dec.n_filler, ops[i]->ldpc_dec.cb_params.e,
> + same_op);
> + ret = enqueue_ldpc_dec_one_op_cb(q, ops[i], i, same_op);
> + if (ret < 0)
> + break;
> + }
> +
> + if (unlikely(i == 0))
> + return 0; /* Nothing to enqueue */
> +
> + /* Set SDone in last CB in enqueued ops for CB mode*/
> + desc = q->ring_addr + ((q->sw_ring_head + i - 1)
> + & q->sw_ring_wrap_mask);
> +
> + desc->req.sdone_enable = 1;
> + desc->req.irq_enable = q->irq_enable;
> +
> + acc100_dma_enqueue(q, i, &q_data->queue_stats);
> +
> + /* Update stats */
> + q_data->queue_stats.enqueued_count += i;
> + q_data->queue_stats.enqueue_err_count += num - i;
> + return i;
> +}
> +
> +/* Enqueue decode operations for ACC100 device. */
> +static uint16_t
> +acc100_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
> + struct rte_bbdev_dec_op **ops, uint16_t num)
> +{
> + struct acc100_queue *q = q_data->queue_private;
> + int32_t aq_avail = q->aq_depth +
> + (q->aq_dequeued - q->aq_enqueued) / 128;
> +
> + if (unlikely((aq_avail == 0) || (num == 0)))
> + return 0;
> +
> + if (ops[0]->ldpc_dec.code_block_mode == 0)
> + return acc100_enqueue_ldpc_dec_tb(q_data, ops, num);
> + else
> + return acc100_enqueue_ldpc_dec_cb(q_data, ops, num);
> +}
> +
> +
> +/* Dequeue one encode operations from ACC100 device in CB mode */
> +static inline int
> +dequeue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op
> **ref_op,
> + uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
> +{
> + union acc100_dma_desc *desc, atom_desc;
> + union acc100_dma_rsp_desc rsp;
> + struct rte_bbdev_enc_op *op;
> + int i;
> +
> + desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
> + & q->sw_ring_wrap_mask);
> + atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> + __ATOMIC_RELAXED);
> +
> + /* Check fdone bit */
> + if (!(atom_desc.rsp.val & ACC100_FDONE))
> + return -1;
> +
> + rsp.val = atom_desc.rsp.val;
> + rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
> +
> + /* Dequeue */
> + op = desc->req.op_addr;
> +
> + /* Clearing status, it will be set based on response */
> + op->status = 0;
> +
> + op->status |= ((rsp.input_err)
> + ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> + op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> + op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> +
> + if (desc->req.last_desc_in_batch) {
> + (*aq_dequeued)++;
> + desc->req.last_desc_in_batch = 0;
> + }
> + desc->rsp.val = ACC100_DMA_DESC_TYPE;
> + desc->rsp.add_info_0 = 0; /*Reserved bits */
> + desc->rsp.add_info_1 = 0; /*Reserved bits */
> +
> + /* Flag that the muxing cause loss of opaque data */
> + op->opaque_data = (void *)-1;
> + for (i = 0 ; i < desc->req.numCBs; i++)
> + ref_op[i] = op;
> +
> + /* One CB (op) was successfully dequeued */
> + return desc->req.numCBs;
> +}
> +
> +/* Dequeue one encode operations from ACC100 device in TB mode */
> +static inline int
> +dequeue_enc_one_op_tb(struct acc100_queue *q, struct rte_bbdev_enc_op
> **ref_op,
> + uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
> +{
> + union acc100_dma_desc *desc, *last_desc, atom_desc;
> + union acc100_dma_rsp_desc rsp;
> + struct rte_bbdev_enc_op *op;
> + uint8_t i = 0;
> + uint16_t current_dequeued_cbs = 0, cbs_in_tb;
> +
> + desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
> + & q->sw_ring_wrap_mask);
> + atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> + __ATOMIC_RELAXED);
> +
> + /* Check fdone bit */
> + if (!(atom_desc.rsp.val & ACC100_FDONE))
> + return -1;
> +
> + /* Get number of CBs in dequeued TB */
> + cbs_in_tb = desc->req.cbs_in_tb;
> + /* Get last CB */
> + last_desc = q->ring_addr + ((q->sw_ring_tail
> + + total_dequeued_cbs + cbs_in_tb - 1)
> + & q->sw_ring_wrap_mask);
> + /* Check if last CB in TB is ready to dequeue (and thus
> + * the whole TB) - checking sdone bit. If not return.
> + */
> + atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
> + __ATOMIC_RELAXED);
> + if (!(atom_desc.rsp.val & ACC100_SDONE))
> + return -1;
> +
> + /* Dequeue */
> + op = desc->req.op_addr;
> +
> + /* Clearing status, it will be set based on response */
> + op->status = 0;
> +
> + while (i < cbs_in_tb) {
> + desc = q->ring_addr + ((q->sw_ring_tail
> + + total_dequeued_cbs)
> + & q->sw_ring_wrap_mask);
> + atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> + __ATOMIC_RELAXED);
> + rsp.val = atom_desc.rsp.val;
> + rte_bbdev_log_debug("Resp. desc %p: %x", desc,
> + rsp.val);
> +
> + op->status |= ((rsp.input_err)
> + ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> + op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> + op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> +
> + if (desc->req.last_desc_in_batch) {
> + (*aq_dequeued)++;
> + desc->req.last_desc_in_batch = 0;
> + }
> + desc->rsp.val = ACC100_DMA_DESC_TYPE;
> + desc->rsp.add_info_0 = 0;
> + desc->rsp.add_info_1 = 0;
> + total_dequeued_cbs++;
> + current_dequeued_cbs++;
> + i++;
> + }
> +
> + *ref_op = op;
> +
> + return current_dequeued_cbs;
> +}
> +
> +/* Dequeue one decode operation from ACC100 device in CB mode */
> +static inline int
> +dequeue_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
> + struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
> + uint16_t dequeued_cbs, uint32_t *aq_dequeued)
> +{
> + union acc100_dma_desc *desc, atom_desc;
> + union acc100_dma_rsp_desc rsp;
> + struct rte_bbdev_dec_op *op;
> +
> + desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> + & q->sw_ring_wrap_mask);
> + atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> + __ATOMIC_RELAXED);
> +
> + /* Check fdone bit */
> + if (!(atom_desc.rsp.val & ACC100_FDONE))
> + return -1;
> +
> + rsp.val = atom_desc.rsp.val;
> + rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
> +
> + /* Dequeue */
> + op = desc->req.op_addr;
> +
> + /* Clearing status, it will be set based on response */
> + op->status = 0;
> + op->status |= ((rsp.input_err)
> + ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> + op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> + op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> + if (op->status != 0)
> + q_data->queue_stats.dequeue_err_count++;
> +
> + /* CRC invalid if error exists */
> + if (!op->status)
> + op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
> + op->turbo_dec.iter_count = (uint8_t) rsp.iter_cnt / 2;
> + /* Check if this is the last desc in batch (Atomic Queue) */
> + if (desc->req.last_desc_in_batch) {
> + (*aq_dequeued)++;
> + desc->req.last_desc_in_batch = 0;
> + }
> + desc->rsp.val = ACC100_DMA_DESC_TYPE;
> + desc->rsp.add_info_0 = 0;
> + desc->rsp.add_info_1 = 0;
> + *ref_op = op;
> +
> + /* One CB (op) was successfully dequeued */
> + return 1;
> +}
> +
> +/* Dequeue one decode operations from ACC100 device in CB mode */
> +static inline int
> +dequeue_ldpc_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
> + struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
> + uint16_t dequeued_cbs, uint32_t *aq_dequeued)
> +{
> + union acc100_dma_desc *desc, atom_desc;
> + union acc100_dma_rsp_desc rsp;
> + struct rte_bbdev_dec_op *op;
> +
> + desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> + & q->sw_ring_wrap_mask);
> + atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> + __ATOMIC_RELAXED);
> +
> + /* Check fdone bit */
> + if (!(atom_desc.rsp.val & ACC100_FDONE))
> + return -1;
> +
> + rsp.val = atom_desc.rsp.val;
> +
> + /* Dequeue */
> + op = desc->req.op_addr;
> +
> + /* Clearing status, it will be set based on response */
> + op->status = 0;
> + op->status |= rsp.input_err << RTE_BBDEV_DATA_ERROR;
> + op->status |= rsp.dma_err << RTE_BBDEV_DRV_ERROR;
> + op->status |= rsp.fcw_err << RTE_BBDEV_DRV_ERROR;
> + if (op->status != 0)
> + q_data->queue_stats.dequeue_err_count++;
> +
> + op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
> + if (op->ldpc_dec.hard_output.length > 0 && !rsp.synd_ok)
> + op->status |= 1 << RTE_BBDEV_SYNDROME_ERROR;
> + op->ldpc_dec.iter_count = (uint8_t) rsp.iter_cnt;
> +
> + /* Check if this is the last desc in batch (Atomic Queue) */
> + if (desc->req.last_desc_in_batch) {
> + (*aq_dequeued)++;
> + desc->req.last_desc_in_batch = 0;
> + }
> +
> + desc->rsp.val = ACC100_DMA_DESC_TYPE;
> + desc->rsp.add_info_0 = 0;
> + desc->rsp.add_info_1 = 0;
> +
> + *ref_op = op;
> +
> + /* One CB (op) was successfully dequeued */
> + return 1;
> +}
> +
> +/* Dequeue one decode operations from ACC100 device in TB mode. */
> +static inline int
> +dequeue_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op
> **ref_op,
> + uint16_t dequeued_cbs, uint32_t *aq_dequeued)
> +{
> + union acc100_dma_desc *desc, *last_desc, atom_desc;
> + union acc100_dma_rsp_desc rsp;
> + struct rte_bbdev_dec_op *op;
> + uint8_t cbs_in_tb = 1, cb_idx = 0;
> +
> + desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> + & q->sw_ring_wrap_mask);
> + atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> + __ATOMIC_RELAXED);
> +
> + /* Check fdone bit */
> + if (!(atom_desc.rsp.val & ACC100_FDONE))
> + return -1;
> +
> + /* Dequeue */
> + op = desc->req.op_addr;
> +
> + /* Get number of CBs in dequeued TB */
> + cbs_in_tb = desc->req.cbs_in_tb;
> + /* Get last CB */
> + last_desc = q->ring_addr + ((q->sw_ring_tail
> + + dequeued_cbs + cbs_in_tb - 1)
> + & q->sw_ring_wrap_mask);
> + /* Check if last CB in TB is ready to dequeue (and thus
> + * the whole TB) - checking sdone bit. If not return.
> + */
> + atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
> + __ATOMIC_RELAXED);
> + if (!(atom_desc.rsp.val & ACC100_SDONE))
> + return -1;
> +
> + /* Clearing status, it will be set based on response */
> + op->status = 0;
> +
> + /* Read remaining CBs if exists */
> + while (cb_idx < cbs_in_tb) {
> + desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> + & q->sw_ring_wrap_mask);
> + atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> + __ATOMIC_RELAXED);
> + rsp.val = atom_desc.rsp.val;
> + rte_bbdev_log_debug("Resp. desc %p: %x", desc,
> + rsp.val);
> +
> + op->status |= ((rsp.input_err)
> + ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> + op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> + op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> +
> + /* CRC invalid if error exists */
> + if (!op->status)
> + op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
> + op->turbo_dec.iter_count = RTE_MAX((uint8_t) rsp.iter_cnt,
> + op->turbo_dec.iter_count);
> +
> + /* Check if this is the last desc in batch (Atomic Queue) */
> + if (desc->req.last_desc_in_batch) {
> + (*aq_dequeued)++;
> + desc->req.last_desc_in_batch = 0;
> + }
> + desc->rsp.val = ACC100_DMA_DESC_TYPE;
> + desc->rsp.add_info_0 = 0;
> + desc->rsp.add_info_1 = 0;
> + dequeued_cbs++;
> + cb_idx++;
> + }
> +
> + *ref_op = op;
> +
> + return cb_idx;
> +}
> +
> +/* Dequeue LDPC encode operations from ACC100 device. */
> +static uint16_t
> +acc100_dequeue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
> + struct rte_bbdev_enc_op **ops, uint16_t num)
> +{
> + struct acc100_queue *q = q_data->queue_private;
> + uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
> + uint32_t aq_dequeued = 0;
> + uint16_t dequeue_num, i, dequeued_cbs = 0, dequeued_descs = 0;
> + int ret;
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> + if (unlikely(ops == 0 && q == NULL))
> + return 0;
> +#endif
> +
> + dequeue_num = (avail < num) ? avail : num;
> +
> + for (i = 0; i < dequeue_num; i++) {
> + ret = dequeue_enc_one_op_cb(q, &ops[dequeued_cbs],
> + dequeued_descs, &aq_dequeued);
> + if (ret < 0)
> + break;
> + dequeued_cbs += ret;
> + dequeued_descs++;
> + if (dequeued_cbs >= num)
> + break;
> + }
> +
> + q->aq_dequeued += aq_dequeued;
> + q->sw_ring_tail += dequeued_descs;
> +
> + /* Update enqueue stats */
> + q_data->queue_stats.dequeued_count += dequeued_cbs;
> +
> + return dequeued_cbs;
> +}
> +
> +/* Dequeue decode operations from ACC100 device. */
> +static uint16_t
> +acc100_dequeue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
> + struct rte_bbdev_dec_op **ops, uint16_t num)
> +{
> + struct acc100_queue *q = q_data->queue_private;
> + uint16_t dequeue_num;
> + uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
> + uint32_t aq_dequeued = 0;
> + uint16_t i;
> + uint16_t dequeued_cbs = 0;
> + struct rte_bbdev_dec_op *op;
> + int ret;
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> + if (unlikely(ops == 0 && q == NULL))
> + return 0;
> +#endif
> +
> + dequeue_num = (avail < num) ? avail : num;
> +
> + for (i = 0; i < dequeue_num; ++i) {
> + op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> + & q->sw_ring_wrap_mask))->req.op_addr;
> + if (op->ldpc_dec.code_block_mode == 0)
> + ret = dequeue_dec_one_op_tb(q, &ops[i], dequeued_cbs,
> + &aq_dequeued);
> + else
> + ret = dequeue_ldpc_dec_one_op_cb(
> + q_data, q, &ops[i], dequeued_cbs,
> + &aq_dequeued);
> +
> + if (ret < 0)
> + break;
> + dequeued_cbs += ret;
> + }
> +
> + q->aq_dequeued += aq_dequeued;
> + q->sw_ring_tail += dequeued_cbs;
> +
> + /* Update enqueue stats */
> + q_data->queue_stats.dequeued_count += i;
> +
> + return i;
> +}
> +
> /* Initialization Function */
> static void
> acc100_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv)
> @@ -703,6 +2321,10 @@
> struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
>
> dev->dev_ops = &acc100_bbdev_ops;
> + dev->enqueue_ldpc_enc_ops = acc100_enqueue_ldpc_enc;
> + dev->enqueue_ldpc_dec_ops = acc100_enqueue_ldpc_dec;
> + dev->dequeue_ldpc_enc_ops = acc100_dequeue_ldpc_enc;
> + dev->dequeue_ldpc_dec_ops = acc100_dequeue_ldpc_dec;
>
> ((struct acc100_device *) dev->data->dev_private)->pf_device =
> !strcmp(drv->driver.name,
> @@ -815,4 +2437,3 @@ static int acc100_pci_remove(struct rte_pci_device
> *pci_dev)
> RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME,
> pci_id_acc100_pf_map);
> RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
> RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME,
> pci_id_acc100_vf_map);
> -
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h
> b/drivers/baseband/acc100/rte_acc100_pmd.h
> index 0e2b79c..78686c1 100644
> --- a/drivers/baseband/acc100/rte_acc100_pmd.h
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
> @@ -88,6 +88,8 @@
> #define TMPL_PRI_3 0x0f0e0d0c
> #define QUEUE_ENABLE 0x80000000 /* Bit to mark Queue as Enabled */
> #define WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
> +#define ACC100_FDONE 0x80000000
> +#define ACC100_SDONE 0x40000000
>
> #define ACC100_NUM_TMPL 32
> #define VF_OFFSET_QOS 16 /* offset in Memory Space specific to QoS Mon */
> @@ -398,6 +400,7 @@ struct __rte_packed acc100_dma_req_desc {
> union acc100_dma_desc {
> struct acc100_dma_req_desc req;
> union acc100_dma_rsp_desc rsp;
> + uint64_t atom_hdr;
> };
>
>
> --
> 1.8.3.1
^ permalink raw reply [flat|nested] 213+ messages in thread
* Re: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC processing functions
2020-08-20 14:57 ` Dave Burley
@ 2020-08-20 21:05 ` Chautru, Nicolas
2020-09-03 8:06 ` Dave Burley
0 siblings, 1 reply; 213+ messages in thread
From: Chautru, Nicolas @ 2020-08-20 21:05 UTC (permalink / raw)
To: Dave Burley, dev; +Cc: Richardson, Bruce
> From: Dave Burley <dave.burley@accelercomm.com>>
> Hi Nic
>
> Thank you - it would be useful to have further documentation for clarification
> as the data format isn't explicitly documented in BBDEV.
Thanks Dave. Just updated on this other patch -> https://patches.dpdk.org/patch/75793/
Feel free to ack or let me know if you need more details.
> Best Regards
>
> Dave
>
>
> From: Chautru, Nicolas <nicolas.chautru@intel.com>
> Sent: 20 August 2020 15:52
> To: Dave Burley <dave.burley@accelercomm.com>; dev@dpdk.org
> <dev@dpdk.org>
> Cc: Richardson, Bruce <bruce.richardson@intel.com>
> Subject: RE: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC
> processing functions
>
> Hi Dave,
> This is assuming 6 bits LLR compression packing (ie. first 2 MSB dropped).
> Similar to HARQ compression.
> Let me know if unclear, I can clarify further in documentation if not explicit
> enough.
> Thanks
> Nic
>
> > -----Original Message-----
> > From: Dave Burley <dave.burley@accelercomm.com>
> > Sent: Thursday, August 20, 2020 7:39 AM
> > To: Chautru, Nicolas <nicolas.chautru@intel.com>; dev@dpdk.org
> > Cc: Richardson, Bruce <bruce.richardson@intel.com>
> > Subject: Re: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC
> > processing functions
> >
> > Hi Nic,
> >
> > As you've now specified the use of RTE_BBDEV_LDPC_LLR_COMPRESSION for
> > this PMB, please could you confirm what the packed format of the LLRs in
> > memory looks like?
> >
> > Best Regards
> >
> > Dave Burley
> >
> >
> > From: dev <dev-bounces@dpdk.org> on behalf of Nicolas Chautru
> > <nicolas.chautru@intel.com>
> > Sent: 19 August 2020 01:25
> > To: dev@dpdk.org <dev@dpdk.org>; akhil.goyal@nxp.com
> > <akhil.goyal@nxp.com>
> > Cc: bruce.richardson@intel.com <bruce.richardson@intel.com>; Nicolas
> > Chautru <nicolas.chautru@intel.com>
> > Subject: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC processing
> > functions
> >
> > Adding LDPC decode and encode processing operations
> >
> > Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> > ---
> > drivers/baseband/acc100/rte_acc100_pmd.c | 1625
> > +++++++++++++++++++++++++++++-
> > drivers/baseband/acc100/rte_acc100_pmd.h | 3 +
> > 2 files changed, 1626 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> > b/drivers/baseband/acc100/rte_acc100_pmd.c
> > index 7a21c57..5f32813 100644
> > --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> > +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> > @@ -15,6 +15,9 @@
> > #include <rte_hexdump.h>
> > #include <rte_pci.h>
> > #include <rte_bus_pci.h>
> > +#ifdef RTE_BBDEV_OFFLOAD_COST
> > +#include <rte_cycles.h>
> > +#endif
> >
> > #include <rte_bbdev.h>
> > #include <rte_bbdev_pmd.h>
> > @@ -449,7 +452,6 @@
> > return 0;
> > }
> >
> > -
> > /**
> > * Report a ACC100 queue index which is free
> > * Return 0 to 16k for a valid queue_idx or -1 when no queue is available
> > @@ -634,6 +636,46 @@
> > struct acc100_device *d = dev->data->dev_private;
> >
> > static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
> > + {
> > + .type = RTE_BBDEV_OP_LDPC_ENC,
> > + .cap.ldpc_enc = {
> > + .capability_flags =
> > + RTE_BBDEV_LDPC_RATE_MATCH |
> > + RTE_BBDEV_LDPC_CRC_24B_ATTACH |
> > + RTE_BBDEV_LDPC_INTERLEAVER_BYPASS,
> > + .num_buffers_src =
> > + RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> > + .num_buffers_dst =
> > + RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> > + }
> > + },
> > + {
> > + .type = RTE_BBDEV_OP_LDPC_DEC,
> > + .cap.ldpc_dec = {
> > + .capability_flags =
> > + RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK |
> > + RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP |
> > + RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE |
> > + RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE |
> > +#ifdef ACC100_EXT_MEM
> >
> + RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABL
> > E |
> >
> + RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENA
> > BLE |
> > +#endif
> > + RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE |
> > + RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS |
> > + RTE_BBDEV_LDPC_DECODE_BYPASS |
> > + RTE_BBDEV_LDPC_DEC_SCATTER_GATHER |
> > + RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION |
> > + RTE_BBDEV_LDPC_LLR_COMPRESSION,
> > + .llr_size = 8,
> > + .llr_decimals = 1,
> > + .num_buffers_src =
> > + RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> > + .num_buffers_hard_out =
> > + RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> > + .num_buffers_soft_out = 0,
> > + }
> > + },
> > RTE_BBDEV_END_OF_CAPABILITIES_LIST()
> > };
> >
> > @@ -669,9 +711,14 @@
> > dev_info->cpu_flag_reqs = NULL;
> > dev_info->min_alignment = 64;
> > dev_info->capabilities = bbdev_capabilities;
> > +#ifdef ACC100_EXT_MEM
> > dev_info->harq_buffer_size = d->ddr_size;
> > +#else
> > + dev_info->harq_buffer_size = 0;
> > +#endif
> > }
> >
> > +
> > static const struct rte_bbdev_ops acc100_bbdev_ops = {
> > .setup_queues = acc100_setup_queues,
> > .close = acc100_dev_close,
> > @@ -696,6 +743,1577 @@
> > {.device_id = 0},
> > };
> >
> > +/* Read flag value 0/1 from bitmap */
> > +static inline bool
> > +check_bit(uint32_t bitmap, uint32_t bitmask)
> > +{
> > + return bitmap & bitmask;
> > +}
> > +
> > +static inline char *
> > +mbuf_append(struct rte_mbuf *m_head, struct rte_mbuf *m, uint16_t len)
> > +{
> > + if (unlikely(len > rte_pktmbuf_tailroom(m)))
> > + return NULL;
> > +
> > + char *tail = (char *)m->buf_addr + m->data_off + m->data_len;
> > + m->data_len = (uint16_t)(m->data_len + len);
> > + m_head->pkt_len = (m_head->pkt_len + len);
> > + return tail;
> > +}
> > +
> > +/* Compute value of k0.
> > + * Based on 3GPP 38.212 Table 5.4.2.1-2
> > + * Starting position of different redundancy versions, k0
> > + */
> > +static inline uint16_t
> > +get_k0(uint16_t n_cb, uint16_t z_c, uint8_t bg, uint8_t rv_index)
> > +{
> > + if (rv_index == 0)
> > + return 0;
> > + uint16_t n = (bg == 1 ? N_ZC_1 : N_ZC_2) * z_c;
> > + if (n_cb == n) {
> > + if (rv_index == 1)
> > + return (bg == 1 ? K0_1_1 : K0_1_2) * z_c;
> > + else if (rv_index == 2)
> > + return (bg == 1 ? K0_2_1 : K0_2_2) * z_c;
> > + else
> > + return (bg == 1 ? K0_3_1 : K0_3_2) * z_c;
> > + }
> > + /* LBRM case - includes a division by N */
> > + if (rv_index == 1)
> > + return (((bg == 1 ? K0_1_1 : K0_1_2) * n_cb)
> > + / n) * z_c;
> > + else if (rv_index == 2)
> > + return (((bg == 1 ? K0_2_1 : K0_2_2) * n_cb)
> > + / n) * z_c;
> > + else
> > + return (((bg == 1 ? K0_3_1 : K0_3_2) * n_cb)
> > + / n) * z_c;
> > +}
> > +
> > +/* Fill in a frame control word for LDPC encoding. */
> > +static inline void
> > +acc100_fcw_le_fill(const struct rte_bbdev_enc_op *op,
> > + struct acc100_fcw_le *fcw, int num_cb)
> > +{
> > + fcw->qm = op->ldpc_enc.q_m;
> > + fcw->nfiller = op->ldpc_enc.n_filler;
> > + fcw->BG = (op->ldpc_enc.basegraph - 1);
> > + fcw->Zc = op->ldpc_enc.z_c;
> > + fcw->ncb = op->ldpc_enc.n_cb;
> > + fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_enc.basegraph,
> > + op->ldpc_enc.rv_index);
> > + fcw->rm_e = op->ldpc_enc.cb_params.e;
> > + fcw->crc_select = check_bit(op->ldpc_enc.op_flags,
> > + RTE_BBDEV_LDPC_CRC_24B_ATTACH);
> > + fcw->bypass_intlv = check_bit(op->ldpc_enc.op_flags,
> > + RTE_BBDEV_LDPC_INTERLEAVER_BYPASS);
> > + fcw->mcb_count = num_cb;
> > +}
> > +
> > +/* Fill in a frame control word for LDPC decoding. */
> > +static inline void
> > +acc100_fcw_ld_fill(const struct rte_bbdev_dec_op *op, struct
> acc100_fcw_ld
> > *fcw,
> > + union acc100_harq_layout_data *harq_layout)
> > +{
> > + uint16_t harq_out_length, harq_in_length, ncb_p, k0_p, parity_offset;
> > + uint16_t harq_index;
> > + uint32_t l;
> > + bool harq_prun = false;
> > +
> > + fcw->qm = op->ldpc_dec.q_m;
> > + fcw->nfiller = op->ldpc_dec.n_filler;
> > + fcw->BG = (op->ldpc_dec.basegraph - 1);
> > + fcw->Zc = op->ldpc_dec.z_c;
> > + fcw->ncb = op->ldpc_dec.n_cb;
> > + fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_dec.basegraph,
> > + op->ldpc_dec.rv_index);
> > + if (op->ldpc_dec.code_block_mode == 1)
> > + fcw->rm_e = op->ldpc_dec.cb_params.e;
> > + else
> > + fcw->rm_e = (op->ldpc_dec.tb_params.r <
> > + op->ldpc_dec.tb_params.cab) ?
> > + op->ldpc_dec.tb_params.ea :
> > + op->ldpc_dec.tb_params.eb;
> > +
> > + fcw->hcin_en = check_bit(op->ldpc_dec.op_flags,
> > + RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE);
> > + fcw->hcout_en = check_bit(op->ldpc_dec.op_flags,
> > + RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE);
> > + fcw->crc_select = check_bit(op->ldpc_dec.op_flags,
> > + RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK);
> > + fcw->bypass_dec = check_bit(op->ldpc_dec.op_flags,
> > + RTE_BBDEV_LDPC_DECODE_BYPASS);
> > + fcw->bypass_intlv = check_bit(op->ldpc_dec.op_flags,
> > + RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS);
> > + if (op->ldpc_dec.q_m == 1) {
> > + fcw->bypass_intlv = 1;
> > + fcw->qm = 2;
> > + }
> > + fcw->hcin_decomp_mode = check_bit(op->ldpc_dec.op_flags,
> > + RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
> > + fcw->hcout_comp_mode = check_bit(op->ldpc_dec.op_flags,
> > + RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
> > + fcw->llr_pack_mode = check_bit(op->ldpc_dec.op_flags,
> > + RTE_BBDEV_LDPC_LLR_COMPRESSION);
> > + harq_index = op->ldpc_dec.harq_combined_output.offset /
> > + ACC100_HARQ_OFFSET;
> > +#ifdef ACC100_EXT_MEM
> > + /* Limit cases when HARQ pruning is valid */
> > + harq_prun = ((op->ldpc_dec.harq_combined_output.offset %
> > + ACC100_HARQ_OFFSET) == 0) &&
> > + (op->ldpc_dec.harq_combined_output.offset <= UINT16_MAX
> > + * ACC100_HARQ_OFFSET);
> > +#endif
> > + if (fcw->hcin_en > 0) {
> > + harq_in_length = op->ldpc_dec.harq_combined_input.length;
> > + if (fcw->hcin_decomp_mode > 0)
> > + harq_in_length = harq_in_length * 8 / 6;
> > + harq_in_length = RTE_ALIGN(harq_in_length, 64);
> > + if ((harq_layout[harq_index].offset > 0) & harq_prun) {
> > + rte_bbdev_log_debug("HARQ IN offset unexpected for
> now\n");
> > + fcw->hcin_size0 = harq_layout[harq_index].size0;
> > + fcw->hcin_offset = harq_layout[harq_index].offset;
> > + fcw->hcin_size1 = harq_in_length -
> > + harq_layout[harq_index].offset;
> > + } else {
> > + fcw->hcin_size0 = harq_in_length;
> > + fcw->hcin_offset = 0;
> > + fcw->hcin_size1 = 0;
> > + }
> > + } else {
> > + fcw->hcin_size0 = 0;
> > + fcw->hcin_offset = 0;
> > + fcw->hcin_size1 = 0;
> > + }
> > +
> > + fcw->itmax = op->ldpc_dec.iter_max;
> > + fcw->itstop = check_bit(op->ldpc_dec.op_flags,
> > + RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE);
> > + fcw->synd_precoder = fcw->itstop;
> > + /*
> > + * These are all implicitly set
> > + * fcw->synd_post = 0;
> > + * fcw->so_en = 0;
> > + * fcw->so_bypass_rm = 0;
> > + * fcw->so_bypass_intlv = 0;
> > + * fcw->dec_convllr = 0;
> > + * fcw->hcout_convllr = 0;
> > + * fcw->hcout_size1 = 0;
> > + * fcw->so_it = 0;
> > + * fcw->hcout_offset = 0;
> > + * fcw->negstop_th = 0;
> > + * fcw->negstop_it = 0;
> > + * fcw->negstop_en = 0;
> > + * fcw->gain_i = 1;
> > + * fcw->gain_h = 1;
> > + */
> > + if (fcw->hcout_en > 0) {
> > + parity_offset = (op->ldpc_dec.basegraph == 1 ? 20 : 8)
> > + * op->ldpc_dec.z_c - op->ldpc_dec.n_filler;
> > + k0_p = (fcw->k0 > parity_offset) ?
> > + fcw->k0 - op->ldpc_dec.n_filler : fcw->k0;
> > + ncb_p = fcw->ncb - op->ldpc_dec.n_filler;
> > + l = k0_p + fcw->rm_e;
> > + harq_out_length = (uint16_t) fcw->hcin_size0;
> > + harq_out_length = RTE_MIN(RTE_MAX(harq_out_length, l),
> ncb_p);
> > + harq_out_length = (harq_out_length + 0x3F) & 0xFFC0;
> > + if ((k0_p > fcw->hcin_size0 + ACC100_HARQ_OFFSET_THRESHOLD)
> > &&
> > + harq_prun) {
> > + fcw->hcout_size0 = (uint16_t) fcw->hcin_size0;
> > + fcw->hcout_offset = k0_p & 0xFFC0;
> > + fcw->hcout_size1 = harq_out_length - fcw->hcout_offset;
> > + } else {
> > + fcw->hcout_size0 = harq_out_length;
> > + fcw->hcout_size1 = 0;
> > + fcw->hcout_offset = 0;
> > + }
> > + harq_layout[harq_index].offset = fcw->hcout_offset;
> > + harq_layout[harq_index].size0 = fcw->hcout_size0;
> > + } else {
> > + fcw->hcout_size0 = 0;
> > + fcw->hcout_size1 = 0;
> > + fcw->hcout_offset = 0;
> > + }
> > +}
> > +
> > +/**
> > + * Fills descriptor with data pointers of one block type.
> > + *
> > + * @param desc
> > + * Pointer to DMA descriptor.
> > + * @param input
> > + * Pointer to pointer to input data which will be encoded. It can be changed
> > + * and points to next segment in scatter-gather case.
> > + * @param offset
> > + * Input offset in rte_mbuf structure. It is used for calculating the point
> > + * where data is starting.
> > + * @param cb_len
> > + * Length of currently processed Code Block
> > + * @param seg_total_left
> > + * It indicates how many bytes still left in segment (mbuf) for further
> > + * processing.
> > + * @param op_flags
> > + * Store information about device capabilities
> > + * @param next_triplet
> > + * Index for ACC100 DMA Descriptor triplet
> > + *
> > + * @return
> > + * Returns index of next triplet on success, other value if lengths of
> > + * pkt and processed cb do not match.
> > + *
> > + */
> > +static inline int
> > +acc100_dma_fill_blk_type_in(struct acc100_dma_req_desc *desc,
> > + struct rte_mbuf **input, uint32_t *offset, uint32_t cb_len,
> > + uint32_t *seg_total_left, int next_triplet)
> > +{
> > + uint32_t part_len;
> > + struct rte_mbuf *m = *input;
> > +
> > + part_len = (*seg_total_left < cb_len) ? *seg_total_left : cb_len;
> > + cb_len -= part_len;
> > + *seg_total_left -= part_len;
> > +
> > + desc->data_ptrs[next_triplet].address =
> > + rte_pktmbuf_iova_offset(m, *offset);
> > + desc->data_ptrs[next_triplet].blen = part_len;
> > + desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN;
> > + desc->data_ptrs[next_triplet].last = 0;
> > + desc->data_ptrs[next_triplet].dma_ext = 0;
> > + *offset += part_len;
> > + next_triplet++;
> > +
> > + while (cb_len > 0) {
> > + if (next_triplet < ACC100_DMA_MAX_NUM_POINTERS &&
> > + m->next != NULL) {
> > +
> > + m = m->next;
> > + *seg_total_left = rte_pktmbuf_data_len(m);
> > + part_len = (*seg_total_left < cb_len) ?
> > + *seg_total_left :
> > + cb_len;
> > + desc->data_ptrs[next_triplet].address =
> > + rte_pktmbuf_mtophys(m);
> > + desc->data_ptrs[next_triplet].blen = part_len;
> > + desc->data_ptrs[next_triplet].blkid =
> > + ACC100_DMA_BLKID_IN;
> > + desc->data_ptrs[next_triplet].last = 0;
> > + desc->data_ptrs[next_triplet].dma_ext = 0;
> > + cb_len -= part_len;
> > + *seg_total_left -= part_len;
> > + /* Initializing offset for next segment (mbuf) */
> > + *offset = part_len;
> > + next_triplet++;
> > + } else {
> > + rte_bbdev_log(ERR,
> > + "Some data still left for processing: "
> > + "data_left: %u, next_triplet: %u, next_mbuf: %p",
> > + cb_len, next_triplet, m->next);
> > + return -EINVAL;
> > + }
> > + }
> > + /* Storing new mbuf as it could be changed in scatter-gather case*/
> > + *input = m;
> > +
> > + return next_triplet;
> > +}
> > +
> > +/* Fills descriptor with data pointers of one block type.
> > + * Returns index of next triplet on success, other value if lengths of
> > + * output data and processed mbuf do not match.
> > + */
> > +static inline int
> > +acc100_dma_fill_blk_type_out(struct acc100_dma_req_desc *desc,
> > + struct rte_mbuf *output, uint32_t out_offset,
> > + uint32_t output_len, int next_triplet, int blk_id)
> > +{
> > + desc->data_ptrs[next_triplet].address =
> > + rte_pktmbuf_iova_offset(output, out_offset);
> > + desc->data_ptrs[next_triplet].blen = output_len;
> > + desc->data_ptrs[next_triplet].blkid = blk_id;
> > + desc->data_ptrs[next_triplet].last = 0;
> > + desc->data_ptrs[next_triplet].dma_ext = 0;
> > + next_triplet++;
> > +
> > + return next_triplet;
> > +}
> > +
> > +static inline int
> > +acc100_dma_desc_le_fill(struct rte_bbdev_enc_op *op,
> > + struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
> > + struct rte_mbuf *output, uint32_t *in_offset,
> > + uint32_t *out_offset, uint32_t *out_length,
> > + uint32_t *mbuf_total_left, uint32_t *seg_total_left)
> > +{
> > + int next_triplet = 1; /* FCW already done */
> > + uint16_t K, in_length_in_bits, in_length_in_bytes;
> > + struct rte_bbdev_op_ldpc_enc *enc = &op->ldpc_enc;
> > +
> > + desc->word0 = ACC100_DMA_DESC_TYPE;
> > + desc->word1 = 0; /**< Timestamp could be disabled */
> > + desc->word2 = 0;
> > + desc->word3 = 0;
> > + desc->numCBs = 1;
> > +
> > + K = (enc->basegraph == 1 ? 22 : 10) * enc->z_c;
> > + in_length_in_bits = K - enc->n_filler;
> > + if ((enc->op_flags & RTE_BBDEV_LDPC_CRC_24A_ATTACH) ||
> > + (enc->op_flags & RTE_BBDEV_LDPC_CRC_24B_ATTACH))
> > + in_length_in_bits -= 24;
> > + in_length_in_bytes = in_length_in_bits >> 3;
> > +
> > + if (unlikely((*mbuf_total_left == 0) ||
> > + (*mbuf_total_left < in_length_in_bytes))) {
> > + rte_bbdev_log(ERR,
> > + "Mismatch between mbuf length and included CB sizes:
> > mbuf len %u, cb len %u",
> > + *mbuf_total_left, in_length_in_bytes);
> > + return -1;
> > + }
> > +
> > + next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset,
> > + in_length_in_bytes,
> > + seg_total_left, next_triplet);
> > + if (unlikely(next_triplet < 0)) {
> > + rte_bbdev_log(ERR,
> > + "Mismatch between data to process and mbuf data
> length
> > in bbdev_op: %p",
> > + op);
> > + return -1;
> > + }
> > + desc->data_ptrs[next_triplet - 1].last = 1;
> > + desc->m2dlen = next_triplet;
> > + *mbuf_total_left -= in_length_in_bytes;
> > +
> > + /* Set output length */
> > + /* Integer round up division by 8 */
> > + *out_length = (enc->cb_params.e + 7) >> 3;
> > +
> > + next_triplet = acc100_dma_fill_blk_type_out(desc, output, *out_offset,
> > + *out_length, next_triplet, ACC100_DMA_BLKID_OUT_ENC);
> > + if (unlikely(next_triplet < 0)) {
> > + rte_bbdev_log(ERR,
> > + "Mismatch between data to process and mbuf data
> length
> > in bbdev_op: %p",
> > + op);
> > + return -1;
> > + }
> > + op->ldpc_enc.output.length += *out_length;
> > + *out_offset += *out_length;
> > + desc->data_ptrs[next_triplet - 1].last = 1;
> > + desc->data_ptrs[next_triplet - 1].dma_ext = 0;
> > + desc->d2mlen = next_triplet - desc->m2dlen;
> > +
> > + desc->op_addr = op;
> > +
> > + return 0;
> > +}
> > +
> > +static inline int
> > +acc100_dma_desc_ld_fill(struct rte_bbdev_dec_op *op,
> > + struct acc100_dma_req_desc *desc,
> > + struct rte_mbuf **input, struct rte_mbuf *h_output,
> > + uint32_t *in_offset, uint32_t *h_out_offset,
> > + uint32_t *h_out_length, uint32_t *mbuf_total_left,
> > + uint32_t *seg_total_left,
> > + struct acc100_fcw_ld *fcw)
> > +{
> > + struct rte_bbdev_op_ldpc_dec *dec = &op->ldpc_dec;
> > + int next_triplet = 1; /* FCW already done */
> > + uint32_t input_length;
> > + uint16_t output_length, crc24_overlap = 0;
> > + uint16_t sys_cols, K, h_p_size, h_np_size;
> > + bool h_comp = check_bit(dec->op_flags,
> > + RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
> > +
> > + desc->word0 = ACC100_DMA_DESC_TYPE;
> > + desc->word1 = 0; /**< Timestamp could be disabled */
> > + desc->word2 = 0;
> > + desc->word3 = 0;
> > + desc->numCBs = 1;
> > +
> > + if (check_bit(op->ldpc_dec.op_flags,
> > + RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP))
> > + crc24_overlap = 24;
> > +
> > + /* Compute some LDPC BG lengths */
> > + input_length = dec->cb_params.e;
> > + if (check_bit(op->ldpc_dec.op_flags,
> > + RTE_BBDEV_LDPC_LLR_COMPRESSION))
> > + input_length = (input_length * 3 + 3) / 4;
> > + sys_cols = (dec->basegraph == 1) ? 22 : 10;
> > + K = sys_cols * dec->z_c;
> > + output_length = K - dec->n_filler - crc24_overlap;
> > +
> > + if (unlikely((*mbuf_total_left == 0) ||
> > + (*mbuf_total_left < input_length))) {
> > + rte_bbdev_log(ERR,
> > + "Mismatch between mbuf length and included CB sizes:
> > mbuf len %u, cb len %u",
> > + *mbuf_total_left, input_length);
> > + return -1;
> > + }
> > +
> > + next_triplet = acc100_dma_fill_blk_type_in(desc, input,
> > + in_offset, input_length,
> > + seg_total_left, next_triplet);
> > +
> > + if (unlikely(next_triplet < 0)) {
> > + rte_bbdev_log(ERR,
> > + "Mismatch between data to process and mbuf data
> length
> > in bbdev_op: %p",
> > + op);
> > + return -1;
> > + }
> > +
> > + if (check_bit(op->ldpc_dec.op_flags,
> > + RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
> > + h_p_size = fcw->hcin_size0 + fcw->hcin_size1;
> > + if (h_comp)
> > + h_p_size = (h_p_size * 3 + 3) / 4;
> > + desc->data_ptrs[next_triplet].address =
> > + dec->harq_combined_input.offset;
> > + desc->data_ptrs[next_triplet].blen = h_p_size;
> > + desc->data_ptrs[next_triplet].blkid =
> > ACC100_DMA_BLKID_IN_HARQ;
> > + desc->data_ptrs[next_triplet].dma_ext = 1;
> > +#ifndef ACC100_EXT_MEM
> > + acc100_dma_fill_blk_type_out(
> > + desc,
> > + op->ldpc_dec.harq_combined_input.data,
> > + op->ldpc_dec.harq_combined_input.offset,
> > + h_p_size,
> > + next_triplet,
> > + ACC100_DMA_BLKID_IN_HARQ);
> > +#endif
> > + next_triplet++;
> > + }
> > +
> > + desc->data_ptrs[next_triplet - 1].last = 1;
> > + desc->m2dlen = next_triplet;
> > + *mbuf_total_left -= input_length;
> > +
> > + next_triplet = acc100_dma_fill_blk_type_out(desc, h_output,
> > + *h_out_offset, output_length >> 3, next_triplet,
> > + ACC100_DMA_BLKID_OUT_HARD);
> > + if (unlikely(next_triplet < 0)) {
> > + rte_bbdev_log(ERR,
> > + "Mismatch between data to process and mbuf data
> length
> > in bbdev_op: %p",
> > + op);
> > + return -1;
> > + }
> > +
> > + if (check_bit(op->ldpc_dec.op_flags,
> > + RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
> > + /* Pruned size of the HARQ */
> > + h_p_size = fcw->hcout_size0 + fcw->hcout_size1;
> > + /* Non-Pruned size of the HARQ */
> > + h_np_size = fcw->hcout_offset > 0 ?
> > + fcw->hcout_offset + fcw->hcout_size1 :
> > + h_p_size;
> > + if (h_comp) {
> > + h_np_size = (h_np_size * 3 + 3) / 4;
> > + h_p_size = (h_p_size * 3 + 3) / 4;
> > + }
> > + dec->harq_combined_output.length = h_np_size;
> > + desc->data_ptrs[next_triplet].address =
> > + dec->harq_combined_output.offset;
> > + desc->data_ptrs[next_triplet].blen = h_p_size;
> > + desc->data_ptrs[next_triplet].blkid =
> > ACC100_DMA_BLKID_OUT_HARQ;
> > + desc->data_ptrs[next_triplet].dma_ext = 1;
> > +#ifndef ACC100_EXT_MEM
> > + acc100_dma_fill_blk_type_out(
> > + desc,
> > + dec->harq_combined_output.data,
> > + dec->harq_combined_output.offset,
> > + h_p_size,
> > + next_triplet,
> > + ACC100_DMA_BLKID_OUT_HARQ);
> > +#endif
> > + next_triplet++;
> > + }
> > +
> > + *h_out_length = output_length >> 3;
> > + dec->hard_output.length += *h_out_length;
> > + *h_out_offset += *h_out_length;
> > + desc->data_ptrs[next_triplet - 1].last = 1;
> > + desc->d2mlen = next_triplet - desc->m2dlen;
> > +
> > + desc->op_addr = op;
> > +
> > + return 0;
> > +}
> > +
> > +static inline void
> > +acc100_dma_desc_ld_update(struct rte_bbdev_dec_op *op,
> > + struct acc100_dma_req_desc *desc,
> > + struct rte_mbuf *input, struct rte_mbuf *h_output,
> > + uint32_t *in_offset, uint32_t *h_out_offset,
> > + uint32_t *h_out_length,
> > + union acc100_harq_layout_data *harq_layout)
> > +{
> > + int next_triplet = 1; /* FCW already done */
> > + desc->data_ptrs[next_triplet].address =
> > + rte_pktmbuf_iova_offset(input, *in_offset);
> > + next_triplet++;
> > +
> > + if (check_bit(op->ldpc_dec.op_flags,
> > + RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
> > + struct rte_bbdev_op_data hi = op->ldpc_dec.harq_combined_input;
> > + desc->data_ptrs[next_triplet].address = hi.offset;
> > +#ifndef ACC100_EXT_MEM
> > + desc->data_ptrs[next_triplet].address =
> > + rte_pktmbuf_iova_offset(hi.data, hi.offset);
> > +#endif
> > + next_triplet++;
> > + }
> > +
> > + desc->data_ptrs[next_triplet].address =
> > + rte_pktmbuf_iova_offset(h_output, *h_out_offset);
> > + *h_out_length = desc->data_ptrs[next_triplet].blen;
> > + next_triplet++;
> > +
> > + if (check_bit(op->ldpc_dec.op_flags,
> > + RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
> > + desc->data_ptrs[next_triplet].address =
> > + op->ldpc_dec.harq_combined_output.offset;
> > + /* Adjust based on previous operation */
> > + struct rte_bbdev_dec_op *prev_op = desc->op_addr;
> > + op->ldpc_dec.harq_combined_output.length =
> > + prev_op->ldpc_dec.harq_combined_output.length;
> > + int16_t hq_idx = op->ldpc_dec.harq_combined_output.offset /
> > + ACC100_HARQ_OFFSET;
> > + int16_t prev_hq_idx =
> > + prev_op->ldpc_dec.harq_combined_output.offset
> > + / ACC100_HARQ_OFFSET;
> > + harq_layout[hq_idx].val = harq_layout[prev_hq_idx].val;
> > +#ifndef ACC100_EXT_MEM
> > + struct rte_bbdev_op_data ho =
> > + op->ldpc_dec.harq_combined_output;
> > + desc->data_ptrs[next_triplet].address =
> > + rte_pktmbuf_iova_offset(ho.data, ho.offset);
> > +#endif
> > + next_triplet++;
> > + }
> > +
> > + op->ldpc_dec.hard_output.length += *h_out_length;
> > + desc->op_addr = op;
> > +}
> > +
> > +
> > +/* Enqueue a number of operations to HW and update software rings */
> > +static inline void
> > +acc100_dma_enqueue(struct acc100_queue *q, uint16_t n,
> > + struct rte_bbdev_stats *queue_stats)
> > +{
> > + union acc100_enqueue_reg_fmt enq_req;
> > +#ifdef RTE_BBDEV_OFFLOAD_COST
> > + uint64_t start_time = 0;
> > + queue_stats->acc_offload_cycles = 0;
> > + RTE_SET_USED(queue_stats);
> > +#else
> > + RTE_SET_USED(queue_stats);
> > +#endif
> > +
> > + enq_req.val = 0;
> > + /* Setting offset, 100b for 256 DMA Desc */
> > + enq_req.addr_offset = ACC100_DESC_OFFSET;
> > +
> > + /* Split ops into batches */
> > + do {
> > + union acc100_dma_desc *desc;
> > + uint16_t enq_batch_size;
> > + uint64_t offset;
> > + rte_iova_t req_elem_addr;
> > +
> > + enq_batch_size = RTE_MIN(n, MAX_ENQ_BATCH_SIZE);
> > +
> > + /* Set flag on last descriptor in a batch */
> > + desc = q->ring_addr + ((q->sw_ring_head + enq_batch_size - 1) &
> > + q->sw_ring_wrap_mask);
> > + desc->req.last_desc_in_batch = 1;
> > +
> > + /* Calculate the 1st descriptor's address */
> > + offset = ((q->sw_ring_head & q->sw_ring_wrap_mask) *
> > + sizeof(union acc100_dma_desc));
> > + req_elem_addr = q->ring_addr_phys + offset;
> > +
> > + /* Fill enqueue struct */
> > + enq_req.num_elem = enq_batch_size;
> > + /* low 6 bits are not needed */
> > + enq_req.req_elem_addr = (uint32_t)(req_elem_addr >> 6);
> > +
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > + rte_memdump(stderr, "Req sdone", desc, sizeof(*desc));
> > +#endif
> > + rte_bbdev_log_debug(
> > + "Enqueue %u reqs (phys %#"PRIx64") to reg %p",
> > + enq_batch_size,
> > + req_elem_addr,
> > + (void *)q->mmio_reg_enqueue);
> > +
> > + rte_wmb();
> > +
> > +#ifdef RTE_BBDEV_OFFLOAD_COST
> > + /* Start time measurement for enqueue function offload. */
> > + start_time = rte_rdtsc_precise();
> > +#endif
> > + rte_bbdev_log(DEBUG, "Debug : MMIO Enqueue");
> > + mmio_write(q->mmio_reg_enqueue, enq_req.val);
> > +
> > +#ifdef RTE_BBDEV_OFFLOAD_COST
> > + queue_stats->acc_offload_cycles +=
> > + rte_rdtsc_precise() - start_time;
> > +#endif
> > +
> > + q->aq_enqueued++;
> > + q->sw_ring_head += enq_batch_size;
> > + n -= enq_batch_size;
> > +
> > + } while (n);
> > +
> > +
> > +}
> > +
> > +/* Enqueue one encode operations for ACC100 device in CB mode */
> > +static inline int
> > +enqueue_ldpc_enc_n_op_cb(struct acc100_queue *q, struct
> > rte_bbdev_enc_op **ops,
> > + uint16_t total_enqueued_cbs, int16_t num)
> > +{
> > + union acc100_dma_desc *desc = NULL;
> > + uint32_t out_length;
> > + struct rte_mbuf *output_head, *output;
> > + int i, next_triplet;
> > + uint16_t in_length_in_bytes;
> > + struct rte_bbdev_op_ldpc_enc *enc = &ops[0]->ldpc_enc;
> > +
> > + uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> > + & q->sw_ring_wrap_mask);
> > + desc = q->ring_addr + desc_idx;
> > + acc100_fcw_le_fill(ops[0], &desc->req.fcw_le, num);
> > +
> > + /** This could be done at polling */
> > + desc->req.word0 = ACC100_DMA_DESC_TYPE;
> > + desc->req.word1 = 0; /**< Timestamp could be disabled */
> > + desc->req.word2 = 0;
> > + desc->req.word3 = 0;
> > + desc->req.numCBs = num;
> > +
> > + in_length_in_bytes = ops[0]->ldpc_enc.input.data->data_len;
> > + out_length = (enc->cb_params.e + 7) >> 3;
> > + desc->req.m2dlen = 1 + num;
> > + desc->req.d2mlen = num;
> > + next_triplet = 1;
> > +
> > + for (i = 0; i < num; i++) {
> > + desc->req.data_ptrs[next_triplet].address =
> > + rte_pktmbuf_iova_offset(ops[i]->ldpc_enc.input.data, 0);
> > + desc->req.data_ptrs[next_triplet].blen = in_length_in_bytes;
> > + next_triplet++;
> > + desc->req.data_ptrs[next_triplet].address =
> > + rte_pktmbuf_iova_offset(
> > + ops[i]->ldpc_enc.output.data, 0);
> > + desc->req.data_ptrs[next_triplet].blen = out_length;
> > + next_triplet++;
> > + ops[i]->ldpc_enc.output.length = out_length;
> > + output_head = output = ops[i]->ldpc_enc.output.data;
> > + mbuf_append(output_head, output, out_length);
> > + output->data_len = out_length;
> > + }
> > +
> > + desc->req.op_addr = ops[0];
> > +
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > + rte_memdump(stderr, "FCW", &desc->req.fcw_le,
> > + sizeof(desc->req.fcw_le) - 8);
> > + rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> > +#endif
> > +
> > + /* One CB (one op) was successfully prepared to enqueue */
> > + return num;
> > +}
> > +
> > +/* Enqueue one encode operations for ACC100 device in CB mode */
> > +static inline int
> > +enqueue_ldpc_enc_one_op_cb(struct acc100_queue *q, struct
> > rte_bbdev_enc_op *op,
> > + uint16_t total_enqueued_cbs)
> > +{
> > + union acc100_dma_desc *desc = NULL;
> > + int ret;
> > + uint32_t in_offset, out_offset, out_length, mbuf_total_left,
> > + seg_total_left;
> > + struct rte_mbuf *input, *output_head, *output;
> > +
> > + uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> > + & q->sw_ring_wrap_mask);
> > + desc = q->ring_addr + desc_idx;
> > + acc100_fcw_le_fill(op, &desc->req.fcw_le, 1);
> > +
> > + input = op->ldpc_enc.input.data;
> > + output_head = output = op->ldpc_enc.output.data;
> > + in_offset = op->ldpc_enc.input.offset;
> > + out_offset = op->ldpc_enc.output.offset;
> > + out_length = 0;
> > + mbuf_total_left = op->ldpc_enc.input.length;
> > + seg_total_left = rte_pktmbuf_data_len(op->ldpc_enc.input.data)
> > + - in_offset;
> > +
> > + ret = acc100_dma_desc_le_fill(op, &desc->req, &input, output,
> > + &in_offset, &out_offset, &out_length, &mbuf_total_left,
> > + &seg_total_left);
> > +
> > + if (unlikely(ret < 0))
> > + return ret;
> > +
> > + mbuf_append(output_head, output, out_length);
> > +
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > + rte_memdump(stderr, "FCW", &desc->req.fcw_le,
> > + sizeof(desc->req.fcw_le) - 8);
> > + rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> > +
> > + /* Check if any data left after processing one CB */
> > + if (mbuf_total_left != 0) {
> > + rte_bbdev_log(ERR,
> > + "Some date still left after processing one CB:
> > mbuf_total_left = %u",
> > + mbuf_total_left);
> > + return -EINVAL;
> > + }
> > +#endif
> > + /* One CB (one op) was successfully prepared to enqueue */
> > + return 1;
> > +}
> > +
> > +/** Enqueue one decode operations for ACC100 device in CB mode */
> > +static inline int
> > +enqueue_ldpc_dec_one_op_cb(struct acc100_queue *q, struct
> > rte_bbdev_dec_op *op,
> > + uint16_t total_enqueued_cbs, bool same_op)
> > +{
> > + int ret;
> > +
> > + union acc100_dma_desc *desc;
> > + uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> > + & q->sw_ring_wrap_mask);
> > + desc = q->ring_addr + desc_idx;
> > + struct rte_mbuf *input, *h_output_head, *h_output;
> > + uint32_t in_offset, h_out_offset, h_out_length, mbuf_total_left;
> > + input = op->ldpc_dec.input.data;
> > + h_output_head = h_output = op->ldpc_dec.hard_output.data;
> > + in_offset = op->ldpc_dec.input.offset;
> > + h_out_offset = op->ldpc_dec.hard_output.offset;
> > + mbuf_total_left = op->ldpc_dec.input.length;
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > + if (unlikely(input == NULL)) {
> > + rte_bbdev_log(ERR, "Invalid mbuf pointer");
> > + return -EFAULT;
> > + }
> > +#endif
> > + union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
> > +
> > + if (same_op) {
> > + union acc100_dma_desc *prev_desc;
> > + desc_idx = ((q->sw_ring_head + total_enqueued_cbs - 1)
> > + & q->sw_ring_wrap_mask);
> > + prev_desc = q->ring_addr + desc_idx;
> > + uint8_t *prev_ptr = (uint8_t *) prev_desc;
> > + uint8_t *new_ptr = (uint8_t *) desc;
> > + /* Copy first 4 words and BDESCs */
> > + rte_memcpy(new_ptr, prev_ptr, 16);
> > + rte_memcpy(new_ptr + 36, prev_ptr + 36, 40);
> > + desc->req.op_addr = prev_desc->req.op_addr;
> > + /* Copy FCW */
> > + rte_memcpy(new_ptr + ACC100_DESC_FCW_OFFSET,
> > + prev_ptr + ACC100_DESC_FCW_OFFSET,
> > + ACC100_FCW_LD_BLEN);
> > + acc100_dma_desc_ld_update(op, &desc->req, input, h_output,
> > + &in_offset, &h_out_offset,
> > + &h_out_length, harq_layout);
> > + } else {
> > + struct acc100_fcw_ld *fcw;
> > + uint32_t seg_total_left;
> > + fcw = &desc->req.fcw_ld;
> > + acc100_fcw_ld_fill(op, fcw, harq_layout);
> > +
> > + /* Special handling when overusing mbuf */
> > + if (fcw->rm_e < MAX_E_MBUF)
> > + seg_total_left = rte_pktmbuf_data_len(input)
> > + - in_offset;
> > + else
> > + seg_total_left = fcw->rm_e;
> > +
> > + ret = acc100_dma_desc_ld_fill(op, &desc->req, &input, h_output,
> > + &in_offset, &h_out_offset,
> > + &h_out_length, &mbuf_total_left,
> > + &seg_total_left, fcw);
> > + if (unlikely(ret < 0))
> > + return ret;
> > + }
> > +
> > + /* Hard output */
> > + mbuf_append(h_output_head, h_output, h_out_length);
> > +#ifndef ACC100_EXT_MEM
> > + if (op->ldpc_dec.harq_combined_output.length > 0) {
> > + /* Push the HARQ output into host memory */
> > + struct rte_mbuf *hq_output_head, *hq_output;
> > + hq_output_head = op->ldpc_dec.harq_combined_output.data;
> > + hq_output = op->ldpc_dec.harq_combined_output.data;
> > + mbuf_append(hq_output_head, hq_output,
> > + op->ldpc_dec.harq_combined_output.length);
> > + }
> > +#endif
> > +
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > + rte_memdump(stderr, "FCW", &desc->req.fcw_ld,
> > + sizeof(desc->req.fcw_ld) - 8);
> > + rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> > +#endif
> > +
> > + /* One CB (one op) was successfully prepared to enqueue */
> > + return 1;
> > +}
> > +
> > +
> > +/* Enqueue one decode operations for ACC100 device in TB mode */
> > +static inline int
> > +enqueue_ldpc_dec_one_op_tb(struct acc100_queue *q, struct
> > rte_bbdev_dec_op *op,
> > + uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
> > +{
> > + union acc100_dma_desc *desc = NULL;
> > + int ret;
> > + uint8_t r, c;
> > + uint32_t in_offset, h_out_offset,
> > + h_out_length, mbuf_total_left, seg_total_left;
> > + struct rte_mbuf *input, *h_output_head, *h_output;
> > + uint16_t current_enqueued_cbs = 0;
> > +
> > + uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> > + & q->sw_ring_wrap_mask);
> > + desc = q->ring_addr + desc_idx;
> > + uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
> > + union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
> > + acc100_fcw_ld_fill(op, &desc->req.fcw_ld, harq_layout);
> > +
> > + input = op->ldpc_dec.input.data;
> > + h_output_head = h_output = op->ldpc_dec.hard_output.data;
> > + in_offset = op->ldpc_dec.input.offset;
> > + h_out_offset = op->ldpc_dec.hard_output.offset;
> > + h_out_length = 0;
> > + mbuf_total_left = op->ldpc_dec.input.length;
> > + c = op->ldpc_dec.tb_params.c;
> > + r = op->ldpc_dec.tb_params.r;
> > +
> > + while (mbuf_total_left > 0 && r < c) {
> > +
> > + seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
> > +
> > + /* Set up DMA descriptor */
> > + desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
> > + & q->sw_ring_wrap_mask);
> > + desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
> > + desc->req.data_ptrs[0].blen = ACC100_FCW_LD_BLEN;
> > + ret = acc100_dma_desc_ld_fill(op, &desc->req, &input,
> > + h_output, &in_offset, &h_out_offset,
> > + &h_out_length,
> > + &mbuf_total_left, &seg_total_left,
> > + &desc->req.fcw_ld);
> > +
> > + if (unlikely(ret < 0))
> > + return ret;
> > +
> > + /* Hard output */
> > + mbuf_append(h_output_head, h_output, h_out_length);
> > +
> > + /* Set total number of CBs in TB */
> > + desc->req.cbs_in_tb = cbs_in_tb;
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > + rte_memdump(stderr, "FCW", &desc->req.fcw_td,
> > + sizeof(desc->req.fcw_td) - 8);
> > + rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> > +#endif
> > +
> > + if (seg_total_left == 0) {
> > + /* Go to the next mbuf */
> > + input = input->next;
> > + in_offset = 0;
> > + h_output = h_output->next;
> > + h_out_offset = 0;
> > + }
> > + total_enqueued_cbs++;
> > + current_enqueued_cbs++;
> > + r++;
> > + }
> > +
> > + if (unlikely(desc == NULL))
> > + return current_enqueued_cbs;
> > +
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > + /* Check if any CBs left for processing */
> > + if (mbuf_total_left != 0) {
> > + rte_bbdev_log(ERR,
> > + "Some date still left for processing: mbuf_total_left =
> %u",
> > + mbuf_total_left);
> > + return -EINVAL;
> > + }
> > +#endif
> > + /* Set SDone on last CB descriptor for TB mode */
> > + desc->req.sdone_enable = 1;
> > + desc->req.irq_enable = q->irq_enable;
> > +
> > + return current_enqueued_cbs;
> > +}
> > +
> > +
> > +/* Calculates number of CBs in processed encoder TB based on 'r' and input
> > + * length.
> > + */
> > +static inline uint8_t
> > +get_num_cbs_in_tb_enc(struct rte_bbdev_op_turbo_enc *turbo_enc)
> > +{
> > + uint8_t c, c_neg, r, crc24_bits = 0;
> > + uint16_t k, k_neg, k_pos;
> > + uint8_t cbs_in_tb = 0;
> > + int32_t length;
> > +
> > + length = turbo_enc->input.length;
> > + r = turbo_enc->tb_params.r;
> > + c = turbo_enc->tb_params.c;
> > + c_neg = turbo_enc->tb_params.c_neg;
> > + k_neg = turbo_enc->tb_params.k_neg;
> > + k_pos = turbo_enc->tb_params.k_pos;
> > + crc24_bits = 0;
> > + if (check_bit(turbo_enc->op_flags,
> > RTE_BBDEV_TURBO_CRC_24B_ATTACH))
> > + crc24_bits = 24;
> > + while (length > 0 && r < c) {
> > + k = (r < c_neg) ? k_neg : k_pos;
> > + length -= (k - crc24_bits) >> 3;
> > + r++;
> > + cbs_in_tb++;
> > + }
> > +
> > + return cbs_in_tb;
> > +}
> > +
> > +/* Calculates number of CBs in processed decoder TB based on 'r' and input
> > + * length.
> > + */
> > +static inline uint16_t
> > +get_num_cbs_in_tb_dec(struct rte_bbdev_op_turbo_dec *turbo_dec)
> > +{
> > + uint8_t c, c_neg, r = 0;
> > + uint16_t kw, k, k_neg, k_pos, cbs_in_tb = 0;
> > + int32_t length;
> > +
> > + length = turbo_dec->input.length;
> > + r = turbo_dec->tb_params.r;
> > + c = turbo_dec->tb_params.c;
> > + c_neg = turbo_dec->tb_params.c_neg;
> > + k_neg = turbo_dec->tb_params.k_neg;
> > + k_pos = turbo_dec->tb_params.k_pos;
> > + while (length > 0 && r < c) {
> > + k = (r < c_neg) ? k_neg : k_pos;
> > + kw = RTE_ALIGN_CEIL(k + 4, 32) * 3;
> > + length -= kw;
> > + r++;
> > + cbs_in_tb++;
> > + }
> > +
> > + return cbs_in_tb;
> > +}
> > +
> > +/* Calculates number of CBs in processed decoder TB based on 'r' and input
> > + * length.
> > + */
> > +static inline uint16_t
> > +get_num_cbs_in_tb_ldpc_dec(struct rte_bbdev_op_ldpc_dec *ldpc_dec)
> > +{
> > + uint16_t r, cbs_in_tb = 0;
> > + int32_t length = ldpc_dec->input.length;
> > + r = ldpc_dec->tb_params.r;
> > + while (length > 0 && r < ldpc_dec->tb_params.c) {
> > + length -= (r < ldpc_dec->tb_params.cab) ?
> > + ldpc_dec->tb_params.ea :
> > + ldpc_dec->tb_params.eb;
> > + r++;
> > + cbs_in_tb++;
> > + }
> > + return cbs_in_tb;
> > +}
> > +
> > +/* Check we can mux encode operations with common FCW */
> > +static inline bool
> > +check_mux(struct rte_bbdev_enc_op **ops, uint16_t num) {
> > + uint16_t i;
> > + if (num == 1)
> > + return false;
> > + for (i = 1; i < num; ++i) {
> > + /* Only mux compatible code blocks */
> > + if (memcmp((uint8_t *)(&ops[i]->ldpc_enc) + ENC_OFFSET,
> > + (uint8_t *)(&ops[0]->ldpc_enc) + ENC_OFFSET,
> > + CMP_ENC_SIZE) != 0)
> > + return false;
> > + }
> > + return true;
> > +}
> > +
> > +/** Enqueue encode operations for ACC100 device in CB mode. */
> > +static inline uint16_t
> > +acc100_enqueue_ldpc_enc_cb(struct rte_bbdev_queue_data *q_data,
> > + struct rte_bbdev_enc_op **ops, uint16_t num)
> > +{
> > + struct acc100_queue *q = q_data->queue_private;
> > + int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
> > + uint16_t i = 0;
> > + union acc100_dma_desc *desc;
> > + int ret, desc_idx = 0;
> > + int16_t enq, left = num;
> > +
> > + while (left > 0) {
> > + if (unlikely(avail - 1 < 0))
> > + break;
> > + avail--;
> > + enq = RTE_MIN(left, MUX_5GDL_DESC);
> > + if (check_mux(&ops[i], enq)) {
> > + ret = enqueue_ldpc_enc_n_op_cb(q, &ops[i],
> > + desc_idx, enq);
> > + if (ret < 0)
> > + break;
> > + i += enq;
> > + } else {
> > + ret = enqueue_ldpc_enc_one_op_cb(q, ops[i], desc_idx);
> > + if (ret < 0)
> > + break;
> > + i++;
> > + }
> > + desc_idx++;
> > + left = num - i;
> > + }
> > +
> > + if (unlikely(i == 0))
> > + return 0; /* Nothing to enqueue */
> > +
> > + /* Set SDone in last CB in enqueued ops for CB mode*/
> > + desc = q->ring_addr + ((q->sw_ring_head + desc_idx - 1)
> > + & q->sw_ring_wrap_mask);
> > + desc->req.sdone_enable = 1;
> > + desc->req.irq_enable = q->irq_enable;
> > +
> > + acc100_dma_enqueue(q, desc_idx, &q_data->queue_stats);
> > +
> > + /* Update stats */
> > + q_data->queue_stats.enqueued_count += i;
> > + q_data->queue_stats.enqueue_err_count += num - i;
> > +
> > + return i;
> > +}
> > +
> > +/* Enqueue encode operations for ACC100 device. */
> > +static uint16_t
> > +acc100_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
> > + struct rte_bbdev_enc_op **ops, uint16_t num)
> > +{
> > + if (unlikely(num == 0))
> > + return 0;
> > + return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
> > +}
> > +
> > +/* Check we can mux encode operations with common FCW */
> > +static inline bool
> > +cmp_ldpc_dec_op(struct rte_bbdev_dec_op **ops) {
> > + /* Only mux compatible code blocks */
> > + if (memcmp((uint8_t *)(&ops[0]->ldpc_dec) + DEC_OFFSET,
> > + (uint8_t *)(&ops[1]->ldpc_dec) +
> > + DEC_OFFSET, CMP_DEC_SIZE) != 0) {
> > + return false;
> > + } else
> > + return true;
> > +}
> > +
> > +
> > +/* Enqueue decode operations for ACC100 device in TB mode */
> > +static uint16_t
> > +acc100_enqueue_ldpc_dec_tb(struct rte_bbdev_queue_data *q_data,
> > + struct rte_bbdev_dec_op **ops, uint16_t num)
> > +{
> > + struct acc100_queue *q = q_data->queue_private;
> > + int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
> > + uint16_t i, enqueued_cbs = 0;
> > + uint8_t cbs_in_tb;
> > + int ret;
> > +
> > + for (i = 0; i < num; ++i) {
> > + cbs_in_tb = get_num_cbs_in_tb_ldpc_dec(&ops[i]->ldpc_dec);
> > + /* Check if there are available space for further processing */
> > + if (unlikely(avail - cbs_in_tb < 0))
> > + break;
> > + avail -= cbs_in_tb;
> > +
> > + ret = enqueue_ldpc_dec_one_op_tb(q, ops[i],
> > + enqueued_cbs, cbs_in_tb);
> > + if (ret < 0)
> > + break;
> > + enqueued_cbs += ret;
> > + }
> > +
> > + acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
> > +
> > + /* Update stats */
> > + q_data->queue_stats.enqueued_count += i;
> > + q_data->queue_stats.enqueue_err_count += num - i;
> > + return i;
> > +}
> > +
> > +/* Enqueue decode operations for ACC100 device in CB mode */
> > +static uint16_t
> > +acc100_enqueue_ldpc_dec_cb(struct rte_bbdev_queue_data *q_data,
> > + struct rte_bbdev_dec_op **ops, uint16_t num)
> > +{
> > + struct acc100_queue *q = q_data->queue_private;
> > + int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
> > + uint16_t i;
> > + union acc100_dma_desc *desc;
> > + int ret;
> > + bool same_op = false;
> > + for (i = 0; i < num; ++i) {
> > + /* Check if there are available space for further processing */
> > + if (unlikely(avail - 1 < 0))
> > + break;
> > + avail -= 1;
> > +
> > + if (i > 0)
> > + same_op = cmp_ldpc_dec_op(&ops[i-1]);
> > + rte_bbdev_log(INFO, "Op %d %d %d %d %d %d %d %d %d %d %d
> > %d\n",
> > + i, ops[i]->ldpc_dec.op_flags, ops[i]->ldpc_dec.rv_index,
> > + ops[i]->ldpc_dec.iter_max, ops[i]->ldpc_dec.iter_count,
> > + ops[i]->ldpc_dec.basegraph, ops[i]->ldpc_dec.z_c,
> > + ops[i]->ldpc_dec.n_cb, ops[i]->ldpc_dec.q_m,
> > + ops[i]->ldpc_dec.n_filler, ops[i]->ldpc_dec.cb_params.e,
> > + same_op);
> > + ret = enqueue_ldpc_dec_one_op_cb(q, ops[i], i, same_op);
> > + if (ret < 0)
> > + break;
> > + }
> > +
> > + if (unlikely(i == 0))
> > + return 0; /* Nothing to enqueue */
> > +
> > + /* Set SDone in last CB in enqueued ops for CB mode*/
> > + desc = q->ring_addr + ((q->sw_ring_head + i - 1)
> > + & q->sw_ring_wrap_mask);
> > +
> > + desc->req.sdone_enable = 1;
> > + desc->req.irq_enable = q->irq_enable;
> > +
> > + acc100_dma_enqueue(q, i, &q_data->queue_stats);
> > +
> > + /* Update stats */
> > + q_data->queue_stats.enqueued_count += i;
> > + q_data->queue_stats.enqueue_err_count += num - i;
> > + return i;
> > +}
> > +
> > +/* Enqueue decode operations for ACC100 device. */
> > +static uint16_t
> > +acc100_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
> > + struct rte_bbdev_dec_op **ops, uint16_t num)
> > +{
> > + struct acc100_queue *q = q_data->queue_private;
> > + int32_t aq_avail = q->aq_depth +
> > + (q->aq_dequeued - q->aq_enqueued) / 128;
> > +
> > + if (unlikely((aq_avail == 0) || (num == 0)))
> > + return 0;
> > +
> > + if (ops[0]->ldpc_dec.code_block_mode == 0)
> > + return acc100_enqueue_ldpc_dec_tb(q_data, ops, num);
> > + else
> > + return acc100_enqueue_ldpc_dec_cb(q_data, ops, num);
> > +}
> > +
> > +
> > +/* Dequeue one encode operations from ACC100 device in CB mode */
> > +static inline int
> > +dequeue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op
> > **ref_op,
> > + uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
> > +{
> > + union acc100_dma_desc *desc, atom_desc;
> > + union acc100_dma_rsp_desc rsp;
> > + struct rte_bbdev_enc_op *op;
> > + int i;
> > +
> > + desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
> > + & q->sw_ring_wrap_mask);
> > + atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> > + __ATOMIC_RELAXED);
> > +
> > + /* Check fdone bit */
> > + if (!(atom_desc.rsp.val & ACC100_FDONE))
> > + return -1;
> > +
> > + rsp.val = atom_desc.rsp.val;
> > + rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
> > +
> > + /* Dequeue */
> > + op = desc->req.op_addr;
> > +
> > + /* Clearing status, it will be set based on response */
> > + op->status = 0;
> > +
> > + op->status |= ((rsp.input_err)
> > + ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> > + op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> > + op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> > +
> > + if (desc->req.last_desc_in_batch) {
> > + (*aq_dequeued)++;
> > + desc->req.last_desc_in_batch = 0;
> > + }
> > + desc->rsp.val = ACC100_DMA_DESC_TYPE;
> > + desc->rsp.add_info_0 = 0; /*Reserved bits */
> > + desc->rsp.add_info_1 = 0; /*Reserved bits */
> > +
> > + /* Flag that the muxing cause loss of opaque data */
> > + op->opaque_data = (void *)-1;
> > + for (i = 0 ; i < desc->req.numCBs; i++)
> > + ref_op[i] = op;
> > +
> > + /* One CB (op) was successfully dequeued */
> > + return desc->req.numCBs;
> > +}
> > +
> > +/* Dequeue one encode operations from ACC100 device in TB mode */
> > +static inline int
> > +dequeue_enc_one_op_tb(struct acc100_queue *q, struct rte_bbdev_enc_op
> > **ref_op,
> > + uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
> > +{
> > + union acc100_dma_desc *desc, *last_desc, atom_desc;
> > + union acc100_dma_rsp_desc rsp;
> > + struct rte_bbdev_enc_op *op;
> > + uint8_t i = 0;
> > + uint16_t current_dequeued_cbs = 0, cbs_in_tb;
> > +
> > + desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
> > + & q->sw_ring_wrap_mask);
> > + atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> > + __ATOMIC_RELAXED);
> > +
> > + /* Check fdone bit */
> > + if (!(atom_desc.rsp.val & ACC100_FDONE))
> > + return -1;
> > +
> > + /* Get number of CBs in dequeued TB */
> > + cbs_in_tb = desc->req.cbs_in_tb;
> > + /* Get last CB */
> > + last_desc = q->ring_addr + ((q->sw_ring_tail
> > + + total_dequeued_cbs + cbs_in_tb - 1)
> > + & q->sw_ring_wrap_mask);
> > + /* Check if last CB in TB is ready to dequeue (and thus
> > + * the whole TB) - checking sdone bit. If not return.
> > + */
> > + atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
> > + __ATOMIC_RELAXED);
> > + if (!(atom_desc.rsp.val & ACC100_SDONE))
> > + return -1;
> > +
> > + /* Dequeue */
> > + op = desc->req.op_addr;
> > +
> > + /* Clearing status, it will be set based on response */
> > + op->status = 0;
> > +
> > + while (i < cbs_in_tb) {
> > + desc = q->ring_addr + ((q->sw_ring_tail
> > + + total_dequeued_cbs)
> > + & q->sw_ring_wrap_mask);
> > + atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> > + __ATOMIC_RELAXED);
> > + rsp.val = atom_desc.rsp.val;
> > + rte_bbdev_log_debug("Resp. desc %p: %x", desc,
> > + rsp.val);
> > +
> > + op->status |= ((rsp.input_err)
> > + ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> > + op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) :
> 0);
> > + op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> > +
> > + if (desc->req.last_desc_in_batch) {
> > + (*aq_dequeued)++;
> > + desc->req.last_desc_in_batch = 0;
> > + }
> > + desc->rsp.val = ACC100_DMA_DESC_TYPE;
> > + desc->rsp.add_info_0 = 0;
> > + desc->rsp.add_info_1 = 0;
> > + total_dequeued_cbs++;
> > + current_dequeued_cbs++;
> > + i++;
> > + }
> > +
> > + *ref_op = op;
> > +
> > + return current_dequeued_cbs;
> > +}
> > +
> > +/* Dequeue one decode operation from ACC100 device in CB mode */
> > +static inline int
> > +dequeue_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
> > + struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
> > + uint16_t dequeued_cbs, uint32_t *aq_dequeued)
> > +{
> > + union acc100_dma_desc *desc, atom_desc;
> > + union acc100_dma_rsp_desc rsp;
> > + struct rte_bbdev_dec_op *op;
> > +
> > + desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> > + & q->sw_ring_wrap_mask);
> > + atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> > + __ATOMIC_RELAXED);
> > +
> > + /* Check fdone bit */
> > + if (!(atom_desc.rsp.val & ACC100_FDONE))
> > + return -1;
> > +
> > + rsp.val = atom_desc.rsp.val;
> > + rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
> > +
> > + /* Dequeue */
> > + op = desc->req.op_addr;
> > +
> > + /* Clearing status, it will be set based on response */
> > + op->status = 0;
> > + op->status |= ((rsp.input_err)
> > + ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> > + op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> > + op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> > + if (op->status != 0)
> > + q_data->queue_stats.dequeue_err_count++;
> > +
> > + /* CRC invalid if error exists */
> > + if (!op->status)
> > + op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
> > + op->turbo_dec.iter_count = (uint8_t) rsp.iter_cnt / 2;
> > + /* Check if this is the last desc in batch (Atomic Queue) */
> > + if (desc->req.last_desc_in_batch) {
> > + (*aq_dequeued)++;
> > + desc->req.last_desc_in_batch = 0;
> > + }
> > + desc->rsp.val = ACC100_DMA_DESC_TYPE;
> > + desc->rsp.add_info_0 = 0;
> > + desc->rsp.add_info_1 = 0;
> > + *ref_op = op;
> > +
> > + /* One CB (op) was successfully dequeued */
> > + return 1;
> > +}
> > +
> > +/* Dequeue one decode operations from ACC100 device in CB mode */
> > +static inline int
> > +dequeue_ldpc_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
> > + struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
> > + uint16_t dequeued_cbs, uint32_t *aq_dequeued)
> > +{
> > + union acc100_dma_desc *desc, atom_desc;
> > + union acc100_dma_rsp_desc rsp;
> > + struct rte_bbdev_dec_op *op;
> > +
> > + desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> > + & q->sw_ring_wrap_mask);
> > + atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> > + __ATOMIC_RELAXED);
> > +
> > + /* Check fdone bit */
> > + if (!(atom_desc.rsp.val & ACC100_FDONE))
> > + return -1;
> > +
> > + rsp.val = atom_desc.rsp.val;
> > +
> > + /* Dequeue */
> > + op = desc->req.op_addr;
> > +
> > + /* Clearing status, it will be set based on response */
> > + op->status = 0;
> > + op->status |= rsp.input_err << RTE_BBDEV_DATA_ERROR;
> > + op->status |= rsp.dma_err << RTE_BBDEV_DRV_ERROR;
> > + op->status |= rsp.fcw_err << RTE_BBDEV_DRV_ERROR;
> > + if (op->status != 0)
> > + q_data->queue_stats.dequeue_err_count++;
> > +
> > + op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
> > + if (op->ldpc_dec.hard_output.length > 0 && !rsp.synd_ok)
> > + op->status |= 1 << RTE_BBDEV_SYNDROME_ERROR;
> > + op->ldpc_dec.iter_count = (uint8_t) rsp.iter_cnt;
> > +
> > + /* Check if this is the last desc in batch (Atomic Queue) */
> > + if (desc->req.last_desc_in_batch) {
> > + (*aq_dequeued)++;
> > + desc->req.last_desc_in_batch = 0;
> > + }
> > +
> > + desc->rsp.val = ACC100_DMA_DESC_TYPE;
> > + desc->rsp.add_info_0 = 0;
> > + desc->rsp.add_info_1 = 0;
> > +
> > + *ref_op = op;
> > +
> > + /* One CB (op) was successfully dequeued */
> > + return 1;
> > +}
> > +
> > +/* Dequeue one decode operations from ACC100 device in TB mode. */
> > +static inline int
> > +dequeue_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op
> > **ref_op,
> > + uint16_t dequeued_cbs, uint32_t *aq_dequeued)
> > +{
> > + union acc100_dma_desc *desc, *last_desc, atom_desc;
> > + union acc100_dma_rsp_desc rsp;
> > + struct rte_bbdev_dec_op *op;
> > + uint8_t cbs_in_tb = 1, cb_idx = 0;
> > +
> > + desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> > + & q->sw_ring_wrap_mask);
> > + atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> > + __ATOMIC_RELAXED);
> > +
> > + /* Check fdone bit */
> > + if (!(atom_desc.rsp.val & ACC100_FDONE))
> > + return -1;
> > +
> > + /* Dequeue */
> > + op = desc->req.op_addr;
> > +
> > + /* Get number of CBs in dequeued TB */
> > + cbs_in_tb = desc->req.cbs_in_tb;
> > + /* Get last CB */
> > + last_desc = q->ring_addr + ((q->sw_ring_tail
> > + + dequeued_cbs + cbs_in_tb - 1)
> > + & q->sw_ring_wrap_mask);
> > + /* Check if last CB in TB is ready to dequeue (and thus
> > + * the whole TB) - checking sdone bit. If not return.
> > + */
> > + atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
> > + __ATOMIC_RELAXED);
> > + if (!(atom_desc.rsp.val & ACC100_SDONE))
> > + return -1;
> > +
> > + /* Clearing status, it will be set based on response */
> > + op->status = 0;
> > +
> > + /* Read remaining CBs if exists */
> > + while (cb_idx < cbs_in_tb) {
> > + desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> > + & q->sw_ring_wrap_mask);
> > + atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> > + __ATOMIC_RELAXED);
> > + rsp.val = atom_desc.rsp.val;
> > + rte_bbdev_log_debug("Resp. desc %p: %x", desc,
> > + rsp.val);
> > +
> > + op->status |= ((rsp.input_err)
> > + ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> > + op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) :
> 0);
> > + op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> > +
> > + /* CRC invalid if error exists */
> > + if (!op->status)
> > + op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
> > + op->turbo_dec.iter_count = RTE_MAX((uint8_t) rsp.iter_cnt,
> > + op->turbo_dec.iter_count);
> > +
> > + /* Check if this is the last desc in batch (Atomic Queue) */
> > + if (desc->req.last_desc_in_batch) {
> > + (*aq_dequeued)++;
> > + desc->req.last_desc_in_batch = 0;
> > + }
> > + desc->rsp.val = ACC100_DMA_DESC_TYPE;
> > + desc->rsp.add_info_0 = 0;
> > + desc->rsp.add_info_1 = 0;
> > + dequeued_cbs++;
> > + cb_idx++;
> > + }
> > +
> > + *ref_op = op;
> > +
> > + return cb_idx;
> > +}
> > +
> > +/* Dequeue LDPC encode operations from ACC100 device. */
> > +static uint16_t
> > +acc100_dequeue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
> > + struct rte_bbdev_enc_op **ops, uint16_t num)
> > +{
> > + struct acc100_queue *q = q_data->queue_private;
> > + uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
> > + uint32_t aq_dequeued = 0;
> > + uint16_t dequeue_num, i, dequeued_cbs = 0, dequeued_descs = 0;
> > + int ret;
> > +
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > + if (unlikely(ops == 0 && q == NULL))
> > + return 0;
> > +#endif
> > +
> > + dequeue_num = (avail < num) ? avail : num;
> > +
> > + for (i = 0; i < dequeue_num; i++) {
> > + ret = dequeue_enc_one_op_cb(q, &ops[dequeued_cbs],
> > + dequeued_descs, &aq_dequeued);
> > + if (ret < 0)
> > + break;
> > + dequeued_cbs += ret;
> > + dequeued_descs++;
> > + if (dequeued_cbs >= num)
> > + break;
> > + }
> > +
> > + q->aq_dequeued += aq_dequeued;
> > + q->sw_ring_tail += dequeued_descs;
> > +
> > + /* Update enqueue stats */
> > + q_data->queue_stats.dequeued_count += dequeued_cbs;
> > +
> > + return dequeued_cbs;
> > +}
> > +
> > +/* Dequeue decode operations from ACC100 device. */
> > +static uint16_t
> > +acc100_dequeue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
> > + struct rte_bbdev_dec_op **ops, uint16_t num)
> > +{
> > + struct acc100_queue *q = q_data->queue_private;
> > + uint16_t dequeue_num;
> > + uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
> > + uint32_t aq_dequeued = 0;
> > + uint16_t i;
> > + uint16_t dequeued_cbs = 0;
> > + struct rte_bbdev_dec_op *op;
> > + int ret;
> > +
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > + if (unlikely(ops == 0 && q == NULL))
> > + return 0;
> > +#endif
> > +
> > + dequeue_num = (avail < num) ? avail : num;
> > +
> > + for (i = 0; i < dequeue_num; ++i) {
> > + op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> > + & q->sw_ring_wrap_mask))->req.op_addr;
> > + if (op->ldpc_dec.code_block_mode == 0)
> > + ret = dequeue_dec_one_op_tb(q, &ops[i], dequeued_cbs,
> > + &aq_dequeued);
> > + else
> > + ret = dequeue_ldpc_dec_one_op_cb(
> > + q_data, q, &ops[i], dequeued_cbs,
> > + &aq_dequeued);
> > +
> > + if (ret < 0)
> > + break;
> > + dequeued_cbs += ret;
> > + }
> > +
> > + q->aq_dequeued += aq_dequeued;
> > + q->sw_ring_tail += dequeued_cbs;
> > +
> > + /* Update enqueue stats */
> > + q_data->queue_stats.dequeued_count += i;
> > +
> > + return i;
> > +}
> > +
> > /* Initialization Function */
> > static void
> > acc100_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv)
> > @@ -703,6 +2321,10 @@
> > struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
> >
> > dev->dev_ops = &acc100_bbdev_ops;
> > + dev->enqueue_ldpc_enc_ops = acc100_enqueue_ldpc_enc;
> > + dev->enqueue_ldpc_dec_ops = acc100_enqueue_ldpc_dec;
> > + dev->dequeue_ldpc_enc_ops = acc100_dequeue_ldpc_enc;
> > + dev->dequeue_ldpc_dec_ops = acc100_dequeue_ldpc_dec;
> >
> > ((struct acc100_device *) dev->data->dev_private)->pf_device =
> > !strcmp(drv->driver.name,
> > @@ -815,4 +2437,3 @@ static int acc100_pci_remove(struct rte_pci_device
> > *pci_dev)
> > RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME,
> > pci_id_acc100_pf_map);
> > RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
> > RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME,
> > pci_id_acc100_vf_map);
> > -
> > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h
> > b/drivers/baseband/acc100/rte_acc100_pmd.h
> > index 0e2b79c..78686c1 100644
> > --- a/drivers/baseband/acc100/rte_acc100_pmd.h
> > +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
> > @@ -88,6 +88,8 @@
> > #define TMPL_PRI_3 0x0f0e0d0c
> > #define QUEUE_ENABLE 0x80000000 /* Bit to mark Queue as Enabled */
> > #define WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
> > +#define ACC100_FDONE 0x80000000
> > +#define ACC100_SDONE 0x40000000
> >
> > #define ACC100_NUM_TMPL 32
> > #define VF_OFFSET_QOS 16 /* offset in Memory Space specific to QoS Mon
> */
> > @@ -398,6 +400,7 @@ struct __rte_packed acc100_dma_req_desc {
> > union acc100_dma_desc {
> > struct acc100_dma_req_desc req;
> > union acc100_dma_rsp_desc rsp;
> > + uint64_t atom_hdr;
> > };
> >
> >
> > --
> > 1.8.3.1
^ permalink raw reply [flat|nested] 213+ messages in thread
* Re: [dpdk-dev] [PATCH v3 01/11] drivers/baseband: add PMD for ACC100
2020-08-19 0:25 ` [dpdk-dev] [PATCH v3 01/11] drivers/baseband: add PMD for ACC100 Nicolas Chautru
@ 2020-08-29 9:44 ` Xu, Rosen
2020-09-04 16:44 ` Chautru, Nicolas
0 siblings, 1 reply; 213+ messages in thread
From: Xu, Rosen @ 2020-08-29 9:44 UTC (permalink / raw)
To: Chautru, Nicolas, dev, akhil.goyal
Cc: Richardson, Bruce, Chautru, Nicolas, Xu, Rosen
Hi,
> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Nicolas Chautru
> Sent: Wednesday, August 19, 2020 8:25
> To: dev@dpdk.org; akhil.goyal@nxp.com
> Cc: Richardson, Bruce <bruce.richardson@intel.com>; Chautru, Nicolas
> <nicolas.chautru@intel.com>
> Subject: [dpdk-dev] [PATCH v3 01/11] drivers/baseband: add PMD for
> ACC100
>
> Add stubs for the ACC100 PMD
>
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> ---
> config/common_base | 4 +
> doc/guides/bbdevs/acc100.rst | 233 +++++++++++++++++++++
> doc/guides/bbdevs/index.rst | 1 +
> doc/guides/rel_notes/release_20_11.rst | 6 +
> drivers/baseband/Makefile | 2 +
> drivers/baseband/acc100/Makefile | 25 +++
> drivers/baseband/acc100/meson.build | 6 +
> drivers/baseband/acc100/rte_acc100_pmd.c | 175 ++++++++++++++++
> drivers/baseband/acc100/rte_acc100_pmd.h | 37 ++++
> .../acc100/rte_pmd_bbdev_acc100_version.map | 3 +
> drivers/baseband/meson.build | 2 +-
> mk/rte.app.mk | 1 +
> 12 files changed, 494 insertions(+), 1 deletion(-) create mode 100644
> doc/guides/bbdevs/acc100.rst create mode 100644
> drivers/baseband/acc100/Makefile create mode 100644
> drivers/baseband/acc100/meson.build
> create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.c
> create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.h
> create mode 100644
> drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
>
> diff --git a/config/common_base b/config/common_base index
> fbf0ee7..218ab16 100644
> --- a/config/common_base
> +++ b/config/common_base
> @@ -584,6 +584,10 @@ CONFIG_RTE_LIBRTE_PMD_BBDEV_NULL=y
> #
> CONFIG_RTE_LIBRTE_PMD_BBDEV_TURBO_SW=y
>
> +# Compile PMD for ACC100 bbdev device
> +#
> +CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100=y
> +
> #
> # Compile PMD for Intel FPGA LTE FEC bbdev device # diff --git
> a/doc/guides/bbdevs/acc100.rst b/doc/guides/bbdevs/acc100.rst new file
> mode 100644 index 0000000..f87ee09
> --- /dev/null
> +++ b/doc/guides/bbdevs/acc100.rst
> @@ -0,0 +1,233 @@
> +.. SPDX-License-Identifier: BSD-3-Clause
> + Copyright(c) 2020 Intel Corporation
> +
> +Intel(R) ACC100 5G/4G FEC Poll Mode Driver
> +==========================================
> +
> +The BBDEV ACC100 5G/4G FEC poll mode driver (PMD) supports an
> +implementation of a VRAN FEC wireless acceleration function.
> +This device is also known as Mount Bryce.
> +
> +Features
> +--------
> +
> +ACC100 5G/4G FEC PMD supports the following features:
> +
> +- LDPC Encode in the DL (5GNR)
> +- LDPC Decode in the UL (5GNR)
> +- Turbo Encode in the DL (4G)
> +- Turbo Decode in the UL (4G)
> +- 16 VFs per PF (physical device)
> +- Maximum of 128 queues per VF
> +- PCIe Gen-3 x16 Interface
> +- MSI
> +- SR-IOV
> +
> +ACC100 5G/4G FEC PMD supports the following BBDEV capabilities:
> +
> +* For the LDPC encode operation:
> + - ``RTE_BBDEV_LDPC_CRC_24B_ATTACH`` : set to attach CRC24B to CB(s)
> + - ``RTE_BBDEV_LDPC_RATE_MATCH`` : if set then do not do Rate Match
> bypass
> + - ``RTE_BBDEV_LDPC_INTERLEAVER_BYPASS`` : if set then bypass
> +interleaver
> +
> +* For the LDPC decode operation:
> + - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK`` : check CRC24B from CB(s)
> + - ``RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE`` : disable early
> termination
> + - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP`` : drops CRC24B bits
> appended while decoding
> + - ``RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE`` : provides an input for
> HARQ combining
> + - ``RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE`` : provides an input
> for HARQ combining
> + - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE`` : HARQ
> memory input is internal
> + - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE`` : HARQ
> memory output is internal
> + - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK`` :
> loopback data to/from HARQ memory
> + - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_FILLERS`` : HARQ
> memory includes the fillers bits
> + - ``RTE_BBDEV_LDPC_DEC_SCATTER_GATHER`` : supports scatter-gather
> for input/output data
> + - ``RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION`` : supports
> compression of the HARQ input/output
> + - ``RTE_BBDEV_LDPC_LLR_COMPRESSION`` : supports LLR input
> +compression
> +
> +* For the turbo encode operation:
> + - ``RTE_BBDEV_TURBO_CRC_24B_ATTACH`` : set to attach CRC24B to CB(s)
> + - ``RTE_BBDEV_TURBO_RATE_MATCH`` : if set then do not do Rate Match
> bypass
> + - ``RTE_BBDEV_TURBO_ENC_INTERRUPTS`` : set for encoder dequeue
> interrupts
> + - ``RTE_BBDEV_TURBO_RV_INDEX_BYPASS`` : set to bypass RV index
> + - ``RTE_BBDEV_TURBO_ENC_SCATTER_GATHER`` : supports scatter-
> gather
> +for input/output data
> +
> +* For the turbo decode operation:
> + - ``RTE_BBDEV_TURBO_CRC_TYPE_24B`` : check CRC24B from CB(s)
> + - ``RTE_BBDEV_TURBO_SUBBLOCK_DEINTERLEAVE`` : perform subblock
> de-interleave
> + - ``RTE_BBDEV_TURBO_DEC_INTERRUPTS`` : set for decoder dequeue
> interrupts
> + - ``RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN`` : set if negative LLR encoder
> i/p is supported
> + - ``RTE_BBDEV_TURBO_POS_LLR_1_BIT_IN`` : set if positive LLR encoder
> i/p is supported
> + - ``RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP`` : keep CRC24B bits
> appended while decoding
> + - ``RTE_BBDEV_TURBO_EARLY_TERMINATION`` : set early early
> termination feature
> + - ``RTE_BBDEV_TURBO_DEC_SCATTER_GATHER`` : supports scatter-
> gather for input/output data
> + - ``RTE_BBDEV_TURBO_HALF_ITERATION_EVEN`` : set half iteration
> +granularity
> +
> +Installation
> +------------
> +
> +Section 3 of the DPDK manual provides instuctions on installing and
> +compiling DPDK. The default set of bbdev compile flags may be found in
> +config/common_base, where for example the flag to build the ACC100
> +5G/4G FEC device, ``CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100``,
> +is already set.
> +
> +DPDK requires hugepages to be configured as detailed in section 2 of the
> DPDK manual.
> +The bbdev test application has been tested with a configuration 40 x
> +1GB hugepages. The hugepage configuration of a server may be examined
> using:
> +
> +.. code-block:: console
> +
> + grep Huge* /proc/meminfo
> +
> +
> +Initialization
> +--------------
> +
> +When the device first powers up, its PCI Physical Functions (PF) can be
> listed through this command:
> +
> +.. code-block:: console
> +
> + sudo lspci -vd8086:0d5c
> +
> +The physical and virtual functions are compatible with Linux UIO drivers:
> +``vfio`` and ``igb_uio``. However, in order to work the ACC100 5G/4G
> +FEC device firstly needs to be bound to one of these linux drivers through
> DPDK.
> +
> +
> +Bind PF UIO driver(s)
> +~~~~~~~~~~~~~~~~~~~~~
> +
> +Install the DPDK igb_uio driver, bind it with the PF PCI device ID and
> +use ``lspci`` to confirm the PF device is under use by ``igb_uio`` DPDK UIO
> driver.
> +
> +The igb_uio driver may be bound to the PF PCI device using one of three
> methods:
> +
> +
> +1. PCI functions (physical or virtual, depending on the use case) can
> +be bound to the UIO driver by repeating this command for every function.
> +
> +.. code-block:: console
> +
> + cd <dpdk-top-level-directory>
> + insmod ./build/kmod/igb_uio.ko
> + echo "8086 0d5c" > /sys/bus/pci/drivers/igb_uio/new_id
> + lspci -vd8086:0d5c
> +
> +
> +2. Another way to bind PF with DPDK UIO driver is by using the
> +``dpdk-devbind.py`` tool
> +
> +.. code-block:: console
> +
> + cd <dpdk-top-level-directory>
> + ./usertools/dpdk-devbind.py -b igb_uio 0000:06:00.0
> +
> +where the PCI device ID (example: 0000:06:00.0) is obtained using lspci
> +-vd8086:0d5c
> +
> +
> +3. A third way to bind is to use ``dpdk-setup.sh`` tool
> +
> +.. code-block:: console
> +
> + cd <dpdk-top-level-directory>
> + ./usertools/dpdk-setup.sh
> +
> + select 'Bind Ethernet/Crypto/Baseband device to IGB UIO module'
> + or
> + select 'Bind Ethernet/Crypto/Baseband device to VFIO module'
> + depending on driver required enter PCI device ID select 'Display
> + current Ethernet/Crypto/Baseband device settings' to confirm binding
> +
> +
> +In the same way the ACC100 5G/4G FEC PF can be bound with vfio, but
> +vfio driver does not support SR-IOV configuration right out of the box, so it
> will need to be patched.
> +
> +
> +Enable Virtual Functions
> +~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +Now, it should be visible in the printouts that PCI PF is under igb_uio
> +control "``Kernel driver in use: igb_uio``"
> +
> +To show the number of available VFs on the device, read ``sriov_totalvfs``
> file..
> +
> +.. code-block:: console
> +
> + cat /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/sriov_totalvfs
> +
> + where 0000\:<b>\:<d>.<f> is the PCI device ID
> +
> +
> +To enable VFs via igb_uio, echo the number of virtual functions
> +intended to enable to ``max_vfs`` file..
> +
> +.. code-block:: console
> +
> + echo <num-of-vfs> > /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/max_vfs
> +
> +
> +Afterwards, all VFs must be bound to appropriate UIO drivers as
> +required, same way it was done with the physical function previously.
> +
> +Enabling SR-IOV via vfio driver is pretty much the same, except that
> +the file name is different:
> +
> +.. code-block:: console
> +
> + echo <num-of-vfs> >
> + /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/sriov_numvfs
> +
> +
> +Configure the VFs through PF
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +The PCI virtual functions must be configured before working or getting
> +assigned to VMs/Containers. The configuration involves allocating the
> +number of hardware queues, priorities, load balance, bandwidth and
> +other settings necessary for the device to perform FEC functions.
> +
> +This configuration needs to be executed at least once after reboot or
> +PCI FLR and can be achieved by using the function
> +``acc100_configure()``, which sets up the parameters defined in
> ``acc100_conf`` structure.
> +
> +Test Application
> +----------------
> +
> +BBDEV provides a test application, ``test-bbdev.py`` and range of test
> +data for testing the functionality of ACC100 5G/4G FEC encode and
> +decode, depending on the device's capabilities. The test application is
> +located under app->test-bbdev folder and has the following options:
> +
> +.. code-block:: console
> +
> + "-p", "--testapp-path": specifies path to the bbdev test app.
> + "-e", "--eal-params" : EAL arguments which are passed to the test app.
> + "-t", "--timeout" : Timeout in seconds (default=300).
> + "-c", "--test-cases" : Defines test cases to run. Run all if not specified.
> + "-v", "--test-vector" : Test vector path (default=dpdk_path+/app/test-
> bbdev/test_vectors/bbdev_null.data).
> + "-n", "--num-ops" : Number of operations to process on device
> (default=32).
> + "-b", "--burst-size" : Operations enqueue/dequeue burst size
> (default=32).
> + "-s", "--snr" : SNR in dB used when generating LLRs for bler tests.
> + "-s", "--iter_max" : Number of iterations for LDPC decoder.
> + "-l", "--num-lcores" : Number of lcores to run (default=16).
> + "-i", "--init-device" : Initialise PF device with default values.
> +
> +
> +To execute the test application tool using simple decode or encode
> +data, type one of the following:
> +
> +.. code-block:: console
> +
> + ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_dec_default.data
> + ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_enc_default.data
> +
> +
> +The test application ``test-bbdev.py``, supports the ability to
> +configure the PF device with a default set of values, if the "-i" or "-
> +-init-device" option is included. The default values are defined in
> test_bbdev_perf.c.
> +
> +
> +Test Vectors
> +~~~~~~~~~~~~
> +
> +In addition to the simple LDPC decoder and LDPC encoder tests, bbdev
> +also provides a range of additional tests under the test_vectors
> +folder, which may be useful. The results of these tests will depend on
> +the ACC100 5G/4G FEC capabilities which may cause some testcases to be
> skipped, but no failure should be reported.
> diff --git a/doc/guides/bbdevs/index.rst b/doc/guides/bbdevs/index.rst
> index a8092dd..4445cbd 100644
> --- a/doc/guides/bbdevs/index.rst
> +++ b/doc/guides/bbdevs/index.rst
> @@ -13,3 +13,4 @@ Baseband Device Drivers
> turbo_sw
> fpga_lte_fec
> fpga_5gnr_fec
> + acc100
> diff --git a/doc/guides/rel_notes/release_20_11.rst
> b/doc/guides/rel_notes/release_20_11.rst
> index df227a1..b3ab614 100644
> --- a/doc/guides/rel_notes/release_20_11.rst
> +++ b/doc/guides/rel_notes/release_20_11.rst
> @@ -55,6 +55,12 @@ New Features
> Also, make sure to start the actual text at the margin.
> =======================================================
>
> +* **Added Intel ACC100 bbdev PMD.**
> +
> + Added a new ``acc100`` bbdev driver for the Intel\ |reg| ACC100
> + accelerator also known as Mount Bryce. See the
> + :doc:`../bbdevs/acc100` BBDEV guide for more details on this new driver.
> +
>
> Removed Items
> -------------
> diff --git a/drivers/baseband/Makefile b/drivers/baseband/Makefile index
> dcc0969..b640294 100644
> --- a/drivers/baseband/Makefile
> +++ b/drivers/baseband/Makefile
> @@ -10,6 +10,8 @@ DIRS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_NULL) +=
> null DEPDIRS-null = $(core-libs)
> DIRS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_TURBO_SW) += turbo_sw
> DEPDIRS-turbo_sw = $(core-libs)
> +DIRS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100) += acc100
> +DEPDIRS-acc100 = $(core-libs)
> DIRS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_FPGA_LTE_FEC) += fpga_lte_fec
> DEPDIRS-fpga_lte_fec = $(core-libs)
> DIRS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_FPGA_5GNR_FEC) +=
> fpga_5gnr_fec diff --git a/drivers/baseband/acc100/Makefile
> b/drivers/baseband/acc100/Makefile
> new file mode 100644
> index 0000000..c79e487
> --- /dev/null
> +++ b/drivers/baseband/acc100/Makefile
> @@ -0,0 +1,25 @@
> +# SPDX-License-Identifier: BSD-3-Clause # Copyright(c) 2020 Intel
> +Corporation
> +
> +include $(RTE_SDK)/mk/rte.vars.mk
> +
> +# library name
> +LIB = librte_pmd_bbdev_acc100.a
> +
> +# build flags
> +CFLAGS += -O3
> +CFLAGS += $(WERROR_FLAGS)
> +LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring -lrte_cfgfile
> +LDLIBS += -lrte_bbdev LDLIBS += -lrte_pci -lrte_bus_pci
> +
> +# versioning export map
> +EXPORT_MAP := rte_pmd_bbdev_acc100_version.map
> +
> +# library version
> +LIBABIVER := 1
> +
> +# library source files
> +SRCS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100) += rte_acc100_pmd.c
> +
> +include $(RTE_SDK)/mk/rte.lib.mk
> diff --git a/drivers/baseband/acc100/meson.build
> b/drivers/baseband/acc100/meson.build
> new file mode 100644
> index 0000000..8afafc2
> --- /dev/null
> +++ b/drivers/baseband/acc100/meson.build
> @@ -0,0 +1,6 @@
> +# SPDX-License-Identifier: BSD-3-Clause # Copyright(c) 2020 Intel
> +Corporation
> +
> +deps += ['bbdev', 'bus_vdev', 'ring', 'pci', 'bus_pci']
> +
> +sources = files('rte_acc100_pmd.c')
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> b/drivers/baseband/acc100/rte_acc100_pmd.c
> new file mode 100644
> index 0000000..1b4cd13
> --- /dev/null
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> @@ -0,0 +1,175 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2020 Intel Corporation
> + */
> +
> +#include <unistd.h>
> +
> +#include <rte_common.h>
> +#include <rte_log.h>
> +#include <rte_dev.h>
> +#include <rte_malloc.h>
> +#include <rte_mempool.h>
> +#include <rte_byteorder.h>
> +#include <rte_errno.h>
> +#include <rte_branch_prediction.h>
> +#include <rte_hexdump.h>
> +#include <rte_pci.h>
> +#include <rte_bus_pci.h>
> +
> +#include <rte_bbdev.h>
> +#include <rte_bbdev_pmd.h>
> +#include "rte_acc100_pmd.h"
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, DEBUG); #else
> +RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE); #endif
> +
> +/* Free 64MB memory used for software rings */ static int
> +acc100_dev_close(struct rte_bbdev *dev __rte_unused) {
> + return 0;
> +}
> +
> +static const struct rte_bbdev_ops acc100_bbdev_ops = {
> + .close = acc100_dev_close,
> +};
> +
> +/* ACC100 PCI PF address map */
> +static struct rte_pci_id pci_id_acc100_pf_map[] = {
> + {
> + RTE_PCI_DEVICE(RTE_ACC100_VENDOR_ID,
> RTE_ACC100_PF_DEVICE_ID)
> + },
> + {.device_id = 0},
> +};
> +
> +/* ACC100 PCI VF address map */
> +static struct rte_pci_id pci_id_acc100_vf_map[] = {
> + {
> + RTE_PCI_DEVICE(RTE_ACC100_VENDOR_ID,
> RTE_ACC100_VF_DEVICE_ID)
> + },
> + {.device_id = 0},
> +};
> +
> +/* Initialization Function */
> +static void
> +acc100_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv) {
> + struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
> +
> + dev->dev_ops = &acc100_bbdev_ops;
> +
> + ((struct acc100_device *) dev->data->dev_private)->pf_device =
> + !strcmp(drv->driver.name,
> + RTE_STR(ACC100PF_DRIVER_NAME));
> + ((struct acc100_device *) dev->data->dev_private)->mmio_base =
> + pci_dev->mem_resource[0].addr;
> +
> + rte_bbdev_log_debug("Init device %s [%s] @ vaddr %p
> paddr %#"PRIx64"",
> + drv->driver.name, dev->data->name,
> + (void *)pci_dev->mem_resource[0].addr,
> + pci_dev->mem_resource[0].phys_addr);
> +}
> +
> +static int acc100_pci_probe(struct rte_pci_driver *pci_drv,
> + struct rte_pci_device *pci_dev)
> +{
> + struct rte_bbdev *bbdev = NULL;
> + char dev_name[RTE_BBDEV_NAME_MAX_LEN];
> +
> + if (pci_dev == NULL) {
> + rte_bbdev_log(ERR, "NULL PCI device");
> + return -EINVAL;
> + }
> +
> + rte_pci_device_name(&pci_dev->addr, dev_name,
> sizeof(dev_name));
> +
> + /* Allocate memory to be used privately by drivers */
> + bbdev = rte_bbdev_allocate(pci_dev->device.name);
> + if (bbdev == NULL)
> + return -ENODEV;
> +
> + /* allocate device private memory */
> + bbdev->data->dev_private = rte_zmalloc_socket(dev_name,
> + sizeof(struct acc100_device), RTE_CACHE_LINE_SIZE,
> + pci_dev->device.numa_node);
> +
> + if (bbdev->data->dev_private == NULL) {
> + rte_bbdev_log(CRIT,
> + "Allocate of %zu bytes for device \"%s\"
> failed",
> + sizeof(struct acc100_device), dev_name);
> + rte_bbdev_release(bbdev);
> + return -ENOMEM;
> + }
> +
> + /* Fill HW specific part of device structure */
> + bbdev->device = &pci_dev->device;
> + bbdev->intr_handle = &pci_dev->intr_handle;
> + bbdev->data->socket_id = pci_dev->device.numa_node;
> +
> + /* Invoke ACC100 device initialization function */
> + acc100_bbdev_init(bbdev, pci_drv);
> +
> + rte_bbdev_log_debug("Initialised bbdev %s (id = %u)",
> + dev_name, bbdev->data->dev_id);
> + return 0;
> +}
> +
> +static int acc100_pci_remove(struct rte_pci_device *pci_dev) {
> + struct rte_bbdev *bbdev;
> + int ret;
> + uint8_t dev_id;
> +
> + if (pci_dev == NULL)
> + return -EINVAL;
> +
> + /* Find device */
> + bbdev = rte_bbdev_get_named_dev(pci_dev->device.name);
> + if (bbdev == NULL) {
> + rte_bbdev_log(CRIT,
> + "Couldn't find HW dev \"%s\" to uninitialise
> it",
> + pci_dev->device.name);
> + return -ENODEV;
> + }
> + dev_id = bbdev->data->dev_id;
> +
> + /* free device private memory before close */
> + rte_free(bbdev->data->dev_private);
> +
> + /* Close device */
> + ret = rte_bbdev_close(dev_id);
> + if (ret < 0)
> + rte_bbdev_log(ERR,
> + "Device %i failed to close during uninit: %i",
> + dev_id, ret);
> +
> + /* release bbdev from library */
> + rte_bbdev_release(bbdev);
> +
> + rte_bbdev_log_debug("Destroyed bbdev = %u", dev_id);
> +
> + return 0;
> +}
> +
> +static struct rte_pci_driver acc100_pci_pf_driver = {
> + .probe = acc100_pci_probe,
> + .remove = acc100_pci_remove,
> + .id_table = pci_id_acc100_pf_map,
> + .drv_flags = RTE_PCI_DRV_NEED_MAPPING };
> +
> +static struct rte_pci_driver acc100_pci_vf_driver = {
> + .probe = acc100_pci_probe,
> + .remove = acc100_pci_remove,
> + .id_table = pci_id_acc100_vf_map,
> + .drv_flags = RTE_PCI_DRV_NEED_MAPPING };
> +
> +RTE_PMD_REGISTER_PCI(ACC100PF_DRIVER_NAME, acc100_pci_pf_driver);
> +RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME,
> pci_id_acc100_pf_map);
> +RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
> +RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME,
> pci_id_acc100_vf_map);
It seems both PF and VF share same date for rte_pci_driver,
it's strange to duplicate code.
> +
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h
> b/drivers/baseband/acc100/rte_acc100_pmd.h
> new file mode 100644
> index 0000000..6f46df0
> --- /dev/null
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
> @@ -0,0 +1,37 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2020 Intel Corporation
> + */
> +
> +#ifndef _RTE_ACC100_PMD_H_
> +#define _RTE_ACC100_PMD_H_
> +
> +/* Helper macro for logging */
> +#define rte_bbdev_log(level, fmt, ...) \
> + rte_log(RTE_LOG_ ## level, acc100_logtype, fmt "\n", \
> + ##__VA_ARGS__)
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +#define rte_bbdev_log_debug(fmt, ...) \
> + rte_bbdev_log(DEBUG, "acc100_pmd: " fmt, \
> + ##__VA_ARGS__)
> +#else
> +#define rte_bbdev_log_debug(fmt, ...)
> +#endif
> +
> +/* ACC100 PF and VF driver names */
> +#define ACC100PF_DRIVER_NAME intel_acc100_pf
> +#define ACC100VF_DRIVER_NAME intel_acc100_vf
> +
> +/* ACC100 PCI vendor & device IDs */
> +#define RTE_ACC100_VENDOR_ID (0x8086)
> +#define RTE_ACC100_PF_DEVICE_ID (0x0d5c)
> +#define RTE_ACC100_VF_DEVICE_ID (0x0d5d)
> +
> +/* Private data structure for each ACC100 device */ struct
> +acc100_device {
> + void *mmio_base; /**< Base address of MMIO registers (BAR0) */
> + bool pf_device; /**< True if this is a PF ACC100 device */
> + bool configured; /**< True if this ACC100 device is configured */ };
> +
> +#endif /* _RTE_ACC100_PMD_H_ */
> diff --git a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> new file mode 100644
> index 0000000..4a76d1d
> --- /dev/null
> +++ b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> @@ -0,0 +1,3 @@
> +DPDK_21 {
> + local: *;
> +};
> diff --git a/drivers/baseband/meson.build b/drivers/baseband/meson.build
> index 415b672..72301ce 100644
> --- a/drivers/baseband/meson.build
> +++ b/drivers/baseband/meson.build
> @@ -5,7 +5,7 @@ if is_windows
> subdir_done()
> endif
>
> -drivers = ['null', 'turbo_sw', 'fpga_lte_fec', 'fpga_5gnr_fec']
> +drivers = ['null', 'turbo_sw', 'fpga_lte_fec', 'fpga_5gnr_fec',
> +'acc100']
>
> config_flag_fmt = 'RTE_LIBRTE_PMD_BBDEV_@0@'
> driver_name_fmt = 'rte_pmd_bbdev_@0@'
> diff --git a/mk/rte.app.mk b/mk/rte.app.mk index a544259..a77f538 100644
> --- a/mk/rte.app.mk
> +++ b/mk/rte.app.mk
> @@ -254,6 +254,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_NETVSC_PMD) +=
> -lrte_pmd_netvsc
>
> ifeq ($(CONFIG_RTE_LIBRTE_BBDEV),y)
> _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_NULL) += -
> lrte_pmd_bbdev_null
> +_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100) += -
> lrte_pmd_bbdev_acc100
> _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_FPGA_LTE_FEC) += -
> lrte_pmd_bbdev_fpga_lte_fec
> _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_FPGA_5GNR_FEC) += -
> lrte_pmd_bbdev_fpga_5gnr_fec
>
> --
> 1.8.3.1
^ permalink raw reply [flat|nested] 213+ messages in thread
* Re: [dpdk-dev] [PATCH v3 02/11] baseband/acc100: add register definition file
2020-08-19 0:25 ` [dpdk-dev] [PATCH v3 02/11] baseband/acc100: add register definition file Nicolas Chautru
@ 2020-08-29 9:55 ` Xu, Rosen
2020-08-29 17:39 ` Chautru, Nicolas
0 siblings, 1 reply; 213+ messages in thread
From: Xu, Rosen @ 2020-08-29 9:55 UTC (permalink / raw)
To: Chautru, Nicolas, dev, akhil.goyal; +Cc: Richardson, Bruce, Chautru, Nicolas
Hi,
> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Nicolas Chautru
> Sent: Wednesday, August 19, 2020 8:25
> To: dev@dpdk.org; akhil.goyal@nxp.com
> Cc: Richardson, Bruce <bruce.richardson@intel.com>; Chautru, Nicolas
> <nicolas.chautru@intel.com>
> Subject: [dpdk-dev] [PATCH v3 02/11] baseband/acc100: add register
> definition file
>
> Add in the list of registers for the device and related
> HW specs definitions.
>
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> ---
> drivers/baseband/acc100/acc100_pf_enum.h | 1068
> ++++++++++++++++++++++++++++++
> drivers/baseband/acc100/acc100_vf_enum.h | 73 ++
> drivers/baseband/acc100/rte_acc100_pmd.h | 490 ++++++++++++++
> 3 files changed, 1631 insertions(+)
> create mode 100644 drivers/baseband/acc100/acc100_pf_enum.h
> create mode 100644 drivers/baseband/acc100/acc100_vf_enum.h
>
> diff --git a/drivers/baseband/acc100/acc100_pf_enum.h
> b/drivers/baseband/acc100/acc100_pf_enum.h
> new file mode 100644
> index 0000000..a1ee416
> --- /dev/null
> +++ b/drivers/baseband/acc100/acc100_pf_enum.h
> @@ -0,0 +1,1068 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2017 Intel Corporation
> + */
> +
> +#ifndef ACC100_PF_ENUM_H
> +#define ACC100_PF_ENUM_H
> +
> +/*
> + * ACC100 Register mapping on PF BAR0
> + * This is automatically generated from RDL, format may change with new
> RDL
> + * Release.
> + * Variable names are as is
> + */
> +enum {
> + HWPfQmgrEgressQueuesTemplate = 0x0007FE00,
> + HWPfQmgrIngressAq = 0x00080000,
> + HWPfQmgrArbQAvail = 0x00A00010,
> + HWPfQmgrArbQBlock = 0x00A00014,
> + HWPfQmgrAqueueDropNotifEn = 0x00A00024,
> + HWPfQmgrAqueueDisableNotifEn = 0x00A00028,
> + HWPfQmgrSoftReset = 0x00A00038,
> + HWPfQmgrInitStatus = 0x00A0003C,
> + HWPfQmgrAramWatchdogCount = 0x00A00040,
> + HWPfQmgrAramWatchdogCounterEn = 0x00A00044,
> + HWPfQmgrAxiWatchdogCount = 0x00A00048,
> + HWPfQmgrAxiWatchdogCounterEn = 0x00A0004C,
> + HWPfQmgrProcessWatchdogCount = 0x00A00050,
> + HWPfQmgrProcessWatchdogCounterEn = 0x00A00054,
> + HWPfQmgrProcessUl4GWatchdogCounter = 0x00A00058,
> + HWPfQmgrProcessDl4GWatchdogCounter = 0x00A0005C,
> + HWPfQmgrProcessUl5GWatchdogCounter = 0x00A00060,
> + HWPfQmgrProcessDl5GWatchdogCounter = 0x00A00064,
> + HWPfQmgrProcessMldWatchdogCounter = 0x00A00068,
> + HWPfQmgrMsiOverflowUpperVf = 0x00A00070,
> + HWPfQmgrMsiOverflowLowerVf = 0x00A00074,
> + HWPfQmgrMsiWatchdogOverflow = 0x00A00078,
> + HWPfQmgrMsiOverflowEnable = 0x00A0007C,
> + HWPfQmgrDebugAqPointerMemGrp = 0x00A00100,
> + HWPfQmgrDebugOutputArbQFifoGrp = 0x00A00140,
> + HWPfQmgrDebugMsiFifoGrp = 0x00A00180,
> + HWPfQmgrDebugAxiWdTimeoutMsiFifo = 0x00A001C0,
> + HWPfQmgrDebugProcessWdTimeoutMsiFifo = 0x00A001C4,
> + HWPfQmgrDepthLog2Grp = 0x00A00200,
> + HWPfQmgrTholdGrp = 0x00A00300,
> + HWPfQmgrGrpTmplateReg0Indx = 0x00A00600,
> + HWPfQmgrGrpTmplateReg1Indx = 0x00A00680,
> + HWPfQmgrGrpTmplateReg2indx = 0x00A00700,
> + HWPfQmgrGrpTmplateReg3Indx = 0x00A00780,
> + HWPfQmgrGrpTmplateReg4Indx = 0x00A00800,
> + HWPfQmgrVfBaseAddr = 0x00A01000,
> + HWPfQmgrUl4GWeightRrVf = 0x00A02000,
> + HWPfQmgrDl4GWeightRrVf = 0x00A02100,
> + HWPfQmgrUl5GWeightRrVf = 0x00A02200,
> + HWPfQmgrDl5GWeightRrVf = 0x00A02300,
> + HWPfQmgrMldWeightRrVf = 0x00A02400,
> + HWPfQmgrArbQDepthGrp = 0x00A02F00,
> + HWPfQmgrGrpFunction0 = 0x00A02F40,
> + HWPfQmgrGrpFunction1 = 0x00A02F44,
> + HWPfQmgrGrpPriority = 0x00A02F48,
> + HWPfQmgrWeightSync = 0x00A03000,
> + HWPfQmgrAqEnableVf = 0x00A10000,
> + HWPfQmgrAqResetVf = 0x00A20000,
> + HWPfQmgrRingSizeVf = 0x00A20004,
> + HWPfQmgrGrpDepthLog20Vf = 0x00A20008,
> + HWPfQmgrGrpDepthLog21Vf = 0x00A2000C,
> + HWPfQmgrGrpFunction0Vf = 0x00A20010,
> + HWPfQmgrGrpFunction1Vf = 0x00A20014,
> + HWPfDmaConfig0Reg = 0x00B80000,
> + HWPfDmaConfig1Reg = 0x00B80004,
> + HWPfDmaQmgrAddrReg = 0x00B80008,
> + HWPfDmaSoftResetReg = 0x00B8000C,
> + HWPfDmaAxcacheReg = 0x00B80010,
> + HWPfDmaVersionReg = 0x00B80014,
> + HWPfDmaFrameThreshold = 0x00B80018,
> + HWPfDmaTimestampLo = 0x00B8001C,
> + HWPfDmaTimestampHi = 0x00B80020,
> + HWPfDmaAxiStatus = 0x00B80028,
> + HWPfDmaAxiControl = 0x00B8002C,
> + HWPfDmaNoQmgr = 0x00B80030,
> + HWPfDmaQosScale = 0x00B80034,
> + HWPfDmaQmanen = 0x00B80040,
> + HWPfDmaQmgrQosBase = 0x00B80060,
> + HWPfDmaFecClkGatingEnable = 0x00B80080,
> + HWPfDmaPmEnable = 0x00B80084,
> + HWPfDmaQosEnable = 0x00B80088,
> + HWPfDmaHarqWeightedRrFrameThreshold = 0x00B800B0,
> + HWPfDmaDataSmallWeightedRrFrameThresh = 0x00B800B4,
> + HWPfDmaDataLargeWeightedRrFrameThresh = 0x00B800B8,
> + HWPfDmaInboundCbMaxSize = 0x00B800BC,
> + HWPfDmaInboundDrainDataSize = 0x00B800C0,
> + HWPfDmaVfDdrBaseRw = 0x00B80400,
> + HWPfDmaCmplTmOutCnt = 0x00B80800,
> + HWPfDmaProcTmOutCnt = 0x00B80804,
> + HWPfDmaStatusRrespBresp = 0x00B80810,
> + HWPfDmaCfgRrespBresp = 0x00B80814,
> + HWPfDmaStatusMemParErr = 0x00B80818,
> + HWPfDmaCfgMemParErrEn = 0x00B8081C,
> + HWPfDmaStatusDmaHwErr = 0x00B80820,
> + HWPfDmaCfgDmaHwErrEn = 0x00B80824,
> + HWPfDmaStatusFecCoreErr = 0x00B80828,
> + HWPfDmaCfgFecCoreErrEn = 0x00B8082C,
> + HWPfDmaStatusFcwDescrErr = 0x00B80830,
> + HWPfDmaCfgFcwDescrErrEn = 0x00B80834,
> + HWPfDmaStatusBlockTransmit = 0x00B80838,
> + HWPfDmaBlockOnErrEn = 0x00B8083C,
> + HWPfDmaStatusFlushDma = 0x00B80840,
> + HWPfDmaFlushDmaOnErrEn = 0x00B80844,
> + HWPfDmaStatusSdoneFifoFull = 0x00B80848,
> + HWPfDmaStatusDescriptorErrLoVf = 0x00B8084C,
> + HWPfDmaStatusDescriptorErrHiVf = 0x00B80850,
> + HWPfDmaStatusFcwErrLoVf = 0x00B80854,
> + HWPfDmaStatusFcwErrHiVf = 0x00B80858,
> + HWPfDmaStatusDataErrLoVf = 0x00B8085C,
> + HWPfDmaStatusDataErrHiVf = 0x00B80860,
> + HWPfDmaCfgMsiEnSoftwareErr = 0x00B80864,
> + HWPfDmaDescriptorSignatuture = 0x00B80868,
> + HWPfDmaFcwSignature = 0x00B8086C,
> + HWPfDmaErrorDetectionEn = 0x00B80870,
> + HWPfDmaErrCntrlFifoDebug = 0x00B8087C,
> + HWPfDmaStatusToutData = 0x00B80880,
> + HWPfDmaStatusToutDesc = 0x00B80884,
> + HWPfDmaStatusToutUnexpData = 0x00B80888,
> + HWPfDmaStatusToutUnexpDesc = 0x00B8088C,
> + HWPfDmaStatusToutProcess = 0x00B80890,
> + HWPfDmaConfigCtoutOutDataEn = 0x00B808A0,
> + HWPfDmaConfigCtoutOutDescrEn = 0x00B808A4,
> + HWPfDmaConfigUnexpComplDataEn = 0x00B808A8,
> + HWPfDmaConfigUnexpComplDescrEn = 0x00B808AC,
> + HWPfDmaConfigPtoutOutEn = 0x00B808B0,
> + HWPfDmaFec5GulDescBaseLoRegVf = 0x00B88020,
> + HWPfDmaFec5GulDescBaseHiRegVf = 0x00B88024,
> + HWPfDmaFec5GulRespPtrLoRegVf = 0x00B88028,
> + HWPfDmaFec5GulRespPtrHiRegVf = 0x00B8802C,
> + HWPfDmaFec5GdlDescBaseLoRegVf = 0x00B88040,
> + HWPfDmaFec5GdlDescBaseHiRegVf = 0x00B88044,
> + HWPfDmaFec5GdlRespPtrLoRegVf = 0x00B88048,
> + HWPfDmaFec5GdlRespPtrHiRegVf = 0x00B8804C,
> + HWPfDmaFec4GulDescBaseLoRegVf = 0x00B88060,
> + HWPfDmaFec4GulDescBaseHiRegVf = 0x00B88064,
> + HWPfDmaFec4GulRespPtrLoRegVf = 0x00B88068,
> + HWPfDmaFec4GulRespPtrHiRegVf = 0x00B8806C,
> + HWPfDmaFec4GdlDescBaseLoRegVf = 0x00B88080,
> + HWPfDmaFec4GdlDescBaseHiRegVf = 0x00B88084,
> + HWPfDmaFec4GdlRespPtrLoRegVf = 0x00B88088,
> + HWPfDmaFec4GdlRespPtrHiRegVf = 0x00B8808C,
> + HWPfDmaVfDdrBaseRangeRo = 0x00B880A0,
> + HWPfQosmonACntrlReg = 0x00B90000,
> + HWPfQosmonAEvalOverflow0 = 0x00B90008,
> + HWPfQosmonAEvalOverflow1 = 0x00B9000C,
> + HWPfQosmonADivTerm = 0x00B90010,
> + HWPfQosmonATickTerm = 0x00B90014,
> + HWPfQosmonAEvalTerm = 0x00B90018,
> + HWPfQosmonAAveTerm = 0x00B9001C,
> + HWPfQosmonAForceEccErr = 0x00B90020,
> + HWPfQosmonAEccErrDetect = 0x00B90024,
> + HWPfQosmonAIterationConfig0Low = 0x00B90060,
> + HWPfQosmonAIterationConfig0High = 0x00B90064,
> + HWPfQosmonAIterationConfig1Low = 0x00B90068,
> + HWPfQosmonAIterationConfig1High = 0x00B9006C,
> + HWPfQosmonAIterationConfig2Low = 0x00B90070,
> + HWPfQosmonAIterationConfig2High = 0x00B90074,
> + HWPfQosmonAIterationConfig3Low = 0x00B90078,
> + HWPfQosmonAIterationConfig3High = 0x00B9007C,
> + HWPfQosmonAEvalMemAddr = 0x00B90080,
> + HWPfQosmonAEvalMemData = 0x00B90084,
> + HWPfQosmonAXaction = 0x00B900C0,
> + HWPfQosmonARemThres1Vf = 0x00B90400,
> + HWPfQosmonAThres2Vf = 0x00B90404,
> + HWPfQosmonAWeiFracVf = 0x00B90408,
> + HWPfQosmonARrWeiVf = 0x00B9040C,
> + HWPfPermonACntrlRegVf = 0x00B98000,
> + HWPfPermonACountVf = 0x00B98008,
> + HWPfPermonAKCntLoVf = 0x00B98010,
> + HWPfPermonAKCntHiVf = 0x00B98014,
> + HWPfPermonADeltaCntLoVf = 0x00B98020,
> + HWPfPermonADeltaCntHiVf = 0x00B98024,
> + HWPfPermonAVersionReg = 0x00B9C000,
> + HWPfPermonACbControlFec = 0x00B9C0F0,
> + HWPfPermonADltTimerLoFec = 0x00B9C0F4,
> + HWPfPermonADltTimerHiFec = 0x00B9C0F8,
> + HWPfPermonACbCountFec = 0x00B9C100,
> + HWPfPermonAAccExecTimerLoFec = 0x00B9C104,
> + HWPfPermonAAccExecTimerHiFec = 0x00B9C108,
> + HWPfPermonAExecTimerMinFec = 0x00B9C200,
> + HWPfPermonAExecTimerMaxFec = 0x00B9C204,
> + HWPfPermonAControlBusMon = 0x00B9C400,
> + HWPfPermonAConfigBusMon = 0x00B9C404,
> + HWPfPermonASkipCountBusMon = 0x00B9C408,
> + HWPfPermonAMinLatBusMon = 0x00B9C40C,
> + HWPfPermonAMaxLatBusMon = 0x00B9C500,
> + HWPfPermonATotalLatLowBusMon = 0x00B9C504,
> + HWPfPermonATotalLatUpperBusMon = 0x00B9C508,
> + HWPfPermonATotalReqCntBusMon = 0x00B9C50C,
> + HWPfQosmonBCntrlReg = 0x00BA0000,
> + HWPfQosmonBEvalOverflow0 = 0x00BA0008,
> + HWPfQosmonBEvalOverflow1 = 0x00BA000C,
> + HWPfQosmonBDivTerm = 0x00BA0010,
> + HWPfQosmonBTickTerm = 0x00BA0014,
> + HWPfQosmonBEvalTerm = 0x00BA0018,
> + HWPfQosmonBAveTerm = 0x00BA001C,
> + HWPfQosmonBForceEccErr = 0x00BA0020,
> + HWPfQosmonBEccErrDetect = 0x00BA0024,
> + HWPfQosmonBIterationConfig0Low = 0x00BA0060,
> + HWPfQosmonBIterationConfig0High = 0x00BA0064,
> + HWPfQosmonBIterationConfig1Low = 0x00BA0068,
> + HWPfQosmonBIterationConfig1High = 0x00BA006C,
> + HWPfQosmonBIterationConfig2Low = 0x00BA0070,
> + HWPfQosmonBIterationConfig2High = 0x00BA0074,
> + HWPfQosmonBIterationConfig3Low = 0x00BA0078,
> + HWPfQosmonBIterationConfig3High = 0x00BA007C,
> + HWPfQosmonBEvalMemAddr = 0x00BA0080,
> + HWPfQosmonBEvalMemData = 0x00BA0084,
> + HWPfQosmonBXaction = 0x00BA00C0,
> + HWPfQosmonBRemThres1Vf = 0x00BA0400,
> + HWPfQosmonBThres2Vf = 0x00BA0404,
> + HWPfQosmonBWeiFracVf = 0x00BA0408,
> + HWPfQosmonBRrWeiVf = 0x00BA040C,
> + HWPfPermonBCntrlRegVf = 0x00BA8000,
> + HWPfPermonBCountVf = 0x00BA8008,
> + HWPfPermonBKCntLoVf = 0x00BA8010,
> + HWPfPermonBKCntHiVf = 0x00BA8014,
> + HWPfPermonBDeltaCntLoVf = 0x00BA8020,
> + HWPfPermonBDeltaCntHiVf = 0x00BA8024,
> + HWPfPermonBVersionReg = 0x00BAC000,
> + HWPfPermonBCbControlFec = 0x00BAC0F0,
> + HWPfPermonBDltTimerLoFec = 0x00BAC0F4,
> + HWPfPermonBDltTimerHiFec = 0x00BAC0F8,
> + HWPfPermonBCbCountFec = 0x00BAC100,
> + HWPfPermonBAccExecTimerLoFec = 0x00BAC104,
> + HWPfPermonBAccExecTimerHiFec = 0x00BAC108,
> + HWPfPermonBExecTimerMinFec = 0x00BAC200,
> + HWPfPermonBExecTimerMaxFec = 0x00BAC204,
> + HWPfPermonBControlBusMon = 0x00BAC400,
> + HWPfPermonBConfigBusMon = 0x00BAC404,
> + HWPfPermonBSkipCountBusMon = 0x00BAC408,
> + HWPfPermonBMinLatBusMon = 0x00BAC40C,
> + HWPfPermonBMaxLatBusMon = 0x00BAC500,
> + HWPfPermonBTotalLatLowBusMon = 0x00BAC504,
> + HWPfPermonBTotalLatUpperBusMon = 0x00BAC508,
> + HWPfPermonBTotalReqCntBusMon = 0x00BAC50C,
> + HWPfFecUl5gCntrlReg = 0x00BC0000,
> + HWPfFecUl5gI2MThreshReg = 0x00BC0004,
> + HWPfFecUl5gVersionReg = 0x00BC0100,
> + HWPfFecUl5gFcwStatusReg = 0x00BC0104,
> + HWPfFecUl5gWarnReg = 0x00BC0108,
> + HwPfFecUl5gIbDebugReg = 0x00BC0200,
> + HwPfFecUl5gObLlrDebugReg = 0x00BC0204,
> + HwPfFecUl5gObHarqDebugReg = 0x00BC0208,
> + HwPfFecUl5g1CntrlReg = 0x00BC1000,
> + HwPfFecUl5g1I2MThreshReg = 0x00BC1004,
> + HwPfFecUl5g1VersionReg = 0x00BC1100,
> + HwPfFecUl5g1FcwStatusReg = 0x00BC1104,
> + HwPfFecUl5g1WarnReg = 0x00BC1108,
> + HwPfFecUl5g1IbDebugReg = 0x00BC1200,
> + HwPfFecUl5g1ObLlrDebugReg = 0x00BC1204,
> + HwPfFecUl5g1ObHarqDebugReg = 0x00BC1208,
> + HwPfFecUl5g2CntrlReg = 0x00BC2000,
> + HwPfFecUl5g2I2MThreshReg = 0x00BC2004,
> + HwPfFecUl5g2VersionReg = 0x00BC2100,
> + HwPfFecUl5g2FcwStatusReg = 0x00BC2104,
> + HwPfFecUl5g2WarnReg = 0x00BC2108,
> + HwPfFecUl5g2IbDebugReg = 0x00BC2200,
> + HwPfFecUl5g2ObLlrDebugReg = 0x00BC2204,
> + HwPfFecUl5g2ObHarqDebugReg = 0x00BC2208,
> + HwPfFecUl5g3CntrlReg = 0x00BC3000,
> + HwPfFecUl5g3I2MThreshReg = 0x00BC3004,
> + HwPfFecUl5g3VersionReg = 0x00BC3100,
> + HwPfFecUl5g3FcwStatusReg = 0x00BC3104,
> + HwPfFecUl5g3WarnReg = 0x00BC3108,
> + HwPfFecUl5g3IbDebugReg = 0x00BC3200,
> + HwPfFecUl5g3ObLlrDebugReg = 0x00BC3204,
> + HwPfFecUl5g3ObHarqDebugReg = 0x00BC3208,
> + HwPfFecUl5g4CntrlReg = 0x00BC4000,
> + HwPfFecUl5g4I2MThreshReg = 0x00BC4004,
> + HwPfFecUl5g4VersionReg = 0x00BC4100,
> + HwPfFecUl5g4FcwStatusReg = 0x00BC4104,
> + HwPfFecUl5g4WarnReg = 0x00BC4108,
> + HwPfFecUl5g4IbDebugReg = 0x00BC4200,
> + HwPfFecUl5g4ObLlrDebugReg = 0x00BC4204,
> + HwPfFecUl5g4ObHarqDebugReg = 0x00BC4208,
> + HwPfFecUl5g5CntrlReg = 0x00BC5000,
> + HwPfFecUl5g5I2MThreshReg = 0x00BC5004,
> + HwPfFecUl5g5VersionReg = 0x00BC5100,
> + HwPfFecUl5g5FcwStatusReg = 0x00BC5104,
> + HwPfFecUl5g5WarnReg = 0x00BC5108,
> + HwPfFecUl5g5IbDebugReg = 0x00BC5200,
> + HwPfFecUl5g5ObLlrDebugReg = 0x00BC5204,
> + HwPfFecUl5g5ObHarqDebugReg = 0x00BC5208,
> + HwPfFecUl5g6CntrlReg = 0x00BC6000,
> + HwPfFecUl5g6I2MThreshReg = 0x00BC6004,
> + HwPfFecUl5g6VersionReg = 0x00BC6100,
> + HwPfFecUl5g6FcwStatusReg = 0x00BC6104,
> + HwPfFecUl5g6WarnReg = 0x00BC6108,
> + HwPfFecUl5g6IbDebugReg = 0x00BC6200,
> + HwPfFecUl5g6ObLlrDebugReg = 0x00BC6204,
> + HwPfFecUl5g6ObHarqDebugReg = 0x00BC6208,
> + HwPfFecUl5g7CntrlReg = 0x00BC7000,
> + HwPfFecUl5g7I2MThreshReg = 0x00BC7004,
> + HwPfFecUl5g7VersionReg = 0x00BC7100,
> + HwPfFecUl5g7FcwStatusReg = 0x00BC7104,
> + HwPfFecUl5g7WarnReg = 0x00BC7108,
> + HwPfFecUl5g7IbDebugReg = 0x00BC7200,
> + HwPfFecUl5g7ObLlrDebugReg = 0x00BC7204,
> + HwPfFecUl5g7ObHarqDebugReg = 0x00BC7208,
> + HwPfFecUl5g8CntrlReg = 0x00BC8000,
> + HwPfFecUl5g8I2MThreshReg = 0x00BC8004,
> + HwPfFecUl5g8VersionReg = 0x00BC8100,
> + HwPfFecUl5g8FcwStatusReg = 0x00BC8104,
> + HwPfFecUl5g8WarnReg = 0x00BC8108,
> + HwPfFecUl5g8IbDebugReg = 0x00BC8200,
> + HwPfFecUl5g8ObLlrDebugReg = 0x00BC8204,
> + HwPfFecUl5g8ObHarqDebugReg = 0x00BC8208,
> + HWPfFecDl5gCntrlReg = 0x00BCF000,
> + HWPfFecDl5gI2MThreshReg = 0x00BCF004,
> + HWPfFecDl5gVersionReg = 0x00BCF100,
> + HWPfFecDl5gFcwStatusReg = 0x00BCF104,
> + HWPfFecDl5gWarnReg = 0x00BCF108,
> + HWPfFecUlVersionReg = 0x00BD0000,
> + HWPfFecUlControlReg = 0x00BD0004,
> + HWPfFecUlStatusReg = 0x00BD0008,
> + HWPfFecDlVersionReg = 0x00BDF000,
> + HWPfFecDlClusterConfigReg = 0x00BDF004,
> + HWPfFecDlBurstThres = 0x00BDF00C,
> + HWPfFecDlClusterStatusReg0 = 0x00BDF040,
> + HWPfFecDlClusterStatusReg1 = 0x00BDF044,
> + HWPfFecDlClusterStatusReg2 = 0x00BDF048,
> + HWPfFecDlClusterStatusReg3 = 0x00BDF04C,
> + HWPfFecDlClusterStatusReg4 = 0x00BDF050,
> + HWPfFecDlClusterStatusReg5 = 0x00BDF054,
> + HWPfChaFabPllPllrst = 0x00C40000,
> + HWPfChaFabPllClk0 = 0x00C40004,
> + HWPfChaFabPllClk1 = 0x00C40008,
> + HWPfChaFabPllBwadj = 0x00C4000C,
> + HWPfChaFabPllLbw = 0x00C40010,
> + HWPfChaFabPllResetq = 0x00C40014,
> + HWPfChaFabPllPhshft0 = 0x00C40018,
> + HWPfChaFabPllPhshft1 = 0x00C4001C,
> + HWPfChaFabPllDivq0 = 0x00C40020,
> + HWPfChaFabPllDivq1 = 0x00C40024,
> + HWPfChaFabPllDivq2 = 0x00C40028,
> + HWPfChaFabPllDivq3 = 0x00C4002C,
> + HWPfChaFabPllDivq4 = 0x00C40030,
> + HWPfChaFabPllDivq5 = 0x00C40034,
> + HWPfChaFabPllDivq6 = 0x00C40038,
> + HWPfChaFabPllDivq7 = 0x00C4003C,
> + HWPfChaDl5gPllPllrst = 0x00C40080,
> + HWPfChaDl5gPllClk0 = 0x00C40084,
> + HWPfChaDl5gPllClk1 = 0x00C40088,
> + HWPfChaDl5gPllBwadj = 0x00C4008C,
> + HWPfChaDl5gPllLbw = 0x00C40090,
> + HWPfChaDl5gPllResetq = 0x00C40094,
> + HWPfChaDl5gPllPhshft0 = 0x00C40098,
> + HWPfChaDl5gPllPhshft1 = 0x00C4009C,
> + HWPfChaDl5gPllDivq0 = 0x00C400A0,
> + HWPfChaDl5gPllDivq1 = 0x00C400A4,
> + HWPfChaDl5gPllDivq2 = 0x00C400A8,
> + HWPfChaDl5gPllDivq3 = 0x00C400AC,
> + HWPfChaDl5gPllDivq4 = 0x00C400B0,
> + HWPfChaDl5gPllDivq5 = 0x00C400B4,
> + HWPfChaDl5gPllDivq6 = 0x00C400B8,
> + HWPfChaDl5gPllDivq7 = 0x00C400BC,
> + HWPfChaDl4gPllPllrst = 0x00C40100,
> + HWPfChaDl4gPllClk0 = 0x00C40104,
> + HWPfChaDl4gPllClk1 = 0x00C40108,
> + HWPfChaDl4gPllBwadj = 0x00C4010C,
> + HWPfChaDl4gPllLbw = 0x00C40110,
> + HWPfChaDl4gPllResetq = 0x00C40114,
> + HWPfChaDl4gPllPhshft0 = 0x00C40118,
> + HWPfChaDl4gPllPhshft1 = 0x00C4011C,
> + HWPfChaDl4gPllDivq0 = 0x00C40120,
> + HWPfChaDl4gPllDivq1 = 0x00C40124,
> + HWPfChaDl4gPllDivq2 = 0x00C40128,
> + HWPfChaDl4gPllDivq3 = 0x00C4012C,
> + HWPfChaDl4gPllDivq4 = 0x00C40130,
> + HWPfChaDl4gPllDivq5 = 0x00C40134,
> + HWPfChaDl4gPllDivq6 = 0x00C40138,
> + HWPfChaDl4gPllDivq7 = 0x00C4013C,
> + HWPfChaUl5gPllPllrst = 0x00C40180,
> + HWPfChaUl5gPllClk0 = 0x00C40184,
> + HWPfChaUl5gPllClk1 = 0x00C40188,
> + HWPfChaUl5gPllBwadj = 0x00C4018C,
> + HWPfChaUl5gPllLbw = 0x00C40190,
> + HWPfChaUl5gPllResetq = 0x00C40194,
> + HWPfChaUl5gPllPhshft0 = 0x00C40198,
> + HWPfChaUl5gPllPhshft1 = 0x00C4019C,
> + HWPfChaUl5gPllDivq0 = 0x00C401A0,
> + HWPfChaUl5gPllDivq1 = 0x00C401A4,
> + HWPfChaUl5gPllDivq2 = 0x00C401A8,
> + HWPfChaUl5gPllDivq3 = 0x00C401AC,
> + HWPfChaUl5gPllDivq4 = 0x00C401B0,
> + HWPfChaUl5gPllDivq5 = 0x00C401B4,
> + HWPfChaUl5gPllDivq6 = 0x00C401B8,
> + HWPfChaUl5gPllDivq7 = 0x00C401BC,
> + HWPfChaUl4gPllPllrst = 0x00C40200,
> + HWPfChaUl4gPllClk0 = 0x00C40204,
> + HWPfChaUl4gPllClk1 = 0x00C40208,
> + HWPfChaUl4gPllBwadj = 0x00C4020C,
> + HWPfChaUl4gPllLbw = 0x00C40210,
> + HWPfChaUl4gPllResetq = 0x00C40214,
> + HWPfChaUl4gPllPhshft0 = 0x00C40218,
> + HWPfChaUl4gPllPhshft1 = 0x00C4021C,
> + HWPfChaUl4gPllDivq0 = 0x00C40220,
> + HWPfChaUl4gPllDivq1 = 0x00C40224,
> + HWPfChaUl4gPllDivq2 = 0x00C40228,
> + HWPfChaUl4gPllDivq3 = 0x00C4022C,
> + HWPfChaUl4gPllDivq4 = 0x00C40230,
> + HWPfChaUl4gPllDivq5 = 0x00C40234,
> + HWPfChaUl4gPllDivq6 = 0x00C40238,
> + HWPfChaUl4gPllDivq7 = 0x00C4023C,
> + HWPfChaDdrPllPllrst = 0x00C40280,
> + HWPfChaDdrPllClk0 = 0x00C40284,
> + HWPfChaDdrPllClk1 = 0x00C40288,
> + HWPfChaDdrPllBwadj = 0x00C4028C,
> + HWPfChaDdrPllLbw = 0x00C40290,
> + HWPfChaDdrPllResetq = 0x00C40294,
> + HWPfChaDdrPllPhshft0 = 0x00C40298,
> + HWPfChaDdrPllPhshft1 = 0x00C4029C,
> + HWPfChaDdrPllDivq0 = 0x00C402A0,
> + HWPfChaDdrPllDivq1 = 0x00C402A4,
> + HWPfChaDdrPllDivq2 = 0x00C402A8,
> + HWPfChaDdrPllDivq3 = 0x00C402AC,
> + HWPfChaDdrPllDivq4 = 0x00C402B0,
> + HWPfChaDdrPllDivq5 = 0x00C402B4,
> + HWPfChaDdrPllDivq6 = 0x00C402B8,
> + HWPfChaDdrPllDivq7 = 0x00C402BC,
> + HWPfChaErrStatus = 0x00C40400,
> + HWPfChaErrMask = 0x00C40404,
> + HWPfChaDebugPcieMsiFifo = 0x00C40410,
> + HWPfChaDebugDdrMsiFifo = 0x00C40414,
> + HWPfChaDebugMiscMsiFifo = 0x00C40418,
> + HWPfChaPwmSet = 0x00C40420,
> + HWPfChaDdrRstStatus = 0x00C40430,
> + HWPfChaDdrStDoneStatus = 0x00C40434,
> + HWPfChaDdrWbRstCfg = 0x00C40438,
> + HWPfChaDdrApbRstCfg = 0x00C4043C,
> + HWPfChaDdrPhyRstCfg = 0x00C40440,
> + HWPfChaDdrCpuRstCfg = 0x00C40444,
> + HWPfChaDdrSifRstCfg = 0x00C40448,
> + HWPfChaPadcfgPcomp0 = 0x00C41000,
> + HWPfChaPadcfgNcomp0 = 0x00C41004,
> + HWPfChaPadcfgOdt0 = 0x00C41008,
> + HWPfChaPadcfgProtect0 = 0x00C4100C,
> + HWPfChaPreemphasisProtect0 = 0x00C41010,
> + HWPfChaPreemphasisCompen0 = 0x00C41040,
> + HWPfChaPreemphasisOdten0 = 0x00C41044,
> + HWPfChaPadcfgPcomp1 = 0x00C41100,
> + HWPfChaPadcfgNcomp1 = 0x00C41104,
> + HWPfChaPadcfgOdt1 = 0x00C41108,
> + HWPfChaPadcfgProtect1 = 0x00C4110C,
> + HWPfChaPreemphasisProtect1 = 0x00C41110,
> + HWPfChaPreemphasisCompen1 = 0x00C41140,
> + HWPfChaPreemphasisOdten1 = 0x00C41144,
> + HWPfChaPadcfgPcomp2 = 0x00C41200,
> + HWPfChaPadcfgNcomp2 = 0x00C41204,
> + HWPfChaPadcfgOdt2 = 0x00C41208,
> + HWPfChaPadcfgProtect2 = 0x00C4120C,
> + HWPfChaPreemphasisProtect2 = 0x00C41210,
> + HWPfChaPreemphasisCompen2 = 0x00C41240,
> + HWPfChaPreemphasisOdten4 = 0x00C41444,
> + HWPfChaPreemphasisOdten2 = 0x00C41244,
> + HWPfChaPadcfgPcomp3 = 0x00C41300,
> + HWPfChaPadcfgNcomp3 = 0x00C41304,
> + HWPfChaPadcfgOdt3 = 0x00C41308,
> + HWPfChaPadcfgProtect3 = 0x00C4130C,
> + HWPfChaPreemphasisProtect3 = 0x00C41310,
> + HWPfChaPreemphasisCompen3 = 0x00C41340,
> + HWPfChaPreemphasisOdten3 = 0x00C41344,
> + HWPfChaPadcfgPcomp4 = 0x00C41400,
> + HWPfChaPadcfgNcomp4 = 0x00C41404,
> + HWPfChaPadcfgOdt4 = 0x00C41408,
> + HWPfChaPadcfgProtect4 = 0x00C4140C,
> + HWPfChaPreemphasisProtect4 = 0x00C41410,
> + HWPfChaPreemphasisCompen4 = 0x00C41440,
> + HWPfHiVfToPfDbellVf = 0x00C80000,
> + HWPfHiPfToVfDbellVf = 0x00C80008,
> + HWPfHiInfoRingBaseLoVf = 0x00C80010,
> + HWPfHiInfoRingBaseHiVf = 0x00C80014,
> + HWPfHiInfoRingPointerVf = 0x00C80018,
> + HWPfHiInfoRingIntWrEnVf = 0x00C80020,
> + HWPfHiInfoRingPf2VfWrEnVf = 0x00C80024,
> + HWPfHiMsixVectorMapperVf = 0x00C80060,
> + HWPfHiModuleVersionReg = 0x00C84000,
> + HWPfHiIosf2axiErrLogReg = 0x00C84004,
> + HWPfHiHardResetReg = 0x00C84008,
> + HWPfHi5GHardResetReg = 0x00C8400C,
> + HWPfHiInfoRingBaseLoRegPf = 0x00C84010,
> + HWPfHiInfoRingBaseHiRegPf = 0x00C84014,
> + HWPfHiInfoRingPointerRegPf = 0x00C84018,
> + HWPfHiInfoRingIntWrEnRegPf = 0x00C84020,
> + HWPfHiInfoRingVf2pfLoWrEnReg = 0x00C84024,
> + HWPfHiInfoRingVf2pfHiWrEnReg = 0x00C84028,
> + HWPfHiLogParityErrStatusReg = 0x00C8402C,
> + HWPfHiLogDataParityErrorVfStatusLo = 0x00C84030,
> + HWPfHiLogDataParityErrorVfStatusHi = 0x00C84034,
> + HWPfHiBlockTransmitOnErrorEn = 0x00C84038,
> + HWPfHiCfgMsiIntWrEnRegPf = 0x00C84040,
> + HWPfHiCfgMsiVf2pfLoWrEnReg = 0x00C84044,
> + HWPfHiCfgMsiVf2pfHighWrEnReg = 0x00C84048,
> + HWPfHiMsixVectorMapperPf = 0x00C84060,
> + HWPfHiApbWrWaitTime = 0x00C84100,
> + HWPfHiXCounterMaxValue = 0x00C84104,
> + HWPfHiPfMode = 0x00C84108,
> + HWPfHiClkGateHystReg = 0x00C8410C,
> + HWPfHiSnoopBitsReg = 0x00C84110,
> + HWPfHiMsiDropEnableReg = 0x00C84114,
> + HWPfHiMsiStatReg = 0x00C84120,
> + HWPfHiFifoOflStatReg = 0x00C84124,
> + HWPfHiHiDebugReg = 0x00C841F4,
> + HWPfHiDebugMemSnoopMsiFifo = 0x00C841F8,
> + HWPfHiDebugMemSnoopInputFifo = 0x00C841FC,
> + HWPfHiMsixMappingConfig = 0x00C84200,
> + HWPfHiJunkReg = 0x00C8FF00,
> + HWPfDdrUmmcVer = 0x00D00000,
> + HWPfDdrUmmcCap = 0x00D00010,
> + HWPfDdrUmmcCtrl = 0x00D00020,
> + HWPfDdrMpcPe = 0x00D00080,
> + HWPfDdrMpcPpri3 = 0x00D00090,
> + HWPfDdrMpcPpri2 = 0x00D000A0,
> + HWPfDdrMpcPpri1 = 0x00D000B0,
> + HWPfDdrMpcPpri0 = 0x00D000C0,
> + HWPfDdrMpcPrwgrpCtrl = 0x00D000D0,
> + HWPfDdrMpcPbw7 = 0x00D000E0,
> + HWPfDdrMpcPbw6 = 0x00D000F0,
> + HWPfDdrMpcPbw5 = 0x00D00100,
> + HWPfDdrMpcPbw4 = 0x00D00110,
> + HWPfDdrMpcPbw3 = 0x00D00120,
> + HWPfDdrMpcPbw2 = 0x00D00130,
> + HWPfDdrMpcPbw1 = 0x00D00140,
> + HWPfDdrMpcPbw0 = 0x00D00150,
> + HWPfDdrMemoryInit = 0x00D00200,
> + HWPfDdrMemoryInitDone = 0x00D00210,
> + HWPfDdrMemInitPhyTrng0 = 0x00D00240,
> + HWPfDdrMemInitPhyTrng1 = 0x00D00250,
> + HWPfDdrMemInitPhyTrng2 = 0x00D00260,
> + HWPfDdrMemInitPhyTrng3 = 0x00D00270,
> + HWPfDdrBcDram = 0x00D003C0,
> + HWPfDdrBcAddrMap = 0x00D003D0,
> + HWPfDdrBcRef = 0x00D003E0,
> + HWPfDdrBcTim0 = 0x00D00400,
> + HWPfDdrBcTim1 = 0x00D00410,
> + HWPfDdrBcTim2 = 0x00D00420,
> + HWPfDdrBcTim3 = 0x00D00430,
> + HWPfDdrBcTim4 = 0x00D00440,
> + HWPfDdrBcTim5 = 0x00D00450,
> + HWPfDdrBcTim6 = 0x00D00460,
> + HWPfDdrBcTim7 = 0x00D00470,
> + HWPfDdrBcTim8 = 0x00D00480,
> + HWPfDdrBcTim9 = 0x00D00490,
> + HWPfDdrBcTim10 = 0x00D004A0,
> + HWPfDdrBcTim12 = 0x00D004C0,
> + HWPfDdrDfiInit = 0x00D004D0,
> + HWPfDdrDfiInitComplete = 0x00D004E0,
> + HWPfDdrDfiTim0 = 0x00D004F0,
> + HWPfDdrDfiTim1 = 0x00D00500,
> + HWPfDdrDfiPhyUpdEn = 0x00D00530,
> + HWPfDdrMemStatus = 0x00D00540,
> + HWPfDdrUmmcErrStatus = 0x00D00550,
> + HWPfDdrUmmcIntStatus = 0x00D00560,
> + HWPfDdrUmmcIntEn = 0x00D00570,
> + HWPfDdrPhyRdLatency = 0x00D48400,
> + HWPfDdrPhyRdLatencyDbi = 0x00D48410,
> + HWPfDdrPhyWrLatency = 0x00D48420,
> + HWPfDdrPhyTrngType = 0x00D48430,
> + HWPfDdrPhyMrsTiming2 = 0x00D48440,
> + HWPfDdrPhyMrsTiming0 = 0x00D48450,
> + HWPfDdrPhyMrsTiming1 = 0x00D48460,
> + HWPfDdrPhyDramTmrd = 0x00D48470,
> + HWPfDdrPhyDramTmod = 0x00D48480,
> + HWPfDdrPhyDramTwpre = 0x00D48490,
> + HWPfDdrPhyDramTrfc = 0x00D484A0,
> + HWPfDdrPhyDramTrwtp = 0x00D484B0,
> + HWPfDdrPhyMr01Dimm = 0x00D484C0,
> + HWPfDdrPhyMr01DimmDbi = 0x00D484D0,
> + HWPfDdrPhyMr23Dimm = 0x00D484E0,
> + HWPfDdrPhyMr45Dimm = 0x00D484F0,
> + HWPfDdrPhyMr67Dimm = 0x00D48500,
> + HWPfDdrPhyWrlvlWwRdlvlRr = 0x00D48510,
> + HWPfDdrPhyOdtEn = 0x00D48520,
> + HWPfDdrPhyFastTrng = 0x00D48530,
> + HWPfDdrPhyDynTrngGap = 0x00D48540,
> + HWPfDdrPhyDynRcalGap = 0x00D48550,
> + HWPfDdrPhyIdletimeout = 0x00D48560,
> + HWPfDdrPhyRstCkeGap = 0x00D48570,
> + HWPfDdrPhyCkeMrsGap = 0x00D48580,
> + HWPfDdrPhyMemVrefMidVal = 0x00D48590,
> + HWPfDdrPhyVrefStep = 0x00D485A0,
> + HWPfDdrPhyVrefThreshold = 0x00D485B0,
> + HWPfDdrPhyPhyVrefMidVal = 0x00D485C0,
> + HWPfDdrPhyDqsCountMax = 0x00D485D0,
> + HWPfDdrPhyDqsCountNum = 0x00D485E0,
> + HWPfDdrPhyDramRow = 0x00D485F0,
> + HWPfDdrPhyDramCol = 0x00D48600,
> + HWPfDdrPhyDramBgBa = 0x00D48610,
> + HWPfDdrPhyDynamicUpdreqrel = 0x00D48620,
> + HWPfDdrPhyVrefLimits = 0x00D48630,
> + HWPfDdrPhyIdtmTcStatus = 0x00D6C020,
> + HWPfDdrPhyIdtmFwVersion = 0x00D6C410,
> + HWPfDdrPhyRdlvlGateInitDelay = 0x00D70000,
> + HWPfDdrPhyRdenSmplabc = 0x00D70008,
> + HWPfDdrPhyVrefNibble0 = 0x00D7000C,
> + HWPfDdrPhyVrefNibble1 = 0x00D70010,
> + HWPfDdrPhyRdlvlGateDqsSmpl0 = 0x00D70014,
> + HWPfDdrPhyRdlvlGateDqsSmpl1 = 0x00D70018,
> + HWPfDdrPhyRdlvlGateDqsSmpl2 = 0x00D7001C,
> + HWPfDdrPhyDqsCount = 0x00D70020,
> + HWPfDdrPhyWrlvlRdlvlGateStatus = 0x00D70024,
> + HWPfDdrPhyErrorFlags = 0x00D70028,
> + HWPfDdrPhyPowerDown = 0x00D70030,
> + HWPfDdrPhyPrbsSeedByte0 = 0x00D70034,
> + HWPfDdrPhyPrbsSeedByte1 = 0x00D70038,
> + HWPfDdrPhyPcompDq = 0x00D70040,
> + HWPfDdrPhyNcompDq = 0x00D70044,
> + HWPfDdrPhyPcompDqs = 0x00D70048,
> + HWPfDdrPhyNcompDqs = 0x00D7004C,
> + HWPfDdrPhyPcompCmd = 0x00D70050,
> + HWPfDdrPhyNcompCmd = 0x00D70054,
> + HWPfDdrPhyPcompCk = 0x00D70058,
> + HWPfDdrPhyNcompCk = 0x00D7005C,
> + HWPfDdrPhyRcalOdtDq = 0x00D70060,
> + HWPfDdrPhyRcalOdtDqs = 0x00D70064,
> + HWPfDdrPhyRcalMask1 = 0x00D70068,
> + HWPfDdrPhyRcalMask2 = 0x00D7006C,
> + HWPfDdrPhyRcalCtrl = 0x00D70070,
> + HWPfDdrPhyRcalCnt = 0x00D70074,
> + HWPfDdrPhyRcalOverride = 0x00D70078,
> + HWPfDdrPhyRcalGateen = 0x00D7007C,
> + HWPfDdrPhyCtrl = 0x00D70080,
> + HWPfDdrPhyWrlvlAlg = 0x00D70084,
> + HWPfDdrPhyRcalVreftTxcmdOdt = 0x00D70088,
> + HWPfDdrPhyRdlvlGateParam = 0x00D7008C,
> + HWPfDdrPhyRdlvlGateParam2 = 0x00D70090,
> + HWPfDdrPhyRcalVreftTxdata = 0x00D70094,
> + HWPfDdrPhyCmdIntDelay = 0x00D700A4,
> + HWPfDdrPhyAlertN = 0x00D700A8,
> + HWPfDdrPhyTrngReqWpre2tck = 0x00D700AC,
> + HWPfDdrPhyCmdPhaseSel = 0x00D700B4,
> + HWPfDdrPhyCmdDcdl = 0x00D700B8,
> + HWPfDdrPhyCkDcdl = 0x00D700BC,
> + HWPfDdrPhySwTrngCtrl1 = 0x00D700C0,
> + HWPfDdrPhySwTrngCtrl2 = 0x00D700C4,
> + HWPfDdrPhyRcalPcompRden = 0x00D700C8,
> + HWPfDdrPhyRcalNcompRden = 0x00D700CC,
> + HWPfDdrPhyRcalCompen = 0x00D700D0,
> + HWPfDdrPhySwTrngRdqs = 0x00D700D4,
> + HWPfDdrPhySwTrngWdqs = 0x00D700D8,
> + HWPfDdrPhySwTrngRdena = 0x00D700DC,
> + HWPfDdrPhySwTrngRdenb = 0x00D700E0,
> + HWPfDdrPhySwTrngRdenc = 0x00D700E4,
> + HWPfDdrPhySwTrngWdq = 0x00D700E8,
> + HWPfDdrPhySwTrngRdq = 0x00D700EC,
> + HWPfDdrPhyPcfgHmValue = 0x00D700F0,
> + HWPfDdrPhyPcfgTimerValue = 0x00D700F4,
> + HWPfDdrPhyPcfgSoftwareTraining = 0x00D700F8,
> + HWPfDdrPhyPcfgMcStatus = 0x00D700FC,
> + HWPfDdrPhyWrlvlPhRank0 = 0x00D70100,
> + HWPfDdrPhyRdenPhRank0 = 0x00D70104,
> + HWPfDdrPhyRdenIntRank0 = 0x00D70108,
> + HWPfDdrPhyRdqsDcdlRank0 = 0x00D7010C,
> + HWPfDdrPhyRdqsShadowDcdlRank0 = 0x00D70110,
> + HWPfDdrPhyWdqsDcdlRank0 = 0x00D70114,
> + HWPfDdrPhyWdmDcdlShadowRank0 = 0x00D70118,
> + HWPfDdrPhyWdmDcdlRank0 = 0x00D7011C,
> + HWPfDdrPhyDbiDcdlRank0 = 0x00D70120,
> + HWPfDdrPhyRdenDcdlaRank0 = 0x00D70124,
> + HWPfDdrPhyDbiDcdlShadowRank0 = 0x00D70128,
> + HWPfDdrPhyRdenDcdlbRank0 = 0x00D7012C,
> + HWPfDdrPhyWdqsShadowDcdlRank0 = 0x00D70130,
> + HWPfDdrPhyRdenDcdlcRank0 = 0x00D70134,
> + HWPfDdrPhyRdenShadowDcdlaRank0 = 0x00D70138,
> + HWPfDdrPhyWrlvlIntRank0 = 0x00D7013C,
> + HWPfDdrPhyRdqDcdlBit0Rank0 = 0x00D70200,
> + HWPfDdrPhyRdqDcdlShadowBit0Rank0 = 0x00D70204,
> + HWPfDdrPhyWdqDcdlBit0Rank0 = 0x00D70208,
> + HWPfDdrPhyWdqDcdlShadowBit0Rank0 = 0x00D7020C,
> + HWPfDdrPhyRdqDcdlBit1Rank0 = 0x00D70240,
> + HWPfDdrPhyRdqDcdlShadowBit1Rank0 = 0x00D70244,
> + HWPfDdrPhyWdqDcdlBit1Rank0 = 0x00D70248,
> + HWPfDdrPhyWdqDcdlShadowBit1Rank0 = 0x00D7024C,
> + HWPfDdrPhyRdqDcdlBit2Rank0 = 0x00D70280,
> + HWPfDdrPhyRdqDcdlShadowBit2Rank0 = 0x00D70284,
> + HWPfDdrPhyWdqDcdlBit2Rank0 = 0x00D70288,
> + HWPfDdrPhyWdqDcdlShadowBit2Rank0 = 0x00D7028C,
> + HWPfDdrPhyRdqDcdlBit3Rank0 = 0x00D702C0,
> + HWPfDdrPhyRdqDcdlShadowBit3Rank0 = 0x00D702C4,
> + HWPfDdrPhyWdqDcdlBit3Rank0 = 0x00D702C8,
> + HWPfDdrPhyWdqDcdlShadowBit3Rank0 = 0x00D702CC,
> + HWPfDdrPhyRdqDcdlBit4Rank0 = 0x00D70300,
> + HWPfDdrPhyRdqDcdlShadowBit4Rank0 = 0x00D70304,
> + HWPfDdrPhyWdqDcdlBit4Rank0 = 0x00D70308,
> + HWPfDdrPhyWdqDcdlShadowBit4Rank0 = 0x00D7030C,
> + HWPfDdrPhyRdqDcdlBit5Rank0 = 0x00D70340,
> + HWPfDdrPhyRdqDcdlShadowBit5Rank0 = 0x00D70344,
> + HWPfDdrPhyWdqDcdlBit5Rank0 = 0x00D70348,
> + HWPfDdrPhyWdqDcdlShadowBit5Rank0 = 0x00D7034C,
> + HWPfDdrPhyRdqDcdlBit6Rank0 = 0x00D70380,
> + HWPfDdrPhyRdqDcdlShadowBit6Rank0 = 0x00D70384,
> + HWPfDdrPhyWdqDcdlBit6Rank0 = 0x00D70388,
> + HWPfDdrPhyWdqDcdlShadowBit6Rank0 = 0x00D7038C,
> + HWPfDdrPhyRdqDcdlBit7Rank0 = 0x00D703C0,
> + HWPfDdrPhyRdqDcdlShadowBit7Rank0 = 0x00D703C4,
> + HWPfDdrPhyWdqDcdlBit7Rank0 = 0x00D703C8,
> + HWPfDdrPhyWdqDcdlShadowBit7Rank0 = 0x00D703CC,
> + HWPfDdrPhyIdtmStatus = 0x00D740D0,
> + HWPfDdrPhyIdtmError = 0x00D74110,
> + HWPfDdrPhyIdtmDebug = 0x00D74120,
> + HWPfDdrPhyIdtmDebugInt = 0x00D74130,
> + HwPfPcieLnAsicCfgovr = 0x00D80000,
> + HwPfPcieLnAclkmixer = 0x00D80004,
> + HwPfPcieLnTxrampfreq = 0x00D80008,
> + HwPfPcieLnLanetest = 0x00D8000C,
> + HwPfPcieLnDcctrl = 0x00D80010,
> + HwPfPcieLnDccmeas = 0x00D80014,
> + HwPfPcieLnDccovrAclk = 0x00D80018,
> + HwPfPcieLnDccovrTxa = 0x00D8001C,
> + HwPfPcieLnDccovrTxk = 0x00D80020,
> + HwPfPcieLnDccovrDclk = 0x00D80024,
> + HwPfPcieLnDccovrEclk = 0x00D80028,
> + HwPfPcieLnDcctrimAclk = 0x00D8002C,
> + HwPfPcieLnDcctrimTx = 0x00D80030,
> + HwPfPcieLnDcctrimDclk = 0x00D80034,
> + HwPfPcieLnDcctrimEclk = 0x00D80038,
> + HwPfPcieLnQuadCtrl = 0x00D8003C,
> + HwPfPcieLnQuadCorrIndex = 0x00D80040,
> + HwPfPcieLnQuadCorrStatus = 0x00D80044,
> + HwPfPcieLnAsicRxovr1 = 0x00D80048,
> + HwPfPcieLnAsicRxovr2 = 0x00D8004C,
> + HwPfPcieLnAsicEqinfovr = 0x00D80050,
> + HwPfPcieLnRxcsr = 0x00D80054,
> + HwPfPcieLnRxfectrl = 0x00D80058,
> + HwPfPcieLnRxtest = 0x00D8005C,
> + HwPfPcieLnEscount = 0x00D80060,
> + HwPfPcieLnCdrctrl = 0x00D80064,
> + HwPfPcieLnCdrctrl2 = 0x00D80068,
> + HwPfPcieLnCdrcfg0Ctrl0 = 0x00D8006C,
> + HwPfPcieLnCdrcfg0Ctrl1 = 0x00D80070,
> + HwPfPcieLnCdrcfg0Ctrl2 = 0x00D80074,
> + HwPfPc