DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH v3 00/11] bbdev PMD ACC100
@ 2020-08-19  0:25 Nicolas Chautru
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 01/11] drivers/baseband: add PMD for ACC100 Nicolas Chautru
                   ` (10 more replies)
  0 siblings, 11 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-08-19  0:25 UTC (permalink / raw)
  To: dev, akhil.goyal; +Cc: bruce.richardson, Nicolas Chautru

v3: missed a change during rebase
v2: includes clean up from latest CI checks.

This set includes a new PMD for the accelerator
ACC100 for 4G+5G FEC in 20.11. 
Documentation is updated as well accordingly.
Existing unit tests are all still supported.


Nicolas Chautru (11):
  drivers/baseband: add PMD for ACC100
  baseband/acc100: add register definition file
  baseband/acc100: add info get function
  baseband/acc100: add queue configuration
  baseband/acc100: add LDPC processing functions
  baseband/acc100: add HARQ loopback support
  baseband/acc100: add support for 4G processing
  baseband/acc100: add interrupt support to PMD
  baseband/acc100: add debug function to validate input
  baseband/acc100: add configure function
  doc: update bbdev feature table

 app/test-bbdev/Makefile                            |    3 +
 app/test-bbdev/meson.build                         |    3 +
 app/test-bbdev/test_bbdev_perf.c                   |   72 +
 config/common_base                                 |    4 +
 doc/guides/bbdevs/acc100.rst                       |  233 +
 doc/guides/bbdevs/features/acc100.ini              |   14 +
 doc/guides/bbdevs/features/mbc.ini                 |   14 -
 doc/guides/bbdevs/index.rst                        |    1 +
 doc/guides/rel_notes/release_20_11.rst             |    6 +
 drivers/baseband/Makefile                          |    2 +
 drivers/baseband/acc100/Makefile                   |   28 +
 drivers/baseband/acc100/acc100_pf_enum.h           | 1068 +++++
 drivers/baseband/acc100/acc100_vf_enum.h           |   73 +
 drivers/baseband/acc100/meson.build                |    8 +
 drivers/baseband/acc100/rte_acc100_cfg.h           |  113 +
 drivers/baseband/acc100/rte_acc100_pmd.c           | 4684 ++++++++++++++++++++
 drivers/baseband/acc100/rte_acc100_pmd.h           |  593 +++
 .../acc100/rte_pmd_bbdev_acc100_version.map        |   10 +
 drivers/baseband/meson.build                       |    2 +-
 mk/rte.app.mk                                      |    1 +
 20 files changed, 6917 insertions(+), 15 deletions(-)
 create mode 100644 doc/guides/bbdevs/acc100.rst
 create mode 100644 doc/guides/bbdevs/features/acc100.ini
 delete mode 100644 doc/guides/bbdevs/features/mbc.ini
 create mode 100644 drivers/baseband/acc100/Makefile
 create mode 100644 drivers/baseband/acc100/acc100_pf_enum.h
 create mode 100644 drivers/baseband/acc100/acc100_vf_enum.h
 create mode 100644 drivers/baseband/acc100/meson.build
 create mode 100644 drivers/baseband/acc100/rte_acc100_cfg.h
 create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.c
 create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.h
 create mode 100644 drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v3 01/11] drivers/baseband: add PMD for ACC100
  2020-08-19  0:25 [dpdk-dev] [PATCH v3 00/11] bbdev PMD ACC100 Nicolas Chautru
@ 2020-08-19  0:25 ` Nicolas Chautru
  2020-08-29  9:44   ` Xu, Rosen
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 02/11] baseband/acc100: add register definition file Nicolas Chautru
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 213+ messages in thread
From: Nicolas Chautru @ 2020-08-19  0:25 UTC (permalink / raw)
  To: dev, akhil.goyal; +Cc: bruce.richardson, Nicolas Chautru

Add stubs for the ACC100 PMD

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 config/common_base                                 |   4 +
 doc/guides/bbdevs/acc100.rst                       | 233 +++++++++++++++++++++
 doc/guides/bbdevs/index.rst                        |   1 +
 doc/guides/rel_notes/release_20_11.rst             |   6 +
 drivers/baseband/Makefile                          |   2 +
 drivers/baseband/acc100/Makefile                   |  25 +++
 drivers/baseband/acc100/meson.build                |   6 +
 drivers/baseband/acc100/rte_acc100_pmd.c           | 175 ++++++++++++++++
 drivers/baseband/acc100/rte_acc100_pmd.h           |  37 ++++
 .../acc100/rte_pmd_bbdev_acc100_version.map        |   3 +
 drivers/baseband/meson.build                       |   2 +-
 mk/rte.app.mk                                      |   1 +
 12 files changed, 494 insertions(+), 1 deletion(-)
 create mode 100644 doc/guides/bbdevs/acc100.rst
 create mode 100644 drivers/baseband/acc100/Makefile
 create mode 100644 drivers/baseband/acc100/meson.build
 create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.c
 create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.h
 create mode 100644 drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map

diff --git a/config/common_base b/config/common_base
index fbf0ee7..218ab16 100644
--- a/config/common_base
+++ b/config/common_base
@@ -584,6 +584,10 @@ CONFIG_RTE_LIBRTE_PMD_BBDEV_NULL=y
 #
 CONFIG_RTE_LIBRTE_PMD_BBDEV_TURBO_SW=y
 
+# Compile PMD for ACC100 bbdev device
+#
+CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100=y
+
 #
 # Compile PMD for Intel FPGA LTE FEC bbdev device
 #
diff --git a/doc/guides/bbdevs/acc100.rst b/doc/guides/bbdevs/acc100.rst
new file mode 100644
index 0000000..f87ee09
--- /dev/null
+++ b/doc/guides/bbdevs/acc100.rst
@@ -0,0 +1,233 @@
+..  SPDX-License-Identifier: BSD-3-Clause
+    Copyright(c) 2020 Intel Corporation
+
+Intel(R) ACC100 5G/4G FEC Poll Mode Driver
+==========================================
+
+The BBDEV ACC100 5G/4G FEC poll mode driver (PMD) supports an
+implementation of a VRAN FEC wireless acceleration function.
+This device is also known as Mount Bryce.
+
+Features
+--------
+
+ACC100 5G/4G FEC PMD supports the following features:
+
+- LDPC Encode in the DL (5GNR)
+- LDPC Decode in the UL (5GNR)
+- Turbo Encode in the DL (4G)
+- Turbo Decode in the UL (4G)
+- 16 VFs per PF (physical device)
+- Maximum of 128 queues per VF
+- PCIe Gen-3 x16 Interface
+- MSI
+- SR-IOV
+
+ACC100 5G/4G FEC PMD supports the following BBDEV capabilities:
+
+* For the LDPC encode operation:
+   - ``RTE_BBDEV_LDPC_CRC_24B_ATTACH`` :  set to attach CRC24B to CB(s)
+   - ``RTE_BBDEV_LDPC_RATE_MATCH`` :  if set then do not do Rate Match bypass
+   - ``RTE_BBDEV_LDPC_INTERLEAVER_BYPASS`` : if set then bypass interleaver
+
+* For the LDPC decode operation:
+   - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK`` :  check CRC24B from CB(s)
+   - ``RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE`` :  disable early termination
+   - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP`` :  drops CRC24B bits appended while decoding
+   - ``RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE`` :  provides an input for HARQ combining
+   - ``RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE`` :  provides an input for HARQ combining
+   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE`` :  HARQ memory input is internal
+   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE`` :  HARQ memory output is internal
+   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK`` :  loopback data to/from HARQ memory
+   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_FILLERS`` :  HARQ memory includes the fillers bits
+   - ``RTE_BBDEV_LDPC_DEC_SCATTER_GATHER`` :  supports scatter-gather for input/output data
+   - ``RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION`` :  supports compression of the HARQ input/output
+   - ``RTE_BBDEV_LDPC_LLR_COMPRESSION`` :  supports LLR input compression
+
+* For the turbo encode operation:
+   - ``RTE_BBDEV_TURBO_CRC_24B_ATTACH`` :  set to attach CRC24B to CB(s)
+   - ``RTE_BBDEV_TURBO_RATE_MATCH`` :  if set then do not do Rate Match bypass
+   - ``RTE_BBDEV_TURBO_ENC_INTERRUPTS`` :  set for encoder dequeue interrupts
+   - ``RTE_BBDEV_TURBO_RV_INDEX_BYPASS`` :  set to bypass RV index
+   - ``RTE_BBDEV_TURBO_ENC_SCATTER_GATHER`` :  supports scatter-gather for input/output data
+
+* For the turbo decode operation:
+   - ``RTE_BBDEV_TURBO_CRC_TYPE_24B`` :  check CRC24B from CB(s)
+   - ``RTE_BBDEV_TURBO_SUBBLOCK_DEINTERLEAVE`` :  perform subblock de-interleave
+   - ``RTE_BBDEV_TURBO_DEC_INTERRUPTS`` :  set for decoder dequeue interrupts
+   - ``RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN`` :  set if negative LLR encoder i/p is supported
+   - ``RTE_BBDEV_TURBO_POS_LLR_1_BIT_IN`` :  set if positive LLR encoder i/p is supported
+   - ``RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP`` :  keep CRC24B bits appended while decoding
+   - ``RTE_BBDEV_TURBO_EARLY_TERMINATION`` :  set early early termination feature
+   - ``RTE_BBDEV_TURBO_DEC_SCATTER_GATHER`` :  supports scatter-gather for input/output data
+   - ``RTE_BBDEV_TURBO_HALF_ITERATION_EVEN`` :  set half iteration granularity
+
+Installation
+------------
+
+Section 3 of the DPDK manual provides instuctions on installing and compiling DPDK. The
+default set of bbdev compile flags may be found in config/common_base, where for example
+the flag to build the ACC100 5G/4G FEC device, ``CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100``,
+is already set.
+
+DPDK requires hugepages to be configured as detailed in section 2 of the DPDK manual.
+The bbdev test application has been tested with a configuration 40 x 1GB hugepages. The
+hugepage configuration of a server may be examined using:
+
+.. code-block:: console
+
+   grep Huge* /proc/meminfo
+
+
+Initialization
+--------------
+
+When the device first powers up, its PCI Physical Functions (PF) can be listed through this command:
+
+.. code-block:: console
+
+  sudo lspci -vd8086:0d5c
+
+The physical and virtual functions are compatible with Linux UIO drivers:
+``vfio`` and ``igb_uio``. However, in order to work the ACC100 5G/4G
+FEC device firstly needs to be bound to one of these linux drivers through DPDK.
+
+
+Bind PF UIO driver(s)
+~~~~~~~~~~~~~~~~~~~~~
+
+Install the DPDK igb_uio driver, bind it with the PF PCI device ID and use
+``lspci`` to confirm the PF device is under use by ``igb_uio`` DPDK UIO driver.
+
+The igb_uio driver may be bound to the PF PCI device using one of three methods:
+
+
+1. PCI functions (physical or virtual, depending on the use case) can be bound to
+the UIO driver by repeating this command for every function.
+
+.. code-block:: console
+
+  cd <dpdk-top-level-directory>
+  insmod ./build/kmod/igb_uio.ko
+  echo "8086 0d5c" > /sys/bus/pci/drivers/igb_uio/new_id
+  lspci -vd8086:0d5c
+
+
+2. Another way to bind PF with DPDK UIO driver is by using the ``dpdk-devbind.py`` tool
+
+.. code-block:: console
+
+  cd <dpdk-top-level-directory>
+  ./usertools/dpdk-devbind.py -b igb_uio 0000:06:00.0
+
+where the PCI device ID (example: 0000:06:00.0) is obtained using lspci -vd8086:0d5c
+
+
+3. A third way to bind is to use ``dpdk-setup.sh`` tool
+
+.. code-block:: console
+
+  cd <dpdk-top-level-directory>
+  ./usertools/dpdk-setup.sh
+
+  select 'Bind Ethernet/Crypto/Baseband device to IGB UIO module'
+  or
+  select 'Bind Ethernet/Crypto/Baseband device to VFIO module' depending on driver required
+  enter PCI device ID
+  select 'Display current Ethernet/Crypto/Baseband device settings' to confirm binding
+
+
+In the same way the ACC100 5G/4G FEC PF can be bound with vfio, but vfio driver does not
+support SR-IOV configuration right out of the box, so it will need to be patched.
+
+
+Enable Virtual Functions
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+Now, it should be visible in the printouts that PCI PF is under igb_uio control
+"``Kernel driver in use: igb_uio``"
+
+To show the number of available VFs on the device, read ``sriov_totalvfs`` file..
+
+.. code-block:: console
+
+  cat /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/sriov_totalvfs
+
+  where 0000\:<b>\:<d>.<f> is the PCI device ID
+
+
+To enable VFs via igb_uio, echo the number of virtual functions intended to
+enable to ``max_vfs`` file..
+
+.. code-block:: console
+
+  echo <num-of-vfs> > /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/max_vfs
+
+
+Afterwards, all VFs must be bound to appropriate UIO drivers as required, same
+way it was done with the physical function previously.
+
+Enabling SR-IOV via vfio driver is pretty much the same, except that the file
+name is different:
+
+.. code-block:: console
+
+  echo <num-of-vfs> > /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/sriov_numvfs
+
+
+Configure the VFs through PF
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The PCI virtual functions must be configured before working or getting assigned
+to VMs/Containers. The configuration involves allocating the number of hardware
+queues, priorities, load balance, bandwidth and other settings necessary for the
+device to perform FEC functions.
+
+This configuration needs to be executed at least once after reboot or PCI FLR and can
+be achieved by using the function ``acc100_configure()``, which sets up the
+parameters defined in ``acc100_conf`` structure.
+
+Test Application
+----------------
+
+BBDEV provides a test application, ``test-bbdev.py`` and range of test data for testing
+the functionality of ACC100 5G/4G FEC encode and decode, depending on the device's
+capabilities. The test application is located under app->test-bbdev folder and has the
+following options:
+
+.. code-block:: console
+
+  "-p", "--testapp-path": specifies path to the bbdev test app.
+  "-e", "--eal-params"	: EAL arguments which are passed to the test app.
+  "-t", "--timeout"	: Timeout in seconds (default=300).
+  "-c", "--test-cases"	: Defines test cases to run. Run all if not specified.
+  "-v", "--test-vector"	: Test vector path (default=dpdk_path+/app/test-bbdev/test_vectors/bbdev_null.data).
+  "-n", "--num-ops"	: Number of operations to process on device (default=32).
+  "-b", "--burst-size"	: Operations enqueue/dequeue burst size (default=32).
+  "-s", "--snr"		: SNR in dB used when generating LLRs for bler tests.
+  "-s", "--iter_max"	: Number of iterations for LDPC decoder.
+  "-l", "--num-lcores"	: Number of lcores to run (default=16).
+  "-i", "--init-device" : Initialise PF device with default values.
+
+
+To execute the test application tool using simple decode or encode data,
+type one of the following:
+
+.. code-block:: console
+
+  ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_dec_default.data
+  ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_enc_default.data
+
+
+The test application ``test-bbdev.py``, supports the ability to configure the PF device with
+a default set of values, if the "-i" or "- -init-device" option is included. The default values
+are defined in test_bbdev_perf.c.
+
+
+Test Vectors
+~~~~~~~~~~~~
+
+In addition to the simple LDPC decoder and LDPC encoder tests, bbdev also provides
+a range of additional tests under the test_vectors folder, which may be useful. The results
+of these tests will depend on the ACC100 5G/4G FEC capabilities which may cause some
+testcases to be skipped, but no failure should be reported.
diff --git a/doc/guides/bbdevs/index.rst b/doc/guides/bbdevs/index.rst
index a8092dd..4445cbd 100644
--- a/doc/guides/bbdevs/index.rst
+++ b/doc/guides/bbdevs/index.rst
@@ -13,3 +13,4 @@ Baseband Device Drivers
     turbo_sw
     fpga_lte_fec
     fpga_5gnr_fec
+    acc100
diff --git a/doc/guides/rel_notes/release_20_11.rst b/doc/guides/rel_notes/release_20_11.rst
index df227a1..b3ab614 100644
--- a/doc/guides/rel_notes/release_20_11.rst
+++ b/doc/guides/rel_notes/release_20_11.rst
@@ -55,6 +55,12 @@ New Features
      Also, make sure to start the actual text at the margin.
      =======================================================
 
+* **Added Intel ACC100 bbdev PMD.**
+
+  Added a new ``acc100`` bbdev driver for the Intel\ |reg| ACC100 accelerator
+  also known as Mount Bryce.  See the
+  :doc:`../bbdevs/acc100` BBDEV guide for more details on this new driver.
+
 
 Removed Items
 -------------
diff --git a/drivers/baseband/Makefile b/drivers/baseband/Makefile
index dcc0969..b640294 100644
--- a/drivers/baseband/Makefile
+++ b/drivers/baseband/Makefile
@@ -10,6 +10,8 @@ DIRS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_NULL) += null
 DEPDIRS-null = $(core-libs)
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_TURBO_SW) += turbo_sw
 DEPDIRS-turbo_sw = $(core-libs)
+DIRS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100) += acc100
+DEPDIRS-acc100 = $(core-libs)
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_FPGA_LTE_FEC) += fpga_lte_fec
 DEPDIRS-fpga_lte_fec = $(core-libs)
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_FPGA_5GNR_FEC) += fpga_5gnr_fec
diff --git a/drivers/baseband/acc100/Makefile b/drivers/baseband/acc100/Makefile
new file mode 100644
index 0000000..c79e487
--- /dev/null
+++ b/drivers/baseband/acc100/Makefile
@@ -0,0 +1,25 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2020 Intel Corporation
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_pmd_bbdev_acc100.a
+
+# build flags
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring -lrte_cfgfile
+LDLIBS += -lrte_bbdev
+LDLIBS += -lrte_pci -lrte_bus_pci
+
+# versioning export map
+EXPORT_MAP := rte_pmd_bbdev_acc100_version.map
+
+# library version
+LIBABIVER := 1
+
+# library source files
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100) += rte_acc100_pmd.c
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/baseband/acc100/meson.build b/drivers/baseband/acc100/meson.build
new file mode 100644
index 0000000..8afafc2
--- /dev/null
+++ b/drivers/baseband/acc100/meson.build
@@ -0,0 +1,6 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2020 Intel Corporation
+
+deps += ['bbdev', 'bus_vdev', 'ring', 'pci', 'bus_pci']
+
+sources = files('rte_acc100_pmd.c')
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
new file mode 100644
index 0000000..1b4cd13
--- /dev/null
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -0,0 +1,175 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#include <unistd.h>
+
+#include <rte_common.h>
+#include <rte_log.h>
+#include <rte_dev.h>
+#include <rte_malloc.h>
+#include <rte_mempool.h>
+#include <rte_byteorder.h>
+#include <rte_errno.h>
+#include <rte_branch_prediction.h>
+#include <rte_hexdump.h>
+#include <rte_pci.h>
+#include <rte_bus_pci.h>
+
+#include <rte_bbdev.h>
+#include <rte_bbdev_pmd.h>
+#include "rte_acc100_pmd.h"
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, DEBUG);
+#else
+RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);
+#endif
+
+/* Free 64MB memory used for software rings */
+static int
+acc100_dev_close(struct rte_bbdev *dev  __rte_unused)
+{
+	return 0;
+}
+
+static const struct rte_bbdev_ops acc100_bbdev_ops = {
+	.close = acc100_dev_close,
+};
+
+/* ACC100 PCI PF address map */
+static struct rte_pci_id pci_id_acc100_pf_map[] = {
+	{
+		RTE_PCI_DEVICE(RTE_ACC100_VENDOR_ID, RTE_ACC100_PF_DEVICE_ID)
+	},
+	{.device_id = 0},
+};
+
+/* ACC100 PCI VF address map */
+static struct rte_pci_id pci_id_acc100_vf_map[] = {
+	{
+		RTE_PCI_DEVICE(RTE_ACC100_VENDOR_ID, RTE_ACC100_VF_DEVICE_ID)
+	},
+	{.device_id = 0},
+};
+
+/* Initialization Function */
+static void
+acc100_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv)
+{
+	struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
+
+	dev->dev_ops = &acc100_bbdev_ops;
+
+	((struct acc100_device *) dev->data->dev_private)->pf_device =
+			!strcmp(drv->driver.name,
+					RTE_STR(ACC100PF_DRIVER_NAME));
+	((struct acc100_device *) dev->data->dev_private)->mmio_base =
+			pci_dev->mem_resource[0].addr;
+
+	rte_bbdev_log_debug("Init device %s [%s] @ vaddr %p paddr %#"PRIx64"",
+			drv->driver.name, dev->data->name,
+			(void *)pci_dev->mem_resource[0].addr,
+			pci_dev->mem_resource[0].phys_addr);
+}
+
+static int acc100_pci_probe(struct rte_pci_driver *pci_drv,
+	struct rte_pci_device *pci_dev)
+{
+	struct rte_bbdev *bbdev = NULL;
+	char dev_name[RTE_BBDEV_NAME_MAX_LEN];
+
+	if (pci_dev == NULL) {
+		rte_bbdev_log(ERR, "NULL PCI device");
+		return -EINVAL;
+	}
+
+	rte_pci_device_name(&pci_dev->addr, dev_name, sizeof(dev_name));
+
+	/* Allocate memory to be used privately by drivers */
+	bbdev = rte_bbdev_allocate(pci_dev->device.name);
+	if (bbdev == NULL)
+		return -ENODEV;
+
+	/* allocate device private memory */
+	bbdev->data->dev_private = rte_zmalloc_socket(dev_name,
+			sizeof(struct acc100_device), RTE_CACHE_LINE_SIZE,
+			pci_dev->device.numa_node);
+
+	if (bbdev->data->dev_private == NULL) {
+		rte_bbdev_log(CRIT,
+				"Allocate of %zu bytes for device \"%s\" failed",
+				sizeof(struct acc100_device), dev_name);
+				rte_bbdev_release(bbdev);
+			return -ENOMEM;
+	}
+
+	/* Fill HW specific part of device structure */
+	bbdev->device = &pci_dev->device;
+	bbdev->intr_handle = &pci_dev->intr_handle;
+	bbdev->data->socket_id = pci_dev->device.numa_node;
+
+	/* Invoke ACC100 device initialization function */
+	acc100_bbdev_init(bbdev, pci_drv);
+
+	rte_bbdev_log_debug("Initialised bbdev %s (id = %u)",
+			dev_name, bbdev->data->dev_id);
+	return 0;
+}
+
+static int acc100_pci_remove(struct rte_pci_device *pci_dev)
+{
+	struct rte_bbdev *bbdev;
+	int ret;
+	uint8_t dev_id;
+
+	if (pci_dev == NULL)
+		return -EINVAL;
+
+	/* Find device */
+	bbdev = rte_bbdev_get_named_dev(pci_dev->device.name);
+	if (bbdev == NULL) {
+		rte_bbdev_log(CRIT,
+				"Couldn't find HW dev \"%s\" to uninitialise it",
+				pci_dev->device.name);
+		return -ENODEV;
+	}
+	dev_id = bbdev->data->dev_id;
+
+	/* free device private memory before close */
+	rte_free(bbdev->data->dev_private);
+
+	/* Close device */
+	ret = rte_bbdev_close(dev_id);
+	if (ret < 0)
+		rte_bbdev_log(ERR,
+				"Device %i failed to close during uninit: %i",
+				dev_id, ret);
+
+	/* release bbdev from library */
+	rte_bbdev_release(bbdev);
+
+	rte_bbdev_log_debug("Destroyed bbdev = %u", dev_id);
+
+	return 0;
+}
+
+static struct rte_pci_driver acc100_pci_pf_driver = {
+		.probe = acc100_pci_probe,
+		.remove = acc100_pci_remove,
+		.id_table = pci_id_acc100_pf_map,
+		.drv_flags = RTE_PCI_DRV_NEED_MAPPING
+};
+
+static struct rte_pci_driver acc100_pci_vf_driver = {
+		.probe = acc100_pci_probe,
+		.remove = acc100_pci_remove,
+		.id_table = pci_id_acc100_vf_map,
+		.drv_flags = RTE_PCI_DRV_NEED_MAPPING
+};
+
+RTE_PMD_REGISTER_PCI(ACC100PF_DRIVER_NAME, acc100_pci_pf_driver);
+RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME, pci_id_acc100_pf_map);
+RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
+RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME, pci_id_acc100_vf_map);
+
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
new file mode 100644
index 0000000..6f46df0
--- /dev/null
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -0,0 +1,37 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#ifndef _RTE_ACC100_PMD_H_
+#define _RTE_ACC100_PMD_H_
+
+/* Helper macro for logging */
+#define rte_bbdev_log(level, fmt, ...) \
+	rte_log(RTE_LOG_ ## level, acc100_logtype, fmt "\n", \
+		##__VA_ARGS__)
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+#define rte_bbdev_log_debug(fmt, ...) \
+		rte_bbdev_log(DEBUG, "acc100_pmd: " fmt, \
+		##__VA_ARGS__)
+#else
+#define rte_bbdev_log_debug(fmt, ...)
+#endif
+
+/* ACC100 PF and VF driver names */
+#define ACC100PF_DRIVER_NAME           intel_acc100_pf
+#define ACC100VF_DRIVER_NAME           intel_acc100_vf
+
+/* ACC100 PCI vendor & device IDs */
+#define RTE_ACC100_VENDOR_ID           (0x8086)
+#define RTE_ACC100_PF_DEVICE_ID        (0x0d5c)
+#define RTE_ACC100_VF_DEVICE_ID        (0x0d5d)
+
+/* Private data structure for each ACC100 device */
+struct acc100_device {
+	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
+	bool pf_device; /**< True if this is a PF ACC100 device */
+	bool configured; /**< True if this ACC100 device is configured */
+};
+
+#endif /* _RTE_ACC100_PMD_H_ */
diff --git a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
new file mode 100644
index 0000000..4a76d1d
--- /dev/null
+++ b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
@@ -0,0 +1,3 @@
+DPDK_21 {
+	local: *;
+};
diff --git a/drivers/baseband/meson.build b/drivers/baseband/meson.build
index 415b672..72301ce 100644
--- a/drivers/baseband/meson.build
+++ b/drivers/baseband/meson.build
@@ -5,7 +5,7 @@ if is_windows
 	subdir_done()
 endif
 
-drivers = ['null', 'turbo_sw', 'fpga_lte_fec', 'fpga_5gnr_fec']
+drivers = ['null', 'turbo_sw', 'fpga_lte_fec', 'fpga_5gnr_fec', 'acc100']
 
 config_flag_fmt = 'RTE_LIBRTE_PMD_BBDEV_@0@'
 driver_name_fmt = 'rte_pmd_bbdev_@0@'
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index a544259..a77f538 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -254,6 +254,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_NETVSC_PMD)     += -lrte_pmd_netvsc
 
 ifeq ($(CONFIG_RTE_LIBRTE_BBDEV),y)
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_NULL)     += -lrte_pmd_bbdev_null
+_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100)    += -lrte_pmd_bbdev_acc100
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_FPGA_LTE_FEC) += -lrte_pmd_bbdev_fpga_lte_fec
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_FPGA_5GNR_FEC) += -lrte_pmd_bbdev_fpga_5gnr_fec
 
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v3 02/11] baseband/acc100: add register definition file
  2020-08-19  0:25 [dpdk-dev] [PATCH v3 00/11] bbdev PMD ACC100 Nicolas Chautru
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 01/11] drivers/baseband: add PMD for ACC100 Nicolas Chautru
@ 2020-08-19  0:25 ` Nicolas Chautru
  2020-08-29  9:55   ` Xu, Rosen
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 03/11] baseband/acc100: add info get function Nicolas Chautru
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 213+ messages in thread
From: Nicolas Chautru @ 2020-08-19  0:25 UTC (permalink / raw)
  To: dev, akhil.goyal; +Cc: bruce.richardson, Nicolas Chautru

Add in the list of registers for the device and related
HW specs definitions.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 drivers/baseband/acc100/acc100_pf_enum.h | 1068 ++++++++++++++++++++++++++++++
 drivers/baseband/acc100/acc100_vf_enum.h |   73 ++
 drivers/baseband/acc100/rte_acc100_pmd.h |  490 ++++++++++++++
 3 files changed, 1631 insertions(+)
 create mode 100644 drivers/baseband/acc100/acc100_pf_enum.h
 create mode 100644 drivers/baseband/acc100/acc100_vf_enum.h

diff --git a/drivers/baseband/acc100/acc100_pf_enum.h b/drivers/baseband/acc100/acc100_pf_enum.h
new file mode 100644
index 0000000..a1ee416
--- /dev/null
+++ b/drivers/baseband/acc100/acc100_pf_enum.h
@@ -0,0 +1,1068 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2017 Intel Corporation
+ */
+
+#ifndef ACC100_PF_ENUM_H
+#define ACC100_PF_ENUM_H
+
+/*
+ * ACC100 Register mapping on PF BAR0
+ * This is automatically generated from RDL, format may change with new RDL
+ * Release.
+ * Variable names are as is
+ */
+enum {
+	HWPfQmgrEgressQueuesTemplate          =  0x0007FE00,
+	HWPfQmgrIngressAq                     =  0x00080000,
+	HWPfQmgrArbQAvail                     =  0x00A00010,
+	HWPfQmgrArbQBlock                     =  0x00A00014,
+	HWPfQmgrAqueueDropNotifEn             =  0x00A00024,
+	HWPfQmgrAqueueDisableNotifEn          =  0x00A00028,
+	HWPfQmgrSoftReset                     =  0x00A00038,
+	HWPfQmgrInitStatus                    =  0x00A0003C,
+	HWPfQmgrAramWatchdogCount             =  0x00A00040,
+	HWPfQmgrAramWatchdogCounterEn         =  0x00A00044,
+	HWPfQmgrAxiWatchdogCount              =  0x00A00048,
+	HWPfQmgrAxiWatchdogCounterEn          =  0x00A0004C,
+	HWPfQmgrProcessWatchdogCount          =  0x00A00050,
+	HWPfQmgrProcessWatchdogCounterEn      =  0x00A00054,
+	HWPfQmgrProcessUl4GWatchdogCounter    =  0x00A00058,
+	HWPfQmgrProcessDl4GWatchdogCounter    =  0x00A0005C,
+	HWPfQmgrProcessUl5GWatchdogCounter    =  0x00A00060,
+	HWPfQmgrProcessDl5GWatchdogCounter    =  0x00A00064,
+	HWPfQmgrProcessMldWatchdogCounter     =  0x00A00068,
+	HWPfQmgrMsiOverflowUpperVf            =  0x00A00070,
+	HWPfQmgrMsiOverflowLowerVf            =  0x00A00074,
+	HWPfQmgrMsiWatchdogOverflow           =  0x00A00078,
+	HWPfQmgrMsiOverflowEnable             =  0x00A0007C,
+	HWPfQmgrDebugAqPointerMemGrp          =  0x00A00100,
+	HWPfQmgrDebugOutputArbQFifoGrp        =  0x00A00140,
+	HWPfQmgrDebugMsiFifoGrp               =  0x00A00180,
+	HWPfQmgrDebugAxiWdTimeoutMsiFifo      =  0x00A001C0,
+	HWPfQmgrDebugProcessWdTimeoutMsiFifo  =  0x00A001C4,
+	HWPfQmgrDepthLog2Grp                  =  0x00A00200,
+	HWPfQmgrTholdGrp                      =  0x00A00300,
+	HWPfQmgrGrpTmplateReg0Indx            =  0x00A00600,
+	HWPfQmgrGrpTmplateReg1Indx            =  0x00A00680,
+	HWPfQmgrGrpTmplateReg2indx            =  0x00A00700,
+	HWPfQmgrGrpTmplateReg3Indx            =  0x00A00780,
+	HWPfQmgrGrpTmplateReg4Indx            =  0x00A00800,
+	HWPfQmgrVfBaseAddr                    =  0x00A01000,
+	HWPfQmgrUl4GWeightRrVf                =  0x00A02000,
+	HWPfQmgrDl4GWeightRrVf                =  0x00A02100,
+	HWPfQmgrUl5GWeightRrVf                =  0x00A02200,
+	HWPfQmgrDl5GWeightRrVf                =  0x00A02300,
+	HWPfQmgrMldWeightRrVf                 =  0x00A02400,
+	HWPfQmgrArbQDepthGrp                  =  0x00A02F00,
+	HWPfQmgrGrpFunction0                  =  0x00A02F40,
+	HWPfQmgrGrpFunction1                  =  0x00A02F44,
+	HWPfQmgrGrpPriority                   =  0x00A02F48,
+	HWPfQmgrWeightSync                    =  0x00A03000,
+	HWPfQmgrAqEnableVf                    =  0x00A10000,
+	HWPfQmgrAqResetVf                     =  0x00A20000,
+	HWPfQmgrRingSizeVf                    =  0x00A20004,
+	HWPfQmgrGrpDepthLog20Vf               =  0x00A20008,
+	HWPfQmgrGrpDepthLog21Vf               =  0x00A2000C,
+	HWPfQmgrGrpFunction0Vf                =  0x00A20010,
+	HWPfQmgrGrpFunction1Vf                =  0x00A20014,
+	HWPfDmaConfig0Reg                     =  0x00B80000,
+	HWPfDmaConfig1Reg                     =  0x00B80004,
+	HWPfDmaQmgrAddrReg                    =  0x00B80008,
+	HWPfDmaSoftResetReg                   =  0x00B8000C,
+	HWPfDmaAxcacheReg                     =  0x00B80010,
+	HWPfDmaVersionReg                     =  0x00B80014,
+	HWPfDmaFrameThreshold                 =  0x00B80018,
+	HWPfDmaTimestampLo                    =  0x00B8001C,
+	HWPfDmaTimestampHi                    =  0x00B80020,
+	HWPfDmaAxiStatus                      =  0x00B80028,
+	HWPfDmaAxiControl                     =  0x00B8002C,
+	HWPfDmaNoQmgr                         =  0x00B80030,
+	HWPfDmaQosScale                       =  0x00B80034,
+	HWPfDmaQmanen                         =  0x00B80040,
+	HWPfDmaQmgrQosBase                    =  0x00B80060,
+	HWPfDmaFecClkGatingEnable             =  0x00B80080,
+	HWPfDmaPmEnable                       =  0x00B80084,
+	HWPfDmaQosEnable                      =  0x00B80088,
+	HWPfDmaHarqWeightedRrFrameThreshold   =  0x00B800B0,
+	HWPfDmaDataSmallWeightedRrFrameThresh  = 0x00B800B4,
+	HWPfDmaDataLargeWeightedRrFrameThresh  = 0x00B800B8,
+	HWPfDmaInboundCbMaxSize               =  0x00B800BC,
+	HWPfDmaInboundDrainDataSize           =  0x00B800C0,
+	HWPfDmaVfDdrBaseRw                    =  0x00B80400,
+	HWPfDmaCmplTmOutCnt                   =  0x00B80800,
+	HWPfDmaProcTmOutCnt                   =  0x00B80804,
+	HWPfDmaStatusRrespBresp               =  0x00B80810,
+	HWPfDmaCfgRrespBresp                  =  0x00B80814,
+	HWPfDmaStatusMemParErr                =  0x00B80818,
+	HWPfDmaCfgMemParErrEn                 =  0x00B8081C,
+	HWPfDmaStatusDmaHwErr                 =  0x00B80820,
+	HWPfDmaCfgDmaHwErrEn                  =  0x00B80824,
+	HWPfDmaStatusFecCoreErr               =  0x00B80828,
+	HWPfDmaCfgFecCoreErrEn                =  0x00B8082C,
+	HWPfDmaStatusFcwDescrErr              =  0x00B80830,
+	HWPfDmaCfgFcwDescrErrEn               =  0x00B80834,
+	HWPfDmaStatusBlockTransmit            =  0x00B80838,
+	HWPfDmaBlockOnErrEn                   =  0x00B8083C,
+	HWPfDmaStatusFlushDma                 =  0x00B80840,
+	HWPfDmaFlushDmaOnErrEn                =  0x00B80844,
+	HWPfDmaStatusSdoneFifoFull            =  0x00B80848,
+	HWPfDmaStatusDescriptorErrLoVf        =  0x00B8084C,
+	HWPfDmaStatusDescriptorErrHiVf        =  0x00B80850,
+	HWPfDmaStatusFcwErrLoVf               =  0x00B80854,
+	HWPfDmaStatusFcwErrHiVf               =  0x00B80858,
+	HWPfDmaStatusDataErrLoVf              =  0x00B8085C,
+	HWPfDmaStatusDataErrHiVf              =  0x00B80860,
+	HWPfDmaCfgMsiEnSoftwareErr            =  0x00B80864,
+	HWPfDmaDescriptorSignatuture          =  0x00B80868,
+	HWPfDmaFcwSignature                   =  0x00B8086C,
+	HWPfDmaErrorDetectionEn               =  0x00B80870,
+	HWPfDmaErrCntrlFifoDebug              =  0x00B8087C,
+	HWPfDmaStatusToutData                 =  0x00B80880,
+	HWPfDmaStatusToutDesc                 =  0x00B80884,
+	HWPfDmaStatusToutUnexpData            =  0x00B80888,
+	HWPfDmaStatusToutUnexpDesc            =  0x00B8088C,
+	HWPfDmaStatusToutProcess              =  0x00B80890,
+	HWPfDmaConfigCtoutOutDataEn           =  0x00B808A0,
+	HWPfDmaConfigCtoutOutDescrEn          =  0x00B808A4,
+	HWPfDmaConfigUnexpComplDataEn         =  0x00B808A8,
+	HWPfDmaConfigUnexpComplDescrEn        =  0x00B808AC,
+	HWPfDmaConfigPtoutOutEn               =  0x00B808B0,
+	HWPfDmaFec5GulDescBaseLoRegVf         =  0x00B88020,
+	HWPfDmaFec5GulDescBaseHiRegVf         =  0x00B88024,
+	HWPfDmaFec5GulRespPtrLoRegVf          =  0x00B88028,
+	HWPfDmaFec5GulRespPtrHiRegVf          =  0x00B8802C,
+	HWPfDmaFec5GdlDescBaseLoRegVf         =  0x00B88040,
+	HWPfDmaFec5GdlDescBaseHiRegVf         =  0x00B88044,
+	HWPfDmaFec5GdlRespPtrLoRegVf          =  0x00B88048,
+	HWPfDmaFec5GdlRespPtrHiRegVf          =  0x00B8804C,
+	HWPfDmaFec4GulDescBaseLoRegVf         =  0x00B88060,
+	HWPfDmaFec4GulDescBaseHiRegVf         =  0x00B88064,
+	HWPfDmaFec4GulRespPtrLoRegVf          =  0x00B88068,
+	HWPfDmaFec4GulRespPtrHiRegVf          =  0x00B8806C,
+	HWPfDmaFec4GdlDescBaseLoRegVf         =  0x00B88080,
+	HWPfDmaFec4GdlDescBaseHiRegVf         =  0x00B88084,
+	HWPfDmaFec4GdlRespPtrLoRegVf          =  0x00B88088,
+	HWPfDmaFec4GdlRespPtrHiRegVf          =  0x00B8808C,
+	HWPfDmaVfDdrBaseRangeRo               =  0x00B880A0,
+	HWPfQosmonACntrlReg                   =  0x00B90000,
+	HWPfQosmonAEvalOverflow0              =  0x00B90008,
+	HWPfQosmonAEvalOverflow1              =  0x00B9000C,
+	HWPfQosmonADivTerm                    =  0x00B90010,
+	HWPfQosmonATickTerm                   =  0x00B90014,
+	HWPfQosmonAEvalTerm                   =  0x00B90018,
+	HWPfQosmonAAveTerm                    =  0x00B9001C,
+	HWPfQosmonAForceEccErr                =  0x00B90020,
+	HWPfQosmonAEccErrDetect               =  0x00B90024,
+	HWPfQosmonAIterationConfig0Low        =  0x00B90060,
+	HWPfQosmonAIterationConfig0High       =  0x00B90064,
+	HWPfQosmonAIterationConfig1Low        =  0x00B90068,
+	HWPfQosmonAIterationConfig1High       =  0x00B9006C,
+	HWPfQosmonAIterationConfig2Low        =  0x00B90070,
+	HWPfQosmonAIterationConfig2High       =  0x00B90074,
+	HWPfQosmonAIterationConfig3Low        =  0x00B90078,
+	HWPfQosmonAIterationConfig3High       =  0x00B9007C,
+	HWPfQosmonAEvalMemAddr                =  0x00B90080,
+	HWPfQosmonAEvalMemData                =  0x00B90084,
+	HWPfQosmonAXaction                    =  0x00B900C0,
+	HWPfQosmonARemThres1Vf                =  0x00B90400,
+	HWPfQosmonAThres2Vf                   =  0x00B90404,
+	HWPfQosmonAWeiFracVf                  =  0x00B90408,
+	HWPfQosmonARrWeiVf                    =  0x00B9040C,
+	HWPfPermonACntrlRegVf                 =  0x00B98000,
+	HWPfPermonACountVf                    =  0x00B98008,
+	HWPfPermonAKCntLoVf                   =  0x00B98010,
+	HWPfPermonAKCntHiVf                   =  0x00B98014,
+	HWPfPermonADeltaCntLoVf               =  0x00B98020,
+	HWPfPermonADeltaCntHiVf               =  0x00B98024,
+	HWPfPermonAVersionReg                 =  0x00B9C000,
+	HWPfPermonACbControlFec               =  0x00B9C0F0,
+	HWPfPermonADltTimerLoFec              =  0x00B9C0F4,
+	HWPfPermonADltTimerHiFec              =  0x00B9C0F8,
+	HWPfPermonACbCountFec                 =  0x00B9C100,
+	HWPfPermonAAccExecTimerLoFec          =  0x00B9C104,
+	HWPfPermonAAccExecTimerHiFec          =  0x00B9C108,
+	HWPfPermonAExecTimerMinFec            =  0x00B9C200,
+	HWPfPermonAExecTimerMaxFec            =  0x00B9C204,
+	HWPfPermonAControlBusMon              =  0x00B9C400,
+	HWPfPermonAConfigBusMon               =  0x00B9C404,
+	HWPfPermonASkipCountBusMon            =  0x00B9C408,
+	HWPfPermonAMinLatBusMon               =  0x00B9C40C,
+	HWPfPermonAMaxLatBusMon               =  0x00B9C500,
+	HWPfPermonATotalLatLowBusMon          =  0x00B9C504,
+	HWPfPermonATotalLatUpperBusMon        =  0x00B9C508,
+	HWPfPermonATotalReqCntBusMon          =  0x00B9C50C,
+	HWPfQosmonBCntrlReg                   =  0x00BA0000,
+	HWPfQosmonBEvalOverflow0              =  0x00BA0008,
+	HWPfQosmonBEvalOverflow1              =  0x00BA000C,
+	HWPfQosmonBDivTerm                    =  0x00BA0010,
+	HWPfQosmonBTickTerm                   =  0x00BA0014,
+	HWPfQosmonBEvalTerm                   =  0x00BA0018,
+	HWPfQosmonBAveTerm                    =  0x00BA001C,
+	HWPfQosmonBForceEccErr                =  0x00BA0020,
+	HWPfQosmonBEccErrDetect               =  0x00BA0024,
+	HWPfQosmonBIterationConfig0Low        =  0x00BA0060,
+	HWPfQosmonBIterationConfig0High       =  0x00BA0064,
+	HWPfQosmonBIterationConfig1Low        =  0x00BA0068,
+	HWPfQosmonBIterationConfig1High       =  0x00BA006C,
+	HWPfQosmonBIterationConfig2Low        =  0x00BA0070,
+	HWPfQosmonBIterationConfig2High       =  0x00BA0074,
+	HWPfQosmonBIterationConfig3Low        =  0x00BA0078,
+	HWPfQosmonBIterationConfig3High       =  0x00BA007C,
+	HWPfQosmonBEvalMemAddr                =  0x00BA0080,
+	HWPfQosmonBEvalMemData                =  0x00BA0084,
+	HWPfQosmonBXaction                    =  0x00BA00C0,
+	HWPfQosmonBRemThres1Vf                =  0x00BA0400,
+	HWPfQosmonBThres2Vf                   =  0x00BA0404,
+	HWPfQosmonBWeiFracVf                  =  0x00BA0408,
+	HWPfQosmonBRrWeiVf                    =  0x00BA040C,
+	HWPfPermonBCntrlRegVf                 =  0x00BA8000,
+	HWPfPermonBCountVf                    =  0x00BA8008,
+	HWPfPermonBKCntLoVf                   =  0x00BA8010,
+	HWPfPermonBKCntHiVf                   =  0x00BA8014,
+	HWPfPermonBDeltaCntLoVf               =  0x00BA8020,
+	HWPfPermonBDeltaCntHiVf               =  0x00BA8024,
+	HWPfPermonBVersionReg                 =  0x00BAC000,
+	HWPfPermonBCbControlFec               =  0x00BAC0F0,
+	HWPfPermonBDltTimerLoFec              =  0x00BAC0F4,
+	HWPfPermonBDltTimerHiFec              =  0x00BAC0F8,
+	HWPfPermonBCbCountFec                 =  0x00BAC100,
+	HWPfPermonBAccExecTimerLoFec          =  0x00BAC104,
+	HWPfPermonBAccExecTimerHiFec          =  0x00BAC108,
+	HWPfPermonBExecTimerMinFec            =  0x00BAC200,
+	HWPfPermonBExecTimerMaxFec            =  0x00BAC204,
+	HWPfPermonBControlBusMon              =  0x00BAC400,
+	HWPfPermonBConfigBusMon               =  0x00BAC404,
+	HWPfPermonBSkipCountBusMon            =  0x00BAC408,
+	HWPfPermonBMinLatBusMon               =  0x00BAC40C,
+	HWPfPermonBMaxLatBusMon               =  0x00BAC500,
+	HWPfPermonBTotalLatLowBusMon          =  0x00BAC504,
+	HWPfPermonBTotalLatUpperBusMon        =  0x00BAC508,
+	HWPfPermonBTotalReqCntBusMon          =  0x00BAC50C,
+	HWPfFecUl5gCntrlReg                   =  0x00BC0000,
+	HWPfFecUl5gI2MThreshReg               =  0x00BC0004,
+	HWPfFecUl5gVersionReg                 =  0x00BC0100,
+	HWPfFecUl5gFcwStatusReg               =  0x00BC0104,
+	HWPfFecUl5gWarnReg                    =  0x00BC0108,
+	HwPfFecUl5gIbDebugReg                 =  0x00BC0200,
+	HwPfFecUl5gObLlrDebugReg              =  0x00BC0204,
+	HwPfFecUl5gObHarqDebugReg             =  0x00BC0208,
+	HwPfFecUl5g1CntrlReg                  =  0x00BC1000,
+	HwPfFecUl5g1I2MThreshReg              =  0x00BC1004,
+	HwPfFecUl5g1VersionReg                =  0x00BC1100,
+	HwPfFecUl5g1FcwStatusReg              =  0x00BC1104,
+	HwPfFecUl5g1WarnReg                   =  0x00BC1108,
+	HwPfFecUl5g1IbDebugReg                =  0x00BC1200,
+	HwPfFecUl5g1ObLlrDebugReg             =  0x00BC1204,
+	HwPfFecUl5g1ObHarqDebugReg            =  0x00BC1208,
+	HwPfFecUl5g2CntrlReg                  =  0x00BC2000,
+	HwPfFecUl5g2I2MThreshReg              =  0x00BC2004,
+	HwPfFecUl5g2VersionReg                =  0x00BC2100,
+	HwPfFecUl5g2FcwStatusReg              =  0x00BC2104,
+	HwPfFecUl5g2WarnReg                   =  0x00BC2108,
+	HwPfFecUl5g2IbDebugReg                =  0x00BC2200,
+	HwPfFecUl5g2ObLlrDebugReg             =  0x00BC2204,
+	HwPfFecUl5g2ObHarqDebugReg            =  0x00BC2208,
+	HwPfFecUl5g3CntrlReg                  =  0x00BC3000,
+	HwPfFecUl5g3I2MThreshReg              =  0x00BC3004,
+	HwPfFecUl5g3VersionReg                =  0x00BC3100,
+	HwPfFecUl5g3FcwStatusReg              =  0x00BC3104,
+	HwPfFecUl5g3WarnReg                   =  0x00BC3108,
+	HwPfFecUl5g3IbDebugReg                =  0x00BC3200,
+	HwPfFecUl5g3ObLlrDebugReg             =  0x00BC3204,
+	HwPfFecUl5g3ObHarqDebugReg            =  0x00BC3208,
+	HwPfFecUl5g4CntrlReg                  =  0x00BC4000,
+	HwPfFecUl5g4I2MThreshReg              =  0x00BC4004,
+	HwPfFecUl5g4VersionReg                =  0x00BC4100,
+	HwPfFecUl5g4FcwStatusReg              =  0x00BC4104,
+	HwPfFecUl5g4WarnReg                   =  0x00BC4108,
+	HwPfFecUl5g4IbDebugReg                =  0x00BC4200,
+	HwPfFecUl5g4ObLlrDebugReg             =  0x00BC4204,
+	HwPfFecUl5g4ObHarqDebugReg            =  0x00BC4208,
+	HwPfFecUl5g5CntrlReg                  =  0x00BC5000,
+	HwPfFecUl5g5I2MThreshReg              =  0x00BC5004,
+	HwPfFecUl5g5VersionReg                =  0x00BC5100,
+	HwPfFecUl5g5FcwStatusReg              =  0x00BC5104,
+	HwPfFecUl5g5WarnReg                   =  0x00BC5108,
+	HwPfFecUl5g5IbDebugReg                =  0x00BC5200,
+	HwPfFecUl5g5ObLlrDebugReg             =  0x00BC5204,
+	HwPfFecUl5g5ObHarqDebugReg            =  0x00BC5208,
+	HwPfFecUl5g6CntrlReg                  =  0x00BC6000,
+	HwPfFecUl5g6I2MThreshReg              =  0x00BC6004,
+	HwPfFecUl5g6VersionReg                =  0x00BC6100,
+	HwPfFecUl5g6FcwStatusReg              =  0x00BC6104,
+	HwPfFecUl5g6WarnReg                   =  0x00BC6108,
+	HwPfFecUl5g6IbDebugReg                =  0x00BC6200,
+	HwPfFecUl5g6ObLlrDebugReg             =  0x00BC6204,
+	HwPfFecUl5g6ObHarqDebugReg            =  0x00BC6208,
+	HwPfFecUl5g7CntrlReg                  =  0x00BC7000,
+	HwPfFecUl5g7I2MThreshReg              =  0x00BC7004,
+	HwPfFecUl5g7VersionReg                =  0x00BC7100,
+	HwPfFecUl5g7FcwStatusReg              =  0x00BC7104,
+	HwPfFecUl5g7WarnReg                   =  0x00BC7108,
+	HwPfFecUl5g7IbDebugReg                =  0x00BC7200,
+	HwPfFecUl5g7ObLlrDebugReg             =  0x00BC7204,
+	HwPfFecUl5g7ObHarqDebugReg            =  0x00BC7208,
+	HwPfFecUl5g8CntrlReg                  =  0x00BC8000,
+	HwPfFecUl5g8I2MThreshReg              =  0x00BC8004,
+	HwPfFecUl5g8VersionReg                =  0x00BC8100,
+	HwPfFecUl5g8FcwStatusReg              =  0x00BC8104,
+	HwPfFecUl5g8WarnReg                   =  0x00BC8108,
+	HwPfFecUl5g8IbDebugReg                =  0x00BC8200,
+	HwPfFecUl5g8ObLlrDebugReg             =  0x00BC8204,
+	HwPfFecUl5g8ObHarqDebugReg            =  0x00BC8208,
+	HWPfFecDl5gCntrlReg                   =  0x00BCF000,
+	HWPfFecDl5gI2MThreshReg               =  0x00BCF004,
+	HWPfFecDl5gVersionReg                 =  0x00BCF100,
+	HWPfFecDl5gFcwStatusReg               =  0x00BCF104,
+	HWPfFecDl5gWarnReg                    =  0x00BCF108,
+	HWPfFecUlVersionReg                   =  0x00BD0000,
+	HWPfFecUlControlReg                   =  0x00BD0004,
+	HWPfFecUlStatusReg                    =  0x00BD0008,
+	HWPfFecDlVersionReg                   =  0x00BDF000,
+	HWPfFecDlClusterConfigReg             =  0x00BDF004,
+	HWPfFecDlBurstThres                   =  0x00BDF00C,
+	HWPfFecDlClusterStatusReg0            =  0x00BDF040,
+	HWPfFecDlClusterStatusReg1            =  0x00BDF044,
+	HWPfFecDlClusterStatusReg2            =  0x00BDF048,
+	HWPfFecDlClusterStatusReg3            =  0x00BDF04C,
+	HWPfFecDlClusterStatusReg4            =  0x00BDF050,
+	HWPfFecDlClusterStatusReg5            =  0x00BDF054,
+	HWPfChaFabPllPllrst                   =  0x00C40000,
+	HWPfChaFabPllClk0                     =  0x00C40004,
+	HWPfChaFabPllClk1                     =  0x00C40008,
+	HWPfChaFabPllBwadj                    =  0x00C4000C,
+	HWPfChaFabPllLbw                      =  0x00C40010,
+	HWPfChaFabPllResetq                   =  0x00C40014,
+	HWPfChaFabPllPhshft0                  =  0x00C40018,
+	HWPfChaFabPllPhshft1                  =  0x00C4001C,
+	HWPfChaFabPllDivq0                    =  0x00C40020,
+	HWPfChaFabPllDivq1                    =  0x00C40024,
+	HWPfChaFabPllDivq2                    =  0x00C40028,
+	HWPfChaFabPllDivq3                    =  0x00C4002C,
+	HWPfChaFabPllDivq4                    =  0x00C40030,
+	HWPfChaFabPllDivq5                    =  0x00C40034,
+	HWPfChaFabPllDivq6                    =  0x00C40038,
+	HWPfChaFabPllDivq7                    =  0x00C4003C,
+	HWPfChaDl5gPllPllrst                  =  0x00C40080,
+	HWPfChaDl5gPllClk0                    =  0x00C40084,
+	HWPfChaDl5gPllClk1                    =  0x00C40088,
+	HWPfChaDl5gPllBwadj                   =  0x00C4008C,
+	HWPfChaDl5gPllLbw                     =  0x00C40090,
+	HWPfChaDl5gPllResetq                  =  0x00C40094,
+	HWPfChaDl5gPllPhshft0                 =  0x00C40098,
+	HWPfChaDl5gPllPhshft1                 =  0x00C4009C,
+	HWPfChaDl5gPllDivq0                   =  0x00C400A0,
+	HWPfChaDl5gPllDivq1                   =  0x00C400A4,
+	HWPfChaDl5gPllDivq2                   =  0x00C400A8,
+	HWPfChaDl5gPllDivq3                   =  0x00C400AC,
+	HWPfChaDl5gPllDivq4                   =  0x00C400B0,
+	HWPfChaDl5gPllDivq5                   =  0x00C400B4,
+	HWPfChaDl5gPllDivq6                   =  0x00C400B8,
+	HWPfChaDl5gPllDivq7                   =  0x00C400BC,
+	HWPfChaDl4gPllPllrst                  =  0x00C40100,
+	HWPfChaDl4gPllClk0                    =  0x00C40104,
+	HWPfChaDl4gPllClk1                    =  0x00C40108,
+	HWPfChaDl4gPllBwadj                   =  0x00C4010C,
+	HWPfChaDl4gPllLbw                     =  0x00C40110,
+	HWPfChaDl4gPllResetq                  =  0x00C40114,
+	HWPfChaDl4gPllPhshft0                 =  0x00C40118,
+	HWPfChaDl4gPllPhshft1                 =  0x00C4011C,
+	HWPfChaDl4gPllDivq0                   =  0x00C40120,
+	HWPfChaDl4gPllDivq1                   =  0x00C40124,
+	HWPfChaDl4gPllDivq2                   =  0x00C40128,
+	HWPfChaDl4gPllDivq3                   =  0x00C4012C,
+	HWPfChaDl4gPllDivq4                   =  0x00C40130,
+	HWPfChaDl4gPllDivq5                   =  0x00C40134,
+	HWPfChaDl4gPllDivq6                   =  0x00C40138,
+	HWPfChaDl4gPllDivq7                   =  0x00C4013C,
+	HWPfChaUl5gPllPllrst                  =  0x00C40180,
+	HWPfChaUl5gPllClk0                    =  0x00C40184,
+	HWPfChaUl5gPllClk1                    =  0x00C40188,
+	HWPfChaUl5gPllBwadj                   =  0x00C4018C,
+	HWPfChaUl5gPllLbw                     =  0x00C40190,
+	HWPfChaUl5gPllResetq                  =  0x00C40194,
+	HWPfChaUl5gPllPhshft0                 =  0x00C40198,
+	HWPfChaUl5gPllPhshft1                 =  0x00C4019C,
+	HWPfChaUl5gPllDivq0                   =  0x00C401A0,
+	HWPfChaUl5gPllDivq1                   =  0x00C401A4,
+	HWPfChaUl5gPllDivq2                   =  0x00C401A8,
+	HWPfChaUl5gPllDivq3                   =  0x00C401AC,
+	HWPfChaUl5gPllDivq4                   =  0x00C401B0,
+	HWPfChaUl5gPllDivq5                   =  0x00C401B4,
+	HWPfChaUl5gPllDivq6                   =  0x00C401B8,
+	HWPfChaUl5gPllDivq7                   =  0x00C401BC,
+	HWPfChaUl4gPllPllrst                  =  0x00C40200,
+	HWPfChaUl4gPllClk0                    =  0x00C40204,
+	HWPfChaUl4gPllClk1                    =  0x00C40208,
+	HWPfChaUl4gPllBwadj                   =  0x00C4020C,
+	HWPfChaUl4gPllLbw                     =  0x00C40210,
+	HWPfChaUl4gPllResetq                  =  0x00C40214,
+	HWPfChaUl4gPllPhshft0                 =  0x00C40218,
+	HWPfChaUl4gPllPhshft1                 =  0x00C4021C,
+	HWPfChaUl4gPllDivq0                   =  0x00C40220,
+	HWPfChaUl4gPllDivq1                   =  0x00C40224,
+	HWPfChaUl4gPllDivq2                   =  0x00C40228,
+	HWPfChaUl4gPllDivq3                   =  0x00C4022C,
+	HWPfChaUl4gPllDivq4                   =  0x00C40230,
+	HWPfChaUl4gPllDivq5                   =  0x00C40234,
+	HWPfChaUl4gPllDivq6                   =  0x00C40238,
+	HWPfChaUl4gPllDivq7                   =  0x00C4023C,
+	HWPfChaDdrPllPllrst                   =  0x00C40280,
+	HWPfChaDdrPllClk0                     =  0x00C40284,
+	HWPfChaDdrPllClk1                     =  0x00C40288,
+	HWPfChaDdrPllBwadj                    =  0x00C4028C,
+	HWPfChaDdrPllLbw                      =  0x00C40290,
+	HWPfChaDdrPllResetq                   =  0x00C40294,
+	HWPfChaDdrPllPhshft0                  =  0x00C40298,
+	HWPfChaDdrPllPhshft1                  =  0x00C4029C,
+	HWPfChaDdrPllDivq0                    =  0x00C402A0,
+	HWPfChaDdrPllDivq1                    =  0x00C402A4,
+	HWPfChaDdrPllDivq2                    =  0x00C402A8,
+	HWPfChaDdrPllDivq3                    =  0x00C402AC,
+	HWPfChaDdrPllDivq4                    =  0x00C402B0,
+	HWPfChaDdrPllDivq5                    =  0x00C402B4,
+	HWPfChaDdrPllDivq6                    =  0x00C402B8,
+	HWPfChaDdrPllDivq7                    =  0x00C402BC,
+	HWPfChaErrStatus                      =  0x00C40400,
+	HWPfChaErrMask                        =  0x00C40404,
+	HWPfChaDebugPcieMsiFifo               =  0x00C40410,
+	HWPfChaDebugDdrMsiFifo                =  0x00C40414,
+	HWPfChaDebugMiscMsiFifo               =  0x00C40418,
+	HWPfChaPwmSet                         =  0x00C40420,
+	HWPfChaDdrRstStatus                   =  0x00C40430,
+	HWPfChaDdrStDoneStatus                =  0x00C40434,
+	HWPfChaDdrWbRstCfg                    =  0x00C40438,
+	HWPfChaDdrApbRstCfg                   =  0x00C4043C,
+	HWPfChaDdrPhyRstCfg                   =  0x00C40440,
+	HWPfChaDdrCpuRstCfg                   =  0x00C40444,
+	HWPfChaDdrSifRstCfg                   =  0x00C40448,
+	HWPfChaPadcfgPcomp0                   =  0x00C41000,
+	HWPfChaPadcfgNcomp0                   =  0x00C41004,
+	HWPfChaPadcfgOdt0                     =  0x00C41008,
+	HWPfChaPadcfgProtect0                 =  0x00C4100C,
+	HWPfChaPreemphasisProtect0            =  0x00C41010,
+	HWPfChaPreemphasisCompen0             =  0x00C41040,
+	HWPfChaPreemphasisOdten0              =  0x00C41044,
+	HWPfChaPadcfgPcomp1                   =  0x00C41100,
+	HWPfChaPadcfgNcomp1                   =  0x00C41104,
+	HWPfChaPadcfgOdt1                     =  0x00C41108,
+	HWPfChaPadcfgProtect1                 =  0x00C4110C,
+	HWPfChaPreemphasisProtect1            =  0x00C41110,
+	HWPfChaPreemphasisCompen1             =  0x00C41140,
+	HWPfChaPreemphasisOdten1              =  0x00C41144,
+	HWPfChaPadcfgPcomp2                   =  0x00C41200,
+	HWPfChaPadcfgNcomp2                   =  0x00C41204,
+	HWPfChaPadcfgOdt2                     =  0x00C41208,
+	HWPfChaPadcfgProtect2                 =  0x00C4120C,
+	HWPfChaPreemphasisProtect2            =  0x00C41210,
+	HWPfChaPreemphasisCompen2             =  0x00C41240,
+	HWPfChaPreemphasisOdten4              =  0x00C41444,
+	HWPfChaPreemphasisOdten2              =  0x00C41244,
+	HWPfChaPadcfgPcomp3                   =  0x00C41300,
+	HWPfChaPadcfgNcomp3                   =  0x00C41304,
+	HWPfChaPadcfgOdt3                     =  0x00C41308,
+	HWPfChaPadcfgProtect3                 =  0x00C4130C,
+	HWPfChaPreemphasisProtect3            =  0x00C41310,
+	HWPfChaPreemphasisCompen3             =  0x00C41340,
+	HWPfChaPreemphasisOdten3              =  0x00C41344,
+	HWPfChaPadcfgPcomp4                   =  0x00C41400,
+	HWPfChaPadcfgNcomp4                   =  0x00C41404,
+	HWPfChaPadcfgOdt4                     =  0x00C41408,
+	HWPfChaPadcfgProtect4                 =  0x00C4140C,
+	HWPfChaPreemphasisProtect4            =  0x00C41410,
+	HWPfChaPreemphasisCompen4             =  0x00C41440,
+	HWPfHiVfToPfDbellVf                   =  0x00C80000,
+	HWPfHiPfToVfDbellVf                   =  0x00C80008,
+	HWPfHiInfoRingBaseLoVf                =  0x00C80010,
+	HWPfHiInfoRingBaseHiVf                =  0x00C80014,
+	HWPfHiInfoRingPointerVf               =  0x00C80018,
+	HWPfHiInfoRingIntWrEnVf               =  0x00C80020,
+	HWPfHiInfoRingPf2VfWrEnVf             =  0x00C80024,
+	HWPfHiMsixVectorMapperVf              =  0x00C80060,
+	HWPfHiModuleVersionReg                =  0x00C84000,
+	HWPfHiIosf2axiErrLogReg               =  0x00C84004,
+	HWPfHiHardResetReg                    =  0x00C84008,
+	HWPfHi5GHardResetReg                  =  0x00C8400C,
+	HWPfHiInfoRingBaseLoRegPf             =  0x00C84010,
+	HWPfHiInfoRingBaseHiRegPf             =  0x00C84014,
+	HWPfHiInfoRingPointerRegPf            =  0x00C84018,
+	HWPfHiInfoRingIntWrEnRegPf            =  0x00C84020,
+	HWPfHiInfoRingVf2pfLoWrEnReg          =  0x00C84024,
+	HWPfHiInfoRingVf2pfHiWrEnReg          =  0x00C84028,
+	HWPfHiLogParityErrStatusReg           =  0x00C8402C,
+	HWPfHiLogDataParityErrorVfStatusLo    =  0x00C84030,
+	HWPfHiLogDataParityErrorVfStatusHi    =  0x00C84034,
+	HWPfHiBlockTransmitOnErrorEn          =  0x00C84038,
+	HWPfHiCfgMsiIntWrEnRegPf              =  0x00C84040,
+	HWPfHiCfgMsiVf2pfLoWrEnReg            =  0x00C84044,
+	HWPfHiCfgMsiVf2pfHighWrEnReg          =  0x00C84048,
+	HWPfHiMsixVectorMapperPf              =  0x00C84060,
+	HWPfHiApbWrWaitTime                   =  0x00C84100,
+	HWPfHiXCounterMaxValue                =  0x00C84104,
+	HWPfHiPfMode                          =  0x00C84108,
+	HWPfHiClkGateHystReg                  =  0x00C8410C,
+	HWPfHiSnoopBitsReg                    =  0x00C84110,
+	HWPfHiMsiDropEnableReg                =  0x00C84114,
+	HWPfHiMsiStatReg                      =  0x00C84120,
+	HWPfHiFifoOflStatReg                  =  0x00C84124,
+	HWPfHiHiDebugReg                      =  0x00C841F4,
+	HWPfHiDebugMemSnoopMsiFifo            =  0x00C841F8,
+	HWPfHiDebugMemSnoopInputFifo          =  0x00C841FC,
+	HWPfHiMsixMappingConfig               =  0x00C84200,
+	HWPfHiJunkReg                         =  0x00C8FF00,
+	HWPfDdrUmmcVer                        =  0x00D00000,
+	HWPfDdrUmmcCap                        =  0x00D00010,
+	HWPfDdrUmmcCtrl                       =  0x00D00020,
+	HWPfDdrMpcPe                          =  0x00D00080,
+	HWPfDdrMpcPpri3                       =  0x00D00090,
+	HWPfDdrMpcPpri2                       =  0x00D000A0,
+	HWPfDdrMpcPpri1                       =  0x00D000B0,
+	HWPfDdrMpcPpri0                       =  0x00D000C0,
+	HWPfDdrMpcPrwgrpCtrl                  =  0x00D000D0,
+	HWPfDdrMpcPbw7                        =  0x00D000E0,
+	HWPfDdrMpcPbw6                        =  0x00D000F0,
+	HWPfDdrMpcPbw5                        =  0x00D00100,
+	HWPfDdrMpcPbw4                        =  0x00D00110,
+	HWPfDdrMpcPbw3                        =  0x00D00120,
+	HWPfDdrMpcPbw2                        =  0x00D00130,
+	HWPfDdrMpcPbw1                        =  0x00D00140,
+	HWPfDdrMpcPbw0                        =  0x00D00150,
+	HWPfDdrMemoryInit                     =  0x00D00200,
+	HWPfDdrMemoryInitDone                 =  0x00D00210,
+	HWPfDdrMemInitPhyTrng0                =  0x00D00240,
+	HWPfDdrMemInitPhyTrng1                =  0x00D00250,
+	HWPfDdrMemInitPhyTrng2                =  0x00D00260,
+	HWPfDdrMemInitPhyTrng3                =  0x00D00270,
+	HWPfDdrBcDram                         =  0x00D003C0,
+	HWPfDdrBcAddrMap                      =  0x00D003D0,
+	HWPfDdrBcRef                          =  0x00D003E0,
+	HWPfDdrBcTim0                         =  0x00D00400,
+	HWPfDdrBcTim1                         =  0x00D00410,
+	HWPfDdrBcTim2                         =  0x00D00420,
+	HWPfDdrBcTim3                         =  0x00D00430,
+	HWPfDdrBcTim4                         =  0x00D00440,
+	HWPfDdrBcTim5                         =  0x00D00450,
+	HWPfDdrBcTim6                         =  0x00D00460,
+	HWPfDdrBcTim7                         =  0x00D00470,
+	HWPfDdrBcTim8                         =  0x00D00480,
+	HWPfDdrBcTim9                         =  0x00D00490,
+	HWPfDdrBcTim10                        =  0x00D004A0,
+	HWPfDdrBcTim12                        =  0x00D004C0,
+	HWPfDdrDfiInit                        =  0x00D004D0,
+	HWPfDdrDfiInitComplete                =  0x00D004E0,
+	HWPfDdrDfiTim0                        =  0x00D004F0,
+	HWPfDdrDfiTim1                        =  0x00D00500,
+	HWPfDdrDfiPhyUpdEn                    =  0x00D00530,
+	HWPfDdrMemStatus                      =  0x00D00540,
+	HWPfDdrUmmcErrStatus                  =  0x00D00550,
+	HWPfDdrUmmcIntStatus                  =  0x00D00560,
+	HWPfDdrUmmcIntEn                      =  0x00D00570,
+	HWPfDdrPhyRdLatency                   =  0x00D48400,
+	HWPfDdrPhyRdLatencyDbi                =  0x00D48410,
+	HWPfDdrPhyWrLatency                   =  0x00D48420,
+	HWPfDdrPhyTrngType                    =  0x00D48430,
+	HWPfDdrPhyMrsTiming2                  =  0x00D48440,
+	HWPfDdrPhyMrsTiming0                  =  0x00D48450,
+	HWPfDdrPhyMrsTiming1                  =  0x00D48460,
+	HWPfDdrPhyDramTmrd                    =  0x00D48470,
+	HWPfDdrPhyDramTmod                    =  0x00D48480,
+	HWPfDdrPhyDramTwpre                   =  0x00D48490,
+	HWPfDdrPhyDramTrfc                    =  0x00D484A0,
+	HWPfDdrPhyDramTrwtp                   =  0x00D484B0,
+	HWPfDdrPhyMr01Dimm                    =  0x00D484C0,
+	HWPfDdrPhyMr01DimmDbi                 =  0x00D484D0,
+	HWPfDdrPhyMr23Dimm                    =  0x00D484E0,
+	HWPfDdrPhyMr45Dimm                    =  0x00D484F0,
+	HWPfDdrPhyMr67Dimm                    =  0x00D48500,
+	HWPfDdrPhyWrlvlWwRdlvlRr              =  0x00D48510,
+	HWPfDdrPhyOdtEn                       =  0x00D48520,
+	HWPfDdrPhyFastTrng                    =  0x00D48530,
+	HWPfDdrPhyDynTrngGap                  =  0x00D48540,
+	HWPfDdrPhyDynRcalGap                  =  0x00D48550,
+	HWPfDdrPhyIdletimeout                 =  0x00D48560,
+	HWPfDdrPhyRstCkeGap                   =  0x00D48570,
+	HWPfDdrPhyCkeMrsGap                   =  0x00D48580,
+	HWPfDdrPhyMemVrefMidVal               =  0x00D48590,
+	HWPfDdrPhyVrefStep                    =  0x00D485A0,
+	HWPfDdrPhyVrefThreshold               =  0x00D485B0,
+	HWPfDdrPhyPhyVrefMidVal               =  0x00D485C0,
+	HWPfDdrPhyDqsCountMax                 =  0x00D485D0,
+	HWPfDdrPhyDqsCountNum                 =  0x00D485E0,
+	HWPfDdrPhyDramRow                     =  0x00D485F0,
+	HWPfDdrPhyDramCol                     =  0x00D48600,
+	HWPfDdrPhyDramBgBa                    =  0x00D48610,
+	HWPfDdrPhyDynamicUpdreqrel            =  0x00D48620,
+	HWPfDdrPhyVrefLimits                  =  0x00D48630,
+	HWPfDdrPhyIdtmTcStatus                =  0x00D6C020,
+	HWPfDdrPhyIdtmFwVersion               =  0x00D6C410,
+	HWPfDdrPhyRdlvlGateInitDelay          =  0x00D70000,
+	HWPfDdrPhyRdenSmplabc                 =  0x00D70008,
+	HWPfDdrPhyVrefNibble0                 =  0x00D7000C,
+	HWPfDdrPhyVrefNibble1                 =  0x00D70010,
+	HWPfDdrPhyRdlvlGateDqsSmpl0           =  0x00D70014,
+	HWPfDdrPhyRdlvlGateDqsSmpl1           =  0x00D70018,
+	HWPfDdrPhyRdlvlGateDqsSmpl2           =  0x00D7001C,
+	HWPfDdrPhyDqsCount                    =  0x00D70020,
+	HWPfDdrPhyWrlvlRdlvlGateStatus        =  0x00D70024,
+	HWPfDdrPhyErrorFlags                  =  0x00D70028,
+	HWPfDdrPhyPowerDown                   =  0x00D70030,
+	HWPfDdrPhyPrbsSeedByte0               =  0x00D70034,
+	HWPfDdrPhyPrbsSeedByte1               =  0x00D70038,
+	HWPfDdrPhyPcompDq                     =  0x00D70040,
+	HWPfDdrPhyNcompDq                     =  0x00D70044,
+	HWPfDdrPhyPcompDqs                    =  0x00D70048,
+	HWPfDdrPhyNcompDqs                    =  0x00D7004C,
+	HWPfDdrPhyPcompCmd                    =  0x00D70050,
+	HWPfDdrPhyNcompCmd                    =  0x00D70054,
+	HWPfDdrPhyPcompCk                     =  0x00D70058,
+	HWPfDdrPhyNcompCk                     =  0x00D7005C,
+	HWPfDdrPhyRcalOdtDq                   =  0x00D70060,
+	HWPfDdrPhyRcalOdtDqs                  =  0x00D70064,
+	HWPfDdrPhyRcalMask1                   =  0x00D70068,
+	HWPfDdrPhyRcalMask2                   =  0x00D7006C,
+	HWPfDdrPhyRcalCtrl                    =  0x00D70070,
+	HWPfDdrPhyRcalCnt                     =  0x00D70074,
+	HWPfDdrPhyRcalOverride                =  0x00D70078,
+	HWPfDdrPhyRcalGateen                  =  0x00D7007C,
+	HWPfDdrPhyCtrl                        =  0x00D70080,
+	HWPfDdrPhyWrlvlAlg                    =  0x00D70084,
+	HWPfDdrPhyRcalVreftTxcmdOdt           =  0x00D70088,
+	HWPfDdrPhyRdlvlGateParam              =  0x00D7008C,
+	HWPfDdrPhyRdlvlGateParam2             =  0x00D70090,
+	HWPfDdrPhyRcalVreftTxdata             =  0x00D70094,
+	HWPfDdrPhyCmdIntDelay                 =  0x00D700A4,
+	HWPfDdrPhyAlertN                      =  0x00D700A8,
+	HWPfDdrPhyTrngReqWpre2tck             =  0x00D700AC,
+	HWPfDdrPhyCmdPhaseSel                 =  0x00D700B4,
+	HWPfDdrPhyCmdDcdl                     =  0x00D700B8,
+	HWPfDdrPhyCkDcdl                      =  0x00D700BC,
+	HWPfDdrPhySwTrngCtrl1                 =  0x00D700C0,
+	HWPfDdrPhySwTrngCtrl2                 =  0x00D700C4,
+	HWPfDdrPhyRcalPcompRden               =  0x00D700C8,
+	HWPfDdrPhyRcalNcompRden               =  0x00D700CC,
+	HWPfDdrPhyRcalCompen                  =  0x00D700D0,
+	HWPfDdrPhySwTrngRdqs                  =  0x00D700D4,
+	HWPfDdrPhySwTrngWdqs                  =  0x00D700D8,
+	HWPfDdrPhySwTrngRdena                 =  0x00D700DC,
+	HWPfDdrPhySwTrngRdenb                 =  0x00D700E0,
+	HWPfDdrPhySwTrngRdenc                 =  0x00D700E4,
+	HWPfDdrPhySwTrngWdq                   =  0x00D700E8,
+	HWPfDdrPhySwTrngRdq                   =  0x00D700EC,
+	HWPfDdrPhyPcfgHmValue                 =  0x00D700F0,
+	HWPfDdrPhyPcfgTimerValue              =  0x00D700F4,
+	HWPfDdrPhyPcfgSoftwareTraining        =  0x00D700F8,
+	HWPfDdrPhyPcfgMcStatus                =  0x00D700FC,
+	HWPfDdrPhyWrlvlPhRank0                =  0x00D70100,
+	HWPfDdrPhyRdenPhRank0                 =  0x00D70104,
+	HWPfDdrPhyRdenIntRank0                =  0x00D70108,
+	HWPfDdrPhyRdqsDcdlRank0               =  0x00D7010C,
+	HWPfDdrPhyRdqsShadowDcdlRank0         =  0x00D70110,
+	HWPfDdrPhyWdqsDcdlRank0               =  0x00D70114,
+	HWPfDdrPhyWdmDcdlShadowRank0          =  0x00D70118,
+	HWPfDdrPhyWdmDcdlRank0                =  0x00D7011C,
+	HWPfDdrPhyDbiDcdlRank0                =  0x00D70120,
+	HWPfDdrPhyRdenDcdlaRank0              =  0x00D70124,
+	HWPfDdrPhyDbiDcdlShadowRank0          =  0x00D70128,
+	HWPfDdrPhyRdenDcdlbRank0              =  0x00D7012C,
+	HWPfDdrPhyWdqsShadowDcdlRank0         =  0x00D70130,
+	HWPfDdrPhyRdenDcdlcRank0              =  0x00D70134,
+	HWPfDdrPhyRdenShadowDcdlaRank0        =  0x00D70138,
+	HWPfDdrPhyWrlvlIntRank0               =  0x00D7013C,
+	HWPfDdrPhyRdqDcdlBit0Rank0            =  0x00D70200,
+	HWPfDdrPhyRdqDcdlShadowBit0Rank0      =  0x00D70204,
+	HWPfDdrPhyWdqDcdlBit0Rank0            =  0x00D70208,
+	HWPfDdrPhyWdqDcdlShadowBit0Rank0      =  0x00D7020C,
+	HWPfDdrPhyRdqDcdlBit1Rank0            =  0x00D70240,
+	HWPfDdrPhyRdqDcdlShadowBit1Rank0      =  0x00D70244,
+	HWPfDdrPhyWdqDcdlBit1Rank0            =  0x00D70248,
+	HWPfDdrPhyWdqDcdlShadowBit1Rank0      =  0x00D7024C,
+	HWPfDdrPhyRdqDcdlBit2Rank0            =  0x00D70280,
+	HWPfDdrPhyRdqDcdlShadowBit2Rank0      =  0x00D70284,
+	HWPfDdrPhyWdqDcdlBit2Rank0            =  0x00D70288,
+	HWPfDdrPhyWdqDcdlShadowBit2Rank0      =  0x00D7028C,
+	HWPfDdrPhyRdqDcdlBit3Rank0            =  0x00D702C0,
+	HWPfDdrPhyRdqDcdlShadowBit3Rank0      =  0x00D702C4,
+	HWPfDdrPhyWdqDcdlBit3Rank0            =  0x00D702C8,
+	HWPfDdrPhyWdqDcdlShadowBit3Rank0      =  0x00D702CC,
+	HWPfDdrPhyRdqDcdlBit4Rank0            =  0x00D70300,
+	HWPfDdrPhyRdqDcdlShadowBit4Rank0      =  0x00D70304,
+	HWPfDdrPhyWdqDcdlBit4Rank0            =  0x00D70308,
+	HWPfDdrPhyWdqDcdlShadowBit4Rank0      =  0x00D7030C,
+	HWPfDdrPhyRdqDcdlBit5Rank0            =  0x00D70340,
+	HWPfDdrPhyRdqDcdlShadowBit5Rank0      =  0x00D70344,
+	HWPfDdrPhyWdqDcdlBit5Rank0            =  0x00D70348,
+	HWPfDdrPhyWdqDcdlShadowBit5Rank0      =  0x00D7034C,
+	HWPfDdrPhyRdqDcdlBit6Rank0            =  0x00D70380,
+	HWPfDdrPhyRdqDcdlShadowBit6Rank0      =  0x00D70384,
+	HWPfDdrPhyWdqDcdlBit6Rank0            =  0x00D70388,
+	HWPfDdrPhyWdqDcdlShadowBit6Rank0      =  0x00D7038C,
+	HWPfDdrPhyRdqDcdlBit7Rank0            =  0x00D703C0,
+	HWPfDdrPhyRdqDcdlShadowBit7Rank0      =  0x00D703C4,
+	HWPfDdrPhyWdqDcdlBit7Rank0            =  0x00D703C8,
+	HWPfDdrPhyWdqDcdlShadowBit7Rank0      =  0x00D703CC,
+	HWPfDdrPhyIdtmStatus                  =  0x00D740D0,
+	HWPfDdrPhyIdtmError                   =  0x00D74110,
+	HWPfDdrPhyIdtmDebug                   =  0x00D74120,
+	HWPfDdrPhyIdtmDebugInt                =  0x00D74130,
+	HwPfPcieLnAsicCfgovr                  =  0x00D80000,
+	HwPfPcieLnAclkmixer                   =  0x00D80004,
+	HwPfPcieLnTxrampfreq                  =  0x00D80008,
+	HwPfPcieLnLanetest                    =  0x00D8000C,
+	HwPfPcieLnDcctrl                      =  0x00D80010,
+	HwPfPcieLnDccmeas                     =  0x00D80014,
+	HwPfPcieLnDccovrAclk                  =  0x00D80018,
+	HwPfPcieLnDccovrTxa                   =  0x00D8001C,
+	HwPfPcieLnDccovrTxk                   =  0x00D80020,
+	HwPfPcieLnDccovrDclk                  =  0x00D80024,
+	HwPfPcieLnDccovrEclk                  =  0x00D80028,
+	HwPfPcieLnDcctrimAclk                 =  0x00D8002C,
+	HwPfPcieLnDcctrimTx                   =  0x00D80030,
+	HwPfPcieLnDcctrimDclk                 =  0x00D80034,
+	HwPfPcieLnDcctrimEclk                 =  0x00D80038,
+	HwPfPcieLnQuadCtrl                    =  0x00D8003C,
+	HwPfPcieLnQuadCorrIndex               =  0x00D80040,
+	HwPfPcieLnQuadCorrStatus              =  0x00D80044,
+	HwPfPcieLnAsicRxovr1                  =  0x00D80048,
+	HwPfPcieLnAsicRxovr2                  =  0x00D8004C,
+	HwPfPcieLnAsicEqinfovr                =  0x00D80050,
+	HwPfPcieLnRxcsr                       =  0x00D80054,
+	HwPfPcieLnRxfectrl                    =  0x00D80058,
+	HwPfPcieLnRxtest                      =  0x00D8005C,
+	HwPfPcieLnEscount                     =  0x00D80060,
+	HwPfPcieLnCdrctrl                     =  0x00D80064,
+	HwPfPcieLnCdrctrl2                    =  0x00D80068,
+	HwPfPcieLnCdrcfg0Ctrl0                =  0x00D8006C,
+	HwPfPcieLnCdrcfg0Ctrl1                =  0x00D80070,
+	HwPfPcieLnCdrcfg0Ctrl2                =  0x00D80074,
+	HwPfPcieLnCdrcfg1Ctrl0                =  0x00D80078,
+	HwPfPcieLnCdrcfg1Ctrl1                =  0x00D8007C,
+	HwPfPcieLnCdrcfg1Ctrl2                =  0x00D80080,
+	HwPfPcieLnCdrcfg2Ctrl0                =  0x00D80084,
+	HwPfPcieLnCdrcfg2Ctrl1                =  0x00D80088,
+	HwPfPcieLnCdrcfg2Ctrl2                =  0x00D8008C,
+	HwPfPcieLnCdrcfg3Ctrl0                =  0x00D80090,
+	HwPfPcieLnCdrcfg3Ctrl1                =  0x00D80094,
+	HwPfPcieLnCdrcfg3Ctrl2                =  0x00D80098,
+	HwPfPcieLnCdrphase                    =  0x00D8009C,
+	HwPfPcieLnCdrfreq                     =  0x00D800A0,
+	HwPfPcieLnCdrstatusPhase              =  0x00D800A4,
+	HwPfPcieLnCdrstatusFreq               =  0x00D800A8,
+	HwPfPcieLnCdroffset                   =  0x00D800AC,
+	HwPfPcieLnRxvosctl                    =  0x00D800B0,
+	HwPfPcieLnRxvosctl2                   =  0x00D800B4,
+	HwPfPcieLnRxlosctl                    =  0x00D800B8,
+	HwPfPcieLnRxlos                       =  0x00D800BC,
+	HwPfPcieLnRxlosvval                   =  0x00D800C0,
+	HwPfPcieLnRxvosd0                     =  0x00D800C4,
+	HwPfPcieLnRxvosd1                     =  0x00D800C8,
+	HwPfPcieLnRxvosep0                    =  0x00D800CC,
+	HwPfPcieLnRxvosep1                    =  0x00D800D0,
+	HwPfPcieLnRxvosen0                    =  0x00D800D4,
+	HwPfPcieLnRxvosen1                    =  0x00D800D8,
+	HwPfPcieLnRxvosafe                    =  0x00D800DC,
+	HwPfPcieLnRxvosa0                     =  0x00D800E0,
+	HwPfPcieLnRxvosa0Out                  =  0x00D800E4,
+	HwPfPcieLnRxvosa1                     =  0x00D800E8,
+	HwPfPcieLnRxvosa1Out                  =  0x00D800EC,
+	HwPfPcieLnRxmisc                      =  0x00D800F0,
+	HwPfPcieLnRxbeacon                    =  0x00D800F4,
+	HwPfPcieLnRxdssout                    =  0x00D800F8,
+	HwPfPcieLnRxdssout2                   =  0x00D800FC,
+	HwPfPcieLnAlphapctrl                  =  0x00D80100,
+	HwPfPcieLnAlphanctrl                  =  0x00D80104,
+	HwPfPcieLnAdaptctrl                   =  0x00D80108,
+	HwPfPcieLnAdaptctrl1                  =  0x00D8010C,
+	HwPfPcieLnAdaptstatus                 =  0x00D80110,
+	HwPfPcieLnAdaptvga1                   =  0x00D80114,
+	HwPfPcieLnAdaptvga2                   =  0x00D80118,
+	HwPfPcieLnAdaptvga3                   =  0x00D8011C,
+	HwPfPcieLnAdaptvga4                   =  0x00D80120,
+	HwPfPcieLnAdaptboost1                 =  0x00D80124,
+	HwPfPcieLnAdaptboost2                 =  0x00D80128,
+	HwPfPcieLnAdaptboost3                 =  0x00D8012C,
+	HwPfPcieLnAdaptboost4                 =  0x00D80130,
+	HwPfPcieLnAdaptsslms1                 =  0x00D80134,
+	HwPfPcieLnAdaptsslms2                 =  0x00D80138,
+	HwPfPcieLnAdaptvgaStatus              =  0x00D8013C,
+	HwPfPcieLnAdaptboostStatus            =  0x00D80140,
+	HwPfPcieLnAdaptsslmsStatus1           =  0x00D80144,
+	HwPfPcieLnAdaptsslmsStatus2           =  0x00D80148,
+	HwPfPcieLnAfectrl1                    =  0x00D8014C,
+	HwPfPcieLnAfectrl2                    =  0x00D80150,
+	HwPfPcieLnAfectrl3                    =  0x00D80154,
+	HwPfPcieLnAfedefault1                 =  0x00D80158,
+	HwPfPcieLnAfedefault2                 =  0x00D8015C,
+	HwPfPcieLnDfectrl1                    =  0x00D80160,
+	HwPfPcieLnDfectrl2                    =  0x00D80164,
+	HwPfPcieLnDfectrl3                    =  0x00D80168,
+	HwPfPcieLnDfectrl4                    =  0x00D8016C,
+	HwPfPcieLnDfectrl5                    =  0x00D80170,
+	HwPfPcieLnDfectrl6                    =  0x00D80174,
+	HwPfPcieLnAfestatus1                  =  0x00D80178,
+	HwPfPcieLnAfestatus2                  =  0x00D8017C,
+	HwPfPcieLnDfestatus1                  =  0x00D80180,
+	HwPfPcieLnDfestatus2                  =  0x00D80184,
+	HwPfPcieLnDfestatus3                  =  0x00D80188,
+	HwPfPcieLnDfestatus4                  =  0x00D8018C,
+	HwPfPcieLnDfestatus5                  =  0x00D80190,
+	HwPfPcieLnAlphastatus                 =  0x00D80194,
+	HwPfPcieLnFomctrl1                    =  0x00D80198,
+	HwPfPcieLnFomctrl2                    =  0x00D8019C,
+	HwPfPcieLnFomctrl3                    =  0x00D801A0,
+	HwPfPcieLnAclkcalStatus               =  0x00D801A4,
+	HwPfPcieLnOffscorrStatus              =  0x00D801A8,
+	HwPfPcieLnEyewidthStatus              =  0x00D801AC,
+	HwPfPcieLnEyeheightStatus             =  0x00D801B0,
+	HwPfPcieLnAsicTxovr1                  =  0x00D801B4,
+	HwPfPcieLnAsicTxovr2                  =  0x00D801B8,
+	HwPfPcieLnAsicTxovr3                  =  0x00D801BC,
+	HwPfPcieLnTxbiasadjOvr                =  0x00D801C0,
+	HwPfPcieLnTxcsr                       =  0x00D801C4,
+	HwPfPcieLnTxtest                      =  0x00D801C8,
+	HwPfPcieLnTxtestword                  =  0x00D801CC,
+	HwPfPcieLnTxtestwordHigh              =  0x00D801D0,
+	HwPfPcieLnTxdrive                     =  0x00D801D4,
+	HwPfPcieLnMtcsLn                      =  0x00D801D8,
+	HwPfPcieLnStatsumLn                   =  0x00D801DC,
+	HwPfPcieLnRcbusScratch                =  0x00D801E0,
+	HwPfPcieLnRcbusMinorrev               =  0x00D801F0,
+	HwPfPcieLnRcbusMajorrev               =  0x00D801F4,
+	HwPfPcieLnRcbusBlocktype              =  0x00D801F8,
+	HwPfPcieSupPllcsr                     =  0x00D80800,
+	HwPfPcieSupPlldiv                     =  0x00D80804,
+	HwPfPcieSupPllcal                     =  0x00D80808,
+	HwPfPcieSupPllcalsts                  =  0x00D8080C,
+	HwPfPcieSupPllmeas                    =  0x00D80810,
+	HwPfPcieSupPlldactrim                 =  0x00D80814,
+	HwPfPcieSupPllbiastrim                =  0x00D80818,
+	HwPfPcieSupPllbwtrim                  =  0x00D8081C,
+	HwPfPcieSupPllcaldly                  =  0x00D80820,
+	HwPfPcieSupRefclkonpclkctrl           =  0x00D80824,
+	HwPfPcieSupPclkdelay                  =  0x00D80828,
+	HwPfPcieSupPhyconfig                  =  0x00D8082C,
+	HwPfPcieSupRcalIntf                   =  0x00D80830,
+	HwPfPcieSupAuxcsr                     =  0x00D80834,
+	HwPfPcieSupVref                       =  0x00D80838,
+	HwPfPcieSupLinkmode                   =  0x00D8083C,
+	HwPfPcieSupRrefcalctl                 =  0x00D80840,
+	HwPfPcieSupRrefcal                    =  0x00D80844,
+	HwPfPcieSupRrefcaldly                 =  0x00D80848,
+	HwPfPcieSupTximpcalctl                =  0x00D8084C,
+	HwPfPcieSupTximpcal                   =  0x00D80850,
+	HwPfPcieSupTximpoffset                =  0x00D80854,
+	HwPfPcieSupTximpcaldly                =  0x00D80858,
+	HwPfPcieSupRximpcalctl                =  0x00D8085C,
+	HwPfPcieSupRximpcal                   =  0x00D80860,
+	HwPfPcieSupRximpoffset                =  0x00D80864,
+	HwPfPcieSupRximpcaldly                =  0x00D80868,
+	HwPfPcieSupFence                      =  0x00D8086C,
+	HwPfPcieSupMtcs                       =  0x00D80870,
+	HwPfPcieSupStatsum                    =  0x00D809B8,
+	HwPfPciePcsDpStatus0                  =  0x00D81000,
+	HwPfPciePcsDpControl0                 =  0x00D81004,
+	HwPfPciePcsPmaStatusLane0             =  0x00D81008,
+	HwPfPciePcsPipeStatusLane0            =  0x00D8100C,
+	HwPfPciePcsTxdeemph0Lane0             =  0x00D81010,
+	HwPfPciePcsTxdeemph1Lane0             =  0x00D81014,
+	HwPfPciePcsInternalStatusLane0        =  0x00D81018,
+	HwPfPciePcsDpStatus1                  =  0x00D8101C,
+	HwPfPciePcsDpControl1                 =  0x00D81020,
+	HwPfPciePcsPmaStatusLane1             =  0x00D81024,
+	HwPfPciePcsPipeStatusLane1            =  0x00D81028,
+	HwPfPciePcsTxdeemph0Lane1             =  0x00D8102C,
+	HwPfPciePcsTxdeemph1Lane1             =  0x00D81030,
+	HwPfPciePcsInternalStatusLane1        =  0x00D81034,
+	HwPfPciePcsDpStatus2                  =  0x00D81038,
+	HwPfPciePcsDpControl2                 =  0x00D8103C,
+	HwPfPciePcsPmaStatusLane2             =  0x00D81040,
+	HwPfPciePcsPipeStatusLane2            =  0x00D81044,
+	HwPfPciePcsTxdeemph0Lane2             =  0x00D81048,
+	HwPfPciePcsTxdeemph1Lane2             =  0x00D8104C,
+	HwPfPciePcsInternalStatusLane2        =  0x00D81050,
+	HwPfPciePcsDpStatus3                  =  0x00D81054,
+	HwPfPciePcsDpControl3                 =  0x00D81058,
+	HwPfPciePcsPmaStatusLane3             =  0x00D8105C,
+	HwPfPciePcsPipeStatusLane3            =  0x00D81060,
+	HwPfPciePcsTxdeemph0Lane3             =  0x00D81064,
+	HwPfPciePcsTxdeemph1Lane3             =  0x00D81068,
+	HwPfPciePcsInternalStatusLane3        =  0x00D8106C,
+	HwPfPciePcsEbStatus0                  =  0x00D81070,
+	HwPfPciePcsEbStatus1                  =  0x00D81074,
+	HwPfPciePcsEbStatus2                  =  0x00D81078,
+	HwPfPciePcsEbStatus3                  =  0x00D8107C,
+	HwPfPciePcsPllSettingPcieG1           =  0x00D81088,
+	HwPfPciePcsPllSettingPcieG2           =  0x00D8108C,
+	HwPfPciePcsPllSettingPcieG3           =  0x00D81090,
+	HwPfPciePcsControl                    =  0x00D81094,
+	HwPfPciePcsEqControl                  =  0x00D81098,
+	HwPfPciePcsEqTimer                    =  0x00D8109C,
+	HwPfPciePcsEqErrStatus                =  0x00D810A0,
+	HwPfPciePcsEqErrCount                 =  0x00D810A4,
+	HwPfPciePcsStatus                     =  0x00D810A8,
+	HwPfPciePcsMiscRegister               =  0x00D810AC,
+	HwPfPciePcsObsControl                 =  0x00D810B0,
+	HwPfPciePcsPrbsCount0                 =  0x00D81200,
+	HwPfPciePcsBistControl0               =  0x00D81204,
+	HwPfPciePcsBistStaticWord00           =  0x00D81208,
+	HwPfPciePcsBistStaticWord10           =  0x00D8120C,
+	HwPfPciePcsBistStaticWord20           =  0x00D81210,
+	HwPfPciePcsBistStaticWord30           =  0x00D81214,
+	HwPfPciePcsPrbsCount1                 =  0x00D81220,
+	HwPfPciePcsBistControl1               =  0x00D81224,
+	HwPfPciePcsBistStaticWord01           =  0x00D81228,
+	HwPfPciePcsBistStaticWord11           =  0x00D8122C,
+	HwPfPciePcsBistStaticWord21           =  0x00D81230,
+	HwPfPciePcsBistStaticWord31           =  0x00D81234,
+	HwPfPciePcsPrbsCount2                 =  0x00D81240,
+	HwPfPciePcsBistControl2               =  0x00D81244,
+	HwPfPciePcsBistStaticWord02           =  0x00D81248,
+	HwPfPciePcsBistStaticWord12           =  0x00D8124C,
+	HwPfPciePcsBistStaticWord22           =  0x00D81250,
+	HwPfPciePcsBistStaticWord32           =  0x00D81254,
+	HwPfPciePcsPrbsCount3                 =  0x00D81260,
+	HwPfPciePcsBistControl3               =  0x00D81264,
+	HwPfPciePcsBistStaticWord03           =  0x00D81268,
+	HwPfPciePcsBistStaticWord13           =  0x00D8126C,
+	HwPfPciePcsBistStaticWord23           =  0x00D81270,
+	HwPfPciePcsBistStaticWord33           =  0x00D81274,
+	HwPfPcieGpexLtssmStateCntrl           =  0x00D90400,
+	HwPfPcieGpexLtssmStateStatus          =  0x00D90404,
+	HwPfPcieGpexSkipFreqTimer             =  0x00D90408,
+	HwPfPcieGpexLaneSelect                =  0x00D9040C,
+	HwPfPcieGpexLaneDeskew                =  0x00D90410,
+	HwPfPcieGpexRxErrorStatus             =  0x00D90414,
+	HwPfPcieGpexLaneNumControl            =  0x00D90418,
+	HwPfPcieGpexNFstControl               =  0x00D9041C,
+	HwPfPcieGpexLinkStatus                =  0x00D90420,
+	HwPfPcieGpexAckReplayTimeout          =  0x00D90438,
+	HwPfPcieGpexSeqNumberStatus           =  0x00D9043C,
+	HwPfPcieGpexCoreClkRatio              =  0x00D90440,
+	HwPfPcieGpexDllTholdControl           =  0x00D90448,
+	HwPfPcieGpexPmTimer                   =  0x00D90450,
+	HwPfPcieGpexPmeTimeout                =  0x00D90454,
+	HwPfPcieGpexAspmL1Timer               =  0x00D90458,
+	HwPfPcieGpexAspmReqTimer              =  0x00D9045C,
+	HwPfPcieGpexAspmL1Dis                 =  0x00D90460,
+	HwPfPcieGpexAdvisoryErrorControl      =  0x00D90468,
+	HwPfPcieGpexId                        =  0x00D90470,
+	HwPfPcieGpexClasscode                 =  0x00D90474,
+	HwPfPcieGpexSubsystemId               =  0x00D90478,
+	HwPfPcieGpexDeviceCapabilities        =  0x00D9047C,
+	HwPfPcieGpexLinkCapabilities          =  0x00D90480,
+	HwPfPcieGpexFunctionNumber            =  0x00D90484,
+	HwPfPcieGpexPmCapabilities            =  0x00D90488,
+	HwPfPcieGpexFunctionSelect            =  0x00D9048C,
+	HwPfPcieGpexErrorCounter              =  0x00D904AC,
+	HwPfPcieGpexConfigReady               =  0x00D904B0,
+	HwPfPcieGpexFcUpdateTimeout           =  0x00D904B8,
+	HwPfPcieGpexFcUpdateTimer             =  0x00D904BC,
+	HwPfPcieGpexVcBufferLoad              =  0x00D904C8,
+	HwPfPcieGpexVcBufferSizeThold         =  0x00D904CC,
+	HwPfPcieGpexVcBufferSelect            =  0x00D904D0,
+	HwPfPcieGpexBarEnable                 =  0x00D904D4,
+	HwPfPcieGpexBarDwordLower             =  0x00D904D8,
+	HwPfPcieGpexBarDwordUpper             =  0x00D904DC,
+	HwPfPcieGpexBarSelect                 =  0x00D904E0,
+	HwPfPcieGpexCreditCounterSelect       =  0x00D904E4,
+	HwPfPcieGpexCreditCounterStatus       =  0x00D904E8,
+	HwPfPcieGpexTlpHeaderSelect           =  0x00D904EC,
+	HwPfPcieGpexTlpHeaderDword0           =  0x00D904F0,
+	HwPfPcieGpexTlpHeaderDword1           =  0x00D904F4,
+	HwPfPcieGpexTlpHeaderDword2           =  0x00D904F8,
+	HwPfPcieGpexTlpHeaderDword3           =  0x00D904FC,
+	HwPfPcieGpexRelaxOrderControl         =  0x00D90500,
+	HwPfPcieGpexBarPrefetch               =  0x00D90504,
+	HwPfPcieGpexFcCheckControl            =  0x00D90508,
+	HwPfPcieGpexFcUpdateTimerTraffic      =  0x00D90518,
+	HwPfPcieGpexPhyControl0               =  0x00D9053C,
+	HwPfPcieGpexPhyControl1               =  0x00D90544,
+	HwPfPcieGpexPhyControl2               =  0x00D9054C,
+	HwPfPcieGpexUserControl0              =  0x00D9055C,
+	HwPfPcieGpexUncorrErrorStatus         =  0x00D905F0,
+	HwPfPcieGpexRxCplError                =  0x00D90620,
+	HwPfPcieGpexRxCplErrorDword0          =  0x00D90624,
+	HwPfPcieGpexRxCplErrorDword1          =  0x00D90628,
+	HwPfPcieGpexRxCplErrorDword2          =  0x00D9062C,
+	HwPfPcieGpexPabSwResetEn              =  0x00D90630,
+	HwPfPcieGpexGen3Control0              =  0x00D90634,
+	HwPfPcieGpexGen3Control1              =  0x00D90638,
+	HwPfPcieGpexGen3Control2              =  0x00D9063C,
+	HwPfPcieGpexGen2ControlCsr            =  0x00D90640,
+	HwPfPcieGpexTotalVfInitialVf0         =  0x00D90644,
+	HwPfPcieGpexTotalVfInitialVf1         =  0x00D90648,
+	HwPfPcieGpexSriovLinkDevId0           =  0x00D90684,
+	HwPfPcieGpexSriovLinkDevId1           =  0x00D90688,
+	HwPfPcieGpexSriovPageSize0            =  0x00D906C4,
+	HwPfPcieGpexSriovPageSize1            =  0x00D906C8,
+	HwPfPcieGpexIdVersion                 =  0x00D906FC,
+	HwPfPcieGpexSriovVfOffsetStride0      =  0x00D90704,
+	HwPfPcieGpexSriovVfOffsetStride1      =  0x00D90708,
+	HwPfPcieGpexGen3DeskewControl         =  0x00D907B4,
+	HwPfPcieGpexGen3EqControl             =  0x00D907B8,
+	HwPfPcieGpexBridgeVersion             =  0x00D90800,
+	HwPfPcieGpexBridgeCapability          =  0x00D90804,
+	HwPfPcieGpexBridgeControl             =  0x00D90808,
+	HwPfPcieGpexBridgeStatus              =  0x00D9080C,
+	HwPfPcieGpexEngineActivityStatus      =  0x00D9081C,
+	HwPfPcieGpexEngineResetControl        =  0x00D90820,
+	HwPfPcieGpexAxiPioControl             =  0x00D90840,
+	HwPfPcieGpexAxiPioStatus              =  0x00D90844,
+	HwPfPcieGpexAmbaSlaveCmdStatus        =  0x00D90848,
+	HwPfPcieGpexPexPioControl             =  0x00D908C0,
+	HwPfPcieGpexPexPioStatus              =  0x00D908C4,
+	HwPfPcieGpexAmbaMasterStatus          =  0x00D908C8,
+	HwPfPcieGpexCsrSlaveCmdStatus         =  0x00D90920,
+	HwPfPcieGpexMailboxAxiControl         =  0x00D90A50,
+	HwPfPcieGpexMailboxAxiData            =  0x00D90A54,
+	HwPfPcieGpexMailboxPexControl         =  0x00D90A90,
+	HwPfPcieGpexMailboxPexData            =  0x00D90A94,
+	HwPfPcieGpexPexInterruptEnable        =  0x00D90AD0,
+	HwPfPcieGpexPexInterruptStatus        =  0x00D90AD4,
+	HwPfPcieGpexPexInterruptAxiPioVector  =  0x00D90AD8,
+	HwPfPcieGpexPexInterruptPexPioVector  =  0x00D90AE0,
+	HwPfPcieGpexPexInterruptMiscVector    =  0x00D90AF8,
+	HwPfPcieGpexAmbaInterruptPioEnable    =  0x00D90B00,
+	HwPfPcieGpexAmbaInterruptMiscEnable   =  0x00D90B0C,
+	HwPfPcieGpexAmbaInterruptPioStatus    =  0x00D90B10,
+	HwPfPcieGpexAmbaInterruptMiscStatus   =  0x00D90B1C,
+	HwPfPcieGpexPexPmControl              =  0x00D90B80,
+	HwPfPcieGpexSlotMisc                  =  0x00D90B88,
+	HwPfPcieGpexAxiAddrMappingControl     =  0x00D90BA0,
+	HwPfPcieGpexAxiAddrMappingWindowAxiBase     =  0x00D90BA4,
+	HwPfPcieGpexAxiAddrMappingWindowPexBaseLow  =  0x00D90BA8,
+	HwPfPcieGpexAxiAddrMappingWindowPexBaseHigh =  0x00D90BAC,
+	HwPfPcieGpexPexBarAddrFunc0Bar0       =  0x00D91BA0,
+	HwPfPcieGpexPexBarAddrFunc0Bar1       =  0x00D91BA4,
+	HwPfPcieGpexAxiAddrMappingPcieHdrParam =  0x00D95BA0,
+	HwPfPcieGpexExtAxiAddrMappingAxiBase  =  0x00D980A0,
+	HwPfPcieGpexPexExtBarAddrFunc0Bar0    =  0x00D984A0,
+	HwPfPcieGpexPexExtBarAddrFunc0Bar1    =  0x00D984A4,
+	HwPfPcieGpexAmbaInterruptFlrEnable    =  0x00D9B960,
+	HwPfPcieGpexAmbaInterruptFlrStatus    =  0x00D9B9A0,
+	HwPfPcieGpexExtAxiAddrMappingSize     =  0x00D9BAF0,
+	HwPfPcieGpexPexPioAwcacheControl      =  0x00D9C300,
+	HwPfPcieGpexPexPioArcacheControl      =  0x00D9C304,
+	HwPfPcieGpexPabObSizeControlVc0       =  0x00D9C310
+};
+
+/* TIP PF Interrupt numbers */
+enum {
+	ACC100_PF_INT_QMGR_AQ_OVERFLOW = 0,
+	ACC100_PF_INT_DOORBELL_VF_2_PF = 1,
+	ACC100_PF_INT_DMA_DL_DESC_IRQ = 2,
+	ACC100_PF_INT_DMA_UL_DESC_IRQ = 3,
+	ACC100_PF_INT_DMA_MLD_DESC_IRQ = 4,
+	ACC100_PF_INT_DMA_UL5G_DESC_IRQ = 5,
+	ACC100_PF_INT_DMA_DL5G_DESC_IRQ = 6,
+	ACC100_PF_INT_ILLEGAL_FORMAT = 7,
+	ACC100_PF_INT_QMGR_DISABLED_ACCESS = 8,
+	ACC100_PF_INT_QMGR_AQ_OVERTHRESHOLD = 9,
+	ACC100_PF_INT_ARAM_ACCESS_ERR = 10,
+	ACC100_PF_INT_ARAM_ECC_1BIT_ERR = 11,
+	ACC100_PF_INT_PARITY_ERR = 12,
+	ACC100_PF_INT_QMGR_ERR = 13,
+	ACC100_PF_INT_INT_REQ_OVERFLOW = 14,
+	ACC100_PF_INT_APB_TIMEOUT = 15,
+};
+
+#endif /* ACC100_PF_ENUM_H */
diff --git a/drivers/baseband/acc100/acc100_vf_enum.h b/drivers/baseband/acc100/acc100_vf_enum.h
new file mode 100644
index 0000000..b512af3
--- /dev/null
+++ b/drivers/baseband/acc100/acc100_vf_enum.h
@@ -0,0 +1,73 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2017 Intel Corporation
+ */
+
+#ifndef ACC100_VF_ENUM_H
+#define ACC100_VF_ENUM_H
+
+/*
+ * ACC100 Register mapping on VF BAR0
+ * This is automatically generated from RDL, format may change with new RDL
+ */
+enum {
+	HWVfQmgrIngressAq             =  0x00000000,
+	HWVfHiVfToPfDbellVf           =  0x00000800,
+	HWVfHiPfToVfDbellVf           =  0x00000808,
+	HWVfHiInfoRingBaseLoVf        =  0x00000810,
+	HWVfHiInfoRingBaseHiVf        =  0x00000814,
+	HWVfHiInfoRingPointerVf       =  0x00000818,
+	HWVfHiInfoRingIntWrEnVf       =  0x00000820,
+	HWVfHiInfoRingPf2VfWrEnVf     =  0x00000824,
+	HWVfHiMsixVectorMapperVf      =  0x00000860,
+	HWVfDmaFec5GulDescBaseLoRegVf =  0x00000920,
+	HWVfDmaFec5GulDescBaseHiRegVf =  0x00000924,
+	HWVfDmaFec5GulRespPtrLoRegVf  =  0x00000928,
+	HWVfDmaFec5GulRespPtrHiRegVf  =  0x0000092C,
+	HWVfDmaFec5GdlDescBaseLoRegVf =  0x00000940,
+	HWVfDmaFec5GdlDescBaseHiRegVf =  0x00000944,
+	HWVfDmaFec5GdlRespPtrLoRegVf  =  0x00000948,
+	HWVfDmaFec5GdlRespPtrHiRegVf  =  0x0000094C,
+	HWVfDmaFec4GulDescBaseLoRegVf =  0x00000960,
+	HWVfDmaFec4GulDescBaseHiRegVf =  0x00000964,
+	HWVfDmaFec4GulRespPtrLoRegVf  =  0x00000968,
+	HWVfDmaFec4GulRespPtrHiRegVf  =  0x0000096C,
+	HWVfDmaFec4GdlDescBaseLoRegVf =  0x00000980,
+	HWVfDmaFec4GdlDescBaseHiRegVf =  0x00000984,
+	HWVfDmaFec4GdlRespPtrLoRegVf  =  0x00000988,
+	HWVfDmaFec4GdlRespPtrHiRegVf  =  0x0000098C,
+	HWVfDmaDdrBaseRangeRoVf       =  0x000009A0,
+	HWVfQmgrAqResetVf             =  0x00000E00,
+	HWVfQmgrRingSizeVf            =  0x00000E04,
+	HWVfQmgrGrpDepthLog20Vf       =  0x00000E08,
+	HWVfQmgrGrpDepthLog21Vf       =  0x00000E0C,
+	HWVfQmgrGrpFunction0Vf        =  0x00000E10,
+	HWVfQmgrGrpFunction1Vf        =  0x00000E14,
+	HWVfPmACntrlRegVf             =  0x00000F40,
+	HWVfPmACountVf                =  0x00000F48,
+	HWVfPmAKCntLoVf               =  0x00000F50,
+	HWVfPmAKCntHiVf               =  0x00000F54,
+	HWVfPmADeltaCntLoVf           =  0x00000F60,
+	HWVfPmADeltaCntHiVf           =  0x00000F64,
+	HWVfPmBCntrlRegVf             =  0x00000F80,
+	HWVfPmBCountVf                =  0x00000F88,
+	HWVfPmBKCntLoVf               =  0x00000F90,
+	HWVfPmBKCntHiVf               =  0x00000F94,
+	HWVfPmBDeltaCntLoVf           =  0x00000FA0,
+	HWVfPmBDeltaCntHiVf           =  0x00000FA4
+};
+
+/* TIP VF Interrupt numbers */
+enum {
+	ACC100_VF_INT_QMGR_AQ_OVERFLOW = 0,
+	ACC100_VF_INT_DOORBELL_VF_2_PF = 1,
+	ACC100_VF_INT_DMA_DL_DESC_IRQ = 2,
+	ACC100_VF_INT_DMA_UL_DESC_IRQ = 3,
+	ACC100_VF_INT_DMA_MLD_DESC_IRQ = 4,
+	ACC100_VF_INT_DMA_UL5G_DESC_IRQ = 5,
+	ACC100_VF_INT_DMA_DL5G_DESC_IRQ = 6,
+	ACC100_VF_INT_ILLEGAL_FORMAT = 7,
+	ACC100_VF_INT_QMGR_DISABLED_ACCESS = 8,
+	ACC100_VF_INT_QMGR_AQ_OVERTHRESHOLD = 9,
+};
+
+#endif /* ACC100_VF_ENUM_H */
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 6f46df0..cd77570 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -5,6 +5,9 @@
 #ifndef _RTE_ACC100_PMD_H_
 #define _RTE_ACC100_PMD_H_
 
+#include "acc100_pf_enum.h"
+#include "acc100_vf_enum.h"
+
 /* Helper macro for logging */
 #define rte_bbdev_log(level, fmt, ...) \
 	rte_log(RTE_LOG_ ## level, acc100_logtype, fmt "\n", \
@@ -27,6 +30,493 @@
 #define RTE_ACC100_PF_DEVICE_ID        (0x0d5c)
 #define RTE_ACC100_VF_DEVICE_ID        (0x0d5d)
 
+/* Define as 1 to use only a single FEC engine */
+#ifndef RTE_ACC100_SINGLE_FEC
+#define RTE_ACC100_SINGLE_FEC 0
+#endif
+
+/* Values used in filling in descriptors */
+#define ACC100_DMA_DESC_TYPE           2
+#define ACC100_DMA_CODE_BLK_MODE       0
+#define ACC100_DMA_BLKID_FCW           1
+#define ACC100_DMA_BLKID_IN            2
+#define ACC100_DMA_BLKID_OUT_ENC       1
+#define ACC100_DMA_BLKID_OUT_HARD      1
+#define ACC100_DMA_BLKID_OUT_SOFT      2
+#define ACC100_DMA_BLKID_OUT_HARQ      3
+#define ACC100_DMA_BLKID_IN_HARQ       3
+
+/* Values used in filling in decode FCWs */
+#define ACC100_FCW_TD_VER              1
+#define ACC100_FCW_TD_EXT_COLD_REG_EN  1
+#define ACC100_FCW_TD_AUTOMAP          0x0f
+#define ACC100_FCW_TD_RVIDX_0          2
+#define ACC100_FCW_TD_RVIDX_1          26
+#define ACC100_FCW_TD_RVIDX_2          50
+#define ACC100_FCW_TD_RVIDX_3          74
+
+/* Values used in writing to the registers */
+#define ACC100_REG_IRQ_EN_ALL          0x1FF83FF  /* Enable all interrupts */
+
+/* ACC100 Specific Dimensioning */
+#define ACC100_SIZE_64MBYTE            (64*1024*1024)
+/* Number of elements in an Info Ring */
+#define ACC100_INFO_RING_NUM_ENTRIES   1024
+/* Number of elements in HARQ layout memory */
+#define ACC100_HARQ_LAYOUT             (64*1024*1024)
+/* Assume offset for HARQ in memory */
+#define ACC100_HARQ_OFFSET             (32*1024)
+/* Mask used to calculate an index in an Info Ring array (not a byte offset) */
+#define ACC100_INFO_RING_MASK          (ACC100_INFO_RING_NUM_ENTRIES-1)
+/* Number of Virtual Functions ACC100 supports */
+#define ACC100_NUM_VFS                  16
+#define ACC100_NUM_QGRPS                 8
+#define ACC100_NUM_QGRPS_PER_WORD        8
+#define ACC100_NUM_AQS                  16
+#define MAX_ENQ_BATCH_SIZE          255
+/* All ACC100 Registers alignment are 32bits = 4B */
+#define BYTES_IN_WORD                 4
+#define MAX_E_MBUF                64000
+
+#define GRP_ID_SHIFT    10 /* Queue Index Hierarchy */
+#define VF_ID_SHIFT     4  /* Queue Index Hierarchy */
+#define VF_OFFSET_QOS   16 /* offset in Memory Space specific to QoS Mon */
+#define TMPL_PRI_0      0x03020100
+#define TMPL_PRI_1      0x07060504
+#define TMPL_PRI_2      0x0b0a0908
+#define TMPL_PRI_3      0x0f0e0d0c
+#define QUEUE_ENABLE    0x80000000  /* Bit to mark Queue as Enabled */
+#define WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
+
+#define ACC100_NUM_TMPL  32
+#define VF_OFFSET_QOS 16 /* offset in Memory Space specific to QoS Mon */
+/* Mapping of signals for the available engines */
+#define SIG_UL_5G      0
+#define SIG_UL_5G_LAST 7
+#define SIG_DL_5G      13
+#define SIG_DL_5G_LAST 15
+#define SIG_UL_4G      16
+#define SIG_UL_4G_LAST 21
+#define SIG_DL_4G      27
+#define SIG_DL_4G_LAST 31
+
+/* max number of iterations to allocate memory block for all rings */
+#define SW_RING_MEM_ALLOC_ATTEMPTS 5
+#define MAX_QUEUE_DEPTH           1024
+#define ACC100_DMA_MAX_NUM_POINTERS  14
+#define ACC100_DMA_DESC_PADDING      8
+#define ACC100_FCW_PADDING           12
+#define ACC100_DESC_FCW_OFFSET       192
+#define ACC100_DESC_SIZE             256
+#define ACC100_DESC_OFFSET           (ACC100_DESC_SIZE / 64)
+#define ACC100_FCW_TE_BLEN     32
+#define ACC100_FCW_TD_BLEN     24
+#define ACC100_FCW_LE_BLEN     32
+#define ACC100_FCW_LD_BLEN     36
+
+#define ACC100_FCW_VER         2
+#define MUX_5GDL_DESC 6
+#define CMP_ENC_SIZE 20
+#define CMP_DEC_SIZE 24
+#define ENC_OFFSET (32)
+#define DEC_OFFSET (80)
+#define ACC100_EXT_MEM
+#define ACC100_HARQ_OFFSET_THRESHOLD 1024
+
+/* Constants from K0 computation from 3GPP 38.212 Table 5.4.2.1-2 */
+#define N_ZC_1 66 /* N = 66 Zc for BG 1 */
+#define N_ZC_2 50 /* N = 50 Zc for BG 2 */
+#define K0_1_1 17 /* K0 fraction numerator for rv 1 and BG 1 */
+#define K0_1_2 13 /* K0 fraction numerator for rv 1 and BG 2 */
+#define K0_2_1 33 /* K0 fraction numerator for rv 2 and BG 1 */
+#define K0_2_2 25 /* K0 fraction numerator for rv 2 and BG 2 */
+#define K0_3_1 56 /* K0 fraction numerator for rv 3 and BG 1 */
+#define K0_3_2 43 /* K0 fraction numerator for rv 3 and BG 2 */
+
+/* ACC100 Configuration */
+#define ACC100_DDR_ECC_ENABLE
+#define ACC100_CFG_DMA_ERROR 0x3D7
+#define ACC100_CFG_AXI_CACHE 0x11
+#define ACC100_CFG_QMGR_HI_P 0x0F0F
+#define ACC100_CFG_PCI_AXI 0xC003
+#define ACC100_CFG_PCI_BRIDGE 0x40006033
+#define ACC100_ENGINE_OFFSET 0x1000
+#define ACC100_RESET_HI 0x20100
+#define ACC100_RESET_LO 0x20000
+#define ACC100_RESET_HARD 0x1FF
+#define ACC100_ENGINES_MAX 9
+#define LONG_WAIT 1000
+
+/* ACC100 DMA Descriptor triplet */
+struct acc100_dma_triplet {
+	uint64_t address;
+	uint32_t blen:20,
+		res0:4,
+		last:1,
+		dma_ext:1,
+		res1:2,
+		blkid:4;
+} __rte_packed;
+
+
+
+/* ACC100 DMA Response Descriptor */
+union acc100_dma_rsp_desc {
+	uint32_t val;
+	struct {
+		uint32_t crc_status:1,
+			synd_ok:1,
+			dma_err:1,
+			neg_stop:1,
+			fcw_err:1,
+			output_err:1,
+			input_err:1,
+			timestampEn:1,
+			iterCountFrac:8,
+			iter_cnt:8,
+			rsrvd3:6,
+			sdone:1,
+			fdone:1;
+		uint32_t add_info_0;
+		uint32_t add_info_1;
+	};
+};
+
+
+/* ACC100 Queue Manager Enqueue PCI Register */
+union acc100_enqueue_reg_fmt {
+	uint32_t val;
+	struct {
+		uint32_t num_elem:8,
+			addr_offset:3,
+			rsrvd:1,
+			req_elem_addr:20;
+	};
+};
+
+/* FEC 4G Uplink Frame Control Word */
+struct __rte_packed acc100_fcw_td {
+	uint8_t fcw_ver:4,
+		num_maps:4; /* Unused */
+	uint8_t filler:6, /* Unused */
+		rsrvd0:1,
+		bypass_sb_deint:1;
+	uint16_t k_pos;
+	uint16_t k_neg; /* Unused */
+	uint8_t c_neg; /* Unused */
+	uint8_t c; /* Unused */
+	uint32_t ea; /* Unused */
+	uint32_t eb; /* Unused */
+	uint8_t cab; /* Unused */
+	uint8_t k0_start_col; /* Unused */
+	uint8_t rsrvd1;
+	uint8_t code_block_mode:1, /* Unused */
+		turbo_crc_type:1,
+		rsrvd2:3,
+		bypass_teq:1, /* Unused */
+		soft_output_en:1, /* Unused */
+		ext_td_cold_reg_en:1;
+	union { /* External Cold register */
+		uint32_t ext_td_cold_reg;
+		struct {
+			uint32_t min_iter:4, /* Unused */
+				max_iter:4,
+				ext_scale:5, /* Unused */
+				rsrvd3:3,
+				early_stop_en:1, /* Unused */
+				sw_soft_out_dis:1, /* Unused */
+				sw_et_cont:1, /* Unused */
+				sw_soft_out_saturation:1, /* Unused */
+				half_iter_on:1, /* Unused */
+				raw_decoder_input_on:1, /* Unused */
+				rsrvd4:10;
+		};
+	};
+};
+
+/* FEC 5GNR Uplink Frame Control Word */
+struct __rte_packed acc100_fcw_ld {
+	uint32_t FCWversion:4,
+		qm:4,
+		nfiller:11,
+		BG:1,
+		Zc:9,
+		res0:1,
+		synd_precoder:1,
+		synd_post:1;
+	uint32_t ncb:16,
+		k0:16;
+	uint32_t rm_e:24,
+		hcin_en:1,
+		hcout_en:1,
+		crc_select:1,
+		bypass_dec:1,
+		bypass_intlv:1,
+		so_en:1,
+		so_bypass_rm:1,
+		so_bypass_intlv:1;
+	uint32_t hcin_offset:16,
+		hcin_size0:16;
+	uint32_t hcin_size1:16,
+		hcin_decomp_mode:3,
+		llr_pack_mode:1,
+		hcout_comp_mode:3,
+		res2:1,
+		dec_convllr:4,
+		hcout_convllr:4;
+	uint32_t itmax:7,
+		itstop:1,
+		so_it:7,
+		res3:1,
+		hcout_offset:16;
+	uint32_t hcout_size0:16,
+		hcout_size1:16;
+	uint32_t gain_i:8,
+		gain_h:8,
+		negstop_th:16;
+	uint32_t negstop_it:7,
+		negstop_en:1,
+		res4:24;
+};
+
+/* FEC 4G Downlink Frame Control Word */
+struct __rte_packed acc100_fcw_te {
+	uint16_t k_neg;
+	uint16_t k_pos;
+	uint8_t c_neg;
+	uint8_t c;
+	uint8_t filler;
+	uint8_t cab;
+	uint32_t ea:17,
+		rsrvd0:15;
+	uint32_t eb:17,
+		rsrvd1:15;
+	uint16_t ncb_neg;
+	uint16_t ncb_pos;
+	uint8_t rv_idx0:2,
+		rsrvd2:2,
+		rv_idx1:2,
+		rsrvd3:2;
+	uint8_t bypass_rv_idx0:1,
+		bypass_rv_idx1:1,
+		bypass_rm:1,
+		rsrvd4:5;
+	uint8_t rsrvd5:1,
+		rsrvd6:3,
+		code_block_crc:1,
+		rsrvd7:3;
+	uint8_t code_block_mode:1,
+		rsrvd8:7;
+	uint64_t rsrvd9;
+};
+
+/* FEC 5GNR Downlink Frame Control Word */
+struct __rte_packed acc100_fcw_le {
+	uint32_t FCWversion:4,
+		qm:4,
+		nfiller:11,
+		BG:1,
+		Zc:9,
+		res0:3;
+	uint32_t ncb:16,
+		k0:16;
+	uint32_t rm_e:24,
+		res1:2,
+		crc_select:1,
+		res2:1,
+		bypass_intlv:1,
+		res3:3;
+	uint32_t res4_a:12,
+		mcb_count:3,
+		res4_b:17;
+	uint32_t res5;
+	uint32_t res6;
+	uint32_t res7;
+	uint32_t res8;
+};
+
+/* ACC100 DMA Request Descriptor */
+struct __rte_packed acc100_dma_req_desc {
+	union {
+		struct{
+			uint32_t type:4,
+				rsrvd0:26,
+				sdone:1,
+				fdone:1;
+			uint32_t rsrvd1;
+			uint32_t rsrvd2;
+			uint32_t pass_param:8,
+				sdone_enable:1,
+				irq_enable:1,
+				timeStampEn:1,
+				res0:5,
+				numCBs:4,
+				res1:4,
+				m2dlen:4,
+				d2mlen:4;
+		};
+		struct{
+			uint32_t word0;
+			uint32_t word1;
+			uint32_t word2;
+			uint32_t word3;
+		};
+	};
+	struct acc100_dma_triplet data_ptrs[ACC100_DMA_MAX_NUM_POINTERS];
+
+	/* Virtual addresses used to retrieve SW context info */
+	union {
+		void *op_addr;
+		uint64_t pad1;  /* pad to 64 bits */
+	};
+	/*
+	 * Stores additional information needed for driver processing:
+	 * - last_desc_in_batch - flag used to mark last descriptor (CB)
+	 *                        in batch
+	 * - cbs_in_tb - stores information about total number of Code Blocks
+	 *               in currently processed Transport Block
+	 */
+	union {
+		struct {
+			union {
+				struct acc100_fcw_ld fcw_ld;
+				struct acc100_fcw_td fcw_td;
+				struct acc100_fcw_le fcw_le;
+				struct acc100_fcw_te fcw_te;
+				uint32_t pad2[ACC100_FCW_PADDING];
+			};
+			uint32_t last_desc_in_batch :8,
+				cbs_in_tb:8,
+				pad4 : 16;
+		};
+		uint64_t pad3[ACC100_DMA_DESC_PADDING]; /* pad to 64 bits */
+	};
+};
+
+/* ACC100 DMA Descriptor */
+union acc100_dma_desc {
+	struct acc100_dma_req_desc req;
+	union acc100_dma_rsp_desc rsp;
+};
+
+
+/* Union describing Info Ring entry */
+union acc100_harq_layout_data {
+	uint32_t val;
+	struct {
+		uint16_t offset;
+		uint16_t size0;
+	};
+} __rte_packed;
+
+
+/* Union describing Info Ring entry */
+union acc100_info_ring_data {
+	uint32_t val;
+	struct {
+		union {
+			uint16_t detailed_info;
+			struct {
+				uint16_t aq_id: 4;
+				uint16_t qg_id: 4;
+				uint16_t vf_id: 6;
+				uint16_t reserved: 2;
+			};
+		};
+		uint16_t int_nb: 7;
+		uint16_t msi_0: 1;
+		uint16_t vf2pf: 6;
+		uint16_t loop: 1;
+		uint16_t valid: 1;
+	};
+} __rte_packed;
+
+struct acc100_registry_addr {
+	unsigned int dma_ring_dl5g_hi;
+	unsigned int dma_ring_dl5g_lo;
+	unsigned int dma_ring_ul5g_hi;
+	unsigned int dma_ring_ul5g_lo;
+	unsigned int dma_ring_dl4g_hi;
+	unsigned int dma_ring_dl4g_lo;
+	unsigned int dma_ring_ul4g_hi;
+	unsigned int dma_ring_ul4g_lo;
+	unsigned int ring_size;
+	unsigned int info_ring_hi;
+	unsigned int info_ring_lo;
+	unsigned int info_ring_en;
+	unsigned int info_ring_ptr;
+	unsigned int tail_ptrs_dl5g_hi;
+	unsigned int tail_ptrs_dl5g_lo;
+	unsigned int tail_ptrs_ul5g_hi;
+	unsigned int tail_ptrs_ul5g_lo;
+	unsigned int tail_ptrs_dl4g_hi;
+	unsigned int tail_ptrs_dl4g_lo;
+	unsigned int tail_ptrs_ul4g_hi;
+	unsigned int tail_ptrs_ul4g_lo;
+	unsigned int depth_log0_offset;
+	unsigned int depth_log1_offset;
+	unsigned int qman_group_func;
+	unsigned int ddr_range;
+};
+
+/* Structure holding registry addresses for PF */
+static const struct acc100_registry_addr pf_reg_addr = {
+	.dma_ring_dl5g_hi = HWPfDmaFec5GdlDescBaseHiRegVf,
+	.dma_ring_dl5g_lo = HWPfDmaFec5GdlDescBaseLoRegVf,
+	.dma_ring_ul5g_hi = HWPfDmaFec5GulDescBaseHiRegVf,
+	.dma_ring_ul5g_lo = HWPfDmaFec5GulDescBaseLoRegVf,
+	.dma_ring_dl4g_hi = HWPfDmaFec4GdlDescBaseHiRegVf,
+	.dma_ring_dl4g_lo = HWPfDmaFec4GdlDescBaseLoRegVf,
+	.dma_ring_ul4g_hi = HWPfDmaFec4GulDescBaseHiRegVf,
+	.dma_ring_ul4g_lo = HWPfDmaFec4GulDescBaseLoRegVf,
+	.ring_size = HWPfQmgrRingSizeVf,
+	.info_ring_hi = HWPfHiInfoRingBaseHiRegPf,
+	.info_ring_lo = HWPfHiInfoRingBaseLoRegPf,
+	.info_ring_en = HWPfHiInfoRingIntWrEnRegPf,
+	.info_ring_ptr = HWPfHiInfoRingPointerRegPf,
+	.tail_ptrs_dl5g_hi = HWPfDmaFec5GdlRespPtrHiRegVf,
+	.tail_ptrs_dl5g_lo = HWPfDmaFec5GdlRespPtrLoRegVf,
+	.tail_ptrs_ul5g_hi = HWPfDmaFec5GulRespPtrHiRegVf,
+	.tail_ptrs_ul5g_lo = HWPfDmaFec5GulRespPtrLoRegVf,
+	.tail_ptrs_dl4g_hi = HWPfDmaFec4GdlRespPtrHiRegVf,
+	.tail_ptrs_dl4g_lo = HWPfDmaFec4GdlRespPtrLoRegVf,
+	.tail_ptrs_ul4g_hi = HWPfDmaFec4GulRespPtrHiRegVf,
+	.tail_ptrs_ul4g_lo = HWPfDmaFec4GulRespPtrLoRegVf,
+	.depth_log0_offset = HWPfQmgrGrpDepthLog20Vf,
+	.depth_log1_offset = HWPfQmgrGrpDepthLog21Vf,
+	.qman_group_func = HWPfQmgrGrpFunction0,
+	.ddr_range = HWPfDmaVfDdrBaseRw,
+};
+
+/* Structure holding registry addresses for VF */
+static const struct acc100_registry_addr vf_reg_addr = {
+	.dma_ring_dl5g_hi = HWVfDmaFec5GdlDescBaseHiRegVf,
+	.dma_ring_dl5g_lo = HWVfDmaFec5GdlDescBaseLoRegVf,
+	.dma_ring_ul5g_hi = HWVfDmaFec5GulDescBaseHiRegVf,
+	.dma_ring_ul5g_lo = HWVfDmaFec5GulDescBaseLoRegVf,
+	.dma_ring_dl4g_hi = HWVfDmaFec4GdlDescBaseHiRegVf,
+	.dma_ring_dl4g_lo = HWVfDmaFec4GdlDescBaseLoRegVf,
+	.dma_ring_ul4g_hi = HWVfDmaFec4GulDescBaseHiRegVf,
+	.dma_ring_ul4g_lo = HWVfDmaFec4GulDescBaseLoRegVf,
+	.ring_size = HWVfQmgrRingSizeVf,
+	.info_ring_hi = HWVfHiInfoRingBaseHiVf,
+	.info_ring_lo = HWVfHiInfoRingBaseLoVf,
+	.info_ring_en = HWVfHiInfoRingIntWrEnVf,
+	.info_ring_ptr = HWVfHiInfoRingPointerVf,
+	.tail_ptrs_dl5g_hi = HWVfDmaFec5GdlRespPtrHiRegVf,
+	.tail_ptrs_dl5g_lo = HWVfDmaFec5GdlRespPtrLoRegVf,
+	.tail_ptrs_ul5g_hi = HWVfDmaFec5GulRespPtrHiRegVf,
+	.tail_ptrs_ul5g_lo = HWVfDmaFec5GulRespPtrLoRegVf,
+	.tail_ptrs_dl4g_hi = HWVfDmaFec4GdlRespPtrHiRegVf,
+	.tail_ptrs_dl4g_lo = HWVfDmaFec4GdlRespPtrLoRegVf,
+	.tail_ptrs_ul4g_hi = HWVfDmaFec4GulRespPtrHiRegVf,
+	.tail_ptrs_ul4g_lo = HWVfDmaFec4GulRespPtrLoRegVf,
+	.depth_log0_offset = HWVfQmgrGrpDepthLog20Vf,
+	.depth_log1_offset = HWVfQmgrGrpDepthLog21Vf,
+	.qman_group_func = HWVfQmgrGrpFunction0Vf,
+	.ddr_range = HWVfDmaDdrBaseRangeRoVf,
+};
+
 /* Private data structure for each ACC100 device */
 struct acc100_device {
 	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v3 03/11] baseband/acc100: add info get function
  2020-08-19  0:25 [dpdk-dev] [PATCH v3 00/11] bbdev PMD ACC100 Nicolas Chautru
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 01/11] drivers/baseband: add PMD for ACC100 Nicolas Chautru
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 02/11] baseband/acc100: add register definition file Nicolas Chautru
@ 2020-08-19  0:25 ` Nicolas Chautru
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 04/11] baseband/acc100: add queue configuration Nicolas Chautru
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-08-19  0:25 UTC (permalink / raw)
  To: dev, akhil.goyal; +Cc: bruce.richardson, Nicolas Chautru

Add in the "info_get" function to the driver, to allow us to query the
device.
No processing capability are available yet.
Linking bbdev-test to support the PMD with null capability.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 app/test-bbdev/Makefile                  |   3 +
 app/test-bbdev/meson.build               |   3 +
 drivers/baseband/acc100/rte_acc100_cfg.h |  96 +++++++++++++
 drivers/baseband/acc100/rte_acc100_pmd.c | 225 +++++++++++++++++++++++++++++++
 drivers/baseband/acc100/rte_acc100_pmd.h |   3 +
 5 files changed, 330 insertions(+)
 create mode 100644 drivers/baseband/acc100/rte_acc100_cfg.h

diff --git a/app/test-bbdev/Makefile b/app/test-bbdev/Makefile
index dc29557..dbc3437 100644
--- a/app/test-bbdev/Makefile
+++ b/app/test-bbdev/Makefile
@@ -26,5 +26,8 @@ endif
 ifeq ($(CONFIG_RTE_LIBRTE_PMD_BBDEV_FPGA_5GNR_FEC),y)
 LDLIBS += -lrte_pmd_bbdev_fpga_5gnr_fec
 endif
+ifeq ($(CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100),y)
+LDLIBS += -lrte_pmd_bbdev_acc100
+endif
 
 include $(RTE_SDK)/mk/rte.app.mk
diff --git a/app/test-bbdev/meson.build b/app/test-bbdev/meson.build
index 18ab6a8..fbd8ae3 100644
--- a/app/test-bbdev/meson.build
+++ b/app/test-bbdev/meson.build
@@ -12,3 +12,6 @@ endif
 if dpdk_conf.has('RTE_LIBRTE_PMD_BBDEV_FPGA_5GNR_FEC')
 	deps += ['pmd_bbdev_fpga_5gnr_fec']
 endif
+if dpdk_conf.has('RTE_LIBRTE_PMD_BBDEV_ACC100')
+	deps += ['pmd_bbdev_acc100']
+endif
\ No newline at end of file
diff --git a/drivers/baseband/acc100/rte_acc100_cfg.h b/drivers/baseband/acc100/rte_acc100_cfg.h
new file mode 100644
index 0000000..73bbe36
--- /dev/null
+++ b/drivers/baseband/acc100/rte_acc100_cfg.h
@@ -0,0 +1,96 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#ifndef _RTE_ACC100_CFG_H_
+#define _RTE_ACC100_CFG_H_
+
+/**
+ * @file rte_acc100_cfg.h
+ *
+ * Functions for configuring ACC100 HW, exposed directly to applications.
+ * Configuration related to encoding/decoding is done through the
+ * librte_bbdev library.
+ *
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ */
+
+#include <stdint.h>
+#include <stdbool.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+/**< Number of Virtual Functions ACC100 supports */
+#define RTE_ACC100_NUM_VFS 16
+
+/**
+ * Definition of Queue Topology for ACC100 Configuration
+ * Some level of details is abstracted out to expose a clean interface
+ * given that comprehensive flexibility is not required
+ */
+struct rte_q_topology_t {
+	/** Number of QGroups in incremental order of priority */
+	uint16_t num_qgroups;
+	/**
+	 * All QGroups have the same number of AQs here.
+	 * Note : Could be made a 16-array if more flexibility is really
+	 * required
+	 */
+	uint16_t num_aqs_per_groups;
+	/**
+	 * Depth of the AQs is the same of all QGroups here. Log2 Enum : 2^N
+	 * Note : Could be made a 16-array if more flexibility is really
+	 * required
+	 */
+	uint16_t aq_depth_log2;
+	/**
+	 * Index of the first Queue Group Index - assuming contiguity
+	 * Initialized as -1
+	 */
+	int8_t first_qgroup_index;
+};
+
+/**
+ * Definition of Arbitration related parameters for ACC100 Configuration
+ */
+struct rte_arbitration_t {
+	/** Default Weight for VF Fairness Arbitration */
+	uint16_t round_robin_weight;
+	uint32_t gbr_threshold1; /**< Guaranteed Bitrate Threshold 1 */
+	uint32_t gbr_threshold2; /**< Guaranteed Bitrate Threshold 2 */
+};
+
+/**
+ * Structure to pass ACC100 configuration.
+ * Note: all VF Bundles will have the same configuration.
+ */
+struct acc100_conf {
+	bool pf_mode_en; /**< 1 if PF is used for dataplane, 0 for VFs */
+	/** 1 if input '1' bit is represented by a positive LLR value, 0 if '1'
+	 * bit is represented by a negative value.
+	 */
+	bool input_pos_llr_1_bit;
+	/** 1 if output '1' bit is represented by a positive value, 0 if '1'
+	 * bit is represented by a negative value.
+	 */
+	bool output_pos_llr_1_bit;
+	uint16_t num_vf_bundles; /**< Number of VF bundles to setup */
+	/** Queue topology for each operation type */
+	struct rte_q_topology_t q_ul_4g;
+	struct rte_q_topology_t q_dl_4g;
+	struct rte_q_topology_t q_ul_5g;
+	struct rte_q_topology_t q_dl_5g;
+	/** Arbitration configuration for each operation type */
+	struct rte_arbitration_t arb_ul_4g[RTE_ACC100_NUM_VFS];
+	struct rte_arbitration_t arb_dl_4g[RTE_ACC100_NUM_VFS];
+	struct rte_arbitration_t arb_ul_5g[RTE_ACC100_NUM_VFS];
+	struct rte_arbitration_t arb_dl_5g[RTE_ACC100_NUM_VFS];
+};
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_ACC100_CFG_H_ */
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 1b4cd13..7807a30 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -26,6 +26,184 @@
 RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);
 #endif
 
+/* Read a register of a ACC100 device */
+static inline uint32_t
+acc100_reg_read(struct acc100_device *d, uint32_t offset)
+{
+
+	void *reg_addr = RTE_PTR_ADD(d->mmio_base, offset);
+	uint32_t ret = *((volatile uint32_t *)(reg_addr));
+	return rte_le_to_cpu_32(ret);
+}
+
+/* Calculate the offset of the enqueue register */
+static inline uint32_t
+queue_offset(bool pf_device, uint8_t vf_id, uint8_t qgrp_id, uint16_t aq_id)
+{
+	if (pf_device)
+		return ((vf_id << 12) + (qgrp_id << 7) + (aq_id << 3) +
+				HWPfQmgrIngressAq);
+	else
+		return ((qgrp_id << 7) + (aq_id << 3) +
+				HWVfQmgrIngressAq);
+}
+
+enum {UL_4G = 0, UL_5G, DL_4G, DL_5G, NUM_ACC};
+
+/* Return the queue topology for a Queue Group Index */
+static inline void
+qtopFromAcc(struct rte_q_topology_t **qtop, int acc_enum,
+		struct acc100_conf *acc100_conf)
+{
+	struct rte_q_topology_t *p_qtop;
+	p_qtop = NULL;
+	switch (acc_enum) {
+	case UL_4G:
+		p_qtop = &(acc100_conf->q_ul_4g);
+		break;
+	case UL_5G:
+		p_qtop = &(acc100_conf->q_ul_5g);
+		break;
+	case DL_4G:
+		p_qtop = &(acc100_conf->q_dl_4g);
+		break;
+	case DL_5G:
+		p_qtop = &(acc100_conf->q_dl_5g);
+		break;
+	default:
+		/* NOTREACHED */
+		rte_bbdev_log(ERR, "Unexpected error evaluating qtopFromAcc");
+		break;
+	}
+	*qtop = p_qtop;
+}
+
+static void
+initQTop(struct acc100_conf *acc100_conf)
+{
+	acc100_conf->q_ul_4g.num_aqs_per_groups = 0;
+	acc100_conf->q_ul_4g.num_qgroups = 0;
+	acc100_conf->q_ul_4g.first_qgroup_index = -1;
+	acc100_conf->q_ul_5g.num_aqs_per_groups = 0;
+	acc100_conf->q_ul_5g.num_qgroups = 0;
+	acc100_conf->q_ul_5g.first_qgroup_index = -1;
+	acc100_conf->q_dl_4g.num_aqs_per_groups = 0;
+	acc100_conf->q_dl_4g.num_qgroups = 0;
+	acc100_conf->q_dl_4g.first_qgroup_index = -1;
+	acc100_conf->q_dl_5g.num_aqs_per_groups = 0;
+	acc100_conf->q_dl_5g.num_qgroups = 0;
+	acc100_conf->q_dl_5g.first_qgroup_index = -1;
+}
+
+static inline void
+updateQtop(uint8_t acc, uint8_t qg, struct acc100_conf *acc100_conf,
+		struct acc100_device *d) {
+	uint32_t reg;
+	struct rte_q_topology_t *q_top = NULL;
+	qtopFromAcc(&q_top, acc, acc100_conf);
+	if (unlikely(q_top == NULL))
+		return;
+	uint16_t aq;
+	q_top->num_qgroups++;
+	if (q_top->first_qgroup_index == -1) {
+		q_top->first_qgroup_index = qg;
+		/* Can be optimized to assume all are enabled by default */
+		reg = acc100_reg_read(d, queue_offset(d->pf_device,
+				0, qg, ACC100_NUM_AQS - 1));
+		if (reg & QUEUE_ENABLE) {
+			q_top->num_aqs_per_groups = ACC100_NUM_AQS;
+			return;
+		}
+		q_top->num_aqs_per_groups = 0;
+		for (aq = 0; aq < ACC100_NUM_AQS; aq++) {
+			reg = acc100_reg_read(d, queue_offset(d->pf_device,
+					0, qg, aq));
+			if (reg & QUEUE_ENABLE)
+				q_top->num_aqs_per_groups++;
+		}
+	}
+}
+
+/* Fetch configuration enabled for the PF/VF using MMIO Read (slow) */
+static inline void
+fetch_acc100_config(struct rte_bbdev *dev)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	struct acc100_conf *acc100_conf = &d->acc100_conf;
+	const struct acc100_registry_addr *reg_addr;
+	uint8_t acc, qg;
+	uint32_t reg, reg_aq, reg_len0, reg_len1;
+	uint32_t reg_mode;
+
+	/* No need to retrieve the configuration is already done */
+	if (d->configured)
+		return;
+
+	/* Choose correct registry addresses for the device type */
+	if (d->pf_device)
+		reg_addr = &pf_reg_addr;
+	else
+		reg_addr = &vf_reg_addr;
+
+	d->ddr_size = (1 + acc100_reg_read(d, reg_addr->ddr_range)) << 10;
+
+	/* Single VF Bundle by VF */
+	acc100_conf->num_vf_bundles = 1;
+	initQTop(acc100_conf);
+
+	struct rte_q_topology_t *q_top = NULL;
+	int qman_func_id[5] = {0, 2, 1, 3, 4};
+	reg = acc100_reg_read(d, reg_addr->qman_group_func);
+	for (qg = 0; qg < ACC100_NUM_QGRPS_PER_WORD; qg++) {
+		reg_aq = acc100_reg_read(d,
+				queue_offset(d->pf_device, 0, qg, 0));
+		if (reg_aq & QUEUE_ENABLE) {
+			acc = qman_func_id[(reg >> (qg * 4)) & 0x7];
+			updateQtop(acc, qg, acc100_conf, d);
+		}
+	}
+
+	/* Check the depth of the AQs*/
+	reg_len0 = acc100_reg_read(d, reg_addr->depth_log0_offset);
+	reg_len1 = acc100_reg_read(d, reg_addr->depth_log1_offset);
+	for (acc = 0; acc < NUM_ACC; acc++) {
+		qtopFromAcc(&q_top, acc, acc100_conf);
+		if (q_top->first_qgroup_index < ACC100_NUM_QGRPS_PER_WORD)
+			q_top->aq_depth_log2 = (reg_len0 >>
+					(q_top->first_qgroup_index * 4))
+					& 0xF;
+		else
+			q_top->aq_depth_log2 = (reg_len1 >>
+					((q_top->first_qgroup_index -
+					ACC100_NUM_QGRPS_PER_WORD) * 4))
+					& 0xF;
+	}
+
+	/* Read PF mode */
+	if (d->pf_device) {
+		reg_mode = acc100_reg_read(d, HWPfHiPfMode);
+		acc100_conf->pf_mode_en = (reg_mode == 2) ? 1 : 0;
+	}
+
+	rte_bbdev_log_debug(
+			"%s Config LLR SIGN IN/OUT %s %s QG %u %u %u %u AQ %u %u %u %u Len %u %u %u %u\n",
+			(d->pf_device) ? "PF" : "VF",
+			(acc100_conf->input_pos_llr_1_bit) ? "POS" : "NEG",
+			(acc100_conf->output_pos_llr_1_bit) ? "POS" : "NEG",
+			acc100_conf->q_ul_4g.num_qgroups,
+			acc100_conf->q_dl_4g.num_qgroups,
+			acc100_conf->q_ul_5g.num_qgroups,
+			acc100_conf->q_dl_5g.num_qgroups,
+			acc100_conf->q_ul_4g.num_aqs_per_groups,
+			acc100_conf->q_dl_4g.num_aqs_per_groups,
+			acc100_conf->q_ul_5g.num_aqs_per_groups,
+			acc100_conf->q_dl_5g.num_aqs_per_groups,
+			acc100_conf->q_ul_4g.aq_depth_log2,
+			acc100_conf->q_dl_4g.aq_depth_log2,
+			acc100_conf->q_ul_5g.aq_depth_log2,
+			acc100_conf->q_dl_5g.aq_depth_log2);
+}
+
 /* Free 64MB memory used for software rings */
 static int
 acc100_dev_close(struct rte_bbdev *dev  __rte_unused)
@@ -33,8 +211,55 @@
 	return 0;
 }
 
+/* Get ACC100 device info */
+static void
+acc100_dev_info_get(struct rte_bbdev *dev,
+		struct rte_bbdev_driver_info *dev_info)
+{
+	struct acc100_device *d = dev->data->dev_private;
+
+	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
+		RTE_BBDEV_END_OF_CAPABILITIES_LIST()
+	};
+
+	static struct rte_bbdev_queue_conf default_queue_conf;
+	default_queue_conf.socket = dev->data->socket_id;
+	default_queue_conf.queue_size = MAX_QUEUE_DEPTH;
+
+	dev_info->driver_name = dev->device->driver->name;
+
+	/* Read and save the populated config from ACC100 registers */
+	fetch_acc100_config(dev);
+
+	/* This isn't ideal because it reports the maximum number of queues but
+	 * does not provide info on how many can be uplink/downlink or different
+	 * priorities
+	 */
+	dev_info->max_num_queues =
+			d->acc100_conf.q_dl_5g.num_aqs_per_groups *
+			d->acc100_conf.q_dl_5g.num_qgroups +
+			d->acc100_conf.q_ul_5g.num_aqs_per_groups *
+			d->acc100_conf.q_ul_5g.num_qgroups +
+			d->acc100_conf.q_dl_4g.num_aqs_per_groups *
+			d->acc100_conf.q_dl_4g.num_qgroups +
+			d->acc100_conf.q_ul_4g.num_aqs_per_groups *
+			d->acc100_conf.q_ul_4g.num_qgroups;
+	dev_info->queue_size_lim = MAX_QUEUE_DEPTH;
+	dev_info->hardware_accelerated = true;
+	dev_info->max_dl_queue_priority =
+			d->acc100_conf.q_dl_4g.num_qgroups - 1;
+	dev_info->max_ul_queue_priority =
+			d->acc100_conf.q_ul_4g.num_qgroups - 1;
+	dev_info->default_queue_conf = default_queue_conf;
+	dev_info->cpu_flag_reqs = NULL;
+	dev_info->min_alignment = 64;
+	dev_info->capabilities = bbdev_capabilities;
+	dev_info->harq_buffer_size = d->ddr_size;
+}
+
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
 	.close = acc100_dev_close,
+	.info_get = acc100_dev_info_get,
 };
 
 /* ACC100 PCI PF address map */
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index cd77570..662e2c8 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -7,6 +7,7 @@
 
 #include "acc100_pf_enum.h"
 #include "acc100_vf_enum.h"
+#include "rte_acc100_cfg.h"
 
 /* Helper macro for logging */
 #define rte_bbdev_log(level, fmt, ...) \
@@ -520,6 +521,8 @@ struct acc100_registry_addr {
 /* Private data structure for each ACC100 device */
 struct acc100_device {
 	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
+	uint32_t ddr_size; /* Size in kB */
+	struct acc100_conf acc100_conf; /* ACC100 Initial configuration */
 	bool pf_device; /**< True if this is a PF ACC100 device */
 	bool configured; /**< True if this ACC100 device is configured */
 };
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v3 04/11] baseband/acc100: add queue configuration
  2020-08-19  0:25 [dpdk-dev] [PATCH v3 00/11] bbdev PMD ACC100 Nicolas Chautru
                   ` (2 preceding siblings ...)
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 03/11] baseband/acc100: add info get function Nicolas Chautru
@ 2020-08-19  0:25 ` Nicolas Chautru
  2020-08-29 10:39   ` Xu, Rosen
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC processing functions Nicolas Chautru
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 213+ messages in thread
From: Nicolas Chautru @ 2020-08-19  0:25 UTC (permalink / raw)
  To: dev, akhil.goyal; +Cc: bruce.richardson, Nicolas Chautru

Adding function to create and configure queues for
the device. Still no capability.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 420 ++++++++++++++++++++++++++++++-
 drivers/baseband/acc100/rte_acc100_pmd.h |  45 ++++
 2 files changed, 464 insertions(+), 1 deletion(-)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 7807a30..7a21c57 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -26,6 +26,22 @@
 RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);
 #endif
 
+/* Write to MMIO register address */
+static inline void
+mmio_write(void *addr, uint32_t value)
+{
+	*((volatile uint32_t *)(addr)) = rte_cpu_to_le_32(value);
+}
+
+/* Write a register of a ACC100 device */
+static inline void
+acc100_reg_write(struct acc100_device *d, uint32_t offset, uint32_t payload)
+{
+	void *reg_addr = RTE_PTR_ADD(d->mmio_base, offset);
+	mmio_write(reg_addr, payload);
+	usleep(1000);
+}
+
 /* Read a register of a ACC100 device */
 static inline uint32_t
 acc100_reg_read(struct acc100_device *d, uint32_t offset)
@@ -36,6 +52,22 @@
 	return rte_le_to_cpu_32(ret);
 }
 
+/* Basic Implementation of Log2 for exact 2^N */
+static inline uint32_t
+log2_basic(uint32_t value)
+{
+	return (value == 0) ? 0 : __builtin_ctz(value);
+}
+
+/* Calculate memory alignment offset assuming alignment is 2^N */
+static inline uint32_t
+calc_mem_alignment_offset(void *unaligned_virt_mem, uint32_t alignment)
+{
+	rte_iova_t unaligned_phy_mem = rte_malloc_virt2iova(unaligned_virt_mem);
+	return (uint32_t)(alignment -
+			(unaligned_phy_mem & (alignment-1)));
+}
+
 /* Calculate the offset of the enqueue register */
 static inline uint32_t
 queue_offset(bool pf_device, uint8_t vf_id, uint8_t qgrp_id, uint16_t aq_id)
@@ -204,10 +236,393 @@
 			acc100_conf->q_dl_5g.aq_depth_log2);
 }
 
+static void
+free_base_addresses(void **base_addrs, int size)
+{
+	int i;
+	for (i = 0; i < size; i++)
+		rte_free(base_addrs[i]);
+}
+
+static inline uint32_t
+get_desc_len(void)
+{
+	return sizeof(union acc100_dma_desc);
+}
+
+/* Allocate the 2 * 64MB block for the sw rings */
+static int
+alloc_2x64mb_sw_rings_mem(struct rte_bbdev *dev, struct acc100_device *d,
+		int socket)
+{
+	uint32_t sw_ring_size = ACC100_SIZE_64MBYTE;
+	d->sw_rings_base = rte_zmalloc_socket(dev->device->driver->name,
+			2 * sw_ring_size, RTE_CACHE_LINE_SIZE, socket);
+	if (d->sw_rings_base == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate memory for %s:%u",
+				dev->device->driver->name,
+				dev->data->dev_id);
+		return -ENOMEM;
+	}
+	memset(d->sw_rings_base, 0, ACC100_SIZE_64MBYTE);
+	uint32_t next_64mb_align_offset = calc_mem_alignment_offset(
+			d->sw_rings_base, ACC100_SIZE_64MBYTE);
+	d->sw_rings = RTE_PTR_ADD(d->sw_rings_base, next_64mb_align_offset);
+	d->sw_rings_phys = rte_malloc_virt2iova(d->sw_rings_base) +
+			next_64mb_align_offset;
+	d->sw_ring_size = MAX_QUEUE_DEPTH * get_desc_len();
+	d->sw_ring_max_depth = d->sw_ring_size / get_desc_len();
+
+	return 0;
+}
+
+/* Attempt to allocate minimised memory space for sw rings */
+static void
+alloc_sw_rings_min_mem(struct rte_bbdev *dev, struct acc100_device *d,
+		uint16_t num_queues, int socket)
+{
+	rte_iova_t sw_rings_base_phy, next_64mb_align_addr_phy;
+	uint32_t next_64mb_align_offset;
+	rte_iova_t sw_ring_phys_end_addr;
+	void *base_addrs[SW_RING_MEM_ALLOC_ATTEMPTS];
+	void *sw_rings_base;
+	int i = 0;
+	uint32_t q_sw_ring_size = MAX_QUEUE_DEPTH * get_desc_len();
+	uint32_t dev_sw_ring_size = q_sw_ring_size * num_queues;
+
+	/* Find an aligned block of memory to store sw rings */
+	while (i < SW_RING_MEM_ALLOC_ATTEMPTS) {
+		/*
+		 * sw_ring allocated memory is guaranteed to be aligned to
+		 * q_sw_ring_size at the condition that the requested size is
+		 * less than the page size
+		 */
+		sw_rings_base = rte_zmalloc_socket(
+				dev->device->driver->name,
+				dev_sw_ring_size, q_sw_ring_size, socket);
+
+		if (sw_rings_base == NULL) {
+			rte_bbdev_log(ERR,
+					"Failed to allocate memory for %s:%u",
+					dev->device->driver->name,
+					dev->data->dev_id);
+			break;
+		}
+
+		sw_rings_base_phy = rte_malloc_virt2iova(sw_rings_base);
+		next_64mb_align_offset = calc_mem_alignment_offset(
+				sw_rings_base, ACC100_SIZE_64MBYTE);
+		next_64mb_align_addr_phy = sw_rings_base_phy +
+				next_64mb_align_offset;
+		sw_ring_phys_end_addr = sw_rings_base_phy + dev_sw_ring_size;
+
+		/* Check if the end of the sw ring memory block is before the
+		 * start of next 64MB aligned mem address
+		 */
+		if (sw_ring_phys_end_addr < next_64mb_align_addr_phy) {
+			d->sw_rings_phys = sw_rings_base_phy;
+			d->sw_rings = sw_rings_base;
+			d->sw_rings_base = sw_rings_base;
+			d->sw_ring_size = q_sw_ring_size;
+			d->sw_ring_max_depth = MAX_QUEUE_DEPTH;
+			break;
+		}
+		/* Store the address of the unaligned mem block */
+		base_addrs[i] = sw_rings_base;
+		i++;
+	}
+
+	/* Free all unaligned blocks of mem allocated in the loop */
+	free_base_addresses(base_addrs, i);
+}
+
+
+/* Allocate 64MB memory used for all software rings */
+static int
+acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id)
+{
+	uint32_t phys_low, phys_high, payload;
+	struct acc100_device *d = dev->data->dev_private;
+	const struct acc100_registry_addr *reg_addr;
+
+	if (d->pf_device && !d->acc100_conf.pf_mode_en) {
+		rte_bbdev_log(NOTICE,
+				"%s has PF mode disabled. This PF can't be used.",
+				dev->data->name);
+		return -ENODEV;
+	}
+
+	alloc_sw_rings_min_mem(dev, d, num_queues, socket_id);
+
+	/* If minimal memory space approach failed, then allocate
+	 * the 2 * 64MB block for the sw rings
+	 */
+	if (d->sw_rings == NULL)
+		alloc_2x64mb_sw_rings_mem(dev, d, socket_id);
+
+	/* Configure ACC100 with the base address for DMA descriptor rings
+	 * Same descriptor rings used for UL and DL DMA Engines
+	 * Note : Assuming only VF0 bundle is used for PF mode
+	 */
+	phys_high = (uint32_t)(d->sw_rings_phys >> 32);
+	phys_low  = (uint32_t)(d->sw_rings_phys & ~(ACC100_SIZE_64MBYTE-1));
+
+	/* Choose correct registry addresses for the device type */
+	if (d->pf_device)
+		reg_addr = &pf_reg_addr;
+	else
+		reg_addr = &vf_reg_addr;
+
+	/* Read the populated cfg from ACC100 registers */
+	fetch_acc100_config(dev);
+
+	/* Mark as configured properly */
+	d->configured = true;
+
+	/* Release AXI from PF */
+	if (d->pf_device)
+		acc100_reg_write(d, HWPfDmaAxiControl, 1);
+
+	acc100_reg_write(d, reg_addr->dma_ring_ul5g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->dma_ring_ul5g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->dma_ring_dl5g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->dma_ring_dl5g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->dma_ring_ul4g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->dma_ring_ul4g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->dma_ring_dl4g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->dma_ring_dl4g_lo, phys_low);
+
+	/*
+	 * Configure Ring Size to the max queue ring size
+	 * (used for wrapping purpose)
+	 */
+	payload = log2_basic(d->sw_ring_size / 64);
+	acc100_reg_write(d, reg_addr->ring_size, payload);
+
+	/* Configure tail pointer for use when SDONE enabled */
+	d->tail_ptrs = rte_zmalloc_socket(
+			dev->device->driver->name,
+			ACC100_NUM_QGRPS * ACC100_NUM_AQS * sizeof(uint32_t),
+			RTE_CACHE_LINE_SIZE, socket_id);
+	if (d->tail_ptrs == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate tail ptr for %s:%u",
+				dev->device->driver->name,
+				dev->data->dev_id);
+		rte_free(d->sw_rings);
+		return -ENOMEM;
+	}
+	d->tail_ptr_phys = rte_malloc_virt2iova(d->tail_ptrs);
+
+	phys_high = (uint32_t)(d->tail_ptr_phys >> 32);
+	phys_low  = (uint32_t)(d->tail_ptr_phys);
+	acc100_reg_write(d, reg_addr->tail_ptrs_ul5g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->tail_ptrs_ul5g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->tail_ptrs_dl5g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->tail_ptrs_dl5g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->tail_ptrs_ul4g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->tail_ptrs_ul4g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_lo, phys_low);
+
+	d->harq_layout = rte_zmalloc_socket("HARQ Layout",
+			ACC100_HARQ_LAYOUT * sizeof(*d->harq_layout),
+			RTE_CACHE_LINE_SIZE, dev->data->socket_id);
+
+	rte_bbdev_log_debug(
+			"ACC100 (%s) configured  sw_rings = %p, sw_rings_phys = %#"
+			PRIx64, dev->data->name, d->sw_rings, d->sw_rings_phys);
+
+	return 0;
+}
+
 /* Free 64MB memory used for software rings */
 static int
-acc100_dev_close(struct rte_bbdev *dev  __rte_unused)
+acc100_dev_close(struct rte_bbdev *dev)
 {
+	struct acc100_device *d = dev->data->dev_private;
+	if (d->sw_rings_base != NULL) {
+		rte_free(d->tail_ptrs);
+		rte_free(d->sw_rings_base);
+		d->sw_rings_base = NULL;
+	}
+	usleep(1000);
+	return 0;
+}
+
+
+/**
+ * Report a ACC100 queue index which is free
+ * Return 0 to 16k for a valid queue_idx or -1 when no queue is available
+ * Note : Only supporting VF0 Bundle for PF mode
+ */
+static int
+acc100_find_free_queue_idx(struct rte_bbdev *dev,
+		const struct rte_bbdev_queue_conf *conf)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	int op_2_acc[5] = {0, UL_4G, DL_4G, UL_5G, DL_5G};
+	int acc = op_2_acc[conf->op_type];
+	struct rte_q_topology_t *qtop = NULL;
+	qtopFromAcc(&qtop, acc, &(d->acc100_conf));
+	if (qtop == NULL)
+		return -1;
+	/* Identify matching QGroup Index which are sorted in priority order */
+	uint16_t group_idx = qtop->first_qgroup_index;
+	group_idx += conf->priority;
+	if (group_idx >= ACC100_NUM_QGRPS ||
+			conf->priority >= qtop->num_qgroups) {
+		rte_bbdev_log(INFO, "Invalid Priority on %s, priority %u",
+				dev->data->name, conf->priority);
+		return -1;
+	}
+	/* Find a free AQ_idx  */
+	uint16_t aq_idx;
+	for (aq_idx = 0; aq_idx < qtop->num_aqs_per_groups; aq_idx++) {
+		if (((d->q_assigned_bit_map[group_idx] >> aq_idx) & 0x1) == 0) {
+			/* Mark the Queue as assigned */
+			d->q_assigned_bit_map[group_idx] |= (1 << aq_idx);
+			/* Report the AQ Index */
+			return (group_idx << GRP_ID_SHIFT) + aq_idx;
+		}
+	}
+	rte_bbdev_log(INFO, "Failed to find free queue on %s, priority %u",
+			dev->data->name, conf->priority);
+	return -1;
+}
+
+/* Setup ACC100 queue */
+static int
+acc100_queue_setup(struct rte_bbdev *dev, uint16_t queue_id,
+		const struct rte_bbdev_queue_conf *conf)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	struct acc100_queue *q;
+	int16_t q_idx;
+
+	/* Allocate the queue data structure. */
+	q = rte_zmalloc_socket(dev->device->driver->name, sizeof(*q),
+			RTE_CACHE_LINE_SIZE, conf->socket);
+	if (q == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate queue memory");
+		return -ENOMEM;
+	}
+
+	q->d = d;
+	q->ring_addr = RTE_PTR_ADD(d->sw_rings, (d->sw_ring_size * queue_id));
+	q->ring_addr_phys = d->sw_rings_phys + (d->sw_ring_size * queue_id);
+
+	/* Prepare the Ring with default descriptor format */
+	union acc100_dma_desc *desc = NULL;
+	unsigned int desc_idx, b_idx;
+	int fcw_len = (conf->op_type == RTE_BBDEV_OP_LDPC_ENC ?
+		ACC100_FCW_LE_BLEN : (conf->op_type == RTE_BBDEV_OP_TURBO_DEC ?
+		ACC100_FCW_TD_BLEN : ACC100_FCW_LD_BLEN));
+
+	for (desc_idx = 0; desc_idx < d->sw_ring_max_depth; desc_idx++) {
+		desc = q->ring_addr + desc_idx;
+		desc->req.word0 = ACC100_DMA_DESC_TYPE;
+		desc->req.word1 = 0; /**< Timestamp */
+		desc->req.word2 = 0;
+		desc->req.word3 = 0;
+		uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
+		desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
+		desc->req.data_ptrs[0].blen = fcw_len;
+		desc->req.data_ptrs[0].blkid = ACC100_DMA_BLKID_FCW;
+		desc->req.data_ptrs[0].last = 0;
+		desc->req.data_ptrs[0].dma_ext = 0;
+		for (b_idx = 1; b_idx < ACC100_DMA_MAX_NUM_POINTERS - 1;
+				b_idx++) {
+			desc->req.data_ptrs[b_idx].blkid = ACC100_DMA_BLKID_IN;
+			desc->req.data_ptrs[b_idx].last = 1;
+			desc->req.data_ptrs[b_idx].dma_ext = 0;
+			b_idx++;
+			desc->req.data_ptrs[b_idx].blkid =
+					ACC100_DMA_BLKID_OUT_ENC;
+			desc->req.data_ptrs[b_idx].last = 1;
+			desc->req.data_ptrs[b_idx].dma_ext = 0;
+		}
+		/* Preset some fields of LDPC FCW */
+		desc->req.fcw_ld.FCWversion = ACC100_FCW_VER;
+		desc->req.fcw_ld.gain_i = 1;
+		desc->req.fcw_ld.gain_h = 1;
+	}
+
+	q->lb_in = rte_zmalloc_socket(dev->device->driver->name,
+			RTE_CACHE_LINE_SIZE,
+			RTE_CACHE_LINE_SIZE, conf->socket);
+	if (q->lb_in == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate lb_in memory");
+		return -ENOMEM;
+	}
+	q->lb_in_addr_phys = rte_malloc_virt2iova(q->lb_in);
+	q->lb_out = rte_zmalloc_socket(dev->device->driver->name,
+			RTE_CACHE_LINE_SIZE,
+			RTE_CACHE_LINE_SIZE, conf->socket);
+	if (q->lb_out == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate lb_out memory");
+		return -ENOMEM;
+	}
+	q->lb_out_addr_phys = rte_malloc_virt2iova(q->lb_out);
+
+	/*
+	 * Software queue ring wraps synchronously with the HW when it reaches
+	 * the boundary of the maximum allocated queue size, no matter what the
+	 * sw queue size is. This wrapping is guarded by setting the wrap_mask
+	 * to represent the maximum queue size as allocated at the time when
+	 * the device has been setup (in configure()).
+	 *
+	 * The queue depth is set to the queue size value (conf->queue_size).
+	 * This limits the occupancy of the queue at any point of time, so that
+	 * the queue does not get swamped with enqueue requests.
+	 */
+	q->sw_ring_depth = conf->queue_size;
+	q->sw_ring_wrap_mask = d->sw_ring_max_depth - 1;
+
+	q->op_type = conf->op_type;
+
+	q_idx = acc100_find_free_queue_idx(dev, conf);
+	if (q_idx == -1) {
+		rte_free(q);
+		return -1;
+	}
+
+	q->qgrp_id = (q_idx >> GRP_ID_SHIFT) & 0xF;
+	q->vf_id = (q_idx >> VF_ID_SHIFT)  & 0x3F;
+	q->aq_id = q_idx & 0xF;
+	q->aq_depth = (conf->op_type ==  RTE_BBDEV_OP_TURBO_DEC) ?
+			(1 << d->acc100_conf.q_ul_4g.aq_depth_log2) :
+			(1 << d->acc100_conf.q_dl_4g.aq_depth_log2);
+
+	q->mmio_reg_enqueue = RTE_PTR_ADD(d->mmio_base,
+			queue_offset(d->pf_device,
+					q->vf_id, q->qgrp_id, q->aq_id));
+
+	rte_bbdev_log_debug(
+			"Setup dev%u q%u: qgrp_id=%u, vf_id=%u, aq_id=%u, aq_depth=%u, mmio_reg_enqueue=%p",
+			dev->data->dev_id, queue_id, q->qgrp_id, q->vf_id,
+			q->aq_id, q->aq_depth, q->mmio_reg_enqueue);
+
+	dev->data->queues[queue_id].queue_private = q;
+	return 0;
+}
+
+/* Release ACC100 queue */
+static int
+acc100_queue_release(struct rte_bbdev *dev, uint16_t q_id)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	struct acc100_queue *q = dev->data->queues[q_id].queue_private;
+
+	if (q != NULL) {
+		/* Mark the Queue as un-assigned */
+		d->q_assigned_bit_map[q->qgrp_id] &= (0xFFFFFFFF -
+				(1 << q->aq_id));
+		rte_free(q->lb_in);
+		rte_free(q->lb_out);
+		rte_free(q);
+		dev->data->queues[q_id].queue_private = NULL;
+	}
+
 	return 0;
 }
 
@@ -258,8 +673,11 @@
 }
 
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
+	.setup_queues = acc100_setup_queues,
 	.close = acc100_dev_close,
 	.info_get = acc100_dev_info_get,
+	.queue_setup = acc100_queue_setup,
+	.queue_release = acc100_queue_release,
 };
 
 /* ACC100 PCI PF address map */
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 662e2c8..0e2b79c 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -518,11 +518,56 @@ struct acc100_registry_addr {
 	.ddr_range = HWVfDmaDdrBaseRangeRoVf,
 };
 
+/* Structure associated with each queue. */
+struct __rte_cache_aligned acc100_queue {
+	union acc100_dma_desc *ring_addr;  /* Virtual address of sw ring */
+	rte_iova_t ring_addr_phys;  /* Physical address of software ring */
+	uint32_t sw_ring_head;  /* software ring head */
+	uint32_t sw_ring_tail;  /* software ring tail */
+	/* software ring size (descriptors, not bytes) */
+	uint32_t sw_ring_depth;
+	/* mask used to wrap enqueued descriptors on the sw ring */
+	uint32_t sw_ring_wrap_mask;
+	/* MMIO register used to enqueue descriptors */
+	void *mmio_reg_enqueue;
+	uint8_t vf_id;  /* VF ID (max = 63) */
+	uint8_t qgrp_id;  /* Queue Group ID */
+	uint16_t aq_id;  /* Atomic Queue ID */
+	uint16_t aq_depth;  /* Depth of atomic queue */
+	uint32_t aq_enqueued;  /* Count how many "batches" have been enqueued */
+	uint32_t aq_dequeued;  /* Count how many "batches" have been dequeued */
+	uint32_t irq_enable;  /* Enable ops dequeue interrupts if set to 1 */
+	struct rte_mempool *fcw_mempool;  /* FCW mempool */
+	enum rte_bbdev_op_type op_type;  /* Type of this Queue: TE or TD */
+	/* Internal Buffers for loopback input */
+	uint8_t *lb_in;
+	uint8_t *lb_out;
+	rte_iova_t lb_in_addr_phys;
+	rte_iova_t lb_out_addr_phys;
+	struct acc100_device *d;
+};
+
 /* Private data structure for each ACC100 device */
 struct acc100_device {
 	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
+	void *sw_rings_base;  /* Base addr of un-aligned memory for sw rings */
+	void *sw_rings;  /* 64MBs of 64MB aligned memory for sw rings */
+	rte_iova_t sw_rings_phys;  /* Physical address of sw_rings */
+	/* Virtual address of the info memory routed to the this function under
+	 * operation, whether it is PF or VF.
+	 */
+	union acc100_harq_layout_data *harq_layout;
+	uint32_t sw_ring_size;
 	uint32_t ddr_size; /* Size in kB */
+	uint32_t *tail_ptrs; /* Base address of response tail pointer buffer */
+	rte_iova_t tail_ptr_phys; /* Physical address of tail pointers */
+	/* Max number of entries available for each queue in device, depending
+	 * on how many queues are enabled with configure()
+	 */
+	uint32_t sw_ring_max_depth;
 	struct acc100_conf acc100_conf; /* ACC100 Initial configuration */
+	/* Bitmap capturing which Queues have already been assigned */
+	uint16_t q_assigned_bit_map[ACC100_NUM_QGRPS];
 	bool pf_device; /**< True if this is a PF ACC100 device */
 	bool configured; /**< True if this ACC100 device is configured */
 };
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC processing functions
  2020-08-19  0:25 [dpdk-dev] [PATCH v3 00/11] bbdev PMD ACC100 Nicolas Chautru
                   ` (3 preceding siblings ...)
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 04/11] baseband/acc100: add queue configuration Nicolas Chautru
@ 2020-08-19  0:25 ` Nicolas Chautru
  2020-08-20 14:38   ` Dave Burley
  2020-08-29 11:10   ` Xu, Rosen
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 06/11] baseband/acc100: add HARQ loopback support Nicolas Chautru
                   ` (5 subsequent siblings)
  10 siblings, 2 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-08-19  0:25 UTC (permalink / raw)
  To: dev, akhil.goyal; +Cc: bruce.richardson, Nicolas Chautru

Adding LDPC decode and encode processing operations

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 1625 +++++++++++++++++++++++++++++-
 drivers/baseband/acc100/rte_acc100_pmd.h |    3 +
 2 files changed, 1626 insertions(+), 2 deletions(-)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 7a21c57..5f32813 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -15,6 +15,9 @@
 #include <rte_hexdump.h>
 #include <rte_pci.h>
 #include <rte_bus_pci.h>
+#ifdef RTE_BBDEV_OFFLOAD_COST
+#include <rte_cycles.h>
+#endif
 
 #include <rte_bbdev.h>
 #include <rte_bbdev_pmd.h>
@@ -449,7 +452,6 @@
 	return 0;
 }
 
-
 /**
  * Report a ACC100 queue index which is free
  * Return 0 to 16k for a valid queue_idx or -1 when no queue is available
@@ -634,6 +636,46 @@
 	struct acc100_device *d = dev->data->dev_private;
 
 	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
+		{
+			.type   = RTE_BBDEV_OP_LDPC_ENC,
+			.cap.ldpc_enc = {
+				.capability_flags =
+					RTE_BBDEV_LDPC_RATE_MATCH |
+					RTE_BBDEV_LDPC_CRC_24B_ATTACH |
+					RTE_BBDEV_LDPC_INTERLEAVER_BYPASS,
+				.num_buffers_src =
+						RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+				.num_buffers_dst =
+						RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+			}
+		},
+		{
+			.type   = RTE_BBDEV_OP_LDPC_DEC,
+			.cap.ldpc_dec = {
+			.capability_flags =
+				RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK |
+				RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP |
+				RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE |
+				RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE |
+#ifdef ACC100_EXT_MEM
+				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE |
+				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE |
+#endif
+				RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE |
+				RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS |
+				RTE_BBDEV_LDPC_DECODE_BYPASS |
+				RTE_BBDEV_LDPC_DEC_SCATTER_GATHER |
+				RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION |
+				RTE_BBDEV_LDPC_LLR_COMPRESSION,
+			.llr_size = 8,
+			.llr_decimals = 1,
+			.num_buffers_src =
+					RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+			.num_buffers_hard_out =
+					RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+			.num_buffers_soft_out = 0,
+			}
+		},
 		RTE_BBDEV_END_OF_CAPABILITIES_LIST()
 	};
 
@@ -669,9 +711,14 @@
 	dev_info->cpu_flag_reqs = NULL;
 	dev_info->min_alignment = 64;
 	dev_info->capabilities = bbdev_capabilities;
+#ifdef ACC100_EXT_MEM
 	dev_info->harq_buffer_size = d->ddr_size;
+#else
+	dev_info->harq_buffer_size = 0;
+#endif
 }
 
+
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
 	.setup_queues = acc100_setup_queues,
 	.close = acc100_dev_close,
@@ -696,6 +743,1577 @@
 	{.device_id = 0},
 };
 
+/* Read flag value 0/1 from bitmap */
+static inline bool
+check_bit(uint32_t bitmap, uint32_t bitmask)
+{
+	return bitmap & bitmask;
+}
+
+static inline char *
+mbuf_append(struct rte_mbuf *m_head, struct rte_mbuf *m, uint16_t len)
+{
+	if (unlikely(len > rte_pktmbuf_tailroom(m)))
+		return NULL;
+
+	char *tail = (char *)m->buf_addr + m->data_off + m->data_len;
+	m->data_len = (uint16_t)(m->data_len + len);
+	m_head->pkt_len  = (m_head->pkt_len + len);
+	return tail;
+}
+
+/* Compute value of k0.
+ * Based on 3GPP 38.212 Table 5.4.2.1-2
+ * Starting position of different redundancy versions, k0
+ */
+static inline uint16_t
+get_k0(uint16_t n_cb, uint16_t z_c, uint8_t bg, uint8_t rv_index)
+{
+	if (rv_index == 0)
+		return 0;
+	uint16_t n = (bg == 1 ? N_ZC_1 : N_ZC_2) * z_c;
+	if (n_cb == n) {
+		if (rv_index == 1)
+			return (bg == 1 ? K0_1_1 : K0_1_2) * z_c;
+		else if (rv_index == 2)
+			return (bg == 1 ? K0_2_1 : K0_2_2) * z_c;
+		else
+			return (bg == 1 ? K0_3_1 : K0_3_2) * z_c;
+	}
+	/* LBRM case - includes a division by N */
+	if (rv_index == 1)
+		return (((bg == 1 ? K0_1_1 : K0_1_2) * n_cb)
+				/ n) * z_c;
+	else if (rv_index == 2)
+		return (((bg == 1 ? K0_2_1 : K0_2_2) * n_cb)
+				/ n) * z_c;
+	else
+		return (((bg == 1 ? K0_3_1 : K0_3_2) * n_cb)
+				/ n) * z_c;
+}
+
+/* Fill in a frame control word for LDPC encoding. */
+static inline void
+acc100_fcw_le_fill(const struct rte_bbdev_enc_op *op,
+		struct acc100_fcw_le *fcw, int num_cb)
+{
+	fcw->qm = op->ldpc_enc.q_m;
+	fcw->nfiller = op->ldpc_enc.n_filler;
+	fcw->BG = (op->ldpc_enc.basegraph - 1);
+	fcw->Zc = op->ldpc_enc.z_c;
+	fcw->ncb = op->ldpc_enc.n_cb;
+	fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_enc.basegraph,
+			op->ldpc_enc.rv_index);
+	fcw->rm_e = op->ldpc_enc.cb_params.e;
+	fcw->crc_select = check_bit(op->ldpc_enc.op_flags,
+			RTE_BBDEV_LDPC_CRC_24B_ATTACH);
+	fcw->bypass_intlv = check_bit(op->ldpc_enc.op_flags,
+			RTE_BBDEV_LDPC_INTERLEAVER_BYPASS);
+	fcw->mcb_count = num_cb;
+}
+
+/* Fill in a frame control word for LDPC decoding. */
+static inline void
+acc100_fcw_ld_fill(const struct rte_bbdev_dec_op *op, struct acc100_fcw_ld *fcw,
+		union acc100_harq_layout_data *harq_layout)
+{
+	uint16_t harq_out_length, harq_in_length, ncb_p, k0_p, parity_offset;
+	uint16_t harq_index;
+	uint32_t l;
+	bool harq_prun = false;
+
+	fcw->qm = op->ldpc_dec.q_m;
+	fcw->nfiller = op->ldpc_dec.n_filler;
+	fcw->BG = (op->ldpc_dec.basegraph - 1);
+	fcw->Zc = op->ldpc_dec.z_c;
+	fcw->ncb = op->ldpc_dec.n_cb;
+	fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_dec.basegraph,
+			op->ldpc_dec.rv_index);
+	if (op->ldpc_dec.code_block_mode == 1)
+		fcw->rm_e = op->ldpc_dec.cb_params.e;
+	else
+		fcw->rm_e = (op->ldpc_dec.tb_params.r <
+				op->ldpc_dec.tb_params.cab) ?
+						op->ldpc_dec.tb_params.ea :
+						op->ldpc_dec.tb_params.eb;
+
+	fcw->hcin_en = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE);
+	fcw->hcout_en = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE);
+	fcw->crc_select = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK);
+	fcw->bypass_dec = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_DECODE_BYPASS);
+	fcw->bypass_intlv = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS);
+	if (op->ldpc_dec.q_m == 1) {
+		fcw->bypass_intlv = 1;
+		fcw->qm = 2;
+	}
+	fcw->hcin_decomp_mode = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
+	fcw->hcout_comp_mode = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
+	fcw->llr_pack_mode = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_LLR_COMPRESSION);
+	harq_index = op->ldpc_dec.harq_combined_output.offset /
+			ACC100_HARQ_OFFSET;
+#ifdef ACC100_EXT_MEM
+	/* Limit cases when HARQ pruning is valid */
+	harq_prun = ((op->ldpc_dec.harq_combined_output.offset %
+			ACC100_HARQ_OFFSET) == 0) &&
+			(op->ldpc_dec.harq_combined_output.offset <= UINT16_MAX
+			* ACC100_HARQ_OFFSET);
+#endif
+	if (fcw->hcin_en > 0) {
+		harq_in_length = op->ldpc_dec.harq_combined_input.length;
+		if (fcw->hcin_decomp_mode > 0)
+			harq_in_length = harq_in_length * 8 / 6;
+		harq_in_length = RTE_ALIGN(harq_in_length, 64);
+		if ((harq_layout[harq_index].offset > 0) & harq_prun) {
+			rte_bbdev_log_debug("HARQ IN offset unexpected for now\n");
+			fcw->hcin_size0 = harq_layout[harq_index].size0;
+			fcw->hcin_offset = harq_layout[harq_index].offset;
+			fcw->hcin_size1 = harq_in_length -
+					harq_layout[harq_index].offset;
+		} else {
+			fcw->hcin_size0 = harq_in_length;
+			fcw->hcin_offset = 0;
+			fcw->hcin_size1 = 0;
+		}
+	} else {
+		fcw->hcin_size0 = 0;
+		fcw->hcin_offset = 0;
+		fcw->hcin_size1 = 0;
+	}
+
+	fcw->itmax = op->ldpc_dec.iter_max;
+	fcw->itstop = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE);
+	fcw->synd_precoder = fcw->itstop;
+	/*
+	 * These are all implicitly set
+	 * fcw->synd_post = 0;
+	 * fcw->so_en = 0;
+	 * fcw->so_bypass_rm = 0;
+	 * fcw->so_bypass_intlv = 0;
+	 * fcw->dec_convllr = 0;
+	 * fcw->hcout_convllr = 0;
+	 * fcw->hcout_size1 = 0;
+	 * fcw->so_it = 0;
+	 * fcw->hcout_offset = 0;
+	 * fcw->negstop_th = 0;
+	 * fcw->negstop_it = 0;
+	 * fcw->negstop_en = 0;
+	 * fcw->gain_i = 1;
+	 * fcw->gain_h = 1;
+	 */
+	if (fcw->hcout_en > 0) {
+		parity_offset = (op->ldpc_dec.basegraph == 1 ? 20 : 8)
+			* op->ldpc_dec.z_c - op->ldpc_dec.n_filler;
+		k0_p = (fcw->k0 > parity_offset) ?
+				fcw->k0 - op->ldpc_dec.n_filler : fcw->k0;
+		ncb_p = fcw->ncb - op->ldpc_dec.n_filler;
+		l = k0_p + fcw->rm_e;
+		harq_out_length = (uint16_t) fcw->hcin_size0;
+		harq_out_length = RTE_MIN(RTE_MAX(harq_out_length, l), ncb_p);
+		harq_out_length = (harq_out_length + 0x3F) & 0xFFC0;
+		if ((k0_p > fcw->hcin_size0 + ACC100_HARQ_OFFSET_THRESHOLD) &&
+				harq_prun) {
+			fcw->hcout_size0 = (uint16_t) fcw->hcin_size0;
+			fcw->hcout_offset = k0_p & 0xFFC0;
+			fcw->hcout_size1 = harq_out_length - fcw->hcout_offset;
+		} else {
+			fcw->hcout_size0 = harq_out_length;
+			fcw->hcout_size1 = 0;
+			fcw->hcout_offset = 0;
+		}
+		harq_layout[harq_index].offset = fcw->hcout_offset;
+		harq_layout[harq_index].size0 = fcw->hcout_size0;
+	} else {
+		fcw->hcout_size0 = 0;
+		fcw->hcout_size1 = 0;
+		fcw->hcout_offset = 0;
+	}
+}
+
+/**
+ * Fills descriptor with data pointers of one block type.
+ *
+ * @param desc
+ *   Pointer to DMA descriptor.
+ * @param input
+ *   Pointer to pointer to input data which will be encoded. It can be changed
+ *   and points to next segment in scatter-gather case.
+ * @param offset
+ *   Input offset in rte_mbuf structure. It is used for calculating the point
+ *   where data is starting.
+ * @param cb_len
+ *   Length of currently processed Code Block
+ * @param seg_total_left
+ *   It indicates how many bytes still left in segment (mbuf) for further
+ *   processing.
+ * @param op_flags
+ *   Store information about device capabilities
+ * @param next_triplet
+ *   Index for ACC100 DMA Descriptor triplet
+ *
+ * @return
+ *   Returns index of next triplet on success, other value if lengths of
+ *   pkt and processed cb do not match.
+ *
+ */
+static inline int
+acc100_dma_fill_blk_type_in(struct acc100_dma_req_desc *desc,
+		struct rte_mbuf **input, uint32_t *offset, uint32_t cb_len,
+		uint32_t *seg_total_left, int next_triplet)
+{
+	uint32_t part_len;
+	struct rte_mbuf *m = *input;
+
+	part_len = (*seg_total_left < cb_len) ? *seg_total_left : cb_len;
+	cb_len -= part_len;
+	*seg_total_left -= part_len;
+
+	desc->data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(m, *offset);
+	desc->data_ptrs[next_triplet].blen = part_len;
+	desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN;
+	desc->data_ptrs[next_triplet].last = 0;
+	desc->data_ptrs[next_triplet].dma_ext = 0;
+	*offset += part_len;
+	next_triplet++;
+
+	while (cb_len > 0) {
+		if (next_triplet < ACC100_DMA_MAX_NUM_POINTERS &&
+				m->next != NULL) {
+
+			m = m->next;
+			*seg_total_left = rte_pktmbuf_data_len(m);
+			part_len = (*seg_total_left < cb_len) ?
+					*seg_total_left :
+					cb_len;
+			desc->data_ptrs[next_triplet].address =
+					rte_pktmbuf_mtophys(m);
+			desc->data_ptrs[next_triplet].blen = part_len;
+			desc->data_ptrs[next_triplet].blkid =
+					ACC100_DMA_BLKID_IN;
+			desc->data_ptrs[next_triplet].last = 0;
+			desc->data_ptrs[next_triplet].dma_ext = 0;
+			cb_len -= part_len;
+			*seg_total_left -= part_len;
+			/* Initializing offset for next segment (mbuf) */
+			*offset = part_len;
+			next_triplet++;
+		} else {
+			rte_bbdev_log(ERR,
+				"Some data still left for processing: "
+				"data_left: %u, next_triplet: %u, next_mbuf: %p",
+				cb_len, next_triplet, m->next);
+			return -EINVAL;
+		}
+	}
+	/* Storing new mbuf as it could be changed in scatter-gather case*/
+	*input = m;
+
+	return next_triplet;
+}
+
+/* Fills descriptor with data pointers of one block type.
+ * Returns index of next triplet on success, other value if lengths of
+ * output data and processed mbuf do not match.
+ */
+static inline int
+acc100_dma_fill_blk_type_out(struct acc100_dma_req_desc *desc,
+		struct rte_mbuf *output, uint32_t out_offset,
+		uint32_t output_len, int next_triplet, int blk_id)
+{
+	desc->data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(output, out_offset);
+	desc->data_ptrs[next_triplet].blen = output_len;
+	desc->data_ptrs[next_triplet].blkid = blk_id;
+	desc->data_ptrs[next_triplet].last = 0;
+	desc->data_ptrs[next_triplet].dma_ext = 0;
+	next_triplet++;
+
+	return next_triplet;
+}
+
+static inline int
+acc100_dma_desc_le_fill(struct rte_bbdev_enc_op *op,
+		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
+		struct rte_mbuf *output, uint32_t *in_offset,
+		uint32_t *out_offset, uint32_t *out_length,
+		uint32_t *mbuf_total_left, uint32_t *seg_total_left)
+{
+	int next_triplet = 1; /* FCW already done */
+	uint16_t K, in_length_in_bits, in_length_in_bytes;
+	struct rte_bbdev_op_ldpc_enc *enc = &op->ldpc_enc;
+
+	desc->word0 = ACC100_DMA_DESC_TYPE;
+	desc->word1 = 0; /**< Timestamp could be disabled */
+	desc->word2 = 0;
+	desc->word3 = 0;
+	desc->numCBs = 1;
+
+	K = (enc->basegraph == 1 ? 22 : 10) * enc->z_c;
+	in_length_in_bits = K - enc->n_filler;
+	if ((enc->op_flags & RTE_BBDEV_LDPC_CRC_24A_ATTACH) ||
+			(enc->op_flags & RTE_BBDEV_LDPC_CRC_24B_ATTACH))
+		in_length_in_bits -= 24;
+	in_length_in_bytes = in_length_in_bits >> 3;
+
+	if (unlikely((*mbuf_total_left == 0) ||
+			(*mbuf_total_left < in_length_in_bytes))) {
+		rte_bbdev_log(ERR,
+				"Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+				*mbuf_total_left, in_length_in_bytes);
+		return -1;
+	}
+
+	next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset,
+			in_length_in_bytes,
+			seg_total_left, next_triplet);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->m2dlen = next_triplet;
+	*mbuf_total_left -= in_length_in_bytes;
+
+	/* Set output length */
+	/* Integer round up division by 8 */
+	*out_length = (enc->cb_params.e + 7) >> 3;
+
+	next_triplet = acc100_dma_fill_blk_type_out(desc, output, *out_offset,
+			*out_length, next_triplet, ACC100_DMA_BLKID_OUT_ENC);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+	op->ldpc_enc.output.length += *out_length;
+	*out_offset += *out_length;
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->data_ptrs[next_triplet - 1].dma_ext = 0;
+	desc->d2mlen = next_triplet - desc->m2dlen;
+
+	desc->op_addr = op;
+
+	return 0;
+}
+
+static inline int
+acc100_dma_desc_ld_fill(struct rte_bbdev_dec_op *op,
+		struct acc100_dma_req_desc *desc,
+		struct rte_mbuf **input, struct rte_mbuf *h_output,
+		uint32_t *in_offset, uint32_t *h_out_offset,
+		uint32_t *h_out_length, uint32_t *mbuf_total_left,
+		uint32_t *seg_total_left,
+		struct acc100_fcw_ld *fcw)
+{
+	struct rte_bbdev_op_ldpc_dec *dec = &op->ldpc_dec;
+	int next_triplet = 1; /* FCW already done */
+	uint32_t input_length;
+	uint16_t output_length, crc24_overlap = 0;
+	uint16_t sys_cols, K, h_p_size, h_np_size;
+	bool h_comp = check_bit(dec->op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
+
+	desc->word0 = ACC100_DMA_DESC_TYPE;
+	desc->word1 = 0; /**< Timestamp could be disabled */
+	desc->word2 = 0;
+	desc->word3 = 0;
+	desc->numCBs = 1;
+
+	if (check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP))
+		crc24_overlap = 24;
+
+	/* Compute some LDPC BG lengths */
+	input_length = dec->cb_params.e;
+	if (check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_LLR_COMPRESSION))
+		input_length = (input_length * 3 + 3) / 4;
+	sys_cols = (dec->basegraph == 1) ? 22 : 10;
+	K = sys_cols * dec->z_c;
+	output_length = K - dec->n_filler - crc24_overlap;
+
+	if (unlikely((*mbuf_total_left == 0) ||
+			(*mbuf_total_left < input_length))) {
+		rte_bbdev_log(ERR,
+				"Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+				*mbuf_total_left, input_length);
+		return -1;
+	}
+
+	next_triplet = acc100_dma_fill_blk_type_in(desc, input,
+			in_offset, input_length,
+			seg_total_left, next_triplet);
+
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+
+	if (check_bit(op->ldpc_dec.op_flags,
+				RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
+		h_p_size = fcw->hcin_size0 + fcw->hcin_size1;
+		if (h_comp)
+			h_p_size = (h_p_size * 3 + 3) / 4;
+		desc->data_ptrs[next_triplet].address =
+				dec->harq_combined_input.offset;
+		desc->data_ptrs[next_triplet].blen = h_p_size;
+		desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN_HARQ;
+		desc->data_ptrs[next_triplet].dma_ext = 1;
+#ifndef ACC100_EXT_MEM
+		acc100_dma_fill_blk_type_out(
+				desc,
+				op->ldpc_dec.harq_combined_input.data,
+				op->ldpc_dec.harq_combined_input.offset,
+				h_p_size,
+				next_triplet,
+				ACC100_DMA_BLKID_IN_HARQ);
+#endif
+		next_triplet++;
+	}
+
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->m2dlen = next_triplet;
+	*mbuf_total_left -= input_length;
+
+	next_triplet = acc100_dma_fill_blk_type_out(desc, h_output,
+			*h_out_offset, output_length >> 3, next_triplet,
+			ACC100_DMA_BLKID_OUT_HARD);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+
+	if (check_bit(op->ldpc_dec.op_flags,
+				RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
+		/* Pruned size of the HARQ */
+		h_p_size = fcw->hcout_size0 + fcw->hcout_size1;
+		/* Non-Pruned size of the HARQ */
+		h_np_size = fcw->hcout_offset > 0 ?
+				fcw->hcout_offset + fcw->hcout_size1 :
+				h_p_size;
+		if (h_comp) {
+			h_np_size = (h_np_size * 3 + 3) / 4;
+			h_p_size = (h_p_size * 3 + 3) / 4;
+		}
+		dec->harq_combined_output.length = h_np_size;
+		desc->data_ptrs[next_triplet].address =
+				dec->harq_combined_output.offset;
+		desc->data_ptrs[next_triplet].blen = h_p_size;
+		desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_OUT_HARQ;
+		desc->data_ptrs[next_triplet].dma_ext = 1;
+#ifndef ACC100_EXT_MEM
+		acc100_dma_fill_blk_type_out(
+				desc,
+				dec->harq_combined_output.data,
+				dec->harq_combined_output.offset,
+				h_p_size,
+				next_triplet,
+				ACC100_DMA_BLKID_OUT_HARQ);
+#endif
+		next_triplet++;
+	}
+
+	*h_out_length = output_length >> 3;
+	dec->hard_output.length += *h_out_length;
+	*h_out_offset += *h_out_length;
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->d2mlen = next_triplet - desc->m2dlen;
+
+	desc->op_addr = op;
+
+	return 0;
+}
+
+static inline void
+acc100_dma_desc_ld_update(struct rte_bbdev_dec_op *op,
+		struct acc100_dma_req_desc *desc,
+		struct rte_mbuf *input, struct rte_mbuf *h_output,
+		uint32_t *in_offset, uint32_t *h_out_offset,
+		uint32_t *h_out_length,
+		union acc100_harq_layout_data *harq_layout)
+{
+	int next_triplet = 1; /* FCW already done */
+	desc->data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(input, *in_offset);
+	next_triplet++;
+
+	if (check_bit(op->ldpc_dec.op_flags,
+				RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
+		struct rte_bbdev_op_data hi = op->ldpc_dec.harq_combined_input;
+		desc->data_ptrs[next_triplet].address = hi.offset;
+#ifndef ACC100_EXT_MEM
+		desc->data_ptrs[next_triplet].address =
+				rte_pktmbuf_iova_offset(hi.data, hi.offset);
+#endif
+		next_triplet++;
+	}
+
+	desc->data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(h_output, *h_out_offset);
+	*h_out_length = desc->data_ptrs[next_triplet].blen;
+	next_triplet++;
+
+	if (check_bit(op->ldpc_dec.op_flags,
+				RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
+		desc->data_ptrs[next_triplet].address =
+				op->ldpc_dec.harq_combined_output.offset;
+		/* Adjust based on previous operation */
+		struct rte_bbdev_dec_op *prev_op = desc->op_addr;
+		op->ldpc_dec.harq_combined_output.length =
+				prev_op->ldpc_dec.harq_combined_output.length;
+		int16_t hq_idx = op->ldpc_dec.harq_combined_output.offset /
+				ACC100_HARQ_OFFSET;
+		int16_t prev_hq_idx =
+				prev_op->ldpc_dec.harq_combined_output.offset
+				/ ACC100_HARQ_OFFSET;
+		harq_layout[hq_idx].val = harq_layout[prev_hq_idx].val;
+#ifndef ACC100_EXT_MEM
+		struct rte_bbdev_op_data ho =
+				op->ldpc_dec.harq_combined_output;
+		desc->data_ptrs[next_triplet].address =
+				rte_pktmbuf_iova_offset(ho.data, ho.offset);
+#endif
+		next_triplet++;
+	}
+
+	op->ldpc_dec.hard_output.length += *h_out_length;
+	desc->op_addr = op;
+}
+
+
+/* Enqueue a number of operations to HW and update software rings */
+static inline void
+acc100_dma_enqueue(struct acc100_queue *q, uint16_t n,
+		struct rte_bbdev_stats *queue_stats)
+{
+	union acc100_enqueue_reg_fmt enq_req;
+#ifdef RTE_BBDEV_OFFLOAD_COST
+	uint64_t start_time = 0;
+	queue_stats->acc_offload_cycles = 0;
+	RTE_SET_USED(queue_stats);
+#else
+	RTE_SET_USED(queue_stats);
+#endif
+
+	enq_req.val = 0;
+	/* Setting offset, 100b for 256 DMA Desc */
+	enq_req.addr_offset = ACC100_DESC_OFFSET;
+
+	/* Split ops into batches */
+	do {
+		union acc100_dma_desc *desc;
+		uint16_t enq_batch_size;
+		uint64_t offset;
+		rte_iova_t req_elem_addr;
+
+		enq_batch_size = RTE_MIN(n, MAX_ENQ_BATCH_SIZE);
+
+		/* Set flag on last descriptor in a batch */
+		desc = q->ring_addr + ((q->sw_ring_head + enq_batch_size - 1) &
+				q->sw_ring_wrap_mask);
+		desc->req.last_desc_in_batch = 1;
+
+		/* Calculate the 1st descriptor's address */
+		offset = ((q->sw_ring_head & q->sw_ring_wrap_mask) *
+				sizeof(union acc100_dma_desc));
+		req_elem_addr = q->ring_addr_phys + offset;
+
+		/* Fill enqueue struct */
+		enq_req.num_elem = enq_batch_size;
+		/* low 6 bits are not needed */
+		enq_req.req_elem_addr = (uint32_t)(req_elem_addr >> 6);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+		rte_memdump(stderr, "Req sdone", desc, sizeof(*desc));
+#endif
+		rte_bbdev_log_debug(
+				"Enqueue %u reqs (phys %#"PRIx64") to reg %p",
+				enq_batch_size,
+				req_elem_addr,
+				(void *)q->mmio_reg_enqueue);
+
+		rte_wmb();
+
+#ifdef RTE_BBDEV_OFFLOAD_COST
+		/* Start time measurement for enqueue function offload. */
+		start_time = rte_rdtsc_precise();
+#endif
+		rte_bbdev_log(DEBUG, "Debug : MMIO Enqueue");
+		mmio_write(q->mmio_reg_enqueue, enq_req.val);
+
+#ifdef RTE_BBDEV_OFFLOAD_COST
+		queue_stats->acc_offload_cycles +=
+				rte_rdtsc_precise() - start_time;
+#endif
+
+		q->aq_enqueued++;
+		q->sw_ring_head += enq_batch_size;
+		n -= enq_batch_size;
+
+	} while (n);
+
+
+}
+
+/* Enqueue one encode operations for ACC100 device in CB mode */
+static inline int
+enqueue_ldpc_enc_n_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op **ops,
+		uint16_t total_enqueued_cbs, int16_t num)
+{
+	union acc100_dma_desc *desc = NULL;
+	uint32_t out_length;
+	struct rte_mbuf *output_head, *output;
+	int i, next_triplet;
+	uint16_t  in_length_in_bytes;
+	struct rte_bbdev_op_ldpc_enc *enc = &ops[0]->ldpc_enc;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	acc100_fcw_le_fill(ops[0], &desc->req.fcw_le, num);
+
+	/** This could be done at polling */
+	desc->req.word0 = ACC100_DMA_DESC_TYPE;
+	desc->req.word1 = 0; /**< Timestamp could be disabled */
+	desc->req.word2 = 0;
+	desc->req.word3 = 0;
+	desc->req.numCBs = num;
+
+	in_length_in_bytes = ops[0]->ldpc_enc.input.data->data_len;
+	out_length = (enc->cb_params.e + 7) >> 3;
+	desc->req.m2dlen = 1 + num;
+	desc->req.d2mlen = num;
+	next_triplet = 1;
+
+	for (i = 0; i < num; i++) {
+		desc->req.data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(ops[i]->ldpc_enc.input.data, 0);
+		desc->req.data_ptrs[next_triplet].blen = in_length_in_bytes;
+		next_triplet++;
+		desc->req.data_ptrs[next_triplet].address =
+				rte_pktmbuf_iova_offset(
+				ops[i]->ldpc_enc.output.data, 0);
+		desc->req.data_ptrs[next_triplet].blen = out_length;
+		next_triplet++;
+		ops[i]->ldpc_enc.output.length = out_length;
+		output_head = output = ops[i]->ldpc_enc.output.data;
+		mbuf_append(output_head, output, out_length);
+		output->data_len = out_length;
+	}
+
+	desc->req.op_addr = ops[0];
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_le,
+			sizeof(desc->req.fcw_le) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+	/* One CB (one op) was successfully prepared to enqueue */
+	return num;
+}
+
+/* Enqueue one encode operations for ACC100 device in CB mode */
+static inline int
+enqueue_ldpc_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
+		uint16_t total_enqueued_cbs)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint32_t in_offset, out_offset, out_length, mbuf_total_left,
+		seg_total_left;
+	struct rte_mbuf *input, *output_head, *output;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	acc100_fcw_le_fill(op, &desc->req.fcw_le, 1);
+
+	input = op->ldpc_enc.input.data;
+	output_head = output = op->ldpc_enc.output.data;
+	in_offset = op->ldpc_enc.input.offset;
+	out_offset = op->ldpc_enc.output.offset;
+	out_length = 0;
+	mbuf_total_left = op->ldpc_enc.input.length;
+	seg_total_left = rte_pktmbuf_data_len(op->ldpc_enc.input.data)
+			- in_offset;
+
+	ret = acc100_dma_desc_le_fill(op, &desc->req, &input, output,
+			&in_offset, &out_offset, &out_length, &mbuf_total_left,
+			&seg_total_left);
+
+	if (unlikely(ret < 0))
+		return ret;
+
+	mbuf_append(output_head, output, out_length);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_le,
+			sizeof(desc->req.fcw_le) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+
+	/* Check if any data left after processing one CB */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left after processing one CB: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
+/** Enqueue one decode operations for ACC100 device in CB mode */
+static inline int
+enqueue_ldpc_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs, bool same_op)
+{
+	int ret;
+
+	union acc100_dma_desc *desc;
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	struct rte_mbuf *input, *h_output_head, *h_output;
+	uint32_t in_offset, h_out_offset, h_out_length, mbuf_total_left;
+	input = op->ldpc_dec.input.data;
+	h_output_head = h_output = op->ldpc_dec.hard_output.data;
+	in_offset = op->ldpc_dec.input.offset;
+	h_out_offset = op->ldpc_dec.hard_output.offset;
+	mbuf_total_left = op->ldpc_dec.input.length;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(input == NULL)) {
+		rte_bbdev_log(ERR, "Invalid mbuf pointer");
+		return -EFAULT;
+	}
+#endif
+	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+
+	if (same_op) {
+		union acc100_dma_desc *prev_desc;
+		desc_idx = ((q->sw_ring_head + total_enqueued_cbs - 1)
+				& q->sw_ring_wrap_mask);
+		prev_desc = q->ring_addr + desc_idx;
+		uint8_t *prev_ptr = (uint8_t *) prev_desc;
+		uint8_t *new_ptr = (uint8_t *) desc;
+		/* Copy first 4 words and BDESCs */
+		rte_memcpy(new_ptr, prev_ptr, 16);
+		rte_memcpy(new_ptr + 36, prev_ptr + 36, 40);
+		desc->req.op_addr = prev_desc->req.op_addr;
+		/* Copy FCW */
+		rte_memcpy(new_ptr + ACC100_DESC_FCW_OFFSET,
+				prev_ptr + ACC100_DESC_FCW_OFFSET,
+				ACC100_FCW_LD_BLEN);
+		acc100_dma_desc_ld_update(op, &desc->req, input, h_output,
+				&in_offset, &h_out_offset,
+				&h_out_length, harq_layout);
+	} else {
+		struct acc100_fcw_ld *fcw;
+		uint32_t seg_total_left;
+		fcw = &desc->req.fcw_ld;
+		acc100_fcw_ld_fill(op, fcw, harq_layout);
+
+		/* Special handling when overusing mbuf */
+		if (fcw->rm_e < MAX_E_MBUF)
+			seg_total_left = rte_pktmbuf_data_len(input)
+					- in_offset;
+		else
+			seg_total_left = fcw->rm_e;
+
+		ret = acc100_dma_desc_ld_fill(op, &desc->req, &input, h_output,
+				&in_offset, &h_out_offset,
+				&h_out_length, &mbuf_total_left,
+				&seg_total_left, fcw);
+		if (unlikely(ret < 0))
+			return ret;
+	}
+
+	/* Hard output */
+	mbuf_append(h_output_head, h_output, h_out_length);
+#ifndef ACC100_EXT_MEM
+	if (op->ldpc_dec.harq_combined_output.length > 0) {
+		/* Push the HARQ output into host memory */
+		struct rte_mbuf *hq_output_head, *hq_output;
+		hq_output_head = op->ldpc_dec.harq_combined_output.data;
+		hq_output = op->ldpc_dec.harq_combined_output.data;
+		mbuf_append(hq_output_head, hq_output,
+				op->ldpc_dec.harq_combined_output.length);
+	}
+#endif
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_ld,
+			sizeof(desc->req.fcw_ld) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
+
+/* Enqueue one decode operations for ACC100 device in TB mode */
+static inline int
+enqueue_ldpc_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint8_t r, c;
+	uint32_t in_offset, h_out_offset,
+		h_out_length, mbuf_total_left, seg_total_left;
+	struct rte_mbuf *input, *h_output_head, *h_output;
+	uint16_t current_enqueued_cbs = 0;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
+	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+	acc100_fcw_ld_fill(op, &desc->req.fcw_ld, harq_layout);
+
+	input = op->ldpc_dec.input.data;
+	h_output_head = h_output = op->ldpc_dec.hard_output.data;
+	in_offset = op->ldpc_dec.input.offset;
+	h_out_offset = op->ldpc_dec.hard_output.offset;
+	h_out_length = 0;
+	mbuf_total_left = op->ldpc_dec.input.length;
+	c = op->ldpc_dec.tb_params.c;
+	r = op->ldpc_dec.tb_params.r;
+
+	while (mbuf_total_left > 0 && r < c) {
+
+		seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
+
+		/* Set up DMA descriptor */
+		desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
+				& q->sw_ring_wrap_mask);
+		desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
+		desc->req.data_ptrs[0].blen = ACC100_FCW_LD_BLEN;
+		ret = acc100_dma_desc_ld_fill(op, &desc->req, &input,
+				h_output, &in_offset, &h_out_offset,
+				&h_out_length,
+				&mbuf_total_left, &seg_total_left,
+				&desc->req.fcw_ld);
+
+		if (unlikely(ret < 0))
+			return ret;
+
+		/* Hard output */
+		mbuf_append(h_output_head, h_output, h_out_length);
+
+		/* Set total number of CBs in TB */
+		desc->req.cbs_in_tb = cbs_in_tb;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+		rte_memdump(stderr, "FCW", &desc->req.fcw_td,
+				sizeof(desc->req.fcw_td) - 8);
+		rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+		if (seg_total_left == 0) {
+			/* Go to the next mbuf */
+			input = input->next;
+			in_offset = 0;
+			h_output = h_output->next;
+			h_out_offset = 0;
+		}
+		total_enqueued_cbs++;
+		current_enqueued_cbs++;
+		r++;
+	}
+
+	if (unlikely(desc == NULL))
+		return current_enqueued_cbs;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Check if any CBs left for processing */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left for processing: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+	/* Set SDone on last CB descriptor for TB mode */
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	return current_enqueued_cbs;
+}
+
+
+/* Calculates number of CBs in processed encoder TB based on 'r' and input
+ * length.
+ */
+static inline uint8_t
+get_num_cbs_in_tb_enc(struct rte_bbdev_op_turbo_enc *turbo_enc)
+{
+	uint8_t c, c_neg, r, crc24_bits = 0;
+	uint16_t k, k_neg, k_pos;
+	uint8_t cbs_in_tb = 0;
+	int32_t length;
+
+	length = turbo_enc->input.length;
+	r = turbo_enc->tb_params.r;
+	c = turbo_enc->tb_params.c;
+	c_neg = turbo_enc->tb_params.c_neg;
+	k_neg = turbo_enc->tb_params.k_neg;
+	k_pos = turbo_enc->tb_params.k_pos;
+	crc24_bits = 0;
+	if (check_bit(turbo_enc->op_flags, RTE_BBDEV_TURBO_CRC_24B_ATTACH))
+		crc24_bits = 24;
+	while (length > 0 && r < c) {
+		k = (r < c_neg) ? k_neg : k_pos;
+		length -= (k - crc24_bits) >> 3;
+		r++;
+		cbs_in_tb++;
+	}
+
+	return cbs_in_tb;
+}
+
+/* Calculates number of CBs in processed decoder TB based on 'r' and input
+ * length.
+ */
+static inline uint16_t
+get_num_cbs_in_tb_dec(struct rte_bbdev_op_turbo_dec *turbo_dec)
+{
+	uint8_t c, c_neg, r = 0;
+	uint16_t kw, k, k_neg, k_pos, cbs_in_tb = 0;
+	int32_t length;
+
+	length = turbo_dec->input.length;
+	r = turbo_dec->tb_params.r;
+	c = turbo_dec->tb_params.c;
+	c_neg = turbo_dec->tb_params.c_neg;
+	k_neg = turbo_dec->tb_params.k_neg;
+	k_pos = turbo_dec->tb_params.k_pos;
+	while (length > 0 && r < c) {
+		k = (r < c_neg) ? k_neg : k_pos;
+		kw = RTE_ALIGN_CEIL(k + 4, 32) * 3;
+		length -= kw;
+		r++;
+		cbs_in_tb++;
+	}
+
+	return cbs_in_tb;
+}
+
+/* Calculates number of CBs in processed decoder TB based on 'r' and input
+ * length.
+ */
+static inline uint16_t
+get_num_cbs_in_tb_ldpc_dec(struct rte_bbdev_op_ldpc_dec *ldpc_dec)
+{
+	uint16_t r, cbs_in_tb = 0;
+	int32_t length = ldpc_dec->input.length;
+	r = ldpc_dec->tb_params.r;
+	while (length > 0 && r < ldpc_dec->tb_params.c) {
+		length -=  (r < ldpc_dec->tb_params.cab) ?
+				ldpc_dec->tb_params.ea :
+				ldpc_dec->tb_params.eb;
+		r++;
+		cbs_in_tb++;
+	}
+	return cbs_in_tb;
+}
+
+/* Check we can mux encode operations with common FCW */
+static inline bool
+check_mux(struct rte_bbdev_enc_op **ops, uint16_t num) {
+	uint16_t i;
+	if (num == 1)
+		return false;
+	for (i = 1; i < num; ++i) {
+		/* Only mux compatible code blocks */
+		if (memcmp((uint8_t *)(&ops[i]->ldpc_enc) + ENC_OFFSET,
+				(uint8_t *)(&ops[0]->ldpc_enc) + ENC_OFFSET,
+				CMP_ENC_SIZE) != 0)
+			return false;
+	}
+	return true;
+}
+
+/** Enqueue encode operations for ACC100 device in CB mode. */
+static inline uint16_t
+acc100_enqueue_ldpc_enc_cb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i = 0;
+	union acc100_dma_desc *desc;
+	int ret, desc_idx = 0;
+	int16_t enq, left = num;
+
+	while (left > 0) {
+		if (unlikely(avail - 1 < 0))
+			break;
+		avail--;
+		enq = RTE_MIN(left, MUX_5GDL_DESC);
+		if (check_mux(&ops[i], enq)) {
+			ret = enqueue_ldpc_enc_n_op_cb(q, &ops[i],
+					desc_idx, enq);
+			if (ret < 0)
+				break;
+			i += enq;
+		} else {
+			ret = enqueue_ldpc_enc_one_op_cb(q, ops[i], desc_idx);
+			if (ret < 0)
+				break;
+			i++;
+		}
+		desc_idx++;
+		left = num - i;
+	}
+
+	if (unlikely(i == 0))
+		return 0; /* Nothing to enqueue */
+
+	/* Set SDone in last CB in enqueued ops for CB mode*/
+	desc = q->ring_addr + ((q->sw_ring_head + desc_idx - 1)
+			& q->sw_ring_wrap_mask);
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	acc100_dma_enqueue(q, desc_idx, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+
+	return i;
+}
+
+/* Enqueue encode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	if (unlikely(num == 0))
+		return 0;
+	return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
+}
+
+/* Check we can mux encode operations with common FCW */
+static inline bool
+cmp_ldpc_dec_op(struct rte_bbdev_dec_op **ops) {
+	/* Only mux compatible code blocks */
+	if (memcmp((uint8_t *)(&ops[0]->ldpc_dec) + DEC_OFFSET,
+			(uint8_t *)(&ops[1]->ldpc_dec) +
+			DEC_OFFSET, CMP_DEC_SIZE) != 0) {
+		return false;
+	} else
+		return true;
+}
+
+
+/* Enqueue decode operations for ACC100 device in TB mode */
+static uint16_t
+acc100_enqueue_ldpc_dec_tb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i, enqueued_cbs = 0;
+	uint8_t cbs_in_tb;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		cbs_in_tb = get_num_cbs_in_tb_ldpc_dec(&ops[i]->ldpc_dec);
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - cbs_in_tb < 0))
+			break;
+		avail -= cbs_in_tb;
+
+		ret = enqueue_ldpc_dec_one_op_tb(q, ops[i],
+				enqueued_cbs, cbs_in_tb);
+		if (ret < 0)
+			break;
+		enqueued_cbs += ret;
+	}
+
+	acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+	return i;
+}
+
+/* Enqueue decode operations for ACC100 device in CB mode */
+static uint16_t
+acc100_enqueue_ldpc_dec_cb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i;
+	union acc100_dma_desc *desc;
+	int ret;
+	bool same_op = false;
+	for (i = 0; i < num; ++i) {
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - 1 < 0))
+			break;
+		avail -= 1;
+
+		if (i > 0)
+			same_op = cmp_ldpc_dec_op(&ops[i-1]);
+		rte_bbdev_log(INFO, "Op %d %d %d %d %d %d %d %d %d %d %d %d\n",
+			i, ops[i]->ldpc_dec.op_flags, ops[i]->ldpc_dec.rv_index,
+			ops[i]->ldpc_dec.iter_max, ops[i]->ldpc_dec.iter_count,
+			ops[i]->ldpc_dec.basegraph, ops[i]->ldpc_dec.z_c,
+			ops[i]->ldpc_dec.n_cb, ops[i]->ldpc_dec.q_m,
+			ops[i]->ldpc_dec.n_filler, ops[i]->ldpc_dec.cb_params.e,
+			same_op);
+		ret = enqueue_ldpc_dec_one_op_cb(q, ops[i], i, same_op);
+		if (ret < 0)
+			break;
+	}
+
+	if (unlikely(i == 0))
+		return 0; /* Nothing to enqueue */
+
+	/* Set SDone in last CB in enqueued ops for CB mode*/
+	desc = q->ring_addr + ((q->sw_ring_head + i - 1)
+			& q->sw_ring_wrap_mask);
+
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	acc100_dma_enqueue(q, i, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+	return i;
+}
+
+/* Enqueue decode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t aq_avail = q->aq_depth +
+			(q->aq_dequeued - q->aq_enqueued) / 128;
+
+	if (unlikely((aq_avail == 0) || (num == 0)))
+		return 0;
+
+	if (ops[0]->ldpc_dec.code_block_mode == 0)
+		return acc100_enqueue_ldpc_dec_tb(q_data, ops, num);
+	else
+		return acc100_enqueue_ldpc_dec_cb(q_data, ops, num);
+}
+
+
+/* Dequeue one encode operations from ACC100 device in CB mode */
+static inline int
+dequeue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op **ref_op,
+		uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_enc_op *op;
+	int i;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	rsp.val = atom_desc.rsp.val;
+	rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+
+	op->status |= ((rsp.input_err)
+			? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+	op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+	op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+
+	if (desc->req.last_desc_in_batch) {
+		(*aq_dequeued)++;
+		desc->req.last_desc_in_batch = 0;
+	}
+	desc->rsp.val = ACC100_DMA_DESC_TYPE;
+	desc->rsp.add_info_0 = 0; /*Reserved bits */
+	desc->rsp.add_info_1 = 0; /*Reserved bits */
+
+	/* Flag that the muxing cause loss of opaque data */
+	op->opaque_data = (void *)-1;
+	for (i = 0 ; i < desc->req.numCBs; i++)
+		ref_op[i] = op;
+
+	/* One CB (op) was successfully dequeued */
+	return desc->req.numCBs;
+}
+
+/* Dequeue one encode operations from ACC100 device in TB mode */
+static inline int
+dequeue_enc_one_op_tb(struct acc100_queue *q, struct rte_bbdev_enc_op **ref_op,
+		uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, *last_desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_enc_op *op;
+	uint8_t i = 0;
+	uint16_t current_dequeued_cbs = 0, cbs_in_tb;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	/* Get number of CBs in dequeued TB */
+	cbs_in_tb = desc->req.cbs_in_tb;
+	/* Get last CB */
+	last_desc = q->ring_addr + ((q->sw_ring_tail
+			+ total_dequeued_cbs + cbs_in_tb - 1)
+			& q->sw_ring_wrap_mask);
+	/* Check if last CB in TB is ready to dequeue (and thus
+	 * the whole TB) - checking sdone bit. If not return.
+	 */
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
+			__ATOMIC_RELAXED);
+	if (!(atom_desc.rsp.val & ACC100_SDONE))
+		return -1;
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+
+	while (i < cbs_in_tb) {
+		desc = q->ring_addr + ((q->sw_ring_tail
+				+ total_dequeued_cbs)
+				& q->sw_ring_wrap_mask);
+		atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+				__ATOMIC_RELAXED);
+		rsp.val = atom_desc.rsp.val;
+		rte_bbdev_log_debug("Resp. desc %p: %x", desc,
+				rsp.val);
+
+		op->status |= ((rsp.input_err)
+				? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+		op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+		op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+
+		if (desc->req.last_desc_in_batch) {
+			(*aq_dequeued)++;
+			desc->req.last_desc_in_batch = 0;
+		}
+		desc->rsp.val = ACC100_DMA_DESC_TYPE;
+		desc->rsp.add_info_0 = 0;
+		desc->rsp.add_info_1 = 0;
+		total_dequeued_cbs++;
+		current_dequeued_cbs++;
+		i++;
+	}
+
+	*ref_op = op;
+
+	return current_dequeued_cbs;
+}
+
+/* Dequeue one decode operation from ACC100 device in CB mode */
+static inline int
+dequeue_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
+		struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
+		uint16_t dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_dec_op *op;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	rsp.val = atom_desc.rsp.val;
+	rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+	op->status |= ((rsp.input_err)
+			? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+	op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+	op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+	if (op->status != 0)
+		q_data->queue_stats.dequeue_err_count++;
+
+	/* CRC invalid if error exists */
+	if (!op->status)
+		op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
+	op->turbo_dec.iter_count = (uint8_t) rsp.iter_cnt / 2;
+	/* Check if this is the last desc in batch (Atomic Queue) */
+	if (desc->req.last_desc_in_batch) {
+		(*aq_dequeued)++;
+		desc->req.last_desc_in_batch = 0;
+	}
+	desc->rsp.val = ACC100_DMA_DESC_TYPE;
+	desc->rsp.add_info_0 = 0;
+	desc->rsp.add_info_1 = 0;
+	*ref_op = op;
+
+	/* One CB (op) was successfully dequeued */
+	return 1;
+}
+
+/* Dequeue one decode operations from ACC100 device in CB mode */
+static inline int
+dequeue_ldpc_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
+		struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
+		uint16_t dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_dec_op *op;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	rsp.val = atom_desc.rsp.val;
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+	op->status |= rsp.input_err << RTE_BBDEV_DATA_ERROR;
+	op->status |= rsp.dma_err << RTE_BBDEV_DRV_ERROR;
+	op->status |= rsp.fcw_err << RTE_BBDEV_DRV_ERROR;
+	if (op->status != 0)
+		q_data->queue_stats.dequeue_err_count++;
+
+	op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
+	if (op->ldpc_dec.hard_output.length > 0 && !rsp.synd_ok)
+		op->status |= 1 << RTE_BBDEV_SYNDROME_ERROR;
+	op->ldpc_dec.iter_count = (uint8_t) rsp.iter_cnt;
+
+	/* Check if this is the last desc in batch (Atomic Queue) */
+	if (desc->req.last_desc_in_batch) {
+		(*aq_dequeued)++;
+		desc->req.last_desc_in_batch = 0;
+	}
+
+	desc->rsp.val = ACC100_DMA_DESC_TYPE;
+	desc->rsp.add_info_0 = 0;
+	desc->rsp.add_info_1 = 0;
+
+	*ref_op = op;
+
+	/* One CB (op) was successfully dequeued */
+	return 1;
+}
+
+/* Dequeue one decode operations from ACC100 device in TB mode. */
+static inline int
+dequeue_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
+		uint16_t dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, *last_desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_dec_op *op;
+	uint8_t cbs_in_tb = 1, cb_idx = 0;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Get number of CBs in dequeued TB */
+	cbs_in_tb = desc->req.cbs_in_tb;
+	/* Get last CB */
+	last_desc = q->ring_addr + ((q->sw_ring_tail
+			+ dequeued_cbs + cbs_in_tb - 1)
+			& q->sw_ring_wrap_mask);
+	/* Check if last CB in TB is ready to dequeue (and thus
+	 * the whole TB) - checking sdone bit. If not return.
+	 */
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
+			__ATOMIC_RELAXED);
+	if (!(atom_desc.rsp.val & ACC100_SDONE))
+		return -1;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+
+	/* Read remaining CBs if exists */
+	while (cb_idx < cbs_in_tb) {
+		desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+				& q->sw_ring_wrap_mask);
+		atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+				__ATOMIC_RELAXED);
+		rsp.val = atom_desc.rsp.val;
+		rte_bbdev_log_debug("Resp. desc %p: %x", desc,
+				rsp.val);
+
+		op->status |= ((rsp.input_err)
+				? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+		op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+		op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+
+		/* CRC invalid if error exists */
+		if (!op->status)
+			op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
+		op->turbo_dec.iter_count = RTE_MAX((uint8_t) rsp.iter_cnt,
+				op->turbo_dec.iter_count);
+
+		/* Check if this is the last desc in batch (Atomic Queue) */
+		if (desc->req.last_desc_in_batch) {
+			(*aq_dequeued)++;
+			desc->req.last_desc_in_batch = 0;
+		}
+		desc->rsp.val = ACC100_DMA_DESC_TYPE;
+		desc->rsp.add_info_0 = 0;
+		desc->rsp.add_info_1 = 0;
+		dequeued_cbs++;
+		cb_idx++;
+	}
+
+	*ref_op = op;
+
+	return cb_idx;
+}
+
+/* Dequeue LDPC encode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+	uint32_t aq_dequeued = 0;
+	uint16_t dequeue_num, i, dequeued_cbs = 0, dequeued_descs = 0;
+	int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(ops == 0 && q == NULL))
+		return 0;
+#endif
+
+	dequeue_num = (avail < num) ? avail : num;
+
+	for (i = 0; i < dequeue_num; i++) {
+		ret = dequeue_enc_one_op_cb(q, &ops[dequeued_cbs],
+				dequeued_descs, &aq_dequeued);
+		if (ret < 0)
+			break;
+		dequeued_cbs += ret;
+		dequeued_descs++;
+		if (dequeued_cbs >= num)
+			break;
+	}
+
+	q->aq_dequeued += aq_dequeued;
+	q->sw_ring_tail += dequeued_descs;
+
+	/* Update enqueue stats */
+	q_data->queue_stats.dequeued_count += dequeued_cbs;
+
+	return dequeued_cbs;
+}
+
+/* Dequeue decode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	uint16_t dequeue_num;
+	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+	uint32_t aq_dequeued = 0;
+	uint16_t i;
+	uint16_t dequeued_cbs = 0;
+	struct rte_bbdev_dec_op *op;
+	int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(ops == 0 && q == NULL))
+		return 0;
+#endif
+
+	dequeue_num = (avail < num) ? avail : num;
+
+	for (i = 0; i < dequeue_num; ++i) {
+		op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask))->req.op_addr;
+		if (op->ldpc_dec.code_block_mode == 0)
+			ret = dequeue_dec_one_op_tb(q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+		else
+			ret = dequeue_ldpc_dec_one_op_cb(
+					q_data, q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+
+		if (ret < 0)
+			break;
+		dequeued_cbs += ret;
+	}
+
+	q->aq_dequeued += aq_dequeued;
+	q->sw_ring_tail += dequeued_cbs;
+
+	/* Update enqueue stats */
+	q_data->queue_stats.dequeued_count += i;
+
+	return i;
+}
+
 /* Initialization Function */
 static void
 acc100_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv)
@@ -703,6 +2321,10 @@
 	struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
 
 	dev->dev_ops = &acc100_bbdev_ops;
+	dev->enqueue_ldpc_enc_ops = acc100_enqueue_ldpc_enc;
+	dev->enqueue_ldpc_dec_ops = acc100_enqueue_ldpc_dec;
+	dev->dequeue_ldpc_enc_ops = acc100_dequeue_ldpc_enc;
+	dev->dequeue_ldpc_dec_ops = acc100_dequeue_ldpc_dec;
 
 	((struct acc100_device *) dev->data->dev_private)->pf_device =
 			!strcmp(drv->driver.name,
@@ -815,4 +2437,3 @@ static int acc100_pci_remove(struct rte_pci_device *pci_dev)
 RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME, pci_id_acc100_pf_map);
 RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
 RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME, pci_id_acc100_vf_map);
-
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 0e2b79c..78686c1 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -88,6 +88,8 @@
 #define TMPL_PRI_3      0x0f0e0d0c
 #define QUEUE_ENABLE    0x80000000  /* Bit to mark Queue as Enabled */
 #define WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
+#define ACC100_FDONE    0x80000000
+#define ACC100_SDONE    0x40000000
 
 #define ACC100_NUM_TMPL  32
 #define VF_OFFSET_QOS 16 /* offset in Memory Space specific to QoS Mon */
@@ -398,6 +400,7 @@ struct __rte_packed acc100_dma_req_desc {
 union acc100_dma_desc {
 	struct acc100_dma_req_desc req;
 	union acc100_dma_rsp_desc rsp;
+	uint64_t atom_hdr;
 };
 
 
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v3 06/11] baseband/acc100: add HARQ loopback support
  2020-08-19  0:25 [dpdk-dev] [PATCH v3 00/11] bbdev PMD ACC100 Nicolas Chautru
                   ` (4 preceding siblings ...)
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC processing functions Nicolas Chautru
@ 2020-08-19  0:25 ` Nicolas Chautru
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 07/11] baseband/acc100: add support for 4G processing Nicolas Chautru
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-08-19  0:25 UTC (permalink / raw)
  To: dev, akhil.goyal; +Cc: bruce.richardson, Nicolas Chautru

Additional support for HARQ memory loopback

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 158 +++++++++++++++++++++++++++++++
 1 file changed, 158 insertions(+)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 5f32813..b44b2f5 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -658,6 +658,7 @@
 				RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE |
 				RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE |
 #ifdef ACC100_EXT_MEM
+				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK |
 				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE |
 				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE |
 #endif
@@ -1480,12 +1481,169 @@
 	return 1;
 }
 
+static inline int
+harq_loopback(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs) {
+	struct acc100_fcw_ld *fcw;
+	union acc100_dma_desc *desc;
+	int next_triplet = 1;
+	struct rte_mbuf *hq_output_head, *hq_output;
+	uint16_t harq_in_length = op->ldpc_dec.harq_combined_input.length;
+	if (harq_in_length == 0) {
+		rte_bbdev_log(ERR, "Loopback of invalid null size\n");
+		return -EINVAL;
+	}
+
+	int h_comp = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION
+			) ? 1 : 0;
+	if (h_comp == 1)
+		harq_in_length = harq_in_length * 8 / 6;
+	harq_in_length = RTE_ALIGN(harq_in_length, 64);
+	uint16_t harq_dma_length_in = (h_comp == 0) ?
+			harq_in_length :
+			harq_in_length * 6 / 8;
+	uint16_t harq_dma_length_out = harq_dma_length_in;
+	bool ddr_mem_in = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE);
+	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+	uint16_t harq_index = (ddr_mem_in ?
+			op->ldpc_dec.harq_combined_input.offset :
+			op->ldpc_dec.harq_combined_output.offset)
+			/ ACC100_HARQ_OFFSET;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	fcw = &desc->req.fcw_ld;
+	/* Set the FCW from loopback into DDR */
+	memset(fcw, 0, sizeof(struct acc100_fcw_ld));
+	fcw->FCWversion = ACC100_FCW_VER;
+	fcw->qm = 2;
+	fcw->Zc = 384;
+	if (harq_in_length < 16 * N_ZC_1)
+		fcw->Zc = 16;
+	fcw->ncb = fcw->Zc * N_ZC_1;
+	fcw->rm_e = 2;
+	fcw->hcin_en = 1;
+	fcw->hcout_en = 1;
+
+	rte_bbdev_log(DEBUG, "Loopback IN %d Index %d offset %d length %d %d\n",
+			ddr_mem_in, harq_index,
+			harq_layout[harq_index].offset, harq_in_length,
+			harq_dma_length_in);
+
+	if (ddr_mem_in && (harq_layout[harq_index].offset > 0)) {
+		fcw->hcin_size0 = harq_layout[harq_index].size0;
+		fcw->hcin_offset = harq_layout[harq_index].offset;
+		fcw->hcin_size1 = harq_in_length - fcw->hcin_offset;
+		harq_dma_length_in = (fcw->hcin_size0 + fcw->hcin_size1);
+		if (h_comp == 1)
+			harq_dma_length_in = harq_dma_length_in * 6 / 8;
+	} else {
+		fcw->hcin_size0 = harq_in_length;
+	}
+	harq_layout[harq_index].val = 0;
+	rte_bbdev_log(DEBUG, "Loopback FCW Config %d %d %d\n",
+			fcw->hcin_size0, fcw->hcin_offset, fcw->hcin_size1);
+	fcw->hcout_size0 = harq_in_length;
+	fcw->hcin_decomp_mode = h_comp;
+	fcw->hcout_comp_mode = h_comp;
+	fcw->gain_i = 1;
+	fcw->gain_h = 1;
+
+	/* Set the prefix of descriptor. This could be done at polling */
+	desc->req.word0 = ACC100_DMA_DESC_TYPE;
+	desc->req.word1 = 0; /**< Timestamp could be disabled */
+	desc->req.word2 = 0;
+	desc->req.word3 = 0;
+	desc->req.numCBs = 1;
+
+	/* Null LLR input for Decoder */
+	desc->req.data_ptrs[next_triplet].address =
+			q->lb_in_addr_phys;
+	desc->req.data_ptrs[next_triplet].blen = 2;
+	desc->req.data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN;
+	desc->req.data_ptrs[next_triplet].last = 0;
+	desc->req.data_ptrs[next_triplet].dma_ext = 0;
+	next_triplet++;
+
+	/* HARQ Combine input from either Memory interface */
+	if (!ddr_mem_in) {
+		next_triplet = acc100_dma_fill_blk_type_out(&desc->req,
+				op->ldpc_dec.harq_combined_input.data,
+				op->ldpc_dec.harq_combined_input.offset,
+				harq_dma_length_in,
+				next_triplet,
+				ACC100_DMA_BLKID_IN_HARQ);
+	} else {
+		desc->req.data_ptrs[next_triplet].address =
+				op->ldpc_dec.harq_combined_input.offset;
+		desc->req.data_ptrs[next_triplet].blen =
+				harq_dma_length_in;
+		desc->req.data_ptrs[next_triplet].blkid =
+				ACC100_DMA_BLKID_IN_HARQ;
+		desc->req.data_ptrs[next_triplet].dma_ext = 1;
+		next_triplet++;
+	}
+	desc->req.data_ptrs[next_triplet - 1].last = 1;
+	desc->req.m2dlen = next_triplet;
+
+	/* Dropped decoder hard output */
+	desc->req.data_ptrs[next_triplet].address =
+			q->lb_out_addr_phys;
+	desc->req.data_ptrs[next_triplet].blen = BYTES_IN_WORD;
+	desc->req.data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_OUT_HARD;
+	desc->req.data_ptrs[next_triplet].last = 0;
+	desc->req.data_ptrs[next_triplet].dma_ext = 0;
+	next_triplet++;
+
+	/* HARQ Combine output to either Memory interface */
+	if (check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE
+			)) {
+		desc->req.data_ptrs[next_triplet].address =
+				op->ldpc_dec.harq_combined_output.offset;
+		desc->req.data_ptrs[next_triplet].blen =
+				harq_dma_length_out;
+		desc->req.data_ptrs[next_triplet].blkid =
+				ACC100_DMA_BLKID_OUT_HARQ;
+		desc->req.data_ptrs[next_triplet].dma_ext = 1;
+		next_triplet++;
+	} else {
+		hq_output_head = op->ldpc_dec.harq_combined_output.data;
+		hq_output = op->ldpc_dec.harq_combined_output.data;
+		next_triplet = acc100_dma_fill_blk_type_out(
+				&desc->req,
+				op->ldpc_dec.harq_combined_output.data,
+				op->ldpc_dec.harq_combined_output.offset,
+				harq_dma_length_out,
+				next_triplet,
+				ACC100_DMA_BLKID_OUT_HARQ);
+		/* HARQ output */
+		mbuf_append(hq_output_head, hq_output, harq_dma_length_out);
+		op->ldpc_dec.harq_combined_output.length =
+				harq_dma_length_out;
+	}
+	desc->req.data_ptrs[next_triplet - 1].last = 1;
+	desc->req.d2mlen = next_triplet - desc->req.m2dlen;
+	desc->req.op_addr = op;
+
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
 /** Enqueue one decode operations for ACC100 device in CB mode */
 static inline int
 enqueue_ldpc_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
 		uint16_t total_enqueued_cbs, bool same_op)
 {
 	int ret;
+	if (unlikely(check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK))) {
+		ret = harq_loopback(q, op, total_enqueued_cbs);
+		return ret;
+	}
 
 	union acc100_dma_desc *desc;
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v3 07/11] baseband/acc100: add support for 4G processing
  2020-08-19  0:25 [dpdk-dev] [PATCH v3 00/11] bbdev PMD ACC100 Nicolas Chautru
                   ` (5 preceding siblings ...)
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 06/11] baseband/acc100: add HARQ loopback support Nicolas Chautru
@ 2020-08-19  0:25 ` Nicolas Chautru
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 08/11] baseband/acc100: add interrupt support to PMD Nicolas Chautru
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-08-19  0:25 UTC (permalink / raw)
  To: dev, akhil.goyal; +Cc: bruce.richardson, Nicolas Chautru

Adding capability for 4G encode and decoder processing

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 1010 ++++++++++++++++++++++++++++--
 1 file changed, 943 insertions(+), 67 deletions(-)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index b44b2f5..1de7531 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -339,7 +339,6 @@
 	free_base_addresses(base_addrs, i);
 }
 
-
 /* Allocate 64MB memory used for all software rings */
 static int
 acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id)
@@ -637,6 +636,41 @@
 
 	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
 		{
+			.type = RTE_BBDEV_OP_TURBO_DEC,
+			.cap.turbo_dec = {
+				.capability_flags =
+					RTE_BBDEV_TURBO_SUBBLOCK_DEINTERLEAVE |
+					RTE_BBDEV_TURBO_CRC_TYPE_24B |
+					RTE_BBDEV_TURBO_HALF_ITERATION_EVEN |
+					RTE_BBDEV_TURBO_EARLY_TERMINATION |
+					RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN |
+					RTE_BBDEV_TURBO_MAP_DEC |
+					RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP |
+					RTE_BBDEV_TURBO_DEC_SCATTER_GATHER,
+				.max_llr_modulus = INT8_MAX,
+				.num_buffers_src =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+				.num_buffers_hard_out =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+				.num_buffers_soft_out =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+			}
+		},
+		{
+			.type = RTE_BBDEV_OP_TURBO_ENC,
+			.cap.turbo_enc = {
+				.capability_flags =
+					RTE_BBDEV_TURBO_CRC_24B_ATTACH |
+					RTE_BBDEV_TURBO_RV_INDEX_BYPASS |
+					RTE_BBDEV_TURBO_RATE_MATCH |
+					RTE_BBDEV_TURBO_ENC_SCATTER_GATHER,
+				.num_buffers_src =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+				.num_buffers_dst =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+			}
+		},
+		{
 			.type   = RTE_BBDEV_OP_LDPC_ENC,
 			.cap.ldpc_enc = {
 				.capability_flags =
@@ -719,7 +753,6 @@
 #endif
 }
 
-
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
 	.setup_queues = acc100_setup_queues,
 	.close = acc100_dev_close,
@@ -763,6 +796,58 @@
 	return tail;
 }
 
+/* Fill in a frame control word for turbo encoding. */
+static inline void
+acc100_fcw_te_fill(const struct rte_bbdev_enc_op *op, struct acc100_fcw_te *fcw)
+{
+	fcw->code_block_mode = op->turbo_enc.code_block_mode;
+	if (fcw->code_block_mode == 0) { /* For TB mode */
+		fcw->k_neg = op->turbo_enc.tb_params.k_neg;
+		fcw->k_pos = op->turbo_enc.tb_params.k_pos;
+		fcw->c_neg = op->turbo_enc.tb_params.c_neg;
+		fcw->c = op->turbo_enc.tb_params.c;
+		fcw->ncb_neg = op->turbo_enc.tb_params.ncb_neg;
+		fcw->ncb_pos = op->turbo_enc.tb_params.ncb_pos;
+
+		if (check_bit(op->turbo_enc.op_flags,
+				RTE_BBDEV_TURBO_RATE_MATCH)) {
+			fcw->bypass_rm = 0;
+			fcw->cab = op->turbo_enc.tb_params.cab;
+			fcw->ea = op->turbo_enc.tb_params.ea;
+			fcw->eb = op->turbo_enc.tb_params.eb;
+		} else {
+			/* E is set to the encoding output size when RM is
+			 * bypassed.
+			 */
+			fcw->bypass_rm = 1;
+			fcw->cab = fcw->c_neg;
+			fcw->ea = 3 * fcw->k_neg + 12;
+			fcw->eb = 3 * fcw->k_pos + 12;
+		}
+	} else { /* For CB mode */
+		fcw->k_pos = op->turbo_enc.cb_params.k;
+		fcw->ncb_pos = op->turbo_enc.cb_params.ncb;
+
+		if (check_bit(op->turbo_enc.op_flags,
+				RTE_BBDEV_TURBO_RATE_MATCH)) {
+			fcw->bypass_rm = 0;
+			fcw->eb = op->turbo_enc.cb_params.e;
+		} else {
+			/* E is set to the encoding output size when RM is
+			 * bypassed.
+			 */
+			fcw->bypass_rm = 1;
+			fcw->eb = 3 * fcw->k_pos + 12;
+		}
+	}
+
+	fcw->bypass_rv_idx1 = check_bit(op->turbo_enc.op_flags,
+			RTE_BBDEV_TURBO_RV_INDEX_BYPASS);
+	fcw->code_block_crc = check_bit(op->turbo_enc.op_flags,
+			RTE_BBDEV_TURBO_CRC_24B_ATTACH);
+	fcw->rv_idx1 = op->turbo_enc.rv_index;
+}
+
 /* Compute value of k0.
  * Based on 3GPP 38.212 Table 5.4.2.1-2
  * Starting position of different redundancy versions, k0
@@ -813,6 +898,25 @@
 	fcw->mcb_count = num_cb;
 }
 
+/* Fill in a frame control word for turbo decoding. */
+static inline void
+acc100_fcw_td_fill(const struct rte_bbdev_dec_op *op, struct acc100_fcw_td *fcw)
+{
+	/* Note : Early termination is always enabled for 4GUL */
+	fcw->fcw_ver = 1;
+	if (op->turbo_dec.code_block_mode == 0)
+		fcw->k_pos = op->turbo_dec.tb_params.k_pos;
+	else
+		fcw->k_pos = op->turbo_dec.cb_params.k;
+	fcw->turbo_crc_type = check_bit(op->turbo_dec.op_flags,
+			RTE_BBDEV_TURBO_CRC_TYPE_24B);
+	fcw->bypass_sb_deint = 0;
+	fcw->raw_decoder_input_on = 0;
+	fcw->max_iter = op->turbo_dec.iter_max;
+	fcw->half_iter_on = !check_bit(op->turbo_dec.op_flags,
+			RTE_BBDEV_TURBO_HALF_ITERATION_EVEN);
+}
+
 /* Fill in a frame control word for LDPC decoding. */
 static inline void
 acc100_fcw_ld_fill(const struct rte_bbdev_dec_op *op, struct acc100_fcw_ld *fcw,
@@ -1042,6 +1146,87 @@
 }
 
 static inline int
+acc100_dma_desc_te_fill(struct rte_bbdev_enc_op *op,
+		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
+		struct rte_mbuf *output, uint32_t *in_offset,
+		uint32_t *out_offset, uint32_t *out_length,
+		uint32_t *mbuf_total_left, uint32_t *seg_total_left, uint8_t r)
+{
+	int next_triplet = 1; /* FCW already done */
+	uint32_t e, ea, eb, length;
+	uint16_t k, k_neg, k_pos;
+	uint8_t cab, c_neg;
+
+	desc->word0 = ACC100_DMA_DESC_TYPE;
+	desc->word1 = 0; /**< Timestamp could be disabled */
+	desc->word2 = 0;
+	desc->word3 = 0;
+	desc->numCBs = 1;
+
+	if (op->turbo_enc.code_block_mode == 0) {
+		ea = op->turbo_enc.tb_params.ea;
+		eb = op->turbo_enc.tb_params.eb;
+		cab = op->turbo_enc.tb_params.cab;
+		k_neg = op->turbo_enc.tb_params.k_neg;
+		k_pos = op->turbo_enc.tb_params.k_pos;
+		c_neg = op->turbo_enc.tb_params.c_neg;
+		e = (r < cab) ? ea : eb;
+		k = (r < c_neg) ? k_neg : k_pos;
+	} else {
+		e = op->turbo_enc.cb_params.e;
+		k = op->turbo_enc.cb_params.k;
+	}
+
+	if (check_bit(op->turbo_enc.op_flags, RTE_BBDEV_TURBO_CRC_24B_ATTACH))
+		length = (k - 24) >> 3;
+	else
+		length = k >> 3;
+
+	if (unlikely((*mbuf_total_left == 0) || (*mbuf_total_left < length))) {
+		rte_bbdev_log(ERR,
+				"Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+				*mbuf_total_left, length);
+		return -1;
+	}
+
+	next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset,
+			length, seg_total_left, next_triplet);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->m2dlen = next_triplet;
+	*mbuf_total_left -= length;
+
+	/* Set output length */
+	if (check_bit(op->turbo_enc.op_flags, RTE_BBDEV_TURBO_RATE_MATCH))
+		/* Integer round up division by 8 */
+		*out_length = (e + 7) >> 3;
+	else
+		*out_length = (k >> 3) * 3 + 2;
+
+	next_triplet = acc100_dma_fill_blk_type_out(desc, output, *out_offset,
+			*out_length, next_triplet, ACC100_DMA_BLKID_OUT_ENC);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+	op->turbo_enc.output.length += *out_length;
+	*out_offset += *out_length;
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->d2mlen = next_triplet - desc->m2dlen;
+
+	desc->op_addr = op;
+
+	return 0;
+}
+
+static inline int
 acc100_dma_desc_le_fill(struct rte_bbdev_enc_op *op,
 		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
 		struct rte_mbuf *output, uint32_t *in_offset,
@@ -1110,6 +1295,117 @@
 }
 
 static inline int
+acc100_dma_desc_td_fill(struct rte_bbdev_dec_op *op,
+		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
+		struct rte_mbuf *h_output, struct rte_mbuf *s_output,
+		uint32_t *in_offset, uint32_t *h_out_offset,
+		uint32_t *s_out_offset, uint32_t *h_out_length,
+		uint32_t *s_out_length, uint32_t *mbuf_total_left,
+		uint32_t *seg_total_left, uint8_t r)
+{
+	int next_triplet = 1; /* FCW already done */
+	uint16_t k;
+	uint16_t crc24_overlap = 0;
+	uint32_t e, kw;
+
+	desc->word0 = ACC100_DMA_DESC_TYPE;
+	desc->word1 = 0; /**< Timestamp could be disabled */
+	desc->word2 = 0;
+	desc->word3 = 0;
+	desc->numCBs = 1;
+
+	if (op->turbo_dec.code_block_mode == 0) {
+		k = (r < op->turbo_dec.tb_params.c_neg)
+			? op->turbo_dec.tb_params.k_neg
+			: op->turbo_dec.tb_params.k_pos;
+		e = (r < op->turbo_dec.tb_params.cab)
+			? op->turbo_dec.tb_params.ea
+			: op->turbo_dec.tb_params.eb;
+	} else {
+		k = op->turbo_dec.cb_params.k;
+		e = op->turbo_dec.cb_params.e;
+	}
+
+	if ((op->turbo_dec.code_block_mode == 0)
+		&& !check_bit(op->turbo_dec.op_flags,
+		RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP))
+		crc24_overlap = 24;
+
+	/* Calculates circular buffer size.
+	 * According to 3gpp 36.212 section 5.1.4.2
+	 *   Kw = 3 * Kpi,
+	 * where:
+	 *   Kpi = nCol * nRow
+	 * where nCol is 32 and nRow can be calculated from:
+	 *   D =< nCol * nRow
+	 * where D is the size of each output from turbo encoder block (k + 4).
+	 */
+	kw = RTE_ALIGN_CEIL(k + 4, 32) * 3;
+
+	if (unlikely((*mbuf_total_left == 0) || (*mbuf_total_left < kw))) {
+		rte_bbdev_log(ERR,
+				"Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+				*mbuf_total_left, kw);
+		return -1;
+	}
+
+	next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset, kw,
+			seg_total_left, next_triplet);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->m2dlen = next_triplet;
+	*mbuf_total_left -= kw;
+
+	next_triplet = acc100_dma_fill_blk_type_out(
+			desc, h_output, *h_out_offset,
+			k >> 3, next_triplet, ACC100_DMA_BLKID_OUT_HARD);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+
+	*h_out_length = ((k - crc24_overlap) >> 3);
+	op->turbo_dec.hard_output.length += *h_out_length;
+	*h_out_offset += *h_out_length;
+
+	/* Soft output */
+	if (check_bit(op->turbo_dec.op_flags, RTE_BBDEV_TURBO_SOFT_OUTPUT)) {
+		if (check_bit(op->turbo_dec.op_flags,
+				RTE_BBDEV_TURBO_EQUALIZER))
+			*s_out_length = e;
+		else
+			*s_out_length = (k * 3) + 12;
+
+		next_triplet = acc100_dma_fill_blk_type_out(desc, s_output,
+				*s_out_offset, *s_out_length, next_triplet,
+				ACC100_DMA_BLKID_OUT_SOFT);
+		if (unlikely(next_triplet < 0)) {
+			rte_bbdev_log(ERR,
+					"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+					op);
+			return -1;
+		}
+
+		op->turbo_dec.soft_output.length += *s_out_length;
+		*s_out_offset += *s_out_length;
+	}
+
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->d2mlen = next_triplet - desc->m2dlen;
+
+	desc->op_addr = op;
+
+	return 0;
+}
+
+static inline int
 acc100_dma_desc_ld_fill(struct rte_bbdev_dec_op *op,
 		struct acc100_dma_req_desc *desc,
 		struct rte_mbuf **input, struct rte_mbuf *h_output,
@@ -1374,6 +1670,57 @@
 
 /* Enqueue one encode operations for ACC100 device in CB mode */
 static inline int
+enqueue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
+		uint16_t total_enqueued_cbs)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint32_t in_offset, out_offset, out_length, mbuf_total_left,
+		seg_total_left;
+	struct rte_mbuf *input, *output_head, *output;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	acc100_fcw_te_fill(op, &desc->req.fcw_te);
+
+	input = op->turbo_enc.input.data;
+	output_head = output = op->turbo_enc.output.data;
+	in_offset = op->turbo_enc.input.offset;
+	out_offset = op->turbo_enc.output.offset;
+	out_length = 0;
+	mbuf_total_left = op->turbo_enc.input.length;
+	seg_total_left = rte_pktmbuf_data_len(op->turbo_enc.input.data)
+			- in_offset;
+
+	ret = acc100_dma_desc_te_fill(op, &desc->req, &input, output,
+			&in_offset, &out_offset, &out_length, &mbuf_total_left,
+			&seg_total_left, 0);
+
+	if (unlikely(ret < 0))
+		return ret;
+
+	mbuf_append(output_head, output, out_length);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_te,
+			sizeof(desc->req.fcw_te) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+
+	/* Check if any data left after processing one CB */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left after processing one CB: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
+/* Enqueue one encode operations for ACC100 device in CB mode */
+static inline int
 enqueue_ldpc_enc_n_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op **ops,
 		uint16_t total_enqueued_cbs, int16_t num)
 {
@@ -1481,78 +1828,235 @@
 	return 1;
 }
 
-static inline int
-harq_loopback(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
-		uint16_t total_enqueued_cbs) {
-	struct acc100_fcw_ld *fcw;
-	union acc100_dma_desc *desc;
-	int next_triplet = 1;
-	struct rte_mbuf *hq_output_head, *hq_output;
-	uint16_t harq_in_length = op->ldpc_dec.harq_combined_input.length;
-	if (harq_in_length == 0) {
-		rte_bbdev_log(ERR, "Loopback of invalid null size\n");
-		return -EINVAL;
-	}
 
-	int h_comp = check_bit(op->ldpc_dec.op_flags,
-			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION
-			) ? 1 : 0;
-	if (h_comp == 1)
-		harq_in_length = harq_in_length * 8 / 6;
-	harq_in_length = RTE_ALIGN(harq_in_length, 64);
-	uint16_t harq_dma_length_in = (h_comp == 0) ?
-			harq_in_length :
-			harq_in_length * 6 / 8;
-	uint16_t harq_dma_length_out = harq_dma_length_in;
-	bool ddr_mem_in = check_bit(op->ldpc_dec.op_flags,
-			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE);
-	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
-	uint16_t harq_index = (ddr_mem_in ?
-			op->ldpc_dec.harq_combined_input.offset :
-			op->ldpc_dec.harq_combined_output.offset)
-			/ ACC100_HARQ_OFFSET;
+/* Enqueue one encode operations for ACC100 device in TB mode. */
+static inline int
+enqueue_enc_one_op_tb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
+		uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint8_t r, c;
+	uint32_t in_offset, out_offset, out_length, mbuf_total_left,
+		seg_total_left;
+	struct rte_mbuf *input, *output_head, *output;
+	uint16_t current_enqueued_cbs = 0;
 
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
-	fcw = &desc->req.fcw_ld;
-	/* Set the FCW from loopback into DDR */
-	memset(fcw, 0, sizeof(struct acc100_fcw_ld));
-	fcw->FCWversion = ACC100_FCW_VER;
-	fcw->qm = 2;
-	fcw->Zc = 384;
-	if (harq_in_length < 16 * N_ZC_1)
-		fcw->Zc = 16;
-	fcw->ncb = fcw->Zc * N_ZC_1;
-	fcw->rm_e = 2;
-	fcw->hcin_en = 1;
-	fcw->hcout_en = 1;
+	uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
+	acc100_fcw_te_fill(op, &desc->req.fcw_te);
 
-	rte_bbdev_log(DEBUG, "Loopback IN %d Index %d offset %d length %d %d\n",
-			ddr_mem_in, harq_index,
-			harq_layout[harq_index].offset, harq_in_length,
-			harq_dma_length_in);
+	input = op->turbo_enc.input.data;
+	output_head = output = op->turbo_enc.output.data;
+	in_offset = op->turbo_enc.input.offset;
+	out_offset = op->turbo_enc.output.offset;
+	out_length = 0;
+	mbuf_total_left = op->turbo_enc.input.length;
 
-	if (ddr_mem_in && (harq_layout[harq_index].offset > 0)) {
-		fcw->hcin_size0 = harq_layout[harq_index].size0;
-		fcw->hcin_offset = harq_layout[harq_index].offset;
-		fcw->hcin_size1 = harq_in_length - fcw->hcin_offset;
-		harq_dma_length_in = (fcw->hcin_size0 + fcw->hcin_size1);
-		if (h_comp == 1)
-			harq_dma_length_in = harq_dma_length_in * 6 / 8;
-	} else {
-		fcw->hcin_size0 = harq_in_length;
-	}
-	harq_layout[harq_index].val = 0;
-	rte_bbdev_log(DEBUG, "Loopback FCW Config %d %d %d\n",
-			fcw->hcin_size0, fcw->hcin_offset, fcw->hcin_size1);
-	fcw->hcout_size0 = harq_in_length;
-	fcw->hcin_decomp_mode = h_comp;
-	fcw->hcout_comp_mode = h_comp;
-	fcw->gain_i = 1;
-	fcw->gain_h = 1;
+	c = op->turbo_enc.tb_params.c;
+	r = op->turbo_enc.tb_params.r;
 
-	/* Set the prefix of descriptor. This could be done at polling */
+	while (mbuf_total_left > 0 && r < c) {
+		seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
+		/* Set up DMA descriptor */
+		desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
+				& q->sw_ring_wrap_mask);
+		desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
+		desc->req.data_ptrs[0].blen = ACC100_FCW_TE_BLEN;
+
+		ret = acc100_dma_desc_te_fill(op, &desc->req, &input, output,
+				&in_offset, &out_offset, &out_length,
+				&mbuf_total_left, &seg_total_left, r);
+		if (unlikely(ret < 0))
+			return ret;
+		mbuf_append(output_head, output, out_length);
+
+		/* Set total number of CBs in TB */
+		desc->req.cbs_in_tb = cbs_in_tb;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+		rte_memdump(stderr, "FCW", &desc->req.fcw_te,
+				sizeof(desc->req.fcw_te) - 8);
+		rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+		if (seg_total_left == 0) {
+			/* Go to the next mbuf */
+			input = input->next;
+			in_offset = 0;
+			output = output->next;
+			out_offset = 0;
+		}
+
+		total_enqueued_cbs++;
+		current_enqueued_cbs++;
+		r++;
+	}
+
+	if (unlikely(desc == NULL))
+		return current_enqueued_cbs;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Check if any CBs left for processing */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left for processing: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+
+	/* Set SDone on last CB descriptor for TB mode. */
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	return current_enqueued_cbs;
+}
+
+/** Enqueue one decode operations for ACC100 device in CB mode */
+static inline int
+enqueue_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint32_t in_offset, h_out_offset, s_out_offset, s_out_length,
+		h_out_length, mbuf_total_left, seg_total_left;
+	struct rte_mbuf *input, *h_output_head, *h_output,
+		*s_output_head, *s_output;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	acc100_fcw_td_fill(op, &desc->req.fcw_td);
+
+	input = op->turbo_dec.input.data;
+	h_output_head = h_output = op->turbo_dec.hard_output.data;
+	s_output_head = s_output = op->turbo_dec.soft_output.data;
+	in_offset = op->turbo_dec.input.offset;
+	h_out_offset = op->turbo_dec.hard_output.offset;
+	s_out_offset = op->turbo_dec.soft_output.offset;
+	h_out_length = s_out_length = 0;
+	mbuf_total_left = op->turbo_dec.input.length;
+	seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(input == NULL)) {
+		rte_bbdev_log(ERR, "Invalid mbuf pointer");
+		return -EFAULT;
+	}
+#endif
+
+	/* Set up DMA descriptor */
+	desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+
+	ret = acc100_dma_desc_td_fill(op, &desc->req, &input, h_output,
+			s_output, &in_offset, &h_out_offset, &s_out_offset,
+			&h_out_length, &s_out_length, &mbuf_total_left,
+			&seg_total_left, 0);
+
+	if (unlikely(ret < 0))
+		return ret;
+
+	/* Hard output */
+	mbuf_append(h_output_head, h_output, h_out_length);
+
+	/* Soft output */
+	if (check_bit(op->turbo_dec.op_flags, RTE_BBDEV_TURBO_SOFT_OUTPUT))
+		mbuf_append(s_output_head, s_output, s_out_length);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_td,
+			sizeof(desc->req.fcw_td) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+
+	/* Check if any CBs left for processing */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left after processing one CB: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
+static inline int
+harq_loopback(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs) {
+	struct acc100_fcw_ld *fcw;
+	union acc100_dma_desc *desc;
+	int next_triplet = 1;
+	struct rte_mbuf *hq_output_head, *hq_output;
+	uint16_t harq_in_length = op->ldpc_dec.harq_combined_input.length;
+	if (harq_in_length == 0) {
+		rte_bbdev_log(ERR, "Loopback of invalid null size\n");
+		return -EINVAL;
+	}
+
+	int h_comp = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION
+			) ? 1 : 0;
+	if (h_comp == 1)
+		harq_in_length = harq_in_length * 8 / 6;
+	harq_in_length = RTE_ALIGN(harq_in_length, 64);
+	uint16_t harq_dma_length_in = (h_comp == 0) ?
+			harq_in_length :
+			harq_in_length * 6 / 8;
+	uint16_t harq_dma_length_out = harq_dma_length_in;
+	bool ddr_mem_in = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE);
+	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+	uint16_t harq_index = (ddr_mem_in ?
+			op->ldpc_dec.harq_combined_input.offset :
+			op->ldpc_dec.harq_combined_output.offset)
+			/ ACC100_HARQ_OFFSET;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	fcw = &desc->req.fcw_ld;
+	/* Set the FCW from loopback into DDR */
+	memset(fcw, 0, sizeof(struct acc100_fcw_ld));
+	fcw->FCWversion = ACC100_FCW_VER;
+	fcw->qm = 2;
+	fcw->Zc = 384;
+	if (harq_in_length < 16 * N_ZC_1)
+		fcw->Zc = 16;
+	fcw->ncb = fcw->Zc * N_ZC_1;
+	fcw->rm_e = 2;
+	fcw->hcin_en = 1;
+	fcw->hcout_en = 1;
+
+	rte_bbdev_log(DEBUG, "Loopback IN %d Index %d offset %d length %d %d\n",
+			ddr_mem_in, harq_index,
+			harq_layout[harq_index].offset, harq_in_length,
+			harq_dma_length_in);
+
+	if (ddr_mem_in && (harq_layout[harq_index].offset > 0)) {
+		fcw->hcin_size0 = harq_layout[harq_index].size0;
+		fcw->hcin_offset = harq_layout[harq_index].offset;
+		fcw->hcin_size1 = harq_in_length - fcw->hcin_offset;
+		harq_dma_length_in = (fcw->hcin_size0 + fcw->hcin_size1);
+		if (h_comp == 1)
+			harq_dma_length_in = harq_dma_length_in * 6 / 8;
+	} else {
+		fcw->hcin_size0 = harq_in_length;
+	}
+	harq_layout[harq_index].val = 0;
+	rte_bbdev_log(DEBUG, "Loopback FCW Config %d %d %d\n",
+			fcw->hcin_size0, fcw->hcin_offset, fcw->hcin_size1);
+	fcw->hcout_size0 = harq_in_length;
+	fcw->hcin_decomp_mode = h_comp;
+	fcw->hcout_comp_mode = h_comp;
+	fcw->gain_i = 1;
+	fcw->gain_h = 1;
+
+	/* Set the prefix of descriptor. This could be done at polling */
 	desc->req.word0 = ACC100_DMA_DESC_TYPE;
 	desc->req.word1 = 0; /**< Timestamp could be disabled */
 	desc->req.word2 = 0;
@@ -1816,6 +2320,107 @@
 	return current_enqueued_cbs;
 }
 
+/* Enqueue one decode operations for ACC100 device in TB mode */
+static inline int
+enqueue_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint8_t r, c;
+	uint32_t in_offset, h_out_offset, s_out_offset, s_out_length,
+		h_out_length, mbuf_total_left, seg_total_left;
+	struct rte_mbuf *input, *h_output_head, *h_output,
+		*s_output_head, *s_output;
+	uint16_t current_enqueued_cbs = 0;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
+	acc100_fcw_td_fill(op, &desc->req.fcw_td);
+
+	input = op->turbo_dec.input.data;
+	h_output_head = h_output = op->turbo_dec.hard_output.data;
+	s_output_head = s_output = op->turbo_dec.soft_output.data;
+	in_offset = op->turbo_dec.input.offset;
+	h_out_offset = op->turbo_dec.hard_output.offset;
+	s_out_offset = op->turbo_dec.soft_output.offset;
+	h_out_length = s_out_length = 0;
+	mbuf_total_left = op->turbo_dec.input.length;
+	c = op->turbo_dec.tb_params.c;
+	r = op->turbo_dec.tb_params.r;
+
+	while (mbuf_total_left > 0 && r < c) {
+
+		seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
+
+		/* Set up DMA descriptor */
+		desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
+				& q->sw_ring_wrap_mask);
+		desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
+		desc->req.data_ptrs[0].blen = ACC100_FCW_TD_BLEN;
+		ret = acc100_dma_desc_td_fill(op, &desc->req, &input,
+				h_output, s_output, &in_offset, &h_out_offset,
+				&s_out_offset, &h_out_length, &s_out_length,
+				&mbuf_total_left, &seg_total_left, r);
+
+		if (unlikely(ret < 0))
+			return ret;
+
+		/* Hard output */
+		mbuf_append(h_output_head, h_output, h_out_length);
+
+		/* Soft output */
+		if (check_bit(op->turbo_dec.op_flags,
+				RTE_BBDEV_TURBO_SOFT_OUTPUT))
+			mbuf_append(s_output_head, s_output, s_out_length);
+
+		/* Set total number of CBs in TB */
+		desc->req.cbs_in_tb = cbs_in_tb;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+		rte_memdump(stderr, "FCW", &desc->req.fcw_td,
+				sizeof(desc->req.fcw_td) - 8);
+		rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+		if (seg_total_left == 0) {
+			/* Go to the next mbuf */
+			input = input->next;
+			in_offset = 0;
+			h_output = h_output->next;
+			h_out_offset = 0;
+
+			if (check_bit(op->turbo_dec.op_flags,
+					RTE_BBDEV_TURBO_SOFT_OUTPUT)) {
+				s_output = s_output->next;
+				s_out_offset = 0;
+			}
+		}
+
+		total_enqueued_cbs++;
+		current_enqueued_cbs++;
+		r++;
+	}
+
+	if (unlikely(desc == NULL))
+		return current_enqueued_cbs;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Check if any CBs left for processing */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left for processing: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+	/* Set SDone on last CB descriptor for TB mode */
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	return current_enqueued_cbs;
+}
 
 /* Calculates number of CBs in processed encoder TB based on 'r' and input
  * length.
@@ -1893,6 +2498,45 @@
 	return cbs_in_tb;
 }
 
+/* Enqueue encode operations for ACC100 device in CB mode. */
+static uint16_t
+acc100_enqueue_enc_cb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i;
+	union acc100_dma_desc *desc;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - 1 < 0))
+			break;
+		avail -= 1;
+
+		ret = enqueue_enc_one_op_cb(q, ops[i], i);
+		if (ret < 0)
+			break;
+	}
+
+	if (unlikely(i == 0))
+		return 0; /* Nothing to enqueue */
+
+	/* Set SDone in last CB in enqueued ops for CB mode*/
+	desc = q->ring_addr + ((q->sw_ring_head + i - 1)
+			& q->sw_ring_wrap_mask);
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	acc100_dma_enqueue(q, i, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+	return i;
+}
+
 /* Check we can mux encode operations with common FCW */
 static inline bool
 check_mux(struct rte_bbdev_enc_op **ops, uint16_t num) {
@@ -1960,6 +2604,52 @@
 	return i;
 }
 
+/* Enqueue encode operations for ACC100 device in TB mode. */
+static uint16_t
+acc100_enqueue_enc_tb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i, enqueued_cbs = 0;
+	uint8_t cbs_in_tb;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		cbs_in_tb = get_num_cbs_in_tb_enc(&ops[i]->turbo_enc);
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - cbs_in_tb < 0))
+			break;
+		avail -= cbs_in_tb;
+
+		ret = enqueue_enc_one_op_tb(q, ops[i], enqueued_cbs, cbs_in_tb);
+		if (ret < 0)
+			break;
+		enqueued_cbs += ret;
+	}
+
+	acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+
+	return i;
+}
+
+/* Enqueue encode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_enc(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	if (unlikely(num == 0))
+		return 0;
+	if (ops[0]->turbo_enc.code_block_mode == 0)
+		return acc100_enqueue_enc_tb(q_data, ops, num);
+	else
+		return acc100_enqueue_enc_cb(q_data, ops, num);
+}
+
 /* Enqueue encode operations for ACC100 device. */
 static uint16_t
 acc100_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
@@ -1967,7 +2657,51 @@
 {
 	if (unlikely(num == 0))
 		return 0;
-	return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
+	if (ops[0]->ldpc_enc.code_block_mode == 0)
+		return acc100_enqueue_enc_tb(q_data, ops, num);
+	else
+		return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
+}
+
+
+/* Enqueue decode operations for ACC100 device in CB mode */
+static uint16_t
+acc100_enqueue_dec_cb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i;
+	union acc100_dma_desc *desc;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - 1 < 0))
+			break;
+		avail -= 1;
+
+		ret = enqueue_dec_one_op_cb(q, ops[i], i);
+		if (ret < 0)
+			break;
+	}
+
+	if (unlikely(i == 0))
+		return 0; /* Nothing to enqueue */
+
+	/* Set SDone in last CB in enqueued ops for CB mode*/
+	desc = q->ring_addr + ((q->sw_ring_head + i - 1)
+			& q->sw_ring_wrap_mask);
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	acc100_dma_enqueue(q, i, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+
+	return i;
 }
 
 /* Check we can mux encode operations with common FCW */
@@ -2065,6 +2799,53 @@
 	return i;
 }
 
+
+/* Enqueue decode operations for ACC100 device in TB mode */
+static uint16_t
+acc100_enqueue_dec_tb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i, enqueued_cbs = 0;
+	uint8_t cbs_in_tb;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		cbs_in_tb = get_num_cbs_in_tb_dec(&ops[i]->turbo_dec);
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - cbs_in_tb < 0))
+			break;
+		avail -= cbs_in_tb;
+
+		ret = enqueue_dec_one_op_tb(q, ops[i], enqueued_cbs, cbs_in_tb);
+		if (ret < 0)
+			break;
+		enqueued_cbs += ret;
+	}
+
+	acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+
+	return i;
+}
+
+/* Enqueue decode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_dec(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	if (unlikely(num == 0))
+		return 0;
+	if (ops[0]->turbo_dec.code_block_mode == 0)
+		return acc100_enqueue_dec_tb(q_data, ops, num);
+	else
+		return acc100_enqueue_dec_cb(q_data, ops, num);
+}
+
 /* Enqueue decode operations for ACC100 device. */
 static uint16_t
 acc100_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
@@ -2388,6 +3169,51 @@
 	return cb_idx;
 }
 
+/* Dequeue encode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_enc(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	uint16_t dequeue_num;
+	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+	uint32_t aq_dequeued = 0;
+	uint16_t i;
+	uint16_t dequeued_cbs = 0;
+	struct rte_bbdev_enc_op *op;
+	int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(ops == 0 && q == NULL))
+		return 0;
+#endif
+
+	dequeue_num = (avail < num) ? avail : num;
+
+	for (i = 0; i < dequeue_num; ++i) {
+		op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask))->req.op_addr;
+		if (op->turbo_enc.code_block_mode == 0)
+			ret = dequeue_enc_one_op_tb(q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+		else
+			ret = dequeue_enc_one_op_cb(q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+
+		if (ret < 0)
+			break;
+		dequeued_cbs += ret;
+	}
+
+	q->aq_dequeued += aq_dequeued;
+	q->sw_ring_tail += dequeued_cbs;
+
+	/* Update enqueue stats */
+	q_data->queue_stats.dequeued_count += i;
+
+	return i;
+}
+
 /* Dequeue LDPC encode operations from ACC100 device. */
 static uint16_t
 acc100_dequeue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
@@ -2426,6 +3252,52 @@
 	return dequeued_cbs;
 }
 
+
+/* Dequeue decode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_dec(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	uint16_t dequeue_num;
+	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+	uint32_t aq_dequeued = 0;
+	uint16_t i;
+	uint16_t dequeued_cbs = 0;
+	struct rte_bbdev_dec_op *op;
+	int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(ops == 0 && q == NULL))
+		return 0;
+#endif
+
+	dequeue_num = (avail < num) ? avail : num;
+
+	for (i = 0; i < dequeue_num; ++i) {
+		op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask))->req.op_addr;
+		if (op->turbo_dec.code_block_mode == 0)
+			ret = dequeue_dec_one_op_tb(q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+		else
+			ret = dequeue_dec_one_op_cb(q_data, q, &ops[i],
+					dequeued_cbs, &aq_dequeued);
+
+		if (ret < 0)
+			break;
+		dequeued_cbs += ret;
+	}
+
+	q->aq_dequeued += aq_dequeued;
+	q->sw_ring_tail += dequeued_cbs;
+
+	/* Update enqueue stats */
+	q_data->queue_stats.dequeued_count += i;
+
+	return i;
+}
+
 /* Dequeue decode operations from ACC100 device. */
 static uint16_t
 acc100_dequeue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
@@ -2479,6 +3351,10 @@
 	struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
 
 	dev->dev_ops = &acc100_bbdev_ops;
+	dev->enqueue_enc_ops = acc100_enqueue_enc;
+	dev->enqueue_dec_ops = acc100_enqueue_dec;
+	dev->dequeue_enc_ops = acc100_dequeue_enc;
+	dev->dequeue_dec_ops = acc100_dequeue_dec;
 	dev->enqueue_ldpc_enc_ops = acc100_enqueue_ldpc_enc;
 	dev->enqueue_ldpc_dec_ops = acc100_enqueue_ldpc_dec;
 	dev->dequeue_ldpc_enc_ops = acc100_dequeue_ldpc_enc;
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v3 08/11] baseband/acc100: add interrupt support to PMD
  2020-08-19  0:25 [dpdk-dev] [PATCH v3 00/11] bbdev PMD ACC100 Nicolas Chautru
                   ` (6 preceding siblings ...)
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 07/11] baseband/acc100: add support for 4G processing Nicolas Chautru
@ 2020-08-19  0:25 ` Nicolas Chautru
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 09/11] baseband/acc100: add debug function to validate input Nicolas Chautru
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-08-19  0:25 UTC (permalink / raw)
  To: dev, akhil.goyal; +Cc: bruce.richardson, Nicolas Chautru

Adding capability and functions to support MSI
interrupts, call backs and inforing.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 288 ++++++++++++++++++++++++++++++-
 drivers/baseband/acc100/rte_acc100_pmd.h |  15 ++
 2 files changed, 300 insertions(+), 3 deletions(-)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 1de7531..ba8e1d8 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -339,6 +339,213 @@
 	free_base_addresses(base_addrs, i);
 }
 
+/*
+ * Find queue_id of a device queue based on details from the Info Ring.
+ * If a queue isn't found UINT16_MAX is returned.
+ */
+static inline uint16_t
+get_queue_id_from_ring_info(struct rte_bbdev_data *data,
+		const union acc100_info_ring_data ring_data)
+{
+	uint16_t queue_id;
+
+	for (queue_id = 0; queue_id < data->num_queues; ++queue_id) {
+		struct acc100_queue *acc100_q =
+				data->queues[queue_id].queue_private;
+		if (acc100_q != NULL && acc100_q->aq_id == ring_data.aq_id &&
+				acc100_q->qgrp_id == ring_data.qg_id &&
+				acc100_q->vf_id == ring_data.vf_id)
+			return queue_id;
+	}
+
+	return UINT16_MAX;
+}
+
+/* Checks PF Info Ring to find the interrupt cause and handles it accordingly */
+static inline void
+acc100_check_ir(struct acc100_device *acc100_dev)
+{
+	volatile union acc100_info_ring_data *ring_data;
+	uint16_t info_ring_head = acc100_dev->info_ring_head;
+	if (acc100_dev->info_ring == NULL)
+		return;
+
+	ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head &
+			ACC100_INFO_RING_MASK);
+
+	while (ring_data->valid) {
+		if ((ring_data->int_nb < ACC100_PF_INT_DMA_DL_DESC_IRQ) || (
+				ring_data->int_nb >
+				ACC100_PF_INT_DMA_DL5G_DESC_IRQ))
+			rte_bbdev_log(WARNING, "InfoRing: ITR:%d Info:0x%x",
+				ring_data->int_nb, ring_data->detailed_info);
+		/* Initialize Info Ring entry and move forward */
+		ring_data->val = 0;
+		info_ring_head++;
+		ring_data = acc100_dev->info_ring +
+				(info_ring_head & ACC100_INFO_RING_MASK);
+	}
+}
+
+/* Checks PF Info Ring to find the interrupt cause and handles it accordingly */
+static inline void
+acc100_pf_interrupt_handler(struct rte_bbdev *dev)
+{
+	struct acc100_device *acc100_dev = dev->data->dev_private;
+	volatile union acc100_info_ring_data *ring_data;
+	struct acc100_deq_intr_details deq_intr_det;
+
+	ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head &
+			ACC100_INFO_RING_MASK);
+
+	while (ring_data->valid) {
+
+		rte_bbdev_log_debug(
+				"ACC100 PF Interrupt received, Info Ring data: 0x%x",
+				ring_data->val);
+
+		switch (ring_data->int_nb) {
+		case ACC100_PF_INT_DMA_DL_DESC_IRQ:
+		case ACC100_PF_INT_DMA_UL_DESC_IRQ:
+		case ACC100_PF_INT_DMA_UL5G_DESC_IRQ:
+		case ACC100_PF_INT_DMA_DL5G_DESC_IRQ:
+			deq_intr_det.queue_id = get_queue_id_from_ring_info(
+					dev->data, *ring_data);
+			if (deq_intr_det.queue_id == UINT16_MAX) {
+				rte_bbdev_log(ERR,
+						"Couldn't find queue: aq_id: %u, qg_id: %u, vf_id: %u",
+						ring_data->aq_id,
+						ring_data->qg_id,
+						ring_data->vf_id);
+				return;
+			}
+			rte_bbdev_pmd_callback_process(dev,
+					RTE_BBDEV_EVENT_DEQUEUE, &deq_intr_det);
+			break;
+		default:
+			rte_bbdev_pmd_callback_process(dev,
+					RTE_BBDEV_EVENT_ERROR, NULL);
+			break;
+		}
+
+		/* Initialize Info Ring entry and move forward */
+		ring_data->val = 0;
+		++acc100_dev->info_ring_head;
+		ring_data = acc100_dev->info_ring +
+				(acc100_dev->info_ring_head &
+				ACC100_INFO_RING_MASK);
+	}
+}
+
+/* Checks VF Info Ring to find the interrupt cause and handles it accordingly */
+static inline void
+acc100_vf_interrupt_handler(struct rte_bbdev *dev)
+{
+	struct acc100_device *acc100_dev = dev->data->dev_private;
+	volatile union acc100_info_ring_data *ring_data;
+	struct acc100_deq_intr_details deq_intr_det;
+
+	ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head &
+			ACC100_INFO_RING_MASK);
+
+	while (ring_data->valid) {
+
+		rte_bbdev_log_debug(
+				"ACC100 VF Interrupt received, Info Ring data: 0x%x",
+				ring_data->val);
+
+		switch (ring_data->int_nb) {
+		case ACC100_VF_INT_DMA_DL_DESC_IRQ:
+		case ACC100_VF_INT_DMA_UL_DESC_IRQ:
+		case ACC100_VF_INT_DMA_UL5G_DESC_IRQ:
+		case ACC100_VF_INT_DMA_DL5G_DESC_IRQ:
+			/* VFs are not aware of their vf_id - it's set to 0 in
+			 * queue structures.
+			 */
+			ring_data->vf_id = 0;
+			deq_intr_det.queue_id = get_queue_id_from_ring_info(
+					dev->data, *ring_data);
+			if (deq_intr_det.queue_id == UINT16_MAX) {
+				rte_bbdev_log(ERR,
+						"Couldn't find queue: aq_id: %u, qg_id: %u",
+						ring_data->aq_id,
+						ring_data->qg_id);
+				return;
+			}
+			rte_bbdev_pmd_callback_process(dev,
+					RTE_BBDEV_EVENT_DEQUEUE, &deq_intr_det);
+			break;
+		default:
+			rte_bbdev_pmd_callback_process(dev,
+					RTE_BBDEV_EVENT_ERROR, NULL);
+			break;
+		}
+
+		/* Initialize Info Ring entry and move forward */
+		ring_data->valid = 0;
+		++acc100_dev->info_ring_head;
+		ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head
+				& ACC100_INFO_RING_MASK);
+	}
+}
+
+/* Interrupt handler triggered by ACC100 dev for handling specific interrupt */
+static void
+acc100_dev_interrupt_handler(void *cb_arg)
+{
+	struct rte_bbdev *dev = cb_arg;
+	struct acc100_device *acc100_dev = dev->data->dev_private;
+
+	/* Read info ring */
+	if (acc100_dev->pf_device)
+		acc100_pf_interrupt_handler(dev);
+	else
+		acc100_vf_interrupt_handler(dev);
+}
+
+/* Allocate and setup inforing */
+static int
+allocate_inforing(struct rte_bbdev *dev)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	const struct acc100_registry_addr *reg_addr;
+	rte_iova_t info_ring_phys;
+	uint32_t phys_low, phys_high;
+
+	if (d->info_ring != NULL)
+		return 0; /* Already configured */
+
+	/* Choose correct registry addresses for the device type */
+	if (d->pf_device)
+		reg_addr = &pf_reg_addr;
+	else
+		reg_addr = &vf_reg_addr;
+	/* Allocate InfoRing */
+	d->info_ring = rte_zmalloc_socket("Info Ring",
+			ACC100_INFO_RING_NUM_ENTRIES *
+			sizeof(*d->info_ring), RTE_CACHE_LINE_SIZE,
+			dev->data->socket_id);
+	if (d->info_ring == NULL) {
+		rte_bbdev_log(ERR,
+				"Failed to allocate Info Ring for %s:%u",
+				dev->device->driver->name,
+				dev->data->dev_id);
+		return -ENOMEM;
+	}
+	info_ring_phys = rte_malloc_virt2iova(d->info_ring);
+
+	/* Setup Info Ring */
+	phys_high = (uint32_t)(info_ring_phys >> 32);
+	phys_low  = (uint32_t)(info_ring_phys);
+	acc100_reg_write(d, reg_addr->info_ring_hi, phys_high);
+	acc100_reg_write(d, reg_addr->info_ring_lo, phys_low);
+	acc100_reg_write(d, reg_addr->info_ring_en, ACC100_REG_IRQ_EN_ALL);
+	d->info_ring_head = (acc100_reg_read(d, reg_addr->info_ring_ptr) &
+			0xFFF) / sizeof(union acc100_info_ring_data);
+	return 0;
+}
+
+
 /* Allocate 64MB memory used for all software rings */
 static int
 acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id)
@@ -426,6 +633,7 @@
 	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_hi, phys_high);
 	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_lo, phys_low);
 
+	allocate_inforing(dev);
 	d->harq_layout = rte_zmalloc_socket("HARQ Layout",
 			ACC100_HARQ_LAYOUT * sizeof(*d->harq_layout),
 			RTE_CACHE_LINE_SIZE, dev->data->socket_id);
@@ -437,13 +645,53 @@
 	return 0;
 }
 
+static int
+acc100_intr_enable(struct rte_bbdev *dev)
+{
+	int ret;
+	struct acc100_device *d = dev->data->dev_private;
+
+	/* Only MSI are currently supported */
+	if (dev->intr_handle->type == RTE_INTR_HANDLE_VFIO_MSI ||
+			dev->intr_handle->type == RTE_INTR_HANDLE_UIO) {
+
+		allocate_inforing(dev);
+
+		ret = rte_intr_enable(dev->intr_handle);
+		if (ret < 0) {
+			rte_bbdev_log(ERR,
+					"Couldn't enable interrupts for device: %s",
+					dev->data->name);
+			rte_free(d->info_ring);
+			return ret;
+		}
+		ret = rte_intr_callback_register(dev->intr_handle,
+				acc100_dev_interrupt_handler, dev);
+		if (ret < 0) {
+			rte_bbdev_log(ERR,
+					"Couldn't register interrupt callback for device: %s",
+					dev->data->name);
+			rte_free(d->info_ring);
+			return ret;
+		}
+
+		return 0;
+	}
+
+	rte_bbdev_log(ERR, "ACC100 (%s) supports only VFIO MSI interrupts",
+			dev->data->name);
+	return -ENOTSUP;
+}
+
 /* Free 64MB memory used for software rings */
 static int
 acc100_dev_close(struct rte_bbdev *dev)
 {
 	struct acc100_device *d = dev->data->dev_private;
+	acc100_check_ir(d);
 	if (d->sw_rings_base != NULL) {
 		rte_free(d->tail_ptrs);
+		rte_free(d->info_ring);
 		rte_free(d->sw_rings_base);
 		d->sw_rings_base = NULL;
 	}
@@ -643,6 +891,7 @@
 					RTE_BBDEV_TURBO_CRC_TYPE_24B |
 					RTE_BBDEV_TURBO_HALF_ITERATION_EVEN |
 					RTE_BBDEV_TURBO_EARLY_TERMINATION |
+					RTE_BBDEV_TURBO_DEC_INTERRUPTS |
 					RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN |
 					RTE_BBDEV_TURBO_MAP_DEC |
 					RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP |
@@ -663,6 +912,7 @@
 					RTE_BBDEV_TURBO_CRC_24B_ATTACH |
 					RTE_BBDEV_TURBO_RV_INDEX_BYPASS |
 					RTE_BBDEV_TURBO_RATE_MATCH |
+					RTE_BBDEV_TURBO_ENC_INTERRUPTS |
 					RTE_BBDEV_TURBO_ENC_SCATTER_GATHER,
 				.num_buffers_src =
 						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
@@ -676,7 +926,8 @@
 				.capability_flags =
 					RTE_BBDEV_LDPC_RATE_MATCH |
 					RTE_BBDEV_LDPC_CRC_24B_ATTACH |
-					RTE_BBDEV_LDPC_INTERLEAVER_BYPASS,
+					RTE_BBDEV_LDPC_INTERLEAVER_BYPASS |
+					RTE_BBDEV_LDPC_ENC_INTERRUPTS,
 				.num_buffers_src =
 						RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
 				.num_buffers_dst =
@@ -701,7 +952,8 @@
 				RTE_BBDEV_LDPC_DECODE_BYPASS |
 				RTE_BBDEV_LDPC_DEC_SCATTER_GATHER |
 				RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION |
-				RTE_BBDEV_LDPC_LLR_COMPRESSION,
+				RTE_BBDEV_LDPC_LLR_COMPRESSION |
+				RTE_BBDEV_LDPC_DEC_INTERRUPTS,
 			.llr_size = 8,
 			.llr_decimals = 1,
 			.num_buffers_src =
@@ -751,14 +1003,39 @@
 #else
 	dev_info->harq_buffer_size = 0;
 #endif
+	acc100_check_ir(d);
+}
+
+static int
+acc100_queue_intr_enable(struct rte_bbdev *dev, uint16_t queue_id)
+{
+	struct acc100_queue *q = dev->data->queues[queue_id].queue_private;
+
+	if (dev->intr_handle->type != RTE_INTR_HANDLE_VFIO_MSI &&
+			dev->intr_handle->type != RTE_INTR_HANDLE_UIO)
+		return -ENOTSUP;
+
+	q->irq_enable = 1;
+	return 0;
+}
+
+static int
+acc100_queue_intr_disable(struct rte_bbdev *dev, uint16_t queue_id)
+{
+	struct acc100_queue *q = dev->data->queues[queue_id].queue_private;
+	q->irq_enable = 0;
+	return 0;
 }
 
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
 	.setup_queues = acc100_setup_queues,
+	.intr_enable = acc100_intr_enable,
 	.close = acc100_dev_close,
 	.info_get = acc100_dev_info_get,
 	.queue_setup = acc100_queue_setup,
 	.queue_release = acc100_queue_release,
+	.queue_intr_enable = acc100_queue_intr_enable,
+	.queue_intr_disable = acc100_queue_intr_disable
 };
 
 /* ACC100 PCI PF address map */
@@ -3018,8 +3295,10 @@
 			? (1 << RTE_BBDEV_DATA_ERROR) : 0);
 	op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
 	op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
-	if (op->status != 0)
+	if (op->status != 0) {
 		q_data->queue_stats.dequeue_err_count++;
+		acc100_check_ir(q->d);
+	}
 
 	/* CRC invalid if error exists */
 	if (!op->status)
@@ -3076,6 +3355,9 @@
 		op->status |= 1 << RTE_BBDEV_SYNDROME_ERROR;
 	op->ldpc_dec.iter_count = (uint8_t) rsp.iter_cnt;
 
+	if (op->status & (1 << RTE_BBDEV_DRV_ERROR))
+		acc100_check_ir(q->d);
+
 	/* Check if this is the last desc in batch (Atomic Queue) */
 	if (desc->req.last_desc_in_batch) {
 		(*aq_dequeued)++;
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 78686c1..8980fa5 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -559,7 +559,14 @@ struct acc100_device {
 	/* Virtual address of the info memory routed to the this function under
 	 * operation, whether it is PF or VF.
 	 */
+	union acc100_info_ring_data *info_ring;
+
 	union acc100_harq_layout_data *harq_layout;
+	/* Virtual Info Ring head */
+	uint16_t info_ring_head;
+	/* Number of bytes available for each queue in device, depending on
+	 * how many queues are enabled with configure()
+	 */
 	uint32_t sw_ring_size;
 	uint32_t ddr_size; /* Size in kB */
 	uint32_t *tail_ptrs; /* Base address of response tail pointer buffer */
@@ -575,4 +582,12 @@ struct acc100_device {
 	bool configured; /**< True if this ACC100 device is configured */
 };
 
+/**
+ * Structure with details about RTE_BBDEV_EVENT_DEQUEUE event. It's passed to
+ * the callback function.
+ */
+struct acc100_deq_intr_details {
+	uint16_t queue_id;
+};
+
 #endif /* _RTE_ACC100_PMD_H_ */
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v3 09/11] baseband/acc100: add debug function to validate input
  2020-08-19  0:25 [dpdk-dev] [PATCH v3 00/11] bbdev PMD ACC100 Nicolas Chautru
                   ` (7 preceding siblings ...)
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 08/11] baseband/acc100: add interrupt support to PMD Nicolas Chautru
@ 2020-08-19  0:25 ` Nicolas Chautru
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 10/11] baseband/acc100: add configure function Nicolas Chautru
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 11/11] doc: update bbdev feature table Nicolas Chautru
  10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-08-19  0:25 UTC (permalink / raw)
  To: dev, akhil.goyal; +Cc: bruce.richardson, Nicolas Chautru

Debug functions to validate the input API from user
Only enabled in DEBUG mode at build time

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 424 +++++++++++++++++++++++++++++++
 1 file changed, 424 insertions(+)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index ba8e1d8..dc14079 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -1945,6 +1945,231 @@
 
 }
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+/* Validates turbo encoder parameters */
+static inline int
+validate_enc_op(struct rte_bbdev_enc_op *op)
+{
+	struct rte_bbdev_op_turbo_enc *turbo_enc = &op->turbo_enc;
+	struct rte_bbdev_op_enc_turbo_cb_params *cb = NULL;
+	struct rte_bbdev_op_enc_turbo_tb_params *tb = NULL;
+	uint16_t kw, kw_neg, kw_pos;
+
+	if (op->mempool == NULL) {
+		rte_bbdev_log(ERR, "Invalid mempool pointer");
+		return -1;
+	}
+	if (turbo_enc->input.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid input pointer");
+		return -1;
+	}
+	if (turbo_enc->output.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid output pointer");
+		return -1;
+	}
+	if (turbo_enc->rv_index > 3) {
+		rte_bbdev_log(ERR,
+				"rv_index (%u) is out of range 0 <= value <= 3",
+				turbo_enc->rv_index);
+		return -1;
+	}
+	if (turbo_enc->code_block_mode != 0 &&
+			turbo_enc->code_block_mode != 1) {
+		rte_bbdev_log(ERR,
+				"code_block_mode (%u) is out of range 0 <= value <= 1",
+				turbo_enc->code_block_mode);
+		return -1;
+	}
+
+	if (turbo_enc->code_block_mode == 0) {
+		tb = &turbo_enc->tb_params;
+		if ((tb->k_neg < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| tb->k_neg > RTE_BBDEV_TURBO_MAX_CB_SIZE)
+				&& tb->c_neg > 0) {
+			rte_bbdev_log(ERR,
+					"k_neg (%u) is out of range %u <= value <= %u",
+					tb->k_neg, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if (tb->k_pos < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| tb->k_pos > RTE_BBDEV_TURBO_MAX_CB_SIZE) {
+			rte_bbdev_log(ERR,
+					"k_pos (%u) is out of range %u <= value <= %u",
+					tb->k_pos, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if (tb->c_neg > (RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1))
+			rte_bbdev_log(ERR,
+					"c_neg (%u) is out of range 0 <= value <= %u",
+					tb->c_neg,
+					RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1);
+		if (tb->c < 1 || tb->c > RTE_BBDEV_TURBO_MAX_CODE_BLOCKS) {
+			rte_bbdev_log(ERR,
+					"c (%u) is out of range 1 <= value <= %u",
+					tb->c, RTE_BBDEV_TURBO_MAX_CODE_BLOCKS);
+			return -1;
+		}
+		if (tb->cab > tb->c) {
+			rte_bbdev_log(ERR,
+					"cab (%u) is greater than c (%u)",
+					tb->cab, tb->c);
+			return -1;
+		}
+		if ((tb->ea < RTE_BBDEV_TURBO_MIN_CB_SIZE || (tb->ea % 2))
+				&& tb->r < tb->cab) {
+			rte_bbdev_log(ERR,
+					"ea (%u) is less than %u or it is not even",
+					tb->ea, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+		if ((tb->eb < RTE_BBDEV_TURBO_MIN_CB_SIZE || (tb->eb % 2))
+				&& tb->c > tb->cab) {
+			rte_bbdev_log(ERR,
+					"eb (%u) is less than %u or it is not even",
+					tb->eb, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+
+		kw_neg = 3 * RTE_ALIGN_CEIL(tb->k_neg + 4,
+					RTE_BBDEV_TURBO_C_SUBBLOCK);
+		if (tb->ncb_neg < tb->k_neg || tb->ncb_neg > kw_neg) {
+			rte_bbdev_log(ERR,
+					"ncb_neg (%u) is out of range (%u) k_neg <= value <= (%u) kw_neg",
+					tb->ncb_neg, tb->k_neg, kw_neg);
+			return -1;
+		}
+
+		kw_pos = 3 * RTE_ALIGN_CEIL(tb->k_pos + 4,
+					RTE_BBDEV_TURBO_C_SUBBLOCK);
+		if (tb->ncb_pos < tb->k_pos || tb->ncb_pos > kw_pos) {
+			rte_bbdev_log(ERR,
+					"ncb_pos (%u) is out of range (%u) k_pos <= value <= (%u) kw_pos",
+					tb->ncb_pos, tb->k_pos, kw_pos);
+			return -1;
+		}
+		if (tb->r > (tb->c - 1)) {
+			rte_bbdev_log(ERR,
+					"r (%u) is greater than c - 1 (%u)",
+					tb->r, tb->c - 1);
+			return -1;
+		}
+	} else {
+		cb = &turbo_enc->cb_params;
+		if (cb->k < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| cb->k > RTE_BBDEV_TURBO_MAX_CB_SIZE) {
+			rte_bbdev_log(ERR,
+					"k (%u) is out of range %u <= value <= %u",
+					cb->k, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+
+		if (cb->e < RTE_BBDEV_TURBO_MIN_CB_SIZE || (cb->e % 2)) {
+			rte_bbdev_log(ERR,
+					"e (%u) is less than %u or it is not even",
+					cb->e, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+
+		kw = RTE_ALIGN_CEIL(cb->k + 4, RTE_BBDEV_TURBO_C_SUBBLOCK) * 3;
+		if (cb->ncb < cb->k || cb->ncb > kw) {
+			rte_bbdev_log(ERR,
+					"ncb (%u) is out of range (%u) k <= value <= (%u) kw",
+					cb->ncb, cb->k, kw);
+			return -1;
+		}
+	}
+
+	return 0;
+}
+/* Validates LDPC encoder parameters */
+static inline int
+validate_ldpc_enc_op(struct rte_bbdev_enc_op *op)
+{
+	struct rte_bbdev_op_ldpc_enc *ldpc_enc = &op->ldpc_enc;
+
+	if (op->mempool == NULL) {
+		rte_bbdev_log(ERR, "Invalid mempool pointer");
+		return -1;
+	}
+	if (ldpc_enc->input.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid input pointer");
+		return -1;
+	}
+	if (ldpc_enc->output.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid output pointer");
+		return -1;
+	}
+	if (ldpc_enc->input.length >
+			RTE_BBDEV_LDPC_MAX_CB_SIZE >> 3) {
+		rte_bbdev_log(ERR, "CB size (%u) is too big, max: %d",
+				ldpc_enc->input.length,
+				RTE_BBDEV_LDPC_MAX_CB_SIZE);
+		return -1;
+	}
+	if ((ldpc_enc->basegraph > 2) || (ldpc_enc->basegraph == 0)) {
+		rte_bbdev_log(ERR,
+				"BG (%u) is out of range 1 <= value <= 2",
+				ldpc_enc->basegraph);
+		return -1;
+	}
+	if (ldpc_enc->rv_index > 3) {
+		rte_bbdev_log(ERR,
+				"rv_index (%u) is out of range 0 <= value <= 3",
+				ldpc_enc->rv_index);
+		return -1;
+	}
+	if (ldpc_enc->code_block_mode > 1) {
+		rte_bbdev_log(ERR,
+				"code_block_mode (%u) is out of range 0 <= value <= 1",
+				ldpc_enc->code_block_mode);
+		return -1;
+	}
+
+	return 0;
+}
+
+/* Validates LDPC decoder parameters */
+static inline int
+validate_ldpc_dec_op(struct rte_bbdev_dec_op *op)
+{
+	struct rte_bbdev_op_ldpc_dec *ldpc_dec = &op->ldpc_dec;
+
+	if (op->mempool == NULL) {
+		rte_bbdev_log(ERR, "Invalid mempool pointer");
+		return -1;
+	}
+	if ((ldpc_dec->basegraph > 2) || (ldpc_dec->basegraph == 0)) {
+		rte_bbdev_log(ERR,
+				"BG (%u) is out of range 1 <= value <= 2",
+				ldpc_dec->basegraph);
+		return -1;
+	}
+	if (ldpc_dec->iter_max == 0) {
+		rte_bbdev_log(ERR,
+				"iter_max (%u) is equal to 0",
+				ldpc_dec->iter_max);
+		return -1;
+	}
+	if (ldpc_dec->rv_index > 3) {
+		rte_bbdev_log(ERR,
+				"rv_index (%u) is out of range 0 <= value <= 3",
+				ldpc_dec->rv_index);
+		return -1;
+	}
+	if (ldpc_dec->code_block_mode > 1) {
+		rte_bbdev_log(ERR,
+				"code_block_mode (%u) is out of range 0 <= value <= 1",
+				ldpc_dec->code_block_mode);
+		return -1;
+	}
+
+	return 0;
+}
+#endif
+
 /* Enqueue one encode operations for ACC100 device in CB mode */
 static inline int
 enqueue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
@@ -1956,6 +2181,14 @@
 		seg_total_left;
 	struct rte_mbuf *input, *output_head, *output;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_enc_op(op) == -1) {
+		rte_bbdev_log(ERR, "Turbo encoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2008,6 +2241,14 @@
 	uint16_t  in_length_in_bytes;
 	struct rte_bbdev_op_ldpc_enc *enc = &ops[0]->ldpc_enc;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_ldpc_enc_op(ops[0]) == -1) {
+		rte_bbdev_log(ERR, "LDPC encoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2065,6 +2306,14 @@
 		seg_total_left;
 	struct rte_mbuf *input, *output_head, *output;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_ldpc_enc_op(op) == -1) {
+		rte_bbdev_log(ERR, "LDPC encoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2119,6 +2368,14 @@
 	struct rte_mbuf *input, *output_head, *output;
 	uint16_t current_enqueued_cbs = 0;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_enc_op(op) == -1) {
+		rte_bbdev_log(ERR, "Turbo encoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2191,6 +2448,142 @@
 	return current_enqueued_cbs;
 }
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+/* Validates turbo decoder parameters */
+static inline int
+validate_dec_op(struct rte_bbdev_dec_op *op)
+{
+	struct rte_bbdev_op_turbo_dec *turbo_dec = &op->turbo_dec;
+	struct rte_bbdev_op_dec_turbo_cb_params *cb = NULL;
+	struct rte_bbdev_op_dec_turbo_tb_params *tb = NULL;
+
+	if (op->mempool == NULL) {
+		rte_bbdev_log(ERR, "Invalid mempool pointer");
+		return -1;
+	}
+	if (turbo_dec->input.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid input pointer");
+		return -1;
+	}
+	if (turbo_dec->hard_output.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid hard_output pointer");
+		return -1;
+	}
+	if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_SOFT_OUTPUT) &&
+			turbo_dec->soft_output.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid soft_output pointer");
+		return -1;
+	}
+	if (turbo_dec->rv_index > 3) {
+		rte_bbdev_log(ERR,
+				"rv_index (%u) is out of range 0 <= value <= 3",
+				turbo_dec->rv_index);
+		return -1;
+	}
+	if (turbo_dec->iter_min < 1) {
+		rte_bbdev_log(ERR,
+				"iter_min (%u) is less than 1",
+				turbo_dec->iter_min);
+		return -1;
+	}
+	if (turbo_dec->iter_max <= 2) {
+		rte_bbdev_log(ERR,
+				"iter_max (%u) is less than or equal to 2",
+				turbo_dec->iter_max);
+		return -1;
+	}
+	if (turbo_dec->iter_min > turbo_dec->iter_max) {
+		rte_bbdev_log(ERR,
+				"iter_min (%u) is greater than iter_max (%u)",
+				turbo_dec->iter_min, turbo_dec->iter_max);
+		return -1;
+	}
+	if (turbo_dec->code_block_mode != 0 &&
+			turbo_dec->code_block_mode != 1) {
+		rte_bbdev_log(ERR,
+				"code_block_mode (%u) is out of range 0 <= value <= 1",
+				turbo_dec->code_block_mode);
+		return -1;
+	}
+
+	if (turbo_dec->code_block_mode == 0) {
+		tb = &turbo_dec->tb_params;
+		if ((tb->k_neg < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| tb->k_neg > RTE_BBDEV_TURBO_MAX_CB_SIZE)
+				&& tb->c_neg > 0) {
+			rte_bbdev_log(ERR,
+					"k_neg (%u) is out of range %u <= value <= %u",
+					tb->k_neg, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if ((tb->k_pos < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| tb->k_pos > RTE_BBDEV_TURBO_MAX_CB_SIZE)
+				&& tb->c > tb->c_neg) {
+			rte_bbdev_log(ERR,
+					"k_pos (%u) is out of range %u <= value <= %u",
+					tb->k_pos, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if (tb->c_neg > (RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1))
+			rte_bbdev_log(ERR,
+					"c_neg (%u) is out of range 0 <= value <= %u",
+					tb->c_neg,
+					RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1);
+		if (tb->c < 1 || tb->c > RTE_BBDEV_TURBO_MAX_CODE_BLOCKS) {
+			rte_bbdev_log(ERR,
+					"c (%u) is out of range 1 <= value <= %u",
+					tb->c, RTE_BBDEV_TURBO_MAX_CODE_BLOCKS);
+			return -1;
+		}
+		if (tb->cab > tb->c) {
+			rte_bbdev_log(ERR,
+					"cab (%u) is greater than c (%u)",
+					tb->cab, tb->c);
+			return -1;
+		}
+		if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_EQUALIZER) &&
+				(tb->ea < RTE_BBDEV_TURBO_MIN_CB_SIZE
+						|| (tb->ea % 2))
+				&& tb->cab > 0) {
+			rte_bbdev_log(ERR,
+					"ea (%u) is less than %u or it is not even",
+					tb->ea, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+		if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_EQUALIZER) &&
+				(tb->eb < RTE_BBDEV_TURBO_MIN_CB_SIZE
+						|| (tb->eb % 2))
+				&& tb->c > tb->cab) {
+			rte_bbdev_log(ERR,
+					"eb (%u) is less than %u or it is not even",
+					tb->eb, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+		}
+	} else {
+		cb = &turbo_dec->cb_params;
+		if (cb->k < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| cb->k > RTE_BBDEV_TURBO_MAX_CB_SIZE) {
+			rte_bbdev_log(ERR,
+					"k (%u) is out of range %u <= value <= %u",
+					cb->k, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_EQUALIZER) &&
+				(cb->e < RTE_BBDEV_TURBO_MIN_CB_SIZE ||
+				(cb->e % 2))) {
+			rte_bbdev_log(ERR,
+					"e (%u) is less than %u or it is not even",
+					cb->e, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+	}
+
+	return 0;
+}
+#endif
+
 /** Enqueue one decode operations for ACC100 device in CB mode */
 static inline int
 enqueue_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
@@ -2203,6 +2596,14 @@
 	struct rte_mbuf *input, *h_output_head, *h_output,
 		*s_output_head, *s_output;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_dec_op(op) == -1) {
+		rte_bbdev_log(ERR, "Turbo decoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2426,6 +2827,13 @@
 		return ret;
 	}
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_ldpc_dec_op(op) == -1) {
+		rte_bbdev_log(ERR, "LDPC decoder validation failed");
+		return -EINVAL;
+	}
+#endif
 	union acc100_dma_desc *desc;
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
@@ -2521,6 +2929,14 @@
 	struct rte_mbuf *input, *h_output_head, *h_output;
 	uint16_t current_enqueued_cbs = 0;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_ldpc_dec_op(op) == -1) {
+		rte_bbdev_log(ERR, "LDPC decoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2611,6 +3027,14 @@
 		*s_output_head, *s_output;
 	uint16_t current_enqueued_cbs = 0;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_dec_op(op) == -1) {
+		rte_bbdev_log(ERR, "Turbo decoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v3 10/11] baseband/acc100: add configure function
  2020-08-19  0:25 [dpdk-dev] [PATCH v3 00/11] bbdev PMD ACC100 Nicolas Chautru
                   ` (8 preceding siblings ...)
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 09/11] baseband/acc100: add debug function to validate input Nicolas Chautru
@ 2020-08-19  0:25 ` Nicolas Chautru
  2020-09-03 10:06   ` Aidan Goddard
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 11/11] doc: update bbdev feature table Nicolas Chautru
  10 siblings, 1 reply; 213+ messages in thread
From: Nicolas Chautru @ 2020-08-19  0:25 UTC (permalink / raw)
  To: dev, akhil.goyal; +Cc: bruce.richardson, Nicolas Chautru

Add configure function to configure the PF from within
the bbdev-test itself without external application
configuration the device.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 app/test-bbdev/test_bbdev_perf.c                   |  72 +++
 drivers/baseband/acc100/Makefile                   |   3 +
 drivers/baseband/acc100/meson.build                |   2 +
 drivers/baseband/acc100/rte_acc100_cfg.h           |  17 +
 drivers/baseband/acc100/rte_acc100_pmd.c           | 505 +++++++++++++++++++++
 .../acc100/rte_pmd_bbdev_acc100_version.map        |   7 +
 6 files changed, 606 insertions(+)

diff --git a/app/test-bbdev/test_bbdev_perf.c b/app/test-bbdev/test_bbdev_perf.c
index 45c0d62..32f23ff 100644
--- a/app/test-bbdev/test_bbdev_perf.c
+++ b/app/test-bbdev/test_bbdev_perf.c
@@ -52,6 +52,18 @@
 #define FLR_5G_TIMEOUT 610
 #endif
 
+#ifdef RTE_LIBRTE_PMD_BBDEV_ACC100
+#include <rte_acc100_cfg.h>
+#define ACC100PF_DRIVER_NAME   ("intel_acc100_pf")
+#define ACC100VF_DRIVER_NAME   ("intel_acc100_vf")
+#define ACC100_QMGR_NUM_AQS 16
+#define ACC100_QMGR_NUM_QGS 2
+#define ACC100_QMGR_AQ_DEPTH 5
+#define ACC100_QMGR_INVALID_IDX -1
+#define ACC100_QMGR_RR 1
+#define ACC100_QOS_GBR 0
+#endif
+
 #define OPS_CACHE_SIZE 256U
 #define OPS_POOL_SIZE_MIN 511U /* 0.5K per queue */
 
@@ -653,6 +665,66 @@ typedef int (test_case_function)(struct active_device *ad,
 				info->dev_name);
 	}
 #endif
+#ifdef RTE_LIBRTE_PMD_BBDEV_ACC100
+	if ((get_init_device() == true) &&
+		(!strcmp(info->drv.driver_name, ACC100PF_DRIVER_NAME))) {
+		struct acc100_conf conf;
+		unsigned int i;
+
+		printf("Configure ACC100 FEC Driver %s with default values\n",
+				info->drv.driver_name);
+
+		/* clear default configuration before initialization */
+		memset(&conf, 0, sizeof(struct acc100_conf));
+
+		/* Always set in PF mode for built-in configuration */
+		conf.pf_mode_en = true;
+		for (i = 0; i < RTE_ACC100_NUM_VFS; ++i) {
+			conf.arb_dl_4g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_dl_4g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_dl_4g[i].round_robin_weight = ACC100_QMGR_RR;
+			conf.arb_ul_4g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_ul_4g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_ul_4g[i].round_robin_weight = ACC100_QMGR_RR;
+			conf.arb_dl_5g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_dl_5g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_dl_5g[i].round_robin_weight = ACC100_QMGR_RR;
+			conf.arb_ul_5g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_ul_5g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_ul_5g[i].round_robin_weight = ACC100_QMGR_RR;
+		}
+
+		conf.input_pos_llr_1_bit = true;
+		conf.output_pos_llr_1_bit = true;
+		conf.num_vf_bundles = 1; /**< Number of VF bundles to setup */
+
+		conf.q_ul_4g.num_qgroups = ACC100_QMGR_NUM_QGS;
+		conf.q_ul_4g.first_qgroup_index = ACC100_QMGR_INVALID_IDX;
+		conf.q_ul_4g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
+		conf.q_ul_4g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
+		conf.q_dl_4g.num_qgroups = ACC100_QMGR_NUM_QGS;
+		conf.q_dl_4g.first_qgroup_index = ACC100_QMGR_INVALID_IDX;
+		conf.q_dl_4g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
+		conf.q_dl_4g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
+		conf.q_ul_5g.num_qgroups = ACC100_QMGR_NUM_QGS;
+		conf.q_ul_5g.first_qgroup_index = ACC100_QMGR_INVALID_IDX;
+		conf.q_ul_5g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
+		conf.q_ul_5g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
+		conf.q_dl_5g.num_qgroups = ACC100_QMGR_NUM_QGS;
+		conf.q_dl_5g.first_qgroup_index = ACC100_QMGR_INVALID_IDX;
+		conf.q_dl_5g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
+		conf.q_dl_5g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
+
+		/* setup PF with configuration information */
+		ret = acc100_configure(info->dev_name, &conf);
+		TEST_ASSERT_SUCCESS(ret,
+				"Failed to configure ACC100 PF for bbdev %s",
+				info->dev_name);
+		/* Let's refresh this now this is configured */
+	}
+	rte_bbdev_info_get(dev_id, info);
+#endif
+
 	nb_queues = RTE_MIN(rte_lcore_count(), info->drv.max_num_queues);
 	nb_queues = RTE_MIN(nb_queues, (unsigned int) MAX_QUEUES);
 
diff --git a/drivers/baseband/acc100/Makefile b/drivers/baseband/acc100/Makefile
index c79e487..37e73af 100644
--- a/drivers/baseband/acc100/Makefile
+++ b/drivers/baseband/acc100/Makefile
@@ -22,4 +22,7 @@ LIBABIVER := 1
 # library source files
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100) += rte_acc100_pmd.c
 
+# export include files
+SYMLINK-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100)-include += rte_acc100_cfg.h
+
 include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/baseband/acc100/meson.build b/drivers/baseband/acc100/meson.build
index 8afafc2..7ac44dc 100644
--- a/drivers/baseband/acc100/meson.build
+++ b/drivers/baseband/acc100/meson.build
@@ -4,3 +4,5 @@
 deps += ['bbdev', 'bus_vdev', 'ring', 'pci', 'bus_pci']
 
 sources = files('rte_acc100_pmd.c')
+
+install_headers('rte_acc100_cfg.h')
diff --git a/drivers/baseband/acc100/rte_acc100_cfg.h b/drivers/baseband/acc100/rte_acc100_cfg.h
index 73bbe36..7f523bc 100644
--- a/drivers/baseband/acc100/rte_acc100_cfg.h
+++ b/drivers/baseband/acc100/rte_acc100_cfg.h
@@ -89,6 +89,23 @@ struct acc100_conf {
 	struct rte_arbitration_t arb_dl_5g[RTE_ACC100_NUM_VFS];
 };
 
+/**
+ * Configure a ACC100 device
+ *
+ * @param dev_name
+ *   The name of the device. This is the short form of PCI BDF, e.g. 00:01.0.
+ *   It can also be retrieved for a bbdev device from the dev_name field in the
+ *   rte_bbdev_info structure returned by rte_bbdev_info_get().
+ * @param conf
+ *   Configuration to apply to ACC100 HW.
+ *
+ * @return
+ *   Zero on success, negative value on failure.
+ */
+__rte_experimental
+int
+acc100_configure(const char *dev_name, struct acc100_conf *conf);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index dc14079..43f664b 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -85,6 +85,26 @@
 
 enum {UL_4G = 0, UL_5G, DL_4G, DL_5G, NUM_ACC};
 
+/* Return the accelerator enum for a Queue Group Index */
+static inline int
+accFromQgid(int qg_idx, const struct acc100_conf *acc100_conf)
+{
+	int accQg[ACC100_NUM_QGRPS];
+	int NumQGroupsPerFn[NUM_ACC];
+	int acc, qgIdx, qgIndex = 0;
+	for (qgIdx = 0; qgIdx < ACC100_NUM_QGRPS; qgIdx++)
+		accQg[qgIdx] = 0;
+	NumQGroupsPerFn[UL_4G] = acc100_conf->q_ul_4g.num_qgroups;
+	NumQGroupsPerFn[UL_5G] = acc100_conf->q_ul_5g.num_qgroups;
+	NumQGroupsPerFn[DL_4G] = acc100_conf->q_dl_4g.num_qgroups;
+	NumQGroupsPerFn[DL_5G] = acc100_conf->q_dl_5g.num_qgroups;
+	for (acc = UL_4G;  acc < NUM_ACC; acc++)
+		for (qgIdx = 0; qgIdx < NumQGroupsPerFn[acc]; qgIdx++)
+			accQg[qgIndex++] = acc;
+	acc = accQg[qg_idx];
+	return acc;
+}
+
 /* Return the queue topology for a Queue Group Index */
 static inline void
 qtopFromAcc(struct rte_q_topology_t **qtop, int acc_enum,
@@ -113,6 +133,30 @@
 	*qtop = p_qtop;
 }
 
+/* Return the AQ depth for a Queue Group Index */
+static inline int
+aqDepth(int qg_idx, struct acc100_conf *acc100_conf)
+{
+	struct rte_q_topology_t *q_top = NULL;
+	int acc_enum = accFromQgid(qg_idx, acc100_conf);
+	qtopFromAcc(&q_top, acc_enum, acc100_conf);
+	if (unlikely(q_top == NULL))
+		return 0;
+	return q_top->aq_depth_log2;
+}
+
+/* Return the AQ depth for a Queue Group Index */
+static inline int
+aqNum(int qg_idx, struct acc100_conf *acc100_conf)
+{
+	struct rte_q_topology_t *q_top = NULL;
+	int acc_enum = accFromQgid(qg_idx, acc100_conf);
+	qtopFromAcc(&q_top, acc_enum, acc100_conf);
+	if (unlikely(q_top == NULL))
+		return 0;
+	return q_top->num_aqs_per_groups;
+}
+
 static void
 initQTop(struct acc100_conf *acc100_conf)
 {
@@ -4177,3 +4221,464 @@ static int acc100_pci_remove(struct rte_pci_device *pci_dev)
 RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME, pci_id_acc100_pf_map);
 RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
 RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME, pci_id_acc100_vf_map);
+
+/*
+ * Implementation to fix the power on status of some 5GUL engines
+ * This requires DMA permission if ported outside DPDK
+ */
+static void
+poweron_cleanup(struct rte_bbdev *bbdev, struct acc100_device *d,
+		struct acc100_conf *conf)
+{
+	int i, template_idx, qg_idx;
+	uint32_t address, status, payload;
+	printf("Need to clear power-on 5GUL status in internal memory\n");
+	/* Reset LDPC Cores */
+	for (i = 0; i < ACC100_ENGINES_MAX; i++)
+		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
+				ACC100_ENGINE_OFFSET * i, ACC100_RESET_HI);
+	usleep(LONG_WAIT);
+	for (i = 0; i < ACC100_ENGINES_MAX; i++)
+		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
+				ACC100_ENGINE_OFFSET * i, ACC100_RESET_LO);
+	usleep(LONG_WAIT);
+	/* Prepare dummy workload */
+	alloc_2x64mb_sw_rings_mem(bbdev, d, 0);
+	/* Set base addresses */
+	uint32_t phys_high = (uint32_t)(d->sw_rings_phys >> 32);
+	uint32_t phys_low  = (uint32_t)(d->sw_rings_phys &
+			~(ACC100_SIZE_64MBYTE-1));
+	acc100_reg_write(d, HWPfDmaFec5GulDescBaseHiRegVf, phys_high);
+	acc100_reg_write(d, HWPfDmaFec5GulDescBaseLoRegVf, phys_low);
+
+	/* Descriptor for a dummy 5GUL code block processing*/
+	union acc100_dma_desc *desc = NULL;
+	desc = d->sw_rings;
+	desc->req.data_ptrs[0].address = d->sw_rings_phys +
+			ACC100_DESC_FCW_OFFSET;
+	desc->req.data_ptrs[0].blen = ACC100_FCW_LD_BLEN;
+	desc->req.data_ptrs[0].blkid = ACC100_DMA_BLKID_FCW;
+	desc->req.data_ptrs[0].last = 0;
+	desc->req.data_ptrs[0].dma_ext = 0;
+	desc->req.data_ptrs[1].address = d->sw_rings_phys + 512;
+	desc->req.data_ptrs[1].blkid = ACC100_DMA_BLKID_IN;
+	desc->req.data_ptrs[1].last = 1;
+	desc->req.data_ptrs[1].dma_ext = 0;
+	desc->req.data_ptrs[1].blen = 44;
+	desc->req.data_ptrs[2].address = d->sw_rings_phys + 1024;
+	desc->req.data_ptrs[2].blkid = ACC100_DMA_BLKID_OUT_ENC;
+	desc->req.data_ptrs[2].last = 1;
+	desc->req.data_ptrs[2].dma_ext = 0;
+	desc->req.data_ptrs[2].blen = 5;
+	/* Dummy FCW */
+	desc->req.fcw_ld.FCWversion = ACC100_FCW_VER;
+	desc->req.fcw_ld.qm = 1;
+	desc->req.fcw_ld.nfiller = 30;
+	desc->req.fcw_ld.BG = 2 - 1;
+	desc->req.fcw_ld.Zc = 7;
+	desc->req.fcw_ld.ncb = 350;
+	desc->req.fcw_ld.rm_e = 4;
+	desc->req.fcw_ld.itmax = 10;
+	desc->req.fcw_ld.gain_i = 1;
+	desc->req.fcw_ld.gain_h = 1;
+
+	int engines_to_restart[SIG_UL_5G_LAST + 1] = {0};
+	int num_failed_engine = 0;
+	/* Detect engines in undefined state */
+	for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST;
+			template_idx++) {
+		/* Check engine power-on status */
+		address = HwPfFecUl5gIbDebugReg +
+				ACC100_ENGINE_OFFSET * template_idx;
+		status = (acc100_reg_read(d, address) >> 4) & 0xF;
+		if (status == 0) {
+			engines_to_restart[num_failed_engine] = template_idx;
+			num_failed_engine++;
+		}
+	}
+
+	int numQqsAcc = conf->q_ul_5g.num_qgroups;
+	int numQgs = conf->q_ul_5g.num_qgroups;
+	payload = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		payload |= (1 << qg_idx);
+	/* Force each engine which is in unspecified state */
+	for (i = 0; i < num_failed_engine; i++) {
+		int failed_engine = engines_to_restart[i];
+		printf("Force engine %d\n", failed_engine);
+		for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST;
+				template_idx++) {
+			address = HWPfQmgrGrpTmplateReg4Indx
+					+ BYTES_IN_WORD * template_idx;
+			if (template_idx == failed_engine)
+				acc100_reg_write(d, address, payload);
+			else
+				acc100_reg_write(d, address, 0);
+		}
+		/* Reset descriptor header */
+		desc->req.word0 = ACC100_DMA_DESC_TYPE;
+		desc->req.word1 = 0;
+		desc->req.word2 = 0;
+		desc->req.word3 = 0;
+		desc->req.numCBs = 1;
+		desc->req.m2dlen = 2;
+		desc->req.d2mlen = 1;
+		/* Enqueue the code block for processing */
+		union acc100_enqueue_reg_fmt enq_req;
+		enq_req.val = 0;
+		enq_req.addr_offset = ACC100_DESC_OFFSET;
+		enq_req.num_elem = 1;
+		enq_req.req_elem_addr = 0;
+		rte_wmb();
+		acc100_reg_write(d, HWPfQmgrIngressAq + 0x100, enq_req.val);
+		usleep(LONG_WAIT * 100);
+		if (desc->req.word0 != 2)
+			printf("DMA Response %#"PRIx32"\n", desc->req.word0);
+	}
+
+	/* Reset LDPC Cores */
+	for (i = 0; i < ACC100_ENGINES_MAX; i++)
+		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
+				ACC100_ENGINE_OFFSET * i, ACC100_RESET_HI);
+	usleep(LONG_WAIT);
+	for (i = 0; i < ACC100_ENGINES_MAX; i++)
+		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
+				ACC100_ENGINE_OFFSET * i, ACC100_RESET_LO);
+	usleep(LONG_WAIT);
+	acc100_reg_write(d, HWPfHi5GHardResetReg, ACC100_RESET_HARD);
+	usleep(LONG_WAIT);
+	int numEngines = 0;
+	/* Check engine power-on status again */
+	for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST;
+			template_idx++) {
+		address = HwPfFecUl5gIbDebugReg +
+				ACC100_ENGINE_OFFSET * template_idx;
+		status = (acc100_reg_read(d, address) >> 4) & 0xF;
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ BYTES_IN_WORD * template_idx;
+		if (status == 1) {
+			acc100_reg_write(d, address, payload);
+			numEngines++;
+		} else
+			acc100_reg_write(d, address, 0);
+	}
+	printf("Number of 5GUL engines %d\n", numEngines);
+
+	if (d->sw_rings_base != NULL)
+		rte_free(d->sw_rings_base);
+	usleep(LONG_WAIT);
+}
+
+/* Initial configuration of a ACC100 device prior to running configure() */
+int
+acc100_configure(const char *dev_name, struct acc100_conf *conf)
+{
+	rte_bbdev_log(INFO, "acc100_configure");
+	uint32_t payload, address, status;
+	int qg_idx, template_idx, vf_idx, acc, i;
+	struct rte_bbdev *bbdev = rte_bbdev_get_named_dev(dev_name);
+
+	/* Compile time checks */
+	RTE_BUILD_BUG_ON(sizeof(struct acc100_dma_req_desc) != 256);
+	RTE_BUILD_BUG_ON(sizeof(union acc100_dma_desc) != 256);
+	RTE_BUILD_BUG_ON(sizeof(struct acc100_fcw_td) != 24);
+	RTE_BUILD_BUG_ON(sizeof(struct acc100_fcw_te) != 32);
+
+	if (bbdev == NULL) {
+		rte_bbdev_log(ERR,
+		"Invalid dev_name (%s), or device is not yet initialised",
+		dev_name);
+		return -ENODEV;
+	}
+	struct acc100_device *d = bbdev->data->dev_private;
+
+	/* Store configuration */
+	rte_memcpy(&d->acc100_conf, conf, sizeof(d->acc100_conf));
+
+	/* PCIe Bridge configuration */
+	acc100_reg_write(d, HwPfPcieGpexBridgeControl, ACC100_CFG_PCI_BRIDGE);
+	for (i = 1; i < 17; i++)
+		acc100_reg_write(d,
+				HwPfPcieGpexAxiAddrMappingWindowPexBaseHigh
+				+ i * 16, 0);
+
+	/* PCIe Link Trainiing and Status State Machine */
+	acc100_reg_write(d, HwPfPcieGpexLtssmStateCntrl, 0xDFC00000);
+
+	/* Prevent blocking AXI read on BRESP for AXI Write */
+	address = HwPfPcieGpexAxiPioControl;
+	payload = ACC100_CFG_PCI_AXI;
+	acc100_reg_write(d, address, payload);
+
+	/* 5GDL PLL phase shift */
+	acc100_reg_write(d, HWPfChaDl5gPllPhshft0, 0x1);
+
+	/* Explicitly releasing AXI as this may be stopped after PF FLR/BME */
+	address = HWPfDmaAxiControl;
+	payload = 1;
+	acc100_reg_write(d, address, payload);
+
+	/* DDR Configuration */
+	address = HWPfDdrBcTim6;
+	payload = acc100_reg_read(d, address);
+	payload &= 0xFFFFFFFB; /* Bit 2 */
+#ifdef ACC100_DDR_ECC_ENABLE
+	payload |= 0x4;
+#endif
+	acc100_reg_write(d, address, payload);
+	address = HWPfDdrPhyDqsCountNum;
+#ifdef ACC100_DDR_ECC_ENABLE
+	payload = 9;
+#else
+	payload = 8;
+#endif
+	acc100_reg_write(d, address, payload);
+
+	/* Set default descriptor signature */
+	address = HWPfDmaDescriptorSignatuture;
+	payload = 0;
+	acc100_reg_write(d, address, payload);
+
+	/* Enable the Error Detection in DMA */
+	payload = ACC100_CFG_DMA_ERROR;
+	address = HWPfDmaErrorDetectionEn;
+	acc100_reg_write(d, address, payload);
+
+	/* AXI Cache configuration */
+	payload = ACC100_CFG_AXI_CACHE;
+	address = HWPfDmaAxcacheReg;
+	acc100_reg_write(d, address, payload);
+
+	/* Default DMA Configuration (Qmgr Enabled) */
+	address = HWPfDmaConfig0Reg;
+	payload = 0;
+	acc100_reg_write(d, address, payload);
+	address = HWPfDmaQmanen;
+	payload = 0;
+	acc100_reg_write(d, address, payload);
+
+	/* Default RLIM/ALEN configuration */
+	address = HWPfDmaConfig1Reg;
+	payload = (1 << 31) + (23 << 8) + (1 << 6) + 7;
+	acc100_reg_write(d, address, payload);
+
+	/* Configure DMA Qmanager addresses */
+	address = HWPfDmaQmgrAddrReg;
+	payload = HWPfQmgrEgressQueuesTemplate;
+	acc100_reg_write(d, address, payload);
+
+	/* ===== Qmgr Configuration ===== */
+	/* Configuration of the AQueue Depth QMGR_GRP_0_DEPTH_LOG2 for UL */
+	int totalQgs = conf->q_ul_4g.num_qgroups +
+			conf->q_ul_5g.num_qgroups +
+			conf->q_dl_4g.num_qgroups +
+			conf->q_dl_5g.num_qgroups;
+	for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) {
+		address = HWPfQmgrDepthLog2Grp +
+		BYTES_IN_WORD * qg_idx;
+		payload = aqDepth(qg_idx, conf);
+		acc100_reg_write(d, address, payload);
+		address = HWPfQmgrTholdGrp +
+		BYTES_IN_WORD * qg_idx;
+		payload = (1 << 16) + (1 << (aqDepth(qg_idx, conf) - 1));
+		acc100_reg_write(d, address, payload);
+	}
+
+	/* Template Priority in incremental order */
+	for (template_idx = 0; template_idx < ACC100_NUM_TMPL;
+			template_idx++) {
+		address = HWPfQmgrGrpTmplateReg0Indx +
+		BYTES_IN_WORD * (template_idx % 8);
+		payload = TMPL_PRI_0;
+		acc100_reg_write(d, address, payload);
+		address = HWPfQmgrGrpTmplateReg1Indx +
+		BYTES_IN_WORD * (template_idx % 8);
+		payload = TMPL_PRI_1;
+		acc100_reg_write(d, address, payload);
+		address = HWPfQmgrGrpTmplateReg2indx +
+		BYTES_IN_WORD * (template_idx % 8);
+		payload = TMPL_PRI_2;
+		acc100_reg_write(d, address, payload);
+		address = HWPfQmgrGrpTmplateReg3Indx +
+		BYTES_IN_WORD * (template_idx % 8);
+		payload = TMPL_PRI_3;
+		acc100_reg_write(d, address, payload);
+	}
+
+	address = HWPfQmgrGrpPriority;
+	payload = ACC100_CFG_QMGR_HI_P;
+	acc100_reg_write(d, address, payload);
+
+	/* Template Configuration */
+	for (template_idx = 0; template_idx < ACC100_NUM_TMPL; template_idx++) {
+		payload = 0;
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ BYTES_IN_WORD * template_idx;
+		acc100_reg_write(d, address, payload);
+	}
+	/* 4GUL */
+	int numQgs = conf->q_ul_4g.num_qgroups;
+	int numQqsAcc = 0;
+	payload = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		payload |= (1 << qg_idx);
+	for (template_idx = SIG_UL_4G; template_idx <= SIG_UL_4G_LAST;
+			template_idx++) {
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ BYTES_IN_WORD*template_idx;
+		acc100_reg_write(d, address, payload);
+	}
+	/* 5GUL */
+	numQqsAcc += numQgs;
+	numQgs	= conf->q_ul_5g.num_qgroups;
+	payload = 0;
+	int numEngines = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		payload |= (1 << qg_idx);
+	for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST;
+			template_idx++) {
+		/* Check engine power-on status */
+		address = HwPfFecUl5gIbDebugReg +
+				ACC100_ENGINE_OFFSET * template_idx;
+		status = (acc100_reg_read(d, address) >> 4) & 0xF;
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ BYTES_IN_WORD * template_idx;
+		if (status == 1) {
+			acc100_reg_write(d, address, payload);
+			numEngines++;
+		} else
+			acc100_reg_write(d, address, 0);
+		#if RTE_ACC100_SINGLE_FEC == 1
+		payload = 0;
+		#endif
+	}
+	printf("Number of 5GUL engines %d\n", numEngines);
+	/* 4GDL */
+	numQqsAcc += numQgs;
+	numQgs	= conf->q_dl_4g.num_qgroups;
+	payload = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		payload |= (1 << qg_idx);
+	for (template_idx = SIG_DL_4G; template_idx <= SIG_DL_4G_LAST;
+			template_idx++) {
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ BYTES_IN_WORD*template_idx;
+		acc100_reg_write(d, address, payload);
+		#if RTE_ACC100_SINGLE_FEC == 1
+			payload = 0;
+		#endif
+	}
+	/* 5GDL */
+	numQqsAcc += numQgs;
+	numQgs	= conf->q_dl_5g.num_qgroups;
+	payload = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		payload |= (1 << qg_idx);
+	for (template_idx = SIG_DL_5G; template_idx <= SIG_DL_5G_LAST;
+			template_idx++) {
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ BYTES_IN_WORD*template_idx;
+		acc100_reg_write(d, address, payload);
+		#if RTE_ACC100_SINGLE_FEC == 1
+		payload = 0;
+		#endif
+	}
+
+	/* Queue Group Function mapping */
+	int qman_func_id[5] = {0, 2, 1, 3, 4};
+	address = HWPfQmgrGrpFunction0;
+	payload = 0;
+	for (qg_idx = 0; qg_idx < 8; qg_idx++) {
+		acc = accFromQgid(qg_idx, conf);
+		payload |= qman_func_id[acc]<<(qg_idx * 4);
+	}
+	acc100_reg_write(d, address, payload);
+
+	/* Configuration of the Arbitration QGroup depth to 1 */
+	for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) {
+		address = HWPfQmgrArbQDepthGrp +
+		BYTES_IN_WORD * qg_idx;
+		payload = 0;
+		acc100_reg_write(d, address, payload);
+	}
+
+	/* Enabling AQueues through the Queue hierarchy*/
+	for (vf_idx = 0; vf_idx < ACC100_NUM_VFS; vf_idx++) {
+		for (qg_idx = 0; qg_idx < ACC100_NUM_QGRPS; qg_idx++) {
+			payload = 0;
+			if (vf_idx < conf->num_vf_bundles &&
+					qg_idx < totalQgs)
+				payload = (1 << aqNum(qg_idx, conf)) - 1;
+			address = HWPfQmgrAqEnableVf
+					+ vf_idx * BYTES_IN_WORD;
+			payload += (qg_idx << 16);
+			acc100_reg_write(d, address, payload);
+		}
+	}
+
+	/* This pointer to ARAM (256kB) is shifted by 2 (4B per register) */
+	uint32_t aram_address = 0;
+	for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) {
+		for (vf_idx = 0; vf_idx < conf->num_vf_bundles; vf_idx++) {
+			address = HWPfQmgrVfBaseAddr + vf_idx
+					* BYTES_IN_WORD + qg_idx
+					* BYTES_IN_WORD * 64;
+			payload = aram_address;
+			acc100_reg_write(d, address, payload);
+			/* Offset ARAM Address for next memory bank
+			 * - increment of 4B
+			 */
+			aram_address += aqNum(qg_idx, conf) *
+					(1 << aqDepth(qg_idx, conf));
+		}
+	}
+
+	if (aram_address > WORDS_IN_ARAM_SIZE) {
+		rte_bbdev_log(ERR, "ARAM Configuration not fitting %d %d\n",
+				aram_address, WORDS_IN_ARAM_SIZE);
+		return -EINVAL;
+	}
+
+	/* ==== HI Configuration ==== */
+
+	/* Prevent Block on Transmit Error */
+	address = HWPfHiBlockTransmitOnErrorEn;
+	payload = 0;
+	acc100_reg_write(d, address, payload);
+	/* Prevents to drop MSI */
+	address = HWPfHiMsiDropEnableReg;
+	payload = 0;
+	acc100_reg_write(d, address, payload);
+	/* Set the PF Mode register */
+	address = HWPfHiPfMode;
+	payload = (conf->pf_mode_en) ? 2 : 0;
+	acc100_reg_write(d, address, payload);
+	/* Enable Error Detection in HW */
+	address = HWPfDmaErrorDetectionEn;
+	payload = 0x3D7;
+	acc100_reg_write(d, address, payload);
+
+	/* QoS overflow init */
+	payload = 1;
+	address = HWPfQosmonAEvalOverflow0;
+	acc100_reg_write(d, address, payload);
+	address = HWPfQosmonBEvalOverflow0;
+	acc100_reg_write(d, address, payload);
+
+	/* HARQ DDR Configuration */
+	unsigned int ddrSizeInMb = 512; /* Fixed to 512 MB per VF for now */
+	for (vf_idx = 0; vf_idx < conf->num_vf_bundles; vf_idx++) {
+		address = HWPfDmaVfDdrBaseRw + vf_idx
+				* 0x10;
+		payload = ((vf_idx * (ddrSizeInMb / 64)) << 16) +
+				(ddrSizeInMb - 1);
+		acc100_reg_write(d, address, payload);
+	}
+	usleep(LONG_WAIT);
+
+	if (numEngines < (SIG_UL_5G_LAST + 1))
+		poweron_cleanup(bbdev, d, conf);
+
+	rte_bbdev_log_debug("PF Tip configuration complete for %s", dev_name);
+	return 0;
+}
diff --git a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
index 4a76d1d..91c234d 100644
--- a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
+++ b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
@@ -1,3 +1,10 @@
 DPDK_21 {
 	local: *;
 };
+
+EXPERIMENTAL {
+	global:
+
+	acc100_configure;
+
+};
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v3 11/11] doc: update bbdev feature table
  2020-08-19  0:25 [dpdk-dev] [PATCH v3 00/11] bbdev PMD ACC100 Nicolas Chautru
                   ` (9 preceding siblings ...)
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 10/11] baseband/acc100: add configure function Nicolas Chautru
@ 2020-08-19  0:25 ` Nicolas Chautru
  2020-09-04 17:53   ` [dpdk-dev] [PATCH v4 00/11] bbdev PMD ACC100 Nicolas Chautru
                     ` (8 more replies)
  10 siblings, 9 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-08-19  0:25 UTC (permalink / raw)
  To: dev, akhil.goyal; +Cc: bruce.richardson, Nicolas Chautru

Correcting overview matrix to use acc100 name

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 doc/guides/bbdevs/features/acc100.ini | 14 ++++++++++++++
 doc/guides/bbdevs/features/mbc.ini    | 14 --------------
 2 files changed, 14 insertions(+), 14 deletions(-)
 create mode 100644 doc/guides/bbdevs/features/acc100.ini
 delete mode 100644 doc/guides/bbdevs/features/mbc.ini

diff --git a/doc/guides/bbdevs/features/acc100.ini b/doc/guides/bbdevs/features/acc100.ini
new file mode 100644
index 0000000..642cd48
--- /dev/null
+++ b/doc/guides/bbdevs/features/acc100.ini
@@ -0,0 +1,14 @@
+;
+; Supported features of the 'acc100' bbdev driver.
+;
+; Refer to default.ini for the full list of available PMD features.
+;
+[Features]
+Turbo Decoder (4G)     = Y
+Turbo Encoder (4G)     = Y
+LDPC Decoder (5G)      = Y
+LDPC Encoder (5G)      = Y
+LLR/HARQ Compression   = Y
+External DDR Access    = Y
+HW Accelerated         = Y
+BBDEV API              = Y
diff --git a/doc/guides/bbdevs/features/mbc.ini b/doc/guides/bbdevs/features/mbc.ini
deleted file mode 100644
index 78a7b95..0000000
--- a/doc/guides/bbdevs/features/mbc.ini
+++ /dev/null
@@ -1,14 +0,0 @@
-;
-; Supported features of the 'mbc' bbdev driver.
-;
-; Refer to default.ini for the full list of available PMD features.
-;
-[Features]
-Turbo Decoder (4G)     = Y
-Turbo Encoder (4G)     = Y
-LDPC Decoder (5G)      = Y
-LDPC Encoder (5G)      = Y
-LLR/HARQ Compression   = Y
-External DDR Access    = Y
-HW Accelerated         = Y
-BBDEV API              = Y
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC processing functions
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC processing functions Nicolas Chautru
@ 2020-08-20 14:38   ` Dave Burley
  2020-08-20 14:52     ` Chautru, Nicolas
  2020-08-29 11:10   ` Xu, Rosen
  1 sibling, 1 reply; 213+ messages in thread
From: Dave Burley @ 2020-08-20 14:38 UTC (permalink / raw)
  To: Nicolas Chautru, dev; +Cc: bruce.richardson

Hi Nic,

As you've now specified the use of RTE_BBDEV_LDPC_LLR_COMPRESSION for this PMB, please could you confirm what the packed format of the LLRs in memory looks like? 

Best Regards

Dave Burley


From: dev <dev-bounces@dpdk.org> on behalf of Nicolas Chautru <nicolas.chautru@intel.com>
Sent: 19 August 2020 01:25
To: dev@dpdk.org <dev@dpdk.org>; akhil.goyal@nxp.com <akhil.goyal@nxp.com>
Cc: bruce.richardson@intel.com <bruce.richardson@intel.com>; Nicolas Chautru <nicolas.chautru@intel.com>
Subject: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC processing functions 
 
Adding LDPC decode and encode processing operations

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 1625 +++++++++++++++++++++++++++++-
 drivers/baseband/acc100/rte_acc100_pmd.h |    3 +
 2 files changed, 1626 insertions(+), 2 deletions(-)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 7a21c57..5f32813 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -15,6 +15,9 @@
 #include <rte_hexdump.h>
 #include <rte_pci.h>
 #include <rte_bus_pci.h>
+#ifdef RTE_BBDEV_OFFLOAD_COST
+#include <rte_cycles.h>
+#endif
 
 #include <rte_bbdev.h>
 #include <rte_bbdev_pmd.h>
@@ -449,7 +452,6 @@
         return 0;
 }
 
-
 /**
  * Report a ACC100 queue index which is free
  * Return 0 to 16k for a valid queue_idx or -1 when no queue is available
@@ -634,6 +636,46 @@
         struct acc100_device *d = dev->data->dev_private;
 
         static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
+               {
+                       .type   = RTE_BBDEV_OP_LDPC_ENC,
+                       .cap.ldpc_enc = {
+                               .capability_flags =
+                                       RTE_BBDEV_LDPC_RATE_MATCH |
+                                       RTE_BBDEV_LDPC_CRC_24B_ATTACH |
+                                       RTE_BBDEV_LDPC_INTERLEAVER_BYPASS,
+                               .num_buffers_src =
+                                               RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+                               .num_buffers_dst =
+                                               RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+                       }
+               },
+               {
+                       .type   = RTE_BBDEV_OP_LDPC_DEC,
+                       .cap.ldpc_dec = {
+                       .capability_flags =
+                               RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK |
+                               RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP |
+                               RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE |
+                               RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE |
+#ifdef ACC100_EXT_MEM
+                               RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE |
+                               RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE |
+#endif
+                               RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE |
+                               RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS |
+                               RTE_BBDEV_LDPC_DECODE_BYPASS |
+                               RTE_BBDEV_LDPC_DEC_SCATTER_GATHER |
+                               RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION |
+                               RTE_BBDEV_LDPC_LLR_COMPRESSION,
+                       .llr_size = 8,
+                       .llr_decimals = 1,
+                       .num_buffers_src =
+                                       RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+                       .num_buffers_hard_out =
+                                       RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+                       .num_buffers_soft_out = 0,
+                       }
+               },
                 RTE_BBDEV_END_OF_CAPABILITIES_LIST()
         };
 
@@ -669,9 +711,14 @@
         dev_info->cpu_flag_reqs = NULL;
         dev_info->min_alignment = 64;
         dev_info->capabilities = bbdev_capabilities;
+#ifdef ACC100_EXT_MEM
         dev_info->harq_buffer_size = d->ddr_size;
+#else
+       dev_info->harq_buffer_size = 0;
+#endif
 }
 
+
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
         .setup_queues = acc100_setup_queues,
         .close = acc100_dev_close,
@@ -696,6 +743,1577 @@
         {.device_id = 0},
 };
 
+/* Read flag value 0/1 from bitmap */
+static inline bool
+check_bit(uint32_t bitmap, uint32_t bitmask)
+{
+       return bitmap & bitmask;
+}
+
+static inline char *
+mbuf_append(struct rte_mbuf *m_head, struct rte_mbuf *m, uint16_t len)
+{
+       if (unlikely(len > rte_pktmbuf_tailroom(m)))
+               return NULL;
+
+       char *tail = (char *)m->buf_addr + m->data_off + m->data_len;
+       m->data_len = (uint16_t)(m->data_len + len);
+       m_head->pkt_len  = (m_head->pkt_len + len);
+       return tail;
+}
+
+/* Compute value of k0.
+ * Based on 3GPP 38.212 Table 5.4.2.1-2
+ * Starting position of different redundancy versions, k0
+ */
+static inline uint16_t
+get_k0(uint16_t n_cb, uint16_t z_c, uint8_t bg, uint8_t rv_index)
+{
+       if (rv_index == 0)
+               return 0;
+       uint16_t n = (bg == 1 ? N_ZC_1 : N_ZC_2) * z_c;
+       if (n_cb == n) {
+               if (rv_index == 1)
+                       return (bg == 1 ? K0_1_1 : K0_1_2) * z_c;
+               else if (rv_index == 2)
+                       return (bg == 1 ? K0_2_1 : K0_2_2) * z_c;
+               else
+                       return (bg == 1 ? K0_3_1 : K0_3_2) * z_c;
+       }
+       /* LBRM case - includes a division by N */
+       if (rv_index == 1)
+               return (((bg == 1 ? K0_1_1 : K0_1_2) * n_cb)
+                               / n) * z_c;
+       else if (rv_index == 2)
+               return (((bg == 1 ? K0_2_1 : K0_2_2) * n_cb)
+                               / n) * z_c;
+       else
+               return (((bg == 1 ? K0_3_1 : K0_3_2) * n_cb)
+                               / n) * z_c;
+}
+
+/* Fill in a frame control word for LDPC encoding. */
+static inline void
+acc100_fcw_le_fill(const struct rte_bbdev_enc_op *op,
+               struct acc100_fcw_le *fcw, int num_cb)
+{
+       fcw->qm = op->ldpc_enc.q_m;
+       fcw->nfiller = op->ldpc_enc.n_filler;
+       fcw->BG = (op->ldpc_enc.basegraph - 1);
+       fcw->Zc = op->ldpc_enc.z_c;
+       fcw->ncb = op->ldpc_enc.n_cb;
+       fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_enc.basegraph,
+                       op->ldpc_enc.rv_index);
+       fcw->rm_e = op->ldpc_enc.cb_params.e;
+       fcw->crc_select = check_bit(op->ldpc_enc.op_flags,
+                       RTE_BBDEV_LDPC_CRC_24B_ATTACH);
+       fcw->bypass_intlv = check_bit(op->ldpc_enc.op_flags,
+                       RTE_BBDEV_LDPC_INTERLEAVER_BYPASS);
+       fcw->mcb_count = num_cb;
+}
+
+/* Fill in a frame control word for LDPC decoding. */
+static inline void
+acc100_fcw_ld_fill(const struct rte_bbdev_dec_op *op, struct acc100_fcw_ld *fcw,
+               union acc100_harq_layout_data *harq_layout)
+{
+       uint16_t harq_out_length, harq_in_length, ncb_p, k0_p, parity_offset;
+       uint16_t harq_index;
+       uint32_t l;
+       bool harq_prun = false;
+
+       fcw->qm = op->ldpc_dec.q_m;
+       fcw->nfiller = op->ldpc_dec.n_filler;
+       fcw->BG = (op->ldpc_dec.basegraph - 1);
+       fcw->Zc = op->ldpc_dec.z_c;
+       fcw->ncb = op->ldpc_dec.n_cb;
+       fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_dec.basegraph,
+                       op->ldpc_dec.rv_index);
+       if (op->ldpc_dec.code_block_mode == 1)
+               fcw->rm_e = op->ldpc_dec.cb_params.e;
+       else
+               fcw->rm_e = (op->ldpc_dec.tb_params.r <
+                               op->ldpc_dec.tb_params.cab) ?
+                                               op->ldpc_dec.tb_params.ea :
+                                               op->ldpc_dec.tb_params.eb;
+
+       fcw->hcin_en = check_bit(op->ldpc_dec.op_flags,
+                       RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE);
+       fcw->hcout_en = check_bit(op->ldpc_dec.op_flags,
+                       RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE);
+       fcw->crc_select = check_bit(op->ldpc_dec.op_flags,
+                       RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK);
+       fcw->bypass_dec = check_bit(op->ldpc_dec.op_flags,
+                       RTE_BBDEV_LDPC_DECODE_BYPASS);
+       fcw->bypass_intlv = check_bit(op->ldpc_dec.op_flags,
+                       RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS);
+       if (op->ldpc_dec.q_m == 1) {
+               fcw->bypass_intlv = 1;
+               fcw->qm = 2;
+       }
+       fcw->hcin_decomp_mode = check_bit(op->ldpc_dec.op_flags,
+                       RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
+       fcw->hcout_comp_mode = check_bit(op->ldpc_dec.op_flags,
+                       RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
+       fcw->llr_pack_mode = check_bit(op->ldpc_dec.op_flags,
+                       RTE_BBDEV_LDPC_LLR_COMPRESSION);
+       harq_index = op->ldpc_dec.harq_combined_output.offset /
+                       ACC100_HARQ_OFFSET;
+#ifdef ACC100_EXT_MEM
+       /* Limit cases when HARQ pruning is valid */
+       harq_prun = ((op->ldpc_dec.harq_combined_output.offset %
+                       ACC100_HARQ_OFFSET) == 0) &&
+                       (op->ldpc_dec.harq_combined_output.offset <= UINT16_MAX
+                       * ACC100_HARQ_OFFSET);
+#endif
+       if (fcw->hcin_en > 0) {
+               harq_in_length = op->ldpc_dec.harq_combined_input.length;
+               if (fcw->hcin_decomp_mode > 0)
+                       harq_in_length = harq_in_length * 8 / 6;
+               harq_in_length = RTE_ALIGN(harq_in_length, 64);
+               if ((harq_layout[harq_index].offset > 0) & harq_prun) {
+                       rte_bbdev_log_debug("HARQ IN offset unexpected for now\n");
+                       fcw->hcin_size0 = harq_layout[harq_index].size0;
+                       fcw->hcin_offset = harq_layout[harq_index].offset;
+                       fcw->hcin_size1 = harq_in_length -
+                                       harq_layout[harq_index].offset;
+               } else {
+                       fcw->hcin_size0 = harq_in_length;
+                       fcw->hcin_offset = 0;
+                       fcw->hcin_size1 = 0;
+               }
+       } else {
+               fcw->hcin_size0 = 0;
+               fcw->hcin_offset = 0;
+               fcw->hcin_size1 = 0;
+       }
+
+       fcw->itmax = op->ldpc_dec.iter_max;
+       fcw->itstop = check_bit(op->ldpc_dec.op_flags,
+                       RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE);
+       fcw->synd_precoder = fcw->itstop;
+       /*
+        * These are all implicitly set
+        * fcw->synd_post = 0;
+        * fcw->so_en = 0;
+        * fcw->so_bypass_rm = 0;
+        * fcw->so_bypass_intlv = 0;
+        * fcw->dec_convllr = 0;
+        * fcw->hcout_convllr = 0;
+        * fcw->hcout_size1 = 0;
+        * fcw->so_it = 0;
+        * fcw->hcout_offset = 0;
+        * fcw->negstop_th = 0;
+        * fcw->negstop_it = 0;
+        * fcw->negstop_en = 0;
+        * fcw->gain_i = 1;
+        * fcw->gain_h = 1;
+        */
+       if (fcw->hcout_en > 0) {
+               parity_offset = (op->ldpc_dec.basegraph == 1 ? 20 : 8)
+                       * op->ldpc_dec.z_c - op->ldpc_dec.n_filler;
+               k0_p = (fcw->k0 > parity_offset) ?
+                               fcw->k0 - op->ldpc_dec.n_filler : fcw->k0;
+               ncb_p = fcw->ncb - op->ldpc_dec.n_filler;
+               l = k0_p + fcw->rm_e;
+               harq_out_length = (uint16_t) fcw->hcin_size0;
+               harq_out_length = RTE_MIN(RTE_MAX(harq_out_length, l), ncb_p);
+               harq_out_length = (harq_out_length + 0x3F) & 0xFFC0;
+               if ((k0_p > fcw->hcin_size0 + ACC100_HARQ_OFFSET_THRESHOLD) &&
+                               harq_prun) {
+                       fcw->hcout_size0 = (uint16_t) fcw->hcin_size0;
+                       fcw->hcout_offset = k0_p & 0xFFC0;
+                       fcw->hcout_size1 = harq_out_length - fcw->hcout_offset;
+               } else {
+                       fcw->hcout_size0 = harq_out_length;
+                       fcw->hcout_size1 = 0;
+                       fcw->hcout_offset = 0;
+               }
+               harq_layout[harq_index].offset = fcw->hcout_offset;
+               harq_layout[harq_index].size0 = fcw->hcout_size0;
+       } else {
+               fcw->hcout_size0 = 0;
+               fcw->hcout_size1 = 0;
+               fcw->hcout_offset = 0;
+       }
+}
+
+/**
+ * Fills descriptor with data pointers of one block type.
+ *
+ * @param desc
+ *   Pointer to DMA descriptor.
+ * @param input
+ *   Pointer to pointer to input data which will be encoded. It can be changed
+ *   and points to next segment in scatter-gather case.
+ * @param offset
+ *   Input offset in rte_mbuf structure. It is used for calculating the point
+ *   where data is starting.
+ * @param cb_len
+ *   Length of currently processed Code Block
+ * @param seg_total_left
+ *   It indicates how many bytes still left in segment (mbuf) for further
+ *   processing.
+ * @param op_flags
+ *   Store information about device capabilities
+ * @param next_triplet
+ *   Index for ACC100 DMA Descriptor triplet
+ *
+ * @return
+ *   Returns index of next triplet on success, other value if lengths of
+ *   pkt and processed cb do not match.
+ *
+ */
+static inline int
+acc100_dma_fill_blk_type_in(struct acc100_dma_req_desc *desc,
+               struct rte_mbuf **input, uint32_t *offset, uint32_t cb_len,
+               uint32_t *seg_total_left, int next_triplet)
+{
+       uint32_t part_len;
+       struct rte_mbuf *m = *input;
+
+       part_len = (*seg_total_left < cb_len) ? *seg_total_left : cb_len;
+       cb_len -= part_len;
+       *seg_total_left -= part_len;
+
+       desc->data_ptrs[next_triplet].address =
+                       rte_pktmbuf_iova_offset(m, *offset);
+       desc->data_ptrs[next_triplet].blen = part_len;
+       desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN;
+       desc->data_ptrs[next_triplet].last = 0;
+       desc->data_ptrs[next_triplet].dma_ext = 0;
+       *offset += part_len;
+       next_triplet++;
+
+       while (cb_len > 0) {
+               if (next_triplet < ACC100_DMA_MAX_NUM_POINTERS &&
+                               m->next != NULL) {
+
+                       m = m->next;
+                       *seg_total_left = rte_pktmbuf_data_len(m);
+                       part_len = (*seg_total_left < cb_len) ?
+                                       *seg_total_left :
+                                       cb_len;
+                       desc->data_ptrs[next_triplet].address =
+                                       rte_pktmbuf_mtophys(m);
+                       desc->data_ptrs[next_triplet].blen = part_len;
+                       desc->data_ptrs[next_triplet].blkid =
+                                       ACC100_DMA_BLKID_IN;
+                       desc->data_ptrs[next_triplet].last = 0;
+                       desc->data_ptrs[next_triplet].dma_ext = 0;
+                       cb_len -= part_len;
+                       *seg_total_left -= part_len;
+                       /* Initializing offset for next segment (mbuf) */
+                       *offset = part_len;
+                       next_triplet++;
+               } else {
+                       rte_bbdev_log(ERR,
+                               "Some data still left for processing: "
+                               "data_left: %u, next_triplet: %u, next_mbuf: %p",
+                               cb_len, next_triplet, m->next);
+                       return -EINVAL;
+               }
+       }
+       /* Storing new mbuf as it could be changed in scatter-gather case*/
+       *input = m;
+
+       return next_triplet;
+}
+
+/* Fills descriptor with data pointers of one block type.
+ * Returns index of next triplet on success, other value if lengths of
+ * output data and processed mbuf do not match.
+ */
+static inline int
+acc100_dma_fill_blk_type_out(struct acc100_dma_req_desc *desc,
+               struct rte_mbuf *output, uint32_t out_offset,
+               uint32_t output_len, int next_triplet, int blk_id)
+{
+       desc->data_ptrs[next_triplet].address =
+                       rte_pktmbuf_iova_offset(output, out_offset);
+       desc->data_ptrs[next_triplet].blen = output_len;
+       desc->data_ptrs[next_triplet].blkid = blk_id;
+       desc->data_ptrs[next_triplet].last = 0;
+       desc->data_ptrs[next_triplet].dma_ext = 0;
+       next_triplet++;
+
+       return next_triplet;
+}
+
+static inline int
+acc100_dma_desc_le_fill(struct rte_bbdev_enc_op *op,
+               struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
+               struct rte_mbuf *output, uint32_t *in_offset,
+               uint32_t *out_offset, uint32_t *out_length,
+               uint32_t *mbuf_total_left, uint32_t *seg_total_left)
+{
+       int next_triplet = 1; /* FCW already done */
+       uint16_t K, in_length_in_bits, in_length_in_bytes;
+       struct rte_bbdev_op_ldpc_enc *enc = &op->ldpc_enc;
+
+       desc->word0 = ACC100_DMA_DESC_TYPE;
+       desc->word1 = 0; /**< Timestamp could be disabled */
+       desc->word2 = 0;
+       desc->word3 = 0;
+       desc->numCBs = 1;
+
+       K = (enc->basegraph == 1 ? 22 : 10) * enc->z_c;
+       in_length_in_bits = K - enc->n_filler;
+       if ((enc->op_flags & RTE_BBDEV_LDPC_CRC_24A_ATTACH) ||
+                       (enc->op_flags & RTE_BBDEV_LDPC_CRC_24B_ATTACH))
+               in_length_in_bits -= 24;
+       in_length_in_bytes = in_length_in_bits >> 3;
+
+       if (unlikely((*mbuf_total_left == 0) ||
+                       (*mbuf_total_left < in_length_in_bytes))) {
+               rte_bbdev_log(ERR,
+                               "Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+                               *mbuf_total_left, in_length_in_bytes);
+               return -1;
+       }
+
+       next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset,
+                       in_length_in_bytes,
+                       seg_total_left, next_triplet);
+       if (unlikely(next_triplet < 0)) {
+               rte_bbdev_log(ERR,
+                               "Mismatch between data to process and mbuf data length in bbdev_op: %p",
+                               op);
+               return -1;
+       }
+       desc->data_ptrs[next_triplet - 1].last = 1;
+       desc->m2dlen = next_triplet;
+       *mbuf_total_left -= in_length_in_bytes;
+
+       /* Set output length */
+       /* Integer round up division by 8 */
+       *out_length = (enc->cb_params.e + 7) >> 3;
+
+       next_triplet = acc100_dma_fill_blk_type_out(desc, output, *out_offset,
+                       *out_length, next_triplet, ACC100_DMA_BLKID_OUT_ENC);
+       if (unlikely(next_triplet < 0)) {
+               rte_bbdev_log(ERR,
+                               "Mismatch between data to process and mbuf data length in bbdev_op: %p",
+                               op);
+               return -1;
+       }
+       op->ldpc_enc.output.length += *out_length;
+       *out_offset += *out_length;
+       desc->data_ptrs[next_triplet - 1].last = 1;
+       desc->data_ptrs[next_triplet - 1].dma_ext = 0;
+       desc->d2mlen = next_triplet - desc->m2dlen;
+
+       desc->op_addr = op;
+
+       return 0;
+}
+
+static inline int
+acc100_dma_desc_ld_fill(struct rte_bbdev_dec_op *op,
+               struct acc100_dma_req_desc *desc,
+               struct rte_mbuf **input, struct rte_mbuf *h_output,
+               uint32_t *in_offset, uint32_t *h_out_offset,
+               uint32_t *h_out_length, uint32_t *mbuf_total_left,
+               uint32_t *seg_total_left,
+               struct acc100_fcw_ld *fcw)
+{
+       struct rte_bbdev_op_ldpc_dec *dec = &op->ldpc_dec;
+       int next_triplet = 1; /* FCW already done */
+       uint32_t input_length;
+       uint16_t output_length, crc24_overlap = 0;
+       uint16_t sys_cols, K, h_p_size, h_np_size;
+       bool h_comp = check_bit(dec->op_flags,
+                       RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
+
+       desc->word0 = ACC100_DMA_DESC_TYPE;
+       desc->word1 = 0; /**< Timestamp could be disabled */
+       desc->word2 = 0;
+       desc->word3 = 0;
+       desc->numCBs = 1;
+
+       if (check_bit(op->ldpc_dec.op_flags,
+                       RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP))
+               crc24_overlap = 24;
+
+       /* Compute some LDPC BG lengths */
+       input_length = dec->cb_params.e;
+       if (check_bit(op->ldpc_dec.op_flags,
+                       RTE_BBDEV_LDPC_LLR_COMPRESSION))
+               input_length = (input_length * 3 + 3) / 4;
+       sys_cols = (dec->basegraph == 1) ? 22 : 10;
+       K = sys_cols * dec->z_c;
+       output_length = K - dec->n_filler - crc24_overlap;
+
+       if (unlikely((*mbuf_total_left == 0) ||
+                       (*mbuf_total_left < input_length))) {
+               rte_bbdev_log(ERR,
+                               "Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+                               *mbuf_total_left, input_length);
+               return -1;
+       }
+
+       next_triplet = acc100_dma_fill_blk_type_in(desc, input,
+                       in_offset, input_length,
+                       seg_total_left, next_triplet);
+
+       if (unlikely(next_triplet < 0)) {
+               rte_bbdev_log(ERR,
+                               "Mismatch between data to process and mbuf data length in bbdev_op: %p",
+                               op);
+               return -1;
+       }
+
+       if (check_bit(op->ldpc_dec.op_flags,
+                               RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
+               h_p_size = fcw->hcin_size0 + fcw->hcin_size1;
+               if (h_comp)
+                       h_p_size = (h_p_size * 3 + 3) / 4;
+               desc->data_ptrs[next_triplet].address =
+                               dec->harq_combined_input.offset;
+               desc->data_ptrs[next_triplet].blen = h_p_size;
+               desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN_HARQ;
+               desc->data_ptrs[next_triplet].dma_ext = 1;
+#ifndef ACC100_EXT_MEM
+               acc100_dma_fill_blk_type_out(
+                               desc,
+                               op->ldpc_dec.harq_combined_input.data,
+                               op->ldpc_dec.harq_combined_input.offset,
+                               h_p_size,
+                               next_triplet,
+                               ACC100_DMA_BLKID_IN_HARQ);
+#endif
+               next_triplet++;
+       }
+
+       desc->data_ptrs[next_triplet - 1].last = 1;
+       desc->m2dlen = next_triplet;
+       *mbuf_total_left -= input_length;
+
+       next_triplet = acc100_dma_fill_blk_type_out(desc, h_output,
+                       *h_out_offset, output_length >> 3, next_triplet,
+                       ACC100_DMA_BLKID_OUT_HARD);
+       if (unlikely(next_triplet < 0)) {
+               rte_bbdev_log(ERR,
+                               "Mismatch between data to process and mbuf data length in bbdev_op: %p",
+                               op);
+               return -1;
+       }
+
+       if (check_bit(op->ldpc_dec.op_flags,
+                               RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
+               /* Pruned size of the HARQ */
+               h_p_size = fcw->hcout_size0 + fcw->hcout_size1;
+               /* Non-Pruned size of the HARQ */
+               h_np_size = fcw->hcout_offset > 0 ?
+                               fcw->hcout_offset + fcw->hcout_size1 :
+                               h_p_size;
+               if (h_comp) {
+                       h_np_size = (h_np_size * 3 + 3) / 4;
+                       h_p_size = (h_p_size * 3 + 3) / 4;
+               }
+               dec->harq_combined_output.length = h_np_size;
+               desc->data_ptrs[next_triplet].address =
+                               dec->harq_combined_output.offset;
+               desc->data_ptrs[next_triplet].blen = h_p_size;
+               desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_OUT_HARQ;
+               desc->data_ptrs[next_triplet].dma_ext = 1;
+#ifndef ACC100_EXT_MEM
+               acc100_dma_fill_blk_type_out(
+                               desc,
+                               dec->harq_combined_output.data,
+                               dec->harq_combined_output.offset,
+                               h_p_size,
+                               next_triplet,
+                               ACC100_DMA_BLKID_OUT_HARQ);
+#endif
+               next_triplet++;
+       }
+
+       *h_out_length = output_length >> 3;
+       dec->hard_output.length += *h_out_length;
+       *h_out_offset += *h_out_length;
+       desc->data_ptrs[next_triplet - 1].last = 1;
+       desc->d2mlen = next_triplet - desc->m2dlen;
+
+       desc->op_addr = op;
+
+       return 0;
+}
+
+static inline void
+acc100_dma_desc_ld_update(struct rte_bbdev_dec_op *op,
+               struct acc100_dma_req_desc *desc,
+               struct rte_mbuf *input, struct rte_mbuf *h_output,
+               uint32_t *in_offset, uint32_t *h_out_offset,
+               uint32_t *h_out_length,
+               union acc100_harq_layout_data *harq_layout)
+{
+       int next_triplet = 1; /* FCW already done */
+       desc->data_ptrs[next_triplet].address =
+                       rte_pktmbuf_iova_offset(input, *in_offset);
+       next_triplet++;
+
+       if (check_bit(op->ldpc_dec.op_flags,
+                               RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
+               struct rte_bbdev_op_data hi = op->ldpc_dec.harq_combined_input;
+               desc->data_ptrs[next_triplet].address = hi.offset;
+#ifndef ACC100_EXT_MEM
+               desc->data_ptrs[next_triplet].address =
+                               rte_pktmbuf_iova_offset(hi.data, hi.offset);
+#endif
+               next_triplet++;
+       }
+
+       desc->data_ptrs[next_triplet].address =
+                       rte_pktmbuf_iova_offset(h_output, *h_out_offset);
+       *h_out_length = desc->data_ptrs[next_triplet].blen;
+       next_triplet++;
+
+       if (check_bit(op->ldpc_dec.op_flags,
+                               RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
+               desc->data_ptrs[next_triplet].address =
+                               op->ldpc_dec.harq_combined_output.offset;
+               /* Adjust based on previous operation */
+               struct rte_bbdev_dec_op *prev_op = desc->op_addr;
+               op->ldpc_dec.harq_combined_output.length =
+                               prev_op->ldpc_dec.harq_combined_output.length;
+               int16_t hq_idx = op->ldpc_dec.harq_combined_output.offset /
+                               ACC100_HARQ_OFFSET;
+               int16_t prev_hq_idx =
+                               prev_op->ldpc_dec.harq_combined_output.offset
+                               / ACC100_HARQ_OFFSET;
+               harq_layout[hq_idx].val = harq_layout[prev_hq_idx].val;
+#ifndef ACC100_EXT_MEM
+               struct rte_bbdev_op_data ho =
+                               op->ldpc_dec.harq_combined_output;
+               desc->data_ptrs[next_triplet].address =
+                               rte_pktmbuf_iova_offset(ho.data, ho.offset);
+#endif
+               next_triplet++;
+       }
+
+       op->ldpc_dec.hard_output.length += *h_out_length;
+       desc->op_addr = op;
+}
+
+
+/* Enqueue a number of operations to HW and update software rings */
+static inline void
+acc100_dma_enqueue(struct acc100_queue *q, uint16_t n,
+               struct rte_bbdev_stats *queue_stats)
+{
+       union acc100_enqueue_reg_fmt enq_req;
+#ifdef RTE_BBDEV_OFFLOAD_COST
+       uint64_t start_time = 0;
+       queue_stats->acc_offload_cycles = 0;
+       RTE_SET_USED(queue_stats);
+#else
+       RTE_SET_USED(queue_stats);
+#endif
+
+       enq_req.val = 0;
+       /* Setting offset, 100b for 256 DMA Desc */
+       enq_req.addr_offset = ACC100_DESC_OFFSET;
+
+       /* Split ops into batches */
+       do {
+               union acc100_dma_desc *desc;
+               uint16_t enq_batch_size;
+               uint64_t offset;
+               rte_iova_t req_elem_addr;
+
+               enq_batch_size = RTE_MIN(n, MAX_ENQ_BATCH_SIZE);
+
+               /* Set flag on last descriptor in a batch */
+               desc = q->ring_addr + ((q->sw_ring_head + enq_batch_size - 1) &
+                               q->sw_ring_wrap_mask);
+               desc->req.last_desc_in_batch = 1;
+
+               /* Calculate the 1st descriptor's address */
+               offset = ((q->sw_ring_head & q->sw_ring_wrap_mask) *
+                               sizeof(union acc100_dma_desc));
+               req_elem_addr = q->ring_addr_phys + offset;
+
+               /* Fill enqueue struct */
+               enq_req.num_elem = enq_batch_size;
+               /* low 6 bits are not needed */
+               enq_req.req_elem_addr = (uint32_t)(req_elem_addr >> 6);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+               rte_memdump(stderr, "Req sdone", desc, sizeof(*desc));
+#endif
+               rte_bbdev_log_debug(
+                               "Enqueue %u reqs (phys %#"PRIx64") to reg %p",
+                               enq_batch_size,
+                               req_elem_addr,
+                               (void *)q->mmio_reg_enqueue);
+
+               rte_wmb();
+
+#ifdef RTE_BBDEV_OFFLOAD_COST
+               /* Start time measurement for enqueue function offload. */
+               start_time = rte_rdtsc_precise();
+#endif
+               rte_bbdev_log(DEBUG, "Debug : MMIO Enqueue");
+               mmio_write(q->mmio_reg_enqueue, enq_req.val);
+
+#ifdef RTE_BBDEV_OFFLOAD_COST
+               queue_stats->acc_offload_cycles +=
+                               rte_rdtsc_precise() - start_time;
+#endif
+
+               q->aq_enqueued++;
+               q->sw_ring_head += enq_batch_size;
+               n -= enq_batch_size;
+
+       } while (n);
+
+
+}
+
+/* Enqueue one encode operations for ACC100 device in CB mode */
+static inline int
+enqueue_ldpc_enc_n_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op **ops,
+               uint16_t total_enqueued_cbs, int16_t num)
+{
+       union acc100_dma_desc *desc = NULL;
+       uint32_t out_length;
+       struct rte_mbuf *output_head, *output;
+       int i, next_triplet;
+       uint16_t  in_length_in_bytes;
+       struct rte_bbdev_op_ldpc_enc *enc = &ops[0]->ldpc_enc;
+
+       uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+                       & q->sw_ring_wrap_mask);
+       desc = q->ring_addr + desc_idx;
+       acc100_fcw_le_fill(ops[0], &desc->req.fcw_le, num);
+
+       /** This could be done at polling */
+       desc->req.word0 = ACC100_DMA_DESC_TYPE;
+       desc->req.word1 = 0; /**< Timestamp could be disabled */
+       desc->req.word2 = 0;
+       desc->req.word3 = 0;
+       desc->req.numCBs = num;
+
+       in_length_in_bytes = ops[0]->ldpc_enc.input.data->data_len;
+       out_length = (enc->cb_params.e + 7) >> 3;
+       desc->req.m2dlen = 1 + num;
+       desc->req.d2mlen = num;
+       next_triplet = 1;
+
+       for (i = 0; i < num; i++) {
+               desc->req.data_ptrs[next_triplet].address =
+                       rte_pktmbuf_iova_offset(ops[i]->ldpc_enc.input.data, 0);
+               desc->req.data_ptrs[next_triplet].blen = in_length_in_bytes;
+               next_triplet++;
+               desc->req.data_ptrs[next_triplet].address =
+                               rte_pktmbuf_iova_offset(
+                               ops[i]->ldpc_enc.output.data, 0);
+               desc->req.data_ptrs[next_triplet].blen = out_length;
+               next_triplet++;
+               ops[i]->ldpc_enc.output.length = out_length;
+               output_head = output = ops[i]->ldpc_enc.output.data;
+               mbuf_append(output_head, output, out_length);
+               output->data_len = out_length;
+       }
+
+       desc->req.op_addr = ops[0];
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+       rte_memdump(stderr, "FCW", &desc->req.fcw_le,
+                       sizeof(desc->req.fcw_le) - 8);
+       rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+       /* One CB (one op) was successfully prepared to enqueue */
+       return num;
+}
+
+/* Enqueue one encode operations for ACC100 device in CB mode */
+static inline int
+enqueue_ldpc_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
+               uint16_t total_enqueued_cbs)
+{
+       union acc100_dma_desc *desc = NULL;
+       int ret;
+       uint32_t in_offset, out_offset, out_length, mbuf_total_left,
+               seg_total_left;
+       struct rte_mbuf *input, *output_head, *output;
+
+       uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+                       & q->sw_ring_wrap_mask);
+       desc = q->ring_addr + desc_idx;
+       acc100_fcw_le_fill(op, &desc->req.fcw_le, 1);
+
+       input = op->ldpc_enc.input.data;
+       output_head = output = op->ldpc_enc.output.data;
+       in_offset = op->ldpc_enc.input.offset;
+       out_offset = op->ldpc_enc.output.offset;
+       out_length = 0;
+       mbuf_total_left = op->ldpc_enc.input.length;
+       seg_total_left = rte_pktmbuf_data_len(op->ldpc_enc.input.data)
+                       - in_offset;
+
+       ret = acc100_dma_desc_le_fill(op, &desc->req, &input, output,
+                       &in_offset, &out_offset, &out_length, &mbuf_total_left,
+                       &seg_total_left);
+
+       if (unlikely(ret < 0))
+               return ret;
+
+       mbuf_append(output_head, output, out_length);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+       rte_memdump(stderr, "FCW", &desc->req.fcw_le,
+                       sizeof(desc->req.fcw_le) - 8);
+       rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+
+       /* Check if any data left after processing one CB */
+       if (mbuf_total_left != 0) {
+               rte_bbdev_log(ERR,
+                               "Some date still left after processing one CB: mbuf_total_left = %u",
+                               mbuf_total_left);
+               return -EINVAL;
+       }
+#endif
+       /* One CB (one op) was successfully prepared to enqueue */
+       return 1;
+}
+
+/** Enqueue one decode operations for ACC100 device in CB mode */
+static inline int
+enqueue_ldpc_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+               uint16_t total_enqueued_cbs, bool same_op)
+{
+       int ret;
+
+       union acc100_dma_desc *desc;
+       uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+                       & q->sw_ring_wrap_mask);
+       desc = q->ring_addr + desc_idx;
+       struct rte_mbuf *input, *h_output_head, *h_output;
+       uint32_t in_offset, h_out_offset, h_out_length, mbuf_total_left;
+       input = op->ldpc_dec.input.data;
+       h_output_head = h_output = op->ldpc_dec.hard_output.data;
+       in_offset = op->ldpc_dec.input.offset;
+       h_out_offset = op->ldpc_dec.hard_output.offset;
+       mbuf_total_left = op->ldpc_dec.input.length;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+       if (unlikely(input == NULL)) {
+               rte_bbdev_log(ERR, "Invalid mbuf pointer");
+               return -EFAULT;
+       }
+#endif
+       union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+
+       if (same_op) {
+               union acc100_dma_desc *prev_desc;
+               desc_idx = ((q->sw_ring_head + total_enqueued_cbs - 1)
+                               & q->sw_ring_wrap_mask);
+               prev_desc = q->ring_addr + desc_idx;
+               uint8_t *prev_ptr = (uint8_t *) prev_desc;
+               uint8_t *new_ptr = (uint8_t *) desc;
+               /* Copy first 4 words and BDESCs */
+               rte_memcpy(new_ptr, prev_ptr, 16);
+               rte_memcpy(new_ptr + 36, prev_ptr + 36, 40);
+               desc->req.op_addr = prev_desc->req.op_addr;
+               /* Copy FCW */
+               rte_memcpy(new_ptr + ACC100_DESC_FCW_OFFSET,
+                               prev_ptr + ACC100_DESC_FCW_OFFSET,
+                               ACC100_FCW_LD_BLEN);
+               acc100_dma_desc_ld_update(op, &desc->req, input, h_output,
+                               &in_offset, &h_out_offset,
+                               &h_out_length, harq_layout);
+       } else {
+               struct acc100_fcw_ld *fcw;
+               uint32_t seg_total_left;
+               fcw = &desc->req.fcw_ld;
+               acc100_fcw_ld_fill(op, fcw, harq_layout);
+
+               /* Special handling when overusing mbuf */
+               if (fcw->rm_e < MAX_E_MBUF)
+                       seg_total_left = rte_pktmbuf_data_len(input)
+                                       - in_offset;
+               else
+                       seg_total_left = fcw->rm_e;
+
+               ret = acc100_dma_desc_ld_fill(op, &desc->req, &input, h_output,
+                               &in_offset, &h_out_offset,
+                               &h_out_length, &mbuf_total_left,
+                               &seg_total_left, fcw);
+               if (unlikely(ret < 0))
+                       return ret;
+       }
+
+       /* Hard output */
+       mbuf_append(h_output_head, h_output, h_out_length);
+#ifndef ACC100_EXT_MEM
+       if (op->ldpc_dec.harq_combined_output.length > 0) {
+               /* Push the HARQ output into host memory */
+               struct rte_mbuf *hq_output_head, *hq_output;
+               hq_output_head = op->ldpc_dec.harq_combined_output.data;
+               hq_output = op->ldpc_dec.harq_combined_output.data;
+               mbuf_append(hq_output_head, hq_output,
+                               op->ldpc_dec.harq_combined_output.length);
+       }
+#endif
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+       rte_memdump(stderr, "FCW", &desc->req.fcw_ld,
+                       sizeof(desc->req.fcw_ld) - 8);
+       rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+       /* One CB (one op) was successfully prepared to enqueue */
+       return 1;
+}
+
+
+/* Enqueue one decode operations for ACC100 device in TB mode */
+static inline int
+enqueue_ldpc_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+               uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
+{
+       union acc100_dma_desc *desc = NULL;
+       int ret;
+       uint8_t r, c;
+       uint32_t in_offset, h_out_offset,
+               h_out_length, mbuf_total_left, seg_total_left;
+       struct rte_mbuf *input, *h_output_head, *h_output;
+       uint16_t current_enqueued_cbs = 0;
+
+       uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+                       & q->sw_ring_wrap_mask);
+       desc = q->ring_addr + desc_idx;
+       uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
+       union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+       acc100_fcw_ld_fill(op, &desc->req.fcw_ld, harq_layout);
+
+       input = op->ldpc_dec.input.data;
+       h_output_head = h_output = op->ldpc_dec.hard_output.data;
+       in_offset = op->ldpc_dec.input.offset;
+       h_out_offset = op->ldpc_dec.hard_output.offset;
+       h_out_length = 0;
+       mbuf_total_left = op->ldpc_dec.input.length;
+       c = op->ldpc_dec.tb_params.c;
+       r = op->ldpc_dec.tb_params.r;
+
+       while (mbuf_total_left > 0 && r < c) {
+
+               seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
+
+               /* Set up DMA descriptor */
+               desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
+                               & q->sw_ring_wrap_mask);
+               desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
+               desc->req.data_ptrs[0].blen = ACC100_FCW_LD_BLEN;
+               ret = acc100_dma_desc_ld_fill(op, &desc->req, &input,
+                               h_output, &in_offset, &h_out_offset,
+                               &h_out_length,
+                               &mbuf_total_left, &seg_total_left,
+                               &desc->req.fcw_ld);
+
+               if (unlikely(ret < 0))
+                       return ret;
+
+               /* Hard output */
+               mbuf_append(h_output_head, h_output, h_out_length);
+
+               /* Set total number of CBs in TB */
+               desc->req.cbs_in_tb = cbs_in_tb;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+               rte_memdump(stderr, "FCW", &desc->req.fcw_td,
+                               sizeof(desc->req.fcw_td) - 8);
+               rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+               if (seg_total_left == 0) {
+                       /* Go to the next mbuf */
+                       input = input->next;
+                       in_offset = 0;
+                       h_output = h_output->next;
+                       h_out_offset = 0;
+               }
+               total_enqueued_cbs++;
+               current_enqueued_cbs++;
+               r++;
+       }
+
+       if (unlikely(desc == NULL))
+               return current_enqueued_cbs;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+       /* Check if any CBs left for processing */
+       if (mbuf_total_left != 0) {
+               rte_bbdev_log(ERR,
+                               "Some date still left for processing: mbuf_total_left = %u",
+                               mbuf_total_left);
+               return -EINVAL;
+       }
+#endif
+       /* Set SDone on last CB descriptor for TB mode */
+       desc->req.sdone_enable = 1;
+       desc->req.irq_enable = q->irq_enable;
+
+       return current_enqueued_cbs;
+}
+
+
+/* Calculates number of CBs in processed encoder TB based on 'r' and input
+ * length.
+ */
+static inline uint8_t
+get_num_cbs_in_tb_enc(struct rte_bbdev_op_turbo_enc *turbo_enc)
+{
+       uint8_t c, c_neg, r, crc24_bits = 0;
+       uint16_t k, k_neg, k_pos;
+       uint8_t cbs_in_tb = 0;
+       int32_t length;
+
+       length = turbo_enc->input.length;
+       r = turbo_enc->tb_params.r;
+       c = turbo_enc->tb_params.c;
+       c_neg = turbo_enc->tb_params.c_neg;
+       k_neg = turbo_enc->tb_params.k_neg;
+       k_pos = turbo_enc->tb_params.k_pos;
+       crc24_bits = 0;
+       if (check_bit(turbo_enc->op_flags, RTE_BBDEV_TURBO_CRC_24B_ATTACH))
+               crc24_bits = 24;
+       while (length > 0 && r < c) {
+               k = (r < c_neg) ? k_neg : k_pos;
+               length -= (k - crc24_bits) >> 3;
+               r++;
+               cbs_in_tb++;
+       }
+
+       return cbs_in_tb;
+}
+
+/* Calculates number of CBs in processed decoder TB based on 'r' and input
+ * length.
+ */
+static inline uint16_t
+get_num_cbs_in_tb_dec(struct rte_bbdev_op_turbo_dec *turbo_dec)
+{
+       uint8_t c, c_neg, r = 0;
+       uint16_t kw, k, k_neg, k_pos, cbs_in_tb = 0;
+       int32_t length;
+
+       length = turbo_dec->input.length;
+       r = turbo_dec->tb_params.r;
+       c = turbo_dec->tb_params.c;
+       c_neg = turbo_dec->tb_params.c_neg;
+       k_neg = turbo_dec->tb_params.k_neg;
+       k_pos = turbo_dec->tb_params.k_pos;
+       while (length > 0 && r < c) {
+               k = (r < c_neg) ? k_neg : k_pos;
+               kw = RTE_ALIGN_CEIL(k + 4, 32) * 3;
+               length -= kw;
+               r++;
+               cbs_in_tb++;
+       }
+
+       return cbs_in_tb;
+}
+
+/* Calculates number of CBs in processed decoder TB based on 'r' and input
+ * length.
+ */
+static inline uint16_t
+get_num_cbs_in_tb_ldpc_dec(struct rte_bbdev_op_ldpc_dec *ldpc_dec)
+{
+       uint16_t r, cbs_in_tb = 0;
+       int32_t length = ldpc_dec->input.length;
+       r = ldpc_dec->tb_params.r;
+       while (length > 0 && r < ldpc_dec->tb_params.c) {
+               length -=  (r < ldpc_dec->tb_params.cab) ?
+                               ldpc_dec->tb_params.ea :
+                               ldpc_dec->tb_params.eb;
+               r++;
+               cbs_in_tb++;
+       }
+       return cbs_in_tb;
+}
+
+/* Check we can mux encode operations with common FCW */
+static inline bool
+check_mux(struct rte_bbdev_enc_op **ops, uint16_t num) {
+       uint16_t i;
+       if (num == 1)
+               return false;
+       for (i = 1; i < num; ++i) {
+               /* Only mux compatible code blocks */
+               if (memcmp((uint8_t *)(&ops[i]->ldpc_enc) + ENC_OFFSET,
+                               (uint8_t *)(&ops[0]->ldpc_enc) + ENC_OFFSET,
+                               CMP_ENC_SIZE) != 0)
+                       return false;
+       }
+       return true;
+}
+
+/** Enqueue encode operations for ACC100 device in CB mode. */
+static inline uint16_t
+acc100_enqueue_ldpc_enc_cb(struct rte_bbdev_queue_data *q_data,
+               struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+       struct acc100_queue *q = q_data->queue_private;
+       int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+       uint16_t i = 0;
+       union acc100_dma_desc *desc;
+       int ret, desc_idx = 0;
+       int16_t enq, left = num;
+
+       while (left > 0) {
+               if (unlikely(avail - 1 < 0))
+                       break;
+               avail--;
+               enq = RTE_MIN(left, MUX_5GDL_DESC);
+               if (check_mux(&ops[i], enq)) {
+                       ret = enqueue_ldpc_enc_n_op_cb(q, &ops[i],
+                                       desc_idx, enq);
+                       if (ret < 0)
+                               break;
+                       i += enq;
+               } else {
+                       ret = enqueue_ldpc_enc_one_op_cb(q, ops[i], desc_idx);
+                       if (ret < 0)
+                               break;
+                       i++;
+               }
+               desc_idx++;
+               left = num - i;
+       }
+
+       if (unlikely(i == 0))
+               return 0; /* Nothing to enqueue */
+
+       /* Set SDone in last CB in enqueued ops for CB mode*/
+       desc = q->ring_addr + ((q->sw_ring_head + desc_idx - 1)
+                       & q->sw_ring_wrap_mask);
+       desc->req.sdone_enable = 1;
+       desc->req.irq_enable = q->irq_enable;
+
+       acc100_dma_enqueue(q, desc_idx, &q_data->queue_stats);
+
+       /* Update stats */
+       q_data->queue_stats.enqueued_count += i;
+       q_data->queue_stats.enqueue_err_count += num - i;
+
+       return i;
+}
+
+/* Enqueue encode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
+               struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+       if (unlikely(num == 0))
+               return 0;
+       return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
+}
+
+/* Check we can mux encode operations with common FCW */
+static inline bool
+cmp_ldpc_dec_op(struct rte_bbdev_dec_op **ops) {
+       /* Only mux compatible code blocks */
+       if (memcmp((uint8_t *)(&ops[0]->ldpc_dec) + DEC_OFFSET,
+                       (uint8_t *)(&ops[1]->ldpc_dec) +
+                       DEC_OFFSET, CMP_DEC_SIZE) != 0) {
+               return false;
+       } else
+               return true;
+}
+
+
+/* Enqueue decode operations for ACC100 device in TB mode */
+static uint16_t
+acc100_enqueue_ldpc_dec_tb(struct rte_bbdev_queue_data *q_data,
+               struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+       struct acc100_queue *q = q_data->queue_private;
+       int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+       uint16_t i, enqueued_cbs = 0;
+       uint8_t cbs_in_tb;
+       int ret;
+
+       for (i = 0; i < num; ++i) {
+               cbs_in_tb = get_num_cbs_in_tb_ldpc_dec(&ops[i]->ldpc_dec);
+               /* Check if there are available space for further processing */
+               if (unlikely(avail - cbs_in_tb < 0))
+                       break;
+               avail -= cbs_in_tb;
+
+               ret = enqueue_ldpc_dec_one_op_tb(q, ops[i],
+                               enqueued_cbs, cbs_in_tb);
+               if (ret < 0)
+                       break;
+               enqueued_cbs += ret;
+       }
+
+       acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
+
+       /* Update stats */
+       q_data->queue_stats.enqueued_count += i;
+       q_data->queue_stats.enqueue_err_count += num - i;
+       return i;
+}
+
+/* Enqueue decode operations for ACC100 device in CB mode */
+static uint16_t
+acc100_enqueue_ldpc_dec_cb(struct rte_bbdev_queue_data *q_data,
+               struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+       struct acc100_queue *q = q_data->queue_private;
+       int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+       uint16_t i;
+       union acc100_dma_desc *desc;
+       int ret;
+       bool same_op = false;
+       for (i = 0; i < num; ++i) {
+               /* Check if there are available space for further processing */
+               if (unlikely(avail - 1 < 0))
+                       break;
+               avail -= 1;
+
+               if (i > 0)
+                       same_op = cmp_ldpc_dec_op(&ops[i-1]);
+               rte_bbdev_log(INFO, "Op %d %d %d %d %d %d %d %d %d %d %d %d\n",
+                       i, ops[i]->ldpc_dec.op_flags, ops[i]->ldpc_dec.rv_index,
+                       ops[i]->ldpc_dec.iter_max, ops[i]->ldpc_dec.iter_count,
+                       ops[i]->ldpc_dec.basegraph, ops[i]->ldpc_dec.z_c,
+                       ops[i]->ldpc_dec.n_cb, ops[i]->ldpc_dec.q_m,
+                       ops[i]->ldpc_dec.n_filler, ops[i]->ldpc_dec.cb_params.e,
+                       same_op);
+               ret = enqueue_ldpc_dec_one_op_cb(q, ops[i], i, same_op);
+               if (ret < 0)
+                       break;
+       }
+
+       if (unlikely(i == 0))
+               return 0; /* Nothing to enqueue */
+
+       /* Set SDone in last CB in enqueued ops for CB mode*/
+       desc = q->ring_addr + ((q->sw_ring_head + i - 1)
+                       & q->sw_ring_wrap_mask);
+
+       desc->req.sdone_enable = 1;
+       desc->req.irq_enable = q->irq_enable;
+
+       acc100_dma_enqueue(q, i, &q_data->queue_stats);
+
+       /* Update stats */
+       q_data->queue_stats.enqueued_count += i;
+       q_data->queue_stats.enqueue_err_count += num - i;
+       return i;
+}
+
+/* Enqueue decode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
+               struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+       struct acc100_queue *q = q_data->queue_private;
+       int32_t aq_avail = q->aq_depth +
+                       (q->aq_dequeued - q->aq_enqueued) / 128;
+
+       if (unlikely((aq_avail == 0) || (num == 0)))
+               return 0;
+
+       if (ops[0]->ldpc_dec.code_block_mode == 0)
+               return acc100_enqueue_ldpc_dec_tb(q_data, ops, num);
+       else
+               return acc100_enqueue_ldpc_dec_cb(q_data, ops, num);
+}
+
+
+/* Dequeue one encode operations from ACC100 device in CB mode */
+static inline int
+dequeue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op **ref_op,
+               uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
+{
+       union acc100_dma_desc *desc, atom_desc;
+       union acc100_dma_rsp_desc rsp;
+       struct rte_bbdev_enc_op *op;
+       int i;
+
+       desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
+                       & q->sw_ring_wrap_mask);
+       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+                       __ATOMIC_RELAXED);
+
+       /* Check fdone bit */
+       if (!(atom_desc.rsp.val & ACC100_FDONE))
+               return -1;
+
+       rsp.val = atom_desc.rsp.val;
+       rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
+
+       /* Dequeue */
+       op = desc->req.op_addr;
+
+       /* Clearing status, it will be set based on response */
+       op->status = 0;
+
+       op->status |= ((rsp.input_err)
+                       ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+       op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+       op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+
+       if (desc->req.last_desc_in_batch) {
+               (*aq_dequeued)++;
+               desc->req.last_desc_in_batch = 0;
+       }
+       desc->rsp.val = ACC100_DMA_DESC_TYPE;
+       desc->rsp.add_info_0 = 0; /*Reserved bits */
+       desc->rsp.add_info_1 = 0; /*Reserved bits */
+
+       /* Flag that the muxing cause loss of opaque data */
+       op->opaque_data = (void *)-1;
+       for (i = 0 ; i < desc->req.numCBs; i++)
+               ref_op[i] = op;
+
+       /* One CB (op) was successfully dequeued */
+       return desc->req.numCBs;
+}
+
+/* Dequeue one encode operations from ACC100 device in TB mode */
+static inline int
+dequeue_enc_one_op_tb(struct acc100_queue *q, struct rte_bbdev_enc_op **ref_op,
+               uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
+{
+       union acc100_dma_desc *desc, *last_desc, atom_desc;
+       union acc100_dma_rsp_desc rsp;
+       struct rte_bbdev_enc_op *op;
+       uint8_t i = 0;
+       uint16_t current_dequeued_cbs = 0, cbs_in_tb;
+
+       desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
+                       & q->sw_ring_wrap_mask);
+       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+                       __ATOMIC_RELAXED);
+
+       /* Check fdone bit */
+       if (!(atom_desc.rsp.val & ACC100_FDONE))
+               return -1;
+
+       /* Get number of CBs in dequeued TB */
+       cbs_in_tb = desc->req.cbs_in_tb;
+       /* Get last CB */
+       last_desc = q->ring_addr + ((q->sw_ring_tail
+                       + total_dequeued_cbs + cbs_in_tb - 1)
+                       & q->sw_ring_wrap_mask);
+       /* Check if last CB in TB is ready to dequeue (and thus
+        * the whole TB) - checking sdone bit. If not return.
+        */
+       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
+                       __ATOMIC_RELAXED);
+       if (!(atom_desc.rsp.val & ACC100_SDONE))
+               return -1;
+
+       /* Dequeue */
+       op = desc->req.op_addr;
+
+       /* Clearing status, it will be set based on response */
+       op->status = 0;
+
+       while (i < cbs_in_tb) {
+               desc = q->ring_addr + ((q->sw_ring_tail
+                               + total_dequeued_cbs)
+                               & q->sw_ring_wrap_mask);
+               atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+                               __ATOMIC_RELAXED);
+               rsp.val = atom_desc.rsp.val;
+               rte_bbdev_log_debug("Resp. desc %p: %x", desc,
+                               rsp.val);
+
+               op->status |= ((rsp.input_err)
+                               ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+               op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+               op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+
+               if (desc->req.last_desc_in_batch) {
+                       (*aq_dequeued)++;
+                       desc->req.last_desc_in_batch = 0;
+               }
+               desc->rsp.val = ACC100_DMA_DESC_TYPE;
+               desc->rsp.add_info_0 = 0;
+               desc->rsp.add_info_1 = 0;
+               total_dequeued_cbs++;
+               current_dequeued_cbs++;
+               i++;
+       }
+
+       *ref_op = op;
+
+       return current_dequeued_cbs;
+}
+
+/* Dequeue one decode operation from ACC100 device in CB mode */
+static inline int
+dequeue_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
+               struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
+               uint16_t dequeued_cbs, uint32_t *aq_dequeued)
+{
+       union acc100_dma_desc *desc, atom_desc;
+       union acc100_dma_rsp_desc rsp;
+       struct rte_bbdev_dec_op *op;
+
+       desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+                       & q->sw_ring_wrap_mask);
+       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+                       __ATOMIC_RELAXED);
+
+       /* Check fdone bit */
+       if (!(atom_desc.rsp.val & ACC100_FDONE))
+               return -1;
+
+       rsp.val = atom_desc.rsp.val;
+       rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
+
+       /* Dequeue */
+       op = desc->req.op_addr;
+
+       /* Clearing status, it will be set based on response */
+       op->status = 0;
+       op->status |= ((rsp.input_err)
+                       ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+       op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+       op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+       if (op->status != 0)
+               q_data->queue_stats.dequeue_err_count++;
+
+       /* CRC invalid if error exists */
+       if (!op->status)
+               op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
+       op->turbo_dec.iter_count = (uint8_t) rsp.iter_cnt / 2;
+       /* Check if this is the last desc in batch (Atomic Queue) */
+       if (desc->req.last_desc_in_batch) {
+               (*aq_dequeued)++;
+               desc->req.last_desc_in_batch = 0;
+       }
+       desc->rsp.val = ACC100_DMA_DESC_TYPE;
+       desc->rsp.add_info_0 = 0;
+       desc->rsp.add_info_1 = 0;
+       *ref_op = op;
+
+       /* One CB (op) was successfully dequeued */
+       return 1;
+}
+
+/* Dequeue one decode operations from ACC100 device in CB mode */
+static inline int
+dequeue_ldpc_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
+               struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
+               uint16_t dequeued_cbs, uint32_t *aq_dequeued)
+{
+       union acc100_dma_desc *desc, atom_desc;
+       union acc100_dma_rsp_desc rsp;
+       struct rte_bbdev_dec_op *op;
+
+       desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+                       & q->sw_ring_wrap_mask);
+       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+                       __ATOMIC_RELAXED);
+
+       /* Check fdone bit */
+       if (!(atom_desc.rsp.val & ACC100_FDONE))
+               return -1;
+
+       rsp.val = atom_desc.rsp.val;
+
+       /* Dequeue */
+       op = desc->req.op_addr;
+
+       /* Clearing status, it will be set based on response */
+       op->status = 0;
+       op->status |= rsp.input_err << RTE_BBDEV_DATA_ERROR;
+       op->status |= rsp.dma_err << RTE_BBDEV_DRV_ERROR;
+       op->status |= rsp.fcw_err << RTE_BBDEV_DRV_ERROR;
+       if (op->status != 0)
+               q_data->queue_stats.dequeue_err_count++;
+
+       op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
+       if (op->ldpc_dec.hard_output.length > 0 && !rsp.synd_ok)
+               op->status |= 1 << RTE_BBDEV_SYNDROME_ERROR;
+       op->ldpc_dec.iter_count = (uint8_t) rsp.iter_cnt;
+
+       /* Check if this is the last desc in batch (Atomic Queue) */
+       if (desc->req.last_desc_in_batch) {
+               (*aq_dequeued)++;
+               desc->req.last_desc_in_batch = 0;
+       }
+
+       desc->rsp.val = ACC100_DMA_DESC_TYPE;
+       desc->rsp.add_info_0 = 0;
+       desc->rsp.add_info_1 = 0;
+
+       *ref_op = op;
+
+       /* One CB (op) was successfully dequeued */
+       return 1;
+}
+
+/* Dequeue one decode operations from ACC100 device in TB mode. */
+static inline int
+dequeue_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
+               uint16_t dequeued_cbs, uint32_t *aq_dequeued)
+{
+       union acc100_dma_desc *desc, *last_desc, atom_desc;
+       union acc100_dma_rsp_desc rsp;
+       struct rte_bbdev_dec_op *op;
+       uint8_t cbs_in_tb = 1, cb_idx = 0;
+
+       desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+                       & q->sw_ring_wrap_mask);
+       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+                       __ATOMIC_RELAXED);
+
+       /* Check fdone bit */
+       if (!(atom_desc.rsp.val & ACC100_FDONE))
+               return -1;
+
+       /* Dequeue */
+       op = desc->req.op_addr;
+
+       /* Get number of CBs in dequeued TB */
+       cbs_in_tb = desc->req.cbs_in_tb;
+       /* Get last CB */
+       last_desc = q->ring_addr + ((q->sw_ring_tail
+                       + dequeued_cbs + cbs_in_tb - 1)
+                       & q->sw_ring_wrap_mask);
+       /* Check if last CB in TB is ready to dequeue (and thus
+        * the whole TB) - checking sdone bit. If not return.
+        */
+       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
+                       __ATOMIC_RELAXED);
+       if (!(atom_desc.rsp.val & ACC100_SDONE))
+               return -1;
+
+       /* Clearing status, it will be set based on response */
+       op->status = 0;
+
+       /* Read remaining CBs if exists */
+       while (cb_idx < cbs_in_tb) {
+               desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+                               & q->sw_ring_wrap_mask);
+               atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+                               __ATOMIC_RELAXED);
+               rsp.val = atom_desc.rsp.val;
+               rte_bbdev_log_debug("Resp. desc %p: %x", desc,
+                               rsp.val);
+
+               op->status |= ((rsp.input_err)
+                               ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+               op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+               op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+
+               /* CRC invalid if error exists */
+               if (!op->status)
+                       op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
+               op->turbo_dec.iter_count = RTE_MAX((uint8_t) rsp.iter_cnt,
+                               op->turbo_dec.iter_count);
+
+               /* Check if this is the last desc in batch (Atomic Queue) */
+               if (desc->req.last_desc_in_batch) {
+                       (*aq_dequeued)++;
+                       desc->req.last_desc_in_batch = 0;
+               }
+               desc->rsp.val = ACC100_DMA_DESC_TYPE;
+               desc->rsp.add_info_0 = 0;
+               desc->rsp.add_info_1 = 0;
+               dequeued_cbs++;
+               cb_idx++;
+       }
+
+       *ref_op = op;
+
+       return cb_idx;
+}
+
+/* Dequeue LDPC encode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
+               struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+       struct acc100_queue *q = q_data->queue_private;
+       uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+       uint32_t aq_dequeued = 0;
+       uint16_t dequeue_num, i, dequeued_cbs = 0, dequeued_descs = 0;
+       int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+       if (unlikely(ops == 0 && q == NULL))
+               return 0;
+#endif
+
+       dequeue_num = (avail < num) ? avail : num;
+
+       for (i = 0; i < dequeue_num; i++) {
+               ret = dequeue_enc_one_op_cb(q, &ops[dequeued_cbs],
+                               dequeued_descs, &aq_dequeued);
+               if (ret < 0)
+                       break;
+               dequeued_cbs += ret;
+               dequeued_descs++;
+               if (dequeued_cbs >= num)
+                       break;
+       }
+
+       q->aq_dequeued += aq_dequeued;
+       q->sw_ring_tail += dequeued_descs;
+
+       /* Update enqueue stats */
+       q_data->queue_stats.dequeued_count += dequeued_cbs;
+
+       return dequeued_cbs;
+}
+
+/* Dequeue decode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
+               struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+       struct acc100_queue *q = q_data->queue_private;
+       uint16_t dequeue_num;
+       uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+       uint32_t aq_dequeued = 0;
+       uint16_t i;
+       uint16_t dequeued_cbs = 0;
+       struct rte_bbdev_dec_op *op;
+       int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+       if (unlikely(ops == 0 && q == NULL))
+               return 0;
+#endif
+
+       dequeue_num = (avail < num) ? avail : num;
+
+       for (i = 0; i < dequeue_num; ++i) {
+               op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+                       & q->sw_ring_wrap_mask))->req.op_addr;
+               if (op->ldpc_dec.code_block_mode == 0)
+                       ret = dequeue_dec_one_op_tb(q, &ops[i], dequeued_cbs,
+                                       &aq_dequeued);
+               else
+                       ret = dequeue_ldpc_dec_one_op_cb(
+                                       q_data, q, &ops[i], dequeued_cbs,
+                                       &aq_dequeued);
+
+               if (ret < 0)
+                       break;
+               dequeued_cbs += ret;
+       }
+
+       q->aq_dequeued += aq_dequeued;
+       q->sw_ring_tail += dequeued_cbs;
+
+       /* Update enqueue stats */
+       q_data->queue_stats.dequeued_count += i;
+
+       return i;
+}
+
 /* Initialization Function */
 static void
 acc100_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv)
@@ -703,6 +2321,10 @@
         struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
 
         dev->dev_ops = &acc100_bbdev_ops;
+       dev->enqueue_ldpc_enc_ops = acc100_enqueue_ldpc_enc;
+       dev->enqueue_ldpc_dec_ops = acc100_enqueue_ldpc_dec;
+       dev->dequeue_ldpc_enc_ops = acc100_dequeue_ldpc_enc;
+       dev->dequeue_ldpc_dec_ops = acc100_dequeue_ldpc_dec;
 
         ((struct acc100_device *) dev->data->dev_private)->pf_device =
                         !strcmp(drv->driver.name,
@@ -815,4 +2437,3 @@ static int acc100_pci_remove(struct rte_pci_device *pci_dev)
 RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME, pci_id_acc100_pf_map);
 RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
 RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME, pci_id_acc100_vf_map);
-
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 0e2b79c..78686c1 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -88,6 +88,8 @@
 #define TMPL_PRI_3      0x0f0e0d0c
 #define QUEUE_ENABLE    0x80000000  /* Bit to mark Queue as Enabled */
 #define WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
+#define ACC100_FDONE    0x80000000
+#define ACC100_SDONE    0x40000000
 
 #define ACC100_NUM_TMPL  32
 #define VF_OFFSET_QOS 16 /* offset in Memory Space specific to QoS Mon */
@@ -398,6 +400,7 @@ struct __rte_packed acc100_dma_req_desc {
 union acc100_dma_desc {
         struct acc100_dma_req_desc req;
         union acc100_dma_rsp_desc rsp;
+       uint64_t atom_hdr;
 };
 
 
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC processing functions
  2020-08-20 14:38   ` Dave Burley
@ 2020-08-20 14:52     ` Chautru, Nicolas
  2020-08-20 14:57       ` Dave Burley
  0 siblings, 1 reply; 213+ messages in thread
From: Chautru, Nicolas @ 2020-08-20 14:52 UTC (permalink / raw)
  To: Dave Burley, dev; +Cc: Richardson, Bruce

Hi Dave, 
This is assuming 6 bits LLR compression packing (ie. first 2 MSB dropped). Similar to HARQ compression.
Let me know if unclear, I can clarify further in documentation if not explicit enough.
Thanks
Nic

> -----Original Message-----
> From: Dave Burley <dave.burley@accelercomm.com>
> Sent: Thursday, August 20, 2020 7:39 AM
> To: Chautru, Nicolas <nicolas.chautru@intel.com>; dev@dpdk.org
> Cc: Richardson, Bruce <bruce.richardson@intel.com>
> Subject: Re: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC
> processing functions
> 
> Hi Nic,
> 
> As you've now specified the use of RTE_BBDEV_LDPC_LLR_COMPRESSION for
> this PMB, please could you confirm what the packed format of the LLRs in
> memory looks like?
> 
> Best Regards
> 
> Dave Burley
> 
> 
> From: dev <dev-bounces@dpdk.org> on behalf of Nicolas Chautru
> <nicolas.chautru@intel.com>
> Sent: 19 August 2020 01:25
> To: dev@dpdk.org <dev@dpdk.org>; akhil.goyal@nxp.com
> <akhil.goyal@nxp.com>
> Cc: bruce.richardson@intel.com <bruce.richardson@intel.com>; Nicolas
> Chautru <nicolas.chautru@intel.com>
> Subject: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC processing
> functions
> 
> Adding LDPC decode and encode processing operations
> 
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> ---
>  drivers/baseband/acc100/rte_acc100_pmd.c | 1625
> +++++++++++++++++++++++++++++-
>  drivers/baseband/acc100/rte_acc100_pmd.h |    3 +
>  2 files changed, 1626 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> b/drivers/baseband/acc100/rte_acc100_pmd.c
> index 7a21c57..5f32813 100644
> --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> @@ -15,6 +15,9 @@
>  #include <rte_hexdump.h>
>  #include <rte_pci.h>
>  #include <rte_bus_pci.h>
> +#ifdef RTE_BBDEV_OFFLOAD_COST
> +#include <rte_cycles.h>
> +#endif
> 
>  #include <rte_bbdev.h>
>  #include <rte_bbdev_pmd.h>
> @@ -449,7 +452,6 @@
>          return 0;
>  }
> 
> -
>  /**
>   * Report a ACC100 queue index which is free
>   * Return 0 to 16k for a valid queue_idx or -1 when no queue is available
> @@ -634,6 +636,46 @@
>          struct acc100_device *d = dev->data->dev_private;
> 
>          static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
> +               {
> +                       .type   = RTE_BBDEV_OP_LDPC_ENC,
> +                       .cap.ldpc_enc = {
> +                               .capability_flags =
> +                                       RTE_BBDEV_LDPC_RATE_MATCH |
> +                                       RTE_BBDEV_LDPC_CRC_24B_ATTACH |
> +                                       RTE_BBDEV_LDPC_INTERLEAVER_BYPASS,
> +                               .num_buffers_src =
> +                                               RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> +                               .num_buffers_dst =
> +                                               RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> +                       }
> +               },
> +               {
> +                       .type   = RTE_BBDEV_OP_LDPC_DEC,
> +                       .cap.ldpc_dec = {
> +                       .capability_flags =
> +                               RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK |
> +                               RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP |
> +                               RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE |
> +                               RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE |
> +#ifdef ACC100_EXT_MEM
> +                               RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABL
> E |
> +                               RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENA
> BLE |
> +#endif
> +                               RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE |
> +                               RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS |
> +                               RTE_BBDEV_LDPC_DECODE_BYPASS |
> +                               RTE_BBDEV_LDPC_DEC_SCATTER_GATHER |
> +                               RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION |
> +                               RTE_BBDEV_LDPC_LLR_COMPRESSION,
> +                       .llr_size = 8,
> +                       .llr_decimals = 1,
> +                       .num_buffers_src =
> +                                       RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> +                       .num_buffers_hard_out =
> +                                       RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> +                       .num_buffers_soft_out = 0,
> +                       }
> +               },
>                  RTE_BBDEV_END_OF_CAPABILITIES_LIST()
>          };
> 
> @@ -669,9 +711,14 @@
>          dev_info->cpu_flag_reqs = NULL;
>          dev_info->min_alignment = 64;
>          dev_info->capabilities = bbdev_capabilities;
> +#ifdef ACC100_EXT_MEM
>          dev_info->harq_buffer_size = d->ddr_size;
> +#else
> +       dev_info->harq_buffer_size = 0;
> +#endif
>  }
> 
> +
>  static const struct rte_bbdev_ops acc100_bbdev_ops = {
>          .setup_queues = acc100_setup_queues,
>          .close = acc100_dev_close,
> @@ -696,6 +743,1577 @@
>          {.device_id = 0},
>  };
> 
> +/* Read flag value 0/1 from bitmap */
> +static inline bool
> +check_bit(uint32_t bitmap, uint32_t bitmask)
> +{
> +       return bitmap & bitmask;
> +}
> +
> +static inline char *
> +mbuf_append(struct rte_mbuf *m_head, struct rte_mbuf *m, uint16_t len)
> +{
> +       if (unlikely(len > rte_pktmbuf_tailroom(m)))
> +               return NULL;
> +
> +       char *tail = (char *)m->buf_addr + m->data_off + m->data_len;
> +       m->data_len = (uint16_t)(m->data_len + len);
> +       m_head->pkt_len  = (m_head->pkt_len + len);
> +       return tail;
> +}
> +
> +/* Compute value of k0.
> + * Based on 3GPP 38.212 Table 5.4.2.1-2
> + * Starting position of different redundancy versions, k0
> + */
> +static inline uint16_t
> +get_k0(uint16_t n_cb, uint16_t z_c, uint8_t bg, uint8_t rv_index)
> +{
> +       if (rv_index == 0)
> +               return 0;
> +       uint16_t n = (bg == 1 ? N_ZC_1 : N_ZC_2) * z_c;
> +       if (n_cb == n) {
> +               if (rv_index == 1)
> +                       return (bg == 1 ? K0_1_1 : K0_1_2) * z_c;
> +               else if (rv_index == 2)
> +                       return (bg == 1 ? K0_2_1 : K0_2_2) * z_c;
> +               else
> +                       return (bg == 1 ? K0_3_1 : K0_3_2) * z_c;
> +       }
> +       /* LBRM case - includes a division by N */
> +       if (rv_index == 1)
> +               return (((bg == 1 ? K0_1_1 : K0_1_2) * n_cb)
> +                               / n) * z_c;
> +       else if (rv_index == 2)
> +               return (((bg == 1 ? K0_2_1 : K0_2_2) * n_cb)
> +                               / n) * z_c;
> +       else
> +               return (((bg == 1 ? K0_3_1 : K0_3_2) * n_cb)
> +                               / n) * z_c;
> +}
> +
> +/* Fill in a frame control word for LDPC encoding. */
> +static inline void
> +acc100_fcw_le_fill(const struct rte_bbdev_enc_op *op,
> +               struct acc100_fcw_le *fcw, int num_cb)
> +{
> +       fcw->qm = op->ldpc_enc.q_m;
> +       fcw->nfiller = op->ldpc_enc.n_filler;
> +       fcw->BG = (op->ldpc_enc.basegraph - 1);
> +       fcw->Zc = op->ldpc_enc.z_c;
> +       fcw->ncb = op->ldpc_enc.n_cb;
> +       fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_enc.basegraph,
> +                       op->ldpc_enc.rv_index);
> +       fcw->rm_e = op->ldpc_enc.cb_params.e;
> +       fcw->crc_select = check_bit(op->ldpc_enc.op_flags,
> +                       RTE_BBDEV_LDPC_CRC_24B_ATTACH);
> +       fcw->bypass_intlv = check_bit(op->ldpc_enc.op_flags,
> +                       RTE_BBDEV_LDPC_INTERLEAVER_BYPASS);
> +       fcw->mcb_count = num_cb;
> +}
> +
> +/* Fill in a frame control word for LDPC decoding. */
> +static inline void
> +acc100_fcw_ld_fill(const struct rte_bbdev_dec_op *op, struct acc100_fcw_ld
> *fcw,
> +               union acc100_harq_layout_data *harq_layout)
> +{
> +       uint16_t harq_out_length, harq_in_length, ncb_p, k0_p, parity_offset;
> +       uint16_t harq_index;
> +       uint32_t l;
> +       bool harq_prun = false;
> +
> +       fcw->qm = op->ldpc_dec.q_m;
> +       fcw->nfiller = op->ldpc_dec.n_filler;
> +       fcw->BG = (op->ldpc_dec.basegraph - 1);
> +       fcw->Zc = op->ldpc_dec.z_c;
> +       fcw->ncb = op->ldpc_dec.n_cb;
> +       fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_dec.basegraph,
> +                       op->ldpc_dec.rv_index);
> +       if (op->ldpc_dec.code_block_mode == 1)
> +               fcw->rm_e = op->ldpc_dec.cb_params.e;
> +       else
> +               fcw->rm_e = (op->ldpc_dec.tb_params.r <
> +                               op->ldpc_dec.tb_params.cab) ?
> +                                               op->ldpc_dec.tb_params.ea :
> +                                               op->ldpc_dec.tb_params.eb;
> +
> +       fcw->hcin_en = check_bit(op->ldpc_dec.op_flags,
> +                       RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE);
> +       fcw->hcout_en = check_bit(op->ldpc_dec.op_flags,
> +                       RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE);
> +       fcw->crc_select = check_bit(op->ldpc_dec.op_flags,
> +                       RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK);
> +       fcw->bypass_dec = check_bit(op->ldpc_dec.op_flags,
> +                       RTE_BBDEV_LDPC_DECODE_BYPASS);
> +       fcw->bypass_intlv = check_bit(op->ldpc_dec.op_flags,
> +                       RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS);
> +       if (op->ldpc_dec.q_m == 1) {
> +               fcw->bypass_intlv = 1;
> +               fcw->qm = 2;
> +       }
> +       fcw->hcin_decomp_mode = check_bit(op->ldpc_dec.op_flags,
> +                       RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
> +       fcw->hcout_comp_mode = check_bit(op->ldpc_dec.op_flags,
> +                       RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
> +       fcw->llr_pack_mode = check_bit(op->ldpc_dec.op_flags,
> +                       RTE_BBDEV_LDPC_LLR_COMPRESSION);
> +       harq_index = op->ldpc_dec.harq_combined_output.offset /
> +                       ACC100_HARQ_OFFSET;
> +#ifdef ACC100_EXT_MEM
> +       /* Limit cases when HARQ pruning is valid */
> +       harq_prun = ((op->ldpc_dec.harq_combined_output.offset %
> +                       ACC100_HARQ_OFFSET) == 0) &&
> +                       (op->ldpc_dec.harq_combined_output.offset <= UINT16_MAX
> +                       * ACC100_HARQ_OFFSET);
> +#endif
> +       if (fcw->hcin_en > 0) {
> +               harq_in_length = op->ldpc_dec.harq_combined_input.length;
> +               if (fcw->hcin_decomp_mode > 0)
> +                       harq_in_length = harq_in_length * 8 / 6;
> +               harq_in_length = RTE_ALIGN(harq_in_length, 64);
> +               if ((harq_layout[harq_index].offset > 0) & harq_prun) {
> +                       rte_bbdev_log_debug("HARQ IN offset unexpected for now\n");
> +                       fcw->hcin_size0 = harq_layout[harq_index].size0;
> +                       fcw->hcin_offset = harq_layout[harq_index].offset;
> +                       fcw->hcin_size1 = harq_in_length -
> +                                       harq_layout[harq_index].offset;
> +               } else {
> +                       fcw->hcin_size0 = harq_in_length;
> +                       fcw->hcin_offset = 0;
> +                       fcw->hcin_size1 = 0;
> +               }
> +       } else {
> +               fcw->hcin_size0 = 0;
> +               fcw->hcin_offset = 0;
> +               fcw->hcin_size1 = 0;
> +       }
> +
> +       fcw->itmax = op->ldpc_dec.iter_max;
> +       fcw->itstop = check_bit(op->ldpc_dec.op_flags,
> +                       RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE);
> +       fcw->synd_precoder = fcw->itstop;
> +       /*
> +        * These are all implicitly set
> +        * fcw->synd_post = 0;
> +        * fcw->so_en = 0;
> +        * fcw->so_bypass_rm = 0;
> +        * fcw->so_bypass_intlv = 0;
> +        * fcw->dec_convllr = 0;
> +        * fcw->hcout_convllr = 0;
> +        * fcw->hcout_size1 = 0;
> +        * fcw->so_it = 0;
> +        * fcw->hcout_offset = 0;
> +        * fcw->negstop_th = 0;
> +        * fcw->negstop_it = 0;
> +        * fcw->negstop_en = 0;
> +        * fcw->gain_i = 1;
> +        * fcw->gain_h = 1;
> +        */
> +       if (fcw->hcout_en > 0) {
> +               parity_offset = (op->ldpc_dec.basegraph == 1 ? 20 : 8)
> +                       * op->ldpc_dec.z_c - op->ldpc_dec.n_filler;
> +               k0_p = (fcw->k0 > parity_offset) ?
> +                               fcw->k0 - op->ldpc_dec.n_filler : fcw->k0;
> +               ncb_p = fcw->ncb - op->ldpc_dec.n_filler;
> +               l = k0_p + fcw->rm_e;
> +               harq_out_length = (uint16_t) fcw->hcin_size0;
> +               harq_out_length = RTE_MIN(RTE_MAX(harq_out_length, l), ncb_p);
> +               harq_out_length = (harq_out_length + 0x3F) & 0xFFC0;
> +               if ((k0_p > fcw->hcin_size0 + ACC100_HARQ_OFFSET_THRESHOLD)
> &&
> +                               harq_prun) {
> +                       fcw->hcout_size0 = (uint16_t) fcw->hcin_size0;
> +                       fcw->hcout_offset = k0_p & 0xFFC0;
> +                       fcw->hcout_size1 = harq_out_length - fcw->hcout_offset;
> +               } else {
> +                       fcw->hcout_size0 = harq_out_length;
> +                       fcw->hcout_size1 = 0;
> +                       fcw->hcout_offset = 0;
> +               }
> +               harq_layout[harq_index].offset = fcw->hcout_offset;
> +               harq_layout[harq_index].size0 = fcw->hcout_size0;
> +       } else {
> +               fcw->hcout_size0 = 0;
> +               fcw->hcout_size1 = 0;
> +               fcw->hcout_offset = 0;
> +       }
> +}
> +
> +/**
> + * Fills descriptor with data pointers of one block type.
> + *
> + * @param desc
> + *   Pointer to DMA descriptor.
> + * @param input
> + *   Pointer to pointer to input data which will be encoded. It can be changed
> + *   and points to next segment in scatter-gather case.
> + * @param offset
> + *   Input offset in rte_mbuf structure. It is used for calculating the point
> + *   where data is starting.
> + * @param cb_len
> + *   Length of currently processed Code Block
> + * @param seg_total_left
> + *   It indicates how many bytes still left in segment (mbuf) for further
> + *   processing.
> + * @param op_flags
> + *   Store information about device capabilities
> + * @param next_triplet
> + *   Index for ACC100 DMA Descriptor triplet
> + *
> + * @return
> + *   Returns index of next triplet on success, other value if lengths of
> + *   pkt and processed cb do not match.
> + *
> + */
> +static inline int
> +acc100_dma_fill_blk_type_in(struct acc100_dma_req_desc *desc,
> +               struct rte_mbuf **input, uint32_t *offset, uint32_t cb_len,
> +               uint32_t *seg_total_left, int next_triplet)
> +{
> +       uint32_t part_len;
> +       struct rte_mbuf *m = *input;
> +
> +       part_len = (*seg_total_left < cb_len) ? *seg_total_left : cb_len;
> +       cb_len -= part_len;
> +       *seg_total_left -= part_len;
> +
> +       desc->data_ptrs[next_triplet].address =
> +                       rte_pktmbuf_iova_offset(m, *offset);
> +       desc->data_ptrs[next_triplet].blen = part_len;
> +       desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN;
> +       desc->data_ptrs[next_triplet].last = 0;
> +       desc->data_ptrs[next_triplet].dma_ext = 0;
> +       *offset += part_len;
> +       next_triplet++;
> +
> +       while (cb_len > 0) {
> +               if (next_triplet < ACC100_DMA_MAX_NUM_POINTERS &&
> +                               m->next != NULL) {
> +
> +                       m = m->next;
> +                       *seg_total_left = rte_pktmbuf_data_len(m);
> +                       part_len = (*seg_total_left < cb_len) ?
> +                                       *seg_total_left :
> +                                       cb_len;
> +                       desc->data_ptrs[next_triplet].address =
> +                                       rte_pktmbuf_mtophys(m);
> +                       desc->data_ptrs[next_triplet].blen = part_len;
> +                       desc->data_ptrs[next_triplet].blkid =
> +                                       ACC100_DMA_BLKID_IN;
> +                       desc->data_ptrs[next_triplet].last = 0;
> +                       desc->data_ptrs[next_triplet].dma_ext = 0;
> +                       cb_len -= part_len;
> +                       *seg_total_left -= part_len;
> +                       /* Initializing offset for next segment (mbuf) */
> +                       *offset = part_len;
> +                       next_triplet++;
> +               } else {
> +                       rte_bbdev_log(ERR,
> +                               "Some data still left for processing: "
> +                               "data_left: %u, next_triplet: %u, next_mbuf: %p",
> +                               cb_len, next_triplet, m->next);
> +                       return -EINVAL;
> +               }
> +       }
> +       /* Storing new mbuf as it could be changed in scatter-gather case*/
> +       *input = m;
> +
> +       return next_triplet;
> +}
> +
> +/* Fills descriptor with data pointers of one block type.
> + * Returns index of next triplet on success, other value if lengths of
> + * output data and processed mbuf do not match.
> + */
> +static inline int
> +acc100_dma_fill_blk_type_out(struct acc100_dma_req_desc *desc,
> +               struct rte_mbuf *output, uint32_t out_offset,
> +               uint32_t output_len, int next_triplet, int blk_id)
> +{
> +       desc->data_ptrs[next_triplet].address =
> +                       rte_pktmbuf_iova_offset(output, out_offset);
> +       desc->data_ptrs[next_triplet].blen = output_len;
> +       desc->data_ptrs[next_triplet].blkid = blk_id;
> +       desc->data_ptrs[next_triplet].last = 0;
> +       desc->data_ptrs[next_triplet].dma_ext = 0;
> +       next_triplet++;
> +
> +       return next_triplet;
> +}
> +
> +static inline int
> +acc100_dma_desc_le_fill(struct rte_bbdev_enc_op *op,
> +               struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
> +               struct rte_mbuf *output, uint32_t *in_offset,
> +               uint32_t *out_offset, uint32_t *out_length,
> +               uint32_t *mbuf_total_left, uint32_t *seg_total_left)
> +{
> +       int next_triplet = 1; /* FCW already done */
> +       uint16_t K, in_length_in_bits, in_length_in_bytes;
> +       struct rte_bbdev_op_ldpc_enc *enc = &op->ldpc_enc;
> +
> +       desc->word0 = ACC100_DMA_DESC_TYPE;
> +       desc->word1 = 0; /**< Timestamp could be disabled */
> +       desc->word2 = 0;
> +       desc->word3 = 0;
> +       desc->numCBs = 1;
> +
> +       K = (enc->basegraph == 1 ? 22 : 10) * enc->z_c;
> +       in_length_in_bits = K - enc->n_filler;
> +       if ((enc->op_flags & RTE_BBDEV_LDPC_CRC_24A_ATTACH) ||
> +                       (enc->op_flags & RTE_BBDEV_LDPC_CRC_24B_ATTACH))
> +               in_length_in_bits -= 24;
> +       in_length_in_bytes = in_length_in_bits >> 3;
> +
> +       if (unlikely((*mbuf_total_left == 0) ||
> +                       (*mbuf_total_left < in_length_in_bytes))) {
> +               rte_bbdev_log(ERR,
> +                               "Mismatch between mbuf length and included CB sizes:
> mbuf len %u, cb len %u",
> +                               *mbuf_total_left, in_length_in_bytes);
> +               return -1;
> +       }
> +
> +       next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset,
> +                       in_length_in_bytes,
> +                       seg_total_left, next_triplet);
> +       if (unlikely(next_triplet < 0)) {
> +               rte_bbdev_log(ERR,
> +                               "Mismatch between data to process and mbuf data length
> in bbdev_op: %p",
> +                               op);
> +               return -1;
> +       }
> +       desc->data_ptrs[next_triplet - 1].last = 1;
> +       desc->m2dlen = next_triplet;
> +       *mbuf_total_left -= in_length_in_bytes;
> +
> +       /* Set output length */
> +       /* Integer round up division by 8 */
> +       *out_length = (enc->cb_params.e + 7) >> 3;
> +
> +       next_triplet = acc100_dma_fill_blk_type_out(desc, output, *out_offset,
> +                       *out_length, next_triplet, ACC100_DMA_BLKID_OUT_ENC);
> +       if (unlikely(next_triplet < 0)) {
> +               rte_bbdev_log(ERR,
> +                               "Mismatch between data to process and mbuf data length
> in bbdev_op: %p",
> +                               op);
> +               return -1;
> +       }
> +       op->ldpc_enc.output.length += *out_length;
> +       *out_offset += *out_length;
> +       desc->data_ptrs[next_triplet - 1].last = 1;
> +       desc->data_ptrs[next_triplet - 1].dma_ext = 0;
> +       desc->d2mlen = next_triplet - desc->m2dlen;
> +
> +       desc->op_addr = op;
> +
> +       return 0;
> +}
> +
> +static inline int
> +acc100_dma_desc_ld_fill(struct rte_bbdev_dec_op *op,
> +               struct acc100_dma_req_desc *desc,
> +               struct rte_mbuf **input, struct rte_mbuf *h_output,
> +               uint32_t *in_offset, uint32_t *h_out_offset,
> +               uint32_t *h_out_length, uint32_t *mbuf_total_left,
> +               uint32_t *seg_total_left,
> +               struct acc100_fcw_ld *fcw)
> +{
> +       struct rte_bbdev_op_ldpc_dec *dec = &op->ldpc_dec;
> +       int next_triplet = 1; /* FCW already done */
> +       uint32_t input_length;
> +       uint16_t output_length, crc24_overlap = 0;
> +       uint16_t sys_cols, K, h_p_size, h_np_size;
> +       bool h_comp = check_bit(dec->op_flags,
> +                       RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
> +
> +       desc->word0 = ACC100_DMA_DESC_TYPE;
> +       desc->word1 = 0; /**< Timestamp could be disabled */
> +       desc->word2 = 0;
> +       desc->word3 = 0;
> +       desc->numCBs = 1;
> +
> +       if (check_bit(op->ldpc_dec.op_flags,
> +                       RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP))
> +               crc24_overlap = 24;
> +
> +       /* Compute some LDPC BG lengths */
> +       input_length = dec->cb_params.e;
> +       if (check_bit(op->ldpc_dec.op_flags,
> +                       RTE_BBDEV_LDPC_LLR_COMPRESSION))
> +               input_length = (input_length * 3 + 3) / 4;
> +       sys_cols = (dec->basegraph == 1) ? 22 : 10;
> +       K = sys_cols * dec->z_c;
> +       output_length = K - dec->n_filler - crc24_overlap;
> +
> +       if (unlikely((*mbuf_total_left == 0) ||
> +                       (*mbuf_total_left < input_length))) {
> +               rte_bbdev_log(ERR,
> +                               "Mismatch between mbuf length and included CB sizes:
> mbuf len %u, cb len %u",
> +                               *mbuf_total_left, input_length);
> +               return -1;
> +       }
> +
> +       next_triplet = acc100_dma_fill_blk_type_in(desc, input,
> +                       in_offset, input_length,
> +                       seg_total_left, next_triplet);
> +
> +       if (unlikely(next_triplet < 0)) {
> +               rte_bbdev_log(ERR,
> +                               "Mismatch between data to process and mbuf data length
> in bbdev_op: %p",
> +                               op);
> +               return -1;
> +       }
> +
> +       if (check_bit(op->ldpc_dec.op_flags,
> +                               RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
> +               h_p_size = fcw->hcin_size0 + fcw->hcin_size1;
> +               if (h_comp)
> +                       h_p_size = (h_p_size * 3 + 3) / 4;
> +               desc->data_ptrs[next_triplet].address =
> +                               dec->harq_combined_input.offset;
> +               desc->data_ptrs[next_triplet].blen = h_p_size;
> +               desc->data_ptrs[next_triplet].blkid =
> ACC100_DMA_BLKID_IN_HARQ;
> +               desc->data_ptrs[next_triplet].dma_ext = 1;
> +#ifndef ACC100_EXT_MEM
> +               acc100_dma_fill_blk_type_out(
> +                               desc,
> +                               op->ldpc_dec.harq_combined_input.data,
> +                               op->ldpc_dec.harq_combined_input.offset,
> +                               h_p_size,
> +                               next_triplet,
> +                               ACC100_DMA_BLKID_IN_HARQ);
> +#endif
> +               next_triplet++;
> +       }
> +
> +       desc->data_ptrs[next_triplet - 1].last = 1;
> +       desc->m2dlen = next_triplet;
> +       *mbuf_total_left -= input_length;
> +
> +       next_triplet = acc100_dma_fill_blk_type_out(desc, h_output,
> +                       *h_out_offset, output_length >> 3, next_triplet,
> +                       ACC100_DMA_BLKID_OUT_HARD);
> +       if (unlikely(next_triplet < 0)) {
> +               rte_bbdev_log(ERR,
> +                               "Mismatch between data to process and mbuf data length
> in bbdev_op: %p",
> +                               op);
> +               return -1;
> +       }
> +
> +       if (check_bit(op->ldpc_dec.op_flags,
> +                               RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
> +               /* Pruned size of the HARQ */
> +               h_p_size = fcw->hcout_size0 + fcw->hcout_size1;
> +               /* Non-Pruned size of the HARQ */
> +               h_np_size = fcw->hcout_offset > 0 ?
> +                               fcw->hcout_offset + fcw->hcout_size1 :
> +                               h_p_size;
> +               if (h_comp) {
> +                       h_np_size = (h_np_size * 3 + 3) / 4;
> +                       h_p_size = (h_p_size * 3 + 3) / 4;
> +               }
> +               dec->harq_combined_output.length = h_np_size;
> +               desc->data_ptrs[next_triplet].address =
> +                               dec->harq_combined_output.offset;
> +               desc->data_ptrs[next_triplet].blen = h_p_size;
> +               desc->data_ptrs[next_triplet].blkid =
> ACC100_DMA_BLKID_OUT_HARQ;
> +               desc->data_ptrs[next_triplet].dma_ext = 1;
> +#ifndef ACC100_EXT_MEM
> +               acc100_dma_fill_blk_type_out(
> +                               desc,
> +                               dec->harq_combined_output.data,
> +                               dec->harq_combined_output.offset,
> +                               h_p_size,
> +                               next_triplet,
> +                               ACC100_DMA_BLKID_OUT_HARQ);
> +#endif
> +               next_triplet++;
> +       }
> +
> +       *h_out_length = output_length >> 3;
> +       dec->hard_output.length += *h_out_length;
> +       *h_out_offset += *h_out_length;
> +       desc->data_ptrs[next_triplet - 1].last = 1;
> +       desc->d2mlen = next_triplet - desc->m2dlen;
> +
> +       desc->op_addr = op;
> +
> +       return 0;
> +}
> +
> +static inline void
> +acc100_dma_desc_ld_update(struct rte_bbdev_dec_op *op,
> +               struct acc100_dma_req_desc *desc,
> +               struct rte_mbuf *input, struct rte_mbuf *h_output,
> +               uint32_t *in_offset, uint32_t *h_out_offset,
> +               uint32_t *h_out_length,
> +               union acc100_harq_layout_data *harq_layout)
> +{
> +       int next_triplet = 1; /* FCW already done */
> +       desc->data_ptrs[next_triplet].address =
> +                       rte_pktmbuf_iova_offset(input, *in_offset);
> +       next_triplet++;
> +
> +       if (check_bit(op->ldpc_dec.op_flags,
> +                               RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
> +               struct rte_bbdev_op_data hi = op->ldpc_dec.harq_combined_input;
> +               desc->data_ptrs[next_triplet].address = hi.offset;
> +#ifndef ACC100_EXT_MEM
> +               desc->data_ptrs[next_triplet].address =
> +                               rte_pktmbuf_iova_offset(hi.data, hi.offset);
> +#endif
> +               next_triplet++;
> +       }
> +
> +       desc->data_ptrs[next_triplet].address =
> +                       rte_pktmbuf_iova_offset(h_output, *h_out_offset);
> +       *h_out_length = desc->data_ptrs[next_triplet].blen;
> +       next_triplet++;
> +
> +       if (check_bit(op->ldpc_dec.op_flags,
> +                               RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
> +               desc->data_ptrs[next_triplet].address =
> +                               op->ldpc_dec.harq_combined_output.offset;
> +               /* Adjust based on previous operation */
> +               struct rte_bbdev_dec_op *prev_op = desc->op_addr;
> +               op->ldpc_dec.harq_combined_output.length =
> +                               prev_op->ldpc_dec.harq_combined_output.length;
> +               int16_t hq_idx = op->ldpc_dec.harq_combined_output.offset /
> +                               ACC100_HARQ_OFFSET;
> +               int16_t prev_hq_idx =
> +                               prev_op->ldpc_dec.harq_combined_output.offset
> +                               / ACC100_HARQ_OFFSET;
> +               harq_layout[hq_idx].val = harq_layout[prev_hq_idx].val;
> +#ifndef ACC100_EXT_MEM
> +               struct rte_bbdev_op_data ho =
> +                               op->ldpc_dec.harq_combined_output;
> +               desc->data_ptrs[next_triplet].address =
> +                               rte_pktmbuf_iova_offset(ho.data, ho.offset);
> +#endif
> +               next_triplet++;
> +       }
> +
> +       op->ldpc_dec.hard_output.length += *h_out_length;
> +       desc->op_addr = op;
> +}
> +
> +
> +/* Enqueue a number of operations to HW and update software rings */
> +static inline void
> +acc100_dma_enqueue(struct acc100_queue *q, uint16_t n,
> +               struct rte_bbdev_stats *queue_stats)
> +{
> +       union acc100_enqueue_reg_fmt enq_req;
> +#ifdef RTE_BBDEV_OFFLOAD_COST
> +       uint64_t start_time = 0;
> +       queue_stats->acc_offload_cycles = 0;
> +       RTE_SET_USED(queue_stats);
> +#else
> +       RTE_SET_USED(queue_stats);
> +#endif
> +
> +       enq_req.val = 0;
> +       /* Setting offset, 100b for 256 DMA Desc */
> +       enq_req.addr_offset = ACC100_DESC_OFFSET;
> +
> +       /* Split ops into batches */
> +       do {
> +               union acc100_dma_desc *desc;
> +               uint16_t enq_batch_size;
> +               uint64_t offset;
> +               rte_iova_t req_elem_addr;
> +
> +               enq_batch_size = RTE_MIN(n, MAX_ENQ_BATCH_SIZE);
> +
> +               /* Set flag on last descriptor in a batch */
> +               desc = q->ring_addr + ((q->sw_ring_head + enq_batch_size - 1) &
> +                               q->sw_ring_wrap_mask);
> +               desc->req.last_desc_in_batch = 1;
> +
> +               /* Calculate the 1st descriptor's address */
> +               offset = ((q->sw_ring_head & q->sw_ring_wrap_mask) *
> +                               sizeof(union acc100_dma_desc));
> +               req_elem_addr = q->ring_addr_phys + offset;
> +
> +               /* Fill enqueue struct */
> +               enq_req.num_elem = enq_batch_size;
> +               /* low 6 bits are not needed */
> +               enq_req.req_elem_addr = (uint32_t)(req_elem_addr >> 6);
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +               rte_memdump(stderr, "Req sdone", desc, sizeof(*desc));
> +#endif
> +               rte_bbdev_log_debug(
> +                               "Enqueue %u reqs (phys %#"PRIx64") to reg %p",
> +                               enq_batch_size,
> +                               req_elem_addr,
> +                               (void *)q->mmio_reg_enqueue);
> +
> +               rte_wmb();
> +
> +#ifdef RTE_BBDEV_OFFLOAD_COST
> +               /* Start time measurement for enqueue function offload. */
> +               start_time = rte_rdtsc_precise();
> +#endif
> +               rte_bbdev_log(DEBUG, "Debug : MMIO Enqueue");
> +               mmio_write(q->mmio_reg_enqueue, enq_req.val);
> +
> +#ifdef RTE_BBDEV_OFFLOAD_COST
> +               queue_stats->acc_offload_cycles +=
> +                               rte_rdtsc_precise() - start_time;
> +#endif
> +
> +               q->aq_enqueued++;
> +               q->sw_ring_head += enq_batch_size;
> +               n -= enq_batch_size;
> +
> +       } while (n);
> +
> +
> +}
> +
> +/* Enqueue one encode operations for ACC100 device in CB mode */
> +static inline int
> +enqueue_ldpc_enc_n_op_cb(struct acc100_queue *q, struct
> rte_bbdev_enc_op **ops,
> +               uint16_t total_enqueued_cbs, int16_t num)
> +{
> +       union acc100_dma_desc *desc = NULL;
> +       uint32_t out_length;
> +       struct rte_mbuf *output_head, *output;
> +       int i, next_triplet;
> +       uint16_t  in_length_in_bytes;
> +       struct rte_bbdev_op_ldpc_enc *enc = &ops[0]->ldpc_enc;
> +
> +       uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> +                       & q->sw_ring_wrap_mask);
> +       desc = q->ring_addr + desc_idx;
> +       acc100_fcw_le_fill(ops[0], &desc->req.fcw_le, num);
> +
> +       /** This could be done at polling */
> +       desc->req.word0 = ACC100_DMA_DESC_TYPE;
> +       desc->req.word1 = 0; /**< Timestamp could be disabled */
> +       desc->req.word2 = 0;
> +       desc->req.word3 = 0;
> +       desc->req.numCBs = num;
> +
> +       in_length_in_bytes = ops[0]->ldpc_enc.input.data->data_len;
> +       out_length = (enc->cb_params.e + 7) >> 3;
> +       desc->req.m2dlen = 1 + num;
> +       desc->req.d2mlen = num;
> +       next_triplet = 1;
> +
> +       for (i = 0; i < num; i++) {
> +               desc->req.data_ptrs[next_triplet].address =
> +                       rte_pktmbuf_iova_offset(ops[i]->ldpc_enc.input.data, 0);
> +               desc->req.data_ptrs[next_triplet].blen = in_length_in_bytes;
> +               next_triplet++;
> +               desc->req.data_ptrs[next_triplet].address =
> +                               rte_pktmbuf_iova_offset(
> +                               ops[i]->ldpc_enc.output.data, 0);
> +               desc->req.data_ptrs[next_triplet].blen = out_length;
> +               next_triplet++;
> +               ops[i]->ldpc_enc.output.length = out_length;
> +               output_head = output = ops[i]->ldpc_enc.output.data;
> +               mbuf_append(output_head, output, out_length);
> +               output->data_len = out_length;
> +       }
> +
> +       desc->req.op_addr = ops[0];
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +       rte_memdump(stderr, "FCW", &desc->req.fcw_le,
> +                       sizeof(desc->req.fcw_le) - 8);
> +       rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> +#endif
> +
> +       /* One CB (one op) was successfully prepared to enqueue */
> +       return num;
> +}
> +
> +/* Enqueue one encode operations for ACC100 device in CB mode */
> +static inline int
> +enqueue_ldpc_enc_one_op_cb(struct acc100_queue *q, struct
> rte_bbdev_enc_op *op,
> +               uint16_t total_enqueued_cbs)
> +{
> +       union acc100_dma_desc *desc = NULL;
> +       int ret;
> +       uint32_t in_offset, out_offset, out_length, mbuf_total_left,
> +               seg_total_left;
> +       struct rte_mbuf *input, *output_head, *output;
> +
> +       uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> +                       & q->sw_ring_wrap_mask);
> +       desc = q->ring_addr + desc_idx;
> +       acc100_fcw_le_fill(op, &desc->req.fcw_le, 1);
> +
> +       input = op->ldpc_enc.input.data;
> +       output_head = output = op->ldpc_enc.output.data;
> +       in_offset = op->ldpc_enc.input.offset;
> +       out_offset = op->ldpc_enc.output.offset;
> +       out_length = 0;
> +       mbuf_total_left = op->ldpc_enc.input.length;
> +       seg_total_left = rte_pktmbuf_data_len(op->ldpc_enc.input.data)
> +                       - in_offset;
> +
> +       ret = acc100_dma_desc_le_fill(op, &desc->req, &input, output,
> +                       &in_offset, &out_offset, &out_length, &mbuf_total_left,
> +                       &seg_total_left);
> +
> +       if (unlikely(ret < 0))
> +               return ret;
> +
> +       mbuf_append(output_head, output, out_length);
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +       rte_memdump(stderr, "FCW", &desc->req.fcw_le,
> +                       sizeof(desc->req.fcw_le) - 8);
> +       rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> +
> +       /* Check if any data left after processing one CB */
> +       if (mbuf_total_left != 0) {
> +               rte_bbdev_log(ERR,
> +                               "Some date still left after processing one CB:
> mbuf_total_left = %u",
> +                               mbuf_total_left);
> +               return -EINVAL;
> +       }
> +#endif
> +       /* One CB (one op) was successfully prepared to enqueue */
> +       return 1;
> +}
> +
> +/** Enqueue one decode operations for ACC100 device in CB mode */
> +static inline int
> +enqueue_ldpc_dec_one_op_cb(struct acc100_queue *q, struct
> rte_bbdev_dec_op *op,
> +               uint16_t total_enqueued_cbs, bool same_op)
> +{
> +       int ret;
> +
> +       union acc100_dma_desc *desc;
> +       uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> +                       & q->sw_ring_wrap_mask);
> +       desc = q->ring_addr + desc_idx;
> +       struct rte_mbuf *input, *h_output_head, *h_output;
> +       uint32_t in_offset, h_out_offset, h_out_length, mbuf_total_left;
> +       input = op->ldpc_dec.input.data;
> +       h_output_head = h_output = op->ldpc_dec.hard_output.data;
> +       in_offset = op->ldpc_dec.input.offset;
> +       h_out_offset = op->ldpc_dec.hard_output.offset;
> +       mbuf_total_left = op->ldpc_dec.input.length;
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +       if (unlikely(input == NULL)) {
> +               rte_bbdev_log(ERR, "Invalid mbuf pointer");
> +               return -EFAULT;
> +       }
> +#endif
> +       union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
> +
> +       if (same_op) {
> +               union acc100_dma_desc *prev_desc;
> +               desc_idx = ((q->sw_ring_head + total_enqueued_cbs - 1)
> +                               & q->sw_ring_wrap_mask);
> +               prev_desc = q->ring_addr + desc_idx;
> +               uint8_t *prev_ptr = (uint8_t *) prev_desc;
> +               uint8_t *new_ptr = (uint8_t *) desc;
> +               /* Copy first 4 words and BDESCs */
> +               rte_memcpy(new_ptr, prev_ptr, 16);
> +               rte_memcpy(new_ptr + 36, prev_ptr + 36, 40);
> +               desc->req.op_addr = prev_desc->req.op_addr;
> +               /* Copy FCW */
> +               rte_memcpy(new_ptr + ACC100_DESC_FCW_OFFSET,
> +                               prev_ptr + ACC100_DESC_FCW_OFFSET,
> +                               ACC100_FCW_LD_BLEN);
> +               acc100_dma_desc_ld_update(op, &desc->req, input, h_output,
> +                               &in_offset, &h_out_offset,
> +                               &h_out_length, harq_layout);
> +       } else {
> +               struct acc100_fcw_ld *fcw;
> +               uint32_t seg_total_left;
> +               fcw = &desc->req.fcw_ld;
> +               acc100_fcw_ld_fill(op, fcw, harq_layout);
> +
> +               /* Special handling when overusing mbuf */
> +               if (fcw->rm_e < MAX_E_MBUF)
> +                       seg_total_left = rte_pktmbuf_data_len(input)
> +                                       - in_offset;
> +               else
> +                       seg_total_left = fcw->rm_e;
> +
> +               ret = acc100_dma_desc_ld_fill(op, &desc->req, &input, h_output,
> +                               &in_offset, &h_out_offset,
> +                               &h_out_length, &mbuf_total_left,
> +                               &seg_total_left, fcw);
> +               if (unlikely(ret < 0))
> +                       return ret;
> +       }
> +
> +       /* Hard output */
> +       mbuf_append(h_output_head, h_output, h_out_length);
> +#ifndef ACC100_EXT_MEM
> +       if (op->ldpc_dec.harq_combined_output.length > 0) {
> +               /* Push the HARQ output into host memory */
> +               struct rte_mbuf *hq_output_head, *hq_output;
> +               hq_output_head = op->ldpc_dec.harq_combined_output.data;
> +               hq_output = op->ldpc_dec.harq_combined_output.data;
> +               mbuf_append(hq_output_head, hq_output,
> +                               op->ldpc_dec.harq_combined_output.length);
> +       }
> +#endif
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +       rte_memdump(stderr, "FCW", &desc->req.fcw_ld,
> +                       sizeof(desc->req.fcw_ld) - 8);
> +       rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> +#endif
> +
> +       /* One CB (one op) was successfully prepared to enqueue */
> +       return 1;
> +}
> +
> +
> +/* Enqueue one decode operations for ACC100 device in TB mode */
> +static inline int
> +enqueue_ldpc_dec_one_op_tb(struct acc100_queue *q, struct
> rte_bbdev_dec_op *op,
> +               uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
> +{
> +       union acc100_dma_desc *desc = NULL;
> +       int ret;
> +       uint8_t r, c;
> +       uint32_t in_offset, h_out_offset,
> +               h_out_length, mbuf_total_left, seg_total_left;
> +       struct rte_mbuf *input, *h_output_head, *h_output;
> +       uint16_t current_enqueued_cbs = 0;
> +
> +       uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> +                       & q->sw_ring_wrap_mask);
> +       desc = q->ring_addr + desc_idx;
> +       uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
> +       union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
> +       acc100_fcw_ld_fill(op, &desc->req.fcw_ld, harq_layout);
> +
> +       input = op->ldpc_dec.input.data;
> +       h_output_head = h_output = op->ldpc_dec.hard_output.data;
> +       in_offset = op->ldpc_dec.input.offset;
> +       h_out_offset = op->ldpc_dec.hard_output.offset;
> +       h_out_length = 0;
> +       mbuf_total_left = op->ldpc_dec.input.length;
> +       c = op->ldpc_dec.tb_params.c;
> +       r = op->ldpc_dec.tb_params.r;
> +
> +       while (mbuf_total_left > 0 && r < c) {
> +
> +               seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
> +
> +               /* Set up DMA descriptor */
> +               desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
> +                               & q->sw_ring_wrap_mask);
> +               desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
> +               desc->req.data_ptrs[0].blen = ACC100_FCW_LD_BLEN;
> +               ret = acc100_dma_desc_ld_fill(op, &desc->req, &input,
> +                               h_output, &in_offset, &h_out_offset,
> +                               &h_out_length,
> +                               &mbuf_total_left, &seg_total_left,
> +                               &desc->req.fcw_ld);
> +
> +               if (unlikely(ret < 0))
> +                       return ret;
> +
> +               /* Hard output */
> +               mbuf_append(h_output_head, h_output, h_out_length);
> +
> +               /* Set total number of CBs in TB */
> +               desc->req.cbs_in_tb = cbs_in_tb;
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +               rte_memdump(stderr, "FCW", &desc->req.fcw_td,
> +                               sizeof(desc->req.fcw_td) - 8);
> +               rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> +#endif
> +
> +               if (seg_total_left == 0) {
> +                       /* Go to the next mbuf */
> +                       input = input->next;
> +                       in_offset = 0;
> +                       h_output = h_output->next;
> +                       h_out_offset = 0;
> +               }
> +               total_enqueued_cbs++;
> +               current_enqueued_cbs++;
> +               r++;
> +       }
> +
> +       if (unlikely(desc == NULL))
> +               return current_enqueued_cbs;
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +       /* Check if any CBs left for processing */
> +       if (mbuf_total_left != 0) {
> +               rte_bbdev_log(ERR,
> +                               "Some date still left for processing: mbuf_total_left = %u",
> +                               mbuf_total_left);
> +               return -EINVAL;
> +       }
> +#endif
> +       /* Set SDone on last CB descriptor for TB mode */
> +       desc->req.sdone_enable = 1;
> +       desc->req.irq_enable = q->irq_enable;
> +
> +       return current_enqueued_cbs;
> +}
> +
> +
> +/* Calculates number of CBs in processed encoder TB based on 'r' and input
> + * length.
> + */
> +static inline uint8_t
> +get_num_cbs_in_tb_enc(struct rte_bbdev_op_turbo_enc *turbo_enc)
> +{
> +       uint8_t c, c_neg, r, crc24_bits = 0;
> +       uint16_t k, k_neg, k_pos;
> +       uint8_t cbs_in_tb = 0;
> +       int32_t length;
> +
> +       length = turbo_enc->input.length;
> +       r = turbo_enc->tb_params.r;
> +       c = turbo_enc->tb_params.c;
> +       c_neg = turbo_enc->tb_params.c_neg;
> +       k_neg = turbo_enc->tb_params.k_neg;
> +       k_pos = turbo_enc->tb_params.k_pos;
> +       crc24_bits = 0;
> +       if (check_bit(turbo_enc->op_flags,
> RTE_BBDEV_TURBO_CRC_24B_ATTACH))
> +               crc24_bits = 24;
> +       while (length > 0 && r < c) {
> +               k = (r < c_neg) ? k_neg : k_pos;
> +               length -= (k - crc24_bits) >> 3;
> +               r++;
> +               cbs_in_tb++;
> +       }
> +
> +       return cbs_in_tb;
> +}
> +
> +/* Calculates number of CBs in processed decoder TB based on 'r' and input
> + * length.
> + */
> +static inline uint16_t
> +get_num_cbs_in_tb_dec(struct rte_bbdev_op_turbo_dec *turbo_dec)
> +{
> +       uint8_t c, c_neg, r = 0;
> +       uint16_t kw, k, k_neg, k_pos, cbs_in_tb = 0;
> +       int32_t length;
> +
> +       length = turbo_dec->input.length;
> +       r = turbo_dec->tb_params.r;
> +       c = turbo_dec->tb_params.c;
> +       c_neg = turbo_dec->tb_params.c_neg;
> +       k_neg = turbo_dec->tb_params.k_neg;
> +       k_pos = turbo_dec->tb_params.k_pos;
> +       while (length > 0 && r < c) {
> +               k = (r < c_neg) ? k_neg : k_pos;
> +               kw = RTE_ALIGN_CEIL(k + 4, 32) * 3;
> +               length -= kw;
> +               r++;
> +               cbs_in_tb++;
> +       }
> +
> +       return cbs_in_tb;
> +}
> +
> +/* Calculates number of CBs in processed decoder TB based on 'r' and input
> + * length.
> + */
> +static inline uint16_t
> +get_num_cbs_in_tb_ldpc_dec(struct rte_bbdev_op_ldpc_dec *ldpc_dec)
> +{
> +       uint16_t r, cbs_in_tb = 0;
> +       int32_t length = ldpc_dec->input.length;
> +       r = ldpc_dec->tb_params.r;
> +       while (length > 0 && r < ldpc_dec->tb_params.c) {
> +               length -=  (r < ldpc_dec->tb_params.cab) ?
> +                               ldpc_dec->tb_params.ea :
> +                               ldpc_dec->tb_params.eb;
> +               r++;
> +               cbs_in_tb++;
> +       }
> +       return cbs_in_tb;
> +}
> +
> +/* Check we can mux encode operations with common FCW */
> +static inline bool
> +check_mux(struct rte_bbdev_enc_op **ops, uint16_t num) {
> +       uint16_t i;
> +       if (num == 1)
> +               return false;
> +       for (i = 1; i < num; ++i) {
> +               /* Only mux compatible code blocks */
> +               if (memcmp((uint8_t *)(&ops[i]->ldpc_enc) + ENC_OFFSET,
> +                               (uint8_t *)(&ops[0]->ldpc_enc) + ENC_OFFSET,
> +                               CMP_ENC_SIZE) != 0)
> +                       return false;
> +       }
> +       return true;
> +}
> +
> +/** Enqueue encode operations for ACC100 device in CB mode. */
> +static inline uint16_t
> +acc100_enqueue_ldpc_enc_cb(struct rte_bbdev_queue_data *q_data,
> +               struct rte_bbdev_enc_op **ops, uint16_t num)
> +{
> +       struct acc100_queue *q = q_data->queue_private;
> +       int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
> +       uint16_t i = 0;
> +       union acc100_dma_desc *desc;
> +       int ret, desc_idx = 0;
> +       int16_t enq, left = num;
> +
> +       while (left > 0) {
> +               if (unlikely(avail - 1 < 0))
> +                       break;
> +               avail--;
> +               enq = RTE_MIN(left, MUX_5GDL_DESC);
> +               if (check_mux(&ops[i], enq)) {
> +                       ret = enqueue_ldpc_enc_n_op_cb(q, &ops[i],
> +                                       desc_idx, enq);
> +                       if (ret < 0)
> +                               break;
> +                       i += enq;
> +               } else {
> +                       ret = enqueue_ldpc_enc_one_op_cb(q, ops[i], desc_idx);
> +                       if (ret < 0)
> +                               break;
> +                       i++;
> +               }
> +               desc_idx++;
> +               left = num - i;
> +       }
> +
> +       if (unlikely(i == 0))
> +               return 0; /* Nothing to enqueue */
> +
> +       /* Set SDone in last CB in enqueued ops for CB mode*/
> +       desc = q->ring_addr + ((q->sw_ring_head + desc_idx - 1)
> +                       & q->sw_ring_wrap_mask);
> +       desc->req.sdone_enable = 1;
> +       desc->req.irq_enable = q->irq_enable;
> +
> +       acc100_dma_enqueue(q, desc_idx, &q_data->queue_stats);
> +
> +       /* Update stats */
> +       q_data->queue_stats.enqueued_count += i;
> +       q_data->queue_stats.enqueue_err_count += num - i;
> +
> +       return i;
> +}
> +
> +/* Enqueue encode operations for ACC100 device. */
> +static uint16_t
> +acc100_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
> +               struct rte_bbdev_enc_op **ops, uint16_t num)
> +{
> +       if (unlikely(num == 0))
> +               return 0;
> +       return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
> +}
> +
> +/* Check we can mux encode operations with common FCW */
> +static inline bool
> +cmp_ldpc_dec_op(struct rte_bbdev_dec_op **ops) {
> +       /* Only mux compatible code blocks */
> +       if (memcmp((uint8_t *)(&ops[0]->ldpc_dec) + DEC_OFFSET,
> +                       (uint8_t *)(&ops[1]->ldpc_dec) +
> +                       DEC_OFFSET, CMP_DEC_SIZE) != 0) {
> +               return false;
> +       } else
> +               return true;
> +}
> +
> +
> +/* Enqueue decode operations for ACC100 device in TB mode */
> +static uint16_t
> +acc100_enqueue_ldpc_dec_tb(struct rte_bbdev_queue_data *q_data,
> +               struct rte_bbdev_dec_op **ops, uint16_t num)
> +{
> +       struct acc100_queue *q = q_data->queue_private;
> +       int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
> +       uint16_t i, enqueued_cbs = 0;
> +       uint8_t cbs_in_tb;
> +       int ret;
> +
> +       for (i = 0; i < num; ++i) {
> +               cbs_in_tb = get_num_cbs_in_tb_ldpc_dec(&ops[i]->ldpc_dec);
> +               /* Check if there are available space for further processing */
> +               if (unlikely(avail - cbs_in_tb < 0))
> +                       break;
> +               avail -= cbs_in_tb;
> +
> +               ret = enqueue_ldpc_dec_one_op_tb(q, ops[i],
> +                               enqueued_cbs, cbs_in_tb);
> +               if (ret < 0)
> +                       break;
> +               enqueued_cbs += ret;
> +       }
> +
> +       acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
> +
> +       /* Update stats */
> +       q_data->queue_stats.enqueued_count += i;
> +       q_data->queue_stats.enqueue_err_count += num - i;
> +       return i;
> +}
> +
> +/* Enqueue decode operations for ACC100 device in CB mode */
> +static uint16_t
> +acc100_enqueue_ldpc_dec_cb(struct rte_bbdev_queue_data *q_data,
> +               struct rte_bbdev_dec_op **ops, uint16_t num)
> +{
> +       struct acc100_queue *q = q_data->queue_private;
> +       int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
> +       uint16_t i;
> +       union acc100_dma_desc *desc;
> +       int ret;
> +       bool same_op = false;
> +       for (i = 0; i < num; ++i) {
> +               /* Check if there are available space for further processing */
> +               if (unlikely(avail - 1 < 0))
> +                       break;
> +               avail -= 1;
> +
> +               if (i > 0)
> +                       same_op = cmp_ldpc_dec_op(&ops[i-1]);
> +               rte_bbdev_log(INFO, "Op %d %d %d %d %d %d %d %d %d %d %d
> %d\n",
> +                       i, ops[i]->ldpc_dec.op_flags, ops[i]->ldpc_dec.rv_index,
> +                       ops[i]->ldpc_dec.iter_max, ops[i]->ldpc_dec.iter_count,
> +                       ops[i]->ldpc_dec.basegraph, ops[i]->ldpc_dec.z_c,
> +                       ops[i]->ldpc_dec.n_cb, ops[i]->ldpc_dec.q_m,
> +                       ops[i]->ldpc_dec.n_filler, ops[i]->ldpc_dec.cb_params.e,
> +                       same_op);
> +               ret = enqueue_ldpc_dec_one_op_cb(q, ops[i], i, same_op);
> +               if (ret < 0)
> +                       break;
> +       }
> +
> +       if (unlikely(i == 0))
> +               return 0; /* Nothing to enqueue */
> +
> +       /* Set SDone in last CB in enqueued ops for CB mode*/
> +       desc = q->ring_addr + ((q->sw_ring_head + i - 1)
> +                       & q->sw_ring_wrap_mask);
> +
> +       desc->req.sdone_enable = 1;
> +       desc->req.irq_enable = q->irq_enable;
> +
> +       acc100_dma_enqueue(q, i, &q_data->queue_stats);
> +
> +       /* Update stats */
> +       q_data->queue_stats.enqueued_count += i;
> +       q_data->queue_stats.enqueue_err_count += num - i;
> +       return i;
> +}
> +
> +/* Enqueue decode operations for ACC100 device. */
> +static uint16_t
> +acc100_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
> +               struct rte_bbdev_dec_op **ops, uint16_t num)
> +{
> +       struct acc100_queue *q = q_data->queue_private;
> +       int32_t aq_avail = q->aq_depth +
> +                       (q->aq_dequeued - q->aq_enqueued) / 128;
> +
> +       if (unlikely((aq_avail == 0) || (num == 0)))
> +               return 0;
> +
> +       if (ops[0]->ldpc_dec.code_block_mode == 0)
> +               return acc100_enqueue_ldpc_dec_tb(q_data, ops, num);
> +       else
> +               return acc100_enqueue_ldpc_dec_cb(q_data, ops, num);
> +}
> +
> +
> +/* Dequeue one encode operations from ACC100 device in CB mode */
> +static inline int
> +dequeue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op
> **ref_op,
> +               uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
> +{
> +       union acc100_dma_desc *desc, atom_desc;
> +       union acc100_dma_rsp_desc rsp;
> +       struct rte_bbdev_enc_op *op;
> +       int i;
> +
> +       desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
> +                       & q->sw_ring_wrap_mask);
> +       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> +                       __ATOMIC_RELAXED);
> +
> +       /* Check fdone bit */
> +       if (!(atom_desc.rsp.val & ACC100_FDONE))
> +               return -1;
> +
> +       rsp.val = atom_desc.rsp.val;
> +       rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
> +
> +       /* Dequeue */
> +       op = desc->req.op_addr;
> +
> +       /* Clearing status, it will be set based on response */
> +       op->status = 0;
> +
> +       op->status |= ((rsp.input_err)
> +                       ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> +       op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> +       op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> +
> +       if (desc->req.last_desc_in_batch) {
> +               (*aq_dequeued)++;
> +               desc->req.last_desc_in_batch = 0;
> +       }
> +       desc->rsp.val = ACC100_DMA_DESC_TYPE;
> +       desc->rsp.add_info_0 = 0; /*Reserved bits */
> +       desc->rsp.add_info_1 = 0; /*Reserved bits */
> +
> +       /* Flag that the muxing cause loss of opaque data */
> +       op->opaque_data = (void *)-1;
> +       for (i = 0 ; i < desc->req.numCBs; i++)
> +               ref_op[i] = op;
> +
> +       /* One CB (op) was successfully dequeued */
> +       return desc->req.numCBs;
> +}
> +
> +/* Dequeue one encode operations from ACC100 device in TB mode */
> +static inline int
> +dequeue_enc_one_op_tb(struct acc100_queue *q, struct rte_bbdev_enc_op
> **ref_op,
> +               uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
> +{
> +       union acc100_dma_desc *desc, *last_desc, atom_desc;
> +       union acc100_dma_rsp_desc rsp;
> +       struct rte_bbdev_enc_op *op;
> +       uint8_t i = 0;
> +       uint16_t current_dequeued_cbs = 0, cbs_in_tb;
> +
> +       desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
> +                       & q->sw_ring_wrap_mask);
> +       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> +                       __ATOMIC_RELAXED);
> +
> +       /* Check fdone bit */
> +       if (!(atom_desc.rsp.val & ACC100_FDONE))
> +               return -1;
> +
> +       /* Get number of CBs in dequeued TB */
> +       cbs_in_tb = desc->req.cbs_in_tb;
> +       /* Get last CB */
> +       last_desc = q->ring_addr + ((q->sw_ring_tail
> +                       + total_dequeued_cbs + cbs_in_tb - 1)
> +                       & q->sw_ring_wrap_mask);
> +       /* Check if last CB in TB is ready to dequeue (and thus
> +        * the whole TB) - checking sdone bit. If not return.
> +        */
> +       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
> +                       __ATOMIC_RELAXED);
> +       if (!(atom_desc.rsp.val & ACC100_SDONE))
> +               return -1;
> +
> +       /* Dequeue */
> +       op = desc->req.op_addr;
> +
> +       /* Clearing status, it will be set based on response */
> +       op->status = 0;
> +
> +       while (i < cbs_in_tb) {
> +               desc = q->ring_addr + ((q->sw_ring_tail
> +                               + total_dequeued_cbs)
> +                               & q->sw_ring_wrap_mask);
> +               atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> +                               __ATOMIC_RELAXED);
> +               rsp.val = atom_desc.rsp.val;
> +               rte_bbdev_log_debug("Resp. desc %p: %x", desc,
> +                               rsp.val);
> +
> +               op->status |= ((rsp.input_err)
> +                               ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> +               op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> +               op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> +
> +               if (desc->req.last_desc_in_batch) {
> +                       (*aq_dequeued)++;
> +                       desc->req.last_desc_in_batch = 0;
> +               }
> +               desc->rsp.val = ACC100_DMA_DESC_TYPE;
> +               desc->rsp.add_info_0 = 0;
> +               desc->rsp.add_info_1 = 0;
> +               total_dequeued_cbs++;
> +               current_dequeued_cbs++;
> +               i++;
> +       }
> +
> +       *ref_op = op;
> +
> +       return current_dequeued_cbs;
> +}
> +
> +/* Dequeue one decode operation from ACC100 device in CB mode */
> +static inline int
> +dequeue_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
> +               struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
> +               uint16_t dequeued_cbs, uint32_t *aq_dequeued)
> +{
> +       union acc100_dma_desc *desc, atom_desc;
> +       union acc100_dma_rsp_desc rsp;
> +       struct rte_bbdev_dec_op *op;
> +
> +       desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> +                       & q->sw_ring_wrap_mask);
> +       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> +                       __ATOMIC_RELAXED);
> +
> +       /* Check fdone bit */
> +       if (!(atom_desc.rsp.val & ACC100_FDONE))
> +               return -1;
> +
> +       rsp.val = atom_desc.rsp.val;
> +       rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
> +
> +       /* Dequeue */
> +       op = desc->req.op_addr;
> +
> +       /* Clearing status, it will be set based on response */
> +       op->status = 0;
> +       op->status |= ((rsp.input_err)
> +                       ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> +       op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> +       op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> +       if (op->status != 0)
> +               q_data->queue_stats.dequeue_err_count++;
> +
> +       /* CRC invalid if error exists */
> +       if (!op->status)
> +               op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
> +       op->turbo_dec.iter_count = (uint8_t) rsp.iter_cnt / 2;
> +       /* Check if this is the last desc in batch (Atomic Queue) */
> +       if (desc->req.last_desc_in_batch) {
> +               (*aq_dequeued)++;
> +               desc->req.last_desc_in_batch = 0;
> +       }
> +       desc->rsp.val = ACC100_DMA_DESC_TYPE;
> +       desc->rsp.add_info_0 = 0;
> +       desc->rsp.add_info_1 = 0;
> +       *ref_op = op;
> +
> +       /* One CB (op) was successfully dequeued */
> +       return 1;
> +}
> +
> +/* Dequeue one decode operations from ACC100 device in CB mode */
> +static inline int
> +dequeue_ldpc_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
> +               struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
> +               uint16_t dequeued_cbs, uint32_t *aq_dequeued)
> +{
> +       union acc100_dma_desc *desc, atom_desc;
> +       union acc100_dma_rsp_desc rsp;
> +       struct rte_bbdev_dec_op *op;
> +
> +       desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> +                       & q->sw_ring_wrap_mask);
> +       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> +                       __ATOMIC_RELAXED);
> +
> +       /* Check fdone bit */
> +       if (!(atom_desc.rsp.val & ACC100_FDONE))
> +               return -1;
> +
> +       rsp.val = atom_desc.rsp.val;
> +
> +       /* Dequeue */
> +       op = desc->req.op_addr;
> +
> +       /* Clearing status, it will be set based on response */
> +       op->status = 0;
> +       op->status |= rsp.input_err << RTE_BBDEV_DATA_ERROR;
> +       op->status |= rsp.dma_err << RTE_BBDEV_DRV_ERROR;
> +       op->status |= rsp.fcw_err << RTE_BBDEV_DRV_ERROR;
> +       if (op->status != 0)
> +               q_data->queue_stats.dequeue_err_count++;
> +
> +       op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
> +       if (op->ldpc_dec.hard_output.length > 0 && !rsp.synd_ok)
> +               op->status |= 1 << RTE_BBDEV_SYNDROME_ERROR;
> +       op->ldpc_dec.iter_count = (uint8_t) rsp.iter_cnt;
> +
> +       /* Check if this is the last desc in batch (Atomic Queue) */
> +       if (desc->req.last_desc_in_batch) {
> +               (*aq_dequeued)++;
> +               desc->req.last_desc_in_batch = 0;
> +       }
> +
> +       desc->rsp.val = ACC100_DMA_DESC_TYPE;
> +       desc->rsp.add_info_0 = 0;
> +       desc->rsp.add_info_1 = 0;
> +
> +       *ref_op = op;
> +
> +       /* One CB (op) was successfully dequeued */
> +       return 1;
> +}
> +
> +/* Dequeue one decode operations from ACC100 device in TB mode. */
> +static inline int
> +dequeue_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op
> **ref_op,
> +               uint16_t dequeued_cbs, uint32_t *aq_dequeued)
> +{
> +       union acc100_dma_desc *desc, *last_desc, atom_desc;
> +       union acc100_dma_rsp_desc rsp;
> +       struct rte_bbdev_dec_op *op;
> +       uint8_t cbs_in_tb = 1, cb_idx = 0;
> +
> +       desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> +                       & q->sw_ring_wrap_mask);
> +       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> +                       __ATOMIC_RELAXED);
> +
> +       /* Check fdone bit */
> +       if (!(atom_desc.rsp.val & ACC100_FDONE))
> +               return -1;
> +
> +       /* Dequeue */
> +       op = desc->req.op_addr;
> +
> +       /* Get number of CBs in dequeued TB */
> +       cbs_in_tb = desc->req.cbs_in_tb;
> +       /* Get last CB */
> +       last_desc = q->ring_addr + ((q->sw_ring_tail
> +                       + dequeued_cbs + cbs_in_tb - 1)
> +                       & q->sw_ring_wrap_mask);
> +       /* Check if last CB in TB is ready to dequeue (and thus
> +        * the whole TB) - checking sdone bit. If not return.
> +        */
> +       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
> +                       __ATOMIC_RELAXED);
> +       if (!(atom_desc.rsp.val & ACC100_SDONE))
> +               return -1;
> +
> +       /* Clearing status, it will be set based on response */
> +       op->status = 0;
> +
> +       /* Read remaining CBs if exists */
> +       while (cb_idx < cbs_in_tb) {
> +               desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> +                               & q->sw_ring_wrap_mask);
> +               atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> +                               __ATOMIC_RELAXED);
> +               rsp.val = atom_desc.rsp.val;
> +               rte_bbdev_log_debug("Resp. desc %p: %x", desc,
> +                               rsp.val);
> +
> +               op->status |= ((rsp.input_err)
> +                               ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> +               op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> +               op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> +
> +               /* CRC invalid if error exists */
> +               if (!op->status)
> +                       op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
> +               op->turbo_dec.iter_count = RTE_MAX((uint8_t) rsp.iter_cnt,
> +                               op->turbo_dec.iter_count);
> +
> +               /* Check if this is the last desc in batch (Atomic Queue) */
> +               if (desc->req.last_desc_in_batch) {
> +                       (*aq_dequeued)++;
> +                       desc->req.last_desc_in_batch = 0;
> +               }
> +               desc->rsp.val = ACC100_DMA_DESC_TYPE;
> +               desc->rsp.add_info_0 = 0;
> +               desc->rsp.add_info_1 = 0;
> +               dequeued_cbs++;
> +               cb_idx++;
> +       }
> +
> +       *ref_op = op;
> +
> +       return cb_idx;
> +}
> +
> +/* Dequeue LDPC encode operations from ACC100 device. */
> +static uint16_t
> +acc100_dequeue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
> +               struct rte_bbdev_enc_op **ops, uint16_t num)
> +{
> +       struct acc100_queue *q = q_data->queue_private;
> +       uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
> +       uint32_t aq_dequeued = 0;
> +       uint16_t dequeue_num, i, dequeued_cbs = 0, dequeued_descs = 0;
> +       int ret;
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +       if (unlikely(ops == 0 && q == NULL))
> +               return 0;
> +#endif
> +
> +       dequeue_num = (avail < num) ? avail : num;
> +
> +       for (i = 0; i < dequeue_num; i++) {
> +               ret = dequeue_enc_one_op_cb(q, &ops[dequeued_cbs],
> +                               dequeued_descs, &aq_dequeued);
> +               if (ret < 0)
> +                       break;
> +               dequeued_cbs += ret;
> +               dequeued_descs++;
> +               if (dequeued_cbs >= num)
> +                       break;
> +       }
> +
> +       q->aq_dequeued += aq_dequeued;
> +       q->sw_ring_tail += dequeued_descs;
> +
> +       /* Update enqueue stats */
> +       q_data->queue_stats.dequeued_count += dequeued_cbs;
> +
> +       return dequeued_cbs;
> +}
> +
> +/* Dequeue decode operations from ACC100 device. */
> +static uint16_t
> +acc100_dequeue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
> +               struct rte_bbdev_dec_op **ops, uint16_t num)
> +{
> +       struct acc100_queue *q = q_data->queue_private;
> +       uint16_t dequeue_num;
> +       uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
> +       uint32_t aq_dequeued = 0;
> +       uint16_t i;
> +       uint16_t dequeued_cbs = 0;
> +       struct rte_bbdev_dec_op *op;
> +       int ret;
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +       if (unlikely(ops == 0 && q == NULL))
> +               return 0;
> +#endif
> +
> +       dequeue_num = (avail < num) ? avail : num;
> +
> +       for (i = 0; i < dequeue_num; ++i) {
> +               op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> +                       & q->sw_ring_wrap_mask))->req.op_addr;
> +               if (op->ldpc_dec.code_block_mode == 0)
> +                       ret = dequeue_dec_one_op_tb(q, &ops[i], dequeued_cbs,
> +                                       &aq_dequeued);
> +               else
> +                       ret = dequeue_ldpc_dec_one_op_cb(
> +                                       q_data, q, &ops[i], dequeued_cbs,
> +                                       &aq_dequeued);
> +
> +               if (ret < 0)
> +                       break;
> +               dequeued_cbs += ret;
> +       }
> +
> +       q->aq_dequeued += aq_dequeued;
> +       q->sw_ring_tail += dequeued_cbs;
> +
> +       /* Update enqueue stats */
> +       q_data->queue_stats.dequeued_count += i;
> +
> +       return i;
> +}
> +
>  /* Initialization Function */
>  static void
>  acc100_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv)
> @@ -703,6 +2321,10 @@
>          struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
> 
>          dev->dev_ops = &acc100_bbdev_ops;
> +       dev->enqueue_ldpc_enc_ops = acc100_enqueue_ldpc_enc;
> +       dev->enqueue_ldpc_dec_ops = acc100_enqueue_ldpc_dec;
> +       dev->dequeue_ldpc_enc_ops = acc100_dequeue_ldpc_enc;
> +       dev->dequeue_ldpc_dec_ops = acc100_dequeue_ldpc_dec;
> 
>          ((struct acc100_device *) dev->data->dev_private)->pf_device =
>                          !strcmp(drv->driver.name,
> @@ -815,4 +2437,3 @@ static int acc100_pci_remove(struct rte_pci_device
> *pci_dev)
>  RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME,
> pci_id_acc100_pf_map);
>  RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
>  RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME,
> pci_id_acc100_vf_map);
> -
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h
> b/drivers/baseband/acc100/rte_acc100_pmd.h
> index 0e2b79c..78686c1 100644
> --- a/drivers/baseband/acc100/rte_acc100_pmd.h
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
> @@ -88,6 +88,8 @@
>  #define TMPL_PRI_3      0x0f0e0d0c
>  #define QUEUE_ENABLE    0x80000000  /* Bit to mark Queue as Enabled */
>  #define WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
> +#define ACC100_FDONE    0x80000000
> +#define ACC100_SDONE    0x40000000
> 
>  #define ACC100_NUM_TMPL  32
>  #define VF_OFFSET_QOS 16 /* offset in Memory Space specific to QoS Mon */
> @@ -398,6 +400,7 @@ struct __rte_packed acc100_dma_req_desc {
>  union acc100_dma_desc {
>          struct acc100_dma_req_desc req;
>          union acc100_dma_rsp_desc rsp;
> +       uint64_t atom_hdr;
>  };
> 
> 
> --
> 1.8.3.1

^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC processing functions
  2020-08-20 14:52     ` Chautru, Nicolas
@ 2020-08-20 14:57       ` Dave Burley
  2020-08-20 21:05         ` Chautru, Nicolas
  0 siblings, 1 reply; 213+ messages in thread
From: Dave Burley @ 2020-08-20 14:57 UTC (permalink / raw)
  To: Chautru, Nicolas, dev; +Cc: Richardson, Bruce

Hi Nic

Thank you - it would be useful to have further documentation for clarification as the data format isn't explicitly documented in BBDEV.
Best Regards

Dave


From: Chautru, Nicolas <nicolas.chautru@intel.com>
Sent: 20 August 2020 15:52
To: Dave Burley <dave.burley@accelercomm.com>; dev@dpdk.org <dev@dpdk.org>
Cc: Richardson, Bruce <bruce.richardson@intel.com>
Subject: RE: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC processing functions 
 
Hi Dave, 
This is assuming 6 bits LLR compression packing (ie. first 2 MSB dropped). Similar to HARQ compression.
Let me know if unclear, I can clarify further in documentation if not explicit enough.
Thanks
Nic

> -----Original Message-----
> From: Dave Burley <dave.burley@accelercomm.com>
> Sent: Thursday, August 20, 2020 7:39 AM
> To: Chautru, Nicolas <nicolas.chautru@intel.com>; dev@dpdk.org
> Cc: Richardson, Bruce <bruce.richardson@intel.com>
> Subject: Re: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC
> processing functions
> 
> Hi Nic,
> 
> As you've now specified the use of RTE_BBDEV_LDPC_LLR_COMPRESSION for
> this PMB, please could you confirm what the packed format of the LLRs in
> memory looks like?
> 
> Best Regards
> 
> Dave Burley
> 
> 
> From: dev <dev-bounces@dpdk.org> on behalf of Nicolas Chautru
> <nicolas.chautru@intel.com>
> Sent: 19 August 2020 01:25
> To: dev@dpdk.org <dev@dpdk.org>; akhil.goyal@nxp.com
> <akhil.goyal@nxp.com>
> Cc: bruce.richardson@intel.com <bruce.richardson@intel.com>; Nicolas
> Chautru <nicolas.chautru@intel.com>
> Subject: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC processing
> functions
> 
> Adding LDPC decode and encode processing operations
> 
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> ---
>  drivers/baseband/acc100/rte_acc100_pmd.c | 1625
> +++++++++++++++++++++++++++++-
>  drivers/baseband/acc100/rte_acc100_pmd.h |    3 +
>  2 files changed, 1626 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> b/drivers/baseband/acc100/rte_acc100_pmd.c
> index 7a21c57..5f32813 100644
> --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> @@ -15,6 +15,9 @@
>  #include <rte_hexdump.h>
>  #include <rte_pci.h>
>  #include <rte_bus_pci.h>
> +#ifdef RTE_BBDEV_OFFLOAD_COST
> +#include <rte_cycles.h>
> +#endif
> 
>  #include <rte_bbdev.h>
>  #include <rte_bbdev_pmd.h>
> @@ -449,7 +452,6 @@
>          return 0;
>  }
> 
> -
>  /**
>   * Report a ACC100 queue index which is free
>   * Return 0 to 16k for a valid queue_idx or -1 when no queue is available
> @@ -634,6 +636,46 @@
>          struct acc100_device *d = dev->data->dev_private;
> 
>          static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
> +               {
> +                       .type   = RTE_BBDEV_OP_LDPC_ENC,
> +                       .cap.ldpc_enc = {
> +                               .capability_flags =
> +                                       RTE_BBDEV_LDPC_RATE_MATCH |
> +                                       RTE_BBDEV_LDPC_CRC_24B_ATTACH |
> +                                       RTE_BBDEV_LDPC_INTERLEAVER_BYPASS,
> +                               .num_buffers_src =
> +                                               RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> +                               .num_buffers_dst =
> +                                               RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> +                       }
> +               },
> +               {
> +                       .type   = RTE_BBDEV_OP_LDPC_DEC,
> +                       .cap.ldpc_dec = {
> +                       .capability_flags =
> +                               RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK |
> +                               RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP |
> +                               RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE |
> +                               RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE |
> +#ifdef ACC100_EXT_MEM
> +                               RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABL
> E |
> +                               RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENA
> BLE |
> +#endif
> +                               RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE |
> +                               RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS |
> +                               RTE_BBDEV_LDPC_DECODE_BYPASS |
> +                               RTE_BBDEV_LDPC_DEC_SCATTER_GATHER |
> +                               RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION |
> +                               RTE_BBDEV_LDPC_LLR_COMPRESSION,
> +                       .llr_size = 8,
> +                       .llr_decimals = 1,
> +                       .num_buffers_src =
> +                                       RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> +                       .num_buffers_hard_out =
> +                                       RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> +                       .num_buffers_soft_out = 0,
> +                       }
> +               },
>                  RTE_BBDEV_END_OF_CAPABILITIES_LIST()
>          };
> 
> @@ -669,9 +711,14 @@
>          dev_info->cpu_flag_reqs = NULL;
>          dev_info->min_alignment = 64;
>          dev_info->capabilities = bbdev_capabilities;
> +#ifdef ACC100_EXT_MEM
>          dev_info->harq_buffer_size = d->ddr_size;
> +#else
> +       dev_info->harq_buffer_size = 0;
> +#endif
>  }
> 
> +
>  static const struct rte_bbdev_ops acc100_bbdev_ops = {
>          .setup_queues = acc100_setup_queues,
>          .close = acc100_dev_close,
> @@ -696,6 +743,1577 @@
>          {.device_id = 0},
>  };
> 
> +/* Read flag value 0/1 from bitmap */
> +static inline bool
> +check_bit(uint32_t bitmap, uint32_t bitmask)
> +{
> +       return bitmap & bitmask;
> +}
> +
> +static inline char *
> +mbuf_append(struct rte_mbuf *m_head, struct rte_mbuf *m, uint16_t len)
> +{
> +       if (unlikely(len > rte_pktmbuf_tailroom(m)))
> +               return NULL;
> +
> +       char *tail = (char *)m->buf_addr + m->data_off + m->data_len;
> +       m->data_len = (uint16_t)(m->data_len + len);
> +       m_head->pkt_len  = (m_head->pkt_len + len);
> +       return tail;
> +}
> +
> +/* Compute value of k0.
> + * Based on 3GPP 38.212 Table 5.4.2.1-2
> + * Starting position of different redundancy versions, k0
> + */
> +static inline uint16_t
> +get_k0(uint16_t n_cb, uint16_t z_c, uint8_t bg, uint8_t rv_index)
> +{
> +       if (rv_index == 0)
> +               return 0;
> +       uint16_t n = (bg == 1 ? N_ZC_1 : N_ZC_2) * z_c;
> +       if (n_cb == n) {
> +               if (rv_index == 1)
> +                       return (bg == 1 ? K0_1_1 : K0_1_2) * z_c;
> +               else if (rv_index == 2)
> +                       return (bg == 1 ? K0_2_1 : K0_2_2) * z_c;
> +               else
> +                       return (bg == 1 ? K0_3_1 : K0_3_2) * z_c;
> +       }
> +       /* LBRM case - includes a division by N */
> +       if (rv_index == 1)
> +               return (((bg == 1 ? K0_1_1 : K0_1_2) * n_cb)
> +                               / n) * z_c;
> +       else if (rv_index == 2)
> +               return (((bg == 1 ? K0_2_1 : K0_2_2) * n_cb)
> +                               / n) * z_c;
> +       else
> +               return (((bg == 1 ? K0_3_1 : K0_3_2) * n_cb)
> +                               / n) * z_c;
> +}
> +
> +/* Fill in a frame control word for LDPC encoding. */
> +static inline void
> +acc100_fcw_le_fill(const struct rte_bbdev_enc_op *op,
> +               struct acc100_fcw_le *fcw, int num_cb)
> +{
> +       fcw->qm = op->ldpc_enc.q_m;
> +       fcw->nfiller = op->ldpc_enc.n_filler;
> +       fcw->BG = (op->ldpc_enc.basegraph - 1);
> +       fcw->Zc = op->ldpc_enc.z_c;
> +       fcw->ncb = op->ldpc_enc.n_cb;
> +       fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_enc.basegraph,
> +                       op->ldpc_enc.rv_index);
> +       fcw->rm_e = op->ldpc_enc.cb_params.e;
> +       fcw->crc_select = check_bit(op->ldpc_enc.op_flags,
> +                       RTE_BBDEV_LDPC_CRC_24B_ATTACH);
> +       fcw->bypass_intlv = check_bit(op->ldpc_enc.op_flags,
> +                       RTE_BBDEV_LDPC_INTERLEAVER_BYPASS);
> +       fcw->mcb_count = num_cb;
> +}
> +
> +/* Fill in a frame control word for LDPC decoding. */
> +static inline void
> +acc100_fcw_ld_fill(const struct rte_bbdev_dec_op *op, struct acc100_fcw_ld
> *fcw,
> +               union acc100_harq_layout_data *harq_layout)
> +{
> +       uint16_t harq_out_length, harq_in_length, ncb_p, k0_p, parity_offset;
> +       uint16_t harq_index;
> +       uint32_t l;
> +       bool harq_prun = false;
> +
> +       fcw->qm = op->ldpc_dec.q_m;
> +       fcw->nfiller = op->ldpc_dec.n_filler;
> +       fcw->BG = (op->ldpc_dec.basegraph - 1);
> +       fcw->Zc = op->ldpc_dec.z_c;
> +       fcw->ncb = op->ldpc_dec.n_cb;
> +       fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_dec.basegraph,
> +                       op->ldpc_dec.rv_index);
> +       if (op->ldpc_dec.code_block_mode == 1)
> +               fcw->rm_e = op->ldpc_dec.cb_params.e;
> +       else
> +               fcw->rm_e = (op->ldpc_dec.tb_params.r <
> +                               op->ldpc_dec.tb_params.cab) ?
> +                                               op->ldpc_dec.tb_params.ea :
> +                                               op->ldpc_dec.tb_params.eb;
> +
> +       fcw->hcin_en = check_bit(op->ldpc_dec.op_flags,
> +                       RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE);
> +       fcw->hcout_en = check_bit(op->ldpc_dec.op_flags,
> +                       RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE);
> +       fcw->crc_select = check_bit(op->ldpc_dec.op_flags,
> +                       RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK);
> +       fcw->bypass_dec = check_bit(op->ldpc_dec.op_flags,
> +                       RTE_BBDEV_LDPC_DECODE_BYPASS);
> +       fcw->bypass_intlv = check_bit(op->ldpc_dec.op_flags,
> +                       RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS);
> +       if (op->ldpc_dec.q_m == 1) {
> +               fcw->bypass_intlv = 1;
> +               fcw->qm = 2;
> +       }
> +       fcw->hcin_decomp_mode = check_bit(op->ldpc_dec.op_flags,
> +                       RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
> +       fcw->hcout_comp_mode = check_bit(op->ldpc_dec.op_flags,
> +                       RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
> +       fcw->llr_pack_mode = check_bit(op->ldpc_dec.op_flags,
> +                       RTE_BBDEV_LDPC_LLR_COMPRESSION);
> +       harq_index = op->ldpc_dec.harq_combined_output.offset /
> +                       ACC100_HARQ_OFFSET;
> +#ifdef ACC100_EXT_MEM
> +       /* Limit cases when HARQ pruning is valid */
> +       harq_prun = ((op->ldpc_dec.harq_combined_output.offset %
> +                       ACC100_HARQ_OFFSET) == 0) &&
> +                       (op->ldpc_dec.harq_combined_output.offset <= UINT16_MAX
> +                       * ACC100_HARQ_OFFSET);
> +#endif
> +       if (fcw->hcin_en > 0) {
> +               harq_in_length = op->ldpc_dec.harq_combined_input.length;
> +               if (fcw->hcin_decomp_mode > 0)
> +                       harq_in_length = harq_in_length * 8 / 6;
> +               harq_in_length = RTE_ALIGN(harq_in_length, 64);
> +               if ((harq_layout[harq_index].offset > 0) & harq_prun) {
> +                       rte_bbdev_log_debug("HARQ IN offset unexpected for now\n");
> +                       fcw->hcin_size0 = harq_layout[harq_index].size0;
> +                       fcw->hcin_offset = harq_layout[harq_index].offset;
> +                       fcw->hcin_size1 = harq_in_length -
> +                                       harq_layout[harq_index].offset;
> +               } else {
> +                       fcw->hcin_size0 = harq_in_length;
> +                       fcw->hcin_offset = 0;
> +                       fcw->hcin_size1 = 0;
> +               }
> +       } else {
> +               fcw->hcin_size0 = 0;
> +               fcw->hcin_offset = 0;
> +               fcw->hcin_size1 = 0;
> +       }
> +
> +       fcw->itmax = op->ldpc_dec.iter_max;
> +       fcw->itstop = check_bit(op->ldpc_dec.op_flags,
> +                       RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE);
> +       fcw->synd_precoder = fcw->itstop;
> +       /*
> +        * These are all implicitly set
> +        * fcw->synd_post = 0;
> +        * fcw->so_en = 0;
> +        * fcw->so_bypass_rm = 0;
> +        * fcw->so_bypass_intlv = 0;
> +        * fcw->dec_convllr = 0;
> +        * fcw->hcout_convllr = 0;
> +        * fcw->hcout_size1 = 0;
> +        * fcw->so_it = 0;
> +        * fcw->hcout_offset = 0;
> +        * fcw->negstop_th = 0;
> +        * fcw->negstop_it = 0;
> +        * fcw->negstop_en = 0;
> +        * fcw->gain_i = 1;
> +        * fcw->gain_h = 1;
> +        */
> +       if (fcw->hcout_en > 0) {
> +               parity_offset = (op->ldpc_dec.basegraph == 1 ? 20 : 8)
> +                       * op->ldpc_dec.z_c - op->ldpc_dec.n_filler;
> +               k0_p = (fcw->k0 > parity_offset) ?
> +                               fcw->k0 - op->ldpc_dec.n_filler : fcw->k0;
> +               ncb_p = fcw->ncb - op->ldpc_dec.n_filler;
> +               l = k0_p + fcw->rm_e;
> +               harq_out_length = (uint16_t) fcw->hcin_size0;
> +               harq_out_length = RTE_MIN(RTE_MAX(harq_out_length, l), ncb_p);
> +               harq_out_length = (harq_out_length + 0x3F) & 0xFFC0;
> +               if ((k0_p > fcw->hcin_size0 + ACC100_HARQ_OFFSET_THRESHOLD)
> &&
> +                               harq_prun) {
> +                       fcw->hcout_size0 = (uint16_t) fcw->hcin_size0;
> +                       fcw->hcout_offset = k0_p & 0xFFC0;
> +                       fcw->hcout_size1 = harq_out_length - fcw->hcout_offset;
> +               } else {
> +                       fcw->hcout_size0 = harq_out_length;
> +                       fcw->hcout_size1 = 0;
> +                       fcw->hcout_offset = 0;
> +               }
> +               harq_layout[harq_index].offset = fcw->hcout_offset;
> +               harq_layout[harq_index].size0 = fcw->hcout_size0;
> +       } else {
> +               fcw->hcout_size0 = 0;
> +               fcw->hcout_size1 = 0;
> +               fcw->hcout_offset = 0;
> +       }
> +}
> +
> +/**
> + * Fills descriptor with data pointers of one block type.
> + *
> + * @param desc
> + *   Pointer to DMA descriptor.
> + * @param input
> + *   Pointer to pointer to input data which will be encoded. It can be changed
> + *   and points to next segment in scatter-gather case.
> + * @param offset
> + *   Input offset in rte_mbuf structure. It is used for calculating the point
> + *   where data is starting.
> + * @param cb_len
> + *   Length of currently processed Code Block
> + * @param seg_total_left
> + *   It indicates how many bytes still left in segment (mbuf) for further
> + *   processing.
> + * @param op_flags
> + *   Store information about device capabilities
> + * @param next_triplet
> + *   Index for ACC100 DMA Descriptor triplet
> + *
> + * @return
> + *   Returns index of next triplet on success, other value if lengths of
> + *   pkt and processed cb do not match.
> + *
> + */
> +static inline int
> +acc100_dma_fill_blk_type_in(struct acc100_dma_req_desc *desc,
> +               struct rte_mbuf **input, uint32_t *offset, uint32_t cb_len,
> +               uint32_t *seg_total_left, int next_triplet)
> +{
> +       uint32_t part_len;
> +       struct rte_mbuf *m = *input;
> +
> +       part_len = (*seg_total_left < cb_len) ? *seg_total_left : cb_len;
> +       cb_len -= part_len;
> +       *seg_total_left -= part_len;
> +
> +       desc->data_ptrs[next_triplet].address =
> +                       rte_pktmbuf_iova_offset(m, *offset);
> +       desc->data_ptrs[next_triplet].blen = part_len;
> +       desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN;
> +       desc->data_ptrs[next_triplet].last = 0;
> +       desc->data_ptrs[next_triplet].dma_ext = 0;
> +       *offset += part_len;
> +       next_triplet++;
> +
> +       while (cb_len > 0) {
> +               if (next_triplet < ACC100_DMA_MAX_NUM_POINTERS &&
> +                               m->next != NULL) {
> +
> +                       m = m->next;
> +                       *seg_total_left = rte_pktmbuf_data_len(m);
> +                       part_len = (*seg_total_left < cb_len) ?
> +                                       *seg_total_left :
> +                                       cb_len;
> +                       desc->data_ptrs[next_triplet].address =
> +                                       rte_pktmbuf_mtophys(m);
> +                       desc->data_ptrs[next_triplet].blen = part_len;
> +                       desc->data_ptrs[next_triplet].blkid =
> +                                       ACC100_DMA_BLKID_IN;
> +                       desc->data_ptrs[next_triplet].last = 0;
> +                       desc->data_ptrs[next_triplet].dma_ext = 0;
> +                       cb_len -= part_len;
> +                       *seg_total_left -= part_len;
> +                       /* Initializing offset for next segment (mbuf) */
> +                       *offset = part_len;
> +                       next_triplet++;
> +               } else {
> +                       rte_bbdev_log(ERR,
> +                               "Some data still left for processing: "
> +                               "data_left: %u, next_triplet: %u, next_mbuf: %p",
> +                               cb_len, next_triplet, m->next);
> +                       return -EINVAL;
> +               }
> +       }
> +       /* Storing new mbuf as it could be changed in scatter-gather case*/
> +       *input = m;
> +
> +       return next_triplet;
> +}
> +
> +/* Fills descriptor with data pointers of one block type.
> + * Returns index of next triplet on success, other value if lengths of
> + * output data and processed mbuf do not match.
> + */
> +static inline int
> +acc100_dma_fill_blk_type_out(struct acc100_dma_req_desc *desc,
> +               struct rte_mbuf *output, uint32_t out_offset,
> +               uint32_t output_len, int next_triplet, int blk_id)
> +{
> +       desc->data_ptrs[next_triplet].address =
> +                       rte_pktmbuf_iova_offset(output, out_offset);
> +       desc->data_ptrs[next_triplet].blen = output_len;
> +       desc->data_ptrs[next_triplet].blkid = blk_id;
> +       desc->data_ptrs[next_triplet].last = 0;
> +       desc->data_ptrs[next_triplet].dma_ext = 0;
> +       next_triplet++;
> +
> +       return next_triplet;
> +}
> +
> +static inline int
> +acc100_dma_desc_le_fill(struct rte_bbdev_enc_op *op,
> +               struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
> +               struct rte_mbuf *output, uint32_t *in_offset,
> +               uint32_t *out_offset, uint32_t *out_length,
> +               uint32_t *mbuf_total_left, uint32_t *seg_total_left)
> +{
> +       int next_triplet = 1; /* FCW already done */
> +       uint16_t K, in_length_in_bits, in_length_in_bytes;
> +       struct rte_bbdev_op_ldpc_enc *enc = &op->ldpc_enc;
> +
> +       desc->word0 = ACC100_DMA_DESC_TYPE;
> +       desc->word1 = 0; /**< Timestamp could be disabled */
> +       desc->word2 = 0;
> +       desc->word3 = 0;
> +       desc->numCBs = 1;
> +
> +       K = (enc->basegraph == 1 ? 22 : 10) * enc->z_c;
> +       in_length_in_bits = K - enc->n_filler;
> +       if ((enc->op_flags & RTE_BBDEV_LDPC_CRC_24A_ATTACH) ||
> +                       (enc->op_flags & RTE_BBDEV_LDPC_CRC_24B_ATTACH))
> +               in_length_in_bits -= 24;
> +       in_length_in_bytes = in_length_in_bits >> 3;
> +
> +       if (unlikely((*mbuf_total_left == 0) ||
> +                       (*mbuf_total_left < in_length_in_bytes))) {
> +               rte_bbdev_log(ERR,
> +                               "Mismatch between mbuf length and included CB sizes:
> mbuf len %u, cb len %u",
> +                               *mbuf_total_left, in_length_in_bytes);
> +               return -1;
> +       }
> +
> +       next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset,
> +                       in_length_in_bytes,
> +                       seg_total_left, next_triplet);
> +       if (unlikely(next_triplet < 0)) {
> +               rte_bbdev_log(ERR,
> +                               "Mismatch between data to process and mbuf data length
> in bbdev_op: %p",
> +                               op);
> +               return -1;
> +       }
> +       desc->data_ptrs[next_triplet - 1].last = 1;
> +       desc->m2dlen = next_triplet;
> +       *mbuf_total_left -= in_length_in_bytes;
> +
> +       /* Set output length */
> +       /* Integer round up division by 8 */
> +       *out_length = (enc->cb_params.e + 7) >> 3;
> +
> +       next_triplet = acc100_dma_fill_blk_type_out(desc, output, *out_offset,
> +                       *out_length, next_triplet, ACC100_DMA_BLKID_OUT_ENC);
> +       if (unlikely(next_triplet < 0)) {
> +               rte_bbdev_log(ERR,
> +                               "Mismatch between data to process and mbuf data length
> in bbdev_op: %p",
> +                               op);
> +               return -1;
> +       }
> +       op->ldpc_enc.output.length += *out_length;
> +       *out_offset += *out_length;
> +       desc->data_ptrs[next_triplet - 1].last = 1;
> +       desc->data_ptrs[next_triplet - 1].dma_ext = 0;
> +       desc->d2mlen = next_triplet - desc->m2dlen;
> +
> +       desc->op_addr = op;
> +
> +       return 0;
> +}
> +
> +static inline int
> +acc100_dma_desc_ld_fill(struct rte_bbdev_dec_op *op,
> +               struct acc100_dma_req_desc *desc,
> +               struct rte_mbuf **input, struct rte_mbuf *h_output,
> +               uint32_t *in_offset, uint32_t *h_out_offset,
> +               uint32_t *h_out_length, uint32_t *mbuf_total_left,
> +               uint32_t *seg_total_left,
> +               struct acc100_fcw_ld *fcw)
> +{
> +       struct rte_bbdev_op_ldpc_dec *dec = &op->ldpc_dec;
> +       int next_triplet = 1; /* FCW already done */
> +       uint32_t input_length;
> +       uint16_t output_length, crc24_overlap = 0;
> +       uint16_t sys_cols, K, h_p_size, h_np_size;
> +       bool h_comp = check_bit(dec->op_flags,
> +                       RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
> +
> +       desc->word0 = ACC100_DMA_DESC_TYPE;
> +       desc->word1 = 0; /**< Timestamp could be disabled */
> +       desc->word2 = 0;
> +       desc->word3 = 0;
> +       desc->numCBs = 1;
> +
> +       if (check_bit(op->ldpc_dec.op_flags,
> +                       RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP))
> +               crc24_overlap = 24;
> +
> +       /* Compute some LDPC BG lengths */
> +       input_length = dec->cb_params.e;
> +       if (check_bit(op->ldpc_dec.op_flags,
> +                       RTE_BBDEV_LDPC_LLR_COMPRESSION))
> +               input_length = (input_length * 3 + 3) / 4;
> +       sys_cols = (dec->basegraph == 1) ? 22 : 10;
> +       K = sys_cols * dec->z_c;
> +       output_length = K - dec->n_filler - crc24_overlap;
> +
> +       if (unlikely((*mbuf_total_left == 0) ||
> +                       (*mbuf_total_left < input_length))) {
> +               rte_bbdev_log(ERR,
> +                               "Mismatch between mbuf length and included CB sizes:
> mbuf len %u, cb len %u",
> +                               *mbuf_total_left, input_length);
> +               return -1;
> +       }
> +
> +       next_triplet = acc100_dma_fill_blk_type_in(desc, input,
> +                       in_offset, input_length,
> +                       seg_total_left, next_triplet);
> +
> +       if (unlikely(next_triplet < 0)) {
> +               rte_bbdev_log(ERR,
> +                               "Mismatch between data to process and mbuf data length
> in bbdev_op: %p",
> +                               op);
> +               return -1;
> +       }
> +
> +       if (check_bit(op->ldpc_dec.op_flags,
> +                               RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
> +               h_p_size = fcw->hcin_size0 + fcw->hcin_size1;
> +               if (h_comp)
> +                       h_p_size = (h_p_size * 3 + 3) / 4;
> +               desc->data_ptrs[next_triplet].address =
> +                               dec->harq_combined_input.offset;
> +               desc->data_ptrs[next_triplet].blen = h_p_size;
> +               desc->data_ptrs[next_triplet].blkid =
> ACC100_DMA_BLKID_IN_HARQ;
> +               desc->data_ptrs[next_triplet].dma_ext = 1;
> +#ifndef ACC100_EXT_MEM
> +               acc100_dma_fill_blk_type_out(
> +                               desc,
> +                               op->ldpc_dec.harq_combined_input.data,
> +                               op->ldpc_dec.harq_combined_input.offset,
> +                               h_p_size,
> +                               next_triplet,
> +                               ACC100_DMA_BLKID_IN_HARQ);
> +#endif
> +               next_triplet++;
> +       }
> +
> +       desc->data_ptrs[next_triplet - 1].last = 1;
> +       desc->m2dlen = next_triplet;
> +       *mbuf_total_left -= input_length;
> +
> +       next_triplet = acc100_dma_fill_blk_type_out(desc, h_output,
> +                       *h_out_offset, output_length >> 3, next_triplet,
> +                       ACC100_DMA_BLKID_OUT_HARD);
> +       if (unlikely(next_triplet < 0)) {
> +               rte_bbdev_log(ERR,
> +                               "Mismatch between data to process and mbuf data length
> in bbdev_op: %p",
> +                               op);
> +               return -1;
> +       }
> +
> +       if (check_bit(op->ldpc_dec.op_flags,
> +                               RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
> +               /* Pruned size of the HARQ */
> +               h_p_size = fcw->hcout_size0 + fcw->hcout_size1;
> +               /* Non-Pruned size of the HARQ */
> +               h_np_size = fcw->hcout_offset > 0 ?
> +                               fcw->hcout_offset + fcw->hcout_size1 :
> +                               h_p_size;
> +               if (h_comp) {
> +                       h_np_size = (h_np_size * 3 + 3) / 4;
> +                       h_p_size = (h_p_size * 3 + 3) / 4;
> +               }
> +               dec->harq_combined_output.length = h_np_size;
> +               desc->data_ptrs[next_triplet].address =
> +                               dec->harq_combined_output.offset;
> +               desc->data_ptrs[next_triplet].blen = h_p_size;
> +               desc->data_ptrs[next_triplet].blkid =
> ACC100_DMA_BLKID_OUT_HARQ;
> +               desc->data_ptrs[next_triplet].dma_ext = 1;
> +#ifndef ACC100_EXT_MEM
> +               acc100_dma_fill_blk_type_out(
> +                               desc,
> +                               dec->harq_combined_output.data,
> +                               dec->harq_combined_output.offset,
> +                               h_p_size,
> +                               next_triplet,
> +                               ACC100_DMA_BLKID_OUT_HARQ);
> +#endif
> +               next_triplet++;
> +       }
> +
> +       *h_out_length = output_length >> 3;
> +       dec->hard_output.length += *h_out_length;
> +       *h_out_offset += *h_out_length;
> +       desc->data_ptrs[next_triplet - 1].last = 1;
> +       desc->d2mlen = next_triplet - desc->m2dlen;
> +
> +       desc->op_addr = op;
> +
> +       return 0;
> +}
> +
> +static inline void
> +acc100_dma_desc_ld_update(struct rte_bbdev_dec_op *op,
> +               struct acc100_dma_req_desc *desc,
> +               struct rte_mbuf *input, struct rte_mbuf *h_output,
> +               uint32_t *in_offset, uint32_t *h_out_offset,
> +               uint32_t *h_out_length,
> +               union acc100_harq_layout_data *harq_layout)
> +{
> +       int next_triplet = 1; /* FCW already done */
> +       desc->data_ptrs[next_triplet].address =
> +                       rte_pktmbuf_iova_offset(input, *in_offset);
> +       next_triplet++;
> +
> +       if (check_bit(op->ldpc_dec.op_flags,
> +                               RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
> +               struct rte_bbdev_op_data hi = op->ldpc_dec.harq_combined_input;
> +               desc->data_ptrs[next_triplet].address = hi.offset;
> +#ifndef ACC100_EXT_MEM
> +               desc->data_ptrs[next_triplet].address =
> +                               rte_pktmbuf_iova_offset(hi.data, hi.offset);
> +#endif
> +               next_triplet++;
> +       }
> +
> +       desc->data_ptrs[next_triplet].address =
> +                       rte_pktmbuf_iova_offset(h_output, *h_out_offset);
> +       *h_out_length = desc->data_ptrs[next_triplet].blen;
> +       next_triplet++;
> +
> +       if (check_bit(op->ldpc_dec.op_flags,
> +                               RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
> +               desc->data_ptrs[next_triplet].address =
> +                               op->ldpc_dec.harq_combined_output.offset;
> +               /* Adjust based on previous operation */
> +               struct rte_bbdev_dec_op *prev_op = desc->op_addr;
> +               op->ldpc_dec.harq_combined_output.length =
> +                               prev_op->ldpc_dec.harq_combined_output.length;
> +               int16_t hq_idx = op->ldpc_dec.harq_combined_output.offset /
> +                               ACC100_HARQ_OFFSET;
> +               int16_t prev_hq_idx =
> +                               prev_op->ldpc_dec.harq_combined_output.offset
> +                               / ACC100_HARQ_OFFSET;
> +               harq_layout[hq_idx].val = harq_layout[prev_hq_idx].val;
> +#ifndef ACC100_EXT_MEM
> +               struct rte_bbdev_op_data ho =
> +                               op->ldpc_dec.harq_combined_output;
> +               desc->data_ptrs[next_triplet].address =
> +                               rte_pktmbuf_iova_offset(ho.data, ho.offset);
> +#endif
> +               next_triplet++;
> +       }
> +
> +       op->ldpc_dec.hard_output.length += *h_out_length;
> +       desc->op_addr = op;
> +}
> +
> +
> +/* Enqueue a number of operations to HW and update software rings */
> +static inline void
> +acc100_dma_enqueue(struct acc100_queue *q, uint16_t n,
> +               struct rte_bbdev_stats *queue_stats)
> +{
> +       union acc100_enqueue_reg_fmt enq_req;
> +#ifdef RTE_BBDEV_OFFLOAD_COST
> +       uint64_t start_time = 0;
> +       queue_stats->acc_offload_cycles = 0;
> +       RTE_SET_USED(queue_stats);
> +#else
> +       RTE_SET_USED(queue_stats);
> +#endif
> +
> +       enq_req.val = 0;
> +       /* Setting offset, 100b for 256 DMA Desc */
> +       enq_req.addr_offset = ACC100_DESC_OFFSET;
> +
> +       /* Split ops into batches */
> +       do {
> +               union acc100_dma_desc *desc;
> +               uint16_t enq_batch_size;
> +               uint64_t offset;
> +               rte_iova_t req_elem_addr;
> +
> +               enq_batch_size = RTE_MIN(n, MAX_ENQ_BATCH_SIZE);
> +
> +               /* Set flag on last descriptor in a batch */
> +               desc = q->ring_addr + ((q->sw_ring_head + enq_batch_size - 1) &
> +                               q->sw_ring_wrap_mask);
> +               desc->req.last_desc_in_batch = 1;
> +
> +               /* Calculate the 1st descriptor's address */
> +               offset = ((q->sw_ring_head & q->sw_ring_wrap_mask) *
> +                               sizeof(union acc100_dma_desc));
> +               req_elem_addr = q->ring_addr_phys + offset;
> +
> +               /* Fill enqueue struct */
> +               enq_req.num_elem = enq_batch_size;
> +               /* low 6 bits are not needed */
> +               enq_req.req_elem_addr = (uint32_t)(req_elem_addr >> 6);
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +               rte_memdump(stderr, "Req sdone", desc, sizeof(*desc));
> +#endif
> +               rte_bbdev_log_debug(
> +                               "Enqueue %u reqs (phys %#"PRIx64") to reg %p",
> +                               enq_batch_size,
> +                               req_elem_addr,
> +                               (void *)q->mmio_reg_enqueue);
> +
> +               rte_wmb();
> +
> +#ifdef RTE_BBDEV_OFFLOAD_COST
> +               /* Start time measurement for enqueue function offload. */
> +               start_time = rte_rdtsc_precise();
> +#endif
> +               rte_bbdev_log(DEBUG, "Debug : MMIO Enqueue");
> +               mmio_write(q->mmio_reg_enqueue, enq_req.val);
> +
> +#ifdef RTE_BBDEV_OFFLOAD_COST
> +               queue_stats->acc_offload_cycles +=
> +                               rte_rdtsc_precise() - start_time;
> +#endif
> +
> +               q->aq_enqueued++;
> +               q->sw_ring_head += enq_batch_size;
> +               n -= enq_batch_size;
> +
> +       } while (n);
> +
> +
> +}
> +
> +/* Enqueue one encode operations for ACC100 device in CB mode */
> +static inline int
> +enqueue_ldpc_enc_n_op_cb(struct acc100_queue *q, struct
> rte_bbdev_enc_op **ops,
> +               uint16_t total_enqueued_cbs, int16_t num)
> +{
> +       union acc100_dma_desc *desc = NULL;
> +       uint32_t out_length;
> +       struct rte_mbuf *output_head, *output;
> +       int i, next_triplet;
> +       uint16_t  in_length_in_bytes;
> +       struct rte_bbdev_op_ldpc_enc *enc = &ops[0]->ldpc_enc;
> +
> +       uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> +                       & q->sw_ring_wrap_mask);
> +       desc = q->ring_addr + desc_idx;
> +       acc100_fcw_le_fill(ops[0], &desc->req.fcw_le, num);
> +
> +       /** This could be done at polling */
> +       desc->req.word0 = ACC100_DMA_DESC_TYPE;
> +       desc->req.word1 = 0; /**< Timestamp could be disabled */
> +       desc->req.word2 = 0;
> +       desc->req.word3 = 0;
> +       desc->req.numCBs = num;
> +
> +       in_length_in_bytes = ops[0]->ldpc_enc.input.data->data_len;
> +       out_length = (enc->cb_params.e + 7) >> 3;
> +       desc->req.m2dlen = 1 + num;
> +       desc->req.d2mlen = num;
> +       next_triplet = 1;
> +
> +       for (i = 0; i < num; i++) {
> +               desc->req.data_ptrs[next_triplet].address =
> +                       rte_pktmbuf_iova_offset(ops[i]->ldpc_enc.input.data, 0);
> +               desc->req.data_ptrs[next_triplet].blen = in_length_in_bytes;
> +               next_triplet++;
> +               desc->req.data_ptrs[next_triplet].address =
> +                               rte_pktmbuf_iova_offset(
> +                               ops[i]->ldpc_enc.output.data, 0);
> +               desc->req.data_ptrs[next_triplet].blen = out_length;
> +               next_triplet++;
> +               ops[i]->ldpc_enc.output.length = out_length;
> +               output_head = output = ops[i]->ldpc_enc.output.data;
> +               mbuf_append(output_head, output, out_length);
> +               output->data_len = out_length;
> +       }
> +
> +       desc->req.op_addr = ops[0];
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +       rte_memdump(stderr, "FCW", &desc->req.fcw_le,
> +                       sizeof(desc->req.fcw_le) - 8);
> +       rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> +#endif
> +
> +       /* One CB (one op) was successfully prepared to enqueue */
> +       return num;
> +}
> +
> +/* Enqueue one encode operations for ACC100 device in CB mode */
> +static inline int
> +enqueue_ldpc_enc_one_op_cb(struct acc100_queue *q, struct
> rte_bbdev_enc_op *op,
> +               uint16_t total_enqueued_cbs)
> +{
> +       union acc100_dma_desc *desc = NULL;
> +       int ret;
> +       uint32_t in_offset, out_offset, out_length, mbuf_total_left,
> +               seg_total_left;
> +       struct rte_mbuf *input, *output_head, *output;
> +
> +       uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> +                       & q->sw_ring_wrap_mask);
> +       desc = q->ring_addr + desc_idx;
> +       acc100_fcw_le_fill(op, &desc->req.fcw_le, 1);
> +
> +       input = op->ldpc_enc.input.data;
> +       output_head = output = op->ldpc_enc.output.data;
> +       in_offset = op->ldpc_enc.input.offset;
> +       out_offset = op->ldpc_enc.output.offset;
> +       out_length = 0;
> +       mbuf_total_left = op->ldpc_enc.input.length;
> +       seg_total_left = rte_pktmbuf_data_len(op->ldpc_enc.input.data)
> +                       - in_offset;
> +
> +       ret = acc100_dma_desc_le_fill(op, &desc->req, &input, output,
> +                       &in_offset, &out_offset, &out_length, &mbuf_total_left,
> +                       &seg_total_left);
> +
> +       if (unlikely(ret < 0))
> +               return ret;
> +
> +       mbuf_append(output_head, output, out_length);
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +       rte_memdump(stderr, "FCW", &desc->req.fcw_le,
> +                       sizeof(desc->req.fcw_le) - 8);
> +       rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> +
> +       /* Check if any data left after processing one CB */
> +       if (mbuf_total_left != 0) {
> +               rte_bbdev_log(ERR,
> +                               "Some date still left after processing one CB:
> mbuf_total_left = %u",
> +                               mbuf_total_left);
> +               return -EINVAL;
> +       }
> +#endif
> +       /* One CB (one op) was successfully prepared to enqueue */
> +       return 1;
> +}
> +
> +/** Enqueue one decode operations for ACC100 device in CB mode */
> +static inline int
> +enqueue_ldpc_dec_one_op_cb(struct acc100_queue *q, struct
> rte_bbdev_dec_op *op,
> +               uint16_t total_enqueued_cbs, bool same_op)
> +{
> +       int ret;
> +
> +       union acc100_dma_desc *desc;
> +       uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> +                       & q->sw_ring_wrap_mask);
> +       desc = q->ring_addr + desc_idx;
> +       struct rte_mbuf *input, *h_output_head, *h_output;
> +       uint32_t in_offset, h_out_offset, h_out_length, mbuf_total_left;
> +       input = op->ldpc_dec.input.data;
> +       h_output_head = h_output = op->ldpc_dec.hard_output.data;
> +       in_offset = op->ldpc_dec.input.offset;
> +       h_out_offset = op->ldpc_dec.hard_output.offset;
> +       mbuf_total_left = op->ldpc_dec.input.length;
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +       if (unlikely(input == NULL)) {
> +               rte_bbdev_log(ERR, "Invalid mbuf pointer");
> +               return -EFAULT;
> +       }
> +#endif
> +       union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
> +
> +       if (same_op) {
> +               union acc100_dma_desc *prev_desc;
> +               desc_idx = ((q->sw_ring_head + total_enqueued_cbs - 1)
> +                               & q->sw_ring_wrap_mask);
> +               prev_desc = q->ring_addr + desc_idx;
> +               uint8_t *prev_ptr = (uint8_t *) prev_desc;
> +               uint8_t *new_ptr = (uint8_t *) desc;
> +               /* Copy first 4 words and BDESCs */
> +               rte_memcpy(new_ptr, prev_ptr, 16);
> +               rte_memcpy(new_ptr + 36, prev_ptr + 36, 40);
> +               desc->req.op_addr = prev_desc->req.op_addr;
> +               /* Copy FCW */
> +               rte_memcpy(new_ptr + ACC100_DESC_FCW_OFFSET,
> +                               prev_ptr + ACC100_DESC_FCW_OFFSET,
> +                               ACC100_FCW_LD_BLEN);
> +               acc100_dma_desc_ld_update(op, &desc->req, input, h_output,
> +                               &in_offset, &h_out_offset,
> +                               &h_out_length, harq_layout);
> +       } else {
> +               struct acc100_fcw_ld *fcw;
> +               uint32_t seg_total_left;
> +               fcw = &desc->req.fcw_ld;
> +               acc100_fcw_ld_fill(op, fcw, harq_layout);
> +
> +               /* Special handling when overusing mbuf */
> +               if (fcw->rm_e < MAX_E_MBUF)
> +                       seg_total_left = rte_pktmbuf_data_len(input)
> +                                       - in_offset;
> +               else
> +                       seg_total_left = fcw->rm_e;
> +
> +               ret = acc100_dma_desc_ld_fill(op, &desc->req, &input, h_output,
> +                               &in_offset, &h_out_offset,
> +                               &h_out_length, &mbuf_total_left,
> +                               &seg_total_left, fcw);
> +               if (unlikely(ret < 0))
> +                       return ret;
> +       }
> +
> +       /* Hard output */
> +       mbuf_append(h_output_head, h_output, h_out_length);
> +#ifndef ACC100_EXT_MEM
> +       if (op->ldpc_dec.harq_combined_output.length > 0) {
> +               /* Push the HARQ output into host memory */
> +               struct rte_mbuf *hq_output_head, *hq_output;
> +               hq_output_head = op->ldpc_dec.harq_combined_output.data;
> +               hq_output = op->ldpc_dec.harq_combined_output.data;
> +               mbuf_append(hq_output_head, hq_output,
> +                               op->ldpc_dec.harq_combined_output.length);
> +       }
> +#endif
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +       rte_memdump(stderr, "FCW", &desc->req.fcw_ld,
> +                       sizeof(desc->req.fcw_ld) - 8);
> +       rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> +#endif
> +
> +       /* One CB (one op) was successfully prepared to enqueue */
> +       return 1;
> +}
> +
> +
> +/* Enqueue one decode operations for ACC100 device in TB mode */
> +static inline int
> +enqueue_ldpc_dec_one_op_tb(struct acc100_queue *q, struct
> rte_bbdev_dec_op *op,
> +               uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
> +{
> +       union acc100_dma_desc *desc = NULL;
> +       int ret;
> +       uint8_t r, c;
> +       uint32_t in_offset, h_out_offset,
> +               h_out_length, mbuf_total_left, seg_total_left;
> +       struct rte_mbuf *input, *h_output_head, *h_output;
> +       uint16_t current_enqueued_cbs = 0;
> +
> +       uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> +                       & q->sw_ring_wrap_mask);
> +       desc = q->ring_addr + desc_idx;
> +       uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
> +       union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
> +       acc100_fcw_ld_fill(op, &desc->req.fcw_ld, harq_layout);
> +
> +       input = op->ldpc_dec.input.data;
> +       h_output_head = h_output = op->ldpc_dec.hard_output.data;
> +       in_offset = op->ldpc_dec.input.offset;
> +       h_out_offset = op->ldpc_dec.hard_output.offset;
> +       h_out_length = 0;
> +       mbuf_total_left = op->ldpc_dec.input.length;
> +       c = op->ldpc_dec.tb_params.c;
> +       r = op->ldpc_dec.tb_params.r;
> +
> +       while (mbuf_total_left > 0 && r < c) {
> +
> +               seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
> +
> +               /* Set up DMA descriptor */
> +               desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
> +                               & q->sw_ring_wrap_mask);
> +               desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
> +               desc->req.data_ptrs[0].blen = ACC100_FCW_LD_BLEN;
> +               ret = acc100_dma_desc_ld_fill(op, &desc->req, &input,
> +                               h_output, &in_offset, &h_out_offset,
> +                               &h_out_length,
> +                               &mbuf_total_left, &seg_total_left,
> +                               &desc->req.fcw_ld);
> +
> +               if (unlikely(ret < 0))
> +                       return ret;
> +
> +               /* Hard output */
> +               mbuf_append(h_output_head, h_output, h_out_length);
> +
> +               /* Set total number of CBs in TB */
> +               desc->req.cbs_in_tb = cbs_in_tb;
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +               rte_memdump(stderr, "FCW", &desc->req.fcw_td,
> +                               sizeof(desc->req.fcw_td) - 8);
> +               rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> +#endif
> +
> +               if (seg_total_left == 0) {
> +                       /* Go to the next mbuf */
> +                       input = input->next;
> +                       in_offset = 0;
> +                       h_output = h_output->next;
> +                       h_out_offset = 0;
> +               }
> +               total_enqueued_cbs++;
> +               current_enqueued_cbs++;
> +               r++;
> +       }
> +
> +       if (unlikely(desc == NULL))
> +               return current_enqueued_cbs;
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +       /* Check if any CBs left for processing */
> +       if (mbuf_total_left != 0) {
> +               rte_bbdev_log(ERR,
> +                               "Some date still left for processing: mbuf_total_left = %u",
> +                               mbuf_total_left);
> +               return -EINVAL;
> +       }
> +#endif
> +       /* Set SDone on last CB descriptor for TB mode */
> +       desc->req.sdone_enable = 1;
> +       desc->req.irq_enable = q->irq_enable;
> +
> +       return current_enqueued_cbs;
> +}
> +
> +
> +/* Calculates number of CBs in processed encoder TB based on 'r' and input
> + * length.
> + */
> +static inline uint8_t
> +get_num_cbs_in_tb_enc(struct rte_bbdev_op_turbo_enc *turbo_enc)
> +{
> +       uint8_t c, c_neg, r, crc24_bits = 0;
> +       uint16_t k, k_neg, k_pos;
> +       uint8_t cbs_in_tb = 0;
> +       int32_t length;
> +
> +       length = turbo_enc->input.length;
> +       r = turbo_enc->tb_params.r;
> +       c = turbo_enc->tb_params.c;
> +       c_neg = turbo_enc->tb_params.c_neg;
> +       k_neg = turbo_enc->tb_params.k_neg;
> +       k_pos = turbo_enc->tb_params.k_pos;
> +       crc24_bits = 0;
> +       if (check_bit(turbo_enc->op_flags,
> RTE_BBDEV_TURBO_CRC_24B_ATTACH))
> +               crc24_bits = 24;
> +       while (length > 0 && r < c) {
> +               k = (r < c_neg) ? k_neg : k_pos;
> +               length -= (k - crc24_bits) >> 3;
> +               r++;
> +               cbs_in_tb++;
> +       }
> +
> +       return cbs_in_tb;
> +}
> +
> +/* Calculates number of CBs in processed decoder TB based on 'r' and input
> + * length.
> + */
> +static inline uint16_t
> +get_num_cbs_in_tb_dec(struct rte_bbdev_op_turbo_dec *turbo_dec)
> +{
> +       uint8_t c, c_neg, r = 0;
> +       uint16_t kw, k, k_neg, k_pos, cbs_in_tb = 0;
> +       int32_t length;
> +
> +       length = turbo_dec->input.length;
> +       r = turbo_dec->tb_params.r;
> +       c = turbo_dec->tb_params.c;
> +       c_neg = turbo_dec->tb_params.c_neg;
> +       k_neg = turbo_dec->tb_params.k_neg;
> +       k_pos = turbo_dec->tb_params.k_pos;
> +       while (length > 0 && r < c) {
> +               k = (r < c_neg) ? k_neg : k_pos;
> +               kw = RTE_ALIGN_CEIL(k + 4, 32) * 3;
> +               length -= kw;
> +               r++;
> +               cbs_in_tb++;
> +       }
> +
> +       return cbs_in_tb;
> +}
> +
> +/* Calculates number of CBs in processed decoder TB based on 'r' and input
> + * length.
> + */
> +static inline uint16_t
> +get_num_cbs_in_tb_ldpc_dec(struct rte_bbdev_op_ldpc_dec *ldpc_dec)
> +{
> +       uint16_t r, cbs_in_tb = 0;
> +       int32_t length = ldpc_dec->input.length;
> +       r = ldpc_dec->tb_params.r;
> +       while (length > 0 && r < ldpc_dec->tb_params.c) {
> +               length -=  (r < ldpc_dec->tb_params.cab) ?
> +                               ldpc_dec->tb_params.ea :
> +                               ldpc_dec->tb_params.eb;
> +               r++;
> +               cbs_in_tb++;
> +       }
> +       return cbs_in_tb;
> +}
> +
> +/* Check we can mux encode operations with common FCW */
> +static inline bool
> +check_mux(struct rte_bbdev_enc_op **ops, uint16_t num) {
> +       uint16_t i;
> +       if (num == 1)
> +               return false;
> +       for (i = 1; i < num; ++i) {
> +               /* Only mux compatible code blocks */
> +               if (memcmp((uint8_t *)(&ops[i]->ldpc_enc) + ENC_OFFSET,
> +                               (uint8_t *)(&ops[0]->ldpc_enc) + ENC_OFFSET,
> +                               CMP_ENC_SIZE) != 0)
> +                       return false;
> +       }
> +       return true;
> +}
> +
> +/** Enqueue encode operations for ACC100 device in CB mode. */
> +static inline uint16_t
> +acc100_enqueue_ldpc_enc_cb(struct rte_bbdev_queue_data *q_data,
> +               struct rte_bbdev_enc_op **ops, uint16_t num)
> +{
> +       struct acc100_queue *q = q_data->queue_private;
> +       int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
> +       uint16_t i = 0;
> +       union acc100_dma_desc *desc;
> +       int ret, desc_idx = 0;
> +       int16_t enq, left = num;
> +
> +       while (left > 0) {
> +               if (unlikely(avail - 1 < 0))
> +                       break;
> +               avail--;
> +               enq = RTE_MIN(left, MUX_5GDL_DESC);
> +               if (check_mux(&ops[i], enq)) {
> +                       ret = enqueue_ldpc_enc_n_op_cb(q, &ops[i],
> +                                       desc_idx, enq);
> +                       if (ret < 0)
> +                               break;
> +                       i += enq;
> +               } else {
> +                       ret = enqueue_ldpc_enc_one_op_cb(q, ops[i], desc_idx);
> +                       if (ret < 0)
> +                               break;
> +                       i++;
> +               }
> +               desc_idx++;
> +               left = num - i;
> +       }
> +
> +       if (unlikely(i == 0))
> +               return 0; /* Nothing to enqueue */
> +
> +       /* Set SDone in last CB in enqueued ops for CB mode*/
> +       desc = q->ring_addr + ((q->sw_ring_head + desc_idx - 1)
> +                       & q->sw_ring_wrap_mask);
> +       desc->req.sdone_enable = 1;
> +       desc->req.irq_enable = q->irq_enable;
> +
> +       acc100_dma_enqueue(q, desc_idx, &q_data->queue_stats);
> +
> +       /* Update stats */
> +       q_data->queue_stats.enqueued_count += i;
> +       q_data->queue_stats.enqueue_err_count += num - i;
> +
> +       return i;
> +}
> +
> +/* Enqueue encode operations for ACC100 device. */
> +static uint16_t
> +acc100_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
> +               struct rte_bbdev_enc_op **ops, uint16_t num)
> +{
> +       if (unlikely(num == 0))
> +               return 0;
> +       return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
> +}
> +
> +/* Check we can mux encode operations with common FCW */
> +static inline bool
> +cmp_ldpc_dec_op(struct rte_bbdev_dec_op **ops) {
> +       /* Only mux compatible code blocks */
> +       if (memcmp((uint8_t *)(&ops[0]->ldpc_dec) + DEC_OFFSET,
> +                       (uint8_t *)(&ops[1]->ldpc_dec) +
> +                       DEC_OFFSET, CMP_DEC_SIZE) != 0) {
> +               return false;
> +       } else
> +               return true;
> +}
> +
> +
> +/* Enqueue decode operations for ACC100 device in TB mode */
> +static uint16_t
> +acc100_enqueue_ldpc_dec_tb(struct rte_bbdev_queue_data *q_data,
> +               struct rte_bbdev_dec_op **ops, uint16_t num)
> +{
> +       struct acc100_queue *q = q_data->queue_private;
> +       int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
> +       uint16_t i, enqueued_cbs = 0;
> +       uint8_t cbs_in_tb;
> +       int ret;
> +
> +       for (i = 0; i < num; ++i) {
> +               cbs_in_tb = get_num_cbs_in_tb_ldpc_dec(&ops[i]->ldpc_dec);
> +               /* Check if there are available space for further processing */
> +               if (unlikely(avail - cbs_in_tb < 0))
> +                       break;
> +               avail -= cbs_in_tb;
> +
> +               ret = enqueue_ldpc_dec_one_op_tb(q, ops[i],
> +                               enqueued_cbs, cbs_in_tb);
> +               if (ret < 0)
> +                       break;
> +               enqueued_cbs += ret;
> +       }
> +
> +       acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
> +
> +       /* Update stats */
> +       q_data->queue_stats.enqueued_count += i;
> +       q_data->queue_stats.enqueue_err_count += num - i;
> +       return i;
> +}
> +
> +/* Enqueue decode operations for ACC100 device in CB mode */
> +static uint16_t
> +acc100_enqueue_ldpc_dec_cb(struct rte_bbdev_queue_data *q_data,
> +               struct rte_bbdev_dec_op **ops, uint16_t num)
> +{
> +       struct acc100_queue *q = q_data->queue_private;
> +       int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
> +       uint16_t i;
> +       union acc100_dma_desc *desc;
> +       int ret;
> +       bool same_op = false;
> +       for (i = 0; i < num; ++i) {
> +               /* Check if there are available space for further processing */
> +               if (unlikely(avail - 1 < 0))
> +                       break;
> +               avail -= 1;
> +
> +               if (i > 0)
> +                       same_op = cmp_ldpc_dec_op(&ops[i-1]);
> +               rte_bbdev_log(INFO, "Op %d %d %d %d %d %d %d %d %d %d %d
> %d\n",
> +                       i, ops[i]->ldpc_dec.op_flags, ops[i]->ldpc_dec.rv_index,
> +                       ops[i]->ldpc_dec.iter_max, ops[i]->ldpc_dec.iter_count,
> +                       ops[i]->ldpc_dec.basegraph, ops[i]->ldpc_dec.z_c,
> +                       ops[i]->ldpc_dec.n_cb, ops[i]->ldpc_dec.q_m,
> +                       ops[i]->ldpc_dec.n_filler, ops[i]->ldpc_dec.cb_params.e,
> +                       same_op);
> +               ret = enqueue_ldpc_dec_one_op_cb(q, ops[i], i, same_op);
> +               if (ret < 0)
> +                       break;
> +       }
> +
> +       if (unlikely(i == 0))
> +               return 0; /* Nothing to enqueue */
> +
> +       /* Set SDone in last CB in enqueued ops for CB mode*/
> +       desc = q->ring_addr + ((q->sw_ring_head + i - 1)
> +                       & q->sw_ring_wrap_mask);
> +
> +       desc->req.sdone_enable = 1;
> +       desc->req.irq_enable = q->irq_enable;
> +
> +       acc100_dma_enqueue(q, i, &q_data->queue_stats);
> +
> +       /* Update stats */
> +       q_data->queue_stats.enqueued_count += i;
> +       q_data->queue_stats.enqueue_err_count += num - i;
> +       return i;
> +}
> +
> +/* Enqueue decode operations for ACC100 device. */
> +static uint16_t
> +acc100_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
> +               struct rte_bbdev_dec_op **ops, uint16_t num)
> +{
> +       struct acc100_queue *q = q_data->queue_private;
> +       int32_t aq_avail = q->aq_depth +
> +                       (q->aq_dequeued - q->aq_enqueued) / 128;
> +
> +       if (unlikely((aq_avail == 0) || (num == 0)))
> +               return 0;
> +
> +       if (ops[0]->ldpc_dec.code_block_mode == 0)
> +               return acc100_enqueue_ldpc_dec_tb(q_data, ops, num);
> +       else
> +               return acc100_enqueue_ldpc_dec_cb(q_data, ops, num);
> +}
> +
> +
> +/* Dequeue one encode operations from ACC100 device in CB mode */
> +static inline int
> +dequeue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op
> **ref_op,
> +               uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
> +{
> +       union acc100_dma_desc *desc, atom_desc;
> +       union acc100_dma_rsp_desc rsp;
> +       struct rte_bbdev_enc_op *op;
> +       int i;
> +
> +       desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
> +                       & q->sw_ring_wrap_mask);
> +       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> +                       __ATOMIC_RELAXED);
> +
> +       /* Check fdone bit */
> +       if (!(atom_desc.rsp.val & ACC100_FDONE))
> +               return -1;
> +
> +       rsp.val = atom_desc.rsp.val;
> +       rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
> +
> +       /* Dequeue */
> +       op = desc->req.op_addr;
> +
> +       /* Clearing status, it will be set based on response */
> +       op->status = 0;
> +
> +       op->status |= ((rsp.input_err)
> +                       ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> +       op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> +       op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> +
> +       if (desc->req.last_desc_in_batch) {
> +               (*aq_dequeued)++;
> +               desc->req.last_desc_in_batch = 0;
> +       }
> +       desc->rsp.val = ACC100_DMA_DESC_TYPE;
> +       desc->rsp.add_info_0 = 0; /*Reserved bits */
> +       desc->rsp.add_info_1 = 0; /*Reserved bits */
> +
> +       /* Flag that the muxing cause loss of opaque data */
> +       op->opaque_data = (void *)-1;
> +       for (i = 0 ; i < desc->req.numCBs; i++)
> +               ref_op[i] = op;
> +
> +       /* One CB (op) was successfully dequeued */
> +       return desc->req.numCBs;
> +}
> +
> +/* Dequeue one encode operations from ACC100 device in TB mode */
> +static inline int
> +dequeue_enc_one_op_tb(struct acc100_queue *q, struct rte_bbdev_enc_op
> **ref_op,
> +               uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
> +{
> +       union acc100_dma_desc *desc, *last_desc, atom_desc;
> +       union acc100_dma_rsp_desc rsp;
> +       struct rte_bbdev_enc_op *op;
> +       uint8_t i = 0;
> +       uint16_t current_dequeued_cbs = 0, cbs_in_tb;
> +
> +       desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
> +                       & q->sw_ring_wrap_mask);
> +       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> +                       __ATOMIC_RELAXED);
> +
> +       /* Check fdone bit */
> +       if (!(atom_desc.rsp.val & ACC100_FDONE))
> +               return -1;
> +
> +       /* Get number of CBs in dequeued TB */
> +       cbs_in_tb = desc->req.cbs_in_tb;
> +       /* Get last CB */
> +       last_desc = q->ring_addr + ((q->sw_ring_tail
> +                       + total_dequeued_cbs + cbs_in_tb - 1)
> +                       & q->sw_ring_wrap_mask);
> +       /* Check if last CB in TB is ready to dequeue (and thus
> +        * the whole TB) - checking sdone bit. If not return.
> +        */
> +       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
> +                       __ATOMIC_RELAXED);
> +       if (!(atom_desc.rsp.val & ACC100_SDONE))
> +               return -1;
> +
> +       /* Dequeue */
> +       op = desc->req.op_addr;
> +
> +       /* Clearing status, it will be set based on response */
> +       op->status = 0;
> +
> +       while (i < cbs_in_tb) {
> +               desc = q->ring_addr + ((q->sw_ring_tail
> +                               + total_dequeued_cbs)
> +                               & q->sw_ring_wrap_mask);
> +               atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> +                               __ATOMIC_RELAXED);
> +               rsp.val = atom_desc.rsp.val;
> +               rte_bbdev_log_debug("Resp. desc %p: %x", desc,
> +                               rsp.val);
> +
> +               op->status |= ((rsp.input_err)
> +                               ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> +               op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> +               op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> +
> +               if (desc->req.last_desc_in_batch) {
> +                       (*aq_dequeued)++;
> +                       desc->req.last_desc_in_batch = 0;
> +               }
> +               desc->rsp.val = ACC100_DMA_DESC_TYPE;
> +               desc->rsp.add_info_0 = 0;
> +               desc->rsp.add_info_1 = 0;
> +               total_dequeued_cbs++;
> +               current_dequeued_cbs++;
> +               i++;
> +       }
> +
> +       *ref_op = op;
> +
> +       return current_dequeued_cbs;
> +}
> +
> +/* Dequeue one decode operation from ACC100 device in CB mode */
> +static inline int
> +dequeue_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
> +               struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
> +               uint16_t dequeued_cbs, uint32_t *aq_dequeued)
> +{
> +       union acc100_dma_desc *desc, atom_desc;
> +       union acc100_dma_rsp_desc rsp;
> +       struct rte_bbdev_dec_op *op;
> +
> +       desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> +                       & q->sw_ring_wrap_mask);
> +       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> +                       __ATOMIC_RELAXED);
> +
> +       /* Check fdone bit */
> +       if (!(atom_desc.rsp.val & ACC100_FDONE))
> +               return -1;
> +
> +       rsp.val = atom_desc.rsp.val;
> +       rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
> +
> +       /* Dequeue */
> +       op = desc->req.op_addr;
> +
> +       /* Clearing status, it will be set based on response */
> +       op->status = 0;
> +       op->status |= ((rsp.input_err)
> +                       ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> +       op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> +       op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> +       if (op->status != 0)
> +               q_data->queue_stats.dequeue_err_count++;
> +
> +       /* CRC invalid if error exists */
> +       if (!op->status)
> +               op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
> +       op->turbo_dec.iter_count = (uint8_t) rsp.iter_cnt / 2;
> +       /* Check if this is the last desc in batch (Atomic Queue) */
> +       if (desc->req.last_desc_in_batch) {
> +               (*aq_dequeued)++;
> +               desc->req.last_desc_in_batch = 0;
> +       }
> +       desc->rsp.val = ACC100_DMA_DESC_TYPE;
> +       desc->rsp.add_info_0 = 0;
> +       desc->rsp.add_info_1 = 0;
> +       *ref_op = op;
> +
> +       /* One CB (op) was successfully dequeued */
> +       return 1;
> +}
> +
> +/* Dequeue one decode operations from ACC100 device in CB mode */
> +static inline int
> +dequeue_ldpc_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
> +               struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
> +               uint16_t dequeued_cbs, uint32_t *aq_dequeued)
> +{
> +       union acc100_dma_desc *desc, atom_desc;
> +       union acc100_dma_rsp_desc rsp;
> +       struct rte_bbdev_dec_op *op;
> +
> +       desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> +                       & q->sw_ring_wrap_mask);
> +       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> +                       __ATOMIC_RELAXED);
> +
> +       /* Check fdone bit */
> +       if (!(atom_desc.rsp.val & ACC100_FDONE))
> +               return -1;
> +
> +       rsp.val = atom_desc.rsp.val;
> +
> +       /* Dequeue */
> +       op = desc->req.op_addr;
> +
> +       /* Clearing status, it will be set based on response */
> +       op->status = 0;
> +       op->status |= rsp.input_err << RTE_BBDEV_DATA_ERROR;
> +       op->status |= rsp.dma_err << RTE_BBDEV_DRV_ERROR;
> +       op->status |= rsp.fcw_err << RTE_BBDEV_DRV_ERROR;
> +       if (op->status != 0)
> +               q_data->queue_stats.dequeue_err_count++;
> +
> +       op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
> +       if (op->ldpc_dec.hard_output.length > 0 && !rsp.synd_ok)
> +               op->status |= 1 << RTE_BBDEV_SYNDROME_ERROR;
> +       op->ldpc_dec.iter_count = (uint8_t) rsp.iter_cnt;
> +
> +       /* Check if this is the last desc in batch (Atomic Queue) */
> +       if (desc->req.last_desc_in_batch) {
> +               (*aq_dequeued)++;
> +               desc->req.last_desc_in_batch = 0;
> +       }
> +
> +       desc->rsp.val = ACC100_DMA_DESC_TYPE;
> +       desc->rsp.add_info_0 = 0;
> +       desc->rsp.add_info_1 = 0;
> +
> +       *ref_op = op;
> +
> +       /* One CB (op) was successfully dequeued */
> +       return 1;
> +}
> +
> +/* Dequeue one decode operations from ACC100 device in TB mode. */
> +static inline int
> +dequeue_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op
> **ref_op,
> +               uint16_t dequeued_cbs, uint32_t *aq_dequeued)
> +{
> +       union acc100_dma_desc *desc, *last_desc, atom_desc;
> +       union acc100_dma_rsp_desc rsp;
> +       struct rte_bbdev_dec_op *op;
> +       uint8_t cbs_in_tb = 1, cb_idx = 0;
> +
> +       desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> +                       & q->sw_ring_wrap_mask);
> +       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> +                       __ATOMIC_RELAXED);
> +
> +       /* Check fdone bit */
> +       if (!(atom_desc.rsp.val & ACC100_FDONE))
> +               return -1;
> +
> +       /* Dequeue */
> +       op = desc->req.op_addr;
> +
> +       /* Get number of CBs in dequeued TB */
> +       cbs_in_tb = desc->req.cbs_in_tb;
> +       /* Get last CB */
> +       last_desc = q->ring_addr + ((q->sw_ring_tail
> +                       + dequeued_cbs + cbs_in_tb - 1)
> +                       & q->sw_ring_wrap_mask);
> +       /* Check if last CB in TB is ready to dequeue (and thus
> +        * the whole TB) - checking sdone bit. If not return.
> +        */
> +       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
> +                       __ATOMIC_RELAXED);
> +       if (!(atom_desc.rsp.val & ACC100_SDONE))
> +               return -1;
> +
> +       /* Clearing status, it will be set based on response */
> +       op->status = 0;
> +
> +       /* Read remaining CBs if exists */
> +       while (cb_idx < cbs_in_tb) {
> +               desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> +                               & q->sw_ring_wrap_mask);
> +               atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> +                               __ATOMIC_RELAXED);
> +               rsp.val = atom_desc.rsp.val;
> +               rte_bbdev_log_debug("Resp. desc %p: %x", desc,
> +                               rsp.val);
> +
> +               op->status |= ((rsp.input_err)
> +                               ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> +               op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> +               op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> +
> +               /* CRC invalid if error exists */
> +               if (!op->status)
> +                       op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
> +               op->turbo_dec.iter_count = RTE_MAX((uint8_t) rsp.iter_cnt,
> +                               op->turbo_dec.iter_count);
> +
> +               /* Check if this is the last desc in batch (Atomic Queue) */
> +               if (desc->req.last_desc_in_batch) {
> +                       (*aq_dequeued)++;
> +                       desc->req.last_desc_in_batch = 0;
> +               }
> +               desc->rsp.val = ACC100_DMA_DESC_TYPE;
> +               desc->rsp.add_info_0 = 0;
> +               desc->rsp.add_info_1 = 0;
> +               dequeued_cbs++;
> +               cb_idx++;
> +       }
> +
> +       *ref_op = op;
> +
> +       return cb_idx;
> +}
> +
> +/* Dequeue LDPC encode operations from ACC100 device. */
> +static uint16_t
> +acc100_dequeue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
> +               struct rte_bbdev_enc_op **ops, uint16_t num)
> +{
> +       struct acc100_queue *q = q_data->queue_private;
> +       uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
> +       uint32_t aq_dequeued = 0;
> +       uint16_t dequeue_num, i, dequeued_cbs = 0, dequeued_descs = 0;
> +       int ret;
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +       if (unlikely(ops == 0 && q == NULL))
> +               return 0;
> +#endif
> +
> +       dequeue_num = (avail < num) ? avail : num;
> +
> +       for (i = 0; i < dequeue_num; i++) {
> +               ret = dequeue_enc_one_op_cb(q, &ops[dequeued_cbs],
> +                               dequeued_descs, &aq_dequeued);
> +               if (ret < 0)
> +                       break;
> +               dequeued_cbs += ret;
> +               dequeued_descs++;
> +               if (dequeued_cbs >= num)
> +                       break;
> +       }
> +
> +       q->aq_dequeued += aq_dequeued;
> +       q->sw_ring_tail += dequeued_descs;
> +
> +       /* Update enqueue stats */
> +       q_data->queue_stats.dequeued_count += dequeued_cbs;
> +
> +       return dequeued_cbs;
> +}
> +
> +/* Dequeue decode operations from ACC100 device. */
> +static uint16_t
> +acc100_dequeue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
> +               struct rte_bbdev_dec_op **ops, uint16_t num)
> +{
> +       struct acc100_queue *q = q_data->queue_private;
> +       uint16_t dequeue_num;
> +       uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
> +       uint32_t aq_dequeued = 0;
> +       uint16_t i;
> +       uint16_t dequeued_cbs = 0;
> +       struct rte_bbdev_dec_op *op;
> +       int ret;
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +       if (unlikely(ops == 0 && q == NULL))
> +               return 0;
> +#endif
> +
> +       dequeue_num = (avail < num) ? avail : num;
> +
> +       for (i = 0; i < dequeue_num; ++i) {
> +               op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> +                       & q->sw_ring_wrap_mask))->req.op_addr;
> +               if (op->ldpc_dec.code_block_mode == 0)
> +                       ret = dequeue_dec_one_op_tb(q, &ops[i], dequeued_cbs,
> +                                       &aq_dequeued);
> +               else
> +                       ret = dequeue_ldpc_dec_one_op_cb(
> +                                       q_data, q, &ops[i], dequeued_cbs,
> +                                       &aq_dequeued);
> +
> +               if (ret < 0)
> +                       break;
> +               dequeued_cbs += ret;
> +       }
> +
> +       q->aq_dequeued += aq_dequeued;
> +       q->sw_ring_tail += dequeued_cbs;
> +
> +       /* Update enqueue stats */
> +       q_data->queue_stats.dequeued_count += i;
> +
> +       return i;
> +}
> +
>  /* Initialization Function */
>  static void
>  acc100_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv)
> @@ -703,6 +2321,10 @@
>          struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
> 
>          dev->dev_ops = &acc100_bbdev_ops;
> +       dev->enqueue_ldpc_enc_ops = acc100_enqueue_ldpc_enc;
> +       dev->enqueue_ldpc_dec_ops = acc100_enqueue_ldpc_dec;
> +       dev->dequeue_ldpc_enc_ops = acc100_dequeue_ldpc_enc;
> +       dev->dequeue_ldpc_dec_ops = acc100_dequeue_ldpc_dec;
> 
>          ((struct acc100_device *) dev->data->dev_private)->pf_device =
>                          !strcmp(drv->driver.name,
> @@ -815,4 +2437,3 @@ static int acc100_pci_remove(struct rte_pci_device
> *pci_dev)
>  RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME,
> pci_id_acc100_pf_map);
>  RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
>  RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME,
> pci_id_acc100_vf_map);
> -
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h
> b/drivers/baseband/acc100/rte_acc100_pmd.h
> index 0e2b79c..78686c1 100644
> --- a/drivers/baseband/acc100/rte_acc100_pmd.h
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
> @@ -88,6 +88,8 @@
>  #define TMPL_PRI_3      0x0f0e0d0c
>  #define QUEUE_ENABLE    0x80000000  /* Bit to mark Queue as Enabled */
>  #define WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
> +#define ACC100_FDONE    0x80000000
> +#define ACC100_SDONE    0x40000000
> 
>  #define ACC100_NUM_TMPL  32
>  #define VF_OFFSET_QOS 16 /* offset in Memory Space specific to QoS Mon */
> @@ -398,6 +400,7 @@ struct __rte_packed acc100_dma_req_desc {
>  union acc100_dma_desc {
>          struct acc100_dma_req_desc req;
>          union acc100_dma_rsp_desc rsp;
> +       uint64_t atom_hdr;
>  };
> 
> 
> --
> 1.8.3.1

^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC processing functions
  2020-08-20 14:57       ` Dave Burley
@ 2020-08-20 21:05         ` Chautru, Nicolas
  2020-09-03  8:06           ` Dave Burley
  0 siblings, 1 reply; 213+ messages in thread
From: Chautru, Nicolas @ 2020-08-20 21:05 UTC (permalink / raw)
  To: Dave Burley, dev; +Cc: Richardson, Bruce


> From: Dave Burley <dave.burley@accelercomm.com>> 
> Hi Nic
> 
> Thank you - it would be useful to have further documentation for clarification
> as the data format isn't explicitly documented in BBDEV.

Thanks Dave. Just updated on this other patch -> https://patches.dpdk.org/patch/75793/
Feel free to ack or let me know if you need more details. 

> Best Regards
> 
> Dave
> 
> 
> From: Chautru, Nicolas <nicolas.chautru@intel.com>
> Sent: 20 August 2020 15:52
> To: Dave Burley <dave.burley@accelercomm.com>; dev@dpdk.org
> <dev@dpdk.org>
> Cc: Richardson, Bruce <bruce.richardson@intel.com>
> Subject: RE: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC
> processing functions
> 
> Hi Dave,
> This is assuming 6 bits LLR compression packing (ie. first 2 MSB dropped).
> Similar to HARQ compression.
> Let me know if unclear, I can clarify further in documentation if not explicit
> enough.
> Thanks
> Nic
> 
> > -----Original Message-----
> > From: Dave Burley <dave.burley@accelercomm.com>
> > Sent: Thursday, August 20, 2020 7:39 AM
> > To: Chautru, Nicolas <nicolas.chautru@intel.com>; dev@dpdk.org
> > Cc: Richardson, Bruce <bruce.richardson@intel.com>
> > Subject: Re: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC
> > processing functions
> >
> > Hi Nic,
> >
> > As you've now specified the use of RTE_BBDEV_LDPC_LLR_COMPRESSION for
> > this PMB, please could you confirm what the packed format of the LLRs in
> > memory looks like?
> >
> > Best Regards
> >
> > Dave Burley
> >
> >
> > From: dev <dev-bounces@dpdk.org> on behalf of Nicolas Chautru
> > <nicolas.chautru@intel.com>
> > Sent: 19 August 2020 01:25
> > To: dev@dpdk.org <dev@dpdk.org>; akhil.goyal@nxp.com
> > <akhil.goyal@nxp.com>
> > Cc: bruce.richardson@intel.com <bruce.richardson@intel.com>; Nicolas
> > Chautru <nicolas.chautru@intel.com>
> > Subject: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC processing
> > functions
> >
> > Adding LDPC decode and encode processing operations
> >
> > Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> > ---
> >  drivers/baseband/acc100/rte_acc100_pmd.c | 1625
> > +++++++++++++++++++++++++++++-
> >  drivers/baseband/acc100/rte_acc100_pmd.h |    3 +
> >  2 files changed, 1626 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> > b/drivers/baseband/acc100/rte_acc100_pmd.c
> > index 7a21c57..5f32813 100644
> > --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> > +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> > @@ -15,6 +15,9 @@
> >  #include <rte_hexdump.h>
> >  #include <rte_pci.h>
> >  #include <rte_bus_pci.h>
> > +#ifdef RTE_BBDEV_OFFLOAD_COST
> > +#include <rte_cycles.h>
> > +#endif
> >
> >  #include <rte_bbdev.h>
> >  #include <rte_bbdev_pmd.h>
> > @@ -449,7 +452,6 @@
> >          return 0;
> >  }
> >
> > -
> >  /**
> >   * Report a ACC100 queue index which is free
> >   * Return 0 to 16k for a valid queue_idx or -1 when no queue is available
> > @@ -634,6 +636,46 @@
> >          struct acc100_device *d = dev->data->dev_private;
> >
> >          static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
> > +               {
> > +                       .type   = RTE_BBDEV_OP_LDPC_ENC,
> > +                       .cap.ldpc_enc = {
> > +                               .capability_flags =
> > +                                       RTE_BBDEV_LDPC_RATE_MATCH |
> > +                                       RTE_BBDEV_LDPC_CRC_24B_ATTACH |
> > +                                       RTE_BBDEV_LDPC_INTERLEAVER_BYPASS,
> > +                               .num_buffers_src =
> > +                                               RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> > +                               .num_buffers_dst =
> > +                                               RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> > +                       }
> > +               },
> > +               {
> > +                       .type   = RTE_BBDEV_OP_LDPC_DEC,
> > +                       .cap.ldpc_dec = {
> > +                       .capability_flags =
> > +                               RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK |
> > +                               RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP |
> > +                               RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE |
> > +                               RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE |
> > +#ifdef ACC100_EXT_MEM
> >
> +                               RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABL
> > E |
> >
> +                               RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENA
> > BLE |
> > +#endif
> > +                               RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE |
> > +                               RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS |
> > +                               RTE_BBDEV_LDPC_DECODE_BYPASS |
> > +                               RTE_BBDEV_LDPC_DEC_SCATTER_GATHER |
> > +                               RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION |
> > +                               RTE_BBDEV_LDPC_LLR_COMPRESSION,
> > +                       .llr_size = 8,
> > +                       .llr_decimals = 1,
> > +                       .num_buffers_src =
> > +                                       RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> > +                       .num_buffers_hard_out =
> > +                                       RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> > +                       .num_buffers_soft_out = 0,
> > +                       }
> > +               },
> >                  RTE_BBDEV_END_OF_CAPABILITIES_LIST()
> >          };
> >
> > @@ -669,9 +711,14 @@
> >          dev_info->cpu_flag_reqs = NULL;
> >          dev_info->min_alignment = 64;
> >          dev_info->capabilities = bbdev_capabilities;
> > +#ifdef ACC100_EXT_MEM
> >          dev_info->harq_buffer_size = d->ddr_size;
> > +#else
> > +       dev_info->harq_buffer_size = 0;
> > +#endif
> >  }
> >
> > +
> >  static const struct rte_bbdev_ops acc100_bbdev_ops = {
> >          .setup_queues = acc100_setup_queues,
> >          .close = acc100_dev_close,
> > @@ -696,6 +743,1577 @@
> >          {.device_id = 0},
> >  };
> >
> > +/* Read flag value 0/1 from bitmap */
> > +static inline bool
> > +check_bit(uint32_t bitmap, uint32_t bitmask)
> > +{
> > +       return bitmap & bitmask;
> > +}
> > +
> > +static inline char *
> > +mbuf_append(struct rte_mbuf *m_head, struct rte_mbuf *m, uint16_t len)
> > +{
> > +       if (unlikely(len > rte_pktmbuf_tailroom(m)))
> > +               return NULL;
> > +
> > +       char *tail = (char *)m->buf_addr + m->data_off + m->data_len;
> > +       m->data_len = (uint16_t)(m->data_len + len);
> > +       m_head->pkt_len  = (m_head->pkt_len + len);
> > +       return tail;
> > +}
> > +
> > +/* Compute value of k0.
> > + * Based on 3GPP 38.212 Table 5.4.2.1-2
> > + * Starting position of different redundancy versions, k0
> > + */
> > +static inline uint16_t
> > +get_k0(uint16_t n_cb, uint16_t z_c, uint8_t bg, uint8_t rv_index)
> > +{
> > +       if (rv_index == 0)
> > +               return 0;
> > +       uint16_t n = (bg == 1 ? N_ZC_1 : N_ZC_2) * z_c;
> > +       if (n_cb == n) {
> > +               if (rv_index == 1)
> > +                       return (bg == 1 ? K0_1_1 : K0_1_2) * z_c;
> > +               else if (rv_index == 2)
> > +                       return (bg == 1 ? K0_2_1 : K0_2_2) * z_c;
> > +               else
> > +                       return (bg == 1 ? K0_3_1 : K0_3_2) * z_c;
> > +       }
> > +       /* LBRM case - includes a division by N */
> > +       if (rv_index == 1)
> > +               return (((bg == 1 ? K0_1_1 : K0_1_2) * n_cb)
> > +                               / n) * z_c;
> > +       else if (rv_index == 2)
> > +               return (((bg == 1 ? K0_2_1 : K0_2_2) * n_cb)
> > +                               / n) * z_c;
> > +       else
> > +               return (((bg == 1 ? K0_3_1 : K0_3_2) * n_cb)
> > +                               / n) * z_c;
> > +}
> > +
> > +/* Fill in a frame control word for LDPC encoding. */
> > +static inline void
> > +acc100_fcw_le_fill(const struct rte_bbdev_enc_op *op,
> > +               struct acc100_fcw_le *fcw, int num_cb)
> > +{
> > +       fcw->qm = op->ldpc_enc.q_m;
> > +       fcw->nfiller = op->ldpc_enc.n_filler;
> > +       fcw->BG = (op->ldpc_enc.basegraph - 1);
> > +       fcw->Zc = op->ldpc_enc.z_c;
> > +       fcw->ncb = op->ldpc_enc.n_cb;
> > +       fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_enc.basegraph,
> > +                       op->ldpc_enc.rv_index);
> > +       fcw->rm_e = op->ldpc_enc.cb_params.e;
> > +       fcw->crc_select = check_bit(op->ldpc_enc.op_flags,
> > +                       RTE_BBDEV_LDPC_CRC_24B_ATTACH);
> > +       fcw->bypass_intlv = check_bit(op->ldpc_enc.op_flags,
> > +                       RTE_BBDEV_LDPC_INTERLEAVER_BYPASS);
> > +       fcw->mcb_count = num_cb;
> > +}
> > +
> > +/* Fill in a frame control word for LDPC decoding. */
> > +static inline void
> > +acc100_fcw_ld_fill(const struct rte_bbdev_dec_op *op, struct
> acc100_fcw_ld
> > *fcw,
> > +               union acc100_harq_layout_data *harq_layout)
> > +{
> > +       uint16_t harq_out_length, harq_in_length, ncb_p, k0_p, parity_offset;
> > +       uint16_t harq_index;
> > +       uint32_t l;
> > +       bool harq_prun = false;
> > +
> > +       fcw->qm = op->ldpc_dec.q_m;
> > +       fcw->nfiller = op->ldpc_dec.n_filler;
> > +       fcw->BG = (op->ldpc_dec.basegraph - 1);
> > +       fcw->Zc = op->ldpc_dec.z_c;
> > +       fcw->ncb = op->ldpc_dec.n_cb;
> > +       fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_dec.basegraph,
> > +                       op->ldpc_dec.rv_index);
> > +       if (op->ldpc_dec.code_block_mode == 1)
> > +               fcw->rm_e = op->ldpc_dec.cb_params.e;
> > +       else
> > +               fcw->rm_e = (op->ldpc_dec.tb_params.r <
> > +                               op->ldpc_dec.tb_params.cab) ?
> > +                                               op->ldpc_dec.tb_params.ea :
> > +                                               op->ldpc_dec.tb_params.eb;
> > +
> > +       fcw->hcin_en = check_bit(op->ldpc_dec.op_flags,
> > +                       RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE);
> > +       fcw->hcout_en = check_bit(op->ldpc_dec.op_flags,
> > +                       RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE);
> > +       fcw->crc_select = check_bit(op->ldpc_dec.op_flags,
> > +                       RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK);
> > +       fcw->bypass_dec = check_bit(op->ldpc_dec.op_flags,
> > +                       RTE_BBDEV_LDPC_DECODE_BYPASS);
> > +       fcw->bypass_intlv = check_bit(op->ldpc_dec.op_flags,
> > +                       RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS);
> > +       if (op->ldpc_dec.q_m == 1) {
> > +               fcw->bypass_intlv = 1;
> > +               fcw->qm = 2;
> > +       }
> > +       fcw->hcin_decomp_mode = check_bit(op->ldpc_dec.op_flags,
> > +                       RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
> > +       fcw->hcout_comp_mode = check_bit(op->ldpc_dec.op_flags,
> > +                       RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
> > +       fcw->llr_pack_mode = check_bit(op->ldpc_dec.op_flags,
> > +                       RTE_BBDEV_LDPC_LLR_COMPRESSION);
> > +       harq_index = op->ldpc_dec.harq_combined_output.offset /
> > +                       ACC100_HARQ_OFFSET;
> > +#ifdef ACC100_EXT_MEM
> > +       /* Limit cases when HARQ pruning is valid */
> > +       harq_prun = ((op->ldpc_dec.harq_combined_output.offset %
> > +                       ACC100_HARQ_OFFSET) == 0) &&
> > +                       (op->ldpc_dec.harq_combined_output.offset <= UINT16_MAX
> > +                       * ACC100_HARQ_OFFSET);
> > +#endif
> > +       if (fcw->hcin_en > 0) {
> > +               harq_in_length = op->ldpc_dec.harq_combined_input.length;
> > +               if (fcw->hcin_decomp_mode > 0)
> > +                       harq_in_length = harq_in_length * 8 / 6;
> > +               harq_in_length = RTE_ALIGN(harq_in_length, 64);
> > +               if ((harq_layout[harq_index].offset > 0) & harq_prun) {
> > +                       rte_bbdev_log_debug("HARQ IN offset unexpected for
> now\n");
> > +                       fcw->hcin_size0 = harq_layout[harq_index].size0;
> > +                       fcw->hcin_offset = harq_layout[harq_index].offset;
> > +                       fcw->hcin_size1 = harq_in_length -
> > +                                       harq_layout[harq_index].offset;
> > +               } else {
> > +                       fcw->hcin_size0 = harq_in_length;
> > +                       fcw->hcin_offset = 0;
> > +                       fcw->hcin_size1 = 0;
> > +               }
> > +       } else {
> > +               fcw->hcin_size0 = 0;
> > +               fcw->hcin_offset = 0;
> > +               fcw->hcin_size1 = 0;
> > +       }
> > +
> > +       fcw->itmax = op->ldpc_dec.iter_max;
> > +       fcw->itstop = check_bit(op->ldpc_dec.op_flags,
> > +                       RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE);
> > +       fcw->synd_precoder = fcw->itstop;
> > +       /*
> > +        * These are all implicitly set
> > +        * fcw->synd_post = 0;
> > +        * fcw->so_en = 0;
> > +        * fcw->so_bypass_rm = 0;
> > +        * fcw->so_bypass_intlv = 0;
> > +        * fcw->dec_convllr = 0;
> > +        * fcw->hcout_convllr = 0;
> > +        * fcw->hcout_size1 = 0;
> > +        * fcw->so_it = 0;
> > +        * fcw->hcout_offset = 0;
> > +        * fcw->negstop_th = 0;
> > +        * fcw->negstop_it = 0;
> > +        * fcw->negstop_en = 0;
> > +        * fcw->gain_i = 1;
> > +        * fcw->gain_h = 1;
> > +        */
> > +       if (fcw->hcout_en > 0) {
> > +               parity_offset = (op->ldpc_dec.basegraph == 1 ? 20 : 8)
> > +                       * op->ldpc_dec.z_c - op->ldpc_dec.n_filler;
> > +               k0_p = (fcw->k0 > parity_offset) ?
> > +                               fcw->k0 - op->ldpc_dec.n_filler : fcw->k0;
> > +               ncb_p = fcw->ncb - op->ldpc_dec.n_filler;
> > +               l = k0_p + fcw->rm_e;
> > +               harq_out_length = (uint16_t) fcw->hcin_size0;
> > +               harq_out_length = RTE_MIN(RTE_MAX(harq_out_length, l),
> ncb_p);
> > +               harq_out_length = (harq_out_length + 0x3F) & 0xFFC0;
> > +               if ((k0_p > fcw->hcin_size0 + ACC100_HARQ_OFFSET_THRESHOLD)
> > &&
> > +                               harq_prun) {
> > +                       fcw->hcout_size0 = (uint16_t) fcw->hcin_size0;
> > +                       fcw->hcout_offset = k0_p & 0xFFC0;
> > +                       fcw->hcout_size1 = harq_out_length - fcw->hcout_offset;
> > +               } else {
> > +                       fcw->hcout_size0 = harq_out_length;
> > +                       fcw->hcout_size1 = 0;
> > +                       fcw->hcout_offset = 0;
> > +               }
> > +               harq_layout[harq_index].offset = fcw->hcout_offset;
> > +               harq_layout[harq_index].size0 = fcw->hcout_size0;
> > +       } else {
> > +               fcw->hcout_size0 = 0;
> > +               fcw->hcout_size1 = 0;
> > +               fcw->hcout_offset = 0;
> > +       }
> > +}
> > +
> > +/**
> > + * Fills descriptor with data pointers of one block type.
> > + *
> > + * @param desc
> > + *   Pointer to DMA descriptor.
> > + * @param input
> > + *   Pointer to pointer to input data which will be encoded. It can be changed
> > + *   and points to next segment in scatter-gather case.
> > + * @param offset
> > + *   Input offset in rte_mbuf structure. It is used for calculating the point
> > + *   where data is starting.
> > + * @param cb_len
> > + *   Length of currently processed Code Block
> > + * @param seg_total_left
> > + *   It indicates how many bytes still left in segment (mbuf) for further
> > + *   processing.
> > + * @param op_flags
> > + *   Store information about device capabilities
> > + * @param next_triplet
> > + *   Index for ACC100 DMA Descriptor triplet
> > + *
> > + * @return
> > + *   Returns index of next triplet on success, other value if lengths of
> > + *   pkt and processed cb do not match.
> > + *
> > + */
> > +static inline int
> > +acc100_dma_fill_blk_type_in(struct acc100_dma_req_desc *desc,
> > +               struct rte_mbuf **input, uint32_t *offset, uint32_t cb_len,
> > +               uint32_t *seg_total_left, int next_triplet)
> > +{
> > +       uint32_t part_len;
> > +       struct rte_mbuf *m = *input;
> > +
> > +       part_len = (*seg_total_left < cb_len) ? *seg_total_left : cb_len;
> > +       cb_len -= part_len;
> > +       *seg_total_left -= part_len;
> > +
> > +       desc->data_ptrs[next_triplet].address =
> > +                       rte_pktmbuf_iova_offset(m, *offset);
> > +       desc->data_ptrs[next_triplet].blen = part_len;
> > +       desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN;
> > +       desc->data_ptrs[next_triplet].last = 0;
> > +       desc->data_ptrs[next_triplet].dma_ext = 0;
> > +       *offset += part_len;
> > +       next_triplet++;
> > +
> > +       while (cb_len > 0) {
> > +               if (next_triplet < ACC100_DMA_MAX_NUM_POINTERS &&
> > +                               m->next != NULL) {
> > +
> > +                       m = m->next;
> > +                       *seg_total_left = rte_pktmbuf_data_len(m);
> > +                       part_len = (*seg_total_left < cb_len) ?
> > +                                       *seg_total_left :
> > +                                       cb_len;
> > +                       desc->data_ptrs[next_triplet].address =
> > +                                       rte_pktmbuf_mtophys(m);
> > +                       desc->data_ptrs[next_triplet].blen = part_len;
> > +                       desc->data_ptrs[next_triplet].blkid =
> > +                                       ACC100_DMA_BLKID_IN;
> > +                       desc->data_ptrs[next_triplet].last = 0;
> > +                       desc->data_ptrs[next_triplet].dma_ext = 0;
> > +                       cb_len -= part_len;
> > +                       *seg_total_left -= part_len;
> > +                       /* Initializing offset for next segment (mbuf) */
> > +                       *offset = part_len;
> > +                       next_triplet++;
> > +               } else {
> > +                       rte_bbdev_log(ERR,
> > +                               "Some data still left for processing: "
> > +                               "data_left: %u, next_triplet: %u, next_mbuf: %p",
> > +                               cb_len, next_triplet, m->next);
> > +                       return -EINVAL;
> > +               }
> > +       }
> > +       /* Storing new mbuf as it could be changed in scatter-gather case*/
> > +       *input = m;
> > +
> > +       return next_triplet;
> > +}
> > +
> > +/* Fills descriptor with data pointers of one block type.
> > + * Returns index of next triplet on success, other value if lengths of
> > + * output data and processed mbuf do not match.
> > + */
> > +static inline int
> > +acc100_dma_fill_blk_type_out(struct acc100_dma_req_desc *desc,
> > +               struct rte_mbuf *output, uint32_t out_offset,
> > +               uint32_t output_len, int next_triplet, int blk_id)
> > +{
> > +       desc->data_ptrs[next_triplet].address =
> > +                       rte_pktmbuf_iova_offset(output, out_offset);
> > +       desc->data_ptrs[next_triplet].blen = output_len;
> > +       desc->data_ptrs[next_triplet].blkid = blk_id;
> > +       desc->data_ptrs[next_triplet].last = 0;
> > +       desc->data_ptrs[next_triplet].dma_ext = 0;
> > +       next_triplet++;
> > +
> > +       return next_triplet;
> > +}
> > +
> > +static inline int
> > +acc100_dma_desc_le_fill(struct rte_bbdev_enc_op *op,
> > +               struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
> > +               struct rte_mbuf *output, uint32_t *in_offset,
> > +               uint32_t *out_offset, uint32_t *out_length,
> > +               uint32_t *mbuf_total_left, uint32_t *seg_total_left)
> > +{
> > +       int next_triplet = 1; /* FCW already done */
> > +       uint16_t K, in_length_in_bits, in_length_in_bytes;
> > +       struct rte_bbdev_op_ldpc_enc *enc = &op->ldpc_enc;
> > +
> > +       desc->word0 = ACC100_DMA_DESC_TYPE;
> > +       desc->word1 = 0; /**< Timestamp could be disabled */
> > +       desc->word2 = 0;
> > +       desc->word3 = 0;
> > +       desc->numCBs = 1;
> > +
> > +       K = (enc->basegraph == 1 ? 22 : 10) * enc->z_c;
> > +       in_length_in_bits = K - enc->n_filler;
> > +       if ((enc->op_flags & RTE_BBDEV_LDPC_CRC_24A_ATTACH) ||
> > +                       (enc->op_flags & RTE_BBDEV_LDPC_CRC_24B_ATTACH))
> > +               in_length_in_bits -= 24;
> > +       in_length_in_bytes = in_length_in_bits >> 3;
> > +
> > +       if (unlikely((*mbuf_total_left == 0) ||
> > +                       (*mbuf_total_left < in_length_in_bytes))) {
> > +               rte_bbdev_log(ERR,
> > +                               "Mismatch between mbuf length and included CB sizes:
> > mbuf len %u, cb len %u",
> > +                               *mbuf_total_left, in_length_in_bytes);
> > +               return -1;
> > +       }
> > +
> > +       next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset,
> > +                       in_length_in_bytes,
> > +                       seg_total_left, next_triplet);
> > +       if (unlikely(next_triplet < 0)) {
> > +               rte_bbdev_log(ERR,
> > +                               "Mismatch between data to process and mbuf data
> length
> > in bbdev_op: %p",
> > +                               op);
> > +               return -1;
> > +       }
> > +       desc->data_ptrs[next_triplet - 1].last = 1;
> > +       desc->m2dlen = next_triplet;
> > +       *mbuf_total_left -= in_length_in_bytes;
> > +
> > +       /* Set output length */
> > +       /* Integer round up division by 8 */
> > +       *out_length = (enc->cb_params.e + 7) >> 3;
> > +
> > +       next_triplet = acc100_dma_fill_blk_type_out(desc, output, *out_offset,
> > +                       *out_length, next_triplet, ACC100_DMA_BLKID_OUT_ENC);
> > +       if (unlikely(next_triplet < 0)) {
> > +               rte_bbdev_log(ERR,
> > +                               "Mismatch between data to process and mbuf data
> length
> > in bbdev_op: %p",
> > +                               op);
> > +               return -1;
> > +       }
> > +       op->ldpc_enc.output.length += *out_length;
> > +       *out_offset += *out_length;
> > +       desc->data_ptrs[next_triplet - 1].last = 1;
> > +       desc->data_ptrs[next_triplet - 1].dma_ext = 0;
> > +       desc->d2mlen = next_triplet - desc->m2dlen;
> > +
> > +       desc->op_addr = op;
> > +
> > +       return 0;
> > +}
> > +
> > +static inline int
> > +acc100_dma_desc_ld_fill(struct rte_bbdev_dec_op *op,
> > +               struct acc100_dma_req_desc *desc,
> > +               struct rte_mbuf **input, struct rte_mbuf *h_output,
> > +               uint32_t *in_offset, uint32_t *h_out_offset,
> > +               uint32_t *h_out_length, uint32_t *mbuf_total_left,
> > +               uint32_t *seg_total_left,
> > +               struct acc100_fcw_ld *fcw)
> > +{
> > +       struct rte_bbdev_op_ldpc_dec *dec = &op->ldpc_dec;
> > +       int next_triplet = 1; /* FCW already done */
> > +       uint32_t input_length;
> > +       uint16_t output_length, crc24_overlap = 0;
> > +       uint16_t sys_cols, K, h_p_size, h_np_size;
> > +       bool h_comp = check_bit(dec->op_flags,
> > +                       RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
> > +
> > +       desc->word0 = ACC100_DMA_DESC_TYPE;
> > +       desc->word1 = 0; /**< Timestamp could be disabled */
> > +       desc->word2 = 0;
> > +       desc->word3 = 0;
> > +       desc->numCBs = 1;
> > +
> > +       if (check_bit(op->ldpc_dec.op_flags,
> > +                       RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP))
> > +               crc24_overlap = 24;
> > +
> > +       /* Compute some LDPC BG lengths */
> > +       input_length = dec->cb_params.e;
> > +       if (check_bit(op->ldpc_dec.op_flags,
> > +                       RTE_BBDEV_LDPC_LLR_COMPRESSION))
> > +               input_length = (input_length * 3 + 3) / 4;
> > +       sys_cols = (dec->basegraph == 1) ? 22 : 10;
> > +       K = sys_cols * dec->z_c;
> > +       output_length = K - dec->n_filler - crc24_overlap;
> > +
> > +       if (unlikely((*mbuf_total_left == 0) ||
> > +                       (*mbuf_total_left < input_length))) {
> > +               rte_bbdev_log(ERR,
> > +                               "Mismatch between mbuf length and included CB sizes:
> > mbuf len %u, cb len %u",
> > +                               *mbuf_total_left, input_length);
> > +               return -1;
> > +       }
> > +
> > +       next_triplet = acc100_dma_fill_blk_type_in(desc, input,
> > +                       in_offset, input_length,
> > +                       seg_total_left, next_triplet);
> > +
> > +       if (unlikely(next_triplet < 0)) {
> > +               rte_bbdev_log(ERR,
> > +                               "Mismatch between data to process and mbuf data
> length
> > in bbdev_op: %p",
> > +                               op);
> > +               return -1;
> > +       }
> > +
> > +       if (check_bit(op->ldpc_dec.op_flags,
> > +                               RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
> > +               h_p_size = fcw->hcin_size0 + fcw->hcin_size1;
> > +               if (h_comp)
> > +                       h_p_size = (h_p_size * 3 + 3) / 4;
> > +               desc->data_ptrs[next_triplet].address =
> > +                               dec->harq_combined_input.offset;
> > +               desc->data_ptrs[next_triplet].blen = h_p_size;
> > +               desc->data_ptrs[next_triplet].blkid =
> > ACC100_DMA_BLKID_IN_HARQ;
> > +               desc->data_ptrs[next_triplet].dma_ext = 1;
> > +#ifndef ACC100_EXT_MEM
> > +               acc100_dma_fill_blk_type_out(
> > +                               desc,
> > +                               op->ldpc_dec.harq_combined_input.data,
> > +                               op->ldpc_dec.harq_combined_input.offset,
> > +                               h_p_size,
> > +                               next_triplet,
> > +                               ACC100_DMA_BLKID_IN_HARQ);
> > +#endif
> > +               next_triplet++;
> > +       }
> > +
> > +       desc->data_ptrs[next_triplet - 1].last = 1;
> > +       desc->m2dlen = next_triplet;
> > +       *mbuf_total_left -= input_length;
> > +
> > +       next_triplet = acc100_dma_fill_blk_type_out(desc, h_output,
> > +                       *h_out_offset, output_length >> 3, next_triplet,
> > +                       ACC100_DMA_BLKID_OUT_HARD);
> > +       if (unlikely(next_triplet < 0)) {
> > +               rte_bbdev_log(ERR,
> > +                               "Mismatch between data to process and mbuf data
> length
> > in bbdev_op: %p",
> > +                               op);
> > +               return -1;
> > +       }
> > +
> > +       if (check_bit(op->ldpc_dec.op_flags,
> > +                               RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
> > +               /* Pruned size of the HARQ */
> > +               h_p_size = fcw->hcout_size0 + fcw->hcout_size1;
> > +               /* Non-Pruned size of the HARQ */
> > +               h_np_size = fcw->hcout_offset > 0 ?
> > +                               fcw->hcout_offset + fcw->hcout_size1 :
> > +                               h_p_size;
> > +               if (h_comp) {
> > +                       h_np_size = (h_np_size * 3 + 3) / 4;
> > +                       h_p_size = (h_p_size * 3 + 3) / 4;
> > +               }
> > +               dec->harq_combined_output.length = h_np_size;
> > +               desc->data_ptrs[next_triplet].address =
> > +                               dec->harq_combined_output.offset;
> > +               desc->data_ptrs[next_triplet].blen = h_p_size;
> > +               desc->data_ptrs[next_triplet].blkid =
> > ACC100_DMA_BLKID_OUT_HARQ;
> > +               desc->data_ptrs[next_triplet].dma_ext = 1;
> > +#ifndef ACC100_EXT_MEM
> > +               acc100_dma_fill_blk_type_out(
> > +                               desc,
> > +                               dec->harq_combined_output.data,
> > +                               dec->harq_combined_output.offset,
> > +                               h_p_size,
> > +                               next_triplet,
> > +                               ACC100_DMA_BLKID_OUT_HARQ);
> > +#endif
> > +               next_triplet++;
> > +       }
> > +
> > +       *h_out_length = output_length >> 3;
> > +       dec->hard_output.length += *h_out_length;
> > +       *h_out_offset += *h_out_length;
> > +       desc->data_ptrs[next_triplet - 1].last = 1;
> > +       desc->d2mlen = next_triplet - desc->m2dlen;
> > +
> > +       desc->op_addr = op;
> > +
> > +       return 0;
> > +}
> > +
> > +static inline void
> > +acc100_dma_desc_ld_update(struct rte_bbdev_dec_op *op,
> > +               struct acc100_dma_req_desc *desc,
> > +               struct rte_mbuf *input, struct rte_mbuf *h_output,
> > +               uint32_t *in_offset, uint32_t *h_out_offset,
> > +               uint32_t *h_out_length,
> > +               union acc100_harq_layout_data *harq_layout)
> > +{
> > +       int next_triplet = 1; /* FCW already done */
> > +       desc->data_ptrs[next_triplet].address =
> > +                       rte_pktmbuf_iova_offset(input, *in_offset);
> > +       next_triplet++;
> > +
> > +       if (check_bit(op->ldpc_dec.op_flags,
> > +                               RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
> > +               struct rte_bbdev_op_data hi = op->ldpc_dec.harq_combined_input;
> > +               desc->data_ptrs[next_triplet].address = hi.offset;
> > +#ifndef ACC100_EXT_MEM
> > +               desc->data_ptrs[next_triplet].address =
> > +                               rte_pktmbuf_iova_offset(hi.data, hi.offset);
> > +#endif
> > +               next_triplet++;
> > +       }
> > +
> > +       desc->data_ptrs[next_triplet].address =
> > +                       rte_pktmbuf_iova_offset(h_output, *h_out_offset);
> > +       *h_out_length = desc->data_ptrs[next_triplet].blen;
> > +       next_triplet++;
> > +
> > +       if (check_bit(op->ldpc_dec.op_flags,
> > +                               RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
> > +               desc->data_ptrs[next_triplet].address =
> > +                               op->ldpc_dec.harq_combined_output.offset;
> > +               /* Adjust based on previous operation */
> > +               struct rte_bbdev_dec_op *prev_op = desc->op_addr;
> > +               op->ldpc_dec.harq_combined_output.length =
> > +                               prev_op->ldpc_dec.harq_combined_output.length;
> > +               int16_t hq_idx = op->ldpc_dec.harq_combined_output.offset /
> > +                               ACC100_HARQ_OFFSET;
> > +               int16_t prev_hq_idx =
> > +                               prev_op->ldpc_dec.harq_combined_output.offset
> > +                               / ACC100_HARQ_OFFSET;
> > +               harq_layout[hq_idx].val = harq_layout[prev_hq_idx].val;
> > +#ifndef ACC100_EXT_MEM
> > +               struct rte_bbdev_op_data ho =
> > +                               op->ldpc_dec.harq_combined_output;
> > +               desc->data_ptrs[next_triplet].address =
> > +                               rte_pktmbuf_iova_offset(ho.data, ho.offset);
> > +#endif
> > +               next_triplet++;
> > +       }
> > +
> > +       op->ldpc_dec.hard_output.length += *h_out_length;
> > +       desc->op_addr = op;
> > +}
> > +
> > +
> > +/* Enqueue a number of operations to HW and update software rings */
> > +static inline void
> > +acc100_dma_enqueue(struct acc100_queue *q, uint16_t n,
> > +               struct rte_bbdev_stats *queue_stats)
> > +{
> > +       union acc100_enqueue_reg_fmt enq_req;
> > +#ifdef RTE_BBDEV_OFFLOAD_COST
> > +       uint64_t start_time = 0;
> > +       queue_stats->acc_offload_cycles = 0;
> > +       RTE_SET_USED(queue_stats);
> > +#else
> > +       RTE_SET_USED(queue_stats);
> > +#endif
> > +
> > +       enq_req.val = 0;
> > +       /* Setting offset, 100b for 256 DMA Desc */
> > +       enq_req.addr_offset = ACC100_DESC_OFFSET;
> > +
> > +       /* Split ops into batches */
> > +       do {
> > +               union acc100_dma_desc *desc;
> > +               uint16_t enq_batch_size;
> > +               uint64_t offset;
> > +               rte_iova_t req_elem_addr;
> > +
> > +               enq_batch_size = RTE_MIN(n, MAX_ENQ_BATCH_SIZE);
> > +
> > +               /* Set flag on last descriptor in a batch */
> > +               desc = q->ring_addr + ((q->sw_ring_head + enq_batch_size - 1) &
> > +                               q->sw_ring_wrap_mask);
> > +               desc->req.last_desc_in_batch = 1;
> > +
> > +               /* Calculate the 1st descriptor's address */
> > +               offset = ((q->sw_ring_head & q->sw_ring_wrap_mask) *
> > +                               sizeof(union acc100_dma_desc));
> > +               req_elem_addr = q->ring_addr_phys + offset;
> > +
> > +               /* Fill enqueue struct */
> > +               enq_req.num_elem = enq_batch_size;
> > +               /* low 6 bits are not needed */
> > +               enq_req.req_elem_addr = (uint32_t)(req_elem_addr >> 6);
> > +
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +               rte_memdump(stderr, "Req sdone", desc, sizeof(*desc));
> > +#endif
> > +               rte_bbdev_log_debug(
> > +                               "Enqueue %u reqs (phys %#"PRIx64") to reg %p",
> > +                               enq_batch_size,
> > +                               req_elem_addr,
> > +                               (void *)q->mmio_reg_enqueue);
> > +
> > +               rte_wmb();
> > +
> > +#ifdef RTE_BBDEV_OFFLOAD_COST
> > +               /* Start time measurement for enqueue function offload. */
> > +               start_time = rte_rdtsc_precise();
> > +#endif
> > +               rte_bbdev_log(DEBUG, "Debug : MMIO Enqueue");
> > +               mmio_write(q->mmio_reg_enqueue, enq_req.val);
> > +
> > +#ifdef RTE_BBDEV_OFFLOAD_COST
> > +               queue_stats->acc_offload_cycles +=
> > +                               rte_rdtsc_precise() - start_time;
> > +#endif
> > +
> > +               q->aq_enqueued++;
> > +               q->sw_ring_head += enq_batch_size;
> > +               n -= enq_batch_size;
> > +
> > +       } while (n);
> > +
> > +
> > +}
> > +
> > +/* Enqueue one encode operations for ACC100 device in CB mode */
> > +static inline int
> > +enqueue_ldpc_enc_n_op_cb(struct acc100_queue *q, struct
> > rte_bbdev_enc_op **ops,
> > +               uint16_t total_enqueued_cbs, int16_t num)
> > +{
> > +       union acc100_dma_desc *desc = NULL;
> > +       uint32_t out_length;
> > +       struct rte_mbuf *output_head, *output;
> > +       int i, next_triplet;
> > +       uint16_t  in_length_in_bytes;
> > +       struct rte_bbdev_op_ldpc_enc *enc = &ops[0]->ldpc_enc;
> > +
> > +       uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> > +                       & q->sw_ring_wrap_mask);
> > +       desc = q->ring_addr + desc_idx;
> > +       acc100_fcw_le_fill(ops[0], &desc->req.fcw_le, num);
> > +
> > +       /** This could be done at polling */
> > +       desc->req.word0 = ACC100_DMA_DESC_TYPE;
> > +       desc->req.word1 = 0; /**< Timestamp could be disabled */
> > +       desc->req.word2 = 0;
> > +       desc->req.word3 = 0;
> > +       desc->req.numCBs = num;
> > +
> > +       in_length_in_bytes = ops[0]->ldpc_enc.input.data->data_len;
> > +       out_length = (enc->cb_params.e + 7) >> 3;
> > +       desc->req.m2dlen = 1 + num;
> > +       desc->req.d2mlen = num;
> > +       next_triplet = 1;
> > +
> > +       for (i = 0; i < num; i++) {
> > +               desc->req.data_ptrs[next_triplet].address =
> > +                       rte_pktmbuf_iova_offset(ops[i]->ldpc_enc.input.data, 0);
> > +               desc->req.data_ptrs[next_triplet].blen = in_length_in_bytes;
> > +               next_triplet++;
> > +               desc->req.data_ptrs[next_triplet].address =
> > +                               rte_pktmbuf_iova_offset(
> > +                               ops[i]->ldpc_enc.output.data, 0);
> > +               desc->req.data_ptrs[next_triplet].blen = out_length;
> > +               next_triplet++;
> > +               ops[i]->ldpc_enc.output.length = out_length;
> > +               output_head = output = ops[i]->ldpc_enc.output.data;
> > +               mbuf_append(output_head, output, out_length);
> > +               output->data_len = out_length;
> > +       }
> > +
> > +       desc->req.op_addr = ops[0];
> > +
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +       rte_memdump(stderr, "FCW", &desc->req.fcw_le,
> > +                       sizeof(desc->req.fcw_le) - 8);
> > +       rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> > +#endif
> > +
> > +       /* One CB (one op) was successfully prepared to enqueue */
> > +       return num;
> > +}
> > +
> > +/* Enqueue one encode operations for ACC100 device in CB mode */
> > +static inline int
> > +enqueue_ldpc_enc_one_op_cb(struct acc100_queue *q, struct
> > rte_bbdev_enc_op *op,
> > +               uint16_t total_enqueued_cbs)
> > +{
> > +       union acc100_dma_desc *desc = NULL;
> > +       int ret;
> > +       uint32_t in_offset, out_offset, out_length, mbuf_total_left,
> > +               seg_total_left;
> > +       struct rte_mbuf *input, *output_head, *output;
> > +
> > +       uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> > +                       & q->sw_ring_wrap_mask);
> > +       desc = q->ring_addr + desc_idx;
> > +       acc100_fcw_le_fill(op, &desc->req.fcw_le, 1);
> > +
> > +       input = op->ldpc_enc.input.data;
> > +       output_head = output = op->ldpc_enc.output.data;
> > +       in_offset = op->ldpc_enc.input.offset;
> > +       out_offset = op->ldpc_enc.output.offset;
> > +       out_length = 0;
> > +       mbuf_total_left = op->ldpc_enc.input.length;
> > +       seg_total_left = rte_pktmbuf_data_len(op->ldpc_enc.input.data)
> > +                       - in_offset;
> > +
> > +       ret = acc100_dma_desc_le_fill(op, &desc->req, &input, output,
> > +                       &in_offset, &out_offset, &out_length, &mbuf_total_left,
> > +                       &seg_total_left);
> > +
> > +       if (unlikely(ret < 0))
> > +               return ret;
> > +
> > +       mbuf_append(output_head, output, out_length);
> > +
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +       rte_memdump(stderr, "FCW", &desc->req.fcw_le,
> > +                       sizeof(desc->req.fcw_le) - 8);
> > +       rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> > +
> > +       /* Check if any data left after processing one CB */
> > +       if (mbuf_total_left != 0) {
> > +               rte_bbdev_log(ERR,
> > +                               "Some date still left after processing one CB:
> > mbuf_total_left = %u",
> > +                               mbuf_total_left);
> > +               return -EINVAL;
> > +       }
> > +#endif
> > +       /* One CB (one op) was successfully prepared to enqueue */
> > +       return 1;
> > +}
> > +
> > +/** Enqueue one decode operations for ACC100 device in CB mode */
> > +static inline int
> > +enqueue_ldpc_dec_one_op_cb(struct acc100_queue *q, struct
> > rte_bbdev_dec_op *op,
> > +               uint16_t total_enqueued_cbs, bool same_op)
> > +{
> > +       int ret;
> > +
> > +       union acc100_dma_desc *desc;
> > +       uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> > +                       & q->sw_ring_wrap_mask);
> > +       desc = q->ring_addr + desc_idx;
> > +       struct rte_mbuf *input, *h_output_head, *h_output;
> > +       uint32_t in_offset, h_out_offset, h_out_length, mbuf_total_left;
> > +       input = op->ldpc_dec.input.data;
> > +       h_output_head = h_output = op->ldpc_dec.hard_output.data;
> > +       in_offset = op->ldpc_dec.input.offset;
> > +       h_out_offset = op->ldpc_dec.hard_output.offset;
> > +       mbuf_total_left = op->ldpc_dec.input.length;
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +       if (unlikely(input == NULL)) {
> > +               rte_bbdev_log(ERR, "Invalid mbuf pointer");
> > +               return -EFAULT;
> > +       }
> > +#endif
> > +       union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
> > +
> > +       if (same_op) {
> > +               union acc100_dma_desc *prev_desc;
> > +               desc_idx = ((q->sw_ring_head + total_enqueued_cbs - 1)
> > +                               & q->sw_ring_wrap_mask);
> > +               prev_desc = q->ring_addr + desc_idx;
> > +               uint8_t *prev_ptr = (uint8_t *) prev_desc;
> > +               uint8_t *new_ptr = (uint8_t *) desc;
> > +               /* Copy first 4 words and BDESCs */
> > +               rte_memcpy(new_ptr, prev_ptr, 16);
> > +               rte_memcpy(new_ptr + 36, prev_ptr + 36, 40);
> > +               desc->req.op_addr = prev_desc->req.op_addr;
> > +               /* Copy FCW */
> > +               rte_memcpy(new_ptr + ACC100_DESC_FCW_OFFSET,
> > +                               prev_ptr + ACC100_DESC_FCW_OFFSET,
> > +                               ACC100_FCW_LD_BLEN);
> > +               acc100_dma_desc_ld_update(op, &desc->req, input, h_output,
> > +                               &in_offset, &h_out_offset,
> > +                               &h_out_length, harq_layout);
> > +       } else {
> > +               struct acc100_fcw_ld *fcw;
> > +               uint32_t seg_total_left;
> > +               fcw = &desc->req.fcw_ld;
> > +               acc100_fcw_ld_fill(op, fcw, harq_layout);
> > +
> > +               /* Special handling when overusing mbuf */
> > +               if (fcw->rm_e < MAX_E_MBUF)
> > +                       seg_total_left = rte_pktmbuf_data_len(input)
> > +                                       - in_offset;
> > +               else
> > +                       seg_total_left = fcw->rm_e;
> > +
> > +               ret = acc100_dma_desc_ld_fill(op, &desc->req, &input, h_output,
> > +                               &in_offset, &h_out_offset,
> > +                               &h_out_length, &mbuf_total_left,
> > +                               &seg_total_left, fcw);
> > +               if (unlikely(ret < 0))
> > +                       return ret;
> > +       }
> > +
> > +       /* Hard output */
> > +       mbuf_append(h_output_head, h_output, h_out_length);
> > +#ifndef ACC100_EXT_MEM
> > +       if (op->ldpc_dec.harq_combined_output.length > 0) {
> > +               /* Push the HARQ output into host memory */
> > +               struct rte_mbuf *hq_output_head, *hq_output;
> > +               hq_output_head = op->ldpc_dec.harq_combined_output.data;
> > +               hq_output = op->ldpc_dec.harq_combined_output.data;
> > +               mbuf_append(hq_output_head, hq_output,
> > +                               op->ldpc_dec.harq_combined_output.length);
> > +       }
> > +#endif
> > +
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +       rte_memdump(stderr, "FCW", &desc->req.fcw_ld,
> > +                       sizeof(desc->req.fcw_ld) - 8);
> > +       rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> > +#endif
> > +
> > +       /* One CB (one op) was successfully prepared to enqueue */
> > +       return 1;
> > +}
> > +
> > +
> > +/* Enqueue one decode operations for ACC100 device in TB mode */
> > +static inline int
> > +enqueue_ldpc_dec_one_op_tb(struct acc100_queue *q, struct
> > rte_bbdev_dec_op *op,
> > +               uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
> > +{
> > +       union acc100_dma_desc *desc = NULL;
> > +       int ret;
> > +       uint8_t r, c;
> > +       uint32_t in_offset, h_out_offset,
> > +               h_out_length, mbuf_total_left, seg_total_left;
> > +       struct rte_mbuf *input, *h_output_head, *h_output;
> > +       uint16_t current_enqueued_cbs = 0;
> > +
> > +       uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> > +                       & q->sw_ring_wrap_mask);
> > +       desc = q->ring_addr + desc_idx;
> > +       uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
> > +       union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
> > +       acc100_fcw_ld_fill(op, &desc->req.fcw_ld, harq_layout);
> > +
> > +       input = op->ldpc_dec.input.data;
> > +       h_output_head = h_output = op->ldpc_dec.hard_output.data;
> > +       in_offset = op->ldpc_dec.input.offset;
> > +       h_out_offset = op->ldpc_dec.hard_output.offset;
> > +       h_out_length = 0;
> > +       mbuf_total_left = op->ldpc_dec.input.length;
> > +       c = op->ldpc_dec.tb_params.c;
> > +       r = op->ldpc_dec.tb_params.r;
> > +
> > +       while (mbuf_total_left > 0 && r < c) {
> > +
> > +               seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
> > +
> > +               /* Set up DMA descriptor */
> > +               desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
> > +                               & q->sw_ring_wrap_mask);
> > +               desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
> > +               desc->req.data_ptrs[0].blen = ACC100_FCW_LD_BLEN;
> > +               ret = acc100_dma_desc_ld_fill(op, &desc->req, &input,
> > +                               h_output, &in_offset, &h_out_offset,
> > +                               &h_out_length,
> > +                               &mbuf_total_left, &seg_total_left,
> > +                               &desc->req.fcw_ld);
> > +
> > +               if (unlikely(ret < 0))
> > +                       return ret;
> > +
> > +               /* Hard output */
> > +               mbuf_append(h_output_head, h_output, h_out_length);
> > +
> > +               /* Set total number of CBs in TB */
> > +               desc->req.cbs_in_tb = cbs_in_tb;
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +               rte_memdump(stderr, "FCW", &desc->req.fcw_td,
> > +                               sizeof(desc->req.fcw_td) - 8);
> > +               rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> > +#endif
> > +
> > +               if (seg_total_left == 0) {
> > +                       /* Go to the next mbuf */
> > +                       input = input->next;
> > +                       in_offset = 0;
> > +                       h_output = h_output->next;
> > +                       h_out_offset = 0;
> > +               }
> > +               total_enqueued_cbs++;
> > +               current_enqueued_cbs++;
> > +               r++;
> > +       }
> > +
> > +       if (unlikely(desc == NULL))
> > +               return current_enqueued_cbs;
> > +
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +       /* Check if any CBs left for processing */
> > +       if (mbuf_total_left != 0) {
> > +               rte_bbdev_log(ERR,
> > +                               "Some date still left for processing: mbuf_total_left =
> %u",
> > +                               mbuf_total_left);
> > +               return -EINVAL;
> > +       }
> > +#endif
> > +       /* Set SDone on last CB descriptor for TB mode */
> > +       desc->req.sdone_enable = 1;
> > +       desc->req.irq_enable = q->irq_enable;
> > +
> > +       return current_enqueued_cbs;
> > +}
> > +
> > +
> > +/* Calculates number of CBs in processed encoder TB based on 'r' and input
> > + * length.
> > + */
> > +static inline uint8_t
> > +get_num_cbs_in_tb_enc(struct rte_bbdev_op_turbo_enc *turbo_enc)
> > +{
> > +       uint8_t c, c_neg, r, crc24_bits = 0;
> > +       uint16_t k, k_neg, k_pos;
> > +       uint8_t cbs_in_tb = 0;
> > +       int32_t length;
> > +
> > +       length = turbo_enc->input.length;
> > +       r = turbo_enc->tb_params.r;
> > +       c = turbo_enc->tb_params.c;
> > +       c_neg = turbo_enc->tb_params.c_neg;
> > +       k_neg = turbo_enc->tb_params.k_neg;
> > +       k_pos = turbo_enc->tb_params.k_pos;
> > +       crc24_bits = 0;
> > +       if (check_bit(turbo_enc->op_flags,
> > RTE_BBDEV_TURBO_CRC_24B_ATTACH))
> > +               crc24_bits = 24;
> > +       while (length > 0 && r < c) {
> > +               k = (r < c_neg) ? k_neg : k_pos;
> > +               length -= (k - crc24_bits) >> 3;
> > +               r++;
> > +               cbs_in_tb++;
> > +       }
> > +
> > +       return cbs_in_tb;
> > +}
> > +
> > +/* Calculates number of CBs in processed decoder TB based on 'r' and input
> > + * length.
> > + */
> > +static inline uint16_t
> > +get_num_cbs_in_tb_dec(struct rte_bbdev_op_turbo_dec *turbo_dec)
> > +{
> > +       uint8_t c, c_neg, r = 0;
> > +       uint16_t kw, k, k_neg, k_pos, cbs_in_tb = 0;
> > +       int32_t length;
> > +
> > +       length = turbo_dec->input.length;
> > +       r = turbo_dec->tb_params.r;
> > +       c = turbo_dec->tb_params.c;
> > +       c_neg = turbo_dec->tb_params.c_neg;
> > +       k_neg = turbo_dec->tb_params.k_neg;
> > +       k_pos = turbo_dec->tb_params.k_pos;
> > +       while (length > 0 && r < c) {
> > +               k = (r < c_neg) ? k_neg : k_pos;
> > +               kw = RTE_ALIGN_CEIL(k + 4, 32) * 3;
> > +               length -= kw;
> > +               r++;
> > +               cbs_in_tb++;
> > +       }
> > +
> > +       return cbs_in_tb;
> > +}
> > +
> > +/* Calculates number of CBs in processed decoder TB based on 'r' and input
> > + * length.
> > + */
> > +static inline uint16_t
> > +get_num_cbs_in_tb_ldpc_dec(struct rte_bbdev_op_ldpc_dec *ldpc_dec)
> > +{
> > +       uint16_t r, cbs_in_tb = 0;
> > +       int32_t length = ldpc_dec->input.length;
> > +       r = ldpc_dec->tb_params.r;
> > +       while (length > 0 && r < ldpc_dec->tb_params.c) {
> > +               length -=  (r < ldpc_dec->tb_params.cab) ?
> > +                               ldpc_dec->tb_params.ea :
> > +                               ldpc_dec->tb_params.eb;
> > +               r++;
> > +               cbs_in_tb++;
> > +       }
> > +       return cbs_in_tb;
> > +}
> > +
> > +/* Check we can mux encode operations with common FCW */
> > +static inline bool
> > +check_mux(struct rte_bbdev_enc_op **ops, uint16_t num) {
> > +       uint16_t i;
> > +       if (num == 1)
> > +               return false;
> > +       for (i = 1; i < num; ++i) {
> > +               /* Only mux compatible code blocks */
> > +               if (memcmp((uint8_t *)(&ops[i]->ldpc_enc) + ENC_OFFSET,
> > +                               (uint8_t *)(&ops[0]->ldpc_enc) + ENC_OFFSET,
> > +                               CMP_ENC_SIZE) != 0)
> > +                       return false;
> > +       }
> > +       return true;
> > +}
> > +
> > +/** Enqueue encode operations for ACC100 device in CB mode. */
> > +static inline uint16_t
> > +acc100_enqueue_ldpc_enc_cb(struct rte_bbdev_queue_data *q_data,
> > +               struct rte_bbdev_enc_op **ops, uint16_t num)
> > +{
> > +       struct acc100_queue *q = q_data->queue_private;
> > +       int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
> > +       uint16_t i = 0;
> > +       union acc100_dma_desc *desc;
> > +       int ret, desc_idx = 0;
> > +       int16_t enq, left = num;
> > +
> > +       while (left > 0) {
> > +               if (unlikely(avail - 1 < 0))
> > +                       break;
> > +               avail--;
> > +               enq = RTE_MIN(left, MUX_5GDL_DESC);
> > +               if (check_mux(&ops[i], enq)) {
> > +                       ret = enqueue_ldpc_enc_n_op_cb(q, &ops[i],
> > +                                       desc_idx, enq);
> > +                       if (ret < 0)
> > +                               break;
> > +                       i += enq;
> > +               } else {
> > +                       ret = enqueue_ldpc_enc_one_op_cb(q, ops[i], desc_idx);
> > +                       if (ret < 0)
> > +                               break;
> > +                       i++;
> > +               }
> > +               desc_idx++;
> > +               left = num - i;
> > +       }
> > +
> > +       if (unlikely(i == 0))
> > +               return 0; /* Nothing to enqueue */
> > +
> > +       /* Set SDone in last CB in enqueued ops for CB mode*/
> > +       desc = q->ring_addr + ((q->sw_ring_head + desc_idx - 1)
> > +                       & q->sw_ring_wrap_mask);
> > +       desc->req.sdone_enable = 1;
> > +       desc->req.irq_enable = q->irq_enable;
> > +
> > +       acc100_dma_enqueue(q, desc_idx, &q_data->queue_stats);
> > +
> > +       /* Update stats */
> > +       q_data->queue_stats.enqueued_count += i;
> > +       q_data->queue_stats.enqueue_err_count += num - i;
> > +
> > +       return i;
> > +}
> > +
> > +/* Enqueue encode operations for ACC100 device. */
> > +static uint16_t
> > +acc100_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
> > +               struct rte_bbdev_enc_op **ops, uint16_t num)
> > +{
> > +       if (unlikely(num == 0))
> > +               return 0;
> > +       return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
> > +}
> > +
> > +/* Check we can mux encode operations with common FCW */
> > +static inline bool
> > +cmp_ldpc_dec_op(struct rte_bbdev_dec_op **ops) {
> > +       /* Only mux compatible code blocks */
> > +       if (memcmp((uint8_t *)(&ops[0]->ldpc_dec) + DEC_OFFSET,
> > +                       (uint8_t *)(&ops[1]->ldpc_dec) +
> > +                       DEC_OFFSET, CMP_DEC_SIZE) != 0) {
> > +               return false;
> > +       } else
> > +               return true;
> > +}
> > +
> > +
> > +/* Enqueue decode operations for ACC100 device in TB mode */
> > +static uint16_t
> > +acc100_enqueue_ldpc_dec_tb(struct rte_bbdev_queue_data *q_data,
> > +               struct rte_bbdev_dec_op **ops, uint16_t num)
> > +{
> > +       struct acc100_queue *q = q_data->queue_private;
> > +       int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
> > +       uint16_t i, enqueued_cbs = 0;
> > +       uint8_t cbs_in_tb;
> > +       int ret;
> > +
> > +       for (i = 0; i < num; ++i) {
> > +               cbs_in_tb = get_num_cbs_in_tb_ldpc_dec(&ops[i]->ldpc_dec);
> > +               /* Check if there are available space for further processing */
> > +               if (unlikely(avail - cbs_in_tb < 0))
> > +                       break;
> > +               avail -= cbs_in_tb;
> > +
> > +               ret = enqueue_ldpc_dec_one_op_tb(q, ops[i],
> > +                               enqueued_cbs, cbs_in_tb);
> > +               if (ret < 0)
> > +                       break;
> > +               enqueued_cbs += ret;
> > +       }
> > +
> > +       acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
> > +
> > +       /* Update stats */
> > +       q_data->queue_stats.enqueued_count += i;
> > +       q_data->queue_stats.enqueue_err_count += num - i;
> > +       return i;
> > +}
> > +
> > +/* Enqueue decode operations for ACC100 device in CB mode */
> > +static uint16_t
> > +acc100_enqueue_ldpc_dec_cb(struct rte_bbdev_queue_data *q_data,
> > +               struct rte_bbdev_dec_op **ops, uint16_t num)
> > +{
> > +       struct acc100_queue *q = q_data->queue_private;
> > +       int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
> > +       uint16_t i;
> > +       union acc100_dma_desc *desc;
> > +       int ret;
> > +       bool same_op = false;
> > +       for (i = 0; i < num; ++i) {
> > +               /* Check if there are available space for further processing */
> > +               if (unlikely(avail - 1 < 0))
> > +                       break;
> > +               avail -= 1;
> > +
> > +               if (i > 0)
> > +                       same_op = cmp_ldpc_dec_op(&ops[i-1]);
> > +               rte_bbdev_log(INFO, "Op %d %d %d %d %d %d %d %d %d %d %d
> > %d\n",
> > +                       i, ops[i]->ldpc_dec.op_flags, ops[i]->ldpc_dec.rv_index,
> > +                       ops[i]->ldpc_dec.iter_max, ops[i]->ldpc_dec.iter_count,
> > +                       ops[i]->ldpc_dec.basegraph, ops[i]->ldpc_dec.z_c,
> > +                       ops[i]->ldpc_dec.n_cb, ops[i]->ldpc_dec.q_m,
> > +                       ops[i]->ldpc_dec.n_filler, ops[i]->ldpc_dec.cb_params.e,
> > +                       same_op);
> > +               ret = enqueue_ldpc_dec_one_op_cb(q, ops[i], i, same_op);
> > +               if (ret < 0)
> > +                       break;
> > +       }
> > +
> > +       if (unlikely(i == 0))
> > +               return 0; /* Nothing to enqueue */
> > +
> > +       /* Set SDone in last CB in enqueued ops for CB mode*/
> > +       desc = q->ring_addr + ((q->sw_ring_head + i - 1)
> > +                       & q->sw_ring_wrap_mask);
> > +
> > +       desc->req.sdone_enable = 1;
> > +       desc->req.irq_enable = q->irq_enable;
> > +
> > +       acc100_dma_enqueue(q, i, &q_data->queue_stats);
> > +
> > +       /* Update stats */
> > +       q_data->queue_stats.enqueued_count += i;
> > +       q_data->queue_stats.enqueue_err_count += num - i;
> > +       return i;
> > +}
> > +
> > +/* Enqueue decode operations for ACC100 device. */
> > +static uint16_t
> > +acc100_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
> > +               struct rte_bbdev_dec_op **ops, uint16_t num)
> > +{
> > +       struct acc100_queue *q = q_data->queue_private;
> > +       int32_t aq_avail = q->aq_depth +
> > +                       (q->aq_dequeued - q->aq_enqueued) / 128;
> > +
> > +       if (unlikely((aq_avail == 0) || (num == 0)))
> > +               return 0;
> > +
> > +       if (ops[0]->ldpc_dec.code_block_mode == 0)
> > +               return acc100_enqueue_ldpc_dec_tb(q_data, ops, num);
> > +       else
> > +               return acc100_enqueue_ldpc_dec_cb(q_data, ops, num);
> > +}
> > +
> > +
> > +/* Dequeue one encode operations from ACC100 device in CB mode */
> > +static inline int
> > +dequeue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op
> > **ref_op,
> > +               uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
> > +{
> > +       union acc100_dma_desc *desc, atom_desc;
> > +       union acc100_dma_rsp_desc rsp;
> > +       struct rte_bbdev_enc_op *op;
> > +       int i;
> > +
> > +       desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
> > +                       & q->sw_ring_wrap_mask);
> > +       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> > +                       __ATOMIC_RELAXED);
> > +
> > +       /* Check fdone bit */
> > +       if (!(atom_desc.rsp.val & ACC100_FDONE))
> > +               return -1;
> > +
> > +       rsp.val = atom_desc.rsp.val;
> > +       rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
> > +
> > +       /* Dequeue */
> > +       op = desc->req.op_addr;
> > +
> > +       /* Clearing status, it will be set based on response */
> > +       op->status = 0;
> > +
> > +       op->status |= ((rsp.input_err)
> > +                       ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> > +       op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> > +       op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> > +
> > +       if (desc->req.last_desc_in_batch) {
> > +               (*aq_dequeued)++;
> > +               desc->req.last_desc_in_batch = 0;
> > +       }
> > +       desc->rsp.val = ACC100_DMA_DESC_TYPE;
> > +       desc->rsp.add_info_0 = 0; /*Reserved bits */
> > +       desc->rsp.add_info_1 = 0; /*Reserved bits */
> > +
> > +       /* Flag that the muxing cause loss of opaque data */
> > +       op->opaque_data = (void *)-1;
> > +       for (i = 0 ; i < desc->req.numCBs; i++)
> > +               ref_op[i] = op;
> > +
> > +       /* One CB (op) was successfully dequeued */
> > +       return desc->req.numCBs;
> > +}
> > +
> > +/* Dequeue one encode operations from ACC100 device in TB mode */
> > +static inline int
> > +dequeue_enc_one_op_tb(struct acc100_queue *q, struct rte_bbdev_enc_op
> > **ref_op,
> > +               uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
> > +{
> > +       union acc100_dma_desc *desc, *last_desc, atom_desc;
> > +       union acc100_dma_rsp_desc rsp;
> > +       struct rte_bbdev_enc_op *op;
> > +       uint8_t i = 0;
> > +       uint16_t current_dequeued_cbs = 0, cbs_in_tb;
> > +
> > +       desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
> > +                       & q->sw_ring_wrap_mask);
> > +       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> > +                       __ATOMIC_RELAXED);
> > +
> > +       /* Check fdone bit */
> > +       if (!(atom_desc.rsp.val & ACC100_FDONE))
> > +               return -1;
> > +
> > +       /* Get number of CBs in dequeued TB */
> > +       cbs_in_tb = desc->req.cbs_in_tb;
> > +       /* Get last CB */
> > +       last_desc = q->ring_addr + ((q->sw_ring_tail
> > +                       + total_dequeued_cbs + cbs_in_tb - 1)
> > +                       & q->sw_ring_wrap_mask);
> > +       /* Check if last CB in TB is ready to dequeue (and thus
> > +        * the whole TB) - checking sdone bit. If not return.
> > +        */
> > +       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
> > +                       __ATOMIC_RELAXED);
> > +       if (!(atom_desc.rsp.val & ACC100_SDONE))
> > +               return -1;
> > +
> > +       /* Dequeue */
> > +       op = desc->req.op_addr;
> > +
> > +       /* Clearing status, it will be set based on response */
> > +       op->status = 0;
> > +
> > +       while (i < cbs_in_tb) {
> > +               desc = q->ring_addr + ((q->sw_ring_tail
> > +                               + total_dequeued_cbs)
> > +                               & q->sw_ring_wrap_mask);
> > +               atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> > +                               __ATOMIC_RELAXED);
> > +               rsp.val = atom_desc.rsp.val;
> > +               rte_bbdev_log_debug("Resp. desc %p: %x", desc,
> > +                               rsp.val);
> > +
> > +               op->status |= ((rsp.input_err)
> > +                               ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> > +               op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) :
> 0);
> > +               op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> > +
> > +               if (desc->req.last_desc_in_batch) {
> > +                       (*aq_dequeued)++;
> > +                       desc->req.last_desc_in_batch = 0;
> > +               }
> > +               desc->rsp.val = ACC100_DMA_DESC_TYPE;
> > +               desc->rsp.add_info_0 = 0;
> > +               desc->rsp.add_info_1 = 0;
> > +               total_dequeued_cbs++;
> > +               current_dequeued_cbs++;
> > +               i++;
> > +       }
> > +
> > +       *ref_op = op;
> > +
> > +       return current_dequeued_cbs;
> > +}
> > +
> > +/* Dequeue one decode operation from ACC100 device in CB mode */
> > +static inline int
> > +dequeue_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
> > +               struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
> > +               uint16_t dequeued_cbs, uint32_t *aq_dequeued)
> > +{
> > +       union acc100_dma_desc *desc, atom_desc;
> > +       union acc100_dma_rsp_desc rsp;
> > +       struct rte_bbdev_dec_op *op;
> > +
> > +       desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> > +                       & q->sw_ring_wrap_mask);
> > +       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> > +                       __ATOMIC_RELAXED);
> > +
> > +       /* Check fdone bit */
> > +       if (!(atom_desc.rsp.val & ACC100_FDONE))
> > +               return -1;
> > +
> > +       rsp.val = atom_desc.rsp.val;
> > +       rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
> > +
> > +       /* Dequeue */
> > +       op = desc->req.op_addr;
> > +
> > +       /* Clearing status, it will be set based on response */
> > +       op->status = 0;
> > +       op->status |= ((rsp.input_err)
> > +                       ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> > +       op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> > +       op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> > +       if (op->status != 0)
> > +               q_data->queue_stats.dequeue_err_count++;
> > +
> > +       /* CRC invalid if error exists */
> > +       if (!op->status)
> > +               op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
> > +       op->turbo_dec.iter_count = (uint8_t) rsp.iter_cnt / 2;
> > +       /* Check if this is the last desc in batch (Atomic Queue) */
> > +       if (desc->req.last_desc_in_batch) {
> > +               (*aq_dequeued)++;
> > +               desc->req.last_desc_in_batch = 0;
> > +       }
> > +       desc->rsp.val = ACC100_DMA_DESC_TYPE;
> > +       desc->rsp.add_info_0 = 0;
> > +       desc->rsp.add_info_1 = 0;
> > +       *ref_op = op;
> > +
> > +       /* One CB (op) was successfully dequeued */
> > +       return 1;
> > +}
> > +
> > +/* Dequeue one decode operations from ACC100 device in CB mode */
> > +static inline int
> > +dequeue_ldpc_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
> > +               struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
> > +               uint16_t dequeued_cbs, uint32_t *aq_dequeued)
> > +{
> > +       union acc100_dma_desc *desc, atom_desc;
> > +       union acc100_dma_rsp_desc rsp;
> > +       struct rte_bbdev_dec_op *op;
> > +
> > +       desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> > +                       & q->sw_ring_wrap_mask);
> > +       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> > +                       __ATOMIC_RELAXED);
> > +
> > +       /* Check fdone bit */
> > +       if (!(atom_desc.rsp.val & ACC100_FDONE))
> > +               return -1;
> > +
> > +       rsp.val = atom_desc.rsp.val;
> > +
> > +       /* Dequeue */
> > +       op = desc->req.op_addr;
> > +
> > +       /* Clearing status, it will be set based on response */
> > +       op->status = 0;
> > +       op->status |= rsp.input_err << RTE_BBDEV_DATA_ERROR;
> > +       op->status |= rsp.dma_err << RTE_BBDEV_DRV_ERROR;
> > +       op->status |= rsp.fcw_err << RTE_BBDEV_DRV_ERROR;
> > +       if (op->status != 0)
> > +               q_data->queue_stats.dequeue_err_count++;
> > +
> > +       op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
> > +       if (op->ldpc_dec.hard_output.length > 0 && !rsp.synd_ok)
> > +               op->status |= 1 << RTE_BBDEV_SYNDROME_ERROR;
> > +       op->ldpc_dec.iter_count = (uint8_t) rsp.iter_cnt;
> > +
> > +       /* Check if this is the last desc in batch (Atomic Queue) */
> > +       if (desc->req.last_desc_in_batch) {
> > +               (*aq_dequeued)++;
> > +               desc->req.last_desc_in_batch = 0;
> > +       }
> > +
> > +       desc->rsp.val = ACC100_DMA_DESC_TYPE;
> > +       desc->rsp.add_info_0 = 0;
> > +       desc->rsp.add_info_1 = 0;
> > +
> > +       *ref_op = op;
> > +
> > +       /* One CB (op) was successfully dequeued */
> > +       return 1;
> > +}
> > +
> > +/* Dequeue one decode operations from ACC100 device in TB mode. */
> > +static inline int
> > +dequeue_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op
> > **ref_op,
> > +               uint16_t dequeued_cbs, uint32_t *aq_dequeued)
> > +{
> > +       union acc100_dma_desc *desc, *last_desc, atom_desc;
> > +       union acc100_dma_rsp_desc rsp;
> > +       struct rte_bbdev_dec_op *op;
> > +       uint8_t cbs_in_tb = 1, cb_idx = 0;
> > +
> > +       desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> > +                       & q->sw_ring_wrap_mask);
> > +       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> > +                       __ATOMIC_RELAXED);
> > +
> > +       /* Check fdone bit */
> > +       if (!(atom_desc.rsp.val & ACC100_FDONE))
> > +               return -1;
> > +
> > +       /* Dequeue */
> > +       op = desc->req.op_addr;
> > +
> > +       /* Get number of CBs in dequeued TB */
> > +       cbs_in_tb = desc->req.cbs_in_tb;
> > +       /* Get last CB */
> > +       last_desc = q->ring_addr + ((q->sw_ring_tail
> > +                       + dequeued_cbs + cbs_in_tb - 1)
> > +                       & q->sw_ring_wrap_mask);
> > +       /* Check if last CB in TB is ready to dequeue (and thus
> > +        * the whole TB) - checking sdone bit. If not return.
> > +        */
> > +       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
> > +                       __ATOMIC_RELAXED);
> > +       if (!(atom_desc.rsp.val & ACC100_SDONE))
> > +               return -1;
> > +
> > +       /* Clearing status, it will be set based on response */
> > +       op->status = 0;
> > +
> > +       /* Read remaining CBs if exists */
> > +       while (cb_idx < cbs_in_tb) {
> > +               desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> > +                               & q->sw_ring_wrap_mask);
> > +               atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> > +                               __ATOMIC_RELAXED);
> > +               rsp.val = atom_desc.rsp.val;
> > +               rte_bbdev_log_debug("Resp. desc %p: %x", desc,
> > +                               rsp.val);
> > +
> > +               op->status |= ((rsp.input_err)
> > +                               ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> > +               op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) :
> 0);
> > +               op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> > +
> > +               /* CRC invalid if error exists */
> > +               if (!op->status)
> > +                       op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
> > +               op->turbo_dec.iter_count = RTE_MAX((uint8_t) rsp.iter_cnt,
> > +                               op->turbo_dec.iter_count);
> > +
> > +               /* Check if this is the last desc in batch (Atomic Queue) */
> > +               if (desc->req.last_desc_in_batch) {
> > +                       (*aq_dequeued)++;
> > +                       desc->req.last_desc_in_batch = 0;
> > +               }
> > +               desc->rsp.val = ACC100_DMA_DESC_TYPE;
> > +               desc->rsp.add_info_0 = 0;
> > +               desc->rsp.add_info_1 = 0;
> > +               dequeued_cbs++;
> > +               cb_idx++;
> > +       }
> > +
> > +       *ref_op = op;
> > +
> > +       return cb_idx;
> > +}
> > +
> > +/* Dequeue LDPC encode operations from ACC100 device. */
> > +static uint16_t
> > +acc100_dequeue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
> > +               struct rte_bbdev_enc_op **ops, uint16_t num)
> > +{
> > +       struct acc100_queue *q = q_data->queue_private;
> > +       uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
> > +       uint32_t aq_dequeued = 0;
> > +       uint16_t dequeue_num, i, dequeued_cbs = 0, dequeued_descs = 0;
> > +       int ret;
> > +
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +       if (unlikely(ops == 0 && q == NULL))
> > +               return 0;
> > +#endif
> > +
> > +       dequeue_num = (avail < num) ? avail : num;
> > +
> > +       for (i = 0; i < dequeue_num; i++) {
> > +               ret = dequeue_enc_one_op_cb(q, &ops[dequeued_cbs],
> > +                               dequeued_descs, &aq_dequeued);
> > +               if (ret < 0)
> > +                       break;
> > +               dequeued_cbs += ret;
> > +               dequeued_descs++;
> > +               if (dequeued_cbs >= num)
> > +                       break;
> > +       }
> > +
> > +       q->aq_dequeued += aq_dequeued;
> > +       q->sw_ring_tail += dequeued_descs;
> > +
> > +       /* Update enqueue stats */
> > +       q_data->queue_stats.dequeued_count += dequeued_cbs;
> > +
> > +       return dequeued_cbs;
> > +}
> > +
> > +/* Dequeue decode operations from ACC100 device. */
> > +static uint16_t
> > +acc100_dequeue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
> > +               struct rte_bbdev_dec_op **ops, uint16_t num)
> > +{
> > +       struct acc100_queue *q = q_data->queue_private;
> > +       uint16_t dequeue_num;
> > +       uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
> > +       uint32_t aq_dequeued = 0;
> > +       uint16_t i;
> > +       uint16_t dequeued_cbs = 0;
> > +       struct rte_bbdev_dec_op *op;
> > +       int ret;
> > +
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +       if (unlikely(ops == 0 && q == NULL))
> > +               return 0;
> > +#endif
> > +
> > +       dequeue_num = (avail < num) ? avail : num;
> > +
> > +       for (i = 0; i < dequeue_num; ++i) {
> > +               op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> > +                       & q->sw_ring_wrap_mask))->req.op_addr;
> > +               if (op->ldpc_dec.code_block_mode == 0)
> > +                       ret = dequeue_dec_one_op_tb(q, &ops[i], dequeued_cbs,
> > +                                       &aq_dequeued);
> > +               else
> > +                       ret = dequeue_ldpc_dec_one_op_cb(
> > +                                       q_data, q, &ops[i], dequeued_cbs,
> > +                                       &aq_dequeued);
> > +
> > +               if (ret < 0)
> > +                       break;
> > +               dequeued_cbs += ret;
> > +       }
> > +
> > +       q->aq_dequeued += aq_dequeued;
> > +       q->sw_ring_tail += dequeued_cbs;
> > +
> > +       /* Update enqueue stats */
> > +       q_data->queue_stats.dequeued_count += i;
> > +
> > +       return i;
> > +}
> > +
> >  /* Initialization Function */
> >  static void
> >  acc100_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv)
> > @@ -703,6 +2321,10 @@
> >          struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
> >
> >          dev->dev_ops = &acc100_bbdev_ops;
> > +       dev->enqueue_ldpc_enc_ops = acc100_enqueue_ldpc_enc;
> > +       dev->enqueue_ldpc_dec_ops = acc100_enqueue_ldpc_dec;
> > +       dev->dequeue_ldpc_enc_ops = acc100_dequeue_ldpc_enc;
> > +       dev->dequeue_ldpc_dec_ops = acc100_dequeue_ldpc_dec;
> >
> >          ((struct acc100_device *) dev->data->dev_private)->pf_device =
> >                          !strcmp(drv->driver.name,
> > @@ -815,4 +2437,3 @@ static int acc100_pci_remove(struct rte_pci_device
> > *pci_dev)
> >  RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME,
> > pci_id_acc100_pf_map);
> >  RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
> >  RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME,
> > pci_id_acc100_vf_map);
> > -
> > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h
> > b/drivers/baseband/acc100/rte_acc100_pmd.h
> > index 0e2b79c..78686c1 100644
> > --- a/drivers/baseband/acc100/rte_acc100_pmd.h
> > +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
> > @@ -88,6 +88,8 @@
> >  #define TMPL_PRI_3      0x0f0e0d0c
> >  #define QUEUE_ENABLE    0x80000000  /* Bit to mark Queue as Enabled */
> >  #define WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
> > +#define ACC100_FDONE    0x80000000
> > +#define ACC100_SDONE    0x40000000
> >
> >  #define ACC100_NUM_TMPL  32
> >  #define VF_OFFSET_QOS 16 /* offset in Memory Space specific to QoS Mon
> */
> > @@ -398,6 +400,7 @@ struct __rte_packed acc100_dma_req_desc {
> >  union acc100_dma_desc {
> >          struct acc100_dma_req_desc req;
> >          union acc100_dma_rsp_desc rsp;
> > +       uint64_t atom_hdr;
> >  };
> >
> >
> > --
> > 1.8.3.1

^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v3 01/11] drivers/baseband: add PMD for ACC100
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 01/11] drivers/baseband: add PMD for ACC100 Nicolas Chautru
@ 2020-08-29  9:44   ` Xu, Rosen
  2020-09-04 16:44     ` Chautru, Nicolas
  0 siblings, 1 reply; 213+ messages in thread
From: Xu, Rosen @ 2020-08-29  9:44 UTC (permalink / raw)
  To: Chautru, Nicolas, dev, akhil.goyal
  Cc: Richardson, Bruce, Chautru, Nicolas, Xu, Rosen

Hi,

> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Nicolas Chautru
> Sent: Wednesday, August 19, 2020 8:25
> To: dev@dpdk.org; akhil.goyal@nxp.com
> Cc: Richardson, Bruce <bruce.richardson@intel.com>; Chautru, Nicolas
> <nicolas.chautru@intel.com>
> Subject: [dpdk-dev] [PATCH v3 01/11] drivers/baseband: add PMD for
> ACC100
> 
> Add stubs for the ACC100 PMD
> 
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> ---
>  config/common_base                                 |   4 +
>  doc/guides/bbdevs/acc100.rst                       | 233 +++++++++++++++++++++
>  doc/guides/bbdevs/index.rst                        |   1 +
>  doc/guides/rel_notes/release_20_11.rst             |   6 +
>  drivers/baseband/Makefile                          |   2 +
>  drivers/baseband/acc100/Makefile                   |  25 +++
>  drivers/baseband/acc100/meson.build                |   6 +
>  drivers/baseband/acc100/rte_acc100_pmd.c           | 175 ++++++++++++++++
>  drivers/baseband/acc100/rte_acc100_pmd.h           |  37 ++++
>  .../acc100/rte_pmd_bbdev_acc100_version.map        |   3 +
>  drivers/baseband/meson.build                       |   2 +-
>  mk/rte.app.mk                                      |   1 +
>  12 files changed, 494 insertions(+), 1 deletion(-)  create mode 100644
> doc/guides/bbdevs/acc100.rst  create mode 100644
> drivers/baseband/acc100/Makefile  create mode 100644
> drivers/baseband/acc100/meson.build
>  create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.c
>  create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.h
>  create mode 100644
> drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> 
> diff --git a/config/common_base b/config/common_base index
> fbf0ee7..218ab16 100644
> --- a/config/common_base
> +++ b/config/common_base
> @@ -584,6 +584,10 @@ CONFIG_RTE_LIBRTE_PMD_BBDEV_NULL=y
>  #
>  CONFIG_RTE_LIBRTE_PMD_BBDEV_TURBO_SW=y
> 
> +# Compile PMD for ACC100 bbdev device
> +#
> +CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100=y
> +
>  #
>  # Compile PMD for Intel FPGA LTE FEC bbdev device  # diff --git
> a/doc/guides/bbdevs/acc100.rst b/doc/guides/bbdevs/acc100.rst new file
> mode 100644 index 0000000..f87ee09
> --- /dev/null
> +++ b/doc/guides/bbdevs/acc100.rst
> @@ -0,0 +1,233 @@
> +..  SPDX-License-Identifier: BSD-3-Clause
> +    Copyright(c) 2020 Intel Corporation
> +
> +Intel(R) ACC100 5G/4G FEC Poll Mode Driver
> +==========================================
> +
> +The BBDEV ACC100 5G/4G FEC poll mode driver (PMD) supports an
> +implementation of a VRAN FEC wireless acceleration function.
> +This device is also known as Mount Bryce.
> +
> +Features
> +--------
> +
> +ACC100 5G/4G FEC PMD supports the following features:
> +
> +- LDPC Encode in the DL (5GNR)
> +- LDPC Decode in the UL (5GNR)
> +- Turbo Encode in the DL (4G)
> +- Turbo Decode in the UL (4G)
> +- 16 VFs per PF (physical device)
> +- Maximum of 128 queues per VF
> +- PCIe Gen-3 x16 Interface
> +- MSI
> +- SR-IOV
> +
> +ACC100 5G/4G FEC PMD supports the following BBDEV capabilities:
> +
> +* For the LDPC encode operation:
> +   - ``RTE_BBDEV_LDPC_CRC_24B_ATTACH`` :  set to attach CRC24B to CB(s)
> +   - ``RTE_BBDEV_LDPC_RATE_MATCH`` :  if set then do not do Rate Match
> bypass
> +   - ``RTE_BBDEV_LDPC_INTERLEAVER_BYPASS`` : if set then bypass
> +interleaver
> +
> +* For the LDPC decode operation:
> +   - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK`` :  check CRC24B from CB(s)
> +   - ``RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE`` :  disable early
> termination
> +   - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP`` :  drops CRC24B bits
> appended while decoding
> +   - ``RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE`` :  provides an input for
> HARQ combining
> +   - ``RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE`` :  provides an input
> for HARQ combining
> +   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE`` :  HARQ
> memory input is internal
> +   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE`` :  HARQ
> memory output is internal
> +   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK`` :
> loopback data to/from HARQ memory
> +   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_FILLERS`` :  HARQ
> memory includes the fillers bits
> +   - ``RTE_BBDEV_LDPC_DEC_SCATTER_GATHER`` :  supports scatter-gather
> for input/output data
> +   - ``RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION`` :  supports
> compression of the HARQ input/output
> +   - ``RTE_BBDEV_LDPC_LLR_COMPRESSION`` :  supports LLR input
> +compression
> +
> +* For the turbo encode operation:
> +   - ``RTE_BBDEV_TURBO_CRC_24B_ATTACH`` :  set to attach CRC24B to CB(s)
> +   - ``RTE_BBDEV_TURBO_RATE_MATCH`` :  if set then do not do Rate Match
> bypass
> +   - ``RTE_BBDEV_TURBO_ENC_INTERRUPTS`` :  set for encoder dequeue
> interrupts
> +   - ``RTE_BBDEV_TURBO_RV_INDEX_BYPASS`` :  set to bypass RV index
> +   - ``RTE_BBDEV_TURBO_ENC_SCATTER_GATHER`` :  supports scatter-
> gather
> +for input/output data
> +
> +* For the turbo decode operation:
> +   - ``RTE_BBDEV_TURBO_CRC_TYPE_24B`` :  check CRC24B from CB(s)
> +   - ``RTE_BBDEV_TURBO_SUBBLOCK_DEINTERLEAVE`` :  perform subblock
> de-interleave
> +   - ``RTE_BBDEV_TURBO_DEC_INTERRUPTS`` :  set for decoder dequeue
> interrupts
> +   - ``RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN`` :  set if negative LLR encoder
> i/p is supported
> +   - ``RTE_BBDEV_TURBO_POS_LLR_1_BIT_IN`` :  set if positive LLR encoder
> i/p is supported
> +   - ``RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP`` :  keep CRC24B bits
> appended while decoding
> +   - ``RTE_BBDEV_TURBO_EARLY_TERMINATION`` :  set early early
> termination feature
> +   - ``RTE_BBDEV_TURBO_DEC_SCATTER_GATHER`` :  supports scatter-
> gather for input/output data
> +   - ``RTE_BBDEV_TURBO_HALF_ITERATION_EVEN`` :  set half iteration
> +granularity
> +
> +Installation
> +------------
> +
> +Section 3 of the DPDK manual provides instuctions on installing and
> +compiling DPDK. The default set of bbdev compile flags may be found in
> +config/common_base, where for example the flag to build the ACC100
> +5G/4G FEC device, ``CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100``,
> +is already set.
> +
> +DPDK requires hugepages to be configured as detailed in section 2 of the
> DPDK manual.
> +The bbdev test application has been tested with a configuration 40 x
> +1GB hugepages. The hugepage configuration of a server may be examined
> using:
> +
> +.. code-block:: console
> +
> +   grep Huge* /proc/meminfo
> +
> +
> +Initialization
> +--------------
> +
> +When the device first powers up, its PCI Physical Functions (PF) can be
> listed through this command:
> +
> +.. code-block:: console
> +
> +  sudo lspci -vd8086:0d5c
> +
> +The physical and virtual functions are compatible with Linux UIO drivers:
> +``vfio`` and ``igb_uio``. However, in order to work the ACC100 5G/4G
> +FEC device firstly needs to be bound to one of these linux drivers through
> DPDK.
> +
> +
> +Bind PF UIO driver(s)
> +~~~~~~~~~~~~~~~~~~~~~
> +
> +Install the DPDK igb_uio driver, bind it with the PF PCI device ID and
> +use ``lspci`` to confirm the PF device is under use by ``igb_uio`` DPDK UIO
> driver.
> +
> +The igb_uio driver may be bound to the PF PCI device using one of three
> methods:
> +
> +
> +1. PCI functions (physical or virtual, depending on the use case) can
> +be bound to the UIO driver by repeating this command for every function.
> +
> +.. code-block:: console
> +
> +  cd <dpdk-top-level-directory>
> +  insmod ./build/kmod/igb_uio.ko
> +  echo "8086 0d5c" > /sys/bus/pci/drivers/igb_uio/new_id
> +  lspci -vd8086:0d5c
> +
> +
> +2. Another way to bind PF with DPDK UIO driver is by using the
> +``dpdk-devbind.py`` tool
> +
> +.. code-block:: console
> +
> +  cd <dpdk-top-level-directory>
> +  ./usertools/dpdk-devbind.py -b igb_uio 0000:06:00.0
> +
> +where the PCI device ID (example: 0000:06:00.0) is obtained using lspci
> +-vd8086:0d5c
> +
> +
> +3. A third way to bind is to use ``dpdk-setup.sh`` tool
> +
> +.. code-block:: console
> +
> +  cd <dpdk-top-level-directory>
> +  ./usertools/dpdk-setup.sh
> +
> +  select 'Bind Ethernet/Crypto/Baseband device to IGB UIO module'
> +  or
> +  select 'Bind Ethernet/Crypto/Baseband device to VFIO module'
> + depending on driver required  enter PCI device ID  select 'Display
> + current Ethernet/Crypto/Baseband device settings' to confirm binding
> +
> +
> +In the same way the ACC100 5G/4G FEC PF can be bound with vfio, but
> +vfio driver does not support SR-IOV configuration right out of the box, so it
> will need to be patched.
> +
> +
> +Enable Virtual Functions
> +~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +Now, it should be visible in the printouts that PCI PF is under igb_uio
> +control "``Kernel driver in use: igb_uio``"
> +
> +To show the number of available VFs on the device, read ``sriov_totalvfs``
> file..
> +
> +.. code-block:: console
> +
> +  cat /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/sriov_totalvfs
> +
> +  where 0000\:<b>\:<d>.<f> is the PCI device ID
> +
> +
> +To enable VFs via igb_uio, echo the number of virtual functions
> +intended to enable to ``max_vfs`` file..
> +
> +.. code-block:: console
> +
> +  echo <num-of-vfs> > /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/max_vfs
> +
> +
> +Afterwards, all VFs must be bound to appropriate UIO drivers as
> +required, same way it was done with the physical function previously.
> +
> +Enabling SR-IOV via vfio driver is pretty much the same, except that
> +the file name is different:
> +
> +.. code-block:: console
> +
> +  echo <num-of-vfs> >
> + /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/sriov_numvfs
> +
> +
> +Configure the VFs through PF
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +The PCI virtual functions must be configured before working or getting
> +assigned to VMs/Containers. The configuration involves allocating the
> +number of hardware queues, priorities, load balance, bandwidth and
> +other settings necessary for the device to perform FEC functions.
> +
> +This configuration needs to be executed at least once after reboot or
> +PCI FLR and can be achieved by using the function
> +``acc100_configure()``, which sets up the parameters defined in
> ``acc100_conf`` structure.
> +
> +Test Application
> +----------------
> +
> +BBDEV provides a test application, ``test-bbdev.py`` and range of test
> +data for testing the functionality of ACC100 5G/4G FEC encode and
> +decode, depending on the device's capabilities. The test application is
> +located under app->test-bbdev folder and has the following options:
> +
> +.. code-block:: console
> +
> +  "-p", "--testapp-path": specifies path to the bbdev test app.
> +  "-e", "--eal-params"	: EAL arguments which are passed to the test app.
> +  "-t", "--timeout"	: Timeout in seconds (default=300).
> +  "-c", "--test-cases"	: Defines test cases to run. Run all if not specified.
> +  "-v", "--test-vector"	: Test vector path (default=dpdk_path+/app/test-
> bbdev/test_vectors/bbdev_null.data).
> +  "-n", "--num-ops"	: Number of operations to process on device
> (default=32).
> +  "-b", "--burst-size"	: Operations enqueue/dequeue burst size
> (default=32).
> +  "-s", "--snr"		: SNR in dB used when generating LLRs for bler tests.
> +  "-s", "--iter_max"	: Number of iterations for LDPC decoder.
> +  "-l", "--num-lcores"	: Number of lcores to run (default=16).
> +  "-i", "--init-device" : Initialise PF device with default values.
> +
> +
> +To execute the test application tool using simple decode or encode
> +data, type one of the following:
> +
> +.. code-block:: console
> +
> +  ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_dec_default.data
> + ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_enc_default.data
> +
> +
> +The test application ``test-bbdev.py``, supports the ability to
> +configure the PF device with a default set of values, if the "-i" or "-
> +-init-device" option is included. The default values are defined in
> test_bbdev_perf.c.
> +
> +
> +Test Vectors
> +~~~~~~~~~~~~
> +
> +In addition to the simple LDPC decoder and LDPC encoder tests, bbdev
> +also provides a range of additional tests under the test_vectors
> +folder, which may be useful. The results of these tests will depend on
> +the ACC100 5G/4G FEC capabilities which may cause some testcases to be
> skipped, but no failure should be reported.
> diff --git a/doc/guides/bbdevs/index.rst b/doc/guides/bbdevs/index.rst
> index a8092dd..4445cbd 100644
> --- a/doc/guides/bbdevs/index.rst
> +++ b/doc/guides/bbdevs/index.rst
> @@ -13,3 +13,4 @@ Baseband Device Drivers
>      turbo_sw
>      fpga_lte_fec
>      fpga_5gnr_fec
> +    acc100
> diff --git a/doc/guides/rel_notes/release_20_11.rst
> b/doc/guides/rel_notes/release_20_11.rst
> index df227a1..b3ab614 100644
> --- a/doc/guides/rel_notes/release_20_11.rst
> +++ b/doc/guides/rel_notes/release_20_11.rst
> @@ -55,6 +55,12 @@ New Features
>       Also, make sure to start the actual text at the margin.
>       =======================================================
> 
> +* **Added Intel ACC100 bbdev PMD.**
> +
> +  Added a new ``acc100`` bbdev driver for the Intel\ |reg| ACC100
> + accelerator  also known as Mount Bryce.  See the
> + :doc:`../bbdevs/acc100` BBDEV guide for more details on this new driver.
> +
> 
>  Removed Items
>  -------------
> diff --git a/drivers/baseband/Makefile b/drivers/baseband/Makefile index
> dcc0969..b640294 100644
> --- a/drivers/baseband/Makefile
> +++ b/drivers/baseband/Makefile
> @@ -10,6 +10,8 @@ DIRS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_NULL) +=
> null  DEPDIRS-null = $(core-libs)
>  DIRS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_TURBO_SW) += turbo_sw
> DEPDIRS-turbo_sw = $(core-libs)
> +DIRS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100) += acc100
> +DEPDIRS-acc100 = $(core-libs)
>  DIRS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_FPGA_LTE_FEC) += fpga_lte_fec
> DEPDIRS-fpga_lte_fec = $(core-libs)
>  DIRS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_FPGA_5GNR_FEC) +=
> fpga_5gnr_fec diff --git a/drivers/baseband/acc100/Makefile
> b/drivers/baseband/acc100/Makefile
> new file mode 100644
> index 0000000..c79e487
> --- /dev/null
> +++ b/drivers/baseband/acc100/Makefile
> @@ -0,0 +1,25 @@
> +# SPDX-License-Identifier: BSD-3-Clause # Copyright(c) 2020 Intel
> +Corporation
> +
> +include $(RTE_SDK)/mk/rte.vars.mk
> +
> +# library name
> +LIB = librte_pmd_bbdev_acc100.a
> +
> +# build flags
> +CFLAGS += -O3
> +CFLAGS += $(WERROR_FLAGS)
> +LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring -lrte_cfgfile
> +LDLIBS += -lrte_bbdev LDLIBS += -lrte_pci -lrte_bus_pci
> +
> +# versioning export map
> +EXPORT_MAP := rte_pmd_bbdev_acc100_version.map
> +
> +# library version
> +LIBABIVER := 1
> +
> +# library source files
> +SRCS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100) += rte_acc100_pmd.c
> +
> +include $(RTE_SDK)/mk/rte.lib.mk
> diff --git a/drivers/baseband/acc100/meson.build
> b/drivers/baseband/acc100/meson.build
> new file mode 100644
> index 0000000..8afafc2
> --- /dev/null
> +++ b/drivers/baseband/acc100/meson.build
> @@ -0,0 +1,6 @@
> +# SPDX-License-Identifier: BSD-3-Clause # Copyright(c) 2020 Intel
> +Corporation
> +
> +deps += ['bbdev', 'bus_vdev', 'ring', 'pci', 'bus_pci']
> +
> +sources = files('rte_acc100_pmd.c')
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> b/drivers/baseband/acc100/rte_acc100_pmd.c
> new file mode 100644
> index 0000000..1b4cd13
> --- /dev/null
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> @@ -0,0 +1,175 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2020 Intel Corporation
> + */
> +
> +#include <unistd.h>
> +
> +#include <rte_common.h>
> +#include <rte_log.h>
> +#include <rte_dev.h>
> +#include <rte_malloc.h>
> +#include <rte_mempool.h>
> +#include <rte_byteorder.h>
> +#include <rte_errno.h>
> +#include <rte_branch_prediction.h>
> +#include <rte_hexdump.h>
> +#include <rte_pci.h>
> +#include <rte_bus_pci.h>
> +
> +#include <rte_bbdev.h>
> +#include <rte_bbdev_pmd.h>
> +#include "rte_acc100_pmd.h"
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, DEBUG); #else
> +RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE); #endif
> +
> +/* Free 64MB memory used for software rings */ static int
> +acc100_dev_close(struct rte_bbdev *dev  __rte_unused) {
> +	return 0;
> +}
> +
> +static const struct rte_bbdev_ops acc100_bbdev_ops = {
> +	.close = acc100_dev_close,
> +};
> +
> +/* ACC100 PCI PF address map */
> +static struct rte_pci_id pci_id_acc100_pf_map[] = {
> +	{
> +		RTE_PCI_DEVICE(RTE_ACC100_VENDOR_ID,
> RTE_ACC100_PF_DEVICE_ID)
> +	},
> +	{.device_id = 0},
> +};
> +
> +/* ACC100 PCI VF address map */
> +static struct rte_pci_id pci_id_acc100_vf_map[] = {
> +	{
> +		RTE_PCI_DEVICE(RTE_ACC100_VENDOR_ID,
> RTE_ACC100_VF_DEVICE_ID)
> +	},
> +	{.device_id = 0},
> +};
> +
> +/* Initialization Function */
> +static void
> +acc100_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv) {
> +	struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
> +
> +	dev->dev_ops = &acc100_bbdev_ops;
> +
> +	((struct acc100_device *) dev->data->dev_private)->pf_device =
> +			!strcmp(drv->driver.name,
> +					RTE_STR(ACC100PF_DRIVER_NAME));
> +	((struct acc100_device *) dev->data->dev_private)->mmio_base =
> +			pci_dev->mem_resource[0].addr;
> +
> +	rte_bbdev_log_debug("Init device %s [%s] @ vaddr %p
> paddr %#"PRIx64"",
> +			drv->driver.name, dev->data->name,
> +			(void *)pci_dev->mem_resource[0].addr,
> +			pci_dev->mem_resource[0].phys_addr);
> +}
> +
> +static int acc100_pci_probe(struct rte_pci_driver *pci_drv,
> +	struct rte_pci_device *pci_dev)
> +{
> +	struct rte_bbdev *bbdev = NULL;
> +	char dev_name[RTE_BBDEV_NAME_MAX_LEN];
> +
> +	if (pci_dev == NULL) {
> +		rte_bbdev_log(ERR, "NULL PCI device");
> +		return -EINVAL;
> +	}
> +
> +	rte_pci_device_name(&pci_dev->addr, dev_name,
> sizeof(dev_name));
> +
> +	/* Allocate memory to be used privately by drivers */
> +	bbdev = rte_bbdev_allocate(pci_dev->device.name);
> +	if (bbdev == NULL)
> +		return -ENODEV;
> +
> +	/* allocate device private memory */
> +	bbdev->data->dev_private = rte_zmalloc_socket(dev_name,
> +			sizeof(struct acc100_device), RTE_CACHE_LINE_SIZE,
> +			pci_dev->device.numa_node);
> +
> +	if (bbdev->data->dev_private == NULL) {
> +		rte_bbdev_log(CRIT,
> +				"Allocate of %zu bytes for device \"%s\"
> failed",
> +				sizeof(struct acc100_device), dev_name);
> +				rte_bbdev_release(bbdev);
> +			return -ENOMEM;
> +	}
> +
> +	/* Fill HW specific part of device structure */
> +	bbdev->device = &pci_dev->device;
> +	bbdev->intr_handle = &pci_dev->intr_handle;
> +	bbdev->data->socket_id = pci_dev->device.numa_node;
> +
> +	/* Invoke ACC100 device initialization function */
> +	acc100_bbdev_init(bbdev, pci_drv);
> +
> +	rte_bbdev_log_debug("Initialised bbdev %s (id = %u)",
> +			dev_name, bbdev->data->dev_id);
> +	return 0;
> +}
> +
> +static int acc100_pci_remove(struct rte_pci_device *pci_dev) {
> +	struct rte_bbdev *bbdev;
> +	int ret;
> +	uint8_t dev_id;
> +
> +	if (pci_dev == NULL)
> +		return -EINVAL;
> +
> +	/* Find device */
> +	bbdev = rte_bbdev_get_named_dev(pci_dev->device.name);
> +	if (bbdev == NULL) {
> +		rte_bbdev_log(CRIT,
> +				"Couldn't find HW dev \"%s\" to uninitialise
> it",
> +				pci_dev->device.name);
> +		return -ENODEV;
> +	}
> +	dev_id = bbdev->data->dev_id;
> +
> +	/* free device private memory before close */
> +	rte_free(bbdev->data->dev_private);
> +
> +	/* Close device */
> +	ret = rte_bbdev_close(dev_id);
> +	if (ret < 0)
> +		rte_bbdev_log(ERR,
> +				"Device %i failed to close during uninit: %i",
> +				dev_id, ret);
> +
> +	/* release bbdev from library */
> +	rte_bbdev_release(bbdev);
> +
> +	rte_bbdev_log_debug("Destroyed bbdev = %u", dev_id);
> +
> +	return 0;
> +}
> +
> +static struct rte_pci_driver acc100_pci_pf_driver = {
> +		.probe = acc100_pci_probe,
> +		.remove = acc100_pci_remove,
> +		.id_table = pci_id_acc100_pf_map,
> +		.drv_flags = RTE_PCI_DRV_NEED_MAPPING };
> +
> +static struct rte_pci_driver acc100_pci_vf_driver = {
> +		.probe = acc100_pci_probe,
> +		.remove = acc100_pci_remove,
> +		.id_table = pci_id_acc100_vf_map,
> +		.drv_flags = RTE_PCI_DRV_NEED_MAPPING };
> +
> +RTE_PMD_REGISTER_PCI(ACC100PF_DRIVER_NAME, acc100_pci_pf_driver);
> +RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME,
> pci_id_acc100_pf_map);
> +RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
> +RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME,
> pci_id_acc100_vf_map);

It seems both PF and VF share same date for rte_pci_driver,
it's strange to duplicate code.

> +
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h
> b/drivers/baseband/acc100/rte_acc100_pmd.h
> new file mode 100644
> index 0000000..6f46df0
> --- /dev/null
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
> @@ -0,0 +1,37 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2020 Intel Corporation
> + */
> +
> +#ifndef _RTE_ACC100_PMD_H_
> +#define _RTE_ACC100_PMD_H_
> +
> +/* Helper macro for logging */
> +#define rte_bbdev_log(level, fmt, ...) \
> +	rte_log(RTE_LOG_ ## level, acc100_logtype, fmt "\n", \
> +		##__VA_ARGS__)
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +#define rte_bbdev_log_debug(fmt, ...) \
> +		rte_bbdev_log(DEBUG, "acc100_pmd: " fmt, \
> +		##__VA_ARGS__)
> +#else
> +#define rte_bbdev_log_debug(fmt, ...)
> +#endif
> +
> +/* ACC100 PF and VF driver names */
> +#define ACC100PF_DRIVER_NAME           intel_acc100_pf
> +#define ACC100VF_DRIVER_NAME           intel_acc100_vf
> +
> +/* ACC100 PCI vendor & device IDs */
> +#define RTE_ACC100_VENDOR_ID           (0x8086)
> +#define RTE_ACC100_PF_DEVICE_ID        (0x0d5c)
> +#define RTE_ACC100_VF_DEVICE_ID        (0x0d5d)
> +
> +/* Private data structure for each ACC100 device */ struct
> +acc100_device {
> +	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
> +	bool pf_device; /**< True if this is a PF ACC100 device */
> +	bool configured; /**< True if this ACC100 device is configured */ };
> +
> +#endif /* _RTE_ACC100_PMD_H_ */
> diff --git a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> new file mode 100644
> index 0000000..4a76d1d
> --- /dev/null
> +++ b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> @@ -0,0 +1,3 @@
> +DPDK_21 {
> +	local: *;
> +};
> diff --git a/drivers/baseband/meson.build b/drivers/baseband/meson.build
> index 415b672..72301ce 100644
> --- a/drivers/baseband/meson.build
> +++ b/drivers/baseband/meson.build
> @@ -5,7 +5,7 @@ if is_windows
>  	subdir_done()
>  endif
> 
> -drivers = ['null', 'turbo_sw', 'fpga_lte_fec', 'fpga_5gnr_fec']
> +drivers = ['null', 'turbo_sw', 'fpga_lte_fec', 'fpga_5gnr_fec',
> +'acc100']
> 
>  config_flag_fmt = 'RTE_LIBRTE_PMD_BBDEV_@0@'
>  driver_name_fmt = 'rte_pmd_bbdev_@0@'
> diff --git a/mk/rte.app.mk b/mk/rte.app.mk index a544259..a77f538 100644
> --- a/mk/rte.app.mk
> +++ b/mk/rte.app.mk
> @@ -254,6 +254,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_NETVSC_PMD)     +=
> -lrte_pmd_netvsc
> 
>  ifeq ($(CONFIG_RTE_LIBRTE_BBDEV),y)
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_NULL)     += -
> lrte_pmd_bbdev_null
> +_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100)    += -
> lrte_pmd_bbdev_acc100
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_FPGA_LTE_FEC) += -
> lrte_pmd_bbdev_fpga_lte_fec
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_FPGA_5GNR_FEC) += -
> lrte_pmd_bbdev_fpga_5gnr_fec
> 
> --
> 1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v3 02/11] baseband/acc100: add register definition file
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 02/11] baseband/acc100: add register definition file Nicolas Chautru
@ 2020-08-29  9:55   ` Xu, Rosen
  2020-08-29 17:39     ` Chautru, Nicolas
  0 siblings, 1 reply; 213+ messages in thread
From: Xu, Rosen @ 2020-08-29  9:55 UTC (permalink / raw)
  To: Chautru, Nicolas, dev, akhil.goyal; +Cc: Richardson, Bruce, Chautru, Nicolas

Hi,

> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Nicolas Chautru
> Sent: Wednesday, August 19, 2020 8:25
> To: dev@dpdk.org; akhil.goyal@nxp.com
> Cc: Richardson, Bruce <bruce.richardson@intel.com>; Chautru, Nicolas
> <nicolas.chautru@intel.com>
> Subject: [dpdk-dev] [PATCH v3 02/11] baseband/acc100: add register
> definition file
> 
> Add in the list of registers for the device and related
> HW specs definitions.
> 
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> ---
>  drivers/baseband/acc100/acc100_pf_enum.h | 1068
> ++++++++++++++++++++++++++++++
>  drivers/baseband/acc100/acc100_vf_enum.h |   73 ++
>  drivers/baseband/acc100/rte_acc100_pmd.h |  490 ++++++++++++++
>  3 files changed, 1631 insertions(+)
>  create mode 100644 drivers/baseband/acc100/acc100_pf_enum.h
>  create mode 100644 drivers/baseband/acc100/acc100_vf_enum.h
> 
> diff --git a/drivers/baseband/acc100/acc100_pf_enum.h
> b/drivers/baseband/acc100/acc100_pf_enum.h
> new file mode 100644
> index 0000000..a1ee416
> --- /dev/null
> +++ b/drivers/baseband/acc100/acc100_pf_enum.h
> @@ -0,0 +1,1068 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2017 Intel Corporation
> + */
> +
> +#ifndef ACC100_PF_ENUM_H
> +#define ACC100_PF_ENUM_H
> +
> +/*
> + * ACC100 Register mapping on PF BAR0
> + * This is automatically generated from RDL, format may change with new
> RDL
> + * Release.
> + * Variable names are as is
> + */
> +enum {
> +	HWPfQmgrEgressQueuesTemplate          =  0x0007FE00,
> +	HWPfQmgrIngressAq                     =  0x00080000,
> +	HWPfQmgrArbQAvail                     =  0x00A00010,
> +	HWPfQmgrArbQBlock                     =  0x00A00014,
> +	HWPfQmgrAqueueDropNotifEn             =  0x00A00024,
> +	HWPfQmgrAqueueDisableNotifEn          =  0x00A00028,
> +	HWPfQmgrSoftReset                     =  0x00A00038,
> +	HWPfQmgrInitStatus                    =  0x00A0003C,
> +	HWPfQmgrAramWatchdogCount             =  0x00A00040,
> +	HWPfQmgrAramWatchdogCounterEn         =  0x00A00044,
> +	HWPfQmgrAxiWatchdogCount              =  0x00A00048,
> +	HWPfQmgrAxiWatchdogCounterEn          =  0x00A0004C,
> +	HWPfQmgrProcessWatchdogCount          =  0x00A00050,
> +	HWPfQmgrProcessWatchdogCounterEn      =  0x00A00054,
> +	HWPfQmgrProcessUl4GWatchdogCounter    =  0x00A00058,
> +	HWPfQmgrProcessDl4GWatchdogCounter    =  0x00A0005C,
> +	HWPfQmgrProcessUl5GWatchdogCounter    =  0x00A00060,
> +	HWPfQmgrProcessDl5GWatchdogCounter    =  0x00A00064,
> +	HWPfQmgrProcessMldWatchdogCounter     =  0x00A00068,
> +	HWPfQmgrMsiOverflowUpperVf            =  0x00A00070,
> +	HWPfQmgrMsiOverflowLowerVf            =  0x00A00074,
> +	HWPfQmgrMsiWatchdogOverflow           =  0x00A00078,
> +	HWPfQmgrMsiOverflowEnable             =  0x00A0007C,
> +	HWPfQmgrDebugAqPointerMemGrp          =  0x00A00100,
> +	HWPfQmgrDebugOutputArbQFifoGrp        =  0x00A00140,
> +	HWPfQmgrDebugMsiFifoGrp               =  0x00A00180,
> +	HWPfQmgrDebugAxiWdTimeoutMsiFifo      =  0x00A001C0,
> +	HWPfQmgrDebugProcessWdTimeoutMsiFifo  =  0x00A001C4,
> +	HWPfQmgrDepthLog2Grp                  =  0x00A00200,
> +	HWPfQmgrTholdGrp                      =  0x00A00300,
> +	HWPfQmgrGrpTmplateReg0Indx            =  0x00A00600,
> +	HWPfQmgrGrpTmplateReg1Indx            =  0x00A00680,
> +	HWPfQmgrGrpTmplateReg2indx            =  0x00A00700,
> +	HWPfQmgrGrpTmplateReg3Indx            =  0x00A00780,
> +	HWPfQmgrGrpTmplateReg4Indx            =  0x00A00800,
> +	HWPfQmgrVfBaseAddr                    =  0x00A01000,
> +	HWPfQmgrUl4GWeightRrVf                =  0x00A02000,
> +	HWPfQmgrDl4GWeightRrVf                =  0x00A02100,
> +	HWPfQmgrUl5GWeightRrVf                =  0x00A02200,
> +	HWPfQmgrDl5GWeightRrVf                =  0x00A02300,
> +	HWPfQmgrMldWeightRrVf                 =  0x00A02400,
> +	HWPfQmgrArbQDepthGrp                  =  0x00A02F00,
> +	HWPfQmgrGrpFunction0                  =  0x00A02F40,
> +	HWPfQmgrGrpFunction1                  =  0x00A02F44,
> +	HWPfQmgrGrpPriority                   =  0x00A02F48,
> +	HWPfQmgrWeightSync                    =  0x00A03000,
> +	HWPfQmgrAqEnableVf                    =  0x00A10000,
> +	HWPfQmgrAqResetVf                     =  0x00A20000,
> +	HWPfQmgrRingSizeVf                    =  0x00A20004,
> +	HWPfQmgrGrpDepthLog20Vf               =  0x00A20008,
> +	HWPfQmgrGrpDepthLog21Vf               =  0x00A2000C,
> +	HWPfQmgrGrpFunction0Vf                =  0x00A20010,
> +	HWPfQmgrGrpFunction1Vf                =  0x00A20014,
> +	HWPfDmaConfig0Reg                     =  0x00B80000,
> +	HWPfDmaConfig1Reg                     =  0x00B80004,
> +	HWPfDmaQmgrAddrReg                    =  0x00B80008,
> +	HWPfDmaSoftResetReg                   =  0x00B8000C,
> +	HWPfDmaAxcacheReg                     =  0x00B80010,
> +	HWPfDmaVersionReg                     =  0x00B80014,
> +	HWPfDmaFrameThreshold                 =  0x00B80018,
> +	HWPfDmaTimestampLo                    =  0x00B8001C,
> +	HWPfDmaTimestampHi                    =  0x00B80020,
> +	HWPfDmaAxiStatus                      =  0x00B80028,
> +	HWPfDmaAxiControl                     =  0x00B8002C,
> +	HWPfDmaNoQmgr                         =  0x00B80030,
> +	HWPfDmaQosScale                       =  0x00B80034,
> +	HWPfDmaQmanen                         =  0x00B80040,
> +	HWPfDmaQmgrQosBase                    =  0x00B80060,
> +	HWPfDmaFecClkGatingEnable             =  0x00B80080,
> +	HWPfDmaPmEnable                       =  0x00B80084,
> +	HWPfDmaQosEnable                      =  0x00B80088,
> +	HWPfDmaHarqWeightedRrFrameThreshold   =  0x00B800B0,
> +	HWPfDmaDataSmallWeightedRrFrameThresh  = 0x00B800B4,
> +	HWPfDmaDataLargeWeightedRrFrameThresh  = 0x00B800B8,
> +	HWPfDmaInboundCbMaxSize               =  0x00B800BC,
> +	HWPfDmaInboundDrainDataSize           =  0x00B800C0,
> +	HWPfDmaVfDdrBaseRw                    =  0x00B80400,
> +	HWPfDmaCmplTmOutCnt                   =  0x00B80800,
> +	HWPfDmaProcTmOutCnt                   =  0x00B80804,
> +	HWPfDmaStatusRrespBresp               =  0x00B80810,
> +	HWPfDmaCfgRrespBresp                  =  0x00B80814,
> +	HWPfDmaStatusMemParErr                =  0x00B80818,
> +	HWPfDmaCfgMemParErrEn                 =  0x00B8081C,
> +	HWPfDmaStatusDmaHwErr                 =  0x00B80820,
> +	HWPfDmaCfgDmaHwErrEn                  =  0x00B80824,
> +	HWPfDmaStatusFecCoreErr               =  0x00B80828,
> +	HWPfDmaCfgFecCoreErrEn                =  0x00B8082C,
> +	HWPfDmaStatusFcwDescrErr              =  0x00B80830,
> +	HWPfDmaCfgFcwDescrErrEn               =  0x00B80834,
> +	HWPfDmaStatusBlockTransmit            =  0x00B80838,
> +	HWPfDmaBlockOnErrEn                   =  0x00B8083C,
> +	HWPfDmaStatusFlushDma                 =  0x00B80840,
> +	HWPfDmaFlushDmaOnErrEn                =  0x00B80844,
> +	HWPfDmaStatusSdoneFifoFull            =  0x00B80848,
> +	HWPfDmaStatusDescriptorErrLoVf        =  0x00B8084C,
> +	HWPfDmaStatusDescriptorErrHiVf        =  0x00B80850,
> +	HWPfDmaStatusFcwErrLoVf               =  0x00B80854,
> +	HWPfDmaStatusFcwErrHiVf               =  0x00B80858,
> +	HWPfDmaStatusDataErrLoVf              =  0x00B8085C,
> +	HWPfDmaStatusDataErrHiVf              =  0x00B80860,
> +	HWPfDmaCfgMsiEnSoftwareErr            =  0x00B80864,
> +	HWPfDmaDescriptorSignatuture          =  0x00B80868,
> +	HWPfDmaFcwSignature                   =  0x00B8086C,
> +	HWPfDmaErrorDetectionEn               =  0x00B80870,
> +	HWPfDmaErrCntrlFifoDebug              =  0x00B8087C,
> +	HWPfDmaStatusToutData                 =  0x00B80880,
> +	HWPfDmaStatusToutDesc                 =  0x00B80884,
> +	HWPfDmaStatusToutUnexpData            =  0x00B80888,
> +	HWPfDmaStatusToutUnexpDesc            =  0x00B8088C,
> +	HWPfDmaStatusToutProcess              =  0x00B80890,
> +	HWPfDmaConfigCtoutOutDataEn           =  0x00B808A0,
> +	HWPfDmaConfigCtoutOutDescrEn          =  0x00B808A4,
> +	HWPfDmaConfigUnexpComplDataEn         =  0x00B808A8,
> +	HWPfDmaConfigUnexpComplDescrEn        =  0x00B808AC,
> +	HWPfDmaConfigPtoutOutEn               =  0x00B808B0,
> +	HWPfDmaFec5GulDescBaseLoRegVf         =  0x00B88020,
> +	HWPfDmaFec5GulDescBaseHiRegVf         =  0x00B88024,
> +	HWPfDmaFec5GulRespPtrLoRegVf          =  0x00B88028,
> +	HWPfDmaFec5GulRespPtrHiRegVf          =  0x00B8802C,
> +	HWPfDmaFec5GdlDescBaseLoRegVf         =  0x00B88040,
> +	HWPfDmaFec5GdlDescBaseHiRegVf         =  0x00B88044,
> +	HWPfDmaFec5GdlRespPtrLoRegVf          =  0x00B88048,
> +	HWPfDmaFec5GdlRespPtrHiRegVf          =  0x00B8804C,
> +	HWPfDmaFec4GulDescBaseLoRegVf         =  0x00B88060,
> +	HWPfDmaFec4GulDescBaseHiRegVf         =  0x00B88064,
> +	HWPfDmaFec4GulRespPtrLoRegVf          =  0x00B88068,
> +	HWPfDmaFec4GulRespPtrHiRegVf          =  0x00B8806C,
> +	HWPfDmaFec4GdlDescBaseLoRegVf         =  0x00B88080,
> +	HWPfDmaFec4GdlDescBaseHiRegVf         =  0x00B88084,
> +	HWPfDmaFec4GdlRespPtrLoRegVf          =  0x00B88088,
> +	HWPfDmaFec4GdlRespPtrHiRegVf          =  0x00B8808C,
> +	HWPfDmaVfDdrBaseRangeRo               =  0x00B880A0,
> +	HWPfQosmonACntrlReg                   =  0x00B90000,
> +	HWPfQosmonAEvalOverflow0              =  0x00B90008,
> +	HWPfQosmonAEvalOverflow1              =  0x00B9000C,
> +	HWPfQosmonADivTerm                    =  0x00B90010,
> +	HWPfQosmonATickTerm                   =  0x00B90014,
> +	HWPfQosmonAEvalTerm                   =  0x00B90018,
> +	HWPfQosmonAAveTerm                    =  0x00B9001C,
> +	HWPfQosmonAForceEccErr                =  0x00B90020,
> +	HWPfQosmonAEccErrDetect               =  0x00B90024,
> +	HWPfQosmonAIterationConfig0Low        =  0x00B90060,
> +	HWPfQosmonAIterationConfig0High       =  0x00B90064,
> +	HWPfQosmonAIterationConfig1Low        =  0x00B90068,
> +	HWPfQosmonAIterationConfig1High       =  0x00B9006C,
> +	HWPfQosmonAIterationConfig2Low        =  0x00B90070,
> +	HWPfQosmonAIterationConfig2High       =  0x00B90074,
> +	HWPfQosmonAIterationConfig3Low        =  0x00B90078,
> +	HWPfQosmonAIterationConfig3High       =  0x00B9007C,
> +	HWPfQosmonAEvalMemAddr                =  0x00B90080,
> +	HWPfQosmonAEvalMemData                =  0x00B90084,
> +	HWPfQosmonAXaction                    =  0x00B900C0,
> +	HWPfQosmonARemThres1Vf                =  0x00B90400,
> +	HWPfQosmonAThres2Vf                   =  0x00B90404,
> +	HWPfQosmonAWeiFracVf                  =  0x00B90408,
> +	HWPfQosmonARrWeiVf                    =  0x00B9040C,
> +	HWPfPermonACntrlRegVf                 =  0x00B98000,
> +	HWPfPermonACountVf                    =  0x00B98008,
> +	HWPfPermonAKCntLoVf                   =  0x00B98010,
> +	HWPfPermonAKCntHiVf                   =  0x00B98014,
> +	HWPfPermonADeltaCntLoVf               =  0x00B98020,
> +	HWPfPermonADeltaCntHiVf               =  0x00B98024,
> +	HWPfPermonAVersionReg                 =  0x00B9C000,
> +	HWPfPermonACbControlFec               =  0x00B9C0F0,
> +	HWPfPermonADltTimerLoFec              =  0x00B9C0F4,
> +	HWPfPermonADltTimerHiFec              =  0x00B9C0F8,
> +	HWPfPermonACbCountFec                 =  0x00B9C100,
> +	HWPfPermonAAccExecTimerLoFec          =  0x00B9C104,
> +	HWPfPermonAAccExecTimerHiFec          =  0x00B9C108,
> +	HWPfPermonAExecTimerMinFec            =  0x00B9C200,
> +	HWPfPermonAExecTimerMaxFec            =  0x00B9C204,
> +	HWPfPermonAControlBusMon              =  0x00B9C400,
> +	HWPfPermonAConfigBusMon               =  0x00B9C404,
> +	HWPfPermonASkipCountBusMon            =  0x00B9C408,
> +	HWPfPermonAMinLatBusMon               =  0x00B9C40C,
> +	HWPfPermonAMaxLatBusMon               =  0x00B9C500,
> +	HWPfPermonATotalLatLowBusMon          =  0x00B9C504,
> +	HWPfPermonATotalLatUpperBusMon        =  0x00B9C508,
> +	HWPfPermonATotalReqCntBusMon          =  0x00B9C50C,
> +	HWPfQosmonBCntrlReg                   =  0x00BA0000,
> +	HWPfQosmonBEvalOverflow0              =  0x00BA0008,
> +	HWPfQosmonBEvalOverflow1              =  0x00BA000C,
> +	HWPfQosmonBDivTerm                    =  0x00BA0010,
> +	HWPfQosmonBTickTerm                   =  0x00BA0014,
> +	HWPfQosmonBEvalTerm                   =  0x00BA0018,
> +	HWPfQosmonBAveTerm                    =  0x00BA001C,
> +	HWPfQosmonBForceEccErr                =  0x00BA0020,
> +	HWPfQosmonBEccErrDetect               =  0x00BA0024,
> +	HWPfQosmonBIterationConfig0Low        =  0x00BA0060,
> +	HWPfQosmonBIterationConfig0High       =  0x00BA0064,
> +	HWPfQosmonBIterationConfig1Low        =  0x00BA0068,
> +	HWPfQosmonBIterationConfig1High       =  0x00BA006C,
> +	HWPfQosmonBIterationConfig2Low        =  0x00BA0070,
> +	HWPfQosmonBIterationConfig2High       =  0x00BA0074,
> +	HWPfQosmonBIterationConfig3Low        =  0x00BA0078,
> +	HWPfQosmonBIterationConfig3High       =  0x00BA007C,
> +	HWPfQosmonBEvalMemAddr                =  0x00BA0080,
> +	HWPfQosmonBEvalMemData                =  0x00BA0084,
> +	HWPfQosmonBXaction                    =  0x00BA00C0,
> +	HWPfQosmonBRemThres1Vf                =  0x00BA0400,
> +	HWPfQosmonBThres2Vf                   =  0x00BA0404,
> +	HWPfQosmonBWeiFracVf                  =  0x00BA0408,
> +	HWPfQosmonBRrWeiVf                    =  0x00BA040C,
> +	HWPfPermonBCntrlRegVf                 =  0x00BA8000,
> +	HWPfPermonBCountVf                    =  0x00BA8008,
> +	HWPfPermonBKCntLoVf                   =  0x00BA8010,
> +	HWPfPermonBKCntHiVf                   =  0x00BA8014,
> +	HWPfPermonBDeltaCntLoVf               =  0x00BA8020,
> +	HWPfPermonBDeltaCntHiVf               =  0x00BA8024,
> +	HWPfPermonBVersionReg                 =  0x00BAC000,
> +	HWPfPermonBCbControlFec               =  0x00BAC0F0,
> +	HWPfPermonBDltTimerLoFec              =  0x00BAC0F4,
> +	HWPfPermonBDltTimerHiFec              =  0x00BAC0F8,
> +	HWPfPermonBCbCountFec                 =  0x00BAC100,
> +	HWPfPermonBAccExecTimerLoFec          =  0x00BAC104,
> +	HWPfPermonBAccExecTimerHiFec          =  0x00BAC108,
> +	HWPfPermonBExecTimerMinFec            =  0x00BAC200,
> +	HWPfPermonBExecTimerMaxFec            =  0x00BAC204,
> +	HWPfPermonBControlBusMon              =  0x00BAC400,
> +	HWPfPermonBConfigBusMon               =  0x00BAC404,
> +	HWPfPermonBSkipCountBusMon            =  0x00BAC408,
> +	HWPfPermonBMinLatBusMon               =  0x00BAC40C,
> +	HWPfPermonBMaxLatBusMon               =  0x00BAC500,
> +	HWPfPermonBTotalLatLowBusMon          =  0x00BAC504,
> +	HWPfPermonBTotalLatUpperBusMon        =  0x00BAC508,
> +	HWPfPermonBTotalReqCntBusMon          =  0x00BAC50C,
> +	HWPfFecUl5gCntrlReg                   =  0x00BC0000,
> +	HWPfFecUl5gI2MThreshReg               =  0x00BC0004,
> +	HWPfFecUl5gVersionReg                 =  0x00BC0100,
> +	HWPfFecUl5gFcwStatusReg               =  0x00BC0104,
> +	HWPfFecUl5gWarnReg                    =  0x00BC0108,
> +	HwPfFecUl5gIbDebugReg                 =  0x00BC0200,
> +	HwPfFecUl5gObLlrDebugReg              =  0x00BC0204,
> +	HwPfFecUl5gObHarqDebugReg             =  0x00BC0208,
> +	HwPfFecUl5g1CntrlReg                  =  0x00BC1000,
> +	HwPfFecUl5g1I2MThreshReg              =  0x00BC1004,
> +	HwPfFecUl5g1VersionReg                =  0x00BC1100,
> +	HwPfFecUl5g1FcwStatusReg              =  0x00BC1104,
> +	HwPfFecUl5g1WarnReg                   =  0x00BC1108,
> +	HwPfFecUl5g1IbDebugReg                =  0x00BC1200,
> +	HwPfFecUl5g1ObLlrDebugReg             =  0x00BC1204,
> +	HwPfFecUl5g1ObHarqDebugReg            =  0x00BC1208,
> +	HwPfFecUl5g2CntrlReg                  =  0x00BC2000,
> +	HwPfFecUl5g2I2MThreshReg              =  0x00BC2004,
> +	HwPfFecUl5g2VersionReg                =  0x00BC2100,
> +	HwPfFecUl5g2FcwStatusReg              =  0x00BC2104,
> +	HwPfFecUl5g2WarnReg                   =  0x00BC2108,
> +	HwPfFecUl5g2IbDebugReg                =  0x00BC2200,
> +	HwPfFecUl5g2ObLlrDebugReg             =  0x00BC2204,
> +	HwPfFecUl5g2ObHarqDebugReg            =  0x00BC2208,
> +	HwPfFecUl5g3CntrlReg                  =  0x00BC3000,
> +	HwPfFecUl5g3I2MThreshReg              =  0x00BC3004,
> +	HwPfFecUl5g3VersionReg                =  0x00BC3100,
> +	HwPfFecUl5g3FcwStatusReg              =  0x00BC3104,
> +	HwPfFecUl5g3WarnReg                   =  0x00BC3108,
> +	HwPfFecUl5g3IbDebugReg                =  0x00BC3200,
> +	HwPfFecUl5g3ObLlrDebugReg             =  0x00BC3204,
> +	HwPfFecUl5g3ObHarqDebugReg            =  0x00BC3208,
> +	HwPfFecUl5g4CntrlReg                  =  0x00BC4000,
> +	HwPfFecUl5g4I2MThreshReg              =  0x00BC4004,
> +	HwPfFecUl5g4VersionReg                =  0x00BC4100,
> +	HwPfFecUl5g4FcwStatusReg              =  0x00BC4104,
> +	HwPfFecUl5g4WarnReg                   =  0x00BC4108,
> +	HwPfFecUl5g4IbDebugReg                =  0x00BC4200,
> +	HwPfFecUl5g4ObLlrDebugReg             =  0x00BC4204,
> +	HwPfFecUl5g4ObHarqDebugReg            =  0x00BC4208,
> +	HwPfFecUl5g5CntrlReg                  =  0x00BC5000,
> +	HwPfFecUl5g5I2MThreshReg              =  0x00BC5004,
> +	HwPfFecUl5g5VersionReg                =  0x00BC5100,
> +	HwPfFecUl5g5FcwStatusReg              =  0x00BC5104,
> +	HwPfFecUl5g5WarnReg                   =  0x00BC5108,
> +	HwPfFecUl5g5IbDebugReg                =  0x00BC5200,
> +	HwPfFecUl5g5ObLlrDebugReg             =  0x00BC5204,
> +	HwPfFecUl5g5ObHarqDebugReg            =  0x00BC5208,
> +	HwPfFecUl5g6CntrlReg                  =  0x00BC6000,
> +	HwPfFecUl5g6I2MThreshReg              =  0x00BC6004,
> +	HwPfFecUl5g6VersionReg                =  0x00BC6100,
> +	HwPfFecUl5g6FcwStatusReg              =  0x00BC6104,
> +	HwPfFecUl5g6WarnReg                   =  0x00BC6108,
> +	HwPfFecUl5g6IbDebugReg                =  0x00BC6200,
> +	HwPfFecUl5g6ObLlrDebugReg             =  0x00BC6204,
> +	HwPfFecUl5g6ObHarqDebugReg            =  0x00BC6208,
> +	HwPfFecUl5g7CntrlReg                  =  0x00BC7000,
> +	HwPfFecUl5g7I2MThreshReg              =  0x00BC7004,
> +	HwPfFecUl5g7VersionReg                =  0x00BC7100,
> +	HwPfFecUl5g7FcwStatusReg              =  0x00BC7104,
> +	HwPfFecUl5g7WarnReg                   =  0x00BC7108,
> +	HwPfFecUl5g7IbDebugReg                =  0x00BC7200,
> +	HwPfFecUl5g7ObLlrDebugReg             =  0x00BC7204,
> +	HwPfFecUl5g7ObHarqDebugReg            =  0x00BC7208,
> +	HwPfFecUl5g8CntrlReg                  =  0x00BC8000,
> +	HwPfFecUl5g8I2MThreshReg              =  0x00BC8004,
> +	HwPfFecUl5g8VersionReg                =  0x00BC8100,
> +	HwPfFecUl5g8FcwStatusReg              =  0x00BC8104,
> +	HwPfFecUl5g8WarnReg                   =  0x00BC8108,
> +	HwPfFecUl5g8IbDebugReg                =  0x00BC8200,
> +	HwPfFecUl5g8ObLlrDebugReg             =  0x00BC8204,
> +	HwPfFecUl5g8ObHarqDebugReg            =  0x00BC8208,
> +	HWPfFecDl5gCntrlReg                   =  0x00BCF000,
> +	HWPfFecDl5gI2MThreshReg               =  0x00BCF004,
> +	HWPfFecDl5gVersionReg                 =  0x00BCF100,
> +	HWPfFecDl5gFcwStatusReg               =  0x00BCF104,
> +	HWPfFecDl5gWarnReg                    =  0x00BCF108,
> +	HWPfFecUlVersionReg                   =  0x00BD0000,
> +	HWPfFecUlControlReg                   =  0x00BD0004,
> +	HWPfFecUlStatusReg                    =  0x00BD0008,
> +	HWPfFecDlVersionReg                   =  0x00BDF000,
> +	HWPfFecDlClusterConfigReg             =  0x00BDF004,
> +	HWPfFecDlBurstThres                   =  0x00BDF00C,
> +	HWPfFecDlClusterStatusReg0            =  0x00BDF040,
> +	HWPfFecDlClusterStatusReg1            =  0x00BDF044,
> +	HWPfFecDlClusterStatusReg2            =  0x00BDF048,
> +	HWPfFecDlClusterStatusReg3            =  0x00BDF04C,
> +	HWPfFecDlClusterStatusReg4            =  0x00BDF050,
> +	HWPfFecDlClusterStatusReg5            =  0x00BDF054,
> +	HWPfChaFabPllPllrst                   =  0x00C40000,
> +	HWPfChaFabPllClk0                     =  0x00C40004,
> +	HWPfChaFabPllClk1                     =  0x00C40008,
> +	HWPfChaFabPllBwadj                    =  0x00C4000C,
> +	HWPfChaFabPllLbw                      =  0x00C40010,
> +	HWPfChaFabPllResetq                   =  0x00C40014,
> +	HWPfChaFabPllPhshft0                  =  0x00C40018,
> +	HWPfChaFabPllPhshft1                  =  0x00C4001C,
> +	HWPfChaFabPllDivq0                    =  0x00C40020,
> +	HWPfChaFabPllDivq1                    =  0x00C40024,
> +	HWPfChaFabPllDivq2                    =  0x00C40028,
> +	HWPfChaFabPllDivq3                    =  0x00C4002C,
> +	HWPfChaFabPllDivq4                    =  0x00C40030,
> +	HWPfChaFabPllDivq5                    =  0x00C40034,
> +	HWPfChaFabPllDivq6                    =  0x00C40038,
> +	HWPfChaFabPllDivq7                    =  0x00C4003C,
> +	HWPfChaDl5gPllPllrst                  =  0x00C40080,
> +	HWPfChaDl5gPllClk0                    =  0x00C40084,
> +	HWPfChaDl5gPllClk1                    =  0x00C40088,
> +	HWPfChaDl5gPllBwadj                   =  0x00C4008C,
> +	HWPfChaDl5gPllLbw                     =  0x00C40090,
> +	HWPfChaDl5gPllResetq                  =  0x00C40094,
> +	HWPfChaDl5gPllPhshft0                 =  0x00C40098,
> +	HWPfChaDl5gPllPhshft1                 =  0x00C4009C,
> +	HWPfChaDl5gPllDivq0                   =  0x00C400A0,
> +	HWPfChaDl5gPllDivq1                   =  0x00C400A4,
> +	HWPfChaDl5gPllDivq2                   =  0x00C400A8,
> +	HWPfChaDl5gPllDivq3                   =  0x00C400AC,
> +	HWPfChaDl5gPllDivq4                   =  0x00C400B0,
> +	HWPfChaDl5gPllDivq5                   =  0x00C400B4,
> +	HWPfChaDl5gPllDivq6                   =  0x00C400B8,
> +	HWPfChaDl5gPllDivq7                   =  0x00C400BC,
> +	HWPfChaDl4gPllPllrst                  =  0x00C40100,
> +	HWPfChaDl4gPllClk0                    =  0x00C40104,
> +	HWPfChaDl4gPllClk1                    =  0x00C40108,
> +	HWPfChaDl4gPllBwadj                   =  0x00C4010C,
> +	HWPfChaDl4gPllLbw                     =  0x00C40110,
> +	HWPfChaDl4gPllResetq                  =  0x00C40114,
> +	HWPfChaDl4gPllPhshft0                 =  0x00C40118,
> +	HWPfChaDl4gPllPhshft1                 =  0x00C4011C,
> +	HWPfChaDl4gPllDivq0                   =  0x00C40120,
> +	HWPfChaDl4gPllDivq1                   =  0x00C40124,
> +	HWPfChaDl4gPllDivq2                   =  0x00C40128,
> +	HWPfChaDl4gPllDivq3                   =  0x00C4012C,
> +	HWPfChaDl4gPllDivq4                   =  0x00C40130,
> +	HWPfChaDl4gPllDivq5                   =  0x00C40134,
> +	HWPfChaDl4gPllDivq6                   =  0x00C40138,
> +	HWPfChaDl4gPllDivq7                   =  0x00C4013C,
> +	HWPfChaUl5gPllPllrst                  =  0x00C40180,
> +	HWPfChaUl5gPllClk0                    =  0x00C40184,
> +	HWPfChaUl5gPllClk1                    =  0x00C40188,
> +	HWPfChaUl5gPllBwadj                   =  0x00C4018C,
> +	HWPfChaUl5gPllLbw                     =  0x00C40190,
> +	HWPfChaUl5gPllResetq                  =  0x00C40194,
> +	HWPfChaUl5gPllPhshft0                 =  0x00C40198,
> +	HWPfChaUl5gPllPhshft1                 =  0x00C4019C,
> +	HWPfChaUl5gPllDivq0                   =  0x00C401A0,
> +	HWPfChaUl5gPllDivq1                   =  0x00C401A4,
> +	HWPfChaUl5gPllDivq2                   =  0x00C401A8,
> +	HWPfChaUl5gPllDivq3                   =  0x00C401AC,
> +	HWPfChaUl5gPllDivq4                   =  0x00C401B0,
> +	HWPfChaUl5gPllDivq5                   =  0x00C401B4,
> +	HWPfChaUl5gPllDivq6                   =  0x00C401B8,
> +	HWPfChaUl5gPllDivq7                   =  0x00C401BC,
> +	HWPfChaUl4gPllPllrst                  =  0x00C40200,
> +	HWPfChaUl4gPllClk0                    =  0x00C40204,
> +	HWPfChaUl4gPllClk1                    =  0x00C40208,
> +	HWPfChaUl4gPllBwadj                   =  0x00C4020C,
> +	HWPfChaUl4gPllLbw                     =  0x00C40210,
> +	HWPfChaUl4gPllResetq                  =  0x00C40214,
> +	HWPfChaUl4gPllPhshft0                 =  0x00C40218,
> +	HWPfChaUl4gPllPhshft1                 =  0x00C4021C,
> +	HWPfChaUl4gPllDivq0                   =  0x00C40220,
> +	HWPfChaUl4gPllDivq1                   =  0x00C40224,
> +	HWPfChaUl4gPllDivq2                   =  0x00C40228,
> +	HWPfChaUl4gPllDivq3                   =  0x00C4022C,
> +	HWPfChaUl4gPllDivq4                   =  0x00C40230,
> +	HWPfChaUl4gPllDivq5                   =  0x00C40234,
> +	HWPfChaUl4gPllDivq6                   =  0x00C40238,
> +	HWPfChaUl4gPllDivq7                   =  0x00C4023C,
> +	HWPfChaDdrPllPllrst                   =  0x00C40280,
> +	HWPfChaDdrPllClk0                     =  0x00C40284,
> +	HWPfChaDdrPllClk1                     =  0x00C40288,
> +	HWPfChaDdrPllBwadj                    =  0x00C4028C,
> +	HWPfChaDdrPllLbw                      =  0x00C40290,
> +	HWPfChaDdrPllResetq                   =  0x00C40294,
> +	HWPfChaDdrPllPhshft0                  =  0x00C40298,
> +	HWPfChaDdrPllPhshft1                  =  0x00C4029C,
> +	HWPfChaDdrPllDivq0                    =  0x00C402A0,
> +	HWPfChaDdrPllDivq1                    =  0x00C402A4,
> +	HWPfChaDdrPllDivq2                    =  0x00C402A8,
> +	HWPfChaDdrPllDivq3                    =  0x00C402AC,
> +	HWPfChaDdrPllDivq4                    =  0x00C402B0,
> +	HWPfChaDdrPllDivq5                    =  0x00C402B4,
> +	HWPfChaDdrPllDivq6                    =  0x00C402B8,
> +	HWPfChaDdrPllDivq7                    =  0x00C402BC,
> +	HWPfChaErrStatus                      =  0x00C40400,
> +	HWPfChaErrMask                        =  0x00C40404,
> +	HWPfChaDebugPcieMsiFifo               =  0x00C40410,
> +	HWPfChaDebugDdrMsiFifo                =  0x00C40414,
> +	HWPfChaDebugMiscMsiFifo               =  0x00C40418,
> +	HWPfChaPwmSet                         =  0x00C40420,
> +	HWPfChaDdrRstStatus                   =  0x00C40430,
> +	HWPfChaDdrStDoneStatus                =  0x00C40434,
> +	HWPfChaDdrWbRstCfg                    =  0x00C40438,
> +	HWPfChaDdrApbRstCfg                   =  0x00C4043C,
> +	HWPfChaDdrPhyRstCfg                   =  0x00C40440,
> +	HWPfChaDdrCpuRstCfg                   =  0x00C40444,
> +	HWPfChaDdrSifRstCfg                   =  0x00C40448,
> +	HWPfChaPadcfgPcomp0                   =  0x00C41000,
> +	HWPfChaPadcfgNcomp0                   =  0x00C41004,
> +	HWPfChaPadcfgOdt0                     =  0x00C41008,
> +	HWPfChaPadcfgProtect0                 =  0x00C4100C,
> +	HWPfChaPreemphasisProtect0            =  0x00C41010,
> +	HWPfChaPreemphasisCompen0             =  0x00C41040,
> +	HWPfChaPreemphasisOdten0              =  0x00C41044,
> +	HWPfChaPadcfgPcomp1                   =  0x00C41100,
> +	HWPfChaPadcfgNcomp1                   =  0x00C41104,
> +	HWPfChaPadcfgOdt1                     =  0x00C41108,
> +	HWPfChaPadcfgProtect1                 =  0x00C4110C,
> +	HWPfChaPreemphasisProtect1            =  0x00C41110,
> +	HWPfChaPreemphasisCompen1             =  0x00C41140,
> +	HWPfChaPreemphasisOdten1              =  0x00C41144,
> +	HWPfChaPadcfgPcomp2                   =  0x00C41200,
> +	HWPfChaPadcfgNcomp2                   =  0x00C41204,
> +	HWPfChaPadcfgOdt2                     =  0x00C41208,
> +	HWPfChaPadcfgProtect2                 =  0x00C4120C,
> +	HWPfChaPreemphasisProtect2            =  0x00C41210,
> +	HWPfChaPreemphasisCompen2             =  0x00C41240,
> +	HWPfChaPreemphasisOdten4              =  0x00C41444,
> +	HWPfChaPreemphasisOdten2              =  0x00C41244,
> +	HWPfChaPadcfgPcomp3                   =  0x00C41300,
> +	HWPfChaPadcfgNcomp3                   =  0x00C41304,
> +	HWPfChaPadcfgOdt3                     =  0x00C41308,
> +	HWPfChaPadcfgProtect3                 =  0x00C4130C,
> +	HWPfChaPreemphasisProtect3            =  0x00C41310,
> +	HWPfChaPreemphasisCompen3             =  0x00C41340,
> +	HWPfChaPreemphasisOdten3              =  0x00C41344,
> +	HWPfChaPadcfgPcomp4                   =  0x00C41400,
> +	HWPfChaPadcfgNcomp4                   =  0x00C41404,
> +	HWPfChaPadcfgOdt4                     =  0x00C41408,
> +	HWPfChaPadcfgProtect4                 =  0x00C4140C,
> +	HWPfChaPreemphasisProtect4            =  0x00C41410,
> +	HWPfChaPreemphasisCompen4             =  0x00C41440,
> +	HWPfHiVfToPfDbellVf                   =  0x00C80000,
> +	HWPfHiPfToVfDbellVf                   =  0x00C80008,
> +	HWPfHiInfoRingBaseLoVf                =  0x00C80010,
> +	HWPfHiInfoRingBaseHiVf                =  0x00C80014,
> +	HWPfHiInfoRingPointerVf               =  0x00C80018,
> +	HWPfHiInfoRingIntWrEnVf               =  0x00C80020,
> +	HWPfHiInfoRingPf2VfWrEnVf             =  0x00C80024,
> +	HWPfHiMsixVectorMapperVf              =  0x00C80060,
> +	HWPfHiModuleVersionReg                =  0x00C84000,
> +	HWPfHiIosf2axiErrLogReg               =  0x00C84004,
> +	HWPfHiHardResetReg                    =  0x00C84008,
> +	HWPfHi5GHardResetReg                  =  0x00C8400C,
> +	HWPfHiInfoRingBaseLoRegPf             =  0x00C84010,
> +	HWPfHiInfoRingBaseHiRegPf             =  0x00C84014,
> +	HWPfHiInfoRingPointerRegPf            =  0x00C84018,
> +	HWPfHiInfoRingIntWrEnRegPf            =  0x00C84020,
> +	HWPfHiInfoRingVf2pfLoWrEnReg          =  0x00C84024,
> +	HWPfHiInfoRingVf2pfHiWrEnReg          =  0x00C84028,
> +	HWPfHiLogParityErrStatusReg           =  0x00C8402C,
> +	HWPfHiLogDataParityErrorVfStatusLo    =  0x00C84030,
> +	HWPfHiLogDataParityErrorVfStatusHi    =  0x00C84034,
> +	HWPfHiBlockTransmitOnErrorEn          =  0x00C84038,
> +	HWPfHiCfgMsiIntWrEnRegPf              =  0x00C84040,
> +	HWPfHiCfgMsiVf2pfLoWrEnReg            =  0x00C84044,
> +	HWPfHiCfgMsiVf2pfHighWrEnReg          =  0x00C84048,
> +	HWPfHiMsixVectorMapperPf              =  0x00C84060,
> +	HWPfHiApbWrWaitTime                   =  0x00C84100,
> +	HWPfHiXCounterMaxValue                =  0x00C84104,
> +	HWPfHiPfMode                          =  0x00C84108,
> +	HWPfHiClkGateHystReg                  =  0x00C8410C,
> +	HWPfHiSnoopBitsReg                    =  0x00C84110,
> +	HWPfHiMsiDropEnableReg                =  0x00C84114,
> +	HWPfHiMsiStatReg                      =  0x00C84120,
> +	HWPfHiFifoOflStatReg                  =  0x00C84124,
> +	HWPfHiHiDebugReg                      =  0x00C841F4,
> +	HWPfHiDebugMemSnoopMsiFifo            =  0x00C841F8,
> +	HWPfHiDebugMemSnoopInputFifo          =  0x00C841FC,
> +	HWPfHiMsixMappingConfig               =  0x00C84200,
> +	HWPfHiJunkReg                         =  0x00C8FF00,
> +	HWPfDdrUmmcVer                        =  0x00D00000,
> +	HWPfDdrUmmcCap                        =  0x00D00010,
> +	HWPfDdrUmmcCtrl                       =  0x00D00020,
> +	HWPfDdrMpcPe                          =  0x00D00080,
> +	HWPfDdrMpcPpri3                       =  0x00D00090,
> +	HWPfDdrMpcPpri2                       =  0x00D000A0,
> +	HWPfDdrMpcPpri1                       =  0x00D000B0,
> +	HWPfDdrMpcPpri0                       =  0x00D000C0,
> +	HWPfDdrMpcPrwgrpCtrl                  =  0x00D000D0,
> +	HWPfDdrMpcPbw7                        =  0x00D000E0,
> +	HWPfDdrMpcPbw6                        =  0x00D000F0,
> +	HWPfDdrMpcPbw5                        =  0x00D00100,
> +	HWPfDdrMpcPbw4                        =  0x00D00110,
> +	HWPfDdrMpcPbw3                        =  0x00D00120,
> +	HWPfDdrMpcPbw2                        =  0x00D00130,
> +	HWPfDdrMpcPbw1                        =  0x00D00140,
> +	HWPfDdrMpcPbw0                        =  0x00D00150,
> +	HWPfDdrMemoryInit                     =  0x00D00200,
> +	HWPfDdrMemoryInitDone                 =  0x00D00210,
> +	HWPfDdrMemInitPhyTrng0                =  0x00D00240,
> +	HWPfDdrMemInitPhyTrng1                =  0x00D00250,
> +	HWPfDdrMemInitPhyTrng2                =  0x00D00260,
> +	HWPfDdrMemInitPhyTrng3                =  0x00D00270,
> +	HWPfDdrBcDram                         =  0x00D003C0,
> +	HWPfDdrBcAddrMap                      =  0x00D003D0,
> +	HWPfDdrBcRef                          =  0x00D003E0,
> +	HWPfDdrBcTim0                         =  0x00D00400,
> +	HWPfDdrBcTim1                         =  0x00D00410,
> +	HWPfDdrBcTim2                         =  0x00D00420,
> +	HWPfDdrBcTim3                         =  0x00D00430,
> +	HWPfDdrBcTim4                         =  0x00D00440,
> +	HWPfDdrBcTim5                         =  0x00D00450,
> +	HWPfDdrBcTim6                         =  0x00D00460,
> +	HWPfDdrBcTim7                         =  0x00D00470,
> +	HWPfDdrBcTim8                         =  0x00D00480,
> +	HWPfDdrBcTim9                         =  0x00D00490,
> +	HWPfDdrBcTim10                        =  0x00D004A0,
> +	HWPfDdrBcTim12                        =  0x00D004C0,
> +	HWPfDdrDfiInit                        =  0x00D004D0,
> +	HWPfDdrDfiInitComplete                =  0x00D004E0,
> +	HWPfDdrDfiTim0                        =  0x00D004F0,
> +	HWPfDdrDfiTim1                        =  0x00D00500,
> +	HWPfDdrDfiPhyUpdEn                    =  0x00D00530,
> +	HWPfDdrMemStatus                      =  0x00D00540,
> +	HWPfDdrUmmcErrStatus                  =  0x00D00550,
> +	HWPfDdrUmmcIntStatus                  =  0x00D00560,
> +	HWPfDdrUmmcIntEn                      =  0x00D00570,
> +	HWPfDdrPhyRdLatency                   =  0x00D48400,
> +	HWPfDdrPhyRdLatencyDbi                =  0x00D48410,
> +	HWPfDdrPhyWrLatency                   =  0x00D48420,
> +	HWPfDdrPhyTrngType                    =  0x00D48430,
> +	HWPfDdrPhyMrsTiming2                  =  0x00D48440,
> +	HWPfDdrPhyMrsTiming0                  =  0x00D48450,
> +	HWPfDdrPhyMrsTiming1                  =  0x00D48460,
> +	HWPfDdrPhyDramTmrd                    =  0x00D48470,
> +	HWPfDdrPhyDramTmod                    =  0x00D48480,
> +	HWPfDdrPhyDramTwpre                   =  0x00D48490,
> +	HWPfDdrPhyDramTrfc                    =  0x00D484A0,
> +	HWPfDdrPhyDramTrwtp                   =  0x00D484B0,
> +	HWPfDdrPhyMr01Dimm                    =  0x00D484C0,
> +	HWPfDdrPhyMr01DimmDbi                 =  0x00D484D0,
> +	HWPfDdrPhyMr23Dimm                    =  0x00D484E0,
> +	HWPfDdrPhyMr45Dimm                    =  0x00D484F0,
> +	HWPfDdrPhyMr67Dimm                    =  0x00D48500,
> +	HWPfDdrPhyWrlvlWwRdlvlRr              =  0x00D48510,
> +	HWPfDdrPhyOdtEn                       =  0x00D48520,
> +	HWPfDdrPhyFastTrng                    =  0x00D48530,
> +	HWPfDdrPhyDynTrngGap                  =  0x00D48540,
> +	HWPfDdrPhyDynRcalGap                  =  0x00D48550,
> +	HWPfDdrPhyIdletimeout                 =  0x00D48560,
> +	HWPfDdrPhyRstCkeGap                   =  0x00D48570,
> +	HWPfDdrPhyCkeMrsGap                   =  0x00D48580,
> +	HWPfDdrPhyMemVrefMidVal               =  0x00D48590,
> +	HWPfDdrPhyVrefStep                    =  0x00D485A0,
> +	HWPfDdrPhyVrefThreshold               =  0x00D485B0,
> +	HWPfDdrPhyPhyVrefMidVal               =  0x00D485C0,
> +	HWPfDdrPhyDqsCountMax                 =  0x00D485D0,
> +	HWPfDdrPhyDqsCountNum                 =  0x00D485E0,
> +	HWPfDdrPhyDramRow                     =  0x00D485F0,
> +	HWPfDdrPhyDramCol                     =  0x00D48600,
> +	HWPfDdrPhyDramBgBa                    =  0x00D48610,
> +	HWPfDdrPhyDynamicUpdreqrel            =  0x00D48620,
> +	HWPfDdrPhyVrefLimits                  =  0x00D48630,
> +	HWPfDdrPhyIdtmTcStatus                =  0x00D6C020,
> +	HWPfDdrPhyIdtmFwVersion               =  0x00D6C410,
> +	HWPfDdrPhyRdlvlGateInitDelay          =  0x00D70000,
> +	HWPfDdrPhyRdenSmplabc                 =  0x00D70008,
> +	HWPfDdrPhyVrefNibble0                 =  0x00D7000C,
> +	HWPfDdrPhyVrefNibble1                 =  0x00D70010,
> +	HWPfDdrPhyRdlvlGateDqsSmpl0           =  0x00D70014,
> +	HWPfDdrPhyRdlvlGateDqsSmpl1           =  0x00D70018,
> +	HWPfDdrPhyRdlvlGateDqsSmpl2           =  0x00D7001C,
> +	HWPfDdrPhyDqsCount                    =  0x00D70020,
> +	HWPfDdrPhyWrlvlRdlvlGateStatus        =  0x00D70024,
> +	HWPfDdrPhyErrorFlags                  =  0x00D70028,
> +	HWPfDdrPhyPowerDown                   =  0x00D70030,
> +	HWPfDdrPhyPrbsSeedByte0               =  0x00D70034,
> +	HWPfDdrPhyPrbsSeedByte1               =  0x00D70038,
> +	HWPfDdrPhyPcompDq                     =  0x00D70040,
> +	HWPfDdrPhyNcompDq                     =  0x00D70044,
> +	HWPfDdrPhyPcompDqs                    =  0x00D70048,
> +	HWPfDdrPhyNcompDqs                    =  0x00D7004C,
> +	HWPfDdrPhyPcompCmd                    =  0x00D70050,
> +	HWPfDdrPhyNcompCmd                    =  0x00D70054,
> +	HWPfDdrPhyPcompCk                     =  0x00D70058,
> +	HWPfDdrPhyNcompCk                     =  0x00D7005C,
> +	HWPfDdrPhyRcalOdtDq                   =  0x00D70060,
> +	HWPfDdrPhyRcalOdtDqs                  =  0x00D70064,
> +	HWPfDdrPhyRcalMask1                   =  0x00D70068,
> +	HWPfDdrPhyRcalMask2                   =  0x00D7006C,
> +	HWPfDdrPhyRcalCtrl                    =  0x00D70070,
> +	HWPfDdrPhyRcalCnt                     =  0x00D70074,
> +	HWPfDdrPhyRcalOverride                =  0x00D70078,
> +	HWPfDdrPhyRcalGateen                  =  0x00D7007C,
> +	HWPfDdrPhyCtrl                        =  0x00D70080,
> +	HWPfDdrPhyWrlvlAlg                    =  0x00D70084,
> +	HWPfDdrPhyRcalVreftTxcmdOdt           =  0x00D70088,
> +	HWPfDdrPhyRdlvlGateParam              =  0x00D7008C,
> +	HWPfDdrPhyRdlvlGateParam2             =  0x00D70090,
> +	HWPfDdrPhyRcalVreftTxdata             =  0x00D70094,
> +	HWPfDdrPhyCmdIntDelay                 =  0x00D700A4,
> +	HWPfDdrPhyAlertN                      =  0x00D700A8,
> +	HWPfDdrPhyTrngReqWpre2tck             =  0x00D700AC,
> +	HWPfDdrPhyCmdPhaseSel                 =  0x00D700B4,
> +	HWPfDdrPhyCmdDcdl                     =  0x00D700B8,
> +	HWPfDdrPhyCkDcdl                      =  0x00D700BC,
> +	HWPfDdrPhySwTrngCtrl1                 =  0x00D700C0,
> +	HWPfDdrPhySwTrngCtrl2                 =  0x00D700C4,
> +	HWPfDdrPhyRcalPcompRden               =  0x00D700C8,
> +	HWPfDdrPhyRcalNcompRden               =  0x00D700CC,
> +	HWPfDdrPhyRcalCompen                  =  0x00D700D0,
> +	HWPfDdrPhySwTrngRdqs                  =  0x00D700D4,
> +	HWPfDdrPhySwTrngWdqs                  =  0x00D700D8,
> +	HWPfDdrPhySwTrngRdena                 =  0x00D700DC,
> +	HWPfDdrPhySwTrngRdenb                 =  0x00D700E0,
> +	HWPfDdrPhySwTrngRdenc                 =  0x00D700E4,
> +	HWPfDdrPhySwTrngWdq                   =  0x00D700E8,
> +	HWPfDdrPhySwTrngRdq                   =  0x00D700EC,
> +	HWPfDdrPhyPcfgHmValue                 =  0x00D700F0,
> +	HWPfDdrPhyPcfgTimerValue              =  0x00D700F4,
> +	HWPfDdrPhyPcfgSoftwareTraining        =  0x00D700F8,
> +	HWPfDdrPhyPcfgMcStatus                =  0x00D700FC,
> +	HWPfDdrPhyWrlvlPhRank0                =  0x00D70100,
> +	HWPfDdrPhyRdenPhRank0                 =  0x00D70104,
> +	HWPfDdrPhyRdenIntRank0                =  0x00D70108,
> +	HWPfDdrPhyRdqsDcdlRank0               =  0x00D7010C,
> +	HWPfDdrPhyRdqsShadowDcdlRank0         =  0x00D70110,
> +	HWPfDdrPhyWdqsDcdlRank0               =  0x00D70114,
> +	HWPfDdrPhyWdmDcdlShadowRank0          =  0x00D70118,
> +	HWPfDdrPhyWdmDcdlRank0                =  0x00D7011C,
> +	HWPfDdrPhyDbiDcdlRank0                =  0x00D70120,
> +	HWPfDdrPhyRdenDcdlaRank0              =  0x00D70124,
> +	HWPfDdrPhyDbiDcdlShadowRank0          =  0x00D70128,
> +	HWPfDdrPhyRdenDcdlbRank0              =  0x00D7012C,
> +	HWPfDdrPhyWdqsShadowDcdlRank0         =  0x00D70130,
> +	HWPfDdrPhyRdenDcdlcRank0              =  0x00D70134,
> +	HWPfDdrPhyRdenShadowDcdlaRank0        =  0x00D70138,
> +	HWPfDdrPhyWrlvlIntRank0               =  0x00D7013C,
> +	HWPfDdrPhyRdqDcdlBit0Rank0            =  0x00D70200,
> +	HWPfDdrPhyRdqDcdlShadowBit0Rank0      =  0x00D70204,
> +	HWPfDdrPhyWdqDcdlBit0Rank0            =  0x00D70208,
> +	HWPfDdrPhyWdqDcdlShadowBit0Rank0      =  0x00D7020C,
> +	HWPfDdrPhyRdqDcdlBit1Rank0            =  0x00D70240,
> +	HWPfDdrPhyRdqDcdlShadowBit1Rank0      =  0x00D70244,
> +	HWPfDdrPhyWdqDcdlBit1Rank0            =  0x00D70248,
> +	HWPfDdrPhyWdqDcdlShadowBit1Rank0      =  0x00D7024C,
> +	HWPfDdrPhyRdqDcdlBit2Rank0            =  0x00D70280,
> +	HWPfDdrPhyRdqDcdlShadowBit2Rank0      =  0x00D70284,
> +	HWPfDdrPhyWdqDcdlBit2Rank0            =  0x00D70288,
> +	HWPfDdrPhyWdqDcdlShadowBit2Rank0      =  0x00D7028C,
> +	HWPfDdrPhyRdqDcdlBit3Rank0            =  0x00D702C0,
> +	HWPfDdrPhyRdqDcdlShadowBit3Rank0      =  0x00D702C4,
> +	HWPfDdrPhyWdqDcdlBit3Rank0            =  0x00D702C8,
> +	HWPfDdrPhyWdqDcdlShadowBit3Rank0      =  0x00D702CC,
> +	HWPfDdrPhyRdqDcdlBit4Rank0            =  0x00D70300,
> +	HWPfDdrPhyRdqDcdlShadowBit4Rank0      =  0x00D70304,
> +	HWPfDdrPhyWdqDcdlBit4Rank0            =  0x00D70308,
> +	HWPfDdrPhyWdqDcdlShadowBit4Rank0      =  0x00D7030C,
> +	HWPfDdrPhyRdqDcdlBit5Rank0            =  0x00D70340,
> +	HWPfDdrPhyRdqDcdlShadowBit5Rank0      =  0x00D70344,
> +	HWPfDdrPhyWdqDcdlBit5Rank0            =  0x00D70348,
> +	HWPfDdrPhyWdqDcdlShadowBit5Rank0      =  0x00D7034C,
> +	HWPfDdrPhyRdqDcdlBit6Rank0            =  0x00D70380,
> +	HWPfDdrPhyRdqDcdlShadowBit6Rank0      =  0x00D70384,
> +	HWPfDdrPhyWdqDcdlBit6Rank0            =  0x00D70388,
> +	HWPfDdrPhyWdqDcdlShadowBit6Rank0      =  0x00D7038C,
> +	HWPfDdrPhyRdqDcdlBit7Rank0            =  0x00D703C0,
> +	HWPfDdrPhyRdqDcdlShadowBit7Rank0      =  0x00D703C4,
> +	HWPfDdrPhyWdqDcdlBit7Rank0            =  0x00D703C8,
> +	HWPfDdrPhyWdqDcdlShadowBit7Rank0      =  0x00D703CC,
> +	HWPfDdrPhyIdtmStatus                  =  0x00D740D0,
> +	HWPfDdrPhyIdtmError                   =  0x00D74110,
> +	HWPfDdrPhyIdtmDebug                   =  0x00D74120,
> +	HWPfDdrPhyIdtmDebugInt                =  0x00D74130,
> +	HwPfPcieLnAsicCfgovr                  =  0x00D80000,
> +	HwPfPcieLnAclkmixer                   =  0x00D80004,
> +	HwPfPcieLnTxrampfreq                  =  0x00D80008,
> +	HwPfPcieLnLanetest                    =  0x00D8000C,
> +	HwPfPcieLnDcctrl                      =  0x00D80010,
> +	HwPfPcieLnDccmeas                     =  0x00D80014,
> +	HwPfPcieLnDccovrAclk                  =  0x00D80018,
> +	HwPfPcieLnDccovrTxa                   =  0x00D8001C,
> +	HwPfPcieLnDccovrTxk                   =  0x00D80020,
> +	HwPfPcieLnDccovrDclk                  =  0x00D80024,
> +	HwPfPcieLnDccovrEclk                  =  0x00D80028,
> +	HwPfPcieLnDcctrimAclk                 =  0x00D8002C,
> +	HwPfPcieLnDcctrimTx                   =  0x00D80030,
> +	HwPfPcieLnDcctrimDclk                 =  0x00D80034,
> +	HwPfPcieLnDcctrimEclk                 =  0x00D80038,
> +	HwPfPcieLnQuadCtrl                    =  0x00D8003C,
> +	HwPfPcieLnQuadCorrIndex               =  0x00D80040,
> +	HwPfPcieLnQuadCorrStatus              =  0x00D80044,
> +	HwPfPcieLnAsicRxovr1                  =  0x00D80048,
> +	HwPfPcieLnAsicRxovr2                  =  0x00D8004C,
> +	HwPfPcieLnAsicEqinfovr                =  0x00D80050,
> +	HwPfPcieLnRxcsr                       =  0x00D80054,
> +	HwPfPcieLnRxfectrl                    =  0x00D80058,
> +	HwPfPcieLnRxtest                      =  0x00D8005C,
> +	HwPfPcieLnEscount                     =  0x00D80060,
> +	HwPfPcieLnCdrctrl                     =  0x00D80064,
> +	HwPfPcieLnCdrctrl2                    =  0x00D80068,
> +	HwPfPcieLnCdrcfg0Ctrl0                =  0x00D8006C,
> +	HwPfPcieLnCdrcfg0Ctrl1                =  0x00D80070,
> +	HwPfPcieLnCdrcfg0Ctrl2                =  0x00D80074,
> +	HwPfPc