DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH v3 00/11] bbdev PMD ACC100
@ 2020-08-19  0:25 Nicolas Chautru
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 01/11] drivers/baseband: add PMD for ACC100 Nicolas Chautru
                   ` (10 more replies)
  0 siblings, 11 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-08-19  0:25 UTC (permalink / raw)
  To: dev, akhil.goyal; +Cc: bruce.richardson, Nicolas Chautru

v3: missed a change during rebase
v2: includes clean up from latest CI checks.

This set includes a new PMD for the accelerator
ACC100 for 4G+5G FEC in 20.11. 
Documentation is updated as well accordingly.
Existing unit tests are all still supported.


Nicolas Chautru (11):
  drivers/baseband: add PMD for ACC100
  baseband/acc100: add register definition file
  baseband/acc100: add info get function
  baseband/acc100: add queue configuration
  baseband/acc100: add LDPC processing functions
  baseband/acc100: add HARQ loopback support
  baseband/acc100: add support for 4G processing
  baseband/acc100: add interrupt support to PMD
  baseband/acc100: add debug function to validate input
  baseband/acc100: add configure function
  doc: update bbdev feature table

 app/test-bbdev/Makefile                            |    3 +
 app/test-bbdev/meson.build                         |    3 +
 app/test-bbdev/test_bbdev_perf.c                   |   72 +
 config/common_base                                 |    4 +
 doc/guides/bbdevs/acc100.rst                       |  233 +
 doc/guides/bbdevs/features/acc100.ini              |   14 +
 doc/guides/bbdevs/features/mbc.ini                 |   14 -
 doc/guides/bbdevs/index.rst                        |    1 +
 doc/guides/rel_notes/release_20_11.rst             |    6 +
 drivers/baseband/Makefile                          |    2 +
 drivers/baseband/acc100/Makefile                   |   28 +
 drivers/baseband/acc100/acc100_pf_enum.h           | 1068 +++++
 drivers/baseband/acc100/acc100_vf_enum.h           |   73 +
 drivers/baseband/acc100/meson.build                |    8 +
 drivers/baseband/acc100/rte_acc100_cfg.h           |  113 +
 drivers/baseband/acc100/rte_acc100_pmd.c           | 4684 ++++++++++++++++++++
 drivers/baseband/acc100/rte_acc100_pmd.h           |  593 +++
 .../acc100/rte_pmd_bbdev_acc100_version.map        |   10 +
 drivers/baseband/meson.build                       |    2 +-
 mk/rte.app.mk                                      |    1 +
 20 files changed, 6917 insertions(+), 15 deletions(-)
 create mode 100644 doc/guides/bbdevs/acc100.rst
 create mode 100644 doc/guides/bbdevs/features/acc100.ini
 delete mode 100644 doc/guides/bbdevs/features/mbc.ini
 create mode 100644 drivers/baseband/acc100/Makefile
 create mode 100644 drivers/baseband/acc100/acc100_pf_enum.h
 create mode 100644 drivers/baseband/acc100/acc100_vf_enum.h
 create mode 100644 drivers/baseband/acc100/meson.build
 create mode 100644 drivers/baseband/acc100/rte_acc100_cfg.h
 create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.c
 create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.h
 create mode 100644 drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v3 01/11] drivers/baseband: add PMD for ACC100
  2020-08-19  0:25 [dpdk-dev] [PATCH v3 00/11] bbdev PMD ACC100 Nicolas Chautru
@ 2020-08-19  0:25 ` Nicolas Chautru
  2020-08-29  9:44   ` Xu, Rosen
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 02/11] baseband/acc100: add register definition file Nicolas Chautru
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 213+ messages in thread
From: Nicolas Chautru @ 2020-08-19  0:25 UTC (permalink / raw)
  To: dev, akhil.goyal; +Cc: bruce.richardson, Nicolas Chautru

Add stubs for the ACC100 PMD

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 config/common_base                                 |   4 +
 doc/guides/bbdevs/acc100.rst                       | 233 +++++++++++++++++++++
 doc/guides/bbdevs/index.rst                        |   1 +
 doc/guides/rel_notes/release_20_11.rst             |   6 +
 drivers/baseband/Makefile                          |   2 +
 drivers/baseband/acc100/Makefile                   |  25 +++
 drivers/baseband/acc100/meson.build                |   6 +
 drivers/baseband/acc100/rte_acc100_pmd.c           | 175 ++++++++++++++++
 drivers/baseband/acc100/rte_acc100_pmd.h           |  37 ++++
 .../acc100/rte_pmd_bbdev_acc100_version.map        |   3 +
 drivers/baseband/meson.build                       |   2 +-
 mk/rte.app.mk                                      |   1 +
 12 files changed, 494 insertions(+), 1 deletion(-)
 create mode 100644 doc/guides/bbdevs/acc100.rst
 create mode 100644 drivers/baseband/acc100/Makefile
 create mode 100644 drivers/baseband/acc100/meson.build
 create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.c
 create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.h
 create mode 100644 drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map

diff --git a/config/common_base b/config/common_base
index fbf0ee7..218ab16 100644
--- a/config/common_base
+++ b/config/common_base
@@ -584,6 +584,10 @@ CONFIG_RTE_LIBRTE_PMD_BBDEV_NULL=y
 #
 CONFIG_RTE_LIBRTE_PMD_BBDEV_TURBO_SW=y
 
+# Compile PMD for ACC100 bbdev device
+#
+CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100=y
+
 #
 # Compile PMD for Intel FPGA LTE FEC bbdev device
 #
diff --git a/doc/guides/bbdevs/acc100.rst b/doc/guides/bbdevs/acc100.rst
new file mode 100644
index 0000000..f87ee09
--- /dev/null
+++ b/doc/guides/bbdevs/acc100.rst
@@ -0,0 +1,233 @@
+..  SPDX-License-Identifier: BSD-3-Clause
+    Copyright(c) 2020 Intel Corporation
+
+Intel(R) ACC100 5G/4G FEC Poll Mode Driver
+==========================================
+
+The BBDEV ACC100 5G/4G FEC poll mode driver (PMD) supports an
+implementation of a VRAN FEC wireless acceleration function.
+This device is also known as Mount Bryce.
+
+Features
+--------
+
+ACC100 5G/4G FEC PMD supports the following features:
+
+- LDPC Encode in the DL (5GNR)
+- LDPC Decode in the UL (5GNR)
+- Turbo Encode in the DL (4G)
+- Turbo Decode in the UL (4G)
+- 16 VFs per PF (physical device)
+- Maximum of 128 queues per VF
+- PCIe Gen-3 x16 Interface
+- MSI
+- SR-IOV
+
+ACC100 5G/4G FEC PMD supports the following BBDEV capabilities:
+
+* For the LDPC encode operation:
+   - ``RTE_BBDEV_LDPC_CRC_24B_ATTACH`` :  set to attach CRC24B to CB(s)
+   - ``RTE_BBDEV_LDPC_RATE_MATCH`` :  if set then do not do Rate Match bypass
+   - ``RTE_BBDEV_LDPC_INTERLEAVER_BYPASS`` : if set then bypass interleaver
+
+* For the LDPC decode operation:
+   - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK`` :  check CRC24B from CB(s)
+   - ``RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE`` :  disable early termination
+   - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP`` :  drops CRC24B bits appended while decoding
+   - ``RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE`` :  provides an input for HARQ combining
+   - ``RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE`` :  provides an input for HARQ combining
+   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE`` :  HARQ memory input is internal
+   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE`` :  HARQ memory output is internal
+   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK`` :  loopback data to/from HARQ memory
+   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_FILLERS`` :  HARQ memory includes the fillers bits
+   - ``RTE_BBDEV_LDPC_DEC_SCATTER_GATHER`` :  supports scatter-gather for input/output data
+   - ``RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION`` :  supports compression of the HARQ input/output
+   - ``RTE_BBDEV_LDPC_LLR_COMPRESSION`` :  supports LLR input compression
+
+* For the turbo encode operation:
+   - ``RTE_BBDEV_TURBO_CRC_24B_ATTACH`` :  set to attach CRC24B to CB(s)
+   - ``RTE_BBDEV_TURBO_RATE_MATCH`` :  if set then do not do Rate Match bypass
+   - ``RTE_BBDEV_TURBO_ENC_INTERRUPTS`` :  set for encoder dequeue interrupts
+   - ``RTE_BBDEV_TURBO_RV_INDEX_BYPASS`` :  set to bypass RV index
+   - ``RTE_BBDEV_TURBO_ENC_SCATTER_GATHER`` :  supports scatter-gather for input/output data
+
+* For the turbo decode operation:
+   - ``RTE_BBDEV_TURBO_CRC_TYPE_24B`` :  check CRC24B from CB(s)
+   - ``RTE_BBDEV_TURBO_SUBBLOCK_DEINTERLEAVE`` :  perform subblock de-interleave
+   - ``RTE_BBDEV_TURBO_DEC_INTERRUPTS`` :  set for decoder dequeue interrupts
+   - ``RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN`` :  set if negative LLR encoder i/p is supported
+   - ``RTE_BBDEV_TURBO_POS_LLR_1_BIT_IN`` :  set if positive LLR encoder i/p is supported
+   - ``RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP`` :  keep CRC24B bits appended while decoding
+   - ``RTE_BBDEV_TURBO_EARLY_TERMINATION`` :  set early early termination feature
+   - ``RTE_BBDEV_TURBO_DEC_SCATTER_GATHER`` :  supports scatter-gather for input/output data
+   - ``RTE_BBDEV_TURBO_HALF_ITERATION_EVEN`` :  set half iteration granularity
+
+Installation
+------------
+
+Section 3 of the DPDK manual provides instuctions on installing and compiling DPDK. The
+default set of bbdev compile flags may be found in config/common_base, where for example
+the flag to build the ACC100 5G/4G FEC device, ``CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100``,
+is already set.
+
+DPDK requires hugepages to be configured as detailed in section 2 of the DPDK manual.
+The bbdev test application has been tested with a configuration 40 x 1GB hugepages. The
+hugepage configuration of a server may be examined using:
+
+.. code-block:: console
+
+   grep Huge* /proc/meminfo
+
+
+Initialization
+--------------
+
+When the device first powers up, its PCI Physical Functions (PF) can be listed through this command:
+
+.. code-block:: console
+
+  sudo lspci -vd8086:0d5c
+
+The physical and virtual functions are compatible with Linux UIO drivers:
+``vfio`` and ``igb_uio``. However, in order to work the ACC100 5G/4G
+FEC device firstly needs to be bound to one of these linux drivers through DPDK.
+
+
+Bind PF UIO driver(s)
+~~~~~~~~~~~~~~~~~~~~~
+
+Install the DPDK igb_uio driver, bind it with the PF PCI device ID and use
+``lspci`` to confirm the PF device is under use by ``igb_uio`` DPDK UIO driver.
+
+The igb_uio driver may be bound to the PF PCI device using one of three methods:
+
+
+1. PCI functions (physical or virtual, depending on the use case) can be bound to
+the UIO driver by repeating this command for every function.
+
+.. code-block:: console
+
+  cd <dpdk-top-level-directory>
+  insmod ./build/kmod/igb_uio.ko
+  echo "8086 0d5c" > /sys/bus/pci/drivers/igb_uio/new_id
+  lspci -vd8086:0d5c
+
+
+2. Another way to bind PF with DPDK UIO driver is by using the ``dpdk-devbind.py`` tool
+
+.. code-block:: console
+
+  cd <dpdk-top-level-directory>
+  ./usertools/dpdk-devbind.py -b igb_uio 0000:06:00.0
+
+where the PCI device ID (example: 0000:06:00.0) is obtained using lspci -vd8086:0d5c
+
+
+3. A third way to bind is to use ``dpdk-setup.sh`` tool
+
+.. code-block:: console
+
+  cd <dpdk-top-level-directory>
+  ./usertools/dpdk-setup.sh
+
+  select 'Bind Ethernet/Crypto/Baseband device to IGB UIO module'
+  or
+  select 'Bind Ethernet/Crypto/Baseband device to VFIO module' depending on driver required
+  enter PCI device ID
+  select 'Display current Ethernet/Crypto/Baseband device settings' to confirm binding
+
+
+In the same way the ACC100 5G/4G FEC PF can be bound with vfio, but vfio driver does not
+support SR-IOV configuration right out of the box, so it will need to be patched.
+
+
+Enable Virtual Functions
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+Now, it should be visible in the printouts that PCI PF is under igb_uio control
+"``Kernel driver in use: igb_uio``"
+
+To show the number of available VFs on the device, read ``sriov_totalvfs`` file..
+
+.. code-block:: console
+
+  cat /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/sriov_totalvfs
+
+  where 0000\:<b>\:<d>.<f> is the PCI device ID
+
+
+To enable VFs via igb_uio, echo the number of virtual functions intended to
+enable to ``max_vfs`` file..
+
+.. code-block:: console
+
+  echo <num-of-vfs> > /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/max_vfs
+
+
+Afterwards, all VFs must be bound to appropriate UIO drivers as required, same
+way it was done with the physical function previously.
+
+Enabling SR-IOV via vfio driver is pretty much the same, except that the file
+name is different:
+
+.. code-block:: console
+
+  echo <num-of-vfs> > /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/sriov_numvfs
+
+
+Configure the VFs through PF
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The PCI virtual functions must be configured before working or getting assigned
+to VMs/Containers. The configuration involves allocating the number of hardware
+queues, priorities, load balance, bandwidth and other settings necessary for the
+device to perform FEC functions.
+
+This configuration needs to be executed at least once after reboot or PCI FLR and can
+be achieved by using the function ``acc100_configure()``, which sets up the
+parameters defined in ``acc100_conf`` structure.
+
+Test Application
+----------------
+
+BBDEV provides a test application, ``test-bbdev.py`` and range of test data for testing
+the functionality of ACC100 5G/4G FEC encode and decode, depending on the device's
+capabilities. The test application is located under app->test-bbdev folder and has the
+following options:
+
+.. code-block:: console
+
+  "-p", "--testapp-path": specifies path to the bbdev test app.
+  "-e", "--eal-params"	: EAL arguments which are passed to the test app.
+  "-t", "--timeout"	: Timeout in seconds (default=300).
+  "-c", "--test-cases"	: Defines test cases to run. Run all if not specified.
+  "-v", "--test-vector"	: Test vector path (default=dpdk_path+/app/test-bbdev/test_vectors/bbdev_null.data).
+  "-n", "--num-ops"	: Number of operations to process on device (default=32).
+  "-b", "--burst-size"	: Operations enqueue/dequeue burst size (default=32).
+  "-s", "--snr"		: SNR in dB used when generating LLRs for bler tests.
+  "-s", "--iter_max"	: Number of iterations for LDPC decoder.
+  "-l", "--num-lcores"	: Number of lcores to run (default=16).
+  "-i", "--init-device" : Initialise PF device with default values.
+
+
+To execute the test application tool using simple decode or encode data,
+type one of the following:
+
+.. code-block:: console
+
+  ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_dec_default.data
+  ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_enc_default.data
+
+
+The test application ``test-bbdev.py``, supports the ability to configure the PF device with
+a default set of values, if the "-i" or "- -init-device" option is included. The default values
+are defined in test_bbdev_perf.c.
+
+
+Test Vectors
+~~~~~~~~~~~~
+
+In addition to the simple LDPC decoder and LDPC encoder tests, bbdev also provides
+a range of additional tests under the test_vectors folder, which may be useful. The results
+of these tests will depend on the ACC100 5G/4G FEC capabilities which may cause some
+testcases to be skipped, but no failure should be reported.
diff --git a/doc/guides/bbdevs/index.rst b/doc/guides/bbdevs/index.rst
index a8092dd..4445cbd 100644
--- a/doc/guides/bbdevs/index.rst
+++ b/doc/guides/bbdevs/index.rst
@@ -13,3 +13,4 @@ Baseband Device Drivers
     turbo_sw
     fpga_lte_fec
     fpga_5gnr_fec
+    acc100
diff --git a/doc/guides/rel_notes/release_20_11.rst b/doc/guides/rel_notes/release_20_11.rst
index df227a1..b3ab614 100644
--- a/doc/guides/rel_notes/release_20_11.rst
+++ b/doc/guides/rel_notes/release_20_11.rst
@@ -55,6 +55,12 @@ New Features
      Also, make sure to start the actual text at the margin.
      =======================================================
 
+* **Added Intel ACC100 bbdev PMD.**
+
+  Added a new ``acc100`` bbdev driver for the Intel\ |reg| ACC100 accelerator
+  also known as Mount Bryce.  See the
+  :doc:`../bbdevs/acc100` BBDEV guide for more details on this new driver.
+
 
 Removed Items
 -------------
diff --git a/drivers/baseband/Makefile b/drivers/baseband/Makefile
index dcc0969..b640294 100644
--- a/drivers/baseband/Makefile
+++ b/drivers/baseband/Makefile
@@ -10,6 +10,8 @@ DIRS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_NULL) += null
 DEPDIRS-null = $(core-libs)
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_TURBO_SW) += turbo_sw
 DEPDIRS-turbo_sw = $(core-libs)
+DIRS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100) += acc100
+DEPDIRS-acc100 = $(core-libs)
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_FPGA_LTE_FEC) += fpga_lte_fec
 DEPDIRS-fpga_lte_fec = $(core-libs)
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_FPGA_5GNR_FEC) += fpga_5gnr_fec
diff --git a/drivers/baseband/acc100/Makefile b/drivers/baseband/acc100/Makefile
new file mode 100644
index 0000000..c79e487
--- /dev/null
+++ b/drivers/baseband/acc100/Makefile
@@ -0,0 +1,25 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2020 Intel Corporation
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_pmd_bbdev_acc100.a
+
+# build flags
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring -lrte_cfgfile
+LDLIBS += -lrte_bbdev
+LDLIBS += -lrte_pci -lrte_bus_pci
+
+# versioning export map
+EXPORT_MAP := rte_pmd_bbdev_acc100_version.map
+
+# library version
+LIBABIVER := 1
+
+# library source files
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100) += rte_acc100_pmd.c
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/baseband/acc100/meson.build b/drivers/baseband/acc100/meson.build
new file mode 100644
index 0000000..8afafc2
--- /dev/null
+++ b/drivers/baseband/acc100/meson.build
@@ -0,0 +1,6 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2020 Intel Corporation
+
+deps += ['bbdev', 'bus_vdev', 'ring', 'pci', 'bus_pci']
+
+sources = files('rte_acc100_pmd.c')
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
new file mode 100644
index 0000000..1b4cd13
--- /dev/null
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -0,0 +1,175 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#include <unistd.h>
+
+#include <rte_common.h>
+#include <rte_log.h>
+#include <rte_dev.h>
+#include <rte_malloc.h>
+#include <rte_mempool.h>
+#include <rte_byteorder.h>
+#include <rte_errno.h>
+#include <rte_branch_prediction.h>
+#include <rte_hexdump.h>
+#include <rte_pci.h>
+#include <rte_bus_pci.h>
+
+#include <rte_bbdev.h>
+#include <rte_bbdev_pmd.h>
+#include "rte_acc100_pmd.h"
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, DEBUG);
+#else
+RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);
+#endif
+
+/* Free 64MB memory used for software rings */
+static int
+acc100_dev_close(struct rte_bbdev *dev  __rte_unused)
+{
+	return 0;
+}
+
+static const struct rte_bbdev_ops acc100_bbdev_ops = {
+	.close = acc100_dev_close,
+};
+
+/* ACC100 PCI PF address map */
+static struct rte_pci_id pci_id_acc100_pf_map[] = {
+	{
+		RTE_PCI_DEVICE(RTE_ACC100_VENDOR_ID, RTE_ACC100_PF_DEVICE_ID)
+	},
+	{.device_id = 0},
+};
+
+/* ACC100 PCI VF address map */
+static struct rte_pci_id pci_id_acc100_vf_map[] = {
+	{
+		RTE_PCI_DEVICE(RTE_ACC100_VENDOR_ID, RTE_ACC100_VF_DEVICE_ID)
+	},
+	{.device_id = 0},
+};
+
+/* Initialization Function */
+static void
+acc100_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv)
+{
+	struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
+
+	dev->dev_ops = &acc100_bbdev_ops;
+
+	((struct acc100_device *) dev->data->dev_private)->pf_device =
+			!strcmp(drv->driver.name,
+					RTE_STR(ACC100PF_DRIVER_NAME));
+	((struct acc100_device *) dev->data->dev_private)->mmio_base =
+			pci_dev->mem_resource[0].addr;
+
+	rte_bbdev_log_debug("Init device %s [%s] @ vaddr %p paddr %#"PRIx64"",
+			drv->driver.name, dev->data->name,
+			(void *)pci_dev->mem_resource[0].addr,
+			pci_dev->mem_resource[0].phys_addr);
+}
+
+static int acc100_pci_probe(struct rte_pci_driver *pci_drv,
+	struct rte_pci_device *pci_dev)
+{
+	struct rte_bbdev *bbdev = NULL;
+	char dev_name[RTE_BBDEV_NAME_MAX_LEN];
+
+	if (pci_dev == NULL) {
+		rte_bbdev_log(ERR, "NULL PCI device");
+		return -EINVAL;
+	}
+
+	rte_pci_device_name(&pci_dev->addr, dev_name, sizeof(dev_name));
+
+	/* Allocate memory to be used privately by drivers */
+	bbdev = rte_bbdev_allocate(pci_dev->device.name);
+	if (bbdev == NULL)
+		return -ENODEV;
+
+	/* allocate device private memory */
+	bbdev->data->dev_private = rte_zmalloc_socket(dev_name,
+			sizeof(struct acc100_device), RTE_CACHE_LINE_SIZE,
+			pci_dev->device.numa_node);
+
+	if (bbdev->data->dev_private == NULL) {
+		rte_bbdev_log(CRIT,
+				"Allocate of %zu bytes for device \"%s\" failed",
+				sizeof(struct acc100_device), dev_name);
+				rte_bbdev_release(bbdev);
+			return -ENOMEM;
+	}
+
+	/* Fill HW specific part of device structure */
+	bbdev->device = &pci_dev->device;
+	bbdev->intr_handle = &pci_dev->intr_handle;
+	bbdev->data->socket_id = pci_dev->device.numa_node;
+
+	/* Invoke ACC100 device initialization function */
+	acc100_bbdev_init(bbdev, pci_drv);
+
+	rte_bbdev_log_debug("Initialised bbdev %s (id = %u)",
+			dev_name, bbdev->data->dev_id);
+	return 0;
+}
+
+static int acc100_pci_remove(struct rte_pci_device *pci_dev)
+{
+	struct rte_bbdev *bbdev;
+	int ret;
+	uint8_t dev_id;
+
+	if (pci_dev == NULL)
+		return -EINVAL;
+
+	/* Find device */
+	bbdev = rte_bbdev_get_named_dev(pci_dev->device.name);
+	if (bbdev == NULL) {
+		rte_bbdev_log(CRIT,
+				"Couldn't find HW dev \"%s\" to uninitialise it",
+				pci_dev->device.name);
+		return -ENODEV;
+	}
+	dev_id = bbdev->data->dev_id;
+
+	/* free device private memory before close */
+	rte_free(bbdev->data->dev_private);
+
+	/* Close device */
+	ret = rte_bbdev_close(dev_id);
+	if (ret < 0)
+		rte_bbdev_log(ERR,
+				"Device %i failed to close during uninit: %i",
+				dev_id, ret);
+
+	/* release bbdev from library */
+	rte_bbdev_release(bbdev);
+
+	rte_bbdev_log_debug("Destroyed bbdev = %u", dev_id);
+
+	return 0;
+}
+
+static struct rte_pci_driver acc100_pci_pf_driver = {
+		.probe = acc100_pci_probe,
+		.remove = acc100_pci_remove,
+		.id_table = pci_id_acc100_pf_map,
+		.drv_flags = RTE_PCI_DRV_NEED_MAPPING
+};
+
+static struct rte_pci_driver acc100_pci_vf_driver = {
+		.probe = acc100_pci_probe,
+		.remove = acc100_pci_remove,
+		.id_table = pci_id_acc100_vf_map,
+		.drv_flags = RTE_PCI_DRV_NEED_MAPPING
+};
+
+RTE_PMD_REGISTER_PCI(ACC100PF_DRIVER_NAME, acc100_pci_pf_driver);
+RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME, pci_id_acc100_pf_map);
+RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
+RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME, pci_id_acc100_vf_map);
+
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
new file mode 100644
index 0000000..6f46df0
--- /dev/null
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -0,0 +1,37 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#ifndef _RTE_ACC100_PMD_H_
+#define _RTE_ACC100_PMD_H_
+
+/* Helper macro for logging */
+#define rte_bbdev_log(level, fmt, ...) \
+	rte_log(RTE_LOG_ ## level, acc100_logtype, fmt "\n", \
+		##__VA_ARGS__)
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+#define rte_bbdev_log_debug(fmt, ...) \
+		rte_bbdev_log(DEBUG, "acc100_pmd: " fmt, \
+		##__VA_ARGS__)
+#else
+#define rte_bbdev_log_debug(fmt, ...)
+#endif
+
+/* ACC100 PF and VF driver names */
+#define ACC100PF_DRIVER_NAME           intel_acc100_pf
+#define ACC100VF_DRIVER_NAME           intel_acc100_vf
+
+/* ACC100 PCI vendor & device IDs */
+#define RTE_ACC100_VENDOR_ID           (0x8086)
+#define RTE_ACC100_PF_DEVICE_ID        (0x0d5c)
+#define RTE_ACC100_VF_DEVICE_ID        (0x0d5d)
+
+/* Private data structure for each ACC100 device */
+struct acc100_device {
+	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
+	bool pf_device; /**< True if this is a PF ACC100 device */
+	bool configured; /**< True if this ACC100 device is configured */
+};
+
+#endif /* _RTE_ACC100_PMD_H_ */
diff --git a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
new file mode 100644
index 0000000..4a76d1d
--- /dev/null
+++ b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
@@ -0,0 +1,3 @@
+DPDK_21 {
+	local: *;
+};
diff --git a/drivers/baseband/meson.build b/drivers/baseband/meson.build
index 415b672..72301ce 100644
--- a/drivers/baseband/meson.build
+++ b/drivers/baseband/meson.build
@@ -5,7 +5,7 @@ if is_windows
 	subdir_done()
 endif
 
-drivers = ['null', 'turbo_sw', 'fpga_lte_fec', 'fpga_5gnr_fec']
+drivers = ['null', 'turbo_sw', 'fpga_lte_fec', 'fpga_5gnr_fec', 'acc100']
 
 config_flag_fmt = 'RTE_LIBRTE_PMD_BBDEV_@0@'
 driver_name_fmt = 'rte_pmd_bbdev_@0@'
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index a544259..a77f538 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -254,6 +254,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_NETVSC_PMD)     += -lrte_pmd_netvsc
 
 ifeq ($(CONFIG_RTE_LIBRTE_BBDEV),y)
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_NULL)     += -lrte_pmd_bbdev_null
+_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100)    += -lrte_pmd_bbdev_acc100
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_FPGA_LTE_FEC) += -lrte_pmd_bbdev_fpga_lte_fec
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_FPGA_5GNR_FEC) += -lrte_pmd_bbdev_fpga_5gnr_fec
 
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v3 02/11] baseband/acc100: add register definition file
  2020-08-19  0:25 [dpdk-dev] [PATCH v3 00/11] bbdev PMD ACC100 Nicolas Chautru
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 01/11] drivers/baseband: add PMD for ACC100 Nicolas Chautru
@ 2020-08-19  0:25 ` Nicolas Chautru
  2020-08-29  9:55   ` Xu, Rosen
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 03/11] baseband/acc100: add info get function Nicolas Chautru
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 213+ messages in thread
From: Nicolas Chautru @ 2020-08-19  0:25 UTC (permalink / raw)
  To: dev, akhil.goyal; +Cc: bruce.richardson, Nicolas Chautru

Add in the list of registers for the device and related
HW specs definitions.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 drivers/baseband/acc100/acc100_pf_enum.h | 1068 ++++++++++++++++++++++++++++++
 drivers/baseband/acc100/acc100_vf_enum.h |   73 ++
 drivers/baseband/acc100/rte_acc100_pmd.h |  490 ++++++++++++++
 3 files changed, 1631 insertions(+)
 create mode 100644 drivers/baseband/acc100/acc100_pf_enum.h
 create mode 100644 drivers/baseband/acc100/acc100_vf_enum.h

diff --git a/drivers/baseband/acc100/acc100_pf_enum.h b/drivers/baseband/acc100/acc100_pf_enum.h
new file mode 100644
index 0000000..a1ee416
--- /dev/null
+++ b/drivers/baseband/acc100/acc100_pf_enum.h
@@ -0,0 +1,1068 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2017 Intel Corporation
+ */
+
+#ifndef ACC100_PF_ENUM_H
+#define ACC100_PF_ENUM_H
+
+/*
+ * ACC100 Register mapping on PF BAR0
+ * This is automatically generated from RDL, format may change with new RDL
+ * Release.
+ * Variable names are as is
+ */
+enum {
+	HWPfQmgrEgressQueuesTemplate          =  0x0007FE00,
+	HWPfQmgrIngressAq                     =  0x00080000,
+	HWPfQmgrArbQAvail                     =  0x00A00010,
+	HWPfQmgrArbQBlock                     =  0x00A00014,
+	HWPfQmgrAqueueDropNotifEn             =  0x00A00024,
+	HWPfQmgrAqueueDisableNotifEn          =  0x00A00028,
+	HWPfQmgrSoftReset                     =  0x00A00038,
+	HWPfQmgrInitStatus                    =  0x00A0003C,
+	HWPfQmgrAramWatchdogCount             =  0x00A00040,
+	HWPfQmgrAramWatchdogCounterEn         =  0x00A00044,
+	HWPfQmgrAxiWatchdogCount              =  0x00A00048,
+	HWPfQmgrAxiWatchdogCounterEn          =  0x00A0004C,
+	HWPfQmgrProcessWatchdogCount          =  0x00A00050,
+	HWPfQmgrProcessWatchdogCounterEn      =  0x00A00054,
+	HWPfQmgrProcessUl4GWatchdogCounter    =  0x00A00058,
+	HWPfQmgrProcessDl4GWatchdogCounter    =  0x00A0005C,
+	HWPfQmgrProcessUl5GWatchdogCounter    =  0x00A00060,
+	HWPfQmgrProcessDl5GWatchdogCounter    =  0x00A00064,
+	HWPfQmgrProcessMldWatchdogCounter     =  0x00A00068,
+	HWPfQmgrMsiOverflowUpperVf            =  0x00A00070,
+	HWPfQmgrMsiOverflowLowerVf            =  0x00A00074,
+	HWPfQmgrMsiWatchdogOverflow           =  0x00A00078,
+	HWPfQmgrMsiOverflowEnable             =  0x00A0007C,
+	HWPfQmgrDebugAqPointerMemGrp          =  0x00A00100,
+	HWPfQmgrDebugOutputArbQFifoGrp        =  0x00A00140,
+	HWPfQmgrDebugMsiFifoGrp               =  0x00A00180,
+	HWPfQmgrDebugAxiWdTimeoutMsiFifo      =  0x00A001C0,
+	HWPfQmgrDebugProcessWdTimeoutMsiFifo  =  0x00A001C4,
+	HWPfQmgrDepthLog2Grp                  =  0x00A00200,
+	HWPfQmgrTholdGrp                      =  0x00A00300,
+	HWPfQmgrGrpTmplateReg0Indx            =  0x00A00600,
+	HWPfQmgrGrpTmplateReg1Indx            =  0x00A00680,
+	HWPfQmgrGrpTmplateReg2indx            =  0x00A00700,
+	HWPfQmgrGrpTmplateReg3Indx            =  0x00A00780,
+	HWPfQmgrGrpTmplateReg4Indx            =  0x00A00800,
+	HWPfQmgrVfBaseAddr                    =  0x00A01000,
+	HWPfQmgrUl4GWeightRrVf                =  0x00A02000,
+	HWPfQmgrDl4GWeightRrVf                =  0x00A02100,
+	HWPfQmgrUl5GWeightRrVf                =  0x00A02200,
+	HWPfQmgrDl5GWeightRrVf                =  0x00A02300,
+	HWPfQmgrMldWeightRrVf                 =  0x00A02400,
+	HWPfQmgrArbQDepthGrp                  =  0x00A02F00,
+	HWPfQmgrGrpFunction0                  =  0x00A02F40,
+	HWPfQmgrGrpFunction1                  =  0x00A02F44,
+	HWPfQmgrGrpPriority                   =  0x00A02F48,
+	HWPfQmgrWeightSync                    =  0x00A03000,
+	HWPfQmgrAqEnableVf                    =  0x00A10000,
+	HWPfQmgrAqResetVf                     =  0x00A20000,
+	HWPfQmgrRingSizeVf                    =  0x00A20004,
+	HWPfQmgrGrpDepthLog20Vf               =  0x00A20008,
+	HWPfQmgrGrpDepthLog21Vf               =  0x00A2000C,
+	HWPfQmgrGrpFunction0Vf                =  0x00A20010,
+	HWPfQmgrGrpFunction1Vf                =  0x00A20014,
+	HWPfDmaConfig0Reg                     =  0x00B80000,
+	HWPfDmaConfig1Reg                     =  0x00B80004,
+	HWPfDmaQmgrAddrReg                    =  0x00B80008,
+	HWPfDmaSoftResetReg                   =  0x00B8000C,
+	HWPfDmaAxcacheReg                     =  0x00B80010,
+	HWPfDmaVersionReg                     =  0x00B80014,
+	HWPfDmaFrameThreshold                 =  0x00B80018,
+	HWPfDmaTimestampLo                    =  0x00B8001C,
+	HWPfDmaTimestampHi                    =  0x00B80020,
+	HWPfDmaAxiStatus                      =  0x00B80028,
+	HWPfDmaAxiControl                     =  0x00B8002C,
+	HWPfDmaNoQmgr                         =  0x00B80030,
+	HWPfDmaQosScale                       =  0x00B80034,
+	HWPfDmaQmanen                         =  0x00B80040,
+	HWPfDmaQmgrQosBase                    =  0x00B80060,
+	HWPfDmaFecClkGatingEnable             =  0x00B80080,
+	HWPfDmaPmEnable                       =  0x00B80084,
+	HWPfDmaQosEnable                      =  0x00B80088,
+	HWPfDmaHarqWeightedRrFrameThreshold   =  0x00B800B0,
+	HWPfDmaDataSmallWeightedRrFrameThresh  = 0x00B800B4,
+	HWPfDmaDataLargeWeightedRrFrameThresh  = 0x00B800B8,
+	HWPfDmaInboundCbMaxSize               =  0x00B800BC,
+	HWPfDmaInboundDrainDataSize           =  0x00B800C0,
+	HWPfDmaVfDdrBaseRw                    =  0x00B80400,
+	HWPfDmaCmplTmOutCnt                   =  0x00B80800,
+	HWPfDmaProcTmOutCnt                   =  0x00B80804,
+	HWPfDmaStatusRrespBresp               =  0x00B80810,
+	HWPfDmaCfgRrespBresp                  =  0x00B80814,
+	HWPfDmaStatusMemParErr                =  0x00B80818,
+	HWPfDmaCfgMemParErrEn                 =  0x00B8081C,
+	HWPfDmaStatusDmaHwErr                 =  0x00B80820,
+	HWPfDmaCfgDmaHwErrEn                  =  0x00B80824,
+	HWPfDmaStatusFecCoreErr               =  0x00B80828,
+	HWPfDmaCfgFecCoreErrEn                =  0x00B8082C,
+	HWPfDmaStatusFcwDescrErr              =  0x00B80830,
+	HWPfDmaCfgFcwDescrErrEn               =  0x00B80834,
+	HWPfDmaStatusBlockTransmit            =  0x00B80838,
+	HWPfDmaBlockOnErrEn                   =  0x00B8083C,
+	HWPfDmaStatusFlushDma                 =  0x00B80840,
+	HWPfDmaFlushDmaOnErrEn                =  0x00B80844,
+	HWPfDmaStatusSdoneFifoFull            =  0x00B80848,
+	HWPfDmaStatusDescriptorErrLoVf        =  0x00B8084C,
+	HWPfDmaStatusDescriptorErrHiVf        =  0x00B80850,
+	HWPfDmaStatusFcwErrLoVf               =  0x00B80854,
+	HWPfDmaStatusFcwErrHiVf               =  0x00B80858,
+	HWPfDmaStatusDataErrLoVf              =  0x00B8085C,
+	HWPfDmaStatusDataErrHiVf              =  0x00B80860,
+	HWPfDmaCfgMsiEnSoftwareErr            =  0x00B80864,
+	HWPfDmaDescriptorSignatuture          =  0x00B80868,
+	HWPfDmaFcwSignature                   =  0x00B8086C,
+	HWPfDmaErrorDetectionEn               =  0x00B80870,
+	HWPfDmaErrCntrlFifoDebug              =  0x00B8087C,
+	HWPfDmaStatusToutData                 =  0x00B80880,
+	HWPfDmaStatusToutDesc                 =  0x00B80884,
+	HWPfDmaStatusToutUnexpData            =  0x00B80888,
+	HWPfDmaStatusToutUnexpDesc            =  0x00B8088C,
+	HWPfDmaStatusToutProcess              =  0x00B80890,
+	HWPfDmaConfigCtoutOutDataEn           =  0x00B808A0,
+	HWPfDmaConfigCtoutOutDescrEn          =  0x00B808A4,
+	HWPfDmaConfigUnexpComplDataEn         =  0x00B808A8,
+	HWPfDmaConfigUnexpComplDescrEn        =  0x00B808AC,
+	HWPfDmaConfigPtoutOutEn               =  0x00B808B0,
+	HWPfDmaFec5GulDescBaseLoRegVf         =  0x00B88020,
+	HWPfDmaFec5GulDescBaseHiRegVf         =  0x00B88024,
+	HWPfDmaFec5GulRespPtrLoRegVf          =  0x00B88028,
+	HWPfDmaFec5GulRespPtrHiRegVf          =  0x00B8802C,
+	HWPfDmaFec5GdlDescBaseLoRegVf         =  0x00B88040,
+	HWPfDmaFec5GdlDescBaseHiRegVf         =  0x00B88044,
+	HWPfDmaFec5GdlRespPtrLoRegVf          =  0x00B88048,
+	HWPfDmaFec5GdlRespPtrHiRegVf          =  0x00B8804C,
+	HWPfDmaFec4GulDescBaseLoRegVf         =  0x00B88060,
+	HWPfDmaFec4GulDescBaseHiRegVf         =  0x00B88064,
+	HWPfDmaFec4GulRespPtrLoRegVf          =  0x00B88068,
+	HWPfDmaFec4GulRespPtrHiRegVf          =  0x00B8806C,
+	HWPfDmaFec4GdlDescBaseLoRegVf         =  0x00B88080,
+	HWPfDmaFec4GdlDescBaseHiRegVf         =  0x00B88084,
+	HWPfDmaFec4GdlRespPtrLoRegVf          =  0x00B88088,
+	HWPfDmaFec4GdlRespPtrHiRegVf          =  0x00B8808C,
+	HWPfDmaVfDdrBaseRangeRo               =  0x00B880A0,
+	HWPfQosmonACntrlReg                   =  0x00B90000,
+	HWPfQosmonAEvalOverflow0              =  0x00B90008,
+	HWPfQosmonAEvalOverflow1              =  0x00B9000C,
+	HWPfQosmonADivTerm                    =  0x00B90010,
+	HWPfQosmonATickTerm                   =  0x00B90014,
+	HWPfQosmonAEvalTerm                   =  0x00B90018,
+	HWPfQosmonAAveTerm                    =  0x00B9001C,
+	HWPfQosmonAForceEccErr                =  0x00B90020,
+	HWPfQosmonAEccErrDetect               =  0x00B90024,
+	HWPfQosmonAIterationConfig0Low        =  0x00B90060,
+	HWPfQosmonAIterationConfig0High       =  0x00B90064,
+	HWPfQosmonAIterationConfig1Low        =  0x00B90068,
+	HWPfQosmonAIterationConfig1High       =  0x00B9006C,
+	HWPfQosmonAIterationConfig2Low        =  0x00B90070,
+	HWPfQosmonAIterationConfig2High       =  0x00B90074,
+	HWPfQosmonAIterationConfig3Low        =  0x00B90078,
+	HWPfQosmonAIterationConfig3High       =  0x00B9007C,
+	HWPfQosmonAEvalMemAddr                =  0x00B90080,
+	HWPfQosmonAEvalMemData                =  0x00B90084,
+	HWPfQosmonAXaction                    =  0x00B900C0,
+	HWPfQosmonARemThres1Vf                =  0x00B90400,
+	HWPfQosmonAThres2Vf                   =  0x00B90404,
+	HWPfQosmonAWeiFracVf                  =  0x00B90408,
+	HWPfQosmonARrWeiVf                    =  0x00B9040C,
+	HWPfPermonACntrlRegVf                 =  0x00B98000,
+	HWPfPermonACountVf                    =  0x00B98008,
+	HWPfPermonAKCntLoVf                   =  0x00B98010,
+	HWPfPermonAKCntHiVf                   =  0x00B98014,
+	HWPfPermonADeltaCntLoVf               =  0x00B98020,
+	HWPfPermonADeltaCntHiVf               =  0x00B98024,
+	HWPfPermonAVersionReg                 =  0x00B9C000,
+	HWPfPermonACbControlFec               =  0x00B9C0F0,
+	HWPfPermonADltTimerLoFec              =  0x00B9C0F4,
+	HWPfPermonADltTimerHiFec              =  0x00B9C0F8,
+	HWPfPermonACbCountFec                 =  0x00B9C100,
+	HWPfPermonAAccExecTimerLoFec          =  0x00B9C104,
+	HWPfPermonAAccExecTimerHiFec          =  0x00B9C108,
+	HWPfPermonAExecTimerMinFec            =  0x00B9C200,
+	HWPfPermonAExecTimerMaxFec            =  0x00B9C204,
+	HWPfPermonAControlBusMon              =  0x00B9C400,
+	HWPfPermonAConfigBusMon               =  0x00B9C404,
+	HWPfPermonASkipCountBusMon            =  0x00B9C408,
+	HWPfPermonAMinLatBusMon               =  0x00B9C40C,
+	HWPfPermonAMaxLatBusMon               =  0x00B9C500,
+	HWPfPermonATotalLatLowBusMon          =  0x00B9C504,
+	HWPfPermonATotalLatUpperBusMon        =  0x00B9C508,
+	HWPfPermonATotalReqCntBusMon          =  0x00B9C50C,
+	HWPfQosmonBCntrlReg                   =  0x00BA0000,
+	HWPfQosmonBEvalOverflow0              =  0x00BA0008,
+	HWPfQosmonBEvalOverflow1              =  0x00BA000C,
+	HWPfQosmonBDivTerm                    =  0x00BA0010,
+	HWPfQosmonBTickTerm                   =  0x00BA0014,
+	HWPfQosmonBEvalTerm                   =  0x00BA0018,
+	HWPfQosmonBAveTerm                    =  0x00BA001C,
+	HWPfQosmonBForceEccErr                =  0x00BA0020,
+	HWPfQosmonBEccErrDetect               =  0x00BA0024,
+	HWPfQosmonBIterationConfig0Low        =  0x00BA0060,
+	HWPfQosmonBIterationConfig0High       =  0x00BA0064,
+	HWPfQosmonBIterationConfig1Low        =  0x00BA0068,
+	HWPfQosmonBIterationConfig1High       =  0x00BA006C,
+	HWPfQosmonBIterationConfig2Low        =  0x00BA0070,
+	HWPfQosmonBIterationConfig2High       =  0x00BA0074,
+	HWPfQosmonBIterationConfig3Low        =  0x00BA0078,
+	HWPfQosmonBIterationConfig3High       =  0x00BA007C,
+	HWPfQosmonBEvalMemAddr                =  0x00BA0080,
+	HWPfQosmonBEvalMemData                =  0x00BA0084,
+	HWPfQosmonBXaction                    =  0x00BA00C0,
+	HWPfQosmonBRemThres1Vf                =  0x00BA0400,
+	HWPfQosmonBThres2Vf                   =  0x00BA0404,
+	HWPfQosmonBWeiFracVf                  =  0x00BA0408,
+	HWPfQosmonBRrWeiVf                    =  0x00BA040C,
+	HWPfPermonBCntrlRegVf                 =  0x00BA8000,
+	HWPfPermonBCountVf                    =  0x00BA8008,
+	HWPfPermonBKCntLoVf                   =  0x00BA8010,
+	HWPfPermonBKCntHiVf                   =  0x00BA8014,
+	HWPfPermonBDeltaCntLoVf               =  0x00BA8020,
+	HWPfPermonBDeltaCntHiVf               =  0x00BA8024,
+	HWPfPermonBVersionReg                 =  0x00BAC000,
+	HWPfPermonBCbControlFec               =  0x00BAC0F0,
+	HWPfPermonBDltTimerLoFec              =  0x00BAC0F4,
+	HWPfPermonBDltTimerHiFec              =  0x00BAC0F8,
+	HWPfPermonBCbCountFec                 =  0x00BAC100,
+	HWPfPermonBAccExecTimerLoFec          =  0x00BAC104,
+	HWPfPermonBAccExecTimerHiFec          =  0x00BAC108,
+	HWPfPermonBExecTimerMinFec            =  0x00BAC200,
+	HWPfPermonBExecTimerMaxFec            =  0x00BAC204,
+	HWPfPermonBControlBusMon              =  0x00BAC400,
+	HWPfPermonBConfigBusMon               =  0x00BAC404,
+	HWPfPermonBSkipCountBusMon            =  0x00BAC408,
+	HWPfPermonBMinLatBusMon               =  0x00BAC40C,
+	HWPfPermonBMaxLatBusMon               =  0x00BAC500,
+	HWPfPermonBTotalLatLowBusMon          =  0x00BAC504,
+	HWPfPermonBTotalLatUpperBusMon        =  0x00BAC508,
+	HWPfPermonBTotalReqCntBusMon          =  0x00BAC50C,
+	HWPfFecUl5gCntrlReg                   =  0x00BC0000,
+	HWPfFecUl5gI2MThreshReg               =  0x00BC0004,
+	HWPfFecUl5gVersionReg                 =  0x00BC0100,
+	HWPfFecUl5gFcwStatusReg               =  0x00BC0104,
+	HWPfFecUl5gWarnReg                    =  0x00BC0108,
+	HwPfFecUl5gIbDebugReg                 =  0x00BC0200,
+	HwPfFecUl5gObLlrDebugReg              =  0x00BC0204,
+	HwPfFecUl5gObHarqDebugReg             =  0x00BC0208,
+	HwPfFecUl5g1CntrlReg                  =  0x00BC1000,
+	HwPfFecUl5g1I2MThreshReg              =  0x00BC1004,
+	HwPfFecUl5g1VersionReg                =  0x00BC1100,
+	HwPfFecUl5g1FcwStatusReg              =  0x00BC1104,
+	HwPfFecUl5g1WarnReg                   =  0x00BC1108,
+	HwPfFecUl5g1IbDebugReg                =  0x00BC1200,
+	HwPfFecUl5g1ObLlrDebugReg             =  0x00BC1204,
+	HwPfFecUl5g1ObHarqDebugReg            =  0x00BC1208,
+	HwPfFecUl5g2CntrlReg                  =  0x00BC2000,
+	HwPfFecUl5g2I2MThreshReg              =  0x00BC2004,
+	HwPfFecUl5g2VersionReg                =  0x00BC2100,
+	HwPfFecUl5g2FcwStatusReg              =  0x00BC2104,
+	HwPfFecUl5g2WarnReg                   =  0x00BC2108,
+	HwPfFecUl5g2IbDebugReg                =  0x00BC2200,
+	HwPfFecUl5g2ObLlrDebugReg             =  0x00BC2204,
+	HwPfFecUl5g2ObHarqDebugReg            =  0x00BC2208,
+	HwPfFecUl5g3CntrlReg                  =  0x00BC3000,
+	HwPfFecUl5g3I2MThreshReg              =  0x00BC3004,
+	HwPfFecUl5g3VersionReg                =  0x00BC3100,
+	HwPfFecUl5g3FcwStatusReg              =  0x00BC3104,
+	HwPfFecUl5g3WarnReg                   =  0x00BC3108,
+	HwPfFecUl5g3IbDebugReg                =  0x00BC3200,
+	HwPfFecUl5g3ObLlrDebugReg             =  0x00BC3204,
+	HwPfFecUl5g3ObHarqDebugReg            =  0x00BC3208,
+	HwPfFecUl5g4CntrlReg                  =  0x00BC4000,
+	HwPfFecUl5g4I2MThreshReg              =  0x00BC4004,
+	HwPfFecUl5g4VersionReg                =  0x00BC4100,
+	HwPfFecUl5g4FcwStatusReg              =  0x00BC4104,
+	HwPfFecUl5g4WarnReg                   =  0x00BC4108,
+	HwPfFecUl5g4IbDebugReg                =  0x00BC4200,
+	HwPfFecUl5g4ObLlrDebugReg             =  0x00BC4204,
+	HwPfFecUl5g4ObHarqDebugReg            =  0x00BC4208,
+	HwPfFecUl5g5CntrlReg                  =  0x00BC5000,
+	HwPfFecUl5g5I2MThreshReg              =  0x00BC5004,
+	HwPfFecUl5g5VersionReg                =  0x00BC5100,
+	HwPfFecUl5g5FcwStatusReg              =  0x00BC5104,
+	HwPfFecUl5g5WarnReg                   =  0x00BC5108,
+	HwPfFecUl5g5IbDebugReg                =  0x00BC5200,
+	HwPfFecUl5g5ObLlrDebugReg             =  0x00BC5204,
+	HwPfFecUl5g5ObHarqDebugReg            =  0x00BC5208,
+	HwPfFecUl5g6CntrlReg                  =  0x00BC6000,
+	HwPfFecUl5g6I2MThreshReg              =  0x00BC6004,
+	HwPfFecUl5g6VersionReg                =  0x00BC6100,
+	HwPfFecUl5g6FcwStatusReg              =  0x00BC6104,
+	HwPfFecUl5g6WarnReg                   =  0x00BC6108,
+	HwPfFecUl5g6IbDebugReg                =  0x00BC6200,
+	HwPfFecUl5g6ObLlrDebugReg             =  0x00BC6204,
+	HwPfFecUl5g6ObHarqDebugReg            =  0x00BC6208,
+	HwPfFecUl5g7CntrlReg                  =  0x00BC7000,
+	HwPfFecUl5g7I2MThreshReg              =  0x00BC7004,
+	HwPfFecUl5g7VersionReg                =  0x00BC7100,
+	HwPfFecUl5g7FcwStatusReg              =  0x00BC7104,
+	HwPfFecUl5g7WarnReg                   =  0x00BC7108,
+	HwPfFecUl5g7IbDebugReg                =  0x00BC7200,
+	HwPfFecUl5g7ObLlrDebugReg             =  0x00BC7204,
+	HwPfFecUl5g7ObHarqDebugReg            =  0x00BC7208,
+	HwPfFecUl5g8CntrlReg                  =  0x00BC8000,
+	HwPfFecUl5g8I2MThreshReg              =  0x00BC8004,
+	HwPfFecUl5g8VersionReg                =  0x00BC8100,
+	HwPfFecUl5g8FcwStatusReg              =  0x00BC8104,
+	HwPfFecUl5g8WarnReg                   =  0x00BC8108,
+	HwPfFecUl5g8IbDebugReg                =  0x00BC8200,
+	HwPfFecUl5g8ObLlrDebugReg             =  0x00BC8204,
+	HwPfFecUl5g8ObHarqDebugReg            =  0x00BC8208,
+	HWPfFecDl5gCntrlReg                   =  0x00BCF000,
+	HWPfFecDl5gI2MThreshReg               =  0x00BCF004,
+	HWPfFecDl5gVersionReg                 =  0x00BCF100,
+	HWPfFecDl5gFcwStatusReg               =  0x00BCF104,
+	HWPfFecDl5gWarnReg                    =  0x00BCF108,
+	HWPfFecUlVersionReg                   =  0x00BD0000,
+	HWPfFecUlControlReg                   =  0x00BD0004,
+	HWPfFecUlStatusReg                    =  0x00BD0008,
+	HWPfFecDlVersionReg                   =  0x00BDF000,
+	HWPfFecDlClusterConfigReg             =  0x00BDF004,
+	HWPfFecDlBurstThres                   =  0x00BDF00C,
+	HWPfFecDlClusterStatusReg0            =  0x00BDF040,
+	HWPfFecDlClusterStatusReg1            =  0x00BDF044,
+	HWPfFecDlClusterStatusReg2            =  0x00BDF048,
+	HWPfFecDlClusterStatusReg3            =  0x00BDF04C,
+	HWPfFecDlClusterStatusReg4            =  0x00BDF050,
+	HWPfFecDlClusterStatusReg5            =  0x00BDF054,
+	HWPfChaFabPllPllrst                   =  0x00C40000,
+	HWPfChaFabPllClk0                     =  0x00C40004,
+	HWPfChaFabPllClk1                     =  0x00C40008,
+	HWPfChaFabPllBwadj                    =  0x00C4000C,
+	HWPfChaFabPllLbw                      =  0x00C40010,
+	HWPfChaFabPllResetq                   =  0x00C40014,
+	HWPfChaFabPllPhshft0                  =  0x00C40018,
+	HWPfChaFabPllPhshft1                  =  0x00C4001C,
+	HWPfChaFabPllDivq0                    =  0x00C40020,
+	HWPfChaFabPllDivq1                    =  0x00C40024,
+	HWPfChaFabPllDivq2                    =  0x00C40028,
+	HWPfChaFabPllDivq3                    =  0x00C4002C,
+	HWPfChaFabPllDivq4                    =  0x00C40030,
+	HWPfChaFabPllDivq5                    =  0x00C40034,
+	HWPfChaFabPllDivq6                    =  0x00C40038,
+	HWPfChaFabPllDivq7                    =  0x00C4003C,
+	HWPfChaDl5gPllPllrst                  =  0x00C40080,
+	HWPfChaDl5gPllClk0                    =  0x00C40084,
+	HWPfChaDl5gPllClk1                    =  0x00C40088,
+	HWPfChaDl5gPllBwadj                   =  0x00C4008C,
+	HWPfChaDl5gPllLbw                     =  0x00C40090,
+	HWPfChaDl5gPllResetq                  =  0x00C40094,
+	HWPfChaDl5gPllPhshft0                 =  0x00C40098,
+	HWPfChaDl5gPllPhshft1                 =  0x00C4009C,
+	HWPfChaDl5gPllDivq0                   =  0x00C400A0,
+	HWPfChaDl5gPllDivq1                   =  0x00C400A4,
+	HWPfChaDl5gPllDivq2                   =  0x00C400A8,
+	HWPfChaDl5gPllDivq3                   =  0x00C400AC,
+	HWPfChaDl5gPllDivq4                   =  0x00C400B0,
+	HWPfChaDl5gPllDivq5                   =  0x00C400B4,
+	HWPfChaDl5gPllDivq6                   =  0x00C400B8,
+	HWPfChaDl5gPllDivq7                   =  0x00C400BC,
+	HWPfChaDl4gPllPllrst                  =  0x00C40100,
+	HWPfChaDl4gPllClk0                    =  0x00C40104,
+	HWPfChaDl4gPllClk1                    =  0x00C40108,
+	HWPfChaDl4gPllBwadj                   =  0x00C4010C,
+	HWPfChaDl4gPllLbw                     =  0x00C40110,
+	HWPfChaDl4gPllResetq                  =  0x00C40114,
+	HWPfChaDl4gPllPhshft0                 =  0x00C40118,
+	HWPfChaDl4gPllPhshft1                 =  0x00C4011C,
+	HWPfChaDl4gPllDivq0                   =  0x00C40120,
+	HWPfChaDl4gPllDivq1                   =  0x00C40124,
+	HWPfChaDl4gPllDivq2                   =  0x00C40128,
+	HWPfChaDl4gPllDivq3                   =  0x00C4012C,
+	HWPfChaDl4gPllDivq4                   =  0x00C40130,
+	HWPfChaDl4gPllDivq5                   =  0x00C40134,
+	HWPfChaDl4gPllDivq6                   =  0x00C40138,
+	HWPfChaDl4gPllDivq7                   =  0x00C4013C,
+	HWPfChaUl5gPllPllrst                  =  0x00C40180,
+	HWPfChaUl5gPllClk0                    =  0x00C40184,
+	HWPfChaUl5gPllClk1                    =  0x00C40188,
+	HWPfChaUl5gPllBwadj                   =  0x00C4018C,
+	HWPfChaUl5gPllLbw                     =  0x00C40190,
+	HWPfChaUl5gPllResetq                  =  0x00C40194,
+	HWPfChaUl5gPllPhshft0                 =  0x00C40198,
+	HWPfChaUl5gPllPhshft1                 =  0x00C4019C,
+	HWPfChaUl5gPllDivq0                   =  0x00C401A0,
+	HWPfChaUl5gPllDivq1                   =  0x00C401A4,
+	HWPfChaUl5gPllDivq2                   =  0x00C401A8,
+	HWPfChaUl5gPllDivq3                   =  0x00C401AC,
+	HWPfChaUl5gPllDivq4                   =  0x00C401B0,
+	HWPfChaUl5gPllDivq5                   =  0x00C401B4,
+	HWPfChaUl5gPllDivq6                   =  0x00C401B8,
+	HWPfChaUl5gPllDivq7                   =  0x00C401BC,
+	HWPfChaUl4gPllPllrst                  =  0x00C40200,
+	HWPfChaUl4gPllClk0                    =  0x00C40204,
+	HWPfChaUl4gPllClk1                    =  0x00C40208,
+	HWPfChaUl4gPllBwadj                   =  0x00C4020C,
+	HWPfChaUl4gPllLbw                     =  0x00C40210,
+	HWPfChaUl4gPllResetq                  =  0x00C40214,
+	HWPfChaUl4gPllPhshft0                 =  0x00C40218,
+	HWPfChaUl4gPllPhshft1                 =  0x00C4021C,
+	HWPfChaUl4gPllDivq0                   =  0x00C40220,
+	HWPfChaUl4gPllDivq1                   =  0x00C40224,
+	HWPfChaUl4gPllDivq2                   =  0x00C40228,
+	HWPfChaUl4gPllDivq3                   =  0x00C4022C,
+	HWPfChaUl4gPllDivq4                   =  0x00C40230,
+	HWPfChaUl4gPllDivq5                   =  0x00C40234,
+	HWPfChaUl4gPllDivq6                   =  0x00C40238,
+	HWPfChaUl4gPllDivq7                   =  0x00C4023C,
+	HWPfChaDdrPllPllrst                   =  0x00C40280,
+	HWPfChaDdrPllClk0                     =  0x00C40284,
+	HWPfChaDdrPllClk1                     =  0x00C40288,
+	HWPfChaDdrPllBwadj                    =  0x00C4028C,
+	HWPfChaDdrPllLbw                      =  0x00C40290,
+	HWPfChaDdrPllResetq                   =  0x00C40294,
+	HWPfChaDdrPllPhshft0                  =  0x00C40298,
+	HWPfChaDdrPllPhshft1                  =  0x00C4029C,
+	HWPfChaDdrPllDivq0                    =  0x00C402A0,
+	HWPfChaDdrPllDivq1                    =  0x00C402A4,
+	HWPfChaDdrPllDivq2                    =  0x00C402A8,
+	HWPfChaDdrPllDivq3                    =  0x00C402AC,
+	HWPfChaDdrPllDivq4                    =  0x00C402B0,
+	HWPfChaDdrPllDivq5                    =  0x00C402B4,
+	HWPfChaDdrPllDivq6                    =  0x00C402B8,
+	HWPfChaDdrPllDivq7                    =  0x00C402BC,
+	HWPfChaErrStatus                      =  0x00C40400,
+	HWPfChaErrMask                        =  0x00C40404,
+	HWPfChaDebugPcieMsiFifo               =  0x00C40410,
+	HWPfChaDebugDdrMsiFifo                =  0x00C40414,
+	HWPfChaDebugMiscMsiFifo               =  0x00C40418,
+	HWPfChaPwmSet                         =  0x00C40420,
+	HWPfChaDdrRstStatus                   =  0x00C40430,
+	HWPfChaDdrStDoneStatus                =  0x00C40434,
+	HWPfChaDdrWbRstCfg                    =  0x00C40438,
+	HWPfChaDdrApbRstCfg                   =  0x00C4043C,
+	HWPfChaDdrPhyRstCfg                   =  0x00C40440,
+	HWPfChaDdrCpuRstCfg                   =  0x00C40444,
+	HWPfChaDdrSifRstCfg                   =  0x00C40448,
+	HWPfChaPadcfgPcomp0                   =  0x00C41000,
+	HWPfChaPadcfgNcomp0                   =  0x00C41004,
+	HWPfChaPadcfgOdt0                     =  0x00C41008,
+	HWPfChaPadcfgProtect0                 =  0x00C4100C,
+	HWPfChaPreemphasisProtect0            =  0x00C41010,
+	HWPfChaPreemphasisCompen0             =  0x00C41040,
+	HWPfChaPreemphasisOdten0              =  0x00C41044,
+	HWPfChaPadcfgPcomp1                   =  0x00C41100,
+	HWPfChaPadcfgNcomp1                   =  0x00C41104,
+	HWPfChaPadcfgOdt1                     =  0x00C41108,
+	HWPfChaPadcfgProtect1                 =  0x00C4110C,
+	HWPfChaPreemphasisProtect1            =  0x00C41110,
+	HWPfChaPreemphasisCompen1             =  0x00C41140,
+	HWPfChaPreemphasisOdten1              =  0x00C41144,
+	HWPfChaPadcfgPcomp2                   =  0x00C41200,
+	HWPfChaPadcfgNcomp2                   =  0x00C41204,
+	HWPfChaPadcfgOdt2                     =  0x00C41208,
+	HWPfChaPadcfgProtect2                 =  0x00C4120C,
+	HWPfChaPreemphasisProtect2            =  0x00C41210,
+	HWPfChaPreemphasisCompen2             =  0x00C41240,
+	HWPfChaPreemphasisOdten4              =  0x00C41444,
+	HWPfChaPreemphasisOdten2              =  0x00C41244,
+	HWPfChaPadcfgPcomp3                   =  0x00C41300,
+	HWPfChaPadcfgNcomp3                   =  0x00C41304,
+	HWPfChaPadcfgOdt3                     =  0x00C41308,
+	HWPfChaPadcfgProtect3                 =  0x00C4130C,
+	HWPfChaPreemphasisProtect3            =  0x00C41310,
+	HWPfChaPreemphasisCompen3             =  0x00C41340,
+	HWPfChaPreemphasisOdten3              =  0x00C41344,
+	HWPfChaPadcfgPcomp4                   =  0x00C41400,
+	HWPfChaPadcfgNcomp4                   =  0x00C41404,
+	HWPfChaPadcfgOdt4                     =  0x00C41408,
+	HWPfChaPadcfgProtect4                 =  0x00C4140C,
+	HWPfChaPreemphasisProtect4            =  0x00C41410,
+	HWPfChaPreemphasisCompen4             =  0x00C41440,
+	HWPfHiVfToPfDbellVf                   =  0x00C80000,
+	HWPfHiPfToVfDbellVf                   =  0x00C80008,
+	HWPfHiInfoRingBaseLoVf                =  0x00C80010,
+	HWPfHiInfoRingBaseHiVf                =  0x00C80014,
+	HWPfHiInfoRingPointerVf               =  0x00C80018,
+	HWPfHiInfoRingIntWrEnVf               =  0x00C80020,
+	HWPfHiInfoRingPf2VfWrEnVf             =  0x00C80024,
+	HWPfHiMsixVectorMapperVf              =  0x00C80060,
+	HWPfHiModuleVersionReg                =  0x00C84000,
+	HWPfHiIosf2axiErrLogReg               =  0x00C84004,
+	HWPfHiHardResetReg                    =  0x00C84008,
+	HWPfHi5GHardResetReg                  =  0x00C8400C,
+	HWPfHiInfoRingBaseLoRegPf             =  0x00C84010,
+	HWPfHiInfoRingBaseHiRegPf             =  0x00C84014,
+	HWPfHiInfoRingPointerRegPf            =  0x00C84018,
+	HWPfHiInfoRingIntWrEnRegPf            =  0x00C84020,
+	HWPfHiInfoRingVf2pfLoWrEnReg          =  0x00C84024,
+	HWPfHiInfoRingVf2pfHiWrEnReg          =  0x00C84028,
+	HWPfHiLogParityErrStatusReg           =  0x00C8402C,
+	HWPfHiLogDataParityErrorVfStatusLo    =  0x00C84030,
+	HWPfHiLogDataParityErrorVfStatusHi    =  0x00C84034,
+	HWPfHiBlockTransmitOnErrorEn          =  0x00C84038,
+	HWPfHiCfgMsiIntWrEnRegPf              =  0x00C84040,
+	HWPfHiCfgMsiVf2pfLoWrEnReg            =  0x00C84044,
+	HWPfHiCfgMsiVf2pfHighWrEnReg          =  0x00C84048,
+	HWPfHiMsixVectorMapperPf              =  0x00C84060,
+	HWPfHiApbWrWaitTime                   =  0x00C84100,
+	HWPfHiXCounterMaxValue                =  0x00C84104,
+	HWPfHiPfMode                          =  0x00C84108,
+	HWPfHiClkGateHystReg                  =  0x00C8410C,
+	HWPfHiSnoopBitsReg                    =  0x00C84110,
+	HWPfHiMsiDropEnableReg                =  0x00C84114,
+	HWPfHiMsiStatReg                      =  0x00C84120,
+	HWPfHiFifoOflStatReg                  =  0x00C84124,
+	HWPfHiHiDebugReg                      =  0x00C841F4,
+	HWPfHiDebugMemSnoopMsiFifo            =  0x00C841F8,
+	HWPfHiDebugMemSnoopInputFifo          =  0x00C841FC,
+	HWPfHiMsixMappingConfig               =  0x00C84200,
+	HWPfHiJunkReg                         =  0x00C8FF00,
+	HWPfDdrUmmcVer                        =  0x00D00000,
+	HWPfDdrUmmcCap                        =  0x00D00010,
+	HWPfDdrUmmcCtrl                       =  0x00D00020,
+	HWPfDdrMpcPe                          =  0x00D00080,
+	HWPfDdrMpcPpri3                       =  0x00D00090,
+	HWPfDdrMpcPpri2                       =  0x00D000A0,
+	HWPfDdrMpcPpri1                       =  0x00D000B0,
+	HWPfDdrMpcPpri0                       =  0x00D000C0,
+	HWPfDdrMpcPrwgrpCtrl                  =  0x00D000D0,
+	HWPfDdrMpcPbw7                        =  0x00D000E0,
+	HWPfDdrMpcPbw6                        =  0x00D000F0,
+	HWPfDdrMpcPbw5                        =  0x00D00100,
+	HWPfDdrMpcPbw4                        =  0x00D00110,
+	HWPfDdrMpcPbw3                        =  0x00D00120,
+	HWPfDdrMpcPbw2                        =  0x00D00130,
+	HWPfDdrMpcPbw1                        =  0x00D00140,
+	HWPfDdrMpcPbw0                        =  0x00D00150,
+	HWPfDdrMemoryInit                     =  0x00D00200,
+	HWPfDdrMemoryInitDone                 =  0x00D00210,
+	HWPfDdrMemInitPhyTrng0                =  0x00D00240,
+	HWPfDdrMemInitPhyTrng1                =  0x00D00250,
+	HWPfDdrMemInitPhyTrng2                =  0x00D00260,
+	HWPfDdrMemInitPhyTrng3                =  0x00D00270,
+	HWPfDdrBcDram                         =  0x00D003C0,
+	HWPfDdrBcAddrMap                      =  0x00D003D0,
+	HWPfDdrBcRef                          =  0x00D003E0,
+	HWPfDdrBcTim0                         =  0x00D00400,
+	HWPfDdrBcTim1                         =  0x00D00410,
+	HWPfDdrBcTim2                         =  0x00D00420,
+	HWPfDdrBcTim3                         =  0x00D00430,
+	HWPfDdrBcTim4                         =  0x00D00440,
+	HWPfDdrBcTim5                         =  0x00D00450,
+	HWPfDdrBcTim6                         =  0x00D00460,
+	HWPfDdrBcTim7                         =  0x00D00470,
+	HWPfDdrBcTim8                         =  0x00D00480,
+	HWPfDdrBcTim9                         =  0x00D00490,
+	HWPfDdrBcTim10                        =  0x00D004A0,
+	HWPfDdrBcTim12                        =  0x00D004C0,
+	HWPfDdrDfiInit                        =  0x00D004D0,
+	HWPfDdrDfiInitComplete                =  0x00D004E0,
+	HWPfDdrDfiTim0                        =  0x00D004F0,
+	HWPfDdrDfiTim1                        =  0x00D00500,
+	HWPfDdrDfiPhyUpdEn                    =  0x00D00530,
+	HWPfDdrMemStatus                      =  0x00D00540,
+	HWPfDdrUmmcErrStatus                  =  0x00D00550,
+	HWPfDdrUmmcIntStatus                  =  0x00D00560,
+	HWPfDdrUmmcIntEn                      =  0x00D00570,
+	HWPfDdrPhyRdLatency                   =  0x00D48400,
+	HWPfDdrPhyRdLatencyDbi                =  0x00D48410,
+	HWPfDdrPhyWrLatency                   =  0x00D48420,
+	HWPfDdrPhyTrngType                    =  0x00D48430,
+	HWPfDdrPhyMrsTiming2                  =  0x00D48440,
+	HWPfDdrPhyMrsTiming0                  =  0x00D48450,
+	HWPfDdrPhyMrsTiming1                  =  0x00D48460,
+	HWPfDdrPhyDramTmrd                    =  0x00D48470,
+	HWPfDdrPhyDramTmod                    =  0x00D48480,
+	HWPfDdrPhyDramTwpre                   =  0x00D48490,
+	HWPfDdrPhyDramTrfc                    =  0x00D484A0,
+	HWPfDdrPhyDramTrwtp                   =  0x00D484B0,
+	HWPfDdrPhyMr01Dimm                    =  0x00D484C0,
+	HWPfDdrPhyMr01DimmDbi                 =  0x00D484D0,
+	HWPfDdrPhyMr23Dimm                    =  0x00D484E0,
+	HWPfDdrPhyMr45Dimm                    =  0x00D484F0,
+	HWPfDdrPhyMr67Dimm                    =  0x00D48500,
+	HWPfDdrPhyWrlvlWwRdlvlRr              =  0x00D48510,
+	HWPfDdrPhyOdtEn                       =  0x00D48520,
+	HWPfDdrPhyFastTrng                    =  0x00D48530,
+	HWPfDdrPhyDynTrngGap                  =  0x00D48540,
+	HWPfDdrPhyDynRcalGap                  =  0x00D48550,
+	HWPfDdrPhyIdletimeout                 =  0x00D48560,
+	HWPfDdrPhyRstCkeGap                   =  0x00D48570,
+	HWPfDdrPhyCkeMrsGap                   =  0x00D48580,
+	HWPfDdrPhyMemVrefMidVal               =  0x00D48590,
+	HWPfDdrPhyVrefStep                    =  0x00D485A0,
+	HWPfDdrPhyVrefThreshold               =  0x00D485B0,
+	HWPfDdrPhyPhyVrefMidVal               =  0x00D485C0,
+	HWPfDdrPhyDqsCountMax                 =  0x00D485D0,
+	HWPfDdrPhyDqsCountNum                 =  0x00D485E0,
+	HWPfDdrPhyDramRow                     =  0x00D485F0,
+	HWPfDdrPhyDramCol                     =  0x00D48600,
+	HWPfDdrPhyDramBgBa                    =  0x00D48610,
+	HWPfDdrPhyDynamicUpdreqrel            =  0x00D48620,
+	HWPfDdrPhyVrefLimits                  =  0x00D48630,
+	HWPfDdrPhyIdtmTcStatus                =  0x00D6C020,
+	HWPfDdrPhyIdtmFwVersion               =  0x00D6C410,
+	HWPfDdrPhyRdlvlGateInitDelay          =  0x00D70000,
+	HWPfDdrPhyRdenSmplabc                 =  0x00D70008,
+	HWPfDdrPhyVrefNibble0                 =  0x00D7000C,
+	HWPfDdrPhyVrefNibble1                 =  0x00D70010,
+	HWPfDdrPhyRdlvlGateDqsSmpl0           =  0x00D70014,
+	HWPfDdrPhyRdlvlGateDqsSmpl1           =  0x00D70018,
+	HWPfDdrPhyRdlvlGateDqsSmpl2           =  0x00D7001C,
+	HWPfDdrPhyDqsCount                    =  0x00D70020,
+	HWPfDdrPhyWrlvlRdlvlGateStatus        =  0x00D70024,
+	HWPfDdrPhyErrorFlags                  =  0x00D70028,
+	HWPfDdrPhyPowerDown                   =  0x00D70030,
+	HWPfDdrPhyPrbsSeedByte0               =  0x00D70034,
+	HWPfDdrPhyPrbsSeedByte1               =  0x00D70038,
+	HWPfDdrPhyPcompDq                     =  0x00D70040,
+	HWPfDdrPhyNcompDq                     =  0x00D70044,
+	HWPfDdrPhyPcompDqs                    =  0x00D70048,
+	HWPfDdrPhyNcompDqs                    =  0x00D7004C,
+	HWPfDdrPhyPcompCmd                    =  0x00D70050,
+	HWPfDdrPhyNcompCmd                    =  0x00D70054,
+	HWPfDdrPhyPcompCk                     =  0x00D70058,
+	HWPfDdrPhyNcompCk                     =  0x00D7005C,
+	HWPfDdrPhyRcalOdtDq                   =  0x00D70060,
+	HWPfDdrPhyRcalOdtDqs                  =  0x00D70064,
+	HWPfDdrPhyRcalMask1                   =  0x00D70068,
+	HWPfDdrPhyRcalMask2                   =  0x00D7006C,
+	HWPfDdrPhyRcalCtrl                    =  0x00D70070,
+	HWPfDdrPhyRcalCnt                     =  0x00D70074,
+	HWPfDdrPhyRcalOverride                =  0x00D70078,
+	HWPfDdrPhyRcalGateen                  =  0x00D7007C,
+	HWPfDdrPhyCtrl                        =  0x00D70080,
+	HWPfDdrPhyWrlvlAlg                    =  0x00D70084,
+	HWPfDdrPhyRcalVreftTxcmdOdt           =  0x00D70088,
+	HWPfDdrPhyRdlvlGateParam              =  0x00D7008C,
+	HWPfDdrPhyRdlvlGateParam2             =  0x00D70090,
+	HWPfDdrPhyRcalVreftTxdata             =  0x00D70094,
+	HWPfDdrPhyCmdIntDelay                 =  0x00D700A4,
+	HWPfDdrPhyAlertN                      =  0x00D700A8,
+	HWPfDdrPhyTrngReqWpre2tck             =  0x00D700AC,
+	HWPfDdrPhyCmdPhaseSel                 =  0x00D700B4,
+	HWPfDdrPhyCmdDcdl                     =  0x00D700B8,
+	HWPfDdrPhyCkDcdl                      =  0x00D700BC,
+	HWPfDdrPhySwTrngCtrl1                 =  0x00D700C0,
+	HWPfDdrPhySwTrngCtrl2                 =  0x00D700C4,
+	HWPfDdrPhyRcalPcompRden               =  0x00D700C8,
+	HWPfDdrPhyRcalNcompRden               =  0x00D700CC,
+	HWPfDdrPhyRcalCompen                  =  0x00D700D0,
+	HWPfDdrPhySwTrngRdqs                  =  0x00D700D4,
+	HWPfDdrPhySwTrngWdqs                  =  0x00D700D8,
+	HWPfDdrPhySwTrngRdena                 =  0x00D700DC,
+	HWPfDdrPhySwTrngRdenb                 =  0x00D700E0,
+	HWPfDdrPhySwTrngRdenc                 =  0x00D700E4,
+	HWPfDdrPhySwTrngWdq                   =  0x00D700E8,
+	HWPfDdrPhySwTrngRdq                   =  0x00D700EC,
+	HWPfDdrPhyPcfgHmValue                 =  0x00D700F0,
+	HWPfDdrPhyPcfgTimerValue              =  0x00D700F4,
+	HWPfDdrPhyPcfgSoftwareTraining        =  0x00D700F8,
+	HWPfDdrPhyPcfgMcStatus                =  0x00D700FC,
+	HWPfDdrPhyWrlvlPhRank0                =  0x00D70100,
+	HWPfDdrPhyRdenPhRank0                 =  0x00D70104,
+	HWPfDdrPhyRdenIntRank0                =  0x00D70108,
+	HWPfDdrPhyRdqsDcdlRank0               =  0x00D7010C,
+	HWPfDdrPhyRdqsShadowDcdlRank0         =  0x00D70110,
+	HWPfDdrPhyWdqsDcdlRank0               =  0x00D70114,
+	HWPfDdrPhyWdmDcdlShadowRank0          =  0x00D70118,
+	HWPfDdrPhyWdmDcdlRank0                =  0x00D7011C,
+	HWPfDdrPhyDbiDcdlRank0                =  0x00D70120,
+	HWPfDdrPhyRdenDcdlaRank0              =  0x00D70124,
+	HWPfDdrPhyDbiDcdlShadowRank0          =  0x00D70128,
+	HWPfDdrPhyRdenDcdlbRank0              =  0x00D7012C,
+	HWPfDdrPhyWdqsShadowDcdlRank0         =  0x00D70130,
+	HWPfDdrPhyRdenDcdlcRank0              =  0x00D70134,
+	HWPfDdrPhyRdenShadowDcdlaRank0        =  0x00D70138,
+	HWPfDdrPhyWrlvlIntRank0               =  0x00D7013C,
+	HWPfDdrPhyRdqDcdlBit0Rank0            =  0x00D70200,
+	HWPfDdrPhyRdqDcdlShadowBit0Rank0      =  0x00D70204,
+	HWPfDdrPhyWdqDcdlBit0Rank0            =  0x00D70208,
+	HWPfDdrPhyWdqDcdlShadowBit0Rank0      =  0x00D7020C,
+	HWPfDdrPhyRdqDcdlBit1Rank0            =  0x00D70240,
+	HWPfDdrPhyRdqDcdlShadowBit1Rank0      =  0x00D70244,
+	HWPfDdrPhyWdqDcdlBit1Rank0            =  0x00D70248,
+	HWPfDdrPhyWdqDcdlShadowBit1Rank0      =  0x00D7024C,
+	HWPfDdrPhyRdqDcdlBit2Rank0            =  0x00D70280,
+	HWPfDdrPhyRdqDcdlShadowBit2Rank0      =  0x00D70284,
+	HWPfDdrPhyWdqDcdlBit2Rank0            =  0x00D70288,
+	HWPfDdrPhyWdqDcdlShadowBit2Rank0      =  0x00D7028C,
+	HWPfDdrPhyRdqDcdlBit3Rank0            =  0x00D702C0,
+	HWPfDdrPhyRdqDcdlShadowBit3Rank0      =  0x00D702C4,
+	HWPfDdrPhyWdqDcdlBit3Rank0            =  0x00D702C8,
+	HWPfDdrPhyWdqDcdlShadowBit3Rank0      =  0x00D702CC,
+	HWPfDdrPhyRdqDcdlBit4Rank0            =  0x00D70300,
+	HWPfDdrPhyRdqDcdlShadowBit4Rank0      =  0x00D70304,
+	HWPfDdrPhyWdqDcdlBit4Rank0            =  0x00D70308,
+	HWPfDdrPhyWdqDcdlShadowBit4Rank0      =  0x00D7030C,
+	HWPfDdrPhyRdqDcdlBit5Rank0            =  0x00D70340,
+	HWPfDdrPhyRdqDcdlShadowBit5Rank0      =  0x00D70344,
+	HWPfDdrPhyWdqDcdlBit5Rank0            =  0x00D70348,
+	HWPfDdrPhyWdqDcdlShadowBit5Rank0      =  0x00D7034C,
+	HWPfDdrPhyRdqDcdlBit6Rank0            =  0x00D70380,
+	HWPfDdrPhyRdqDcdlShadowBit6Rank0      =  0x00D70384,
+	HWPfDdrPhyWdqDcdlBit6Rank0            =  0x00D70388,
+	HWPfDdrPhyWdqDcdlShadowBit6Rank0      =  0x00D7038C,
+	HWPfDdrPhyRdqDcdlBit7Rank0            =  0x00D703C0,
+	HWPfDdrPhyRdqDcdlShadowBit7Rank0      =  0x00D703C4,
+	HWPfDdrPhyWdqDcdlBit7Rank0            =  0x00D703C8,
+	HWPfDdrPhyWdqDcdlShadowBit7Rank0      =  0x00D703CC,
+	HWPfDdrPhyIdtmStatus                  =  0x00D740D0,
+	HWPfDdrPhyIdtmError                   =  0x00D74110,
+	HWPfDdrPhyIdtmDebug                   =  0x00D74120,
+	HWPfDdrPhyIdtmDebugInt                =  0x00D74130,
+	HwPfPcieLnAsicCfgovr                  =  0x00D80000,
+	HwPfPcieLnAclkmixer                   =  0x00D80004,
+	HwPfPcieLnTxrampfreq                  =  0x00D80008,
+	HwPfPcieLnLanetest                    =  0x00D8000C,
+	HwPfPcieLnDcctrl                      =  0x00D80010,
+	HwPfPcieLnDccmeas                     =  0x00D80014,
+	HwPfPcieLnDccovrAclk                  =  0x00D80018,
+	HwPfPcieLnDccovrTxa                   =  0x00D8001C,
+	HwPfPcieLnDccovrTxk                   =  0x00D80020,
+	HwPfPcieLnDccovrDclk                  =  0x00D80024,
+	HwPfPcieLnDccovrEclk                  =  0x00D80028,
+	HwPfPcieLnDcctrimAclk                 =  0x00D8002C,
+	HwPfPcieLnDcctrimTx                   =  0x00D80030,
+	HwPfPcieLnDcctrimDclk                 =  0x00D80034,
+	HwPfPcieLnDcctrimEclk                 =  0x00D80038,
+	HwPfPcieLnQuadCtrl                    =  0x00D8003C,
+	HwPfPcieLnQuadCorrIndex               =  0x00D80040,
+	HwPfPcieLnQuadCorrStatus              =  0x00D80044,
+	HwPfPcieLnAsicRxovr1                  =  0x00D80048,
+	HwPfPcieLnAsicRxovr2                  =  0x00D8004C,
+	HwPfPcieLnAsicEqinfovr                =  0x00D80050,
+	HwPfPcieLnRxcsr                       =  0x00D80054,
+	HwPfPcieLnRxfectrl                    =  0x00D80058,
+	HwPfPcieLnRxtest                      =  0x00D8005C,
+	HwPfPcieLnEscount                     =  0x00D80060,
+	HwPfPcieLnCdrctrl                     =  0x00D80064,
+	HwPfPcieLnCdrctrl2                    =  0x00D80068,
+	HwPfPcieLnCdrcfg0Ctrl0                =  0x00D8006C,
+	HwPfPcieLnCdrcfg0Ctrl1                =  0x00D80070,
+	HwPfPcieLnCdrcfg0Ctrl2                =  0x00D80074,
+	HwPfPcieLnCdrcfg1Ctrl0                =  0x00D80078,
+	HwPfPcieLnCdrcfg1Ctrl1                =  0x00D8007C,
+	HwPfPcieLnCdrcfg1Ctrl2                =  0x00D80080,
+	HwPfPcieLnCdrcfg2Ctrl0                =  0x00D80084,
+	HwPfPcieLnCdrcfg2Ctrl1                =  0x00D80088,
+	HwPfPcieLnCdrcfg2Ctrl2                =  0x00D8008C,
+	HwPfPcieLnCdrcfg3Ctrl0                =  0x00D80090,
+	HwPfPcieLnCdrcfg3Ctrl1                =  0x00D80094,
+	HwPfPcieLnCdrcfg3Ctrl2                =  0x00D80098,
+	HwPfPcieLnCdrphase                    =  0x00D8009C,
+	HwPfPcieLnCdrfreq                     =  0x00D800A0,
+	HwPfPcieLnCdrstatusPhase              =  0x00D800A4,
+	HwPfPcieLnCdrstatusFreq               =  0x00D800A8,
+	HwPfPcieLnCdroffset                   =  0x00D800AC,
+	HwPfPcieLnRxvosctl                    =  0x00D800B0,
+	HwPfPcieLnRxvosctl2                   =  0x00D800B4,
+	HwPfPcieLnRxlosctl                    =  0x00D800B8,
+	HwPfPcieLnRxlos                       =  0x00D800BC,
+	HwPfPcieLnRxlosvval                   =  0x00D800C0,
+	HwPfPcieLnRxvosd0                     =  0x00D800C4,
+	HwPfPcieLnRxvosd1                     =  0x00D800C8,
+	HwPfPcieLnRxvosep0                    =  0x00D800CC,
+	HwPfPcieLnRxvosep1                    =  0x00D800D0,
+	HwPfPcieLnRxvosen0                    =  0x00D800D4,
+	HwPfPcieLnRxvosen1                    =  0x00D800D8,
+	HwPfPcieLnRxvosafe                    =  0x00D800DC,
+	HwPfPcieLnRxvosa0                     =  0x00D800E0,
+	HwPfPcieLnRxvosa0Out                  =  0x00D800E4,
+	HwPfPcieLnRxvosa1                     =  0x00D800E8,
+	HwPfPcieLnRxvosa1Out                  =  0x00D800EC,
+	HwPfPcieLnRxmisc                      =  0x00D800F0,
+	HwPfPcieLnRxbeacon                    =  0x00D800F4,
+	HwPfPcieLnRxdssout                    =  0x00D800F8,
+	HwPfPcieLnRxdssout2                   =  0x00D800FC,
+	HwPfPcieLnAlphapctrl                  =  0x00D80100,
+	HwPfPcieLnAlphanctrl                  =  0x00D80104,
+	HwPfPcieLnAdaptctrl                   =  0x00D80108,
+	HwPfPcieLnAdaptctrl1                  =  0x00D8010C,
+	HwPfPcieLnAdaptstatus                 =  0x00D80110,
+	HwPfPcieLnAdaptvga1                   =  0x00D80114,
+	HwPfPcieLnAdaptvga2                   =  0x00D80118,
+	HwPfPcieLnAdaptvga3                   =  0x00D8011C,
+	HwPfPcieLnAdaptvga4                   =  0x00D80120,
+	HwPfPcieLnAdaptboost1                 =  0x00D80124,
+	HwPfPcieLnAdaptboost2                 =  0x00D80128,
+	HwPfPcieLnAdaptboost3                 =  0x00D8012C,
+	HwPfPcieLnAdaptboost4                 =  0x00D80130,
+	HwPfPcieLnAdaptsslms1                 =  0x00D80134,
+	HwPfPcieLnAdaptsslms2                 =  0x00D80138,
+	HwPfPcieLnAdaptvgaStatus              =  0x00D8013C,
+	HwPfPcieLnAdaptboostStatus            =  0x00D80140,
+	HwPfPcieLnAdaptsslmsStatus1           =  0x00D80144,
+	HwPfPcieLnAdaptsslmsStatus2           =  0x00D80148,
+	HwPfPcieLnAfectrl1                    =  0x00D8014C,
+	HwPfPcieLnAfectrl2                    =  0x00D80150,
+	HwPfPcieLnAfectrl3                    =  0x00D80154,
+	HwPfPcieLnAfedefault1                 =  0x00D80158,
+	HwPfPcieLnAfedefault2                 =  0x00D8015C,
+	HwPfPcieLnDfectrl1                    =  0x00D80160,
+	HwPfPcieLnDfectrl2                    =  0x00D80164,
+	HwPfPcieLnDfectrl3                    =  0x00D80168,
+	HwPfPcieLnDfectrl4                    =  0x00D8016C,
+	HwPfPcieLnDfectrl5                    =  0x00D80170,
+	HwPfPcieLnDfectrl6                    =  0x00D80174,
+	HwPfPcieLnAfestatus1                  =  0x00D80178,
+	HwPfPcieLnAfestatus2                  =  0x00D8017C,
+	HwPfPcieLnDfestatus1                  =  0x00D80180,
+	HwPfPcieLnDfestatus2                  =  0x00D80184,
+	HwPfPcieLnDfestatus3                  =  0x00D80188,
+	HwPfPcieLnDfestatus4                  =  0x00D8018C,
+	HwPfPcieLnDfestatus5                  =  0x00D80190,
+	HwPfPcieLnAlphastatus                 =  0x00D80194,
+	HwPfPcieLnFomctrl1                    =  0x00D80198,
+	HwPfPcieLnFomctrl2                    =  0x00D8019C,
+	HwPfPcieLnFomctrl3                    =  0x00D801A0,
+	HwPfPcieLnAclkcalStatus               =  0x00D801A4,
+	HwPfPcieLnOffscorrStatus              =  0x00D801A8,
+	HwPfPcieLnEyewidthStatus              =  0x00D801AC,
+	HwPfPcieLnEyeheightStatus             =  0x00D801B0,
+	HwPfPcieLnAsicTxovr1                  =  0x00D801B4,
+	HwPfPcieLnAsicTxovr2                  =  0x00D801B8,
+	HwPfPcieLnAsicTxovr3                  =  0x00D801BC,
+	HwPfPcieLnTxbiasadjOvr                =  0x00D801C0,
+	HwPfPcieLnTxcsr                       =  0x00D801C4,
+	HwPfPcieLnTxtest                      =  0x00D801C8,
+	HwPfPcieLnTxtestword                  =  0x00D801CC,
+	HwPfPcieLnTxtestwordHigh              =  0x00D801D0,
+	HwPfPcieLnTxdrive                     =  0x00D801D4,
+	HwPfPcieLnMtcsLn                      =  0x00D801D8,
+	HwPfPcieLnStatsumLn                   =  0x00D801DC,
+	HwPfPcieLnRcbusScratch                =  0x00D801E0,
+	HwPfPcieLnRcbusMinorrev               =  0x00D801F0,
+	HwPfPcieLnRcbusMajorrev               =  0x00D801F4,
+	HwPfPcieLnRcbusBlocktype              =  0x00D801F8,
+	HwPfPcieSupPllcsr                     =  0x00D80800,
+	HwPfPcieSupPlldiv                     =  0x00D80804,
+	HwPfPcieSupPllcal                     =  0x00D80808,
+	HwPfPcieSupPllcalsts                  =  0x00D8080C,
+	HwPfPcieSupPllmeas                    =  0x00D80810,
+	HwPfPcieSupPlldactrim                 =  0x00D80814,
+	HwPfPcieSupPllbiastrim                =  0x00D80818,
+	HwPfPcieSupPllbwtrim                  =  0x00D8081C,
+	HwPfPcieSupPllcaldly                  =  0x00D80820,
+	HwPfPcieSupRefclkonpclkctrl           =  0x00D80824,
+	HwPfPcieSupPclkdelay                  =  0x00D80828,
+	HwPfPcieSupPhyconfig                  =  0x00D8082C,
+	HwPfPcieSupRcalIntf                   =  0x00D80830,
+	HwPfPcieSupAuxcsr                     =  0x00D80834,
+	HwPfPcieSupVref                       =  0x00D80838,
+	HwPfPcieSupLinkmode                   =  0x00D8083C,
+	HwPfPcieSupRrefcalctl                 =  0x00D80840,
+	HwPfPcieSupRrefcal                    =  0x00D80844,
+	HwPfPcieSupRrefcaldly                 =  0x00D80848,
+	HwPfPcieSupTximpcalctl                =  0x00D8084C,
+	HwPfPcieSupTximpcal                   =  0x00D80850,
+	HwPfPcieSupTximpoffset                =  0x00D80854,
+	HwPfPcieSupTximpcaldly                =  0x00D80858,
+	HwPfPcieSupRximpcalctl                =  0x00D8085C,
+	HwPfPcieSupRximpcal                   =  0x00D80860,
+	HwPfPcieSupRximpoffset                =  0x00D80864,
+	HwPfPcieSupRximpcaldly                =  0x00D80868,
+	HwPfPcieSupFence                      =  0x00D8086C,
+	HwPfPcieSupMtcs                       =  0x00D80870,
+	HwPfPcieSupStatsum                    =  0x00D809B8,
+	HwPfPciePcsDpStatus0                  =  0x00D81000,
+	HwPfPciePcsDpControl0                 =  0x00D81004,
+	HwPfPciePcsPmaStatusLane0             =  0x00D81008,
+	HwPfPciePcsPipeStatusLane0            =  0x00D8100C,
+	HwPfPciePcsTxdeemph0Lane0             =  0x00D81010,
+	HwPfPciePcsTxdeemph1Lane0             =  0x00D81014,
+	HwPfPciePcsInternalStatusLane0        =  0x00D81018,
+	HwPfPciePcsDpStatus1                  =  0x00D8101C,
+	HwPfPciePcsDpControl1                 =  0x00D81020,
+	HwPfPciePcsPmaStatusLane1             =  0x00D81024,
+	HwPfPciePcsPipeStatusLane1            =  0x00D81028,
+	HwPfPciePcsTxdeemph0Lane1             =  0x00D8102C,
+	HwPfPciePcsTxdeemph1Lane1             =  0x00D81030,
+	HwPfPciePcsInternalStatusLane1        =  0x00D81034,
+	HwPfPciePcsDpStatus2                  =  0x00D81038,
+	HwPfPciePcsDpControl2                 =  0x00D8103C,
+	HwPfPciePcsPmaStatusLane2             =  0x00D81040,
+	HwPfPciePcsPipeStatusLane2            =  0x00D81044,
+	HwPfPciePcsTxdeemph0Lane2             =  0x00D81048,
+	HwPfPciePcsTxdeemph1Lane2             =  0x00D8104C,
+	HwPfPciePcsInternalStatusLane2        =  0x00D81050,
+	HwPfPciePcsDpStatus3                  =  0x00D81054,
+	HwPfPciePcsDpControl3                 =  0x00D81058,
+	HwPfPciePcsPmaStatusLane3             =  0x00D8105C,
+	HwPfPciePcsPipeStatusLane3            =  0x00D81060,
+	HwPfPciePcsTxdeemph0Lane3             =  0x00D81064,
+	HwPfPciePcsTxdeemph1Lane3             =  0x00D81068,
+	HwPfPciePcsInternalStatusLane3        =  0x00D8106C,
+	HwPfPciePcsEbStatus0                  =  0x00D81070,
+	HwPfPciePcsEbStatus1                  =  0x00D81074,
+	HwPfPciePcsEbStatus2                  =  0x00D81078,
+	HwPfPciePcsEbStatus3                  =  0x00D8107C,
+	HwPfPciePcsPllSettingPcieG1           =  0x00D81088,
+	HwPfPciePcsPllSettingPcieG2           =  0x00D8108C,
+	HwPfPciePcsPllSettingPcieG3           =  0x00D81090,
+	HwPfPciePcsControl                    =  0x00D81094,
+	HwPfPciePcsEqControl                  =  0x00D81098,
+	HwPfPciePcsEqTimer                    =  0x00D8109C,
+	HwPfPciePcsEqErrStatus                =  0x00D810A0,
+	HwPfPciePcsEqErrCount                 =  0x00D810A4,
+	HwPfPciePcsStatus                     =  0x00D810A8,
+	HwPfPciePcsMiscRegister               =  0x00D810AC,
+	HwPfPciePcsObsControl                 =  0x00D810B0,
+	HwPfPciePcsPrbsCount0                 =  0x00D81200,
+	HwPfPciePcsBistControl0               =  0x00D81204,
+	HwPfPciePcsBistStaticWord00           =  0x00D81208,
+	HwPfPciePcsBistStaticWord10           =  0x00D8120C,
+	HwPfPciePcsBistStaticWord20           =  0x00D81210,
+	HwPfPciePcsBistStaticWord30           =  0x00D81214,
+	HwPfPciePcsPrbsCount1                 =  0x00D81220,
+	HwPfPciePcsBistControl1               =  0x00D81224,
+	HwPfPciePcsBistStaticWord01           =  0x00D81228,
+	HwPfPciePcsBistStaticWord11           =  0x00D8122C,
+	HwPfPciePcsBistStaticWord21           =  0x00D81230,
+	HwPfPciePcsBistStaticWord31           =  0x00D81234,
+	HwPfPciePcsPrbsCount2                 =  0x00D81240,
+	HwPfPciePcsBistControl2               =  0x00D81244,
+	HwPfPciePcsBistStaticWord02           =  0x00D81248,
+	HwPfPciePcsBistStaticWord12           =  0x00D8124C,
+	HwPfPciePcsBistStaticWord22           =  0x00D81250,
+	HwPfPciePcsBistStaticWord32           =  0x00D81254,
+	HwPfPciePcsPrbsCount3                 =  0x00D81260,
+	HwPfPciePcsBistControl3               =  0x00D81264,
+	HwPfPciePcsBistStaticWord03           =  0x00D81268,
+	HwPfPciePcsBistStaticWord13           =  0x00D8126C,
+	HwPfPciePcsBistStaticWord23           =  0x00D81270,
+	HwPfPciePcsBistStaticWord33           =  0x00D81274,
+	HwPfPcieGpexLtssmStateCntrl           =  0x00D90400,
+	HwPfPcieGpexLtssmStateStatus          =  0x00D90404,
+	HwPfPcieGpexSkipFreqTimer             =  0x00D90408,
+	HwPfPcieGpexLaneSelect                =  0x00D9040C,
+	HwPfPcieGpexLaneDeskew                =  0x00D90410,
+	HwPfPcieGpexRxErrorStatus             =  0x00D90414,
+	HwPfPcieGpexLaneNumControl            =  0x00D90418,
+	HwPfPcieGpexNFstControl               =  0x00D9041C,
+	HwPfPcieGpexLinkStatus                =  0x00D90420,
+	HwPfPcieGpexAckReplayTimeout          =  0x00D90438,
+	HwPfPcieGpexSeqNumberStatus           =  0x00D9043C,
+	HwPfPcieGpexCoreClkRatio              =  0x00D90440,
+	HwPfPcieGpexDllTholdControl           =  0x00D90448,
+	HwPfPcieGpexPmTimer                   =  0x00D90450,
+	HwPfPcieGpexPmeTimeout                =  0x00D90454,
+	HwPfPcieGpexAspmL1Timer               =  0x00D90458,
+	HwPfPcieGpexAspmReqTimer              =  0x00D9045C,
+	HwPfPcieGpexAspmL1Dis                 =  0x00D90460,
+	HwPfPcieGpexAdvisoryErrorControl      =  0x00D90468,
+	HwPfPcieGpexId                        =  0x00D90470,
+	HwPfPcieGpexClasscode                 =  0x00D90474,
+	HwPfPcieGpexSubsystemId               =  0x00D90478,
+	HwPfPcieGpexDeviceCapabilities        =  0x00D9047C,
+	HwPfPcieGpexLinkCapabilities          =  0x00D90480,
+	HwPfPcieGpexFunctionNumber            =  0x00D90484,
+	HwPfPcieGpexPmCapabilities            =  0x00D90488,
+	HwPfPcieGpexFunctionSelect            =  0x00D9048C,
+	HwPfPcieGpexErrorCounter              =  0x00D904AC,
+	HwPfPcieGpexConfigReady               =  0x00D904B0,
+	HwPfPcieGpexFcUpdateTimeout           =  0x00D904B8,
+	HwPfPcieGpexFcUpdateTimer             =  0x00D904BC,
+	HwPfPcieGpexVcBufferLoad              =  0x00D904C8,
+	HwPfPcieGpexVcBufferSizeThold         =  0x00D904CC,
+	HwPfPcieGpexVcBufferSelect            =  0x00D904D0,
+	HwPfPcieGpexBarEnable                 =  0x00D904D4,
+	HwPfPcieGpexBarDwordLower             =  0x00D904D8,
+	HwPfPcieGpexBarDwordUpper             =  0x00D904DC,
+	HwPfPcieGpexBarSelect                 =  0x00D904E0,
+	HwPfPcieGpexCreditCounterSelect       =  0x00D904E4,
+	HwPfPcieGpexCreditCounterStatus       =  0x00D904E8,
+	HwPfPcieGpexTlpHeaderSelect           =  0x00D904EC,
+	HwPfPcieGpexTlpHeaderDword0           =  0x00D904F0,
+	HwPfPcieGpexTlpHeaderDword1           =  0x00D904F4,
+	HwPfPcieGpexTlpHeaderDword2           =  0x00D904F8,
+	HwPfPcieGpexTlpHeaderDword3           =  0x00D904FC,
+	HwPfPcieGpexRelaxOrderControl         =  0x00D90500,
+	HwPfPcieGpexBarPrefetch               =  0x00D90504,
+	HwPfPcieGpexFcCheckControl            =  0x00D90508,
+	HwPfPcieGpexFcUpdateTimerTraffic      =  0x00D90518,
+	HwPfPcieGpexPhyControl0               =  0x00D9053C,
+	HwPfPcieGpexPhyControl1               =  0x00D90544,
+	HwPfPcieGpexPhyControl2               =  0x00D9054C,
+	HwPfPcieGpexUserControl0              =  0x00D9055C,
+	HwPfPcieGpexUncorrErrorStatus         =  0x00D905F0,
+	HwPfPcieGpexRxCplError                =  0x00D90620,
+	HwPfPcieGpexRxCplErrorDword0          =  0x00D90624,
+	HwPfPcieGpexRxCplErrorDword1          =  0x00D90628,
+	HwPfPcieGpexRxCplErrorDword2          =  0x00D9062C,
+	HwPfPcieGpexPabSwResetEn              =  0x00D90630,
+	HwPfPcieGpexGen3Control0              =  0x00D90634,
+	HwPfPcieGpexGen3Control1              =  0x00D90638,
+	HwPfPcieGpexGen3Control2              =  0x00D9063C,
+	HwPfPcieGpexGen2ControlCsr            =  0x00D90640,
+	HwPfPcieGpexTotalVfInitialVf0         =  0x00D90644,
+	HwPfPcieGpexTotalVfInitialVf1         =  0x00D90648,
+	HwPfPcieGpexSriovLinkDevId0           =  0x00D90684,
+	HwPfPcieGpexSriovLinkDevId1           =  0x00D90688,
+	HwPfPcieGpexSriovPageSize0            =  0x00D906C4,
+	HwPfPcieGpexSriovPageSize1            =  0x00D906C8,
+	HwPfPcieGpexIdVersion                 =  0x00D906FC,
+	HwPfPcieGpexSriovVfOffsetStride0      =  0x00D90704,
+	HwPfPcieGpexSriovVfOffsetStride1      =  0x00D90708,
+	HwPfPcieGpexGen3DeskewControl         =  0x00D907B4,
+	HwPfPcieGpexGen3EqControl             =  0x00D907B8,
+	HwPfPcieGpexBridgeVersion             =  0x00D90800,
+	HwPfPcieGpexBridgeCapability          =  0x00D90804,
+	HwPfPcieGpexBridgeControl             =  0x00D90808,
+	HwPfPcieGpexBridgeStatus              =  0x00D9080C,
+	HwPfPcieGpexEngineActivityStatus      =  0x00D9081C,
+	HwPfPcieGpexEngineResetControl        =  0x00D90820,
+	HwPfPcieGpexAxiPioControl             =  0x00D90840,
+	HwPfPcieGpexAxiPioStatus              =  0x00D90844,
+	HwPfPcieGpexAmbaSlaveCmdStatus        =  0x00D90848,
+	HwPfPcieGpexPexPioControl             =  0x00D908C0,
+	HwPfPcieGpexPexPioStatus              =  0x00D908C4,
+	HwPfPcieGpexAmbaMasterStatus          =  0x00D908C8,
+	HwPfPcieGpexCsrSlaveCmdStatus         =  0x00D90920,
+	HwPfPcieGpexMailboxAxiControl         =  0x00D90A50,
+	HwPfPcieGpexMailboxAxiData            =  0x00D90A54,
+	HwPfPcieGpexMailboxPexControl         =  0x00D90A90,
+	HwPfPcieGpexMailboxPexData            =  0x00D90A94,
+	HwPfPcieGpexPexInterruptEnable        =  0x00D90AD0,
+	HwPfPcieGpexPexInterruptStatus        =  0x00D90AD4,
+	HwPfPcieGpexPexInterruptAxiPioVector  =  0x00D90AD8,
+	HwPfPcieGpexPexInterruptPexPioVector  =  0x00D90AE0,
+	HwPfPcieGpexPexInterruptMiscVector    =  0x00D90AF8,
+	HwPfPcieGpexAmbaInterruptPioEnable    =  0x00D90B00,
+	HwPfPcieGpexAmbaInterruptMiscEnable   =  0x00D90B0C,
+	HwPfPcieGpexAmbaInterruptPioStatus    =  0x00D90B10,
+	HwPfPcieGpexAmbaInterruptMiscStatus   =  0x00D90B1C,
+	HwPfPcieGpexPexPmControl              =  0x00D90B80,
+	HwPfPcieGpexSlotMisc                  =  0x00D90B88,
+	HwPfPcieGpexAxiAddrMappingControl     =  0x00D90BA0,
+	HwPfPcieGpexAxiAddrMappingWindowAxiBase     =  0x00D90BA4,
+	HwPfPcieGpexAxiAddrMappingWindowPexBaseLow  =  0x00D90BA8,
+	HwPfPcieGpexAxiAddrMappingWindowPexBaseHigh =  0x00D90BAC,
+	HwPfPcieGpexPexBarAddrFunc0Bar0       =  0x00D91BA0,
+	HwPfPcieGpexPexBarAddrFunc0Bar1       =  0x00D91BA4,
+	HwPfPcieGpexAxiAddrMappingPcieHdrParam =  0x00D95BA0,
+	HwPfPcieGpexExtAxiAddrMappingAxiBase  =  0x00D980A0,
+	HwPfPcieGpexPexExtBarAddrFunc0Bar0    =  0x00D984A0,
+	HwPfPcieGpexPexExtBarAddrFunc0Bar1    =  0x00D984A4,
+	HwPfPcieGpexAmbaInterruptFlrEnable    =  0x00D9B960,
+	HwPfPcieGpexAmbaInterruptFlrStatus    =  0x00D9B9A0,
+	HwPfPcieGpexExtAxiAddrMappingSize     =  0x00D9BAF0,
+	HwPfPcieGpexPexPioAwcacheControl      =  0x00D9C300,
+	HwPfPcieGpexPexPioArcacheControl      =  0x00D9C304,
+	HwPfPcieGpexPabObSizeControlVc0       =  0x00D9C310
+};
+
+/* TIP PF Interrupt numbers */
+enum {
+	ACC100_PF_INT_QMGR_AQ_OVERFLOW = 0,
+	ACC100_PF_INT_DOORBELL_VF_2_PF = 1,
+	ACC100_PF_INT_DMA_DL_DESC_IRQ = 2,
+	ACC100_PF_INT_DMA_UL_DESC_IRQ = 3,
+	ACC100_PF_INT_DMA_MLD_DESC_IRQ = 4,
+	ACC100_PF_INT_DMA_UL5G_DESC_IRQ = 5,
+	ACC100_PF_INT_DMA_DL5G_DESC_IRQ = 6,
+	ACC100_PF_INT_ILLEGAL_FORMAT = 7,
+	ACC100_PF_INT_QMGR_DISABLED_ACCESS = 8,
+	ACC100_PF_INT_QMGR_AQ_OVERTHRESHOLD = 9,
+	ACC100_PF_INT_ARAM_ACCESS_ERR = 10,
+	ACC100_PF_INT_ARAM_ECC_1BIT_ERR = 11,
+	ACC100_PF_INT_PARITY_ERR = 12,
+	ACC100_PF_INT_QMGR_ERR = 13,
+	ACC100_PF_INT_INT_REQ_OVERFLOW = 14,
+	ACC100_PF_INT_APB_TIMEOUT = 15,
+};
+
+#endif /* ACC100_PF_ENUM_H */
diff --git a/drivers/baseband/acc100/acc100_vf_enum.h b/drivers/baseband/acc100/acc100_vf_enum.h
new file mode 100644
index 0000000..b512af3
--- /dev/null
+++ b/drivers/baseband/acc100/acc100_vf_enum.h
@@ -0,0 +1,73 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2017 Intel Corporation
+ */
+
+#ifndef ACC100_VF_ENUM_H
+#define ACC100_VF_ENUM_H
+
+/*
+ * ACC100 Register mapping on VF BAR0
+ * This is automatically generated from RDL, format may change with new RDL
+ */
+enum {
+	HWVfQmgrIngressAq             =  0x00000000,
+	HWVfHiVfToPfDbellVf           =  0x00000800,
+	HWVfHiPfToVfDbellVf           =  0x00000808,
+	HWVfHiInfoRingBaseLoVf        =  0x00000810,
+	HWVfHiInfoRingBaseHiVf        =  0x00000814,
+	HWVfHiInfoRingPointerVf       =  0x00000818,
+	HWVfHiInfoRingIntWrEnVf       =  0x00000820,
+	HWVfHiInfoRingPf2VfWrEnVf     =  0x00000824,
+	HWVfHiMsixVectorMapperVf      =  0x00000860,
+	HWVfDmaFec5GulDescBaseLoRegVf =  0x00000920,
+	HWVfDmaFec5GulDescBaseHiRegVf =  0x00000924,
+	HWVfDmaFec5GulRespPtrLoRegVf  =  0x00000928,
+	HWVfDmaFec5GulRespPtrHiRegVf  =  0x0000092C,
+	HWVfDmaFec5GdlDescBaseLoRegVf =  0x00000940,
+	HWVfDmaFec5GdlDescBaseHiRegVf =  0x00000944,
+	HWVfDmaFec5GdlRespPtrLoRegVf  =  0x00000948,
+	HWVfDmaFec5GdlRespPtrHiRegVf  =  0x0000094C,
+	HWVfDmaFec4GulDescBaseLoRegVf =  0x00000960,
+	HWVfDmaFec4GulDescBaseHiRegVf =  0x00000964,
+	HWVfDmaFec4GulRespPtrLoRegVf  =  0x00000968,
+	HWVfDmaFec4GulRespPtrHiRegVf  =  0x0000096C,
+	HWVfDmaFec4GdlDescBaseLoRegVf =  0x00000980,
+	HWVfDmaFec4GdlDescBaseHiRegVf =  0x00000984,
+	HWVfDmaFec4GdlRespPtrLoRegVf  =  0x00000988,
+	HWVfDmaFec4GdlRespPtrHiRegVf  =  0x0000098C,
+	HWVfDmaDdrBaseRangeRoVf       =  0x000009A0,
+	HWVfQmgrAqResetVf             =  0x00000E00,
+	HWVfQmgrRingSizeVf            =  0x00000E04,
+	HWVfQmgrGrpDepthLog20Vf       =  0x00000E08,
+	HWVfQmgrGrpDepthLog21Vf       =  0x00000E0C,
+	HWVfQmgrGrpFunction0Vf        =  0x00000E10,
+	HWVfQmgrGrpFunction1Vf        =  0x00000E14,
+	HWVfPmACntrlRegVf             =  0x00000F40,
+	HWVfPmACountVf                =  0x00000F48,
+	HWVfPmAKCntLoVf               =  0x00000F50,
+	HWVfPmAKCntHiVf               =  0x00000F54,
+	HWVfPmADeltaCntLoVf           =  0x00000F60,
+	HWVfPmADeltaCntHiVf           =  0x00000F64,
+	HWVfPmBCntrlRegVf             =  0x00000F80,
+	HWVfPmBCountVf                =  0x00000F88,
+	HWVfPmBKCntLoVf               =  0x00000F90,
+	HWVfPmBKCntHiVf               =  0x00000F94,
+	HWVfPmBDeltaCntLoVf           =  0x00000FA0,
+	HWVfPmBDeltaCntHiVf           =  0x00000FA4
+};
+
+/* TIP VF Interrupt numbers */
+enum {
+	ACC100_VF_INT_QMGR_AQ_OVERFLOW = 0,
+	ACC100_VF_INT_DOORBELL_VF_2_PF = 1,
+	ACC100_VF_INT_DMA_DL_DESC_IRQ = 2,
+	ACC100_VF_INT_DMA_UL_DESC_IRQ = 3,
+	ACC100_VF_INT_DMA_MLD_DESC_IRQ = 4,
+	ACC100_VF_INT_DMA_UL5G_DESC_IRQ = 5,
+	ACC100_VF_INT_DMA_DL5G_DESC_IRQ = 6,
+	ACC100_VF_INT_ILLEGAL_FORMAT = 7,
+	ACC100_VF_INT_QMGR_DISABLED_ACCESS = 8,
+	ACC100_VF_INT_QMGR_AQ_OVERTHRESHOLD = 9,
+};
+
+#endif /* ACC100_VF_ENUM_H */
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 6f46df0..cd77570 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -5,6 +5,9 @@
 #ifndef _RTE_ACC100_PMD_H_
 #define _RTE_ACC100_PMD_H_
 
+#include "acc100_pf_enum.h"
+#include "acc100_vf_enum.h"
+
 /* Helper macro for logging */
 #define rte_bbdev_log(level, fmt, ...) \
 	rte_log(RTE_LOG_ ## level, acc100_logtype, fmt "\n", \
@@ -27,6 +30,493 @@
 #define RTE_ACC100_PF_DEVICE_ID        (0x0d5c)
 #define RTE_ACC100_VF_DEVICE_ID        (0x0d5d)
 
+/* Define as 1 to use only a single FEC engine */
+#ifndef RTE_ACC100_SINGLE_FEC
+#define RTE_ACC100_SINGLE_FEC 0
+#endif
+
+/* Values used in filling in descriptors */
+#define ACC100_DMA_DESC_TYPE           2
+#define ACC100_DMA_CODE_BLK_MODE       0
+#define ACC100_DMA_BLKID_FCW           1
+#define ACC100_DMA_BLKID_IN            2
+#define ACC100_DMA_BLKID_OUT_ENC       1
+#define ACC100_DMA_BLKID_OUT_HARD      1
+#define ACC100_DMA_BLKID_OUT_SOFT      2
+#define ACC100_DMA_BLKID_OUT_HARQ      3
+#define ACC100_DMA_BLKID_IN_HARQ       3
+
+/* Values used in filling in decode FCWs */
+#define ACC100_FCW_TD_VER              1
+#define ACC100_FCW_TD_EXT_COLD_REG_EN  1
+#define ACC100_FCW_TD_AUTOMAP          0x0f
+#define ACC100_FCW_TD_RVIDX_0          2
+#define ACC100_FCW_TD_RVIDX_1          26
+#define ACC100_FCW_TD_RVIDX_2          50
+#define ACC100_FCW_TD_RVIDX_3          74
+
+/* Values used in writing to the registers */
+#define ACC100_REG_IRQ_EN_ALL          0x1FF83FF  /* Enable all interrupts */
+
+/* ACC100 Specific Dimensioning */
+#define ACC100_SIZE_64MBYTE            (64*1024*1024)
+/* Number of elements in an Info Ring */
+#define ACC100_INFO_RING_NUM_ENTRIES   1024
+/* Number of elements in HARQ layout memory */
+#define ACC100_HARQ_LAYOUT             (64*1024*1024)
+/* Assume offset for HARQ in memory */
+#define ACC100_HARQ_OFFSET             (32*1024)
+/* Mask used to calculate an index in an Info Ring array (not a byte offset) */
+#define ACC100_INFO_RING_MASK          (ACC100_INFO_RING_NUM_ENTRIES-1)
+/* Number of Virtual Functions ACC100 supports */
+#define ACC100_NUM_VFS                  16
+#define ACC100_NUM_QGRPS                 8
+#define ACC100_NUM_QGRPS_PER_WORD        8
+#define ACC100_NUM_AQS                  16
+#define MAX_ENQ_BATCH_SIZE          255
+/* All ACC100 Registers alignment are 32bits = 4B */
+#define BYTES_IN_WORD                 4
+#define MAX_E_MBUF                64000
+
+#define GRP_ID_SHIFT    10 /* Queue Index Hierarchy */
+#define VF_ID_SHIFT     4  /* Queue Index Hierarchy */
+#define VF_OFFSET_QOS   16 /* offset in Memory Space specific to QoS Mon */
+#define TMPL_PRI_0      0x03020100
+#define TMPL_PRI_1      0x07060504
+#define TMPL_PRI_2      0x0b0a0908
+#define TMPL_PRI_3      0x0f0e0d0c
+#define QUEUE_ENABLE    0x80000000  /* Bit to mark Queue as Enabled */
+#define WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
+
+#define ACC100_NUM_TMPL  32
+#define VF_OFFSET_QOS 16 /* offset in Memory Space specific to QoS Mon */
+/* Mapping of signals for the available engines */
+#define SIG_UL_5G      0
+#define SIG_UL_5G_LAST 7
+#define SIG_DL_5G      13
+#define SIG_DL_5G_LAST 15
+#define SIG_UL_4G      16
+#define SIG_UL_4G_LAST 21
+#define SIG_DL_4G      27
+#define SIG_DL_4G_LAST 31
+
+/* max number of iterations to allocate memory block for all rings */
+#define SW_RING_MEM_ALLOC_ATTEMPTS 5
+#define MAX_QUEUE_DEPTH           1024
+#define ACC100_DMA_MAX_NUM_POINTERS  14
+#define ACC100_DMA_DESC_PADDING      8
+#define ACC100_FCW_PADDING           12
+#define ACC100_DESC_FCW_OFFSET       192
+#define ACC100_DESC_SIZE             256
+#define ACC100_DESC_OFFSET           (ACC100_DESC_SIZE / 64)
+#define ACC100_FCW_TE_BLEN     32
+#define ACC100_FCW_TD_BLEN     24
+#define ACC100_FCW_LE_BLEN     32
+#define ACC100_FCW_LD_BLEN     36
+
+#define ACC100_FCW_VER         2
+#define MUX_5GDL_DESC 6
+#define CMP_ENC_SIZE 20
+#define CMP_DEC_SIZE 24
+#define ENC_OFFSET (32)
+#define DEC_OFFSET (80)
+#define ACC100_EXT_MEM
+#define ACC100_HARQ_OFFSET_THRESHOLD 1024
+
+/* Constants from K0 computation from 3GPP 38.212 Table 5.4.2.1-2 */
+#define N_ZC_1 66 /* N = 66 Zc for BG 1 */
+#define N_ZC_2 50 /* N = 50 Zc for BG 2 */
+#define K0_1_1 17 /* K0 fraction numerator for rv 1 and BG 1 */
+#define K0_1_2 13 /* K0 fraction numerator for rv 1 and BG 2 */
+#define K0_2_1 33 /* K0 fraction numerator for rv 2 and BG 1 */
+#define K0_2_2 25 /* K0 fraction numerator for rv 2 and BG 2 */
+#define K0_3_1 56 /* K0 fraction numerator for rv 3 and BG 1 */
+#define K0_3_2 43 /* K0 fraction numerator for rv 3 and BG 2 */
+
+/* ACC100 Configuration */
+#define ACC100_DDR_ECC_ENABLE
+#define ACC100_CFG_DMA_ERROR 0x3D7
+#define ACC100_CFG_AXI_CACHE 0x11
+#define ACC100_CFG_QMGR_HI_P 0x0F0F
+#define ACC100_CFG_PCI_AXI 0xC003
+#define ACC100_CFG_PCI_BRIDGE 0x40006033
+#define ACC100_ENGINE_OFFSET 0x1000
+#define ACC100_RESET_HI 0x20100
+#define ACC100_RESET_LO 0x20000
+#define ACC100_RESET_HARD 0x1FF
+#define ACC100_ENGINES_MAX 9
+#define LONG_WAIT 1000
+
+/* ACC100 DMA Descriptor triplet */
+struct acc100_dma_triplet {
+	uint64_t address;
+	uint32_t blen:20,
+		res0:4,
+		last:1,
+		dma_ext:1,
+		res1:2,
+		blkid:4;
+} __rte_packed;
+
+
+
+/* ACC100 DMA Response Descriptor */
+union acc100_dma_rsp_desc {
+	uint32_t val;
+	struct {
+		uint32_t crc_status:1,
+			synd_ok:1,
+			dma_err:1,
+			neg_stop:1,
+			fcw_err:1,
+			output_err:1,
+			input_err:1,
+			timestampEn:1,
+			iterCountFrac:8,
+			iter_cnt:8,
+			rsrvd3:6,
+			sdone:1,
+			fdone:1;
+		uint32_t add_info_0;
+		uint32_t add_info_1;
+	};
+};
+
+
+/* ACC100 Queue Manager Enqueue PCI Register */
+union acc100_enqueue_reg_fmt {
+	uint32_t val;
+	struct {
+		uint32_t num_elem:8,
+			addr_offset:3,
+			rsrvd:1,
+			req_elem_addr:20;
+	};
+};
+
+/* FEC 4G Uplink Frame Control Word */
+struct __rte_packed acc100_fcw_td {
+	uint8_t fcw_ver:4,
+		num_maps:4; /* Unused */
+	uint8_t filler:6, /* Unused */
+		rsrvd0:1,
+		bypass_sb_deint:1;
+	uint16_t k_pos;
+	uint16_t k_neg; /* Unused */
+	uint8_t c_neg; /* Unused */
+	uint8_t c; /* Unused */
+	uint32_t ea; /* Unused */
+	uint32_t eb; /* Unused */
+	uint8_t cab; /* Unused */
+	uint8_t k0_start_col; /* Unused */
+	uint8_t rsrvd1;
+	uint8_t code_block_mode:1, /* Unused */
+		turbo_crc_type:1,
+		rsrvd2:3,
+		bypass_teq:1, /* Unused */
+		soft_output_en:1, /* Unused */
+		ext_td_cold_reg_en:1;
+	union { /* External Cold register */
+		uint32_t ext_td_cold_reg;
+		struct {
+			uint32_t min_iter:4, /* Unused */
+				max_iter:4,
+				ext_scale:5, /* Unused */
+				rsrvd3:3,
+				early_stop_en:1, /* Unused */
+				sw_soft_out_dis:1, /* Unused */
+				sw_et_cont:1, /* Unused */
+				sw_soft_out_saturation:1, /* Unused */
+				half_iter_on:1, /* Unused */
+				raw_decoder_input_on:1, /* Unused */
+				rsrvd4:10;
+		};
+	};
+};
+
+/* FEC 5GNR Uplink Frame Control Word */
+struct __rte_packed acc100_fcw_ld {
+	uint32_t FCWversion:4,
+		qm:4,
+		nfiller:11,
+		BG:1,
+		Zc:9,
+		res0:1,
+		synd_precoder:1,
+		synd_post:1;
+	uint32_t ncb:16,
+		k0:16;
+	uint32_t rm_e:24,
+		hcin_en:1,
+		hcout_en:1,
+		crc_select:1,
+		bypass_dec:1,
+		bypass_intlv:1,
+		so_en:1,
+		so_bypass_rm:1,
+		so_bypass_intlv:1;
+	uint32_t hcin_offset:16,
+		hcin_size0:16;
+	uint32_t hcin_size1:16,
+		hcin_decomp_mode:3,
+		llr_pack_mode:1,
+		hcout_comp_mode:3,
+		res2:1,
+		dec_convllr:4,
+		hcout_convllr:4;
+	uint32_t itmax:7,
+		itstop:1,
+		so_it:7,
+		res3:1,
+		hcout_offset:16;
+	uint32_t hcout_size0:16,
+		hcout_size1:16;
+	uint32_t gain_i:8,
+		gain_h:8,
+		negstop_th:16;
+	uint32_t negstop_it:7,
+		negstop_en:1,
+		res4:24;
+};
+
+/* FEC 4G Downlink Frame Control Word */
+struct __rte_packed acc100_fcw_te {
+	uint16_t k_neg;
+	uint16_t k_pos;
+	uint8_t c_neg;
+	uint8_t c;
+	uint8_t filler;
+	uint8_t cab;
+	uint32_t ea:17,
+		rsrvd0:15;
+	uint32_t eb:17,
+		rsrvd1:15;
+	uint16_t ncb_neg;
+	uint16_t ncb_pos;
+	uint8_t rv_idx0:2,
+		rsrvd2:2,
+		rv_idx1:2,
+		rsrvd3:2;
+	uint8_t bypass_rv_idx0:1,
+		bypass_rv_idx1:1,
+		bypass_rm:1,
+		rsrvd4:5;
+	uint8_t rsrvd5:1,
+		rsrvd6:3,
+		code_block_crc:1,
+		rsrvd7:3;
+	uint8_t code_block_mode:1,
+		rsrvd8:7;
+	uint64_t rsrvd9;
+};
+
+/* FEC 5GNR Downlink Frame Control Word */
+struct __rte_packed acc100_fcw_le {
+	uint32_t FCWversion:4,
+		qm:4,
+		nfiller:11,
+		BG:1,
+		Zc:9,
+		res0:3;
+	uint32_t ncb:16,
+		k0:16;
+	uint32_t rm_e:24,
+		res1:2,
+		crc_select:1,
+		res2:1,
+		bypass_intlv:1,
+		res3:3;
+	uint32_t res4_a:12,
+		mcb_count:3,
+		res4_b:17;
+	uint32_t res5;
+	uint32_t res6;
+	uint32_t res7;
+	uint32_t res8;
+};
+
+/* ACC100 DMA Request Descriptor */
+struct __rte_packed acc100_dma_req_desc {
+	union {
+		struct{
+			uint32_t type:4,
+				rsrvd0:26,
+				sdone:1,
+				fdone:1;
+			uint32_t rsrvd1;
+			uint32_t rsrvd2;
+			uint32_t pass_param:8,
+				sdone_enable:1,
+				irq_enable:1,
+				timeStampEn:1,
+				res0:5,
+				numCBs:4,
+				res1:4,
+				m2dlen:4,
+				d2mlen:4;
+		};
+		struct{
+			uint32_t word0;
+			uint32_t word1;
+			uint32_t word2;
+			uint32_t word3;
+		};
+	};
+	struct acc100_dma_triplet data_ptrs[ACC100_DMA_MAX_NUM_POINTERS];
+
+	/* Virtual addresses used to retrieve SW context info */
+	union {
+		void *op_addr;
+		uint64_t pad1;  /* pad to 64 bits */
+	};
+	/*
+	 * Stores additional information needed for driver processing:
+	 * - last_desc_in_batch - flag used to mark last descriptor (CB)
+	 *                        in batch
+	 * - cbs_in_tb - stores information about total number of Code Blocks
+	 *               in currently processed Transport Block
+	 */
+	union {
+		struct {
+			union {
+				struct acc100_fcw_ld fcw_ld;
+				struct acc100_fcw_td fcw_td;
+				struct acc100_fcw_le fcw_le;
+				struct acc100_fcw_te fcw_te;
+				uint32_t pad2[ACC100_FCW_PADDING];
+			};
+			uint32_t last_desc_in_batch :8,
+				cbs_in_tb:8,
+				pad4 : 16;
+		};
+		uint64_t pad3[ACC100_DMA_DESC_PADDING]; /* pad to 64 bits */
+	};
+};
+
+/* ACC100 DMA Descriptor */
+union acc100_dma_desc {
+	struct acc100_dma_req_desc req;
+	union acc100_dma_rsp_desc rsp;
+};
+
+
+/* Union describing Info Ring entry */
+union acc100_harq_layout_data {
+	uint32_t val;
+	struct {
+		uint16_t offset;
+		uint16_t size0;
+	};
+} __rte_packed;
+
+
+/* Union describing Info Ring entry */
+union acc100_info_ring_data {
+	uint32_t val;
+	struct {
+		union {
+			uint16_t detailed_info;
+			struct {
+				uint16_t aq_id: 4;
+				uint16_t qg_id: 4;
+				uint16_t vf_id: 6;
+				uint16_t reserved: 2;
+			};
+		};
+		uint16_t int_nb: 7;
+		uint16_t msi_0: 1;
+		uint16_t vf2pf: 6;
+		uint16_t loop: 1;
+		uint16_t valid: 1;
+	};
+} __rte_packed;
+
+struct acc100_registry_addr {
+	unsigned int dma_ring_dl5g_hi;
+	unsigned int dma_ring_dl5g_lo;
+	unsigned int dma_ring_ul5g_hi;
+	unsigned int dma_ring_ul5g_lo;
+	unsigned int dma_ring_dl4g_hi;
+	unsigned int dma_ring_dl4g_lo;
+	unsigned int dma_ring_ul4g_hi;
+	unsigned int dma_ring_ul4g_lo;
+	unsigned int ring_size;
+	unsigned int info_ring_hi;
+	unsigned int info_ring_lo;
+	unsigned int info_ring_en;
+	unsigned int info_ring_ptr;
+	unsigned int tail_ptrs_dl5g_hi;
+	unsigned int tail_ptrs_dl5g_lo;
+	unsigned int tail_ptrs_ul5g_hi;
+	unsigned int tail_ptrs_ul5g_lo;
+	unsigned int tail_ptrs_dl4g_hi;
+	unsigned int tail_ptrs_dl4g_lo;
+	unsigned int tail_ptrs_ul4g_hi;
+	unsigned int tail_ptrs_ul4g_lo;
+	unsigned int depth_log0_offset;
+	unsigned int depth_log1_offset;
+	unsigned int qman_group_func;
+	unsigned int ddr_range;
+};
+
+/* Structure holding registry addresses for PF */
+static const struct acc100_registry_addr pf_reg_addr = {
+	.dma_ring_dl5g_hi = HWPfDmaFec5GdlDescBaseHiRegVf,
+	.dma_ring_dl5g_lo = HWPfDmaFec5GdlDescBaseLoRegVf,
+	.dma_ring_ul5g_hi = HWPfDmaFec5GulDescBaseHiRegVf,
+	.dma_ring_ul5g_lo = HWPfDmaFec5GulDescBaseLoRegVf,
+	.dma_ring_dl4g_hi = HWPfDmaFec4GdlDescBaseHiRegVf,
+	.dma_ring_dl4g_lo = HWPfDmaFec4GdlDescBaseLoRegVf,
+	.dma_ring_ul4g_hi = HWPfDmaFec4GulDescBaseHiRegVf,
+	.dma_ring_ul4g_lo = HWPfDmaFec4GulDescBaseLoRegVf,
+	.ring_size = HWPfQmgrRingSizeVf,
+	.info_ring_hi = HWPfHiInfoRingBaseHiRegPf,
+	.info_ring_lo = HWPfHiInfoRingBaseLoRegPf,
+	.info_ring_en = HWPfHiInfoRingIntWrEnRegPf,
+	.info_ring_ptr = HWPfHiInfoRingPointerRegPf,
+	.tail_ptrs_dl5g_hi = HWPfDmaFec5GdlRespPtrHiRegVf,
+	.tail_ptrs_dl5g_lo = HWPfDmaFec5GdlRespPtrLoRegVf,
+	.tail_ptrs_ul5g_hi = HWPfDmaFec5GulRespPtrHiRegVf,
+	.tail_ptrs_ul5g_lo = HWPfDmaFec5GulRespPtrLoRegVf,
+	.tail_ptrs_dl4g_hi = HWPfDmaFec4GdlRespPtrHiRegVf,
+	.tail_ptrs_dl4g_lo = HWPfDmaFec4GdlRespPtrLoRegVf,
+	.tail_ptrs_ul4g_hi = HWPfDmaFec4GulRespPtrHiRegVf,
+	.tail_ptrs_ul4g_lo = HWPfDmaFec4GulRespPtrLoRegVf,
+	.depth_log0_offset = HWPfQmgrGrpDepthLog20Vf,
+	.depth_log1_offset = HWPfQmgrGrpDepthLog21Vf,
+	.qman_group_func = HWPfQmgrGrpFunction0,
+	.ddr_range = HWPfDmaVfDdrBaseRw,
+};
+
+/* Structure holding registry addresses for VF */
+static const struct acc100_registry_addr vf_reg_addr = {
+	.dma_ring_dl5g_hi = HWVfDmaFec5GdlDescBaseHiRegVf,
+	.dma_ring_dl5g_lo = HWVfDmaFec5GdlDescBaseLoRegVf,
+	.dma_ring_ul5g_hi = HWVfDmaFec5GulDescBaseHiRegVf,
+	.dma_ring_ul5g_lo = HWVfDmaFec5GulDescBaseLoRegVf,
+	.dma_ring_dl4g_hi = HWVfDmaFec4GdlDescBaseHiRegVf,
+	.dma_ring_dl4g_lo = HWVfDmaFec4GdlDescBaseLoRegVf,
+	.dma_ring_ul4g_hi = HWVfDmaFec4GulDescBaseHiRegVf,
+	.dma_ring_ul4g_lo = HWVfDmaFec4GulDescBaseLoRegVf,
+	.ring_size = HWVfQmgrRingSizeVf,
+	.info_ring_hi = HWVfHiInfoRingBaseHiVf,
+	.info_ring_lo = HWVfHiInfoRingBaseLoVf,
+	.info_ring_en = HWVfHiInfoRingIntWrEnVf,
+	.info_ring_ptr = HWVfHiInfoRingPointerVf,
+	.tail_ptrs_dl5g_hi = HWVfDmaFec5GdlRespPtrHiRegVf,
+	.tail_ptrs_dl5g_lo = HWVfDmaFec5GdlRespPtrLoRegVf,
+	.tail_ptrs_ul5g_hi = HWVfDmaFec5GulRespPtrHiRegVf,
+	.tail_ptrs_ul5g_lo = HWVfDmaFec5GulRespPtrLoRegVf,
+	.tail_ptrs_dl4g_hi = HWVfDmaFec4GdlRespPtrHiRegVf,
+	.tail_ptrs_dl4g_lo = HWVfDmaFec4GdlRespPtrLoRegVf,
+	.tail_ptrs_ul4g_hi = HWVfDmaFec4GulRespPtrHiRegVf,
+	.tail_ptrs_ul4g_lo = HWVfDmaFec4GulRespPtrLoRegVf,
+	.depth_log0_offset = HWVfQmgrGrpDepthLog20Vf,
+	.depth_log1_offset = HWVfQmgrGrpDepthLog21Vf,
+	.qman_group_func = HWVfQmgrGrpFunction0Vf,
+	.ddr_range = HWVfDmaDdrBaseRangeRoVf,
+};
+
 /* Private data structure for each ACC100 device */
 struct acc100_device {
 	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v3 03/11] baseband/acc100: add info get function
  2020-08-19  0:25 [dpdk-dev] [PATCH v3 00/11] bbdev PMD ACC100 Nicolas Chautru
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 01/11] drivers/baseband: add PMD for ACC100 Nicolas Chautru
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 02/11] baseband/acc100: add register definition file Nicolas Chautru
@ 2020-08-19  0:25 ` Nicolas Chautru
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 04/11] baseband/acc100: add queue configuration Nicolas Chautru
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-08-19  0:25 UTC (permalink / raw)
  To: dev, akhil.goyal; +Cc: bruce.richardson, Nicolas Chautru

Add in the "info_get" function to the driver, to allow us to query the
device.
No processing capability are available yet.
Linking bbdev-test to support the PMD with null capability.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 app/test-bbdev/Makefile                  |   3 +
 app/test-bbdev/meson.build               |   3 +
 drivers/baseband/acc100/rte_acc100_cfg.h |  96 +++++++++++++
 drivers/baseband/acc100/rte_acc100_pmd.c | 225 +++++++++++++++++++++++++++++++
 drivers/baseband/acc100/rte_acc100_pmd.h |   3 +
 5 files changed, 330 insertions(+)
 create mode 100644 drivers/baseband/acc100/rte_acc100_cfg.h

diff --git a/app/test-bbdev/Makefile b/app/test-bbdev/Makefile
index dc29557..dbc3437 100644
--- a/app/test-bbdev/Makefile
+++ b/app/test-bbdev/Makefile
@@ -26,5 +26,8 @@ endif
 ifeq ($(CONFIG_RTE_LIBRTE_PMD_BBDEV_FPGA_5GNR_FEC),y)
 LDLIBS += -lrte_pmd_bbdev_fpga_5gnr_fec
 endif
+ifeq ($(CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100),y)
+LDLIBS += -lrte_pmd_bbdev_acc100
+endif
 
 include $(RTE_SDK)/mk/rte.app.mk
diff --git a/app/test-bbdev/meson.build b/app/test-bbdev/meson.build
index 18ab6a8..fbd8ae3 100644
--- a/app/test-bbdev/meson.build
+++ b/app/test-bbdev/meson.build
@@ -12,3 +12,6 @@ endif
 if dpdk_conf.has('RTE_LIBRTE_PMD_BBDEV_FPGA_5GNR_FEC')
 	deps += ['pmd_bbdev_fpga_5gnr_fec']
 endif
+if dpdk_conf.has('RTE_LIBRTE_PMD_BBDEV_ACC100')
+	deps += ['pmd_bbdev_acc100']
+endif
\ No newline at end of file
diff --git a/drivers/baseband/acc100/rte_acc100_cfg.h b/drivers/baseband/acc100/rte_acc100_cfg.h
new file mode 100644
index 0000000..73bbe36
--- /dev/null
+++ b/drivers/baseband/acc100/rte_acc100_cfg.h
@@ -0,0 +1,96 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#ifndef _RTE_ACC100_CFG_H_
+#define _RTE_ACC100_CFG_H_
+
+/**
+ * @file rte_acc100_cfg.h
+ *
+ * Functions for configuring ACC100 HW, exposed directly to applications.
+ * Configuration related to encoding/decoding is done through the
+ * librte_bbdev library.
+ *
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ */
+
+#include <stdint.h>
+#include <stdbool.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+/**< Number of Virtual Functions ACC100 supports */
+#define RTE_ACC100_NUM_VFS 16
+
+/**
+ * Definition of Queue Topology for ACC100 Configuration
+ * Some level of details is abstracted out to expose a clean interface
+ * given that comprehensive flexibility is not required
+ */
+struct rte_q_topology_t {
+	/** Number of QGroups in incremental order of priority */
+	uint16_t num_qgroups;
+	/**
+	 * All QGroups have the same number of AQs here.
+	 * Note : Could be made a 16-array if more flexibility is really
+	 * required
+	 */
+	uint16_t num_aqs_per_groups;
+	/**
+	 * Depth of the AQs is the same of all QGroups here. Log2 Enum : 2^N
+	 * Note : Could be made a 16-array if more flexibility is really
+	 * required
+	 */
+	uint16_t aq_depth_log2;
+	/**
+	 * Index of the first Queue Group Index - assuming contiguity
+	 * Initialized as -1
+	 */
+	int8_t first_qgroup_index;
+};
+
+/**
+ * Definition of Arbitration related parameters for ACC100 Configuration
+ */
+struct rte_arbitration_t {
+	/** Default Weight for VF Fairness Arbitration */
+	uint16_t round_robin_weight;
+	uint32_t gbr_threshold1; /**< Guaranteed Bitrate Threshold 1 */
+	uint32_t gbr_threshold2; /**< Guaranteed Bitrate Threshold 2 */
+};
+
+/**
+ * Structure to pass ACC100 configuration.
+ * Note: all VF Bundles will have the same configuration.
+ */
+struct acc100_conf {
+	bool pf_mode_en; /**< 1 if PF is used for dataplane, 0 for VFs */
+	/** 1 if input '1' bit is represented by a positive LLR value, 0 if '1'
+	 * bit is represented by a negative value.
+	 */
+	bool input_pos_llr_1_bit;
+	/** 1 if output '1' bit is represented by a positive value, 0 if '1'
+	 * bit is represented by a negative value.
+	 */
+	bool output_pos_llr_1_bit;
+	uint16_t num_vf_bundles; /**< Number of VF bundles to setup */
+	/** Queue topology for each operation type */
+	struct rte_q_topology_t q_ul_4g;
+	struct rte_q_topology_t q_dl_4g;
+	struct rte_q_topology_t q_ul_5g;
+	struct rte_q_topology_t q_dl_5g;
+	/** Arbitration configuration for each operation type */
+	struct rte_arbitration_t arb_ul_4g[RTE_ACC100_NUM_VFS];
+	struct rte_arbitration_t arb_dl_4g[RTE_ACC100_NUM_VFS];
+	struct rte_arbitration_t arb_ul_5g[RTE_ACC100_NUM_VFS];
+	struct rte_arbitration_t arb_dl_5g[RTE_ACC100_NUM_VFS];
+};
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_ACC100_CFG_H_ */
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 1b4cd13..7807a30 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -26,6 +26,184 @@
 RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);
 #endif
 
+/* Read a register of a ACC100 device */
+static inline uint32_t
+acc100_reg_read(struct acc100_device *d, uint32_t offset)
+{
+
+	void *reg_addr = RTE_PTR_ADD(d->mmio_base, offset);
+	uint32_t ret = *((volatile uint32_t *)(reg_addr));
+	return rte_le_to_cpu_32(ret);
+}
+
+/* Calculate the offset of the enqueue register */
+static inline uint32_t
+queue_offset(bool pf_device, uint8_t vf_id, uint8_t qgrp_id, uint16_t aq_id)
+{
+	if (pf_device)
+		return ((vf_id << 12) + (qgrp_id << 7) + (aq_id << 3) +
+				HWPfQmgrIngressAq);
+	else
+		return ((qgrp_id << 7) + (aq_id << 3) +
+				HWVfQmgrIngressAq);
+}
+
+enum {UL_4G = 0, UL_5G, DL_4G, DL_5G, NUM_ACC};
+
+/* Return the queue topology for a Queue Group Index */
+static inline void
+qtopFromAcc(struct rte_q_topology_t **qtop, int acc_enum,
+		struct acc100_conf *acc100_conf)
+{
+	struct rte_q_topology_t *p_qtop;
+	p_qtop = NULL;
+	switch (acc_enum) {
+	case UL_4G:
+		p_qtop = &(acc100_conf->q_ul_4g);
+		break;
+	case UL_5G:
+		p_qtop = &(acc100_conf->q_ul_5g);
+		break;
+	case DL_4G:
+		p_qtop = &(acc100_conf->q_dl_4g);
+		break;
+	case DL_5G:
+		p_qtop = &(acc100_conf->q_dl_5g);
+		break;
+	default:
+		/* NOTREACHED */
+		rte_bbdev_log(ERR, "Unexpected error evaluating qtopFromAcc");
+		break;
+	}
+	*qtop = p_qtop;
+}
+
+static void
+initQTop(struct acc100_conf *acc100_conf)
+{
+	acc100_conf->q_ul_4g.num_aqs_per_groups = 0;
+	acc100_conf->q_ul_4g.num_qgroups = 0;
+	acc100_conf->q_ul_4g.first_qgroup_index = -1;
+	acc100_conf->q_ul_5g.num_aqs_per_groups = 0;
+	acc100_conf->q_ul_5g.num_qgroups = 0;
+	acc100_conf->q_ul_5g.first_qgroup_index = -1;
+	acc100_conf->q_dl_4g.num_aqs_per_groups = 0;
+	acc100_conf->q_dl_4g.num_qgroups = 0;
+	acc100_conf->q_dl_4g.first_qgroup_index = -1;
+	acc100_conf->q_dl_5g.num_aqs_per_groups = 0;
+	acc100_conf->q_dl_5g.num_qgroups = 0;
+	acc100_conf->q_dl_5g.first_qgroup_index = -1;
+}
+
+static inline void
+updateQtop(uint8_t acc, uint8_t qg, struct acc100_conf *acc100_conf,
+		struct acc100_device *d) {
+	uint32_t reg;
+	struct rte_q_topology_t *q_top = NULL;
+	qtopFromAcc(&q_top, acc, acc100_conf);
+	if (unlikely(q_top == NULL))
+		return;
+	uint16_t aq;
+	q_top->num_qgroups++;
+	if (q_top->first_qgroup_index == -1) {
+		q_top->first_qgroup_index = qg;
+		/* Can be optimized to assume all are enabled by default */
+		reg = acc100_reg_read(d, queue_offset(d->pf_device,
+				0, qg, ACC100_NUM_AQS - 1));
+		if (reg & QUEUE_ENABLE) {
+			q_top->num_aqs_per_groups = ACC100_NUM_AQS;
+			return;
+		}
+		q_top->num_aqs_per_groups = 0;
+		for (aq = 0; aq < ACC100_NUM_AQS; aq++) {
+			reg = acc100_reg_read(d, queue_offset(d->pf_device,
+					0, qg, aq));
+			if (reg & QUEUE_ENABLE)
+				q_top->num_aqs_per_groups++;
+		}
+	}
+}
+
+/* Fetch configuration enabled for the PF/VF using MMIO Read (slow) */
+static inline void
+fetch_acc100_config(struct rte_bbdev *dev)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	struct acc100_conf *acc100_conf = &d->acc100_conf;
+	const struct acc100_registry_addr *reg_addr;
+	uint8_t acc, qg;
+	uint32_t reg, reg_aq, reg_len0, reg_len1;
+	uint32_t reg_mode;
+
+	/* No need to retrieve the configuration is already done */
+	if (d->configured)
+		return;
+
+	/* Choose correct registry addresses for the device type */
+	if (d->pf_device)
+		reg_addr = &pf_reg_addr;
+	else
+		reg_addr = &vf_reg_addr;
+
+	d->ddr_size = (1 + acc100_reg_read(d, reg_addr->ddr_range)) << 10;
+
+	/* Single VF Bundle by VF */
+	acc100_conf->num_vf_bundles = 1;
+	initQTop(acc100_conf);
+
+	struct rte_q_topology_t *q_top = NULL;
+	int qman_func_id[5] = {0, 2, 1, 3, 4};
+	reg = acc100_reg_read(d, reg_addr->qman_group_func);
+	for (qg = 0; qg < ACC100_NUM_QGRPS_PER_WORD; qg++) {
+		reg_aq = acc100_reg_read(d,
+				queue_offset(d->pf_device, 0, qg, 0));
+		if (reg_aq & QUEUE_ENABLE) {
+			acc = qman_func_id[(reg >> (qg * 4)) & 0x7];
+			updateQtop(acc, qg, acc100_conf, d);
+		}
+	}
+
+	/* Check the depth of the AQs*/
+	reg_len0 = acc100_reg_read(d, reg_addr->depth_log0_offset);
+	reg_len1 = acc100_reg_read(d, reg_addr->depth_log1_offset);
+	for (acc = 0; acc < NUM_ACC; acc++) {
+		qtopFromAcc(&q_top, acc, acc100_conf);
+		if (q_top->first_qgroup_index < ACC100_NUM_QGRPS_PER_WORD)
+			q_top->aq_depth_log2 = (reg_len0 >>
+					(q_top->first_qgroup_index * 4))
+					& 0xF;
+		else
+			q_top->aq_depth_log2 = (reg_len1 >>
+					((q_top->first_qgroup_index -
+					ACC100_NUM_QGRPS_PER_WORD) * 4))
+					& 0xF;
+	}
+
+	/* Read PF mode */
+	if (d->pf_device) {
+		reg_mode = acc100_reg_read(d, HWPfHiPfMode);
+		acc100_conf->pf_mode_en = (reg_mode == 2) ? 1 : 0;
+	}
+
+	rte_bbdev_log_debug(
+			"%s Config LLR SIGN IN/OUT %s %s QG %u %u %u %u AQ %u %u %u %u Len %u %u %u %u\n",
+			(d->pf_device) ? "PF" : "VF",
+			(acc100_conf->input_pos_llr_1_bit) ? "POS" : "NEG",
+			(acc100_conf->output_pos_llr_1_bit) ? "POS" : "NEG",
+			acc100_conf->q_ul_4g.num_qgroups,
+			acc100_conf->q_dl_4g.num_qgroups,
+			acc100_conf->q_ul_5g.num_qgroups,
+			acc100_conf->q_dl_5g.num_qgroups,
+			acc100_conf->q_ul_4g.num_aqs_per_groups,
+			acc100_conf->q_dl_4g.num_aqs_per_groups,
+			acc100_conf->q_ul_5g.num_aqs_per_groups,
+			acc100_conf->q_dl_5g.num_aqs_per_groups,
+			acc100_conf->q_ul_4g.aq_depth_log2,
+			acc100_conf->q_dl_4g.aq_depth_log2,
+			acc100_conf->q_ul_5g.aq_depth_log2,
+			acc100_conf->q_dl_5g.aq_depth_log2);
+}
+
 /* Free 64MB memory used for software rings */
 static int
 acc100_dev_close(struct rte_bbdev *dev  __rte_unused)
@@ -33,8 +211,55 @@
 	return 0;
 }
 
+/* Get ACC100 device info */
+static void
+acc100_dev_info_get(struct rte_bbdev *dev,
+		struct rte_bbdev_driver_info *dev_info)
+{
+	struct acc100_device *d = dev->data->dev_private;
+
+	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
+		RTE_BBDEV_END_OF_CAPABILITIES_LIST()
+	};
+
+	static struct rte_bbdev_queue_conf default_queue_conf;
+	default_queue_conf.socket = dev->data->socket_id;
+	default_queue_conf.queue_size = MAX_QUEUE_DEPTH;
+
+	dev_info->driver_name = dev->device->driver->name;
+
+	/* Read and save the populated config from ACC100 registers */
+	fetch_acc100_config(dev);
+
+	/* This isn't ideal because it reports the maximum number of queues but
+	 * does not provide info on how many can be uplink/downlink or different
+	 * priorities
+	 */
+	dev_info->max_num_queues =
+			d->acc100_conf.q_dl_5g.num_aqs_per_groups *
+			d->acc100_conf.q_dl_5g.num_qgroups +
+			d->acc100_conf.q_ul_5g.num_aqs_per_groups *
+			d->acc100_conf.q_ul_5g.num_qgroups +
+			d->acc100_conf.q_dl_4g.num_aqs_per_groups *
+			d->acc100_conf.q_dl_4g.num_qgroups +
+			d->acc100_conf.q_ul_4g.num_aqs_per_groups *
+			d->acc100_conf.q_ul_4g.num_qgroups;
+	dev_info->queue_size_lim = MAX_QUEUE_DEPTH;
+	dev_info->hardware_accelerated = true;
+	dev_info->max_dl_queue_priority =
+			d->acc100_conf.q_dl_4g.num_qgroups - 1;
+	dev_info->max_ul_queue_priority =
+			d->acc100_conf.q_ul_4g.num_qgroups - 1;
+	dev_info->default_queue_conf = default_queue_conf;
+	dev_info->cpu_flag_reqs = NULL;
+	dev_info->min_alignment = 64;
+	dev_info->capabilities = bbdev_capabilities;
+	dev_info->harq_buffer_size = d->ddr_size;
+}
+
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
 	.close = acc100_dev_close,
+	.info_get = acc100_dev_info_get,
 };
 
 /* ACC100 PCI PF address map */
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index cd77570..662e2c8 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -7,6 +7,7 @@
 
 #include "acc100_pf_enum.h"
 #include "acc100_vf_enum.h"
+#include "rte_acc100_cfg.h"
 
 /* Helper macro for logging */
 #define rte_bbdev_log(level, fmt, ...) \
@@ -520,6 +521,8 @@ struct acc100_registry_addr {
 /* Private data structure for each ACC100 device */
 struct acc100_device {
 	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
+	uint32_t ddr_size; /* Size in kB */
+	struct acc100_conf acc100_conf; /* ACC100 Initial configuration */
 	bool pf_device; /**< True if this is a PF ACC100 device */
 	bool configured; /**< True if this ACC100 device is configured */
 };
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v3 04/11] baseband/acc100: add queue configuration
  2020-08-19  0:25 [dpdk-dev] [PATCH v3 00/11] bbdev PMD ACC100 Nicolas Chautru
                   ` (2 preceding siblings ...)
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 03/11] baseband/acc100: add info get function Nicolas Chautru
@ 2020-08-19  0:25 ` Nicolas Chautru
  2020-08-29 10:39   ` Xu, Rosen
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC processing functions Nicolas Chautru
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 213+ messages in thread
From: Nicolas Chautru @ 2020-08-19  0:25 UTC (permalink / raw)
  To: dev, akhil.goyal; +Cc: bruce.richardson, Nicolas Chautru

Adding function to create and configure queues for
the device. Still no capability.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 420 ++++++++++++++++++++++++++++++-
 drivers/baseband/acc100/rte_acc100_pmd.h |  45 ++++
 2 files changed, 464 insertions(+), 1 deletion(-)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 7807a30..7a21c57 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -26,6 +26,22 @@
 RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);
 #endif
 
+/* Write to MMIO register address */
+static inline void
+mmio_write(void *addr, uint32_t value)
+{
+	*((volatile uint32_t *)(addr)) = rte_cpu_to_le_32(value);
+}
+
+/* Write a register of a ACC100 device */
+static inline void
+acc100_reg_write(struct acc100_device *d, uint32_t offset, uint32_t payload)
+{
+	void *reg_addr = RTE_PTR_ADD(d->mmio_base, offset);
+	mmio_write(reg_addr, payload);
+	usleep(1000);
+}
+
 /* Read a register of a ACC100 device */
 static inline uint32_t
 acc100_reg_read(struct acc100_device *d, uint32_t offset)
@@ -36,6 +52,22 @@
 	return rte_le_to_cpu_32(ret);
 }
 
+/* Basic Implementation of Log2 for exact 2^N */
+static inline uint32_t
+log2_basic(uint32_t value)
+{
+	return (value == 0) ? 0 : __builtin_ctz(value);
+}
+
+/* Calculate memory alignment offset assuming alignment is 2^N */
+static inline uint32_t
+calc_mem_alignment_offset(void *unaligned_virt_mem, uint32_t alignment)
+{
+	rte_iova_t unaligned_phy_mem = rte_malloc_virt2iova(unaligned_virt_mem);
+	return (uint32_t)(alignment -
+			(unaligned_phy_mem & (alignment-1)));
+}
+
 /* Calculate the offset of the enqueue register */
 static inline uint32_t
 queue_offset(bool pf_device, uint8_t vf_id, uint8_t qgrp_id, uint16_t aq_id)
@@ -204,10 +236,393 @@
 			acc100_conf->q_dl_5g.aq_depth_log2);
 }
 
+static void
+free_base_addresses(void **base_addrs, int size)
+{
+	int i;
+	for (i = 0; i < size; i++)
+		rte_free(base_addrs[i]);
+}
+
+static inline uint32_t
+get_desc_len(void)
+{
+	return sizeof(union acc100_dma_desc);
+}
+
+/* Allocate the 2 * 64MB block for the sw rings */
+static int
+alloc_2x64mb_sw_rings_mem(struct rte_bbdev *dev, struct acc100_device *d,
+		int socket)
+{
+	uint32_t sw_ring_size = ACC100_SIZE_64MBYTE;
+	d->sw_rings_base = rte_zmalloc_socket(dev->device->driver->name,
+			2 * sw_ring_size, RTE_CACHE_LINE_SIZE, socket);
+	if (d->sw_rings_base == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate memory for %s:%u",
+				dev->device->driver->name,
+				dev->data->dev_id);
+		return -ENOMEM;
+	}
+	memset(d->sw_rings_base, 0, ACC100_SIZE_64MBYTE);
+	uint32_t next_64mb_align_offset = calc_mem_alignment_offset(
+			d->sw_rings_base, ACC100_SIZE_64MBYTE);
+	d->sw_rings = RTE_PTR_ADD(d->sw_rings_base, next_64mb_align_offset);
+	d->sw_rings_phys = rte_malloc_virt2iova(d->sw_rings_base) +
+			next_64mb_align_offset;
+	d->sw_ring_size = MAX_QUEUE_DEPTH * get_desc_len();
+	d->sw_ring_max_depth = d->sw_ring_size / get_desc_len();
+
+	return 0;
+}
+
+/* Attempt to allocate minimised memory space for sw rings */
+static void
+alloc_sw_rings_min_mem(struct rte_bbdev *dev, struct acc100_device *d,
+		uint16_t num_queues, int socket)
+{
+	rte_iova_t sw_rings_base_phy, next_64mb_align_addr_phy;
+	uint32_t next_64mb_align_offset;
+	rte_iova_t sw_ring_phys_end_addr;
+	void *base_addrs[SW_RING_MEM_ALLOC_ATTEMPTS];
+	void *sw_rings_base;
+	int i = 0;
+	uint32_t q_sw_ring_size = MAX_QUEUE_DEPTH * get_desc_len();
+	uint32_t dev_sw_ring_size = q_sw_ring_size * num_queues;
+
+	/* Find an aligned block of memory to store sw rings */
+	while (i < SW_RING_MEM_ALLOC_ATTEMPTS) {
+		/*
+		 * sw_ring allocated memory is guaranteed to be aligned to
+		 * q_sw_ring_size at the condition that the requested size is
+		 * less than the page size
+		 */
+		sw_rings_base = rte_zmalloc_socket(
+				dev->device->driver->name,
+				dev_sw_ring_size, q_sw_ring_size, socket);
+
+		if (sw_rings_base == NULL) {
+			rte_bbdev_log(ERR,
+					"Failed to allocate memory for %s:%u",
+					dev->device->driver->name,
+					dev->data->dev_id);
+			break;
+		}
+
+		sw_rings_base_phy = rte_malloc_virt2iova(sw_rings_base);
+		next_64mb_align_offset = calc_mem_alignment_offset(
+				sw_rings_base, ACC100_SIZE_64MBYTE);
+		next_64mb_align_addr_phy = sw_rings_base_phy +
+				next_64mb_align_offset;
+		sw_ring_phys_end_addr = sw_rings_base_phy + dev_sw_ring_size;
+
+		/* Check if the end of the sw ring memory block is before the
+		 * start of next 64MB aligned mem address
+		 */
+		if (sw_ring_phys_end_addr < next_64mb_align_addr_phy) {
+			d->sw_rings_phys = sw_rings_base_phy;
+			d->sw_rings = sw_rings_base;
+			d->sw_rings_base = sw_rings_base;
+			d->sw_ring_size = q_sw_ring_size;
+			d->sw_ring_max_depth = MAX_QUEUE_DEPTH;
+			break;
+		}
+		/* Store the address of the unaligned mem block */
+		base_addrs[i] = sw_rings_base;
+		i++;
+	}
+
+	/* Free all unaligned blocks of mem allocated in the loop */
+	free_base_addresses(base_addrs, i);
+}
+
+
+/* Allocate 64MB memory used for all software rings */
+static int
+acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id)
+{
+	uint32_t phys_low, phys_high, payload;
+	struct acc100_device *d = dev->data->dev_private;
+	const struct acc100_registry_addr *reg_addr;
+
+	if (d->pf_device && !d->acc100_conf.pf_mode_en) {
+		rte_bbdev_log(NOTICE,
+				"%s has PF mode disabled. This PF can't be used.",
+				dev->data->name);
+		return -ENODEV;
+	}
+
+	alloc_sw_rings_min_mem(dev, d, num_queues, socket_id);
+
+	/* If minimal memory space approach failed, then allocate
+	 * the 2 * 64MB block for the sw rings
+	 */
+	if (d->sw_rings == NULL)
+		alloc_2x64mb_sw_rings_mem(dev, d, socket_id);
+
+	/* Configure ACC100 with the base address for DMA descriptor rings
+	 * Same descriptor rings used for UL and DL DMA Engines
+	 * Note : Assuming only VF0 bundle is used for PF mode
+	 */
+	phys_high = (uint32_t)(d->sw_rings_phys >> 32);
+	phys_low  = (uint32_t)(d->sw_rings_phys & ~(ACC100_SIZE_64MBYTE-1));
+
+	/* Choose correct registry addresses for the device type */
+	if (d->pf_device)
+		reg_addr = &pf_reg_addr;
+	else
+		reg_addr = &vf_reg_addr;
+
+	/* Read the populated cfg from ACC100 registers */
+	fetch_acc100_config(dev);
+
+	/* Mark as configured properly */
+	d->configured = true;
+
+	/* Release AXI from PF */
+	if (d->pf_device)
+		acc100_reg_write(d, HWPfDmaAxiControl, 1);
+
+	acc100_reg_write(d, reg_addr->dma_ring_ul5g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->dma_ring_ul5g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->dma_ring_dl5g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->dma_ring_dl5g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->dma_ring_ul4g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->dma_ring_ul4g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->dma_ring_dl4g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->dma_ring_dl4g_lo, phys_low);
+
+	/*
+	 * Configure Ring Size to the max queue ring size
+	 * (used for wrapping purpose)
+	 */
+	payload = log2_basic(d->sw_ring_size / 64);
+	acc100_reg_write(d, reg_addr->ring_size, payload);
+
+	/* Configure tail pointer for use when SDONE enabled */
+	d->tail_ptrs = rte_zmalloc_socket(
+			dev->device->driver->name,
+			ACC100_NUM_QGRPS * ACC100_NUM_AQS * sizeof(uint32_t),
+			RTE_CACHE_LINE_SIZE, socket_id);
+	if (d->tail_ptrs == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate tail ptr for %s:%u",
+				dev->device->driver->name,
+				dev->data->dev_id);
+		rte_free(d->sw_rings);
+		return -ENOMEM;
+	}
+	d->tail_ptr_phys = rte_malloc_virt2iova(d->tail_ptrs);
+
+	phys_high = (uint32_t)(d->tail_ptr_phys >> 32);
+	phys_low  = (uint32_t)(d->tail_ptr_phys);
+	acc100_reg_write(d, reg_addr->tail_ptrs_ul5g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->tail_ptrs_ul5g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->tail_ptrs_dl5g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->tail_ptrs_dl5g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->tail_ptrs_ul4g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->tail_ptrs_ul4g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_lo, phys_low);
+
+	d->harq_layout = rte_zmalloc_socket("HARQ Layout",
+			ACC100_HARQ_LAYOUT * sizeof(*d->harq_layout),
+			RTE_CACHE_LINE_SIZE, dev->data->socket_id);
+
+	rte_bbdev_log_debug(
+			"ACC100 (%s) configured  sw_rings = %p, sw_rings_phys = %#"
+			PRIx64, dev->data->name, d->sw_rings, d->sw_rings_phys);
+
+	return 0;
+}
+
 /* Free 64MB memory used for software rings */
 static int
-acc100_dev_close(struct rte_bbdev *dev  __rte_unused)
+acc100_dev_close(struct rte_bbdev *dev)
 {
+	struct acc100_device *d = dev->data->dev_private;
+	if (d->sw_rings_base != NULL) {
+		rte_free(d->tail_ptrs);
+		rte_free(d->sw_rings_base);
+		d->sw_rings_base = NULL;
+	}
+	usleep(1000);
+	return 0;
+}
+
+
+/**
+ * Report a ACC100 queue index which is free
+ * Return 0 to 16k for a valid queue_idx or -1 when no queue is available
+ * Note : Only supporting VF0 Bundle for PF mode
+ */
+static int
+acc100_find_free_queue_idx(struct rte_bbdev *dev,
+		const struct rte_bbdev_queue_conf *conf)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	int op_2_acc[5] = {0, UL_4G, DL_4G, UL_5G, DL_5G};
+	int acc = op_2_acc[conf->op_type];
+	struct rte_q_topology_t *qtop = NULL;
+	qtopFromAcc(&qtop, acc, &(d->acc100_conf));
+	if (qtop == NULL)
+		return -1;
+	/* Identify matching QGroup Index which are sorted in priority order */
+	uint16_t group_idx = qtop->first_qgroup_index;
+	group_idx += conf->priority;
+	if (group_idx >= ACC100_NUM_QGRPS ||
+			conf->priority >= qtop->num_qgroups) {
+		rte_bbdev_log(INFO, "Invalid Priority on %s, priority %u",
+				dev->data->name, conf->priority);
+		return -1;
+	}
+	/* Find a free AQ_idx  */
+	uint16_t aq_idx;
+	for (aq_idx = 0; aq_idx < qtop->num_aqs_per_groups; aq_idx++) {
+		if (((d->q_assigned_bit_map[group_idx] >> aq_idx) & 0x1) == 0) {
+			/* Mark the Queue as assigned */
+			d->q_assigned_bit_map[group_idx] |= (1 << aq_idx);
+			/* Report the AQ Index */
+			return (group_idx << GRP_ID_SHIFT) + aq_idx;
+		}
+	}
+	rte_bbdev_log(INFO, "Failed to find free queue on %s, priority %u",
+			dev->data->name, conf->priority);
+	return -1;
+}
+
+/* Setup ACC100 queue */
+static int
+acc100_queue_setup(struct rte_bbdev *dev, uint16_t queue_id,
+		const struct rte_bbdev_queue_conf *conf)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	struct acc100_queue *q;
+	int16_t q_idx;
+
+	/* Allocate the queue data structure. */
+	q = rte_zmalloc_socket(dev->device->driver->name, sizeof(*q),
+			RTE_CACHE_LINE_SIZE, conf->socket);
+	if (q == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate queue memory");
+		return -ENOMEM;
+	}
+
+	q->d = d;
+	q->ring_addr = RTE_PTR_ADD(d->sw_rings, (d->sw_ring_size * queue_id));
+	q->ring_addr_phys = d->sw_rings_phys + (d->sw_ring_size * queue_id);
+
+	/* Prepare the Ring with default descriptor format */
+	union acc100_dma_desc *desc = NULL;
+	unsigned int desc_idx, b_idx;
+	int fcw_len = (conf->op_type == RTE_BBDEV_OP_LDPC_ENC ?
+		ACC100_FCW_LE_BLEN : (conf->op_type == RTE_BBDEV_OP_TURBO_DEC ?
+		ACC100_FCW_TD_BLEN : ACC100_FCW_LD_BLEN));
+
+	for (desc_idx = 0; desc_idx < d->sw_ring_max_depth; desc_idx++) {
+		desc = q->ring_addr + desc_idx;
+		desc->req.word0 = ACC100_DMA_DESC_TYPE;
+		desc->req.word1 = 0; /**< Timestamp */
+		desc->req.word2 = 0;
+		desc->req.word3 = 0;
+		uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
+		desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
+		desc->req.data_ptrs[0].blen = fcw_len;
+		desc->req.data_ptrs[0].blkid = ACC100_DMA_BLKID_FCW;
+		desc->req.data_ptrs[0].last = 0;
+		desc->req.data_ptrs[0].dma_ext = 0;
+		for (b_idx = 1; b_idx < ACC100_DMA_MAX_NUM_POINTERS - 1;
+				b_idx++) {
+			desc->req.data_ptrs[b_idx].blkid = ACC100_DMA_BLKID_IN;
+			desc->req.data_ptrs[b_idx].last = 1;
+			desc->req.data_ptrs[b_idx].dma_ext = 0;
+			b_idx++;
+			desc->req.data_ptrs[b_idx].blkid =
+					ACC100_DMA_BLKID_OUT_ENC;
+			desc->req.data_ptrs[b_idx].last = 1;
+			desc->req.data_ptrs[b_idx].dma_ext = 0;
+		}
+		/* Preset some fields of LDPC FCW */
+		desc->req.fcw_ld.FCWversion = ACC100_FCW_VER;
+		desc->req.fcw_ld.gain_i = 1;
+		desc->req.fcw_ld.gain_h = 1;
+	}
+
+	q->lb_in = rte_zmalloc_socket(dev->device->driver->name,
+			RTE_CACHE_LINE_SIZE,
+			RTE_CACHE_LINE_SIZE, conf->socket);
+	if (q->lb_in == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate lb_in memory");
+		return -ENOMEM;
+	}
+	q->lb_in_addr_phys = rte_malloc_virt2iova(q->lb_in);
+	q->lb_out = rte_zmalloc_socket(dev->device->driver->name,
+			RTE_CACHE_LINE_SIZE,
+			RTE_CACHE_LINE_SIZE, conf->socket);
+	if (q->lb_out == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate lb_out memory");
+		return -ENOMEM;
+	}
+	q->lb_out_addr_phys = rte_malloc_virt2iova(q->lb_out);
+
+	/*
+	 * Software queue ring wraps synchronously with the HW when it reaches
+	 * the boundary of the maximum allocated queue size, no matter what the
+	 * sw queue size is. This wrapping is guarded by setting the wrap_mask
+	 * to represent the maximum queue size as allocated at the time when
+	 * the device has been setup (in configure()).
+	 *
+	 * The queue depth is set to the queue size value (conf->queue_size).
+	 * This limits the occupancy of the queue at any point of time, so that
+	 * the queue does not get swamped with enqueue requests.
+	 */
+	q->sw_ring_depth = conf->queue_size;
+	q->sw_ring_wrap_mask = d->sw_ring_max_depth - 1;
+
+	q->op_type = conf->op_type;
+
+	q_idx = acc100_find_free_queue_idx(dev, conf);
+	if (q_idx == -1) {
+		rte_free(q);
+		return -1;
+	}
+
+	q->qgrp_id = (q_idx >> GRP_ID_SHIFT) & 0xF;
+	q->vf_id = (q_idx >> VF_ID_SHIFT)  & 0x3F;
+	q->aq_id = q_idx & 0xF;
+	q->aq_depth = (conf->op_type ==  RTE_BBDEV_OP_TURBO_DEC) ?
+			(1 << d->acc100_conf.q_ul_4g.aq_depth_log2) :
+			(1 << d->acc100_conf.q_dl_4g.aq_depth_log2);
+
+	q->mmio_reg_enqueue = RTE_PTR_ADD(d->mmio_base,
+			queue_offset(d->pf_device,
+					q->vf_id, q->qgrp_id, q->aq_id));
+
+	rte_bbdev_log_debug(
+			"Setup dev%u q%u: qgrp_id=%u, vf_id=%u, aq_id=%u, aq_depth=%u, mmio_reg_enqueue=%p",
+			dev->data->dev_id, queue_id, q->qgrp_id, q->vf_id,
+			q->aq_id, q->aq_depth, q->mmio_reg_enqueue);
+
+	dev->data->queues[queue_id].queue_private = q;
+	return 0;
+}
+
+/* Release ACC100 queue */
+static int
+acc100_queue_release(struct rte_bbdev *dev, uint16_t q_id)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	struct acc100_queue *q = dev->data->queues[q_id].queue_private;
+
+	if (q != NULL) {
+		/* Mark the Queue as un-assigned */
+		d->q_assigned_bit_map[q->qgrp_id] &= (0xFFFFFFFF -
+				(1 << q->aq_id));
+		rte_free(q->lb_in);
+		rte_free(q->lb_out);
+		rte_free(q);
+		dev->data->queues[q_id].queue_private = NULL;
+	}
+
 	return 0;
 }
 
@@ -258,8 +673,11 @@
 }
 
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
+	.setup_queues = acc100_setup_queues,
 	.close = acc100_dev_close,
 	.info_get = acc100_dev_info_get,
+	.queue_setup = acc100_queue_setup,
+	.queue_release = acc100_queue_release,
 };
 
 /* ACC100 PCI PF address map */
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 662e2c8..0e2b79c 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -518,11 +518,56 @@ struct acc100_registry_addr {
 	.ddr_range = HWVfDmaDdrBaseRangeRoVf,
 };
 
+/* Structure associated with each queue. */
+struct __rte_cache_aligned acc100_queue {
+	union acc100_dma_desc *ring_addr;  /* Virtual address of sw ring */
+	rte_iova_t ring_addr_phys;  /* Physical address of software ring */
+	uint32_t sw_ring_head;  /* software ring head */
+	uint32_t sw_ring_tail;  /* software ring tail */
+	/* software ring size (descriptors, not bytes) */
+	uint32_t sw_ring_depth;
+	/* mask used to wrap enqueued descriptors on the sw ring */
+	uint32_t sw_ring_wrap_mask;
+	/* MMIO register used to enqueue descriptors */
+	void *mmio_reg_enqueue;
+	uint8_t vf_id;  /* VF ID (max = 63) */
+	uint8_t qgrp_id;  /* Queue Group ID */
+	uint16_t aq_id;  /* Atomic Queue ID */
+	uint16_t aq_depth;  /* Depth of atomic queue */
+	uint32_t aq_enqueued;  /* Count how many "batches" have been enqueued */
+	uint32_t aq_dequeued;  /* Count how many "batches" have been dequeued */
+	uint32_t irq_enable;  /* Enable ops dequeue interrupts if set to 1 */
+	struct rte_mempool *fcw_mempool;  /* FCW mempool */
+	enum rte_bbdev_op_type op_type;  /* Type of this Queue: TE or TD */
+	/* Internal Buffers for loopback input */
+	uint8_t *lb_in;
+	uint8_t *lb_out;
+	rte_iova_t lb_in_addr_phys;
+	rte_iova_t lb_out_addr_phys;
+	struct acc100_device *d;
+};
+
 /* Private data structure for each ACC100 device */
 struct acc100_device {
 	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
+	void *sw_rings_base;  /* Base addr of un-aligned memory for sw rings */
+	void *sw_rings;  /* 64MBs of 64MB aligned memory for sw rings */
+	rte_iova_t sw_rings_phys;  /* Physical address of sw_rings */
+	/* Virtual address of the info memory routed to the this function under
+	 * operation, whether it is PF or VF.
+	 */
+	union acc100_harq_layout_data *harq_layout;
+	uint32_t sw_ring_size;
 	uint32_t ddr_size; /* Size in kB */
+	uint32_t *tail_ptrs; /* Base address of response tail pointer buffer */
+	rte_iova_t tail_ptr_phys; /* Physical address of tail pointers */
+	/* Max number of entries available for each queue in device, depending
+	 * on how many queues are enabled with configure()
+	 */
+	uint32_t sw_ring_max_depth;
 	struct acc100_conf acc100_conf; /* ACC100 Initial configuration */
+	/* Bitmap capturing which Queues have already been assigned */
+	uint16_t q_assigned_bit_map[ACC100_NUM_QGRPS];
 	bool pf_device; /**< True if this is a PF ACC100 device */
 	bool configured; /**< True if this ACC100 device is configured */
 };
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC processing functions
  2020-08-19  0:25 [dpdk-dev] [PATCH v3 00/11] bbdev PMD ACC100 Nicolas Chautru
                   ` (3 preceding siblings ...)
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 04/11] baseband/acc100: add queue configuration Nicolas Chautru
@ 2020-08-19  0:25 ` Nicolas Chautru
  2020-08-20 14:38   ` Dave Burley
  2020-08-29 11:10   ` Xu, Rosen
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 06/11] baseband/acc100: add HARQ loopback support Nicolas Chautru
                   ` (5 subsequent siblings)
  10 siblings, 2 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-08-19  0:25 UTC (permalink / raw)
  To: dev, akhil.goyal; +Cc: bruce.richardson, Nicolas Chautru

Adding LDPC decode and encode processing operations

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 1625 +++++++++++++++++++++++++++++-
 drivers/baseband/acc100/rte_acc100_pmd.h |    3 +
 2 files changed, 1626 insertions(+), 2 deletions(-)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 7a21c57..5f32813 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -15,6 +15,9 @@
 #include <rte_hexdump.h>
 #include <rte_pci.h>
 #include <rte_bus_pci.h>
+#ifdef RTE_BBDEV_OFFLOAD_COST
+#include <rte_cycles.h>
+#endif
 
 #include <rte_bbdev.h>
 #include <rte_bbdev_pmd.h>
@@ -449,7 +452,6 @@
 	return 0;
 }
 
-
 /**
  * Report a ACC100 queue index which is free
  * Return 0 to 16k for a valid queue_idx or -1 when no queue is available
@@ -634,6 +636,46 @@
 	struct acc100_device *d = dev->data->dev_private;
 
 	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
+		{
+			.type   = RTE_BBDEV_OP_LDPC_ENC,
+			.cap.ldpc_enc = {
+				.capability_flags =
+					RTE_BBDEV_LDPC_RATE_MATCH |
+					RTE_BBDEV_LDPC_CRC_24B_ATTACH |
+					RTE_BBDEV_LDPC_INTERLEAVER_BYPASS,
+				.num_buffers_src =
+						RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+				.num_buffers_dst =
+						RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+			}
+		},
+		{
+			.type   = RTE_BBDEV_OP_LDPC_DEC,
+			.cap.ldpc_dec = {
+			.capability_flags =
+				RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK |
+				RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP |
+				RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE |
+				RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE |
+#ifdef ACC100_EXT_MEM
+				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE |
+				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE |
+#endif
+				RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE |
+				RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS |
+				RTE_BBDEV_LDPC_DECODE_BYPASS |
+				RTE_BBDEV_LDPC_DEC_SCATTER_GATHER |
+				RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION |
+				RTE_BBDEV_LDPC_LLR_COMPRESSION,
+			.llr_size = 8,
+			.llr_decimals = 1,
+			.num_buffers_src =
+					RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+			.num_buffers_hard_out =
+					RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+			.num_buffers_soft_out = 0,
+			}
+		},
 		RTE_BBDEV_END_OF_CAPABILITIES_LIST()
 	};
 
@@ -669,9 +711,14 @@
 	dev_info->cpu_flag_reqs = NULL;
 	dev_info->min_alignment = 64;
 	dev_info->capabilities = bbdev_capabilities;
+#ifdef ACC100_EXT_MEM
 	dev_info->harq_buffer_size = d->ddr_size;
+#else
+	dev_info->harq_buffer_size = 0;
+#endif
 }
 
+
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
 	.setup_queues = acc100_setup_queues,
 	.close = acc100_dev_close,
@@ -696,6 +743,1577 @@
 	{.device_id = 0},
 };
 
+/* Read flag value 0/1 from bitmap */
+static inline bool
+check_bit(uint32_t bitmap, uint32_t bitmask)
+{
+	return bitmap & bitmask;
+}
+
+static inline char *
+mbuf_append(struct rte_mbuf *m_head, struct rte_mbuf *m, uint16_t len)
+{
+	if (unlikely(len > rte_pktmbuf_tailroom(m)))
+		return NULL;
+
+	char *tail = (char *)m->buf_addr + m->data_off + m->data_len;
+	m->data_len = (uint16_t)(m->data_len + len);
+	m_head->pkt_len  = (m_head->pkt_len + len);
+	return tail;
+}
+
+/* Compute value of k0.
+ * Based on 3GPP 38.212 Table 5.4.2.1-2
+ * Starting position of different redundancy versions, k0
+ */
+static inline uint16_t
+get_k0(uint16_t n_cb, uint16_t z_c, uint8_t bg, uint8_t rv_index)
+{
+	if (rv_index == 0)
+		return 0;
+	uint16_t n = (bg == 1 ? N_ZC_1 : N_ZC_2) * z_c;
+	if (n_cb == n) {
+		if (rv_index == 1)
+			return (bg == 1 ? K0_1_1 : K0_1_2) * z_c;
+		else if (rv_index == 2)
+			return (bg == 1 ? K0_2_1 : K0_2_2) * z_c;
+		else
+			return (bg == 1 ? K0_3_1 : K0_3_2) * z_c;
+	}
+	/* LBRM case - includes a division by N */
+	if (rv_index == 1)
+		return (((bg == 1 ? K0_1_1 : K0_1_2) * n_cb)
+				/ n) * z_c;
+	else if (rv_index == 2)
+		return (((bg == 1 ? K0_2_1 : K0_2_2) * n_cb)
+				/ n) * z_c;
+	else
+		return (((bg == 1 ? K0_3_1 : K0_3_2) * n_cb)
+				/ n) * z_c;
+}
+
+/* Fill in a frame control word for LDPC encoding. */
+static inline void
+acc100_fcw_le_fill(const struct rte_bbdev_enc_op *op,
+		struct acc100_fcw_le *fcw, int num_cb)
+{
+	fcw->qm = op->ldpc_enc.q_m;
+	fcw->nfiller = op->ldpc_enc.n_filler;
+	fcw->BG = (op->ldpc_enc.basegraph - 1);
+	fcw->Zc = op->ldpc_enc.z_c;
+	fcw->ncb = op->ldpc_enc.n_cb;
+	fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_enc.basegraph,
+			op->ldpc_enc.rv_index);
+	fcw->rm_e = op->ldpc_enc.cb_params.e;
+	fcw->crc_select = check_bit(op->ldpc_enc.op_flags,
+			RTE_BBDEV_LDPC_CRC_24B_ATTACH);
+	fcw->bypass_intlv = check_bit(op->ldpc_enc.op_flags,
+			RTE_BBDEV_LDPC_INTERLEAVER_BYPASS);
+	fcw->mcb_count = num_cb;
+}
+
+/* Fill in a frame control word for LDPC decoding. */
+static inline void
+acc100_fcw_ld_fill(const struct rte_bbdev_dec_op *op, struct acc100_fcw_ld *fcw,
+		union acc100_harq_layout_data *harq_layout)
+{
+	uint16_t harq_out_length, harq_in_length, ncb_p, k0_p, parity_offset;
+	uint16_t harq_index;
+	uint32_t l;
+	bool harq_prun = false;
+
+	fcw->qm = op->ldpc_dec.q_m;
+	fcw->nfiller = op->ldpc_dec.n_filler;
+	fcw->BG = (op->ldpc_dec.basegraph - 1);
+	fcw->Zc = op->ldpc_dec.z_c;
+	fcw->ncb = op->ldpc_dec.n_cb;
+	fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_dec.basegraph,
+			op->ldpc_dec.rv_index);
+	if (op->ldpc_dec.code_block_mode == 1)
+		fcw->rm_e = op->ldpc_dec.cb_params.e;
+	else
+		fcw->rm_e = (op->ldpc_dec.tb_params.r <
+				op->ldpc_dec.tb_params.cab) ?
+						op->ldpc_dec.tb_params.ea :
+						op->ldpc_dec.tb_params.eb;
+
+	fcw->hcin_en = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE);
+	fcw->hcout_en = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE);
+	fcw->crc_select = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK);
+	fcw->bypass_dec = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_DECODE_BYPASS);
+	fcw->bypass_intlv = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS);
+	if (op->ldpc_dec.q_m == 1) {
+		fcw->bypass_intlv = 1;
+		fcw->qm = 2;
+	}
+	fcw->hcin_decomp_mode = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
+	fcw->hcout_comp_mode = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
+	fcw->llr_pack_mode = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_LLR_COMPRESSION);
+	harq_index = op->ldpc_dec.harq_combined_output.offset /
+			ACC100_HARQ_OFFSET;
+#ifdef ACC100_EXT_MEM
+	/* Limit cases when HARQ pruning is valid */
+	harq_prun = ((op->ldpc_dec.harq_combined_output.offset %
+			ACC100_HARQ_OFFSET) == 0) &&
+			(op->ldpc_dec.harq_combined_output.offset <= UINT16_MAX
+			* ACC100_HARQ_OFFSET);
+#endif
+	if (fcw->hcin_en > 0) {
+		harq_in_length = op->ldpc_dec.harq_combined_input.length;
+		if (fcw->hcin_decomp_mode > 0)
+			harq_in_length = harq_in_length * 8 / 6;
+		harq_in_length = RTE_ALIGN(harq_in_length, 64);
+		if ((harq_layout[harq_index].offset > 0) & harq_prun) {
+			rte_bbdev_log_debug("HARQ IN offset unexpected for now\n");
+			fcw->hcin_size0 = harq_layout[harq_index].size0;
+			fcw->hcin_offset = harq_layout[harq_index].offset;
+			fcw->hcin_size1 = harq_in_length -
+					harq_layout[harq_index].offset;
+		} else {
+			fcw->hcin_size0 = harq_in_length;
+			fcw->hcin_offset = 0;
+			fcw->hcin_size1 = 0;
+		}
+	} else {
+		fcw->hcin_size0 = 0;
+		fcw->hcin_offset = 0;
+		fcw->hcin_size1 = 0;
+	}
+
+	fcw->itmax = op->ldpc_dec.iter_max;
+	fcw->itstop = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE);
+	fcw->synd_precoder = fcw->itstop;
+	/*
+	 * These are all implicitly set
+	 * fcw->synd_post = 0;
+	 * fcw->so_en = 0;
+	 * fcw->so_bypass_rm = 0;
+	 * fcw->so_bypass_intlv = 0;
+	 * fcw->dec_convllr = 0;
+	 * fcw->hcout_convllr = 0;
+	 * fcw->hcout_size1 = 0;
+	 * fcw->so_it = 0;
+	 * fcw->hcout_offset = 0;
+	 * fcw->negstop_th = 0;
+	 * fcw->negstop_it = 0;
+	 * fcw->negstop_en = 0;
+	 * fcw->gain_i = 1;
+	 * fcw->gain_h = 1;
+	 */
+	if (fcw->hcout_en > 0) {
+		parity_offset = (op->ldpc_dec.basegraph == 1 ? 20 : 8)
+			* op->ldpc_dec.z_c - op->ldpc_dec.n_filler;
+		k0_p = (fcw->k0 > parity_offset) ?
+				fcw->k0 - op->ldpc_dec.n_filler : fcw->k0;
+		ncb_p = fcw->ncb - op->ldpc_dec.n_filler;
+		l = k0_p + fcw->rm_e;
+		harq_out_length = (uint16_t) fcw->hcin_size0;
+		harq_out_length = RTE_MIN(RTE_MAX(harq_out_length, l), ncb_p);
+		harq_out_length = (harq_out_length + 0x3F) & 0xFFC0;
+		if ((k0_p > fcw->hcin_size0 + ACC100_HARQ_OFFSET_THRESHOLD) &&
+				harq_prun) {
+			fcw->hcout_size0 = (uint16_t) fcw->hcin_size0;
+			fcw->hcout_offset = k0_p & 0xFFC0;
+			fcw->hcout_size1 = harq_out_length - fcw->hcout_offset;
+		} else {
+			fcw->hcout_size0 = harq_out_length;
+			fcw->hcout_size1 = 0;
+			fcw->hcout_offset = 0;
+		}
+		harq_layout[harq_index].offset = fcw->hcout_offset;
+		harq_layout[harq_index].size0 = fcw->hcout_size0;
+	} else {
+		fcw->hcout_size0 = 0;
+		fcw->hcout_size1 = 0;
+		fcw->hcout_offset = 0;
+	}
+}
+
+/**
+ * Fills descriptor with data pointers of one block type.
+ *
+ * @param desc
+ *   Pointer to DMA descriptor.
+ * @param input
+ *   Pointer to pointer to input data which will be encoded. It can be changed
+ *   and points to next segment in scatter-gather case.
+ * @param offset
+ *   Input offset in rte_mbuf structure. It is used for calculating the point
+ *   where data is starting.
+ * @param cb_len
+ *   Length of currently processed Code Block
+ * @param seg_total_left
+ *   It indicates how many bytes still left in segment (mbuf) for further
+ *   processing.
+ * @param op_flags
+ *   Store information about device capabilities
+ * @param next_triplet
+ *   Index for ACC100 DMA Descriptor triplet
+ *
+ * @return
+ *   Returns index of next triplet on success, other value if lengths of
+ *   pkt and processed cb do not match.
+ *
+ */
+static inline int
+acc100_dma_fill_blk_type_in(struct acc100_dma_req_desc *desc,
+		struct rte_mbuf **input, uint32_t *offset, uint32_t cb_len,
+		uint32_t *seg_total_left, int next_triplet)
+{
+	uint32_t part_len;
+	struct rte_mbuf *m = *input;
+
+	part_len = (*seg_total_left < cb_len) ? *seg_total_left : cb_len;
+	cb_len -= part_len;
+	*seg_total_left -= part_len;
+
+	desc->data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(m, *offset);
+	desc->data_ptrs[next_triplet].blen = part_len;
+	desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN;
+	desc->data_ptrs[next_triplet].last = 0;
+	desc->data_ptrs[next_triplet].dma_ext = 0;
+	*offset += part_len;
+	next_triplet++;
+
+	while (cb_len > 0) {
+		if (next_triplet < ACC100_DMA_MAX_NUM_POINTERS &&
+				m->next != NULL) {
+
+			m = m->next;
+			*seg_total_left = rte_pktmbuf_data_len(m);
+			part_len = (*seg_total_left < cb_len) ?
+					*seg_total_left :
+					cb_len;
+			desc->data_ptrs[next_triplet].address =
+					rte_pktmbuf_mtophys(m);
+			desc->data_ptrs[next_triplet].blen = part_len;
+			desc->data_ptrs[next_triplet].blkid =
+					ACC100_DMA_BLKID_IN;
+			desc->data_ptrs[next_triplet].last = 0;
+			desc->data_ptrs[next_triplet].dma_ext = 0;
+			cb_len -= part_len;
+			*seg_total_left -= part_len;
+			/* Initializing offset for next segment (mbuf) */
+			*offset = part_len;
+			next_triplet++;
+		} else {
+			rte_bbdev_log(ERR,
+				"Some data still left for processing: "
+				"data_left: %u, next_triplet: %u, next_mbuf: %p",
+				cb_len, next_triplet, m->next);
+			return -EINVAL;
+		}
+	}
+	/* Storing new mbuf as it could be changed in scatter-gather case*/
+	*input = m;
+
+	return next_triplet;
+}
+
+/* Fills descriptor with data pointers of one block type.
+ * Returns index of next triplet on success, other value if lengths of
+ * output data and processed mbuf do not match.
+ */
+static inline int
+acc100_dma_fill_blk_type_out(struct acc100_dma_req_desc *desc,
+		struct rte_mbuf *output, uint32_t out_offset,
+		uint32_t output_len, int next_triplet, int blk_id)
+{
+	desc->data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(output, out_offset);
+	desc->data_ptrs[next_triplet].blen = output_len;
+	desc->data_ptrs[next_triplet].blkid = blk_id;
+	desc->data_ptrs[next_triplet].last = 0;
+	desc->data_ptrs[next_triplet].dma_ext = 0;
+	next_triplet++;
+
+	return next_triplet;
+}
+
+static inline int
+acc100_dma_desc_le_fill(struct rte_bbdev_enc_op *op,
+		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
+		struct rte_mbuf *output, uint32_t *in_offset,
+		uint32_t *out_offset, uint32_t *out_length,
+		uint32_t *mbuf_total_left, uint32_t *seg_total_left)
+{
+	int next_triplet = 1; /* FCW already done */
+	uint16_t K, in_length_in_bits, in_length_in_bytes;
+	struct rte_bbdev_op_ldpc_enc *enc = &op->ldpc_enc;
+
+	desc->word0 = ACC100_DMA_DESC_TYPE;
+	desc->word1 = 0; /**< Timestamp could be disabled */
+	desc->word2 = 0;
+	desc->word3 = 0;
+	desc->numCBs = 1;
+
+	K = (enc->basegraph == 1 ? 22 : 10) * enc->z_c;
+	in_length_in_bits = K - enc->n_filler;
+	if ((enc->op_flags & RTE_BBDEV_LDPC_CRC_24A_ATTACH) ||
+			(enc->op_flags & RTE_BBDEV_LDPC_CRC_24B_ATTACH))
+		in_length_in_bits -= 24;
+	in_length_in_bytes = in_length_in_bits >> 3;
+
+	if (unlikely((*mbuf_total_left == 0) ||
+			(*mbuf_total_left < in_length_in_bytes))) {
+		rte_bbdev_log(ERR,
+				"Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+				*mbuf_total_left, in_length_in_bytes);
+		return -1;
+	}
+
+	next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset,
+			in_length_in_bytes,
+			seg_total_left, next_triplet);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->m2dlen = next_triplet;
+	*mbuf_total_left -= in_length_in_bytes;
+
+	/* Set output length */
+	/* Integer round up division by 8 */
+	*out_length = (enc->cb_params.e + 7) >> 3;
+
+	next_triplet = acc100_dma_fill_blk_type_out(desc, output, *out_offset,
+			*out_length, next_triplet, ACC100_DMA_BLKID_OUT_ENC);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+	op->ldpc_enc.output.length += *out_length;
+	*out_offset += *out_length;
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->data_ptrs[next_triplet - 1].dma_ext = 0;
+	desc->d2mlen = next_triplet - desc->m2dlen;
+
+	desc->op_addr = op;
+
+	return 0;
+}
+
+static inline int
+acc100_dma_desc_ld_fill(struct rte_bbdev_dec_op *op,
+		struct acc100_dma_req_desc *desc,
+		struct rte_mbuf **input, struct rte_mbuf *h_output,
+		uint32_t *in_offset, uint32_t *h_out_offset,
+		uint32_t *h_out_length, uint32_t *mbuf_total_left,
+		uint32_t *seg_total_left,
+		struct acc100_fcw_ld *fcw)
+{
+	struct rte_bbdev_op_ldpc_dec *dec = &op->ldpc_dec;
+	int next_triplet = 1; /* FCW already done */
+	uint32_t input_length;
+	uint16_t output_length, crc24_overlap = 0;
+	uint16_t sys_cols, K, h_p_size, h_np_size;
+	bool h_comp = check_bit(dec->op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
+
+	desc->word0 = ACC100_DMA_DESC_TYPE;
+	desc->word1 = 0; /**< Timestamp could be disabled */
+	desc->word2 = 0;
+	desc->word3 = 0;
+	desc->numCBs = 1;
+
+	if (check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP))
+		crc24_overlap = 24;
+
+	/* Compute some LDPC BG lengths */
+	input_length = dec->cb_params.e;
+	if (check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_LLR_COMPRESSION))
+		input_length = (input_length * 3 + 3) / 4;
+	sys_cols = (dec->basegraph == 1) ? 22 : 10;
+	K = sys_cols * dec->z_c;
+	output_length = K - dec->n_filler - crc24_overlap;
+
+	if (unlikely((*mbuf_total_left == 0) ||
+			(*mbuf_total_left < input_length))) {
+		rte_bbdev_log(ERR,
+				"Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+				*mbuf_total_left, input_length);
+		return -1;
+	}
+
+	next_triplet = acc100_dma_fill_blk_type_in(desc, input,
+			in_offset, input_length,
+			seg_total_left, next_triplet);
+
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+
+	if (check_bit(op->ldpc_dec.op_flags,
+				RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
+		h_p_size = fcw->hcin_size0 + fcw->hcin_size1;
+		if (h_comp)
+			h_p_size = (h_p_size * 3 + 3) / 4;
+		desc->data_ptrs[next_triplet].address =
+				dec->harq_combined_input.offset;
+		desc->data_ptrs[next_triplet].blen = h_p_size;
+		desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN_HARQ;
+		desc->data_ptrs[next_triplet].dma_ext = 1;
+#ifndef ACC100_EXT_MEM
+		acc100_dma_fill_blk_type_out(
+				desc,
+				op->ldpc_dec.harq_combined_input.data,
+				op->ldpc_dec.harq_combined_input.offset,
+				h_p_size,
+				next_triplet,
+				ACC100_DMA_BLKID_IN_HARQ);
+#endif
+		next_triplet++;
+	}
+
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->m2dlen = next_triplet;
+	*mbuf_total_left -= input_length;
+
+	next_triplet = acc100_dma_fill_blk_type_out(desc, h_output,
+			*h_out_offset, output_length >> 3, next_triplet,
+			ACC100_DMA_BLKID_OUT_HARD);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+
+	if (check_bit(op->ldpc_dec.op_flags,
+				RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
+		/* Pruned size of the HARQ */
+		h_p_size = fcw->hcout_size0 + fcw->hcout_size1;
+		/* Non-Pruned size of the HARQ */
+		h_np_size = fcw->hcout_offset > 0 ?
+				fcw->hcout_offset + fcw->hcout_size1 :
+				h_p_size;
+		if (h_comp) {
+			h_np_size = (h_np_size * 3 + 3) / 4;
+			h_p_size = (h_p_size * 3 + 3) / 4;
+		}
+		dec->harq_combined_output.length = h_np_size;
+		desc->data_ptrs[next_triplet].address =
+				dec->harq_combined_output.offset;
+		desc->data_ptrs[next_triplet].blen = h_p_size;
+		desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_OUT_HARQ;
+		desc->data_ptrs[next_triplet].dma_ext = 1;
+#ifndef ACC100_EXT_MEM
+		acc100_dma_fill_blk_type_out(
+				desc,
+				dec->harq_combined_output.data,
+				dec->harq_combined_output.offset,
+				h_p_size,
+				next_triplet,
+				ACC100_DMA_BLKID_OUT_HARQ);
+#endif
+		next_triplet++;
+	}
+
+	*h_out_length = output_length >> 3;
+	dec->hard_output.length += *h_out_length;
+	*h_out_offset += *h_out_length;
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->d2mlen = next_triplet - desc->m2dlen;
+
+	desc->op_addr = op;
+
+	return 0;
+}
+
+static inline void
+acc100_dma_desc_ld_update(struct rte_bbdev_dec_op *op,
+		struct acc100_dma_req_desc *desc,
+		struct rte_mbuf *input, struct rte_mbuf *h_output,
+		uint32_t *in_offset, uint32_t *h_out_offset,
+		uint32_t *h_out_length,
+		union acc100_harq_layout_data *harq_layout)
+{
+	int next_triplet = 1; /* FCW already done */
+	desc->data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(input, *in_offset);
+	next_triplet++;
+
+	if (check_bit(op->ldpc_dec.op_flags,
+				RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
+		struct rte_bbdev_op_data hi = op->ldpc_dec.harq_combined_input;
+		desc->data_ptrs[next_triplet].address = hi.offset;
+#ifndef ACC100_EXT_MEM
+		desc->data_ptrs[next_triplet].address =
+				rte_pktmbuf_iova_offset(hi.data, hi.offset);
+#endif
+		next_triplet++;
+	}
+
+	desc->data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(h_output, *h_out_offset);
+	*h_out_length = desc->data_ptrs[next_triplet].blen;
+	next_triplet++;
+
+	if (check_bit(op->ldpc_dec.op_flags,
+				RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
+		desc->data_ptrs[next_triplet].address =
+				op->ldpc_dec.harq_combined_output.offset;
+		/* Adjust based on previous operation */
+		struct rte_bbdev_dec_op *prev_op = desc->op_addr;
+		op->ldpc_dec.harq_combined_output.length =
+				prev_op->ldpc_dec.harq_combined_output.length;
+		int16_t hq_idx = op->ldpc_dec.harq_combined_output.offset /
+				ACC100_HARQ_OFFSET;
+		int16_t prev_hq_idx =
+				prev_op->ldpc_dec.harq_combined_output.offset
+				/ ACC100_HARQ_OFFSET;
+		harq_layout[hq_idx].val = harq_layout[prev_hq_idx].val;
+#ifndef ACC100_EXT_MEM
+		struct rte_bbdev_op_data ho =
+				op->ldpc_dec.harq_combined_output;
+		desc->data_ptrs[next_triplet].address =
+				rte_pktmbuf_iova_offset(ho.data, ho.offset);
+#endif
+		next_triplet++;
+	}
+
+	op->ldpc_dec.hard_output.length += *h_out_length;
+	desc->op_addr = op;
+}
+
+
+/* Enqueue a number of operations to HW and update software rings */
+static inline void
+acc100_dma_enqueue(struct acc100_queue *q, uint16_t n,
+		struct rte_bbdev_stats *queue_stats)
+{
+	union acc100_enqueue_reg_fmt enq_req;
+#ifdef RTE_BBDEV_OFFLOAD_COST
+	uint64_t start_time = 0;
+	queue_stats->acc_offload_cycles = 0;
+	RTE_SET_USED(queue_stats);
+#else
+	RTE_SET_USED(queue_stats);
+#endif
+
+	enq_req.val = 0;
+	/* Setting offset, 100b for 256 DMA Desc */
+	enq_req.addr_offset = ACC100_DESC_OFFSET;
+
+	/* Split ops into batches */
+	do {
+		union acc100_dma_desc *desc;
+		uint16_t enq_batch_size;
+		uint64_t offset;
+		rte_iova_t req_elem_addr;
+
+		enq_batch_size = RTE_MIN(n, MAX_ENQ_BATCH_SIZE);
+
+		/* Set flag on last descriptor in a batch */
+		desc = q->ring_addr + ((q->sw_ring_head + enq_batch_size - 1) &
+				q->sw_ring_wrap_mask);
+		desc->req.last_desc_in_batch = 1;
+
+		/* Calculate the 1st descriptor's address */
+		offset = ((q->sw_ring_head & q->sw_ring_wrap_mask) *
+				sizeof(union acc100_dma_desc));
+		req_elem_addr = q->ring_addr_phys + offset;
+
+		/* Fill enqueue struct */
+		enq_req.num_elem = enq_batch_size;
+		/* low 6 bits are not needed */
+		enq_req.req_elem_addr = (uint32_t)(req_elem_addr >> 6);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+		rte_memdump(stderr, "Req sdone", desc, sizeof(*desc));
+#endif
+		rte_bbdev_log_debug(
+				"Enqueue %u reqs (phys %#"PRIx64") to reg %p",
+				enq_batch_size,
+				req_elem_addr,
+				(void *)q->mmio_reg_enqueue);
+
+		rte_wmb();
+
+#ifdef RTE_BBDEV_OFFLOAD_COST
+		/* Start time measurement for enqueue function offload. */
+		start_time = rte_rdtsc_precise();
+#endif
+		rte_bbdev_log(DEBUG, "Debug : MMIO Enqueue");
+		mmio_write(q->mmio_reg_enqueue, enq_req.val);
+
+#ifdef RTE_BBDEV_OFFLOAD_COST
+		queue_stats->acc_offload_cycles +=
+				rte_rdtsc_precise() - start_time;
+#endif
+
+		q->aq_enqueued++;
+		q->sw_ring_head += enq_batch_size;
+		n -= enq_batch_size;
+
+	} while (n);
+
+
+}
+
+/* Enqueue one encode operations for ACC100 device in CB mode */
+static inline int
+enqueue_ldpc_enc_n_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op **ops,
+		uint16_t total_enqueued_cbs, int16_t num)
+{
+	union acc100_dma_desc *desc = NULL;
+	uint32_t out_length;
+	struct rte_mbuf *output_head, *output;
+	int i, next_triplet;
+	uint16_t  in_length_in_bytes;
+	struct rte_bbdev_op_ldpc_enc *enc = &ops[0]->ldpc_enc;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	acc100_fcw_le_fill(ops[0], &desc->req.fcw_le, num);
+
+	/** This could be done at polling */
+	desc->req.word0 = ACC100_DMA_DESC_TYPE;
+	desc->req.word1 = 0; /**< Timestamp could be disabled */
+	desc->req.word2 = 0;
+	desc->req.word3 = 0;
+	desc->req.numCBs = num;
+
+	in_length_in_bytes = ops[0]->ldpc_enc.input.data->data_len;
+	out_length = (enc->cb_params.e + 7) >> 3;
+	desc->req.m2dlen = 1 + num;
+	desc->req.d2mlen = num;
+	next_triplet = 1;
+
+	for (i = 0; i < num; i++) {
+		desc->req.data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(ops[i]->ldpc_enc.input.data, 0);
+		desc->req.data_ptrs[next_triplet].blen = in_length_in_bytes;
+		next_triplet++;
+		desc->req.data_ptrs[next_triplet].address =
+				rte_pktmbuf_iova_offset(
+				ops[i]->ldpc_enc.output.data, 0);
+		desc->req.data_ptrs[next_triplet].blen = out_length;
+		next_triplet++;
+		ops[i]->ldpc_enc.output.length = out_length;
+		output_head = output = ops[i]->ldpc_enc.output.data;
+		mbuf_append(output_head, output, out_length);
+		output->data_len = out_length;
+	}
+
+	desc->req.op_addr = ops[0];
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_le,
+			sizeof(desc->req.fcw_le) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+	/* One CB (one op) was successfully prepared to enqueue */
+	return num;
+}
+
+/* Enqueue one encode operations for ACC100 device in CB mode */
+static inline int
+enqueue_ldpc_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
+		uint16_t total_enqueued_cbs)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint32_t in_offset, out_offset, out_length, mbuf_total_left,
+		seg_total_left;
+	struct rte_mbuf *input, *output_head, *output;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	acc100_fcw_le_fill(op, &desc->req.fcw_le, 1);
+
+	input = op->ldpc_enc.input.data;
+	output_head = output = op->ldpc_enc.output.data;
+	in_offset = op->ldpc_enc.input.offset;
+	out_offset = op->ldpc_enc.output.offset;
+	out_length = 0;
+	mbuf_total_left = op->ldpc_enc.input.length;
+	seg_total_left = rte_pktmbuf_data_len(op->ldpc_enc.input.data)
+			- in_offset;
+
+	ret = acc100_dma_desc_le_fill(op, &desc->req, &input, output,
+			&in_offset, &out_offset, &out_length, &mbuf_total_left,
+			&seg_total_left);
+
+	if (unlikely(ret < 0))
+		return ret;
+
+	mbuf_append(output_head, output, out_length);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_le,
+			sizeof(desc->req.fcw_le) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+
+	/* Check if any data left after processing one CB */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left after processing one CB: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
+/** Enqueue one decode operations for ACC100 device in CB mode */
+static inline int
+enqueue_ldpc_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs, bool same_op)
+{
+	int ret;
+
+	union acc100_dma_desc *desc;
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	struct rte_mbuf *input, *h_output_head, *h_output;
+	uint32_t in_offset, h_out_offset, h_out_length, mbuf_total_left;
+	input = op->ldpc_dec.input.data;
+	h_output_head = h_output = op->ldpc_dec.hard_output.data;
+	in_offset = op->ldpc_dec.input.offset;
+	h_out_offset = op->ldpc_dec.hard_output.offset;
+	mbuf_total_left = op->ldpc_dec.input.length;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(input == NULL)) {
+		rte_bbdev_log(ERR, "Invalid mbuf pointer");
+		return -EFAULT;
+	}
+#endif
+	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+
+	if (same_op) {
+		union acc100_dma_desc *prev_desc;
+		desc_idx = ((q->sw_ring_head + total_enqueued_cbs - 1)
+				& q->sw_ring_wrap_mask);
+		prev_desc = q->ring_addr + desc_idx;
+		uint8_t *prev_ptr = (uint8_t *) prev_desc;
+		uint8_t *new_ptr = (uint8_t *) desc;
+		/* Copy first 4 words and BDESCs */
+		rte_memcpy(new_ptr, prev_ptr, 16);
+		rte_memcpy(new_ptr + 36, prev_ptr + 36, 40);
+		desc->req.op_addr = prev_desc->req.op_addr;
+		/* Copy FCW */
+		rte_memcpy(new_ptr + ACC100_DESC_FCW_OFFSET,
+				prev_ptr + ACC100_DESC_FCW_OFFSET,
+				ACC100_FCW_LD_BLEN);
+		acc100_dma_desc_ld_update(op, &desc->req, input, h_output,
+				&in_offset, &h_out_offset,
+				&h_out_length, harq_layout);
+	} else {
+		struct acc100_fcw_ld *fcw;
+		uint32_t seg_total_left;
+		fcw = &desc->req.fcw_ld;
+		acc100_fcw_ld_fill(op, fcw, harq_layout);
+
+		/* Special handling when overusing mbuf */
+		if (fcw->rm_e < MAX_E_MBUF)
+			seg_total_left = rte_pktmbuf_data_len(input)
+					- in_offset;
+		else
+			seg_total_left = fcw->rm_e;
+
+		ret = acc100_dma_desc_ld_fill(op, &desc->req, &input, h_output,
+				&in_offset, &h_out_offset,
+				&h_out_length, &mbuf_total_left,
+				&seg_total_left, fcw);
+		if (unlikely(ret < 0))
+			return ret;
+	}
+
+	/* Hard output */
+	mbuf_append(h_output_head, h_output, h_out_length);
+#ifndef ACC100_EXT_MEM
+	if (op->ldpc_dec.harq_combined_output.length > 0) {
+		/* Push the HARQ output into host memory */
+		struct rte_mbuf *hq_output_head, *hq_output;
+		hq_output_head = op->ldpc_dec.harq_combined_output.data;
+		hq_output = op->ldpc_dec.harq_combined_output.data;
+		mbuf_append(hq_output_head, hq_output,
+				op->ldpc_dec.harq_combined_output.length);
+	}
+#endif
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_ld,
+			sizeof(desc->req.fcw_ld) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
+
+/* Enqueue one decode operations for ACC100 device in TB mode */
+static inline int
+enqueue_ldpc_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint8_t r, c;
+	uint32_t in_offset, h_out_offset,
+		h_out_length, mbuf_total_left, seg_total_left;
+	struct rte_mbuf *input, *h_output_head, *h_output;
+	uint16_t current_enqueued_cbs = 0;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
+	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+	acc100_fcw_ld_fill(op, &desc->req.fcw_ld, harq_layout);
+
+	input = op->ldpc_dec.input.data;
+	h_output_head = h_output = op->ldpc_dec.hard_output.data;
+	in_offset = op->ldpc_dec.input.offset;
+	h_out_offset = op->ldpc_dec.hard_output.offset;
+	h_out_length = 0;
+	mbuf_total_left = op->ldpc_dec.input.length;
+	c = op->ldpc_dec.tb_params.c;
+	r = op->ldpc_dec.tb_params.r;
+
+	while (mbuf_total_left > 0 && r < c) {
+
+		seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
+
+		/* Set up DMA descriptor */
+		desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
+				& q->sw_ring_wrap_mask);
+		desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
+		desc->req.data_ptrs[0].blen = ACC100_FCW_LD_BLEN;
+		ret = acc100_dma_desc_ld_fill(op, &desc->req, &input,
+				h_output, &in_offset, &h_out_offset,
+				&h_out_length,
+				&mbuf_total_left, &seg_total_left,
+				&desc->req.fcw_ld);
+
+		if (unlikely(ret < 0))
+			return ret;
+
+		/* Hard output */
+		mbuf_append(h_output_head, h_output, h_out_length);
+
+		/* Set total number of CBs in TB */
+		desc->req.cbs_in_tb = cbs_in_tb;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+		rte_memdump(stderr, "FCW", &desc->req.fcw_td,
+				sizeof(desc->req.fcw_td) - 8);
+		rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+		if (seg_total_left == 0) {
+			/* Go to the next mbuf */
+			input = input->next;
+			in_offset = 0;
+			h_output = h_output->next;
+			h_out_offset = 0;
+		}
+		total_enqueued_cbs++;
+		current_enqueued_cbs++;
+		r++;
+	}
+
+	if (unlikely(desc == NULL))
+		return current_enqueued_cbs;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Check if any CBs left for processing */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left for processing: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+	/* Set SDone on last CB descriptor for TB mode */
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	return current_enqueued_cbs;
+}
+
+
+/* Calculates number of CBs in processed encoder TB based on 'r' and input
+ * length.
+ */
+static inline uint8_t
+get_num_cbs_in_tb_enc(struct rte_bbdev_op_turbo_enc *turbo_enc)
+{
+	uint8_t c, c_neg, r, crc24_bits = 0;
+	uint16_t k, k_neg, k_pos;
+	uint8_t cbs_in_tb = 0;
+	int32_t length;
+
+	length = turbo_enc->input.length;
+	r = turbo_enc->tb_params.r;
+	c = turbo_enc->tb_params.c;
+	c_neg = turbo_enc->tb_params.c_neg;
+	k_neg = turbo_enc->tb_params.k_neg;
+	k_pos = turbo_enc->tb_params.k_pos;
+	crc24_bits = 0;
+	if (check_bit(turbo_enc->op_flags, RTE_BBDEV_TURBO_CRC_24B_ATTACH))
+		crc24_bits = 24;
+	while (length > 0 && r < c) {
+		k = (r < c_neg) ? k_neg : k_pos;
+		length -= (k - crc24_bits) >> 3;
+		r++;
+		cbs_in_tb++;
+	}
+
+	return cbs_in_tb;
+}
+
+/* Calculates number of CBs in processed decoder TB based on 'r' and input
+ * length.
+ */
+static inline uint16_t
+get_num_cbs_in_tb_dec(struct rte_bbdev_op_turbo_dec *turbo_dec)
+{
+	uint8_t c, c_neg, r = 0;
+	uint16_t kw, k, k_neg, k_pos, cbs_in_tb = 0;
+	int32_t length;
+
+	length = turbo_dec->input.length;
+	r = turbo_dec->tb_params.r;
+	c = turbo_dec->tb_params.c;
+	c_neg = turbo_dec->tb_params.c_neg;
+	k_neg = turbo_dec->tb_params.k_neg;
+	k_pos = turbo_dec->tb_params.k_pos;
+	while (length > 0 && r < c) {
+		k = (r < c_neg) ? k_neg : k_pos;
+		kw = RTE_ALIGN_CEIL(k + 4, 32) * 3;
+		length -= kw;
+		r++;
+		cbs_in_tb++;
+	}
+
+	return cbs_in_tb;
+}
+
+/* Calculates number of CBs in processed decoder TB based on 'r' and input
+ * length.
+ */
+static inline uint16_t
+get_num_cbs_in_tb_ldpc_dec(struct rte_bbdev_op_ldpc_dec *ldpc_dec)
+{
+	uint16_t r, cbs_in_tb = 0;
+	int32_t length = ldpc_dec->input.length;
+	r = ldpc_dec->tb_params.r;
+	while (length > 0 && r < ldpc_dec->tb_params.c) {
+		length -=  (r < ldpc_dec->tb_params.cab) ?
+				ldpc_dec->tb_params.ea :
+				ldpc_dec->tb_params.eb;
+		r++;
+		cbs_in_tb++;
+	}
+	return cbs_in_tb;
+}
+
+/* Check we can mux encode operations with common FCW */
+static inline bool
+check_mux(struct rte_bbdev_enc_op **ops, uint16_t num) {
+	uint16_t i;
+	if (num == 1)
+		return false;
+	for (i = 1; i < num; ++i) {
+		/* Only mux compatible code blocks */
+		if (memcmp((uint8_t *)(&ops[i]->ldpc_enc) + ENC_OFFSET,
+				(uint8_t *)(&ops[0]->ldpc_enc) + ENC_OFFSET,
+				CMP_ENC_SIZE) != 0)
+			return false;
+	}
+	return true;
+}
+
+/** Enqueue encode operations for ACC100 device in CB mode. */
+static inline uint16_t
+acc100_enqueue_ldpc_enc_cb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i = 0;
+	union acc100_dma_desc *desc;
+	int ret, desc_idx = 0;
+	int16_t enq, left = num;
+
+	while (left > 0) {
+		if (unlikely(avail - 1 < 0))
+			break;
+		avail--;
+		enq = RTE_MIN(left, MUX_5GDL_DESC);
+		if (check_mux(&ops[i], enq)) {
+			ret = enqueue_ldpc_enc_n_op_cb(q, &ops[i],
+					desc_idx, enq);
+			if (ret < 0)
+				break;
+			i += enq;
+		} else {
+			ret = enqueue_ldpc_enc_one_op_cb(q, ops[i], desc_idx);
+			if (ret < 0)
+				break;
+			i++;
+		}
+		desc_idx++;
+		left = num - i;
+	}
+
+	if (unlikely(i == 0))
+		return 0; /* Nothing to enqueue */
+
+	/* Set SDone in last CB in enqueued ops for CB mode*/
+	desc = q->ring_addr + ((q->sw_ring_head + desc_idx - 1)
+			& q->sw_ring_wrap_mask);
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	acc100_dma_enqueue(q, desc_idx, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+
+	return i;
+}
+
+/* Enqueue encode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	if (unlikely(num == 0))
+		return 0;
+	return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
+}
+
+/* Check we can mux encode operations with common FCW */
+static inline bool
+cmp_ldpc_dec_op(struct rte_bbdev_dec_op **ops) {
+	/* Only mux compatible code blocks */
+	if (memcmp((uint8_t *)(&ops[0]->ldpc_dec) + DEC_OFFSET,
+			(uint8_t *)(&ops[1]->ldpc_dec) +
+			DEC_OFFSET, CMP_DEC_SIZE) != 0) {
+		return false;
+	} else
+		return true;
+}
+
+
+/* Enqueue decode operations for ACC100 device in TB mode */
+static uint16_t
+acc100_enqueue_ldpc_dec_tb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i, enqueued_cbs = 0;
+	uint8_t cbs_in_tb;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		cbs_in_tb = get_num_cbs_in_tb_ldpc_dec(&ops[i]->ldpc_dec);
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - cbs_in_tb < 0))
+			break;
+		avail -= cbs_in_tb;
+
+		ret = enqueue_ldpc_dec_one_op_tb(q, ops[i],
+				enqueued_cbs, cbs_in_tb);
+		if (ret < 0)
+			break;
+		enqueued_cbs += ret;
+	}
+
+	acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+	return i;
+}
+
+/* Enqueue decode operations for ACC100 device in CB mode */
+static uint16_t
+acc100_enqueue_ldpc_dec_cb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i;
+	union acc100_dma_desc *desc;
+	int ret;
+	bool same_op = false;
+	for (i = 0; i < num; ++i) {
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - 1 < 0))
+			break;
+		avail -= 1;
+
+		if (i > 0)
+			same_op = cmp_ldpc_dec_op(&ops[i-1]);
+		rte_bbdev_log(INFO, "Op %d %d %d %d %d %d %d %d %d %d %d %d\n",
+			i, ops[i]->ldpc_dec.op_flags, ops[i]->ldpc_dec.rv_index,
+			ops[i]->ldpc_dec.iter_max, ops[i]->ldpc_dec.iter_count,
+			ops[i]->ldpc_dec.basegraph, ops[i]->ldpc_dec.z_c,
+			ops[i]->ldpc_dec.n_cb, ops[i]->ldpc_dec.q_m,
+			ops[i]->ldpc_dec.n_filler, ops[i]->ldpc_dec.cb_params.e,
+			same_op);
+		ret = enqueue_ldpc_dec_one_op_cb(q, ops[i], i, same_op);
+		if (ret < 0)
+			break;
+	}
+
+	if (unlikely(i == 0))
+		return 0; /* Nothing to enqueue */
+
+	/* Set SDone in last CB in enqueued ops for CB mode*/
+	desc = q->ring_addr + ((q->sw_ring_head + i - 1)
+			& q->sw_ring_wrap_mask);
+
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	acc100_dma_enqueue(q, i, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+	return i;
+}
+
+/* Enqueue decode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t aq_avail = q->aq_depth +
+			(q->aq_dequeued - q->aq_enqueued) / 128;
+
+	if (unlikely((aq_avail == 0) || (num == 0)))
+		return 0;
+
+	if (ops[0]->ldpc_dec.code_block_mode == 0)
+		return acc100_enqueue_ldpc_dec_tb(q_data, ops, num);
+	else
+		return acc100_enqueue_ldpc_dec_cb(q_data, ops, num);
+}
+
+
+/* Dequeue one encode operations from ACC100 device in CB mode */
+static inline int
+dequeue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op **ref_op,
+		uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_enc_op *op;
+	int i;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	rsp.val = atom_desc.rsp.val;
+	rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+
+	op->status |= ((rsp.input_err)
+			? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+	op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+	op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+
+	if (desc->req.last_desc_in_batch) {
+		(*aq_dequeued)++;
+		desc->req.last_desc_in_batch = 0;
+	}
+	desc->rsp.val = ACC100_DMA_DESC_TYPE;
+	desc->rsp.add_info_0 = 0; /*Reserved bits */
+	desc->rsp.add_info_1 = 0; /*Reserved bits */
+
+	/* Flag that the muxing cause loss of opaque data */
+	op->opaque_data = (void *)-1;
+	for (i = 0 ; i < desc->req.numCBs; i++)
+		ref_op[i] = op;
+
+	/* One CB (op) was successfully dequeued */
+	return desc->req.numCBs;
+}
+
+/* Dequeue one encode operations from ACC100 device in TB mode */
+static inline int
+dequeue_enc_one_op_tb(struct acc100_queue *q, struct rte_bbdev_enc_op **ref_op,
+		uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, *last_desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_enc_op *op;
+	uint8_t i = 0;
+	uint16_t current_dequeued_cbs = 0, cbs_in_tb;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	/* Get number of CBs in dequeued TB */
+	cbs_in_tb = desc->req.cbs_in_tb;
+	/* Get last CB */
+	last_desc = q->ring_addr + ((q->sw_ring_tail
+			+ total_dequeued_cbs + cbs_in_tb - 1)
+			& q->sw_ring_wrap_mask);
+	/* Check if last CB in TB is ready to dequeue (and thus
+	 * the whole TB) - checking sdone bit. If not return.
+	 */
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
+			__ATOMIC_RELAXED);
+	if (!(atom_desc.rsp.val & ACC100_SDONE))
+		return -1;
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+
+	while (i < cbs_in_tb) {
+		desc = q->ring_addr + ((q->sw_ring_tail
+				+ total_dequeued_cbs)
+				& q->sw_ring_wrap_mask);
+		atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+				__ATOMIC_RELAXED);
+		rsp.val = atom_desc.rsp.val;
+		rte_bbdev_log_debug("Resp. desc %p: %x", desc,
+				rsp.val);
+
+		op->status |= ((rsp.input_err)
+				? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+		op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+		op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+
+		if (desc->req.last_desc_in_batch) {
+			(*aq_dequeued)++;
+			desc->req.last_desc_in_batch = 0;
+		}
+		desc->rsp.val = ACC100_DMA_DESC_TYPE;
+		desc->rsp.add_info_0 = 0;
+		desc->rsp.add_info_1 = 0;
+		total_dequeued_cbs++;
+		current_dequeued_cbs++;
+		i++;
+	}
+
+	*ref_op = op;
+
+	return current_dequeued_cbs;
+}
+
+/* Dequeue one decode operation from ACC100 device in CB mode */
+static inline int
+dequeue_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
+		struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
+		uint16_t dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_dec_op *op;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	rsp.val = atom_desc.rsp.val;
+	rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+	op->status |= ((rsp.input_err)
+			? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+	op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+	op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+	if (op->status != 0)
+		q_data->queue_stats.dequeue_err_count++;
+
+	/* CRC invalid if error exists */
+	if (!op->status)
+		op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
+	op->turbo_dec.iter_count = (uint8_t) rsp.iter_cnt / 2;
+	/* Check if this is the last desc in batch (Atomic Queue) */
+	if (desc->req.last_desc_in_batch) {
+		(*aq_dequeued)++;
+		desc->req.last_desc_in_batch = 0;
+	}
+	desc->rsp.val = ACC100_DMA_DESC_TYPE;
+	desc->rsp.add_info_0 = 0;
+	desc->rsp.add_info_1 = 0;
+	*ref_op = op;
+
+	/* One CB (op) was successfully dequeued */
+	return 1;
+}
+
+/* Dequeue one decode operations from ACC100 device in CB mode */
+static inline int
+dequeue_ldpc_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
+		struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
+		uint16_t dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_dec_op *op;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	rsp.val = atom_desc.rsp.val;
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+	op->status |= rsp.input_err << RTE_BBDEV_DATA_ERROR;
+	op->status |= rsp.dma_err << RTE_BBDEV_DRV_ERROR;
+	op->status |= rsp.fcw_err << RTE_BBDEV_DRV_ERROR;
+	if (op->status != 0)
+		q_data->queue_stats.dequeue_err_count++;
+
+	op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
+	if (op->ldpc_dec.hard_output.length > 0 && !rsp.synd_ok)
+		op->status |= 1 << RTE_BBDEV_SYNDROME_ERROR;
+	op->ldpc_dec.iter_count = (uint8_t) rsp.iter_cnt;
+
+	/* Check if this is the last desc in batch (Atomic Queue) */
+	if (desc->req.last_desc_in_batch) {
+		(*aq_dequeued)++;
+		desc->req.last_desc_in_batch = 0;
+	}
+
+	desc->rsp.val = ACC100_DMA_DESC_TYPE;
+	desc->rsp.add_info_0 = 0;
+	desc->rsp.add_info_1 = 0;
+
+	*ref_op = op;
+
+	/* One CB (op) was successfully dequeued */
+	return 1;
+}
+
+/* Dequeue one decode operations from ACC100 device in TB mode. */
+static inline int
+dequeue_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
+		uint16_t dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, *last_desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_dec_op *op;
+	uint8_t cbs_in_tb = 1, cb_idx = 0;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Get number of CBs in dequeued TB */
+	cbs_in_tb = desc->req.cbs_in_tb;
+	/* Get last CB */
+	last_desc = q->ring_addr + ((q->sw_ring_tail
+			+ dequeued_cbs + cbs_in_tb - 1)
+			& q->sw_ring_wrap_mask);
+	/* Check if last CB in TB is ready to dequeue (and thus
+	 * the whole TB) - checking sdone bit. If not return.
+	 */
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
+			__ATOMIC_RELAXED);
+	if (!(atom_desc.rsp.val & ACC100_SDONE))
+		return -1;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+
+	/* Read remaining CBs if exists */
+	while (cb_idx < cbs_in_tb) {
+		desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+				& q->sw_ring_wrap_mask);
+		atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+				__ATOMIC_RELAXED);
+		rsp.val = atom_desc.rsp.val;
+		rte_bbdev_log_debug("Resp. desc %p: %x", desc,
+				rsp.val);
+
+		op->status |= ((rsp.input_err)
+				? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+		op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+		op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+
+		/* CRC invalid if error exists */
+		if (!op->status)
+			op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
+		op->turbo_dec.iter_count = RTE_MAX((uint8_t) rsp.iter_cnt,
+				op->turbo_dec.iter_count);
+
+		/* Check if this is the last desc in batch (Atomic Queue) */
+		if (desc->req.last_desc_in_batch) {
+			(*aq_dequeued)++;
+			desc->req.last_desc_in_batch = 0;
+		}
+		desc->rsp.val = ACC100_DMA_DESC_TYPE;
+		desc->rsp.add_info_0 = 0;
+		desc->rsp.add_info_1 = 0;
+		dequeued_cbs++;
+		cb_idx++;
+	}
+
+	*ref_op = op;
+
+	return cb_idx;
+}
+
+/* Dequeue LDPC encode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+	uint32_t aq_dequeued = 0;
+	uint16_t dequeue_num, i, dequeued_cbs = 0, dequeued_descs = 0;
+	int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(ops == 0 && q == NULL))
+		return 0;
+#endif
+
+	dequeue_num = (avail < num) ? avail : num;
+
+	for (i = 0; i < dequeue_num; i++) {
+		ret = dequeue_enc_one_op_cb(q, &ops[dequeued_cbs],
+				dequeued_descs, &aq_dequeued);
+		if (ret < 0)
+			break;
+		dequeued_cbs += ret;
+		dequeued_descs++;
+		if (dequeued_cbs >= num)
+			break;
+	}
+
+	q->aq_dequeued += aq_dequeued;
+	q->sw_ring_tail += dequeued_descs;
+
+	/* Update enqueue stats */
+	q_data->queue_stats.dequeued_count += dequeued_cbs;
+
+	return dequeued_cbs;
+}
+
+/* Dequeue decode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	uint16_t dequeue_num;
+	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+	uint32_t aq_dequeued = 0;
+	uint16_t i;
+	uint16_t dequeued_cbs = 0;
+	struct rte_bbdev_dec_op *op;
+	int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(ops == 0 && q == NULL))
+		return 0;
+#endif
+
+	dequeue_num = (avail < num) ? avail : num;
+
+	for (i = 0; i < dequeue_num; ++i) {
+		op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask))->req.op_addr;
+		if (op->ldpc_dec.code_block_mode == 0)
+			ret = dequeue_dec_one_op_tb(q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+		else
+			ret = dequeue_ldpc_dec_one_op_cb(
+					q_data, q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+
+		if (ret < 0)
+			break;
+		dequeued_cbs += ret;
+	}
+
+	q->aq_dequeued += aq_dequeued;
+	q->sw_ring_tail += dequeued_cbs;
+
+	/* Update enqueue stats */
+	q_data->queue_stats.dequeued_count += i;
+
+	return i;
+}
+
 /* Initialization Function */
 static void
 acc100_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv)
@@ -703,6 +2321,10 @@
 	struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
 
 	dev->dev_ops = &acc100_bbdev_ops;
+	dev->enqueue_ldpc_enc_ops = acc100_enqueue_ldpc_enc;
+	dev->enqueue_ldpc_dec_ops = acc100_enqueue_ldpc_dec;
+	dev->dequeue_ldpc_enc_ops = acc100_dequeue_ldpc_enc;
+	dev->dequeue_ldpc_dec_ops = acc100_dequeue_ldpc_dec;
 
 	((struct acc100_device *) dev->data->dev_private)->pf_device =
 			!strcmp(drv->driver.name,
@@ -815,4 +2437,3 @@ static int acc100_pci_remove(struct rte_pci_device *pci_dev)
 RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME, pci_id_acc100_pf_map);
 RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
 RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME, pci_id_acc100_vf_map);
-
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 0e2b79c..78686c1 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -88,6 +88,8 @@
 #define TMPL_PRI_3      0x0f0e0d0c
 #define QUEUE_ENABLE    0x80000000  /* Bit to mark Queue as Enabled */
 #define WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
+#define ACC100_FDONE    0x80000000
+#define ACC100_SDONE    0x40000000
 
 #define ACC100_NUM_TMPL  32
 #define VF_OFFSET_QOS 16 /* offset in Memory Space specific to QoS Mon */
@@ -398,6 +400,7 @@ struct __rte_packed acc100_dma_req_desc {
 union acc100_dma_desc {
 	struct acc100_dma_req_desc req;
 	union acc100_dma_rsp_desc rsp;
+	uint64_t atom_hdr;
 };
 
 
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v3 06/11] baseband/acc100: add HARQ loopback support
  2020-08-19  0:25 [dpdk-dev] [PATCH v3 00/11] bbdev PMD ACC100 Nicolas Chautru
                   ` (4 preceding siblings ...)
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC processing functions Nicolas Chautru
@ 2020-08-19  0:25 ` Nicolas Chautru
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 07/11] baseband/acc100: add support for 4G processing Nicolas Chautru
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-08-19  0:25 UTC (permalink / raw)
  To: dev, akhil.goyal; +Cc: bruce.richardson, Nicolas Chautru

Additional support for HARQ memory loopback

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 158 +++++++++++++++++++++++++++++++
 1 file changed, 158 insertions(+)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 5f32813..b44b2f5 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -658,6 +658,7 @@
 				RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE |
 				RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE |
 #ifdef ACC100_EXT_MEM
+				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK |
 				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE |
 				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE |
 #endif
@@ -1480,12 +1481,169 @@
 	return 1;
 }
 
+static inline int
+harq_loopback(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs) {
+	struct acc100_fcw_ld *fcw;
+	union acc100_dma_desc *desc;
+	int next_triplet = 1;
+	struct rte_mbuf *hq_output_head, *hq_output;
+	uint16_t harq_in_length = op->ldpc_dec.harq_combined_input.length;
+	if (harq_in_length == 0) {
+		rte_bbdev_log(ERR, "Loopback of invalid null size\n");
+		return -EINVAL;
+	}
+
+	int h_comp = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION
+			) ? 1 : 0;
+	if (h_comp == 1)
+		harq_in_length = harq_in_length * 8 / 6;
+	harq_in_length = RTE_ALIGN(harq_in_length, 64);
+	uint16_t harq_dma_length_in = (h_comp == 0) ?
+			harq_in_length :
+			harq_in_length * 6 / 8;
+	uint16_t harq_dma_length_out = harq_dma_length_in;
+	bool ddr_mem_in = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE);
+	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+	uint16_t harq_index = (ddr_mem_in ?
+			op->ldpc_dec.harq_combined_input.offset :
+			op->ldpc_dec.harq_combined_output.offset)
+			/ ACC100_HARQ_OFFSET;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	fcw = &desc->req.fcw_ld;
+	/* Set the FCW from loopback into DDR */
+	memset(fcw, 0, sizeof(struct acc100_fcw_ld));
+	fcw->FCWversion = ACC100_FCW_VER;
+	fcw->qm = 2;
+	fcw->Zc = 384;
+	if (harq_in_length < 16 * N_ZC_1)
+		fcw->Zc = 16;
+	fcw->ncb = fcw->Zc * N_ZC_1;
+	fcw->rm_e = 2;
+	fcw->hcin_en = 1;
+	fcw->hcout_en = 1;
+
+	rte_bbdev_log(DEBUG, "Loopback IN %d Index %d offset %d length %d %d\n",
+			ddr_mem_in, harq_index,
+			harq_layout[harq_index].offset, harq_in_length,
+			harq_dma_length_in);
+
+	if (ddr_mem_in && (harq_layout[harq_index].offset > 0)) {
+		fcw->hcin_size0 = harq_layout[harq_index].size0;
+		fcw->hcin_offset = harq_layout[harq_index].offset;
+		fcw->hcin_size1 = harq_in_length - fcw->hcin_offset;
+		harq_dma_length_in = (fcw->hcin_size0 + fcw->hcin_size1);
+		if (h_comp == 1)
+			harq_dma_length_in = harq_dma_length_in * 6 / 8;
+	} else {
+		fcw->hcin_size0 = harq_in_length;
+	}
+	harq_layout[harq_index].val = 0;
+	rte_bbdev_log(DEBUG, "Loopback FCW Config %d %d %d\n",
+			fcw->hcin_size0, fcw->hcin_offset, fcw->hcin_size1);
+	fcw->hcout_size0 = harq_in_length;
+	fcw->hcin_decomp_mode = h_comp;
+	fcw->hcout_comp_mode = h_comp;
+	fcw->gain_i = 1;
+	fcw->gain_h = 1;
+
+	/* Set the prefix of descriptor. This could be done at polling */
+	desc->req.word0 = ACC100_DMA_DESC_TYPE;
+	desc->req.word1 = 0; /**< Timestamp could be disabled */
+	desc->req.word2 = 0;
+	desc->req.word3 = 0;
+	desc->req.numCBs = 1;
+
+	/* Null LLR input for Decoder */
+	desc->req.data_ptrs[next_triplet].address =
+			q->lb_in_addr_phys;
+	desc->req.data_ptrs[next_triplet].blen = 2;
+	desc->req.data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN;
+	desc->req.data_ptrs[next_triplet].last = 0;
+	desc->req.data_ptrs[next_triplet].dma_ext = 0;
+	next_triplet++;
+
+	/* HARQ Combine input from either Memory interface */
+	if (!ddr_mem_in) {
+		next_triplet = acc100_dma_fill_blk_type_out(&desc->req,
+				op->ldpc_dec.harq_combined_input.data,
+				op->ldpc_dec.harq_combined_input.offset,
+				harq_dma_length_in,
+				next_triplet,
+				ACC100_DMA_BLKID_IN_HARQ);
+	} else {
+		desc->req.data_ptrs[next_triplet].address =
+				op->ldpc_dec.harq_combined_input.offset;
+		desc->req.data_ptrs[next_triplet].blen =
+				harq_dma_length_in;
+		desc->req.data_ptrs[next_triplet].blkid =
+				ACC100_DMA_BLKID_IN_HARQ;
+		desc->req.data_ptrs[next_triplet].dma_ext = 1;
+		next_triplet++;
+	}
+	desc->req.data_ptrs[next_triplet - 1].last = 1;
+	desc->req.m2dlen = next_triplet;
+
+	/* Dropped decoder hard output */
+	desc->req.data_ptrs[next_triplet].address =
+			q->lb_out_addr_phys;
+	desc->req.data_ptrs[next_triplet].blen = BYTES_IN_WORD;
+	desc->req.data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_OUT_HARD;
+	desc->req.data_ptrs[next_triplet].last = 0;
+	desc->req.data_ptrs[next_triplet].dma_ext = 0;
+	next_triplet++;
+
+	/* HARQ Combine output to either Memory interface */
+	if (check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE
+			)) {
+		desc->req.data_ptrs[next_triplet].address =
+				op->ldpc_dec.harq_combined_output.offset;
+		desc->req.data_ptrs[next_triplet].blen =
+				harq_dma_length_out;
+		desc->req.data_ptrs[next_triplet].blkid =
+				ACC100_DMA_BLKID_OUT_HARQ;
+		desc->req.data_ptrs[next_triplet].dma_ext = 1;
+		next_triplet++;
+	} else {
+		hq_output_head = op->ldpc_dec.harq_combined_output.data;
+		hq_output = op->ldpc_dec.harq_combined_output.data;
+		next_triplet = acc100_dma_fill_blk_type_out(
+				&desc->req,
+				op->ldpc_dec.harq_combined_output.data,
+				op->ldpc_dec.harq_combined_output.offset,
+				harq_dma_length_out,
+				next_triplet,
+				ACC100_DMA_BLKID_OUT_HARQ);
+		/* HARQ output */
+		mbuf_append(hq_output_head, hq_output, harq_dma_length_out);
+		op->ldpc_dec.harq_combined_output.length =
+				harq_dma_length_out;
+	}
+	desc->req.data_ptrs[next_triplet - 1].last = 1;
+	desc->req.d2mlen = next_triplet - desc->req.m2dlen;
+	desc->req.op_addr = op;
+
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
 /** Enqueue one decode operations for ACC100 device in CB mode */
 static inline int
 enqueue_ldpc_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
 		uint16_t total_enqueued_cbs, bool same_op)
 {
 	int ret;
+	if (unlikely(check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK))) {
+		ret = harq_loopback(q, op, total_enqueued_cbs);
+		return ret;
+	}
 
 	union acc100_dma_desc *desc;
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v3 07/11] baseband/acc100: add support for 4G processing
  2020-08-19  0:25 [dpdk-dev] [PATCH v3 00/11] bbdev PMD ACC100 Nicolas Chautru
                   ` (5 preceding siblings ...)
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 06/11] baseband/acc100: add HARQ loopback support Nicolas Chautru
@ 2020-08-19  0:25 ` Nicolas Chautru
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 08/11] baseband/acc100: add interrupt support to PMD Nicolas Chautru
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-08-19  0:25 UTC (permalink / raw)
  To: dev, akhil.goyal; +Cc: bruce.richardson, Nicolas Chautru

Adding capability for 4G encode and decoder processing

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 1010 ++++++++++++++++++++++++++++--
 1 file changed, 943 insertions(+), 67 deletions(-)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index b44b2f5..1de7531 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -339,7 +339,6 @@
 	free_base_addresses(base_addrs, i);
 }
 
-
 /* Allocate 64MB memory used for all software rings */
 static int
 acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id)
@@ -637,6 +636,41 @@
 
 	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
 		{
+			.type = RTE_BBDEV_OP_TURBO_DEC,
+			.cap.turbo_dec = {
+				.capability_flags =
+					RTE_BBDEV_TURBO_SUBBLOCK_DEINTERLEAVE |
+					RTE_BBDEV_TURBO_CRC_TYPE_24B |
+					RTE_BBDEV_TURBO_HALF_ITERATION_EVEN |
+					RTE_BBDEV_TURBO_EARLY_TERMINATION |
+					RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN |
+					RTE_BBDEV_TURBO_MAP_DEC |
+					RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP |
+					RTE_BBDEV_TURBO_DEC_SCATTER_GATHER,
+				.max_llr_modulus = INT8_MAX,
+				.num_buffers_src =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+				.num_buffers_hard_out =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+				.num_buffers_soft_out =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+			}
+		},
+		{
+			.type = RTE_BBDEV_OP_TURBO_ENC,
+			.cap.turbo_enc = {
+				.capability_flags =
+					RTE_BBDEV_TURBO_CRC_24B_ATTACH |
+					RTE_BBDEV_TURBO_RV_INDEX_BYPASS |
+					RTE_BBDEV_TURBO_RATE_MATCH |
+					RTE_BBDEV_TURBO_ENC_SCATTER_GATHER,
+				.num_buffers_src =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+				.num_buffers_dst =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+			}
+		},
+		{
 			.type   = RTE_BBDEV_OP_LDPC_ENC,
 			.cap.ldpc_enc = {
 				.capability_flags =
@@ -719,7 +753,6 @@
 #endif
 }
 
-
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
 	.setup_queues = acc100_setup_queues,
 	.close = acc100_dev_close,
@@ -763,6 +796,58 @@
 	return tail;
 }
 
+/* Fill in a frame control word for turbo encoding. */
+static inline void
+acc100_fcw_te_fill(const struct rte_bbdev_enc_op *op, struct acc100_fcw_te *fcw)
+{
+	fcw->code_block_mode = op->turbo_enc.code_block_mode;
+	if (fcw->code_block_mode == 0) { /* For TB mode */
+		fcw->k_neg = op->turbo_enc.tb_params.k_neg;
+		fcw->k_pos = op->turbo_enc.tb_params.k_pos;
+		fcw->c_neg = op->turbo_enc.tb_params.c_neg;
+		fcw->c = op->turbo_enc.tb_params.c;
+		fcw->ncb_neg = op->turbo_enc.tb_params.ncb_neg;
+		fcw->ncb_pos = op->turbo_enc.tb_params.ncb_pos;
+
+		if (check_bit(op->turbo_enc.op_flags,
+				RTE_BBDEV_TURBO_RATE_MATCH)) {
+			fcw->bypass_rm = 0;
+			fcw->cab = op->turbo_enc.tb_params.cab;
+			fcw->ea = op->turbo_enc.tb_params.ea;
+			fcw->eb = op->turbo_enc.tb_params.eb;
+		} else {
+			/* E is set to the encoding output size when RM is
+			 * bypassed.
+			 */
+			fcw->bypass_rm = 1;
+			fcw->cab = fcw->c_neg;
+			fcw->ea = 3 * fcw->k_neg + 12;
+			fcw->eb = 3 * fcw->k_pos + 12;
+		}
+	} else { /* For CB mode */
+		fcw->k_pos = op->turbo_enc.cb_params.k;
+		fcw->ncb_pos = op->turbo_enc.cb_params.ncb;
+
+		if (check_bit(op->turbo_enc.op_flags,
+				RTE_BBDEV_TURBO_RATE_MATCH)) {
+			fcw->bypass_rm = 0;
+			fcw->eb = op->turbo_enc.cb_params.e;
+		} else {
+			/* E is set to the encoding output size when RM is
+			 * bypassed.
+			 */
+			fcw->bypass_rm = 1;
+			fcw->eb = 3 * fcw->k_pos + 12;
+		}
+	}
+
+	fcw->bypass_rv_idx1 = check_bit(op->turbo_enc.op_flags,
+			RTE_BBDEV_TURBO_RV_INDEX_BYPASS);
+	fcw->code_block_crc = check_bit(op->turbo_enc.op_flags,
+			RTE_BBDEV_TURBO_CRC_24B_ATTACH);
+	fcw->rv_idx1 = op->turbo_enc.rv_index;
+}
+
 /* Compute value of k0.
  * Based on 3GPP 38.212 Table 5.4.2.1-2
  * Starting position of different redundancy versions, k0
@@ -813,6 +898,25 @@
 	fcw->mcb_count = num_cb;
 }
 
+/* Fill in a frame control word for turbo decoding. */
+static inline void
+acc100_fcw_td_fill(const struct rte_bbdev_dec_op *op, struct acc100_fcw_td *fcw)
+{
+	/* Note : Early termination is always enabled for 4GUL */
+	fcw->fcw_ver = 1;
+	if (op->turbo_dec.code_block_mode == 0)
+		fcw->k_pos = op->turbo_dec.tb_params.k_pos;
+	else
+		fcw->k_pos = op->turbo_dec.cb_params.k;
+	fcw->turbo_crc_type = check_bit(op->turbo_dec.op_flags,
+			RTE_BBDEV_TURBO_CRC_TYPE_24B);
+	fcw->bypass_sb_deint = 0;
+	fcw->raw_decoder_input_on = 0;
+	fcw->max_iter = op->turbo_dec.iter_max;
+	fcw->half_iter_on = !check_bit(op->turbo_dec.op_flags,
+			RTE_BBDEV_TURBO_HALF_ITERATION_EVEN);
+}
+
 /* Fill in a frame control word for LDPC decoding. */
 static inline void
 acc100_fcw_ld_fill(const struct rte_bbdev_dec_op *op, struct acc100_fcw_ld *fcw,
@@ -1042,6 +1146,87 @@
 }
 
 static inline int
+acc100_dma_desc_te_fill(struct rte_bbdev_enc_op *op,
+		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
+		struct rte_mbuf *output, uint32_t *in_offset,
+		uint32_t *out_offset, uint32_t *out_length,
+		uint32_t *mbuf_total_left, uint32_t *seg_total_left, uint8_t r)
+{
+	int next_triplet = 1; /* FCW already done */
+	uint32_t e, ea, eb, length;
+	uint16_t k, k_neg, k_pos;
+	uint8_t cab, c_neg;
+
+	desc->word0 = ACC100_DMA_DESC_TYPE;
+	desc->word1 = 0; /**< Timestamp could be disabled */
+	desc->word2 = 0;
+	desc->word3 = 0;
+	desc->numCBs = 1;
+
+	if (op->turbo_enc.code_block_mode == 0) {
+		ea = op->turbo_enc.tb_params.ea;
+		eb = op->turbo_enc.tb_params.eb;
+		cab = op->turbo_enc.tb_params.cab;
+		k_neg = op->turbo_enc.tb_params.k_neg;
+		k_pos = op->turbo_enc.tb_params.k_pos;
+		c_neg = op->turbo_enc.tb_params.c_neg;
+		e = (r < cab) ? ea : eb;
+		k = (r < c_neg) ? k_neg : k_pos;
+	} else {
+		e = op->turbo_enc.cb_params.e;
+		k = op->turbo_enc.cb_params.k;
+	}
+
+	if (check_bit(op->turbo_enc.op_flags, RTE_BBDEV_TURBO_CRC_24B_ATTACH))
+		length = (k - 24) >> 3;
+	else
+		length = k >> 3;
+
+	if (unlikely((*mbuf_total_left == 0) || (*mbuf_total_left < length))) {
+		rte_bbdev_log(ERR,
+				"Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+				*mbuf_total_left, length);
+		return -1;
+	}
+
+	next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset,
+			length, seg_total_left, next_triplet);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->m2dlen = next_triplet;
+	*mbuf_total_left -= length;
+
+	/* Set output length */
+	if (check_bit(op->turbo_enc.op_flags, RTE_BBDEV_TURBO_RATE_MATCH))
+		/* Integer round up division by 8 */
+		*out_length = (e + 7) >> 3;
+	else
+		*out_length = (k >> 3) * 3 + 2;
+
+	next_triplet = acc100_dma_fill_blk_type_out(desc, output, *out_offset,
+			*out_length, next_triplet, ACC100_DMA_BLKID_OUT_ENC);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+	op->turbo_enc.output.length += *out_length;
+	*out_offset += *out_length;
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->d2mlen = next_triplet - desc->m2dlen;
+
+	desc->op_addr = op;
+
+	return 0;
+}
+
+static inline int
 acc100_dma_desc_le_fill(struct rte_bbdev_enc_op *op,
 		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
 		struct rte_mbuf *output, uint32_t *in_offset,
@@ -1110,6 +1295,117 @@
 }
 
 static inline int
+acc100_dma_desc_td_fill(struct rte_bbdev_dec_op *op,
+		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
+		struct rte_mbuf *h_output, struct rte_mbuf *s_output,
+		uint32_t *in_offset, uint32_t *h_out_offset,
+		uint32_t *s_out_offset, uint32_t *h_out_length,
+		uint32_t *s_out_length, uint32_t *mbuf_total_left,
+		uint32_t *seg_total_left, uint8_t r)
+{
+	int next_triplet = 1; /* FCW already done */
+	uint16_t k;
+	uint16_t crc24_overlap = 0;
+	uint32_t e, kw;
+
+	desc->word0 = ACC100_DMA_DESC_TYPE;
+	desc->word1 = 0; /**< Timestamp could be disabled */
+	desc->word2 = 0;
+	desc->word3 = 0;
+	desc->numCBs = 1;
+
+	if (op->turbo_dec.code_block_mode == 0) {
+		k = (r < op->turbo_dec.tb_params.c_neg)
+			? op->turbo_dec.tb_params.k_neg
+			: op->turbo_dec.tb_params.k_pos;
+		e = (r < op->turbo_dec.tb_params.cab)
+			? op->turbo_dec.tb_params.ea
+			: op->turbo_dec.tb_params.eb;
+	} else {
+		k = op->turbo_dec.cb_params.k;
+		e = op->turbo_dec.cb_params.e;
+	}
+
+	if ((op->turbo_dec.code_block_mode == 0)
+		&& !check_bit(op->turbo_dec.op_flags,
+		RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP))
+		crc24_overlap = 24;
+
+	/* Calculates circular buffer size.
+	 * According to 3gpp 36.212 section 5.1.4.2
+	 *   Kw = 3 * Kpi,
+	 * where:
+	 *   Kpi = nCol * nRow
+	 * where nCol is 32 and nRow can be calculated from:
+	 *   D =< nCol * nRow
+	 * where D is the size of each output from turbo encoder block (k + 4).
+	 */
+	kw = RTE_ALIGN_CEIL(k + 4, 32) * 3;
+
+	if (unlikely((*mbuf_total_left == 0) || (*mbuf_total_left < kw))) {
+		rte_bbdev_log(ERR,
+				"Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+				*mbuf_total_left, kw);
+		return -1;
+	}
+
+	next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset, kw,
+			seg_total_left, next_triplet);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->m2dlen = next_triplet;
+	*mbuf_total_left -= kw;
+
+	next_triplet = acc100_dma_fill_blk_type_out(
+			desc, h_output, *h_out_offset,
+			k >> 3, next_triplet, ACC100_DMA_BLKID_OUT_HARD);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+
+	*h_out_length = ((k - crc24_overlap) >> 3);
+	op->turbo_dec.hard_output.length += *h_out_length;
+	*h_out_offset += *h_out_length;
+
+	/* Soft output */
+	if (check_bit(op->turbo_dec.op_flags, RTE_BBDEV_TURBO_SOFT_OUTPUT)) {
+		if (check_bit(op->turbo_dec.op_flags,
+				RTE_BBDEV_TURBO_EQUALIZER))
+			*s_out_length = e;
+		else
+			*s_out_length = (k * 3) + 12;
+
+		next_triplet = acc100_dma_fill_blk_type_out(desc, s_output,
+				*s_out_offset, *s_out_length, next_triplet,
+				ACC100_DMA_BLKID_OUT_SOFT);
+		if (unlikely(next_triplet < 0)) {
+			rte_bbdev_log(ERR,
+					"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+					op);
+			return -1;
+		}
+
+		op->turbo_dec.soft_output.length += *s_out_length;
+		*s_out_offset += *s_out_length;
+	}
+
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->d2mlen = next_triplet - desc->m2dlen;
+
+	desc->op_addr = op;
+
+	return 0;
+}
+
+static inline int
 acc100_dma_desc_ld_fill(struct rte_bbdev_dec_op *op,
 		struct acc100_dma_req_desc *desc,
 		struct rte_mbuf **input, struct rte_mbuf *h_output,
@@ -1374,6 +1670,57 @@
 
 /* Enqueue one encode operations for ACC100 device in CB mode */
 static inline int
+enqueue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
+		uint16_t total_enqueued_cbs)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint32_t in_offset, out_offset, out_length, mbuf_total_left,
+		seg_total_left;
+	struct rte_mbuf *input, *output_head, *output;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	acc100_fcw_te_fill(op, &desc->req.fcw_te);
+
+	input = op->turbo_enc.input.data;
+	output_head = output = op->turbo_enc.output.data;
+	in_offset = op->turbo_enc.input.offset;
+	out_offset = op->turbo_enc.output.offset;
+	out_length = 0;
+	mbuf_total_left = op->turbo_enc.input.length;
+	seg_total_left = rte_pktmbuf_data_len(op->turbo_enc.input.data)
+			- in_offset;
+
+	ret = acc100_dma_desc_te_fill(op, &desc->req, &input, output,
+			&in_offset, &out_offset, &out_length, &mbuf_total_left,
+			&seg_total_left, 0);
+
+	if (unlikely(ret < 0))
+		return ret;
+
+	mbuf_append(output_head, output, out_length);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_te,
+			sizeof(desc->req.fcw_te) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+
+	/* Check if any data left after processing one CB */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left after processing one CB: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
+/* Enqueue one encode operations for ACC100 device in CB mode */
+static inline int
 enqueue_ldpc_enc_n_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op **ops,
 		uint16_t total_enqueued_cbs, int16_t num)
 {
@@ -1481,78 +1828,235 @@
 	return 1;
 }
 
-static inline int
-harq_loopback(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
-		uint16_t total_enqueued_cbs) {
-	struct acc100_fcw_ld *fcw;
-	union acc100_dma_desc *desc;
-	int next_triplet = 1;
-	struct rte_mbuf *hq_output_head, *hq_output;
-	uint16_t harq_in_length = op->ldpc_dec.harq_combined_input.length;
-	if (harq_in_length == 0) {
-		rte_bbdev_log(ERR, "Loopback of invalid null size\n");
-		return -EINVAL;
-	}
 
-	int h_comp = check_bit(op->ldpc_dec.op_flags,
-			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION
-			) ? 1 : 0;
-	if (h_comp == 1)
-		harq_in_length = harq_in_length * 8 / 6;
-	harq_in_length = RTE_ALIGN(harq_in_length, 64);
-	uint16_t harq_dma_length_in = (h_comp == 0) ?
-			harq_in_length :
-			harq_in_length * 6 / 8;
-	uint16_t harq_dma_length_out = harq_dma_length_in;
-	bool ddr_mem_in = check_bit(op->ldpc_dec.op_flags,
-			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE);
-	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
-	uint16_t harq_index = (ddr_mem_in ?
-			op->ldpc_dec.harq_combined_input.offset :
-			op->ldpc_dec.harq_combined_output.offset)
-			/ ACC100_HARQ_OFFSET;
+/* Enqueue one encode operations for ACC100 device in TB mode. */
+static inline int
+enqueue_enc_one_op_tb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
+		uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint8_t r, c;
+	uint32_t in_offset, out_offset, out_length, mbuf_total_left,
+		seg_total_left;
+	struct rte_mbuf *input, *output_head, *output;
+	uint16_t current_enqueued_cbs = 0;
 
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
-	fcw = &desc->req.fcw_ld;
-	/* Set the FCW from loopback into DDR */
-	memset(fcw, 0, sizeof(struct acc100_fcw_ld));
-	fcw->FCWversion = ACC100_FCW_VER;
-	fcw->qm = 2;
-	fcw->Zc = 384;
-	if (harq_in_length < 16 * N_ZC_1)
-		fcw->Zc = 16;
-	fcw->ncb = fcw->Zc * N_ZC_1;
-	fcw->rm_e = 2;
-	fcw->hcin_en = 1;
-	fcw->hcout_en = 1;
+	uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
+	acc100_fcw_te_fill(op, &desc->req.fcw_te);
 
-	rte_bbdev_log(DEBUG, "Loopback IN %d Index %d offset %d length %d %d\n",
-			ddr_mem_in, harq_index,
-			harq_layout[harq_index].offset, harq_in_length,
-			harq_dma_length_in);
+	input = op->turbo_enc.input.data;
+	output_head = output = op->turbo_enc.output.data;
+	in_offset = op->turbo_enc.input.offset;
+	out_offset = op->turbo_enc.output.offset;
+	out_length = 0;
+	mbuf_total_left = op->turbo_enc.input.length;
 
-	if (ddr_mem_in && (harq_layout[harq_index].offset > 0)) {
-		fcw->hcin_size0 = harq_layout[harq_index].size0;
-		fcw->hcin_offset = harq_layout[harq_index].offset;
-		fcw->hcin_size1 = harq_in_length - fcw->hcin_offset;
-		harq_dma_length_in = (fcw->hcin_size0 + fcw->hcin_size1);
-		if (h_comp == 1)
-			harq_dma_length_in = harq_dma_length_in * 6 / 8;
-	} else {
-		fcw->hcin_size0 = harq_in_length;
-	}
-	harq_layout[harq_index].val = 0;
-	rte_bbdev_log(DEBUG, "Loopback FCW Config %d %d %d\n",
-			fcw->hcin_size0, fcw->hcin_offset, fcw->hcin_size1);
-	fcw->hcout_size0 = harq_in_length;
-	fcw->hcin_decomp_mode = h_comp;
-	fcw->hcout_comp_mode = h_comp;
-	fcw->gain_i = 1;
-	fcw->gain_h = 1;
+	c = op->turbo_enc.tb_params.c;
+	r = op->turbo_enc.tb_params.r;
 
-	/* Set the prefix of descriptor. This could be done at polling */
+	while (mbuf_total_left > 0 && r < c) {
+		seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
+		/* Set up DMA descriptor */
+		desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
+				& q->sw_ring_wrap_mask);
+		desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
+		desc->req.data_ptrs[0].blen = ACC100_FCW_TE_BLEN;
+
+		ret = acc100_dma_desc_te_fill(op, &desc->req, &input, output,
+				&in_offset, &out_offset, &out_length,
+				&mbuf_total_left, &seg_total_left, r);
+		if (unlikely(ret < 0))
+			return ret;
+		mbuf_append(output_head, output, out_length);
+
+		/* Set total number of CBs in TB */
+		desc->req.cbs_in_tb = cbs_in_tb;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+		rte_memdump(stderr, "FCW", &desc->req.fcw_te,
+				sizeof(desc->req.fcw_te) - 8);
+		rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+		if (seg_total_left == 0) {
+			/* Go to the next mbuf */
+			input = input->next;
+			in_offset = 0;
+			output = output->next;
+			out_offset = 0;
+		}
+
+		total_enqueued_cbs++;
+		current_enqueued_cbs++;
+		r++;
+	}
+
+	if (unlikely(desc == NULL))
+		return current_enqueued_cbs;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Check if any CBs left for processing */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left for processing: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+
+	/* Set SDone on last CB descriptor for TB mode. */
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	return current_enqueued_cbs;
+}
+
+/** Enqueue one decode operations for ACC100 device in CB mode */
+static inline int
+enqueue_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint32_t in_offset, h_out_offset, s_out_offset, s_out_length,
+		h_out_length, mbuf_total_left, seg_total_left;
+	struct rte_mbuf *input, *h_output_head, *h_output,
+		*s_output_head, *s_output;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	acc100_fcw_td_fill(op, &desc->req.fcw_td);
+
+	input = op->turbo_dec.input.data;
+	h_output_head = h_output = op->turbo_dec.hard_output.data;
+	s_output_head = s_output = op->turbo_dec.soft_output.data;
+	in_offset = op->turbo_dec.input.offset;
+	h_out_offset = op->turbo_dec.hard_output.offset;
+	s_out_offset = op->turbo_dec.soft_output.offset;
+	h_out_length = s_out_length = 0;
+	mbuf_total_left = op->turbo_dec.input.length;
+	seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(input == NULL)) {
+		rte_bbdev_log(ERR, "Invalid mbuf pointer");
+		return -EFAULT;
+	}
+#endif
+
+	/* Set up DMA descriptor */
+	desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+
+	ret = acc100_dma_desc_td_fill(op, &desc->req, &input, h_output,
+			s_output, &in_offset, &h_out_offset, &s_out_offset,
+			&h_out_length, &s_out_length, &mbuf_total_left,
+			&seg_total_left, 0);
+
+	if (unlikely(ret < 0))
+		return ret;
+
+	/* Hard output */
+	mbuf_append(h_output_head, h_output, h_out_length);
+
+	/* Soft output */
+	if (check_bit(op->turbo_dec.op_flags, RTE_BBDEV_TURBO_SOFT_OUTPUT))
+		mbuf_append(s_output_head, s_output, s_out_length);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_td,
+			sizeof(desc->req.fcw_td) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+
+	/* Check if any CBs left for processing */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left after processing one CB: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
+static inline int
+harq_loopback(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs) {
+	struct acc100_fcw_ld *fcw;
+	union acc100_dma_desc *desc;
+	int next_triplet = 1;
+	struct rte_mbuf *hq_output_head, *hq_output;
+	uint16_t harq_in_length = op->ldpc_dec.harq_combined_input.length;
+	if (harq_in_length == 0) {
+		rte_bbdev_log(ERR, "Loopback of invalid null size\n");
+		return -EINVAL;
+	}
+
+	int h_comp = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION
+			) ? 1 : 0;
+	if (h_comp == 1)
+		harq_in_length = harq_in_length * 8 / 6;
+	harq_in_length = RTE_ALIGN(harq_in_length, 64);
+	uint16_t harq_dma_length_in = (h_comp == 0) ?
+			harq_in_length :
+			harq_in_length * 6 / 8;
+	uint16_t harq_dma_length_out = harq_dma_length_in;
+	bool ddr_mem_in = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE);
+	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+	uint16_t harq_index = (ddr_mem_in ?
+			op->ldpc_dec.harq_combined_input.offset :
+			op->ldpc_dec.harq_combined_output.offset)
+			/ ACC100_HARQ_OFFSET;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	fcw = &desc->req.fcw_ld;
+	/* Set the FCW from loopback into DDR */
+	memset(fcw, 0, sizeof(struct acc100_fcw_ld));
+	fcw->FCWversion = ACC100_FCW_VER;
+	fcw->qm = 2;
+	fcw->Zc = 384;
+	if (harq_in_length < 16 * N_ZC_1)
+		fcw->Zc = 16;
+	fcw->ncb = fcw->Zc * N_ZC_1;
+	fcw->rm_e = 2;
+	fcw->hcin_en = 1;
+	fcw->hcout_en = 1;
+
+	rte_bbdev_log(DEBUG, "Loopback IN %d Index %d offset %d length %d %d\n",
+			ddr_mem_in, harq_index,
+			harq_layout[harq_index].offset, harq_in_length,
+			harq_dma_length_in);
+
+	if (ddr_mem_in && (harq_layout[harq_index].offset > 0)) {
+		fcw->hcin_size0 = harq_layout[harq_index].size0;
+		fcw->hcin_offset = harq_layout[harq_index].offset;
+		fcw->hcin_size1 = harq_in_length - fcw->hcin_offset;
+		harq_dma_length_in = (fcw->hcin_size0 + fcw->hcin_size1);
+		if (h_comp == 1)
+			harq_dma_length_in = harq_dma_length_in * 6 / 8;
+	} else {
+		fcw->hcin_size0 = harq_in_length;
+	}
+	harq_layout[harq_index].val = 0;
+	rte_bbdev_log(DEBUG, "Loopback FCW Config %d %d %d\n",
+			fcw->hcin_size0, fcw->hcin_offset, fcw->hcin_size1);
+	fcw->hcout_size0 = harq_in_length;
+	fcw->hcin_decomp_mode = h_comp;
+	fcw->hcout_comp_mode = h_comp;
+	fcw->gain_i = 1;
+	fcw->gain_h = 1;
+
+	/* Set the prefix of descriptor. This could be done at polling */
 	desc->req.word0 = ACC100_DMA_DESC_TYPE;
 	desc->req.word1 = 0; /**< Timestamp could be disabled */
 	desc->req.word2 = 0;
@@ -1816,6 +2320,107 @@
 	return current_enqueued_cbs;
 }
 
+/* Enqueue one decode operations for ACC100 device in TB mode */
+static inline int
+enqueue_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint8_t r, c;
+	uint32_t in_offset, h_out_offset, s_out_offset, s_out_length,
+		h_out_length, mbuf_total_left, seg_total_left;
+	struct rte_mbuf *input, *h_output_head, *h_output,
+		*s_output_head, *s_output;
+	uint16_t current_enqueued_cbs = 0;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
+	acc100_fcw_td_fill(op, &desc->req.fcw_td);
+
+	input = op->turbo_dec.input.data;
+	h_output_head = h_output = op->turbo_dec.hard_output.data;
+	s_output_head = s_output = op->turbo_dec.soft_output.data;
+	in_offset = op->turbo_dec.input.offset;
+	h_out_offset = op->turbo_dec.hard_output.offset;
+	s_out_offset = op->turbo_dec.soft_output.offset;
+	h_out_length = s_out_length = 0;
+	mbuf_total_left = op->turbo_dec.input.length;
+	c = op->turbo_dec.tb_params.c;
+	r = op->turbo_dec.tb_params.r;
+
+	while (mbuf_total_left > 0 && r < c) {
+
+		seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
+
+		/* Set up DMA descriptor */
+		desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
+				& q->sw_ring_wrap_mask);
+		desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
+		desc->req.data_ptrs[0].blen = ACC100_FCW_TD_BLEN;
+		ret = acc100_dma_desc_td_fill(op, &desc->req, &input,
+				h_output, s_output, &in_offset, &h_out_offset,
+				&s_out_offset, &h_out_length, &s_out_length,
+				&mbuf_total_left, &seg_total_left, r);
+
+		if (unlikely(ret < 0))
+			return ret;
+
+		/* Hard output */
+		mbuf_append(h_output_head, h_output, h_out_length);
+
+		/* Soft output */
+		if (check_bit(op->turbo_dec.op_flags,
+				RTE_BBDEV_TURBO_SOFT_OUTPUT))
+			mbuf_append(s_output_head, s_output, s_out_length);
+
+		/* Set total number of CBs in TB */
+		desc->req.cbs_in_tb = cbs_in_tb;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+		rte_memdump(stderr, "FCW", &desc->req.fcw_td,
+				sizeof(desc->req.fcw_td) - 8);
+		rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+		if (seg_total_left == 0) {
+			/* Go to the next mbuf */
+			input = input->next;
+			in_offset = 0;
+			h_output = h_output->next;
+			h_out_offset = 0;
+
+			if (check_bit(op->turbo_dec.op_flags,
+					RTE_BBDEV_TURBO_SOFT_OUTPUT)) {
+				s_output = s_output->next;
+				s_out_offset = 0;
+			}
+		}
+
+		total_enqueued_cbs++;
+		current_enqueued_cbs++;
+		r++;
+	}
+
+	if (unlikely(desc == NULL))
+		return current_enqueued_cbs;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Check if any CBs left for processing */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left for processing: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+	/* Set SDone on last CB descriptor for TB mode */
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	return current_enqueued_cbs;
+}
 
 /* Calculates number of CBs in processed encoder TB based on 'r' and input
  * length.
@@ -1893,6 +2498,45 @@
 	return cbs_in_tb;
 }
 
+/* Enqueue encode operations for ACC100 device in CB mode. */
+static uint16_t
+acc100_enqueue_enc_cb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i;
+	union acc100_dma_desc *desc;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - 1 < 0))
+			break;
+		avail -= 1;
+
+		ret = enqueue_enc_one_op_cb(q, ops[i], i);
+		if (ret < 0)
+			break;
+	}
+
+	if (unlikely(i == 0))
+		return 0; /* Nothing to enqueue */
+
+	/* Set SDone in last CB in enqueued ops for CB mode*/
+	desc = q->ring_addr + ((q->sw_ring_head + i - 1)
+			& q->sw_ring_wrap_mask);
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	acc100_dma_enqueue(q, i, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+	return i;
+}
+
 /* Check we can mux encode operations with common FCW */
 static inline bool
 check_mux(struct rte_bbdev_enc_op **ops, uint16_t num) {
@@ -1960,6 +2604,52 @@
 	return i;
 }
 
+/* Enqueue encode operations for ACC100 device in TB mode. */
+static uint16_t
+acc100_enqueue_enc_tb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i, enqueued_cbs = 0;
+	uint8_t cbs_in_tb;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		cbs_in_tb = get_num_cbs_in_tb_enc(&ops[i]->turbo_enc);
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - cbs_in_tb < 0))
+			break;
+		avail -= cbs_in_tb;
+
+		ret = enqueue_enc_one_op_tb(q, ops[i], enqueued_cbs, cbs_in_tb);
+		if (ret < 0)
+			break;
+		enqueued_cbs += ret;
+	}
+
+	acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+
+	return i;
+}
+
+/* Enqueue encode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_enc(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	if (unlikely(num == 0))
+		return 0;
+	if (ops[0]->turbo_enc.code_block_mode == 0)
+		return acc100_enqueue_enc_tb(q_data, ops, num);
+	else
+		return acc100_enqueue_enc_cb(q_data, ops, num);
+}
+
 /* Enqueue encode operations for ACC100 device. */
 static uint16_t
 acc100_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
@@ -1967,7 +2657,51 @@
 {
 	if (unlikely(num == 0))
 		return 0;
-	return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
+	if (ops[0]->ldpc_enc.code_block_mode == 0)
+		return acc100_enqueue_enc_tb(q_data, ops, num);
+	else
+		return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
+}
+
+
+/* Enqueue decode operations for ACC100 device in CB mode */
+static uint16_t
+acc100_enqueue_dec_cb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i;
+	union acc100_dma_desc *desc;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - 1 < 0))
+			break;
+		avail -= 1;
+
+		ret = enqueue_dec_one_op_cb(q, ops[i], i);
+		if (ret < 0)
+			break;
+	}
+
+	if (unlikely(i == 0))
+		return 0; /* Nothing to enqueue */
+
+	/* Set SDone in last CB in enqueued ops for CB mode*/
+	desc = q->ring_addr + ((q->sw_ring_head + i - 1)
+			& q->sw_ring_wrap_mask);
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	acc100_dma_enqueue(q, i, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+
+	return i;
 }
 
 /* Check we can mux encode operations with common FCW */
@@ -2065,6 +2799,53 @@
 	return i;
 }
 
+
+/* Enqueue decode operations for ACC100 device in TB mode */
+static uint16_t
+acc100_enqueue_dec_tb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i, enqueued_cbs = 0;
+	uint8_t cbs_in_tb;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		cbs_in_tb = get_num_cbs_in_tb_dec(&ops[i]->turbo_dec);
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - cbs_in_tb < 0))
+			break;
+		avail -= cbs_in_tb;
+
+		ret = enqueue_dec_one_op_tb(q, ops[i], enqueued_cbs, cbs_in_tb);
+		if (ret < 0)
+			break;
+		enqueued_cbs += ret;
+	}
+
+	acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+
+	return i;
+}
+
+/* Enqueue decode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_dec(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	if (unlikely(num == 0))
+		return 0;
+	if (ops[0]->turbo_dec.code_block_mode == 0)
+		return acc100_enqueue_dec_tb(q_data, ops, num);
+	else
+		return acc100_enqueue_dec_cb(q_data, ops, num);
+}
+
 /* Enqueue decode operations for ACC100 device. */
 static uint16_t
 acc100_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
@@ -2388,6 +3169,51 @@
 	return cb_idx;
 }
 
+/* Dequeue encode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_enc(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	uint16_t dequeue_num;
+	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+	uint32_t aq_dequeued = 0;
+	uint16_t i;
+	uint16_t dequeued_cbs = 0;
+	struct rte_bbdev_enc_op *op;
+	int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(ops == 0 && q == NULL))
+		return 0;
+#endif
+
+	dequeue_num = (avail < num) ? avail : num;
+
+	for (i = 0; i < dequeue_num; ++i) {
+		op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask))->req.op_addr;
+		if (op->turbo_enc.code_block_mode == 0)
+			ret = dequeue_enc_one_op_tb(q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+		else
+			ret = dequeue_enc_one_op_cb(q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+
+		if (ret < 0)
+			break;
+		dequeued_cbs += ret;
+	}
+
+	q->aq_dequeued += aq_dequeued;
+	q->sw_ring_tail += dequeued_cbs;
+
+	/* Update enqueue stats */
+	q_data->queue_stats.dequeued_count += i;
+
+	return i;
+}
+
 /* Dequeue LDPC encode operations from ACC100 device. */
 static uint16_t
 acc100_dequeue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
@@ -2426,6 +3252,52 @@
 	return dequeued_cbs;
 }
 
+
+/* Dequeue decode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_dec(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	uint16_t dequeue_num;
+	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+	uint32_t aq_dequeued = 0;
+	uint16_t i;
+	uint16_t dequeued_cbs = 0;
+	struct rte_bbdev_dec_op *op;
+	int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(ops == 0 && q == NULL))
+		return 0;
+#endif
+
+	dequeue_num = (avail < num) ? avail : num;
+
+	for (i = 0; i < dequeue_num; ++i) {
+		op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask))->req.op_addr;
+		if (op->turbo_dec.code_block_mode == 0)
+			ret = dequeue_dec_one_op_tb(q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+		else
+			ret = dequeue_dec_one_op_cb(q_data, q, &ops[i],
+					dequeued_cbs, &aq_dequeued);
+
+		if (ret < 0)
+			break;
+		dequeued_cbs += ret;
+	}
+
+	q->aq_dequeued += aq_dequeued;
+	q->sw_ring_tail += dequeued_cbs;
+
+	/* Update enqueue stats */
+	q_data->queue_stats.dequeued_count += i;
+
+	return i;
+}
+
 /* Dequeue decode operations from ACC100 device. */
 static uint16_t
 acc100_dequeue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
@@ -2479,6 +3351,10 @@
 	struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
 
 	dev->dev_ops = &acc100_bbdev_ops;
+	dev->enqueue_enc_ops = acc100_enqueue_enc;
+	dev->enqueue_dec_ops = acc100_enqueue_dec;
+	dev->dequeue_enc_ops = acc100_dequeue_enc;
+	dev->dequeue_dec_ops = acc100_dequeue_dec;
 	dev->enqueue_ldpc_enc_ops = acc100_enqueue_ldpc_enc;
 	dev->enqueue_ldpc_dec_ops = acc100_enqueue_ldpc_dec;
 	dev->dequeue_ldpc_enc_ops = acc100_dequeue_ldpc_enc;
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v3 08/11] baseband/acc100: add interrupt support to PMD
  2020-08-19  0:25 [dpdk-dev] [PATCH v3 00/11] bbdev PMD ACC100 Nicolas Chautru
                   ` (6 preceding siblings ...)
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 07/11] baseband/acc100: add support for 4G processing Nicolas Chautru
@ 2020-08-19  0:25 ` Nicolas Chautru
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 09/11] baseband/acc100: add debug function to validate input Nicolas Chautru
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-08-19  0:25 UTC (permalink / raw)
  To: dev, akhil.goyal; +Cc: bruce.richardson, Nicolas Chautru

Adding capability and functions to support MSI
interrupts, call backs and inforing.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 288 ++++++++++++++++++++++++++++++-
 drivers/baseband/acc100/rte_acc100_pmd.h |  15 ++
 2 files changed, 300 insertions(+), 3 deletions(-)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 1de7531..ba8e1d8 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -339,6 +339,213 @@
 	free_base_addresses(base_addrs, i);
 }
 
+/*
+ * Find queue_id of a device queue based on details from the Info Ring.
+ * If a queue isn't found UINT16_MAX is returned.
+ */
+static inline uint16_t
+get_queue_id_from_ring_info(struct rte_bbdev_data *data,
+		const union acc100_info_ring_data ring_data)
+{
+	uint16_t queue_id;
+
+	for (queue_id = 0; queue_id < data->num_queues; ++queue_id) {
+		struct acc100_queue *acc100_q =
+				data->queues[queue_id].queue_private;
+		if (acc100_q != NULL && acc100_q->aq_id == ring_data.aq_id &&
+				acc100_q->qgrp_id == ring_data.qg_id &&
+				acc100_q->vf_id == ring_data.vf_id)
+			return queue_id;
+	}
+
+	return UINT16_MAX;
+}
+
+/* Checks PF Info Ring to find the interrupt cause and handles it accordingly */
+static inline void
+acc100_check_ir(struct acc100_device *acc100_dev)
+{
+	volatile union acc100_info_ring_data *ring_data;
+	uint16_t info_ring_head = acc100_dev->info_ring_head;
+	if (acc100_dev->info_ring == NULL)
+		return;
+
+	ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head &
+			ACC100_INFO_RING_MASK);
+
+	while (ring_data->valid) {
+		if ((ring_data->int_nb < ACC100_PF_INT_DMA_DL_DESC_IRQ) || (
+				ring_data->int_nb >
+				ACC100_PF_INT_DMA_DL5G_DESC_IRQ))
+			rte_bbdev_log(WARNING, "InfoRing: ITR:%d Info:0x%x",
+				ring_data->int_nb, ring_data->detailed_info);
+		/* Initialize Info Ring entry and move forward */
+		ring_data->val = 0;
+		info_ring_head++;
+		ring_data = acc100_dev->info_ring +
+				(info_ring_head & ACC100_INFO_RING_MASK);
+	}
+}
+
+/* Checks PF Info Ring to find the interrupt cause and handles it accordingly */
+static inline void
+acc100_pf_interrupt_handler(struct rte_bbdev *dev)
+{
+	struct acc100_device *acc100_dev = dev->data->dev_private;
+	volatile union acc100_info_ring_data *ring_data;
+	struct acc100_deq_intr_details deq_intr_det;
+
+	ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head &
+			ACC100_INFO_RING_MASK);
+
+	while (ring_data->valid) {
+
+		rte_bbdev_log_debug(
+				"ACC100 PF Interrupt received, Info Ring data: 0x%x",
+				ring_data->val);
+
+		switch (ring_data->int_nb) {
+		case ACC100_PF_INT_DMA_DL_DESC_IRQ:
+		case ACC100_PF_INT_DMA_UL_DESC_IRQ:
+		case ACC100_PF_INT_DMA_UL5G_DESC_IRQ:
+		case ACC100_PF_INT_DMA_DL5G_DESC_IRQ:
+			deq_intr_det.queue_id = get_queue_id_from_ring_info(
+					dev->data, *ring_data);
+			if (deq_intr_det.queue_id == UINT16_MAX) {
+				rte_bbdev_log(ERR,
+						"Couldn't find queue: aq_id: %u, qg_id: %u, vf_id: %u",
+						ring_data->aq_id,
+						ring_data->qg_id,
+						ring_data->vf_id);
+				return;
+			}
+			rte_bbdev_pmd_callback_process(dev,
+					RTE_BBDEV_EVENT_DEQUEUE, &deq_intr_det);
+			break;
+		default:
+			rte_bbdev_pmd_callback_process(dev,
+					RTE_BBDEV_EVENT_ERROR, NULL);
+			break;
+		}
+
+		/* Initialize Info Ring entry and move forward */
+		ring_data->val = 0;
+		++acc100_dev->info_ring_head;
+		ring_data = acc100_dev->info_ring +
+				(acc100_dev->info_ring_head &
+				ACC100_INFO_RING_MASK);
+	}
+}
+
+/* Checks VF Info Ring to find the interrupt cause and handles it accordingly */
+static inline void
+acc100_vf_interrupt_handler(struct rte_bbdev *dev)
+{
+	struct acc100_device *acc100_dev = dev->data->dev_private;
+	volatile union acc100_info_ring_data *ring_data;
+	struct acc100_deq_intr_details deq_intr_det;
+
+	ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head &
+			ACC100_INFO_RING_MASK);
+
+	while (ring_data->valid) {
+
+		rte_bbdev_log_debug(
+				"ACC100 VF Interrupt received, Info Ring data: 0x%x",
+				ring_data->val);
+
+		switch (ring_data->int_nb) {
+		case ACC100_VF_INT_DMA_DL_DESC_IRQ:
+		case ACC100_VF_INT_DMA_UL_DESC_IRQ:
+		case ACC100_VF_INT_DMA_UL5G_DESC_IRQ:
+		case ACC100_VF_INT_DMA_DL5G_DESC_IRQ:
+			/* VFs are not aware of their vf_id - it's set to 0 in
+			 * queue structures.
+			 */
+			ring_data->vf_id = 0;
+			deq_intr_det.queue_id = get_queue_id_from_ring_info(
+					dev->data, *ring_data);
+			if (deq_intr_det.queue_id == UINT16_MAX) {
+				rte_bbdev_log(ERR,
+						"Couldn't find queue: aq_id: %u, qg_id: %u",
+						ring_data->aq_id,
+						ring_data->qg_id);
+				return;
+			}
+			rte_bbdev_pmd_callback_process(dev,
+					RTE_BBDEV_EVENT_DEQUEUE, &deq_intr_det);
+			break;
+		default:
+			rte_bbdev_pmd_callback_process(dev,
+					RTE_BBDEV_EVENT_ERROR, NULL);
+			break;
+		}
+
+		/* Initialize Info Ring entry and move forward */
+		ring_data->valid = 0;
+		++acc100_dev->info_ring_head;
+		ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head
+				& ACC100_INFO_RING_MASK);
+	}
+}
+
+/* Interrupt handler triggered by ACC100 dev for handling specific interrupt */
+static void
+acc100_dev_interrupt_handler(void *cb_arg)
+{
+	struct rte_bbdev *dev = cb_arg;
+	struct acc100_device *acc100_dev = dev->data->dev_private;
+
+	/* Read info ring */
+	if (acc100_dev->pf_device)
+		acc100_pf_interrupt_handler(dev);
+	else
+		acc100_vf_interrupt_handler(dev);
+}
+
+/* Allocate and setup inforing */
+static int
+allocate_inforing(struct rte_bbdev *dev)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	const struct acc100_registry_addr *reg_addr;
+	rte_iova_t info_ring_phys;
+	uint32_t phys_low, phys_high;
+
+	if (d->info_ring != NULL)
+		return 0; /* Already configured */
+
+	/* Choose correct registry addresses for the device type */
+	if (d->pf_device)
+		reg_addr = &pf_reg_addr;
+	else
+		reg_addr = &vf_reg_addr;
+	/* Allocate InfoRing */
+	d->info_ring = rte_zmalloc_socket("Info Ring",
+			ACC100_INFO_RING_NUM_ENTRIES *
+			sizeof(*d->info_ring), RTE_CACHE_LINE_SIZE,
+			dev->data->socket_id);
+	if (d->info_ring == NULL) {
+		rte_bbdev_log(ERR,
+				"Failed to allocate Info Ring for %s:%u",
+				dev->device->driver->name,
+				dev->data->dev_id);
+		return -ENOMEM;
+	}
+	info_ring_phys = rte_malloc_virt2iova(d->info_ring);
+
+	/* Setup Info Ring */
+	phys_high = (uint32_t)(info_ring_phys >> 32);
+	phys_low  = (uint32_t)(info_ring_phys);
+	acc100_reg_write(d, reg_addr->info_ring_hi, phys_high);
+	acc100_reg_write(d, reg_addr->info_ring_lo, phys_low);
+	acc100_reg_write(d, reg_addr->info_ring_en, ACC100_REG_IRQ_EN_ALL);
+	d->info_ring_head = (acc100_reg_read(d, reg_addr->info_ring_ptr) &
+			0xFFF) / sizeof(union acc100_info_ring_data);
+	return 0;
+}
+
+
 /* Allocate 64MB memory used for all software rings */
 static int
 acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id)
@@ -426,6 +633,7 @@
 	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_hi, phys_high);
 	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_lo, phys_low);
 
+	allocate_inforing(dev);
 	d->harq_layout = rte_zmalloc_socket("HARQ Layout",
 			ACC100_HARQ_LAYOUT * sizeof(*d->harq_layout),
 			RTE_CACHE_LINE_SIZE, dev->data->socket_id);
@@ -437,13 +645,53 @@
 	return 0;
 }
 
+static int
+acc100_intr_enable(struct rte_bbdev *dev)
+{
+	int ret;
+	struct acc100_device *d = dev->data->dev_private;
+
+	/* Only MSI are currently supported */
+	if (dev->intr_handle->type == RTE_INTR_HANDLE_VFIO_MSI ||
+			dev->intr_handle->type == RTE_INTR_HANDLE_UIO) {
+
+		allocate_inforing(dev);
+
+		ret = rte_intr_enable(dev->intr_handle);
+		if (ret < 0) {
+			rte_bbdev_log(ERR,
+					"Couldn't enable interrupts for device: %s",
+					dev->data->name);
+			rte_free(d->info_ring);
+			return ret;
+		}
+		ret = rte_intr_callback_register(dev->intr_handle,
+				acc100_dev_interrupt_handler, dev);
+		if (ret < 0) {
+			rte_bbdev_log(ERR,
+					"Couldn't register interrupt callback for device: %s",
+					dev->data->name);
+			rte_free(d->info_ring);
+			return ret;
+		}
+
+		return 0;
+	}
+
+	rte_bbdev_log(ERR, "ACC100 (%s) supports only VFIO MSI interrupts",
+			dev->data->name);
+	return -ENOTSUP;
+}
+
 /* Free 64MB memory used for software rings */
 static int
 acc100_dev_close(struct rte_bbdev *dev)
 {
 	struct acc100_device *d = dev->data->dev_private;
+	acc100_check_ir(d);
 	if (d->sw_rings_base != NULL) {
 		rte_free(d->tail_ptrs);
+		rte_free(d->info_ring);
 		rte_free(d->sw_rings_base);
 		d->sw_rings_base = NULL;
 	}
@@ -643,6 +891,7 @@
 					RTE_BBDEV_TURBO_CRC_TYPE_24B |
 					RTE_BBDEV_TURBO_HALF_ITERATION_EVEN |
 					RTE_BBDEV_TURBO_EARLY_TERMINATION |
+					RTE_BBDEV_TURBO_DEC_INTERRUPTS |
 					RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN |
 					RTE_BBDEV_TURBO_MAP_DEC |
 					RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP |
@@ -663,6 +912,7 @@
 					RTE_BBDEV_TURBO_CRC_24B_ATTACH |
 					RTE_BBDEV_TURBO_RV_INDEX_BYPASS |
 					RTE_BBDEV_TURBO_RATE_MATCH |
+					RTE_BBDEV_TURBO_ENC_INTERRUPTS |
 					RTE_BBDEV_TURBO_ENC_SCATTER_GATHER,
 				.num_buffers_src =
 						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
@@ -676,7 +926,8 @@
 				.capability_flags =
 					RTE_BBDEV_LDPC_RATE_MATCH |
 					RTE_BBDEV_LDPC_CRC_24B_ATTACH |
-					RTE_BBDEV_LDPC_INTERLEAVER_BYPASS,
+					RTE_BBDEV_LDPC_INTERLEAVER_BYPASS |
+					RTE_BBDEV_LDPC_ENC_INTERRUPTS,
 				.num_buffers_src =
 						RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
 				.num_buffers_dst =
@@ -701,7 +952,8 @@
 				RTE_BBDEV_LDPC_DECODE_BYPASS |
 				RTE_BBDEV_LDPC_DEC_SCATTER_GATHER |
 				RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION |
-				RTE_BBDEV_LDPC_LLR_COMPRESSION,
+				RTE_BBDEV_LDPC_LLR_COMPRESSION |
+				RTE_BBDEV_LDPC_DEC_INTERRUPTS,
 			.llr_size = 8,
 			.llr_decimals = 1,
 			.num_buffers_src =
@@ -751,14 +1003,39 @@
 #else
 	dev_info->harq_buffer_size = 0;
 #endif
+	acc100_check_ir(d);
+}
+
+static int
+acc100_queue_intr_enable(struct rte_bbdev *dev, uint16_t queue_id)
+{
+	struct acc100_queue *q = dev->data->queues[queue_id].queue_private;
+
+	if (dev->intr_handle->type != RTE_INTR_HANDLE_VFIO_MSI &&
+			dev->intr_handle->type != RTE_INTR_HANDLE_UIO)
+		return -ENOTSUP;
+
+	q->irq_enable = 1;
+	return 0;
+}
+
+static int
+acc100_queue_intr_disable(struct rte_bbdev *dev, uint16_t queue_id)
+{
+	struct acc100_queue *q = dev->data->queues[queue_id].queue_private;
+	q->irq_enable = 0;
+	return 0;
 }
 
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
 	.setup_queues = acc100_setup_queues,
+	.intr_enable = acc100_intr_enable,
 	.close = acc100_dev_close,
 	.info_get = acc100_dev_info_get,
 	.queue_setup = acc100_queue_setup,
 	.queue_release = acc100_queue_release,
+	.queue_intr_enable = acc100_queue_intr_enable,
+	.queue_intr_disable = acc100_queue_intr_disable
 };
 
 /* ACC100 PCI PF address map */
@@ -3018,8 +3295,10 @@
 			? (1 << RTE_BBDEV_DATA_ERROR) : 0);
 	op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
 	op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
-	if (op->status != 0)
+	if (op->status != 0) {
 		q_data->queue_stats.dequeue_err_count++;
+		acc100_check_ir(q->d);
+	}
 
 	/* CRC invalid if error exists */
 	if (!op->status)
@@ -3076,6 +3355,9 @@
 		op->status |= 1 << RTE_BBDEV_SYNDROME_ERROR;
 	op->ldpc_dec.iter_count = (uint8_t) rsp.iter_cnt;
 
+	if (op->status & (1 << RTE_BBDEV_DRV_ERROR))
+		acc100_check_ir(q->d);
+
 	/* Check if this is the last desc in batch (Atomic Queue) */
 	if (desc->req.last_desc_in_batch) {
 		(*aq_dequeued)++;
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 78686c1..8980fa5 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -559,7 +559,14 @@ struct acc100_device {
 	/* Virtual address of the info memory routed to the this function under
 	 * operation, whether it is PF or VF.
 	 */
+	union acc100_info_ring_data *info_ring;
+
 	union acc100_harq_layout_data *harq_layout;
+	/* Virtual Info Ring head */
+	uint16_t info_ring_head;
+	/* Number of bytes available for each queue in device, depending on
+	 * how many queues are enabled with configure()
+	 */
 	uint32_t sw_ring_size;
 	uint32_t ddr_size; /* Size in kB */
 	uint32_t *tail_ptrs; /* Base address of response tail pointer buffer */
@@ -575,4 +582,12 @@ struct acc100_device {
 	bool configured; /**< True if this ACC100 device is configured */
 };
 
+/**
+ * Structure with details about RTE_BBDEV_EVENT_DEQUEUE event. It's passed to
+ * the callback function.
+ */
+struct acc100_deq_intr_details {
+	uint16_t queue_id;
+};
+
 #endif /* _RTE_ACC100_PMD_H_ */
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v3 09/11] baseband/acc100: add debug function to validate input
  2020-08-19  0:25 [dpdk-dev] [PATCH v3 00/11] bbdev PMD ACC100 Nicolas Chautru
                   ` (7 preceding siblings ...)
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 08/11] baseband/acc100: add interrupt support to PMD Nicolas Chautru
@ 2020-08-19  0:25 ` Nicolas Chautru
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 10/11] baseband/acc100: add configure function Nicolas Chautru
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 11/11] doc: update bbdev feature table Nicolas Chautru
  10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-08-19  0:25 UTC (permalink / raw)
  To: dev, akhil.goyal; +Cc: bruce.richardson, Nicolas Chautru

Debug functions to validate the input API from user
Only enabled in DEBUG mode at build time

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 424 +++++++++++++++++++++++++++++++
 1 file changed, 424 insertions(+)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index ba8e1d8..dc14079 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -1945,6 +1945,231 @@
 
 }
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+/* Validates turbo encoder parameters */
+static inline int
+validate_enc_op(struct rte_bbdev_enc_op *op)
+{
+	struct rte_bbdev_op_turbo_enc *turbo_enc = &op->turbo_enc;
+	struct rte_bbdev_op_enc_turbo_cb_params *cb = NULL;
+	struct rte_bbdev_op_enc_turbo_tb_params *tb = NULL;
+	uint16_t kw, kw_neg, kw_pos;
+
+	if (op->mempool == NULL) {
+		rte_bbdev_log(ERR, "Invalid mempool pointer");
+		return -1;
+	}
+	if (turbo_enc->input.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid input pointer");
+		return -1;
+	}
+	if (turbo_enc->output.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid output pointer");
+		return -1;
+	}
+	if (turbo_enc->rv_index > 3) {
+		rte_bbdev_log(ERR,
+				"rv_index (%u) is out of range 0 <= value <= 3",
+				turbo_enc->rv_index);
+		return -1;
+	}
+	if (turbo_enc->code_block_mode != 0 &&
+			turbo_enc->code_block_mode != 1) {
+		rte_bbdev_log(ERR,
+				"code_block_mode (%u) is out of range 0 <= value <= 1",
+				turbo_enc->code_block_mode);
+		return -1;
+	}
+
+	if (turbo_enc->code_block_mode == 0) {
+		tb = &turbo_enc->tb_params;
+		if ((tb->k_neg < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| tb->k_neg > RTE_BBDEV_TURBO_MAX_CB_SIZE)
+				&& tb->c_neg > 0) {
+			rte_bbdev_log(ERR,
+					"k_neg (%u) is out of range %u <= value <= %u",
+					tb->k_neg, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if (tb->k_pos < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| tb->k_pos > RTE_BBDEV_TURBO_MAX_CB_SIZE) {
+			rte_bbdev_log(ERR,
+					"k_pos (%u) is out of range %u <= value <= %u",
+					tb->k_pos, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if (tb->c_neg > (RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1))
+			rte_bbdev_log(ERR,
+					"c_neg (%u) is out of range 0 <= value <= %u",
+					tb->c_neg,
+					RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1);
+		if (tb->c < 1 || tb->c > RTE_BBDEV_TURBO_MAX_CODE_BLOCKS) {
+			rte_bbdev_log(ERR,
+					"c (%u) is out of range 1 <= value <= %u",
+					tb->c, RTE_BBDEV_TURBO_MAX_CODE_BLOCKS);
+			return -1;
+		}
+		if (tb->cab > tb->c) {
+			rte_bbdev_log(ERR,
+					"cab (%u) is greater than c (%u)",
+					tb->cab, tb->c);
+			return -1;
+		}
+		if ((tb->ea < RTE_BBDEV_TURBO_MIN_CB_SIZE || (tb->ea % 2))
+				&& tb->r < tb->cab) {
+			rte_bbdev_log(ERR,
+					"ea (%u) is less than %u or it is not even",
+					tb->ea, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+		if ((tb->eb < RTE_BBDEV_TURBO_MIN_CB_SIZE || (tb->eb % 2))
+				&& tb->c > tb->cab) {
+			rte_bbdev_log(ERR,
+					"eb (%u) is less than %u or it is not even",
+					tb->eb, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+
+		kw_neg = 3 * RTE_ALIGN_CEIL(tb->k_neg + 4,
+					RTE_BBDEV_TURBO_C_SUBBLOCK);
+		if (tb->ncb_neg < tb->k_neg || tb->ncb_neg > kw_neg) {
+			rte_bbdev_log(ERR,
+					"ncb_neg (%u) is out of range (%u) k_neg <= value <= (%u) kw_neg",
+					tb->ncb_neg, tb->k_neg, kw_neg);
+			return -1;
+		}
+
+		kw_pos = 3 * RTE_ALIGN_CEIL(tb->k_pos + 4,
+					RTE_BBDEV_TURBO_C_SUBBLOCK);
+		if (tb->ncb_pos < tb->k_pos || tb->ncb_pos > kw_pos) {
+			rte_bbdev_log(ERR,
+					"ncb_pos (%u) is out of range (%u) k_pos <= value <= (%u) kw_pos",
+					tb->ncb_pos, tb->k_pos, kw_pos);
+			return -1;
+		}
+		if (tb->r > (tb->c - 1)) {
+			rte_bbdev_log(ERR,
+					"r (%u) is greater than c - 1 (%u)",
+					tb->r, tb->c - 1);
+			return -1;
+		}
+	} else {
+		cb = &turbo_enc->cb_params;
+		if (cb->k < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| cb->k > RTE_BBDEV_TURBO_MAX_CB_SIZE) {
+			rte_bbdev_log(ERR,
+					"k (%u) is out of range %u <= value <= %u",
+					cb->k, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+
+		if (cb->e < RTE_BBDEV_TURBO_MIN_CB_SIZE || (cb->e % 2)) {
+			rte_bbdev_log(ERR,
+					"e (%u) is less than %u or it is not even",
+					cb->e, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+
+		kw = RTE_ALIGN_CEIL(cb->k + 4, RTE_BBDEV_TURBO_C_SUBBLOCK) * 3;
+		if (cb->ncb < cb->k || cb->ncb > kw) {
+			rte_bbdev_log(ERR,
+					"ncb (%u) is out of range (%u) k <= value <= (%u) kw",
+					cb->ncb, cb->k, kw);
+			return -1;
+		}
+	}
+
+	return 0;
+}
+/* Validates LDPC encoder parameters */
+static inline int
+validate_ldpc_enc_op(struct rte_bbdev_enc_op *op)
+{
+	struct rte_bbdev_op_ldpc_enc *ldpc_enc = &op->ldpc_enc;
+
+	if (op->mempool == NULL) {
+		rte_bbdev_log(ERR, "Invalid mempool pointer");
+		return -1;
+	}
+	if (ldpc_enc->input.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid input pointer");
+		return -1;
+	}
+	if (ldpc_enc->output.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid output pointer");
+		return -1;
+	}
+	if (ldpc_enc->input.length >
+			RTE_BBDEV_LDPC_MAX_CB_SIZE >> 3) {
+		rte_bbdev_log(ERR, "CB size (%u) is too big, max: %d",
+				ldpc_enc->input.length,
+				RTE_BBDEV_LDPC_MAX_CB_SIZE);
+		return -1;
+	}
+	if ((ldpc_enc->basegraph > 2) || (ldpc_enc->basegraph == 0)) {
+		rte_bbdev_log(ERR,
+				"BG (%u) is out of range 1 <= value <= 2",
+				ldpc_enc->basegraph);
+		return -1;
+	}
+	if (ldpc_enc->rv_index > 3) {
+		rte_bbdev_log(ERR,
+				"rv_index (%u) is out of range 0 <= value <= 3",
+				ldpc_enc->rv_index);
+		return -1;
+	}
+	if (ldpc_enc->code_block_mode > 1) {
+		rte_bbdev_log(ERR,
+				"code_block_mode (%u) is out of range 0 <= value <= 1",
+				ldpc_enc->code_block_mode);
+		return -1;
+	}
+
+	return 0;
+}
+
+/* Validates LDPC decoder parameters */
+static inline int
+validate_ldpc_dec_op(struct rte_bbdev_dec_op *op)
+{
+	struct rte_bbdev_op_ldpc_dec *ldpc_dec = &op->ldpc_dec;
+
+	if (op->mempool == NULL) {
+		rte_bbdev_log(ERR, "Invalid mempool pointer");
+		return -1;
+	}
+	if ((ldpc_dec->basegraph > 2) || (ldpc_dec->basegraph == 0)) {
+		rte_bbdev_log(ERR,
+				"BG (%u) is out of range 1 <= value <= 2",
+				ldpc_dec->basegraph);
+		return -1;
+	}
+	if (ldpc_dec->iter_max == 0) {
+		rte_bbdev_log(ERR,
+				"iter_max (%u) is equal to 0",
+				ldpc_dec->iter_max);
+		return -1;
+	}
+	if (ldpc_dec->rv_index > 3) {
+		rte_bbdev_log(ERR,
+				"rv_index (%u) is out of range 0 <= value <= 3",
+				ldpc_dec->rv_index);
+		return -1;
+	}
+	if (ldpc_dec->code_block_mode > 1) {
+		rte_bbdev_log(ERR,
+				"code_block_mode (%u) is out of range 0 <= value <= 1",
+				ldpc_dec->code_block_mode);
+		return -1;
+	}
+
+	return 0;
+}
+#endif
+
 /* Enqueue one encode operations for ACC100 device in CB mode */
 static inline int
 enqueue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
@@ -1956,6 +2181,14 @@
 		seg_total_left;
 	struct rte_mbuf *input, *output_head, *output;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_enc_op(op) == -1) {
+		rte_bbdev_log(ERR, "Turbo encoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2008,6 +2241,14 @@
 	uint16_t  in_length_in_bytes;
 	struct rte_bbdev_op_ldpc_enc *enc = &ops[0]->ldpc_enc;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_ldpc_enc_op(ops[0]) == -1) {
+		rte_bbdev_log(ERR, "LDPC encoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2065,6 +2306,14 @@
 		seg_total_left;
 	struct rte_mbuf *input, *output_head, *output;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_ldpc_enc_op(op) == -1) {
+		rte_bbdev_log(ERR, "LDPC encoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2119,6 +2368,14 @@
 	struct rte_mbuf *input, *output_head, *output;
 	uint16_t current_enqueued_cbs = 0;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_enc_op(op) == -1) {
+		rte_bbdev_log(ERR, "Turbo encoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2191,6 +2448,142 @@
 	return current_enqueued_cbs;
 }
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+/* Validates turbo decoder parameters */
+static inline int
+validate_dec_op(struct rte_bbdev_dec_op *op)
+{
+	struct rte_bbdev_op_turbo_dec *turbo_dec = &op->turbo_dec;
+	struct rte_bbdev_op_dec_turbo_cb_params *cb = NULL;
+	struct rte_bbdev_op_dec_turbo_tb_params *tb = NULL;
+
+	if (op->mempool == NULL) {
+		rte_bbdev_log(ERR, "Invalid mempool pointer");
+		return -1;
+	}
+	if (turbo_dec->input.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid input pointer");
+		return -1;
+	}
+	if (turbo_dec->hard_output.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid hard_output pointer");
+		return -1;
+	}
+	if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_SOFT_OUTPUT) &&
+			turbo_dec->soft_output.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid soft_output pointer");
+		return -1;
+	}
+	if (turbo_dec->rv_index > 3) {
+		rte_bbdev_log(ERR,
+				"rv_index (%u) is out of range 0 <= value <= 3",
+				turbo_dec->rv_index);
+		return -1;
+	}
+	if (turbo_dec->iter_min < 1) {
+		rte_bbdev_log(ERR,
+				"iter_min (%u) is less than 1",
+				turbo_dec->iter_min);
+		return -1;
+	}
+	if (turbo_dec->iter_max <= 2) {
+		rte_bbdev_log(ERR,
+				"iter_max (%u) is less than or equal to 2",
+				turbo_dec->iter_max);
+		return -1;
+	}
+	if (turbo_dec->iter_min > turbo_dec->iter_max) {
+		rte_bbdev_log(ERR,
+				"iter_min (%u) is greater than iter_max (%u)",
+				turbo_dec->iter_min, turbo_dec->iter_max);
+		return -1;
+	}
+	if (turbo_dec->code_block_mode != 0 &&
+			turbo_dec->code_block_mode != 1) {
+		rte_bbdev_log(ERR,
+				"code_block_mode (%u) is out of range 0 <= value <= 1",
+				turbo_dec->code_block_mode);
+		return -1;
+	}
+
+	if (turbo_dec->code_block_mode == 0) {
+		tb = &turbo_dec->tb_params;
+		if ((tb->k_neg < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| tb->k_neg > RTE_BBDEV_TURBO_MAX_CB_SIZE)
+				&& tb->c_neg > 0) {
+			rte_bbdev_log(ERR,
+					"k_neg (%u) is out of range %u <= value <= %u",
+					tb->k_neg, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if ((tb->k_pos < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| tb->k_pos > RTE_BBDEV_TURBO_MAX_CB_SIZE)
+				&& tb->c > tb->c_neg) {
+			rte_bbdev_log(ERR,
+					"k_pos (%u) is out of range %u <= value <= %u",
+					tb->k_pos, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if (tb->c_neg > (RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1))
+			rte_bbdev_log(ERR,
+					"c_neg (%u) is out of range 0 <= value <= %u",
+					tb->c_neg,
+					RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1);
+		if (tb->c < 1 || tb->c > RTE_BBDEV_TURBO_MAX_CODE_BLOCKS) {
+			rte_bbdev_log(ERR,
+					"c (%u) is out of range 1 <= value <= %u",
+					tb->c, RTE_BBDEV_TURBO_MAX_CODE_BLOCKS);
+			return -1;
+		}
+		if (tb->cab > tb->c) {
+			rte_bbdev_log(ERR,
+					"cab (%u) is greater than c (%u)",
+					tb->cab, tb->c);
+			return -1;
+		}
+		if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_EQUALIZER) &&
+				(tb->ea < RTE_BBDEV_TURBO_MIN_CB_SIZE
+						|| (tb->ea % 2))
+				&& tb->cab > 0) {
+			rte_bbdev_log(ERR,
+					"ea (%u) is less than %u or it is not even",
+					tb->ea, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+		if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_EQUALIZER) &&
+				(tb->eb < RTE_BBDEV_TURBO_MIN_CB_SIZE
+						|| (tb->eb % 2))
+				&& tb->c > tb->cab) {
+			rte_bbdev_log(ERR,
+					"eb (%u) is less than %u or it is not even",
+					tb->eb, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+		}
+	} else {
+		cb = &turbo_dec->cb_params;
+		if (cb->k < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| cb->k > RTE_BBDEV_TURBO_MAX_CB_SIZE) {
+			rte_bbdev_log(ERR,
+					"k (%u) is out of range %u <= value <= %u",
+					cb->k, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_EQUALIZER) &&
+				(cb->e < RTE_BBDEV_TURBO_MIN_CB_SIZE ||
+				(cb->e % 2))) {
+			rte_bbdev_log(ERR,
+					"e (%u) is less than %u or it is not even",
+					cb->e, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+	}
+
+	return 0;
+}
+#endif
+
 /** Enqueue one decode operations for ACC100 device in CB mode */
 static inline int
 enqueue_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
@@ -2203,6 +2596,14 @@
 	struct rte_mbuf *input, *h_output_head, *h_output,
 		*s_output_head, *s_output;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_dec_op(op) == -1) {
+		rte_bbdev_log(ERR, "Turbo decoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2426,6 +2827,13 @@
 		return ret;
 	}
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_ldpc_dec_op(op) == -1) {
+		rte_bbdev_log(ERR, "LDPC decoder validation failed");
+		return -EINVAL;
+	}
+#endif
 	union acc100_dma_desc *desc;
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
@@ -2521,6 +2929,14 @@
 	struct rte_mbuf *input, *h_output_head, *h_output;
 	uint16_t current_enqueued_cbs = 0;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_ldpc_dec_op(op) == -1) {
+		rte_bbdev_log(ERR, "LDPC decoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2611,6 +3027,14 @@
 		*s_output_head, *s_output;
 	uint16_t current_enqueued_cbs = 0;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_dec_op(op) == -1) {
+		rte_bbdev_log(ERR, "Turbo decoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v3 10/11] baseband/acc100: add configure function
  2020-08-19  0:25 [dpdk-dev] [PATCH v3 00/11] bbdev PMD ACC100 Nicolas Chautru
                   ` (8 preceding siblings ...)
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 09/11] baseband/acc100: add debug function to validate input Nicolas Chautru
@ 2020-08-19  0:25 ` Nicolas Chautru
  2020-09-03 10:06   ` Aidan Goddard
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 11/11] doc: update bbdev feature table Nicolas Chautru
  10 siblings, 1 reply; 213+ messages in thread
From: Nicolas Chautru @ 2020-08-19  0:25 UTC (permalink / raw)
  To: dev, akhil.goyal; +Cc: bruce.richardson, Nicolas Chautru

Add configure function to configure the PF from within
the bbdev-test itself without external application
configuration the device.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 app/test-bbdev/test_bbdev_perf.c                   |  72 +++
 drivers/baseband/acc100/Makefile                   |   3 +
 drivers/baseband/acc100/meson.build                |   2 +
 drivers/baseband/acc100/rte_acc100_cfg.h           |  17 +
 drivers/baseband/acc100/rte_acc100_pmd.c           | 505 +++++++++++++++++++++
 .../acc100/rte_pmd_bbdev_acc100_version.map        |   7 +
 6 files changed, 606 insertions(+)

diff --git a/app/test-bbdev/test_bbdev_perf.c b/app/test-bbdev/test_bbdev_perf.c
index 45c0d62..32f23ff 100644
--- a/app/test-bbdev/test_bbdev_perf.c
+++ b/app/test-bbdev/test_bbdev_perf.c
@@ -52,6 +52,18 @@
 #define FLR_5G_TIMEOUT 610
 #endif
 
+#ifdef RTE_LIBRTE_PMD_BBDEV_ACC100
+#include <rte_acc100_cfg.h>
+#define ACC100PF_DRIVER_NAME   ("intel_acc100_pf")
+#define ACC100VF_DRIVER_NAME   ("intel_acc100_vf")
+#define ACC100_QMGR_NUM_AQS 16
+#define ACC100_QMGR_NUM_QGS 2
+#define ACC100_QMGR_AQ_DEPTH 5
+#define ACC100_QMGR_INVALID_IDX -1
+#define ACC100_QMGR_RR 1
+#define ACC100_QOS_GBR 0
+#endif
+
 #define OPS_CACHE_SIZE 256U
 #define OPS_POOL_SIZE_MIN 511U /* 0.5K per queue */
 
@@ -653,6 +665,66 @@ typedef int (test_case_function)(struct active_device *ad,
 				info->dev_name);
 	}
 #endif
+#ifdef RTE_LIBRTE_PMD_BBDEV_ACC100
+	if ((get_init_device() == true) &&
+		(!strcmp(info->drv.driver_name, ACC100PF_DRIVER_NAME))) {
+		struct acc100_conf conf;
+		unsigned int i;
+
+		printf("Configure ACC100 FEC Driver %s with default values\n",
+				info->drv.driver_name);
+
+		/* clear default configuration before initialization */
+		memset(&conf, 0, sizeof(struct acc100_conf));
+
+		/* Always set in PF mode for built-in configuration */
+		conf.pf_mode_en = true;
+		for (i = 0; i < RTE_ACC100_NUM_VFS; ++i) {
+			conf.arb_dl_4g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_dl_4g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_dl_4g[i].round_robin_weight = ACC100_QMGR_RR;
+			conf.arb_ul_4g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_ul_4g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_ul_4g[i].round_robin_weight = ACC100_QMGR_RR;
+			conf.arb_dl_5g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_dl_5g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_dl_5g[i].round_robin_weight = ACC100_QMGR_RR;
+			conf.arb_ul_5g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_ul_5g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_ul_5g[i].round_robin_weight = ACC100_QMGR_RR;
+		}
+
+		conf.input_pos_llr_1_bit = true;
+		conf.output_pos_llr_1_bit = true;
+		conf.num_vf_bundles = 1; /**< Number of VF bundles to setup */
+
+		conf.q_ul_4g.num_qgroups = ACC100_QMGR_NUM_QGS;
+		conf.q_ul_4g.first_qgroup_index = ACC100_QMGR_INVALID_IDX;
+		conf.q_ul_4g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
+		conf.q_ul_4g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
+		conf.q_dl_4g.num_qgroups = ACC100_QMGR_NUM_QGS;
+		conf.q_dl_4g.first_qgroup_index = ACC100_QMGR_INVALID_IDX;
+		conf.q_dl_4g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
+		conf.q_dl_4g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
+		conf.q_ul_5g.num_qgroups = ACC100_QMGR_NUM_QGS;
+		conf.q_ul_5g.first_qgroup_index = ACC100_QMGR_INVALID_IDX;
+		conf.q_ul_5g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
+		conf.q_ul_5g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
+		conf.q_dl_5g.num_qgroups = ACC100_QMGR_NUM_QGS;
+		conf.q_dl_5g.first_qgroup_index = ACC100_QMGR_INVALID_IDX;
+		conf.q_dl_5g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
+		conf.q_dl_5g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
+
+		/* setup PF with configuration information */
+		ret = acc100_configure(info->dev_name, &conf);
+		TEST_ASSERT_SUCCESS(ret,
+				"Failed to configure ACC100 PF for bbdev %s",
+				info->dev_name);
+		/* Let's refresh this now this is configured */
+	}
+	rte_bbdev_info_get(dev_id, info);
+#endif
+
 	nb_queues = RTE_MIN(rte_lcore_count(), info->drv.max_num_queues);
 	nb_queues = RTE_MIN(nb_queues, (unsigned int) MAX_QUEUES);
 
diff --git a/drivers/baseband/acc100/Makefile b/drivers/baseband/acc100/Makefile
index c79e487..37e73af 100644
--- a/drivers/baseband/acc100/Makefile
+++ b/drivers/baseband/acc100/Makefile
@@ -22,4 +22,7 @@ LIBABIVER := 1
 # library source files
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100) += rte_acc100_pmd.c
 
+# export include files
+SYMLINK-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100)-include += rte_acc100_cfg.h
+
 include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/baseband/acc100/meson.build b/drivers/baseband/acc100/meson.build
index 8afafc2..7ac44dc 100644
--- a/drivers/baseband/acc100/meson.build
+++ b/drivers/baseband/acc100/meson.build
@@ -4,3 +4,5 @@
 deps += ['bbdev', 'bus_vdev', 'ring', 'pci', 'bus_pci']
 
 sources = files('rte_acc100_pmd.c')
+
+install_headers('rte_acc100_cfg.h')
diff --git a/drivers/baseband/acc100/rte_acc100_cfg.h b/drivers/baseband/acc100/rte_acc100_cfg.h
index 73bbe36..7f523bc 100644
--- a/drivers/baseband/acc100/rte_acc100_cfg.h
+++ b/drivers/baseband/acc100/rte_acc100_cfg.h
@@ -89,6 +89,23 @@ struct acc100_conf {
 	struct rte_arbitration_t arb_dl_5g[RTE_ACC100_NUM_VFS];
 };
 
+/**
+ * Configure a ACC100 device
+ *
+ * @param dev_name
+ *   The name of the device. This is the short form of PCI BDF, e.g. 00:01.0.
+ *   It can also be retrieved for a bbdev device from the dev_name field in the
+ *   rte_bbdev_info structure returned by rte_bbdev_info_get().
+ * @param conf
+ *   Configuration to apply to ACC100 HW.
+ *
+ * @return
+ *   Zero on success, negative value on failure.
+ */
+__rte_experimental
+int
+acc100_configure(const char *dev_name, struct acc100_conf *conf);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index dc14079..43f664b 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -85,6 +85,26 @@
 
 enum {UL_4G = 0, UL_5G, DL_4G, DL_5G, NUM_ACC};
 
+/* Return the accelerator enum for a Queue Group Index */
+static inline int
+accFromQgid(int qg_idx, const struct acc100_conf *acc100_conf)
+{
+	int accQg[ACC100_NUM_QGRPS];
+	int NumQGroupsPerFn[NUM_ACC];
+	int acc, qgIdx, qgIndex = 0;
+	for (qgIdx = 0; qgIdx < ACC100_NUM_QGRPS; qgIdx++)
+		accQg[qgIdx] = 0;
+	NumQGroupsPerFn[UL_4G] = acc100_conf->q_ul_4g.num_qgroups;
+	NumQGroupsPerFn[UL_5G] = acc100_conf->q_ul_5g.num_qgroups;
+	NumQGroupsPerFn[DL_4G] = acc100_conf->q_dl_4g.num_qgroups;
+	NumQGroupsPerFn[DL_5G] = acc100_conf->q_dl_5g.num_qgroups;
+	for (acc = UL_4G;  acc < NUM_ACC; acc++)
+		for (qgIdx = 0; qgIdx < NumQGroupsPerFn[acc]; qgIdx++)
+			accQg[qgIndex++] = acc;
+	acc = accQg[qg_idx];
+	return acc;
+}
+
 /* Return the queue topology for a Queue Group Index */
 static inline void
 qtopFromAcc(struct rte_q_topology_t **qtop, int acc_enum,
@@ -113,6 +133,30 @@
 	*qtop = p_qtop;
 }
 
+/* Return the AQ depth for a Queue Group Index */
+static inline int
+aqDepth(int qg_idx, struct acc100_conf *acc100_conf)
+{
+	struct rte_q_topology_t *q_top = NULL;
+	int acc_enum = accFromQgid(qg_idx, acc100_conf);
+	qtopFromAcc(&q_top, acc_enum, acc100_conf);
+	if (unlikely(q_top == NULL))
+		return 0;
+	return q_top->aq_depth_log2;
+}
+
+/* Return the AQ depth for a Queue Group Index */
+static inline int
+aqNum(int qg_idx, struct acc100_conf *acc100_conf)
+{
+	struct rte_q_topology_t *q_top = NULL;
+	int acc_enum = accFromQgid(qg_idx, acc100_conf);
+	qtopFromAcc(&q_top, acc_enum, acc100_conf);
+	if (unlikely(q_top == NULL))
+		return 0;
+	return q_top->num_aqs_per_groups;
+}
+
 static void
 initQTop(struct acc100_conf *acc100_conf)
 {
@@ -4177,3 +4221,464 @@ static int acc100_pci_remove(struct rte_pci_device *pci_dev)
 RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME, pci_id_acc100_pf_map);
 RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
 RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME, pci_id_acc100_vf_map);
+
+/*
+ * Implementation to fix the power on status of some 5GUL engines
+ * This requires DMA permission if ported outside DPDK
+ */
+static void
+poweron_cleanup(struct rte_bbdev *bbdev, struct acc100_device *d,
+		struct acc100_conf *conf)
+{
+	int i, template_idx, qg_idx;
+	uint32_t address, status, payload;
+	printf("Need to clear power-on 5GUL status in internal memory\n");
+	/* Reset LDPC Cores */
+	for (i = 0; i < ACC100_ENGINES_MAX; i++)
+		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
+				ACC100_ENGINE_OFFSET * i, ACC100_RESET_HI);
+	usleep(LONG_WAIT);
+	for (i = 0; i < ACC100_ENGINES_MAX; i++)
+		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
+				ACC100_ENGINE_OFFSET * i, ACC100_RESET_LO);
+	usleep(LONG_WAIT);
+	/* Prepare dummy workload */
+	alloc_2x64mb_sw_rings_mem(bbdev, d, 0);
+	/* Set base addresses */
+	uint32_t phys_high = (uint32_t)(d->sw_rings_phys >> 32);
+	uint32_t phys_low  = (uint32_t)(d->sw_rings_phys &
+			~(ACC100_SIZE_64MBYTE-1));
+	acc100_reg_write(d, HWPfDmaFec5GulDescBaseHiRegVf, phys_high);
+	acc100_reg_write(d, HWPfDmaFec5GulDescBaseLoRegVf, phys_low);
+
+	/* Descriptor for a dummy 5GUL code block processing*/
+	union acc100_dma_desc *desc = NULL;
+	desc = d->sw_rings;
+	desc->req.data_ptrs[0].address = d->sw_rings_phys +
+			ACC100_DESC_FCW_OFFSET;
+	desc->req.data_ptrs[0].blen = ACC100_FCW_LD_BLEN;
+	desc->req.data_ptrs[0].blkid = ACC100_DMA_BLKID_FCW;
+	desc->req.data_ptrs[0].last = 0;
+	desc->req.data_ptrs[0].dma_ext = 0;
+	desc->req.data_ptrs[1].address = d->sw_rings_phys + 512;
+	desc->req.data_ptrs[1].blkid = ACC100_DMA_BLKID_IN;
+	desc->req.data_ptrs[1].last = 1;
+	desc->req.data_ptrs[1].dma_ext = 0;
+	desc->req.data_ptrs[1].blen = 44;
+	desc->req.data_ptrs[2].address = d->sw_rings_phys + 1024;
+	desc->req.data_ptrs[2].blkid = ACC100_DMA_BLKID_OUT_ENC;
+	desc->req.data_ptrs[2].last = 1;
+	desc->req.data_ptrs[2].dma_ext = 0;
+	desc->req.data_ptrs[2].blen = 5;
+	/* Dummy FCW */
+	desc->req.fcw_ld.FCWversion = ACC100_FCW_VER;
+	desc->req.fcw_ld.qm = 1;
+	desc->req.fcw_ld.nfiller = 30;
+	desc->req.fcw_ld.BG = 2 - 1;
+	desc->req.fcw_ld.Zc = 7;
+	desc->req.fcw_ld.ncb = 350;
+	desc->req.fcw_ld.rm_e = 4;
+	desc->req.fcw_ld.itmax = 10;
+	desc->req.fcw_ld.gain_i = 1;
+	desc->req.fcw_ld.gain_h = 1;
+
+	int engines_to_restart[SIG_UL_5G_LAST + 1] = {0};
+	int num_failed_engine = 0;
+	/* Detect engines in undefined state */
+	for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST;
+			template_idx++) {
+		/* Check engine power-on status */
+		address = HwPfFecUl5gIbDebugReg +
+				ACC100_ENGINE_OFFSET * template_idx;
+		status = (acc100_reg_read(d, address) >> 4) & 0xF;
+		if (status == 0) {
+			engines_to_restart[num_failed_engine] = template_idx;
+			num_failed_engine++;
+		}
+	}
+
+	int numQqsAcc = conf->q_ul_5g.num_qgroups;
+	int numQgs = conf->q_ul_5g.num_qgroups;
+	payload = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		payload |= (1 << qg_idx);
+	/* Force each engine which is in unspecified state */
+	for (i = 0; i < num_failed_engine; i++) {
+		int failed_engine = engines_to_restart[i];
+		printf("Force engine %d\n", failed_engine);
+		for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST;
+				template_idx++) {
+			address = HWPfQmgrGrpTmplateReg4Indx
+					+ BYTES_IN_WORD * template_idx;
+			if (template_idx == failed_engine)
+				acc100_reg_write(d, address, payload);
+			else
+				acc100_reg_write(d, address, 0);
+		}
+		/* Reset descriptor header */
+		desc->req.word0 = ACC100_DMA_DESC_TYPE;
+		desc->req.word1 = 0;
+		desc->req.word2 = 0;
+		desc->req.word3 = 0;
+		desc->req.numCBs = 1;
+		desc->req.m2dlen = 2;
+		desc->req.d2mlen = 1;
+		/* Enqueue the code block for processing */
+		union acc100_enqueue_reg_fmt enq_req;
+		enq_req.val = 0;
+		enq_req.addr_offset = ACC100_DESC_OFFSET;
+		enq_req.num_elem = 1;
+		enq_req.req_elem_addr = 0;
+		rte_wmb();
+		acc100_reg_write(d, HWPfQmgrIngressAq + 0x100, enq_req.val);
+		usleep(LONG_WAIT * 100);
+		if (desc->req.word0 != 2)
+			printf("DMA Response %#"PRIx32"\n", desc->req.word0);
+	}
+
+	/* Reset LDPC Cores */
+	for (i = 0; i < ACC100_ENGINES_MAX; i++)
+		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
+				ACC100_ENGINE_OFFSET * i, ACC100_RESET_HI);
+	usleep(LONG_WAIT);
+	for (i = 0; i < ACC100_ENGINES_MAX; i++)
+		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
+				ACC100_ENGINE_OFFSET * i, ACC100_RESET_LO);
+	usleep(LONG_WAIT);
+	acc100_reg_write(d, HWPfHi5GHardResetReg, ACC100_RESET_HARD);
+	usleep(LONG_WAIT);
+	int numEngines = 0;
+	/* Check engine power-on status again */
+	for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST;
+			template_idx++) {
+		address = HwPfFecUl5gIbDebugReg +
+				ACC100_ENGINE_OFFSET * template_idx;
+		status = (acc100_reg_read(d, address) >> 4) & 0xF;
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ BYTES_IN_WORD * template_idx;
+		if (status == 1) {
+			acc100_reg_write(d, address, payload);
+			numEngines++;
+		} else
+			acc100_reg_write(d, address, 0);
+	}
+	printf("Number of 5GUL engines %d\n", numEngines);
+
+	if (d->sw_rings_base != NULL)
+		rte_free(d->sw_rings_base);
+	usleep(LONG_WAIT);
+}
+
+/* Initial configuration of a ACC100 device prior to running configure() */
+int
+acc100_configure(const char *dev_name, struct acc100_conf *conf)
+{
+	rte_bbdev_log(INFO, "acc100_configure");
+	uint32_t payload, address, status;
+	int qg_idx, template_idx, vf_idx, acc, i;
+	struct rte_bbdev *bbdev = rte_bbdev_get_named_dev(dev_name);
+
+	/* Compile time checks */
+	RTE_BUILD_BUG_ON(sizeof(struct acc100_dma_req_desc) != 256);
+	RTE_BUILD_BUG_ON(sizeof(union acc100_dma_desc) != 256);
+	RTE_BUILD_BUG_ON(sizeof(struct acc100_fcw_td) != 24);
+	RTE_BUILD_BUG_ON(sizeof(struct acc100_fcw_te) != 32);
+
+	if (bbdev == NULL) {
+		rte_bbdev_log(ERR,
+		"Invalid dev_name (%s), or device is not yet initialised",
+		dev_name);
+		return -ENODEV;
+	}
+	struct acc100_device *d = bbdev->data->dev_private;
+
+	/* Store configuration */
+	rte_memcpy(&d->acc100_conf, conf, sizeof(d->acc100_conf));
+
+	/* PCIe Bridge configuration */
+	acc100_reg_write(d, HwPfPcieGpexBridgeControl, ACC100_CFG_PCI_BRIDGE);
+	for (i = 1; i < 17; i++)
+		acc100_reg_write(d,
+				HwPfPcieGpexAxiAddrMappingWindowPexBaseHigh
+				+ i * 16, 0);
+
+	/* PCIe Link Trainiing and Status State Machine */
+	acc100_reg_write(d, HwPfPcieGpexLtssmStateCntrl, 0xDFC00000);
+
+	/* Prevent blocking AXI read on BRESP for AXI Write */
+	address = HwPfPcieGpexAxiPioControl;
+	payload = ACC100_CFG_PCI_AXI;
+	acc100_reg_write(d, address, payload);
+
+	/* 5GDL PLL phase shift */
+	acc100_reg_write(d, HWPfChaDl5gPllPhshft0, 0x1);
+
+	/* Explicitly releasing AXI as this may be stopped after PF FLR/BME */
+	address = HWPfDmaAxiControl;
+	payload = 1;
+	acc100_reg_write(d, address, payload);
+
+	/* DDR Configuration */
+	address = HWPfDdrBcTim6;
+	payload = acc100_reg_read(d, address);
+	payload &= 0xFFFFFFFB; /* Bit 2 */
+#ifdef ACC100_DDR_ECC_ENABLE
+	payload |= 0x4;
+#endif
+	acc100_reg_write(d, address, payload);
+	address = HWPfDdrPhyDqsCountNum;
+#ifdef ACC100_DDR_ECC_ENABLE
+	payload = 9;
+#else
+	payload = 8;
+#endif
+	acc100_reg_write(d, address, payload);
+
+	/* Set default descriptor signature */
+	address = HWPfDmaDescriptorSignatuture;
+	payload = 0;
+	acc100_reg_write(d, address, payload);
+
+	/* Enable the Error Detection in DMA */
+	payload = ACC100_CFG_DMA_ERROR;
+	address = HWPfDmaErrorDetectionEn;
+	acc100_reg_write(d, address, payload);
+
+	/* AXI Cache configuration */
+	payload = ACC100_CFG_AXI_CACHE;
+	address = HWPfDmaAxcacheReg;
+	acc100_reg_write(d, address, payload);
+
+	/* Default DMA Configuration (Qmgr Enabled) */
+	address = HWPfDmaConfig0Reg;
+	payload = 0;
+	acc100_reg_write(d, address, payload);
+	address = HWPfDmaQmanen;
+	payload = 0;
+	acc100_reg_write(d, address, payload);
+
+	/* Default RLIM/ALEN configuration */
+	address = HWPfDmaConfig1Reg;
+	payload = (1 << 31) + (23 << 8) + (1 << 6) + 7;
+	acc100_reg_write(d, address, payload);
+
+	/* Configure DMA Qmanager addresses */
+	address = HWPfDmaQmgrAddrReg;
+	payload = HWPfQmgrEgressQueuesTemplate;
+	acc100_reg_write(d, address, payload);
+
+	/* ===== Qmgr Configuration ===== */
+	/* Configuration of the AQueue Depth QMGR_GRP_0_DEPTH_LOG2 for UL */
+	int totalQgs = conf->q_ul_4g.num_qgroups +
+			conf->q_ul_5g.num_qgroups +
+			conf->q_dl_4g.num_qgroups +
+			conf->q_dl_5g.num_qgroups;
+	for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) {
+		address = HWPfQmgrDepthLog2Grp +
+		BYTES_IN_WORD * qg_idx;
+		payload = aqDepth(qg_idx, conf);
+		acc100_reg_write(d, address, payload);
+		address = HWPfQmgrTholdGrp +
+		BYTES_IN_WORD * qg_idx;
+		payload = (1 << 16) + (1 << (aqDepth(qg_idx, conf) - 1));
+		acc100_reg_write(d, address, payload);
+	}
+
+	/* Template Priority in incremental order */
+	for (template_idx = 0; template_idx < ACC100_NUM_TMPL;
+			template_idx++) {
+		address = HWPfQmgrGrpTmplateReg0Indx +
+		BYTES_IN_WORD * (template_idx % 8);
+		payload = TMPL_PRI_0;
+		acc100_reg_write(d, address, payload);
+		address = HWPfQmgrGrpTmplateReg1Indx +
+		BYTES_IN_WORD * (template_idx % 8);
+		payload = TMPL_PRI_1;
+		acc100_reg_write(d, address, payload);
+		address = HWPfQmgrGrpTmplateReg2indx +
+		BYTES_IN_WORD * (template_idx % 8);
+		payload = TMPL_PRI_2;
+		acc100_reg_write(d, address, payload);
+		address = HWPfQmgrGrpTmplateReg3Indx +
+		BYTES_IN_WORD * (template_idx % 8);
+		payload = TMPL_PRI_3;
+		acc100_reg_write(d, address, payload);
+	}
+
+	address = HWPfQmgrGrpPriority;
+	payload = ACC100_CFG_QMGR_HI_P;
+	acc100_reg_write(d, address, payload);
+
+	/* Template Configuration */
+	for (template_idx = 0; template_idx < ACC100_NUM_TMPL; template_idx++) {
+		payload = 0;
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ BYTES_IN_WORD * template_idx;
+		acc100_reg_write(d, address, payload);
+	}
+	/* 4GUL */
+	int numQgs = conf->q_ul_4g.num_qgroups;
+	int numQqsAcc = 0;
+	payload = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		payload |= (1 << qg_idx);
+	for (template_idx = SIG_UL_4G; template_idx <= SIG_UL_4G_LAST;
+			template_idx++) {
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ BYTES_IN_WORD*template_idx;
+		acc100_reg_write(d, address, payload);
+	}
+	/* 5GUL */
+	numQqsAcc += numQgs;
+	numQgs	= conf->q_ul_5g.num_qgroups;
+	payload = 0;
+	int numEngines = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		payload |= (1 << qg_idx);
+	for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST;
+			template_idx++) {
+		/* Check engine power-on status */
+		address = HwPfFecUl5gIbDebugReg +
+				ACC100_ENGINE_OFFSET * template_idx;
+		status = (acc100_reg_read(d, address) >> 4) & 0xF;
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ BYTES_IN_WORD * template_idx;
+		if (status == 1) {
+			acc100_reg_write(d, address, payload);
+			numEngines++;
+		} else
+			acc100_reg_write(d, address, 0);
+		#if RTE_ACC100_SINGLE_FEC == 1
+		payload = 0;
+		#endif
+	}
+	printf("Number of 5GUL engines %d\n", numEngines);
+	/* 4GDL */
+	numQqsAcc += numQgs;
+	numQgs	= conf->q_dl_4g.num_qgroups;
+	payload = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		payload |= (1 << qg_idx);
+	for (template_idx = SIG_DL_4G; template_idx <= SIG_DL_4G_LAST;
+			template_idx++) {
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ BYTES_IN_WORD*template_idx;
+		acc100_reg_write(d, address, payload);
+		#if RTE_ACC100_SINGLE_FEC == 1
+			payload = 0;
+		#endif
+	}
+	/* 5GDL */
+	numQqsAcc += numQgs;
+	numQgs	= conf->q_dl_5g.num_qgroups;
+	payload = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		payload |= (1 << qg_idx);
+	for (template_idx = SIG_DL_5G; template_idx <= SIG_DL_5G_LAST;
+			template_idx++) {
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ BYTES_IN_WORD*template_idx;
+		acc100_reg_write(d, address, payload);
+		#if RTE_ACC100_SINGLE_FEC == 1
+		payload = 0;
+		#endif
+	}
+
+	/* Queue Group Function mapping */
+	int qman_func_id[5] = {0, 2, 1, 3, 4};
+	address = HWPfQmgrGrpFunction0;
+	payload = 0;
+	for (qg_idx = 0; qg_idx < 8; qg_idx++) {
+		acc = accFromQgid(qg_idx, conf);
+		payload |= qman_func_id[acc]<<(qg_idx * 4);
+	}
+	acc100_reg_write(d, address, payload);
+
+	/* Configuration of the Arbitration QGroup depth to 1 */
+	for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) {
+		address = HWPfQmgrArbQDepthGrp +
+		BYTES_IN_WORD * qg_idx;
+		payload = 0;
+		acc100_reg_write(d, address, payload);
+	}
+
+	/* Enabling AQueues through the Queue hierarchy*/
+	for (vf_idx = 0; vf_idx < ACC100_NUM_VFS; vf_idx++) {
+		for (qg_idx = 0; qg_idx < ACC100_NUM_QGRPS; qg_idx++) {
+			payload = 0;
+			if (vf_idx < conf->num_vf_bundles &&
+					qg_idx < totalQgs)
+				payload = (1 << aqNum(qg_idx, conf)) - 1;
+			address = HWPfQmgrAqEnableVf
+					+ vf_idx * BYTES_IN_WORD;
+			payload += (qg_idx << 16);
+			acc100_reg_write(d, address, payload);
+		}
+	}
+
+	/* This pointer to ARAM (256kB) is shifted by 2 (4B per register) */
+	uint32_t aram_address = 0;
+	for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) {
+		for (vf_idx = 0; vf_idx < conf->num_vf_bundles; vf_idx++) {
+			address = HWPfQmgrVfBaseAddr + vf_idx
+					* BYTES_IN_WORD + qg_idx
+					* BYTES_IN_WORD * 64;
+			payload = aram_address;
+			acc100_reg_write(d, address, payload);
+			/* Offset ARAM Address for next memory bank
+			 * - increment of 4B
+			 */
+			aram_address += aqNum(qg_idx, conf) *
+					(1 << aqDepth(qg_idx, conf));
+		}
+	}
+
+	if (aram_address > WORDS_IN_ARAM_SIZE) {
+		rte_bbdev_log(ERR, "ARAM Configuration not fitting %d %d\n",
+				aram_address, WORDS_IN_ARAM_SIZE);
+		return -EINVAL;
+	}
+
+	/* ==== HI Configuration ==== */
+
+	/* Prevent Block on Transmit Error */
+	address = HWPfHiBlockTransmitOnErrorEn;
+	payload = 0;
+	acc100_reg_write(d, address, payload);
+	/* Prevents to drop MSI */
+	address = HWPfHiMsiDropEnableReg;
+	payload = 0;
+	acc100_reg_write(d, address, payload);
+	/* Set the PF Mode register */
+	address = HWPfHiPfMode;
+	payload = (conf->pf_mode_en) ? 2 : 0;
+	acc100_reg_write(d, address, payload);
+	/* Enable Error Detection in HW */
+	address = HWPfDmaErrorDetectionEn;
+	payload = 0x3D7;
+	acc100_reg_write(d, address, payload);
+
+	/* QoS overflow init */
+	payload = 1;
+	address = HWPfQosmonAEvalOverflow0;
+	acc100_reg_write(d, address, payload);
+	address = HWPfQosmonBEvalOverflow0;
+	acc100_reg_write(d, address, payload);
+
+	/* HARQ DDR Configuration */
+	unsigned int ddrSizeInMb = 512; /* Fixed to 512 MB per VF for now */
+	for (vf_idx = 0; vf_idx < conf->num_vf_bundles; vf_idx++) {
+		address = HWPfDmaVfDdrBaseRw + vf_idx
+				* 0x10;
+		payload = ((vf_idx * (ddrSizeInMb / 64)) << 16) +
+				(ddrSizeInMb - 1);
+		acc100_reg_write(d, address, payload);
+	}
+	usleep(LONG_WAIT);
+
+	if (numEngines < (SIG_UL_5G_LAST + 1))
+		poweron_cleanup(bbdev, d, conf);
+
+	rte_bbdev_log_debug("PF Tip configuration complete for %s", dev_name);
+	return 0;
+}
diff --git a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
index 4a76d1d..91c234d 100644
--- a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
+++ b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
@@ -1,3 +1,10 @@
 DPDK_21 {
 	local: *;
 };
+
+EXPERIMENTAL {
+	global:
+
+	acc100_configure;
+
+};
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v3 11/11] doc: update bbdev feature table
  2020-08-19  0:25 [dpdk-dev] [PATCH v3 00/11] bbdev PMD ACC100 Nicolas Chautru
                   ` (9 preceding siblings ...)
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 10/11] baseband/acc100: add configure function Nicolas Chautru
@ 2020-08-19  0:25 ` Nicolas Chautru
  2020-09-04 17:53   ` [dpdk-dev] [PATCH v4 00/11] bbdev PMD ACC100 Nicolas Chautru
                     ` (8 more replies)
  10 siblings, 9 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-08-19  0:25 UTC (permalink / raw)
  To: dev, akhil.goyal; +Cc: bruce.richardson, Nicolas Chautru

Correcting overview matrix to use acc100 name

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 doc/guides/bbdevs/features/acc100.ini | 14 ++++++++++++++
 doc/guides/bbdevs/features/mbc.ini    | 14 --------------
 2 files changed, 14 insertions(+), 14 deletions(-)
 create mode 100644 doc/guides/bbdevs/features/acc100.ini
 delete mode 100644 doc/guides/bbdevs/features/mbc.ini

diff --git a/doc/guides/bbdevs/features/acc100.ini b/doc/guides/bbdevs/features/acc100.ini
new file mode 100644
index 0000000..642cd48
--- /dev/null
+++ b/doc/guides/bbdevs/features/acc100.ini
@@ -0,0 +1,14 @@
+;
+; Supported features of the 'acc100' bbdev driver.
+;
+; Refer to default.ini for the full list of available PMD features.
+;
+[Features]
+Turbo Decoder (4G)     = Y
+Turbo Encoder (4G)     = Y
+LDPC Decoder (5G)      = Y
+LDPC Encoder (5G)      = Y
+LLR/HARQ Compression   = Y
+External DDR Access    = Y
+HW Accelerated         = Y
+BBDEV API              = Y
diff --git a/doc/guides/bbdevs/features/mbc.ini b/doc/guides/bbdevs/features/mbc.ini
deleted file mode 100644
index 78a7b95..0000000
--- a/doc/guides/bbdevs/features/mbc.ini
+++ /dev/null
@@ -1,14 +0,0 @@
-;
-; Supported features of the 'mbc' bbdev driver.
-;
-; Refer to default.ini for the full list of available PMD features.
-;
-[Features]
-Turbo Decoder (4G)     = Y
-Turbo Encoder (4G)     = Y
-LDPC Decoder (5G)      = Y
-LDPC Encoder (5G)      = Y
-LLR/HARQ Compression   = Y
-External DDR Access    = Y
-HW Accelerated         = Y
-BBDEV API              = Y
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC processing functions
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC processing functions Nicolas Chautru
@ 2020-08-20 14:38   ` Dave Burley
  2020-08-20 14:52     ` Chautru, Nicolas
  2020-08-29 11:10   ` Xu, Rosen
  1 sibling, 1 reply; 213+ messages in thread
From: Dave Burley @ 2020-08-20 14:38 UTC (permalink / raw)
  To: Nicolas Chautru, dev; +Cc: bruce.richardson

Hi Nic,

As you've now specified the use of RTE_BBDEV_LDPC_LLR_COMPRESSION for this PMB, please could you confirm what the packed format of the LLRs in memory looks like? 

Best Regards

Dave Burley


From: dev <dev-bounces@dpdk.org> on behalf of Nicolas Chautru <nicolas.chautru@intel.com>
Sent: 19 August 2020 01:25
To: dev@dpdk.org <dev@dpdk.org>; akhil.goyal@nxp.com <akhil.goyal@nxp.com>
Cc: bruce.richardson@intel.com <bruce.richardson@intel.com>; Nicolas Chautru <nicolas.chautru@intel.com>
Subject: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC processing functions 
 
Adding LDPC decode and encode processing operations

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 1625 +++++++++++++++++++++++++++++-
 drivers/baseband/acc100/rte_acc100_pmd.h |    3 +
 2 files changed, 1626 insertions(+), 2 deletions(-)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 7a21c57..5f32813 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -15,6 +15,9 @@
 #include <rte_hexdump.h>
 #include <rte_pci.h>
 #include <rte_bus_pci.h>
+#ifdef RTE_BBDEV_OFFLOAD_COST
+#include <rte_cycles.h>
+#endif
 
 #include <rte_bbdev.h>
 #include <rte_bbdev_pmd.h>
@@ -449,7 +452,6 @@
         return 0;
 }
 
-
 /**
  * Report a ACC100 queue index which is free
  * Return 0 to 16k for a valid queue_idx or -1 when no queue is available
@@ -634,6 +636,46 @@
         struct acc100_device *d = dev->data->dev_private;
 
         static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
+               {
+                       .type   = RTE_BBDEV_OP_LDPC_ENC,
+                       .cap.ldpc_enc = {
+                               .capability_flags =
+                                       RTE_BBDEV_LDPC_RATE_MATCH |
+                                       RTE_BBDEV_LDPC_CRC_24B_ATTACH |
+                                       RTE_BBDEV_LDPC_INTERLEAVER_BYPASS,
+                               .num_buffers_src =
+                                               RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+                               .num_buffers_dst =
+                                               RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+                       }
+               },
+               {
+                       .type   = RTE_BBDEV_OP_LDPC_DEC,
+                       .cap.ldpc_dec = {
+                       .capability_flags =
+                               RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK |
+                               RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP |
+                               RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE |
+                               RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE |
+#ifdef ACC100_EXT_MEM
+                               RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE |
+                               RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE |
+#endif
+                               RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE |
+                               RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS |
+                               RTE_BBDEV_LDPC_DECODE_BYPASS |
+                               RTE_BBDEV_LDPC_DEC_SCATTER_GATHER |
+                               RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION |
+                               RTE_BBDEV_LDPC_LLR_COMPRESSION,
+                       .llr_size = 8,
+                       .llr_decimals = 1,
+                       .num_buffers_src =
+                                       RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+                       .num_buffers_hard_out =
+                                       RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+                       .num_buffers_soft_out = 0,
+                       }
+               },
                 RTE_BBDEV_END_OF_CAPABILITIES_LIST()
         };
 
@@ -669,9 +711,14 @@
         dev_info->cpu_flag_reqs = NULL;
         dev_info->min_alignment = 64;
         dev_info->capabilities = bbdev_capabilities;
+#ifdef ACC100_EXT_MEM
         dev_info->harq_buffer_size = d->ddr_size;
+#else
+       dev_info->harq_buffer_size = 0;
+#endif
 }
 
+
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
         .setup_queues = acc100_setup_queues,
         .close = acc100_dev_close,
@@ -696,6 +743,1577 @@
         {.device_id = 0},
 };
 
+/* Read flag value 0/1 from bitmap */
+static inline bool
+check_bit(uint32_t bitmap, uint32_t bitmask)
+{
+       return bitmap & bitmask;
+}
+
+static inline char *
+mbuf_append(struct rte_mbuf *m_head, struct rte_mbuf *m, uint16_t len)
+{
+       if (unlikely(len > rte_pktmbuf_tailroom(m)))
+               return NULL;
+
+       char *tail = (char *)m->buf_addr + m->data_off + m->data_len;
+       m->data_len = (uint16_t)(m->data_len + len);
+       m_head->pkt_len  = (m_head->pkt_len + len);
+       return tail;
+}
+
+/* Compute value of k0.
+ * Based on 3GPP 38.212 Table 5.4.2.1-2
+ * Starting position of different redundancy versions, k0
+ */
+static inline uint16_t
+get_k0(uint16_t n_cb, uint16_t z_c, uint8_t bg, uint8_t rv_index)
+{
+       if (rv_index == 0)
+               return 0;
+       uint16_t n = (bg == 1 ? N_ZC_1 : N_ZC_2) * z_c;
+       if (n_cb == n) {
+               if (rv_index == 1)
+                       return (bg == 1 ? K0_1_1 : K0_1_2) * z_c;
+               else if (rv_index == 2)
+                       return (bg == 1 ? K0_2_1 : K0_2_2) * z_c;
+               else
+                       return (bg == 1 ? K0_3_1 : K0_3_2) * z_c;
+       }
+       /* LBRM case - includes a division by N */
+       if (rv_index == 1)
+               return (((bg == 1 ? K0_1_1 : K0_1_2) * n_cb)
+                               / n) * z_c;
+       else if (rv_index == 2)
+               return (((bg == 1 ? K0_2_1 : K0_2_2) * n_cb)
+                               / n) * z_c;
+       else
+               return (((bg == 1 ? K0_3_1 : K0_3_2) * n_cb)
+                               / n) * z_c;
+}
+
+/* Fill in a frame control word for LDPC encoding. */
+static inline void
+acc100_fcw_le_fill(const struct rte_bbdev_enc_op *op,
+               struct acc100_fcw_le *fcw, int num_cb)
+{
+       fcw->qm = op->ldpc_enc.q_m;
+       fcw->nfiller = op->ldpc_enc.n_filler;
+       fcw->BG = (op->ldpc_enc.basegraph - 1);
+       fcw->Zc = op->ldpc_enc.z_c;
+       fcw->ncb = op->ldpc_enc.n_cb;
+       fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_enc.basegraph,
+                       op->ldpc_enc.rv_index);
+       fcw->rm_e = op->ldpc_enc.cb_params.e;
+       fcw->crc_select = check_bit(op->ldpc_enc.op_flags,
+                       RTE_BBDEV_LDPC_CRC_24B_ATTACH);
+       fcw->bypass_intlv = check_bit(op->ldpc_enc.op_flags,
+                       RTE_BBDEV_LDPC_INTERLEAVER_BYPASS);
+       fcw->mcb_count = num_cb;
+}
+
+/* Fill in a frame control word for LDPC decoding. */
+static inline void
+acc100_fcw_ld_fill(const struct rte_bbdev_dec_op *op, struct acc100_fcw_ld *fcw,
+               union acc100_harq_layout_data *harq_layout)
+{
+       uint16_t harq_out_length, harq_in_length, ncb_p, k0_p, parity_offset;
+       uint16_t harq_index;
+       uint32_t l;
+       bool harq_prun = false;
+
+       fcw->qm = op->ldpc_dec.q_m;
+       fcw->nfiller = op->ldpc_dec.n_filler;
+       fcw->BG = (op->ldpc_dec.basegraph - 1);
+       fcw->Zc = op->ldpc_dec.z_c;
+       fcw->ncb = op->ldpc_dec.n_cb;
+       fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_dec.basegraph,
+                       op->ldpc_dec.rv_index);
+       if (op->ldpc_dec.code_block_mode == 1)
+               fcw->rm_e = op->ldpc_dec.cb_params.e;
+       else
+               fcw->rm_e = (op->ldpc_dec.tb_params.r <
+                               op->ldpc_dec.tb_params.cab) ?
+                                               op->ldpc_dec.tb_params.ea :
+                                               op->ldpc_dec.tb_params.eb;
+
+       fcw->hcin_en = check_bit(op->ldpc_dec.op_flags,
+                       RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE);
+       fcw->hcout_en = check_bit(op->ldpc_dec.op_flags,
+                       RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE);
+       fcw->crc_select = check_bit(op->ldpc_dec.op_flags,
+                       RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK);
+       fcw->bypass_dec = check_bit(op->ldpc_dec.op_flags,
+                       RTE_BBDEV_LDPC_DECODE_BYPASS);
+       fcw->bypass_intlv = check_bit(op->ldpc_dec.op_flags,
+                       RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS);
+       if (op->ldpc_dec.q_m == 1) {
+               fcw->bypass_intlv = 1;
+               fcw->qm = 2;
+       }
+       fcw->hcin_decomp_mode = check_bit(op->ldpc_dec.op_flags,
+                       RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
+       fcw->hcout_comp_mode = check_bit(op->ldpc_dec.op_flags,
+                       RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
+       fcw->llr_pack_mode = check_bit(op->ldpc_dec.op_flags,
+                       RTE_BBDEV_LDPC_LLR_COMPRESSION);
+       harq_index = op->ldpc_dec.harq_combined_output.offset /
+                       ACC100_HARQ_OFFSET;
+#ifdef ACC100_EXT_MEM
+       /* Limit cases when HARQ pruning is valid */
+       harq_prun = ((op->ldpc_dec.harq_combined_output.offset %
+                       ACC100_HARQ_OFFSET) == 0) &&
+                       (op->ldpc_dec.harq_combined_output.offset <= UINT16_MAX
+                       * ACC100_HARQ_OFFSET);
+#endif
+       if (fcw->hcin_en > 0) {
+               harq_in_length = op->ldpc_dec.harq_combined_input.length;
+               if (fcw->hcin_decomp_mode > 0)
+                       harq_in_length = harq_in_length * 8 / 6;
+               harq_in_length = RTE_ALIGN(harq_in_length, 64);
+               if ((harq_layout[harq_index].offset > 0) & harq_prun) {
+                       rte_bbdev_log_debug("HARQ IN offset unexpected for now\n");
+                       fcw->hcin_size0 = harq_layout[harq_index].size0;
+                       fcw->hcin_offset = harq_layout[harq_index].offset;
+                       fcw->hcin_size1 = harq_in_length -
+                                       harq_layout[harq_index].offset;
+               } else {
+                       fcw->hcin_size0 = harq_in_length;
+                       fcw->hcin_offset = 0;
+                       fcw->hcin_size1 = 0;
+               }
+       } else {
+               fcw->hcin_size0 = 0;
+               fcw->hcin_offset = 0;
+               fcw->hcin_size1 = 0;
+       }
+
+       fcw->itmax = op->ldpc_dec.iter_max;
+       fcw->itstop = check_bit(op->ldpc_dec.op_flags,
+                       RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE);
+       fcw->synd_precoder = fcw->itstop;
+       /*
+        * These are all implicitly set
+        * fcw->synd_post = 0;
+        * fcw->so_en = 0;
+        * fcw->so_bypass_rm = 0;
+        * fcw->so_bypass_intlv = 0;
+        * fcw->dec_convllr = 0;
+        * fcw->hcout_convllr = 0;
+        * fcw->hcout_size1 = 0;
+        * fcw->so_it = 0;
+        * fcw->hcout_offset = 0;
+        * fcw->negstop_th = 0;
+        * fcw->negstop_it = 0;
+        * fcw->negstop_en = 0;
+        * fcw->gain_i = 1;
+        * fcw->gain_h = 1;
+        */
+       if (fcw->hcout_en > 0) {
+               parity_offset = (op->ldpc_dec.basegraph == 1 ? 20 : 8)
+                       * op->ldpc_dec.z_c - op->ldpc_dec.n_filler;
+               k0_p = (fcw->k0 > parity_offset) ?
+                               fcw->k0 - op->ldpc_dec.n_filler : fcw->k0;
+               ncb_p = fcw->ncb - op->ldpc_dec.n_filler;
+               l = k0_p + fcw->rm_e;
+               harq_out_length = (uint16_t) fcw->hcin_size0;
+               harq_out_length = RTE_MIN(RTE_MAX(harq_out_length, l), ncb_p);
+               harq_out_length = (harq_out_length + 0x3F) & 0xFFC0;
+               if ((k0_p > fcw->hcin_size0 + ACC100_HARQ_OFFSET_THRESHOLD) &&
+                               harq_prun) {
+                       fcw->hcout_size0 = (uint16_t) fcw->hcin_size0;
+                       fcw->hcout_offset = k0_p & 0xFFC0;
+                       fcw->hcout_size1 = harq_out_length - fcw->hcout_offset;
+               } else {
+                       fcw->hcout_size0 = harq_out_length;
+                       fcw->hcout_size1 = 0;
+                       fcw->hcout_offset = 0;
+               }
+               harq_layout[harq_index].offset = fcw->hcout_offset;
+               harq_layout[harq_index].size0 = fcw->hcout_size0;
+       } else {
+               fcw->hcout_size0 = 0;
+               fcw->hcout_size1 = 0;
+               fcw->hcout_offset = 0;
+       }
+}
+
+/**
+ * Fills descriptor with data pointers of one block type.
+ *
+ * @param desc
+ *   Pointer to DMA descriptor.
+ * @param input
+ *   Pointer to pointer to input data which will be encoded. It can be changed
+ *   and points to next segment in scatter-gather case.
+ * @param offset
+ *   Input offset in rte_mbuf structure. It is used for calculating the point
+ *   where data is starting.
+ * @param cb_len
+ *   Length of currently processed Code Block
+ * @param seg_total_left
+ *   It indicates how many bytes still left in segment (mbuf) for further
+ *   processing.
+ * @param op_flags
+ *   Store information about device capabilities
+ * @param next_triplet
+ *   Index for ACC100 DMA Descriptor triplet
+ *
+ * @return
+ *   Returns index of next triplet on success, other value if lengths of
+ *   pkt and processed cb do not match.
+ *
+ */
+static inline int
+acc100_dma_fill_blk_type_in(struct acc100_dma_req_desc *desc,
+               struct rte_mbuf **input, uint32_t *offset, uint32_t cb_len,
+               uint32_t *seg_total_left, int next_triplet)
+{
+       uint32_t part_len;
+       struct rte_mbuf *m = *input;
+
+       part_len = (*seg_total_left < cb_len) ? *seg_total_left : cb_len;
+       cb_len -= part_len;
+       *seg_total_left -= part_len;
+
+       desc->data_ptrs[next_triplet].address =
+                       rte_pktmbuf_iova_offset(m, *offset);
+       desc->data_ptrs[next_triplet].blen = part_len;
+       desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN;
+       desc->data_ptrs[next_triplet].last = 0;
+       desc->data_ptrs[next_triplet].dma_ext = 0;
+       *offset += part_len;
+       next_triplet++;
+
+       while (cb_len > 0) {
+               if (next_triplet < ACC100_DMA_MAX_NUM_POINTERS &&
+                               m->next != NULL) {
+
+                       m = m->next;
+                       *seg_total_left = rte_pktmbuf_data_len(m);
+                       part_len = (*seg_total_left < cb_len) ?
+                                       *seg_total_left :
+                                       cb_len;
+                       desc->data_ptrs[next_triplet].address =
+                                       rte_pktmbuf_mtophys(m);
+                       desc->data_ptrs[next_triplet].blen = part_len;
+                       desc->data_ptrs[next_triplet].blkid =
+                                       ACC100_DMA_BLKID_IN;
+                       desc->data_ptrs[next_triplet].last = 0;
+                       desc->data_ptrs[next_triplet].dma_ext = 0;
+                       cb_len -= part_len;
+                       *seg_total_left -= part_len;
+                       /* Initializing offset for next segment (mbuf) */
+                       *offset = part_len;
+                       next_triplet++;
+               } else {
+                       rte_bbdev_log(ERR,
+                               "Some data still left for processing: "
+                               "data_left: %u, next_triplet: %u, next_mbuf: %p",
+                               cb_len, next_triplet, m->next);
+                       return -EINVAL;
+               }
+       }
+       /* Storing new mbuf as it could be changed in scatter-gather case*/
+       *input = m;
+
+       return next_triplet;
+}
+
+/* Fills descriptor with data pointers of one block type.
+ * Returns index of next triplet on success, other value if lengths of
+ * output data and processed mbuf do not match.
+ */
+static inline int
+acc100_dma_fill_blk_type_out(struct acc100_dma_req_desc *desc,
+               struct rte_mbuf *output, uint32_t out_offset,
+               uint32_t output_len, int next_triplet, int blk_id)
+{
+       desc->data_ptrs[next_triplet].address =
+                       rte_pktmbuf_iova_offset(output, out_offset);
+       desc->data_ptrs[next_triplet].blen = output_len;
+       desc->data_ptrs[next_triplet].blkid = blk_id;
+       desc->data_ptrs[next_triplet].last = 0;
+       desc->data_ptrs[next_triplet].dma_ext = 0;
+       next_triplet++;
+
+       return next_triplet;
+}
+
+static inline int
+acc100_dma_desc_le_fill(struct rte_bbdev_enc_op *op,
+               struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
+               struct rte_mbuf *output, uint32_t *in_offset,
+               uint32_t *out_offset, uint32_t *out_length,
+               uint32_t *mbuf_total_left, uint32_t *seg_total_left)
+{
+       int next_triplet = 1; /* FCW already done */
+       uint16_t K, in_length_in_bits, in_length_in_bytes;
+       struct rte_bbdev_op_ldpc_enc *enc = &op->ldpc_enc;
+
+       desc->word0 = ACC100_DMA_DESC_TYPE;
+       desc->word1 = 0; /**< Timestamp could be disabled */
+       desc->word2 = 0;
+       desc->word3 = 0;
+       desc->numCBs = 1;
+
+       K = (enc->basegraph == 1 ? 22 : 10) * enc->z_c;
+       in_length_in_bits = K - enc->n_filler;
+       if ((enc->op_flags & RTE_BBDEV_LDPC_CRC_24A_ATTACH) ||
+                       (enc->op_flags & RTE_BBDEV_LDPC_CRC_24B_ATTACH))
+               in_length_in_bits -= 24;
+       in_length_in_bytes = in_length_in_bits >> 3;
+
+       if (unlikely((*mbuf_total_left == 0) ||
+                       (*mbuf_total_left < in_length_in_bytes))) {
+               rte_bbdev_log(ERR,
+                               "Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+                               *mbuf_total_left, in_length_in_bytes);
+               return -1;
+       }
+
+       next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset,
+                       in_length_in_bytes,
+                       seg_total_left, next_triplet);
+       if (unlikely(next_triplet < 0)) {
+               rte_bbdev_log(ERR,
+                               "Mismatch between data to process and mbuf data length in bbdev_op: %p",
+                               op);
+               return -1;
+       }
+       desc->data_ptrs[next_triplet - 1].last = 1;
+       desc->m2dlen = next_triplet;
+       *mbuf_total_left -= in_length_in_bytes;
+
+       /* Set output length */
+       /* Integer round up division by 8 */
+       *out_length = (enc->cb_params.e + 7) >> 3;
+
+       next_triplet = acc100_dma_fill_blk_type_out(desc, output, *out_offset,
+                       *out_length, next_triplet, ACC100_DMA_BLKID_OUT_ENC);
+       if (unlikely(next_triplet < 0)) {
+               rte_bbdev_log(ERR,
+                               "Mismatch between data to process and mbuf data length in bbdev_op: %p",
+                               op);
+               return -1;
+       }
+       op->ldpc_enc.output.length += *out_length;
+       *out_offset += *out_length;
+       desc->data_ptrs[next_triplet - 1].last = 1;
+       desc->data_ptrs[next_triplet - 1].dma_ext = 0;
+       desc->d2mlen = next_triplet - desc->m2dlen;
+
+       desc->op_addr = op;
+
+       return 0;
+}
+
+static inline int
+acc100_dma_desc_ld_fill(struct rte_bbdev_dec_op *op,
+               struct acc100_dma_req_desc *desc,
+               struct rte_mbuf **input, struct rte_mbuf *h_output,
+               uint32_t *in_offset, uint32_t *h_out_offset,
+               uint32_t *h_out_length, uint32_t *mbuf_total_left,
+               uint32_t *seg_total_left,
+               struct acc100_fcw_ld *fcw)
+{
+       struct rte_bbdev_op_ldpc_dec *dec = &op->ldpc_dec;
+       int next_triplet = 1; /* FCW already done */
+       uint32_t input_length;
+       uint16_t output_length, crc24_overlap = 0;
+       uint16_t sys_cols, K, h_p_size, h_np_size;
+       bool h_comp = check_bit(dec->op_flags,
+                       RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
+
+       desc->word0 = ACC100_DMA_DESC_TYPE;
+       desc->word1 = 0; /**< Timestamp could be disabled */
+       desc->word2 = 0;
+       desc->word3 = 0;
+       desc->numCBs = 1;
+
+       if (check_bit(op->ldpc_dec.op_flags,
+                       RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP))
+               crc24_overlap = 24;
+
+       /* Compute some LDPC BG lengths */
+       input_length = dec->cb_params.e;
+       if (check_bit(op->ldpc_dec.op_flags,
+                       RTE_BBDEV_LDPC_LLR_COMPRESSION))
+               input_length = (input_length * 3 + 3) / 4;
+       sys_cols = (dec->basegraph == 1) ? 22 : 10;
+       K = sys_cols * dec->z_c;
+       output_length = K - dec->n_filler - crc24_overlap;
+
+       if (unlikely((*mbuf_total_left == 0) ||
+                       (*mbuf_total_left < input_length))) {
+               rte_bbdev_log(ERR,
+                               "Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+                               *mbuf_total_left, input_length);
+               return -1;
+       }
+
+       next_triplet = acc100_dma_fill_blk_type_in(desc, input,
+                       in_offset, input_length,
+                       seg_total_left, next_triplet);
+
+       if (unlikely(next_triplet < 0)) {
+               rte_bbdev_log(ERR,
+                               "Mismatch between data to process and mbuf data length in bbdev_op: %p",
+                               op);
+               return -1;
+       }
+
+       if (check_bit(op->ldpc_dec.op_flags,
+                               RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
+               h_p_size = fcw->hcin_size0 + fcw->hcin_size1;
+               if (h_comp)
+                       h_p_size = (h_p_size * 3 + 3) / 4;
+               desc->data_ptrs[next_triplet].address =
+                               dec->harq_combined_input.offset;
+               desc->data_ptrs[next_triplet].blen = h_p_size;
+               desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN_HARQ;
+               desc->data_ptrs[next_triplet].dma_ext = 1;
+#ifndef ACC100_EXT_MEM
+               acc100_dma_fill_blk_type_out(
+                               desc,
+                               op->ldpc_dec.harq_combined_input.data,
+                               op->ldpc_dec.harq_combined_input.offset,
+                               h_p_size,
+                               next_triplet,
+                               ACC100_DMA_BLKID_IN_HARQ);
+#endif
+               next_triplet++;
+       }
+
+       desc->data_ptrs[next_triplet - 1].last = 1;
+       desc->m2dlen = next_triplet;
+       *mbuf_total_left -= input_length;
+
+       next_triplet = acc100_dma_fill_blk_type_out(desc, h_output,
+                       *h_out_offset, output_length >> 3, next_triplet,
+                       ACC100_DMA_BLKID_OUT_HARD);
+       if (unlikely(next_triplet < 0)) {
+               rte_bbdev_log(ERR,
+                               "Mismatch between data to process and mbuf data length in bbdev_op: %p",
+                               op);
+               return -1;
+       }
+
+       if (check_bit(op->ldpc_dec.op_flags,
+                               RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
+               /* Pruned size of the HARQ */
+               h_p_size = fcw->hcout_size0 + fcw->hcout_size1;
+               /* Non-Pruned size of the HARQ */
+               h_np_size = fcw->hcout_offset > 0 ?
+                               fcw->hcout_offset + fcw->hcout_size1 :
+                               h_p_size;
+               if (h_comp) {
+                       h_np_size = (h_np_size * 3 + 3) / 4;
+                       h_p_size = (h_p_size * 3 + 3) / 4;
+               }
+               dec->harq_combined_output.length = h_np_size;
+               desc->data_ptrs[next_triplet].address =
+                               dec->harq_combined_output.offset;
+               desc->data_ptrs[next_triplet].blen = h_p_size;
+               desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_OUT_HARQ;
+               desc->data_ptrs[next_triplet].dma_ext = 1;
+#ifndef ACC100_EXT_MEM
+               acc100_dma_fill_blk_type_out(
+                               desc,
+                               dec->harq_combined_output.data,
+                               dec->harq_combined_output.offset,
+                               h_p_size,
+                               next_triplet,
+                               ACC100_DMA_BLKID_OUT_HARQ);
+#endif
+               next_triplet++;
+       }
+
+       *h_out_length = output_length >> 3;
+       dec->hard_output.length += *h_out_length;
+       *h_out_offset += *h_out_length;
+       desc->data_ptrs[next_triplet - 1].last = 1;
+       desc->d2mlen = next_triplet - desc->m2dlen;
+
+       desc->op_addr = op;
+
+       return 0;
+}
+
+static inline void
+acc100_dma_desc_ld_update(struct rte_bbdev_dec_op *op,
+               struct acc100_dma_req_desc *desc,
+               struct rte_mbuf *input, struct rte_mbuf *h_output,
+               uint32_t *in_offset, uint32_t *h_out_offset,
+               uint32_t *h_out_length,
+               union acc100_harq_layout_data *harq_layout)
+{
+       int next_triplet = 1; /* FCW already done */
+       desc->data_ptrs[next_triplet].address =
+                       rte_pktmbuf_iova_offset(input, *in_offset);
+       next_triplet++;
+
+       if (check_bit(op->ldpc_dec.op_flags,
+                               RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
+               struct rte_bbdev_op_data hi = op->ldpc_dec.harq_combined_input;
+               desc->data_ptrs[next_triplet].address = hi.offset;
+#ifndef ACC100_EXT_MEM
+               desc->data_ptrs[next_triplet].address =
+                               rte_pktmbuf_iova_offset(hi.data, hi.offset);
+#endif
+               next_triplet++;
+       }
+
+       desc->data_ptrs[next_triplet].address =
+                       rte_pktmbuf_iova_offset(h_output, *h_out_offset);
+       *h_out_length = desc->data_ptrs[next_triplet].blen;
+       next_triplet++;
+
+       if (check_bit(op->ldpc_dec.op_flags,
+                               RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
+               desc->data_ptrs[next_triplet].address =
+                               op->ldpc_dec.harq_combined_output.offset;
+               /* Adjust based on previous operation */
+               struct rte_bbdev_dec_op *prev_op = desc->op_addr;
+               op->ldpc_dec.harq_combined_output.length =
+                               prev_op->ldpc_dec.harq_combined_output.length;
+               int16_t hq_idx = op->ldpc_dec.harq_combined_output.offset /
+                               ACC100_HARQ_OFFSET;
+               int16_t prev_hq_idx =
+                               prev_op->ldpc_dec.harq_combined_output.offset
+                               / ACC100_HARQ_OFFSET;
+               harq_layout[hq_idx].val = harq_layout[prev_hq_idx].val;
+#ifndef ACC100_EXT_MEM
+               struct rte_bbdev_op_data ho =
+                               op->ldpc_dec.harq_combined_output;
+               desc->data_ptrs[next_triplet].address =
+                               rte_pktmbuf_iova_offset(ho.data, ho.offset);
+#endif
+               next_triplet++;
+       }
+
+       op->ldpc_dec.hard_output.length += *h_out_length;
+       desc->op_addr = op;
+}
+
+
+/* Enqueue a number of operations to HW and update software rings */
+static inline void
+acc100_dma_enqueue(struct acc100_queue *q, uint16_t n,
+               struct rte_bbdev_stats *queue_stats)
+{
+       union acc100_enqueue_reg_fmt enq_req;
+#ifdef RTE_BBDEV_OFFLOAD_COST
+       uint64_t start_time = 0;
+       queue_stats->acc_offload_cycles = 0;
+       RTE_SET_USED(queue_stats);
+#else
+       RTE_SET_USED(queue_stats);
+#endif
+
+       enq_req.val = 0;
+       /* Setting offset, 100b for 256 DMA Desc */
+       enq_req.addr_offset = ACC100_DESC_OFFSET;
+
+       /* Split ops into batches */
+       do {
+               union acc100_dma_desc *desc;
+               uint16_t enq_batch_size;
+               uint64_t offset;
+               rte_iova_t req_elem_addr;
+
+               enq_batch_size = RTE_MIN(n, MAX_ENQ_BATCH_SIZE);
+
+               /* Set flag on last descriptor in a batch */
+               desc = q->ring_addr + ((q->sw_ring_head + enq_batch_size - 1) &
+                               q->sw_ring_wrap_mask);
+               desc->req.last_desc_in_batch = 1;
+
+               /* Calculate the 1st descriptor's address */
+               offset = ((q->sw_ring_head & q->sw_ring_wrap_mask) *
+                               sizeof(union acc100_dma_desc));
+               req_elem_addr = q->ring_addr_phys + offset;
+
+               /* Fill enqueue struct */
+               enq_req.num_elem = enq_batch_size;
+               /* low 6 bits are not needed */
+               enq_req.req_elem_addr = (uint32_t)(req_elem_addr >> 6);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+               rte_memdump(stderr, "Req sdone", desc, sizeof(*desc));
+#endif
+               rte_bbdev_log_debug(
+                               "Enqueue %u reqs (phys %#"PRIx64") to reg %p",
+                               enq_batch_size,
+                               req_elem_addr,
+                               (void *)q->mmio_reg_enqueue);
+
+               rte_wmb();
+
+#ifdef RTE_BBDEV_OFFLOAD_COST
+               /* Start time measurement for enqueue function offload. */
+               start_time = rte_rdtsc_precise();
+#endif
+               rte_bbdev_log(DEBUG, "Debug : MMIO Enqueue");
+               mmio_write(q->mmio_reg_enqueue, enq_req.val);
+
+#ifdef RTE_BBDEV_OFFLOAD_COST
+               queue_stats->acc_offload_cycles +=
+                               rte_rdtsc_precise() - start_time;
+#endif
+
+               q->aq_enqueued++;
+               q->sw_ring_head += enq_batch_size;
+               n -= enq_batch_size;
+
+       } while (n);
+
+
+}
+
+/* Enqueue one encode operations for ACC100 device in CB mode */
+static inline int
+enqueue_ldpc_enc_n_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op **ops,
+               uint16_t total_enqueued_cbs, int16_t num)
+{
+       union acc100_dma_desc *desc = NULL;
+       uint32_t out_length;
+       struct rte_mbuf *output_head, *output;
+       int i, next_triplet;
+       uint16_t  in_length_in_bytes;
+       struct rte_bbdev_op_ldpc_enc *enc = &ops[0]->ldpc_enc;
+
+       uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+                       & q->sw_ring_wrap_mask);
+       desc = q->ring_addr + desc_idx;
+       acc100_fcw_le_fill(ops[0], &desc->req.fcw_le, num);
+
+       /** This could be done at polling */
+       desc->req.word0 = ACC100_DMA_DESC_TYPE;
+       desc->req.word1 = 0; /**< Timestamp could be disabled */
+       desc->req.word2 = 0;
+       desc->req.word3 = 0;
+       desc->req.numCBs = num;
+
+       in_length_in_bytes = ops[0]->ldpc_enc.input.data->data_len;
+       out_length = (enc->cb_params.e + 7) >> 3;
+       desc->req.m2dlen = 1 + num;
+       desc->req.d2mlen = num;
+       next_triplet = 1;
+
+       for (i = 0; i < num; i++) {
+               desc->req.data_ptrs[next_triplet].address =
+                       rte_pktmbuf_iova_offset(ops[i]->ldpc_enc.input.data, 0);
+               desc->req.data_ptrs[next_triplet].blen = in_length_in_bytes;
+               next_triplet++;
+               desc->req.data_ptrs[next_triplet].address =
+                               rte_pktmbuf_iova_offset(
+                               ops[i]->ldpc_enc.output.data, 0);
+               desc->req.data_ptrs[next_triplet].blen = out_length;
+               next_triplet++;
+               ops[i]->ldpc_enc.output.length = out_length;
+               output_head = output = ops[i]->ldpc_enc.output.data;
+               mbuf_append(output_head, output, out_length);
+               output->data_len = out_length;
+       }
+
+       desc->req.op_addr = ops[0];
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+       rte_memdump(stderr, "FCW", &desc->req.fcw_le,
+                       sizeof(desc->req.fcw_le) - 8);
+       rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+       /* One CB (one op) was successfully prepared to enqueue */
+       return num;
+}
+
+/* Enqueue one encode operations for ACC100 device in CB mode */
+static inline int
+enqueue_ldpc_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
+               uint16_t total_enqueued_cbs)
+{
+       union acc100_dma_desc *desc = NULL;
+       int ret;
+       uint32_t in_offset, out_offset, out_length, mbuf_total_left,
+               seg_total_left;
+       struct rte_mbuf *input, *output_head, *output;
+
+       uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+                       & q->sw_ring_wrap_mask);
+       desc = q->ring_addr + desc_idx;
+       acc100_fcw_le_fill(op, &desc->req.fcw_le, 1);
+
+       input = op->ldpc_enc.input.data;
+       output_head = output = op->ldpc_enc.output.data;
+       in_offset = op->ldpc_enc.input.offset;
+       out_offset = op->ldpc_enc.output.offset;
+       out_length = 0;
+       mbuf_total_left = op->ldpc_enc.input.length;
+       seg_total_left = rte_pktmbuf_data_len(op->ldpc_enc.input.data)
+                       - in_offset;
+
+       ret = acc100_dma_desc_le_fill(op, &desc->req, &input, output,
+                       &in_offset, &out_offset, &out_length, &mbuf_total_left,
+                       &seg_total_left);
+
+       if (unlikely(ret < 0))
+               return ret;
+
+       mbuf_append(output_head, output, out_length);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+       rte_memdump(stderr, "FCW", &desc->req.fcw_le,
+                       sizeof(desc->req.fcw_le) - 8);
+       rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+
+       /* Check if any data left after processing one CB */
+       if (mbuf_total_left != 0) {
+               rte_bbdev_log(ERR,
+                               "Some date still left after processing one CB: mbuf_total_left = %u",
+                               mbuf_total_left);
+               return -EINVAL;
+       }
+#endif
+       /* One CB (one op) was successfully prepared to enqueue */
+       return 1;
+}
+
+/** Enqueue one decode operations for ACC100 device in CB mode */
+static inline int
+enqueue_ldpc_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+               uint16_t total_enqueued_cbs, bool same_op)
+{
+       int ret;
+
+       union acc100_dma_desc *desc;
+       uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+                       & q->sw_ring_wrap_mask);
+       desc = q->ring_addr + desc_idx;
+       struct rte_mbuf *input, *h_output_head, *h_output;
+       uint32_t in_offset, h_out_offset, h_out_length, mbuf_total_left;
+       input = op->ldpc_dec.input.data;
+       h_output_head = h_output = op->ldpc_dec.hard_output.data;
+       in_offset = op->ldpc_dec.input.offset;
+       h_out_offset = op->ldpc_dec.hard_output.offset;
+       mbuf_total_left = op->ldpc_dec.input.length;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+       if (unlikely(input == NULL)) {
+               rte_bbdev_log(ERR, "Invalid mbuf pointer");
+               return -EFAULT;
+       }
+#endif
+       union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+
+       if (same_op) {
+               union acc100_dma_desc *prev_desc;
+               desc_idx = ((q->sw_ring_head + total_enqueued_cbs - 1)
+                               & q->sw_ring_wrap_mask);
+               prev_desc = q->ring_addr + desc_idx;
+               uint8_t *prev_ptr = (uint8_t *) prev_desc;
+               uint8_t *new_ptr = (uint8_t *) desc;
+               /* Copy first 4 words and BDESCs */
+               rte_memcpy(new_ptr, prev_ptr, 16);
+               rte_memcpy(new_ptr + 36, prev_ptr + 36, 40);
+               desc->req.op_addr = prev_desc->req.op_addr;
+               /* Copy FCW */
+               rte_memcpy(new_ptr + ACC100_DESC_FCW_OFFSET,
+                               prev_ptr + ACC100_DESC_FCW_OFFSET,
+                               ACC100_FCW_LD_BLEN);
+               acc100_dma_desc_ld_update(op, &desc->req, input, h_output,
+                               &in_offset, &h_out_offset,
+                               &h_out_length, harq_layout);
+       } else {
+               struct acc100_fcw_ld *fcw;
+               uint32_t seg_total_left;
+               fcw = &desc->req.fcw_ld;
+               acc100_fcw_ld_fill(op, fcw, harq_layout);
+
+               /* Special handling when overusing mbuf */
+               if (fcw->rm_e < MAX_E_MBUF)
+                       seg_total_left = rte_pktmbuf_data_len(input)
+                                       - in_offset;
+               else
+                       seg_total_left = fcw->rm_e;
+
+               ret = acc100_dma_desc_ld_fill(op, &desc->req, &input, h_output,
+                               &in_offset, &h_out_offset,
+                               &h_out_length, &mbuf_total_left,
+                               &seg_total_left, fcw);
+               if (unlikely(ret < 0))
+                       return ret;
+       }
+
+       /* Hard output */
+       mbuf_append(h_output_head, h_output, h_out_length);
+#ifndef ACC100_EXT_MEM
+       if (op->ldpc_dec.harq_combined_output.length > 0) {
+               /* Push the HARQ output into host memory */
+               struct rte_mbuf *hq_output_head, *hq_output;
+               hq_output_head = op->ldpc_dec.harq_combined_output.data;
+               hq_output = op->ldpc_dec.harq_combined_output.data;
+               mbuf_append(hq_output_head, hq_output,
+                               op->ldpc_dec.harq_combined_output.length);
+       }
+#endif
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+       rte_memdump(stderr, "FCW", &desc->req.fcw_ld,
+                       sizeof(desc->req.fcw_ld) - 8);
+       rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+       /* One CB (one op) was successfully prepared to enqueue */
+       return 1;
+}
+
+
+/* Enqueue one decode operations for ACC100 device in TB mode */
+static inline int
+enqueue_ldpc_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+               uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
+{
+       union acc100_dma_desc *desc = NULL;
+       int ret;
+       uint8_t r, c;
+       uint32_t in_offset, h_out_offset,
+               h_out_length, mbuf_total_left, seg_total_left;
+       struct rte_mbuf *input, *h_output_head, *h_output;
+       uint16_t current_enqueued_cbs = 0;
+
+       uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+                       & q->sw_ring_wrap_mask);
+       desc = q->ring_addr + desc_idx;
+       uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
+       union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+       acc100_fcw_ld_fill(op, &desc->req.fcw_ld, harq_layout);
+
+       input = op->ldpc_dec.input.data;
+       h_output_head = h_output = op->ldpc_dec.hard_output.data;
+       in_offset = op->ldpc_dec.input.offset;
+       h_out_offset = op->ldpc_dec.hard_output.offset;
+       h_out_length = 0;
+       mbuf_total_left = op->ldpc_dec.input.length;
+       c = op->ldpc_dec.tb_params.c;
+       r = op->ldpc_dec.tb_params.r;
+
+       while (mbuf_total_left > 0 && r < c) {
+
+               seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
+
+               /* Set up DMA descriptor */
+               desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
+                               & q->sw_ring_wrap_mask);
+               desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
+               desc->req.data_ptrs[0].blen = ACC100_FCW_LD_BLEN;
+               ret = acc100_dma_desc_ld_fill(op, &desc->req, &input,
+                               h_output, &in_offset, &h_out_offset,
+                               &h_out_length,
+                               &mbuf_total_left, &seg_total_left,
+                               &desc->req.fcw_ld);
+
+               if (unlikely(ret < 0))
+                       return ret;
+
+               /* Hard output */
+               mbuf_append(h_output_head, h_output, h_out_length);
+
+               /* Set total number of CBs in TB */
+               desc->req.cbs_in_tb = cbs_in_tb;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+               rte_memdump(stderr, "FCW", &desc->req.fcw_td,
+                               sizeof(desc->req.fcw_td) - 8);
+               rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+               if (seg_total_left == 0) {
+                       /* Go to the next mbuf */
+                       input = input->next;
+                       in_offset = 0;
+                       h_output = h_output->next;
+                       h_out_offset = 0;
+               }
+               total_enqueued_cbs++;
+               current_enqueued_cbs++;
+               r++;
+       }
+
+       if (unlikely(desc == NULL))
+               return current_enqueued_cbs;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+       /* Check if any CBs left for processing */
+       if (mbuf_total_left != 0) {
+               rte_bbdev_log(ERR,
+                               "Some date still left for processing: mbuf_total_left = %u",
+                               mbuf_total_left);
+               return -EINVAL;
+       }
+#endif
+       /* Set SDone on last CB descriptor for TB mode */
+       desc->req.sdone_enable = 1;
+       desc->req.irq_enable = q->irq_enable;
+
+       return current_enqueued_cbs;
+}
+
+
+/* Calculates number of CBs in processed encoder TB based on 'r' and input
+ * length.
+ */
+static inline uint8_t
+get_num_cbs_in_tb_enc(struct rte_bbdev_op_turbo_enc *turbo_enc)
+{
+       uint8_t c, c_neg, r, crc24_bits = 0;
+       uint16_t k, k_neg, k_pos;
+       uint8_t cbs_in_tb = 0;
+       int32_t length;
+
+       length = turbo_enc->input.length;
+       r = turbo_enc->tb_params.r;
+       c = turbo_enc->tb_params.c;
+       c_neg = turbo_enc->tb_params.c_neg;
+       k_neg = turbo_enc->tb_params.k_neg;
+       k_pos = turbo_enc->tb_params.k_pos;
+       crc24_bits = 0;
+       if (check_bit(turbo_enc->op_flags, RTE_BBDEV_TURBO_CRC_24B_ATTACH))
+               crc24_bits = 24;
+       while (length > 0 && r < c) {
+               k = (r < c_neg) ? k_neg : k_pos;
+               length -= (k - crc24_bits) >> 3;
+               r++;
+               cbs_in_tb++;
+       }
+
+       return cbs_in_tb;
+}
+
+/* Calculates number of CBs in processed decoder TB based on 'r' and input
+ * length.
+ */
+static inline uint16_t
+get_num_cbs_in_tb_dec(struct rte_bbdev_op_turbo_dec *turbo_dec)
+{
+       uint8_t c, c_neg, r = 0;
+       uint16_t kw, k, k_neg, k_pos, cbs_in_tb = 0;
+       int32_t length;
+
+       length = turbo_dec->input.length;
+       r = turbo_dec->tb_params.r;
+       c = turbo_dec->tb_params.c;
+       c_neg = turbo_dec->tb_params.c_neg;
+       k_neg = turbo_dec->tb_params.k_neg;
+       k_pos = turbo_dec->tb_params.k_pos;
+       while (length > 0 && r < c) {
+               k = (r < c_neg) ? k_neg : k_pos;
+               kw = RTE_ALIGN_CEIL(k + 4, 32) * 3;
+               length -= kw;
+               r++;
+               cbs_in_tb++;
+       }
+
+       return cbs_in_tb;
+}
+
+/* Calculates number of CBs in processed decoder TB based on 'r' and input
+ * length.
+ */
+static inline uint16_t
+get_num_cbs_in_tb_ldpc_dec(struct rte_bbdev_op_ldpc_dec *ldpc_dec)
+{
+       uint16_t r, cbs_in_tb = 0;
+       int32_t length = ldpc_dec->input.length;
+       r = ldpc_dec->tb_params.r;
+       while (length > 0 && r < ldpc_dec->tb_params.c) {
+               length -=  (r < ldpc_dec->tb_params.cab) ?
+                               ldpc_dec->tb_params.ea :
+                               ldpc_dec->tb_params.eb;
+               r++;
+               cbs_in_tb++;
+       }
+       return cbs_in_tb;
+}
+
+/* Check we can mux encode operations with common FCW */
+static inline bool
+check_mux(struct rte_bbdev_enc_op **ops, uint16_t num) {
+       uint16_t i;
+       if (num == 1)
+               return false;
+       for (i = 1; i < num; ++i) {
+               /* Only mux compatible code blocks */
+               if (memcmp((uint8_t *)(&ops[i]->ldpc_enc) + ENC_OFFSET,
+                               (uint8_t *)(&ops[0]->ldpc_enc) + ENC_OFFSET,
+                               CMP_ENC_SIZE) != 0)
+                       return false;
+       }
+       return true;
+}
+
+/** Enqueue encode operations for ACC100 device in CB mode. */
+static inline uint16_t
+acc100_enqueue_ldpc_enc_cb(struct rte_bbdev_queue_data *q_data,
+               struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+       struct acc100_queue *q = q_data->queue_private;
+       int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+       uint16_t i = 0;
+       union acc100_dma_desc *desc;
+       int ret, desc_idx = 0;
+       int16_t enq, left = num;
+
+       while (left > 0) {
+               if (unlikely(avail - 1 < 0))
+                       break;
+               avail--;
+               enq = RTE_MIN(left, MUX_5GDL_DESC);
+               if (check_mux(&ops[i], enq)) {
+                       ret = enqueue_ldpc_enc_n_op_cb(q, &ops[i],
+                                       desc_idx, enq);
+                       if (ret < 0)
+                               break;
+                       i += enq;
+               } else {
+                       ret = enqueue_ldpc_enc_one_op_cb(q, ops[i], desc_idx);
+                       if (ret < 0)
+                               break;
+                       i++;
+               }
+               desc_idx++;
+               left = num - i;
+       }
+
+       if (unlikely(i == 0))
+               return 0; /* Nothing to enqueue */
+
+       /* Set SDone in last CB in enqueued ops for CB mode*/
+       desc = q->ring_addr + ((q->sw_ring_head + desc_idx - 1)
+                       & q->sw_ring_wrap_mask);
+       desc->req.sdone_enable = 1;
+       desc->req.irq_enable = q->irq_enable;
+
+       acc100_dma_enqueue(q, desc_idx, &q_data->queue_stats);
+
+       /* Update stats */
+       q_data->queue_stats.enqueued_count += i;
+       q_data->queue_stats.enqueue_err_count += num - i;
+
+       return i;
+}
+
+/* Enqueue encode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
+               struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+       if (unlikely(num == 0))
+               return 0;
+       return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
+}
+
+/* Check we can mux encode operations with common FCW */
+static inline bool
+cmp_ldpc_dec_op(struct rte_bbdev_dec_op **ops) {
+       /* Only mux compatible code blocks */
+       if (memcmp((uint8_t *)(&ops[0]->ldpc_dec) + DEC_OFFSET,
+                       (uint8_t *)(&ops[1]->ldpc_dec) +
+                       DEC_OFFSET, CMP_DEC_SIZE) != 0) {
+               return false;
+       } else
+               return true;
+}
+
+
+/* Enqueue decode operations for ACC100 device in TB mode */
+static uint16_t
+acc100_enqueue_ldpc_dec_tb(struct rte_bbdev_queue_data *q_data,
+               struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+       struct acc100_queue *q = q_data->queue_private;
+       int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+       uint16_t i, enqueued_cbs = 0;
+       uint8_t cbs_in_tb;
+       int ret;
+
+       for (i = 0; i < num; ++i) {
+               cbs_in_tb = get_num_cbs_in_tb_ldpc_dec(&ops[i]->ldpc_dec);
+               /* Check if there are available space for further processing */
+               if (unlikely(avail - cbs_in_tb < 0))
+                       break;
+               avail -= cbs_in_tb;
+
+               ret = enqueue_ldpc_dec_one_op_tb(q, ops[i],
+                               enqueued_cbs, cbs_in_tb);
+               if (ret < 0)
+                       break;
+               enqueued_cbs += ret;
+       }
+
+       acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
+
+       /* Update stats */
+       q_data->queue_stats.enqueued_count += i;
+       q_data->queue_stats.enqueue_err_count += num - i;
+       return i;
+}
+
+/* Enqueue decode operations for ACC100 device in CB mode */
+static uint16_t
+acc100_enqueue_ldpc_dec_cb(struct rte_bbdev_queue_data *q_data,
+               struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+       struct acc100_queue *q = q_data->queue_private;
+       int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+       uint16_t i;
+       union acc100_dma_desc *desc;
+       int ret;
+       bool same_op = false;
+       for (i = 0; i < num; ++i) {
+               /* Check if there are available space for further processing */
+               if (unlikely(avail - 1 < 0))
+                       break;
+               avail -= 1;
+
+               if (i > 0)
+                       same_op = cmp_ldpc_dec_op(&ops[i-1]);
+               rte_bbdev_log(INFO, "Op %d %d %d %d %d %d %d %d %d %d %d %d\n",
+                       i, ops[i]->ldpc_dec.op_flags, ops[i]->ldpc_dec.rv_index,
+                       ops[i]->ldpc_dec.iter_max, ops[i]->ldpc_dec.iter_count,
+                       ops[i]->ldpc_dec.basegraph, ops[i]->ldpc_dec.z_c,
+                       ops[i]->ldpc_dec.n_cb, ops[i]->ldpc_dec.q_m,
+                       ops[i]->ldpc_dec.n_filler, ops[i]->ldpc_dec.cb_params.e,
+                       same_op);
+               ret = enqueue_ldpc_dec_one_op_cb(q, ops[i], i, same_op);
+               if (ret < 0)
+                       break;
+       }
+
+       if (unlikely(i == 0))
+               return 0; /* Nothing to enqueue */
+
+       /* Set SDone in last CB in enqueued ops for CB mode*/
+       desc = q->ring_addr + ((q->sw_ring_head + i - 1)
+                       & q->sw_ring_wrap_mask);
+
+       desc->req.sdone_enable = 1;
+       desc->req.irq_enable = q->irq_enable;
+
+       acc100_dma_enqueue(q, i, &q_data->queue_stats);
+
+       /* Update stats */
+       q_data->queue_stats.enqueued_count += i;
+       q_data->queue_stats.enqueue_err_count += num - i;
+       return i;
+}
+
+/* Enqueue decode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
+               struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+       struct acc100_queue *q = q_data->queue_private;
+       int32_t aq_avail = q->aq_depth +
+                       (q->aq_dequeued - q->aq_enqueued) / 128;
+
+       if (unlikely((aq_avail == 0) || (num == 0)))
+               return 0;
+
+       if (ops[0]->ldpc_dec.code_block_mode == 0)
+               return acc100_enqueue_ldpc_dec_tb(q_data, ops, num);
+       else
+               return acc100_enqueue_ldpc_dec_cb(q_data, ops, num);
+}
+
+
+/* Dequeue one encode operations from ACC100 device in CB mode */
+static inline int
+dequeue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op **ref_op,
+               uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
+{
+       union acc100_dma_desc *desc, atom_desc;
+       union acc100_dma_rsp_desc rsp;
+       struct rte_bbdev_enc_op *op;
+       int i;
+
+       desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
+                       & q->sw_ring_wrap_mask);
+       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+                       __ATOMIC_RELAXED);
+
+       /* Check fdone bit */
+       if (!(atom_desc.rsp.val & ACC100_FDONE))
+               return -1;
+
+       rsp.val = atom_desc.rsp.val;
+       rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
+
+       /* Dequeue */
+       op = desc->req.op_addr;
+
+       /* Clearing status, it will be set based on response */
+       op->status = 0;
+
+       op->status |= ((rsp.input_err)
+                       ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+       op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+       op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+
+       if (desc->req.last_desc_in_batch) {
+               (*aq_dequeued)++;
+               desc->req.last_desc_in_batch = 0;
+       }
+       desc->rsp.val = ACC100_DMA_DESC_TYPE;
+       desc->rsp.add_info_0 = 0; /*Reserved bits */
+       desc->rsp.add_info_1 = 0; /*Reserved bits */
+
+       /* Flag that the muxing cause loss of opaque data */
+       op->opaque_data = (void *)-1;
+       for (i = 0 ; i < desc->req.numCBs; i++)
+               ref_op[i] = op;
+
+       /* One CB (op) was successfully dequeued */
+       return desc->req.numCBs;
+}
+
+/* Dequeue one encode operations from ACC100 device in TB mode */
+static inline int
+dequeue_enc_one_op_tb(struct acc100_queue *q, struct rte_bbdev_enc_op **ref_op,
+               uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
+{
+       union acc100_dma_desc *desc, *last_desc, atom_desc;
+       union acc100_dma_rsp_desc rsp;
+       struct rte_bbdev_enc_op *op;
+       uint8_t i = 0;
+       uint16_t current_dequeued_cbs = 0, cbs_in_tb;
+
+       desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
+                       & q->sw_ring_wrap_mask);
+       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+                       __ATOMIC_RELAXED);
+
+       /* Check fdone bit */
+       if (!(atom_desc.rsp.val & ACC100_FDONE))
+               return -1;
+
+       /* Get number of CBs in dequeued TB */
+       cbs_in_tb = desc->req.cbs_in_tb;
+       /* Get last CB */
+       last_desc = q->ring_addr + ((q->sw_ring_tail
+                       + total_dequeued_cbs + cbs_in_tb - 1)
+                       & q->sw_ring_wrap_mask);
+       /* Check if last CB in TB is ready to dequeue (and thus
+        * the whole TB) - checking sdone bit. If not return.
+        */
+       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
+                       __ATOMIC_RELAXED);
+       if (!(atom_desc.rsp.val & ACC100_SDONE))
+               return -1;
+
+       /* Dequeue */
+       op = desc->req.op_addr;
+
+       /* Clearing status, it will be set based on response */
+       op->status = 0;
+
+       while (i < cbs_in_tb) {
+               desc = q->ring_addr + ((q->sw_ring_tail
+                               + total_dequeued_cbs)
+                               & q->sw_ring_wrap_mask);
+               atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+                               __ATOMIC_RELAXED);
+               rsp.val = atom_desc.rsp.val;
+               rte_bbdev_log_debug("Resp. desc %p: %x", desc,
+                               rsp.val);
+
+               op->status |= ((rsp.input_err)
+                               ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+               op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+               op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+
+               if (desc->req.last_desc_in_batch) {
+                       (*aq_dequeued)++;
+                       desc->req.last_desc_in_batch = 0;
+               }
+               desc->rsp.val = ACC100_DMA_DESC_TYPE;
+               desc->rsp.add_info_0 = 0;
+               desc->rsp.add_info_1 = 0;
+               total_dequeued_cbs++;
+               current_dequeued_cbs++;
+               i++;
+       }
+
+       *ref_op = op;
+
+       return current_dequeued_cbs;
+}
+
+/* Dequeue one decode operation from ACC100 device in CB mode */
+static inline int
+dequeue_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
+               struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
+               uint16_t dequeued_cbs, uint32_t *aq_dequeued)
+{
+       union acc100_dma_desc *desc, atom_desc;
+       union acc100_dma_rsp_desc rsp;
+       struct rte_bbdev_dec_op *op;
+
+       desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+                       & q->sw_ring_wrap_mask);
+       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+                       __ATOMIC_RELAXED);
+
+       /* Check fdone bit */
+       if (!(atom_desc.rsp.val & ACC100_FDONE))
+               return -1;
+
+       rsp.val = atom_desc.rsp.val;
+       rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
+
+       /* Dequeue */
+       op = desc->req.op_addr;
+
+       /* Clearing status, it will be set based on response */
+       op->status = 0;
+       op->status |= ((rsp.input_err)
+                       ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+       op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+       op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+       if (op->status != 0)
+               q_data->queue_stats.dequeue_err_count++;
+
+       /* CRC invalid if error exists */
+       if (!op->status)
+               op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
+       op->turbo_dec.iter_count = (uint8_t) rsp.iter_cnt / 2;
+       /* Check if this is the last desc in batch (Atomic Queue) */
+       if (desc->req.last_desc_in_batch) {
+               (*aq_dequeued)++;
+               desc->req.last_desc_in_batch = 0;
+       }
+       desc->rsp.val = ACC100_DMA_DESC_TYPE;
+       desc->rsp.add_info_0 = 0;
+       desc->rsp.add_info_1 = 0;
+       *ref_op = op;
+
+       /* One CB (op) was successfully dequeued */
+       return 1;
+}
+
+/* Dequeue one decode operations from ACC100 device in CB mode */
+static inline int
+dequeue_ldpc_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
+               struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
+               uint16_t dequeued_cbs, uint32_t *aq_dequeued)
+{
+       union acc100_dma_desc *desc, atom_desc;
+       union acc100_dma_rsp_desc rsp;
+       struct rte_bbdev_dec_op *op;
+
+       desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+                       & q->sw_ring_wrap_mask);
+       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+                       __ATOMIC_RELAXED);
+
+       /* Check fdone bit */
+       if (!(atom_desc.rsp.val & ACC100_FDONE))
+               return -1;
+
+       rsp.val = atom_desc.rsp.val;
+
+       /* Dequeue */
+       op = desc->req.op_addr;
+
+       /* Clearing status, it will be set based on response */
+       op->status = 0;
+       op->status |= rsp.input_err << RTE_BBDEV_DATA_ERROR;
+       op->status |= rsp.dma_err << RTE_BBDEV_DRV_ERROR;
+       op->status |= rsp.fcw_err << RTE_BBDEV_DRV_ERROR;
+       if (op->status != 0)
+               q_data->queue_stats.dequeue_err_count++;
+
+       op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
+       if (op->ldpc_dec.hard_output.length > 0 && !rsp.synd_ok)
+               op->status |= 1 << RTE_BBDEV_SYNDROME_ERROR;
+       op->ldpc_dec.iter_count = (uint8_t) rsp.iter_cnt;
+
+       /* Check if this is the last desc in batch (Atomic Queue) */
+       if (desc->req.last_desc_in_batch) {
+               (*aq_dequeued)++;
+               desc->req.last_desc_in_batch = 0;
+       }
+
+       desc->rsp.val = ACC100_DMA_DESC_TYPE;
+       desc->rsp.add_info_0 = 0;
+       desc->rsp.add_info_1 = 0;
+
+       *ref_op = op;
+
+       /* One CB (op) was successfully dequeued */
+       return 1;
+}
+
+/* Dequeue one decode operations from ACC100 device in TB mode. */
+static inline int
+dequeue_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
+               uint16_t dequeued_cbs, uint32_t *aq_dequeued)
+{
+       union acc100_dma_desc *desc, *last_desc, atom_desc;
+       union acc100_dma_rsp_desc rsp;
+       struct rte_bbdev_dec_op *op;
+       uint8_t cbs_in_tb = 1, cb_idx = 0;
+
+       desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+                       & q->sw_ring_wrap_mask);
+       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+                       __ATOMIC_RELAXED);
+
+       /* Check fdone bit */
+       if (!(atom_desc.rsp.val & ACC100_FDONE))
+               return -1;
+
+       /* Dequeue */
+       op = desc->req.op_addr;
+
+       /* Get number of CBs in dequeued TB */
+       cbs_in_tb = desc->req.cbs_in_tb;
+       /* Get last CB */
+       last_desc = q->ring_addr + ((q->sw_ring_tail
+                       + dequeued_cbs + cbs_in_tb - 1)
+                       & q->sw_ring_wrap_mask);
+       /* Check if last CB in TB is ready to dequeue (and thus
+        * the whole TB) - checking sdone bit. If not return.
+        */
+       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
+                       __ATOMIC_RELAXED);
+       if (!(atom_desc.rsp.val & ACC100_SDONE))
+               return -1;
+
+       /* Clearing status, it will be set based on response */
+       op->status = 0;
+
+       /* Read remaining CBs if exists */
+       while (cb_idx < cbs_in_tb) {
+               desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+                               & q->sw_ring_wrap_mask);
+               atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+                               __ATOMIC_RELAXED);
+               rsp.val = atom_desc.rsp.val;
+               rte_bbdev_log_debug("Resp. desc %p: %x", desc,
+                               rsp.val);
+
+               op->status |= ((rsp.input_err)
+                               ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+               op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+               op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+
+               /* CRC invalid if error exists */
+               if (!op->status)
+                       op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
+               op->turbo_dec.iter_count = RTE_MAX((uint8_t) rsp.iter_cnt,
+                               op->turbo_dec.iter_count);
+
+               /* Check if this is the last desc in batch (Atomic Queue) */
+               if (desc->req.last_desc_in_batch) {
+                       (*aq_dequeued)++;
+                       desc->req.last_desc_in_batch = 0;
+               }
+               desc->rsp.val = ACC100_DMA_DESC_TYPE;
+               desc->rsp.add_info_0 = 0;
+               desc->rsp.add_info_1 = 0;
+               dequeued_cbs++;
+               cb_idx++;
+       }
+
+       *ref_op = op;
+
+       return cb_idx;
+}
+
+/* Dequeue LDPC encode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
+               struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+       struct acc100_queue *q = q_data->queue_private;
+       uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+       uint32_t aq_dequeued = 0;
+       uint16_t dequeue_num, i, dequeued_cbs = 0, dequeued_descs = 0;
+       int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+       if (unlikely(ops == 0 && q == NULL))
+               return 0;
+#endif
+
+       dequeue_num = (avail < num) ? avail : num;
+
+       for (i = 0; i < dequeue_num; i++) {
+               ret = dequeue_enc_one_op_cb(q, &ops[dequeued_cbs],
+                               dequeued_descs, &aq_dequeued);
+               if (ret < 0)
+                       break;
+               dequeued_cbs += ret;
+               dequeued_descs++;
+               if (dequeued_cbs >= num)
+                       break;
+       }
+
+       q->aq_dequeued += aq_dequeued;
+       q->sw_ring_tail += dequeued_descs;
+
+       /* Update enqueue stats */
+       q_data->queue_stats.dequeued_count += dequeued_cbs;
+
+       return dequeued_cbs;
+}
+
+/* Dequeue decode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
+               struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+       struct acc100_queue *q = q_data->queue_private;
+       uint16_t dequeue_num;
+       uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+       uint32_t aq_dequeued = 0;
+       uint16_t i;
+       uint16_t dequeued_cbs = 0;
+       struct rte_bbdev_dec_op *op;
+       int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+       if (unlikely(ops == 0 && q == NULL))
+               return 0;
+#endif
+
+       dequeue_num = (avail < num) ? avail : num;
+
+       for (i = 0; i < dequeue_num; ++i) {
+               op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+                       & q->sw_ring_wrap_mask))->req.op_addr;
+               if (op->ldpc_dec.code_block_mode == 0)
+                       ret = dequeue_dec_one_op_tb(q, &ops[i], dequeued_cbs,
+                                       &aq_dequeued);
+               else
+                       ret = dequeue_ldpc_dec_one_op_cb(
+                                       q_data, q, &ops[i], dequeued_cbs,
+                                       &aq_dequeued);
+
+               if (ret < 0)
+                       break;
+               dequeued_cbs += ret;
+       }
+
+       q->aq_dequeued += aq_dequeued;
+       q->sw_ring_tail += dequeued_cbs;
+
+       /* Update enqueue stats */
+       q_data->queue_stats.dequeued_count += i;
+
+       return i;
+}
+
 /* Initialization Function */
 static void
 acc100_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv)
@@ -703,6 +2321,10 @@
         struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
 
         dev->dev_ops = &acc100_bbdev_ops;
+       dev->enqueue_ldpc_enc_ops = acc100_enqueue_ldpc_enc;
+       dev->enqueue_ldpc_dec_ops = acc100_enqueue_ldpc_dec;
+       dev->dequeue_ldpc_enc_ops = acc100_dequeue_ldpc_enc;
+       dev->dequeue_ldpc_dec_ops = acc100_dequeue_ldpc_dec;
 
         ((struct acc100_device *) dev->data->dev_private)->pf_device =
                         !strcmp(drv->driver.name,
@@ -815,4 +2437,3 @@ static int acc100_pci_remove(struct rte_pci_device *pci_dev)
 RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME, pci_id_acc100_pf_map);
 RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
 RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME, pci_id_acc100_vf_map);
-
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 0e2b79c..78686c1 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -88,6 +88,8 @@
 #define TMPL_PRI_3      0x0f0e0d0c
 #define QUEUE_ENABLE    0x80000000  /* Bit to mark Queue as Enabled */
 #define WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
+#define ACC100_FDONE    0x80000000
+#define ACC100_SDONE    0x40000000
 
 #define ACC100_NUM_TMPL  32
 #define VF_OFFSET_QOS 16 /* offset in Memory Space specific to QoS Mon */
@@ -398,6 +400,7 @@ struct __rte_packed acc100_dma_req_desc {
 union acc100_dma_desc {
         struct acc100_dma_req_desc req;
         union acc100_dma_rsp_desc rsp;
+       uint64_t atom_hdr;
 };
 
 
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC processing functions
  2020-08-20 14:38   ` Dave Burley
@ 2020-08-20 14:52     ` Chautru, Nicolas
  2020-08-20 14:57       ` Dave Burley
  0 siblings, 1 reply; 213+ messages in thread
From: Chautru, Nicolas @ 2020-08-20 14:52 UTC (permalink / raw)
  To: Dave Burley, dev; +Cc: Richardson, Bruce

Hi Dave, 
This is assuming 6 bits LLR compression packing (ie. first 2 MSB dropped). Similar to HARQ compression.
Let me know if unclear, I can clarify further in documentation if not explicit enough.
Thanks
Nic

> -----Original Message-----
> From: Dave Burley <dave.burley@accelercomm.com>
> Sent: Thursday, August 20, 2020 7:39 AM
> To: Chautru, Nicolas <nicolas.chautru@intel.com>; dev@dpdk.org
> Cc: Richardson, Bruce <bruce.richardson@intel.com>
> Subject: Re: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC
> processing functions
> 
> Hi Nic,
> 
> As you've now specified the use of RTE_BBDEV_LDPC_LLR_COMPRESSION for
> this PMB, please could you confirm what the packed format of the LLRs in
> memory looks like?
> 
> Best Regards
> 
> Dave Burley
> 
> 
> From: dev <dev-bounces@dpdk.org> on behalf of Nicolas Chautru
> <nicolas.chautru@intel.com>
> Sent: 19 August 2020 01:25
> To: dev@dpdk.org <dev@dpdk.org>; akhil.goyal@nxp.com
> <akhil.goyal@nxp.com>
> Cc: bruce.richardson@intel.com <bruce.richardson@intel.com>; Nicolas
> Chautru <nicolas.chautru@intel.com>
> Subject: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC processing
> functions
> 
> Adding LDPC decode and encode processing operations
> 
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> ---
>  drivers/baseband/acc100/rte_acc100_pmd.c | 1625
> +++++++++++++++++++++++++++++-
>  drivers/baseband/acc100/rte_acc100_pmd.h |    3 +
>  2 files changed, 1626 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> b/drivers/baseband/acc100/rte_acc100_pmd.c
> index 7a21c57..5f32813 100644
> --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> @@ -15,6 +15,9 @@
>  #include <rte_hexdump.h>
>  #include <rte_pci.h>
>  #include <rte_bus_pci.h>
> +#ifdef RTE_BBDEV_OFFLOAD_COST
> +#include <rte_cycles.h>
> +#endif
> 
>  #include <rte_bbdev.h>
>  #include <rte_bbdev_pmd.h>
> @@ -449,7 +452,6 @@
>          return 0;
>  }
> 
> -
>  /**
>   * Report a ACC100 queue index which is free
>   * Return 0 to 16k for a valid queue_idx or -1 when no queue is available
> @@ -634,6 +636,46 @@
>          struct acc100_device *d = dev->data->dev_private;
> 
>          static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
> +               {
> +                       .type   = RTE_BBDEV_OP_LDPC_ENC,
> +                       .cap.ldpc_enc = {
> +                               .capability_flags =
> +                                       RTE_BBDEV_LDPC_RATE_MATCH |
> +                                       RTE_BBDEV_LDPC_CRC_24B_ATTACH |
> +                                       RTE_BBDEV_LDPC_INTERLEAVER_BYPASS,
> +                               .num_buffers_src =
> +                                               RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> +                               .num_buffers_dst =
> +                                               RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> +                       }
> +               },
> +               {
> +                       .type   = RTE_BBDEV_OP_LDPC_DEC,
> +                       .cap.ldpc_dec = {
> +                       .capability_flags =
> +                               RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK |
> +                               RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP |
> +                               RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE |
> +                               RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE |
> +#ifdef ACC100_EXT_MEM
> +                               RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABL
> E |
> +                               RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENA
> BLE |
> +#endif
> +                               RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE |
> +                               RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS |
> +                               RTE_BBDEV_LDPC_DECODE_BYPASS |
> +                               RTE_BBDEV_LDPC_DEC_SCATTER_GATHER |
> +                               RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION |
> +                               RTE_BBDEV_LDPC_LLR_COMPRESSION,
> +                       .llr_size = 8,
> +                       .llr_decimals = 1,
> +                       .num_buffers_src =
> +                                       RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> +                       .num_buffers_hard_out =
> +                                       RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> +                       .num_buffers_soft_out = 0,
> +                       }
> +               },
>                  RTE_BBDEV_END_OF_CAPABILITIES_LIST()
>          };
> 
> @@ -669,9 +711,14 @@
>          dev_info->cpu_flag_reqs = NULL;
>          dev_info->min_alignment = 64;
>          dev_info->capabilities = bbdev_capabilities;
> +#ifdef ACC100_EXT_MEM
>          dev_info->harq_buffer_size = d->ddr_size;
> +#else
> +       dev_info->harq_buffer_size = 0;
> +#endif
>  }
> 
> +
>  static const struct rte_bbdev_ops acc100_bbdev_ops = {
>          .setup_queues = acc100_setup_queues,
>          .close = acc100_dev_close,
> @@ -696,6 +743,1577 @@
>          {.device_id = 0},
>  };
> 
> +/* Read flag value 0/1 from bitmap */
> +static inline bool
> +check_bit(uint32_t bitmap, uint32_t bitmask)
> +{
> +       return bitmap & bitmask;
> +}
> +
> +static inline char *
> +mbuf_append(struct rte_mbuf *m_head, struct rte_mbuf *m, uint16_t len)
> +{
> +       if (unlikely(len > rte_pktmbuf_tailroom(m)))
> +               return NULL;
> +
> +       char *tail = (char *)m->buf_addr + m->data_off + m->data_len;
> +       m->data_len = (uint16_t)(m->data_len + len);
> +       m_head->pkt_len  = (m_head->pkt_len + len);
> +       return tail;
> +}
> +
> +/* Compute value of k0.
> + * Based on 3GPP 38.212 Table 5.4.2.1-2
> + * Starting position of different redundancy versions, k0
> + */
> +static inline uint16_t
> +get_k0(uint16_t n_cb, uint16_t z_c, uint8_t bg, uint8_t rv_index)
> +{
> +       if (rv_index == 0)
> +               return 0;
> +       uint16_t n = (bg == 1 ? N_ZC_1 : N_ZC_2) * z_c;
> +       if (n_cb == n) {
> +               if (rv_index == 1)
> +                       return (bg == 1 ? K0_1_1 : K0_1_2) * z_c;
> +               else if (rv_index == 2)
> +                       return (bg == 1 ? K0_2_1 : K0_2_2) * z_c;
> +               else
> +                       return (bg == 1 ? K0_3_1 : K0_3_2) * z_c;
> +       }
> +       /* LBRM case - includes a division by N */
> +       if (rv_index == 1)
> +               return (((bg == 1 ? K0_1_1 : K0_1_2) * n_cb)
> +                               / n) * z_c;
> +       else if (rv_index == 2)
> +               return (((bg == 1 ? K0_2_1 : K0_2_2) * n_cb)
> +                               / n) * z_c;
> +       else
> +               return (((bg == 1 ? K0_3_1 : K0_3_2) * n_cb)
> +                               / n) * z_c;
> +}
> +
> +/* Fill in a frame control word for LDPC encoding. */
> +static inline void
> +acc100_fcw_le_fill(const struct rte_bbdev_enc_op *op,
> +               struct acc100_fcw_le *fcw, int num_cb)
> +{
> +       fcw->qm = op->ldpc_enc.q_m;
> +       fcw->nfiller = op->ldpc_enc.n_filler;
> +       fcw->BG = (op->ldpc_enc.basegraph - 1);
> +       fcw->Zc = op->ldpc_enc.z_c;
> +       fcw->ncb = op->ldpc_enc.n_cb;
> +       fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_enc.basegraph,
> +                       op->ldpc_enc.rv_index);
> +       fcw->rm_e = op->ldpc_enc.cb_params.e;
> +       fcw->crc_select = check_bit(op->ldpc_enc.op_flags,
> +                       RTE_BBDEV_LDPC_CRC_24B_ATTACH);
> +       fcw->bypass_intlv = check_bit(op->ldpc_enc.op_flags,
> +                       RTE_BBDEV_LDPC_INTERLEAVER_BYPASS);
> +       fcw->mcb_count = num_cb;
> +}
> +
> +/* Fill in a frame control word for LDPC decoding. */
> +static inline void
> +acc100_fcw_ld_fill(const struct rte_bbdev_dec_op *op, struct acc100_fcw_ld
> *fcw,
> +               union acc100_harq_layout_data *harq_layout)
> +{
> +       uint16_t harq_out_length, harq_in_length, ncb_p, k0_p, parity_offset;
> +       uint16_t harq_index;
> +       uint32_t l;
> +       bool harq_prun = false;
> +
> +       fcw->qm = op->ldpc_dec.q_m;
> +       fcw->nfiller = op->ldpc_dec.n_filler;
> +       fcw->BG = (op->ldpc_dec.basegraph - 1);
> +       fcw->Zc = op->ldpc_dec.z_c;
> +       fcw->ncb = op->ldpc_dec.n_cb;
> +       fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_dec.basegraph,
> +                       op->ldpc_dec.rv_index);
> +       if (op->ldpc_dec.code_block_mode == 1)
> +               fcw->rm_e = op->ldpc_dec.cb_params.e;
> +       else
> +               fcw->rm_e = (op->ldpc_dec.tb_params.r <
> +                               op->ldpc_dec.tb_params.cab) ?
> +                                               op->ldpc_dec.tb_params.ea :
> +                                               op->ldpc_dec.tb_params.eb;
> +
> +       fcw->hcin_en = check_bit(op->ldpc_dec.op_flags,
> +                       RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE);
> +       fcw->hcout_en = check_bit(op->ldpc_dec.op_flags,
> +                       RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE);
> +       fcw->crc_select = check_bit(op->ldpc_dec.op_flags,
> +                       RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK);
> +       fcw->bypass_dec = check_bit(op->ldpc_dec.op_flags,
> +                       RTE_BBDEV_LDPC_DECODE_BYPASS);
> +       fcw->bypass_intlv = check_bit(op->ldpc_dec.op_flags,
> +                       RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS);
> +       if (op->ldpc_dec.q_m == 1) {
> +               fcw->bypass_intlv = 1;
> +               fcw->qm = 2;
> +       }
> +       fcw->hcin_decomp_mode = check_bit(op->ldpc_dec.op_flags,
> +                       RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
> +       fcw->hcout_comp_mode = check_bit(op->ldpc_dec.op_flags,
> +                       RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
> +       fcw->llr_pack_mode = check_bit(op->ldpc_dec.op_flags,
> +                       RTE_BBDEV_LDPC_LLR_COMPRESSION);
> +       harq_index = op->ldpc_dec.harq_combined_output.offset /
> +                       ACC100_HARQ_OFFSET;
> +#ifdef ACC100_EXT_MEM
> +       /* Limit cases when HARQ pruning is valid */
> +       harq_prun = ((op->ldpc_dec.harq_combined_output.offset %
> +                       ACC100_HARQ_OFFSET) == 0) &&
> +                       (op->ldpc_dec.harq_combined_output.offset <= UINT16_MAX
> +                       * ACC100_HARQ_OFFSET);
> +#endif
> +       if (fcw->hcin_en > 0) {
> +               harq_in_length = op->ldpc_dec.harq_combined_input.length;
> +               if (fcw->hcin_decomp_mode > 0)
> +                       harq_in_length = harq_in_length * 8 / 6;
> +               harq_in_length = RTE_ALIGN(harq_in_length, 64);
> +               if ((harq_layout[harq_index].offset > 0) & harq_prun) {
> +                       rte_bbdev_log_debug("HARQ IN offset unexpected for now\n");
> +                       fcw->hcin_size0 = harq_layout[harq_index].size0;
> +                       fcw->hcin_offset = harq_layout[harq_index].offset;
> +                       fcw->hcin_size1 = harq_in_length -
> +                                       harq_layout[harq_index].offset;
> +               } else {
> +                       fcw->hcin_size0 = harq_in_length;
> +                       fcw->hcin_offset = 0;
> +                       fcw->hcin_size1 = 0;
> +               }
> +       } else {
> +               fcw->hcin_size0 = 0;
> +               fcw->hcin_offset = 0;
> +               fcw->hcin_size1 = 0;
> +       }
> +
> +       fcw->itmax = op->ldpc_dec.iter_max;
> +       fcw->itstop = check_bit(op->ldpc_dec.op_flags,
> +                       RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE);
> +       fcw->synd_precoder = fcw->itstop;
> +       /*
> +        * These are all implicitly set
> +        * fcw->synd_post = 0;
> +        * fcw->so_en = 0;
> +        * fcw->so_bypass_rm = 0;
> +        * fcw->so_bypass_intlv = 0;
> +        * fcw->dec_convllr = 0;
> +        * fcw->hcout_convllr = 0;
> +        * fcw->hcout_size1 = 0;
> +        * fcw->so_it = 0;
> +        * fcw->hcout_offset = 0;
> +        * fcw->negstop_th = 0;
> +        * fcw->negstop_it = 0;
> +        * fcw->negstop_en = 0;
> +        * fcw->gain_i = 1;
> +        * fcw->gain_h = 1;
> +        */
> +       if (fcw->hcout_en > 0) {
> +               parity_offset = (op->ldpc_dec.basegraph == 1 ? 20 : 8)
> +                       * op->ldpc_dec.z_c - op->ldpc_dec.n_filler;
> +               k0_p = (fcw->k0 > parity_offset) ?
> +                               fcw->k0 - op->ldpc_dec.n_filler : fcw->k0;
> +               ncb_p = fcw->ncb - op->ldpc_dec.n_filler;
> +               l = k0_p + fcw->rm_e;
> +               harq_out_length = (uint16_t) fcw->hcin_size0;
> +               harq_out_length = RTE_MIN(RTE_MAX(harq_out_length, l), ncb_p);
> +               harq_out_length = (harq_out_length + 0x3F) & 0xFFC0;
> +               if ((k0_p > fcw->hcin_size0 + ACC100_HARQ_OFFSET_THRESHOLD)
> &&
> +                               harq_prun) {
> +                       fcw->hcout_size0 = (uint16_t) fcw->hcin_size0;
> +                       fcw->hcout_offset = k0_p & 0xFFC0;
> +                       fcw->hcout_size1 = harq_out_length - fcw->hcout_offset;
> +               } else {
> +                       fcw->hcout_size0 = harq_out_length;
> +                       fcw->hcout_size1 = 0;
> +                       fcw->hcout_offset = 0;
> +               }
> +               harq_layout[harq_index].offset = fcw->hcout_offset;
> +               harq_layout[harq_index].size0 = fcw->hcout_size0;
> +       } else {
> +               fcw->hcout_size0 = 0;
> +               fcw->hcout_size1 = 0;
> +               fcw->hcout_offset = 0;
> +       }
> +}
> +
> +/**
> + * Fills descriptor with data pointers of one block type.
> + *
> + * @param desc
> + *   Pointer to DMA descriptor.
> + * @param input
> + *   Pointer to pointer to input data which will be encoded. It can be changed
> + *   and points to next segment in scatter-gather case.
> + * @param offset
> + *   Input offset in rte_mbuf structure. It is used for calculating the point
> + *   where data is starting.
> + * @param cb_len
> + *   Length of currently processed Code Block
> + * @param seg_total_left
> + *   It indicates how many bytes still left in segment (mbuf) for further
> + *   processing.
> + * @param op_flags
> + *   Store information about device capabilities
> + * @param next_triplet
> + *   Index for ACC100 DMA Descriptor triplet
> + *
> + * @return
> + *   Returns index of next triplet on success, other value if lengths of
> + *   pkt and processed cb do not match.
> + *
> + */
> +static inline int
> +acc100_dma_fill_blk_type_in(struct acc100_dma_req_desc *desc,
> +               struct rte_mbuf **input, uint32_t *offset, uint32_t cb_len,
> +               uint32_t *seg_total_left, int next_triplet)
> +{
> +       uint32_t part_len;
> +       struct rte_mbuf *m = *input;
> +
> +       part_len = (*seg_total_left < cb_len) ? *seg_total_left : cb_len;
> +       cb_len -= part_len;
> +       *seg_total_left -= part_len;
> +
> +       desc->data_ptrs[next_triplet].address =
> +                       rte_pktmbuf_iova_offset(m, *offset);
> +       desc->data_ptrs[next_triplet].blen = part_len;
> +       desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN;
> +       desc->data_ptrs[next_triplet].last = 0;
> +       desc->data_ptrs[next_triplet].dma_ext = 0;
> +       *offset += part_len;
> +       next_triplet++;
> +
> +       while (cb_len > 0) {
> +               if (next_triplet < ACC100_DMA_MAX_NUM_POINTERS &&
> +                               m->next != NULL) {
> +
> +                       m = m->next;
> +                       *seg_total_left = rte_pktmbuf_data_len(m);
> +                       part_len = (*seg_total_left < cb_len) ?
> +                                       *seg_total_left :
> +                                       cb_len;
> +                       desc->data_ptrs[next_triplet].address =
> +                                       rte_pktmbuf_mtophys(m);
> +                       desc->data_ptrs[next_triplet].blen = part_len;
> +                       desc->data_ptrs[next_triplet].blkid =
> +                                       ACC100_DMA_BLKID_IN;
> +                       desc->data_ptrs[next_triplet].last = 0;
> +                       desc->data_ptrs[next_triplet].dma_ext = 0;
> +                       cb_len -= part_len;
> +                       *seg_total_left -= part_len;
> +                       /* Initializing offset for next segment (mbuf) */
> +                       *offset = part_len;
> +                       next_triplet++;
> +               } else {
> +                       rte_bbdev_log(ERR,
> +                               "Some data still left for processing: "
> +                               "data_left: %u, next_triplet: %u, next_mbuf: %p",
> +                               cb_len, next_triplet, m->next);
> +                       return -EINVAL;
> +               }
> +       }
> +       /* Storing new mbuf as it could be changed in scatter-gather case*/
> +       *input = m;
> +
> +       return next_triplet;
> +}
> +
> +/* Fills descriptor with data pointers of one block type.
> + * Returns index of next triplet on success, other value if lengths of
> + * output data and processed mbuf do not match.
> + */
> +static inline int
> +acc100_dma_fill_blk_type_out(struct acc100_dma_req_desc *desc,
> +               struct rte_mbuf *output, uint32_t out_offset,
> +               uint32_t output_len, int next_triplet, int blk_id)
> +{
> +       desc->data_ptrs[next_triplet].address =
> +                       rte_pktmbuf_iova_offset(output, out_offset);
> +       desc->data_ptrs[next_triplet].blen = output_len;
> +       desc->data_ptrs[next_triplet].blkid = blk_id;
> +       desc->data_ptrs[next_triplet].last = 0;
> +       desc->data_ptrs[next_triplet].dma_ext = 0;
> +       next_triplet++;
> +
> +       return next_triplet;
> +}
> +
> +static inline int
> +acc100_dma_desc_le_fill(struct rte_bbdev_enc_op *op,
> +               struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
> +               struct rte_mbuf *output, uint32_t *in_offset,
> +               uint32_t *out_offset, uint32_t *out_length,
> +               uint32_t *mbuf_total_left, uint32_t *seg_total_left)
> +{
> +       int next_triplet = 1; /* FCW already done */
> +       uint16_t K, in_length_in_bits, in_length_in_bytes;
> +       struct rte_bbdev_op_ldpc_enc *enc = &op->ldpc_enc;
> +
> +       desc->word0 = ACC100_DMA_DESC_TYPE;
> +       desc->word1 = 0; /**< Timestamp could be disabled */
> +       desc->word2 = 0;
> +       desc->word3 = 0;
> +       desc->numCBs = 1;
> +
> +       K = (enc->basegraph == 1 ? 22 : 10) * enc->z_c;
> +       in_length_in_bits = K - enc->n_filler;
> +       if ((enc->op_flags & RTE_BBDEV_LDPC_CRC_24A_ATTACH) ||
> +                       (enc->op_flags & RTE_BBDEV_LDPC_CRC_24B_ATTACH))
> +               in_length_in_bits -= 24;
> +       in_length_in_bytes = in_length_in_bits >> 3;
> +
> +       if (unlikely((*mbuf_total_left == 0) ||
> +                       (*mbuf_total_left < in_length_in_bytes))) {
> +               rte_bbdev_log(ERR,
> +                               "Mismatch between mbuf length and included CB sizes:
> mbuf len %u, cb len %u",
> +                               *mbuf_total_left, in_length_in_bytes);
> +               return -1;
> +       }
> +
> +       next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset,
> +                       in_length_in_bytes,
> +                       seg_total_left, next_triplet);
> +       if (unlikely(next_triplet < 0)) {
> +               rte_bbdev_log(ERR,
> +                               "Mismatch between data to process and mbuf data length
> in bbdev_op: %p",
> +                               op);
> +               return -1;
> +       }
> +       desc->data_ptrs[next_triplet - 1].last = 1;
> +       desc->m2dlen = next_triplet;
> +       *mbuf_total_left -= in_length_in_bytes;
> +
> +       /* Set output length */
> +       /* Integer round up division by 8 */
> +       *out_length = (enc->cb_params.e + 7) >> 3;
> +
> +       next_triplet = acc100_dma_fill_blk_type_out(desc, output, *out_offset,
> +                       *out_length, next_triplet, ACC100_DMA_BLKID_OUT_ENC);
> +       if (unlikely(next_triplet < 0)) {
> +               rte_bbdev_log(ERR,
> +                               "Mismatch between data to process and mbuf data length
> in bbdev_op: %p",
> +                               op);
> +               return -1;
> +       }
> +       op->ldpc_enc.output.length += *out_length;
> +       *out_offset += *out_length;
> +       desc->data_ptrs[next_triplet - 1].last = 1;
> +       desc->data_ptrs[next_triplet - 1].dma_ext = 0;
> +       desc->d2mlen = next_triplet - desc->m2dlen;
> +
> +       desc->op_addr = op;
> +
> +       return 0;
> +}
> +
> +static inline int
> +acc100_dma_desc_ld_fill(struct rte_bbdev_dec_op *op,
> +               struct acc100_dma_req_desc *desc,
> +               struct rte_mbuf **input, struct rte_mbuf *h_output,
> +               uint32_t *in_offset, uint32_t *h_out_offset,
> +               uint32_t *h_out_length, uint32_t *mbuf_total_left,
> +               uint32_t *seg_total_left,
> +               struct acc100_fcw_ld *fcw)
> +{
> +       struct rte_bbdev_op_ldpc_dec *dec = &op->ldpc_dec;
> +       int next_triplet = 1; /* FCW already done */
> +       uint32_t input_length;
> +       uint16_t output_length, crc24_overlap = 0;
> +       uint16_t sys_cols, K, h_p_size, h_np_size;
> +       bool h_comp = check_bit(dec->op_flags,
> +                       RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
> +
> +       desc->word0 = ACC100_DMA_DESC_TYPE;
> +       desc->word1 = 0; /**< Timestamp could be disabled */
> +       desc->word2 = 0;
> +       desc->word3 = 0;
> +       desc->numCBs = 1;
> +
> +       if (check_bit(op->ldpc_dec.op_flags,
> +                       RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP))
> +               crc24_overlap = 24;
> +
> +       /* Compute some LDPC BG lengths */
> +       input_length = dec->cb_params.e;
> +       if (check_bit(op->ldpc_dec.op_flags,
> +                       RTE_BBDEV_LDPC_LLR_COMPRESSION))
> +               input_length = (input_length * 3 + 3) / 4;
> +       sys_cols = (dec->basegraph == 1) ? 22 : 10;
> +       K = sys_cols * dec->z_c;
> +       output_length = K - dec->n_filler - crc24_overlap;
> +
> +       if (unlikely((*mbuf_total_left == 0) ||
> +                       (*mbuf_total_left < input_length))) {
> +               rte_bbdev_log(ERR,
> +                               "Mismatch between mbuf length and included CB sizes:
> mbuf len %u, cb len %u",
> +                               *mbuf_total_left, input_length);
> +               return -1;
> +       }
> +
> +       next_triplet = acc100_dma_fill_blk_type_in(desc, input,
> +                       in_offset, input_length,
> +                       seg_total_left, next_triplet);
> +
> +       if (unlikely(next_triplet < 0)) {
> +               rte_bbdev_log(ERR,
> +                               "Mismatch between data to process and mbuf data length
> in bbdev_op: %p",
> +                               op);
> +               return -1;
> +       }
> +
> +       if (check_bit(op->ldpc_dec.op_flags,
> +                               RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
> +               h_p_size = fcw->hcin_size0 + fcw->hcin_size1;
> +               if (h_comp)
> +                       h_p_size = (h_p_size * 3 + 3) / 4;
> +               desc->data_ptrs[next_triplet].address =
> +                               dec->harq_combined_input.offset;
> +               desc->data_ptrs[next_triplet].blen = h_p_size;
> +               desc->data_ptrs[next_triplet].blkid =
> ACC100_DMA_BLKID_IN_HARQ;
> +               desc->data_ptrs[next_triplet].dma_ext = 1;
> +#ifndef ACC100_EXT_MEM
> +               acc100_dma_fill_blk_type_out(
> +                               desc,
> +                               op->ldpc_dec.harq_combined_input.data,
> +                               op->ldpc_dec.harq_combined_input.offset,
> +                               h_p_size,
> +                               next_triplet,
> +                               ACC100_DMA_BLKID_IN_HARQ);
> +#endif
> +               next_triplet++;
> +       }
> +
> +       desc->data_ptrs[next_triplet - 1].last = 1;
> +       desc->m2dlen = next_triplet;
> +       *mbuf_total_left -= input_length;
> +
> +       next_triplet = acc100_dma_fill_blk_type_out(desc, h_output,
> +                       *h_out_offset, output_length >> 3, next_triplet,
> +                       ACC100_DMA_BLKID_OUT_HARD);
> +       if (unlikely(next_triplet < 0)) {
> +               rte_bbdev_log(ERR,
> +                               "Mismatch between data to process and mbuf data length
> in bbdev_op: %p",
> +                               op);
> +               return -1;
> +       }
> +
> +       if (check_bit(op->ldpc_dec.op_flags,
> +                               RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
> +               /* Pruned size of the HARQ */
> +               h_p_size = fcw->hcout_size0 + fcw->hcout_size1;
> +               /* Non-Pruned size of the HARQ */
> +               h_np_size = fcw->hcout_offset > 0 ?
> +                               fcw->hcout_offset + fcw->hcout_size1 :
> +                               h_p_size;
> +               if (h_comp) {
> +                       h_np_size = (h_np_size * 3 + 3) / 4;
> +                       h_p_size = (h_p_size * 3 + 3) / 4;
> +               }
> +               dec->harq_combined_output.length = h_np_size;
> +               desc->data_ptrs[next_triplet].address =
> +                               dec->harq_combined_output.offset;
> +               desc->data_ptrs[next_triplet].blen = h_p_size;
> +               desc->data_ptrs[next_triplet].blkid =
> ACC100_DMA_BLKID_OUT_HARQ;
> +               desc->data_ptrs[next_triplet].dma_ext = 1;
> +#ifndef ACC100_EXT_MEM
> +               acc100_dma_fill_blk_type_out(
> +                               desc,
> +                               dec->harq_combined_output.data,
> +                               dec->harq_combined_output.offset,
> +                               h_p_size,
> +                               next_triplet,
> +                               ACC100_DMA_BLKID_OUT_HARQ);
> +#endif
> +               next_triplet++;
> +       }
> +
> +       *h_out_length = output_length >> 3;
> +       dec->hard_output.length += *h_out_length;
> +       *h_out_offset += *h_out_length;
> +       desc->data_ptrs[next_triplet - 1].last = 1;
> +       desc->d2mlen = next_triplet - desc->m2dlen;
> +
> +       desc->op_addr = op;
> +
> +       return 0;
> +}
> +
> +static inline void
> +acc100_dma_desc_ld_update(struct rte_bbdev_dec_op *op,
> +               struct acc100_dma_req_desc *desc,
> +               struct rte_mbuf *input, struct rte_mbuf *h_output,
> +               uint32_t *in_offset, uint32_t *h_out_offset,
> +               uint32_t *h_out_length,
> +               union acc100_harq_layout_data *harq_layout)
> +{
> +       int next_triplet = 1; /* FCW already done */
> +       desc->data_ptrs[next_triplet].address =
> +                       rte_pktmbuf_iova_offset(input, *in_offset);
> +       next_triplet++;
> +
> +       if (check_bit(op->ldpc_dec.op_flags,
> +                               RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
> +               struct rte_bbdev_op_data hi = op->ldpc_dec.harq_combined_input;
> +               desc->data_ptrs[next_triplet].address = hi.offset;
> +#ifndef ACC100_EXT_MEM
> +               desc->data_ptrs[next_triplet].address =
> +                               rte_pktmbuf_iova_offset(hi.data, hi.offset);
> +#endif
> +               next_triplet++;
> +       }
> +
> +       desc->data_ptrs[next_triplet].address =
> +                       rte_pktmbuf_iova_offset(h_output, *h_out_offset);
> +       *h_out_length = desc->data_ptrs[next_triplet].blen;
> +       next_triplet++;
> +
> +       if (check_bit(op->ldpc_dec.op_flags,
> +                               RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
> +               desc->data_ptrs[next_triplet].address =
> +                               op->ldpc_dec.harq_combined_output.offset;
> +               /* Adjust based on previous operation */
> +               struct rte_bbdev_dec_op *prev_op = desc->op_addr;
> +               op->ldpc_dec.harq_combined_output.length =
> +                               prev_op->ldpc_dec.harq_combined_output.length;
> +               int16_t hq_idx = op->ldpc_dec.harq_combined_output.offset /
> +                               ACC100_HARQ_OFFSET;
> +               int16_t prev_hq_idx =
> +                               prev_op->ldpc_dec.harq_combined_output.offset
> +                               / ACC100_HARQ_OFFSET;
> +               harq_layout[hq_idx].val = harq_layout[prev_hq_idx].val;
> +#ifndef ACC100_EXT_MEM
> +               struct rte_bbdev_op_data ho =
> +                               op->ldpc_dec.harq_combined_output;
> +               desc->data_ptrs[next_triplet].address =
> +                               rte_pktmbuf_iova_offset(ho.data, ho.offset);
> +#endif
> +               next_triplet++;
> +       }
> +
> +       op->ldpc_dec.hard_output.length += *h_out_length;
> +       desc->op_addr = op;
> +}
> +
> +
> +/* Enqueue a number of operations to HW and update software rings */
> +static inline void
> +acc100_dma_enqueue(struct acc100_queue *q, uint16_t n,
> +               struct rte_bbdev_stats *queue_stats)
> +{
> +       union acc100_enqueue_reg_fmt enq_req;
> +#ifdef RTE_BBDEV_OFFLOAD_COST
> +       uint64_t start_time = 0;
> +       queue_stats->acc_offload_cycles = 0;
> +       RTE_SET_USED(queue_stats);
> +#else
> +       RTE_SET_USED(queue_stats);
> +#endif
> +
> +       enq_req.val = 0;
> +       /* Setting offset, 100b for 256 DMA Desc */
> +       enq_req.addr_offset = ACC100_DESC_OFFSET;
> +
> +       /* Split ops into batches */
> +       do {
> +               union acc100_dma_desc *desc;
> +               uint16_t enq_batch_size;
> +               uint64_t offset;
> +               rte_iova_t req_elem_addr;
> +
> +               enq_batch_size = RTE_MIN(n, MAX_ENQ_BATCH_SIZE);
> +
> +               /* Set flag on last descriptor in a batch */
> +               desc = q->ring_addr + ((q->sw_ring_head + enq_batch_size - 1) &
> +                               q->sw_ring_wrap_mask);
> +               desc->req.last_desc_in_batch = 1;
> +
> +               /* Calculate the 1st descriptor's address */
> +               offset = ((q->sw_ring_head & q->sw_ring_wrap_mask) *
> +                               sizeof(union acc100_dma_desc));
> +               req_elem_addr = q->ring_addr_phys + offset;
> +
> +               /* Fill enqueue struct */
> +               enq_req.num_elem = enq_batch_size;
> +               /* low 6 bits are not needed */
> +               enq_req.req_elem_addr = (uint32_t)(req_elem_addr >> 6);
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +               rte_memdump(stderr, "Req sdone", desc, sizeof(*desc));
> +#endif
> +               rte_bbdev_log_debug(
> +                               "Enqueue %u reqs (phys %#"PRIx64") to reg %p",
> +                               enq_batch_size,
> +                               req_elem_addr,
> +                               (void *)q->mmio_reg_enqueue);
> +
> +               rte_wmb();
> +
> +#ifdef RTE_BBDEV_OFFLOAD_COST
> +               /* Start time measurement for enqueue function offload. */
> +               start_time = rte_rdtsc_precise();
> +#endif
> +               rte_bbdev_log(DEBUG, "Debug : MMIO Enqueue");
> +               mmio_write(q->mmio_reg_enqueue, enq_req.val);
> +
> +#ifdef RTE_BBDEV_OFFLOAD_COST
> +               queue_stats->acc_offload_cycles +=
> +                               rte_rdtsc_precise() - start_time;
> +#endif
> +
> +               q->aq_enqueued++;
> +               q->sw_ring_head += enq_batch_size;
> +               n -= enq_batch_size;
> +
> +       } while (n);
> +
> +
> +}
> +
> +/* Enqueue one encode operations for ACC100 device in CB mode */
> +static inline int
> +enqueue_ldpc_enc_n_op_cb(struct acc100_queue *q, struct
> rte_bbdev_enc_op **ops,
> +               uint16_t total_enqueued_cbs, int16_t num)
> +{
> +       union acc100_dma_desc *desc = NULL;
> +       uint32_t out_length;
> +       struct rte_mbuf *output_head, *output;
> +       int i, next_triplet;
> +       uint16_t  in_length_in_bytes;
> +       struct rte_bbdev_op_ldpc_enc *enc = &ops[0]->ldpc_enc;
> +
> +       uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> +                       & q->sw_ring_wrap_mask);
> +       desc = q->ring_addr + desc_idx;
> +       acc100_fcw_le_fill(ops[0], &desc->req.fcw_le, num);
> +
> +       /** This could be done at polling */
> +       desc->req.word0 = ACC100_DMA_DESC_TYPE;
> +       desc->req.word1 = 0; /**< Timestamp could be disabled */
> +       desc->req.word2 = 0;
> +       desc->req.word3 = 0;
> +       desc->req.numCBs = num;
> +
> +       in_length_in_bytes = ops[0]->ldpc_enc.input.data->data_len;
> +       out_length = (enc->cb_params.e + 7) >> 3;
> +       desc->req.m2dlen = 1 + num;
> +       desc->req.d2mlen = num;
> +       next_triplet = 1;
> +
> +       for (i = 0; i < num; i++) {
> +               desc->req.data_ptrs[next_triplet].address =
> +                       rte_pktmbuf_iova_offset(ops[i]->ldpc_enc.input.data, 0);
> +               desc->req.data_ptrs[next_triplet].blen = in_length_in_bytes;
> +               next_triplet++;
> +               desc->req.data_ptrs[next_triplet].address =
> +                               rte_pktmbuf_iova_offset(
> +                               ops[i]->ldpc_enc.output.data, 0);
> +               desc->req.data_ptrs[next_triplet].blen = out_length;
> +               next_triplet++;
> +               ops[i]->ldpc_enc.output.length = out_length;
> +               output_head = output = ops[i]->ldpc_enc.output.data;
> +               mbuf_append(output_head, output, out_length);
> +               output->data_len = out_length;
> +       }
> +
> +       desc->req.op_addr = ops[0];
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +       rte_memdump(stderr, "FCW", &desc->req.fcw_le,
> +                       sizeof(desc->req.fcw_le) - 8);
> +       rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> +#endif
> +
> +       /* One CB (one op) was successfully prepared to enqueue */
> +       return num;
> +}
> +
> +/* Enqueue one encode operations for ACC100 device in CB mode */
> +static inline int
> +enqueue_ldpc_enc_one_op_cb(struct acc100_queue *q, struct
> rte_bbdev_enc_op *op,
> +               uint16_t total_enqueued_cbs)
> +{
> +       union acc100_dma_desc *desc = NULL;
> +       int ret;
> +       uint32_t in_offset, out_offset, out_length, mbuf_total_left,
> +               seg_total_left;
> +       struct rte_mbuf *input, *output_head, *output;
> +
> +       uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> +                       & q->sw_ring_wrap_mask);
> +       desc = q->ring_addr + desc_idx;
> +       acc100_fcw_le_fill(op, &desc->req.fcw_le, 1);
> +
> +       input = op->ldpc_enc.input.data;
> +       output_head = output = op->ldpc_enc.output.data;
> +       in_offset = op->ldpc_enc.input.offset;
> +       out_offset = op->ldpc_enc.output.offset;
> +       out_length = 0;
> +       mbuf_total_left = op->ldpc_enc.input.length;
> +       seg_total_left = rte_pktmbuf_data_len(op->ldpc_enc.input.data)
> +                       - in_offset;
> +
> +       ret = acc100_dma_desc_le_fill(op, &desc->req, &input, output,
> +                       &in_offset, &out_offset, &out_length, &mbuf_total_left,
> +                       &seg_total_left);
> +
> +       if (unlikely(ret < 0))
> +               return ret;
> +
> +       mbuf_append(output_head, output, out_length);
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +       rte_memdump(stderr, "FCW", &desc->req.fcw_le,
> +                       sizeof(desc->req.fcw_le) - 8);
> +       rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> +
> +       /* Check if any data left after processing one CB */
> +       if (mbuf_total_left != 0) {
> +               rte_bbdev_log(ERR,
> +                               "Some date still left after processing one CB:
> mbuf_total_left = %u",
> +                               mbuf_total_left);
> +               return -EINVAL;
> +       }
> +#endif
> +       /* One CB (one op) was successfully prepared to enqueue */
> +       return 1;
> +}
> +
> +/** Enqueue one decode operations for ACC100 device in CB mode */
> +static inline int
> +enqueue_ldpc_dec_one_op_cb(struct acc100_queue *q, struct
> rte_bbdev_dec_op *op,
> +               uint16_t total_enqueued_cbs, bool same_op)
> +{
> +       int ret;
> +
> +       union acc100_dma_desc *desc;
> +       uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> +                       & q->sw_ring_wrap_mask);
> +       desc = q->ring_addr + desc_idx;
> +       struct rte_mbuf *input, *h_output_head, *h_output;
> +       uint32_t in_offset, h_out_offset, h_out_length, mbuf_total_left;
> +       input = op->ldpc_dec.input.data;
> +       h_output_head = h_output = op->ldpc_dec.hard_output.data;
> +       in_offset = op->ldpc_dec.input.offset;
> +       h_out_offset = op->ldpc_dec.hard_output.offset;
> +       mbuf_total_left = op->ldpc_dec.input.length;
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +       if (unlikely(input == NULL)) {
> +               rte_bbdev_log(ERR, "Invalid mbuf pointer");
> +               return -EFAULT;
> +       }
> +#endif
> +       union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
> +
> +       if (same_op) {
> +               union acc100_dma_desc *prev_desc;
> +               desc_idx = ((q->sw_ring_head + total_enqueued_cbs - 1)
> +                               & q->sw_ring_wrap_mask);
> +               prev_desc = q->ring_addr + desc_idx;
> +               uint8_t *prev_ptr = (uint8_t *) prev_desc;
> +               uint8_t *new_ptr = (uint8_t *) desc;
> +               /* Copy first 4 words and BDESCs */
> +               rte_memcpy(new_ptr, prev_ptr, 16);
> +               rte_memcpy(new_ptr + 36, prev_ptr + 36, 40);
> +               desc->req.op_addr = prev_desc->req.op_addr;
> +               /* Copy FCW */
> +               rte_memcpy(new_ptr + ACC100_DESC_FCW_OFFSET,
> +                               prev_ptr + ACC100_DESC_FCW_OFFSET,
> +                               ACC100_FCW_LD_BLEN);
> +               acc100_dma_desc_ld_update(op, &desc->req, input, h_output,
> +                               &in_offset, &h_out_offset,
> +                               &h_out_length, harq_layout);
> +       } else {
> +               struct acc100_fcw_ld *fcw;
> +               uint32_t seg_total_left;
> +               fcw = &desc->req.fcw_ld;
> +               acc100_fcw_ld_fill(op, fcw, harq_layout);
> +
> +               /* Special handling when overusing mbuf */
> +               if (fcw->rm_e < MAX_E_MBUF)
> +                       seg_total_left = rte_pktmbuf_data_len(input)
> +                                       - in_offset;
> +               else
> +                       seg_total_left = fcw->rm_e;
> +
> +               ret = acc100_dma_desc_ld_fill(op, &desc->req, &input, h_output,
> +                               &in_offset, &h_out_offset,
> +                               &h_out_length, &mbuf_total_left,
> +                               &seg_total_left, fcw);
> +               if (unlikely(ret < 0))
> +                       return ret;
> +       }
> +
> +       /* Hard output */
> +       mbuf_append(h_output_head, h_output, h_out_length);
> +#ifndef ACC100_EXT_MEM
> +       if (op->ldpc_dec.harq_combined_output.length > 0) {
> +               /* Push the HARQ output into host memory */
> +               struct rte_mbuf *hq_output_head, *hq_output;
> +               hq_output_head = op->ldpc_dec.harq_combined_output.data;
> +               hq_output = op->ldpc_dec.harq_combined_output.data;
> +               mbuf_append(hq_output_head, hq_output,
> +                               op->ldpc_dec.harq_combined_output.length);
> +       }
> +#endif
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +       rte_memdump(stderr, "FCW", &desc->req.fcw_ld,
> +                       sizeof(desc->req.fcw_ld) - 8);
> +       rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> +#endif
> +
> +       /* One CB (one op) was successfully prepared to enqueue */
> +       return 1;
> +}
> +
> +
> +/* Enqueue one decode operations for ACC100 device in TB mode */
> +static inline int
> +enqueue_ldpc_dec_one_op_tb(struct acc100_queue *q, struct
> rte_bbdev_dec_op *op,
> +               uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
> +{
> +       union acc100_dma_desc *desc = NULL;
> +       int ret;
> +       uint8_t r, c;
> +       uint32_t in_offset, h_out_offset,
> +               h_out_length, mbuf_total_left, seg_total_left;
> +       struct rte_mbuf *input, *h_output_head, *h_output;
> +       uint16_t current_enqueued_cbs = 0;
> +
> +       uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> +                       & q->sw_ring_wrap_mask);
> +       desc = q->ring_addr + desc_idx;
> +       uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
> +       union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
> +       acc100_fcw_ld_fill(op, &desc->req.fcw_ld, harq_layout);
> +
> +       input = op->ldpc_dec.input.data;
> +       h_output_head = h_output = op->ldpc_dec.hard_output.data;
> +       in_offset = op->ldpc_dec.input.offset;
> +       h_out_offset = op->ldpc_dec.hard_output.offset;
> +       h_out_length = 0;
> +       mbuf_total_left = op->ldpc_dec.input.length;
> +       c = op->ldpc_dec.tb_params.c;
> +       r = op->ldpc_dec.tb_params.r;
> +
> +       while (mbuf_total_left > 0 && r < c) {
> +
> +               seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
> +
> +               /* Set up DMA descriptor */
> +               desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
> +                               & q->sw_ring_wrap_mask);
> +               desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
> +               desc->req.data_ptrs[0].blen = ACC100_FCW_LD_BLEN;
> +               ret = acc100_dma_desc_ld_fill(op, &desc->req, &input,
> +                               h_output, &in_offset, &h_out_offset,
> +                               &h_out_length,
> +                               &mbuf_total_left, &seg_total_left,
> +                               &desc->req.fcw_ld);
> +
> +               if (unlikely(ret < 0))
> +                       return ret;
> +
> +               /* Hard output */
> +               mbuf_append(h_output_head, h_output, h_out_length);
> +
> +               /* Set total number of CBs in TB */
> +               desc->req.cbs_in_tb = cbs_in_tb;
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +               rte_memdump(stderr, "FCW", &desc->req.fcw_td,
> +                               sizeof(desc->req.fcw_td) - 8);
> +               rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> +#endif
> +
> +               if (seg_total_left == 0) {
> +                       /* Go to the next mbuf */
> +                       input = input->next;
> +                       in_offset = 0;
> +                       h_output = h_output->next;
> +                       h_out_offset = 0;
> +               }
> +               total_enqueued_cbs++;
> +               current_enqueued_cbs++;
> +               r++;
> +       }
> +
> +       if (unlikely(desc == NULL))
> +               return current_enqueued_cbs;
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +       /* Check if any CBs left for processing */
> +       if (mbuf_total_left != 0) {
> +               rte_bbdev_log(ERR,
> +                               "Some date still left for processing: mbuf_total_left = %u",
> +                               mbuf_total_left);
> +               return -EINVAL;
> +       }
> +#endif
> +       /* Set SDone on last CB descriptor for TB mode */
> +       desc->req.sdone_enable = 1;
> +       desc->req.irq_enable = q->irq_enable;
> +
> +       return current_enqueued_cbs;
> +}
> +
> +
> +/* Calculates number of CBs in processed encoder TB based on 'r' and input
> + * length.
> + */
> +static inline uint8_t
> +get_num_cbs_in_tb_enc(struct rte_bbdev_op_turbo_enc *turbo_enc)
> +{
> +       uint8_t c, c_neg, r, crc24_bits = 0;
> +       uint16_t k, k_neg, k_pos;
> +       uint8_t cbs_in_tb = 0;
> +       int32_t length;
> +
> +       length = turbo_enc->input.length;
> +       r = turbo_enc->tb_params.r;
> +       c = turbo_enc->tb_params.c;
> +       c_neg = turbo_enc->tb_params.c_neg;
> +       k_neg = turbo_enc->tb_params.k_neg;
> +       k_pos = turbo_enc->tb_params.k_pos;
> +       crc24_bits = 0;
> +       if (check_bit(turbo_enc->op_flags,
> RTE_BBDEV_TURBO_CRC_24B_ATTACH))
> +               crc24_bits = 24;
> +       while (length > 0 && r < c) {
> +               k = (r < c_neg) ? k_neg : k_pos;
> +               length -= (k - crc24_bits) >> 3;
> +               r++;
> +               cbs_in_tb++;
> +       }
> +
> +       return cbs_in_tb;
> +}
> +
> +/* Calculates number of CBs in processed decoder TB based on 'r' and input
> + * length.
> + */
> +static inline uint16_t
> +get_num_cbs_in_tb_dec(struct rte_bbdev_op_turbo_dec *turbo_dec)
> +{
> +       uint8_t c, c_neg, r = 0;
> +       uint16_t kw, k, k_neg, k_pos, cbs_in_tb = 0;
> +       int32_t length;
> +
> +       length = turbo_dec->input.length;
> +       r = turbo_dec->tb_params.r;
> +       c = turbo_dec->tb_params.c;
> +       c_neg = turbo_dec->tb_params.c_neg;
> +       k_neg = turbo_dec->tb_params.k_neg;
> +       k_pos = turbo_dec->tb_params.k_pos;
> +       while (length > 0 && r < c) {
> +               k = (r < c_neg) ? k_neg : k_pos;
> +               kw = RTE_ALIGN_CEIL(k + 4, 32) * 3;
> +               length -= kw;
> +               r++;
> +               cbs_in_tb++;
> +       }
> +
> +       return cbs_in_tb;
> +}
> +
> +/* Calculates number of CBs in processed decoder TB based on 'r' and input
> + * length.
> + */
> +static inline uint16_t
> +get_num_cbs_in_tb_ldpc_dec(struct rte_bbdev_op_ldpc_dec *ldpc_dec)
> +{
> +       uint16_t r, cbs_in_tb = 0;
> +       int32_t length = ldpc_dec->input.length;
> +       r = ldpc_dec->tb_params.r;
> +       while (length > 0 && r < ldpc_dec->tb_params.c) {
> +               length -=  (r < ldpc_dec->tb_params.cab) ?
> +                               ldpc_dec->tb_params.ea :
> +                               ldpc_dec->tb_params.eb;
> +               r++;
> +               cbs_in_tb++;
> +       }
> +       return cbs_in_tb;
> +}
> +
> +/* Check we can mux encode operations with common FCW */
> +static inline bool
> +check_mux(struct rte_bbdev_enc_op **ops, uint16_t num) {
> +       uint16_t i;
> +       if (num == 1)
> +               return false;
> +       for (i = 1; i < num; ++i) {
> +               /* Only mux compatible code blocks */
> +               if (memcmp((uint8_t *)(&ops[i]->ldpc_enc) + ENC_OFFSET,
> +                               (uint8_t *)(&ops[0]->ldpc_enc) + ENC_OFFSET,
> +                               CMP_ENC_SIZE) != 0)
> +                       return false;
> +       }
> +       return true;
> +}
> +
> +/** Enqueue encode operations for ACC100 device in CB mode. */
> +static inline uint16_t
> +acc100_enqueue_ldpc_enc_cb(struct rte_bbdev_queue_data *q_data,
> +               struct rte_bbdev_enc_op **ops, uint16_t num)
> +{
> +       struct acc100_queue *q = q_data->queue_private;
> +       int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
> +       uint16_t i = 0;
> +       union acc100_dma_desc *desc;
> +       int ret, desc_idx = 0;
> +       int16_t enq, left = num;
> +
> +       while (left > 0) {
> +               if (unlikely(avail - 1 < 0))
> +                       break;
> +               avail--;
> +               enq = RTE_MIN(left, MUX_5GDL_DESC);
> +               if (check_mux(&ops[i], enq)) {
> +                       ret = enqueue_ldpc_enc_n_op_cb(q, &ops[i],
> +                                       desc_idx, enq);
> +                       if (ret < 0)
> +                               break;
> +                       i += enq;
> +               } else {
> +                       ret = enqueue_ldpc_enc_one_op_cb(q, ops[i], desc_idx);
> +                       if (ret < 0)
> +                               break;
> +                       i++;
> +               }
> +               desc_idx++;
> +               left = num - i;
> +       }
> +
> +       if (unlikely(i == 0))
> +               return 0; /* Nothing to enqueue */
> +
> +       /* Set SDone in last CB in enqueued ops for CB mode*/
> +       desc = q->ring_addr + ((q->sw_ring_head + desc_idx - 1)
> +                       & q->sw_ring_wrap_mask);
> +       desc->req.sdone_enable = 1;
> +       desc->req.irq_enable = q->irq_enable;
> +
> +       acc100_dma_enqueue(q, desc_idx, &q_data->queue_stats);
> +
> +       /* Update stats */
> +       q_data->queue_stats.enqueued_count += i;
> +       q_data->queue_stats.enqueue_err_count += num - i;
> +
> +       return i;
> +}
> +
> +/* Enqueue encode operations for ACC100 device. */
> +static uint16_t
> +acc100_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
> +               struct rte_bbdev_enc_op **ops, uint16_t num)
> +{
> +       if (unlikely(num == 0))
> +               return 0;
> +       return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
> +}
> +
> +/* Check we can mux encode operations with common FCW */
> +static inline bool
> +cmp_ldpc_dec_op(struct rte_bbdev_dec_op **ops) {
> +       /* Only mux compatible code blocks */
> +       if (memcmp((uint8_t *)(&ops[0]->ldpc_dec) + DEC_OFFSET,
> +                       (uint8_t *)(&ops[1]->ldpc_dec) +
> +                       DEC_OFFSET, CMP_DEC_SIZE) != 0) {
> +               return false;
> +       } else
> +               return true;
> +}
> +
> +
> +/* Enqueue decode operations for ACC100 device in TB mode */
> +static uint16_t
> +acc100_enqueue_ldpc_dec_tb(struct rte_bbdev_queue_data *q_data,
> +               struct rte_bbdev_dec_op **ops, uint16_t num)
> +{
> +       struct acc100_queue *q = q_data->queue_private;
> +       int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
> +       uint16_t i, enqueued_cbs = 0;
> +       uint8_t cbs_in_tb;
> +       int ret;
> +
> +       for (i = 0; i < num; ++i) {
> +               cbs_in_tb = get_num_cbs_in_tb_ldpc_dec(&ops[i]->ldpc_dec);
> +               /* Check if there are available space for further processing */
> +               if (unlikely(avail - cbs_in_tb < 0))
> +                       break;
> +               avail -= cbs_in_tb;
> +
> +               ret = enqueue_ldpc_dec_one_op_tb(q, ops[i],
> +                               enqueued_cbs, cbs_in_tb);
> +               if (ret < 0)
> +                       break;
> +               enqueued_cbs += ret;
> +       }
> +
> +       acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
> +
> +       /* Update stats */
> +       q_data->queue_stats.enqueued_count += i;
> +       q_data->queue_stats.enqueue_err_count += num - i;
> +       return i;
> +}
> +
> +/* Enqueue decode operations for ACC100 device in CB mode */
> +static uint16_t
> +acc100_enqueue_ldpc_dec_cb(struct rte_bbdev_queue_data *q_data,
> +               struct rte_bbdev_dec_op **ops, uint16_t num)
> +{
> +       struct acc100_queue *q = q_data->queue_private;
> +       int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
> +       uint16_t i;
> +       union acc100_dma_desc *desc;
> +       int ret;
> +       bool same_op = false;
> +       for (i = 0; i < num; ++i) {
> +               /* Check if there are available space for further processing */
> +               if (unlikely(avail - 1 < 0))
> +                       break;
> +               avail -= 1;
> +
> +               if (i > 0)
> +                       same_op = cmp_ldpc_dec_op(&ops[i-1]);
> +               rte_bbdev_log(INFO, "Op %d %d %d %d %d %d %d %d %d %d %d
> %d\n",
> +                       i, ops[i]->ldpc_dec.op_flags, ops[i]->ldpc_dec.rv_index,
> +                       ops[i]->ldpc_dec.iter_max, ops[i]->ldpc_dec.iter_count,
> +                       ops[i]->ldpc_dec.basegraph, ops[i]->ldpc_dec.z_c,
> +                       ops[i]->ldpc_dec.n_cb, ops[i]->ldpc_dec.q_m,
> +                       ops[i]->ldpc_dec.n_filler, ops[i]->ldpc_dec.cb_params.e,
> +                       same_op);
> +               ret = enqueue_ldpc_dec_one_op_cb(q, ops[i], i, same_op);
> +               if (ret < 0)
> +                       break;
> +       }
> +
> +       if (unlikely(i == 0))
> +               return 0; /* Nothing to enqueue */
> +
> +       /* Set SDone in last CB in enqueued ops for CB mode*/
> +       desc = q->ring_addr + ((q->sw_ring_head + i - 1)
> +                       & q->sw_ring_wrap_mask);
> +
> +       desc->req.sdone_enable = 1;
> +       desc->req.irq_enable = q->irq_enable;
> +
> +       acc100_dma_enqueue(q, i, &q_data->queue_stats);
> +
> +       /* Update stats */
> +       q_data->queue_stats.enqueued_count += i;
> +       q_data->queue_stats.enqueue_err_count += num - i;
> +       return i;
> +}
> +
> +/* Enqueue decode operations for ACC100 device. */
> +static uint16_t
> +acc100_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
> +               struct rte_bbdev_dec_op **ops, uint16_t num)
> +{
> +       struct acc100_queue *q = q_data->queue_private;
> +       int32_t aq_avail = q->aq_depth +
> +                       (q->aq_dequeued - q->aq_enqueued) / 128;
> +
> +       if (unlikely((aq_avail == 0) || (num == 0)))
> +               return 0;
> +
> +       if (ops[0]->ldpc_dec.code_block_mode == 0)
> +               return acc100_enqueue_ldpc_dec_tb(q_data, ops, num);
> +       else
> +               return acc100_enqueue_ldpc_dec_cb(q_data, ops, num);
> +}
> +
> +
> +/* Dequeue one encode operations from ACC100 device in CB mode */
> +static inline int
> +dequeue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op
> **ref_op,
> +               uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
> +{
> +       union acc100_dma_desc *desc, atom_desc;
> +       union acc100_dma_rsp_desc rsp;
> +       struct rte_bbdev_enc_op *op;
> +       int i;
> +
> +       desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
> +                       & q->sw_ring_wrap_mask);
> +       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> +                       __ATOMIC_RELAXED);
> +
> +       /* Check fdone bit */
> +       if (!(atom_desc.rsp.val & ACC100_FDONE))
> +               return -1;
> +
> +       rsp.val = atom_desc.rsp.val;
> +       rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
> +
> +       /* Dequeue */
> +       op = desc->req.op_addr;
> +
> +       /* Clearing status, it will be set based on response */
> +       op->status = 0;
> +
> +       op->status |= ((rsp.input_err)
> +                       ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> +       op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> +       op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> +
> +       if (desc->req.last_desc_in_batch) {
> +               (*aq_dequeued)++;
> +               desc->req.last_desc_in_batch = 0;
> +       }
> +       desc->rsp.val = ACC100_DMA_DESC_TYPE;
> +       desc->rsp.add_info_0 = 0; /*Reserved bits */
> +       desc->rsp.add_info_1 = 0; /*Reserved bits */
> +
> +       /* Flag that the muxing cause loss of opaque data */
> +       op->opaque_data = (void *)-1;
> +       for (i = 0 ; i < desc->req.numCBs; i++)
> +               ref_op[i] = op;
> +
> +       /* One CB (op) was successfully dequeued */
> +       return desc->req.numCBs;
> +}
> +
> +/* Dequeue one encode operations from ACC100 device in TB mode */
> +static inline int
> +dequeue_enc_one_op_tb(struct acc100_queue *q, struct rte_bbdev_enc_op
> **ref_op,
> +               uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
> +{
> +       union acc100_dma_desc *desc, *last_desc, atom_desc;
> +       union acc100_dma_rsp_desc rsp;
> +       struct rte_bbdev_enc_op *op;
> +       uint8_t i = 0;
> +       uint16_t current_dequeued_cbs = 0, cbs_in_tb;
> +
> +       desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
> +                       & q->sw_ring_wrap_mask);
> +       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> +                       __ATOMIC_RELAXED);
> +
> +       /* Check fdone bit */
> +       if (!(atom_desc.rsp.val & ACC100_FDONE))
> +               return -1;
> +
> +       /* Get number of CBs in dequeued TB */
> +       cbs_in_tb = desc->req.cbs_in_tb;
> +       /* Get last CB */
> +       last_desc = q->ring_addr + ((q->sw_ring_tail
> +                       + total_dequeued_cbs + cbs_in_tb - 1)
> +                       & q->sw_ring_wrap_mask);
> +       /* Check if last CB in TB is ready to dequeue (and thus
> +        * the whole TB) - checking sdone bit. If not return.
> +        */
> +       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
> +                       __ATOMIC_RELAXED);
> +       if (!(atom_desc.rsp.val & ACC100_SDONE))
> +               return -1;
> +
> +       /* Dequeue */
> +       op = desc->req.op_addr;
> +
> +       /* Clearing status, it will be set based on response */
> +       op->status = 0;
> +
> +       while (i < cbs_in_tb) {
> +               desc = q->ring_addr + ((q->sw_ring_tail
> +                               + total_dequeued_cbs)
> +                               & q->sw_ring_wrap_mask);
> +               atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> +                               __ATOMIC_RELAXED);
> +               rsp.val = atom_desc.rsp.val;
> +               rte_bbdev_log_debug("Resp. desc %p: %x", desc,
> +                               rsp.val);
> +
> +               op->status |= ((rsp.input_err)
> +                               ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> +               op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> +               op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> +
> +               if (desc->req.last_desc_in_batch) {
> +                       (*aq_dequeued)++;
> +                       desc->req.last_desc_in_batch = 0;
> +               }
> +               desc->rsp.val = ACC100_DMA_DESC_TYPE;
> +               desc->rsp.add_info_0 = 0;
> +               desc->rsp.add_info_1 = 0;
> +               total_dequeued_cbs++;
> +               current_dequeued_cbs++;
> +               i++;
> +       }
> +
> +       *ref_op = op;
> +
> +       return current_dequeued_cbs;
> +}
> +
> +/* Dequeue one decode operation from ACC100 device in CB mode */
> +static inline int
> +dequeue_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
> +               struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
> +               uint16_t dequeued_cbs, uint32_t *aq_dequeued)
> +{
> +       union acc100_dma_desc *desc, atom_desc;
> +       union acc100_dma_rsp_desc rsp;
> +       struct rte_bbdev_dec_op *op;
> +
> +       desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> +                       & q->sw_ring_wrap_mask);
> +       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> +                       __ATOMIC_RELAXED);
> +
> +       /* Check fdone bit */
> +       if (!(atom_desc.rsp.val & ACC100_FDONE))
> +               return -1;
> +
> +       rsp.val = atom_desc.rsp.val;
> +       rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
> +
> +       /* Dequeue */
> +       op = desc->req.op_addr;
> +
> +       /* Clearing status, it will be set based on response */
> +       op->status = 0;
> +       op->status |= ((rsp.input_err)
> +                       ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> +       op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> +       op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> +       if (op->status != 0)
> +               q_data->queue_stats.dequeue_err_count++;
> +
> +       /* CRC invalid if error exists */
> +       if (!op->status)
> +               op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
> +       op->turbo_dec.iter_count = (uint8_t) rsp.iter_cnt / 2;
> +       /* Check if this is the last desc in batch (Atomic Queue) */
> +       if (desc->req.last_desc_in_batch) {
> +               (*aq_dequeued)++;
> +               desc->req.last_desc_in_batch = 0;
> +       }
> +       desc->rsp.val = ACC100_DMA_DESC_TYPE;
> +       desc->rsp.add_info_0 = 0;
> +       desc->rsp.add_info_1 = 0;
> +       *ref_op = op;
> +
> +       /* One CB (op) was successfully dequeued */
> +       return 1;
> +}
> +
> +/* Dequeue one decode operations from ACC100 device in CB mode */
> +static inline int
> +dequeue_ldpc_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
> +               struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
> +               uint16_t dequeued_cbs, uint32_t *aq_dequeued)
> +{
> +       union acc100_dma_desc *desc, atom_desc;
> +       union acc100_dma_rsp_desc rsp;
> +       struct rte_bbdev_dec_op *op;
> +
> +       desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> +                       & q->sw_ring_wrap_mask);
> +       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> +                       __ATOMIC_RELAXED);
> +
> +       /* Check fdone bit */
> +       if (!(atom_desc.rsp.val & ACC100_FDONE))
> +               return -1;
> +
> +       rsp.val = atom_desc.rsp.val;
> +
> +       /* Dequeue */
> +       op = desc->req.op_addr;
> +
> +       /* Clearing status, it will be set based on response */
> +       op->status = 0;
> +       op->status |= rsp.input_err << RTE_BBDEV_DATA_ERROR;
> +       op->status |= rsp.dma_err << RTE_BBDEV_DRV_ERROR;
> +       op->status |= rsp.fcw_err << RTE_BBDEV_DRV_ERROR;
> +       if (op->status != 0)
> +               q_data->queue_stats.dequeue_err_count++;
> +
> +       op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
> +       if (op->ldpc_dec.hard_output.length > 0 && !rsp.synd_ok)
> +               op->status |= 1 << RTE_BBDEV_SYNDROME_ERROR;
> +       op->ldpc_dec.iter_count = (uint8_t) rsp.iter_cnt;
> +
> +       /* Check if this is the last desc in batch (Atomic Queue) */
> +       if (desc->req.last_desc_in_batch) {
> +               (*aq_dequeued)++;
> +               desc->req.last_desc_in_batch = 0;
> +       }
> +
> +       desc->rsp.val = ACC100_DMA_DESC_TYPE;
> +       desc->rsp.add_info_0 = 0;
> +       desc->rsp.add_info_1 = 0;
> +
> +       *ref_op = op;
> +
> +       /* One CB (op) was successfully dequeued */
> +       return 1;
> +}
> +
> +/* Dequeue one decode operations from ACC100 device in TB mode. */
> +static inline int
> +dequeue_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op
> **ref_op,
> +               uint16_t dequeued_cbs, uint32_t *aq_dequeued)
> +{
> +       union acc100_dma_desc *desc, *last_desc, atom_desc;
> +       union acc100_dma_rsp_desc rsp;
> +       struct rte_bbdev_dec_op *op;
> +       uint8_t cbs_in_tb = 1, cb_idx = 0;
> +
> +       desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> +                       & q->sw_ring_wrap_mask);
> +       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> +                       __ATOMIC_RELAXED);
> +
> +       /* Check fdone bit */
> +       if (!(atom_desc.rsp.val & ACC100_FDONE))
> +               return -1;
> +
> +       /* Dequeue */
> +       op = desc->req.op_addr;
> +
> +       /* Get number of CBs in dequeued TB */
> +       cbs_in_tb = desc->req.cbs_in_tb;
> +       /* Get last CB */
> +       last_desc = q->ring_addr + ((q->sw_ring_tail
> +                       + dequeued_cbs + cbs_in_tb - 1)
> +                       & q->sw_ring_wrap_mask);
> +       /* Check if last CB in TB is ready to dequeue (and thus
> +        * the whole TB) - checking sdone bit. If not return.
> +        */
> +       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
> +                       __ATOMIC_RELAXED);
> +       if (!(atom_desc.rsp.val & ACC100_SDONE))
> +               return -1;
> +
> +       /* Clearing status, it will be set based on response */
> +       op->status = 0;
> +
> +       /* Read remaining CBs if exists */
> +       while (cb_idx < cbs_in_tb) {
> +               desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> +                               & q->sw_ring_wrap_mask);
> +               atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> +                               __ATOMIC_RELAXED);
> +               rsp.val = atom_desc.rsp.val;
> +               rte_bbdev_log_debug("Resp. desc %p: %x", desc,
> +                               rsp.val);
> +
> +               op->status |= ((rsp.input_err)
> +                               ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> +               op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> +               op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> +
> +               /* CRC invalid if error exists */
> +               if (!op->status)
> +                       op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
> +               op->turbo_dec.iter_count = RTE_MAX((uint8_t) rsp.iter_cnt,
> +                               op->turbo_dec.iter_count);
> +
> +               /* Check if this is the last desc in batch (Atomic Queue) */
> +               if (desc->req.last_desc_in_batch) {
> +                       (*aq_dequeued)++;
> +                       desc->req.last_desc_in_batch = 0;
> +               }
> +               desc->rsp.val = ACC100_DMA_DESC_TYPE;
> +               desc->rsp.add_info_0 = 0;
> +               desc->rsp.add_info_1 = 0;
> +               dequeued_cbs++;
> +               cb_idx++;
> +       }
> +
> +       *ref_op = op;
> +
> +       return cb_idx;
> +}
> +
> +/* Dequeue LDPC encode operations from ACC100 device. */
> +static uint16_t
> +acc100_dequeue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
> +               struct rte_bbdev_enc_op **ops, uint16_t num)
> +{
> +       struct acc100_queue *q = q_data->queue_private;
> +       uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
> +       uint32_t aq_dequeued = 0;
> +       uint16_t dequeue_num, i, dequeued_cbs = 0, dequeued_descs = 0;
> +       int ret;
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +       if (unlikely(ops == 0 && q == NULL))
> +               return 0;
> +#endif
> +
> +       dequeue_num = (avail < num) ? avail : num;
> +
> +       for (i = 0; i < dequeue_num; i++) {
> +               ret = dequeue_enc_one_op_cb(q, &ops[dequeued_cbs],
> +                               dequeued_descs, &aq_dequeued);
> +               if (ret < 0)
> +                       break;
> +               dequeued_cbs += ret;
> +               dequeued_descs++;
> +               if (dequeued_cbs >= num)
> +                       break;
> +       }
> +
> +       q->aq_dequeued += aq_dequeued;
> +       q->sw_ring_tail += dequeued_descs;
> +
> +       /* Update enqueue stats */
> +       q_data->queue_stats.dequeued_count += dequeued_cbs;
> +
> +       return dequeued_cbs;
> +}
> +
> +/* Dequeue decode operations from ACC100 device. */
> +static uint16_t
> +acc100_dequeue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
> +               struct rte_bbdev_dec_op **ops, uint16_t num)
> +{
> +       struct acc100_queue *q = q_data->queue_private;
> +       uint16_t dequeue_num;
> +       uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
> +       uint32_t aq_dequeued = 0;
> +       uint16_t i;
> +       uint16_t dequeued_cbs = 0;
> +       struct rte_bbdev_dec_op *op;
> +       int ret;
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +       if (unlikely(ops == 0 && q == NULL))
> +               return 0;
> +#endif
> +
> +       dequeue_num = (avail < num) ? avail : num;
> +
> +       for (i = 0; i < dequeue_num; ++i) {
> +               op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> +                       & q->sw_ring_wrap_mask))->req.op_addr;
> +               if (op->ldpc_dec.code_block_mode == 0)
> +                       ret = dequeue_dec_one_op_tb(q, &ops[i], dequeued_cbs,
> +                                       &aq_dequeued);
> +               else
> +                       ret = dequeue_ldpc_dec_one_op_cb(
> +                                       q_data, q, &ops[i], dequeued_cbs,
> +                                       &aq_dequeued);
> +
> +               if (ret < 0)
> +                       break;
> +               dequeued_cbs += ret;
> +       }
> +
> +       q->aq_dequeued += aq_dequeued;
> +       q->sw_ring_tail += dequeued_cbs;
> +
> +       /* Update enqueue stats */
> +       q_data->queue_stats.dequeued_count += i;
> +
> +       return i;
> +}
> +
>  /* Initialization Function */
>  static void
>  acc100_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv)
> @@ -703,6 +2321,10 @@
>          struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
> 
>          dev->dev_ops = &acc100_bbdev_ops;
> +       dev->enqueue_ldpc_enc_ops = acc100_enqueue_ldpc_enc;
> +       dev->enqueue_ldpc_dec_ops = acc100_enqueue_ldpc_dec;
> +       dev->dequeue_ldpc_enc_ops = acc100_dequeue_ldpc_enc;
> +       dev->dequeue_ldpc_dec_ops = acc100_dequeue_ldpc_dec;
> 
>          ((struct acc100_device *) dev->data->dev_private)->pf_device =
>                          !strcmp(drv->driver.name,
> @@ -815,4 +2437,3 @@ static int acc100_pci_remove(struct rte_pci_device
> *pci_dev)
>  RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME,
> pci_id_acc100_pf_map);
>  RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
>  RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME,
> pci_id_acc100_vf_map);
> -
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h
> b/drivers/baseband/acc100/rte_acc100_pmd.h
> index 0e2b79c..78686c1 100644
> --- a/drivers/baseband/acc100/rte_acc100_pmd.h
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
> @@ -88,6 +88,8 @@
>  #define TMPL_PRI_3      0x0f0e0d0c
>  #define QUEUE_ENABLE    0x80000000  /* Bit to mark Queue as Enabled */
>  #define WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
> +#define ACC100_FDONE    0x80000000
> +#define ACC100_SDONE    0x40000000
> 
>  #define ACC100_NUM_TMPL  32
>  #define VF_OFFSET_QOS 16 /* offset in Memory Space specific to QoS Mon */
> @@ -398,6 +400,7 @@ struct __rte_packed acc100_dma_req_desc {
>  union acc100_dma_desc {
>          struct acc100_dma_req_desc req;
>          union acc100_dma_rsp_desc rsp;
> +       uint64_t atom_hdr;
>  };
> 
> 
> --
> 1.8.3.1

^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC processing functions
  2020-08-20 14:52     ` Chautru, Nicolas
@ 2020-08-20 14:57       ` Dave Burley
  2020-08-20 21:05         ` Chautru, Nicolas
  0 siblings, 1 reply; 213+ messages in thread
From: Dave Burley @ 2020-08-20 14:57 UTC (permalink / raw)
  To: Chautru, Nicolas, dev; +Cc: Richardson, Bruce

Hi Nic

Thank you - it would be useful to have further documentation for clarification as the data format isn't explicitly documented in BBDEV.
Best Regards

Dave


From: Chautru, Nicolas <nicolas.chautru@intel.com>
Sent: 20 August 2020 15:52
To: Dave Burley <dave.burley@accelercomm.com>; dev@dpdk.org <dev@dpdk.org>
Cc: Richardson, Bruce <bruce.richardson@intel.com>
Subject: RE: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC processing functions 
 
Hi Dave, 
This is assuming 6 bits LLR compression packing (ie. first 2 MSB dropped). Similar to HARQ compression.
Let me know if unclear, I can clarify further in documentation if not explicit enough.
Thanks
Nic

> -----Original Message-----
> From: Dave Burley <dave.burley@accelercomm.com>
> Sent: Thursday, August 20, 2020 7:39 AM
> To: Chautru, Nicolas <nicolas.chautru@intel.com>; dev@dpdk.org
> Cc: Richardson, Bruce <bruce.richardson@intel.com>
> Subject: Re: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC
> processing functions
> 
> Hi Nic,
> 
> As you've now specified the use of RTE_BBDEV_LDPC_LLR_COMPRESSION for
> this PMB, please could you confirm what the packed format of the LLRs in
> memory looks like?
> 
> Best Regards
> 
> Dave Burley
> 
> 
> From: dev <dev-bounces@dpdk.org> on behalf of Nicolas Chautru
> <nicolas.chautru@intel.com>
> Sent: 19 August 2020 01:25
> To: dev@dpdk.org <dev@dpdk.org>; akhil.goyal@nxp.com
> <akhil.goyal@nxp.com>
> Cc: bruce.richardson@intel.com <bruce.richardson@intel.com>; Nicolas
> Chautru <nicolas.chautru@intel.com>
> Subject: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC processing
> functions
> 
> Adding LDPC decode and encode processing operations
> 
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> ---
>  drivers/baseband/acc100/rte_acc100_pmd.c | 1625
> +++++++++++++++++++++++++++++-
>  drivers/baseband/acc100/rte_acc100_pmd.h |    3 +
>  2 files changed, 1626 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> b/drivers/baseband/acc100/rte_acc100_pmd.c
> index 7a21c57..5f32813 100644
> --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> @@ -15,6 +15,9 @@
>  #include <rte_hexdump.h>
>  #include <rte_pci.h>
>  #include <rte_bus_pci.h>
> +#ifdef RTE_BBDEV_OFFLOAD_COST
> +#include <rte_cycles.h>
> +#endif
> 
>  #include <rte_bbdev.h>
>  #include <rte_bbdev_pmd.h>
> @@ -449,7 +452,6 @@
>          return 0;
>  }
> 
> -
>  /**
>   * Report a ACC100 queue index which is free
>   * Return 0 to 16k for a valid queue_idx or -1 when no queue is available
> @@ -634,6 +636,46 @@
>          struct acc100_device *d = dev->data->dev_private;
> 
>          static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
> +               {
> +                       .type   = RTE_BBDEV_OP_LDPC_ENC,
> +                       .cap.ldpc_enc = {
> +                               .capability_flags =
> +                                       RTE_BBDEV_LDPC_RATE_MATCH |
> +                                       RTE_BBDEV_LDPC_CRC_24B_ATTACH |
> +                                       RTE_BBDEV_LDPC_INTERLEAVER_BYPASS,
> +                               .num_buffers_src =
> +                                               RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> +                               .num_buffers_dst =
> +                                               RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> +                       }
> +               },
> +               {
> +                       .type   = RTE_BBDEV_OP_LDPC_DEC,
> +                       .cap.ldpc_dec = {
> +                       .capability_flags =
> +                               RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK |
> +                               RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP |
> +                               RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE |
> +                               RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE |
> +#ifdef ACC100_EXT_MEM
> +                               RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABL
> E |
> +                               RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENA
> BLE |
> +#endif
> +                               RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE |
> +                               RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS |
> +                               RTE_BBDEV_LDPC_DECODE_BYPASS |
> +                               RTE_BBDEV_LDPC_DEC_SCATTER_GATHER |
> +                               RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION |
> +                               RTE_BBDEV_LDPC_LLR_COMPRESSION,
> +                       .llr_size = 8,
> +                       .llr_decimals = 1,
> +                       .num_buffers_src =
> +                                       RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> +                       .num_buffers_hard_out =
> +                                       RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> +                       .num_buffers_soft_out = 0,
> +                       }
> +               },
>                  RTE_BBDEV_END_OF_CAPABILITIES_LIST()
>          };
> 
> @@ -669,9 +711,14 @@
>          dev_info->cpu_flag_reqs = NULL;
>          dev_info->min_alignment = 64;
>          dev_info->capabilities = bbdev_capabilities;
> +#ifdef ACC100_EXT_MEM
>          dev_info->harq_buffer_size = d->ddr_size;
> +#else
> +       dev_info->harq_buffer_size = 0;
> +#endif
>  }
> 
> +
>  static const struct rte_bbdev_ops acc100_bbdev_ops = {
>          .setup_queues = acc100_setup_queues,
>          .close = acc100_dev_close,
> @@ -696,6 +743,1577 @@
>          {.device_id = 0},
>  };
> 
> +/* Read flag value 0/1 from bitmap */
> +static inline bool
> +check_bit(uint32_t bitmap, uint32_t bitmask)
> +{
> +       return bitmap & bitmask;
> +}
> +
> +static inline char *
> +mbuf_append(struct rte_mbuf *m_head, struct rte_mbuf *m, uint16_t len)
> +{
> +       if (unlikely(len > rte_pktmbuf_tailroom(m)))
> +               return NULL;
> +
> +       char *tail = (char *)m->buf_addr + m->data_off + m->data_len;
> +       m->data_len = (uint16_t)(m->data_len + len);
> +       m_head->pkt_len  = (m_head->pkt_len + len);
> +       return tail;
> +}
> +
> +/* Compute value of k0.
> + * Based on 3GPP 38.212 Table 5.4.2.1-2
> + * Starting position of different redundancy versions, k0
> + */
> +static inline uint16_t
> +get_k0(uint16_t n_cb, uint16_t z_c, uint8_t bg, uint8_t rv_index)
> +{
> +       if (rv_index == 0)
> +               return 0;
> +       uint16_t n = (bg == 1 ? N_ZC_1 : N_ZC_2) * z_c;
> +       if (n_cb == n) {
> +               if (rv_index == 1)
> +                       return (bg == 1 ? K0_1_1 : K0_1_2) * z_c;
> +               else if (rv_index == 2)
> +                       return (bg == 1 ? K0_2_1 : K0_2_2) * z_c;
> +               else
> +                       return (bg == 1 ? K0_3_1 : K0_3_2) * z_c;
> +       }
> +       /* LBRM case - includes a division by N */
> +       if (rv_index == 1)
> +               return (((bg == 1 ? K0_1_1 : K0_1_2) * n_cb)
> +                               / n) * z_c;
> +       else if (rv_index == 2)
> +               return (((bg == 1 ? K0_2_1 : K0_2_2) * n_cb)
> +                               / n) * z_c;
> +       else
> +               return (((bg == 1 ? K0_3_1 : K0_3_2) * n_cb)
> +                               / n) * z_c;
> +}
> +
> +/* Fill in a frame control word for LDPC encoding. */
> +static inline void
> +acc100_fcw_le_fill(const struct rte_bbdev_enc_op *op,
> +               struct acc100_fcw_le *fcw, int num_cb)
> +{
> +       fcw->qm = op->ldpc_enc.q_m;
> +       fcw->nfiller = op->ldpc_enc.n_filler;
> +       fcw->BG = (op->ldpc_enc.basegraph - 1);
> +       fcw->Zc = op->ldpc_enc.z_c;
> +       fcw->ncb = op->ldpc_enc.n_cb;
> +       fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_enc.basegraph,
> +                       op->ldpc_enc.rv_index);
> +       fcw->rm_e = op->ldpc_enc.cb_params.e;
> +       fcw->crc_select = check_bit(op->ldpc_enc.op_flags,
> +                       RTE_BBDEV_LDPC_CRC_24B_ATTACH);
> +       fcw->bypass_intlv = check_bit(op->ldpc_enc.op_flags,
> +                       RTE_BBDEV_LDPC_INTERLEAVER_BYPASS);
> +       fcw->mcb_count = num_cb;
> +}
> +
> +/* Fill in a frame control word for LDPC decoding. */
> +static inline void
> +acc100_fcw_ld_fill(const struct rte_bbdev_dec_op *op, struct acc100_fcw_ld
> *fcw,
> +               union acc100_harq_layout_data *harq_layout)
> +{
> +       uint16_t harq_out_length, harq_in_length, ncb_p, k0_p, parity_offset;
> +       uint16_t harq_index;
> +       uint32_t l;
> +       bool harq_prun = false;
> +
> +       fcw->qm = op->ldpc_dec.q_m;
> +       fcw->nfiller = op->ldpc_dec.n_filler;
> +       fcw->BG = (op->ldpc_dec.basegraph - 1);
> +       fcw->Zc = op->ldpc_dec.z_c;
> +       fcw->ncb = op->ldpc_dec.n_cb;
> +       fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_dec.basegraph,
> +                       op->ldpc_dec.rv_index);
> +       if (op->ldpc_dec.code_block_mode == 1)
> +               fcw->rm_e = op->ldpc_dec.cb_params.e;
> +       else
> +               fcw->rm_e = (op->ldpc_dec.tb_params.r <
> +                               op->ldpc_dec.tb_params.cab) ?
> +                                               op->ldpc_dec.tb_params.ea :
> +                                               op->ldpc_dec.tb_params.eb;
> +
> +       fcw->hcin_en = check_bit(op->ldpc_dec.op_flags,
> +                       RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE);
> +       fcw->hcout_en = check_bit(op->ldpc_dec.op_flags,
> +                       RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE);
> +       fcw->crc_select = check_bit(op->ldpc_dec.op_flags,
> +                       RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK);
> +       fcw->bypass_dec = check_bit(op->ldpc_dec.op_flags,
> +                       RTE_BBDEV_LDPC_DECODE_BYPASS);
> +       fcw->bypass_intlv = check_bit(op->ldpc_dec.op_flags,
> +                       RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS);
> +       if (op->ldpc_dec.q_m == 1) {
> +               fcw->bypass_intlv = 1;
> +               fcw->qm = 2;
> +       }
> +       fcw->hcin_decomp_mode = check_bit(op->ldpc_dec.op_flags,
> +                       RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
> +       fcw->hcout_comp_mode = check_bit(op->ldpc_dec.op_flags,
> +                       RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
> +       fcw->llr_pack_mode = check_bit(op->ldpc_dec.op_flags,
> +                       RTE_BBDEV_LDPC_LLR_COMPRESSION);
> +       harq_index = op->ldpc_dec.harq_combined_output.offset /
> +                       ACC100_HARQ_OFFSET;
> +#ifdef ACC100_EXT_MEM
> +       /* Limit cases when HARQ pruning is valid */
> +       harq_prun = ((op->ldpc_dec.harq_combined_output.offset %
> +                       ACC100_HARQ_OFFSET) == 0) &&
> +                       (op->ldpc_dec.harq_combined_output.offset <= UINT16_MAX
> +                       * ACC100_HARQ_OFFSET);
> +#endif
> +       if (fcw->hcin_en > 0) {
> +               harq_in_length = op->ldpc_dec.harq_combined_input.length;
> +               if (fcw->hcin_decomp_mode > 0)
> +                       harq_in_length = harq_in_length * 8 / 6;
> +               harq_in_length = RTE_ALIGN(harq_in_length, 64);
> +               if ((harq_layout[harq_index].offset > 0) & harq_prun) {
> +                       rte_bbdev_log_debug("HARQ IN offset unexpected for now\n");
> +                       fcw->hcin_size0 = harq_layout[harq_index].size0;
> +                       fcw->hcin_offset = harq_layout[harq_index].offset;
> +                       fcw->hcin_size1 = harq_in_length -
> +                                       harq_layout[harq_index].offset;
> +               } else {
> +                       fcw->hcin_size0 = harq_in_length;
> +                       fcw->hcin_offset = 0;
> +                       fcw->hcin_size1 = 0;
> +               }
> +       } else {
> +               fcw->hcin_size0 = 0;
> +               fcw->hcin_offset = 0;
> +               fcw->hcin_size1 = 0;
> +       }
> +
> +       fcw->itmax = op->ldpc_dec.iter_max;
> +       fcw->itstop = check_bit(op->ldpc_dec.op_flags,
> +                       RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE);
> +       fcw->synd_precoder = fcw->itstop;
> +       /*
> +        * These are all implicitly set
> +        * fcw->synd_post = 0;
> +        * fcw->so_en = 0;
> +        * fcw->so_bypass_rm = 0;
> +        * fcw->so_bypass_intlv = 0;
> +        * fcw->dec_convllr = 0;
> +        * fcw->hcout_convllr = 0;
> +        * fcw->hcout_size1 = 0;
> +        * fcw->so_it = 0;
> +        * fcw->hcout_offset = 0;
> +        * fcw->negstop_th = 0;
> +        * fcw->negstop_it = 0;
> +        * fcw->negstop_en = 0;
> +        * fcw->gain_i = 1;
> +        * fcw->gain_h = 1;
> +        */
> +       if (fcw->hcout_en > 0) {
> +               parity_offset = (op->ldpc_dec.basegraph == 1 ? 20 : 8)
> +                       * op->ldpc_dec.z_c - op->ldpc_dec.n_filler;
> +               k0_p = (fcw->k0 > parity_offset) ?
> +                               fcw->k0 - op->ldpc_dec.n_filler : fcw->k0;
> +               ncb_p = fcw->ncb - op->ldpc_dec.n_filler;
> +               l = k0_p + fcw->rm_e;
> +               harq_out_length = (uint16_t) fcw->hcin_size0;
> +               harq_out_length = RTE_MIN(RTE_MAX(harq_out_length, l), ncb_p);
> +               harq_out_length = (harq_out_length + 0x3F) & 0xFFC0;
> +               if ((k0_p > fcw->hcin_size0 + ACC100_HARQ_OFFSET_THRESHOLD)
> &&
> +                               harq_prun) {
> +                       fcw->hcout_size0 = (uint16_t) fcw->hcin_size0;
> +                       fcw->hcout_offset = k0_p & 0xFFC0;
> +                       fcw->hcout_size1 = harq_out_length - fcw->hcout_offset;
> +               } else {
> +                       fcw->hcout_size0 = harq_out_length;
> +                       fcw->hcout_size1 = 0;
> +                       fcw->hcout_offset = 0;
> +               }
> +               harq_layout[harq_index].offset = fcw->hcout_offset;
> +               harq_layout[harq_index].size0 = fcw->hcout_size0;
> +       } else {
> +               fcw->hcout_size0 = 0;
> +               fcw->hcout_size1 = 0;
> +               fcw->hcout_offset = 0;
> +       }
> +}
> +
> +/**
> + * Fills descriptor with data pointers of one block type.
> + *
> + * @param desc
> + *   Pointer to DMA descriptor.
> + * @param input
> + *   Pointer to pointer to input data which will be encoded. It can be changed
> + *   and points to next segment in scatter-gather case.
> + * @param offset
> + *   Input offset in rte_mbuf structure. It is used for calculating the point
> + *   where data is starting.
> + * @param cb_len
> + *   Length of currently processed Code Block
> + * @param seg_total_left
> + *   It indicates how many bytes still left in segment (mbuf) for further
> + *   processing.
> + * @param op_flags
> + *   Store information about device capabilities
> + * @param next_triplet
> + *   Index for ACC100 DMA Descriptor triplet
> + *
> + * @return
> + *   Returns index of next triplet on success, other value if lengths of
> + *   pkt and processed cb do not match.
> + *
> + */
> +static inline int
> +acc100_dma_fill_blk_type_in(struct acc100_dma_req_desc *desc,
> +               struct rte_mbuf **input, uint32_t *offset, uint32_t cb_len,
> +               uint32_t *seg_total_left, int next_triplet)
> +{
> +       uint32_t part_len;
> +       struct rte_mbuf *m = *input;
> +
> +       part_len = (*seg_total_left < cb_len) ? *seg_total_left : cb_len;
> +       cb_len -= part_len;
> +       *seg_total_left -= part_len;
> +
> +       desc->data_ptrs[next_triplet].address =
> +                       rte_pktmbuf_iova_offset(m, *offset);
> +       desc->data_ptrs[next_triplet].blen = part_len;
> +       desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN;
> +       desc->data_ptrs[next_triplet].last = 0;
> +       desc->data_ptrs[next_triplet].dma_ext = 0;
> +       *offset += part_len;
> +       next_triplet++;
> +
> +       while (cb_len > 0) {
> +               if (next_triplet < ACC100_DMA_MAX_NUM_POINTERS &&
> +                               m->next != NULL) {
> +
> +                       m = m->next;
> +                       *seg_total_left = rte_pktmbuf_data_len(m);
> +                       part_len = (*seg_total_left < cb_len) ?
> +                                       *seg_total_left :
> +                                       cb_len;
> +                       desc->data_ptrs[next_triplet].address =
> +                                       rte_pktmbuf_mtophys(m);
> +                       desc->data_ptrs[next_triplet].blen = part_len;
> +                       desc->data_ptrs[next_triplet].blkid =
> +                                       ACC100_DMA_BLKID_IN;
> +                       desc->data_ptrs[next_triplet].last = 0;
> +                       desc->data_ptrs[next_triplet].dma_ext = 0;
> +                       cb_len -= part_len;
> +                       *seg_total_left -= part_len;
> +                       /* Initializing offset for next segment (mbuf) */
> +                       *offset = part_len;
> +                       next_triplet++;
> +               } else {
> +                       rte_bbdev_log(ERR,
> +                               "Some data still left for processing: "
> +                               "data_left: %u, next_triplet: %u, next_mbuf: %p",
> +                               cb_len, next_triplet, m->next);
> +                       return -EINVAL;
> +               }
> +       }
> +       /* Storing new mbuf as it could be changed in scatter-gather case*/
> +       *input = m;
> +
> +       return next_triplet;
> +}
> +
> +/* Fills descriptor with data pointers of one block type.
> + * Returns index of next triplet on success, other value if lengths of
> + * output data and processed mbuf do not match.
> + */
> +static inline int
> +acc100_dma_fill_blk_type_out(struct acc100_dma_req_desc *desc,
> +               struct rte_mbuf *output, uint32_t out_offset,
> +               uint32_t output_len, int next_triplet, int blk_id)
> +{
> +       desc->data_ptrs[next_triplet].address =
> +                       rte_pktmbuf_iova_offset(output, out_offset);
> +       desc->data_ptrs[next_triplet].blen = output_len;
> +       desc->data_ptrs[next_triplet].blkid = blk_id;
> +       desc->data_ptrs[next_triplet].last = 0;
> +       desc->data_ptrs[next_triplet].dma_ext = 0;
> +       next_triplet++;
> +
> +       return next_triplet;
> +}
> +
> +static inline int
> +acc100_dma_desc_le_fill(struct rte_bbdev_enc_op *op,
> +               struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
> +               struct rte_mbuf *output, uint32_t *in_offset,
> +               uint32_t *out_offset, uint32_t *out_length,
> +               uint32_t *mbuf_total_left, uint32_t *seg_total_left)
> +{
> +       int next_triplet = 1; /* FCW already done */
> +       uint16_t K, in_length_in_bits, in_length_in_bytes;
> +       struct rte_bbdev_op_ldpc_enc *enc = &op->ldpc_enc;
> +
> +       desc->word0 = ACC100_DMA_DESC_TYPE;
> +       desc->word1 = 0; /**< Timestamp could be disabled */
> +       desc->word2 = 0;
> +       desc->word3 = 0;
> +       desc->numCBs = 1;
> +
> +       K = (enc->basegraph == 1 ? 22 : 10) * enc->z_c;
> +       in_length_in_bits = K - enc->n_filler;
> +       if ((enc->op_flags & RTE_BBDEV_LDPC_CRC_24A_ATTACH) ||
> +                       (enc->op_flags & RTE_BBDEV_LDPC_CRC_24B_ATTACH))
> +               in_length_in_bits -= 24;
> +       in_length_in_bytes = in_length_in_bits >> 3;
> +
> +       if (unlikely((*mbuf_total_left == 0) ||
> +                       (*mbuf_total_left < in_length_in_bytes))) {
> +               rte_bbdev_log(ERR,
> +                               "Mismatch between mbuf length and included CB sizes:
> mbuf len %u, cb len %u",
> +                               *mbuf_total_left, in_length_in_bytes);
> +               return -1;
> +       }
> +
> +       next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset,
> +                       in_length_in_bytes,
> +                       seg_total_left, next_triplet);
> +       if (unlikely(next_triplet < 0)) {
> +               rte_bbdev_log(ERR,
> +                               "Mismatch between data to process and mbuf data length
> in bbdev_op: %p",
> +                               op);
> +               return -1;
> +       }
> +       desc->data_ptrs[next_triplet - 1].last = 1;
> +       desc->m2dlen = next_triplet;
> +       *mbuf_total_left -= in_length_in_bytes;
> +
> +       /* Set output length */
> +       /* Integer round up division by 8 */
> +       *out_length = (enc->cb_params.e + 7) >> 3;
> +
> +       next_triplet = acc100_dma_fill_blk_type_out(desc, output, *out_offset,
> +                       *out_length, next_triplet, ACC100_DMA_BLKID_OUT_ENC);
> +       if (unlikely(next_triplet < 0)) {
> +               rte_bbdev_log(ERR,
> +                               "Mismatch between data to process and mbuf data length
> in bbdev_op: %p",
> +                               op);
> +               return -1;
> +       }
> +       op->ldpc_enc.output.length += *out_length;
> +       *out_offset += *out_length;
> +       desc->data_ptrs[next_triplet - 1].last = 1;
> +       desc->data_ptrs[next_triplet - 1].dma_ext = 0;
> +       desc->d2mlen = next_triplet - desc->m2dlen;
> +
> +       desc->op_addr = op;
> +
> +       return 0;
> +}
> +
> +static inline int
> +acc100_dma_desc_ld_fill(struct rte_bbdev_dec_op *op,
> +               struct acc100_dma_req_desc *desc,
> +               struct rte_mbuf **input, struct rte_mbuf *h_output,
> +               uint32_t *in_offset, uint32_t *h_out_offset,
> +               uint32_t *h_out_length, uint32_t *mbuf_total_left,
> +               uint32_t *seg_total_left,
> +               struct acc100_fcw_ld *fcw)
> +{
> +       struct rte_bbdev_op_ldpc_dec *dec = &op->ldpc_dec;
> +       int next_triplet = 1; /* FCW already done */
> +       uint32_t input_length;
> +       uint16_t output_length, crc24_overlap = 0;
> +       uint16_t sys_cols, K, h_p_size, h_np_size;
> +       bool h_comp = check_bit(dec->op_flags,
> +                       RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
> +
> +       desc->word0 = ACC100_DMA_DESC_TYPE;
> +       desc->word1 = 0; /**< Timestamp could be disabled */
> +       desc->word2 = 0;
> +       desc->word3 = 0;
> +       desc->numCBs = 1;
> +
> +       if (check_bit(op->ldpc_dec.op_flags,
> +                       RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP))
> +               crc24_overlap = 24;
> +
> +       /* Compute some LDPC BG lengths */
> +       input_length = dec->cb_params.e;
> +       if (check_bit(op->ldpc_dec.op_flags,
> +                       RTE_BBDEV_LDPC_LLR_COMPRESSION))
> +               input_length = (input_length * 3 + 3) / 4;
> +       sys_cols = (dec->basegraph == 1) ? 22 : 10;
> +       K = sys_cols * dec->z_c;
> +       output_length = K - dec->n_filler - crc24_overlap;
> +
> +       if (unlikely((*mbuf_total_left == 0) ||
> +                       (*mbuf_total_left < input_length))) {
> +               rte_bbdev_log(ERR,
> +                               "Mismatch between mbuf length and included CB sizes:
> mbuf len %u, cb len %u",
> +                               *mbuf_total_left, input_length);
> +               return -1;
> +       }
> +
> +       next_triplet = acc100_dma_fill_blk_type_in(desc, input,
> +                       in_offset, input_length,
> +                       seg_total_left, next_triplet);
> +
> +       if (unlikely(next_triplet < 0)) {
> +               rte_bbdev_log(ERR,
> +                               "Mismatch between data to process and mbuf data length
> in bbdev_op: %p",
> +                               op);
> +               return -1;
> +       }
> +
> +       if (check_bit(op->ldpc_dec.op_flags,
> +                               RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
> +               h_p_size = fcw->hcin_size0 + fcw->hcin_size1;
> +               if (h_comp)
> +                       h_p_size = (h_p_size * 3 + 3) / 4;
> +               desc->data_ptrs[next_triplet].address =
> +                               dec->harq_combined_input.offset;
> +               desc->data_ptrs[next_triplet].blen = h_p_size;
> +               desc->data_ptrs[next_triplet].blkid =
> ACC100_DMA_BLKID_IN_HARQ;
> +               desc->data_ptrs[next_triplet].dma_ext = 1;
> +#ifndef ACC100_EXT_MEM
> +               acc100_dma_fill_blk_type_out(
> +                               desc,
> +                               op->ldpc_dec.harq_combined_input.data,
> +                               op->ldpc_dec.harq_combined_input.offset,
> +                               h_p_size,
> +                               next_triplet,
> +                               ACC100_DMA_BLKID_IN_HARQ);
> +#endif
> +               next_triplet++;
> +       }
> +
> +       desc->data_ptrs[next_triplet - 1].last = 1;
> +       desc->m2dlen = next_triplet;
> +       *mbuf_total_left -= input_length;
> +
> +       next_triplet = acc100_dma_fill_blk_type_out(desc, h_output,
> +                       *h_out_offset, output_length >> 3, next_triplet,
> +                       ACC100_DMA_BLKID_OUT_HARD);
> +       if (unlikely(next_triplet < 0)) {
> +               rte_bbdev_log(ERR,
> +                               "Mismatch between data to process and mbuf data length
> in bbdev_op: %p",
> +                               op);
> +               return -1;
> +       }
> +
> +       if (check_bit(op->ldpc_dec.op_flags,
> +                               RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
> +               /* Pruned size of the HARQ */
> +               h_p_size = fcw->hcout_size0 + fcw->hcout_size1;
> +               /* Non-Pruned size of the HARQ */
> +               h_np_size = fcw->hcout_offset > 0 ?
> +                               fcw->hcout_offset + fcw->hcout_size1 :
> +                               h_p_size;
> +               if (h_comp) {
> +                       h_np_size = (h_np_size * 3 + 3) / 4;
> +                       h_p_size = (h_p_size * 3 + 3) / 4;
> +               }
> +               dec->harq_combined_output.length = h_np_size;
> +               desc->data_ptrs[next_triplet].address =
> +                               dec->harq_combined_output.offset;
> +               desc->data_ptrs[next_triplet].blen = h_p_size;
> +               desc->data_ptrs[next_triplet].blkid =
> ACC100_DMA_BLKID_OUT_HARQ;
> +               desc->data_ptrs[next_triplet].dma_ext = 1;
> +#ifndef ACC100_EXT_MEM
> +               acc100_dma_fill_blk_type_out(
> +                               desc,
> +                               dec->harq_combined_output.data,
> +                               dec->harq_combined_output.offset,
> +                               h_p_size,
> +                               next_triplet,
> +                               ACC100_DMA_BLKID_OUT_HARQ);
> +#endif
> +               next_triplet++;
> +       }
> +
> +       *h_out_length = output_length >> 3;
> +       dec->hard_output.length += *h_out_length;
> +       *h_out_offset += *h_out_length;
> +       desc->data_ptrs[next_triplet - 1].last = 1;
> +       desc->d2mlen = next_triplet - desc->m2dlen;
> +
> +       desc->op_addr = op;
> +
> +       return 0;
> +}
> +
> +static inline void
> +acc100_dma_desc_ld_update(struct rte_bbdev_dec_op *op,
> +               struct acc100_dma_req_desc *desc,
> +               struct rte_mbuf *input, struct rte_mbuf *h_output,
> +               uint32_t *in_offset, uint32_t *h_out_offset,
> +               uint32_t *h_out_length,
> +               union acc100_harq_layout_data *harq_layout)
> +{
> +       int next_triplet = 1; /* FCW already done */
> +       desc->data_ptrs[next_triplet].address =
> +                       rte_pktmbuf_iova_offset(input, *in_offset);
> +       next_triplet++;
> +
> +       if (check_bit(op->ldpc_dec.op_flags,
> +                               RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
> +               struct rte_bbdev_op_data hi = op->ldpc_dec.harq_combined_input;
> +               desc->data_ptrs[next_triplet].address = hi.offset;
> +#ifndef ACC100_EXT_MEM
> +               desc->data_ptrs[next_triplet].address =
> +                               rte_pktmbuf_iova_offset(hi.data, hi.offset);
> +#endif
> +               next_triplet++;
> +       }
> +
> +       desc->data_ptrs[next_triplet].address =
> +                       rte_pktmbuf_iova_offset(h_output, *h_out_offset);
> +       *h_out_length = desc->data_ptrs[next_triplet].blen;
> +       next_triplet++;
> +
> +       if (check_bit(op->ldpc_dec.op_flags,
> +                               RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
> +               desc->data_ptrs[next_triplet].address =
> +                               op->ldpc_dec.harq_combined_output.offset;
> +               /* Adjust based on previous operation */
> +               struct rte_bbdev_dec_op *prev_op = desc->op_addr;
> +               op->ldpc_dec.harq_combined_output.length =
> +                               prev_op->ldpc_dec.harq_combined_output.length;
> +               int16_t hq_idx = op->ldpc_dec.harq_combined_output.offset /
> +                               ACC100_HARQ_OFFSET;
> +               int16_t prev_hq_idx =
> +                               prev_op->ldpc_dec.harq_combined_output.offset
> +                               / ACC100_HARQ_OFFSET;
> +               harq_layout[hq_idx].val = harq_layout[prev_hq_idx].val;
> +#ifndef ACC100_EXT_MEM
> +               struct rte_bbdev_op_data ho =
> +                               op->ldpc_dec.harq_combined_output;
> +               desc->data_ptrs[next_triplet].address =
> +                               rte_pktmbuf_iova_offset(ho.data, ho.offset);
> +#endif
> +               next_triplet++;
> +       }
> +
> +       op->ldpc_dec.hard_output.length += *h_out_length;
> +       desc->op_addr = op;
> +}
> +
> +
> +/* Enqueue a number of operations to HW and update software rings */
> +static inline void
> +acc100_dma_enqueue(struct acc100_queue *q, uint16_t n,
> +               struct rte_bbdev_stats *queue_stats)
> +{
> +       union acc100_enqueue_reg_fmt enq_req;
> +#ifdef RTE_BBDEV_OFFLOAD_COST
> +       uint64_t start_time = 0;
> +       queue_stats->acc_offload_cycles = 0;
> +       RTE_SET_USED(queue_stats);
> +#else
> +       RTE_SET_USED(queue_stats);
> +#endif
> +
> +       enq_req.val = 0;
> +       /* Setting offset, 100b for 256 DMA Desc */
> +       enq_req.addr_offset = ACC100_DESC_OFFSET;
> +
> +       /* Split ops into batches */
> +       do {
> +               union acc100_dma_desc *desc;
> +               uint16_t enq_batch_size;
> +               uint64_t offset;
> +               rte_iova_t req_elem_addr;
> +
> +               enq_batch_size = RTE_MIN(n, MAX_ENQ_BATCH_SIZE);
> +
> +               /* Set flag on last descriptor in a batch */
> +               desc = q->ring_addr + ((q->sw_ring_head + enq_batch_size - 1) &
> +                               q->sw_ring_wrap_mask);
> +               desc->req.last_desc_in_batch = 1;
> +
> +               /* Calculate the 1st descriptor's address */
> +               offset = ((q->sw_ring_head & q->sw_ring_wrap_mask) *
> +                               sizeof(union acc100_dma_desc));
> +               req_elem_addr = q->ring_addr_phys + offset;
> +
> +               /* Fill enqueue struct */
> +               enq_req.num_elem = enq_batch_size;
> +               /* low 6 bits are not needed */
> +               enq_req.req_elem_addr = (uint32_t)(req_elem_addr >> 6);
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +               rte_memdump(stderr, "Req sdone", desc, sizeof(*desc));
> +#endif
> +               rte_bbdev_log_debug(
> +                               "Enqueue %u reqs (phys %#"PRIx64") to reg %p",
> +                               enq_batch_size,
> +                               req_elem_addr,
> +                               (void *)q->mmio_reg_enqueue);
> +
> +               rte_wmb();
> +
> +#ifdef RTE_BBDEV_OFFLOAD_COST
> +               /* Start time measurement for enqueue function offload. */
> +               start_time = rte_rdtsc_precise();
> +#endif
> +               rte_bbdev_log(DEBUG, "Debug : MMIO Enqueue");
> +               mmio_write(q->mmio_reg_enqueue, enq_req.val);
> +
> +#ifdef RTE_BBDEV_OFFLOAD_COST
> +               queue_stats->acc_offload_cycles +=
> +                               rte_rdtsc_precise() - start_time;
> +#endif
> +
> +               q->aq_enqueued++;
> +               q->sw_ring_head += enq_batch_size;
> +               n -= enq_batch_size;
> +
> +       } while (n);
> +
> +
> +}
> +
> +/* Enqueue one encode operations for ACC100 device in CB mode */
> +static inline int
> +enqueue_ldpc_enc_n_op_cb(struct acc100_queue *q, struct
> rte_bbdev_enc_op **ops,
> +               uint16_t total_enqueued_cbs, int16_t num)
> +{
> +       union acc100_dma_desc *desc = NULL;
> +       uint32_t out_length;
> +       struct rte_mbuf *output_head, *output;
> +       int i, next_triplet;
> +       uint16_t  in_length_in_bytes;
> +       struct rte_bbdev_op_ldpc_enc *enc = &ops[0]->ldpc_enc;
> +
> +       uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> +                       & q->sw_ring_wrap_mask);
> +       desc = q->ring_addr + desc_idx;
> +       acc100_fcw_le_fill(ops[0], &desc->req.fcw_le, num);
> +
> +       /** This could be done at polling */
> +       desc->req.word0 = ACC100_DMA_DESC_TYPE;
> +       desc->req.word1 = 0; /**< Timestamp could be disabled */
> +       desc->req.word2 = 0;
> +       desc->req.word3 = 0;
> +       desc->req.numCBs = num;
> +
> +       in_length_in_bytes = ops[0]->ldpc_enc.input.data->data_len;
> +       out_length = (enc->cb_params.e + 7) >> 3;
> +       desc->req.m2dlen = 1 + num;
> +       desc->req.d2mlen = num;
> +       next_triplet = 1;
> +
> +       for (i = 0; i < num; i++) {
> +               desc->req.data_ptrs[next_triplet].address =
> +                       rte_pktmbuf_iova_offset(ops[i]->ldpc_enc.input.data, 0);
> +               desc->req.data_ptrs[next_triplet].blen = in_length_in_bytes;
> +               next_triplet++;
> +               desc->req.data_ptrs[next_triplet].address =
> +                               rte_pktmbuf_iova_offset(
> +                               ops[i]->ldpc_enc.output.data, 0);
> +               desc->req.data_ptrs[next_triplet].blen = out_length;
> +               next_triplet++;
> +               ops[i]->ldpc_enc.output.length = out_length;
> +               output_head = output = ops[i]->ldpc_enc.output.data;
> +               mbuf_append(output_head, output, out_length);
> +               output->data_len = out_length;
> +       }
> +
> +       desc->req.op_addr = ops[0];
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +       rte_memdump(stderr, "FCW", &desc->req.fcw_le,
> +                       sizeof(desc->req.fcw_le) - 8);
> +       rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> +#endif
> +
> +       /* One CB (one op) was successfully prepared to enqueue */
> +       return num;
> +}
> +
> +/* Enqueue one encode operations for ACC100 device in CB mode */
> +static inline int
> +enqueue_ldpc_enc_one_op_cb(struct acc100_queue *q, struct
> rte_bbdev_enc_op *op,
> +               uint16_t total_enqueued_cbs)
> +{
> +       union acc100_dma_desc *desc = NULL;
> +       int ret;
> +       uint32_t in_offset, out_offset, out_length, mbuf_total_left,
> +               seg_total_left;
> +       struct rte_mbuf *input, *output_head, *output;
> +
> +       uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> +                       & q->sw_ring_wrap_mask);
> +       desc = q->ring_addr + desc_idx;
> +       acc100_fcw_le_fill(op, &desc->req.fcw_le, 1);
> +
> +       input = op->ldpc_enc.input.data;
> +       output_head = output = op->ldpc_enc.output.data;
> +       in_offset = op->ldpc_enc.input.offset;
> +       out_offset = op->ldpc_enc.output.offset;
> +       out_length = 0;
> +       mbuf_total_left = op->ldpc_enc.input.length;
> +       seg_total_left = rte_pktmbuf_data_len(op->ldpc_enc.input.data)
> +                       - in_offset;
> +
> +       ret = acc100_dma_desc_le_fill(op, &desc->req, &input, output,
> +                       &in_offset, &out_offset, &out_length, &mbuf_total_left,
> +                       &seg_total_left);
> +
> +       if (unlikely(ret < 0))
> +               return ret;
> +
> +       mbuf_append(output_head, output, out_length);
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +       rte_memdump(stderr, "FCW", &desc->req.fcw_le,
> +                       sizeof(desc->req.fcw_le) - 8);
> +       rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> +
> +       /* Check if any data left after processing one CB */
> +       if (mbuf_total_left != 0) {
> +               rte_bbdev_log(ERR,
> +                               "Some date still left after processing one CB:
> mbuf_total_left = %u",
> +                               mbuf_total_left);
> +               return -EINVAL;
> +       }
> +#endif
> +       /* One CB (one op) was successfully prepared to enqueue */
> +       return 1;
> +}
> +
> +/** Enqueue one decode operations for ACC100 device in CB mode */
> +static inline int
> +enqueue_ldpc_dec_one_op_cb(struct acc100_queue *q, struct
> rte_bbdev_dec_op *op,
> +               uint16_t total_enqueued_cbs, bool same_op)
> +{
> +       int ret;
> +
> +       union acc100_dma_desc *desc;
> +       uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> +                       & q->sw_ring_wrap_mask);
> +       desc = q->ring_addr + desc_idx;
> +       struct rte_mbuf *input, *h_output_head, *h_output;
> +       uint32_t in_offset, h_out_offset, h_out_length, mbuf_total_left;
> +       input = op->ldpc_dec.input.data;
> +       h_output_head = h_output = op->ldpc_dec.hard_output.data;
> +       in_offset = op->ldpc_dec.input.offset;
> +       h_out_offset = op->ldpc_dec.hard_output.offset;
> +       mbuf_total_left = op->ldpc_dec.input.length;
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +       if (unlikely(input == NULL)) {
> +               rte_bbdev_log(ERR, "Invalid mbuf pointer");
> +               return -EFAULT;
> +       }
> +#endif
> +       union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
> +
> +       if (same_op) {
> +               union acc100_dma_desc *prev_desc;
> +               desc_idx = ((q->sw_ring_head + total_enqueued_cbs - 1)
> +                               & q->sw_ring_wrap_mask);
> +               prev_desc = q->ring_addr + desc_idx;
> +               uint8_t *prev_ptr = (uint8_t *) prev_desc;
> +               uint8_t *new_ptr = (uint8_t *) desc;
> +               /* Copy first 4 words and BDESCs */
> +               rte_memcpy(new_ptr, prev_ptr, 16);
> +               rte_memcpy(new_ptr + 36, prev_ptr + 36, 40);
> +               desc->req.op_addr = prev_desc->req.op_addr;
> +               /* Copy FCW */
> +               rte_memcpy(new_ptr + ACC100_DESC_FCW_OFFSET,
> +                               prev_ptr + ACC100_DESC_FCW_OFFSET,
> +                               ACC100_FCW_LD_BLEN);
> +               acc100_dma_desc_ld_update(op, &desc->req, input, h_output,
> +                               &in_offset, &h_out_offset,
> +                               &h_out_length, harq_layout);
> +       } else {
> +               struct acc100_fcw_ld *fcw;
> +               uint32_t seg_total_left;
> +               fcw = &desc->req.fcw_ld;
> +               acc100_fcw_ld_fill(op, fcw, harq_layout);
> +
> +               /* Special handling when overusing mbuf */
> +               if (fcw->rm_e < MAX_E_MBUF)
> +                       seg_total_left = rte_pktmbuf_data_len(input)
> +                                       - in_offset;
> +               else
> +                       seg_total_left = fcw->rm_e;
> +
> +               ret = acc100_dma_desc_ld_fill(op, &desc->req, &input, h_output,
> +                               &in_offset, &h_out_offset,
> +                               &h_out_length, &mbuf_total_left,
> +                               &seg_total_left, fcw);
> +               if (unlikely(ret < 0))
> +                       return ret;
> +       }
> +
> +       /* Hard output */
> +       mbuf_append(h_output_head, h_output, h_out_length);
> +#ifndef ACC100_EXT_MEM
> +       if (op->ldpc_dec.harq_combined_output.length > 0) {
> +               /* Push the HARQ output into host memory */
> +               struct rte_mbuf *hq_output_head, *hq_output;
> +               hq_output_head = op->ldpc_dec.harq_combined_output.data;
> +               hq_output = op->ldpc_dec.harq_combined_output.data;
> +               mbuf_append(hq_output_head, hq_output,
> +                               op->ldpc_dec.harq_combined_output.length);
> +       }
> +#endif
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +       rte_memdump(stderr, "FCW", &desc->req.fcw_ld,
> +                       sizeof(desc->req.fcw_ld) - 8);
> +       rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> +#endif
> +
> +       /* One CB (one op) was successfully prepared to enqueue */
> +       return 1;
> +}
> +
> +
> +/* Enqueue one decode operations for ACC100 device in TB mode */
> +static inline int
> +enqueue_ldpc_dec_one_op_tb(struct acc100_queue *q, struct
> rte_bbdev_dec_op *op,
> +               uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
> +{
> +       union acc100_dma_desc *desc = NULL;
> +       int ret;
> +       uint8_t r, c;
> +       uint32_t in_offset, h_out_offset,
> +               h_out_length, mbuf_total_left, seg_total_left;
> +       struct rte_mbuf *input, *h_output_head, *h_output;
> +       uint16_t current_enqueued_cbs = 0;
> +
> +       uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> +                       & q->sw_ring_wrap_mask);
> +       desc = q->ring_addr + desc_idx;
> +       uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
> +       union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
> +       acc100_fcw_ld_fill(op, &desc->req.fcw_ld, harq_layout);
> +
> +       input = op->ldpc_dec.input.data;
> +       h_output_head = h_output = op->ldpc_dec.hard_output.data;
> +       in_offset = op->ldpc_dec.input.offset;
> +       h_out_offset = op->ldpc_dec.hard_output.offset;
> +       h_out_length = 0;
> +       mbuf_total_left = op->ldpc_dec.input.length;
> +       c = op->ldpc_dec.tb_params.c;
> +       r = op->ldpc_dec.tb_params.r;
> +
> +       while (mbuf_total_left > 0 && r < c) {
> +
> +               seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
> +
> +               /* Set up DMA descriptor */
> +               desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
> +                               & q->sw_ring_wrap_mask);
> +               desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
> +               desc->req.data_ptrs[0].blen = ACC100_FCW_LD_BLEN;
> +               ret = acc100_dma_desc_ld_fill(op, &desc->req, &input,
> +                               h_output, &in_offset, &h_out_offset,
> +                               &h_out_length,
> +                               &mbuf_total_left, &seg_total_left,
> +                               &desc->req.fcw_ld);
> +
> +               if (unlikely(ret < 0))
> +                       return ret;
> +
> +               /* Hard output */
> +               mbuf_append(h_output_head, h_output, h_out_length);
> +
> +               /* Set total number of CBs in TB */
> +               desc->req.cbs_in_tb = cbs_in_tb;
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +               rte_memdump(stderr, "FCW", &desc->req.fcw_td,
> +                               sizeof(desc->req.fcw_td) - 8);
> +               rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> +#endif
> +
> +               if (seg_total_left == 0) {
> +                       /* Go to the next mbuf */
> +                       input = input->next;
> +                       in_offset = 0;
> +                       h_output = h_output->next;
> +                       h_out_offset = 0;
> +               }
> +               total_enqueued_cbs++;
> +               current_enqueued_cbs++;
> +               r++;
> +       }
> +
> +       if (unlikely(desc == NULL))
> +               return current_enqueued_cbs;
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +       /* Check if any CBs left for processing */
> +       if (mbuf_total_left != 0) {
> +               rte_bbdev_log(ERR,
> +                               "Some date still left for processing: mbuf_total_left = %u",
> +                               mbuf_total_left);
> +               return -EINVAL;
> +       }
> +#endif
> +       /* Set SDone on last CB descriptor for TB mode */
> +       desc->req.sdone_enable = 1;
> +       desc->req.irq_enable = q->irq_enable;
> +
> +       return current_enqueued_cbs;
> +}
> +
> +
> +/* Calculates number of CBs in processed encoder TB based on 'r' and input
> + * length.
> + */
> +static inline uint8_t
> +get_num_cbs_in_tb_enc(struct rte_bbdev_op_turbo_enc *turbo_enc)
> +{
> +       uint8_t c, c_neg, r, crc24_bits = 0;
> +       uint16_t k, k_neg, k_pos;
> +       uint8_t cbs_in_tb = 0;
> +       int32_t length;
> +
> +       length = turbo_enc->input.length;
> +       r = turbo_enc->tb_params.r;
> +       c = turbo_enc->tb_params.c;
> +       c_neg = turbo_enc->tb_params.c_neg;
> +       k_neg = turbo_enc->tb_params.k_neg;
> +       k_pos = turbo_enc->tb_params.k_pos;
> +       crc24_bits = 0;
> +       if (check_bit(turbo_enc->op_flags,
> RTE_BBDEV_TURBO_CRC_24B_ATTACH))
> +               crc24_bits = 24;
> +       while (length > 0 && r < c) {
> +               k = (r < c_neg) ? k_neg : k_pos;
> +               length -= (k - crc24_bits) >> 3;
> +               r++;
> +               cbs_in_tb++;
> +       }
> +
> +       return cbs_in_tb;
> +}
> +
> +/* Calculates number of CBs in processed decoder TB based on 'r' and input
> + * length.
> + */
> +static inline uint16_t
> +get_num_cbs_in_tb_dec(struct rte_bbdev_op_turbo_dec *turbo_dec)
> +{
> +       uint8_t c, c_neg, r = 0;
> +       uint16_t kw, k, k_neg, k_pos, cbs_in_tb = 0;
> +       int32_t length;
> +
> +       length = turbo_dec->input.length;
> +       r = turbo_dec->tb_params.r;
> +       c = turbo_dec->tb_params.c;
> +       c_neg = turbo_dec->tb_params.c_neg;
> +       k_neg = turbo_dec->tb_params.k_neg;
> +       k_pos = turbo_dec->tb_params.k_pos;
> +       while (length > 0 && r < c) {
> +               k = (r < c_neg) ? k_neg : k_pos;
> +               kw = RTE_ALIGN_CEIL(k + 4, 32) * 3;
> +               length -= kw;
> +               r++;
> +               cbs_in_tb++;
> +       }
> +
> +       return cbs_in_tb;
> +}
> +
> +/* Calculates number of CBs in processed decoder TB based on 'r' and input
> + * length.
> + */
> +static inline uint16_t
> +get_num_cbs_in_tb_ldpc_dec(struct rte_bbdev_op_ldpc_dec *ldpc_dec)
> +{
> +       uint16_t r, cbs_in_tb = 0;
> +       int32_t length = ldpc_dec->input.length;
> +       r = ldpc_dec->tb_params.r;
> +       while (length > 0 && r < ldpc_dec->tb_params.c) {
> +               length -=  (r < ldpc_dec->tb_params.cab) ?
> +                               ldpc_dec->tb_params.ea :
> +                               ldpc_dec->tb_params.eb;
> +               r++;
> +               cbs_in_tb++;
> +       }
> +       return cbs_in_tb;
> +}
> +
> +/* Check we can mux encode operations with common FCW */
> +static inline bool
> +check_mux(struct rte_bbdev_enc_op **ops, uint16_t num) {
> +       uint16_t i;
> +       if (num == 1)
> +               return false;
> +       for (i = 1; i < num; ++i) {
> +               /* Only mux compatible code blocks */
> +               if (memcmp((uint8_t *)(&ops[i]->ldpc_enc) + ENC_OFFSET,
> +                               (uint8_t *)(&ops[0]->ldpc_enc) + ENC_OFFSET,
> +                               CMP_ENC_SIZE) != 0)
> +                       return false;
> +       }
> +       return true;
> +}
> +
> +/** Enqueue encode operations for ACC100 device in CB mode. */
> +static inline uint16_t
> +acc100_enqueue_ldpc_enc_cb(struct rte_bbdev_queue_data *q_data,
> +               struct rte_bbdev_enc_op **ops, uint16_t num)
> +{
> +       struct acc100_queue *q = q_data->queue_private;
> +       int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
> +       uint16_t i = 0;
> +       union acc100_dma_desc *desc;
> +       int ret, desc_idx = 0;
> +       int16_t enq, left = num;
> +
> +       while (left > 0) {
> +               if (unlikely(avail - 1 < 0))
> +                       break;
> +               avail--;
> +               enq = RTE_MIN(left, MUX_5GDL_DESC);
> +               if (check_mux(&ops[i], enq)) {
> +                       ret = enqueue_ldpc_enc_n_op_cb(q, &ops[i],
> +                                       desc_idx, enq);
> +                       if (ret < 0)
> +                               break;
> +                       i += enq;
> +               } else {
> +                       ret = enqueue_ldpc_enc_one_op_cb(q, ops[i], desc_idx);
> +                       if (ret < 0)
> +                               break;
> +                       i++;
> +               }
> +               desc_idx++;
> +               left = num - i;
> +       }
> +
> +       if (unlikely(i == 0))
> +               return 0; /* Nothing to enqueue */
> +
> +       /* Set SDone in last CB in enqueued ops for CB mode*/
> +       desc = q->ring_addr + ((q->sw_ring_head + desc_idx - 1)
> +                       & q->sw_ring_wrap_mask);
> +       desc->req.sdone_enable = 1;
> +       desc->req.irq_enable = q->irq_enable;
> +
> +       acc100_dma_enqueue(q, desc_idx, &q_data->queue_stats);
> +
> +       /* Update stats */
> +       q_data->queue_stats.enqueued_count += i;
> +       q_data->queue_stats.enqueue_err_count += num - i;
> +
> +       return i;
> +}
> +
> +/* Enqueue encode operations for ACC100 device. */
> +static uint16_t
> +acc100_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
> +               struct rte_bbdev_enc_op **ops, uint16_t num)
> +{
> +       if (unlikely(num == 0))
> +               return 0;
> +       return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
> +}
> +
> +/* Check we can mux encode operations with common FCW */
> +static inline bool
> +cmp_ldpc_dec_op(struct rte_bbdev_dec_op **ops) {
> +       /* Only mux compatible code blocks */
> +       if (memcmp((uint8_t *)(&ops[0]->ldpc_dec) + DEC_OFFSET,
> +                       (uint8_t *)(&ops[1]->ldpc_dec) +
> +                       DEC_OFFSET, CMP_DEC_SIZE) != 0) {
> +               return false;
> +       } else
> +               return true;
> +}
> +
> +
> +/* Enqueue decode operations for ACC100 device in TB mode */
> +static uint16_t
> +acc100_enqueue_ldpc_dec_tb(struct rte_bbdev_queue_data *q_data,
> +               struct rte_bbdev_dec_op **ops, uint16_t num)
> +{
> +       struct acc100_queue *q = q_data->queue_private;
> +       int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
> +       uint16_t i, enqueued_cbs = 0;
> +       uint8_t cbs_in_tb;
> +       int ret;
> +
> +       for (i = 0; i < num; ++i) {
> +               cbs_in_tb = get_num_cbs_in_tb_ldpc_dec(&ops[i]->ldpc_dec);
> +               /* Check if there are available space for further processing */
> +               if (unlikely(avail - cbs_in_tb < 0))
> +                       break;
> +               avail -= cbs_in_tb;
> +
> +               ret = enqueue_ldpc_dec_one_op_tb(q, ops[i],
> +                               enqueued_cbs, cbs_in_tb);
> +               if (ret < 0)
> +                       break;
> +               enqueued_cbs += ret;
> +       }
> +
> +       acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
> +
> +       /* Update stats */
> +       q_data->queue_stats.enqueued_count += i;
> +       q_data->queue_stats.enqueue_err_count += num - i;
> +       return i;
> +}
> +
> +/* Enqueue decode operations for ACC100 device in CB mode */
> +static uint16_t
> +acc100_enqueue_ldpc_dec_cb(struct rte_bbdev_queue_data *q_data,
> +               struct rte_bbdev_dec_op **ops, uint16_t num)
> +{
> +       struct acc100_queue *q = q_data->queue_private;
> +       int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
> +       uint16_t i;
> +       union acc100_dma_desc *desc;
> +       int ret;
> +       bool same_op = false;
> +       for (i = 0; i < num; ++i) {
> +               /* Check if there are available space for further processing */
> +               if (unlikely(avail - 1 < 0))
> +                       break;
> +               avail -= 1;
> +
> +               if (i > 0)
> +                       same_op = cmp_ldpc_dec_op(&ops[i-1]);
> +               rte_bbdev_log(INFO, "Op %d %d %d %d %d %d %d %d %d %d %d
> %d\n",
> +                       i, ops[i]->ldpc_dec.op_flags, ops[i]->ldpc_dec.rv_index,
> +                       ops[i]->ldpc_dec.iter_max, ops[i]->ldpc_dec.iter_count,
> +                       ops[i]->ldpc_dec.basegraph, ops[i]->ldpc_dec.z_c,
> +                       ops[i]->ldpc_dec.n_cb, ops[i]->ldpc_dec.q_m,
> +                       ops[i]->ldpc_dec.n_filler, ops[i]->ldpc_dec.cb_params.e,
> +                       same_op);
> +               ret = enqueue_ldpc_dec_one_op_cb(q, ops[i], i, same_op);
> +               if (ret < 0)
> +                       break;
> +       }
> +
> +       if (unlikely(i == 0))
> +               return 0; /* Nothing to enqueue */
> +
> +       /* Set SDone in last CB in enqueued ops for CB mode*/
> +       desc = q->ring_addr + ((q->sw_ring_head + i - 1)
> +                       & q->sw_ring_wrap_mask);
> +
> +       desc->req.sdone_enable = 1;
> +       desc->req.irq_enable = q->irq_enable;
> +
> +       acc100_dma_enqueue(q, i, &q_data->queue_stats);
> +
> +       /* Update stats */
> +       q_data->queue_stats.enqueued_count += i;
> +       q_data->queue_stats.enqueue_err_count += num - i;
> +       return i;
> +}
> +
> +/* Enqueue decode operations for ACC100 device. */
> +static uint16_t
> +acc100_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
> +               struct rte_bbdev_dec_op **ops, uint16_t num)
> +{
> +       struct acc100_queue *q = q_data->queue_private;
> +       int32_t aq_avail = q->aq_depth +
> +                       (q->aq_dequeued - q->aq_enqueued) / 128;
> +
> +       if (unlikely((aq_avail == 0) || (num == 0)))
> +               return 0;
> +
> +       if (ops[0]->ldpc_dec.code_block_mode == 0)
> +               return acc100_enqueue_ldpc_dec_tb(q_data, ops, num);
> +       else
> +               return acc100_enqueue_ldpc_dec_cb(q_data, ops, num);
> +}
> +
> +
> +/* Dequeue one encode operations from ACC100 device in CB mode */
> +static inline int
> +dequeue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op
> **ref_op,
> +               uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
> +{
> +       union acc100_dma_desc *desc, atom_desc;
> +       union acc100_dma_rsp_desc rsp;
> +       struct rte_bbdev_enc_op *op;
> +       int i;
> +
> +       desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
> +                       & q->sw_ring_wrap_mask);
> +       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> +                       __ATOMIC_RELAXED);
> +
> +       /* Check fdone bit */
> +       if (!(atom_desc.rsp.val & ACC100_FDONE))
> +               return -1;
> +
> +       rsp.val = atom_desc.rsp.val;
> +       rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
> +
> +       /* Dequeue */
> +       op = desc->req.op_addr;
> +
> +       /* Clearing status, it will be set based on response */
> +       op->status = 0;
> +
> +       op->status |= ((rsp.input_err)
> +                       ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> +       op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> +       op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> +
> +       if (desc->req.last_desc_in_batch) {
> +               (*aq_dequeued)++;
> +               desc->req.last_desc_in_batch = 0;
> +       }
> +       desc->rsp.val = ACC100_DMA_DESC_TYPE;
> +       desc->rsp.add_info_0 = 0; /*Reserved bits */
> +       desc->rsp.add_info_1 = 0; /*Reserved bits */
> +
> +       /* Flag that the muxing cause loss of opaque data */
> +       op->opaque_data = (void *)-1;
> +       for (i = 0 ; i < desc->req.numCBs; i++)
> +               ref_op[i] = op;
> +
> +       /* One CB (op) was successfully dequeued */
> +       return desc->req.numCBs;
> +}
> +
> +/* Dequeue one encode operations from ACC100 device in TB mode */
> +static inline int
> +dequeue_enc_one_op_tb(struct acc100_queue *q, struct rte_bbdev_enc_op
> **ref_op,
> +               uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
> +{
> +       union acc100_dma_desc *desc, *last_desc, atom_desc;
> +       union acc100_dma_rsp_desc rsp;
> +       struct rte_bbdev_enc_op *op;
> +       uint8_t i = 0;
> +       uint16_t current_dequeued_cbs = 0, cbs_in_tb;
> +
> +       desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
> +                       & q->sw_ring_wrap_mask);
> +       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> +                       __ATOMIC_RELAXED);
> +
> +       /* Check fdone bit */
> +       if (!(atom_desc.rsp.val & ACC100_FDONE))
> +               return -1;
> +
> +       /* Get number of CBs in dequeued TB */
> +       cbs_in_tb = desc->req.cbs_in_tb;
> +       /* Get last CB */
> +       last_desc = q->ring_addr + ((q->sw_ring_tail
> +                       + total_dequeued_cbs + cbs_in_tb - 1)
> +                       & q->sw_ring_wrap_mask);
> +       /* Check if last CB in TB is ready to dequeue (and thus
> +        * the whole TB) - checking sdone bit. If not return.
> +        */
> +       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
> +                       __ATOMIC_RELAXED);
> +       if (!(atom_desc.rsp.val & ACC100_SDONE))
> +               return -1;
> +
> +       /* Dequeue */
> +       op = desc->req.op_addr;
> +
> +       /* Clearing status, it will be set based on response */
> +       op->status = 0;
> +
> +       while (i < cbs_in_tb) {
> +               desc = q->ring_addr + ((q->sw_ring_tail
> +                               + total_dequeued_cbs)
> +                               & q->sw_ring_wrap_mask);
> +               atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> +                               __ATOMIC_RELAXED);
> +               rsp.val = atom_desc.rsp.val;
> +               rte_bbdev_log_debug("Resp. desc %p: %x", desc,
> +                               rsp.val);
> +
> +               op->status |= ((rsp.input_err)
> +                               ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> +               op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> +               op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> +
> +               if (desc->req.last_desc_in_batch) {
> +                       (*aq_dequeued)++;
> +                       desc->req.last_desc_in_batch = 0;
> +               }
> +               desc->rsp.val = ACC100_DMA_DESC_TYPE;
> +               desc->rsp.add_info_0 = 0;
> +               desc->rsp.add_info_1 = 0;
> +               total_dequeued_cbs++;
> +               current_dequeued_cbs++;
> +               i++;
> +       }
> +
> +       *ref_op = op;
> +
> +       return current_dequeued_cbs;
> +}
> +
> +/* Dequeue one decode operation from ACC100 device in CB mode */
> +static inline int
> +dequeue_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
> +               struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
> +               uint16_t dequeued_cbs, uint32_t *aq_dequeued)
> +{
> +       union acc100_dma_desc *desc, atom_desc;
> +       union acc100_dma_rsp_desc rsp;
> +       struct rte_bbdev_dec_op *op;
> +
> +       desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> +                       & q->sw_ring_wrap_mask);
> +       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> +                       __ATOMIC_RELAXED);
> +
> +       /* Check fdone bit */
> +       if (!(atom_desc.rsp.val & ACC100_FDONE))
> +               return -1;
> +
> +       rsp.val = atom_desc.rsp.val;
> +       rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
> +
> +       /* Dequeue */
> +       op = desc->req.op_addr;
> +
> +       /* Clearing status, it will be set based on response */
> +       op->status = 0;
> +       op->status |= ((rsp.input_err)
> +                       ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> +       op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> +       op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> +       if (op->status != 0)
> +               q_data->queue_stats.dequeue_err_count++;
> +
> +       /* CRC invalid if error exists */
> +       if (!op->status)
> +               op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
> +       op->turbo_dec.iter_count = (uint8_t) rsp.iter_cnt / 2;
> +       /* Check if this is the last desc in batch (Atomic Queue) */
> +       if (desc->req.last_desc_in_batch) {
> +               (*aq_dequeued)++;
> +               desc->req.last_desc_in_batch = 0;
> +       }
> +       desc->rsp.val = ACC100_DMA_DESC_TYPE;
> +       desc->rsp.add_info_0 = 0;
> +       desc->rsp.add_info_1 = 0;
> +       *ref_op = op;
> +
> +       /* One CB (op) was successfully dequeued */
> +       return 1;
> +}
> +
> +/* Dequeue one decode operations from ACC100 device in CB mode */
> +static inline int
> +dequeue_ldpc_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
> +               struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
> +               uint16_t dequeued_cbs, uint32_t *aq_dequeued)
> +{
> +       union acc100_dma_desc *desc, atom_desc;
> +       union acc100_dma_rsp_desc rsp;
> +       struct rte_bbdev_dec_op *op;
> +
> +       desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> +                       & q->sw_ring_wrap_mask);
> +       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> +                       __ATOMIC_RELAXED);
> +
> +       /* Check fdone bit */
> +       if (!(atom_desc.rsp.val & ACC100_FDONE))
> +               return -1;
> +
> +       rsp.val = atom_desc.rsp.val;
> +
> +       /* Dequeue */
> +       op = desc->req.op_addr;
> +
> +       /* Clearing status, it will be set based on response */
> +       op->status = 0;
> +       op->status |= rsp.input_err << RTE_BBDEV_DATA_ERROR;
> +       op->status |= rsp.dma_err << RTE_BBDEV_DRV_ERROR;
> +       op->status |= rsp.fcw_err << RTE_BBDEV_DRV_ERROR;
> +       if (op->status != 0)
> +               q_data->queue_stats.dequeue_err_count++;
> +
> +       op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
> +       if (op->ldpc_dec.hard_output.length > 0 && !rsp.synd_ok)
> +               op->status |= 1 << RTE_BBDEV_SYNDROME_ERROR;
> +       op->ldpc_dec.iter_count = (uint8_t) rsp.iter_cnt;
> +
> +       /* Check if this is the last desc in batch (Atomic Queue) */
> +       if (desc->req.last_desc_in_batch) {
> +               (*aq_dequeued)++;
> +               desc->req.last_desc_in_batch = 0;
> +       }
> +
> +       desc->rsp.val = ACC100_DMA_DESC_TYPE;
> +       desc->rsp.add_info_0 = 0;
> +       desc->rsp.add_info_1 = 0;
> +
> +       *ref_op = op;
> +
> +       /* One CB (op) was successfully dequeued */
> +       return 1;
> +}
> +
> +/* Dequeue one decode operations from ACC100 device in TB mode. */
> +static inline int
> +dequeue_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op
> **ref_op,
> +               uint16_t dequeued_cbs, uint32_t *aq_dequeued)
> +{
> +       union acc100_dma_desc *desc, *last_desc, atom_desc;
> +       union acc100_dma_rsp_desc rsp;
> +       struct rte_bbdev_dec_op *op;
> +       uint8_t cbs_in_tb = 1, cb_idx = 0;
> +
> +       desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> +                       & q->sw_ring_wrap_mask);
> +       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> +                       __ATOMIC_RELAXED);
> +
> +       /* Check fdone bit */
> +       if (!(atom_desc.rsp.val & ACC100_FDONE))
> +               return -1;
> +
> +       /* Dequeue */
> +       op = desc->req.op_addr;
> +
> +       /* Get number of CBs in dequeued TB */
> +       cbs_in_tb = desc->req.cbs_in_tb;
> +       /* Get last CB */
> +       last_desc = q->ring_addr + ((q->sw_ring_tail
> +                       + dequeued_cbs + cbs_in_tb - 1)
> +                       & q->sw_ring_wrap_mask);
> +       /* Check if last CB in TB is ready to dequeue (and thus
> +        * the whole TB) - checking sdone bit. If not return.
> +        */
> +       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
> +                       __ATOMIC_RELAXED);
> +       if (!(atom_desc.rsp.val & ACC100_SDONE))
> +               return -1;
> +
> +       /* Clearing status, it will be set based on response */
> +       op->status = 0;
> +
> +       /* Read remaining CBs if exists */
> +       while (cb_idx < cbs_in_tb) {
> +               desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> +                               & q->sw_ring_wrap_mask);
> +               atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> +                               __ATOMIC_RELAXED);
> +               rsp.val = atom_desc.rsp.val;
> +               rte_bbdev_log_debug("Resp. desc %p: %x", desc,
> +                               rsp.val);
> +
> +               op->status |= ((rsp.input_err)
> +                               ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> +               op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> +               op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> +
> +               /* CRC invalid if error exists */
> +               if (!op->status)
> +                       op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
> +               op->turbo_dec.iter_count = RTE_MAX((uint8_t) rsp.iter_cnt,
> +                               op->turbo_dec.iter_count);
> +
> +               /* Check if this is the last desc in batch (Atomic Queue) */
> +               if (desc->req.last_desc_in_batch) {
> +                       (*aq_dequeued)++;
> +                       desc->req.last_desc_in_batch = 0;
> +               }
> +               desc->rsp.val = ACC100_DMA_DESC_TYPE;
> +               desc->rsp.add_info_0 = 0;
> +               desc->rsp.add_info_1 = 0;
> +               dequeued_cbs++;
> +               cb_idx++;
> +       }
> +
> +       *ref_op = op;
> +
> +       return cb_idx;
> +}
> +
> +/* Dequeue LDPC encode operations from ACC100 device. */
> +static uint16_t
> +acc100_dequeue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
> +               struct rte_bbdev_enc_op **ops, uint16_t num)
> +{
> +       struct acc100_queue *q = q_data->queue_private;
> +       uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
> +       uint32_t aq_dequeued = 0;
> +       uint16_t dequeue_num, i, dequeued_cbs = 0, dequeued_descs = 0;
> +       int ret;
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +       if (unlikely(ops == 0 && q == NULL))
> +               return 0;
> +#endif
> +
> +       dequeue_num = (avail < num) ? avail : num;
> +
> +       for (i = 0; i < dequeue_num; i++) {
> +               ret = dequeue_enc_one_op_cb(q, &ops[dequeued_cbs],
> +                               dequeued_descs, &aq_dequeued);
> +               if (ret < 0)
> +                       break;
> +               dequeued_cbs += ret;
> +               dequeued_descs++;
> +               if (dequeued_cbs >= num)
> +                       break;
> +       }
> +
> +       q->aq_dequeued += aq_dequeued;
> +       q->sw_ring_tail += dequeued_descs;
> +
> +       /* Update enqueue stats */
> +       q_data->queue_stats.dequeued_count += dequeued_cbs;
> +
> +       return dequeued_cbs;
> +}
> +
> +/* Dequeue decode operations from ACC100 device. */
> +static uint16_t
> +acc100_dequeue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
> +               struct rte_bbdev_dec_op **ops, uint16_t num)
> +{
> +       struct acc100_queue *q = q_data->queue_private;
> +       uint16_t dequeue_num;
> +       uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
> +       uint32_t aq_dequeued = 0;
> +       uint16_t i;
> +       uint16_t dequeued_cbs = 0;
> +       struct rte_bbdev_dec_op *op;
> +       int ret;
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +       if (unlikely(ops == 0 && q == NULL))
> +               return 0;
> +#endif
> +
> +       dequeue_num = (avail < num) ? avail : num;
> +
> +       for (i = 0; i < dequeue_num; ++i) {
> +               op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> +                       & q->sw_ring_wrap_mask))->req.op_addr;
> +               if (op->ldpc_dec.code_block_mode == 0)
> +                       ret = dequeue_dec_one_op_tb(q, &ops[i], dequeued_cbs,
> +                                       &aq_dequeued);
> +               else
> +                       ret = dequeue_ldpc_dec_one_op_cb(
> +                                       q_data, q, &ops[i], dequeued_cbs,
> +                                       &aq_dequeued);
> +
> +               if (ret < 0)
> +                       break;
> +               dequeued_cbs += ret;
> +       }
> +
> +       q->aq_dequeued += aq_dequeued;
> +       q->sw_ring_tail += dequeued_cbs;
> +
> +       /* Update enqueue stats */
> +       q_data->queue_stats.dequeued_count += i;
> +
> +       return i;
> +}
> +
>  /* Initialization Function */
>  static void
>  acc100_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv)
> @@ -703,6 +2321,10 @@
>          struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
> 
>          dev->dev_ops = &acc100_bbdev_ops;
> +       dev->enqueue_ldpc_enc_ops = acc100_enqueue_ldpc_enc;
> +       dev->enqueue_ldpc_dec_ops = acc100_enqueue_ldpc_dec;
> +       dev->dequeue_ldpc_enc_ops = acc100_dequeue_ldpc_enc;
> +       dev->dequeue_ldpc_dec_ops = acc100_dequeue_ldpc_dec;
> 
>          ((struct acc100_device *) dev->data->dev_private)->pf_device =
>                          !strcmp(drv->driver.name,
> @@ -815,4 +2437,3 @@ static int acc100_pci_remove(struct rte_pci_device
> *pci_dev)
>  RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME,
> pci_id_acc100_pf_map);
>  RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
>  RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME,
> pci_id_acc100_vf_map);
> -
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h
> b/drivers/baseband/acc100/rte_acc100_pmd.h
> index 0e2b79c..78686c1 100644
> --- a/drivers/baseband/acc100/rte_acc100_pmd.h
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
> @@ -88,6 +88,8 @@
>  #define TMPL_PRI_3      0x0f0e0d0c
>  #define QUEUE_ENABLE    0x80000000  /* Bit to mark Queue as Enabled */
>  #define WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
> +#define ACC100_FDONE    0x80000000
> +#define ACC100_SDONE    0x40000000
> 
>  #define ACC100_NUM_TMPL  32
>  #define VF_OFFSET_QOS 16 /* offset in Memory Space specific to QoS Mon */
> @@ -398,6 +400,7 @@ struct __rte_packed acc100_dma_req_desc {
>  union acc100_dma_desc {
>          struct acc100_dma_req_desc req;
>          union acc100_dma_rsp_desc rsp;
> +       uint64_t atom_hdr;
>  };
> 
> 
> --
> 1.8.3.1

^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC processing functions
  2020-08-20 14:57       ` Dave Burley
@ 2020-08-20 21:05         ` Chautru, Nicolas
  2020-09-03  8:06           ` Dave Burley
  0 siblings, 1 reply; 213+ messages in thread
From: Chautru, Nicolas @ 2020-08-20 21:05 UTC (permalink / raw)
  To: Dave Burley, dev; +Cc: Richardson, Bruce


> From: Dave Burley <dave.burley@accelercomm.com>> 
> Hi Nic
> 
> Thank you - it would be useful to have further documentation for clarification
> as the data format isn't explicitly documented in BBDEV.

Thanks Dave. Just updated on this other patch -> https://patches.dpdk.org/patch/75793/
Feel free to ack or let me know if you need more details. 

> Best Regards
> 
> Dave
> 
> 
> From: Chautru, Nicolas <nicolas.chautru@intel.com>
> Sent: 20 August 2020 15:52
> To: Dave Burley <dave.burley@accelercomm.com>; dev@dpdk.org
> <dev@dpdk.org>
> Cc: Richardson, Bruce <bruce.richardson@intel.com>
> Subject: RE: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC
> processing functions
> 
> Hi Dave,
> This is assuming 6 bits LLR compression packing (ie. first 2 MSB dropped).
> Similar to HARQ compression.
> Let me know if unclear, I can clarify further in documentation if not explicit
> enough.
> Thanks
> Nic
> 
> > -----Original Message-----
> > From: Dave Burley <dave.burley@accelercomm.com>
> > Sent: Thursday, August 20, 2020 7:39 AM
> > To: Chautru, Nicolas <nicolas.chautru@intel.com>; dev@dpdk.org
> > Cc: Richardson, Bruce <bruce.richardson@intel.com>
> > Subject: Re: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC
> > processing functions
> >
> > Hi Nic,
> >
> > As you've now specified the use of RTE_BBDEV_LDPC_LLR_COMPRESSION for
> > this PMB, please could you confirm what the packed format of the LLRs in
> > memory looks like?
> >
> > Best Regards
> >
> > Dave Burley
> >
> >
> > From: dev <dev-bounces@dpdk.org> on behalf of Nicolas Chautru
> > <nicolas.chautru@intel.com>
> > Sent: 19 August 2020 01:25
> > To: dev@dpdk.org <dev@dpdk.org>; akhil.goyal@nxp.com
> > <akhil.goyal@nxp.com>
> > Cc: bruce.richardson@intel.com <bruce.richardson@intel.com>; Nicolas
> > Chautru <nicolas.chautru@intel.com>
> > Subject: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC processing
> > functions
> >
> > Adding LDPC decode and encode processing operations
> >
> > Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> > ---
> >  drivers/baseband/acc100/rte_acc100_pmd.c | 1625
> > +++++++++++++++++++++++++++++-
> >  drivers/baseband/acc100/rte_acc100_pmd.h |    3 +
> >  2 files changed, 1626 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> > b/drivers/baseband/acc100/rte_acc100_pmd.c
> > index 7a21c57..5f32813 100644
> > --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> > +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> > @@ -15,6 +15,9 @@
> >  #include <rte_hexdump.h>
> >  #include <rte_pci.h>
> >  #include <rte_bus_pci.h>
> > +#ifdef RTE_BBDEV_OFFLOAD_COST
> > +#include <rte_cycles.h>
> > +#endif
> >
> >  #include <rte_bbdev.h>
> >  #include <rte_bbdev_pmd.h>
> > @@ -449,7 +452,6 @@
> >          return 0;
> >  }
> >
> > -
> >  /**
> >   * Report a ACC100 queue index which is free
> >   * Return 0 to 16k for a valid queue_idx or -1 when no queue is available
> > @@ -634,6 +636,46 @@
> >          struct acc100_device *d = dev->data->dev_private;
> >
> >          static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
> > +               {
> > +                       .type   = RTE_BBDEV_OP_LDPC_ENC,
> > +                       .cap.ldpc_enc = {
> > +                               .capability_flags =
> > +                                       RTE_BBDEV_LDPC_RATE_MATCH |
> > +                                       RTE_BBDEV_LDPC_CRC_24B_ATTACH |
> > +                                       RTE_BBDEV_LDPC_INTERLEAVER_BYPASS,
> > +                               .num_buffers_src =
> > +                                               RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> > +                               .num_buffers_dst =
> > +                                               RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> > +                       }
> > +               },
> > +               {
> > +                       .type   = RTE_BBDEV_OP_LDPC_DEC,
> > +                       .cap.ldpc_dec = {
> > +                       .capability_flags =
> > +                               RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK |
> > +                               RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP |
> > +                               RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE |
> > +                               RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE |
> > +#ifdef ACC100_EXT_MEM
> >
> +                               RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABL
> > E |
> >
> +                               RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENA
> > BLE |
> > +#endif
> > +                               RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE |
> > +                               RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS |
> > +                               RTE_BBDEV_LDPC_DECODE_BYPASS |
> > +                               RTE_BBDEV_LDPC_DEC_SCATTER_GATHER |
> > +                               RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION |
> > +                               RTE_BBDEV_LDPC_LLR_COMPRESSION,
> > +                       .llr_size = 8,
> > +                       .llr_decimals = 1,
> > +                       .num_buffers_src =
> > +                                       RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> > +                       .num_buffers_hard_out =
> > +                                       RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> > +                       .num_buffers_soft_out = 0,
> > +                       }
> > +               },
> >                  RTE_BBDEV_END_OF_CAPABILITIES_LIST()
> >          };
> >
> > @@ -669,9 +711,14 @@
> >          dev_info->cpu_flag_reqs = NULL;
> >          dev_info->min_alignment = 64;
> >          dev_info->capabilities = bbdev_capabilities;
> > +#ifdef ACC100_EXT_MEM
> >          dev_info->harq_buffer_size = d->ddr_size;
> > +#else
> > +       dev_info->harq_buffer_size = 0;
> > +#endif
> >  }
> >
> > +
> >  static const struct rte_bbdev_ops acc100_bbdev_ops = {
> >          .setup_queues = acc100_setup_queues,
> >          .close = acc100_dev_close,
> > @@ -696,6 +743,1577 @@
> >          {.device_id = 0},
> >  };
> >
> > +/* Read flag value 0/1 from bitmap */
> > +static inline bool
> > +check_bit(uint32_t bitmap, uint32_t bitmask)
> > +{
> > +       return bitmap & bitmask;
> > +}
> > +
> > +static inline char *
> > +mbuf_append(struct rte_mbuf *m_head, struct rte_mbuf *m, uint16_t len)
> > +{
> > +       if (unlikely(len > rte_pktmbuf_tailroom(m)))
> > +               return NULL;
> > +
> > +       char *tail = (char *)m->buf_addr + m->data_off + m->data_len;
> > +       m->data_len = (uint16_t)(m->data_len + len);
> > +       m_head->pkt_len  = (m_head->pkt_len + len);
> > +       return tail;
> > +}
> > +
> > +/* Compute value of k0.
> > + * Based on 3GPP 38.212 Table 5.4.2.1-2
> > + * Starting position of different redundancy versions, k0
> > + */
> > +static inline uint16_t
> > +get_k0(uint16_t n_cb, uint16_t z_c, uint8_t bg, uint8_t rv_index)
> > +{
> > +       if (rv_index == 0)
> > +               return 0;
> > +       uint16_t n = (bg == 1 ? N_ZC_1 : N_ZC_2) * z_c;
> > +       if (n_cb == n) {
> > +               if (rv_index == 1)
> > +                       return (bg == 1 ? K0_1_1 : K0_1_2) * z_c;
> > +               else if (rv_index == 2)
> > +                       return (bg == 1 ? K0_2_1 : K0_2_2) * z_c;
> > +               else
> > +                       return (bg == 1 ? K0_3_1 : K0_3_2) * z_c;
> > +       }
> > +       /* LBRM case - includes a division by N */
> > +       if (rv_index == 1)
> > +               return (((bg == 1 ? K0_1_1 : K0_1_2) * n_cb)
> > +                               / n) * z_c;
> > +       else if (rv_index == 2)
> > +               return (((bg == 1 ? K0_2_1 : K0_2_2) * n_cb)
> > +                               / n) * z_c;
> > +       else
> > +               return (((bg == 1 ? K0_3_1 : K0_3_2) * n_cb)
> > +                               / n) * z_c;
> > +}
> > +
> > +/* Fill in a frame control word for LDPC encoding. */
> > +static inline void
> > +acc100_fcw_le_fill(const struct rte_bbdev_enc_op *op,
> > +               struct acc100_fcw_le *fcw, int num_cb)
> > +{
> > +       fcw->qm = op->ldpc_enc.q_m;
> > +       fcw->nfiller = op->ldpc_enc.n_filler;
> > +       fcw->BG = (op->ldpc_enc.basegraph - 1);
> > +       fcw->Zc = op->ldpc_enc.z_c;
> > +       fcw->ncb = op->ldpc_enc.n_cb;
> > +       fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_enc.basegraph,
> > +                       op->ldpc_enc.rv_index);
> > +       fcw->rm_e = op->ldpc_enc.cb_params.e;
> > +       fcw->crc_select = check_bit(op->ldpc_enc.op_flags,
> > +                       RTE_BBDEV_LDPC_CRC_24B_ATTACH);
> > +       fcw->bypass_intlv = check_bit(op->ldpc_enc.op_flags,
> > +                       RTE_BBDEV_LDPC_INTERLEAVER_BYPASS);
> > +       fcw->mcb_count = num_cb;
> > +}
> > +
> > +/* Fill in a frame control word for LDPC decoding. */
> > +static inline void
> > +acc100_fcw_ld_fill(const struct rte_bbdev_dec_op *op, struct
> acc100_fcw_ld
> > *fcw,
> > +               union acc100_harq_layout_data *harq_layout)
> > +{
> > +       uint16_t harq_out_length, harq_in_length, ncb_p, k0_p, parity_offset;
> > +       uint16_t harq_index;
> > +       uint32_t l;
> > +       bool harq_prun = false;
> > +
> > +       fcw->qm = op->ldpc_dec.q_m;
> > +       fcw->nfiller = op->ldpc_dec.n_filler;
> > +       fcw->BG = (op->ldpc_dec.basegraph - 1);
> > +       fcw->Zc = op->ldpc_dec.z_c;
> > +       fcw->ncb = op->ldpc_dec.n_cb;
> > +       fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_dec.basegraph,
> > +                       op->ldpc_dec.rv_index);
> > +       if (op->ldpc_dec.code_block_mode == 1)
> > +               fcw->rm_e = op->ldpc_dec.cb_params.e;
> > +       else
> > +               fcw->rm_e = (op->ldpc_dec.tb_params.r <
> > +                               op->ldpc_dec.tb_params.cab) ?
> > +                                               op->ldpc_dec.tb_params.ea :
> > +                                               op->ldpc_dec.tb_params.eb;
> > +
> > +       fcw->hcin_en = check_bit(op->ldpc_dec.op_flags,
> > +                       RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE);
> > +       fcw->hcout_en = check_bit(op->ldpc_dec.op_flags,
> > +                       RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE);
> > +       fcw->crc_select = check_bit(op->ldpc_dec.op_flags,
> > +                       RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK);
> > +       fcw->bypass_dec = check_bit(op->ldpc_dec.op_flags,
> > +                       RTE_BBDEV_LDPC_DECODE_BYPASS);
> > +       fcw->bypass_intlv = check_bit(op->ldpc_dec.op_flags,
> > +                       RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS);
> > +       if (op->ldpc_dec.q_m == 1) {
> > +               fcw->bypass_intlv = 1;
> > +               fcw->qm = 2;
> > +       }
> > +       fcw->hcin_decomp_mode = check_bit(op->ldpc_dec.op_flags,
> > +                       RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
> > +       fcw->hcout_comp_mode = check_bit(op->ldpc_dec.op_flags,
> > +                       RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
> > +       fcw->llr_pack_mode = check_bit(op->ldpc_dec.op_flags,
> > +                       RTE_BBDEV_LDPC_LLR_COMPRESSION);
> > +       harq_index = op->ldpc_dec.harq_combined_output.offset /
> > +                       ACC100_HARQ_OFFSET;
> > +#ifdef ACC100_EXT_MEM
> > +       /* Limit cases when HARQ pruning is valid */
> > +       harq_prun = ((op->ldpc_dec.harq_combined_output.offset %
> > +                       ACC100_HARQ_OFFSET) == 0) &&
> > +                       (op->ldpc_dec.harq_combined_output.offset <= UINT16_MAX
> > +                       * ACC100_HARQ_OFFSET);
> > +#endif
> > +       if (fcw->hcin_en > 0) {
> > +               harq_in_length = op->ldpc_dec.harq_combined_input.length;
> > +               if (fcw->hcin_decomp_mode > 0)
> > +                       harq_in_length = harq_in_length * 8 / 6;
> > +               harq_in_length = RTE_ALIGN(harq_in_length, 64);
> > +               if ((harq_layout[harq_index].offset > 0) & harq_prun) {
> > +                       rte_bbdev_log_debug("HARQ IN offset unexpected for
> now\n");
> > +                       fcw->hcin_size0 = harq_layout[harq_index].size0;
> > +                       fcw->hcin_offset = harq_layout[harq_index].offset;
> > +                       fcw->hcin_size1 = harq_in_length -
> > +                                       harq_layout[harq_index].offset;
> > +               } else {
> > +                       fcw->hcin_size0 = harq_in_length;
> > +                       fcw->hcin_offset = 0;
> > +                       fcw->hcin_size1 = 0;
> > +               }
> > +       } else {
> > +               fcw->hcin_size0 = 0;
> > +               fcw->hcin_offset = 0;
> > +               fcw->hcin_size1 = 0;
> > +       }
> > +
> > +       fcw->itmax = op->ldpc_dec.iter_max;
> > +       fcw->itstop = check_bit(op->ldpc_dec.op_flags,
> > +                       RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE);
> > +       fcw->synd_precoder = fcw->itstop;
> > +       /*
> > +        * These are all implicitly set
> > +        * fcw->synd_post = 0;
> > +        * fcw->so_en = 0;
> > +        * fcw->so_bypass_rm = 0;
> > +        * fcw->so_bypass_intlv = 0;
> > +        * fcw->dec_convllr = 0;
> > +        * fcw->hcout_convllr = 0;
> > +        * fcw->hcout_size1 = 0;
> > +        * fcw->so_it = 0;
> > +        * fcw->hcout_offset = 0;
> > +        * fcw->negstop_th = 0;
> > +        * fcw->negstop_it = 0;
> > +        * fcw->negstop_en = 0;
> > +        * fcw->gain_i = 1;
> > +        * fcw->gain_h = 1;
> > +        */
> > +       if (fcw->hcout_en > 0) {
> > +               parity_offset = (op->ldpc_dec.basegraph == 1 ? 20 : 8)
> > +                       * op->ldpc_dec.z_c - op->ldpc_dec.n_filler;
> > +               k0_p = (fcw->k0 > parity_offset) ?
> > +                               fcw->k0 - op->ldpc_dec.n_filler : fcw->k0;
> > +               ncb_p = fcw->ncb - op->ldpc_dec.n_filler;
> > +               l = k0_p + fcw->rm_e;
> > +               harq_out_length = (uint16_t) fcw->hcin_size0;
> > +               harq_out_length = RTE_MIN(RTE_MAX(harq_out_length, l),
> ncb_p);
> > +               harq_out_length = (harq_out_length + 0x3F) & 0xFFC0;
> > +               if ((k0_p > fcw->hcin_size0 + ACC100_HARQ_OFFSET_THRESHOLD)
> > &&
> > +                               harq_prun) {
> > +                       fcw->hcout_size0 = (uint16_t) fcw->hcin_size0;
> > +                       fcw->hcout_offset = k0_p & 0xFFC0;
> > +                       fcw->hcout_size1 = harq_out_length - fcw->hcout_offset;
> > +               } else {
> > +                       fcw->hcout_size0 = harq_out_length;
> > +                       fcw->hcout_size1 = 0;
> > +                       fcw->hcout_offset = 0;
> > +               }
> > +               harq_layout[harq_index].offset = fcw->hcout_offset;
> > +               harq_layout[harq_index].size0 = fcw->hcout_size0;
> > +       } else {
> > +               fcw->hcout_size0 = 0;
> > +               fcw->hcout_size1 = 0;
> > +               fcw->hcout_offset = 0;
> > +       }
> > +}
> > +
> > +/**
> > + * Fills descriptor with data pointers of one block type.
> > + *
> > + * @param desc
> > + *   Pointer to DMA descriptor.
> > + * @param input
> > + *   Pointer to pointer to input data which will be encoded. It can be changed
> > + *   and points to next segment in scatter-gather case.
> > + * @param offset
> > + *   Input offset in rte_mbuf structure. It is used for calculating the point
> > + *   where data is starting.
> > + * @param cb_len
> > + *   Length of currently processed Code Block
> > + * @param seg_total_left
> > + *   It indicates how many bytes still left in segment (mbuf) for further
> > + *   processing.
> > + * @param op_flags
> > + *   Store information about device capabilities
> > + * @param next_triplet
> > + *   Index for ACC100 DMA Descriptor triplet
> > + *
> > + * @return
> > + *   Returns index of next triplet on success, other value if lengths of
> > + *   pkt and processed cb do not match.
> > + *
> > + */
> > +static inline int
> > +acc100_dma_fill_blk_type_in(struct acc100_dma_req_desc *desc,
> > +               struct rte_mbuf **input, uint32_t *offset, uint32_t cb_len,
> > +               uint32_t *seg_total_left, int next_triplet)
> > +{
> > +       uint32_t part_len;
> > +       struct rte_mbuf *m = *input;
> > +
> > +       part_len = (*seg_total_left < cb_len) ? *seg_total_left : cb_len;
> > +       cb_len -= part_len;
> > +       *seg_total_left -= part_len;
> > +
> > +       desc->data_ptrs[next_triplet].address =
> > +                       rte_pktmbuf_iova_offset(m, *offset);
> > +       desc->data_ptrs[next_triplet].blen = part_len;
> > +       desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN;
> > +       desc->data_ptrs[next_triplet].last = 0;
> > +       desc->data_ptrs[next_triplet].dma_ext = 0;
> > +       *offset += part_len;
> > +       next_triplet++;
> > +
> > +       while (cb_len > 0) {
> > +               if (next_triplet < ACC100_DMA_MAX_NUM_POINTERS &&
> > +                               m->next != NULL) {
> > +
> > +                       m = m->next;
> > +                       *seg_total_left = rte_pktmbuf_data_len(m);
> > +                       part_len = (*seg_total_left < cb_len) ?
> > +                                       *seg_total_left :
> > +                                       cb_len;
> > +                       desc->data_ptrs[next_triplet].address =
> > +                                       rte_pktmbuf_mtophys(m);
> > +                       desc->data_ptrs[next_triplet].blen = part_len;
> > +                       desc->data_ptrs[next_triplet].blkid =
> > +                                       ACC100_DMA_BLKID_IN;
> > +                       desc->data_ptrs[next_triplet].last = 0;
> > +                       desc->data_ptrs[next_triplet].dma_ext = 0;
> > +                       cb_len -= part_len;
> > +                       *seg_total_left -= part_len;
> > +                       /* Initializing offset for next segment (mbuf) */
> > +                       *offset = part_len;
> > +                       next_triplet++;
> > +               } else {
> > +                       rte_bbdev_log(ERR,
> > +                               "Some data still left for processing: "
> > +                               "data_left: %u, next_triplet: %u, next_mbuf: %p",
> > +                               cb_len, next_triplet, m->next);
> > +                       return -EINVAL;
> > +               }
> > +       }
> > +       /* Storing new mbuf as it could be changed in scatter-gather case*/
> > +       *input = m;
> > +
> > +       return next_triplet;
> > +}
> > +
> > +/* Fills descriptor with data pointers of one block type.
> > + * Returns index of next triplet on success, other value if lengths of
> > + * output data and processed mbuf do not match.
> > + */
> > +static inline int
> > +acc100_dma_fill_blk_type_out(struct acc100_dma_req_desc *desc,
> > +               struct rte_mbuf *output, uint32_t out_offset,
> > +               uint32_t output_len, int next_triplet, int blk_id)
> > +{
> > +       desc->data_ptrs[next_triplet].address =
> > +                       rte_pktmbuf_iova_offset(output, out_offset);
> > +       desc->data_ptrs[next_triplet].blen = output_len;
> > +       desc->data_ptrs[next_triplet].blkid = blk_id;
> > +       desc->data_ptrs[next_triplet].last = 0;
> > +       desc->data_ptrs[next_triplet].dma_ext = 0;
> > +       next_triplet++;
> > +
> > +       return next_triplet;
> > +}
> > +
> > +static inline int
> > +acc100_dma_desc_le_fill(struct rte_bbdev_enc_op *op,
> > +               struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
> > +               struct rte_mbuf *output, uint32_t *in_offset,
> > +               uint32_t *out_offset, uint32_t *out_length,
> > +               uint32_t *mbuf_total_left, uint32_t *seg_total_left)
> > +{
> > +       int next_triplet = 1; /* FCW already done */
> > +       uint16_t K, in_length_in_bits, in_length_in_bytes;
> > +       struct rte_bbdev_op_ldpc_enc *enc = &op->ldpc_enc;
> > +
> > +       desc->word0 = ACC100_DMA_DESC_TYPE;
> > +       desc->word1 = 0; /**< Timestamp could be disabled */
> > +       desc->word2 = 0;
> > +       desc->word3 = 0;
> > +       desc->numCBs = 1;
> > +
> > +       K = (enc->basegraph == 1 ? 22 : 10) * enc->z_c;
> > +       in_length_in_bits = K - enc->n_filler;
> > +       if ((enc->op_flags & RTE_BBDEV_LDPC_CRC_24A_ATTACH) ||
> > +                       (enc->op_flags & RTE_BBDEV_LDPC_CRC_24B_ATTACH))
> > +               in_length_in_bits -= 24;
> > +       in_length_in_bytes = in_length_in_bits >> 3;
> > +
> > +       if (unlikely((*mbuf_total_left == 0) ||
> > +                       (*mbuf_total_left < in_length_in_bytes))) {
> > +               rte_bbdev_log(ERR,
> > +                               "Mismatch between mbuf length and included CB sizes:
> > mbuf len %u, cb len %u",
> > +                               *mbuf_total_left, in_length_in_bytes);
> > +               return -1;
> > +       }
> > +
> > +       next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset,
> > +                       in_length_in_bytes,
> > +                       seg_total_left, next_triplet);
> > +       if (unlikely(next_triplet < 0)) {
> > +               rte_bbdev_log(ERR,
> > +                               "Mismatch between data to process and mbuf data
> length
> > in bbdev_op: %p",
> > +                               op);
> > +               return -1;
> > +       }
> > +       desc->data_ptrs[next_triplet - 1].last = 1;
> > +       desc->m2dlen = next_triplet;
> > +       *mbuf_total_left -= in_length_in_bytes;
> > +
> > +       /* Set output length */
> > +       /* Integer round up division by 8 */
> > +       *out_length = (enc->cb_params.e + 7) >> 3;
> > +
> > +       next_triplet = acc100_dma_fill_blk_type_out(desc, output, *out_offset,
> > +                       *out_length, next_triplet, ACC100_DMA_BLKID_OUT_ENC);
> > +       if (unlikely(next_triplet < 0)) {
> > +               rte_bbdev_log(ERR,
> > +                               "Mismatch between data to process and mbuf data
> length
> > in bbdev_op: %p",
> > +                               op);
> > +               return -1;
> > +       }
> > +       op->ldpc_enc.output.length += *out_length;
> > +       *out_offset += *out_length;
> > +       desc->data_ptrs[next_triplet - 1].last = 1;
> > +       desc->data_ptrs[next_triplet - 1].dma_ext = 0;
> > +       desc->d2mlen = next_triplet - desc->m2dlen;
> > +
> > +       desc->op_addr = op;
> > +
> > +       return 0;
> > +}
> > +
> > +static inline int
> > +acc100_dma_desc_ld_fill(struct rte_bbdev_dec_op *op,
> > +               struct acc100_dma_req_desc *desc,
> > +               struct rte_mbuf **input, struct rte_mbuf *h_output,
> > +               uint32_t *in_offset, uint32_t *h_out_offset,
> > +               uint32_t *h_out_length, uint32_t *mbuf_total_left,
> > +               uint32_t *seg_total_left,
> > +               struct acc100_fcw_ld *fcw)
> > +{
> > +       struct rte_bbdev_op_ldpc_dec *dec = &op->ldpc_dec;
> > +       int next_triplet = 1; /* FCW already done */
> > +       uint32_t input_length;
> > +       uint16_t output_length, crc24_overlap = 0;
> > +       uint16_t sys_cols, K, h_p_size, h_np_size;
> > +       bool h_comp = check_bit(dec->op_flags,
> > +                       RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
> > +
> > +       desc->word0 = ACC100_DMA_DESC_TYPE;
> > +       desc->word1 = 0; /**< Timestamp could be disabled */
> > +       desc->word2 = 0;
> > +       desc->word3 = 0;
> > +       desc->numCBs = 1;
> > +
> > +       if (check_bit(op->ldpc_dec.op_flags,
> > +                       RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP))
> > +               crc24_overlap = 24;
> > +
> > +       /* Compute some LDPC BG lengths */
> > +       input_length = dec->cb_params.e;
> > +       if (check_bit(op->ldpc_dec.op_flags,
> > +                       RTE_BBDEV_LDPC_LLR_COMPRESSION))
> > +               input_length = (input_length * 3 + 3) / 4;
> > +       sys_cols = (dec->basegraph == 1) ? 22 : 10;
> > +       K = sys_cols * dec->z_c;
> > +       output_length = K - dec->n_filler - crc24_overlap;
> > +
> > +       if (unlikely((*mbuf_total_left == 0) ||
> > +                       (*mbuf_total_left < input_length))) {
> > +               rte_bbdev_log(ERR,
> > +                               "Mismatch between mbuf length and included CB sizes:
> > mbuf len %u, cb len %u",
> > +                               *mbuf_total_left, input_length);
> > +               return -1;
> > +       }
> > +
> > +       next_triplet = acc100_dma_fill_blk_type_in(desc, input,
> > +                       in_offset, input_length,
> > +                       seg_total_left, next_triplet);
> > +
> > +       if (unlikely(next_triplet < 0)) {
> > +               rte_bbdev_log(ERR,
> > +                               "Mismatch between data to process and mbuf data
> length
> > in bbdev_op: %p",
> > +                               op);
> > +               return -1;
> > +       }
> > +
> > +       if (check_bit(op->ldpc_dec.op_flags,
> > +                               RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
> > +               h_p_size = fcw->hcin_size0 + fcw->hcin_size1;
> > +               if (h_comp)
> > +                       h_p_size = (h_p_size * 3 + 3) / 4;
> > +               desc->data_ptrs[next_triplet].address =
> > +                               dec->harq_combined_input.offset;
> > +               desc->data_ptrs[next_triplet].blen = h_p_size;
> > +               desc->data_ptrs[next_triplet].blkid =
> > ACC100_DMA_BLKID_IN_HARQ;
> > +               desc->data_ptrs[next_triplet].dma_ext = 1;
> > +#ifndef ACC100_EXT_MEM
> > +               acc100_dma_fill_blk_type_out(
> > +                               desc,
> > +                               op->ldpc_dec.harq_combined_input.data,
> > +                               op->ldpc_dec.harq_combined_input.offset,
> > +                               h_p_size,
> > +                               next_triplet,
> > +                               ACC100_DMA_BLKID_IN_HARQ);
> > +#endif
> > +               next_triplet++;
> > +       }
> > +
> > +       desc->data_ptrs[next_triplet - 1].last = 1;
> > +       desc->m2dlen = next_triplet;
> > +       *mbuf_total_left -= input_length;
> > +
> > +       next_triplet = acc100_dma_fill_blk_type_out(desc, h_output,
> > +                       *h_out_offset, output_length >> 3, next_triplet,
> > +                       ACC100_DMA_BLKID_OUT_HARD);
> > +       if (unlikely(next_triplet < 0)) {
> > +               rte_bbdev_log(ERR,
> > +                               "Mismatch between data to process and mbuf data
> length
> > in bbdev_op: %p",
> > +                               op);
> > +               return -1;
> > +       }
> > +
> > +       if (check_bit(op->ldpc_dec.op_flags,
> > +                               RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
> > +               /* Pruned size of the HARQ */
> > +               h_p_size = fcw->hcout_size0 + fcw->hcout_size1;
> > +               /* Non-Pruned size of the HARQ */
> > +               h_np_size = fcw->hcout_offset > 0 ?
> > +                               fcw->hcout_offset + fcw->hcout_size1 :
> > +                               h_p_size;
> > +               if (h_comp) {
> > +                       h_np_size = (h_np_size * 3 + 3) / 4;
> > +                       h_p_size = (h_p_size * 3 + 3) / 4;
> > +               }
> > +               dec->harq_combined_output.length = h_np_size;
> > +               desc->data_ptrs[next_triplet].address =
> > +                               dec->harq_combined_output.offset;
> > +               desc->data_ptrs[next_triplet].blen = h_p_size;
> > +               desc->data_ptrs[next_triplet].blkid =
> > ACC100_DMA_BLKID_OUT_HARQ;
> > +               desc->data_ptrs[next_triplet].dma_ext = 1;
> > +#ifndef ACC100_EXT_MEM
> > +               acc100_dma_fill_blk_type_out(
> > +                               desc,
> > +                               dec->harq_combined_output.data,
> > +                               dec->harq_combined_output.offset,
> > +                               h_p_size,
> > +                               next_triplet,
> > +                               ACC100_DMA_BLKID_OUT_HARQ);
> > +#endif
> > +               next_triplet++;
> > +       }
> > +
> > +       *h_out_length = output_length >> 3;
> > +       dec->hard_output.length += *h_out_length;
> > +       *h_out_offset += *h_out_length;
> > +       desc->data_ptrs[next_triplet - 1].last = 1;
> > +       desc->d2mlen = next_triplet - desc->m2dlen;
> > +
> > +       desc->op_addr = op;
> > +
> > +       return 0;
> > +}
> > +
> > +static inline void
> > +acc100_dma_desc_ld_update(struct rte_bbdev_dec_op *op,
> > +               struct acc100_dma_req_desc *desc,
> > +               struct rte_mbuf *input, struct rte_mbuf *h_output,
> > +               uint32_t *in_offset, uint32_t *h_out_offset,
> > +               uint32_t *h_out_length,
> > +               union acc100_harq_layout_data *harq_layout)
> > +{
> > +       int next_triplet = 1; /* FCW already done */
> > +       desc->data_ptrs[next_triplet].address =
> > +                       rte_pktmbuf_iova_offset(input, *in_offset);
> > +       next_triplet++;
> > +
> > +       if (check_bit(op->ldpc_dec.op_flags,
> > +                               RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
> > +               struct rte_bbdev_op_data hi = op->ldpc_dec.harq_combined_input;
> > +               desc->data_ptrs[next_triplet].address = hi.offset;
> > +#ifndef ACC100_EXT_MEM
> > +               desc->data_ptrs[next_triplet].address =
> > +                               rte_pktmbuf_iova_offset(hi.data, hi.offset);
> > +#endif
> > +               next_triplet++;
> > +       }
> > +
> > +       desc->data_ptrs[next_triplet].address =
> > +                       rte_pktmbuf_iova_offset(h_output, *h_out_offset);
> > +       *h_out_length = desc->data_ptrs[next_triplet].blen;
> > +       next_triplet++;
> > +
> > +       if (check_bit(op->ldpc_dec.op_flags,
> > +                               RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
> > +               desc->data_ptrs[next_triplet].address =
> > +                               op->ldpc_dec.harq_combined_output.offset;
> > +               /* Adjust based on previous operation */
> > +               struct rte_bbdev_dec_op *prev_op = desc->op_addr;
> > +               op->ldpc_dec.harq_combined_output.length =
> > +                               prev_op->ldpc_dec.harq_combined_output.length;
> > +               int16_t hq_idx = op->ldpc_dec.harq_combined_output.offset /
> > +                               ACC100_HARQ_OFFSET;
> > +               int16_t prev_hq_idx =
> > +                               prev_op->ldpc_dec.harq_combined_output.offset
> > +                               / ACC100_HARQ_OFFSET;
> > +               harq_layout[hq_idx].val = harq_layout[prev_hq_idx].val;
> > +#ifndef ACC100_EXT_MEM
> > +               struct rte_bbdev_op_data ho =
> > +                               op->ldpc_dec.harq_combined_output;
> > +               desc->data_ptrs[next_triplet].address =
> > +                               rte_pktmbuf_iova_offset(ho.data, ho.offset);
> > +#endif
> > +               next_triplet++;
> > +       }
> > +
> > +       op->ldpc_dec.hard_output.length += *h_out_length;
> > +       desc->op_addr = op;
> > +}
> > +
> > +
> > +/* Enqueue a number of operations to HW and update software rings */
> > +static inline void
> > +acc100_dma_enqueue(struct acc100_queue *q, uint16_t n,
> > +               struct rte_bbdev_stats *queue_stats)
> > +{
> > +       union acc100_enqueue_reg_fmt enq_req;
> > +#ifdef RTE_BBDEV_OFFLOAD_COST
> > +       uint64_t start_time = 0;
> > +       queue_stats->acc_offload_cycles = 0;
> > +       RTE_SET_USED(queue_stats);
> > +#else
> > +       RTE_SET_USED(queue_stats);
> > +#endif
> > +
> > +       enq_req.val = 0;
> > +       /* Setting offset, 100b for 256 DMA Desc */
> > +       enq_req.addr_offset = ACC100_DESC_OFFSET;
> > +
> > +       /* Split ops into batches */
> > +       do {
> > +               union acc100_dma_desc *desc;
> > +               uint16_t enq_batch_size;
> > +               uint64_t offset;
> > +               rte_iova_t req_elem_addr;
> > +
> > +               enq_batch_size = RTE_MIN(n, MAX_ENQ_BATCH_SIZE);
> > +
> > +               /* Set flag on last descriptor in a batch */
> > +               desc = q->ring_addr + ((q->sw_ring_head + enq_batch_size - 1) &
> > +                               q->sw_ring_wrap_mask);
> > +               desc->req.last_desc_in_batch = 1;
> > +
> > +               /* Calculate the 1st descriptor's address */
> > +               offset = ((q->sw_ring_head & q->sw_ring_wrap_mask) *
> > +                               sizeof(union acc100_dma_desc));
> > +               req_elem_addr = q->ring_addr_phys + offset;
> > +
> > +               /* Fill enqueue struct */
> > +               enq_req.num_elem = enq_batch_size;
> > +               /* low 6 bits are not needed */
> > +               enq_req.req_elem_addr = (uint32_t)(req_elem_addr >> 6);
> > +
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +               rte_memdump(stderr, "Req sdone", desc, sizeof(*desc));
> > +#endif
> > +               rte_bbdev_log_debug(
> > +                               "Enqueue %u reqs (phys %#"PRIx64") to reg %p",
> > +                               enq_batch_size,
> > +                               req_elem_addr,
> > +                               (void *)q->mmio_reg_enqueue);
> > +
> > +               rte_wmb();
> > +
> > +#ifdef RTE_BBDEV_OFFLOAD_COST
> > +               /* Start time measurement for enqueue function offload. */
> > +               start_time = rte_rdtsc_precise();
> > +#endif
> > +               rte_bbdev_log(DEBUG, "Debug : MMIO Enqueue");
> > +               mmio_write(q->mmio_reg_enqueue, enq_req.val);
> > +
> > +#ifdef RTE_BBDEV_OFFLOAD_COST
> > +               queue_stats->acc_offload_cycles +=
> > +                               rte_rdtsc_precise() - start_time;
> > +#endif
> > +
> > +               q->aq_enqueued++;
> > +               q->sw_ring_head += enq_batch_size;
> > +               n -= enq_batch_size;
> > +
> > +       } while (n);
> > +
> > +
> > +}
> > +
> > +/* Enqueue one encode operations for ACC100 device in CB mode */
> > +static inline int
> > +enqueue_ldpc_enc_n_op_cb(struct acc100_queue *q, struct
> > rte_bbdev_enc_op **ops,
> > +               uint16_t total_enqueued_cbs, int16_t num)
> > +{
> > +       union acc100_dma_desc *desc = NULL;
> > +       uint32_t out_length;
> > +       struct rte_mbuf *output_head, *output;
> > +       int i, next_triplet;
> > +       uint16_t  in_length_in_bytes;
> > +       struct rte_bbdev_op_ldpc_enc *enc = &ops[0]->ldpc_enc;
> > +
> > +       uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> > +                       & q->sw_ring_wrap_mask);
> > +       desc = q->ring_addr + desc_idx;
> > +       acc100_fcw_le_fill(ops[0], &desc->req.fcw_le, num);
> > +
> > +       /** This could be done at polling */
> > +       desc->req.word0 = ACC100_DMA_DESC_TYPE;
> > +       desc->req.word1 = 0; /**< Timestamp could be disabled */
> > +       desc->req.word2 = 0;
> > +       desc->req.word3 = 0;
> > +       desc->req.numCBs = num;
> > +
> > +       in_length_in_bytes = ops[0]->ldpc_enc.input.data->data_len;
> > +       out_length = (enc->cb_params.e + 7) >> 3;
> > +       desc->req.m2dlen = 1 + num;
> > +       desc->req.d2mlen = num;
> > +       next_triplet = 1;
> > +
> > +       for (i = 0; i < num; i++) {
> > +               desc->req.data_ptrs[next_triplet].address =
> > +                       rte_pktmbuf_iova_offset(ops[i]->ldpc_enc.input.data, 0);
> > +               desc->req.data_ptrs[next_triplet].blen = in_length_in_bytes;
> > +               next_triplet++;
> > +               desc->req.data_ptrs[next_triplet].address =
> > +                               rte_pktmbuf_iova_offset(
> > +                               ops[i]->ldpc_enc.output.data, 0);
> > +               desc->req.data_ptrs[next_triplet].blen = out_length;
> > +               next_triplet++;
> > +               ops[i]->ldpc_enc.output.length = out_length;
> > +               output_head = output = ops[i]->ldpc_enc.output.data;
> > +               mbuf_append(output_head, output, out_length);
> > +               output->data_len = out_length;
> > +       }
> > +
> > +       desc->req.op_addr = ops[0];
> > +
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +       rte_memdump(stderr, "FCW", &desc->req.fcw_le,
> > +                       sizeof(desc->req.fcw_le) - 8);
> > +       rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> > +#endif
> > +
> > +       /* One CB (one op) was successfully prepared to enqueue */
> > +       return num;
> > +}
> > +
> > +/* Enqueue one encode operations for ACC100 device in CB mode */
> > +static inline int
> > +enqueue_ldpc_enc_one_op_cb(struct acc100_queue *q, struct
> > rte_bbdev_enc_op *op,
> > +               uint16_t total_enqueued_cbs)
> > +{
> > +       union acc100_dma_desc *desc = NULL;
> > +       int ret;
> > +       uint32_t in_offset, out_offset, out_length, mbuf_total_left,
> > +               seg_total_left;
> > +       struct rte_mbuf *input, *output_head, *output;
> > +
> > +       uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> > +                       & q->sw_ring_wrap_mask);
> > +       desc = q->ring_addr + desc_idx;
> > +       acc100_fcw_le_fill(op, &desc->req.fcw_le, 1);
> > +
> > +       input = op->ldpc_enc.input.data;
> > +       output_head = output = op->ldpc_enc.output.data;
> > +       in_offset = op->ldpc_enc.input.offset;
> > +       out_offset = op->ldpc_enc.output.offset;
> > +       out_length = 0;
> > +       mbuf_total_left = op->ldpc_enc.input.length;
> > +       seg_total_left = rte_pktmbuf_data_len(op->ldpc_enc.input.data)
> > +                       - in_offset;
> > +
> > +       ret = acc100_dma_desc_le_fill(op, &desc->req, &input, output,
> > +                       &in_offset, &out_offset, &out_length, &mbuf_total_left,
> > +                       &seg_total_left);
> > +
> > +       if (unlikely(ret < 0))
> > +               return ret;
> > +
> > +       mbuf_append(output_head, output, out_length);
> > +
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +       rte_memdump(stderr, "FCW", &desc->req.fcw_le,
> > +                       sizeof(desc->req.fcw_le) - 8);
> > +       rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> > +
> > +       /* Check if any data left after processing one CB */
> > +       if (mbuf_total_left != 0) {
> > +               rte_bbdev_log(ERR,
> > +                               "Some date still left after processing one CB:
> > mbuf_total_left = %u",
> > +                               mbuf_total_left);
> > +               return -EINVAL;
> > +       }
> > +#endif
> > +       /* One CB (one op) was successfully prepared to enqueue */
> > +       return 1;
> > +}
> > +
> > +/** Enqueue one decode operations for ACC100 device in CB mode */
> > +static inline int
> > +enqueue_ldpc_dec_one_op_cb(struct acc100_queue *q, struct
> > rte_bbdev_dec_op *op,
> > +               uint16_t total_enqueued_cbs, bool same_op)
> > +{
> > +       int ret;
> > +
> > +       union acc100_dma_desc *desc;
> > +       uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> > +                       & q->sw_ring_wrap_mask);
> > +       desc = q->ring_addr + desc_idx;
> > +       struct rte_mbuf *input, *h_output_head, *h_output;
> > +       uint32_t in_offset, h_out_offset, h_out_length, mbuf_total_left;
> > +       input = op->ldpc_dec.input.data;
> > +       h_output_head = h_output = op->ldpc_dec.hard_output.data;
> > +       in_offset = op->ldpc_dec.input.offset;
> > +       h_out_offset = op->ldpc_dec.hard_output.offset;
> > +       mbuf_total_left = op->ldpc_dec.input.length;
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +       if (unlikely(input == NULL)) {
> > +               rte_bbdev_log(ERR, "Invalid mbuf pointer");
> > +               return -EFAULT;
> > +       }
> > +#endif
> > +       union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
> > +
> > +       if (same_op) {
> > +               union acc100_dma_desc *prev_desc;
> > +               desc_idx = ((q->sw_ring_head + total_enqueued_cbs - 1)
> > +                               & q->sw_ring_wrap_mask);
> > +               prev_desc = q->ring_addr + desc_idx;
> > +               uint8_t *prev_ptr = (uint8_t *) prev_desc;
> > +               uint8_t *new_ptr = (uint8_t *) desc;
> > +               /* Copy first 4 words and BDESCs */
> > +               rte_memcpy(new_ptr, prev_ptr, 16);
> > +               rte_memcpy(new_ptr + 36, prev_ptr + 36, 40);
> > +               desc->req.op_addr = prev_desc->req.op_addr;
> > +               /* Copy FCW */
> > +               rte_memcpy(new_ptr + ACC100_DESC_FCW_OFFSET,
> > +                               prev_ptr + ACC100_DESC_FCW_OFFSET,
> > +                               ACC100_FCW_LD_BLEN);
> > +               acc100_dma_desc_ld_update(op, &desc->req, input, h_output,
> > +                               &in_offset, &h_out_offset,
> > +                               &h_out_length, harq_layout);
> > +       } else {
> > +               struct acc100_fcw_ld *fcw;
> > +               uint32_t seg_total_left;
> > +               fcw = &desc->req.fcw_ld;
> > +               acc100_fcw_ld_fill(op, fcw, harq_layout);
> > +
> > +               /* Special handling when overusing mbuf */
> > +               if (fcw->rm_e < MAX_E_MBUF)
> > +                       seg_total_left = rte_pktmbuf_data_len(input)
> > +                                       - in_offset;
> > +               else
> > +                       seg_total_left = fcw->rm_e;
> > +
> > +               ret = acc100_dma_desc_ld_fill(op, &desc->req, &input, h_output,
> > +                               &in_offset, &h_out_offset,
> > +                               &h_out_length, &mbuf_total_left,
> > +                               &seg_total_left, fcw);
> > +               if (unlikely(ret < 0))
> > +                       return ret;
> > +       }
> > +
> > +       /* Hard output */
> > +       mbuf_append(h_output_head, h_output, h_out_length);
> > +#ifndef ACC100_EXT_MEM
> > +       if (op->ldpc_dec.harq_combined_output.length > 0) {
> > +               /* Push the HARQ output into host memory */
> > +               struct rte_mbuf *hq_output_head, *hq_output;
> > +               hq_output_head = op->ldpc_dec.harq_combined_output.data;
> > +               hq_output = op->ldpc_dec.harq_combined_output.data;
> > +               mbuf_append(hq_output_head, hq_output,
> > +                               op->ldpc_dec.harq_combined_output.length);
> > +       }
> > +#endif
> > +
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +       rte_memdump(stderr, "FCW", &desc->req.fcw_ld,
> > +                       sizeof(desc->req.fcw_ld) - 8);
> > +       rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> > +#endif
> > +
> > +       /* One CB (one op) was successfully prepared to enqueue */
> > +       return 1;
> > +}
> > +
> > +
> > +/* Enqueue one decode operations for ACC100 device in TB mode */
> > +static inline int
> > +enqueue_ldpc_dec_one_op_tb(struct acc100_queue *q, struct
> > rte_bbdev_dec_op *op,
> > +               uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
> > +{
> > +       union acc100_dma_desc *desc = NULL;
> > +       int ret;
> > +       uint8_t r, c;
> > +       uint32_t in_offset, h_out_offset,
> > +               h_out_length, mbuf_total_left, seg_total_left;
> > +       struct rte_mbuf *input, *h_output_head, *h_output;
> > +       uint16_t current_enqueued_cbs = 0;
> > +
> > +       uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> > +                       & q->sw_ring_wrap_mask);
> > +       desc = q->ring_addr + desc_idx;
> > +       uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
> > +       union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
> > +       acc100_fcw_ld_fill(op, &desc->req.fcw_ld, harq_layout);
> > +
> > +       input = op->ldpc_dec.input.data;
> > +       h_output_head = h_output = op->ldpc_dec.hard_output.data;
> > +       in_offset = op->ldpc_dec.input.offset;
> > +       h_out_offset = op->ldpc_dec.hard_output.offset;
> > +       h_out_length = 0;
> > +       mbuf_total_left = op->ldpc_dec.input.length;
> > +       c = op->ldpc_dec.tb_params.c;
> > +       r = op->ldpc_dec.tb_params.r;
> > +
> > +       while (mbuf_total_left > 0 && r < c) {
> > +
> > +               seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
> > +
> > +               /* Set up DMA descriptor */
> > +               desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
> > +                               & q->sw_ring_wrap_mask);
> > +               desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
> > +               desc->req.data_ptrs[0].blen = ACC100_FCW_LD_BLEN;
> > +               ret = acc100_dma_desc_ld_fill(op, &desc->req, &input,
> > +                               h_output, &in_offset, &h_out_offset,
> > +                               &h_out_length,
> > +                               &mbuf_total_left, &seg_total_left,
> > +                               &desc->req.fcw_ld);
> > +
> > +               if (unlikely(ret < 0))
> > +                       return ret;
> > +
> > +               /* Hard output */
> > +               mbuf_append(h_output_head, h_output, h_out_length);
> > +
> > +               /* Set total number of CBs in TB */
> > +               desc->req.cbs_in_tb = cbs_in_tb;
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +               rte_memdump(stderr, "FCW", &desc->req.fcw_td,
> > +                               sizeof(desc->req.fcw_td) - 8);
> > +               rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> > +#endif
> > +
> > +               if (seg_total_left == 0) {
> > +                       /* Go to the next mbuf */
> > +                       input = input->next;
> > +                       in_offset = 0;
> > +                       h_output = h_output->next;
> > +                       h_out_offset = 0;
> > +               }
> > +               total_enqueued_cbs++;
> > +               current_enqueued_cbs++;
> > +               r++;
> > +       }
> > +
> > +       if (unlikely(desc == NULL))
> > +               return current_enqueued_cbs;
> > +
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +       /* Check if any CBs left for processing */
> > +       if (mbuf_total_left != 0) {
> > +               rte_bbdev_log(ERR,
> > +                               "Some date still left for processing: mbuf_total_left =
> %u",
> > +                               mbuf_total_left);
> > +               return -EINVAL;
> > +       }
> > +#endif
> > +       /* Set SDone on last CB descriptor for TB mode */
> > +       desc->req.sdone_enable = 1;
> > +       desc->req.irq_enable = q->irq_enable;
> > +
> > +       return current_enqueued_cbs;
> > +}
> > +
> > +
> > +/* Calculates number of CBs in processed encoder TB based on 'r' and input
> > + * length.
> > + */
> > +static inline uint8_t
> > +get_num_cbs_in_tb_enc(struct rte_bbdev_op_turbo_enc *turbo_enc)
> > +{
> > +       uint8_t c, c_neg, r, crc24_bits = 0;
> > +       uint16_t k, k_neg, k_pos;
> > +       uint8_t cbs_in_tb = 0;
> > +       int32_t length;
> > +
> > +       length = turbo_enc->input.length;
> > +       r = turbo_enc->tb_params.r;
> > +       c = turbo_enc->tb_params.c;
> > +       c_neg = turbo_enc->tb_params.c_neg;
> > +       k_neg = turbo_enc->tb_params.k_neg;
> > +       k_pos = turbo_enc->tb_params.k_pos;
> > +       crc24_bits = 0;
> > +       if (check_bit(turbo_enc->op_flags,
> > RTE_BBDEV_TURBO_CRC_24B_ATTACH))
> > +               crc24_bits = 24;
> > +       while (length > 0 && r < c) {
> > +               k = (r < c_neg) ? k_neg : k_pos;
> > +               length -= (k - crc24_bits) >> 3;
> > +               r++;
> > +               cbs_in_tb++;
> > +       }
> > +
> > +       return cbs_in_tb;
> > +}
> > +
> > +/* Calculates number of CBs in processed decoder TB based on 'r' and input
> > + * length.
> > + */
> > +static inline uint16_t
> > +get_num_cbs_in_tb_dec(struct rte_bbdev_op_turbo_dec *turbo_dec)
> > +{
> > +       uint8_t c, c_neg, r = 0;
> > +       uint16_t kw, k, k_neg, k_pos, cbs_in_tb = 0;
> > +       int32_t length;
> > +
> > +       length = turbo_dec->input.length;
> > +       r = turbo_dec->tb_params.r;
> > +       c = turbo_dec->tb_params.c;
> > +       c_neg = turbo_dec->tb_params.c_neg;
> > +       k_neg = turbo_dec->tb_params.k_neg;
> > +       k_pos = turbo_dec->tb_params.k_pos;
> > +       while (length > 0 && r < c) {
> > +               k = (r < c_neg) ? k_neg : k_pos;
> > +               kw = RTE_ALIGN_CEIL(k + 4, 32) * 3;
> > +               length -= kw;
> > +               r++;
> > +               cbs_in_tb++;
> > +       }
> > +
> > +       return cbs_in_tb;
> > +}
> > +
> > +/* Calculates number of CBs in processed decoder TB based on 'r' and input
> > + * length.
> > + */
> > +static inline uint16_t
> > +get_num_cbs_in_tb_ldpc_dec(struct rte_bbdev_op_ldpc_dec *ldpc_dec)
> > +{
> > +       uint16_t r, cbs_in_tb = 0;
> > +       int32_t length = ldpc_dec->input.length;
> > +       r = ldpc_dec->tb_params.r;
> > +       while (length > 0 && r < ldpc_dec->tb_params.c) {
> > +               length -=  (r < ldpc_dec->tb_params.cab) ?
> > +                               ldpc_dec->tb_params.ea :
> > +                               ldpc_dec->tb_params.eb;
> > +               r++;
> > +               cbs_in_tb++;
> > +       }
> > +       return cbs_in_tb;
> > +}
> > +
> > +/* Check we can mux encode operations with common FCW */
> > +static inline bool
> > +check_mux(struct rte_bbdev_enc_op **ops, uint16_t num) {
> > +       uint16_t i;
> > +       if (num == 1)
> > +               return false;
> > +       for (i = 1; i < num; ++i) {
> > +               /* Only mux compatible code blocks */
> > +               if (memcmp((uint8_t *)(&ops[i]->ldpc_enc) + ENC_OFFSET,
> > +                               (uint8_t *)(&ops[0]->ldpc_enc) + ENC_OFFSET,
> > +                               CMP_ENC_SIZE) != 0)
> > +                       return false;
> > +       }
> > +       return true;
> > +}
> > +
> > +/** Enqueue encode operations for ACC100 device in CB mode. */
> > +static inline uint16_t
> > +acc100_enqueue_ldpc_enc_cb(struct rte_bbdev_queue_data *q_data,
> > +               struct rte_bbdev_enc_op **ops, uint16_t num)
> > +{
> > +       struct acc100_queue *q = q_data->queue_private;
> > +       int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
> > +       uint16_t i = 0;
> > +       union acc100_dma_desc *desc;
> > +       int ret, desc_idx = 0;
> > +       int16_t enq, left = num;
> > +
> > +       while (left > 0) {
> > +               if (unlikely(avail - 1 < 0))
> > +                       break;
> > +               avail--;
> > +               enq = RTE_MIN(left, MUX_5GDL_DESC);
> > +               if (check_mux(&ops[i], enq)) {
> > +                       ret = enqueue_ldpc_enc_n_op_cb(q, &ops[i],
> > +                                       desc_idx, enq);
> > +                       if (ret < 0)
> > +                               break;
> > +                       i += enq;
> > +               } else {
> > +                       ret = enqueue_ldpc_enc_one_op_cb(q, ops[i], desc_idx);
> > +                       if (ret < 0)
> > +                               break;
> > +                       i++;
> > +               }
> > +               desc_idx++;
> > +               left = num - i;
> > +       }
> > +
> > +       if (unlikely(i == 0))
> > +               return 0; /* Nothing to enqueue */
> > +
> > +       /* Set SDone in last CB in enqueued ops for CB mode*/
> > +       desc = q->ring_addr + ((q->sw_ring_head + desc_idx - 1)
> > +                       & q->sw_ring_wrap_mask);
> > +       desc->req.sdone_enable = 1;
> > +       desc->req.irq_enable = q->irq_enable;
> > +
> > +       acc100_dma_enqueue(q, desc_idx, &q_data->queue_stats);
> > +
> > +       /* Update stats */
> > +       q_data->queue_stats.enqueued_count += i;
> > +       q_data->queue_stats.enqueue_err_count += num - i;
> > +
> > +       return i;
> > +}
> > +
> > +/* Enqueue encode operations for ACC100 device. */
> > +static uint16_t
> > +acc100_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
> > +               struct rte_bbdev_enc_op **ops, uint16_t num)
> > +{
> > +       if (unlikely(num == 0))
> > +               return 0;
> > +       return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
> > +}
> > +
> > +/* Check we can mux encode operations with common FCW */
> > +static inline bool
> > +cmp_ldpc_dec_op(struct rte_bbdev_dec_op **ops) {
> > +       /* Only mux compatible code blocks */
> > +       if (memcmp((uint8_t *)(&ops[0]->ldpc_dec) + DEC_OFFSET,
> > +                       (uint8_t *)(&ops[1]->ldpc_dec) +
> > +                       DEC_OFFSET, CMP_DEC_SIZE) != 0) {
> > +               return false;
> > +       } else
> > +               return true;
> > +}
> > +
> > +
> > +/* Enqueue decode operations for ACC100 device in TB mode */
> > +static uint16_t
> > +acc100_enqueue_ldpc_dec_tb(struct rte_bbdev_queue_data *q_data,
> > +               struct rte_bbdev_dec_op **ops, uint16_t num)
> > +{
> > +       struct acc100_queue *q = q_data->queue_private;
> > +       int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
> > +       uint16_t i, enqueued_cbs = 0;
> > +       uint8_t cbs_in_tb;
> > +       int ret;
> > +
> > +       for (i = 0; i < num; ++i) {
> > +               cbs_in_tb = get_num_cbs_in_tb_ldpc_dec(&ops[i]->ldpc_dec);
> > +               /* Check if there are available space for further processing */
> > +               if (unlikely(avail - cbs_in_tb < 0))
> > +                       break;
> > +               avail -= cbs_in_tb;
> > +
> > +               ret = enqueue_ldpc_dec_one_op_tb(q, ops[i],
> > +                               enqueued_cbs, cbs_in_tb);
> > +               if (ret < 0)
> > +                       break;
> > +               enqueued_cbs += ret;
> > +       }
> > +
> > +       acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
> > +
> > +       /* Update stats */
> > +       q_data->queue_stats.enqueued_count += i;
> > +       q_data->queue_stats.enqueue_err_count += num - i;
> > +       return i;
> > +}
> > +
> > +/* Enqueue decode operations for ACC100 device in CB mode */
> > +static uint16_t
> > +acc100_enqueue_ldpc_dec_cb(struct rte_bbdev_queue_data *q_data,
> > +               struct rte_bbdev_dec_op **ops, uint16_t num)
> > +{
> > +       struct acc100_queue *q = q_data->queue_private;
> > +       int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
> > +       uint16_t i;
> > +       union acc100_dma_desc *desc;
> > +       int ret;
> > +       bool same_op = false;
> > +       for (i = 0; i < num; ++i) {
> > +               /* Check if there are available space for further processing */
> > +               if (unlikely(avail - 1 < 0))
> > +                       break;
> > +               avail -= 1;
> > +
> > +               if (i > 0)
> > +                       same_op = cmp_ldpc_dec_op(&ops[i-1]);
> > +               rte_bbdev_log(INFO, "Op %d %d %d %d %d %d %d %d %d %d %d
> > %d\n",
> > +                       i, ops[i]->ldpc_dec.op_flags, ops[i]->ldpc_dec.rv_index,
> > +                       ops[i]->ldpc_dec.iter_max, ops[i]->ldpc_dec.iter_count,
> > +                       ops[i]->ldpc_dec.basegraph, ops[i]->ldpc_dec.z_c,
> > +                       ops[i]->ldpc_dec.n_cb, ops[i]->ldpc_dec.q_m,
> > +                       ops[i]->ldpc_dec.n_filler, ops[i]->ldpc_dec.cb_params.e,
> > +                       same_op);
> > +               ret = enqueue_ldpc_dec_one_op_cb(q, ops[i], i, same_op);
> > +               if (ret < 0)
> > +                       break;
> > +       }
> > +
> > +       if (unlikely(i == 0))
> > +               return 0; /* Nothing to enqueue */
> > +
> > +       /* Set SDone in last CB in enqueued ops for CB mode*/
> > +       desc = q->ring_addr + ((q->sw_ring_head + i - 1)
> > +                       & q->sw_ring_wrap_mask);
> > +
> > +       desc->req.sdone_enable = 1;
> > +       desc->req.irq_enable = q->irq_enable;
> > +
> > +       acc100_dma_enqueue(q, i, &q_data->queue_stats);
> > +
> > +       /* Update stats */
> > +       q_data->queue_stats.enqueued_count += i;
> > +       q_data->queue_stats.enqueue_err_count += num - i;
> > +       return i;
> > +}
> > +
> > +/* Enqueue decode operations for ACC100 device. */
> > +static uint16_t
> > +acc100_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
> > +               struct rte_bbdev_dec_op **ops, uint16_t num)
> > +{
> > +       struct acc100_queue *q = q_data->queue_private;
> > +       int32_t aq_avail = q->aq_depth +
> > +                       (q->aq_dequeued - q->aq_enqueued) / 128;
> > +
> > +       if (unlikely((aq_avail == 0) || (num == 0)))
> > +               return 0;
> > +
> > +       if (ops[0]->ldpc_dec.code_block_mode == 0)
> > +               return acc100_enqueue_ldpc_dec_tb(q_data, ops, num);
> > +       else
> > +               return acc100_enqueue_ldpc_dec_cb(q_data, ops, num);
> > +}
> > +
> > +
> > +/* Dequeue one encode operations from ACC100 device in CB mode */
> > +static inline int
> > +dequeue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op
> > **ref_op,
> > +               uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
> > +{
> > +       union acc100_dma_desc *desc, atom_desc;
> > +       union acc100_dma_rsp_desc rsp;
> > +       struct rte_bbdev_enc_op *op;
> > +       int i;
> > +
> > +       desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
> > +                       & q->sw_ring_wrap_mask);
> > +       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> > +                       __ATOMIC_RELAXED);
> > +
> > +       /* Check fdone bit */
> > +       if (!(atom_desc.rsp.val & ACC100_FDONE))
> > +               return -1;
> > +
> > +       rsp.val = atom_desc.rsp.val;
> > +       rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
> > +
> > +       /* Dequeue */
> > +       op = desc->req.op_addr;
> > +
> > +       /* Clearing status, it will be set based on response */
> > +       op->status = 0;
> > +
> > +       op->status |= ((rsp.input_err)
> > +                       ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> > +       op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> > +       op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> > +
> > +       if (desc->req.last_desc_in_batch) {
> > +               (*aq_dequeued)++;
> > +               desc->req.last_desc_in_batch = 0;
> > +       }
> > +       desc->rsp.val = ACC100_DMA_DESC_TYPE;
> > +       desc->rsp.add_info_0 = 0; /*Reserved bits */
> > +       desc->rsp.add_info_1 = 0; /*Reserved bits */
> > +
> > +       /* Flag that the muxing cause loss of opaque data */
> > +       op->opaque_data = (void *)-1;
> > +       for (i = 0 ; i < desc->req.numCBs; i++)
> > +               ref_op[i] = op;
> > +
> > +       /* One CB (op) was successfully dequeued */
> > +       return desc->req.numCBs;
> > +}
> > +
> > +/* Dequeue one encode operations from ACC100 device in TB mode */
> > +static inline int
> > +dequeue_enc_one_op_tb(struct acc100_queue *q, struct rte_bbdev_enc_op
> > **ref_op,
> > +               uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
> > +{
> > +       union acc100_dma_desc *desc, *last_desc, atom_desc;
> > +       union acc100_dma_rsp_desc rsp;
> > +       struct rte_bbdev_enc_op *op;
> > +       uint8_t i = 0;
> > +       uint16_t current_dequeued_cbs = 0, cbs_in_tb;
> > +
> > +       desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
> > +                       & q->sw_ring_wrap_mask);
> > +       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> > +                       __ATOMIC_RELAXED);
> > +
> > +       /* Check fdone bit */
> > +       if (!(atom_desc.rsp.val & ACC100_FDONE))
> > +               return -1;
> > +
> > +       /* Get number of CBs in dequeued TB */
> > +       cbs_in_tb = desc->req.cbs_in_tb;
> > +       /* Get last CB */
> > +       last_desc = q->ring_addr + ((q->sw_ring_tail
> > +                       + total_dequeued_cbs + cbs_in_tb - 1)
> > +                       & q->sw_ring_wrap_mask);
> > +       /* Check if last CB in TB is ready to dequeue (and thus
> > +        * the whole TB) - checking sdone bit. If not return.
> > +        */
> > +       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
> > +                       __ATOMIC_RELAXED);
> > +       if (!(atom_desc.rsp.val & ACC100_SDONE))
> > +               return -1;
> > +
> > +       /* Dequeue */
> > +       op = desc->req.op_addr;
> > +
> > +       /* Clearing status, it will be set based on response */
> > +       op->status = 0;
> > +
> > +       while (i < cbs_in_tb) {
> > +               desc = q->ring_addr + ((q->sw_ring_tail
> > +                               + total_dequeued_cbs)
> > +                               & q->sw_ring_wrap_mask);
> > +               atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> > +                               __ATOMIC_RELAXED);
> > +               rsp.val = atom_desc.rsp.val;
> > +               rte_bbdev_log_debug("Resp. desc %p: %x", desc,
> > +                               rsp.val);
> > +
> > +               op->status |= ((rsp.input_err)
> > +                               ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> > +               op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) :
> 0);
> > +               op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> > +
> > +               if (desc->req.last_desc_in_batch) {
> > +                       (*aq_dequeued)++;
> > +                       desc->req.last_desc_in_batch = 0;
> > +               }
> > +               desc->rsp.val = ACC100_DMA_DESC_TYPE;
> > +               desc->rsp.add_info_0 = 0;
> > +               desc->rsp.add_info_1 = 0;
> > +               total_dequeued_cbs++;
> > +               current_dequeued_cbs++;
> > +               i++;
> > +       }
> > +
> > +       *ref_op = op;
> > +
> > +       return current_dequeued_cbs;
> > +}
> > +
> > +/* Dequeue one decode operation from ACC100 device in CB mode */
> > +static inline int
> > +dequeue_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
> > +               struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
> > +               uint16_t dequeued_cbs, uint32_t *aq_dequeued)
> > +{
> > +       union acc100_dma_desc *desc, atom_desc;
> > +       union acc100_dma_rsp_desc rsp;
> > +       struct rte_bbdev_dec_op *op;
> > +
> > +       desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> > +                       & q->sw_ring_wrap_mask);
> > +       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> > +                       __ATOMIC_RELAXED);
> > +
> > +       /* Check fdone bit */
> > +       if (!(atom_desc.rsp.val & ACC100_FDONE))
> > +               return -1;
> > +
> > +       rsp.val = atom_desc.rsp.val;
> > +       rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
> > +
> > +       /* Dequeue */
> > +       op = desc->req.op_addr;
> > +
> > +       /* Clearing status, it will be set based on response */
> > +       op->status = 0;
> > +       op->status |= ((rsp.input_err)
> > +                       ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> > +       op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> > +       op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> > +       if (op->status != 0)
> > +               q_data->queue_stats.dequeue_err_count++;
> > +
> > +       /* CRC invalid if error exists */
> > +       if (!op->status)
> > +               op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
> > +       op->turbo_dec.iter_count = (uint8_t) rsp.iter_cnt / 2;
> > +       /* Check if this is the last desc in batch (Atomic Queue) */
> > +       if (desc->req.last_desc_in_batch) {
> > +               (*aq_dequeued)++;
> > +               desc->req.last_desc_in_batch = 0;
> > +       }
> > +       desc->rsp.val = ACC100_DMA_DESC_TYPE;
> > +       desc->rsp.add_info_0 = 0;
> > +       desc->rsp.add_info_1 = 0;
> > +       *ref_op = op;
> > +
> > +       /* One CB (op) was successfully dequeued */
> > +       return 1;
> > +}
> > +
> > +/* Dequeue one decode operations from ACC100 device in CB mode */
> > +static inline int
> > +dequeue_ldpc_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
> > +               struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
> > +               uint16_t dequeued_cbs, uint32_t *aq_dequeued)
> > +{
> > +       union acc100_dma_desc *desc, atom_desc;
> > +       union acc100_dma_rsp_desc rsp;
> > +       struct rte_bbdev_dec_op *op;
> > +
> > +       desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> > +                       & q->sw_ring_wrap_mask);
> > +       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> > +                       __ATOMIC_RELAXED);
> > +
> > +       /* Check fdone bit */
> > +       if (!(atom_desc.rsp.val & ACC100_FDONE))
> > +               return -1;
> > +
> > +       rsp.val = atom_desc.rsp.val;
> > +
> > +       /* Dequeue */
> > +       op = desc->req.op_addr;
> > +
> > +       /* Clearing status, it will be set based on response */
> > +       op->status = 0;
> > +       op->status |= rsp.input_err << RTE_BBDEV_DATA_ERROR;
> > +       op->status |= rsp.dma_err << RTE_BBDEV_DRV_ERROR;
> > +       op->status |= rsp.fcw_err << RTE_BBDEV_DRV_ERROR;
> > +       if (op->status != 0)
> > +               q_data->queue_stats.dequeue_err_count++;
> > +
> > +       op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
> > +       if (op->ldpc_dec.hard_output.length > 0 && !rsp.synd_ok)
> > +               op->status |= 1 << RTE_BBDEV_SYNDROME_ERROR;
> > +       op->ldpc_dec.iter_count = (uint8_t) rsp.iter_cnt;
> > +
> > +       /* Check if this is the last desc in batch (Atomic Queue) */
> > +       if (desc->req.last_desc_in_batch) {
> > +               (*aq_dequeued)++;
> > +               desc->req.last_desc_in_batch = 0;
> > +       }
> > +
> > +       desc->rsp.val = ACC100_DMA_DESC_TYPE;
> > +       desc->rsp.add_info_0 = 0;
> > +       desc->rsp.add_info_1 = 0;
> > +
> > +       *ref_op = op;
> > +
> > +       /* One CB (op) was successfully dequeued */
> > +       return 1;
> > +}
> > +
> > +/* Dequeue one decode operations from ACC100 device in TB mode. */
> > +static inline int
> > +dequeue_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op
> > **ref_op,
> > +               uint16_t dequeued_cbs, uint32_t *aq_dequeued)
> > +{
> > +       union acc100_dma_desc *desc, *last_desc, atom_desc;
> > +       union acc100_dma_rsp_desc rsp;
> > +       struct rte_bbdev_dec_op *op;
> > +       uint8_t cbs_in_tb = 1, cb_idx = 0;
> > +
> > +       desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> > +                       & q->sw_ring_wrap_mask);
> > +       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> > +                       __ATOMIC_RELAXED);
> > +
> > +       /* Check fdone bit */
> > +       if (!(atom_desc.rsp.val & ACC100_FDONE))
> > +               return -1;
> > +
> > +       /* Dequeue */
> > +       op = desc->req.op_addr;
> > +
> > +       /* Get number of CBs in dequeued TB */
> > +       cbs_in_tb = desc->req.cbs_in_tb;
> > +       /* Get last CB */
> > +       last_desc = q->ring_addr + ((q->sw_ring_tail
> > +                       + dequeued_cbs + cbs_in_tb - 1)
> > +                       & q->sw_ring_wrap_mask);
> > +       /* Check if last CB in TB is ready to dequeue (and thus
> > +        * the whole TB) - checking sdone bit. If not return.
> > +        */
> > +       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
> > +                       __ATOMIC_RELAXED);
> > +       if (!(atom_desc.rsp.val & ACC100_SDONE))
> > +               return -1;
> > +
> > +       /* Clearing status, it will be set based on response */
> > +       op->status = 0;
> > +
> > +       /* Read remaining CBs if exists */
> > +       while (cb_idx < cbs_in_tb) {
> > +               desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> > +                               & q->sw_ring_wrap_mask);
> > +               atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> > +                               __ATOMIC_RELAXED);
> > +               rsp.val = atom_desc.rsp.val;
> > +               rte_bbdev_log_debug("Resp. desc %p: %x", desc,
> > +                               rsp.val);
> > +
> > +               op->status |= ((rsp.input_err)
> > +                               ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> > +               op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) :
> 0);
> > +               op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> > +
> > +               /* CRC invalid if error exists */
> > +               if (!op->status)
> > +                       op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
> > +               op->turbo_dec.iter_count = RTE_MAX((uint8_t) rsp.iter_cnt,
> > +                               op->turbo_dec.iter_count);
> > +
> > +               /* Check if this is the last desc in batch (Atomic Queue) */
> > +               if (desc->req.last_desc_in_batch) {
> > +                       (*aq_dequeued)++;
> > +                       desc->req.last_desc_in_batch = 0;
> > +               }
> > +               desc->rsp.val = ACC100_DMA_DESC_TYPE;
> > +               desc->rsp.add_info_0 = 0;
> > +               desc->rsp.add_info_1 = 0;
> > +               dequeued_cbs++;
> > +               cb_idx++;
> > +       }
> > +
> > +       *ref_op = op;
> > +
> > +       return cb_idx;
> > +}
> > +
> > +/* Dequeue LDPC encode operations from ACC100 device. */
> > +static uint16_t
> > +acc100_dequeue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
> > +               struct rte_bbdev_enc_op **ops, uint16_t num)
> > +{
> > +       struct acc100_queue *q = q_data->queue_private;
> > +       uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
> > +       uint32_t aq_dequeued = 0;
> > +       uint16_t dequeue_num, i, dequeued_cbs = 0, dequeued_descs = 0;
> > +       int ret;
> > +
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +       if (unlikely(ops == 0 && q == NULL))
> > +               return 0;
> > +#endif
> > +
> > +       dequeue_num = (avail < num) ? avail : num;
> > +
> > +       for (i = 0; i < dequeue_num; i++) {
> > +               ret = dequeue_enc_one_op_cb(q, &ops[dequeued_cbs],
> > +                               dequeued_descs, &aq_dequeued);
> > +               if (ret < 0)
> > +                       break;
> > +               dequeued_cbs += ret;
> > +               dequeued_descs++;
> > +               if (dequeued_cbs >= num)
> > +                       break;
> > +       }
> > +
> > +       q->aq_dequeued += aq_dequeued;
> > +       q->sw_ring_tail += dequeued_descs;
> > +
> > +       /* Update enqueue stats */
> > +       q_data->queue_stats.dequeued_count += dequeued_cbs;
> > +
> > +       return dequeued_cbs;
> > +}
> > +
> > +/* Dequeue decode operations from ACC100 device. */
> > +static uint16_t
> > +acc100_dequeue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
> > +               struct rte_bbdev_dec_op **ops, uint16_t num)
> > +{
> > +       struct acc100_queue *q = q_data->queue_private;
> > +       uint16_t dequeue_num;
> > +       uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
> > +       uint32_t aq_dequeued = 0;
> > +       uint16_t i;
> > +       uint16_t dequeued_cbs = 0;
> > +       struct rte_bbdev_dec_op *op;
> > +       int ret;
> > +
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +       if (unlikely(ops == 0 && q == NULL))
> > +               return 0;
> > +#endif
> > +
> > +       dequeue_num = (avail < num) ? avail : num;
> > +
> > +       for (i = 0; i < dequeue_num; ++i) {
> > +               op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> > +                       & q->sw_ring_wrap_mask))->req.op_addr;
> > +               if (op->ldpc_dec.code_block_mode == 0)
> > +                       ret = dequeue_dec_one_op_tb(q, &ops[i], dequeued_cbs,
> > +                                       &aq_dequeued);
> > +               else
> > +                       ret = dequeue_ldpc_dec_one_op_cb(
> > +                                       q_data, q, &ops[i], dequeued_cbs,
> > +                                       &aq_dequeued);
> > +
> > +               if (ret < 0)
> > +                       break;
> > +               dequeued_cbs += ret;
> > +       }
> > +
> > +       q->aq_dequeued += aq_dequeued;
> > +       q->sw_ring_tail += dequeued_cbs;
> > +
> > +       /* Update enqueue stats */
> > +       q_data->queue_stats.dequeued_count += i;
> > +
> > +       return i;
> > +}
> > +
> >  /* Initialization Function */
> >  static void
> >  acc100_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv)
> > @@ -703,6 +2321,10 @@
> >          struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
> >
> >          dev->dev_ops = &acc100_bbdev_ops;
> > +       dev->enqueue_ldpc_enc_ops = acc100_enqueue_ldpc_enc;
> > +       dev->enqueue_ldpc_dec_ops = acc100_enqueue_ldpc_dec;
> > +       dev->dequeue_ldpc_enc_ops = acc100_dequeue_ldpc_enc;
> > +       dev->dequeue_ldpc_dec_ops = acc100_dequeue_ldpc_dec;
> >
> >          ((struct acc100_device *) dev->data->dev_private)->pf_device =
> >                          !strcmp(drv->driver.name,
> > @@ -815,4 +2437,3 @@ static int acc100_pci_remove(struct rte_pci_device
> > *pci_dev)
> >  RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME,
> > pci_id_acc100_pf_map);
> >  RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
> >  RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME,
> > pci_id_acc100_vf_map);
> > -
> > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h
> > b/drivers/baseband/acc100/rte_acc100_pmd.h
> > index 0e2b79c..78686c1 100644
> > --- a/drivers/baseband/acc100/rte_acc100_pmd.h
> > +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
> > @@ -88,6 +88,8 @@
> >  #define TMPL_PRI_3      0x0f0e0d0c
> >  #define QUEUE_ENABLE    0x80000000  /* Bit to mark Queue as Enabled */
> >  #define WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
> > +#define ACC100_FDONE    0x80000000
> > +#define ACC100_SDONE    0x40000000
> >
> >  #define ACC100_NUM_TMPL  32
> >  #define VF_OFFSET_QOS 16 /* offset in Memory Space specific to QoS Mon
> */
> > @@ -398,6 +400,7 @@ struct __rte_packed acc100_dma_req_desc {
> >  union acc100_dma_desc {
> >          struct acc100_dma_req_desc req;
> >          union acc100_dma_rsp_desc rsp;
> > +       uint64_t atom_hdr;
> >  };
> >
> >
> > --
> > 1.8.3.1

^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v3 01/11] drivers/baseband: add PMD for ACC100
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 01/11] drivers/baseband: add PMD for ACC100 Nicolas Chautru
@ 2020-08-29  9:44   ` Xu, Rosen
  2020-09-04 16:44     ` Chautru, Nicolas
  0 siblings, 1 reply; 213+ messages in thread
From: Xu, Rosen @ 2020-08-29  9:44 UTC (permalink / raw)
  To: Chautru, Nicolas, dev, akhil.goyal
  Cc: Richardson, Bruce, Chautru, Nicolas, Xu, Rosen

Hi,

> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Nicolas Chautru
> Sent: Wednesday, August 19, 2020 8:25
> To: dev@dpdk.org; akhil.goyal@nxp.com
> Cc: Richardson, Bruce <bruce.richardson@intel.com>; Chautru, Nicolas
> <nicolas.chautru@intel.com>
> Subject: [dpdk-dev] [PATCH v3 01/11] drivers/baseband: add PMD for
> ACC100
> 
> Add stubs for the ACC100 PMD
> 
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> ---
>  config/common_base                                 |   4 +
>  doc/guides/bbdevs/acc100.rst                       | 233 +++++++++++++++++++++
>  doc/guides/bbdevs/index.rst                        |   1 +
>  doc/guides/rel_notes/release_20_11.rst             |   6 +
>  drivers/baseband/Makefile                          |   2 +
>  drivers/baseband/acc100/Makefile                   |  25 +++
>  drivers/baseband/acc100/meson.build                |   6 +
>  drivers/baseband/acc100/rte_acc100_pmd.c           | 175 ++++++++++++++++
>  drivers/baseband/acc100/rte_acc100_pmd.h           |  37 ++++
>  .../acc100/rte_pmd_bbdev_acc100_version.map        |   3 +
>  drivers/baseband/meson.build                       |   2 +-
>  mk/rte.app.mk                                      |   1 +
>  12 files changed, 494 insertions(+), 1 deletion(-)  create mode 100644
> doc/guides/bbdevs/acc100.rst  create mode 100644
> drivers/baseband/acc100/Makefile  create mode 100644
> drivers/baseband/acc100/meson.build
>  create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.c
>  create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.h
>  create mode 100644
> drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> 
> diff --git a/config/common_base b/config/common_base index
> fbf0ee7..218ab16 100644
> --- a/config/common_base
> +++ b/config/common_base
> @@ -584,6 +584,10 @@ CONFIG_RTE_LIBRTE_PMD_BBDEV_NULL=y
>  #
>  CONFIG_RTE_LIBRTE_PMD_BBDEV_TURBO_SW=y
> 
> +# Compile PMD for ACC100 bbdev device
> +#
> +CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100=y
> +
>  #
>  # Compile PMD for Intel FPGA LTE FEC bbdev device  # diff --git
> a/doc/guides/bbdevs/acc100.rst b/doc/guides/bbdevs/acc100.rst new file
> mode 100644 index 0000000..f87ee09
> --- /dev/null
> +++ b/doc/guides/bbdevs/acc100.rst
> @@ -0,0 +1,233 @@
> +..  SPDX-License-Identifier: BSD-3-Clause
> +    Copyright(c) 2020 Intel Corporation
> +
> +Intel(R) ACC100 5G/4G FEC Poll Mode Driver
> +==========================================
> +
> +The BBDEV ACC100 5G/4G FEC poll mode driver (PMD) supports an
> +implementation of a VRAN FEC wireless acceleration function.
> +This device is also known as Mount Bryce.
> +
> +Features
> +--------
> +
> +ACC100 5G/4G FEC PMD supports the following features:
> +
> +- LDPC Encode in the DL (5GNR)
> +- LDPC Decode in the UL (5GNR)
> +- Turbo Encode in the DL (4G)
> +- Turbo Decode in the UL (4G)
> +- 16 VFs per PF (physical device)
> +- Maximum of 128 queues per VF
> +- PCIe Gen-3 x16 Interface
> +- MSI
> +- SR-IOV
> +
> +ACC100 5G/4G FEC PMD supports the following BBDEV capabilities:
> +
> +* For the LDPC encode operation:
> +   - ``RTE_BBDEV_LDPC_CRC_24B_ATTACH`` :  set to attach CRC24B to CB(s)
> +   - ``RTE_BBDEV_LDPC_RATE_MATCH`` :  if set then do not do Rate Match
> bypass
> +   - ``RTE_BBDEV_LDPC_INTERLEAVER_BYPASS`` : if set then bypass
> +interleaver
> +
> +* For the LDPC decode operation:
> +   - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK`` :  check CRC24B from CB(s)
> +   - ``RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE`` :  disable early
> termination
> +   - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP`` :  drops CRC24B bits
> appended while decoding
> +   - ``RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE`` :  provides an input for
> HARQ combining
> +   - ``RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE`` :  provides an input
> for HARQ combining
> +   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE`` :  HARQ
> memory input is internal
> +   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE`` :  HARQ
> memory output is internal
> +   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK`` :
> loopback data to/from HARQ memory
> +   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_FILLERS`` :  HARQ
> memory includes the fillers bits
> +   - ``RTE_BBDEV_LDPC_DEC_SCATTER_GATHER`` :  supports scatter-gather
> for input/output data
> +   - ``RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION`` :  supports
> compression of the HARQ input/output
> +   - ``RTE_BBDEV_LDPC_LLR_COMPRESSION`` :  supports LLR input
> +compression
> +
> +* For the turbo encode operation:
> +   - ``RTE_BBDEV_TURBO_CRC_24B_ATTACH`` :  set to attach CRC24B to CB(s)
> +   - ``RTE_BBDEV_TURBO_RATE_MATCH`` :  if set then do not do Rate Match
> bypass
> +   - ``RTE_BBDEV_TURBO_ENC_INTERRUPTS`` :  set for encoder dequeue
> interrupts
> +   - ``RTE_BBDEV_TURBO_RV_INDEX_BYPASS`` :  set to bypass RV index
> +   - ``RTE_BBDEV_TURBO_ENC_SCATTER_GATHER`` :  supports scatter-
> gather
> +for input/output data
> +
> +* For the turbo decode operation:
> +   - ``RTE_BBDEV_TURBO_CRC_TYPE_24B`` :  check CRC24B from CB(s)
> +   - ``RTE_BBDEV_TURBO_SUBBLOCK_DEINTERLEAVE`` :  perform subblock
> de-interleave
> +   - ``RTE_BBDEV_TURBO_DEC_INTERRUPTS`` :  set for decoder dequeue
> interrupts
> +   - ``RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN`` :  set if negative LLR encoder
> i/p is supported
> +   - ``RTE_BBDEV_TURBO_POS_LLR_1_BIT_IN`` :  set if positive LLR encoder
> i/p is supported
> +   - ``RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP`` :  keep CRC24B bits
> appended while decoding
> +   - ``RTE_BBDEV_TURBO_EARLY_TERMINATION`` :  set early early
> termination feature
> +   - ``RTE_BBDEV_TURBO_DEC_SCATTER_GATHER`` :  supports scatter-
> gather for input/output data
> +   - ``RTE_BBDEV_TURBO_HALF_ITERATION_EVEN`` :  set half iteration
> +granularity
> +
> +Installation
> +------------
> +
> +Section 3 of the DPDK manual provides instuctions on installing and
> +compiling DPDK. The default set of bbdev compile flags may be found in
> +config/common_base, where for example the flag to build the ACC100
> +5G/4G FEC device, ``CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100``,
> +is already set.
> +
> +DPDK requires hugepages to be configured as detailed in section 2 of the
> DPDK manual.
> +The bbdev test application has been tested with a configuration 40 x
> +1GB hugepages. The hugepage configuration of a server may be examined
> using:
> +
> +.. code-block:: console
> +
> +   grep Huge* /proc/meminfo
> +
> +
> +Initialization
> +--------------
> +
> +When the device first powers up, its PCI Physical Functions (PF) can be
> listed through this command:
> +
> +.. code-block:: console
> +
> +  sudo lspci -vd8086:0d5c
> +
> +The physical and virtual functions are compatible with Linux UIO drivers:
> +``vfio`` and ``igb_uio``. However, in order to work the ACC100 5G/4G
> +FEC device firstly needs to be bound to one of these linux drivers through
> DPDK.
> +
> +
> +Bind PF UIO driver(s)
> +~~~~~~~~~~~~~~~~~~~~~
> +
> +Install the DPDK igb_uio driver, bind it with the PF PCI device ID and
> +use ``lspci`` to confirm the PF device is under use by ``igb_uio`` DPDK UIO
> driver.
> +
> +The igb_uio driver may be bound to the PF PCI device using one of three
> methods:
> +
> +
> +1. PCI functions (physical or virtual, depending on the use case) can
> +be bound to the UIO driver by repeating this command for every function.
> +
> +.. code-block:: console
> +
> +  cd <dpdk-top-level-directory>
> +  insmod ./build/kmod/igb_uio.ko
> +  echo "8086 0d5c" > /sys/bus/pci/drivers/igb_uio/new_id
> +  lspci -vd8086:0d5c
> +
> +
> +2. Another way to bind PF with DPDK UIO driver is by using the
> +``dpdk-devbind.py`` tool
> +
> +.. code-block:: console
> +
> +  cd <dpdk-top-level-directory>
> +  ./usertools/dpdk-devbind.py -b igb_uio 0000:06:00.0
> +
> +where the PCI device ID (example: 0000:06:00.0) is obtained using lspci
> +-vd8086:0d5c
> +
> +
> +3. A third way to bind is to use ``dpdk-setup.sh`` tool
> +
> +.. code-block:: console
> +
> +  cd <dpdk-top-level-directory>
> +  ./usertools/dpdk-setup.sh
> +
> +  select 'Bind Ethernet/Crypto/Baseband device to IGB UIO module'
> +  or
> +  select 'Bind Ethernet/Crypto/Baseband device to VFIO module'
> + depending on driver required  enter PCI device ID  select 'Display
> + current Ethernet/Crypto/Baseband device settings' to confirm binding
> +
> +
> +In the same way the ACC100 5G/4G FEC PF can be bound with vfio, but
> +vfio driver does not support SR-IOV configuration right out of the box, so it
> will need to be patched.
> +
> +
> +Enable Virtual Functions
> +~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +Now, it should be visible in the printouts that PCI PF is under igb_uio
> +control "``Kernel driver in use: igb_uio``"
> +
> +To show the number of available VFs on the device, read ``sriov_totalvfs``
> file..
> +
> +.. code-block:: console
> +
> +  cat /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/sriov_totalvfs
> +
> +  where 0000\:<b>\:<d>.<f> is the PCI device ID
> +
> +
> +To enable VFs via igb_uio, echo the number of virtual functions
> +intended to enable to ``max_vfs`` file..
> +
> +.. code-block:: console
> +
> +  echo <num-of-vfs> > /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/max_vfs
> +
> +
> +Afterwards, all VFs must be bound to appropriate UIO drivers as
> +required, same way it was done with the physical function previously.
> +
> +Enabling SR-IOV via vfio driver is pretty much the same, except that
> +the file name is different:
> +
> +.. code-block:: console
> +
> +  echo <num-of-vfs> >
> + /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/sriov_numvfs
> +
> +
> +Configure the VFs through PF
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +The PCI virtual functions must be configured before working or getting
> +assigned to VMs/Containers. The configuration involves allocating the
> +number of hardware queues, priorities, load balance, bandwidth and
> +other settings necessary for the device to perform FEC functions.
> +
> +This configuration needs to be executed at least once after reboot or
> +PCI FLR and can be achieved by using the function
> +``acc100_configure()``, which sets up the parameters defined in
> ``acc100_conf`` structure.
> +
> +Test Application
> +----------------
> +
> +BBDEV provides a test application, ``test-bbdev.py`` and range of test
> +data for testing the functionality of ACC100 5G/4G FEC encode and
> +decode, depending on the device's capabilities. The test application is
> +located under app->test-bbdev folder and has the following options:
> +
> +.. code-block:: console
> +
> +  "-p", "--testapp-path": specifies path to the bbdev test app.
> +  "-e", "--eal-params"	: EAL arguments which are passed to the test app.
> +  "-t", "--timeout"	: Timeout in seconds (default=300).
> +  "-c", "--test-cases"	: Defines test cases to run. Run all if not specified.
> +  "-v", "--test-vector"	: Test vector path (default=dpdk_path+/app/test-
> bbdev/test_vectors/bbdev_null.data).
> +  "-n", "--num-ops"	: Number of operations to process on device
> (default=32).
> +  "-b", "--burst-size"	: Operations enqueue/dequeue burst size
> (default=32).
> +  "-s", "--snr"		: SNR in dB used when generating LLRs for bler tests.
> +  "-s", "--iter_max"	: Number of iterations for LDPC decoder.
> +  "-l", "--num-lcores"	: Number of lcores to run (default=16).
> +  "-i", "--init-device" : Initialise PF device with default values.
> +
> +
> +To execute the test application tool using simple decode or encode
> +data, type one of the following:
> +
> +.. code-block:: console
> +
> +  ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_dec_default.data
> + ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_enc_default.data
> +
> +
> +The test application ``test-bbdev.py``, supports the ability to
> +configure the PF device with a default set of values, if the "-i" or "-
> +-init-device" option is included. The default values are defined in
> test_bbdev_perf.c.
> +
> +
> +Test Vectors
> +~~~~~~~~~~~~
> +
> +In addition to the simple LDPC decoder and LDPC encoder tests, bbdev
> +also provides a range of additional tests under the test_vectors
> +folder, which may be useful. The results of these tests will depend on
> +the ACC100 5G/4G FEC capabilities which may cause some testcases to be
> skipped, but no failure should be reported.
> diff --git a/doc/guides/bbdevs/index.rst b/doc/guides/bbdevs/index.rst
> index a8092dd..4445cbd 100644
> --- a/doc/guides/bbdevs/index.rst
> +++ b/doc/guides/bbdevs/index.rst
> @@ -13,3 +13,4 @@ Baseband Device Drivers
>      turbo_sw
>      fpga_lte_fec
>      fpga_5gnr_fec
> +    acc100
> diff --git a/doc/guides/rel_notes/release_20_11.rst
> b/doc/guides/rel_notes/release_20_11.rst
> index df227a1..b3ab614 100644
> --- a/doc/guides/rel_notes/release_20_11.rst
> +++ b/doc/guides/rel_notes/release_20_11.rst
> @@ -55,6 +55,12 @@ New Features
>       Also, make sure to start the actual text at the margin.
>       =======================================================
> 
> +* **Added Intel ACC100 bbdev PMD.**
> +
> +  Added a new ``acc100`` bbdev driver for the Intel\ |reg| ACC100
> + accelerator  also known as Mount Bryce.  See the
> + :doc:`../bbdevs/acc100` BBDEV guide for more details on this new driver.
> +
> 
>  Removed Items
>  -------------
> diff --git a/drivers/baseband/Makefile b/drivers/baseband/Makefile index
> dcc0969..b640294 100644
> --- a/drivers/baseband/Makefile
> +++ b/drivers/baseband/Makefile
> @@ -10,6 +10,8 @@ DIRS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_NULL) +=
> null  DEPDIRS-null = $(core-libs)
>  DIRS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_TURBO_SW) += turbo_sw
> DEPDIRS-turbo_sw = $(core-libs)
> +DIRS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100) += acc100
> +DEPDIRS-acc100 = $(core-libs)
>  DIRS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_FPGA_LTE_FEC) += fpga_lte_fec
> DEPDIRS-fpga_lte_fec = $(core-libs)
>  DIRS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_FPGA_5GNR_FEC) +=
> fpga_5gnr_fec diff --git a/drivers/baseband/acc100/Makefile
> b/drivers/baseband/acc100/Makefile
> new file mode 100644
> index 0000000..c79e487
> --- /dev/null
> +++ b/drivers/baseband/acc100/Makefile
> @@ -0,0 +1,25 @@
> +# SPDX-License-Identifier: BSD-3-Clause # Copyright(c) 2020 Intel
> +Corporation
> +
> +include $(RTE_SDK)/mk/rte.vars.mk
> +
> +# library name
> +LIB = librte_pmd_bbdev_acc100.a
> +
> +# build flags
> +CFLAGS += -O3
> +CFLAGS += $(WERROR_FLAGS)
> +LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring -lrte_cfgfile
> +LDLIBS += -lrte_bbdev LDLIBS += -lrte_pci -lrte_bus_pci
> +
> +# versioning export map
> +EXPORT_MAP := rte_pmd_bbdev_acc100_version.map
> +
> +# library version
> +LIBABIVER := 1
> +
> +# library source files
> +SRCS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100) += rte_acc100_pmd.c
> +
> +include $(RTE_SDK)/mk/rte.lib.mk
> diff --git a/drivers/baseband/acc100/meson.build
> b/drivers/baseband/acc100/meson.build
> new file mode 100644
> index 0000000..8afafc2
> --- /dev/null
> +++ b/drivers/baseband/acc100/meson.build
> @@ -0,0 +1,6 @@
> +# SPDX-License-Identifier: BSD-3-Clause # Copyright(c) 2020 Intel
> +Corporation
> +
> +deps += ['bbdev', 'bus_vdev', 'ring', 'pci', 'bus_pci']
> +
> +sources = files('rte_acc100_pmd.c')
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> b/drivers/baseband/acc100/rte_acc100_pmd.c
> new file mode 100644
> index 0000000..1b4cd13
> --- /dev/null
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> @@ -0,0 +1,175 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2020 Intel Corporation
> + */
> +
> +#include <unistd.h>
> +
> +#include <rte_common.h>
> +#include <rte_log.h>
> +#include <rte_dev.h>
> +#include <rte_malloc.h>
> +#include <rte_mempool.h>
> +#include <rte_byteorder.h>
> +#include <rte_errno.h>
> +#include <rte_branch_prediction.h>
> +#include <rte_hexdump.h>
> +#include <rte_pci.h>
> +#include <rte_bus_pci.h>
> +
> +#include <rte_bbdev.h>
> +#include <rte_bbdev_pmd.h>
> +#include "rte_acc100_pmd.h"
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, DEBUG); #else
> +RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE); #endif
> +
> +/* Free 64MB memory used for software rings */ static int
> +acc100_dev_close(struct rte_bbdev *dev  __rte_unused) {
> +	return 0;
> +}
> +
> +static const struct rte_bbdev_ops acc100_bbdev_ops = {
> +	.close = acc100_dev_close,
> +};
> +
> +/* ACC100 PCI PF address map */
> +static struct rte_pci_id pci_id_acc100_pf_map[] = {
> +	{
> +		RTE_PCI_DEVICE(RTE_ACC100_VENDOR_ID,
> RTE_ACC100_PF_DEVICE_ID)
> +	},
> +	{.device_id = 0},
> +};
> +
> +/* ACC100 PCI VF address map */
> +static struct rte_pci_id pci_id_acc100_vf_map[] = {
> +	{
> +		RTE_PCI_DEVICE(RTE_ACC100_VENDOR_ID,
> RTE_ACC100_VF_DEVICE_ID)
> +	},
> +	{.device_id = 0},
> +};
> +
> +/* Initialization Function */
> +static void
> +acc100_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv) {
> +	struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
> +
> +	dev->dev_ops = &acc100_bbdev_ops;
> +
> +	((struct acc100_device *) dev->data->dev_private)->pf_device =
> +			!strcmp(drv->driver.name,
> +					RTE_STR(ACC100PF_DRIVER_NAME));
> +	((struct acc100_device *) dev->data->dev_private)->mmio_base =
> +			pci_dev->mem_resource[0].addr;
> +
> +	rte_bbdev_log_debug("Init device %s [%s] @ vaddr %p
> paddr %#"PRIx64"",
> +			drv->driver.name, dev->data->name,
> +			(void *)pci_dev->mem_resource[0].addr,
> +			pci_dev->mem_resource[0].phys_addr);
> +}
> +
> +static int acc100_pci_probe(struct rte_pci_driver *pci_drv,
> +	struct rte_pci_device *pci_dev)
> +{
> +	struct rte_bbdev *bbdev = NULL;
> +	char dev_name[RTE_BBDEV_NAME_MAX_LEN];
> +
> +	if (pci_dev == NULL) {
> +		rte_bbdev_log(ERR, "NULL PCI device");
> +		return -EINVAL;
> +	}
> +
> +	rte_pci_device_name(&pci_dev->addr, dev_name,
> sizeof(dev_name));
> +
> +	/* Allocate memory to be used privately by drivers */
> +	bbdev = rte_bbdev_allocate(pci_dev->device.name);
> +	if (bbdev == NULL)
> +		return -ENODEV;
> +
> +	/* allocate device private memory */
> +	bbdev->data->dev_private = rte_zmalloc_socket(dev_name,
> +			sizeof(struct acc100_device), RTE_CACHE_LINE_SIZE,
> +			pci_dev->device.numa_node);
> +
> +	if (bbdev->data->dev_private == NULL) {
> +		rte_bbdev_log(CRIT,
> +				"Allocate of %zu bytes for device \"%s\"
> failed",
> +				sizeof(struct acc100_device), dev_name);
> +				rte_bbdev_release(bbdev);
> +			return -ENOMEM;
> +	}
> +
> +	/* Fill HW specific part of device structure */
> +	bbdev->device = &pci_dev->device;
> +	bbdev->intr_handle = &pci_dev->intr_handle;
> +	bbdev->data->socket_id = pci_dev->device.numa_node;
> +
> +	/* Invoke ACC100 device initialization function */
> +	acc100_bbdev_init(bbdev, pci_drv);
> +
> +	rte_bbdev_log_debug("Initialised bbdev %s (id = %u)",
> +			dev_name, bbdev->data->dev_id);
> +	return 0;
> +}
> +
> +static int acc100_pci_remove(struct rte_pci_device *pci_dev) {
> +	struct rte_bbdev *bbdev;
> +	int ret;
> +	uint8_t dev_id;
> +
> +	if (pci_dev == NULL)
> +		return -EINVAL;
> +
> +	/* Find device */
> +	bbdev = rte_bbdev_get_named_dev(pci_dev->device.name);
> +	if (bbdev == NULL) {
> +		rte_bbdev_log(CRIT,
> +				"Couldn't find HW dev \"%s\" to uninitialise
> it",
> +				pci_dev->device.name);
> +		return -ENODEV;
> +	}
> +	dev_id = bbdev->data->dev_id;
> +
> +	/* free device private memory before close */
> +	rte_free(bbdev->data->dev_private);
> +
> +	/* Close device */
> +	ret = rte_bbdev_close(dev_id);
> +	if (ret < 0)
> +		rte_bbdev_log(ERR,
> +				"Device %i failed to close during uninit: %i",
> +				dev_id, ret);
> +
> +	/* release bbdev from library */
> +	rte_bbdev_release(bbdev);
> +
> +	rte_bbdev_log_debug("Destroyed bbdev = %u", dev_id);
> +
> +	return 0;
> +}
> +
> +static struct rte_pci_driver acc100_pci_pf_driver = {
> +		.probe = acc100_pci_probe,
> +		.remove = acc100_pci_remove,
> +		.id_table = pci_id_acc100_pf_map,
> +		.drv_flags = RTE_PCI_DRV_NEED_MAPPING };
> +
> +static struct rte_pci_driver acc100_pci_vf_driver = {
> +		.probe = acc100_pci_probe,
> +		.remove = acc100_pci_remove,
> +		.id_table = pci_id_acc100_vf_map,
> +		.drv_flags = RTE_PCI_DRV_NEED_MAPPING };
> +
> +RTE_PMD_REGISTER_PCI(ACC100PF_DRIVER_NAME, acc100_pci_pf_driver);
> +RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME,
> pci_id_acc100_pf_map);
> +RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
> +RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME,
> pci_id_acc100_vf_map);

It seems both PF and VF share same date for rte_pci_driver,
it's strange to duplicate code.

> +
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h
> b/drivers/baseband/acc100/rte_acc100_pmd.h
> new file mode 100644
> index 0000000..6f46df0
> --- /dev/null
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
> @@ -0,0 +1,37 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2020 Intel Corporation
> + */
> +
> +#ifndef _RTE_ACC100_PMD_H_
> +#define _RTE_ACC100_PMD_H_
> +
> +/* Helper macro for logging */
> +#define rte_bbdev_log(level, fmt, ...) \
> +	rte_log(RTE_LOG_ ## level, acc100_logtype, fmt "\n", \
> +		##__VA_ARGS__)
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +#define rte_bbdev_log_debug(fmt, ...) \
> +		rte_bbdev_log(DEBUG, "acc100_pmd: " fmt, \
> +		##__VA_ARGS__)
> +#else
> +#define rte_bbdev_log_debug(fmt, ...)
> +#endif
> +
> +/* ACC100 PF and VF driver names */
> +#define ACC100PF_DRIVER_NAME           intel_acc100_pf
> +#define ACC100VF_DRIVER_NAME           intel_acc100_vf
> +
> +/* ACC100 PCI vendor & device IDs */
> +#define RTE_ACC100_VENDOR_ID           (0x8086)
> +#define RTE_ACC100_PF_DEVICE_ID        (0x0d5c)
> +#define RTE_ACC100_VF_DEVICE_ID        (0x0d5d)
> +
> +/* Private data structure for each ACC100 device */ struct
> +acc100_device {
> +	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
> +	bool pf_device; /**< True if this is a PF ACC100 device */
> +	bool configured; /**< True if this ACC100 device is configured */ };
> +
> +#endif /* _RTE_ACC100_PMD_H_ */
> diff --git a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> new file mode 100644
> index 0000000..4a76d1d
> --- /dev/null
> +++ b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> @@ -0,0 +1,3 @@
> +DPDK_21 {
> +	local: *;
> +};
> diff --git a/drivers/baseband/meson.build b/drivers/baseband/meson.build
> index 415b672..72301ce 100644
> --- a/drivers/baseband/meson.build
> +++ b/drivers/baseband/meson.build
> @@ -5,7 +5,7 @@ if is_windows
>  	subdir_done()
>  endif
> 
> -drivers = ['null', 'turbo_sw', 'fpga_lte_fec', 'fpga_5gnr_fec']
> +drivers = ['null', 'turbo_sw', 'fpga_lte_fec', 'fpga_5gnr_fec',
> +'acc100']
> 
>  config_flag_fmt = 'RTE_LIBRTE_PMD_BBDEV_@0@'
>  driver_name_fmt = 'rte_pmd_bbdev_@0@'
> diff --git a/mk/rte.app.mk b/mk/rte.app.mk index a544259..a77f538 100644
> --- a/mk/rte.app.mk
> +++ b/mk/rte.app.mk
> @@ -254,6 +254,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_NETVSC_PMD)     +=
> -lrte_pmd_netvsc
> 
>  ifeq ($(CONFIG_RTE_LIBRTE_BBDEV),y)
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_NULL)     += -
> lrte_pmd_bbdev_null
> +_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100)    += -
> lrte_pmd_bbdev_acc100
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_FPGA_LTE_FEC) += -
> lrte_pmd_bbdev_fpga_lte_fec
>  _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_FPGA_5GNR_FEC) += -
> lrte_pmd_bbdev_fpga_5gnr_fec
> 
> --
> 1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v3 02/11] baseband/acc100: add register definition file
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 02/11] baseband/acc100: add register definition file Nicolas Chautru
@ 2020-08-29  9:55   ` Xu, Rosen
  2020-08-29 17:39     ` Chautru, Nicolas
  0 siblings, 1 reply; 213+ messages in thread
From: Xu, Rosen @ 2020-08-29  9:55 UTC (permalink / raw)
  To: Chautru, Nicolas, dev, akhil.goyal; +Cc: Richardson, Bruce, Chautru, Nicolas

Hi,

> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Nicolas Chautru
> Sent: Wednesday, August 19, 2020 8:25
> To: dev@dpdk.org; akhil.goyal@nxp.com
> Cc: Richardson, Bruce <bruce.richardson@intel.com>; Chautru, Nicolas
> <nicolas.chautru@intel.com>
> Subject: [dpdk-dev] [PATCH v3 02/11] baseband/acc100: add register
> definition file
> 
> Add in the list of registers for the device and related
> HW specs definitions.
> 
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> ---
>  drivers/baseband/acc100/acc100_pf_enum.h | 1068
> ++++++++++++++++++++++++++++++
>  drivers/baseband/acc100/acc100_vf_enum.h |   73 ++
>  drivers/baseband/acc100/rte_acc100_pmd.h |  490 ++++++++++++++
>  3 files changed, 1631 insertions(+)
>  create mode 100644 drivers/baseband/acc100/acc100_pf_enum.h
>  create mode 100644 drivers/baseband/acc100/acc100_vf_enum.h
> 
> diff --git a/drivers/baseband/acc100/acc100_pf_enum.h
> b/drivers/baseband/acc100/acc100_pf_enum.h
> new file mode 100644
> index 0000000..a1ee416
> --- /dev/null
> +++ b/drivers/baseband/acc100/acc100_pf_enum.h
> @@ -0,0 +1,1068 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2017 Intel Corporation
> + */
> +
> +#ifndef ACC100_PF_ENUM_H
> +#define ACC100_PF_ENUM_H
> +
> +/*
> + * ACC100 Register mapping on PF BAR0
> + * This is automatically generated from RDL, format may change with new
> RDL
> + * Release.
> + * Variable names are as is
> + */
> +enum {
> +	HWPfQmgrEgressQueuesTemplate          =  0x0007FE00,
> +	HWPfQmgrIngressAq                     =  0x00080000,
> +	HWPfQmgrArbQAvail                     =  0x00A00010,
> +	HWPfQmgrArbQBlock                     =  0x00A00014,
> +	HWPfQmgrAqueueDropNotifEn             =  0x00A00024,
> +	HWPfQmgrAqueueDisableNotifEn          =  0x00A00028,
> +	HWPfQmgrSoftReset                     =  0x00A00038,
> +	HWPfQmgrInitStatus                    =  0x00A0003C,
> +	HWPfQmgrAramWatchdogCount             =  0x00A00040,
> +	HWPfQmgrAramWatchdogCounterEn         =  0x00A00044,
> +	HWPfQmgrAxiWatchdogCount              =  0x00A00048,
> +	HWPfQmgrAxiWatchdogCounterEn          =  0x00A0004C,
> +	HWPfQmgrProcessWatchdogCount          =  0x00A00050,
> +	HWPfQmgrProcessWatchdogCounterEn      =  0x00A00054,
> +	HWPfQmgrProcessUl4GWatchdogCounter    =  0x00A00058,
> +	HWPfQmgrProcessDl4GWatchdogCounter    =  0x00A0005C,
> +	HWPfQmgrProcessUl5GWatchdogCounter    =  0x00A00060,
> +	HWPfQmgrProcessDl5GWatchdogCounter    =  0x00A00064,
> +	HWPfQmgrProcessMldWatchdogCounter     =  0x00A00068,
> +	HWPfQmgrMsiOverflowUpperVf            =  0x00A00070,
> +	HWPfQmgrMsiOverflowLowerVf            =  0x00A00074,
> +	HWPfQmgrMsiWatchdogOverflow           =  0x00A00078,
> +	HWPfQmgrMsiOverflowEnable             =  0x00A0007C,
> +	HWPfQmgrDebugAqPointerMemGrp          =  0x00A00100,
> +	HWPfQmgrDebugOutputArbQFifoGrp        =  0x00A00140,
> +	HWPfQmgrDebugMsiFifoGrp               =  0x00A00180,
> +	HWPfQmgrDebugAxiWdTimeoutMsiFifo      =  0x00A001C0,
> +	HWPfQmgrDebugProcessWdTimeoutMsiFifo  =  0x00A001C4,
> +	HWPfQmgrDepthLog2Grp                  =  0x00A00200,
> +	HWPfQmgrTholdGrp                      =  0x00A00300,
> +	HWPfQmgrGrpTmplateReg0Indx            =  0x00A00600,
> +	HWPfQmgrGrpTmplateReg1Indx            =  0x00A00680,
> +	HWPfQmgrGrpTmplateReg2indx            =  0x00A00700,
> +	HWPfQmgrGrpTmplateReg3Indx            =  0x00A00780,
> +	HWPfQmgrGrpTmplateReg4Indx            =  0x00A00800,
> +	HWPfQmgrVfBaseAddr                    =  0x00A01000,
> +	HWPfQmgrUl4GWeightRrVf                =  0x00A02000,
> +	HWPfQmgrDl4GWeightRrVf                =  0x00A02100,
> +	HWPfQmgrUl5GWeightRrVf                =  0x00A02200,
> +	HWPfQmgrDl5GWeightRrVf                =  0x00A02300,
> +	HWPfQmgrMldWeightRrVf                 =  0x00A02400,
> +	HWPfQmgrArbQDepthGrp                  =  0x00A02F00,
> +	HWPfQmgrGrpFunction0                  =  0x00A02F40,
> +	HWPfQmgrGrpFunction1                  =  0x00A02F44,
> +	HWPfQmgrGrpPriority                   =  0x00A02F48,
> +	HWPfQmgrWeightSync                    =  0x00A03000,
> +	HWPfQmgrAqEnableVf                    =  0x00A10000,
> +	HWPfQmgrAqResetVf                     =  0x00A20000,
> +	HWPfQmgrRingSizeVf                    =  0x00A20004,
> +	HWPfQmgrGrpDepthLog20Vf               =  0x00A20008,
> +	HWPfQmgrGrpDepthLog21Vf               =  0x00A2000C,
> +	HWPfQmgrGrpFunction0Vf                =  0x00A20010,
> +	HWPfQmgrGrpFunction1Vf                =  0x00A20014,
> +	HWPfDmaConfig0Reg                     =  0x00B80000,
> +	HWPfDmaConfig1Reg                     =  0x00B80004,
> +	HWPfDmaQmgrAddrReg                    =  0x00B80008,
> +	HWPfDmaSoftResetReg                   =  0x00B8000C,
> +	HWPfDmaAxcacheReg                     =  0x00B80010,
> +	HWPfDmaVersionReg                     =  0x00B80014,
> +	HWPfDmaFrameThreshold                 =  0x00B80018,
> +	HWPfDmaTimestampLo                    =  0x00B8001C,
> +	HWPfDmaTimestampHi                    =  0x00B80020,
> +	HWPfDmaAxiStatus                      =  0x00B80028,
> +	HWPfDmaAxiControl                     =  0x00B8002C,
> +	HWPfDmaNoQmgr                         =  0x00B80030,
> +	HWPfDmaQosScale                       =  0x00B80034,
> +	HWPfDmaQmanen                         =  0x00B80040,
> +	HWPfDmaQmgrQosBase                    =  0x00B80060,
> +	HWPfDmaFecClkGatingEnable             =  0x00B80080,
> +	HWPfDmaPmEnable                       =  0x00B80084,
> +	HWPfDmaQosEnable                      =  0x00B80088,
> +	HWPfDmaHarqWeightedRrFrameThreshold   =  0x00B800B0,
> +	HWPfDmaDataSmallWeightedRrFrameThresh  = 0x00B800B4,
> +	HWPfDmaDataLargeWeightedRrFrameThresh  = 0x00B800B8,
> +	HWPfDmaInboundCbMaxSize               =  0x00B800BC,
> +	HWPfDmaInboundDrainDataSize           =  0x00B800C0,
> +	HWPfDmaVfDdrBaseRw                    =  0x00B80400,
> +	HWPfDmaCmplTmOutCnt                   =  0x00B80800,
> +	HWPfDmaProcTmOutCnt                   =  0x00B80804,
> +	HWPfDmaStatusRrespBresp               =  0x00B80810,
> +	HWPfDmaCfgRrespBresp                  =  0x00B80814,
> +	HWPfDmaStatusMemParErr                =  0x00B80818,
> +	HWPfDmaCfgMemParErrEn                 =  0x00B8081C,
> +	HWPfDmaStatusDmaHwErr                 =  0x00B80820,
> +	HWPfDmaCfgDmaHwErrEn                  =  0x00B80824,
> +	HWPfDmaStatusFecCoreErr               =  0x00B80828,
> +	HWPfDmaCfgFecCoreErrEn                =  0x00B8082C,
> +	HWPfDmaStatusFcwDescrErr              =  0x00B80830,
> +	HWPfDmaCfgFcwDescrErrEn               =  0x00B80834,
> +	HWPfDmaStatusBlockTransmit            =  0x00B80838,
> +	HWPfDmaBlockOnErrEn                   =  0x00B8083C,
> +	HWPfDmaStatusFlushDma                 =  0x00B80840,
> +	HWPfDmaFlushDmaOnErrEn                =  0x00B80844,
> +	HWPfDmaStatusSdoneFifoFull            =  0x00B80848,
> +	HWPfDmaStatusDescriptorErrLoVf        =  0x00B8084C,
> +	HWPfDmaStatusDescriptorErrHiVf        =  0x00B80850,
> +	HWPfDmaStatusFcwErrLoVf               =  0x00B80854,
> +	HWPfDmaStatusFcwErrHiVf               =  0x00B80858,
> +	HWPfDmaStatusDataErrLoVf              =  0x00B8085C,
> +	HWPfDmaStatusDataErrHiVf              =  0x00B80860,
> +	HWPfDmaCfgMsiEnSoftwareErr            =  0x00B80864,
> +	HWPfDmaDescriptorSignatuture          =  0x00B80868,
> +	HWPfDmaFcwSignature                   =  0x00B8086C,
> +	HWPfDmaErrorDetectionEn               =  0x00B80870,
> +	HWPfDmaErrCntrlFifoDebug              =  0x00B8087C,
> +	HWPfDmaStatusToutData                 =  0x00B80880,
> +	HWPfDmaStatusToutDesc                 =  0x00B80884,
> +	HWPfDmaStatusToutUnexpData            =  0x00B80888,
> +	HWPfDmaStatusToutUnexpDesc            =  0x00B8088C,
> +	HWPfDmaStatusToutProcess              =  0x00B80890,
> +	HWPfDmaConfigCtoutOutDataEn           =  0x00B808A0,
> +	HWPfDmaConfigCtoutOutDescrEn          =  0x00B808A4,
> +	HWPfDmaConfigUnexpComplDataEn         =  0x00B808A8,
> +	HWPfDmaConfigUnexpComplDescrEn        =  0x00B808AC,
> +	HWPfDmaConfigPtoutOutEn               =  0x00B808B0,
> +	HWPfDmaFec5GulDescBaseLoRegVf         =  0x00B88020,
> +	HWPfDmaFec5GulDescBaseHiRegVf         =  0x00B88024,
> +	HWPfDmaFec5GulRespPtrLoRegVf          =  0x00B88028,
> +	HWPfDmaFec5GulRespPtrHiRegVf          =  0x00B8802C,
> +	HWPfDmaFec5GdlDescBaseLoRegVf         =  0x00B88040,
> +	HWPfDmaFec5GdlDescBaseHiRegVf         =  0x00B88044,
> +	HWPfDmaFec5GdlRespPtrLoRegVf          =  0x00B88048,
> +	HWPfDmaFec5GdlRespPtrHiRegVf          =  0x00B8804C,
> +	HWPfDmaFec4GulDescBaseLoRegVf         =  0x00B88060,
> +	HWPfDmaFec4GulDescBaseHiRegVf         =  0x00B88064,
> +	HWPfDmaFec4GulRespPtrLoRegVf          =  0x00B88068,
> +	HWPfDmaFec4GulRespPtrHiRegVf          =  0x00B8806C,
> +	HWPfDmaFec4GdlDescBaseLoRegVf         =  0x00B88080,
> +	HWPfDmaFec4GdlDescBaseHiRegVf         =  0x00B88084,
> +	HWPfDmaFec4GdlRespPtrLoRegVf          =  0x00B88088,
> +	HWPfDmaFec4GdlRespPtrHiRegVf          =  0x00B8808C,
> +	HWPfDmaVfDdrBaseRangeRo               =  0x00B880A0,
> +	HWPfQosmonACntrlReg                   =  0x00B90000,
> +	HWPfQosmonAEvalOverflow0              =  0x00B90008,
> +	HWPfQosmonAEvalOverflow1              =  0x00B9000C,
> +	HWPfQosmonADivTerm                    =  0x00B90010,
> +	HWPfQosmonATickTerm                   =  0x00B90014,
> +	HWPfQosmonAEvalTerm                   =  0x00B90018,
> +	HWPfQosmonAAveTerm                    =  0x00B9001C,
> +	HWPfQosmonAForceEccErr                =  0x00B90020,
> +	HWPfQosmonAEccErrDetect               =  0x00B90024,
> +	HWPfQosmonAIterationConfig0Low        =  0x00B90060,
> +	HWPfQosmonAIterationConfig0High       =  0x00B90064,
> +	HWPfQosmonAIterationConfig1Low        =  0x00B90068,
> +	HWPfQosmonAIterationConfig1High       =  0x00B9006C,
> +	HWPfQosmonAIterationConfig2Low        =  0x00B90070,
> +	HWPfQosmonAIterationConfig2High       =  0x00B90074,
> +	HWPfQosmonAIterationConfig3Low        =  0x00B90078,
> +	HWPfQosmonAIterationConfig3High       =  0x00B9007C,
> +	HWPfQosmonAEvalMemAddr                =  0x00B90080,
> +	HWPfQosmonAEvalMemData                =  0x00B90084,
> +	HWPfQosmonAXaction                    =  0x00B900C0,
> +	HWPfQosmonARemThres1Vf                =  0x00B90400,
> +	HWPfQosmonAThres2Vf                   =  0x00B90404,
> +	HWPfQosmonAWeiFracVf                  =  0x00B90408,
> +	HWPfQosmonARrWeiVf                    =  0x00B9040C,
> +	HWPfPermonACntrlRegVf                 =  0x00B98000,
> +	HWPfPermonACountVf                    =  0x00B98008,
> +	HWPfPermonAKCntLoVf                   =  0x00B98010,
> +	HWPfPermonAKCntHiVf                   =  0x00B98014,
> +	HWPfPermonADeltaCntLoVf               =  0x00B98020,
> +	HWPfPermonADeltaCntHiVf               =  0x00B98024,
> +	HWPfPermonAVersionReg                 =  0x00B9C000,
> +	HWPfPermonACbControlFec               =  0x00B9C0F0,
> +	HWPfPermonADltTimerLoFec              =  0x00B9C0F4,
> +	HWPfPermonADltTimerHiFec              =  0x00B9C0F8,
> +	HWPfPermonACbCountFec                 =  0x00B9C100,
> +	HWPfPermonAAccExecTimerLoFec          =  0x00B9C104,
> +	HWPfPermonAAccExecTimerHiFec          =  0x00B9C108,
> +	HWPfPermonAExecTimerMinFec            =  0x00B9C200,
> +	HWPfPermonAExecTimerMaxFec            =  0x00B9C204,
> +	HWPfPermonAControlBusMon              =  0x00B9C400,
> +	HWPfPermonAConfigBusMon               =  0x00B9C404,
> +	HWPfPermonASkipCountBusMon            =  0x00B9C408,
> +	HWPfPermonAMinLatBusMon               =  0x00B9C40C,
> +	HWPfPermonAMaxLatBusMon               =  0x00B9C500,
> +	HWPfPermonATotalLatLowBusMon          =  0x00B9C504,
> +	HWPfPermonATotalLatUpperBusMon        =  0x00B9C508,
> +	HWPfPermonATotalReqCntBusMon          =  0x00B9C50C,
> +	HWPfQosmonBCntrlReg                   =  0x00BA0000,
> +	HWPfQosmonBEvalOverflow0              =  0x00BA0008,
> +	HWPfQosmonBEvalOverflow1              =  0x00BA000C,
> +	HWPfQosmonBDivTerm                    =  0x00BA0010,
> +	HWPfQosmonBTickTerm                   =  0x00BA0014,
> +	HWPfQosmonBEvalTerm                   =  0x00BA0018,
> +	HWPfQosmonBAveTerm                    =  0x00BA001C,
> +	HWPfQosmonBForceEccErr                =  0x00BA0020,
> +	HWPfQosmonBEccErrDetect               =  0x00BA0024,
> +	HWPfQosmonBIterationConfig0Low        =  0x00BA0060,
> +	HWPfQosmonBIterationConfig0High       =  0x00BA0064,
> +	HWPfQosmonBIterationConfig1Low        =  0x00BA0068,
> +	HWPfQosmonBIterationConfig1High       =  0x00BA006C,
> +	HWPfQosmonBIterationConfig2Low        =  0x00BA0070,
> +	HWPfQosmonBIterationConfig2High       =  0x00BA0074,
> +	HWPfQosmonBIterationConfig3Low        =  0x00BA0078,
> +	HWPfQosmonBIterationConfig3High       =  0x00BA007C,
> +	HWPfQosmonBEvalMemAddr                =  0x00BA0080,
> +	HWPfQosmonBEvalMemData                =  0x00BA0084,
> +	HWPfQosmonBXaction                    =  0x00BA00C0,
> +	HWPfQosmonBRemThres1Vf                =  0x00BA0400,
> +	HWPfQosmonBThres2Vf                   =  0x00BA0404,
> +	HWPfQosmonBWeiFracVf                  =  0x00BA0408,
> +	HWPfQosmonBRrWeiVf                    =  0x00BA040C,
> +	HWPfPermonBCntrlRegVf                 =  0x00BA8000,
> +	HWPfPermonBCountVf                    =  0x00BA8008,
> +	HWPfPermonBKCntLoVf                   =  0x00BA8010,
> +	HWPfPermonBKCntHiVf                   =  0x00BA8014,
> +	HWPfPermonBDeltaCntLoVf               =  0x00BA8020,
> +	HWPfPermonBDeltaCntHiVf               =  0x00BA8024,
> +	HWPfPermonBVersionReg                 =  0x00BAC000,
> +	HWPfPermonBCbControlFec               =  0x00BAC0F0,
> +	HWPfPermonBDltTimerLoFec              =  0x00BAC0F4,
> +	HWPfPermonBDltTimerHiFec              =  0x00BAC0F8,
> +	HWPfPermonBCbCountFec                 =  0x00BAC100,
> +	HWPfPermonBAccExecTimerLoFec          =  0x00BAC104,
> +	HWPfPermonBAccExecTimerHiFec          =  0x00BAC108,
> +	HWPfPermonBExecTimerMinFec            =  0x00BAC200,
> +	HWPfPermonBExecTimerMaxFec            =  0x00BAC204,
> +	HWPfPermonBControlBusMon              =  0x00BAC400,
> +	HWPfPermonBConfigBusMon               =  0x00BAC404,
> +	HWPfPermonBSkipCountBusMon            =  0x00BAC408,
> +	HWPfPermonBMinLatBusMon               =  0x00BAC40C,
> +	HWPfPermonBMaxLatBusMon               =  0x00BAC500,
> +	HWPfPermonBTotalLatLowBusMon          =  0x00BAC504,
> +	HWPfPermonBTotalLatUpperBusMon        =  0x00BAC508,
> +	HWPfPermonBTotalReqCntBusMon          =  0x00BAC50C,
> +	HWPfFecUl5gCntrlReg                   =  0x00BC0000,
> +	HWPfFecUl5gI2MThreshReg               =  0x00BC0004,
> +	HWPfFecUl5gVersionReg                 =  0x00BC0100,
> +	HWPfFecUl5gFcwStatusReg               =  0x00BC0104,
> +	HWPfFecUl5gWarnReg                    =  0x00BC0108,
> +	HwPfFecUl5gIbDebugReg                 =  0x00BC0200,
> +	HwPfFecUl5gObLlrDebugReg              =  0x00BC0204,
> +	HwPfFecUl5gObHarqDebugReg             =  0x00BC0208,
> +	HwPfFecUl5g1CntrlReg                  =  0x00BC1000,
> +	HwPfFecUl5g1I2MThreshReg              =  0x00BC1004,
> +	HwPfFecUl5g1VersionReg                =  0x00BC1100,
> +	HwPfFecUl5g1FcwStatusReg              =  0x00BC1104,
> +	HwPfFecUl5g1WarnReg                   =  0x00BC1108,
> +	HwPfFecUl5g1IbDebugReg                =  0x00BC1200,
> +	HwPfFecUl5g1ObLlrDebugReg             =  0x00BC1204,
> +	HwPfFecUl5g1ObHarqDebugReg            =  0x00BC1208,
> +	HwPfFecUl5g2CntrlReg                  =  0x00BC2000,
> +	HwPfFecUl5g2I2MThreshReg              =  0x00BC2004,
> +	HwPfFecUl5g2VersionReg                =  0x00BC2100,
> +	HwPfFecUl5g2FcwStatusReg              =  0x00BC2104,
> +	HwPfFecUl5g2WarnReg                   =  0x00BC2108,
> +	HwPfFecUl5g2IbDebugReg                =  0x00BC2200,
> +	HwPfFecUl5g2ObLlrDebugReg             =  0x00BC2204,
> +	HwPfFecUl5g2ObHarqDebugReg            =  0x00BC2208,
> +	HwPfFecUl5g3CntrlReg                  =  0x00BC3000,
> +	HwPfFecUl5g3I2MThreshReg              =  0x00BC3004,
> +	HwPfFecUl5g3VersionReg                =  0x00BC3100,
> +	HwPfFecUl5g3FcwStatusReg              =  0x00BC3104,
> +	HwPfFecUl5g3WarnReg                   =  0x00BC3108,
> +	HwPfFecUl5g3IbDebugReg                =  0x00BC3200,
> +	HwPfFecUl5g3ObLlrDebugReg             =  0x00BC3204,
> +	HwPfFecUl5g3ObHarqDebugReg            =  0x00BC3208,
> +	HwPfFecUl5g4CntrlReg                  =  0x00BC4000,
> +	HwPfFecUl5g4I2MThreshReg              =  0x00BC4004,
> +	HwPfFecUl5g4VersionReg                =  0x00BC4100,
> +	HwPfFecUl5g4FcwStatusReg              =  0x00BC4104,
> +	HwPfFecUl5g4WarnReg                   =  0x00BC4108,
> +	HwPfFecUl5g4IbDebugReg                =  0x00BC4200,
> +	HwPfFecUl5g4ObLlrDebugReg             =  0x00BC4204,
> +	HwPfFecUl5g4ObHarqDebugReg            =  0x00BC4208,
> +	HwPfFecUl5g5CntrlReg                  =  0x00BC5000,
> +	HwPfFecUl5g5I2MThreshReg              =  0x00BC5004,
> +	HwPfFecUl5g5VersionReg                =  0x00BC5100,
> +	HwPfFecUl5g5FcwStatusReg              =  0x00BC5104,
> +	HwPfFecUl5g5WarnReg                   =  0x00BC5108,
> +	HwPfFecUl5g5IbDebugReg                =  0x00BC5200,
> +	HwPfFecUl5g5ObLlrDebugReg             =  0x00BC5204,
> +	HwPfFecUl5g5ObHarqDebugReg            =  0x00BC5208,
> +	HwPfFecUl5g6CntrlReg                  =  0x00BC6000,
> +	HwPfFecUl5g6I2MThreshReg              =  0x00BC6004,
> +	HwPfFecUl5g6VersionReg                =  0x00BC6100,
> +	HwPfFecUl5g6FcwStatusReg              =  0x00BC6104,
> +	HwPfFecUl5g6WarnReg                   =  0x00BC6108,
> +	HwPfFecUl5g6IbDebugReg                =  0x00BC6200,
> +	HwPfFecUl5g6ObLlrDebugReg             =  0x00BC6204,
> +	HwPfFecUl5g6ObHarqDebugReg            =  0x00BC6208,
> +	HwPfFecUl5g7CntrlReg                  =  0x00BC7000,
> +	HwPfFecUl5g7I2MThreshReg              =  0x00BC7004,
> +	HwPfFecUl5g7VersionReg                =  0x00BC7100,
> +	HwPfFecUl5g7FcwStatusReg              =  0x00BC7104,
> +	HwPfFecUl5g7WarnReg                   =  0x00BC7108,
> +	HwPfFecUl5g7IbDebugReg                =  0x00BC7200,
> +	HwPfFecUl5g7ObLlrDebugReg             =  0x00BC7204,
> +	HwPfFecUl5g7ObHarqDebugReg            =  0x00BC7208,
> +	HwPfFecUl5g8CntrlReg                  =  0x00BC8000,
> +	HwPfFecUl5g8I2MThreshReg              =  0x00BC8004,
> +	HwPfFecUl5g8VersionReg                =  0x00BC8100,
> +	HwPfFecUl5g8FcwStatusReg              =  0x00BC8104,
> +	HwPfFecUl5g8WarnReg                   =  0x00BC8108,
> +	HwPfFecUl5g8IbDebugReg                =  0x00BC8200,
> +	HwPfFecUl5g8ObLlrDebugReg             =  0x00BC8204,
> +	HwPfFecUl5g8ObHarqDebugReg            =  0x00BC8208,
> +	HWPfFecDl5gCntrlReg                   =  0x00BCF000,
> +	HWPfFecDl5gI2MThreshReg               =  0x00BCF004,
> +	HWPfFecDl5gVersionReg                 =  0x00BCF100,
> +	HWPfFecDl5gFcwStatusReg               =  0x00BCF104,
> +	HWPfFecDl5gWarnReg                    =  0x00BCF108,
> +	HWPfFecUlVersionReg                   =  0x00BD0000,
> +	HWPfFecUlControlReg                   =  0x00BD0004,
> +	HWPfFecUlStatusReg                    =  0x00BD0008,
> +	HWPfFecDlVersionReg                   =  0x00BDF000,
> +	HWPfFecDlClusterConfigReg             =  0x00BDF004,
> +	HWPfFecDlBurstThres                   =  0x00BDF00C,
> +	HWPfFecDlClusterStatusReg0            =  0x00BDF040,
> +	HWPfFecDlClusterStatusReg1            =  0x00BDF044,
> +	HWPfFecDlClusterStatusReg2            =  0x00BDF048,
> +	HWPfFecDlClusterStatusReg3            =  0x00BDF04C,
> +	HWPfFecDlClusterStatusReg4            =  0x00BDF050,
> +	HWPfFecDlClusterStatusReg5            =  0x00BDF054,
> +	HWPfChaFabPllPllrst                   =  0x00C40000,
> +	HWPfChaFabPllClk0                     =  0x00C40004,
> +	HWPfChaFabPllClk1                     =  0x00C40008,
> +	HWPfChaFabPllBwadj                    =  0x00C4000C,
> +	HWPfChaFabPllLbw                      =  0x00C40010,
> +	HWPfChaFabPllResetq                   =  0x00C40014,
> +	HWPfChaFabPllPhshft0                  =  0x00C40018,
> +	HWPfChaFabPllPhshft1                  =  0x00C4001C,
> +	HWPfChaFabPllDivq0                    =  0x00C40020,
> +	HWPfChaFabPllDivq1                    =  0x00C40024,
> +	HWPfChaFabPllDivq2                    =  0x00C40028,
> +	HWPfChaFabPllDivq3                    =  0x00C4002C,
> +	HWPfChaFabPllDivq4                    =  0x00C40030,
> +	HWPfChaFabPllDivq5                    =  0x00C40034,
> +	HWPfChaFabPllDivq6                    =  0x00C40038,
> +	HWPfChaFabPllDivq7                    =  0x00C4003C,
> +	HWPfChaDl5gPllPllrst                  =  0x00C40080,
> +	HWPfChaDl5gPllClk0                    =  0x00C40084,
> +	HWPfChaDl5gPllClk1                    =  0x00C40088,
> +	HWPfChaDl5gPllBwadj                   =  0x00C4008C,
> +	HWPfChaDl5gPllLbw                     =  0x00C40090,
> +	HWPfChaDl5gPllResetq                  =  0x00C40094,
> +	HWPfChaDl5gPllPhshft0                 =  0x00C40098,
> +	HWPfChaDl5gPllPhshft1                 =  0x00C4009C,
> +	HWPfChaDl5gPllDivq0                   =  0x00C400A0,
> +	HWPfChaDl5gPllDivq1                   =  0x00C400A4,
> +	HWPfChaDl5gPllDivq2                   =  0x00C400A8,
> +	HWPfChaDl5gPllDivq3                   =  0x00C400AC,
> +	HWPfChaDl5gPllDivq4                   =  0x00C400B0,
> +	HWPfChaDl5gPllDivq5                   =  0x00C400B4,
> +	HWPfChaDl5gPllDivq6                   =  0x00C400B8,
> +	HWPfChaDl5gPllDivq7                   =  0x00C400BC,
> +	HWPfChaDl4gPllPllrst                  =  0x00C40100,
> +	HWPfChaDl4gPllClk0                    =  0x00C40104,
> +	HWPfChaDl4gPllClk1                    =  0x00C40108,
> +	HWPfChaDl4gPllBwadj                   =  0x00C4010C,
> +	HWPfChaDl4gPllLbw                     =  0x00C40110,
> +	HWPfChaDl4gPllResetq                  =  0x00C40114,
> +	HWPfChaDl4gPllPhshft0                 =  0x00C40118,
> +	HWPfChaDl4gPllPhshft1                 =  0x00C4011C,
> +	HWPfChaDl4gPllDivq0                   =  0x00C40120,
> +	HWPfChaDl4gPllDivq1                   =  0x00C40124,
> +	HWPfChaDl4gPllDivq2                   =  0x00C40128,
> +	HWPfChaDl4gPllDivq3                   =  0x00C4012C,
> +	HWPfChaDl4gPllDivq4                   =  0x00C40130,
> +	HWPfChaDl4gPllDivq5                   =  0x00C40134,
> +	HWPfChaDl4gPllDivq6                   =  0x00C40138,
> +	HWPfChaDl4gPllDivq7                   =  0x00C4013C,
> +	HWPfChaUl5gPllPllrst                  =  0x00C40180,
> +	HWPfChaUl5gPllClk0                    =  0x00C40184,
> +	HWPfChaUl5gPllClk1                    =  0x00C40188,
> +	HWPfChaUl5gPllBwadj                   =  0x00C4018C,
> +	HWPfChaUl5gPllLbw                     =  0x00C40190,
> +	HWPfChaUl5gPllResetq                  =  0x00C40194,
> +	HWPfChaUl5gPllPhshft0                 =  0x00C40198,
> +	HWPfChaUl5gPllPhshft1                 =  0x00C4019C,
> +	HWPfChaUl5gPllDivq0                   =  0x00C401A0,
> +	HWPfChaUl5gPllDivq1                   =  0x00C401A4,
> +	HWPfChaUl5gPllDivq2                   =  0x00C401A8,
> +	HWPfChaUl5gPllDivq3                   =  0x00C401AC,
> +	HWPfChaUl5gPllDivq4                   =  0x00C401B0,
> +	HWPfChaUl5gPllDivq5                   =  0x00C401B4,
> +	HWPfChaUl5gPllDivq6                   =  0x00C401B8,
> +	HWPfChaUl5gPllDivq7                   =  0x00C401BC,
> +	HWPfChaUl4gPllPllrst                  =  0x00C40200,
> +	HWPfChaUl4gPllClk0                    =  0x00C40204,
> +	HWPfChaUl4gPllClk1                    =  0x00C40208,
> +	HWPfChaUl4gPllBwadj                   =  0x00C4020C,
> +	HWPfChaUl4gPllLbw                     =  0x00C40210,
> +	HWPfChaUl4gPllResetq                  =  0x00C40214,
> +	HWPfChaUl4gPllPhshft0                 =  0x00C40218,
> +	HWPfChaUl4gPllPhshft1                 =  0x00C4021C,
> +	HWPfChaUl4gPllDivq0                   =  0x00C40220,
> +	HWPfChaUl4gPllDivq1                   =  0x00C40224,
> +	HWPfChaUl4gPllDivq2                   =  0x00C40228,
> +	HWPfChaUl4gPllDivq3                   =  0x00C4022C,
> +	HWPfChaUl4gPllDivq4                   =  0x00C40230,
> +	HWPfChaUl4gPllDivq5                   =  0x00C40234,
> +	HWPfChaUl4gPllDivq6                   =  0x00C40238,
> +	HWPfChaUl4gPllDivq7                   =  0x00C4023C,
> +	HWPfChaDdrPllPllrst                   =  0x00C40280,
> +	HWPfChaDdrPllClk0                     =  0x00C40284,
> +	HWPfChaDdrPllClk1                     =  0x00C40288,
> +	HWPfChaDdrPllBwadj                    =  0x00C4028C,
> +	HWPfChaDdrPllLbw                      =  0x00C40290,
> +	HWPfChaDdrPllResetq                   =  0x00C40294,
> +	HWPfChaDdrPllPhshft0                  =  0x00C40298,
> +	HWPfChaDdrPllPhshft1                  =  0x00C4029C,
> +	HWPfChaDdrPllDivq0                    =  0x00C402A0,
> +	HWPfChaDdrPllDivq1                    =  0x00C402A4,
> +	HWPfChaDdrPllDivq2                    =  0x00C402A8,
> +	HWPfChaDdrPllDivq3                    =  0x00C402AC,
> +	HWPfChaDdrPllDivq4                    =  0x00C402B0,
> +	HWPfChaDdrPllDivq5                    =  0x00C402B4,
> +	HWPfChaDdrPllDivq6                    =  0x00C402B8,
> +	HWPfChaDdrPllDivq7                    =  0x00C402BC,
> +	HWPfChaErrStatus                      =  0x00C40400,
> +	HWPfChaErrMask                        =  0x00C40404,
> +	HWPfChaDebugPcieMsiFifo               =  0x00C40410,
> +	HWPfChaDebugDdrMsiFifo                =  0x00C40414,
> +	HWPfChaDebugMiscMsiFifo               =  0x00C40418,
> +	HWPfChaPwmSet                         =  0x00C40420,
> +	HWPfChaDdrRstStatus                   =  0x00C40430,
> +	HWPfChaDdrStDoneStatus                =  0x00C40434,
> +	HWPfChaDdrWbRstCfg                    =  0x00C40438,
> +	HWPfChaDdrApbRstCfg                   =  0x00C4043C,
> +	HWPfChaDdrPhyRstCfg                   =  0x00C40440,
> +	HWPfChaDdrCpuRstCfg                   =  0x00C40444,
> +	HWPfChaDdrSifRstCfg                   =  0x00C40448,
> +	HWPfChaPadcfgPcomp0                   =  0x00C41000,
> +	HWPfChaPadcfgNcomp0                   =  0x00C41004,
> +	HWPfChaPadcfgOdt0                     =  0x00C41008,
> +	HWPfChaPadcfgProtect0                 =  0x00C4100C,
> +	HWPfChaPreemphasisProtect0            =  0x00C41010,
> +	HWPfChaPreemphasisCompen0             =  0x00C41040,
> +	HWPfChaPreemphasisOdten0              =  0x00C41044,
> +	HWPfChaPadcfgPcomp1                   =  0x00C41100,
> +	HWPfChaPadcfgNcomp1                   =  0x00C41104,
> +	HWPfChaPadcfgOdt1                     =  0x00C41108,
> +	HWPfChaPadcfgProtect1                 =  0x00C4110C,
> +	HWPfChaPreemphasisProtect1            =  0x00C41110,
> +	HWPfChaPreemphasisCompen1             =  0x00C41140,
> +	HWPfChaPreemphasisOdten1              =  0x00C41144,
> +	HWPfChaPadcfgPcomp2                   =  0x00C41200,
> +	HWPfChaPadcfgNcomp2                   =  0x00C41204,
> +	HWPfChaPadcfgOdt2                     =  0x00C41208,
> +	HWPfChaPadcfgProtect2                 =  0x00C4120C,
> +	HWPfChaPreemphasisProtect2            =  0x00C41210,
> +	HWPfChaPreemphasisCompen2             =  0x00C41240,
> +	HWPfChaPreemphasisOdten4              =  0x00C41444,
> +	HWPfChaPreemphasisOdten2              =  0x00C41244,
> +	HWPfChaPadcfgPcomp3                   =  0x00C41300,
> +	HWPfChaPadcfgNcomp3                   =  0x00C41304,
> +	HWPfChaPadcfgOdt3                     =  0x00C41308,
> +	HWPfChaPadcfgProtect3                 =  0x00C4130C,
> +	HWPfChaPreemphasisProtect3            =  0x00C41310,
> +	HWPfChaPreemphasisCompen3             =  0x00C41340,
> +	HWPfChaPreemphasisOdten3              =  0x00C41344,
> +	HWPfChaPadcfgPcomp4                   =  0x00C41400,
> +	HWPfChaPadcfgNcomp4                   =  0x00C41404,
> +	HWPfChaPadcfgOdt4                     =  0x00C41408,
> +	HWPfChaPadcfgProtect4                 =  0x00C4140C,
> +	HWPfChaPreemphasisProtect4            =  0x00C41410,
> +	HWPfChaPreemphasisCompen4             =  0x00C41440,
> +	HWPfHiVfToPfDbellVf                   =  0x00C80000,
> +	HWPfHiPfToVfDbellVf                   =  0x00C80008,
> +	HWPfHiInfoRingBaseLoVf                =  0x00C80010,
> +	HWPfHiInfoRingBaseHiVf                =  0x00C80014,
> +	HWPfHiInfoRingPointerVf               =  0x00C80018,
> +	HWPfHiInfoRingIntWrEnVf               =  0x00C80020,
> +	HWPfHiInfoRingPf2VfWrEnVf             =  0x00C80024,
> +	HWPfHiMsixVectorMapperVf              =  0x00C80060,
> +	HWPfHiModuleVersionReg                =  0x00C84000,
> +	HWPfHiIosf2axiErrLogReg               =  0x00C84004,
> +	HWPfHiHardResetReg                    =  0x00C84008,
> +	HWPfHi5GHardResetReg                  =  0x00C8400C,
> +	HWPfHiInfoRingBaseLoRegPf             =  0x00C84010,
> +	HWPfHiInfoRingBaseHiRegPf             =  0x00C84014,
> +	HWPfHiInfoRingPointerRegPf            =  0x00C84018,
> +	HWPfHiInfoRingIntWrEnRegPf            =  0x00C84020,
> +	HWPfHiInfoRingVf2pfLoWrEnReg          =  0x00C84024,
> +	HWPfHiInfoRingVf2pfHiWrEnReg          =  0x00C84028,
> +	HWPfHiLogParityErrStatusReg           =  0x00C8402C,
> +	HWPfHiLogDataParityErrorVfStatusLo    =  0x00C84030,
> +	HWPfHiLogDataParityErrorVfStatusHi    =  0x00C84034,
> +	HWPfHiBlockTransmitOnErrorEn          =  0x00C84038,
> +	HWPfHiCfgMsiIntWrEnRegPf              =  0x00C84040,
> +	HWPfHiCfgMsiVf2pfLoWrEnReg            =  0x00C84044,
> +	HWPfHiCfgMsiVf2pfHighWrEnReg          =  0x00C84048,
> +	HWPfHiMsixVectorMapperPf              =  0x00C84060,
> +	HWPfHiApbWrWaitTime                   =  0x00C84100,
> +	HWPfHiXCounterMaxValue                =  0x00C84104,
> +	HWPfHiPfMode                          =  0x00C84108,
> +	HWPfHiClkGateHystReg                  =  0x00C8410C,
> +	HWPfHiSnoopBitsReg                    =  0x00C84110,
> +	HWPfHiMsiDropEnableReg                =  0x00C84114,
> +	HWPfHiMsiStatReg                      =  0x00C84120,
> +	HWPfHiFifoOflStatReg                  =  0x00C84124,
> +	HWPfHiHiDebugReg                      =  0x00C841F4,
> +	HWPfHiDebugMemSnoopMsiFifo            =  0x00C841F8,
> +	HWPfHiDebugMemSnoopInputFifo          =  0x00C841FC,
> +	HWPfHiMsixMappingConfig               =  0x00C84200,
> +	HWPfHiJunkReg                         =  0x00C8FF00,
> +	HWPfDdrUmmcVer                        =  0x00D00000,
> +	HWPfDdrUmmcCap                        =  0x00D00010,
> +	HWPfDdrUmmcCtrl                       =  0x00D00020,
> +	HWPfDdrMpcPe                          =  0x00D00080,
> +	HWPfDdrMpcPpri3                       =  0x00D00090,
> +	HWPfDdrMpcPpri2                       =  0x00D000A0,
> +	HWPfDdrMpcPpri1                       =  0x00D000B0,
> +	HWPfDdrMpcPpri0                       =  0x00D000C0,
> +	HWPfDdrMpcPrwgrpCtrl                  =  0x00D000D0,
> +	HWPfDdrMpcPbw7                        =  0x00D000E0,
> +	HWPfDdrMpcPbw6                        =  0x00D000F0,
> +	HWPfDdrMpcPbw5                        =  0x00D00100,
> +	HWPfDdrMpcPbw4                        =  0x00D00110,
> +	HWPfDdrMpcPbw3                        =  0x00D00120,
> +	HWPfDdrMpcPbw2                        =  0x00D00130,
> +	HWPfDdrMpcPbw1                        =  0x00D00140,
> +	HWPfDdrMpcPbw0                        =  0x00D00150,
> +	HWPfDdrMemoryInit                     =  0x00D00200,
> +	HWPfDdrMemoryInitDone                 =  0x00D00210,
> +	HWPfDdrMemInitPhyTrng0                =  0x00D00240,
> +	HWPfDdrMemInitPhyTrng1                =  0x00D00250,
> +	HWPfDdrMemInitPhyTrng2                =  0x00D00260,
> +	HWPfDdrMemInitPhyTrng3                =  0x00D00270,
> +	HWPfDdrBcDram                         =  0x00D003C0,
> +	HWPfDdrBcAddrMap                      =  0x00D003D0,
> +	HWPfDdrBcRef                          =  0x00D003E0,
> +	HWPfDdrBcTim0                         =  0x00D00400,
> +	HWPfDdrBcTim1                         =  0x00D00410,
> +	HWPfDdrBcTim2                         =  0x00D00420,
> +	HWPfDdrBcTim3                         =  0x00D00430,
> +	HWPfDdrBcTim4                         =  0x00D00440,
> +	HWPfDdrBcTim5                         =  0x00D00450,
> +	HWPfDdrBcTim6                         =  0x00D00460,
> +	HWPfDdrBcTim7                         =  0x00D00470,
> +	HWPfDdrBcTim8                         =  0x00D00480,
> +	HWPfDdrBcTim9                         =  0x00D00490,
> +	HWPfDdrBcTim10                        =  0x00D004A0,
> +	HWPfDdrBcTim12                        =  0x00D004C0,
> +	HWPfDdrDfiInit                        =  0x00D004D0,
> +	HWPfDdrDfiInitComplete                =  0x00D004E0,
> +	HWPfDdrDfiTim0                        =  0x00D004F0,
> +	HWPfDdrDfiTim1                        =  0x00D00500,
> +	HWPfDdrDfiPhyUpdEn                    =  0x00D00530,
> +	HWPfDdrMemStatus                      =  0x00D00540,
> +	HWPfDdrUmmcErrStatus                  =  0x00D00550,
> +	HWPfDdrUmmcIntStatus                  =  0x00D00560,
> +	HWPfDdrUmmcIntEn                      =  0x00D00570,
> +	HWPfDdrPhyRdLatency                   =  0x00D48400,
> +	HWPfDdrPhyRdLatencyDbi                =  0x00D48410,
> +	HWPfDdrPhyWrLatency                   =  0x00D48420,
> +	HWPfDdrPhyTrngType                    =  0x00D48430,
> +	HWPfDdrPhyMrsTiming2                  =  0x00D48440,
> +	HWPfDdrPhyMrsTiming0                  =  0x00D48450,
> +	HWPfDdrPhyMrsTiming1                  =  0x00D48460,
> +	HWPfDdrPhyDramTmrd                    =  0x00D48470,
> +	HWPfDdrPhyDramTmod                    =  0x00D48480,
> +	HWPfDdrPhyDramTwpre                   =  0x00D48490,
> +	HWPfDdrPhyDramTrfc                    =  0x00D484A0,
> +	HWPfDdrPhyDramTrwtp                   =  0x00D484B0,
> +	HWPfDdrPhyMr01Dimm                    =  0x00D484C0,
> +	HWPfDdrPhyMr01DimmDbi                 =  0x00D484D0,
> +	HWPfDdrPhyMr23Dimm                    =  0x00D484E0,
> +	HWPfDdrPhyMr45Dimm                    =  0x00D484F0,
> +	HWPfDdrPhyMr67Dimm                    =  0x00D48500,
> +	HWPfDdrPhyWrlvlWwRdlvlRr              =  0x00D48510,
> +	HWPfDdrPhyOdtEn                       =  0x00D48520,
> +	HWPfDdrPhyFastTrng                    =  0x00D48530,
> +	HWPfDdrPhyDynTrngGap                  =  0x00D48540,
> +	HWPfDdrPhyDynRcalGap                  =  0x00D48550,
> +	HWPfDdrPhyIdletimeout                 =  0x00D48560,
> +	HWPfDdrPhyRstCkeGap                   =  0x00D48570,
> +	HWPfDdrPhyCkeMrsGap                   =  0x00D48580,
> +	HWPfDdrPhyMemVrefMidVal               =  0x00D48590,
> +	HWPfDdrPhyVrefStep                    =  0x00D485A0,
> +	HWPfDdrPhyVrefThreshold               =  0x00D485B0,
> +	HWPfDdrPhyPhyVrefMidVal               =  0x00D485C0,
> +	HWPfDdrPhyDqsCountMax                 =  0x00D485D0,
> +	HWPfDdrPhyDqsCountNum                 =  0x00D485E0,
> +	HWPfDdrPhyDramRow                     =  0x00D485F0,
> +	HWPfDdrPhyDramCol                     =  0x00D48600,
> +	HWPfDdrPhyDramBgBa                    =  0x00D48610,
> +	HWPfDdrPhyDynamicUpdreqrel            =  0x00D48620,
> +	HWPfDdrPhyVrefLimits                  =  0x00D48630,
> +	HWPfDdrPhyIdtmTcStatus                =  0x00D6C020,
> +	HWPfDdrPhyIdtmFwVersion               =  0x00D6C410,
> +	HWPfDdrPhyRdlvlGateInitDelay          =  0x00D70000,
> +	HWPfDdrPhyRdenSmplabc                 =  0x00D70008,
> +	HWPfDdrPhyVrefNibble0                 =  0x00D7000C,
> +	HWPfDdrPhyVrefNibble1                 =  0x00D70010,
> +	HWPfDdrPhyRdlvlGateDqsSmpl0           =  0x00D70014,
> +	HWPfDdrPhyRdlvlGateDqsSmpl1           =  0x00D70018,
> +	HWPfDdrPhyRdlvlGateDqsSmpl2           =  0x00D7001C,
> +	HWPfDdrPhyDqsCount                    =  0x00D70020,
> +	HWPfDdrPhyWrlvlRdlvlGateStatus        =  0x00D70024,
> +	HWPfDdrPhyErrorFlags                  =  0x00D70028,
> +	HWPfDdrPhyPowerDown                   =  0x00D70030,
> +	HWPfDdrPhyPrbsSeedByte0               =  0x00D70034,
> +	HWPfDdrPhyPrbsSeedByte1               =  0x00D70038,
> +	HWPfDdrPhyPcompDq                     =  0x00D70040,
> +	HWPfDdrPhyNcompDq                     =  0x00D70044,
> +	HWPfDdrPhyPcompDqs                    =  0x00D70048,
> +	HWPfDdrPhyNcompDqs                    =  0x00D7004C,
> +	HWPfDdrPhyPcompCmd                    =  0x00D70050,
> +	HWPfDdrPhyNcompCmd                    =  0x00D70054,
> +	HWPfDdrPhyPcompCk                     =  0x00D70058,
> +	HWPfDdrPhyNcompCk                     =  0x00D7005C,
> +	HWPfDdrPhyRcalOdtDq                   =  0x00D70060,
> +	HWPfDdrPhyRcalOdtDqs                  =  0x00D70064,
> +	HWPfDdrPhyRcalMask1                   =  0x00D70068,
> +	HWPfDdrPhyRcalMask2                   =  0x00D7006C,
> +	HWPfDdrPhyRcalCtrl                    =  0x00D70070,
> +	HWPfDdrPhyRcalCnt                     =  0x00D70074,
> +	HWPfDdrPhyRcalOverride                =  0x00D70078,
> +	HWPfDdrPhyRcalGateen                  =  0x00D7007C,
> +	HWPfDdrPhyCtrl                        =  0x00D70080,
> +	HWPfDdrPhyWrlvlAlg                    =  0x00D70084,
> +	HWPfDdrPhyRcalVreftTxcmdOdt           =  0x00D70088,
> +	HWPfDdrPhyRdlvlGateParam              =  0x00D7008C,
> +	HWPfDdrPhyRdlvlGateParam2             =  0x00D70090,
> +	HWPfDdrPhyRcalVreftTxdata             =  0x00D70094,
> +	HWPfDdrPhyCmdIntDelay                 =  0x00D700A4,
> +	HWPfDdrPhyAlertN                      =  0x00D700A8,
> +	HWPfDdrPhyTrngReqWpre2tck             =  0x00D700AC,
> +	HWPfDdrPhyCmdPhaseSel                 =  0x00D700B4,
> +	HWPfDdrPhyCmdDcdl                     =  0x00D700B8,
> +	HWPfDdrPhyCkDcdl                      =  0x00D700BC,
> +	HWPfDdrPhySwTrngCtrl1                 =  0x00D700C0,
> +	HWPfDdrPhySwTrngCtrl2                 =  0x00D700C4,
> +	HWPfDdrPhyRcalPcompRden               =  0x00D700C8,
> +	HWPfDdrPhyRcalNcompRden               =  0x00D700CC,
> +	HWPfDdrPhyRcalCompen                  =  0x00D700D0,
> +	HWPfDdrPhySwTrngRdqs                  =  0x00D700D4,
> +	HWPfDdrPhySwTrngWdqs                  =  0x00D700D8,
> +	HWPfDdrPhySwTrngRdena                 =  0x00D700DC,
> +	HWPfDdrPhySwTrngRdenb                 =  0x00D700E0,
> +	HWPfDdrPhySwTrngRdenc                 =  0x00D700E4,
> +	HWPfDdrPhySwTrngWdq                   =  0x00D700E8,
> +	HWPfDdrPhySwTrngRdq                   =  0x00D700EC,
> +	HWPfDdrPhyPcfgHmValue                 =  0x00D700F0,
> +	HWPfDdrPhyPcfgTimerValue              =  0x00D700F4,
> +	HWPfDdrPhyPcfgSoftwareTraining        =  0x00D700F8,
> +	HWPfDdrPhyPcfgMcStatus                =  0x00D700FC,
> +	HWPfDdrPhyWrlvlPhRank0                =  0x00D70100,
> +	HWPfDdrPhyRdenPhRank0                 =  0x00D70104,
> +	HWPfDdrPhyRdenIntRank0                =  0x00D70108,
> +	HWPfDdrPhyRdqsDcdlRank0               =  0x00D7010C,
> +	HWPfDdrPhyRdqsShadowDcdlRank0         =  0x00D70110,
> +	HWPfDdrPhyWdqsDcdlRank0               =  0x00D70114,
> +	HWPfDdrPhyWdmDcdlShadowRank0          =  0x00D70118,
> +	HWPfDdrPhyWdmDcdlRank0                =  0x00D7011C,
> +	HWPfDdrPhyDbiDcdlRank0                =  0x00D70120,
> +	HWPfDdrPhyRdenDcdlaRank0              =  0x00D70124,
> +	HWPfDdrPhyDbiDcdlShadowRank0          =  0x00D70128,
> +	HWPfDdrPhyRdenDcdlbRank0              =  0x00D7012C,
> +	HWPfDdrPhyWdqsShadowDcdlRank0         =  0x00D70130,
> +	HWPfDdrPhyRdenDcdlcRank0              =  0x00D70134,
> +	HWPfDdrPhyRdenShadowDcdlaRank0        =  0x00D70138,
> +	HWPfDdrPhyWrlvlIntRank0               =  0x00D7013C,
> +	HWPfDdrPhyRdqDcdlBit0Rank0            =  0x00D70200,
> +	HWPfDdrPhyRdqDcdlShadowBit0Rank0      =  0x00D70204,
> +	HWPfDdrPhyWdqDcdlBit0Rank0            =  0x00D70208,
> +	HWPfDdrPhyWdqDcdlShadowBit0Rank0      =  0x00D7020C,
> +	HWPfDdrPhyRdqDcdlBit1Rank0            =  0x00D70240,
> +	HWPfDdrPhyRdqDcdlShadowBit1Rank0      =  0x00D70244,
> +	HWPfDdrPhyWdqDcdlBit1Rank0            =  0x00D70248,
> +	HWPfDdrPhyWdqDcdlShadowBit1Rank0      =  0x00D7024C,
> +	HWPfDdrPhyRdqDcdlBit2Rank0            =  0x00D70280,
> +	HWPfDdrPhyRdqDcdlShadowBit2Rank0      =  0x00D70284,
> +	HWPfDdrPhyWdqDcdlBit2Rank0            =  0x00D70288,
> +	HWPfDdrPhyWdqDcdlShadowBit2Rank0      =  0x00D7028C,
> +	HWPfDdrPhyRdqDcdlBit3Rank0            =  0x00D702C0,
> +	HWPfDdrPhyRdqDcdlShadowBit3Rank0      =  0x00D702C4,
> +	HWPfDdrPhyWdqDcdlBit3Rank0            =  0x00D702C8,
> +	HWPfDdrPhyWdqDcdlShadowBit3Rank0      =  0x00D702CC,
> +	HWPfDdrPhyRdqDcdlBit4Rank0            =  0x00D70300,
> +	HWPfDdrPhyRdqDcdlShadowBit4Rank0      =  0x00D70304,
> +	HWPfDdrPhyWdqDcdlBit4Rank0            =  0x00D70308,
> +	HWPfDdrPhyWdqDcdlShadowBit4Rank0      =  0x00D7030C,
> +	HWPfDdrPhyRdqDcdlBit5Rank0            =  0x00D70340,
> +	HWPfDdrPhyRdqDcdlShadowBit5Rank0      =  0x00D70344,
> +	HWPfDdrPhyWdqDcdlBit5Rank0            =  0x00D70348,
> +	HWPfDdrPhyWdqDcdlShadowBit5Rank0      =  0x00D7034C,
> +	HWPfDdrPhyRdqDcdlBit6Rank0            =  0x00D70380,
> +	HWPfDdrPhyRdqDcdlShadowBit6Rank0      =  0x00D70384,
> +	HWPfDdrPhyWdqDcdlBit6Rank0            =  0x00D70388,
> +	HWPfDdrPhyWdqDcdlShadowBit6Rank0      =  0x00D7038C,
> +	HWPfDdrPhyRdqDcdlBit7Rank0            =  0x00D703C0,
> +	HWPfDdrPhyRdqDcdlShadowBit7Rank0      =  0x00D703C4,
> +	HWPfDdrPhyWdqDcdlBit7Rank0            =  0x00D703C8,
> +	HWPfDdrPhyWdqDcdlShadowBit7Rank0      =  0x00D703CC,
> +	HWPfDdrPhyIdtmStatus                  =  0x00D740D0,
> +	HWPfDdrPhyIdtmError                   =  0x00D74110,
> +	HWPfDdrPhyIdtmDebug                   =  0x00D74120,
> +	HWPfDdrPhyIdtmDebugInt                =  0x00D74130,
> +	HwPfPcieLnAsicCfgovr                  =  0x00D80000,
> +	HwPfPcieLnAclkmixer                   =  0x00D80004,
> +	HwPfPcieLnTxrampfreq                  =  0x00D80008,
> +	HwPfPcieLnLanetest                    =  0x00D8000C,
> +	HwPfPcieLnDcctrl                      =  0x00D80010,
> +	HwPfPcieLnDccmeas                     =  0x00D80014,
> +	HwPfPcieLnDccovrAclk                  =  0x00D80018,
> +	HwPfPcieLnDccovrTxa                   =  0x00D8001C,
> +	HwPfPcieLnDccovrTxk                   =  0x00D80020,
> +	HwPfPcieLnDccovrDclk                  =  0x00D80024,
> +	HwPfPcieLnDccovrEclk                  =  0x00D80028,
> +	HwPfPcieLnDcctrimAclk                 =  0x00D8002C,
> +	HwPfPcieLnDcctrimTx                   =  0x00D80030,
> +	HwPfPcieLnDcctrimDclk                 =  0x00D80034,
> +	HwPfPcieLnDcctrimEclk                 =  0x00D80038,
> +	HwPfPcieLnQuadCtrl                    =  0x00D8003C,
> +	HwPfPcieLnQuadCorrIndex               =  0x00D80040,
> +	HwPfPcieLnQuadCorrStatus              =  0x00D80044,
> +	HwPfPcieLnAsicRxovr1                  =  0x00D80048,
> +	HwPfPcieLnAsicRxovr2                  =  0x00D8004C,
> +	HwPfPcieLnAsicEqinfovr                =  0x00D80050,
> +	HwPfPcieLnRxcsr                       =  0x00D80054,
> +	HwPfPcieLnRxfectrl                    =  0x00D80058,
> +	HwPfPcieLnRxtest                      =  0x00D8005C,
> +	HwPfPcieLnEscount                     =  0x00D80060,
> +	HwPfPcieLnCdrctrl                     =  0x00D80064,
> +	HwPfPcieLnCdrctrl2                    =  0x00D80068,
> +	HwPfPcieLnCdrcfg0Ctrl0                =  0x00D8006C,
> +	HwPfPcieLnCdrcfg0Ctrl1                =  0x00D80070,
> +	HwPfPcieLnCdrcfg0Ctrl2                =  0x00D80074,
> +	HwPfPcieLnCdrcfg1Ctrl0                =  0x00D80078,
> +	HwPfPcieLnCdrcfg1Ctrl1                =  0x00D8007C,
> +	HwPfPcieLnCdrcfg1Ctrl2                =  0x00D80080,
> +	HwPfPcieLnCdrcfg2Ctrl0                =  0x00D80084,
> +	HwPfPcieLnCdrcfg2Ctrl1                =  0x00D80088,
> +	HwPfPcieLnCdrcfg2Ctrl2                =  0x00D8008C,
> +	HwPfPcieLnCdrcfg3Ctrl0                =  0x00D80090,
> +	HwPfPcieLnCdrcfg3Ctrl1                =  0x00D80094,
> +	HwPfPcieLnCdrcfg3Ctrl2                =  0x00D80098,
> +	HwPfPcieLnCdrphase                    =  0x00D8009C,
> +	HwPfPcieLnCdrfreq                     =  0x00D800A0,
> +	HwPfPcieLnCdrstatusPhase              =  0x00D800A4,
> +	HwPfPcieLnCdrstatusFreq               =  0x00D800A8,
> +	HwPfPcieLnCdroffset                   =  0x00D800AC,
> +	HwPfPcieLnRxvosctl                    =  0x00D800B0,
> +	HwPfPcieLnRxvosctl2                   =  0x00D800B4,
> +	HwPfPcieLnRxlosctl                    =  0x00D800B8,
> +	HwPfPcieLnRxlos                       =  0x00D800BC,
> +	HwPfPcieLnRxlosvval                   =  0x00D800C0,
> +	HwPfPcieLnRxvosd0                     =  0x00D800C4,
> +	HwPfPcieLnRxvosd1                     =  0x00D800C8,
> +	HwPfPcieLnRxvosep0                    =  0x00D800CC,
> +	HwPfPcieLnRxvosep1                    =  0x00D800D0,
> +	HwPfPcieLnRxvosen0                    =  0x00D800D4,
> +	HwPfPcieLnRxvosen1                    =  0x00D800D8,
> +	HwPfPcieLnRxvosafe                    =  0x00D800DC,
> +	HwPfPcieLnRxvosa0                     =  0x00D800E0,
> +	HwPfPcieLnRxvosa0Out                  =  0x00D800E4,
> +	HwPfPcieLnRxvosa1                     =  0x00D800E8,
> +	HwPfPcieLnRxvosa1Out                  =  0x00D800EC,
> +	HwPfPcieLnRxmisc                      =  0x00D800F0,
> +	HwPfPcieLnRxbeacon                    =  0x00D800F4,
> +	HwPfPcieLnRxdssout                    =  0x00D800F8,
> +	HwPfPcieLnRxdssout2                   =  0x00D800FC,
> +	HwPfPcieLnAlphapctrl                  =  0x00D80100,
> +	HwPfPcieLnAlphanctrl                  =  0x00D80104,
> +	HwPfPcieLnAdaptctrl                   =  0x00D80108,
> +	HwPfPcieLnAdaptctrl1                  =  0x00D8010C,
> +	HwPfPcieLnAdaptstatus                 =  0x00D80110,
> +	HwPfPcieLnAdaptvga1                   =  0x00D80114,
> +	HwPfPcieLnAdaptvga2                   =  0x00D80118,
> +	HwPfPcieLnAdaptvga3                   =  0x00D8011C,
> +	HwPfPcieLnAdaptvga4                   =  0x00D80120,
> +	HwPfPcieLnAdaptboost1                 =  0x00D80124,
> +	HwPfPcieLnAdaptboost2                 =  0x00D80128,
> +	HwPfPcieLnAdaptboost3                 =  0x00D8012C,
> +	HwPfPcieLnAdaptboost4                 =  0x00D80130,
> +	HwPfPcieLnAdaptsslms1                 =  0x00D80134,
> +	HwPfPcieLnAdaptsslms2                 =  0x00D80138,
> +	HwPfPcieLnAdaptvgaStatus              =  0x00D8013C,
> +	HwPfPcieLnAdaptboostStatus            =  0x00D80140,
> +	HwPfPcieLnAdaptsslmsStatus1           =  0x00D80144,
> +	HwPfPcieLnAdaptsslmsStatus2           =  0x00D80148,
> +	HwPfPcieLnAfectrl1                    =  0x00D8014C,
> +	HwPfPcieLnAfectrl2                    =  0x00D80150,
> +	HwPfPcieLnAfectrl3                    =  0x00D80154,
> +	HwPfPcieLnAfedefault1                 =  0x00D80158,
> +	HwPfPcieLnAfedefault2                 =  0x00D8015C,
> +	HwPfPcieLnDfectrl1                    =  0x00D80160,
> +	HwPfPcieLnDfectrl2                    =  0x00D80164,
> +	HwPfPcieLnDfectrl3                    =  0x00D80168,
> +	HwPfPcieLnDfectrl4                    =  0x00D8016C,
> +	HwPfPcieLnDfectrl5                    =  0x00D80170,
> +	HwPfPcieLnDfectrl6                    =  0x00D80174,
> +	HwPfPcieLnAfestatus1                  =  0x00D80178,
> +	HwPfPcieLnAfestatus2                  =  0x00D8017C,
> +	HwPfPcieLnDfestatus1                  =  0x00D80180,
> +	HwPfPcieLnDfestatus2                  =  0x00D80184,
> +	HwPfPcieLnDfestatus3                  =  0x00D80188,
> +	HwPfPcieLnDfestatus4                  =  0x00D8018C,
> +	HwPfPcieLnDfestatus5                  =  0x00D80190,
> +	HwPfPcieLnAlphastatus                 =  0x00D80194,
> +	HwPfPcieLnFomctrl1                    =  0x00D80198,
> +	HwPfPcieLnFomctrl2                    =  0x00D8019C,
> +	HwPfPcieLnFomctrl3                    =  0x00D801A0,
> +	HwPfPcieLnAclkcalStatus               =  0x00D801A4,
> +	HwPfPcieLnOffscorrStatus              =  0x00D801A8,
> +	HwPfPcieLnEyewidthStatus              =  0x00D801AC,
> +	HwPfPcieLnEyeheightStatus             =  0x00D801B0,
> +	HwPfPcieLnAsicTxovr1                  =  0x00D801B4,
> +	HwPfPcieLnAsicTxovr2                  =  0x00D801B8,
> +	HwPfPcieLnAsicTxovr3                  =  0x00D801BC,
> +	HwPfPcieLnTxbiasadjOvr                =  0x00D801C0,
> +	HwPfPcieLnTxcsr                       =  0x00D801C4,
> +	HwPfPcieLnTxtest                      =  0x00D801C8,
> +	HwPfPcieLnTxtestword                  =  0x00D801CC,
> +	HwPfPcieLnTxtestwordHigh              =  0x00D801D0,
> +	HwPfPcieLnTxdrive                     =  0x00D801D4,
> +	HwPfPcieLnMtcsLn                      =  0x00D801D8,
> +	HwPfPcieLnStatsumLn                   =  0x00D801DC,
> +	HwPfPcieLnRcbusScratch                =  0x00D801E0,
> +	HwPfPcieLnRcbusMinorrev               =  0x00D801F0,
> +	HwPfPcieLnRcbusMajorrev               =  0x00D801F4,
> +	HwPfPcieLnRcbusBlocktype              =  0x00D801F8,
> +	HwPfPcieSupPllcsr                     =  0x00D80800,
> +	HwPfPcieSupPlldiv                     =  0x00D80804,
> +	HwPfPcieSupPllcal                     =  0x00D80808,
> +	HwPfPcieSupPllcalsts                  =  0x00D8080C,
> +	HwPfPcieSupPllmeas                    =  0x00D80810,
> +	HwPfPcieSupPlldactrim                 =  0x00D80814,
> +	HwPfPcieSupPllbiastrim                =  0x00D80818,
> +	HwPfPcieSupPllbwtrim                  =  0x00D8081C,
> +	HwPfPcieSupPllcaldly                  =  0x00D80820,
> +	HwPfPcieSupRefclkonpclkctrl           =  0x00D80824,
> +	HwPfPcieSupPclkdelay                  =  0x00D80828,
> +	HwPfPcieSupPhyconfig                  =  0x00D8082C,
> +	HwPfPcieSupRcalIntf                   =  0x00D80830,
> +	HwPfPcieSupAuxcsr                     =  0x00D80834,
> +	HwPfPcieSupVref                       =  0x00D80838,
> +	HwPfPcieSupLinkmode                   =  0x00D8083C,
> +	HwPfPcieSupRrefcalctl                 =  0x00D80840,
> +	HwPfPcieSupRrefcal                    =  0x00D80844,
> +	HwPfPcieSupRrefcaldly                 =  0x00D80848,
> +	HwPfPcieSupTximpcalctl                =  0x00D8084C,
> +	HwPfPcieSupTximpcal                   =  0x00D80850,
> +	HwPfPcieSupTximpoffset                =  0x00D80854,
> +	HwPfPcieSupTximpcaldly                =  0x00D80858,
> +	HwPfPcieSupRximpcalctl                =  0x00D8085C,
> +	HwPfPcieSupRximpcal                   =  0x00D80860,
> +	HwPfPcieSupRximpoffset                =  0x00D80864,
> +	HwPfPcieSupRximpcaldly                =  0x00D80868,
> +	HwPfPcieSupFence                      =  0x00D8086C,
> +	HwPfPcieSupMtcs                       =  0x00D80870,
> +	HwPfPcieSupStatsum                    =  0x00D809B8,
> +	HwPfPciePcsDpStatus0                  =  0x00D81000,
> +	HwPfPciePcsDpControl0                 =  0x00D81004,
> +	HwPfPciePcsPmaStatusLane0             =  0x00D81008,
> +	HwPfPciePcsPipeStatusLane0            =  0x00D8100C,
> +	HwPfPciePcsTxdeemph0Lane0             =  0x00D81010,
> +	HwPfPciePcsTxdeemph1Lane0             =  0x00D81014,
> +	HwPfPciePcsInternalStatusLane0        =  0x00D81018,
> +	HwPfPciePcsDpStatus1                  =  0x00D8101C,
> +	HwPfPciePcsDpControl1                 =  0x00D81020,
> +	HwPfPciePcsPmaStatusLane1             =  0x00D81024,
> +	HwPfPciePcsPipeStatusLane1            =  0x00D81028,
> +	HwPfPciePcsTxdeemph0Lane1             =  0x00D8102C,
> +	HwPfPciePcsTxdeemph1Lane1             =  0x00D81030,
> +	HwPfPciePcsInternalStatusLane1        =  0x00D81034,
> +	HwPfPciePcsDpStatus2                  =  0x00D81038,
> +	HwPfPciePcsDpControl2                 =  0x00D8103C,
> +	HwPfPciePcsPmaStatusLane2             =  0x00D81040,
> +	HwPfPciePcsPipeStatusLane2            =  0x00D81044,
> +	HwPfPciePcsTxdeemph0Lane2             =  0x00D81048,
> +	HwPfPciePcsTxdeemph1Lane2             =  0x00D8104C,
> +	HwPfPciePcsInternalStatusLane2        =  0x00D81050,
> +	HwPfPciePcsDpStatus3                  =  0x00D81054,
> +	HwPfPciePcsDpControl3                 =  0x00D81058,
> +	HwPfPciePcsPmaStatusLane3             =  0x00D8105C,
> +	HwPfPciePcsPipeStatusLane3            =  0x00D81060,
> +	HwPfPciePcsTxdeemph0Lane3             =  0x00D81064,
> +	HwPfPciePcsTxdeemph1Lane3             =  0x00D81068,
> +	HwPfPciePcsInternalStatusLane3        =  0x00D8106C,
> +	HwPfPciePcsEbStatus0                  =  0x00D81070,
> +	HwPfPciePcsEbStatus1                  =  0x00D81074,
> +	HwPfPciePcsEbStatus2                  =  0x00D81078,
> +	HwPfPciePcsEbStatus3                  =  0x00D8107C,
> +	HwPfPciePcsPllSettingPcieG1           =  0x00D81088,
> +	HwPfPciePcsPllSettingPcieG2           =  0x00D8108C,
> +	HwPfPciePcsPllSettingPcieG3           =  0x00D81090,
> +	HwPfPciePcsControl                    =  0x00D81094,
> +	HwPfPciePcsEqControl                  =  0x00D81098,
> +	HwPfPciePcsEqTimer                    =  0x00D8109C,
> +	HwPfPciePcsEqErrStatus                =  0x00D810A0,
> +	HwPfPciePcsEqErrCount                 =  0x00D810A4,
> +	HwPfPciePcsStatus                     =  0x00D810A8,
> +	HwPfPciePcsMiscRegister               =  0x00D810AC,
> +	HwPfPciePcsObsControl                 =  0x00D810B0,
> +	HwPfPciePcsPrbsCount0                 =  0x00D81200,
> +	HwPfPciePcsBistControl0               =  0x00D81204,
> +	HwPfPciePcsBistStaticWord00           =  0x00D81208,
> +	HwPfPciePcsBistStaticWord10           =  0x00D8120C,
> +	HwPfPciePcsBistStaticWord20           =  0x00D81210,
> +	HwPfPciePcsBistStaticWord30           =  0x00D81214,
> +	HwPfPciePcsPrbsCount1                 =  0x00D81220,
> +	HwPfPciePcsBistControl1               =  0x00D81224,
> +	HwPfPciePcsBistStaticWord01           =  0x00D81228,
> +	HwPfPciePcsBistStaticWord11           =  0x00D8122C,
> +	HwPfPciePcsBistStaticWord21           =  0x00D81230,
> +	HwPfPciePcsBistStaticWord31           =  0x00D81234,
> +	HwPfPciePcsPrbsCount2                 =  0x00D81240,
> +	HwPfPciePcsBistControl2               =  0x00D81244,
> +	HwPfPciePcsBistStaticWord02           =  0x00D81248,
> +	HwPfPciePcsBistStaticWord12           =  0x00D8124C,
> +	HwPfPciePcsBistStaticWord22           =  0x00D81250,
> +	HwPfPciePcsBistStaticWord32           =  0x00D81254,
> +	HwPfPciePcsPrbsCount3                 =  0x00D81260,
> +	HwPfPciePcsBistControl3               =  0x00D81264,
> +	HwPfPciePcsBistStaticWord03           =  0x00D81268,
> +	HwPfPciePcsBistStaticWord13           =  0x00D8126C,
> +	HwPfPciePcsBistStaticWord23           =  0x00D81270,
> +	HwPfPciePcsBistStaticWord33           =  0x00D81274,
> +	HwPfPcieGpexLtssmStateCntrl           =  0x00D90400,
> +	HwPfPcieGpexLtssmStateStatus          =  0x00D90404,
> +	HwPfPcieGpexSkipFreqTimer             =  0x00D90408,
> +	HwPfPcieGpexLaneSelect                =  0x00D9040C,
> +	HwPfPcieGpexLaneDeskew                =  0x00D90410,
> +	HwPfPcieGpexRxErrorStatus             =  0x00D90414,
> +	HwPfPcieGpexLaneNumControl            =  0x00D90418,
> +	HwPfPcieGpexNFstControl               =  0x00D9041C,
> +	HwPfPcieGpexLinkStatus                =  0x00D90420,
> +	HwPfPcieGpexAckReplayTimeout          =  0x00D90438,
> +	HwPfPcieGpexSeqNumberStatus           =  0x00D9043C,
> +	HwPfPcieGpexCoreClkRatio              =  0x00D90440,
> +	HwPfPcieGpexDllTholdControl           =  0x00D90448,
> +	HwPfPcieGpexPmTimer                   =  0x00D90450,
> +	HwPfPcieGpexPmeTimeout                =  0x00D90454,
> +	HwPfPcieGpexAspmL1Timer               =  0x00D90458,
> +	HwPfPcieGpexAspmReqTimer              =  0x00D9045C,
> +	HwPfPcieGpexAspmL1Dis                 =  0x00D90460,
> +	HwPfPcieGpexAdvisoryErrorControl      =  0x00D90468,
> +	HwPfPcieGpexId                        =  0x00D90470,
> +	HwPfPcieGpexClasscode                 =  0x00D90474,
> +	HwPfPcieGpexSubsystemId               =  0x00D90478,
> +	HwPfPcieGpexDeviceCapabilities        =  0x00D9047C,
> +	HwPfPcieGpexLinkCapabilities          =  0x00D90480,
> +	HwPfPcieGpexFunctionNumber            =  0x00D90484,
> +	HwPfPcieGpexPmCapabilities            =  0x00D90488,
> +	HwPfPcieGpexFunctionSelect            =  0x00D9048C,
> +	HwPfPcieGpexErrorCounter              =  0x00D904AC,
> +	HwPfPcieGpexConfigReady               =  0x00D904B0,
> +	HwPfPcieGpexFcUpdateTimeout           =  0x00D904B8,
> +	HwPfPcieGpexFcUpdateTimer             =  0x00D904BC,
> +	HwPfPcieGpexVcBufferLoad              =  0x00D904C8,
> +	HwPfPcieGpexVcBufferSizeThold         =  0x00D904CC,
> +	HwPfPcieGpexVcBufferSelect            =  0x00D904D0,
> +	HwPfPcieGpexBarEnable                 =  0x00D904D4,
> +	HwPfPcieGpexBarDwordLower             =  0x00D904D8,
> +	HwPfPcieGpexBarDwordUpper             =  0x00D904DC,
> +	HwPfPcieGpexBarSelect                 =  0x00D904E0,
> +	HwPfPcieGpexCreditCounterSelect       =  0x00D904E4,
> +	HwPfPcieGpexCreditCounterStatus       =  0x00D904E8,
> +	HwPfPcieGpexTlpHeaderSelect           =  0x00D904EC,
> +	HwPfPcieGpexTlpHeaderDword0           =  0x00D904F0,
> +	HwPfPcieGpexTlpHeaderDword1           =  0x00D904F4,
> +	HwPfPcieGpexTlpHeaderDword2           =  0x00D904F8,
> +	HwPfPcieGpexTlpHeaderDword3           =  0x00D904FC,
> +	HwPfPcieGpexRelaxOrderControl         =  0x00D90500,
> +	HwPfPcieGpexBarPrefetch               =  0x00D90504,
> +	HwPfPcieGpexFcCheckControl            =  0x00D90508,
> +	HwPfPcieGpexFcUpdateTimerTraffic      =  0x00D90518,
> +	HwPfPcieGpexPhyControl0               =  0x00D9053C,
> +	HwPfPcieGpexPhyControl1               =  0x00D90544,
> +	HwPfPcieGpexPhyControl2               =  0x00D9054C,
> +	HwPfPcieGpexUserControl0              =  0x00D9055C,
> +	HwPfPcieGpexUncorrErrorStatus         =  0x00D905F0,
> +	HwPfPcieGpexRxCplError                =  0x00D90620,
> +	HwPfPcieGpexRxCplErrorDword0          =  0x00D90624,
> +	HwPfPcieGpexRxCplErrorDword1          =  0x00D90628,
> +	HwPfPcieGpexRxCplErrorDword2          =  0x00D9062C,
> +	HwPfPcieGpexPabSwResetEn              =  0x00D90630,
> +	HwPfPcieGpexGen3Control0              =  0x00D90634,
> +	HwPfPcieGpexGen3Control1              =  0x00D90638,
> +	HwPfPcieGpexGen3Control2              =  0x00D9063C,
> +	HwPfPcieGpexGen2ControlCsr            =  0x00D90640,
> +	HwPfPcieGpexTotalVfInitialVf0         =  0x00D90644,
> +	HwPfPcieGpexTotalVfInitialVf1         =  0x00D90648,
> +	HwPfPcieGpexSriovLinkDevId0           =  0x00D90684,
> +	HwPfPcieGpexSriovLinkDevId1           =  0x00D90688,
> +	HwPfPcieGpexSriovPageSize0            =  0x00D906C4,
> +	HwPfPcieGpexSriovPageSize1            =  0x00D906C8,
> +	HwPfPcieGpexIdVersion                 =  0x00D906FC,
> +	HwPfPcieGpexSriovVfOffsetStride0      =  0x00D90704,
> +	HwPfPcieGpexSriovVfOffsetStride1      =  0x00D90708,
> +	HwPfPcieGpexGen3DeskewControl         =  0x00D907B4,
> +	HwPfPcieGpexGen3EqControl             =  0x00D907B8,
> +	HwPfPcieGpexBridgeVersion             =  0x00D90800,
> +	HwPfPcieGpexBridgeCapability          =  0x00D90804,
> +	HwPfPcieGpexBridgeControl             =  0x00D90808,
> +	HwPfPcieGpexBridgeStatus              =  0x00D9080C,
> +	HwPfPcieGpexEngineActivityStatus      =  0x00D9081C,
> +	HwPfPcieGpexEngineResetControl        =  0x00D90820,
> +	HwPfPcieGpexAxiPioControl             =  0x00D90840,
> +	HwPfPcieGpexAxiPioStatus              =  0x00D90844,
> +	HwPfPcieGpexAmbaSlaveCmdStatus        =  0x00D90848,
> +	HwPfPcieGpexPexPioControl             =  0x00D908C0,
> +	HwPfPcieGpexPexPioStatus              =  0x00D908C4,
> +	HwPfPcieGpexAmbaMasterStatus          =  0x00D908C8,
> +	HwPfPcieGpexCsrSlaveCmdStatus         =  0x00D90920,
> +	HwPfPcieGpexMailboxAxiControl         =  0x00D90A50,
> +	HwPfPcieGpexMailboxAxiData            =  0x00D90A54,
> +	HwPfPcieGpexMailboxPexControl         =  0x00D90A90,
> +	HwPfPcieGpexMailboxPexData            =  0x00D90A94,
> +	HwPfPcieGpexPexInterruptEnable        =  0x00D90AD0,
> +	HwPfPcieGpexPexInterruptStatus        =  0x00D90AD4,
> +	HwPfPcieGpexPexInterruptAxiPioVector  =  0x00D90AD8,
> +	HwPfPcieGpexPexInterruptPexPioVector  =  0x00D90AE0,
> +	HwPfPcieGpexPexInterruptMiscVector    =  0x00D90AF8,
> +	HwPfPcieGpexAmbaInterruptPioEnable    =  0x00D90B00,
> +	HwPfPcieGpexAmbaInterruptMiscEnable   =  0x00D90B0C,
> +	HwPfPcieGpexAmbaInterruptPioStatus    =  0x00D90B10,
> +	HwPfPcieGpexAmbaInterruptMiscStatus   =  0x00D90B1C,
> +	HwPfPcieGpexPexPmControl              =  0x00D90B80,
> +	HwPfPcieGpexSlotMisc                  =  0x00D90B88,
> +	HwPfPcieGpexAxiAddrMappingControl     =  0x00D90BA0,
> +	HwPfPcieGpexAxiAddrMappingWindowAxiBase     =  0x00D90BA4,
> +	HwPfPcieGpexAxiAddrMappingWindowPexBaseLow  =  0x00D90BA8,
> +	HwPfPcieGpexAxiAddrMappingWindowPexBaseHigh =  0x00D90BAC,
> +	HwPfPcieGpexPexBarAddrFunc0Bar0       =  0x00D91BA0,
> +	HwPfPcieGpexPexBarAddrFunc0Bar1       =  0x00D91BA4,
> +	HwPfPcieGpexAxiAddrMappingPcieHdrParam =  0x00D95BA0,
> +	HwPfPcieGpexExtAxiAddrMappingAxiBase  =  0x00D980A0,
> +	HwPfPcieGpexPexExtBarAddrFunc0Bar0    =  0x00D984A0,
> +	HwPfPcieGpexPexExtBarAddrFunc0Bar1    =  0x00D984A4,
> +	HwPfPcieGpexAmbaInterruptFlrEnable    =  0x00D9B960,
> +	HwPfPcieGpexAmbaInterruptFlrStatus    =  0x00D9B9A0,
> +	HwPfPcieGpexExtAxiAddrMappingSize     =  0x00D9BAF0,
> +	HwPfPcieGpexPexPioAwcacheControl      =  0x00D9C300,
> +	HwPfPcieGpexPexPioArcacheControl      =  0x00D9C304,
> +	HwPfPcieGpexPabObSizeControlVc0       =  0x00D9C310
> +};

Why not macro definition but enum?

> +/* TIP PF Interrupt numbers */
> +enum {
> +	ACC100_PF_INT_QMGR_AQ_OVERFLOW = 0,
> +	ACC100_PF_INT_DOORBELL_VF_2_PF = 1,
> +	ACC100_PF_INT_DMA_DL_DESC_IRQ = 2,
> +	ACC100_PF_INT_DMA_UL_DESC_IRQ = 3,
> +	ACC100_PF_INT_DMA_MLD_DESC_IRQ = 4,
> +	ACC100_PF_INT_DMA_UL5G_DESC_IRQ = 5,
> +	ACC100_PF_INT_DMA_DL5G_DESC_IRQ = 6,
> +	ACC100_PF_INT_ILLEGAL_FORMAT = 7,
> +	ACC100_PF_INT_QMGR_DISABLED_ACCESS = 8,
> +	ACC100_PF_INT_QMGR_AQ_OVERTHRESHOLD = 9,
> +	ACC100_PF_INT_ARAM_ACCESS_ERR = 10,
> +	ACC100_PF_INT_ARAM_ECC_1BIT_ERR = 11,
> +	ACC100_PF_INT_PARITY_ERR = 12,
> +	ACC100_PF_INT_QMGR_ERR = 13,
> +	ACC100_PF_INT_INT_REQ_OVERFLOW = 14,
> +	ACC100_PF_INT_APB_TIMEOUT = 15,
> +};
> +
> +#endif /* ACC100_PF_ENUM_H */
> diff --git a/drivers/baseband/acc100/acc100_vf_enum.h
> b/drivers/baseband/acc100/acc100_vf_enum.h
> new file mode 100644
> index 0000000..b512af3
> --- /dev/null
> +++ b/drivers/baseband/acc100/acc100_vf_enum.h
> @@ -0,0 +1,73 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2017 Intel Corporation
> + */
> +
> +#ifndef ACC100_VF_ENUM_H
> +#define ACC100_VF_ENUM_H
> +
> +/*
> + * ACC100 Register mapping on VF BAR0
> + * This is automatically generated from RDL, format may change with new
> RDL
> + */
> +enum {
> +	HWVfQmgrIngressAq             =  0x00000000,
> +	HWVfHiVfToPfDbellVf           =  0x00000800,
> +	HWVfHiPfToVfDbellVf           =  0x00000808,
> +	HWVfHiInfoRingBaseLoVf        =  0x00000810,
> +	HWVfHiInfoRingBaseHiVf        =  0x00000814,
> +	HWVfHiInfoRingPointerVf       =  0x00000818,
> +	HWVfHiInfoRingIntWrEnVf       =  0x00000820,
> +	HWVfHiInfoRingPf2VfWrEnVf     =  0x00000824,
> +	HWVfHiMsixVectorMapperVf      =  0x00000860,
> +	HWVfDmaFec5GulDescBaseLoRegVf =  0x00000920,
> +	HWVfDmaFec5GulDescBaseHiRegVf =  0x00000924,
> +	HWVfDmaFec5GulRespPtrLoRegVf  =  0x00000928,
> +	HWVfDmaFec5GulRespPtrHiRegVf  =  0x0000092C,
> +	HWVfDmaFec5GdlDescBaseLoRegVf =  0x00000940,
> +	HWVfDmaFec5GdlDescBaseHiRegVf =  0x00000944,
> +	HWVfDmaFec5GdlRespPtrLoRegVf  =  0x00000948,
> +	HWVfDmaFec5GdlRespPtrHiRegVf  =  0x0000094C,
> +	HWVfDmaFec4GulDescBaseLoRegVf =  0x00000960,
> +	HWVfDmaFec4GulDescBaseHiRegVf =  0x00000964,
> +	HWVfDmaFec4GulRespPtrLoRegVf  =  0x00000968,
> +	HWVfDmaFec4GulRespPtrHiRegVf  =  0x0000096C,
> +	HWVfDmaFec4GdlDescBaseLoRegVf =  0x00000980,
> +	HWVfDmaFec4GdlDescBaseHiRegVf =  0x00000984,
> +	HWVfDmaFec4GdlRespPtrLoRegVf  =  0x00000988,
> +	HWVfDmaFec4GdlRespPtrHiRegVf  =  0x0000098C,
> +	HWVfDmaDdrBaseRangeRoVf       =  0x000009A0,
> +	HWVfQmgrAqResetVf             =  0x00000E00,
> +	HWVfQmgrRingSizeVf            =  0x00000E04,
> +	HWVfQmgrGrpDepthLog20Vf       =  0x00000E08,
> +	HWVfQmgrGrpDepthLog21Vf       =  0x00000E0C,
> +	HWVfQmgrGrpFunction0Vf        =  0x00000E10,
> +	HWVfQmgrGrpFunction1Vf        =  0x00000E14,
> +	HWVfPmACntrlRegVf             =  0x00000F40,
> +	HWVfPmACountVf                =  0x00000F48,
> +	HWVfPmAKCntLoVf               =  0x00000F50,
> +	HWVfPmAKCntHiVf               =  0x00000F54,
> +	HWVfPmADeltaCntLoVf           =  0x00000F60,
> +	HWVfPmADeltaCntHiVf           =  0x00000F64,
> +	HWVfPmBCntrlRegVf             =  0x00000F80,
> +	HWVfPmBCountVf                =  0x00000F88,
> +	HWVfPmBKCntLoVf               =  0x00000F90,
> +	HWVfPmBKCntHiVf               =  0x00000F94,
> +	HWVfPmBDeltaCntLoVf           =  0x00000FA0,
> +	HWVfPmBDeltaCntHiVf           =  0x00000FA4
> +};
> +
> +/* TIP VF Interrupt numbers */
> +enum {
> +	ACC100_VF_INT_QMGR_AQ_OVERFLOW = 0,
> +	ACC100_VF_INT_DOORBELL_VF_2_PF = 1,
> +	ACC100_VF_INT_DMA_DL_DESC_IRQ = 2,
> +	ACC100_VF_INT_DMA_UL_DESC_IRQ = 3,
> +	ACC100_VF_INT_DMA_MLD_DESC_IRQ = 4,
> +	ACC100_VF_INT_DMA_UL5G_DESC_IRQ = 5,
> +	ACC100_VF_INT_DMA_DL5G_DESC_IRQ = 6,
> +	ACC100_VF_INT_ILLEGAL_FORMAT = 7,
> +	ACC100_VF_INT_QMGR_DISABLED_ACCESS = 8,
> +	ACC100_VF_INT_QMGR_AQ_OVERTHRESHOLD = 9,
> +};
> +
> +#endif /* ACC100_VF_ENUM_H */
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h
> b/drivers/baseband/acc100/rte_acc100_pmd.h
> index 6f46df0..cd77570 100644
> --- a/drivers/baseband/acc100/rte_acc100_pmd.h
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
> @@ -5,6 +5,9 @@
>  #ifndef _RTE_ACC100_PMD_H_
>  #define _RTE_ACC100_PMD_H_
> 
> +#include "acc100_pf_enum.h"
> +#include "acc100_vf_enum.h"
> +
>  /* Helper macro for logging */
>  #define rte_bbdev_log(level, fmt, ...) \
>  	rte_log(RTE_LOG_ ## level, acc100_logtype, fmt "\n", \
> @@ -27,6 +30,493 @@
>  #define RTE_ACC100_PF_DEVICE_ID        (0x0d5c)
>  #define RTE_ACC100_VF_DEVICE_ID        (0x0d5d)
> 
> +/* Define as 1 to use only a single FEC engine */
> +#ifndef RTE_ACC100_SINGLE_FEC
> +#define RTE_ACC100_SINGLE_FEC 0
> +#endif
> +
> +/* Values used in filling in descriptors */
> +#define ACC100_DMA_DESC_TYPE           2
> +#define ACC100_DMA_CODE_BLK_MODE       0
> +#define ACC100_DMA_BLKID_FCW           1
> +#define ACC100_DMA_BLKID_IN            2
> +#define ACC100_DMA_BLKID_OUT_ENC       1
> +#define ACC100_DMA_BLKID_OUT_HARD      1
> +#define ACC100_DMA_BLKID_OUT_SOFT      2
> +#define ACC100_DMA_BLKID_OUT_HARQ      3
> +#define ACC100_DMA_BLKID_IN_HARQ       3
> +
> +/* Values used in filling in decode FCWs */
> +#define ACC100_FCW_TD_VER              1
> +#define ACC100_FCW_TD_EXT_COLD_REG_EN  1
> +#define ACC100_FCW_TD_AUTOMAP          0x0f
> +#define ACC100_FCW_TD_RVIDX_0          2
> +#define ACC100_FCW_TD_RVIDX_1          26
> +#define ACC100_FCW_TD_RVIDX_2          50
> +#define ACC100_FCW_TD_RVIDX_3          74
> +
> +/* Values used in writing to the registers */
> +#define ACC100_REG_IRQ_EN_ALL          0x1FF83FF  /* Enable all interrupts
> */
> +
> +/* ACC100 Specific Dimensioning */
> +#define ACC100_SIZE_64MBYTE            (64*1024*1024)
> +/* Number of elements in an Info Ring */
> +#define ACC100_INFO_RING_NUM_ENTRIES   1024
> +/* Number of elements in HARQ layout memory */
> +#define ACC100_HARQ_LAYOUT             (64*1024*1024)
> +/* Assume offset for HARQ in memory */
> +#define ACC100_HARQ_OFFSET             (32*1024)
> +/* Mask used to calculate an index in an Info Ring array (not a byte offset)
> */
> +#define ACC100_INFO_RING_MASK
> (ACC100_INFO_RING_NUM_ENTRIES-1)
> +/* Number of Virtual Functions ACC100 supports */
> +#define ACC100_NUM_VFS                  16
> +#define ACC100_NUM_QGRPS                 8
> +#define ACC100_NUM_QGRPS_PER_WORD        8
> +#define ACC100_NUM_AQS                  16
> +#define MAX_ENQ_BATCH_SIZE          255
> +/* All ACC100 Registers alignment are 32bits = 4B */
> +#define BYTES_IN_WORD                 4
> +#define MAX_E_MBUF                64000
> +
> +#define GRP_ID_SHIFT    10 /* Queue Index Hierarchy */
> +#define VF_ID_SHIFT     4  /* Queue Index Hierarchy */
> +#define VF_OFFSET_QOS   16 /* offset in Memory Space specific to QoS
> Mon */
> +#define TMPL_PRI_0      0x03020100
> +#define TMPL_PRI_1      0x07060504
> +#define TMPL_PRI_2      0x0b0a0908
> +#define TMPL_PRI_3      0x0f0e0d0c
> +#define QUEUE_ENABLE    0x80000000  /* Bit to mark Queue as Enabled */
> +#define WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
> +
> +#define ACC100_NUM_TMPL  32
> +#define VF_OFFSET_QOS 16 /* offset in Memory Space specific to QoS Mon
> */
> +/* Mapping of signals for the available engines */
> +#define SIG_UL_5G      0
> +#define SIG_UL_5G_LAST 7
> +#define SIG_DL_5G      13
> +#define SIG_DL_5G_LAST 15
> +#define SIG_UL_4G      16
> +#define SIG_UL_4G_LAST 21
> +#define SIG_DL_4G      27
> +#define SIG_DL_4G_LAST 31
> +
> +/* max number of iterations to allocate memory block for all rings */
> +#define SW_RING_MEM_ALLOC_ATTEMPTS 5
> +#define MAX_QUEUE_DEPTH           1024
> +#define ACC100_DMA_MAX_NUM_POINTERS  14
> +#define ACC100_DMA_DESC_PADDING      8
> +#define ACC100_FCW_PADDING           12
> +#define ACC100_DESC_FCW_OFFSET       192
> +#define ACC100_DESC_SIZE             256
> +#define ACC100_DESC_OFFSET           (ACC100_DESC_SIZE / 64)
> +#define ACC100_FCW_TE_BLEN     32
> +#define ACC100_FCW_TD_BLEN     24
> +#define ACC100_FCW_LE_BLEN     32
> +#define ACC100_FCW_LD_BLEN     36
> +
> +#define ACC100_FCW_VER         2
> +#define MUX_5GDL_DESC 6
> +#define CMP_ENC_SIZE 20
> +#define CMP_DEC_SIZE 24
> +#define ENC_OFFSET (32)
> +#define DEC_OFFSET (80)
> +#define ACC100_EXT_MEM
> +#define ACC100_HARQ_OFFSET_THRESHOLD 1024
> +
> +/* Constants from K0 computation from 3GPP 38.212 Table 5.4.2.1-2 */
> +#define N_ZC_1 66 /* N = 66 Zc for BG 1 */
> +#define N_ZC_2 50 /* N = 50 Zc for BG 2 */
> +#define K0_1_1 17 /* K0 fraction numerator for rv 1 and BG 1 */
> +#define K0_1_2 13 /* K0 fraction numerator for rv 1 and BG 2 */
> +#define K0_2_1 33 /* K0 fraction numerator for rv 2 and BG 1 */
> +#define K0_2_2 25 /* K0 fraction numerator for rv 2 and BG 2 */
> +#define K0_3_1 56 /* K0 fraction numerator for rv 3 and BG 1 */
> +#define K0_3_2 43 /* K0 fraction numerator for rv 3 and BG 2 */
> +
> +/* ACC100 Configuration */
> +#define ACC100_DDR_ECC_ENABLE
> +#define ACC100_CFG_DMA_ERROR 0x3D7
> +#define ACC100_CFG_AXI_CACHE 0x11
> +#define ACC100_CFG_QMGR_HI_P 0x0F0F
> +#define ACC100_CFG_PCI_AXI 0xC003
> +#define ACC100_CFG_PCI_BRIDGE 0x40006033
> +#define ACC100_ENGINE_OFFSET 0x1000
> +#define ACC100_RESET_HI 0x20100
> +#define ACC100_RESET_LO 0x20000
> +#define ACC100_RESET_HARD 0x1FF
> +#define ACC100_ENGINES_MAX 9
> +#define LONG_WAIT 1000
> +
> +/* ACC100 DMA Descriptor triplet */
> +struct acc100_dma_triplet {
> +	uint64_t address;
> +	uint32_t blen:20,
> +		res0:4,
> +		last:1,
> +		dma_ext:1,
> +		res1:2,
> +		blkid:4;
> +} __rte_packed;
> +
> +
> +
> +/* ACC100 DMA Response Descriptor */
> +union acc100_dma_rsp_desc {
> +	uint32_t val;
> +	struct {
> +		uint32_t crc_status:1,
> +			synd_ok:1,
> +			dma_err:1,
> +			neg_stop:1,
> +			fcw_err:1,
> +			output_err:1,
> +			input_err:1,
> +			timestampEn:1,
> +			iterCountFrac:8,
> +			iter_cnt:8,
> +			rsrvd3:6,
> +			sdone:1,
> +			fdone:1;
> +		uint32_t add_info_0;
> +		uint32_t add_info_1;
> +	};
> +};
> +
> +
> +/* ACC100 Queue Manager Enqueue PCI Register */
> +union acc100_enqueue_reg_fmt {
> +	uint32_t val;
> +	struct {
> +		uint32_t num_elem:8,
> +			addr_offset:3,
> +			rsrvd:1,
> +			req_elem_addr:20;
> +	};
> +};
> +
> +/* FEC 4G Uplink Frame Control Word */
> +struct __rte_packed acc100_fcw_td {
> +	uint8_t fcw_ver:4,
> +		num_maps:4; /* Unused */
> +	uint8_t filler:6, /* Unused */
> +		rsrvd0:1,
> +		bypass_sb_deint:1;
> +	uint16_t k_pos;
> +	uint16_t k_neg; /* Unused */
> +	uint8_t c_neg; /* Unused */
> +	uint8_t c; /* Unused */
> +	uint32_t ea; /* Unused */
> +	uint32_t eb; /* Unused */
> +	uint8_t cab; /* Unused */
> +	uint8_t k0_start_col; /* Unused */
> +	uint8_t rsrvd1;
> +	uint8_t code_block_mode:1, /* Unused */
> +		turbo_crc_type:1,
> +		rsrvd2:3,
> +		bypass_teq:1, /* Unused */
> +		soft_output_en:1, /* Unused */
> +		ext_td_cold_reg_en:1;
> +	union { /* External Cold register */
> +		uint32_t ext_td_cold_reg;
> +		struct {
> +			uint32_t min_iter:4, /* Unused */
> +				max_iter:4,
> +				ext_scale:5, /* Unused */
> +				rsrvd3:3,
> +				early_stop_en:1, /* Unused */
> +				sw_soft_out_dis:1, /* Unused */
> +				sw_et_cont:1, /* Unused */
> +				sw_soft_out_saturation:1, /* Unused */
> +				half_iter_on:1, /* Unused */
> +				raw_decoder_input_on:1, /* Unused */
> +				rsrvd4:10;
> +		};
> +	};
> +};
> +
> +/* FEC 5GNR Uplink Frame Control Word */
> +struct __rte_packed acc100_fcw_ld {
> +	uint32_t FCWversion:4,
> +		qm:4,
> +		nfiller:11,
> +		BG:1,
> +		Zc:9,
> +		res0:1,
> +		synd_precoder:1,
> +		synd_post:1;
> +	uint32_t ncb:16,
> +		k0:16;
> +	uint32_t rm_e:24,
> +		hcin_en:1,
> +		hcout_en:1,
> +		crc_select:1,
> +		bypass_dec:1,
> +		bypass_intlv:1,
> +		so_en:1,
> +		so_bypass_rm:1,
> +		so_bypass_intlv:1;
> +	uint32_t hcin_offset:16,
> +		hcin_size0:16;
> +	uint32_t hcin_size1:16,
> +		hcin_decomp_mode:3,
> +		llr_pack_mode:1,
> +		hcout_comp_mode:3,
> +		res2:1,
> +		dec_convllr:4,
> +		hcout_convllr:4;
> +	uint32_t itmax:7,
> +		itstop:1,
> +		so_it:7,
> +		res3:1,
> +		hcout_offset:16;
> +	uint32_t hcout_size0:16,
> +		hcout_size1:16;
> +	uint32_t gain_i:8,
> +		gain_h:8,
> +		negstop_th:16;
> +	uint32_t negstop_it:7,
> +		negstop_en:1,
> +		res4:24;
> +};
> +
> +/* FEC 4G Downlink Frame Control Word */
> +struct __rte_packed acc100_fcw_te {
> +	uint16_t k_neg;
> +	uint16_t k_pos;
> +	uint8_t c_neg;
> +	uint8_t c;
> +	uint8_t filler;
> +	uint8_t cab;
> +	uint32_t ea:17,
> +		rsrvd0:15;
> +	uint32_t eb:17,
> +		rsrvd1:15;
> +	uint16_t ncb_neg;
> +	uint16_t ncb_pos;
> +	uint8_t rv_idx0:2,
> +		rsrvd2:2,
> +		rv_idx1:2,
> +		rsrvd3:2;
> +	uint8_t bypass_rv_idx0:1,
> +		bypass_rv_idx1:1,
> +		bypass_rm:1,
> +		rsrvd4:5;
> +	uint8_t rsrvd5:1,
> +		rsrvd6:3,
> +		code_block_crc:1,
> +		rsrvd7:3;
> +	uint8_t code_block_mode:1,
> +		rsrvd8:7;
> +	uint64_t rsrvd9;
> +};
> +
> +/* FEC 5GNR Downlink Frame Control Word */
> +struct __rte_packed acc100_fcw_le {
> +	uint32_t FCWversion:4,
> +		qm:4,
> +		nfiller:11,
> +		BG:1,
> +		Zc:9,
> +		res0:3;
> +	uint32_t ncb:16,
> +		k0:16;
> +	uint32_t rm_e:24,
> +		res1:2,
> +		crc_select:1,
> +		res2:1,
> +		bypass_intlv:1,
> +		res3:3;
> +	uint32_t res4_a:12,
> +		mcb_count:3,
> +		res4_b:17;
> +	uint32_t res5;
> +	uint32_t res6;
> +	uint32_t res7;
> +	uint32_t res8;
> +};
> +
> +/* ACC100 DMA Request Descriptor */
> +struct __rte_packed acc100_dma_req_desc {
> +	union {
> +		struct{
> +			uint32_t type:4,
> +				rsrvd0:26,
> +				sdone:1,
> +				fdone:1;
> +			uint32_t rsrvd1;
> +			uint32_t rsrvd2;
> +			uint32_t pass_param:8,
> +				sdone_enable:1,
> +				irq_enable:1,
> +				timeStampEn:1,
> +				res0:5,
> +				numCBs:4,
> +				res1:4,
> +				m2dlen:4,
> +				d2mlen:4;
> +		};
> +		struct{
> +			uint32_t word0;
> +			uint32_t word1;
> +			uint32_t word2;
> +			uint32_t word3;
> +		};
> +	};
> +	struct acc100_dma_triplet
> data_ptrs[ACC100_DMA_MAX_NUM_POINTERS];
> +
> +	/* Virtual addresses used to retrieve SW context info */
> +	union {
> +		void *op_addr;
> +		uint64_t pad1;  /* pad to 64 bits */
> +	};
> +	/*
> +	 * Stores additional information needed for driver processing:
> +	 * - last_desc_in_batch - flag used to mark last descriptor (CB)
> +	 *                        in batch
> +	 * - cbs_in_tb - stores information about total number of Code Blocks
> +	 *               in currently processed Transport Block
> +	 */
> +	union {
> +		struct {
> +			union {
> +				struct acc100_fcw_ld fcw_ld;
> +				struct acc100_fcw_td fcw_td;
> +				struct acc100_fcw_le fcw_le;
> +				struct acc100_fcw_te fcw_te;
> +				uint32_t pad2[ACC100_FCW_PADDING];
> +			};
> +			uint32_t last_desc_in_batch :8,
> +				cbs_in_tb:8,
> +				pad4 : 16;
> +		};
> +		uint64_t pad3[ACC100_DMA_DESC_PADDING]; /* pad to 64
> bits */
> +	};
> +};
> +
> +/* ACC100 DMA Descriptor */
> +union acc100_dma_desc {
> +	struct acc100_dma_req_desc req;
> +	union acc100_dma_rsp_desc rsp;
> +};
> +
> +
> +/* Union describing Info Ring entry */
> +union acc100_harq_layout_data {
> +	uint32_t val;
> +	struct {
> +		uint16_t offset;
> +		uint16_t size0;
> +	};
> +} __rte_packed;
> +
> +
> +/* Union describing Info Ring entry */
> +union acc100_info_ring_data {
> +	uint32_t val;
> +	struct {
> +		union {
> +			uint16_t detailed_info;
> +			struct {
> +				uint16_t aq_id: 4;
> +				uint16_t qg_id: 4;
> +				uint16_t vf_id: 6;
> +				uint16_t reserved: 2;
> +			};
> +		};
> +		uint16_t int_nb: 7;
> +		uint16_t msi_0: 1;
> +		uint16_t vf2pf: 6;
> +		uint16_t loop: 1;
> +		uint16_t valid: 1;
> +	};
> +} __rte_packed;
> +
> +struct acc100_registry_addr {
> +	unsigned int dma_ring_dl5g_hi;
> +	unsigned int dma_ring_dl5g_lo;
> +	unsigned int dma_ring_ul5g_hi;
> +	unsigned int dma_ring_ul5g_lo;
> +	unsigned int dma_ring_dl4g_hi;
> +	unsigned int dma_ring_dl4g_lo;
> +	unsigned int dma_ring_ul4g_hi;
> +	unsigned int dma_ring_ul4g_lo;
> +	unsigned int ring_size;
> +	unsigned int info_ring_hi;
> +	unsigned int info_ring_lo;
> +	unsigned int info_ring_en;
> +	unsigned int info_ring_ptr;
> +	unsigned int tail_ptrs_dl5g_hi;
> +	unsigned int tail_ptrs_dl5g_lo;
> +	unsigned int tail_ptrs_ul5g_hi;
> +	unsigned int tail_ptrs_ul5g_lo;
> +	unsigned int tail_ptrs_dl4g_hi;
> +	unsigned int tail_ptrs_dl4g_lo;
> +	unsigned int tail_ptrs_ul4g_hi;
> +	unsigned int tail_ptrs_ul4g_lo;
> +	unsigned int depth_log0_offset;
> +	unsigned int depth_log1_offset;
> +	unsigned int qman_group_func;
> +	unsigned int ddr_range;
> +};
> +
> +/* Structure holding registry addresses for PF */
> +static const struct acc100_registry_addr pf_reg_addr = {
> +	.dma_ring_dl5g_hi = HWPfDmaFec5GdlDescBaseHiRegVf,
> +	.dma_ring_dl5g_lo = HWPfDmaFec5GdlDescBaseLoRegVf,
> +	.dma_ring_ul5g_hi = HWPfDmaFec5GulDescBaseHiRegVf,
> +	.dma_ring_ul5g_lo = HWPfDmaFec5GulDescBaseLoRegVf,
> +	.dma_ring_dl4g_hi = HWPfDmaFec4GdlDescBaseHiRegVf,
> +	.dma_ring_dl4g_lo = HWPfDmaFec4GdlDescBaseLoRegVf,
> +	.dma_ring_ul4g_hi = HWPfDmaFec4GulDescBaseHiRegVf,
> +	.dma_ring_ul4g_lo = HWPfDmaFec4GulDescBaseLoRegVf,
> +	.ring_size = HWPfQmgrRingSizeVf,
> +	.info_ring_hi = HWPfHiInfoRingBaseHiRegPf,
> +	.info_ring_lo = HWPfHiInfoRingBaseLoRegPf,
> +	.info_ring_en = HWPfHiInfoRingIntWrEnRegPf,
> +	.info_ring_ptr = HWPfHiInfoRingPointerRegPf,
> +	.tail_ptrs_dl5g_hi = HWPfDmaFec5GdlRespPtrHiRegVf,
> +	.tail_ptrs_dl5g_lo = HWPfDmaFec5GdlRespPtrLoRegVf,
> +	.tail_ptrs_ul5g_hi = HWPfDmaFec5GulRespPtrHiRegVf,
> +	.tail_ptrs_ul5g_lo = HWPfDmaFec5GulRespPtrLoRegVf,
> +	.tail_ptrs_dl4g_hi = HWPfDmaFec4GdlRespPtrHiRegVf,
> +	.tail_ptrs_dl4g_lo = HWPfDmaFec4GdlRespPtrLoRegVf,
> +	.tail_ptrs_ul4g_hi = HWPfDmaFec4GulRespPtrHiRegVf,
> +	.tail_ptrs_ul4g_lo = HWPfDmaFec4GulRespPtrLoRegVf,
> +	.depth_log0_offset = HWPfQmgrGrpDepthLog20Vf,
> +	.depth_log1_offset = HWPfQmgrGrpDepthLog21Vf,
> +	.qman_group_func = HWPfQmgrGrpFunction0,
> +	.ddr_range = HWPfDmaVfDdrBaseRw,
> +};
> +
> +/* Structure holding registry addresses for VF */
> +static const struct acc100_registry_addr vf_reg_addr = {
> +	.dma_ring_dl5g_hi = HWVfDmaFec5GdlDescBaseHiRegVf,
> +	.dma_ring_dl5g_lo = HWVfDmaFec5GdlDescBaseLoRegVf,
> +	.dma_ring_ul5g_hi = HWVfDmaFec5GulDescBaseHiRegVf,
> +	.dma_ring_ul5g_lo = HWVfDmaFec5GulDescBaseLoRegVf,
> +	.dma_ring_dl4g_hi = HWVfDmaFec4GdlDescBaseHiRegVf,
> +	.dma_ring_dl4g_lo = HWVfDmaFec4GdlDescBaseLoRegVf,
> +	.dma_ring_ul4g_hi = HWVfDmaFec4GulDescBaseHiRegVf,
> +	.dma_ring_ul4g_lo = HWVfDmaFec4GulDescBaseLoRegVf,
> +	.ring_size = HWVfQmgrRingSizeVf,
> +	.info_ring_hi = HWVfHiInfoRingBaseHiVf,
> +	.info_ring_lo = HWVfHiInfoRingBaseLoVf,
> +	.info_ring_en = HWVfHiInfoRingIntWrEnVf,
> +	.info_ring_ptr = HWVfHiInfoRingPointerVf,
> +	.tail_ptrs_dl5g_hi = HWVfDmaFec5GdlRespPtrHiRegVf,
> +	.tail_ptrs_dl5g_lo = HWVfDmaFec5GdlRespPtrLoRegVf,
> +	.tail_ptrs_ul5g_hi = HWVfDmaFec5GulRespPtrHiRegVf,
> +	.tail_ptrs_ul5g_lo = HWVfDmaFec5GulRespPtrLoRegVf,
> +	.tail_ptrs_dl4g_hi = HWVfDmaFec4GdlRespPtrHiRegVf,
> +	.tail_ptrs_dl4g_lo = HWVfDmaFec4GdlRespPtrLoRegVf,
> +	.tail_ptrs_ul4g_hi = HWVfDmaFec4GulRespPtrHiRegVf,
> +	.tail_ptrs_ul4g_lo = HWVfDmaFec4GulRespPtrLoRegVf,
> +	.depth_log0_offset = HWVfQmgrGrpDepthLog20Vf,
> +	.depth_log1_offset = HWVfQmgrGrpDepthLog21Vf,
> +	.qman_group_func = HWVfQmgrGrpFunction0Vf,
> +	.ddr_range = HWVfDmaDdrBaseRangeRoVf,
> +};
> +
>  /* Private data structure for each ACC100 device */
>  struct acc100_device {
>  	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
> --
> 1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v3 04/11] baseband/acc100: add queue configuration
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 04/11] baseband/acc100: add queue configuration Nicolas Chautru
@ 2020-08-29 10:39   ` Xu, Rosen
  2020-08-29 17:48     ` Chautru, Nicolas
  0 siblings, 1 reply; 213+ messages in thread
From: Xu, Rosen @ 2020-08-29 10:39 UTC (permalink / raw)
  To: Chautru, Nicolas, dev, akhil.goyal
  Cc: Richardson, Bruce, Chautru, Nicolas, Xu, Rosen

Hi,

> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Nicolas Chautru
> Sent: Wednesday, August 19, 2020 8:25
> To: dev@dpdk.org; akhil.goyal@nxp.com
> Cc: Richardson, Bruce <bruce.richardson@intel.com>; Chautru, Nicolas
> <nicolas.chautru@intel.com>
> Subject: [dpdk-dev] [PATCH v3 04/11] baseband/acc100: add queue
> configuration
> 
> Adding function to create and configure queues for the device. Still no
> capability.
> 
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> ---
>  drivers/baseband/acc100/rte_acc100_pmd.c | 420
> ++++++++++++++++++++++++++++++-
> drivers/baseband/acc100/rte_acc100_pmd.h |  45 ++++
>  2 files changed, 464 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> b/drivers/baseband/acc100/rte_acc100_pmd.c
> index 7807a30..7a21c57 100644
> --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> @@ -26,6 +26,22 @@
>  RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);  #endif
> 
> +/* Write to MMIO register address */
> +static inline void
> +mmio_write(void *addr, uint32_t value)
> +{
> +	*((volatile uint32_t *)(addr)) = rte_cpu_to_le_32(value); }
> +
> +/* Write a register of a ACC100 device */ static inline void
> +acc100_reg_write(struct acc100_device *d, uint32_t offset, uint32_t
> +payload) {
> +	void *reg_addr = RTE_PTR_ADD(d->mmio_base, offset);
> +	mmio_write(reg_addr, payload);
> +	usleep(1000);
> +}
> +
>  /* Read a register of a ACC100 device */  static inline uint32_t
> acc100_reg_read(struct acc100_device *d, uint32_t offset) @@ -36,6 +52,22
> @@
>  	return rte_le_to_cpu_32(ret);
>  }
> 
> +/* Basic Implementation of Log2 for exact 2^N */ static inline uint32_t
> +log2_basic(uint32_t value) {
> +	return (value == 0) ? 0 : __builtin_ctz(value); }
> +
> +/* Calculate memory alignment offset assuming alignment is 2^N */
> +static inline uint32_t calc_mem_alignment_offset(void
> +*unaligned_virt_mem, uint32_t alignment) {
> +	rte_iova_t unaligned_phy_mem =
> rte_malloc_virt2iova(unaligned_virt_mem);
> +	return (uint32_t)(alignment -
> +			(unaligned_phy_mem & (alignment-1))); }
> +
>  /* Calculate the offset of the enqueue register */  static inline uint32_t
> queue_offset(bool pf_device, uint8_t vf_id, uint8_t qgrp_id, uint16_t aq_id)
> @@ -204,10 +236,393 @@
>  			acc100_conf->q_dl_5g.aq_depth_log2);
>  }
> 
> +static void
> +free_base_addresses(void **base_addrs, int size) {
> +	int i;
> +	for (i = 0; i < size; i++)
> +		rte_free(base_addrs[i]);
> +}
> +
> +static inline uint32_t
> +get_desc_len(void)
> +{
> +	return sizeof(union acc100_dma_desc);
> +}
> +
> +/* Allocate the 2 * 64MB block for the sw rings */ static int
> +alloc_2x64mb_sw_rings_mem(struct rte_bbdev *dev, struct acc100_device
> *d,
> +		int socket)
> +{
> +	uint32_t sw_ring_size = ACC100_SIZE_64MBYTE;
> +	d->sw_rings_base = rte_zmalloc_socket(dev->device->driver-
> >name,
> +			2 * sw_ring_size, RTE_CACHE_LINE_SIZE, socket);
> +	if (d->sw_rings_base == NULL) {
> +		rte_bbdev_log(ERR, "Failed to allocate memory for %s:%u",
> +				dev->device->driver->name,
> +				dev->data->dev_id);
> +		return -ENOMEM;
> +	}
> +	memset(d->sw_rings_base, 0, ACC100_SIZE_64MBYTE);
> +	uint32_t next_64mb_align_offset = calc_mem_alignment_offset(
> +			d->sw_rings_base, ACC100_SIZE_64MBYTE);
> +	d->sw_rings = RTE_PTR_ADD(d->sw_rings_base,
> next_64mb_align_offset);
> +	d->sw_rings_phys = rte_malloc_virt2iova(d->sw_rings_base) +
> +			next_64mb_align_offset;
> +	d->sw_ring_size = MAX_QUEUE_DEPTH * get_desc_len();
> +	d->sw_ring_max_depth = d->sw_ring_size / get_desc_len();
> +
> +	return 0;
> +}

Why not a common alloc memory function but special function for different memory size?

> +/* Attempt to allocate minimised memory space for sw rings */ static
> +void alloc_sw_rings_min_mem(struct rte_bbdev *dev, struct
> acc100_device
> +*d,
> +		uint16_t num_queues, int socket)
> +{
> +	rte_iova_t sw_rings_base_phy, next_64mb_align_addr_phy;
> +	uint32_t next_64mb_align_offset;
> +	rte_iova_t sw_ring_phys_end_addr;
> +	void *base_addrs[SW_RING_MEM_ALLOC_ATTEMPTS];
> +	void *sw_rings_base;
> +	int i = 0;
> +	uint32_t q_sw_ring_size = MAX_QUEUE_DEPTH * get_desc_len();
> +	uint32_t dev_sw_ring_size = q_sw_ring_size * num_queues;
> +
> +	/* Find an aligned block of memory to store sw rings */
> +	while (i < SW_RING_MEM_ALLOC_ATTEMPTS) {
> +		/*
> +		 * sw_ring allocated memory is guaranteed to be aligned to
> +		 * q_sw_ring_size at the condition that the requested size is
> +		 * less than the page size
> +		 */
> +		sw_rings_base = rte_zmalloc_socket(
> +				dev->device->driver->name,
> +				dev_sw_ring_size, q_sw_ring_size, socket);
> +
> +		if (sw_rings_base == NULL) {
> +			rte_bbdev_log(ERR,
> +					"Failed to allocate memory
> for %s:%u",
> +					dev->device->driver->name,
> +					dev->data->dev_id);
> +			break;
> +		}
> +
> +		sw_rings_base_phy = rte_malloc_virt2iova(sw_rings_base);
> +		next_64mb_align_offset = calc_mem_alignment_offset(
> +				sw_rings_base, ACC100_SIZE_64MBYTE);
> +		next_64mb_align_addr_phy = sw_rings_base_phy +
> +				next_64mb_align_offset;
> +		sw_ring_phys_end_addr = sw_rings_base_phy +
> dev_sw_ring_size;
> +
> +		/* Check if the end of the sw ring memory block is before the
> +		 * start of next 64MB aligned mem address
> +		 */
> +		if (sw_ring_phys_end_addr < next_64mb_align_addr_phy) {
> +			d->sw_rings_phys = sw_rings_base_phy;
> +			d->sw_rings = sw_rings_base;
> +			d->sw_rings_base = sw_rings_base;
> +			d->sw_ring_size = q_sw_ring_size;
> +			d->sw_ring_max_depth = MAX_QUEUE_DEPTH;
> +			break;
> +		}
> +		/* Store the address of the unaligned mem block */
> +		base_addrs[i] = sw_rings_base;
> +		i++;
> +	}
> +
> +	/* Free all unaligned blocks of mem allocated in the loop */
> +	free_base_addresses(base_addrs, i);
> +}

It's strange to firstly alloc memory and then free memory but on operations on this memory.

> +
> +/* Allocate 64MB memory used for all software rings */ static int
> +acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int
> +socket_id) {
> +	uint32_t phys_low, phys_high, payload;
> +	struct acc100_device *d = dev->data->dev_private;
> +	const struct acc100_registry_addr *reg_addr;
> +
> +	if (d->pf_device && !d->acc100_conf.pf_mode_en) {
> +		rte_bbdev_log(NOTICE,
> +				"%s has PF mode disabled. This PF can't be
> used.",
> +				dev->data->name);
> +		return -ENODEV;
> +	}
> +
> +	alloc_sw_rings_min_mem(dev, d, num_queues, socket_id);
> +
> +	/* If minimal memory space approach failed, then allocate
> +	 * the 2 * 64MB block for the sw rings
> +	 */
> +	if (d->sw_rings == NULL)
> +		alloc_2x64mb_sw_rings_mem(dev, d, socket_id);
> +
> +	/* Configure ACC100 with the base address for DMA descriptor rings
> +	 * Same descriptor rings used for UL and DL DMA Engines
> +	 * Note : Assuming only VF0 bundle is used for PF mode
> +	 */
> +	phys_high = (uint32_t)(d->sw_rings_phys >> 32);
> +	phys_low  = (uint32_t)(d->sw_rings_phys &
> ~(ACC100_SIZE_64MBYTE-1));
> +
> +	/* Choose correct registry addresses for the device type */
> +	if (d->pf_device)
> +		reg_addr = &pf_reg_addr;
> +	else
> +		reg_addr = &vf_reg_addr;
> +
> +	/* Read the populated cfg from ACC100 registers */
> +	fetch_acc100_config(dev);
> +
> +	/* Mark as configured properly */
> +	d->configured = true;
> +
> +	/* Release AXI from PF */
> +	if (d->pf_device)
> +		acc100_reg_write(d, HWPfDmaAxiControl, 1);
> +
> +	acc100_reg_write(d, reg_addr->dma_ring_ul5g_hi, phys_high);
> +	acc100_reg_write(d, reg_addr->dma_ring_ul5g_lo, phys_low);
> +	acc100_reg_write(d, reg_addr->dma_ring_dl5g_hi, phys_high);
> +	acc100_reg_write(d, reg_addr->dma_ring_dl5g_lo, phys_low);
> +	acc100_reg_write(d, reg_addr->dma_ring_ul4g_hi, phys_high);
> +	acc100_reg_write(d, reg_addr->dma_ring_ul4g_lo, phys_low);
> +	acc100_reg_write(d, reg_addr->dma_ring_dl4g_hi, phys_high);
> +	acc100_reg_write(d, reg_addr->dma_ring_dl4g_lo, phys_low);
> +
> +	/*
> +	 * Configure Ring Size to the max queue ring size
> +	 * (used for wrapping purpose)
> +	 */
> +	payload = log2_basic(d->sw_ring_size / 64);
> +	acc100_reg_write(d, reg_addr->ring_size, payload);
> +
> +	/* Configure tail pointer for use when SDONE enabled */
> +	d->tail_ptrs = rte_zmalloc_socket(
> +			dev->device->driver->name,
> +			ACC100_NUM_QGRPS * ACC100_NUM_AQS *
> sizeof(uint32_t),
> +			RTE_CACHE_LINE_SIZE, socket_id);
> +	if (d->tail_ptrs == NULL) {
> +		rte_bbdev_log(ERR, "Failed to allocate tail ptr for %s:%u",
> +				dev->device->driver->name,
> +				dev->data->dev_id);
> +		rte_free(d->sw_rings);
> +		return -ENOMEM;
> +	}
> +	d->tail_ptr_phys = rte_malloc_virt2iova(d->tail_ptrs);
> +
> +	phys_high = (uint32_t)(d->tail_ptr_phys >> 32);
> +	phys_low  = (uint32_t)(d->tail_ptr_phys);
> +	acc100_reg_write(d, reg_addr->tail_ptrs_ul5g_hi, phys_high);
> +	acc100_reg_write(d, reg_addr->tail_ptrs_ul5g_lo, phys_low);
> +	acc100_reg_write(d, reg_addr->tail_ptrs_dl5g_hi, phys_high);
> +	acc100_reg_write(d, reg_addr->tail_ptrs_dl5g_lo, phys_low);
> +	acc100_reg_write(d, reg_addr->tail_ptrs_ul4g_hi, phys_high);
> +	acc100_reg_write(d, reg_addr->tail_ptrs_ul4g_lo, phys_low);
> +	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_hi, phys_high);
> +	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_lo, phys_low);
> +
> +	d->harq_layout = rte_zmalloc_socket("HARQ Layout",
> +			ACC100_HARQ_LAYOUT * sizeof(*d->harq_layout),
> +			RTE_CACHE_LINE_SIZE, dev->data->socket_id);
> +
> +	rte_bbdev_log_debug(
> +			"ACC100 (%s) configured  sw_rings = %p,
> sw_rings_phys = %#"
> +			PRIx64, dev->data->name, d->sw_rings, d-
> >sw_rings_phys);
> +
> +	return 0;
> +}
> +
>  /* Free 64MB memory used for software rings */  static int -
> acc100_dev_close(struct rte_bbdev *dev  __rte_unused)
> +acc100_dev_close(struct rte_bbdev *dev)
>  {
> +	struct acc100_device *d = dev->data->dev_private;
> +	if (d->sw_rings_base != NULL) {
> +		rte_free(d->tail_ptrs);
> +		rte_free(d->sw_rings_base);
> +		d->sw_rings_base = NULL;
> +	}
> +	usleep(1000);
> +	return 0;
> +}
> +
> +
> +/**
> + * Report a ACC100 queue index which is free
> + * Return 0 to 16k for a valid queue_idx or -1 when no queue is
> +available
> + * Note : Only supporting VF0 Bundle for PF mode  */ static int
> +acc100_find_free_queue_idx(struct rte_bbdev *dev,
> +		const struct rte_bbdev_queue_conf *conf) {
> +	struct acc100_device *d = dev->data->dev_private;
> +	int op_2_acc[5] = {0, UL_4G, DL_4G, UL_5G, DL_5G};
> +	int acc = op_2_acc[conf->op_type];
> +	struct rte_q_topology_t *qtop = NULL;
> +	qtopFromAcc(&qtop, acc, &(d->acc100_conf));
> +	if (qtop == NULL)
> +		return -1;
> +	/* Identify matching QGroup Index which are sorted in priority order
> */
> +	uint16_t group_idx = qtop->first_qgroup_index;
> +	group_idx += conf->priority;
> +	if (group_idx >= ACC100_NUM_QGRPS ||
> +			conf->priority >= qtop->num_qgroups) {
> +		rte_bbdev_log(INFO, "Invalid Priority on %s, priority %u",
> +				dev->data->name, conf->priority);
> +		return -1;
> +	}
> +	/* Find a free AQ_idx  */
> +	uint16_t aq_idx;
> +	for (aq_idx = 0; aq_idx < qtop->num_aqs_per_groups; aq_idx++) {
> +		if (((d->q_assigned_bit_map[group_idx] >> aq_idx) & 0x1)
> == 0) {
> +			/* Mark the Queue as assigned */
> +			d->q_assigned_bit_map[group_idx] |= (1 << aq_idx);
> +			/* Report the AQ Index */
> +			return (group_idx << GRP_ID_SHIFT) + aq_idx;
> +		}
> +	}
> +	rte_bbdev_log(INFO, "Failed to find free queue on %s, priority %u",
> +			dev->data->name, conf->priority);
> +	return -1;
> +}
> +
> +/* Setup ACC100 queue */
> +static int
> +acc100_queue_setup(struct rte_bbdev *dev, uint16_t queue_id,
> +		const struct rte_bbdev_queue_conf *conf) {
> +	struct acc100_device *d = dev->data->dev_private;
> +	struct acc100_queue *q;
> +	int16_t q_idx;
> +
> +	/* Allocate the queue data structure. */
> +	q = rte_zmalloc_socket(dev->device->driver->name, sizeof(*q),
> +			RTE_CACHE_LINE_SIZE, conf->socket);
> +	if (q == NULL) {
> +		rte_bbdev_log(ERR, "Failed to allocate queue memory");
> +		return -ENOMEM;
> +	}
> +
> +	q->d = d;
> +	q->ring_addr = RTE_PTR_ADD(d->sw_rings, (d->sw_ring_size *
> queue_id));
> +	q->ring_addr_phys = d->sw_rings_phys + (d->sw_ring_size *
> queue_id);
> +
> +	/* Prepare the Ring with default descriptor format */
> +	union acc100_dma_desc *desc = NULL;
> +	unsigned int desc_idx, b_idx;
> +	int fcw_len = (conf->op_type == RTE_BBDEV_OP_LDPC_ENC ?
> +		ACC100_FCW_LE_BLEN : (conf->op_type ==
> RTE_BBDEV_OP_TURBO_DEC ?
> +		ACC100_FCW_TD_BLEN : ACC100_FCW_LD_BLEN));
> +
> +	for (desc_idx = 0; desc_idx < d->sw_ring_max_depth; desc_idx++) {
> +		desc = q->ring_addr + desc_idx;
> +		desc->req.word0 = ACC100_DMA_DESC_TYPE;
> +		desc->req.word1 = 0; /**< Timestamp */
> +		desc->req.word2 = 0;
> +		desc->req.word3 = 0;
> +		uint64_t fcw_offset = (desc_idx << 8) +
> ACC100_DESC_FCW_OFFSET;
> +		desc->req.data_ptrs[0].address = q->ring_addr_phys +
> fcw_offset;
> +		desc->req.data_ptrs[0].blen = fcw_len;
> +		desc->req.data_ptrs[0].blkid = ACC100_DMA_BLKID_FCW;
> +		desc->req.data_ptrs[0].last = 0;
> +		desc->req.data_ptrs[0].dma_ext = 0;
> +		for (b_idx = 1; b_idx < ACC100_DMA_MAX_NUM_POINTERS
> - 1;
> +				b_idx++) {
> +			desc->req.data_ptrs[b_idx].blkid =
> ACC100_DMA_BLKID_IN;
> +			desc->req.data_ptrs[b_idx].last = 1;
> +			desc->req.data_ptrs[b_idx].dma_ext = 0;
> +			b_idx++;
> +			desc->req.data_ptrs[b_idx].blkid =
> +					ACC100_DMA_BLKID_OUT_ENC;
> +			desc->req.data_ptrs[b_idx].last = 1;
> +			desc->req.data_ptrs[b_idx].dma_ext = 0;
> +		}
> +		/* Preset some fields of LDPC FCW */
> +		desc->req.fcw_ld.FCWversion = ACC100_FCW_VER;
> +		desc->req.fcw_ld.gain_i = 1;
> +		desc->req.fcw_ld.gain_h = 1;
> +	}
> +
> +	q->lb_in = rte_zmalloc_socket(dev->device->driver->name,
> +			RTE_CACHE_LINE_SIZE,
> +			RTE_CACHE_LINE_SIZE, conf->socket);
> +	if (q->lb_in == NULL) {
> +		rte_bbdev_log(ERR, "Failed to allocate lb_in memory");
> +		return -ENOMEM;
> +	}
> +	q->lb_in_addr_phys = rte_malloc_virt2iova(q->lb_in);
> +	q->lb_out = rte_zmalloc_socket(dev->device->driver->name,
> +			RTE_CACHE_LINE_SIZE,
> +			RTE_CACHE_LINE_SIZE, conf->socket);
> +	if (q->lb_out == NULL) {
> +		rte_bbdev_log(ERR, "Failed to allocate lb_out memory");
> +		return -ENOMEM;
> +	}
> +	q->lb_out_addr_phys = rte_malloc_virt2iova(q->lb_out);
> +
> +	/*
> +	 * Software queue ring wraps synchronously with the HW when it
> reaches
> +	 * the boundary of the maximum allocated queue size, no matter
> what the
> +	 * sw queue size is. This wrapping is guarded by setting the
> wrap_mask
> +	 * to represent the maximum queue size as allocated at the time
> when
> +	 * the device has been setup (in configure()).
> +	 *
> +	 * The queue depth is set to the queue size value (conf-
> >queue_size).
> +	 * This limits the occupancy of the queue at any point of time, so that
> +	 * the queue does not get swamped with enqueue requests.
> +	 */
> +	q->sw_ring_depth = conf->queue_size;
> +	q->sw_ring_wrap_mask = d->sw_ring_max_depth - 1;
> +
> +	q->op_type = conf->op_type;
> +
> +	q_idx = acc100_find_free_queue_idx(dev, conf);
> +	if (q_idx == -1) {
> +		rte_free(q);
> +		return -1;
> +	}
> +
> +	q->qgrp_id = (q_idx >> GRP_ID_SHIFT) & 0xF;
> +	q->vf_id = (q_idx >> VF_ID_SHIFT)  & 0x3F;
> +	q->aq_id = q_idx & 0xF;
> +	q->aq_depth = (conf->op_type ==  RTE_BBDEV_OP_TURBO_DEC) ?
> +			(1 << d->acc100_conf.q_ul_4g.aq_depth_log2) :
> +			(1 << d->acc100_conf.q_dl_4g.aq_depth_log2);
> +
> +	q->mmio_reg_enqueue = RTE_PTR_ADD(d->mmio_base,
> +			queue_offset(d->pf_device,
> +					q->vf_id, q->qgrp_id, q->aq_id));
> +
> +	rte_bbdev_log_debug(
> +			"Setup dev%u q%u: qgrp_id=%u, vf_id=%u,
> aq_id=%u, aq_depth=%u, mmio_reg_enqueue=%p",
> +			dev->data->dev_id, queue_id, q->qgrp_id, q->vf_id,
> +			q->aq_id, q->aq_depth, q->mmio_reg_enqueue);
> +
> +	dev->data->queues[queue_id].queue_private = q;
> +	return 0;
> +}
> +
> +/* Release ACC100 queue */
> +static int
> +acc100_queue_release(struct rte_bbdev *dev, uint16_t q_id) {
> +	struct acc100_device *d = dev->data->dev_private;
> +	struct acc100_queue *q = dev->data->queues[q_id].queue_private;
> +
> +	if (q != NULL) {
> +		/* Mark the Queue as un-assigned */
> +		d->q_assigned_bit_map[q->qgrp_id] &= (0xFFFFFFFF -
> +				(1 << q->aq_id));
> +		rte_free(q->lb_in);
> +		rte_free(q->lb_out);
> +		rte_free(q);
> +		dev->data->queues[q_id].queue_private = NULL;
> +	}
> +
>  	return 0;
>  }
> 
> @@ -258,8 +673,11 @@
>  }
> 
>  static const struct rte_bbdev_ops acc100_bbdev_ops = {
> +	.setup_queues = acc100_setup_queues,
>  	.close = acc100_dev_close,
>  	.info_get = acc100_dev_info_get,
> +	.queue_setup = acc100_queue_setup,
> +	.queue_release = acc100_queue_release,
>  };
> 
>  /* ACC100 PCI PF address map */
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h
> b/drivers/baseband/acc100/rte_acc100_pmd.h
> index 662e2c8..0e2b79c 100644
> --- a/drivers/baseband/acc100/rte_acc100_pmd.h
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
> @@ -518,11 +518,56 @@ struct acc100_registry_addr {
>  	.ddr_range = HWVfDmaDdrBaseRangeRoVf,
>  };
> 
> +/* Structure associated with each queue. */ struct __rte_cache_aligned
> +acc100_queue {
> +	union acc100_dma_desc *ring_addr;  /* Virtual address of sw ring */
> +	rte_iova_t ring_addr_phys;  /* Physical address of software ring */
> +	uint32_t sw_ring_head;  /* software ring head */
> +	uint32_t sw_ring_tail;  /* software ring tail */
> +	/* software ring size (descriptors, not bytes) */
> +	uint32_t sw_ring_depth;
> +	/* mask used to wrap enqueued descriptors on the sw ring */
> +	uint32_t sw_ring_wrap_mask;
> +	/* MMIO register used to enqueue descriptors */
> +	void *mmio_reg_enqueue;
> +	uint8_t vf_id;  /* VF ID (max = 63) */
> +	uint8_t qgrp_id;  /* Queue Group ID */
> +	uint16_t aq_id;  /* Atomic Queue ID */
> +	uint16_t aq_depth;  /* Depth of atomic queue */
> +	uint32_t aq_enqueued;  /* Count how many "batches" have been
> enqueued */
> +	uint32_t aq_dequeued;  /* Count how many "batches" have been
> dequeued */
> +	uint32_t irq_enable;  /* Enable ops dequeue interrupts if set to 1 */
> +	struct rte_mempool *fcw_mempool;  /* FCW mempool */
> +	enum rte_bbdev_op_type op_type;  /* Type of this Queue: TE or TD
> */
> +	/* Internal Buffers for loopback input */
> +	uint8_t *lb_in;
> +	uint8_t *lb_out;
> +	rte_iova_t lb_in_addr_phys;
> +	rte_iova_t lb_out_addr_phys;
> +	struct acc100_device *d;
> +};
> +
>  /* Private data structure for each ACC100 device */  struct acc100_device {
>  	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
> +	void *sw_rings_base;  /* Base addr of un-aligned memory for sw
> rings */
> +	void *sw_rings;  /* 64MBs of 64MB aligned memory for sw rings */
> +	rte_iova_t sw_rings_phys;  /* Physical address of sw_rings */
> +	/* Virtual address of the info memory routed to the this function
> under
> +	 * operation, whether it is PF or VF.
> +	 */
> +	union acc100_harq_layout_data *harq_layout;
> +	uint32_t sw_ring_size;
>  	uint32_t ddr_size; /* Size in kB */
> +	uint32_t *tail_ptrs; /* Base address of response tail pointer buffer */
> +	rte_iova_t tail_ptr_phys; /* Physical address of tail pointers */
> +	/* Max number of entries available for each queue in device,
> depending
> +	 * on how many queues are enabled with configure()
> +	 */
> +	uint32_t sw_ring_max_depth;
>  	struct acc100_conf acc100_conf; /* ACC100 Initial configuration */
> +	/* Bitmap capturing which Queues have already been assigned */
> +	uint16_t q_assigned_bit_map[ACC100_NUM_QGRPS];
>  	bool pf_device; /**< True if this is a PF ACC100 device */
>  	bool configured; /**< True if this ACC100 device is configured */  };
> --
> 1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC processing functions
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC processing functions Nicolas Chautru
  2020-08-20 14:38   ` Dave Burley
@ 2020-08-29 11:10   ` Xu, Rosen
  2020-08-29 18:01     ` Chautru, Nicolas
  1 sibling, 1 reply; 213+ messages in thread
From: Xu, Rosen @ 2020-08-29 11:10 UTC (permalink / raw)
  To: Chautru, Nicolas, dev, akhil.goyal
  Cc: Richardson, Bruce, Chautru, Nicolas, Xu, Rosen

Hi,

> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Nicolas Chautru
> Sent: Wednesday, August 19, 2020 8:25
> To: dev@dpdk.org; akhil.goyal@nxp.com
> Cc: Richardson, Bruce <bruce.richardson@intel.com>; Chautru, Nicolas
> <nicolas.chautru@intel.com>
> Subject: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC
> processing functions
> 
> Adding LDPC decode and encode processing operations
> 
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> ---
>  drivers/baseband/acc100/rte_acc100_pmd.c | 1625
> +++++++++++++++++++++++++++++-
>  drivers/baseband/acc100/rte_acc100_pmd.h |    3 +
>  2 files changed, 1626 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> b/drivers/baseband/acc100/rte_acc100_pmd.c
> index 7a21c57..5f32813 100644
> --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> @@ -15,6 +15,9 @@
>  #include <rte_hexdump.h>
>  #include <rte_pci.h>
>  #include <rte_bus_pci.h>
> +#ifdef RTE_BBDEV_OFFLOAD_COST
> +#include <rte_cycles.h>
> +#endif
> 
>  #include <rte_bbdev.h>
>  #include <rte_bbdev_pmd.h>
> @@ -449,7 +452,6 @@
>  	return 0;
>  }
> 
> -
>  /**
>   * Report a ACC100 queue index which is free
>   * Return 0 to 16k for a valid queue_idx or -1 when no queue is available
> @@ -634,6 +636,46 @@
>  	struct acc100_device *d = dev->data->dev_private;
> 
>  	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
> +		{
> +			.type   = RTE_BBDEV_OP_LDPC_ENC,
> +			.cap.ldpc_enc = {
> +				.capability_flags =
> +					RTE_BBDEV_LDPC_RATE_MATCH |
> +					RTE_BBDEV_LDPC_CRC_24B_ATTACH
> |
> +
> 	RTE_BBDEV_LDPC_INTERLEAVER_BYPASS,
> +				.num_buffers_src =
> +
> 	RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> +				.num_buffers_dst =
> +
> 	RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> +			}
> +		},
> +		{
> +			.type   = RTE_BBDEV_OP_LDPC_DEC,
> +			.cap.ldpc_dec = {
> +			.capability_flags =
> +				RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK |
> +				RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP |
> +
> 	RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE |
> +
> 	RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE |
> +#ifdef ACC100_EXT_MEM
> +
> 	RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE |
> +
> 	RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE |
> +#endif
> +
> 	RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE |
> +				RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS
> |
> +				RTE_BBDEV_LDPC_DECODE_BYPASS |
> +				RTE_BBDEV_LDPC_DEC_SCATTER_GATHER |
> +
> 	RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION |
> +				RTE_BBDEV_LDPC_LLR_COMPRESSION,
> +			.llr_size = 8,
> +			.llr_decimals = 1,
> +			.num_buffers_src =
> +
> 	RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> +			.num_buffers_hard_out =
> +
> 	RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> +			.num_buffers_soft_out = 0,
> +			}
> +		},
>  		RTE_BBDEV_END_OF_CAPABILITIES_LIST()
>  	};
> 
> @@ -669,9 +711,14 @@
>  	dev_info->cpu_flag_reqs = NULL;
>  	dev_info->min_alignment = 64;
>  	dev_info->capabilities = bbdev_capabilities;
> +#ifdef ACC100_EXT_MEM
>  	dev_info->harq_buffer_size = d->ddr_size;
> +#else
> +	dev_info->harq_buffer_size = 0;
> +#endif
>  }
> 
> +
>  static const struct rte_bbdev_ops acc100_bbdev_ops = {
>  	.setup_queues = acc100_setup_queues,
>  	.close = acc100_dev_close,
> @@ -696,6 +743,1577 @@
>  	{.device_id = 0},
>  };
> 
> +/* Read flag value 0/1 from bitmap */
> +static inline bool
> +check_bit(uint32_t bitmap, uint32_t bitmask)
> +{
> +	return bitmap & bitmask;
> +}
> +
> +static inline char *
> +mbuf_append(struct rte_mbuf *m_head, struct rte_mbuf *m, uint16_t len)
> +{
> +	if (unlikely(len > rte_pktmbuf_tailroom(m)))
> +		return NULL;
> +
> +	char *tail = (char *)m->buf_addr + m->data_off + m->data_len;
> +	m->data_len = (uint16_t)(m->data_len + len);
> +	m_head->pkt_len  = (m_head->pkt_len + len);
> +	return tail;
> +}

Is it reasonable to direct add data_len of rte_mbuf?

> +/* Compute value of k0.
> + * Based on 3GPP 38.212 Table 5.4.2.1-2
> + * Starting position of different redundancy versions, k0
> + */
> +static inline uint16_t
> +get_k0(uint16_t n_cb, uint16_t z_c, uint8_t bg, uint8_t rv_index)
> +{
> +	if (rv_index == 0)
> +		return 0;
> +	uint16_t n = (bg == 1 ? N_ZC_1 : N_ZC_2) * z_c;
> +	if (n_cb == n) {
> +		if (rv_index == 1)
> +			return (bg == 1 ? K0_1_1 : K0_1_2) * z_c;
> +		else if (rv_index == 2)
> +			return (bg == 1 ? K0_2_1 : K0_2_2) * z_c;
> +		else
> +			return (bg == 1 ? K0_3_1 : K0_3_2) * z_c;
> +	}
> +	/* LBRM case - includes a division by N */
> +	if (rv_index == 1)
> +		return (((bg == 1 ? K0_1_1 : K0_1_2) * n_cb)
> +				/ n) * z_c;
> +	else if (rv_index == 2)
> +		return (((bg == 1 ? K0_2_1 : K0_2_2) * n_cb)
> +				/ n) * z_c;
> +	else
> +		return (((bg == 1 ? K0_3_1 : K0_3_2) * n_cb)
> +				/ n) * z_c;
> +}
> +
> +/* Fill in a frame control word for LDPC encoding. */
> +static inline void
> +acc100_fcw_le_fill(const struct rte_bbdev_enc_op *op,
> +		struct acc100_fcw_le *fcw, int num_cb)
> +{
> +	fcw->qm = op->ldpc_enc.q_m;
> +	fcw->nfiller = op->ldpc_enc.n_filler;
> +	fcw->BG = (op->ldpc_enc.basegraph - 1);
> +	fcw->Zc = op->ldpc_enc.z_c;
> +	fcw->ncb = op->ldpc_enc.n_cb;
> +	fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_enc.basegraph,
> +			op->ldpc_enc.rv_index);
> +	fcw->rm_e = op->ldpc_enc.cb_params.e;
> +	fcw->crc_select = check_bit(op->ldpc_enc.op_flags,
> +			RTE_BBDEV_LDPC_CRC_24B_ATTACH);
> +	fcw->bypass_intlv = check_bit(op->ldpc_enc.op_flags,
> +			RTE_BBDEV_LDPC_INTERLEAVER_BYPASS);
> +	fcw->mcb_count = num_cb;
> +}
> +
> +/* Fill in a frame control word for LDPC decoding. */
> +static inline void
> +acc100_fcw_ld_fill(const struct rte_bbdev_dec_op *op, struct
> acc100_fcw_ld *fcw,
> +		union acc100_harq_layout_data *harq_layout)
> +{
> +	uint16_t harq_out_length, harq_in_length, ncb_p, k0_p,
> parity_offset;
> +	uint16_t harq_index;
> +	uint32_t l;
> +	bool harq_prun = false;
> +
> +	fcw->qm = op->ldpc_dec.q_m;
> +	fcw->nfiller = op->ldpc_dec.n_filler;
> +	fcw->BG = (op->ldpc_dec.basegraph - 1);
> +	fcw->Zc = op->ldpc_dec.z_c;
> +	fcw->ncb = op->ldpc_dec.n_cb;
> +	fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_dec.basegraph,
> +			op->ldpc_dec.rv_index);
> +	if (op->ldpc_dec.code_block_mode == 1)
> +		fcw->rm_e = op->ldpc_dec.cb_params.e;
> +	else
> +		fcw->rm_e = (op->ldpc_dec.tb_params.r <
> +				op->ldpc_dec.tb_params.cab) ?
> +						op->ldpc_dec.tb_params.ea :
> +						op->ldpc_dec.tb_params.eb;
> +
> +	fcw->hcin_en = check_bit(op->ldpc_dec.op_flags,
> +			RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE);
> +	fcw->hcout_en = check_bit(op->ldpc_dec.op_flags,
> +			RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE);
> +	fcw->crc_select = check_bit(op->ldpc_dec.op_flags,
> +			RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK);
> +	fcw->bypass_dec = check_bit(op->ldpc_dec.op_flags,
> +			RTE_BBDEV_LDPC_DECODE_BYPASS);
> +	fcw->bypass_intlv = check_bit(op->ldpc_dec.op_flags,
> +			RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS);
> +	if (op->ldpc_dec.q_m == 1) {
> +		fcw->bypass_intlv = 1;
> +		fcw->qm = 2;
> +	}
> +	fcw->hcin_decomp_mode = check_bit(op->ldpc_dec.op_flags,
> +			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
> +	fcw->hcout_comp_mode = check_bit(op->ldpc_dec.op_flags,
> +			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
> +	fcw->llr_pack_mode = check_bit(op->ldpc_dec.op_flags,
> +			RTE_BBDEV_LDPC_LLR_COMPRESSION);
> +	harq_index = op->ldpc_dec.harq_combined_output.offset /
> +			ACC100_HARQ_OFFSET;
> +#ifdef ACC100_EXT_MEM
> +	/* Limit cases when HARQ pruning is valid */
> +	harq_prun = ((op->ldpc_dec.harq_combined_output.offset %
> +			ACC100_HARQ_OFFSET) == 0) &&
> +			(op->ldpc_dec.harq_combined_output.offset <=
> UINT16_MAX
> +			* ACC100_HARQ_OFFSET);
> +#endif
> +	if (fcw->hcin_en > 0) {
> +		harq_in_length = op-
> >ldpc_dec.harq_combined_input.length;
> +		if (fcw->hcin_decomp_mode > 0)
> +			harq_in_length = harq_in_length * 8 / 6;
> +		harq_in_length = RTE_ALIGN(harq_in_length, 64);
> +		if ((harq_layout[harq_index].offset > 0) & harq_prun) {
> +			rte_bbdev_log_debug("HARQ IN offset unexpected
> for now\n");
> +			fcw->hcin_size0 = harq_layout[harq_index].size0;
> +			fcw->hcin_offset = harq_layout[harq_index].offset;
> +			fcw->hcin_size1 = harq_in_length -
> +					harq_layout[harq_index].offset;
> +		} else {
> +			fcw->hcin_size0 = harq_in_length;
> +			fcw->hcin_offset = 0;
> +			fcw->hcin_size1 = 0;
> +		}
> +	} else {
> +		fcw->hcin_size0 = 0;
> +		fcw->hcin_offset = 0;
> +		fcw->hcin_size1 = 0;
> +	}
> +
> +	fcw->itmax = op->ldpc_dec.iter_max;
> +	fcw->itstop = check_bit(op->ldpc_dec.op_flags,
> +			RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE);
> +	fcw->synd_precoder = fcw->itstop;
> +	/*
> +	 * These are all implicitly set
> +	 * fcw->synd_post = 0;
> +	 * fcw->so_en = 0;
> +	 * fcw->so_bypass_rm = 0;
> +	 * fcw->so_bypass_intlv = 0;
> +	 * fcw->dec_convllr = 0;
> +	 * fcw->hcout_convllr = 0;
> +	 * fcw->hcout_size1 = 0;
> +	 * fcw->so_it = 0;
> +	 * fcw->hcout_offset = 0;
> +	 * fcw->negstop_th = 0;
> +	 * fcw->negstop_it = 0;
> +	 * fcw->negstop_en = 0;
> +	 * fcw->gain_i = 1;
> +	 * fcw->gain_h = 1;
> +	 */
> +	if (fcw->hcout_en > 0) {
> +		parity_offset = (op->ldpc_dec.basegraph == 1 ? 20 : 8)
> +			* op->ldpc_dec.z_c - op->ldpc_dec.n_filler;
> +		k0_p = (fcw->k0 > parity_offset) ?
> +				fcw->k0 - op->ldpc_dec.n_filler : fcw->k0;
> +		ncb_p = fcw->ncb - op->ldpc_dec.n_filler;
> +		l = k0_p + fcw->rm_e;
> +		harq_out_length = (uint16_t) fcw->hcin_size0;
> +		harq_out_length = RTE_MIN(RTE_MAX(harq_out_length, l),
> ncb_p);
> +		harq_out_length = (harq_out_length + 0x3F) & 0xFFC0;
> +		if ((k0_p > fcw->hcin_size0 +
> ACC100_HARQ_OFFSET_THRESHOLD) &&
> +				harq_prun) {
> +			fcw->hcout_size0 = (uint16_t) fcw->hcin_size0;
> +			fcw->hcout_offset = k0_p & 0xFFC0;
> +			fcw->hcout_size1 = harq_out_length - fcw-
> >hcout_offset;
> +		} else {
> +			fcw->hcout_size0 = harq_out_length;
> +			fcw->hcout_size1 = 0;
> +			fcw->hcout_offset = 0;
> +		}
> +		harq_layout[harq_index].offset = fcw->hcout_offset;
> +		harq_layout[harq_index].size0 = fcw->hcout_size0;
> +	} else {
> +		fcw->hcout_size0 = 0;
> +		fcw->hcout_size1 = 0;
> +		fcw->hcout_offset = 0;
> +	}
> +}
> +
> +/**
> + * Fills descriptor with data pointers of one block type.
> + *
> + * @param desc
> + *   Pointer to DMA descriptor.
> + * @param input
> + *   Pointer to pointer to input data which will be encoded. It can be changed
> + *   and points to next segment in scatter-gather case.
> + * @param offset
> + *   Input offset in rte_mbuf structure. It is used for calculating the point
> + *   where data is starting.
> + * @param cb_len
> + *   Length of currently processed Code Block
> + * @param seg_total_left
> + *   It indicates how many bytes still left in segment (mbuf) for further
> + *   processing.
> + * @param op_flags
> + *   Store information about device capabilities
> + * @param next_triplet
> + *   Index for ACC100 DMA Descriptor triplet
> + *
> + * @return
> + *   Returns index of next triplet on success, other value if lengths of
> + *   pkt and processed cb do not match.
> + *
> + */
> +static inline int
> +acc100_dma_fill_blk_type_in(struct acc100_dma_req_desc *desc,
> +		struct rte_mbuf **input, uint32_t *offset, uint32_t cb_len,
> +		uint32_t *seg_total_left, int next_triplet)
> +{
> +	uint32_t part_len;
> +	struct rte_mbuf *m = *input;
> +
> +	part_len = (*seg_total_left < cb_len) ? *seg_total_left : cb_len;
> +	cb_len -= part_len;
> +	*seg_total_left -= part_len;
> +
> +	desc->data_ptrs[next_triplet].address =
> +			rte_pktmbuf_iova_offset(m, *offset);
> +	desc->data_ptrs[next_triplet].blen = part_len;
> +	desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN;
> +	desc->data_ptrs[next_triplet].last = 0;
> +	desc->data_ptrs[next_triplet].dma_ext = 0;
> +	*offset += part_len;
> +	next_triplet++;
> +
> +	while (cb_len > 0) {
> +		if (next_triplet < ACC100_DMA_MAX_NUM_POINTERS &&
> +				m->next != NULL) {
> +
> +			m = m->next;
> +			*seg_total_left = rte_pktmbuf_data_len(m);
> +			part_len = (*seg_total_left < cb_len) ?
> +					*seg_total_left :
> +					cb_len;
> +			desc->data_ptrs[next_triplet].address =
> +					rte_pktmbuf_mtophys(m);
> +			desc->data_ptrs[next_triplet].blen = part_len;
> +			desc->data_ptrs[next_triplet].blkid =
> +					ACC100_DMA_BLKID_IN;
> +			desc->data_ptrs[next_triplet].last = 0;
> +			desc->data_ptrs[next_triplet].dma_ext = 0;
> +			cb_len -= part_len;
> +			*seg_total_left -= part_len;
> +			/* Initializing offset for next segment (mbuf) */
> +			*offset = part_len;
> +			next_triplet++;
> +		} else {
> +			rte_bbdev_log(ERR,
> +				"Some data still left for processing: "
> +				"data_left: %u, next_triplet: %u,
> next_mbuf: %p",
> +				cb_len, next_triplet, m->next);
> +			return -EINVAL;
> +		}
> +	}
> +	/* Storing new mbuf as it could be changed in scatter-gather case*/
> +	*input = m;
> +
> +	return next_triplet;
> +}
> +
> +/* Fills descriptor with data pointers of one block type.
> + * Returns index of next triplet on success, other value if lengths of
> + * output data and processed mbuf do not match.
> + */
> +static inline int
> +acc100_dma_fill_blk_type_out(struct acc100_dma_req_desc *desc,
> +		struct rte_mbuf *output, uint32_t out_offset,
> +		uint32_t output_len, int next_triplet, int blk_id)
> +{
> +	desc->data_ptrs[next_triplet].address =
> +			rte_pktmbuf_iova_offset(output, out_offset);
> +	desc->data_ptrs[next_triplet].blen = output_len;
> +	desc->data_ptrs[next_triplet].blkid = blk_id;
> +	desc->data_ptrs[next_triplet].last = 0;
> +	desc->data_ptrs[next_triplet].dma_ext = 0;
> +	next_triplet++;
> +
> +	return next_triplet;
> +}
> +
> +static inline int
> +acc100_dma_desc_le_fill(struct rte_bbdev_enc_op *op,
> +		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
> +		struct rte_mbuf *output, uint32_t *in_offset,
> +		uint32_t *out_offset, uint32_t *out_length,
> +		uint32_t *mbuf_total_left, uint32_t *seg_total_left)
> +{
> +	int next_triplet = 1; /* FCW already done */
> +	uint16_t K, in_length_in_bits, in_length_in_bytes;
> +	struct rte_bbdev_op_ldpc_enc *enc = &op->ldpc_enc;
> +
> +	desc->word0 = ACC100_DMA_DESC_TYPE;
> +	desc->word1 = 0; /**< Timestamp could be disabled */
> +	desc->word2 = 0;
> +	desc->word3 = 0;
> +	desc->numCBs = 1;
> +
> +	K = (enc->basegraph == 1 ? 22 : 10) * enc->z_c;
> +	in_length_in_bits = K - enc->n_filler;
> +	if ((enc->op_flags & RTE_BBDEV_LDPC_CRC_24A_ATTACH) ||
> +			(enc->op_flags &
> RTE_BBDEV_LDPC_CRC_24B_ATTACH))
> +		in_length_in_bits -= 24;
> +	in_length_in_bytes = in_length_in_bits >> 3;
> +
> +	if (unlikely((*mbuf_total_left == 0) ||
> +			(*mbuf_total_left < in_length_in_bytes))) {
> +		rte_bbdev_log(ERR,
> +				"Mismatch between mbuf length and
> included CB sizes: mbuf len %u, cb len %u",
> +				*mbuf_total_left, in_length_in_bytes);
> +		return -1;
> +	}
> +
> +	next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset,
> +			in_length_in_bytes,
> +			seg_total_left, next_triplet);
> +	if (unlikely(next_triplet < 0)) {
> +		rte_bbdev_log(ERR,
> +				"Mismatch between data to process and
> mbuf data length in bbdev_op: %p",
> +				op);
> +		return -1;
> +	}
> +	desc->data_ptrs[next_triplet - 1].last = 1;
> +	desc->m2dlen = next_triplet;
> +	*mbuf_total_left -= in_length_in_bytes;
> +
> +	/* Set output length */
> +	/* Integer round up division by 8 */
> +	*out_length = (enc->cb_params.e + 7) >> 3;
> +
> +	next_triplet = acc100_dma_fill_blk_type_out(desc, output,
> *out_offset,
> +			*out_length, next_triplet,
> ACC100_DMA_BLKID_OUT_ENC);
> +	if (unlikely(next_triplet < 0)) {
> +		rte_bbdev_log(ERR,
> +				"Mismatch between data to process and
> mbuf data length in bbdev_op: %p",
> +				op);
> +		return -1;
> +	}
> +	op->ldpc_enc.output.length += *out_length;
> +	*out_offset += *out_length;
> +	desc->data_ptrs[next_triplet - 1].last = 1;
> +	desc->data_ptrs[next_triplet - 1].dma_ext = 0;
> +	desc->d2mlen = next_triplet - desc->m2dlen;
> +
> +	desc->op_addr = op;
> +
> +	return 0;
> +}
> +
> +static inline int
> +acc100_dma_desc_ld_fill(struct rte_bbdev_dec_op *op,
> +		struct acc100_dma_req_desc *desc,
> +		struct rte_mbuf **input, struct rte_mbuf *h_output,
> +		uint32_t *in_offset, uint32_t *h_out_offset,
> +		uint32_t *h_out_length, uint32_t *mbuf_total_left,
> +		uint32_t *seg_total_left,
> +		struct acc100_fcw_ld *fcw)
> +{
> +	struct rte_bbdev_op_ldpc_dec *dec = &op->ldpc_dec;
> +	int next_triplet = 1; /* FCW already done */
> +	uint32_t input_length;
> +	uint16_t output_length, crc24_overlap = 0;
> +	uint16_t sys_cols, K, h_p_size, h_np_size;
> +	bool h_comp = check_bit(dec->op_flags,
> +			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
> +
> +	desc->word0 = ACC100_DMA_DESC_TYPE;
> +	desc->word1 = 0; /**< Timestamp could be disabled */
> +	desc->word2 = 0;
> +	desc->word3 = 0;
> +	desc->numCBs = 1;
> +
> +	if (check_bit(op->ldpc_dec.op_flags,
> +			RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP))
> +		crc24_overlap = 24;
> +
> +	/* Compute some LDPC BG lengths */
> +	input_length = dec->cb_params.e;
> +	if (check_bit(op->ldpc_dec.op_flags,
> +			RTE_BBDEV_LDPC_LLR_COMPRESSION))
> +		input_length = (input_length * 3 + 3) / 4;
> +	sys_cols = (dec->basegraph == 1) ? 22 : 10;
> +	K = sys_cols * dec->z_c;
> +	output_length = K - dec->n_filler - crc24_overlap;
> +
> +	if (unlikely((*mbuf_total_left == 0) ||
> +			(*mbuf_total_left < input_length))) {
> +		rte_bbdev_log(ERR,
> +				"Mismatch between mbuf length and
> included CB sizes: mbuf len %u, cb len %u",
> +				*mbuf_total_left, input_length);
> +		return -1;
> +	}
> +
> +	next_triplet = acc100_dma_fill_blk_type_in(desc, input,
> +			in_offset, input_length,
> +			seg_total_left, next_triplet);
> +
> +	if (unlikely(next_triplet < 0)) {
> +		rte_bbdev_log(ERR,
> +				"Mismatch between data to process and
> mbuf data length in bbdev_op: %p",
> +				op);
> +		return -1;
> +	}
> +
> +	if (check_bit(op->ldpc_dec.op_flags,
> +
> 	RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
> +		h_p_size = fcw->hcin_size0 + fcw->hcin_size1;
> +		if (h_comp)
> +			h_p_size = (h_p_size * 3 + 3) / 4;
> +		desc->data_ptrs[next_triplet].address =
> +				dec->harq_combined_input.offset;
> +		desc->data_ptrs[next_triplet].blen = h_p_size;
> +		desc->data_ptrs[next_triplet].blkid =
> ACC100_DMA_BLKID_IN_HARQ;
> +		desc->data_ptrs[next_triplet].dma_ext = 1;
> +#ifndef ACC100_EXT_MEM
> +		acc100_dma_fill_blk_type_out(
> +				desc,
> +				op->ldpc_dec.harq_combined_input.data,
> +				op->ldpc_dec.harq_combined_input.offset,
> +				h_p_size,
> +				next_triplet,
> +				ACC100_DMA_BLKID_IN_HARQ);
> +#endif
> +		next_triplet++;
> +	}
> +
> +	desc->data_ptrs[next_triplet - 1].last = 1;
> +	desc->m2dlen = next_triplet;
> +	*mbuf_total_left -= input_length;
> +
> +	next_triplet = acc100_dma_fill_blk_type_out(desc, h_output,
> +			*h_out_offset, output_length >> 3, next_triplet,
> +			ACC100_DMA_BLKID_OUT_HARD);
> +	if (unlikely(next_triplet < 0)) {
> +		rte_bbdev_log(ERR,
> +				"Mismatch between data to process and
> mbuf data length in bbdev_op: %p",
> +				op);
> +		return -1;
> +	}
> +
> +	if (check_bit(op->ldpc_dec.op_flags,
> +
> 	RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
> +		/* Pruned size of the HARQ */
> +		h_p_size = fcw->hcout_size0 + fcw->hcout_size1;
> +		/* Non-Pruned size of the HARQ */
> +		h_np_size = fcw->hcout_offset > 0 ?
> +				fcw->hcout_offset + fcw->hcout_size1 :
> +				h_p_size;
> +		if (h_comp) {
> +			h_np_size = (h_np_size * 3 + 3) / 4;
> +			h_p_size = (h_p_size * 3 + 3) / 4;
> +		}
> +		dec->harq_combined_output.length = h_np_size;
> +		desc->data_ptrs[next_triplet].address =
> +				dec->harq_combined_output.offset;
> +		desc->data_ptrs[next_triplet].blen = h_p_size;
> +		desc->data_ptrs[next_triplet].blkid =
> ACC100_DMA_BLKID_OUT_HARQ;
> +		desc->data_ptrs[next_triplet].dma_ext = 1;
> +#ifndef ACC100_EXT_MEM
> +		acc100_dma_fill_blk_type_out(
> +				desc,
> +				dec->harq_combined_output.data,
> +				dec->harq_combined_output.offset,
> +				h_p_size,
> +				next_triplet,
> +				ACC100_DMA_BLKID_OUT_HARQ);
> +#endif
> +		next_triplet++;
> +	}
> +
> +	*h_out_length = output_length >> 3;
> +	dec->hard_output.length += *h_out_length;
> +	*h_out_offset += *h_out_length;
> +	desc->data_ptrs[next_triplet - 1].last = 1;
> +	desc->d2mlen = next_triplet - desc->m2dlen;
> +
> +	desc->op_addr = op;
> +
> +	return 0;
> +}
> +
> +static inline void
> +acc100_dma_desc_ld_update(struct rte_bbdev_dec_op *op,
> +		struct acc100_dma_req_desc *desc,
> +		struct rte_mbuf *input, struct rte_mbuf *h_output,
> +		uint32_t *in_offset, uint32_t *h_out_offset,
> +		uint32_t *h_out_length,
> +		union acc100_harq_layout_data *harq_layout)
> +{
> +	int next_triplet = 1; /* FCW already done */
> +	desc->data_ptrs[next_triplet].address =
> +			rte_pktmbuf_iova_offset(input, *in_offset);
> +	next_triplet++;
> +
> +	if (check_bit(op->ldpc_dec.op_flags,
> +
> 	RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
> +		struct rte_bbdev_op_data hi = op-
> >ldpc_dec.harq_combined_input;
> +		desc->data_ptrs[next_triplet].address = hi.offset;
> +#ifndef ACC100_EXT_MEM
> +		desc->data_ptrs[next_triplet].address =
> +				rte_pktmbuf_iova_offset(hi.data, hi.offset);
> +#endif
> +		next_triplet++;
> +	}
> +
> +	desc->data_ptrs[next_triplet].address =
> +			rte_pktmbuf_iova_offset(h_output, *h_out_offset);
> +	*h_out_length = desc->data_ptrs[next_triplet].blen;
> +	next_triplet++;
> +
> +	if (check_bit(op->ldpc_dec.op_flags,
> +
> 	RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
> +		desc->data_ptrs[next_triplet].address =
> +				op->ldpc_dec.harq_combined_output.offset;
> +		/* Adjust based on previous operation */
> +		struct rte_bbdev_dec_op *prev_op = desc->op_addr;
> +		op->ldpc_dec.harq_combined_output.length =
> +				prev_op-
> >ldpc_dec.harq_combined_output.length;
> +		int16_t hq_idx = op-
> >ldpc_dec.harq_combined_output.offset /
> +				ACC100_HARQ_OFFSET;
> +		int16_t prev_hq_idx =
> +				prev_op-
> >ldpc_dec.harq_combined_output.offset
> +				/ ACC100_HARQ_OFFSET;
> +		harq_layout[hq_idx].val = harq_layout[prev_hq_idx].val;
> +#ifndef ACC100_EXT_MEM
> +		struct rte_bbdev_op_data ho =
> +				op->ldpc_dec.harq_combined_output;
> +		desc->data_ptrs[next_triplet].address =
> +				rte_pktmbuf_iova_offset(ho.data, ho.offset);
> +#endif
> +		next_triplet++;
> +	}
> +
> +	op->ldpc_dec.hard_output.length += *h_out_length;
> +	desc->op_addr = op;
> +}
> +
> +
> +/* Enqueue a number of operations to HW and update software rings */
> +static inline void
> +acc100_dma_enqueue(struct acc100_queue *q, uint16_t n,
> +		struct rte_bbdev_stats *queue_stats)
> +{
> +	union acc100_enqueue_reg_fmt enq_req;
> +#ifdef RTE_BBDEV_OFFLOAD_COST
> +	uint64_t start_time = 0;
> +	queue_stats->acc_offload_cycles = 0;
> +	RTE_SET_USED(queue_stats);
> +#else
> +	RTE_SET_USED(queue_stats);
> +#endif
> +
> +	enq_req.val = 0;
> +	/* Setting offset, 100b for 256 DMA Desc */
> +	enq_req.addr_offset = ACC100_DESC_OFFSET;
> +
> +	/* Split ops into batches */
> +	do {
> +		union acc100_dma_desc *desc;
> +		uint16_t enq_batch_size;
> +		uint64_t offset;
> +		rte_iova_t req_elem_addr;
> +
> +		enq_batch_size = RTE_MIN(n, MAX_ENQ_BATCH_SIZE);
> +
> +		/* Set flag on last descriptor in a batch */
> +		desc = q->ring_addr + ((q->sw_ring_head + enq_batch_size -
> 1) &
> +				q->sw_ring_wrap_mask);
> +		desc->req.last_desc_in_batch = 1;
> +
> +		/* Calculate the 1st descriptor's address */
> +		offset = ((q->sw_ring_head & q->sw_ring_wrap_mask) *
> +				sizeof(union acc100_dma_desc));
> +		req_elem_addr = q->ring_addr_phys + offset;
> +
> +		/* Fill enqueue struct */
> +		enq_req.num_elem = enq_batch_size;
> +		/* low 6 bits are not needed */
> +		enq_req.req_elem_addr = (uint32_t)(req_elem_addr >> 6);
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +		rte_memdump(stderr, "Req sdone", desc, sizeof(*desc));
> +#endif
> +		rte_bbdev_log_debug(
> +				"Enqueue %u reqs (phys %#"PRIx64") to
> reg %p",
> +				enq_batch_size,
> +				req_elem_addr,
> +				(void *)q->mmio_reg_enqueue);
> +
> +		rte_wmb();
> +
> +#ifdef RTE_BBDEV_OFFLOAD_COST
> +		/* Start time measurement for enqueue function offload. */
> +		start_time = rte_rdtsc_precise();
> +#endif
> +		rte_bbdev_log(DEBUG, "Debug : MMIO Enqueue");
> +		mmio_write(q->mmio_reg_enqueue, enq_req.val);
> +
> +#ifdef RTE_BBDEV_OFFLOAD_COST
> +		queue_stats->acc_offload_cycles +=
> +				rte_rdtsc_precise() - start_time;
> +#endif
> +
> +		q->aq_enqueued++;
> +		q->sw_ring_head += enq_batch_size;
> +		n -= enq_batch_size;
> +
> +	} while (n);
> +
> +
> +}
> +
> +/* Enqueue one encode operations for ACC100 device in CB mode */
> +static inline int
> +enqueue_ldpc_enc_n_op_cb(struct acc100_queue *q, struct
> rte_bbdev_enc_op **ops,
> +		uint16_t total_enqueued_cbs, int16_t num)
> +{
> +	union acc100_dma_desc *desc = NULL;
> +	uint32_t out_length;
> +	struct rte_mbuf *output_head, *output;
> +	int i, next_triplet;
> +	uint16_t  in_length_in_bytes;
> +	struct rte_bbdev_op_ldpc_enc *enc = &ops[0]->ldpc_enc;
> +
> +	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> +			& q->sw_ring_wrap_mask);
> +	desc = q->ring_addr + desc_idx;
> +	acc100_fcw_le_fill(ops[0], &desc->req.fcw_le, num);
> +
> +	/** This could be done at polling */
> +	desc->req.word0 = ACC100_DMA_DESC_TYPE;
> +	desc->req.word1 = 0; /**< Timestamp could be disabled */
> +	desc->req.word2 = 0;
> +	desc->req.word3 = 0;
> +	desc->req.numCBs = num;
> +
> +	in_length_in_bytes = ops[0]->ldpc_enc.input.data->data_len;
> +	out_length = (enc->cb_params.e + 7) >> 3;
> +	desc->req.m2dlen = 1 + num;
> +	desc->req.d2mlen = num;
> +	next_triplet = 1;
> +
> +	for (i = 0; i < num; i++) {
> +		desc->req.data_ptrs[next_triplet].address =
> +			rte_pktmbuf_iova_offset(ops[i]-
> >ldpc_enc.input.data, 0);
> +		desc->req.data_ptrs[next_triplet].blen = in_length_in_bytes;
> +		next_triplet++;
> +		desc->req.data_ptrs[next_triplet].address =
> +				rte_pktmbuf_iova_offset(
> +				ops[i]->ldpc_enc.output.data, 0);
> +		desc->req.data_ptrs[next_triplet].blen = out_length;
> +		next_triplet++;
> +		ops[i]->ldpc_enc.output.length = out_length;
> +		output_head = output = ops[i]->ldpc_enc.output.data;
> +		mbuf_append(output_head, output, out_length);
> +		output->data_len = out_length;
> +	}
> +
> +	desc->req.op_addr = ops[0];
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +	rte_memdump(stderr, "FCW", &desc->req.fcw_le,
> +			sizeof(desc->req.fcw_le) - 8);
> +	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> +#endif
> +
> +	/* One CB (one op) was successfully prepared to enqueue */
> +	return num;
> +}
> +
> +/* Enqueue one encode operations for ACC100 device in CB mode */
> +static inline int
> +enqueue_ldpc_enc_one_op_cb(struct acc100_queue *q, struct
> rte_bbdev_enc_op *op,
> +		uint16_t total_enqueued_cbs)
> +{
> +	union acc100_dma_desc *desc = NULL;
> +	int ret;
> +	uint32_t in_offset, out_offset, out_length, mbuf_total_left,
> +		seg_total_left;
> +	struct rte_mbuf *input, *output_head, *output;
> +
> +	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> +			& q->sw_ring_wrap_mask);
> +	desc = q->ring_addr + desc_idx;
> +	acc100_fcw_le_fill(op, &desc->req.fcw_le, 1);
> +
> +	input = op->ldpc_enc.input.data;
> +	output_head = output = op->ldpc_enc.output.data;
> +	in_offset = op->ldpc_enc.input.offset;
> +	out_offset = op->ldpc_enc.output.offset;
> +	out_length = 0;
> +	mbuf_total_left = op->ldpc_enc.input.length;
> +	seg_total_left = rte_pktmbuf_data_len(op->ldpc_enc.input.data)
> +			- in_offset;
> +
> +	ret = acc100_dma_desc_le_fill(op, &desc->req, &input, output,
> +			&in_offset, &out_offset, &out_length,
> &mbuf_total_left,
> +			&seg_total_left);
> +
> +	if (unlikely(ret < 0))
> +		return ret;
> +
> +	mbuf_append(output_head, output, out_length);
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +	rte_memdump(stderr, "FCW", &desc->req.fcw_le,
> +			sizeof(desc->req.fcw_le) - 8);
> +	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> +
> +	/* Check if any data left after processing one CB */
> +	if (mbuf_total_left != 0) {
> +		rte_bbdev_log(ERR,
> +				"Some date still left after processing one CB:
> mbuf_total_left = %u",
> +				mbuf_total_left);
> +		return -EINVAL;
> +	}
> +#endif
> +	/* One CB (one op) was successfully prepared to enqueue */
> +	return 1;
> +}
> +
> +/** Enqueue one decode operations for ACC100 device in CB mode */
> +static inline int
> +enqueue_ldpc_dec_one_op_cb(struct acc100_queue *q, struct
> rte_bbdev_dec_op *op,
> +		uint16_t total_enqueued_cbs, bool same_op)
> +{
> +	int ret;
> +
> +	union acc100_dma_desc *desc;
> +	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> +			& q->sw_ring_wrap_mask);
> +	desc = q->ring_addr + desc_idx;
> +	struct rte_mbuf *input, *h_output_head, *h_output;
> +	uint32_t in_offset, h_out_offset, h_out_length, mbuf_total_left;
> +	input = op->ldpc_dec.input.data;
> +	h_output_head = h_output = op->ldpc_dec.hard_output.data;
> +	in_offset = op->ldpc_dec.input.offset;
> +	h_out_offset = op->ldpc_dec.hard_output.offset;
> +	mbuf_total_left = op->ldpc_dec.input.length;
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +	if (unlikely(input == NULL)) {
> +		rte_bbdev_log(ERR, "Invalid mbuf pointer");
> +		return -EFAULT;
> +	}
> +#endif
> +	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
> +
> +	if (same_op) {
> +		union acc100_dma_desc *prev_desc;
> +		desc_idx = ((q->sw_ring_head + total_enqueued_cbs - 1)
> +				& q->sw_ring_wrap_mask);
> +		prev_desc = q->ring_addr + desc_idx;
> +		uint8_t *prev_ptr = (uint8_t *) prev_desc;
> +		uint8_t *new_ptr = (uint8_t *) desc;
> +		/* Copy first 4 words and BDESCs */
> +		rte_memcpy(new_ptr, prev_ptr, 16);
> +		rte_memcpy(new_ptr + 36, prev_ptr + 36, 40);
> +		desc->req.op_addr = prev_desc->req.op_addr;
> +		/* Copy FCW */
> +		rte_memcpy(new_ptr + ACC100_DESC_FCW_OFFSET,
> +				prev_ptr + ACC100_DESC_FCW_OFFSET,
> +				ACC100_FCW_LD_BLEN);
> +		acc100_dma_desc_ld_update(op, &desc->req, input,
> h_output,
> +				&in_offset, &h_out_offset,
> +				&h_out_length, harq_layout);
> +	} else {
> +		struct acc100_fcw_ld *fcw;
> +		uint32_t seg_total_left;
> +		fcw = &desc->req.fcw_ld;
> +		acc100_fcw_ld_fill(op, fcw, harq_layout);
> +
> +		/* Special handling when overusing mbuf */
> +		if (fcw->rm_e < MAX_E_MBUF)
> +			seg_total_left = rte_pktmbuf_data_len(input)
> +					- in_offset;
> +		else
> +			seg_total_left = fcw->rm_e;
> +
> +		ret = acc100_dma_desc_ld_fill(op, &desc->req, &input,
> h_output,
> +				&in_offset, &h_out_offset,
> +				&h_out_length, &mbuf_total_left,
> +				&seg_total_left, fcw);
> +		if (unlikely(ret < 0))
> +			return ret;
> +	}
> +
> +	/* Hard output */
> +	mbuf_append(h_output_head, h_output, h_out_length);
> +#ifndef ACC100_EXT_MEM
> +	if (op->ldpc_dec.harq_combined_output.length > 0) {
> +		/* Push the HARQ output into host memory */
> +		struct rte_mbuf *hq_output_head, *hq_output;
> +		hq_output_head = op-
> >ldpc_dec.harq_combined_output.data;
> +		hq_output = op->ldpc_dec.harq_combined_output.data;
> +		mbuf_append(hq_output_head, hq_output,
> +				op-
> >ldpc_dec.harq_combined_output.length);
> +	}
> +#endif
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +	rte_memdump(stderr, "FCW", &desc->req.fcw_ld,
> +			sizeof(desc->req.fcw_ld) - 8);
> +	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> +#endif
> +
> +	/* One CB (one op) was successfully prepared to enqueue */
> +	return 1;
> +}
> +
> +
> +/* Enqueue one decode operations for ACC100 device in TB mode */
> +static inline int
> +enqueue_ldpc_dec_one_op_tb(struct acc100_queue *q, struct
> rte_bbdev_dec_op *op,
> +		uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
> +{
> +	union acc100_dma_desc *desc = NULL;
> +	int ret;
> +	uint8_t r, c;
> +	uint32_t in_offset, h_out_offset,
> +		h_out_length, mbuf_total_left, seg_total_left;
> +	struct rte_mbuf *input, *h_output_head, *h_output;
> +	uint16_t current_enqueued_cbs = 0;
> +
> +	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> +			& q->sw_ring_wrap_mask);
> +	desc = q->ring_addr + desc_idx;
> +	uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
> +	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
> +	acc100_fcw_ld_fill(op, &desc->req.fcw_ld, harq_layout);
> +
> +	input = op->ldpc_dec.input.data;
> +	h_output_head = h_output = op->ldpc_dec.hard_output.data;
> +	in_offset = op->ldpc_dec.input.offset;
> +	h_out_offset = op->ldpc_dec.hard_output.offset;
> +	h_out_length = 0;
> +	mbuf_total_left = op->ldpc_dec.input.length;
> +	c = op->ldpc_dec.tb_params.c;
> +	r = op->ldpc_dec.tb_params.r;
> +
> +	while (mbuf_total_left > 0 && r < c) {
> +
> +		seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
> +
> +		/* Set up DMA descriptor */
> +		desc = q->ring_addr + ((q->sw_ring_head +
> total_enqueued_cbs)
> +				& q->sw_ring_wrap_mask);
> +		desc->req.data_ptrs[0].address = q->ring_addr_phys +
> fcw_offset;
> +		desc->req.data_ptrs[0].blen = ACC100_FCW_LD_BLEN;
> +		ret = acc100_dma_desc_ld_fill(op, &desc->req, &input,
> +				h_output, &in_offset, &h_out_offset,
> +				&h_out_length,
> +				&mbuf_total_left, &seg_total_left,
> +				&desc->req.fcw_ld);
> +
> +		if (unlikely(ret < 0))
> +			return ret;
> +
> +		/* Hard output */
> +		mbuf_append(h_output_head, h_output, h_out_length);
> +
> +		/* Set total number of CBs in TB */
> +		desc->req.cbs_in_tb = cbs_in_tb;
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +		rte_memdump(stderr, "FCW", &desc->req.fcw_td,
> +				sizeof(desc->req.fcw_td) - 8);
> +		rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> +#endif
> +
> +		if (seg_total_left == 0) {
> +			/* Go to the next mbuf */
> +			input = input->next;
> +			in_offset = 0;
> +			h_output = h_output->next;
> +			h_out_offset = 0;
> +		}
> +		total_enqueued_cbs++;
> +		current_enqueued_cbs++;
> +		r++;
> +	}
> +
> +	if (unlikely(desc == NULL))
> +		return current_enqueued_cbs;
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +	/* Check if any CBs left for processing */
> +	if (mbuf_total_left != 0) {
> +		rte_bbdev_log(ERR,
> +				"Some date still left for processing:
> mbuf_total_left = %u",
> +				mbuf_total_left);
> +		return -EINVAL;
> +	}
> +#endif
> +	/* Set SDone on last CB descriptor for TB mode */
> +	desc->req.sdone_enable = 1;
> +	desc->req.irq_enable = q->irq_enable;
> +
> +	return current_enqueued_cbs;
> +}
> +
> +
> +/* Calculates number of CBs in processed encoder TB based on 'r' and input
> + * length.
> + */
> +static inline uint8_t
> +get_num_cbs_in_tb_enc(struct rte_bbdev_op_turbo_enc *turbo_enc)
> +{
> +	uint8_t c, c_neg, r, crc24_bits = 0;
> +	uint16_t k, k_neg, k_pos;
> +	uint8_t cbs_in_tb = 0;
> +	int32_t length;
> +
> +	length = turbo_enc->input.length;
> +	r = turbo_enc->tb_params.r;
> +	c = turbo_enc->tb_params.c;
> +	c_neg = turbo_enc->tb_params.c_neg;
> +	k_neg = turbo_enc->tb_params.k_neg;
> +	k_pos = turbo_enc->tb_params.k_pos;
> +	crc24_bits = 0;
> +	if (check_bit(turbo_enc->op_flags,
> RTE_BBDEV_TURBO_CRC_24B_ATTACH))
> +		crc24_bits = 24;
> +	while (length > 0 && r < c) {
> +		k = (r < c_neg) ? k_neg : k_pos;
> +		length -= (k - crc24_bits) >> 3;
> +		r++;
> +		cbs_in_tb++;
> +	}
> +
> +	return cbs_in_tb;
> +}
> +
> +/* Calculates number of CBs in processed decoder TB based on 'r' and input
> + * length.
> + */
> +static inline uint16_t
> +get_num_cbs_in_tb_dec(struct rte_bbdev_op_turbo_dec *turbo_dec)
> +{
> +	uint8_t c, c_neg, r = 0;
> +	uint16_t kw, k, k_neg, k_pos, cbs_in_tb = 0;
> +	int32_t length;
> +
> +	length = turbo_dec->input.length;
> +	r = turbo_dec->tb_params.r;
> +	c = turbo_dec->tb_params.c;
> +	c_neg = turbo_dec->tb_params.c_neg;
> +	k_neg = turbo_dec->tb_params.k_neg;
> +	k_pos = turbo_dec->tb_params.k_pos;
> +	while (length > 0 && r < c) {
> +		k = (r < c_neg) ? k_neg : k_pos;
> +		kw = RTE_ALIGN_CEIL(k + 4, 32) * 3;
> +		length -= kw;
> +		r++;
> +		cbs_in_tb++;
> +	}
> +
> +	return cbs_in_tb;
> +}
> +
> +/* Calculates number of CBs in processed decoder TB based on 'r' and input
> + * length.
> + */
> +static inline uint16_t
> +get_num_cbs_in_tb_ldpc_dec(struct rte_bbdev_op_ldpc_dec *ldpc_dec)
> +{
> +	uint16_t r, cbs_in_tb = 0;
> +	int32_t length = ldpc_dec->input.length;
> +	r = ldpc_dec->tb_params.r;
> +	while (length > 0 && r < ldpc_dec->tb_params.c) {
> +		length -=  (r < ldpc_dec->tb_params.cab) ?
> +				ldpc_dec->tb_params.ea :
> +				ldpc_dec->tb_params.eb;
> +		r++;
> +		cbs_in_tb++;
> +	}
> +	return cbs_in_tb;
> +}
> +
> +/* Check we can mux encode operations with common FCW */
> +static inline bool
> +check_mux(struct rte_bbdev_enc_op **ops, uint16_t num) {
> +	uint16_t i;
> +	if (num == 1)
> +		return false;
> +	for (i = 1; i < num; ++i) {
> +		/* Only mux compatible code blocks */
> +		if (memcmp((uint8_t *)(&ops[i]->ldpc_enc) + ENC_OFFSET,
> +				(uint8_t *)(&ops[0]->ldpc_enc) +
> ENC_OFFSET,
> +				CMP_ENC_SIZE) != 0)
> +			return false;
> +	}
> +	return true;
> +}
> +
> +/** Enqueue encode operations for ACC100 device in CB mode. */
> +static inline uint16_t
> +acc100_enqueue_ldpc_enc_cb(struct rte_bbdev_queue_data *q_data,
> +		struct rte_bbdev_enc_op **ops, uint16_t num)
> +{
> +	struct acc100_queue *q = q_data->queue_private;
> +	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q-
> >sw_ring_head;
> +	uint16_t i = 0;
> +	union acc100_dma_desc *desc;
> +	int ret, desc_idx = 0;
> +	int16_t enq, left = num;
> +
> +	while (left > 0) {
> +		if (unlikely(avail - 1 < 0))
> +			break;
> +		avail--;
> +		enq = RTE_MIN(left, MUX_5GDL_DESC);
> +		if (check_mux(&ops[i], enq)) {
> +			ret = enqueue_ldpc_enc_n_op_cb(q, &ops[i],
> +					desc_idx, enq);
> +			if (ret < 0)
> +				break;
> +			i += enq;
> +		} else {
> +			ret = enqueue_ldpc_enc_one_op_cb(q, ops[i],
> desc_idx);
> +			if (ret < 0)
> +				break;
> +			i++;
> +		}
> +		desc_idx++;
> +		left = num - i;
> +	}
> +
> +	if (unlikely(i == 0))
> +		return 0; /* Nothing to enqueue */
> +
> +	/* Set SDone in last CB in enqueued ops for CB mode*/
> +	desc = q->ring_addr + ((q->sw_ring_head + desc_idx - 1)
> +			& q->sw_ring_wrap_mask);
> +	desc->req.sdone_enable = 1;
> +	desc->req.irq_enable = q->irq_enable;
> +
> +	acc100_dma_enqueue(q, desc_idx, &q_data->queue_stats);
> +
> +	/* Update stats */
> +	q_data->queue_stats.enqueued_count += i;
> +	q_data->queue_stats.enqueue_err_count += num - i;
> +
> +	return i;
> +}
> +
> +/* Enqueue encode operations for ACC100 device. */
> +static uint16_t
> +acc100_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
> +		struct rte_bbdev_enc_op **ops, uint16_t num)
> +{
> +	if (unlikely(num == 0))
> +		return 0;
> +	return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
> +}
> +
> +/* Check we can mux encode operations with common FCW */
> +static inline bool
> +cmp_ldpc_dec_op(struct rte_bbdev_dec_op **ops) {
> +	/* Only mux compatible code blocks */
> +	if (memcmp((uint8_t *)(&ops[0]->ldpc_dec) + DEC_OFFSET,
> +			(uint8_t *)(&ops[1]->ldpc_dec) +
> +			DEC_OFFSET, CMP_DEC_SIZE) != 0) {
> +		return false;
> +	} else
> +		return true;
> +}
> +
> +
> +/* Enqueue decode operations for ACC100 device in TB mode */
> +static uint16_t
> +acc100_enqueue_ldpc_dec_tb(struct rte_bbdev_queue_data *q_data,
> +		struct rte_bbdev_dec_op **ops, uint16_t num)
> +{
> +	struct acc100_queue *q = q_data->queue_private;
> +	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q-
> >sw_ring_head;
> +	uint16_t i, enqueued_cbs = 0;
> +	uint8_t cbs_in_tb;
> +	int ret;
> +
> +	for (i = 0; i < num; ++i) {
> +		cbs_in_tb = get_num_cbs_in_tb_ldpc_dec(&ops[i]-
> >ldpc_dec);
> +		/* Check if there are available space for further processing */
> +		if (unlikely(avail - cbs_in_tb < 0))
> +			break;
> +		avail -= cbs_in_tb;
> +
> +		ret = enqueue_ldpc_dec_one_op_tb(q, ops[i],
> +				enqueued_cbs, cbs_in_tb);
> +		if (ret < 0)
> +			break;
> +		enqueued_cbs += ret;
> +	}
> +
> +	acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
> +
> +	/* Update stats */
> +	q_data->queue_stats.enqueued_count += i;
> +	q_data->queue_stats.enqueue_err_count += num - i;
> +	return i;
> +}
> +
> +/* Enqueue decode operations for ACC100 device in CB mode */
> +static uint16_t
> +acc100_enqueue_ldpc_dec_cb(struct rte_bbdev_queue_data *q_data,
> +		struct rte_bbdev_dec_op **ops, uint16_t num)
> +{
> +	struct acc100_queue *q = q_data->queue_private;
> +	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q-
> >sw_ring_head;
> +	uint16_t i;
> +	union acc100_dma_desc *desc;
> +	int ret;
> +	bool same_op = false;
> +	for (i = 0; i < num; ++i) {
> +		/* Check if there are available space for further processing */
> +		if (unlikely(avail - 1 < 0))
> +			break;
> +		avail -= 1;
> +
> +		if (i > 0)
> +			same_op = cmp_ldpc_dec_op(&ops[i-1]);
> +		rte_bbdev_log(INFO,
> "Op %d %d %d %d %d %d %d %d %d %d %d %d\n",
> +			i, ops[i]->ldpc_dec.op_flags, ops[i]-
> >ldpc_dec.rv_index,
> +			ops[i]->ldpc_dec.iter_max, ops[i]-
> >ldpc_dec.iter_count,
> +			ops[i]->ldpc_dec.basegraph, ops[i]->ldpc_dec.z_c,
> +			ops[i]->ldpc_dec.n_cb, ops[i]->ldpc_dec.q_m,
> +			ops[i]->ldpc_dec.n_filler, ops[i]-
> >ldpc_dec.cb_params.e,
> +			same_op);
> +		ret = enqueue_ldpc_dec_one_op_cb(q, ops[i], i, same_op);
> +		if (ret < 0)
> +			break;
> +	}
> +
> +	if (unlikely(i == 0))
> +		return 0; /* Nothing to enqueue */
> +
> +	/* Set SDone in last CB in enqueued ops for CB mode*/
> +	desc = q->ring_addr + ((q->sw_ring_head + i - 1)
> +			& q->sw_ring_wrap_mask);
> +
> +	desc->req.sdone_enable = 1;
> +	desc->req.irq_enable = q->irq_enable;
> +
> +	acc100_dma_enqueue(q, i, &q_data->queue_stats);
> +
> +	/* Update stats */
> +	q_data->queue_stats.enqueued_count += i;
> +	q_data->queue_stats.enqueue_err_count += num - i;
> +	return i;
> +}
> +
> +/* Enqueue decode operations for ACC100 device. */
> +static uint16_t
> +acc100_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
> +		struct rte_bbdev_dec_op **ops, uint16_t num)
> +{
> +	struct acc100_queue *q = q_data->queue_private;
> +	int32_t aq_avail = q->aq_depth +
> +			(q->aq_dequeued - q->aq_enqueued) / 128;
> +
> +	if (unlikely((aq_avail == 0) || (num == 0)))
> +		return 0;
> +
> +	if (ops[0]->ldpc_dec.code_block_mode == 0)
> +		return acc100_enqueue_ldpc_dec_tb(q_data, ops, num);
> +	else
> +		return acc100_enqueue_ldpc_dec_cb(q_data, ops, num);
> +}
> +
> +
> +/* Dequeue one encode operations from ACC100 device in CB mode */
> +static inline int
> +dequeue_enc_one_op_cb(struct acc100_queue *q, struct
> rte_bbdev_enc_op **ref_op,
> +		uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
> +{
> +	union acc100_dma_desc *desc, atom_desc;
> +	union acc100_dma_rsp_desc rsp;
> +	struct rte_bbdev_enc_op *op;
> +	int i;
> +
> +	desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
> +			& q->sw_ring_wrap_mask);
> +	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> +			__ATOMIC_RELAXED);
> +
> +	/* Check fdone bit */
> +	if (!(atom_desc.rsp.val & ACC100_FDONE))
> +		return -1;
> +
> +	rsp.val = atom_desc.rsp.val;
> +	rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
> +
> +	/* Dequeue */
> +	op = desc->req.op_addr;
> +
> +	/* Clearing status, it will be set based on response */
> +	op->status = 0;
> +
> +	op->status |= ((rsp.input_err)
> +			? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> +	op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> +	op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> +
> +	if (desc->req.last_desc_in_batch) {
> +		(*aq_dequeued)++;
> +		desc->req.last_desc_in_batch = 0;
> +	}
> +	desc->rsp.val = ACC100_DMA_DESC_TYPE;
> +	desc->rsp.add_info_0 = 0; /*Reserved bits */
> +	desc->rsp.add_info_1 = 0; /*Reserved bits */
> +
> +	/* Flag that the muxing cause loss of opaque data */
> +	op->opaque_data = (void *)-1;
> +	for (i = 0 ; i < desc->req.numCBs; i++)
> +		ref_op[i] = op;
> +
> +	/* One CB (op) was successfully dequeued */
> +	return desc->req.numCBs;
> +}
> +
> +/* Dequeue one encode operations from ACC100 device in TB mode */
> +static inline int
> +dequeue_enc_one_op_tb(struct acc100_queue *q, struct
> rte_bbdev_enc_op **ref_op,
> +		uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
> +{
> +	union acc100_dma_desc *desc, *last_desc, atom_desc;
> +	union acc100_dma_rsp_desc rsp;
> +	struct rte_bbdev_enc_op *op;
> +	uint8_t i = 0;
> +	uint16_t current_dequeued_cbs = 0, cbs_in_tb;
> +
> +	desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
> +			& q->sw_ring_wrap_mask);
> +	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> +			__ATOMIC_RELAXED);
> +
> +	/* Check fdone bit */
> +	if (!(atom_desc.rsp.val & ACC100_FDONE))
> +		return -1;
> +
> +	/* Get number of CBs in dequeued TB */
> +	cbs_in_tb = desc->req.cbs_in_tb;
> +	/* Get last CB */
> +	last_desc = q->ring_addr + ((q->sw_ring_tail
> +			+ total_dequeued_cbs + cbs_in_tb - 1)
> +			& q->sw_ring_wrap_mask);
> +	/* Check if last CB in TB is ready to dequeue (and thus
> +	 * the whole TB) - checking sdone bit. If not return.
> +	 */
> +	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
> +			__ATOMIC_RELAXED);
> +	if (!(atom_desc.rsp.val & ACC100_SDONE))
> +		return -1;
> +
> +	/* Dequeue */
> +	op = desc->req.op_addr;
> +
> +	/* Clearing status, it will be set based on response */
> +	op->status = 0;
> +
> +	while (i < cbs_in_tb) {
> +		desc = q->ring_addr + ((q->sw_ring_tail
> +				+ total_dequeued_cbs)
> +				& q->sw_ring_wrap_mask);
> +		atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> +				__ATOMIC_RELAXED);
> +		rsp.val = atom_desc.rsp.val;
> +		rte_bbdev_log_debug("Resp. desc %p: %x", desc,
> +				rsp.val);
> +
> +		op->status |= ((rsp.input_err)
> +				? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> +		op->status |= ((rsp.dma_err) ? (1 <<
> RTE_BBDEV_DRV_ERROR) : 0);
> +		op->status |= ((rsp.fcw_err) ? (1 <<
> RTE_BBDEV_DRV_ERROR) : 0);
> +
> +		if (desc->req.last_desc_in_batch) {
> +			(*aq_dequeued)++;
> +			desc->req.last_desc_in_batch = 0;
> +		}
> +		desc->rsp.val = ACC100_DMA_DESC_TYPE;
> +		desc->rsp.add_info_0 = 0;
> +		desc->rsp.add_info_1 = 0;
> +		total_dequeued_cbs++;
> +		current_dequeued_cbs++;
> +		i++;
> +	}
> +
> +	*ref_op = op;
> +
> +	return current_dequeued_cbs;
> +}
> +
> +/* Dequeue one decode operation from ACC100 device in CB mode */
> +static inline int
> +dequeue_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
> +		struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
> +		uint16_t dequeued_cbs, uint32_t *aq_dequeued)
> +{
> +	union acc100_dma_desc *desc, atom_desc;
> +	union acc100_dma_rsp_desc rsp;
> +	struct rte_bbdev_dec_op *op;
> +
> +	desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> +			& q->sw_ring_wrap_mask);
> +	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> +			__ATOMIC_RELAXED);
> +
> +	/* Check fdone bit */
> +	if (!(atom_desc.rsp.val & ACC100_FDONE))
> +		return -1;
> +
> +	rsp.val = atom_desc.rsp.val;
> +	rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
> +
> +	/* Dequeue */
> +	op = desc->req.op_addr;
> +
> +	/* Clearing status, it will be set based on response */
> +	op->status = 0;
> +	op->status |= ((rsp.input_err)
> +			? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> +	op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> +	op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> +	if (op->status != 0)
> +		q_data->queue_stats.dequeue_err_count++;
> +
> +	/* CRC invalid if error exists */
> +	if (!op->status)
> +		op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
> +	op->turbo_dec.iter_count = (uint8_t) rsp.iter_cnt / 2;
> +	/* Check if this is the last desc in batch (Atomic Queue) */
> +	if (desc->req.last_desc_in_batch) {
> +		(*aq_dequeued)++;
> +		desc->req.last_desc_in_batch = 0;
> +	}
> +	desc->rsp.val = ACC100_DMA_DESC_TYPE;
> +	desc->rsp.add_info_0 = 0;
> +	desc->rsp.add_info_1 = 0;
> +	*ref_op = op;
> +
> +	/* One CB (op) was successfully dequeued */
> +	return 1;
> +}
> +
> +/* Dequeue one decode operations from ACC100 device in CB mode */
> +static inline int
> +dequeue_ldpc_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
> +		struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
> +		uint16_t dequeued_cbs, uint32_t *aq_dequeued)
> +{
> +	union acc100_dma_desc *desc, atom_desc;
> +	union acc100_dma_rsp_desc rsp;
> +	struct rte_bbdev_dec_op *op;
> +
> +	desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> +			& q->sw_ring_wrap_mask);
> +	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> +			__ATOMIC_RELAXED);
> +
> +	/* Check fdone bit */
> +	if (!(atom_desc.rsp.val & ACC100_FDONE))
> +		return -1;
> +
> +	rsp.val = atom_desc.rsp.val;
> +
> +	/* Dequeue */
> +	op = desc->req.op_addr;
> +
> +	/* Clearing status, it will be set based on response */
> +	op->status = 0;
> +	op->status |= rsp.input_err << RTE_BBDEV_DATA_ERROR;
> +	op->status |= rsp.dma_err << RTE_BBDEV_DRV_ERROR;
> +	op->status |= rsp.fcw_err << RTE_BBDEV_DRV_ERROR;
> +	if (op->status != 0)
> +		q_data->queue_stats.dequeue_err_count++;
> +
> +	op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
> +	if (op->ldpc_dec.hard_output.length > 0 && !rsp.synd_ok)
> +		op->status |= 1 << RTE_BBDEV_SYNDROME_ERROR;
> +	op->ldpc_dec.iter_count = (uint8_t) rsp.iter_cnt;
> +
> +	/* Check if this is the last desc in batch (Atomic Queue) */
> +	if (desc->req.last_desc_in_batch) {
> +		(*aq_dequeued)++;
> +		desc->req.last_desc_in_batch = 0;
> +	}
> +
> +	desc->rsp.val = ACC100_DMA_DESC_TYPE;
> +	desc->rsp.add_info_0 = 0;
> +	desc->rsp.add_info_1 = 0;
> +
> +	*ref_op = op;
> +
> +	/* One CB (op) was successfully dequeued */
> +	return 1;
> +}
> +
> +/* Dequeue one decode operations from ACC100 device in TB mode. */
> +static inline int
> +dequeue_dec_one_op_tb(struct acc100_queue *q, struct
> rte_bbdev_dec_op **ref_op,
> +		uint16_t dequeued_cbs, uint32_t *aq_dequeued)
> +{
> +	union acc100_dma_desc *desc, *last_desc, atom_desc;
> +	union acc100_dma_rsp_desc rsp;
> +	struct rte_bbdev_dec_op *op;
> +	uint8_t cbs_in_tb = 1, cb_idx = 0;
> +
> +	desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> +			& q->sw_ring_wrap_mask);
> +	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> +			__ATOMIC_RELAXED);
> +
> +	/* Check fdone bit */
> +	if (!(atom_desc.rsp.val & ACC100_FDONE))
> +		return -1;
> +
> +	/* Dequeue */
> +	op = desc->req.op_addr;
> +
> +	/* Get number of CBs in dequeued TB */
> +	cbs_in_tb = desc->req.cbs_in_tb;
> +	/* Get last CB */
> +	last_desc = q->ring_addr + ((q->sw_ring_tail
> +			+ dequeued_cbs + cbs_in_tb - 1)
> +			& q->sw_ring_wrap_mask);
> +	/* Check if last CB in TB is ready to dequeue (and thus
> +	 * the whole TB) - checking sdone bit. If not return.
> +	 */
> +	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
> +			__ATOMIC_RELAXED);
> +	if (!(atom_desc.rsp.val & ACC100_SDONE))
> +		return -1;
> +
> +	/* Clearing status, it will be set based on response */
> +	op->status = 0;
> +
> +	/* Read remaining CBs if exists */
> +	while (cb_idx < cbs_in_tb) {
> +		desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> +				& q->sw_ring_wrap_mask);
> +		atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> +				__ATOMIC_RELAXED);
> +		rsp.val = atom_desc.rsp.val;
> +		rte_bbdev_log_debug("Resp. desc %p: %x", desc,
> +				rsp.val);
> +
> +		op->status |= ((rsp.input_err)
> +				? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> +		op->status |= ((rsp.dma_err) ? (1 <<
> RTE_BBDEV_DRV_ERROR) : 0);
> +		op->status |= ((rsp.fcw_err) ? (1 <<
> RTE_BBDEV_DRV_ERROR) : 0);
> +
> +		/* CRC invalid if error exists */
> +		if (!op->status)
> +			op->status |= rsp.crc_status <<
> RTE_BBDEV_CRC_ERROR;
> +		op->turbo_dec.iter_count = RTE_MAX((uint8_t) rsp.iter_cnt,
> +				op->turbo_dec.iter_count);
> +
> +		/* Check if this is the last desc in batch (Atomic Queue) */
> +		if (desc->req.last_desc_in_batch) {
> +			(*aq_dequeued)++;
> +			desc->req.last_desc_in_batch = 0;
> +		}
> +		desc->rsp.val = ACC100_DMA_DESC_TYPE;
> +		desc->rsp.add_info_0 = 0;
> +		desc->rsp.add_info_1 = 0;
> +		dequeued_cbs++;
> +		cb_idx++;
> +	}
> +
> +	*ref_op = op;
> +
> +	return cb_idx;
> +}
> +
> +/* Dequeue LDPC encode operations from ACC100 device. */
> +static uint16_t
> +acc100_dequeue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
> +		struct rte_bbdev_enc_op **ops, uint16_t num)
> +{
> +	struct acc100_queue *q = q_data->queue_private;
> +	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
> +	uint32_t aq_dequeued = 0;
> +	uint16_t dequeue_num, i, dequeued_cbs = 0, dequeued_descs = 0;
> +	int ret;
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +	if (unlikely(ops == 0 && q == NULL))
> +		return 0;
> +#endif
> +
> +	dequeue_num = (avail < num) ? avail : num;
> +
> +	for (i = 0; i < dequeue_num; i++) {
> +		ret = dequeue_enc_one_op_cb(q, &ops[dequeued_cbs],
> +				dequeued_descs, &aq_dequeued);
> +		if (ret < 0)
> +			break;
> +		dequeued_cbs += ret;
> +		dequeued_descs++;
> +		if (dequeued_cbs >= num)
> +			break;
> +	}
> +
> +	q->aq_dequeued += aq_dequeued;
> +	q->sw_ring_tail += dequeued_descs;
> +
> +	/* Update enqueue stats */
> +	q_data->queue_stats.dequeued_count += dequeued_cbs;
> +
> +	return dequeued_cbs;
> +}
> +
> +/* Dequeue decode operations from ACC100 device. */
> +static uint16_t
> +acc100_dequeue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
> +		struct rte_bbdev_dec_op **ops, uint16_t num)
> +{
> +	struct acc100_queue *q = q_data->queue_private;
> +	uint16_t dequeue_num;
> +	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
> +	uint32_t aq_dequeued = 0;
> +	uint16_t i;
> +	uint16_t dequeued_cbs = 0;
> +	struct rte_bbdev_dec_op *op;
> +	int ret;
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +	if (unlikely(ops == 0 && q == NULL))
> +		return 0;
> +#endif
> +
> +	dequeue_num = (avail < num) ? avail : num;
> +
> +	for (i = 0; i < dequeue_num; ++i) {
> +		op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> +			& q->sw_ring_wrap_mask))->req.op_addr;
> +		if (op->ldpc_dec.code_block_mode == 0)
> +			ret = dequeue_dec_one_op_tb(q, &ops[i],
> dequeued_cbs,
> +					&aq_dequeued);
> +		else
> +			ret = dequeue_ldpc_dec_one_op_cb(
> +					q_data, q, &ops[i], dequeued_cbs,
> +					&aq_dequeued);
> +
> +		if (ret < 0)
> +			break;
> +		dequeued_cbs += ret;
> +	}
> +
> +	q->aq_dequeued += aq_dequeued;
> +	q->sw_ring_tail += dequeued_cbs;
> +
> +	/* Update enqueue stats */
> +	q_data->queue_stats.dequeued_count += i;
> +
> +	return i;
> +}
> +
>  /* Initialization Function */
>  static void
>  acc100_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv)
> @@ -703,6 +2321,10 @@
>  	struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
> 
>  	dev->dev_ops = &acc100_bbdev_ops;
> +	dev->enqueue_ldpc_enc_ops = acc100_enqueue_ldpc_enc;
> +	dev->enqueue_ldpc_dec_ops = acc100_enqueue_ldpc_dec;
> +	dev->dequeue_ldpc_enc_ops = acc100_dequeue_ldpc_enc;
> +	dev->dequeue_ldpc_dec_ops = acc100_dequeue_ldpc_dec;
> 
>  	((struct acc100_device *) dev->data->dev_private)->pf_device =
>  			!strcmp(drv->driver.name,
> @@ -815,4 +2437,3 @@ static int acc100_pci_remove(struct rte_pci_device
> *pci_dev)
>  RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME,
> pci_id_acc100_pf_map);
>  RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
>  RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME,
> pci_id_acc100_vf_map);
> -
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h
> b/drivers/baseband/acc100/rte_acc100_pmd.h
> index 0e2b79c..78686c1 100644
> --- a/drivers/baseband/acc100/rte_acc100_pmd.h
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
> @@ -88,6 +88,8 @@
>  #define TMPL_PRI_3      0x0f0e0d0c
>  #define QUEUE_ENABLE    0x80000000  /* Bit to mark Queue as Enabled */
>  #define WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
> +#define ACC100_FDONE    0x80000000
> +#define ACC100_SDONE    0x40000000
> 
>  #define ACC100_NUM_TMPL  32
>  #define VF_OFFSET_QOS 16 /* offset in Memory Space specific to QoS Mon
> */
> @@ -398,6 +400,7 @@ struct __rte_packed acc100_dma_req_desc {
>  union acc100_dma_desc {
>  	struct acc100_dma_req_desc req;
>  	union acc100_dma_rsp_desc rsp;
> +	uint64_t atom_hdr;
>  };
> 
> 
> --
> 1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v3 02/11] baseband/acc100: add register definition file
  2020-08-29  9:55   ` Xu, Rosen
@ 2020-08-29 17:39     ` Chautru, Nicolas
  2020-09-03  2:15       ` Xu, Rosen
  0 siblings, 1 reply; 213+ messages in thread
From: Chautru, Nicolas @ 2020-08-29 17:39 UTC (permalink / raw)
  To: Xu, Rosen, dev, akhil.goyal; +Cc: Richardson, Bruce

Hi Rosen, 

> From: Xu, Rosen <rosen.xu@intel.com>
> 
> Hi,
> 
> > -----Original Message-----
> > From: dev <dev-bounces@dpdk.org> On Behalf Of Nicolas Chautru
> > Sent: Wednesday, August 19, 2020 8:25
> > To: dev@dpdk.org; akhil.goyal@nxp.com
> > Cc: Richardson, Bruce <bruce.richardson@intel.com>; Chautru, Nicolas
> > <nicolas.chautru@intel.com>
> > Subject: [dpdk-dev] [PATCH v3 02/11] baseband/acc100: add register
> > definition file
> >
> > Add in the list of registers for the device and related
> > HW specs definitions.
> >
> > Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> > ---
> >  drivers/baseband/acc100/acc100_pf_enum.h | 1068
> > ++++++++++++++++++++++++++++++
> >  drivers/baseband/acc100/acc100_vf_enum.h |   73 ++
> >  drivers/baseband/acc100/rte_acc100_pmd.h |  490 ++++++++++++++
> >  3 files changed, 1631 insertions(+)
> >  create mode 100644 drivers/baseband/acc100/acc100_pf_enum.h
> >  create mode 100644 drivers/baseband/acc100/acc100_vf_enum.h
> >
> > diff --git a/drivers/baseband/acc100/acc100_pf_enum.h
> > b/drivers/baseband/acc100/acc100_pf_enum.h
> > new file mode 100644
> > index 0000000..a1ee416
> > --- /dev/null
> > +++ b/drivers/baseband/acc100/acc100_pf_enum.h
> > @@ -0,0 +1,1068 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright(c) 2017 Intel Corporation
> > + */
> > +
> > +#ifndef ACC100_PF_ENUM_H
> > +#define ACC100_PF_ENUM_H
> > +
> > +/*
> > + * ACC100 Register mapping on PF BAR0
> > + * This is automatically generated from RDL, format may change with new
> > RDL
> > + * Release.
> > + * Variable names are as is
> > + */
> > +enum {
> > +	HWPfQmgrEgressQueuesTemplate          =  0x0007FE00,
> > +	HWPfQmgrIngressAq                     =  0x00080000,
> > +	HWPfQmgrArbQAvail                     =  0x00A00010,
> > +	HWPfQmgrArbQBlock                     =  0x00A00014,
> > +	HWPfQmgrAqueueDropNotifEn             =  0x00A00024,
> > +	HWPfQmgrAqueueDisableNotifEn          =  0x00A00028,
> > +	HWPfQmgrSoftReset                     =  0x00A00038,
> > +	HWPfQmgrInitStatus                    =  0x00A0003C,
> > +	HWPfQmgrAramWatchdogCount             =  0x00A00040,
> > +	HWPfQmgrAramWatchdogCounterEn         =  0x00A00044,
> > +	HWPfQmgrAxiWatchdogCount              =  0x00A00048,
> > +	HWPfQmgrAxiWatchdogCounterEn          =  0x00A0004C,
> > +	HWPfQmgrProcessWatchdogCount          =  0x00A00050,
> > +	HWPfQmgrProcessWatchdogCounterEn      =  0x00A00054,
> > +	HWPfQmgrProcessUl4GWatchdogCounter    =  0x00A00058,
> > +	HWPfQmgrProcessDl4GWatchdogCounter    =  0x00A0005C,
> > +	HWPfQmgrProcessUl5GWatchdogCounter    =  0x00A00060,
> > +	HWPfQmgrProcessDl5GWatchdogCounter    =  0x00A00064,
> > +	HWPfQmgrProcessMldWatchdogCounter     =  0x00A00068,
> > +	HWPfQmgrMsiOverflowUpperVf            =  0x00A00070,
> > +	HWPfQmgrMsiOverflowLowerVf            =  0x00A00074,
> > +	HWPfQmgrMsiWatchdogOverflow           =  0x00A00078,
> > +	HWPfQmgrMsiOverflowEnable             =  0x00A0007C,
> > +	HWPfQmgrDebugAqPointerMemGrp          =  0x00A00100,
> > +	HWPfQmgrDebugOutputArbQFifoGrp        =  0x00A00140,
> > +	HWPfQmgrDebugMsiFifoGrp               =  0x00A00180,
> > +	HWPfQmgrDebugAxiWdTimeoutMsiFifo      =  0x00A001C0,
> > +	HWPfQmgrDebugProcessWdTimeoutMsiFifo  =  0x00A001C4,
> > +	HWPfQmgrDepthLog2Grp                  =  0x00A00200,
> > +	HWPfQmgrTholdGrp                      =  0x00A00300,
> > +	HWPfQmgrGrpTmplateReg0Indx            =  0x00A00600,
> > +	HWPfQmgrGrpTmplateReg1Indx            =  0x00A00680,
> > +	HWPfQmgrGrpTmplateReg2indx            =  0x00A00700,
> > +	HWPfQmgrGrpTmplateReg3Indx            =  0x00A00780,
> > +	HWPfQmgrGrpTmplateReg4Indx            =  0x00A00800,
> > +	HWPfQmgrVfBaseAddr                    =  0x00A01000,
> > +	HWPfQmgrUl4GWeightRrVf                =  0x00A02000,
> > +	HWPfQmgrDl4GWeightRrVf                =  0x00A02100,
> > +	HWPfQmgrUl5GWeightRrVf                =  0x00A02200,
> > +	HWPfQmgrDl5GWeightRrVf                =  0x00A02300,
> > +	HWPfQmgrMldWeightRrVf                 =  0x00A02400,
> > +	HWPfQmgrArbQDepthGrp                  =  0x00A02F00,
> > +	HWPfQmgrGrpFunction0                  =  0x00A02F40,
> > +	HWPfQmgrGrpFunction1                  =  0x00A02F44,
> > +	HWPfQmgrGrpPriority                   =  0x00A02F48,
> > +	HWPfQmgrWeightSync                    =  0x00A03000,
> > +	HWPfQmgrAqEnableVf                    =  0x00A10000,
> > +	HWPfQmgrAqResetVf                     =  0x00A20000,
> > +	HWPfQmgrRingSizeVf                    =  0x00A20004,
> > +	HWPfQmgrGrpDepthLog20Vf               =  0x00A20008,
> > +	HWPfQmgrGrpDepthLog21Vf               =  0x00A2000C,
> > +	HWPfQmgrGrpFunction0Vf                =  0x00A20010,
> > +	HWPfQmgrGrpFunction1Vf                =  0x00A20014,
> > +	HWPfDmaConfig0Reg                     =  0x00B80000,
> > +	HWPfDmaConfig1Reg                     =  0x00B80004,
> > +	HWPfDmaQmgrAddrReg                    =  0x00B80008,
> > +	HWPfDmaSoftResetReg                   =  0x00B8000C,
> > +	HWPfDmaAxcacheReg                     =  0x00B80010,
> > +	HWPfDmaVersionReg                     =  0x00B80014,
> > +	HWPfDmaFrameThreshold                 =  0x00B80018,
> > +	HWPfDmaTimestampLo                    =  0x00B8001C,
> > +	HWPfDmaTimestampHi                    =  0x00B80020,
> > +	HWPfDmaAxiStatus                      =  0x00B80028,
> > +	HWPfDmaAxiControl                     =  0x00B8002C,
> > +	HWPfDmaNoQmgr                         =  0x00B80030,
> > +	HWPfDmaQosScale                       =  0x00B80034,
> > +	HWPfDmaQmanen                         =  0x00B80040,
> > +	HWPfDmaQmgrQosBase                    =  0x00B80060,
> > +	HWPfDmaFecClkGatingEnable             =  0x00B80080,
> > +	HWPfDmaPmEnable                       =  0x00B80084,
> > +	HWPfDmaQosEnable                      =  0x00B80088,
> > +	HWPfDmaHarqWeightedRrFrameThreshold   =  0x00B800B0,
> > +	HWPfDmaDataSmallWeightedRrFrameThresh  = 0x00B800B4,
> > +	HWPfDmaDataLargeWeightedRrFrameThresh  = 0x00B800B8,
> > +	HWPfDmaInboundCbMaxSize               =  0x00B800BC,
> > +	HWPfDmaInboundDrainDataSize           =  0x00B800C0,
> > +	HWPfDmaVfDdrBaseRw                    =  0x00B80400,
> > +	HWPfDmaCmplTmOutCnt                   =  0x00B80800,
> > +	HWPfDmaProcTmOutCnt                   =  0x00B80804,
> > +	HWPfDmaStatusRrespBresp               =  0x00B80810,
> > +	HWPfDmaCfgRrespBresp                  =  0x00B80814,
> > +	HWPfDmaStatusMemParErr                =  0x00B80818,
> > +	HWPfDmaCfgMemParErrEn                 =  0x00B8081C,
> > +	HWPfDmaStatusDmaHwErr                 =  0x00B80820,
> > +	HWPfDmaCfgDmaHwErrEn                  =  0x00B80824,
> > +	HWPfDmaStatusFecCoreErr               =  0x00B80828,
> > +	HWPfDmaCfgFecCoreErrEn                =  0x00B8082C,
> > +	HWPfDmaStatusFcwDescrErr              =  0x00B80830,
> > +	HWPfDmaCfgFcwDescrErrEn               =  0x00B80834,
> > +	HWPfDmaStatusBlockTransmit            =  0x00B80838,
> > +	HWPfDmaBlockOnErrEn                   =  0x00B8083C,
> > +	HWPfDmaStatusFlushDma                 =  0x00B80840,
> > +	HWPfDmaFlushDmaOnErrEn                =  0x00B80844,
> > +	HWPfDmaStatusSdoneFifoFull            =  0x00B80848,
> > +	HWPfDmaStatusDescriptorErrLoVf        =  0x00B8084C,
> > +	HWPfDmaStatusDescriptorErrHiVf        =  0x00B80850,
> > +	HWPfDmaStatusFcwErrLoVf               =  0x00B80854,
> > +	HWPfDmaStatusFcwErrHiVf               =  0x00B80858,
> > +	HWPfDmaStatusDataErrLoVf              =  0x00B8085C,
> > +	HWPfDmaStatusDataErrHiVf              =  0x00B80860,
> > +	HWPfDmaCfgMsiEnSoftwareErr            =  0x00B80864,
> > +	HWPfDmaDescriptorSignatuture          =  0x00B80868,
> > +	HWPfDmaFcwSignature                   =  0x00B8086C,
> > +	HWPfDmaErrorDetectionEn               =  0x00B80870,
> > +	HWPfDmaErrCntrlFifoDebug              =  0x00B8087C,
> > +	HWPfDmaStatusToutData                 =  0x00B80880,
> > +	HWPfDmaStatusToutDesc                 =  0x00B80884,
> > +	HWPfDmaStatusToutUnexpData            =  0x00B80888,
> > +	HWPfDmaStatusToutUnexpDesc            =  0x00B8088C,
> > +	HWPfDmaStatusToutProcess              =  0x00B80890,
> > +	HWPfDmaConfigCtoutOutDataEn           =  0x00B808A0,
> > +	HWPfDmaConfigCtoutOutDescrEn          =  0x00B808A4,
> > +	HWPfDmaConfigUnexpComplDataEn         =  0x00B808A8,
> > +	HWPfDmaConfigUnexpComplDescrEn        =  0x00B808AC,
> > +	HWPfDmaConfigPtoutOutEn               =  0x00B808B0,
> > +	HWPfDmaFec5GulDescBaseLoRegVf         =  0x00B88020,
> > +	HWPfDmaFec5GulDescBaseHiRegVf         =  0x00B88024,
> > +	HWPfDmaFec5GulRespPtrLoRegVf          =  0x00B88028,
> > +	HWPfDmaFec5GulRespPtrHiRegVf          =  0x00B8802C,
> > +	HWPfDmaFec5GdlDescBaseLoRegVf         =  0x00B88040,
> > +	HWPfDmaFec5GdlDescBaseHiRegVf         =  0x00B88044,
> > +	HWPfDmaFec5GdlRespPtrLoRegVf          =  0x00B88048,
> > +	HWPfDmaFec5GdlRespPtrHiRegVf          =  0x00B8804C,
> > +	HWPfDmaFec4GulDescBaseLoRegVf         =  0x00B88060,
> > +	HWPfDmaFec4GulDescBaseHiRegVf         =  0x00B88064,
> > +	HWPfDmaFec4GulRespPtrLoRegVf          =  0x00B88068,
> > +	HWPfDmaFec4GulRespPtrHiRegVf          =  0x00B8806C,
> > +	HWPfDmaFec4GdlDescBaseLoRegVf         =  0x00B88080,
> > +	HWPfDmaFec4GdlDescBaseHiRegVf         =  0x00B88084,
> > +	HWPfDmaFec4GdlRespPtrLoRegVf          =  0x00B88088,
> > +	HWPfDmaFec4GdlRespPtrHiRegVf          =  0x00B8808C,
> > +	HWPfDmaVfDdrBaseRangeRo               =  0x00B880A0,
> > +	HWPfQosmonACntrlReg                   =  0x00B90000,
> > +	HWPfQosmonAEvalOverflow0              =  0x00B90008,
> > +	HWPfQosmonAEvalOverflow1              =  0x00B9000C,
> > +	HWPfQosmonADivTerm                    =  0x00B90010,
> > +	HWPfQosmonATickTerm                   =  0x00B90014,
> > +	HWPfQosmonAEvalTerm                   =  0x00B90018,
> > +	HWPfQosmonAAveTerm                    =  0x00B9001C,
> > +	HWPfQosmonAForceEccErr                =  0x00B90020,
> > +	HWPfQosmonAEccErrDetect               =  0x00B90024,
> > +	HWPfQosmonAIterationConfig0Low        =  0x00B90060,
> > +	HWPfQosmonAIterationConfig0High       =  0x00B90064,
> > +	HWPfQosmonAIterationConfig1Low        =  0x00B90068,
> > +	HWPfQosmonAIterationConfig1High       =  0x00B9006C,
> > +	HWPfQosmonAIterationConfig2Low        =  0x00B90070,
> > +	HWPfQosmonAIterationConfig2High       =  0x00B90074,
> > +	HWPfQosmonAIterationConfig3Low        =  0x00B90078,
> > +	HWPfQosmonAIterationConfig3High       =  0x00B9007C,
> > +	HWPfQosmonAEvalMemAddr                =  0x00B90080,
> > +	HWPfQosmonAEvalMemData                =  0x00B90084,
> > +	HWPfQosmonAXaction                    =  0x00B900C0,
> > +	HWPfQosmonARemThres1Vf                =  0x00B90400,
> > +	HWPfQosmonAThres2Vf                   =  0x00B90404,
> > +	HWPfQosmonAWeiFracVf                  =  0x00B90408,
> > +	HWPfQosmonARrWeiVf                    =  0x00B9040C,
> > +	HWPfPermonACntrlRegVf                 =  0x00B98000,
> > +	HWPfPermonACountVf                    =  0x00B98008,
> > +	HWPfPermonAKCntLoVf                   =  0x00B98010,
> > +	HWPfPermonAKCntHiVf                   =  0x00B98014,
> > +	HWPfPermonADeltaCntLoVf               =  0x00B98020,
> > +	HWPfPermonADeltaCntHiVf               =  0x00B98024,
> > +	HWPfPermonAVersionReg                 =  0x00B9C000,
> > +	HWPfPermonACbControlFec               =  0x00B9C0F0,
> > +	HWPfPermonADltTimerLoFec              =  0x00B9C0F4,
> > +	HWPfPermonADltTimerHiFec              =  0x00B9C0F8,
> > +	HWPfPermonACbCountFec                 =  0x00B9C100,
> > +	HWPfPermonAAccExecTimerLoFec          =  0x00B9C104,
> > +	HWPfPermonAAccExecTimerHiFec          =  0x00B9C108,
> > +	HWPfPermonAExecTimerMinFec            =  0x00B9C200,
> > +	HWPfPermonAExecTimerMaxFec            =  0x00B9C204,
> > +	HWPfPermonAControlBusMon              =  0x00B9C400,
> > +	HWPfPermonAConfigBusMon               =  0x00B9C404,
> > +	HWPfPermonASkipCountBusMon            =  0x00B9C408,
> > +	HWPfPermonAMinLatBusMon               =  0x00B9C40C,
> > +	HWPfPermonAMaxLatBusMon               =  0x00B9C500,
> > +	HWPfPermonATotalLatLowBusMon          =  0x00B9C504,
> > +	HWPfPermonATotalLatUpperBusMon        =  0x00B9C508,
> > +	HWPfPermonATotalReqCntBusMon          =  0x00B9C50C,
> > +	HWPfQosmonBCntrlReg                   =  0x00BA0000,
> > +	HWPfQosmonBEvalOverflow0              =  0x00BA0008,
> > +	HWPfQosmonBEvalOverflow1              =  0x00BA000C,
> > +	HWPfQosmonBDivTerm                    =  0x00BA0010,
> > +	HWPfQosmonBTickTerm                   =  0x00BA0014,
> > +	HWPfQosmonBEvalTerm                   =  0x00BA0018,
> > +	HWPfQosmonBAveTerm                    =  0x00BA001C,
> > +	HWPfQosmonBForceEccErr                =  0x00BA0020,
> > +	HWPfQosmonBEccErrDetect               =  0x00BA0024,
> > +	HWPfQosmonBIterationConfig0Low        =  0x00BA0060,
> > +	HWPfQosmonBIterationConfig0High       =  0x00BA0064,
> > +	HWPfQosmonBIterationConfig1Low        =  0x00BA0068,
> > +	HWPfQosmonBIterationConfig1High       =  0x00BA006C,
> > +	HWPfQosmonBIterationConfig2Low        =  0x00BA0070,
> > +	HWPfQosmonBIterationConfig2High       =  0x00BA0074,
> > +	HWPfQosmonBIterationConfig3Low        =  0x00BA0078,
> > +	HWPfQosmonBIterationConfig3High       =  0x00BA007C,
> > +	HWPfQosmonBEvalMemAddr                =  0x00BA0080,
> > +	HWPfQosmonBEvalMemData                =  0x00BA0084,
> > +	HWPfQosmonBXaction                    =  0x00BA00C0,
> > +	HWPfQosmonBRemThres1Vf                =  0x00BA0400,
> > +	HWPfQosmonBThres2Vf                   =  0x00BA0404,
> > +	HWPfQosmonBWeiFracVf                  =  0x00BA0408,
> > +	HWPfQosmonBRrWeiVf                    =  0x00BA040C,
> > +	HWPfPermonBCntrlRegVf                 =  0x00BA8000,
> > +	HWPfPermonBCountVf                    =  0x00BA8008,
> > +	HWPfPermonBKCntLoVf                   =  0x00BA8010,
> > +	HWPfPermonBKCntHiVf                   =  0x00BA8014,
> > +	HWPfPermonBDeltaCntLoVf               =  0x00BA8020,
> > +	HWPfPermonBDeltaCntHiVf               =  0x00BA8024,
> > +	HWPfPermonBVersionReg                 =  0x00BAC000,
> > +	HWPfPermonBCbControlFec               =  0x00BAC0F0,
> > +	HWPfPermonBDltTimerLoFec              =  0x00BAC0F4,
> > +	HWPfPermonBDltTimerHiFec              =  0x00BAC0F8,
> > +	HWPfPermonBCbCountFec                 =  0x00BAC100,
> > +	HWPfPermonBAccExecTimerLoFec          =  0x00BAC104,
> > +	HWPfPermonBAccExecTimerHiFec          =  0x00BAC108,
> > +	HWPfPermonBExecTimerMinFec            =  0x00BAC200,
> > +	HWPfPermonBExecTimerMaxFec            =  0x00BAC204,
> > +	HWPfPermonBControlBusMon              =  0x00BAC400,
> > +	HWPfPermonBConfigBusMon               =  0x00BAC404,
> > +	HWPfPermonBSkipCountBusMon            =  0x00BAC408,
> > +	HWPfPermonBMinLatBusMon               =  0x00BAC40C,
> > +	HWPfPermonBMaxLatBusMon               =  0x00BAC500,
> > +	HWPfPermonBTotalLatLowBusMon          =  0x00BAC504,
> > +	HWPfPermonBTotalLatUpperBusMon        =  0x00BAC508,
> > +	HWPfPermonBTotalReqCntBusMon          =  0x00BAC50C,
> > +	HWPfFecUl5gCntrlReg                   =  0x00BC0000,
> > +	HWPfFecUl5gI2MThreshReg               =  0x00BC0004,
> > +	HWPfFecUl5gVersionReg                 =  0x00BC0100,
> > +	HWPfFecUl5gFcwStatusReg               =  0x00BC0104,
> > +	HWPfFecUl5gWarnReg                    =  0x00BC0108,
> > +	HwPfFecUl5gIbDebugReg                 =  0x00BC0200,
> > +	HwPfFecUl5gObLlrDebugReg              =  0x00BC0204,
> > +	HwPfFecUl5gObHarqDebugReg             =  0x00BC0208,
> > +	HwPfFecUl5g1CntrlReg                  =  0x00BC1000,
> > +	HwPfFecUl5g1I2MThreshReg              =  0x00BC1004,
> > +	HwPfFecUl5g1VersionReg                =  0x00BC1100,
> > +	HwPfFecUl5g1FcwStatusReg              =  0x00BC1104,
> > +	HwPfFecUl5g1WarnReg                   =  0x00BC1108,
> > +	HwPfFecUl5g1IbDebugReg                =  0x00BC1200,
> > +	HwPfFecUl5g1ObLlrDebugReg             =  0x00BC1204,
> > +	HwPfFecUl5g1ObHarqDebugReg            =  0x00BC1208,
> > +	HwPfFecUl5g2CntrlReg                  =  0x00BC2000,
> > +	HwPfFecUl5g2I2MThreshReg              =  0x00BC2004,
> > +	HwPfFecUl5g2VersionReg                =  0x00BC2100,
> > +	HwPfFecUl5g2FcwStatusReg              =  0x00BC2104,
> > +	HwPfFecUl5g2WarnReg                   =  0x00BC2108,
> > +	HwPfFecUl5g2IbDebugReg                =  0x00BC2200,
> > +	HwPfFecUl5g2ObLlrDebugReg             =  0x00BC2204,
> > +	HwPfFecUl5g2ObHarqDebugReg            =  0x00BC2208,
> > +	HwPfFecUl5g3CntrlReg                  =  0x00BC3000,
> > +	HwPfFecUl5g3I2MThreshReg              =  0x00BC3004,
> > +	HwPfFecUl5g3VersionReg                =  0x00BC3100,
> > +	HwPfFecUl5g3FcwStatusReg              =  0x00BC3104,
> > +	HwPfFecUl5g3WarnReg                   =  0x00BC3108,
> > +	HwPfFecUl5g3IbDebugReg                =  0x00BC3200,
> > +	HwPfFecUl5g3ObLlrDebugReg             =  0x00BC3204,
> > +	HwPfFecUl5g3ObHarqDebugReg            =  0x00BC3208,
> > +	HwPfFecUl5g4CntrlReg                  =  0x00BC4000,
> > +	HwPfFecUl5g4I2MThreshReg              =  0x00BC4004,
> > +	HwPfFecUl5g4VersionReg                =  0x00BC4100,
> > +	HwPfFecUl5g4FcwStatusReg              =  0x00BC4104,
> > +	HwPfFecUl5g4WarnReg                   =  0x00BC4108,
> > +	HwPfFecUl5g4IbDebugReg                =  0x00BC4200,
> > +	HwPfFecUl5g4ObLlrDebugReg             =  0x00BC4204,
> > +	HwPfFecUl5g4ObHarqDebugReg            =  0x00BC4208,
> > +	HwPfFecUl5g5CntrlReg                  =  0x00BC5000,
> > +	HwPfFecUl5g5I2MThreshReg              =  0x00BC5004,
> > +	HwPfFecUl5g5VersionReg                =  0x00BC5100,
> > +	HwPfFecUl5g5FcwStatusReg              =  0x00BC5104,
> > +	HwPfFecUl5g5WarnReg                   =  0x00BC5108,
> > +	HwPfFecUl5g5IbDebugReg                =  0x00BC5200,
> > +	HwPfFecUl5g5ObLlrDebugReg             =  0x00BC5204,
> > +	HwPfFecUl5g5ObHarqDebugReg            =  0x00BC5208,
> > +	HwPfFecUl5g6CntrlReg                  =  0x00BC6000,
> > +	HwPfFecUl5g6I2MThreshReg              =  0x00BC6004,
> > +	HwPfFecUl5g6VersionReg                =  0x00BC6100,
> > +	HwPfFecUl5g6FcwStatusReg              =  0x00BC6104,
> > +	HwPfFecUl5g6WarnReg                   =  0x00BC6108,
> > +	HwPfFecUl5g6IbDebugReg                =  0x00BC6200,
> > +	HwPfFecUl5g6ObLlrDebugReg             =  0x00BC6204,
> > +	HwPfFecUl5g6ObHarqDebugReg            =  0x00BC6208,
> > +	HwPfFecUl5g7CntrlReg                  =  0x00BC7000,
> > +	HwPfFecUl5g7I2MThreshReg              =  0x00BC7004,
> > +	HwPfFecUl5g7VersionReg                =  0x00BC7100,
> > +	HwPfFecUl5g7FcwStatusReg              =  0x00BC7104,
> > +	HwPfFecUl5g7WarnReg                   =  0x00BC7108,
> > +	HwPfFecUl5g7IbDebugReg                =  0x00BC7200,
> > +	HwPfFecUl5g7ObLlrDebugReg             =  0x00BC7204,
> > +	HwPfFecUl5g7ObHarqDebugReg            =  0x00BC7208,
> > +	HwPfFecUl5g8CntrlReg                  =  0x00BC8000,
> > +	HwPfFecUl5g8I2MThreshReg              =  0x00BC8004,
> > +	HwPfFecUl5g8VersionReg                =  0x00BC8100,
> > +	HwPfFecUl5g8FcwStatusReg              =  0x00BC8104,
> > +	HwPfFecUl5g8WarnReg                   =  0x00BC8108,
> > +	HwPfFecUl5g8IbDebugReg                =  0x00BC8200,
> > +	HwPfFecUl5g8ObLlrDebugReg             =  0x00BC8204,
> > +	HwPfFecUl5g8ObHarqDebugReg            =  0x00BC8208,
> > +	HWPfFecDl5gCntrlReg                   =  0x00BCF000,
> > +	HWPfFecDl5gI2MThreshReg               =  0x00BCF004,
> > +	HWPfFecDl5gVersionReg                 =  0x00BCF100,
> > +	HWPfFecDl5gFcwStatusReg               =  0x00BCF104,
> > +	HWPfFecDl5gWarnReg                    =  0x00BCF108,
> > +	HWPfFecUlVersionReg                   =  0x00BD0000,
> > +	HWPfFecUlControlReg                   =  0x00BD0004,
> > +	HWPfFecUlStatusReg                    =  0x00BD0008,
> > +	HWPfFecDlVersionReg                   =  0x00BDF000,
> > +	HWPfFecDlClusterConfigReg             =  0x00BDF004,
> > +	HWPfFecDlBurstThres                   =  0x00BDF00C,
> > +	HWPfFecDlClusterStatusReg0            =  0x00BDF040,
> > +	HWPfFecDlClusterStatusReg1            =  0x00BDF044,
> > +	HWPfFecDlClusterStatusReg2            =  0x00BDF048,
> > +	HWPfFecDlClusterStatusReg3            =  0x00BDF04C,
> > +	HWPfFecDlClusterStatusReg4            =  0x00BDF050,
> > +	HWPfFecDlClusterStatusReg5            =  0x00BDF054,
> > +	HWPfChaFabPllPllrst                   =  0x00C40000,
> > +	HWPfChaFabPllClk0                     =  0x00C40004,
> > +	HWPfChaFabPllClk1                     =  0x00C40008,
> > +	HWPfChaFabPllBwadj                    =  0x00C4000C,
> > +	HWPfChaFabPllLbw                      =  0x00C40010,
> > +	HWPfChaFabPllResetq                   =  0x00C40014,
> > +	HWPfChaFabPllPhshft0                  =  0x00C40018,
> > +	HWPfChaFabPllPhshft1                  =  0x00C4001C,
> > +	HWPfChaFabPllDivq0                    =  0x00C40020,
> > +	HWPfChaFabPllDivq1                    =  0x00C40024,
> > +	HWPfChaFabPllDivq2                    =  0x00C40028,
> > +	HWPfChaFabPllDivq3                    =  0x00C4002C,
> > +	HWPfChaFabPllDivq4                    =  0x00C40030,
> > +	HWPfChaFabPllDivq5                    =  0x00C40034,
> > +	HWPfChaFabPllDivq6                    =  0x00C40038,
> > +	HWPfChaFabPllDivq7                    =  0x00C4003C,
> > +	HWPfChaDl5gPllPllrst                  =  0x00C40080,
> > +	HWPfChaDl5gPllClk0                    =  0x00C40084,
> > +	HWPfChaDl5gPllClk1                    =  0x00C40088,
> > +	HWPfChaDl5gPllBwadj                   =  0x00C4008C,
> > +	HWPfChaDl5gPllLbw                     =  0x00C40090,
> > +	HWPfChaDl5gPllResetq                  =  0x00C40094,
> > +	HWPfChaDl5gPllPhshft0                 =  0x00C40098,
> > +	HWPfChaDl5gPllPhshft1                 =  0x00C4009C,
> > +	HWPfChaDl5gPllDivq0                   =  0x00C400A0,
> > +	HWPfChaDl5gPllDivq1                   =  0x00C400A4,
> > +	HWPfChaDl5gPllDivq2                   =  0x00C400A8,
> > +	HWPfChaDl5gPllDivq3                   =  0x00C400AC,
> > +	HWPfChaDl5gPllDivq4                   =  0x00C400B0,
> > +	HWPfChaDl5gPllDivq5                   =  0x00C400B4,
> > +	HWPfChaDl5gPllDivq6                   =  0x00C400B8,
> > +	HWPfChaDl5gPllDivq7                   =  0x00C400BC,
> > +	HWPfChaDl4gPllPllrst                  =  0x00C40100,
> > +	HWPfChaDl4gPllClk0                    =  0x00C40104,
> > +	HWPfChaDl4gPllClk1                    =  0x00C40108,
> > +	HWPfChaDl4gPllBwadj                   =  0x00C4010C,
> > +	HWPfChaDl4gPllLbw                     =  0x00C40110,
> > +	HWPfChaDl4gPllResetq                  =  0x00C40114,
> > +	HWPfChaDl4gPllPhshft0                 =  0x00C40118,
> > +	HWPfChaDl4gPllPhshft1                 =  0x00C4011C,
> > +	HWPfChaDl4gPllDivq0                   =  0x00C40120,
> > +	HWPfChaDl4gPllDivq1                   =  0x00C40124,
> > +	HWPfChaDl4gPllDivq2                   =  0x00C40128,
> > +	HWPfChaDl4gPllDivq3                   =  0x00C4012C,
> > +	HWPfChaDl4gPllDivq4                   =  0x00C40130,
> > +	HWPfChaDl4gPllDivq5                   =  0x00C40134,
> > +	HWPfChaDl4gPllDivq6                   =  0x00C40138,
> > +	HWPfChaDl4gPllDivq7                   =  0x00C4013C,
> > +	HWPfChaUl5gPllPllrst                  =  0x00C40180,
> > +	HWPfChaUl5gPllClk0                    =  0x00C40184,
> > +	HWPfChaUl5gPllClk1                    =  0x00C40188,
> > +	HWPfChaUl5gPllBwadj                   =  0x00C4018C,
> > +	HWPfChaUl5gPllLbw                     =  0x00C40190,
> > +	HWPfChaUl5gPllResetq                  =  0x00C40194,
> > +	HWPfChaUl5gPllPhshft0                 =  0x00C40198,
> > +	HWPfChaUl5gPllPhshft1                 =  0x00C4019C,
> > +	HWPfChaUl5gPllDivq0                   =  0x00C401A0,
> > +	HWPfChaUl5gPllDivq1                   =  0x00C401A4,
> > +	HWPfChaUl5gPllDivq2                   =  0x00C401A8,
> > +	HWPfChaUl5gPllDivq3                   =  0x00C401AC,
> > +	HWPfChaUl5gPllDivq4                   =  0x00C401B0,
> > +	HWPfChaUl5gPllDivq5                   =  0x00C401B4,
> > +	HWPfChaUl5gPllDivq6                   =  0x00C401B8,
> > +	HWPfChaUl5gPllDivq7                   =  0x00C401BC,
> > +	HWPfChaUl4gPllPllrst                  =  0x00C40200,
> > +	HWPfChaUl4gPllClk0                    =  0x00C40204,
> > +	HWPfChaUl4gPllClk1                    =  0x00C40208,
> > +	HWPfChaUl4gPllBwadj                   =  0x00C4020C,
> > +	HWPfChaUl4gPllLbw                     =  0x00C40210,
> > +	HWPfChaUl4gPllResetq                  =  0x00C40214,
> > +	HWPfChaUl4gPllPhshft0                 =  0x00C40218,
> > +	HWPfChaUl4gPllPhshft1                 =  0x00C4021C,
> > +	HWPfChaUl4gPllDivq0                   =  0x00C40220,
> > +	HWPfChaUl4gPllDivq1                   =  0x00C40224,
> > +	HWPfChaUl4gPllDivq2                   =  0x00C40228,
> > +	HWPfChaUl4gPllDivq3                   =  0x00C4022C,
> > +	HWPfChaUl4gPllDivq4                   =  0x00C40230,
> > +	HWPfChaUl4gPllDivq5                   =  0x00C40234,
> > +	HWPfChaUl4gPllDivq6                   =  0x00C40238,
> > +	HWPfChaUl4gPllDivq7                   =  0x00C4023C,
> > +	HWPfChaDdrPllPllrst                   =  0x00C40280,
> > +	HWPfChaDdrPllClk0                     =  0x00C40284,
> > +	HWPfChaDdrPllClk1                     =  0x00C40288,
> > +	HWPfChaDdrPllBwadj                    =  0x00C4028C,
> > +	HWPfChaDdrPllLbw                      =  0x00C40290,
> > +	HWPfChaDdrPllResetq                   =  0x00C40294,
> > +	HWPfChaDdrPllPhshft0                  =  0x00C40298,
> > +	HWPfChaDdrPllPhshft1                  =  0x00C4029C,
> > +	HWPfChaDdrPllDivq0                    =  0x00C402A0,
> > +	HWPfChaDdrPllDivq1                    =  0x00C402A4,
> > +	HWPfChaDdrPllDivq2                    =  0x00C402A8,
> > +	HWPfChaDdrPllDivq3                    =  0x00C402AC,
> > +	HWPfChaDdrPllDivq4                    =  0x00C402B0,
> > +	HWPfChaDdrPllDivq5                    =  0x00C402B4,
> > +	HWPfChaDdrPllDivq6                    =  0x00C402B8,
> > +	HWPfChaDdrPllDivq7                    =  0x00C402BC,
> > +	HWPfChaErrStatus                      =  0x00C40400,
> > +	HWPfChaErrMask                        =  0x00C40404,
> > +	HWPfChaDebugPcieMsiFifo               =  0x00C40410,
> > +	HWPfChaDebugDdrMsiFifo                =  0x00C40414,
> > +	HWPfChaDebugMiscMsiFifo               =  0x00C40418,
> > +	HWPfChaPwmSet                         =  0x00C40420,
> > +	HWPfChaDdrRstStatus                   =  0x00C40430,
> > +	HWPfChaDdrStDoneStatus                =  0x00C40434,
> > +	HWPfChaDdrWbRstCfg                    =  0x00C40438,
> > +	HWPfChaDdrApbRstCfg                   =  0x00C4043C,
> > +	HWPfChaDdrPhyRstCfg                   =  0x00C40440,
> > +	HWPfChaDdrCpuRstCfg                   =  0x00C40444,
> > +	HWPfChaDdrSifRstCfg                   =  0x00C40448,
> > +	HWPfChaPadcfgPcomp0                   =  0x00C41000,
> > +	HWPfChaPadcfgNcomp0                   =  0x00C41004,
> > +	HWPfChaPadcfgOdt0                     =  0x00C41008,
> > +	HWPfChaPadcfgProtect0                 =  0x00C4100C,
> > +	HWPfChaPreemphasisProtect0            =  0x00C41010,
> > +	HWPfChaPreemphasisCompen0             =  0x00C41040,
> > +	HWPfChaPreemphasisOdten0              =  0x00C41044,
> > +	HWPfChaPadcfgPcomp1                   =  0x00C41100,
> > +	HWPfChaPadcfgNcomp1                   =  0x00C41104,
> > +	HWPfChaPadcfgOdt1                     =  0x00C41108,
> > +	HWPfChaPadcfgProtect1                 =  0x00C4110C,
> > +	HWPfChaPreemphasisProtect1            =  0x00C41110,
> > +	HWPfChaPreemphasisCompen1             =  0x00C41140,
> > +	HWPfChaPreemphasisOdten1              =  0x00C41144,
> > +	HWPfChaPadcfgPcomp2                   =  0x00C41200,
> > +	HWPfChaPadcfgNcomp2                   =  0x00C41204,
> > +	HWPfChaPadcfgOdt2                     =  0x00C41208,
> > +	HWPfChaPadcfgProtect2                 =  0x00C4120C,
> > +	HWPfChaPreemphasisProtect2            =  0x00C41210,
> > +	HWPfChaPreemphasisCompen2             =  0x00C41240,
> > +	HWPfChaPreemphasisOdten4              =  0x00C41444,
> > +	HWPfChaPreemphasisOdten2              =  0x00C41244,
> > +	HWPfChaPadcfgPcomp3                   =  0x00C41300,
> > +	HWPfChaPadcfgNcomp3                   =  0x00C41304,
> > +	HWPfChaPadcfgOdt3                     =  0x00C41308,
> > +	HWPfChaPadcfgProtect3                 =  0x00C4130C,
> > +	HWPfChaPreemphasisProtect3            =  0x00C41310,
> > +	HWPfChaPreemphasisCompen3             =  0x00C41340,
> > +	HWPfChaPreemphasisOdten3              =  0x00C41344,
> > +	HWPfChaPadcfgPcomp4                   =  0x00C41400,
> > +	HWPfChaPadcfgNcomp4                   =  0x00C41404,
> > +	HWPfChaPadcfgOdt4                     =  0x00C41408,
> > +	HWPfChaPadcfgProtect4                 =  0x00C4140C,
> > +	HWPfChaPreemphasisProtect4            =  0x00C41410,
> > +	HWPfChaPreemphasisCompen4             =  0x00C41440,
> > +	HWPfHiVfToPfDbellVf                   =  0x00C80000,
> > +	HWPfHiPfToVfDbellVf                   =  0x00C80008,
> > +	HWPfHiInfoRingBaseLoVf                =  0x00C80010,
> > +	HWPfHiInfoRingBaseHiVf                =  0x00C80014,
> > +	HWPfHiInfoRingPointerVf               =  0x00C80018,
> > +	HWPfHiInfoRingIntWrEnVf               =  0x00C80020,
> > +	HWPfHiInfoRingPf2VfWrEnVf             =  0x00C80024,
> > +	HWPfHiMsixVectorMapperVf              =  0x00C80060,
> > +	HWPfHiModuleVersionReg                =  0x00C84000,
> > +	HWPfHiIosf2axiErrLogReg               =  0x00C84004,
> > +	HWPfHiHardResetReg                    =  0x00C84008,
> > +	HWPfHi5GHardResetReg                  =  0x00C8400C,
> > +	HWPfHiInfoRingBaseLoRegPf             =  0x00C84010,
> > +	HWPfHiInfoRingBaseHiRegPf             =  0x00C84014,
> > +	HWPfHiInfoRingPointerRegPf            =  0x00C84018,
> > +	HWPfHiInfoRingIntWrEnRegPf            =  0x00C84020,
> > +	HWPfHiInfoRingVf2pfLoWrEnReg          =  0x00C84024,
> > +	HWPfHiInfoRingVf2pfHiWrEnReg          =  0x00C84028,
> > +	HWPfHiLogParityErrStatusReg           =  0x00C8402C,
> > +	HWPfHiLogDataParityErrorVfStatusLo    =  0x00C84030,
> > +	HWPfHiLogDataParityErrorVfStatusHi    =  0x00C84034,
> > +	HWPfHiBlockTransmitOnErrorEn          =  0x00C84038,
> > +	HWPfHiCfgMsiIntWrEnRegPf              =  0x00C84040,
> > +	HWPfHiCfgMsiVf2pfLoWrEnReg            =  0x00C84044,
> > +	HWPfHiCfgMsiVf2pfHighWrEnReg          =  0x00C84048,
> > +	HWPfHiMsixVectorMapperPf              =  0x00C84060,
> > +	HWPfHiApbWrWaitTime                   =  0x00C84100,
> > +	HWPfHiXCounterMaxValue                =  0x00C84104,
> > +	HWPfHiPfMode                          =  0x00C84108,
> > +	HWPfHiClkGateHystReg                  =  0x00C8410C,
> > +	HWPfHiSnoopBitsReg                    =  0x00C84110,
> > +	HWPfHiMsiDropEnableReg                =  0x00C84114,
> > +	HWPfHiMsiStatReg                      =  0x00C84120,
> > +	HWPfHiFifoOflStatReg                  =  0x00C84124,
> > +	HWPfHiHiDebugReg                      =  0x00C841F4,
> > +	HWPfHiDebugMemSnoopMsiFifo            =  0x00C841F8,
> > +	HWPfHiDebugMemSnoopInputFifo          =  0x00C841FC,
> > +	HWPfHiMsixMappingConfig               =  0x00C84200,
> > +	HWPfHiJunkReg                         =  0x00C8FF00,
> > +	HWPfDdrUmmcVer                        =  0x00D00000,
> > +	HWPfDdrUmmcCap                        =  0x00D00010,
> > +	HWPfDdrUmmcCtrl                       =  0x00D00020,
> > +	HWPfDdrMpcPe                          =  0x00D00080,
> > +	HWPfDdrMpcPpri3                       =  0x00D00090,
> > +	HWPfDdrMpcPpri2                       =  0x00D000A0,
> > +	HWPfDdrMpcPpri1                       =  0x00D000B0,
> > +	HWPfDdrMpcPpri0                       =  0x00D000C0,
> > +	HWPfDdrMpcPrwgrpCtrl                  =  0x00D000D0,
> > +	HWPfDdrMpcPbw7                        =  0x00D000E0,
> > +	HWPfDdrMpcPbw6                        =  0x00D000F0,
> > +	HWPfDdrMpcPbw5                        =  0x00D00100,
> > +	HWPfDdrMpcPbw4                        =  0x00D00110,
> > +	HWPfDdrMpcPbw3                        =  0x00D00120,
> > +	HWPfDdrMpcPbw2                        =  0x00D00130,
> > +	HWPfDdrMpcPbw1                        =  0x00D00140,
> > +	HWPfDdrMpcPbw0                        =  0x00D00150,
> > +	HWPfDdrMemoryInit                     =  0x00D00200,
> > +	HWPfDdrMemoryInitDone                 =  0x00D00210,
> > +	HWPfDdrMemInitPhyTrng0                =  0x00D00240,
> > +	HWPfDdrMemInitPhyTrng1                =  0x00D00250,
> > +	HWPfDdrMemInitPhyTrng2                =  0x00D00260,
> > +	HWPfDdrMemInitPhyTrng3                =  0x00D00270,
> > +	HWPfDdrBcDram                         =  0x00D003C0,
> > +	HWPfDdrBcAddrMap                      =  0x00D003D0,
> > +	HWPfDdrBcRef                          =  0x00D003E0,
> > +	HWPfDdrBcTim0                         =  0x00D00400,
> > +	HWPfDdrBcTim1                         =  0x00D00410,
> > +	HWPfDdrBcTim2                         =  0x00D00420,
> > +	HWPfDdrBcTim3                         =  0x00D00430,
> > +	HWPfDdrBcTim4                         =  0x00D00440,
> > +	HWPfDdrBcTim5                         =  0x00D00450,
> > +	HWPfDdrBcTim6                         =  0x00D00460,
> > +	HWPfDdrBcTim7                         =  0x00D00470,
> > +	HWPfDdrBcTim8                         =  0x00D00480,
> > +	HWPfDdrBcTim9                         =  0x00D00490,
> > +	HWPfDdrBcTim10                        =  0x00D004A0,
> > +	HWPfDdrBcTim12                        =  0x00D004C0,
> > +	HWPfDdrDfiInit                        =  0x00D004D0,
> > +	HWPfDdrDfiInitComplete                =  0x00D004E0,
> > +	HWPfDdrDfiTim0                        =  0x00D004F0,
> > +	HWPfDdrDfiTim1                        =  0x00D00500,
> > +	HWPfDdrDfiPhyUpdEn                    =  0x00D00530,
> > +	HWPfDdrMemStatus                      =  0x00D00540,
> > +	HWPfDdrUmmcErrStatus                  =  0x00D00550,
> > +	HWPfDdrUmmcIntStatus                  =  0x00D00560,
> > +	HWPfDdrUmmcIntEn                      =  0x00D00570,
> > +	HWPfDdrPhyRdLatency                   =  0x00D48400,
> > +	HWPfDdrPhyRdLatencyDbi                =  0x00D48410,
> > +	HWPfDdrPhyWrLatency                   =  0x00D48420,
> > +	HWPfDdrPhyTrngType                    =  0x00D48430,
> > +	HWPfDdrPhyMrsTiming2                  =  0x00D48440,
> > +	HWPfDdrPhyMrsTiming0                  =  0x00D48450,
> > +	HWPfDdrPhyMrsTiming1                  =  0x00D48460,
> > +	HWPfDdrPhyDramTmrd                    =  0x00D48470,
> > +	HWPfDdrPhyDramTmod                    =  0x00D48480,
> > +	HWPfDdrPhyDramTwpre                   =  0x00D48490,
> > +	HWPfDdrPhyDramTrfc                    =  0x00D484A0,
> > +	HWPfDdrPhyDramTrwtp                   =  0x00D484B0,
> > +	HWPfDdrPhyMr01Dimm                    =  0x00D484C0,
> > +	HWPfDdrPhyMr01DimmDbi                 =  0x00D484D0,
> > +	HWPfDdrPhyMr23Dimm                    =  0x00D484E0,
> > +	HWPfDdrPhyMr45Dimm                    =  0x00D484F0,
> > +	HWPfDdrPhyMr67Dimm                    =  0x00D48500,
> > +	HWPfDdrPhyWrlvlWwRdlvlRr              =  0x00D48510,
> > +	HWPfDdrPhyOdtEn                       =  0x00D48520,
> > +	HWPfDdrPhyFastTrng                    =  0x00D48530,
> > +	HWPfDdrPhyDynTrngGap                  =  0x00D48540,
> > +	HWPfDdrPhyDynRcalGap                  =  0x00D48550,
> > +	HWPfDdrPhyIdletimeout                 =  0x00D48560,
> > +	HWPfDdrPhyRstCkeGap                   =  0x00D48570,
> > +	HWPfDdrPhyCkeMrsGap                   =  0x00D48580,
> > +	HWPfDdrPhyMemVrefMidVal               =  0x00D48590,
> > +	HWPfDdrPhyVrefStep                    =  0x00D485A0,
> > +	HWPfDdrPhyVrefThreshold               =  0x00D485B0,
> > +	HWPfDdrPhyPhyVrefMidVal               =  0x00D485C0,
> > +	HWPfDdrPhyDqsCountMax                 =  0x00D485D0,
> > +	HWPfDdrPhyDqsCountNum                 =  0x00D485E0,
> > +	HWPfDdrPhyDramRow                     =  0x00D485F0,
> > +	HWPfDdrPhyDramCol                     =  0x00D48600,
> > +	HWPfDdrPhyDramBgBa                    =  0x00D48610,
> > +	HWPfDdrPhyDynamicUpdreqrel            =  0x00D48620,
> > +	HWPfDdrPhyVrefLimits                  =  0x00D48630,
> > +	HWPfDdrPhyIdtmTcStatus                =  0x00D6C020,
> > +	HWPfDdrPhyIdtmFwVersion               =  0x00D6C410,
> > +	HWPfDdrPhyRdlvlGateInitDelay          =  0x00D70000,
> > +	HWPfDdrPhyRdenSmplabc                 =  0x00D70008,
> > +	HWPfDdrPhyVrefNibble0                 =  0x00D7000C,
> > +	HWPfDdrPhyVrefNibble1                 =  0x00D70010,
> > +	HWPfDdrPhyRdlvlGateDqsSmpl0           =  0x00D70014,
> > +	HWPfDdrPhyRdlvlGateDqsSmpl1           =  0x00D70018,
> > +	HWPfDdrPhyRdlvlGateDqsSmpl2           =  0x00D7001C,
> > +	HWPfDdrPhyDqsCount                    =  0x00D70020,
> > +	HWPfDdrPhyWrlvlRdlvlGateStatus        =  0x00D70024,
> > +	HWPfDdrPhyErrorFlags                  =  0x00D70028,
> > +	HWPfDdrPhyPowerDown                   =  0x00D70030,
> > +	HWPfDdrPhyPrbsSeedByte0               =  0x00D70034,
> > +	HWPfDdrPhyPrbsSeedByte1               =  0x00D70038,
> > +	HWPfDdrPhyPcompDq                     =  0x00D70040,
> > +	HWPfDdrPhyNcompDq                     =  0x00D70044,
> > +	HWPfDdrPhyPcompDqs                    =  0x00D70048,
> > +	HWPfDdrPhyNcompDqs                    =  0x00D7004C,
> > +	HWPfDdrPhyPcompCmd                    =  0x00D70050,
> > +	HWPfDdrPhyNcompCmd                    =  0x00D70054,
> > +	HWPfDdrPhyPcompCk                     =  0x00D70058,
> > +	HWPfDdrPhyNcompCk                     =  0x00D7005C,
> > +	HWPfDdrPhyRcalOdtDq                   =  0x00D70060,
> > +	HWPfDdrPhyRcalOdtDqs                  =  0x00D70064,
> > +	HWPfDdrPhyRcalMask1                   =  0x00D70068,
> > +	HWPfDdrPhyRcalMask2                   =  0x00D7006C,
> > +	HWPfDdrPhyRcalCtrl                    =  0x00D70070,
> > +	HWPfDdrPhyRcalCnt                     =  0x00D70074,
> > +	HWPfDdrPhyRcalOverride                =  0x00D70078,
> > +	HWPfDdrPhyRcalGateen                  =  0x00D7007C,
> > +	HWPfDdrPhyCtrl                        =  0x00D70080,
> > +	HWPfDdrPhyWrlvlAlg                    =  0x00D70084,
> > +	HWPfDdrPhyRcalVreftTxcmdOdt           =  0x00D70088,
> > +	HWPfDdrPhyRdlvlGateParam              =  0x00D7008C,
> > +	HWPfDdrPhyRdlvlGateParam2             =  0x00D70090,
> > +	HWPfDdrPhyRcalVreftTxdata             =  0x00D70094,
> > +	HWPfDdrPhyCmdIntDelay                 =  0x00D700A4,
> > +	HWPfDdrPhyAlertN                      =  0x00D700A8,
> > +	HWPfDdrPhyTrngReqWpre2tck             =  0x00D700AC,
> > +	HWPfDdrPhyCmdPhaseSel                 =  0x00D700B4,
> > +	HWPfDdrPhyCmdDcdl                     =  0x00D700B8,
> > +	HWPfDdrPhyCkDcdl                      =  0x00D700BC,
> > +	HWPfDdrPhySwTrngCtrl1                 =  0x00D700C0,
> > +	HWPfDdrPhySwTrngCtrl2                 =  0x00D700C4,
> > +	HWPfDdrPhyRcalPcompRden               =  0x00D700C8,
> > +	HWPfDdrPhyRcalNcompRden               =  0x00D700CC,
> > +	HWPfDdrPhyRcalCompen                  =  0x00D700D0,
> > +	HWPfDdrPhySwTrngRdqs                  =  0x00D700D4,
> > +	HWPfDdrPhySwTrngWdqs                  =  0x00D700D8,
> > +	HWPfDdrPhySwTrngRdena                 =  0x00D700DC,
> > +	HWPfDdrPhySwTrngRdenb                 =  0x00D700E0,
> > +	HWPfDdrPhySwTrngRdenc                 =  0x00D700E4,
> > +	HWPfDdrPhySwTrngWdq                   =  0x00D700E8,
> > +	HWPfDdrPhySwTrngRdq                   =  0x00D700EC,
> > +	HWPfDdrPhyPcfgHmValue                 =  0x00D700F0,
> > +	HWPfDdrPhyPcfgTimerValue              =  0x00D700F4,
> > +	HWPfDdrPhyPcfgSoftwareTraining        =  0x00D700F8,
> > +	HWPfDdrPhyPcfgMcStatus                =  0x00D700FC,
> > +	HWPfDdrPhyWrlvlPhRank0                =  0x00D70100,
> > +	HWPfDdrPhyRdenPhRank0                 =  0x00D70104,
> > +	HWPfDdrPhyRdenIntRank0                =  0x00D70108,
> > +	HWPfDdrPhyRdqsDcdlRank0               =  0x00D7010C,
> > +	HWPfDdrPhyRdqsShadowDcdlRank0         =  0x00D70110,
> > +	HWPfDdrPhyWdqsDcdlRank0               =  0x00D70114,
> > +	HWPfDdrPhyWdmDcdlShadowRank0          =  0x00D70118,
> > +	HWPfDdrPhyWdmDcdlRank0                =  0x00D7011C,
> > +	HWPfDdrPhyDbiDcdlRank0                =  0x00D70120,
> > +	HWPfDdrPhyRdenDcdlaRank0              =  0x00D70124,
> > +	HWPfDdrPhyDbiDcdlShadowRank0          =  0x00D70128,
> > +	HWPfDdrPhyRdenDcdlbRank0              =  0x00D7012C,
> > +	HWPfDdrPhyWdqsShadowDcdlRank0         =  0x00D70130,
> > +	HWPfDdrPhyRdenDcdlcRank0              =  0x00D70134,
> > +	HWPfDdrPhyRdenShadowDcdlaRank0        =  0x00D70138,
> > +	HWPfDdrPhyWrlvlIntRank0               =  0x00D7013C,
> > +	HWPfDdrPhyRdqDcdlBit0Rank0            =  0x00D70200,
> > +	HWPfDdrPhyRdqDcdlShadowBit0Rank0      =  0x00D70204,
> > +	HWPfDdrPhyWdqDcdlBit0Rank0            =  0x00D70208,
> > +	HWPfDdrPhyWdqDcdlShadowBit0Rank0      =  0x00D7020C,
> > +	HWPfDdrPhyRdqDcdlBit1Rank0            =  0x00D70240,
> > +	HWPfDdrPhyRdqDcdlShadowBit1Rank0      =  0x00D70244,
> > +	HWPfDdrPhyWdqDcdlBit1Rank0            =  0x00D70248,
> > +	HWPfDdrPhyWdqDcdlShadowBit1Rank0      =  0x00D7024C,
> > +	HWPfDdrPhyRdqDcdlBit2Rank0            =  0x00D70280,
> > +	HWPfDdrPhyRdqDcdlShadowBit2Rank0      =  0x00D70284,
> > +	HWPfDdrPhyWdqDcdlBit2Rank0            =  0x00D70288,
> > +	HWPfDdrPhyWdqDcdlShadowBit2Rank0      =  0x00D7028C,
> > +	HWPfDdrPhyRdqDcdlBit3Rank0            =  0x00D702C0,
> > +	HWPfDdrPhyRdqDcdlShadowBit3Rank0      =  0x00D702C4,
> > +	HWPfDdrPhyWdqDcdlBit3Rank0            =  0x00D702C8,
> > +	HWPfDdrPhyWdqDcdlShadowBit3Rank0      =  0x00D702CC,
> > +	HWPfDdrPhyRdqDcdlBit4Rank0            =  0x00D70300,
> > +	HWPfDdrPhyRdqDcdlShadowBit4Rank0      =  0x00D70304,
> > +	HWPfDdrPhyWdqDcdlBit4Rank0            =  0x00D70308,
> > +	HWPfDdrPhyWdqDcdlShadowBit4Rank0      =  0x00D7030C,
> > +	HWPfDdrPhyRdqDcdlBit5Rank0            =  0x00D70340,
> > +	HWPfDdrPhyRdqDcdlShadowBit5Rank0      =  0x00D70344,
> > +	HWPfDdrPhyWdqDcdlBit5Rank0            =  0x00D70348,
> > +	HWPfDdrPhyWdqDcdlShadowBit5Rank0      =  0x00D7034C,
> > +	HWPfDdrPhyRdqDcdlBit6Rank0            =  0x00D70380,
> > +	HWPfDdrPhyRdqDcdlShadowBit6Rank0      =  0x00D70384,
> > +	HWPfDdrPhyWdqDcdlBit6Rank0            =  0x00D70388,
> > +	HWPfDdrPhyWdqDcdlShadowBit6Rank0      =  0x00D7038C,
> > +	HWPfDdrPhyRdqDcdlBit7Rank0            =  0x00D703C0,
> > +	HWPfDdrPhyRdqDcdlShadowBit7Rank0      =  0x00D703C4,
> > +	HWPfDdrPhyWdqDcdlBit7Rank0            =  0x00D703C8,
> > +	HWPfDdrPhyWdqDcdlShadowBit7Rank0      =  0x00D703CC,
> > +	HWPfDdrPhyIdtmStatus                  =  0x00D740D0,
> > +	HWPfDdrPhyIdtmError                   =  0x00D74110,
> > +	HWPfDdrPhyIdtmDebug                   =  0x00D74120,
> > +	HWPfDdrPhyIdtmDebugInt                =  0x00D74130,
> > +	HwPfPcieLnAsicCfgovr                  =  0x00D80000,
> > +	HwPfPcieLnAclkmixer                   =  0x00D80004,
> > +	HwPfPcieLnTxrampfreq                  =  0x00D80008,
> > +	HwPfPcieLnLanetest                    =  0x00D8000C,
> > +	HwPfPcieLnDcctrl                      =  0x00D80010,
> > +	HwPfPcieLnDccmeas                     =  0x00D80014,
> > +	HwPfPcieLnDccovrAclk                  =  0x00D80018,
> > +	HwPfPcieLnDccovrTxa                   =  0x00D8001C,
> > +	HwPfPcieLnDccovrTxk                   =  0x00D80020,
> > +	HwPfPcieLnDccovrDclk                  =  0x00D80024,
> > +	HwPfPcieLnDccovrEclk                  =  0x00D80028,
> > +	HwPfPcieLnDcctrimAclk                 =  0x00D8002C,
> > +	HwPfPcieLnDcctrimTx                   =  0x00D80030,
> > +	HwPfPcieLnDcctrimDclk                 =  0x00D80034,
> > +	HwPfPcieLnDcctrimEclk                 =  0x00D80038,
> > +	HwPfPcieLnQuadCtrl                    =  0x00D8003C,
> > +	HwPfPcieLnQuadCorrIndex               =  0x00D80040,
> > +	HwPfPcieLnQuadCorrStatus              =  0x00D80044,
> > +	HwPfPcieLnAsicRxovr1                  =  0x00D80048,
> > +	HwPfPcieLnAsicRxovr2                  =  0x00D8004C,
> > +	HwPfPcieLnAsicEqinfovr                =  0x00D80050,
> > +	HwPfPcieLnRxcsr                       =  0x00D80054,
> > +	HwPfPcieLnRxfectrl                    =  0x00D80058,
> > +	HwPfPcieLnRxtest                      =  0x00D8005C,
> > +	HwPfPcieLnEscount                     =  0x00D80060,
> > +	HwPfPcieLnCdrctrl                     =  0x00D80064,
> > +	HwPfPcieLnCdrctrl2                    =  0x00D80068,
> > +	HwPfPcieLnCdrcfg0Ctrl0                =  0x00D8006C,
> > +	HwPfPcieLnCdrcfg0Ctrl1                =  0x00D80070,
> > +	HwPfPcieLnCdrcfg0Ctrl2                =  0x00D80074,
> > +	HwPfPcieLnCdrcfg1Ctrl0                =  0x00D80078,
> > +	HwPfPcieLnCdrcfg1Ctrl1                =  0x00D8007C,
> > +	HwPfPcieLnCdrcfg1Ctrl2                =  0x00D80080,
> > +	HwPfPcieLnCdrcfg2Ctrl0                =  0x00D80084,
> > +	HwPfPcieLnCdrcfg2Ctrl1                =  0x00D80088,
> > +	HwPfPcieLnCdrcfg2Ctrl2                =  0x00D8008C,
> > +	HwPfPcieLnCdrcfg3Ctrl0                =  0x00D80090,
> > +	HwPfPcieLnCdrcfg3Ctrl1                =  0x00D80094,
> > +	HwPfPcieLnCdrcfg3Ctrl2                =  0x00D80098,
> > +	HwPfPcieLnCdrphase                    =  0x00D8009C,
> > +	HwPfPcieLnCdrfreq                     =  0x00D800A0,
> > +	HwPfPcieLnCdrstatusPhase              =  0x00D800A4,
> > +	HwPfPcieLnCdrstatusFreq               =  0x00D800A8,
> > +	HwPfPcieLnCdroffset                   =  0x00D800AC,
> > +	HwPfPcieLnRxvosctl                    =  0x00D800B0,
> > +	HwPfPcieLnRxvosctl2                   =  0x00D800B4,
> > +	HwPfPcieLnRxlosctl                    =  0x00D800B8,
> > +	HwPfPcieLnRxlos                       =  0x00D800BC,
> > +	HwPfPcieLnRxlosvval                   =  0x00D800C0,
> > +	HwPfPcieLnRxvosd0                     =  0x00D800C4,
> > +	HwPfPcieLnRxvosd1                     =  0x00D800C8,
> > +	HwPfPcieLnRxvosep0                    =  0x00D800CC,
> > +	HwPfPcieLnRxvosep1                    =  0x00D800D0,
> > +	HwPfPcieLnRxvosen0                    =  0x00D800D4,
> > +	HwPfPcieLnRxvosen1                    =  0x00D800D8,
> > +	HwPfPcieLnRxvosafe                    =  0x00D800DC,
> > +	HwPfPcieLnRxvosa0                     =  0x00D800E0,
> > +	HwPfPcieLnRxvosa0Out                  =  0x00D800E4,
> > +	HwPfPcieLnRxvosa1                     =  0x00D800E8,
> > +	HwPfPcieLnRxvosa1Out                  =  0x00D800EC,
> > +	HwPfPcieLnRxmisc                      =  0x00D800F0,
> > +	HwPfPcieLnRxbeacon                    =  0x00D800F4,
> > +	HwPfPcieLnRxdssout                    =  0x00D800F8,
> > +	HwPfPcieLnRxdssout2                   =  0x00D800FC,
> > +	HwPfPcieLnAlphapctrl                  =  0x00D80100,
> > +	HwPfPcieLnAlphanctrl                  =  0x00D80104,
> > +	HwPfPcieLnAdaptctrl                   =  0x00D80108,
> > +	HwPfPcieLnAdaptctrl1                  =  0x00D8010C,
> > +	HwPfPcieLnAdaptstatus                 =  0x00D80110,
> > +	HwPfPcieLnAdaptvga1                   =  0x00D80114,
> > +	HwPfPcieLnAdaptvga2                   =  0x00D80118,
> > +	HwPfPcieLnAdaptvga3                   =  0x00D8011C,
> > +	HwPfPcieLnAdaptvga4                   =  0x00D80120,
> > +	HwPfPcieLnAdaptboost1                 =  0x00D80124,
> > +	HwPfPcieLnAdaptboost2                 =  0x00D80128,
> > +	HwPfPcieLnAdaptboost3                 =  0x00D8012C,
> > +	HwPfPcieLnAdaptboost4                 =  0x00D80130,
> > +	HwPfPcieLnAdaptsslms1                 =  0x00D80134,
> > +	HwPfPcieLnAdaptsslms2                 =  0x00D80138,
> > +	HwPfPcieLnAdaptvgaStatus              =  0x00D8013C,
> > +	HwPfPcieLnAdaptboostStatus            =  0x00D80140,
> > +	HwPfPcieLnAdaptsslmsStatus1           =  0x00D80144,
> > +	HwPfPcieLnAdaptsslmsStatus2           =  0x00D80148,
> > +	HwPfPcieLnAfectrl1                    =  0x00D8014C,
> > +	HwPfPcieLnAfectrl2                    =  0x00D80150,
> > +	HwPfPcieLnAfectrl3                    =  0x00D80154,
> > +	HwPfPcieLnAfedefault1                 =  0x00D80158,
> > +	HwPfPcieLnAfedefault2                 =  0x00D8015C,
> > +	HwPfPcieLnDfectrl1                    =  0x00D80160,
> > +	HwPfPcieLnDfectrl2                    =  0x00D80164,
> > +	HwPfPcieLnDfectrl3                    =  0x00D80168,
> > +	HwPfPcieLnDfectrl4                    =  0x00D8016C,
> > +	HwPfPcieLnDfectrl5                    =  0x00D80170,
> > +	HwPfPcieLnDfectrl6                    =  0x00D80174,
> > +	HwPfPcieLnAfestatus1                  =  0x00D80178,
> > +	HwPfPcieLnAfestatus2                  =  0x00D8017C,
> > +	HwPfPcieLnDfestatus1                  =  0x00D80180,
> > +	HwPfPcieLnDfestatus2                  =  0x00D80184,
> > +	HwPfPcieLnDfestatus3                  =  0x00D80188,
> > +	HwPfPcieLnDfestatus4                  =  0x00D8018C,
> > +	HwPfPcieLnDfestatus5                  =  0x00D80190,
> > +	HwPfPcieLnAlphastatus                 =  0x00D80194,
> > +	HwPfPcieLnFomctrl1                    =  0x00D80198,
> > +	HwPfPcieLnFomctrl2                    =  0x00D8019C,
> > +	HwPfPcieLnFomctrl3                    =  0x00D801A0,
> > +	HwPfPcieLnAclkcalStatus               =  0x00D801A4,
> > +	HwPfPcieLnOffscorrStatus              =  0x00D801A8,
> > +	HwPfPcieLnEyewidthStatus              =  0x00D801AC,
> > +	HwPfPcieLnEyeheightStatus             =  0x00D801B0,
> > +	HwPfPcieLnAsicTxovr1                  =  0x00D801B4,
> > +	HwPfPcieLnAsicTxovr2                  =  0x00D801B8,
> > +	HwPfPcieLnAsicTxovr3                  =  0x00D801BC,
> > +	HwPfPcieLnTxbiasadjOvr                =  0x00D801C0,
> > +	HwPfPcieLnTxcsr                       =  0x00D801C4,
> > +	HwPfPcieLnTxtest                      =  0x00D801C8,
> > +	HwPfPcieLnTxtestword                  =  0x00D801CC,
> > +	HwPfPcieLnTxtestwordHigh              =  0x00D801D0,
> > +	HwPfPcieLnTxdrive                     =  0x00D801D4,
> > +	HwPfPcieLnMtcsLn                      =  0x00D801D8,
> > +	HwPfPcieLnStatsumLn                   =  0x00D801DC,
> > +	HwPfPcieLnRcbusScratch                =  0x00D801E0,
> > +	HwPfPcieLnRcbusMinorrev               =  0x00D801F0,
> > +	HwPfPcieLnRcbusMajorrev               =  0x00D801F4,
> > +	HwPfPcieLnRcbusBlocktype              =  0x00D801F8,
> > +	HwPfPcieSupPllcsr                     =  0x00D80800,
> > +	HwPfPcieSupPlldiv                     =  0x00D80804,
> > +	HwPfPcieSupPllcal                     =  0x00D80808,
> > +	HwPfPcieSupPllcalsts                  =  0x00D8080C,
> > +	HwPfPcieSupPllmeas                    =  0x00D80810,
> > +	HwPfPcieSupPlldactrim                 =  0x00D80814,
> > +	HwPfPcieSupPllbiastrim                =  0x00D80818,
> > +	HwPfPcieSupPllbwtrim                  =  0x00D8081C,
> > +	HwPfPcieSupPllcaldly                  =  0x00D80820,
> > +	HwPfPcieSupRefclkonpclkctrl           =  0x00D80824,
> > +	HwPfPcieSupPclkdelay                  =  0x00D80828,
> > +	HwPfPcieSupPhyconfig                  =  0x00D8082C,
> > +	HwPfPcieSupRcalIntf                   =  0x00D80830,
> > +	HwPfPcieSupAuxcsr                     =  0x00D80834,
> > +	HwPfPcieSupVref                       =  0x00D80838,
> > +	HwPfPcieSupLinkmode                   =  0x00D8083C,
> > +	HwPfPcieSupRrefcalctl                 =  0x00D80840,
> > +	HwPfPcieSupRrefcal                    =  0x00D80844,
> > +	HwPfPcieSupRrefcaldly                 =  0x00D80848,
> > +	HwPfPcieSupTximpcalctl                =  0x00D8084C,
> > +	HwPfPcieSupTximpcal                   =  0x00D80850,
> > +	HwPfPcieSupTximpoffset                =  0x00D80854,
> > +	HwPfPcieSupTximpcaldly                =  0x00D80858,
> > +	HwPfPcieSupRximpcalctl                =  0x00D8085C,
> > +	HwPfPcieSupRximpcal                   =  0x00D80860,
> > +	HwPfPcieSupRximpoffset                =  0x00D80864,
> > +	HwPfPcieSupRximpcaldly                =  0x00D80868,
> > +	HwPfPcieSupFence                      =  0x00D8086C,
> > +	HwPfPcieSupMtcs                       =  0x00D80870,
> > +	HwPfPcieSupStatsum                    =  0x00D809B8,
> > +	HwPfPciePcsDpStatus0                  =  0x00D81000,
> > +	HwPfPciePcsDpControl0                 =  0x00D81004,
> > +	HwPfPciePcsPmaStatusLane0             =  0x00D81008,
> > +	HwPfPciePcsPipeStatusLane0            =  0x00D8100C,
> > +	HwPfPciePcsTxdeemph0Lane0             =  0x00D81010,
> > +	HwPfPciePcsTxdeemph1Lane0             =  0x00D81014,
> > +	HwPfPciePcsInternalStatusLane0        =  0x00D81018,
> > +	HwPfPciePcsDpStatus1                  =  0x00D8101C,
> > +	HwPfPciePcsDpControl1                 =  0x00D81020,
> > +	HwPfPciePcsPmaStatusLane1             =  0x00D81024,
> > +	HwPfPciePcsPipeStatusLane1            =  0x00D81028,
> > +	HwPfPciePcsTxdeemph0Lane1             =  0x00D8102C,
> > +	HwPfPciePcsTxdeemph1Lane1             =  0x00D81030,
> > +	HwPfPciePcsInternalStatusLane1        =  0x00D81034,
> > +	HwPfPciePcsDpStatus2                  =  0x00D81038,
> > +	HwPfPciePcsDpControl2                 =  0x00D8103C,
> > +	HwPfPciePcsPmaStatusLane2             =  0x00D81040,
> > +	HwPfPciePcsPipeStatusLane2            =  0x00D81044,
> > +	HwPfPciePcsTxdeemph0Lane2             =  0x00D81048,
> > +	HwPfPciePcsTxdeemph1Lane2             =  0x00D8104C,
> > +	HwPfPciePcsInternalStatusLane2        =  0x00D81050,
> > +	HwPfPciePcsDpStatus3                  =  0x00D81054,
> > +	HwPfPciePcsDpControl3                 =  0x00D81058,
> > +	HwPfPciePcsPmaStatusLane3             =  0x00D8105C,
> > +	HwPfPciePcsPipeStatusLane3            =  0x00D81060,
> > +	HwPfPciePcsTxdeemph0Lane3             =  0x00D81064,
> > +	HwPfPciePcsTxdeemph1Lane3             =  0x00D81068,
> > +	HwPfPciePcsInternalStatusLane3        =  0x00D8106C,
> > +	HwPfPciePcsEbStatus0                  =  0x00D81070,
> > +	HwPfPciePcsEbStatus1                  =  0x00D81074,
> > +	HwPfPciePcsEbStatus2                  =  0x00D81078,
> > +	HwPfPciePcsEbStatus3                  =  0x00D8107C,
> > +	HwPfPciePcsPllSettingPcieG1           =  0x00D81088,
> > +	HwPfPciePcsPllSettingPcieG2           =  0x00D8108C,
> > +	HwPfPciePcsPllSettingPcieG3           =  0x00D81090,
> > +	HwPfPciePcsControl                    =  0x00D81094,
> > +	HwPfPciePcsEqControl                  =  0x00D81098,
> > +	HwPfPciePcsEqTimer                    =  0x00D8109C,
> > +	HwPfPciePcsEqErrStatus                =  0x00D810A0,
> > +	HwPfPciePcsEqErrCount                 =  0x00D810A4,
> > +	HwPfPciePcsStatus                     =  0x00D810A8,
> > +	HwPfPciePcsMiscRegister               =  0x00D810AC,
> > +	HwPfPciePcsObsControl                 =  0x00D810B0,
> > +	HwPfPciePcsPrbsCount0                 =  0x00D81200,
> > +	HwPfPciePcsBistControl0               =  0x00D81204,
> > +	HwPfPciePcsBistStaticWord00           =  0x00D81208,
> > +	HwPfPciePcsBistStaticWord10           =  0x00D8120C,
> > +	HwPfPciePcsBistStaticWord20           =  0x00D81210,
> > +	HwPfPciePcsBistStaticWord30           =  0x00D81214,
> > +	HwPfPciePcsPrbsCount1                 =  0x00D81220,
> > +	HwPfPciePcsBistControl1               =  0x00D81224,
> > +	HwPfPciePcsBistStaticWord01           =  0x00D81228,
> > +	HwPfPciePcsBistStaticWord11           =  0x00D8122C,
> > +	HwPfPciePcsBistStaticWord21           =  0x00D81230,
> > +	HwPfPciePcsBistStaticWord31           =  0x00D81234,
> > +	HwPfPciePcsPrbsCount2                 =  0x00D81240,
> > +	HwPfPciePcsBistControl2               =  0x00D81244,
> > +	HwPfPciePcsBistStaticWord02           =  0x00D81248,
> > +	HwPfPciePcsBistStaticWord12           =  0x00D8124C,
> > +	HwPfPciePcsBistStaticWord22           =  0x00D81250,
> > +	HwPfPciePcsBistStaticWord32           =  0x00D81254,
> > +	HwPfPciePcsPrbsCount3                 =  0x00D81260,
> > +	HwPfPciePcsBistControl3               =  0x00D81264,
> > +	HwPfPciePcsBistStaticWord03           =  0x00D81268,
> > +	HwPfPciePcsBistStaticWord13           =  0x00D8126C,
> > +	HwPfPciePcsBistStaticWord23           =  0x00D81270,
> > +	HwPfPciePcsBistStaticWord33           =  0x00D81274,
> > +	HwPfPcieGpexLtssmStateCntrl           =  0x00D90400,
> > +	HwPfPcieGpexLtssmStateStatus          =  0x00D90404,
> > +	HwPfPcieGpexSkipFreqTimer             =  0x00D90408,
> > +	HwPfPcieGpexLaneSelect                =  0x00D9040C,
> > +	HwPfPcieGpexLaneDeskew                =  0x00D90410,
> > +	HwPfPcieGpexRxErrorStatus             =  0x00D90414,
> > +	HwPfPcieGpexLaneNumControl            =  0x00D90418,
> > +	HwPfPcieGpexNFstControl               =  0x00D9041C,
> > +	HwPfPcieGpexLinkStatus                =  0x00D90420,
> > +	HwPfPcieGpexAckReplayTimeout          =  0x00D90438,
> > +	HwPfPcieGpexSeqNumberStatus           =  0x00D9043C,
> > +	HwPfPcieGpexCoreClkRatio              =  0x00D90440,
> > +	HwPfPcieGpexDllTholdControl           =  0x00D90448,
> > +	HwPfPcieGpexPmTimer                   =  0x00D90450,
> > +	HwPfPcieGpexPmeTimeout                =  0x00D90454,
> > +	HwPfPcieGpexAspmL1Timer               =  0x00D90458,
> > +	HwPfPcieGpexAspmReqTimer              =  0x00D9045C,
> > +	HwPfPcieGpexAspmL1Dis                 =  0x00D90460,
> > +	HwPfPcieGpexAdvisoryErrorControl      =  0x00D90468,
> > +	HwPfPcieGpexId                        =  0x00D90470,
> > +	HwPfPcieGpexClasscode                 =  0x00D90474,
> > +	HwPfPcieGpexSubsystemId               =  0x00D90478,
> > +	HwPfPcieGpexDeviceCapabilities        =  0x00D9047C,
> > +	HwPfPcieGpexLinkCapabilities          =  0x00D90480,
> > +	HwPfPcieGpexFunctionNumber            =  0x00D90484,
> > +	HwPfPcieGpexPmCapabilities            =  0x00D90488,
> > +	HwPfPcieGpexFunctionSelect            =  0x00D9048C,
> > +	HwPfPcieGpexErrorCounter              =  0x00D904AC,
> > +	HwPfPcieGpexConfigReady               =  0x00D904B0,
> > +	HwPfPcieGpexFcUpdateTimeout           =  0x00D904B8,
> > +	HwPfPcieGpexFcUpdateTimer             =  0x00D904BC,
> > +	HwPfPcieGpexVcBufferLoad              =  0x00D904C8,
> > +	HwPfPcieGpexVcBufferSizeThold         =  0x00D904CC,
> > +	HwPfPcieGpexVcBufferSelect            =  0x00D904D0,
> > +	HwPfPcieGpexBarEnable                 =  0x00D904D4,
> > +	HwPfPcieGpexBarDwordLower             =  0x00D904D8,
> > +	HwPfPcieGpexBarDwordUpper             =  0x00D904DC,
> > +	HwPfPcieGpexBarSelect                 =  0x00D904E0,
> > +	HwPfPcieGpexCreditCounterSelect       =  0x00D904E4,
> > +	HwPfPcieGpexCreditCounterStatus       =  0x00D904E8,
> > +	HwPfPcieGpexTlpHeaderSelect           =  0x00D904EC,
> > +	HwPfPcieGpexTlpHeaderDword0           =  0x00D904F0,
> > +	HwPfPcieGpexTlpHeaderDword1           =  0x00D904F4,
> > +	HwPfPcieGpexTlpHeaderDword2           =  0x00D904F8,
> > +	HwPfPcieGpexTlpHeaderDword3           =  0x00D904FC,
> > +	HwPfPcieGpexRelaxOrderControl         =  0x00D90500,
> > +	HwPfPcieGpexBarPrefetch               =  0x00D90504,
> > +	HwPfPcieGpexFcCheckControl            =  0x00D90508,
> > +	HwPfPcieGpexFcUpdateTimerTraffic      =  0x00D90518,
> > +	HwPfPcieGpexPhyControl0               =  0x00D9053C,
> > +	HwPfPcieGpexPhyControl1               =  0x00D90544,
> > +	HwPfPcieGpexPhyControl2               =  0x00D9054C,
> > +	HwPfPcieGpexUserControl0              =  0x00D9055C,
> > +	HwPfPcieGpexUncorrErrorStatus         =  0x00D905F0,
> > +	HwPfPcieGpexRxCplError                =  0x00D90620,
> > +	HwPfPcieGpexRxCplErrorDword0          =  0x00D90624,
> > +	HwPfPcieGpexRxCplErrorDword1          =  0x00D90628,
> > +	HwPfPcieGpexRxCplErrorDword2          =  0x00D9062C,
> > +	HwPfPcieGpexPabSwResetEn              =  0x00D90630,
> > +	HwPfPcieGpexGen3Control0              =  0x00D90634,
> > +	HwPfPcieGpexGen3Control1              =  0x00D90638,
> > +	HwPfPcieGpexGen3Control2              =  0x00D9063C,
> > +	HwPfPcieGpexGen2ControlCsr            =  0x00D90640,
> > +	HwPfPcieGpexTotalVfInitialVf0         =  0x00D90644,
> > +	HwPfPcieGpexTotalVfInitialVf1         =  0x00D90648,
> > +	HwPfPcieGpexSriovLinkDevId0           =  0x00D90684,
> > +	HwPfPcieGpexSriovLinkDevId1           =  0x00D90688,
> > +	HwPfPcieGpexSriovPageSize0            =  0x00D906C4,
> > +	HwPfPcieGpexSriovPageSize1            =  0x00D906C8,
> > +	HwPfPcieGpexIdVersion                 =  0x00D906FC,
> > +	HwPfPcieGpexSriovVfOffsetStride0      =  0x00D90704,
> > +	HwPfPcieGpexSriovVfOffsetStride1      =  0x00D90708,
> > +	HwPfPcieGpexGen3DeskewControl         =  0x00D907B4,
> > +	HwPfPcieGpexGen3EqControl             =  0x00D907B8,
> > +	HwPfPcieGpexBridgeVersion             =  0x00D90800,
> > +	HwPfPcieGpexBridgeCapability          =  0x00D90804,
> > +	HwPfPcieGpexBridgeControl             =  0x00D90808,
> > +	HwPfPcieGpexBridgeStatus              =  0x00D9080C,
> > +	HwPfPcieGpexEngineActivityStatus      =  0x00D9081C,
> > +	HwPfPcieGpexEngineResetControl        =  0x00D90820,
> > +	HwPfPcieGpexAxiPioControl             =  0x00D90840,
> > +	HwPfPcieGpexAxiPioStatus              =  0x00D90844,
> > +	HwPfPcieGpexAmbaSlaveCmdStatus        =  0x00D90848,
> > +	HwPfPcieGpexPexPioControl             =  0x00D908C0,
> > +	HwPfPcieGpexPexPioStatus              =  0x00D908C4,
> > +	HwPfPcieGpexAmbaMasterStatus          =  0x00D908C8,
> > +	HwPfPcieGpexCsrSlaveCmdStatus         =  0x00D90920,
> > +	HwPfPcieGpexMailboxAxiControl         =  0x00D90A50,
> > +	HwPfPcieGpexMailboxAxiData            =  0x00D90A54,
> > +	HwPfPcieGpexMailboxPexControl         =  0x00D90A90,
> > +	HwPfPcieGpexMailboxPexData            =  0x00D90A94,
> > +	HwPfPcieGpexPexInterruptEnable        =  0x00D90AD0,
> > +	HwPfPcieGpexPexInterruptStatus        =  0x00D90AD4,
> > +	HwPfPcieGpexPexInterruptAxiPioVector  =  0x00D90AD8,
> > +	HwPfPcieGpexPexInterruptPexPioVector  =  0x00D90AE0,
> > +	HwPfPcieGpexPexInterruptMiscVector    =  0x00D90AF8,
> > +	HwPfPcieGpexAmbaInterruptPioEnable    =  0x00D90B00,
> > +	HwPfPcieGpexAmbaInterruptMiscEnable   =  0x00D90B0C,
> > +	HwPfPcieGpexAmbaInterruptPioStatus    =  0x00D90B10,
> > +	HwPfPcieGpexAmbaInterruptMiscStatus   =  0x00D90B1C,
> > +	HwPfPcieGpexPexPmControl              =  0x00D90B80,
> > +	HwPfPcieGpexSlotMisc                  =  0x00D90B88,
> > +	HwPfPcieGpexAxiAddrMappingControl     =  0x00D90BA0,
> > +	HwPfPcieGpexAxiAddrMappingWindowAxiBase     =  0x00D90BA4,
> > +	HwPfPcieGpexAxiAddrMappingWindowPexBaseLow  =  0x00D90BA8,
> > +	HwPfPcieGpexAxiAddrMappingWindowPexBaseHigh =  0x00D90BAC,
> > +	HwPfPcieGpexPexBarAddrFunc0Bar0       =  0x00D91BA0,
> > +	HwPfPcieGpexPexBarAddrFunc0Bar1       =  0x00D91BA4,
> > +	HwPfPcieGpexAxiAddrMappingPcieHdrParam =  0x00D95BA0,
> > +	HwPfPcieGpexExtAxiAddrMappingAxiBase  =  0x00D980A0,
> > +	HwPfPcieGpexPexExtBarAddrFunc0Bar0    =  0x00D984A0,
> > +	HwPfPcieGpexPexExtBarAddrFunc0Bar1    =  0x00D984A4,
> > +	HwPfPcieGpexAmbaInterruptFlrEnable    =  0x00D9B960,
> > +	HwPfPcieGpexAmbaInterruptFlrStatus    =  0x00D9B9A0,
> > +	HwPfPcieGpexExtAxiAddrMappingSize     =  0x00D9BAF0,
> > +	HwPfPcieGpexPexPioAwcacheControl      =  0x00D9C300,
> > +	HwPfPcieGpexPexPioArcacheControl      =  0x00D9C304,
> > +	HwPfPcieGpexPabObSizeControlVc0       =  0x00D9C310
> > +};
> 
> Why not macro definition but enum?
> 

Well both would "work". The main reason really is that this long enum is automatically generated from RDL output from the chip design.
But still in that case I would argue enum is cleaner so that to put all these incremental addresses together. 
This can also helps when debugging as this is kept post compilation as both value and enum var.  
Any concern or any BKM from other PMDs? 

> > +/* TIP PF Interrupt numbers */
> > +enum {
> > +	ACC100_PF_INT_QMGR_AQ_OVERFLOW = 0,
> > +	ACC100_PF_INT_DOORBELL_VF_2_PF = 1,
> > +	ACC100_PF_INT_DMA_DL_DESC_IRQ = 2,
> > +	ACC100_PF_INT_DMA_UL_DESC_IRQ = 3,
> > +	ACC100_PF_INT_DMA_MLD_DESC_IRQ = 4,
> > +	ACC100_PF_INT_DMA_UL5G_DESC_IRQ = 5,
> > +	ACC100_PF_INT_DMA_DL5G_DESC_IRQ = 6,
> > +	ACC100_PF_INT_ILLEGAL_FORMAT = 7,
> > +	ACC100_PF_INT_QMGR_DISABLED_ACCESS = 8,
> > +	ACC100_PF_INT_QMGR_AQ_OVERTHRESHOLD = 9,
> > +	ACC100_PF_INT_ARAM_ACCESS_ERR = 10,
> > +	ACC100_PF_INT_ARAM_ECC_1BIT_ERR = 11,
> > +	ACC100_PF_INT_PARITY_ERR = 12,
> > +	ACC100_PF_INT_QMGR_ERR = 13,
> > +	ACC100_PF_INT_INT_REQ_OVERFLOW = 14,
> > +	ACC100_PF_INT_APB_TIMEOUT = 15,
> > +};
> > +
> > +#endif /* ACC100_PF_ENUM_H */
> > diff --git a/drivers/baseband/acc100/acc100_vf_enum.h
> > b/drivers/baseband/acc100/acc100_vf_enum.h
> > new file mode 100644
> > index 0000000..b512af3
> > --- /dev/null
> > +++ b/drivers/baseband/acc100/acc100_vf_enum.h
> > @@ -0,0 +1,73 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright(c) 2017 Intel Corporation
> > + */
> > +
> > +#ifndef ACC100_VF_ENUM_H
> > +#define ACC100_VF_ENUM_H
> > +
> > +/*
> > + * ACC100 Register mapping on VF BAR0
> > + * This is automatically generated from RDL, format may change with new
> > RDL
> > + */
> > +enum {
> > +	HWVfQmgrIngressAq             =  0x00000000,
> > +	HWVfHiVfToPfDbellVf           =  0x00000800,
> > +	HWVfHiPfToVfDbellVf           =  0x00000808,
> > +	HWVfHiInfoRingBaseLoVf        =  0x00000810,
> > +	HWVfHiInfoRingBaseHiVf        =  0x00000814,
> > +	HWVfHiInfoRingPointerVf       =  0x00000818,
> > +	HWVfHiInfoRingIntWrEnVf       =  0x00000820,
> > +	HWVfHiInfoRingPf2VfWrEnVf     =  0x00000824,
> > +	HWVfHiMsixVectorMapperVf      =  0x00000860,
> > +	HWVfDmaFec5GulDescBaseLoRegVf =  0x00000920,
> > +	HWVfDmaFec5GulDescBaseHiRegVf =  0x00000924,
> > +	HWVfDmaFec5GulRespPtrLoRegVf  =  0x00000928,
> > +	HWVfDmaFec5GulRespPtrHiRegVf  =  0x0000092C,
> > +	HWVfDmaFec5GdlDescBaseLoRegVf =  0x00000940,
> > +	HWVfDmaFec5GdlDescBaseHiRegVf =  0x00000944,
> > +	HWVfDmaFec5GdlRespPtrLoRegVf  =  0x00000948,
> > +	HWVfDmaFec5GdlRespPtrHiRegVf  =  0x0000094C,
> > +	HWVfDmaFec4GulDescBaseLoRegVf =  0x00000960,
> > +	HWVfDmaFec4GulDescBaseHiRegVf =  0x00000964,
> > +	HWVfDmaFec4GulRespPtrLoRegVf  =  0x00000968,
> > +	HWVfDmaFec4GulRespPtrHiRegVf  =  0x0000096C,
> > +	HWVfDmaFec4GdlDescBaseLoRegVf =  0x00000980,
> > +	HWVfDmaFec4GdlDescBaseHiRegVf =  0x00000984,
> > +	HWVfDmaFec4GdlRespPtrLoRegVf  =  0x00000988,
> > +	HWVfDmaFec4GdlRespPtrHiRegVf  =  0x0000098C,
> > +	HWVfDmaDdrBaseRangeRoVf       =  0x000009A0,
> > +	HWVfQmgrAqResetVf             =  0x00000E00,
> > +	HWVfQmgrRingSizeVf            =  0x00000E04,
> > +	HWVfQmgrGrpDepthLog20Vf       =  0x00000E08,
> > +	HWVfQmgrGrpDepthLog21Vf       =  0x00000E0C,
> > +	HWVfQmgrGrpFunction0Vf        =  0x00000E10,
> > +	HWVfQmgrGrpFunction1Vf        =  0x00000E14,
> > +	HWVfPmACntrlRegVf             =  0x00000F40,
> > +	HWVfPmACountVf                =  0x00000F48,
> > +	HWVfPmAKCntLoVf               =  0x00000F50,
> > +	HWVfPmAKCntHiVf               =  0x00000F54,
> > +	HWVfPmADeltaCntLoVf           =  0x00000F60,
> > +	HWVfPmADeltaCntHiVf           =  0x00000F64,
> > +	HWVfPmBCntrlRegVf             =  0x00000F80,
> > +	HWVfPmBCountVf                =  0x00000F88,
> > +	HWVfPmBKCntLoVf               =  0x00000F90,
> > +	HWVfPmBKCntHiVf               =  0x00000F94,
> > +	HWVfPmBDeltaCntLoVf           =  0x00000FA0,
> > +	HWVfPmBDeltaCntHiVf           =  0x00000FA4
> > +};
> > +
> > +/* TIP VF Interrupt numbers */
> > +enum {
> > +	ACC100_VF_INT_QMGR_AQ_OVERFLOW = 0,
> > +	ACC100_VF_INT_DOORBELL_VF_2_PF = 1,
> > +	ACC100_VF_INT_DMA_DL_DESC_IRQ = 2,
> > +	ACC100_VF_INT_DMA_UL_DESC_IRQ = 3,
> > +	ACC100_VF_INT_DMA_MLD_DESC_IRQ = 4,
> > +	ACC100_VF_INT_DMA_UL5G_DESC_IRQ = 5,
> > +	ACC100_VF_INT_DMA_DL5G_DESC_IRQ = 6,
> > +	ACC100_VF_INT_ILLEGAL_FORMAT = 7,
> > +	ACC100_VF_INT_QMGR_DISABLED_ACCESS = 8,
> > +	ACC100_VF_INT_QMGR_AQ_OVERTHRESHOLD = 9,
> > +};
> > +
> > +#endif /* ACC100_VF_ENUM_H */
> > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h
> > b/drivers/baseband/acc100/rte_acc100_pmd.h
> > index 6f46df0..cd77570 100644
> > --- a/drivers/baseband/acc100/rte_acc100_pmd.h
> > +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
> > @@ -5,6 +5,9 @@
> >  #ifndef _RTE_ACC100_PMD_H_
> >  #define _RTE_ACC100_PMD_H_
> >
> > +#include "acc100_pf_enum.h"
> > +#include "acc100_vf_enum.h"
> > +
> >  /* Helper macro for logging */
> >  #define rte_bbdev_log(level, fmt, ...) \
> >  	rte_log(RTE_LOG_ ## level, acc100_logtype, fmt "\n", \
> > @@ -27,6 +30,493 @@
> >  #define RTE_ACC100_PF_DEVICE_ID        (0x0d5c)
> >  #define RTE_ACC100_VF_DEVICE_ID        (0x0d5d)
> >
> > +/* Define as 1 to use only a single FEC engine */
> > +#ifndef RTE_ACC100_SINGLE_FEC
> > +#define RTE_ACC100_SINGLE_FEC 0
> > +#endif
> > +
> > +/* Values used in filling in descriptors */
> > +#define ACC100_DMA_DESC_TYPE           2
> > +#define ACC100_DMA_CODE_BLK_MODE       0
> > +#define ACC100_DMA_BLKID_FCW           1
> > +#define ACC100_DMA_BLKID_IN            2
> > +#define ACC100_DMA_BLKID_OUT_ENC       1
> > +#define ACC100_DMA_BLKID_OUT_HARD      1
> > +#define ACC100_DMA_BLKID_OUT_SOFT      2
> > +#define ACC100_DMA_BLKID_OUT_HARQ      3
> > +#define ACC100_DMA_BLKID_IN_HARQ       3
> > +
> > +/* Values used in filling in decode FCWs */
> > +#define ACC100_FCW_TD_VER              1
> > +#define ACC100_FCW_TD_EXT_COLD_REG_EN  1
> > +#define ACC100_FCW_TD_AUTOMAP          0x0f
> > +#define ACC100_FCW_TD_RVIDX_0          2
> > +#define ACC100_FCW_TD_RVIDX_1          26
> > +#define ACC100_FCW_TD_RVIDX_2          50
> > +#define ACC100_FCW_TD_RVIDX_3          74
> > +
> > +/* Values used in writing to the registers */
> > +#define ACC100_REG_IRQ_EN_ALL          0x1FF83FF  /* Enable all interrupts
> > */
> > +
> > +/* ACC100 Specific Dimensioning */
> > +#define ACC100_SIZE_64MBYTE            (64*1024*1024)
> > +/* Number of elements in an Info Ring */
> > +#define ACC100_INFO_RING_NUM_ENTRIES   1024
> > +/* Number of elements in HARQ layout memory */
> > +#define ACC100_HARQ_LAYOUT             (64*1024*1024)
> > +/* Assume offset for HARQ in memory */
> > +#define ACC100_HARQ_OFFSET             (32*1024)
> > +/* Mask used to calculate an index in an Info Ring array (not a byte offset)
> > */
> > +#define ACC100_INFO_RING_MASK
> > (ACC100_INFO_RING_NUM_ENTRIES-1)
> > +/* Number of Virtual Functions ACC100 supports */
> > +#define ACC100_NUM_VFS                  16
> > +#define ACC100_NUM_QGRPS                 8
> > +#define ACC100_NUM_QGRPS_PER_WORD        8
> > +#define ACC100_NUM_AQS                  16
> > +#define MAX_ENQ_BATCH_SIZE          255
> > +/* All ACC100 Registers alignment are 32bits = 4B */
> > +#define BYTES_IN_WORD                 4
> > +#define MAX_E_MBUF                64000
> > +
> > +#define GRP_ID_SHIFT    10 /* Queue Index Hierarchy */
> > +#define VF_ID_SHIFT     4  /* Queue Index Hierarchy */
> > +#define VF_OFFSET_QOS   16 /* offset in Memory Space specific to QoS
> > Mon */
> > +#define TMPL_PRI_0      0x03020100
> > +#define TMPL_PRI_1      0x07060504
> > +#define TMPL_PRI_2      0x0b0a0908
> > +#define TMPL_PRI_3      0x0f0e0d0c
> > +#define QUEUE_ENABLE    0x80000000  /* Bit to mark Queue as Enabled */
> > +#define WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
> > +
> > +#define ACC100_NUM_TMPL  32
> > +#define VF_OFFSET_QOS 16 /* offset in Memory Space specific to QoS Mon
> > */
> > +/* Mapping of signals for the available engines */
> > +#define SIG_UL_5G      0
> > +#define SIG_UL_5G_LAST 7
> > +#define SIG_DL_5G      13
> > +#define SIG_DL_5G_LAST 15
> > +#define SIG_UL_4G      16
> > +#define SIG_UL_4G_LAST 21
> > +#define SIG_DL_4G      27
> > +#define SIG_DL_4G_LAST 31
> > +
> > +/* max number of iterations to allocate memory block for all rings */
> > +#define SW_RING_MEM_ALLOC_ATTEMPTS 5
> > +#define MAX_QUEUE_DEPTH           1024
> > +#define ACC100_DMA_MAX_NUM_POINTERS  14
> > +#define ACC100_DMA_DESC_PADDING      8
> > +#define ACC100_FCW_PADDING           12
> > +#define ACC100_DESC_FCW_OFFSET       192
> > +#define ACC100_DESC_SIZE             256
> > +#define ACC100_DESC_OFFSET           (ACC100_DESC_SIZE / 64)
> > +#define ACC100_FCW_TE_BLEN     32
> > +#define ACC100_FCW_TD_BLEN     24
> > +#define ACC100_FCW_LE_BLEN     32
> > +#define ACC100_FCW_LD_BLEN     36
> > +
> > +#define ACC100_FCW_VER         2
> > +#define MUX_5GDL_DESC 6
> > +#define CMP_ENC_SIZE 20
> > +#define CMP_DEC_SIZE 24
> > +#define ENC_OFFSET (32)
> > +#define DEC_OFFSET (80)
> > +#define ACC100_EXT_MEM
> > +#define ACC100_HARQ_OFFSET_THRESHOLD 1024
> > +
> > +/* Constants from K0 computation from 3GPP 38.212 Table 5.4.2.1-2 */
> > +#define N_ZC_1 66 /* N = 66 Zc for BG 1 */
> > +#define N_ZC_2 50 /* N = 50 Zc for BG 2 */
> > +#define K0_1_1 17 /* K0 fraction numerator for rv 1 and BG 1 */
> > +#define K0_1_2 13 /* K0 fraction numerator for rv 1 and BG 2 */
> > +#define K0_2_1 33 /* K0 fraction numerator for rv 2 and BG 1 */
> > +#define K0_2_2 25 /* K0 fraction numerator for rv 2 and BG 2 */
> > +#define K0_3_1 56 /* K0 fraction numerator for rv 3 and BG 1 */
> > +#define K0_3_2 43 /* K0 fraction numerator for rv 3 and BG 2 */
> > +
> > +/* ACC100 Configuration */
> > +#define ACC100_DDR_ECC_ENABLE
> > +#define ACC100_CFG_DMA_ERROR 0x3D7
> > +#define ACC100_CFG_AXI_CACHE 0x11
> > +#define ACC100_CFG_QMGR_HI_P 0x0F0F
> > +#define ACC100_CFG_PCI_AXI 0xC003
> > +#define ACC100_CFG_PCI_BRIDGE 0x40006033
> > +#define ACC100_ENGINE_OFFSET 0x1000
> > +#define ACC100_RESET_HI 0x20100
> > +#define ACC100_RESET_LO 0x20000
> > +#define ACC100_RESET_HARD 0x1FF
> > +#define ACC100_ENGINES_MAX 9
> > +#define LONG_WAIT 1000
> > +
> > +/* ACC100 DMA Descriptor triplet */
> > +struct acc100_dma_triplet {
> > +	uint64_t address;
> > +	uint32_t blen:20,
> > +		res0:4,
> > +		last:1,
> > +		dma_ext:1,
> > +		res1:2,
> > +		blkid:4;
> > +} __rte_packed;
> > +
> > +
> > +
> > +/* ACC100 DMA Response Descriptor */
> > +union acc100_dma_rsp_desc {
> > +	uint32_t val;
> > +	struct {
> > +		uint32_t crc_status:1,
> > +			synd_ok:1,
> > +			dma_err:1,
> > +			neg_stop:1,
> > +			fcw_err:1,
> > +			output_err:1,
> > +			input_err:1,
> > +			timestampEn:1,
> > +			iterCountFrac:8,
> > +			iter_cnt:8,
> > +			rsrvd3:6,
> > +			sdone:1,
> > +			fdone:1;
> > +		uint32_t add_info_0;
> > +		uint32_t add_info_1;
> > +	};
> > +};
> > +
> > +
> > +/* ACC100 Queue Manager Enqueue PCI Register */
> > +union acc100_enqueue_reg_fmt {
> > +	uint32_t val;
> > +	struct {
> > +		uint32_t num_elem:8,
> > +			addr_offset:3,
> > +			rsrvd:1,
> > +			req_elem_addr:20;
> > +	};
> > +};
> > +
> > +/* FEC 4G Uplink Frame Control Word */
> > +struct __rte_packed acc100_fcw_td {
> > +	uint8_t fcw_ver:4,
> > +		num_maps:4; /* Unused */
> > +	uint8_t filler:6, /* Unused */
> > +		rsrvd0:1,
> > +		bypass_sb_deint:1;
> > +	uint16_t k_pos;
> > +	uint16_t k_neg; /* Unused */
> > +	uint8_t c_neg; /* Unused */
> > +	uint8_t c; /* Unused */
> > +	uint32_t ea; /* Unused */
> > +	uint32_t eb; /* Unused */
> > +	uint8_t cab; /* Unused */
> > +	uint8_t k0_start_col; /* Unused */
> > +	uint8_t rsrvd1;
> > +	uint8_t code_block_mode:1, /* Unused */
> > +		turbo_crc_type:1,
> > +		rsrvd2:3,
> > +		bypass_teq:1, /* Unused */
> > +		soft_output_en:1, /* Unused */
> > +		ext_td_cold_reg_en:1;
> > +	union { /* External Cold register */
> > +		uint32_t ext_td_cold_reg;
> > +		struct {
> > +			uint32_t min_iter:4, /* Unused */
> > +				max_iter:4,
> > +				ext_scale:5, /* Unused */
> > +				rsrvd3:3,
> > +				early_stop_en:1, /* Unused */
> > +				sw_soft_out_dis:1, /* Unused */
> > +				sw_et_cont:1, /* Unused */
> > +				sw_soft_out_saturation:1, /* Unused */
> > +				half_iter_on:1, /* Unused */
> > +				raw_decoder_input_on:1, /* Unused */
> > +				rsrvd4:10;
> > +		};
> > +	};
> > +};
> > +
> > +/* FEC 5GNR Uplink Frame Control Word */
> > +struct __rte_packed acc100_fcw_ld {
> > +	uint32_t FCWversion:4,
> > +		qm:4,
> > +		nfiller:11,
> > +		BG:1,
> > +		Zc:9,
> > +		res0:1,
> > +		synd_precoder:1,
> > +		synd_post:1;
> > +	uint32_t ncb:16,
> > +		k0:16;
> > +	uint32_t rm_e:24,
> > +		hcin_en:1,
> > +		hcout_en:1,
> > +		crc_select:1,
> > +		bypass_dec:1,
> > +		bypass_intlv:1,
> > +		so_en:1,
> > +		so_bypass_rm:1,
> > +		so_bypass_intlv:1;
> > +	uint32_t hcin_offset:16,
> > +		hcin_size0:16;
> > +	uint32_t hcin_size1:16,
> > +		hcin_decomp_mode:3,
> > +		llr_pack_mode:1,
> > +		hcout_comp_mode:3,
> > +		res2:1,
> > +		dec_convllr:4,
> > +		hcout_convllr:4;
> > +	uint32_t itmax:7,
> > +		itstop:1,
> > +		so_it:7,
> > +		res3:1,
> > +		hcout_offset:16;
> > +	uint32_t hcout_size0:16,
> > +		hcout_size1:16;
> > +	uint32_t gain_i:8,
> > +		gain_h:8,
> > +		negstop_th:16;
> > +	uint32_t negstop_it:7,
> > +		negstop_en:1,
> > +		res4:24;
> > +};
> > +
> > +/* FEC 4G Downlink Frame Control Word */
> > +struct __rte_packed acc100_fcw_te {
> > +	uint16_t k_neg;
> > +	uint16_t k_pos;
> > +	uint8_t c_neg;
> > +	uint8_t c;
> > +	uint8_t filler;
> > +	uint8_t cab;
> > +	uint32_t ea:17,
> > +		rsrvd0:15;
> > +	uint32_t eb:17,
> > +		rsrvd1:15;
> > +	uint16_t ncb_neg;
> > +	uint16_t ncb_pos;
> > +	uint8_t rv_idx0:2,
> > +		rsrvd2:2,
> > +		rv_idx1:2,
> > +		rsrvd3:2;
> > +	uint8_t bypass_rv_idx0:1,
> > +		bypass_rv_idx1:1,
> > +		bypass_rm:1,
> > +		rsrvd4:5;
> > +	uint8_t rsrvd5:1,
> > +		rsrvd6:3,
> > +		code_block_crc:1,
> > +		rsrvd7:3;
> > +	uint8_t code_block_mode:1,
> > +		rsrvd8:7;
> > +	uint64_t rsrvd9;
> > +};
> > +
> > +/* FEC 5GNR Downlink Frame Control Word */
> > +struct __rte_packed acc100_fcw_le {
> > +	uint32_t FCWversion:4,
> > +		qm:4,
> > +		nfiller:11,
> > +		BG:1,
> > +		Zc:9,
> > +		res0:3;
> > +	uint32_t ncb:16,
> > +		k0:16;
> > +	uint32_t rm_e:24,
> > +		res1:2,
> > +		crc_select:1,
> > +		res2:1,
> > +		bypass_intlv:1,
> > +		res3:3;
> > +	uint32_t res4_a:12,
> > +		mcb_count:3,
> > +		res4_b:17;
> > +	uint32_t res5;
> > +	uint32_t res6;
> > +	uint32_t res7;
> > +	uint32_t res8;
> > +};
> > +
> > +/* ACC100 DMA Request Descriptor */
> > +struct __rte_packed acc100_dma_req_desc {
> > +	union {
> > +		struct{
> > +			uint32_t type:4,
> > +				rsrvd0:26,
> > +				sdone:1,
> > +				fdone:1;
> > +			uint32_t rsrvd1;
> > +			uint32_t rsrvd2;
> > +			uint32_t pass_param:8,
> > +				sdone_enable:1,
> > +				irq_enable:1,
> > +				timeStampEn:1,
> > +				res0:5,
> > +				numCBs:4,
> > +				res1:4,
> > +				m2dlen:4,
> > +				d2mlen:4;
> > +		};
> > +		struct{
> > +			uint32_t word0;
> > +			uint32_t word1;
> > +			uint32_t word2;
> > +			uint32_t word3;
> > +		};
> > +	};
> > +	struct acc100_dma_triplet
> > data_ptrs[ACC100_DMA_MAX_NUM_POINTERS];
> > +
> > +	/* Virtual addresses used to retrieve SW context info */
> > +	union {
> > +		void *op_addr;
> > +		uint64_t pad1;  /* pad to 64 bits */
> > +	};
> > +	/*
> > +	 * Stores additional information needed for driver processing:
> > +	 * - last_desc_in_batch - flag used to mark last descriptor (CB)
> > +	 *                        in batch
> > +	 * - cbs_in_tb - stores information about total number of Code Blocks
> > +	 *               in currently processed Transport Block
> > +	 */
> > +	union {
> > +		struct {
> > +			union {
> > +				struct acc100_fcw_ld fcw_ld;
> > +				struct acc100_fcw_td fcw_td;
> > +				struct acc100_fcw_le fcw_le;
> > +				struct acc100_fcw_te fcw_te;
> > +				uint32_t pad2[ACC100_FCW_PADDING];
> > +			};
> > +			uint32_t last_desc_in_batch :8,
> > +				cbs_in_tb:8,
> > +				pad4 : 16;
> > +		};
> > +		uint64_t pad3[ACC100_DMA_DESC_PADDING]; /* pad to 64
> > bits */
> > +	};
> > +};
> > +
> > +/* ACC100 DMA Descriptor */
> > +union acc100_dma_desc {
> > +	struct acc100_dma_req_desc req;
> > +	union acc100_dma_rsp_desc rsp;
> > +};
> > +
> > +
> > +/* Union describing Info Ring entry */
> > +union acc100_harq_layout_data {
> > +	uint32_t val;
> > +	struct {
> > +		uint16_t offset;
> > +		uint16_t size0;
> > +	};
> > +} __rte_packed;
> > +
> > +
> > +/* Union describing Info Ring entry */
> > +union acc100_info_ring_data {
> > +	uint32_t val;
> > +	struct {
> > +		union {
> > +			uint16_t detailed_info;
> > +			struct {
> > +				uint16_t aq_id: 4;
> > +				uint16_t qg_id: 4;
> > +				uint16_t vf_id: 6;
> > +				uint16_t reserved: 2;
> > +			};
> > +		};
> > +		uint16_t int_nb: 7;
> > +		uint16_t msi_0: 1;
> > +		uint16_t vf2pf: 6;
> > +		uint16_t loop: 1;
> > +		uint16_t valid: 1;
> > +	};
> > +} __rte_packed;
> > +
> > +struct acc100_registry_addr {
> > +	unsigned int dma_ring_dl5g_hi;
> > +	unsigned int dma_ring_dl5g_lo;
> > +	unsigned int dma_ring_ul5g_hi;
> > +	unsigned int dma_ring_ul5g_lo;
> > +	unsigned int dma_ring_dl4g_hi;
> > +	unsigned int dma_ring_dl4g_lo;
> > +	unsigned int dma_ring_ul4g_hi;
> > +	unsigned int dma_ring_ul4g_lo;
> > +	unsigned int ring_size;
> > +	unsigned int info_ring_hi;
> > +	unsigned int info_ring_lo;
> > +	unsigned int info_ring_en;
> > +	unsigned int info_ring_ptr;
> > +	unsigned int tail_ptrs_dl5g_hi;
> > +	unsigned int tail_ptrs_dl5g_lo;
> > +	unsigned int tail_ptrs_ul5g_hi;
> > +	unsigned int tail_ptrs_ul5g_lo;
> > +	unsigned int tail_ptrs_dl4g_hi;
> > +	unsigned int tail_ptrs_dl4g_lo;
> > +	unsigned int tail_ptrs_ul4g_hi;
> > +	unsigned int tail_ptrs_ul4g_lo;
> > +	unsigned int depth_log0_offset;
> > +	unsigned int depth_log1_offset;
> > +	unsigned int qman_group_func;
> > +	unsigned int ddr_range;
> > +};
> > +
> > +/* Structure holding registry addresses for PF */
> > +static const struct acc100_registry_addr pf_reg_addr = {
> > +	.dma_ring_dl5g_hi = HWPfDmaFec5GdlDescBaseHiRegVf,
> > +	.dma_ring_dl5g_lo = HWPfDmaFec5GdlDescBaseLoRegVf,
> > +	.dma_ring_ul5g_hi = HWPfDmaFec5GulDescBaseHiRegVf,
> > +	.dma_ring_ul5g_lo = HWPfDmaFec5GulDescBaseLoRegVf,
> > +	.dma_ring_dl4g_hi = HWPfDmaFec4GdlDescBaseHiRegVf,
> > +	.dma_ring_dl4g_lo = HWPfDmaFec4GdlDescBaseLoRegVf,
> > +	.dma_ring_ul4g_hi = HWPfDmaFec4GulDescBaseHiRegVf,
> > +	.dma_ring_ul4g_lo = HWPfDmaFec4GulDescBaseLoRegVf,
> > +	.ring_size = HWPfQmgrRingSizeVf,
> > +	.info_ring_hi = HWPfHiInfoRingBaseHiRegPf,
> > +	.info_ring_lo = HWPfHiInfoRingBaseLoRegPf,
> > +	.info_ring_en = HWPfHiInfoRingIntWrEnRegPf,
> > +	.info_ring_ptr = HWPfHiInfoRingPointerRegPf,
> > +	.tail_ptrs_dl5g_hi = HWPfDmaFec5GdlRespPtrHiRegVf,
> > +	.tail_ptrs_dl5g_lo = HWPfDmaFec5GdlRespPtrLoRegVf,
> > +	.tail_ptrs_ul5g_hi = HWPfDmaFec5GulRespPtrHiRegVf,
> > +	.tail_ptrs_ul5g_lo = HWPfDmaFec5GulRespPtrLoRegVf,
> > +	.tail_ptrs_dl4g_hi = HWPfDmaFec4GdlRespPtrHiRegVf,
> > +	.tail_ptrs_dl4g_lo = HWPfDmaFec4GdlRespPtrLoRegVf,
> > +	.tail_ptrs_ul4g_hi = HWPfDmaFec4GulRespPtrHiRegVf,
> > +	.tail_ptrs_ul4g_lo = HWPfDmaFec4GulRespPtrLoRegVf,
> > +	.depth_log0_offset = HWPfQmgrGrpDepthLog20Vf,
> > +	.depth_log1_offset = HWPfQmgrGrpDepthLog21Vf,
> > +	.qman_group_func = HWPfQmgrGrpFunction0,
> > +	.ddr_range = HWPfDmaVfDdrBaseRw,
> > +};
> > +
> > +/* Structure holding registry addresses for VF */
> > +static const struct acc100_registry_addr vf_reg_addr = {
> > +	.dma_ring_dl5g_hi = HWVfDmaFec5GdlDescBaseHiRegVf,
> > +	.dma_ring_dl5g_lo = HWVfDmaFec5GdlDescBaseLoRegVf,
> > +	.dma_ring_ul5g_hi = HWVfDmaFec5GulDescBaseHiRegVf,
> > +	.dma_ring_ul5g_lo = HWVfDmaFec5GulDescBaseLoRegVf,
> > +	.dma_ring_dl4g_hi = HWVfDmaFec4GdlDescBaseHiRegVf,
> > +	.dma_ring_dl4g_lo = HWVfDmaFec4GdlDescBaseLoRegVf,
> > +	.dma_ring_ul4g_hi = HWVfDmaFec4GulDescBaseHiRegVf,
> > +	.dma_ring_ul4g_lo = HWVfDmaFec4GulDescBaseLoRegVf,
> > +	.ring_size = HWVfQmgrRingSizeVf,
> > +	.info_ring_hi = HWVfHiInfoRingBaseHiVf,
> > +	.info_ring_lo = HWVfHiInfoRingBaseLoVf,
> > +	.info_ring_en = HWVfHiInfoRingIntWrEnVf,
> > +	.info_ring_ptr = HWVfHiInfoRingPointerVf,
> > +	.tail_ptrs_dl5g_hi = HWVfDmaFec5GdlRespPtrHiRegVf,
> > +	.tail_ptrs_dl5g_lo = HWVfDmaFec5GdlRespPtrLoRegVf,
> > +	.tail_ptrs_ul5g_hi = HWVfDmaFec5GulRespPtrHiRegVf,
> > +	.tail_ptrs_ul5g_lo = HWVfDmaFec5GulRespPtrLoRegVf,
> > +	.tail_ptrs_dl4g_hi = HWVfDmaFec4GdlRespPtrHiRegVf,
> > +	.tail_ptrs_dl4g_lo = HWVfDmaFec4GdlRespPtrLoRegVf,
> > +	.tail_ptrs_ul4g_hi = HWVfDmaFec4GulRespPtrHiRegVf,
> > +	.tail_ptrs_ul4g_lo = HWVfDmaFec4GulRespPtrLoRegVf,
> > +	.depth_log0_offset = HWVfQmgrGrpDepthLog20Vf,
> > +	.depth_log1_offset = HWVfQmgrGrpDepthLog21Vf,
> > +	.qman_group_func = HWVfQmgrGrpFunction0Vf,
> > +	.ddr_range = HWVfDmaDdrBaseRangeRoVf,
> > +};
> > +
> >  /* Private data structure for each ACC100 device */
> >  struct acc100_device {
> >  	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
> > --
> > 1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v3 04/11] baseband/acc100: add queue configuration
  2020-08-29 10:39   ` Xu, Rosen
@ 2020-08-29 17:48     ` Chautru, Nicolas
  2020-09-03  2:30       ` Xu, Rosen
  0 siblings, 1 reply; 213+ messages in thread
From: Chautru, Nicolas @ 2020-08-29 17:48 UTC (permalink / raw)
  To: Xu, Rosen, dev, akhil.goyal; +Cc: Richardson, Bruce

Hi, 

> From: Xu, Rosen <rosen.xu@intel.com>
> 
> Hi,
> 
> > -----Original Message-----
> > From: dev <dev-bounces@dpdk.org> On Behalf Of Nicolas Chautru
> > Sent: Wednesday, August 19, 2020 8:25
> > To: dev@dpdk.org; akhil.goyal@nxp.com
> > Cc: Richardson, Bruce <bruce.richardson@intel.com>; Chautru, Nicolas
> > <nicolas.chautru@intel.com>
> > Subject: [dpdk-dev] [PATCH v3 04/11] baseband/acc100: add queue
> > configuration
> >
> > Adding function to create and configure queues for the device. Still
> > no capability.
> >
> > Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> > ---
> >  drivers/baseband/acc100/rte_acc100_pmd.c | 420
> > ++++++++++++++++++++++++++++++-
> > drivers/baseband/acc100/rte_acc100_pmd.h |  45 ++++
> >  2 files changed, 464 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> > b/drivers/baseband/acc100/rte_acc100_pmd.c
> > index 7807a30..7a21c57 100644
> > --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> > +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> > @@ -26,6 +26,22 @@
> >  RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);  #endif
> >
> > +/* Write to MMIO register address */
> > +static inline void
> > +mmio_write(void *addr, uint32_t value) {
> > +	*((volatile uint32_t *)(addr)) = rte_cpu_to_le_32(value); }
> > +
> > +/* Write a register of a ACC100 device */ static inline void
> > +acc100_reg_write(struct acc100_device *d, uint32_t offset, uint32_t
> > +payload) {
> > +	void *reg_addr = RTE_PTR_ADD(d->mmio_base, offset);
> > +	mmio_write(reg_addr, payload);
> > +	usleep(1000);
> > +}
> > +
> >  /* Read a register of a ACC100 device */  static inline uint32_t
> > acc100_reg_read(struct acc100_device *d, uint32_t offset) @@ -36,6
> > +52,22 @@
> >  	return rte_le_to_cpu_32(ret);
> >  }
> >
> > +/* Basic Implementation of Log2 for exact 2^N */ static inline
> > +uint32_t log2_basic(uint32_t value) {
> > +	return (value == 0) ? 0 : __builtin_ctz(value); }
> > +
> > +/* Calculate memory alignment offset assuming alignment is 2^N */
> > +static inline uint32_t calc_mem_alignment_offset(void
> > +*unaligned_virt_mem, uint32_t alignment) {
> > +	rte_iova_t unaligned_phy_mem =
> > rte_malloc_virt2iova(unaligned_virt_mem);
> > +	return (uint32_t)(alignment -
> > +			(unaligned_phy_mem & (alignment-1))); }
> > +
> >  /* Calculate the offset of the enqueue register */  static inline
> > uint32_t queue_offset(bool pf_device, uint8_t vf_id, uint8_t qgrp_id,
> > uint16_t aq_id) @@ -204,10 +236,393 @@
> >  			acc100_conf->q_dl_5g.aq_depth_log2);
> >  }
> >
> > +static void
> > +free_base_addresses(void **base_addrs, int size) {
> > +	int i;
> > +	for (i = 0; i < size; i++)
> > +		rte_free(base_addrs[i]);
> > +}
> > +
> > +static inline uint32_t
> > +get_desc_len(void)
> > +{
> > +	return sizeof(union acc100_dma_desc); }
> > +
> > +/* Allocate the 2 * 64MB block for the sw rings */ static int
> > +alloc_2x64mb_sw_rings_mem(struct rte_bbdev *dev, struct acc100_device
> > *d,
> > +		int socket)
> > +{
> > +	uint32_t sw_ring_size = ACC100_SIZE_64MBYTE;
> > +	d->sw_rings_base = rte_zmalloc_socket(dev->device->driver-
> > >name,
> > +			2 * sw_ring_size, RTE_CACHE_LINE_SIZE, socket);
> > +	if (d->sw_rings_base == NULL) {
> > +		rte_bbdev_log(ERR, "Failed to allocate memory for %s:%u",
> > +				dev->device->driver->name,
> > +				dev->data->dev_id);
> > +		return -ENOMEM;
> > +	}
> > +	memset(d->sw_rings_base, 0, ACC100_SIZE_64MBYTE);
> > +	uint32_t next_64mb_align_offset = calc_mem_alignment_offset(
> > +			d->sw_rings_base, ACC100_SIZE_64MBYTE);
> > +	d->sw_rings = RTE_PTR_ADD(d->sw_rings_base,
> > next_64mb_align_offset);
> > +	d->sw_rings_phys = rte_malloc_virt2iova(d->sw_rings_base) +
> > +			next_64mb_align_offset;
> > +	d->sw_ring_size = MAX_QUEUE_DEPTH * get_desc_len();
> > +	d->sw_ring_max_depth = d->sw_ring_size / get_desc_len();
> > +
> > +	return 0;
> > +}
> 
> Why not a common alloc memory function but special function for different
> memory size?

This is a bit convoluted but due to the fact the first attempt method which is optimal (minimum) may not always find aligned memory. 


> 
> > +/* Attempt to allocate minimised memory space for sw rings */ static
> > +void alloc_sw_rings_min_mem(struct rte_bbdev *dev, struct
> > acc100_device
> > +*d,
> > +		uint16_t num_queues, int socket)
> > +{
> > +	rte_iova_t sw_rings_base_phy, next_64mb_align_addr_phy;
> > +	uint32_t next_64mb_align_offset;
> > +	rte_iova_t sw_ring_phys_end_addr;
> > +	void *base_addrs[SW_RING_MEM_ALLOC_ATTEMPTS];
> > +	void *sw_rings_base;
> > +	int i = 0;
> > +	uint32_t q_sw_ring_size = MAX_QUEUE_DEPTH * get_desc_len();
> > +	uint32_t dev_sw_ring_size = q_sw_ring_size * num_queues;
> > +
> > +	/* Find an aligned block of memory to store sw rings */
> > +	while (i < SW_RING_MEM_ALLOC_ATTEMPTS) {
> > +		/*
> > +		 * sw_ring allocated memory is guaranteed to be aligned to
> > +		 * q_sw_ring_size at the condition that the requested size is
> > +		 * less than the page size
> > +		 */
> > +		sw_rings_base = rte_zmalloc_socket(
> > +				dev->device->driver->name,
> > +				dev_sw_ring_size, q_sw_ring_size, socket);
> > +
> > +		if (sw_rings_base == NULL) {
> > +			rte_bbdev_log(ERR,
> > +					"Failed to allocate memory
> > for %s:%u",
> > +					dev->device->driver->name,
> > +					dev->data->dev_id);
> > +			break;
> > +		}
> > +
> > +		sw_rings_base_phy = rte_malloc_virt2iova(sw_rings_base);
> > +		next_64mb_align_offset = calc_mem_alignment_offset(
> > +				sw_rings_base, ACC100_SIZE_64MBYTE);
> > +		next_64mb_align_addr_phy = sw_rings_base_phy +
> > +				next_64mb_align_offset;
> > +		sw_ring_phys_end_addr = sw_rings_base_phy +
> > dev_sw_ring_size;
> > +
> > +		/* Check if the end of the sw ring memory block is before the
> > +		 * start of next 64MB aligned mem address
> > +		 */
> > +		if (sw_ring_phys_end_addr < next_64mb_align_addr_phy) {
> > +			d->sw_rings_phys = sw_rings_base_phy;
> > +			d->sw_rings = sw_rings_base;
> > +			d->sw_rings_base = sw_rings_base;
> > +			d->sw_ring_size = q_sw_ring_size;
> > +			d->sw_ring_max_depth = MAX_QUEUE_DEPTH;
> > +			break;
> > +		}
> > +		/* Store the address of the unaligned mem block */
> > +		base_addrs[i] = sw_rings_base;
> > +		i++;
> > +	}
> > +
> > +	/* Free all unaligned blocks of mem allocated in the loop */
> > +	free_base_addresses(base_addrs, i);
> > +}
> 
> It's strange to firstly alloc memory and then free memory but on operations on
> this memory.

I may miss your point. We are freeing the exact same mem we did get from rte_zmalloc. 
Not that the base_addrs array refers to multiple attempts of mallocs, not multiple operations in a ring. 

> 
> > +
> > +/* Allocate 64MB memory used for all software rings */ static int
> > +acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int
> > +socket_id) {
> > +	uint32_t phys_low, phys_high, payload;
> > +	struct acc100_device *d = dev->data->dev_private;
> > +	const struct acc100_registry_addr *reg_addr;
> > +
> > +	if (d->pf_device && !d->acc100_conf.pf_mode_en) {
> > +		rte_bbdev_log(NOTICE,
> > +				"%s has PF mode disabled. This PF can't be
> > used.",
> > +				dev->data->name);
> > +		return -ENODEV;
> > +	}
> > +
> > +	alloc_sw_rings_min_mem(dev, d, num_queues, socket_id);
> > +
> > +	/* If minimal memory space approach failed, then allocate
> > +	 * the 2 * 64MB block for the sw rings
> > +	 */
> > +	if (d->sw_rings == NULL)
> > +		alloc_2x64mb_sw_rings_mem(dev, d, socket_id);
> > +
> > +	/* Configure ACC100 with the base address for DMA descriptor rings
> > +	 * Same descriptor rings used for UL and DL DMA Engines
> > +	 * Note : Assuming only VF0 bundle is used for PF mode
> > +	 */
> > +	phys_high = (uint32_t)(d->sw_rings_phys >> 32);
> > +	phys_low  = (uint32_t)(d->sw_rings_phys &
> > ~(ACC100_SIZE_64MBYTE-1));
> > +
> > +	/* Choose correct registry addresses for the device type */
> > +	if (d->pf_device)
> > +		reg_addr = &pf_reg_addr;
> > +	else
> > +		reg_addr = &vf_reg_addr;
> > +
> > +	/* Read the populated cfg from ACC100 registers */
> > +	fetch_acc100_config(dev);
> > +
> > +	/* Mark as configured properly */
> > +	d->configured = true;
> > +
> > +	/* Release AXI from PF */
> > +	if (d->pf_device)
> > +		acc100_reg_write(d, HWPfDmaAxiControl, 1);
> > +
> > +	acc100_reg_write(d, reg_addr->dma_ring_ul5g_hi, phys_high);
> > +	acc100_reg_write(d, reg_addr->dma_ring_ul5g_lo, phys_low);
> > +	acc100_reg_write(d, reg_addr->dma_ring_dl5g_hi, phys_high);
> > +	acc100_reg_write(d, reg_addr->dma_ring_dl5g_lo, phys_low);
> > +	acc100_reg_write(d, reg_addr->dma_ring_ul4g_hi, phys_high);
> > +	acc100_reg_write(d, reg_addr->dma_ring_ul4g_lo, phys_low);
> > +	acc100_reg_write(d, reg_addr->dma_ring_dl4g_hi, phys_high);
> > +	acc100_reg_write(d, reg_addr->dma_ring_dl4g_lo, phys_low);
> > +
> > +	/*
> > +	 * Configure Ring Size to the max queue ring size
> > +	 * (used for wrapping purpose)
> > +	 */
> > +	payload = log2_basic(d->sw_ring_size / 64);
> > +	acc100_reg_write(d, reg_addr->ring_size, payload);
> > +
> > +	/* Configure tail pointer for use when SDONE enabled */
> > +	d->tail_ptrs = rte_zmalloc_socket(
> > +			dev->device->driver->name,
> > +			ACC100_NUM_QGRPS * ACC100_NUM_AQS *
> > sizeof(uint32_t),
> > +			RTE_CACHE_LINE_SIZE, socket_id);
> > +	if (d->tail_ptrs == NULL) {
> > +		rte_bbdev_log(ERR, "Failed to allocate tail ptr for %s:%u",
> > +				dev->device->driver->name,
> > +				dev->data->dev_id);
> > +		rte_free(d->sw_rings);
> > +		return -ENOMEM;
> > +	}
> > +	d->tail_ptr_phys = rte_malloc_virt2iova(d->tail_ptrs);
> > +
> > +	phys_high = (uint32_t)(d->tail_ptr_phys >> 32);
> > +	phys_low  = (uint32_t)(d->tail_ptr_phys);
> > +	acc100_reg_write(d, reg_addr->tail_ptrs_ul5g_hi, phys_high);
> > +	acc100_reg_write(d, reg_addr->tail_ptrs_ul5g_lo, phys_low);
> > +	acc100_reg_write(d, reg_addr->tail_ptrs_dl5g_hi, phys_high);
> > +	acc100_reg_write(d, reg_addr->tail_ptrs_dl5g_lo, phys_low);
> > +	acc100_reg_write(d, reg_addr->tail_ptrs_ul4g_hi, phys_high);
> > +	acc100_reg_write(d, reg_addr->tail_ptrs_ul4g_lo, phys_low);
> > +	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_hi, phys_high);
> > +	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_lo, phys_low);
> > +
> > +	d->harq_layout = rte_zmalloc_socket("HARQ Layout",
> > +			ACC100_HARQ_LAYOUT * sizeof(*d->harq_layout),
> > +			RTE_CACHE_LINE_SIZE, dev->data->socket_id);
> > +
> > +	rte_bbdev_log_debug(
> > +			"ACC100 (%s) configured  sw_rings = %p,
> > sw_rings_phys = %#"
> > +			PRIx64, dev->data->name, d->sw_rings, d-
> > >sw_rings_phys);
> > +
> > +	return 0;
> > +}
> > +
> >  /* Free 64MB memory used for software rings */  static int -
> > acc100_dev_close(struct rte_bbdev *dev  __rte_unused)
> > +acc100_dev_close(struct rte_bbdev *dev)
> >  {
> > +	struct acc100_device *d = dev->data->dev_private;
> > +	if (d->sw_rings_base != NULL) {
> > +		rte_free(d->tail_ptrs);
> > +		rte_free(d->sw_rings_base);
> > +		d->sw_rings_base = NULL;
> > +	}
> > +	usleep(1000);
> > +	return 0;
> > +}
> > +
> > +
> > +/**
> > + * Report a ACC100 queue index which is free
> > + * Return 0 to 16k for a valid queue_idx or -1 when no queue is
> > +available
> > + * Note : Only supporting VF0 Bundle for PF mode  */ static int
> > +acc100_find_free_queue_idx(struct rte_bbdev *dev,
> > +		const struct rte_bbdev_queue_conf *conf) {
> > +	struct acc100_device *d = dev->data->dev_private;
> > +	int op_2_acc[5] = {0, UL_4G, DL_4G, UL_5G, DL_5G};
> > +	int acc = op_2_acc[conf->op_type];
> > +	struct rte_q_topology_t *qtop = NULL;
> > +	qtopFromAcc(&qtop, acc, &(d->acc100_conf));
> > +	if (qtop == NULL)
> > +		return -1;
> > +	/* Identify matching QGroup Index which are sorted in priority order
> > */
> > +	uint16_t group_idx = qtop->first_qgroup_index;
> > +	group_idx += conf->priority;
> > +	if (group_idx >= ACC100_NUM_QGRPS ||
> > +			conf->priority >= qtop->num_qgroups) {
> > +		rte_bbdev_log(INFO, "Invalid Priority on %s, priority %u",
> > +				dev->data->name, conf->priority);
> > +		return -1;
> > +	}
> > +	/* Find a free AQ_idx  */
> > +	uint16_t aq_idx;
> > +	for (aq_idx = 0; aq_idx < qtop->num_aqs_per_groups; aq_idx++) {
> > +		if (((d->q_assigned_bit_map[group_idx] >> aq_idx) & 0x1)
> > == 0) {
> > +			/* Mark the Queue as assigned */
> > +			d->q_assigned_bit_map[group_idx] |= (1 << aq_idx);
> > +			/* Report the AQ Index */
> > +			return (group_idx << GRP_ID_SHIFT) + aq_idx;
> > +		}
> > +	}
> > +	rte_bbdev_log(INFO, "Failed to find free queue on %s, priority %u",
> > +			dev->data->name, conf->priority);
> > +	return -1;
> > +}
> > +
> > +/* Setup ACC100 queue */
> > +static int
> > +acc100_queue_setup(struct rte_bbdev *dev, uint16_t queue_id,
> > +		const struct rte_bbdev_queue_conf *conf) {
> > +	struct acc100_device *d = dev->data->dev_private;
> > +	struct acc100_queue *q;
> > +	int16_t q_idx;
> > +
> > +	/* Allocate the queue data structure. */
> > +	q = rte_zmalloc_socket(dev->device->driver->name, sizeof(*q),
> > +			RTE_CACHE_LINE_SIZE, conf->socket);
> > +	if (q == NULL) {
> > +		rte_bbdev_log(ERR, "Failed to allocate queue memory");
> > +		return -ENOMEM;
> > +	}
> > +
> > +	q->d = d;
> > +	q->ring_addr = RTE_PTR_ADD(d->sw_rings, (d->sw_ring_size *
> > queue_id));
> > +	q->ring_addr_phys = d->sw_rings_phys + (d->sw_ring_size *
> > queue_id);
> > +
> > +	/* Prepare the Ring with default descriptor format */
> > +	union acc100_dma_desc *desc = NULL;
> > +	unsigned int desc_idx, b_idx;
> > +	int fcw_len = (conf->op_type == RTE_BBDEV_OP_LDPC_ENC ?
> > +		ACC100_FCW_LE_BLEN : (conf->op_type ==
> > RTE_BBDEV_OP_TURBO_DEC ?
> > +		ACC100_FCW_TD_BLEN : ACC100_FCW_LD_BLEN));
> > +
> > +	for (desc_idx = 0; desc_idx < d->sw_ring_max_depth; desc_idx++) {
> > +		desc = q->ring_addr + desc_idx;
> > +		desc->req.word0 = ACC100_DMA_DESC_TYPE;
> > +		desc->req.word1 = 0; /**< Timestamp */
> > +		desc->req.word2 = 0;
> > +		desc->req.word3 = 0;
> > +		uint64_t fcw_offset = (desc_idx << 8) +
> > ACC100_DESC_FCW_OFFSET;
> > +		desc->req.data_ptrs[0].address = q->ring_addr_phys +
> > fcw_offset;
> > +		desc->req.data_ptrs[0].blen = fcw_len;
> > +		desc->req.data_ptrs[0].blkid = ACC100_DMA_BLKID_FCW;
> > +		desc->req.data_ptrs[0].last = 0;
> > +		desc->req.data_ptrs[0].dma_ext = 0;
> > +		for (b_idx = 1; b_idx < ACC100_DMA_MAX_NUM_POINTERS
> > - 1;
> > +				b_idx++) {
> > +			desc->req.data_ptrs[b_idx].blkid =
> > ACC100_DMA_BLKID_IN;
> > +			desc->req.data_ptrs[b_idx].last = 1;
> > +			desc->req.data_ptrs[b_idx].dma_ext = 0;
> > +			b_idx++;
> > +			desc->req.data_ptrs[b_idx].blkid =
> > +					ACC100_DMA_BLKID_OUT_ENC;
> > +			desc->req.data_ptrs[b_idx].last = 1;
> > +			desc->req.data_ptrs[b_idx].dma_ext = 0;
> > +		}
> > +		/* Preset some fields of LDPC FCW */
> > +		desc->req.fcw_ld.FCWversion = ACC100_FCW_VER;
> > +		desc->req.fcw_ld.gain_i = 1;
> > +		desc->req.fcw_ld.gain_h = 1;
> > +	}
> > +
> > +	q->lb_in = rte_zmalloc_socket(dev->device->driver->name,
> > +			RTE_CACHE_LINE_SIZE,
> > +			RTE_CACHE_LINE_SIZE, conf->socket);
> > +	if (q->lb_in == NULL) {
> > +		rte_bbdev_log(ERR, "Failed to allocate lb_in memory");
> > +		return -ENOMEM;
> > +	}
> > +	q->lb_in_addr_phys = rte_malloc_virt2iova(q->lb_in);
> > +	q->lb_out = rte_zmalloc_socket(dev->device->driver->name,
> > +			RTE_CACHE_LINE_SIZE,
> > +			RTE_CACHE_LINE_SIZE, conf->socket);
> > +	if (q->lb_out == NULL) {
> > +		rte_bbdev_log(ERR, "Failed to allocate lb_out memory");
> > +		return -ENOMEM;
> > +	}
> > +	q->lb_out_addr_phys = rte_malloc_virt2iova(q->lb_out);
> > +
> > +	/*
> > +	 * Software queue ring wraps synchronously with the HW when it
> > reaches
> > +	 * the boundary of the maximum allocated queue size, no matter
> > what the
> > +	 * sw queue size is. This wrapping is guarded by setting the
> > wrap_mask
> > +	 * to represent the maximum queue size as allocated at the time
> > when
> > +	 * the device has been setup (in configure()).
> > +	 *
> > +	 * The queue depth is set to the queue size value (conf-
> > >queue_size).
> > +	 * This limits the occupancy of the queue at any point of time, so that
> > +	 * the queue does not get swamped with enqueue requests.
> > +	 */
> > +	q->sw_ring_depth = conf->queue_size;
> > +	q->sw_ring_wrap_mask = d->sw_ring_max_depth - 1;
> > +
> > +	q->op_type = conf->op_type;
> > +
> > +	q_idx = acc100_find_free_queue_idx(dev, conf);
> > +	if (q_idx == -1) {
> > +		rte_free(q);
> > +		return -1;
> > +	}
> > +
> > +	q->qgrp_id = (q_idx >> GRP_ID_SHIFT) & 0xF;
> > +	q->vf_id = (q_idx >> VF_ID_SHIFT)  & 0x3F;
> > +	q->aq_id = q_idx & 0xF;
> > +	q->aq_depth = (conf->op_type ==  RTE_BBDEV_OP_TURBO_DEC) ?
> > +			(1 << d->acc100_conf.q_ul_4g.aq_depth_log2) :
> > +			(1 << d->acc100_conf.q_dl_4g.aq_depth_log2);
> > +
> > +	q->mmio_reg_enqueue = RTE_PTR_ADD(d->mmio_base,
> > +			queue_offset(d->pf_device,
> > +					q->vf_id, q->qgrp_id, q->aq_id));
> > +
> > +	rte_bbdev_log_debug(
> > +			"Setup dev%u q%u: qgrp_id=%u, vf_id=%u,
> > aq_id=%u, aq_depth=%u, mmio_reg_enqueue=%p",
> > +			dev->data->dev_id, queue_id, q->qgrp_id, q->vf_id,
> > +			q->aq_id, q->aq_depth, q->mmio_reg_enqueue);
> > +
> > +	dev->data->queues[queue_id].queue_private = q;
> > +	return 0;
> > +}
> > +
> > +/* Release ACC100 queue */
> > +static int
> > +acc100_queue_release(struct rte_bbdev *dev, uint16_t q_id) {
> > +	struct acc100_device *d = dev->data->dev_private;
> > +	struct acc100_queue *q = dev->data->queues[q_id].queue_private;
> > +
> > +	if (q != NULL) {
> > +		/* Mark the Queue as un-assigned */
> > +		d->q_assigned_bit_map[q->qgrp_id] &= (0xFFFFFFFF -
> > +				(1 << q->aq_id));
> > +		rte_free(q->lb_in);
> > +		rte_free(q->lb_out);
> > +		rte_free(q);
> > +		dev->data->queues[q_id].queue_private = NULL;
> > +	}
> > +
> >  	return 0;
> >  }
> >
> > @@ -258,8 +673,11 @@
> >  }
> >
> >  static const struct rte_bbdev_ops acc100_bbdev_ops = {
> > +	.setup_queues = acc100_setup_queues,
> >  	.close = acc100_dev_close,
> >  	.info_get = acc100_dev_info_get,
> > +	.queue_setup = acc100_queue_setup,
> > +	.queue_release = acc100_queue_release,
> >  };
> >
> >  /* ACC100 PCI PF address map */
> > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h
> > b/drivers/baseband/acc100/rte_acc100_pmd.h
> > index 662e2c8..0e2b79c 100644
> > --- a/drivers/baseband/acc100/rte_acc100_pmd.h
> > +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
> > @@ -518,11 +518,56 @@ struct acc100_registry_addr {
> >  	.ddr_range = HWVfDmaDdrBaseRangeRoVf,  };
> >
> > +/* Structure associated with each queue. */ struct
> > +__rte_cache_aligned acc100_queue {
> > +	union acc100_dma_desc *ring_addr;  /* Virtual address of sw ring */
> > +	rte_iova_t ring_addr_phys;  /* Physical address of software ring */
> > +	uint32_t sw_ring_head;  /* software ring head */
> > +	uint32_t sw_ring_tail;  /* software ring tail */
> > +	/* software ring size (descriptors, not bytes) */
> > +	uint32_t sw_ring_depth;
> > +	/* mask used to wrap enqueued descriptors on the sw ring */
> > +	uint32_t sw_ring_wrap_mask;
> > +	/* MMIO register used to enqueue descriptors */
> > +	void *mmio_reg_enqueue;
> > +	uint8_t vf_id;  /* VF ID (max = 63) */
> > +	uint8_t qgrp_id;  /* Queue Group ID */
> > +	uint16_t aq_id;  /* Atomic Queue ID */
> > +	uint16_t aq_depth;  /* Depth of atomic queue */
> > +	uint32_t aq_enqueued;  /* Count how many "batches" have been
> > enqueued */
> > +	uint32_t aq_dequeued;  /* Count how many "batches" have been
> > dequeued */
> > +	uint32_t irq_enable;  /* Enable ops dequeue interrupts if set to 1 */
> > +	struct rte_mempool *fcw_mempool;  /* FCW mempool */
> > +	enum rte_bbdev_op_type op_type;  /* Type of this Queue: TE or TD
> > */
> > +	/* Internal Buffers for loopback input */
> > +	uint8_t *lb_in;
> > +	uint8_t *lb_out;
> > +	rte_iova_t lb_in_addr_phys;
> > +	rte_iova_t lb_out_addr_phys;
> > +	struct acc100_device *d;
> > +};
> > +
> >  /* Private data structure for each ACC100 device */  struct acc100_device {
> >  	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
> > +	void *sw_rings_base;  /* Base addr of un-aligned memory for sw
> > rings */
> > +	void *sw_rings;  /* 64MBs of 64MB aligned memory for sw rings */
> > +	rte_iova_t sw_rings_phys;  /* Physical address of sw_rings */
> > +	/* Virtual address of the info memory routed to the this function
> > under
> > +	 * operation, whether it is PF or VF.
> > +	 */
> > +	union acc100_harq_layout_data *harq_layout;
> > +	uint32_t sw_ring_size;
> >  	uint32_t ddr_size; /* Size in kB */
> > +	uint32_t *tail_ptrs; /* Base address of response tail pointer buffer */
> > +	rte_iova_t tail_ptr_phys; /* Physical address of tail pointers */
> > +	/* Max number of entries available for each queue in device,
> > depending
> > +	 * on how many queues are enabled with configure()
> > +	 */
> > +	uint32_t sw_ring_max_depth;
> >  	struct acc100_conf acc100_conf; /* ACC100 Initial configuration */
> > +	/* Bitmap capturing which Queues have already been assigned */
> > +	uint16_t q_assigned_bit_map[ACC100_NUM_QGRPS];
> >  	bool pf_device; /**< True if this is a PF ACC100 device */
> >  	bool configured; /**< True if this ACC100 device is configured */
> > };
> > --
> > 1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC processing functions
  2020-08-29 11:10   ` Xu, Rosen
@ 2020-08-29 18:01     ` Chautru, Nicolas
  2020-09-03  2:34       ` Xu, Rosen
  0 siblings, 1 reply; 213+ messages in thread
From: Chautru, Nicolas @ 2020-08-29 18:01 UTC (permalink / raw)
  To: Xu, Rosen, dev, akhil.goyal; +Cc: Richardson, Bruce

Hi Rosen, 

> From: Xu, Rosen <rosen.xu@intel.com>
> 
> Hi,
> 
> > -----Original Message-----
> > From: dev <dev-bounces@dpdk.org> On Behalf Of Nicolas Chautru
> > Sent: Wednesday, August 19, 2020 8:25
> > To: dev@dpdk.org; akhil.goyal@nxp.com
> > Cc: Richardson, Bruce <bruce.richardson@intel.com>; Chautru, Nicolas
> > <nicolas.chautru@intel.com>
> > Subject: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC
> > processing functions
> >
> > Adding LDPC decode and encode processing operations
> >
> > Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> > ---
> >  drivers/baseband/acc100/rte_acc100_pmd.c | 1625
> > +++++++++++++++++++++++++++++-
> >  drivers/baseband/acc100/rte_acc100_pmd.h |    3 +
> >  2 files changed, 1626 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> > b/drivers/baseband/acc100/rte_acc100_pmd.c
> > index 7a21c57..5f32813 100644
> > --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> > +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> > @@ -15,6 +15,9 @@
> >  #include <rte_hexdump.h>
> >  #include <rte_pci.h>
> >  #include <rte_bus_pci.h>
> > +#ifdef RTE_BBDEV_OFFLOAD_COST
> > +#include <rte_cycles.h>
> > +#endif
> >
> >  #include <rte_bbdev.h>
> >  #include <rte_bbdev_pmd.h>
> > @@ -449,7 +452,6 @@
> >  	return 0;
> >  }
> >
> > -
> >  /**
> >   * Report a ACC100 queue index which is free
> >   * Return 0 to 16k for a valid queue_idx or -1 when no queue is available
> > @@ -634,6 +636,46 @@
> >  	struct acc100_device *d = dev->data->dev_private;
> >
> >  	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
> > +		{
> > +			.type   = RTE_BBDEV_OP_LDPC_ENC,
> > +			.cap.ldpc_enc = {
> > +				.capability_flags =
> > +					RTE_BBDEV_LDPC_RATE_MATCH |
> > +					RTE_BBDEV_LDPC_CRC_24B_ATTACH
> > |
> > +
> > 	RTE_BBDEV_LDPC_INTERLEAVER_BYPASS,
> > +				.num_buffers_src =
> > +
> > 	RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> > +				.num_buffers_dst =
> > +
> > 	RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> > +			}
> > +		},
> > +		{
> > +			.type   = RTE_BBDEV_OP_LDPC_DEC,
> > +			.cap.ldpc_dec = {
> > +			.capability_flags =
> > +				RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK |
> > +				RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP |
> > +
> > 	RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE |
> > +
> > 	RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE |
> > +#ifdef ACC100_EXT_MEM
> > +
> > 	RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE |
> > +
> > 	RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE |
> > +#endif
> > +
> > 	RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE |
> > +				RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS
> > |
> > +				RTE_BBDEV_LDPC_DECODE_BYPASS |
> > +				RTE_BBDEV_LDPC_DEC_SCATTER_GATHER |
> > +
> > 	RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION |
> > +				RTE_BBDEV_LDPC_LLR_COMPRESSION,
> > +			.llr_size = 8,
> > +			.llr_decimals = 1,
> > +			.num_buffers_src =
> > +
> > 	RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> > +			.num_buffers_hard_out =
> > +
> > 	RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> > +			.num_buffers_soft_out = 0,
> > +			}
> > +		},
> >  		RTE_BBDEV_END_OF_CAPABILITIES_LIST()
> >  	};
> >
> > @@ -669,9 +711,14 @@
> >  	dev_info->cpu_flag_reqs = NULL;
> >  	dev_info->min_alignment = 64;
> >  	dev_info->capabilities = bbdev_capabilities;
> > +#ifdef ACC100_EXT_MEM
> >  	dev_info->harq_buffer_size = d->ddr_size;
> > +#else
> > +	dev_info->harq_buffer_size = 0;
> > +#endif
> >  }
> >
> > +
> >  static const struct rte_bbdev_ops acc100_bbdev_ops = {
> >  	.setup_queues = acc100_setup_queues,
> >  	.close = acc100_dev_close,
> > @@ -696,6 +743,1577 @@
> >  	{.device_id = 0},
> >  };
> >
> > +/* Read flag value 0/1 from bitmap */
> > +static inline bool
> > +check_bit(uint32_t bitmap, uint32_t bitmask)
> > +{
> > +	return bitmap & bitmask;
> > +}
> > +
> > +static inline char *
> > +mbuf_append(struct rte_mbuf *m_head, struct rte_mbuf *m, uint16_t len)
> > +{
> > +	if (unlikely(len > rte_pktmbuf_tailroom(m)))
> > +		return NULL;
> > +
> > +	char *tail = (char *)m->buf_addr + m->data_off + m->data_len;
> > +	m->data_len = (uint16_t)(m->data_len + len);
> > +	m_head->pkt_len  = (m_head->pkt_len + len);
> > +	return tail;
> > +}
> 
> Is it reasonable to direct add data_len of rte_mbuf?
> 

Do you suggest to add directly without checking there is enough room in the mbuf? We cannot rely on the application providing mbuf with enough tailroom.
In case you ask about the 2 mbufs, this is because this function is used to also support segmented memory made of multiple mbufs segments. 
Note that this function is also used in other existing bbdev PMDs. In case you believe there is a better way to do this, we can certainly discuss and change these in several PMDs through another serie. 

Thanks for all the reviews and useful comments.
Nic

^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v3 02/11] baseband/acc100: add register definition file
  2020-08-29 17:39     ` Chautru, Nicolas
@ 2020-09-03  2:15       ` Xu, Rosen
  2020-09-03  9:17         ` Ferruh Yigit
  0 siblings, 1 reply; 213+ messages in thread
From: Xu, Rosen @ 2020-09-03  2:15 UTC (permalink / raw)
  To: Chautru, Nicolas, dev, akhil.goyal; +Cc: Richardson, Bruce

Hi,

> -----Original Message-----
> From: Chautru, Nicolas <nicolas.chautru@intel.com>
> Sent: Sunday, August 30, 2020 1:40
> To: Xu, Rosen <rosen.xu@intel.com>; dev@dpdk.org; akhil.goyal@nxp.com
> Cc: Richardson, Bruce <bruce.richardson@intel.com>
> Subject: RE: [dpdk-dev] [PATCH v3 02/11] baseband/acc100: add register
> definition file
> 
> Hi Rosen,
> 
> > From: Xu, Rosen <rosen.xu@intel.com>
> >
> > Hi,
> >
> > > -----Original Message-----
> > > From: dev <dev-bounces@dpdk.org> On Behalf Of Nicolas Chautru
> > > Sent: Wednesday, August 19, 2020 8:25
> > > To: dev@dpdk.org; akhil.goyal@nxp.com
> > > Cc: Richardson, Bruce <bruce.richardson@intel.com>; Chautru, Nicolas
> > > <nicolas.chautru@intel.com>
> > > Subject: [dpdk-dev] [PATCH v3 02/11] baseband/acc100: add register
> > > definition file
> > >
> > > Add in the list of registers for the device and related
> > > HW specs definitions.
> > >
> > > Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> > > ---
> > >  drivers/baseband/acc100/acc100_pf_enum.h | 1068
> > > ++++++++++++++++++++++++++++++
> > >  drivers/baseband/acc100/acc100_vf_enum.h |   73 ++
> > >  drivers/baseband/acc100/rte_acc100_pmd.h |  490 ++++++++++++++
> > >  3 files changed, 1631 insertions(+)
> > >  create mode 100644 drivers/baseband/acc100/acc100_pf_enum.h
> > >  create mode 100644 drivers/baseband/acc100/acc100_vf_enum.h
> > >
> > > diff --git a/drivers/baseband/acc100/acc100_pf_enum.h
> > > b/drivers/baseband/acc100/acc100_pf_enum.h
> > > new file mode 100644
> > > index 0000000..a1ee416
> > > --- /dev/null
> > > +++ b/drivers/baseband/acc100/acc100_pf_enum.h
> > > @@ -0,0 +1,1068 @@
> > > +/* SPDX-License-Identifier: BSD-3-Clause
> > > + * Copyright(c) 2017 Intel Corporation
> > > + */
> > > +
> > > +#ifndef ACC100_PF_ENUM_H
> > > +#define ACC100_PF_ENUM_H
> > > +
> > > +/*
> > > + * ACC100 Register mapping on PF BAR0
> > > + * This is automatically generated from RDL, format may change with
> new
> > > RDL
> > > + * Release.
> > > + * Variable names are as is
> > > + */
> > > +enum {
> > > +	HWPfQmgrEgressQueuesTemplate          =  0x0007FE00,
> > > +	HWPfQmgrIngressAq                     =  0x00080000,
> > > +	HWPfQmgrArbQAvail                     =  0x00A00010,
> > > +	HWPfQmgrArbQBlock                     =  0x00A00014,
> > > +	HWPfQmgrAqueueDropNotifEn             =  0x00A00024,
> > > +	HWPfQmgrAqueueDisableNotifEn          =  0x00A00028,
> > > +	HWPfQmgrSoftReset                     =  0x00A00038,
> > > +	HWPfQmgrInitStatus                    =  0x00A0003C,
> > > +	HWPfQmgrAramWatchdogCount             =  0x00A00040,
> > > +	HWPfQmgrAramWatchdogCounterEn         =  0x00A00044,
> > > +	HWPfQmgrAxiWatchdogCount              =  0x00A00048,
> > > +	HWPfQmgrAxiWatchdogCounterEn          =  0x00A0004C,
> > > +	HWPfQmgrProcessWatchdogCount          =  0x00A00050,
> > > +	HWPfQmgrProcessWatchdogCounterEn      =  0x00A00054,
> > > +	HWPfQmgrProcessUl4GWatchdogCounter    =  0x00A00058,
> > > +	HWPfQmgrProcessDl4GWatchdogCounter    =  0x00A0005C,
> > > +	HWPfQmgrProcessUl5GWatchdogCounter    =  0x00A00060,
> > > +	HWPfQmgrProcessDl5GWatchdogCounter    =  0x00A00064,
> > > +	HWPfQmgrProcessMldWatchdogCounter     =  0x00A00068,
> > > +	HWPfQmgrMsiOverflowUpperVf            =  0x00A00070,
> > > +	HWPfQmgrMsiOverflowLowerVf            =  0x00A00074,
> > > +	HWPfQmgrMsiWatchdogOverflow           =  0x00A00078,
> > > +	HWPfQmgrMsiOverflowEnable             =  0x00A0007C,
> > > +	HWPfQmgrDebugAqPointerMemGrp          =  0x00A00100,
> > > +	HWPfQmgrDebugOutputArbQFifoGrp        =  0x00A00140,
> > > +	HWPfQmgrDebugMsiFifoGrp               =  0x00A00180,
> > > +	HWPfQmgrDebugAxiWdTimeoutMsiFifo      =  0x00A001C0,
> > > +	HWPfQmgrDebugProcessWdTimeoutMsiFifo  =  0x00A001C4,
> > > +	HWPfQmgrDepthLog2Grp                  =  0x00A00200,
> > > +	HWPfQmgrTholdGrp                      =  0x00A00300,
> > > +	HWPfQmgrGrpTmplateReg0Indx            =  0x00A00600,
> > > +	HWPfQmgrGrpTmplateReg1Indx            =  0x00A00680,
> > > +	HWPfQmgrGrpTmplateReg2indx            =  0x00A00700,
> > > +	HWPfQmgrGrpTmplateReg3Indx            =  0x00A00780,
> > > +	HWPfQmgrGrpTmplateReg4Indx            =  0x00A00800,
> > > +	HWPfQmgrVfBaseAddr                    =  0x00A01000,
> > > +	HWPfQmgrUl4GWeightRrVf                =  0x00A02000,
> > > +	HWPfQmgrDl4GWeightRrVf                =  0x00A02100,
> > > +	HWPfQmgrUl5GWeightRrVf                =  0x00A02200,
> > > +	HWPfQmgrDl5GWeightRrVf                =  0x00A02300,
> > > +	HWPfQmgrMldWeightRrVf                 =  0x00A02400,
> > > +	HWPfQmgrArbQDepthGrp                  =  0x00A02F00,
> > > +	HWPfQmgrGrpFunction0                  =  0x00A02F40,
> > > +	HWPfQmgrGrpFunction1                  =  0x00A02F44,
> > > +	HWPfQmgrGrpPriority                   =  0x00A02F48,
> > > +	HWPfQmgrWeightSync                    =  0x00A03000,
> > > +	HWPfQmgrAqEnableVf                    =  0x00A10000,
> > > +	HWPfQmgrAqResetVf                     =  0x00A20000,
> > > +	HWPfQmgrRingSizeVf                    =  0x00A20004,
> > > +	HWPfQmgrGrpDepthLog20Vf               =  0x00A20008,
> > > +	HWPfQmgrGrpDepthLog21Vf               =  0x00A2000C,
> > > +	HWPfQmgrGrpFunction0Vf                =  0x00A20010,
> > > +	HWPfQmgrGrpFunction1Vf                =  0x00A20014,
> > > +	HWPfDmaConfig0Reg                     =  0x00B80000,
> > > +	HWPfDmaConfig1Reg                     =  0x00B80004,
> > > +	HWPfDmaQmgrAddrReg                    =  0x00B80008,
> > > +	HWPfDmaSoftResetReg                   =  0x00B8000C,
> > > +	HWPfDmaAxcacheReg                     =  0x00B80010,
> > > +	HWPfDmaVersionReg                     =  0x00B80014,
> > > +	HWPfDmaFrameThreshold                 =  0x00B80018,
> > > +	HWPfDmaTimestampLo                    =  0x00B8001C,
> > > +	HWPfDmaTimestampHi                    =  0x00B80020,
> > > +	HWPfDmaAxiStatus                      =  0x00B80028,
> > > +	HWPfDmaAxiControl                     =  0x00B8002C,
> > > +	HWPfDmaNoQmgr                         =  0x00B80030,
> > > +	HWPfDmaQosScale                       =  0x00B80034,
> > > +	HWPfDmaQmanen                         =  0x00B80040,
> > > +	HWPfDmaQmgrQosBase                    =  0x00B80060,
> > > +	HWPfDmaFecClkGatingEnable             =  0x00B80080,
> > > +	HWPfDmaPmEnable                       =  0x00B80084,
> > > +	HWPfDmaQosEnable                      =  0x00B80088,
> > > +	HWPfDmaHarqWeightedRrFrameThreshold   =  0x00B800B0,
> > > +	HWPfDmaDataSmallWeightedRrFrameThresh  = 0x00B800B4,
> > > +	HWPfDmaDataLargeWeightedRrFrameThresh  = 0x00B800B8,
> > > +	HWPfDmaInboundCbMaxSize               =  0x00B800BC,
> > > +	HWPfDmaInboundDrainDataSize           =  0x00B800C0,
> > > +	HWPfDmaVfDdrBaseRw                    =  0x00B80400,
> > > +	HWPfDmaCmplTmOutCnt                   =  0x00B80800,
> > > +	HWPfDmaProcTmOutCnt                   =  0x00B80804,
> > > +	HWPfDmaStatusRrespBresp               =  0x00B80810,
> > > +	HWPfDmaCfgRrespBresp                  =  0x00B80814,
> > > +	HWPfDmaStatusMemParErr                =  0x00B80818,
> > > +	HWPfDmaCfgMemParErrEn                 =  0x00B8081C,
> > > +	HWPfDmaStatusDmaHwErr                 =  0x00B80820,
> > > +	HWPfDmaCfgDmaHwErrEn                  =  0x00B80824,
> > > +	HWPfDmaStatusFecCoreErr               =  0x00B80828,
> > > +	HWPfDmaCfgFecCoreErrEn                =  0x00B8082C,
> > > +	HWPfDmaStatusFcwDescrErr              =  0x00B80830,
> > > +	HWPfDmaCfgFcwDescrErrEn               =  0x00B80834,
> > > +	HWPfDmaStatusBlockTransmit            =  0x00B80838,
> > > +	HWPfDmaBlockOnErrEn                   =  0x00B8083C,
> > > +	HWPfDmaStatusFlushDma                 =  0x00B80840,
> > > +	HWPfDmaFlushDmaOnErrEn                =  0x00B80844,
> > > +	HWPfDmaStatusSdoneFifoFull            =  0x00B80848,
> > > +	HWPfDmaStatusDescriptorErrLoVf        =  0x00B8084C,
> > > +	HWPfDmaStatusDescriptorErrHiVf        =  0x00B80850,
> > > +	HWPfDmaStatusFcwErrLoVf               =  0x00B80854,
> > > +	HWPfDmaStatusFcwErrHiVf               =  0x00B80858,
> > > +	HWPfDmaStatusDataErrLoVf              =  0x00B8085C,
> > > +	HWPfDmaStatusDataErrHiVf              =  0x00B80860,
> > > +	HWPfDmaCfgMsiEnSoftwareErr            =  0x00B80864,
> > > +	HWPfDmaDescriptorSignatuture          =  0x00B80868,
> > > +	HWPfDmaFcwSignature                   =  0x00B8086C,
> > > +	HWPfDmaErrorDetectionEn               =  0x00B80870,
> > > +	HWPfDmaErrCntrlFifoDebug              =  0x00B8087C,
> > > +	HWPfDmaStatusToutData                 =  0x00B80880,
> > > +	HWPfDmaStatusToutDesc                 =  0x00B80884,
> > > +	HWPfDmaStatusToutUnexpData            =  0x00B80888,
> > > +	HWPfDmaStatusToutUnexpDesc            =  0x00B8088C,
> > > +	HWPfDmaStatusToutProcess              =  0x00B80890,
> > > +	HWPfDmaConfigCtoutOutDataEn           =  0x00B808A0,
> > > +	HWPfDmaConfigCtoutOutDescrEn          =  0x00B808A4,
> > > +	HWPfDmaConfigUnexpComplDataEn         =  0x00B808A8,
> > > +	HWPfDmaConfigUnexpComplDescrEn        =  0x00B808AC,
> > > +	HWPfDmaConfigPtoutOutEn               =  0x00B808B0,
> > > +	HWPfDmaFec5GulDescBaseLoRegVf         =  0x00B88020,
> > > +	HWPfDmaFec5GulDescBaseHiRegVf         =  0x00B88024,
> > > +	HWPfDmaFec5GulRespPtrLoRegVf          =  0x00B88028,
> > > +	HWPfDmaFec5GulRespPtrHiRegVf          =  0x00B8802C,
> > > +	HWPfDmaFec5GdlDescBaseLoRegVf         =  0x00B88040,
> > > +	HWPfDmaFec5GdlDescBaseHiRegVf         =  0x00B88044,
> > > +	HWPfDmaFec5GdlRespPtrLoRegVf          =  0x00B88048,
> > > +	HWPfDmaFec5GdlRespPtrHiRegVf          =  0x00B8804C,
> > > +	HWPfDmaFec4GulDescBaseLoRegVf         =  0x00B88060,
> > > +	HWPfDmaFec4GulDescBaseHiRegVf         =  0x00B88064,
> > > +	HWPfDmaFec4GulRespPtrLoRegVf          =  0x00B88068,
> > > +	HWPfDmaFec4GulRespPtrHiRegVf          =  0x00B8806C,
> > > +	HWPfDmaFec4GdlDescBaseLoRegVf         =  0x00B88080,
> > > +	HWPfDmaFec4GdlDescBaseHiRegVf         =  0x00B88084,
> > > +	HWPfDmaFec4GdlRespPtrLoRegVf          =  0x00B88088,
> > > +	HWPfDmaFec4GdlRespPtrHiRegVf          =  0x00B8808C,
> > > +	HWPfDmaVfDdrBaseRangeRo               =  0x00B880A0,
> > > +	HWPfQosmonACntrlReg                   =  0x00B90000,
> > > +	HWPfQosmonAEvalOverflow0              =  0x00B90008,
> > > +	HWPfQosmonAEvalOverflow1              =  0x00B9000C,
> > > +	HWPfQosmonADivTerm                    =  0x00B90010,
> > > +	HWPfQosmonATickTerm                   =  0x00B90014,
> > > +	HWPfQosmonAEvalTerm                   =  0x00B90018,
> > > +	HWPfQosmonAAveTerm                    =  0x00B9001C,
> > > +	HWPfQosmonAForceEccErr                =  0x00B90020,
> > > +	HWPfQosmonAEccErrDetect               =  0x00B90024,
> > > +	HWPfQosmonAIterationConfig0Low        =  0x00B90060,
> > > +	HWPfQosmonAIterationConfig0High       =  0x00B90064,
> > > +	HWPfQosmonAIterationConfig1Low        =  0x00B90068,
> > > +	HWPfQosmonAIterationConfig1High       =  0x00B9006C,
> > > +	HWPfQosmonAIterationConfig2Low        =  0x00B90070,
> > > +	HWPfQosmonAIterationConfig2High       =  0x00B90074,
> > > +	HWPfQosmonAIterationConfig3Low        =  0x00B90078,
> > > +	HWPfQosmonAIterationConfig3High       =  0x00B9007C,
> > > +	HWPfQosmonAEvalMemAddr                =  0x00B90080,
> > > +	HWPfQosmonAEvalMemData                =  0x00B90084,
> > > +	HWPfQosmonAXaction                    =  0x00B900C0,
> > > +	HWPfQosmonARemThres1Vf                =  0x00B90400,
> > > +	HWPfQosmonAThres2Vf                   =  0x00B90404,
> > > +	HWPfQosmonAWeiFracVf                  =  0x00B90408,
> > > +	HWPfQosmonARrWeiVf                    =  0x00B9040C,
> > > +	HWPfPermonACntrlRegVf                 =  0x00B98000,
> > > +	HWPfPermonACountVf                    =  0x00B98008,
> > > +	HWPfPermonAKCntLoVf                   =  0x00B98010,
> > > +	HWPfPermonAKCntHiVf                   =  0x00B98014,
> > > +	HWPfPermonADeltaCntLoVf               =  0x00B98020,
> > > +	HWPfPermonADeltaCntHiVf               =  0x00B98024,
> > > +	HWPfPermonAVersionReg                 =  0x00B9C000,
> > > +	HWPfPermonACbControlFec               =  0x00B9C0F0,
> > > +	HWPfPermonADltTimerLoFec              =  0x00B9C0F4,
> > > +	HWPfPermonADltTimerHiFec              =  0x00B9C0F8,
> > > +	HWPfPermonACbCountFec                 =  0x00B9C100,
> > > +	HWPfPermonAAccExecTimerLoFec          =  0x00B9C104,
> > > +	HWPfPermonAAccExecTimerHiFec          =  0x00B9C108,
> > > +	HWPfPermonAExecTimerMinFec            =  0x00B9C200,
> > > +	HWPfPermonAExecTimerMaxFec            =  0x00B9C204,
> > > +	HWPfPermonAControlBusMon              =  0x00B9C400,
> > > +	HWPfPermonAConfigBusMon               =  0x00B9C404,
> > > +	HWPfPermonASkipCountBusMon            =  0x00B9C408,
> > > +	HWPfPermonAMinLatBusMon               =  0x00B9C40C,
> > > +	HWPfPermonAMaxLatBusMon               =  0x00B9C500,
> > > +	HWPfPermonATotalLatLowBusMon          =  0x00B9C504,
> > > +	HWPfPermonATotalLatUpperBusMon        =  0x00B9C508,
> > > +	HWPfPermonATotalReqCntBusMon          =  0x00B9C50C,
> > > +	HWPfQosmonBCntrlReg                   =  0x00BA0000,
> > > +	HWPfQosmonBEvalOverflow0              =  0x00BA0008,
> > > +	HWPfQosmonBEvalOverflow1              =  0x00BA000C,
> > > +	HWPfQosmonBDivTerm                    =  0x00BA0010,
> > > +	HWPfQosmonBTickTerm                   =  0x00BA0014,
> > > +	HWPfQosmonBEvalTerm                   =  0x00BA0018,
> > > +	HWPfQosmonBAveTerm                    =  0x00BA001C,
> > > +	HWPfQosmonBForceEccErr                =  0x00BA0020,
> > > +	HWPfQosmonBEccErrDetect               =  0x00BA0024,
> > > +	HWPfQosmonBIterationConfig0Low        =  0x00BA0060,
> > > +	HWPfQosmonBIterationConfig0High       =  0x00BA0064,
> > > +	HWPfQosmonBIterationConfig1Low        =  0x00BA0068,
> > > +	HWPfQosmonBIterationConfig1High       =  0x00BA006C,
> > > +	HWPfQosmonBIterationConfig2Low        =  0x00BA0070,
> > > +	HWPfQosmonBIterationConfig2High       =  0x00BA0074,
> > > +	HWPfQosmonBIterationConfig3Low        =  0x00BA0078,
> > > +	HWPfQosmonBIterationConfig3High       =  0x00BA007C,
> > > +	HWPfQosmonBEvalMemAddr                =  0x00BA0080,
> > > +	HWPfQosmonBEvalMemData                =  0x00BA0084,
> > > +	HWPfQosmonBXaction                    =  0x00BA00C0,
> > > +	HWPfQosmonBRemThres1Vf                =  0x00BA0400,
> > > +	HWPfQosmonBThres2Vf                   =  0x00BA0404,
> > > +	HWPfQosmonBWeiFracVf                  =  0x00BA0408,
> > > +	HWPfQosmonBRrWeiVf                    =  0x00BA040C,
> > > +	HWPfPermonBCntrlRegVf                 =  0x00BA8000,
> > > +	HWPfPermonBCountVf                    =  0x00BA8008,
> > > +	HWPfPermonBKCntLoVf                   =  0x00BA8010,
> > > +	HWPfPermonBKCntHiVf                   =  0x00BA8014,
> > > +	HWPfPermonBDeltaCntLoVf               =  0x00BA8020,
> > > +	HWPfPermonBDeltaCntHiVf               =  0x00BA8024,
> > > +	HWPfPermonBVersionReg                 =  0x00BAC000,
> > > +	HWPfPermonBCbControlFec               =  0x00BAC0F0,
> > > +	HWPfPermonBDltTimerLoFec              =  0x00BAC0F4,
> > > +	HWPfPermonBDltTimerHiFec              =  0x00BAC0F8,
> > > +	HWPfPermonBCbCountFec                 =  0x00BAC100,
> > > +	HWPfPermonBAccExecTimerLoFec          =  0x00BAC104,
> > > +	HWPfPermonBAccExecTimerHiFec          =  0x00BAC108,
> > > +	HWPfPermonBExecTimerMinFec            =  0x00BAC200,
> > > +	HWPfPermonBExecTimerMaxFec            =  0x00BAC204,
> > > +	HWPfPermonBControlBusMon              =  0x00BAC400,
> > > +	HWPfPermonBConfigBusMon               =  0x00BAC404,
> > > +	HWPfPermonBSkipCountBusMon            =  0x00BAC408,
> > > +	HWPfPermonBMinLatBusMon               =  0x00BAC40C,
> > > +	HWPfPermonBMaxLatBusMon               =  0x00BAC500,
> > > +	HWPfPermonBTotalLatLowBusMon          =  0x00BAC504,
> > > +	HWPfPermonBTotalLatUpperBusMon        =  0x00BAC508,
> > > +	HWPfPermonBTotalReqCntBusMon          =  0x00BAC50C,
> > > +	HWPfFecUl5gCntrlReg                   =  0x00BC0000,
> > > +	HWPfFecUl5gI2MThreshReg               =  0x00BC0004,
> > > +	HWPfFecUl5gVersionReg                 =  0x00BC0100,
> > > +	HWPfFecUl5gFcwStatusReg               =  0x00BC0104,
> > > +	HWPfFecUl5gWarnReg                    =  0x00BC0108,
> > > +	HwPfFecUl5gIbDebugReg                 =  0x00BC0200,
> > > +	HwPfFecUl5gObLlrDebugReg              =  0x00BC0204,
> > > +	HwPfFecUl5gObHarqDebugReg             =  0x00BC0208,
> > > +	HwPfFecUl5g1CntrlReg                  =  0x00BC1000,
> > > +	HwPfFecUl5g1I2MThreshReg              =  0x00BC1004,
> > > +	HwPfFecUl5g1VersionReg                =  0x00BC1100,
> > > +	HwPfFecUl5g1FcwStatusReg              =  0x00BC1104,
> > > +	HwPfFecUl5g1WarnReg                   =  0x00BC1108,
> > > +	HwPfFecUl5g1IbDebugReg                =  0x00BC1200,
> > > +	HwPfFecUl5g1ObLlrDebugReg             =  0x00BC1204,
> > > +	HwPfFecUl5g1ObHarqDebugReg            =  0x00BC1208,
> > > +	HwPfFecUl5g2CntrlReg                  =  0x00BC2000,
> > > +	HwPfFecUl5g2I2MThreshReg              =  0x00BC2004,
> > > +	HwPfFecUl5g2VersionReg                =  0x00BC2100,
> > > +	HwPfFecUl5g2FcwStatusReg              =  0x00BC2104,
> > > +	HwPfFecUl5g2WarnReg                   =  0x00BC2108,
> > > +	HwPfFecUl5g2IbDebugReg                =  0x00BC2200,
> > > +	HwPfFecUl5g2ObLlrDebugReg             =  0x00BC2204,
> > > +	HwPfFecUl5g2ObHarqDebugReg            =  0x00BC2208,
> > > +	HwPfFecUl5g3CntrlReg                  =  0x00BC3000,
> > > +	HwPfFecUl5g3I2MThreshReg              =  0x00BC3004,
> > > +	HwPfFecUl5g3VersionReg                =  0x00BC3100,
> > > +	HwPfFecUl5g3FcwStatusReg              =  0x00BC3104,
> > > +	HwPfFecUl5g3WarnReg                   =  0x00BC3108,
> > > +	HwPfFecUl5g3IbDebugReg                =  0x00BC3200,
> > > +	HwPfFecUl5g3ObLlrDebugReg             =  0x00BC3204,
> > > +	HwPfFecUl5g3ObHarqDebugReg            =  0x00BC3208,
> > > +	HwPfFecUl5g4CntrlReg                  =  0x00BC4000,
> > > +	HwPfFecUl5g4I2MThreshReg              =  0x00BC4004,
> > > +	HwPfFecUl5g4VersionReg                =  0x00BC4100,
> > > +	HwPfFecUl5g4FcwStatusReg              =  0x00BC4104,
> > > +	HwPfFecUl5g4WarnReg                   =  0x00BC4108,
> > > +	HwPfFecUl5g4IbDebugReg                =  0x00BC4200,
> > > +	HwPfFecUl5g4ObLlrDebugReg             =  0x00BC4204,
> > > +	HwPfFecUl5g4ObHarqDebugReg            =  0x00BC4208,
> > > +	HwPfFecUl5g5CntrlReg                  =  0x00BC5000,
> > > +	HwPfFecUl5g5I2MThreshReg              =  0x00BC5004,
> > > +	HwPfFecUl5g5VersionReg                =  0x00BC5100,
> > > +	HwPfFecUl5g5FcwStatusReg              =  0x00BC5104,
> > > +	HwPfFecUl5g5WarnReg                   =  0x00BC5108,
> > > +	HwPfFecUl5g5IbDebugReg                =  0x00BC5200,
> > > +	HwPfFecUl5g5ObLlrDebugReg             =  0x00BC5204,
> > > +	HwPfFecUl5g5ObHarqDebugReg            =  0x00BC5208,
> > > +	HwPfFecUl5g6CntrlReg                  =  0x00BC6000,
> > > +	HwPfFecUl5g6I2MThreshReg              =  0x00BC6004,
> > > +	HwPfFecUl5g6VersionReg                =  0x00BC6100,
> > > +	HwPfFecUl5g6FcwStatusReg              =  0x00BC6104,
> > > +	HwPfFecUl5g6WarnReg                   =  0x00BC6108,
> > > +	HwPfFecUl5g6IbDebugReg                =  0x00BC6200,
> > > +	HwPfFecUl5g6ObLlrDebugReg             =  0x00BC6204,
> > > +	HwPfFecUl5g6ObHarqDebugReg            =  0x00BC6208,
> > > +	HwPfFecUl5g7CntrlReg                  =  0x00BC7000,
> > > +	HwPfFecUl5g7I2MThreshReg              =  0x00BC7004,
> > > +	HwPfFecUl5g7VersionReg                =  0x00BC7100,
> > > +	HwPfFecUl5g7FcwStatusReg              =  0x00BC7104,
> > > +	HwPfFecUl5g7WarnReg                   =  0x00BC7108,
> > > +	HwPfFecUl5g7IbDebugReg                =  0x00BC7200,
> > > +	HwPfFecUl5g7ObLlrDebugReg             =  0x00BC7204,
> > > +	HwPfFecUl5g7ObHarqDebugReg            =  0x00BC7208,
> > > +	HwPfFecUl5g8CntrlReg                  =  0x00BC8000,
> > > +	HwPfFecUl5g8I2MThreshReg              =  0x00BC8004,
> > > +	HwPfFecUl5g8VersionReg                =  0x00BC8100,
> > > +	HwPfFecUl5g8FcwStatusReg              =  0x00BC8104,
> > > +	HwPfFecUl5g8WarnReg                   =  0x00BC8108,
> > > +	HwPfFecUl5g8IbDebugReg                =  0x00BC8200,
> > > +	HwPfFecUl5g8ObLlrDebugReg             =  0x00BC8204,
> > > +	HwPfFecUl5g8ObHarqDebugReg            =  0x00BC8208,
> > > +	HWPfFecDl5gCntrlReg                   =  0x00BCF000,
> > > +	HWPfFecDl5gI2MThreshReg               =  0x00BCF004,
> > > +	HWPfFecDl5gVersionReg                 =  0x00BCF100,
> > > +	HWPfFecDl5gFcwStatusReg               =  0x00BCF104,
> > > +	HWPfFecDl5gWarnReg                    =  0x00BCF108,
> > > +	HWPfFecUlVersionReg                   =  0x00BD0000,
> > > +	HWPfFecUlControlReg                   =  0x00BD0004,
> > > +	HWPfFecUlStatusReg                    =  0x00BD0008,
> > > +	HWPfFecDlVersionReg                   =  0x00BDF000,
> > > +	HWPfFecDlClusterConfigReg             =  0x00BDF004,
> > > +	HWPfFecDlBurstThres                   =  0x00BDF00C,
> > > +	HWPfFecDlClusterStatusReg0            =  0x00BDF040,
> > > +	HWPfFecDlClusterStatusReg1            =  0x00BDF044,
> > > +	HWPfFecDlClusterStatusReg2            =  0x00BDF048,
> > > +	HWPfFecDlClusterStatusReg3            =  0x00BDF04C,
> > > +	HWPfFecDlClusterStatusReg4            =  0x00BDF050,
> > > +	HWPfFecDlClusterStatusReg5            =  0x00BDF054,
> > > +	HWPfChaFabPllPllrst                   =  0x00C40000,
> > > +	HWPfChaFabPllClk0                     =  0x00C40004,
> > > +	HWPfChaFabPllClk1                     =  0x00C40008,
> > > +	HWPfChaFabPllBwadj                    =  0x00C4000C,
> > > +	HWPfChaFabPllLbw                      =  0x00C40010,
> > > +	HWPfChaFabPllResetq                   =  0x00C40014,
> > > +	HWPfChaFabPllPhshft0                  =  0x00C40018,
> > > +	HWPfChaFabPllPhshft1                  =  0x00C4001C,
> > > +	HWPfChaFabPllDivq0                    =  0x00C40020,
> > > +	HWPfChaFabPllDivq1                    =  0x00C40024,
> > > +	HWPfChaFabPllDivq2                    =  0x00C40028,
> > > +	HWPfChaFabPllDivq3                    =  0x00C4002C,
> > > +	HWPfChaFabPllDivq4                    =  0x00C40030,
> > > +	HWPfChaFabPllDivq5                    =  0x00C40034,
> > > +	HWPfChaFabPllDivq6                    =  0x00C40038,
> > > +	HWPfChaFabPllDivq7                    =  0x00C4003C,
> > > +	HWPfChaDl5gPllPllrst                  =  0x00C40080,
> > > +	HWPfChaDl5gPllClk0                    =  0x00C40084,
> > > +	HWPfChaDl5gPllClk1                    =  0x00C40088,
> > > +	HWPfChaDl5gPllBwadj                   =  0x00C4008C,
> > > +	HWPfChaDl5gPllLbw                     =  0x00C40090,
> > > +	HWPfChaDl5gPllResetq                  =  0x00C40094,
> > > +	HWPfChaDl5gPllPhshft0                 =  0x00C40098,
> > > +	HWPfChaDl5gPllPhshft1                 =  0x00C4009C,
> > > +	HWPfChaDl5gPllDivq0                   =  0x00C400A0,
> > > +	HWPfChaDl5gPllDivq1                   =  0x00C400A4,
> > > +	HWPfChaDl5gPllDivq2                   =  0x00C400A8,
> > > +	HWPfChaDl5gPllDivq3                   =  0x00C400AC,
> > > +	HWPfChaDl5gPllDivq4                   =  0x00C400B0,
> > > +	HWPfChaDl5gPllDivq5                   =  0x00C400B4,
> > > +	HWPfChaDl5gPllDivq6                   =  0x00C400B8,
> > > +	HWPfChaDl5gPllDivq7                   =  0x00C400BC,
> > > +	HWPfChaDl4gPllPllrst                  =  0x00C40100,
> > > +	HWPfChaDl4gPllClk0                    =  0x00C40104,
> > > +	HWPfChaDl4gPllClk1                    =  0x00C40108,
> > > +	HWPfChaDl4gPllBwadj                   =  0x00C4010C,
> > > +	HWPfChaDl4gPllLbw                     =  0x00C40110,
> > > +	HWPfChaDl4gPllResetq                  =  0x00C40114,
> > > +	HWPfChaDl4gPllPhshft0                 =  0x00C40118,
> > > +	HWPfChaDl4gPllPhshft1                 =  0x00C4011C,
> > > +	HWPfChaDl4gPllDivq0                   =  0x00C40120,
> > > +	HWPfChaDl4gPllDivq1                   =  0x00C40124,
> > > +	HWPfChaDl4gPllDivq2                   =  0x00C40128,
> > > +	HWPfChaDl4gPllDivq3                   =  0x00C4012C,
> > > +	HWPfChaDl4gPllDivq4                   =  0x00C40130,
> > > +	HWPfChaDl4gPllDivq5                   =  0x00C40134,
> > > +	HWPfChaDl4gPllDivq6                   =  0x00C40138,
> > > +	HWPfChaDl4gPllDivq7                   =  0x00C4013C,
> > > +	HWPfChaUl5gPllPllrst                  =  0x00C40180,
> > > +	HWPfChaUl5gPllClk0                    =  0x00C40184,
> > > +	HWPfChaUl5gPllClk1                    =  0x00C40188,
> > > +	HWPfChaUl5gPllBwadj                   =  0x00C4018C,
> > > +	HWPfChaUl5gPllLbw                     =  0x00C40190,
> > > +	HWPfChaUl5gPllResetq                  =  0x00C40194,
> > > +	HWPfChaUl5gPllPhshft0                 =  0x00C40198,
> > > +	HWPfChaUl5gPllPhshft1                 =  0x00C4019C,
> > > +	HWPfChaUl5gPllDivq0                   =  0x00C401A0,
> > > +	HWPfChaUl5gPllDivq1                   =  0x00C401A4,
> > > +	HWPfChaUl5gPllDivq2                   =  0x00C401A8,
> > > +	HWPfChaUl5gPllDivq3                   =  0x00C401AC,
> > > +	HWPfChaUl5gPllDivq4                   =  0x00C401B0,
> > > +	HWPfChaUl5gPllDivq5                   =  0x00C401B4,
> > > +	HWPfChaUl5gPllDivq6                   =  0x00C401B8,
> > > +	HWPfChaUl5gPllDivq7                   =  0x00C401BC,
> > > +	HWPfChaUl4gPllPllrst                  =  0x00C40200,
> > > +	HWPfChaUl4gPllClk0                    =  0x00C40204,
> > > +	HWPfChaUl4gPllClk1                    =  0x00C40208,
> > > +	HWPfChaUl4gPllBwadj                   =  0x00C4020C,
> > > +	HWPfChaUl4gPllLbw                     =  0x00C40210,
> > > +	HWPfChaUl4gPllResetq                  =  0x00C40214,
> > > +	HWPfChaUl4gPllPhshft0                 =  0x00C40218,
> > > +	HWPfChaUl4gPllPhshft1                 =  0x00C4021C,
> > > +	HWPfChaUl4gPllDivq0                   =  0x00C40220,
> > > +	HWPfChaUl4gPllDivq1                   =  0x00C40224,
> > > +	HWPfChaUl4gPllDivq2                   =  0x00C40228,
> > > +	HWPfChaUl4gPllDivq3                   =  0x00C4022C,
> > > +	HWPfChaUl4gPllDivq4                   =  0x00C40230,
> > > +	HWPfChaUl4gPllDivq5                   =  0x00C40234,
> > > +	HWPfChaUl4gPllDivq6                   =  0x00C40238,
> > > +	HWPfChaUl4gPllDivq7                   =  0x00C4023C,
> > > +	HWPfChaDdrPllPllrst                   =  0x00C40280,
> > > +	HWPfChaDdrPllClk0                     =  0x00C40284,
> > > +	HWPfChaDdrPllClk1                     =  0x00C40288,
> > > +	HWPfChaDdrPllBwadj                    =  0x00C4028C,
> > > +	HWPfChaDdrPllLbw                      =  0x00C40290,
> > > +	HWPfChaDdrPllResetq                   =  0x00C40294,
> > > +	HWPfChaDdrPllPhshft0                  =  0x00C40298,
> > > +	HWPfChaDdrPllPhshft1                  =  0x00C4029C,
> > > +	HWPfChaDdrPllDivq0                    =  0x00C402A0,
> > > +	HWPfChaDdrPllDivq1                    =  0x00C402A4,
> > > +	HWPfChaDdrPllDivq2                    =  0x00C402A8,
> > > +	HWPfChaDdrPllDivq3                    =  0x00C402AC,
> > > +	HWPfChaDdrPllDivq4                    =  0x00C402B0,
> > > +	HWPfChaDdrPllDivq5                    =  0x00C402B4,
> > > +	HWPfChaDdrPllDivq6                    =  0x00C402B8,
> > > +	HWPfChaDdrPllDivq7                    =  0x00C402BC,
> > > +	HWPfChaErrStatus                      =  0x00C40400,
> > > +	HWPfChaErrMask                        =  0x00C40404,
> > > +	HWPfChaDebugPcieMsiFifo               =  0x00C40410,
> > > +	HWPfChaDebugDdrMsiFifo                =  0x00C40414,
> > > +	HWPfChaDebugMiscMsiFifo               =  0x00C40418,
> > > +	HWPfChaPwmSet                         =  0x00C40420,
> > > +	HWPfChaDdrRstStatus                   =  0x00C40430,
> > > +	HWPfChaDdrStDoneStatus                =  0x00C40434,
> > > +	HWPfChaDdrWbRstCfg                    =  0x00C40438,
> > > +	HWPfChaDdrApbRstCfg                   =  0x00C4043C,
> > > +	HWPfChaDdrPhyRstCfg                   =  0x00C40440,
> > > +	HWPfChaDdrCpuRstCfg                   =  0x00C40444,
> > > +	HWPfChaDdrSifRstCfg                   =  0x00C40448,
> > > +	HWPfChaPadcfgPcomp0                   =  0x00C41000,
> > > +	HWPfChaPadcfgNcomp0                   =  0x00C41004,
> > > +	HWPfChaPadcfgOdt0                     =  0x00C41008,
> > > +	HWPfChaPadcfgProtect0                 =  0x00C4100C,
> > > +	HWPfChaPreemphasisProtect0            =  0x00C41010,
> > > +	HWPfChaPreemphasisCompen0             =  0x00C41040,
> > > +	HWPfChaPreemphasisOdten0              =  0x00C41044,
> > > +	HWPfChaPadcfgPcomp1                   =  0x00C41100,
> > > +	HWPfChaPadcfgNcomp1                   =  0x00C41104,
> > > +	HWPfChaPadcfgOdt1                     =  0x00C41108,
> > > +	HWPfChaPadcfgProtect1                 =  0x00C4110C,
> > > +	HWPfChaPreemphasisProtect1            =  0x00C41110,
> > > +	HWPfChaPreemphasisCompen1             =  0x00C41140,
> > > +	HWPfChaPreemphasisOdten1              =  0x00C41144,
> > > +	HWPfChaPadcfgPcomp2                   =  0x00C41200,
> > > +	HWPfChaPadcfgNcomp2                   =  0x00C41204,
> > > +	HWPfChaPadcfgOdt2                     =  0x00C41208,
> > > +	HWPfChaPadcfgProtect2                 =  0x00C4120C,
> > > +	HWPfChaPreemphasisProtect2            =  0x00C41210,
> > > +	HWPfChaPreemphasisCompen2             =  0x00C41240,
> > > +	HWPfChaPreemphasisOdten4              =  0x00C41444,
> > > +	HWPfChaPreemphasisOdten2              =  0x00C41244,
> > > +	HWPfChaPadcfgPcomp3                   =  0x00C41300,
> > > +	HWPfChaPadcfgNcomp3                   =  0x00C41304,
> > > +	HWPfChaPadcfgOdt3                     =  0x00C41308,
> > > +	HWPfChaPadcfgProtect3                 =  0x00C4130C,
> > > +	HWPfChaPreemphasisProtect3            =  0x00C41310,
> > > +	HWPfChaPreemphasisCompen3             =  0x00C41340,
> > > +	HWPfChaPreemphasisOdten3              =  0x00C41344,
> > > +	HWPfChaPadcfgPcomp4                   =  0x00C41400,
> > > +	HWPfChaPadcfgNcomp4                   =  0x00C41404,
> > > +	HWPfChaPadcfgOdt4                     =  0x00C41408,
> > > +	HWPfChaPadcfgProtect4                 =  0x00C4140C,
> > > +	HWPfChaPreemphasisProtect4            =  0x00C41410,
> > > +	HWPfChaPreemphasisCompen4             =  0x00C41440,
> > > +	HWPfHiVfToPfDbellVf                   =  0x00C80000,
> > > +	HWPfHiPfToVfDbellVf                   =  0x00C80008,
> > > +	HWPfHiInfoRingBaseLoVf                =  0x00C80010,
> > > +	HWPfHiInfoRingBaseHiVf                =  0x00C80014,
> > > +	HWPfHiInfoRingPointerVf               =  0x00C80018,
> > > +	HWPfHiInfoRingIntWrEnVf               =  0x00C80020,
> > > +	HWPfHiInfoRingPf2VfWrEnVf             =  0x00C80024,
> > > +	HWPfHiMsixVectorMapperVf              =  0x00C80060,
> > > +	HWPfHiModuleVersionReg                =  0x00C84000,
> > > +	HWPfHiIosf2axiErrLogReg               =  0x00C84004,
> > > +	HWPfHiHardResetReg                    =  0x00C84008,
> > > +	HWPfHi5GHardResetReg                  =  0x00C8400C,
> > > +	HWPfHiInfoRingBaseLoRegPf             =  0x00C84010,
> > > +	HWPfHiInfoRingBaseHiRegPf             =  0x00C84014,
> > > +	HWPfHiInfoRingPointerRegPf            =  0x00C84018,
> > > +	HWPfHiInfoRingIntWrEnRegPf            =  0x00C84020,
> > > +	HWPfHiInfoRingVf2pfLoWrEnReg          =  0x00C84024,
> > > +	HWPfHiInfoRingVf2pfHiWrEnReg          =  0x00C84028,
> > > +	HWPfHiLogParityErrStatusReg           =  0x00C8402C,
> > > +	HWPfHiLogDataParityErrorVfStatusLo    =  0x00C84030,
> > > +	HWPfHiLogDataParityErrorVfStatusHi    =  0x00C84034,
> > > +	HWPfHiBlockTransmitOnErrorEn          =  0x00C84038,
> > > +	HWPfHiCfgMsiIntWrEnRegPf              =  0x00C84040,
> > > +	HWPfHiCfgMsiVf2pfLoWrEnReg            =  0x00C84044,
> > > +	HWPfHiCfgMsiVf2pfHighWrEnReg          =  0x00C84048,
> > > +	HWPfHiMsixVectorMapperPf              =  0x00C84060,
> > > +	HWPfHiApbWrWaitTime                   =  0x00C84100,
> > > +	HWPfHiXCounterMaxValue                =  0x00C84104,
> > > +	HWPfHiPfMode                          =  0x00C84108,
> > > +	HWPfHiClkGateHystReg                  =  0x00C8410C,
> > > +	HWPfHiSnoopBitsReg                    =  0x00C84110,
> > > +	HWPfHiMsiDropEnableReg                =  0x00C84114,
> > > +	HWPfHiMsiStatReg                      =  0x00C84120,
> > > +	HWPfHiFifoOflStatReg                  =  0x00C84124,
> > > +	HWPfHiHiDebugReg                      =  0x00C841F4,
> > > +	HWPfHiDebugMemSnoopMsiFifo            =  0x00C841F8,
> > > +	HWPfHiDebugMemSnoopInputFifo          =  0x00C841FC,
> > > +	HWPfHiMsixMappingConfig               =  0x00C84200,
> > > +	HWPfHiJunkReg                         =  0x00C8FF00,
> > > +	HWPfDdrUmmcVer                        =  0x00D00000,
> > > +	HWPfDdrUmmcCap                        =  0x00D00010,
> > > +	HWPfDdrUmmcCtrl                       =  0x00D00020,
> > > +	HWPfDdrMpcPe                          =  0x00D00080,
> > > +	HWPfDdrMpcPpri3                       =  0x00D00090,
> > > +	HWPfDdrMpcPpri2                       =  0x00D000A0,
> > > +	HWPfDdrMpcPpri1                       =  0x00D000B0,
> > > +	HWPfDdrMpcPpri0                       =  0x00D000C0,
> > > +	HWPfDdrMpcPrwgrpCtrl                  =  0x00D000D0,
> > > +	HWPfDdrMpcPbw7                        =  0x00D000E0,
> > > +	HWPfDdrMpcPbw6                        =  0x00D000F0,
> > > +	HWPfDdrMpcPbw5                        =  0x00D00100,
> > > +	HWPfDdrMpcPbw4                        =  0x00D00110,
> > > +	HWPfDdrMpcPbw3                        =  0x00D00120,
> > > +	HWPfDdrMpcPbw2                        =  0x00D00130,
> > > +	HWPfDdrMpcPbw1                        =  0x00D00140,
> > > +	HWPfDdrMpcPbw0                        =  0x00D00150,
> > > +	HWPfDdrMemoryInit                     =  0x00D00200,
> > > +	HWPfDdrMemoryInitDone                 =  0x00D00210,
> > > +	HWPfDdrMemInitPhyTrng0                =  0x00D00240,
> > > +	HWPfDdrMemInitPhyTrng1                =  0x00D00250,
> > > +	HWPfDdrMemInitPhyTrng2                =  0x00D00260,
> > > +	HWPfDdrMemInitPhyTrng3                =  0x00D00270,
> > > +	HWPfDdrBcDram                         =  0x00D003C0,
> > > +	HWPfDdrBcAddrMap                      =  0x00D003D0,
> > > +	HWPfDdrBcRef                          =  0x00D003E0,
> > > +	HWPfDdrBcTim0                         =  0x00D00400,
> > > +	HWPfDdrBcTim1                         =  0x00D00410,
> > > +	HWPfDdrBcTim2                         =  0x00D00420,
> > > +	HWPfDdrBcTim3                         =  0x00D00430,
> > > +	HWPfDdrBcTim4                         =  0x00D00440,
> > > +	HWPfDdrBcTim5                         =  0x00D00450,
> > > +	HWPfDdrBcTim6                         =  0x00D00460,
> > > +	HWPfDdrBcTim7                         =  0x00D00470,
> > > +	HWPfDdrBcTim8                         =  0x00D00480,
> > > +	HWPfDdrBcTim9                         =  0x00D00490,
> > > +	HWPfDdrBcTim10                        =  0x00D004A0,
> > > +	HWPfDdrBcTim12                        =  0x00D004C0,
> > > +	HWPfDdrDfiInit                        =  0x00D004D0,
> > > +	HWPfDdrDfiInitComplete                =  0x00D004E0,
> > > +	HWPfDdrDfiTim0                        =  0x00D004F0,
> > > +	HWPfDdrDfiTim1                        =  0x00D00500,
> > > +	HWPfDdrDfiPhyUpdEn                    =  0x00D00530,
> > > +	HWPfDdrMemStatus                      =  0x00D00540,
> > > +	HWPfDdrUmmcErrStatus                  =  0x00D00550,
> > > +	HWPfDdrUmmcIntStatus                  =  0x00D00560,
> > > +	HWPfDdrUmmcIntEn                      =  0x00D00570,
> > > +	HWPfDdrPhyRdLatency                   =  0x00D48400,
> > > +	HWPfDdrPhyRdLatencyDbi                =  0x00D48410,
> > > +	HWPfDdrPhyWrLatency                   =  0x00D48420,
> > > +	HWPfDdrPhyTrngType                    =  0x00D48430,
> > > +	HWPfDdrPhyMrsTiming2                  =  0x00D48440,
> > > +	HWPfDdrPhyMrsTiming0                  =  0x00D48450,
> > > +	HWPfDdrPhyMrsTiming1                  =  0x00D48460,
> > > +	HWPfDdrPhyDramTmrd                    =  0x00D48470,
> > > +	HWPfDdrPhyDramTmod                    =  0x00D48480,
> > > +	HWPfDdrPhyDramTwpre                   =  0x00D48490,
> > > +	HWPfDdrPhyDramTrfc                    =  0x00D484A0,
> > > +	HWPfDdrPhyDramTrwtp                   =  0x00D484B0,
> > > +	HWPfDdrPhyMr01Dimm                    =  0x00D484C0,
> > > +	HWPfDdrPhyMr01DimmDbi                 =  0x00D484D0,
> > > +	HWPfDdrPhyMr23Dimm                    =  0x00D484E0,
> > > +	HWPfDdrPhyMr45Dimm                    =  0x00D484F0,
> > > +	HWPfDdrPhyMr67Dimm                    =  0x00D48500,
> > > +	HWPfDdrPhyWrlvlWwRdlvlRr              =  0x00D48510,
> > > +	HWPfDdrPhyOdtEn                       =  0x00D48520,
> > > +	HWPfDdrPhyFastTrng                    =  0x00D48530,
> > > +	HWPfDdrPhyDynTrngGap                  =  0x00D48540,
> > > +	HWPfDdrPhyDynRcalGap                  =  0x00D48550,
> > > +	HWPfDdrPhyIdletimeout                 =  0x00D48560,
> > > +	HWPfDdrPhyRstCkeGap                   =  0x00D48570,
> > > +	HWPfDdrPhyCkeMrsGap                   =  0x00D48580,
> > > +	HWPfDdrPhyMemVrefMidVal               =  0x00D48590,
> > > +	HWPfDdrPhyVrefStep                    =  0x00D485A0,
> > > +	HWPfDdrPhyVrefThreshold               =  0x00D485B0,
> > > +	HWPfDdrPhyPhyVrefMidVal               =  0x00D485C0,
> > > +	HWPfDdrPhyDqsCountMax                 =  0x00D485D0,
> > > +	HWPfDdrPhyDqsCountNum                 =  0x00D485E0,
> > > +	HWPfDdrPhyDramRow                     =  0x00D485F0,
> > > +	HWPfDdrPhyDramCol                     =  0x00D48600,
> > > +	HWPfDdrPhyDramBgBa                    =  0x00D48610,
> > > +	HWPfDdrPhyDynamicUpdreqrel            =  0x00D48620,
> > > +	HWPfDdrPhyVrefLimits                  =  0x00D48630,
> > > +	HWPfDdrPhyIdtmTcStatus                =  0x00D6C020,
> > > +	HWPfDdrPhyIdtmFwVersion               =  0x00D6C410,
> > > +	HWPfDdrPhyRdlvlGateInitDelay          =  0x00D70000,
> > > +	HWPfDdrPhyRdenSmplabc                 =  0x00D70008,
> > > +	HWPfDdrPhyVrefNibble0                 =  0x00D7000C,
> > > +	HWPfDdrPhyVrefNibble1                 =  0x00D70010,
> > > +	HWPfDdrPhyRdlvlGateDqsSmpl0           =  0x00D70014,
> > > +	HWPfDdrPhyRdlvlGateDqsSmpl1           =  0x00D70018,
> > > +	HWPfDdrPhyRdlvlGateDqsSmpl2           =  0x00D7001C,
> > > +	HWPfDdrPhyDqsCount                    =  0x00D70020,
> > > +	HWPfDdrPhyWrlvlRdlvlGateStatus        =  0x00D70024,
> > > +	HWPfDdrPhyErrorFlags                  =  0x00D70028,
> > > +	HWPfDdrPhyPowerDown                   =  0x00D70030,
> > > +	HWPfDdrPhyPrbsSeedByte0               =  0x00D70034,
> > > +	HWPfDdrPhyPrbsSeedByte1               =  0x00D70038,
> > > +	HWPfDdrPhyPcompDq                     =  0x00D70040,
> > > +	HWPfDdrPhyNcompDq                     =  0x00D70044,
> > > +	HWPfDdrPhyPcompDqs                    =  0x00D70048,
> > > +	HWPfDdrPhyNcompDqs                    =  0x00D7004C,
> > > +	HWPfDdrPhyPcompCmd                    =  0x00D70050,
> > > +	HWPfDdrPhyNcompCmd                    =  0x00D70054,
> > > +	HWPfDdrPhyPcompCk                     =  0x00D70058,
> > > +	HWPfDdrPhyNcompCk                     =  0x00D7005C,
> > > +	HWPfDdrPhyRcalOdtDq                   =  0x00D70060,
> > > +	HWPfDdrPhyRcalOdtDqs                  =  0x00D70064,
> > > +	HWPfDdrPhyRcalMask1                   =  0x00D70068,
> > > +	HWPfDdrPhyRcalMask2                   =  0x00D7006C,
> > > +	HWPfDdrPhyRcalCtrl                    =  0x00D70070,
> > > +	HWPfDdrPhyRcalCnt                     =  0x00D70074,
> > > +	HWPfDdrPhyRcalOverride                =  0x00D70078,
> > > +	HWPfDdrPhyRcalGateen                  =  0x00D7007C,
> > > +	HWPfDdrPhyCtrl                        =  0x00D70080,
> > > +	HWPfDdrPhyWrlvlAlg                    =  0x00D70084,
> > > +	HWPfDdrPhyRcalVreftTxcmdOdt           =  0x00D70088,
> > > +	HWPfDdrPhyRdlvlGateParam              =  0x00D7008C,
> > > +	HWPfDdrPhyRdlvlGateParam2             =  0x00D70090,
> > > +	HWPfDdrPhyRcalVreftTxdata             =  0x00D70094,
> > > +	HWPfDdrPhyCmdIntDelay                 =  0x00D700A4,
> > > +	HWPfDdrPhyAlertN                      =  0x00D700A8,
> > > +	HWPfDdrPhyTrngReqWpre2tck             =  0x00D700AC,
> > > +	HWPfDdrPhyCmdPhaseSel                 =  0x00D700B4,
> > > +	HWPfDdrPhyCmdDcdl                     =  0x00D700B8,
> > > +	HWPfDdrPhyCkDcdl                      =  0x00D700BC,
> > > +	HWPfDdrPhySwTrngCtrl1                 =  0x00D700C0,
> > > +	HWPfDdrPhySwTrngCtrl2                 =  0x00D700C4,
> > > +	HWPfDdrPhyRcalPcompRden               =  0x00D700C8,
> > > +	HWPfDdrPhyRcalNcompRden               =  0x00D700CC,
> > > +	HWPfDdrPhyRcalCompen                  =  0x00D700D0,
> > > +	HWPfDdrPhySwTrngRdqs                  =  0x00D700D4,
> > > +	HWPfDdrPhySwTrngWdqs                  =  0x00D700D8,
> > > +	HWPfDdrPhySwTrngRdena                 =  0x00D700DC,
> > > +	HWPfDdrPhySwTrngRdenb                 =  0x00D700E0,
> > > +	HWPfDdrPhySwTrngRdenc                 =  0x00D700E4,
> > > +	HWPfDdrPhySwTrngWdq                   =  0x00D700E8,
> > > +	HWPfDdrPhySwTrngRdq                   =  0x00D700EC,
> > > +	HWPfDdrPhyPcfgHmValue                 =  0x00D700F0,
> > > +	HWPfDdrPhyPcfgTimerValue              =  0x00D700F4,
> > > +	HWPfDdrPhyPcfgSoftwareTraining        =  0x00D700F8,
> > > +	HWPfDdrPhyPcfgMcStatus                =  0x00D700FC,
> > > +	HWPfDdrPhyWrlvlPhRank0                =  0x00D70100,
> > > +	HWPfDdrPhyRdenPhRank0                 =  0x00D70104,
> > > +	HWPfDdrPhyRdenIntRank0                =  0x00D70108,
> > > +	HWPfDdrPhyRdqsDcdlRank0               =  0x00D7010C,
> > > +	HWPfDdrPhyRdqsShadowDcdlRank0         =  0x00D70110,
> > > +	HWPfDdrPhyWdqsDcdlRank0               =  0x00D70114,
> > > +	HWPfDdrPhyWdmDcdlShadowRank0          =  0x00D70118,
> > > +	HWPfDdrPhyWdmDcdlRank0                =  0x00D7011C,
> > > +	HWPfDdrPhyDbiDcdlRank0                =  0x00D70120,
> > > +	HWPfDdrPhyRdenDcdlaRank0              =  0x00D70124,
> > > +	HWPfDdrPhyDbiDcdlShadowRank0          =  0x00D70128,
> > > +	HWPfDdrPhyRdenDcdlbRank0              =  0x00D7012C,
> > > +	HWPfDdrPhyWdqsShadowDcdlRank0         =  0x00D70130,
> > > +	HWPfDdrPhyRdenDcdlcRank0              =  0x00D70134,
> > > +	HWPfDdrPhyRdenShadowDcdlaRank0        =  0x00D70138,
> > > +	HWPfDdrPhyWrlvlIntRank0               =  0x00D7013C,
> > > +	HWPfDdrPhyRdqDcdlBit0Rank0            =  0x00D70200,
> > > +	HWPfDdrPhyRdqDcdlShadowBit0Rank0      =  0x00D70204,
> > > +	HWPfDdrPhyWdqDcdlBit0Rank0            =  0x00D70208,
> > > +	HWPfDdrPhyWdqDcdlShadowBit0Rank0      =  0x00D7020C,
> > > +	HWPfDdrPhyRdqDcdlBit1Rank0            =  0x00D70240,
> > > +	HWPfDdrPhyRdqDcdlShadowBit1Rank0      =  0x00D70244,
> > > +	HWPfDdrPhyWdqDcdlBit1Rank0            =  0x00D70248,
> > > +	HWPfDdrPhyWdqDcdlShadowBit1Rank0      =  0x00D7024C,
> > > +	HWPfDdrPhyRdqDcdlBit2Rank0            =  0x00D70280,
> > > +	HWPfDdrPhyRdqDcdlShadowBit2Rank0      =  0x00D70284,
> > > +	HWPfDdrPhyWdqDcdlBit2Rank0            =  0x00D70288,
> > > +	HWPfDdrPhyWdqDcdlShadowBit2Rank0      =  0x00D7028C,
> > > +	HWPfDdrPhyRdqDcdlBit3Rank0            =  0x00D702C0,
> > > +	HWPfDdrPhyRdqDcdlShadowBit3Rank0      =  0x00D702C4,
> > > +	HWPfDdrPhyWdqDcdlBit3Rank0            =  0x00D702C8,
> > > +	HWPfDdrPhyWdqDcdlShadowBit3Rank0      =  0x00D702CC,
> > > +	HWPfDdrPhyRdqDcdlBit4Rank0            =  0x00D70300,
> > > +	HWPfDdrPhyRdqDcdlShadowBit4Rank0      =  0x00D70304,
> > > +	HWPfDdrPhyWdqDcdlBit4Rank0            =  0x00D70308,
> > > +	HWPfDdrPhyWdqDcdlShadowBit4Rank0      =  0x00D7030C,
> > > +	HWPfDdrPhyRdqDcdlBit5Rank0            =  0x00D70340,
> > > +	HWPfDdrPhyRdqDcdlShadowBit5Rank0      =  0x00D70344,
> > > +	HWPfDdrPhyWdqDcdlBit5Rank0            =  0x00D70348,
> > > +	HWPfDdrPhyWdqDcdlShadowBit5Rank0      =  0x00D7034C,
> > > +	HWPfDdrPhyRdqDcdlBit6Rank0            =  0x00D70380,
> > > +	HWPfDdrPhyRdqDcdlShadowBit6Rank0      =  0x00D70384,
> > > +	HWPfDdrPhyWdqDcdlBit6Rank0            =  0x00D70388,
> > > +	HWPfDdrPhyWdqDcdlShadowBit6Rank0      =  0x00D7038C,
> > > +	HWPfDdrPhyRdqDcdlBit7Rank0            =  0x00D703C0,
> > > +	HWPfDdrPhyRdqDcdlShadowBit7Rank0      =  0x00D703C4,
> > > +	HWPfDdrPhyWdqDcdlBit7Rank0            =  0x00D703C8,
> > > +	HWPfDdrPhyWdqDcdlShadowBit7Rank0      =  0x00D703CC,
> > > +	HWPfDdrPhyIdtmStatus                  =  0x00D740D0,
> > > +	HWPfDdrPhyIdtmError                   =  0x00D74110,
> > > +	HWPfDdrPhyIdtmDebug                   =  0x00D74120,
> > > +	HWPfDdrPhyIdtmDebugInt                =  0x00D74130,
> > > +	HwPfPcieLnAsicCfgovr                  =  0x00D80000,
> > > +	HwPfPcieLnAclkmixer                   =  0x00D80004,
> > > +	HwPfPcieLnTxrampfreq                  =  0x00D80008,
> > > +	HwPfPcieLnLanetest                    =  0x00D8000C,
> > > +	HwPfPcieLnDcctrl                      =  0x00D80010,
> > > +	HwPfPcieLnDccmeas                     =  0x00D80014,
> > > +	HwPfPcieLnDccovrAclk                  =  0x00D80018,
> > > +	HwPfPcieLnDccovrTxa                   =  0x00D8001C,
> > > +	HwPfPcieLnDccovrTxk                   =  0x00D80020,
> > > +	HwPfPcieLnDccovrDclk                  =  0x00D80024,
> > > +	HwPfPcieLnDccovrEclk                  =  0x00D80028,
> > > +	HwPfPcieLnDcctrimAclk                 =  0x00D8002C,
> > > +	HwPfPcieLnDcctrimTx                   =  0x00D80030,
> > > +	HwPfPcieLnDcctrimDclk                 =  0x00D80034,
> > > +	HwPfPcieLnDcctrimEclk                 =  0x00D80038,
> > > +	HwPfPcieLnQuadCtrl                    =  0x00D8003C,
> > > +	HwPfPcieLnQuadCorrIndex               =  0x00D80040,
> > > +	HwPfPcieLnQuadCorrStatus              =  0x00D80044,
> > > +	HwPfPcieLnAsicRxovr1                  =  0x00D80048,
> > > +	HwPfPcieLnAsicRxovr2                  =  0x00D8004C,
> > > +	HwPfPcieLnAsicEqinfovr                =  0x00D80050,
> > > +	HwPfPcieLnRxcsr                       =  0x00D80054,
> > > +	HwPfPcieLnRxfectrl                    =  0x00D80058,
> > > +	HwPfPcieLnRxtest                      =  0x00D8005C,
> > > +	HwPfPcieLnEscount                     =  0x00D80060,
> > > +	HwPfPcieLnCdrctrl                     =  0x00D80064,
> > > +	HwPfPcieLnCdrctrl2                    =  0x00D80068,
> > > +	HwPfPcieLnCdrcfg0Ctrl0                =  0x00D8006C,
> > > +	HwPfPcieLnCdrcfg0Ctrl1                =  0x00D80070,
> > > +	HwPfPcieLnCdrcfg0Ctrl2                =  0x00D80074,
> > > +	HwPfPcieLnCdrcfg1Ctrl0                =  0x00D80078,
> > > +	HwPfPcieLnCdrcfg1Ctrl1                =  0x00D8007C,
> > > +	HwPfPcieLnCdrcfg1Ctrl2                =  0x00D80080,
> > > +	HwPfPcieLnCdrcfg2Ctrl0                =  0x00D80084,
> > > +	HwPfPcieLnCdrcfg2Ctrl1                =  0x00D80088,
> > > +	HwPfPcieLnCdrcfg2Ctrl2                =  0x00D8008C,
> > > +	HwPfPcieLnCdrcfg3Ctrl0                =  0x00D80090,
> > > +	HwPfPcieLnCdrcfg3Ctrl1                =  0x00D80094,
> > > +	HwPfPcieLnCdrcfg3Ctrl2                =  0x00D80098,
> > > +	HwPfPcieLnCdrphase                    =  0x00D8009C,
> > > +	HwPfPcieLnCdrfreq                     =  0x00D800A0,
> > > +	HwPfPcieLnCdrstatusPhase              =  0x00D800A4,
> > > +	HwPfPcieLnCdrstatusFreq               =  0x00D800A8,
> > > +	HwPfPcieLnCdroffset                   =  0x00D800AC,
> > > +	HwPfPcieLnRxvosctl                    =  0x00D800B0,
> > > +	HwPfPcieLnRxvosctl2                   =  0x00D800B4,
> > > +	HwPfPcieLnRxlosctl                    =  0x00D800B8,
> > > +	HwPfPcieLnRxlos                       =  0x00D800BC,
> > > +	HwPfPcieLnRxlosvval                   =  0x00D800C0,
> > > +	HwPfPcieLnRxvosd0                     =  0x00D800C4,
> > > +	HwPfPcieLnRxvosd1                     =  0x00D800C8,
> > > +	HwPfPcieLnRxvosep0                    =  0x00D800CC,
> > > +	HwPfPcieLnRxvosep1                    =  0x00D800D0,
> > > +	HwPfPcieLnRxvosen0                    =  0x00D800D4,
> > > +	HwPfPcieLnRxvosen1                    =  0x00D800D8,
> > > +	HwPfPcieLnRxvosafe                    =  0x00D800DC,
> > > +	HwPfPcieLnRxvosa0                     =  0x00D800E0,
> > > +	HwPfPcieLnRxvosa0Out                  =  0x00D800E4,
> > > +	HwPfPcieLnRxvosa1                     =  0x00D800E8,
> > > +	HwPfPcieLnRxvosa1Out                  =  0x00D800EC,
> > > +	HwPfPcieLnRxmisc                      =  0x00D800F0,
> > > +	HwPfPcieLnRxbeacon                    =  0x00D800F4,
> > > +	HwPfPcieLnRxdssout                    =  0x00D800F8,
> > > +	HwPfPcieLnRxdssout2                   =  0x00D800FC,
> > > +	HwPfPcieLnAlphapctrl                  =  0x00D80100,
> > > +	HwPfPcieLnAlphanctrl                  =  0x00D80104,
> > > +	HwPfPcieLnAdaptctrl                   =  0x00D80108,
> > > +	HwPfPcieLnAdaptctrl1                  =  0x00D8010C,
> > > +	HwPfPcieLnAdaptstatus                 =  0x00D80110,
> > > +	HwPfPcieLnAdaptvga1                   =  0x00D80114,
> > > +	HwPfPcieLnAdaptvga2                   =  0x00D80118,
> > > +	HwPfPcieLnAdaptvga3                   =  0x00D8011C,
> > > +	HwPfPcieLnAdaptvga4                   =  0x00D80120,
> > > +	HwPfPcieLnAdaptboost1                 =  0x00D80124,
> > > +	HwPfPcieLnAdaptboost2                 =  0x00D80128,
> > > +	HwPfPcieLnAdaptboost3                 =  0x00D8012C,
> > > +	HwPfPcieLnAdaptboost4                 =  0x00D80130,
> > > +	HwPfPcieLnAdaptsslms1                 =  0x00D80134,
> > > +	HwPfPcieLnAdaptsslms2                 =  0x00D80138,
> > > +	HwPfPcieLnAdaptvgaStatus              =  0x00D8013C,
> > > +	HwPfPcieLnAdaptboostStatus            =  0x00D80140,
> > > +	HwPfPcieLnAdaptsslmsStatus1           =  0x00D80144,
> > > +	HwPfPcieLnAdaptsslmsStatus2           =  0x00D80148,
> > > +	HwPfPcieLnAfectrl1                    =  0x00D8014C,
> > > +	HwPfPcieLnAfectrl2                    =  0x00D80150,
> > > +	HwPfPcieLnAfectrl3                    =  0x00D80154,
> > > +	HwPfPcieLnAfedefault1                 =  0x00D80158,
> > > +	HwPfPcieLnAfedefault2                 =  0x00D8015C,
> > > +	HwPfPcieLnDfectrl1                    =  0x00D80160,
> > > +	HwPfPcieLnDfectrl2                    =  0x00D80164,
> > > +	HwPfPcieLnDfectrl3                    =  0x00D80168,
> > > +	HwPfPcieLnDfectrl4                    =  0x00D8016C,
> > > +	HwPfPcieLnDfectrl5                    =  0x00D80170,
> > > +	HwPfPcieLnDfectrl6                    =  0x00D80174,
> > > +	HwPfPcieLnAfestatus1                  =  0x00D80178,
> > > +	HwPfPcieLnAfestatus2                  =  0x00D8017C,
> > > +	HwPfPcieLnDfestatus1                  =  0x00D80180,
> > > +	HwPfPcieLnDfestatus2                  =  0x00D80184,
> > > +	HwPfPcieLnDfestatus3                  =  0x00D80188,
> > > +	HwPfPcieLnDfestatus4                  =  0x00D8018C,
> > > +	HwPfPcieLnDfestatus5                  =  0x00D80190,
> > > +	HwPfPcieLnAlphastatus                 =  0x00D80194,
> > > +	HwPfPcieLnFomctrl1                    =  0x00D80198,
> > > +	HwPfPcieLnFomctrl2                    =  0x00D8019C,
> > > +	HwPfPcieLnFomctrl3                    =  0x00D801A0,
> > > +	HwPfPcieLnAclkcalStatus               =  0x00D801A4,
> > > +	HwPfPcieLnOffscorrStatus              =  0x00D801A8,
> > > +	HwPfPcieLnEyewidthStatus              =  0x00D801AC,
> > > +	HwPfPcieLnEyeheightStatus             =  0x00D801B0,
> > > +	HwPfPcieLnAsicTxovr1                  =  0x00D801B4,
> > > +	HwPfPcieLnAsicTxovr2                  =  0x00D801B8,
> > > +	HwPfPcieLnAsicTxovr3                  =  0x00D801BC,
> > > +	HwPfPcieLnTxbiasadjOvr                =  0x00D801C0,
> > > +	HwPfPcieLnTxcsr                       =  0x00D801C4,
> > > +	HwPfPcieLnTxtest                      =  0x00D801C8,
> > > +	HwPfPcieLnTxtestword                  =  0x00D801CC,
> > > +	HwPfPcieLnTxtestwordHigh              =  0x00D801D0,
> > > +	HwPfPcieLnTxdrive                     =  0x00D801D4,
> > > +	HwPfPcieLnMtcsLn                      =  0x00D801D8,
> > > +	HwPfPcieLnStatsumLn                   =  0x00D801DC,
> > > +	HwPfPcieLnRcbusScratch                =  0x00D801E0,
> > > +	HwPfPcieLnRcbusMinorrev               =  0x00D801F0,
> > > +	HwPfPcieLnRcbusMajorrev               =  0x00D801F4,
> > > +	HwPfPcieLnRcbusBlocktype              =  0x00D801F8,
> > > +	HwPfPcieSupPllcsr                     =  0x00D80800,
> > > +	HwPfPcieSupPlldiv                     =  0x00D80804,
> > > +	HwPfPcieSupPllcal                     =  0x00D80808,
> > > +	HwPfPcieSupPllcalsts                  =  0x00D8080C,
> > > +	HwPfPcieSupPllmeas                    =  0x00D80810,
> > > +	HwPfPcieSupPlldactrim                 =  0x00D80814,
> > > +	HwPfPcieSupPllbiastrim                =  0x00D80818,
> > > +	HwPfPcieSupPllbwtrim                  =  0x00D8081C,
> > > +	HwPfPcieSupPllcaldly                  =  0x00D80820,
> > > +	HwPfPcieSupRefclkonpclkctrl           =  0x00D80824,
> > > +	HwPfPcieSupPclkdelay                  =  0x00D80828,
> > > +	HwPfPcieSupPhyconfig                  =  0x00D8082C,
> > > +	HwPfPcieSupRcalIntf                   =  0x00D80830,
> > > +	HwPfPcieSupAuxcsr                     =  0x00D80834,
> > > +	HwPfPcieSupVref                       =  0x00D80838,
> > > +	HwPfPcieSupLinkmode                   =  0x00D8083C,
> > > +	HwPfPcieSupRrefcalctl                 =  0x00D80840,
> > > +	HwPfPcieSupRrefcal                    =  0x00D80844,
> > > +	HwPfPcieSupRrefcaldly                 =  0x00D80848,
> > > +	HwPfPcieSupTximpcalctl                =  0x00D8084C,
> > > +	HwPfPcieSupTximpcal                   =  0x00D80850,
> > > +	HwPfPcieSupTximpoffset                =  0x00D80854,
> > > +	HwPfPcieSupTximpcaldly                =  0x00D80858,
> > > +	HwPfPcieSupRximpcalctl                =  0x00D8085C,
> > > +	HwPfPcieSupRximpcal                   =  0x00D80860,
> > > +	HwPfPcieSupRximpoffset                =  0x00D80864,
> > > +	HwPfPcieSupRximpcaldly                =  0x00D80868,
> > > +	HwPfPcieSupFence                      =  0x00D8086C,
> > > +	HwPfPcieSupMtcs                       =  0x00D80870,
> > > +	HwPfPcieSupStatsum                    =  0x00D809B8,
> > > +	HwPfPciePcsDpStatus0                  =  0x00D81000,
> > > +	HwPfPciePcsDpControl0                 =  0x00D81004,
> > > +	HwPfPciePcsPmaStatusLane0             =  0x00D81008,
> > > +	HwPfPciePcsPipeStatusLane0            =  0x00D8100C,
> > > +	HwPfPciePcsTxdeemph0Lane0             =  0x00D81010,
> > > +	HwPfPciePcsTxdeemph1Lane0             =  0x00D81014,
> > > +	HwPfPciePcsInternalStatusLane0        =  0x00D81018,
> > > +	HwPfPciePcsDpStatus1                  =  0x00D8101C,
> > > +	HwPfPciePcsDpControl1                 =  0x00D81020,
> > > +	HwPfPciePcsPmaStatusLane1             =  0x00D81024,
> > > +	HwPfPciePcsPipeStatusLane1            =  0x00D81028,
> > > +	HwPfPciePcsTxdeemph0Lane1             =  0x00D8102C,
> > > +	HwPfPciePcsTxdeemph1Lane1             =  0x00D81030,
> > > +	HwPfPciePcsInternalStatusLane1        =  0x00D81034,
> > > +	HwPfPciePcsDpStatus2                  =  0x00D81038,
> > > +	HwPfPciePcsDpControl2                 =  0x00D8103C,
> > > +	HwPfPciePcsPmaStatusLane2             =  0x00D81040,
> > > +	HwPfPciePcsPipeStatusLane2            =  0x00D81044,
> > > +	HwPfPciePcsTxdeemph0Lane2             =  0x00D81048,
> > > +	HwPfPciePcsTxdeemph1Lane2             =  0x00D8104C,
> > > +	HwPfPciePcsInternalStatusLane2        =  0x00D81050,
> > > +	HwPfPciePcsDpStatus3                  =  0x00D81054,
> > > +	HwPfPciePcsDpControl3                 =  0x00D81058,
> > > +	HwPfPciePcsPmaStatusLane3             =  0x00D8105C,
> > > +	HwPfPciePcsPipeStatusLane3            =  0x00D81060,
> > > +	HwPfPciePcsTxdeemph0Lane3             =  0x00D81064,
> > > +	HwPfPciePcsTxdeemph1Lane3             =  0x00D81068,
> > > +	HwPfPciePcsInternalStatusLane3        =  0x00D8106C,
> > > +	HwPfPciePcsEbStatus0                  =  0x00D81070,
> > > +	HwPfPciePcsEbStatus1                  =  0x00D81074,
> > > +	HwPfPciePcsEbStatus2                  =  0x00D81078,
> > > +	HwPfPciePcsEbStatus3                  =  0x00D8107C,
> > > +	HwPfPciePcsPllSettingPcieG1           =  0x00D81088,
> > > +	HwPfPciePcsPllSettingPcieG2           =  0x00D8108C,
> > > +	HwPfPciePcsPllSettingPcieG3           =  0x00D81090,
> > > +	HwPfPciePcsControl                    =  0x00D81094,
> > > +	HwPfPciePcsEqControl                  =  0x00D81098,
> > > +	HwPfPciePcsEqTimer                    =  0x00D8109C,
> > > +	HwPfPciePcsEqErrStatus                =  0x00D810A0,
> > > +	HwPfPciePcsEqErrCount                 =  0x00D810A4,
> > > +	HwPfPciePcsStatus                     =  0x00D810A8,
> > > +	HwPfPciePcsMiscRegister               =  0x00D810AC,
> > > +	HwPfPciePcsObsControl                 =  0x00D810B0,
> > > +	HwPfPciePcsPrbsCount0                 =  0x00D81200,
> > > +	HwPfPciePcsBistControl0               =  0x00D81204,
> > > +	HwPfPciePcsBistStaticWord00           =  0x00D81208,
> > > +	HwPfPciePcsBistStaticWord10           =  0x00D8120C,
> > > +	HwPfPciePcsBistStaticWord20           =  0x00D81210,
> > > +	HwPfPciePcsBistStaticWord30           =  0x00D81214,
> > > +	HwPfPciePcsPrbsCount1                 =  0x00D81220,
> > > +	HwPfPciePcsBistControl1               =  0x00D81224,
> > > +	HwPfPciePcsBistStaticWord01           =  0x00D81228,
> > > +	HwPfPciePcsBistStaticWord11           =  0x00D8122C,
> > > +	HwPfPciePcsBistStaticWord21           =  0x00D81230,
> > > +	HwPfPciePcsBistStaticWord31           =  0x00D81234,
> > > +	HwPfPciePcsPrbsCount2                 =  0x00D81240,
> > > +	HwPfPciePcsBistControl2               =  0x00D81244,
> > > +	HwPfPciePcsBistStaticWord02           =  0x00D81248,
> > > +	HwPfPciePcsBistStaticWord12           =  0x00D8124C,
> > > +	HwPfPciePcsBistStaticWord22           =  0x00D81250,
> > > +	HwPfPciePcsBistStaticWord32           =  0x00D81254,
> > > +	HwPfPciePcsPrbsCount3                 =  0x00D81260,
> > > +	HwPfPciePcsBistControl3               =  0x00D81264,
> > > +	HwPfPciePcsBistStaticWord03           =  0x00D81268,
> > > +	HwPfPciePcsBistStaticWord13           =  0x00D8126C,
> > > +	HwPfPciePcsBistStaticWord23           =  0x00D81270,
> > > +	HwPfPciePcsBistStaticWord33           =  0x00D81274,
> > > +	HwPfPcieGpexLtssmStateCntrl           =  0x00D90400,
> > > +	HwPfPcieGpexLtssmStateStatus          =  0x00D90404,
> > > +	HwPfPcieGpexSkipFreqTimer             =  0x00D90408,
> > > +	HwPfPcieGpexLaneSelect                =  0x00D9040C,
> > > +	HwPfPcieGpexLaneDeskew                =  0x00D90410,
> > > +	HwPfPcieGpexRxErrorStatus             =  0x00D90414,
> > > +	HwPfPcieGpexLaneNumControl            =  0x00D90418,
> > > +	HwPfPcieGpexNFstControl               =  0x00D9041C,
> > > +	HwPfPcieGpexLinkStatus                =  0x00D90420,
> > > +	HwPfPcieGpexAckReplayTimeout          =  0x00D90438,
> > > +	HwPfPcieGpexSeqNumberStatus           =  0x00D9043C,
> > > +	HwPfPcieGpexCoreClkRatio              =  0x00D90440,
> > > +	HwPfPcieGpexDllTholdControl           =  0x00D90448,
> > > +	HwPfPcieGpexPmTimer                   =  0x00D90450,
> > > +	HwPfPcieGpexPmeTimeout                =  0x00D90454,
> > > +	HwPfPcieGpexAspmL1Timer               =  0x00D90458,
> > > +	HwPfPcieGpexAspmReqTimer              =  0x00D9045C,
> > > +	HwPfPcieGpexAspmL1Dis                 =  0x00D90460,
> > > +	HwPfPcieGpexAdvisoryErrorControl      =  0x00D90468,
> > > +	HwPfPcieGpexId                        =  0x00D90470,
> > > +	HwPfPcieGpexClasscode                 =  0x00D90474,
> > > +	HwPfPcieGpexSubsystemId               =  0x00D90478,
> > > +	HwPfPcieGpexDeviceCapabilities        =  0x00D9047C,
> > > +	HwPfPcieGpexLinkCapabilities          =  0x00D90480,
> > > +	HwPfPcieGpexFunctionNumber            =  0x00D90484,
> > > +	HwPfPcieGpexPmCapabilities            =  0x00D90488,
> > > +	HwPfPcieGpexFunctionSelect            =  0x00D9048C,
> > > +	HwPfPcieGpexErrorCounter              =  0x00D904AC,
> > > +	HwPfPcieGpexConfigReady               =  0x00D904B0,
> > > +	HwPfPcieGpexFcUpdateTimeout           =  0x00D904B8,
> > > +	HwPfPcieGpexFcUpdateTimer             =  0x00D904BC,
> > > +	HwPfPcieGpexVcBufferLoad              =  0x00D904C8,
> > > +	HwPfPcieGpexVcBufferSizeThold         =  0x00D904CC,
> > > +	HwPfPcieGpexVcBufferSelect            =  0x00D904D0,
> > > +	HwPfPcieGpexBarEnable                 =  0x00D904D4,
> > > +	HwPfPcieGpexBarDwordLower             =  0x00D904D8,
> > > +	HwPfPcieGpexBarDwordUpper             =  0x00D904DC,
> > > +	HwPfPcieGpexBarSelect                 =  0x00D904E0,
> > > +	HwPfPcieGpexCreditCounterSelect       =  0x00D904E4,
> > > +	HwPfPcieGpexCreditCounterStatus       =  0x00D904E8,
> > > +	HwPfPcieGpexTlpHeaderSelect           =  0x00D904EC,
> > > +	HwPfPcieGpexTlpHeaderDword0           =  0x00D904F0,
> > > +	HwPfPcieGpexTlpHeaderDword1           =  0x00D904F4,
> > > +	HwPfPcieGpexTlpHeaderDword2           =  0x00D904F8,
> > > +	HwPfPcieGpexTlpHeaderDword3           =  0x00D904FC,
> > > +	HwPfPcieGpexRelaxOrderControl         =  0x00D90500,
> > > +	HwPfPcieGpexBarPrefetch               =  0x00D90504,
> > > +	HwPfPcieGpexFcCheckControl            =  0x00D90508,
> > > +	HwPfPcieGpexFcUpdateTimerTraffic      =  0x00D90518,
> > > +	HwPfPcieGpexPhyControl0               =  0x00D9053C,
> > > +	HwPfPcieGpexPhyControl1               =  0x00D90544,
> > > +	HwPfPcieGpexPhyControl2               =  0x00D9054C,
> > > +	HwPfPcieGpexUserControl0              =  0x00D9055C,
> > > +	HwPfPcieGpexUncorrErrorStatus         =  0x00D905F0,
> > > +	HwPfPcieGpexRxCplError                =  0x00D90620,
> > > +	HwPfPcieGpexRxCplErrorDword0          =  0x00D90624,
> > > +	HwPfPcieGpexRxCplErrorDword1          =  0x00D90628,
> > > +	HwPfPcieGpexRxCplErrorDword2          =  0x00D9062C,
> > > +	HwPfPcieGpexPabSwResetEn              =  0x00D90630,
> > > +	HwPfPcieGpexGen3Control0              =  0x00D90634,
> > > +	HwPfPcieGpexGen3Control1              =  0x00D90638,
> > > +	HwPfPcieGpexGen3Control2              =  0x00D9063C,
> > > +	HwPfPcieGpexGen2ControlCsr            =  0x00D90640,
> > > +	HwPfPcieGpexTotalVfInitialVf0         =  0x00D90644,
> > > +	HwPfPcieGpexTotalVfInitialVf1         =  0x00D90648,
> > > +	HwPfPcieGpexSriovLinkDevId0           =  0x00D90684,
> > > +	HwPfPcieGpexSriovLinkDevId1           =  0x00D90688,
> > > +	HwPfPcieGpexSriovPageSize0            =  0x00D906C4,
> > > +	HwPfPcieGpexSriovPageSize1            =  0x00D906C8,
> > > +	HwPfPcieGpexIdVersion                 =  0x00D906FC,
> > > +	HwPfPcieGpexSriovVfOffsetStride0      =  0x00D90704,
> > > +	HwPfPcieGpexSriovVfOffsetStride1      =  0x00D90708,
> > > +	HwPfPcieGpexGen3DeskewControl         =  0x00D907B4,
> > > +	HwPfPcieGpexGen3EqControl             =  0x00D907B8,
> > > +	HwPfPcieGpexBridgeVersion             =  0x00D90800,
> > > +	HwPfPcieGpexBridgeCapability          =  0x00D90804,
> > > +	HwPfPcieGpexBridgeControl             =  0x00D90808,
> > > +	HwPfPcieGpexBridgeStatus              =  0x00D9080C,
> > > +	HwPfPcieGpexEngineActivityStatus      =  0x00D9081C,
> > > +	HwPfPcieGpexEngineResetControl        =  0x00D90820,
> > > +	HwPfPcieGpexAxiPioControl             =  0x00D90840,
> > > +	HwPfPcieGpexAxiPioStatus              =  0x00D90844,
> > > +	HwPfPcieGpexAmbaSlaveCmdStatus        =  0x00D90848,
> > > +	HwPfPcieGpexPexPioControl             =  0x00D908C0,
> > > +	HwPfPcieGpexPexPioStatus              =  0x00D908C4,
> > > +	HwPfPcieGpexAmbaMasterStatus          =  0x00D908C8,
> > > +	HwPfPcieGpexCsrSlaveCmdStatus         =  0x00D90920,
> > > +	HwPfPcieGpexMailboxAxiControl         =  0x00D90A50,
> > > +	HwPfPcieGpexMailboxAxiData            =  0x00D90A54,
> > > +	HwPfPcieGpexMailboxPexControl         =  0x00D90A90,
> > > +	HwPfPcieGpexMailboxPexData            =  0x00D90A94,
> > > +	HwPfPcieGpexPexInterruptEnable        =  0x00D90AD0,
> > > +	HwPfPcieGpexPexInterruptStatus        =  0x00D90AD4,
> > > +	HwPfPcieGpexPexInterruptAxiPioVector  =  0x00D90AD8,
> > > +	HwPfPcieGpexPexInterruptPexPioVector  =  0x00D90AE0,
> > > +	HwPfPcieGpexPexInterruptMiscVector    =  0x00D90AF8,
> > > +	HwPfPcieGpexAmbaInterruptPioEnable    =  0x00D90B00,
> > > +	HwPfPcieGpexAmbaInterruptMiscEnable   =  0x00D90B0C,
> > > +	HwPfPcieGpexAmbaInterruptPioStatus    =  0x00D90B10,
> > > +	HwPfPcieGpexAmbaInterruptMiscStatus   =  0x00D90B1C,
> > > +	HwPfPcieGpexPexPmControl              =  0x00D90B80,
> > > +	HwPfPcieGpexSlotMisc                  =  0x00D90B88,
> > > +	HwPfPcieGpexAxiAddrMappingControl     =  0x00D90BA0,
> > > +	HwPfPcieGpexAxiAddrMappingWindowAxiBase     =  0x00D90BA4,
> > > +	HwPfPcieGpexAxiAddrMappingWindowPexBaseLow  =  0x00D90BA8,
> > > +	HwPfPcieGpexAxiAddrMappingWindowPexBaseHigh =  0x00D90BAC,
> > > +	HwPfPcieGpexPexBarAddrFunc0Bar0       =  0x00D91BA0,
> > > +	HwPfPcieGpexPexBarAddrFunc0Bar1       =  0x00D91BA4,
> > > +	HwPfPcieGpexAxiAddrMappingPcieHdrParam =  0x00D95BA0,
> > > +	HwPfPcieGpexExtAxiAddrMappingAxiBase  =  0x00D980A0,
> > > +	HwPfPcieGpexPexExtBarAddrFunc0Bar0    =  0x00D984A0,
> > > +	HwPfPcieGpexPexExtBarAddrFunc0Bar1    =  0x00D984A4,
> > > +	HwPfPcieGpexAmbaInterruptFlrEnable    =  0x00D9B960,
> > > +	HwPfPcieGpexAmbaInterruptFlrStatus    =  0x00D9B9A0,
> > > +	HwPfPcieGpexExtAxiAddrMappingSize     =  0x00D9BAF0,
> > > +	HwPfPcieGpexPexPioAwcacheControl      =  0x00D9C300,
> > > +	HwPfPcieGpexPexPioArcacheControl      =  0x00D9C304,
> > > +	HwPfPcieGpexPabObSizeControlVc0       =  0x00D9C310
> > > +};
> >
> > Why not macro definition but enum?
> >
> 
> Well both would "work". The main reason really is that this long enum is
> automatically generated from RDL output from the chip design.
> But still in that case I would argue enum is cleaner so that to put all these
> incremental addresses together.
> This can also helps when debugging as this is kept post compilation as both
> value and enum var.
> Any concern or any BKM from other PMDs?

Can you read DPDK coding style firstly?
https://doc.dpdk.org/guides-16.11/contributing/coding_style.html
It's not make sense to define HW address in your way.

> > > +/* TIP PF Interrupt numbers */
> > > +enum {
> > > +	ACC100_PF_INT_QMGR_AQ_OVERFLOW = 0,
> > > +	ACC100_PF_INT_DOORBELL_VF_2_PF = 1,
> > > +	ACC100_PF_INT_DMA_DL_DESC_IRQ = 2,
> > > +	ACC100_PF_INT_DMA_UL_DESC_IRQ = 3,
> > > +	ACC100_PF_INT_DMA_MLD_DESC_IRQ = 4,
> > > +	ACC100_PF_INT_DMA_UL5G_DESC_IRQ = 5,
> > > +	ACC100_PF_INT_DMA_DL5G_DESC_IRQ = 6,
> > > +	ACC100_PF_INT_ILLEGAL_FORMAT = 7,
> > > +	ACC100_PF_INT_QMGR_DISABLED_ACCESS = 8,
> > > +	ACC100_PF_INT_QMGR_AQ_OVERTHRESHOLD = 9,
> > > +	ACC100_PF_INT_ARAM_ACCESS_ERR = 10,
> > > +	ACC100_PF_INT_ARAM_ECC_1BIT_ERR = 11,
> > > +	ACC100_PF_INT_PARITY_ERR = 12,
> > > +	ACC100_PF_INT_QMGR_ERR = 13,
> > > +	ACC100_PF_INT_INT_REQ_OVERFLOW = 14,
> > > +	ACC100_PF_INT_APB_TIMEOUT = 15,
> > > +};
> > > +
> > > +#endif /* ACC100_PF_ENUM_H */
> > > diff --git a/drivers/baseband/acc100/acc100_vf_enum.h
> > > b/drivers/baseband/acc100/acc100_vf_enum.h
> > > new file mode 100644
> > > index 0000000..b512af3
> > > --- /dev/null
> > > +++ b/drivers/baseband/acc100/acc100_vf_enum.h
> > > @@ -0,0 +1,73 @@
> > > +/* SPDX-License-Identifier: BSD-3-Clause
> > > + * Copyright(c) 2017 Intel Corporation
> > > + */
> > > +
> > > +#ifndef ACC100_VF_ENUM_H
> > > +#define ACC100_VF_ENUM_H
> > > +
> > > +/*
> > > + * ACC100 Register mapping on VF BAR0
> > > + * This is automatically generated from RDL, format may change with
> new
> > > RDL
> > > + */
> > > +enum {
> > > +	HWVfQmgrIngressAq             =  0x00000000,
> > > +	HWVfHiVfToPfDbellVf           =  0x00000800,
> > > +	HWVfHiPfToVfDbellVf           =  0x00000808,
> > > +	HWVfHiInfoRingBaseLoVf        =  0x00000810,
> > > +	HWVfHiInfoRingBaseHiVf        =  0x00000814,
> > > +	HWVfHiInfoRingPointerVf       =  0x00000818,
> > > +	HWVfHiInfoRingIntWrEnVf       =  0x00000820,
> > > +	HWVfHiInfoRingPf2VfWrEnVf     =  0x00000824,
> > > +	HWVfHiMsixVectorMapperVf      =  0x00000860,
> > > +	HWVfDmaFec5GulDescBaseLoRegVf =  0x00000920,
> > > +	HWVfDmaFec5GulDescBaseHiRegVf =  0x00000924,
> > > +	HWVfDmaFec5GulRespPtrLoRegVf  =  0x00000928,
> > > +	HWVfDmaFec5GulRespPtrHiRegVf  =  0x0000092C,
> > > +	HWVfDmaFec5GdlDescBaseLoRegVf =  0x00000940,
> > > +	HWVfDmaFec5GdlDescBaseHiRegVf =  0x00000944,
> > > +	HWVfDmaFec5GdlRespPtrLoRegVf  =  0x00000948,
> > > +	HWVfDmaFec5GdlRespPtrHiRegVf  =  0x0000094C,
> > > +	HWVfDmaFec4GulDescBaseLoRegVf =  0x00000960,
> > > +	HWVfDmaFec4GulDescBaseHiRegVf =  0x00000964,
> > > +	HWVfDmaFec4GulRespPtrLoRegVf  =  0x00000968,
> > > +	HWVfDmaFec4GulRespPtrHiRegVf  =  0x0000096C,
> > > +	HWVfDmaFec4GdlDescBaseLoRegVf =  0x00000980,
> > > +	HWVfDmaFec4GdlDescBaseHiRegVf =  0x00000984,
> > > +	HWVfDmaFec4GdlRespPtrLoRegVf  =  0x00000988,
> > > +	HWVfDmaFec4GdlRespPtrHiRegVf  =  0x0000098C,
> > > +	HWVfDmaDdrBaseRangeRoVf       =  0x000009A0,
> > > +	HWVfQmgrAqResetVf             =  0x00000E00,
> > > +	HWVfQmgrRingSizeVf            =  0x00000E04,
> > > +	HWVfQmgrGrpDepthLog20Vf       =  0x00000E08,
> > > +	HWVfQmgrGrpDepthLog21Vf       =  0x00000E0C,
> > > +	HWVfQmgrGrpFunction0Vf        =  0x00000E10,
> > > +	HWVfQmgrGrpFunction1Vf        =  0x00000E14,
> > > +	HWVfPmACntrlRegVf             =  0x00000F40,
> > > +	HWVfPmACountVf                =  0x00000F48,
> > > +	HWVfPmAKCntLoVf               =  0x00000F50,
> > > +	HWVfPmAKCntHiVf               =  0x00000F54,
> > > +	HWVfPmADeltaCntLoVf           =  0x00000F60,
> > > +	HWVfPmADeltaCntHiVf           =  0x00000F64,
> > > +	HWVfPmBCntrlRegVf             =  0x00000F80,
> > > +	HWVfPmBCountVf                =  0x00000F88,
> > > +	HWVfPmBKCntLoVf               =  0x00000F90,
> > > +	HWVfPmBKCntHiVf               =  0x00000F94,
> > > +	HWVfPmBDeltaCntLoVf           =  0x00000FA0,
> > > +	HWVfPmBDeltaCntHiVf           =  0x00000FA4
> > > +};
> > > +
> > > +/* TIP VF Interrupt numbers */
> > > +enum {
> > > +	ACC100_VF_INT_QMGR_AQ_OVERFLOW = 0,
> > > +	ACC100_VF_INT_DOORBELL_VF_2_PF = 1,
> > > +	ACC100_VF_INT_DMA_DL_DESC_IRQ = 2,
> > > +	ACC100_VF_INT_DMA_UL_DESC_IRQ = 3,
> > > +	ACC100_VF_INT_DMA_MLD_DESC_IRQ = 4,
> > > +	ACC100_VF_INT_DMA_UL5G_DESC_IRQ = 5,
> > > +	ACC100_VF_INT_DMA_DL5G_DESC_IRQ = 6,
> > > +	ACC100_VF_INT_ILLEGAL_FORMAT = 7,
> > > +	ACC100_VF_INT_QMGR_DISABLED_ACCESS = 8,
> > > +	ACC100_VF_INT_QMGR_AQ_OVERTHRESHOLD = 9,
> > > +};
> > > +
> > > +#endif /* ACC100_VF_ENUM_H */
> > > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h
> > > b/drivers/baseband/acc100/rte_acc100_pmd.h
> > > index 6f46df0..cd77570 100644
> > > --- a/drivers/baseband/acc100/rte_acc100_pmd.h
> > > +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
> > > @@ -5,6 +5,9 @@
> > >  #ifndef _RTE_ACC100_PMD_H_
> > >  #define _RTE_ACC100_PMD_H_
> > >
> > > +#include "acc100_pf_enum.h"
> > > +#include "acc100_vf_enum.h"
> > > +
> > >  /* Helper macro for logging */
> > >  #define rte_bbdev_log(level, fmt, ...) \
> > >  	rte_log(RTE_LOG_ ## level, acc100_logtype, fmt "\n", \
> > > @@ -27,6 +30,493 @@
> > >  #define RTE_ACC100_PF_DEVICE_ID        (0x0d5c)
> > >  #define RTE_ACC100_VF_DEVICE_ID        (0x0d5d)
> > >
> > > +/* Define as 1 to use only a single FEC engine */
> > > +#ifndef RTE_ACC100_SINGLE_FEC
> > > +#define RTE_ACC100_SINGLE_FEC 0
> > > +#endif
> > > +
> > > +/* Values used in filling in descriptors */
> > > +#define ACC100_DMA_DESC_TYPE           2
> > > +#define ACC100_DMA_CODE_BLK_MODE       0
> > > +#define ACC100_DMA_BLKID_FCW           1
> > > +#define ACC100_DMA_BLKID_IN            2
> > > +#define ACC100_DMA_BLKID_OUT_ENC       1
> > > +#define ACC100_DMA_BLKID_OUT_HARD      1
> > > +#define ACC100_DMA_BLKID_OUT_SOFT      2
> > > +#define ACC100_DMA_BLKID_OUT_HARQ      3
> > > +#define ACC100_DMA_BLKID_IN_HARQ       3
> > > +
> > > +/* Values used in filling in decode FCWs */
> > > +#define ACC100_FCW_TD_VER              1
> > > +#define ACC100_FCW_TD_EXT_COLD_REG_EN  1
> > > +#define ACC100_FCW_TD_AUTOMAP          0x0f
> > > +#define ACC100_FCW_TD_RVIDX_0          2
> > > +#define ACC100_FCW_TD_RVIDX_1          26
> > > +#define ACC100_FCW_TD_RVIDX_2          50
> > > +#define ACC100_FCW_TD_RVIDX_3          74
> > > +
> > > +/* Values used in writing to the registers */
> > > +#define ACC100_REG_IRQ_EN_ALL          0x1FF83FF  /* Enable all
> interrupts
> > > */
> > > +
> > > +/* ACC100 Specific Dimensioning */
> > > +#define ACC100_SIZE_64MBYTE            (64*1024*1024)
> > > +/* Number of elements in an Info Ring */
> > > +#define ACC100_INFO_RING_NUM_ENTRIES   1024
> > > +/* Number of elements in HARQ layout memory */
> > > +#define ACC100_HARQ_LAYOUT             (64*1024*1024)
> > > +/* Assume offset for HARQ in memory */
> > > +#define ACC100_HARQ_OFFSET             (32*1024)
> > > +/* Mask used to calculate an index in an Info Ring array (not a byte
> offset)
> > > */
> > > +#define ACC100_INFO_RING_MASK
> > > (ACC100_INFO_RING_NUM_ENTRIES-1)
> > > +/* Number of Virtual Functions ACC100 supports */
> > > +#define ACC100_NUM_VFS                  16
> > > +#define ACC100_NUM_QGRPS                 8
> > > +#define ACC100_NUM_QGRPS_PER_WORD        8
> > > +#define ACC100_NUM_AQS                  16
> > > +#define MAX_ENQ_BATCH_SIZE          255
> > > +/* All ACC100 Registers alignment are 32bits = 4B */
> > > +#define BYTES_IN_WORD                 4
> > > +#define MAX_E_MBUF                64000
> > > +
> > > +#define GRP_ID_SHIFT    10 /* Queue Index Hierarchy */
> > > +#define VF_ID_SHIFT     4  /* Queue Index Hierarchy */
> > > +#define VF_OFFSET_QOS   16 /* offset in Memory Space specific to QoS
> > > Mon */
> > > +#define TMPL_PRI_0      0x03020100
> > > +#define TMPL_PRI_1      0x07060504
> > > +#define TMPL_PRI_2      0x0b0a0908
> > > +#define TMPL_PRI_3      0x0f0e0d0c
> > > +#define QUEUE_ENABLE    0x80000000  /* Bit to mark Queue as Enabled
> */
> > > +#define WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
> > > +
> > > +#define ACC100_NUM_TMPL  32
> > > +#define VF_OFFSET_QOS 16 /* offset in Memory Space specific to QoS
> Mon
> > > */
> > > +/* Mapping of signals for the available engines */
> > > +#define SIG_UL_5G      0
> > > +#define SIG_UL_5G_LAST 7
> > > +#define SIG_DL_5G      13
> > > +#define SIG_DL_5G_LAST 15
> > > +#define SIG_UL_4G      16
> > > +#define SIG_UL_4G_LAST 21
> > > +#define SIG_DL_4G      27
> > > +#define SIG_DL_4G_LAST 31
> > > +
> > > +/* max number of iterations to allocate memory block for all rings */
> > > +#define SW_RING_MEM_ALLOC_ATTEMPTS 5
> > > +#define MAX_QUEUE_DEPTH           1024
> > > +#define ACC100_DMA_MAX_NUM_POINTERS  14
> > > +#define ACC100_DMA_DESC_PADDING      8
> > > +#define ACC100_FCW_PADDING           12
> > > +#define ACC100_DESC_FCW_OFFSET       192
> > > +#define ACC100_DESC_SIZE             256
> > > +#define ACC100_DESC_OFFSET           (ACC100_DESC_SIZE / 64)
> > > +#define ACC100_FCW_TE_BLEN     32
> > > +#define ACC100_FCW_TD_BLEN     24
> > > +#define ACC100_FCW_LE_BLEN     32
> > > +#define ACC100_FCW_LD_BLEN     36
> > > +
> > > +#define ACC100_FCW_VER         2
> > > +#define MUX_5GDL_DESC 6
> > > +#define CMP_ENC_SIZE 20
> > > +#define CMP_DEC_SIZE 24
> > > +#define ENC_OFFSET (32)
> > > +#define DEC_OFFSET (80)
> > > +#define ACC100_EXT_MEM
> > > +#define ACC100_HARQ_OFFSET_THRESHOLD 1024
> > > +
> > > +/* Constants from K0 computation from 3GPP 38.212 Table 5.4.2.1-2 */
> > > +#define N_ZC_1 66 /* N = 66 Zc for BG 1 */
> > > +#define N_ZC_2 50 /* N = 50 Zc for BG 2 */
> > > +#define K0_1_1 17 /* K0 fraction numerator for rv 1 and BG 1 */
> > > +#define K0_1_2 13 /* K0 fraction numerator for rv 1 and BG 2 */
> > > +#define K0_2_1 33 /* K0 fraction numerator for rv 2 and BG 1 */
> > > +#define K0_2_2 25 /* K0 fraction numerator for rv 2 and BG 2 */
> > > +#define K0_3_1 56 /* K0 fraction numerator for rv 3 and BG 1 */
> > > +#define K0_3_2 43 /* K0 fraction numerator for rv 3 and BG 2 */
> > > +
> > > +/* ACC100 Configuration */
> > > +#define ACC100_DDR_ECC_ENABLE
> > > +#define ACC100_CFG_DMA_ERROR 0x3D7
> > > +#define ACC100_CFG_AXI_CACHE 0x11
> > > +#define ACC100_CFG_QMGR_HI_P 0x0F0F
> > > +#define ACC100_CFG_PCI_AXI 0xC003
> > > +#define ACC100_CFG_PCI_BRIDGE 0x40006033
> > > +#define ACC100_ENGINE_OFFSET 0x1000
> > > +#define ACC100_RESET_HI 0x20100
> > > +#define ACC100_RESET_LO 0x20000
> > > +#define ACC100_RESET_HARD 0x1FF
> > > +#define ACC100_ENGINES_MAX 9
> > > +#define LONG_WAIT 1000
> > > +
> > > +/* ACC100 DMA Descriptor triplet */
> > > +struct acc100_dma_triplet {
> > > +	uint64_t address;
> > > +	uint32_t blen:20,
> > > +		res0:4,
> > > +		last:1,
> > > +		dma_ext:1,
> > > +		res1:2,
> > > +		blkid:4;
> > > +} __rte_packed;
> > > +
> > > +
> > > +
> > > +/* ACC100 DMA Response Descriptor */
> > > +union acc100_dma_rsp_desc {
> > > +	uint32_t val;
> > > +	struct {
> > > +		uint32_t crc_status:1,
> > > +			synd_ok:1,
> > > +			dma_err:1,
> > > +			neg_stop:1,
> > > +			fcw_err:1,
> > > +			output_err:1,
> > > +			input_err:1,
> > > +			timestampEn:1,
> > > +			iterCountFrac:8,
> > > +			iter_cnt:8,
> > > +			rsrvd3:6,
> > > +			sdone:1,
> > > +			fdone:1;
> > > +		uint32_t add_info_0;
> > > +		uint32_t add_info_1;
> > > +	};
> > > +};
> > > +
> > > +
> > > +/* ACC100 Queue Manager Enqueue PCI Register */
> > > +union acc100_enqueue_reg_fmt {
> > > +	uint32_t val;
> > > +	struct {
> > > +		uint32_t num_elem:8,
> > > +			addr_offset:3,
> > > +			rsrvd:1,
> > > +			req_elem_addr:20;
> > > +	};
> > > +};
> > > +
> > > +/* FEC 4G Uplink Frame Control Word */
> > > +struct __rte_packed acc100_fcw_td {
> > > +	uint8_t fcw_ver:4,
> > > +		num_maps:4; /* Unused */
> > > +	uint8_t filler:6, /* Unused */
> > > +		rsrvd0:1,
> > > +		bypass_sb_deint:1;
> > > +	uint16_t k_pos;
> > > +	uint16_t k_neg; /* Unused */
> > > +	uint8_t c_neg; /* Unused */
> > > +	uint8_t c; /* Unused */
> > > +	uint32_t ea; /* Unused */
> > > +	uint32_t eb; /* Unused */
> > > +	uint8_t cab; /* Unused */
> > > +	uint8_t k0_start_col; /* Unused */
> > > +	uint8_t rsrvd1;
> > > +	uint8_t code_block_mode:1, /* Unused */
> > > +		turbo_crc_type:1,
> > > +		rsrvd2:3,
> > > +		bypass_teq:1, /* Unused */
> > > +		soft_output_en:1, /* Unused */
> > > +		ext_td_cold_reg_en:1;
> > > +	union { /* External Cold register */
> > > +		uint32_t ext_td_cold_reg;
> > > +		struct {
> > > +			uint32_t min_iter:4, /* Unused */
> > > +				max_iter:4,
> > > +				ext_scale:5, /* Unused */
> > > +				rsrvd3:3,
> > > +				early_stop_en:1, /* Unused */
> > > +				sw_soft_out_dis:1, /* Unused */
> > > +				sw_et_cont:1, /* Unused */
> > > +				sw_soft_out_saturation:1, /* Unused */
> > > +				half_iter_on:1, /* Unused */
> > > +				raw_decoder_input_on:1, /* Unused */
> > > +				rsrvd4:10;
> > > +		};
> > > +	};
> > > +};
> > > +
> > > +/* FEC 5GNR Uplink Frame Control Word */
> > > +struct __rte_packed acc100_fcw_ld {
> > > +	uint32_t FCWversion:4,
> > > +		qm:4,
> > > +		nfiller:11,
> > > +		BG:1,
> > > +		Zc:9,
> > > +		res0:1,
> > > +		synd_precoder:1,
> > > +		synd_post:1;
> > > +	uint32_t ncb:16,
> > > +		k0:16;
> > > +	uint32_t rm_e:24,
> > > +		hcin_en:1,
> > > +		hcout_en:1,
> > > +		crc_select:1,
> > > +		bypass_dec:1,
> > > +		bypass_intlv:1,
> > > +		so_en:1,
> > > +		so_bypass_rm:1,
> > > +		so_bypass_intlv:1;
> > > +	uint32_t hcin_offset:16,
> > > +		hcin_size0:16;
> > > +	uint32_t hcin_size1:16,
> > > +		hcin_decomp_mode:3,
> > > +		llr_pack_mode:1,
> > > +		hcout_comp_mode:3,
> > > +		res2:1,
> > > +		dec_convllr:4,
> > > +		hcout_convllr:4;
> > > +	uint32_t itmax:7,
> > > +		itstop:1,
> > > +		so_it:7,
> > > +		res3:1,
> > > +		hcout_offset:16;
> > > +	uint32_t hcout_size0:16,
> > > +		hcout_size1:16;
> > > +	uint32_t gain_i:8,
> > > +		gain_h:8,
> > > +		negstop_th:16;
> > > +	uint32_t negstop_it:7,
> > > +		negstop_en:1,
> > > +		res4:24;
> > > +};
> > > +
> > > +/* FEC 4G Downlink Frame Control Word */
> > > +struct __rte_packed acc100_fcw_te {
> > > +	uint16_t k_neg;
> > > +	uint16_t k_pos;
> > > +	uint8_t c_neg;
> > > +	uint8_t c;
> > > +	uint8_t filler;
> > > +	uint8_t cab;
> > > +	uint32_t ea:17,
> > > +		rsrvd0:15;
> > > +	uint32_t eb:17,
> > > +		rsrvd1:15;
> > > +	uint16_t ncb_neg;
> > > +	uint16_t ncb_pos;
> > > +	uint8_t rv_idx0:2,
> > > +		rsrvd2:2,
> > > +		rv_idx1:2,
> > > +		rsrvd3:2;
> > > +	uint8_t bypass_rv_idx0:1,
> > > +		bypass_rv_idx1:1,
> > > +		bypass_rm:1,
> > > +		rsrvd4:5;
> > > +	uint8_t rsrvd5:1,
> > > +		rsrvd6:3,
> > > +		code_block_crc:1,
> > > +		rsrvd7:3;
> > > +	uint8_t code_block_mode:1,
> > > +		rsrvd8:7;
> > > +	uint64_t rsrvd9;
> > > +};
> > > +
> > > +/* FEC 5GNR Downlink Frame Control Word */
> > > +struct __rte_packed acc100_fcw_le {
> > > +	uint32_t FCWversion:4,
> > > +		qm:4,
> > > +		nfiller:11,
> > > +		BG:1,
> > > +		Zc:9,
> > > +		res0:3;
> > > +	uint32_t ncb:16,
> > > +		k0:16;
> > > +	uint32_t rm_e:24,
> > > +		res1:2,
> > > +		crc_select:1,
> > > +		res2:1,
> > > +		bypass_intlv:1,
> > > +		res3:3;
> > > +	uint32_t res4_a:12,
> > > +		mcb_count:3,
> > > +		res4_b:17;
> > > +	uint32_t res5;
> > > +	uint32_t res6;
> > > +	uint32_t res7;
> > > +	uint32_t res8;
> > > +};
> > > +
> > > +/* ACC100 DMA Request Descriptor */
> > > +struct __rte_packed acc100_dma_req_desc {
> > > +	union {
> > > +		struct{
> > > +			uint32_t type:4,
> > > +				rsrvd0:26,
> > > +				sdone:1,
> > > +				fdone:1;
> > > +			uint32_t rsrvd1;
> > > +			uint32_t rsrvd2;
> > > +			uint32_t pass_param:8,
> > > +				sdone_enable:1,
> > > +				irq_enable:1,
> > > +				timeStampEn:1,
> > > +				res0:5,
> > > +				numCBs:4,
> > > +				res1:4,
> > > +				m2dlen:4,
> > > +				d2mlen:4;
> > > +		};
> > > +		struct{
> > > +			uint32_t word0;
> > > +			uint32_t word1;
> > > +			uint32_t word2;
> > > +			uint32_t word3;
> > > +		};
> > > +	};
> > > +	struct acc100_dma_triplet
> > > data_ptrs[ACC100_DMA_MAX_NUM_POINTERS];
> > > +
> > > +	/* Virtual addresses used to retrieve SW context info */
> > > +	union {
> > > +		void *op_addr;
> > > +		uint64_t pad1;  /* pad to 64 bits */
> > > +	};
> > > +	/*
> > > +	 * Stores additional information needed for driver processing:
> > > +	 * - last_desc_in_batch - flag used to mark last descriptor (CB)
> > > +	 *                        in batch
> > > +	 * - cbs_in_tb - stores information about total number of Code Blocks
> > > +	 *               in currently processed Transport Block
> > > +	 */
> > > +	union {
> > > +		struct {
> > > +			union {
> > > +				struct acc100_fcw_ld fcw_ld;
> > > +				struct acc100_fcw_td fcw_td;
> > > +				struct acc100_fcw_le fcw_le;
> > > +				struct acc100_fcw_te fcw_te;
> > > +				uint32_t pad2[ACC100_FCW_PADDING];
> > > +			};
> > > +			uint32_t last_desc_in_batch :8,
> > > +				cbs_in_tb:8,
> > > +				pad4 : 16;
> > > +		};
> > > +		uint64_t pad3[ACC100_DMA_DESC_PADDING]; /* pad to 64
> > > bits */
> > > +	};
> > > +};
> > > +
> > > +/* ACC100 DMA Descriptor */
> > > +union acc100_dma_desc {
> > > +	struct acc100_dma_req_desc req;
> > > +	union acc100_dma_rsp_desc rsp;
> > > +};
> > > +
> > > +
> > > +/* Union describing Info Ring entry */
> > > +union acc100_harq_layout_data {
> > > +	uint32_t val;
> > > +	struct {
> > > +		uint16_t offset;
> > > +		uint16_t size0;
> > > +	};
> > > +} __rte_packed;
> > > +
> > > +
> > > +/* Union describing Info Ring entry */
> > > +union acc100_info_ring_data {
> > > +	uint32_t val;
> > > +	struct {
> > > +		union {
> > > +			uint16_t detailed_info;
> > > +			struct {
> > > +				uint16_t aq_id: 4;
> > > +				uint16_t qg_id: 4;
> > > +				uint16_t vf_id: 6;
> > > +				uint16_t reserved: 2;
> > > +			};
> > > +		};
> > > +		uint16_t int_nb: 7;
> > > +		uint16_t msi_0: 1;
> > > +		uint16_t vf2pf: 6;
> > > +		uint16_t loop: 1;
> > > +		uint16_t valid: 1;
> > > +	};
> > > +} __rte_packed;
> > > +
> > > +struct acc100_registry_addr {
> > > +	unsigned int dma_ring_dl5g_hi;
> > > +	unsigned int dma_ring_dl5g_lo;
> > > +	unsigned int dma_ring_ul5g_hi;
> > > +	unsigned int dma_ring_ul5g_lo;
> > > +	unsigned int dma_ring_dl4g_hi;
> > > +	unsigned int dma_ring_dl4g_lo;
> > > +	unsigned int dma_ring_ul4g_hi;
> > > +	unsigned int dma_ring_ul4g_lo;
> > > +	unsigned int ring_size;
> > > +	unsigned int info_ring_hi;
> > > +	unsigned int info_ring_lo;
> > > +	unsigned int info_ring_en;
> > > +	unsigned int info_ring_ptr;
> > > +	unsigned int tail_ptrs_dl5g_hi;
> > > +	unsigned int tail_ptrs_dl5g_lo;
> > > +	unsigned int tail_ptrs_ul5g_hi;
> > > +	unsigned int tail_ptrs_ul5g_lo;
> > > +	unsigned int tail_ptrs_dl4g_hi;
> > > +	unsigned int tail_ptrs_dl4g_lo;
> > > +	unsigned int tail_ptrs_ul4g_hi;
> > > +	unsigned int tail_ptrs_ul4g_lo;
> > > +	unsigned int depth_log0_offset;
> > > +	unsigned int depth_log1_offset;
> > > +	unsigned int qman_group_func;
> > > +	unsigned int ddr_range;
> > > +};
> > > +
> > > +/* Structure holding registry addresses for PF */
> > > +static const struct acc100_registry_addr pf_reg_addr = {
> > > +	.dma_ring_dl5g_hi = HWPfDmaFec5GdlDescBaseHiRegVf,
> > > +	.dma_ring_dl5g_lo = HWPfDmaFec5GdlDescBaseLoRegVf,
> > > +	.dma_ring_ul5g_hi = HWPfDmaFec5GulDescBaseHiRegVf,
> > > +	.dma_ring_ul5g_lo = HWPfDmaFec5GulDescBaseLoRegVf,
> > > +	.dma_ring_dl4g_hi = HWPfDmaFec4GdlDescBaseHiRegVf,
> > > +	.dma_ring_dl4g_lo = HWPfDmaFec4GdlDescBaseLoRegVf,
> > > +	.dma_ring_ul4g_hi = HWPfDmaFec4GulDescBaseHiRegVf,
> > > +	.dma_ring_ul4g_lo = HWPfDmaFec4GulDescBaseLoRegVf,
> > > +	.ring_size = HWPfQmgrRingSizeVf,
> > > +	.info_ring_hi = HWPfHiInfoRingBaseHiRegPf,
> > > +	.info_ring_lo = HWPfHiInfoRingBaseLoRegPf,
> > > +	.info_ring_en = HWPfHiInfoRingIntWrEnRegPf,
> > > +	.info_ring_ptr = HWPfHiInfoRingPointerRegPf,
> > > +	.tail_ptrs_dl5g_hi = HWPfDmaFec5GdlRespPtrHiRegVf,
> > > +	.tail_ptrs_dl5g_lo = HWPfDmaFec5GdlRespPtrLoRegVf,
> > > +	.tail_ptrs_ul5g_hi = HWPfDmaFec5GulRespPtrHiRegVf,
> > > +	.tail_ptrs_ul5g_lo = HWPfDmaFec5GulRespPtrLoRegVf,
> > > +	.tail_ptrs_dl4g_hi = HWPfDmaFec4GdlRespPtrHiRegVf,
> > > +	.tail_ptrs_dl4g_lo = HWPfDmaFec4GdlRespPtrLoRegVf,
> > > +	.tail_ptrs_ul4g_hi = HWPfDmaFec4GulRespPtrHiRegVf,
> > > +	.tail_ptrs_ul4g_lo = HWPfDmaFec4GulRespPtrLoRegVf,
> > > +	.depth_log0_offset = HWPfQmgrGrpDepthLog20Vf,
> > > +	.depth_log1_offset = HWPfQmgrGrpDepthLog21Vf,
> > > +	.qman_group_func = HWPfQmgrGrpFunction0,
> > > +	.ddr_range = HWPfDmaVfDdrBaseRw,
> > > +};
> > > +
> > > +/* Structure holding registry addresses for VF */
> > > +static const struct acc100_registry_addr vf_reg_addr = {
> > > +	.dma_ring_dl5g_hi = HWVfDmaFec5GdlDescBaseHiRegVf,
> > > +	.dma_ring_dl5g_lo = HWVfDmaFec5GdlDescBaseLoRegVf,
> > > +	.dma_ring_ul5g_hi = HWVfDmaFec5GulDescBaseHiRegVf,
> > > +	.dma_ring_ul5g_lo = HWVfDmaFec5GulDescBaseLoRegVf,
> > > +	.dma_ring_dl4g_hi = HWVfDmaFec4GdlDescBaseHiRegVf,
> > > +	.dma_ring_dl4g_lo = HWVfDmaFec4GdlDescBaseLoRegVf,
> > > +	.dma_ring_ul4g_hi = HWVfDmaFec4GulDescBaseHiRegVf,
> > > +	.dma_ring_ul4g_lo = HWVfDmaFec4GulDescBaseLoRegVf,
> > > +	.ring_size = HWVfQmgrRingSizeVf,
> > > +	.info_ring_hi = HWVfHiInfoRingBaseHiVf,
> > > +	.info_ring_lo = HWVfHiInfoRingBaseLoVf,
> > > +	.info_ring_en = HWVfHiInfoRingIntWrEnVf,
> > > +	.info_ring_ptr = HWVfHiInfoRingPointerVf,
> > > +	.tail_ptrs_dl5g_hi = HWVfDmaFec5GdlRespPtrHiRegVf,
> > > +	.tail_ptrs_dl5g_lo = HWVfDmaFec5GdlRespPtrLoRegVf,
> > > +	.tail_ptrs_ul5g_hi = HWVfDmaFec5GulRespPtrHiRegVf,
> > > +	.tail_ptrs_ul5g_lo = HWVfDmaFec5GulRespPtrLoRegVf,
> > > +	.tail_ptrs_dl4g_hi = HWVfDmaFec4GdlRespPtrHiRegVf,
> > > +	.tail_ptrs_dl4g_lo = HWVfDmaFec4GdlRespPtrLoRegVf,
> > > +	.tail_ptrs_ul4g_hi = HWVfDmaFec4GulRespPtrHiRegVf,
> > > +	.tail_ptrs_ul4g_lo = HWVfDmaFec4GulRespPtrLoRegVf,
> > > +	.depth_log0_offset = HWVfQmgrGrpDepthLog20Vf,
> > > +	.depth_log1_offset = HWVfQmgrGrpDepthLog21Vf,
> > > +	.qman_group_func = HWVfQmgrGrpFunction0Vf,
> > > +	.ddr_range = HWVfDmaDdrBaseRangeRoVf,
> > > +};
> > > +
> > >  /* Private data structure for each ACC100 device */
> > >  struct acc100_device {
> > >  	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
> > > --
> > > 1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v3 04/11] baseband/acc100: add queue configuration
  2020-08-29 17:48     ` Chautru, Nicolas
@ 2020-09-03  2:30       ` Xu, Rosen
  2020-09-03 22:48         ` Chautru, Nicolas
  0 siblings, 1 reply; 213+ messages in thread
From: Xu, Rosen @ 2020-09-03  2:30 UTC (permalink / raw)
  To: Chautru, Nicolas, dev, akhil.goyal; +Cc: Richardson, Bruce

Hi,

> -----Original Message-----
> From: Chautru, Nicolas <nicolas.chautru@intel.com>
> Sent: Sunday, August 30, 2020 1:48
> To: Xu, Rosen <rosen.xu@intel.com>; dev@dpdk.org; akhil.goyal@nxp.com
> Cc: Richardson, Bruce <bruce.richardson@intel.com>
> Subject: RE: [dpdk-dev] [PATCH v3 04/11] baseband/acc100: add queue
> configuration
> 
> Hi,
> 
> > From: Xu, Rosen <rosen.xu@intel.com>
> >
> > Hi,
> >
> > > -----Original Message-----
> > > From: dev <dev-bounces@dpdk.org> On Behalf Of Nicolas Chautru
> > > Sent: Wednesday, August 19, 2020 8:25
> > > To: dev@dpdk.org; akhil.goyal@nxp.com
> > > Cc: Richardson, Bruce <bruce.richardson@intel.com>; Chautru, Nicolas
> > > <nicolas.chautru@intel.com>
> > > Subject: [dpdk-dev] [PATCH v3 04/11] baseband/acc100: add queue
> > > configuration
> > >
> > > Adding function to create and configure queues for the device. Still
> > > no capability.
> > >
> > > Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> > > ---
> > >  drivers/baseband/acc100/rte_acc100_pmd.c | 420
> > > ++++++++++++++++++++++++++++++-
> > > drivers/baseband/acc100/rte_acc100_pmd.h |  45 ++++
> > >  2 files changed, 464 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> > > b/drivers/baseband/acc100/rte_acc100_pmd.c
> > > index 7807a30..7a21c57 100644
> > > --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> > > +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> > > @@ -26,6 +26,22 @@
> > >  RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);  #endif
> > >
> > > +/* Write to MMIO register address */ static inline void
> > > +mmio_write(void *addr, uint32_t value) {
> > > +	*((volatile uint32_t *)(addr)) = rte_cpu_to_le_32(value); }
> > > +
> > > +/* Write a register of a ACC100 device */ static inline void
> > > +acc100_reg_write(struct acc100_device *d, uint32_t offset, uint32_t
> > > +payload) {
> > > +	void *reg_addr = RTE_PTR_ADD(d->mmio_base, offset);
> > > +	mmio_write(reg_addr, payload);
> > > +	usleep(1000);
> > > +}
> > > +
> > >  /* Read a register of a ACC100 device */  static inline uint32_t
> > > acc100_reg_read(struct acc100_device *d, uint32_t offset) @@ -36,6
> > > +52,22 @@
> > >  	return rte_le_to_cpu_32(ret);
> > >  }
> > >
> > > +/* Basic Implementation of Log2 for exact 2^N */ static inline
> > > +uint32_t log2_basic(uint32_t value) {
> > > +	return (value == 0) ? 0 : __builtin_ctz(value); }
> > > +
> > > +/* Calculate memory alignment offset assuming alignment is 2^N */
> > > +static inline uint32_t calc_mem_alignment_offset(void
> > > +*unaligned_virt_mem, uint32_t alignment) {
> > > +	rte_iova_t unaligned_phy_mem =
> > > rte_malloc_virt2iova(unaligned_virt_mem);
> > > +	return (uint32_t)(alignment -
> > > +			(unaligned_phy_mem & (alignment-1))); }
> > > +
> > >  /* Calculate the offset of the enqueue register */  static inline
> > > uint32_t queue_offset(bool pf_device, uint8_t vf_id, uint8_t
> > > qgrp_id, uint16_t aq_id) @@ -204,10 +236,393 @@
> > >  			acc100_conf->q_dl_5g.aq_depth_log2);
> > >  }
> > >
> > > +static void
> > > +free_base_addresses(void **base_addrs, int size) {
> > > +	int i;
> > > +	for (i = 0; i < size; i++)
> > > +		rte_free(base_addrs[i]);
> > > +}
> > > +
> > > +static inline uint32_t
> > > +get_desc_len(void)
> > > +{
> > > +	return sizeof(union acc100_dma_desc); }
> > > +
> > > +/* Allocate the 2 * 64MB block for the sw rings */ static int
> > > +alloc_2x64mb_sw_rings_mem(struct rte_bbdev *dev, struct
> > > +acc100_device
> > > *d,
> > > +		int socket)
> > > +{
> > > +	uint32_t sw_ring_size = ACC100_SIZE_64MBYTE;
> > > +	d->sw_rings_base = rte_zmalloc_socket(dev->device->driver-
> > > >name,
> > > +			2 * sw_ring_size, RTE_CACHE_LINE_SIZE, socket);
> > > +	if (d->sw_rings_base == NULL) {
> > > +		rte_bbdev_log(ERR, "Failed to allocate memory for %s:%u",
> > > +				dev->device->driver->name,
> > > +				dev->data->dev_id);
> > > +		return -ENOMEM;
> > > +	}
> > > +	memset(d->sw_rings_base, 0, ACC100_SIZE_64MBYTE);
> > > +	uint32_t next_64mb_align_offset = calc_mem_alignment_offset(
> > > +			d->sw_rings_base, ACC100_SIZE_64MBYTE);
> > > +	d->sw_rings = RTE_PTR_ADD(d->sw_rings_base,
> > > next_64mb_align_offset);
> > > +	d->sw_rings_phys = rte_malloc_virt2iova(d->sw_rings_base) +
> > > +			next_64mb_align_offset;
> > > +	d->sw_ring_size = MAX_QUEUE_DEPTH * get_desc_len();
> > > +	d->sw_ring_max_depth = d->sw_ring_size / get_desc_len();
> > > +
> > > +	return 0;
> > > +}
> >
> > Why not a common alloc memory function but special function for
> > different memory size?
> 
> This is a bit convoluted but due to the fact the first attempt method which is
> optimal (minimum) may not always find aligned memory.

What's convoluted? Can you explain?
For packet processing, in most scenarios, aren't we aligned memory when we alloc memory?
> 
> >
> > > +/* Attempt to allocate minimised memory space for sw rings */
> > > +static void alloc_sw_rings_min_mem(struct rte_bbdev *dev, struct
> > > acc100_device
> > > +*d,
> > > +		uint16_t num_queues, int socket)
> > > +{
> > > +	rte_iova_t sw_rings_base_phy, next_64mb_align_addr_phy;
> > > +	uint32_t next_64mb_align_offset;
> > > +	rte_iova_t sw_ring_phys_end_addr;
> > > +	void *base_addrs[SW_RING_MEM_ALLOC_ATTEMPTS];
> > > +	void *sw_rings_base;
> > > +	int i = 0;
> > > +	uint32_t q_sw_ring_size = MAX_QUEUE_DEPTH * get_desc_len();
> > > +	uint32_t dev_sw_ring_size = q_sw_ring_size * num_queues;
> > > +
> > > +	/* Find an aligned block of memory to store sw rings */
> > > +	while (i < SW_RING_MEM_ALLOC_ATTEMPTS) {
> > > +		/*
> > > +		 * sw_ring allocated memory is guaranteed to be aligned to
> > > +		 * q_sw_ring_size at the condition that the requested size is
> > > +		 * less than the page size
> > > +		 */
> > > +		sw_rings_base = rte_zmalloc_socket(
> > > +				dev->device->driver->name,
> > > +				dev_sw_ring_size, q_sw_ring_size, socket);
> > > +
> > > +		if (sw_rings_base == NULL) {
> > > +			rte_bbdev_log(ERR,
> > > +					"Failed to allocate memory
> > > for %s:%u",
> > > +					dev->device->driver->name,
> > > +					dev->data->dev_id);
> > > +			break;
> > > +		}
> > > +
> > > +		sw_rings_base_phy = rte_malloc_virt2iova(sw_rings_base);
> > > +		next_64mb_align_offset = calc_mem_alignment_offset(
> > > +				sw_rings_base, ACC100_SIZE_64MBYTE);
> > > +		next_64mb_align_addr_phy = sw_rings_base_phy +
> > > +				next_64mb_align_offset;
> > > +		sw_ring_phys_end_addr = sw_rings_base_phy +
> > > dev_sw_ring_size;
> > > +
> > > +		/* Check if the end of the sw ring memory block is before the
> > > +		 * start of next 64MB aligned mem address
> > > +		 */
> > > +		if (sw_ring_phys_end_addr < next_64mb_align_addr_phy) {
> > > +			d->sw_rings_phys = sw_rings_base_phy;
> > > +			d->sw_rings = sw_rings_base;
> > > +			d->sw_rings_base = sw_rings_base;
> > > +			d->sw_ring_size = q_sw_ring_size;
> > > +			d->sw_ring_max_depth = MAX_QUEUE_DEPTH;
> > > +			break;
> > > +		}
> > > +		/* Store the address of the unaligned mem block */
> > > +		base_addrs[i] = sw_rings_base;
> > > +		i++;
> > > +	}
> > > +
> > > +	/* Free all unaligned blocks of mem allocated in the loop */
> > > +	free_base_addresses(base_addrs, i); }
> >
> > It's strange to firstly alloc memory and then free memory but on
> > operations on this memory.
> 
> I may miss your point. We are freeing the exact same mem we did get from
> rte_zmalloc.
> Not that the base_addrs array refers to multiple attempts of mallocs, not
> multiple operations in a ring.

You alloc memory sw_rings_base, after some translate, assign this memory to cc100_device *d,
and before the function return, this memory has been freed.

> >
> > > +
> > > +/* Allocate 64MB memory used for all software rings */ static int
> > > +acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues,
> int
> > > +socket_id) {
> > > +	uint32_t phys_low, phys_high, payload;
> > > +	struct acc100_device *d = dev->data->dev_private;
> > > +	const struct acc100_registry_addr *reg_addr;
> > > +
> > > +	if (d->pf_device && !d->acc100_conf.pf_mode_en) {
> > > +		rte_bbdev_log(NOTICE,
> > > +				"%s has PF mode disabled. This PF can't be
> > > used.",
> > > +				dev->data->name);
> > > +		return -ENODEV;
> > > +	}
> > > +
> > > +	alloc_sw_rings_min_mem(dev, d, num_queues, socket_id);
> > > +
> > > +	/* If minimal memory space approach failed, then allocate
> > > +	 * the 2 * 64MB block for the sw rings
> > > +	 */
> > > +	if (d->sw_rings == NULL)
> > > +		alloc_2x64mb_sw_rings_mem(dev, d, socket_id);
> > > +
> > > +	/* Configure ACC100 with the base address for DMA descriptor rings
> > > +	 * Same descriptor rings used for UL and DL DMA Engines
> > > +	 * Note : Assuming only VF0 bundle is used for PF mode
> > > +	 */
> > > +	phys_high = (uint32_t)(d->sw_rings_phys >> 32);
> > > +	phys_low  = (uint32_t)(d->sw_rings_phys &
> > > ~(ACC100_SIZE_64MBYTE-1));
> > > +
> > > +	/* Choose correct registry addresses for the device type */
> > > +	if (d->pf_device)
> > > +		reg_addr = &pf_reg_addr;
> > > +	else
> > > +		reg_addr = &vf_reg_addr;
> > > +
> > > +	/* Read the populated cfg from ACC100 registers */
> > > +	fetch_acc100_config(dev);
> > > +
> > > +	/* Mark as configured properly */
> > > +	d->configured = true;
> > > +
> > > +	/* Release AXI from PF */
> > > +	if (d->pf_device)
> > > +		acc100_reg_write(d, HWPfDmaAxiControl, 1);
> > > +
> > > +	acc100_reg_write(d, reg_addr->dma_ring_ul5g_hi, phys_high);
> > > +	acc100_reg_write(d, reg_addr->dma_ring_ul5g_lo, phys_low);
> > > +	acc100_reg_write(d, reg_addr->dma_ring_dl5g_hi, phys_high);
> > > +	acc100_reg_write(d, reg_addr->dma_ring_dl5g_lo, phys_low);
> > > +	acc100_reg_write(d, reg_addr->dma_ring_ul4g_hi, phys_high);
> > > +	acc100_reg_write(d, reg_addr->dma_ring_ul4g_lo, phys_low);
> > > +	acc100_reg_write(d, reg_addr->dma_ring_dl4g_hi, phys_high);
> > > +	acc100_reg_write(d, reg_addr->dma_ring_dl4g_lo, phys_low);
> > > +
> > > +	/*
> > > +	 * Configure Ring Size to the max queue ring size
> > > +	 * (used for wrapping purpose)
> > > +	 */
> > > +	payload = log2_basic(d->sw_ring_size / 64);
> > > +	acc100_reg_write(d, reg_addr->ring_size, payload);
> > > +
> > > +	/* Configure tail pointer for use when SDONE enabled */
> > > +	d->tail_ptrs = rte_zmalloc_socket(
> > > +			dev->device->driver->name,
> > > +			ACC100_NUM_QGRPS * ACC100_NUM_AQS *
> > > sizeof(uint32_t),
> > > +			RTE_CACHE_LINE_SIZE, socket_id);
> > > +	if (d->tail_ptrs == NULL) {
> > > +		rte_bbdev_log(ERR, "Failed to allocate tail ptr for %s:%u",
> > > +				dev->device->driver->name,
> > > +				dev->data->dev_id);
> > > +		rte_free(d->sw_rings);
> > > +		return -ENOMEM;
> > > +	}
> > > +	d->tail_ptr_phys = rte_malloc_virt2iova(d->tail_ptrs);
> > > +
> > > +	phys_high = (uint32_t)(d->tail_ptr_phys >> 32);
> > > +	phys_low  = (uint32_t)(d->tail_ptr_phys);
> > > +	acc100_reg_write(d, reg_addr->tail_ptrs_ul5g_hi, phys_high);
> > > +	acc100_reg_write(d, reg_addr->tail_ptrs_ul5g_lo, phys_low);
> > > +	acc100_reg_write(d, reg_addr->tail_ptrs_dl5g_hi, phys_high);
> > > +	acc100_reg_write(d, reg_addr->tail_ptrs_dl5g_lo, phys_low);
> > > +	acc100_reg_write(d, reg_addr->tail_ptrs_ul4g_hi, phys_high);
> > > +	acc100_reg_write(d, reg_addr->tail_ptrs_ul4g_lo, phys_low);
> > > +	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_hi, phys_high);
> > > +	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_lo, phys_low);
> > > +
> > > +	d->harq_layout = rte_zmalloc_socket("HARQ Layout",
> > > +			ACC100_HARQ_LAYOUT * sizeof(*d->harq_layout),
> > > +			RTE_CACHE_LINE_SIZE, dev->data->socket_id);
> > > +
> > > +	rte_bbdev_log_debug(
> > > +			"ACC100 (%s) configured  sw_rings = %p,
> > > sw_rings_phys = %#"
> > > +			PRIx64, dev->data->name, d->sw_rings, d-
> > > >sw_rings_phys);
> > > +
> > > +	return 0;
> > > +}
> > > +
> > >  /* Free 64MB memory used for software rings */  static int -
> > > acc100_dev_close(struct rte_bbdev *dev  __rte_unused)
> > > +acc100_dev_close(struct rte_bbdev *dev)
> > >  {
> > > +	struct acc100_device *d = dev->data->dev_private;
> > > +	if (d->sw_rings_base != NULL) {
> > > +		rte_free(d->tail_ptrs);
> > > +		rte_free(d->sw_rings_base);
> > > +		d->sw_rings_base = NULL;
> > > +	}
> > > +	usleep(1000);
> > > +	return 0;
> > > +}
> > > +
> > > +
> > > +/**
> > > + * Report a ACC100 queue index which is free
> > > + * Return 0 to 16k for a valid queue_idx or -1 when no queue is
> > > +available
> > > + * Note : Only supporting VF0 Bundle for PF mode  */ static int
> > > +acc100_find_free_queue_idx(struct rte_bbdev *dev,
> > > +		const struct rte_bbdev_queue_conf *conf) {
> > > +	struct acc100_device *d = dev->data->dev_private;
> > > +	int op_2_acc[5] = {0, UL_4G, DL_4G, UL_5G, DL_5G};
> > > +	int acc = op_2_acc[conf->op_type];
> > > +	struct rte_q_topology_t *qtop = NULL;
> > > +	qtopFromAcc(&qtop, acc, &(d->acc100_conf));
> > > +	if (qtop == NULL)
> > > +		return -1;
> > > +	/* Identify matching QGroup Index which are sorted in priority
> > > +order
> > > */
> > > +	uint16_t group_idx = qtop->first_qgroup_index;
> > > +	group_idx += conf->priority;
> > > +	if (group_idx >= ACC100_NUM_QGRPS ||
> > > +			conf->priority >= qtop->num_qgroups) {
> > > +		rte_bbdev_log(INFO, "Invalid Priority on %s, priority %u",
> > > +				dev->data->name, conf->priority);
> > > +		return -1;
> > > +	}
> > > +	/* Find a free AQ_idx  */
> > > +	uint16_t aq_idx;
> > > +	for (aq_idx = 0; aq_idx < qtop->num_aqs_per_groups; aq_idx++) {
> > > +		if (((d->q_assigned_bit_map[group_idx] >> aq_idx) & 0x1)
> > > == 0) {
> > > +			/* Mark the Queue as assigned */
> > > +			d->q_assigned_bit_map[group_idx] |= (1 << aq_idx);
> > > +			/* Report the AQ Index */
> > > +			return (group_idx << GRP_ID_SHIFT) + aq_idx;
> > > +		}
> > > +	}
> > > +	rte_bbdev_log(INFO, "Failed to find free queue on %s, priority %u",
> > > +			dev->data->name, conf->priority);
> > > +	return -1;
> > > +}
> > > +
> > > +/* Setup ACC100 queue */
> > > +static int
> > > +acc100_queue_setup(struct rte_bbdev *dev, uint16_t queue_id,
> > > +		const struct rte_bbdev_queue_conf *conf) {
> > > +	struct acc100_device *d = dev->data->dev_private;
> > > +	struct acc100_queue *q;
> > > +	int16_t q_idx;
> > > +
> > > +	/* Allocate the queue data structure. */
> > > +	q = rte_zmalloc_socket(dev->device->driver->name, sizeof(*q),
> > > +			RTE_CACHE_LINE_SIZE, conf->socket);
> > > +	if (q == NULL) {
> > > +		rte_bbdev_log(ERR, "Failed to allocate queue memory");
> > > +		return -ENOMEM;
> > > +	}
> > > +
> > > +	q->d = d;
> > > +	q->ring_addr = RTE_PTR_ADD(d->sw_rings, (d->sw_ring_size *
> > > queue_id));
> > > +	q->ring_addr_phys = d->sw_rings_phys + (d->sw_ring_size *
> > > queue_id);
> > > +
> > > +	/* Prepare the Ring with default descriptor format */
> > > +	union acc100_dma_desc *desc = NULL;
> > > +	unsigned int desc_idx, b_idx;
> > > +	int fcw_len = (conf->op_type == RTE_BBDEV_OP_LDPC_ENC ?
> > > +		ACC100_FCW_LE_BLEN : (conf->op_type ==
> > > RTE_BBDEV_OP_TURBO_DEC ?
> > > +		ACC100_FCW_TD_BLEN : ACC100_FCW_LD_BLEN));
> > > +
> > > +	for (desc_idx = 0; desc_idx < d->sw_ring_max_depth; desc_idx++) {
> > > +		desc = q->ring_addr + desc_idx;
> > > +		desc->req.word0 = ACC100_DMA_DESC_TYPE;
> > > +		desc->req.word1 = 0; /**< Timestamp */
> > > +		desc->req.word2 = 0;
> > > +		desc->req.word3 = 0;
> > > +		uint64_t fcw_offset = (desc_idx << 8) +
> > > ACC100_DESC_FCW_OFFSET;
> > > +		desc->req.data_ptrs[0].address = q->ring_addr_phys +
> > > fcw_offset;
> > > +		desc->req.data_ptrs[0].blen = fcw_len;
> > > +		desc->req.data_ptrs[0].blkid = ACC100_DMA_BLKID_FCW;
> > > +		desc->req.data_ptrs[0].last = 0;
> > > +		desc->req.data_ptrs[0].dma_ext = 0;
> > > +		for (b_idx = 1; b_idx < ACC100_DMA_MAX_NUM_POINTERS
> > > - 1;
> > > +				b_idx++) {
> > > +			desc->req.data_ptrs[b_idx].blkid =
> > > ACC100_DMA_BLKID_IN;
> > > +			desc->req.data_ptrs[b_idx].last = 1;
> > > +			desc->req.data_ptrs[b_idx].dma_ext = 0;
> > > +			b_idx++;
> > > +			desc->req.data_ptrs[b_idx].blkid =
> > > +					ACC100_DMA_BLKID_OUT_ENC;
> > > +			desc->req.data_ptrs[b_idx].last = 1;
> > > +			desc->req.data_ptrs[b_idx].dma_ext = 0;
> > > +		}
> > > +		/* Preset some fields of LDPC FCW */
> > > +		desc->req.fcw_ld.FCWversion = ACC100_FCW_VER;
> > > +		desc->req.fcw_ld.gain_i = 1;
> > > +		desc->req.fcw_ld.gain_h = 1;
> > > +	}
> > > +
> > > +	q->lb_in = rte_zmalloc_socket(dev->device->driver->name,
> > > +			RTE_CACHE_LINE_SIZE,
> > > +			RTE_CACHE_LINE_SIZE, conf->socket);
> > > +	if (q->lb_in == NULL) {
> > > +		rte_bbdev_log(ERR, "Failed to allocate lb_in memory");
> > > +		return -ENOMEM;
> > > +	}
> > > +	q->lb_in_addr_phys = rte_malloc_virt2iova(q->lb_in);
> > > +	q->lb_out = rte_zmalloc_socket(dev->device->driver->name,
> > > +			RTE_CACHE_LINE_SIZE,
> > > +			RTE_CACHE_LINE_SIZE, conf->socket);
> > > +	if (q->lb_out == NULL) {
> > > +		rte_bbdev_log(ERR, "Failed to allocate lb_out memory");
> > > +		return -ENOMEM;
> > > +	}
> > > +	q->lb_out_addr_phys = rte_malloc_virt2iova(q->lb_out);
> > > +
> > > +	/*
> > > +	 * Software queue ring wraps synchronously with the HW when it
> > > reaches
> > > +	 * the boundary of the maximum allocated queue size, no matter
> > > what the
> > > +	 * sw queue size is. This wrapping is guarded by setting the
> > > wrap_mask
> > > +	 * to represent the maximum queue size as allocated at the time
> > > when
> > > +	 * the device has been setup (in configure()).
> > > +	 *
> > > +	 * The queue depth is set to the queue size value (conf-
> > > >queue_size).
> > > +	 * This limits the occupancy of the queue at any point of time, so that
> > > +	 * the queue does not get swamped with enqueue requests.
> > > +	 */
> > > +	q->sw_ring_depth = conf->queue_size;
> > > +	q->sw_ring_wrap_mask = d->sw_ring_max_depth - 1;
> > > +
> > > +	q->op_type = conf->op_type;
> > > +
> > > +	q_idx = acc100_find_free_queue_idx(dev, conf);
> > > +	if (q_idx == -1) {
> > > +		rte_free(q);
> > > +		return -1;
> > > +	}
> > > +
> > > +	q->qgrp_id = (q_idx >> GRP_ID_SHIFT) & 0xF;
> > > +	q->vf_id = (q_idx >> VF_ID_SHIFT)  & 0x3F;
> > > +	q->aq_id = q_idx & 0xF;
> > > +	q->aq_depth = (conf->op_type ==  RTE_BBDEV_OP_TURBO_DEC) ?
> > > +			(1 << d->acc100_conf.q_ul_4g.aq_depth_log2) :
> > > +			(1 << d->acc100_conf.q_dl_4g.aq_depth_log2);
> > > +
> > > +	q->mmio_reg_enqueue = RTE_PTR_ADD(d->mmio_base,
> > > +			queue_offset(d->pf_device,
> > > +					q->vf_id, q->qgrp_id, q->aq_id));
> > > +
> > > +	rte_bbdev_log_debug(
> > > +			"Setup dev%u q%u: qgrp_id=%u, vf_id=%u,
> > > aq_id=%u, aq_depth=%u, mmio_reg_enqueue=%p",
> > > +			dev->data->dev_id, queue_id, q->qgrp_id, q->vf_id,
> > > +			q->aq_id, q->aq_depth, q->mmio_reg_enqueue);
> > > +
> > > +	dev->data->queues[queue_id].queue_private = q;
> > > +	return 0;
> > > +}
> > > +
> > > +/* Release ACC100 queue */
> > > +static int
> > > +acc100_queue_release(struct rte_bbdev *dev, uint16_t q_id) {
> > > +	struct acc100_device *d = dev->data->dev_private;
> > > +	struct acc100_queue *q = dev->data->queues[q_id].queue_private;
> > > +
> > > +	if (q != NULL) {
> > > +		/* Mark the Queue as un-assigned */
> > > +		d->q_assigned_bit_map[q->qgrp_id] &= (0xFFFFFFFF -
> > > +				(1 << q->aq_id));
> > > +		rte_free(q->lb_in);
> > > +		rte_free(q->lb_out);
> > > +		rte_free(q);
> > > +		dev->data->queues[q_id].queue_private = NULL;
> > > +	}
> > > +
> > >  	return 0;
> > >  }
> > >
> > > @@ -258,8 +673,11 @@
> > >  }
> > >
> > >  static const struct rte_bbdev_ops acc100_bbdev_ops = {
> > > +	.setup_queues = acc100_setup_queues,
> > >  	.close = acc100_dev_close,
> > >  	.info_get = acc100_dev_info_get,
> > > +	.queue_setup = acc100_queue_setup,
> > > +	.queue_release = acc100_queue_release,
> > >  };
> > >
> > >  /* ACC100 PCI PF address map */
> > > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h
> > > b/drivers/baseband/acc100/rte_acc100_pmd.h
> > > index 662e2c8..0e2b79c 100644
> > > --- a/drivers/baseband/acc100/rte_acc100_pmd.h
> > > +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
> > > @@ -518,11 +518,56 @@ struct acc100_registry_addr {
> > >  	.ddr_range = HWVfDmaDdrBaseRangeRoVf,  };
> > >
> > > +/* Structure associated with each queue. */ struct
> > > +__rte_cache_aligned acc100_queue {
> > > +	union acc100_dma_desc *ring_addr;  /* Virtual address of sw ring */
> > > +	rte_iova_t ring_addr_phys;  /* Physical address of software ring */
> > > +	uint32_t sw_ring_head;  /* software ring head */
> > > +	uint32_t sw_ring_tail;  /* software ring tail */
> > > +	/* software ring size (descriptors, not bytes) */
> > > +	uint32_t sw_ring_depth;
> > > +	/* mask used to wrap enqueued descriptors on the sw ring */
> > > +	uint32_t sw_ring_wrap_mask;
> > > +	/* MMIO register used to enqueue descriptors */
> > > +	void *mmio_reg_enqueue;
> > > +	uint8_t vf_id;  /* VF ID (max = 63) */
> > > +	uint8_t qgrp_id;  /* Queue Group ID */
> > > +	uint16_t aq_id;  /* Atomic Queue ID */
> > > +	uint16_t aq_depth;  /* Depth of atomic queue */
> > > +	uint32_t aq_enqueued;  /* Count how many "batches" have been
> > > enqueued */
> > > +	uint32_t aq_dequeued;  /* Count how many "batches" have been
> > > dequeued */
> > > +	uint32_t irq_enable;  /* Enable ops dequeue interrupts if set to 1 */
> > > +	struct rte_mempool *fcw_mempool;  /* FCW mempool */
> > > +	enum rte_bbdev_op_type op_type;  /* Type of this Queue: TE or TD
> > > */
> > > +	/* Internal Buffers for loopback input */
> > > +	uint8_t *lb_in;
> > > +	uint8_t *lb_out;
> > > +	rte_iova_t lb_in_addr_phys;
> > > +	rte_iova_t lb_out_addr_phys;
> > > +	struct acc100_device *d;
> > > +};
> > > +
> > >  /* Private data structure for each ACC100 device */  struct acc100_device
> {
> > >  	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
> > > +	void *sw_rings_base;  /* Base addr of un-aligned memory for sw
> > > rings */
> > > +	void *sw_rings;  /* 64MBs of 64MB aligned memory for sw rings */
> > > +	rte_iova_t sw_rings_phys;  /* Physical address of sw_rings */
> > > +	/* Virtual address of the info memory routed to the this function
> > > under
> > > +	 * operation, whether it is PF or VF.
> > > +	 */
> > > +	union acc100_harq_layout_data *harq_layout;
> > > +	uint32_t sw_ring_size;
> > >  	uint32_t ddr_size; /* Size in kB */
> > > +	uint32_t *tail_ptrs; /* Base address of response tail pointer buffer */
> > > +	rte_iova_t tail_ptr_phys; /* Physical address of tail pointers */
> > > +	/* Max number of entries available for each queue in device,
> > > depending
> > > +	 * on how many queues are enabled with configure()
> > > +	 */
> > > +	uint32_t sw_ring_max_depth;
> > >  	struct acc100_conf acc100_conf; /* ACC100 Initial configuration */
> > > +	/* Bitmap capturing which Queues have already been assigned */
> > > +	uint16_t q_assigned_bit_map[ACC100_NUM_QGRPS];
> > >  	bool pf_device; /**< True if this is a PF ACC100 device */
> > >  	bool configured; /**< True if this ACC100 device is configured */
> > > };
> > > --
> > > 1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC processing functions
  2020-08-29 18:01     ` Chautru, Nicolas
@ 2020-09-03  2:34       ` Xu, Rosen
  2020-09-03  9:09         ` Ananyev, Konstantin
  0 siblings, 1 reply; 213+ messages in thread
From: Xu, Rosen @ 2020-09-03  2:34 UTC (permalink / raw)
  To: Chautru, Nicolas, dev, akhil.goyal; +Cc: Richardson, Bruce

Hi,

> -----Original Message-----
> From: Chautru, Nicolas <nicolas.chautru@intel.com>
> Sent: Sunday, August 30, 2020 2:01
> To: Xu, Rosen <rosen.xu@intel.com>; dev@dpdk.org; akhil.goyal@nxp.com
> Cc: Richardson, Bruce <bruce.richardson@intel.com>
> Subject: RE: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC
> processing functions
> 
> Hi Rosen,
> 
> > From: Xu, Rosen <rosen.xu@intel.com>
> >
> > Hi,
> >
> > > -----Original Message-----
> > > From: dev <dev-bounces@dpdk.org> On Behalf Of Nicolas Chautru
> > > Sent: Wednesday, August 19, 2020 8:25
> > > To: dev@dpdk.org; akhil.goyal@nxp.com
> > > Cc: Richardson, Bruce <bruce.richardson@intel.com>; Chautru, Nicolas
> > > <nicolas.chautru@intel.com>
> > > Subject: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC
> > > processing functions
> > >
> > > Adding LDPC decode and encode processing operations
> > >
> > > Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> > > ---
> > >  drivers/baseband/acc100/rte_acc100_pmd.c | 1625
> > > +++++++++++++++++++++++++++++-
> > >  drivers/baseband/acc100/rte_acc100_pmd.h |    3 +
> > >  2 files changed, 1626 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> > > b/drivers/baseband/acc100/rte_acc100_pmd.c
> > > index 7a21c57..5f32813 100644
> > > --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> > > +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> > > @@ -15,6 +15,9 @@
> > >  #include <rte_hexdump.h>
> > >  #include <rte_pci.h>
> > >  #include <rte_bus_pci.h>
> > > +#ifdef RTE_BBDEV_OFFLOAD_COST
> > > +#include <rte_cycles.h>
> > > +#endif
> > >
> > >  #include <rte_bbdev.h>
> > >  #include <rte_bbdev_pmd.h>
> > > @@ -449,7 +452,6 @@
> > >  	return 0;
> > >  }
> > >
> > > -
> > >  /**
> > >   * Report a ACC100 queue index which is free
> > >   * Return 0 to 16k for a valid queue_idx or -1 when no queue is
> > > available @@ -634,6 +636,46 @@
> > >  	struct acc100_device *d = dev->data->dev_private;
> > >
> > >  	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
> > > +		{
> > > +			.type   = RTE_BBDEV_OP_LDPC_ENC,
> > > +			.cap.ldpc_enc = {
> > > +				.capability_flags =
> > > +					RTE_BBDEV_LDPC_RATE_MATCH |
> > > +					RTE_BBDEV_LDPC_CRC_24B_ATTACH
> > > |
> > > +
> > > 	RTE_BBDEV_LDPC_INTERLEAVER_BYPASS,
> > > +				.num_buffers_src =
> > > +
> > > 	RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> > > +				.num_buffers_dst =
> > > +
> > > 	RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> > > +			}
> > > +		},
> > > +		{
> > > +			.type   = RTE_BBDEV_OP_LDPC_DEC,
> > > +			.cap.ldpc_dec = {
> > > +			.capability_flags =
> > > +				RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK |
> > > +				RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP |
> > > +
> > > 	RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE |
> > > +
> > > 	RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE |
> > > +#ifdef ACC100_EXT_MEM
> > > +
> > > 	RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE |
> > > +
> > > 	RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE |
> > > +#endif
> > > +
> > > 	RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE |
> > > +				RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS
> > > |
> > > +				RTE_BBDEV_LDPC_DECODE_BYPASS |
> > > +				RTE_BBDEV_LDPC_DEC_SCATTER_GATHER |
> > > +
> > > 	RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION |
> > > +				RTE_BBDEV_LDPC_LLR_COMPRESSION,
> > > +			.llr_size = 8,
> > > +			.llr_decimals = 1,
> > > +			.num_buffers_src =
> > > +
> > > 	RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> > > +			.num_buffers_hard_out =
> > > +
> > > 	RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> > > +			.num_buffers_soft_out = 0,
> > > +			}
> > > +		},
> > >  		RTE_BBDEV_END_OF_CAPABILITIES_LIST()
> > >  	};
> > >
> > > @@ -669,9 +711,14 @@
> > >  	dev_info->cpu_flag_reqs = NULL;
> > >  	dev_info->min_alignment = 64;
> > >  	dev_info->capabilities = bbdev_capabilities;
> > > +#ifdef ACC100_EXT_MEM
> > >  	dev_info->harq_buffer_size = d->ddr_size;
> > > +#else
> > > +	dev_info->harq_buffer_size = 0;
> > > +#endif
> > >  }
> > >
> > > +
> > >  static const struct rte_bbdev_ops acc100_bbdev_ops = {
> > >  	.setup_queues = acc100_setup_queues,
> > >  	.close = acc100_dev_close,
> > > @@ -696,6 +743,1577 @@
> > >  	{.device_id = 0},
> > >  };
> > >
> > > +/* Read flag value 0/1 from bitmap */ static inline bool
> > > +check_bit(uint32_t bitmap, uint32_t bitmask) {
> > > +	return bitmap & bitmask;
> > > +}
> > > +
> > > +static inline char *
> > > +mbuf_append(struct rte_mbuf *m_head, struct rte_mbuf *m, uint16_t
> > > +len) {
> > > +	if (unlikely(len > rte_pktmbuf_tailroom(m)))
> > > +		return NULL;
> > > +
> > > +	char *tail = (char *)m->buf_addr + m->data_off + m->data_len;
> > > +	m->data_len = (uint16_t)(m->data_len + len);
> > > +	m_head->pkt_len  = (m_head->pkt_len + len);
> > > +	return tail;
> > > +}
> >
> > Is it reasonable to direct add data_len of rte_mbuf?
> >
> 
> Do you suggest to add directly without checking there is enough room in the
> mbuf? We cannot rely on the application providing mbuf with enough
> tailroom.

What I mentioned is this changes about mbuf should move to librte_mbuf.
And it's better to align Olivier Matz.

> In case you ask about the 2 mbufs, this is because this function is used to also
> support segmented memory made of multiple mbufs segments.
> Note that this function is also used in other existing bbdev PMDs. In case you
> believe there is a better way to do this, we can certainly discuss and change
> these in several PMDs through another serie.
> 
> Thanks for all the reviews and useful comments.
> Nic

^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC processing functions
  2020-08-20 21:05         ` Chautru, Nicolas
@ 2020-09-03  8:06           ` Dave Burley
  0 siblings, 0 replies; 213+ messages in thread
From: Dave Burley @ 2020-09-03  8:06 UTC (permalink / raw)
  To: Chautru, Nicolas, dev; +Cc: Richardson, Bruce

Acked-by: Dave Burley <dave.burley@accelercomm.com>


From: Chautru, Nicolas <nicolas.chautru@intel.com>
Sent: 20 August 2020 22:05
To: Dave Burley <dave.burley@accelercomm.com>; dev@dpdk.org <dev@dpdk.org>
Cc: Richardson, Bruce <bruce.richardson@intel.com>
Subject: RE: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC processing functions 
 

> From: Dave Burley <dave.burley@accelercomm.com>> 
> Hi Nic
> 
> Thank you - it would be useful to have further documentation for clarification
> as the data format isn't explicitly documented in BBDEV.

Thanks Dave. Just updated on this other patch -> https://patches.dpdk.org/patch/75793/
Feel free to ack or let me know if you need more details. 

> Best Regards
> 
> Dave
> 
> 
> From: Chautru, Nicolas <nicolas.chautru@intel.com>
> Sent: 20 August 2020 15:52
> To: Dave Burley <dave.burley@accelercomm.com>; dev@dpdk.org
> <dev@dpdk.org>
> Cc: Richardson, Bruce <bruce.richardson@intel.com>
> Subject: RE: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC
> processing functions
> 
> Hi Dave,
> This is assuming 6 bits LLR compression packing (ie. first 2 MSB dropped).
> Similar to HARQ compression.
> Let me know if unclear, I can clarify further in documentation if not explicit
> enough.
> Thanks
> Nic
> 
> > -----Original Message-----
> > From: Dave Burley <dave.burley@accelercomm.com>
> > Sent: Thursday, August 20, 2020 7:39 AM
> > To: Chautru, Nicolas <nicolas.chautru@intel.com>; dev@dpdk.org
> > Cc: Richardson, Bruce <bruce.richardson@intel.com>
> > Subject: Re: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC
> > processing functions
> >
> > Hi Nic,
> >
> > As you've now specified the use of RTE_BBDEV_LDPC_LLR_COMPRESSION for
> > this PMB, please could you confirm what the packed format of the LLRs in
> > memory looks like?
> >
> > Best Regards
> >
> > Dave Burley
> >
> >
> > From: dev <dev-bounces@dpdk.org> on behalf of Nicolas Chautru
> > <nicolas.chautru@intel.com>
> > Sent: 19 August 2020 01:25
> > To: dev@dpdk.org <dev@dpdk.org>; akhil.goyal@nxp.com
> > <akhil.goyal@nxp.com>
> > Cc: bruce.richardson@intel.com <bruce.richardson@intel.com>; Nicolas
> > Chautru <nicolas.chautru@intel.com>
> > Subject: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC processing
> > functions
> >
> > Adding LDPC decode and encode processing operations
> >
> > Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> > ---
> >  drivers/baseband/acc100/rte_acc100_pmd.c | 1625
> > +++++++++++++++++++++++++++++-
> >  drivers/baseband/acc100/rte_acc100_pmd.h |    3 +
> >  2 files changed, 1626 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> > b/drivers/baseband/acc100/rte_acc100_pmd.c
> > index 7a21c57..5f32813 100644
> > --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> > +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> > @@ -15,6 +15,9 @@
> >  #include <rte_hexdump.h>
> >  #include <rte_pci.h>
> >  #include <rte_bus_pci.h>
> > +#ifdef RTE_BBDEV_OFFLOAD_COST
> > +#include <rte_cycles.h>
> > +#endif
> >
> >  #include <rte_bbdev.h>
> >  #include <rte_bbdev_pmd.h>
> > @@ -449,7 +452,6 @@
> >          return 0;
> >  }
> >
> > -
> >  /**
> >   * Report a ACC100 queue index which is free
> >   * Return 0 to 16k for a valid queue_idx or -1 when no queue is available
> > @@ -634,6 +636,46 @@
> >          struct acc100_device *d = dev->data->dev_private;
> >
> >          static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
> > +               {
> > +                       .type   = RTE_BBDEV_OP_LDPC_ENC,
> > +                       .cap.ldpc_enc = {
> > +                               .capability_flags =
> > +                                       RTE_BBDEV_LDPC_RATE_MATCH |
> > +                                       RTE_BBDEV_LDPC_CRC_24B_ATTACH |
> > +                                       RTE_BBDEV_LDPC_INTERLEAVER_BYPASS,
> > +                               .num_buffers_src =
> > +                                               RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> > +                               .num_buffers_dst =
> > +                                               RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> > +                       }
> > +               },
> > +               {
> > +                       .type   = RTE_BBDEV_OP_LDPC_DEC,
> > +                       .cap.ldpc_dec = {
> > +                       .capability_flags =
> > +                               RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK |
> > +                               RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP |
> > +                               RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE |
> > +                               RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE |
> > +#ifdef ACC100_EXT_MEM
> >
> +                               RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABL
> > E |
> >
> +                               RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENA
> > BLE |
> > +#endif
> > +                               RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE |
> > +                               RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS |
> > +                               RTE_BBDEV_LDPC_DECODE_BYPASS |
> > +                               RTE_BBDEV_LDPC_DEC_SCATTER_GATHER |
> > +                               RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION |
> > +                               RTE_BBDEV_LDPC_LLR_COMPRESSION,
> > +                       .llr_size = 8,
> > +                       .llr_decimals = 1,
> > +                       .num_buffers_src =
> > +                                       RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> > +                       .num_buffers_hard_out =
> > +                                       RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> > +                       .num_buffers_soft_out = 0,
> > +                       }
> > +               },
> >                  RTE_BBDEV_END_OF_CAPABILITIES_LIST()
> >          };
> >
> > @@ -669,9 +711,14 @@
> >          dev_info->cpu_flag_reqs = NULL;
> >          dev_info->min_alignment = 64;
> >          dev_info->capabilities = bbdev_capabilities;
> > +#ifdef ACC100_EXT_MEM
> >          dev_info->harq_buffer_size = d->ddr_size;
> > +#else
> > +       dev_info->harq_buffer_size = 0;
> > +#endif
> >  }
> >
> > +
> >  static const struct rte_bbdev_ops acc100_bbdev_ops = {
> >          .setup_queues = acc100_setup_queues,
> >          .close = acc100_dev_close,
> > @@ -696,6 +743,1577 @@
> >          {.device_id = 0},
> >  };
> >
> > +/* Read flag value 0/1 from bitmap */
> > +static inline bool
> > +check_bit(uint32_t bitmap, uint32_t bitmask)
> > +{
> > +       return bitmap & bitmask;
> > +}
> > +
> > +static inline char *
> > +mbuf_append(struct rte_mbuf *m_head, struct rte_mbuf *m, uint16_t len)
> > +{
> > +       if (unlikely(len > rte_pktmbuf_tailroom(m)))
> > +               return NULL;
> > +
> > +       char *tail = (char *)m->buf_addr + m->data_off + m->data_len;
> > +       m->data_len = (uint16_t)(m->data_len + len);
> > +       m_head->pkt_len  = (m_head->pkt_len + len);
> > +       return tail;
> > +}
> > +
> > +/* Compute value of k0.
> > + * Based on 3GPP 38.212 Table 5.4.2.1-2
> > + * Starting position of different redundancy versions, k0
> > + */
> > +static inline uint16_t
> > +get_k0(uint16_t n_cb, uint16_t z_c, uint8_t bg, uint8_t rv_index)
> > +{
> > +       if (rv_index == 0)
> > +               return 0;
> > +       uint16_t n = (bg == 1 ? N_ZC_1 : N_ZC_2) * z_c;
> > +       if (n_cb == n) {
> > +               if (rv_index == 1)
> > +                       return (bg == 1 ? K0_1_1 : K0_1_2) * z_c;
> > +               else if (rv_index == 2)
> > +                       return (bg == 1 ? K0_2_1 : K0_2_2) * z_c;
> > +               else
> > +                       return (bg == 1 ? K0_3_1 : K0_3_2) * z_c;
> > +       }
> > +       /* LBRM case - includes a division by N */
> > +       if (rv_index == 1)
> > +               return (((bg == 1 ? K0_1_1 : K0_1_2) * n_cb)
> > +                               / n) * z_c;
> > +       else if (rv_index == 2)
> > +               return (((bg == 1 ? K0_2_1 : K0_2_2) * n_cb)
> > +                               / n) * z_c;
> > +       else
> > +               return (((bg == 1 ? K0_3_1 : K0_3_2) * n_cb)
> > +                               / n) * z_c;
> > +}
> > +
> > +/* Fill in a frame control word for LDPC encoding. */
> > +static inline void
> > +acc100_fcw_le_fill(const struct rte_bbdev_enc_op *op,
> > +               struct acc100_fcw_le *fcw, int num_cb)
> > +{
> > +       fcw->qm = op->ldpc_enc.q_m;
> > +       fcw->nfiller = op->ldpc_enc.n_filler;
> > +       fcw->BG = (op->ldpc_enc.basegraph - 1);
> > +       fcw->Zc = op->ldpc_enc.z_c;
> > +       fcw->ncb = op->ldpc_enc.n_cb;
> > +       fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_enc.basegraph,
> > +                       op->ldpc_enc.rv_index);
> > +       fcw->rm_e = op->ldpc_enc.cb_params.e;
> > +       fcw->crc_select = check_bit(op->ldpc_enc.op_flags,
> > +                       RTE_BBDEV_LDPC_CRC_24B_ATTACH);
> > +       fcw->bypass_intlv = check_bit(op->ldpc_enc.op_flags,
> > +                       RTE_BBDEV_LDPC_INTERLEAVER_BYPASS);
> > +       fcw->mcb_count = num_cb;
> > +}
> > +
> > +/* Fill in a frame control word for LDPC decoding. */
> > +static inline void
> > +acc100_fcw_ld_fill(const struct rte_bbdev_dec_op *op, struct
> acc100_fcw_ld
> > *fcw,
> > +               union acc100_harq_layout_data *harq_layout)
> > +{
> > +       uint16_t harq_out_length, harq_in_length, ncb_p, k0_p, parity_offset;
> > +       uint16_t harq_index;
> > +       uint32_t l;
> > +       bool harq_prun = false;
> > +
> > +       fcw->qm = op->ldpc_dec.q_m;
> > +       fcw->nfiller = op->ldpc_dec.n_filler;
> > +       fcw->BG = (op->ldpc_dec.basegraph - 1);
> > +       fcw->Zc = op->ldpc_dec.z_c;
> > +       fcw->ncb = op->ldpc_dec.n_cb;
> > +       fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_dec.basegraph,
> > +                       op->ldpc_dec.rv_index);
> > +       if (op->ldpc_dec.code_block_mode == 1)
> > +               fcw->rm_e = op->ldpc_dec.cb_params.e;
> > +       else
> > +               fcw->rm_e = (op->ldpc_dec.tb_params.r <
> > +                               op->ldpc_dec.tb_params.cab) ?
> > +                                               op->ldpc_dec.tb_params.ea :
> > +                                               op->ldpc_dec.tb_params.eb;
> > +
> > +       fcw->hcin_en = check_bit(op->ldpc_dec.op_flags,
> > +                       RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE);
> > +       fcw->hcout_en = check_bit(op->ldpc_dec.op_flags,
> > +                       RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE);
> > +       fcw->crc_select = check_bit(op->ldpc_dec.op_flags,
> > +                       RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK);
> > +       fcw->bypass_dec = check_bit(op->ldpc_dec.op_flags,
> > +                       RTE_BBDEV_LDPC_DECODE_BYPASS);
> > +       fcw->bypass_intlv = check_bit(op->ldpc_dec.op_flags,
> > +                       RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS);
> > +       if (op->ldpc_dec.q_m == 1) {
> > +               fcw->bypass_intlv = 1;
> > +               fcw->qm = 2;
> > +       }
> > +       fcw->hcin_decomp_mode = check_bit(op->ldpc_dec.op_flags,
> > +                       RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
> > +       fcw->hcout_comp_mode = check_bit(op->ldpc_dec.op_flags,
> > +                       RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
> > +       fcw->llr_pack_mode = check_bit(op->ldpc_dec.op_flags,
> > +                       RTE_BBDEV_LDPC_LLR_COMPRESSION);
> > +       harq_index = op->ldpc_dec.harq_combined_output.offset /
> > +                       ACC100_HARQ_OFFSET;
> > +#ifdef ACC100_EXT_MEM
> > +       /* Limit cases when HARQ pruning is valid */
> > +       harq_prun = ((op->ldpc_dec.harq_combined_output.offset %
> > +                       ACC100_HARQ_OFFSET) == 0) &&
> > +                       (op->ldpc_dec.harq_combined_output.offset <= UINT16_MAX
> > +                       * ACC100_HARQ_OFFSET);
> > +#endif
> > +       if (fcw->hcin_en > 0) {
> > +               harq_in_length = op->ldpc_dec.harq_combined_input.length;
> > +               if (fcw->hcin_decomp_mode > 0)
> > +                       harq_in_length = harq_in_length * 8 / 6;
> > +               harq_in_length = RTE_ALIGN(harq_in_length, 64);
> > +               if ((harq_layout[harq_index].offset > 0) & harq_prun) {
> > +                       rte_bbdev_log_debug("HARQ IN offset unexpected for
> now\n");
> > +                       fcw->hcin_size0 = harq_layout[harq_index].size0;
> > +                       fcw->hcin_offset = harq_layout[harq_index].offset;
> > +                       fcw->hcin_size1 = harq_in_length -
> > +                                       harq_layout[harq_index].offset;
> > +               } else {
> > +                       fcw->hcin_size0 = harq_in_length;
> > +                       fcw->hcin_offset = 0;
> > +                       fcw->hcin_size1 = 0;
> > +               }
> > +       } else {
> > +               fcw->hcin_size0 = 0;
> > +               fcw->hcin_offset = 0;
> > +               fcw->hcin_size1 = 0;
> > +       }
> > +
> > +       fcw->itmax = op->ldpc_dec.iter_max;
> > +       fcw->itstop = check_bit(op->ldpc_dec.op_flags,
> > +                       RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE);
> > +       fcw->synd_precoder = fcw->itstop;
> > +       /*
> > +        * These are all implicitly set
> > +        * fcw->synd_post = 0;
> > +        * fcw->so_en = 0;
> > +        * fcw->so_bypass_rm = 0;
> > +        * fcw->so_bypass_intlv = 0;
> > +        * fcw->dec_convllr = 0;
> > +        * fcw->hcout_convllr = 0;
> > +        * fcw->hcout_size1 = 0;
> > +        * fcw->so_it = 0;
> > +        * fcw->hcout_offset = 0;
> > +        * fcw->negstop_th = 0;
> > +        * fcw->negstop_it = 0;
> > +        * fcw->negstop_en = 0;
> > +        * fcw->gain_i = 1;
> > +        * fcw->gain_h = 1;
> > +        */
> > +       if (fcw->hcout_en > 0) {
> > +               parity_offset = (op->ldpc_dec.basegraph == 1 ? 20 : 8)
> > +                       * op->ldpc_dec.z_c - op->ldpc_dec.n_filler;
> > +               k0_p = (fcw->k0 > parity_offset) ?
> > +                               fcw->k0 - op->ldpc_dec.n_filler : fcw->k0;
> > +               ncb_p = fcw->ncb - op->ldpc_dec.n_filler;
> > +               l = k0_p + fcw->rm_e;
> > +               harq_out_length = (uint16_t) fcw->hcin_size0;
> > +               harq_out_length = RTE_MIN(RTE_MAX(harq_out_length, l),
> ncb_p);
> > +               harq_out_length = (harq_out_length + 0x3F) & 0xFFC0;
> > +               if ((k0_p > fcw->hcin_size0 + ACC100_HARQ_OFFSET_THRESHOLD)
> > &&
> > +                               harq_prun) {
> > +                       fcw->hcout_size0 = (uint16_t) fcw->hcin_size0;
> > +                       fcw->hcout_offset = k0_p & 0xFFC0;
> > +                       fcw->hcout_size1 = harq_out_length - fcw->hcout_offset;
> > +               } else {
> > +                       fcw->hcout_size0 = harq_out_length;
> > +                       fcw->hcout_size1 = 0;
> > +                       fcw->hcout_offset = 0;
> > +               }
> > +               harq_layout[harq_index].offset = fcw->hcout_offset;
> > +               harq_layout[harq_index].size0 = fcw->hcout_size0;
> > +       } else {
> > +               fcw->hcout_size0 = 0;
> > +               fcw->hcout_size1 = 0;
> > +               fcw->hcout_offset = 0;
> > +       }
> > +}
> > +
> > +/**
> > + * Fills descriptor with data pointers of one block type.
> > + *
> > + * @param desc
> > + *   Pointer to DMA descriptor.
> > + * @param input
> > + *   Pointer to pointer to input data which will be encoded. It can be changed
> > + *   and points to next segment in scatter-gather case.
> > + * @param offset
> > + *   Input offset in rte_mbuf structure. It is used for calculating the point
> > + *   where data is starting.
> > + * @param cb_len
> > + *   Length of currently processed Code Block
> > + * @param seg_total_left
> > + *   It indicates how many bytes still left in segment (mbuf) for further
> > + *   processing.
> > + * @param op_flags
> > + *   Store information about device capabilities
> > + * @param next_triplet
> > + *   Index for ACC100 DMA Descriptor triplet
> > + *
> > + * @return
> > + *   Returns index of next triplet on success, other value if lengths of
> > + *   pkt and processed cb do not match.
> > + *
> > + */
> > +static inline int
> > +acc100_dma_fill_blk_type_in(struct acc100_dma_req_desc *desc,
> > +               struct rte_mbuf **input, uint32_t *offset, uint32_t cb_len,
> > +               uint32_t *seg_total_left, int next_triplet)
> > +{
> > +       uint32_t part_len;
> > +       struct rte_mbuf *m = *input;
> > +
> > +       part_len = (*seg_total_left < cb_len) ? *seg_total_left : cb_len;
> > +       cb_len -= part_len;
> > +       *seg_total_left -= part_len;
> > +
> > +       desc->data_ptrs[next_triplet].address =
> > +                       rte_pktmbuf_iova_offset(m, *offset);
> > +       desc->data_ptrs[next_triplet].blen = part_len;
> > +       desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN;
> > +       desc->data_ptrs[next_triplet].last = 0;
> > +       desc->data_ptrs[next_triplet].dma_ext = 0;
> > +       *offset += part_len;
> > +       next_triplet++;
> > +
> > +       while (cb_len > 0) {
> > +               if (next_triplet < ACC100_DMA_MAX_NUM_POINTERS &&
> > +                               m->next != NULL) {
> > +
> > +                       m = m->next;
> > +                       *seg_total_left = rte_pktmbuf_data_len(m);
> > +                       part_len = (*seg_total_left < cb_len) ?
> > +                                       *seg_total_left :
> > +                                       cb_len;
> > +                       desc->data_ptrs[next_triplet].address =
> > +                                       rte_pktmbuf_mtophys(m);
> > +                       desc->data_ptrs[next_triplet].blen = part_len;
> > +                       desc->data_ptrs[next_triplet].blkid =
> > +                                       ACC100_DMA_BLKID_IN;
> > +                       desc->data_ptrs[next_triplet].last = 0;
> > +                       desc->data_ptrs[next_triplet].dma_ext = 0;
> > +                       cb_len -= part_len;
> > +                       *seg_total_left -= part_len;
> > +                       /* Initializing offset for next segment (mbuf) */
> > +                       *offset = part_len;
> > +                       next_triplet++;
> > +               } else {
> > +                       rte_bbdev_log(ERR,
> > +                               "Some data still left for processing: "
> > +                               "data_left: %u, next_triplet: %u, next_mbuf: %p",
> > +                               cb_len, next_triplet, m->next);
> > +                       return -EINVAL;
> > +               }
> > +       }
> > +       /* Storing new mbuf as it could be changed in scatter-gather case*/
> > +       *input = m;
> > +
> > +       return next_triplet;
> > +}
> > +
> > +/* Fills descriptor with data pointers of one block type.
> > + * Returns index of next triplet on success, other value if lengths of
> > + * output data and processed mbuf do not match.
> > + */
> > +static inline int
> > +acc100_dma_fill_blk_type_out(struct acc100_dma_req_desc *desc,
> > +               struct rte_mbuf *output, uint32_t out_offset,
> > +               uint32_t output_len, int next_triplet, int blk_id)
> > +{
> > +       desc->data_ptrs[next_triplet].address =
> > +                       rte_pktmbuf_iova_offset(output, out_offset);
> > +       desc->data_ptrs[next_triplet].blen = output_len;
> > +       desc->data_ptrs[next_triplet].blkid = blk_id;
> > +       desc->data_ptrs[next_triplet].last = 0;
> > +       desc->data_ptrs[next_triplet].dma_ext = 0;
> > +       next_triplet++;
> > +
> > +       return next_triplet;
> > +}
> > +
> > +static inline int
> > +acc100_dma_desc_le_fill(struct rte_bbdev_enc_op *op,
> > +               struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
> > +               struct rte_mbuf *output, uint32_t *in_offset,
> > +               uint32_t *out_offset, uint32_t *out_length,
> > +               uint32_t *mbuf_total_left, uint32_t *seg_total_left)
> > +{
> > +       int next_triplet = 1; /* FCW already done */
> > +       uint16_t K, in_length_in_bits, in_length_in_bytes;
> > +       struct rte_bbdev_op_ldpc_enc *enc = &op->ldpc_enc;
> > +
> > +       desc->word0 = ACC100_DMA_DESC_TYPE;
> > +       desc->word1 = 0; /**< Timestamp could be disabled */
> > +       desc->word2 = 0;
> > +       desc->word3 = 0;
> > +       desc->numCBs = 1;
> > +
> > +       K = (enc->basegraph == 1 ? 22 : 10) * enc->z_c;
> > +       in_length_in_bits = K - enc->n_filler;
> > +       if ((enc->op_flags & RTE_BBDEV_LDPC_CRC_24A_ATTACH) ||
> > +                       (enc->op_flags & RTE_BBDEV_LDPC_CRC_24B_ATTACH))
> > +               in_length_in_bits -= 24;
> > +       in_length_in_bytes = in_length_in_bits >> 3;
> > +
> > +       if (unlikely((*mbuf_total_left == 0) ||
> > +                       (*mbuf_total_left < in_length_in_bytes))) {
> > +               rte_bbdev_log(ERR,
> > +                               "Mismatch between mbuf length and included CB sizes:
> > mbuf len %u, cb len %u",
> > +                               *mbuf_total_left, in_length_in_bytes);
> > +               return -1;
> > +       }
> > +
> > +       next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset,
> > +                       in_length_in_bytes,
> > +                       seg_total_left, next_triplet);
> > +       if (unlikely(next_triplet < 0)) {
> > +               rte_bbdev_log(ERR,
> > +                               "Mismatch between data to process and mbuf data
> length
> > in bbdev_op: %p",
> > +                               op);
> > +               return -1;
> > +       }
> > +       desc->data_ptrs[next_triplet - 1].last = 1;
> > +       desc->m2dlen = next_triplet;
> > +       *mbuf_total_left -= in_length_in_bytes;
> > +
> > +       /* Set output length */
> > +       /* Integer round up division by 8 */
> > +       *out_length = (enc->cb_params.e + 7) >> 3;
> > +
> > +       next_triplet = acc100_dma_fill_blk_type_out(desc, output, *out_offset,
> > +                       *out_length, next_triplet, ACC100_DMA_BLKID_OUT_ENC);
> > +       if (unlikely(next_triplet < 0)) {
> > +               rte_bbdev_log(ERR,
> > +                               "Mismatch between data to process and mbuf data
> length
> > in bbdev_op: %p",
> > +                               op);
> > +               return -1;
> > +       }
> > +       op->ldpc_enc.output.length += *out_length;
> > +       *out_offset += *out_length;
> > +       desc->data_ptrs[next_triplet - 1].last = 1;
> > +       desc->data_ptrs[next_triplet - 1].dma_ext = 0;
> > +       desc->d2mlen = next_triplet - desc->m2dlen;
> > +
> > +       desc->op_addr = op;
> > +
> > +       return 0;
> > +}
> > +
> > +static inline int
> > +acc100_dma_desc_ld_fill(struct rte_bbdev_dec_op *op,
> > +               struct acc100_dma_req_desc *desc,
> > +               struct rte_mbuf **input, struct rte_mbuf *h_output,
> > +               uint32_t *in_offset, uint32_t *h_out_offset,
> > +               uint32_t *h_out_length, uint32_t *mbuf_total_left,
> > +               uint32_t *seg_total_left,
> > +               struct acc100_fcw_ld *fcw)
> > +{
> > +       struct rte_bbdev_op_ldpc_dec *dec = &op->ldpc_dec;
> > +       int next_triplet = 1; /* FCW already done */
> > +       uint32_t input_length;
> > +       uint16_t output_length, crc24_overlap = 0;
> > +       uint16_t sys_cols, K, h_p_size, h_np_size;
> > +       bool h_comp = check_bit(dec->op_flags,
> > +                       RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
> > +
> > +       desc->word0 = ACC100_DMA_DESC_TYPE;
> > +       desc->word1 = 0; /**< Timestamp could be disabled */
> > +       desc->word2 = 0;
> > +       desc->word3 = 0;
> > +       desc->numCBs = 1;
> > +
> > +       if (check_bit(op->ldpc_dec.op_flags,
> > +                       RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP))
> > +               crc24_overlap = 24;
> > +
> > +       /* Compute some LDPC BG lengths */
> > +       input_length = dec->cb_params.e;
> > +       if (check_bit(op->ldpc_dec.op_flags,
> > +                       RTE_BBDEV_LDPC_LLR_COMPRESSION))
> > +               input_length = (input_length * 3 + 3) / 4;
> > +       sys_cols = (dec->basegraph == 1) ? 22 : 10;
> > +       K = sys_cols * dec->z_c;
> > +       output_length = K - dec->n_filler - crc24_overlap;
> > +
> > +       if (unlikely((*mbuf_total_left == 0) ||
> > +                       (*mbuf_total_left < input_length))) {
> > +               rte_bbdev_log(ERR,
> > +                               "Mismatch between mbuf length and included CB sizes:
> > mbuf len %u, cb len %u",
> > +                               *mbuf_total_left, input_length);
> > +               return -1;
> > +       }
> > +
> > +       next_triplet = acc100_dma_fill_blk_type_in(desc, input,
> > +                       in_offset, input_length,
> > +                       seg_total_left, next_triplet);
> > +
> > +       if (unlikely(next_triplet < 0)) {
> > +               rte_bbdev_log(ERR,
> > +                               "Mismatch between data to process and mbuf data
> length
> > in bbdev_op: %p",
> > +                               op);
> > +               return -1;
> > +       }
> > +
> > +       if (check_bit(op->ldpc_dec.op_flags,
> > +                               RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
> > +               h_p_size = fcw->hcin_size0 + fcw->hcin_size1;
> > +               if (h_comp)
> > +                       h_p_size = (h_p_size * 3 + 3) / 4;
> > +               desc->data_ptrs[next_triplet].address =
> > +                               dec->harq_combined_input.offset;
> > +               desc->data_ptrs[next_triplet].blen = h_p_size;
> > +               desc->data_ptrs[next_triplet].blkid =
> > ACC100_DMA_BLKID_IN_HARQ;
> > +               desc->data_ptrs[next_triplet].dma_ext = 1;
> > +#ifndef ACC100_EXT_MEM
> > +               acc100_dma_fill_blk_type_out(
> > +                               desc,
> > +                               op->ldpc_dec.harq_combined_input.data,
> > +                               op->ldpc_dec.harq_combined_input.offset,
> > +                               h_p_size,
> > +                               next_triplet,
> > +                               ACC100_DMA_BLKID_IN_HARQ);
> > +#endif
> > +               next_triplet++;
> > +       }
> > +
> > +       desc->data_ptrs[next_triplet - 1].last = 1;
> > +       desc->m2dlen = next_triplet;
> > +       *mbuf_total_left -= input_length;
> > +
> > +       next_triplet = acc100_dma_fill_blk_type_out(desc, h_output,
> > +                       *h_out_offset, output_length >> 3, next_triplet,
> > +                       ACC100_DMA_BLKID_OUT_HARD);
> > +       if (unlikely(next_triplet < 0)) {
> > +               rte_bbdev_log(ERR,
> > +                               "Mismatch between data to process and mbuf data
> length
> > in bbdev_op: %p",
> > +                               op);
> > +               return -1;
> > +       }
> > +
> > +       if (check_bit(op->ldpc_dec.op_flags,
> > +                               RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
> > +               /* Pruned size of the HARQ */
> > +               h_p_size = fcw->hcout_size0 + fcw->hcout_size1;
> > +               /* Non-Pruned size of the HARQ */
> > +               h_np_size = fcw->hcout_offset > 0 ?
> > +                               fcw->hcout_offset + fcw->hcout_size1 :
> > +                               h_p_size;
> > +               if (h_comp) {
> > +                       h_np_size = (h_np_size * 3 + 3) / 4;
> > +                       h_p_size = (h_p_size * 3 + 3) / 4;
> > +               }
> > +               dec->harq_combined_output.length = h_np_size;
> > +               desc->data_ptrs[next_triplet].address =
> > +                               dec->harq_combined_output.offset;
> > +               desc->data_ptrs[next_triplet].blen = h_p_size;
> > +               desc->data_ptrs[next_triplet].blkid =
> > ACC100_DMA_BLKID_OUT_HARQ;
> > +               desc->data_ptrs[next_triplet].dma_ext = 1;
> > +#ifndef ACC100_EXT_MEM
> > +               acc100_dma_fill_blk_type_out(
> > +                               desc,
> > +                               dec->harq_combined_output.data,
> > +                               dec->harq_combined_output.offset,
> > +                               h_p_size,
> > +                               next_triplet,
> > +                               ACC100_DMA_BLKID_OUT_HARQ);
> > +#endif
> > +               next_triplet++;
> > +       }
> > +
> > +       *h_out_length = output_length >> 3;
> > +       dec->hard_output.length += *h_out_length;
> > +       *h_out_offset += *h_out_length;
> > +       desc->data_ptrs[next_triplet - 1].last = 1;
> > +       desc->d2mlen = next_triplet - desc->m2dlen;
> > +
> > +       desc->op_addr = op;
> > +
> > +       return 0;
> > +}
> > +
> > +static inline void
> > +acc100_dma_desc_ld_update(struct rte_bbdev_dec_op *op,
> > +               struct acc100_dma_req_desc *desc,
> > +               struct rte_mbuf *input, struct rte_mbuf *h_output,
> > +               uint32_t *in_offset, uint32_t *h_out_offset,
> > +               uint32_t *h_out_length,
> > +               union acc100_harq_layout_data *harq_layout)
> > +{
> > +       int next_triplet = 1; /* FCW already done */
> > +       desc->data_ptrs[next_triplet].address =
> > +                       rte_pktmbuf_iova_offset(input, *in_offset);
> > +       next_triplet++;
> > +
> > +       if (check_bit(op->ldpc_dec.op_flags,
> > +                               RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
> > +               struct rte_bbdev_op_data hi = op->ldpc_dec.harq_combined_input;
> > +               desc->data_ptrs[next_triplet].address = hi.offset;
> > +#ifndef ACC100_EXT_MEM
> > +               desc->data_ptrs[next_triplet].address =
> > +                               rte_pktmbuf_iova_offset(hi.data, hi.offset);
> > +#endif
> > +               next_triplet++;
> > +       }
> > +
> > +       desc->data_ptrs[next_triplet].address =
> > +                       rte_pktmbuf_iova_offset(h_output, *h_out_offset);
> > +       *h_out_length = desc->data_ptrs[next_triplet].blen;
> > +       next_triplet++;
> > +
> > +       if (check_bit(op->ldpc_dec.op_flags,
> > +                               RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
> > +               desc->data_ptrs[next_triplet].address =
> > +                               op->ldpc_dec.harq_combined_output.offset;
> > +               /* Adjust based on previous operation */
> > +               struct rte_bbdev_dec_op *prev_op = desc->op_addr;
> > +               op->ldpc_dec.harq_combined_output.length =
> > +                               prev_op->ldpc_dec.harq_combined_output.length;
> > +               int16_t hq_idx = op->ldpc_dec.harq_combined_output.offset /
> > +                               ACC100_HARQ_OFFSET;
> > +               int16_t prev_hq_idx =
> > +                               prev_op->ldpc_dec.harq_combined_output.offset
> > +                               / ACC100_HARQ_OFFSET;
> > +               harq_layout[hq_idx].val = harq_layout[prev_hq_idx].val;
> > +#ifndef ACC100_EXT_MEM
> > +               struct rte_bbdev_op_data ho =
> > +                               op->ldpc_dec.harq_combined_output;
> > +               desc->data_ptrs[next_triplet].address =
> > +                               rte_pktmbuf_iova_offset(ho.data, ho.offset);
> > +#endif
> > +               next_triplet++;
> > +       }
> > +
> > +       op->ldpc_dec.hard_output.length += *h_out_length;
> > +       desc->op_addr = op;
> > +}
> > +
> > +
> > +/* Enqueue a number of operations to HW and update software rings */
> > +static inline void
> > +acc100_dma_enqueue(struct acc100_queue *q, uint16_t n,
> > +               struct rte_bbdev_stats *queue_stats)
> > +{
> > +       union acc100_enqueue_reg_fmt enq_req;
> > +#ifdef RTE_BBDEV_OFFLOAD_COST
> > +       uint64_t start_time = 0;
> > +       queue_stats->acc_offload_cycles = 0;
> > +       RTE_SET_USED(queue_stats);
> > +#else
> > +       RTE_SET_USED(queue_stats);
> > +#endif
> > +
> > +       enq_req.val = 0;
> > +       /* Setting offset, 100b for 256 DMA Desc */
> > +       enq_req.addr_offset = ACC100_DESC_OFFSET;
> > +
> > +       /* Split ops into batches */
> > +       do {
> > +               union acc100_dma_desc *desc;
> > +               uint16_t enq_batch_size;
> > +               uint64_t offset;
> > +               rte_iova_t req_elem_addr;
> > +
> > +               enq_batch_size = RTE_MIN(n, MAX_ENQ_BATCH_SIZE);
> > +
> > +               /* Set flag on last descriptor in a batch */
> > +               desc = q->ring_addr + ((q->sw_ring_head + enq_batch_size - 1) &
> > +                               q->sw_ring_wrap_mask);
> > +               desc->req.last_desc_in_batch = 1;
> > +
> > +               /* Calculate the 1st descriptor's address */
> > +               offset = ((q->sw_ring_head & q->sw_ring_wrap_mask) *
> > +                               sizeof(union acc100_dma_desc));
> > +               req_elem_addr = q->ring_addr_phys + offset;
> > +
> > +               /* Fill enqueue struct */
> > +               enq_req.num_elem = enq_batch_size;
> > +               /* low 6 bits are not needed */
> > +               enq_req.req_elem_addr = (uint32_t)(req_elem_addr >> 6);
> > +
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +               rte_memdump(stderr, "Req sdone", desc, sizeof(*desc));
> > +#endif
> > +               rte_bbdev_log_debug(
> > +                               "Enqueue %u reqs (phys %#"PRIx64") to reg %p",
> > +                               enq_batch_size,
> > +                               req_elem_addr,
> > +                               (void *)q->mmio_reg_enqueue);
> > +
> > +               rte_wmb();
> > +
> > +#ifdef RTE_BBDEV_OFFLOAD_COST
> > +               /* Start time measurement for enqueue function offload. */
> > +               start_time = rte_rdtsc_precise();
> > +#endif
> > +               rte_bbdev_log(DEBUG, "Debug : MMIO Enqueue");
> > +               mmio_write(q->mmio_reg_enqueue, enq_req.val);
> > +
> > +#ifdef RTE_BBDEV_OFFLOAD_COST
> > +               queue_stats->acc_offload_cycles +=
> > +                               rte_rdtsc_precise() - start_time;
> > +#endif
> > +
> > +               q->aq_enqueued++;
> > +               q->sw_ring_head += enq_batch_size;
> > +               n -= enq_batch_size;
> > +
> > +       } while (n);
> > +
> > +
> > +}
> > +
> > +/* Enqueue one encode operations for ACC100 device in CB mode */
> > +static inline int
> > +enqueue_ldpc_enc_n_op_cb(struct acc100_queue *q, struct
> > rte_bbdev_enc_op **ops,
> > +               uint16_t total_enqueued_cbs, int16_t num)
> > +{
> > +       union acc100_dma_desc *desc = NULL;
> > +       uint32_t out_length;
> > +       struct rte_mbuf *output_head, *output;
> > +       int i, next_triplet;
> > +       uint16_t  in_length_in_bytes;
> > +       struct rte_bbdev_op_ldpc_enc *enc = &ops[0]->ldpc_enc;
> > +
> > +       uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> > +                       & q->sw_ring_wrap_mask);
> > +       desc = q->ring_addr + desc_idx;
> > +       acc100_fcw_le_fill(ops[0], &desc->req.fcw_le, num);
> > +
> > +       /** This could be done at polling */
> > +       desc->req.word0 = ACC100_DMA_DESC_TYPE;
> > +       desc->req.word1 = 0; /**< Timestamp could be disabled */
> > +       desc->req.word2 = 0;
> > +       desc->req.word3 = 0;
> > +       desc->req.numCBs = num;
> > +
> > +       in_length_in_bytes = ops[0]->ldpc_enc.input.data->data_len;
> > +       out_length = (enc->cb_params.e + 7) >> 3;
> > +       desc->req.m2dlen = 1 + num;
> > +       desc->req.d2mlen = num;
> > +       next_triplet = 1;
> > +
> > +       for (i = 0; i < num; i++) {
> > +               desc->req.data_ptrs[next_triplet].address =
> > +                       rte_pktmbuf_iova_offset(ops[i]->ldpc_enc.input.data, 0);
> > +               desc->req.data_ptrs[next_triplet].blen = in_length_in_bytes;
> > +               next_triplet++;
> > +               desc->req.data_ptrs[next_triplet].address =
> > +                               rte_pktmbuf_iova_offset(
> > +                               ops[i]->ldpc_enc.output.data, 0);
> > +               desc->req.data_ptrs[next_triplet].blen = out_length;
> > +               next_triplet++;
> > +               ops[i]->ldpc_enc.output.length = out_length;
> > +               output_head = output = ops[i]->ldpc_enc.output.data;
> > +               mbuf_append(output_head, output, out_length);
> > +               output->data_len = out_length;
> > +       }
> > +
> > +       desc->req.op_addr = ops[0];
> > +
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +       rte_memdump(stderr, "FCW", &desc->req.fcw_le,
> > +                       sizeof(desc->req.fcw_le) - 8);
> > +       rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> > +#endif
> > +
> > +       /* One CB (one op) was successfully prepared to enqueue */
> > +       return num;
> > +}
> > +
> > +/* Enqueue one encode operations for ACC100 device in CB mode */
> > +static inline int
> > +enqueue_ldpc_enc_one_op_cb(struct acc100_queue *q, struct
> > rte_bbdev_enc_op *op,
> > +               uint16_t total_enqueued_cbs)
> > +{
> > +       union acc100_dma_desc *desc = NULL;
> > +       int ret;
> > +       uint32_t in_offset, out_offset, out_length, mbuf_total_left,
> > +               seg_total_left;
> > +       struct rte_mbuf *input, *output_head, *output;
> > +
> > +       uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> > +                       & q->sw_ring_wrap_mask);
> > +       desc = q->ring_addr + desc_idx;
> > +       acc100_fcw_le_fill(op, &desc->req.fcw_le, 1);
> > +
> > +       input = op->ldpc_enc.input.data;
> > +       output_head = output = op->ldpc_enc.output.data;
> > +       in_offset = op->ldpc_enc.input.offset;
> > +       out_offset = op->ldpc_enc.output.offset;
> > +       out_length = 0;
> > +       mbuf_total_left = op->ldpc_enc.input.length;
> > +       seg_total_left = rte_pktmbuf_data_len(op->ldpc_enc.input.data)
> > +                       - in_offset;
> > +
> > +       ret = acc100_dma_desc_le_fill(op, &desc->req, &input, output,
> > +                       &in_offset, &out_offset, &out_length, &mbuf_total_left,
> > +                       &seg_total_left);
> > +
> > +       if (unlikely(ret < 0))
> > +               return ret;
> > +
> > +       mbuf_append(output_head, output, out_length);
> > +
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +       rte_memdump(stderr, "FCW", &desc->req.fcw_le,
> > +                       sizeof(desc->req.fcw_le) - 8);
> > +       rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> > +
> > +       /* Check if any data left after processing one CB */
> > +       if (mbuf_total_left != 0) {
> > +               rte_bbdev_log(ERR,
> > +                               "Some date still left after processing one CB:
> > mbuf_total_left = %u",
> > +                               mbuf_total_left);
> > +               return -EINVAL;
> > +       }
> > +#endif
> > +       /* One CB (one op) was successfully prepared to enqueue */
> > +       return 1;
> > +}
> > +
> > +/** Enqueue one decode operations for ACC100 device in CB mode */
> > +static inline int
> > +enqueue_ldpc_dec_one_op_cb(struct acc100_queue *q, struct
> > rte_bbdev_dec_op *op,
> > +               uint16_t total_enqueued_cbs, bool same_op)
> > +{
> > +       int ret;
> > +
> > +       union acc100_dma_desc *desc;
> > +       uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> > +                       & q->sw_ring_wrap_mask);
> > +       desc = q->ring_addr + desc_idx;
> > +       struct rte_mbuf *input, *h_output_head, *h_output;
> > +       uint32_t in_offset, h_out_offset, h_out_length, mbuf_total_left;
> > +       input = op->ldpc_dec.input.data;
> > +       h_output_head = h_output = op->ldpc_dec.hard_output.data;
> > +       in_offset = op->ldpc_dec.input.offset;
> > +       h_out_offset = op->ldpc_dec.hard_output.offset;
> > +       mbuf_total_left = op->ldpc_dec.input.length;
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +       if (unlikely(input == NULL)) {
> > +               rte_bbdev_log(ERR, "Invalid mbuf pointer");
> > +               return -EFAULT;
> > +       }
> > +#endif
> > +       union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
> > +
> > +       if (same_op) {
> > +               union acc100_dma_desc *prev_desc;
> > +               desc_idx = ((q->sw_ring_head + total_enqueued_cbs - 1)
> > +                               & q->sw_ring_wrap_mask);
> > +               prev_desc = q->ring_addr + desc_idx;
> > +               uint8_t *prev_ptr = (uint8_t *) prev_desc;
> > +               uint8_t *new_ptr = (uint8_t *) desc;
> > +               /* Copy first 4 words and BDESCs */
> > +               rte_memcpy(new_ptr, prev_ptr, 16);
> > +               rte_memcpy(new_ptr + 36, prev_ptr + 36, 40);
> > +               desc->req.op_addr = prev_desc->req.op_addr;
> > +               /* Copy FCW */
> > +               rte_memcpy(new_ptr + ACC100_DESC_FCW_OFFSET,
> > +                               prev_ptr + ACC100_DESC_FCW_OFFSET,
> > +                               ACC100_FCW_LD_BLEN);
> > +               acc100_dma_desc_ld_update(op, &desc->req, input, h_output,
> > +                               &in_offset, &h_out_offset,
> > +                               &h_out_length, harq_layout);
> > +       } else {
> > +               struct acc100_fcw_ld *fcw;
> > +               uint32_t seg_total_left;
> > +               fcw = &desc->req.fcw_ld;
> > +               acc100_fcw_ld_fill(op, fcw, harq_layout);
> > +
> > +               /* Special handling when overusing mbuf */
> > +               if (fcw->rm_e < MAX_E_MBUF)
> > +                       seg_total_left = rte_pktmbuf_data_len(input)
> > +                                       - in_offset;
> > +               else
> > +                       seg_total_left = fcw->rm_e;
> > +
> > +               ret = acc100_dma_desc_ld_fill(op, &desc->req, &input, h_output,
> > +                               &in_offset, &h_out_offset,
> > +                               &h_out_length, &mbuf_total_left,
> > +                               &seg_total_left, fcw);
> > +               if (unlikely(ret < 0))
> > +                       return ret;
> > +       }
> > +
> > +       /* Hard output */
> > +       mbuf_append(h_output_head, h_output, h_out_length);
> > +#ifndef ACC100_EXT_MEM
> > +       if (op->ldpc_dec.harq_combined_output.length > 0) {
> > +               /* Push the HARQ output into host memory */
> > +               struct rte_mbuf *hq_output_head, *hq_output;
> > +               hq_output_head = op->ldpc_dec.harq_combined_output.data;
> > +               hq_output = op->ldpc_dec.harq_combined_output.data;
> > +               mbuf_append(hq_output_head, hq_output,
> > +                               op->ldpc_dec.harq_combined_output.length);
> > +       }
> > +#endif
> > +
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +       rte_memdump(stderr, "FCW", &desc->req.fcw_ld,
> > +                       sizeof(desc->req.fcw_ld) - 8);
> > +       rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> > +#endif
> > +
> > +       /* One CB (one op) was successfully prepared to enqueue */
> > +       return 1;
> > +}
> > +
> > +
> > +/* Enqueue one decode operations for ACC100 device in TB mode */
> > +static inline int
> > +enqueue_ldpc_dec_one_op_tb(struct acc100_queue *q, struct
> > rte_bbdev_dec_op *op,
> > +               uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
> > +{
> > +       union acc100_dma_desc *desc = NULL;
> > +       int ret;
> > +       uint8_t r, c;
> > +       uint32_t in_offset, h_out_offset,
> > +               h_out_length, mbuf_total_left, seg_total_left;
> > +       struct rte_mbuf *input, *h_output_head, *h_output;
> > +       uint16_t current_enqueued_cbs = 0;
> > +
> > +       uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> > +                       & q->sw_ring_wrap_mask);
> > +       desc = q->ring_addr + desc_idx;
> > +       uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
> > +       union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
> > +       acc100_fcw_ld_fill(op, &desc->req.fcw_ld, harq_layout);
> > +
> > +       input = op->ldpc_dec.input.data;
> > +       h_output_head = h_output = op->ldpc_dec.hard_output.data;
> > +       in_offset = op->ldpc_dec.input.offset;
> > +       h_out_offset = op->ldpc_dec.hard_output.offset;
> > +       h_out_length = 0;
> > +       mbuf_total_left = op->ldpc_dec.input.length;
> > +       c = op->ldpc_dec.tb_params.c;
> > +       r = op->ldpc_dec.tb_params.r;
> > +
> > +       while (mbuf_total_left > 0 && r < c) {
> > +
> > +               seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
> > +
> > +               /* Set up DMA descriptor */
> > +               desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
> > +                               & q->sw_ring_wrap_mask);
> > +               desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
> > +               desc->req.data_ptrs[0].blen = ACC100_FCW_LD_BLEN;
> > +               ret = acc100_dma_desc_ld_fill(op, &desc->req, &input,
> > +                               h_output, &in_offset, &h_out_offset,
> > +                               &h_out_length,
> > +                               &mbuf_total_left, &seg_total_left,
> > +                               &desc->req.fcw_ld);
> > +
> > +               if (unlikely(ret < 0))
> > +                       return ret;
> > +
> > +               /* Hard output */
> > +               mbuf_append(h_output_head, h_output, h_out_length);
> > +
> > +               /* Set total number of CBs in TB */
> > +               desc->req.cbs_in_tb = cbs_in_tb;
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +               rte_memdump(stderr, "FCW", &desc->req.fcw_td,
> > +                               sizeof(desc->req.fcw_td) - 8);
> > +               rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> > +#endif
> > +
> > +               if (seg_total_left == 0) {
> > +                       /* Go to the next mbuf */
> > +                       input = input->next;
> > +                       in_offset = 0;
> > +                       h_output = h_output->next;
> > +                       h_out_offset = 0;
> > +               }
> > +               total_enqueued_cbs++;
> > +               current_enqueued_cbs++;
> > +               r++;
> > +       }
> > +
> > +       if (unlikely(desc == NULL))
> > +               return current_enqueued_cbs;
> > +
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +       /* Check if any CBs left for processing */
> > +       if (mbuf_total_left != 0) {
> > +               rte_bbdev_log(ERR,
> > +                               "Some date still left for processing: mbuf_total_left =
> %u",
> > +                               mbuf_total_left);
> > +               return -EINVAL;
> > +       }
> > +#endif
> > +       /* Set SDone on last CB descriptor for TB mode */
> > +       desc->req.sdone_enable = 1;
> > +       desc->req.irq_enable = q->irq_enable;
> > +
> > +       return current_enqueued_cbs;
> > +}
> > +
> > +
> > +/* Calculates number of CBs in processed encoder TB based on 'r' and input
> > + * length.
> > + */
> > +static inline uint8_t
> > +get_num_cbs_in_tb_enc(struct rte_bbdev_op_turbo_enc *turbo_enc)
> > +{
> > +       uint8_t c, c_neg, r, crc24_bits = 0;
> > +       uint16_t k, k_neg, k_pos;
> > +       uint8_t cbs_in_tb = 0;
> > +       int32_t length;
> > +
> > +       length = turbo_enc->input.length;
> > +       r = turbo_enc->tb_params.r;
> > +       c = turbo_enc->tb_params.c;
> > +       c_neg = turbo_enc->tb_params.c_neg;
> > +       k_neg = turbo_enc->tb_params.k_neg;
> > +       k_pos = turbo_enc->tb_params.k_pos;
> > +       crc24_bits = 0;
> > +       if (check_bit(turbo_enc->op_flags,
> > RTE_BBDEV_TURBO_CRC_24B_ATTACH))
> > +               crc24_bits = 24;
> > +       while (length > 0 && r < c) {
> > +               k = (r < c_neg) ? k_neg : k_pos;
> > +               length -= (k - crc24_bits) >> 3;
> > +               r++;
> > +               cbs_in_tb++;
> > +       }
> > +
> > +       return cbs_in_tb;
> > +}
> > +
> > +/* Calculates number of CBs in processed decoder TB based on 'r' and input
> > + * length.
> > + */
> > +static inline uint16_t
> > +get_num_cbs_in_tb_dec(struct rte_bbdev_op_turbo_dec *turbo_dec)
> > +{
> > +       uint8_t c, c_neg, r = 0;
> > +       uint16_t kw, k, k_neg, k_pos, cbs_in_tb = 0;
> > +       int32_t length;
> > +
> > +       length = turbo_dec->input.length;
> > +       r = turbo_dec->tb_params.r;
> > +       c = turbo_dec->tb_params.c;
> > +       c_neg = turbo_dec->tb_params.c_neg;
> > +       k_neg = turbo_dec->tb_params.k_neg;
> > +       k_pos = turbo_dec->tb_params.k_pos;
> > +       while (length > 0 && r < c) {
> > +               k = (r < c_neg) ? k_neg : k_pos;
> > +               kw = RTE_ALIGN_CEIL(k + 4, 32) * 3;
> > +               length -= kw;
> > +               r++;
> > +               cbs_in_tb++;
> > +       }
> > +
> > +       return cbs_in_tb;
> > +}
> > +
> > +/* Calculates number of CBs in processed decoder TB based on 'r' and input
> > + * length.
> > + */
> > +static inline uint16_t
> > +get_num_cbs_in_tb_ldpc_dec(struct rte_bbdev_op_ldpc_dec *ldpc_dec)
> > +{
> > +       uint16_t r, cbs_in_tb = 0;
> > +       int32_t length = ldpc_dec->input.length;
> > +       r = ldpc_dec->tb_params.r;
> > +       while (length > 0 && r < ldpc_dec->tb_params.c) {
> > +               length -=  (r < ldpc_dec->tb_params.cab) ?
> > +                               ldpc_dec->tb_params.ea :
> > +                               ldpc_dec->tb_params.eb;
> > +               r++;
> > +               cbs_in_tb++;
> > +       }
> > +       return cbs_in_tb;
> > +}
> > +
> > +/* Check we can mux encode operations with common FCW */
> > +static inline bool
> > +check_mux(struct rte_bbdev_enc_op **ops, uint16_t num) {
> > +       uint16_t i;
> > +       if (num == 1)
> > +               return false;
> > +       for (i = 1; i < num; ++i) {
> > +               /* Only mux compatible code blocks */
> > +               if (memcmp((uint8_t *)(&ops[i]->ldpc_enc) + ENC_OFFSET,
> > +                               (uint8_t *)(&ops[0]->ldpc_enc) + ENC_OFFSET,
> > +                               CMP_ENC_SIZE) != 0)
> > +                       return false;
> > +       }
> > +       return true;
> > +}
> > +
> > +/** Enqueue encode operations for ACC100 device in CB mode. */
> > +static inline uint16_t
> > +acc100_enqueue_ldpc_enc_cb(struct rte_bbdev_queue_data *q_data,
> > +               struct rte_bbdev_enc_op **ops, uint16_t num)
> > +{
> > +       struct acc100_queue *q = q_data->queue_private;
> > +       int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
> > +       uint16_t i = 0;
> > +       union acc100_dma_desc *desc;
> > +       int ret, desc_idx = 0;
> > +       int16_t enq, left = num;
> > +
> > +       while (left > 0) {
> > +               if (unlikely(avail - 1 < 0))
> > +                       break;
> > +               avail--;
> > +               enq = RTE_MIN(left, MUX_5GDL_DESC);
> > +               if (check_mux(&ops[i], enq)) {
> > +                       ret = enqueue_ldpc_enc_n_op_cb(q, &ops[i],
> > +                                       desc_idx, enq);
> > +                       if (ret < 0)
> > +                               break;
> > +                       i += enq;
> > +               } else {
> > +                       ret = enqueue_ldpc_enc_one_op_cb(q, ops[i], desc_idx);
> > +                       if (ret < 0)
> > +                               break;
> > +                       i++;
> > +               }
> > +               desc_idx++;
> > +               left = num - i;
> > +       }
> > +
> > +       if (unlikely(i == 0))
> > +               return 0; /* Nothing to enqueue */
> > +
> > +       /* Set SDone in last CB in enqueued ops for CB mode*/
> > +       desc = q->ring_addr + ((q->sw_ring_head + desc_idx - 1)
> > +                       & q->sw_ring_wrap_mask);
> > +       desc->req.sdone_enable = 1;
> > +       desc->req.irq_enable = q->irq_enable;
> > +
> > +       acc100_dma_enqueue(q, desc_idx, &q_data->queue_stats);
> > +
> > +       /* Update stats */
> > +       q_data->queue_stats.enqueued_count += i;
> > +       q_data->queue_stats.enqueue_err_count += num - i;
> > +
> > +       return i;
> > +}
> > +
> > +/* Enqueue encode operations for ACC100 device. */
> > +static uint16_t
> > +acc100_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
> > +               struct rte_bbdev_enc_op **ops, uint16_t num)
> > +{
> > +       if (unlikely(num == 0))
> > +               return 0;
> > +       return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
> > +}
> > +
> > +/* Check we can mux encode operations with common FCW */
> > +static inline bool
> > +cmp_ldpc_dec_op(struct rte_bbdev_dec_op **ops) {
> > +       /* Only mux compatible code blocks */
> > +       if (memcmp((uint8_t *)(&ops[0]->ldpc_dec) + DEC_OFFSET,
> > +                       (uint8_t *)(&ops[1]->ldpc_dec) +
> > +                       DEC_OFFSET, CMP_DEC_SIZE) != 0) {
> > +               return false;
> > +       } else
> > +               return true;
> > +}
> > +
> > +
> > +/* Enqueue decode operations for ACC100 device in TB mode */
> > +static uint16_t
> > +acc100_enqueue_ldpc_dec_tb(struct rte_bbdev_queue_data *q_data,
> > +               struct rte_bbdev_dec_op **ops, uint16_t num)
> > +{
> > +       struct acc100_queue *q = q_data->queue_private;
> > +       int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
> > +       uint16_t i, enqueued_cbs = 0;
> > +       uint8_t cbs_in_tb;
> > +       int ret;
> > +
> > +       for (i = 0; i < num; ++i) {
> > +               cbs_in_tb = get_num_cbs_in_tb_ldpc_dec(&ops[i]->ldpc_dec);
> > +               /* Check if there are available space for further processing */
> > +               if (unlikely(avail - cbs_in_tb < 0))
> > +                       break;
> > +               avail -= cbs_in_tb;
> > +
> > +               ret = enqueue_ldpc_dec_one_op_tb(q, ops[i],
> > +                               enqueued_cbs, cbs_in_tb);
> > +               if (ret < 0)
> > +                       break;
> > +               enqueued_cbs += ret;
> > +       }
> > +
> > +       acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
> > +
> > +       /* Update stats */
> > +       q_data->queue_stats.enqueued_count += i;
> > +       q_data->queue_stats.enqueue_err_count += num - i;
> > +       return i;
> > +}
> > +
> > +/* Enqueue decode operations for ACC100 device in CB mode */
> > +static uint16_t
> > +acc100_enqueue_ldpc_dec_cb(struct rte_bbdev_queue_data *q_data,
> > +               struct rte_bbdev_dec_op **ops, uint16_t num)
> > +{
> > +       struct acc100_queue *q = q_data->queue_private;
> > +       int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
> > +       uint16_t i;
> > +       union acc100_dma_desc *desc;
> > +       int ret;
> > +       bool same_op = false;
> > +       for (i = 0; i < num; ++i) {
> > +               /* Check if there are available space for further processing */
> > +               if (unlikely(avail - 1 < 0))
> > +                       break;
> > +               avail -= 1;
> > +
> > +               if (i > 0)
> > +                       same_op = cmp_ldpc_dec_op(&ops[i-1]);
> > +               rte_bbdev_log(INFO, "Op %d %d %d %d %d %d %d %d %d %d %d
> > %d\n",
> > +                       i, ops[i]->ldpc_dec.op_flags, ops[i]->ldpc_dec.rv_index,
> > +                       ops[i]->ldpc_dec.iter_max, ops[i]->ldpc_dec.iter_count,
> > +                       ops[i]->ldpc_dec.basegraph, ops[i]->ldpc_dec.z_c,
> > +                       ops[i]->ldpc_dec.n_cb, ops[i]->ldpc_dec.q_m,
> > +                       ops[i]->ldpc_dec.n_filler, ops[i]->ldpc_dec.cb_params.e,
> > +                       same_op);
> > +               ret = enqueue_ldpc_dec_one_op_cb(q, ops[i], i, same_op);
> > +               if (ret < 0)
> > +                       break;
> > +       }
> > +
> > +       if (unlikely(i == 0))
> > +               return 0; /* Nothing to enqueue */
> > +
> > +       /* Set SDone in last CB in enqueued ops for CB mode*/
> > +       desc = q->ring_addr + ((q->sw_ring_head + i - 1)
> > +                       & q->sw_ring_wrap_mask);
> > +
> > +       desc->req.sdone_enable = 1;
> > +       desc->req.irq_enable = q->irq_enable;
> > +
> > +       acc100_dma_enqueue(q, i, &q_data->queue_stats);
> > +
> > +       /* Update stats */
> > +       q_data->queue_stats.enqueued_count += i;
> > +       q_data->queue_stats.enqueue_err_count += num - i;
> > +       return i;
> > +}
> > +
> > +/* Enqueue decode operations for ACC100 device. */
> > +static uint16_t
> > +acc100_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
> > +               struct rte_bbdev_dec_op **ops, uint16_t num)
> > +{
> > +       struct acc100_queue *q = q_data->queue_private;
> > +       int32_t aq_avail = q->aq_depth +
> > +                       (q->aq_dequeued - q->aq_enqueued) / 128;
> > +
> > +       if (unlikely((aq_avail == 0) || (num == 0)))
> > +               return 0;
> > +
> > +       if (ops[0]->ldpc_dec.code_block_mode == 0)
> > +               return acc100_enqueue_ldpc_dec_tb(q_data, ops, num);
> > +       else
> > +               return acc100_enqueue_ldpc_dec_cb(q_data, ops, num);
> > +}
> > +
> > +
> > +/* Dequeue one encode operations from ACC100 device in CB mode */
> > +static inline int
> > +dequeue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op
> > **ref_op,
> > +               uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
> > +{
> > +       union acc100_dma_desc *desc, atom_desc;
> > +       union acc100_dma_rsp_desc rsp;
> > +       struct rte_bbdev_enc_op *op;
> > +       int i;
> > +
> > +       desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
> > +                       & q->sw_ring_wrap_mask);
> > +       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> > +                       __ATOMIC_RELAXED);
> > +
> > +       /* Check fdone bit */
> > +       if (!(atom_desc.rsp.val & ACC100_FDONE))
> > +               return -1;
> > +
> > +       rsp.val = atom_desc.rsp.val;
> > +       rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
> > +
> > +       /* Dequeue */
> > +       op = desc->req.op_addr;
> > +
> > +       /* Clearing status, it will be set based on response */
> > +       op->status = 0;
> > +
> > +       op->status |= ((rsp.input_err)
> > +                       ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> > +       op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> > +       op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> > +
> > +       if (desc->req.last_desc_in_batch) {
> > +               (*aq_dequeued)++;
> > +               desc->req.last_desc_in_batch = 0;
> > +       }
> > +       desc->rsp.val = ACC100_DMA_DESC_TYPE;
> > +       desc->rsp.add_info_0 = 0; /*Reserved bits */
> > +       desc->rsp.add_info_1 = 0; /*Reserved bits */
> > +
> > +       /* Flag that the muxing cause loss of opaque data */
> > +       op->opaque_data = (void *)-1;
> > +       for (i = 0 ; i < desc->req.numCBs; i++)
> > +               ref_op[i] = op;
> > +
> > +       /* One CB (op) was successfully dequeued */
> > +       return desc->req.numCBs;
> > +}
> > +
> > +/* Dequeue one encode operations from ACC100 device in TB mode */
> > +static inline int
> > +dequeue_enc_one_op_tb(struct acc100_queue *q, struct rte_bbdev_enc_op
> > **ref_op,
> > +               uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
> > +{
> > +       union acc100_dma_desc *desc, *last_desc, atom_desc;
> > +       union acc100_dma_rsp_desc rsp;
> > +       struct rte_bbdev_enc_op *op;
> > +       uint8_t i = 0;
> > +       uint16_t current_dequeued_cbs = 0, cbs_in_tb;
> > +
> > +       desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
> > +                       & q->sw_ring_wrap_mask);
> > +       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> > +                       __ATOMIC_RELAXED);
> > +
> > +       /* Check fdone bit */
> > +       if (!(atom_desc.rsp.val & ACC100_FDONE))
> > +               return -1;
> > +
> > +       /* Get number of CBs in dequeued TB */
> > +       cbs_in_tb = desc->req.cbs_in_tb;
> > +       /* Get last CB */
> > +       last_desc = q->ring_addr + ((q->sw_ring_tail
> > +                       + total_dequeued_cbs + cbs_in_tb - 1)
> > +                       & q->sw_ring_wrap_mask);
> > +       /* Check if last CB in TB is ready to dequeue (and thus
> > +        * the whole TB) - checking sdone bit. If not return.
> > +        */
> > +       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
> > +                       __ATOMIC_RELAXED);
> > +       if (!(atom_desc.rsp.val & ACC100_SDONE))
> > +               return -1;
> > +
> > +       /* Dequeue */
> > +       op = desc->req.op_addr;
> > +
> > +       /* Clearing status, it will be set based on response */
> > +       op->status = 0;
> > +
> > +       while (i < cbs_in_tb) {
> > +               desc = q->ring_addr + ((q->sw_ring_tail
> > +                               + total_dequeued_cbs)
> > +                               & q->sw_ring_wrap_mask);
> > +               atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> > +                               __ATOMIC_RELAXED);
> > +               rsp.val = atom_desc.rsp.val;
> > +               rte_bbdev_log_debug("Resp. desc %p: %x", desc,
> > +                               rsp.val);
> > +
> > +               op->status |= ((rsp.input_err)
> > +                               ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> > +               op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) :
> 0);
> > +               op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> > +
> > +               if (desc->req.last_desc_in_batch) {
> > +                       (*aq_dequeued)++;
> > +                       desc->req.last_desc_in_batch = 0;
> > +               }
> > +               desc->rsp.val = ACC100_DMA_DESC_TYPE;
> > +               desc->rsp.add_info_0 = 0;
> > +               desc->rsp.add_info_1 = 0;
> > +               total_dequeued_cbs++;
> > +               current_dequeued_cbs++;
> > +               i++;
> > +       }
> > +
> > +       *ref_op = op;
> > +
> > +       return current_dequeued_cbs;
> > +}
> > +
> > +/* Dequeue one decode operation from ACC100 device in CB mode */
> > +static inline int
> > +dequeue_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
> > +               struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
> > +               uint16_t dequeued_cbs, uint32_t *aq_dequeued)
> > +{
> > +       union acc100_dma_desc *desc, atom_desc;
> > +       union acc100_dma_rsp_desc rsp;
> > +       struct rte_bbdev_dec_op *op;
> > +
> > +       desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> > +                       & q->sw_ring_wrap_mask);
> > +       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> > +                       __ATOMIC_RELAXED);
> > +
> > +       /* Check fdone bit */
> > +       if (!(atom_desc.rsp.val & ACC100_FDONE))
> > +               return -1;
> > +
> > +       rsp.val = atom_desc.rsp.val;
> > +       rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
> > +
> > +       /* Dequeue */
> > +       op = desc->req.op_addr;
> > +
> > +       /* Clearing status, it will be set based on response */
> > +       op->status = 0;
> > +       op->status |= ((rsp.input_err)
> > +                       ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> > +       op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> > +       op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> > +       if (op->status != 0)
> > +               q_data->queue_stats.dequeue_err_count++;
> > +
> > +       /* CRC invalid if error exists */
> > +       if (!op->status)
> > +               op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
> > +       op->turbo_dec.iter_count = (uint8_t) rsp.iter_cnt / 2;
> > +       /* Check if this is the last desc in batch (Atomic Queue) */
> > +       if (desc->req.last_desc_in_batch) {
> > +               (*aq_dequeued)++;
> > +               desc->req.last_desc_in_batch = 0;
> > +       }
> > +       desc->rsp.val = ACC100_DMA_DESC_TYPE;
> > +       desc->rsp.add_info_0 = 0;
> > +       desc->rsp.add_info_1 = 0;
> > +       *ref_op = op;
> > +
> > +       /* One CB (op) was successfully dequeued */
> > +       return 1;
> > +}
> > +
> > +/* Dequeue one decode operations from ACC100 device in CB mode */
> > +static inline int
> > +dequeue_ldpc_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
> > +               struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
> > +               uint16_t dequeued_cbs, uint32_t *aq_dequeued)
> > +{
> > +       union acc100_dma_desc *desc, atom_desc;
> > +       union acc100_dma_rsp_desc rsp;
> > +       struct rte_bbdev_dec_op *op;
> > +
> > +       desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> > +                       & q->sw_ring_wrap_mask);
> > +       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> > +                       __ATOMIC_RELAXED);
> > +
> > +       /* Check fdone bit */
> > +       if (!(atom_desc.rsp.val & ACC100_FDONE))
> > +               return -1;
> > +
> > +       rsp.val = atom_desc.rsp.val;
> > +
> > +       /* Dequeue */
> > +       op = desc->req.op_addr;
> > +
> > +       /* Clearing status, it will be set based on response */
> > +       op->status = 0;
> > +       op->status |= rsp.input_err << RTE_BBDEV_DATA_ERROR;
> > +       op->status |= rsp.dma_err << RTE_BBDEV_DRV_ERROR;
> > +       op->status |= rsp.fcw_err << RTE_BBDEV_DRV_ERROR;
> > +       if (op->status != 0)
> > +               q_data->queue_stats.dequeue_err_count++;
> > +
> > +       op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
> > +       if (op->ldpc_dec.hard_output.length > 0 && !rsp.synd_ok)
> > +               op->status |= 1 << RTE_BBDEV_SYNDROME_ERROR;
> > +       op->ldpc_dec.iter_count = (uint8_t) rsp.iter_cnt;
> > +
> > +       /* Check if this is the last desc in batch (Atomic Queue) */
> > +       if (desc->req.last_desc_in_batch) {
> > +               (*aq_dequeued)++;
> > +               desc->req.last_desc_in_batch = 0;
> > +       }
> > +
> > +       desc->rsp.val = ACC100_DMA_DESC_TYPE;
> > +       desc->rsp.add_info_0 = 0;
> > +       desc->rsp.add_info_1 = 0;
> > +
> > +       *ref_op = op;
> > +
> > +       /* One CB (op) was successfully dequeued */
> > +       return 1;
> > +}
> > +
> > +/* Dequeue one decode operations from ACC100 device in TB mode. */
> > +static inline int
> > +dequeue_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op
> > **ref_op,
> > +               uint16_t dequeued_cbs, uint32_t *aq_dequeued)
> > +{
> > +       union acc100_dma_desc *desc, *last_desc, atom_desc;
> > +       union acc100_dma_rsp_desc rsp;
> > +       struct rte_bbdev_dec_op *op;
> > +       uint8_t cbs_in_tb = 1, cb_idx = 0;
> > +
> > +       desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> > +                       & q->sw_ring_wrap_mask);
> > +       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> > +                       __ATOMIC_RELAXED);
> > +
> > +       /* Check fdone bit */
> > +       if (!(atom_desc.rsp.val & ACC100_FDONE))
> > +               return -1;
> > +
> > +       /* Dequeue */
> > +       op = desc->req.op_addr;
> > +
> > +       /* Get number of CBs in dequeued TB */
> > +       cbs_in_tb = desc->req.cbs_in_tb;
> > +       /* Get last CB */
> > +       last_desc = q->ring_addr + ((q->sw_ring_tail
> > +                       + dequeued_cbs + cbs_in_tb - 1)
> > +                       & q->sw_ring_wrap_mask);
> > +       /* Check if last CB in TB is ready to dequeue (and thus
> > +        * the whole TB) - checking sdone bit. If not return.
> > +        */
> > +       atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
> > +                       __ATOMIC_RELAXED);
> > +       if (!(atom_desc.rsp.val & ACC100_SDONE))
> > +               return -1;
> > +
> > +       /* Clearing status, it will be set based on response */
> > +       op->status = 0;
> > +
> > +       /* Read remaining CBs if exists */
> > +       while (cb_idx < cbs_in_tb) {
> > +               desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> > +                               & q->sw_ring_wrap_mask);
> > +               atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> > +                               __ATOMIC_RELAXED);
> > +               rsp.val = atom_desc.rsp.val;
> > +               rte_bbdev_log_debug("Resp. desc %p: %x", desc,
> > +                               rsp.val);
> > +
> > +               op->status |= ((rsp.input_err)
> > +                               ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> > +               op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) :
> 0);
> > +               op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> > +
> > +               /* CRC invalid if error exists */
> > +               if (!op->status)
> > +                       op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
> > +               op->turbo_dec.iter_count = RTE_MAX((uint8_t) rsp.iter_cnt,
> > +                               op->turbo_dec.iter_count);
> > +
> > +               /* Check if this is the last desc in batch (Atomic Queue) */
> > +               if (desc->req.last_desc_in_batch) {
> > +                       (*aq_dequeued)++;
> > +                       desc->req.last_desc_in_batch = 0;
> > +               }
> > +               desc->rsp.val = ACC100_DMA_DESC_TYPE;
> > +               desc->rsp.add_info_0 = 0;
> > +               desc->rsp.add_info_1 = 0;
> > +               dequeued_cbs++;
> > +               cb_idx++;
> > +       }
> > +
> > +       *ref_op = op;
> > +
> > +       return cb_idx;
> > +}
> > +
> > +/* Dequeue LDPC encode operations from ACC100 device. */
> > +static uint16_t
> > +acc100_dequeue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
> > +               struct rte_bbdev_enc_op **ops, uint16_t num)
> > +{
> > +       struct acc100_queue *q = q_data->queue_private;
> > +       uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
> > +       uint32_t aq_dequeued = 0;
> > +       uint16_t dequeue_num, i, dequeued_cbs = 0, dequeued_descs = 0;
> > +       int ret;
> > +
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +       if (unlikely(ops == 0 && q == NULL))
> > +               return 0;
> > +#endif
> > +
> > +       dequeue_num = (avail < num) ? avail : num;
> > +
> > +       for (i = 0; i < dequeue_num; i++) {
> > +               ret = dequeue_enc_one_op_cb(q, &ops[dequeued_cbs],
> > +                               dequeued_descs, &aq_dequeued);
> > +               if (ret < 0)
> > +                       break;
> > +               dequeued_cbs += ret;
> > +               dequeued_descs++;
> > +               if (dequeued_cbs >= num)
> > +                       break;
> > +       }
> > +
> > +       q->aq_dequeued += aq_dequeued;
> > +       q->sw_ring_tail += dequeued_descs;
> > +
> > +       /* Update enqueue stats */
> > +       q_data->queue_stats.dequeued_count += dequeued_cbs;
> > +
> > +       return dequeued_cbs;
> > +}
> > +
> > +/* Dequeue decode operations from ACC100 device. */
> > +static uint16_t
> > +acc100_dequeue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
> > +               struct rte_bbdev_dec_op **ops, uint16_t num)
> > +{
> > +       struct acc100_queue *q = q_data->queue_private;
> > +       uint16_t dequeue_num;
> > +       uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
> > +       uint32_t aq_dequeued = 0;
> > +       uint16_t i;
> > +       uint16_t dequeued_cbs = 0;
> > +       struct rte_bbdev_dec_op *op;
> > +       int ret;
> > +
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +       if (unlikely(ops == 0 && q == NULL))
> > +               return 0;
> > +#endif
> > +
> > +       dequeue_num = (avail < num) ? avail : num;
> > +
> > +       for (i = 0; i < dequeue_num; ++i) {
> > +               op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> > +                       & q->sw_ring_wrap_mask))->req.op_addr;
> > +               if (op->ldpc_dec.code_block_mode == 0)
> > +                       ret = dequeue_dec_one_op_tb(q, &ops[i], dequeued_cbs,
> > +                                       &aq_dequeued);
> > +               else
> > +                       ret = dequeue_ldpc_dec_one_op_cb(
> > +                                       q_data, q, &ops[i], dequeued_cbs,
> > +                                       &aq_dequeued);
> > +
> > +               if (ret < 0)
> > +                       break;
> > +               dequeued_cbs += ret;
> > +       }
> > +
> > +       q->aq_dequeued += aq_dequeued;
> > +       q->sw_ring_tail += dequeued_cbs;
> > +
> > +       /* Update enqueue stats */
> > +       q_data->queue_stats.dequeued_count += i;
> > +
> > +       return i;
> > +}
> > +
> >  /* Initialization Function */
> >  static void
> >  acc100_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv)
> > @@ -703,6 +2321,10 @@
> >          struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
> >
> >          dev->dev_ops = &acc100_bbdev_ops;
> > +       dev->enqueue_ldpc_enc_ops = acc100_enqueue_ldpc_enc;
> > +       dev->enqueue_ldpc_dec_ops = acc100_enqueue_ldpc_dec;
> > +       dev->dequeue_ldpc_enc_ops = acc100_dequeue_ldpc_enc;
> > +       dev->dequeue_ldpc_dec_ops = acc100_dequeue_ldpc_dec;
> >
> >          ((struct acc100_device *) dev->data->dev_private)->pf_device =
> >                          !strcmp(drv->driver.name,
> > @@ -815,4 +2437,3 @@ static int acc100_pci_remove(struct rte_pci_device
> > *pci_dev)
> >  RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME,
> > pci_id_acc100_pf_map);
> >  RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
> >  RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME,
> > pci_id_acc100_vf_map);
> > -
> > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h
> > b/drivers/baseband/acc100/rte_acc100_pmd.h
> > index 0e2b79c..78686c1 100644
> > --- a/drivers/baseband/acc100/rte_acc100_pmd.h
> > +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
> > @@ -88,6 +88,8 @@
> >  #define TMPL_PRI_3      0x0f0e0d0c
> >  #define QUEUE_ENABLE    0x80000000  /* Bit to mark Queue as Enabled */
> >  #define WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
> > +#define ACC100_FDONE    0x80000000
> > +#define ACC100_SDONE    0x40000000
> >
> >  #define ACC100_NUM_TMPL  32
> >  #define VF_OFFSET_QOS 16 /* offset in Memory Space specific to QoS Mon
> */
> > @@ -398,6 +400,7 @@ struct __rte_packed acc100_dma_req_desc {
> >  union acc100_dma_desc {
> >          struct acc100_dma_req_desc req;
> >          union acc100_dma_rsp_desc rsp;
> > +       uint64_t atom_hdr;
> >  };
> >
> >
> > --
> > 1.8.3.1

^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC processing functions
  2020-09-03  2:34       ` Xu, Rosen
@ 2020-09-03  9:09         ` Ananyev, Konstantin
  2020-09-03 20:45           ` Chautru, Nicolas
  0 siblings, 1 reply; 213+ messages in thread
From: Ananyev, Konstantin @ 2020-09-03  9:09 UTC (permalink / raw)
  To: Xu, Rosen, Chautru, Nicolas, dev, akhil.goyal; +Cc: Richardson, Bruce



> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Xu, Rosen
> Sent: Thursday, September 3, 2020 3:34 AM
> To: Chautru, Nicolas <nicolas.chautru@intel.com>; dev@dpdk.org; akhil.goyal@nxp.com
> Cc: Richardson, Bruce <bruce.richardson@intel.com>
> Subject: Re: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC processing functions
> 
> Hi,
> 
> > -----Original Message-----
> > From: Chautru, Nicolas <nicolas.chautru@intel.com>
> > Sent: Sunday, August 30, 2020 2:01
> > To: Xu, Rosen <rosen.xu@intel.com>; dev@dpdk.org; akhil.goyal@nxp.com
> > Cc: Richardson, Bruce <bruce.richardson@intel.com>
> > Subject: RE: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC
> > processing functions
> >
> > Hi Rosen,
> >
> > > From: Xu, Rosen <rosen.xu@intel.com>
> > >
> > > Hi,
> > >
> > > > -----Original Message-----
> > > > From: dev <dev-bounces@dpdk.org> On Behalf Of Nicolas Chautru
> > > > Sent: Wednesday, August 19, 2020 8:25
> > > > To: dev@dpdk.org; akhil.goyal@nxp.com
> > > > Cc: Richardson, Bruce <bruce.richardson@intel.com>; Chautru, Nicolas
> > > > <nicolas.chautru@intel.com>
> > > > Subject: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC
> > > > processing functions
> > > >
> > > > Adding LDPC decode and encode processing operations
> > > >
> > > > Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> > > > ---
> > > >  drivers/baseband/acc100/rte_acc100_pmd.c | 1625
> > > > +++++++++++++++++++++++++++++-
> > > >  drivers/baseband/acc100/rte_acc100_pmd.h |    3 +
> > > >  2 files changed, 1626 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> > > > b/drivers/baseband/acc100/rte_acc100_pmd.c
> > > > index 7a21c57..5f32813 100644
> > > > --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> > > > +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> > > > @@ -15,6 +15,9 @@
> > > >  #include <rte_hexdump.h>
> > > >  #include <rte_pci.h>
> > > >  #include <rte_bus_pci.h>
> > > > +#ifdef RTE_BBDEV_OFFLOAD_COST
> > > > +#include <rte_cycles.h>
> > > > +#endif
> > > >
> > > >  #include <rte_bbdev.h>
> > > >  #include <rte_bbdev_pmd.h>
> > > > @@ -449,7 +452,6 @@
> > > >  	return 0;
> > > >  }
> > > >
> > > > -
> > > >  /**
> > > >   * Report a ACC100 queue index which is free
> > > >   * Return 0 to 16k for a valid queue_idx or -1 when no queue is
> > > > available @@ -634,6 +636,46 @@
> > > >  	struct acc100_device *d = dev->data->dev_private;
> > > >
> > > >  	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
> > > > +		{
> > > > +			.type   = RTE_BBDEV_OP_LDPC_ENC,
> > > > +			.cap.ldpc_enc = {
> > > > +				.capability_flags =
> > > > +					RTE_BBDEV_LDPC_RATE_MATCH |
> > > > +					RTE_BBDEV_LDPC_CRC_24B_ATTACH
> > > > |
> > > > +
> > > > 	RTE_BBDEV_LDPC_INTERLEAVER_BYPASS,
> > > > +				.num_buffers_src =
> > > > +
> > > > 	RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> > > > +				.num_buffers_dst =
> > > > +
> > > > 	RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> > > > +			}
> > > > +		},
> > > > +		{
> > > > +			.type   = RTE_BBDEV_OP_LDPC_DEC,
> > > > +			.cap.ldpc_dec = {
> > > > +			.capability_flags =
> > > > +				RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK |
> > > > +				RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP |
> > > > +
> > > > 	RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE |
> > > > +
> > > > 	RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE |
> > > > +#ifdef ACC100_EXT_MEM
> > > > +
> > > > 	RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE |
> > > > +
> > > > 	RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE |
> > > > +#endif
> > > > +
> > > > 	RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE |
> > > > +				RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS
> > > > |
> > > > +				RTE_BBDEV_LDPC_DECODE_BYPASS |
> > > > +				RTE_BBDEV_LDPC_DEC_SCATTER_GATHER |
> > > > +
> > > > 	RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION |
> > > > +				RTE_BBDEV_LDPC_LLR_COMPRESSION,
> > > > +			.llr_size = 8,
> > > > +			.llr_decimals = 1,
> > > > +			.num_buffers_src =
> > > > +
> > > > 	RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> > > > +			.num_buffers_hard_out =
> > > > +
> > > > 	RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> > > > +			.num_buffers_soft_out = 0,
> > > > +			}
> > > > +		},
> > > >  		RTE_BBDEV_END_OF_CAPABILITIES_LIST()
> > > >  	};
> > > >
> > > > @@ -669,9 +711,14 @@
> > > >  	dev_info->cpu_flag_reqs = NULL;
> > > >  	dev_info->min_alignment = 64;
> > > >  	dev_info->capabilities = bbdev_capabilities;
> > > > +#ifdef ACC100_EXT_MEM
> > > >  	dev_info->harq_buffer_size = d->ddr_size;
> > > > +#else
> > > > +	dev_info->harq_buffer_size = 0;
> > > > +#endif
> > > >  }
> > > >
> > > > +
> > > >  static const struct rte_bbdev_ops acc100_bbdev_ops = {
> > > >  	.setup_queues = acc100_setup_queues,
> > > >  	.close = acc100_dev_close,
> > > > @@ -696,6 +743,1577 @@
> > > >  	{.device_id = 0},
> > > >  };
> > > >
> > > > +/* Read flag value 0/1 from bitmap */ static inline bool
> > > > +check_bit(uint32_t bitmap, uint32_t bitmask) {
> > > > +	return bitmap & bitmask;
> > > > +}
> > > > +
> > > > +static inline char *
> > > > +mbuf_append(struct rte_mbuf *m_head, struct rte_mbuf *m, uint16_t
> > > > +len) {
> > > > +	if (unlikely(len > rte_pktmbuf_tailroom(m)))
> > > > +		return NULL;
> > > > +
> > > > +	char *tail = (char *)m->buf_addr + m->data_off + m->data_len;
> > > > +	m->data_len = (uint16_t)(m->data_len + len);
> > > > +	m_head->pkt_len  = (m_head->pkt_len + len);
> > > > +	return tail;
> > > > +}
> > >
> > > Is it reasonable to direct add data_len of rte_mbuf?
> > >
> >
> > Do you suggest to add directly without checking there is enough room in the
> > mbuf? We cannot rely on the application providing mbuf with enough
> > tailroom.
> 
> What I mentioned is this changes about mbuf should move to librte_mbuf.
> And it's better to align Olivier Matz.

There is already rte_pktmbuf_append() inside rte_mbuf.h.
Wouldn't it suit?

> 
> > In case you ask about the 2 mbufs, this is because this function is used to also
> > support segmented memory made of multiple mbufs segments.
> > Note that this function is also used in other existing bbdev PMDs. In case you
> > believe there is a better way to do this, we can certainly discuss and change
> > these in several PMDs through another serie.
> >
> > Thanks for all the reviews and useful comments.
> > Nic

^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v3 02/11] baseband/acc100: add register definition file
  2020-09-03  2:15       ` Xu, Rosen
@ 2020-09-03  9:17         ` Ferruh Yigit
  0 siblings, 0 replies; 213+ messages in thread
From: Ferruh Yigit @ 2020-09-03  9:17 UTC (permalink / raw)
  To: Xu, Rosen, Chautru, Nicolas, dev, akhil.goyal; +Cc: Richardson, Bruce

On 9/3/2020 3:15 AM, Xu, Rosen wrote:
> Hi,
> 
>> -----Original Message-----
>> From: Chautru, Nicolas <nicolas.chautru@intel.com>
>> Sent: Sunday, August 30, 2020 1:40
>> To: Xu, Rosen <rosen.xu@intel.com>; dev@dpdk.org; akhil.goyal@nxp.com
>> Cc: Richardson, Bruce <bruce.richardson@intel.com>
>> Subject: RE: [dpdk-dev] [PATCH v3 02/11] baseband/acc100: add register
>> definition file
>>
>> Hi Rosen,
>>
>>> From: Xu, Rosen <rosen.xu@intel.com>
>>>
>>> Hi,
>>>
>>>> -----Original Message-----
>>>> From: dev <dev-bounces@dpdk.org> On Behalf Of Nicolas Chautru
>>>> Sent: Wednesday, August 19, 2020 8:25
>>>> To: dev@dpdk.org; akhil.goyal@nxp.com
>>>> Cc: Richardson, Bruce <bruce.richardson@intel.com>; Chautru, Nicolas
>>>> <nicolas.chautru@intel.com>
>>>> Subject: [dpdk-dev] [PATCH v3 02/11] baseband/acc100: add register
>>>> definition file
>>>>
>>>> Add in the list of registers for the device and related
>>>> HW specs definitions.
>>>>
>>>> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>

<...>

>>>> @@ -0,0 +1,1068 @@
>>>> +/* SPDX-License-Identifier: BSD-3-Clause
>>>> + * Copyright(c) 2017 Intel Corporation
>>>> + */
>>>> +
>>>> +#ifndef ACC100_PF_ENUM_H
>>>> +#define ACC100_PF_ENUM_H
>>>> +
>>>> +/*
>>>> + * ACC100 Register mapping on PF BAR0
>>>> + * This is automatically generated from RDL, format may change with
>> new
>>>> RDL
>>>> + * Release.
>>>> + * Variable names are as is
>>>> + */
>>>> +enum {
>>>> +	HWPfQmgrEgressQueuesTemplate          =  0x0007FE00,
>>>> +	HWPfQmgrIngressAq                     =  0x00080000,
>>>> +	HWPfQmgrArbQAvail                     =  0x00A00010,

<...>

>>>> +	HwPfPcieGpexPexPioArcacheControl      =  0x00D9C304,
>>>> +	HwPfPcieGpexPabObSizeControlVc0       =  0x00D9C310
>>>> +};
>>>
>>> Why not macro definition but enum?
>>>
>>
>> Well both would "work". The main reason really is that this long enum is
>> automatically generated from RDL output from the chip design.
>> But still in that case I would argue enum is cleaner so that to put all these
>> incremental addresses together.
>> This can also helps when debugging as this is kept post compilation as both
>> value and enum var.
>> Any concern or any BKM from other PMDs?
> 
> Can you read DPDK coding style firstly?
> https://doc.dpdk.org/guides-16.11/contributing/coding_style.html
> It's not make sense to define HW address in your way.

Both works as Nicolas said, and I agree enum is better for the reasons Nicolas
mentioned.

Also coding style says:
"
Wherever possible, enums and inline functions should be preferred to macros,
since they provide additional degrees of type-safety and can allow compilers to
emit extra warnings about unsafe code.
"

What is the concern to have enum instead of macro?

^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v3 10/11] baseband/acc100: add configure function
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 10/11] baseband/acc100: add configure function Nicolas Chautru
@ 2020-09-03 10:06   ` Aidan Goddard
  2020-09-03 18:53     ` Chautru, Nicolas
  0 siblings, 1 reply; 213+ messages in thread
From: Aidan Goddard @ 2020-09-03 10:06 UTC (permalink / raw)
  To: Nicolas Chautru, dev, akhil.goyal; +Cc: bruce.richardson, Dave Burley

Hi Nic, 

Does the ACC100-specific configuration code have to go into test_bbdev_perf.c? Would it be better to avoid having this device specific code in test-bbdev or is there a good reason for doing so?

Thanks,

Aidan Goddard


From: dev <dev-bounces@dpdk.org> on behalf of Nicolas Chautru <nicolas.chautru@intel.com>
Sent: 19 August 2020 01:25
To: dev@dpdk.org <dev@dpdk.org>; akhil.goyal@nxp.com <akhil.goyal@nxp.com>
Cc: bruce.richardson@intel.com <bruce.richardson@intel.com>; Nicolas Chautru <nicolas.chautru@intel.com>
Subject: [dpdk-dev] [PATCH v3 10/11] baseband/acc100: add configure function 
 
Add configure function to configure the PF from within
the bbdev-test itself without external application
configuration the device.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 app/test-bbdev/test_bbdev_perf.c                   |  72 +++
 drivers/baseband/acc100/Makefile                   |   3 +
 drivers/baseband/acc100/meson.build                |   2 +
 drivers/baseband/acc100/rte_acc100_cfg.h           |  17 +
 drivers/baseband/acc100/rte_acc100_pmd.c           | 505 +++++++++++++++++++++
 .../acc100/rte_pmd_bbdev_acc100_version.map        |   7 +
 6 files changed, 606 insertions(+)

diff --git a/app/test-bbdev/test_bbdev_perf.c b/app/test-bbdev/test_bbdev_perf.c
index 45c0d62..32f23ff 100644
--- a/app/test-bbdev/test_bbdev_perf.c
+++ b/app/test-bbdev/test_bbdev_perf.c
@@ -52,6 +52,18 @@
 #define FLR_5G_TIMEOUT 610
 #endif
 
+#ifdef RTE_LIBRTE_PMD_BBDEV_ACC100
+#include <rte_acc100_cfg.h>
+#define ACC100PF_DRIVER_NAME   ("intel_acc100_pf")
+#define ACC100VF_DRIVER_NAME   ("intel_acc100_vf")
+#define ACC100_QMGR_NUM_AQS 16
+#define ACC100_QMGR_NUM_QGS 2
+#define ACC100_QMGR_AQ_DEPTH 5
+#define ACC100_QMGR_INVALID_IDX -1
+#define ACC100_QMGR_RR 1
+#define ACC100_QOS_GBR 0
+#endif
+
 #define OPS_CACHE_SIZE 256U
 #define OPS_POOL_SIZE_MIN 511U /* 0.5K per queue */
 
@@ -653,6 +665,66 @@ typedef int (test_case_function)(struct active_device *ad,
                                 info->dev_name);
         }
 #endif
+#ifdef RTE_LIBRTE_PMD_BBDEV_ACC100
+       if ((get_init_device() == true) &&
+               (!strcmp(info->drv.driver_name, ACC100PF_DRIVER_NAME))) {
+               struct acc100_conf conf;
+               unsigned int i;
+
+               printf("Configure ACC100 FEC Driver %s with default values\n",
+                               info->drv.driver_name);
+
+               /* clear default configuration before initialization */
+               memset(&conf, 0, sizeof(struct acc100_conf));
+
+               /* Always set in PF mode for built-in configuration */
+               conf.pf_mode_en = true;
+               for (i = 0; i < RTE_ACC100_NUM_VFS; ++i) {
+                       conf.arb_dl_4g[i].gbr_threshold1 = ACC100_QOS_GBR;
+                       conf.arb_dl_4g[i].gbr_threshold1 = ACC100_QOS_GBR;
+                       conf.arb_dl_4g[i].round_robin_weight = ACC100_QMGR_RR;
+                       conf.arb_ul_4g[i].gbr_threshold1 = ACC100_QOS_GBR;
+                       conf.arb_ul_4g[i].gbr_threshold1 = ACC100_QOS_GBR;
+                       conf.arb_ul_4g[i].round_robin_weight = ACC100_QMGR_RR;
+                       conf.arb_dl_5g[i].gbr_threshold1 = ACC100_QOS_GBR;
+                       conf.arb_dl_5g[i].gbr_threshold1 = ACC100_QOS_GBR;
+                       conf.arb_dl_5g[i].round_robin_weight = ACC100_QMGR_RR;
+                       conf.arb_ul_5g[i].gbr_threshold1 = ACC100_QOS_GBR;
+                       conf.arb_ul_5g[i].gbr_threshold1 = ACC100_QOS_GBR;
+                       conf.arb_ul_5g[i].round_robin_weight = ACC100_QMGR_RR;
+               }
+
+               conf.input_pos_llr_1_bit = true;
+               conf.output_pos_llr_1_bit = true;
+               conf.num_vf_bundles = 1; /**< Number of VF bundles to setup */
+
+               conf.q_ul_4g.num_qgroups = ACC100_QMGR_NUM_QGS;
+               conf.q_ul_4g.first_qgroup_index = ACC100_QMGR_INVALID_IDX;
+               conf.q_ul_4g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
+               conf.q_ul_4g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
+               conf.q_dl_4g.num_qgroups = ACC100_QMGR_NUM_QGS;
+               conf.q_dl_4g.first_qgroup_index = ACC100_QMGR_INVALID_IDX;
+               conf.q_dl_4g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
+               conf.q_dl_4g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
+               conf.q_ul_5g.num_qgroups = ACC100_QMGR_NUM_QGS;
+               conf.q_ul_5g.first_qgroup_index = ACC100_QMGR_INVALID_IDX;
+               conf.q_ul_5g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
+               conf.q_ul_5g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
+               conf.q_dl_5g.num_qgroups = ACC100_QMGR_NUM_QGS;
+               conf.q_dl_5g.first_qgroup_index = ACC100_QMGR_INVALID_IDX;
+               conf.q_dl_5g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
+               conf.q_dl_5g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
+
+               /* setup PF with configuration information */
+               ret = acc100_configure(info->dev_name, &conf);
+               TEST_ASSERT_SUCCESS(ret,
+                               "Failed to configure ACC100 PF for bbdev %s",
+                               info->dev_name);
+               /* Let's refresh this now this is configured */
+       }
+       rte_bbdev_info_get(dev_id, info);
+#endif
+
         nb_queues = RTE_MIN(rte_lcore_count(), info->drv.max_num_queues);
         nb_queues = RTE_MIN(nb_queues, (unsigned int) MAX_QUEUES);
 
diff --git a/drivers/baseband/acc100/Makefile b/drivers/baseband/acc100/Makefile
index c79e487..37e73af 100644
--- a/drivers/baseband/acc100/Makefile
+++ b/drivers/baseband/acc100/Makefile
@@ -22,4 +22,7 @@ LIBABIVER := 1
 # library source files
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100) += rte_acc100_pmd.c
 
+# export include files
+SYMLINK-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100)-include += rte_acc100_cfg.h
+
 include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/baseband/acc100/meson.build b/drivers/baseband/acc100/meson.build
index 8afafc2..7ac44dc 100644
--- a/drivers/baseband/acc100/meson.build
+++ b/drivers/baseband/acc100/meson.build
@@ -4,3 +4,5 @@
 deps += ['bbdev', 'bus_vdev', 'ring', 'pci', 'bus_pci']
 
 sources = files('rte_acc100_pmd.c')
+
+install_headers('rte_acc100_cfg.h')
diff --git a/drivers/baseband/acc100/rte_acc100_cfg.h b/drivers/baseband/acc100/rte_acc100_cfg.h
index 73bbe36..7f523bc 100644
--- a/drivers/baseband/acc100/rte_acc100_cfg.h
+++ b/drivers/baseband/acc100/rte_acc100_cfg.h
@@ -89,6 +89,23 @@ struct acc100_conf {
         struct rte_arbitration_t arb_dl_5g[RTE_ACC100_NUM_VFS];
 };
 
+/**
+ * Configure a ACC100 device
+ *
+ * @param dev_name
+ *   The name of the device. This is the short form of PCI BDF, e.g. 00:01.0.
+ *   It can also be retrieved for a bbdev device from the dev_name field in the
+ *   rte_bbdev_info structure returned by rte_bbdev_info_get().
+ * @param conf
+ *   Configuration to apply to ACC100 HW.
+ *
+ * @return
+ *   Zero on success, negative value on failure.
+ */
+__rte_experimental
+int
+acc100_configure(const char *dev_name, struct acc100_conf *conf);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index dc14079..43f664b 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -85,6 +85,26 @@
 
 enum {UL_4G = 0, UL_5G, DL_4G, DL_5G, NUM_ACC};
 
+/* Return the accelerator enum for a Queue Group Index */
+static inline int
+accFromQgid(int qg_idx, const struct acc100_conf *acc100_conf)
+{
+       int accQg[ACC100_NUM_QGRPS];
+       int NumQGroupsPerFn[NUM_ACC];
+       int acc, qgIdx, qgIndex = 0;
+       for (qgIdx = 0; qgIdx < ACC100_NUM_QGRPS; qgIdx++)
+               accQg[qgIdx] = 0;
+       NumQGroupsPerFn[UL_4G] = acc100_conf->q_ul_4g.num_qgroups;
+       NumQGroupsPerFn[UL_5G] = acc100_conf->q_ul_5g.num_qgroups;
+       NumQGroupsPerFn[DL_4G] = acc100_conf->q_dl_4g.num_qgroups;
+       NumQGroupsPerFn[DL_5G] = acc100_conf->q_dl_5g.num_qgroups;
+       for (acc = UL_4G;  acc < NUM_ACC; acc++)
+               for (qgIdx = 0; qgIdx < NumQGroupsPerFn[acc]; qgIdx++)
+                       accQg[qgIndex++] = acc;
+       acc = accQg[qg_idx];
+       return acc;
+}
+
 /* Return the queue topology for a Queue Group Index */
 static inline void
 qtopFromAcc(struct rte_q_topology_t **qtop, int acc_enum,
@@ -113,6 +133,30 @@
         *qtop = p_qtop;
 }
 
+/* Return the AQ depth for a Queue Group Index */
+static inline int
+aqDepth(int qg_idx, struct acc100_conf *acc100_conf)
+{
+       struct rte_q_topology_t *q_top = NULL;
+       int acc_enum = accFromQgid(qg_idx, acc100_conf);
+       qtopFromAcc(&q_top, acc_enum, acc100_conf);
+       if (unlikely(q_top == NULL))
+               return 0;
+       return q_top->aq_depth_log2;
+}
+
+/* Return the AQ depth for a Queue Group Index */
+static inline int
+aqNum(int qg_idx, struct acc100_conf *acc100_conf)
+{
+       struct rte_q_topology_t *q_top = NULL;
+       int acc_enum = accFromQgid(qg_idx, acc100_conf);
+       qtopFromAcc(&q_top, acc_enum, acc100_conf);
+       if (unlikely(q_top == NULL))
+               return 0;
+       return q_top->num_aqs_per_groups;
+}
+
 static void
 initQTop(struct acc100_conf *acc100_conf)
 {
@@ -4177,3 +4221,464 @@ static int acc100_pci_remove(struct rte_pci_device *pci_dev)
 RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME, pci_id_acc100_pf_map);
 RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
 RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME, pci_id_acc100_vf_map);
+
+/*
+ * Implementation to fix the power on status of some 5GUL engines
+ * This requires DMA permission if ported outside DPDK
+ */
+static void
+poweron_cleanup(struct rte_bbdev *bbdev, struct acc100_device *d,
+               struct acc100_conf *conf)
+{
+       int i, template_idx, qg_idx;
+       uint32_t address, status, payload;
+       printf("Need to clear power-on 5GUL status in internal memory\n");
+       /* Reset LDPC Cores */
+       for (i = 0; i < ACC100_ENGINES_MAX; i++)
+               acc100_reg_write(d, HWPfFecUl5gCntrlReg +
+                               ACC100_ENGINE_OFFSET * i, ACC100_RESET_HI);
+       usleep(LONG_WAIT);
+       for (i = 0; i < ACC100_ENGINES_MAX; i++)
+               acc100_reg_write(d, HWPfFecUl5gCntrlReg +
+                               ACC100_ENGINE_OFFSET * i, ACC100_RESET_LO);
+       usleep(LONG_WAIT);
+       /* Prepare dummy workload */
+       alloc_2x64mb_sw_rings_mem(bbdev, d, 0);
+       /* Set base addresses */
+       uint32_t phys_high = (uint32_t)(d->sw_rings_phys >> 32);
+       uint32_t phys_low  = (uint32_t)(d->sw_rings_phys &
+                       ~(ACC100_SIZE_64MBYTE-1));
+       acc100_reg_write(d, HWPfDmaFec5GulDescBaseHiRegVf, phys_high);
+       acc100_reg_write(d, HWPfDmaFec5GulDescBaseLoRegVf, phys_low);
+
+       /* Descriptor for a dummy 5GUL code block processing*/
+       union acc100_dma_desc *desc = NULL;
+       desc = d->sw_rings;
+       desc->req.data_ptrs[0].address = d->sw_rings_phys +
+                       ACC100_DESC_FCW_OFFSET;
+       desc->req.data_ptrs[0].blen = ACC100_FCW_LD_BLEN;
+       desc->req.data_ptrs[0].blkid = ACC100_DMA_BLKID_FCW;
+       desc->req.data_ptrs[0].last = 0;
+       desc->req.data_ptrs[0].dma_ext = 0;
+       desc->req.data_ptrs[1].address = d->sw_rings_phys + 512;
+       desc->req.data_ptrs[1].blkid = ACC100_DMA_BLKID_IN;
+       desc->req.data_ptrs[1].last = 1;
+       desc->req.data_ptrs[1].dma_ext = 0;
+       desc->req.data_ptrs[1].blen = 44;
+       desc->req.data_ptrs[2].address = d->sw_rings_phys + 1024;
+       desc->req.data_ptrs[2].blkid = ACC100_DMA_BLKID_OUT_ENC;
+       desc->req.data_ptrs[2].last = 1;
+       desc->req.data_ptrs[2].dma_ext = 0;
+       desc->req.data_ptrs[2].blen = 5;
+       /* Dummy FCW */
+       desc->req.fcw_ld.FCWversion = ACC100_FCW_VER;
+       desc->req.fcw_ld.qm = 1;
+       desc->req.fcw_ld.nfiller = 30;
+       desc->req.fcw_ld.BG = 2 - 1;
+       desc->req.fcw_ld.Zc = 7;
+       desc->req.fcw_ld.ncb = 350;
+       desc->req.fcw_ld.rm_e = 4;
+       desc->req.fcw_ld.itmax = 10;
+       desc->req.fcw_ld.gain_i = 1;
+       desc->req.fcw_ld.gain_h = 1;
+
+       int engines_to_restart[SIG_UL_5G_LAST + 1] = {0};
+       int num_failed_engine = 0;
+       /* Detect engines in undefined state */
+       for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST;
+                       template_idx++) {
+               /* Check engine power-on status */
+               address = HwPfFecUl5gIbDebugReg +
+                               ACC100_ENGINE_OFFSET * template_idx;
+               status = (acc100_reg_read(d, address) >> 4) & 0xF;
+               if (status == 0) {
+                       engines_to_restart[num_failed_engine] = template_idx;
+                       num_failed_engine++;
+               }
+       }
+
+       int numQqsAcc = conf->q_ul_5g.num_qgroups;
+       int numQgs = conf->q_ul_5g.num_qgroups;
+       payload = 0;
+       for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+               payload |= (1 << qg_idx);
+       /* Force each engine which is in unspecified state */
+       for (i = 0; i < num_failed_engine; i++) {
+               int failed_engine = engines_to_restart[i];
+               printf("Force engine %d\n", failed_engine);
+               for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST;
+                               template_idx++) {
+                       address = HWPfQmgrGrpTmplateReg4Indx
+                                       + BYTES_IN_WORD * template_idx;
+                       if (template_idx == failed_engine)
+                               acc100_reg_write(d, address, payload);
+                       else
+                               acc100_reg_write(d, address, 0);
+               }
+               /* Reset descriptor header */
+               desc->req.word0 = ACC100_DMA_DESC_TYPE;
+               desc->req.word1 = 0;
+               desc->req.word2 = 0;
+               desc->req.word3 = 0;
+               desc->req.numCBs = 1;
+               desc->req.m2dlen = 2;
+               desc->req.d2mlen = 1;
+               /* Enqueue the code block for processing */
+               union acc100_enqueue_reg_fmt enq_req;
+               enq_req.val = 0;
+               enq_req.addr_offset = ACC100_DESC_OFFSET;
+               enq_req.num_elem = 1;
+               enq_req.req_elem_addr = 0;
+               rte_wmb();
+               acc100_reg_write(d, HWPfQmgrIngressAq + 0x100, enq_req.val);
+               usleep(LONG_WAIT * 100);
+               if (desc->req.word0 != 2)
+                       printf("DMA Response %#"PRIx32"\n", desc->req.word0);
+       }
+
+       /* Reset LDPC Cores */
+       for (i = 0; i < ACC100_ENGINES_MAX; i++)
+               acc100_reg_write(d, HWPfFecUl5gCntrlReg +
+                               ACC100_ENGINE_OFFSET * i, ACC100_RESET_HI);
+       usleep(LONG_WAIT);
+       for (i = 0; i < ACC100_ENGINES_MAX; i++)
+               acc100_reg_write(d, HWPfFecUl5gCntrlReg +
+                               ACC100_ENGINE_OFFSET * i, ACC100_RESET_LO);
+       usleep(LONG_WAIT);
+       acc100_reg_write(d, HWPfHi5GHardResetReg, ACC100_RESET_HARD);
+       usleep(LONG_WAIT);
+       int numEngines = 0;
+       /* Check engine power-on status again */
+       for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST;
+                       template_idx++) {
+               address = HwPfFecUl5gIbDebugReg +
+                               ACC100_ENGINE_OFFSET * template_idx;
+               status = (acc100_reg_read(d, address) >> 4) & 0xF;
+               address = HWPfQmgrGrpTmplateReg4Indx
+                               + BYTES_IN_WORD * template_idx;
+               if (status == 1) {
+                       acc100_reg_write(d, address, payload);
+                       numEngines++;
+               } else
+                       acc100_reg_write(d, address, 0);
+       }
+       printf("Number of 5GUL engines %d\n", numEngines);
+
+       if (d->sw_rings_base != NULL)
+               rte_free(d->sw_rings_base);
+       usleep(LONG_WAIT);
+}
+
+/* Initial configuration of a ACC100 device prior to running configure() */
+int
+acc100_configure(const char *dev_name, struct acc100_conf *conf)
+{
+       rte_bbdev_log(INFO, "acc100_configure");
+       uint32_t payload, address, status;
+       int qg_idx, template_idx, vf_idx, acc, i;
+       struct rte_bbdev *bbdev = rte_bbdev_get_named_dev(dev_name);
+
+       /* Compile time checks */
+       RTE_BUILD_BUG_ON(sizeof(struct acc100_dma_req_desc) != 256);
+       RTE_BUILD_BUG_ON(sizeof(union acc100_dma_desc) != 256);
+       RTE_BUILD_BUG_ON(sizeof(struct acc100_fcw_td) != 24);
+       RTE_BUILD_BUG_ON(sizeof(struct acc100_fcw_te) != 32);
+
+       if (bbdev == NULL) {
+               rte_bbdev_log(ERR,
+               "Invalid dev_name (%s), or device is not yet initialised",
+               dev_name);
+               return -ENODEV;
+       }
+       struct acc100_device *d = bbdev->data->dev_private;
+
+       /* Store configuration */
+       rte_memcpy(&d->acc100_conf, conf, sizeof(d->acc100_conf));
+
+       /* PCIe Bridge configuration */
+       acc100_reg_write(d, HwPfPcieGpexBridgeControl, ACC100_CFG_PCI_BRIDGE);
+       for (i = 1; i < 17; i++)
+               acc100_reg_write(d,
+                               HwPfPcieGpexAxiAddrMappingWindowPexBaseHigh
+                               + i * 16, 0);
+
+       /* PCIe Link Trainiing and Status State Machine */
+       acc100_reg_write(d, HwPfPcieGpexLtssmStateCntrl, 0xDFC00000);
+
+       /* Prevent blocking AXI read on BRESP for AXI Write */
+       address = HwPfPcieGpexAxiPioControl;
+       payload = ACC100_CFG_PCI_AXI;
+       acc100_reg_write(d, address, payload);
+
+       /* 5GDL PLL phase shift */
+       acc100_reg_write(d, HWPfChaDl5gPllPhshft0, 0x1);
+
+       /* Explicitly releasing AXI as this may be stopped after PF FLR/BME */
+       address = HWPfDmaAxiControl;
+       payload = 1;
+       acc100_reg_write(d, address, payload);
+
+       /* DDR Configuration */
+       address = HWPfDdrBcTim6;
+       payload = acc100_reg_read(d, address);
+       payload &= 0xFFFFFFFB; /* Bit 2 */
+#ifdef ACC100_DDR_ECC_ENABLE
+       payload |= 0x4;
+#endif
+       acc100_reg_write(d, address, payload);
+       address = HWPfDdrPhyDqsCountNum;
+#ifdef ACC100_DDR_ECC_ENABLE
+       payload = 9;
+#else
+       payload = 8;
+#endif
+       acc100_reg_write(d, address, payload);
+
+       /* Set default descriptor signature */
+       address = HWPfDmaDescriptorSignatuture;
+       payload = 0;
+       acc100_reg_write(d, address, payload);
+
+       /* Enable the Error Detection in DMA */
+       payload = ACC100_CFG_DMA_ERROR;
+       address = HWPfDmaErrorDetectionEn;
+       acc100_reg_write(d, address, payload);
+
+       /* AXI Cache configuration */
+       payload = ACC100_CFG_AXI_CACHE;
+       address = HWPfDmaAxcacheReg;
+       acc100_reg_write(d, address, payload);
+
+       /* Default DMA Configuration (Qmgr Enabled) */
+       address = HWPfDmaConfig0Reg;
+       payload = 0;
+       acc100_reg_write(d, address, payload);
+       address = HWPfDmaQmanen;
+       payload = 0;
+       acc100_reg_write(d, address, payload);
+
+       /* Default RLIM/ALEN configuration */
+       address = HWPfDmaConfig1Reg;
+       payload = (1 << 31) + (23 << 8) + (1 << 6) + 7;
+       acc100_reg_write(d, address, payload);
+
+       /* Configure DMA Qmanager addresses */
+       address = HWPfDmaQmgrAddrReg;
+       payload = HWPfQmgrEgressQueuesTemplate;
+       acc100_reg_write(d, address, payload);
+
+       /* ===== Qmgr Configuration ===== */
+       /* Configuration of the AQueue Depth QMGR_GRP_0_DEPTH_LOG2 for UL */
+       int totalQgs = conf->q_ul_4g.num_qgroups +
+                       conf->q_ul_5g.num_qgroups +
+                       conf->q_dl_4g.num_qgroups +
+                       conf->q_dl_5g.num_qgroups;
+       for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) {
+               address = HWPfQmgrDepthLog2Grp +
+               BYTES_IN_WORD * qg_idx;
+               payload = aqDepth(qg_idx, conf);
+               acc100_reg_write(d, address, payload);
+               address = HWPfQmgrTholdGrp +
+               BYTES_IN_WORD * qg_idx;
+               payload = (1 << 16) + (1 << (aqDepth(qg_idx, conf) - 1));
+               acc100_reg_write(d, address, payload);
+       }
+
+       /* Template Priority in incremental order */
+       for (template_idx = 0; template_idx < ACC100_NUM_TMPL;
+                       template_idx++) {
+               address = HWPfQmgrGrpTmplateReg0Indx +
+               BYTES_IN_WORD * (template_idx % 8);
+               payload = TMPL_PRI_0;
+               acc100_reg_write(d, address, payload);
+               address = HWPfQmgrGrpTmplateReg1Indx +
+               BYTES_IN_WORD * (template_idx % 8);
+               payload = TMPL_PRI_1;
+               acc100_reg_write(d, address, payload);
+               address = HWPfQmgrGrpTmplateReg2indx +
+               BYTES_IN_WORD * (template_idx % 8);
+               payload = TMPL_PRI_2;
+               acc100_reg_write(d, address, payload);
+               address = HWPfQmgrGrpTmplateReg3Indx +
+               BYTES_IN_WORD * (template_idx % 8);
+               payload = TMPL_PRI_3;
+               acc100_reg_write(d, address, payload);
+       }
+
+       address = HWPfQmgrGrpPriority;
+       payload = ACC100_CFG_QMGR_HI_P;
+       acc100_reg_write(d, address, payload);
+
+       /* Template Configuration */
+       for (template_idx = 0; template_idx < ACC100_NUM_TMPL; template_idx++) {
+               payload = 0;
+               address = HWPfQmgrGrpTmplateReg4Indx
+                               + BYTES_IN_WORD * template_idx;
+               acc100_reg_write(d, address, payload);
+       }
+       /* 4GUL */
+       int numQgs = conf->q_ul_4g.num_qgroups;
+       int numQqsAcc = 0;
+       payload = 0;
+       for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+               payload |= (1 << qg_idx);
+       for (template_idx = SIG_UL_4G; template_idx <= SIG_UL_4G_LAST;
+                       template_idx++) {
+               address = HWPfQmgrGrpTmplateReg4Indx
+                               + BYTES_IN_WORD*template_idx;
+               acc100_reg_write(d, address, payload);
+       }
+       /* 5GUL */
+       numQqsAcc += numQgs;
+       numQgs  = conf->q_ul_5g.num_qgroups;
+       payload = 0;
+       int numEngines = 0;
+       for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+               payload |= (1 << qg_idx);
+       for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST;
+                       template_idx++) {
+               /* Check engine power-on status */
+               address = HwPfFecUl5gIbDebugReg +
+                               ACC100_ENGINE_OFFSET * template_idx;
+               status = (acc100_reg_read(d, address) >> 4) & 0xF;
+               address = HWPfQmgrGrpTmplateReg4Indx
+                               + BYTES_IN_WORD * template_idx;
+               if (status == 1) {
+                       acc100_reg_write(d, address, payload);
+                       numEngines++;
+               } else
+                       acc100_reg_write(d, address, 0);
+               #if RTE_ACC100_SINGLE_FEC == 1
+               payload = 0;
+               #endif
+       }
+       printf("Number of 5GUL engines %d\n", numEngines);
+       /* 4GDL */
+       numQqsAcc += numQgs;
+       numQgs  = conf->q_dl_4g.num_qgroups;
+       payload = 0;
+       for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+               payload |= (1 << qg_idx);
+       for (template_idx = SIG_DL_4G; template_idx <= SIG_DL_4G_LAST;
+                       template_idx++) {
+               address = HWPfQmgrGrpTmplateReg4Indx
+                               + BYTES_IN_WORD*template_idx;
+               acc100_reg_write(d, address, payload);
+               #if RTE_ACC100_SINGLE_FEC == 1
+                       payload = 0;
+               #endif
+       }
+       /* 5GDL */
+       numQqsAcc += numQgs;
+       numQgs  = conf->q_dl_5g.num_qgroups;
+       payload = 0;
+       for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+               payload |= (1 << qg_idx);
+       for (template_idx = SIG_DL_5G; template_idx <= SIG_DL_5G_LAST;
+                       template_idx++) {
+               address = HWPfQmgrGrpTmplateReg4Indx
+                               + BYTES_IN_WORD*template_idx;
+               acc100_reg_write(d, address, payload);
+               #if RTE_ACC100_SINGLE_FEC == 1
+               payload = 0;
+               #endif
+       }
+
+       /* Queue Group Function mapping */
+       int qman_func_id[5] = {0, 2, 1, 3, 4};
+       address = HWPfQmgrGrpFunction0;
+       payload = 0;
+       for (qg_idx = 0; qg_idx < 8; qg_idx++) {
+               acc = accFromQgid(qg_idx, conf);
+               payload |= qman_func_id[acc]<<(qg_idx * 4);
+       }
+       acc100_reg_write(d, address, payload);
+
+       /* Configuration of the Arbitration QGroup depth to 1 */
+       for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) {
+               address = HWPfQmgrArbQDepthGrp +
+               BYTES_IN_WORD * qg_idx;
+               payload = 0;
+               acc100_reg_write(d, address, payload);
+       }
+
+       /* Enabling AQueues through the Queue hierarchy*/
+       for (vf_idx = 0; vf_idx < ACC100_NUM_VFS; vf_idx++) {
+               for (qg_idx = 0; qg_idx < ACC100_NUM_QGRPS; qg_idx++) {
+                       payload = 0;
+                       if (vf_idx < conf->num_vf_bundles &&
+                                       qg_idx < totalQgs)
+                               payload = (1 << aqNum(qg_idx, conf)) - 1;
+                       address = HWPfQmgrAqEnableVf
+                                       + vf_idx * BYTES_IN_WORD;
+                       payload += (qg_idx << 16);
+                       acc100_reg_write(d, address, payload);
+               }
+       }
+
+       /* This pointer to ARAM (256kB) is shifted by 2 (4B per register) */
+       uint32_t aram_address = 0;
+       for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) {
+               for (vf_idx = 0; vf_idx < conf->num_vf_bundles; vf_idx++) {
+                       address = HWPfQmgrVfBaseAddr + vf_idx
+                                       * BYTES_IN_WORD + qg_idx
+                                       * BYTES_IN_WORD * 64;
+                       payload = aram_address;
+                       acc100_reg_write(d, address, payload);
+                       /* Offset ARAM Address for next memory bank
+                        * - increment of 4B
+                        */
+                       aram_address += aqNum(qg_idx, conf) *
+                                       (1 << aqDepth(qg_idx, conf));
+               }
+       }
+
+       if (aram_address > WORDS_IN_ARAM_SIZE) {
+               rte_bbdev_log(ERR, "ARAM Configuration not fitting %d %d\n",
+                               aram_address, WORDS_IN_ARAM_SIZE);
+               return -EINVAL;
+       }
+
+       /* ==== HI Configuration ==== */
+
+       /* Prevent Block on Transmit Error */
+       address = HWPfHiBlockTransmitOnErrorEn;
+       payload = 0;
+       acc100_reg_write(d, address, payload);
+       /* Prevents to drop MSI */
+       address = HWPfHiMsiDropEnableReg;
+       payload = 0;
+       acc100_reg_write(d, address, payload);
+       /* Set the PF Mode register */
+       address = HWPfHiPfMode;
+       payload = (conf->pf_mode_en) ? 2 : 0;
+       acc100_reg_write(d, address, payload);
+       /* Enable Error Detection in HW */
+       address = HWPfDmaErrorDetectionEn;
+       payload = 0x3D7;
+       acc100_reg_write(d, address, payload);
+
+       /* QoS overflow init */
+       payload = 1;
+       address = HWPfQosmonAEvalOverflow0;
+       acc100_reg_write(d, address, payload);
+       address = HWPfQosmonBEvalOverflow0;
+       acc100_reg_write(d, address, payload);
+
+       /* HARQ DDR Configuration */
+       unsigned int ddrSizeInMb = 512; /* Fixed to 512 MB per VF for now */
+       for (vf_idx = 0; vf_idx < conf->num_vf_bundles; vf_idx++) {
+               address = HWPfDmaVfDdrBaseRw + vf_idx
+                               * 0x10;
+               payload = ((vf_idx * (ddrSizeInMb / 64)) << 16) +
+                               (ddrSizeInMb - 1);
+               acc100_reg_write(d, address, payload);
+       }
+       usleep(LONG_WAIT);
+
+       if (numEngines < (SIG_UL_5G_LAST + 1))
+               poweron_cleanup(bbdev, d, conf);
+
+       rte_bbdev_log_debug("PF Tip configuration complete for %s", dev_name);
+       return 0;
+}
diff --git a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
index 4a76d1d..91c234d 100644
--- a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
+++ b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
@@ -1,3 +1,10 @@
 DPDK_21 {
         local: *;
 };
+
+EXPERIMENTAL {
+       global:
+
+       acc100_configure;
+
+};
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v3 10/11] baseband/acc100: add configure function
  2020-09-03 10:06   ` Aidan Goddard
@ 2020-09-03 18:53     ` Chautru, Nicolas
  0 siblings, 0 replies; 213+ messages in thread
From: Chautru, Nicolas @ 2020-09-03 18:53 UTC (permalink / raw)
  To: Aidan Goddard, dev, akhil.goyal; +Cc: Richardson, Bruce, Dave Burley

Hi Aidan, 

> From: Aidan Goddard <aidan.goddard@accelercomm.com>
> 
> Hi Nic,
> 
> Does the ACC100-specific configuration code have to go into
> test_bbdev_perf.c? Would it be better to avoid having this device specific code
> in test-bbdev or is there a good reason for doing so?

The test-bbdev provides the option to initialize the HW device when running from PF (-i option from command line) as this HW initialization is required to be able run any test on the HW (this has to be run at least once after boot up or HW reset).
The actual implementation of the related function is within each of the BBDEV PMD directories as you can see below. 
You can see the patch purely consists in :
 - defining default input config structure for the test
 - calling acc100_configure(info->dev_name, &conf); (the actual implementation is not within bbdev-test).

This is done the exact same way for existing other BBDEV PMDs and HW variants, so this is kept fairly seamless, self contained and scalable. 
Once configured all HW devices can be run with the exact same testcases (some cases may be skipped in case an HW variant doesn't support a given capability). 
I expect your BBDEV PMD to follow the same principle but let me know if you have any concern or suggestion for this. In case somehow your device boots up self-configured, then nothing would need to be called. 

Thanks
Nic

> 
> Thanks,
> 
> Aidan Goddard
> 
> 
> From: dev <dev-bounces@dpdk.org> on behalf of Nicolas Chautru
> <nicolas.chautru@intel.com>
> Sent: 19 August 2020 01:25
> To: dev@dpdk.org <dev@dpdk.org>; akhil.goyal@nxp.com
> <akhil.goyal@nxp.com>
> Cc: bruce.richardson@intel.com <bruce.richardson@intel.com>; Nicolas
> Chautru <nicolas.chautru@intel.com>
> Subject: [dpdk-dev] [PATCH v3 10/11] baseband/acc100: add configure function
> 
> Add configure function to configure the PF from within the bbdev-test itself
> without external application configuration the device.
> 
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> ---
>  app/test-bbdev/test_bbdev_perf.c                   |  72 +++
>  drivers/baseband/acc100/Makefile                   |   3 +
>  drivers/baseband/acc100/meson.build                |   2 +
>  drivers/baseband/acc100/rte_acc100_cfg.h           |  17 +
>  drivers/baseband/acc100/rte_acc100_pmd.c           | 505
> +++++++++++++++++++++
>  .../acc100/rte_pmd_bbdev_acc100_version.map        |   7 +
>  6 files changed, 606 insertions(+)
> 
> diff --git a/app/test-bbdev/test_bbdev_perf.c b/app/test-
> bbdev/test_bbdev_perf.c
> index 45c0d62..32f23ff 100644
> --- a/app/test-bbdev/test_bbdev_perf.c
> +++ b/app/test-bbdev/test_bbdev_perf.c
> @@ -52,6 +52,18 @@
>  #define FLR_5G_TIMEOUT 610
>  #endif
> 
> +#ifdef RTE_LIBRTE_PMD_BBDEV_ACC100
> +#include <rte_acc100_cfg.h>
> +#define ACC100PF_DRIVER_NAME   ("intel_acc100_pf") #define
> +ACC100VF_DRIVER_NAME   ("intel_acc100_vf") #define
> ACC100_QMGR_NUM_AQS
> +16 #define ACC100_QMGR_NUM_QGS 2 #define ACC100_QMGR_AQ_DEPTH
> 5 #define
> +ACC100_QMGR_INVALID_IDX -1 #define ACC100_QMGR_RR 1 #define
> +ACC100_QOS_GBR 0 #endif
> +
>  #define OPS_CACHE_SIZE 256U
>  #define OPS_POOL_SIZE_MIN 511U /* 0.5K per queue */
> 
> @@ -653,6 +665,66 @@ typedef int (test_case_function)(struct active_device
> *ad,
>                                  info->dev_name);
>          }
>  #endif
> +#ifdef RTE_LIBRTE_PMD_BBDEV_ACC100
> +       if ((get_init_device() == true) &&
> +               (!strcmp(info->drv.driver_name, ACC100PF_DRIVER_NAME)))
> +{
> +               struct acc100_conf conf;
> +               unsigned int i;
> +
> +               printf("Configure ACC100 FEC Driver %s with default
> +values\n",
> +                               info->drv.driver_name);
> +
> +               /* clear default configuration before initialization */
> +               memset(&conf, 0, sizeof(struct acc100_conf));
> +
> +               /* Always set in PF mode for built-in configuration */
> +               conf.pf_mode_en = true;
> +               for (i = 0; i < RTE_ACC100_NUM_VFS; ++i) {
> +                       conf.arb_dl_4g[i].gbr_threshold1 =
> +ACC100_QOS_GBR;
> +                       conf.arb_dl_4g[i].gbr_threshold1 =
> +ACC100_QOS_GBR;
> +                       conf.arb_dl_4g[i].round_robin_weight =
> +ACC100_QMGR_RR;
> +                       conf.arb_ul_4g[i].gbr_threshold1 =
> +ACC100_QOS_GBR;
> +                       conf.arb_ul_4g[i].gbr_threshold1 =
> +ACC100_QOS_GBR;
> +                       conf.arb_ul_4g[i].round_robin_weight =
> +ACC100_QMGR_RR;
> +                       conf.arb_dl_5g[i].gbr_threshold1 =
> +ACC100_QOS_GBR;
> +                       conf.arb_dl_5g[i].gbr_threshold1 =
> +ACC100_QOS_GBR;
> +                       conf.arb_dl_5g[i].round_robin_weight =
> +ACC100_QMGR_RR;
> +                       conf.arb_ul_5g[i].gbr_threshold1 =
> +ACC100_QOS_GBR;
> +                       conf.arb_ul_5g[i].gbr_threshold1 =
> +ACC100_QOS_GBR;
> +                       conf.arb_ul_5g[i].round_robin_weight =
> +ACC100_QMGR_RR;
> +               }
> +
> +               conf.input_pos_llr_1_bit = true;
> +               conf.output_pos_llr_1_bit = true;
> +               conf.num_vf_bundles = 1; /**< Number of VF bundles to
> +setup */
> +
> +               conf.q_ul_4g.num_qgroups = ACC100_QMGR_NUM_QGS;
> +               conf.q_ul_4g.first_qgroup_index =
> +ACC100_QMGR_INVALID_IDX;
> +               conf.q_ul_4g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
> +               conf.q_ul_4g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
> +               conf.q_dl_4g.num_qgroups = ACC100_QMGR_NUM_QGS;
> +               conf.q_dl_4g.first_qgroup_index =
> +ACC100_QMGR_INVALID_IDX;
> +               conf.q_dl_4g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
> +               conf.q_dl_4g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
> +               conf.q_ul_5g.num_qgroups = ACC100_QMGR_NUM_QGS;
> +               conf.q_ul_5g.first_qgroup_index =
> +ACC100_QMGR_INVALID_IDX;
> +               conf.q_ul_5g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
> +               conf.q_ul_5g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
> +               conf.q_dl_5g.num_qgroups = ACC100_QMGR_NUM_QGS;
> +               conf.q_dl_5g.first_qgroup_index =
> +ACC100_QMGR_INVALID_IDX;
> +               conf.q_dl_5g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
> +               conf.q_dl_5g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
> +
> +               /* setup PF with configuration information */
> +               ret = acc100_configure(info->dev_name, &conf);
> +               TEST_ASSERT_SUCCESS(ret,
> +                               "Failed to configure ACC100 PF for bbdev
> +%s",
> +                               info->dev_name);
> +               /* Let's refresh this now this is configured */
> +       }
> +       rte_bbdev_info_get(dev_id, info); #endif
> +
>          nb_queues = RTE_MIN(rte_lcore_count(), info->drv.max_num_queues);
>          nb_queues = RTE_MIN(nb_queues, (unsigned int) MAX_QUEUES);
> 
> diff --git a/drivers/baseband/acc100/Makefile
> b/drivers/baseband/acc100/Makefile
> index c79e487..37e73af 100644
> --- a/drivers/baseband/acc100/Makefile
> +++ b/drivers/baseband/acc100/Makefile
> @@ -22,4 +22,7 @@ LIBABIVER := 1
>  # library source files
>  SRCS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100) += rte_acc100_pmd.c
> 
> +# export include files
> +SYMLINK-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100)-include +=
> +rte_acc100_cfg.h
> +
>  include $(RTE_SDK)/mk/rte.lib.mk
> diff --git a/drivers/baseband/acc100/meson.build
> b/drivers/baseband/acc100/meson.build
> index 8afafc2..7ac44dc 100644
> --- a/drivers/baseband/acc100/meson.build
> +++ b/drivers/baseband/acc100/meson.build
> @@ -4,3 +4,5 @@
>  deps += ['bbdev', 'bus_vdev', 'ring', 'pci', 'bus_pci']
> 
>  sources = files('rte_acc100_pmd.c')
> +
> +install_headers('rte_acc100_cfg.h')
> diff --git a/drivers/baseband/acc100/rte_acc100_cfg.h
> b/drivers/baseband/acc100/rte_acc100_cfg.h
> index 73bbe36..7f523bc 100644
> --- a/drivers/baseband/acc100/rte_acc100_cfg.h
> +++ b/drivers/baseband/acc100/rte_acc100_cfg.h
> @@ -89,6 +89,23 @@ struct acc100_conf {
>          struct rte_arbitration_t arb_dl_5g[RTE_ACC100_NUM_VFS];
>  };
> 
> +/**
> + * Configure a ACC100 device
> + *
> + * @param dev_name
> + *   The name of the device. This is the short form of PCI BDF, e.g. 00:01.0.
> + *   It can also be retrieved for a bbdev device from the dev_name
> +field in the
> + *   rte_bbdev_info structure returned by rte_bbdev_info_get().
> + * @param conf
> + *   Configuration to apply to ACC100 HW.
> + *
> + * @return
> + *   Zero on success, negative value on failure.
> + */
> +__rte_experimental
> +int
> +acc100_configure(const char *dev_name, struct acc100_conf *conf);
> +
>  #ifdef __cplusplus
>  }
>  #endif
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> b/drivers/baseband/acc100/rte_acc100_pmd.c
> index dc14079..43f664b 100644
> --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> @@ -85,6 +85,26 @@
> 
>  enum {UL_4G = 0, UL_5G, DL_4G, DL_5G, NUM_ACC};
> 
> +/* Return the accelerator enum for a Queue Group Index */ static inline
> +int accFromQgid(int qg_idx, const struct acc100_conf *acc100_conf) {
> +       int accQg[ACC100_NUM_QGRPS];
> +       int NumQGroupsPerFn[NUM_ACC];
> +       int acc, qgIdx, qgIndex = 0;
> +       for (qgIdx = 0; qgIdx < ACC100_NUM_QGRPS; qgIdx++)
> +               accQg[qgIdx] = 0;
> +       NumQGroupsPerFn[UL_4G] = acc100_conf->q_ul_4g.num_qgroups;
> +       NumQGroupsPerFn[UL_5G] = acc100_conf->q_ul_5g.num_qgroups;
> +       NumQGroupsPerFn[DL_4G] = acc100_conf->q_dl_4g.num_qgroups;
> +       NumQGroupsPerFn[DL_5G] = acc100_conf->q_dl_5g.num_qgroups;
> +       for (acc = UL_4G;  acc < NUM_ACC; acc++)
> +               for (qgIdx = 0; qgIdx < NumQGroupsPerFn[acc]; qgIdx++)
> +                       accQg[qgIndex++] = acc;
> +       acc = accQg[qg_idx];
> +       return acc;
> +}
> +
>  /* Return the queue topology for a Queue Group Index */
>  static inline void
>  qtopFromAcc(struct rte_q_topology_t **qtop, int acc_enum, @@ -113,6
> +133,30 @@
>          *qtop = p_qtop;
>  }
> 
> +/* Return the AQ depth for a Queue Group Index */ static inline int
> +aqDepth(int qg_idx, struct acc100_conf *acc100_conf) {
> +       struct rte_q_topology_t *q_top = NULL;
> +       int acc_enum = accFromQgid(qg_idx, acc100_conf);
> +       qtopFromAcc(&q_top, acc_enum, acc100_conf);
> +       if (unlikely(q_top == NULL))
> +               return 0;
> +       return q_top->aq_depth_log2;
> +}
> +
> +/* Return the AQ depth for a Queue Group Index */ static inline int
> +aqNum(int qg_idx, struct acc100_conf *acc100_conf) {
> +       struct rte_q_topology_t *q_top = NULL;
> +       int acc_enum = accFromQgid(qg_idx, acc100_conf);
> +       qtopFromAcc(&q_top, acc_enum, acc100_conf);
> +       if (unlikely(q_top == NULL))
> +               return 0;
> +       return q_top->num_aqs_per_groups; }
> +
>  static void
>  initQTop(struct acc100_conf *acc100_conf)
>  {
> @@ -4177,3 +4221,464 @@ static int acc100_pci_remove(struct
> rte_pci_device *pci_dev)
>  RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME,
> pci_id_acc100_pf_map);
>  RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
>  RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME,
> pci_id_acc100_vf_map);
> +
> +/*
> + * Implementation to fix the power on status of some 5GUL engines
> + * This requires DMA permission if ported outside DPDK  */ static void
> +poweron_cleanup(struct rte_bbdev *bbdev, struct acc100_device *d,
> +               struct acc100_conf *conf) {
> +       int i, template_idx, qg_idx;
> +       uint32_t address, status, payload;
> +       printf("Need to clear power-on 5GUL status in internal
> +memory\n");
> +       /* Reset LDPC Cores */
> +       for (i = 0; i < ACC100_ENGINES_MAX; i++)
> +               acc100_reg_write(d, HWPfFecUl5gCntrlReg +
> +                               ACC100_ENGINE_OFFSET * i,
> +ACC100_RESET_HI);
> +       usleep(LONG_WAIT);
> +       for (i = 0; i < ACC100_ENGINES_MAX; i++)
> +               acc100_reg_write(d, HWPfFecUl5gCntrlReg +
> +                               ACC100_ENGINE_OFFSET * i,
> +ACC100_RESET_LO);
> +       usleep(LONG_WAIT);
> +       /* Prepare dummy workload */
> +       alloc_2x64mb_sw_rings_mem(bbdev, d, 0);
> +       /* Set base addresses */
> +       uint32_t phys_high = (uint32_t)(d->sw_rings_phys >> 32);
> +       uint32_t phys_low  = (uint32_t)(d->sw_rings_phys &
> +                       ~(ACC100_SIZE_64MBYTE-1));
> +       acc100_reg_write(d, HWPfDmaFec5GulDescBaseHiRegVf, phys_high);
> +       acc100_reg_write(d, HWPfDmaFec5GulDescBaseLoRegVf, phys_low);
> +
> +       /* Descriptor for a dummy 5GUL code block processing*/
> +       union acc100_dma_desc *desc = NULL;
> +       desc = d->sw_rings;
> +       desc->req.data_ptrs[0].address = d->sw_rings_phys +
> +                       ACC100_DESC_FCW_OFFSET;
> +       desc->req.data_ptrs[0].blen = ACC100_FCW_LD_BLEN;
> +       desc->req.data_ptrs[0].blkid = ACC100_DMA_BLKID_FCW;
> +       desc->req.data_ptrs[0].last = 0;
> +       desc->req.data_ptrs[0].dma_ext = 0;
> +       desc->req.data_ptrs[1].address = d->sw_rings_phys + 512;
> +       desc->req.data_ptrs[1].blkid = ACC100_DMA_BLKID_IN;
> +       desc->req.data_ptrs[1].last = 1;
> +       desc->req.data_ptrs[1].dma_ext = 0;
> +       desc->req.data_ptrs[1].blen = 44;
> +       desc->req.data_ptrs[2].address = d->sw_rings_phys + 1024;
> +       desc->req.data_ptrs[2].blkid = ACC100_DMA_BLKID_OUT_ENC;
> +       desc->req.data_ptrs[2].last = 1;
> +       desc->req.data_ptrs[2].dma_ext = 0;
> +       desc->req.data_ptrs[2].blen = 5;
> +       /* Dummy FCW */
> +       desc->req.fcw_ld.FCWversion = ACC100_FCW_VER;
> +       desc->req.fcw_ld.qm = 1;
> +       desc->req.fcw_ld.nfiller = 30;
> +       desc->req.fcw_ld.BG = 2 - 1;
> +       desc->req.fcw_ld.Zc = 7;
> +       desc->req.fcw_ld.ncb = 350;
> +       desc->req.fcw_ld.rm_e = 4;
> +       desc->req.fcw_ld.itmax = 10;
> +       desc->req.fcw_ld.gain_i = 1;
> +       desc->req.fcw_ld.gain_h = 1;
> +
> +       int engines_to_restart[SIG_UL_5G_LAST + 1] = {0};
> +       int num_failed_engine = 0;
> +       /* Detect engines in undefined state */
> +       for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST;
> +                       template_idx++) {
> +               /* Check engine power-on status */
> +               address = HwPfFecUl5gIbDebugReg +
> +                               ACC100_ENGINE_OFFSET * template_idx;
> +               status = (acc100_reg_read(d, address) >> 4) & 0xF;
> +               if (status == 0) {
> +                       engines_to_restart[num_failed_engine] =
> +template_idx;
> +                       num_failed_engine++;
> +               }
> +       }
> +
> +       int numQqsAcc = conf->q_ul_5g.num_qgroups;
> +       int numQgs = conf->q_ul_5g.num_qgroups;
> +       payload = 0;
> +       for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc);
> +qg_idx++)
> +               payload |= (1 << qg_idx);
> +       /* Force each engine which is in unspecified state */
> +       for (i = 0; i < num_failed_engine; i++) {
> +               int failed_engine = engines_to_restart[i];
> +               printf("Force engine %d\n", failed_engine);
> +               for (template_idx = SIG_UL_5G; template_idx <=
> +SIG_UL_5G_LAST;
> +                               template_idx++) {
> +                       address = HWPfQmgrGrpTmplateReg4Indx
> +                                       + BYTES_IN_WORD * template_idx;
> +                       if (template_idx == failed_engine)
> +                               acc100_reg_write(d, address, payload);
> +                       else
> +                               acc100_reg_write(d, address, 0);
> +               }
> +               /* Reset descriptor header */
> +               desc->req.word0 = ACC100_DMA_DESC_TYPE;
> +               desc->req.word1 = 0;
> +               desc->req.word2 = 0;
> +               desc->req.word3 = 0;
> +               desc->req.numCBs = 1;
> +               desc->req.m2dlen = 2;
> +               desc->req.d2mlen = 1;
> +               /* Enqueue the code block for processing */
> +               union acc100_enqueue_reg_fmt enq_req;
> +               enq_req.val = 0;
> +               enq_req.addr_offset = ACC100_DESC_OFFSET;
> +               enq_req.num_elem = 1;
> +               enq_req.req_elem_addr = 0;
> +               rte_wmb();
> +               acc100_reg_write(d, HWPfQmgrIngressAq + 0x100,
> +enq_req.val);
> +               usleep(LONG_WAIT * 100);
> +               if (desc->req.word0 != 2)
> +                       printf("DMA Response %#"PRIx32"\n",
> +desc->req.word0);
> +       }
> +
> +       /* Reset LDPC Cores */
> +       for (i = 0; i < ACC100_ENGINES_MAX; i++)
> +               acc100_reg_write(d, HWPfFecUl5gCntrlReg +
> +                               ACC100_ENGINE_OFFSET * i,
> +ACC100_RESET_HI);
> +       usleep(LONG_WAIT);
> +       for (i = 0; i < ACC100_ENGINES_MAX; i++)
> +               acc100_reg_write(d, HWPfFecUl5gCntrlReg +
> +                               ACC100_ENGINE_OFFSET * i,
> +ACC100_RESET_LO);
> +       usleep(LONG_WAIT);
> +       acc100_reg_write(d, HWPfHi5GHardResetReg, ACC100_RESET_HARD);
> +       usleep(LONG_WAIT);
> +       int numEngines = 0;
> +       /* Check engine power-on status again */
> +       for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST;
> +                       template_idx++) {
> +               address = HwPfFecUl5gIbDebugReg +
> +                               ACC100_ENGINE_OFFSET * template_idx;
> +               status = (acc100_reg_read(d, address) >> 4) & 0xF;
> +               address = HWPfQmgrGrpTmplateReg4Indx
> +                               + BYTES_IN_WORD * template_idx;
> +               if (status == 1) {
> +                       acc100_reg_write(d, address, payload);
> +                       numEngines++;
> +               } else
> +                       acc100_reg_write(d, address, 0);
> +       }
> +       printf("Number of 5GUL engines %d\n", numEngines);
> +
> +       if (d->sw_rings_base != NULL)
> +               rte_free(d->sw_rings_base);
> +       usleep(LONG_WAIT);
> +}
> +
> +/* Initial configuration of a ACC100 device prior to running
> +configure() */ int acc100_configure(const char *dev_name, struct
> +acc100_conf *conf) {
> +       rte_bbdev_log(INFO, "acc100_configure");
> +       uint32_t payload, address, status;
> +       int qg_idx, template_idx, vf_idx, acc, i;
> +       struct rte_bbdev *bbdev = rte_bbdev_get_named_dev(dev_name);
> +
> +       /* Compile time checks */
> +       RTE_BUILD_BUG_ON(sizeof(struct acc100_dma_req_desc) != 256);
> +       RTE_BUILD_BUG_ON(sizeof(union acc100_dma_desc) != 256);
> +       RTE_BUILD_BUG_ON(sizeof(struct acc100_fcw_td) != 24);
> +       RTE_BUILD_BUG_ON(sizeof(struct acc100_fcw_te) != 32);
> +
> +       if (bbdev == NULL) {
> +               rte_bbdev_log(ERR,
> +               "Invalid dev_name (%s), or device is not yet
> +initialised",
> +               dev_name);
> +               return -ENODEV;
> +       }
> +       struct acc100_device *d = bbdev->data->dev_private;
> +
> +       /* Store configuration */
> +       rte_memcpy(&d->acc100_conf, conf, sizeof(d->acc100_conf));
> +
> +       /* PCIe Bridge configuration */
> +       acc100_reg_write(d, HwPfPcieGpexBridgeControl,
> +ACC100_CFG_PCI_BRIDGE);
> +       for (i = 1; i < 17; i++)
> +               acc100_reg_write(d,
> +
> +HwPfPcieGpexAxiAddrMappingWindowPexBaseHigh
> +                               + i * 16, 0);
> +
> +       /* PCIe Link Trainiing and Status State Machine */
> +       acc100_reg_write(d, HwPfPcieGpexLtssmStateCntrl, 0xDFC00000);
> +
> +       /* Prevent blocking AXI read on BRESP for AXI Write */
> +       address = HwPfPcieGpexAxiPioControl;
> +       payload = ACC100_CFG_PCI_AXI;
> +       acc100_reg_write(d, address, payload);
> +
> +       /* 5GDL PLL phase shift */
> +       acc100_reg_write(d, HWPfChaDl5gPllPhshft0, 0x1);
> +
> +       /* Explicitly releasing AXI as this may be stopped after PF
> +FLR/BME */
> +       address = HWPfDmaAxiControl;
> +       payload = 1;
> +       acc100_reg_write(d, address, payload);
> +
> +       /* DDR Configuration */
> +       address = HWPfDdrBcTim6;
> +       payload = acc100_reg_read(d, address);
> +       payload &= 0xFFFFFFFB; /* Bit 2 */ #ifdef ACC100_DDR_ECC_ENABLE
> +       payload |= 0x4;
> +#endif
> +       acc100_reg_write(d, address, payload);
> +       address = HWPfDdrPhyDqsCountNum; #ifdef ACC100_DDR_ECC_ENABLE
> +       payload = 9;
> +#else
> +       payload = 8;
> +#endif
> +       acc100_reg_write(d, address, payload);
> +
> +       /* Set default descriptor signature */
> +       address = HWPfDmaDescriptorSignatuture;
> +       payload = 0;
> +       acc100_reg_write(d, address, payload);
> +
> +       /* Enable the Error Detection in DMA */
> +       payload = ACC100_CFG_DMA_ERROR;
> +       address = HWPfDmaErrorDetectionEn;
> +       acc100_reg_write(d, address, payload);
> +
> +       /* AXI Cache configuration */
> +       payload = ACC100_CFG_AXI_CACHE;
> +       address = HWPfDmaAxcacheReg;
> +       acc100_reg_write(d, address, payload);
> +
> +       /* Default DMA Configuration (Qmgr Enabled) */
> +       address = HWPfDmaConfig0Reg;
> +       payload = 0;
> +       acc100_reg_write(d, address, payload);
> +       address = HWPfDmaQmanen;
> +       payload = 0;
> +       acc100_reg_write(d, address, payload);
> +
> +       /* Default RLIM/ALEN configuration */
> +       address = HWPfDmaConfig1Reg;
> +       payload = (1 << 31) + (23 << 8) + (1 << 6) + 7;
> +       acc100_reg_write(d, address, payload);
> +
> +       /* Configure DMA Qmanager addresses */
> +       address = HWPfDmaQmgrAddrReg;
> +       payload = HWPfQmgrEgressQueuesTemplate;
> +       acc100_reg_write(d, address, payload);
> +
> +       /* ===== Qmgr Configuration ===== */
> +       /* Configuration of the AQueue Depth QMGR_GRP_0_DEPTH_LOG2 for
> +UL */
> +       int totalQgs = conf->q_ul_4g.num_qgroups +
> +                       conf->q_ul_5g.num_qgroups +
> +                       conf->q_dl_4g.num_qgroups +
> +                       conf->q_dl_5g.num_qgroups;
> +       for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) {
> +               address = HWPfQmgrDepthLog2Grp +
> +               BYTES_IN_WORD * qg_idx;
> +               payload = aqDepth(qg_idx, conf);
> +               acc100_reg_write(d, address, payload);
> +               address = HWPfQmgrTholdGrp +
> +               BYTES_IN_WORD * qg_idx;
> +               payload = (1 << 16) + (1 << (aqDepth(qg_idx, conf) -
> +1));
> +               acc100_reg_write(d, address, payload);
> +       }
> +
> +       /* Template Priority in incremental order */
> +       for (template_idx = 0; template_idx < ACC100_NUM_TMPL;
> +                       template_idx++) {
> +               address = HWPfQmgrGrpTmplateReg0Indx +
> +               BYTES_IN_WORD * (template_idx % 8);
> +               payload = TMPL_PRI_0;
> +               acc100_reg_write(d, address, payload);
> +               address = HWPfQmgrGrpTmplateReg1Indx +
> +               BYTES_IN_WORD * (template_idx % 8);
> +               payload = TMPL_PRI_1;
> +               acc100_reg_write(d, address, payload);
> +               address = HWPfQmgrGrpTmplateReg2indx +
> +               BYTES_IN_WORD * (template_idx % 8);
> +               payload = TMPL_PRI_2;
> +               acc100_reg_write(d, address, payload);
> +               address = HWPfQmgrGrpTmplateReg3Indx +
> +               BYTES_IN_WORD * (template_idx % 8);
> +               payload = TMPL_PRI_3;
> +               acc100_reg_write(d, address, payload);
> +       }
> +
> +       address = HWPfQmgrGrpPriority;
> +       payload = ACC100_CFG_QMGR_HI_P;
> +       acc100_reg_write(d, address, payload);
> +
> +       /* Template Configuration */
> +       for (template_idx = 0; template_idx < ACC100_NUM_TMPL;
> +template_idx++) {
> +               payload = 0;
> +               address = HWPfQmgrGrpTmplateReg4Indx
> +                               + BYTES_IN_WORD * template_idx;
> +               acc100_reg_write(d, address, payload);
> +       }
> +       /* 4GUL */
> +       int numQgs = conf->q_ul_4g.num_qgroups;
> +       int numQqsAcc = 0;
> +       payload = 0;
> +       for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc);
> +qg_idx++)
> +               payload |= (1 << qg_idx);
> +       for (template_idx = SIG_UL_4G; template_idx <= SIG_UL_4G_LAST;
> +                       template_idx++) {
> +               address = HWPfQmgrGrpTmplateReg4Indx
> +                               + BYTES_IN_WORD*template_idx;
> +               acc100_reg_write(d, address, payload);
> +       }
> +       /* 5GUL */
> +       numQqsAcc += numQgs;
> +       numQgs  = conf->q_ul_5g.num_qgroups;
> +       payload = 0;
> +       int numEngines = 0;
> +       for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc);
> +qg_idx++)
> +               payload |= (1 << qg_idx);
> +       for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST;
> +                       template_idx++) {
> +               /* Check engine power-on status */
> +               address = HwPfFecUl5gIbDebugReg +
> +                               ACC100_ENGINE_OFFSET * template_idx;
> +               status = (acc100_reg_read(d, address) >> 4) & 0xF;
> +               address = HWPfQmgrGrpTmplateReg4Indx
> +                               + BYTES_IN_WORD * template_idx;
> +               if (status == 1) {
> +                       acc100_reg_write(d, address, payload);
> +                       numEngines++;
> +               } else
> +                       acc100_reg_write(d, address, 0);
> +               #if RTE_ACC100_SINGLE_FEC == 1
> +               payload = 0;
> +               #endif
> +       }
> +       printf("Number of 5GUL engines %d\n", numEngines);
> +       /* 4GDL */
> +       numQqsAcc += numQgs;
> +       numQgs  = conf->q_dl_4g.num_qgroups;
> +       payload = 0;
> +       for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc);
> +qg_idx++)
> +               payload |= (1 << qg_idx);
> +       for (template_idx = SIG_DL_4G; template_idx <= SIG_DL_4G_LAST;
> +                       template_idx++) {
> +               address = HWPfQmgrGrpTmplateReg4Indx
> +                               + BYTES_IN_WORD*template_idx;
> +               acc100_reg_write(d, address, payload);
> +               #if RTE_ACC100_SINGLE_FEC == 1
> +                       payload = 0;
> +               #endif
> +       }
> +       /* 5GDL */
> +       numQqsAcc += numQgs;
> +       numQgs  = conf->q_dl_5g.num_qgroups;
> +       payload = 0;
> +       for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc);
> +qg_idx++)
> +               payload |= (1 << qg_idx);
> +       for (template_idx = SIG_DL_5G; template_idx <= SIG_DL_5G_LAST;
> +                       template_idx++) {
> +               address = HWPfQmgrGrpTmplateReg4Indx
> +                               + BYTES_IN_WORD*template_idx;
> +               acc100_reg_write(d, address, payload);
> +               #if RTE_ACC100_SINGLE_FEC == 1
> +               payload = 0;
> +               #endif
> +       }
> +
> +       /* Queue Group Function mapping */
> +       int qman_func_id[5] = {0, 2, 1, 3, 4};
> +       address = HWPfQmgrGrpFunction0;
> +       payload = 0;
> +       for (qg_idx = 0; qg_idx < 8; qg_idx++) {
> +               acc = accFromQgid(qg_idx, conf);
> +               payload |= qman_func_id[acc]<<(qg_idx * 4);
> +       }
> +       acc100_reg_write(d, address, payload);
> +
> +       /* Configuration of the Arbitration QGroup depth to 1 */
> +       for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) {
> +               address = HWPfQmgrArbQDepthGrp +
> +               BYTES_IN_WORD * qg_idx;
> +               payload = 0;
> +               acc100_reg_write(d, address, payload);
> +       }
> +
> +       /* Enabling AQueues through the Queue hierarchy*/
> +       for (vf_idx = 0; vf_idx < ACC100_NUM_VFS; vf_idx++) {
> +               for (qg_idx = 0; qg_idx < ACC100_NUM_QGRPS; qg_idx++) {
> +                       payload = 0;
> +                       if (vf_idx < conf->num_vf_bundles &&
> +                                       qg_idx < totalQgs)
> +                               payload = (1 << aqNum(qg_idx, conf)) -
> +1;
> +                       address = HWPfQmgrAqEnableVf
> +                                       + vf_idx * BYTES_IN_WORD;
> +                       payload += (qg_idx << 16);
> +                       acc100_reg_write(d, address, payload);
> +               }
> +       }
> +
> +       /* This pointer to ARAM (256kB) is shifted by 2 (4B per
> +register) */
> +       uint32_t aram_address = 0;
> +       for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) {
> +               for (vf_idx = 0; vf_idx < conf->num_vf_bundles;
> +vf_idx++) {
> +                       address = HWPfQmgrVfBaseAddr + vf_idx
> +                                       * BYTES_IN_WORD + qg_idx
> +                                       * BYTES_IN_WORD * 64;
> +                       payload = aram_address;
> +                       acc100_reg_write(d, address, payload);
> +                       /* Offset ARAM Address for next memory bank
> +                        * - increment of 4B
> +                        */
> +                       aram_address += aqNum(qg_idx, conf) *
> +                                       (1 << aqDepth(qg_idx, conf));
> +               }
> +       }
> +
> +       if (aram_address > WORDS_IN_ARAM_SIZE) {
> +               rte_bbdev_log(ERR, "ARAM Configuration not fitting %d
> +%d\n",
> +                               aram_address, WORDS_IN_ARAM_SIZE);
> +               return -EINVAL;
> +       }
> +
> +       /* ==== HI Configuration ==== */
> +
> +       /* Prevent Block on Transmit Error */
> +       address = HWPfHiBlockTransmitOnErrorEn;
> +       payload = 0;
> +       acc100_reg_write(d, address, payload);
> +       /* Prevents to drop MSI */
> +       address = HWPfHiMsiDropEnableReg;
> +       payload = 0;
> +       acc100_reg_write(d, address, payload);
> +       /* Set the PF Mode register */
> +       address = HWPfHiPfMode;
> +       payload = (conf->pf_mode_en) ? 2 : 0;
> +       acc100_reg_write(d, address, payload);
> +       /* Enable Error Detection in HW */
> +       address = HWPfDmaErrorDetectionEn;
> +       payload = 0x3D7;
> +       acc100_reg_write(d, address, payload);
> +
> +       /* QoS overflow init */
> +       payload = 1;
> +       address = HWPfQosmonAEvalOverflow0;
> +       acc100_reg_write(d, address, payload);
> +       address = HWPfQosmonBEvalOverflow0;
> +       acc100_reg_write(d, address, payload);
> +
> +       /* HARQ DDR Configuration */
> +       unsigned int ddrSizeInMb = 512; /* Fixed to 512 MB per VF for
> +now */
> +       for (vf_idx = 0; vf_idx < conf->num_vf_bundles; vf_idx++) {
> +               address = HWPfDmaVfDdrBaseRw + vf_idx
> +                               * 0x10;
> +               payload = ((vf_idx * (ddrSizeInMb / 64)) << 16) +
> +                               (ddrSizeInMb - 1);
> +               acc100_reg_write(d, address, payload);
> +       }
> +       usleep(LONG_WAIT);
> +
> +       if (numEngines < (SIG_UL_5G_LAST + 1))
> +               poweron_cleanup(bbdev, d, conf);
> +
> +       rte_bbdev_log_debug("PF Tip configuration complete for %s",
> +dev_name);
> +       return 0;
> +}
> diff --git a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> index 4a76d1d..91c234d 100644
> --- a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> +++ b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> @@ -1,3 +1,10 @@
>  DPDK_21 {
>          local: *;
>  };
> +
> +EXPERIMENTAL {
> +       global:
> +
> +       acc100_configure;
> +
> +};
> --
> 1.8.3.1

^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC processing functions
  2020-09-03  9:09         ` Ananyev, Konstantin
@ 2020-09-03 20:45           ` Chautru, Nicolas
  2020-09-15  1:45             ` Chautru, Nicolas
  2020-09-15 10:21             ` Ananyev, Konstantin
  0 siblings, 2 replies; 213+ messages in thread
From: Chautru, Nicolas @ 2020-09-03 20:45 UTC (permalink / raw)
  To: Ananyev, Konstantin, Xu, Rosen, dev, akhil.goyal; +Cc: Richardson, Bruce

> From: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> 
> 
> 
> > -----Original Message-----
> > From: dev <dev-bounces@dpdk.org> On Behalf Of Xu, Rosen
> > Sent: Thursday, September 3, 2020 3:34 AM
> > To: Chautru, Nicolas <nicolas.chautru@intel.com>; dev@dpdk.org;
> > akhil.goyal@nxp.com
> > Cc: Richardson, Bruce <bruce.richardson@intel.com>
> > Subject: Re: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC
> > processing functions
> >
> > Hi,
> >
> > > -----Original Message-----
> > > From: Chautru, Nicolas <nicolas.chautru@intel.com>
> > > Sent: Sunday, August 30, 2020 2:01
> > > To: Xu, Rosen <rosen.xu@intel.com>; dev@dpdk.org;
> > > akhil.goyal@nxp.com
> > > Cc: Richardson, Bruce <bruce.richardson@intel.com>
> > > Subject: RE: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC
> > > processing functions
> > >
> > > Hi Rosen,
> > >
> > > > From: Xu, Rosen <rosen.xu@intel.com>
> > > >
> > > > Hi,
> > > >
> > > > > -----Original Message-----
> > > > > From: dev <dev-bounces@dpdk.org> On Behalf Of Nicolas Chautru
> > > > > Sent: Wednesday, August 19, 2020 8:25
> > > > > To: dev@dpdk.org; akhil.goyal@nxp.com
> > > > > Cc: Richardson, Bruce <bruce.richardson@intel.com>; Chautru,
> > > > > Nicolas <nicolas.chautru@intel.com>
> > > > > Subject: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC
> > > > > processing functions
> > > > >
> > > > > Adding LDPC decode and encode processing operations
> > > > >
> > > > > Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> > > > > ---
> > > > >  drivers/baseband/acc100/rte_acc100_pmd.c | 1625
> > > > > +++++++++++++++++++++++++++++-
> > > > >  drivers/baseband/acc100/rte_acc100_pmd.h |    3 +
> > > > >  2 files changed, 1626 insertions(+), 2 deletions(-)
> > > > >
> > > > > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> > > > > b/drivers/baseband/acc100/rte_acc100_pmd.c
> > > > > index 7a21c57..5f32813 100644
> > > > > --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> > > > > +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> > > > > @@ -15,6 +15,9 @@
> > > > >  #include <rte_hexdump.h>
> > > > >  #include <rte_pci.h>
> > > > >  #include <rte_bus_pci.h>
> > > > > +#ifdef RTE_BBDEV_OFFLOAD_COST
> > > > > +#include <rte_cycles.h>
> > > > > +#endif
> > > > >
> > > > >  #include <rte_bbdev.h>
> > > > >  #include <rte_bbdev_pmd.h>
> > > > > @@ -449,7 +452,6 @@
> > > > >  	return 0;
> > > > >  }
> > > > >
> > > > > -
> > > > >  /**
> > > > >   * Report a ACC100 queue index which is free
> > > > >   * Return 0 to 16k for a valid queue_idx or -1 when no queue is
> > > > > available @@ -634,6 +636,46 @@
> > > > >  	struct acc100_device *d = dev->data->dev_private;
> > > > >
> > > > >  	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
> > > > > +		{
> > > > > +			.type   = RTE_BBDEV_OP_LDPC_ENC,
> > > > > +			.cap.ldpc_enc = {
> > > > > +				.capability_flags =
> > > > > +
> 	RTE_BBDEV_LDPC_RATE_MATCH |
> > > > > +
> 	RTE_BBDEV_LDPC_CRC_24B_ATTACH
> > > > > |
> > > > > +
> > > > > 	RTE_BBDEV_LDPC_INTERLEAVER_BYPASS,
> > > > > +				.num_buffers_src =
> > > > > +
> > > > > 	RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> > > > > +				.num_buffers_dst =
> > > > > +
> > > > > 	RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> > > > > +			}
> > > > > +		},
> > > > > +		{
> > > > > +			.type   = RTE_BBDEV_OP_LDPC_DEC,
> > > > > +			.cap.ldpc_dec = {
> > > > > +			.capability_flags =
> > > > > +
> 	RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK |
> > > > > +
> 	RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP |
> > > > > +
> > > > > 	RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE |
> > > > > +
> > > > > 	RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE |
> > > > > +#ifdef ACC100_EXT_MEM
> > > > > +
> > > > > 	RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE |
> > > > > +
> > > > > 	RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE |
> > > > > +#endif
> > > > > +
> > > > > 	RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE |
> > > > > +
> 	RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS
> > > > > |
> > > > > +				RTE_BBDEV_LDPC_DECODE_BYPASS |
> > > > > +
> 	RTE_BBDEV_LDPC_DEC_SCATTER_GATHER |
> > > > > +
> > > > > 	RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION |
> > > > > +
> 	RTE_BBDEV_LDPC_LLR_COMPRESSION,
> > > > > +			.llr_size = 8,
> > > > > +			.llr_decimals = 1,
> > > > > +			.num_buffers_src =
> > > > > +
> > > > > 	RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> > > > > +			.num_buffers_hard_out =
> > > > > +
> > > > > 	RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> > > > > +			.num_buffers_soft_out = 0,
> > > > > +			}
> > > > > +		},
> > > > >  		RTE_BBDEV_END_OF_CAPABILITIES_LIST()
> > > > >  	};
> > > > >
> > > > > @@ -669,9 +711,14 @@
> > > > >  	dev_info->cpu_flag_reqs = NULL;
> > > > >  	dev_info->min_alignment = 64;
> > > > >  	dev_info->capabilities = bbdev_capabilities;
> > > > > +#ifdef ACC100_EXT_MEM
> > > > >  	dev_info->harq_buffer_size = d->ddr_size;
> > > > > +#else
> > > > > +	dev_info->harq_buffer_size = 0; #endif
> > > > >  }
> > > > >
> > > > > +
> > > > >  static const struct rte_bbdev_ops acc100_bbdev_ops = {
> > > > >  	.setup_queues = acc100_setup_queues,
> > > > >  	.close = acc100_dev_close,
> > > > > @@ -696,6 +743,1577 @@
> > > > >  	{.device_id = 0},
> > > > >  };
> > > > >
> > > > > +/* Read flag value 0/1 from bitmap */ static inline bool
> > > > > +check_bit(uint32_t bitmap, uint32_t bitmask) {
> > > > > +	return bitmap & bitmask;
> > > > > +}
> > > > > +
> > > > > +static inline char *
> > > > > +mbuf_append(struct rte_mbuf *m_head, struct rte_mbuf *m,
> > > > > +uint16_t
> > > > > +len) {
> > > > > +	if (unlikely(len > rte_pktmbuf_tailroom(m)))
> > > > > +		return NULL;
> > > > > +
> > > > > +	char *tail = (char *)m->buf_addr + m->data_off + m-
> >data_len;
> > > > > +	m->data_len = (uint16_t)(m->data_len + len);
> > > > > +	m_head->pkt_len  = (m_head->pkt_len + len);
> > > > > +	return tail;
> > > > > +}
> > > >
> > > > Is it reasonable to direct add data_len of rte_mbuf?
> > > >
> > >
> > > Do you suggest to add directly without checking there is enough room
> > > in the mbuf? We cannot rely on the application providing mbuf with
> > > enough tailroom.
> >
> > What I mentioned is this changes about mbuf should move to librte_mbuf.
> > And it's better to align Olivier Matz.
> 
> There is already rte_pktmbuf_append() inside rte_mbuf.h.
> Wouldn't it suit?
> 

Hi Ananyev, Rosen, 
I agree that this can be confusing at first look and notably compared to packet processing. 
Note first that this same existing syntaxwhich  is already used in all bbdev PMDs when manipulating outbound mbufs in the context of base band signal processing (not really a packet as for NIC or other devices). 
Nothing new in that very PMD as this follows existing logic already in DPDK bbdev PMDs. 

This function basically differs from the typical rte_pktmbuf_append() as this is not appending data in the last mbuf but is used to potentially  update sequentially data for any mbufs in the middle from preallocated data hence it takes 2 arguments for both the head and the current mbuf segment in the list. 
There may be a more elegant way to do this down the line notably once there is a proposal to handle gracefully large mbufs (another usecase we have to handle in a slightly custom way). But I believe that is orthogonal to that very PMD serie which keeps on reling on using existing logic.  




> >
> > > In case you ask about the 2 mbufs, this is because this function is
> > > used to also support segmented memory made of multiple mbufs segments.
> > > Note that this function is also used in other existing bbdev PMDs.
> > > In case you believe there is a better way to do this, we can
> > > certainly discuss and change these in several PMDs through another serie.
> > >
> > > Thanks for all the reviews and useful comments.
> > > Nic

^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v3 04/11] baseband/acc100: add queue configuration
  2020-09-03  2:30       ` Xu, Rosen
@ 2020-09-03 22:48         ` Chautru, Nicolas
  2020-09-04  2:01           ` Xu, Rosen
  0 siblings, 1 reply; 213+ messages in thread
From: Chautru, Nicolas @ 2020-09-03 22:48 UTC (permalink / raw)
  To: Xu, Rosen, dev, akhil.goyal; +Cc: Richardson, Bruce

> From: Xu, Rosen <rosen.xu@intel.com>
> 
> Hi,
> 
> > -----Original Message-----
> > From: Chautru, Nicolas <nicolas.chautru@intel.com>
> > Sent: Sunday, August 30, 2020 1:48
> > To: Xu, Rosen <rosen.xu@intel.com>; dev@dpdk.org; akhil.goyal@nxp.com
> > Cc: Richardson, Bruce <bruce.richardson@intel.com>
> > Subject: RE: [dpdk-dev] [PATCH v3 04/11] baseband/acc100: add queue
> > configuration
> >
> > Hi,
> >
> > > From: Xu, Rosen <rosen.xu@intel.com>
> > >
> > > Hi,
> > >
> > > > -----Original Message-----
> > > > From: dev <dev-bounces@dpdk.org> On Behalf Of Nicolas Chautru
> > > > Sent: Wednesday, August 19, 2020 8:25
> > > > To: dev@dpdk.org; akhil.goyal@nxp.com
> > > > Cc: Richardson, Bruce <bruce.richardson@intel.com>; Chautru,
> > > > Nicolas <nicolas.chautru@intel.com>
> > > > Subject: [dpdk-dev] [PATCH v3 04/11] baseband/acc100: add queue
> > > > configuration
> > > >
> > > > Adding function to create and configure queues for the device.
> > > > Still no capability.
> > > >
> > > > Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> > > > ---
> > > >  drivers/baseband/acc100/rte_acc100_pmd.c | 420
> > > > ++++++++++++++++++++++++++++++-
> > > > drivers/baseband/acc100/rte_acc100_pmd.h |  45 ++++
> > > >  2 files changed, 464 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> > > > b/drivers/baseband/acc100/rte_acc100_pmd.c
> > > > index 7807a30..7a21c57 100644
> > > > --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> > > > +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> > > > @@ -26,6 +26,22 @@
> > > >  RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);  #endif
> > > >
> > > > +/* Write to MMIO register address */ static inline void
> > > > +mmio_write(void *addr, uint32_t value) {
> > > > +	*((volatile uint32_t *)(addr)) = rte_cpu_to_le_32(value); }
> > > > +
> > > > +/* Write a register of a ACC100 device */ static inline void
> > > > +acc100_reg_write(struct acc100_device *d, uint32_t offset,
> > > > +uint32_t
> > > > +payload) {
> > > > +	void *reg_addr = RTE_PTR_ADD(d->mmio_base, offset);
> > > > +	mmio_write(reg_addr, payload);
> > > > +	usleep(1000);
> > > > +}
> > > > +
> > > >  /* Read a register of a ACC100 device */  static inline uint32_t
> > > > acc100_reg_read(struct acc100_device *d, uint32_t offset) @@ -36,6
> > > > +52,22 @@
> > > >  	return rte_le_to_cpu_32(ret);
> > > >  }
> > > >
> > > > +/* Basic Implementation of Log2 for exact 2^N */ static inline
> > > > +uint32_t log2_basic(uint32_t value) {
> > > > +	return (value == 0) ? 0 : __builtin_ctz(value); }
> > > > +
> > > > +/* Calculate memory alignment offset assuming alignment is 2^N */
> > > > +static inline uint32_t calc_mem_alignment_offset(void
> > > > +*unaligned_virt_mem, uint32_t alignment) {
> > > > +	rte_iova_t unaligned_phy_mem =
> > > > rte_malloc_virt2iova(unaligned_virt_mem);
> > > > +	return (uint32_t)(alignment -
> > > > +			(unaligned_phy_mem & (alignment-1))); }
> > > > +
> > > >  /* Calculate the offset of the enqueue register */  static inline
> > > > uint32_t queue_offset(bool pf_device, uint8_t vf_id, uint8_t
> > > > qgrp_id, uint16_t aq_id) @@ -204,10 +236,393 @@
> > > >  			acc100_conf->q_dl_5g.aq_depth_log2);
> > > >  }
> > > >
> > > > +static void
> > > > +free_base_addresses(void **base_addrs, int size) {
> > > > +	int i;
> > > > +	for (i = 0; i < size; i++)
> > > > +		rte_free(base_addrs[i]);
> > > > +}
> > > > +
> > > > +static inline uint32_t
> > > > +get_desc_len(void)
> > > > +{
> > > > +	return sizeof(union acc100_dma_desc); }
> > > > +
> > > > +/* Allocate the 2 * 64MB block for the sw rings */ static int
> > > > +alloc_2x64mb_sw_rings_mem(struct rte_bbdev *dev, struct
> > > > +acc100_device
> > > > *d,
> > > > +		int socket)
> > > > +{
> > > > +	uint32_t sw_ring_size = ACC100_SIZE_64MBYTE;
> > > > +	d->sw_rings_base = rte_zmalloc_socket(dev->device->driver-
> > > > >name,
> > > > +			2 * sw_ring_size, RTE_CACHE_LINE_SIZE, socket);
> > > > +	if (d->sw_rings_base == NULL) {
> > > > +		rte_bbdev_log(ERR, "Failed to allocate memory for %s:%u",
> > > > +				dev->device->driver->name,
> > > > +				dev->data->dev_id);
> > > > +		return -ENOMEM;
> > > > +	}
> > > > +	memset(d->sw_rings_base, 0, ACC100_SIZE_64MBYTE);
> > > > +	uint32_t next_64mb_align_offset = calc_mem_alignment_offset(
> > > > +			d->sw_rings_base, ACC100_SIZE_64MBYTE);
> > > > +	d->sw_rings = RTE_PTR_ADD(d->sw_rings_base,
> > > > next_64mb_align_offset);
> > > > +	d->sw_rings_phys = rte_malloc_virt2iova(d->sw_rings_base) +
> > > > +			next_64mb_align_offset;
> > > > +	d->sw_ring_size = MAX_QUEUE_DEPTH * get_desc_len();
> > > > +	d->sw_ring_max_depth = d->sw_ring_size / get_desc_len();
> > > > +
> > > > +	return 0;
> > > > +}
> > >
> > > Why not a common alloc memory function but special function for
> > > different memory size?
> >
> > This is a bit convoluted but due to the fact the first attempt method
> > which is optimal (minimum) may not always find aligned memory.
> 
> What's convoluted? Can you explain?
> For packet processing, in most scenarios, aren't we aligned memory when we
> alloc memory?

Hi Rosen, 
This is related to both the alignment and the size of the contiguous amount of data in pinned down memory = 64MB contiguous block aligned on 64MB boundary of physical address (not linear). 
The first method can potentially fail hence is run incrementally while the 2nd version may be used as safe fall through and is more wasteful in term of footprint (hence not used as default).
That is the part that I considered "convoluted" in this way to reliably allocate memory. It is possible to only use the 2nd version which would look cleaner in term of code but more wasteful in memory usage. 



> >
> > >
> > > > +/* Attempt to allocate minimised memory space for sw rings */
> > > > +static void alloc_sw_rings_min_mem(struct rte_bbdev *dev, struct
> > > > acc100_device
> > > > +*d,
> > > > +		uint16_t num_queues, int socket) {
> > > > +	rte_iova_t sw_rings_base_phy, next_64mb_align_addr_phy;
> > > > +	uint32_t next_64mb_align_offset;
> > > > +	rte_iova_t sw_ring_phys_end_addr;
> > > > +	void *base_addrs[SW_RING_MEM_ALLOC_ATTEMPTS];
> > > > +	void *sw_rings_base;
> > > > +	int i = 0;
> > > > +	uint32_t q_sw_ring_size = MAX_QUEUE_DEPTH * get_desc_len();
> > > > +	uint32_t dev_sw_ring_size = q_sw_ring_size * num_queues;
> > > > +
> > > > +	/* Find an aligned block of memory to store sw rings */
> > > > +	while (i < SW_RING_MEM_ALLOC_ATTEMPTS) {
> > > > +		/*
> > > > +		 * sw_ring allocated memory is guaranteed to be aligned to
> > > > +		 * q_sw_ring_size at the condition that the requested size is
> > > > +		 * less than the page size
> > > > +		 */
> > > > +		sw_rings_base = rte_zmalloc_socket(
> > > > +				dev->device->driver->name,
> > > > +				dev_sw_ring_size, q_sw_ring_size, socket);
> > > > +
> > > > +		if (sw_rings_base == NULL) {
> > > > +			rte_bbdev_log(ERR,
> > > > +					"Failed to allocate memory
> > > > for %s:%u",
> > > > +					dev->device->driver->name,
> > > > +					dev->data->dev_id);
> > > > +			break;
> > > > +		}
> > > > +
> > > > +		sw_rings_base_phy = rte_malloc_virt2iova(sw_rings_base);
> > > > +		next_64mb_align_offset = calc_mem_alignment_offset(
> > > > +				sw_rings_base, ACC100_SIZE_64MBYTE);
> > > > +		next_64mb_align_addr_phy = sw_rings_base_phy +
> > > > +				next_64mb_align_offset;
> > > > +		sw_ring_phys_end_addr = sw_rings_base_phy +
> > > > dev_sw_ring_size;
> > > > +
> > > > +		/* Check if the end of the sw ring memory block is before the
> > > > +		 * start of next 64MB aligned mem address
> > > > +		 */
> > > > +		if (sw_ring_phys_end_addr < next_64mb_align_addr_phy) {
> > > > +			d->sw_rings_phys = sw_rings_base_phy;
> > > > +			d->sw_rings = sw_rings_base;
> > > > +			d->sw_rings_base = sw_rings_base;
> > > > +			d->sw_ring_size = q_sw_ring_size;
> > > > +			d->sw_ring_max_depth = MAX_QUEUE_DEPTH;
> > > > +			break;
> > > > +		}
> > > > +		/* Store the address of the unaligned mem block */
> > > > +		base_addrs[i] = sw_rings_base;
> > > > +		i++;
> > > > +	}
> > > > +
> > > > +	/* Free all unaligned blocks of mem allocated in the loop */
> > > > +	free_base_addresses(base_addrs, i); }
> > >
> > > It's strange to firstly alloc memory and then free memory but on
> > > operations on this memory.
> >
> > I may miss your point. We are freeing the exact same mem we did get
> > from rte_zmalloc.
> > Not that the base_addrs array refers to multiple attempts of mallocs,
> > not multiple operations in a ring.
> 
> You alloc memory sw_rings_base, after some translate, assign this memory to
> cc100_device *d, and before the function return, this memory has been freed.

If you follow the logic, this actually only frees the memory from attempts which were not successfully well aligned, not the one which ends up being in fact used for sw rings.  
The actually memory for sw rings is obviously used and actually gets freed when closing the device below => ie. rte_free(d->sw_rings_base);
Let me know if unclear. I could add more comments if this not obvious from the code. Ie. /* Free all _unaligned_ blocks of mem allocated in the loop */*

Thanks for your review. I can see how it can look a bit odd initially. 

> 
> > >
> > > > +
> > > > +/* Allocate 64MB memory used for all software rings */ static int
> > > > +acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues,
> > int
> > > > +socket_id) {
> > > > +	uint32_t phys_low, phys_high, payload;
> > > > +	struct acc100_device *d = dev->data->dev_private;
> > > > +	const struct acc100_registry_addr *reg_addr;
> > > > +
> > > > +	if (d->pf_device && !d->acc100_conf.pf_mode_en) {
> > > > +		rte_bbdev_log(NOTICE,
> > > > +				"%s has PF mode disabled. This PF can't be
> > > > used.",
> > > > +				dev->data->name);
> > > > +		return -ENODEV;
> > > > +	}
> > > > +
> > > > +	alloc_sw_rings_min_mem(dev, d, num_queues, socket_id);
> > > > +
> > > > +	/* If minimal memory space approach failed, then allocate
> > > > +	 * the 2 * 64MB block for the sw rings
> > > > +	 */
> > > > +	if (d->sw_rings == NULL)
> > > > +		alloc_2x64mb_sw_rings_mem(dev, d, socket_id);
> > > > +
> > > > +	/* Configure ACC100 with the base address for DMA descriptor rings
> > > > +	 * Same descriptor rings used for UL and DL DMA Engines
> > > > +	 * Note : Assuming only VF0 bundle is used for PF mode
> > > > +	 */
> > > > +	phys_high = (uint32_t)(d->sw_rings_phys >> 32);
> > > > +	phys_low  = (uint32_t)(d->sw_rings_phys &
> > > > ~(ACC100_SIZE_64MBYTE-1));
> > > > +
> > > > +	/* Choose correct registry addresses for the device type */
> > > > +	if (d->pf_device)
> > > > +		reg_addr = &pf_reg_addr;
> > > > +	else
> > > > +		reg_addr = &vf_reg_addr;
> > > > +
> > > > +	/* Read the populated cfg from ACC100 registers */
> > > > +	fetch_acc100_config(dev);
> > > > +
> > > > +	/* Mark as configured properly */
> > > > +	d->configured = true;
> > > > +
> > > > +	/* Release AXI from PF */
> > > > +	if (d->pf_device)
> > > > +		acc100_reg_write(d, HWPfDmaAxiControl, 1);
> > > > +
> > > > +	acc100_reg_write(d, reg_addr->dma_ring_ul5g_hi, phys_high);
> > > > +	acc100_reg_write(d, reg_addr->dma_ring_ul5g_lo, phys_low);
> > > > +	acc100_reg_write(d, reg_addr->dma_ring_dl5g_hi, phys_high);
> > > > +	acc100_reg_write(d, reg_addr->dma_ring_dl5g_lo, phys_low);
> > > > +	acc100_reg_write(d, reg_addr->dma_ring_ul4g_hi, phys_high);
> > > > +	acc100_reg_write(d, reg_addr->dma_ring_ul4g_lo, phys_low);
> > > > +	acc100_reg_write(d, reg_addr->dma_ring_dl4g_hi, phys_high);
> > > > +	acc100_reg_write(d, reg_addr->dma_ring_dl4g_lo, phys_low);
> > > > +
> > > > +	/*
> > > > +	 * Configure Ring Size to the max queue ring size
> > > > +	 * (used for wrapping purpose)
> > > > +	 */
> > > > +	payload = log2_basic(d->sw_ring_size / 64);
> > > > +	acc100_reg_write(d, reg_addr->ring_size, payload);
> > > > +
> > > > +	/* Configure tail pointer for use when SDONE enabled */
> > > > +	d->tail_ptrs = rte_zmalloc_socket(
> > > > +			dev->device->driver->name,
> > > > +			ACC100_NUM_QGRPS * ACC100_NUM_AQS *
> > > > sizeof(uint32_t),
> > > > +			RTE_CACHE_LINE_SIZE, socket_id);
> > > > +	if (d->tail_ptrs == NULL) {
> > > > +		rte_bbdev_log(ERR, "Failed to allocate tail ptr for %s:%u",
> > > > +				dev->device->driver->name,
> > > > +				dev->data->dev_id);
> > > > +		rte_free(d->sw_rings);
> > > > +		return -ENOMEM;
> > > > +	}
> > > > +	d->tail_ptr_phys = rte_malloc_virt2iova(d->tail_ptrs);
> > > > +
> > > > +	phys_high = (uint32_t)(d->tail_ptr_phys >> 32);
> > > > +	phys_low  = (uint32_t)(d->tail_ptr_phys);
> > > > +	acc100_reg_write(d, reg_addr->tail_ptrs_ul5g_hi, phys_high);
> > > > +	acc100_reg_write(d, reg_addr->tail_ptrs_ul5g_lo, phys_low);
> > > > +	acc100_reg_write(d, reg_addr->tail_ptrs_dl5g_hi, phys_high);
> > > > +	acc100_reg_write(d, reg_addr->tail_ptrs_dl5g_lo, phys_low);
> > > > +	acc100_reg_write(d, reg_addr->tail_ptrs_ul4g_hi, phys_high);
> > > > +	acc100_reg_write(d, reg_addr->tail_ptrs_ul4g_lo, phys_low);
> > > > +	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_hi, phys_high);
> > > > +	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_lo, phys_low);
> > > > +
> > > > +	d->harq_layout = rte_zmalloc_socket("HARQ Layout",
> > > > +			ACC100_HARQ_LAYOUT * sizeof(*d->harq_layout),
> > > > +			RTE_CACHE_LINE_SIZE, dev->data->socket_id);
> > > > +
> > > > +	rte_bbdev_log_debug(
> > > > +			"ACC100 (%s) configured  sw_rings = %p,
> > > > sw_rings_phys = %#"
> > > > +			PRIx64, dev->data->name, d->sw_rings, d-
> > > > >sw_rings_phys);
> > > > +
> > > > +	return 0;
> > > > +}
> > > > +
> > > >  /* Free 64MB memory used for software rings */  static int -
> > > > acc100_dev_close(struct rte_bbdev *dev  __rte_unused)
> > > > +acc100_dev_close(struct rte_bbdev *dev)
> > > >  {
> > > > +	struct acc100_device *d = dev->data->dev_private;
> > > > +	if (d->sw_rings_base != NULL) {
> > > > +		rte_free(d->tail_ptrs);
> > > > +		rte_free(d->sw_rings_base);
> > > > +		d->sw_rings_base = NULL;
> > > > +	}
> > > > +	usleep(1000);
> > > > +	return 0;
> > > > +}
> > > > +
> > > > +
> > > > +/**
> > > > + * Report a ACC100 queue index which is free
> > > > + * Return 0 to 16k for a valid queue_idx or -1 when no queue is
> > > > +available
> > > > + * Note : Only supporting VF0 Bundle for PF mode  */ static int
> > > > +acc100_find_free_queue_idx(struct rte_bbdev *dev,
> > > > +		const struct rte_bbdev_queue_conf *conf) {
> > > > +	struct acc100_device *d = dev->data->dev_private;
> > > > +	int op_2_acc[5] = {0, UL_4G, DL_4G, UL_5G, DL_5G};
> > > > +	int acc = op_2_acc[conf->op_type];
> > > > +	struct rte_q_topology_t *qtop = NULL;
> > > > +	qtopFromAcc(&qtop, acc, &(d->acc100_conf));
> > > > +	if (qtop == NULL)
> > > > +		return -1;
> > > > +	/* Identify matching QGroup Index which are sorted in priority
> > > > +order
> > > > */
> > > > +	uint16_t group_idx = qtop->first_qgroup_index;
> > > > +	group_idx += conf->priority;
> > > > +	if (group_idx >= ACC100_NUM_QGRPS ||
> > > > +			conf->priority >= qtop->num_qgroups) {
> > > > +		rte_bbdev_log(INFO, "Invalid Priority on %s, priority %u",
> > > > +				dev->data->name, conf->priority);
> > > > +		return -1;
> > > > +	}
> > > > +	/* Find a free AQ_idx  */
> > > > +	uint16_t aq_idx;
> > > > +	for (aq_idx = 0; aq_idx < qtop->num_aqs_per_groups; aq_idx++) {
> > > > +		if (((d->q_assigned_bit_map[group_idx] >> aq_idx) & 0x1)
> > > > == 0) {
> > > > +			/* Mark the Queue as assigned */
> > > > +			d->q_assigned_bit_map[group_idx] |= (1 << aq_idx);
> > > > +			/* Report the AQ Index */
> > > > +			return (group_idx << GRP_ID_SHIFT) + aq_idx;
> > > > +		}
> > > > +	}
> > > > +	rte_bbdev_log(INFO, "Failed to find free queue on %s, priority %u",
> > > > +			dev->data->name, conf->priority);
> > > > +	return -1;
> > > > +}
> > > > +
> > > > +/* Setup ACC100 queue */
> > > > +static int
> > > > +acc100_queue_setup(struct rte_bbdev *dev, uint16_t queue_id,
> > > > +		const struct rte_bbdev_queue_conf *conf) {
> > > > +	struct acc100_device *d = dev->data->dev_private;
> > > > +	struct acc100_queue *q;
> > > > +	int16_t q_idx;
> > > > +
> > > > +	/* Allocate the queue data structure. */
> > > > +	q = rte_zmalloc_socket(dev->device->driver->name, sizeof(*q),
> > > > +			RTE_CACHE_LINE_SIZE, conf->socket);
> > > > +	if (q == NULL) {
> > > > +		rte_bbdev_log(ERR, "Failed to allocate queue memory");
> > > > +		return -ENOMEM;
> > > > +	}
> > > > +
> > > > +	q->d = d;
> > > > +	q->ring_addr = RTE_PTR_ADD(d->sw_rings, (d->sw_ring_size *
> > > > queue_id));
> > > > +	q->ring_addr_phys = d->sw_rings_phys + (d->sw_ring_size *
> > > > queue_id);
> > > > +
> > > > +	/* Prepare the Ring with default descriptor format */
> > > > +	union acc100_dma_desc *desc = NULL;
> > > > +	unsigned int desc_idx, b_idx;
> > > > +	int fcw_len = (conf->op_type == RTE_BBDEV_OP_LDPC_ENC ?
> > > > +		ACC100_FCW_LE_BLEN : (conf->op_type ==
> > > > RTE_BBDEV_OP_TURBO_DEC ?
> > > > +		ACC100_FCW_TD_BLEN : ACC100_FCW_LD_BLEN));
> > > > +
> > > > +	for (desc_idx = 0; desc_idx < d->sw_ring_max_depth; desc_idx++) {
> > > > +		desc = q->ring_addr + desc_idx;
> > > > +		desc->req.word0 = ACC100_DMA_DESC_TYPE;
> > > > +		desc->req.word1 = 0; /**< Timestamp */
> > > > +		desc->req.word2 = 0;
> > > > +		desc->req.word3 = 0;
> > > > +		uint64_t fcw_offset = (desc_idx << 8) +
> > > > ACC100_DESC_FCW_OFFSET;
> > > > +		desc->req.data_ptrs[0].address = q->ring_addr_phys +
> > > > fcw_offset;
> > > > +		desc->req.data_ptrs[0].blen = fcw_len;
> > > > +		desc->req.data_ptrs[0].blkid = ACC100_DMA_BLKID_FCW;
> > > > +		desc->req.data_ptrs[0].last = 0;
> > > > +		desc->req.data_ptrs[0].dma_ext = 0;
> > > > +		for (b_idx = 1; b_idx < ACC100_DMA_MAX_NUM_POINTERS
> > > > - 1;
> > > > +				b_idx++) {
> > > > +			desc->req.data_ptrs[b_idx].blkid =
> > > > ACC100_DMA_BLKID_IN;
> > > > +			desc->req.data_ptrs[b_idx].last = 1;
> > > > +			desc->req.data_ptrs[b_idx].dma_ext = 0;
> > > > +			b_idx++;
> > > > +			desc->req.data_ptrs[b_idx].blkid =
> > > > +					ACC100_DMA_BLKID_OUT_ENC;
> > > > +			desc->req.data_ptrs[b_idx].last = 1;
> > > > +			desc->req.data_ptrs[b_idx].dma_ext = 0;
> > > > +		}
> > > > +		/* Preset some fields of LDPC FCW */
> > > > +		desc->req.fcw_ld.FCWversion = ACC100_FCW_VER;
> > > > +		desc->req.fcw_ld.gain_i = 1;
> > > > +		desc->req.fcw_ld.gain_h = 1;
> > > > +	}
> > > > +
> > > > +	q->lb_in = rte_zmalloc_socket(dev->device->driver->name,
> > > > +			RTE_CACHE_LINE_SIZE,
> > > > +			RTE_CACHE_LINE_SIZE, conf->socket);
> > > > +	if (q->lb_in == NULL) {
> > > > +		rte_bbdev_log(ERR, "Failed to allocate lb_in memory");
> > > > +		return -ENOMEM;
> > > > +	}
> > > > +	q->lb_in_addr_phys = rte_malloc_virt2iova(q->lb_in);
> > > > +	q->lb_out = rte_zmalloc_socket(dev->device->driver->name,
> > > > +			RTE_CACHE_LINE_SIZE,
> > > > +			RTE_CACHE_LINE_SIZE, conf->socket);
> > > > +	if (q->lb_out == NULL) {
> > > > +		rte_bbdev_log(ERR, "Failed to allocate lb_out memory");
> > > > +		return -ENOMEM;
> > > > +	}
> > > > +	q->lb_out_addr_phys = rte_malloc_virt2iova(q->lb_out);
> > > > +
> > > > +	/*
> > > > +	 * Software queue ring wraps synchronously with the HW when it
> > > > reaches
> > > > +	 * the boundary of the maximum allocated queue size, no matter
> > > > what the
> > > > +	 * sw queue size is. This wrapping is guarded by setting the
> > > > wrap_mask
> > > > +	 * to represent the maximum queue size as allocated at the time
> > > > when
> > > > +	 * the device has been setup (in configure()).
> > > > +	 *
> > > > +	 * The queue depth is set to the queue size value (conf-
> > > > >queue_size).
> > > > +	 * This limits the occupancy of the queue at any point of time, so that
> > > > +	 * the queue does not get swamped with enqueue requests.
> > > > +	 */
> > > > +	q->sw_ring_depth = conf->queue_size;
> > > > +	q->sw_ring_wrap_mask = d->sw_ring_max_depth - 1;
> > > > +
> > > > +	q->op_type = conf->op_type;
> > > > +
> > > > +	q_idx = acc100_find_free_queue_idx(dev, conf);
> > > > +	if (q_idx == -1) {
> > > > +		rte_free(q);
> > > > +		return -1;
> > > > +	}
> > > > +
> > > > +	q->qgrp_id = (q_idx >> GRP_ID_SHIFT) & 0xF;
> > > > +	q->vf_id = (q_idx >> VF_ID_SHIFT)  & 0x3F;
> > > > +	q->aq_id = q_idx & 0xF;
> > > > +	q->aq_depth = (conf->op_type ==  RTE_BBDEV_OP_TURBO_DEC) ?
> > > > +			(1 << d->acc100_conf.q_ul_4g.aq_depth_log2) :
> > > > +			(1 << d->acc100_conf.q_dl_4g.aq_depth_log2);
> > > > +
> > > > +	q->mmio_reg_enqueue = RTE_PTR_ADD(d->mmio_base,
> > > > +			queue_offset(d->pf_device,
> > > > +					q->vf_id, q->qgrp_id, q->aq_id));
> > > > +
> > > > +	rte_bbdev_log_debug(
> > > > +			"Setup dev%u q%u: qgrp_id=%u, vf_id=%u,
> > > > aq_id=%u, aq_depth=%u, mmio_reg_enqueue=%p",
> > > > +			dev->data->dev_id, queue_id, q->qgrp_id, q->vf_id,
> > > > +			q->aq_id, q->aq_depth, q->mmio_reg_enqueue);
> > > > +
> > > > +	dev->data->queues[queue_id].queue_private = q;
> > > > +	return 0;
> > > > +}
> > > > +
> > > > +/* Release ACC100 queue */
> > > > +static int
> > > > +acc100_queue_release(struct rte_bbdev *dev, uint16_t q_id) {
> > > > +	struct acc100_device *d = dev->data->dev_private;
> > > > +	struct acc100_queue *q = dev->data->queues[q_id].queue_private;
> > > > +
> > > > +	if (q != NULL) {
> > > > +		/* Mark the Queue as un-assigned */
> > > > +		d->q_assigned_bit_map[q->qgrp_id] &= (0xFFFFFFFF -
> > > > +				(1 << q->aq_id));
> > > > +		rte_free(q->lb_in);
> > > > +		rte_free(q->lb_out);
> > > > +		rte_free(q);
> > > > +		dev->data->queues[q_id].queue_private = NULL;
> > > > +	}
> > > > +
> > > >  	return 0;
> > > >  }
> > > >
> > > > @@ -258,8 +673,11 @@
> > > >  }
> > > >
> > > >  static const struct rte_bbdev_ops acc100_bbdev_ops = {
> > > > +	.setup_queues = acc100_setup_queues,
> > > >  	.close = acc100_dev_close,
> > > >  	.info_get = acc100_dev_info_get,
> > > > +	.queue_setup = acc100_queue_setup,
> > > > +	.queue_release = acc100_queue_release,
> > > >  };
> > > >
> > > >  /* ACC100 PCI PF address map */
> > > > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h
> > > > b/drivers/baseband/acc100/rte_acc100_pmd.h
> > > > index 662e2c8..0e2b79c 100644
> > > > --- a/drivers/baseband/acc100/rte_acc100_pmd.h
> > > > +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
> > > > @@ -518,11 +518,56 @@ struct acc100_registry_addr {
> > > >  	.ddr_range = HWVfDmaDdrBaseRangeRoVf,  };
> > > >
> > > > +/* Structure associated with each queue. */ struct
> > > > +__rte_cache_aligned acc100_queue {
> > > > +	union acc100_dma_desc *ring_addr;  /* Virtual address of sw ring */
> > > > +	rte_iova_t ring_addr_phys;  /* Physical address of software ring */
> > > > +	uint32_t sw_ring_head;  /* software ring head */
> > > > +	uint32_t sw_ring_tail;  /* software ring tail */
> > > > +	/* software ring size (descriptors, not bytes) */
> > > > +	uint32_t sw_ring_depth;
> > > > +	/* mask used to wrap enqueued descriptors on the sw ring */
> > > > +	uint32_t sw_ring_wrap_mask;
> > > > +	/* MMIO register used to enqueue descriptors */
> > > > +	void *mmio_reg_enqueue;
> > > > +	uint8_t vf_id;  /* VF ID (max = 63) */
> > > > +	uint8_t qgrp_id;  /* Queue Group ID */
> > > > +	uint16_t aq_id;  /* Atomic Queue ID */
> > > > +	uint16_t aq_depth;  /* Depth of atomic queue */
> > > > +	uint32_t aq_enqueued;  /* Count how many "batches" have been
> > > > enqueued */
> > > > +	uint32_t aq_dequeued;  /* Count how many "batches" have been
> > > > dequeued */
> > > > +	uint32_t irq_enable;  /* Enable ops dequeue interrupts if set to 1 */
> > > > +	struct rte_mempool *fcw_mempool;  /* FCW mempool */
> > > > +	enum rte_bbdev_op_type op_type;  /* Type of this Queue: TE or TD
> > > > */
> > > > +	/* Internal Buffers for loopback input */
> > > > +	uint8_t *lb_in;
> > > > +	uint8_t *lb_out;
> > > > +	rte_iova_t lb_in_addr_phys;
> > > > +	rte_iova_t lb_out_addr_phys;
> > > > +	struct acc100_device *d;
> > > > +};
> > > > +
> > > >  /* Private data structure for each ACC100 device */  struct
> > > > acc100_device
> > {
> > > >  	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
> > > > +	void *sw_rings_base;  /* Base addr of un-aligned memory for sw
> > > > rings */
> > > > +	void *sw_rings;  /* 64MBs of 64MB aligned memory for sw rings */
> > > > +	rte_iova_t sw_rings_phys;  /* Physical address of sw_rings */
> > > > +	/* Virtual address of the info memory routed to the this
> > > > +function
> > > > under
> > > > +	 * operation, whether it is PF or VF.
> > > > +	 */
> > > > +	union acc100_harq_layout_data *harq_layout;
> > > > +	uint32_t sw_ring_size;
> > > >  	uint32_t ddr_size; /* Size in kB */
> > > > +	uint32_t *tail_ptrs; /* Base address of response tail pointer buffer */
> > > > +	rte_iova_t tail_ptr_phys; /* Physical address of tail pointers */
> > > > +	/* Max number of entries available for each queue in device,
> > > > depending
> > > > +	 * on how many queues are enabled with configure()
> > > > +	 */
> > > > +	uint32_t sw_ring_max_depth;
> > > >  	struct acc100_conf acc100_conf; /* ACC100 Initial configuration
> > > > */
> > > > +	/* Bitmap capturing which Queues have already been assigned */
> > > > +	uint16_t q_assigned_bit_map[ACC100_NUM_QGRPS];
> > > >  	bool pf_device; /**< True if this is a PF ACC100 device */
> > > >  	bool configured; /**< True if this ACC100 device is configured
> > > > */ };
> > > > --
> > > > 1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v3 04/11] baseband/acc100: add queue configuration
  2020-09-03 22:48         ` Chautru, Nicolas
@ 2020-09-04  2:01           ` Xu, Rosen
  0 siblings, 0 replies; 213+ messages in thread
From: Xu, Rosen @ 2020-09-04  2:01 UTC (permalink / raw)
  To: Chautru, Nicolas, dev, akhil.goyal; +Cc: Richardson, Bruce

Hi,

> -----Original Message-----
> From: Chautru, Nicolas <nicolas.chautru@intel.com>
> Sent: Friday, September 04, 2020 6:49
> To: Xu, Rosen <rosen.xu@intel.com>; dev@dpdk.org; akhil.goyal@nxp.com
> Cc: Richardson, Bruce <bruce.richardson@intel.com>
> Subject: RE: [dpdk-dev] [PATCH v3 04/11] baseband/acc100: add queue
> configuration
> 
> > From: Xu, Rosen <rosen.xu@intel.com>
> >
> > Hi,
> >
> > > -----Original Message-----
> > > From: Chautru, Nicolas <nicolas.chautru@intel.com>
> > > Sent: Sunday, August 30, 2020 1:48
> > > To: Xu, Rosen <rosen.xu@intel.com>; dev@dpdk.org;
> > > akhil.goyal@nxp.com
> > > Cc: Richardson, Bruce <bruce.richardson@intel.com>
> > > Subject: RE: [dpdk-dev] [PATCH v3 04/11] baseband/acc100: add queue
> > > configuration
> > >
> > > Hi,
> > >
> > > > From: Xu, Rosen <rosen.xu@intel.com>
> > > >
> > > > Hi,
> > > >
> > > > > -----Original Message-----
> > > > > From: dev <dev-bounces@dpdk.org> On Behalf Of Nicolas Chautru
> > > > > Sent: Wednesday, August 19, 2020 8:25
> > > > > To: dev@dpdk.org; akhil.goyal@nxp.com
> > > > > Cc: Richardson, Bruce <bruce.richardson@intel.com>; Chautru,
> > > > > Nicolas <nicolas.chautru@intel.com>
> > > > > Subject: [dpdk-dev] [PATCH v3 04/11] baseband/acc100: add queue
> > > > > configuration
> > > > >
> > > > > Adding function to create and configure queues for the device.
> > > > > Still no capability.
> > > > >
> > > > > Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> > > > > ---
> > > > >  drivers/baseband/acc100/rte_acc100_pmd.c | 420
> > > > > ++++++++++++++++++++++++++++++-
> > > > > drivers/baseband/acc100/rte_acc100_pmd.h |  45 ++++
> > > > >  2 files changed, 464 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> > > > > b/drivers/baseband/acc100/rte_acc100_pmd.c
> > > > > index 7807a30..7a21c57 100644
> > > > > --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> > > > > +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> > > > > @@ -26,6 +26,22 @@
> > > > >  RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);
> > > > > #endif
> > > > >
> > > > > +/* Write to MMIO register address */ static inline void
> > > > > +mmio_write(void *addr, uint32_t value) {
> > > > > +	*((volatile uint32_t *)(addr)) = rte_cpu_to_le_32(value); }
> > > > > +
> > > > > +/* Write a register of a ACC100 device */ static inline void
> > > > > +acc100_reg_write(struct acc100_device *d, uint32_t offset,
> > > > > +uint32_t
> > > > > +payload) {
> > > > > +	void *reg_addr = RTE_PTR_ADD(d->mmio_base, offset);
> > > > > +	mmio_write(reg_addr, payload);
> > > > > +	usleep(1000);
> > > > > +}
> > > > > +
> > > > >  /* Read a register of a ACC100 device */  static inline
> > > > > uint32_t acc100_reg_read(struct acc100_device *d, uint32_t
> > > > > offset) @@ -36,6
> > > > > +52,22 @@
> > > > >  	return rte_le_to_cpu_32(ret);
> > > > >  }
> > > > >
> > > > > +/* Basic Implementation of Log2 for exact 2^N */ static inline
> > > > > +uint32_t log2_basic(uint32_t value) {
> > > > > +	return (value == 0) ? 0 : __builtin_ctz(value); }
> > > > > +
> > > > > +/* Calculate memory alignment offset assuming alignment is 2^N
> > > > > +*/ static inline uint32_t calc_mem_alignment_offset(void
> > > > > +*unaligned_virt_mem, uint32_t alignment) {
> > > > > +	rte_iova_t unaligned_phy_mem =
> > > > > rte_malloc_virt2iova(unaligned_virt_mem);
> > > > > +	return (uint32_t)(alignment -
> > > > > +			(unaligned_phy_mem & (alignment-1))); }
> > > > > +
> > > > >  /* Calculate the offset of the enqueue register */  static
> > > > > inline uint32_t queue_offset(bool pf_device, uint8_t vf_id,
> > > > > uint8_t qgrp_id, uint16_t aq_id) @@ -204,10 +236,393 @@
> > > > >  			acc100_conf->q_dl_5g.aq_depth_log2);
> > > > >  }
> > > > >
> > > > > +static void
> > > > > +free_base_addresses(void **base_addrs, int size) {
> > > > > +	int i;
> > > > > +	for (i = 0; i < size; i++)
> > > > > +		rte_free(base_addrs[i]);
> > > > > +}
> > > > > +
> > > > > +static inline uint32_t
> > > > > +get_desc_len(void)
> > > > > +{
> > > > > +	return sizeof(union acc100_dma_desc); }
> > > > > +
> > > > > +/* Allocate the 2 * 64MB block for the sw rings */ static int
> > > > > +alloc_2x64mb_sw_rings_mem(struct rte_bbdev *dev, struct
> > > > > +acc100_device
> > > > > *d,
> > > > > +		int socket)
> > > > > +{
> > > > > +	uint32_t sw_ring_size = ACC100_SIZE_64MBYTE;
> > > > > +	d->sw_rings_base = rte_zmalloc_socket(dev->device-
> >driver-
> > > > > >name,
> > > > > +			2 * sw_ring_size, RTE_CACHE_LINE_SIZE,
> socket);
> > > > > +	if (d->sw_rings_base == NULL) {
> > > > > +		rte_bbdev_log(ERR, "Failed to allocate memory
> for %s:%u",
> > > > > +				dev->device->driver->name,
> > > > > +				dev->data->dev_id);
> > > > > +		return -ENOMEM;
> > > > > +	}
> > > > > +	memset(d->sw_rings_base, 0, ACC100_SIZE_64MBYTE);
> > > > > +	uint32_t next_64mb_align_offset =
> calc_mem_alignment_offset(
> > > > > +			d->sw_rings_base, ACC100_SIZE_64MBYTE);
> > > > > +	d->sw_rings = RTE_PTR_ADD(d->sw_rings_base,
> > > > > next_64mb_align_offset);
> > > > > +	d->sw_rings_phys = rte_malloc_virt2iova(d->sw_rings_base)
> +
> > > > > +			next_64mb_align_offset;
> > > > > +	d->sw_ring_size = MAX_QUEUE_DEPTH * get_desc_len();
> > > > > +	d->sw_ring_max_depth = d->sw_ring_size / get_desc_len();
> > > > > +
> > > > > +	return 0;
> > > > > +}
> > > >
> > > > Why not a common alloc memory function but special function for
> > > > different memory size?
> > >
> > > This is a bit convoluted but due to the fact the first attempt
> > > method which is optimal (minimum) may not always find aligned memory.
> >
> > What's convoluted? Can you explain?
> > For packet processing, in most scenarios, aren't we aligned memory
> > when we alloc memory?
> 
> Hi Rosen,
> This is related to both the alignment and the size of the contiguous amount
> of data in pinned down memory = 64MB contiguous block aligned on 64MB
> boundary of physical address (not linear).
> The first method can potentially fail hence is run incrementally while the 2nd
> version may be used as safe fall through and is more wasteful in term of
> footprint (hence not used as default).
> That is the part that I considered "convoluted" in this way to reliably allocate
> memory. It is possible to only use the 2nd version which would look cleaner
> in term of code but more wasteful in memory usage.

As you mentioned, it's not cleaner, looking forwarding your next version patch.

> 
> 
> > >
> > > >
> > > > > +/* Attempt to allocate minimised memory space for sw rings */
> > > > > +static void alloc_sw_rings_min_mem(struct rte_bbdev *dev, struct
> > > > > acc100_device
> > > > > +*d,
> > > > > +		uint16_t num_queues, int socket) {
> > > > > +	rte_iova_t sw_rings_base_phy, next_64mb_align_addr_phy;
> > > > > +	uint32_t next_64mb_align_offset;
> > > > > +	rte_iova_t sw_ring_phys_end_addr;
> > > > > +	void *base_addrs[SW_RING_MEM_ALLOC_ATTEMPTS];
> > > > > +	void *sw_rings_base;
> > > > > +	int i = 0;
> > > > > +	uint32_t q_sw_ring_size = MAX_QUEUE_DEPTH *
> get_desc_len();
> > > > > +	uint32_t dev_sw_ring_size = q_sw_ring_size * num_queues;
> > > > > +
> > > > > +	/* Find an aligned block of memory to store sw rings */
> > > > > +	while (i < SW_RING_MEM_ALLOC_ATTEMPTS) {
> > > > > +		/*
> > > > > +		 * sw_ring allocated memory is guaranteed to be
> aligned to
> > > > > +		 * q_sw_ring_size at the condition that the
> requested size is
> > > > > +		 * less than the page size
> > > > > +		 */
> > > > > +		sw_rings_base = rte_zmalloc_socket(
> > > > > +				dev->device->driver->name,
> > > > > +				dev_sw_ring_size, q_sw_ring_size,
> socket);
> > > > > +
> > > > > +		if (sw_rings_base == NULL) {
> > > > > +			rte_bbdev_log(ERR,
> > > > > +					"Failed to allocate memory
> > > > > for %s:%u",
> > > > > +					dev->device->driver->name,
> > > > > +					dev->data->dev_id);
> > > > > +			break;
> > > > > +		}
> > > > > +
> > > > > +		sw_rings_base_phy =
> rte_malloc_virt2iova(sw_rings_base);
> > > > > +		next_64mb_align_offset =
> calc_mem_alignment_offset(
> > > > > +				sw_rings_base,
> ACC100_SIZE_64MBYTE);
> > > > > +		next_64mb_align_addr_phy = sw_rings_base_phy +
> > > > > +				next_64mb_align_offset;
> > > > > +		sw_ring_phys_end_addr = sw_rings_base_phy +
> > > > > dev_sw_ring_size;
> > > > > +
> > > > > +		/* Check if the end of the sw ring memory block is
> before the
> > > > > +		 * start of next 64MB aligned mem address
> > > > > +		 */
> > > > > +		if (sw_ring_phys_end_addr <
> next_64mb_align_addr_phy) {
> > > > > +			d->sw_rings_phys = sw_rings_base_phy;
> > > > > +			d->sw_rings = sw_rings_base;
> > > > > +			d->sw_rings_base = sw_rings_base;
> > > > > +			d->sw_ring_size = q_sw_ring_size;
> > > > > +			d->sw_ring_max_depth =
> MAX_QUEUE_DEPTH;
> > > > > +			break;
> > > > > +		}
> > > > > +		/* Store the address of the unaligned mem block */
> > > > > +		base_addrs[i] = sw_rings_base;
> > > > > +		i++;
> > > > > +	}
> > > > > +
> > > > > +	/* Free all unaligned blocks of mem allocated in the loop */
> > > > > +	free_base_addresses(base_addrs, i); }
> > > >
> > > > It's strange to firstly alloc memory and then free memory but on
> > > > operations on this memory.
> > >
> > > I may miss your point. We are freeing the exact same mem we did get
> > > from rte_zmalloc.
> > > Not that the base_addrs array refers to multiple attempts of mallocs,
> > > not multiple operations in a ring.
> >
> > You alloc memory sw_rings_base, after some translate, assign this memory
> to
> > cc100_device *d, and before the function return, this memory has been
> freed.
> 
> If you follow the logic, this actually only frees the memory from attempts
> which were not successfully well aligned, not the one which ends up being in
> fact used for sw rings.
> The actually memory for sw rings is obviously used and actually gets freed
> when closing the device below => ie. rte_free(d->sw_rings_base);
> Let me know if unclear. I could add more comments if this not obvious from
> the code. Ie. /* Free all _unaligned_ blocks of mem allocated in the loop */*
> 
> Thanks for your review. I can see how it can look a bit odd initially.

Pls make sure you code can works well in each branch.

> >
> > > >
> > > > > +
> > > > > +/* Allocate 64MB memory used for all software rings */ static int
> > > > > +acc100_setup_queues(struct rte_bbdev *dev, uint16_t
> num_queues,
> > > int
> > > > > +socket_id) {
> > > > > +	uint32_t phys_low, phys_high, payload;
> > > > > +	struct acc100_device *d = dev->data->dev_private;
> > > > > +	const struct acc100_registry_addr *reg_addr;
> > > > > +
> > > > > +	if (d->pf_device && !d->acc100_conf.pf_mode_en) {
> > > > > +		rte_bbdev_log(NOTICE,
> > > > > +				"%s has PF mode disabled. This PF
> can't be
> > > > > used.",
> > > > > +				dev->data->name);
> > > > > +		return -ENODEV;
> > > > > +	}
> > > > > +
> > > > > +	alloc_sw_rings_min_mem(dev, d, num_queues, socket_id);
> > > > > +
> > > > > +	/* If minimal memory space approach failed, then allocate
> > > > > +	 * the 2 * 64MB block for the sw rings
> > > > > +	 */
> > > > > +	if (d->sw_rings == NULL)
> > > > > +		alloc_2x64mb_sw_rings_mem(dev, d, socket_id);
> > > > > +
> > > > > +	/* Configure ACC100 with the base address for DMA
> descriptor rings
> > > > > +	 * Same descriptor rings used for UL and DL DMA Engines
> > > > > +	 * Note : Assuming only VF0 bundle is used for PF mode
> > > > > +	 */
> > > > > +	phys_high = (uint32_t)(d->sw_rings_phys >> 32);
> > > > > +	phys_low  = (uint32_t)(d->sw_rings_phys &
> > > > > ~(ACC100_SIZE_64MBYTE-1));
> > > > > +
> > > > > +	/* Choose correct registry addresses for the device type */
> > > > > +	if (d->pf_device)
> > > > > +		reg_addr = &pf_reg_addr;
> > > > > +	else
> > > > > +		reg_addr = &vf_reg_addr;
> > > > > +
> > > > > +	/* Read the populated cfg from ACC100 registers */
> > > > > +	fetch_acc100_config(dev);
> > > > > +
> > > > > +	/* Mark as configured properly */
> > > > > +	d->configured = true;
> > > > > +
> > > > > +	/* Release AXI from PF */
> > > > > +	if (d->pf_device)
> > > > > +		acc100_reg_write(d, HWPfDmaAxiControl, 1);
> > > > > +
> > > > > +	acc100_reg_write(d, reg_addr->dma_ring_ul5g_hi,
> phys_high);
> > > > > +	acc100_reg_write(d, reg_addr->dma_ring_ul5g_lo,
> phys_low);
> > > > > +	acc100_reg_write(d, reg_addr->dma_ring_dl5g_hi,
> phys_high);
> > > > > +	acc100_reg_write(d, reg_addr->dma_ring_dl5g_lo,
> phys_low);
> > > > > +	acc100_reg_write(d, reg_addr->dma_ring_ul4g_hi,
> phys_high);
> > > > > +	acc100_reg_write(d, reg_addr->dma_ring_ul4g_lo,
> phys_low);
> > > > > +	acc100_reg_write(d, reg_addr->dma_ring_dl4g_hi,
> phys_high);
> > > > > +	acc100_reg_write(d, reg_addr->dma_ring_dl4g_lo,
> phys_low);
> > > > > +
> > > > > +	/*
> > > > > +	 * Configure Ring Size to the max queue ring size
> > > > > +	 * (used for wrapping purpose)
> > > > > +	 */
> > > > > +	payload = log2_basic(d->sw_ring_size / 64);
> > > > > +	acc100_reg_write(d, reg_addr->ring_size, payload);
> > > > > +
> > > > > +	/* Configure tail pointer for use when SDONE enabled */
> > > > > +	d->tail_ptrs = rte_zmalloc_socket(
> > > > > +			dev->device->driver->name,
> > > > > +			ACC100_NUM_QGRPS * ACC100_NUM_AQS
> *
> > > > > sizeof(uint32_t),
> > > > > +			RTE_CACHE_LINE_SIZE, socket_id);
> > > > > +	if (d->tail_ptrs == NULL) {
> > > > > +		rte_bbdev_log(ERR, "Failed to allocate tail ptr
> for %s:%u",
> > > > > +				dev->device->driver->name,
> > > > > +				dev->data->dev_id);
> > > > > +		rte_free(d->sw_rings);
> > > > > +		return -ENOMEM;
> > > > > +	}
> > > > > +	d->tail_ptr_phys = rte_malloc_virt2iova(d->tail_ptrs);
> > > > > +
> > > > > +	phys_high = (uint32_t)(d->tail_ptr_phys >> 32);
> > > > > +	phys_low  = (uint32_t)(d->tail_ptr_phys);
> > > > > +	acc100_reg_write(d, reg_addr->tail_ptrs_ul5g_hi, phys_high);
> > > > > +	acc100_reg_write(d, reg_addr->tail_ptrs_ul5g_lo, phys_low);
> > > > > +	acc100_reg_write(d, reg_addr->tail_ptrs_dl5g_hi, phys_high);
> > > > > +	acc100_reg_write(d, reg_addr->tail_ptrs_dl5g_lo, phys_low);
> > > > > +	acc100_reg_write(d, reg_addr->tail_ptrs_ul4g_hi, phys_high);
> > > > > +	acc100_reg_write(d, reg_addr->tail_ptrs_ul4g_lo, phys_low);
> > > > > +	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_hi, phys_high);
> > > > > +	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_lo, phys_low);
> > > > > +
> > > > > +	d->harq_layout = rte_zmalloc_socket("HARQ Layout",
> > > > > +			ACC100_HARQ_LAYOUT * sizeof(*d-
> >harq_layout),
> > > > > +			RTE_CACHE_LINE_SIZE, dev->data-
> >socket_id);
> > > > > +
> > > > > +	rte_bbdev_log_debug(
> > > > > +			"ACC100 (%s) configured  sw_rings = %p,
> > > > > sw_rings_phys = %#"
> > > > > +			PRIx64, dev->data->name, d->sw_rings, d-
> > > > > >sw_rings_phys);
> > > > > +
> > > > > +	return 0;
> > > > > +}
> > > > > +
> > > > >  /* Free 64MB memory used for software rings */  static int -
> > > > > acc100_dev_close(struct rte_bbdev *dev  __rte_unused)
> > > > > +acc100_dev_close(struct rte_bbdev *dev)
> > > > >  {
> > > > > +	struct acc100_device *d = dev->data->dev_private;
> > > > > +	if (d->sw_rings_base != NULL) {
> > > > > +		rte_free(d->tail_ptrs);
> > > > > +		rte_free(d->sw_rings_base);
> > > > > +		d->sw_rings_base = NULL;
> > > > > +	}
> > > > > +	usleep(1000);
> > > > > +	return 0;
> > > > > +}
> > > > > +
> > > > > +
> > > > > +/**
> > > > > + * Report a ACC100 queue index which is free
> > > > > + * Return 0 to 16k for a valid queue_idx or -1 when no queue is
> > > > > +available
> > > > > + * Note : Only supporting VF0 Bundle for PF mode  */ static int
> > > > > +acc100_find_free_queue_idx(struct rte_bbdev *dev,
> > > > > +		const struct rte_bbdev_queue_conf *conf) {
> > > > > +	struct acc100_device *d = dev->data->dev_private;
> > > > > +	int op_2_acc[5] = {0, UL_4G, DL_4G, UL_5G, DL_5G};
> > > > > +	int acc = op_2_acc[conf->op_type];
> > > > > +	struct rte_q_topology_t *qtop = NULL;
> > > > > +	qtopFromAcc(&qtop, acc, &(d->acc100_conf));
> > > > > +	if (qtop == NULL)
> > > > > +		return -1;
> > > > > +	/* Identify matching QGroup Index which are sorted in
> priority
> > > > > +order
> > > > > */
> > > > > +	uint16_t group_idx = qtop->first_qgroup_index;
> > > > > +	group_idx += conf->priority;
> > > > > +	if (group_idx >= ACC100_NUM_QGRPS ||
> > > > > +			conf->priority >= qtop->num_qgroups) {
> > > > > +		rte_bbdev_log(INFO, "Invalid Priority on %s,
> priority %u",
> > > > > +				dev->data->name, conf->priority);
> > > > > +		return -1;
> > > > > +	}
> > > > > +	/* Find a free AQ_idx  */
> > > > > +	uint16_t aq_idx;
> > > > > +	for (aq_idx = 0; aq_idx < qtop->num_aqs_per_groups;
> aq_idx++) {
> > > > > +		if (((d->q_assigned_bit_map[group_idx] >> aq_idx)
> & 0x1)
> > > > > == 0) {
> > > > > +			/* Mark the Queue as assigned */
> > > > > +			d->q_assigned_bit_map[group_idx] |= (1 <<
> aq_idx);
> > > > > +			/* Report the AQ Index */
> > > > > +			return (group_idx << GRP_ID_SHIFT) +
> aq_idx;
> > > > > +		}
> > > > > +	}
> > > > > +	rte_bbdev_log(INFO, "Failed to find free queue on %s,
> priority %u",
> > > > > +			dev->data->name, conf->priority);
> > > > > +	return -1;
> > > > > +}
> > > > > +
> > > > > +/* Setup ACC100 queue */
> > > > > +static int
> > > > > +acc100_queue_setup(struct rte_bbdev *dev, uint16_t queue_id,
> > > > > +		const struct rte_bbdev_queue_conf *conf) {
> > > > > +	struct acc100_device *d = dev->data->dev_private;
> > > > > +	struct acc100_queue *q;
> > > > > +	int16_t q_idx;
> > > > > +
> > > > > +	/* Allocate the queue data structure. */
> > > > > +	q = rte_zmalloc_socket(dev->device->driver->name,
> sizeof(*q),
> > > > > +			RTE_CACHE_LINE_SIZE, conf->socket);
> > > > > +	if (q == NULL) {
> > > > > +		rte_bbdev_log(ERR, "Failed to allocate queue
> memory");
> > > > > +		return -ENOMEM;
> > > > > +	}
> > > > > +
> > > > > +	q->d = d;
> > > > > +	q->ring_addr = RTE_PTR_ADD(d->sw_rings, (d->sw_ring_size
> *
> > > > > queue_id));
> > > > > +	q->ring_addr_phys = d->sw_rings_phys + (d->sw_ring_size *
> > > > > queue_id);
> > > > > +
> > > > > +	/* Prepare the Ring with default descriptor format */
> > > > > +	union acc100_dma_desc *desc = NULL;
> > > > > +	unsigned int desc_idx, b_idx;
> > > > > +	int fcw_len = (conf->op_type == RTE_BBDEV_OP_LDPC_ENC ?
> > > > > +		ACC100_FCW_LE_BLEN : (conf->op_type ==
> > > > > RTE_BBDEV_OP_TURBO_DEC ?
> > > > > +		ACC100_FCW_TD_BLEN : ACC100_FCW_LD_BLEN));
> > > > > +
> > > > > +	for (desc_idx = 0; desc_idx < d->sw_ring_max_depth;
> desc_idx++) {
> > > > > +		desc = q->ring_addr + desc_idx;
> > > > > +		desc->req.word0 = ACC100_DMA_DESC_TYPE;
> > > > > +		desc->req.word1 = 0; /**< Timestamp */
> > > > > +		desc->req.word2 = 0;
> > > > > +		desc->req.word3 = 0;
> > > > > +		uint64_t fcw_offset = (desc_idx << 8) +
> > > > > ACC100_DESC_FCW_OFFSET;
> > > > > +		desc->req.data_ptrs[0].address = q->ring_addr_phys
> +
> > > > > fcw_offset;
> > > > > +		desc->req.data_ptrs[0].blen = fcw_len;
> > > > > +		desc->req.data_ptrs[0].blkid =
> ACC100_DMA_BLKID_FCW;
> > > > > +		desc->req.data_ptrs[0].last = 0;
> > > > > +		desc->req.data_ptrs[0].dma_ext = 0;
> > > > > +		for (b_idx = 1; b_idx <
> ACC100_DMA_MAX_NUM_POINTERS
> > > > > - 1;
> > > > > +				b_idx++) {
> > > > > +			desc->req.data_ptrs[b_idx].blkid =
> > > > > ACC100_DMA_BLKID_IN;
> > > > > +			desc->req.data_ptrs[b_idx].last = 1;
> > > > > +			desc->req.data_ptrs[b_idx].dma_ext = 0;
> > > > > +			b_idx++;
> > > > > +			desc->req.data_ptrs[b_idx].blkid =
> > > > > +
> 	ACC100_DMA_BLKID_OUT_ENC;
> > > > > +			desc->req.data_ptrs[b_idx].last = 1;
> > > > > +			desc->req.data_ptrs[b_idx].dma_ext = 0;
> > > > > +		}
> > > > > +		/* Preset some fields of LDPC FCW */
> > > > > +		desc->req.fcw_ld.FCWversion = ACC100_FCW_VER;
> > > > > +		desc->req.fcw_ld.gain_i = 1;
> > > > > +		desc->req.fcw_ld.gain_h = 1;
> > > > > +	}
> > > > > +
> > > > > +	q->lb_in = rte_zmalloc_socket(dev->device->driver->name,
> > > > > +			RTE_CACHE_LINE_SIZE,
> > > > > +			RTE_CACHE_LINE_SIZE, conf->socket);
> > > > > +	if (q->lb_in == NULL) {
> > > > > +		rte_bbdev_log(ERR, "Failed to allocate lb_in
> memory");
> > > > > +		return -ENOMEM;
> > > > > +	}
> > > > > +	q->lb_in_addr_phys = rte_malloc_virt2iova(q->lb_in);
> > > > > +	q->lb_out = rte_zmalloc_socket(dev->device->driver->name,
> > > > > +			RTE_CACHE_LINE_SIZE,
> > > > > +			RTE_CACHE_LINE_SIZE, conf->socket);
> > > > > +	if (q->lb_out == NULL) {
> > > > > +		rte_bbdev_log(ERR, "Failed to allocate lb_out
> memory");
> > > > > +		return -ENOMEM;
> > > > > +	}
> > > > > +	q->lb_out_addr_phys = rte_malloc_virt2iova(q->lb_out);
> > > > > +
> > > > > +	/*
> > > > > +	 * Software queue ring wraps synchronously with the HW
> when it
> > > > > reaches
> > > > > +	 * the boundary of the maximum allocated queue size, no
> matter
> > > > > what the
> > > > > +	 * sw queue size is. This wrapping is guarded by setting the
> > > > > wrap_mask
> > > > > +	 * to represent the maximum queue size as allocated at the
> time
> > > > > when
> > > > > +	 * the device has been setup (in configure()).
> > > > > +	 *
> > > > > +	 * The queue depth is set to the queue size value (conf-
> > > > > >queue_size).
> > > > > +	 * This limits the occupancy of the queue at any point of time,
> so that
> > > > > +	 * the queue does not get swamped with enqueue requests.
> > > > > +	 */
> > > > > +	q->sw_ring_depth = conf->queue_size;
> > > > > +	q->sw_ring_wrap_mask = d->sw_ring_max_depth - 1;
> > > > > +
> > > > > +	q->op_type = conf->op_type;
> > > > > +
> > > > > +	q_idx = acc100_find_free_queue_idx(dev, conf);
> > > > > +	if (q_idx == -1) {
> > > > > +		rte_free(q);
> > > > > +		return -1;
> > > > > +	}
> > > > > +
> > > > > +	q->qgrp_id = (q_idx >> GRP_ID_SHIFT) & 0xF;
> > > > > +	q->vf_id = (q_idx >> VF_ID_SHIFT)  & 0x3F;
> > > > > +	q->aq_id = q_idx & 0xF;
> > > > > +	q->aq_depth = (conf->op_type ==
> RTE_BBDEV_OP_TURBO_DEC) ?
> > > > > +			(1 << d-
> >acc100_conf.q_ul_4g.aq_depth_log2) :
> > > > > +			(1 << d-
> >acc100_conf.q_dl_4g.aq_depth_log2);
> > > > > +
> > > > > +	q->mmio_reg_enqueue = RTE_PTR_ADD(d->mmio_base,
> > > > > +			queue_offset(d->pf_device,
> > > > > +					q->vf_id, q->qgrp_id, q-
> >aq_id));
> > > > > +
> > > > > +	rte_bbdev_log_debug(
> > > > > +			"Setup dev%u q%u: qgrp_id=%u, vf_id=%u,
> > > > > aq_id=%u, aq_depth=%u, mmio_reg_enqueue=%p",
> > > > > +			dev->data->dev_id, queue_id, q->qgrp_id,
> q->vf_id,
> > > > > +			q->aq_id, q->aq_depth, q-
> >mmio_reg_enqueue);
> > > > > +
> > > > > +	dev->data->queues[queue_id].queue_private = q;
> > > > > +	return 0;
> > > > > +}
> > > > > +
> > > > > +/* Release ACC100 queue */
> > > > > +static int
> > > > > +acc100_queue_release(struct rte_bbdev *dev, uint16_t q_id) {
> > > > > +	struct acc100_device *d = dev->data->dev_private;
> > > > > +	struct acc100_queue *q = dev->data-
> >queues[q_id].queue_private;
> > > > > +
> > > > > +	if (q != NULL) {
> > > > > +		/* Mark the Queue as un-assigned */
> > > > > +		d->q_assigned_bit_map[q->qgrp_id] &= (0xFFFFFFFF
> -
> > > > > +				(1 << q->aq_id));
> > > > > +		rte_free(q->lb_in);
> > > > > +		rte_free(q->lb_out);
> > > > > +		rte_free(q);
> > > > > +		dev->data->queues[q_id].queue_private = NULL;
> > > > > +	}
> > > > > +
> > > > >  	return 0;
> > > > >  }
> > > > >
> > > > > @@ -258,8 +673,11 @@
> > > > >  }
> > > > >
> > > > >  static const struct rte_bbdev_ops acc100_bbdev_ops = {
> > > > > +	.setup_queues = acc100_setup_queues,
> > > > >  	.close = acc100_dev_close,
> > > > >  	.info_get = acc100_dev_info_get,
> > > > > +	.queue_setup = acc100_queue_setup,
> > > > > +	.queue_release = acc100_queue_release,
> > > > >  };
> > > > >
> > > > >  /* ACC100 PCI PF address map */
> > > > > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h
> > > > > b/drivers/baseband/acc100/rte_acc100_pmd.h
> > > > > index 662e2c8..0e2b79c 100644
> > > > > --- a/drivers/baseband/acc100/rte_acc100_pmd.h
> > > > > +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
> > > > > @@ -518,11 +518,56 @@ struct acc100_registry_addr {
> > > > >  	.ddr_range = HWVfDmaDdrBaseRangeRoVf,  };
> > > > >
> > > > > +/* Structure associated with each queue. */ struct
> > > > > +__rte_cache_aligned acc100_queue {
> > > > > +	union acc100_dma_desc *ring_addr;  /* Virtual address of sw
> ring */
> > > > > +	rte_iova_t ring_addr_phys;  /* Physical address of software
> ring */
> > > > > +	uint32_t sw_ring_head;  /* software ring head */
> > > > > +	uint32_t sw_ring_tail;  /* software ring tail */
> > > > > +	/* software ring size (descriptors, not bytes) */
> > > > > +	uint32_t sw_ring_depth;
> > > > > +	/* mask used to wrap enqueued descriptors on the sw ring
> */
> > > > > +	uint32_t sw_ring_wrap_mask;
> > > > > +	/* MMIO register used to enqueue descriptors */
> > > > > +	void *mmio_reg_enqueue;
> > > > > +	uint8_t vf_id;  /* VF ID (max = 63) */
> > > > > +	uint8_t qgrp_id;  /* Queue Group ID */
> > > > > +	uint16_t aq_id;  /* Atomic Queue ID */
> > > > > +	uint16_t aq_depth;  /* Depth of atomic queue */
> > > > > +	uint32_t aq_enqueued;  /* Count how many "batches" have
> been
> > > > > enqueued */
> > > > > +	uint32_t aq_dequeued;  /* Count how many "batches" have
> been
> > > > > dequeued */
> > > > > +	uint32_t irq_enable;  /* Enable ops dequeue interrupts if set
> to 1 */
> > > > > +	struct rte_mempool *fcw_mempool;  /* FCW mempool */
> > > > > +	enum rte_bbdev_op_type op_type;  /* Type of this Queue:
> TE or TD
> > > > > */
> > > > > +	/* Internal Buffers for loopback input */
> > > > > +	uint8_t *lb_in;
> > > > > +	uint8_t *lb_out;
> > > > > +	rte_iova_t lb_in_addr_phys;
> > > > > +	rte_iova_t lb_out_addr_phys;
> > > > > +	struct acc100_device *d;
> > > > > +};
> > > > > +
> > > > >  /* Private data structure for each ACC100 device */  struct
> > > > > acc100_device
> > > {
> > > > >  	void *mmio_base;  /**< Base address of MMIO registers
> (BAR0) */
> > > > > +	void *sw_rings_base;  /* Base addr of un-aligned memory
> for sw
> > > > > rings */
> > > > > +	void *sw_rings;  /* 64MBs of 64MB aligned memory for sw
> rings */
> > > > > +	rte_iova_t sw_rings_phys;  /* Physical address of sw_rings */
> > > > > +	/* Virtual address of the info memory routed to the this
> > > > > +function
> > > > > under
> > > > > +	 * operation, whether it is PF or VF.
> > > > > +	 */
> > > > > +	union acc100_harq_layout_data *harq_layout;
> > > > > +	uint32_t sw_ring_size;
> > > > >  	uint32_t ddr_size; /* Size in kB */
> > > > > +	uint32_t *tail_ptrs; /* Base address of response tail pointer
> buffer */
> > > > > +	rte_iova_t tail_ptr_phys; /* Physical address of tail pointers
> */
> > > > > +	/* Max number of entries available for each queue in device,
> > > > > depending
> > > > > +	 * on how many queues are enabled with configure()
> > > > > +	 */
> > > > > +	uint32_t sw_ring_max_depth;
> > > > >  	struct acc100_conf acc100_conf; /* ACC100 Initial
> configuration
> > > > > */
> > > > > +	/* Bitmap capturing which Queues have already been
> assigned */
> > > > > +	uint16_t q_assigned_bit_map[ACC100_NUM_QGRPS];
> > > > >  	bool pf_device; /**< True if this is a PF ACC100 device */
> > > > >  	bool configured; /**< True if this ACC100 device is configured
> > > > > */ };
> > > > > --
> > > > > 1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v3 01/11] drivers/baseband: add PMD for ACC100
  2020-08-29  9:44   ` Xu, Rosen
@ 2020-09-04 16:44     ` Chautru, Nicolas
  0 siblings, 0 replies; 213+ messages in thread
From: Chautru, Nicolas @ 2020-09-04 16:44 UTC (permalink / raw)
  To: Xu, Rosen, dev, akhil.goyal; +Cc: Richardson, Bruce

Hi, 

> -----Original Message-----
> From: Xu, Rosen <rosen.xu@intel.com>
> 
> Hi,
> 
> > -----Original Message-----
> > From: dev <dev-bounces@dpdk.org> On Behalf Of Nicolas Chautru
> > Sent: Wednesday, August 19, 2020 8:25
> > To: dev@dpdk.org; akhil.goyal@nxp.com
> > Cc: Richardson, Bruce <bruce.richardson@intel.com>; Chautru, Nicolas
> > <nicolas.chautru@intel.com>
> > Subject: [dpdk-dev] [PATCH v3 01/11] drivers/baseband: add PMD for
> > ACC100
> >
> > Add stubs for the ACC100 PMD
> >
> > Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> > ---
> >  config/common_base                                 |   4 +
> >  doc/guides/bbdevs/acc100.rst                       | 233 +++++++++++++++++++++
> >  doc/guides/bbdevs/index.rst                        |   1 +
> >  doc/guides/rel_notes/release_20_11.rst             |   6 +
> >  drivers/baseband/Makefile                          |   2 +
> >  drivers/baseband/acc100/Makefile                   |  25 +++
> >  drivers/baseband/acc100/meson.build                |   6 +
> >  drivers/baseband/acc100/rte_acc100_pmd.c           | 175 ++++++++++++++++
> >  drivers/baseband/acc100/rte_acc100_pmd.h           |  37 ++++
> >  .../acc100/rte_pmd_bbdev_acc100_version.map        |   3 +
> >  drivers/baseband/meson.build                       |   2 +-
> >  mk/rte.app.mk                                      |   1 +
> >  12 files changed, 494 insertions(+), 1 deletion(-)  create mode
> > 100644 doc/guides/bbdevs/acc100.rst  create mode 100644
> > drivers/baseband/acc100/Makefile  create mode 100644
> > drivers/baseband/acc100/meson.build
> >  create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.c
> >  create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.h
> >  create mode 100644
> > drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> >
> > diff --git a/config/common_base b/config/common_base index
> > fbf0ee7..218ab16 100644
> > --- a/config/common_base
> > +++ b/config/common_base
> > @@ -584,6 +584,10 @@ CONFIG_RTE_LIBRTE_PMD_BBDEV_NULL=y
> >  #
> >  CONFIG_RTE_LIBRTE_PMD_BBDEV_TURBO_SW=y
> >
> > +# Compile PMD for ACC100 bbdev device #
> > +CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100=y
> > +
> >  #
> >  # Compile PMD for Intel FPGA LTE FEC bbdev device  # diff --git
> > a/doc/guides/bbdevs/acc100.rst b/doc/guides/bbdevs/acc100.rst new file
> > mode 100644 index 0000000..f87ee09
> > --- /dev/null
> > +++ b/doc/guides/bbdevs/acc100.rst
> > @@ -0,0 +1,233 @@
> > +..  SPDX-License-Identifier: BSD-3-Clause
> > +    Copyright(c) 2020 Intel Corporation
> > +
> > +Intel(R) ACC100 5G/4G FEC Poll Mode Driver
> > +==========================================
> > +
> > +The BBDEV ACC100 5G/4G FEC poll mode driver (PMD) supports an
> > +implementation of a VRAN FEC wireless acceleration function.
> > +This device is also known as Mount Bryce.
> > +
> > +Features
> > +--------
> > +
> > +ACC100 5G/4G FEC PMD supports the following features:
> > +
> > +- LDPC Encode in the DL (5GNR)
> > +- LDPC Decode in the UL (5GNR)
> > +- Turbo Encode in the DL (4G)
> > +- Turbo Decode in the UL (4G)
> > +- 16 VFs per PF (physical device)
> > +- Maximum of 128 queues per VF
> > +- PCIe Gen-3 x16 Interface
> > +- MSI
> > +- SR-IOV
> > +
> > +ACC100 5G/4G FEC PMD supports the following BBDEV capabilities:
> > +
> > +* For the LDPC encode operation:
> > +   - ``RTE_BBDEV_LDPC_CRC_24B_ATTACH`` :  set to attach CRC24B to CB(s)
> > +   - ``RTE_BBDEV_LDPC_RATE_MATCH`` :  if set then do not do Rate
> > +Match
> > bypass
> > +   - ``RTE_BBDEV_LDPC_INTERLEAVER_BYPASS`` : if set then bypass
> > +interleaver
> > +
> > +* For the LDPC decode operation:
> > +   - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK`` :  check CRC24B from
> CB(s)
> > +   - ``RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE`` :  disable early
> > termination
> > +   - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP`` :  drops CRC24B bits
> > appended while decoding
> > +   - ``RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE`` :  provides an input for
> > HARQ combining
> > +   - ``RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE`` :  provides an input
> > for HARQ combining
> > +   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE`` :  HARQ
> > memory input is internal
> > +   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE`` :
> HARQ
> > memory output is internal
> > +   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK`` :
> > loopback data to/from HARQ memory
> > +   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_FILLERS`` :  HARQ
> > memory includes the fillers bits
> > +   - ``RTE_BBDEV_LDPC_DEC_SCATTER_GATHER`` :  supports scatter-gather
> > for input/output data
> > +   - ``RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION`` :  supports
> > compression of the HARQ input/output
> > +   - ``RTE_BBDEV_LDPC_LLR_COMPRESSION`` :  supports LLR input
> > +compression
> > +
> > +* For the turbo encode operation:
> > +   - ``RTE_BBDEV_TURBO_CRC_24B_ATTACH`` :  set to attach CRC24B to
> CB(s)
> > +   - ``RTE_BBDEV_TURBO_RATE_MATCH`` :  if set then do not do Rate
> > +Match
> > bypass
> > +   - ``RTE_BBDEV_TURBO_ENC_INTERRUPTS`` :  set for encoder dequeue
> > interrupts
> > +   - ``RTE_BBDEV_TURBO_RV_INDEX_BYPASS`` :  set to bypass RV index
> > +   - ``RTE_BBDEV_TURBO_ENC_SCATTER_GATHER`` :  supports scatter-
> > gather
> > +for input/output data
> > +
> > +* For the turbo decode operation:
> > +   - ``RTE_BBDEV_TURBO_CRC_TYPE_24B`` :  check CRC24B from CB(s)
> > +   - ``RTE_BBDEV_TURBO_SUBBLOCK_DEINTERLEAVE`` :  perform subblock
> > de-interleave
> > +   - ``RTE_BBDEV_TURBO_DEC_INTERRUPTS`` :  set for decoder dequeue
> > interrupts
> > +   - ``RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN`` :  set if negative LLR
> > + encoder
> > i/p is supported
> > +   - ``RTE_BBDEV_TURBO_POS_LLR_1_BIT_IN`` :  set if positive LLR
> > + encoder
> > i/p is supported
> > +   - ``RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP`` :  keep CRC24B bits
> > appended while decoding
> > +   - ``RTE_BBDEV_TURBO_EARLY_TERMINATION`` :  set early early
> > termination feature
> > +   - ``RTE_BBDEV_TURBO_DEC_SCATTER_GATHER`` :  supports scatter-
> > gather for input/output data
> > +   - ``RTE_BBDEV_TURBO_HALF_ITERATION_EVEN`` :  set half iteration
> > +granularity
> > +
> > +Installation
> > +------------
> > +
> > +Section 3 of the DPDK manual provides instuctions on installing and
> > +compiling DPDK. The default set of bbdev compile flags may be found
> > +in config/common_base, where for example the flag to build the ACC100
> > +5G/4G FEC device, ``CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100``,
> > +is already set.
> > +
> > +DPDK requires hugepages to be configured as detailed in section 2 of
> > +the
> > DPDK manual.
> > +The bbdev test application has been tested with a configuration 40 x
> > +1GB hugepages. The hugepage configuration of a server may be examined
> > using:
> > +
> > +.. code-block:: console
> > +
> > +   grep Huge* /proc/meminfo
> > +
> > +
> > +Initialization
> > +--------------
> > +
> > +When the device first powers up, its PCI Physical Functions (PF) can
> > +be
> > listed through this command:
> > +
> > +.. code-block:: console
> > +
> > +  sudo lspci -vd8086:0d5c
> > +
> > +The physical and virtual functions are compatible with Linux UIO drivers:
> > +``vfio`` and ``igb_uio``. However, in order to work the ACC100 5G/4G
> > +FEC device firstly needs to be bound to one of these linux drivers
> > +through
> > DPDK.
> > +
> > +
> > +Bind PF UIO driver(s)
> > +~~~~~~~~~~~~~~~~~~~~~
> > +
> > +Install the DPDK igb_uio driver, bind it with the PF PCI device ID
> > +and use ``lspci`` to confirm the PF device is under use by
> > +``igb_uio`` DPDK UIO
> > driver.
> > +
> > +The igb_uio driver may be bound to the PF PCI device using one of
> > +three
> > methods:
> > +
> > +
> > +1. PCI functions (physical or virtual, depending on the use case) can
> > +be bound to the UIO driver by repeating this command for every function.
> > +
> > +.. code-block:: console
> > +
> > +  cd <dpdk-top-level-directory>
> > +  insmod ./build/kmod/igb_uio.ko
> > +  echo "8086 0d5c" > /sys/bus/pci/drivers/igb_uio/new_id
> > +  lspci -vd8086:0d5c
> > +
> > +
> > +2. Another way to bind PF with DPDK UIO driver is by using the
> > +``dpdk-devbind.py`` tool
> > +
> > +.. code-block:: console
> > +
> > +  cd <dpdk-top-level-directory>
> > +  ./usertools/dpdk-devbind.py -b igb_uio 0000:06:00.0
> > +
> > +where the PCI device ID (example: 0000:06:00.0) is obtained using
> > +lspci -vd8086:0d5c
> > +
> > +
> > +3. A third way to bind is to use ``dpdk-setup.sh`` tool
> > +
> > +.. code-block:: console
> > +
> > +  cd <dpdk-top-level-directory>
> > +  ./usertools/dpdk-setup.sh
> > +
> > +  select 'Bind Ethernet/Crypto/Baseband device to IGB UIO module'
> > +  or
> > +  select 'Bind Ethernet/Crypto/Baseband device to VFIO module'
> > + depending on driver required  enter PCI device ID  select 'Display
> > + current Ethernet/Crypto/Baseband device settings' to confirm binding
> > +
> > +
> > +In the same way the ACC100 5G/4G FEC PF can be bound with vfio, but
> > +vfio driver does not support SR-IOV configuration right out of the
> > +box, so it
> > will need to be patched.
> > +
> > +
> > +Enable Virtual Functions
> > +~~~~~~~~~~~~~~~~~~~~~~~~
> > +
> > +Now, it should be visible in the printouts that PCI PF is under
> > +igb_uio control "``Kernel driver in use: igb_uio``"
> > +
> > +To show the number of available VFs on the device, read
> > +``sriov_totalvfs``
> > file..
> > +
> > +.. code-block:: console
> > +
> > +  cat /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/sriov_totalvfs
> > +
> > +  where 0000\:<b>\:<d>.<f> is the PCI device ID
> > +
> > +
> > +To enable VFs via igb_uio, echo the number of virtual functions
> > +intended to enable to ``max_vfs`` file..
> > +
> > +.. code-block:: console
> > +
> > +  echo <num-of-vfs> > /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/max_vfs
> > +
> > +
> > +Afterwards, all VFs must be bound to appropriate UIO drivers as
> > +required, same way it was done with the physical function previously.
> > +
> > +Enabling SR-IOV via vfio driver is pretty much the same, except that
> > +the file name is different:
> > +
> > +.. code-block:: console
> > +
> > +  echo <num-of-vfs> >
> > + /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/sriov_numvfs
> > +
> > +
> > +Configure the VFs through PF
> > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > +
> > +The PCI virtual functions must be configured before working or
> > +getting assigned to VMs/Containers. The configuration involves
> > +allocating the number of hardware queues, priorities, load balance,
> > +bandwidth and other settings necessary for the device to perform FEC
> functions.
> > +
> > +This configuration needs to be executed at least once after reboot or
> > +PCI FLR and can be achieved by using the function
> > +``acc100_configure()``, which sets up the parameters defined in
> > ``acc100_conf`` structure.
> > +
> > +Test Application
> > +----------------
> > +
> > +BBDEV provides a test application, ``test-bbdev.py`` and range of
> > +test data for testing the functionality of ACC100 5G/4G FEC encode
> > +and decode, depending on the device's capabilities. The test
> > +application is located under app->test-bbdev folder and has the following
> options:
> > +
> > +.. code-block:: console
> > +
> > +  "-p", "--testapp-path": specifies path to the bbdev test app.
> > +  "-e", "--eal-params"	: EAL arguments which are passed to the test app.
> > +  "-t", "--timeout"	: Timeout in seconds (default=300).
> > +  "-c", "--test-cases"	: Defines test cases to run. Run all if not specified.
> > +  "-v", "--test-vector"	: Test vector path (default=dpdk_path+/app/test-
> > bbdev/test_vectors/bbdev_null.data).
> > +  "-n", "--num-ops"	: Number of operations to process on device
> > (default=32).
> > +  "-b", "--burst-size"	: Operations enqueue/dequeue burst size
> > (default=32).
> > +  "-s", "--snr"		: SNR in dB used when generating LLRs for bler tests.
> > +  "-s", "--iter_max"	: Number of iterations for LDPC decoder.
> > +  "-l", "--num-lcores"	: Number of lcores to run (default=16).
> > +  "-i", "--init-device" : Initialise PF device with default values.
> > +
> > +
> > +To execute the test application tool using simple decode or encode
> > +data, type one of the following:
> > +
> > +.. code-block:: console
> > +
> > +  ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_dec_default.data
> > + ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_enc_default.data
> > +
> > +
> > +The test application ``test-bbdev.py``, supports the ability to
> > +configure the PF device with a default set of values, if the "-i" or
> > +"- -init-device" option is included. The default values are defined
> > +in
> > test_bbdev_perf.c.
> > +
> > +
> > +Test Vectors
> > +~~~~~~~~~~~~
> > +
> > +In addition to the simple LDPC decoder and LDPC encoder tests, bbdev
> > +also provides a range of additional tests under the test_vectors
> > +folder, which may be useful. The results of these tests will depend
> > +on the ACC100 5G/4G FEC capabilities which may cause some testcases
> > +to be
> > skipped, but no failure should be reported.
> > diff --git a/doc/guides/bbdevs/index.rst b/doc/guides/bbdevs/index.rst
> > index a8092dd..4445cbd 100644
> > --- a/doc/guides/bbdevs/index.rst
> > +++ b/doc/guides/bbdevs/index.rst
> > @@ -13,3 +13,4 @@ Baseband Device Drivers
> >      turbo_sw
> >      fpga_lte_fec
> >      fpga_5gnr_fec
> > +    acc100
> > diff --git a/doc/guides/rel_notes/release_20_11.rst
> > b/doc/guides/rel_notes/release_20_11.rst
> > index df227a1..b3ab614 100644
> > --- a/doc/guides/rel_notes/release_20_11.rst
> > +++ b/doc/guides/rel_notes/release_20_11.rst
> > @@ -55,6 +55,12 @@ New Features
> >       Also, make sure to start the actual text at the margin.
> >       =======================================================
> >
> > +* **Added Intel ACC100 bbdev PMD.**
> > +
> > +  Added a new ``acc100`` bbdev driver for the Intel\ |reg| ACC100
> > + accelerator  also known as Mount Bryce.  See the
> > + :doc:`../bbdevs/acc100` BBDEV guide for more details on this new driver.
> > +
> >
> >  Removed Items
> >  -------------
> > diff --git a/drivers/baseband/Makefile b/drivers/baseband/Makefile
> > index
> > dcc0969..b640294 100644
> > --- a/drivers/baseband/Makefile
> > +++ b/drivers/baseband/Makefile
> > @@ -10,6 +10,8 @@ DIRS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_NULL) +=
> null
> > DEPDIRS-null = $(core-libs)
> >  DIRS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_TURBO_SW) += turbo_sw
> > DEPDIRS-turbo_sw = $(core-libs)
> > +DIRS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100) += acc100
> > +DEPDIRS-acc100 = $(core-libs)
> >  DIRS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_FPGA_LTE_FEC) += fpga_lte_fec
> > DEPDIRS-fpga_lte_fec = $(core-libs)
> >  DIRS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_FPGA_5GNR_FEC) +=
> fpga_5gnr_fec
> > diff --git a/drivers/baseband/acc100/Makefile
> > b/drivers/baseband/acc100/Makefile
> > new file mode 100644
> > index 0000000..c79e487
> > --- /dev/null
> > +++ b/drivers/baseband/acc100/Makefile
> > @@ -0,0 +1,25 @@
> > +# SPDX-License-Identifier: BSD-3-Clause # Copyright(c) 2020 Intel
> > +Corporation
> > +
> > +include $(RTE_SDK)/mk/rte.vars.mk
> > +
> > +# library name
> > +LIB = librte_pmd_bbdev_acc100.a
> > +
> > +# build flags
> > +CFLAGS += -O3
> > +CFLAGS += $(WERROR_FLAGS)
> > +LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring -lrte_cfgfile
> > +LDLIBS += -lrte_bbdev LDLIBS += -lrte_pci -lrte_bus_pci
> > +
> > +# versioning export map
> > +EXPORT_MAP := rte_pmd_bbdev_acc100_version.map
> > +
> > +# library version
> > +LIBABIVER := 1
> > +
> > +# library source files
> > +SRCS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100) += rte_acc100_pmd.c
> > +
> > +include $(RTE_SDK)/mk/rte.lib.mk
> > diff --git a/drivers/baseband/acc100/meson.build
> > b/drivers/baseband/acc100/meson.build
> > new file mode 100644
> > index 0000000..8afafc2
> > --- /dev/null
> > +++ b/drivers/baseband/acc100/meson.build
> > @@ -0,0 +1,6 @@
> > +# SPDX-License-Identifier: BSD-3-Clause # Copyright(c) 2020 Intel
> > +Corporation
> > +
> > +deps += ['bbdev', 'bus_vdev', 'ring', 'pci', 'bus_pci']
> > +
> > +sources = files('rte_acc100_pmd.c')
> > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> > b/drivers/baseband/acc100/rte_acc100_pmd.c
> > new file mode 100644
> > index 0000000..1b4cd13
> > --- /dev/null
> > +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> > @@ -0,0 +1,175 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright(c) 2020 Intel Corporation  */
> > +
> > +#include <unistd.h>
> > +
> > +#include <rte_common.h>
> > +#include <rte_log.h>
> > +#include <rte_dev.h>
> > +#include <rte_malloc.h>
> > +#include <rte_mempool.h>
> > +#include <rte_byteorder.h>
> > +#include <rte_errno.h>
> > +#include <rte_branch_prediction.h>
> > +#include <rte_hexdump.h>
> > +#include <rte_pci.h>
> > +#include <rte_bus_pci.h>
> > +
> > +#include <rte_bbdev.h>
> > +#include <rte_bbdev_pmd.h>
> > +#include "rte_acc100_pmd.h"
> > +
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, DEBUG); #else
> > +RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE); #endif
> > +
> > +/* Free 64MB memory used for software rings */ static int
> > +acc100_dev_close(struct rte_bbdev *dev  __rte_unused) {
> > +	return 0;
> > +}
> > +
> > +static const struct rte_bbdev_ops acc100_bbdev_ops = {
> > +	.close = acc100_dev_close,
> > +};
> > +
> > +/* ACC100 PCI PF address map */
> > +static struct rte_pci_id pci_id_acc100_pf_map[] = {
> > +	{
> > +		RTE_PCI_DEVICE(RTE_ACC100_VENDOR_ID,
> > RTE_ACC100_PF_DEVICE_ID)
> > +	},
> > +	{.device_id = 0},
> > +};
> > +
> > +/* ACC100 PCI VF address map */
> > +static struct rte_pci_id pci_id_acc100_vf_map[] = {
> > +	{
> > +		RTE_PCI_DEVICE(RTE_ACC100_VENDOR_ID,
> > RTE_ACC100_VF_DEVICE_ID)
> > +	},
> > +	{.device_id = 0},
> > +};
> > +
> > +/* Initialization Function */
> > +static void
> > +acc100_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv) {
> > +	struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
> > +
> > +	dev->dev_ops = &acc100_bbdev_ops;
> > +
> > +	((struct acc100_device *) dev->data->dev_private)->pf_device =
> > +			!strcmp(drv->driver.name,
> > +					RTE_STR(ACC100PF_DRIVER_NAME));
> > +	((struct acc100_device *) dev->data->dev_private)->mmio_base =
> > +			pci_dev->mem_resource[0].addr;
> > +
> > +	rte_bbdev_log_debug("Init device %s [%s] @ vaddr %p
> > paddr %#"PRIx64"",
> > +			drv->driver.name, dev->data->name,
> > +			(void *)pci_dev->mem_resource[0].addr,
> > +			pci_dev->mem_resource[0].phys_addr);
> > +}
> > +
> > +static int acc100_pci_probe(struct rte_pci_driver *pci_drv,
> > +	struct rte_pci_device *pci_dev)
> > +{
> > +	struct rte_bbdev *bbdev = NULL;
> > +	char dev_name[RTE_BBDEV_NAME_MAX_LEN];
> > +
> > +	if (pci_dev == NULL) {
> > +		rte_bbdev_log(ERR, "NULL PCI device");
> > +		return -EINVAL;
> > +	}
> > +
> > +	rte_pci_device_name(&pci_dev->addr, dev_name,
> > sizeof(dev_name));
> > +
> > +	/* Allocate memory to be used privately by drivers */
> > +	bbdev = rte_bbdev_allocate(pci_dev->device.name);
> > +	if (bbdev == NULL)
> > +		return -ENODEV;
> > +
> > +	/* allocate device private memory */
> > +	bbdev->data->dev_private = rte_zmalloc_socket(dev_name,
> > +			sizeof(struct acc100_device), RTE_CACHE_LINE_SIZE,
> > +			pci_dev->device.numa_node);
> > +
> > +	if (bbdev->data->dev_private == NULL) {
> > +		rte_bbdev_log(CRIT,
> > +				"Allocate of %zu bytes for device \"%s\"
> > failed",
> > +				sizeof(struct acc100_device), dev_name);
> > +				rte_bbdev_release(bbdev);
> > +			return -ENOMEM;
> > +	}
> > +
> > +	/* Fill HW specific part of device structure */
> > +	bbdev->device = &pci_dev->device;
> > +	bbdev->intr_handle = &pci_dev->intr_handle;
> > +	bbdev->data->socket_id = pci_dev->device.numa_node;
> > +
> > +	/* Invoke ACC100 device initialization function */
> > +	acc100_bbdev_init(bbdev, pci_drv);
> > +
> > +	rte_bbdev_log_debug("Initialised bbdev %s (id = %u)",
> > +			dev_name, bbdev->data->dev_id);
> > +	return 0;
> > +}
> > +
> > +static int acc100_pci_remove(struct rte_pci_device *pci_dev) {
> > +	struct rte_bbdev *bbdev;
> > +	int ret;
> > +	uint8_t dev_id;
> > +
> > +	if (pci_dev == NULL)
> > +		return -EINVAL;
> > +
> > +	/* Find device */
> > +	bbdev = rte_bbdev_get_named_dev(pci_dev->device.name);
> > +	if (bbdev == NULL) {
> > +		rte_bbdev_log(CRIT,
> > +				"Couldn't find HW dev \"%s\" to uninitialise
> > it",
> > +				pci_dev->device.name);
> > +		return -ENODEV;
> > +	}
> > +	dev_id = bbdev->data->dev_id;
> > +
> > +	/* free device private memory before close */
> > +	rte_free(bbdev->data->dev_private);
> > +
> > +	/* Close device */
> > +	ret = rte_bbdev_close(dev_id);
> > +	if (ret < 0)
> > +		rte_bbdev_log(ERR,
> > +				"Device %i failed to close during uninit: %i",
> > +				dev_id, ret);
> > +
> > +	/* release bbdev from library */
> > +	rte_bbdev_release(bbdev);
> > +
> > +	rte_bbdev_log_debug("Destroyed bbdev = %u", dev_id);
> > +
> > +	return 0;
> > +}
> > +
> > +static struct rte_pci_driver acc100_pci_pf_driver = {
> > +		.probe = acc100_pci_probe,
> > +		.remove = acc100_pci_remove,
> > +		.id_table = pci_id_acc100_pf_map,
> > +		.drv_flags = RTE_PCI_DRV_NEED_MAPPING };
> > +
> > +static struct rte_pci_driver acc100_pci_vf_driver = {
> > +		.probe = acc100_pci_probe,
> > +		.remove = acc100_pci_remove,
> > +		.id_table = pci_id_acc100_vf_map,
> > +		.drv_flags = RTE_PCI_DRV_NEED_MAPPING };
> > +
> > +RTE_PMD_REGISTER_PCI(ACC100PF_DRIVER_NAME,
> acc100_pci_pf_driver);
> > +RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME,
> > pci_id_acc100_pf_map);
> > +RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME,
> acc100_pci_vf_driver);
> > +RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME,
> > pci_id_acc100_vf_map);
> 
> It seems both PF and VF share same date for rte_pci_driver, it's strange to
> duplicate code.

Hi Rosen, 

These 2 small structure differs since they map different device ids to different driver names while the 2 probe/remove functions share same function call with differences being handled within the underlying implementation when required based on the device driver being actually for PF vs VF. 
Basically that is used to clearly expose and carry the differences of the driver for PF and VF which have slightly different functionality and implementation, even though they share the same probe/remove actual functions to limit code duplication (similarly for other functions, you will see the difference within the code for these related function only when required when it assesses whether the underlying driver is for PF or VF. There is in effect no duplicate code this way). 
I believe that this is best way to limit code duplication to only the places where it brings value and clarity while implementation are kept as generic as comprehensively as possible. 

Note also that we use the same exact model for other existing bbdev PMDs and I want to carry the same split moving forward when possible (existing users are used to this from RTE EAL probing print notably) on top of keep the code clean. 

I hope this makes sense from these different aspects. 
Thanks again for your thorough review

Nic

> 
> > +
> > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h
> > b/drivers/baseband/acc100/rte_acc100_pmd.h
> > new file mode 100644
> > index 0000000..6f46df0
> > --- /dev/null
> > +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
> > @@ -0,0 +1,37 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright(c) 2020 Intel Corporation  */
> > +
> > +#ifndef _RTE_ACC100_PMD_H_
> > +#define _RTE_ACC100_PMD_H_
> > +
> > +/* Helper macro for logging */
> > +#define rte_bbdev_log(level, fmt, ...) \
> > +	rte_log(RTE_LOG_ ## level, acc100_logtype, fmt "\n", \
> > +		##__VA_ARGS__)
> > +
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +#define rte_bbdev_log_debug(fmt, ...) \
> > +		rte_bbdev_log(DEBUG, "acc100_pmd: " fmt, \
> > +		##__VA_ARGS__)
> > +#else
> > +#define rte_bbdev_log_debug(fmt, ...) #endif
> > +
> > +/* ACC100 PF and VF driver names */
> > +#define ACC100PF_DRIVER_NAME           intel_acc100_pf
> > +#define ACC100VF_DRIVER_NAME           intel_acc100_vf
> > +
> > +/* ACC100 PCI vendor & device IDs */
> > +#define RTE_ACC100_VENDOR_ID           (0x8086)
> > +#define RTE_ACC100_PF_DEVICE_ID        (0x0d5c)
> > +#define RTE_ACC100_VF_DEVICE_ID        (0x0d5d)
> > +
> > +/* Private data structure for each ACC100 device */ struct
> > +acc100_device {
> > +	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
> > +	bool pf_device; /**< True if this is a PF ACC100 device */
> > +	bool configured; /**< True if this ACC100 device is configured */ };
> > +
> > +#endif /* _RTE_ACC100_PMD_H_ */
> > diff --git a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> > b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> > new file mode 100644
> > index 0000000..4a76d1d
> > --- /dev/null
> > +++ b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> > @@ -0,0 +1,3 @@
> > +DPDK_21 {
> > +	local: *;
> > +};
> > diff --git a/drivers/baseband/meson.build
> > b/drivers/baseband/meson.build index 415b672..72301ce 100644
> > --- a/drivers/baseband/meson.build
> > +++ b/drivers/baseband/meson.build
> > @@ -5,7 +5,7 @@ if is_windows
> >  	subdir_done()
> >  endif
> >
> > -drivers = ['null', 'turbo_sw', 'fpga_lte_fec', 'fpga_5gnr_fec']
> > +drivers = ['null', 'turbo_sw', 'fpga_lte_fec', 'fpga_5gnr_fec',
> > +'acc100']
> >
> >  config_flag_fmt = 'RTE_LIBRTE_PMD_BBDEV_@0@'
> >  driver_name_fmt = 'rte_pmd_bbdev_@0@'
> > diff --git a/mk/rte.app.mk b/mk/rte.app.mk index a544259..a77f538
> > 100644
> > --- a/mk/rte.app.mk
> > +++ b/mk/rte.app.mk
> > @@ -254,6 +254,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_NETVSC_PMD)     +=
> > -lrte_pmd_netvsc
> >
> >  ifeq ($(CONFIG_RTE_LIBRTE_BBDEV),y)
> >  _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_NULL)     += -
> > lrte_pmd_bbdev_null
> > +_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100)    += -
> > lrte_pmd_bbdev_acc100
> >  _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_FPGA_LTE_FEC) += -
> > lrte_pmd_bbdev_fpga_lte_fec
> >  _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_FPGA_5GNR_FEC) += -
> > lrte_pmd_bbdev_fpga_5gnr_fec
> >
> > --
> > 1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v4 00/11] bbdev PMD ACC100
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 11/11] doc: update bbdev feature table Nicolas Chautru
@ 2020-09-04 17:53   ` Nicolas Chautru
  2020-09-04 17:53     ` [dpdk-dev] [PATCH v4 01/11] drivers/baseband: add PMD for ACC100 Nicolas Chautru
                       ` (11 more replies)
  2020-09-23  2:12   ` [dpdk-dev] [PATCH v5 " Nicolas Chautru
                     ` (7 subsequent siblings)
  8 siblings, 12 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-04 17:53 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

v4: an odd compilation error is reported for one CI variant
using "gcc latest" which looks to me like a false positive of
maybe-undeclared. 
http://mails.dpdk.org/archives/test-report/2020-August/148936.html
Still forcing a dummy declare to remove this CI warning 
I will check with ci@dpdk.org in parallel.  
v3: missed a change during rebase
v2: includes clean up from latest CI checks.

This set includes a new PMD for the accelerator
ACC100 for 4G+5G FEC in 20.11. 
Documentation is updated as well accordingly.
Existing unit tests are all still supported.


Nicolas Chautru (11):
  drivers/baseband: add PMD for ACC100
  baseband/acc100: add register definition file
  baseband/acc100: add info get function
  baseband/acc100: add queue configuration
  baseband/acc100: add LDPC processing functions
  baseband/acc100: add HARQ loopback support
  baseband/acc100: add support for 4G processing
  baseband/acc100: add interrupt support to PMD
  baseband/acc100: add debug function to validate input
  baseband/acc100: add configure function
  doc: update bbdev feature table

 app/test-bbdev/Makefile                            |    3 +
 app/test-bbdev/meson.build                         |    3 +
 app/test-bbdev/test_bbdev_perf.c                   |   72 +
 config/common_base                                 |    4 +
 doc/guides/bbdevs/acc100.rst                       |  233 +
 doc/guides/bbdevs/features/acc100.ini              |   14 +
 doc/guides/bbdevs/features/mbc.ini                 |   14 -
 doc/guides/bbdevs/index.rst                        |    1 +
 doc/guides/rel_notes/release_20_11.rst             |    6 +
 drivers/baseband/Makefile                          |    2 +
 drivers/baseband/acc100/Makefile                   |   28 +
 drivers/baseband/acc100/acc100_pf_enum.h           | 1068 +++++
 drivers/baseband/acc100/acc100_vf_enum.h           |   73 +
 drivers/baseband/acc100/meson.build                |    8 +
 drivers/baseband/acc100/rte_acc100_cfg.h           |  113 +
 drivers/baseband/acc100/rte_acc100_pmd.c           | 4684 ++++++++++++++++++++
 drivers/baseband/acc100/rte_acc100_pmd.h           |  593 +++
 .../acc100/rte_pmd_bbdev_acc100_version.map        |   10 +
 drivers/baseband/meson.build                       |    2 +-
 mk/rte.app.mk                                      |    1 +
 20 files changed, 6917 insertions(+), 15 deletions(-)
 create mode 100644 doc/guides/bbdevs/acc100.rst
 create mode 100644 doc/guides/bbdevs/features/acc100.ini
 delete mode 100644 doc/guides/bbdevs/features/mbc.ini
 create mode 100644 drivers/baseband/acc100/Makefile
 create mode 100644 drivers/baseband/acc100/acc100_pf_enum.h
 create mode 100644 drivers/baseband/acc100/acc100_vf_enum.h
 create mode 100644 drivers/baseband/acc100/meson.build
 create mode 100644 drivers/baseband/acc100/rte_acc100_cfg.h
 create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.c
 create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.h
 create mode 100644 drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v4 01/11] drivers/baseband: add PMD for ACC100
  2020-09-04 17:53   ` [dpdk-dev] [PATCH v4 00/11] bbdev PMD ACC100 Nicolas Chautru
@ 2020-09-04 17:53     ` Nicolas Chautru
  2020-09-08  3:10       ` Liu, Tianjiao
  2020-09-04 17:53     ` [dpdk-dev] [PATCH v4 02/11] baseband/acc100: add register definition file Nicolas Chautru
                       ` (10 subsequent siblings)
  11 siblings, 1 reply; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-04 17:53 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Add stubs for the ACC100 PMD

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 config/common_base                                 |   4 +
 doc/guides/bbdevs/acc100.rst                       | 233 +++++++++++++++++++++
 doc/guides/bbdevs/index.rst                        |   1 +
 doc/guides/rel_notes/release_20_11.rst             |   6 +
 drivers/baseband/Makefile                          |   2 +
 drivers/baseband/acc100/Makefile                   |  25 +++
 drivers/baseband/acc100/meson.build                |   6 +
 drivers/baseband/acc100/rte_acc100_pmd.c           | 175 ++++++++++++++++
 drivers/baseband/acc100/rte_acc100_pmd.h           |  37 ++++
 .../acc100/rte_pmd_bbdev_acc100_version.map        |   3 +
 drivers/baseband/meson.build                       |   2 +-
 mk/rte.app.mk                                      |   1 +
 12 files changed, 494 insertions(+), 1 deletion(-)
 create mode 100644 doc/guides/bbdevs/acc100.rst
 create mode 100644 drivers/baseband/acc100/Makefile
 create mode 100644 drivers/baseband/acc100/meson.build
 create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.c
 create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.h
 create mode 100644 drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map

diff --git a/config/common_base b/config/common_base
index fbf0ee7..218ab16 100644
--- a/config/common_base
+++ b/config/common_base
@@ -584,6 +584,10 @@ CONFIG_RTE_LIBRTE_PMD_BBDEV_NULL=y
 #
 CONFIG_RTE_LIBRTE_PMD_BBDEV_TURBO_SW=y
 
+# Compile PMD for ACC100 bbdev device
+#
+CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100=y
+
 #
 # Compile PMD for Intel FPGA LTE FEC bbdev device
 #
diff --git a/doc/guides/bbdevs/acc100.rst b/doc/guides/bbdevs/acc100.rst
new file mode 100644
index 0000000..f87ee09
--- /dev/null
+++ b/doc/guides/bbdevs/acc100.rst
@@ -0,0 +1,233 @@
+..  SPDX-License-Identifier: BSD-3-Clause
+    Copyright(c) 2020 Intel Corporation
+
+Intel(R) ACC100 5G/4G FEC Poll Mode Driver
+==========================================
+
+The BBDEV ACC100 5G/4G FEC poll mode driver (PMD) supports an
+implementation of a VRAN FEC wireless acceleration function.
+This device is also known as Mount Bryce.
+
+Features
+--------
+
+ACC100 5G/4G FEC PMD supports the following features:
+
+- LDPC Encode in the DL (5GNR)
+- LDPC Decode in the UL (5GNR)
+- Turbo Encode in the DL (4G)
+- Turbo Decode in the UL (4G)
+- 16 VFs per PF (physical device)
+- Maximum of 128 queues per VF
+- PCIe Gen-3 x16 Interface
+- MSI
+- SR-IOV
+
+ACC100 5G/4G FEC PMD supports the following BBDEV capabilities:
+
+* For the LDPC encode operation:
+   - ``RTE_BBDEV_LDPC_CRC_24B_ATTACH`` :  set to attach CRC24B to CB(s)
+   - ``RTE_BBDEV_LDPC_RATE_MATCH`` :  if set then do not do Rate Match bypass
+   - ``RTE_BBDEV_LDPC_INTERLEAVER_BYPASS`` : if set then bypass interleaver
+
+* For the LDPC decode operation:
+   - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK`` :  check CRC24B from CB(s)
+   - ``RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE`` :  disable early termination
+   - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP`` :  drops CRC24B bits appended while decoding
+   - ``RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE`` :  provides an input for HARQ combining
+   - ``RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE`` :  provides an input for HARQ combining
+   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE`` :  HARQ memory input is internal
+   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE`` :  HARQ memory output is internal
+   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK`` :  loopback data to/from HARQ memory
+   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_FILLERS`` :  HARQ memory includes the fillers bits
+   - ``RTE_BBDEV_LDPC_DEC_SCATTER_GATHER`` :  supports scatter-gather for input/output data
+   - ``RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION`` :  supports compression of the HARQ input/output
+   - ``RTE_BBDEV_LDPC_LLR_COMPRESSION`` :  supports LLR input compression
+
+* For the turbo encode operation:
+   - ``RTE_BBDEV_TURBO_CRC_24B_ATTACH`` :  set to attach CRC24B to CB(s)
+   - ``RTE_BBDEV_TURBO_RATE_MATCH`` :  if set then do not do Rate Match bypass
+   - ``RTE_BBDEV_TURBO_ENC_INTERRUPTS`` :  set for encoder dequeue interrupts
+   - ``RTE_BBDEV_TURBO_RV_INDEX_BYPASS`` :  set to bypass RV index
+   - ``RTE_BBDEV_TURBO_ENC_SCATTER_GATHER`` :  supports scatter-gather for input/output data
+
+* For the turbo decode operation:
+   - ``RTE_BBDEV_TURBO_CRC_TYPE_24B`` :  check CRC24B from CB(s)
+   - ``RTE_BBDEV_TURBO_SUBBLOCK_DEINTERLEAVE`` :  perform subblock de-interleave
+   - ``RTE_BBDEV_TURBO_DEC_INTERRUPTS`` :  set for decoder dequeue interrupts
+   - ``RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN`` :  set if negative LLR encoder i/p is supported
+   - ``RTE_BBDEV_TURBO_POS_LLR_1_BIT_IN`` :  set if positive LLR encoder i/p is supported
+   - ``RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP`` :  keep CRC24B bits appended while decoding
+   - ``RTE_BBDEV_TURBO_EARLY_TERMINATION`` :  set early early termination feature
+   - ``RTE_BBDEV_TURBO_DEC_SCATTER_GATHER`` :  supports scatter-gather for input/output data
+   - ``RTE_BBDEV_TURBO_HALF_ITERATION_EVEN`` :  set half iteration granularity
+
+Installation
+------------
+
+Section 3 of the DPDK manual provides instuctions on installing and compiling DPDK. The
+default set of bbdev compile flags may be found in config/common_base, where for example
+the flag to build the ACC100 5G/4G FEC device, ``CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100``,
+is already set.
+
+DPDK requires hugepages to be configured as detailed in section 2 of the DPDK manual.
+The bbdev test application has been tested with a configuration 40 x 1GB hugepages. The
+hugepage configuration of a server may be examined using:
+
+.. code-block:: console
+
+   grep Huge* /proc/meminfo
+
+
+Initialization
+--------------
+
+When the device first powers up, its PCI Physical Functions (PF) can be listed through this command:
+
+.. code-block:: console
+
+  sudo lspci -vd8086:0d5c
+
+The physical and virtual functions are compatible with Linux UIO drivers:
+``vfio`` and ``igb_uio``. However, in order to work the ACC100 5G/4G
+FEC device firstly needs to be bound to one of these linux drivers through DPDK.
+
+
+Bind PF UIO driver(s)
+~~~~~~~~~~~~~~~~~~~~~
+
+Install the DPDK igb_uio driver, bind it with the PF PCI device ID and use
+``lspci`` to confirm the PF device is under use by ``igb_uio`` DPDK UIO driver.
+
+The igb_uio driver may be bound to the PF PCI device using one of three methods:
+
+
+1. PCI functions (physical or virtual, depending on the use case) can be bound to
+the UIO driver by repeating this command for every function.
+
+.. code-block:: console
+
+  cd <dpdk-top-level-directory>
+  insmod ./build/kmod/igb_uio.ko
+  echo "8086 0d5c" > /sys/bus/pci/drivers/igb_uio/new_id
+  lspci -vd8086:0d5c
+
+
+2. Another way to bind PF with DPDK UIO driver is by using the ``dpdk-devbind.py`` tool
+
+.. code-block:: console
+
+  cd <dpdk-top-level-directory>
+  ./usertools/dpdk-devbind.py -b igb_uio 0000:06:00.0
+
+where the PCI device ID (example: 0000:06:00.0) is obtained using lspci -vd8086:0d5c
+
+
+3. A third way to bind is to use ``dpdk-setup.sh`` tool
+
+.. code-block:: console
+
+  cd <dpdk-top-level-directory>
+  ./usertools/dpdk-setup.sh
+
+  select 'Bind Ethernet/Crypto/Baseband device to IGB UIO module'
+  or
+  select 'Bind Ethernet/Crypto/Baseband device to VFIO module' depending on driver required
+  enter PCI device ID
+  select 'Display current Ethernet/Crypto/Baseband device settings' to confirm binding
+
+
+In the same way the ACC100 5G/4G FEC PF can be bound with vfio, but vfio driver does not
+support SR-IOV configuration right out of the box, so it will need to be patched.
+
+
+Enable Virtual Functions
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+Now, it should be visible in the printouts that PCI PF is under igb_uio control
+"``Kernel driver in use: igb_uio``"
+
+To show the number of available VFs on the device, read ``sriov_totalvfs`` file..
+
+.. code-block:: console
+
+  cat /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/sriov_totalvfs
+
+  where 0000\:<b>\:<d>.<f> is the PCI device ID
+
+
+To enable VFs via igb_uio, echo the number of virtual functions intended to
+enable to ``max_vfs`` file..
+
+.. code-block:: console
+
+  echo <num-of-vfs> > /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/max_vfs
+
+
+Afterwards, all VFs must be bound to appropriate UIO drivers as required, same
+way it was done with the physical function previously.
+
+Enabling SR-IOV via vfio driver is pretty much the same, except that the file
+name is different:
+
+.. code-block:: console
+
+  echo <num-of-vfs> > /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/sriov_numvfs
+
+
+Configure the VFs through PF
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The PCI virtual functions must be configured before working or getting assigned
+to VMs/Containers. The configuration involves allocating the number of hardware
+queues, priorities, load balance, bandwidth and other settings necessary for the
+device to perform FEC functions.
+
+This configuration needs to be executed at least once after reboot or PCI FLR and can
+be achieved by using the function ``acc100_configure()``, which sets up the
+parameters defined in ``acc100_conf`` structure.
+
+Test Application
+----------------
+
+BBDEV provides a test application, ``test-bbdev.py`` and range of test data for testing
+the functionality of ACC100 5G/4G FEC encode and decode, depending on the device's
+capabilities. The test application is located under app->test-bbdev folder and has the
+following options:
+
+.. code-block:: console
+
+  "-p", "--testapp-path": specifies path to the bbdev test app.
+  "-e", "--eal-params"	: EAL arguments which are passed to the test app.
+  "-t", "--timeout"	: Timeout in seconds (default=300).
+  "-c", "--test-cases"	: Defines test cases to run. Run all if not specified.
+  "-v", "--test-vector"	: Test vector path (default=dpdk_path+/app/test-bbdev/test_vectors/bbdev_null.data).
+  "-n", "--num-ops"	: Number of operations to process on device (default=32).
+  "-b", "--burst-size"	: Operations enqueue/dequeue burst size (default=32).
+  "-s", "--snr"		: SNR in dB used when generating LLRs for bler tests.
+  "-s", "--iter_max"	: Number of iterations for LDPC decoder.
+  "-l", "--num-lcores"	: Number of lcores to run (default=16).
+  "-i", "--init-device" : Initialise PF device with default values.
+
+
+To execute the test application tool using simple decode or encode data,
+type one of the following:
+
+.. code-block:: console
+
+  ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_dec_default.data
+  ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_enc_default.data
+
+
+The test application ``test-bbdev.py``, supports the ability to configure the PF device with
+a default set of values, if the "-i" or "- -init-device" option is included. The default values
+are defined in test_bbdev_perf.c.
+
+
+Test Vectors
+~~~~~~~~~~~~
+
+In addition to the simple LDPC decoder and LDPC encoder tests, bbdev also provides
+a range of additional tests under the test_vectors folder, which may be useful. The results
+of these tests will depend on the ACC100 5G/4G FEC capabilities which may cause some
+testcases to be skipped, but no failure should be reported.
diff --git a/doc/guides/bbdevs/index.rst b/doc/guides/bbdevs/index.rst
index a8092dd..4445cbd 100644
--- a/doc/guides/bbdevs/index.rst
+++ b/doc/guides/bbdevs/index.rst
@@ -13,3 +13,4 @@ Baseband Device Drivers
     turbo_sw
     fpga_lte_fec
     fpga_5gnr_fec
+    acc100
diff --git a/doc/guides/rel_notes/release_20_11.rst b/doc/guides/rel_notes/release_20_11.rst
index df227a1..b3ab614 100644
--- a/doc/guides/rel_notes/release_20_11.rst
+++ b/doc/guides/rel_notes/release_20_11.rst
@@ -55,6 +55,12 @@ New Features
      Also, make sure to start the actual text at the margin.
      =======================================================
 
+* **Added Intel ACC100 bbdev PMD.**
+
+  Added a new ``acc100`` bbdev driver for the Intel\ |reg| ACC100 accelerator
+  also known as Mount Bryce.  See the
+  :doc:`../bbdevs/acc100` BBDEV guide for more details on this new driver.
+
 
 Removed Items
 -------------
diff --git a/drivers/baseband/Makefile b/drivers/baseband/Makefile
index dcc0969..b640294 100644
--- a/drivers/baseband/Makefile
+++ b/drivers/baseband/Makefile
@@ -10,6 +10,8 @@ DIRS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_NULL) += null
 DEPDIRS-null = $(core-libs)
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_TURBO_SW) += turbo_sw
 DEPDIRS-turbo_sw = $(core-libs)
+DIRS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100) += acc100
+DEPDIRS-acc100 = $(core-libs)
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_FPGA_LTE_FEC) += fpga_lte_fec
 DEPDIRS-fpga_lte_fec = $(core-libs)
 DIRS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_FPGA_5GNR_FEC) += fpga_5gnr_fec
diff --git a/drivers/baseband/acc100/Makefile b/drivers/baseband/acc100/Makefile
new file mode 100644
index 0000000..c79e487
--- /dev/null
+++ b/drivers/baseband/acc100/Makefile
@@ -0,0 +1,25 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2020 Intel Corporation
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_pmd_bbdev_acc100.a
+
+# build flags
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring -lrte_cfgfile
+LDLIBS += -lrte_bbdev
+LDLIBS += -lrte_pci -lrte_bus_pci
+
+# versioning export map
+EXPORT_MAP := rte_pmd_bbdev_acc100_version.map
+
+# library version
+LIBABIVER := 1
+
+# library source files
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100) += rte_acc100_pmd.c
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/baseband/acc100/meson.build b/drivers/baseband/acc100/meson.build
new file mode 100644
index 0000000..8afafc2
--- /dev/null
+++ b/drivers/baseband/acc100/meson.build
@@ -0,0 +1,6 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2020 Intel Corporation
+
+deps += ['bbdev', 'bus_vdev', 'ring', 'pci', 'bus_pci']
+
+sources = files('rte_acc100_pmd.c')
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
new file mode 100644
index 0000000..1b4cd13
--- /dev/null
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -0,0 +1,175 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#include <unistd.h>
+
+#include <rte_common.h>
+#include <rte_log.h>
+#include <rte_dev.h>
+#include <rte_malloc.h>
+#include <rte_mempool.h>
+#include <rte_byteorder.h>
+#include <rte_errno.h>
+#include <rte_branch_prediction.h>
+#include <rte_hexdump.h>
+#include <rte_pci.h>
+#include <rte_bus_pci.h>
+
+#include <rte_bbdev.h>
+#include <rte_bbdev_pmd.h>
+#include "rte_acc100_pmd.h"
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, DEBUG);
+#else
+RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);
+#endif
+
+/* Free 64MB memory used for software rings */
+static int
+acc100_dev_close(struct rte_bbdev *dev  __rte_unused)
+{
+	return 0;
+}
+
+static const struct rte_bbdev_ops acc100_bbdev_ops = {
+	.close = acc100_dev_close,
+};
+
+/* ACC100 PCI PF address map */
+static struct rte_pci_id pci_id_acc100_pf_map[] = {
+	{
+		RTE_PCI_DEVICE(RTE_ACC100_VENDOR_ID, RTE_ACC100_PF_DEVICE_ID)
+	},
+	{.device_id = 0},
+};
+
+/* ACC100 PCI VF address map */
+static struct rte_pci_id pci_id_acc100_vf_map[] = {
+	{
+		RTE_PCI_DEVICE(RTE_ACC100_VENDOR_ID, RTE_ACC100_VF_DEVICE_ID)
+	},
+	{.device_id = 0},
+};
+
+/* Initialization Function */
+static void
+acc100_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv)
+{
+	struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
+
+	dev->dev_ops = &acc100_bbdev_ops;
+
+	((struct acc100_device *) dev->data->dev_private)->pf_device =
+			!strcmp(drv->driver.name,
+					RTE_STR(ACC100PF_DRIVER_NAME));
+	((struct acc100_device *) dev->data->dev_private)->mmio_base =
+			pci_dev->mem_resource[0].addr;
+
+	rte_bbdev_log_debug("Init device %s [%s] @ vaddr %p paddr %#"PRIx64"",
+			drv->driver.name, dev->data->name,
+			(void *)pci_dev->mem_resource[0].addr,
+			pci_dev->mem_resource[0].phys_addr);
+}
+
+static int acc100_pci_probe(struct rte_pci_driver *pci_drv,
+	struct rte_pci_device *pci_dev)
+{
+	struct rte_bbdev *bbdev = NULL;
+	char dev_name[RTE_BBDEV_NAME_MAX_LEN];
+
+	if (pci_dev == NULL) {
+		rte_bbdev_log(ERR, "NULL PCI device");
+		return -EINVAL;
+	}
+
+	rte_pci_device_name(&pci_dev->addr, dev_name, sizeof(dev_name));
+
+	/* Allocate memory to be used privately by drivers */
+	bbdev = rte_bbdev_allocate(pci_dev->device.name);
+	if (bbdev == NULL)
+		return -ENODEV;
+
+	/* allocate device private memory */
+	bbdev->data->dev_private = rte_zmalloc_socket(dev_name,
+			sizeof(struct acc100_device), RTE_CACHE_LINE_SIZE,
+			pci_dev->device.numa_node);
+
+	if (bbdev->data->dev_private == NULL) {
+		rte_bbdev_log(CRIT,
+				"Allocate of %zu bytes for device \"%s\" failed",
+				sizeof(struct acc100_device), dev_name);
+				rte_bbdev_release(bbdev);
+			return -ENOMEM;
+	}
+
+	/* Fill HW specific part of device structure */
+	bbdev->device = &pci_dev->device;
+	bbdev->intr_handle = &pci_dev->intr_handle;
+	bbdev->data->socket_id = pci_dev->device.numa_node;
+
+	/* Invoke ACC100 device initialization function */
+	acc100_bbdev_init(bbdev, pci_drv);
+
+	rte_bbdev_log_debug("Initialised bbdev %s (id = %u)",
+			dev_name, bbdev->data->dev_id);
+	return 0;
+}
+
+static int acc100_pci_remove(struct rte_pci_device *pci_dev)
+{
+	struct rte_bbdev *bbdev;
+	int ret;
+	uint8_t dev_id;
+
+	if (pci_dev == NULL)
+		return -EINVAL;
+
+	/* Find device */
+	bbdev = rte_bbdev_get_named_dev(pci_dev->device.name);
+	if (bbdev == NULL) {
+		rte_bbdev_log(CRIT,
+				"Couldn't find HW dev \"%s\" to uninitialise it",
+				pci_dev->device.name);
+		return -ENODEV;
+	}
+	dev_id = bbdev->data->dev_id;
+
+	/* free device private memory before close */
+	rte_free(bbdev->data->dev_private);
+
+	/* Close device */
+	ret = rte_bbdev_close(dev_id);
+	if (ret < 0)
+		rte_bbdev_log(ERR,
+				"Device %i failed to close during uninit: %i",
+				dev_id, ret);
+
+	/* release bbdev from library */
+	rte_bbdev_release(bbdev);
+
+	rte_bbdev_log_debug("Destroyed bbdev = %u", dev_id);
+
+	return 0;
+}
+
+static struct rte_pci_driver acc100_pci_pf_driver = {
+		.probe = acc100_pci_probe,
+		.remove = acc100_pci_remove,
+		.id_table = pci_id_acc100_pf_map,
+		.drv_flags = RTE_PCI_DRV_NEED_MAPPING
+};
+
+static struct rte_pci_driver acc100_pci_vf_driver = {
+		.probe = acc100_pci_probe,
+		.remove = acc100_pci_remove,
+		.id_table = pci_id_acc100_vf_map,
+		.drv_flags = RTE_PCI_DRV_NEED_MAPPING
+};
+
+RTE_PMD_REGISTER_PCI(ACC100PF_DRIVER_NAME, acc100_pci_pf_driver);
+RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME, pci_id_acc100_pf_map);
+RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
+RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME, pci_id_acc100_vf_map);
+
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
new file mode 100644
index 0000000..6f46df0
--- /dev/null
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -0,0 +1,37 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#ifndef _RTE_ACC100_PMD_H_
+#define _RTE_ACC100_PMD_H_
+
+/* Helper macro for logging */
+#define rte_bbdev_log(level, fmt, ...) \
+	rte_log(RTE_LOG_ ## level, acc100_logtype, fmt "\n", \
+		##__VA_ARGS__)
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+#define rte_bbdev_log_debug(fmt, ...) \
+		rte_bbdev_log(DEBUG, "acc100_pmd: " fmt, \
+		##__VA_ARGS__)
+#else
+#define rte_bbdev_log_debug(fmt, ...)
+#endif
+
+/* ACC100 PF and VF driver names */
+#define ACC100PF_DRIVER_NAME           intel_acc100_pf
+#define ACC100VF_DRIVER_NAME           intel_acc100_vf
+
+/* ACC100 PCI vendor & device IDs */
+#define RTE_ACC100_VENDOR_ID           (0x8086)
+#define RTE_ACC100_PF_DEVICE_ID        (0x0d5c)
+#define RTE_ACC100_VF_DEVICE_ID        (0x0d5d)
+
+/* Private data structure for each ACC100 device */
+struct acc100_device {
+	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
+	bool pf_device; /**< True if this is a PF ACC100 device */
+	bool configured; /**< True if this ACC100 device is configured */
+};
+
+#endif /* _RTE_ACC100_PMD_H_ */
diff --git a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
new file mode 100644
index 0000000..4a76d1d
--- /dev/null
+++ b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
@@ -0,0 +1,3 @@
+DPDK_21 {
+	local: *;
+};
diff --git a/drivers/baseband/meson.build b/drivers/baseband/meson.build
index 415b672..72301ce 100644
--- a/drivers/baseband/meson.build
+++ b/drivers/baseband/meson.build
@@ -5,7 +5,7 @@ if is_windows
 	subdir_done()
 endif
 
-drivers = ['null', 'turbo_sw', 'fpga_lte_fec', 'fpga_5gnr_fec']
+drivers = ['null', 'turbo_sw', 'fpga_lte_fec', 'fpga_5gnr_fec', 'acc100']
 
 config_flag_fmt = 'RTE_LIBRTE_PMD_BBDEV_@0@'
 driver_name_fmt = 'rte_pmd_bbdev_@0@'
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index a544259..a77f538 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -254,6 +254,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_NETVSC_PMD)     += -lrte_pmd_netvsc
 
 ifeq ($(CONFIG_RTE_LIBRTE_BBDEV),y)
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_NULL)     += -lrte_pmd_bbdev_null
+_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100)    += -lrte_pmd_bbdev_acc100
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_FPGA_LTE_FEC) += -lrte_pmd_bbdev_fpga_lte_fec
 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_FPGA_5GNR_FEC) += -lrte_pmd_bbdev_fpga_5gnr_fec
 
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v4 02/11] baseband/acc100: add register definition file
  2020-09-04 17:53   ` [dpdk-dev] [PATCH v4 00/11] bbdev PMD ACC100 Nicolas Chautru
  2020-09-04 17:53     ` [dpdk-dev] [PATCH v4 01/11] drivers/baseband: add PMD for ACC100 Nicolas Chautru
@ 2020-09-04 17:53     ` Nicolas Chautru
  2020-09-15  2:31       ` Xu, Rosen
  2020-09-18  2:39       ` Liu, Tianjiao
  2020-09-04 17:53     ` [dpdk-dev] [PATCH v4 03/11] baseband/acc100: add info get function Nicolas Chautru
                       ` (9 subsequent siblings)
  11 siblings, 2 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-04 17:53 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Add in the list of registers for the device and related
HW specs definitions.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 drivers/baseband/acc100/acc100_pf_enum.h | 1068 ++++++++++++++++++++++++++++++
 drivers/baseband/acc100/acc100_vf_enum.h |   73 ++
 drivers/baseband/acc100/rte_acc100_pmd.h |  490 ++++++++++++++
 3 files changed, 1631 insertions(+)
 create mode 100644 drivers/baseband/acc100/acc100_pf_enum.h
 create mode 100644 drivers/baseband/acc100/acc100_vf_enum.h

diff --git a/drivers/baseband/acc100/acc100_pf_enum.h b/drivers/baseband/acc100/acc100_pf_enum.h
new file mode 100644
index 0000000..a1ee416
--- /dev/null
+++ b/drivers/baseband/acc100/acc100_pf_enum.h
@@ -0,0 +1,1068 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2017 Intel Corporation
+ */
+
+#ifndef ACC100_PF_ENUM_H
+#define ACC100_PF_ENUM_H
+
+/*
+ * ACC100 Register mapping on PF BAR0
+ * This is automatically generated from RDL, format may change with new RDL
+ * Release.
+ * Variable names are as is
+ */
+enum {
+	HWPfQmgrEgressQueuesTemplate          =  0x0007FE00,
+	HWPfQmgrIngressAq                     =  0x00080000,
+	HWPfQmgrArbQAvail                     =  0x00A00010,
+	HWPfQmgrArbQBlock                     =  0x00A00014,
+	HWPfQmgrAqueueDropNotifEn             =  0x00A00024,
+	HWPfQmgrAqueueDisableNotifEn          =  0x00A00028,
+	HWPfQmgrSoftReset                     =  0x00A00038,
+	HWPfQmgrInitStatus                    =  0x00A0003C,
+	HWPfQmgrAramWatchdogCount             =  0x00A00040,
+	HWPfQmgrAramWatchdogCounterEn         =  0x00A00044,
+	HWPfQmgrAxiWatchdogCount              =  0x00A00048,
+	HWPfQmgrAxiWatchdogCounterEn          =  0x00A0004C,
+	HWPfQmgrProcessWatchdogCount          =  0x00A00050,
+	HWPfQmgrProcessWatchdogCounterEn      =  0x00A00054,
+	HWPfQmgrProcessUl4GWatchdogCounter    =  0x00A00058,
+	HWPfQmgrProcessDl4GWatchdogCounter    =  0x00A0005C,
+	HWPfQmgrProcessUl5GWatchdogCounter    =  0x00A00060,
+	HWPfQmgrProcessDl5GWatchdogCounter    =  0x00A00064,
+	HWPfQmgrProcessMldWatchdogCounter     =  0x00A00068,
+	HWPfQmgrMsiOverflowUpperVf            =  0x00A00070,
+	HWPfQmgrMsiOverflowLowerVf            =  0x00A00074,
+	HWPfQmgrMsiWatchdogOverflow           =  0x00A00078,
+	HWPfQmgrMsiOverflowEnable             =  0x00A0007C,
+	HWPfQmgrDebugAqPointerMemGrp          =  0x00A00100,
+	HWPfQmgrDebugOutputArbQFifoGrp        =  0x00A00140,
+	HWPfQmgrDebugMsiFifoGrp               =  0x00A00180,
+	HWPfQmgrDebugAxiWdTimeoutMsiFifo      =  0x00A001C0,
+	HWPfQmgrDebugProcessWdTimeoutMsiFifo  =  0x00A001C4,
+	HWPfQmgrDepthLog2Grp                  =  0x00A00200,
+	HWPfQmgrTholdGrp                      =  0x00A00300,
+	HWPfQmgrGrpTmplateReg0Indx            =  0x00A00600,
+	HWPfQmgrGrpTmplateReg1Indx            =  0x00A00680,
+	HWPfQmgrGrpTmplateReg2indx            =  0x00A00700,
+	HWPfQmgrGrpTmplateReg3Indx            =  0x00A00780,
+	HWPfQmgrGrpTmplateReg4Indx            =  0x00A00800,
+	HWPfQmgrVfBaseAddr                    =  0x00A01000,
+	HWPfQmgrUl4GWeightRrVf                =  0x00A02000,
+	HWPfQmgrDl4GWeightRrVf                =  0x00A02100,
+	HWPfQmgrUl5GWeightRrVf                =  0x00A02200,
+	HWPfQmgrDl5GWeightRrVf                =  0x00A02300,
+	HWPfQmgrMldWeightRrVf                 =  0x00A02400,
+	HWPfQmgrArbQDepthGrp                  =  0x00A02F00,
+	HWPfQmgrGrpFunction0                  =  0x00A02F40,
+	HWPfQmgrGrpFunction1                  =  0x00A02F44,
+	HWPfQmgrGrpPriority                   =  0x00A02F48,
+	HWPfQmgrWeightSync                    =  0x00A03000,
+	HWPfQmgrAqEnableVf                    =  0x00A10000,
+	HWPfQmgrAqResetVf                     =  0x00A20000,
+	HWPfQmgrRingSizeVf                    =  0x00A20004,
+	HWPfQmgrGrpDepthLog20Vf               =  0x00A20008,
+	HWPfQmgrGrpDepthLog21Vf               =  0x00A2000C,
+	HWPfQmgrGrpFunction0Vf                =  0x00A20010,
+	HWPfQmgrGrpFunction1Vf                =  0x00A20014,
+	HWPfDmaConfig0Reg                     =  0x00B80000,
+	HWPfDmaConfig1Reg                     =  0x00B80004,
+	HWPfDmaQmgrAddrReg                    =  0x00B80008,
+	HWPfDmaSoftResetReg                   =  0x00B8000C,
+	HWPfDmaAxcacheReg                     =  0x00B80010,
+	HWPfDmaVersionReg                     =  0x00B80014,
+	HWPfDmaFrameThreshold                 =  0x00B80018,
+	HWPfDmaTimestampLo                    =  0x00B8001C,
+	HWPfDmaTimestampHi                    =  0x00B80020,
+	HWPfDmaAxiStatus                      =  0x00B80028,
+	HWPfDmaAxiControl                     =  0x00B8002C,
+	HWPfDmaNoQmgr                         =  0x00B80030,
+	HWPfDmaQosScale                       =  0x00B80034,
+	HWPfDmaQmanen                         =  0x00B80040,
+	HWPfDmaQmgrQosBase                    =  0x00B80060,
+	HWPfDmaFecClkGatingEnable             =  0x00B80080,
+	HWPfDmaPmEnable                       =  0x00B80084,
+	HWPfDmaQosEnable                      =  0x00B80088,
+	HWPfDmaHarqWeightedRrFrameThreshold   =  0x00B800B0,
+	HWPfDmaDataSmallWeightedRrFrameThresh  = 0x00B800B4,
+	HWPfDmaDataLargeWeightedRrFrameThresh  = 0x00B800B8,
+	HWPfDmaInboundCbMaxSize               =  0x00B800BC,
+	HWPfDmaInboundDrainDataSize           =  0x00B800C0,
+	HWPfDmaVfDdrBaseRw                    =  0x00B80400,
+	HWPfDmaCmplTmOutCnt                   =  0x00B80800,
+	HWPfDmaProcTmOutCnt                   =  0x00B80804,
+	HWPfDmaStatusRrespBresp               =  0x00B80810,
+	HWPfDmaCfgRrespBresp                  =  0x00B80814,
+	HWPfDmaStatusMemParErr                =  0x00B80818,
+	HWPfDmaCfgMemParErrEn                 =  0x00B8081C,
+	HWPfDmaStatusDmaHwErr                 =  0x00B80820,
+	HWPfDmaCfgDmaHwErrEn                  =  0x00B80824,
+	HWPfDmaStatusFecCoreErr               =  0x00B80828,
+	HWPfDmaCfgFecCoreErrEn                =  0x00B8082C,
+	HWPfDmaStatusFcwDescrErr              =  0x00B80830,
+	HWPfDmaCfgFcwDescrErrEn               =  0x00B80834,
+	HWPfDmaStatusBlockTransmit            =  0x00B80838,
+	HWPfDmaBlockOnErrEn                   =  0x00B8083C,
+	HWPfDmaStatusFlushDma                 =  0x00B80840,
+	HWPfDmaFlushDmaOnErrEn                =  0x00B80844,
+	HWPfDmaStatusSdoneFifoFull            =  0x00B80848,
+	HWPfDmaStatusDescriptorErrLoVf        =  0x00B8084C,
+	HWPfDmaStatusDescriptorErrHiVf        =  0x00B80850,
+	HWPfDmaStatusFcwErrLoVf               =  0x00B80854,
+	HWPfDmaStatusFcwErrHiVf               =  0x00B80858,
+	HWPfDmaStatusDataErrLoVf              =  0x00B8085C,
+	HWPfDmaStatusDataErrHiVf              =  0x00B80860,
+	HWPfDmaCfgMsiEnSoftwareErr            =  0x00B80864,
+	HWPfDmaDescriptorSignatuture          =  0x00B80868,
+	HWPfDmaFcwSignature                   =  0x00B8086C,
+	HWPfDmaErrorDetectionEn               =  0x00B80870,
+	HWPfDmaErrCntrlFifoDebug              =  0x00B8087C,
+	HWPfDmaStatusToutData                 =  0x00B80880,
+	HWPfDmaStatusToutDesc                 =  0x00B80884,
+	HWPfDmaStatusToutUnexpData            =  0x00B80888,
+	HWPfDmaStatusToutUnexpDesc            =  0x00B8088C,
+	HWPfDmaStatusToutProcess              =  0x00B80890,
+	HWPfDmaConfigCtoutOutDataEn           =  0x00B808A0,
+	HWPfDmaConfigCtoutOutDescrEn          =  0x00B808A4,
+	HWPfDmaConfigUnexpComplDataEn         =  0x00B808A8,
+	HWPfDmaConfigUnexpComplDescrEn        =  0x00B808AC,
+	HWPfDmaConfigPtoutOutEn               =  0x00B808B0,
+	HWPfDmaFec5GulDescBaseLoRegVf         =  0x00B88020,
+	HWPfDmaFec5GulDescBaseHiRegVf         =  0x00B88024,
+	HWPfDmaFec5GulRespPtrLoRegVf          =  0x00B88028,
+	HWPfDmaFec5GulRespPtrHiRegVf          =  0x00B8802C,
+	HWPfDmaFec5GdlDescBaseLoRegVf         =  0x00B88040,
+	HWPfDmaFec5GdlDescBaseHiRegVf         =  0x00B88044,
+	HWPfDmaFec5GdlRespPtrLoRegVf          =  0x00B88048,
+	HWPfDmaFec5GdlRespPtrHiRegVf          =  0x00B8804C,
+	HWPfDmaFec4GulDescBaseLoRegVf         =  0x00B88060,
+	HWPfDmaFec4GulDescBaseHiRegVf         =  0x00B88064,
+	HWPfDmaFec4GulRespPtrLoRegVf          =  0x00B88068,
+	HWPfDmaFec4GulRespPtrHiRegVf          =  0x00B8806C,
+	HWPfDmaFec4GdlDescBaseLoRegVf         =  0x00B88080,
+	HWPfDmaFec4GdlDescBaseHiRegVf         =  0x00B88084,
+	HWPfDmaFec4GdlRespPtrLoRegVf          =  0x00B88088,
+	HWPfDmaFec4GdlRespPtrHiRegVf          =  0x00B8808C,
+	HWPfDmaVfDdrBaseRangeRo               =  0x00B880A0,
+	HWPfQosmonACntrlReg                   =  0x00B90000,
+	HWPfQosmonAEvalOverflow0              =  0x00B90008,
+	HWPfQosmonAEvalOverflow1              =  0x00B9000C,
+	HWPfQosmonADivTerm                    =  0x00B90010,
+	HWPfQosmonATickTerm                   =  0x00B90014,
+	HWPfQosmonAEvalTerm                   =  0x00B90018,
+	HWPfQosmonAAveTerm                    =  0x00B9001C,
+	HWPfQosmonAForceEccErr                =  0x00B90020,
+	HWPfQosmonAEccErrDetect               =  0x00B90024,
+	HWPfQosmonAIterationConfig0Low        =  0x00B90060,
+	HWPfQosmonAIterationConfig0High       =  0x00B90064,
+	HWPfQosmonAIterationConfig1Low        =  0x00B90068,
+	HWPfQosmonAIterationConfig1High       =  0x00B9006C,
+	HWPfQosmonAIterationConfig2Low        =  0x00B90070,
+	HWPfQosmonAIterationConfig2High       =  0x00B90074,
+	HWPfQosmonAIterationConfig3Low        =  0x00B90078,
+	HWPfQosmonAIterationConfig3High       =  0x00B9007C,
+	HWPfQosmonAEvalMemAddr                =  0x00B90080,
+	HWPfQosmonAEvalMemData                =  0x00B90084,
+	HWPfQosmonAXaction                    =  0x00B900C0,
+	HWPfQosmonARemThres1Vf                =  0x00B90400,
+	HWPfQosmonAThres2Vf                   =  0x00B90404,
+	HWPfQosmonAWeiFracVf                  =  0x00B90408,
+	HWPfQosmonARrWeiVf                    =  0x00B9040C,
+	HWPfPermonACntrlRegVf                 =  0x00B98000,
+	HWPfPermonACountVf                    =  0x00B98008,
+	HWPfPermonAKCntLoVf                   =  0x00B98010,
+	HWPfPermonAKCntHiVf                   =  0x00B98014,
+	HWPfPermonADeltaCntLoVf               =  0x00B98020,
+	HWPfPermonADeltaCntHiVf               =  0x00B98024,
+	HWPfPermonAVersionReg                 =  0x00B9C000,
+	HWPfPermonACbControlFec               =  0x00B9C0F0,
+	HWPfPermonADltTimerLoFec              =  0x00B9C0F4,
+	HWPfPermonADltTimerHiFec              =  0x00B9C0F8,
+	HWPfPermonACbCountFec                 =  0x00B9C100,
+	HWPfPermonAAccExecTimerLoFec          =  0x00B9C104,
+	HWPfPermonAAccExecTimerHiFec          =  0x00B9C108,
+	HWPfPermonAExecTimerMinFec            =  0x00B9C200,
+	HWPfPermonAExecTimerMaxFec            =  0x00B9C204,
+	HWPfPermonAControlBusMon              =  0x00B9C400,
+	HWPfPermonAConfigBusMon               =  0x00B9C404,
+	HWPfPermonASkipCountBusMon            =  0x00B9C408,
+	HWPfPermonAMinLatBusMon               =  0x00B9C40C,
+	HWPfPermonAMaxLatBusMon               =  0x00B9C500,
+	HWPfPermonATotalLatLowBusMon          =  0x00B9C504,
+	HWPfPermonATotalLatUpperBusMon        =  0x00B9C508,
+	HWPfPermonATotalReqCntBusMon          =  0x00B9C50C,
+	HWPfQosmonBCntrlReg                   =  0x00BA0000,
+	HWPfQosmonBEvalOverflow0              =  0x00BA0008,
+	HWPfQosmonBEvalOverflow1              =  0x00BA000C,
+	HWPfQosmonBDivTerm                    =  0x00BA0010,
+	HWPfQosmonBTickTerm                   =  0x00BA0014,
+	HWPfQosmonBEvalTerm                   =  0x00BA0018,
+	HWPfQosmonBAveTerm                    =  0x00BA001C,
+	HWPfQosmonBForceEccErr                =  0x00BA0020,
+	HWPfQosmonBEccErrDetect               =  0x00BA0024,
+	HWPfQosmonBIterationConfig0Low        =  0x00BA0060,
+	HWPfQosmonBIterationConfig0High       =  0x00BA0064,
+	HWPfQosmonBIterationConfig1Low        =  0x00BA0068,
+	HWPfQosmonBIterationConfig1High       =  0x00BA006C,
+	HWPfQosmonBIterationConfig2Low        =  0x00BA0070,
+	HWPfQosmonBIterationConfig2High       =  0x00BA0074,
+	HWPfQosmonBIterationConfig3Low        =  0x00BA0078,
+	HWPfQosmonBIterationConfig3High       =  0x00BA007C,
+	HWPfQosmonBEvalMemAddr                =  0x00BA0080,
+	HWPfQosmonBEvalMemData                =  0x00BA0084,
+	HWPfQosmonBXaction                    =  0x00BA00C0,
+	HWPfQosmonBRemThres1Vf                =  0x00BA0400,
+	HWPfQosmonBThres2Vf                   =  0x00BA0404,
+	HWPfQosmonBWeiFracVf                  =  0x00BA0408,
+	HWPfQosmonBRrWeiVf                    =  0x00BA040C,
+	HWPfPermonBCntrlRegVf                 =  0x00BA8000,
+	HWPfPermonBCountVf                    =  0x00BA8008,
+	HWPfPermonBKCntLoVf                   =  0x00BA8010,
+	HWPfPermonBKCntHiVf                   =  0x00BA8014,
+	HWPfPermonBDeltaCntLoVf               =  0x00BA8020,
+	HWPfPermonBDeltaCntHiVf               =  0x00BA8024,
+	HWPfPermonBVersionReg                 =  0x00BAC000,
+	HWPfPermonBCbControlFec               =  0x00BAC0F0,
+	HWPfPermonBDltTimerLoFec              =  0x00BAC0F4,
+	HWPfPermonBDltTimerHiFec              =  0x00BAC0F8,
+	HWPfPermonBCbCountFec                 =  0x00BAC100,
+	HWPfPermonBAccExecTimerLoFec          =  0x00BAC104,
+	HWPfPermonBAccExecTimerHiFec          =  0x00BAC108,
+	HWPfPermonBExecTimerMinFec            =  0x00BAC200,
+	HWPfPermonBExecTimerMaxFec            =  0x00BAC204,
+	HWPfPermonBControlBusMon              =  0x00BAC400,
+	HWPfPermonBConfigBusMon               =  0x00BAC404,
+	HWPfPermonBSkipCountBusMon            =  0x00BAC408,
+	HWPfPermonBMinLatBusMon               =  0x00BAC40C,
+	HWPfPermonBMaxLatBusMon               =  0x00BAC500,
+	HWPfPermonBTotalLatLowBusMon          =  0x00BAC504,
+	HWPfPermonBTotalLatUpperBusMon        =  0x00BAC508,
+	HWPfPermonBTotalReqCntBusMon          =  0x00BAC50C,
+	HWPfFecUl5gCntrlReg                   =  0x00BC0000,
+	HWPfFecUl5gI2MThreshReg               =  0x00BC0004,
+	HWPfFecUl5gVersionReg                 =  0x00BC0100,
+	HWPfFecUl5gFcwStatusReg               =  0x00BC0104,
+	HWPfFecUl5gWarnReg                    =  0x00BC0108,
+	HwPfFecUl5gIbDebugReg                 =  0x00BC0200,
+	HwPfFecUl5gObLlrDebugReg              =  0x00BC0204,
+	HwPfFecUl5gObHarqDebugReg             =  0x00BC0208,
+	HwPfFecUl5g1CntrlReg                  =  0x00BC1000,
+	HwPfFecUl5g1I2MThreshReg              =  0x00BC1004,
+	HwPfFecUl5g1VersionReg                =  0x00BC1100,
+	HwPfFecUl5g1FcwStatusReg              =  0x00BC1104,
+	HwPfFecUl5g1WarnReg                   =  0x00BC1108,
+	HwPfFecUl5g1IbDebugReg                =  0x00BC1200,
+	HwPfFecUl5g1ObLlrDebugReg             =  0x00BC1204,
+	HwPfFecUl5g1ObHarqDebugReg            =  0x00BC1208,
+	HwPfFecUl5g2CntrlReg                  =  0x00BC2000,
+	HwPfFecUl5g2I2MThreshReg              =  0x00BC2004,
+	HwPfFecUl5g2VersionReg                =  0x00BC2100,
+	HwPfFecUl5g2FcwStatusReg              =  0x00BC2104,
+	HwPfFecUl5g2WarnReg                   =  0x00BC2108,
+	HwPfFecUl5g2IbDebugReg                =  0x00BC2200,
+	HwPfFecUl5g2ObLlrDebugReg             =  0x00BC2204,
+	HwPfFecUl5g2ObHarqDebugReg            =  0x00BC2208,
+	HwPfFecUl5g3CntrlReg                  =  0x00BC3000,
+	HwPfFecUl5g3I2MThreshReg              =  0x00BC3004,
+	HwPfFecUl5g3VersionReg                =  0x00BC3100,
+	HwPfFecUl5g3FcwStatusReg              =  0x00BC3104,
+	HwPfFecUl5g3WarnReg                   =  0x00BC3108,
+	HwPfFecUl5g3IbDebugReg                =  0x00BC3200,
+	HwPfFecUl5g3ObLlrDebugReg             =  0x00BC3204,
+	HwPfFecUl5g3ObHarqDebugReg            =  0x00BC3208,
+	HwPfFecUl5g4CntrlReg                  =  0x00BC4000,
+	HwPfFecUl5g4I2MThreshReg              =  0x00BC4004,
+	HwPfFecUl5g4VersionReg                =  0x00BC4100,
+	HwPfFecUl5g4FcwStatusReg              =  0x00BC4104,
+	HwPfFecUl5g4WarnReg                   =  0x00BC4108,
+	HwPfFecUl5g4IbDebugReg                =  0x00BC4200,
+	HwPfFecUl5g4ObLlrDebugReg             =  0x00BC4204,
+	HwPfFecUl5g4ObHarqDebugReg            =  0x00BC4208,
+	HwPfFecUl5g5CntrlReg                  =  0x00BC5000,
+	HwPfFecUl5g5I2MThreshReg              =  0x00BC5004,
+	HwPfFecUl5g5VersionReg                =  0x00BC5100,
+	HwPfFecUl5g5FcwStatusReg              =  0x00BC5104,
+	HwPfFecUl5g5WarnReg                   =  0x00BC5108,
+	HwPfFecUl5g5IbDebugReg                =  0x00BC5200,
+	HwPfFecUl5g5ObLlrDebugReg             =  0x00BC5204,
+	HwPfFecUl5g5ObHarqDebugReg            =  0x00BC5208,
+	HwPfFecUl5g6CntrlReg                  =  0x00BC6000,
+	HwPfFecUl5g6I2MThreshReg              =  0x00BC6004,
+	HwPfFecUl5g6VersionReg                =  0x00BC6100,
+	HwPfFecUl5g6FcwStatusReg              =  0x00BC6104,
+	HwPfFecUl5g6WarnReg                   =  0x00BC6108,
+	HwPfFecUl5g6IbDebugReg                =  0x00BC6200,
+	HwPfFecUl5g6ObLlrDebugReg             =  0x00BC6204,
+	HwPfFecUl5g6ObHarqDebugReg            =  0x00BC6208,
+	HwPfFecUl5g7CntrlReg                  =  0x00BC7000,
+	HwPfFecUl5g7I2MThreshReg              =  0x00BC7004,
+	HwPfFecUl5g7VersionReg                =  0x00BC7100,
+	HwPfFecUl5g7FcwStatusReg              =  0x00BC7104,
+	HwPfFecUl5g7WarnReg                   =  0x00BC7108,
+	HwPfFecUl5g7IbDebugReg                =  0x00BC7200,
+	HwPfFecUl5g7ObLlrDebugReg             =  0x00BC7204,
+	HwPfFecUl5g7ObHarqDebugReg            =  0x00BC7208,
+	HwPfFecUl5g8CntrlReg                  =  0x00BC8000,
+	HwPfFecUl5g8I2MThreshReg              =  0x00BC8004,
+	HwPfFecUl5g8VersionReg                =  0x00BC8100,
+	HwPfFecUl5g8FcwStatusReg              =  0x00BC8104,
+	HwPfFecUl5g8WarnReg                   =  0x00BC8108,
+	HwPfFecUl5g8IbDebugReg                =  0x00BC8200,
+	HwPfFecUl5g8ObLlrDebugReg             =  0x00BC8204,
+	HwPfFecUl5g8ObHarqDebugReg            =  0x00BC8208,
+	HWPfFecDl5gCntrlReg                   =  0x00BCF000,
+	HWPfFecDl5gI2MThreshReg               =  0x00BCF004,
+	HWPfFecDl5gVersionReg                 =  0x00BCF100,
+	HWPfFecDl5gFcwStatusReg               =  0x00BCF104,
+	HWPfFecDl5gWarnReg                    =  0x00BCF108,
+	HWPfFecUlVersionReg                   =  0x00BD0000,
+	HWPfFecUlControlReg                   =  0x00BD0004,
+	HWPfFecUlStatusReg                    =  0x00BD0008,
+	HWPfFecDlVersionReg                   =  0x00BDF000,
+	HWPfFecDlClusterConfigReg             =  0x00BDF004,
+	HWPfFecDlBurstThres                   =  0x00BDF00C,
+	HWPfFecDlClusterStatusReg0            =  0x00BDF040,
+	HWPfFecDlClusterStatusReg1            =  0x00BDF044,
+	HWPfFecDlClusterStatusReg2            =  0x00BDF048,
+	HWPfFecDlClusterStatusReg3            =  0x00BDF04C,
+	HWPfFecDlClusterStatusReg4            =  0x00BDF050,
+	HWPfFecDlClusterStatusReg5            =  0x00BDF054,
+	HWPfChaFabPllPllrst                   =  0x00C40000,
+	HWPfChaFabPllClk0                     =  0x00C40004,
+	HWPfChaFabPllClk1                     =  0x00C40008,
+	HWPfChaFabPllBwadj                    =  0x00C4000C,
+	HWPfChaFabPllLbw                      =  0x00C40010,
+	HWPfChaFabPllResetq                   =  0x00C40014,
+	HWPfChaFabPllPhshft0                  =  0x00C40018,
+	HWPfChaFabPllPhshft1                  =  0x00C4001C,
+	HWPfChaFabPllDivq0                    =  0x00C40020,
+	HWPfChaFabPllDivq1                    =  0x00C40024,
+	HWPfChaFabPllDivq2                    =  0x00C40028,
+	HWPfChaFabPllDivq3                    =  0x00C4002C,
+	HWPfChaFabPllDivq4                    =  0x00C40030,
+	HWPfChaFabPllDivq5                    =  0x00C40034,
+	HWPfChaFabPllDivq6                    =  0x00C40038,
+	HWPfChaFabPllDivq7                    =  0x00C4003C,
+	HWPfChaDl5gPllPllrst                  =  0x00C40080,
+	HWPfChaDl5gPllClk0                    =  0x00C40084,
+	HWPfChaDl5gPllClk1                    =  0x00C40088,
+	HWPfChaDl5gPllBwadj                   =  0x00C4008C,
+	HWPfChaDl5gPllLbw                     =  0x00C40090,
+	HWPfChaDl5gPllResetq                  =  0x00C40094,
+	HWPfChaDl5gPllPhshft0                 =  0x00C40098,
+	HWPfChaDl5gPllPhshft1                 =  0x00C4009C,
+	HWPfChaDl5gPllDivq0                   =  0x00C400A0,
+	HWPfChaDl5gPllDivq1                   =  0x00C400A4,
+	HWPfChaDl5gPllDivq2                   =  0x00C400A8,
+	HWPfChaDl5gPllDivq3                   =  0x00C400AC,
+	HWPfChaDl5gPllDivq4                   =  0x00C400B0,
+	HWPfChaDl5gPllDivq5                   =  0x00C400B4,
+	HWPfChaDl5gPllDivq6                   =  0x00C400B8,
+	HWPfChaDl5gPllDivq7                   =  0x00C400BC,
+	HWPfChaDl4gPllPllrst                  =  0x00C40100,
+	HWPfChaDl4gPllClk0                    =  0x00C40104,
+	HWPfChaDl4gPllClk1                    =  0x00C40108,
+	HWPfChaDl4gPllBwadj                   =  0x00C4010C,
+	HWPfChaDl4gPllLbw                     =  0x00C40110,
+	HWPfChaDl4gPllResetq                  =  0x00C40114,
+	HWPfChaDl4gPllPhshft0                 =  0x00C40118,
+	HWPfChaDl4gPllPhshft1                 =  0x00C4011C,
+	HWPfChaDl4gPllDivq0                   =  0x00C40120,
+	HWPfChaDl4gPllDivq1                   =  0x00C40124,
+	HWPfChaDl4gPllDivq2                   =  0x00C40128,
+	HWPfChaDl4gPllDivq3                   =  0x00C4012C,
+	HWPfChaDl4gPllDivq4                   =  0x00C40130,
+	HWPfChaDl4gPllDivq5                   =  0x00C40134,
+	HWPfChaDl4gPllDivq6                   =  0x00C40138,
+	HWPfChaDl4gPllDivq7                   =  0x00C4013C,
+	HWPfChaUl5gPllPllrst                  =  0x00C40180,
+	HWPfChaUl5gPllClk0                    =  0x00C40184,
+	HWPfChaUl5gPllClk1                    =  0x00C40188,
+	HWPfChaUl5gPllBwadj                   =  0x00C4018C,
+	HWPfChaUl5gPllLbw                     =  0x00C40190,
+	HWPfChaUl5gPllResetq                  =  0x00C40194,
+	HWPfChaUl5gPllPhshft0                 =  0x00C40198,
+	HWPfChaUl5gPllPhshft1                 =  0x00C4019C,
+	HWPfChaUl5gPllDivq0                   =  0x00C401A0,
+	HWPfChaUl5gPllDivq1                   =  0x00C401A4,
+	HWPfChaUl5gPllDivq2                   =  0x00C401A8,
+	HWPfChaUl5gPllDivq3                   =  0x00C401AC,
+	HWPfChaUl5gPllDivq4                   =  0x00C401B0,
+	HWPfChaUl5gPllDivq5                   =  0x00C401B4,
+	HWPfChaUl5gPllDivq6                   =  0x00C401B8,
+	HWPfChaUl5gPllDivq7                   =  0x00C401BC,
+	HWPfChaUl4gPllPllrst                  =  0x00C40200,
+	HWPfChaUl4gPllClk0                    =  0x00C40204,
+	HWPfChaUl4gPllClk1                    =  0x00C40208,
+	HWPfChaUl4gPllBwadj                   =  0x00C4020C,
+	HWPfChaUl4gPllLbw                     =  0x00C40210,
+	HWPfChaUl4gPllResetq                  =  0x00C40214,
+	HWPfChaUl4gPllPhshft0                 =  0x00C40218,
+	HWPfChaUl4gPllPhshft1                 =  0x00C4021C,
+	HWPfChaUl4gPllDivq0                   =  0x00C40220,
+	HWPfChaUl4gPllDivq1                   =  0x00C40224,
+	HWPfChaUl4gPllDivq2                   =  0x00C40228,
+	HWPfChaUl4gPllDivq3                   =  0x00C4022C,
+	HWPfChaUl4gPllDivq4                   =  0x00C40230,
+	HWPfChaUl4gPllDivq5                   =  0x00C40234,
+	HWPfChaUl4gPllDivq6                   =  0x00C40238,
+	HWPfChaUl4gPllDivq7                   =  0x00C4023C,
+	HWPfChaDdrPllPllrst                   =  0x00C40280,
+	HWPfChaDdrPllClk0                     =  0x00C40284,
+	HWPfChaDdrPllClk1                     =  0x00C40288,
+	HWPfChaDdrPllBwadj                    =  0x00C4028C,
+	HWPfChaDdrPllLbw                      =  0x00C40290,
+	HWPfChaDdrPllResetq                   =  0x00C40294,
+	HWPfChaDdrPllPhshft0                  =  0x00C40298,
+	HWPfChaDdrPllPhshft1                  =  0x00C4029C,
+	HWPfChaDdrPllDivq0                    =  0x00C402A0,
+	HWPfChaDdrPllDivq1                    =  0x00C402A4,
+	HWPfChaDdrPllDivq2                    =  0x00C402A8,
+	HWPfChaDdrPllDivq3                    =  0x00C402AC,
+	HWPfChaDdrPllDivq4                    =  0x00C402B0,
+	HWPfChaDdrPllDivq5                    =  0x00C402B4,
+	HWPfChaDdrPllDivq6                    =  0x00C402B8,
+	HWPfChaDdrPllDivq7                    =  0x00C402BC,
+	HWPfChaErrStatus                      =  0x00C40400,
+	HWPfChaErrMask                        =  0x00C40404,
+	HWPfChaDebugPcieMsiFifo               =  0x00C40410,
+	HWPfChaDebugDdrMsiFifo                =  0x00C40414,
+	HWPfChaDebugMiscMsiFifo               =  0x00C40418,
+	HWPfChaPwmSet                         =  0x00C40420,
+	HWPfChaDdrRstStatus                   =  0x00C40430,
+	HWPfChaDdrStDoneStatus                =  0x00C40434,
+	HWPfChaDdrWbRstCfg                    =  0x00C40438,
+	HWPfChaDdrApbRstCfg                   =  0x00C4043C,
+	HWPfChaDdrPhyRstCfg                   =  0x00C40440,
+	HWPfChaDdrCpuRstCfg                   =  0x00C40444,
+	HWPfChaDdrSifRstCfg                   =  0x00C40448,
+	HWPfChaPadcfgPcomp0                   =  0x00C41000,
+	HWPfChaPadcfgNcomp0                   =  0x00C41004,
+	HWPfChaPadcfgOdt0                     =  0x00C41008,
+	HWPfChaPadcfgProtect0                 =  0x00C4100C,
+	HWPfChaPreemphasisProtect0            =  0x00C41010,
+	HWPfChaPreemphasisCompen0             =  0x00C41040,
+	HWPfChaPreemphasisOdten0              =  0x00C41044,
+	HWPfChaPadcfgPcomp1                   =  0x00C41100,
+	HWPfChaPadcfgNcomp1                   =  0x00C41104,
+	HWPfChaPadcfgOdt1                     =  0x00C41108,
+	HWPfChaPadcfgProtect1                 =  0x00C4110C,
+	HWPfChaPreemphasisProtect1            =  0x00C41110,
+	HWPfChaPreemphasisCompen1             =  0x00C41140,
+	HWPfChaPreemphasisOdten1              =  0x00C41144,
+	HWPfChaPadcfgPcomp2                   =  0x00C41200,
+	HWPfChaPadcfgNcomp2                   =  0x00C41204,
+	HWPfChaPadcfgOdt2                     =  0x00C41208,
+	HWPfChaPadcfgProtect2                 =  0x00C4120C,
+	HWPfChaPreemphasisProtect2            =  0x00C41210,
+	HWPfChaPreemphasisCompen2             =  0x00C41240,
+	HWPfChaPreemphasisOdten4              =  0x00C41444,
+	HWPfChaPreemphasisOdten2              =  0x00C41244,
+	HWPfChaPadcfgPcomp3                   =  0x00C41300,
+	HWPfChaPadcfgNcomp3                   =  0x00C41304,
+	HWPfChaPadcfgOdt3                     =  0x00C41308,
+	HWPfChaPadcfgProtect3                 =  0x00C4130C,
+	HWPfChaPreemphasisProtect3            =  0x00C41310,
+	HWPfChaPreemphasisCompen3             =  0x00C41340,
+	HWPfChaPreemphasisOdten3              =  0x00C41344,
+	HWPfChaPadcfgPcomp4                   =  0x00C41400,
+	HWPfChaPadcfgNcomp4                   =  0x00C41404,
+	HWPfChaPadcfgOdt4                     =  0x00C41408,
+	HWPfChaPadcfgProtect4                 =  0x00C4140C,
+	HWPfChaPreemphasisProtect4            =  0x00C41410,
+	HWPfChaPreemphasisCompen4             =  0x00C41440,
+	HWPfHiVfToPfDbellVf                   =  0x00C80000,
+	HWPfHiPfToVfDbellVf                   =  0x00C80008,
+	HWPfHiInfoRingBaseLoVf                =  0x00C80010,
+	HWPfHiInfoRingBaseHiVf                =  0x00C80014,
+	HWPfHiInfoRingPointerVf               =  0x00C80018,
+	HWPfHiInfoRingIntWrEnVf               =  0x00C80020,
+	HWPfHiInfoRingPf2VfWrEnVf             =  0x00C80024,
+	HWPfHiMsixVectorMapperVf              =  0x00C80060,
+	HWPfHiModuleVersionReg                =  0x00C84000,
+	HWPfHiIosf2axiErrLogReg               =  0x00C84004,
+	HWPfHiHardResetReg                    =  0x00C84008,
+	HWPfHi5GHardResetReg                  =  0x00C8400C,
+	HWPfHiInfoRingBaseLoRegPf             =  0x00C84010,
+	HWPfHiInfoRingBaseHiRegPf             =  0x00C84014,
+	HWPfHiInfoRingPointerRegPf            =  0x00C84018,
+	HWPfHiInfoRingIntWrEnRegPf            =  0x00C84020,
+	HWPfHiInfoRingVf2pfLoWrEnReg          =  0x00C84024,
+	HWPfHiInfoRingVf2pfHiWrEnReg          =  0x00C84028,
+	HWPfHiLogParityErrStatusReg           =  0x00C8402C,
+	HWPfHiLogDataParityErrorVfStatusLo    =  0x00C84030,
+	HWPfHiLogDataParityErrorVfStatusHi    =  0x00C84034,
+	HWPfHiBlockTransmitOnErrorEn          =  0x00C84038,
+	HWPfHiCfgMsiIntWrEnRegPf              =  0x00C84040,
+	HWPfHiCfgMsiVf2pfLoWrEnReg            =  0x00C84044,
+	HWPfHiCfgMsiVf2pfHighWrEnReg          =  0x00C84048,
+	HWPfHiMsixVectorMapperPf              =  0x00C84060,
+	HWPfHiApbWrWaitTime                   =  0x00C84100,
+	HWPfHiXCounterMaxValue                =  0x00C84104,
+	HWPfHiPfMode                          =  0x00C84108,
+	HWPfHiClkGateHystReg                  =  0x00C8410C,
+	HWPfHiSnoopBitsReg                    =  0x00C84110,
+	HWPfHiMsiDropEnableReg                =  0x00C84114,
+	HWPfHiMsiStatReg                      =  0x00C84120,
+	HWPfHiFifoOflStatReg                  =  0x00C84124,
+	HWPfHiHiDebugReg                      =  0x00C841F4,
+	HWPfHiDebugMemSnoopMsiFifo            =  0x00C841F8,
+	HWPfHiDebugMemSnoopInputFifo          =  0x00C841FC,
+	HWPfHiMsixMappingConfig               =  0x00C84200,
+	HWPfHiJunkReg                         =  0x00C8FF00,
+	HWPfDdrUmmcVer                        =  0x00D00000,
+	HWPfDdrUmmcCap                        =  0x00D00010,
+	HWPfDdrUmmcCtrl                       =  0x00D00020,
+	HWPfDdrMpcPe                          =  0x00D00080,
+	HWPfDdrMpcPpri3                       =  0x00D00090,
+	HWPfDdrMpcPpri2                       =  0x00D000A0,
+	HWPfDdrMpcPpri1                       =  0x00D000B0,
+	HWPfDdrMpcPpri0                       =  0x00D000C0,
+	HWPfDdrMpcPrwgrpCtrl                  =  0x00D000D0,
+	HWPfDdrMpcPbw7                        =  0x00D000E0,
+	HWPfDdrMpcPbw6                        =  0x00D000F0,
+	HWPfDdrMpcPbw5                        =  0x00D00100,
+	HWPfDdrMpcPbw4                        =  0x00D00110,
+	HWPfDdrMpcPbw3                        =  0x00D00120,
+	HWPfDdrMpcPbw2                        =  0x00D00130,
+	HWPfDdrMpcPbw1                        =  0x00D00140,
+	HWPfDdrMpcPbw0                        =  0x00D00150,
+	HWPfDdrMemoryInit                     =  0x00D00200,
+	HWPfDdrMemoryInitDone                 =  0x00D00210,
+	HWPfDdrMemInitPhyTrng0                =  0x00D00240,
+	HWPfDdrMemInitPhyTrng1                =  0x00D00250,
+	HWPfDdrMemInitPhyTrng2                =  0x00D00260,
+	HWPfDdrMemInitPhyTrng3                =  0x00D00270,
+	HWPfDdrBcDram                         =  0x00D003C0,
+	HWPfDdrBcAddrMap                      =  0x00D003D0,
+	HWPfDdrBcRef                          =  0x00D003E0,
+	HWPfDdrBcTim0                         =  0x00D00400,
+	HWPfDdrBcTim1                         =  0x00D00410,
+	HWPfDdrBcTim2                         =  0x00D00420,
+	HWPfDdrBcTim3                         =  0x00D00430,
+	HWPfDdrBcTim4                         =  0x00D00440,
+	HWPfDdrBcTim5                         =  0x00D00450,
+	HWPfDdrBcTim6                         =  0x00D00460,
+	HWPfDdrBcTim7                         =  0x00D00470,
+	HWPfDdrBcTim8                         =  0x00D00480,
+	HWPfDdrBcTim9                         =  0x00D00490,
+	HWPfDdrBcTim10                        =  0x00D004A0,
+	HWPfDdrBcTim12                        =  0x00D004C0,
+	HWPfDdrDfiInit                        =  0x00D004D0,
+	HWPfDdrDfiInitComplete                =  0x00D004E0,
+	HWPfDdrDfiTim0                        =  0x00D004F0,
+	HWPfDdrDfiTim1                        =  0x00D00500,
+	HWPfDdrDfiPhyUpdEn                    =  0x00D00530,
+	HWPfDdrMemStatus                      =  0x00D00540,
+	HWPfDdrUmmcErrStatus                  =  0x00D00550,
+	HWPfDdrUmmcIntStatus                  =  0x00D00560,
+	HWPfDdrUmmcIntEn                      =  0x00D00570,
+	HWPfDdrPhyRdLatency                   =  0x00D48400,
+	HWPfDdrPhyRdLatencyDbi                =  0x00D48410,
+	HWPfDdrPhyWrLatency                   =  0x00D48420,
+	HWPfDdrPhyTrngType                    =  0x00D48430,
+	HWPfDdrPhyMrsTiming2                  =  0x00D48440,
+	HWPfDdrPhyMrsTiming0                  =  0x00D48450,
+	HWPfDdrPhyMrsTiming1                  =  0x00D48460,
+	HWPfDdrPhyDramTmrd                    =  0x00D48470,
+	HWPfDdrPhyDramTmod                    =  0x00D48480,
+	HWPfDdrPhyDramTwpre                   =  0x00D48490,
+	HWPfDdrPhyDramTrfc                    =  0x00D484A0,
+	HWPfDdrPhyDramTrwtp                   =  0x00D484B0,
+	HWPfDdrPhyMr01Dimm                    =  0x00D484C0,
+	HWPfDdrPhyMr01DimmDbi                 =  0x00D484D0,
+	HWPfDdrPhyMr23Dimm                    =  0x00D484E0,
+	HWPfDdrPhyMr45Dimm                    =  0x00D484F0,
+	HWPfDdrPhyMr67Dimm                    =  0x00D48500,
+	HWPfDdrPhyWrlvlWwRdlvlRr              =  0x00D48510,
+	HWPfDdrPhyOdtEn                       =  0x00D48520,
+	HWPfDdrPhyFastTrng                    =  0x00D48530,
+	HWPfDdrPhyDynTrngGap                  =  0x00D48540,
+	HWPfDdrPhyDynRcalGap                  =  0x00D48550,
+	HWPfDdrPhyIdletimeout                 =  0x00D48560,
+	HWPfDdrPhyRstCkeGap                   =  0x00D48570,
+	HWPfDdrPhyCkeMrsGap                   =  0x00D48580,
+	HWPfDdrPhyMemVrefMidVal               =  0x00D48590,
+	HWPfDdrPhyVrefStep                    =  0x00D485A0,
+	HWPfDdrPhyVrefThreshold               =  0x00D485B0,
+	HWPfDdrPhyPhyVrefMidVal               =  0x00D485C0,
+	HWPfDdrPhyDqsCountMax                 =  0x00D485D0,
+	HWPfDdrPhyDqsCountNum                 =  0x00D485E0,
+	HWPfDdrPhyDramRow                     =  0x00D485F0,
+	HWPfDdrPhyDramCol                     =  0x00D48600,
+	HWPfDdrPhyDramBgBa                    =  0x00D48610,
+	HWPfDdrPhyDynamicUpdreqrel            =  0x00D48620,
+	HWPfDdrPhyVrefLimits                  =  0x00D48630,
+	HWPfDdrPhyIdtmTcStatus                =  0x00D6C020,
+	HWPfDdrPhyIdtmFwVersion               =  0x00D6C410,
+	HWPfDdrPhyRdlvlGateInitDelay          =  0x00D70000,
+	HWPfDdrPhyRdenSmplabc                 =  0x00D70008,
+	HWPfDdrPhyVrefNibble0                 =  0x00D7000C,
+	HWPfDdrPhyVrefNibble1                 =  0x00D70010,
+	HWPfDdrPhyRdlvlGateDqsSmpl0           =  0x00D70014,
+	HWPfDdrPhyRdlvlGateDqsSmpl1           =  0x00D70018,
+	HWPfDdrPhyRdlvlGateDqsSmpl2           =  0x00D7001C,
+	HWPfDdrPhyDqsCount                    =  0x00D70020,
+	HWPfDdrPhyWrlvlRdlvlGateStatus        =  0x00D70024,
+	HWPfDdrPhyErrorFlags                  =  0x00D70028,
+	HWPfDdrPhyPowerDown                   =  0x00D70030,
+	HWPfDdrPhyPrbsSeedByte0               =  0x00D70034,
+	HWPfDdrPhyPrbsSeedByte1               =  0x00D70038,
+	HWPfDdrPhyPcompDq                     =  0x00D70040,
+	HWPfDdrPhyNcompDq                     =  0x00D70044,
+	HWPfDdrPhyPcompDqs                    =  0x00D70048,
+	HWPfDdrPhyNcompDqs                    =  0x00D7004C,
+	HWPfDdrPhyPcompCmd                    =  0x00D70050,
+	HWPfDdrPhyNcompCmd                    =  0x00D70054,
+	HWPfDdrPhyPcompCk                     =  0x00D70058,
+	HWPfDdrPhyNcompCk                     =  0x00D7005C,
+	HWPfDdrPhyRcalOdtDq                   =  0x00D70060,
+	HWPfDdrPhyRcalOdtDqs                  =  0x00D70064,
+	HWPfDdrPhyRcalMask1                   =  0x00D70068,
+	HWPfDdrPhyRcalMask2                   =  0x00D7006C,
+	HWPfDdrPhyRcalCtrl                    =  0x00D70070,
+	HWPfDdrPhyRcalCnt                     =  0x00D70074,
+	HWPfDdrPhyRcalOverride                =  0x00D70078,
+	HWPfDdrPhyRcalGateen                  =  0x00D7007C,
+	HWPfDdrPhyCtrl                        =  0x00D70080,
+	HWPfDdrPhyWrlvlAlg                    =  0x00D70084,
+	HWPfDdrPhyRcalVreftTxcmdOdt           =  0x00D70088,
+	HWPfDdrPhyRdlvlGateParam              =  0x00D7008C,
+	HWPfDdrPhyRdlvlGateParam2             =  0x00D70090,
+	HWPfDdrPhyRcalVreftTxdata             =  0x00D70094,
+	HWPfDdrPhyCmdIntDelay                 =  0x00D700A4,
+	HWPfDdrPhyAlertN                      =  0x00D700A8,
+	HWPfDdrPhyTrngReqWpre2tck             =  0x00D700AC,
+	HWPfDdrPhyCmdPhaseSel                 =  0x00D700B4,
+	HWPfDdrPhyCmdDcdl                     =  0x00D700B8,
+	HWPfDdrPhyCkDcdl                      =  0x00D700BC,
+	HWPfDdrPhySwTrngCtrl1                 =  0x00D700C0,
+	HWPfDdrPhySwTrngCtrl2                 =  0x00D700C4,
+	HWPfDdrPhyRcalPcompRden               =  0x00D700C8,
+	HWPfDdrPhyRcalNcompRden               =  0x00D700CC,
+	HWPfDdrPhyRcalCompen                  =  0x00D700D0,
+	HWPfDdrPhySwTrngRdqs                  =  0x00D700D4,
+	HWPfDdrPhySwTrngWdqs                  =  0x00D700D8,
+	HWPfDdrPhySwTrngRdena                 =  0x00D700DC,
+	HWPfDdrPhySwTrngRdenb                 =  0x00D700E0,
+	HWPfDdrPhySwTrngRdenc                 =  0x00D700E4,
+	HWPfDdrPhySwTrngWdq                   =  0x00D700E8,
+	HWPfDdrPhySwTrngRdq                   =  0x00D700EC,
+	HWPfDdrPhyPcfgHmValue                 =  0x00D700F0,
+	HWPfDdrPhyPcfgTimerValue              =  0x00D700F4,
+	HWPfDdrPhyPcfgSoftwareTraining        =  0x00D700F8,
+	HWPfDdrPhyPcfgMcStatus                =  0x00D700FC,
+	HWPfDdrPhyWrlvlPhRank0                =  0x00D70100,
+	HWPfDdrPhyRdenPhRank0                 =  0x00D70104,
+	HWPfDdrPhyRdenIntRank0                =  0x00D70108,
+	HWPfDdrPhyRdqsDcdlRank0               =  0x00D7010C,
+	HWPfDdrPhyRdqsShadowDcdlRank0         =  0x00D70110,
+	HWPfDdrPhyWdqsDcdlRank0               =  0x00D70114,
+	HWPfDdrPhyWdmDcdlShadowRank0          =  0x00D70118,
+	HWPfDdrPhyWdmDcdlRank0                =  0x00D7011C,
+	HWPfDdrPhyDbiDcdlRank0                =  0x00D70120,
+	HWPfDdrPhyRdenDcdlaRank0              =  0x00D70124,
+	HWPfDdrPhyDbiDcdlShadowRank0          =  0x00D70128,
+	HWPfDdrPhyRdenDcdlbRank0              =  0x00D7012C,
+	HWPfDdrPhyWdqsShadowDcdlRank0         =  0x00D70130,
+	HWPfDdrPhyRdenDcdlcRank0              =  0x00D70134,
+	HWPfDdrPhyRdenShadowDcdlaRank0        =  0x00D70138,
+	HWPfDdrPhyWrlvlIntRank0               =  0x00D7013C,
+	HWPfDdrPhyRdqDcdlBit0Rank0            =  0x00D70200,
+	HWPfDdrPhyRdqDcdlShadowBit0Rank0      =  0x00D70204,
+	HWPfDdrPhyWdqDcdlBit0Rank0            =  0x00D70208,
+	HWPfDdrPhyWdqDcdlShadowBit0Rank0      =  0x00D7020C,
+	HWPfDdrPhyRdqDcdlBit1Rank0            =  0x00D70240,
+	HWPfDdrPhyRdqDcdlShadowBit1Rank0      =  0x00D70244,
+	HWPfDdrPhyWdqDcdlBit1Rank0            =  0x00D70248,
+	HWPfDdrPhyWdqDcdlShadowBit1Rank0      =  0x00D7024C,
+	HWPfDdrPhyRdqDcdlBit2Rank0            =  0x00D70280,
+	HWPfDdrPhyRdqDcdlShadowBit2Rank0      =  0x00D70284,
+	HWPfDdrPhyWdqDcdlBit2Rank0            =  0x00D70288,
+	HWPfDdrPhyWdqDcdlShadowBit2Rank0      =  0x00D7028C,
+	HWPfDdrPhyRdqDcdlBit3Rank0            =  0x00D702C0,
+	HWPfDdrPhyRdqDcdlShadowBit3Rank0      =  0x00D702C4,
+	HWPfDdrPhyWdqDcdlBit3Rank0            =  0x00D702C8,
+	HWPfDdrPhyWdqDcdlShadowBit3Rank0      =  0x00D702CC,
+	HWPfDdrPhyRdqDcdlBit4Rank0            =  0x00D70300,
+	HWPfDdrPhyRdqDcdlShadowBit4Rank0      =  0x00D70304,
+	HWPfDdrPhyWdqDcdlBit4Rank0            =  0x00D70308,
+	HWPfDdrPhyWdqDcdlShadowBit4Rank0      =  0x00D7030C,
+	HWPfDdrPhyRdqDcdlBit5Rank0            =  0x00D70340,
+	HWPfDdrPhyRdqDcdlShadowBit5Rank0      =  0x00D70344,
+	HWPfDdrPhyWdqDcdlBit5Rank0            =  0x00D70348,
+	HWPfDdrPhyWdqDcdlShadowBit5Rank0      =  0x00D7034C,
+	HWPfDdrPhyRdqDcdlBit6Rank0            =  0x00D70380,
+	HWPfDdrPhyRdqDcdlShadowBit6Rank0      =  0x00D70384,
+	HWPfDdrPhyWdqDcdlBit6Rank0            =  0x00D70388,
+	HWPfDdrPhyWdqDcdlShadowBit6Rank0      =  0x00D7038C,
+	HWPfDdrPhyRdqDcdlBit7Rank0            =  0x00D703C0,
+	HWPfDdrPhyRdqDcdlShadowBit7Rank0      =  0x00D703C4,
+	HWPfDdrPhyWdqDcdlBit7Rank0            =  0x00D703C8,
+	HWPfDdrPhyWdqDcdlShadowBit7Rank0      =  0x00D703CC,
+	HWPfDdrPhyIdtmStatus                  =  0x00D740D0,
+	HWPfDdrPhyIdtmError                   =  0x00D74110,
+	HWPfDdrPhyIdtmDebug                   =  0x00D74120,
+	HWPfDdrPhyIdtmDebugInt                =  0x00D74130,
+	HwPfPcieLnAsicCfgovr                  =  0x00D80000,
+	HwPfPcieLnAclkmixer                   =  0x00D80004,
+	HwPfPcieLnTxrampfreq                  =  0x00D80008,
+	HwPfPcieLnLanetest                    =  0x00D8000C,
+	HwPfPcieLnDcctrl                      =  0x00D80010,
+	HwPfPcieLnDccmeas                     =  0x00D80014,
+	HwPfPcieLnDccovrAclk                  =  0x00D80018,
+	HwPfPcieLnDccovrTxa                   =  0x00D8001C,
+	HwPfPcieLnDccovrTxk                   =  0x00D80020,
+	HwPfPcieLnDccovrDclk                  =  0x00D80024,
+	HwPfPcieLnDccovrEclk                  =  0x00D80028,
+	HwPfPcieLnDcctrimAclk                 =  0x00D8002C,
+	HwPfPcieLnDcctrimTx                   =  0x00D80030,
+	HwPfPcieLnDcctrimDclk                 =  0x00D80034,
+	HwPfPcieLnDcctrimEclk                 =  0x00D80038,
+	HwPfPcieLnQuadCtrl                    =  0x00D8003C,
+	HwPfPcieLnQuadCorrIndex               =  0x00D80040,
+	HwPfPcieLnQuadCorrStatus              =  0x00D80044,
+	HwPfPcieLnAsicRxovr1                  =  0x00D80048,
+	HwPfPcieLnAsicRxovr2                  =  0x00D8004C,
+	HwPfPcieLnAsicEqinfovr                =  0x00D80050,
+	HwPfPcieLnRxcsr                       =  0x00D80054,
+	HwPfPcieLnRxfectrl                    =  0x00D80058,
+	HwPfPcieLnRxtest                      =  0x00D8005C,
+	HwPfPcieLnEscount                     =  0x00D80060,
+	HwPfPcieLnCdrctrl                     =  0x00D80064,
+	HwPfPcieLnCdrctrl2                    =  0x00D80068,
+	HwPfPcieLnCdrcfg0Ctrl0                =  0x00D8006C,
+	HwPfPcieLnCdrcfg0Ctrl1                =  0x00D80070,
+	HwPfPcieLnCdrcfg0Ctrl2                =  0x00D80074,
+	HwPfPcieLnCdrcfg1Ctrl0                =  0x00D80078,
+	HwPfPcieLnCdrcfg1Ctrl1                =  0x00D8007C,
+	HwPfPcieLnCdrcfg1Ctrl2                =  0x00D80080,
+	HwPfPcieLnCdrcfg2Ctrl0                =  0x00D80084,
+	HwPfPcieLnCdrcfg2Ctrl1                =  0x00D80088,
+	HwPfPcieLnCdrcfg2Ctrl2                =  0x00D8008C,
+	HwPfPcieLnCdrcfg3Ctrl0                =  0x00D80090,
+	HwPfPcieLnCdrcfg3Ctrl1                =  0x00D80094,
+	HwPfPcieLnCdrcfg3Ctrl2                =  0x00D80098,
+	HwPfPcieLnCdrphase                    =  0x00D8009C,
+	HwPfPcieLnCdrfreq                     =  0x00D800A0,
+	HwPfPcieLnCdrstatusPhase              =  0x00D800A4,
+	HwPfPcieLnCdrstatusFreq               =  0x00D800A8,
+	HwPfPcieLnCdroffset                   =  0x00D800AC,
+	HwPfPcieLnRxvosctl                    =  0x00D800B0,
+	HwPfPcieLnRxvosctl2                   =  0x00D800B4,
+	HwPfPcieLnRxlosctl                    =  0x00D800B8,
+	HwPfPcieLnRxlos                       =  0x00D800BC,
+	HwPfPcieLnRxlosvval                   =  0x00D800C0,
+	HwPfPcieLnRxvosd0                     =  0x00D800C4,
+	HwPfPcieLnRxvosd1                     =  0x00D800C8,
+	HwPfPcieLnRxvosep0                    =  0x00D800CC,
+	HwPfPcieLnRxvosep1                    =  0x00D800D0,
+	HwPfPcieLnRxvosen0                    =  0x00D800D4,
+	HwPfPcieLnRxvosen1                    =  0x00D800D8,
+	HwPfPcieLnRxvosafe                    =  0x00D800DC,
+	HwPfPcieLnRxvosa0                     =  0x00D800E0,
+	HwPfPcieLnRxvosa0Out                  =  0x00D800E4,
+	HwPfPcieLnRxvosa1                     =  0x00D800E8,
+	HwPfPcieLnRxvosa1Out                  =  0x00D800EC,
+	HwPfPcieLnRxmisc                      =  0x00D800F0,
+	HwPfPcieLnRxbeacon                    =  0x00D800F4,
+	HwPfPcieLnRxdssout                    =  0x00D800F8,
+	HwPfPcieLnRxdssout2                   =  0x00D800FC,
+	HwPfPcieLnAlphapctrl                  =  0x00D80100,
+	HwPfPcieLnAlphanctrl                  =  0x00D80104,
+	HwPfPcieLnAdaptctrl                   =  0x00D80108,
+	HwPfPcieLnAdaptctrl1                  =  0x00D8010C,
+	HwPfPcieLnAdaptstatus                 =  0x00D80110,
+	HwPfPcieLnAdaptvga1                   =  0x00D80114,
+	HwPfPcieLnAdaptvga2                   =  0x00D80118,
+	HwPfPcieLnAdaptvga3                   =  0x00D8011C,
+	HwPfPcieLnAdaptvga4                   =  0x00D80120,
+	HwPfPcieLnAdaptboost1                 =  0x00D80124,
+	HwPfPcieLnAdaptboost2                 =  0x00D80128,
+	HwPfPcieLnAdaptboost3                 =  0x00D8012C,
+	HwPfPcieLnAdaptboost4                 =  0x00D80130,
+	HwPfPcieLnAdaptsslms1                 =  0x00D80134,
+	HwPfPcieLnAdaptsslms2                 =  0x00D80138,
+	HwPfPcieLnAdaptvgaStatus              =  0x00D8013C,
+	HwPfPcieLnAdaptboostStatus            =  0x00D80140,
+	HwPfPcieLnAdaptsslmsStatus1           =  0x00D80144,
+	HwPfPcieLnAdaptsslmsStatus2           =  0x00D80148,
+	HwPfPcieLnAfectrl1                    =  0x00D8014C,
+	HwPfPcieLnAfectrl2                    =  0x00D80150,
+	HwPfPcieLnAfectrl3                    =  0x00D80154,
+	HwPfPcieLnAfedefault1                 =  0x00D80158,
+	HwPfPcieLnAfedefault2                 =  0x00D8015C,
+	HwPfPcieLnDfectrl1                    =  0x00D80160,
+	HwPfPcieLnDfectrl2                    =  0x00D80164,
+	HwPfPcieLnDfectrl3                    =  0x00D80168,
+	HwPfPcieLnDfectrl4                    =  0x00D8016C,
+	HwPfPcieLnDfectrl5                    =  0x00D80170,
+	HwPfPcieLnDfectrl6                    =  0x00D80174,
+	HwPfPcieLnAfestatus1                  =  0x00D80178,
+	HwPfPcieLnAfestatus2                  =  0x00D8017C,
+	HwPfPcieLnDfestatus1                  =  0x00D80180,
+	HwPfPcieLnDfestatus2                  =  0x00D80184,
+	HwPfPcieLnDfestatus3                  =  0x00D80188,
+	HwPfPcieLnDfestatus4                  =  0x00D8018C,
+	HwPfPcieLnDfestatus5                  =  0x00D80190,
+	HwPfPcieLnAlphastatus                 =  0x00D80194,
+	HwPfPcieLnFomctrl1                    =  0x00D80198,
+	HwPfPcieLnFomctrl2                    =  0x00D8019C,
+	HwPfPcieLnFomctrl3                    =  0x00D801A0,
+	HwPfPcieLnAclkcalStatus               =  0x00D801A4,
+	HwPfPcieLnOffscorrStatus              =  0x00D801A8,
+	HwPfPcieLnEyewidthStatus              =  0x00D801AC,
+	HwPfPcieLnEyeheightStatus             =  0x00D801B0,
+	HwPfPcieLnAsicTxovr1                  =  0x00D801B4,
+	HwPfPcieLnAsicTxovr2                  =  0x00D801B8,
+	HwPfPcieLnAsicTxovr3                  =  0x00D801BC,
+	HwPfPcieLnTxbiasadjOvr                =  0x00D801C0,
+	HwPfPcieLnTxcsr                       =  0x00D801C4,
+	HwPfPcieLnTxtest                      =  0x00D801C8,
+	HwPfPcieLnTxtestword                  =  0x00D801CC,
+	HwPfPcieLnTxtestwordHigh              =  0x00D801D0,
+	HwPfPcieLnTxdrive                     =  0x00D801D4,
+	HwPfPcieLnMtcsLn                      =  0x00D801D8,
+	HwPfPcieLnStatsumLn                   =  0x00D801DC,
+	HwPfPcieLnRcbusScratch                =  0x00D801E0,
+	HwPfPcieLnRcbusMinorrev               =  0x00D801F0,
+	HwPfPcieLnRcbusMajorrev               =  0x00D801F4,
+	HwPfPcieLnRcbusBlocktype              =  0x00D801F8,
+	HwPfPcieSupPllcsr                     =  0x00D80800,
+	HwPfPcieSupPlldiv                     =  0x00D80804,
+	HwPfPcieSupPllcal                     =  0x00D80808,
+	HwPfPcieSupPllcalsts                  =  0x00D8080C,
+	HwPfPcieSupPllmeas                    =  0x00D80810,
+	HwPfPcieSupPlldactrim                 =  0x00D80814,
+	HwPfPcieSupPllbiastrim                =  0x00D80818,
+	HwPfPcieSupPllbwtrim                  =  0x00D8081C,
+	HwPfPcieSupPllcaldly                  =  0x00D80820,
+	HwPfPcieSupRefclkonpclkctrl           =  0x00D80824,
+	HwPfPcieSupPclkdelay                  =  0x00D80828,
+	HwPfPcieSupPhyconfig                  =  0x00D8082C,
+	HwPfPcieSupRcalIntf                   =  0x00D80830,
+	HwPfPcieSupAuxcsr                     =  0x00D80834,
+	HwPfPcieSupVref                       =  0x00D80838,
+	HwPfPcieSupLinkmode                   =  0x00D8083C,
+	HwPfPcieSupRrefcalctl                 =  0x00D80840,
+	HwPfPcieSupRrefcal                    =  0x00D80844,
+	HwPfPcieSupRrefcaldly                 =  0x00D80848,
+	HwPfPcieSupTximpcalctl                =  0x00D8084C,
+	HwPfPcieSupTximpcal                   =  0x00D80850,
+	HwPfPcieSupTximpoffset                =  0x00D80854,
+	HwPfPcieSupTximpcaldly                =  0x00D80858,
+	HwPfPcieSupRximpcalctl                =  0x00D8085C,
+	HwPfPcieSupRximpcal                   =  0x00D80860,
+	HwPfPcieSupRximpoffset                =  0x00D80864,
+	HwPfPcieSupRximpcaldly                =  0x00D80868,
+	HwPfPcieSupFence                      =  0x00D8086C,
+	HwPfPcieSupMtcs                       =  0x00D80870,
+	HwPfPcieSupStatsum                    =  0x00D809B8,
+	HwPfPciePcsDpStatus0                  =  0x00D81000,
+	HwPfPciePcsDpControl0                 =  0x00D81004,
+	HwPfPciePcsPmaStatusLane0             =  0x00D81008,
+	HwPfPciePcsPipeStatusLane0            =  0x00D8100C,
+	HwPfPciePcsTxdeemph0Lane0             =  0x00D81010,
+	HwPfPciePcsTxdeemph1Lane0             =  0x00D81014,
+	HwPfPciePcsInternalStatusLane0        =  0x00D81018,
+	HwPfPciePcsDpStatus1                  =  0x00D8101C,
+	HwPfPciePcsDpControl1                 =  0x00D81020,
+	HwPfPciePcsPmaStatusLane1             =  0x00D81024,
+	HwPfPciePcsPipeStatusLane1            =  0x00D81028,
+	HwPfPciePcsTxdeemph0Lane1             =  0x00D8102C,
+	HwPfPciePcsTxdeemph1Lane1             =  0x00D81030,
+	HwPfPciePcsInternalStatusLane1        =  0x00D81034,
+	HwPfPciePcsDpStatus2                  =  0x00D81038,
+	HwPfPciePcsDpControl2                 =  0x00D8103C,
+	HwPfPciePcsPmaStatusLane2             =  0x00D81040,
+	HwPfPciePcsPipeStatusLane2            =  0x00D81044,
+	HwPfPciePcsTxdeemph0Lane2             =  0x00D81048,
+	HwPfPciePcsTxdeemph1Lane2             =  0x00D8104C,
+	HwPfPciePcsInternalStatusLane2        =  0x00D81050,
+	HwPfPciePcsDpStatus3                  =  0x00D81054,
+	HwPfPciePcsDpControl3                 =  0x00D81058,
+	HwPfPciePcsPmaStatusLane3             =  0x00D8105C,
+	HwPfPciePcsPipeStatusLane3            =  0x00D81060,
+	HwPfPciePcsTxdeemph0Lane3             =  0x00D81064,
+	HwPfPciePcsTxdeemph1Lane3             =  0x00D81068,
+	HwPfPciePcsInternalStatusLane3        =  0x00D8106C,
+	HwPfPciePcsEbStatus0                  =  0x00D81070,
+	HwPfPciePcsEbStatus1                  =  0x00D81074,
+	HwPfPciePcsEbStatus2                  =  0x00D81078,
+	HwPfPciePcsEbStatus3                  =  0x00D8107C,
+	HwPfPciePcsPllSettingPcieG1           =  0x00D81088,
+	HwPfPciePcsPllSettingPcieG2           =  0x00D8108C,
+	HwPfPciePcsPllSettingPcieG3           =  0x00D81090,
+	HwPfPciePcsControl                    =  0x00D81094,
+	HwPfPciePcsEqControl                  =  0x00D81098,
+	HwPfPciePcsEqTimer                    =  0x00D8109C,
+	HwPfPciePcsEqErrStatus                =  0x00D810A0,
+	HwPfPciePcsEqErrCount                 =  0x00D810A4,
+	HwPfPciePcsStatus                     =  0x00D810A8,
+	HwPfPciePcsMiscRegister               =  0x00D810AC,
+	HwPfPciePcsObsControl                 =  0x00D810B0,
+	HwPfPciePcsPrbsCount0                 =  0x00D81200,
+	HwPfPciePcsBistControl0               =  0x00D81204,
+	HwPfPciePcsBistStaticWord00           =  0x00D81208,
+	HwPfPciePcsBistStaticWord10           =  0x00D8120C,
+	HwPfPciePcsBistStaticWord20           =  0x00D81210,
+	HwPfPciePcsBistStaticWord30           =  0x00D81214,
+	HwPfPciePcsPrbsCount1                 =  0x00D81220,
+	HwPfPciePcsBistControl1               =  0x00D81224,
+	HwPfPciePcsBistStaticWord01           =  0x00D81228,
+	HwPfPciePcsBistStaticWord11           =  0x00D8122C,
+	HwPfPciePcsBistStaticWord21           =  0x00D81230,
+	HwPfPciePcsBistStaticWord31           =  0x00D81234,
+	HwPfPciePcsPrbsCount2                 =  0x00D81240,
+	HwPfPciePcsBistControl2               =  0x00D81244,
+	HwPfPciePcsBistStaticWord02           =  0x00D81248,
+	HwPfPciePcsBistStaticWord12           =  0x00D8124C,
+	HwPfPciePcsBistStaticWord22           =  0x00D81250,
+	HwPfPciePcsBistStaticWord32           =  0x00D81254,
+	HwPfPciePcsPrbsCount3                 =  0x00D81260,
+	HwPfPciePcsBistControl3               =  0x00D81264,
+	HwPfPciePcsBistStaticWord03           =  0x00D81268,
+	HwPfPciePcsBistStaticWord13           =  0x00D8126C,
+	HwPfPciePcsBistStaticWord23           =  0x00D81270,
+	HwPfPciePcsBistStaticWord33           =  0x00D81274,
+	HwPfPcieGpexLtssmStateCntrl           =  0x00D90400,
+	HwPfPcieGpexLtssmStateStatus          =  0x00D90404,
+	HwPfPcieGpexSkipFreqTimer             =  0x00D90408,
+	HwPfPcieGpexLaneSelect                =  0x00D9040C,
+	HwPfPcieGpexLaneDeskew                =  0x00D90410,
+	HwPfPcieGpexRxErrorStatus             =  0x00D90414,
+	HwPfPcieGpexLaneNumControl            =  0x00D90418,
+	HwPfPcieGpexNFstControl               =  0x00D9041C,
+	HwPfPcieGpexLinkStatus                =  0x00D90420,
+	HwPfPcieGpexAckReplayTimeout          =  0x00D90438,
+	HwPfPcieGpexSeqNumberStatus           =  0x00D9043C,
+	HwPfPcieGpexCoreClkRatio              =  0x00D90440,
+	HwPfPcieGpexDllTholdControl           =  0x00D90448,
+	HwPfPcieGpexPmTimer                   =  0x00D90450,
+	HwPfPcieGpexPmeTimeout                =  0x00D90454,
+	HwPfPcieGpexAspmL1Timer               =  0x00D90458,
+	HwPfPcieGpexAspmReqTimer              =  0x00D9045C,
+	HwPfPcieGpexAspmL1Dis                 =  0x00D90460,
+	HwPfPcieGpexAdvisoryErrorControl      =  0x00D90468,
+	HwPfPcieGpexId                        =  0x00D90470,
+	HwPfPcieGpexClasscode                 =  0x00D90474,
+	HwPfPcieGpexSubsystemId               =  0x00D90478,
+	HwPfPcieGpexDeviceCapabilities        =  0x00D9047C,
+	HwPfPcieGpexLinkCapabilities          =  0x00D90480,
+	HwPfPcieGpexFunctionNumber            =  0x00D90484,
+	HwPfPcieGpexPmCapabilities            =  0x00D90488,
+	HwPfPcieGpexFunctionSelect            =  0x00D9048C,
+	HwPfPcieGpexErrorCounter              =  0x00D904AC,
+	HwPfPcieGpexConfigReady               =  0x00D904B0,
+	HwPfPcieGpexFcUpdateTimeout           =  0x00D904B8,
+	HwPfPcieGpexFcUpdateTimer             =  0x00D904BC,
+	HwPfPcieGpexVcBufferLoad              =  0x00D904C8,
+	HwPfPcieGpexVcBufferSizeThold         =  0x00D904CC,
+	HwPfPcieGpexVcBufferSelect            =  0x00D904D0,
+	HwPfPcieGpexBarEnable                 =  0x00D904D4,
+	HwPfPcieGpexBarDwordLower             =  0x00D904D8,
+	HwPfPcieGpexBarDwordUpper             =  0x00D904DC,
+	HwPfPcieGpexBarSelect                 =  0x00D904E0,
+	HwPfPcieGpexCreditCounterSelect       =  0x00D904E4,
+	HwPfPcieGpexCreditCounterStatus       =  0x00D904E8,
+	HwPfPcieGpexTlpHeaderSelect           =  0x00D904EC,
+	HwPfPcieGpexTlpHeaderDword0           =  0x00D904F0,
+	HwPfPcieGpexTlpHeaderDword1           =  0x00D904F4,
+	HwPfPcieGpexTlpHeaderDword2           =  0x00D904F8,
+	HwPfPcieGpexTlpHeaderDword3           =  0x00D904FC,
+	HwPfPcieGpexRelaxOrderControl         =  0x00D90500,
+	HwPfPcieGpexBarPrefetch               =  0x00D90504,
+	HwPfPcieGpexFcCheckControl            =  0x00D90508,
+	HwPfPcieGpexFcUpdateTimerTraffic      =  0x00D90518,
+	HwPfPcieGpexPhyControl0               =  0x00D9053C,
+	HwPfPcieGpexPhyControl1               =  0x00D90544,
+	HwPfPcieGpexPhyControl2               =  0x00D9054C,
+	HwPfPcieGpexUserControl0              =  0x00D9055C,
+	HwPfPcieGpexUncorrErrorStatus         =  0x00D905F0,
+	HwPfPcieGpexRxCplError                =  0x00D90620,
+	HwPfPcieGpexRxCplErrorDword0          =  0x00D90624,
+	HwPfPcieGpexRxCplErrorDword1          =  0x00D90628,
+	HwPfPcieGpexRxCplErrorDword2          =  0x00D9062C,
+	HwPfPcieGpexPabSwResetEn              =  0x00D90630,
+	HwPfPcieGpexGen3Control0              =  0x00D90634,
+	HwPfPcieGpexGen3Control1              =  0x00D90638,
+	HwPfPcieGpexGen3Control2              =  0x00D9063C,
+	HwPfPcieGpexGen2ControlCsr            =  0x00D90640,
+	HwPfPcieGpexTotalVfInitialVf0         =  0x00D90644,
+	HwPfPcieGpexTotalVfInitialVf1         =  0x00D90648,
+	HwPfPcieGpexSriovLinkDevId0           =  0x00D90684,
+	HwPfPcieGpexSriovLinkDevId1           =  0x00D90688,
+	HwPfPcieGpexSriovPageSize0            =  0x00D906C4,
+	HwPfPcieGpexSriovPageSize1            =  0x00D906C8,
+	HwPfPcieGpexIdVersion                 =  0x00D906FC,
+	HwPfPcieGpexSriovVfOffsetStride0      =  0x00D90704,
+	HwPfPcieGpexSriovVfOffsetStride1      =  0x00D90708,
+	HwPfPcieGpexGen3DeskewControl         =  0x00D907B4,
+	HwPfPcieGpexGen3EqControl             =  0x00D907B8,
+	HwPfPcieGpexBridgeVersion             =  0x00D90800,
+	HwPfPcieGpexBridgeCapability          =  0x00D90804,
+	HwPfPcieGpexBridgeControl             =  0x00D90808,
+	HwPfPcieGpexBridgeStatus              =  0x00D9080C,
+	HwPfPcieGpexEngineActivityStatus      =  0x00D9081C,
+	HwPfPcieGpexEngineResetControl        =  0x00D90820,
+	HwPfPcieGpexAxiPioControl             =  0x00D90840,
+	HwPfPcieGpexAxiPioStatus              =  0x00D90844,
+	HwPfPcieGpexAmbaSlaveCmdStatus        =  0x00D90848,
+	HwPfPcieGpexPexPioControl             =  0x00D908C0,
+	HwPfPcieGpexPexPioStatus              =  0x00D908C4,
+	HwPfPcieGpexAmbaMasterStatus          =  0x00D908C8,
+	HwPfPcieGpexCsrSlaveCmdStatus         =  0x00D90920,
+	HwPfPcieGpexMailboxAxiControl         =  0x00D90A50,
+	HwPfPcieGpexMailboxAxiData            =  0x00D90A54,
+	HwPfPcieGpexMailboxPexControl         =  0x00D90A90,
+	HwPfPcieGpexMailboxPexData            =  0x00D90A94,
+	HwPfPcieGpexPexInterruptEnable        =  0x00D90AD0,
+	HwPfPcieGpexPexInterruptStatus        =  0x00D90AD4,
+	HwPfPcieGpexPexInterruptAxiPioVector  =  0x00D90AD8,
+	HwPfPcieGpexPexInterruptPexPioVector  =  0x00D90AE0,
+	HwPfPcieGpexPexInterruptMiscVector    =  0x00D90AF8,
+	HwPfPcieGpexAmbaInterruptPioEnable    =  0x00D90B00,
+	HwPfPcieGpexAmbaInterruptMiscEnable   =  0x00D90B0C,
+	HwPfPcieGpexAmbaInterruptPioStatus    =  0x00D90B10,
+	HwPfPcieGpexAmbaInterruptMiscStatus   =  0x00D90B1C,
+	HwPfPcieGpexPexPmControl              =  0x00D90B80,
+	HwPfPcieGpexSlotMisc                  =  0x00D90B88,
+	HwPfPcieGpexAxiAddrMappingControl     =  0x00D90BA0,
+	HwPfPcieGpexAxiAddrMappingWindowAxiBase     =  0x00D90BA4,
+	HwPfPcieGpexAxiAddrMappingWindowPexBaseLow  =  0x00D90BA8,
+	HwPfPcieGpexAxiAddrMappingWindowPexBaseHigh =  0x00D90BAC,
+	HwPfPcieGpexPexBarAddrFunc0Bar0       =  0x00D91BA0,
+	HwPfPcieGpexPexBarAddrFunc0Bar1       =  0x00D91BA4,
+	HwPfPcieGpexAxiAddrMappingPcieHdrParam =  0x00D95BA0,
+	HwPfPcieGpexExtAxiAddrMappingAxiBase  =  0x00D980A0,
+	HwPfPcieGpexPexExtBarAddrFunc0Bar0    =  0x00D984A0,
+	HwPfPcieGpexPexExtBarAddrFunc0Bar1    =  0x00D984A4,
+	HwPfPcieGpexAmbaInterruptFlrEnable    =  0x00D9B960,
+	HwPfPcieGpexAmbaInterruptFlrStatus    =  0x00D9B9A0,
+	HwPfPcieGpexExtAxiAddrMappingSize     =  0x00D9BAF0,
+	HwPfPcieGpexPexPioAwcacheControl      =  0x00D9C300,
+	HwPfPcieGpexPexPioArcacheControl      =  0x00D9C304,
+	HwPfPcieGpexPabObSizeControlVc0       =  0x00D9C310
+};
+
+/* TIP PF Interrupt numbers */
+enum {
+	ACC100_PF_INT_QMGR_AQ_OVERFLOW = 0,
+	ACC100_PF_INT_DOORBELL_VF_2_PF = 1,
+	ACC100_PF_INT_DMA_DL_DESC_IRQ = 2,
+	ACC100_PF_INT_DMA_UL_DESC_IRQ = 3,
+	ACC100_PF_INT_DMA_MLD_DESC_IRQ = 4,
+	ACC100_PF_INT_DMA_UL5G_DESC_IRQ = 5,
+	ACC100_PF_INT_DMA_DL5G_DESC_IRQ = 6,
+	ACC100_PF_INT_ILLEGAL_FORMAT = 7,
+	ACC100_PF_INT_QMGR_DISABLED_ACCESS = 8,
+	ACC100_PF_INT_QMGR_AQ_OVERTHRESHOLD = 9,
+	ACC100_PF_INT_ARAM_ACCESS_ERR = 10,
+	ACC100_PF_INT_ARAM_ECC_1BIT_ERR = 11,
+	ACC100_PF_INT_PARITY_ERR = 12,
+	ACC100_PF_INT_QMGR_ERR = 13,
+	ACC100_PF_INT_INT_REQ_OVERFLOW = 14,
+	ACC100_PF_INT_APB_TIMEOUT = 15,
+};
+
+#endif /* ACC100_PF_ENUM_H */
diff --git a/drivers/baseband/acc100/acc100_vf_enum.h b/drivers/baseband/acc100/acc100_vf_enum.h
new file mode 100644
index 0000000..b512af3
--- /dev/null
+++ b/drivers/baseband/acc100/acc100_vf_enum.h
@@ -0,0 +1,73 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2017 Intel Corporation
+ */
+
+#ifndef ACC100_VF_ENUM_H
+#define ACC100_VF_ENUM_H
+
+/*
+ * ACC100 Register mapping on VF BAR0
+ * This is automatically generated from RDL, format may change with new RDL
+ */
+enum {
+	HWVfQmgrIngressAq             =  0x00000000,
+	HWVfHiVfToPfDbellVf           =  0x00000800,
+	HWVfHiPfToVfDbellVf           =  0x00000808,
+	HWVfHiInfoRingBaseLoVf        =  0x00000810,
+	HWVfHiInfoRingBaseHiVf        =  0x00000814,
+	HWVfHiInfoRingPointerVf       =  0x00000818,
+	HWVfHiInfoRingIntWrEnVf       =  0x00000820,
+	HWVfHiInfoRingPf2VfWrEnVf     =  0x00000824,
+	HWVfHiMsixVectorMapperVf      =  0x00000860,
+	HWVfDmaFec5GulDescBaseLoRegVf =  0x00000920,
+	HWVfDmaFec5GulDescBaseHiRegVf =  0x00000924,
+	HWVfDmaFec5GulRespPtrLoRegVf  =  0x00000928,
+	HWVfDmaFec5GulRespPtrHiRegVf  =  0x0000092C,
+	HWVfDmaFec5GdlDescBaseLoRegVf =  0x00000940,
+	HWVfDmaFec5GdlDescBaseHiRegVf =  0x00000944,
+	HWVfDmaFec5GdlRespPtrLoRegVf  =  0x00000948,
+	HWVfDmaFec5GdlRespPtrHiRegVf  =  0x0000094C,
+	HWVfDmaFec4GulDescBaseLoRegVf =  0x00000960,
+	HWVfDmaFec4GulDescBaseHiRegVf =  0x00000964,
+	HWVfDmaFec4GulRespPtrLoRegVf  =  0x00000968,
+	HWVfDmaFec4GulRespPtrHiRegVf  =  0x0000096C,
+	HWVfDmaFec4GdlDescBaseLoRegVf =  0x00000980,
+	HWVfDmaFec4GdlDescBaseHiRegVf =  0x00000984,
+	HWVfDmaFec4GdlRespPtrLoRegVf  =  0x00000988,
+	HWVfDmaFec4GdlRespPtrHiRegVf  =  0x0000098C,
+	HWVfDmaDdrBaseRangeRoVf       =  0x000009A0,
+	HWVfQmgrAqResetVf             =  0x00000E00,
+	HWVfQmgrRingSizeVf            =  0x00000E04,
+	HWVfQmgrGrpDepthLog20Vf       =  0x00000E08,
+	HWVfQmgrGrpDepthLog21Vf       =  0x00000E0C,
+	HWVfQmgrGrpFunction0Vf        =  0x00000E10,
+	HWVfQmgrGrpFunction1Vf        =  0x00000E14,
+	HWVfPmACntrlRegVf             =  0x00000F40,
+	HWVfPmACountVf                =  0x00000F48,
+	HWVfPmAKCntLoVf               =  0x00000F50,
+	HWVfPmAKCntHiVf               =  0x00000F54,
+	HWVfPmADeltaCntLoVf           =  0x00000F60,
+	HWVfPmADeltaCntHiVf           =  0x00000F64,
+	HWVfPmBCntrlRegVf             =  0x00000F80,
+	HWVfPmBCountVf                =  0x00000F88,
+	HWVfPmBKCntLoVf               =  0x00000F90,
+	HWVfPmBKCntHiVf               =  0x00000F94,
+	HWVfPmBDeltaCntLoVf           =  0x00000FA0,
+	HWVfPmBDeltaCntHiVf           =  0x00000FA4
+};
+
+/* TIP VF Interrupt numbers */
+enum {
+	ACC100_VF_INT_QMGR_AQ_OVERFLOW = 0,
+	ACC100_VF_INT_DOORBELL_VF_2_PF = 1,
+	ACC100_VF_INT_DMA_DL_DESC_IRQ = 2,
+	ACC100_VF_INT_DMA_UL_DESC_IRQ = 3,
+	ACC100_VF_INT_DMA_MLD_DESC_IRQ = 4,
+	ACC100_VF_INT_DMA_UL5G_DESC_IRQ = 5,
+	ACC100_VF_INT_DMA_DL5G_DESC_IRQ = 6,
+	ACC100_VF_INT_ILLEGAL_FORMAT = 7,
+	ACC100_VF_INT_QMGR_DISABLED_ACCESS = 8,
+	ACC100_VF_INT_QMGR_AQ_OVERTHRESHOLD = 9,
+};
+
+#endif /* ACC100_VF_ENUM_H */
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 6f46df0..cd77570 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -5,6 +5,9 @@
 #ifndef _RTE_ACC100_PMD_H_
 #define _RTE_ACC100_PMD_H_
 
+#include "acc100_pf_enum.h"
+#include "acc100_vf_enum.h"
+
 /* Helper macro for logging */
 #define rte_bbdev_log(level, fmt, ...) \
 	rte_log(RTE_LOG_ ## level, acc100_logtype, fmt "\n", \
@@ -27,6 +30,493 @@
 #define RTE_ACC100_PF_DEVICE_ID        (0x0d5c)
 #define RTE_ACC100_VF_DEVICE_ID        (0x0d5d)
 
+/* Define as 1 to use only a single FEC engine */
+#ifndef RTE_ACC100_SINGLE_FEC
+#define RTE_ACC100_SINGLE_FEC 0
+#endif
+
+/* Values used in filling in descriptors */
+#define ACC100_DMA_DESC_TYPE           2
+#define ACC100_DMA_CODE_BLK_MODE       0
+#define ACC100_DMA_BLKID_FCW           1
+#define ACC100_DMA_BLKID_IN            2
+#define ACC100_DMA_BLKID_OUT_ENC       1
+#define ACC100_DMA_BLKID_OUT_HARD      1
+#define ACC100_DMA_BLKID_OUT_SOFT      2
+#define ACC100_DMA_BLKID_OUT_HARQ      3
+#define ACC100_DMA_BLKID_IN_HARQ       3
+
+/* Values used in filling in decode FCWs */
+#define ACC100_FCW_TD_VER              1
+#define ACC100_FCW_TD_EXT_COLD_REG_EN  1
+#define ACC100_FCW_TD_AUTOMAP          0x0f
+#define ACC100_FCW_TD_RVIDX_0          2
+#define ACC100_FCW_TD_RVIDX_1          26
+#define ACC100_FCW_TD_RVIDX_2          50
+#define ACC100_FCW_TD_RVIDX_3          74
+
+/* Values used in writing to the registers */
+#define ACC100_REG_IRQ_EN_ALL          0x1FF83FF  /* Enable all interrupts */
+
+/* ACC100 Specific Dimensioning */
+#define ACC100_SIZE_64MBYTE            (64*1024*1024)
+/* Number of elements in an Info Ring */
+#define ACC100_INFO_RING_NUM_ENTRIES   1024
+/* Number of elements in HARQ layout memory */
+#define ACC100_HARQ_LAYOUT             (64*1024*1024)
+/* Assume offset for HARQ in memory */
+#define ACC100_HARQ_OFFSET             (32*1024)
+/* Mask used to calculate an index in an Info Ring array (not a byte offset) */
+#define ACC100_INFO_RING_MASK          (ACC100_INFO_RING_NUM_ENTRIES-1)
+/* Number of Virtual Functions ACC100 supports */
+#define ACC100_NUM_VFS                  16
+#define ACC100_NUM_QGRPS                 8
+#define ACC100_NUM_QGRPS_PER_WORD        8
+#define ACC100_NUM_AQS                  16
+#define MAX_ENQ_BATCH_SIZE          255
+/* All ACC100 Registers alignment are 32bits = 4B */
+#define BYTES_IN_WORD                 4
+#define MAX_E_MBUF                64000
+
+#define GRP_ID_SHIFT    10 /* Queue Index Hierarchy */
+#define VF_ID_SHIFT     4  /* Queue Index Hierarchy */
+#define VF_OFFSET_QOS   16 /* offset in Memory Space specific to QoS Mon */
+#define TMPL_PRI_0      0x03020100
+#define TMPL_PRI_1      0x07060504
+#define TMPL_PRI_2      0x0b0a0908
+#define TMPL_PRI_3      0x0f0e0d0c
+#define QUEUE_ENABLE    0x80000000  /* Bit to mark Queue as Enabled */
+#define WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
+
+#define ACC100_NUM_TMPL  32
+#define VF_OFFSET_QOS 16 /* offset in Memory Space specific to QoS Mon */
+/* Mapping of signals for the available engines */
+#define SIG_UL_5G      0
+#define SIG_UL_5G_LAST 7
+#define SIG_DL_5G      13
+#define SIG_DL_5G_LAST 15
+#define SIG_UL_4G      16
+#define SIG_UL_4G_LAST 21
+#define SIG_DL_4G      27
+#define SIG_DL_4G_LAST 31
+
+/* max number of iterations to allocate memory block for all rings */
+#define SW_RING_MEM_ALLOC_ATTEMPTS 5
+#define MAX_QUEUE_DEPTH           1024
+#define ACC100_DMA_MAX_NUM_POINTERS  14
+#define ACC100_DMA_DESC_PADDING      8
+#define ACC100_FCW_PADDING           12
+#define ACC100_DESC_FCW_OFFSET       192
+#define ACC100_DESC_SIZE             256
+#define ACC100_DESC_OFFSET           (ACC100_DESC_SIZE / 64)
+#define ACC100_FCW_TE_BLEN     32
+#define ACC100_FCW_TD_BLEN     24
+#define ACC100_FCW_LE_BLEN     32
+#define ACC100_FCW_LD_BLEN     36
+
+#define ACC100_FCW_VER         2
+#define MUX_5GDL_DESC 6
+#define CMP_ENC_SIZE 20
+#define CMP_DEC_SIZE 24
+#define ENC_OFFSET (32)
+#define DEC_OFFSET (80)
+#define ACC100_EXT_MEM
+#define ACC100_HARQ_OFFSET_THRESHOLD 1024
+
+/* Constants from K0 computation from 3GPP 38.212 Table 5.4.2.1-2 */
+#define N_ZC_1 66 /* N = 66 Zc for BG 1 */
+#define N_ZC_2 50 /* N = 50 Zc for BG 2 */
+#define K0_1_1 17 /* K0 fraction numerator for rv 1 and BG 1 */
+#define K0_1_2 13 /* K0 fraction numerator for rv 1 and BG 2 */
+#define K0_2_1 33 /* K0 fraction numerator for rv 2 and BG 1 */
+#define K0_2_2 25 /* K0 fraction numerator for rv 2 and BG 2 */
+#define K0_3_1 56 /* K0 fraction numerator for rv 3 and BG 1 */
+#define K0_3_2 43 /* K0 fraction numerator for rv 3 and BG 2 */
+
+/* ACC100 Configuration */
+#define ACC100_DDR_ECC_ENABLE
+#define ACC100_CFG_DMA_ERROR 0x3D7
+#define ACC100_CFG_AXI_CACHE 0x11
+#define ACC100_CFG_QMGR_HI_P 0x0F0F
+#define ACC100_CFG_PCI_AXI 0xC003
+#define ACC100_CFG_PCI_BRIDGE 0x40006033
+#define ACC100_ENGINE_OFFSET 0x1000
+#define ACC100_RESET_HI 0x20100
+#define ACC100_RESET_LO 0x20000
+#define ACC100_RESET_HARD 0x1FF
+#define ACC100_ENGINES_MAX 9
+#define LONG_WAIT 1000
+
+/* ACC100 DMA Descriptor triplet */
+struct acc100_dma_triplet {
+	uint64_t address;
+	uint32_t blen:20,
+		res0:4,
+		last:1,
+		dma_ext:1,
+		res1:2,
+		blkid:4;
+} __rte_packed;
+
+
+
+/* ACC100 DMA Response Descriptor */
+union acc100_dma_rsp_desc {
+	uint32_t val;
+	struct {
+		uint32_t crc_status:1,
+			synd_ok:1,
+			dma_err:1,
+			neg_stop:1,
+			fcw_err:1,
+			output_err:1,
+			input_err:1,
+			timestampEn:1,
+			iterCountFrac:8,
+			iter_cnt:8,
+			rsrvd3:6,
+			sdone:1,
+			fdone:1;
+		uint32_t add_info_0;
+		uint32_t add_info_1;
+	};
+};
+
+
+/* ACC100 Queue Manager Enqueue PCI Register */
+union acc100_enqueue_reg_fmt {
+	uint32_t val;
+	struct {
+		uint32_t num_elem:8,
+			addr_offset:3,
+			rsrvd:1,
+			req_elem_addr:20;
+	};
+};
+
+/* FEC 4G Uplink Frame Control Word */
+struct __rte_packed acc100_fcw_td {
+	uint8_t fcw_ver:4,
+		num_maps:4; /* Unused */
+	uint8_t filler:6, /* Unused */
+		rsrvd0:1,
+		bypass_sb_deint:1;
+	uint16_t k_pos;
+	uint16_t k_neg; /* Unused */
+	uint8_t c_neg; /* Unused */
+	uint8_t c; /* Unused */
+	uint32_t ea; /* Unused */
+	uint32_t eb; /* Unused */
+	uint8_t cab; /* Unused */
+	uint8_t k0_start_col; /* Unused */
+	uint8_t rsrvd1;
+	uint8_t code_block_mode:1, /* Unused */
+		turbo_crc_type:1,
+		rsrvd2:3,
+		bypass_teq:1, /* Unused */
+		soft_output_en:1, /* Unused */
+		ext_td_cold_reg_en:1;
+	union { /* External Cold register */
+		uint32_t ext_td_cold_reg;
+		struct {
+			uint32_t min_iter:4, /* Unused */
+				max_iter:4,
+				ext_scale:5, /* Unused */
+				rsrvd3:3,
+				early_stop_en:1, /* Unused */
+				sw_soft_out_dis:1, /* Unused */
+				sw_et_cont:1, /* Unused */
+				sw_soft_out_saturation:1, /* Unused */
+				half_iter_on:1, /* Unused */
+				raw_decoder_input_on:1, /* Unused */
+				rsrvd4:10;
+		};
+	};
+};
+
+/* FEC 5GNR Uplink Frame Control Word */
+struct __rte_packed acc100_fcw_ld {
+	uint32_t FCWversion:4,
+		qm:4,
+		nfiller:11,
+		BG:1,
+		Zc:9,
+		res0:1,
+		synd_precoder:1,
+		synd_post:1;
+	uint32_t ncb:16,
+		k0:16;
+	uint32_t rm_e:24,
+		hcin_en:1,
+		hcout_en:1,
+		crc_select:1,
+		bypass_dec:1,
+		bypass_intlv:1,
+		so_en:1,
+		so_bypass_rm:1,
+		so_bypass_intlv:1;
+	uint32_t hcin_offset:16,
+		hcin_size0:16;
+	uint32_t hcin_size1:16,
+		hcin_decomp_mode:3,
+		llr_pack_mode:1,
+		hcout_comp_mode:3,
+		res2:1,
+		dec_convllr:4,
+		hcout_convllr:4;
+	uint32_t itmax:7,
+		itstop:1,
+		so_it:7,
+		res3:1,
+		hcout_offset:16;
+	uint32_t hcout_size0:16,
+		hcout_size1:16;
+	uint32_t gain_i:8,
+		gain_h:8,
+		negstop_th:16;
+	uint32_t negstop_it:7,
+		negstop_en:1,
+		res4:24;
+};
+
+/* FEC 4G Downlink Frame Control Word */
+struct __rte_packed acc100_fcw_te {
+	uint16_t k_neg;
+	uint16_t k_pos;
+	uint8_t c_neg;
+	uint8_t c;
+	uint8_t filler;
+	uint8_t cab;
+	uint32_t ea:17,
+		rsrvd0:15;
+	uint32_t eb:17,
+		rsrvd1:15;
+	uint16_t ncb_neg;
+	uint16_t ncb_pos;
+	uint8_t rv_idx0:2,
+		rsrvd2:2,
+		rv_idx1:2,
+		rsrvd3:2;
+	uint8_t bypass_rv_idx0:1,
+		bypass_rv_idx1:1,
+		bypass_rm:1,
+		rsrvd4:5;
+	uint8_t rsrvd5:1,
+		rsrvd6:3,
+		code_block_crc:1,
+		rsrvd7:3;
+	uint8_t code_block_mode:1,
+		rsrvd8:7;
+	uint64_t rsrvd9;
+};
+
+/* FEC 5GNR Downlink Frame Control Word */
+struct __rte_packed acc100_fcw_le {
+	uint32_t FCWversion:4,
+		qm:4,
+		nfiller:11,
+		BG:1,
+		Zc:9,
+		res0:3;
+	uint32_t ncb:16,
+		k0:16;
+	uint32_t rm_e:24,
+		res1:2,
+		crc_select:1,
+		res2:1,
+		bypass_intlv:1,
+		res3:3;
+	uint32_t res4_a:12,
+		mcb_count:3,
+		res4_b:17;
+	uint32_t res5;
+	uint32_t res6;
+	uint32_t res7;
+	uint32_t res8;
+};
+
+/* ACC100 DMA Request Descriptor */
+struct __rte_packed acc100_dma_req_desc {
+	union {
+		struct{
+			uint32_t type:4,
+				rsrvd0:26,
+				sdone:1,
+				fdone:1;
+			uint32_t rsrvd1;
+			uint32_t rsrvd2;
+			uint32_t pass_param:8,
+				sdone_enable:1,
+				irq_enable:1,
+				timeStampEn:1,
+				res0:5,
+				numCBs:4,
+				res1:4,
+				m2dlen:4,
+				d2mlen:4;
+		};
+		struct{
+			uint32_t word0;
+			uint32_t word1;
+			uint32_t word2;
+			uint32_t word3;
+		};
+	};
+	struct acc100_dma_triplet data_ptrs[ACC100_DMA_MAX_NUM_POINTERS];
+
+	/* Virtual addresses used to retrieve SW context info */
+	union {
+		void *op_addr;
+		uint64_t pad1;  /* pad to 64 bits */
+	};
+	/*
+	 * Stores additional information needed for driver processing:
+	 * - last_desc_in_batch - flag used to mark last descriptor (CB)
+	 *                        in batch
+	 * - cbs_in_tb - stores information about total number of Code Blocks
+	 *               in currently processed Transport Block
+	 */
+	union {
+		struct {
+			union {
+				struct acc100_fcw_ld fcw_ld;
+				struct acc100_fcw_td fcw_td;
+				struct acc100_fcw_le fcw_le;
+				struct acc100_fcw_te fcw_te;
+				uint32_t pad2[ACC100_FCW_PADDING];
+			};
+			uint32_t last_desc_in_batch :8,
+				cbs_in_tb:8,
+				pad4 : 16;
+		};
+		uint64_t pad3[ACC100_DMA_DESC_PADDING]; /* pad to 64 bits */
+	};
+};
+
+/* ACC100 DMA Descriptor */
+union acc100_dma_desc {
+	struct acc100_dma_req_desc req;
+	union acc100_dma_rsp_desc rsp;
+};
+
+
+/* Union describing Info Ring entry */
+union acc100_harq_layout_data {
+	uint32_t val;
+	struct {
+		uint16_t offset;
+		uint16_t size0;
+	};
+} __rte_packed;
+
+
+/* Union describing Info Ring entry */
+union acc100_info_ring_data {
+	uint32_t val;
+	struct {
+		union {
+			uint16_t detailed_info;
+			struct {
+				uint16_t aq_id: 4;
+				uint16_t qg_id: 4;
+				uint16_t vf_id: 6;
+				uint16_t reserved: 2;
+			};
+		};
+		uint16_t int_nb: 7;
+		uint16_t msi_0: 1;
+		uint16_t vf2pf: 6;
+		uint16_t loop: 1;
+		uint16_t valid: 1;
+	};
+} __rte_packed;
+
+struct acc100_registry_addr {
+	unsigned int dma_ring_dl5g_hi;
+	unsigned int dma_ring_dl5g_lo;
+	unsigned int dma_ring_ul5g_hi;
+	unsigned int dma_ring_ul5g_lo;
+	unsigned int dma_ring_dl4g_hi;
+	unsigned int dma_ring_dl4g_lo;
+	unsigned int dma_ring_ul4g_hi;
+	unsigned int dma_ring_ul4g_lo;
+	unsigned int ring_size;
+	unsigned int info_ring_hi;
+	unsigned int info_ring_lo;
+	unsigned int info_ring_en;
+	unsigned int info_ring_ptr;
+	unsigned int tail_ptrs_dl5g_hi;
+	unsigned int tail_ptrs_dl5g_lo;
+	unsigned int tail_ptrs_ul5g_hi;
+	unsigned int tail_ptrs_ul5g_lo;
+	unsigned int tail_ptrs_dl4g_hi;
+	unsigned int tail_ptrs_dl4g_lo;
+	unsigned int tail_ptrs_ul4g_hi;
+	unsigned int tail_ptrs_ul4g_lo;
+	unsigned int depth_log0_offset;
+	unsigned int depth_log1_offset;
+	unsigned int qman_group_func;
+	unsigned int ddr_range;
+};
+
+/* Structure holding registry addresses for PF */
+static const struct acc100_registry_addr pf_reg_addr = {
+	.dma_ring_dl5g_hi = HWPfDmaFec5GdlDescBaseHiRegVf,
+	.dma_ring_dl5g_lo = HWPfDmaFec5GdlDescBaseLoRegVf,
+	.dma_ring_ul5g_hi = HWPfDmaFec5GulDescBaseHiRegVf,
+	.dma_ring_ul5g_lo = HWPfDmaFec5GulDescBaseLoRegVf,
+	.dma_ring_dl4g_hi = HWPfDmaFec4GdlDescBaseHiRegVf,
+	.dma_ring_dl4g_lo = HWPfDmaFec4GdlDescBaseLoRegVf,
+	.dma_ring_ul4g_hi = HWPfDmaFec4GulDescBaseHiRegVf,
+	.dma_ring_ul4g_lo = HWPfDmaFec4GulDescBaseLoRegVf,
+	.ring_size = HWPfQmgrRingSizeVf,
+	.info_ring_hi = HWPfHiInfoRingBaseHiRegPf,
+	.info_ring_lo = HWPfHiInfoRingBaseLoRegPf,
+	.info_ring_en = HWPfHiInfoRingIntWrEnRegPf,
+	.info_ring_ptr = HWPfHiInfoRingPointerRegPf,
+	.tail_ptrs_dl5g_hi = HWPfDmaFec5GdlRespPtrHiRegVf,
+	.tail_ptrs_dl5g_lo = HWPfDmaFec5GdlRespPtrLoRegVf,
+	.tail_ptrs_ul5g_hi = HWPfDmaFec5GulRespPtrHiRegVf,
+	.tail_ptrs_ul5g_lo = HWPfDmaFec5GulRespPtrLoRegVf,
+	.tail_ptrs_dl4g_hi = HWPfDmaFec4GdlRespPtrHiRegVf,
+	.tail_ptrs_dl4g_lo = HWPfDmaFec4GdlRespPtrLoRegVf,
+	.tail_ptrs_ul4g_hi = HWPfDmaFec4GulRespPtrHiRegVf,
+	.tail_ptrs_ul4g_lo = HWPfDmaFec4GulRespPtrLoRegVf,
+	.depth_log0_offset = HWPfQmgrGrpDepthLog20Vf,
+	.depth_log1_offset = HWPfQmgrGrpDepthLog21Vf,
+	.qman_group_func = HWPfQmgrGrpFunction0,
+	.ddr_range = HWPfDmaVfDdrBaseRw,
+};
+
+/* Structure holding registry addresses for VF */
+static const struct acc100_registry_addr vf_reg_addr = {
+	.dma_ring_dl5g_hi = HWVfDmaFec5GdlDescBaseHiRegVf,
+	.dma_ring_dl5g_lo = HWVfDmaFec5GdlDescBaseLoRegVf,
+	.dma_ring_ul5g_hi = HWVfDmaFec5GulDescBaseHiRegVf,
+	.dma_ring_ul5g_lo = HWVfDmaFec5GulDescBaseLoRegVf,
+	.dma_ring_dl4g_hi = HWVfDmaFec4GdlDescBaseHiRegVf,
+	.dma_ring_dl4g_lo = HWVfDmaFec4GdlDescBaseLoRegVf,
+	.dma_ring_ul4g_hi = HWVfDmaFec4GulDescBaseHiRegVf,
+	.dma_ring_ul4g_lo = HWVfDmaFec4GulDescBaseLoRegVf,
+	.ring_size = HWVfQmgrRingSizeVf,
+	.info_ring_hi = HWVfHiInfoRingBaseHiVf,
+	.info_ring_lo = HWVfHiInfoRingBaseLoVf,
+	.info_ring_en = HWVfHiInfoRingIntWrEnVf,
+	.info_ring_ptr = HWVfHiInfoRingPointerVf,
+	.tail_ptrs_dl5g_hi = HWVfDmaFec5GdlRespPtrHiRegVf,
+	.tail_ptrs_dl5g_lo = HWVfDmaFec5GdlRespPtrLoRegVf,
+	.tail_ptrs_ul5g_hi = HWVfDmaFec5GulRespPtrHiRegVf,
+	.tail_ptrs_ul5g_lo = HWVfDmaFec5GulRespPtrLoRegVf,
+	.tail_ptrs_dl4g_hi = HWVfDmaFec4GdlRespPtrHiRegVf,
+	.tail_ptrs_dl4g_lo = HWVfDmaFec4GdlRespPtrLoRegVf,
+	.tail_ptrs_ul4g_hi = HWVfDmaFec4GulRespPtrHiRegVf,
+	.tail_ptrs_ul4g_lo = HWVfDmaFec4GulRespPtrLoRegVf,
+	.depth_log0_offset = HWVfQmgrGrpDepthLog20Vf,
+	.depth_log1_offset = HWVfQmgrGrpDepthLog21Vf,
+	.qman_group_func = HWVfQmgrGrpFunction0Vf,
+	.ddr_range = HWVfDmaDdrBaseRangeRoVf,
+};
+
 /* Private data structure for each ACC100 device */
 struct acc100_device {
 	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v4 03/11] baseband/acc100: add info get function
  2020-09-04 17:53   ` [dpdk-dev] [PATCH v4 00/11] bbdev PMD ACC100 Nicolas Chautru
  2020-09-04 17:53     ` [dpdk-dev] [PATCH v4 01/11] drivers/baseband: add PMD for ACC100 Nicolas Chautru
  2020-09-04 17:53     ` [dpdk-dev] [PATCH v4 02/11] baseband/acc100: add register definition file Nicolas Chautru
@ 2020-09-04 17:53     ` Nicolas Chautru
  2020-09-18  2:47       ` Liu, Tianjiao
  2020-09-04 17:54     ` [dpdk-dev] [PATCH v4 04/11] baseband/acc100: add queue configuration Nicolas Chautru
                       ` (8 subsequent siblings)
  11 siblings, 1 reply; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-04 17:53 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Add in the "info_get" function to the driver, to allow us to query the
device.
No processing capability are available yet.
Linking bbdev-test to support the PMD with null capability.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 app/test-bbdev/Makefile                  |   3 +
 app/test-bbdev/meson.build               |   3 +
 drivers/baseband/acc100/rte_acc100_cfg.h |  96 +++++++++++++
 drivers/baseband/acc100/rte_acc100_pmd.c | 225 +++++++++++++++++++++++++++++++
 drivers/baseband/acc100/rte_acc100_pmd.h |   3 +
 5 files changed, 330 insertions(+)
 create mode 100644 drivers/baseband/acc100/rte_acc100_cfg.h

diff --git a/app/test-bbdev/Makefile b/app/test-bbdev/Makefile
index dc29557..dbc3437 100644
--- a/app/test-bbdev/Makefile
+++ b/app/test-bbdev/Makefile
@@ -26,5 +26,8 @@ endif
 ifeq ($(CONFIG_RTE_LIBRTE_PMD_BBDEV_FPGA_5GNR_FEC),y)
 LDLIBS += -lrte_pmd_bbdev_fpga_5gnr_fec
 endif
+ifeq ($(CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100),y)
+LDLIBS += -lrte_pmd_bbdev_acc100
+endif
 
 include $(RTE_SDK)/mk/rte.app.mk
diff --git a/app/test-bbdev/meson.build b/app/test-bbdev/meson.build
index 18ab6a8..fbd8ae3 100644
--- a/app/test-bbdev/meson.build
+++ b/app/test-bbdev/meson.build
@@ -12,3 +12,6 @@ endif
 if dpdk_conf.has('RTE_LIBRTE_PMD_BBDEV_FPGA_5GNR_FEC')
 	deps += ['pmd_bbdev_fpga_5gnr_fec']
 endif
+if dpdk_conf.has('RTE_LIBRTE_PMD_BBDEV_ACC100')
+	deps += ['pmd_bbdev_acc100']
+endif
\ No newline at end of file
diff --git a/drivers/baseband/acc100/rte_acc100_cfg.h b/drivers/baseband/acc100/rte_acc100_cfg.h
new file mode 100644
index 0000000..73bbe36
--- /dev/null
+++ b/drivers/baseband/acc100/rte_acc100_cfg.h
@@ -0,0 +1,96 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#ifndef _RTE_ACC100_CFG_H_
+#define _RTE_ACC100_CFG_H_
+
+/**
+ * @file rte_acc100_cfg.h
+ *
+ * Functions for configuring ACC100 HW, exposed directly to applications.
+ * Configuration related to encoding/decoding is done through the
+ * librte_bbdev library.
+ *
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ */
+
+#include <stdint.h>
+#include <stdbool.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+/**< Number of Virtual Functions ACC100 supports */
+#define RTE_ACC100_NUM_VFS 16
+
+/**
+ * Definition of Queue Topology for ACC100 Configuration
+ * Some level of details is abstracted out to expose a clean interface
+ * given that comprehensive flexibility is not required
+ */
+struct rte_q_topology_t {
+	/** Number of QGroups in incremental order of priority */
+	uint16_t num_qgroups;
+	/**
+	 * All QGroups have the same number of AQs here.
+	 * Note : Could be made a 16-array if more flexibility is really
+	 * required
+	 */
+	uint16_t num_aqs_per_groups;
+	/**
+	 * Depth of the AQs is the same of all QGroups here. Log2 Enum : 2^N
+	 * Note : Could be made a 16-array if more flexibility is really
+	 * required
+	 */
+	uint16_t aq_depth_log2;
+	/**
+	 * Index of the first Queue Group Index - assuming contiguity
+	 * Initialized as -1
+	 */
+	int8_t first_qgroup_index;
+};
+
+/**
+ * Definition of Arbitration related parameters for ACC100 Configuration
+ */
+struct rte_arbitration_t {
+	/** Default Weight for VF Fairness Arbitration */
+	uint16_t round_robin_weight;
+	uint32_t gbr_threshold1; /**< Guaranteed Bitrate Threshold 1 */
+	uint32_t gbr_threshold2; /**< Guaranteed Bitrate Threshold 2 */
+};
+
+/**
+ * Structure to pass ACC100 configuration.
+ * Note: all VF Bundles will have the same configuration.
+ */
+struct acc100_conf {
+	bool pf_mode_en; /**< 1 if PF is used for dataplane, 0 for VFs */
+	/** 1 if input '1' bit is represented by a positive LLR value, 0 if '1'
+	 * bit is represented by a negative value.
+	 */
+	bool input_pos_llr_1_bit;
+	/** 1 if output '1' bit is represented by a positive value, 0 if '1'
+	 * bit is represented by a negative value.
+	 */
+	bool output_pos_llr_1_bit;
+	uint16_t num_vf_bundles; /**< Number of VF bundles to setup */
+	/** Queue topology for each operation type */
+	struct rte_q_topology_t q_ul_4g;
+	struct rte_q_topology_t q_dl_4g;
+	struct rte_q_topology_t q_ul_5g;
+	struct rte_q_topology_t q_dl_5g;
+	/** Arbitration configuration for each operation type */
+	struct rte_arbitration_t arb_ul_4g[RTE_ACC100_NUM_VFS];
+	struct rte_arbitration_t arb_dl_4g[RTE_ACC100_NUM_VFS];
+	struct rte_arbitration_t arb_ul_5g[RTE_ACC100_NUM_VFS];
+	struct rte_arbitration_t arb_dl_5g[RTE_ACC100_NUM_VFS];
+};
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_ACC100_CFG_H_ */
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 1b4cd13..7807a30 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -26,6 +26,184 @@
 RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);
 #endif
 
+/* Read a register of a ACC100 device */
+static inline uint32_t
+acc100_reg_read(struct acc100_device *d, uint32_t offset)
+{
+
+	void *reg_addr = RTE_PTR_ADD(d->mmio_base, offset);
+	uint32_t ret = *((volatile uint32_t *)(reg_addr));
+	return rte_le_to_cpu_32(ret);
+}
+
+/* Calculate the offset of the enqueue register */
+static inline uint32_t
+queue_offset(bool pf_device, uint8_t vf_id, uint8_t qgrp_id, uint16_t aq_id)
+{
+	if (pf_device)
+		return ((vf_id << 12) + (qgrp_id << 7) + (aq_id << 3) +
+				HWPfQmgrIngressAq);
+	else
+		return ((qgrp_id << 7) + (aq_id << 3) +
+				HWVfQmgrIngressAq);
+}
+
+enum {UL_4G = 0, UL_5G, DL_4G, DL_5G, NUM_ACC};
+
+/* Return the queue topology for a Queue Group Index */
+static inline void
+qtopFromAcc(struct rte_q_topology_t **qtop, int acc_enum,
+		struct acc100_conf *acc100_conf)
+{
+	struct rte_q_topology_t *p_qtop;
+	p_qtop = NULL;
+	switch (acc_enum) {
+	case UL_4G:
+		p_qtop = &(acc100_conf->q_ul_4g);
+		break;
+	case UL_5G:
+		p_qtop = &(acc100_conf->q_ul_5g);
+		break;
+	case DL_4G:
+		p_qtop = &(acc100_conf->q_dl_4g);
+		break;
+	case DL_5G:
+		p_qtop = &(acc100_conf->q_dl_5g);
+		break;
+	default:
+		/* NOTREACHED */
+		rte_bbdev_log(ERR, "Unexpected error evaluating qtopFromAcc");
+		break;
+	}
+	*qtop = p_qtop;
+}
+
+static void
+initQTop(struct acc100_conf *acc100_conf)
+{
+	acc100_conf->q_ul_4g.num_aqs_per_groups = 0;
+	acc100_conf->q_ul_4g.num_qgroups = 0;
+	acc100_conf->q_ul_4g.first_qgroup_index = -1;
+	acc100_conf->q_ul_5g.num_aqs_per_groups = 0;
+	acc100_conf->q_ul_5g.num_qgroups = 0;
+	acc100_conf->q_ul_5g.first_qgroup_index = -1;
+	acc100_conf->q_dl_4g.num_aqs_per_groups = 0;
+	acc100_conf->q_dl_4g.num_qgroups = 0;
+	acc100_conf->q_dl_4g.first_qgroup_index = -1;
+	acc100_conf->q_dl_5g.num_aqs_per_groups = 0;
+	acc100_conf->q_dl_5g.num_qgroups = 0;
+	acc100_conf->q_dl_5g.first_qgroup_index = -1;
+}
+
+static inline void
+updateQtop(uint8_t acc, uint8_t qg, struct acc100_conf *acc100_conf,
+		struct acc100_device *d) {
+	uint32_t reg;
+	struct rte_q_topology_t *q_top = NULL;
+	qtopFromAcc(&q_top, acc, acc100_conf);
+	if (unlikely(q_top == NULL))
+		return;
+	uint16_t aq;
+	q_top->num_qgroups++;
+	if (q_top->first_qgroup_index == -1) {
+		q_top->first_qgroup_index = qg;
+		/* Can be optimized to assume all are enabled by default */
+		reg = acc100_reg_read(d, queue_offset(d->pf_device,
+				0, qg, ACC100_NUM_AQS - 1));
+		if (reg & QUEUE_ENABLE) {
+			q_top->num_aqs_per_groups = ACC100_NUM_AQS;
+			return;
+		}
+		q_top->num_aqs_per_groups = 0;
+		for (aq = 0; aq < ACC100_NUM_AQS; aq++) {
+			reg = acc100_reg_read(d, queue_offset(d->pf_device,
+					0, qg, aq));
+			if (reg & QUEUE_ENABLE)
+				q_top->num_aqs_per_groups++;
+		}
+	}
+}
+
+/* Fetch configuration enabled for the PF/VF using MMIO Read (slow) */
+static inline void
+fetch_acc100_config(struct rte_bbdev *dev)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	struct acc100_conf *acc100_conf = &d->acc100_conf;
+	const struct acc100_registry_addr *reg_addr;
+	uint8_t acc, qg;
+	uint32_t reg, reg_aq, reg_len0, reg_len1;
+	uint32_t reg_mode;
+
+	/* No need to retrieve the configuration is already done */
+	if (d->configured)
+		return;
+
+	/* Choose correct registry addresses for the device type */
+	if (d->pf_device)
+		reg_addr = &pf_reg_addr;
+	else
+		reg_addr = &vf_reg_addr;
+
+	d->ddr_size = (1 + acc100_reg_read(d, reg_addr->ddr_range)) << 10;
+
+	/* Single VF Bundle by VF */
+	acc100_conf->num_vf_bundles = 1;
+	initQTop(acc100_conf);
+
+	struct rte_q_topology_t *q_top = NULL;
+	int qman_func_id[5] = {0, 2, 1, 3, 4};
+	reg = acc100_reg_read(d, reg_addr->qman_group_func);
+	for (qg = 0; qg < ACC100_NUM_QGRPS_PER_WORD; qg++) {
+		reg_aq = acc100_reg_read(d,
+				queue_offset(d->pf_device, 0, qg, 0));
+		if (reg_aq & QUEUE_ENABLE) {
+			acc = qman_func_id[(reg >> (qg * 4)) & 0x7];
+			updateQtop(acc, qg, acc100_conf, d);
+		}
+	}
+
+	/* Check the depth of the AQs*/
+	reg_len0 = acc100_reg_read(d, reg_addr->depth_log0_offset);
+	reg_len1 = acc100_reg_read(d, reg_addr->depth_log1_offset);
+	for (acc = 0; acc < NUM_ACC; acc++) {
+		qtopFromAcc(&q_top, acc, acc100_conf);
+		if (q_top->first_qgroup_index < ACC100_NUM_QGRPS_PER_WORD)
+			q_top->aq_depth_log2 = (reg_len0 >>
+					(q_top->first_qgroup_index * 4))
+					& 0xF;
+		else
+			q_top->aq_depth_log2 = (reg_len1 >>
+					((q_top->first_qgroup_index -
+					ACC100_NUM_QGRPS_PER_WORD) * 4))
+					& 0xF;
+	}
+
+	/* Read PF mode */
+	if (d->pf_device) {
+		reg_mode = acc100_reg_read(d, HWPfHiPfMode);
+		acc100_conf->pf_mode_en = (reg_mode == 2) ? 1 : 0;
+	}
+
+	rte_bbdev_log_debug(
+			"%s Config LLR SIGN IN/OUT %s %s QG %u %u %u %u AQ %u %u %u %u Len %u %u %u %u\n",
+			(d->pf_device) ? "PF" : "VF",
+			(acc100_conf->input_pos_llr_1_bit) ? "POS" : "NEG",
+			(acc100_conf->output_pos_llr_1_bit) ? "POS" : "NEG",
+			acc100_conf->q_ul_4g.num_qgroups,
+			acc100_conf->q_dl_4g.num_qgroups,
+			acc100_conf->q_ul_5g.num_qgroups,
+			acc100_conf->q_dl_5g.num_qgroups,
+			acc100_conf->q_ul_4g.num_aqs_per_groups,
+			acc100_conf->q_dl_4g.num_aqs_per_groups,
+			acc100_conf->q_ul_5g.num_aqs_per_groups,
+			acc100_conf->q_dl_5g.num_aqs_per_groups,
+			acc100_conf->q_ul_4g.aq_depth_log2,
+			acc100_conf->q_dl_4g.aq_depth_log2,
+			acc100_conf->q_ul_5g.aq_depth_log2,
+			acc100_conf->q_dl_5g.aq_depth_log2);
+}
+
 /* Free 64MB memory used for software rings */
 static int
 acc100_dev_close(struct rte_bbdev *dev  __rte_unused)
@@ -33,8 +211,55 @@
 	return 0;
 }
 
+/* Get ACC100 device info */
+static void
+acc100_dev_info_get(struct rte_bbdev *dev,
+		struct rte_bbdev_driver_info *dev_info)
+{
+	struct acc100_device *d = dev->data->dev_private;
+
+	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
+		RTE_BBDEV_END_OF_CAPABILITIES_LIST()
+	};
+
+	static struct rte_bbdev_queue_conf default_queue_conf;
+	default_queue_conf.socket = dev->data->socket_id;
+	default_queue_conf.queue_size = MAX_QUEUE_DEPTH;
+
+	dev_info->driver_name = dev->device->driver->name;
+
+	/* Read and save the populated config from ACC100 registers */
+	fetch_acc100_config(dev);
+
+	/* This isn't ideal because it reports the maximum number of queues but
+	 * does not provide info on how many can be uplink/downlink or different
+	 * priorities
+	 */
+	dev_info->max_num_queues =
+			d->acc100_conf.q_dl_5g.num_aqs_per_groups *
+			d->acc100_conf.q_dl_5g.num_qgroups +
+			d->acc100_conf.q_ul_5g.num_aqs_per_groups *
+			d->acc100_conf.q_ul_5g.num_qgroups +
+			d->acc100_conf.q_dl_4g.num_aqs_per_groups *
+			d->acc100_conf.q_dl_4g.num_qgroups +
+			d->acc100_conf.q_ul_4g.num_aqs_per_groups *
+			d->acc100_conf.q_ul_4g.num_qgroups;
+	dev_info->queue_size_lim = MAX_QUEUE_DEPTH;
+	dev_info->hardware_accelerated = true;
+	dev_info->max_dl_queue_priority =
+			d->acc100_conf.q_dl_4g.num_qgroups - 1;
+	dev_info->max_ul_queue_priority =
+			d->acc100_conf.q_ul_4g.num_qgroups - 1;
+	dev_info->default_queue_conf = default_queue_conf;
+	dev_info->cpu_flag_reqs = NULL;
+	dev_info->min_alignment = 64;
+	dev_info->capabilities = bbdev_capabilities;
+	dev_info->harq_buffer_size = d->ddr_size;
+}
+
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
 	.close = acc100_dev_close,
+	.info_get = acc100_dev_info_get,
 };
 
 /* ACC100 PCI PF address map */
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index cd77570..662e2c8 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -7,6 +7,7 @@
 
 #include "acc100_pf_enum.h"
 #include "acc100_vf_enum.h"
+#include "rte_acc100_cfg.h"
 
 /* Helper macro for logging */
 #define rte_bbdev_log(level, fmt, ...) \
@@ -520,6 +521,8 @@ struct acc100_registry_addr {
 /* Private data structure for each ACC100 device */
 struct acc100_device {
 	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
+	uint32_t ddr_size; /* Size in kB */
+	struct acc100_conf acc100_conf; /* ACC100 Initial configuration */
 	bool pf_device; /**< True if this is a PF ACC100 device */
 	bool configured; /**< True if this ACC100 device is configured */
 };
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v4 04/11] baseband/acc100: add queue configuration
  2020-09-04 17:53   ` [dpdk-dev] [PATCH v4 00/11] bbdev PMD ACC100 Nicolas Chautru
                       ` (2 preceding siblings ...)
  2020-09-04 17:53     ` [dpdk-dev] [PATCH v4 03/11] baseband/acc100: add info get function Nicolas Chautru
@ 2020-09-04 17:54     ` Nicolas Chautru
  2020-09-15  2:31       ` Xu, Rosen
  2020-09-18  3:01       ` Liu, Tianjiao
  2020-09-04 17:54     ` [dpdk-dev] [PATCH v4 05/11] baseband/acc100: add LDPC processing functions Nicolas Chautru
                       ` (7 subsequent siblings)
  11 siblings, 2 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-04 17:54 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Adding function to create and configure queues for
the device. Still no capability.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 420 ++++++++++++++++++++++++++++++-
 drivers/baseband/acc100/rte_acc100_pmd.h |  45 ++++
 2 files changed, 464 insertions(+), 1 deletion(-)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 7807a30..7a21c57 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -26,6 +26,22 @@
 RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);
 #endif
 
+/* Write to MMIO register address */
+static inline void
+mmio_write(void *addr, uint32_t value)
+{
+	*((volatile uint32_t *)(addr)) = rte_cpu_to_le_32(value);
+}
+
+/* Write a register of a ACC100 device */
+static inline void
+acc100_reg_write(struct acc100_device *d, uint32_t offset, uint32_t payload)
+{
+	void *reg_addr = RTE_PTR_ADD(d->mmio_base, offset);
+	mmio_write(reg_addr, payload);
+	usleep(1000);
+}
+
 /* Read a register of a ACC100 device */
 static inline uint32_t
 acc100_reg_read(struct acc100_device *d, uint32_t offset)
@@ -36,6 +52,22 @@
 	return rte_le_to_cpu_32(ret);
 }
 
+/* Basic Implementation of Log2 for exact 2^N */
+static inline uint32_t
+log2_basic(uint32_t value)
+{
+	return (value == 0) ? 0 : __builtin_ctz(value);
+}
+
+/* Calculate memory alignment offset assuming alignment is 2^N */
+static inline uint32_t
+calc_mem_alignment_offset(void *unaligned_virt_mem, uint32_t alignment)
+{
+	rte_iova_t unaligned_phy_mem = rte_malloc_virt2iova(unaligned_virt_mem);
+	return (uint32_t)(alignment -
+			(unaligned_phy_mem & (alignment-1)));
+}
+
 /* Calculate the offset of the enqueue register */
 static inline uint32_t
 queue_offset(bool pf_device, uint8_t vf_id, uint8_t qgrp_id, uint16_t aq_id)
@@ -204,10 +236,393 @@
 			acc100_conf->q_dl_5g.aq_depth_log2);
 }
 
+static void
+free_base_addresses(void **base_addrs, int size)
+{
+	int i;
+	for (i = 0; i < size; i++)
+		rte_free(base_addrs[i]);
+}
+
+static inline uint32_t
+get_desc_len(void)
+{
+	return sizeof(union acc100_dma_desc);
+}
+
+/* Allocate the 2 * 64MB block for the sw rings */
+static int
+alloc_2x64mb_sw_rings_mem(struct rte_bbdev *dev, struct acc100_device *d,
+		int socket)
+{
+	uint32_t sw_ring_size = ACC100_SIZE_64MBYTE;
+	d->sw_rings_base = rte_zmalloc_socket(dev->device->driver->name,
+			2 * sw_ring_size, RTE_CACHE_LINE_SIZE, socket);
+	if (d->sw_rings_base == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate memory for %s:%u",
+				dev->device->driver->name,
+				dev->data->dev_id);
+		return -ENOMEM;
+	}
+	memset(d->sw_rings_base, 0, ACC100_SIZE_64MBYTE);
+	uint32_t next_64mb_align_offset = calc_mem_alignment_offset(
+			d->sw_rings_base, ACC100_SIZE_64MBYTE);
+	d->sw_rings = RTE_PTR_ADD(d->sw_rings_base, next_64mb_align_offset);
+	d->sw_rings_phys = rte_malloc_virt2iova(d->sw_rings_base) +
+			next_64mb_align_offset;
+	d->sw_ring_size = MAX_QUEUE_DEPTH * get_desc_len();
+	d->sw_ring_max_depth = d->sw_ring_size / get_desc_len();
+
+	return 0;
+}
+
+/* Attempt to allocate minimised memory space for sw rings */
+static void
+alloc_sw_rings_min_mem(struct rte_bbdev *dev, struct acc100_device *d,
+		uint16_t num_queues, int socket)
+{
+	rte_iova_t sw_rings_base_phy, next_64mb_align_addr_phy;
+	uint32_t next_64mb_align_offset;
+	rte_iova_t sw_ring_phys_end_addr;
+	void *base_addrs[SW_RING_MEM_ALLOC_ATTEMPTS];
+	void *sw_rings_base;
+	int i = 0;
+	uint32_t q_sw_ring_size = MAX_QUEUE_DEPTH * get_desc_len();
+	uint32_t dev_sw_ring_size = q_sw_ring_size * num_queues;
+
+	/* Find an aligned block of memory to store sw rings */
+	while (i < SW_RING_MEM_ALLOC_ATTEMPTS) {
+		/*
+		 * sw_ring allocated memory is guaranteed to be aligned to
+		 * q_sw_ring_size at the condition that the requested size is
+		 * less than the page size
+		 */
+		sw_rings_base = rte_zmalloc_socket(
+				dev->device->driver->name,
+				dev_sw_ring_size, q_sw_ring_size, socket);
+
+		if (sw_rings_base == NULL) {
+			rte_bbdev_log(ERR,
+					"Failed to allocate memory for %s:%u",
+					dev->device->driver->name,
+					dev->data->dev_id);
+			break;
+		}
+
+		sw_rings_base_phy = rte_malloc_virt2iova(sw_rings_base);
+		next_64mb_align_offset = calc_mem_alignment_offset(
+				sw_rings_base, ACC100_SIZE_64MBYTE);
+		next_64mb_align_addr_phy = sw_rings_base_phy +
+				next_64mb_align_offset;
+		sw_ring_phys_end_addr = sw_rings_base_phy + dev_sw_ring_size;
+
+		/* Check if the end of the sw ring memory block is before the
+		 * start of next 64MB aligned mem address
+		 */
+		if (sw_ring_phys_end_addr < next_64mb_align_addr_phy) {
+			d->sw_rings_phys = sw_rings_base_phy;
+			d->sw_rings = sw_rings_base;
+			d->sw_rings_base = sw_rings_base;
+			d->sw_ring_size = q_sw_ring_size;
+			d->sw_ring_max_depth = MAX_QUEUE_DEPTH;
+			break;
+		}
+		/* Store the address of the unaligned mem block */
+		base_addrs[i] = sw_rings_base;
+		i++;
+	}
+
+	/* Free all unaligned blocks of mem allocated in the loop */
+	free_base_addresses(base_addrs, i);
+}
+
+
+/* Allocate 64MB memory used for all software rings */
+static int
+acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id)
+{
+	uint32_t phys_low, phys_high, payload;
+	struct acc100_device *d = dev->data->dev_private;
+	const struct acc100_registry_addr *reg_addr;
+
+	if (d->pf_device && !d->acc100_conf.pf_mode_en) {
+		rte_bbdev_log(NOTICE,
+				"%s has PF mode disabled. This PF can't be used.",
+				dev->data->name);
+		return -ENODEV;
+	}
+
+	alloc_sw_rings_min_mem(dev, d, num_queues, socket_id);
+
+	/* If minimal memory space approach failed, then allocate
+	 * the 2 * 64MB block for the sw rings
+	 */
+	if (d->sw_rings == NULL)
+		alloc_2x64mb_sw_rings_mem(dev, d, socket_id);
+
+	/* Configure ACC100 with the base address for DMA descriptor rings
+	 * Same descriptor rings used for UL and DL DMA Engines
+	 * Note : Assuming only VF0 bundle is used for PF mode
+	 */
+	phys_high = (uint32_t)(d->sw_rings_phys >> 32);
+	phys_low  = (uint32_t)(d->sw_rings_phys & ~(ACC100_SIZE_64MBYTE-1));
+
+	/* Choose correct registry addresses for the device type */
+	if (d->pf_device)
+		reg_addr = &pf_reg_addr;
+	else
+		reg_addr = &vf_reg_addr;
+
+	/* Read the populated cfg from ACC100 registers */
+	fetch_acc100_config(dev);
+
+	/* Mark as configured properly */
+	d->configured = true;
+
+	/* Release AXI from PF */
+	if (d->pf_device)
+		acc100_reg_write(d, HWPfDmaAxiControl, 1);
+
+	acc100_reg_write(d, reg_addr->dma_ring_ul5g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->dma_ring_ul5g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->dma_ring_dl5g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->dma_ring_dl5g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->dma_ring_ul4g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->dma_ring_ul4g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->dma_ring_dl4g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->dma_ring_dl4g_lo, phys_low);
+
+	/*
+	 * Configure Ring Size to the max queue ring size
+	 * (used for wrapping purpose)
+	 */
+	payload = log2_basic(d->sw_ring_size / 64);
+	acc100_reg_write(d, reg_addr->ring_size, payload);
+
+	/* Configure tail pointer for use when SDONE enabled */
+	d->tail_ptrs = rte_zmalloc_socket(
+			dev->device->driver->name,
+			ACC100_NUM_QGRPS * ACC100_NUM_AQS * sizeof(uint32_t),
+			RTE_CACHE_LINE_SIZE, socket_id);
+	if (d->tail_ptrs == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate tail ptr for %s:%u",
+				dev->device->driver->name,
+				dev->data->dev_id);
+		rte_free(d->sw_rings);
+		return -ENOMEM;
+	}
+	d->tail_ptr_phys = rte_malloc_virt2iova(d->tail_ptrs);
+
+	phys_high = (uint32_t)(d->tail_ptr_phys >> 32);
+	phys_low  = (uint32_t)(d->tail_ptr_phys);
+	acc100_reg_write(d, reg_addr->tail_ptrs_ul5g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->tail_ptrs_ul5g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->tail_ptrs_dl5g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->tail_ptrs_dl5g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->tail_ptrs_ul4g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->tail_ptrs_ul4g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_lo, phys_low);
+
+	d->harq_layout = rte_zmalloc_socket("HARQ Layout",
+			ACC100_HARQ_LAYOUT * sizeof(*d->harq_layout),
+			RTE_CACHE_LINE_SIZE, dev->data->socket_id);
+
+	rte_bbdev_log_debug(
+			"ACC100 (%s) configured  sw_rings = %p, sw_rings_phys = %#"
+			PRIx64, dev->data->name, d->sw_rings, d->sw_rings_phys);
+
+	return 0;
+}
+
 /* Free 64MB memory used for software rings */
 static int
-acc100_dev_close(struct rte_bbdev *dev  __rte_unused)
+acc100_dev_close(struct rte_bbdev *dev)
 {
+	struct acc100_device *d = dev->data->dev_private;
+	if (d->sw_rings_base != NULL) {
+		rte_free(d->tail_ptrs);
+		rte_free(d->sw_rings_base);
+		d->sw_rings_base = NULL;
+	}
+	usleep(1000);
+	return 0;
+}
+
+
+/**
+ * Report a ACC100 queue index which is free
+ * Return 0 to 16k for a valid queue_idx or -1 when no queue is available
+ * Note : Only supporting VF0 Bundle for PF mode
+ */
+static int
+acc100_find_free_queue_idx(struct rte_bbdev *dev,
+		const struct rte_bbdev_queue_conf *conf)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	int op_2_acc[5] = {0, UL_4G, DL_4G, UL_5G, DL_5G};
+	int acc = op_2_acc[conf->op_type];
+	struct rte_q_topology_t *qtop = NULL;
+	qtopFromAcc(&qtop, acc, &(d->acc100_conf));
+	if (qtop == NULL)
+		return -1;
+	/* Identify matching QGroup Index which are sorted in priority order */
+	uint16_t group_idx = qtop->first_qgroup_index;
+	group_idx += conf->priority;
+	if (group_idx >= ACC100_NUM_QGRPS ||
+			conf->priority >= qtop->num_qgroups) {
+		rte_bbdev_log(INFO, "Invalid Priority on %s, priority %u",
+				dev->data->name, conf->priority);
+		return -1;
+	}
+	/* Find a free AQ_idx  */
+	uint16_t aq_idx;
+	for (aq_idx = 0; aq_idx < qtop->num_aqs_per_groups; aq_idx++) {
+		if (((d->q_assigned_bit_map[group_idx] >> aq_idx) & 0x1) == 0) {
+			/* Mark the Queue as assigned */
+			d->q_assigned_bit_map[group_idx] |= (1 << aq_idx);
+			/* Report the AQ Index */
+			return (group_idx << GRP_ID_SHIFT) + aq_idx;
+		}
+	}
+	rte_bbdev_log(INFO, "Failed to find free queue on %s, priority %u",
+			dev->data->name, conf->priority);
+	return -1;
+}
+
+/* Setup ACC100 queue */
+static int
+acc100_queue_setup(struct rte_bbdev *dev, uint16_t queue_id,
+		const struct rte_bbdev_queue_conf *conf)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	struct acc100_queue *q;
+	int16_t q_idx;
+
+	/* Allocate the queue data structure. */
+	q = rte_zmalloc_socket(dev->device->driver->name, sizeof(*q),
+			RTE_CACHE_LINE_SIZE, conf->socket);
+	if (q == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate queue memory");
+		return -ENOMEM;
+	}
+
+	q->d = d;
+	q->ring_addr = RTE_PTR_ADD(d->sw_rings, (d->sw_ring_size * queue_id));
+	q->ring_addr_phys = d->sw_rings_phys + (d->sw_ring_size * queue_id);
+
+	/* Prepare the Ring with default descriptor format */
+	union acc100_dma_desc *desc = NULL;
+	unsigned int desc_idx, b_idx;
+	int fcw_len = (conf->op_type == RTE_BBDEV_OP_LDPC_ENC ?
+		ACC100_FCW_LE_BLEN : (conf->op_type == RTE_BBDEV_OP_TURBO_DEC ?
+		ACC100_FCW_TD_BLEN : ACC100_FCW_LD_BLEN));
+
+	for (desc_idx = 0; desc_idx < d->sw_ring_max_depth; desc_idx++) {
+		desc = q->ring_addr + desc_idx;
+		desc->req.word0 = ACC100_DMA_DESC_TYPE;
+		desc->req.word1 = 0; /**< Timestamp */
+		desc->req.word2 = 0;
+		desc->req.word3 = 0;
+		uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
+		desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
+		desc->req.data_ptrs[0].blen = fcw_len;
+		desc->req.data_ptrs[0].blkid = ACC100_DMA_BLKID_FCW;
+		desc->req.data_ptrs[0].last = 0;
+		desc->req.data_ptrs[0].dma_ext = 0;
+		for (b_idx = 1; b_idx < ACC100_DMA_MAX_NUM_POINTERS - 1;
+				b_idx++) {
+			desc->req.data_ptrs[b_idx].blkid = ACC100_DMA_BLKID_IN;
+			desc->req.data_ptrs[b_idx].last = 1;
+			desc->req.data_ptrs[b_idx].dma_ext = 0;
+			b_idx++;
+			desc->req.data_ptrs[b_idx].blkid =
+					ACC100_DMA_BLKID_OUT_ENC;
+			desc->req.data_ptrs[b_idx].last = 1;
+			desc->req.data_ptrs[b_idx].dma_ext = 0;
+		}
+		/* Preset some fields of LDPC FCW */
+		desc->req.fcw_ld.FCWversion = ACC100_FCW_VER;
+		desc->req.fcw_ld.gain_i = 1;
+		desc->req.fcw_ld.gain_h = 1;
+	}
+
+	q->lb_in = rte_zmalloc_socket(dev->device->driver->name,
+			RTE_CACHE_LINE_SIZE,
+			RTE_CACHE_LINE_SIZE, conf->socket);
+	if (q->lb_in == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate lb_in memory");
+		return -ENOMEM;
+	}
+	q->lb_in_addr_phys = rte_malloc_virt2iova(q->lb_in);
+	q->lb_out = rte_zmalloc_socket(dev->device->driver->name,
+			RTE_CACHE_LINE_SIZE,
+			RTE_CACHE_LINE_SIZE, conf->socket);
+	if (q->lb_out == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate lb_out memory");
+		return -ENOMEM;
+	}
+	q->lb_out_addr_phys = rte_malloc_virt2iova(q->lb_out);
+
+	/*
+	 * Software queue ring wraps synchronously with the HW when it reaches
+	 * the boundary of the maximum allocated queue size, no matter what the
+	 * sw queue size is. This wrapping is guarded by setting the wrap_mask
+	 * to represent the maximum queue size as allocated at the time when
+	 * the device has been setup (in configure()).
+	 *
+	 * The queue depth is set to the queue size value (conf->queue_size).
+	 * This limits the occupancy of the queue at any point of time, so that
+	 * the queue does not get swamped with enqueue requests.
+	 */
+	q->sw_ring_depth = conf->queue_size;
+	q->sw_ring_wrap_mask = d->sw_ring_max_depth - 1;
+
+	q->op_type = conf->op_type;
+
+	q_idx = acc100_find_free_queue_idx(dev, conf);
+	if (q_idx == -1) {
+		rte_free(q);
+		return -1;
+	}
+
+	q->qgrp_id = (q_idx >> GRP_ID_SHIFT) & 0xF;
+	q->vf_id = (q_idx >> VF_ID_SHIFT)  & 0x3F;
+	q->aq_id = q_idx & 0xF;
+	q->aq_depth = (conf->op_type ==  RTE_BBDEV_OP_TURBO_DEC) ?
+			(1 << d->acc100_conf.q_ul_4g.aq_depth_log2) :
+			(1 << d->acc100_conf.q_dl_4g.aq_depth_log2);
+
+	q->mmio_reg_enqueue = RTE_PTR_ADD(d->mmio_base,
+			queue_offset(d->pf_device,
+					q->vf_id, q->qgrp_id, q->aq_id));
+
+	rte_bbdev_log_debug(
+			"Setup dev%u q%u: qgrp_id=%u, vf_id=%u, aq_id=%u, aq_depth=%u, mmio_reg_enqueue=%p",
+			dev->data->dev_id, queue_id, q->qgrp_id, q->vf_id,
+			q->aq_id, q->aq_depth, q->mmio_reg_enqueue);
+
+	dev->data->queues[queue_id].queue_private = q;
+	return 0;
+}
+
+/* Release ACC100 queue */
+static int
+acc100_queue_release(struct rte_bbdev *dev, uint16_t q_id)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	struct acc100_queue *q = dev->data->queues[q_id].queue_private;
+
+	if (q != NULL) {
+		/* Mark the Queue as un-assigned */
+		d->q_assigned_bit_map[q->qgrp_id] &= (0xFFFFFFFF -
+				(1 << q->aq_id));
+		rte_free(q->lb_in);
+		rte_free(q->lb_out);
+		rte_free(q);
+		dev->data->queues[q_id].queue_private = NULL;
+	}
+
 	return 0;
 }
 
@@ -258,8 +673,11 @@
 }
 
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
+	.setup_queues = acc100_setup_queues,
 	.close = acc100_dev_close,
 	.info_get = acc100_dev_info_get,
+	.queue_setup = acc100_queue_setup,
+	.queue_release = acc100_queue_release,
 };
 
 /* ACC100 PCI PF address map */
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 662e2c8..0e2b79c 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -518,11 +518,56 @@ struct acc100_registry_addr {
 	.ddr_range = HWVfDmaDdrBaseRangeRoVf,
 };
 
+/* Structure associated with each queue. */
+struct __rte_cache_aligned acc100_queue {
+	union acc100_dma_desc *ring_addr;  /* Virtual address of sw ring */
+	rte_iova_t ring_addr_phys;  /* Physical address of software ring */
+	uint32_t sw_ring_head;  /* software ring head */
+	uint32_t sw_ring_tail;  /* software ring tail */
+	/* software ring size (descriptors, not bytes) */
+	uint32_t sw_ring_depth;
+	/* mask used to wrap enqueued descriptors on the sw ring */
+	uint32_t sw_ring_wrap_mask;
+	/* MMIO register used to enqueue descriptors */
+	void *mmio_reg_enqueue;
+	uint8_t vf_id;  /* VF ID (max = 63) */
+	uint8_t qgrp_id;  /* Queue Group ID */
+	uint16_t aq_id;  /* Atomic Queue ID */
+	uint16_t aq_depth;  /* Depth of atomic queue */
+	uint32_t aq_enqueued;  /* Count how many "batches" have been enqueued */
+	uint32_t aq_dequeued;  /* Count how many "batches" have been dequeued */
+	uint32_t irq_enable;  /* Enable ops dequeue interrupts if set to 1 */
+	struct rte_mempool *fcw_mempool;  /* FCW mempool */
+	enum rte_bbdev_op_type op_type;  /* Type of this Queue: TE or TD */
+	/* Internal Buffers for loopback input */
+	uint8_t *lb_in;
+	uint8_t *lb_out;
+	rte_iova_t lb_in_addr_phys;
+	rte_iova_t lb_out_addr_phys;
+	struct acc100_device *d;
+};
+
 /* Private data structure for each ACC100 device */
 struct acc100_device {
 	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
+	void *sw_rings_base;  /* Base addr of un-aligned memory for sw rings */
+	void *sw_rings;  /* 64MBs of 64MB aligned memory for sw rings */
+	rte_iova_t sw_rings_phys;  /* Physical address of sw_rings */
+	/* Virtual address of the info memory routed to the this function under
+	 * operation, whether it is PF or VF.
+	 */
+	union acc100_harq_layout_data *harq_layout;
+	uint32_t sw_ring_size;
 	uint32_t ddr_size; /* Size in kB */
+	uint32_t *tail_ptrs; /* Base address of response tail pointer buffer */
+	rte_iova_t tail_ptr_phys; /* Physical address of tail pointers */
+	/* Max number of entries available for each queue in device, depending
+	 * on how many queues are enabled with configure()
+	 */
+	uint32_t sw_ring_max_depth;
 	struct acc100_conf acc100_conf; /* ACC100 Initial configuration */
+	/* Bitmap capturing which Queues have already been assigned */
+	uint16_t q_assigned_bit_map[ACC100_NUM_QGRPS];
 	bool pf_device; /**< True if this is a PF ACC100 device */
 	bool configured; /**< True if this ACC100 device is configured */
 };
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v4 05/11] baseband/acc100: add LDPC processing functions
  2020-09-04 17:53   ` [dpdk-dev] [PATCH v4 00/11] bbdev PMD ACC100 Nicolas Chautru
                       ` (3 preceding siblings ...)
  2020-09-04 17:54     ` [dpdk-dev] [PATCH v4 04/11] baseband/acc100: add queue configuration Nicolas Chautru
@ 2020-09-04 17:54     ` Nicolas Chautru
  2020-09-21  1:40       ` Liu, Tianjiao
  2020-09-04 17:54     ` [dpdk-dev] [PATCH v4 06/11] baseband/acc100: add HARQ loopback support Nicolas Chautru
                       ` (6 subsequent siblings)
  11 siblings, 1 reply; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-04 17:54 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Adding LDPC decode and encode processing operations

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 1625 +++++++++++++++++++++++++++++-
 drivers/baseband/acc100/rte_acc100_pmd.h |    3 +
 2 files changed, 1626 insertions(+), 2 deletions(-)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 7a21c57..7f64695 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -15,6 +15,9 @@
 #include <rte_hexdump.h>
 #include <rte_pci.h>
 #include <rte_bus_pci.h>
+#ifdef RTE_BBDEV_OFFLOAD_COST
+#include <rte_cycles.h>
+#endif
 
 #include <rte_bbdev.h>
 #include <rte_bbdev_pmd.h>
@@ -449,7 +452,6 @@
 	return 0;
 }
 
-
 /**
  * Report a ACC100 queue index which is free
  * Return 0 to 16k for a valid queue_idx or -1 when no queue is available
@@ -634,6 +636,46 @@
 	struct acc100_device *d = dev->data->dev_private;
 
 	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
+		{
+			.type   = RTE_BBDEV_OP_LDPC_ENC,
+			.cap.ldpc_enc = {
+				.capability_flags =
+					RTE_BBDEV_LDPC_RATE_MATCH |
+					RTE_BBDEV_LDPC_CRC_24B_ATTACH |
+					RTE_BBDEV_LDPC_INTERLEAVER_BYPASS,
+				.num_buffers_src =
+						RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+				.num_buffers_dst =
+						RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+			}
+		},
+		{
+			.type   = RTE_BBDEV_OP_LDPC_DEC,
+			.cap.ldpc_dec = {
+			.capability_flags =
+				RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK |
+				RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP |
+				RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE |
+				RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE |
+#ifdef ACC100_EXT_MEM
+				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE |
+				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE |
+#endif
+				RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE |
+				RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS |
+				RTE_BBDEV_LDPC_DECODE_BYPASS |
+				RTE_BBDEV_LDPC_DEC_SCATTER_GATHER |
+				RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION |
+				RTE_BBDEV_LDPC_LLR_COMPRESSION,
+			.llr_size = 8,
+			.llr_decimals = 1,
+			.num_buffers_src =
+					RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+			.num_buffers_hard_out =
+					RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+			.num_buffers_soft_out = 0,
+			}
+		},
 		RTE_BBDEV_END_OF_CAPABILITIES_LIST()
 	};
 
@@ -669,9 +711,14 @@
 	dev_info->cpu_flag_reqs = NULL;
 	dev_info->min_alignment = 64;
 	dev_info->capabilities = bbdev_capabilities;
+#ifdef ACC100_EXT_MEM
 	dev_info->harq_buffer_size = d->ddr_size;
+#else
+	dev_info->harq_buffer_size = 0;
+#endif
 }
 
+
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
 	.setup_queues = acc100_setup_queues,
 	.close = acc100_dev_close,
@@ -696,6 +743,1577 @@
 	{.device_id = 0},
 };
 
+/* Read flag value 0/1 from bitmap */
+static inline bool
+check_bit(uint32_t bitmap, uint32_t bitmask)
+{
+	return bitmap & bitmask;
+}
+
+static inline char *
+mbuf_append(struct rte_mbuf *m_head, struct rte_mbuf *m, uint16_t len)
+{
+	if (unlikely(len > rte_pktmbuf_tailroom(m)))
+		return NULL;
+
+	char *tail = (char *)m->buf_addr + m->data_off + m->data_len;
+	m->data_len = (uint16_t)(m->data_len + len);
+	m_head->pkt_len  = (m_head->pkt_len + len);
+	return tail;
+}
+
+/* Compute value of k0.
+ * Based on 3GPP 38.212 Table 5.4.2.1-2
+ * Starting position of different redundancy versions, k0
+ */
+static inline uint16_t
+get_k0(uint16_t n_cb, uint16_t z_c, uint8_t bg, uint8_t rv_index)
+{
+	if (rv_index == 0)
+		return 0;
+	uint16_t n = (bg == 1 ? N_ZC_1 : N_ZC_2) * z_c;
+	if (n_cb == n) {
+		if (rv_index == 1)
+			return (bg == 1 ? K0_1_1 : K0_1_2) * z_c;
+		else if (rv_index == 2)
+			return (bg == 1 ? K0_2_1 : K0_2_2) * z_c;
+		else
+			return (bg == 1 ? K0_3_1 : K0_3_2) * z_c;
+	}
+	/* LBRM case - includes a division by N */
+	if (rv_index == 1)
+		return (((bg == 1 ? K0_1_1 : K0_1_2) * n_cb)
+				/ n) * z_c;
+	else if (rv_index == 2)
+		return (((bg == 1 ? K0_2_1 : K0_2_2) * n_cb)
+				/ n) * z_c;
+	else
+		return (((bg == 1 ? K0_3_1 : K0_3_2) * n_cb)
+				/ n) * z_c;
+}
+
+/* Fill in a frame control word for LDPC encoding. */
+static inline void
+acc100_fcw_le_fill(const struct rte_bbdev_enc_op *op,
+		struct acc100_fcw_le *fcw, int num_cb)
+{
+	fcw->qm = op->ldpc_enc.q_m;
+	fcw->nfiller = op->ldpc_enc.n_filler;
+	fcw->BG = (op->ldpc_enc.basegraph - 1);
+	fcw->Zc = op->ldpc_enc.z_c;
+	fcw->ncb = op->ldpc_enc.n_cb;
+	fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_enc.basegraph,
+			op->ldpc_enc.rv_index);
+	fcw->rm_e = op->ldpc_enc.cb_params.e;
+	fcw->crc_select = check_bit(op->ldpc_enc.op_flags,
+			RTE_BBDEV_LDPC_CRC_24B_ATTACH);
+	fcw->bypass_intlv = check_bit(op->ldpc_enc.op_flags,
+			RTE_BBDEV_LDPC_INTERLEAVER_BYPASS);
+	fcw->mcb_count = num_cb;
+}
+
+/* Fill in a frame control word for LDPC decoding. */
+static inline void
+acc100_fcw_ld_fill(const struct rte_bbdev_dec_op *op, struct acc100_fcw_ld *fcw,
+		union acc100_harq_layout_data *harq_layout)
+{
+	uint16_t harq_out_length, harq_in_length, ncb_p, k0_p, parity_offset;
+	uint16_t harq_index;
+	uint32_t l;
+	bool harq_prun = false;
+
+	fcw->qm = op->ldpc_dec.q_m;
+	fcw->nfiller = op->ldpc_dec.n_filler;
+	fcw->BG = (op->ldpc_dec.basegraph - 1);
+	fcw->Zc = op->ldpc_dec.z_c;
+	fcw->ncb = op->ldpc_dec.n_cb;
+	fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_dec.basegraph,
+			op->ldpc_dec.rv_index);
+	if (op->ldpc_dec.code_block_mode == 1)
+		fcw->rm_e = op->ldpc_dec.cb_params.e;
+	else
+		fcw->rm_e = (op->ldpc_dec.tb_params.r <
+				op->ldpc_dec.tb_params.cab) ?
+						op->ldpc_dec.tb_params.ea :
+						op->ldpc_dec.tb_params.eb;
+
+	fcw->hcin_en = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE);
+	fcw->hcout_en = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE);
+	fcw->crc_select = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK);
+	fcw->bypass_dec = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_DECODE_BYPASS);
+	fcw->bypass_intlv = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS);
+	if (op->ldpc_dec.q_m == 1) {
+		fcw->bypass_intlv = 1;
+		fcw->qm = 2;
+	}
+	fcw->hcin_decomp_mode = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
+	fcw->hcout_comp_mode = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
+	fcw->llr_pack_mode = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_LLR_COMPRESSION);
+	harq_index = op->ldpc_dec.harq_combined_output.offset /
+			ACC100_HARQ_OFFSET;
+#ifdef ACC100_EXT_MEM
+	/* Limit cases when HARQ pruning is valid */
+	harq_prun = ((op->ldpc_dec.harq_combined_output.offset %
+			ACC100_HARQ_OFFSET) == 0) &&
+			(op->ldpc_dec.harq_combined_output.offset <= UINT16_MAX
+			* ACC100_HARQ_OFFSET);
+#endif
+	if (fcw->hcin_en > 0) {
+		harq_in_length = op->ldpc_dec.harq_combined_input.length;
+		if (fcw->hcin_decomp_mode > 0)
+			harq_in_length = harq_in_length * 8 / 6;
+		harq_in_length = RTE_ALIGN(harq_in_length, 64);
+		if ((harq_layout[harq_index].offset > 0) & harq_prun) {
+			rte_bbdev_log_debug("HARQ IN offset unexpected for now\n");
+			fcw->hcin_size0 = harq_layout[harq_index].size0;
+			fcw->hcin_offset = harq_layout[harq_index].offset;
+			fcw->hcin_size1 = harq_in_length -
+					harq_layout[harq_index].offset;
+		} else {
+			fcw->hcin_size0 = harq_in_length;
+			fcw->hcin_offset = 0;
+			fcw->hcin_size1 = 0;
+		}
+	} else {
+		fcw->hcin_size0 = 0;
+		fcw->hcin_offset = 0;
+		fcw->hcin_size1 = 0;
+	}
+
+	fcw->itmax = op->ldpc_dec.iter_max;
+	fcw->itstop = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE);
+	fcw->synd_precoder = fcw->itstop;
+	/*
+	 * These are all implicitly set
+	 * fcw->synd_post = 0;
+	 * fcw->so_en = 0;
+	 * fcw->so_bypass_rm = 0;
+	 * fcw->so_bypass_intlv = 0;
+	 * fcw->dec_convllr = 0;
+	 * fcw->hcout_convllr = 0;
+	 * fcw->hcout_size1 = 0;
+	 * fcw->so_it = 0;
+	 * fcw->hcout_offset = 0;
+	 * fcw->negstop_th = 0;
+	 * fcw->negstop_it = 0;
+	 * fcw->negstop_en = 0;
+	 * fcw->gain_i = 1;
+	 * fcw->gain_h = 1;
+	 */
+	if (fcw->hcout_en > 0) {
+		parity_offset = (op->ldpc_dec.basegraph == 1 ? 20 : 8)
+			* op->ldpc_dec.z_c - op->ldpc_dec.n_filler;
+		k0_p = (fcw->k0 > parity_offset) ?
+				fcw->k0 - op->ldpc_dec.n_filler : fcw->k0;
+		ncb_p = fcw->ncb - op->ldpc_dec.n_filler;
+		l = k0_p + fcw->rm_e;
+		harq_out_length = (uint16_t) fcw->hcin_size0;
+		harq_out_length = RTE_MIN(RTE_MAX(harq_out_length, l), ncb_p);
+		harq_out_length = (harq_out_length + 0x3F) & 0xFFC0;
+		if ((k0_p > fcw->hcin_size0 + ACC100_HARQ_OFFSET_THRESHOLD) &&
+				harq_prun) {
+			fcw->hcout_size0 = (uint16_t) fcw->hcin_size0;
+			fcw->hcout_offset = k0_p & 0xFFC0;
+			fcw->hcout_size1 = harq_out_length - fcw->hcout_offset;
+		} else {
+			fcw->hcout_size0 = harq_out_length;
+			fcw->hcout_size1 = 0;
+			fcw->hcout_offset = 0;
+		}
+		harq_layout[harq_index].offset = fcw->hcout_offset;
+		harq_layout[harq_index].size0 = fcw->hcout_size0;
+	} else {
+		fcw->hcout_size0 = 0;
+		fcw->hcout_size1 = 0;
+		fcw->hcout_offset = 0;
+	}
+}
+
+/**
+ * Fills descriptor with data pointers of one block type.
+ *
+ * @param desc
+ *   Pointer to DMA descriptor.
+ * @param input
+ *   Pointer to pointer to input data which will be encoded. It can be changed
+ *   and points to next segment in scatter-gather case.
+ * @param offset
+ *   Input offset in rte_mbuf structure. It is used for calculating the point
+ *   where data is starting.
+ * @param cb_len
+ *   Length of currently processed Code Block
+ * @param seg_total_left
+ *   It indicates how many bytes still left in segment (mbuf) for further
+ *   processing.
+ * @param op_flags
+ *   Store information about device capabilities
+ * @param next_triplet
+ *   Index for ACC100 DMA Descriptor triplet
+ *
+ * @return
+ *   Returns index of next triplet on success, other value if lengths of
+ *   pkt and processed cb do not match.
+ *
+ */
+static inline int
+acc100_dma_fill_blk_type_in(struct acc100_dma_req_desc *desc,
+		struct rte_mbuf **input, uint32_t *offset, uint32_t cb_len,
+		uint32_t *seg_total_left, int next_triplet)
+{
+	uint32_t part_len;
+	struct rte_mbuf *m = *input;
+
+	part_len = (*seg_total_left < cb_len) ? *seg_total_left : cb_len;
+	cb_len -= part_len;
+	*seg_total_left -= part_len;
+
+	desc->data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(m, *offset);
+	desc->data_ptrs[next_triplet].blen = part_len;
+	desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN;
+	desc->data_ptrs[next_triplet].last = 0;
+	desc->data_ptrs[next_triplet].dma_ext = 0;
+	*offset += part_len;
+	next_triplet++;
+
+	while (cb_len > 0) {
+		if (next_triplet < ACC100_DMA_MAX_NUM_POINTERS &&
+				m->next != NULL) {
+
+			m = m->next;
+			*seg_total_left = rte_pktmbuf_data_len(m);
+			part_len = (*seg_total_left < cb_len) ?
+					*seg_total_left :
+					cb_len;
+			desc->data_ptrs[next_triplet].address =
+					rte_pktmbuf_mtophys(m);
+			desc->data_ptrs[next_triplet].blen = part_len;
+			desc->data_ptrs[next_triplet].blkid =
+					ACC100_DMA_BLKID_IN;
+			desc->data_ptrs[next_triplet].last = 0;
+			desc->data_ptrs[next_triplet].dma_ext = 0;
+			cb_len -= part_len;
+			*seg_total_left -= part_len;
+			/* Initializing offset for next segment (mbuf) */
+			*offset = part_len;
+			next_triplet++;
+		} else {
+			rte_bbdev_log(ERR,
+				"Some data still left for processing: "
+				"data_left: %u, next_triplet: %u, next_mbuf: %p",
+				cb_len, next_triplet, m->next);
+			return -EINVAL;
+		}
+	}
+	/* Storing new mbuf as it could be changed in scatter-gather case*/
+	*input = m;
+
+	return next_triplet;
+}
+
+/* Fills descriptor with data pointers of one block type.
+ * Returns index of next triplet on success, other value if lengths of
+ * output data and processed mbuf do not match.
+ */
+static inline int
+acc100_dma_fill_blk_type_out(struct acc100_dma_req_desc *desc,
+		struct rte_mbuf *output, uint32_t out_offset,
+		uint32_t output_len, int next_triplet, int blk_id)
+{
+	desc->data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(output, out_offset);
+	desc->data_ptrs[next_triplet].blen = output_len;
+	desc->data_ptrs[next_triplet].blkid = blk_id;
+	desc->data_ptrs[next_triplet].last = 0;
+	desc->data_ptrs[next_triplet].dma_ext = 0;
+	next_triplet++;
+
+	return next_triplet;
+}
+
+static inline int
+acc100_dma_desc_le_fill(struct rte_bbdev_enc_op *op,
+		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
+		struct rte_mbuf *output, uint32_t *in_offset,
+		uint32_t *out_offset, uint32_t *out_length,
+		uint32_t *mbuf_total_left, uint32_t *seg_total_left)
+{
+	int next_triplet = 1; /* FCW already done */
+	uint16_t K, in_length_in_bits, in_length_in_bytes;
+	struct rte_bbdev_op_ldpc_enc *enc = &op->ldpc_enc;
+
+	desc->word0 = ACC100_DMA_DESC_TYPE;
+	desc->word1 = 0; /**< Timestamp could be disabled */
+	desc->word2 = 0;
+	desc->word3 = 0;
+	desc->numCBs = 1;
+
+	K = (enc->basegraph == 1 ? 22 : 10) * enc->z_c;
+	in_length_in_bits = K - enc->n_filler;
+	if ((enc->op_flags & RTE_BBDEV_LDPC_CRC_24A_ATTACH) ||
+			(enc->op_flags & RTE_BBDEV_LDPC_CRC_24B_ATTACH))
+		in_length_in_bits -= 24;
+	in_length_in_bytes = in_length_in_bits >> 3;
+
+	if (unlikely((*mbuf_total_left == 0) ||
+			(*mbuf_total_left < in_length_in_bytes))) {
+		rte_bbdev_log(ERR,
+				"Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+				*mbuf_total_left, in_length_in_bytes);
+		return -1;
+	}
+
+	next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset,
+			in_length_in_bytes,
+			seg_total_left, next_triplet);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->m2dlen = next_triplet;
+	*mbuf_total_left -= in_length_in_bytes;
+
+	/* Set output length */
+	/* Integer round up division by 8 */
+	*out_length = (enc->cb_params.e + 7) >> 3;
+
+	next_triplet = acc100_dma_fill_blk_type_out(desc, output, *out_offset,
+			*out_length, next_triplet, ACC100_DMA_BLKID_OUT_ENC);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+	op->ldpc_enc.output.length += *out_length;
+	*out_offset += *out_length;
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->data_ptrs[next_triplet - 1].dma_ext = 0;
+	desc->d2mlen = next_triplet - desc->m2dlen;
+
+	desc->op_addr = op;
+
+	return 0;
+}
+
+static inline int
+acc100_dma_desc_ld_fill(struct rte_bbdev_dec_op *op,
+		struct acc100_dma_req_desc *desc,
+		struct rte_mbuf **input, struct rte_mbuf *h_output,
+		uint32_t *in_offset, uint32_t *h_out_offset,
+		uint32_t *h_out_length, uint32_t *mbuf_total_left,
+		uint32_t *seg_total_left,
+		struct acc100_fcw_ld *fcw)
+{
+	struct rte_bbdev_op_ldpc_dec *dec = &op->ldpc_dec;
+	int next_triplet = 1; /* FCW already done */
+	uint32_t input_length;
+	uint16_t output_length, crc24_overlap = 0;
+	uint16_t sys_cols, K, h_p_size, h_np_size;
+	bool h_comp = check_bit(dec->op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
+
+	desc->word0 = ACC100_DMA_DESC_TYPE;
+	desc->word1 = 0; /**< Timestamp could be disabled */
+	desc->word2 = 0;
+	desc->word3 = 0;
+	desc->numCBs = 1;
+
+	if (check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP))
+		crc24_overlap = 24;
+
+	/* Compute some LDPC BG lengths */
+	input_length = dec->cb_params.e;
+	if (check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_LLR_COMPRESSION))
+		input_length = (input_length * 3 + 3) / 4;
+	sys_cols = (dec->basegraph == 1) ? 22 : 10;
+	K = sys_cols * dec->z_c;
+	output_length = K - dec->n_filler - crc24_overlap;
+
+	if (unlikely((*mbuf_total_left == 0) ||
+			(*mbuf_total_left < input_length))) {
+		rte_bbdev_log(ERR,
+				"Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+				*mbuf_total_left, input_length);
+		return -1;
+	}
+
+	next_triplet = acc100_dma_fill_blk_type_in(desc, input,
+			in_offset, input_length,
+			seg_total_left, next_triplet);
+
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+
+	if (check_bit(op->ldpc_dec.op_flags,
+				RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
+		h_p_size = fcw->hcin_size0 + fcw->hcin_size1;
+		if (h_comp)
+			h_p_size = (h_p_size * 3 + 3) / 4;
+		desc->data_ptrs[next_triplet].address =
+				dec->harq_combined_input.offset;
+		desc->data_ptrs[next_triplet].blen = h_p_size;
+		desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN_HARQ;
+		desc->data_ptrs[next_triplet].dma_ext = 1;
+#ifndef ACC100_EXT_MEM
+		acc100_dma_fill_blk_type_out(
+				desc,
+				op->ldpc_dec.harq_combined_input.data,
+				op->ldpc_dec.harq_combined_input.offset,
+				h_p_size,
+				next_triplet,
+				ACC100_DMA_BLKID_IN_HARQ);
+#endif
+		next_triplet++;
+	}
+
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->m2dlen = next_triplet;
+	*mbuf_total_left -= input_length;
+
+	next_triplet = acc100_dma_fill_blk_type_out(desc, h_output,
+			*h_out_offset, output_length >> 3, next_triplet,
+			ACC100_DMA_BLKID_OUT_HARD);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+
+	if (check_bit(op->ldpc_dec.op_flags,
+				RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
+		/* Pruned size of the HARQ */
+		h_p_size = fcw->hcout_size0 + fcw->hcout_size1;
+		/* Non-Pruned size of the HARQ */
+		h_np_size = fcw->hcout_offset > 0 ?
+				fcw->hcout_offset + fcw->hcout_size1 :
+				h_p_size;
+		if (h_comp) {
+			h_np_size = (h_np_size * 3 + 3) / 4;
+			h_p_size = (h_p_size * 3 + 3) / 4;
+		}
+		dec->harq_combined_output.length = h_np_size;
+		desc->data_ptrs[next_triplet].address =
+				dec->harq_combined_output.offset;
+		desc->data_ptrs[next_triplet].blen = h_p_size;
+		desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_OUT_HARQ;
+		desc->data_ptrs[next_triplet].dma_ext = 1;
+#ifndef ACC100_EXT_MEM
+		acc100_dma_fill_blk_type_out(
+				desc,
+				dec->harq_combined_output.data,
+				dec->harq_combined_output.offset,
+				h_p_size,
+				next_triplet,
+				ACC100_DMA_BLKID_OUT_HARQ);
+#endif
+		next_triplet++;
+	}
+
+	*h_out_length = output_length >> 3;
+	dec->hard_output.length += *h_out_length;
+	*h_out_offset += *h_out_length;
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->d2mlen = next_triplet - desc->m2dlen;
+
+	desc->op_addr = op;
+
+	return 0;
+}
+
+static inline void
+acc100_dma_desc_ld_update(struct rte_bbdev_dec_op *op,
+		struct acc100_dma_req_desc *desc,
+		struct rte_mbuf *input, struct rte_mbuf *h_output,
+		uint32_t *in_offset, uint32_t *h_out_offset,
+		uint32_t *h_out_length,
+		union acc100_harq_layout_data *harq_layout)
+{
+	int next_triplet = 1; /* FCW already done */
+	desc->data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(input, *in_offset);
+	next_triplet++;
+
+	if (check_bit(op->ldpc_dec.op_flags,
+				RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
+		struct rte_bbdev_op_data hi = op->ldpc_dec.harq_combined_input;
+		desc->data_ptrs[next_triplet].address = hi.offset;
+#ifndef ACC100_EXT_MEM
+		desc->data_ptrs[next_triplet].address =
+				rte_pktmbuf_iova_offset(hi.data, hi.offset);
+#endif
+		next_triplet++;
+	}
+
+	desc->data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(h_output, *h_out_offset);
+	*h_out_length = desc->data_ptrs[next_triplet].blen;
+	next_triplet++;
+
+	if (check_bit(op->ldpc_dec.op_flags,
+				RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
+		desc->data_ptrs[next_triplet].address =
+				op->ldpc_dec.harq_combined_output.offset;
+		/* Adjust based on previous operation */
+		struct rte_bbdev_dec_op *prev_op = desc->op_addr;
+		op->ldpc_dec.harq_combined_output.length =
+				prev_op->ldpc_dec.harq_combined_output.length;
+		int16_t hq_idx = op->ldpc_dec.harq_combined_output.offset /
+				ACC100_HARQ_OFFSET;
+		int16_t prev_hq_idx =
+				prev_op->ldpc_dec.harq_combined_output.offset
+				/ ACC100_HARQ_OFFSET;
+		harq_layout[hq_idx].val = harq_layout[prev_hq_idx].val;
+#ifndef ACC100_EXT_MEM
+		struct rte_bbdev_op_data ho =
+				op->ldpc_dec.harq_combined_output;
+		desc->data_ptrs[next_triplet].address =
+				rte_pktmbuf_iova_offset(ho.data, ho.offset);
+#endif
+		next_triplet++;
+	}
+
+	op->ldpc_dec.hard_output.length += *h_out_length;
+	desc->op_addr = op;
+}
+
+
+/* Enqueue a number of operations to HW and update software rings */
+static inline void
+acc100_dma_enqueue(struct acc100_queue *q, uint16_t n,
+		struct rte_bbdev_stats *queue_stats)
+{
+	union acc100_enqueue_reg_fmt enq_req;
+#ifdef RTE_BBDEV_OFFLOAD_COST
+	uint64_t start_time = 0;
+	queue_stats->acc_offload_cycles = 0;
+	RTE_SET_USED(queue_stats);
+#else
+	RTE_SET_USED(queue_stats);
+#endif
+
+	enq_req.val = 0;
+	/* Setting offset, 100b for 256 DMA Desc */
+	enq_req.addr_offset = ACC100_DESC_OFFSET;
+
+	/* Split ops into batches */
+	do {
+		union acc100_dma_desc *desc;
+		uint16_t enq_batch_size;
+		uint64_t offset;
+		rte_iova_t req_elem_addr;
+
+		enq_batch_size = RTE_MIN(n, MAX_ENQ_BATCH_SIZE);
+
+		/* Set flag on last descriptor in a batch */
+		desc = q->ring_addr + ((q->sw_ring_head + enq_batch_size - 1) &
+				q->sw_ring_wrap_mask);
+		desc->req.last_desc_in_batch = 1;
+
+		/* Calculate the 1st descriptor's address */
+		offset = ((q->sw_ring_head & q->sw_ring_wrap_mask) *
+				sizeof(union acc100_dma_desc));
+		req_elem_addr = q->ring_addr_phys + offset;
+
+		/* Fill enqueue struct */
+		enq_req.num_elem = enq_batch_size;
+		/* low 6 bits are not needed */
+		enq_req.req_elem_addr = (uint32_t)(req_elem_addr >> 6);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+		rte_memdump(stderr, "Req sdone", desc, sizeof(*desc));
+#endif
+		rte_bbdev_log_debug(
+				"Enqueue %u reqs (phys %#"PRIx64") to reg %p",
+				enq_batch_size,
+				req_elem_addr,
+				(void *)q->mmio_reg_enqueue);
+
+		rte_wmb();
+
+#ifdef RTE_BBDEV_OFFLOAD_COST
+		/* Start time measurement for enqueue function offload. */
+		start_time = rte_rdtsc_precise();
+#endif
+		rte_bbdev_log(DEBUG, "Debug : MMIO Enqueue");
+		mmio_write(q->mmio_reg_enqueue, enq_req.val);
+
+#ifdef RTE_BBDEV_OFFLOAD_COST
+		queue_stats->acc_offload_cycles +=
+				rte_rdtsc_precise() - start_time;
+#endif
+
+		q->aq_enqueued++;
+		q->sw_ring_head += enq_batch_size;
+		n -= enq_batch_size;
+
+	} while (n);
+
+
+}
+
+/* Enqueue one encode operations for ACC100 device in CB mode */
+static inline int
+enqueue_ldpc_enc_n_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op **ops,
+		uint16_t total_enqueued_cbs, int16_t num)
+{
+	union acc100_dma_desc *desc = NULL;
+	uint32_t out_length;
+	struct rte_mbuf *output_head, *output;
+	int i, next_triplet;
+	uint16_t  in_length_in_bytes;
+	struct rte_bbdev_op_ldpc_enc *enc = &ops[0]->ldpc_enc;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	acc100_fcw_le_fill(ops[0], &desc->req.fcw_le, num);
+
+	/** This could be done at polling */
+	desc->req.word0 = ACC100_DMA_DESC_TYPE;
+	desc->req.word1 = 0; /**< Timestamp could be disabled */
+	desc->req.word2 = 0;
+	desc->req.word3 = 0;
+	desc->req.numCBs = num;
+
+	in_length_in_bytes = ops[0]->ldpc_enc.input.data->data_len;
+	out_length = (enc->cb_params.e + 7) >> 3;
+	desc->req.m2dlen = 1 + num;
+	desc->req.d2mlen = num;
+	next_triplet = 1;
+
+	for (i = 0; i < num; i++) {
+		desc->req.data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(ops[i]->ldpc_enc.input.data, 0);
+		desc->req.data_ptrs[next_triplet].blen = in_length_in_bytes;
+		next_triplet++;
+		desc->req.data_ptrs[next_triplet].address =
+				rte_pktmbuf_iova_offset(
+				ops[i]->ldpc_enc.output.data, 0);
+		desc->req.data_ptrs[next_triplet].blen = out_length;
+		next_triplet++;
+		ops[i]->ldpc_enc.output.length = out_length;
+		output_head = output = ops[i]->ldpc_enc.output.data;
+		mbuf_append(output_head, output, out_length);
+		output->data_len = out_length;
+	}
+
+	desc->req.op_addr = ops[0];
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_le,
+			sizeof(desc->req.fcw_le) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+	/* One CB (one op) was successfully prepared to enqueue */
+	return num;
+}
+
+/* Enqueue one encode operations for ACC100 device in CB mode */
+static inline int
+enqueue_ldpc_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
+		uint16_t total_enqueued_cbs)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint32_t in_offset, out_offset, out_length, mbuf_total_left,
+		seg_total_left;
+	struct rte_mbuf *input, *output_head, *output;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	acc100_fcw_le_fill(op, &desc->req.fcw_le, 1);
+
+	input = op->ldpc_enc.input.data;
+	output_head = output = op->ldpc_enc.output.data;
+	in_offset = op->ldpc_enc.input.offset;
+	out_offset = op->ldpc_enc.output.offset;
+	out_length = 0;
+	mbuf_total_left = op->ldpc_enc.input.length;
+	seg_total_left = rte_pktmbuf_data_len(op->ldpc_enc.input.data)
+			- in_offset;
+
+	ret = acc100_dma_desc_le_fill(op, &desc->req, &input, output,
+			&in_offset, &out_offset, &out_length, &mbuf_total_left,
+			&seg_total_left);
+
+	if (unlikely(ret < 0))
+		return ret;
+
+	mbuf_append(output_head, output, out_length);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_le,
+			sizeof(desc->req.fcw_le) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+
+	/* Check if any data left after processing one CB */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left after processing one CB: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
+/** Enqueue one decode operations for ACC100 device in CB mode */
+static inline int
+enqueue_ldpc_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs, bool same_op)
+{
+	int ret;
+
+	union acc100_dma_desc *desc;
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	struct rte_mbuf *input, *h_output_head, *h_output;
+	uint32_t in_offset, h_out_offset, mbuf_total_left, h_out_length = 0;
+	input = op->ldpc_dec.input.data;
+	h_output_head = h_output = op->ldpc_dec.hard_output.data;
+	in_offset = op->ldpc_dec.input.offset;
+	h_out_offset = op->ldpc_dec.hard_output.offset;
+	mbuf_total_left = op->ldpc_dec.input.length;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(input == NULL)) {
+		rte_bbdev_log(ERR, "Invalid mbuf pointer");
+		return -EFAULT;
+	}
+#endif
+	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+
+	if (same_op) {
+		union acc100_dma_desc *prev_desc;
+		desc_idx = ((q->sw_ring_head + total_enqueued_cbs - 1)
+				& q->sw_ring_wrap_mask);
+		prev_desc = q->ring_addr + desc_idx;
+		uint8_t *prev_ptr = (uint8_t *) prev_desc;
+		uint8_t *new_ptr = (uint8_t *) desc;
+		/* Copy first 4 words and BDESCs */
+		rte_memcpy(new_ptr, prev_ptr, 16);
+		rte_memcpy(new_ptr + 36, prev_ptr + 36, 40);
+		desc->req.op_addr = prev_desc->req.op_addr;
+		/* Copy FCW */
+		rte_memcpy(new_ptr + ACC100_DESC_FCW_OFFSET,
+				prev_ptr + ACC100_DESC_FCW_OFFSET,
+				ACC100_FCW_LD_BLEN);
+		acc100_dma_desc_ld_update(op, &desc->req, input, h_output,
+				&in_offset, &h_out_offset,
+				&h_out_length, harq_layout);
+	} else {
+		struct acc100_fcw_ld *fcw;
+		uint32_t seg_total_left;
+		fcw = &desc->req.fcw_ld;
+		acc100_fcw_ld_fill(op, fcw, harq_layout);
+
+		/* Special handling when overusing mbuf */
+		if (fcw->rm_e < MAX_E_MBUF)
+			seg_total_left = rte_pktmbuf_data_len(input)
+					- in_offset;
+		else
+			seg_total_left = fcw->rm_e;
+
+		ret = acc100_dma_desc_ld_fill(op, &desc->req, &input, h_output,
+				&in_offset, &h_out_offset,
+				&h_out_length, &mbuf_total_left,
+				&seg_total_left, fcw);
+		if (unlikely(ret < 0))
+			return ret;
+	}
+
+	/* Hard output */
+	mbuf_append(h_output_head, h_output, h_out_length);
+#ifndef ACC100_EXT_MEM
+	if (op->ldpc_dec.harq_combined_output.length > 0) {
+		/* Push the HARQ output into host memory */
+		struct rte_mbuf *hq_output_head, *hq_output;
+		hq_output_head = op->ldpc_dec.harq_combined_output.data;
+		hq_output = op->ldpc_dec.harq_combined_output.data;
+		mbuf_append(hq_output_head, hq_output,
+				op->ldpc_dec.harq_combined_output.length);
+	}
+#endif
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_ld,
+			sizeof(desc->req.fcw_ld) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
+
+/* Enqueue one decode operations for ACC100 device in TB mode */
+static inline int
+enqueue_ldpc_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint8_t r, c;
+	uint32_t in_offset, h_out_offset,
+		h_out_length, mbuf_total_left, seg_total_left;
+	struct rte_mbuf *input, *h_output_head, *h_output;
+	uint16_t current_enqueued_cbs = 0;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
+	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+	acc100_fcw_ld_fill(op, &desc->req.fcw_ld, harq_layout);
+
+	input = op->ldpc_dec.input.data;
+	h_output_head = h_output = op->ldpc_dec.hard_output.data;
+	in_offset = op->ldpc_dec.input.offset;
+	h_out_offset = op->ldpc_dec.hard_output.offset;
+	h_out_length = 0;
+	mbuf_total_left = op->ldpc_dec.input.length;
+	c = op->ldpc_dec.tb_params.c;
+	r = op->ldpc_dec.tb_params.r;
+
+	while (mbuf_total_left > 0 && r < c) {
+
+		seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
+
+		/* Set up DMA descriptor */
+		desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
+				& q->sw_ring_wrap_mask);
+		desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
+		desc->req.data_ptrs[0].blen = ACC100_FCW_LD_BLEN;
+		ret = acc100_dma_desc_ld_fill(op, &desc->req, &input,
+				h_output, &in_offset, &h_out_offset,
+				&h_out_length,
+				&mbuf_total_left, &seg_total_left,
+				&desc->req.fcw_ld);
+
+		if (unlikely(ret < 0))
+			return ret;
+
+		/* Hard output */
+		mbuf_append(h_output_head, h_output, h_out_length);
+
+		/* Set total number of CBs in TB */
+		desc->req.cbs_in_tb = cbs_in_tb;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+		rte_memdump(stderr, "FCW", &desc->req.fcw_td,
+				sizeof(desc->req.fcw_td) - 8);
+		rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+		if (seg_total_left == 0) {
+			/* Go to the next mbuf */
+			input = input->next;
+			in_offset = 0;
+			h_output = h_output->next;
+			h_out_offset = 0;
+		}
+		total_enqueued_cbs++;
+		current_enqueued_cbs++;
+		r++;
+	}
+
+	if (unlikely(desc == NULL))
+		return current_enqueued_cbs;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Check if any CBs left for processing */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left for processing: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+	/* Set SDone on last CB descriptor for TB mode */
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	return current_enqueued_cbs;
+}
+
+
+/* Calculates number of CBs in processed encoder TB based on 'r' and input
+ * length.
+ */
+static inline uint8_t
+get_num_cbs_in_tb_enc(struct rte_bbdev_op_turbo_enc *turbo_enc)
+{
+	uint8_t c, c_neg, r, crc24_bits = 0;
+	uint16_t k, k_neg, k_pos;
+	uint8_t cbs_in_tb = 0;
+	int32_t length;
+
+	length = turbo_enc->input.length;
+	r = turbo_enc->tb_params.r;
+	c = turbo_enc->tb_params.c;
+	c_neg = turbo_enc->tb_params.c_neg;
+	k_neg = turbo_enc->tb_params.k_neg;
+	k_pos = turbo_enc->tb_params.k_pos;
+	crc24_bits = 0;
+	if (check_bit(turbo_enc->op_flags, RTE_BBDEV_TURBO_CRC_24B_ATTACH))
+		crc24_bits = 24;
+	while (length > 0 && r < c) {
+		k = (r < c_neg) ? k_neg : k_pos;
+		length -= (k - crc24_bits) >> 3;
+		r++;
+		cbs_in_tb++;
+	}
+
+	return cbs_in_tb;
+}
+
+/* Calculates number of CBs in processed decoder TB based on 'r' and input
+ * length.
+ */
+static inline uint16_t
+get_num_cbs_in_tb_dec(struct rte_bbdev_op_turbo_dec *turbo_dec)
+{
+	uint8_t c, c_neg, r = 0;
+	uint16_t kw, k, k_neg, k_pos, cbs_in_tb = 0;
+	int32_t length;
+
+	length = turbo_dec->input.length;
+	r = turbo_dec->tb_params.r;
+	c = turbo_dec->tb_params.c;
+	c_neg = turbo_dec->tb_params.c_neg;
+	k_neg = turbo_dec->tb_params.k_neg;
+	k_pos = turbo_dec->tb_params.k_pos;
+	while (length > 0 && r < c) {
+		k = (r < c_neg) ? k_neg : k_pos;
+		kw = RTE_ALIGN_CEIL(k + 4, 32) * 3;
+		length -= kw;
+		r++;
+		cbs_in_tb++;
+	}
+
+	return cbs_in_tb;
+}
+
+/* Calculates number of CBs in processed decoder TB based on 'r' and input
+ * length.
+ */
+static inline uint16_t
+get_num_cbs_in_tb_ldpc_dec(struct rte_bbdev_op_ldpc_dec *ldpc_dec)
+{
+	uint16_t r, cbs_in_tb = 0;
+	int32_t length = ldpc_dec->input.length;
+	r = ldpc_dec->tb_params.r;
+	while (length > 0 && r < ldpc_dec->tb_params.c) {
+		length -=  (r < ldpc_dec->tb_params.cab) ?
+				ldpc_dec->tb_params.ea :
+				ldpc_dec->tb_params.eb;
+		r++;
+		cbs_in_tb++;
+	}
+	return cbs_in_tb;
+}
+
+/* Check we can mux encode operations with common FCW */
+static inline bool
+check_mux(struct rte_bbdev_enc_op **ops, uint16_t num) {
+	uint16_t i;
+	if (num == 1)
+		return false;
+	for (i = 1; i < num; ++i) {
+		/* Only mux compatible code blocks */
+		if (memcmp((uint8_t *)(&ops[i]->ldpc_enc) + ENC_OFFSET,
+				(uint8_t *)(&ops[0]->ldpc_enc) + ENC_OFFSET,
+				CMP_ENC_SIZE) != 0)
+			return false;
+	}
+	return true;
+}
+
+/** Enqueue encode operations for ACC100 device in CB mode. */
+static inline uint16_t
+acc100_enqueue_ldpc_enc_cb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i = 0;
+	union acc100_dma_desc *desc;
+	int ret, desc_idx = 0;
+	int16_t enq, left = num;
+
+	while (left > 0) {
+		if (unlikely(avail - 1 < 0))
+			break;
+		avail--;
+		enq = RTE_MIN(left, MUX_5GDL_DESC);
+		if (check_mux(&ops[i], enq)) {
+			ret = enqueue_ldpc_enc_n_op_cb(q, &ops[i],
+					desc_idx, enq);
+			if (ret < 0)
+				break;
+			i += enq;
+		} else {
+			ret = enqueue_ldpc_enc_one_op_cb(q, ops[i], desc_idx);
+			if (ret < 0)
+				break;
+			i++;
+		}
+		desc_idx++;
+		left = num - i;
+	}
+
+	if (unlikely(i == 0))
+		return 0; /* Nothing to enqueue */
+
+	/* Set SDone in last CB in enqueued ops for CB mode*/
+	desc = q->ring_addr + ((q->sw_ring_head + desc_idx - 1)
+			& q->sw_ring_wrap_mask);
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	acc100_dma_enqueue(q, desc_idx, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+
+	return i;
+}
+
+/* Enqueue encode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	if (unlikely(num == 0))
+		return 0;
+	return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
+}
+
+/* Check we can mux encode operations with common FCW */
+static inline bool
+cmp_ldpc_dec_op(struct rte_bbdev_dec_op **ops) {
+	/* Only mux compatible code blocks */
+	if (memcmp((uint8_t *)(&ops[0]->ldpc_dec) + DEC_OFFSET,
+			(uint8_t *)(&ops[1]->ldpc_dec) +
+			DEC_OFFSET, CMP_DEC_SIZE) != 0) {
+		return false;
+	} else
+		return true;
+}
+
+
+/* Enqueue decode operations for ACC100 device in TB mode */
+static uint16_t
+acc100_enqueue_ldpc_dec_tb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i, enqueued_cbs = 0;
+	uint8_t cbs_in_tb;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		cbs_in_tb = get_num_cbs_in_tb_ldpc_dec(&ops[i]->ldpc_dec);
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - cbs_in_tb < 0))
+			break;
+		avail -= cbs_in_tb;
+
+		ret = enqueue_ldpc_dec_one_op_tb(q, ops[i],
+				enqueued_cbs, cbs_in_tb);
+		if (ret < 0)
+			break;
+		enqueued_cbs += ret;
+	}
+
+	acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+	return i;
+}
+
+/* Enqueue decode operations for ACC100 device in CB mode */
+static uint16_t
+acc100_enqueue_ldpc_dec_cb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i;
+	union acc100_dma_desc *desc;
+	int ret;
+	bool same_op = false;
+	for (i = 0; i < num; ++i) {
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - 1 < 0))
+			break;
+		avail -= 1;
+
+		if (i > 0)
+			same_op = cmp_ldpc_dec_op(&ops[i-1]);
+		rte_bbdev_log(INFO, "Op %d %d %d %d %d %d %d %d %d %d %d %d\n",
+			i, ops[i]->ldpc_dec.op_flags, ops[i]->ldpc_dec.rv_index,
+			ops[i]->ldpc_dec.iter_max, ops[i]->ldpc_dec.iter_count,
+			ops[i]->ldpc_dec.basegraph, ops[i]->ldpc_dec.z_c,
+			ops[i]->ldpc_dec.n_cb, ops[i]->ldpc_dec.q_m,
+			ops[i]->ldpc_dec.n_filler, ops[i]->ldpc_dec.cb_params.e,
+			same_op);
+		ret = enqueue_ldpc_dec_one_op_cb(q, ops[i], i, same_op);
+		if (ret < 0)
+			break;
+	}
+
+	if (unlikely(i == 0))
+		return 0; /* Nothing to enqueue */
+
+	/* Set SDone in last CB in enqueued ops for CB mode*/
+	desc = q->ring_addr + ((q->sw_ring_head + i - 1)
+			& q->sw_ring_wrap_mask);
+
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	acc100_dma_enqueue(q, i, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+	return i;
+}
+
+/* Enqueue decode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t aq_avail = q->aq_depth +
+			(q->aq_dequeued - q->aq_enqueued) / 128;
+
+	if (unlikely((aq_avail == 0) || (num == 0)))
+		return 0;
+
+	if (ops[0]->ldpc_dec.code_block_mode == 0)
+		return acc100_enqueue_ldpc_dec_tb(q_data, ops, num);
+	else
+		return acc100_enqueue_ldpc_dec_cb(q_data, ops, num);
+}
+
+
+/* Dequeue one encode operations from ACC100 device in CB mode */
+static inline int
+dequeue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op **ref_op,
+		uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_enc_op *op;
+	int i;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	rsp.val = atom_desc.rsp.val;
+	rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+
+	op->status |= ((rsp.input_err)
+			? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+	op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+	op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+
+	if (desc->req.last_desc_in_batch) {
+		(*aq_dequeued)++;
+		desc->req.last_desc_in_batch = 0;
+	}
+	desc->rsp.val = ACC100_DMA_DESC_TYPE;
+	desc->rsp.add_info_0 = 0; /*Reserved bits */
+	desc->rsp.add_info_1 = 0; /*Reserved bits */
+
+	/* Flag that the muxing cause loss of opaque data */
+	op->opaque_data = (void *)-1;
+	for (i = 0 ; i < desc->req.numCBs; i++)
+		ref_op[i] = op;
+
+	/* One CB (op) was successfully dequeued */
+	return desc->req.numCBs;
+}
+
+/* Dequeue one encode operations from ACC100 device in TB mode */
+static inline int
+dequeue_enc_one_op_tb(struct acc100_queue *q, struct rte_bbdev_enc_op **ref_op,
+		uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, *last_desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_enc_op *op;
+	uint8_t i = 0;
+	uint16_t current_dequeued_cbs = 0, cbs_in_tb;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	/* Get number of CBs in dequeued TB */
+	cbs_in_tb = desc->req.cbs_in_tb;
+	/* Get last CB */
+	last_desc = q->ring_addr + ((q->sw_ring_tail
+			+ total_dequeued_cbs + cbs_in_tb - 1)
+			& q->sw_ring_wrap_mask);
+	/* Check if last CB in TB is ready to dequeue (and thus
+	 * the whole TB) - checking sdone bit. If not return.
+	 */
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
+			__ATOMIC_RELAXED);
+	if (!(atom_desc.rsp.val & ACC100_SDONE))
+		return -1;
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+
+	while (i < cbs_in_tb) {
+		desc = q->ring_addr + ((q->sw_ring_tail
+				+ total_dequeued_cbs)
+				& q->sw_ring_wrap_mask);
+		atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+				__ATOMIC_RELAXED);
+		rsp.val = atom_desc.rsp.val;
+		rte_bbdev_log_debug("Resp. desc %p: %x", desc,
+				rsp.val);
+
+		op->status |= ((rsp.input_err)
+				? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+		op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+		op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+
+		if (desc->req.last_desc_in_batch) {
+			(*aq_dequeued)++;
+			desc->req.last_desc_in_batch = 0;
+		}
+		desc->rsp.val = ACC100_DMA_DESC_TYPE;
+		desc->rsp.add_info_0 = 0;
+		desc->rsp.add_info_1 = 0;
+		total_dequeued_cbs++;
+		current_dequeued_cbs++;
+		i++;
+	}
+
+	*ref_op = op;
+
+	return current_dequeued_cbs;
+}
+
+/* Dequeue one decode operation from ACC100 device in CB mode */
+static inline int
+dequeue_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
+		struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
+		uint16_t dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_dec_op *op;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	rsp.val = atom_desc.rsp.val;
+	rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+	op->status |= ((rsp.input_err)
+			? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+	op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+	op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+	if (op->status != 0)
+		q_data->queue_stats.dequeue_err_count++;
+
+	/* CRC invalid if error exists */
+	if (!op->status)
+		op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
+	op->turbo_dec.iter_count = (uint8_t) rsp.iter_cnt / 2;
+	/* Check if this is the last desc in batch (Atomic Queue) */
+	if (desc->req.last_desc_in_batch) {
+		(*aq_dequeued)++;
+		desc->req.last_desc_in_batch = 0;
+	}
+	desc->rsp.val = ACC100_DMA_DESC_TYPE;
+	desc->rsp.add_info_0 = 0;
+	desc->rsp.add_info_1 = 0;
+	*ref_op = op;
+
+	/* One CB (op) was successfully dequeued */
+	return 1;
+}
+
+/* Dequeue one decode operations from ACC100 device in CB mode */
+static inline int
+dequeue_ldpc_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
+		struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
+		uint16_t dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_dec_op *op;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	rsp.val = atom_desc.rsp.val;
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+	op->status |= rsp.input_err << RTE_BBDEV_DATA_ERROR;
+	op->status |= rsp.dma_err << RTE_BBDEV_DRV_ERROR;
+	op->status |= rsp.fcw_err << RTE_BBDEV_DRV_ERROR;
+	if (op->status != 0)
+		q_data->queue_stats.dequeue_err_count++;
+
+	op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
+	if (op->ldpc_dec.hard_output.length > 0 && !rsp.synd_ok)
+		op->status |= 1 << RTE_BBDEV_SYNDROME_ERROR;
+	op->ldpc_dec.iter_count = (uint8_t) rsp.iter_cnt;
+
+	/* Check if this is the last desc in batch (Atomic Queue) */
+	if (desc->req.last_desc_in_batch) {
+		(*aq_dequeued)++;
+		desc->req.last_desc_in_batch = 0;
+	}
+
+	desc->rsp.val = ACC100_DMA_DESC_TYPE;
+	desc->rsp.add_info_0 = 0;
+	desc->rsp.add_info_1 = 0;
+
+	*ref_op = op;
+
+	/* One CB (op) was successfully dequeued */
+	return 1;
+}
+
+/* Dequeue one decode operations from ACC100 device in TB mode. */
+static inline int
+dequeue_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
+		uint16_t dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, *last_desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_dec_op *op;
+	uint8_t cbs_in_tb = 1, cb_idx = 0;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Get number of CBs in dequeued TB */
+	cbs_in_tb = desc->req.cbs_in_tb;
+	/* Get last CB */
+	last_desc = q->ring_addr + ((q->sw_ring_tail
+			+ dequeued_cbs + cbs_in_tb - 1)
+			& q->sw_ring_wrap_mask);
+	/* Check if last CB in TB is ready to dequeue (and thus
+	 * the whole TB) - checking sdone bit. If not return.
+	 */
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
+			__ATOMIC_RELAXED);
+	if (!(atom_desc.rsp.val & ACC100_SDONE))
+		return -1;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+
+	/* Read remaining CBs if exists */
+	while (cb_idx < cbs_in_tb) {
+		desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+				& q->sw_ring_wrap_mask);
+		atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+				__ATOMIC_RELAXED);
+		rsp.val = atom_desc.rsp.val;
+		rte_bbdev_log_debug("Resp. desc %p: %x", desc,
+				rsp.val);
+
+		op->status |= ((rsp.input_err)
+				? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+		op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+		op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+
+		/* CRC invalid if error exists */
+		if (!op->status)
+			op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
+		op->turbo_dec.iter_count = RTE_MAX((uint8_t) rsp.iter_cnt,
+				op->turbo_dec.iter_count);
+
+		/* Check if this is the last desc in batch (Atomic Queue) */
+		if (desc->req.last_desc_in_batch) {
+			(*aq_dequeued)++;
+			desc->req.last_desc_in_batch = 0;
+		}
+		desc->rsp.val = ACC100_DMA_DESC_TYPE;
+		desc->rsp.add_info_0 = 0;
+		desc->rsp.add_info_1 = 0;
+		dequeued_cbs++;
+		cb_idx++;
+	}
+
+	*ref_op = op;
+
+	return cb_idx;
+}
+
+/* Dequeue LDPC encode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+	uint32_t aq_dequeued = 0;
+	uint16_t dequeue_num, i, dequeued_cbs = 0, dequeued_descs = 0;
+	int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(ops == 0 && q == NULL))
+		return 0;
+#endif
+
+	dequeue_num = (avail < num) ? avail : num;
+
+	for (i = 0; i < dequeue_num; i++) {
+		ret = dequeue_enc_one_op_cb(q, &ops[dequeued_cbs],
+				dequeued_descs, &aq_dequeued);
+		if (ret < 0)
+			break;
+		dequeued_cbs += ret;
+		dequeued_descs++;
+		if (dequeued_cbs >= num)
+			break;
+	}
+
+	q->aq_dequeued += aq_dequeued;
+	q->sw_ring_tail += dequeued_descs;
+
+	/* Update enqueue stats */
+	q_data->queue_stats.dequeued_count += dequeued_cbs;
+
+	return dequeued_cbs;
+}
+
+/* Dequeue decode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	uint16_t dequeue_num;
+	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+	uint32_t aq_dequeued = 0;
+	uint16_t i;
+	uint16_t dequeued_cbs = 0;
+	struct rte_bbdev_dec_op *op;
+	int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(ops == 0 && q == NULL))
+		return 0;
+#endif
+
+	dequeue_num = (avail < num) ? avail : num;
+
+	for (i = 0; i < dequeue_num; ++i) {
+		op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask))->req.op_addr;
+		if (op->ldpc_dec.code_block_mode == 0)
+			ret = dequeue_dec_one_op_tb(q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+		else
+			ret = dequeue_ldpc_dec_one_op_cb(
+					q_data, q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+
+		if (ret < 0)
+			break;
+		dequeued_cbs += ret;
+	}
+
+	q->aq_dequeued += aq_dequeued;
+	q->sw_ring_tail += dequeued_cbs;
+
+	/* Update enqueue stats */
+	q_data->queue_stats.dequeued_count += i;
+
+	return i;
+}
+
 /* Initialization Function */
 static void
 acc100_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv)
@@ -703,6 +2321,10 @@
 	struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
 
 	dev->dev_ops = &acc100_bbdev_ops;
+	dev->enqueue_ldpc_enc_ops = acc100_enqueue_ldpc_enc;
+	dev->enqueue_ldpc_dec_ops = acc100_enqueue_ldpc_dec;
+	dev->dequeue_ldpc_enc_ops = acc100_dequeue_ldpc_enc;
+	dev->dequeue_ldpc_dec_ops = acc100_dequeue_ldpc_dec;
 
 	((struct acc100_device *) dev->data->dev_private)->pf_device =
 			!strcmp(drv->driver.name,
@@ -815,4 +2437,3 @@ static int acc100_pci_remove(struct rte_pci_device *pci_dev)
 RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME, pci_id_acc100_pf_map);
 RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
 RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME, pci_id_acc100_vf_map);
-
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 0e2b79c..78686c1 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -88,6 +88,8 @@
 #define TMPL_PRI_3      0x0f0e0d0c
 #define QUEUE_ENABLE    0x80000000  /* Bit to mark Queue as Enabled */
 #define WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
+#define ACC100_FDONE    0x80000000
+#define ACC100_SDONE    0x40000000
 
 #define ACC100_NUM_TMPL  32
 #define VF_OFFSET_QOS 16 /* offset in Memory Space specific to QoS Mon */
@@ -398,6 +400,7 @@ struct __rte_packed acc100_dma_req_desc {
 union acc100_dma_desc {
 	struct acc100_dma_req_desc req;
 	union acc100_dma_rsp_desc rsp;
+	uint64_t atom_hdr;
 };
 
 
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v4 06/11] baseband/acc100: add HARQ loopback support
  2020-09-04 17:53   ` [dpdk-dev] [PATCH v4 00/11] bbdev PMD ACC100 Nicolas Chautru
                       ` (4 preceding siblings ...)
  2020-09-04 17:54     ` [dpdk-dev] [PATCH v4 05/11] baseband/acc100: add LDPC processing functions Nicolas Chautru
@ 2020-09-04 17:54     ` Nicolas Chautru
  2020-09-21  1:41       ` Liu, Tianjiao
  2020-09-04 17:54     ` [dpdk-dev] [PATCH v4 07/11] baseband/acc100: add support for 4G processing Nicolas Chautru
                       ` (5 subsequent siblings)
  11 siblings, 1 reply; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-04 17:54 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Additional support for HARQ memory loopback

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 158 +++++++++++++++++++++++++++++++
 1 file changed, 158 insertions(+)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 7f64695..5b011a1 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -658,6 +658,7 @@
 				RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE |
 				RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE |
 #ifdef ACC100_EXT_MEM
+				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK |
 				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE |
 				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE |
 #endif
@@ -1480,12 +1481,169 @@
 	return 1;
 }
 
+static inline int
+harq_loopback(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs) {
+	struct acc100_fcw_ld *fcw;
+	union acc100_dma_desc *desc;
+	int next_triplet = 1;
+	struct rte_mbuf *hq_output_head, *hq_output;
+	uint16_t harq_in_length = op->ldpc_dec.harq_combined_input.length;
+	if (harq_in_length == 0) {
+		rte_bbdev_log(ERR, "Loopback of invalid null size\n");
+		return -EINVAL;
+	}
+
+	int h_comp = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION
+			) ? 1 : 0;
+	if (h_comp == 1)
+		harq_in_length = harq_in_length * 8 / 6;
+	harq_in_length = RTE_ALIGN(harq_in_length, 64);
+	uint16_t harq_dma_length_in = (h_comp == 0) ?
+			harq_in_length :
+			harq_in_length * 6 / 8;
+	uint16_t harq_dma_length_out = harq_dma_length_in;
+	bool ddr_mem_in = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE);
+	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+	uint16_t harq_index = (ddr_mem_in ?
+			op->ldpc_dec.harq_combined_input.offset :
+			op->ldpc_dec.harq_combined_output.offset)
+			/ ACC100_HARQ_OFFSET;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	fcw = &desc->req.fcw_ld;
+	/* Set the FCW from loopback into DDR */
+	memset(fcw, 0, sizeof(struct acc100_fcw_ld));
+	fcw->FCWversion = ACC100_FCW_VER;
+	fcw->qm = 2;
+	fcw->Zc = 384;
+	if (harq_in_length < 16 * N_ZC_1)
+		fcw->Zc = 16;
+	fcw->ncb = fcw->Zc * N_ZC_1;
+	fcw->rm_e = 2;
+	fcw->hcin_en = 1;
+	fcw->hcout_en = 1;
+
+	rte_bbdev_log(DEBUG, "Loopback IN %d Index %d offset %d length %d %d\n",
+			ddr_mem_in, harq_index,
+			harq_layout[harq_index].offset, harq_in_length,
+			harq_dma_length_in);
+
+	if (ddr_mem_in && (harq_layout[harq_index].offset > 0)) {
+		fcw->hcin_size0 = harq_layout[harq_index].size0;
+		fcw->hcin_offset = harq_layout[harq_index].offset;
+		fcw->hcin_size1 = harq_in_length - fcw->hcin_offset;
+		harq_dma_length_in = (fcw->hcin_size0 + fcw->hcin_size1);
+		if (h_comp == 1)
+			harq_dma_length_in = harq_dma_length_in * 6 / 8;
+	} else {
+		fcw->hcin_size0 = harq_in_length;
+	}
+	harq_layout[harq_index].val = 0;
+	rte_bbdev_log(DEBUG, "Loopback FCW Config %d %d %d\n",
+			fcw->hcin_size0, fcw->hcin_offset, fcw->hcin_size1);
+	fcw->hcout_size0 = harq_in_length;
+	fcw->hcin_decomp_mode = h_comp;
+	fcw->hcout_comp_mode = h_comp;
+	fcw->gain_i = 1;
+	fcw->gain_h = 1;
+
+	/* Set the prefix of descriptor. This could be done at polling */
+	desc->req.word0 = ACC100_DMA_DESC_TYPE;
+	desc->req.word1 = 0; /**< Timestamp could be disabled */
+	desc->req.word2 = 0;
+	desc->req.word3 = 0;
+	desc->req.numCBs = 1;
+
+	/* Null LLR input for Decoder */
+	desc->req.data_ptrs[next_triplet].address =
+			q->lb_in_addr_phys;
+	desc->req.data_ptrs[next_triplet].blen = 2;
+	desc->req.data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN;
+	desc->req.data_ptrs[next_triplet].last = 0;
+	desc->req.data_ptrs[next_triplet].dma_ext = 0;
+	next_triplet++;
+
+	/* HARQ Combine input from either Memory interface */
+	if (!ddr_mem_in) {
+		next_triplet = acc100_dma_fill_blk_type_out(&desc->req,
+				op->ldpc_dec.harq_combined_input.data,
+				op->ldpc_dec.harq_combined_input.offset,
+				harq_dma_length_in,
+				next_triplet,
+				ACC100_DMA_BLKID_IN_HARQ);
+	} else {
+		desc->req.data_ptrs[next_triplet].address =
+				op->ldpc_dec.harq_combined_input.offset;
+		desc->req.data_ptrs[next_triplet].blen =
+				harq_dma_length_in;
+		desc->req.data_ptrs[next_triplet].blkid =
+				ACC100_DMA_BLKID_IN_HARQ;
+		desc->req.data_ptrs[next_triplet].dma_ext = 1;
+		next_triplet++;
+	}
+	desc->req.data_ptrs[next_triplet - 1].last = 1;
+	desc->req.m2dlen = next_triplet;
+
+	/* Dropped decoder hard output */
+	desc->req.data_ptrs[next_triplet].address =
+			q->lb_out_addr_phys;
+	desc->req.data_ptrs[next_triplet].blen = BYTES_IN_WORD;
+	desc->req.data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_OUT_HARD;
+	desc->req.data_ptrs[next_triplet].last = 0;
+	desc->req.data_ptrs[next_triplet].dma_ext = 0;
+	next_triplet++;
+
+	/* HARQ Combine output to either Memory interface */
+	if (check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE
+			)) {
+		desc->req.data_ptrs[next_triplet].address =
+				op->ldpc_dec.harq_combined_output.offset;
+		desc->req.data_ptrs[next_triplet].blen =
+				harq_dma_length_out;
+		desc->req.data_ptrs[next_triplet].blkid =
+				ACC100_DMA_BLKID_OUT_HARQ;
+		desc->req.data_ptrs[next_triplet].dma_ext = 1;
+		next_triplet++;
+	} else {
+		hq_output_head = op->ldpc_dec.harq_combined_output.data;
+		hq_output = op->ldpc_dec.harq_combined_output.data;
+		next_triplet = acc100_dma_fill_blk_type_out(
+				&desc->req,
+				op->ldpc_dec.harq_combined_output.data,
+				op->ldpc_dec.harq_combined_output.offset,
+				harq_dma_length_out,
+				next_triplet,
+				ACC100_DMA_BLKID_OUT_HARQ);
+		/* HARQ output */
+		mbuf_append(hq_output_head, hq_output, harq_dma_length_out);
+		op->ldpc_dec.harq_combined_output.length =
+				harq_dma_length_out;
+	}
+	desc->req.data_ptrs[next_triplet - 1].last = 1;
+	desc->req.d2mlen = next_triplet - desc->req.m2dlen;
+	desc->req.op_addr = op;
+
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
 /** Enqueue one decode operations for ACC100 device in CB mode */
 static inline int
 enqueue_ldpc_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
 		uint16_t total_enqueued_cbs, bool same_op)
 {
 	int ret;
+	if (unlikely(check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK))) {
+		ret = harq_loopback(q, op, total_enqueued_cbs);
+		return ret;
+	}
 
 	union acc100_dma_desc *desc;
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v4 07/11] baseband/acc100: add support for 4G processing
  2020-09-04 17:53   ` [dpdk-dev] [PATCH v4 00/11] bbdev PMD ACC100 Nicolas Chautru
                       ` (5 preceding siblings ...)
  2020-09-04 17:54     ` [dpdk-dev] [PATCH v4 06/11] baseband/acc100: add HARQ loopback support Nicolas Chautru
@ 2020-09-04 17:54     ` Nicolas Chautru
  2020-09-21  1:43       ` Liu, Tianjiao
  2020-09-04 17:54     ` [dpdk-dev] [PATCH v4 08/11] baseband/acc100: add interrupt support to PMD Nicolas Chautru
                       ` (4 subsequent siblings)
  11 siblings, 1 reply; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-04 17:54 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Adding capability for 4G encode and decoder processing

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 1010 ++++++++++++++++++++++++++++--
 1 file changed, 943 insertions(+), 67 deletions(-)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 5b011a1..bd07def 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -339,7 +339,6 @@
 	free_base_addresses(base_addrs, i);
 }
 
-
 /* Allocate 64MB memory used for all software rings */
 static int
 acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id)
@@ -637,6 +636,41 @@
 
 	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
 		{
+			.type = RTE_BBDEV_OP_TURBO_DEC,
+			.cap.turbo_dec = {
+				.capability_flags =
+					RTE_BBDEV_TURBO_SUBBLOCK_DEINTERLEAVE |
+					RTE_BBDEV_TURBO_CRC_TYPE_24B |
+					RTE_BBDEV_TURBO_HALF_ITERATION_EVEN |
+					RTE_BBDEV_TURBO_EARLY_TERMINATION |
+					RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN |
+					RTE_BBDEV_TURBO_MAP_DEC |
+					RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP |
+					RTE_BBDEV_TURBO_DEC_SCATTER_GATHER,
+				.max_llr_modulus = INT8_MAX,
+				.num_buffers_src =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+				.num_buffers_hard_out =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+				.num_buffers_soft_out =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+			}
+		},
+		{
+			.type = RTE_BBDEV_OP_TURBO_ENC,
+			.cap.turbo_enc = {
+				.capability_flags =
+					RTE_BBDEV_TURBO_CRC_24B_ATTACH |
+					RTE_BBDEV_TURBO_RV_INDEX_BYPASS |
+					RTE_BBDEV_TURBO_RATE_MATCH |
+					RTE_BBDEV_TURBO_ENC_SCATTER_GATHER,
+				.num_buffers_src =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+				.num_buffers_dst =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+			}
+		},
+		{
 			.type   = RTE_BBDEV_OP_LDPC_ENC,
 			.cap.ldpc_enc = {
 				.capability_flags =
@@ -719,7 +753,6 @@
 #endif
 }
 
-
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
 	.setup_queues = acc100_setup_queues,
 	.close = acc100_dev_close,
@@ -763,6 +796,58 @@
 	return tail;
 }
 
+/* Fill in a frame control word for turbo encoding. */
+static inline void
+acc100_fcw_te_fill(const struct rte_bbdev_enc_op *op, struct acc100_fcw_te *fcw)
+{
+	fcw->code_block_mode = op->turbo_enc.code_block_mode;
+	if (fcw->code_block_mode == 0) { /* For TB mode */
+		fcw->k_neg = op->turbo_enc.tb_params.k_neg;
+		fcw->k_pos = op->turbo_enc.tb_params.k_pos;
+		fcw->c_neg = op->turbo_enc.tb_params.c_neg;
+		fcw->c = op->turbo_enc.tb_params.c;
+		fcw->ncb_neg = op->turbo_enc.tb_params.ncb_neg;
+		fcw->ncb_pos = op->turbo_enc.tb_params.ncb_pos;
+
+		if (check_bit(op->turbo_enc.op_flags,
+				RTE_BBDEV_TURBO_RATE_MATCH)) {
+			fcw->bypass_rm = 0;
+			fcw->cab = op->turbo_enc.tb_params.cab;
+			fcw->ea = op->turbo_enc.tb_params.ea;
+			fcw->eb = op->turbo_enc.tb_params.eb;
+		} else {
+			/* E is set to the encoding output size when RM is
+			 * bypassed.
+			 */
+			fcw->bypass_rm = 1;
+			fcw->cab = fcw->c_neg;
+			fcw->ea = 3 * fcw->k_neg + 12;
+			fcw->eb = 3 * fcw->k_pos + 12;
+		}
+	} else { /* For CB mode */
+		fcw->k_pos = op->turbo_enc.cb_params.k;
+		fcw->ncb_pos = op->turbo_enc.cb_params.ncb;
+
+		if (check_bit(op->turbo_enc.op_flags,
+				RTE_BBDEV_TURBO_RATE_MATCH)) {
+			fcw->bypass_rm = 0;
+			fcw->eb = op->turbo_enc.cb_params.e;
+		} else {
+			/* E is set to the encoding output size when RM is
+			 * bypassed.
+			 */
+			fcw->bypass_rm = 1;
+			fcw->eb = 3 * fcw->k_pos + 12;
+		}
+	}
+
+	fcw->bypass_rv_idx1 = check_bit(op->turbo_enc.op_flags,
+			RTE_BBDEV_TURBO_RV_INDEX_BYPASS);
+	fcw->code_block_crc = check_bit(op->turbo_enc.op_flags,
+			RTE_BBDEV_TURBO_CRC_24B_ATTACH);
+	fcw->rv_idx1 = op->turbo_enc.rv_index;
+}
+
 /* Compute value of k0.
  * Based on 3GPP 38.212 Table 5.4.2.1-2
  * Starting position of different redundancy versions, k0
@@ -813,6 +898,25 @@
 	fcw->mcb_count = num_cb;
 }
 
+/* Fill in a frame control word for turbo decoding. */
+static inline void
+acc100_fcw_td_fill(const struct rte_bbdev_dec_op *op, struct acc100_fcw_td *fcw)
+{
+	/* Note : Early termination is always enabled for 4GUL */
+	fcw->fcw_ver = 1;
+	if (op->turbo_dec.code_block_mode == 0)
+		fcw->k_pos = op->turbo_dec.tb_params.k_pos;
+	else
+		fcw->k_pos = op->turbo_dec.cb_params.k;
+	fcw->turbo_crc_type = check_bit(op->turbo_dec.op_flags,
+			RTE_BBDEV_TURBO_CRC_TYPE_24B);
+	fcw->bypass_sb_deint = 0;
+	fcw->raw_decoder_input_on = 0;
+	fcw->max_iter = op->turbo_dec.iter_max;
+	fcw->half_iter_on = !check_bit(op->turbo_dec.op_flags,
+			RTE_BBDEV_TURBO_HALF_ITERATION_EVEN);
+}
+
 /* Fill in a frame control word for LDPC decoding. */
 static inline void
 acc100_fcw_ld_fill(const struct rte_bbdev_dec_op *op, struct acc100_fcw_ld *fcw,
@@ -1042,6 +1146,87 @@
 }
 
 static inline int
+acc100_dma_desc_te_fill(struct rte_bbdev_enc_op *op,
+		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
+		struct rte_mbuf *output, uint32_t *in_offset,
+		uint32_t *out_offset, uint32_t *out_length,
+		uint32_t *mbuf_total_left, uint32_t *seg_total_left, uint8_t r)
+{
+	int next_triplet = 1; /* FCW already done */
+	uint32_t e, ea, eb, length;
+	uint16_t k, k_neg, k_pos;
+	uint8_t cab, c_neg;
+
+	desc->word0 = ACC100_DMA_DESC_TYPE;
+	desc->word1 = 0; /**< Timestamp could be disabled */
+	desc->word2 = 0;
+	desc->word3 = 0;
+	desc->numCBs = 1;
+
+	if (op->turbo_enc.code_block_mode == 0) {
+		ea = op->turbo_enc.tb_params.ea;
+		eb = op->turbo_enc.tb_params.eb;
+		cab = op->turbo_enc.tb_params.cab;
+		k_neg = op->turbo_enc.tb_params.k_neg;
+		k_pos = op->turbo_enc.tb_params.k_pos;
+		c_neg = op->turbo_enc.tb_params.c_neg;
+		e = (r < cab) ? ea : eb;
+		k = (r < c_neg) ? k_neg : k_pos;
+	} else {
+		e = op->turbo_enc.cb_params.e;
+		k = op->turbo_enc.cb_params.k;
+	}
+
+	if (check_bit(op->turbo_enc.op_flags, RTE_BBDEV_TURBO_CRC_24B_ATTACH))
+		length = (k - 24) >> 3;
+	else
+		length = k >> 3;
+
+	if (unlikely((*mbuf_total_left == 0) || (*mbuf_total_left < length))) {
+		rte_bbdev_log(ERR,
+				"Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+				*mbuf_total_left, length);
+		return -1;
+	}
+
+	next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset,
+			length, seg_total_left, next_triplet);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->m2dlen = next_triplet;
+	*mbuf_total_left -= length;
+
+	/* Set output length */
+	if (check_bit(op->turbo_enc.op_flags, RTE_BBDEV_TURBO_RATE_MATCH))
+		/* Integer round up division by 8 */
+		*out_length = (e + 7) >> 3;
+	else
+		*out_length = (k >> 3) * 3 + 2;
+
+	next_triplet = acc100_dma_fill_blk_type_out(desc, output, *out_offset,
+			*out_length, next_triplet, ACC100_DMA_BLKID_OUT_ENC);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+	op->turbo_enc.output.length += *out_length;
+	*out_offset += *out_length;
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->d2mlen = next_triplet - desc->m2dlen;
+
+	desc->op_addr = op;
+
+	return 0;
+}
+
+static inline int
 acc100_dma_desc_le_fill(struct rte_bbdev_enc_op *op,
 		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
 		struct rte_mbuf *output, uint32_t *in_offset,
@@ -1110,6 +1295,117 @@
 }
 
 static inline int
+acc100_dma_desc_td_fill(struct rte_bbdev_dec_op *op,
+		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
+		struct rte_mbuf *h_output, struct rte_mbuf *s_output,
+		uint32_t *in_offset, uint32_t *h_out_offset,
+		uint32_t *s_out_offset, uint32_t *h_out_length,
+		uint32_t *s_out_length, uint32_t *mbuf_total_left,
+		uint32_t *seg_total_left, uint8_t r)
+{
+	int next_triplet = 1; /* FCW already done */
+	uint16_t k;
+	uint16_t crc24_overlap = 0;
+	uint32_t e, kw;
+
+	desc->word0 = ACC100_DMA_DESC_TYPE;
+	desc->word1 = 0; /**< Timestamp could be disabled */
+	desc->word2 = 0;
+	desc->word3 = 0;
+	desc->numCBs = 1;
+
+	if (op->turbo_dec.code_block_mode == 0) {
+		k = (r < op->turbo_dec.tb_params.c_neg)
+			? op->turbo_dec.tb_params.k_neg
+			: op->turbo_dec.tb_params.k_pos;
+		e = (r < op->turbo_dec.tb_params.cab)
+			? op->turbo_dec.tb_params.ea
+			: op->turbo_dec.tb_params.eb;
+	} else {
+		k = op->turbo_dec.cb_params.k;
+		e = op->turbo_dec.cb_params.e;
+	}
+
+	if ((op->turbo_dec.code_block_mode == 0)
+		&& !check_bit(op->turbo_dec.op_flags,
+		RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP))
+		crc24_overlap = 24;
+
+	/* Calculates circular buffer size.
+	 * According to 3gpp 36.212 section 5.1.4.2
+	 *   Kw = 3 * Kpi,
+	 * where:
+	 *   Kpi = nCol * nRow
+	 * where nCol is 32 and nRow can be calculated from:
+	 *   D =< nCol * nRow
+	 * where D is the size of each output from turbo encoder block (k + 4).
+	 */
+	kw = RTE_ALIGN_CEIL(k + 4, 32) * 3;
+
+	if (unlikely((*mbuf_total_left == 0) || (*mbuf_total_left < kw))) {
+		rte_bbdev_log(ERR,
+				"Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+				*mbuf_total_left, kw);
+		return -1;
+	}
+
+	next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset, kw,
+			seg_total_left, next_triplet);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->m2dlen = next_triplet;
+	*mbuf_total_left -= kw;
+
+	next_triplet = acc100_dma_fill_blk_type_out(
+			desc, h_output, *h_out_offset,
+			k >> 3, next_triplet, ACC100_DMA_BLKID_OUT_HARD);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+
+	*h_out_length = ((k - crc24_overlap) >> 3);
+	op->turbo_dec.hard_output.length += *h_out_length;
+	*h_out_offset += *h_out_length;
+
+	/* Soft output */
+	if (check_bit(op->turbo_dec.op_flags, RTE_BBDEV_TURBO_SOFT_OUTPUT)) {
+		if (check_bit(op->turbo_dec.op_flags,
+				RTE_BBDEV_TURBO_EQUALIZER))
+			*s_out_length = e;
+		else
+			*s_out_length = (k * 3) + 12;
+
+		next_triplet = acc100_dma_fill_blk_type_out(desc, s_output,
+				*s_out_offset, *s_out_length, next_triplet,
+				ACC100_DMA_BLKID_OUT_SOFT);
+		if (unlikely(next_triplet < 0)) {
+			rte_bbdev_log(ERR,
+					"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+					op);
+			return -1;
+		}
+
+		op->turbo_dec.soft_output.length += *s_out_length;
+		*s_out_offset += *s_out_length;
+	}
+
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->d2mlen = next_triplet - desc->m2dlen;
+
+	desc->op_addr = op;
+
+	return 0;
+}
+
+static inline int
 acc100_dma_desc_ld_fill(struct rte_bbdev_dec_op *op,
 		struct acc100_dma_req_desc *desc,
 		struct rte_mbuf **input, struct rte_mbuf *h_output,
@@ -1374,6 +1670,57 @@
 
 /* Enqueue one encode operations for ACC100 device in CB mode */
 static inline int
+enqueue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
+		uint16_t total_enqueued_cbs)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint32_t in_offset, out_offset, out_length, mbuf_total_left,
+		seg_total_left;
+	struct rte_mbuf *input, *output_head, *output;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	acc100_fcw_te_fill(op, &desc->req.fcw_te);
+
+	input = op->turbo_enc.input.data;
+	output_head = output = op->turbo_enc.output.data;
+	in_offset = op->turbo_enc.input.offset;
+	out_offset = op->turbo_enc.output.offset;
+	out_length = 0;
+	mbuf_total_left = op->turbo_enc.input.length;
+	seg_total_left = rte_pktmbuf_data_len(op->turbo_enc.input.data)
+			- in_offset;
+
+	ret = acc100_dma_desc_te_fill(op, &desc->req, &input, output,
+			&in_offset, &out_offset, &out_length, &mbuf_total_left,
+			&seg_total_left, 0);
+
+	if (unlikely(ret < 0))
+		return ret;
+
+	mbuf_append(output_head, output, out_length);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_te,
+			sizeof(desc->req.fcw_te) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+
+	/* Check if any data left after processing one CB */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left after processing one CB: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
+/* Enqueue one encode operations for ACC100 device in CB mode */
+static inline int
 enqueue_ldpc_enc_n_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op **ops,
 		uint16_t total_enqueued_cbs, int16_t num)
 {
@@ -1481,78 +1828,235 @@
 	return 1;
 }
 
-static inline int
-harq_loopback(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
-		uint16_t total_enqueued_cbs) {
-	struct acc100_fcw_ld *fcw;
-	union acc100_dma_desc *desc;
-	int next_triplet = 1;
-	struct rte_mbuf *hq_output_head, *hq_output;
-	uint16_t harq_in_length = op->ldpc_dec.harq_combined_input.length;
-	if (harq_in_length == 0) {
-		rte_bbdev_log(ERR, "Loopback of invalid null size\n");
-		return -EINVAL;
-	}
 
-	int h_comp = check_bit(op->ldpc_dec.op_flags,
-			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION
-			) ? 1 : 0;
-	if (h_comp == 1)
-		harq_in_length = harq_in_length * 8 / 6;
-	harq_in_length = RTE_ALIGN(harq_in_length, 64);
-	uint16_t harq_dma_length_in = (h_comp == 0) ?
-			harq_in_length :
-			harq_in_length * 6 / 8;
-	uint16_t harq_dma_length_out = harq_dma_length_in;
-	bool ddr_mem_in = check_bit(op->ldpc_dec.op_flags,
-			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE);
-	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
-	uint16_t harq_index = (ddr_mem_in ?
-			op->ldpc_dec.harq_combined_input.offset :
-			op->ldpc_dec.harq_combined_output.offset)
-			/ ACC100_HARQ_OFFSET;
+/* Enqueue one encode operations for ACC100 device in TB mode. */
+static inline int
+enqueue_enc_one_op_tb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
+		uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint8_t r, c;
+	uint32_t in_offset, out_offset, out_length, mbuf_total_left,
+		seg_total_left;
+	struct rte_mbuf *input, *output_head, *output;
+	uint16_t current_enqueued_cbs = 0;
 
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
-	fcw = &desc->req.fcw_ld;
-	/* Set the FCW from loopback into DDR */
-	memset(fcw, 0, sizeof(struct acc100_fcw_ld));
-	fcw->FCWversion = ACC100_FCW_VER;
-	fcw->qm = 2;
-	fcw->Zc = 384;
-	if (harq_in_length < 16 * N_ZC_1)
-		fcw->Zc = 16;
-	fcw->ncb = fcw->Zc * N_ZC_1;
-	fcw->rm_e = 2;
-	fcw->hcin_en = 1;
-	fcw->hcout_en = 1;
+	uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
+	acc100_fcw_te_fill(op, &desc->req.fcw_te);
 
-	rte_bbdev_log(DEBUG, "Loopback IN %d Index %d offset %d length %d %d\n",
-			ddr_mem_in, harq_index,
-			harq_layout[harq_index].offset, harq_in_length,
-			harq_dma_length_in);
+	input = op->turbo_enc.input.data;
+	output_head = output = op->turbo_enc.output.data;
+	in_offset = op->turbo_enc.input.offset;
+	out_offset = op->turbo_enc.output.offset;
+	out_length = 0;
+	mbuf_total_left = op->turbo_enc.input.length;
 
-	if (ddr_mem_in && (harq_layout[harq_index].offset > 0)) {
-		fcw->hcin_size0 = harq_layout[harq_index].size0;
-		fcw->hcin_offset = harq_layout[harq_index].offset;
-		fcw->hcin_size1 = harq_in_length - fcw->hcin_offset;
-		harq_dma_length_in = (fcw->hcin_size0 + fcw->hcin_size1);
-		if (h_comp == 1)
-			harq_dma_length_in = harq_dma_length_in * 6 / 8;
-	} else {
-		fcw->hcin_size0 = harq_in_length;
-	}
-	harq_layout[harq_index].val = 0;
-	rte_bbdev_log(DEBUG, "Loopback FCW Config %d %d %d\n",
-			fcw->hcin_size0, fcw->hcin_offset, fcw->hcin_size1);
-	fcw->hcout_size0 = harq_in_length;
-	fcw->hcin_decomp_mode = h_comp;
-	fcw->hcout_comp_mode = h_comp;
-	fcw->gain_i = 1;
-	fcw->gain_h = 1;
+	c = op->turbo_enc.tb_params.c;
+	r = op->turbo_enc.tb_params.r;
 
-	/* Set the prefix of descriptor. This could be done at polling */
+	while (mbuf_total_left > 0 && r < c) {
+		seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
+		/* Set up DMA descriptor */
+		desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
+				& q->sw_ring_wrap_mask);
+		desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
+		desc->req.data_ptrs[0].blen = ACC100_FCW_TE_BLEN;
+
+		ret = acc100_dma_desc_te_fill(op, &desc->req, &input, output,
+				&in_offset, &out_offset, &out_length,
+				&mbuf_total_left, &seg_total_left, r);
+		if (unlikely(ret < 0))
+			return ret;
+		mbuf_append(output_head, output, out_length);
+
+		/* Set total number of CBs in TB */
+		desc->req.cbs_in_tb = cbs_in_tb;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+		rte_memdump(stderr, "FCW", &desc->req.fcw_te,
+				sizeof(desc->req.fcw_te) - 8);
+		rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+		if (seg_total_left == 0) {
+			/* Go to the next mbuf */
+			input = input->next;
+			in_offset = 0;
+			output = output->next;
+			out_offset = 0;
+		}
+
+		total_enqueued_cbs++;
+		current_enqueued_cbs++;
+		r++;
+	}
+
+	if (unlikely(desc == NULL))
+		return current_enqueued_cbs;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Check if any CBs left for processing */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left for processing: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+
+	/* Set SDone on last CB descriptor for TB mode. */
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	return current_enqueued_cbs;
+}
+
+/** Enqueue one decode operations for ACC100 device in CB mode */
+static inline int
+enqueue_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint32_t in_offset, h_out_offset, s_out_offset, s_out_length,
+		h_out_length, mbuf_total_left, seg_total_left;
+	struct rte_mbuf *input, *h_output_head, *h_output,
+		*s_output_head, *s_output;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	acc100_fcw_td_fill(op, &desc->req.fcw_td);
+
+	input = op->turbo_dec.input.data;
+	h_output_head = h_output = op->turbo_dec.hard_output.data;
+	s_output_head = s_output = op->turbo_dec.soft_output.data;
+	in_offset = op->turbo_dec.input.offset;
+	h_out_offset = op->turbo_dec.hard_output.offset;
+	s_out_offset = op->turbo_dec.soft_output.offset;
+	h_out_length = s_out_length = 0;
+	mbuf_total_left = op->turbo_dec.input.length;
+	seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(input == NULL)) {
+		rte_bbdev_log(ERR, "Invalid mbuf pointer");
+		return -EFAULT;
+	}
+#endif
+
+	/* Set up DMA descriptor */
+	desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+
+	ret = acc100_dma_desc_td_fill(op, &desc->req, &input, h_output,
+			s_output, &in_offset, &h_out_offset, &s_out_offset,
+			&h_out_length, &s_out_length, &mbuf_total_left,
+			&seg_total_left, 0);
+
+	if (unlikely(ret < 0))
+		return ret;
+
+	/* Hard output */
+	mbuf_append(h_output_head, h_output, h_out_length);
+
+	/* Soft output */
+	if (check_bit(op->turbo_dec.op_flags, RTE_BBDEV_TURBO_SOFT_OUTPUT))
+		mbuf_append(s_output_head, s_output, s_out_length);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_td,
+			sizeof(desc->req.fcw_td) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+
+	/* Check if any CBs left for processing */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left after processing one CB: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
+static inline int
+harq_loopback(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs) {
+	struct acc100_fcw_ld *fcw;
+	union acc100_dma_desc *desc;
+	int next_triplet = 1;
+	struct rte_mbuf *hq_output_head, *hq_output;
+	uint16_t harq_in_length = op->ldpc_dec.harq_combined_input.length;
+	if (harq_in_length == 0) {
+		rte_bbdev_log(ERR, "Loopback of invalid null size\n");
+		return -EINVAL;
+	}
+
+	int h_comp = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION
+			) ? 1 : 0;
+	if (h_comp == 1)
+		harq_in_length = harq_in_length * 8 / 6;
+	harq_in_length = RTE_ALIGN(harq_in_length, 64);
+	uint16_t harq_dma_length_in = (h_comp == 0) ?
+			harq_in_length :
+			harq_in_length * 6 / 8;
+	uint16_t harq_dma_length_out = harq_dma_length_in;
+	bool ddr_mem_in = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE);
+	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+	uint16_t harq_index = (ddr_mem_in ?
+			op->ldpc_dec.harq_combined_input.offset :
+			op->ldpc_dec.harq_combined_output.offset)
+			/ ACC100_HARQ_OFFSET;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	fcw = &desc->req.fcw_ld;
+	/* Set the FCW from loopback into DDR */
+	memset(fcw, 0, sizeof(struct acc100_fcw_ld));
+	fcw->FCWversion = ACC100_FCW_VER;
+	fcw->qm = 2;
+	fcw->Zc = 384;
+	if (harq_in_length < 16 * N_ZC_1)
+		fcw->Zc = 16;
+	fcw->ncb = fcw->Zc * N_ZC_1;
+	fcw->rm_e = 2;
+	fcw->hcin_en = 1;
+	fcw->hcout_en = 1;
+
+	rte_bbdev_log(DEBUG, "Loopback IN %d Index %d offset %d length %d %d\n",
+			ddr_mem_in, harq_index,
+			harq_layout[harq_index].offset, harq_in_length,
+			harq_dma_length_in);
+
+	if (ddr_mem_in && (harq_layout[harq_index].offset > 0)) {
+		fcw->hcin_size0 = harq_layout[harq_index].size0;
+		fcw->hcin_offset = harq_layout[harq_index].offset;
+		fcw->hcin_size1 = harq_in_length - fcw->hcin_offset;
+		harq_dma_length_in = (fcw->hcin_size0 + fcw->hcin_size1);
+		if (h_comp == 1)
+			harq_dma_length_in = harq_dma_length_in * 6 / 8;
+	} else {
+		fcw->hcin_size0 = harq_in_length;
+	}
+	harq_layout[harq_index].val = 0;
+	rte_bbdev_log(DEBUG, "Loopback FCW Config %d %d %d\n",
+			fcw->hcin_size0, fcw->hcin_offset, fcw->hcin_size1);
+	fcw->hcout_size0 = harq_in_length;
+	fcw->hcin_decomp_mode = h_comp;
+	fcw->hcout_comp_mode = h_comp;
+	fcw->gain_i = 1;
+	fcw->gain_h = 1;
+
+	/* Set the prefix of descriptor. This could be done at polling */
 	desc->req.word0 = ACC100_DMA_DESC_TYPE;
 	desc->req.word1 = 0; /**< Timestamp could be disabled */
 	desc->req.word2 = 0;
@@ -1816,6 +2320,107 @@
 	return current_enqueued_cbs;
 }
 
+/* Enqueue one decode operations for ACC100 device in TB mode */
+static inline int
+enqueue_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint8_t r, c;
+	uint32_t in_offset, h_out_offset, s_out_offset, s_out_length,
+		h_out_length, mbuf_total_left, seg_total_left;
+	struct rte_mbuf *input, *h_output_head, *h_output,
+		*s_output_head, *s_output;
+	uint16_t current_enqueued_cbs = 0;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
+	acc100_fcw_td_fill(op, &desc->req.fcw_td);
+
+	input = op->turbo_dec.input.data;
+	h_output_head = h_output = op->turbo_dec.hard_output.data;
+	s_output_head = s_output = op->turbo_dec.soft_output.data;
+	in_offset = op->turbo_dec.input.offset;
+	h_out_offset = op->turbo_dec.hard_output.offset;
+	s_out_offset = op->turbo_dec.soft_output.offset;
+	h_out_length = s_out_length = 0;
+	mbuf_total_left = op->turbo_dec.input.length;
+	c = op->turbo_dec.tb_params.c;
+	r = op->turbo_dec.tb_params.r;
+
+	while (mbuf_total_left > 0 && r < c) {
+
+		seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
+
+		/* Set up DMA descriptor */
+		desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
+				& q->sw_ring_wrap_mask);
+		desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
+		desc->req.data_ptrs[0].blen = ACC100_FCW_TD_BLEN;
+		ret = acc100_dma_desc_td_fill(op, &desc->req, &input,
+				h_output, s_output, &in_offset, &h_out_offset,
+				&s_out_offset, &h_out_length, &s_out_length,
+				&mbuf_total_left, &seg_total_left, r);
+
+		if (unlikely(ret < 0))
+			return ret;
+
+		/* Hard output */
+		mbuf_append(h_output_head, h_output, h_out_length);
+
+		/* Soft output */
+		if (check_bit(op->turbo_dec.op_flags,
+				RTE_BBDEV_TURBO_SOFT_OUTPUT))
+			mbuf_append(s_output_head, s_output, s_out_length);
+
+		/* Set total number of CBs in TB */
+		desc->req.cbs_in_tb = cbs_in_tb;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+		rte_memdump(stderr, "FCW", &desc->req.fcw_td,
+				sizeof(desc->req.fcw_td) - 8);
+		rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+		if (seg_total_left == 0) {
+			/* Go to the next mbuf */
+			input = input->next;
+			in_offset = 0;
+			h_output = h_output->next;
+			h_out_offset = 0;
+
+			if (check_bit(op->turbo_dec.op_flags,
+					RTE_BBDEV_TURBO_SOFT_OUTPUT)) {
+				s_output = s_output->next;
+				s_out_offset = 0;
+			}
+		}
+
+		total_enqueued_cbs++;
+		current_enqueued_cbs++;
+		r++;
+	}
+
+	if (unlikely(desc == NULL))
+		return current_enqueued_cbs;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Check if any CBs left for processing */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left for processing: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+	/* Set SDone on last CB descriptor for TB mode */
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	return current_enqueued_cbs;
+}
 
 /* Calculates number of CBs in processed encoder TB based on 'r' and input
  * length.
@@ -1893,6 +2498,45 @@
 	return cbs_in_tb;
 }
 
+/* Enqueue encode operations for ACC100 device in CB mode. */
+static uint16_t
+acc100_enqueue_enc_cb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i;
+	union acc100_dma_desc *desc;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - 1 < 0))
+			break;
+		avail -= 1;
+
+		ret = enqueue_enc_one_op_cb(q, ops[i], i);
+		if (ret < 0)
+			break;
+	}
+
+	if (unlikely(i == 0))
+		return 0; /* Nothing to enqueue */
+
+	/* Set SDone in last CB in enqueued ops for CB mode*/
+	desc = q->ring_addr + ((q->sw_ring_head + i - 1)
+			& q->sw_ring_wrap_mask);
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	acc100_dma_enqueue(q, i, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+	return i;
+}
+
 /* Check we can mux encode operations with common FCW */
 static inline bool
 check_mux(struct rte_bbdev_enc_op **ops, uint16_t num) {
@@ -1960,6 +2604,52 @@
 	return i;
 }
 
+/* Enqueue encode operations for ACC100 device in TB mode. */
+static uint16_t
+acc100_enqueue_enc_tb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i, enqueued_cbs = 0;
+	uint8_t cbs_in_tb;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		cbs_in_tb = get_num_cbs_in_tb_enc(&ops[i]->turbo_enc);
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - cbs_in_tb < 0))
+			break;
+		avail -= cbs_in_tb;
+
+		ret = enqueue_enc_one_op_tb(q, ops[i], enqueued_cbs, cbs_in_tb);
+		if (ret < 0)
+			break;
+		enqueued_cbs += ret;
+	}
+
+	acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+
+	return i;
+}
+
+/* Enqueue encode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_enc(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	if (unlikely(num == 0))
+		return 0;
+	if (ops[0]->turbo_enc.code_block_mode == 0)
+		return acc100_enqueue_enc_tb(q_data, ops, num);
+	else
+		return acc100_enqueue_enc_cb(q_data, ops, num);
+}
+
 /* Enqueue encode operations for ACC100 device. */
 static uint16_t
 acc100_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
@@ -1967,7 +2657,51 @@
 {
 	if (unlikely(num == 0))
 		return 0;
-	return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
+	if (ops[0]->ldpc_enc.code_block_mode == 0)
+		return acc100_enqueue_enc_tb(q_data, ops, num);
+	else
+		return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
+}
+
+
+/* Enqueue decode operations for ACC100 device in CB mode */
+static uint16_t
+acc100_enqueue_dec_cb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i;
+	union acc100_dma_desc *desc;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - 1 < 0))
+			break;
+		avail -= 1;
+
+		ret = enqueue_dec_one_op_cb(q, ops[i], i);
+		if (ret < 0)
+			break;
+	}
+
+	if (unlikely(i == 0))
+		return 0; /* Nothing to enqueue */
+
+	/* Set SDone in last CB in enqueued ops for CB mode*/
+	desc = q->ring_addr + ((q->sw_ring_head + i - 1)
+			& q->sw_ring_wrap_mask);
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	acc100_dma_enqueue(q, i, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+
+	return i;
 }
 
 /* Check we can mux encode operations with common FCW */
@@ -2065,6 +2799,53 @@
 	return i;
 }
 
+
+/* Enqueue decode operations for ACC100 device in TB mode */
+static uint16_t
+acc100_enqueue_dec_tb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i, enqueued_cbs = 0;
+	uint8_t cbs_in_tb;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		cbs_in_tb = get_num_cbs_in_tb_dec(&ops[i]->turbo_dec);
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - cbs_in_tb < 0))
+			break;
+		avail -= cbs_in_tb;
+
+		ret = enqueue_dec_one_op_tb(q, ops[i], enqueued_cbs, cbs_in_tb);
+		if (ret < 0)
+			break;
+		enqueued_cbs += ret;
+	}
+
+	acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+
+	return i;
+}
+
+/* Enqueue decode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_dec(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	if (unlikely(num == 0))
+		return 0;
+	if (ops[0]->turbo_dec.code_block_mode == 0)
+		return acc100_enqueue_dec_tb(q_data, ops, num);
+	else
+		return acc100_enqueue_dec_cb(q_data, ops, num);
+}
+
 /* Enqueue decode operations for ACC100 device. */
 static uint16_t
 acc100_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
@@ -2388,6 +3169,51 @@
 	return cb_idx;
 }
 
+/* Dequeue encode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_enc(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	uint16_t dequeue_num;
+	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+	uint32_t aq_dequeued = 0;
+	uint16_t i;
+	uint16_t dequeued_cbs = 0;
+	struct rte_bbdev_enc_op *op;
+	int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(ops == 0 && q == NULL))
+		return 0;
+#endif
+
+	dequeue_num = (avail < num) ? avail : num;
+
+	for (i = 0; i < dequeue_num; ++i) {
+		op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask))->req.op_addr;
+		if (op->turbo_enc.code_block_mode == 0)
+			ret = dequeue_enc_one_op_tb(q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+		else
+			ret = dequeue_enc_one_op_cb(q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+
+		if (ret < 0)
+			break;
+		dequeued_cbs += ret;
+	}
+
+	q->aq_dequeued += aq_dequeued;
+	q->sw_ring_tail += dequeued_cbs;
+
+	/* Update enqueue stats */
+	q_data->queue_stats.dequeued_count += i;
+
+	return i;
+}
+
 /* Dequeue LDPC encode operations from ACC100 device. */
 static uint16_t
 acc100_dequeue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
@@ -2426,6 +3252,52 @@
 	return dequeued_cbs;
 }
 
+
+/* Dequeue decode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_dec(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	uint16_t dequeue_num;
+	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+	uint32_t aq_dequeued = 0;
+	uint16_t i;
+	uint16_t dequeued_cbs = 0;
+	struct rte_bbdev_dec_op *op;
+	int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(ops == 0 && q == NULL))
+		return 0;
+#endif
+
+	dequeue_num = (avail < num) ? avail : num;
+
+	for (i = 0; i < dequeue_num; ++i) {
+		op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask))->req.op_addr;
+		if (op->turbo_dec.code_block_mode == 0)
+			ret = dequeue_dec_one_op_tb(q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+		else
+			ret = dequeue_dec_one_op_cb(q_data, q, &ops[i],
+					dequeued_cbs, &aq_dequeued);
+
+		if (ret < 0)
+			break;
+		dequeued_cbs += ret;
+	}
+
+	q->aq_dequeued += aq_dequeued;
+	q->sw_ring_tail += dequeued_cbs;
+
+	/* Update enqueue stats */
+	q_data->queue_stats.dequeued_count += i;
+
+	return i;
+}
+
 /* Dequeue decode operations from ACC100 device. */
 static uint16_t
 acc100_dequeue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
@@ -2479,6 +3351,10 @@
 	struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
 
 	dev->dev_ops = &acc100_bbdev_ops;
+	dev->enqueue_enc_ops = acc100_enqueue_enc;
+	dev->enqueue_dec_ops = acc100_enqueue_dec;
+	dev->dequeue_enc_ops = acc100_dequeue_enc;
+	dev->dequeue_dec_ops = acc100_dequeue_dec;
 	dev->enqueue_ldpc_enc_ops = acc100_enqueue_ldpc_enc;
 	dev->enqueue_ldpc_dec_ops = acc100_enqueue_ldpc_dec;
 	dev->dequeue_ldpc_enc_ops = acc100_dequeue_ldpc_enc;
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v4 08/11] baseband/acc100: add interrupt support to PMD
  2020-09-04 17:53   ` [dpdk-dev] [PATCH v4 00/11] bbdev PMD ACC100 Nicolas Chautru
                       ` (6 preceding siblings ...)
  2020-09-04 17:54     ` [dpdk-dev] [PATCH v4 07/11] baseband/acc100: add support for 4G processing Nicolas Chautru
@ 2020-09-04 17:54     ` Nicolas Chautru
  2020-09-21  1:45       ` Liu, Tianjiao
  2020-09-04 17:54     ` [dpdk-dev] [PATCH v4 09/11] baseband/acc100: add debug function to validate input Nicolas Chautru
                       ` (3 subsequent siblings)
  11 siblings, 1 reply; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-04 17:54 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Adding capability and functions to support MSI
interrupts, call backs and inforing.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 288 ++++++++++++++++++++++++++++++-
 drivers/baseband/acc100/rte_acc100_pmd.h |  15 ++
 2 files changed, 300 insertions(+), 3 deletions(-)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index bd07def..54b5917 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -339,6 +339,213 @@
 	free_base_addresses(base_addrs, i);
 }
 
+/*
+ * Find queue_id of a device queue based on details from the Info Ring.
+ * If a queue isn't found UINT16_MAX is returned.
+ */
+static inline uint16_t
+get_queue_id_from_ring_info(struct rte_bbdev_data *data,
+		const union acc100_info_ring_data ring_data)
+{
+	uint16_t queue_id;
+
+	for (queue_id = 0; queue_id < data->num_queues; ++queue_id) {
+		struct acc100_queue *acc100_q =
+				data->queues[queue_id].queue_private;
+		if (acc100_q != NULL && acc100_q->aq_id == ring_data.aq_id &&
+				acc100_q->qgrp_id == ring_data.qg_id &&
+				acc100_q->vf_id == ring_data.vf_id)
+			return queue_id;
+	}
+
+	return UINT16_MAX;
+}
+
+/* Checks PF Info Ring to find the interrupt cause and handles it accordingly */
+static inline void
+acc100_check_ir(struct acc100_device *acc100_dev)
+{
+	volatile union acc100_info_ring_data *ring_data;
+	uint16_t info_ring_head = acc100_dev->info_ring_head;
+	if (acc100_dev->info_ring == NULL)
+		return;
+
+	ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head &
+			ACC100_INFO_RING_MASK);
+
+	while (ring_data->valid) {
+		if ((ring_data->int_nb < ACC100_PF_INT_DMA_DL_DESC_IRQ) || (
+				ring_data->int_nb >
+				ACC100_PF_INT_DMA_DL5G_DESC_IRQ))
+			rte_bbdev_log(WARNING, "InfoRing: ITR:%d Info:0x%x",
+				ring_data->int_nb, ring_data->detailed_info);
+		/* Initialize Info Ring entry and move forward */
+		ring_data->val = 0;
+		info_ring_head++;
+		ring_data = acc100_dev->info_ring +
+				(info_ring_head & ACC100_INFO_RING_MASK);
+	}
+}
+
+/* Checks PF Info Ring to find the interrupt cause and handles it accordingly */
+static inline void
+acc100_pf_interrupt_handler(struct rte_bbdev *dev)
+{
+	struct acc100_device *acc100_dev = dev->data->dev_private;
+	volatile union acc100_info_ring_data *ring_data;
+	struct acc100_deq_intr_details deq_intr_det;
+
+	ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head &
+			ACC100_INFO_RING_MASK);
+
+	while (ring_data->valid) {
+
+		rte_bbdev_log_debug(
+				"ACC100 PF Interrupt received, Info Ring data: 0x%x",
+				ring_data->val);
+
+		switch (ring_data->int_nb) {
+		case ACC100_PF_INT_DMA_DL_DESC_IRQ:
+		case ACC100_PF_INT_DMA_UL_DESC_IRQ:
+		case ACC100_PF_INT_DMA_UL5G_DESC_IRQ:
+		case ACC100_PF_INT_DMA_DL5G_DESC_IRQ:
+			deq_intr_det.queue_id = get_queue_id_from_ring_info(
+					dev->data, *ring_data);
+			if (deq_intr_det.queue_id == UINT16_MAX) {
+				rte_bbdev_log(ERR,
+						"Couldn't find queue: aq_id: %u, qg_id: %u, vf_id: %u",
+						ring_data->aq_id,
+						ring_data->qg_id,
+						ring_data->vf_id);
+				return;
+			}
+			rte_bbdev_pmd_callback_process(dev,
+					RTE_BBDEV_EVENT_DEQUEUE, &deq_intr_det);
+			break;
+		default:
+			rte_bbdev_pmd_callback_process(dev,
+					RTE_BBDEV_EVENT_ERROR, NULL);
+			break;
+		}
+
+		/* Initialize Info Ring entry and move forward */
+		ring_data->val = 0;
+		++acc100_dev->info_ring_head;
+		ring_data = acc100_dev->info_ring +
+				(acc100_dev->info_ring_head &
+				ACC100_INFO_RING_MASK);
+	}
+}
+
+/* Checks VF Info Ring to find the interrupt cause and handles it accordingly */
+static inline void
+acc100_vf_interrupt_handler(struct rte_bbdev *dev)
+{
+	struct acc100_device *acc100_dev = dev->data->dev_private;
+	volatile union acc100_info_ring_data *ring_data;
+	struct acc100_deq_intr_details deq_intr_det;
+
+	ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head &
+			ACC100_INFO_RING_MASK);
+
+	while (ring_data->valid) {
+
+		rte_bbdev_log_debug(
+				"ACC100 VF Interrupt received, Info Ring data: 0x%x",
+				ring_data->val);
+
+		switch (ring_data->int_nb) {
+		case ACC100_VF_INT_DMA_DL_DESC_IRQ:
+		case ACC100_VF_INT_DMA_UL_DESC_IRQ:
+		case ACC100_VF_INT_DMA_UL5G_DESC_IRQ:
+		case ACC100_VF_INT_DMA_DL5G_DESC_IRQ:
+			/* VFs are not aware of their vf_id - it's set to 0 in
+			 * queue structures.
+			 */
+			ring_data->vf_id = 0;
+			deq_intr_det.queue_id = get_queue_id_from_ring_info(
+					dev->data, *ring_data);
+			if (deq_intr_det.queue_id == UINT16_MAX) {
+				rte_bbdev_log(ERR,
+						"Couldn't find queue: aq_id: %u, qg_id: %u",
+						ring_data->aq_id,
+						ring_data->qg_id);
+				return;
+			}
+			rte_bbdev_pmd_callback_process(dev,
+					RTE_BBDEV_EVENT_DEQUEUE, &deq_intr_det);
+			break;
+		default:
+			rte_bbdev_pmd_callback_process(dev,
+					RTE_BBDEV_EVENT_ERROR, NULL);
+			break;
+		}
+
+		/* Initialize Info Ring entry and move forward */
+		ring_data->valid = 0;
+		++acc100_dev->info_ring_head;
+		ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head
+				& ACC100_INFO_RING_MASK);
+	}
+}
+
+/* Interrupt handler triggered by ACC100 dev for handling specific interrupt */
+static void
+acc100_dev_interrupt_handler(void *cb_arg)
+{
+	struct rte_bbdev *dev = cb_arg;
+	struct acc100_device *acc100_dev = dev->data->dev_private;
+
+	/* Read info ring */
+	if (acc100_dev->pf_device)
+		acc100_pf_interrupt_handler(dev);
+	else
+		acc100_vf_interrupt_handler(dev);
+}
+
+/* Allocate and setup inforing */
+static int
+allocate_inforing(struct rte_bbdev *dev)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	const struct acc100_registry_addr *reg_addr;
+	rte_iova_t info_ring_phys;
+	uint32_t phys_low, phys_high;
+
+	if (d->info_ring != NULL)
+		return 0; /* Already configured */
+
+	/* Choose correct registry addresses for the device type */
+	if (d->pf_device)
+		reg_addr = &pf_reg_addr;
+	else
+		reg_addr = &vf_reg_addr;
+	/* Allocate InfoRing */
+	d->info_ring = rte_zmalloc_socket("Info Ring",
+			ACC100_INFO_RING_NUM_ENTRIES *
+			sizeof(*d->info_ring), RTE_CACHE_LINE_SIZE,
+			dev->data->socket_id);
+	if (d->info_ring == NULL) {
+		rte_bbdev_log(ERR,
+				"Failed to allocate Info Ring for %s:%u",
+				dev->device->driver->name,
+				dev->data->dev_id);
+		return -ENOMEM;
+	}
+	info_ring_phys = rte_malloc_virt2iova(d->info_ring);
+
+	/* Setup Info Ring */
+	phys_high = (uint32_t)(info_ring_phys >> 32);
+	phys_low  = (uint32_t)(info_ring_phys);
+	acc100_reg_write(d, reg_addr->info_ring_hi, phys_high);
+	acc100_reg_write(d, reg_addr->info_ring_lo, phys_low);
+	acc100_reg_write(d, reg_addr->info_ring_en, ACC100_REG_IRQ_EN_ALL);
+	d->info_ring_head = (acc100_reg_read(d, reg_addr->info_ring_ptr) &
+			0xFFF) / sizeof(union acc100_info_ring_data);
+	return 0;
+}
+
+
 /* Allocate 64MB memory used for all software rings */
 static int
 acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id)
@@ -426,6 +633,7 @@
 	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_hi, phys_high);
 	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_lo, phys_low);
 
+	allocate_inforing(dev);
 	d->harq_layout = rte_zmalloc_socket("HARQ Layout",
 			ACC100_HARQ_LAYOUT * sizeof(*d->harq_layout),
 			RTE_CACHE_LINE_SIZE, dev->data->socket_id);
@@ -437,13 +645,53 @@
 	return 0;
 }
 
+static int
+acc100_intr_enable(struct rte_bbdev *dev)
+{
+	int ret;
+	struct acc100_device *d = dev->data->dev_private;
+
+	/* Only MSI are currently supported */
+	if (dev->intr_handle->type == RTE_INTR_HANDLE_VFIO_MSI ||
+			dev->intr_handle->type == RTE_INTR_HANDLE_UIO) {
+
+		allocate_inforing(dev);
+
+		ret = rte_intr_enable(dev->intr_handle);
+		if (ret < 0) {
+			rte_bbdev_log(ERR,
+					"Couldn't enable interrupts for device: %s",
+					dev->data->name);
+			rte_free(d->info_ring);
+			return ret;
+		}
+		ret = rte_intr_callback_register(dev->intr_handle,
+				acc100_dev_interrupt_handler, dev);
+		if (ret < 0) {
+			rte_bbdev_log(ERR,
+					"Couldn't register interrupt callback for device: %s",
+					dev->data->name);
+			rte_free(d->info_ring);
+			return ret;
+		}
+
+		return 0;
+	}
+
+	rte_bbdev_log(ERR, "ACC100 (%s) supports only VFIO MSI interrupts",
+			dev->data->name);
+	return -ENOTSUP;
+}
+
 /* Free 64MB memory used for software rings */
 static int
 acc100_dev_close(struct rte_bbdev *dev)
 {
 	struct acc100_device *d = dev->data->dev_private;
+	acc100_check_ir(d);
 	if (d->sw_rings_base != NULL) {
 		rte_free(d->tail_ptrs);
+		rte_free(d->info_ring);
 		rte_free(d->sw_rings_base);
 		d->sw_rings_base = NULL;
 	}
@@ -643,6 +891,7 @@
 					RTE_BBDEV_TURBO_CRC_TYPE_24B |
 					RTE_BBDEV_TURBO_HALF_ITERATION_EVEN |
 					RTE_BBDEV_TURBO_EARLY_TERMINATION |
+					RTE_BBDEV_TURBO_DEC_INTERRUPTS |
 					RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN |
 					RTE_BBDEV_TURBO_MAP_DEC |
 					RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP |
@@ -663,6 +912,7 @@
 					RTE_BBDEV_TURBO_CRC_24B_ATTACH |
 					RTE_BBDEV_TURBO_RV_INDEX_BYPASS |
 					RTE_BBDEV_TURBO_RATE_MATCH |
+					RTE_BBDEV_TURBO_ENC_INTERRUPTS |
 					RTE_BBDEV_TURBO_ENC_SCATTER_GATHER,
 				.num_buffers_src =
 						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
@@ -676,7 +926,8 @@
 				.capability_flags =
 					RTE_BBDEV_LDPC_RATE_MATCH |
 					RTE_BBDEV_LDPC_CRC_24B_ATTACH |
-					RTE_BBDEV_LDPC_INTERLEAVER_BYPASS,
+					RTE_BBDEV_LDPC_INTERLEAVER_BYPASS |
+					RTE_BBDEV_LDPC_ENC_INTERRUPTS,
 				.num_buffers_src =
 						RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
 				.num_buffers_dst =
@@ -701,7 +952,8 @@
 				RTE_BBDEV_LDPC_DECODE_BYPASS |
 				RTE_BBDEV_LDPC_DEC_SCATTER_GATHER |
 				RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION |
-				RTE_BBDEV_LDPC_LLR_COMPRESSION,
+				RTE_BBDEV_LDPC_LLR_COMPRESSION |
+				RTE_BBDEV_LDPC_DEC_INTERRUPTS,
 			.llr_size = 8,
 			.llr_decimals = 1,
 			.num_buffers_src =
@@ -751,14 +1003,39 @@
 #else
 	dev_info->harq_buffer_size = 0;
 #endif
+	acc100_check_ir(d);
+}
+
+static int
+acc100_queue_intr_enable(struct rte_bbdev *dev, uint16_t queue_id)
+{
+	struct acc100_queue *q = dev->data->queues[queue_id].queue_private;
+
+	if (dev->intr_handle->type != RTE_INTR_HANDLE_VFIO_MSI &&
+			dev->intr_handle->type != RTE_INTR_HANDLE_UIO)
+		return -ENOTSUP;
+
+	q->irq_enable = 1;
+	return 0;
+}
+
+static int
+acc100_queue_intr_disable(struct rte_bbdev *dev, uint16_t queue_id)
+{
+	struct acc100_queue *q = dev->data->queues[queue_id].queue_private;
+	q->irq_enable = 0;
+	return 0;
 }
 
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
 	.setup_queues = acc100_setup_queues,
+	.intr_enable = acc100_intr_enable,
 	.close = acc100_dev_close,
 	.info_get = acc100_dev_info_get,
 	.queue_setup = acc100_queue_setup,
 	.queue_release = acc100_queue_release,
+	.queue_intr_enable = acc100_queue_intr_enable,
+	.queue_intr_disable = acc100_queue_intr_disable
 };
 
 /* ACC100 PCI PF address map */
@@ -3018,8 +3295,10 @@
 			? (1 << RTE_BBDEV_DATA_ERROR) : 0);
 	op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
 	op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
-	if (op->status != 0)
+	if (op->status != 0) {
 		q_data->queue_stats.dequeue_err_count++;
+		acc100_check_ir(q->d);
+	}
 
 	/* CRC invalid if error exists */
 	if (!op->status)
@@ -3076,6 +3355,9 @@
 		op->status |= 1 << RTE_BBDEV_SYNDROME_ERROR;
 	op->ldpc_dec.iter_count = (uint8_t) rsp.iter_cnt;
 
+	if (op->status & (1 << RTE_BBDEV_DRV_ERROR))
+		acc100_check_ir(q->d);
+
 	/* Check if this is the last desc in batch (Atomic Queue) */
 	if (desc->req.last_desc_in_batch) {
 		(*aq_dequeued)++;
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 78686c1..8980fa5 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -559,7 +559,14 @@ struct acc100_device {
 	/* Virtual address of the info memory routed to the this function under
 	 * operation, whether it is PF or VF.
 	 */
+	union acc100_info_ring_data *info_ring;
+
 	union acc100_harq_layout_data *harq_layout;
+	/* Virtual Info Ring head */
+	uint16_t info_ring_head;
+	/* Number of bytes available for each queue in device, depending on
+	 * how many queues are enabled with configure()
+	 */
 	uint32_t sw_ring_size;
 	uint32_t ddr_size; /* Size in kB */
 	uint32_t *tail_ptrs; /* Base address of response tail pointer buffer */
@@ -575,4 +582,12 @@ struct acc100_device {
 	bool configured; /**< True if this ACC100 device is configured */
 };
 
+/**
+ * Structure with details about RTE_BBDEV_EVENT_DEQUEUE event. It's passed to
+ * the callback function.
+ */
+struct acc100_deq_intr_details {
+	uint16_t queue_id;
+};
+
 #endif /* _RTE_ACC100_PMD_H_ */
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v4 09/11] baseband/acc100: add debug function to validate input
  2020-09-04 17:53   ` [dpdk-dev] [PATCH v4 00/11] bbdev PMD ACC100 Nicolas Chautru
                       ` (7 preceding siblings ...)
  2020-09-04 17:54     ` [dpdk-dev] [PATCH v4 08/11] baseband/acc100: add interrupt support to PMD Nicolas Chautru
@ 2020-09-04 17:54     ` Nicolas Chautru
  2020-09-21  1:46       ` Liu, Tianjiao
  2020-09-04 17:54     ` [dpdk-dev] [PATCH v4 10/11] baseband/acc100: add configure function Nicolas Chautru
                       ` (2 subsequent siblings)
  11 siblings, 1 reply; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-04 17:54 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Debug functions to validate the input API from user
Only enabled in DEBUG mode at build time

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 424 +++++++++++++++++++++++++++++++
 1 file changed, 424 insertions(+)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 54b5917..e64d5e2 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -1945,6 +1945,231 @@
 
 }
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+/* Validates turbo encoder parameters */
+static inline int
+validate_enc_op(struct rte_bbdev_enc_op *op)
+{
+	struct rte_bbdev_op_turbo_enc *turbo_enc = &op->turbo_enc;
+	struct rte_bbdev_op_enc_turbo_cb_params *cb = NULL;
+	struct rte_bbdev_op_enc_turbo_tb_params *tb = NULL;
+	uint16_t kw, kw_neg, kw_pos;
+
+	if (op->mempool == NULL) {
+		rte_bbdev_log(ERR, "Invalid mempool pointer");
+		return -1;
+	}
+	if (turbo_enc->input.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid input pointer");
+		return -1;
+	}
+	if (turbo_enc->output.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid output pointer");
+		return -1;
+	}
+	if (turbo_enc->rv_index > 3) {
+		rte_bbdev_log(ERR,
+				"rv_index (%u) is out of range 0 <= value <= 3",
+				turbo_enc->rv_index);
+		return -1;
+	}
+	if (turbo_enc->code_block_mode != 0 &&
+			turbo_enc->code_block_mode != 1) {
+		rte_bbdev_log(ERR,
+				"code_block_mode (%u) is out of range 0 <= value <= 1",
+				turbo_enc->code_block_mode);
+		return -1;
+	}
+
+	if (turbo_enc->code_block_mode == 0) {
+		tb = &turbo_enc->tb_params;
+		if ((tb->k_neg < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| tb->k_neg > RTE_BBDEV_TURBO_MAX_CB_SIZE)
+				&& tb->c_neg > 0) {
+			rte_bbdev_log(ERR,
+					"k_neg (%u) is out of range %u <= value <= %u",
+					tb->k_neg, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if (tb->k_pos < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| tb->k_pos > RTE_BBDEV_TURBO_MAX_CB_SIZE) {
+			rte_bbdev_log(ERR,
+					"k_pos (%u) is out of range %u <= value <= %u",
+					tb->k_pos, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if (tb->c_neg > (RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1))
+			rte_bbdev_log(ERR,
+					"c_neg (%u) is out of range 0 <= value <= %u",
+					tb->c_neg,
+					RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1);
+		if (tb->c < 1 || tb->c > RTE_BBDEV_TURBO_MAX_CODE_BLOCKS) {
+			rte_bbdev_log(ERR,
+					"c (%u) is out of range 1 <= value <= %u",
+					tb->c, RTE_BBDEV_TURBO_MAX_CODE_BLOCKS);
+			return -1;
+		}
+		if (tb->cab > tb->c) {
+			rte_bbdev_log(ERR,
+					"cab (%u) is greater than c (%u)",
+					tb->cab, tb->c);
+			return -1;
+		}
+		if ((tb->ea < RTE_BBDEV_TURBO_MIN_CB_SIZE || (tb->ea % 2))
+				&& tb->r < tb->cab) {
+			rte_bbdev_log(ERR,
+					"ea (%u) is less than %u or it is not even",
+					tb->ea, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+		if ((tb->eb < RTE_BBDEV_TURBO_MIN_CB_SIZE || (tb->eb % 2))
+				&& tb->c > tb->cab) {
+			rte_bbdev_log(ERR,
+					"eb (%u) is less than %u or it is not even",
+					tb->eb, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+
+		kw_neg = 3 * RTE_ALIGN_CEIL(tb->k_neg + 4,
+					RTE_BBDEV_TURBO_C_SUBBLOCK);
+		if (tb->ncb_neg < tb->k_neg || tb->ncb_neg > kw_neg) {
+			rte_bbdev_log(ERR,
+					"ncb_neg (%u) is out of range (%u) k_neg <= value <= (%u) kw_neg",
+					tb->ncb_neg, tb->k_neg, kw_neg);
+			return -1;
+		}
+
+		kw_pos = 3 * RTE_ALIGN_CEIL(tb->k_pos + 4,
+					RTE_BBDEV_TURBO_C_SUBBLOCK);
+		if (tb->ncb_pos < tb->k_pos || tb->ncb_pos > kw_pos) {
+			rte_bbdev_log(ERR,
+					"ncb_pos (%u) is out of range (%u) k_pos <= value <= (%u) kw_pos",
+					tb->ncb_pos, tb->k_pos, kw_pos);
+			return -1;
+		}
+		if (tb->r > (tb->c - 1)) {
+			rte_bbdev_log(ERR,
+					"r (%u) is greater than c - 1 (%u)",
+					tb->r, tb->c - 1);
+			return -1;
+		}
+	} else {
+		cb = &turbo_enc->cb_params;
+		if (cb->k < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| cb->k > RTE_BBDEV_TURBO_MAX_CB_SIZE) {
+			rte_bbdev_log(ERR,
+					"k (%u) is out of range %u <= value <= %u",
+					cb->k, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+
+		if (cb->e < RTE_BBDEV_TURBO_MIN_CB_SIZE || (cb->e % 2)) {
+			rte_bbdev_log(ERR,
+					"e (%u) is less than %u or it is not even",
+					cb->e, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+
+		kw = RTE_ALIGN_CEIL(cb->k + 4, RTE_BBDEV_TURBO_C_SUBBLOCK) * 3;
+		if (cb->ncb < cb->k || cb->ncb > kw) {
+			rte_bbdev_log(ERR,
+					"ncb (%u) is out of range (%u) k <= value <= (%u) kw",
+					cb->ncb, cb->k, kw);
+			return -1;
+		}
+	}
+
+	return 0;
+}
+/* Validates LDPC encoder parameters */
+static inline int
+validate_ldpc_enc_op(struct rte_bbdev_enc_op *op)
+{
+	struct rte_bbdev_op_ldpc_enc *ldpc_enc = &op->ldpc_enc;
+
+	if (op->mempool == NULL) {
+		rte_bbdev_log(ERR, "Invalid mempool pointer");
+		return -1;
+	}
+	if (ldpc_enc->input.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid input pointer");
+		return -1;
+	}
+	if (ldpc_enc->output.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid output pointer");
+		return -1;
+	}
+	if (ldpc_enc->input.length >
+			RTE_BBDEV_LDPC_MAX_CB_SIZE >> 3) {
+		rte_bbdev_log(ERR, "CB size (%u) is too big, max: %d",
+				ldpc_enc->input.length,
+				RTE_BBDEV_LDPC_MAX_CB_SIZE);
+		return -1;
+	}
+	if ((ldpc_enc->basegraph > 2) || (ldpc_enc->basegraph == 0)) {
+		rte_bbdev_log(ERR,
+				"BG (%u) is out of range 1 <= value <= 2",
+				ldpc_enc->basegraph);
+		return -1;
+	}
+	if (ldpc_enc->rv_index > 3) {
+		rte_bbdev_log(ERR,
+				"rv_index (%u) is out of range 0 <= value <= 3",
+				ldpc_enc->rv_index);
+		return -1;
+	}
+	if (ldpc_enc->code_block_mode > 1) {
+		rte_bbdev_log(ERR,
+				"code_block_mode (%u) is out of range 0 <= value <= 1",
+				ldpc_enc->code_block_mode);
+		return -1;
+	}
+
+	return 0;
+}
+
+/* Validates LDPC decoder parameters */
+static inline int
+validate_ldpc_dec_op(struct rte_bbdev_dec_op *op)
+{
+	struct rte_bbdev_op_ldpc_dec *ldpc_dec = &op->ldpc_dec;
+
+	if (op->mempool == NULL) {
+		rte_bbdev_log(ERR, "Invalid mempool pointer");
+		return -1;
+	}
+	if ((ldpc_dec->basegraph > 2) || (ldpc_dec->basegraph == 0)) {
+		rte_bbdev_log(ERR,
+				"BG (%u) is out of range 1 <= value <= 2",
+				ldpc_dec->basegraph);
+		return -1;
+	}
+	if (ldpc_dec->iter_max == 0) {
+		rte_bbdev_log(ERR,
+				"iter_max (%u) is equal to 0",
+				ldpc_dec->iter_max);
+		return -1;
+	}
+	if (ldpc_dec->rv_index > 3) {
+		rte_bbdev_log(ERR,
+				"rv_index (%u) is out of range 0 <= value <= 3",
+				ldpc_dec->rv_index);
+		return -1;
+	}
+	if (ldpc_dec->code_block_mode > 1) {
+		rte_bbdev_log(ERR,
+				"code_block_mode (%u) is out of range 0 <= value <= 1",
+				ldpc_dec->code_block_mode);
+		return -1;
+	}
+
+	return 0;
+}
+#endif
+
 /* Enqueue one encode operations for ACC100 device in CB mode */
 static inline int
 enqueue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
@@ -1956,6 +2181,14 @@
 		seg_total_left;
 	struct rte_mbuf *input, *output_head, *output;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_enc_op(op) == -1) {
+		rte_bbdev_log(ERR, "Turbo encoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2008,6 +2241,14 @@
 	uint16_t  in_length_in_bytes;
 	struct rte_bbdev_op_ldpc_enc *enc = &ops[0]->ldpc_enc;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_ldpc_enc_op(ops[0]) == -1) {
+		rte_bbdev_log(ERR, "LDPC encoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2065,6 +2306,14 @@
 		seg_total_left;
 	struct rte_mbuf *input, *output_head, *output;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_ldpc_enc_op(op) == -1) {
+		rte_bbdev_log(ERR, "LDPC encoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2119,6 +2368,14 @@
 	struct rte_mbuf *input, *output_head, *output;
 	uint16_t current_enqueued_cbs = 0;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_enc_op(op) == -1) {
+		rte_bbdev_log(ERR, "Turbo encoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2191,6 +2448,142 @@
 	return current_enqueued_cbs;
 }
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+/* Validates turbo decoder parameters */
+static inline int
+validate_dec_op(struct rte_bbdev_dec_op *op)
+{
+	struct rte_bbdev_op_turbo_dec *turbo_dec = &op->turbo_dec;
+	struct rte_bbdev_op_dec_turbo_cb_params *cb = NULL;
+	struct rte_bbdev_op_dec_turbo_tb_params *tb = NULL;
+
+	if (op->mempool == NULL) {
+		rte_bbdev_log(ERR, "Invalid mempool pointer");
+		return -1;
+	}
+	if (turbo_dec->input.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid input pointer");
+		return -1;
+	}
+	if (turbo_dec->hard_output.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid hard_output pointer");
+		return -1;
+	}
+	if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_SOFT_OUTPUT) &&
+			turbo_dec->soft_output.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid soft_output pointer");
+		return -1;
+	}
+	if (turbo_dec->rv_index > 3) {
+		rte_bbdev_log(ERR,
+				"rv_index (%u) is out of range 0 <= value <= 3",
+				turbo_dec->rv_index);
+		return -1;
+	}
+	if (turbo_dec->iter_min < 1) {
+		rte_bbdev_log(ERR,
+				"iter_min (%u) is less than 1",
+				turbo_dec->iter_min);
+		return -1;
+	}
+	if (turbo_dec->iter_max <= 2) {
+		rte_bbdev_log(ERR,
+				"iter_max (%u) is less than or equal to 2",
+				turbo_dec->iter_max);
+		return -1;
+	}
+	if (turbo_dec->iter_min > turbo_dec->iter_max) {
+		rte_bbdev_log(ERR,
+				"iter_min (%u) is greater than iter_max (%u)",
+				turbo_dec->iter_min, turbo_dec->iter_max);
+		return -1;
+	}
+	if (turbo_dec->code_block_mode != 0 &&
+			turbo_dec->code_block_mode != 1) {
+		rte_bbdev_log(ERR,
+				"code_block_mode (%u) is out of range 0 <= value <= 1",
+				turbo_dec->code_block_mode);
+		return -1;
+	}
+
+	if (turbo_dec->code_block_mode == 0) {
+		tb = &turbo_dec->tb_params;
+		if ((tb->k_neg < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| tb->k_neg > RTE_BBDEV_TURBO_MAX_CB_SIZE)
+				&& tb->c_neg > 0) {
+			rte_bbdev_log(ERR,
+					"k_neg (%u) is out of range %u <= value <= %u",
+					tb->k_neg, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if ((tb->k_pos < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| tb->k_pos > RTE_BBDEV_TURBO_MAX_CB_SIZE)
+				&& tb->c > tb->c_neg) {
+			rte_bbdev_log(ERR,
+					"k_pos (%u) is out of range %u <= value <= %u",
+					tb->k_pos, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if (tb->c_neg > (RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1))
+			rte_bbdev_log(ERR,
+					"c_neg (%u) is out of range 0 <= value <= %u",
+					tb->c_neg,
+					RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1);
+		if (tb->c < 1 || tb->c > RTE_BBDEV_TURBO_MAX_CODE_BLOCKS) {
+			rte_bbdev_log(ERR,
+					"c (%u) is out of range 1 <= value <= %u",
+					tb->c, RTE_BBDEV_TURBO_MAX_CODE_BLOCKS);
+			return -1;
+		}
+		if (tb->cab > tb->c) {
+			rte_bbdev_log(ERR,
+					"cab (%u) is greater than c (%u)",
+					tb->cab, tb->c);
+			return -1;
+		}
+		if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_EQUALIZER) &&
+				(tb->ea < RTE_BBDEV_TURBO_MIN_CB_SIZE
+						|| (tb->ea % 2))
+				&& tb->cab > 0) {
+			rte_bbdev_log(ERR,
+					"ea (%u) is less than %u or it is not even",
+					tb->ea, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+		if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_EQUALIZER) &&
+				(tb->eb < RTE_BBDEV_TURBO_MIN_CB_SIZE
+						|| (tb->eb % 2))
+				&& tb->c > tb->cab) {
+			rte_bbdev_log(ERR,
+					"eb (%u) is less than %u or it is not even",
+					tb->eb, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+		}
+	} else {
+		cb = &turbo_dec->cb_params;
+		if (cb->k < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| cb->k > RTE_BBDEV_TURBO_MAX_CB_SIZE) {
+			rte_bbdev_log(ERR,
+					"k (%u) is out of range %u <= value <= %u",
+					cb->k, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_EQUALIZER) &&
+				(cb->e < RTE_BBDEV_TURBO_MIN_CB_SIZE ||
+				(cb->e % 2))) {
+			rte_bbdev_log(ERR,
+					"e (%u) is less than %u or it is not even",
+					cb->e, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+	}
+
+	return 0;
+}
+#endif
+
 /** Enqueue one decode operations for ACC100 device in CB mode */
 static inline int
 enqueue_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
@@ -2203,6 +2596,14 @@
 	struct rte_mbuf *input, *h_output_head, *h_output,
 		*s_output_head, *s_output;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_dec_op(op) == -1) {
+		rte_bbdev_log(ERR, "Turbo decoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2426,6 +2827,13 @@
 		return ret;
 	}
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_ldpc_dec_op(op) == -1) {
+		rte_bbdev_log(ERR, "LDPC decoder validation failed");
+		return -EINVAL;
+	}
+#endif
 	union acc100_dma_desc *desc;
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
@@ -2521,6 +2929,14 @@
 	struct rte_mbuf *input, *h_output_head, *h_output;
 	uint16_t current_enqueued_cbs = 0;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_ldpc_dec_op(op) == -1) {
+		rte_bbdev_log(ERR, "LDPC decoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2611,6 +3027,14 @@
 		*s_output_head, *s_output;
 	uint16_t current_enqueued_cbs = 0;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_dec_op(op) == -1) {
+		rte_bbdev_log(ERR, "Turbo decoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v4 10/11] baseband/acc100: add configure function
  2020-09-04 17:53   ` [dpdk-dev] [PATCH v4 00/11] bbdev PMD ACC100 Nicolas Chautru
                       ` (8 preceding siblings ...)
  2020-09-04 17:54     ` [dpdk-dev] [PATCH v4 09/11] baseband/acc100: add debug function to validate input Nicolas Chautru
@ 2020-09-04 17:54     ` Nicolas Chautru
  2020-09-21  1:48       ` Liu, Tianjiao
  2020-09-04 17:54     ` [dpdk-dev] [PATCH v4 11/11] doc: update bbdev feature table Nicolas Chautru
  2020-09-21 14:36     ` [dpdk-dev] [PATCH v4 00/11] bbdev PMD ACC100 Chautru, Nicolas
  11 siblings, 1 reply; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-04 17:54 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Add configure function to configure the PF from within
the bbdev-test itself without external application
configuration the device.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 app/test-bbdev/test_bbdev_perf.c                   |  72 +++
 drivers/baseband/acc100/Makefile                   |   3 +
 drivers/baseband/acc100/meson.build                |   2 +
 drivers/baseband/acc100/rte_acc100_cfg.h           |  17 +
 drivers/baseband/acc100/rte_acc100_pmd.c           | 505 +++++++++++++++++++++
 .../acc100/rte_pmd_bbdev_acc100_version.map        |   7 +
 6 files changed, 606 insertions(+)

diff --git a/app/test-bbdev/test_bbdev_perf.c b/app/test-bbdev/test_bbdev_perf.c
index 45c0d62..32f23ff 100644
--- a/app/test-bbdev/test_bbdev_perf.c
+++ b/app/test-bbdev/test_bbdev_perf.c
@@ -52,6 +52,18 @@
 #define FLR_5G_TIMEOUT 610
 #endif
 
+#ifdef RTE_LIBRTE_PMD_BBDEV_ACC100
+#include <rte_acc100_cfg.h>
+#define ACC100PF_DRIVER_NAME   ("intel_acc100_pf")
+#define ACC100VF_DRIVER_NAME   ("intel_acc100_vf")
+#define ACC100_QMGR_NUM_AQS 16
+#define ACC100_QMGR_NUM_QGS 2
+#define ACC100_QMGR_AQ_DEPTH 5
+#define ACC100_QMGR_INVALID_IDX -1
+#define ACC100_QMGR_RR 1
+#define ACC100_QOS_GBR 0
+#endif
+
 #define OPS_CACHE_SIZE 256U
 #define OPS_POOL_SIZE_MIN 511U /* 0.5K per queue */
 
@@ -653,6 +665,66 @@ typedef int (test_case_function)(struct active_device *ad,
 				info->dev_name);
 	}
 #endif
+#ifdef RTE_LIBRTE_PMD_BBDEV_ACC100
+	if ((get_init_device() == true) &&
+		(!strcmp(info->drv.driver_name, ACC100PF_DRIVER_NAME))) {
+		struct acc100_conf conf;
+		unsigned int i;
+
+		printf("Configure ACC100 FEC Driver %s with default values\n",
+				info->drv.driver_name);
+
+		/* clear default configuration before initialization */
+		memset(&conf, 0, sizeof(struct acc100_conf));
+
+		/* Always set in PF mode for built-in configuration */
+		conf.pf_mode_en = true;
+		for (i = 0; i < RTE_ACC100_NUM_VFS; ++i) {
+			conf.arb_dl_4g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_dl_4g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_dl_4g[i].round_robin_weight = ACC100_QMGR_RR;
+			conf.arb_ul_4g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_ul_4g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_ul_4g[i].round_robin_weight = ACC100_QMGR_RR;
+			conf.arb_dl_5g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_dl_5g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_dl_5g[i].round_robin_weight = ACC100_QMGR_RR;
+			conf.arb_ul_5g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_ul_5g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_ul_5g[i].round_robin_weight = ACC100_QMGR_RR;
+		}
+
+		conf.input_pos_llr_1_bit = true;
+		conf.output_pos_llr_1_bit = true;
+		conf.num_vf_bundles = 1; /**< Number of VF bundles to setup */
+
+		conf.q_ul_4g.num_qgroups = ACC100_QMGR_NUM_QGS;
+		conf.q_ul_4g.first_qgroup_index = ACC100_QMGR_INVALID_IDX;
+		conf.q_ul_4g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
+		conf.q_ul_4g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
+		conf.q_dl_4g.num_qgroups = ACC100_QMGR_NUM_QGS;
+		conf.q_dl_4g.first_qgroup_index = ACC100_QMGR_INVALID_IDX;
+		conf.q_dl_4g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
+		conf.q_dl_4g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
+		conf.q_ul_5g.num_qgroups = ACC100_QMGR_NUM_QGS;
+		conf.q_ul_5g.first_qgroup_index = ACC100_QMGR_INVALID_IDX;
+		conf.q_ul_5g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
+		conf.q_ul_5g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
+		conf.q_dl_5g.num_qgroups = ACC100_QMGR_NUM_QGS;
+		conf.q_dl_5g.first_qgroup_index = ACC100_QMGR_INVALID_IDX;
+		conf.q_dl_5g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
+		conf.q_dl_5g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
+
+		/* setup PF with configuration information */
+		ret = acc100_configure(info->dev_name, &conf);
+		TEST_ASSERT_SUCCESS(ret,
+				"Failed to configure ACC100 PF for bbdev %s",
+				info->dev_name);
+		/* Let's refresh this now this is configured */
+	}
+	rte_bbdev_info_get(dev_id, info);
+#endif
+
 	nb_queues = RTE_MIN(rte_lcore_count(), info->drv.max_num_queues);
 	nb_queues = RTE_MIN(nb_queues, (unsigned int) MAX_QUEUES);
 
diff --git a/drivers/baseband/acc100/Makefile b/drivers/baseband/acc100/Makefile
index c79e487..37e73af 100644
--- a/drivers/baseband/acc100/Makefile
+++ b/drivers/baseband/acc100/Makefile
@@ -22,4 +22,7 @@ LIBABIVER := 1
 # library source files
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100) += rte_acc100_pmd.c
 
+# export include files
+SYMLINK-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100)-include += rte_acc100_cfg.h
+
 include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/baseband/acc100/meson.build b/drivers/baseband/acc100/meson.build
index 8afafc2..7ac44dc 100644
--- a/drivers/baseband/acc100/meson.build
+++ b/drivers/baseband/acc100/meson.build
@@ -4,3 +4,5 @@
 deps += ['bbdev', 'bus_vdev', 'ring', 'pci', 'bus_pci']
 
 sources = files('rte_acc100_pmd.c')
+
+install_headers('rte_acc100_cfg.h')
diff --git a/drivers/baseband/acc100/rte_acc100_cfg.h b/drivers/baseband/acc100/rte_acc100_cfg.h
index 73bbe36..7f523bc 100644
--- a/drivers/baseband/acc100/rte_acc100_cfg.h
+++ b/drivers/baseband/acc100/rte_acc100_cfg.h
@@ -89,6 +89,23 @@ struct acc100_conf {
 	struct rte_arbitration_t arb_dl_5g[RTE_ACC100_NUM_VFS];
 };
 
+/**
+ * Configure a ACC100 device
+ *
+ * @param dev_name
+ *   The name of the device. This is the short form of PCI BDF, e.g. 00:01.0.
+ *   It can also be retrieved for a bbdev device from the dev_name field in the
+ *   rte_bbdev_info structure returned by rte_bbdev_info_get().
+ * @param conf
+ *   Configuration to apply to ACC100 HW.
+ *
+ * @return
+ *   Zero on success, negative value on failure.
+ */
+__rte_experimental
+int
+acc100_configure(const char *dev_name, struct acc100_conf *conf);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index e64d5e2..f039ed3 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -85,6 +85,26 @@
 
 enum {UL_4G = 0, UL_5G, DL_4G, DL_5G, NUM_ACC};
 
+/* Return the accelerator enum for a Queue Group Index */
+static inline int
+accFromQgid(int qg_idx, const struct acc100_conf *acc100_conf)
+{
+	int accQg[ACC100_NUM_QGRPS];
+	int NumQGroupsPerFn[NUM_ACC];
+	int acc, qgIdx, qgIndex = 0;
+	for (qgIdx = 0; qgIdx < ACC100_NUM_QGRPS; qgIdx++)
+		accQg[qgIdx] = 0;
+	NumQGroupsPerFn[UL_4G] = acc100_conf->q_ul_4g.num_qgroups;
+	NumQGroupsPerFn[UL_5G] = acc100_conf->q_ul_5g.num_qgroups;
+	NumQGroupsPerFn[DL_4G] = acc100_conf->q_dl_4g.num_qgroups;
+	NumQGroupsPerFn[DL_5G] = acc100_conf->q_dl_5g.num_qgroups;
+	for (acc = UL_4G;  acc < NUM_ACC; acc++)
+		for (qgIdx = 0; qgIdx < NumQGroupsPerFn[acc]; qgIdx++)
+			accQg[qgIndex++] = acc;
+	acc = accQg[qg_idx];
+	return acc;
+}
+
 /* Return the queue topology for a Queue Group Index */
 static inline void
 qtopFromAcc(struct rte_q_topology_t **qtop, int acc_enum,
@@ -113,6 +133,30 @@
 	*qtop = p_qtop;
 }
 
+/* Return the AQ depth for a Queue Group Index */
+static inline int
+aqDepth(int qg_idx, struct acc100_conf *acc100_conf)
+{
+	struct rte_q_topology_t *q_top = NULL;
+	int acc_enum = accFromQgid(qg_idx, acc100_conf);
+	qtopFromAcc(&q_top, acc_enum, acc100_conf);
+	if (unlikely(q_top == NULL))
+		return 0;
+	return q_top->aq_depth_log2;
+}
+
+/* Return the AQ depth for a Queue Group Index */
+static inline int
+aqNum(int qg_idx, struct acc100_conf *acc100_conf)
+{
+	struct rte_q_topology_t *q_top = NULL;
+	int acc_enum = accFromQgid(qg_idx, acc100_conf);
+	qtopFromAcc(&q_top, acc_enum, acc100_conf);
+	if (unlikely(q_top == NULL))
+		return 0;
+	return q_top->num_aqs_per_groups;
+}
+
 static void
 initQTop(struct acc100_conf *acc100_conf)
 {
@@ -4177,3 +4221,464 @@ static int acc100_pci_remove(struct rte_pci_device *pci_dev)
 RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME, pci_id_acc100_pf_map);
 RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
 RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME, pci_id_acc100_vf_map);
+
+/*
+ * Implementation to fix the power on status of some 5GUL engines
+ * This requires DMA permission if ported outside DPDK
+ */
+static void
+poweron_cleanup(struct rte_bbdev *bbdev, struct acc100_device *d,
+		struct acc100_conf *conf)
+{
+	int i, template_idx, qg_idx;
+	uint32_t address, status, payload;
+	printf("Need to clear power-on 5GUL status in internal memory\n");
+	/* Reset LDPC Cores */
+	for (i = 0; i < ACC100_ENGINES_MAX; i++)
+		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
+				ACC100_ENGINE_OFFSET * i, ACC100_RESET_HI);
+	usleep(LONG_WAIT);
+	for (i = 0; i < ACC100_ENGINES_MAX; i++)
+		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
+				ACC100_ENGINE_OFFSET * i, ACC100_RESET_LO);
+	usleep(LONG_WAIT);
+	/* Prepare dummy workload */
+	alloc_2x64mb_sw_rings_mem(bbdev, d, 0);
+	/* Set base addresses */
+	uint32_t phys_high = (uint32_t)(d->sw_rings_phys >> 32);
+	uint32_t phys_low  = (uint32_t)(d->sw_rings_phys &
+			~(ACC100_SIZE_64MBYTE-1));
+	acc100_reg_write(d, HWPfDmaFec5GulDescBaseHiRegVf, phys_high);
+	acc100_reg_write(d, HWPfDmaFec5GulDescBaseLoRegVf, phys_low);
+
+	/* Descriptor for a dummy 5GUL code block processing*/
+	union acc100_dma_desc *desc = NULL;
+	desc = d->sw_rings;
+	desc->req.data_ptrs[0].address = d->sw_rings_phys +
+			ACC100_DESC_FCW_OFFSET;
+	desc->req.data_ptrs[0].blen = ACC100_FCW_LD_BLEN;
+	desc->req.data_ptrs[0].blkid = ACC100_DMA_BLKID_FCW;
+	desc->req.data_ptrs[0].last = 0;
+	desc->req.data_ptrs[0].dma_ext = 0;
+	desc->req.data_ptrs[1].address = d->sw_rings_phys + 512;
+	desc->req.data_ptrs[1].blkid = ACC100_DMA_BLKID_IN;
+	desc->req.data_ptrs[1].last = 1;
+	desc->req.data_ptrs[1].dma_ext = 0;
+	desc->req.data_ptrs[1].blen = 44;
+	desc->req.data_ptrs[2].address = d->sw_rings_phys + 1024;
+	desc->req.data_ptrs[2].blkid = ACC100_DMA_BLKID_OUT_ENC;
+	desc->req.data_ptrs[2].last = 1;
+	desc->req.data_ptrs[2].dma_ext = 0;
+	desc->req.data_ptrs[2].blen = 5;
+	/* Dummy FCW */
+	desc->req.fcw_ld.FCWversion = ACC100_FCW_VER;
+	desc->req.fcw_ld.qm = 1;
+	desc->req.fcw_ld.nfiller = 30;
+	desc->req.fcw_ld.BG = 2 - 1;
+	desc->req.fcw_ld.Zc = 7;
+	desc->req.fcw_ld.ncb = 350;
+	desc->req.fcw_ld.rm_e = 4;
+	desc->req.fcw_ld.itmax = 10;
+	desc->req.fcw_ld.gain_i = 1;
+	desc->req.fcw_ld.gain_h = 1;
+
+	int engines_to_restart[SIG_UL_5G_LAST + 1] = {0};
+	int num_failed_engine = 0;
+	/* Detect engines in undefined state */
+	for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST;
+			template_idx++) {
+		/* Check engine power-on status */
+		address = HwPfFecUl5gIbDebugReg +
+				ACC100_ENGINE_OFFSET * template_idx;
+		status = (acc100_reg_read(d, address) >> 4) & 0xF;
+		if (status == 0) {
+			engines_to_restart[num_failed_engine] = template_idx;
+			num_failed_engine++;
+		}
+	}
+
+	int numQqsAcc = conf->q_ul_5g.num_qgroups;
+	int numQgs = conf->q_ul_5g.num_qgroups;
+	payload = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		payload |= (1 << qg_idx);
+	/* Force each engine which is in unspecified state */
+	for (i = 0; i < num_failed_engine; i++) {
+		int failed_engine = engines_to_restart[i];
+		printf("Force engine %d\n", failed_engine);
+		for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST;
+				template_idx++) {
+			address = HWPfQmgrGrpTmplateReg4Indx
+					+ BYTES_IN_WORD * template_idx;
+			if (template_idx == failed_engine)
+				acc100_reg_write(d, address, payload);
+			else
+				acc100_reg_write(d, address, 0);
+		}
+		/* Reset descriptor header */
+		desc->req.word0 = ACC100_DMA_DESC_TYPE;
+		desc->req.word1 = 0;
+		desc->req.word2 = 0;
+		desc->req.word3 = 0;
+		desc->req.numCBs = 1;
+		desc->req.m2dlen = 2;
+		desc->req.d2mlen = 1;
+		/* Enqueue the code block for processing */
+		union acc100_enqueue_reg_fmt enq_req;
+		enq_req.val = 0;
+		enq_req.addr_offset = ACC100_DESC_OFFSET;
+		enq_req.num_elem = 1;
+		enq_req.req_elem_addr = 0;
+		rte_wmb();
+		acc100_reg_write(d, HWPfQmgrIngressAq + 0x100, enq_req.val);
+		usleep(LONG_WAIT * 100);
+		if (desc->req.word0 != 2)
+			printf("DMA Response %#"PRIx32"\n", desc->req.word0);
+	}
+
+	/* Reset LDPC Cores */
+	for (i = 0; i < ACC100_ENGINES_MAX; i++)
+		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
+				ACC100_ENGINE_OFFSET * i, ACC100_RESET_HI);
+	usleep(LONG_WAIT);
+	for (i = 0; i < ACC100_ENGINES_MAX; i++)
+		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
+				ACC100_ENGINE_OFFSET * i, ACC100_RESET_LO);
+	usleep(LONG_WAIT);
+	acc100_reg_write(d, HWPfHi5GHardResetReg, ACC100_RESET_HARD);
+	usleep(LONG_WAIT);
+	int numEngines = 0;
+	/* Check engine power-on status again */
+	for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST;
+			template_idx++) {
+		address = HwPfFecUl5gIbDebugReg +
+				ACC100_ENGINE_OFFSET * template_idx;
+		status = (acc100_reg_read(d, address) >> 4) & 0xF;
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ BYTES_IN_WORD * template_idx;
+		if (status == 1) {
+			acc100_reg_write(d, address, payload);
+			numEngines++;
+		} else
+			acc100_reg_write(d, address, 0);
+	}
+	printf("Number of 5GUL engines %d\n", numEngines);
+
+	if (d->sw_rings_base != NULL)
+		rte_free(d->sw_rings_base);
+	usleep(LONG_WAIT);
+}
+
+/* Initial configuration of a ACC100 device prior to running configure() */
+int
+acc100_configure(const char *dev_name, struct acc100_conf *conf)
+{
+	rte_bbdev_log(INFO, "acc100_configure");
+	uint32_t payload, address, status;
+	int qg_idx, template_idx, vf_idx, acc, i;
+	struct rte_bbdev *bbdev = rte_bbdev_get_named_dev(dev_name);
+
+	/* Compile time checks */
+	RTE_BUILD_BUG_ON(sizeof(struct acc100_dma_req_desc) != 256);
+	RTE_BUILD_BUG_ON(sizeof(union acc100_dma_desc) != 256);
+	RTE_BUILD_BUG_ON(sizeof(struct acc100_fcw_td) != 24);
+	RTE_BUILD_BUG_ON(sizeof(struct acc100_fcw_te) != 32);
+
+	if (bbdev == NULL) {
+		rte_bbdev_log(ERR,
+		"Invalid dev_name (%s), or device is not yet initialised",
+		dev_name);
+		return -ENODEV;
+	}
+	struct acc100_device *d = bbdev->data->dev_private;
+
+	/* Store configuration */
+	rte_memcpy(&d->acc100_conf, conf, sizeof(d->acc100_conf));
+
+	/* PCIe Bridge configuration */
+	acc100_reg_write(d, HwPfPcieGpexBridgeControl, ACC100_CFG_PCI_BRIDGE);
+	for (i = 1; i < 17; i++)
+		acc100_reg_write(d,
+				HwPfPcieGpexAxiAddrMappingWindowPexBaseHigh
+				+ i * 16, 0);
+
+	/* PCIe Link Trainiing and Status State Machine */
+	acc100_reg_write(d, HwPfPcieGpexLtssmStateCntrl, 0xDFC00000);
+
+	/* Prevent blocking AXI read on BRESP for AXI Write */
+	address = HwPfPcieGpexAxiPioControl;
+	payload = ACC100_CFG_PCI_AXI;
+	acc100_reg_write(d, address, payload);
+
+	/* 5GDL PLL phase shift */
+	acc100_reg_write(d, HWPfChaDl5gPllPhshft0, 0x1);
+
+	/* Explicitly releasing AXI as this may be stopped after PF FLR/BME */
+	address = HWPfDmaAxiControl;
+	payload = 1;
+	acc100_reg_write(d, address, payload);
+
+	/* DDR Configuration */
+	address = HWPfDdrBcTim6;
+	payload = acc100_reg_read(d, address);
+	payload &= 0xFFFFFFFB; /* Bit 2 */
+#ifdef ACC100_DDR_ECC_ENABLE
+	payload |= 0x4;
+#endif
+	acc100_reg_write(d, address, payload);
+	address = HWPfDdrPhyDqsCountNum;
+#ifdef ACC100_DDR_ECC_ENABLE
+	payload = 9;
+#else
+	payload = 8;
+#endif
+	acc100_reg_write(d, address, payload);
+
+	/* Set default descriptor signature */
+	address = HWPfDmaDescriptorSignatuture;
+	payload = 0;
+	acc100_reg_write(d, address, payload);
+
+	/* Enable the Error Detection in DMA */
+	payload = ACC100_CFG_DMA_ERROR;
+	address = HWPfDmaErrorDetectionEn;
+	acc100_reg_write(d, address, payload);
+
+	/* AXI Cache configuration */
+	payload = ACC100_CFG_AXI_CACHE;
+	address = HWPfDmaAxcacheReg;
+	acc100_reg_write(d, address, payload);
+
+	/* Default DMA Configuration (Qmgr Enabled) */
+	address = HWPfDmaConfig0Reg;
+	payload = 0;
+	acc100_reg_write(d, address, payload);
+	address = HWPfDmaQmanen;
+	payload = 0;
+	acc100_reg_write(d, address, payload);
+
+	/* Default RLIM/ALEN configuration */
+	address = HWPfDmaConfig1Reg;
+	payload = (1 << 31) + (23 << 8) + (1 << 6) + 7;
+	acc100_reg_write(d, address, payload);
+
+	/* Configure DMA Qmanager addresses */
+	address = HWPfDmaQmgrAddrReg;
+	payload = HWPfQmgrEgressQueuesTemplate;
+	acc100_reg_write(d, address, payload);
+
+	/* ===== Qmgr Configuration ===== */
+	/* Configuration of the AQueue Depth QMGR_GRP_0_DEPTH_LOG2 for UL */
+	int totalQgs = conf->q_ul_4g.num_qgroups +
+			conf->q_ul_5g.num_qgroups +
+			conf->q_dl_4g.num_qgroups +
+			conf->q_dl_5g.num_qgroups;
+	for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) {
+		address = HWPfQmgrDepthLog2Grp +
+		BYTES_IN_WORD * qg_idx;
+		payload = aqDepth(qg_idx, conf);
+		acc100_reg_write(d, address, payload);
+		address = HWPfQmgrTholdGrp +
+		BYTES_IN_WORD * qg_idx;
+		payload = (1 << 16) + (1 << (aqDepth(qg_idx, conf) - 1));
+		acc100_reg_write(d, address, payload);
+	}
+
+	/* Template Priority in incremental order */
+	for (template_idx = 0; template_idx < ACC100_NUM_TMPL;
+			template_idx++) {
+		address = HWPfQmgrGrpTmplateReg0Indx +
+		BYTES_IN_WORD * (template_idx % 8);
+		payload = TMPL_PRI_0;
+		acc100_reg_write(d, address, payload);
+		address = HWPfQmgrGrpTmplateReg1Indx +
+		BYTES_IN_WORD * (template_idx % 8);
+		payload = TMPL_PRI_1;
+		acc100_reg_write(d, address, payload);
+		address = HWPfQmgrGrpTmplateReg2indx +
+		BYTES_IN_WORD * (template_idx % 8);
+		payload = TMPL_PRI_2;
+		acc100_reg_write(d, address, payload);
+		address = HWPfQmgrGrpTmplateReg3Indx +
+		BYTES_IN_WORD * (template_idx % 8);
+		payload = TMPL_PRI_3;
+		acc100_reg_write(d, address, payload);
+	}
+
+	address = HWPfQmgrGrpPriority;
+	payload = ACC100_CFG_QMGR_HI_P;
+	acc100_reg_write(d, address, payload);
+
+	/* Template Configuration */
+	for (template_idx = 0; template_idx < ACC100_NUM_TMPL; template_idx++) {
+		payload = 0;
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ BYTES_IN_WORD * template_idx;
+		acc100_reg_write(d, address, payload);
+	}
+	/* 4GUL */
+	int numQgs = conf->q_ul_4g.num_qgroups;
+	int numQqsAcc = 0;
+	payload = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		payload |= (1 << qg_idx);
+	for (template_idx = SIG_UL_4G; template_idx <= SIG_UL_4G_LAST;
+			template_idx++) {
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ BYTES_IN_WORD*template_idx;
+		acc100_reg_write(d, address, payload);
+	}
+	/* 5GUL */
+	numQqsAcc += numQgs;
+	numQgs	= conf->q_ul_5g.num_qgroups;
+	payload = 0;
+	int numEngines = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		payload |= (1 << qg_idx);
+	for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST;
+			template_idx++) {
+		/* Check engine power-on status */
+		address = HwPfFecUl5gIbDebugReg +
+				ACC100_ENGINE_OFFSET * template_idx;
+		status = (acc100_reg_read(d, address) >> 4) & 0xF;
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ BYTES_IN_WORD * template_idx;
+		if (status == 1) {
+			acc100_reg_write(d, address, payload);
+			numEngines++;
+		} else
+			acc100_reg_write(d, address, 0);
+		#if RTE_ACC100_SINGLE_FEC == 1
+		payload = 0;
+		#endif
+	}
+	printf("Number of 5GUL engines %d\n", numEngines);
+	/* 4GDL */
+	numQqsAcc += numQgs;
+	numQgs	= conf->q_dl_4g.num_qgroups;
+	payload = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		payload |= (1 << qg_idx);
+	for (template_idx = SIG_DL_4G; template_idx <= SIG_DL_4G_LAST;
+			template_idx++) {
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ BYTES_IN_WORD*template_idx;
+		acc100_reg_write(d, address, payload);
+		#if RTE_ACC100_SINGLE_FEC == 1
+			payload = 0;
+		#endif
+	}
+	/* 5GDL */
+	numQqsAcc += numQgs;
+	numQgs	= conf->q_dl_5g.num_qgroups;
+	payload = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		payload |= (1 << qg_idx);
+	for (template_idx = SIG_DL_5G; template_idx <= SIG_DL_5G_LAST;
+			template_idx++) {
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ BYTES_IN_WORD*template_idx;
+		acc100_reg_write(d, address, payload);
+		#if RTE_ACC100_SINGLE_FEC == 1
+		payload = 0;
+		#endif
+	}
+
+	/* Queue Group Function mapping */
+	int qman_func_id[5] = {0, 2, 1, 3, 4};
+	address = HWPfQmgrGrpFunction0;
+	payload = 0;
+	for (qg_idx = 0; qg_idx < 8; qg_idx++) {
+		acc = accFromQgid(qg_idx, conf);
+		payload |= qman_func_id[acc]<<(qg_idx * 4);
+	}
+	acc100_reg_write(d, address, payload);
+
+	/* Configuration of the Arbitration QGroup depth to 1 */
+	for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) {
+		address = HWPfQmgrArbQDepthGrp +
+		BYTES_IN_WORD * qg_idx;
+		payload = 0;
+		acc100_reg_write(d, address, payload);
+	}
+
+	/* Enabling AQueues through the Queue hierarchy*/
+	for (vf_idx = 0; vf_idx < ACC100_NUM_VFS; vf_idx++) {
+		for (qg_idx = 0; qg_idx < ACC100_NUM_QGRPS; qg_idx++) {
+			payload = 0;
+			if (vf_idx < conf->num_vf_bundles &&
+					qg_idx < totalQgs)
+				payload = (1 << aqNum(qg_idx, conf)) - 1;
+			address = HWPfQmgrAqEnableVf
+					+ vf_idx * BYTES_IN_WORD;
+			payload += (qg_idx << 16);
+			acc100_reg_write(d, address, payload);
+		}
+	}
+
+	/* This pointer to ARAM (256kB) is shifted by 2 (4B per register) */
+	uint32_t aram_address = 0;
+	for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) {
+		for (vf_idx = 0; vf_idx < conf->num_vf_bundles; vf_idx++) {
+			address = HWPfQmgrVfBaseAddr + vf_idx
+					* BYTES_IN_WORD + qg_idx
+					* BYTES_IN_WORD * 64;
+			payload = aram_address;
+			acc100_reg_write(d, address, payload);
+			/* Offset ARAM Address for next memory bank
+			 * - increment of 4B
+			 */
+			aram_address += aqNum(qg_idx, conf) *
+					(1 << aqDepth(qg_idx, conf));
+		}
+	}
+
+	if (aram_address > WORDS_IN_ARAM_SIZE) {
+		rte_bbdev_log(ERR, "ARAM Configuration not fitting %d %d\n",
+				aram_address, WORDS_IN_ARAM_SIZE);
+		return -EINVAL;
+	}
+
+	/* ==== HI Configuration ==== */
+
+	/* Prevent Block on Transmit Error */
+	address = HWPfHiBlockTransmitOnErrorEn;
+	payload = 0;
+	acc100_reg_write(d, address, payload);
+	/* Prevents to drop MSI */
+	address = HWPfHiMsiDropEnableReg;
+	payload = 0;
+	acc100_reg_write(d, address, payload);
+	/* Set the PF Mode register */
+	address = HWPfHiPfMode;
+	payload = (conf->pf_mode_en) ? 2 : 0;
+	acc100_reg_write(d, address, payload);
+	/* Enable Error Detection in HW */
+	address = HWPfDmaErrorDetectionEn;
+	payload = 0x3D7;
+	acc100_reg_write(d, address, payload);
+
+	/* QoS overflow init */
+	payload = 1;
+	address = HWPfQosmonAEvalOverflow0;
+	acc100_reg_write(d, address, payload);
+	address = HWPfQosmonBEvalOverflow0;
+	acc100_reg_write(d, address, payload);
+
+	/* HARQ DDR Configuration */
+	unsigned int ddrSizeInMb = 512; /* Fixed to 512 MB per VF for now */
+	for (vf_idx = 0; vf_idx < conf->num_vf_bundles; vf_idx++) {
+		address = HWPfDmaVfDdrBaseRw + vf_idx
+				* 0x10;
+		payload = ((vf_idx * (ddrSizeInMb / 64)) << 16) +
+				(ddrSizeInMb - 1);
+		acc100_reg_write(d, address, payload);
+	}
+	usleep(LONG_WAIT);
+
+	if (numEngines < (SIG_UL_5G_LAST + 1))
+		poweron_cleanup(bbdev, d, conf);
+
+	rte_bbdev_log_debug("PF Tip configuration complete for %s", dev_name);
+	return 0;
+}
diff --git a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
index 4a76d1d..91c234d 100644
--- a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
+++ b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
@@ -1,3 +1,10 @@
 DPDK_21 {
 	local: *;
 };
+
+EXPERIMENTAL {
+	global:
+
+	acc100_configure;
+
+};
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v4 11/11] doc: update bbdev feature table
  2020-09-04 17:53   ` [dpdk-dev] [PATCH v4 00/11] bbdev PMD ACC100 Nicolas Chautru
                       ` (9 preceding siblings ...)
  2020-09-04 17:54     ` [dpdk-dev] [PATCH v4 10/11] baseband/acc100: add configure function Nicolas Chautru
@ 2020-09-04 17:54     ` Nicolas Chautru
  2020-09-21  1:50       ` Liu, Tianjiao
  2020-09-21 14:36     ` [dpdk-dev] [PATCH v4 00/11] bbdev PMD ACC100 Chautru, Nicolas
  11 siblings, 1 reply; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-04 17:54 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Correcting overview matrix to use acc100 name

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 doc/guides/bbdevs/features/acc100.ini | 14 ++++++++++++++
 doc/guides/bbdevs/features/mbc.ini    | 14 --------------
 2 files changed, 14 insertions(+), 14 deletions(-)
 create mode 100644 doc/guides/bbdevs/features/acc100.ini
 delete mode 100644 doc/guides/bbdevs/features/mbc.ini

diff --git a/doc/guides/bbdevs/features/acc100.ini b/doc/guides/bbdevs/features/acc100.ini
new file mode 100644
index 0000000..642cd48
--- /dev/null
+++ b/doc/guides/bbdevs/features/acc100.ini
@@ -0,0 +1,14 @@
+;
+; Supported features of the 'acc100' bbdev driver.
+;
+; Refer to default.ini for the full list of available PMD features.
+;
+[Features]
+Turbo Decoder (4G)     = Y
+Turbo Encoder (4G)     = Y
+LDPC Decoder (5G)      = Y
+LDPC Encoder (5G)      = Y
+LLR/HARQ Compression   = Y
+External DDR Access    = Y
+HW Accelerated         = Y
+BBDEV API              = Y
diff --git a/doc/guides/bbdevs/features/mbc.ini b/doc/guides/bbdevs/features/mbc.ini
deleted file mode 100644
index 78a7b95..0000000
--- a/doc/guides/bbdevs/features/mbc.ini
+++ /dev/null
@@ -1,14 +0,0 @@
-;
-; Supported features of the 'mbc' bbdev driver.
-;
-; Refer to default.ini for the full list of available PMD features.
-;
-[Features]
-Turbo Decoder (4G)     = Y
-Turbo Encoder (4G)     = Y
-LDPC Decoder (5G)      = Y
-LDPC Encoder (5G)      = Y
-LLR/HARQ Compression   = Y
-External DDR Access    = Y
-HW Accelerated         = Y
-BBDEV API              = Y
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v4 01/11] drivers/baseband: add PMD for ACC100
  2020-09-04 17:53     ` [dpdk-dev] [PATCH v4 01/11] drivers/baseband: add PMD for ACC100 Nicolas Chautru
@ 2020-09-08  3:10       ` Liu, Tianjiao
  0 siblings, 0 replies; 213+ messages in thread
From: Liu, Tianjiao @ 2020-09-08  3:10 UTC (permalink / raw)
  To: Chautru, Nicolas, dev, akhil.goyal

Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>

-----Original Message-----
From: Chautru, Nicolas <nicolas.chautru@intel.com> 
Sent: Saturday, September 5, 2020 1:54 AM
To: dev@dpdk.org; akhil.goyal@nxp.com
Cc: Richardson, Bruce <bruce.richardson@intel.com>; Xu, Rosen <rosen.xu@intel.com>; dave.burley@accelercomm.com; aidan.goddard@accelercomm.com; Yigit, Ferruh <ferruh.yigit@intel.com>; Liu, Tianjiao <tianjiao.liu@intel.com>; Chautru, Nicolas <nicolas.chautru@intel.com>
Subject: [PATCH v4 01/11] drivers/baseband: add PMD for ACC100

Add stubs for the ACC100 PMD

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>

^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC processing functions
  2020-09-03 20:45           ` Chautru, Nicolas
@ 2020-09-15  1:45             ` Chautru, Nicolas
  2020-09-15 10:21             ` Ananyev, Konstantin
  1 sibling, 0 replies; 213+ messages in thread
From: Chautru, Nicolas @ 2020-09-15  1:45 UTC (permalink / raw)
  To: Ananyev, Konstantin, Xu, Rosen, dev; +Cc: Richardson, Bruce, akhil.goyal

Hi Konstantin, Rosen, 
Replying to my own email. 
Can you confirm that the previous explanation below makes sense and can you ack this patch?

Thanks and regards, 
Nic

> From: Chautru, Nicolas
> 
> > From: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> >
> >
> >
> > > -----Original Message-----
> > > From: dev <dev-bounces@dpdk.org> On Behalf Of Xu, Rosen
> > > Sent: Thursday, September 3, 2020 3:34 AM
> > > To: Chautru, Nicolas <nicolas.chautru@intel.com>; dev@dpdk.org;
> > > akhil.goyal@nxp.com
> > > Cc: Richardson, Bruce <bruce.richardson@intel.com>
> > > Subject: Re: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC
> > > processing functions
> > >
> > > Hi,
> > >
> > > > -----Original Message-----
> > > > From: Chautru, Nicolas <nicolas.chautru@intel.com>
> > > > Sent: Sunday, August 30, 2020 2:01
> > > > To: Xu, Rosen <rosen.xu@intel.com>; dev@dpdk.org;
> > > > akhil.goyal@nxp.com
> > > > Cc: Richardson, Bruce <bruce.richardson@intel.com>
> > > > Subject: RE: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC
> > > > processing functions
> > > >
> > > > Hi Rosen,
> > > >
> > > > > From: Xu, Rosen <rosen.xu@intel.com>
> > > > >
> > > > > Hi,
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: dev <dev-bounces@dpdk.org> On Behalf Of Nicolas Chautru
> > > > > > Sent: Wednesday, August 19, 2020 8:25
> > > > > > To: dev@dpdk.org; akhil.goyal@nxp.com
> > > > > > Cc: Richardson, Bruce <bruce.richardson@intel.com>; Chautru,
> > > > > > Nicolas <nicolas.chautru@intel.com>
> > > > > > Subject: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC
> > > > > > processing functions
> > > > > >
> > > > > > Adding LDPC decode and encode processing operations
> > > > > >
> > > > > > Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> > > > > > ---
> > > > > >  drivers/baseband/acc100/rte_acc100_pmd.c | 1625
> > > > > > +++++++++++++++++++++++++++++-
> > > > > >  drivers/baseband/acc100/rte_acc100_pmd.h |    3 +
> > > > > >  2 files changed, 1626 insertions(+), 2 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> > > > > > b/drivers/baseband/acc100/rte_acc100_pmd.c
> > > > > > index 7a21c57..5f32813 100644
> > > > > > --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> > > > > > +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> > > > > > @@ -15,6 +15,9 @@
> > > > > >  #include <rte_hexdump.h>
> > > > > >  #include <rte_pci.h>
> > > > > >  #include <rte_bus_pci.h>
> > > > > > +#ifdef RTE_BBDEV_OFFLOAD_COST #include <rte_cycles.h> #endif
> > > > > >
> > > > > >  #include <rte_bbdev.h>
> > > > > >  #include <rte_bbdev_pmd.h>
> > > > > > @@ -449,7 +452,6 @@
> > > > > >  	return 0;
> > > > > >  }
> > > > > >
> > > > > > -
> > > > > >  /**
> > > > > >   * Report a ACC100 queue index which is free
> > > > > >   * Return 0 to 16k for a valid queue_idx or -1 when no queue
> > > > > > is available @@ -634,6 +636,46 @@
> > > > > >  	struct acc100_device *d = dev->data->dev_private;
> > > > > >
> > > > > >  	static const struct rte_bbdev_op_cap bbdev_capabilities[] =
> > > > > > {
> > > > > > +		{
> > > > > > +			.type   = RTE_BBDEV_OP_LDPC_ENC,
> > > > > > +			.cap.ldpc_enc = {
> > > > > > +				.capability_flags =
> > > > > > +
> > 	RTE_BBDEV_LDPC_RATE_MATCH |
> > > > > > +
> > 	RTE_BBDEV_LDPC_CRC_24B_ATTACH
> > > > > > |
> > > > > > +
> > > > > > 	RTE_BBDEV_LDPC_INTERLEAVER_BYPASS,
> > > > > > +				.num_buffers_src =
> > > > > > +
> > > > > > 	RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> > > > > > +				.num_buffers_dst =
> > > > > > +
> > > > > > 	RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> > > > > > +			}
> > > > > > +		},
> > > > > > +		{
> > > > > > +			.type   = RTE_BBDEV_OP_LDPC_DEC,
> > > > > > +			.cap.ldpc_dec = {
> > > > > > +			.capability_flags =
> > > > > > +
> > 	RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK |
> > > > > > +
> > 	RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP |
> > > > > > +
> > > > > > 	RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE |
> > > > > > +
> > > > > > 	RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE |
> > > > > > +#ifdef ACC100_EXT_MEM
> > > > > > +
> > > > > > 	RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE |
> > > > > > +
> > > > > > 	RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE
> |
> > > > > > +#endif
> > > > > > +
> > > > > > 	RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE |
> > > > > > +
> > 	RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS
> > > > > > |
> > > > > > +				RTE_BBDEV_LDPC_DECODE_BYPASS |
> > > > > > +
> > 	RTE_BBDEV_LDPC_DEC_SCATTER_GATHER |
> > > > > > +
> > > > > > 	RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION |
> > > > > > +
> > 	RTE_BBDEV_LDPC_LLR_COMPRESSION,
> > > > > > +			.llr_size = 8,
> > > > > > +			.llr_decimals = 1,
> > > > > > +			.num_buffers_src =
> > > > > > +
> > > > > > 	RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> > > > > > +			.num_buffers_hard_out =
> > > > > > +
> > > > > > 	RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> > > > > > +			.num_buffers_soft_out = 0,
> > > > > > +			}
> > > > > > +		},
> > > > > >  		RTE_BBDEV_END_OF_CAPABILITIES_LIST()
> > > > > >  	};
> > > > > >
> > > > > > @@ -669,9 +711,14 @@
> > > > > >  	dev_info->cpu_flag_reqs = NULL;
> > > > > >  	dev_info->min_alignment = 64;
> > > > > >  	dev_info->capabilities = bbdev_capabilities;
> > > > > > +#ifdef ACC100_EXT_MEM
> > > > > >  	dev_info->harq_buffer_size = d->ddr_size;
> > > > > > +#else
> > > > > > +	dev_info->harq_buffer_size = 0; #endif
> > > > > >  }
> > > > > >
> > > > > > +
> > > > > >  static const struct rte_bbdev_ops acc100_bbdev_ops = {
> > > > > >  	.setup_queues = acc100_setup_queues,
> > > > > >  	.close = acc100_dev_close,
> > > > > > @@ -696,6 +743,1577 @@
> > > > > >  	{.device_id = 0},
> > > > > >  };
> > > > > >
> > > > > > +/* Read flag value 0/1 from bitmap */ static inline bool
> > > > > > +check_bit(uint32_t bitmap, uint32_t bitmask) {
> > > > > > +	return bitmap & bitmask;
> > > > > > +}
> > > > > > +
> > > > > > +static inline char *
> > > > > > +mbuf_append(struct rte_mbuf *m_head, struct rte_mbuf *m,
> > > > > > +uint16_t
> > > > > > +len) {
> > > > > > +	if (unlikely(len > rte_pktmbuf_tailroom(m)))
> > > > > > +		return NULL;
> > > > > > +
> > > > > > +	char *tail = (char *)m->buf_addr + m->data_off + m-
> > >data_len;
> > > > > > +	m->data_len = (uint16_t)(m->data_len + len);
> > > > > > +	m_head->pkt_len  = (m_head->pkt_len + len);
> > > > > > +	return tail;
> > > > > > +}
> > > > >
> > > > > Is it reasonable to direct add data_len of rte_mbuf?
> > > > >
> > > >
> > > > Do you suggest to add directly without checking there is enough
> > > > room in the mbuf? We cannot rely on the application providing mbuf
> > > > with enough tailroom.
> > >
> > > What I mentioned is this changes about mbuf should move to librte_mbuf.
> > > And it's better to align Olivier Matz.
> >
> > There is already rte_pktmbuf_append() inside rte_mbuf.h.
> > Wouldn't it suit?
> >
> 
> Hi Ananyev, Rosen,
> I agree that this can be confusing at first look and notably compared to packet
> processing.
> Note first that this same existing syntaxwhich  is already used in all bbdev PMDs
> when manipulating outbound mbufs in the context of base band signal
> processing (not really a packet as for NIC or other devices).
> Nothing new in that very PMD as this follows existing logic already in DPDK
> bbdev PMDs.
> 
> This function basically differs from the typical rte_pktmbuf_append() as this is
> not appending data in the last mbuf but is used to potentially  update
> sequentially data for any mbufs in the middle from preallocated data hence it
> takes 2 arguments for both the head and the current mbuf segment in the list.
> There may be a more elegant way to do this down the line notably once there is
> a proposal to handle gracefully large mbufs (another usecase we have to handle
> in a slightly custom way). But I believe that is orthogonal to that very PMD serie
> which keeps on reling on using existing logic.
> 
> 
> 
> 
> > >
> > > > In case you ask about the 2 mbufs, this is because this function
> > > > is used to also support segmented memory made of multiple mbufs
> segments.
> > > > Note that this function is also used in other existing bbdev PMDs.
> > > > In case you believe there is a better way to do this, we can
> > > > certainly discuss and change these in several PMDs through another serie.
> > > >
> > > > Thanks for all the reviews and useful comments.
> > > > Nic

^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v4 02/11] baseband/acc100: add register definition file
  2020-09-04 17:53     ` [dpdk-dev] [PATCH v4 02/11] baseband/acc100: add register definition file Nicolas Chautru
@ 2020-09-15  2:31       ` Xu, Rosen
  2020-09-18  2:39       ` Liu, Tianjiao
  1 sibling, 0 replies; 213+ messages in thread
From: Xu, Rosen @ 2020-09-15  2:31 UTC (permalink / raw)
  To: Chautru, Nicolas, dev, akhil.goyal
  Cc: Richardson, Bruce, dave.burley, aidan.goddard, Yigit, Ferruh,
	Liu, Tianjiao

Hi,

> -----Original Message-----
> From: Chautru, Nicolas <nicolas.chautru@intel.com>
> Sent: Saturday, September 05, 2020 1:54
> To: dev@dpdk.org; akhil.goyal@nxp.com
> Cc: Richardson, Bruce <bruce.richardson@intel.com>; Xu, Rosen
> <rosen.xu@intel.com>; dave.burley@accelercomm.com;
> aidan.goddard@accelercomm.com; Yigit, Ferruh <ferruh.yigit@intel.com>;
> Liu, Tianjiao <tianjiao.liu@intel.com>; Chautru, Nicolas
> <nicolas.chautru@intel.com>
> Subject: [PATCH v4 02/11] baseband/acc100: add register definition file
> 
> Add in the list of registers for the device and related
> HW specs definitions.
> 
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> ---
>  drivers/baseband/acc100/acc100_pf_enum.h | 1068
> ++++++++++++++++++++++++++++++
>  drivers/baseband/acc100/acc100_vf_enum.h |   73 ++
>  drivers/baseband/acc100/rte_acc100_pmd.h |  490 ++++++++++++++
>  3 files changed, 1631 insertions(+)
>  create mode 100644 drivers/baseband/acc100/acc100_pf_enum.h
>  create mode 100644 drivers/baseband/acc100/acc100_vf_enum.h
> 
> diff --git a/drivers/baseband/acc100/acc100_pf_enum.h
> b/drivers/baseband/acc100/acc100_pf_enum.h
> new file mode 100644
> index 0000000..a1ee416
> --- /dev/null
> +++ b/drivers/baseband/acc100/acc100_pf_enum.h
> @@ -0,0 +1,1068 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2017 Intel Corporation
> + */
> +
> +#ifndef ACC100_PF_ENUM_H
> +#define ACC100_PF_ENUM_H
> +
> +/*
> + * ACC100 Register mapping on PF BAR0
> + * This is automatically generated from RDL, format may change with new
> RDL
> + * Release.
> + * Variable names are as is
> + */
> +enum {
> +	HWPfQmgrEgressQueuesTemplate          =  0x0007FE00,
> +	HWPfQmgrIngressAq                     =  0x00080000,
> +	HWPfQmgrArbQAvail                     =  0x00A00010,
> +	HWPfQmgrArbQBlock                     =  0x00A00014,
> +	HWPfQmgrAqueueDropNotifEn             =  0x00A00024,
> +	HWPfQmgrAqueueDisableNotifEn          =  0x00A00028,
> +	HWPfQmgrSoftReset                     =  0x00A00038,
> +	HWPfQmgrInitStatus                    =  0x00A0003C,
> +	HWPfQmgrAramWatchdogCount             =  0x00A00040,
> +	HWPfQmgrAramWatchdogCounterEn         =  0x00A00044,
> +	HWPfQmgrAxiWatchdogCount              =  0x00A00048,
> +	HWPfQmgrAxiWatchdogCounterEn          =  0x00A0004C,
> +	HWPfQmgrProcessWatchdogCount          =  0x00A00050,
> +	HWPfQmgrProcessWatchdogCounterEn      =  0x00A00054,
> +	HWPfQmgrProcessUl4GWatchdogCounter    =  0x00A00058,
> +	HWPfQmgrProcessDl4GWatchdogCounter    =  0x00A0005C,
> +	HWPfQmgrProcessUl5GWatchdogCounter    =  0x00A00060,
> +	HWPfQmgrProcessDl5GWatchdogCounter    =  0x00A00064,
> +	HWPfQmgrProcessMldWatchdogCounter     =  0x00A00068,
> +	HWPfQmgrMsiOverflowUpperVf            =  0x00A00070,
> +	HWPfQmgrMsiOverflowLowerVf            =  0x00A00074,
> +	HWPfQmgrMsiWatchdogOverflow           =  0x00A00078,
> +	HWPfQmgrMsiOverflowEnable             =  0x00A0007C,
> +	HWPfQmgrDebugAqPointerMemGrp          =  0x00A00100,
> +	HWPfQmgrDebugOutputArbQFifoGrp        =  0x00A00140,
> +	HWPfQmgrDebugMsiFifoGrp               =  0x00A00180,
> +	HWPfQmgrDebugAxiWdTimeoutMsiFifo      =  0x00A001C0,
> +	HWPfQmgrDebugProcessWdTimeoutMsiFifo  =  0x00A001C4,
> +	HWPfQmgrDepthLog2Grp                  =  0x00A00200,
> +	HWPfQmgrTholdGrp                      =  0x00A00300,
> +	HWPfQmgrGrpTmplateReg0Indx            =  0x00A00600,
> +	HWPfQmgrGrpTmplateReg1Indx            =  0x00A00680,
> +	HWPfQmgrGrpTmplateReg2indx            =  0x00A00700,
> +	HWPfQmgrGrpTmplateReg3Indx            =  0x00A00780,
> +	HWPfQmgrGrpTmplateReg4Indx            =  0x00A00800,
> +	HWPfQmgrVfBaseAddr                    =  0x00A01000,
> +	HWPfQmgrUl4GWeightRrVf                =  0x00A02000,
> +	HWPfQmgrDl4GWeightRrVf                =  0x00A02100,
> +	HWPfQmgrUl5GWeightRrVf                =  0x00A02200,
> +	HWPfQmgrDl5GWeightRrVf                =  0x00A02300,
> +	HWPfQmgrMldWeightRrVf                 =  0x00A02400,
> +	HWPfQmgrArbQDepthGrp                  =  0x00A02F00,
> +	HWPfQmgrGrpFunction0                  =  0x00A02F40,
> +	HWPfQmgrGrpFunction1                  =  0x00A02F44,
> +	HWPfQmgrGrpPriority                   =  0x00A02F48,
> +	HWPfQmgrWeightSync                    =  0x00A03000,
> +	HWPfQmgrAqEnableVf                    =  0x00A10000,
> +	HWPfQmgrAqResetVf                     =  0x00A20000,
> +	HWPfQmgrRingSizeVf                    =  0x00A20004,
> +	HWPfQmgrGrpDepthLog20Vf               =  0x00A20008,
> +	HWPfQmgrGrpDepthLog21Vf               =  0x00A2000C,
> +	HWPfQmgrGrpFunction0Vf                =  0x00A20010,
> +	HWPfQmgrGrpFunction1Vf                =  0x00A20014,
> +	HWPfDmaConfig0Reg                     =  0x00B80000,
> +	HWPfDmaConfig1Reg                     =  0x00B80004,
> +	HWPfDmaQmgrAddrReg                    =  0x00B80008,
> +	HWPfDmaSoftResetReg                   =  0x00B8000C,
> +	HWPfDmaAxcacheReg                     =  0x00B80010,
> +	HWPfDmaVersionReg                     =  0x00B80014,
> +	HWPfDmaFrameThreshold                 =  0x00B80018,
> +	HWPfDmaTimestampLo                    =  0x00B8001C,
> +	HWPfDmaTimestampHi                    =  0x00B80020,
> +	HWPfDmaAxiStatus                      =  0x00B80028,
> +	HWPfDmaAxiControl                     =  0x00B8002C,
> +	HWPfDmaNoQmgr                         =  0x00B80030,
> +	HWPfDmaQosScale                       =  0x00B80034,
> +	HWPfDmaQmanen                         =  0x00B80040,
> +	HWPfDmaQmgrQosBase                    =  0x00B80060,
> +	HWPfDmaFecClkGatingEnable             =  0x00B80080,
> +	HWPfDmaPmEnable                       =  0x00B80084,
> +	HWPfDmaQosEnable                      =  0x00B80088,
> +	HWPfDmaHarqWeightedRrFrameThreshold   =  0x00B800B0,
> +	HWPfDmaDataSmallWeightedRrFrameThresh  = 0x00B800B4,
> +	HWPfDmaDataLargeWeightedRrFrameThresh  = 0x00B800B8,
> +	HWPfDmaInboundCbMaxSize               =  0x00B800BC,
> +	HWPfDmaInboundDrainDataSize           =  0x00B800C0,
> +	HWPfDmaVfDdrBaseRw                    =  0x00B80400,
> +	HWPfDmaCmplTmOutCnt                   =  0x00B80800,
> +	HWPfDmaProcTmOutCnt                   =  0x00B80804,
> +	HWPfDmaStatusRrespBresp               =  0x00B80810,
> +	HWPfDmaCfgRrespBresp                  =  0x00B80814,
> +	HWPfDmaStatusMemParErr                =  0x00B80818,
> +	HWPfDmaCfgMemParErrEn                 =  0x00B8081C,
> +	HWPfDmaStatusDmaHwErr                 =  0x00B80820,
> +	HWPfDmaCfgDmaHwErrEn                  =  0x00B80824,
> +	HWPfDmaStatusFecCoreErr               =  0x00B80828,
> +	HWPfDmaCfgFecCoreErrEn                =  0x00B8082C,
> +	HWPfDmaStatusFcwDescrErr              =  0x00B80830,
> +	HWPfDmaCfgFcwDescrErrEn               =  0x00B80834,
> +	HWPfDmaStatusBlockTransmit            =  0x00B80838,
> +	HWPfDmaBlockOnErrEn                   =  0x00B8083C,
> +	HWPfDmaStatusFlushDma                 =  0x00B80840,
> +	HWPfDmaFlushDmaOnErrEn                =  0x00B80844,
> +	HWPfDmaStatusSdoneFifoFull            =  0x00B80848,
> +	HWPfDmaStatusDescriptorErrLoVf        =  0x00B8084C,
> +	HWPfDmaStatusDescriptorErrHiVf        =  0x00B80850,
> +	HWPfDmaStatusFcwErrLoVf               =  0x00B80854,
> +	HWPfDmaStatusFcwErrHiVf               =  0x00B80858,
> +	HWPfDmaStatusDataErrLoVf              =  0x00B8085C,
> +	HWPfDmaStatusDataErrHiVf              =  0x00B80860,
> +	HWPfDmaCfgMsiEnSoftwareErr            =  0x00B80864,
> +	HWPfDmaDescriptorSignatuture          =  0x00B80868,
> +	HWPfDmaFcwSignature                   =  0x00B8086C,
> +	HWPfDmaErrorDetectionEn               =  0x00B80870,
> +	HWPfDmaErrCntrlFifoDebug              =  0x00B8087C,
> +	HWPfDmaStatusToutData                 =  0x00B80880,
> +	HWPfDmaStatusToutDesc                 =  0x00B80884,
> +	HWPfDmaStatusToutUnexpData            =  0x00B80888,
> +	HWPfDmaStatusToutUnexpDesc            =  0x00B8088C,
> +	HWPfDmaStatusToutProcess              =  0x00B80890,
> +	HWPfDmaConfigCtoutOutDataEn           =  0x00B808A0,
> +	HWPfDmaConfigCtoutOutDescrEn          =  0x00B808A4,
> +	HWPfDmaConfigUnexpComplDataEn         =  0x00B808A8,
> +	HWPfDmaConfigUnexpComplDescrEn        =  0x00B808AC,
> +	HWPfDmaConfigPtoutOutEn               =  0x00B808B0,
> +	HWPfDmaFec5GulDescBaseLoRegVf         =  0x00B88020,
> +	HWPfDmaFec5GulDescBaseHiRegVf         =  0x00B88024,
> +	HWPfDmaFec5GulRespPtrLoRegVf          =  0x00B88028,
> +	HWPfDmaFec5GulRespPtrHiRegVf          =  0x00B8802C,
> +	HWPfDmaFec5GdlDescBaseLoRegVf         =  0x00B88040,
> +	HWPfDmaFec5GdlDescBaseHiRegVf         =  0x00B88044,
> +	HWPfDmaFec5GdlRespPtrLoRegVf          =  0x00B88048,
> +	HWPfDmaFec5GdlRespPtrHiRegVf          =  0x00B8804C,
> +	HWPfDmaFec4GulDescBaseLoRegVf         =  0x00B88060,
> +	HWPfDmaFec4GulDescBaseHiRegVf         =  0x00B88064,
> +	HWPfDmaFec4GulRespPtrLoRegVf          =  0x00B88068,
> +	HWPfDmaFec4GulRespPtrHiRegVf          =  0x00B8806C,
> +	HWPfDmaFec4GdlDescBaseLoRegVf         =  0x00B88080,
> +	HWPfDmaFec4GdlDescBaseHiRegVf         =  0x00B88084,
> +	HWPfDmaFec4GdlRespPtrLoRegVf          =  0x00B88088,
> +	HWPfDmaFec4GdlRespPtrHiRegVf          =  0x00B8808C,
> +	HWPfDmaVfDdrBaseRangeRo               =  0x00B880A0,
> +	HWPfQosmonACntrlReg                   =  0x00B90000,
> +	HWPfQosmonAEvalOverflow0              =  0x00B90008,
> +	HWPfQosmonAEvalOverflow1              =  0x00B9000C,
> +	HWPfQosmonADivTerm                    =  0x00B90010,
> +	HWPfQosmonATickTerm                   =  0x00B90014,
> +	HWPfQosmonAEvalTerm                   =  0x00B90018,
> +	HWPfQosmonAAveTerm                    =  0x00B9001C,
> +	HWPfQosmonAForceEccErr                =  0x00B90020,
> +	HWPfQosmonAEccErrDetect               =  0x00B90024,
> +	HWPfQosmonAIterationConfig0Low        =  0x00B90060,
> +	HWPfQosmonAIterationConfig0High       =  0x00B90064,
> +	HWPfQosmonAIterationConfig1Low        =  0x00B90068,
> +	HWPfQosmonAIterationConfig1High       =  0x00B9006C,
> +	HWPfQosmonAIterationConfig2Low        =  0x00B90070,
> +	HWPfQosmonAIterationConfig2High       =  0x00B90074,
> +	HWPfQosmonAIterationConfig3Low        =  0x00B90078,
> +	HWPfQosmonAIterationConfig3High       =  0x00B9007C,
> +	HWPfQosmonAEvalMemAddr                =  0x00B90080,
> +	HWPfQosmonAEvalMemData                =  0x00B90084,
> +	HWPfQosmonAXaction                    =  0x00B900C0,
> +	HWPfQosmonARemThres1Vf                =  0x00B90400,
> +	HWPfQosmonAThres2Vf                   =  0x00B90404,
> +	HWPfQosmonAWeiFracVf                  =  0x00B90408,
> +	HWPfQosmonARrWeiVf                    =  0x00B9040C,
> +	HWPfPermonACntrlRegVf                 =  0x00B98000,
> +	HWPfPermonACountVf                    =  0x00B98008,
> +	HWPfPermonAKCntLoVf                   =  0x00B98010,
> +	HWPfPermonAKCntHiVf                   =  0x00B98014,
> +	HWPfPermonADeltaCntLoVf               =  0x00B98020,
> +	HWPfPermonADeltaCntHiVf               =  0x00B98024,
> +	HWPfPermonAVersionReg                 =  0x00B9C000,
> +	HWPfPermonACbControlFec               =  0x00B9C0F0,
> +	HWPfPermonADltTimerLoFec              =  0x00B9C0F4,
> +	HWPfPermonADltTimerHiFec              =  0x00B9C0F8,
> +	HWPfPermonACbCountFec                 =  0x00B9C100,
> +	HWPfPermonAAccExecTimerLoFec          =  0x00B9C104,
> +	HWPfPermonAAccExecTimerHiFec          =  0x00B9C108,
> +	HWPfPermonAExecTimerMinFec            =  0x00B9C200,
> +	HWPfPermonAExecTimerMaxFec            =  0x00B9C204,
> +	HWPfPermonAControlBusMon              =  0x00B9C400,
> +	HWPfPermonAConfigBusMon               =  0x00B9C404,
> +	HWPfPermonASkipCountBusMon            =  0x00B9C408,
> +	HWPfPermonAMinLatBusMon               =  0x00B9C40C,
> +	HWPfPermonAMaxLatBusMon               =  0x00B9C500,
> +	HWPfPermonATotalLatLowBusMon          =  0x00B9C504,
> +	HWPfPermonATotalLatUpperBusMon        =  0x00B9C508,
> +	HWPfPermonATotalReqCntBusMon          =  0x00B9C50C,
> +	HWPfQosmonBCntrlReg                   =  0x00BA0000,
> +	HWPfQosmonBEvalOverflow0              =  0x00BA0008,
> +	HWPfQosmonBEvalOverflow1              =  0x00BA000C,
> +	HWPfQosmonBDivTerm                    =  0x00BA0010,
> +	HWPfQosmonBTickTerm                   =  0x00BA0014,
> +	HWPfQosmonBEvalTerm                   =  0x00BA0018,
> +	HWPfQosmonBAveTerm                    =  0x00BA001C,
> +	HWPfQosmonBForceEccErr                =  0x00BA0020,
> +	HWPfQosmonBEccErrDetect               =  0x00BA0024,
> +	HWPfQosmonBIterationConfig0Low        =  0x00BA0060,
> +	HWPfQosmonBIterationConfig0High       =  0x00BA0064,
> +	HWPfQosmonBIterationConfig1Low        =  0x00BA0068,
> +	HWPfQosmonBIterationConfig1High       =  0x00BA006C,
> +	HWPfQosmonBIterationConfig2Low        =  0x00BA0070,
> +	HWPfQosmonBIterationConfig2High       =  0x00BA0074,
> +	HWPfQosmonBIterationConfig3Low        =  0x00BA0078,
> +	HWPfQosmonBIterationConfig3High       =  0x00BA007C,
> +	HWPfQosmonBEvalMemAddr                =  0x00BA0080,
> +	HWPfQosmonBEvalMemData                =  0x00BA0084,
> +	HWPfQosmonBXaction                    =  0x00BA00C0,
> +	HWPfQosmonBRemThres1Vf                =  0x00BA0400,
> +	HWPfQosmonBThres2Vf                   =  0x00BA0404,
> +	HWPfQosmonBWeiFracVf                  =  0x00BA0408,
> +	HWPfQosmonBRrWeiVf                    =  0x00BA040C,
> +	HWPfPermonBCntrlRegVf                 =  0x00BA8000,
> +	HWPfPermonBCountVf                    =  0x00BA8008,
> +	HWPfPermonBKCntLoVf                   =  0x00BA8010,
> +	HWPfPermonBKCntHiVf                   =  0x00BA8014,
> +	HWPfPermonBDeltaCntLoVf               =  0x00BA8020,
> +	HWPfPermonBDeltaCntHiVf               =  0x00BA8024,
> +	HWPfPermonBVersionReg                 =  0x00BAC000,
> +	HWPfPermonBCbControlFec               =  0x00BAC0F0,
> +	HWPfPermonBDltTimerLoFec              =  0x00BAC0F4,
> +	HWPfPermonBDltTimerHiFec              =  0x00BAC0F8,
> +	HWPfPermonBCbCountFec                 =  0x00BAC100,
> +	HWPfPermonBAccExecTimerLoFec          =  0x00BAC104,
> +	HWPfPermonBAccExecTimerHiFec          =  0x00BAC108,
> +	HWPfPermonBExecTimerMinFec            =  0x00BAC200,
> +	HWPfPermonBExecTimerMaxFec            =  0x00BAC204,
> +	HWPfPermonBControlBusMon              =  0x00BAC400,
> +	HWPfPermonBConfigBusMon               =  0x00BAC404,
> +	HWPfPermonBSkipCountBusMon            =  0x00BAC408,
> +	HWPfPermonBMinLatBusMon               =  0x00BAC40C,
> +	HWPfPermonBMaxLatBusMon               =  0x00BAC500,
> +	HWPfPermonBTotalLatLowBusMon          =  0x00BAC504,
> +	HWPfPermonBTotalLatUpperBusMon        =  0x00BAC508,
> +	HWPfPermonBTotalReqCntBusMon          =  0x00BAC50C,
> +	HWPfFecUl5gCntrlReg                   =  0x00BC0000,
> +	HWPfFecUl5gI2MThreshReg               =  0x00BC0004,
> +	HWPfFecUl5gVersionReg                 =  0x00BC0100,
> +	HWPfFecUl5gFcwStatusReg               =  0x00BC0104,
> +	HWPfFecUl5gWarnReg                    =  0x00BC0108,
> +	HwPfFecUl5gIbDebugReg                 =  0x00BC0200,
> +	HwPfFecUl5gObLlrDebugReg              =  0x00BC0204,
> +	HwPfFecUl5gObHarqDebugReg             =  0x00BC0208,
> +	HwPfFecUl5g1CntrlReg                  =  0x00BC1000,
> +	HwPfFecUl5g1I2MThreshReg              =  0x00BC1004,
> +	HwPfFecUl5g1VersionReg                =  0x00BC1100,
> +	HwPfFecUl5g1FcwStatusReg              =  0x00BC1104,
> +	HwPfFecUl5g1WarnReg                   =  0x00BC1108,
> +	HwPfFecUl5g1IbDebugReg                =  0x00BC1200,
> +	HwPfFecUl5g1ObLlrDebugReg             =  0x00BC1204,
> +	HwPfFecUl5g1ObHarqDebugReg            =  0x00BC1208,
> +	HwPfFecUl5g2CntrlReg                  =  0x00BC2000,
> +	HwPfFecUl5g2I2MThreshReg              =  0x00BC2004,
> +	HwPfFecUl5g2VersionReg                =  0x00BC2100,
> +	HwPfFecUl5g2FcwStatusReg              =  0x00BC2104,
> +	HwPfFecUl5g2WarnReg                   =  0x00BC2108,
> +	HwPfFecUl5g2IbDebugReg                =  0x00BC2200,
> +	HwPfFecUl5g2ObLlrDebugReg             =  0x00BC2204,
> +	HwPfFecUl5g2ObHarqDebugReg            =  0x00BC2208,
> +	HwPfFecUl5g3CntrlReg                  =  0x00BC3000,
> +	HwPfFecUl5g3I2MThreshReg              =  0x00BC3004,
> +	HwPfFecUl5g3VersionReg                =  0x00BC3100,
> +	HwPfFecUl5g3FcwStatusReg              =  0x00BC3104,
> +	HwPfFecUl5g3WarnReg                   =  0x00BC3108,
> +	HwPfFecUl5g3IbDebugReg                =  0x00BC3200,
> +	HwPfFecUl5g3ObLlrDebugReg             =  0x00BC3204,
> +	HwPfFecUl5g3ObHarqDebugReg            =  0x00BC3208,
> +	HwPfFecUl5g4CntrlReg                  =  0x00BC4000,
> +	HwPfFecUl5g4I2MThreshReg              =  0x00BC4004,
> +	HwPfFecUl5g4VersionReg                =  0x00BC4100,
> +	HwPfFecUl5g4FcwStatusReg              =  0x00BC4104,
> +	HwPfFecUl5g4WarnReg                   =  0x00BC4108,
> +	HwPfFecUl5g4IbDebugReg                =  0x00BC4200,
> +	HwPfFecUl5g4ObLlrDebugReg             =  0x00BC4204,
> +	HwPfFecUl5g4ObHarqDebugReg            =  0x00BC4208,
> +	HwPfFecUl5g5CntrlReg                  =  0x00BC5000,
> +	HwPfFecUl5g5I2MThreshReg              =  0x00BC5004,
> +	HwPfFecUl5g5VersionReg                =  0x00BC5100,
> +	HwPfFecUl5g5FcwStatusReg              =  0x00BC5104,
> +	HwPfFecUl5g5WarnReg                   =  0x00BC5108,
> +	HwPfFecUl5g5IbDebugReg                =  0x00BC5200,
> +	HwPfFecUl5g5ObLlrDebugReg             =  0x00BC5204,
> +	HwPfFecUl5g5ObHarqDebugReg            =  0x00BC5208,
> +	HwPfFecUl5g6CntrlReg                  =  0x00BC6000,
> +	HwPfFecUl5g6I2MThreshReg              =  0x00BC6004,
> +	HwPfFecUl5g6VersionReg                =  0x00BC6100,
> +	HwPfFecUl5g6FcwStatusReg              =  0x00BC6104,
> +	HwPfFecUl5g6WarnReg                   =  0x00BC6108,
> +	HwPfFecUl5g6IbDebugReg                =  0x00BC6200,
> +	HwPfFecUl5g6ObLlrDebugReg             =  0x00BC6204,
> +	HwPfFecUl5g6ObHarqDebugReg            =  0x00BC6208,
> +	HwPfFecUl5g7CntrlReg                  =  0x00BC7000,
> +	HwPfFecUl5g7I2MThreshReg              =  0x00BC7004,
> +	HwPfFecUl5g7VersionReg                =  0x00BC7100,
> +	HwPfFecUl5g7FcwStatusReg              =  0x00BC7104,
> +	HwPfFecUl5g7WarnReg                   =  0x00BC7108,
> +	HwPfFecUl5g7IbDebugReg                =  0x00BC7200,
> +	HwPfFecUl5g7ObLlrDebugReg             =  0x00BC7204,
> +	HwPfFecUl5g7ObHarqDebugReg            =  0x00BC7208,
> +	HwPfFecUl5g8CntrlReg                  =  0x00BC8000,
> +	HwPfFecUl5g8I2MThreshReg              =  0x00BC8004,
> +	HwPfFecUl5g8VersionReg                =  0x00BC8100,
> +	HwPfFecUl5g8FcwStatusReg              =  0x00BC8104,
> +	HwPfFecUl5g8WarnReg                   =  0x00BC8108,
> +	HwPfFecUl5g8IbDebugReg                =  0x00BC8200,
> +	HwPfFecUl5g8ObLlrDebugReg             =  0x00BC8204,
> +	HwPfFecUl5g8ObHarqDebugReg            =  0x00BC8208,
> +	HWPfFecDl5gCntrlReg                   =  0x00BCF000,
> +	HWPfFecDl5gI2MThreshReg               =  0x00BCF004,
> +	HWPfFecDl5gVersionReg                 =  0x00BCF100,
> +	HWPfFecDl5gFcwStatusReg               =  0x00BCF104,
> +	HWPfFecDl5gWarnReg                    =  0x00BCF108,
> +	HWPfFecUlVersionReg                   =  0x00BD0000,
> +	HWPfFecUlControlReg                   =  0x00BD0004,
> +	HWPfFecUlStatusReg                    =  0x00BD0008,
> +	HWPfFecDlVersionReg                   =  0x00BDF000,
> +	HWPfFecDlClusterConfigReg             =  0x00BDF004,
> +	HWPfFecDlBurstThres                   =  0x00BDF00C,
> +	HWPfFecDlClusterStatusReg0            =  0x00BDF040,
> +	HWPfFecDlClusterStatusReg1            =  0x00BDF044,
> +	HWPfFecDlClusterStatusReg2            =  0x00BDF048,
> +	HWPfFecDlClusterStatusReg3            =  0x00BDF04C,
> +	HWPfFecDlClusterStatusReg4            =  0x00BDF050,
> +	HWPfFecDlClusterStatusReg5            =  0x00BDF054,
> +	HWPfChaFabPllPllrst                   =  0x00C40000,
> +	HWPfChaFabPllClk0                     =  0x00C40004,
> +	HWPfChaFabPllClk1                     =  0x00C40008,
> +	HWPfChaFabPllBwadj                    =  0x00C4000C,
> +	HWPfChaFabPllLbw                      =  0x00C40010,
> +	HWPfChaFabPllResetq                   =  0x00C40014,
> +	HWPfChaFabPllPhshft0                  =  0x00C40018,
> +	HWPfChaFabPllPhshft1                  =  0x00C4001C,
> +	HWPfChaFabPllDivq0                    =  0x00C40020,
> +	HWPfChaFabPllDivq1                    =  0x00C40024,
> +	HWPfChaFabPllDivq2                    =  0x00C40028,
> +	HWPfChaFabPllDivq3                    =  0x00C4002C,
> +	HWPfChaFabPllDivq4                    =  0x00C40030,
> +	HWPfChaFabPllDivq5                    =  0x00C40034,
> +	HWPfChaFabPllDivq6                    =  0x00C40038,
> +	HWPfChaFabPllDivq7                    =  0x00C4003C,
> +	HWPfChaDl5gPllPllrst                  =  0x00C40080,
> +	HWPfChaDl5gPllClk0                    =  0x00C40084,
> +	HWPfChaDl5gPllClk1                    =  0x00C40088,
> +	HWPfChaDl5gPllBwadj                   =  0x00C4008C,
> +	HWPfChaDl5gPllLbw                     =  0x00C40090,
> +	HWPfChaDl5gPllResetq                  =  0x00C40094,
> +	HWPfChaDl5gPllPhshft0                 =  0x00C40098,
> +	HWPfChaDl5gPllPhshft1                 =  0x00C4009C,
> +	HWPfChaDl5gPllDivq0                   =  0x00C400A0,
> +	HWPfChaDl5gPllDivq1                   =  0x00C400A4,
> +	HWPfChaDl5gPllDivq2                   =  0x00C400A8,
> +	HWPfChaDl5gPllDivq3                   =  0x00C400AC,
> +	HWPfChaDl5gPllDivq4                   =  0x00C400B0,
> +	HWPfChaDl5gPllDivq5                   =  0x00C400B4,
> +	HWPfChaDl5gPllDivq6                   =  0x00C400B8,
> +	HWPfChaDl5gPllDivq7                   =  0x00C400BC,
> +	HWPfChaDl4gPllPllrst                  =  0x00C40100,
> +	HWPfChaDl4gPllClk0                    =  0x00C40104,
> +	HWPfChaDl4gPllClk1                    =  0x00C40108,
> +	HWPfChaDl4gPllBwadj                   =  0x00C4010C,
> +	HWPfChaDl4gPllLbw                     =  0x00C40110,
> +	HWPfChaDl4gPllResetq                  =  0x00C40114,
> +	HWPfChaDl4gPllPhshft0                 =  0x00C40118,
> +	HWPfChaDl4gPllPhshft1                 =  0x00C4011C,
> +	HWPfChaDl4gPllDivq0                   =  0x00C40120,
> +	HWPfChaDl4gPllDivq1                   =  0x00C40124,
> +	HWPfChaDl4gPllDivq2                   =  0x00C40128,
> +	HWPfChaDl4gPllDivq3                   =  0x00C4012C,
> +	HWPfChaDl4gPllDivq4                   =  0x00C40130,
> +	HWPfChaDl4gPllDivq5                   =  0x00C40134,
> +	HWPfChaDl4gPllDivq6                   =  0x00C40138,
> +	HWPfChaDl4gPllDivq7                   =  0x00C4013C,
> +	HWPfChaUl5gPllPllrst                  =  0x00C40180,
> +	HWPfChaUl5gPllClk0                    =  0x00C40184,
> +	HWPfChaUl5gPllClk1                    =  0x00C40188,
> +	HWPfChaUl5gPllBwadj                   =  0x00C4018C,
> +	HWPfChaUl5gPllLbw                     =  0x00C40190,
> +	HWPfChaUl5gPllResetq                  =  0x00C40194,
> +	HWPfChaUl5gPllPhshft0                 =  0x00C40198,
> +	HWPfChaUl5gPllPhshft1                 =  0x00C4019C,
> +	HWPfChaUl5gPllDivq0                   =  0x00C401A0,
> +	HWPfChaUl5gPllDivq1                   =  0x00C401A4,
> +	HWPfChaUl5gPllDivq2                   =  0x00C401A8,
> +	HWPfChaUl5gPllDivq3                   =  0x00C401AC,
> +	HWPfChaUl5gPllDivq4                   =  0x00C401B0,
> +	HWPfChaUl5gPllDivq5                   =  0x00C401B4,
> +	HWPfChaUl5gPllDivq6                   =  0x00C401B8,
> +	HWPfChaUl5gPllDivq7                   =  0x00C401BC,
> +	HWPfChaUl4gPllPllrst                  =  0x00C40200,
> +	HWPfChaUl4gPllClk0                    =  0x00C40204,
> +	HWPfChaUl4gPllClk1                    =  0x00C40208,
> +	HWPfChaUl4gPllBwadj                   =  0x00C4020C,
> +	HWPfChaUl4gPllLbw                     =  0x00C40210,
> +	HWPfChaUl4gPllResetq                  =  0x00C40214,
> +	HWPfChaUl4gPllPhshft0                 =  0x00C40218,
> +	HWPfChaUl4gPllPhshft1                 =  0x00C4021C,
> +	HWPfChaUl4gPllDivq0                   =  0x00C40220,
> +	HWPfChaUl4gPllDivq1                   =  0x00C40224,
> +	HWPfChaUl4gPllDivq2                   =  0x00C40228,
> +	HWPfChaUl4gPllDivq3                   =  0x00C4022C,
> +	HWPfChaUl4gPllDivq4                   =  0x00C40230,
> +	HWPfChaUl4gPllDivq5                   =  0x00C40234,
> +	HWPfChaUl4gPllDivq6                   =  0x00C40238,
> +	HWPfChaUl4gPllDivq7                   =  0x00C4023C,
> +	HWPfChaDdrPllPllrst                   =  0x00C40280,
> +	HWPfChaDdrPllClk0                     =  0x00C40284,
> +	HWPfChaDdrPllClk1                     =  0x00C40288,
> +	HWPfChaDdrPllBwadj                    =  0x00C4028C,
> +	HWPfChaDdrPllLbw                      =  0x00C40290,
> +	HWPfChaDdrPllResetq                   =  0x00C40294,
> +	HWPfChaDdrPllPhshft0                  =  0x00C40298,
> +	HWPfChaDdrPllPhshft1                  =  0x00C4029C,
> +	HWPfChaDdrPllDivq0                    =  0x00C402A0,
> +	HWPfChaDdrPllDivq1                    =  0x00C402A4,
> +	HWPfChaDdrPllDivq2                    =  0x00C402A8,
> +	HWPfChaDdrPllDivq3                    =  0x00C402AC,
> +	HWPfChaDdrPllDivq4                    =  0x00C402B0,
> +	HWPfChaDdrPllDivq5                    =  0x00C402B4,
> +	HWPfChaDdrPllDivq6                    =  0x00C402B8,
> +	HWPfChaDdrPllDivq7                    =  0x00C402BC,
> +	HWPfChaErrStatus                      =  0x00C40400,
> +	HWPfChaErrMask                        =  0x00C40404,
> +	HWPfChaDebugPcieMsiFifo               =  0x00C40410,
> +	HWPfChaDebugDdrMsiFifo                =  0x00C40414,
> +	HWPfChaDebugMiscMsiFifo               =  0x00C40418,
> +	HWPfChaPwmSet                         =  0x00C40420,
> +	HWPfChaDdrRstStatus                   =  0x00C40430,
> +	HWPfChaDdrStDoneStatus                =  0x00C40434,
> +	HWPfChaDdrWbRstCfg                    =  0x00C40438,
> +	HWPfChaDdrApbRstCfg                   =  0x00C4043C,
> +	HWPfChaDdrPhyRstCfg                   =  0x00C40440,
> +	HWPfChaDdrCpuRstCfg                   =  0x00C40444,
> +	HWPfChaDdrSifRstCfg                   =  0x00C40448,
> +	HWPfChaPadcfgPcomp0                   =  0x00C41000,
> +	HWPfChaPadcfgNcomp0                   =  0x00C41004,
> +	HWPfChaPadcfgOdt0                     =  0x00C41008,
> +	HWPfChaPadcfgProtect0                 =  0x00C4100C,
> +	HWPfChaPreemphasisProtect0            =  0x00C41010,
> +	HWPfChaPreemphasisCompen0             =  0x00C41040,
> +	HWPfChaPreemphasisOdten0              =  0x00C41044,
> +	HWPfChaPadcfgPcomp1                   =  0x00C41100,
> +	HWPfChaPadcfgNcomp1                   =  0x00C41104,
> +	HWPfChaPadcfgOdt1                     =  0x00C41108,
> +	HWPfChaPadcfgProtect1                 =  0x00C4110C,
> +	HWPfChaPreemphasisProtect1            =  0x00C41110,
> +	HWPfChaPreemphasisCompen1             =  0x00C41140,
> +	HWPfChaPreemphasisOdten1              =  0x00C41144,
> +	HWPfChaPadcfgPcomp2                   =  0x00C41200,
> +	HWPfChaPadcfgNcomp2                   =  0x00C41204,
> +	HWPfChaPadcfgOdt2                     =  0x00C41208,
> +	HWPfChaPadcfgProtect2                 =  0x00C4120C,
> +	HWPfChaPreemphasisProtect2            =  0x00C41210,
> +	HWPfChaPreemphasisCompen2             =  0x00C41240,
> +	HWPfChaPreemphasisOdten4              =  0x00C41444,
> +	HWPfChaPreemphasisOdten2              =  0x00C41244,
> +	HWPfChaPadcfgPcomp3                   =  0x00C41300,
> +	HWPfChaPadcfgNcomp3                   =  0x00C41304,
> +	HWPfChaPadcfgOdt3                     =  0x00C41308,
> +	HWPfChaPadcfgProtect3                 =  0x00C4130C,
> +	HWPfChaPreemphasisProtect3            =  0x00C41310,
> +	HWPfChaPreemphasisCompen3             =  0x00C41340,
> +	HWPfChaPreemphasisOdten3              =  0x00C41344,
> +	HWPfChaPadcfgPcomp4                   =  0x00C41400,
> +	HWPfChaPadcfgNcomp4                   =  0x00C41404,
> +	HWPfChaPadcfgOdt4                     =  0x00C41408,
> +	HWPfChaPadcfgProtect4                 =  0x00C4140C,
> +	HWPfChaPreemphasisProtect4            =  0x00C41410,
> +	HWPfChaPreemphasisCompen4             =  0x00C41440,
> +	HWPfHiVfToPfDbellVf                   =  0x00C80000,
> +	HWPfHiPfToVfDbellVf                   =  0x00C80008,
> +	HWPfHiInfoRingBaseLoVf                =  0x00C80010,
> +	HWPfHiInfoRingBaseHiVf                =  0x00C80014,
> +	HWPfHiInfoRingPointerVf               =  0x00C80018,
> +	HWPfHiInfoRingIntWrEnVf               =  0x00C80020,
> +	HWPfHiInfoRingPf2VfWrEnVf             =  0x00C80024,
> +	HWPfHiMsixVectorMapperVf              =  0x00C80060,
> +	HWPfHiModuleVersionReg                =  0x00C84000,
> +	HWPfHiIosf2axiErrLogReg               =  0x00C84004,
> +	HWPfHiHardResetReg                    =  0x00C84008,
> +	HWPfHi5GHardResetReg                  =  0x00C8400C,
> +	HWPfHiInfoRingBaseLoRegPf             =  0x00C84010,
> +	HWPfHiInfoRingBaseHiRegPf             =  0x00C84014,
> +	HWPfHiInfoRingPointerRegPf            =  0x00C84018,
> +	HWPfHiInfoRingIntWrEnRegPf            =  0x00C84020,
> +	HWPfHiInfoRingVf2pfLoWrEnReg          =  0x00C84024,
> +	HWPfHiInfoRingVf2pfHiWrEnReg          =  0x00C84028,
> +	HWPfHiLogParityErrStatusReg           =  0x00C8402C,
> +	HWPfHiLogDataParityErrorVfStatusLo    =  0x00C84030,
> +	HWPfHiLogDataParityErrorVfStatusHi    =  0x00C84034,
> +	HWPfHiBlockTransmitOnErrorEn          =  0x00C84038,
> +	HWPfHiCfgMsiIntWrEnRegPf              =  0x00C84040,
> +	HWPfHiCfgMsiVf2pfLoWrEnReg            =  0x00C84044,
> +	HWPfHiCfgMsiVf2pfHighWrEnReg          =  0x00C84048,
> +	HWPfHiMsixVectorMapperPf              =  0x00C84060,
> +	HWPfHiApbWrWaitTime                   =  0x00C84100,
> +	HWPfHiXCounterMaxValue                =  0x00C84104,
> +	HWPfHiPfMode                          =  0x00C84108,
> +	HWPfHiClkGateHystReg                  =  0x00C8410C,
> +	HWPfHiSnoopBitsReg                    =  0x00C84110,
> +	HWPfHiMsiDropEnableReg                =  0x00C84114,
> +	HWPfHiMsiStatReg                      =  0x00C84120,
> +	HWPfHiFifoOflStatReg                  =  0x00C84124,
> +	HWPfHiHiDebugReg                      =  0x00C841F4,
> +	HWPfHiDebugMemSnoopMsiFifo            =  0x00C841F8,
> +	HWPfHiDebugMemSnoopInputFifo          =  0x00C841FC,
> +	HWPfHiMsixMappingConfig               =  0x00C84200,
> +	HWPfHiJunkReg                         =  0x00C8FF00,
> +	HWPfDdrUmmcVer                        =  0x00D00000,
> +	HWPfDdrUmmcCap                        =  0x00D00010,
> +	HWPfDdrUmmcCtrl                       =  0x00D00020,
> +	HWPfDdrMpcPe                          =  0x00D00080,
> +	HWPfDdrMpcPpri3                       =  0x00D00090,
> +	HWPfDdrMpcPpri2                       =  0x00D000A0,
> +	HWPfDdrMpcPpri1                       =  0x00D000B0,
> +	HWPfDdrMpcPpri0                       =  0x00D000C0,
> +	HWPfDdrMpcPrwgrpCtrl                  =  0x00D000D0,
> +	HWPfDdrMpcPbw7                        =  0x00D000E0,
> +	HWPfDdrMpcPbw6                        =  0x00D000F0,
> +	HWPfDdrMpcPbw5                        =  0x00D00100,
> +	HWPfDdrMpcPbw4                        =  0x00D00110,
> +	HWPfDdrMpcPbw3                        =  0x00D00120,
> +	HWPfDdrMpcPbw2                        =  0x00D00130,
> +	HWPfDdrMpcPbw1                        =  0x00D00140,
> +	HWPfDdrMpcPbw0                        =  0x00D00150,
> +	HWPfDdrMemoryInit                     =  0x00D00200,
> +	HWPfDdrMemoryInitDone                 =  0x00D00210,
> +	HWPfDdrMemInitPhyTrng0                =  0x00D00240,
> +	HWPfDdrMemInitPhyTrng1                =  0x00D00250,
> +	HWPfDdrMemInitPhyTrng2                =  0x00D00260,
> +	HWPfDdrMemInitPhyTrng3                =  0x00D00270,
> +	HWPfDdrBcDram                         =  0x00D003C0,
> +	HWPfDdrBcAddrMap                      =  0x00D003D0,
> +	HWPfDdrBcRef                          =  0x00D003E0,
> +	HWPfDdrBcTim0                         =  0x00D00400,
> +	HWPfDdrBcTim1                         =  0x00D00410,
> +	HWPfDdrBcTim2                         =  0x00D00420,
> +	HWPfDdrBcTim3                         =  0x00D00430,
> +	HWPfDdrBcTim4                         =  0x00D00440,
> +	HWPfDdrBcTim5                         =  0x00D00450,
> +	HWPfDdrBcTim6                         =  0x00D00460,
> +	HWPfDdrBcTim7                         =  0x00D00470,
> +	HWPfDdrBcTim8                         =  0x00D00480,
> +	HWPfDdrBcTim9                         =  0x00D00490,
> +	HWPfDdrBcTim10                        =  0x00D004A0,
> +	HWPfDdrBcTim12                        =  0x00D004C0,
> +	HWPfDdrDfiInit                        =  0x00D004D0,
> +	HWPfDdrDfiInitComplete                =  0x00D004E0,
> +	HWPfDdrDfiTim0                        =  0x00D004F0,
> +	HWPfDdrDfiTim1                        =  0x00D00500,
> +	HWPfDdrDfiPhyUpdEn                    =  0x00D00530,
> +	HWPfDdrMemStatus                      =  0x00D00540,
> +	HWPfDdrUmmcErrStatus                  =  0x00D00550,
> +	HWPfDdrUmmcIntStatus                  =  0x00D00560,
> +	HWPfDdrUmmcIntEn                      =  0x00D00570,
> +	HWPfDdrPhyRdLatency                   =  0x00D48400,
> +	HWPfDdrPhyRdLatencyDbi                =  0x00D48410,
> +	HWPfDdrPhyWrLatency                   =  0x00D48420,
> +	HWPfDdrPhyTrngType                    =  0x00D48430,
> +	HWPfDdrPhyMrsTiming2                  =  0x00D48440,
> +	HWPfDdrPhyMrsTiming0                  =  0x00D48450,
> +	HWPfDdrPhyMrsTiming1                  =  0x00D48460,
> +	HWPfDdrPhyDramTmrd                    =  0x00D48470,
> +	HWPfDdrPhyDramTmod                    =  0x00D48480,
> +	HWPfDdrPhyDramTwpre                   =  0x00D48490,
> +	HWPfDdrPhyDramTrfc                    =  0x00D484A0,
> +	HWPfDdrPhyDramTrwtp                   =  0x00D484B0,
> +	HWPfDdrPhyMr01Dimm                    =  0x00D484C0,
> +	HWPfDdrPhyMr01DimmDbi                 =  0x00D484D0,
> +	HWPfDdrPhyMr23Dimm                    =  0x00D484E0,
> +	HWPfDdrPhyMr45Dimm                    =  0x00D484F0,
> +	HWPfDdrPhyMr67Dimm                    =  0x00D48500,
> +	HWPfDdrPhyWrlvlWwRdlvlRr              =  0x00D48510,
> +	HWPfDdrPhyOdtEn                       =  0x00D48520,
> +	HWPfDdrPhyFastTrng                    =  0x00D48530,
> +	HWPfDdrPhyDynTrngGap                  =  0x00D48540,
> +	HWPfDdrPhyDynRcalGap                  =  0x00D48550,
> +	HWPfDdrPhyIdletimeout                 =  0x00D48560,
> +	HWPfDdrPhyRstCkeGap                   =  0x00D48570,
> +	HWPfDdrPhyCkeMrsGap                   =  0x00D48580,
> +	HWPfDdrPhyMemVrefMidVal               =  0x00D48590,
> +	HWPfDdrPhyVrefStep                    =  0x00D485A0,
> +	HWPfDdrPhyVrefThreshold               =  0x00D485B0,
> +	HWPfDdrPhyPhyVrefMidVal               =  0x00D485C0,
> +	HWPfDdrPhyDqsCountMax                 =  0x00D485D0,
> +	HWPfDdrPhyDqsCountNum                 =  0x00D485E0,
> +	HWPfDdrPhyDramRow                     =  0x00D485F0,
> +	HWPfDdrPhyDramCol                     =  0x00D48600,
> +	HWPfDdrPhyDramBgBa                    =  0x00D48610,
> +	HWPfDdrPhyDynamicUpdreqrel            =  0x00D48620,
> +	HWPfDdrPhyVrefLimits                  =  0x00D48630,
> +	HWPfDdrPhyIdtmTcStatus                =  0x00D6C020,
> +	HWPfDdrPhyIdtmFwVersion               =  0x00D6C410,
> +	HWPfDdrPhyRdlvlGateInitDelay          =  0x00D70000,
> +	HWPfDdrPhyRdenSmplabc                 =  0x00D70008,
> +	HWPfDdrPhyVrefNibble0                 =  0x00D7000C,
> +	HWPfDdrPhyVrefNibble1                 =  0x00D70010,
> +	HWPfDdrPhyRdlvlGateDqsSmpl0           =  0x00D70014,
> +	HWPfDdrPhyRdlvlGateDqsSmpl1           =  0x00D70018,
> +	HWPfDdrPhyRdlvlGateDqsSmpl2           =  0x00D7001C,
> +	HWPfDdrPhyDqsCount                    =  0x00D70020,
> +	HWPfDdrPhyWrlvlRdlvlGateStatus        =  0x00D70024,
> +	HWPfDdrPhyErrorFlags                  =  0x00D70028,
> +	HWPfDdrPhyPowerDown                   =  0x00D70030,
> +	HWPfDdrPhyPrbsSeedByte0               =  0x00D70034,
> +	HWPfDdrPhyPrbsSeedByte1               =  0x00D70038,
> +	HWPfDdrPhyPcompDq                     =  0x00D70040,
> +	HWPfDdrPhyNcompDq                     =  0x00D70044,
> +	HWPfDdrPhyPcompDqs                    =  0x00D70048,
> +	HWPfDdrPhyNcompDqs                    =  0x00D7004C,
> +	HWPfDdrPhyPcompCmd                    =  0x00D70050,
> +	HWPfDdrPhyNcompCmd                    =  0x00D70054,
> +	HWPfDdrPhyPcompCk                     =  0x00D70058,
> +	HWPfDdrPhyNcompCk                     =  0x00D7005C,
> +	HWPfDdrPhyRcalOdtDq                   =  0x00D70060,
> +	HWPfDdrPhyRcalOdtDqs                  =  0x00D70064,
> +	HWPfDdrPhyRcalMask1                   =  0x00D70068,
> +	HWPfDdrPhyRcalMask2                   =  0x00D7006C,
> +	HWPfDdrPhyRcalCtrl                    =  0x00D70070,
> +	HWPfDdrPhyRcalCnt                     =  0x00D70074,
> +	HWPfDdrPhyRcalOverride                =  0x00D70078,
> +	HWPfDdrPhyRcalGateen                  =  0x00D7007C,
> +	HWPfDdrPhyCtrl                        =  0x00D70080,
> +	HWPfDdrPhyWrlvlAlg                    =  0x00D70084,
> +	HWPfDdrPhyRcalVreftTxcmdOdt           =  0x00D70088,
> +	HWPfDdrPhyRdlvlGateParam              =  0x00D7008C,
> +	HWPfDdrPhyRdlvlGateParam2             =  0x00D70090,
> +	HWPfDdrPhyRcalVreftTxdata             =  0x00D70094,
> +	HWPfDdrPhyCmdIntDelay                 =  0x00D700A4,
> +	HWPfDdrPhyAlertN                      =  0x00D700A8,
> +	HWPfDdrPhyTrngReqWpre2tck             =  0x00D700AC,
> +	HWPfDdrPhyCmdPhaseSel                 =  0x00D700B4,
> +	HWPfDdrPhyCmdDcdl                     =  0x00D700B8,
> +	HWPfDdrPhyCkDcdl                      =  0x00D700BC,
> +	HWPfDdrPhySwTrngCtrl1                 =  0x00D700C0,
> +	HWPfDdrPhySwTrngCtrl2                 =  0x00D700C4,
> +	HWPfDdrPhyRcalPcompRden               =  0x00D700C8,
> +	HWPfDdrPhyRcalNcompRden               =  0x00D700CC,
> +	HWPfDdrPhyRcalCompen                  =  0x00D700D0,
> +	HWPfDdrPhySwTrngRdqs                  =  0x00D700D4,
> +	HWPfDdrPhySwTrngWdqs                  =  0x00D700D8,
> +	HWPfDdrPhySwTrngRdena                 =  0x00D700DC,
> +	HWPfDdrPhySwTrngRdenb                 =  0x00D700E0,
> +	HWPfDdrPhySwTrngRdenc                 =  0x00D700E4,
> +	HWPfDdrPhySwTrngWdq                   =  0x00D700E8,
> +	HWPfDdrPhySwTrngRdq                   =  0x00D700EC,
> +	HWPfDdrPhyPcfgHmValue                 =  0x00D700F0,
> +	HWPfDdrPhyPcfgTimerValue              =  0x00D700F4,
> +	HWPfDdrPhyPcfgSoftwareTraining        =  0x00D700F8,
> +	HWPfDdrPhyPcfgMcStatus                =  0x00D700FC,
> +	HWPfDdrPhyWrlvlPhRank0                =  0x00D70100,
> +	HWPfDdrPhyRdenPhRank0                 =  0x00D70104,
> +	HWPfDdrPhyRdenIntRank0                =  0x00D70108,
> +	HWPfDdrPhyRdqsDcdlRank0               =  0x00D7010C,
> +	HWPfDdrPhyRdqsShadowDcdlRank0         =  0x00D70110,
> +	HWPfDdrPhyWdqsDcdlRank0               =  0x00D70114,
> +	HWPfDdrPhyWdmDcdlShadowRank0          =  0x00D70118,
> +	HWPfDdrPhyWdmDcdlRank0                =  0x00D7011C,
> +	HWPfDdrPhyDbiDcdlRank0                =  0x00D70120,
> +	HWPfDdrPhyRdenDcdlaRank0              =  0x00D70124,
> +	HWPfDdrPhyDbiDcdlShadowRank0          =  0x00D70128,
> +	HWPfDdrPhyRdenDcdlbRank0              =  0x00D7012C,
> +	HWPfDdrPhyWdqsShadowDcdlRank0         =  0x00D70130,
> +	HWPfDdrPhyRdenDcdlcRank0              =  0x00D70134,
> +	HWPfDdrPhyRdenShadowDcdlaRank0        =  0x00D70138,
> +	HWPfDdrPhyWrlvlIntRank0               =  0x00D7013C,
> +	HWPfDdrPhyRdqDcdlBit0Rank0            =  0x00D70200,
> +	HWPfDdrPhyRdqDcdlShadowBit0Rank0      =  0x00D70204,
> +	HWPfDdrPhyWdqDcdlBit0Rank0            =  0x00D70208,
> +	HWPfDdrPhyWdqDcdlShadowBit0Rank0      =  0x00D7020C,
> +	HWPfDdrPhyRdqDcdlBit1Rank0            =  0x00D70240,
> +	HWPfDdrPhyRdqDcdlShadowBit1Rank0      =  0x00D70244,
> +	HWPfDdrPhyWdqDcdlBit1Rank0            =  0x00D70248,
> +	HWPfDdrPhyWdqDcdlShadowBit1Rank0      =  0x00D7024C,
> +	HWPfDdrPhyRdqDcdlBit2Rank0            =  0x00D70280,
> +	HWPfDdrPhyRdqDcdlShadowBit2Rank0      =  0x00D70284,
> +	HWPfDdrPhyWdqDcdlBit2Rank0            =  0x00D70288,
> +	HWPfDdrPhyWdqDcdlShadowBit2Rank0      =  0x00D7028C,
> +	HWPfDdrPhyRdqDcdlBit3Rank0            =  0x00D702C0,
> +	HWPfDdrPhyRdqDcdlShadowBit3Rank0      =  0x00D702C4,
> +	HWPfDdrPhyWdqDcdlBit3Rank0            =  0x00D702C8,
> +	HWPfDdrPhyWdqDcdlShadowBit3Rank0      =  0x00D702CC,
> +	HWPfDdrPhyRdqDcdlBit4Rank0            =  0x00D70300,
> +	HWPfDdrPhyRdqDcdlShadowBit4Rank0      =  0x00D70304,
> +	HWPfDdrPhyWdqDcdlBit4Rank0            =  0x00D70308,
> +	HWPfDdrPhyWdqDcdlShadowBit4Rank0      =  0x00D7030C,
> +	HWPfDdrPhyRdqDcdlBit5Rank0            =  0x00D70340,
> +	HWPfDdrPhyRdqDcdlShadowBit5Rank0      =  0x00D70344,
> +	HWPfDdrPhyWdqDcdlBit5Rank0            =  0x00D70348,
> +	HWPfDdrPhyWdqDcdlShadowBit5Rank0      =  0x00D7034C,
> +	HWPfDdrPhyRdqDcdlBit6Rank0            =  0x00D70380,
> +	HWPfDdrPhyRdqDcdlShadowBit6Rank0      =  0x00D70384,
> +	HWPfDdrPhyWdqDcdlBit6Rank0            =  0x00D70388,
> +	HWPfDdrPhyWdqDcdlShadowBit6Rank0      =  0x00D7038C,
> +	HWPfDdrPhyRdqDcdlBit7Rank0            =  0x00D703C0,
> +	HWPfDdrPhyRdqDcdlShadowBit7Rank0      =  0x00D703C4,
> +	HWPfDdrPhyWdqDcdlBit7Rank0            =  0x00D703C8,
> +	HWPfDdrPhyWdqDcdlShadowBit7Rank0      =  0x00D703CC,
> +	HWPfDdrPhyIdtmStatus                  =  0x00D740D0,
> +	HWPfDdrPhyIdtmError                   =  0x00D74110,
> +	HWPfDdrPhyIdtmDebug                   =  0x00D74120,
> +	HWPfDdrPhyIdtmDebugInt                =  0x00D74130,
> +	HwPfPcieLnAsicCfgovr                  =  0x00D80000,
> +	HwPfPcieLnAclkmixer                   =  0x00D80004,
> +	HwPfPcieLnTxrampfreq                  =  0x00D80008,
> +	HwPfPcieLnLanetest                    =  0x00D8000C,
> +	HwPfPcieLnDcctrl                      =  0x00D80010,
> +	HwPfPcieLnDccmeas                     =  0x00D80014,
> +	HwPfPcieLnDccovrAclk                  =  0x00D80018,
> +	HwPfPcieLnDccovrTxa                   =  0x00D8001C,
> +	HwPfPcieLnDccovrTxk                   =  0x00D80020,
> +	HwPfPcieLnDccovrDclk                  =  0x00D80024,
> +	HwPfPcieLnDccovrEclk                  =  0x00D80028,
> +	HwPfPcieLnDcctrimAclk                 =  0x00D8002C,
> +	HwPfPcieLnDcctrimTx                   =  0x00D80030,
> +	HwPfPcieLnDcctrimDclk                 =  0x00D80034,
> +	HwPfPcieLnDcctrimEclk                 =  0x00D80038,
> +	HwPfPcieLnQuadCtrl                    =  0x00D8003C,
> +	HwPfPcieLnQuadCorrIndex               =  0x00D80040,
> +	HwPfPcieLnQuadCorrStatus              =  0x00D80044,
> +	HwPfPcieLnAsicRxovr1                  =  0x00D80048,
> +	HwPfPcieLnAsicRxovr2                  =  0x00D8004C,
> +	HwPfPcieLnAsicEqinfovr                =  0x00D80050,
> +	HwPfPcieLnRxcsr                       =  0x00D80054,
> +	HwPfPcieLnRxfectrl                    =  0x00D80058,
> +	HwPfPcieLnRxtest                      =  0x00D8005C,
> +	HwPfPcieLnEscount                     =  0x00D80060,
> +	HwPfPcieLnCdrctrl                     =  0x00D80064,
> +	HwPfPcieLnCdrctrl2                    =  0x00D80068,
> +	HwPfPcieLnCdrcfg0Ctrl0                =  0x00D8006C,
> +	HwPfPcieLnCdrcfg0Ctrl1                =  0x00D80070,
> +	HwPfPcieLnCdrcfg0Ctrl2                =  0x00D80074,
> +	HwPfPcieLnCdrcfg1Ctrl0                =  0x00D80078,
> +	HwPfPcieLnCdrcfg1Ctrl1                =  0x00D8007C,
> +	HwPfPcieLnCdrcfg1Ctrl2                =  0x00D80080,
> +	HwPfPcieLnCdrcfg2Ctrl0                =  0x00D80084,
> +	HwPfPcieLnCdrcfg2Ctrl1                =  0x00D80088,
> +	HwPfPcieLnCdrcfg2Ctrl2                =  0x00D8008C,
> +	HwPfPcieLnCdrcfg3Ctrl0                =  0x00D80090,
> +	HwPfPcieLnCdrcfg3Ctrl1                =  0x00D80094,
> +	HwPfPcieLnCdrcfg3Ctrl2                =  0x00D80098,
> +	HwPfPcieLnCdrphase                    =  0x00D8009C,
> +	HwPfPcieLnCdrfreq                     =  0x00D800A0,
> +	HwPfPcieLnCdrstatusPhase              =  0x00D800A4,
> +	HwPfPcieLnCdrstatusFreq               =  0x00D800A8,
> +	HwPfPcieLnCdroffset                   =  0x00D800AC,
> +	HwPfPcieLnRxvosctl                    =  0x00D800B0,
> +	HwPfPcieLnRxvosctl2                   =  0x00D800B4,
> +	HwPfPcieLnRxlosctl                    =  0x00D800B8,
> +	HwPfPcieLnRxlos                       =  0x00D800BC,
> +	HwPfPcieLnRxlosvval                   =  0x00D800C0,
> +	HwPfPcieLnRxvosd0                     =  0x00D800C4,
> +	HwPfPcieLnRxvosd1                     =  0x00D800C8,
> +	HwPfPcieLnRxvosep0                    =  0x00D800CC,
> +	HwPfPcieLnRxvosep1                    =  0x00D800D0,
> +	HwPfPcieLnRxvosen0                    =  0x00D800D4,
> +	HwPfPcieLnRxvosen1                    =  0x00D800D8,
> +	HwPfPcieLnRxvosafe                    =  0x00D800DC,
> +	HwPfPcieLnRxvosa0                     =  0x00D800E0,
> +	HwPfPcieLnRxvosa0Out                  =  0x00D800E4,
> +	HwPfPcieLnRxvosa1                     =  0x00D800E8,
> +	HwPfPcieLnRxvosa1Out                  =  0x00D800EC,
> +	HwPfPcieLnRxmisc                      =  0x00D800F0,
> +	HwPfPcieLnRxbeacon                    =  0x00D800F4,
> +	HwPfPcieLnRxdssout                    =  0x00D800F8,
> +	HwPfPcieLnRxdssout2                   =  0x00D800FC,
> +	HwPfPcieLnAlphapctrl                  =  0x00D80100,
> +	HwPfPcieLnAlphanctrl                  =  0x00D80104,
> +	HwPfPcieLnAdaptctrl                   =  0x00D80108,
> +	HwPfPcieLnAdaptctrl1                  =  0x00D8010C,
> +	HwPfPcieLnAdaptstatus                 =  0x00D80110,
> +	HwPfPcieLnAdaptvga1                   =  0x00D80114,
> +	HwPfPcieLnAdaptvga2                   =  0x00D80118,
> +	HwPfPcieLnAdaptvga3                   =  0x00D8011C,
> +	HwPfPcieLnAdaptvga4                   =  0x00D80120,
> +	HwPfPcieLnAdaptboost1                 =  0x00D80124,
> +	HwPfPcieLnAdaptboost2                 =  0x00D80128,
> +	HwPfPcieLnAdaptboost3                 =  0x00D8012C,
> +	HwPfPcieLnAdaptboost4                 =  0x00D80130,
> +	HwPfPcieLnAdaptsslms1                 =  0x00D80134,
> +	HwPfPcieLnAdaptsslms2                 =  0x00D80138,
> +	HwPfPcieLnAdaptvgaStatus              =  0x00D8013C,
> +	HwPfPcieLnAdaptboostStatus            =  0x00D80140,
> +	HwPfPcieLnAdaptsslmsStatus1           =  0x00D80144,
> +	HwPfPcieLnAdaptsslmsStatus2           =  0x00D80148,
> +	HwPfPcieLnAfectrl1                    =  0x00D8014C,
> +	HwPfPcieLnAfectrl2                    =  0x00D80150,
> +	HwPfPcieLnAfectrl3                    =  0x00D80154,
> +	HwPfPcieLnAfedefault1                 =  0x00D80158,
> +	HwPfPcieLnAfedefault2                 =  0x00D8015C,
> +	HwPfPcieLnDfectrl1                    =  0x00D80160,
> +	HwPfPcieLnDfectrl2                    =  0x00D80164,
> +	HwPfPcieLnDfectrl3                    =  0x00D80168,
> +	HwPfPcieLnDfectrl4                    =  0x00D8016C,
> +	HwPfPcieLnDfectrl5                    =  0x00D80170,
> +	HwPfPcieLnDfectrl6                    =  0x00D80174,
> +	HwPfPcieLnAfestatus1                  =  0x00D80178,
> +	HwPfPcieLnAfestatus2                  =  0x00D8017C,
> +	HwPfPcieLnDfestatus1                  =  0x00D80180,
> +	HwPfPcieLnDfestatus2                  =  0x00D80184,
> +	HwPfPcieLnDfestatus3                  =  0x00D80188,
> +	HwPfPcieLnDfestatus4                  =  0x00D8018C,
> +	HwPfPcieLnDfestatus5                  =  0x00D80190,
> +	HwPfPcieLnAlphastatus                 =  0x00D80194,
> +	HwPfPcieLnFomctrl1                    =  0x00D80198,
> +	HwPfPcieLnFomctrl2                    =  0x00D8019C,
> +	HwPfPcieLnFomctrl3                    =  0x00D801A0,
> +	HwPfPcieLnAclkcalStatus               =  0x00D801A4,
> +	HwPfPcieLnOffscorrStatus              =  0x00D801A8,
> +	HwPfPcieLnEyewidthStatus              =  0x00D801AC,
> +	HwPfPcieLnEyeheightStatus             =  0x00D801B0,
> +	HwPfPcieLnAsicTxovr1                  =  0x00D801B4,
> +	HwPfPcieLnAsicTxovr2                  =  0x00D801B8,
> +	HwPfPcieLnAsicTxovr3                  =  0x00D801BC,
> +	HwPfPcieLnTxbiasadjOvr                =  0x00D801C0,
> +	HwPfPcieLnTxcsr                       =  0x00D801C4,
> +	HwPfPcieLnTxtest                      =  0x00D801C8,
> +	HwPfPcieLnTxtestword                  =  0x00D801CC,
> +	HwPfPcieLnTxtestwordHigh              =  0x00D801D0,
> +	HwPfPcieLnTxdrive                     =  0x00D801D4,
> +	HwPfPcieLnMtcsLn                      =  0x00D801D8,
> +	HwPfPcieLnStatsumLn                   =  0x00D801DC,
> +	HwPfPcieLnRcbusScratch                =  0x00D801E0,
> +	HwPfPcieLnRcbusMinorrev               =  0x00D801F0,
> +	HwPfPcieLnRcbusMajorrev               =  0x00D801F4,
> +	HwPfPcieLnRcbusBlocktype              =  0x00D801F8,
> +	HwPfPcieSupPllcsr                     =  0x00D80800,
> +	HwPfPcieSupPlldiv                     =  0x00D80804,
> +	HwPfPcieSupPllcal                     =  0x00D80808,
> +	HwPfPcieSupPllcalsts                  =  0x00D8080C,
> +	HwPfPcieSupPllmeas                    =  0x00D80810,
> +	HwPfPcieSupPlldactrim                 =  0x00D80814,
> +	HwPfPcieSupPllbiastrim                =  0x00D80818,
> +	HwPfPcieSupPllbwtrim                  =  0x00D8081C,
> +	HwPfPcieSupPllcaldly                  =  0x00D80820,
> +	HwPfPcieSupRefclkonpclkctrl           =  0x00D80824,
> +	HwPfPcieSupPclkdelay                  =  0x00D80828,
> +	HwPfPcieSupPhyconfig                  =  0x00D8082C,
> +	HwPfPcieSupRcalIntf                   =  0x00D80830,
> +	HwPfPcieSupAuxcsr                     =  0x00D80834,
> +	HwPfPcieSupVref                       =  0x00D80838,
> +	HwPfPcieSupLinkmode                   =  0x00D8083C,
> +	HwPfPcieSupRrefcalctl                 =  0x00D80840,
> +	HwPfPcieSupRrefcal                    =  0x00D80844,
> +	HwPfPcieSupRrefcaldly                 =  0x00D80848,
> +	HwPfPcieSupTximpcalctl                =  0x00D8084C,
> +	HwPfPcieSupTximpcal                   =  0x00D80850,
> +	HwPfPcieSupTximpoffset                =  0x00D80854,
> +	HwPfPcieSupTximpcaldly                =  0x00D80858,
> +	HwPfPcieSupRximpcalctl                =  0x00D8085C,
> +	HwPfPcieSupRximpcal                   =  0x00D80860,
> +	HwPfPcieSupRximpoffset                =  0x00D80864,
> +	HwPfPcieSupRximpcaldly                =  0x00D80868,
> +	HwPfPcieSupFence                      =  0x00D8086C,
> +	HwPfPcieSupMtcs                       =  0x00D80870,
> +	HwPfPcieSupStatsum                    =  0x00D809B8,
> +	HwPfPciePcsDpStatus0                  =  0x00D81000,
> +	HwPfPciePcsDpControl0                 =  0x00D81004,
> +	HwPfPciePcsPmaStatusLane0             =  0x00D81008,
> +	HwPfPciePcsPipeStatusLane0            =  0x00D8100C,
> +	HwPfPciePcsTxdeemph0Lane0             =  0x00D81010,
> +	HwPfPciePcsTxdeemph1Lane0             =  0x00D81014,
> +	HwPfPciePcsInternalStatusLane0        =  0x00D81018,
> +	HwPfPciePcsDpStatus1                  =  0x00D8101C,
> +	HwPfPciePcsDpControl1                 =  0x00D81020,
> +	HwPfPciePcsPmaStatusLane1             =  0x00D81024,
> +	HwPfPciePcsPipeStatusLane1            =  0x00D81028,
> +	HwPfPciePcsTxdeemph0Lane1             =  0x00D8102C,
> +	HwPfPciePcsTxdeemph1Lane1             =  0x00D81030,
> +	HwPfPciePcsInternalStatusLane1        =  0x00D81034,
> +	HwPfPciePcsDpStatus2                  =  0x00D81038,
> +	HwPfPciePcsDpControl2                 =  0x00D8103C,
> +	HwPfPciePcsPmaStatusLane2             =  0x00D81040,
> +	HwPfPciePcsPipeStatusLane2            =  0x00D81044,
> +	HwPfPciePcsTxdeemph0Lane2             =  0x00D81048,
> +	HwPfPciePcsTxdeemph1Lane2             =  0x00D8104C,
> +	HwPfPciePcsInternalStatusLane2        =  0x00D81050,
> +	HwPfPciePcsDpStatus3                  =  0x00D81054,
> +	HwPfPciePcsDpControl3                 =  0x00D81058,
> +	HwPfPciePcsPmaStatusLane3             =  0x00D8105C,
> +	HwPfPciePcsPipeStatusLane3            =  0x00D81060,
> +	HwPfPciePcsTxdeemph0Lane3             =  0x00D81064,
> +	HwPfPciePcsTxdeemph1Lane3             =  0x00D81068,
> +	HwPfPciePcsInternalStatusLane3        =  0x00D8106C,
> +	HwPfPciePcsEbStatus0                  =  0x00D81070,
> +	HwPfPciePcsEbStatus1                  =  0x00D81074,
> +	HwPfPciePcsEbStatus2                  =  0x00D81078,
> +	HwPfPciePcsEbStatus3                  =  0x00D8107C,
> +	HwPfPciePcsPllSettingPcieG1           =  0x00D81088,
> +	HwPfPciePcsPllSettingPcieG2           =  0x00D8108C,
> +	HwPfPciePcsPllSettingPcieG3           =  0x00D81090,
> +	HwPfPciePcsControl                    =  0x00D81094,
> +	HwPfPciePcsEqControl                  =  0x00D81098,
> +	HwPfPciePcsEqTimer                    =  0x00D8109C,
> +	HwPfPciePcsEqErrStatus                =  0x00D810A0,
> +	HwPfPciePcsEqErrCount                 =  0x00D810A4,
> +	HwPfPciePcsStatus                     =  0x00D810A8,
> +	HwPfPciePcsMiscRegister               =  0x00D810AC,
> +	HwPfPciePcsObsControl                 =  0x00D810B0,
> +	HwPfPciePcsPrbsCount0                 =  0x00D81200,
> +	HwPfPciePcsBistControl0               =  0x00D81204,
> +	HwPfPciePcsBistStaticWord00           =  0x00D81208,
> +	HwPfPciePcsBistStaticWord10           =  0x00D8120C,
> +	HwPfPciePcsBistStaticWord20           =  0x00D81210,
> +	HwPfPciePcsBistStaticWord30           =  0x00D81214,
> +	HwPfPciePcsPrbsCount1                 =  0x00D81220,
> +	HwPfPciePcsBistControl1               =  0x00D81224,
> +	HwPfPciePcsBistStaticWord01           =  0x00D81228,
> +	HwPfPciePcsBistStaticWord11           =  0x00D8122C,
> +	HwPfPciePcsBistStaticWord21           =  0x00D81230,
> +	HwPfPciePcsBistStaticWord31           =  0x00D81234,
> +	HwPfPciePcsPrbsCount2                 =  0x00D81240,
> +	HwPfPciePcsBistControl2               =  0x00D81244,
> +	HwPfPciePcsBistStaticWord02           =  0x00D81248,
> +	HwPfPciePcsBistStaticWord12           =  0x00D8124C,
> +	HwPfPciePcsBistStaticWord22           =  0x00D81250,
> +	HwPfPciePcsBistStaticWord32           =  0x00D81254,
> +	HwPfPciePcsPrbsCount3                 =  0x00D81260,
> +	HwPfPciePcsBistControl3               =  0x00D81264,
> +	HwPfPciePcsBistStaticWord03           =  0x00D81268,
> +	HwPfPciePcsBistStaticWord13           =  0x00D8126C,
> +	HwPfPciePcsBistStaticWord23           =  0x00D81270,
> +	HwPfPciePcsBistStaticWord33           =  0x00D81274,
> +	HwPfPcieGpexLtssmStateCntrl           =  0x00D90400,
> +	HwPfPcieGpexLtssmStateStatus          =  0x00D90404,
> +	HwPfPcieGpexSkipFreqTimer             =  0x00D90408,
> +	HwPfPcieGpexLaneSelect                =  0x00D9040C,
> +	HwPfPcieGpexLaneDeskew                =  0x00D90410,
> +	HwPfPcieGpexRxErrorStatus             =  0x00D90414,
> +	HwPfPcieGpexLaneNumControl            =  0x00D90418,
> +	HwPfPcieGpexNFstControl               =  0x00D9041C,
> +	HwPfPcieGpexLinkStatus                =  0x00D90420,
> +	HwPfPcieGpexAckReplayTimeout          =  0x00D90438,
> +	HwPfPcieGpexSeqNumberStatus           =  0x00D9043C,
> +	HwPfPcieGpexCoreClkRatio              =  0x00D90440,
> +	HwPfPcieGpexDllTholdControl           =  0x00D90448,
> +	HwPfPcieGpexPmTimer                   =  0x00D90450,
> +	HwPfPcieGpexPmeTimeout                =  0x00D90454,
> +	HwPfPcieGpexAspmL1Timer               =  0x00D90458,
> +	HwPfPcieGpexAspmReqTimer              =  0x00D9045C,
> +	HwPfPcieGpexAspmL1Dis                 =  0x00D90460,
> +	HwPfPcieGpexAdvisoryErrorControl      =  0x00D90468,
> +	HwPfPcieGpexId                        =  0x00D90470,
> +	HwPfPcieGpexClasscode                 =  0x00D90474,
> +	HwPfPcieGpexSubsystemId               =  0x00D90478,
> +	HwPfPcieGpexDeviceCapabilities        =  0x00D9047C,
> +	HwPfPcieGpexLinkCapabilities          =  0x00D90480,
> +	HwPfPcieGpexFunctionNumber            =  0x00D90484,
> +	HwPfPcieGpexPmCapabilities            =  0x00D90488,
> +	HwPfPcieGpexFunctionSelect            =  0x00D9048C,
> +	HwPfPcieGpexErrorCounter              =  0x00D904AC,
> +	HwPfPcieGpexConfigReady               =  0x00D904B0,
> +	HwPfPcieGpexFcUpdateTimeout           =  0x00D904B8,
> +	HwPfPcieGpexFcUpdateTimer             =  0x00D904BC,
> +	HwPfPcieGpexVcBufferLoad              =  0x00D904C8,
> +	HwPfPcieGpexVcBufferSizeThold         =  0x00D904CC,
> +	HwPfPcieGpexVcBufferSelect            =  0x00D904D0,
> +	HwPfPcieGpexBarEnable                 =  0x00D904D4,
> +	HwPfPcieGpexBarDwordLower             =  0x00D904D8,
> +	HwPfPcieGpexBarDwordUpper             =  0x00D904DC,
> +	HwPfPcieGpexBarSelect                 =  0x00D904E0,
> +	HwPfPcieGpexCreditCounterSelect       =  0x00D904E4,
> +	HwPfPcieGpexCreditCounterStatus       =  0x00D904E8,
> +	HwPfPcieGpexTlpHeaderSelect           =  0x00D904EC,
> +	HwPfPcieGpexTlpHeaderDword0           =  0x00D904F0,
> +	HwPfPcieGpexTlpHeaderDword1           =  0x00D904F4,
> +	HwPfPcieGpexTlpHeaderDword2           =  0x00D904F8,
> +	HwPfPcieGpexTlpHeaderDword3           =  0x00D904FC,
> +	HwPfPcieGpexRelaxOrderControl         =  0x00D90500,
> +	HwPfPcieGpexBarPrefetch               =  0x00D90504,
> +	HwPfPcieGpexFcCheckControl            =  0x00D90508,
> +	HwPfPcieGpexFcUpdateTimerTraffic      =  0x00D90518,
> +	HwPfPcieGpexPhyControl0               =  0x00D9053C,
> +	HwPfPcieGpexPhyControl1               =  0x00D90544,
> +	HwPfPcieGpexPhyControl2               =  0x00D9054C,
> +	HwPfPcieGpexUserControl0              =  0x00D9055C,
> +	HwPfPcieGpexUncorrErrorStatus         =  0x00D905F0,
> +	HwPfPcieGpexRxCplError                =  0x00D90620,
> +	HwPfPcieGpexRxCplErrorDword0          =  0x00D90624,
> +	HwPfPcieGpexRxCplErrorDword1          =  0x00D90628,
> +	HwPfPcieGpexRxCplErrorDword2          =  0x00D9062C,
> +	HwPfPcieGpexPabSwResetEn              =  0x00D90630,
> +	HwPfPcieGpexGen3Control0              =  0x00D90634,
> +	HwPfPcieGpexGen3Control1              =  0x00D90638,
> +	HwPfPcieGpexGen3Control2              =  0x00D9063C,
> +	HwPfPcieGpexGen2ControlCsr            =  0x00D90640,
> +	HwPfPcieGpexTotalVfInitialVf0         =  0x00D90644,
> +	HwPfPcieGpexTotalVfInitialVf1         =  0x00D90648,
> +	HwPfPcieGpexSriovLinkDevId0           =  0x00D90684,
> +	HwPfPcieGpexSriovLinkDevId1           =  0x00D90688,
> +	HwPfPcieGpexSriovPageSize0            =  0x00D906C4,
> +	HwPfPcieGpexSriovPageSize1            =  0x00D906C8,
> +	HwPfPcieGpexIdVersion                 =  0x00D906FC,
> +	HwPfPcieGpexSriovVfOffsetStride0      =  0x00D90704,
> +	HwPfPcieGpexSriovVfOffsetStride1      =  0x00D90708,
> +	HwPfPcieGpexGen3DeskewControl         =  0x00D907B4,
> +	HwPfPcieGpexGen3EqControl             =  0x00D907B8,
> +	HwPfPcieGpexBridgeVersion             =  0x00D90800,
> +	HwPfPcieGpexBridgeCapability          =  0x00D90804,
> +	HwPfPcieGpexBridgeControl             =  0x00D90808,
> +	HwPfPcieGpexBridgeStatus              =  0x00D9080C,
> +	HwPfPcieGpexEngineActivityStatus      =  0x00D9081C,
> +	HwPfPcieGpexEngineResetControl        =  0x00D90820,
> +	HwPfPcieGpexAxiPioControl             =  0x00D90840,
> +	HwPfPcieGpexAxiPioStatus              =  0x00D90844,
> +	HwPfPcieGpexAmbaSlaveCmdStatus        =  0x00D90848,
> +	HwPfPcieGpexPexPioControl             =  0x00D908C0,
> +	HwPfPcieGpexPexPioStatus              =  0x00D908C4,
> +	HwPfPcieGpexAmbaMasterStatus          =  0x00D908C8,
> +	HwPfPcieGpexCsrSlaveCmdStatus         =  0x00D90920,
> +	HwPfPcieGpexMailboxAxiControl         =  0x00D90A50,
> +	HwPfPcieGpexMailboxAxiData            =  0x00D90A54,
> +	HwPfPcieGpexMailboxPexControl         =  0x00D90A90,
> +	HwPfPcieGpexMailboxPexData            =  0x00D90A94,
> +	HwPfPcieGpexPexInterruptEnable        =  0x00D90AD0,
> +	HwPfPcieGpexPexInterruptStatus        =  0x00D90AD4,
> +	HwPfPcieGpexPexInterruptAxiPioVector  =  0x00D90AD8,
> +	HwPfPcieGpexPexInterruptPexPioVector  =  0x00D90AE0,
> +	HwPfPcieGpexPexInterruptMiscVector    =  0x00D90AF8,
> +	HwPfPcieGpexAmbaInterruptPioEnable    =  0x00D90B00,
> +	HwPfPcieGpexAmbaInterruptMiscEnable   =  0x00D90B0C,
> +	HwPfPcieGpexAmbaInterruptPioStatus    =  0x00D90B10,
> +	HwPfPcieGpexAmbaInterruptMiscStatus   =  0x00D90B1C,
> +	HwPfPcieGpexPexPmControl              =  0x00D90B80,
> +	HwPfPcieGpexSlotMisc                  =  0x00D90B88,
> +	HwPfPcieGpexAxiAddrMappingControl     =  0x00D90BA0,
> +	HwPfPcieGpexAxiAddrMappingWindowAxiBase     =  0x00D90BA4,
> +	HwPfPcieGpexAxiAddrMappingWindowPexBaseLow  =  0x00D90BA8,
> +	HwPfPcieGpexAxiAddrMappingWindowPexBaseHigh =  0x00D90BAC,
> +	HwPfPcieGpexPexBarAddrFunc0Bar0       =  0x00D91BA0,
> +	HwPfPcieGpexPexBarAddrFunc0Bar1       =  0x00D91BA4,
> +	HwPfPcieGpexAxiAddrMappingPcieHdrParam =  0x00D95BA0,
> +	HwPfPcieGpexExtAxiAddrMappingAxiBase  =  0x00D980A0,
> +	HwPfPcieGpexPexExtBarAddrFunc0Bar0    =  0x00D984A0,
> +	HwPfPcieGpexPexExtBarAddrFunc0Bar1    =  0x00D984A4,
> +	HwPfPcieGpexAmbaInterruptFlrEnable    =  0x00D9B960,
> +	HwPfPcieGpexAmbaInterruptFlrStatus    =  0x00D9B9A0,
> +	HwPfPcieGpexExtAxiAddrMappingSize     =  0x00D9BAF0,
> +	HwPfPcieGpexPexPioAwcacheControl      =  0x00D9C300,
> +	HwPfPcieGpexPexPioArcacheControl      =  0x00D9C304,
> +	HwPfPcieGpexPabObSizeControlVc0       =  0x00D9C310
> +};
> +
> +/* TIP PF Interrupt numbers */
> +enum {
> +	ACC100_PF_INT_QMGR_AQ_OVERFLOW = 0,
> +	ACC100_PF_INT_DOORBELL_VF_2_PF = 1,
> +	ACC100_PF_INT_DMA_DL_DESC_IRQ = 2,
> +	ACC100_PF_INT_DMA_UL_DESC_IRQ = 3,
> +	ACC100_PF_INT_DMA_MLD_DESC_IRQ = 4,
> +	ACC100_PF_INT_DMA_UL5G_DESC_IRQ = 5,
> +	ACC100_PF_INT_DMA_DL5G_DESC_IRQ = 6,
> +	ACC100_PF_INT_ILLEGAL_FORMAT = 7,
> +	ACC100_PF_INT_QMGR_DISABLED_ACCESS = 8,
> +	ACC100_PF_INT_QMGR_AQ_OVERTHRESHOLD = 9,
> +	ACC100_PF_INT_ARAM_ACCESS_ERR = 10,
> +	ACC100_PF_INT_ARAM_ECC_1BIT_ERR = 11,
> +	ACC100_PF_INT_PARITY_ERR = 12,
> +	ACC100_PF_INT_QMGR_ERR = 13,
> +	ACC100_PF_INT_INT_REQ_OVERFLOW = 14,
> +	ACC100_PF_INT_APB_TIMEOUT = 15,
> +};
> +
> +#endif /* ACC100_PF_ENUM_H */
> diff --git a/drivers/baseband/acc100/acc100_vf_enum.h
> b/drivers/baseband/acc100/acc100_vf_enum.h
> new file mode 100644
> index 0000000..b512af3
> --- /dev/null
> +++ b/drivers/baseband/acc100/acc100_vf_enum.h
> @@ -0,0 +1,73 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2017 Intel Corporation
> + */
> +
> +#ifndef ACC100_VF_ENUM_H
> +#define ACC100_VF_ENUM_H
> +
> +/*
> + * ACC100 Register mapping on VF BAR0
> + * This is automatically generated from RDL, format may change with new
> RDL
> + */
> +enum {
> +	HWVfQmgrIngressAq             =  0x00000000,
> +	HWVfHiVfToPfDbellVf           =  0x00000800,
> +	HWVfHiPfToVfDbellVf           =  0x00000808,
> +	HWVfHiInfoRingBaseLoVf        =  0x00000810,
> +	HWVfHiInfoRingBaseHiVf        =  0x00000814,
> +	HWVfHiInfoRingPointerVf       =  0x00000818,
> +	HWVfHiInfoRingIntWrEnVf       =  0x00000820,
> +	HWVfHiInfoRingPf2VfWrEnVf     =  0x00000824,
> +	HWVfHiMsixVectorMapperVf      =  0x00000860,
> +	HWVfDmaFec5GulDescBaseLoRegVf =  0x00000920,
> +	HWVfDmaFec5GulDescBaseHiRegVf =  0x00000924,
> +	HWVfDmaFec5GulRespPtrLoRegVf  =  0x00000928,
> +	HWVfDmaFec5GulRespPtrHiRegVf  =  0x0000092C,
> +	HWVfDmaFec5GdlDescBaseLoRegVf =  0x00000940,
> +	HWVfDmaFec5GdlDescBaseHiRegVf =  0x00000944,
> +	HWVfDmaFec5GdlRespPtrLoRegVf  =  0x00000948,
> +	HWVfDmaFec5GdlRespPtrHiRegVf  =  0x0000094C,
> +	HWVfDmaFec4GulDescBaseLoRegVf =  0x00000960,
> +	HWVfDmaFec4GulDescBaseHiRegVf =  0x00000964,
> +	HWVfDmaFec4GulRespPtrLoRegVf  =  0x00000968,
> +	HWVfDmaFec4GulRespPtrHiRegVf  =  0x0000096C,
> +	HWVfDmaFec4GdlDescBaseLoRegVf =  0x00000980,
> +	HWVfDmaFec4GdlDescBaseHiRegVf =  0x00000984,
> +	HWVfDmaFec4GdlRespPtrLoRegVf  =  0x00000988,
> +	HWVfDmaFec4GdlRespPtrHiRegVf  =  0x0000098C,
> +	HWVfDmaDdrBaseRangeRoVf       =  0x000009A0,
> +	HWVfQmgrAqResetVf             =  0x00000E00,
> +	HWVfQmgrRingSizeVf            =  0x00000E04,
> +	HWVfQmgrGrpDepthLog20Vf       =  0x00000E08,
> +	HWVfQmgrGrpDepthLog21Vf       =  0x00000E0C,
> +	HWVfQmgrGrpFunction0Vf        =  0x00000E10,
> +	HWVfQmgrGrpFunction1Vf        =  0x00000E14,
> +	HWVfPmACntrlRegVf             =  0x00000F40,
> +	HWVfPmACountVf                =  0x00000F48,
> +	HWVfPmAKCntLoVf               =  0x00000F50,
> +	HWVfPmAKCntHiVf               =  0x00000F54,
> +	HWVfPmADeltaCntLoVf           =  0x00000F60,
> +	HWVfPmADeltaCntHiVf           =  0x00000F64,
> +	HWVfPmBCntrlRegVf             =  0x00000F80,
> +	HWVfPmBCountVf                =  0x00000F88,
> +	HWVfPmBKCntLoVf               =  0x00000F90,
> +	HWVfPmBKCntHiVf               =  0x00000F94,
> +	HWVfPmBDeltaCntLoVf           =  0x00000FA0,
> +	HWVfPmBDeltaCntHiVf           =  0x00000FA4
> +};
> +
> +/* TIP VF Interrupt numbers */
> +enum {
> +	ACC100_VF_INT_QMGR_AQ_OVERFLOW = 0,
> +	ACC100_VF_INT_DOORBELL_VF_2_PF = 1,
> +	ACC100_VF_INT_DMA_DL_DESC_IRQ = 2,
> +	ACC100_VF_INT_DMA_UL_DESC_IRQ = 3,
> +	ACC100_VF_INT_DMA_MLD_DESC_IRQ = 4,
> +	ACC100_VF_INT_DMA_UL5G_DESC_IRQ = 5,
> +	ACC100_VF_INT_DMA_DL5G_DESC_IRQ = 6,
> +	ACC100_VF_INT_ILLEGAL_FORMAT = 7,
> +	ACC100_VF_INT_QMGR_DISABLED_ACCESS = 8,
> +	ACC100_VF_INT_QMGR_AQ_OVERTHRESHOLD = 9,
> +};
> +
> +#endif /* ACC100_VF_ENUM_H */
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h
> b/drivers/baseband/acc100/rte_acc100_pmd.h
> index 6f46df0..cd77570 100644
> --- a/drivers/baseband/acc100/rte_acc100_pmd.h
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
> @@ -5,6 +5,9 @@
>  #ifndef _RTE_ACC100_PMD_H_
>  #define _RTE_ACC100_PMD_H_
> 
> +#include "acc100_pf_enum.h"
> +#include "acc100_vf_enum.h"
> +
>  /* Helper macro for logging */
>  #define rte_bbdev_log(level, fmt, ...) \
>  	rte_log(RTE_LOG_ ## level, acc100_logtype, fmt "\n", \
> @@ -27,6 +30,493 @@
>  #define RTE_ACC100_PF_DEVICE_ID        (0x0d5c)
>  #define RTE_ACC100_VF_DEVICE_ID        (0x0d5d)
> 
> +/* Define as 1 to use only a single FEC engine */
> +#ifndef RTE_ACC100_SINGLE_FEC
> +#define RTE_ACC100_SINGLE_FEC 0
> +#endif
> +
> +/* Values used in filling in descriptors */
> +#define ACC100_DMA_DESC_TYPE           2
> +#define ACC100_DMA_CODE_BLK_MODE       0
> +#define ACC100_DMA_BLKID_FCW           1
> +#define ACC100_DMA_BLKID_IN            2
> +#define ACC100_DMA_BLKID_OUT_ENC       1
> +#define ACC100_DMA_BLKID_OUT_HARD      1
> +#define ACC100_DMA_BLKID_OUT_SOFT      2
> +#define ACC100_DMA_BLKID_OUT_HARQ      3
> +#define ACC100_DMA_BLKID_IN_HARQ       3
> +
> +/* Values used in filling in decode FCWs */
> +#define ACC100_FCW_TD_VER              1
> +#define ACC100_FCW_TD_EXT_COLD_REG_EN  1
> +#define ACC100_FCW_TD_AUTOMAP          0x0f
> +#define ACC100_FCW_TD_RVIDX_0          2
> +#define ACC100_FCW_TD_RVIDX_1          26
> +#define ACC100_FCW_TD_RVIDX_2          50
> +#define ACC100_FCW_TD_RVIDX_3          74
> +
> +/* Values used in writing to the registers */
> +#define ACC100_REG_IRQ_EN_ALL          0x1FF83FF  /* Enable all interrupts
> */
> +
> +/* ACC100 Specific Dimensioning */
> +#define ACC100_SIZE_64MBYTE            (64*1024*1024)
> +/* Number of elements in an Info Ring */
> +#define ACC100_INFO_RING_NUM_ENTRIES   1024
> +/* Number of elements in HARQ layout memory */
> +#define ACC100_HARQ_LAYOUT             (64*1024*1024)
> +/* Assume offset for HARQ in memory */
> +#define ACC100_HARQ_OFFSET             (32*1024)
> +/* Mask used to calculate an index in an Info Ring array (not a byte offset)
> */
> +#define ACC100_INFO_RING_MASK
> (ACC100_INFO_RING_NUM_ENTRIES-1)
> +/* Number of Virtual Functions ACC100 supports */
> +#define ACC100_NUM_VFS                  16
> +#define ACC100_NUM_QGRPS                 8
> +#define ACC100_NUM_QGRPS_PER_WORD        8
> +#define ACC100_NUM_AQS                  16
> +#define MAX_ENQ_BATCH_SIZE          255
> +/* All ACC100 Registers alignment are 32bits = 4B */
> +#define BYTES_IN_WORD                 4
> +#define MAX_E_MBUF                64000
> +
> +#define GRP_ID_SHIFT    10 /* Queue Index Hierarchy */
> +#define VF_ID_SHIFT     4  /* Queue Index Hierarchy */
> +#define VF_OFFSET_QOS   16 /* offset in Memory Space specific to QoS
> Mon */
> +#define TMPL_PRI_0      0x03020100
> +#define TMPL_PRI_1      0x07060504
> +#define TMPL_PRI_2      0x0b0a0908
> +#define TMPL_PRI_3      0x0f0e0d0c
> +#define QUEUE_ENABLE    0x80000000  /* Bit to mark Queue as Enabled */
> +#define WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
> +
> +#define ACC100_NUM_TMPL  32
> +#define VF_OFFSET_QOS 16 /* offset in Memory Space specific to QoS Mon
> */
> +/* Mapping of signals for the available engines */
> +#define SIG_UL_5G      0
> +#define SIG_UL_5G_LAST 7
> +#define SIG_DL_5G      13
> +#define SIG_DL_5G_LAST 15
> +#define SIG_UL_4G      16
> +#define SIG_UL_4G_LAST 21
> +#define SIG_DL_4G      27
> +#define SIG_DL_4G_LAST 31
> +
> +/* max number of iterations to allocate memory block for all rings */
> +#define SW_RING_MEM_ALLOC_ATTEMPTS 5
> +#define MAX_QUEUE_DEPTH           1024
> +#define ACC100_DMA_MAX_NUM_POINTERS  14
> +#define ACC100_DMA_DESC_PADDING      8
> +#define ACC100_FCW_PADDING           12
> +#define ACC100_DESC_FCW_OFFSET       192
> +#define ACC100_DESC_SIZE             256
> +#define ACC100_DESC_OFFSET           (ACC100_DESC_SIZE / 64)
> +#define ACC100_FCW_TE_BLEN     32
> +#define ACC100_FCW_TD_BLEN     24
> +#define ACC100_FCW_LE_BLEN     32
> +#define ACC100_FCW_LD_BLEN     36
> +
> +#define ACC100_FCW_VER         2
> +#define MUX_5GDL_DESC 6
> +#define CMP_ENC_SIZE 20
> +#define CMP_DEC_SIZE 24
> +#define ENC_OFFSET (32)
> +#define DEC_OFFSET (80)
> +#define ACC100_EXT_MEM
> +#define ACC100_HARQ_OFFSET_THRESHOLD 1024
> +
> +/* Constants from K0 computation from 3GPP 38.212 Table 5.4.2.1-2 */
> +#define N_ZC_1 66 /* N = 66 Zc for BG 1 */
> +#define N_ZC_2 50 /* N = 50 Zc for BG 2 */
> +#define K0_1_1 17 /* K0 fraction numerator for rv 1 and BG 1 */
> +#define K0_1_2 13 /* K0 fraction numerator for rv 1 and BG 2 */
> +#define K0_2_1 33 /* K0 fraction numerator for rv 2 and BG 1 */
> +#define K0_2_2 25 /* K0 fraction numerator for rv 2 and BG 2 */
> +#define K0_3_1 56 /* K0 fraction numerator for rv 3 and BG 1 */
> +#define K0_3_2 43 /* K0 fraction numerator for rv 3 and BG 2 */
> +
> +/* ACC100 Configuration */
> +#define ACC100_DDR_ECC_ENABLE
> +#define ACC100_CFG_DMA_ERROR 0x3D7
> +#define ACC100_CFG_AXI_CACHE 0x11
> +#define ACC100_CFG_QMGR_HI_P 0x0F0F
> +#define ACC100_CFG_PCI_AXI 0xC003
> +#define ACC100_CFG_PCI_BRIDGE 0x40006033
> +#define ACC100_ENGINE_OFFSET 0x1000
> +#define ACC100_RESET_HI 0x20100
> +#define ACC100_RESET_LO 0x20000
> +#define ACC100_RESET_HARD 0x1FF
> +#define ACC100_ENGINES_MAX 9
> +#define LONG_WAIT 1000
> +
> +/* ACC100 DMA Descriptor triplet */
> +struct acc100_dma_triplet {
> +	uint64_t address;
> +	uint32_t blen:20,
> +		res0:4,
> +		last:1,
> +		dma_ext:1,
> +		res1:2,
> +		blkid:4;
> +} __rte_packed;
> +
> +
> +
> +/* ACC100 DMA Response Descriptor */
> +union acc100_dma_rsp_desc {
> +	uint32_t val;
> +	struct {
> +		uint32_t crc_status:1,
> +			synd_ok:1,
> +			dma_err:1,
> +			neg_stop:1,
> +			fcw_err:1,
> +			output_err:1,
> +			input_err:1,
> +			timestampEn:1,
> +			iterCountFrac:8,
> +			iter_cnt:8,
> +			rsrvd3:6,
> +			sdone:1,
> +			fdone:1;
> +		uint32_t add_info_0;
> +		uint32_t add_info_1;
> +	};
> +};
> +
> +
> +/* ACC100 Queue Manager Enqueue PCI Register */
> +union acc100_enqueue_reg_fmt {
> +	uint32_t val;
> +	struct {
> +		uint32_t num_elem:8,
> +			addr_offset:3,
> +			rsrvd:1,
> +			req_elem_addr:20;
> +	};
> +};
> +
> +/* FEC 4G Uplink Frame Control Word */
> +struct __rte_packed acc100_fcw_td {
> +	uint8_t fcw_ver:4,
> +		num_maps:4; /* Unused */
> +	uint8_t filler:6, /* Unused */
> +		rsrvd0:1,
> +		bypass_sb_deint:1;
> +	uint16_t k_pos;
> +	uint16_t k_neg; /* Unused */
> +	uint8_t c_neg; /* Unused */
> +	uint8_t c; /* Unused */
> +	uint32_t ea; /* Unused */
> +	uint32_t eb; /* Unused */
> +	uint8_t cab; /* Unused */
> +	uint8_t k0_start_col; /* Unused */
> +	uint8_t rsrvd1;
> +	uint8_t code_block_mode:1, /* Unused */
> +		turbo_crc_type:1,
> +		rsrvd2:3,
> +		bypass_teq:1, /* Unused */
> +		soft_output_en:1, /* Unused */
> +		ext_td_cold_reg_en:1;
> +	union { /* External Cold register */
> +		uint32_t ext_td_cold_reg;
> +		struct {
> +			uint32_t min_iter:4, /* Unused */
> +				max_iter:4,
> +				ext_scale:5, /* Unused */
> +				rsrvd3:3,
> +				early_stop_en:1, /* Unused */
> +				sw_soft_out_dis:1, /* Unused */
> +				sw_et_cont:1, /* Unused */
> +				sw_soft_out_saturation:1, /* Unused */
> +				half_iter_on:1, /* Unused */
> +				raw_decoder_input_on:1, /* Unused */
> +				rsrvd4:10;
> +		};
> +	};
> +};
> +
> +/* FEC 5GNR Uplink Frame Control Word */
> +struct __rte_packed acc100_fcw_ld {
> +	uint32_t FCWversion:4,
> +		qm:4,
> +		nfiller:11,
> +		BG:1,
> +		Zc:9,
> +		res0:1,
> +		synd_precoder:1,
> +		synd_post:1;
> +	uint32_t ncb:16,
> +		k0:16;
> +	uint32_t rm_e:24,
> +		hcin_en:1,
> +		hcout_en:1,
> +		crc_select:1,
> +		bypass_dec:1,
> +		bypass_intlv:1,
> +		so_en:1,
> +		so_bypass_rm:1,
> +		so_bypass_intlv:1;
> +	uint32_t hcin_offset:16,
> +		hcin_size0:16;
> +	uint32_t hcin_size1:16,
> +		hcin_decomp_mode:3,
> +		llr_pack_mode:1,
> +		hcout_comp_mode:3,
> +		res2:1,
> +		dec_convllr:4,
> +		hcout_convllr:4;
> +	uint32_t itmax:7,
> +		itstop:1,
> +		so_it:7,
> +		res3:1,
> +		hcout_offset:16;
> +	uint32_t hcout_size0:16,
> +		hcout_size1:16;
> +	uint32_t gain_i:8,
> +		gain_h:8,
> +		negstop_th:16;
> +	uint32_t negstop_it:7,
> +		negstop_en:1,
> +		res4:24;
> +};
> +
> +/* FEC 4G Downlink Frame Control Word */
> +struct __rte_packed acc100_fcw_te {
> +	uint16_t k_neg;
> +	uint16_t k_pos;
> +	uint8_t c_neg;
> +	uint8_t c;
> +	uint8_t filler;
> +	uint8_t cab;
> +	uint32_t ea:17,
> +		rsrvd0:15;
> +	uint32_t eb:17,
> +		rsrvd1:15;
> +	uint16_t ncb_neg;
> +	uint16_t ncb_pos;
> +	uint8_t rv_idx0:2,
> +		rsrvd2:2,
> +		rv_idx1:2,
> +		rsrvd3:2;
> +	uint8_t bypass_rv_idx0:1,
> +		bypass_rv_idx1:1,
> +		bypass_rm:1,
> +		rsrvd4:5;
> +	uint8_t rsrvd5:1,
> +		rsrvd6:3,
> +		code_block_crc:1,
> +		rsrvd7:3;
> +	uint8_t code_block_mode:1,
> +		rsrvd8:7;
> +	uint64_t rsrvd9;
> +};
> +
> +/* FEC 5GNR Downlink Frame Control Word */
> +struct __rte_packed acc100_fcw_le {
> +	uint32_t FCWversion:4,
> +		qm:4,
> +		nfiller:11,
> +		BG:1,
> +		Zc:9,
> +		res0:3;
> +	uint32_t ncb:16,
> +		k0:16;
> +	uint32_t rm_e:24,
> +		res1:2,
> +		crc_select:1,
> +		res2:1,
> +		bypass_intlv:1,
> +		res3:3;
> +	uint32_t res4_a:12,
> +		mcb_count:3,
> +		res4_b:17;
> +	uint32_t res5;
> +	uint32_t res6;
> +	uint32_t res7;
> +	uint32_t res8;
> +};
> +
> +/* ACC100 DMA Request Descriptor */
> +struct __rte_packed acc100_dma_req_desc {
> +	union {
> +		struct{
> +			uint32_t type:4,
> +				rsrvd0:26,
> +				sdone:1,
> +				fdone:1;
> +			uint32_t rsrvd1;
> +			uint32_t rsrvd2;
> +			uint32_t pass_param:8,
> +				sdone_enable:1,
> +				irq_enable:1,
> +				timeStampEn:1,
> +				res0:5,
> +				numCBs:4,
> +				res1:4,
> +				m2dlen:4,
> +				d2mlen:4;
> +		};
> +		struct{
> +			uint32_t word0;
> +			uint32_t word1;
> +			uint32_t word2;
> +			uint32_t word3;
> +		};
> +	};
> +	struct acc100_dma_triplet
> data_ptrs[ACC100_DMA_MAX_NUM_POINTERS];
> +
> +	/* Virtual addresses used to retrieve SW context info */
> +	union {
> +		void *op_addr;
> +		uint64_t pad1;  /* pad to 64 bits */
> +	};
> +	/*
> +	 * Stores additional information needed for driver processing:
> +	 * - last_desc_in_batch - flag used to mark last descriptor (CB)
> +	 *                        in batch
> +	 * - cbs_in_tb - stores information about total number of Code Blocks
> +	 *               in currently processed Transport Block
> +	 */
> +	union {
> +		struct {
> +			union {
> +				struct acc100_fcw_ld fcw_ld;
> +				struct acc100_fcw_td fcw_td;
> +				struct acc100_fcw_le fcw_le;
> +				struct acc100_fcw_te fcw_te;
> +				uint32_t pad2[ACC100_FCW_PADDING];
> +			};
> +			uint32_t last_desc_in_batch :8,
> +				cbs_in_tb:8,
> +				pad4 : 16;
> +		};
> +		uint64_t pad3[ACC100_DMA_DESC_PADDING]; /* pad to 64
> bits */
> +	};
> +};
> +
> +/* ACC100 DMA Descriptor */
> +union acc100_dma_desc {
> +	struct acc100_dma_req_desc req;
> +	union acc100_dma_rsp_desc rsp;
> +};
> +
> +
> +/* Union describing Info Ring entry */
> +union acc100_harq_layout_data {
> +	uint32_t val;
> +	struct {
> +		uint16_t offset;
> +		uint16_t size0;
> +	};
> +} __rte_packed;
> +
> +
> +/* Union describing Info Ring entry */
> +union acc100_info_ring_data {
> +	uint32_t val;
> +	struct {
> +		union {
> +			uint16_t detailed_info;
> +			struct {
> +				uint16_t aq_id: 4;
> +				uint16_t qg_id: 4;
> +				uint16_t vf_id: 6;
> +				uint16_t reserved: 2;
> +			};
> +		};
> +		uint16_t int_nb: 7;
> +		uint16_t msi_0: 1;
> +		uint16_t vf2pf: 6;
> +		uint16_t loop: 1;
> +		uint16_t valid: 1;
> +	};
> +} __rte_packed;
> +
> +struct acc100_registry_addr {
> +	unsigned int dma_ring_dl5g_hi;
> +	unsigned int dma_ring_dl5g_lo;
> +	unsigned int dma_ring_ul5g_hi;
> +	unsigned int dma_ring_ul5g_lo;
> +	unsigned int dma_ring_dl4g_hi;
> +	unsigned int dma_ring_dl4g_lo;
> +	unsigned int dma_ring_ul4g_hi;
> +	unsigned int dma_ring_ul4g_lo;
> +	unsigned int ring_size;
> +	unsigned int info_ring_hi;
> +	unsigned int info_ring_lo;
> +	unsigned int info_ring_en;
> +	unsigned int info_ring_ptr;
> +	unsigned int tail_ptrs_dl5g_hi;
> +	unsigned int tail_ptrs_dl5g_lo;
> +	unsigned int tail_ptrs_ul5g_hi;
> +	unsigned int tail_ptrs_ul5g_lo;
> +	unsigned int tail_ptrs_dl4g_hi;
> +	unsigned int tail_ptrs_dl4g_lo;
> +	unsigned int tail_ptrs_ul4g_hi;
> +	unsigned int tail_ptrs_ul4g_lo;
> +	unsigned int depth_log0_offset;
> +	unsigned int depth_log1_offset;
> +	unsigned int qman_group_func;
> +	unsigned int ddr_range;
> +};
> +
> +/* Structure holding registry addresses for PF */
> +static const struct acc100_registry_addr pf_reg_addr = {
> +	.dma_ring_dl5g_hi = HWPfDmaFec5GdlDescBaseHiRegVf,
> +	.dma_ring_dl5g_lo = HWPfDmaFec5GdlDescBaseLoRegVf,
> +	.dma_ring_ul5g_hi = HWPfDmaFec5GulDescBaseHiRegVf,
> +	.dma_ring_ul5g_lo = HWPfDmaFec5GulDescBaseLoRegVf,
> +	.dma_ring_dl4g_hi = HWPfDmaFec4GdlDescBaseHiRegVf,
> +	.dma_ring_dl4g_lo = HWPfDmaFec4GdlDescBaseLoRegVf,
> +	.dma_ring_ul4g_hi = HWPfDmaFec4GulDescBaseHiRegVf,
> +	.dma_ring_ul4g_lo = HWPfDmaFec4GulDescBaseLoRegVf,
> +	.ring_size = HWPfQmgrRingSizeVf,
> +	.info_ring_hi = HWPfHiInfoRingBaseHiRegPf,
> +	.info_ring_lo = HWPfHiInfoRingBaseLoRegPf,
> +	.info_ring_en = HWPfHiInfoRingIntWrEnRegPf,
> +	.info_ring_ptr = HWPfHiInfoRingPointerRegPf,
> +	.tail_ptrs_dl5g_hi = HWPfDmaFec5GdlRespPtrHiRegVf,
> +	.tail_ptrs_dl5g_lo = HWPfDmaFec5GdlRespPtrLoRegVf,
> +	.tail_ptrs_ul5g_hi = HWPfDmaFec5GulRespPtrHiRegVf,
> +	.tail_ptrs_ul5g_lo = HWPfDmaFec5GulRespPtrLoRegVf,
> +	.tail_ptrs_dl4g_hi = HWPfDmaFec4GdlRespPtrHiRegVf,
> +	.tail_ptrs_dl4g_lo = HWPfDmaFec4GdlRespPtrLoRegVf,
> +	.tail_ptrs_ul4g_hi = HWPfDmaFec4GulRespPtrHiRegVf,
> +	.tail_ptrs_ul4g_lo = HWPfDmaFec4GulRespPtrLoRegVf,
> +	.depth_log0_offset = HWPfQmgrGrpDepthLog20Vf,
> +	.depth_log1_offset = HWPfQmgrGrpDepthLog21Vf,
> +	.qman_group_func = HWPfQmgrGrpFunction0,
> +	.ddr_range = HWPfDmaVfDdrBaseRw,
> +};
> +
> +/* Structure holding registry addresses for VF */
> +static const struct acc100_registry_addr vf_reg_addr = {
> +	.dma_ring_dl5g_hi = HWVfDmaFec5GdlDescBaseHiRegVf,
> +	.dma_ring_dl5g_lo = HWVfDmaFec5GdlDescBaseLoRegVf,
> +	.dma_ring_ul5g_hi = HWVfDmaFec5GulDescBaseHiRegVf,
> +	.dma_ring_ul5g_lo = HWVfDmaFec5GulDescBaseLoRegVf,
> +	.dma_ring_dl4g_hi = HWVfDmaFec4GdlDescBaseHiRegVf,
> +	.dma_ring_dl4g_lo = HWVfDmaFec4GdlDescBaseLoRegVf,
> +	.dma_ring_ul4g_hi = HWVfDmaFec4GulDescBaseHiRegVf,
> +	.dma_ring_ul4g_lo = HWVfDmaFec4GulDescBaseLoRegVf,
> +	.ring_size = HWVfQmgrRingSizeVf,
> +	.info_ring_hi = HWVfHiInfoRingBaseHiVf,
> +	.info_ring_lo = HWVfHiInfoRingBaseLoVf,
> +	.info_ring_en = HWVfHiInfoRingIntWrEnVf,
> +	.info_ring_ptr = HWVfHiInfoRingPointerVf,
> +	.tail_ptrs_dl5g_hi = HWVfDmaFec5GdlRespPtrHiRegVf,
> +	.tail_ptrs_dl5g_lo = HWVfDmaFec5GdlRespPtrLoRegVf,
> +	.tail_ptrs_ul5g_hi = HWVfDmaFec5GulRespPtrHiRegVf,
> +	.tail_ptrs_ul5g_lo = HWVfDmaFec5GulRespPtrLoRegVf,
> +	.tail_ptrs_dl4g_hi = HWVfDmaFec4GdlRespPtrHiRegVf,
> +	.tail_ptrs_dl4g_lo = HWVfDmaFec4GdlRespPtrLoRegVf,
> +	.tail_ptrs_ul4g_hi = HWVfDmaFec4GulRespPtrHiRegVf,
> +	.tail_ptrs_ul4g_lo = HWVfDmaFec4GulRespPtrLoRegVf,
> +	.depth_log0_offset = HWVfQmgrGrpDepthLog20Vf,
> +	.depth_log1_offset = HWVfQmgrGrpDepthLog21Vf,
> +	.qman_group_func = HWVfQmgrGrpFunction0Vf,
> +	.ddr_range = HWVfDmaDdrBaseRangeRoVf,
> +};
> +
>  /* Private data structure for each ACC100 device */
>  struct acc100_device {
>  	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
> --
> 1.8.3.1
Reviewed-by: Rosen Xu<rosen.xu@intel.com>

^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v4 04/11] baseband/acc100: add queue configuration
  2020-09-04 17:54     ` [dpdk-dev] [PATCH v4 04/11] baseband/acc100: add queue configuration Nicolas Chautru
@ 2020-09-15  2:31       ` Xu, Rosen
  2020-09-18  3:01       ` Liu, Tianjiao
  1 sibling, 0 replies; 213+ messages in thread
From: Xu, Rosen @ 2020-09-15  2:31 UTC (permalink / raw)
  To: Chautru, Nicolas, dev, akhil.goyal
  Cc: Richardson, Bruce, dave.burley, aidan.goddard, Yigit, Ferruh,
	Liu, Tianjiao

Hi,

> -----Original Message-----
> From: Chautru, Nicolas <nicolas.chautru@intel.com>
> Sent: Saturday, September 05, 2020 1:54
> To: dev@dpdk.org; akhil.goyal@nxp.com
> Cc: Richardson, Bruce <bruce.richardson@intel.com>; Xu, Rosen
> <rosen.xu@intel.com>; dave.burley@accelercomm.com;
> aidan.goddard@accelercomm.com; Yigit, Ferruh <ferruh.yigit@intel.com>;
> Liu, Tianjiao <tianjiao.liu@intel.com>; Chautru, Nicolas
> <nicolas.chautru@intel.com>
> Subject: [PATCH v4 04/11] baseband/acc100: add queue configuration
> 
> Adding function to create and configure queues for the device. Still no
> capability.
> 
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> ---
>  drivers/baseband/acc100/rte_acc100_pmd.c | 420
> ++++++++++++++++++++++++++++++-
> drivers/baseband/acc100/rte_acc100_pmd.h |  45 ++++
>  2 files changed, 464 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> b/drivers/baseband/acc100/rte_acc100_pmd.c
> index 7807a30..7a21c57 100644
> --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> @@ -26,6 +26,22 @@
>  RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);  #endif
> 
> +/* Write to MMIO register address */
> +static inline void
> +mmio_write(void *addr, uint32_t value)
> +{
> +	*((volatile uint32_t *)(addr)) = rte_cpu_to_le_32(value); }
> +
> +/* Write a register of a ACC100 device */ static inline void
> +acc100_reg_write(struct acc100_device *d, uint32_t offset, uint32_t
> +payload) {
> +	void *reg_addr = RTE_PTR_ADD(d->mmio_base, offset);
> +	mmio_write(reg_addr, payload);
> +	usleep(1000);
> +}
> +
>  /* Read a register of a ACC100 device */  static inline uint32_t
> acc100_reg_read(struct acc100_device *d, uint32_t offset) @@ -36,6 +52,22
> @@
>  	return rte_le_to_cpu_32(ret);
>  }
> 
> +/* Basic Implementation of Log2 for exact 2^N */ static inline uint32_t
> +log2_basic(uint32_t value) {
> +	return (value == 0) ? 0 : __builtin_ctz(value); }
> +
> +/* Calculate memory alignment offset assuming alignment is 2^N */
> +static inline uint32_t calc_mem_alignment_offset(void
> +*unaligned_virt_mem, uint32_t alignment) {
> +	rte_iova_t unaligned_phy_mem =
> rte_malloc_virt2iova(unaligned_virt_mem);
> +	return (uint32_t)(alignment -
> +			(unaligned_phy_mem & (alignment-1))); }
> +
>  /* Calculate the offset of the enqueue register */  static inline uint32_t
> queue_offset(bool pf_device, uint8_t vf_id, uint8_t qgrp_id, uint16_t aq_id)
> @@ -204,10 +236,393 @@
>  			acc100_conf->q_dl_5g.aq_depth_log2);
>  }
> 
> +static void
> +free_base_addresses(void **base_addrs, int size) {
> +	int i;
> +	for (i = 0; i < size; i++)
> +		rte_free(base_addrs[i]);
> +}
> +
> +static inline uint32_t
> +get_desc_len(void)
> +{
> +	return sizeof(union acc100_dma_desc);
> +}
> +
> +/* Allocate the 2 * 64MB block for the sw rings */ static int
> +alloc_2x64mb_sw_rings_mem(struct rte_bbdev *dev, struct acc100_device
> *d,
> +		int socket)
> +{
> +	uint32_t sw_ring_size = ACC100_SIZE_64MBYTE;
> +	d->sw_rings_base = rte_zmalloc_socket(dev->device->driver-
> >name,
> +			2 * sw_ring_size, RTE_CACHE_LINE_SIZE, socket);
> +	if (d->sw_rings_base == NULL) {
> +		rte_bbdev_log(ERR, "Failed to allocate memory for %s:%u",
> +				dev->device->driver->name,
> +				dev->data->dev_id);
> +		return -ENOMEM;
> +	}
> +	memset(d->sw_rings_base, 0, ACC100_SIZE_64MBYTE);
> +	uint32_t next_64mb_align_offset = calc_mem_alignment_offset(
> +			d->sw_rings_base, ACC100_SIZE_64MBYTE);
> +	d->sw_rings = RTE_PTR_ADD(d->sw_rings_base,
> next_64mb_align_offset);
> +	d->sw_rings_phys = rte_malloc_virt2iova(d->sw_rings_base) +
> +			next_64mb_align_offset;
> +	d->sw_ring_size = MAX_QUEUE_DEPTH * get_desc_len();
> +	d->sw_ring_max_depth = d->sw_ring_size / get_desc_len();
> +
> +	return 0;
> +}
> +
> +/* Attempt to allocate minimised memory space for sw rings */ static
> +void alloc_sw_rings_min_mem(struct rte_bbdev *dev, struct
> acc100_device
> +*d,
> +		uint16_t num_queues, int socket)
> +{
> +	rte_iova_t sw_rings_base_phy, next_64mb_align_addr_phy;
> +	uint32_t next_64mb_align_offset;
> +	rte_iova_t sw_ring_phys_end_addr;
> +	void *base_addrs[SW_RING_MEM_ALLOC_ATTEMPTS];
> +	void *sw_rings_base;
> +	int i = 0;
> +	uint32_t q_sw_ring_size = MAX_QUEUE_DEPTH * get_desc_len();
> +	uint32_t dev_sw_ring_size = q_sw_ring_size * num_queues;
> +
> +	/* Find an aligned block of memory to store sw rings */
> +	while (i < SW_RING_MEM_ALLOC_ATTEMPTS) {
> +		/*
> +		 * sw_ring allocated memory is guaranteed to be aligned to
> +		 * q_sw_ring_size at the condition that the requested size is
> +		 * less than the page size
> +		 */
> +		sw_rings_base = rte_zmalloc_socket(
> +				dev->device->driver->name,
> +				dev_sw_ring_size, q_sw_ring_size, socket);
> +
> +		if (sw_rings_base == NULL) {
> +			rte_bbdev_log(ERR,
> +					"Failed to allocate memory
> for %s:%u",
> +					dev->device->driver->name,
> +					dev->data->dev_id);
> +			break;
> +		}
> +
> +		sw_rings_base_phy = rte_malloc_virt2iova(sw_rings_base);
> +		next_64mb_align_offset = calc_mem_alignment_offset(
> +				sw_rings_base, ACC100_SIZE_64MBYTE);
> +		next_64mb_align_addr_phy = sw_rings_base_phy +
> +				next_64mb_align_offset;
> +		sw_ring_phys_end_addr = sw_rings_base_phy +
> dev_sw_ring_size;
> +
> +		/* Check if the end of the sw ring memory block is before the
> +		 * start of next 64MB aligned mem address
> +		 */
> +		if (sw_ring_phys_end_addr < next_64mb_align_addr_phy) {
> +			d->sw_rings_phys = sw_rings_base_phy;
> +			d->sw_rings = sw_rings_base;
> +			d->sw_rings_base = sw_rings_base;
> +			d->sw_ring_size = q_sw_ring_size;
> +			d->sw_ring_max_depth = MAX_QUEUE_DEPTH;
> +			break;
> +		}
> +		/* Store the address of the unaligned mem block */
> +		base_addrs[i] = sw_rings_base;
> +		i++;
> +	}
> +
> +	/* Free all unaligned blocks of mem allocated in the loop */
> +	free_base_addresses(base_addrs, i);
> +}
> +
> +
> +/* Allocate 64MB memory used for all software rings */ static int
> +acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int
> +socket_id) {
> +	uint32_t phys_low, phys_high, payload;
> +	struct acc100_device *d = dev->data->dev_private;
> +	const struct acc100_registry_addr *reg_addr;
> +
> +	if (d->pf_device && !d->acc100_conf.pf_mode_en) {
> +		rte_bbdev_log(NOTICE,
> +				"%s has PF mode disabled. This PF can't be
> used.",
> +				dev->data->name);
> +		return -ENODEV;
> +	}
> +
> +	alloc_sw_rings_min_mem(dev, d, num_queues, socket_id);
> +
> +	/* If minimal memory space approach failed, then allocate
> +	 * the 2 * 64MB block for the sw rings
> +	 */
> +	if (d->sw_rings == NULL)
> +		alloc_2x64mb_sw_rings_mem(dev, d, socket_id);
> +
> +	/* Configure ACC100 with the base address for DMA descriptor rings
> +	 * Same descriptor rings used for UL and DL DMA Engines
> +	 * Note : Assuming only VF0 bundle is used for PF mode
> +	 */
> +	phys_high = (uint32_t)(d->sw_rings_phys >> 32);
> +	phys_low  = (uint32_t)(d->sw_rings_phys &
> ~(ACC100_SIZE_64MBYTE-1));
> +
> +	/* Choose correct registry addresses for the device type */
> +	if (d->pf_device)
> +		reg_addr = &pf_reg_addr;
> +	else
> +		reg_addr = &vf_reg_addr;
> +
> +	/* Read the populated cfg from ACC100 registers */
> +	fetch_acc100_config(dev);
> +
> +	/* Mark as configured properly */
> +	d->configured = true;
> +
> +	/* Release AXI from PF */
> +	if (d->pf_device)
> +		acc100_reg_write(d, HWPfDmaAxiControl, 1);
> +
> +	acc100_reg_write(d, reg_addr->dma_ring_ul5g_hi, phys_high);
> +	acc100_reg_write(d, reg_addr->dma_ring_ul5g_lo, phys_low);
> +	acc100_reg_write(d, reg_addr->dma_ring_dl5g_hi, phys_high);
> +	acc100_reg_write(d, reg_addr->dma_ring_dl5g_lo, phys_low);
> +	acc100_reg_write(d, reg_addr->dma_ring_ul4g_hi, phys_high);
> +	acc100_reg_write(d, reg_addr->dma_ring_ul4g_lo, phys_low);
> +	acc100_reg_write(d, reg_addr->dma_ring_dl4g_hi, phys_high);
> +	acc100_reg_write(d, reg_addr->dma_ring_dl4g_lo, phys_low);
> +
> +	/*
> +	 * Configure Ring Size to the max queue ring size
> +	 * (used for wrapping purpose)
> +	 */
> +	payload = log2_basic(d->sw_ring_size / 64);
> +	acc100_reg_write(d, reg_addr->ring_size, payload);
> +
> +	/* Configure tail pointer for use when SDONE enabled */
> +	d->tail_ptrs = rte_zmalloc_socket(
> +			dev->device->driver->name,
> +			ACC100_NUM_QGRPS * ACC100_NUM_AQS *
> sizeof(uint32_t),
> +			RTE_CACHE_LINE_SIZE, socket_id);
> +	if (d->tail_ptrs == NULL) {
> +		rte_bbdev_log(ERR, "Failed to allocate tail ptr for %s:%u",
> +				dev->device->driver->name,
> +				dev->data->dev_id);
> +		rte_free(d->sw_rings);
> +		return -ENOMEM;
> +	}
> +	d->tail_ptr_phys = rte_malloc_virt2iova(d->tail_ptrs);
> +
> +	phys_high = (uint32_t)(d->tail_ptr_phys >> 32);
> +	phys_low  = (uint32_t)(d->tail_ptr_phys);
> +	acc100_reg_write(d, reg_addr->tail_ptrs_ul5g_hi, phys_high);
> +	acc100_reg_write(d, reg_addr->tail_ptrs_ul5g_lo, phys_low);
> +	acc100_reg_write(d, reg_addr->tail_ptrs_dl5g_hi, phys_high);
> +	acc100_reg_write(d, reg_addr->tail_ptrs_dl5g_lo, phys_low);
> +	acc100_reg_write(d, reg_addr->tail_ptrs_ul4g_hi, phys_high);
> +	acc100_reg_write(d, reg_addr->tail_ptrs_ul4g_lo, phys_low);
> +	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_hi, phys_high);
> +	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_lo, phys_low);
> +
> +	d->harq_layout = rte_zmalloc_socket("HARQ Layout",
> +			ACC100_HARQ_LAYOUT * sizeof(*d->harq_layout),
> +			RTE_CACHE_LINE_SIZE, dev->data->socket_id);
> +
> +	rte_bbdev_log_debug(
> +			"ACC100 (%s) configured  sw_rings = %p,
> sw_rings_phys = %#"
> +			PRIx64, dev->data->name, d->sw_rings, d-
> >sw_rings_phys);
> +
> +	return 0;
> +}
> +
>  /* Free 64MB memory used for software rings */  static int -
> acc100_dev_close(struct rte_bbdev *dev  __rte_unused)
> +acc100_dev_close(struct rte_bbdev *dev)
>  {
> +	struct acc100_device *d = dev->data->dev_private;
> +	if (d->sw_rings_base != NULL) {
> +		rte_free(d->tail_ptrs);
> +		rte_free(d->sw_rings_base);
> +		d->sw_rings_base = NULL;
> +	}
> +	usleep(1000);
> +	return 0;
> +}
> +
> +
> +/**
> + * Report a ACC100 queue index which is free
> + * Return 0 to 16k for a valid queue_idx or -1 when no queue is
> +available
> + * Note : Only supporting VF0 Bundle for PF mode  */ static int
> +acc100_find_free_queue_idx(struct rte_bbdev *dev,
> +		const struct rte_bbdev_queue_conf *conf) {
> +	struct acc100_device *d = dev->data->dev_private;
> +	int op_2_acc[5] = {0, UL_4G, DL_4G, UL_5G, DL_5G};
> +	int acc = op_2_acc[conf->op_type];
> +	struct rte_q_topology_t *qtop = NULL;
> +	qtopFromAcc(&qtop, acc, &(d->acc100_conf));
> +	if (qtop == NULL)
> +		return -1;
> +	/* Identify matching QGroup Index which are sorted in priority order
> */
> +	uint16_t group_idx = qtop->first_qgroup_index;
> +	group_idx += conf->priority;
> +	if (group_idx >= ACC100_NUM_QGRPS ||
> +			conf->priority >= qtop->num_qgroups) {
> +		rte_bbdev_log(INFO, "Invalid Priority on %s, priority %u",
> +				dev->data->name, conf->priority);
> +		return -1;
> +	}
> +	/* Find a free AQ_idx  */
> +	uint16_t aq_idx;
> +	for (aq_idx = 0; aq_idx < qtop->num_aqs_per_groups; aq_idx++) {
> +		if (((d->q_assigned_bit_map[group_idx] >> aq_idx) & 0x1)
> == 0) {
> +			/* Mark the Queue as assigned */
> +			d->q_assigned_bit_map[group_idx] |= (1 << aq_idx);
> +			/* Report the AQ Index */
> +			return (group_idx << GRP_ID_SHIFT) + aq_idx;
> +		}
> +	}
> +	rte_bbdev_log(INFO, "Failed to find free queue on %s, priority %u",
> +			dev->data->name, conf->priority);
> +	return -1;
> +}
> +
> +/* Setup ACC100 queue */
> +static int
> +acc100_queue_setup(struct rte_bbdev *dev, uint16_t queue_id,
> +		const struct rte_bbdev_queue_conf *conf) {
> +	struct acc100_device *d = dev->data->dev_private;
> +	struct acc100_queue *q;
> +	int16_t q_idx;
> +
> +	/* Allocate the queue data structure. */
> +	q = rte_zmalloc_socket(dev->device->driver->name, sizeof(*q),
> +			RTE_CACHE_LINE_SIZE, conf->socket);
> +	if (q == NULL) {
> +		rte_bbdev_log(ERR, "Failed to allocate queue memory");
> +		return -ENOMEM;
> +	}
> +
> +	q->d = d;
> +	q->ring_addr = RTE_PTR_ADD(d->sw_rings, (d->sw_ring_size *
> queue_id));
> +	q->ring_addr_phys = d->sw_rings_phys + (d->sw_ring_size *
> queue_id);
> +
> +	/* Prepare the Ring with default descriptor format */
> +	union acc100_dma_desc *desc = NULL;
> +	unsigned int desc_idx, b_idx;
> +	int fcw_len = (conf->op_type == RTE_BBDEV_OP_LDPC_ENC ?
> +		ACC100_FCW_LE_BLEN : (conf->op_type ==
> RTE_BBDEV_OP_TURBO_DEC ?
> +		ACC100_FCW_TD_BLEN : ACC100_FCW_LD_BLEN));
> +
> +	for (desc_idx = 0; desc_idx < d->sw_ring_max_depth; desc_idx++) {
> +		desc = q->ring_addr + desc_idx;
> +		desc->req.word0 = ACC100_DMA_DESC_TYPE;
> +		desc->req.word1 = 0; /**< Timestamp */
> +		desc->req.word2 = 0;
> +		desc->req.word3 = 0;
> +		uint64_t fcw_offset = (desc_idx << 8) +
> ACC100_DESC_FCW_OFFSET;
> +		desc->req.data_ptrs[0].address = q->ring_addr_phys +
> fcw_offset;
> +		desc->req.data_ptrs[0].blen = fcw_len;
> +		desc->req.data_ptrs[0].blkid = ACC100_DMA_BLKID_FCW;
> +		desc->req.data_ptrs[0].last = 0;
> +		desc->req.data_ptrs[0].dma_ext = 0;
> +		for (b_idx = 1; b_idx < ACC100_DMA_MAX_NUM_POINTERS
> - 1;
> +				b_idx++) {
> +			desc->req.data_ptrs[b_idx].blkid =
> ACC100_DMA_BLKID_IN;
> +			desc->req.data_ptrs[b_idx].last = 1;
> +			desc->req.data_ptrs[b_idx].dma_ext = 0;
> +			b_idx++;
> +			desc->req.data_ptrs[b_idx].blkid =
> +					ACC100_DMA_BLKID_OUT_ENC;
> +			desc->req.data_ptrs[b_idx].last = 1;
> +			desc->req.data_ptrs[b_idx].dma_ext = 0;
> +		}
> +		/* Preset some fields of LDPC FCW */
> +		desc->req.fcw_ld.FCWversion = ACC100_FCW_VER;
> +		desc->req.fcw_ld.gain_i = 1;
> +		desc->req.fcw_ld.gain_h = 1;
> +	}
> +
> +	q->lb_in = rte_zmalloc_socket(dev->device->driver->name,
> +			RTE_CACHE_LINE_SIZE,
> +			RTE_CACHE_LINE_SIZE, conf->socket);
> +	if (q->lb_in == NULL) {
> +		rte_bbdev_log(ERR, "Failed to allocate lb_in memory");
> +		return -ENOMEM;
> +	}
> +	q->lb_in_addr_phys = rte_malloc_virt2iova(q->lb_in);
> +	q->lb_out = rte_zmalloc_socket(dev->device->driver->name,
> +			RTE_CACHE_LINE_SIZE,
> +			RTE_CACHE_LINE_SIZE, conf->socket);
> +	if (q->lb_out == NULL) {
> +		rte_bbdev_log(ERR, "Failed to allocate lb_out memory");
> +		return -ENOMEM;
> +	}
> +	q->lb_out_addr_phys = rte_malloc_virt2iova(q->lb_out);
> +
> +	/*
> +	 * Software queue ring wraps synchronously with the HW when it
> reaches
> +	 * the boundary of the maximum allocated queue size, no matter
> what the
> +	 * sw queue size is. This wrapping is guarded by setting the
> wrap_mask
> +	 * to represent the maximum queue size as allocated at the time
> when
> +	 * the device has been setup (in configure()).
> +	 *
> +	 * The queue depth is set to the queue size value (conf-
> >queue_size).
> +	 * This limits the occupancy of the queue at any point of time, so that
> +	 * the queue does not get swamped with enqueue requests.
> +	 */
> +	q->sw_ring_depth = conf->queue_size;
> +	q->sw_ring_wrap_mask = d->sw_ring_max_depth - 1;
> +
> +	q->op_type = conf->op_type;
> +
> +	q_idx = acc100_find_free_queue_idx(dev, conf);
> +	if (q_idx == -1) {
> +		rte_free(q);
> +		return -1;
> +	}
> +
> +	q->qgrp_id = (q_idx >> GRP_ID_SHIFT) & 0xF;
> +	q->vf_id = (q_idx >> VF_ID_SHIFT)  & 0x3F;
> +	q->aq_id = q_idx & 0xF;
> +	q->aq_depth = (conf->op_type ==  RTE_BBDEV_OP_TURBO_DEC) ?
> +			(1 << d->acc100_conf.q_ul_4g.aq_depth_log2) :
> +			(1 << d->acc100_conf.q_dl_4g.aq_depth_log2);
> +
> +	q->mmio_reg_enqueue = RTE_PTR_ADD(d->mmio_base,
> +			queue_offset(d->pf_device,
> +					q->vf_id, q->qgrp_id, q->aq_id));
> +
> +	rte_bbdev_log_debug(
> +			"Setup dev%u q%u: qgrp_id=%u, vf_id=%u,
> aq_id=%u, aq_depth=%u, mmio_reg_enqueue=%p",
> +			dev->data->dev_id, queue_id, q->qgrp_id, q->vf_id,
> +			q->aq_id, q->aq_depth, q->mmio_reg_enqueue);
> +
> +	dev->data->queues[queue_id].queue_private = q;
> +	return 0;
> +}
> +
> +/* Release ACC100 queue */
> +static int
> +acc100_queue_release(struct rte_bbdev *dev, uint16_t q_id) {
> +	struct acc100_device *d = dev->data->dev_private;
> +	struct acc100_queue *q = dev->data->queues[q_id].queue_private;
> +
> +	if (q != NULL) {
> +		/* Mark the Queue as un-assigned */
> +		d->q_assigned_bit_map[q->qgrp_id] &= (0xFFFFFFFF -
> +				(1 << q->aq_id));
> +		rte_free(q->lb_in);
> +		rte_free(q->lb_out);
> +		rte_free(q);
> +		dev->data->queues[q_id].queue_private = NULL;
> +	}
> +
>  	return 0;
>  }
> 
> @@ -258,8 +673,11 @@
>  }
> 
>  static const struct rte_bbdev_ops acc100_bbdev_ops = {
> +	.setup_queues = acc100_setup_queues,
>  	.close = acc100_dev_close,
>  	.info_get = acc100_dev_info_get,
> +	.queue_setup = acc100_queue_setup,
> +	.queue_release = acc100_queue_release,
>  };
> 
>  /* ACC100 PCI PF address map */
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h
> b/drivers/baseband/acc100/rte_acc100_pmd.h
> index 662e2c8..0e2b79c 100644
> --- a/drivers/baseband/acc100/rte_acc100_pmd.h
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
> @@ -518,11 +518,56 @@ struct acc100_registry_addr {
>  	.ddr_range = HWVfDmaDdrBaseRangeRoVf,
>  };
> 
> +/* Structure associated with each queue. */ struct __rte_cache_aligned
> +acc100_queue {
> +	union acc100_dma_desc *ring_addr;  /* Virtual address of sw ring */
> +	rte_iova_t ring_addr_phys;  /* Physical address of software ring */
> +	uint32_t sw_ring_head;  /* software ring head */
> +	uint32_t sw_ring_tail;  /* software ring tail */
> +	/* software ring size (descriptors, not bytes) */
> +	uint32_t sw_ring_depth;
> +	/* mask used to wrap enqueued descriptors on the sw ring */
> +	uint32_t sw_ring_wrap_mask;
> +	/* MMIO register used to enqueue descriptors */
> +	void *mmio_reg_enqueue;
> +	uint8_t vf_id;  /* VF ID (max = 63) */
> +	uint8_t qgrp_id;  /* Queue Group ID */
> +	uint16_t aq_id;  /* Atomic Queue ID */
> +	uint16_t aq_depth;  /* Depth of atomic queue */
> +	uint32_t aq_enqueued;  /* Count how many "batches" have been
> enqueued */
> +	uint32_t aq_dequeued;  /* Count how many "batches" have been
> dequeued */
> +	uint32_t irq_enable;  /* Enable ops dequeue interrupts if set to 1 */
> +	struct rte_mempool *fcw_mempool;  /* FCW mempool */
> +	enum rte_bbdev_op_type op_type;  /* Type of this Queue: TE or TD
> */
> +	/* Internal Buffers for loopback input */
> +	uint8_t *lb_in;
> +	uint8_t *lb_out;
> +	rte_iova_t lb_in_addr_phys;
> +	rte_iova_t lb_out_addr_phys;
> +	struct acc100_device *d;
> +};
> +
>  /* Private data structure for each ACC100 device */  struct acc100_device {
>  	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
> +	void *sw_rings_base;  /* Base addr of un-aligned memory for sw
> rings */
> +	void *sw_rings;  /* 64MBs of 64MB aligned memory for sw rings */
> +	rte_iova_t sw_rings_phys;  /* Physical address of sw_rings */
> +	/* Virtual address of the info memory routed to the this function
> under
> +	 * operation, whether it is PF or VF.
> +	 */
> +	union acc100_harq_layout_data *harq_layout;
> +	uint32_t sw_ring_size;
>  	uint32_t ddr_size; /* Size in kB */
> +	uint32_t *tail_ptrs; /* Base address of response tail pointer buffer */
> +	rte_iova_t tail_ptr_phys; /* Physical address of tail pointers */
> +	/* Max number of entries available for each queue in device,
> depending
> +	 * on how many queues are enabled with configure()
> +	 */
> +	uint32_t sw_ring_max_depth;
>  	struct acc100_conf acc100_conf; /* ACC100 Initial configuration */
> +	/* Bitmap capturing which Queues have already been assigned */
> +	uint16_t q_assigned_bit_map[ACC100_NUM_QGRPS];
>  	bool pf_device; /**< True if this is a PF ACC100 device */
>  	bool configured; /**< True if this ACC100 device is configured */  };
> --
> 1.8.3.1
Reviewed-by: Rosen Xu<rosen.xu@intel.com>

^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC processing functions
  2020-09-03 20:45           ` Chautru, Nicolas
  2020-09-15  1:45             ` Chautru, Nicolas
@ 2020-09-15 10:21             ` Ananyev, Konstantin
  1 sibling, 0 replies; 213+ messages in thread
From: Ananyev, Konstantin @ 2020-09-15 10:21 UTC (permalink / raw)
  To: Chautru, Nicolas, Xu, Rosen, dev, akhil.goyal; +Cc: Richardson, Bruce


> > > > > > +
> > > > > > +static inline char *
> > > > > > +mbuf_append(struct rte_mbuf *m_head, struct rte_mbuf *m,
> > > > > > +uint16_t
> > > > > > +len) {
> > > > > > +	if (unlikely(len > rte_pktmbuf_tailroom(m)))
> > > > > > +		return NULL;
> > > > > > +
> > > > > > +	char *tail = (char *)m->buf_addr + m->data_off + m-
> > >data_len;
> > > > > > +	m->data_len = (uint16_t)(m->data_len + len);
> > > > > > +	m_head->pkt_len  = (m_head->pkt_len + len);
> > > > > > +	return tail;
> > > > > > +}
> > > > >
> > > > > Is it reasonable to direct add data_len of rte_mbuf?
> > > > >
> > > >
> > > > Do you suggest to add directly without checking there is enough room
> > > > in the mbuf? We cannot rely on the application providing mbuf with
> > > > enough tailroom.
> > >
> > > What I mentioned is this changes about mbuf should move to librte_mbuf.
> > > And it's better to align Olivier Matz.
> >
> > There is already rte_pktmbuf_append() inside rte_mbuf.h.
> > Wouldn't it suit?
> >
> 
> Hi Ananyev, Rosen,
> I agree that this can be confusing at first look and notably compared to packet processing.
> Note first that this same existing syntaxwhich  is already used in all bbdev PMDs when manipulating outbound mbufs in the context of base
> band signal processing (not really a packet as for NIC or other devices).
> Nothing new in that very PMD as this follows existing logic already in DPDK bbdev PMDs.
> 
> This function basically differs from the typical rte_pktmbuf_append() as this is not appending data in the last mbuf but is used to potentially
> update sequentially data for any mbufs in the middle from preallocated data hence it takes 2 arguments for both the head and the current
> mbuf segment in the list.

Ok, thanks for explanation.

> There may be a more elegant way to do this down the line notably once there is a proposal to handle gracefully large mbufs (another
> usecase we have to handle in a slightly custom way). But I believe that is orthogonal to that very PMD serie which keeps on reling on using
> existing logic.
> 
> 
> 
> 
> > >
> > > > In case you ask about the 2 mbufs, this is because this function is
> > > > used to also support segmented memory made of multiple mbufs segments.
> > > > Note that this function is also used in other existing bbdev PMDs.
> > > > In case you believe there is a better way to do this, we can
> > > > certainly discuss and change these in several PMDs through another serie.
> > > >
> > > > Thanks for all the reviews and useful comments.
> > > > Nic

^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v4 02/11] baseband/acc100: add register definition file
  2020-09-04 17:53     ` [dpdk-dev] [PATCH v4 02/11] baseband/acc100: add register definition file Nicolas Chautru
  2020-09-15  2:31       ` Xu, Rosen
@ 2020-09-18  2:39       ` Liu, Tianjiao
  1 sibling, 0 replies; 213+ messages in thread
From: Liu, Tianjiao @ 2020-09-18  2:39 UTC (permalink / raw)
  To: Chautru, Nicolas, dev, akhil.goyal

On Fri,  4 Sep 2020 10:53:58 -0700, Nicolas Chautru wrote:

> Add in the list of registers for the device and related
> HW specs definitions.

> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> ---
>  drivers/baseband/acc100/acc100_pf_enum.h | 1068 ++++++++++++++++++++++++++++++
>  drivers/baseband/acc100/acc100_vf_enum.h |   73 ++
>  drivers/baseband/acc100/rte_acc100_pmd.h |  490 ++++++++++++++
>  3 files changed, 1631 insertions(+)
>  create mode 100644 drivers/baseband/acc100/acc100_pf_enum.h
>  create mode 100644 drivers/baseband/acc100/acc100_vf_enum.h

Acked-by: Liu Tianjiao <tianjiao.liu@intel.com>

^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v4 03/11] baseband/acc100: add info get function
  2020-09-04 17:53     ` [dpdk-dev] [PATCH v4 03/11] baseband/acc100: add info get function Nicolas Chautru
@ 2020-09-18  2:47       ` Liu, Tianjiao
  0 siblings, 0 replies; 213+ messages in thread
From: Liu, Tianjiao @ 2020-09-18  2:47 UTC (permalink / raw)
  To: Chautru, Nicolas, dev, akhil.goyal

On Fri,  4 Sep 2020 10:53:59 -0700, Nicolas Chautru wrote:

> Add in the "info_get" function to the driver, to allow us to query the device.
> No processing capability are available yet.
> Linking bbdev-test to support the PMD with null capability.

> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> ---
>  app/test-bbdev/Makefile                  |   3 +
>  app/test-bbdev/meson.build               |   3 +
> drivers/baseband/acc100/rte_acc100_cfg.h |  96 +++++++++++++  drivers/baseband/acc100/rte_acc100_pmd.c | 225 +++++++++++++++++++++++++++++++
>  drivers/baseband/acc100/rte_acc100_pmd.h |   3 +
>  5 files changed, 330 insertions(+)
>  create mode 100644 drivers/baseband/acc100/rte_acc100_cfg.h

Acked-by: Liu Tianjiao <tianjiao.liu@intel.com>

^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v4 04/11] baseband/acc100: add queue configuration
  2020-09-04 17:54     ` [dpdk-dev] [PATCH v4 04/11] baseband/acc100: add queue configuration Nicolas Chautru
  2020-09-15  2:31       ` Xu, Rosen
@ 2020-09-18  3:01       ` Liu, Tianjiao
  1 sibling, 0 replies; 213+ messages in thread
From: Liu, Tianjiao @ 2020-09-18  3:01 UTC (permalink / raw)
  To: Chautru, Nicolas, dev, akhil.goyal

On Fri,  4 Sep 2020 10:53:58 -0700, Nicolas Chautru wrote:

> Adding function to create and configure queues for the device. Still no capability.

> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> ---
> drivers/baseband/acc100/rte_acc100_pmd.c | 420 ++++++++++++++++++++++++++++++-  drivers/baseband/acc100/rte_acc100_pmd.h |  45 ++++
> 2 files changed, 464 insertions(+), 1 deletion(-)

Acked-by: Liu Tianjiao <tianjiao.liu@intel.com>

^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v4 05/11] baseband/acc100: add LDPC processing functions
  2020-09-04 17:54     ` [dpdk-dev] [PATCH v4 05/11] baseband/acc100: add LDPC processing functions Nicolas Chautru
@ 2020-09-21  1:40       ` Liu, Tianjiao
  0 siblings, 0 replies; 213+ messages in thread
From: Liu, Tianjiao @ 2020-09-21  1:40 UTC (permalink / raw)
  To: Chautru, Nicolas, dev, akhil.goyal


On Fri,  4 Sep 2020 10:54:01 -0700, Nicolas Chautru wrote:

> Adding LDPC decode and encode processing operations

> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> ---
>  drivers/baseband/acc100/rte_acc100_pmd.c | 1625 +++++++++++++++++++++++++++++-
>  drivers/baseband/acc100/rte_acc100_pmd.h |    3 +
>  2 files changed, 1626 insertions(+), 2 deletions(-)

Acked-by: Liu Tianjiao <tianjiao.liu@intel.com>

^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v4 06/11] baseband/acc100: add HARQ loopback support
  2020-09-04 17:54     ` [dpdk-dev] [PATCH v4 06/11] baseband/acc100: add HARQ loopback support Nicolas Chautru
@ 2020-09-21  1:41       ` Liu, Tianjiao
  0 siblings, 0 replies; 213+ messages in thread
From: Liu, Tianjiao @ 2020-09-21  1:41 UTC (permalink / raw)
  To: Chautru, Nicolas, dev, akhil.goyal

On Fri,  4 Sep 2020 10:54:02 -0700, Nicolas Chautru wrote:

> Additional support for HARQ memory loopback

> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> ---
>  drivers/baseband/acc100/rte_acc100_pmd.c | 158 +++++++++++++++++++++++++++++++
>  1 file changed, 158 insertions(+)

Acked-by: Liu Tianjiao <tianjiao.liu@intel.com>


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v4 07/11] baseband/acc100: add support for 4G processing
  2020-09-04 17:54     ` [dpdk-dev] [PATCH v4 07/11] baseband/acc100: add support for 4G processing Nicolas Chautru
@ 2020-09-21  1:43       ` Liu, Tianjiao
  0 siblings, 0 replies; 213+ messages in thread
From: Liu, Tianjiao @ 2020-09-21  1:43 UTC (permalink / raw)
  To: Chautru, Nicolas, dev, akhil.goyal

On Fri,  4 Sep 2020 10:54:03 -0700, Nicolas Chautru wrote:

> Adding capability for 4G encode and decoder processing

> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> ---
>  drivers/baseband/acc100/rte_acc100_pmd.c | 1010 ++++++++++++++++++++++++++++--
>  1 file changed, 943 insertions(+), 67 deletions(-)


Acked-by: Liu Tianjiao <tianjiao.liu@intel.com>

^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v4 08/11] baseband/acc100: add interrupt support to PMD
  2020-09-04 17:54     ` [dpdk-dev] [PATCH v4 08/11] baseband/acc100: add interrupt support to PMD Nicolas Chautru
@ 2020-09-21  1:45       ` Liu, Tianjiao
  0 siblings, 0 replies; 213+ messages in thread
From: Liu, Tianjiao @ 2020-09-21  1:45 UTC (permalink / raw)
  To: Chautru, Nicolas, dev, akhil.goyal

On Fri,  4 Sep 2020 10:54:04 -0700, Nicolas Chautru wrote:



> Adding capability and functions to support MSI interrupts, call backs and inforing.



> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com<mailto:nicolas.chautru@intel.com>>

> ---

>  drivers/baseband/acc100/rte_acc100_pmd.c | 288 ++++++++++++++++++++++++++++++-  drivers/baseband/acc100/rte_acc100_pmd.h |  15 ++

>  2 files changed, 300 insertions(+), 3 deletions(-)



Acked-by: Liu Tianjiao <tianjiao.liu@intel.com>

^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v4 09/11] baseband/acc100: add debug function to validate input
  2020-09-04 17:54     ` [dpdk-dev] [PATCH v4 09/11] baseband/acc100: add debug function to validate input Nicolas Chautru
@ 2020-09-21  1:46       ` Liu, Tianjiao
  0 siblings, 0 replies; 213+ messages in thread
From: Liu, Tianjiao @ 2020-09-21  1:46 UTC (permalink / raw)
  To: Chautru, Nicolas, dev, akhil.goyal

On Fri,  4 Sep 2020 10:54:05 -0700, Nicolas Chautru wrote:

> Debug functions to validate the input API from user Only enabled in DEBUG mode at build time

> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> ---
>  drivers/baseband/acc100/rte_acc100_pmd.c | 424 +++++++++++++++++++++++++++++++
>  1 file changed, 424 insertions(+)

Acked-by: Liu Tianjiao <tianjiao.liu@intel.com>

^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v4 10/11] baseband/acc100: add configure function
  2020-09-04 17:54     ` [dpdk-dev] [PATCH v4 10/11] baseband/acc100: add configure function Nicolas Chautru
@ 2020-09-21  1:48       ` Liu, Tianjiao
  0 siblings, 0 replies; 213+ messages in thread
From: Liu, Tianjiao @ 2020-09-21  1:48 UTC (permalink / raw)
  To: Chautru, Nicolas, dev, akhil.goyal

On Fri,  4 Sep 2020 10:54:06 -0700, Nicolas Chautru wrote:

> Add configure function to configure the PF from within the bbdev-test itself without external application configuration the device.
> 
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> ---
>  app/test-bbdev/test_bbdev_perf.c                   |  72 +++
>  drivers/baseband/acc100/Makefile                   |   3 +
>  drivers/baseband/acc100/meson.build                |   2 +
>  drivers/baseband/acc100/rte_acc100_cfg.h           |  17 +
>  drivers/baseband/acc100/rte_acc100_pmd.c           | 505 +++++++++++++++++++++
>  .../acc100/rte_pmd_bbdev_acc100_version.map        |   7 +
>  6 files changed, 606 insertions(+)


Acked-by: Liu Tianjiao <tianjiao.liu@intel.com>

^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v4 11/11] doc: update bbdev feature table
  2020-09-04 17:54     ` [dpdk-dev] [PATCH v4 11/11] doc: update bbdev feature table Nicolas Chautru
@ 2020-09-21  1:50       ` Liu, Tianjiao
  0 siblings, 0 replies; 213+ messages in thread
From: Liu, Tianjiao @ 2020-09-21  1:50 UTC (permalink / raw)
  To: Chautru, Nicolas, dev, akhil.goyal

On Fri,  4 Sep 2020 10:54:07 -0700, Nicolas Chautru wrote:

> Correcting overview matrix to use acc100 name
> 
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> ---
>  doc/guides/bbdevs/features/acc100.ini | 14 ++++++++++++++
>  doc/guides/bbdevs/features/mbc.ini    | 14 --------------
>  2 files changed, 14 insertions(+), 14 deletions(-)  create mode 100644 doc/guides/bbdevs/features/acc100.ini
>  delete mode 100644 doc/guides/bbdevs/features/mbc.ini

Acked-by: Liu Tianjiao <tianjiao.liu@intel.com>

^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v4 00/11] bbdev PMD ACC100
  2020-09-04 17:53   ` [dpdk-dev] [PATCH v4 00/11] bbdev PMD ACC100 Nicolas Chautru
                       ` (10 preceding siblings ...)
  2020-09-04 17:54     ` [dpdk-dev] [PATCH v4 11/11] doc: update bbdev feature table Nicolas Chautru
@ 2020-09-21 14:36     ` Chautru, Nicolas
  2020-09-22 19:32       ` Akhil Goyal
  11 siblings, 1 reply; 213+ messages in thread
From: Chautru, Nicolas @ 2020-09-21 14:36 UTC (permalink / raw)
  To: dev, akhil.goyal, Thomas Monjalon
  Cc: Richardson, Bruce, Xu, Rosen, dave.burley, aidan.goddard, Yigit,
	Ferruh, Liu, Tianjiao, Ananyev, Konstantin

Hi Akhil, 
Just a heads up on this bbdev PMD which is ready and was reviewed for some time by the community.
There is one warning on patchwork but it can be ignored (one ack email sent with bad formatting). 
Thanks and best regards, 
Nic

> -----Original Message-----
> From: Chautru, Nicolas <nicolas.chautru@intel.com>
> Sent: Friday, September 4, 2020 10:54 AM
> To: dev@dpdk.org; akhil.goyal@nxp.com
> Cc: Richardson, Bruce <bruce.richardson@intel.com>; Xu, Rosen
> <rosen.xu@intel.com>; dave.burley@accelercomm.com;
> aidan.goddard@accelercomm.com; Yigit, Ferruh <ferruh.yigit@intel.com>; Liu,
> Tianjiao <tianjiao.liu@intel.com>; Chautru, Nicolas
> <nicolas.chautru@intel.com>
> Subject: [PATCH v4 00/11] bbdev PMD ACC100
> 
> v4: an odd compilation error is reported for one CI variant using "gcc latest"
> which looks to me like a false positive of maybe-undeclared.
> http://mails.dpdk.org/archives/test-report/2020-August/148936.html
> Still forcing a dummy declare to remove this CI warning I will check with
> ci@dpdk.org in parallel.
> v3: missed a change during rebase
> v2: includes clean up from latest CI checks.
> 
> This set includes a new PMD for the accelerator
> ACC100 for 4G+5G FEC in 20.11.
> Documentation is updated as well accordingly.
> Existing unit tests are all still supported.
> 
> 
> Nicolas Chautru (11):
>   drivers/baseband: add PMD for ACC100
>   baseband/acc100: add register definition file
>   baseband/acc100: add info get function
>   baseband/acc100: add queue configuration
>   baseband/acc100: add LDPC processing functions
>   baseband/acc100: add HARQ loopback support
>   baseband/acc100: add support for 4G processing
>   baseband/acc100: add interrupt support to PMD
>   baseband/acc100: add debug function to validate input
>   baseband/acc100: add configure function
>   doc: update bbdev feature table
> 
>  app/test-bbdev/Makefile                            |    3 +
>  app/test-bbdev/meson.build                         |    3 +
>  app/test-bbdev/test_bbdev_perf.c                   |   72 +
>  config/common_base                                 |    4 +
>  doc/guides/bbdevs/acc100.rst                       |  233 +
>  doc/guides/bbdevs/features/acc100.ini              |   14 +
>  doc/guides/bbdevs/features/mbc.ini                 |   14 -
>  doc/guides/bbdevs/index.rst                        |    1 +
>  doc/guides/rel_notes/release_20_11.rst             |    6 +
>  drivers/baseband/Makefile                          |    2 +
>  drivers/baseband/acc100/Makefile                   |   28 +
>  drivers/baseband/acc100/acc100_pf_enum.h           | 1068 +++++
>  drivers/baseband/acc100/acc100_vf_enum.h           |   73 +
>  drivers/baseband/acc100/meson.build                |    8 +
>  drivers/baseband/acc100/rte_acc100_cfg.h           |  113 +
>  drivers/baseband/acc100/rte_acc100_pmd.c           | 4684
> ++++++++++++++++++++
>  drivers/baseband/acc100/rte_acc100_pmd.h           |  593 +++
>  .../acc100/rte_pmd_bbdev_acc100_version.map        |   10 +
>  drivers/baseband/meson.build                       |    2 +-
>  mk/rte.app.mk                                      |    1 +
>  20 files changed, 6917 insertions(+), 15 deletions(-)  create mode 100644
> doc/guides/bbdevs/acc100.rst  create mode 100644
> doc/guides/bbdevs/features/acc100.ini
>  delete mode 100644 doc/guides/bbdevs/features/mbc.ini
>  create mode 100644 drivers/baseband/acc100/Makefile  create mode 100644
> drivers/baseband/acc100/acc100_pf_enum.h
>  create mode 100644 drivers/baseband/acc100/acc100_vf_enum.h
>  create mode 100644 drivers/baseband/acc100/meson.build
>  create mode 100644 drivers/baseband/acc100/rte_acc100_cfg.h
>  create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.c
>  create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.h
>  create mode 100644
> drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> 
> --
> 1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v4 00/11] bbdev PMD ACC100
  2020-09-21 14:36     ` [dpdk-dev] [PATCH v4 00/11] bbdev PMD ACC100 Chautru, Nicolas
@ 2020-09-22 19:32       ` Akhil Goyal
  2020-09-23  2:21         ` Chautru, Nicolas
  0 siblings, 1 reply; 213+ messages in thread
From: Akhil Goyal @ 2020-09-22 19:32 UTC (permalink / raw)
  To: Chautru, Nicolas, dev, Thomas Monjalon
  Cc: Richardson, Bruce, Xu, Rosen, dave.burley, aidan.goddard, Yigit,
	Ferruh, Liu, Tianjiao, Ananyev, Konstantin

Hi Nicolas,

> 
> Hi Akhil,
> Just a heads up on this bbdev PMD which is ready and was reviewed for some
> time by the community.
> There is one warning on patchwork but it can be ignored (one ack email sent
> with bad formatting).
> Thanks and best regards,
> Nic
There are changes in Makefiles, which are not required as all makefiles are removed
As we have moved to meson build.

Could you please update the series.

Thanks,
Akhil


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v5 00/11] bbdev PMD ACC100
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 11/11] doc: update bbdev feature table Nicolas Chautru
  2020-09-04 17:53   ` [dpdk-dev] [PATCH v4 00/11] bbdev PMD ACC100 Nicolas Chautru
@ 2020-09-23  2:12   ` Nicolas Chautru
  2020-09-23  2:12     ` [dpdk-dev] [PATCH v5 01/11] drivers/baseband: add PMD for ACC100 Nicolas Chautru
                       ` (10 more replies)
  2020-09-23  2:19   ` [dpdk-dev] [PATCH v6 00/11] bbdev PMD ACC100 Nicolas Chautru
                     ` (6 subsequent siblings)
  8 siblings, 11 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-23  2:12 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

v5: rebase based on latest on main. The legacy makefiles are removed. 
v4: an odd compilation error is reported for one CI variant using "gcc latest" which looks to me like a false positive of maybe-undeclared. 
http://mails.dpdk.org/archives/test-report/2020-August/148936.html
Still forcing a dummy declare to remove this CI warning I will check with ci@dpdk.org in parallel.  
v3: missed a change during rebase
v2: includes clean up from latest CI checks.

This set includes a new PMD for the accelerator
ACC100 for 4G+5G FEC in 20.11. 
Documentation is updated as well accordingly.
Existing unit tests are all still supported.


Nicolas Chautru (11):
  drivers/baseband: add PMD for ACC100
  baseband/acc100: add register definition file
  baseband/acc100: add info get function
  baseband/acc100: add queue configuration
  baseband/acc100: add LDPC processing functions
  baseband/acc100: add HARQ loopback support
  baseband/acc100: add support for 4G processing
  baseband/acc100: add interrupt support to PMD
  baseband/acc100: add debug function to validate input
  baseband/acc100: add configure function
  doc: update bbdev feature table

 app/test-bbdev/meson.build                         |    3 +
 app/test-bbdev/test_bbdev_perf.c                   |   72 +
 doc/guides/bbdevs/acc100.rst                       |  233 +
 doc/guides/bbdevs/features/acc100.ini              |   14 +
 doc/guides/bbdevs/features/mbc.ini                 |   14 -
 doc/guides/bbdevs/index.rst                        |    1 +
 doc/guides/rel_notes/release_20_11.rst             |    6 +
 drivers/baseband/acc100/Makefile                   |   28 +
 drivers/baseband/acc100/acc100_pf_enum.h           | 1068 +++++
 drivers/baseband/acc100/acc100_vf_enum.h           |   73 +
 drivers/baseband/acc100/meson.build                |    8 +
 drivers/baseband/acc100/rte_acc100_cfg.h           |  113 +
 drivers/baseband/acc100/rte_acc100_pmd.c           | 4684 ++++++++++++++++++++
 drivers/baseband/acc100/rte_acc100_pmd.h           |  593 +++
 .../acc100/rte_pmd_bbdev_acc100_version.map        |   10 +
 drivers/baseband/meson.build                       |    2 +-
 16 files changed, 6907 insertions(+), 15 deletions(-)
 create mode 100644 doc/guides/bbdevs/acc100.rst
 create mode 100644 doc/guides/bbdevs/features/acc100.ini
 delete mode 100644 doc/guides/bbdevs/features/mbc.ini
 create mode 100644 drivers/baseband/acc100/Makefile
 create mode 100644 drivers/baseband/acc100/acc100_pf_enum.h
 create mode 100644 drivers/baseband/acc100/acc100_vf_enum.h
 create mode 100644 drivers/baseband/acc100/meson.build
 create mode 100644 drivers/baseband/acc100/rte_acc100_cfg.h
 create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.c
 create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.h
 create mode 100644 drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v5 01/11] drivers/baseband: add PMD for ACC100
  2020-09-23  2:12   ` [dpdk-dev] [PATCH v5 " Nicolas Chautru
@ 2020-09-23  2:12     ` Nicolas Chautru
  2020-09-23  2:12     ` [dpdk-dev] [PATCH v5 02/11] baseband/acc100: add register definition file Nicolas Chautru
                       ` (9 subsequent siblings)
  10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-23  2:12 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Add stubs for the ACC100 PMD

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 doc/guides/bbdevs/acc100.rst                       | 233 +++++++++++++++++++++
 doc/guides/bbdevs/index.rst                        |   1 +
 doc/guides/rel_notes/release_20_11.rst             |   6 +
 drivers/baseband/acc100/Makefile                   |  25 +++
 drivers/baseband/acc100/meson.build                |   6 +
 drivers/baseband/acc100/rte_acc100_pmd.c           | 175 ++++++++++++++++
 drivers/baseband/acc100/rte_acc100_pmd.h           |  37 ++++
 .../acc100/rte_pmd_bbdev_acc100_version.map        |   3 +
 drivers/baseband/meson.build                       |   2 +-
 9 files changed, 487 insertions(+), 1 deletion(-)
 create mode 100644 doc/guides/bbdevs/acc100.rst
 create mode 100644 drivers/baseband/acc100/Makefile
 create mode 100644 drivers/baseband/acc100/meson.build
 create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.c
 create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.h
 create mode 100644 drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map

diff --git a/doc/guides/bbdevs/acc100.rst b/doc/guides/bbdevs/acc100.rst
new file mode 100644
index 0000000..f87ee09
--- /dev/null
+++ b/doc/guides/bbdevs/acc100.rst
@@ -0,0 +1,233 @@
+..  SPDX-License-Identifier: BSD-3-Clause
+    Copyright(c) 2020 Intel Corporation
+
+Intel(R) ACC100 5G/4G FEC Poll Mode Driver
+==========================================
+
+The BBDEV ACC100 5G/4G FEC poll mode driver (PMD) supports an
+implementation of a VRAN FEC wireless acceleration function.
+This device is also known as Mount Bryce.
+
+Features
+--------
+
+ACC100 5G/4G FEC PMD supports the following features:
+
+- LDPC Encode in the DL (5GNR)
+- LDPC Decode in the UL (5GNR)
+- Turbo Encode in the DL (4G)
+- Turbo Decode in the UL (4G)
+- 16 VFs per PF (physical device)
+- Maximum of 128 queues per VF
+- PCIe Gen-3 x16 Interface
+- MSI
+- SR-IOV
+
+ACC100 5G/4G FEC PMD supports the following BBDEV capabilities:
+
+* For the LDPC encode operation:
+   - ``RTE_BBDEV_LDPC_CRC_24B_ATTACH`` :  set to attach CRC24B to CB(s)
+   - ``RTE_BBDEV_LDPC_RATE_MATCH`` :  if set then do not do Rate Match bypass
+   - ``RTE_BBDEV_LDPC_INTERLEAVER_BYPASS`` : if set then bypass interleaver
+
+* For the LDPC decode operation:
+   - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK`` :  check CRC24B from CB(s)
+   - ``RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE`` :  disable early termination
+   - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP`` :  drops CRC24B bits appended while decoding
+   - ``RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE`` :  provides an input for HARQ combining
+   - ``RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE`` :  provides an input for HARQ combining
+   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE`` :  HARQ memory input is internal
+   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE`` :  HARQ memory output is internal
+   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK`` :  loopback data to/from HARQ memory
+   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_FILLERS`` :  HARQ memory includes the fillers bits
+   - ``RTE_BBDEV_LDPC_DEC_SCATTER_GATHER`` :  supports scatter-gather for input/output data
+   - ``RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION`` :  supports compression of the HARQ input/output
+   - ``RTE_BBDEV_LDPC_LLR_COMPRESSION`` :  supports LLR input compression
+
+* For the turbo encode operation:
+   - ``RTE_BBDEV_TURBO_CRC_24B_ATTACH`` :  set to attach CRC24B to CB(s)
+   - ``RTE_BBDEV_TURBO_RATE_MATCH`` :  if set then do not do Rate Match bypass
+   - ``RTE_BBDEV_TURBO_ENC_INTERRUPTS`` :  set for encoder dequeue interrupts
+   - ``RTE_BBDEV_TURBO_RV_INDEX_BYPASS`` :  set to bypass RV index
+   - ``RTE_BBDEV_TURBO_ENC_SCATTER_GATHER`` :  supports scatter-gather for input/output data
+
+* For the turbo decode operation:
+   - ``RTE_BBDEV_TURBO_CRC_TYPE_24B`` :  check CRC24B from CB(s)
+   - ``RTE_BBDEV_TURBO_SUBBLOCK_DEINTERLEAVE`` :  perform subblock de-interleave
+   - ``RTE_BBDEV_TURBO_DEC_INTERRUPTS`` :  set for decoder dequeue interrupts
+   - ``RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN`` :  set if negative LLR encoder i/p is supported
+   - ``RTE_BBDEV_TURBO_POS_LLR_1_BIT_IN`` :  set if positive LLR encoder i/p is supported
+   - ``RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP`` :  keep CRC24B bits appended while decoding
+   - ``RTE_BBDEV_TURBO_EARLY_TERMINATION`` :  set early early termination feature
+   - ``RTE_BBDEV_TURBO_DEC_SCATTER_GATHER`` :  supports scatter-gather for input/output data
+   - ``RTE_BBDEV_TURBO_HALF_ITERATION_EVEN`` :  set half iteration granularity
+
+Installation
+------------
+
+Section 3 of the DPDK manual provides instuctions on installing and compiling DPDK. The
+default set of bbdev compile flags may be found in config/common_base, where for example
+the flag to build the ACC100 5G/4G FEC device, ``CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100``,
+is already set.
+
+DPDK requires hugepages to be configured as detailed in section 2 of the DPDK manual.
+The bbdev test application has been tested with a configuration 40 x 1GB hugepages. The
+hugepage configuration of a server may be examined using:
+
+.. code-block:: console
+
+   grep Huge* /proc/meminfo
+
+
+Initialization
+--------------
+
+When the device first powers up, its PCI Physical Functions (PF) can be listed through this command:
+
+.. code-block:: console
+
+  sudo lspci -vd8086:0d5c
+
+The physical and virtual functions are compatible with Linux UIO drivers:
+``vfio`` and ``igb_uio``. However, in order to work the ACC100 5G/4G
+FEC device firstly needs to be bound to one of these linux drivers through DPDK.
+
+
+Bind PF UIO driver(s)
+~~~~~~~~~~~~~~~~~~~~~
+
+Install the DPDK igb_uio driver, bind it with the PF PCI device ID and use
+``lspci`` to confirm the PF device is under use by ``igb_uio`` DPDK UIO driver.
+
+The igb_uio driver may be bound to the PF PCI device using one of three methods:
+
+
+1. PCI functions (physical or virtual, depending on the use case) can be bound to
+the UIO driver by repeating this command for every function.
+
+.. code-block:: console
+
+  cd <dpdk-top-level-directory>
+  insmod ./build/kmod/igb_uio.ko
+  echo "8086 0d5c" > /sys/bus/pci/drivers/igb_uio/new_id
+  lspci -vd8086:0d5c
+
+
+2. Another way to bind PF with DPDK UIO driver is by using the ``dpdk-devbind.py`` tool
+
+.. code-block:: console
+
+  cd <dpdk-top-level-directory>
+  ./usertools/dpdk-devbind.py -b igb_uio 0000:06:00.0
+
+where the PCI device ID (example: 0000:06:00.0) is obtained using lspci -vd8086:0d5c
+
+
+3. A third way to bind is to use ``dpdk-setup.sh`` tool
+
+.. code-block:: console
+
+  cd <dpdk-top-level-directory>
+  ./usertools/dpdk-setup.sh
+
+  select 'Bind Ethernet/Crypto/Baseband device to IGB UIO module'
+  or
+  select 'Bind Ethernet/Crypto/Baseband device to VFIO module' depending on driver required
+  enter PCI device ID
+  select 'Display current Ethernet/Crypto/Baseband device settings' to confirm binding
+
+
+In the same way the ACC100 5G/4G FEC PF can be bound with vfio, but vfio driver does not
+support SR-IOV configuration right out of the box, so it will need to be patched.
+
+
+Enable Virtual Functions
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+Now, it should be visible in the printouts that PCI PF is under igb_uio control
+"``Kernel driver in use: igb_uio``"
+
+To show the number of available VFs on the device, read ``sriov_totalvfs`` file..
+
+.. code-block:: console
+
+  cat /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/sriov_totalvfs
+
+  where 0000\:<b>\:<d>.<f> is the PCI device ID
+
+
+To enable VFs via igb_uio, echo the number of virtual functions intended to
+enable to ``max_vfs`` file..
+
+.. code-block:: console
+
+  echo <num-of-vfs> > /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/max_vfs
+
+
+Afterwards, all VFs must be bound to appropriate UIO drivers as required, same
+way it was done with the physical function previously.
+
+Enabling SR-IOV via vfio driver is pretty much the same, except that the file
+name is different:
+
+.. code-block:: console
+
+  echo <num-of-vfs> > /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/sriov_numvfs
+
+
+Configure the VFs through PF
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The PCI virtual functions must be configured before working or getting assigned
+to VMs/Containers. The configuration involves allocating the number of hardware
+queues, priorities, load balance, bandwidth and other settings necessary for the
+device to perform FEC functions.
+
+This configuration needs to be executed at least once after reboot or PCI FLR and can
+be achieved by using the function ``acc100_configure()``, which sets up the
+parameters defined in ``acc100_conf`` structure.
+
+Test Application
+----------------
+
+BBDEV provides a test application, ``test-bbdev.py`` and range of test data for testing
+the functionality of ACC100 5G/4G FEC encode and decode, depending on the device's
+capabilities. The test application is located under app->test-bbdev folder and has the
+following options:
+
+.. code-block:: console
+
+  "-p", "--testapp-path": specifies path to the bbdev test app.
+  "-e", "--eal-params"	: EAL arguments which are passed to the test app.
+  "-t", "--timeout"	: Timeout in seconds (default=300).
+  "-c", "--test-cases"	: Defines test cases to run. Run all if not specified.
+  "-v", "--test-vector"	: Test vector path (default=dpdk_path+/app/test-bbdev/test_vectors/bbdev_null.data).
+  "-n", "--num-ops"	: Number of operations to process on device (default=32).
+  "-b", "--burst-size"	: Operations enqueue/dequeue burst size (default=32).
+  "-s", "--snr"		: SNR in dB used when generating LLRs for bler tests.
+  "-s", "--iter_max"	: Number of iterations for LDPC decoder.
+  "-l", "--num-lcores"	: Number of lcores to run (default=16).
+  "-i", "--init-device" : Initialise PF device with default values.
+
+
+To execute the test application tool using simple decode or encode data,
+type one of the following:
+
+.. code-block:: console
+
+  ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_dec_default.data
+  ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_enc_default.data
+
+
+The test application ``test-bbdev.py``, supports the ability to configure the PF device with
+a default set of values, if the "-i" or "- -init-device" option is included. The default values
+are defined in test_bbdev_perf.c.
+
+
+Test Vectors
+~~~~~~~~~~~~
+
+In addition to the simple LDPC decoder and LDPC encoder tests, bbdev also provides
+a range of additional tests under the test_vectors folder, which may be useful. The results
+of these tests will depend on the ACC100 5G/4G FEC capabilities which may cause some
+testcases to be skipped, but no failure should be reported.
diff --git a/doc/guides/bbdevs/index.rst b/doc/guides/bbdevs/index.rst
index a8092dd..4445cbd 100644
--- a/doc/guides/bbdevs/index.rst
+++ b/doc/guides/bbdevs/index.rst
@@ -13,3 +13,4 @@ Baseband Device Drivers
     turbo_sw
     fpga_lte_fec
     fpga_5gnr_fec
+    acc100
diff --git a/doc/guides/rel_notes/release_20_11.rst b/doc/guides/rel_notes/release_20_11.rst
index 73ac08f..20639ea 100644
--- a/doc/guides/rel_notes/release_20_11.rst
+++ b/doc/guides/rel_notes/release_20_11.rst
@@ -55,6 +55,12 @@ New Features
      Also, make sure to start the actual text at the margin.
      =======================================================
 
+* **Added Intel ACC100 bbdev PMD.**
+
+  Added a new ``acc100`` bbdev driver for the Intel\ |reg| ACC100 accelerator
+  also known as Mount Bryce.  See the
+  :doc:`../bbdevs/acc100` BBDEV guide for more details on this new driver.
+
 
 Removed Items
 -------------
diff --git a/drivers/baseband/acc100/Makefile b/drivers/baseband/acc100/Makefile
new file mode 100644
index 0000000..c79e487
--- /dev/null
+++ b/drivers/baseband/acc100/Makefile
@@ -0,0 +1,25 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2020 Intel Corporation
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_pmd_bbdev_acc100.a
+
+# build flags
+CFLAGS += -O3
+CFLAGS += $(WERROR_FLAGS)
+LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring -lrte_cfgfile
+LDLIBS += -lrte_bbdev
+LDLIBS += -lrte_pci -lrte_bus_pci
+
+# versioning export map
+EXPORT_MAP := rte_pmd_bbdev_acc100_version.map
+
+# library version
+LIBABIVER := 1
+
+# library source files
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100) += rte_acc100_pmd.c
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/baseband/acc100/meson.build b/drivers/baseband/acc100/meson.build
new file mode 100644
index 0000000..8afafc2
--- /dev/null
+++ b/drivers/baseband/acc100/meson.build
@@ -0,0 +1,6 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2020 Intel Corporation
+
+deps += ['bbdev', 'bus_vdev', 'ring', 'pci', 'bus_pci']
+
+sources = files('rte_acc100_pmd.c')
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
new file mode 100644
index 0000000..1b4cd13
--- /dev/null
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -0,0 +1,175 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#include <unistd.h>
+
+#include <rte_common.h>
+#include <rte_log.h>
+#include <rte_dev.h>
+#include <rte_malloc.h>
+#include <rte_mempool.h>
+#include <rte_byteorder.h>
+#include <rte_errno.h>
+#include <rte_branch_prediction.h>
+#include <rte_hexdump.h>
+#include <rte_pci.h>
+#include <rte_bus_pci.h>
+
+#include <rte_bbdev.h>
+#include <rte_bbdev_pmd.h>
+#include "rte_acc100_pmd.h"
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, DEBUG);
+#else
+RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);
+#endif
+
+/* Free 64MB memory used for software rings */
+static int
+acc100_dev_close(struct rte_bbdev *dev  __rte_unused)
+{
+	return 0;
+}
+
+static const struct rte_bbdev_ops acc100_bbdev_ops = {
+	.close = acc100_dev_close,
+};
+
+/* ACC100 PCI PF address map */
+static struct rte_pci_id pci_id_acc100_pf_map[] = {
+	{
+		RTE_PCI_DEVICE(RTE_ACC100_VENDOR_ID, RTE_ACC100_PF_DEVICE_ID)
+	},
+	{.device_id = 0},
+};
+
+/* ACC100 PCI VF address map */
+static struct rte_pci_id pci_id_acc100_vf_map[] = {
+	{
+		RTE_PCI_DEVICE(RTE_ACC100_VENDOR_ID, RTE_ACC100_VF_DEVICE_ID)
+	},
+	{.device_id = 0},
+};
+
+/* Initialization Function */
+static void
+acc100_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv)
+{
+	struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
+
+	dev->dev_ops = &acc100_bbdev_ops;
+
+	((struct acc100_device *) dev->data->dev_private)->pf_device =
+			!strcmp(drv->driver.name,
+					RTE_STR(ACC100PF_DRIVER_NAME));
+	((struct acc100_device *) dev->data->dev_private)->mmio_base =
+			pci_dev->mem_resource[0].addr;
+
+	rte_bbdev_log_debug("Init device %s [%s] @ vaddr %p paddr %#"PRIx64"",
+			drv->driver.name, dev->data->name,
+			(void *)pci_dev->mem_resource[0].addr,
+			pci_dev->mem_resource[0].phys_addr);
+}
+
+static int acc100_pci_probe(struct rte_pci_driver *pci_drv,
+	struct rte_pci_device *pci_dev)
+{
+	struct rte_bbdev *bbdev = NULL;
+	char dev_name[RTE_BBDEV_NAME_MAX_LEN];
+
+	if (pci_dev == NULL) {
+		rte_bbdev_log(ERR, "NULL PCI device");
+		return -EINVAL;
+	}
+
+	rte_pci_device_name(&pci_dev->addr, dev_name, sizeof(dev_name));
+
+	/* Allocate memory to be used privately by drivers */
+	bbdev = rte_bbdev_allocate(pci_dev->device.name);
+	if (bbdev == NULL)
+		return -ENODEV;
+
+	/* allocate device private memory */
+	bbdev->data->dev_private = rte_zmalloc_socket(dev_name,
+			sizeof(struct acc100_device), RTE_CACHE_LINE_SIZE,
+			pci_dev->device.numa_node);
+
+	if (bbdev->data->dev_private == NULL) {
+		rte_bbdev_log(CRIT,
+				"Allocate of %zu bytes for device \"%s\" failed",
+				sizeof(struct acc100_device), dev_name);
+				rte_bbdev_release(bbdev);
+			return -ENOMEM;
+	}
+
+	/* Fill HW specific part of device structure */
+	bbdev->device = &pci_dev->device;
+	bbdev->intr_handle = &pci_dev->intr_handle;
+	bbdev->data->socket_id = pci_dev->device.numa_node;
+
+	/* Invoke ACC100 device initialization function */
+	acc100_bbdev_init(bbdev, pci_drv);
+
+	rte_bbdev_log_debug("Initialised bbdev %s (id = %u)",
+			dev_name, bbdev->data->dev_id);
+	return 0;
+}
+
+static int acc100_pci_remove(struct rte_pci_device *pci_dev)
+{
+	struct rte_bbdev *bbdev;
+	int ret;
+	uint8_t dev_id;
+
+	if (pci_dev == NULL)
+		return -EINVAL;
+
+	/* Find device */
+	bbdev = rte_bbdev_get_named_dev(pci_dev->device.name);
+	if (bbdev == NULL) {
+		rte_bbdev_log(CRIT,
+				"Couldn't find HW dev \"%s\" to uninitialise it",
+				pci_dev->device.name);
+		return -ENODEV;
+	}
+	dev_id = bbdev->data->dev_id;
+
+	/* free device private memory before close */
+	rte_free(bbdev->data->dev_private);
+
+	/* Close device */
+	ret = rte_bbdev_close(dev_id);
+	if (ret < 0)
+		rte_bbdev_log(ERR,
+				"Device %i failed to close during uninit: %i",
+				dev_id, ret);
+
+	/* release bbdev from library */
+	rte_bbdev_release(bbdev);
+
+	rte_bbdev_log_debug("Destroyed bbdev = %u", dev_id);
+
+	return 0;
+}
+
+static struct rte_pci_driver acc100_pci_pf_driver = {
+		.probe = acc100_pci_probe,
+		.remove = acc100_pci_remove,
+		.id_table = pci_id_acc100_pf_map,
+		.drv_flags = RTE_PCI_DRV_NEED_MAPPING
+};
+
+static struct rte_pci_driver acc100_pci_vf_driver = {
+		.probe = acc100_pci_probe,
+		.remove = acc100_pci_remove,
+		.id_table = pci_id_acc100_vf_map,
+		.drv_flags = RTE_PCI_DRV_NEED_MAPPING
+};
+
+RTE_PMD_REGISTER_PCI(ACC100PF_DRIVER_NAME, acc100_pci_pf_driver);
+RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME, pci_id_acc100_pf_map);
+RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
+RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME, pci_id_acc100_vf_map);
+
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
new file mode 100644
index 0000000..6f46df0
--- /dev/null
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -0,0 +1,37 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#ifndef _RTE_ACC100_PMD_H_
+#define _RTE_ACC100_PMD_H_
+
+/* Helper macro for logging */
+#define rte_bbdev_log(level, fmt, ...) \
+	rte_log(RTE_LOG_ ## level, acc100_logtype, fmt "\n", \
+		##__VA_ARGS__)
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+#define rte_bbdev_log_debug(fmt, ...) \
+		rte_bbdev_log(DEBUG, "acc100_pmd: " fmt, \
+		##__VA_ARGS__)
+#else
+#define rte_bbdev_log_debug(fmt, ...)
+#endif
+
+/* ACC100 PF and VF driver names */
+#define ACC100PF_DRIVER_NAME           intel_acc100_pf
+#define ACC100VF_DRIVER_NAME           intel_acc100_vf
+
+/* ACC100 PCI vendor & device IDs */
+#define RTE_ACC100_VENDOR_ID           (0x8086)
+#define RTE_ACC100_PF_DEVICE_ID        (0x0d5c)
+#define RTE_ACC100_VF_DEVICE_ID        (0x0d5d)
+
+/* Private data structure for each ACC100 device */
+struct acc100_device {
+	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
+	bool pf_device; /**< True if this is a PF ACC100 device */
+	bool configured; /**< True if this ACC100 device is configured */
+};
+
+#endif /* _RTE_ACC100_PMD_H_ */
diff --git a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
new file mode 100644
index 0000000..4a76d1d
--- /dev/null
+++ b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
@@ -0,0 +1,3 @@
+DPDK_21 {
+	local: *;
+};
diff --git a/drivers/baseband/meson.build b/drivers/baseband/meson.build
index 415b672..72301ce 100644
--- a/drivers/baseband/meson.build
+++ b/drivers/baseband/meson.build
@@ -5,7 +5,7 @@ if is_windows
 	subdir_done()
 endif
 
-drivers = ['null', 'turbo_sw', 'fpga_lte_fec', 'fpga_5gnr_fec']
+drivers = ['null', 'turbo_sw', 'fpga_lte_fec', 'fpga_5gnr_fec', 'acc100']
 
 config_flag_fmt = 'RTE_LIBRTE_PMD_BBDEV_@0@'
 driver_name_fmt = 'rte_pmd_bbdev_@0@'
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v5 02/11] baseband/acc100: add register definition file
  2020-09-23  2:12   ` [dpdk-dev] [PATCH v5 " Nicolas Chautru
  2020-09-23  2:12     ` [dpdk-dev] [PATCH v5 01/11] drivers/baseband: add PMD for ACC100 Nicolas Chautru
@ 2020-09-23  2:12     ` Nicolas Chautru
  2020-09-23  2:12     ` [dpdk-dev] [PATCH v5 03/11] baseband/acc100: add info get function Nicolas Chautru
                       ` (8 subsequent siblings)
  10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-23  2:12 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Add in the list of registers for the device and related
HW specs definitions.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 drivers/baseband/acc100/acc100_pf_enum.h | 1068 ++++++++++++++++++++++++++++++
 drivers/baseband/acc100/acc100_vf_enum.h |   73 ++
 drivers/baseband/acc100/rte_acc100_pmd.h |  490 ++++++++++++++
 3 files changed, 1631 insertions(+)
 create mode 100644 drivers/baseband/acc100/acc100_pf_enum.h
 create mode 100644 drivers/baseband/acc100/acc100_vf_enum.h

diff --git a/drivers/baseband/acc100/acc100_pf_enum.h b/drivers/baseband/acc100/acc100_pf_enum.h
new file mode 100644
index 0000000..a1ee416
--- /dev/null
+++ b/drivers/baseband/acc100/acc100_pf_enum.h
@@ -0,0 +1,1068 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2017 Intel Corporation
+ */
+
+#ifndef ACC100_PF_ENUM_H
+#define ACC100_PF_ENUM_H
+
+/*
+ * ACC100 Register mapping on PF BAR0
+ * This is automatically generated from RDL, format may change with new RDL
+ * Release.
+ * Variable names are as is
+ */
+enum {
+	HWPfQmgrEgressQueuesTemplate          =  0x0007FE00,
+	HWPfQmgrIngressAq                     =  0x00080000,
+	HWPfQmgrArbQAvail                     =  0x00A00010,
+	HWPfQmgrArbQBlock                     =  0x00A00014,
+	HWPfQmgrAqueueDropNotifEn             =  0x00A00024,
+	HWPfQmgrAqueueDisableNotifEn          =  0x00A00028,
+	HWPfQmgrSoftReset                     =  0x00A00038,
+	HWPfQmgrInitStatus                    =  0x00A0003C,
+	HWPfQmgrAramWatchdogCount             =  0x00A00040,
+	HWPfQmgrAramWatchdogCounterEn         =  0x00A00044,
+	HWPfQmgrAxiWatchdogCount              =  0x00A00048,
+	HWPfQmgrAxiWatchdogCounterEn          =  0x00A0004C,
+	HWPfQmgrProcessWatchdogCount          =  0x00A00050,
+	HWPfQmgrProcessWatchdogCounterEn      =  0x00A00054,
+	HWPfQmgrProcessUl4GWatchdogCounter    =  0x00A00058,
+	HWPfQmgrProcessDl4GWatchdogCounter    =  0x00A0005C,
+	HWPfQmgrProcessUl5GWatchdogCounter    =  0x00A00060,
+	HWPfQmgrProcessDl5GWatchdogCounter    =  0x00A00064,
+	HWPfQmgrProcessMldWatchdogCounter     =  0x00A00068,
+	HWPfQmgrMsiOverflowUpperVf            =  0x00A00070,
+	HWPfQmgrMsiOverflowLowerVf            =  0x00A00074,
+	HWPfQmgrMsiWatchdogOverflow           =  0x00A00078,
+	HWPfQmgrMsiOverflowEnable             =  0x00A0007C,
+	HWPfQmgrDebugAqPointerMemGrp          =  0x00A00100,
+	HWPfQmgrDebugOutputArbQFifoGrp        =  0x00A00140,
+	HWPfQmgrDebugMsiFifoGrp               =  0x00A00180,
+	HWPfQmgrDebugAxiWdTimeoutMsiFifo      =  0x00A001C0,
+	HWPfQmgrDebugProcessWdTimeoutMsiFifo  =  0x00A001C4,
+	HWPfQmgrDepthLog2Grp                  =  0x00A00200,
+	HWPfQmgrTholdGrp                      =  0x00A00300,
+	HWPfQmgrGrpTmplateReg0Indx            =  0x00A00600,
+	HWPfQmgrGrpTmplateReg1Indx            =  0x00A00680,
+	HWPfQmgrGrpTmplateReg2indx            =  0x00A00700,
+	HWPfQmgrGrpTmplateReg3Indx            =  0x00A00780,
+	HWPfQmgrGrpTmplateReg4Indx            =  0x00A00800,
+	HWPfQmgrVfBaseAddr                    =  0x00A01000,
+	HWPfQmgrUl4GWeightRrVf                =  0x00A02000,
+	HWPfQmgrDl4GWeightRrVf                =  0x00A02100,
+	HWPfQmgrUl5GWeightRrVf                =  0x00A02200,
+	HWPfQmgrDl5GWeightRrVf                =  0x00A02300,
+	HWPfQmgrMldWeightRrVf                 =  0x00A02400,
+	HWPfQmgrArbQDepthGrp                  =  0x00A02F00,
+	HWPfQmgrGrpFunction0                  =  0x00A02F40,
+	HWPfQmgrGrpFunction1                  =  0x00A02F44,
+	HWPfQmgrGrpPriority                   =  0x00A02F48,
+	HWPfQmgrWeightSync                    =  0x00A03000,
+	HWPfQmgrAqEnableVf                    =  0x00A10000,
+	HWPfQmgrAqResetVf                     =  0x00A20000,
+	HWPfQmgrRingSizeVf                    =  0x00A20004,
+	HWPfQmgrGrpDepthLog20Vf               =  0x00A20008,
+	HWPfQmgrGrpDepthLog21Vf               =  0x00A2000C,
+	HWPfQmgrGrpFunction0Vf                =  0x00A20010,
+	HWPfQmgrGrpFunction1Vf                =  0x00A20014,
+	HWPfDmaConfig0Reg                     =  0x00B80000,
+	HWPfDmaConfig1Reg                     =  0x00B80004,
+	HWPfDmaQmgrAddrReg                    =  0x00B80008,
+	HWPfDmaSoftResetReg                   =  0x00B8000C,
+	HWPfDmaAxcacheReg                     =  0x00B80010,
+	HWPfDmaVersionReg                     =  0x00B80014,
+	HWPfDmaFrameThreshold                 =  0x00B80018,
+	HWPfDmaTimestampLo                    =  0x00B8001C,
+	HWPfDmaTimestampHi                    =  0x00B80020,
+	HWPfDmaAxiStatus                      =  0x00B80028,
+	HWPfDmaAxiControl                     =  0x00B8002C,
+	HWPfDmaNoQmgr                         =  0x00B80030,
+	HWPfDmaQosScale                       =  0x00B80034,
+	HWPfDmaQmanen                         =  0x00B80040,
+	HWPfDmaQmgrQosBase                    =  0x00B80060,
+	HWPfDmaFecClkGatingEnable             =  0x00B80080,
+	HWPfDmaPmEnable                       =  0x00B80084,
+	HWPfDmaQosEnable                      =  0x00B80088,
+	HWPfDmaHarqWeightedRrFrameThreshold   =  0x00B800B0,
+	HWPfDmaDataSmallWeightedRrFrameThresh  = 0x00B800B4,
+	HWPfDmaDataLargeWeightedRrFrameThresh  = 0x00B800B8,
+	HWPfDmaInboundCbMaxSize               =  0x00B800BC,
+	HWPfDmaInboundDrainDataSize           =  0x00B800C0,
+	HWPfDmaVfDdrBaseRw                    =  0x00B80400,
+	HWPfDmaCmplTmOutCnt                   =  0x00B80800,
+	HWPfDmaProcTmOutCnt                   =  0x00B80804,
+	HWPfDmaStatusRrespBresp               =  0x00B80810,
+	HWPfDmaCfgRrespBresp                  =  0x00B80814,
+	HWPfDmaStatusMemParErr                =  0x00B80818,
+	HWPfDmaCfgMemParErrEn                 =  0x00B8081C,
+	HWPfDmaStatusDmaHwErr                 =  0x00B80820,
+	HWPfDmaCfgDmaHwErrEn                  =  0x00B80824,
+	HWPfDmaStatusFecCoreErr               =  0x00B80828,
+	HWPfDmaCfgFecCoreErrEn                =  0x00B8082C,
+	HWPfDmaStatusFcwDescrErr              =  0x00B80830,
+	HWPfDmaCfgFcwDescrErrEn               =  0x00B80834,
+	HWPfDmaStatusBlockTransmit            =  0x00B80838,
+	HWPfDmaBlockOnErrEn                   =  0x00B8083C,
+	HWPfDmaStatusFlushDma                 =  0x00B80840,
+	HWPfDmaFlushDmaOnErrEn                =  0x00B80844,
+	HWPfDmaStatusSdoneFifoFull            =  0x00B80848,
+	HWPfDmaStatusDescriptorErrLoVf        =  0x00B8084C,
+	HWPfDmaStatusDescriptorErrHiVf        =  0x00B80850,
+	HWPfDmaStatusFcwErrLoVf               =  0x00B80854,
+	HWPfDmaStatusFcwErrHiVf               =  0x00B80858,
+	HWPfDmaStatusDataErrLoVf              =  0x00B8085C,
+	HWPfDmaStatusDataErrHiVf              =  0x00B80860,
+	HWPfDmaCfgMsiEnSoftwareErr            =  0x00B80864,
+	HWPfDmaDescriptorSignatuture          =  0x00B80868,
+	HWPfDmaFcwSignature                   =  0x00B8086C,
+	HWPfDmaErrorDetectionEn               =  0x00B80870,
+	HWPfDmaErrCntrlFifoDebug              =  0x00B8087C,
+	HWPfDmaStatusToutData                 =  0x00B80880,
+	HWPfDmaStatusToutDesc                 =  0x00B80884,
+	HWPfDmaStatusToutUnexpData            =  0x00B80888,
+	HWPfDmaStatusToutUnexpDesc            =  0x00B8088C,
+	HWPfDmaStatusToutProcess              =  0x00B80890,
+	HWPfDmaConfigCtoutOutDataEn           =  0x00B808A0,
+	HWPfDmaConfigCtoutOutDescrEn          =  0x00B808A4,
+	HWPfDmaConfigUnexpComplDataEn         =  0x00B808A8,
+	HWPfDmaConfigUnexpComplDescrEn        =  0x00B808AC,
+	HWPfDmaConfigPtoutOutEn               =  0x00B808B0,
+	HWPfDmaFec5GulDescBaseLoRegVf         =  0x00B88020,
+	HWPfDmaFec5GulDescBaseHiRegVf         =  0x00B88024,
+	HWPfDmaFec5GulRespPtrLoRegVf          =  0x00B88028,
+	HWPfDmaFec5GulRespPtrHiRegVf          =  0x00B8802C,
+	HWPfDmaFec5GdlDescBaseLoRegVf         =  0x00B88040,
+	HWPfDmaFec5GdlDescBaseHiRegVf         =  0x00B88044,
+	HWPfDmaFec5GdlRespPtrLoRegVf          =  0x00B88048,
+	HWPfDmaFec5GdlRespPtrHiRegVf          =  0x00B8804C,
+	HWPfDmaFec4GulDescBaseLoRegVf         =  0x00B88060,
+	HWPfDmaFec4GulDescBaseHiRegVf         =  0x00B88064,
+	HWPfDmaFec4GulRespPtrLoRegVf          =  0x00B88068,
+	HWPfDmaFec4GulRespPtrHiRegVf          =  0x00B8806C,
+	HWPfDmaFec4GdlDescBaseLoRegVf         =  0x00B88080,
+	HWPfDmaFec4GdlDescBaseHiRegVf         =  0x00B88084,
+	HWPfDmaFec4GdlRespPtrLoRegVf          =  0x00B88088,
+	HWPfDmaFec4GdlRespPtrHiRegVf          =  0x00B8808C,
+	HWPfDmaVfDdrBaseRangeRo               =  0x00B880A0,
+	HWPfQosmonACntrlReg                   =  0x00B90000,
+	HWPfQosmonAEvalOverflow0              =  0x00B90008,
+	HWPfQosmonAEvalOverflow1              =  0x00B9000C,
+	HWPfQosmonADivTerm                    =  0x00B90010,
+	HWPfQosmonATickTerm                   =  0x00B90014,
+	HWPfQosmonAEvalTerm                   =  0x00B90018,
+	HWPfQosmonAAveTerm                    =  0x00B9001C,
+	HWPfQosmonAForceEccErr                =  0x00B90020,
+	HWPfQosmonAEccErrDetect               =  0x00B90024,
+	HWPfQosmonAIterationConfig0Low        =  0x00B90060,
+	HWPfQosmonAIterationConfig0High       =  0x00B90064,
+	HWPfQosmonAIterationConfig1Low        =  0x00B90068,
+	HWPfQosmonAIterationConfig1High       =  0x00B9006C,
+	HWPfQosmonAIterationConfig2Low        =  0x00B90070,
+	HWPfQosmonAIterationConfig2High       =  0x00B90074,
+	HWPfQosmonAIterationConfig3Low        =  0x00B90078,
+	HWPfQosmonAIterationConfig3High       =  0x00B9007C,
+	HWPfQosmonAEvalMemAddr                =  0x00B90080,
+	HWPfQosmonAEvalMemData                =  0x00B90084,
+	HWPfQosmonAXaction                    =  0x00B900C0,
+	HWPfQosmonARemThres1Vf                =  0x00B90400,
+	HWPfQosmonAThres2Vf                   =  0x00B90404,
+	HWPfQosmonAWeiFracVf                  =  0x00B90408,
+	HWPfQosmonARrWeiVf                    =  0x00B9040C,
+	HWPfPermonACntrlRegVf                 =  0x00B98000,
+	HWPfPermonACountVf                    =  0x00B98008,
+	HWPfPermonAKCntLoVf                   =  0x00B98010,
+	HWPfPermonAKCntHiVf                   =  0x00B98014,
+	HWPfPermonADeltaCntLoVf               =  0x00B98020,
+	HWPfPermonADeltaCntHiVf               =  0x00B98024,
+	HWPfPermonAVersionReg                 =  0x00B9C000,
+	HWPfPermonACbControlFec               =  0x00B9C0F0,
+	HWPfPermonADltTimerLoFec              =  0x00B9C0F4,
+	HWPfPermonADltTimerHiFec              =  0x00B9C0F8,
+	HWPfPermonACbCountFec                 =  0x00B9C100,
+	HWPfPermonAAccExecTimerLoFec          =  0x00B9C104,
+	HWPfPermonAAccExecTimerHiFec          =  0x00B9C108,
+	HWPfPermonAExecTimerMinFec            =  0x00B9C200,
+	HWPfPermonAExecTimerMaxFec            =  0x00B9C204,
+	HWPfPermonAControlBusMon              =  0x00B9C400,
+	HWPfPermonAConfigBusMon               =  0x00B9C404,
+	HWPfPermonASkipCountBusMon            =  0x00B9C408,
+	HWPfPermonAMinLatBusMon               =  0x00B9C40C,
+	HWPfPermonAMaxLatBusMon               =  0x00B9C500,
+	HWPfPermonATotalLatLowBusMon          =  0x00B9C504,
+	HWPfPermonATotalLatUpperBusMon        =  0x00B9C508,
+	HWPfPermonATotalReqCntBusMon          =  0x00B9C50C,
+	HWPfQosmonBCntrlReg                   =  0x00BA0000,
+	HWPfQosmonBEvalOverflow0              =  0x00BA0008,
+	HWPfQosmonBEvalOverflow1              =  0x00BA000C,
+	HWPfQosmonBDivTerm                    =  0x00BA0010,
+	HWPfQosmonBTickTerm                   =  0x00BA0014,
+	HWPfQosmonBEvalTerm                   =  0x00BA0018,
+	HWPfQosmonBAveTerm                    =  0x00BA001C,
+	HWPfQosmonBForceEccErr                =  0x00BA0020,
+	HWPfQosmonBEccErrDetect               =  0x00BA0024,
+	HWPfQosmonBIterationConfig0Low        =  0x00BA0060,
+	HWPfQosmonBIterationConfig0High       =  0x00BA0064,
+	HWPfQosmonBIterationConfig1Low        =  0x00BA0068,
+	HWPfQosmonBIterationConfig1High       =  0x00BA006C,
+	HWPfQosmonBIterationConfig2Low        =  0x00BA0070,
+	HWPfQosmonBIterationConfig2High       =  0x00BA0074,
+	HWPfQosmonBIterationConfig3Low        =  0x00BA0078,
+	HWPfQosmonBIterationConfig3High       =  0x00BA007C,
+	HWPfQosmonBEvalMemAddr                =  0x00BA0080,
+	HWPfQosmonBEvalMemData                =  0x00BA0084,
+	HWPfQosmonBXaction                    =  0x00BA00C0,
+	HWPfQosmonBRemThres1Vf                =  0x00BA0400,
+	HWPfQosmonBThres2Vf                   =  0x00BA0404,
+	HWPfQosmonBWeiFracVf                  =  0x00BA0408,
+	HWPfQosmonBRrWeiVf                    =  0x00BA040C,
+	HWPfPermonBCntrlRegVf                 =  0x00BA8000,
+	HWPfPermonBCountVf                    =  0x00BA8008,
+	HWPfPermonBKCntLoVf                   =  0x00BA8010,
+	HWPfPermonBKCntHiVf                   =  0x00BA8014,
+	HWPfPermonBDeltaCntLoVf               =  0x00BA8020,
+	HWPfPermonBDeltaCntHiVf               =  0x00BA8024,
+	HWPfPermonBVersionReg                 =  0x00BAC000,
+	HWPfPermonBCbControlFec               =  0x00BAC0F0,
+	HWPfPermonBDltTimerLoFec              =  0x00BAC0F4,
+	HWPfPermonBDltTimerHiFec              =  0x00BAC0F8,
+	HWPfPermonBCbCountFec                 =  0x00BAC100,
+	HWPfPermonBAccExecTimerLoFec          =  0x00BAC104,
+	HWPfPermonBAccExecTimerHiFec          =  0x00BAC108,
+	HWPfPermonBExecTimerMinFec            =  0x00BAC200,
+	HWPfPermonBExecTimerMaxFec            =  0x00BAC204,
+	HWPfPermonBControlBusMon              =  0x00BAC400,
+	HWPfPermonBConfigBusMon               =  0x00BAC404,
+	HWPfPermonBSkipCountBusMon            =  0x00BAC408,
+	HWPfPermonBMinLatBusMon               =  0x00BAC40C,
+	HWPfPermonBMaxLatBusMon               =  0x00BAC500,
+	HWPfPermonBTotalLatLowBusMon          =  0x00BAC504,
+	HWPfPermonBTotalLatUpperBusMon        =  0x00BAC508,
+	HWPfPermonBTotalReqCntBusMon          =  0x00BAC50C,
+	HWPfFecUl5gCntrlReg                   =  0x00BC0000,
+	HWPfFecUl5gI2MThreshReg               =  0x00BC0004,
+	HWPfFecUl5gVersionReg                 =  0x00BC0100,
+	HWPfFecUl5gFcwStatusReg               =  0x00BC0104,
+	HWPfFecUl5gWarnReg                    =  0x00BC0108,
+	HwPfFecUl5gIbDebugReg                 =  0x00BC0200,
+	HwPfFecUl5gObLlrDebugReg              =  0x00BC0204,
+	HwPfFecUl5gObHarqDebugReg             =  0x00BC0208,
+	HwPfFecUl5g1CntrlReg                  =  0x00BC1000,
+	HwPfFecUl5g1I2MThreshReg              =  0x00BC1004,
+	HwPfFecUl5g1VersionReg                =  0x00BC1100,
+	HwPfFecUl5g1FcwStatusReg              =  0x00BC1104,
+	HwPfFecUl5g1WarnReg                   =  0x00BC1108,
+	HwPfFecUl5g1IbDebugReg                =  0x00BC1200,
+	HwPfFecUl5g1ObLlrDebugReg             =  0x00BC1204,
+	HwPfFecUl5g1ObHarqDebugReg            =  0x00BC1208,
+	HwPfFecUl5g2CntrlReg                  =  0x00BC2000,
+	HwPfFecUl5g2I2MThreshReg              =  0x00BC2004,
+	HwPfFecUl5g2VersionReg                =  0x00BC2100,
+	HwPfFecUl5g2FcwStatusReg              =  0x00BC2104,
+	HwPfFecUl5g2WarnReg                   =  0x00BC2108,
+	HwPfFecUl5g2IbDebugReg                =  0x00BC2200,
+	HwPfFecUl5g2ObLlrDebugReg             =  0x00BC2204,
+	HwPfFecUl5g2ObHarqDebugReg            =  0x00BC2208,
+	HwPfFecUl5g3CntrlReg                  =  0x00BC3000,
+	HwPfFecUl5g3I2MThreshReg              =  0x00BC3004,
+	HwPfFecUl5g3VersionReg                =  0x00BC3100,
+	HwPfFecUl5g3FcwStatusReg              =  0x00BC3104,
+	HwPfFecUl5g3WarnReg                   =  0x00BC3108,
+	HwPfFecUl5g3IbDebugReg                =  0x00BC3200,
+	HwPfFecUl5g3ObLlrDebugReg             =  0x00BC3204,
+	HwPfFecUl5g3ObHarqDebugReg            =  0x00BC3208,
+	HwPfFecUl5g4CntrlReg                  =  0x00BC4000,
+	HwPfFecUl5g4I2MThreshReg              =  0x00BC4004,
+	HwPfFecUl5g4VersionReg                =  0x00BC4100,
+	HwPfFecUl5g4FcwStatusReg              =  0x00BC4104,
+	HwPfFecUl5g4WarnReg                   =  0x00BC4108,
+	HwPfFecUl5g4IbDebugReg                =  0x00BC4200,
+	HwPfFecUl5g4ObLlrDebugReg             =  0x00BC4204,
+	HwPfFecUl5g4ObHarqDebugReg            =  0x00BC4208,
+	HwPfFecUl5g5CntrlReg                  =  0x00BC5000,
+	HwPfFecUl5g5I2MThreshReg              =  0x00BC5004,
+	HwPfFecUl5g5VersionReg                =  0x00BC5100,
+	HwPfFecUl5g5FcwStatusReg              =  0x00BC5104,
+	HwPfFecUl5g5WarnReg                   =  0x00BC5108,
+	HwPfFecUl5g5IbDebugReg                =  0x00BC5200,
+	HwPfFecUl5g5ObLlrDebugReg             =  0x00BC5204,
+	HwPfFecUl5g5ObHarqDebugReg            =  0x00BC5208,
+	HwPfFecUl5g6CntrlReg                  =  0x00BC6000,
+	HwPfFecUl5g6I2MThreshReg              =  0x00BC6004,
+	HwPfFecUl5g6VersionReg                =  0x00BC6100,
+	HwPfFecUl5g6FcwStatusReg              =  0x00BC6104,
+	HwPfFecUl5g6WarnReg                   =  0x00BC6108,
+	HwPfFecUl5g6IbDebugReg                =  0x00BC6200,
+	HwPfFecUl5g6ObLlrDebugReg             =  0x00BC6204,
+	HwPfFecUl5g6ObHarqDebugReg            =  0x00BC6208,
+	HwPfFecUl5g7CntrlReg                  =  0x00BC7000,
+	HwPfFecUl5g7I2MThreshReg              =  0x00BC7004,
+	HwPfFecUl5g7VersionReg                =  0x00BC7100,
+	HwPfFecUl5g7FcwStatusReg              =  0x00BC7104,
+	HwPfFecUl5g7WarnReg                   =  0x00BC7108,
+	HwPfFecUl5g7IbDebugReg                =  0x00BC7200,
+	HwPfFecUl5g7ObLlrDebugReg             =  0x00BC7204,
+	HwPfFecUl5g7ObHarqDebugReg            =  0x00BC7208,
+	HwPfFecUl5g8CntrlReg                  =  0x00BC8000,
+	HwPfFecUl5g8I2MThreshReg              =  0x00BC8004,
+	HwPfFecUl5g8VersionReg                =  0x00BC8100,
+	HwPfFecUl5g8FcwStatusReg              =  0x00BC8104,
+	HwPfFecUl5g8WarnReg                   =  0x00BC8108,
+	HwPfFecUl5g8IbDebugReg                =  0x00BC8200,
+	HwPfFecUl5g8ObLlrDebugReg             =  0x00BC8204,
+	HwPfFecUl5g8ObHarqDebugReg            =  0x00BC8208,
+	HWPfFecDl5gCntrlReg                   =  0x00BCF000,
+	HWPfFecDl5gI2MThreshReg               =  0x00BCF004,
+	HWPfFecDl5gVersionReg                 =  0x00BCF100,
+	HWPfFecDl5gFcwStatusReg               =  0x00BCF104,
+	HWPfFecDl5gWarnReg                    =  0x00BCF108,
+	HWPfFecUlVersionReg                   =  0x00BD0000,
+	HWPfFecUlControlReg                   =  0x00BD0004,
+	HWPfFecUlStatusReg                    =  0x00BD0008,
+	HWPfFecDlVersionReg                   =  0x00BDF000,
+	HWPfFecDlClusterConfigReg             =  0x00BDF004,
+	HWPfFecDlBurstThres                   =  0x00BDF00C,
+	HWPfFecDlClusterStatusReg0            =  0x00BDF040,
+	HWPfFecDlClusterStatusReg1            =  0x00BDF044,
+	HWPfFecDlClusterStatusReg2            =  0x00BDF048,
+	HWPfFecDlClusterStatusReg3            =  0x00BDF04C,
+	HWPfFecDlClusterStatusReg4            =  0x00BDF050,
+	HWPfFecDlClusterStatusReg5            =  0x00BDF054,
+	HWPfChaFabPllPllrst                   =  0x00C40000,
+	HWPfChaFabPllClk0                     =  0x00C40004,
+	HWPfChaFabPllClk1                     =  0x00C40008,
+	HWPfChaFabPllBwadj                    =  0x00C4000C,
+	HWPfChaFabPllLbw                      =  0x00C40010,
+	HWPfChaFabPllResetq                   =  0x00C40014,
+	HWPfChaFabPllPhshft0                  =  0x00C40018,
+	HWPfChaFabPllPhshft1                  =  0x00C4001C,
+	HWPfChaFabPllDivq0                    =  0x00C40020,
+	HWPfChaFabPllDivq1                    =  0x00C40024,
+	HWPfChaFabPllDivq2                    =  0x00C40028,
+	HWPfChaFabPllDivq3                    =  0x00C4002C,
+	HWPfChaFabPllDivq4                    =  0x00C40030,
+	HWPfChaFabPllDivq5                    =  0x00C40034,
+	HWPfChaFabPllDivq6                    =  0x00C40038,
+	HWPfChaFabPllDivq7                    =  0x00C4003C,
+	HWPfChaDl5gPllPllrst                  =  0x00C40080,
+	HWPfChaDl5gPllClk0                    =  0x00C40084,
+	HWPfChaDl5gPllClk1                    =  0x00C40088,
+	HWPfChaDl5gPllBwadj                   =  0x00C4008C,
+	HWPfChaDl5gPllLbw                     =  0x00C40090,
+	HWPfChaDl5gPllResetq                  =  0x00C40094,
+	HWPfChaDl5gPllPhshft0                 =  0x00C40098,
+	HWPfChaDl5gPllPhshft1                 =  0x00C4009C,
+	HWPfChaDl5gPllDivq0                   =  0x00C400A0,
+	HWPfChaDl5gPllDivq1                   =  0x00C400A4,
+	HWPfChaDl5gPllDivq2                   =  0x00C400A8,
+	HWPfChaDl5gPllDivq3                   =  0x00C400AC,
+	HWPfChaDl5gPllDivq4                   =  0x00C400B0,
+	HWPfChaDl5gPllDivq5                   =  0x00C400B4,
+	HWPfChaDl5gPllDivq6                   =  0x00C400B8,
+	HWPfChaDl5gPllDivq7                   =  0x00C400BC,
+	HWPfChaDl4gPllPllrst                  =  0x00C40100,
+	HWPfChaDl4gPllClk0                    =  0x00C40104,
+	HWPfChaDl4gPllClk1                    =  0x00C40108,
+	HWPfChaDl4gPllBwadj                   =  0x00C4010C,
+	HWPfChaDl4gPllLbw                     =  0x00C40110,
+	HWPfChaDl4gPllResetq                  =  0x00C40114,
+	HWPfChaDl4gPllPhshft0                 =  0x00C40118,
+	HWPfChaDl4gPllPhshft1                 =  0x00C4011C,
+	HWPfChaDl4gPllDivq0                   =  0x00C40120,
+	HWPfChaDl4gPllDivq1                   =  0x00C40124,
+	HWPfChaDl4gPllDivq2                   =  0x00C40128,
+	HWPfChaDl4gPllDivq3                   =  0x00C4012C,
+	HWPfChaDl4gPllDivq4                   =  0x00C40130,
+	HWPfChaDl4gPllDivq5                   =  0x00C40134,
+	HWPfChaDl4gPllDivq6                   =  0x00C40138,
+	HWPfChaDl4gPllDivq7                   =  0x00C4013C,
+	HWPfChaUl5gPllPllrst                  =  0x00C40180,
+	HWPfChaUl5gPllClk0                    =  0x00C40184,
+	HWPfChaUl5gPllClk1                    =  0x00C40188,
+	HWPfChaUl5gPllBwadj                   =  0x00C4018C,
+	HWPfChaUl5gPllLbw                     =  0x00C40190,
+	HWPfChaUl5gPllResetq                  =  0x00C40194,
+	HWPfChaUl5gPllPhshft0                 =  0x00C40198,
+	HWPfChaUl5gPllPhshft1                 =  0x00C4019C,
+	HWPfChaUl5gPllDivq0                   =  0x00C401A0,
+	HWPfChaUl5gPllDivq1                   =  0x00C401A4,
+	HWPfChaUl5gPllDivq2                   =  0x00C401A8,
+	HWPfChaUl5gPllDivq3                   =  0x00C401AC,
+	HWPfChaUl5gPllDivq4                   =  0x00C401B0,
+	HWPfChaUl5gPllDivq5                   =  0x00C401B4,
+	HWPfChaUl5gPllDivq6                   =  0x00C401B8,
+	HWPfChaUl5gPllDivq7                   =  0x00C401BC,
+	HWPfChaUl4gPllPllrst                  =  0x00C40200,
+	HWPfChaUl4gPllClk0                    =  0x00C40204,
+	HWPfChaUl4gPllClk1                    =  0x00C40208,
+	HWPfChaUl4gPllBwadj                   =  0x00C4020C,
+	HWPfChaUl4gPllLbw                     =  0x00C40210,
+	HWPfChaUl4gPllResetq                  =  0x00C40214,
+	HWPfChaUl4gPllPhshft0                 =  0x00C40218,
+	HWPfChaUl4gPllPhshft1                 =  0x00C4021C,
+	HWPfChaUl4gPllDivq0                   =  0x00C40220,
+	HWPfChaUl4gPllDivq1                   =  0x00C40224,
+	HWPfChaUl4gPllDivq2                   =  0x00C40228,
+	HWPfChaUl4gPllDivq3                   =  0x00C4022C,
+	HWPfChaUl4gPllDivq4                   =  0x00C40230,
+	HWPfChaUl4gPllDivq5                   =  0x00C40234,
+	HWPfChaUl4gPllDivq6                   =  0x00C40238,
+	HWPfChaUl4gPllDivq7                   =  0x00C4023C,
+	HWPfChaDdrPllPllrst                   =  0x00C40280,
+	HWPfChaDdrPllClk0                     =  0x00C40284,
+	HWPfChaDdrPllClk1                     =  0x00C40288,
+	HWPfChaDdrPllBwadj                    =  0x00C4028C,
+	HWPfChaDdrPllLbw                      =  0x00C40290,
+	HWPfChaDdrPllResetq                   =  0x00C40294,
+	HWPfChaDdrPllPhshft0                  =  0x00C40298,
+	HWPfChaDdrPllPhshft1                  =  0x00C4029C,
+	HWPfChaDdrPllDivq0                    =  0x00C402A0,
+	HWPfChaDdrPllDivq1                    =  0x00C402A4,
+	HWPfChaDdrPllDivq2                    =  0x00C402A8,
+	HWPfChaDdrPllDivq3                    =  0x00C402AC,
+	HWPfChaDdrPllDivq4                    =  0x00C402B0,
+	HWPfChaDdrPllDivq5                    =  0x00C402B4,
+	HWPfChaDdrPllDivq6                    =  0x00C402B8,
+	HWPfChaDdrPllDivq7                    =  0x00C402BC,
+	HWPfChaErrStatus                      =  0x00C40400,
+	HWPfChaErrMask                        =  0x00C40404,
+	HWPfChaDebugPcieMsiFifo               =  0x00C40410,
+	HWPfChaDebugDdrMsiFifo                =  0x00C40414,
+	HWPfChaDebugMiscMsiFifo               =  0x00C40418,
+	HWPfChaPwmSet                         =  0x00C40420,
+	HWPfChaDdrRstStatus                   =  0x00C40430,
+	HWPfChaDdrStDoneStatus                =  0x00C40434,
+	HWPfChaDdrWbRstCfg                    =  0x00C40438,
+	HWPfChaDdrApbRstCfg                   =  0x00C4043C,
+	HWPfChaDdrPhyRstCfg                   =  0x00C40440,
+	HWPfChaDdrCpuRstCfg                   =  0x00C40444,
+	HWPfChaDdrSifRstCfg                   =  0x00C40448,
+	HWPfChaPadcfgPcomp0                   =  0x00C41000,
+	HWPfChaPadcfgNcomp0                   =  0x00C41004,
+	HWPfChaPadcfgOdt0                     =  0x00C41008,
+	HWPfChaPadcfgProtect0                 =  0x00C4100C,
+	HWPfChaPreemphasisProtect0            =  0x00C41010,
+	HWPfChaPreemphasisCompen0             =  0x00C41040,
+	HWPfChaPreemphasisOdten0              =  0x00C41044,
+	HWPfChaPadcfgPcomp1                   =  0x00C41100,
+	HWPfChaPadcfgNcomp1                   =  0x00C41104,
+	HWPfChaPadcfgOdt1                     =  0x00C41108,
+	HWPfChaPadcfgProtect1                 =  0x00C4110C,
+	HWPfChaPreemphasisProtect1            =  0x00C41110,
+	HWPfChaPreemphasisCompen1             =  0x00C41140,
+	HWPfChaPreemphasisOdten1              =  0x00C41144,
+	HWPfChaPadcfgPcomp2                   =  0x00C41200,
+	HWPfChaPadcfgNcomp2                   =  0x00C41204,
+	HWPfChaPadcfgOdt2                     =  0x00C41208,
+	HWPfChaPadcfgProtect2                 =  0x00C4120C,
+	HWPfChaPreemphasisProtect2            =  0x00C41210,
+	HWPfChaPreemphasisCompen2             =  0x00C41240,
+	HWPfChaPreemphasisOdten4              =  0x00C41444,
+	HWPfChaPreemphasisOdten2              =  0x00C41244,
+	HWPfChaPadcfgPcomp3                   =  0x00C41300,
+	HWPfChaPadcfgNcomp3                   =  0x00C41304,
+	HWPfChaPadcfgOdt3                     =  0x00C41308,
+	HWPfChaPadcfgProtect3                 =  0x00C4130C,
+	HWPfChaPreemphasisProtect3            =  0x00C41310,
+	HWPfChaPreemphasisCompen3             =  0x00C41340,
+	HWPfChaPreemphasisOdten3              =  0x00C41344,
+	HWPfChaPadcfgPcomp4                   =  0x00C41400,
+	HWPfChaPadcfgNcomp4                   =  0x00C41404,
+	HWPfChaPadcfgOdt4                     =  0x00C41408,
+	HWPfChaPadcfgProtect4                 =  0x00C4140C,
+	HWPfChaPreemphasisProtect4            =  0x00C41410,
+	HWPfChaPreemphasisCompen4             =  0x00C41440,
+	HWPfHiVfToPfDbellVf                   =  0x00C80000,
+	HWPfHiPfToVfDbellVf                   =  0x00C80008,
+	HWPfHiInfoRingBaseLoVf                =  0x00C80010,
+	HWPfHiInfoRingBaseHiVf                =  0x00C80014,
+	HWPfHiInfoRingPointerVf               =  0x00C80018,
+	HWPfHiInfoRingIntWrEnVf               =  0x00C80020,
+	HWPfHiInfoRingPf2VfWrEnVf             =  0x00C80024,
+	HWPfHiMsixVectorMapperVf              =  0x00C80060,
+	HWPfHiModuleVersionReg                =  0x00C84000,
+	HWPfHiIosf2axiErrLogReg               =  0x00C84004,
+	HWPfHiHardResetReg                    =  0x00C84008,
+	HWPfHi5GHardResetReg                  =  0x00C8400C,
+	HWPfHiInfoRingBaseLoRegPf             =  0x00C84010,
+	HWPfHiInfoRingBaseHiRegPf             =  0x00C84014,
+	HWPfHiInfoRingPointerRegPf            =  0x00C84018,
+	HWPfHiInfoRingIntWrEnRegPf            =  0x00C84020,
+	HWPfHiInfoRingVf2pfLoWrEnReg          =  0x00C84024,
+	HWPfHiInfoRingVf2pfHiWrEnReg          =  0x00C84028,
+	HWPfHiLogParityErrStatusReg           =  0x00C8402C,
+	HWPfHiLogDataParityErrorVfStatusLo    =  0x00C84030,
+	HWPfHiLogDataParityErrorVfStatusHi    =  0x00C84034,
+	HWPfHiBlockTransmitOnErrorEn          =  0x00C84038,
+	HWPfHiCfgMsiIntWrEnRegPf              =  0x00C84040,
+	HWPfHiCfgMsiVf2pfLoWrEnReg            =  0x00C84044,
+	HWPfHiCfgMsiVf2pfHighWrEnReg          =  0x00C84048,
+	HWPfHiMsixVectorMapperPf              =  0x00C84060,
+	HWPfHiApbWrWaitTime                   =  0x00C84100,
+	HWPfHiXCounterMaxValue                =  0x00C84104,
+	HWPfHiPfMode                          =  0x00C84108,
+	HWPfHiClkGateHystReg                  =  0x00C8410C,
+	HWPfHiSnoopBitsReg                    =  0x00C84110,
+	HWPfHiMsiDropEnableReg                =  0x00C84114,
+	HWPfHiMsiStatReg                      =  0x00C84120,
+	HWPfHiFifoOflStatReg                  =  0x00C84124,
+	HWPfHiHiDebugReg                      =  0x00C841F4,
+	HWPfHiDebugMemSnoopMsiFifo            =  0x00C841F8,
+	HWPfHiDebugMemSnoopInputFifo          =  0x00C841FC,
+	HWPfHiMsixMappingConfig               =  0x00C84200,
+	HWPfHiJunkReg                         =  0x00C8FF00,
+	HWPfDdrUmmcVer                        =  0x00D00000,
+	HWPfDdrUmmcCap                        =  0x00D00010,
+	HWPfDdrUmmcCtrl                       =  0x00D00020,
+	HWPfDdrMpcPe                          =  0x00D00080,
+	HWPfDdrMpcPpri3                       =  0x00D00090,
+	HWPfDdrMpcPpri2                       =  0x00D000A0,
+	HWPfDdrMpcPpri1                       =  0x00D000B0,
+	HWPfDdrMpcPpri0                       =  0x00D000C0,
+	HWPfDdrMpcPrwgrpCtrl                  =  0x00D000D0,
+	HWPfDdrMpcPbw7                        =  0x00D000E0,
+	HWPfDdrMpcPbw6                        =  0x00D000F0,
+	HWPfDdrMpcPbw5                        =  0x00D00100,
+	HWPfDdrMpcPbw4                        =  0x00D00110,
+	HWPfDdrMpcPbw3                        =  0x00D00120,
+	HWPfDdrMpcPbw2                        =  0x00D00130,
+	HWPfDdrMpcPbw1                        =  0x00D00140,
+	HWPfDdrMpcPbw0                        =  0x00D00150,
+	HWPfDdrMemoryInit                     =  0x00D00200,
+	HWPfDdrMemoryInitDone                 =  0x00D00210,
+	HWPfDdrMemInitPhyTrng0                =  0x00D00240,
+	HWPfDdrMemInitPhyTrng1                =  0x00D00250,
+	HWPfDdrMemInitPhyTrng2                =  0x00D00260,
+	HWPfDdrMemInitPhyTrng3                =  0x00D00270,
+	HWPfDdrBcDram                         =  0x00D003C0,
+	HWPfDdrBcAddrMap                      =  0x00D003D0,
+	HWPfDdrBcRef                          =  0x00D003E0,
+	HWPfDdrBcTim0                         =  0x00D00400,
+	HWPfDdrBcTim1                         =  0x00D00410,
+	HWPfDdrBcTim2                         =  0x00D00420,
+	HWPfDdrBcTim3                         =  0x00D00430,
+	HWPfDdrBcTim4                         =  0x00D00440,
+	HWPfDdrBcTim5                         =  0x00D00450,
+	HWPfDdrBcTim6                         =  0x00D00460,
+	HWPfDdrBcTim7                         =  0x00D00470,
+	HWPfDdrBcTim8                         =  0x00D00480,
+	HWPfDdrBcTim9                         =  0x00D00490,
+	HWPfDdrBcTim10                        =  0x00D004A0,
+	HWPfDdrBcTim12                        =  0x00D004C0,
+	HWPfDdrDfiInit                        =  0x00D004D0,
+	HWPfDdrDfiInitComplete                =  0x00D004E0,
+	HWPfDdrDfiTim0                        =  0x00D004F0,
+	HWPfDdrDfiTim1                        =  0x00D00500,
+	HWPfDdrDfiPhyUpdEn                    =  0x00D00530,
+	HWPfDdrMemStatus                      =  0x00D00540,
+	HWPfDdrUmmcErrStatus                  =  0x00D00550,
+	HWPfDdrUmmcIntStatus                  =  0x00D00560,
+	HWPfDdrUmmcIntEn                      =  0x00D00570,
+	HWPfDdrPhyRdLatency                   =  0x00D48400,
+	HWPfDdrPhyRdLatencyDbi                =  0x00D48410,
+	HWPfDdrPhyWrLatency                   =  0x00D48420,
+	HWPfDdrPhyTrngType                    =  0x00D48430,
+	HWPfDdrPhyMrsTiming2                  =  0x00D48440,
+	HWPfDdrPhyMrsTiming0                  =  0x00D48450,
+	HWPfDdrPhyMrsTiming1                  =  0x00D48460,
+	HWPfDdrPhyDramTmrd                    =  0x00D48470,
+	HWPfDdrPhyDramTmod                    =  0x00D48480,
+	HWPfDdrPhyDramTwpre                   =  0x00D48490,
+	HWPfDdrPhyDramTrfc                    =  0x00D484A0,
+	HWPfDdrPhyDramTrwtp                   =  0x00D484B0,
+	HWPfDdrPhyMr01Dimm                    =  0x00D484C0,
+	HWPfDdrPhyMr01DimmDbi                 =  0x00D484D0,
+	HWPfDdrPhyMr23Dimm                    =  0x00D484E0,
+	HWPfDdrPhyMr45Dimm                    =  0x00D484F0,
+	HWPfDdrPhyMr67Dimm                    =  0x00D48500,
+	HWPfDdrPhyWrlvlWwRdlvlRr              =  0x00D48510,
+	HWPfDdrPhyOdtEn                       =  0x00D48520,
+	HWPfDdrPhyFastTrng                    =  0x00D48530,
+	HWPfDdrPhyDynTrngGap                  =  0x00D48540,
+	HWPfDdrPhyDynRcalGap                  =  0x00D48550,
+	HWPfDdrPhyIdletimeout                 =  0x00D48560,
+	HWPfDdrPhyRstCkeGap                   =  0x00D48570,
+	HWPfDdrPhyCkeMrsGap                   =  0x00D48580,
+	HWPfDdrPhyMemVrefMidVal               =  0x00D48590,
+	HWPfDdrPhyVrefStep                    =  0x00D485A0,
+	HWPfDdrPhyVrefThreshold               =  0x00D485B0,
+	HWPfDdrPhyPhyVrefMidVal               =  0x00D485C0,
+	HWPfDdrPhyDqsCountMax                 =  0x00D485D0,
+	HWPfDdrPhyDqsCountNum                 =  0x00D485E0,
+	HWPfDdrPhyDramRow                     =  0x00D485F0,
+	HWPfDdrPhyDramCol                     =  0x00D48600,
+	HWPfDdrPhyDramBgBa                    =  0x00D48610,
+	HWPfDdrPhyDynamicUpdreqrel            =  0x00D48620,
+	HWPfDdrPhyVrefLimits                  =  0x00D48630,
+	HWPfDdrPhyIdtmTcStatus                =  0x00D6C020,
+	HWPfDdrPhyIdtmFwVersion               =  0x00D6C410,
+	HWPfDdrPhyRdlvlGateInitDelay          =  0x00D70000,
+	HWPfDdrPhyRdenSmplabc                 =  0x00D70008,
+	HWPfDdrPhyVrefNibble0                 =  0x00D7000C,
+	HWPfDdrPhyVrefNibble1                 =  0x00D70010,
+	HWPfDdrPhyRdlvlGateDqsSmpl0           =  0x00D70014,
+	HWPfDdrPhyRdlvlGateDqsSmpl1           =  0x00D70018,
+	HWPfDdrPhyRdlvlGateDqsSmpl2           =  0x00D7001C,
+	HWPfDdrPhyDqsCount                    =  0x00D70020,
+	HWPfDdrPhyWrlvlRdlvlGateStatus        =  0x00D70024,
+	HWPfDdrPhyErrorFlags                  =  0x00D70028,
+	HWPfDdrPhyPowerDown                   =  0x00D70030,
+	HWPfDdrPhyPrbsSeedByte0               =  0x00D70034,
+	HWPfDdrPhyPrbsSeedByte1               =  0x00D70038,
+	HWPfDdrPhyPcompDq                     =  0x00D70040,
+	HWPfDdrPhyNcompDq                     =  0x00D70044,
+	HWPfDdrPhyPcompDqs                    =  0x00D70048,
+	HWPfDdrPhyNcompDqs                    =  0x00D7004C,
+	HWPfDdrPhyPcompCmd                    =  0x00D70050,
+	HWPfDdrPhyNcompCmd                    =  0x00D70054,
+	HWPfDdrPhyPcompCk                     =  0x00D70058,
+	HWPfDdrPhyNcompCk                     =  0x00D7005C,
+	HWPfDdrPhyRcalOdtDq                   =  0x00D70060,
+	HWPfDdrPhyRcalOdtDqs                  =  0x00D70064,
+	HWPfDdrPhyRcalMask1                   =  0x00D70068,
+	HWPfDdrPhyRcalMask2                   =  0x00D7006C,
+	HWPfDdrPhyRcalCtrl                    =  0x00D70070,
+	HWPfDdrPhyRcalCnt                     =  0x00D70074,
+	HWPfDdrPhyRcalOverride                =  0x00D70078,
+	HWPfDdrPhyRcalGateen                  =  0x00D7007C,
+	HWPfDdrPhyCtrl                        =  0x00D70080,
+	HWPfDdrPhyWrlvlAlg                    =  0x00D70084,
+	HWPfDdrPhyRcalVreftTxcmdOdt           =  0x00D70088,
+	HWPfDdrPhyRdlvlGateParam              =  0x00D7008C,
+	HWPfDdrPhyRdlvlGateParam2             =  0x00D70090,
+	HWPfDdrPhyRcalVreftTxdata             =  0x00D70094,
+	HWPfDdrPhyCmdIntDelay                 =  0x00D700A4,
+	HWPfDdrPhyAlertN                      =  0x00D700A8,
+	HWPfDdrPhyTrngReqWpre2tck             =  0x00D700AC,
+	HWPfDdrPhyCmdPhaseSel                 =  0x00D700B4,
+	HWPfDdrPhyCmdDcdl                     =  0x00D700B8,
+	HWPfDdrPhyCkDcdl                      =  0x00D700BC,
+	HWPfDdrPhySwTrngCtrl1                 =  0x00D700C0,
+	HWPfDdrPhySwTrngCtrl2                 =  0x00D700C4,
+	HWPfDdrPhyRcalPcompRden               =  0x00D700C8,
+	HWPfDdrPhyRcalNcompRden               =  0x00D700CC,
+	HWPfDdrPhyRcalCompen                  =  0x00D700D0,
+	HWPfDdrPhySwTrngRdqs                  =  0x00D700D4,
+	HWPfDdrPhySwTrngWdqs                  =  0x00D700D8,
+	HWPfDdrPhySwTrngRdena                 =  0x00D700DC,
+	HWPfDdrPhySwTrngRdenb                 =  0x00D700E0,
+	HWPfDdrPhySwTrngRdenc                 =  0x00D700E4,
+	HWPfDdrPhySwTrngWdq                   =  0x00D700E8,
+	HWPfDdrPhySwTrngRdq                   =  0x00D700EC,
+	HWPfDdrPhyPcfgHmValue                 =  0x00D700F0,
+	HWPfDdrPhyPcfgTimerValue              =  0x00D700F4,
+	HWPfDdrPhyPcfgSoftwareTraining        =  0x00D700F8,
+	HWPfDdrPhyPcfgMcStatus                =  0x00D700FC,
+	HWPfDdrPhyWrlvlPhRank0                =  0x00D70100,
+	HWPfDdrPhyRdenPhRank0                 =  0x00D70104,
+	HWPfDdrPhyRdenIntRank0                =  0x00D70108,
+	HWPfDdrPhyRdqsDcdlRank0               =  0x00D7010C,
+	HWPfDdrPhyRdqsShadowDcdlRank0         =  0x00D70110,
+	HWPfDdrPhyWdqsDcdlRank0               =  0x00D70114,
+	HWPfDdrPhyWdmDcdlShadowRank0          =  0x00D70118,
+	HWPfDdrPhyWdmDcdlRank0                =  0x00D7011C,
+	HWPfDdrPhyDbiDcdlRank0                =  0x00D70120,
+	HWPfDdrPhyRdenDcdlaRank0              =  0x00D70124,
+	HWPfDdrPhyDbiDcdlShadowRank0          =  0x00D70128,
+	HWPfDdrPhyRdenDcdlbRank0              =  0x00D7012C,
+	HWPfDdrPhyWdqsShadowDcdlRank0         =  0x00D70130,
+	HWPfDdrPhyRdenDcdlcRank0              =  0x00D70134,
+	HWPfDdrPhyRdenShadowDcdlaRank0        =  0x00D70138,
+	HWPfDdrPhyWrlvlIntRank0               =  0x00D7013C,
+	HWPfDdrPhyRdqDcdlBit0Rank0            =  0x00D70200,
+	HWPfDdrPhyRdqDcdlShadowBit0Rank0      =  0x00D70204,
+	HWPfDdrPhyWdqDcdlBit0Rank0            =  0x00D70208,
+	HWPfDdrPhyWdqDcdlShadowBit0Rank0      =  0x00D7020C,
+	HWPfDdrPhyRdqDcdlBit1Rank0            =  0x00D70240,
+	HWPfDdrPhyRdqDcdlShadowBit1Rank0      =  0x00D70244,
+	HWPfDdrPhyWdqDcdlBit1Rank0            =  0x00D70248,
+	HWPfDdrPhyWdqDcdlShadowBit1Rank0      =  0x00D7024C,
+	HWPfDdrPhyRdqDcdlBit2Rank0            =  0x00D70280,
+	HWPfDdrPhyRdqDcdlShadowBit2Rank0      =  0x00D70284,
+	HWPfDdrPhyWdqDcdlBit2Rank0            =  0x00D70288,
+	HWPfDdrPhyWdqDcdlShadowBit2Rank0      =  0x00D7028C,
+	HWPfDdrPhyRdqDcdlBit3Rank0            =  0x00D702C0,
+	HWPfDdrPhyRdqDcdlShadowBit3Rank0      =  0x00D702C4,
+	HWPfDdrPhyWdqDcdlBit3Rank0            =  0x00D702C8,
+	HWPfDdrPhyWdqDcdlShadowBit3Rank0      =  0x00D702CC,
+	HWPfDdrPhyRdqDcdlBit4Rank0            =  0x00D70300,
+	HWPfDdrPhyRdqDcdlShadowBit4Rank0      =  0x00D70304,
+	HWPfDdrPhyWdqDcdlBit4Rank0            =  0x00D70308,
+	HWPfDdrPhyWdqDcdlShadowBit4Rank0      =  0x00D7030C,
+	HWPfDdrPhyRdqDcdlBit5Rank0            =  0x00D70340,
+	HWPfDdrPhyRdqDcdlShadowBit5Rank0      =  0x00D70344,
+	HWPfDdrPhyWdqDcdlBit5Rank0            =  0x00D70348,
+	HWPfDdrPhyWdqDcdlShadowBit5Rank0      =  0x00D7034C,
+	HWPfDdrPhyRdqDcdlBit6Rank0            =  0x00D70380,
+	HWPfDdrPhyRdqDcdlShadowBit6Rank0      =  0x00D70384,
+	HWPfDdrPhyWdqDcdlBit6Rank0            =  0x00D70388,
+	HWPfDdrPhyWdqDcdlShadowBit6Rank0      =  0x00D7038C,
+	HWPfDdrPhyRdqDcdlBit7Rank0            =  0x00D703C0,
+	HWPfDdrPhyRdqDcdlShadowBit7Rank0      =  0x00D703C4,
+	HWPfDdrPhyWdqDcdlBit7Rank0            =  0x00D703C8,
+	HWPfDdrPhyWdqDcdlShadowBit7Rank0      =  0x00D703CC,
+	HWPfDdrPhyIdtmStatus                  =  0x00D740D0,
+	HWPfDdrPhyIdtmError                   =  0x00D74110,
+	HWPfDdrPhyIdtmDebug                   =  0x00D74120,
+	HWPfDdrPhyIdtmDebugInt                =  0x00D74130,
+	HwPfPcieLnAsicCfgovr                  =  0x00D80000,
+	HwPfPcieLnAclkmixer                   =  0x00D80004,
+	HwPfPcieLnTxrampfreq                  =  0x00D80008,
+	HwPfPcieLnLanetest                    =  0x00D8000C,
+	HwPfPcieLnDcctrl                      =  0x00D80010,
+	HwPfPcieLnDccmeas                     =  0x00D80014,
+	HwPfPcieLnDccovrAclk                  =  0x00D80018,
+	HwPfPcieLnDccovrTxa                   =  0x00D8001C,
+	HwPfPcieLnDccovrTxk                   =  0x00D80020,
+	HwPfPcieLnDccovrDclk                  =  0x00D80024,
+	HwPfPcieLnDccovrEclk                  =  0x00D80028,
+	HwPfPcieLnDcctrimAclk                 =  0x00D8002C,
+	HwPfPcieLnDcctrimTx                   =  0x00D80030,
+	HwPfPcieLnDcctrimDclk                 =  0x00D80034,
+	HwPfPcieLnDcctrimEclk                 =  0x00D80038,
+	HwPfPcieLnQuadCtrl                    =  0x00D8003C,
+	HwPfPcieLnQuadCorrIndex               =  0x00D80040,
+	HwPfPcieLnQuadCorrStatus              =  0x00D80044,
+	HwPfPcieLnAsicRxovr1                  =  0x00D80048,
+	HwPfPcieLnAsicRxovr2                  =  0x00D8004C,
+	HwPfPcieLnAsicEqinfovr                =  0x00D80050,
+	HwPfPcieLnRxcsr                       =  0x00D80054,
+	HwPfPcieLnRxfectrl                    =  0x00D80058,
+	HwPfPcieLnRxtest                      =  0x00D8005C,
+	HwPfPcieLnEscount                     =  0x00D80060,
+	HwPfPcieLnCdrctrl                     =  0x00D80064,
+	HwPfPcieLnCdrctrl2                    =  0x00D80068,
+	HwPfPcieLnCdrcfg0Ctrl0                =  0x00D8006C,
+	HwPfPcieLnCdrcfg0Ctrl1                =  0x00D80070,
+	HwPfPcieLnCdrcfg0Ctrl2                =  0x00D80074,
+	HwPfPcieLnCdrcfg1Ctrl0                =  0x00D80078,
+	HwPfPcieLnCdrcfg1Ctrl1                =  0x00D8007C,
+	HwPfPcieLnCdrcfg1Ctrl2                =  0x00D80080,
+	HwPfPcieLnCdrcfg2Ctrl0                =  0x00D80084,
+	HwPfPcieLnCdrcfg2Ctrl1                =  0x00D80088,
+	HwPfPcieLnCdrcfg2Ctrl2                =  0x00D8008C,
+	HwPfPcieLnCdrcfg3Ctrl0                =  0x00D80090,
+	HwPfPcieLnCdrcfg3Ctrl1                =  0x00D80094,
+	HwPfPcieLnCdrcfg3Ctrl2                =  0x00D80098,
+	HwPfPcieLnCdrphase                    =  0x00D8009C,
+	HwPfPcieLnCdrfreq                     =  0x00D800A0,
+	HwPfPcieLnCdrstatusPhase              =  0x00D800A4,
+	HwPfPcieLnCdrstatusFreq               =  0x00D800A8,
+	HwPfPcieLnCdroffset                   =  0x00D800AC,
+	HwPfPcieLnRxvosctl                    =  0x00D800B0,
+	HwPfPcieLnRxvosctl2                   =  0x00D800B4,
+	HwPfPcieLnRxlosctl                    =  0x00D800B8,
+	HwPfPcieLnRxlos                       =  0x00D800BC,
+	HwPfPcieLnRxlosvval                   =  0x00D800C0,
+	HwPfPcieLnRxvosd0                     =  0x00D800C4,
+	HwPfPcieLnRxvosd1                     =  0x00D800C8,
+	HwPfPcieLnRxvosep0                    =  0x00D800CC,
+	HwPfPcieLnRxvosep1                    =  0x00D800D0,
+	HwPfPcieLnRxvosen0                    =  0x00D800D4,
+	HwPfPcieLnRxvosen1                    =  0x00D800D8,
+	HwPfPcieLnRxvosafe                    =  0x00D800DC,
+	HwPfPcieLnRxvosa0                     =  0x00D800E0,
+	HwPfPcieLnRxvosa0Out                  =  0x00D800E4,
+	HwPfPcieLnRxvosa1                     =  0x00D800E8,
+	HwPfPcieLnRxvosa1Out                  =  0x00D800EC,
+	HwPfPcieLnRxmisc                      =  0x00D800F0,
+	HwPfPcieLnRxbeacon                    =  0x00D800F4,
+	HwPfPcieLnRxdssout                    =  0x00D800F8,
+	HwPfPcieLnRxdssout2                   =  0x00D800FC,
+	HwPfPcieLnAlphapctrl                  =  0x00D80100,
+	HwPfPcieLnAlphanctrl                  =  0x00D80104,
+	HwPfPcieLnAdaptctrl                   =  0x00D80108,
+	HwPfPcieLnAdaptctrl1                  =  0x00D8010C,
+	HwPfPcieLnAdaptstatus                 =  0x00D80110,
+	HwPfPcieLnAdaptvga1                   =  0x00D80114,
+	HwPfPcieLnAdaptvga2                   =  0x00D80118,
+	HwPfPcieLnAdaptvga3                   =  0x00D8011C,
+	HwPfPcieLnAdaptvga4                   =  0x00D80120,
+	HwPfPcieLnAdaptboost1                 =  0x00D80124,
+	HwPfPcieLnAdaptboost2                 =  0x00D80128,
+	HwPfPcieLnAdaptboost3                 =  0x00D8012C,
+	HwPfPcieLnAdaptboost4                 =  0x00D80130,
+	HwPfPcieLnAdaptsslms1                 =  0x00D80134,
+	HwPfPcieLnAdaptsslms2                 =  0x00D80138,
+	HwPfPcieLnAdaptvgaStatus              =  0x00D8013C,
+	HwPfPcieLnAdaptboostStatus            =  0x00D80140,
+	HwPfPcieLnAdaptsslmsStatus1           =  0x00D80144,
+	HwPfPcieLnAdaptsslmsStatus2           =  0x00D80148,
+	HwPfPcieLnAfectrl1                    =  0x00D8014C,
+	HwPfPcieLnAfectrl2                    =  0x00D80150,
+	HwPfPcieLnAfectrl3                    =  0x00D80154,
+	HwPfPcieLnAfedefault1                 =  0x00D80158,
+	HwPfPcieLnAfedefault2                 =  0x00D8015C,
+	HwPfPcieLnDfectrl1                    =  0x00D80160,
+	HwPfPcieLnDfectrl2                    =  0x00D80164,
+	HwPfPcieLnDfectrl3                    =  0x00D80168,
+	HwPfPcieLnDfectrl4                    =  0x00D8016C,
+	HwPfPcieLnDfectrl5                    =  0x00D80170,
+	HwPfPcieLnDfectrl6                    =  0x00D80174,
+	HwPfPcieLnAfestatus1                  =  0x00D80178,
+	HwPfPcieLnAfestatus2                  =  0x00D8017C,
+	HwPfPcieLnDfestatus1                  =  0x00D80180,
+	HwPfPcieLnDfestatus2                  =  0x00D80184,
+	HwPfPcieLnDfestatus3                  =  0x00D80188,
+	HwPfPcieLnDfestatus4                  =  0x00D8018C,
+	HwPfPcieLnDfestatus5                  =  0x00D80190,
+	HwPfPcieLnAlphastatus                 =  0x00D80194,
+	HwPfPcieLnFomctrl1                    =  0x00D80198,
+	HwPfPcieLnFomctrl2                    =  0x00D8019C,
+	HwPfPcieLnFomctrl3                    =  0x00D801A0,
+	HwPfPcieLnAclkcalStatus               =  0x00D801A4,
+	HwPfPcieLnOffscorrStatus              =  0x00D801A8,
+	HwPfPcieLnEyewidthStatus              =  0x00D801AC,
+	HwPfPcieLnEyeheightStatus             =  0x00D801B0,
+	HwPfPcieLnAsicTxovr1                  =  0x00D801B4,
+	HwPfPcieLnAsicTxovr2                  =  0x00D801B8,
+	HwPfPcieLnAsicTxovr3                  =  0x00D801BC,
+	HwPfPcieLnTxbiasadjOvr                =  0x00D801C0,
+	HwPfPcieLnTxcsr                       =  0x00D801C4,
+	HwPfPcieLnTxtest                      =  0x00D801C8,
+	HwPfPcieLnTxtestword                  =  0x00D801CC,
+	HwPfPcieLnTxtestwordHigh              =  0x00D801D0,
+	HwPfPcieLnTxdrive                     =  0x00D801D4,
+	HwPfPcieLnMtcsLn                      =  0x00D801D8,
+	HwPfPcieLnStatsumLn                   =  0x00D801DC,
+	HwPfPcieLnRcbusScratch                =  0x00D801E0,
+	HwPfPcieLnRcbusMinorrev               =  0x00D801F0,
+	HwPfPcieLnRcbusMajorrev               =  0x00D801F4,
+	HwPfPcieLnRcbusBlocktype              =  0x00D801F8,
+	HwPfPcieSupPllcsr                     =  0x00D80800,
+	HwPfPcieSupPlldiv                     =  0x00D80804,
+	HwPfPcieSupPllcal                     =  0x00D80808,
+	HwPfPcieSupPllcalsts                  =  0x00D8080C,
+	HwPfPcieSupPllmeas                    =  0x00D80810,
+	HwPfPcieSupPlldactrim                 =  0x00D80814,
+	HwPfPcieSupPllbiastrim                =  0x00D80818,
+	HwPfPcieSupPllbwtrim                  =  0x00D8081C,
+	HwPfPcieSupPllcaldly                  =  0x00D80820,
+	HwPfPcieSupRefclkonpclkctrl           =  0x00D80824,
+	HwPfPcieSupPclkdelay                  =  0x00D80828,
+	HwPfPcieSupPhyconfig                  =  0x00D8082C,
+	HwPfPcieSupRcalIntf                   =  0x00D80830,
+	HwPfPcieSupAuxcsr                     =  0x00D80834,
+	HwPfPcieSupVref                       =  0x00D80838,
+	HwPfPcieSupLinkmode                   =  0x00D8083C,
+	HwPfPcieSupRrefcalctl                 =  0x00D80840,
+	HwPfPcieSupRrefcal                    =  0x00D80844,
+	HwPfPcieSupRrefcaldly                 =  0x00D80848,
+	HwPfPcieSupTximpcalctl                =  0x00D8084C,
+	HwPfPcieSupTximpcal                   =  0x00D80850,
+	HwPfPcieSupTximpoffset                =  0x00D80854,
+	HwPfPcieSupTximpcaldly                =  0x00D80858,
+	HwPfPcieSupRximpcalctl                =  0x00D8085C,
+	HwPfPcieSupRximpcal                   =  0x00D80860,
+	HwPfPcieSupRximpoffset                =  0x00D80864,
+	HwPfPcieSupRximpcaldly                =  0x00D80868,
+	HwPfPcieSupFence                      =  0x00D8086C,
+	HwPfPcieSupMtcs                       =  0x00D80870,
+	HwPfPcieSupStatsum                    =  0x00D809B8,
+	HwPfPciePcsDpStatus0                  =  0x00D81000,
+	HwPfPciePcsDpControl0                 =  0x00D81004,
+	HwPfPciePcsPmaStatusLane0             =  0x00D81008,
+	HwPfPciePcsPipeStatusLane0            =  0x00D8100C,
+	HwPfPciePcsTxdeemph0Lane0             =  0x00D81010,
+	HwPfPciePcsTxdeemph1Lane0             =  0x00D81014,
+	HwPfPciePcsInternalStatusLane0        =  0x00D81018,
+	HwPfPciePcsDpStatus1                  =  0x00D8101C,
+	HwPfPciePcsDpControl1                 =  0x00D81020,
+	HwPfPciePcsPmaStatusLane1             =  0x00D81024,
+	HwPfPciePcsPipeStatusLane1            =  0x00D81028,
+	HwPfPciePcsTxdeemph0Lane1             =  0x00D8102C,
+	HwPfPciePcsTxdeemph1Lane1             =  0x00D81030,
+	HwPfPciePcsInternalStatusLane1        =  0x00D81034,
+	HwPfPciePcsDpStatus2                  =  0x00D81038,
+	HwPfPciePcsDpControl2                 =  0x00D8103C,
+	HwPfPciePcsPmaStatusLane2             =  0x00D81040,
+	HwPfPciePcsPipeStatusLane2            =  0x00D81044,
+	HwPfPciePcsTxdeemph0Lane2             =  0x00D81048,
+	HwPfPciePcsTxdeemph1Lane2             =  0x00D8104C,
+	HwPfPciePcsInternalStatusLane2        =  0x00D81050,
+	HwPfPciePcsDpStatus3                  =  0x00D81054,
+	HwPfPciePcsDpControl3                 =  0x00D81058,
+	HwPfPciePcsPmaStatusLane3             =  0x00D8105C,
+	HwPfPciePcsPipeStatusLane3            =  0x00D81060,
+	HwPfPciePcsTxdeemph0Lane3             =  0x00D81064,
+	HwPfPciePcsTxdeemph1Lane3             =  0x00D81068,
+	HwPfPciePcsInternalStatusLane3        =  0x00D8106C,
+	HwPfPciePcsEbStatus0                  =  0x00D81070,
+	HwPfPciePcsEbStatus1                  =  0x00D81074,
+	HwPfPciePcsEbStatus2                  =  0x00D81078,
+	HwPfPciePcsEbStatus3                  =  0x00D8107C,
+	HwPfPciePcsPllSettingPcieG1           =  0x00D81088,
+	HwPfPciePcsPllSettingPcieG2           =  0x00D8108C,
+	HwPfPciePcsPllSettingPcieG3           =  0x00D81090,
+	HwPfPciePcsControl                    =  0x00D81094,
+	HwPfPciePcsEqControl                  =  0x00D81098,
+	HwPfPciePcsEqTimer                    =  0x00D8109C,
+	HwPfPciePcsEqErrStatus                =  0x00D810A0,
+	HwPfPciePcsEqErrCount                 =  0x00D810A4,
+	HwPfPciePcsStatus                     =  0x00D810A8,
+	HwPfPciePcsMiscRegister               =  0x00D810AC,
+	HwPfPciePcsObsControl                 =  0x00D810B0,
+	HwPfPciePcsPrbsCount0                 =  0x00D81200,
+	HwPfPciePcsBistControl0               =  0x00D81204,
+	HwPfPciePcsBistStaticWord00           =  0x00D81208,
+	HwPfPciePcsBistStaticWord10           =  0x00D8120C,
+	HwPfPciePcsBistStaticWord20           =  0x00D81210,
+	HwPfPciePcsBistStaticWord30           =  0x00D81214,
+	HwPfPciePcsPrbsCount1                 =  0x00D81220,
+	HwPfPciePcsBistControl1               =  0x00D81224,
+	HwPfPciePcsBistStaticWord01           =  0x00D81228,
+	HwPfPciePcsBistStaticWord11           =  0x00D8122C,
+	HwPfPciePcsBistStaticWord21           =  0x00D81230,
+	HwPfPciePcsBistStaticWord31           =  0x00D81234,
+	HwPfPciePcsPrbsCount2                 =  0x00D81240,
+	HwPfPciePcsBistControl2               =  0x00D81244,
+	HwPfPciePcsBistStaticWord02           =  0x00D81248,
+	HwPfPciePcsBistStaticWord12           =  0x00D8124C,
+	HwPfPciePcsBistStaticWord22           =  0x00D81250,
+	HwPfPciePcsBistStaticWord32           =  0x00D81254,
+	HwPfPciePcsPrbsCount3                 =  0x00D81260,
+	HwPfPciePcsBistControl3               =  0x00D81264,
+	HwPfPciePcsBistStaticWord03           =  0x00D81268,
+	HwPfPciePcsBistStaticWord13           =  0x00D8126C,
+	HwPfPciePcsBistStaticWord23           =  0x00D81270,
+	HwPfPciePcsBistStaticWord33           =  0x00D81274,
+	HwPfPcieGpexLtssmStateCntrl           =  0x00D90400,
+	HwPfPcieGpexLtssmStateStatus          =  0x00D90404,
+	HwPfPcieGpexSkipFreqTimer             =  0x00D90408,
+	HwPfPcieGpexLaneSelect                =  0x00D9040C,
+	HwPfPcieGpexLaneDeskew                =  0x00D90410,
+	HwPfPcieGpexRxErrorStatus             =  0x00D90414,
+	HwPfPcieGpexLaneNumControl            =  0x00D90418,
+	HwPfPcieGpexNFstControl               =  0x00D9041C,
+	HwPfPcieGpexLinkStatus                =  0x00D90420,
+	HwPfPcieGpexAckReplayTimeout          =  0x00D90438,
+	HwPfPcieGpexSeqNumberStatus           =  0x00D9043C,
+	HwPfPcieGpexCoreClkRatio              =  0x00D90440,
+	HwPfPcieGpexDllTholdControl           =  0x00D90448,
+	HwPfPcieGpexPmTimer                   =  0x00D90450,
+	HwPfPcieGpexPmeTimeout                =  0x00D90454,
+	HwPfPcieGpexAspmL1Timer               =  0x00D90458,
+	HwPfPcieGpexAspmReqTimer              =  0x00D9045C,
+	HwPfPcieGpexAspmL1Dis                 =  0x00D90460,
+	HwPfPcieGpexAdvisoryErrorControl      =  0x00D90468,
+	HwPfPcieGpexId                        =  0x00D90470,
+	HwPfPcieGpexClasscode                 =  0x00D90474,
+	HwPfPcieGpexSubsystemId               =  0x00D90478,
+	HwPfPcieGpexDeviceCapabilities        =  0x00D9047C,
+	HwPfPcieGpexLinkCapabilities          =  0x00D90480,
+	HwPfPcieGpexFunctionNumber            =  0x00D90484,
+	HwPfPcieGpexPmCapabilities            =  0x00D90488,
+	HwPfPcieGpexFunctionSelect            =  0x00D9048C,
+	HwPfPcieGpexErrorCounter              =  0x00D904AC,
+	HwPfPcieGpexConfigReady               =  0x00D904B0,
+	HwPfPcieGpexFcUpdateTimeout           =  0x00D904B8,
+	HwPfPcieGpexFcUpdateTimer             =  0x00D904BC,
+	HwPfPcieGpexVcBufferLoad              =  0x00D904C8,
+	HwPfPcieGpexVcBufferSizeThold         =  0x00D904CC,
+	HwPfPcieGpexVcBufferSelect            =  0x00D904D0,
+	HwPfPcieGpexBarEnable                 =  0x00D904D4,
+	HwPfPcieGpexBarDwordLower             =  0x00D904D8,
+	HwPfPcieGpexBarDwordUpper             =  0x00D904DC,
+	HwPfPcieGpexBarSelect                 =  0x00D904E0,
+	HwPfPcieGpexCreditCounterSelect       =  0x00D904E4,
+	HwPfPcieGpexCreditCounterStatus       =  0x00D904E8,
+	HwPfPcieGpexTlpHeaderSelect           =  0x00D904EC,
+	HwPfPcieGpexTlpHeaderDword0           =  0x00D904F0,
+	HwPfPcieGpexTlpHeaderDword1           =  0x00D904F4,
+	HwPfPcieGpexTlpHeaderDword2           =  0x00D904F8,
+	HwPfPcieGpexTlpHeaderDword3           =  0x00D904FC,
+	HwPfPcieGpexRelaxOrderControl         =  0x00D90500,
+	HwPfPcieGpexBarPrefetch               =  0x00D90504,
+	HwPfPcieGpexFcCheckControl            =  0x00D90508,
+	HwPfPcieGpexFcUpdateTimerTraffic      =  0x00D90518,
+	HwPfPcieGpexPhyControl0               =  0x00D9053C,
+	HwPfPcieGpexPhyControl1               =  0x00D90544,
+	HwPfPcieGpexPhyControl2               =  0x00D9054C,
+	HwPfPcieGpexUserControl0              =  0x00D9055C,
+	HwPfPcieGpexUncorrErrorStatus         =  0x00D905F0,
+	HwPfPcieGpexRxCplError                =  0x00D90620,
+	HwPfPcieGpexRxCplErrorDword0          =  0x00D90624,
+	HwPfPcieGpexRxCplErrorDword1          =  0x00D90628,
+	HwPfPcieGpexRxCplErrorDword2          =  0x00D9062C,
+	HwPfPcieGpexPabSwResetEn              =  0x00D90630,
+	HwPfPcieGpexGen3Control0              =  0x00D90634,
+	HwPfPcieGpexGen3Control1              =  0x00D90638,
+	HwPfPcieGpexGen3Control2              =  0x00D9063C,
+	HwPfPcieGpexGen2ControlCsr            =  0x00D90640,
+	HwPfPcieGpexTotalVfInitialVf0         =  0x00D90644,
+	HwPfPcieGpexTotalVfInitialVf1         =  0x00D90648,
+	HwPfPcieGpexSriovLinkDevId0           =  0x00D90684,
+	HwPfPcieGpexSriovLinkDevId1           =  0x00D90688,
+	HwPfPcieGpexSriovPageSize0            =  0x00D906C4,
+	HwPfPcieGpexSriovPageSize1            =  0x00D906C8,
+	HwPfPcieGpexIdVersion                 =  0x00D906FC,
+	HwPfPcieGpexSriovVfOffsetStride0      =  0x00D90704,
+	HwPfPcieGpexSriovVfOffsetStride1      =  0x00D90708,
+	HwPfPcieGpexGen3DeskewControl         =  0x00D907B4,
+	HwPfPcieGpexGen3EqControl             =  0x00D907B8,
+	HwPfPcieGpexBridgeVersion             =  0x00D90800,
+	HwPfPcieGpexBridgeCapability          =  0x00D90804,
+	HwPfPcieGpexBridgeControl             =  0x00D90808,
+	HwPfPcieGpexBridgeStatus              =  0x00D9080C,
+	HwPfPcieGpexEngineActivityStatus      =  0x00D9081C,
+	HwPfPcieGpexEngineResetControl        =  0x00D90820,
+	HwPfPcieGpexAxiPioControl             =  0x00D90840,
+	HwPfPcieGpexAxiPioStatus              =  0x00D90844,
+	HwPfPcieGpexAmbaSlaveCmdStatus        =  0x00D90848,
+	HwPfPcieGpexPexPioControl             =  0x00D908C0,
+	HwPfPcieGpexPexPioStatus              =  0x00D908C4,
+	HwPfPcieGpexAmbaMasterStatus          =  0x00D908C8,
+	HwPfPcieGpexCsrSlaveCmdStatus         =  0x00D90920,
+	HwPfPcieGpexMailboxAxiControl         =  0x00D90A50,
+	HwPfPcieGpexMailboxAxiData            =  0x00D90A54,
+	HwPfPcieGpexMailboxPexControl         =  0x00D90A90,
+	HwPfPcieGpexMailboxPexData            =  0x00D90A94,
+	HwPfPcieGpexPexInterruptEnable        =  0x00D90AD0,
+	HwPfPcieGpexPexInterruptStatus        =  0x00D90AD4,
+	HwPfPcieGpexPexInterruptAxiPioVector  =  0x00D90AD8,
+	HwPfPcieGpexPexInterruptPexPioVector  =  0x00D90AE0,
+	HwPfPcieGpexPexInterruptMiscVector    =  0x00D90AF8,
+	HwPfPcieGpexAmbaInterruptPioEnable    =  0x00D90B00,
+	HwPfPcieGpexAmbaInterruptMiscEnable   =  0x00D90B0C,
+	HwPfPcieGpexAmbaInterruptPioStatus    =  0x00D90B10,
+	HwPfPcieGpexAmbaInterruptMiscStatus   =  0x00D90B1C,
+	HwPfPcieGpexPexPmControl              =  0x00D90B80,
+	HwPfPcieGpexSlotMisc                  =  0x00D90B88,
+	HwPfPcieGpexAxiAddrMappingControl     =  0x00D90BA0,
+	HwPfPcieGpexAxiAddrMappingWindowAxiBase     =  0x00D90BA4,
+	HwPfPcieGpexAxiAddrMappingWindowPexBaseLow  =  0x00D90BA8,
+	HwPfPcieGpexAxiAddrMappingWindowPexBaseHigh =  0x00D90BAC,
+	HwPfPcieGpexPexBarAddrFunc0Bar0       =  0x00D91BA0,
+	HwPfPcieGpexPexBarAddrFunc0Bar1       =  0x00D91BA4,
+	HwPfPcieGpexAxiAddrMappingPcieHdrParam =  0x00D95BA0,
+	HwPfPcieGpexExtAxiAddrMappingAxiBase  =  0x00D980A0,
+	HwPfPcieGpexPexExtBarAddrFunc0Bar0    =  0x00D984A0,
+	HwPfPcieGpexPexExtBarAddrFunc0Bar1    =  0x00D984A4,
+	HwPfPcieGpexAmbaInterruptFlrEnable    =  0x00D9B960,
+	HwPfPcieGpexAmbaInterruptFlrStatus    =  0x00D9B9A0,
+	HwPfPcieGpexExtAxiAddrMappingSize     =  0x00D9BAF0,
+	HwPfPcieGpexPexPioAwcacheControl      =  0x00D9C300,
+	HwPfPcieGpexPexPioArcacheControl      =  0x00D9C304,
+	HwPfPcieGpexPabObSizeControlVc0       =  0x00D9C310
+};
+
+/* TIP PF Interrupt numbers */
+enum {
+	ACC100_PF_INT_QMGR_AQ_OVERFLOW = 0,
+	ACC100_PF_INT_DOORBELL_VF_2_PF = 1,
+	ACC100_PF_INT_DMA_DL_DESC_IRQ = 2,
+	ACC100_PF_INT_DMA_UL_DESC_IRQ = 3,
+	ACC100_PF_INT_DMA_MLD_DESC_IRQ = 4,
+	ACC100_PF_INT_DMA_UL5G_DESC_IRQ = 5,
+	ACC100_PF_INT_DMA_DL5G_DESC_IRQ = 6,
+	ACC100_PF_INT_ILLEGAL_FORMAT = 7,
+	ACC100_PF_INT_QMGR_DISABLED_ACCESS = 8,
+	ACC100_PF_INT_QMGR_AQ_OVERTHRESHOLD = 9,
+	ACC100_PF_INT_ARAM_ACCESS_ERR = 10,
+	ACC100_PF_INT_ARAM_ECC_1BIT_ERR = 11,
+	ACC100_PF_INT_PARITY_ERR = 12,
+	ACC100_PF_INT_QMGR_ERR = 13,
+	ACC100_PF_INT_INT_REQ_OVERFLOW = 14,
+	ACC100_PF_INT_APB_TIMEOUT = 15,
+};
+
+#endif /* ACC100_PF_ENUM_H */
diff --git a/drivers/baseband/acc100/acc100_vf_enum.h b/drivers/baseband/acc100/acc100_vf_enum.h
new file mode 100644
index 0000000..b512af3
--- /dev/null
+++ b/drivers/baseband/acc100/acc100_vf_enum.h
@@ -0,0 +1,73 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2017 Intel Corporation
+ */
+
+#ifndef ACC100_VF_ENUM_H
+#define ACC100_VF_ENUM_H
+
+/*
+ * ACC100 Register mapping on VF BAR0
+ * This is automatically generated from RDL, format may change with new RDL
+ */
+enum {
+	HWVfQmgrIngressAq             =  0x00000000,
+	HWVfHiVfToPfDbellVf           =  0x00000800,
+	HWVfHiPfToVfDbellVf           =  0x00000808,
+	HWVfHiInfoRingBaseLoVf        =  0x00000810,
+	HWVfHiInfoRingBaseHiVf        =  0x00000814,
+	HWVfHiInfoRingPointerVf       =  0x00000818,
+	HWVfHiInfoRingIntWrEnVf       =  0x00000820,
+	HWVfHiInfoRingPf2VfWrEnVf     =  0x00000824,
+	HWVfHiMsixVectorMapperVf      =  0x00000860,
+	HWVfDmaFec5GulDescBaseLoRegVf =  0x00000920,
+	HWVfDmaFec5GulDescBaseHiRegVf =  0x00000924,
+	HWVfDmaFec5GulRespPtrLoRegVf  =  0x00000928,
+	HWVfDmaFec5GulRespPtrHiRegVf  =  0x0000092C,
+	HWVfDmaFec5GdlDescBaseLoRegVf =  0x00000940,
+	HWVfDmaFec5GdlDescBaseHiRegVf =  0x00000944,
+	HWVfDmaFec5GdlRespPtrLoRegVf  =  0x00000948,
+	HWVfDmaFec5GdlRespPtrHiRegVf  =  0x0000094C,
+	HWVfDmaFec4GulDescBaseLoRegVf =  0x00000960,
+	HWVfDmaFec4GulDescBaseHiRegVf =  0x00000964,
+	HWVfDmaFec4GulRespPtrLoRegVf  =  0x00000968,
+	HWVfDmaFec4GulRespPtrHiRegVf  =  0x0000096C,
+	HWVfDmaFec4GdlDescBaseLoRegVf =  0x00000980,
+	HWVfDmaFec4GdlDescBaseHiRegVf =  0x00000984,
+	HWVfDmaFec4GdlRespPtrLoRegVf  =  0x00000988,
+	HWVfDmaFec4GdlRespPtrHiRegVf  =  0x0000098C,
+	HWVfDmaDdrBaseRangeRoVf       =  0x000009A0,
+	HWVfQmgrAqResetVf             =  0x00000E00,
+	HWVfQmgrRingSizeVf            =  0x00000E04,
+	HWVfQmgrGrpDepthLog20Vf       =  0x00000E08,
+	HWVfQmgrGrpDepthLog21Vf       =  0x00000E0C,
+	HWVfQmgrGrpFunction0Vf        =  0x00000E10,
+	HWVfQmgrGrpFunction1Vf        =  0x00000E14,
+	HWVfPmACntrlRegVf             =  0x00000F40,
+	HWVfPmACountVf                =  0x00000F48,
+	HWVfPmAKCntLoVf               =  0x00000F50,
+	HWVfPmAKCntHiVf               =  0x00000F54,
+	HWVfPmADeltaCntLoVf           =  0x00000F60,
+	HWVfPmADeltaCntHiVf           =  0x00000F64,
+	HWVfPmBCntrlRegVf             =  0x00000F80,
+	HWVfPmBCountVf                =  0x00000F88,
+	HWVfPmBKCntLoVf               =  0x00000F90,
+	HWVfPmBKCntHiVf               =  0x00000F94,
+	HWVfPmBDeltaCntLoVf           =  0x00000FA0,
+	HWVfPmBDeltaCntHiVf           =  0x00000FA4
+};
+
+/* TIP VF Interrupt numbers */
+enum {
+	ACC100_VF_INT_QMGR_AQ_OVERFLOW = 0,
+	ACC100_VF_INT_DOORBELL_VF_2_PF = 1,
+	ACC100_VF_INT_DMA_DL_DESC_IRQ = 2,
+	ACC100_VF_INT_DMA_UL_DESC_IRQ = 3,
+	ACC100_VF_INT_DMA_MLD_DESC_IRQ = 4,
+	ACC100_VF_INT_DMA_UL5G_DESC_IRQ = 5,
+	ACC100_VF_INT_DMA_DL5G_DESC_IRQ = 6,
+	ACC100_VF_INT_ILLEGAL_FORMAT = 7,
+	ACC100_VF_INT_QMGR_DISABLED_ACCESS = 8,
+	ACC100_VF_INT_QMGR_AQ_OVERTHRESHOLD = 9,
+};
+
+#endif /* ACC100_VF_ENUM_H */
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 6f46df0..cd77570 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -5,6 +5,9 @@
 #ifndef _RTE_ACC100_PMD_H_
 #define _RTE_ACC100_PMD_H_
 
+#include "acc100_pf_enum.h"
+#include "acc100_vf_enum.h"
+
 /* Helper macro for logging */
 #define rte_bbdev_log(level, fmt, ...) \
 	rte_log(RTE_LOG_ ## level, acc100_logtype, fmt "\n", \
@@ -27,6 +30,493 @@
 #define RTE_ACC100_PF_DEVICE_ID        (0x0d5c)
 #define RTE_ACC100_VF_DEVICE_ID        (0x0d5d)
 
+/* Define as 1 to use only a single FEC engine */
+#ifndef RTE_ACC100_SINGLE_FEC
+#define RTE_ACC100_SINGLE_FEC 0
+#endif
+
+/* Values used in filling in descriptors */
+#define ACC100_DMA_DESC_TYPE           2
+#define ACC100_DMA_CODE_BLK_MODE       0
+#define ACC100_DMA_BLKID_FCW           1
+#define ACC100_DMA_BLKID_IN            2
+#define ACC100_DMA_BLKID_OUT_ENC       1
+#define ACC100_DMA_BLKID_OUT_HARD      1
+#define ACC100_DMA_BLKID_OUT_SOFT      2
+#define ACC100_DMA_BLKID_OUT_HARQ      3
+#define ACC100_DMA_BLKID_IN_HARQ       3
+
+/* Values used in filling in decode FCWs */
+#define ACC100_FCW_TD_VER              1
+#define ACC100_FCW_TD_EXT_COLD_REG_EN  1
+#define ACC100_FCW_TD_AUTOMAP          0x0f
+#define ACC100_FCW_TD_RVIDX_0          2
+#define ACC100_FCW_TD_RVIDX_1          26
+#define ACC100_FCW_TD_RVIDX_2          50
+#define ACC100_FCW_TD_RVIDX_3          74
+
+/* Values used in writing to the registers */
+#define ACC100_REG_IRQ_EN_ALL          0x1FF83FF  /* Enable all interrupts */
+
+/* ACC100 Specific Dimensioning */
+#define ACC100_SIZE_64MBYTE            (64*1024*1024)
+/* Number of elements in an Info Ring */
+#define ACC100_INFO_RING_NUM_ENTRIES   1024
+/* Number of elements in HARQ layout memory */
+#define ACC100_HARQ_LAYOUT             (64*1024*1024)
+/* Assume offset for HARQ in memory */
+#define ACC100_HARQ_OFFSET             (32*1024)
+/* Mask used to calculate an index in an Info Ring array (not a byte offset) */
+#define ACC100_INFO_RING_MASK          (ACC100_INFO_RING_NUM_ENTRIES-1)
+/* Number of Virtual Functions ACC100 supports */
+#define ACC100_NUM_VFS                  16
+#define ACC100_NUM_QGRPS                 8
+#define ACC100_NUM_QGRPS_PER_WORD        8
+#define ACC100_NUM_AQS                  16
+#define MAX_ENQ_BATCH_SIZE          255
+/* All ACC100 Registers alignment are 32bits = 4B */
+#define BYTES_IN_WORD                 4
+#define MAX_E_MBUF                64000
+
+#define GRP_ID_SHIFT    10 /* Queue Index Hierarchy */
+#define VF_ID_SHIFT     4  /* Queue Index Hierarchy */
+#define VF_OFFSET_QOS   16 /* offset in Memory Space specific to QoS Mon */
+#define TMPL_PRI_0      0x03020100
+#define TMPL_PRI_1      0x07060504
+#define TMPL_PRI_2      0x0b0a0908
+#define TMPL_PRI_3      0x0f0e0d0c
+#define QUEUE_ENABLE    0x80000000  /* Bit to mark Queue as Enabled */
+#define WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
+
+#define ACC100_NUM_TMPL  32
+#define VF_OFFSET_QOS 16 /* offset in Memory Space specific to QoS Mon */
+/* Mapping of signals for the available engines */
+#define SIG_UL_5G      0
+#define SIG_UL_5G_LAST 7
+#define SIG_DL_5G      13
+#define SIG_DL_5G_LAST 15
+#define SIG_UL_4G      16
+#define SIG_UL_4G_LAST 21
+#define SIG_DL_4G      27
+#define SIG_DL_4G_LAST 31
+
+/* max number of iterations to allocate memory block for all rings */
+#define SW_RING_MEM_ALLOC_ATTEMPTS 5
+#define MAX_QUEUE_DEPTH           1024
+#define ACC100_DMA_MAX_NUM_POINTERS  14
+#define ACC100_DMA_DESC_PADDING      8
+#define ACC100_FCW_PADDING           12
+#define ACC100_DESC_FCW_OFFSET       192
+#define ACC100_DESC_SIZE             256
+#define ACC100_DESC_OFFSET           (ACC100_DESC_SIZE / 64)
+#define ACC100_FCW_TE_BLEN     32
+#define ACC100_FCW_TD_BLEN     24
+#define ACC100_FCW_LE_BLEN     32
+#define ACC100_FCW_LD_BLEN     36
+
+#define ACC100_FCW_VER         2
+#define MUX_5GDL_DESC 6
+#define CMP_ENC_SIZE 20
+#define CMP_DEC_SIZE 24
+#define ENC_OFFSET (32)
+#define DEC_OFFSET (80)
+#define ACC100_EXT_MEM
+#define ACC100_HARQ_OFFSET_THRESHOLD 1024
+
+/* Constants from K0 computation from 3GPP 38.212 Table 5.4.2.1-2 */
+#define N_ZC_1 66 /* N = 66 Zc for BG 1 */
+#define N_ZC_2 50 /* N = 50 Zc for BG 2 */
+#define K0_1_1 17 /* K0 fraction numerator for rv 1 and BG 1 */
+#define K0_1_2 13 /* K0 fraction numerator for rv 1 and BG 2 */
+#define K0_2_1 33 /* K0 fraction numerator for rv 2 and BG 1 */
+#define K0_2_2 25 /* K0 fraction numerator for rv 2 and BG 2 */
+#define K0_3_1 56 /* K0 fraction numerator for rv 3 and BG 1 */
+#define K0_3_2 43 /* K0 fraction numerator for rv 3 and BG 2 */
+
+/* ACC100 Configuration */
+#define ACC100_DDR_ECC_ENABLE
+#define ACC100_CFG_DMA_ERROR 0x3D7
+#define ACC100_CFG_AXI_CACHE 0x11
+#define ACC100_CFG_QMGR_HI_P 0x0F0F
+#define ACC100_CFG_PCI_AXI 0xC003
+#define ACC100_CFG_PCI_BRIDGE 0x40006033
+#define ACC100_ENGINE_OFFSET 0x1000
+#define ACC100_RESET_HI 0x20100
+#define ACC100_RESET_LO 0x20000
+#define ACC100_RESET_HARD 0x1FF
+#define ACC100_ENGINES_MAX 9
+#define LONG_WAIT 1000
+
+/* ACC100 DMA Descriptor triplet */
+struct acc100_dma_triplet {
+	uint64_t address;
+	uint32_t blen:20,
+		res0:4,
+		last:1,
+		dma_ext:1,
+		res1:2,
+		blkid:4;
+} __rte_packed;
+
+
+
+/* ACC100 DMA Response Descriptor */
+union acc100_dma_rsp_desc {
+	uint32_t val;
+	struct {
+		uint32_t crc_status:1,
+			synd_ok:1,
+			dma_err:1,
+			neg_stop:1,
+			fcw_err:1,
+			output_err:1,
+			input_err:1,
+			timestampEn:1,
+			iterCountFrac:8,
+			iter_cnt:8,
+			rsrvd3:6,
+			sdone:1,
+			fdone:1;
+		uint32_t add_info_0;
+		uint32_t add_info_1;
+	};
+};
+
+
+/* ACC100 Queue Manager Enqueue PCI Register */
+union acc100_enqueue_reg_fmt {
+	uint32_t val;
+	struct {
+		uint32_t num_elem:8,
+			addr_offset:3,
+			rsrvd:1,
+			req_elem_addr:20;
+	};
+};
+
+/* FEC 4G Uplink Frame Control Word */
+struct __rte_packed acc100_fcw_td {
+	uint8_t fcw_ver:4,
+		num_maps:4; /* Unused */
+	uint8_t filler:6, /* Unused */
+		rsrvd0:1,
+		bypass_sb_deint:1;
+	uint16_t k_pos;
+	uint16_t k_neg; /* Unused */
+	uint8_t c_neg; /* Unused */
+	uint8_t c; /* Unused */
+	uint32_t ea; /* Unused */
+	uint32_t eb; /* Unused */
+	uint8_t cab; /* Unused */
+	uint8_t k0_start_col; /* Unused */
+	uint8_t rsrvd1;
+	uint8_t code_block_mode:1, /* Unused */
+		turbo_crc_type:1,
+		rsrvd2:3,
+		bypass_teq:1, /* Unused */
+		soft_output_en:1, /* Unused */
+		ext_td_cold_reg_en:1;
+	union { /* External Cold register */
+		uint32_t ext_td_cold_reg;
+		struct {
+			uint32_t min_iter:4, /* Unused */
+				max_iter:4,
+				ext_scale:5, /* Unused */
+				rsrvd3:3,
+				early_stop_en:1, /* Unused */
+				sw_soft_out_dis:1, /* Unused */
+				sw_et_cont:1, /* Unused */
+				sw_soft_out_saturation:1, /* Unused */
+				half_iter_on:1, /* Unused */
+				raw_decoder_input_on:1, /* Unused */
+				rsrvd4:10;
+		};
+	};
+};
+
+/* FEC 5GNR Uplink Frame Control Word */
+struct __rte_packed acc100_fcw_ld {
+	uint32_t FCWversion:4,
+		qm:4,
+		nfiller:11,
+		BG:1,
+		Zc:9,
+		res0:1,
+		synd_precoder:1,
+		synd_post:1;
+	uint32_t ncb:16,
+		k0:16;
+	uint32_t rm_e:24,
+		hcin_en:1,
+		hcout_en:1,
+		crc_select:1,
+		bypass_dec:1,
+		bypass_intlv:1,
+		so_en:1,
+		so_bypass_rm:1,
+		so_bypass_intlv:1;
+	uint32_t hcin_offset:16,
+		hcin_size0:16;
+	uint32_t hcin_size1:16,
+		hcin_decomp_mode:3,
+		llr_pack_mode:1,
+		hcout_comp_mode:3,
+		res2:1,
+		dec_convllr:4,
+		hcout_convllr:4;
+	uint32_t itmax:7,
+		itstop:1,
+		so_it:7,
+		res3:1,
+		hcout_offset:16;
+	uint32_t hcout_size0:16,
+		hcout_size1:16;
+	uint32_t gain_i:8,
+		gain_h:8,
+		negstop_th:16;
+	uint32_t negstop_it:7,
+		negstop_en:1,
+		res4:24;
+};
+
+/* FEC 4G Downlink Frame Control Word */
+struct __rte_packed acc100_fcw_te {
+	uint16_t k_neg;
+	uint16_t k_pos;
+	uint8_t c_neg;
+	uint8_t c;
+	uint8_t filler;
+	uint8_t cab;
+	uint32_t ea:17,
+		rsrvd0:15;
+	uint32_t eb:17,
+		rsrvd1:15;
+	uint16_t ncb_neg;
+	uint16_t ncb_pos;
+	uint8_t rv_idx0:2,
+		rsrvd2:2,
+		rv_idx1:2,
+		rsrvd3:2;
+	uint8_t bypass_rv_idx0:1,
+		bypass_rv_idx1:1,
+		bypass_rm:1,
+		rsrvd4:5;
+	uint8_t rsrvd5:1,
+		rsrvd6:3,
+		code_block_crc:1,
+		rsrvd7:3;
+	uint8_t code_block_mode:1,
+		rsrvd8:7;
+	uint64_t rsrvd9;
+};
+
+/* FEC 5GNR Downlink Frame Control Word */
+struct __rte_packed acc100_fcw_le {
+	uint32_t FCWversion:4,
+		qm:4,
+		nfiller:11,
+		BG:1,
+		Zc:9,
+		res0:3;
+	uint32_t ncb:16,
+		k0:16;
+	uint32_t rm_e:24,
+		res1:2,
+		crc_select:1,
+		res2:1,
+		bypass_intlv:1,
+		res3:3;
+	uint32_t res4_a:12,
+		mcb_count:3,
+		res4_b:17;
+	uint32_t res5;
+	uint32_t res6;
+	uint32_t res7;
+	uint32_t res8;
+};
+
+/* ACC100 DMA Request Descriptor */
+struct __rte_packed acc100_dma_req_desc {
+	union {
+		struct{
+			uint32_t type:4,
+				rsrvd0:26,
+				sdone:1,
+				fdone:1;
+			uint32_t rsrvd1;
+			uint32_t rsrvd2;
+			uint32_t pass_param:8,
+				sdone_enable:1,
+				irq_enable:1,
+				timeStampEn:1,
+				res0:5,
+				numCBs:4,
+				res1:4,
+				m2dlen:4,
+				d2mlen:4;
+		};
+		struct{
+			uint32_t word0;
+			uint32_t word1;
+			uint32_t word2;
+			uint32_t word3;
+		};
+	};
+	struct acc100_dma_triplet data_ptrs[ACC100_DMA_MAX_NUM_POINTERS];
+
+	/* Virtual addresses used to retrieve SW context info */
+	union {
+		void *op_addr;
+		uint64_t pad1;  /* pad to 64 bits */
+	};
+	/*
+	 * Stores additional information needed for driver processing:
+	 * - last_desc_in_batch - flag used to mark last descriptor (CB)
+	 *                        in batch
+	 * - cbs_in_tb - stores information about total number of Code Blocks
+	 *               in currently processed Transport Block
+	 */
+	union {
+		struct {
+			union {
+				struct acc100_fcw_ld fcw_ld;
+				struct acc100_fcw_td fcw_td;
+				struct acc100_fcw_le fcw_le;
+				struct acc100_fcw_te fcw_te;
+				uint32_t pad2[ACC100_FCW_PADDING];
+			};
+			uint32_t last_desc_in_batch :8,
+				cbs_in_tb:8,
+				pad4 : 16;
+		};
+		uint64_t pad3[ACC100_DMA_DESC_PADDING]; /* pad to 64 bits */
+	};
+};
+
+/* ACC100 DMA Descriptor */
+union acc100_dma_desc {
+	struct acc100_dma_req_desc req;
+	union acc100_dma_rsp_desc rsp;
+};
+
+
+/* Union describing Info Ring entry */
+union acc100_harq_layout_data {
+	uint32_t val;
+	struct {
+		uint16_t offset;
+		uint16_t size0;
+	};
+} __rte_packed;
+
+
+/* Union describing Info Ring entry */
+union acc100_info_ring_data {
+	uint32_t val;
+	struct {
+		union {
+			uint16_t detailed_info;
+			struct {
+				uint16_t aq_id: 4;
+				uint16_t qg_id: 4;
+				uint16_t vf_id: 6;
+				uint16_t reserved: 2;
+			};
+		};
+		uint16_t int_nb: 7;
+		uint16_t msi_0: 1;
+		uint16_t vf2pf: 6;
+		uint16_t loop: 1;
+		uint16_t valid: 1;
+	};
+} __rte_packed;
+
+struct acc100_registry_addr {
+	unsigned int dma_ring_dl5g_hi;
+	unsigned int dma_ring_dl5g_lo;
+	unsigned int dma_ring_ul5g_hi;
+	unsigned int dma_ring_ul5g_lo;
+	unsigned int dma_ring_dl4g_hi;
+	unsigned int dma_ring_dl4g_lo;
+	unsigned int dma_ring_ul4g_hi;
+	unsigned int dma_ring_ul4g_lo;
+	unsigned int ring_size;
+	unsigned int info_ring_hi;
+	unsigned int info_ring_lo;
+	unsigned int info_ring_en;
+	unsigned int info_ring_ptr;
+	unsigned int tail_ptrs_dl5g_hi;
+	unsigned int tail_ptrs_dl5g_lo;
+	unsigned int tail_ptrs_ul5g_hi;
+	unsigned int tail_ptrs_ul5g_lo;
+	unsigned int tail_ptrs_dl4g_hi;
+	unsigned int tail_ptrs_dl4g_lo;
+	unsigned int tail_ptrs_ul4g_hi;
+	unsigned int tail_ptrs_ul4g_lo;
+	unsigned int depth_log0_offset;
+	unsigned int depth_log1_offset;
+	unsigned int qman_group_func;
+	unsigned int ddr_range;
+};
+
+/* Structure holding registry addresses for PF */
+static const struct acc100_registry_addr pf_reg_addr = {
+	.dma_ring_dl5g_hi = HWPfDmaFec5GdlDescBaseHiRegVf,
+	.dma_ring_dl5g_lo = HWPfDmaFec5GdlDescBaseLoRegVf,
+	.dma_ring_ul5g_hi = HWPfDmaFec5GulDescBaseHiRegVf,
+	.dma_ring_ul5g_lo = HWPfDmaFec5GulDescBaseLoRegVf,
+	.dma_ring_dl4g_hi = HWPfDmaFec4GdlDescBaseHiRegVf,
+	.dma_ring_dl4g_lo = HWPfDmaFec4GdlDescBaseLoRegVf,
+	.dma_ring_ul4g_hi = HWPfDmaFec4GulDescBaseHiRegVf,
+	.dma_ring_ul4g_lo = HWPfDmaFec4GulDescBaseLoRegVf,
+	.ring_size = HWPfQmgrRingSizeVf,
+	.info_ring_hi = HWPfHiInfoRingBaseHiRegPf,
+	.info_ring_lo = HWPfHiInfoRingBaseLoRegPf,
+	.info_ring_en = HWPfHiInfoRingIntWrEnRegPf,
+	.info_ring_ptr = HWPfHiInfoRingPointerRegPf,
+	.tail_ptrs_dl5g_hi = HWPfDmaFec5GdlRespPtrHiRegVf,
+	.tail_ptrs_dl5g_lo = HWPfDmaFec5GdlRespPtrLoRegVf,
+	.tail_ptrs_ul5g_hi = HWPfDmaFec5GulRespPtrHiRegVf,
+	.tail_ptrs_ul5g_lo = HWPfDmaFec5GulRespPtrLoRegVf,
+	.tail_ptrs_dl4g_hi = HWPfDmaFec4GdlRespPtrHiRegVf,
+	.tail_ptrs_dl4g_lo = HWPfDmaFec4GdlRespPtrLoRegVf,
+	.tail_ptrs_ul4g_hi = HWPfDmaFec4GulRespPtrHiRegVf,
+	.tail_ptrs_ul4g_lo = HWPfDmaFec4GulRespPtrLoRegVf,
+	.depth_log0_offset = HWPfQmgrGrpDepthLog20Vf,
+	.depth_log1_offset = HWPfQmgrGrpDepthLog21Vf,
+	.qman_group_func = HWPfQmgrGrpFunction0,
+	.ddr_range = HWPfDmaVfDdrBaseRw,
+};
+
+/* Structure holding registry addresses for VF */
+static const struct acc100_registry_addr vf_reg_addr = {
+	.dma_ring_dl5g_hi = HWVfDmaFec5GdlDescBaseHiRegVf,
+	.dma_ring_dl5g_lo = HWVfDmaFec5GdlDescBaseLoRegVf,
+	.dma_ring_ul5g_hi = HWVfDmaFec5GulDescBaseHiRegVf,
+	.dma_ring_ul5g_lo = HWVfDmaFec5GulDescBaseLoRegVf,
+	.dma_ring_dl4g_hi = HWVfDmaFec4GdlDescBaseHiRegVf,
+	.dma_ring_dl4g_lo = HWVfDmaFec4GdlDescBaseLoRegVf,
+	.dma_ring_ul4g_hi = HWVfDmaFec4GulDescBaseHiRegVf,
+	.dma_ring_ul4g_lo = HWVfDmaFec4GulDescBaseLoRegVf,
+	.ring_size = HWVfQmgrRingSizeVf,
+	.info_ring_hi = HWVfHiInfoRingBaseHiVf,
+	.info_ring_lo = HWVfHiInfoRingBaseLoVf,
+	.info_ring_en = HWVfHiInfoRingIntWrEnVf,
+	.info_ring_ptr = HWVfHiInfoRingPointerVf,
+	.tail_ptrs_dl5g_hi = HWVfDmaFec5GdlRespPtrHiRegVf,
+	.tail_ptrs_dl5g_lo = HWVfDmaFec5GdlRespPtrLoRegVf,
+	.tail_ptrs_ul5g_hi = HWVfDmaFec5GulRespPtrHiRegVf,
+	.tail_ptrs_ul5g_lo = HWVfDmaFec5GulRespPtrLoRegVf,
+	.tail_ptrs_dl4g_hi = HWVfDmaFec4GdlRespPtrHiRegVf,
+	.tail_ptrs_dl4g_lo = HWVfDmaFec4GdlRespPtrLoRegVf,
+	.tail_ptrs_ul4g_hi = HWVfDmaFec4GulRespPtrHiRegVf,
+	.tail_ptrs_ul4g_lo = HWVfDmaFec4GulRespPtrLoRegVf,
+	.depth_log0_offset = HWVfQmgrGrpDepthLog20Vf,
+	.depth_log1_offset = HWVfQmgrGrpDepthLog21Vf,
+	.qman_group_func = HWVfQmgrGrpFunction0Vf,
+	.ddr_range = HWVfDmaDdrBaseRangeRoVf,
+};
+
 /* Private data structure for each ACC100 device */
 struct acc100_device {
 	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v5 03/11] baseband/acc100: add info get function
  2020-09-23  2:12   ` [dpdk-dev] [PATCH v5 " Nicolas Chautru
  2020-09-23  2:12     ` [dpdk-dev] [PATCH v5 01/11] drivers/baseband: add PMD for ACC100 Nicolas Chautru
  2020-09-23  2:12     ` [dpdk-dev] [PATCH v5 02/11] baseband/acc100: add register definition file Nicolas Chautru
@ 2020-09-23  2:12     ` Nicolas Chautru
  2020-09-23  2:12     ` [dpdk-dev] [PATCH v5 04/11] baseband/acc100: add queue configuration Nicolas Chautru
                       ` (7 subsequent siblings)
  10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-23  2:12 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Add in the "info_get" function to the driver, to allow us to query the
device.
No processing capability are available yet.
Linking bbdev-test to support the PMD with null capability.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 app/test-bbdev/meson.build               |   3 +
 drivers/baseband/acc100/rte_acc100_cfg.h |  96 +++++++++++++
 drivers/baseband/acc100/rte_acc100_pmd.c | 225 +++++++++++++++++++++++++++++++
 drivers/baseband/acc100/rte_acc100_pmd.h |   3 +
 4 files changed, 327 insertions(+)
 create mode 100644 drivers/baseband/acc100/rte_acc100_cfg.h

diff --git a/app/test-bbdev/meson.build b/app/test-bbdev/meson.build
index 18ab6a8..fbd8ae3 100644
--- a/app/test-bbdev/meson.build
+++ b/app/test-bbdev/meson.build
@@ -12,3 +12,6 @@ endif
 if dpdk_conf.has('RTE_LIBRTE_PMD_BBDEV_FPGA_5GNR_FEC')
 	deps += ['pmd_bbdev_fpga_5gnr_fec']
 endif
+if dpdk_conf.has('RTE_LIBRTE_PMD_BBDEV_ACC100')
+	deps += ['pmd_bbdev_acc100']
+endif
\ No newline at end of file
diff --git a/drivers/baseband/acc100/rte_acc100_cfg.h b/drivers/baseband/acc100/rte_acc100_cfg.h
new file mode 100644
index 0000000..73bbe36
--- /dev/null
+++ b/drivers/baseband/acc100/rte_acc100_cfg.h
@@ -0,0 +1,96 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#ifndef _RTE_ACC100_CFG_H_
+#define _RTE_ACC100_CFG_H_
+
+/**
+ * @file rte_acc100_cfg.h
+ *
+ * Functions for configuring ACC100 HW, exposed directly to applications.
+ * Configuration related to encoding/decoding is done through the
+ * librte_bbdev library.
+ *
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ */
+
+#include <stdint.h>
+#include <stdbool.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+/**< Number of Virtual Functions ACC100 supports */
+#define RTE_ACC100_NUM_VFS 16
+
+/**
+ * Definition of Queue Topology for ACC100 Configuration
+ * Some level of details is abstracted out to expose a clean interface
+ * given that comprehensive flexibility is not required
+ */
+struct rte_q_topology_t {
+	/** Number of QGroups in incremental order of priority */
+	uint16_t num_qgroups;
+	/**
+	 * All QGroups have the same number of AQs here.
+	 * Note : Could be made a 16-array if more flexibility is really
+	 * required
+	 */
+	uint16_t num_aqs_per_groups;
+	/**
+	 * Depth of the AQs is the same of all QGroups here. Log2 Enum : 2^N
+	 * Note : Could be made a 16-array if more flexibility is really
+	 * required
+	 */
+	uint16_t aq_depth_log2;
+	/**
+	 * Index of the first Queue Group Index - assuming contiguity
+	 * Initialized as -1
+	 */
+	int8_t first_qgroup_index;
+};
+
+/**
+ * Definition of Arbitration related parameters for ACC100 Configuration
+ */
+struct rte_arbitration_t {
+	/** Default Weight for VF Fairness Arbitration */
+	uint16_t round_robin_weight;
+	uint32_t gbr_threshold1; /**< Guaranteed Bitrate Threshold 1 */
+	uint32_t gbr_threshold2; /**< Guaranteed Bitrate Threshold 2 */
+};
+
+/**
+ * Structure to pass ACC100 configuration.
+ * Note: all VF Bundles will have the same configuration.
+ */
+struct acc100_conf {
+	bool pf_mode_en; /**< 1 if PF is used for dataplane, 0 for VFs */
+	/** 1 if input '1' bit is represented by a positive LLR value, 0 if '1'
+	 * bit is represented by a negative value.
+	 */
+	bool input_pos_llr_1_bit;
+	/** 1 if output '1' bit is represented by a positive value, 0 if '1'
+	 * bit is represented by a negative value.
+	 */
+	bool output_pos_llr_1_bit;
+	uint16_t num_vf_bundles; /**< Number of VF bundles to setup */
+	/** Queue topology for each operation type */
+	struct rte_q_topology_t q_ul_4g;
+	struct rte_q_topology_t q_dl_4g;
+	struct rte_q_topology_t q_ul_5g;
+	struct rte_q_topology_t q_dl_5g;
+	/** Arbitration configuration for each operation type */
+	struct rte_arbitration_t arb_ul_4g[RTE_ACC100_NUM_VFS];
+	struct rte_arbitration_t arb_dl_4g[RTE_ACC100_NUM_VFS];
+	struct rte_arbitration_t arb_ul_5g[RTE_ACC100_NUM_VFS];
+	struct rte_arbitration_t arb_dl_5g[RTE_ACC100_NUM_VFS];
+};
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_ACC100_CFG_H_ */
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 1b4cd13..7807a30 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -26,6 +26,184 @@
 RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);
 #endif
 
+/* Read a register of a ACC100 device */
+static inline uint32_t
+acc100_reg_read(struct acc100_device *d, uint32_t offset)
+{
+
+	void *reg_addr = RTE_PTR_ADD(d->mmio_base, offset);
+	uint32_t ret = *((volatile uint32_t *)(reg_addr));
+	return rte_le_to_cpu_32(ret);
+}
+
+/* Calculate the offset of the enqueue register */
+static inline uint32_t
+queue_offset(bool pf_device, uint8_t vf_id, uint8_t qgrp_id, uint16_t aq_id)
+{
+	if (pf_device)
+		return ((vf_id << 12) + (qgrp_id << 7) + (aq_id << 3) +
+				HWPfQmgrIngressAq);
+	else
+		return ((qgrp_id << 7) + (aq_id << 3) +
+				HWVfQmgrIngressAq);
+}
+
+enum {UL_4G = 0, UL_5G, DL_4G, DL_5G, NUM_ACC};
+
+/* Return the queue topology for a Queue Group Index */
+static inline void
+qtopFromAcc(struct rte_q_topology_t **qtop, int acc_enum,
+		struct acc100_conf *acc100_conf)
+{
+	struct rte_q_topology_t *p_qtop;
+	p_qtop = NULL;
+	switch (acc_enum) {
+	case UL_4G:
+		p_qtop = &(acc100_conf->q_ul_4g);
+		break;
+	case UL_5G:
+		p_qtop = &(acc100_conf->q_ul_5g);
+		break;
+	case DL_4G:
+		p_qtop = &(acc100_conf->q_dl_4g);
+		break;
+	case DL_5G:
+		p_qtop = &(acc100_conf->q_dl_5g);
+		break;
+	default:
+		/* NOTREACHED */
+		rte_bbdev_log(ERR, "Unexpected error evaluating qtopFromAcc");
+		break;
+	}
+	*qtop = p_qtop;
+}
+
+static void
+initQTop(struct acc100_conf *acc100_conf)
+{
+	acc100_conf->q_ul_4g.num_aqs_per_groups = 0;
+	acc100_conf->q_ul_4g.num_qgroups = 0;
+	acc100_conf->q_ul_4g.first_qgroup_index = -1;
+	acc100_conf->q_ul_5g.num_aqs_per_groups = 0;
+	acc100_conf->q_ul_5g.num_qgroups = 0;
+	acc100_conf->q_ul_5g.first_qgroup_index = -1;
+	acc100_conf->q_dl_4g.num_aqs_per_groups = 0;
+	acc100_conf->q_dl_4g.num_qgroups = 0;
+	acc100_conf->q_dl_4g.first_qgroup_index = -1;
+	acc100_conf->q_dl_5g.num_aqs_per_groups = 0;
+	acc100_conf->q_dl_5g.num_qgroups = 0;
+	acc100_conf->q_dl_5g.first_qgroup_index = -1;
+}
+
+static inline void
+updateQtop(uint8_t acc, uint8_t qg, struct acc100_conf *acc100_conf,
+		struct acc100_device *d) {
+	uint32_t reg;
+	struct rte_q_topology_t *q_top = NULL;
+	qtopFromAcc(&q_top, acc, acc100_conf);
+	if (unlikely(q_top == NULL))
+		return;
+	uint16_t aq;
+	q_top->num_qgroups++;
+	if (q_top->first_qgroup_index == -1) {
+		q_top->first_qgroup_index = qg;
+		/* Can be optimized to assume all are enabled by default */
+		reg = acc100_reg_read(d, queue_offset(d->pf_device,
+				0, qg, ACC100_NUM_AQS - 1));
+		if (reg & QUEUE_ENABLE) {
+			q_top->num_aqs_per_groups = ACC100_NUM_AQS;
+			return;
+		}
+		q_top->num_aqs_per_groups = 0;
+		for (aq = 0; aq < ACC100_NUM_AQS; aq++) {
+			reg = acc100_reg_read(d, queue_offset(d->pf_device,
+					0, qg, aq));
+			if (reg & QUEUE_ENABLE)
+				q_top->num_aqs_per_groups++;
+		}
+	}
+}
+
+/* Fetch configuration enabled for the PF/VF using MMIO Read (slow) */
+static inline void
+fetch_acc100_config(struct rte_bbdev *dev)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	struct acc100_conf *acc100_conf = &d->acc100_conf;
+	const struct acc100_registry_addr *reg_addr;
+	uint8_t acc, qg;
+	uint32_t reg, reg_aq, reg_len0, reg_len1;
+	uint32_t reg_mode;
+
+	/* No need to retrieve the configuration is already done */
+	if (d->configured)
+		return;
+
+	/* Choose correct registry addresses for the device type */
+	if (d->pf_device)
+		reg_addr = &pf_reg_addr;
+	else
+		reg_addr = &vf_reg_addr;
+
+	d->ddr_size = (1 + acc100_reg_read(d, reg_addr->ddr_range)) << 10;
+
+	/* Single VF Bundle by VF */
+	acc100_conf->num_vf_bundles = 1;
+	initQTop(acc100_conf);
+
+	struct rte_q_topology_t *q_top = NULL;
+	int qman_func_id[5] = {0, 2, 1, 3, 4};
+	reg = acc100_reg_read(d, reg_addr->qman_group_func);
+	for (qg = 0; qg < ACC100_NUM_QGRPS_PER_WORD; qg++) {
+		reg_aq = acc100_reg_read(d,
+				queue_offset(d->pf_device, 0, qg, 0));
+		if (reg_aq & QUEUE_ENABLE) {
+			acc = qman_func_id[(reg >> (qg * 4)) & 0x7];
+			updateQtop(acc, qg, acc100_conf, d);
+		}
+	}
+
+	/* Check the depth of the AQs*/
+	reg_len0 = acc100_reg_read(d, reg_addr->depth_log0_offset);
+	reg_len1 = acc100_reg_read(d, reg_addr->depth_log1_offset);
+	for (acc = 0; acc < NUM_ACC; acc++) {
+		qtopFromAcc(&q_top, acc, acc100_conf);
+		if (q_top->first_qgroup_index < ACC100_NUM_QGRPS_PER_WORD)
+			q_top->aq_depth_log2 = (reg_len0 >>
+					(q_top->first_qgroup_index * 4))
+					& 0xF;
+		else
+			q_top->aq_depth_log2 = (reg_len1 >>
+					((q_top->first_qgroup_index -
+					ACC100_NUM_QGRPS_PER_WORD) * 4))
+					& 0xF;
+	}
+
+	/* Read PF mode */
+	if (d->pf_device) {
+		reg_mode = acc100_reg_read(d, HWPfHiPfMode);
+		acc100_conf->pf_mode_en = (reg_mode == 2) ? 1 : 0;
+	}
+
+	rte_bbdev_log_debug(
+			"%s Config LLR SIGN IN/OUT %s %s QG %u %u %u %u AQ %u %u %u %u Len %u %u %u %u\n",
+			(d->pf_device) ? "PF" : "VF",
+			(acc100_conf->input_pos_llr_1_bit) ? "POS" : "NEG",
+			(acc100_conf->output_pos_llr_1_bit) ? "POS" : "NEG",
+			acc100_conf->q_ul_4g.num_qgroups,
+			acc100_conf->q_dl_4g.num_qgroups,
+			acc100_conf->q_ul_5g.num_qgroups,
+			acc100_conf->q_dl_5g.num_qgroups,
+			acc100_conf->q_ul_4g.num_aqs_per_groups,
+			acc100_conf->q_dl_4g.num_aqs_per_groups,
+			acc100_conf->q_ul_5g.num_aqs_per_groups,
+			acc100_conf->q_dl_5g.num_aqs_per_groups,
+			acc100_conf->q_ul_4g.aq_depth_log2,
+			acc100_conf->q_dl_4g.aq_depth_log2,
+			acc100_conf->q_ul_5g.aq_depth_log2,
+			acc100_conf->q_dl_5g.aq_depth_log2);
+}
+
 /* Free 64MB memory used for software rings */
 static int
 acc100_dev_close(struct rte_bbdev *dev  __rte_unused)
@@ -33,8 +211,55 @@
 	return 0;
 }
 
+/* Get ACC100 device info */
+static void
+acc100_dev_info_get(struct rte_bbdev *dev,
+		struct rte_bbdev_driver_info *dev_info)
+{
+	struct acc100_device *d = dev->data->dev_private;
+
+	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
+		RTE_BBDEV_END_OF_CAPABILITIES_LIST()
+	};
+
+	static struct rte_bbdev_queue_conf default_queue_conf;
+	default_queue_conf.socket = dev->data->socket_id;
+	default_queue_conf.queue_size = MAX_QUEUE_DEPTH;
+
+	dev_info->driver_name = dev->device->driver->name;
+
+	/* Read and save the populated config from ACC100 registers */
+	fetch_acc100_config(dev);
+
+	/* This isn't ideal because it reports the maximum number of queues but
+	 * does not provide info on how many can be uplink/downlink or different
+	 * priorities
+	 */
+	dev_info->max_num_queues =
+			d->acc100_conf.q_dl_5g.num_aqs_per_groups *
+			d->acc100_conf.q_dl_5g.num_qgroups +
+			d->acc100_conf.q_ul_5g.num_aqs_per_groups *
+			d->acc100_conf.q_ul_5g.num_qgroups +
+			d->acc100_conf.q_dl_4g.num_aqs_per_groups *
+			d->acc100_conf.q_dl_4g.num_qgroups +
+			d->acc100_conf.q_ul_4g.num_aqs_per_groups *
+			d->acc100_conf.q_ul_4g.num_qgroups;
+	dev_info->queue_size_lim = MAX_QUEUE_DEPTH;
+	dev_info->hardware_accelerated = true;
+	dev_info->max_dl_queue_priority =
+			d->acc100_conf.q_dl_4g.num_qgroups - 1;
+	dev_info->max_ul_queue_priority =
+			d->acc100_conf.q_ul_4g.num_qgroups - 1;
+	dev_info->default_queue_conf = default_queue_conf;
+	dev_info->cpu_flag_reqs = NULL;
+	dev_info->min_alignment = 64;
+	dev_info->capabilities = bbdev_capabilities;
+	dev_info->harq_buffer_size = d->ddr_size;
+}
+
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
 	.close = acc100_dev_close,
+	.info_get = acc100_dev_info_get,
 };
 
 /* ACC100 PCI PF address map */
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index cd77570..662e2c8 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -7,6 +7,7 @@
 
 #include "acc100_pf_enum.h"
 #include "acc100_vf_enum.h"
+#include "rte_acc100_cfg.h"
 
 /* Helper macro for logging */
 #define rte_bbdev_log(level, fmt, ...) \
@@ -520,6 +521,8 @@ struct acc100_registry_addr {
 /* Private data structure for each ACC100 device */
 struct acc100_device {
 	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
+	uint32_t ddr_size; /* Size in kB */
+	struct acc100_conf acc100_conf; /* ACC100 Initial configuration */
 	bool pf_device; /**< True if this is a PF ACC100 device */
 	bool configured; /**< True if this ACC100 device is configured */
 };
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v5 04/11] baseband/acc100: add queue configuration
  2020-09-23  2:12   ` [dpdk-dev] [PATCH v5 " Nicolas Chautru
                       ` (2 preceding siblings ...)
  2020-09-23  2:12     ` [dpdk-dev] [PATCH v5 03/11] baseband/acc100: add info get function Nicolas Chautru
@ 2020-09-23  2:12     ` Nicolas Chautru
  2020-09-23  2:12     ` [dpdk-dev] [PATCH v5 05/11] baseband/acc100: add LDPC processing functions Nicolas Chautru
                       ` (6 subsequent siblings)
  10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-23  2:12 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Adding function to create and configure queues for
the device. Still no capability.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 420 ++++++++++++++++++++++++++++++-
 drivers/baseband/acc100/rte_acc100_pmd.h |  45 ++++
 2 files changed, 464 insertions(+), 1 deletion(-)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 7807a30..7a21c57 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -26,6 +26,22 @@
 RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);
 #endif
 
+/* Write to MMIO register address */
+static inline void
+mmio_write(void *addr, uint32_t value)
+{
+	*((volatile uint32_t *)(addr)) = rte_cpu_to_le_32(value);
+}
+
+/* Write a register of a ACC100 device */
+static inline void
+acc100_reg_write(struct acc100_device *d, uint32_t offset, uint32_t payload)
+{
+	void *reg_addr = RTE_PTR_ADD(d->mmio_base, offset);
+	mmio_write(reg_addr, payload);
+	usleep(1000);
+}
+
 /* Read a register of a ACC100 device */
 static inline uint32_t
 acc100_reg_read(struct acc100_device *d, uint32_t offset)
@@ -36,6 +52,22 @@
 	return rte_le_to_cpu_32(ret);
 }
 
+/* Basic Implementation of Log2 for exact 2^N */
+static inline uint32_t
+log2_basic(uint32_t value)
+{
+	return (value == 0) ? 0 : __builtin_ctz(value);
+}
+
+/* Calculate memory alignment offset assuming alignment is 2^N */
+static inline uint32_t
+calc_mem_alignment_offset(void *unaligned_virt_mem, uint32_t alignment)
+{
+	rte_iova_t unaligned_phy_mem = rte_malloc_virt2iova(unaligned_virt_mem);
+	return (uint32_t)(alignment -
+			(unaligned_phy_mem & (alignment-1)));
+}
+
 /* Calculate the offset of the enqueue register */
 static inline uint32_t
 queue_offset(bool pf_device, uint8_t vf_id, uint8_t qgrp_id, uint16_t aq_id)
@@ -204,10 +236,393 @@
 			acc100_conf->q_dl_5g.aq_depth_log2);
 }
 
+static void
+free_base_addresses(void **base_addrs, int size)
+{
+	int i;
+	for (i = 0; i < size; i++)
+		rte_free(base_addrs[i]);
+}
+
+static inline uint32_t
+get_desc_len(void)
+{
+	return sizeof(union acc100_dma_desc);
+}
+
+/* Allocate the 2 * 64MB block for the sw rings */
+static int
+alloc_2x64mb_sw_rings_mem(struct rte_bbdev *dev, struct acc100_device *d,
+		int socket)
+{
+	uint32_t sw_ring_size = ACC100_SIZE_64MBYTE;
+	d->sw_rings_base = rte_zmalloc_socket(dev->device->driver->name,
+			2 * sw_ring_size, RTE_CACHE_LINE_SIZE, socket);
+	if (d->sw_rings_base == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate memory for %s:%u",
+				dev->device->driver->name,
+				dev->data->dev_id);
+		return -ENOMEM;
+	}
+	memset(d->sw_rings_base, 0, ACC100_SIZE_64MBYTE);
+	uint32_t next_64mb_align_offset = calc_mem_alignment_offset(
+			d->sw_rings_base, ACC100_SIZE_64MBYTE);
+	d->sw_rings = RTE_PTR_ADD(d->sw_rings_base, next_64mb_align_offset);
+	d->sw_rings_phys = rte_malloc_virt2iova(d->sw_rings_base) +
+			next_64mb_align_offset;
+	d->sw_ring_size = MAX_QUEUE_DEPTH * get_desc_len();
+	d->sw_ring_max_depth = d->sw_ring_size / get_desc_len();
+
+	return 0;
+}
+
+/* Attempt to allocate minimised memory space for sw rings */
+static void
+alloc_sw_rings_min_mem(struct rte_bbdev *dev, struct acc100_device *d,
+		uint16_t num_queues, int socket)
+{
+	rte_iova_t sw_rings_base_phy, next_64mb_align_addr_phy;
+	uint32_t next_64mb_align_offset;
+	rte_iova_t sw_ring_phys_end_addr;
+	void *base_addrs[SW_RING_MEM_ALLOC_ATTEMPTS];
+	void *sw_rings_base;
+	int i = 0;
+	uint32_t q_sw_ring_size = MAX_QUEUE_DEPTH * get_desc_len();
+	uint32_t dev_sw_ring_size = q_sw_ring_size * num_queues;
+
+	/* Find an aligned block of memory to store sw rings */
+	while (i < SW_RING_MEM_ALLOC_ATTEMPTS) {
+		/*
+		 * sw_ring allocated memory is guaranteed to be aligned to
+		 * q_sw_ring_size at the condition that the requested size is
+		 * less than the page size
+		 */
+		sw_rings_base = rte_zmalloc_socket(
+				dev->device->driver->name,
+				dev_sw_ring_size, q_sw_ring_size, socket);
+
+		if (sw_rings_base == NULL) {
+			rte_bbdev_log(ERR,
+					"Failed to allocate memory for %s:%u",
+					dev->device->driver->name,
+					dev->data->dev_id);
+			break;
+		}
+
+		sw_rings_base_phy = rte_malloc_virt2iova(sw_rings_base);
+		next_64mb_align_offset = calc_mem_alignment_offset(
+				sw_rings_base, ACC100_SIZE_64MBYTE);
+		next_64mb_align_addr_phy = sw_rings_base_phy +
+				next_64mb_align_offset;
+		sw_ring_phys_end_addr = sw_rings_base_phy + dev_sw_ring_size;
+
+		/* Check if the end of the sw ring memory block is before the
+		 * start of next 64MB aligned mem address
+		 */
+		if (sw_ring_phys_end_addr < next_64mb_align_addr_phy) {
+			d->sw_rings_phys = sw_rings_base_phy;
+			d->sw_rings = sw_rings_base;
+			d->sw_rings_base = sw_rings_base;
+			d->sw_ring_size = q_sw_ring_size;
+			d->sw_ring_max_depth = MAX_QUEUE_DEPTH;
+			break;
+		}
+		/* Store the address of the unaligned mem block */
+		base_addrs[i] = sw_rings_base;
+		i++;
+	}
+
+	/* Free all unaligned blocks of mem allocated in the loop */
+	free_base_addresses(base_addrs, i);
+}
+
+
+/* Allocate 64MB memory used for all software rings */
+static int
+acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id)
+{
+	uint32_t phys_low, phys_high, payload;
+	struct acc100_device *d = dev->data->dev_private;
+	const struct acc100_registry_addr *reg_addr;
+
+	if (d->pf_device && !d->acc100_conf.pf_mode_en) {
+		rte_bbdev_log(NOTICE,
+				"%s has PF mode disabled. This PF can't be used.",
+				dev->data->name);
+		return -ENODEV;
+	}
+
+	alloc_sw_rings_min_mem(dev, d, num_queues, socket_id);
+
+	/* If minimal memory space approach failed, then allocate
+	 * the 2 * 64MB block for the sw rings
+	 */
+	if (d->sw_rings == NULL)
+		alloc_2x64mb_sw_rings_mem(dev, d, socket_id);
+
+	/* Configure ACC100 with the base address for DMA descriptor rings
+	 * Same descriptor rings used for UL and DL DMA Engines
+	 * Note : Assuming only VF0 bundle is used for PF mode
+	 */
+	phys_high = (uint32_t)(d->sw_rings_phys >> 32);
+	phys_low  = (uint32_t)(d->sw_rings_phys & ~(ACC100_SIZE_64MBYTE-1));
+
+	/* Choose correct registry addresses for the device type */
+	if (d->pf_device)
+		reg_addr = &pf_reg_addr;
+	else
+		reg_addr = &vf_reg_addr;
+
+	/* Read the populated cfg from ACC100 registers */
+	fetch_acc100_config(dev);
+
+	/* Mark as configured properly */
+	d->configured = true;
+
+	/* Release AXI from PF */
+	if (d->pf_device)
+		acc100_reg_write(d, HWPfDmaAxiControl, 1);
+
+	acc100_reg_write(d, reg_addr->dma_ring_ul5g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->dma_ring_ul5g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->dma_ring_dl5g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->dma_ring_dl5g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->dma_ring_ul4g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->dma_ring_ul4g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->dma_ring_dl4g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->dma_ring_dl4g_lo, phys_low);
+
+	/*
+	 * Configure Ring Size to the max queue ring size
+	 * (used for wrapping purpose)
+	 */
+	payload = log2_basic(d->sw_ring_size / 64);
+	acc100_reg_write(d, reg_addr->ring_size, payload);
+
+	/* Configure tail pointer for use when SDONE enabled */
+	d->tail_ptrs = rte_zmalloc_socket(
+			dev->device->driver->name,
+			ACC100_NUM_QGRPS * ACC100_NUM_AQS * sizeof(uint32_t),
+			RTE_CACHE_LINE_SIZE, socket_id);
+	if (d->tail_ptrs == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate tail ptr for %s:%u",
+				dev->device->driver->name,
+				dev->data->dev_id);
+		rte_free(d->sw_rings);
+		return -ENOMEM;
+	}
+	d->tail_ptr_phys = rte_malloc_virt2iova(d->tail_ptrs);
+
+	phys_high = (uint32_t)(d->tail_ptr_phys >> 32);
+	phys_low  = (uint32_t)(d->tail_ptr_phys);
+	acc100_reg_write(d, reg_addr->tail_ptrs_ul5g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->tail_ptrs_ul5g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->tail_ptrs_dl5g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->tail_ptrs_dl5g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->tail_ptrs_ul4g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->tail_ptrs_ul4g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_lo, phys_low);
+
+	d->harq_layout = rte_zmalloc_socket("HARQ Layout",
+			ACC100_HARQ_LAYOUT * sizeof(*d->harq_layout),
+			RTE_CACHE_LINE_SIZE, dev->data->socket_id);
+
+	rte_bbdev_log_debug(
+			"ACC100 (%s) configured  sw_rings = %p, sw_rings_phys = %#"
+			PRIx64, dev->data->name, d->sw_rings, d->sw_rings_phys);
+
+	return 0;
+}
+
 /* Free 64MB memory used for software rings */
 static int
-acc100_dev_close(struct rte_bbdev *dev  __rte_unused)
+acc100_dev_close(struct rte_bbdev *dev)
 {
+	struct acc100_device *d = dev->data->dev_private;
+	if (d->sw_rings_base != NULL) {
+		rte_free(d->tail_ptrs);
+		rte_free(d->sw_rings_base);
+		d->sw_rings_base = NULL;
+	}
+	usleep(1000);
+	return 0;
+}
+
+
+/**
+ * Report a ACC100 queue index which is free
+ * Return 0 to 16k for a valid queue_idx or -1 when no queue is available
+ * Note : Only supporting VF0 Bundle for PF mode
+ */
+static int
+acc100_find_free_queue_idx(struct rte_bbdev *dev,
+		const struct rte_bbdev_queue_conf *conf)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	int op_2_acc[5] = {0, UL_4G, DL_4G, UL_5G, DL_5G};
+	int acc = op_2_acc[conf->op_type];
+	struct rte_q_topology_t *qtop = NULL;
+	qtopFromAcc(&qtop, acc, &(d->acc100_conf));
+	if (qtop == NULL)
+		return -1;
+	/* Identify matching QGroup Index which are sorted in priority order */
+	uint16_t group_idx = qtop->first_qgroup_index;
+	group_idx += conf->priority;
+	if (group_idx >= ACC100_NUM_QGRPS ||
+			conf->priority >= qtop->num_qgroups) {
+		rte_bbdev_log(INFO, "Invalid Priority on %s, priority %u",
+				dev->data->name, conf->priority);
+		return -1;
+	}
+	/* Find a free AQ_idx  */
+	uint16_t aq_idx;
+	for (aq_idx = 0; aq_idx < qtop->num_aqs_per_groups; aq_idx++) {
+		if (((d->q_assigned_bit_map[group_idx] >> aq_idx) & 0x1) == 0) {
+			/* Mark the Queue as assigned */
+			d->q_assigned_bit_map[group_idx] |= (1 << aq_idx);
+			/* Report the AQ Index */
+			return (group_idx << GRP_ID_SHIFT) + aq_idx;
+		}
+	}
+	rte_bbdev_log(INFO, "Failed to find free queue on %s, priority %u",
+			dev->data->name, conf->priority);
+	return -1;
+}
+
+/* Setup ACC100 queue */
+static int
+acc100_queue_setup(struct rte_bbdev *dev, uint16_t queue_id,
+		const struct rte_bbdev_queue_conf *conf)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	struct acc100_queue *q;
+	int16_t q_idx;
+
+	/* Allocate the queue data structure. */
+	q = rte_zmalloc_socket(dev->device->driver->name, sizeof(*q),
+			RTE_CACHE_LINE_SIZE, conf->socket);
+	if (q == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate queue memory");
+		return -ENOMEM;
+	}
+
+	q->d = d;
+	q->ring_addr = RTE_PTR_ADD(d->sw_rings, (d->sw_ring_size * queue_id));
+	q->ring_addr_phys = d->sw_rings_phys + (d->sw_ring_size * queue_id);
+
+	/* Prepare the Ring with default descriptor format */
+	union acc100_dma_desc *desc = NULL;
+	unsigned int desc_idx, b_idx;
+	int fcw_len = (conf->op_type == RTE_BBDEV_OP_LDPC_ENC ?
+		ACC100_FCW_LE_BLEN : (conf->op_type == RTE_BBDEV_OP_TURBO_DEC ?
+		ACC100_FCW_TD_BLEN : ACC100_FCW_LD_BLEN));
+
+	for (desc_idx = 0; desc_idx < d->sw_ring_max_depth; desc_idx++) {
+		desc = q->ring_addr + desc_idx;
+		desc->req.word0 = ACC100_DMA_DESC_TYPE;
+		desc->req.word1 = 0; /**< Timestamp */
+		desc->req.word2 = 0;
+		desc->req.word3 = 0;
+		uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
+		desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
+		desc->req.data_ptrs[0].blen = fcw_len;
+		desc->req.data_ptrs[0].blkid = ACC100_DMA_BLKID_FCW;
+		desc->req.data_ptrs[0].last = 0;
+		desc->req.data_ptrs[0].dma_ext = 0;
+		for (b_idx = 1; b_idx < ACC100_DMA_MAX_NUM_POINTERS - 1;
+				b_idx++) {
+			desc->req.data_ptrs[b_idx].blkid = ACC100_DMA_BLKID_IN;
+			desc->req.data_ptrs[b_idx].last = 1;
+			desc->req.data_ptrs[b_idx].dma_ext = 0;
+			b_idx++;
+			desc->req.data_ptrs[b_idx].blkid =
+					ACC100_DMA_BLKID_OUT_ENC;
+			desc->req.data_ptrs[b_idx].last = 1;
+			desc->req.data_ptrs[b_idx].dma_ext = 0;
+		}
+		/* Preset some fields of LDPC FCW */
+		desc->req.fcw_ld.FCWversion = ACC100_FCW_VER;
+		desc->req.fcw_ld.gain_i = 1;
+		desc->req.fcw_ld.gain_h = 1;
+	}
+
+	q->lb_in = rte_zmalloc_socket(dev->device->driver->name,
+			RTE_CACHE_LINE_SIZE,
+			RTE_CACHE_LINE_SIZE, conf->socket);
+	if (q->lb_in == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate lb_in memory");
+		return -ENOMEM;
+	}
+	q->lb_in_addr_phys = rte_malloc_virt2iova(q->lb_in);
+	q->lb_out = rte_zmalloc_socket(dev->device->driver->name,
+			RTE_CACHE_LINE_SIZE,
+			RTE_CACHE_LINE_SIZE, conf->socket);
+	if (q->lb_out == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate lb_out memory");
+		return -ENOMEM;
+	}
+	q->lb_out_addr_phys = rte_malloc_virt2iova(q->lb_out);
+
+	/*
+	 * Software queue ring wraps synchronously with the HW when it reaches
+	 * the boundary of the maximum allocated queue size, no matter what the
+	 * sw queue size is. This wrapping is guarded by setting the wrap_mask
+	 * to represent the maximum queue size as allocated at the time when
+	 * the device has been setup (in configure()).
+	 *
+	 * The queue depth is set to the queue size value (conf->queue_size).
+	 * This limits the occupancy of the queue at any point of time, so that
+	 * the queue does not get swamped with enqueue requests.
+	 */
+	q->sw_ring_depth = conf->queue_size;
+	q->sw_ring_wrap_mask = d->sw_ring_max_depth - 1;
+
+	q->op_type = conf->op_type;
+
+	q_idx = acc100_find_free_queue_idx(dev, conf);
+	if (q_idx == -1) {
+		rte_free(q);
+		return -1;
+	}
+
+	q->qgrp_id = (q_idx >> GRP_ID_SHIFT) & 0xF;
+	q->vf_id = (q_idx >> VF_ID_SHIFT)  & 0x3F;
+	q->aq_id = q_idx & 0xF;
+	q->aq_depth = (conf->op_type ==  RTE_BBDEV_OP_TURBO_DEC) ?
+			(1 << d->acc100_conf.q_ul_4g.aq_depth_log2) :
+			(1 << d->acc100_conf.q_dl_4g.aq_depth_log2);
+
+	q->mmio_reg_enqueue = RTE_PTR_ADD(d->mmio_base,
+			queue_offset(d->pf_device,
+					q->vf_id, q->qgrp_id, q->aq_id));
+
+	rte_bbdev_log_debug(
+			"Setup dev%u q%u: qgrp_id=%u, vf_id=%u, aq_id=%u, aq_depth=%u, mmio_reg_enqueue=%p",
+			dev->data->dev_id, queue_id, q->qgrp_id, q->vf_id,
+			q->aq_id, q->aq_depth, q->mmio_reg_enqueue);
+
+	dev->data->queues[queue_id].queue_private = q;
+	return 0;
+}
+
+/* Release ACC100 queue */
+static int
+acc100_queue_release(struct rte_bbdev *dev, uint16_t q_id)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	struct acc100_queue *q = dev->data->queues[q_id].queue_private;
+
+	if (q != NULL) {
+		/* Mark the Queue as un-assigned */
+		d->q_assigned_bit_map[q->qgrp_id] &= (0xFFFFFFFF -
+				(1 << q->aq_id));
+		rte_free(q->lb_in);
+		rte_free(q->lb_out);
+		rte_free(q);
+		dev->data->queues[q_id].queue_private = NULL;
+	}
+
 	return 0;
 }
 
@@ -258,8 +673,11 @@
 }
 
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
+	.setup_queues = acc100_setup_queues,
 	.close = acc100_dev_close,
 	.info_get = acc100_dev_info_get,
+	.queue_setup = acc100_queue_setup,
+	.queue_release = acc100_queue_release,
 };
 
 /* ACC100 PCI PF address map */
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 662e2c8..0e2b79c 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -518,11 +518,56 @@ struct acc100_registry_addr {
 	.ddr_range = HWVfDmaDdrBaseRangeRoVf,
 };
 
+/* Structure associated with each queue. */
+struct __rte_cache_aligned acc100_queue {
+	union acc100_dma_desc *ring_addr;  /* Virtual address of sw ring */
+	rte_iova_t ring_addr_phys;  /* Physical address of software ring */
+	uint32_t sw_ring_head;  /* software ring head */
+	uint32_t sw_ring_tail;  /* software ring tail */
+	/* software ring size (descriptors, not bytes) */
+	uint32_t sw_ring_depth;
+	/* mask used to wrap enqueued descriptors on the sw ring */
+	uint32_t sw_ring_wrap_mask;
+	/* MMIO register used to enqueue descriptors */
+	void *mmio_reg_enqueue;
+	uint8_t vf_id;  /* VF ID (max = 63) */
+	uint8_t qgrp_id;  /* Queue Group ID */
+	uint16_t aq_id;  /* Atomic Queue ID */
+	uint16_t aq_depth;  /* Depth of atomic queue */
+	uint32_t aq_enqueued;  /* Count how many "batches" have been enqueued */
+	uint32_t aq_dequeued;  /* Count how many "batches" have been dequeued */
+	uint32_t irq_enable;  /* Enable ops dequeue interrupts if set to 1 */
+	struct rte_mempool *fcw_mempool;  /* FCW mempool */
+	enum rte_bbdev_op_type op_type;  /* Type of this Queue: TE or TD */
+	/* Internal Buffers for loopback input */
+	uint8_t *lb_in;
+	uint8_t *lb_out;
+	rte_iova_t lb_in_addr_phys;
+	rte_iova_t lb_out_addr_phys;
+	struct acc100_device *d;
+};
+
 /* Private data structure for each ACC100 device */
 struct acc100_device {
 	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
+	void *sw_rings_base;  /* Base addr of un-aligned memory for sw rings */
+	void *sw_rings;  /* 64MBs of 64MB aligned memory for sw rings */
+	rte_iova_t sw_rings_phys;  /* Physical address of sw_rings */
+	/* Virtual address of the info memory routed to the this function under
+	 * operation, whether it is PF or VF.
+	 */
+	union acc100_harq_layout_data *harq_layout;
+	uint32_t sw_ring_size;
 	uint32_t ddr_size; /* Size in kB */
+	uint32_t *tail_ptrs; /* Base address of response tail pointer buffer */
+	rte_iova_t tail_ptr_phys; /* Physical address of tail pointers */
+	/* Max number of entries available for each queue in device, depending
+	 * on how many queues are enabled with configure()
+	 */
+	uint32_t sw_ring_max_depth;
 	struct acc100_conf acc100_conf; /* ACC100 Initial configuration */
+	/* Bitmap capturing which Queues have already been assigned */
+	uint16_t q_assigned_bit_map[ACC100_NUM_QGRPS];
 	bool pf_device; /**< True if this is a PF ACC100 device */
 	bool configured; /**< True if this ACC100 device is configured */
 };
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v5 05/11] baseband/acc100: add LDPC processing functions
  2020-09-23  2:12   ` [dpdk-dev] [PATCH v5 " Nicolas Chautru
                       ` (3 preceding siblings ...)
  2020-09-23  2:12     ` [dpdk-dev] [PATCH v5 04/11] baseband/acc100: add queue configuration Nicolas Chautru
@ 2020-09-23  2:12     ` Nicolas Chautru
  2020-09-23  2:12     ` [dpdk-dev] [PATCH v5 06/11] baseband/acc100: add HARQ loopback support Nicolas Chautru
                       ` (5 subsequent siblings)
  10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-23  2:12 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Adding LDPC decode and encode processing operations

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
Acked-by: Dave Burley <dave.burley@accelercomm.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 1625 +++++++++++++++++++++++++++++-
 drivers/baseband/acc100/rte_acc100_pmd.h |    3 +
 2 files changed, 1626 insertions(+), 2 deletions(-)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 7a21c57..b223547 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -15,6 +15,9 @@
 #include <rte_hexdump.h>
 #include <rte_pci.h>
 #include <rte_bus_pci.h>
+#ifdef RTE_BBDEV_OFFLOAD_COST
+#include <rte_cycles.h>
+#endif
 
 #include <rte_bbdev.h>
 #include <rte_bbdev_pmd.h>
@@ -449,7 +452,6 @@
 	return 0;
 }
 
-
 /**
  * Report a ACC100 queue index which is free
  * Return 0 to 16k for a valid queue_idx or -1 when no queue is available
@@ -634,6 +636,46 @@
 	struct acc100_device *d = dev->data->dev_private;
 
 	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
+		{
+			.type   = RTE_BBDEV_OP_LDPC_ENC,
+			.cap.ldpc_enc = {
+				.capability_flags =
+					RTE_BBDEV_LDPC_RATE_MATCH |
+					RTE_BBDEV_LDPC_CRC_24B_ATTACH |
+					RTE_BBDEV_LDPC_INTERLEAVER_BYPASS,
+				.num_buffers_src =
+						RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+				.num_buffers_dst =
+						RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+			}
+		},
+		{
+			.type   = RTE_BBDEV_OP_LDPC_DEC,
+			.cap.ldpc_dec = {
+			.capability_flags =
+				RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK |
+				RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP |
+				RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE |
+				RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE |
+#ifdef ACC100_EXT_MEM
+				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE |
+				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE |
+#endif
+				RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE |
+				RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS |
+				RTE_BBDEV_LDPC_DECODE_BYPASS |
+				RTE_BBDEV_LDPC_DEC_SCATTER_GATHER |
+				RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION |
+				RTE_BBDEV_LDPC_LLR_COMPRESSION,
+			.llr_size = 8,
+			.llr_decimals = 1,
+			.num_buffers_src =
+					RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+			.num_buffers_hard_out =
+					RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+			.num_buffers_soft_out = 0,
+			}
+		},
 		RTE_BBDEV_END_OF_CAPABILITIES_LIST()
 	};
 
@@ -669,9 +711,14 @@
 	dev_info->cpu_flag_reqs = NULL;
 	dev_info->min_alignment = 64;
 	dev_info->capabilities = bbdev_capabilities;
+#ifdef ACC100_EXT_MEM
 	dev_info->harq_buffer_size = d->ddr_size;
+#else
+	dev_info->harq_buffer_size = 0;
+#endif
 }
 
+
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
 	.setup_queues = acc100_setup_queues,
 	.close = acc100_dev_close,
@@ -696,6 +743,1577 @@
 	{.device_id = 0},
 };
 
+/* Read flag value 0/1 from bitmap */
+static inline bool
+check_bit(uint32_t bitmap, uint32_t bitmask)
+{
+	return bitmap & bitmask;
+}
+
+static inline char *
+mbuf_append(struct rte_mbuf *m_head, struct rte_mbuf *m, uint16_t len)
+{
+	if (unlikely(len > rte_pktmbuf_tailroom(m)))
+		return NULL;
+
+	char *tail = (char *)m->buf_addr + m->data_off + m->data_len;
+	m->data_len = (uint16_t)(m->data_len + len);
+	m_head->pkt_len  = (m_head->pkt_len + len);
+	return tail;
+}
+
+/* Compute value of k0.
+ * Based on 3GPP 38.212 Table 5.4.2.1-2
+ * Starting position of different redundancy versions, k0
+ */
+static inline uint16_t
+get_k0(uint16_t n_cb, uint16_t z_c, uint8_t bg, uint8_t rv_index)
+{
+	if (rv_index == 0)
+		return 0;
+	uint16_t n = (bg == 1 ? N_ZC_1 : N_ZC_2) * z_c;
+	if (n_cb == n) {
+		if (rv_index == 1)
+			return (bg == 1 ? K0_1_1 : K0_1_2) * z_c;
+		else if (rv_index == 2)
+			return (bg == 1 ? K0_2_1 : K0_2_2) * z_c;
+		else
+			return (bg == 1 ? K0_3_1 : K0_3_2) * z_c;
+	}
+	/* LBRM case - includes a division by N */
+	if (rv_index == 1)
+		return (((bg == 1 ? K0_1_1 : K0_1_2) * n_cb)
+				/ n) * z_c;
+	else if (rv_index == 2)
+		return (((bg == 1 ? K0_2_1 : K0_2_2) * n_cb)
+				/ n) * z_c;
+	else
+		return (((bg == 1 ? K0_3_1 : K0_3_2) * n_cb)
+				/ n) * z_c;
+}
+
+/* Fill in a frame control word for LDPC encoding. */
+static inline void
+acc100_fcw_le_fill(const struct rte_bbdev_enc_op *op,
+		struct acc100_fcw_le *fcw, int num_cb)
+{
+	fcw->qm = op->ldpc_enc.q_m;
+	fcw->nfiller = op->ldpc_enc.n_filler;
+	fcw->BG = (op->ldpc_enc.basegraph - 1);
+	fcw->Zc = op->ldpc_enc.z_c;
+	fcw->ncb = op->ldpc_enc.n_cb;
+	fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_enc.basegraph,
+			op->ldpc_enc.rv_index);
+	fcw->rm_e = op->ldpc_enc.cb_params.e;
+	fcw->crc_select = check_bit(op->ldpc_enc.op_flags,
+			RTE_BBDEV_LDPC_CRC_24B_ATTACH);
+	fcw->bypass_intlv = check_bit(op->ldpc_enc.op_flags,
+			RTE_BBDEV_LDPC_INTERLEAVER_BYPASS);
+	fcw->mcb_count = num_cb;
+}
+
+/* Fill in a frame control word for LDPC decoding. */
+static inline void
+acc100_fcw_ld_fill(const struct rte_bbdev_dec_op *op, struct acc100_fcw_ld *fcw,
+		union acc100_harq_layout_data *harq_layout)
+{
+	uint16_t harq_out_length, harq_in_length, ncb_p, k0_p, parity_offset;
+	uint16_t harq_index;
+	uint32_t l;
+	bool harq_prun = false;
+
+	fcw->qm = op->ldpc_dec.q_m;
+	fcw->nfiller = op->ldpc_dec.n_filler;
+	fcw->BG = (op->ldpc_dec.basegraph - 1);
+	fcw->Zc = op->ldpc_dec.z_c;
+	fcw->ncb = op->ldpc_dec.n_cb;
+	fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_dec.basegraph,
+			op->ldpc_dec.rv_index);
+	if (op->ldpc_dec.code_block_mode == 1)
+		fcw->rm_e = op->ldpc_dec.cb_params.e;
+	else
+		fcw->rm_e = (op->ldpc_dec.tb_params.r <
+				op->ldpc_dec.tb_params.cab) ?
+						op->ldpc_dec.tb_params.ea :
+						op->ldpc_dec.tb_params.eb;
+
+	fcw->hcin_en = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE);
+	fcw->hcout_en = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE);
+	fcw->crc_select = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK);
+	fcw->bypass_dec = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_DECODE_BYPASS);
+	fcw->bypass_intlv = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS);
+	if (op->ldpc_dec.q_m == 1) {
+		fcw->bypass_intlv = 1;
+		fcw->qm = 2;
+	}
+	fcw->hcin_decomp_mode = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
+	fcw->hcout_comp_mode = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
+	fcw->llr_pack_mode = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_LLR_COMPRESSION);
+	harq_index = op->ldpc_dec.harq_combined_output.offset /
+			ACC100_HARQ_OFFSET;
+#ifdef ACC100_EXT_MEM
+	/* Limit cases when HARQ pruning is valid */
+	harq_prun = ((op->ldpc_dec.harq_combined_output.offset %
+			ACC100_HARQ_OFFSET) == 0) &&
+			(op->ldpc_dec.harq_combined_output.offset <= UINT16_MAX
+			* ACC100_HARQ_OFFSET);
+#endif
+	if (fcw->hcin_en > 0) {
+		harq_in_length = op->ldpc_dec.harq_combined_input.length;
+		if (fcw->hcin_decomp_mode > 0)
+			harq_in_length = harq_in_length * 8 / 6;
+		harq_in_length = RTE_ALIGN(harq_in_length, 64);
+		if ((harq_layout[harq_index].offset > 0) & harq_prun) {
+			rte_bbdev_log_debug("HARQ IN offset unexpected for now\n");
+			fcw->hcin_size0 = harq_layout[harq_index].size0;
+			fcw->hcin_offset = harq_layout[harq_index].offset;
+			fcw->hcin_size1 = harq_in_length -
+					harq_layout[harq_index].offset;
+		} else {
+			fcw->hcin_size0 = harq_in_length;
+			fcw->hcin_offset = 0;
+			fcw->hcin_size1 = 0;
+		}
+	} else {
+		fcw->hcin_size0 = 0;
+		fcw->hcin_offset = 0;
+		fcw->hcin_size1 = 0;
+	}
+
+	fcw->itmax = op->ldpc_dec.iter_max;
+	fcw->itstop = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE);
+	fcw->synd_precoder = fcw->itstop;
+	/*
+	 * These are all implicitly set
+	 * fcw->synd_post = 0;
+	 * fcw->so_en = 0;
+	 * fcw->so_bypass_rm = 0;
+	 * fcw->so_bypass_intlv = 0;
+	 * fcw->dec_convllr = 0;
+	 * fcw->hcout_convllr = 0;
+	 * fcw->hcout_size1 = 0;
+	 * fcw->so_it = 0;
+	 * fcw->hcout_offset = 0;
+	 * fcw->negstop_th = 0;
+	 * fcw->negstop_it = 0;
+	 * fcw->negstop_en = 0;
+	 * fcw->gain_i = 1;
+	 * fcw->gain_h = 1;
+	 */
+	if (fcw->hcout_en > 0) {
+		parity_offset = (op->ldpc_dec.basegraph == 1 ? 20 : 8)
+			* op->ldpc_dec.z_c - op->ldpc_dec.n_filler;
+		k0_p = (fcw->k0 > parity_offset) ?
+				fcw->k0 - op->ldpc_dec.n_filler : fcw->k0;
+		ncb_p = fcw->ncb - op->ldpc_dec.n_filler;
+		l = k0_p + fcw->rm_e;
+		harq_out_length = (uint16_t) fcw->hcin_size0;
+		harq_out_length = RTE_MIN(RTE_MAX(harq_out_length, l), ncb_p);
+		harq_out_length = (harq_out_length + 0x3F) & 0xFFC0;
+		if ((k0_p > fcw->hcin_size0 + ACC100_HARQ_OFFSET_THRESHOLD) &&
+				harq_prun) {
+			fcw->hcout_size0 = (uint16_t) fcw->hcin_size0;
+			fcw->hcout_offset = k0_p & 0xFFC0;
+			fcw->hcout_size1 = harq_out_length - fcw->hcout_offset;
+		} else {
+			fcw->hcout_size0 = harq_out_length;
+			fcw->hcout_size1 = 0;
+			fcw->hcout_offset = 0;
+		}
+		harq_layout[harq_index].offset = fcw->hcout_offset;
+		harq_layout[harq_index].size0 = fcw->hcout_size0;
+	} else {
+		fcw->hcout_size0 = 0;
+		fcw->hcout_size1 = 0;
+		fcw->hcout_offset = 0;
+	}
+}
+
+/**
+ * Fills descriptor with data pointers of one block type.
+ *
+ * @param desc
+ *   Pointer to DMA descriptor.
+ * @param input
+ *   Pointer to pointer to input data which will be encoded. It can be changed
+ *   and points to next segment in scatter-gather case.
+ * @param offset
+ *   Input offset in rte_mbuf structure. It is used for calculating the point
+ *   where data is starting.
+ * @param cb_len
+ *   Length of currently processed Code Block
+ * @param seg_total_left
+ *   It indicates how many bytes still left in segment (mbuf) for further
+ *   processing.
+ * @param op_flags
+ *   Store information about device capabilities
+ * @param next_triplet
+ *   Index for ACC100 DMA Descriptor triplet
+ *
+ * @return
+ *   Returns index of next triplet on success, other value if lengths of
+ *   pkt and processed cb do not match.
+ *
+ */
+static inline int
+acc100_dma_fill_blk_type_in(struct acc100_dma_req_desc *desc,
+		struct rte_mbuf **input, uint32_t *offset, uint32_t cb_len,
+		uint32_t *seg_total_left, int next_triplet)
+{
+	uint32_t part_len;
+	struct rte_mbuf *m = *input;
+
+	part_len = (*seg_total_left < cb_len) ? *seg_total_left : cb_len;
+	cb_len -= part_len;
+	*seg_total_left -= part_len;
+
+	desc->data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(m, *offset);
+	desc->data_ptrs[next_triplet].blen = part_len;
+	desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN;
+	desc->data_ptrs[next_triplet].last = 0;
+	desc->data_ptrs[next_triplet].dma_ext = 0;
+	*offset += part_len;
+	next_triplet++;
+
+	while (cb_len > 0) {
+		if (next_triplet < ACC100_DMA_MAX_NUM_POINTERS &&
+				m->next != NULL) {
+
+			m = m->next;
+			*seg_total_left = rte_pktmbuf_data_len(m);
+			part_len = (*seg_total_left < cb_len) ?
+					*seg_total_left :
+					cb_len;
+			desc->data_ptrs[next_triplet].address =
+					rte_pktmbuf_iova_offset(m, 0);
+			desc->data_ptrs[next_triplet].blen = part_len;
+			desc->data_ptrs[next_triplet].blkid =
+					ACC100_DMA_BLKID_IN;
+			desc->data_ptrs[next_triplet].last = 0;
+			desc->data_ptrs[next_triplet].dma_ext = 0;
+			cb_len -= part_len;
+			*seg_total_left -= part_len;
+			/* Initializing offset for next segment (mbuf) */
+			*offset = part_len;
+			next_triplet++;
+		} else {
+			rte_bbdev_log(ERR,
+				"Some data still left for processing: "
+				"data_left: %u, next_triplet: %u, next_mbuf: %p",
+				cb_len, next_triplet, m->next);
+			return -EINVAL;
+		}
+	}
+	/* Storing new mbuf as it could be changed in scatter-gather case*/
+	*input = m;
+
+	return next_triplet;
+}
+
+/* Fills descriptor with data pointers of one block type.
+ * Returns index of next triplet on success, other value if lengths of
+ * output data and processed mbuf do not match.
+ */
+static inline int
+acc100_dma_fill_blk_type_out(struct acc100_dma_req_desc *desc,
+		struct rte_mbuf *output, uint32_t out_offset,
+		uint32_t output_len, int next_triplet, int blk_id)
+{
+	desc->data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(output, out_offset);
+	desc->data_ptrs[next_triplet].blen = output_len;
+	desc->data_ptrs[next_triplet].blkid = blk_id;
+	desc->data_ptrs[next_triplet].last = 0;
+	desc->data_ptrs[next_triplet].dma_ext = 0;
+	next_triplet++;
+
+	return next_triplet;
+}
+
+static inline int
+acc100_dma_desc_le_fill(struct rte_bbdev_enc_op *op,
+		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
+		struct rte_mbuf *output, uint32_t *in_offset,
+		uint32_t *out_offset, uint32_t *out_length,
+		uint32_t *mbuf_total_left, uint32_t *seg_total_left)
+{
+	int next_triplet = 1; /* FCW already done */
+	uint16_t K, in_length_in_bits, in_length_in_bytes;
+	struct rte_bbdev_op_ldpc_enc *enc = &op->ldpc_enc;
+
+	desc->word0 = ACC100_DMA_DESC_TYPE;
+	desc->word1 = 0; /**< Timestamp could be disabled */
+	desc->word2 = 0;
+	desc->word3 = 0;
+	desc->numCBs = 1;
+
+	K = (enc->basegraph == 1 ? 22 : 10) * enc->z_c;
+	in_length_in_bits = K - enc->n_filler;
+	if ((enc->op_flags & RTE_BBDEV_LDPC_CRC_24A_ATTACH) ||
+			(enc->op_flags & RTE_BBDEV_LDPC_CRC_24B_ATTACH))
+		in_length_in_bits -= 24;
+	in_length_in_bytes = in_length_in_bits >> 3;
+
+	if (unlikely((*mbuf_total_left == 0) ||
+			(*mbuf_total_left < in_length_in_bytes))) {
+		rte_bbdev_log(ERR,
+				"Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+				*mbuf_total_left, in_length_in_bytes);
+		return -1;
+	}
+
+	next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset,
+			in_length_in_bytes,
+			seg_total_left, next_triplet);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->m2dlen = next_triplet;
+	*mbuf_total_left -= in_length_in_bytes;
+
+	/* Set output length */
+	/* Integer round up division by 8 */
+	*out_length = (enc->cb_params.e + 7) >> 3;
+
+	next_triplet = acc100_dma_fill_blk_type_out(desc, output, *out_offset,
+			*out_length, next_triplet, ACC100_DMA_BLKID_OUT_ENC);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+	op->ldpc_enc.output.length += *out_length;
+	*out_offset += *out_length;
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->data_ptrs[next_triplet - 1].dma_ext = 0;
+	desc->d2mlen = next_triplet - desc->m2dlen;
+
+	desc->op_addr = op;
+
+	return 0;
+}
+
+static inline int
+acc100_dma_desc_ld_fill(struct rte_bbdev_dec_op *op,
+		struct acc100_dma_req_desc *desc,
+		struct rte_mbuf **input, struct rte_mbuf *h_output,
+		uint32_t *in_offset, uint32_t *h_out_offset,
+		uint32_t *h_out_length, uint32_t *mbuf_total_left,
+		uint32_t *seg_total_left,
+		struct acc100_fcw_ld *fcw)
+{
+	struct rte_bbdev_op_ldpc_dec *dec = &op->ldpc_dec;
+	int next_triplet = 1; /* FCW already done */
+	uint32_t input_length;
+	uint16_t output_length, crc24_overlap = 0;
+	uint16_t sys_cols, K, h_p_size, h_np_size;
+	bool h_comp = check_bit(dec->op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
+
+	desc->word0 = ACC100_DMA_DESC_TYPE;
+	desc->word1 = 0; /**< Timestamp could be disabled */
+	desc->word2 = 0;
+	desc->word3 = 0;
+	desc->numCBs = 1;
+
+	if (check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP))
+		crc24_overlap = 24;
+
+	/* Compute some LDPC BG lengths */
+	input_length = dec->cb_params.e;
+	if (check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_LLR_COMPRESSION))
+		input_length = (input_length * 3 + 3) / 4;
+	sys_cols = (dec->basegraph == 1) ? 22 : 10;
+	K = sys_cols * dec->z_c;
+	output_length = K - dec->n_filler - crc24_overlap;
+
+	if (unlikely((*mbuf_total_left == 0) ||
+			(*mbuf_total_left < input_length))) {
+		rte_bbdev_log(ERR,
+				"Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+				*mbuf_total_left, input_length);
+		return -1;
+	}
+
+	next_triplet = acc100_dma_fill_blk_type_in(desc, input,
+			in_offset, input_length,
+			seg_total_left, next_triplet);
+
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+
+	if (check_bit(op->ldpc_dec.op_flags,
+				RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
+		h_p_size = fcw->hcin_size0 + fcw->hcin_size1;
+		if (h_comp)
+			h_p_size = (h_p_size * 3 + 3) / 4;
+		desc->data_ptrs[next_triplet].address =
+				dec->harq_combined_input.offset;
+		desc->data_ptrs[next_triplet].blen = h_p_size;
+		desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN_HARQ;
+		desc->data_ptrs[next_triplet].dma_ext = 1;
+#ifndef ACC100_EXT_MEM
+		acc100_dma_fill_blk_type_out(
+				desc,
+				op->ldpc_dec.harq_combined_input.data,
+				op->ldpc_dec.harq_combined_input.offset,
+				h_p_size,
+				next_triplet,
+				ACC100_DMA_BLKID_IN_HARQ);
+#endif
+		next_triplet++;
+	}
+
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->m2dlen = next_triplet;
+	*mbuf_total_left -= input_length;
+
+	next_triplet = acc100_dma_fill_blk_type_out(desc, h_output,
+			*h_out_offset, output_length >> 3, next_triplet,
+			ACC100_DMA_BLKID_OUT_HARD);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+
+	if (check_bit(op->ldpc_dec.op_flags,
+				RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
+		/* Pruned size of the HARQ */
+		h_p_size = fcw->hcout_size0 + fcw->hcout_size1;
+		/* Non-Pruned size of the HARQ */
+		h_np_size = fcw->hcout_offset > 0 ?
+				fcw->hcout_offset + fcw->hcout_size1 :
+				h_p_size;
+		if (h_comp) {
+			h_np_size = (h_np_size * 3 + 3) / 4;
+			h_p_size = (h_p_size * 3 + 3) / 4;
+		}
+		dec->harq_combined_output.length = h_np_size;
+		desc->data_ptrs[next_triplet].address =
+				dec->harq_combined_output.offset;
+		desc->data_ptrs[next_triplet].blen = h_p_size;
+		desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_OUT_HARQ;
+		desc->data_ptrs[next_triplet].dma_ext = 1;
+#ifndef ACC100_EXT_MEM
+		acc100_dma_fill_blk_type_out(
+				desc,
+				dec->harq_combined_output.data,
+				dec->harq_combined_output.offset,
+				h_p_size,
+				next_triplet,
+				ACC100_DMA_BLKID_OUT_HARQ);
+#endif
+		next_triplet++;
+	}
+
+	*h_out_length = output_length >> 3;
+	dec->hard_output.length += *h_out_length;
+	*h_out_offset += *h_out_length;
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->d2mlen = next_triplet - desc->m2dlen;
+
+	desc->op_addr = op;
+
+	return 0;
+}
+
+static inline void
+acc100_dma_desc_ld_update(struct rte_bbdev_dec_op *op,
+		struct acc100_dma_req_desc *desc,
+		struct rte_mbuf *input, struct rte_mbuf *h_output,
+		uint32_t *in_offset, uint32_t *h_out_offset,
+		uint32_t *h_out_length,
+		union acc100_harq_layout_data *harq_layout)
+{
+	int next_triplet = 1; /* FCW already done */
+	desc->data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(input, *in_offset);
+	next_triplet++;
+
+	if (check_bit(op->ldpc_dec.op_flags,
+				RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
+		struct rte_bbdev_op_data hi = op->ldpc_dec.harq_combined_input;
+		desc->data_ptrs[next_triplet].address = hi.offset;
+#ifndef ACC100_EXT_MEM
+		desc->data_ptrs[next_triplet].address =
+				rte_pktmbuf_iova_offset(hi.data, hi.offset);
+#endif
+		next_triplet++;
+	}
+
+	desc->data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(h_output, *h_out_offset);
+	*h_out_length = desc->data_ptrs[next_triplet].blen;
+	next_triplet++;
+
+	if (check_bit(op->ldpc_dec.op_flags,
+				RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
+		desc->data_ptrs[next_triplet].address =
+				op->ldpc_dec.harq_combined_output.offset;
+		/* Adjust based on previous operation */
+		struct rte_bbdev_dec_op *prev_op = desc->op_addr;
+		op->ldpc_dec.harq_combined_output.length =
+				prev_op->ldpc_dec.harq_combined_output.length;
+		int16_t hq_idx = op->ldpc_dec.harq_combined_output.offset /
+				ACC100_HARQ_OFFSET;
+		int16_t prev_hq_idx =
+				prev_op->ldpc_dec.harq_combined_output.offset
+				/ ACC100_HARQ_OFFSET;
+		harq_layout[hq_idx].val = harq_layout[prev_hq_idx].val;
+#ifndef ACC100_EXT_MEM
+		struct rte_bbdev_op_data ho =
+				op->ldpc_dec.harq_combined_output;
+		desc->data_ptrs[next_triplet].address =
+				rte_pktmbuf_iova_offset(ho.data, ho.offset);
+#endif
+		next_triplet++;
+	}
+
+	op->ldpc_dec.hard_output.length += *h_out_length;
+	desc->op_addr = op;
+}
+
+
+/* Enqueue a number of operations to HW and update software rings */
+static inline void
+acc100_dma_enqueue(struct acc100_queue *q, uint16_t n,
+		struct rte_bbdev_stats *queue_stats)
+{
+	union acc100_enqueue_reg_fmt enq_req;
+#ifdef RTE_BBDEV_OFFLOAD_COST
+	uint64_t start_time = 0;
+	queue_stats->acc_offload_cycles = 0;
+	RTE_SET_USED(queue_stats);
+#else
+	RTE_SET_USED(queue_stats);
+#endif
+
+	enq_req.val = 0;
+	/* Setting offset, 100b for 256 DMA Desc */
+	enq_req.addr_offset = ACC100_DESC_OFFSET;
+
+	/* Split ops into batches */
+	do {
+		union acc100_dma_desc *desc;
+		uint16_t enq_batch_size;
+		uint64_t offset;
+		rte_iova_t req_elem_addr;
+
+		enq_batch_size = RTE_MIN(n, MAX_ENQ_BATCH_SIZE);
+
+		/* Set flag on last descriptor in a batch */
+		desc = q->ring_addr + ((q->sw_ring_head + enq_batch_size - 1) &
+				q->sw_ring_wrap_mask);
+		desc->req.last_desc_in_batch = 1;
+
+		/* Calculate the 1st descriptor's address */
+		offset = ((q->sw_ring_head & q->sw_ring_wrap_mask) *
+				sizeof(union acc100_dma_desc));
+		req_elem_addr = q->ring_addr_phys + offset;
+
+		/* Fill enqueue struct */
+		enq_req.num_elem = enq_batch_size;
+		/* low 6 bits are not needed */
+		enq_req.req_elem_addr = (uint32_t)(req_elem_addr >> 6);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+		rte_memdump(stderr, "Req sdone", desc, sizeof(*desc));
+#endif
+		rte_bbdev_log_debug(
+				"Enqueue %u reqs (phys %#"PRIx64") to reg %p",
+				enq_batch_size,
+				req_elem_addr,
+				(void *)q->mmio_reg_enqueue);
+
+		rte_wmb();
+
+#ifdef RTE_BBDEV_OFFLOAD_COST
+		/* Start time measurement for enqueue function offload. */
+		start_time = rte_rdtsc_precise();
+#endif
+		rte_bbdev_log(DEBUG, "Debug : MMIO Enqueue");
+		mmio_write(q->mmio_reg_enqueue, enq_req.val);
+
+#ifdef RTE_BBDEV_OFFLOAD_COST
+		queue_stats->acc_offload_cycles +=
+				rte_rdtsc_precise() - start_time;
+#endif
+
+		q->aq_enqueued++;
+		q->sw_ring_head += enq_batch_size;
+		n -= enq_batch_size;
+
+	} while (n);
+
+
+}
+
+/* Enqueue one encode operations for ACC100 device in CB mode */
+static inline int
+enqueue_ldpc_enc_n_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op **ops,
+		uint16_t total_enqueued_cbs, int16_t num)
+{
+	union acc100_dma_desc *desc = NULL;
+	uint32_t out_length;
+	struct rte_mbuf *output_head, *output;
+	int i, next_triplet;
+	uint16_t  in_length_in_bytes;
+	struct rte_bbdev_op_ldpc_enc *enc = &ops[0]->ldpc_enc;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	acc100_fcw_le_fill(ops[0], &desc->req.fcw_le, num);
+
+	/** This could be done at polling */
+	desc->req.word0 = ACC100_DMA_DESC_TYPE;
+	desc->req.word1 = 0; /**< Timestamp could be disabled */
+	desc->req.word2 = 0;
+	desc->req.word3 = 0;
+	desc->req.numCBs = num;
+
+	in_length_in_bytes = ops[0]->ldpc_enc.input.data->data_len;
+	out_length = (enc->cb_params.e + 7) >> 3;
+	desc->req.m2dlen = 1 + num;
+	desc->req.d2mlen = num;
+	next_triplet = 1;
+
+	for (i = 0; i < num; i++) {
+		desc->req.data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(ops[i]->ldpc_enc.input.data, 0);
+		desc->req.data_ptrs[next_triplet].blen = in_length_in_bytes;
+		next_triplet++;
+		desc->req.data_ptrs[next_triplet].address =
+				rte_pktmbuf_iova_offset(
+				ops[i]->ldpc_enc.output.data, 0);
+		desc->req.data_ptrs[next_triplet].blen = out_length;
+		next_triplet++;
+		ops[i]->ldpc_enc.output.length = out_length;
+		output_head = output = ops[i]->ldpc_enc.output.data;
+		mbuf_append(output_head, output, out_length);
+		output->data_len = out_length;
+	}
+
+	desc->req.op_addr = ops[0];
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_le,
+			sizeof(desc->req.fcw_le) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+	/* One CB (one op) was successfully prepared to enqueue */
+	return num;
+}
+
+/* Enqueue one encode operations for ACC100 device in CB mode */
+static inline int
+enqueue_ldpc_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
+		uint16_t total_enqueued_cbs)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint32_t in_offset, out_offset, out_length, mbuf_total_left,
+		seg_total_left;
+	struct rte_mbuf *input, *output_head, *output;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	acc100_fcw_le_fill(op, &desc->req.fcw_le, 1);
+
+	input = op->ldpc_enc.input.data;
+	output_head = output = op->ldpc_enc.output.data;
+	in_offset = op->ldpc_enc.input.offset;
+	out_offset = op->ldpc_enc.output.offset;
+	out_length = 0;
+	mbuf_total_left = op->ldpc_enc.input.length;
+	seg_total_left = rte_pktmbuf_data_len(op->ldpc_enc.input.data)
+			- in_offset;
+
+	ret = acc100_dma_desc_le_fill(op, &desc->req, &input, output,
+			&in_offset, &out_offset, &out_length, &mbuf_total_left,
+			&seg_total_left);
+
+	if (unlikely(ret < 0))
+		return ret;
+
+	mbuf_append(output_head, output, out_length);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_le,
+			sizeof(desc->req.fcw_le) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+
+	/* Check if any data left after processing one CB */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left after processing one CB: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
+/** Enqueue one decode operations for ACC100 device in CB mode */
+static inline int
+enqueue_ldpc_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs, bool same_op)
+{
+	int ret;
+
+	union acc100_dma_desc *desc;
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	struct rte_mbuf *input, *h_output_head, *h_output;
+	uint32_t in_offset, h_out_offset, mbuf_total_left, h_out_length = 0;
+	input = op->ldpc_dec.input.data;
+	h_output_head = h_output = op->ldpc_dec.hard_output.data;
+	in_offset = op->ldpc_dec.input.offset;
+	h_out_offset = op->ldpc_dec.hard_output.offset;
+	mbuf_total_left = op->ldpc_dec.input.length;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(input == NULL)) {
+		rte_bbdev_log(ERR, "Invalid mbuf pointer");
+		return -EFAULT;
+	}
+#endif
+	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+
+	if (same_op) {
+		union acc100_dma_desc *prev_desc;
+		desc_idx = ((q->sw_ring_head + total_enqueued_cbs - 1)
+				& q->sw_ring_wrap_mask);
+		prev_desc = q->ring_addr + desc_idx;
+		uint8_t *prev_ptr = (uint8_t *) prev_desc;
+		uint8_t *new_ptr = (uint8_t *) desc;
+		/* Copy first 4 words and BDESCs */
+		rte_memcpy(new_ptr, prev_ptr, 16);
+		rte_memcpy(new_ptr + 36, prev_ptr + 36, 40);
+		desc->req.op_addr = prev_desc->req.op_addr;
+		/* Copy FCW */
+		rte_memcpy(new_ptr + ACC100_DESC_FCW_OFFSET,
+				prev_ptr + ACC100_DESC_FCW_OFFSET,
+				ACC100_FCW_LD_BLEN);
+		acc100_dma_desc_ld_update(op, &desc->req, input, h_output,
+				&in_offset, &h_out_offset,
+				&h_out_length, harq_layout);
+	} else {
+		struct acc100_fcw_ld *fcw;
+		uint32_t seg_total_left;
+		fcw = &desc->req.fcw_ld;
+		acc100_fcw_ld_fill(op, fcw, harq_layout);
+
+		/* Special handling when overusing mbuf */
+		if (fcw->rm_e < MAX_E_MBUF)
+			seg_total_left = rte_pktmbuf_data_len(input)
+					- in_offset;
+		else
+			seg_total_left = fcw->rm_e;
+
+		ret = acc100_dma_desc_ld_fill(op, &desc->req, &input, h_output,
+				&in_offset, &h_out_offset,
+				&h_out_length, &mbuf_total_left,
+				&seg_total_left, fcw);
+		if (unlikely(ret < 0))
+			return ret;
+	}
+
+	/* Hard output */
+	mbuf_append(h_output_head, h_output, h_out_length);
+#ifndef ACC100_EXT_MEM
+	if (op->ldpc_dec.harq_combined_output.length > 0) {
+		/* Push the HARQ output into host memory */
+		struct rte_mbuf *hq_output_head, *hq_output;
+		hq_output_head = op->ldpc_dec.harq_combined_output.data;
+		hq_output = op->ldpc_dec.harq_combined_output.data;
+		mbuf_append(hq_output_head, hq_output,
+				op->ldpc_dec.harq_combined_output.length);
+	}
+#endif
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_ld,
+			sizeof(desc->req.fcw_ld) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
+
+/* Enqueue one decode operations for ACC100 device in TB mode */
+static inline int
+enqueue_ldpc_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint8_t r, c;
+	uint32_t in_offset, h_out_offset,
+		h_out_length, mbuf_total_left, seg_total_left;
+	struct rte_mbuf *input, *h_output_head, *h_output;
+	uint16_t current_enqueued_cbs = 0;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
+	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+	acc100_fcw_ld_fill(op, &desc->req.fcw_ld, harq_layout);
+
+	input = op->ldpc_dec.input.data;
+	h_output_head = h_output = op->ldpc_dec.hard_output.data;
+	in_offset = op->ldpc_dec.input.offset;
+	h_out_offset = op->ldpc_dec.hard_output.offset;
+	h_out_length = 0;
+	mbuf_total_left = op->ldpc_dec.input.length;
+	c = op->ldpc_dec.tb_params.c;
+	r = op->ldpc_dec.tb_params.r;
+
+	while (mbuf_total_left > 0 && r < c) {
+
+		seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
+
+		/* Set up DMA descriptor */
+		desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
+				& q->sw_ring_wrap_mask);
+		desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
+		desc->req.data_ptrs[0].blen = ACC100_FCW_LD_BLEN;
+		ret = acc100_dma_desc_ld_fill(op, &desc->req, &input,
+				h_output, &in_offset, &h_out_offset,
+				&h_out_length,
+				&mbuf_total_left, &seg_total_left,
+				&desc->req.fcw_ld);
+
+		if (unlikely(ret < 0))
+			return ret;
+
+		/* Hard output */
+		mbuf_append(h_output_head, h_output, h_out_length);
+
+		/* Set total number of CBs in TB */
+		desc->req.cbs_in_tb = cbs_in_tb;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+		rte_memdump(stderr, "FCW", &desc->req.fcw_td,
+				sizeof(desc->req.fcw_td) - 8);
+		rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+		if (seg_total_left == 0) {
+			/* Go to the next mbuf */
+			input = input->next;
+			in_offset = 0;
+			h_output = h_output->next;
+			h_out_offset = 0;
+		}
+		total_enqueued_cbs++;
+		current_enqueued_cbs++;
+		r++;
+	}
+
+	if (unlikely(desc == NULL))
+		return current_enqueued_cbs;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Check if any CBs left for processing */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left for processing: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+	/* Set SDone on last CB descriptor for TB mode */
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	return current_enqueued_cbs;
+}
+
+
+/* Calculates number of CBs in processed encoder TB based on 'r' and input
+ * length.
+ */
+static inline uint8_t
+get_num_cbs_in_tb_enc(struct rte_bbdev_op_turbo_enc *turbo_enc)
+{
+	uint8_t c, c_neg, r, crc24_bits = 0;
+	uint16_t k, k_neg, k_pos;
+	uint8_t cbs_in_tb = 0;
+	int32_t length;
+
+	length = turbo_enc->input.length;
+	r = turbo_enc->tb_params.r;
+	c = turbo_enc->tb_params.c;
+	c_neg = turbo_enc->tb_params.c_neg;
+	k_neg = turbo_enc->tb_params.k_neg;
+	k_pos = turbo_enc->tb_params.k_pos;
+	crc24_bits = 0;
+	if (check_bit(turbo_enc->op_flags, RTE_BBDEV_TURBO_CRC_24B_ATTACH))
+		crc24_bits = 24;
+	while (length > 0 && r < c) {
+		k = (r < c_neg) ? k_neg : k_pos;
+		length -= (k - crc24_bits) >> 3;
+		r++;
+		cbs_in_tb++;
+	}
+
+	return cbs_in_tb;
+}
+
+/* Calculates number of CBs in processed decoder TB based on 'r' and input
+ * length.
+ */
+static inline uint16_t
+get_num_cbs_in_tb_dec(struct rte_bbdev_op_turbo_dec *turbo_dec)
+{
+	uint8_t c, c_neg, r = 0;
+	uint16_t kw, k, k_neg, k_pos, cbs_in_tb = 0;
+	int32_t length;
+
+	length = turbo_dec->input.length;
+	r = turbo_dec->tb_params.r;
+	c = turbo_dec->tb_params.c;
+	c_neg = turbo_dec->tb_params.c_neg;
+	k_neg = turbo_dec->tb_params.k_neg;
+	k_pos = turbo_dec->tb_params.k_pos;
+	while (length > 0 && r < c) {
+		k = (r < c_neg) ? k_neg : k_pos;
+		kw = RTE_ALIGN_CEIL(k + 4, 32) * 3;
+		length -= kw;
+		r++;
+		cbs_in_tb++;
+	}
+
+	return cbs_in_tb;
+}
+
+/* Calculates number of CBs in processed decoder TB based on 'r' and input
+ * length.
+ */
+static inline uint16_t
+get_num_cbs_in_tb_ldpc_dec(struct rte_bbdev_op_ldpc_dec *ldpc_dec)
+{
+	uint16_t r, cbs_in_tb = 0;
+	int32_t length = ldpc_dec->input.length;
+	r = ldpc_dec->tb_params.r;
+	while (length > 0 && r < ldpc_dec->tb_params.c) {
+		length -=  (r < ldpc_dec->tb_params.cab) ?
+				ldpc_dec->tb_params.ea :
+				ldpc_dec->tb_params.eb;
+		r++;
+		cbs_in_tb++;
+	}
+	return cbs_in_tb;
+}
+
+/* Check we can mux encode operations with common FCW */
+static inline bool
+check_mux(struct rte_bbdev_enc_op **ops, uint16_t num) {
+	uint16_t i;
+	if (num == 1)
+		return false;
+	for (i = 1; i < num; ++i) {
+		/* Only mux compatible code blocks */
+		if (memcmp((uint8_t *)(&ops[i]->ldpc_enc) + ENC_OFFSET,
+				(uint8_t *)(&ops[0]->ldpc_enc) + ENC_OFFSET,
+				CMP_ENC_SIZE) != 0)
+			return false;
+	}
+	return true;
+}
+
+/** Enqueue encode operations for ACC100 device in CB mode. */
+static inline uint16_t
+acc100_enqueue_ldpc_enc_cb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i = 0;
+	union acc100_dma_desc *desc;
+	int ret, desc_idx = 0;
+	int16_t enq, left = num;
+
+	while (left > 0) {
+		if (unlikely(avail - 1 < 0))
+			break;
+		avail--;
+		enq = RTE_MIN(left, MUX_5GDL_DESC);
+		if (check_mux(&ops[i], enq)) {
+			ret = enqueue_ldpc_enc_n_op_cb(q, &ops[i],
+					desc_idx, enq);
+			if (ret < 0)
+				break;
+			i += enq;
+		} else {
+			ret = enqueue_ldpc_enc_one_op_cb(q, ops[i], desc_idx);
+			if (ret < 0)
+				break;
+			i++;
+		}
+		desc_idx++;
+		left = num - i;
+	}
+
+	if (unlikely(i == 0))
+		return 0; /* Nothing to enqueue */
+
+	/* Set SDone in last CB in enqueued ops for CB mode*/
+	desc = q->ring_addr + ((q->sw_ring_head + desc_idx - 1)
+			& q->sw_ring_wrap_mask);
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	acc100_dma_enqueue(q, desc_idx, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+
+	return i;
+}
+
+/* Enqueue encode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	if (unlikely(num == 0))
+		return 0;
+	return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
+}
+
+/* Check we can mux encode operations with common FCW */
+static inline bool
+cmp_ldpc_dec_op(struct rte_bbdev_dec_op **ops) {
+	/* Only mux compatible code blocks */
+	if (memcmp((uint8_t *)(&ops[0]->ldpc_dec) + DEC_OFFSET,
+			(uint8_t *)(&ops[1]->ldpc_dec) +
+			DEC_OFFSET, CMP_DEC_SIZE) != 0) {
+		return false;
+	} else
+		return true;
+}
+
+
+/* Enqueue decode operations for ACC100 device in TB mode */
+static uint16_t
+acc100_enqueue_ldpc_dec_tb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i, enqueued_cbs = 0;
+	uint8_t cbs_in_tb;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		cbs_in_tb = get_num_cbs_in_tb_ldpc_dec(&ops[i]->ldpc_dec);
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - cbs_in_tb < 0))
+			break;
+		avail -= cbs_in_tb;
+
+		ret = enqueue_ldpc_dec_one_op_tb(q, ops[i],
+				enqueued_cbs, cbs_in_tb);
+		if (ret < 0)
+			break;
+		enqueued_cbs += ret;
+	}
+
+	acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+	return i;
+}
+
+/* Enqueue decode operations for ACC100 device in CB mode */
+static uint16_t
+acc100_enqueue_ldpc_dec_cb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i;
+	union acc100_dma_desc *desc;
+	int ret;
+	bool same_op = false;
+	for (i = 0; i < num; ++i) {
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - 1 < 0))
+			break;
+		avail -= 1;
+
+		if (i > 0)
+			same_op = cmp_ldpc_dec_op(&ops[i-1]);
+		rte_bbdev_log(INFO, "Op %d %d %d %d %d %d %d %d %d %d %d %d\n",
+			i, ops[i]->ldpc_dec.op_flags, ops[i]->ldpc_dec.rv_index,
+			ops[i]->ldpc_dec.iter_max, ops[i]->ldpc_dec.iter_count,
+			ops[i]->ldpc_dec.basegraph, ops[i]->ldpc_dec.z_c,
+			ops[i]->ldpc_dec.n_cb, ops[i]->ldpc_dec.q_m,
+			ops[i]->ldpc_dec.n_filler, ops[i]->ldpc_dec.cb_params.e,
+			same_op);
+		ret = enqueue_ldpc_dec_one_op_cb(q, ops[i], i, same_op);
+		if (ret < 0)
+			break;
+	}
+
+	if (unlikely(i == 0))
+		return 0; /* Nothing to enqueue */
+
+	/* Set SDone in last CB in enqueued ops for CB mode*/
+	desc = q->ring_addr + ((q->sw_ring_head + i - 1)
+			& q->sw_ring_wrap_mask);
+
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	acc100_dma_enqueue(q, i, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+	return i;
+}
+
+/* Enqueue decode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t aq_avail = q->aq_depth +
+			(q->aq_dequeued - q->aq_enqueued) / 128;
+
+	if (unlikely((aq_avail == 0) || (num == 0)))
+		return 0;
+
+	if (ops[0]->ldpc_dec.code_block_mode == 0)
+		return acc100_enqueue_ldpc_dec_tb(q_data, ops, num);
+	else
+		return acc100_enqueue_ldpc_dec_cb(q_data, ops, num);
+}
+
+
+/* Dequeue one encode operations from ACC100 device in CB mode */
+static inline int
+dequeue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op **ref_op,
+		uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_enc_op *op;
+	int i;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	rsp.val = atom_desc.rsp.val;
+	rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+
+	op->status |= ((rsp.input_err)
+			? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+	op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+	op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+
+	if (desc->req.last_desc_in_batch) {
+		(*aq_dequeued)++;
+		desc->req.last_desc_in_batch = 0;
+	}
+	desc->rsp.val = ACC100_DMA_DESC_TYPE;
+	desc->rsp.add_info_0 = 0; /*Reserved bits */
+	desc->rsp.add_info_1 = 0; /*Reserved bits */
+
+	/* Flag that the muxing cause loss of opaque data */
+	op->opaque_data = (void *)-1;
+	for (i = 0 ; i < desc->req.numCBs; i++)
+		ref_op[i] = op;
+
+	/* One CB (op) was successfully dequeued */
+	return desc->req.numCBs;
+}
+
+/* Dequeue one encode operations from ACC100 device in TB mode */
+static inline int
+dequeue_enc_one_op_tb(struct acc100_queue *q, struct rte_bbdev_enc_op **ref_op,
+		uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, *last_desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_enc_op *op;
+	uint8_t i = 0;
+	uint16_t current_dequeued_cbs = 0, cbs_in_tb;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	/* Get number of CBs in dequeued TB */
+	cbs_in_tb = desc->req.cbs_in_tb;
+	/* Get last CB */
+	last_desc = q->ring_addr + ((q->sw_ring_tail
+			+ total_dequeued_cbs + cbs_in_tb - 1)
+			& q->sw_ring_wrap_mask);
+	/* Check if last CB in TB is ready to dequeue (and thus
+	 * the whole TB) - checking sdone bit. If not return.
+	 */
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
+			__ATOMIC_RELAXED);
+	if (!(atom_desc.rsp.val & ACC100_SDONE))
+		return -1;
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+
+	while (i < cbs_in_tb) {
+		desc = q->ring_addr + ((q->sw_ring_tail
+				+ total_dequeued_cbs)
+				& q->sw_ring_wrap_mask);
+		atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+				__ATOMIC_RELAXED);
+		rsp.val = atom_desc.rsp.val;
+		rte_bbdev_log_debug("Resp. desc %p: %x", desc,
+				rsp.val);
+
+		op->status |= ((rsp.input_err)
+				? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+		op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+		op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+
+		if (desc->req.last_desc_in_batch) {
+			(*aq_dequeued)++;
+			desc->req.last_desc_in_batch = 0;
+		}
+		desc->rsp.val = ACC100_DMA_DESC_TYPE;
+		desc->rsp.add_info_0 = 0;
+		desc->rsp.add_info_1 = 0;
+		total_dequeued_cbs++;
+		current_dequeued_cbs++;
+		i++;
+	}
+
+	*ref_op = op;
+
+	return current_dequeued_cbs;
+}
+
+/* Dequeue one decode operation from ACC100 device in CB mode */
+static inline int
+dequeue_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
+		struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
+		uint16_t dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_dec_op *op;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	rsp.val = atom_desc.rsp.val;
+	rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+	op->status |= ((rsp.input_err)
+			? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+	op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+	op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+	if (op->status != 0)
+		q_data->queue_stats.dequeue_err_count++;
+
+	/* CRC invalid if error exists */
+	if (!op->status)
+		op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
+	op->turbo_dec.iter_count = (uint8_t) rsp.iter_cnt / 2;
+	/* Check if this is the last desc in batch (Atomic Queue) */
+	if (desc->req.last_desc_in_batch) {
+		(*aq_dequeued)++;
+		desc->req.last_desc_in_batch = 0;
+	}
+	desc->rsp.val = ACC100_DMA_DESC_TYPE;
+	desc->rsp.add_info_0 = 0;
+	desc->rsp.add_info_1 = 0;
+	*ref_op = op;
+
+	/* One CB (op) was successfully dequeued */
+	return 1;
+}
+
+/* Dequeue one decode operations from ACC100 device in CB mode */
+static inline int
+dequeue_ldpc_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
+		struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
+		uint16_t dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_dec_op *op;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	rsp.val = atom_desc.rsp.val;
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+	op->status |= rsp.input_err << RTE_BBDEV_DATA_ERROR;
+	op->status |= rsp.dma_err << RTE_BBDEV_DRV_ERROR;
+	op->status |= rsp.fcw_err << RTE_BBDEV_DRV_ERROR;
+	if (op->status != 0)
+		q_data->queue_stats.dequeue_err_count++;
+
+	op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
+	if (op->ldpc_dec.hard_output.length > 0 && !rsp.synd_ok)
+		op->status |= 1 << RTE_BBDEV_SYNDROME_ERROR;
+	op->ldpc_dec.iter_count = (uint8_t) rsp.iter_cnt;
+
+	/* Check if this is the last desc in batch (Atomic Queue) */
+	if (desc->req.last_desc_in_batch) {
+		(*aq_dequeued)++;
+		desc->req.last_desc_in_batch = 0;
+	}
+
+	desc->rsp.val = ACC100_DMA_DESC_TYPE;
+	desc->rsp.add_info_0 = 0;
+	desc->rsp.add_info_1 = 0;
+
+	*ref_op = op;
+
+	/* One CB (op) was successfully dequeued */
+	return 1;
+}
+
+/* Dequeue one decode operations from ACC100 device in TB mode. */
+static inline int
+dequeue_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
+		uint16_t dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, *last_desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_dec_op *op;
+	uint8_t cbs_in_tb = 1, cb_idx = 0;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Get number of CBs in dequeued TB */
+	cbs_in_tb = desc->req.cbs_in_tb;
+	/* Get last CB */
+	last_desc = q->ring_addr + ((q->sw_ring_tail
+			+ dequeued_cbs + cbs_in_tb - 1)
+			& q->sw_ring_wrap_mask);
+	/* Check if last CB in TB is ready to dequeue (and thus
+	 * the whole TB) - checking sdone bit. If not return.
+	 */
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
+			__ATOMIC_RELAXED);
+	if (!(atom_desc.rsp.val & ACC100_SDONE))
+		return -1;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+
+	/* Read remaining CBs if exists */
+	while (cb_idx < cbs_in_tb) {
+		desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+				& q->sw_ring_wrap_mask);
+		atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+				__ATOMIC_RELAXED);
+		rsp.val = atom_desc.rsp.val;
+		rte_bbdev_log_debug("Resp. desc %p: %x", desc,
+				rsp.val);
+
+		op->status |= ((rsp.input_err)
+				? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+		op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+		op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+
+		/* CRC invalid if error exists */
+		if (!op->status)
+			op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
+		op->turbo_dec.iter_count = RTE_MAX((uint8_t) rsp.iter_cnt,
+				op->turbo_dec.iter_count);
+
+		/* Check if this is the last desc in batch (Atomic Queue) */
+		if (desc->req.last_desc_in_batch) {
+			(*aq_dequeued)++;
+			desc->req.last_desc_in_batch = 0;
+		}
+		desc->rsp.val = ACC100_DMA_DESC_TYPE;
+		desc->rsp.add_info_0 = 0;
+		desc->rsp.add_info_1 = 0;
+		dequeued_cbs++;
+		cb_idx++;
+	}
+
+	*ref_op = op;
+
+	return cb_idx;
+}
+
+/* Dequeue LDPC encode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+	uint32_t aq_dequeued = 0;
+	uint16_t dequeue_num, i, dequeued_cbs = 0, dequeued_descs = 0;
+	int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(ops == 0 && q == NULL))
+		return 0;
+#endif
+
+	dequeue_num = (avail < num) ? avail : num;
+
+	for (i = 0; i < dequeue_num; i++) {
+		ret = dequeue_enc_one_op_cb(q, &ops[dequeued_cbs],
+				dequeued_descs, &aq_dequeued);
+		if (ret < 0)
+			break;
+		dequeued_cbs += ret;
+		dequeued_descs++;
+		if (dequeued_cbs >= num)
+			break;
+	}
+
+	q->aq_dequeued += aq_dequeued;
+	q->sw_ring_tail += dequeued_descs;
+
+	/* Update enqueue stats */
+	q_data->queue_stats.dequeued_count += dequeued_cbs;
+
+	return dequeued_cbs;
+}
+
+/* Dequeue decode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	uint16_t dequeue_num;
+	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+	uint32_t aq_dequeued = 0;
+	uint16_t i;
+	uint16_t dequeued_cbs = 0;
+	struct rte_bbdev_dec_op *op;
+	int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(ops == 0 && q == NULL))
+		return 0;
+#endif
+
+	dequeue_num = (avail < num) ? avail : num;
+
+	for (i = 0; i < dequeue_num; ++i) {
+		op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask))->req.op_addr;
+		if (op->ldpc_dec.code_block_mode == 0)
+			ret = dequeue_dec_one_op_tb(q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+		else
+			ret = dequeue_ldpc_dec_one_op_cb(
+					q_data, q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+
+		if (ret < 0)
+			break;
+		dequeued_cbs += ret;
+	}
+
+	q->aq_dequeued += aq_dequeued;
+	q->sw_ring_tail += dequeued_cbs;
+
+	/* Update enqueue stats */
+	q_data->queue_stats.dequeued_count += i;
+
+	return i;
+}
+
 /* Initialization Function */
 static void
 acc100_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv)
@@ -703,6 +2321,10 @@
 	struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
 
 	dev->dev_ops = &acc100_bbdev_ops;
+	dev->enqueue_ldpc_enc_ops = acc100_enqueue_ldpc_enc;
+	dev->enqueue_ldpc_dec_ops = acc100_enqueue_ldpc_dec;
+	dev->dequeue_ldpc_enc_ops = acc100_dequeue_ldpc_enc;
+	dev->dequeue_ldpc_dec_ops = acc100_dequeue_ldpc_dec;
 
 	((struct acc100_device *) dev->data->dev_private)->pf_device =
 			!strcmp(drv->driver.name,
@@ -815,4 +2437,3 @@ static int acc100_pci_remove(struct rte_pci_device *pci_dev)
 RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME, pci_id_acc100_pf_map);
 RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
 RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME, pci_id_acc100_vf_map);
-
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 0e2b79c..78686c1 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -88,6 +88,8 @@
 #define TMPL_PRI_3      0x0f0e0d0c
 #define QUEUE_ENABLE    0x80000000  /* Bit to mark Queue as Enabled */
 #define WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
+#define ACC100_FDONE    0x80000000
+#define ACC100_SDONE    0x40000000
 
 #define ACC100_NUM_TMPL  32
 #define VF_OFFSET_QOS 16 /* offset in Memory Space specific to QoS Mon */
@@ -398,6 +400,7 @@ struct __rte_packed acc100_dma_req_desc {
 union acc100_dma_desc {
 	struct acc100_dma_req_desc req;
 	union acc100_dma_rsp_desc rsp;
+	uint64_t atom_hdr;
 };
 
 
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v5 06/11] baseband/acc100: add HARQ loopback support
  2020-09-23  2:12   ` [dpdk-dev] [PATCH v5 " Nicolas Chautru
                       ` (4 preceding siblings ...)
  2020-09-23  2:12     ` [dpdk-dev] [PATCH v5 05/11] baseband/acc100: add LDPC processing functions Nicolas Chautru
@ 2020-09-23  2:12     ` Nicolas Chautru
  2020-09-23  2:12     ` [dpdk-dev] [PATCH v5 07/11] baseband/acc100: add support for 4G processing Nicolas Chautru
                       ` (4 subsequent siblings)
  10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-23  2:12 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Additional support for HARQ memory loopback

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 158 +++++++++++++++++++++++++++++++
 1 file changed, 158 insertions(+)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index b223547..e484c0a 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -658,6 +658,7 @@
 				RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE |
 				RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE |
 #ifdef ACC100_EXT_MEM
+				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK |
 				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE |
 				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE |
 #endif
@@ -1480,12 +1481,169 @@
 	return 1;
 }
 
+static inline int
+harq_loopback(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs) {
+	struct acc100_fcw_ld *fcw;
+	union acc100_dma_desc *desc;
+	int next_triplet = 1;
+	struct rte_mbuf *hq_output_head, *hq_output;
+	uint16_t harq_in_length = op->ldpc_dec.harq_combined_input.length;
+	if (harq_in_length == 0) {
+		rte_bbdev_log(ERR, "Loopback of invalid null size\n");
+		return -EINVAL;
+	}
+
+	int h_comp = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION
+			) ? 1 : 0;
+	if (h_comp == 1)
+		harq_in_length = harq_in_length * 8 / 6;
+	harq_in_length = RTE_ALIGN(harq_in_length, 64);
+	uint16_t harq_dma_length_in = (h_comp == 0) ?
+			harq_in_length :
+			harq_in_length * 6 / 8;
+	uint16_t harq_dma_length_out = harq_dma_length_in;
+	bool ddr_mem_in = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE);
+	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+	uint16_t harq_index = (ddr_mem_in ?
+			op->ldpc_dec.harq_combined_input.offset :
+			op->ldpc_dec.harq_combined_output.offset)
+			/ ACC100_HARQ_OFFSET;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	fcw = &desc->req.fcw_ld;
+	/* Set the FCW from loopback into DDR */
+	memset(fcw, 0, sizeof(struct acc100_fcw_ld));
+	fcw->FCWversion = ACC100_FCW_VER;
+	fcw->qm = 2;
+	fcw->Zc = 384;
+	if (harq_in_length < 16 * N_ZC_1)
+		fcw->Zc = 16;
+	fcw->ncb = fcw->Zc * N_ZC_1;
+	fcw->rm_e = 2;
+	fcw->hcin_en = 1;
+	fcw->hcout_en = 1;
+
+	rte_bbdev_log(DEBUG, "Loopback IN %d Index %d offset %d length %d %d\n",
+			ddr_mem_in, harq_index,
+			harq_layout[harq_index].offset, harq_in_length,
+			harq_dma_length_in);
+
+	if (ddr_mem_in && (harq_layout[harq_index].offset > 0)) {
+		fcw->hcin_size0 = harq_layout[harq_index].size0;
+		fcw->hcin_offset = harq_layout[harq_index].offset;
+		fcw->hcin_size1 = harq_in_length - fcw->hcin_offset;
+		harq_dma_length_in = (fcw->hcin_size0 + fcw->hcin_size1);
+		if (h_comp == 1)
+			harq_dma_length_in = harq_dma_length_in * 6 / 8;
+	} else {
+		fcw->hcin_size0 = harq_in_length;
+	}
+	harq_layout[harq_index].val = 0;
+	rte_bbdev_log(DEBUG, "Loopback FCW Config %d %d %d\n",
+			fcw->hcin_size0, fcw->hcin_offset, fcw->hcin_size1);
+	fcw->hcout_size0 = harq_in_length;
+	fcw->hcin_decomp_mode = h_comp;
+	fcw->hcout_comp_mode = h_comp;
+	fcw->gain_i = 1;
+	fcw->gain_h = 1;
+
+	/* Set the prefix of descriptor. This could be done at polling */
+	desc->req.word0 = ACC100_DMA_DESC_TYPE;
+	desc->req.word1 = 0; /**< Timestamp could be disabled */
+	desc->req.word2 = 0;
+	desc->req.word3 = 0;
+	desc->req.numCBs = 1;
+
+	/* Null LLR input for Decoder */
+	desc->req.data_ptrs[next_triplet].address =
+			q->lb_in_addr_phys;
+	desc->req.data_ptrs[next_triplet].blen = 2;
+	desc->req.data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN;
+	desc->req.data_ptrs[next_triplet].last = 0;
+	desc->req.data_ptrs[next_triplet].dma_ext = 0;
+	next_triplet++;
+
+	/* HARQ Combine input from either Memory interface */
+	if (!ddr_mem_in) {
+		next_triplet = acc100_dma_fill_blk_type_out(&desc->req,
+				op->ldpc_dec.harq_combined_input.data,
+				op->ldpc_dec.harq_combined_input.offset,
+				harq_dma_length_in,
+				next_triplet,
+				ACC100_DMA_BLKID_IN_HARQ);
+	} else {
+		desc->req.data_ptrs[next_triplet].address =
+				op->ldpc_dec.harq_combined_input.offset;
+		desc->req.data_ptrs[next_triplet].blen =
+				harq_dma_length_in;
+		desc->req.data_ptrs[next_triplet].blkid =
+				ACC100_DMA_BLKID_IN_HARQ;
+		desc->req.data_ptrs[next_triplet].dma_ext = 1;
+		next_triplet++;
+	}
+	desc->req.data_ptrs[next_triplet - 1].last = 1;
+	desc->req.m2dlen = next_triplet;
+
+	/* Dropped decoder hard output */
+	desc->req.data_ptrs[next_triplet].address =
+			q->lb_out_addr_phys;
+	desc->req.data_ptrs[next_triplet].blen = BYTES_IN_WORD;
+	desc->req.data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_OUT_HARD;
+	desc->req.data_ptrs[next_triplet].last = 0;
+	desc->req.data_ptrs[next_triplet].dma_ext = 0;
+	next_triplet++;
+
+	/* HARQ Combine output to either Memory interface */
+	if (check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE
+			)) {
+		desc->req.data_ptrs[next_triplet].address =
+				op->ldpc_dec.harq_combined_output.offset;
+		desc->req.data_ptrs[next_triplet].blen =
+				harq_dma_length_out;
+		desc->req.data_ptrs[next_triplet].blkid =
+				ACC100_DMA_BLKID_OUT_HARQ;
+		desc->req.data_ptrs[next_triplet].dma_ext = 1;
+		next_triplet++;
+	} else {
+		hq_output_head = op->ldpc_dec.harq_combined_output.data;
+		hq_output = op->ldpc_dec.harq_combined_output.data;
+		next_triplet = acc100_dma_fill_blk_type_out(
+				&desc->req,
+				op->ldpc_dec.harq_combined_output.data,
+				op->ldpc_dec.harq_combined_output.offset,
+				harq_dma_length_out,
+				next_triplet,
+				ACC100_DMA_BLKID_OUT_HARQ);
+		/* HARQ output */
+		mbuf_append(hq_output_head, hq_output, harq_dma_length_out);
+		op->ldpc_dec.harq_combined_output.length =
+				harq_dma_length_out;
+	}
+	desc->req.data_ptrs[next_triplet - 1].last = 1;
+	desc->req.d2mlen = next_triplet - desc->req.m2dlen;
+	desc->req.op_addr = op;
+
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
 /** Enqueue one decode operations for ACC100 device in CB mode */
 static inline int
 enqueue_ldpc_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
 		uint16_t total_enqueued_cbs, bool same_op)
 {
 	int ret;
+	if (unlikely(check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK))) {
+		ret = harq_loopback(q, op, total_enqueued_cbs);
+		return ret;
+	}
 
 	union acc100_dma_desc *desc;
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v5 07/11] baseband/acc100: add support for 4G processing
  2020-09-23  2:12   ` [dpdk-dev] [PATCH v5 " Nicolas Chautru
                       ` (5 preceding siblings ...)
  2020-09-23  2:12     ` [dpdk-dev] [PATCH v5 06/11] baseband/acc100: add HARQ loopback support Nicolas Chautru
@ 2020-09-23  2:12     ` Nicolas Chautru
  2020-09-23  2:12     ` [dpdk-dev] [PATCH v5 08/11] baseband/acc100: add interrupt support to PMD Nicolas Chautru
                       ` (3 subsequent siblings)
  10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-23  2:12 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Adding capability for 4G encode and decoder processing

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 1010 ++++++++++++++++++++++++++++--
 1 file changed, 943 insertions(+), 67 deletions(-)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index e484c0a..7d4c3df 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -339,7 +339,6 @@
 	free_base_addresses(base_addrs, i);
 }
 
-
 /* Allocate 64MB memory used for all software rings */
 static int
 acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id)
@@ -637,6 +636,41 @@
 
 	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
 		{
+			.type = RTE_BBDEV_OP_TURBO_DEC,
+			.cap.turbo_dec = {
+				.capability_flags =
+					RTE_BBDEV_TURBO_SUBBLOCK_DEINTERLEAVE |
+					RTE_BBDEV_TURBO_CRC_TYPE_24B |
+					RTE_BBDEV_TURBO_HALF_ITERATION_EVEN |
+					RTE_BBDEV_TURBO_EARLY_TERMINATION |
+					RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN |
+					RTE_BBDEV_TURBO_MAP_DEC |
+					RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP |
+					RTE_BBDEV_TURBO_DEC_SCATTER_GATHER,
+				.max_llr_modulus = INT8_MAX,
+				.num_buffers_src =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+				.num_buffers_hard_out =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+				.num_buffers_soft_out =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+			}
+		},
+		{
+			.type = RTE_BBDEV_OP_TURBO_ENC,
+			.cap.turbo_enc = {
+				.capability_flags =
+					RTE_BBDEV_TURBO_CRC_24B_ATTACH |
+					RTE_BBDEV_TURBO_RV_INDEX_BYPASS |
+					RTE_BBDEV_TURBO_RATE_MATCH |
+					RTE_BBDEV_TURBO_ENC_SCATTER_GATHER,
+				.num_buffers_src =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+				.num_buffers_dst =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+			}
+		},
+		{
 			.type   = RTE_BBDEV_OP_LDPC_ENC,
 			.cap.ldpc_enc = {
 				.capability_flags =
@@ -719,7 +753,6 @@
 #endif
 }
 
-
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
 	.setup_queues = acc100_setup_queues,
 	.close = acc100_dev_close,
@@ -763,6 +796,58 @@
 	return tail;
 }
 
+/* Fill in a frame control word for turbo encoding. */
+static inline void
+acc100_fcw_te_fill(const struct rte_bbdev_enc_op *op, struct acc100_fcw_te *fcw)
+{
+	fcw->code_block_mode = op->turbo_enc.code_block_mode;
+	if (fcw->code_block_mode == 0) { /* For TB mode */
+		fcw->k_neg = op->turbo_enc.tb_params.k_neg;
+		fcw->k_pos = op->turbo_enc.tb_params.k_pos;
+		fcw->c_neg = op->turbo_enc.tb_params.c_neg;
+		fcw->c = op->turbo_enc.tb_params.c;
+		fcw->ncb_neg = op->turbo_enc.tb_params.ncb_neg;
+		fcw->ncb_pos = op->turbo_enc.tb_params.ncb_pos;
+
+		if (check_bit(op->turbo_enc.op_flags,
+				RTE_BBDEV_TURBO_RATE_MATCH)) {
+			fcw->bypass_rm = 0;
+			fcw->cab = op->turbo_enc.tb_params.cab;
+			fcw->ea = op->turbo_enc.tb_params.ea;
+			fcw->eb = op->turbo_enc.tb_params.eb;
+		} else {
+			/* E is set to the encoding output size when RM is
+			 * bypassed.
+			 */
+			fcw->bypass_rm = 1;
+			fcw->cab = fcw->c_neg;
+			fcw->ea = 3 * fcw->k_neg + 12;
+			fcw->eb = 3 * fcw->k_pos + 12;
+		}
+	} else { /* For CB mode */
+		fcw->k_pos = op->turbo_enc.cb_params.k;
+		fcw->ncb_pos = op->turbo_enc.cb_params.ncb;
+
+		if (check_bit(op->turbo_enc.op_flags,
+				RTE_BBDEV_TURBO_RATE_MATCH)) {
+			fcw->bypass_rm = 0;
+			fcw->eb = op->turbo_enc.cb_params.e;
+		} else {
+			/* E is set to the encoding output size when RM is
+			 * bypassed.
+			 */
+			fcw->bypass_rm = 1;
+			fcw->eb = 3 * fcw->k_pos + 12;
+		}
+	}
+
+	fcw->bypass_rv_idx1 = check_bit(op->turbo_enc.op_flags,
+			RTE_BBDEV_TURBO_RV_INDEX_BYPASS);
+	fcw->code_block_crc = check_bit(op->turbo_enc.op_flags,
+			RTE_BBDEV_TURBO_CRC_24B_ATTACH);
+	fcw->rv_idx1 = op->turbo_enc.rv_index;
+}
+
 /* Compute value of k0.
  * Based on 3GPP 38.212 Table 5.4.2.1-2
  * Starting position of different redundancy versions, k0
@@ -813,6 +898,25 @@
 	fcw->mcb_count = num_cb;
 }
 
+/* Fill in a frame control word for turbo decoding. */
+static inline void
+acc100_fcw_td_fill(const struct rte_bbdev_dec_op *op, struct acc100_fcw_td *fcw)
+{
+	/* Note : Early termination is always enabled for 4GUL */
+	fcw->fcw_ver = 1;
+	if (op->turbo_dec.code_block_mode == 0)
+		fcw->k_pos = op->turbo_dec.tb_params.k_pos;
+	else
+		fcw->k_pos = op->turbo_dec.cb_params.k;
+	fcw->turbo_crc_type = check_bit(op->turbo_dec.op_flags,
+			RTE_BBDEV_TURBO_CRC_TYPE_24B);
+	fcw->bypass_sb_deint = 0;
+	fcw->raw_decoder_input_on = 0;
+	fcw->max_iter = op->turbo_dec.iter_max;
+	fcw->half_iter_on = !check_bit(op->turbo_dec.op_flags,
+			RTE_BBDEV_TURBO_HALF_ITERATION_EVEN);
+}
+
 /* Fill in a frame control word for LDPC decoding. */
 static inline void
 acc100_fcw_ld_fill(const struct rte_bbdev_dec_op *op, struct acc100_fcw_ld *fcw,
@@ -1042,6 +1146,87 @@
 }
 
 static inline int
+acc100_dma_desc_te_fill(struct rte_bbdev_enc_op *op,
+		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
+		struct rte_mbuf *output, uint32_t *in_offset,
+		uint32_t *out_offset, uint32_t *out_length,
+		uint32_t *mbuf_total_left, uint32_t *seg_total_left, uint8_t r)
+{
+	int next_triplet = 1; /* FCW already done */
+	uint32_t e, ea, eb, length;
+	uint16_t k, k_neg, k_pos;
+	uint8_t cab, c_neg;
+
+	desc->word0 = ACC100_DMA_DESC_TYPE;
+	desc->word1 = 0; /**< Timestamp could be disabled */
+	desc->word2 = 0;
+	desc->word3 = 0;
+	desc->numCBs = 1;
+
+	if (op->turbo_enc.code_block_mode == 0) {
+		ea = op->turbo_enc.tb_params.ea;
+		eb = op->turbo_enc.tb_params.eb;
+		cab = op->turbo_enc.tb_params.cab;
+		k_neg = op->turbo_enc.tb_params.k_neg;
+		k_pos = op->turbo_enc.tb_params.k_pos;
+		c_neg = op->turbo_enc.tb_params.c_neg;
+		e = (r < cab) ? ea : eb;
+		k = (r < c_neg) ? k_neg : k_pos;
+	} else {
+		e = op->turbo_enc.cb_params.e;
+		k = op->turbo_enc.cb_params.k;
+	}
+
+	if (check_bit(op->turbo_enc.op_flags, RTE_BBDEV_TURBO_CRC_24B_ATTACH))
+		length = (k - 24) >> 3;
+	else
+		length = k >> 3;
+
+	if (unlikely((*mbuf_total_left == 0) || (*mbuf_total_left < length))) {
+		rte_bbdev_log(ERR,
+				"Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+				*mbuf_total_left, length);
+		return -1;
+	}
+
+	next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset,
+			length, seg_total_left, next_triplet);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->m2dlen = next_triplet;
+	*mbuf_total_left -= length;
+
+	/* Set output length */
+	if (check_bit(op->turbo_enc.op_flags, RTE_BBDEV_TURBO_RATE_MATCH))
+		/* Integer round up division by 8 */
+		*out_length = (e + 7) >> 3;
+	else
+		*out_length = (k >> 3) * 3 + 2;
+
+	next_triplet = acc100_dma_fill_blk_type_out(desc, output, *out_offset,
+			*out_length, next_triplet, ACC100_DMA_BLKID_OUT_ENC);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+	op->turbo_enc.output.length += *out_length;
+	*out_offset += *out_length;
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->d2mlen = next_triplet - desc->m2dlen;
+
+	desc->op_addr = op;
+
+	return 0;
+}
+
+static inline int
 acc100_dma_desc_le_fill(struct rte_bbdev_enc_op *op,
 		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
 		struct rte_mbuf *output, uint32_t *in_offset,
@@ -1110,6 +1295,117 @@
 }
 
 static inline int
+acc100_dma_desc_td_fill(struct rte_bbdev_dec_op *op,
+		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
+		struct rte_mbuf *h_output, struct rte_mbuf *s_output,
+		uint32_t *in_offset, uint32_t *h_out_offset,
+		uint32_t *s_out_offset, uint32_t *h_out_length,
+		uint32_t *s_out_length, uint32_t *mbuf_total_left,
+		uint32_t *seg_total_left, uint8_t r)
+{
+	int next_triplet = 1; /* FCW already done */
+	uint16_t k;
+	uint16_t crc24_overlap = 0;
+	uint32_t e, kw;
+
+	desc->word0 = ACC100_DMA_DESC_TYPE;
+	desc->word1 = 0; /**< Timestamp could be disabled */
+	desc->word2 = 0;
+	desc->word3 = 0;
+	desc->numCBs = 1;
+
+	if (op->turbo_dec.code_block_mode == 0) {
+		k = (r < op->turbo_dec.tb_params.c_neg)
+			? op->turbo_dec.tb_params.k_neg
+			: op->turbo_dec.tb_params.k_pos;
+		e = (r < op->turbo_dec.tb_params.cab)
+			? op->turbo_dec.tb_params.ea
+			: op->turbo_dec.tb_params.eb;
+	} else {
+		k = op->turbo_dec.cb_params.k;
+		e = op->turbo_dec.cb_params.e;
+	}
+
+	if ((op->turbo_dec.code_block_mode == 0)
+		&& !check_bit(op->turbo_dec.op_flags,
+		RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP))
+		crc24_overlap = 24;
+
+	/* Calculates circular buffer size.
+	 * According to 3gpp 36.212 section 5.1.4.2
+	 *   Kw = 3 * Kpi,
+	 * where:
+	 *   Kpi = nCol * nRow
+	 * where nCol is 32 and nRow can be calculated from:
+	 *   D =< nCol * nRow
+	 * where D is the size of each output from turbo encoder block (k + 4).
+	 */
+	kw = RTE_ALIGN_CEIL(k + 4, 32) * 3;
+
+	if (unlikely((*mbuf_total_left == 0) || (*mbuf_total_left < kw))) {
+		rte_bbdev_log(ERR,
+				"Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+				*mbuf_total_left, kw);
+		return -1;
+	}
+
+	next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset, kw,
+			seg_total_left, next_triplet);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->m2dlen = next_triplet;
+	*mbuf_total_left -= kw;
+
+	next_triplet = acc100_dma_fill_blk_type_out(
+			desc, h_output, *h_out_offset,
+			k >> 3, next_triplet, ACC100_DMA_BLKID_OUT_HARD);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+
+	*h_out_length = ((k - crc24_overlap) >> 3);
+	op->turbo_dec.hard_output.length += *h_out_length;
+	*h_out_offset += *h_out_length;
+
+	/* Soft output */
+	if (check_bit(op->turbo_dec.op_flags, RTE_BBDEV_TURBO_SOFT_OUTPUT)) {
+		if (check_bit(op->turbo_dec.op_flags,
+				RTE_BBDEV_TURBO_EQUALIZER))
+			*s_out_length = e;
+		else
+			*s_out_length = (k * 3) + 12;
+
+		next_triplet = acc100_dma_fill_blk_type_out(desc, s_output,
+				*s_out_offset, *s_out_length, next_triplet,
+				ACC100_DMA_BLKID_OUT_SOFT);
+		if (unlikely(next_triplet < 0)) {
+			rte_bbdev_log(ERR,
+					"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+					op);
+			return -1;
+		}
+
+		op->turbo_dec.soft_output.length += *s_out_length;
+		*s_out_offset += *s_out_length;
+	}
+
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->d2mlen = next_triplet - desc->m2dlen;
+
+	desc->op_addr = op;
+
+	return 0;
+}
+
+static inline int
 acc100_dma_desc_ld_fill(struct rte_bbdev_dec_op *op,
 		struct acc100_dma_req_desc *desc,
 		struct rte_mbuf **input, struct rte_mbuf *h_output,
@@ -1374,6 +1670,57 @@
 
 /* Enqueue one encode operations for ACC100 device in CB mode */
 static inline int
+enqueue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
+		uint16_t total_enqueued_cbs)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint32_t in_offset, out_offset, out_length, mbuf_total_left,
+		seg_total_left;
+	struct rte_mbuf *input, *output_head, *output;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	acc100_fcw_te_fill(op, &desc->req.fcw_te);
+
+	input = op->turbo_enc.input.data;
+	output_head = output = op->turbo_enc.output.data;
+	in_offset = op->turbo_enc.input.offset;
+	out_offset = op->turbo_enc.output.offset;
+	out_length = 0;
+	mbuf_total_left = op->turbo_enc.input.length;
+	seg_total_left = rte_pktmbuf_data_len(op->turbo_enc.input.data)
+			- in_offset;
+
+	ret = acc100_dma_desc_te_fill(op, &desc->req, &input, output,
+			&in_offset, &out_offset, &out_length, &mbuf_total_left,
+			&seg_total_left, 0);
+
+	if (unlikely(ret < 0))
+		return ret;
+
+	mbuf_append(output_head, output, out_length);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_te,
+			sizeof(desc->req.fcw_te) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+
+	/* Check if any data left after processing one CB */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left after processing one CB: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
+/* Enqueue one encode operations for ACC100 device in CB mode */
+static inline int
 enqueue_ldpc_enc_n_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op **ops,
 		uint16_t total_enqueued_cbs, int16_t num)
 {
@@ -1481,78 +1828,235 @@
 	return 1;
 }
 
-static inline int
-harq_loopback(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
-		uint16_t total_enqueued_cbs) {
-	struct acc100_fcw_ld *fcw;
-	union acc100_dma_desc *desc;
-	int next_triplet = 1;
-	struct rte_mbuf *hq_output_head, *hq_output;
-	uint16_t harq_in_length = op->ldpc_dec.harq_combined_input.length;
-	if (harq_in_length == 0) {
-		rte_bbdev_log(ERR, "Loopback of invalid null size\n");
-		return -EINVAL;
-	}
 
-	int h_comp = check_bit(op->ldpc_dec.op_flags,
-			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION
-			) ? 1 : 0;
-	if (h_comp == 1)
-		harq_in_length = harq_in_length * 8 / 6;
-	harq_in_length = RTE_ALIGN(harq_in_length, 64);
-	uint16_t harq_dma_length_in = (h_comp == 0) ?
-			harq_in_length :
-			harq_in_length * 6 / 8;
-	uint16_t harq_dma_length_out = harq_dma_length_in;
-	bool ddr_mem_in = check_bit(op->ldpc_dec.op_flags,
-			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE);
-	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
-	uint16_t harq_index = (ddr_mem_in ?
-			op->ldpc_dec.harq_combined_input.offset :
-			op->ldpc_dec.harq_combined_output.offset)
-			/ ACC100_HARQ_OFFSET;
+/* Enqueue one encode operations for ACC100 device in TB mode. */
+static inline int
+enqueue_enc_one_op_tb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
+		uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint8_t r, c;
+	uint32_t in_offset, out_offset, out_length, mbuf_total_left,
+		seg_total_left;
+	struct rte_mbuf *input, *output_head, *output;
+	uint16_t current_enqueued_cbs = 0;
 
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
-	fcw = &desc->req.fcw_ld;
-	/* Set the FCW from loopback into DDR */
-	memset(fcw, 0, sizeof(struct acc100_fcw_ld));
-	fcw->FCWversion = ACC100_FCW_VER;
-	fcw->qm = 2;
-	fcw->Zc = 384;
-	if (harq_in_length < 16 * N_ZC_1)
-		fcw->Zc = 16;
-	fcw->ncb = fcw->Zc * N_ZC_1;
-	fcw->rm_e = 2;
-	fcw->hcin_en = 1;
-	fcw->hcout_en = 1;
+	uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
+	acc100_fcw_te_fill(op, &desc->req.fcw_te);
 
-	rte_bbdev_log(DEBUG, "Loopback IN %d Index %d offset %d length %d %d\n",
-			ddr_mem_in, harq_index,
-			harq_layout[harq_index].offset, harq_in_length,
-			harq_dma_length_in);
+	input = op->turbo_enc.input.data;
+	output_head = output = op->turbo_enc.output.data;
+	in_offset = op->turbo_enc.input.offset;
+	out_offset = op->turbo_enc.output.offset;
+	out_length = 0;
+	mbuf_total_left = op->turbo_enc.input.length;
 
-	if (ddr_mem_in && (harq_layout[harq_index].offset > 0)) {
-		fcw->hcin_size0 = harq_layout[harq_index].size0;
-		fcw->hcin_offset = harq_layout[harq_index].offset;
-		fcw->hcin_size1 = harq_in_length - fcw->hcin_offset;
-		harq_dma_length_in = (fcw->hcin_size0 + fcw->hcin_size1);
-		if (h_comp == 1)
-			harq_dma_length_in = harq_dma_length_in * 6 / 8;
-	} else {
-		fcw->hcin_size0 = harq_in_length;
-	}
-	harq_layout[harq_index].val = 0;
-	rte_bbdev_log(DEBUG, "Loopback FCW Config %d %d %d\n",
-			fcw->hcin_size0, fcw->hcin_offset, fcw->hcin_size1);
-	fcw->hcout_size0 = harq_in_length;
-	fcw->hcin_decomp_mode = h_comp;
-	fcw->hcout_comp_mode = h_comp;
-	fcw->gain_i = 1;
-	fcw->gain_h = 1;
+	c = op->turbo_enc.tb_params.c;
+	r = op->turbo_enc.tb_params.r;
 
-	/* Set the prefix of descriptor. This could be done at polling */
+	while (mbuf_total_left > 0 && r < c) {
+		seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
+		/* Set up DMA descriptor */
+		desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
+				& q->sw_ring_wrap_mask);
+		desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
+		desc->req.data_ptrs[0].blen = ACC100_FCW_TE_BLEN;
+
+		ret = acc100_dma_desc_te_fill(op, &desc->req, &input, output,
+				&in_offset, &out_offset, &out_length,
+				&mbuf_total_left, &seg_total_left, r);
+		if (unlikely(ret < 0))
+			return ret;
+		mbuf_append(output_head, output, out_length);
+
+		/* Set total number of CBs in TB */
+		desc->req.cbs_in_tb = cbs_in_tb;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+		rte_memdump(stderr, "FCW", &desc->req.fcw_te,
+				sizeof(desc->req.fcw_te) - 8);
+		rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+		if (seg_total_left == 0) {
+			/* Go to the next mbuf */
+			input = input->next;
+			in_offset = 0;
+			output = output->next;
+			out_offset = 0;
+		}
+
+		total_enqueued_cbs++;
+		current_enqueued_cbs++;
+		r++;
+	}
+
+	if (unlikely(desc == NULL))
+		return current_enqueued_cbs;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Check if any CBs left for processing */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left for processing: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+
+	/* Set SDone on last CB descriptor for TB mode. */
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	return current_enqueued_cbs;
+}
+
+/** Enqueue one decode operations for ACC100 device in CB mode */
+static inline int
+enqueue_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint32_t in_offset, h_out_offset, s_out_offset, s_out_length,
+		h_out_length, mbuf_total_left, seg_total_left;
+	struct rte_mbuf *input, *h_output_head, *h_output,
+		*s_output_head, *s_output;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	acc100_fcw_td_fill(op, &desc->req.fcw_td);
+
+	input = op->turbo_dec.input.data;
+	h_output_head = h_output = op->turbo_dec.hard_output.data;
+	s_output_head = s_output = op->turbo_dec.soft_output.data;
+	in_offset = op->turbo_dec.input.offset;
+	h_out_offset = op->turbo_dec.hard_output.offset;
+	s_out_offset = op->turbo_dec.soft_output.offset;
+	h_out_length = s_out_length = 0;
+	mbuf_total_left = op->turbo_dec.input.length;
+	seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(input == NULL)) {
+		rte_bbdev_log(ERR, "Invalid mbuf pointer");
+		return -EFAULT;
+	}
+#endif
+
+	/* Set up DMA descriptor */
+	desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+
+	ret = acc100_dma_desc_td_fill(op, &desc->req, &input, h_output,
+			s_output, &in_offset, &h_out_offset, &s_out_offset,
+			&h_out_length, &s_out_length, &mbuf_total_left,
+			&seg_total_left, 0);
+
+	if (unlikely(ret < 0))
+		return ret;
+
+	/* Hard output */
+	mbuf_append(h_output_head, h_output, h_out_length);
+
+	/* Soft output */
+	if (check_bit(op->turbo_dec.op_flags, RTE_BBDEV_TURBO_SOFT_OUTPUT))
+		mbuf_append(s_output_head, s_output, s_out_length);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_td,
+			sizeof(desc->req.fcw_td) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+
+	/* Check if any CBs left for processing */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left after processing one CB: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
+static inline int
+harq_loopback(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs) {
+	struct acc100_fcw_ld *fcw;
+	union acc100_dma_desc *desc;
+	int next_triplet = 1;
+	struct rte_mbuf *hq_output_head, *hq_output;
+	uint16_t harq_in_length = op->ldpc_dec.harq_combined_input.length;
+	if (harq_in_length == 0) {
+		rte_bbdev_log(ERR, "Loopback of invalid null size\n");
+		return -EINVAL;
+	}
+
+	int h_comp = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION
+			) ? 1 : 0;
+	if (h_comp == 1)
+		harq_in_length = harq_in_length * 8 / 6;
+	harq_in_length = RTE_ALIGN(harq_in_length, 64);
+	uint16_t harq_dma_length_in = (h_comp == 0) ?
+			harq_in_length :
+			harq_in_length * 6 / 8;
+	uint16_t harq_dma_length_out = harq_dma_length_in;
+	bool ddr_mem_in = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE);
+	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+	uint16_t harq_index = (ddr_mem_in ?
+			op->ldpc_dec.harq_combined_input.offset :
+			op->ldpc_dec.harq_combined_output.offset)
+			/ ACC100_HARQ_OFFSET;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	fcw = &desc->req.fcw_ld;
+	/* Set the FCW from loopback into DDR */
+	memset(fcw, 0, sizeof(struct acc100_fcw_ld));
+	fcw->FCWversion = ACC100_FCW_VER;
+	fcw->qm = 2;
+	fcw->Zc = 384;
+	if (harq_in_length < 16 * N_ZC_1)
+		fcw->Zc = 16;
+	fcw->ncb = fcw->Zc * N_ZC_1;
+	fcw->rm_e = 2;
+	fcw->hcin_en = 1;
+	fcw->hcout_en = 1;
+
+	rte_bbdev_log(DEBUG, "Loopback IN %d Index %d offset %d length %d %d\n",
+			ddr_mem_in, harq_index,
+			harq_layout[harq_index].offset, harq_in_length,
+			harq_dma_length_in);
+
+	if (ddr_mem_in && (harq_layout[harq_index].offset > 0)) {
+		fcw->hcin_size0 = harq_layout[harq_index].size0;
+		fcw->hcin_offset = harq_layout[harq_index].offset;
+		fcw->hcin_size1 = harq_in_length - fcw->hcin_offset;
+		harq_dma_length_in = (fcw->hcin_size0 + fcw->hcin_size1);
+		if (h_comp == 1)
+			harq_dma_length_in = harq_dma_length_in * 6 / 8;
+	} else {
+		fcw->hcin_size0 = harq_in_length;
+	}
+	harq_layout[harq_index].val = 0;
+	rte_bbdev_log(DEBUG, "Loopback FCW Config %d %d %d\n",
+			fcw->hcin_size0, fcw->hcin_offset, fcw->hcin_size1);
+	fcw->hcout_size0 = harq_in_length;
+	fcw->hcin_decomp_mode = h_comp;
+	fcw->hcout_comp_mode = h_comp;
+	fcw->gain_i = 1;
+	fcw->gain_h = 1;
+
+	/* Set the prefix of descriptor. This could be done at polling */
 	desc->req.word0 = ACC100_DMA_DESC_TYPE;
 	desc->req.word1 = 0; /**< Timestamp could be disabled */
 	desc->req.word2 = 0;
@@ -1816,6 +2320,107 @@
 	return current_enqueued_cbs;
 }
 
+/* Enqueue one decode operations for ACC100 device in TB mode */
+static inline int
+enqueue_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint8_t r, c;
+	uint32_t in_offset, h_out_offset, s_out_offset, s_out_length,
+		h_out_length, mbuf_total_left, seg_total_left;
+	struct rte_mbuf *input, *h_output_head, *h_output,
+		*s_output_head, *s_output;
+	uint16_t current_enqueued_cbs = 0;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
+	acc100_fcw_td_fill(op, &desc->req.fcw_td);
+
+	input = op->turbo_dec.input.data;
+	h_output_head = h_output = op->turbo_dec.hard_output.data;
+	s_output_head = s_output = op->turbo_dec.soft_output.data;
+	in_offset = op->turbo_dec.input.offset;
+	h_out_offset = op->turbo_dec.hard_output.offset;
+	s_out_offset = op->turbo_dec.soft_output.offset;
+	h_out_length = s_out_length = 0;
+	mbuf_total_left = op->turbo_dec.input.length;
+	c = op->turbo_dec.tb_params.c;
+	r = op->turbo_dec.tb_params.r;
+
+	while (mbuf_total_left > 0 && r < c) {
+
+		seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
+
+		/* Set up DMA descriptor */
+		desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
+				& q->sw_ring_wrap_mask);
+		desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
+		desc->req.data_ptrs[0].blen = ACC100_FCW_TD_BLEN;
+		ret = acc100_dma_desc_td_fill(op, &desc->req, &input,
+				h_output, s_output, &in_offset, &h_out_offset,
+				&s_out_offset, &h_out_length, &s_out_length,
+				&mbuf_total_left, &seg_total_left, r);
+
+		if (unlikely(ret < 0))
+			return ret;
+
+		/* Hard output */
+		mbuf_append(h_output_head, h_output, h_out_length);
+
+		/* Soft output */
+		if (check_bit(op->turbo_dec.op_flags,
+				RTE_BBDEV_TURBO_SOFT_OUTPUT))
+			mbuf_append(s_output_head, s_output, s_out_length);
+
+		/* Set total number of CBs in TB */
+		desc->req.cbs_in_tb = cbs_in_tb;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+		rte_memdump(stderr, "FCW", &desc->req.fcw_td,
+				sizeof(desc->req.fcw_td) - 8);
+		rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+		if (seg_total_left == 0) {
+			/* Go to the next mbuf */
+			input = input->next;
+			in_offset = 0;
+			h_output = h_output->next;
+			h_out_offset = 0;
+
+			if (check_bit(op->turbo_dec.op_flags,
+					RTE_BBDEV_TURBO_SOFT_OUTPUT)) {
+				s_output = s_output->next;
+				s_out_offset = 0;
+			}
+		}
+
+		total_enqueued_cbs++;
+		current_enqueued_cbs++;
+		r++;
+	}
+
+	if (unlikely(desc == NULL))
+		return current_enqueued_cbs;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Check if any CBs left for processing */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left for processing: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+	/* Set SDone on last CB descriptor for TB mode */
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	return current_enqueued_cbs;
+}
 
 /* Calculates number of CBs in processed encoder TB based on 'r' and input
  * length.
@@ -1893,6 +2498,45 @@
 	return cbs_in_tb;
 }
 
+/* Enqueue encode operations for ACC100 device in CB mode. */
+static uint16_t
+acc100_enqueue_enc_cb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i;
+	union acc100_dma_desc *desc;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - 1 < 0))
+			break;
+		avail -= 1;
+
+		ret = enqueue_enc_one_op_cb(q, ops[i], i);
+		if (ret < 0)
+			break;
+	}
+
+	if (unlikely(i == 0))
+		return 0; /* Nothing to enqueue */
+
+	/* Set SDone in last CB in enqueued ops for CB mode*/
+	desc = q->ring_addr + ((q->sw_ring_head + i - 1)
+			& q->sw_ring_wrap_mask);
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	acc100_dma_enqueue(q, i, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+	return i;
+}
+
 /* Check we can mux encode operations with common FCW */
 static inline bool
 check_mux(struct rte_bbdev_enc_op **ops, uint16_t num) {
@@ -1960,6 +2604,52 @@
 	return i;
 }
 
+/* Enqueue encode operations for ACC100 device in TB mode. */
+static uint16_t
+acc100_enqueue_enc_tb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i, enqueued_cbs = 0;
+	uint8_t cbs_in_tb;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		cbs_in_tb = get_num_cbs_in_tb_enc(&ops[i]->turbo_enc);
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - cbs_in_tb < 0))
+			break;
+		avail -= cbs_in_tb;
+
+		ret = enqueue_enc_one_op_tb(q, ops[i], enqueued_cbs, cbs_in_tb);
+		if (ret < 0)
+			break;
+		enqueued_cbs += ret;
+	}
+
+	acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+
+	return i;
+}
+
+/* Enqueue encode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_enc(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	if (unlikely(num == 0))
+		return 0;
+	if (ops[0]->turbo_enc.code_block_mode == 0)
+		return acc100_enqueue_enc_tb(q_data, ops, num);
+	else
+		return acc100_enqueue_enc_cb(q_data, ops, num);
+}
+
 /* Enqueue encode operations for ACC100 device. */
 static uint16_t
 acc100_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
@@ -1967,7 +2657,51 @@
 {
 	if (unlikely(num == 0))
 		return 0;
-	return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
+	if (ops[0]->ldpc_enc.code_block_mode == 0)
+		return acc100_enqueue_enc_tb(q_data, ops, num);
+	else
+		return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
+}
+
+
+/* Enqueue decode operations for ACC100 device in CB mode */
+static uint16_t
+acc100_enqueue_dec_cb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i;
+	union acc100_dma_desc *desc;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - 1 < 0))
+			break;
+		avail -= 1;
+
+		ret = enqueue_dec_one_op_cb(q, ops[i], i);
+		if (ret < 0)
+			break;
+	}
+
+	if (unlikely(i == 0))
+		return 0; /* Nothing to enqueue */
+
+	/* Set SDone in last CB in enqueued ops for CB mode*/
+	desc = q->ring_addr + ((q->sw_ring_head + i - 1)
+			& q->sw_ring_wrap_mask);
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	acc100_dma_enqueue(q, i, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+
+	return i;
 }
 
 /* Check we can mux encode operations with common FCW */
@@ -2065,6 +2799,53 @@
 	return i;
 }
 
+
+/* Enqueue decode operations for ACC100 device in TB mode */
+static uint16_t
+acc100_enqueue_dec_tb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i, enqueued_cbs = 0;
+	uint8_t cbs_in_tb;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		cbs_in_tb = get_num_cbs_in_tb_dec(&ops[i]->turbo_dec);
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - cbs_in_tb < 0))
+			break;
+		avail -= cbs_in_tb;
+
+		ret = enqueue_dec_one_op_tb(q, ops[i], enqueued_cbs, cbs_in_tb);
+		if (ret < 0)
+			break;
+		enqueued_cbs += ret;
+	}
+
+	acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+
+	return i;
+}
+
+/* Enqueue decode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_dec(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	if (unlikely(num == 0))
+		return 0;
+	if (ops[0]->turbo_dec.code_block_mode == 0)
+		return acc100_enqueue_dec_tb(q_data, ops, num);
+	else
+		return acc100_enqueue_dec_cb(q_data, ops, num);
+}
+
 /* Enqueue decode operations for ACC100 device. */
 static uint16_t
 acc100_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
@@ -2388,6 +3169,51 @@
 	return cb_idx;
 }
 
+/* Dequeue encode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_enc(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	uint16_t dequeue_num;
+	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+	uint32_t aq_dequeued = 0;
+	uint16_t i;
+	uint16_t dequeued_cbs = 0;
+	struct rte_bbdev_enc_op *op;
+	int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(ops == 0 && q == NULL))
+		return 0;
+#endif
+
+	dequeue_num = (avail < num) ? avail : num;
+
+	for (i = 0; i < dequeue_num; ++i) {
+		op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask))->req.op_addr;
+		if (op->turbo_enc.code_block_mode == 0)
+			ret = dequeue_enc_one_op_tb(q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+		else
+			ret = dequeue_enc_one_op_cb(q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+
+		if (ret < 0)
+			break;
+		dequeued_cbs += ret;
+	}
+
+	q->aq_dequeued += aq_dequeued;
+	q->sw_ring_tail += dequeued_cbs;
+
+	/* Update enqueue stats */
+	q_data->queue_stats.dequeued_count += i;
+
+	return i;
+}
+
 /* Dequeue LDPC encode operations from ACC100 device. */
 static uint16_t
 acc100_dequeue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
@@ -2426,6 +3252,52 @@
 	return dequeued_cbs;
 }
 
+
+/* Dequeue decode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_dec(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	uint16_t dequeue_num;
+	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+	uint32_t aq_dequeued = 0;
+	uint16_t i;
+	uint16_t dequeued_cbs = 0;
+	struct rte_bbdev_dec_op *op;
+	int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(ops == 0 && q == NULL))
+		return 0;
+#endif
+
+	dequeue_num = (avail < num) ? avail : num;
+
+	for (i = 0; i < dequeue_num; ++i) {
+		op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask))->req.op_addr;
+		if (op->turbo_dec.code_block_mode == 0)
+			ret = dequeue_dec_one_op_tb(q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+		else
+			ret = dequeue_dec_one_op_cb(q_data, q, &ops[i],
+					dequeued_cbs, &aq_dequeued);
+
+		if (ret < 0)
+			break;
+		dequeued_cbs += ret;
+	}
+
+	q->aq_dequeued += aq_dequeued;
+	q->sw_ring_tail += dequeued_cbs;
+
+	/* Update enqueue stats */
+	q_data->queue_stats.dequeued_count += i;
+
+	return i;
+}
+
 /* Dequeue decode operations from ACC100 device. */
 static uint16_t
 acc100_dequeue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
@@ -2479,6 +3351,10 @@
 	struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
 
 	dev->dev_ops = &acc100_bbdev_ops;
+	dev->enqueue_enc_ops = acc100_enqueue_enc;
+	dev->enqueue_dec_ops = acc100_enqueue_dec;
+	dev->dequeue_enc_ops = acc100_dequeue_enc;
+	dev->dequeue_dec_ops = acc100_dequeue_dec;
 	dev->enqueue_ldpc_enc_ops = acc100_enqueue_ldpc_enc;
 	dev->enqueue_ldpc_dec_ops = acc100_enqueue_ldpc_dec;
 	dev->dequeue_ldpc_enc_ops = acc100_dequeue_ldpc_enc;
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v5 08/11] baseband/acc100: add interrupt support to PMD
  2020-09-23  2:12   ` [dpdk-dev] [PATCH v5 " Nicolas Chautru
                       ` (6 preceding siblings ...)
  2020-09-23  2:12     ` [dpdk-dev] [PATCH v5 07/11] baseband/acc100: add support for 4G processing Nicolas Chautru
@ 2020-09-23  2:12     ` Nicolas Chautru
  2020-09-23  2:12     ` [dpdk-dev] [PATCH v5 09/11] baseband/acc100: add debug function to validate input Nicolas Chautru
                       ` (2 subsequent siblings)
  10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-23  2:12 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Adding capability and functions to support MSI
interrupts, call backs and inforing.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 288 ++++++++++++++++++++++++++++++-
 drivers/baseband/acc100/rte_acc100_pmd.h |  15 ++
 2 files changed, 300 insertions(+), 3 deletions(-)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 7d4c3df..b6d9e7c 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -339,6 +339,213 @@
 	free_base_addresses(base_addrs, i);
 }
 
+/*
+ * Find queue_id of a device queue based on details from the Info Ring.
+ * If a queue isn't found UINT16_MAX is returned.
+ */
+static inline uint16_t
+get_queue_id_from_ring_info(struct rte_bbdev_data *data,
+		const union acc100_info_ring_data ring_data)
+{
+	uint16_t queue_id;
+
+	for (queue_id = 0; queue_id < data->num_queues; ++queue_id) {
+		struct acc100_queue *acc100_q =
+				data->queues[queue_id].queue_private;
+		if (acc100_q != NULL && acc100_q->aq_id == ring_data.aq_id &&
+				acc100_q->qgrp_id == ring_data.qg_id &&
+				acc100_q->vf_id == ring_data.vf_id)
+			return queue_id;
+	}
+
+	return UINT16_MAX;
+}
+
+/* Checks PF Info Ring to find the interrupt cause and handles it accordingly */
+static inline void
+acc100_check_ir(struct acc100_device *acc100_dev)
+{
+	volatile union acc100_info_ring_data *ring_data;
+	uint16_t info_ring_head = acc100_dev->info_ring_head;
+	if (acc100_dev->info_ring == NULL)
+		return;
+
+	ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head &
+			ACC100_INFO_RING_MASK);
+
+	while (ring_data->valid) {
+		if ((ring_data->int_nb < ACC100_PF_INT_DMA_DL_DESC_IRQ) || (
+				ring_data->int_nb >
+				ACC100_PF_INT_DMA_DL5G_DESC_IRQ))
+			rte_bbdev_log(WARNING, "InfoRing: ITR:%d Info:0x%x",
+				ring_data->int_nb, ring_data->detailed_info);
+		/* Initialize Info Ring entry and move forward */
+		ring_data->val = 0;
+		info_ring_head++;
+		ring_data = acc100_dev->info_ring +
+				(info_ring_head & ACC100_INFO_RING_MASK);
+	}
+}
+
+/* Checks PF Info Ring to find the interrupt cause and handles it accordingly */
+static inline void
+acc100_pf_interrupt_handler(struct rte_bbdev *dev)
+{
+	struct acc100_device *acc100_dev = dev->data->dev_private;
+	volatile union acc100_info_ring_data *ring_data;
+	struct acc100_deq_intr_details deq_intr_det;
+
+	ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head &
+			ACC100_INFO_RING_MASK);
+
+	while (ring_data->valid) {
+
+		rte_bbdev_log_debug(
+				"ACC100 PF Interrupt received, Info Ring data: 0x%x",
+				ring_data->val);
+
+		switch (ring_data->int_nb) {
+		case ACC100_PF_INT_DMA_DL_DESC_IRQ:
+		case ACC100_PF_INT_DMA_UL_DESC_IRQ:
+		case ACC100_PF_INT_DMA_UL5G_DESC_IRQ:
+		case ACC100_PF_INT_DMA_DL5G_DESC_IRQ:
+			deq_intr_det.queue_id = get_queue_id_from_ring_info(
+					dev->data, *ring_data);
+			if (deq_intr_det.queue_id == UINT16_MAX) {
+				rte_bbdev_log(ERR,
+						"Couldn't find queue: aq_id: %u, qg_id: %u, vf_id: %u",
+						ring_data->aq_id,
+						ring_data->qg_id,
+						ring_data->vf_id);
+				return;
+			}
+			rte_bbdev_pmd_callback_process(dev,
+					RTE_BBDEV_EVENT_DEQUEUE, &deq_intr_det);
+			break;
+		default:
+			rte_bbdev_pmd_callback_process(dev,
+					RTE_BBDEV_EVENT_ERROR, NULL);
+			break;
+		}
+
+		/* Initialize Info Ring entry and move forward */
+		ring_data->val = 0;
+		++acc100_dev->info_ring_head;
+		ring_data = acc100_dev->info_ring +
+				(acc100_dev->info_ring_head &
+				ACC100_INFO_RING_MASK);
+	}
+}
+
+/* Checks VF Info Ring to find the interrupt cause and handles it accordingly */
+static inline void
+acc100_vf_interrupt_handler(struct rte_bbdev *dev)
+{
+	struct acc100_device *acc100_dev = dev->data->dev_private;
+	volatile union acc100_info_ring_data *ring_data;
+	struct acc100_deq_intr_details deq_intr_det;
+
+	ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head &
+			ACC100_INFO_RING_MASK);
+
+	while (ring_data->valid) {
+
+		rte_bbdev_log_debug(
+				"ACC100 VF Interrupt received, Info Ring data: 0x%x",
+				ring_data->val);
+
+		switch (ring_data->int_nb) {
+		case ACC100_VF_INT_DMA_DL_DESC_IRQ:
+		case ACC100_VF_INT_DMA_UL_DESC_IRQ:
+		case ACC100_VF_INT_DMA_UL5G_DESC_IRQ:
+		case ACC100_VF_INT_DMA_DL5G_DESC_IRQ:
+			/* VFs are not aware of their vf_id - it's set to 0 in
+			 * queue structures.
+			 */
+			ring_data->vf_id = 0;
+			deq_intr_det.queue_id = get_queue_id_from_ring_info(
+					dev->data, *ring_data);
+			if (deq_intr_det.queue_id == UINT16_MAX) {
+				rte_bbdev_log(ERR,
+						"Couldn't find queue: aq_id: %u, qg_id: %u",
+						ring_data->aq_id,
+						ring_data->qg_id);
+				return;
+			}
+			rte_bbdev_pmd_callback_process(dev,
+					RTE_BBDEV_EVENT_DEQUEUE, &deq_intr_det);
+			break;
+		default:
+			rte_bbdev_pmd_callback_process(dev,
+					RTE_BBDEV_EVENT_ERROR, NULL);
+			break;
+		}
+
+		/* Initialize Info Ring entry and move forward */
+		ring_data->valid = 0;
+		++acc100_dev->info_ring_head;
+		ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head
+				& ACC100_INFO_RING_MASK);
+	}
+}
+
+/* Interrupt handler triggered by ACC100 dev for handling specific interrupt */
+static void
+acc100_dev_interrupt_handler(void *cb_arg)
+{
+	struct rte_bbdev *dev = cb_arg;
+	struct acc100_device *acc100_dev = dev->data->dev_private;
+
+	/* Read info ring */
+	if (acc100_dev->pf_device)
+		acc100_pf_interrupt_handler(dev);
+	else
+		acc100_vf_interrupt_handler(dev);
+}
+
+/* Allocate and setup inforing */
+static int
+allocate_inforing(struct rte_bbdev *dev)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	const struct acc100_registry_addr *reg_addr;
+	rte_iova_t info_ring_phys;
+	uint32_t phys_low, phys_high;
+
+	if (d->info_ring != NULL)
+		return 0; /* Already configured */
+
+	/* Choose correct registry addresses for the device type */
+	if (d->pf_device)
+		reg_addr = &pf_reg_addr;
+	else
+		reg_addr = &vf_reg_addr;
+	/* Allocate InfoRing */
+	d->info_ring = rte_zmalloc_socket("Info Ring",
+			ACC100_INFO_RING_NUM_ENTRIES *
+			sizeof(*d->info_ring), RTE_CACHE_LINE_SIZE,
+			dev->data->socket_id);
+	if (d->info_ring == NULL) {
+		rte_bbdev_log(ERR,
+				"Failed to allocate Info Ring for %s:%u",
+				dev->device->driver->name,
+				dev->data->dev_id);
+		return -ENOMEM;
+	}
+	info_ring_phys = rte_malloc_virt2iova(d->info_ring);
+
+	/* Setup Info Ring */
+	phys_high = (uint32_t)(info_ring_phys >> 32);
+	phys_low  = (uint32_t)(info_ring_phys);
+	acc100_reg_write(d, reg_addr->info_ring_hi, phys_high);
+	acc100_reg_write(d, reg_addr->info_ring_lo, phys_low);
+	acc100_reg_write(d, reg_addr->info_ring_en, ACC100_REG_IRQ_EN_ALL);
+	d->info_ring_head = (acc100_reg_read(d, reg_addr->info_ring_ptr) &
+			0xFFF) / sizeof(union acc100_info_ring_data);
+	return 0;
+}
+
+
 /* Allocate 64MB memory used for all software rings */
 static int
 acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id)
@@ -426,6 +633,7 @@
 	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_hi, phys_high);
 	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_lo, phys_low);
 
+	allocate_inforing(dev);
 	d->harq_layout = rte_zmalloc_socket("HARQ Layout",
 			ACC100_HARQ_LAYOUT * sizeof(*d->harq_layout),
 			RTE_CACHE_LINE_SIZE, dev->data->socket_id);
@@ -437,13 +645,53 @@
 	return 0;
 }
 
+static int
+acc100_intr_enable(struct rte_bbdev *dev)
+{
+	int ret;
+	struct acc100_device *d = dev->data->dev_private;
+
+	/* Only MSI are currently supported */
+	if (dev->intr_handle->type == RTE_INTR_HANDLE_VFIO_MSI ||
+			dev->intr_handle->type == RTE_INTR_HANDLE_UIO) {
+
+		allocate_inforing(dev);
+
+		ret = rte_intr_enable(dev->intr_handle);
+		if (ret < 0) {
+			rte_bbdev_log(ERR,
+					"Couldn't enable interrupts for device: %s",
+					dev->data->name);
+			rte_free(d->info_ring);
+			return ret;
+		}
+		ret = rte_intr_callback_register(dev->intr_handle,
+				acc100_dev_interrupt_handler, dev);
+		if (ret < 0) {
+			rte_bbdev_log(ERR,
+					"Couldn't register interrupt callback for device: %s",
+					dev->data->name);
+			rte_free(d->info_ring);
+			return ret;
+		}
+
+		return 0;
+	}
+
+	rte_bbdev_log(ERR, "ACC100 (%s) supports only VFIO MSI interrupts",
+			dev->data->name);
+	return -ENOTSUP;
+}
+
 /* Free 64MB memory used for software rings */
 static int
 acc100_dev_close(struct rte_bbdev *dev)
 {
 	struct acc100_device *d = dev->data->dev_private;
+	acc100_check_ir(d);
 	if (d->sw_rings_base != NULL) {
 		rte_free(d->tail_ptrs);
+		rte_free(d->info_ring);
 		rte_free(d->sw_rings_base);
 		d->sw_rings_base = NULL;
 	}
@@ -643,6 +891,7 @@
 					RTE_BBDEV_TURBO_CRC_TYPE_24B |
 					RTE_BBDEV_TURBO_HALF_ITERATION_EVEN |
 					RTE_BBDEV_TURBO_EARLY_TERMINATION |
+					RTE_BBDEV_TURBO_DEC_INTERRUPTS |
 					RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN |
 					RTE_BBDEV_TURBO_MAP_DEC |
 					RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP |
@@ -663,6 +912,7 @@
 					RTE_BBDEV_TURBO_CRC_24B_ATTACH |
 					RTE_BBDEV_TURBO_RV_INDEX_BYPASS |
 					RTE_BBDEV_TURBO_RATE_MATCH |
+					RTE_BBDEV_TURBO_ENC_INTERRUPTS |
 					RTE_BBDEV_TURBO_ENC_SCATTER_GATHER,
 				.num_buffers_src =
 						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
@@ -676,7 +926,8 @@
 				.capability_flags =
 					RTE_BBDEV_LDPC_RATE_MATCH |
 					RTE_BBDEV_LDPC_CRC_24B_ATTACH |
-					RTE_BBDEV_LDPC_INTERLEAVER_BYPASS,
+					RTE_BBDEV_LDPC_INTERLEAVER_BYPASS |
+					RTE_BBDEV_LDPC_ENC_INTERRUPTS,
 				.num_buffers_src =
 						RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
 				.num_buffers_dst =
@@ -701,7 +952,8 @@
 				RTE_BBDEV_LDPC_DECODE_BYPASS |
 				RTE_BBDEV_LDPC_DEC_SCATTER_GATHER |
 				RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION |
-				RTE_BBDEV_LDPC_LLR_COMPRESSION,
+				RTE_BBDEV_LDPC_LLR_COMPRESSION |
+				RTE_BBDEV_LDPC_DEC_INTERRUPTS,
 			.llr_size = 8,
 			.llr_decimals = 1,
 			.num_buffers_src =
@@ -751,14 +1003,39 @@
 #else
 	dev_info->harq_buffer_size = 0;
 #endif
+	acc100_check_ir(d);
+}
+
+static int
+acc100_queue_intr_enable(struct rte_bbdev *dev, uint16_t queue_id)
+{
+	struct acc100_queue *q = dev->data->queues[queue_id].queue_private;
+
+	if (dev->intr_handle->type != RTE_INTR_HANDLE_VFIO_MSI &&
+			dev->intr_handle->type != RTE_INTR_HANDLE_UIO)
+		return -ENOTSUP;
+
+	q->irq_enable = 1;
+	return 0;
+}
+
+static int
+acc100_queue_intr_disable(struct rte_bbdev *dev, uint16_t queue_id)
+{
+	struct acc100_queue *q = dev->data->queues[queue_id].queue_private;
+	q->irq_enable = 0;
+	return 0;
 }
 
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
 	.setup_queues = acc100_setup_queues,
+	.intr_enable = acc100_intr_enable,
 	.close = acc100_dev_close,
 	.info_get = acc100_dev_info_get,
 	.queue_setup = acc100_queue_setup,
 	.queue_release = acc100_queue_release,
+	.queue_intr_enable = acc100_queue_intr_enable,
+	.queue_intr_disable = acc100_queue_intr_disable
 };
 
 /* ACC100 PCI PF address map */
@@ -3018,8 +3295,10 @@
 			? (1 << RTE_BBDEV_DATA_ERROR) : 0);
 	op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
 	op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
-	if (op->status != 0)
+	if (op->status != 0) {
 		q_data->queue_stats.dequeue_err_count++;
+		acc100_check_ir(q->d);
+	}
 
 	/* CRC invalid if error exists */
 	if (!op->status)
@@ -3076,6 +3355,9 @@
 		op->status |= 1 << RTE_BBDEV_SYNDROME_ERROR;
 	op->ldpc_dec.iter_count = (uint8_t) rsp.iter_cnt;
 
+	if (op->status & (1 << RTE_BBDEV_DRV_ERROR))
+		acc100_check_ir(q->d);
+
 	/* Check if this is the last desc in batch (Atomic Queue) */
 	if (desc->req.last_desc_in_batch) {
 		(*aq_dequeued)++;
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 78686c1..8980fa5 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -559,7 +559,14 @@ struct acc100_device {
 	/* Virtual address of the info memory routed to the this function under
 	 * operation, whether it is PF or VF.
 	 */
+	union acc100_info_ring_data *info_ring;
+
 	union acc100_harq_layout_data *harq_layout;
+	/* Virtual Info Ring head */
+	uint16_t info_ring_head;
+	/* Number of bytes available for each queue in device, depending on
+	 * how many queues are enabled with configure()
+	 */
 	uint32_t sw_ring_size;
 	uint32_t ddr_size; /* Size in kB */
 	uint32_t *tail_ptrs; /* Base address of response tail pointer buffer */
@@ -575,4 +582,12 @@ struct acc100_device {
 	bool configured; /**< True if this ACC100 device is configured */
 };
 
+/**
+ * Structure with details about RTE_BBDEV_EVENT_DEQUEUE event. It's passed to
+ * the callback function.
+ */
+struct acc100_deq_intr_details {
+	uint16_t queue_id;
+};
+
 #endif /* _RTE_ACC100_PMD_H_ */
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v5 09/11] baseband/acc100: add debug function to validate input
  2020-09-23  2:12   ` [dpdk-dev] [PATCH v5 " Nicolas Chautru
                       ` (7 preceding siblings ...)
  2020-09-23  2:12     ` [dpdk-dev] [PATCH v5 08/11] baseband/acc100: add interrupt support to PMD Nicolas Chautru
@ 2020-09-23  2:12     ` Nicolas Chautru
  2020-09-23  2:12     ` [dpdk-dev] [PATCH v5 10/11] baseband/acc100: add configure function Nicolas Chautru
  2020-09-23  2:12     ` [dpdk-dev] [PATCH v5 11/11] doc: update bbdev feature table Nicolas Chautru
  10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-23  2:12 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Debug functions to validate the input API from user
Only enabled in DEBUG mode at build time

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 424 +++++++++++++++++++++++++++++++
 1 file changed, 424 insertions(+)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index b6d9e7c..3589814 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -1945,6 +1945,231 @@
 
 }
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+/* Validates turbo encoder parameters */
+static inline int
+validate_enc_op(struct rte_bbdev_enc_op *op)
+{
+	struct rte_bbdev_op_turbo_enc *turbo_enc = &op->turbo_enc;
+	struct rte_bbdev_op_enc_turbo_cb_params *cb = NULL;
+	struct rte_bbdev_op_enc_turbo_tb_params *tb = NULL;
+	uint16_t kw, kw_neg, kw_pos;
+
+	if (op->mempool == NULL) {
+		rte_bbdev_log(ERR, "Invalid mempool pointer");
+		return -1;
+	}
+	if (turbo_enc->input.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid input pointer");
+		return -1;
+	}
+	if (turbo_enc->output.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid output pointer");
+		return -1;
+	}
+	if (turbo_enc->rv_index > 3) {
+		rte_bbdev_log(ERR,
+				"rv_index (%u) is out of range 0 <= value <= 3",
+				turbo_enc->rv_index);
+		return -1;
+	}
+	if (turbo_enc->code_block_mode != 0 &&
+			turbo_enc->code_block_mode != 1) {
+		rte_bbdev_log(ERR,
+				"code_block_mode (%u) is out of range 0 <= value <= 1",
+				turbo_enc->code_block_mode);
+		return -1;
+	}
+
+	if (turbo_enc->code_block_mode == 0) {
+		tb = &turbo_enc->tb_params;
+		if ((tb->k_neg < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| tb->k_neg > RTE_BBDEV_TURBO_MAX_CB_SIZE)
+				&& tb->c_neg > 0) {
+			rte_bbdev_log(ERR,
+					"k_neg (%u) is out of range %u <= value <= %u",
+					tb->k_neg, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if (tb->k_pos < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| tb->k_pos > RTE_BBDEV_TURBO_MAX_CB_SIZE) {
+			rte_bbdev_log(ERR,
+					"k_pos (%u) is out of range %u <= value <= %u",
+					tb->k_pos, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if (tb->c_neg > (RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1))
+			rte_bbdev_log(ERR,
+					"c_neg (%u) is out of range 0 <= value <= %u",
+					tb->c_neg,
+					RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1);
+		if (tb->c < 1 || tb->c > RTE_BBDEV_TURBO_MAX_CODE_BLOCKS) {
+			rte_bbdev_log(ERR,
+					"c (%u) is out of range 1 <= value <= %u",
+					tb->c, RTE_BBDEV_TURBO_MAX_CODE_BLOCKS);
+			return -1;
+		}
+		if (tb->cab > tb->c) {
+			rte_bbdev_log(ERR,
+					"cab (%u) is greater than c (%u)",
+					tb->cab, tb->c);
+			return -1;
+		}
+		if ((tb->ea < RTE_BBDEV_TURBO_MIN_CB_SIZE || (tb->ea % 2))
+				&& tb->r < tb->cab) {
+			rte_bbdev_log(ERR,
+					"ea (%u) is less than %u or it is not even",
+					tb->ea, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+		if ((tb->eb < RTE_BBDEV_TURBO_MIN_CB_SIZE || (tb->eb % 2))
+				&& tb->c > tb->cab) {
+			rte_bbdev_log(ERR,
+					"eb (%u) is less than %u or it is not even",
+					tb->eb, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+
+		kw_neg = 3 * RTE_ALIGN_CEIL(tb->k_neg + 4,
+					RTE_BBDEV_TURBO_C_SUBBLOCK);
+		if (tb->ncb_neg < tb->k_neg || tb->ncb_neg > kw_neg) {
+			rte_bbdev_log(ERR,
+					"ncb_neg (%u) is out of range (%u) k_neg <= value <= (%u) kw_neg",
+					tb->ncb_neg, tb->k_neg, kw_neg);
+			return -1;
+		}
+
+		kw_pos = 3 * RTE_ALIGN_CEIL(tb->k_pos + 4,
+					RTE_BBDEV_TURBO_C_SUBBLOCK);
+		if (tb->ncb_pos < tb->k_pos || tb->ncb_pos > kw_pos) {
+			rte_bbdev_log(ERR,
+					"ncb_pos (%u) is out of range (%u) k_pos <= value <= (%u) kw_pos",
+					tb->ncb_pos, tb->k_pos, kw_pos);
+			return -1;
+		}
+		if (tb->r > (tb->c - 1)) {
+			rte_bbdev_log(ERR,
+					"r (%u) is greater than c - 1 (%u)",
+					tb->r, tb->c - 1);
+			return -1;
+		}
+	} else {
+		cb = &turbo_enc->cb_params;
+		if (cb->k < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| cb->k > RTE_BBDEV_TURBO_MAX_CB_SIZE) {
+			rte_bbdev_log(ERR,
+					"k (%u) is out of range %u <= value <= %u",
+					cb->k, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+
+		if (cb->e < RTE_BBDEV_TURBO_MIN_CB_SIZE || (cb->e % 2)) {
+			rte_bbdev_log(ERR,
+					"e (%u) is less than %u or it is not even",
+					cb->e, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+
+		kw = RTE_ALIGN_CEIL(cb->k + 4, RTE_BBDEV_TURBO_C_SUBBLOCK) * 3;
+		if (cb->ncb < cb->k || cb->ncb > kw) {
+			rte_bbdev_log(ERR,
+					"ncb (%u) is out of range (%u) k <= value <= (%u) kw",
+					cb->ncb, cb->k, kw);
+			return -1;
+		}
+	}
+
+	return 0;
+}
+/* Validates LDPC encoder parameters */
+static inline int
+validate_ldpc_enc_op(struct rte_bbdev_enc_op *op)
+{
+	struct rte_bbdev_op_ldpc_enc *ldpc_enc = &op->ldpc_enc;
+
+	if (op->mempool == NULL) {
+		rte_bbdev_log(ERR, "Invalid mempool pointer");
+		return -1;
+	}
+	if (ldpc_enc->input.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid input pointer");
+		return -1;
+	}
+	if (ldpc_enc->output.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid output pointer");
+		return -1;
+	}
+	if (ldpc_enc->input.length >
+			RTE_BBDEV_LDPC_MAX_CB_SIZE >> 3) {
+		rte_bbdev_log(ERR, "CB size (%u) is too big, max: %d",
+				ldpc_enc->input.length,
+				RTE_BBDEV_LDPC_MAX_CB_SIZE);
+		return -1;
+	}
+	if ((ldpc_enc->basegraph > 2) || (ldpc_enc->basegraph == 0)) {
+		rte_bbdev_log(ERR,
+				"BG (%u) is out of range 1 <= value <= 2",
+				ldpc_enc->basegraph);
+		return -1;
+	}
+	if (ldpc_enc->rv_index > 3) {
+		rte_bbdev_log(ERR,
+				"rv_index (%u) is out of range 0 <= value <= 3",
+				ldpc_enc->rv_index);
+		return -1;
+	}
+	if (ldpc_enc->code_block_mode > 1) {
+		rte_bbdev_log(ERR,
+				"code_block_mode (%u) is out of range 0 <= value <= 1",
+				ldpc_enc->code_block_mode);
+		return -1;
+	}
+
+	return 0;
+}
+
+/* Validates LDPC decoder parameters */
+static inline int
+validate_ldpc_dec_op(struct rte_bbdev_dec_op *op)
+{
+	struct rte_bbdev_op_ldpc_dec *ldpc_dec = &op->ldpc_dec;
+
+	if (op->mempool == NULL) {
+		rte_bbdev_log(ERR, "Invalid mempool pointer");
+		return -1;
+	}
+	if ((ldpc_dec->basegraph > 2) || (ldpc_dec->basegraph == 0)) {
+		rte_bbdev_log(ERR,
+				"BG (%u) is out of range 1 <= value <= 2",
+				ldpc_dec->basegraph);
+		return -1;
+	}
+	if (ldpc_dec->iter_max == 0) {
+		rte_bbdev_log(ERR,
+				"iter_max (%u) is equal to 0",
+				ldpc_dec->iter_max);
+		return -1;
+	}
+	if (ldpc_dec->rv_index > 3) {
+		rte_bbdev_log(ERR,
+				"rv_index (%u) is out of range 0 <= value <= 3",
+				ldpc_dec->rv_index);
+		return -1;
+	}
+	if (ldpc_dec->code_block_mode > 1) {
+		rte_bbdev_log(ERR,
+				"code_block_mode (%u) is out of range 0 <= value <= 1",
+				ldpc_dec->code_block_mode);
+		return -1;
+	}
+
+	return 0;
+}
+#endif
+
 /* Enqueue one encode operations for ACC100 device in CB mode */
 static inline int
 enqueue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
@@ -1956,6 +2181,14 @@
 		seg_total_left;
 	struct rte_mbuf *input, *output_head, *output;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_enc_op(op) == -1) {
+		rte_bbdev_log(ERR, "Turbo encoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2008,6 +2241,14 @@
 	uint16_t  in_length_in_bytes;
 	struct rte_bbdev_op_ldpc_enc *enc = &ops[0]->ldpc_enc;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_ldpc_enc_op(ops[0]) == -1) {
+		rte_bbdev_log(ERR, "LDPC encoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2065,6 +2306,14 @@
 		seg_total_left;
 	struct rte_mbuf *input, *output_head, *output;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_ldpc_enc_op(op) == -1) {
+		rte_bbdev_log(ERR, "LDPC encoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2119,6 +2368,14 @@
 	struct rte_mbuf *input, *output_head, *output;
 	uint16_t current_enqueued_cbs = 0;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_enc_op(op) == -1) {
+		rte_bbdev_log(ERR, "Turbo encoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2191,6 +2448,142 @@
 	return current_enqueued_cbs;
 }
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+/* Validates turbo decoder parameters */
+static inline int
+validate_dec_op(struct rte_bbdev_dec_op *op)
+{
+	struct rte_bbdev_op_turbo_dec *turbo_dec = &op->turbo_dec;
+	struct rte_bbdev_op_dec_turbo_cb_params *cb = NULL;
+	struct rte_bbdev_op_dec_turbo_tb_params *tb = NULL;
+
+	if (op->mempool == NULL) {
+		rte_bbdev_log(ERR, "Invalid mempool pointer");
+		return -1;
+	}
+	if (turbo_dec->input.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid input pointer");
+		return -1;
+	}
+	if (turbo_dec->hard_output.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid hard_output pointer");
+		return -1;
+	}
+	if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_SOFT_OUTPUT) &&
+			turbo_dec->soft_output.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid soft_output pointer");
+		return -1;
+	}
+	if (turbo_dec->rv_index > 3) {
+		rte_bbdev_log(ERR,
+				"rv_index (%u) is out of range 0 <= value <= 3",
+				turbo_dec->rv_index);
+		return -1;
+	}
+	if (turbo_dec->iter_min < 1) {
+		rte_bbdev_log(ERR,
+				"iter_min (%u) is less than 1",
+				turbo_dec->iter_min);
+		return -1;
+	}
+	if (turbo_dec->iter_max <= 2) {
+		rte_bbdev_log(ERR,
+				"iter_max (%u) is less than or equal to 2",
+				turbo_dec->iter_max);
+		return -1;
+	}
+	if (turbo_dec->iter_min > turbo_dec->iter_max) {
+		rte_bbdev_log(ERR,
+				"iter_min (%u) is greater than iter_max (%u)",
+				turbo_dec->iter_min, turbo_dec->iter_max);
+		return -1;
+	}
+	if (turbo_dec->code_block_mode != 0 &&
+			turbo_dec->code_block_mode != 1) {
+		rte_bbdev_log(ERR,
+				"code_block_mode (%u) is out of range 0 <= value <= 1",
+				turbo_dec->code_block_mode);
+		return -1;
+	}
+
+	if (turbo_dec->code_block_mode == 0) {
+		tb = &turbo_dec->tb_params;
+		if ((tb->k_neg < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| tb->k_neg > RTE_BBDEV_TURBO_MAX_CB_SIZE)
+				&& tb->c_neg > 0) {
+			rte_bbdev_log(ERR,
+					"k_neg (%u) is out of range %u <= value <= %u",
+					tb->k_neg, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if ((tb->k_pos < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| tb->k_pos > RTE_BBDEV_TURBO_MAX_CB_SIZE)
+				&& tb->c > tb->c_neg) {
+			rte_bbdev_log(ERR,
+					"k_pos (%u) is out of range %u <= value <= %u",
+					tb->k_pos, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if (tb->c_neg > (RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1))
+			rte_bbdev_log(ERR,
+					"c_neg (%u) is out of range 0 <= value <= %u",
+					tb->c_neg,
+					RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1);
+		if (tb->c < 1 || tb->c > RTE_BBDEV_TURBO_MAX_CODE_BLOCKS) {
+			rte_bbdev_log(ERR,
+					"c (%u) is out of range 1 <= value <= %u",
+					tb->c, RTE_BBDEV_TURBO_MAX_CODE_BLOCKS);
+			return -1;
+		}
+		if (tb->cab > tb->c) {
+			rte_bbdev_log(ERR,
+					"cab (%u) is greater than c (%u)",
+					tb->cab, tb->c);
+			return -1;
+		}
+		if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_EQUALIZER) &&
+				(tb->ea < RTE_BBDEV_TURBO_MIN_CB_SIZE
+						|| (tb->ea % 2))
+				&& tb->cab > 0) {
+			rte_bbdev_log(ERR,
+					"ea (%u) is less than %u or it is not even",
+					tb->ea, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+		if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_EQUALIZER) &&
+				(tb->eb < RTE_BBDEV_TURBO_MIN_CB_SIZE
+						|| (tb->eb % 2))
+				&& tb->c > tb->cab) {
+			rte_bbdev_log(ERR,
+					"eb (%u) is less than %u or it is not even",
+					tb->eb, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+		}
+	} else {
+		cb = &turbo_dec->cb_params;
+		if (cb->k < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| cb->k > RTE_BBDEV_TURBO_MAX_CB_SIZE) {
+			rte_bbdev_log(ERR,
+					"k (%u) is out of range %u <= value <= %u",
+					cb->k, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_EQUALIZER) &&
+				(cb->e < RTE_BBDEV_TURBO_MIN_CB_SIZE ||
+				(cb->e % 2))) {
+			rte_bbdev_log(ERR,
+					"e (%u) is less than %u or it is not even",
+					cb->e, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+	}
+
+	return 0;
+}
+#endif
+
 /** Enqueue one decode operations for ACC100 device in CB mode */
 static inline int
 enqueue_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
@@ -2203,6 +2596,14 @@
 	struct rte_mbuf *input, *h_output_head, *h_output,
 		*s_output_head, *s_output;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_dec_op(op) == -1) {
+		rte_bbdev_log(ERR, "Turbo decoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2426,6 +2827,13 @@
 		return ret;
 	}
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_ldpc_dec_op(op) == -1) {
+		rte_bbdev_log(ERR, "LDPC decoder validation failed");
+		return -EINVAL;
+	}
+#endif
 	union acc100_dma_desc *desc;
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
@@ -2521,6 +2929,14 @@
 	struct rte_mbuf *input, *h_output_head, *h_output;
 	uint16_t current_enqueued_cbs = 0;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_ldpc_dec_op(op) == -1) {
+		rte_bbdev_log(ERR, "LDPC decoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2611,6 +3027,14 @@
 		*s_output_head, *s_output;
 	uint16_t current_enqueued_cbs = 0;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_dec_op(op) == -1) {
+		rte_bbdev_log(ERR, "Turbo decoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v5 10/11] baseband/acc100: add configure function
  2020-09-23  2:12   ` [dpdk-dev] [PATCH v5 " Nicolas Chautru
                       ` (8 preceding siblings ...)
  2020-09-23  2:12     ` [dpdk-dev] [PATCH v5 09/11] baseband/acc100: add debug function to validate input Nicolas Chautru
@ 2020-09-23  2:12     ` Nicolas Chautru
  2020-09-23  2:12     ` [dpdk-dev] [PATCH v5 11/11] doc: update bbdev feature table Nicolas Chautru
  10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-23  2:12 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Add configure function to configure the PF from within
the bbdev-test itself without external application
configuration the device.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 app/test-bbdev/test_bbdev_perf.c                   |  72 +++
 drivers/baseband/acc100/Makefile                   |   3 +
 drivers/baseband/acc100/meson.build                |   2 +
 drivers/baseband/acc100/rte_acc100_cfg.h           |  17 +
 drivers/baseband/acc100/rte_acc100_pmd.c           | 505 +++++++++++++++++++++
 .../acc100/rte_pmd_bbdev_acc100_version.map        |   7 +
 6 files changed, 606 insertions(+)

diff --git a/app/test-bbdev/test_bbdev_perf.c b/app/test-bbdev/test_bbdev_perf.c
index 45c0d62..32f23ff 100644
--- a/app/test-bbdev/test_bbdev_perf.c
+++ b/app/test-bbdev/test_bbdev_perf.c
@@ -52,6 +52,18 @@
 #define FLR_5G_TIMEOUT 610
 #endif
 
+#ifdef RTE_LIBRTE_PMD_BBDEV_ACC100
+#include <rte_acc100_cfg.h>
+#define ACC100PF_DRIVER_NAME   ("intel_acc100_pf")
+#define ACC100VF_DRIVER_NAME   ("intel_acc100_vf")
+#define ACC100_QMGR_NUM_AQS 16
+#define ACC100_QMGR_NUM_QGS 2
+#define ACC100_QMGR_AQ_DEPTH 5
+#define ACC100_QMGR_INVALID_IDX -1
+#define ACC100_QMGR_RR 1
+#define ACC100_QOS_GBR 0
+#endif
+
 #define OPS_CACHE_SIZE 256U
 #define OPS_POOL_SIZE_MIN 511U /* 0.5K per queue */
 
@@ -653,6 +665,66 @@ typedef int (test_case_function)(struct active_device *ad,
 				info->dev_name);
 	}
 #endif
+#ifdef RTE_LIBRTE_PMD_BBDEV_ACC100
+	if ((get_init_device() == true) &&
+		(!strcmp(info->drv.driver_name, ACC100PF_DRIVER_NAME))) {
+		struct acc100_conf conf;
+		unsigned int i;
+
+		printf("Configure ACC100 FEC Driver %s with default values\n",
+				info->drv.driver_name);
+
+		/* clear default configuration before initialization */
+		memset(&conf, 0, sizeof(struct acc100_conf));
+
+		/* Always set in PF mode for built-in configuration */
+		conf.pf_mode_en = true;
+		for (i = 0; i < RTE_ACC100_NUM_VFS; ++i) {
+			conf.arb_dl_4g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_dl_4g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_dl_4g[i].round_robin_weight = ACC100_QMGR_RR;
+			conf.arb_ul_4g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_ul_4g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_ul_4g[i].round_robin_weight = ACC100_QMGR_RR;
+			conf.arb_dl_5g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_dl_5g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_dl_5g[i].round_robin_weight = ACC100_QMGR_RR;
+			conf.arb_ul_5g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_ul_5g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_ul_5g[i].round_robin_weight = ACC100_QMGR_RR;
+		}
+
+		conf.input_pos_llr_1_bit = true;
+		conf.output_pos_llr_1_bit = true;
+		conf.num_vf_bundles = 1; /**< Number of VF bundles to setup */
+
+		conf.q_ul_4g.num_qgroups = ACC100_QMGR_NUM_QGS;
+		conf.q_ul_4g.first_qgroup_index = ACC100_QMGR_INVALID_IDX;
+		conf.q_ul_4g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
+		conf.q_ul_4g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
+		conf.q_dl_4g.num_qgroups = ACC100_QMGR_NUM_QGS;
+		conf.q_dl_4g.first_qgroup_index = ACC100_QMGR_INVALID_IDX;
+		conf.q_dl_4g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
+		conf.q_dl_4g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
+		conf.q_ul_5g.num_qgroups = ACC100_QMGR_NUM_QGS;
+		conf.q_ul_5g.first_qgroup_index = ACC100_QMGR_INVALID_IDX;
+		conf.q_ul_5g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
+		conf.q_ul_5g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
+		conf.q_dl_5g.num_qgroups = ACC100_QMGR_NUM_QGS;
+		conf.q_dl_5g.first_qgroup_index = ACC100_QMGR_INVALID_IDX;
+		conf.q_dl_5g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
+		conf.q_dl_5g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
+
+		/* setup PF with configuration information */
+		ret = acc100_configure(info->dev_name, &conf);
+		TEST_ASSERT_SUCCESS(ret,
+				"Failed to configure ACC100 PF for bbdev %s",
+				info->dev_name);
+		/* Let's refresh this now this is configured */
+	}
+	rte_bbdev_info_get(dev_id, info);
+#endif
+
 	nb_queues = RTE_MIN(rte_lcore_count(), info->drv.max_num_queues);
 	nb_queues = RTE_MIN(nb_queues, (unsigned int) MAX_QUEUES);
 
diff --git a/drivers/baseband/acc100/Makefile b/drivers/baseband/acc100/Makefile
index c79e487..37e73af 100644
--- a/drivers/baseband/acc100/Makefile
+++ b/drivers/baseband/acc100/Makefile
@@ -22,4 +22,7 @@ LIBABIVER := 1
 # library source files
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100) += rte_acc100_pmd.c
 
+# export include files
+SYMLINK-$(CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100)-include += rte_acc100_cfg.h
+
 include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/drivers/baseband/acc100/meson.build b/drivers/baseband/acc100/meson.build
index 8afafc2..7ac44dc 100644
--- a/drivers/baseband/acc100/meson.build
+++ b/drivers/baseband/acc100/meson.build
@@ -4,3 +4,5 @@
 deps += ['bbdev', 'bus_vdev', 'ring', 'pci', 'bus_pci']
 
 sources = files('rte_acc100_pmd.c')
+
+install_headers('rte_acc100_cfg.h')
diff --git a/drivers/baseband/acc100/rte_acc100_cfg.h b/drivers/baseband/acc100/rte_acc100_cfg.h
index 73bbe36..7f523bc 100644
--- a/drivers/baseband/acc100/rte_acc100_cfg.h
+++ b/drivers/baseband/acc100/rte_acc100_cfg.h
@@ -89,6 +89,23 @@ struct acc100_conf {
 	struct rte_arbitration_t arb_dl_5g[RTE_ACC100_NUM_VFS];
 };
 
+/**
+ * Configure a ACC100 device
+ *
+ * @param dev_name
+ *   The name of the device. This is the short form of PCI BDF, e.g. 00:01.0.
+ *   It can also be retrieved for a bbdev device from the dev_name field in the
+ *   rte_bbdev_info structure returned by rte_bbdev_info_get().
+ * @param conf
+ *   Configuration to apply to ACC100 HW.
+ *
+ * @return
+ *   Zero on success, negative value on failure.
+ */
+__rte_experimental
+int
+acc100_configure(const char *dev_name, struct acc100_conf *conf);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 3589814..b50dd32 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -85,6 +85,26 @@
 
 enum {UL_4G = 0, UL_5G, DL_4G, DL_5G, NUM_ACC};
 
+/* Return the accelerator enum for a Queue Group Index */
+static inline int
+accFromQgid(int qg_idx, const struct acc100_conf *acc100_conf)
+{
+	int accQg[ACC100_NUM_QGRPS];
+	int NumQGroupsPerFn[NUM_ACC];
+	int acc, qgIdx, qgIndex = 0;
+	for (qgIdx = 0; qgIdx < ACC100_NUM_QGRPS; qgIdx++)
+		accQg[qgIdx] = 0;
+	NumQGroupsPerFn[UL_4G] = acc100_conf->q_ul_4g.num_qgroups;
+	NumQGroupsPerFn[UL_5G] = acc100_conf->q_ul_5g.num_qgroups;
+	NumQGroupsPerFn[DL_4G] = acc100_conf->q_dl_4g.num_qgroups;
+	NumQGroupsPerFn[DL_5G] = acc100_conf->q_dl_5g.num_qgroups;
+	for (acc = UL_4G;  acc < NUM_ACC; acc++)
+		for (qgIdx = 0; qgIdx < NumQGroupsPerFn[acc]; qgIdx++)
+			accQg[qgIndex++] = acc;
+	acc = accQg[qg_idx];
+	return acc;
+}
+
 /* Return the queue topology for a Queue Group Index */
 static inline void
 qtopFromAcc(struct rte_q_topology_t **qtop, int acc_enum,
@@ -113,6 +133,30 @@
 	*qtop = p_qtop;
 }
 
+/* Return the AQ depth for a Queue Group Index */
+static inline int
+aqDepth(int qg_idx, struct acc100_conf *acc100_conf)
+{
+	struct rte_q_topology_t *q_top = NULL;
+	int acc_enum = accFromQgid(qg_idx, acc100_conf);
+	qtopFromAcc(&q_top, acc_enum, acc100_conf);
+	if (unlikely(q_top == NULL))
+		return 0;
+	return q_top->aq_depth_log2;
+}
+
+/* Return the AQ depth for a Queue Group Index */
+static inline int
+aqNum(int qg_idx, struct acc100_conf *acc100_conf)
+{
+	struct rte_q_topology_t *q_top = NULL;
+	int acc_enum = accFromQgid(qg_idx, acc100_conf);
+	qtopFromAcc(&q_top, acc_enum, acc100_conf);
+	if (unlikely(q_top == NULL))
+		return 0;
+	return q_top->num_aqs_per_groups;
+}
+
 static void
 initQTop(struct acc100_conf *acc100_conf)
 {
@@ -4177,3 +4221,464 @@ static int acc100_pci_remove(struct rte_pci_device *pci_dev)
 RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME, pci_id_acc100_pf_map);
 RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
 RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME, pci_id_acc100_vf_map);
+
+/*
+ * Implementation to fix the power on status of some 5GUL engines
+ * This requires DMA permission if ported outside DPDK
+ */
+static void
+poweron_cleanup(struct rte_bbdev *bbdev, struct acc100_device *d,
+		struct acc100_conf *conf)
+{
+	int i, template_idx, qg_idx;
+	uint32_t address, status, payload;
+	printf("Need to clear power-on 5GUL status in internal memory\n");
+	/* Reset LDPC Cores */
+	for (i = 0; i < ACC100_ENGINES_MAX; i++)
+		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
+				ACC100_ENGINE_OFFSET * i, ACC100_RESET_HI);
+	usleep(LONG_WAIT);
+	for (i = 0; i < ACC100_ENGINES_MAX; i++)
+		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
+				ACC100_ENGINE_OFFSET * i, ACC100_RESET_LO);
+	usleep(LONG_WAIT);
+	/* Prepare dummy workload */
+	alloc_2x64mb_sw_rings_mem(bbdev, d, 0);
+	/* Set base addresses */
+	uint32_t phys_high = (uint32_t)(d->sw_rings_phys >> 32);
+	uint32_t phys_low  = (uint32_t)(d->sw_rings_phys &
+			~(ACC100_SIZE_64MBYTE-1));
+	acc100_reg_write(d, HWPfDmaFec5GulDescBaseHiRegVf, phys_high);
+	acc100_reg_write(d, HWPfDmaFec5GulDescBaseLoRegVf, phys_low);
+
+	/* Descriptor for a dummy 5GUL code block processing*/
+	union acc100_dma_desc *desc = NULL;
+	desc = d->sw_rings;
+	desc->req.data_ptrs[0].address = d->sw_rings_phys +
+			ACC100_DESC_FCW_OFFSET;
+	desc->req.data_ptrs[0].blen = ACC100_FCW_LD_BLEN;
+	desc->req.data_ptrs[0].blkid = ACC100_DMA_BLKID_FCW;
+	desc->req.data_ptrs[0].last = 0;
+	desc->req.data_ptrs[0].dma_ext = 0;
+	desc->req.data_ptrs[1].address = d->sw_rings_phys + 512;
+	desc->req.data_ptrs[1].blkid = ACC100_DMA_BLKID_IN;
+	desc->req.data_ptrs[1].last = 1;
+	desc->req.data_ptrs[1].dma_ext = 0;
+	desc->req.data_ptrs[1].blen = 44;
+	desc->req.data_ptrs[2].address = d->sw_rings_phys + 1024;
+	desc->req.data_ptrs[2].blkid = ACC100_DMA_BLKID_OUT_ENC;
+	desc->req.data_ptrs[2].last = 1;
+	desc->req.data_ptrs[2].dma_ext = 0;
+	desc->req.data_ptrs[2].blen = 5;
+	/* Dummy FCW */
+	desc->req.fcw_ld.FCWversion = ACC100_FCW_VER;
+	desc->req.fcw_ld.qm = 1;
+	desc->req.fcw_ld.nfiller = 30;
+	desc->req.fcw_ld.BG = 2 - 1;
+	desc->req.fcw_ld.Zc = 7;
+	desc->req.fcw_ld.ncb = 350;
+	desc->req.fcw_ld.rm_e = 4;
+	desc->req.fcw_ld.itmax = 10;
+	desc->req.fcw_ld.gain_i = 1;
+	desc->req.fcw_ld.gain_h = 1;
+
+	int engines_to_restart[SIG_UL_5G_LAST + 1] = {0};
+	int num_failed_engine = 0;
+	/* Detect engines in undefined state */
+	for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST;
+			template_idx++) {
+		/* Check engine power-on status */
+		address = HwPfFecUl5gIbDebugReg +
+				ACC100_ENGINE_OFFSET * template_idx;
+		status = (acc100_reg_read(d, address) >> 4) & 0xF;
+		if (status == 0) {
+			engines_to_restart[num_failed_engine] = template_idx;
+			num_failed_engine++;
+		}
+	}
+
+	int numQqsAcc = conf->q_ul_5g.num_qgroups;
+	int numQgs = conf->q_ul_5g.num_qgroups;
+	payload = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		payload |= (1 << qg_idx);
+	/* Force each engine which is in unspecified state */
+	for (i = 0; i < num_failed_engine; i++) {
+		int failed_engine = engines_to_restart[i];
+		printf("Force engine %d\n", failed_engine);
+		for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST;
+				template_idx++) {
+			address = HWPfQmgrGrpTmplateReg4Indx
+					+ BYTES_IN_WORD * template_idx;
+			if (template_idx == failed_engine)
+				acc100_reg_write(d, address, payload);
+			else
+				acc100_reg_write(d, address, 0);
+		}
+		/* Reset descriptor header */
+		desc->req.word0 = ACC100_DMA_DESC_TYPE;
+		desc->req.word1 = 0;
+		desc->req.word2 = 0;
+		desc->req.word3 = 0;
+		desc->req.numCBs = 1;
+		desc->req.m2dlen = 2;
+		desc->req.d2mlen = 1;
+		/* Enqueue the code block for processing */
+		union acc100_enqueue_reg_fmt enq_req;
+		enq_req.val = 0;
+		enq_req.addr_offset = ACC100_DESC_OFFSET;
+		enq_req.num_elem = 1;
+		enq_req.req_elem_addr = 0;
+		rte_wmb();
+		acc100_reg_write(d, HWPfQmgrIngressAq + 0x100, enq_req.val);
+		usleep(LONG_WAIT * 100);
+		if (desc->req.word0 != 2)
+			printf("DMA Response %#"PRIx32"\n", desc->req.word0);
+	}
+
+	/* Reset LDPC Cores */
+	for (i = 0; i < ACC100_ENGINES_MAX; i++)
+		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
+				ACC100_ENGINE_OFFSET * i, ACC100_RESET_HI);
+	usleep(LONG_WAIT);
+	for (i = 0; i < ACC100_ENGINES_MAX; i++)
+		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
+				ACC100_ENGINE_OFFSET * i, ACC100_RESET_LO);
+	usleep(LONG_WAIT);
+	acc100_reg_write(d, HWPfHi5GHardResetReg, ACC100_RESET_HARD);
+	usleep(LONG_WAIT);
+	int numEngines = 0;
+	/* Check engine power-on status again */
+	for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST;
+			template_idx++) {
+		address = HwPfFecUl5gIbDebugReg +
+				ACC100_ENGINE_OFFSET * template_idx;
+		status = (acc100_reg_read(d, address) >> 4) & 0xF;
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ BYTES_IN_WORD * template_idx;
+		if (status == 1) {
+			acc100_reg_write(d, address, payload);
+			numEngines++;
+		} else
+			acc100_reg_write(d, address, 0);
+	}
+	printf("Number of 5GUL engines %d\n", numEngines);
+
+	if (d->sw_rings_base != NULL)
+		rte_free(d->sw_rings_base);
+	usleep(LONG_WAIT);
+}
+
+/* Initial configuration of a ACC100 device prior to running configure() */
+int
+acc100_configure(const char *dev_name, struct acc100_conf *conf)
+{
+	rte_bbdev_log(INFO, "acc100_configure");
+	uint32_t payload, address, status;
+	int qg_idx, template_idx, vf_idx, acc, i;
+	struct rte_bbdev *bbdev = rte_bbdev_get_named_dev(dev_name);
+
+	/* Compile time checks */
+	RTE_BUILD_BUG_ON(sizeof(struct acc100_dma_req_desc) != 256);
+	RTE_BUILD_BUG_ON(sizeof(union acc100_dma_desc) != 256);
+	RTE_BUILD_BUG_ON(sizeof(struct acc100_fcw_td) != 24);
+	RTE_BUILD_BUG_ON(sizeof(struct acc100_fcw_te) != 32);
+
+	if (bbdev == NULL) {
+		rte_bbdev_log(ERR,
+		"Invalid dev_name (%s), or device is not yet initialised",
+		dev_name);
+		return -ENODEV;
+	}
+	struct acc100_device *d = bbdev->data->dev_private;
+
+	/* Store configuration */
+	rte_memcpy(&d->acc100_conf, conf, sizeof(d->acc100_conf));
+
+	/* PCIe Bridge configuration */
+	acc100_reg_write(d, HwPfPcieGpexBridgeControl, ACC100_CFG_PCI_BRIDGE);
+	for (i = 1; i < 17; i++)
+		acc100_reg_write(d,
+				HwPfPcieGpexAxiAddrMappingWindowPexBaseHigh
+				+ i * 16, 0);
+
+	/* PCIe Link Trainiing and Status State Machine */
+	acc100_reg_write(d, HwPfPcieGpexLtssmStateCntrl, 0xDFC00000);
+
+	/* Prevent blocking AXI read on BRESP for AXI Write */
+	address = HwPfPcieGpexAxiPioControl;
+	payload = ACC100_CFG_PCI_AXI;
+	acc100_reg_write(d, address, payload);
+
+	/* 5GDL PLL phase shift */
+	acc100_reg_write(d, HWPfChaDl5gPllPhshft0, 0x1);
+
+	/* Explicitly releasing AXI as this may be stopped after PF FLR/BME */
+	address = HWPfDmaAxiControl;
+	payload = 1;
+	acc100_reg_write(d, address, payload);
+
+	/* DDR Configuration */
+	address = HWPfDdrBcTim6;
+	payload = acc100_reg_read(d, address);
+	payload &= 0xFFFFFFFB; /* Bit 2 */
+#ifdef ACC100_DDR_ECC_ENABLE
+	payload |= 0x4;
+#endif
+	acc100_reg_write(d, address, payload);
+	address = HWPfDdrPhyDqsCountNum;
+#ifdef ACC100_DDR_ECC_ENABLE
+	payload = 9;
+#else
+	payload = 8;
+#endif
+	acc100_reg_write(d, address, payload);
+
+	/* Set default descriptor signature */
+	address = HWPfDmaDescriptorSignatuture;
+	payload = 0;
+	acc100_reg_write(d, address, payload);
+
+	/* Enable the Error Detection in DMA */
+	payload = ACC100_CFG_DMA_ERROR;
+	address = HWPfDmaErrorDetectionEn;
+	acc100_reg_write(d, address, payload);
+
+	/* AXI Cache configuration */
+	payload = ACC100_CFG_AXI_CACHE;
+	address = HWPfDmaAxcacheReg;
+	acc100_reg_write(d, address, payload);
+
+	/* Default DMA Configuration (Qmgr Enabled) */
+	address = HWPfDmaConfig0Reg;
+	payload = 0;
+	acc100_reg_write(d, address, payload);
+	address = HWPfDmaQmanen;
+	payload = 0;
+	acc100_reg_write(d, address, payload);
+
+	/* Default RLIM/ALEN configuration */
+	address = HWPfDmaConfig1Reg;
+	payload = (1 << 31) + (23 << 8) + (1 << 6) + 7;
+	acc100_reg_write(d, address, payload);
+
+	/* Configure DMA Qmanager addresses */
+	address = HWPfDmaQmgrAddrReg;
+	payload = HWPfQmgrEgressQueuesTemplate;
+	acc100_reg_write(d, address, payload);
+
+	/* ===== Qmgr Configuration ===== */
+	/* Configuration of the AQueue Depth QMGR_GRP_0_DEPTH_LOG2 for UL */
+	int totalQgs = conf->q_ul_4g.num_qgroups +
+			conf->q_ul_5g.num_qgroups +
+			conf->q_dl_4g.num_qgroups +
+			conf->q_dl_5g.num_qgroups;
+	for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) {
+		address = HWPfQmgrDepthLog2Grp +
+		BYTES_IN_WORD * qg_idx;
+		payload = aqDepth(qg_idx, conf);
+		acc100_reg_write(d, address, payload);
+		address = HWPfQmgrTholdGrp +
+		BYTES_IN_WORD * qg_idx;
+		payload = (1 << 16) + (1 << (aqDepth(qg_idx, conf) - 1));
+		acc100_reg_write(d, address, payload);
+	}
+
+	/* Template Priority in incremental order */
+	for (template_idx = 0; template_idx < ACC100_NUM_TMPL;
+			template_idx++) {
+		address = HWPfQmgrGrpTmplateReg0Indx +
+		BYTES_IN_WORD * (template_idx % 8);
+		payload = TMPL_PRI_0;
+		acc100_reg_write(d, address, payload);
+		address = HWPfQmgrGrpTmplateReg1Indx +
+		BYTES_IN_WORD * (template_idx % 8);
+		payload = TMPL_PRI_1;
+		acc100_reg_write(d, address, payload);
+		address = HWPfQmgrGrpTmplateReg2indx +
+		BYTES_IN_WORD * (template_idx % 8);
+		payload = TMPL_PRI_2;
+		acc100_reg_write(d, address, payload);
+		address = HWPfQmgrGrpTmplateReg3Indx +
+		BYTES_IN_WORD * (template_idx % 8);
+		payload = TMPL_PRI_3;
+		acc100_reg_write(d, address, payload);
+	}
+
+	address = HWPfQmgrGrpPriority;
+	payload = ACC100_CFG_QMGR_HI_P;
+	acc100_reg_write(d, address, payload);
+
+	/* Template Configuration */
+	for (template_idx = 0; template_idx < ACC100_NUM_TMPL; template_idx++) {
+		payload = 0;
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ BYTES_IN_WORD * template_idx;
+		acc100_reg_write(d, address, payload);
+	}
+	/* 4GUL */
+	int numQgs = conf->q_ul_4g.num_qgroups;
+	int numQqsAcc = 0;
+	payload = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		payload |= (1 << qg_idx);
+	for (template_idx = SIG_UL_4G; template_idx <= SIG_UL_4G_LAST;
+			template_idx++) {
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ BYTES_IN_WORD*template_idx;
+		acc100_reg_write(d, address, payload);
+	}
+	/* 5GUL */
+	numQqsAcc += numQgs;
+	numQgs	= conf->q_ul_5g.num_qgroups;
+	payload = 0;
+	int numEngines = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		payload |= (1 << qg_idx);
+	for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST;
+			template_idx++) {
+		/* Check engine power-on status */
+		address = HwPfFecUl5gIbDebugReg +
+				ACC100_ENGINE_OFFSET * template_idx;
+		status = (acc100_reg_read(d, address) >> 4) & 0xF;
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ BYTES_IN_WORD * template_idx;
+		if (status == 1) {
+			acc100_reg_write(d, address, payload);
+			numEngines++;
+		} else
+			acc100_reg_write(d, address, 0);
+		#if RTE_ACC100_SINGLE_FEC == 1
+		payload = 0;
+		#endif
+	}
+	printf("Number of 5GUL engines %d\n", numEngines);
+	/* 4GDL */
+	numQqsAcc += numQgs;
+	numQgs	= conf->q_dl_4g.num_qgroups;
+	payload = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		payload |= (1 << qg_idx);
+	for (template_idx = SIG_DL_4G; template_idx <= SIG_DL_4G_LAST;
+			template_idx++) {
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ BYTES_IN_WORD*template_idx;
+		acc100_reg_write(d, address, payload);
+		#if RTE_ACC100_SINGLE_FEC == 1
+			payload = 0;
+		#endif
+	}
+	/* 5GDL */
+	numQqsAcc += numQgs;
+	numQgs	= conf->q_dl_5g.num_qgroups;
+	payload = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		payload |= (1 << qg_idx);
+	for (template_idx = SIG_DL_5G; template_idx <= SIG_DL_5G_LAST;
+			template_idx++) {
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ BYTES_IN_WORD*template_idx;
+		acc100_reg_write(d, address, payload);
+		#if RTE_ACC100_SINGLE_FEC == 1
+		payload = 0;
+		#endif
+	}
+
+	/* Queue Group Function mapping */
+	int qman_func_id[5] = {0, 2, 1, 3, 4};
+	address = HWPfQmgrGrpFunction0;
+	payload = 0;
+	for (qg_idx = 0; qg_idx < 8; qg_idx++) {
+		acc = accFromQgid(qg_idx, conf);
+		payload |= qman_func_id[acc]<<(qg_idx * 4);
+	}
+	acc100_reg_write(d, address, payload);
+
+	/* Configuration of the Arbitration QGroup depth to 1 */
+	for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) {
+		address = HWPfQmgrArbQDepthGrp +
+		BYTES_IN_WORD * qg_idx;
+		payload = 0;
+		acc100_reg_write(d, address, payload);
+	}
+
+	/* Enabling AQueues through the Queue hierarchy*/
+	for (vf_idx = 0; vf_idx < ACC100_NUM_VFS; vf_idx++) {
+		for (qg_idx = 0; qg_idx < ACC100_NUM_QGRPS; qg_idx++) {
+			payload = 0;
+			if (vf_idx < conf->num_vf_bundles &&
+					qg_idx < totalQgs)
+				payload = (1 << aqNum(qg_idx, conf)) - 1;
+			address = HWPfQmgrAqEnableVf
+					+ vf_idx * BYTES_IN_WORD;
+			payload += (qg_idx << 16);
+			acc100_reg_write(d, address, payload);
+		}
+	}
+
+	/* This pointer to ARAM (256kB) is shifted by 2 (4B per register) */
+	uint32_t aram_address = 0;
+	for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) {
+		for (vf_idx = 0; vf_idx < conf->num_vf_bundles; vf_idx++) {
+			address = HWPfQmgrVfBaseAddr + vf_idx
+					* BYTES_IN_WORD + qg_idx
+					* BYTES_IN_WORD * 64;
+			payload = aram_address;
+			acc100_reg_write(d, address, payload);
+			/* Offset ARAM Address for next memory bank
+			 * - increment of 4B
+			 */
+			aram_address += aqNum(qg_idx, conf) *
+					(1 << aqDepth(qg_idx, conf));
+		}
+	}
+
+	if (aram_address > WORDS_IN_ARAM_SIZE) {
+		rte_bbdev_log(ERR, "ARAM Configuration not fitting %d %d\n",
+				aram_address, WORDS_IN_ARAM_SIZE);
+		return -EINVAL;
+	}
+
+	/* ==== HI Configuration ==== */
+
+	/* Prevent Block on Transmit Error */
+	address = HWPfHiBlockTransmitOnErrorEn;
+	payload = 0;
+	acc100_reg_write(d, address, payload);
+	/* Prevents to drop MSI */
+	address = HWPfHiMsiDropEnableReg;
+	payload = 0;
+	acc100_reg_write(d, address, payload);
+	/* Set the PF Mode register */
+	address = HWPfHiPfMode;
+	payload = (conf->pf_mode_en) ? 2 : 0;
+	acc100_reg_write(d, address, payload);
+	/* Enable Error Detection in HW */
+	address = HWPfDmaErrorDetectionEn;
+	payload = 0x3D7;
+	acc100_reg_write(d, address, payload);
+
+	/* QoS overflow init */
+	payload = 1;
+	address = HWPfQosmonAEvalOverflow0;
+	acc100_reg_write(d, address, payload);
+	address = HWPfQosmonBEvalOverflow0;
+	acc100_reg_write(d, address, payload);
+
+	/* HARQ DDR Configuration */
+	unsigned int ddrSizeInMb = 512; /* Fixed to 512 MB per VF for now */
+	for (vf_idx = 0; vf_idx < conf->num_vf_bundles; vf_idx++) {
+		address = HWPfDmaVfDdrBaseRw + vf_idx
+				* 0x10;
+		payload = ((vf_idx * (ddrSizeInMb / 64)) << 16) +
+				(ddrSizeInMb - 1);
+		acc100_reg_write(d, address, payload);
+	}
+	usleep(LONG_WAIT);
+
+	if (numEngines < (SIG_UL_5G_LAST + 1))
+		poweron_cleanup(bbdev, d, conf);
+
+	rte_bbdev_log_debug("PF Tip configuration complete for %s", dev_name);
+	return 0;
+}
diff --git a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
index 4a76d1d..91c234d 100644
--- a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
+++ b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
@@ -1,3 +1,10 @@
 DPDK_21 {
 	local: *;
 };
+
+EXPERIMENTAL {
+	global:
+
+	acc100_configure;
+
+};
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v5 11/11] doc: update bbdev feature table
  2020-09-23  2:12   ` [dpdk-dev] [PATCH v5 " Nicolas Chautru
                       ` (9 preceding siblings ...)
  2020-09-23  2:12     ` [dpdk-dev] [PATCH v5 10/11] baseband/acc100: add configure function Nicolas Chautru
@ 2020-09-23  2:12     ` Nicolas Chautru
  10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-23  2:12 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Correcting overview matrix to use acc100 name

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 doc/guides/bbdevs/features/acc100.ini | 14 ++++++++++++++
 doc/guides/bbdevs/features/mbc.ini    | 14 --------------
 2 files changed, 14 insertions(+), 14 deletions(-)
 create mode 100644 doc/guides/bbdevs/features/acc100.ini
 delete mode 100644 doc/guides/bbdevs/features/mbc.ini

diff --git a/doc/guides/bbdevs/features/acc100.ini b/doc/guides/bbdevs/features/acc100.ini
new file mode 100644
index 0000000..642cd48
--- /dev/null
+++ b/doc/guides/bbdevs/features/acc100.ini
@@ -0,0 +1,14 @@
+;
+; Supported features of the 'acc100' bbdev driver.
+;
+; Refer to default.ini for the full list of available PMD features.
+;
+[Features]
+Turbo Decoder (4G)     = Y
+Turbo Encoder (4G)     = Y
+LDPC Decoder (5G)      = Y
+LDPC Encoder (5G)      = Y
+LLR/HARQ Compression   = Y
+External DDR Access    = Y
+HW Accelerated         = Y
+BBDEV API              = Y
diff --git a/doc/guides/bbdevs/features/mbc.ini b/doc/guides/bbdevs/features/mbc.ini
deleted file mode 100644
index 78a7b95..0000000
--- a/doc/guides/bbdevs/features/mbc.ini
+++ /dev/null
@@ -1,14 +0,0 @@
-;
-; Supported features of the 'mbc' bbdev driver.
-;
-; Refer to default.ini for the full list of available PMD features.
-;
-[Features]
-Turbo Decoder (4G)     = Y
-Turbo Encoder (4G)     = Y
-LDPC Decoder (5G)      = Y
-LDPC Encoder (5G)      = Y
-LLR/HARQ Compression   = Y
-External DDR Access    = Y
-HW Accelerated         = Y
-BBDEV API              = Y
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v6 00/11] bbdev PMD ACC100
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 11/11] doc: update bbdev feature table Nicolas Chautru
  2020-09-04 17:53   ` [dpdk-dev] [PATCH v4 00/11] bbdev PMD ACC100 Nicolas Chautru
  2020-09-23  2:12   ` [dpdk-dev] [PATCH v5 " Nicolas Chautru
@ 2020-09-23  2:19   ` Nicolas Chautru
  2020-09-23  2:19     ` [dpdk-dev] [PATCH v6 01/11] service: retrieve lcore active state Nicolas Chautru
                       ` (10 more replies)
  2020-09-23  2:24   ` [dpdk-dev] [PATCH v7 00/11] bbdev PMD ACC100 Nicolas Chautru
                     ` (5 subsequent siblings)
  8 siblings, 11 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-23  2:19 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

v6: removed a legacy makefile no longer required
v5: rebase based on latest on main. The legacy makefiles are removed. 
v4: an odd compilation error is reported for one CI variant using "gcc latest" which looks to me like a false positive of maybe-undeclared. 
http://mails.dpdk.org/archives/test-report/2020-August/148936.html
Still forcing a dummy declare to remove this CI warning I will check with ci@dpdk.org in parallel.  
v3: missed a change during rebase
v2: includes clean up from latest CI checks.

This set includes a new PMD for the accelerator
ACC100 for 4G+5G FEC in 20.11. 
Documentation is updated as well accordingly.
Existing unit tests are all still supported.


Harry van Haaren (2):
  service: retrieve lcore active state
  test/service: fix race condition on stopping lcore

Nicolas Chautru (9):
  drivers/baseband: add PMD for ACC100
  baseband/acc100: add register definition file
  baseband/acc100: add info get function
  baseband/acc100: add queue configuration
  baseband/acc100: add LDPC processing functions
  baseband/acc100: add HARQ loopback support
  baseband/acc100: add support for 4G processing
  baseband/acc100: add interrupt support to PMD
  baseband/acc100: add debug function to validate input

 app/test-bbdev/meson.build                         |    3 +
 app/test/test_service_cores.c                      |   21 +-
 doc/guides/bbdevs/acc100.rst                       |  233 ++
 doc/guides/bbdevs/index.rst                        |    1 +
 doc/guides/rel_notes/release_20_11.rst             |    6 +
 drivers/baseband/acc100/acc100_pf_enum.h           | 1068 +++++
 drivers/baseband/acc100/acc100_vf_enum.h           |   73 +
 drivers/baseband/acc100/meson.build                |    6 +
 drivers/baseband/acc100/rte_acc100_cfg.h           |   96 +
 drivers/baseband/acc100/rte_acc100_pmd.c           | 4179 ++++++++++++++++++++
 drivers/baseband/acc100/rte_acc100_pmd.h           |  593 +++
 .../acc100/rte_pmd_bbdev_acc100_version.map        |    3 +
 drivers/baseband/meson.build                       |    2 +-
 lib/librte_eal/common/rte_service.c                |   21 +
 lib/librte_eal/include/rte_service.h               |   22 +-
 lib/librte_eal/rte_eal_version.map                 |    1 +
 16 files changed, 6325 insertions(+), 3 deletions(-)
 create mode 100644 doc/guides/bbdevs/acc100.rst
 create mode 100644 drivers/baseband/acc100/acc100_pf_enum.h
 create mode 100644 drivers/baseband/acc100/acc100_vf_enum.h
 create mode 100644 drivers/baseband/acc100/meson.build
 create mode 100644 drivers/baseband/acc100/rte_acc100_cfg.h
 create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.c
 create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.h
 create mode 100644 drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v6 01/11] service: retrieve lcore active state
  2020-09-23  2:19   ` [dpdk-dev] [PATCH v6 00/11] bbdev PMD ACC100 Nicolas Chautru
@ 2020-09-23  2:19     ` Nicolas Chautru
  2020-09-23  2:19     ` [dpdk-dev] [PATCH v6 02/11] test/service: fix race condition on stopping lcore Nicolas Chautru
                       ` (9 subsequent siblings)
  10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-23  2:19 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Harry van Haaren

From: Harry van Haaren <harry.van.haaren@intel.com>

This commit adds a new experimental API which allows the user
to retrieve the active state of an lcore. Knowing when the service
lcore is completed its polling loop can be useful to applications
to avoid race conditions when e.g. finalizing statistics.

The service thread itself now has a variable to indicate if its
thread is active. When zero the service thread has completed its
service, and has returned from the service_runner_func() function.

Suggested-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
---
 lib/librte_eal/common/rte_service.c  | 21 +++++++++++++++++++++
 lib/librte_eal/include/rte_service.h | 22 +++++++++++++++++++++-
 lib/librte_eal/rte_eal_version.map   |  1 +
 3 files changed, 43 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/common/rte_service.c b/lib/librte_eal/common/rte_service.c
index 6a0e0ff..98565bb 100644
--- a/lib/librte_eal/common/rte_service.c
+++ b/lib/librte_eal/common/rte_service.c
@@ -65,6 +65,7 @@ struct core_state {
 	/* map of services IDs are run on this core */
 	uint64_t service_mask;
 	uint8_t runstate; /* running or stopped */
+	uint8_t thread_active; /* indicates when thread is in service_run() */
 	uint8_t is_service_core; /* set if core is currently a service core */
 	uint8_t service_active_on_lcore[RTE_SERVICE_NUM_MAX];
 	uint64_t loops;
@@ -457,6 +458,8 @@ struct core_state {
 	const int lcore = rte_lcore_id();
 	struct core_state *cs = &lcore_states[lcore];
 
+	__atomic_store_n(&cs->thread_active, 1, __ATOMIC_SEQ_CST);
+
 	/* runstate act as the guard variable. Use load-acquire
 	 * memory order here to synchronize with store-release
 	 * in runstate update functions.
@@ -475,10 +478,28 @@ struct core_state {
 		cs->loops++;
 	}
 
+	/* Use SEQ CST memory ordering to avoid any re-ordering around
+	 * this store, ensuring that once this store is visible, the service
+	 * lcore thread really is done in service cores code.
+	 */
+	__atomic_store_n(&cs->thread_active, 0, __ATOMIC_SEQ_CST);
 	return 0;
 }
 
 int32_t
+rte_service_lcore_may_be_active(uint32_t lcore)
+{
+	if (lcore >= RTE_MAX_LCORE || !lcore_states[lcore].is_service_core)
+		return -EINVAL;
+
+	/* Load thread_active using ACQUIRE to avoid instructions dependent on
+	 * the result being re-ordered before this load completes.
+	 */
+	return __atomic_load_n(&lcore_states[lcore].thread_active,
+			       __ATOMIC_ACQUIRE);
+}
+
+int32_t
 rte_service_lcore_count(void)
 {
 	int32_t count = 0;
diff --git a/lib/librte_eal/include/rte_service.h b/lib/librte_eal/include/rte_service.h
index e2d0a6d..ca9950d 100644
--- a/lib/librte_eal/include/rte_service.h
+++ b/lib/librte_eal/include/rte_service.h
@@ -249,7 +249,11 @@ int32_t rte_service_run_iter_on_app_lcore(uint32_t id,
  * Stop a service core.
  *
  * Stopping a core makes the core become idle, but remains  assigned as a
- * service core.
+ * service core. Note that the service lcore thread may not have returned from
+ * the service it is running when this API returns.
+ *
+ * The *rte_service_lcore_may_be_active* API can be used to check if the
+ * service lcore is * still active.
  *
  * @retval 0 Success
  * @retval -EINVAL Invalid *lcore_id* provided
@@ -262,6 +266,22 @@ int32_t rte_service_run_iter_on_app_lcore(uint32_t id,
 int32_t rte_service_lcore_stop(uint32_t lcore_id);
 
 /**
+ * Reports if a service lcore is currently running.
+ *
+ * This function returns if the core has finished service cores code, and has
+ * returned to EAL control. If *rte_service_lcore_stop* has been called but
+ * the lcore has not returned to EAL yet, it might be required to wait and call
+ * this function again. The amount of time to wait before the core returns
+ * depends on the duration of the services being run.
+ *
+ * @retval 0 Service thread is not active, and lcore has been returned to EAL.
+ * @retval 1 Service thread is in the service core polling loop.
+ * @retval -EINVAL Invalid *lcore_id* provided.
+ */
+__rte_experimental
+int32_t rte_service_lcore_may_be_active(uint32_t lcore_id);
+
+/**
  * Adds lcore to the list of service cores.
  *
  * This functions can be used at runtime in order to modify the service core
diff --git a/lib/librte_eal/rte_eal_version.map b/lib/librte_eal/rte_eal_version.map
index eba868e..c32461c 100644
--- a/lib/librte_eal/rte_eal_version.map
+++ b/lib/librte_eal/rte_eal_version.map
@@ -394,6 +394,7 @@ EXPERIMENTAL {
 	rte_lcore_dump;
 	rte_lcore_iterate;
 	rte_mp_disable;
+	rte_service_lcore_may_be_active;
 	rte_thread_register;
 	rte_thread_unregister;
 };
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v6 02/11] test/service: fix race condition on stopping lcore
  2020-09-23  2:19   ` [dpdk-dev] [PATCH v6 00/11] bbdev PMD ACC100 Nicolas Chautru
  2020-09-23  2:19     ` [dpdk-dev] [PATCH v6 01/11] service: retrieve lcore active state Nicolas Chautru
@ 2020-09-23  2:19     ` Nicolas Chautru
  2020-09-23  2:19     ` [dpdk-dev] [PATCH v6 03/11] drivers/baseband: add PMD for ACC100 Nicolas Chautru
                       ` (8 subsequent siblings)
  10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-23  2:19 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Harry van Haaren

From: Harry van Haaren <harry.van.haaren@intel.com>

This commit fixes a potential race condition in the tests
where the lcore running a service would increment a counter
that was already reset by the test-suite thread. The resulting
race-condition incremented value could cause CI failures, as
indicated by DPDK's CI.

This patch fixes the race-condition by making use of the
added rte_service_lcore_active() API, which indicates when
a service-core is no longer in the service-core polling loop.

The unit test makes use of the above function to detect when
all statistics increments are done in the service-core thread,
and then the unit test continues finalizing and checking state.

Fixes: f28f3594ded2 ("service: add attribute API")

Reported-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
---
 app/test/test_service_cores.c | 21 ++++++++++++++++++++-
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/app/test/test_service_cores.c b/app/test/test_service_cores.c
index ef1d8fc..5d92bea 100644
--- a/app/test/test_service_cores.c
+++ b/app/test/test_service_cores.c
@@ -362,6 +362,9 @@ static int32_t dummy_mt_safe_cb(void *args)
 			"Service core add did not return zero");
 	TEST_ASSERT_EQUAL(0, rte_service_map_lcore_set(id, slcore_id, 1),
 			"Enabling valid service and core failed");
+	/* Ensure service is not active before starting */
+	TEST_ASSERT_EQUAL(0, rte_service_lcore_may_be_active(slcore_id),
+			"Not-active service core reported as active");
 	TEST_ASSERT_EQUAL(0, rte_service_lcore_start(slcore_id),
 			"Starting service core failed");
 
@@ -382,7 +385,23 @@ static int32_t dummy_mt_safe_cb(void *args)
 			lcore_attr_id, &lcore_attr_value),
 			"Invalid lcore attr didn't return -EINVAL");
 
-	rte_service_lcore_stop(slcore_id);
+	/* Ensure service is active */
+	TEST_ASSERT_EQUAL(1, rte_service_lcore_may_be_active(slcore_id),
+			"Active service core reported as not-active");
+
+	TEST_ASSERT_EQUAL(0, rte_service_map_lcore_set(id, slcore_id, 0),
+			"Disabling valid service and core failed");
+	TEST_ASSERT_EQUAL(0, rte_service_lcore_stop(slcore_id),
+			"Failed to stop service lcore");
+
+	/* Wait until service lcore not active, or for 100x SERVICE_DELAY */
+	int i;
+	for (i = 0; rte_service_lcore_may_be_active(slcore_id) == 1 &&
+			i < 100; i++)
+		rte_delay_ms(SERVICE_DELAY);
+
+	TEST_ASSERT_EQUAL(0, rte_service_lcore_may_be_active(slcore_id),
+			  "Service lcore not stopped after waiting.");
 
 	TEST_ASSERT_EQUAL(0, rte_service_lcore_attr_reset_all(slcore_id),
 			  "Valid lcore_attr_reset_all() didn't return success");
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v6 03/11] drivers/baseband: add PMD for ACC100
  2020-09-23  2:19   ` [dpdk-dev] [PATCH v6 00/11] bbdev PMD ACC100 Nicolas Chautru
  2020-09-23  2:19     ` [dpdk-dev] [PATCH v6 01/11] service: retrieve lcore active state Nicolas Chautru
  2020-09-23  2:19     ` [dpdk-dev] [PATCH v6 02/11] test/service: fix race condition on stopping lcore Nicolas Chautru
@ 2020-09-23  2:19     ` Nicolas Chautru
  2020-09-23  2:19     ` [dpdk-dev] [PATCH v6 04/11] baseband/acc100: add register definition file Nicolas Chautru
                       ` (7 subsequent siblings)
  10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-23  2:19 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Add stubs for the ACC100 PMD

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 doc/guides/bbdevs/acc100.rst                       | 233 +++++++++++++++++++++
 doc/guides/bbdevs/index.rst                        |   1 +
 doc/guides/rel_notes/release_20_11.rst             |   6 +
 drivers/baseband/acc100/meson.build                |   6 +
 drivers/baseband/acc100/rte_acc100_pmd.c           | 175 ++++++++++++++++
 drivers/baseband/acc100/rte_acc100_pmd.h           |  37 ++++
 .../acc100/rte_pmd_bbdev_acc100_version.map        |   3 +
 drivers/baseband/meson.build                       |   2 +-
 8 files changed, 462 insertions(+), 1 deletion(-)
 create mode 100644 doc/guides/bbdevs/acc100.rst
 create mode 100644 drivers/baseband/acc100/meson.build
 create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.c
 create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.h
 create mode 100644 drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map

diff --git a/doc/guides/bbdevs/acc100.rst b/doc/guides/bbdevs/acc100.rst
new file mode 100644
index 0000000..f87ee09
--- /dev/null
+++ b/doc/guides/bbdevs/acc100.rst
@@ -0,0 +1,233 @@
+..  SPDX-License-Identifier: BSD-3-Clause
+    Copyright(c) 2020 Intel Corporation
+
+Intel(R) ACC100 5G/4G FEC Poll Mode Driver
+==========================================
+
+The BBDEV ACC100 5G/4G FEC poll mode driver (PMD) supports an
+implementation of a VRAN FEC wireless acceleration function.
+This device is also known as Mount Bryce.
+
+Features
+--------
+
+ACC100 5G/4G FEC PMD supports the following features:
+
+- LDPC Encode in the DL (5GNR)
+- LDPC Decode in the UL (5GNR)
+- Turbo Encode in the DL (4G)
+- Turbo Decode in the UL (4G)
+- 16 VFs per PF (physical device)
+- Maximum of 128 queues per VF
+- PCIe Gen-3 x16 Interface
+- MSI
+- SR-IOV
+
+ACC100 5G/4G FEC PMD supports the following BBDEV capabilities:
+
+* For the LDPC encode operation:
+   - ``RTE_BBDEV_LDPC_CRC_24B_ATTACH`` :  set to attach CRC24B to CB(s)
+   - ``RTE_BBDEV_LDPC_RATE_MATCH`` :  if set then do not do Rate Match bypass
+   - ``RTE_BBDEV_LDPC_INTERLEAVER_BYPASS`` : if set then bypass interleaver
+
+* For the LDPC decode operation:
+   - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK`` :  check CRC24B from CB(s)
+   - ``RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE`` :  disable early termination
+   - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP`` :  drops CRC24B bits appended while decoding
+   - ``RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE`` :  provides an input for HARQ combining
+   - ``RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE`` :  provides an input for HARQ combining
+   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE`` :  HARQ memory input is internal
+   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE`` :  HARQ memory output is internal
+   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK`` :  loopback data to/from HARQ memory
+   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_FILLERS`` :  HARQ memory includes the fillers bits
+   - ``RTE_BBDEV_LDPC_DEC_SCATTER_GATHER`` :  supports scatter-gather for input/output data
+   - ``RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION`` :  supports compression of the HARQ input/output
+   - ``RTE_BBDEV_LDPC_LLR_COMPRESSION`` :  supports LLR input compression
+
+* For the turbo encode operation:
+   - ``RTE_BBDEV_TURBO_CRC_24B_ATTACH`` :  set to attach CRC24B to CB(s)
+   - ``RTE_BBDEV_TURBO_RATE_MATCH`` :  if set then do not do Rate Match bypass
+   - ``RTE_BBDEV_TURBO_ENC_INTERRUPTS`` :  set for encoder dequeue interrupts
+   - ``RTE_BBDEV_TURBO_RV_INDEX_BYPASS`` :  set to bypass RV index
+   - ``RTE_BBDEV_TURBO_ENC_SCATTER_GATHER`` :  supports scatter-gather for input/output data
+
+* For the turbo decode operation:
+   - ``RTE_BBDEV_TURBO_CRC_TYPE_24B`` :  check CRC24B from CB(s)
+   - ``RTE_BBDEV_TURBO_SUBBLOCK_DEINTERLEAVE`` :  perform subblock de-interleave
+   - ``RTE_BBDEV_TURBO_DEC_INTERRUPTS`` :  set for decoder dequeue interrupts
+   - ``RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN`` :  set if negative LLR encoder i/p is supported
+   - ``RTE_BBDEV_TURBO_POS_LLR_1_BIT_IN`` :  set if positive LLR encoder i/p is supported
+   - ``RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP`` :  keep CRC24B bits appended while decoding
+   - ``RTE_BBDEV_TURBO_EARLY_TERMINATION`` :  set early early termination feature
+   - ``RTE_BBDEV_TURBO_DEC_SCATTER_GATHER`` :  supports scatter-gather for input/output data
+   - ``RTE_BBDEV_TURBO_HALF_ITERATION_EVEN`` :  set half iteration granularity
+
+Installation
+------------
+
+Section 3 of the DPDK manual provides instuctions on installing and compiling DPDK. The
+default set of bbdev compile flags may be found in config/common_base, where for example
+the flag to build the ACC100 5G/4G FEC device, ``CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100``,
+is already set.
+
+DPDK requires hugepages to be configured as detailed in section 2 of the DPDK manual.
+The bbdev test application has been tested with a configuration 40 x 1GB hugepages. The
+hugepage configuration of a server may be examined using:
+
+.. code-block:: console
+
+   grep Huge* /proc/meminfo
+
+
+Initialization
+--------------
+
+When the device first powers up, its PCI Physical Functions (PF) can be listed through this command:
+
+.. code-block:: console
+
+  sudo lspci -vd8086:0d5c
+
+The physical and virtual functions are compatible with Linux UIO drivers:
+``vfio`` and ``igb_uio``. However, in order to work the ACC100 5G/4G
+FEC device firstly needs to be bound to one of these linux drivers through DPDK.
+
+
+Bind PF UIO driver(s)
+~~~~~~~~~~~~~~~~~~~~~
+
+Install the DPDK igb_uio driver, bind it with the PF PCI device ID and use
+``lspci`` to confirm the PF device is under use by ``igb_uio`` DPDK UIO driver.
+
+The igb_uio driver may be bound to the PF PCI device using one of three methods:
+
+
+1. PCI functions (physical or virtual, depending on the use case) can be bound to
+the UIO driver by repeating this command for every function.
+
+.. code-block:: console
+
+  cd <dpdk-top-level-directory>
+  insmod ./build/kmod/igb_uio.ko
+  echo "8086 0d5c" > /sys/bus/pci/drivers/igb_uio/new_id
+  lspci -vd8086:0d5c
+
+
+2. Another way to bind PF with DPDK UIO driver is by using the ``dpdk-devbind.py`` tool
+
+.. code-block:: console
+
+  cd <dpdk-top-level-directory>
+  ./usertools/dpdk-devbind.py -b igb_uio 0000:06:00.0
+
+where the PCI device ID (example: 0000:06:00.0) is obtained using lspci -vd8086:0d5c
+
+
+3. A third way to bind is to use ``dpdk-setup.sh`` tool
+
+.. code-block:: console
+
+  cd <dpdk-top-level-directory>
+  ./usertools/dpdk-setup.sh
+
+  select 'Bind Ethernet/Crypto/Baseband device to IGB UIO module'
+  or
+  select 'Bind Ethernet/Crypto/Baseband device to VFIO module' depending on driver required
+  enter PCI device ID
+  select 'Display current Ethernet/Crypto/Baseband device settings' to confirm binding
+
+
+In the same way the ACC100 5G/4G FEC PF can be bound with vfio, but vfio driver does not
+support SR-IOV configuration right out of the box, so it will need to be patched.
+
+
+Enable Virtual Functions
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+Now, it should be visible in the printouts that PCI PF is under igb_uio control
+"``Kernel driver in use: igb_uio``"
+
+To show the number of available VFs on the device, read ``sriov_totalvfs`` file..
+
+.. code-block:: console
+
+  cat /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/sriov_totalvfs
+
+  where 0000\:<b>\:<d>.<f> is the PCI device ID
+
+
+To enable VFs via igb_uio, echo the number of virtual functions intended to
+enable to ``max_vfs`` file..
+
+.. code-block:: console
+
+  echo <num-of-vfs> > /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/max_vfs
+
+
+Afterwards, all VFs must be bound to appropriate UIO drivers as required, same
+way it was done with the physical function previously.
+
+Enabling SR-IOV via vfio driver is pretty much the same, except that the file
+name is different:
+
+.. code-block:: console
+
+  echo <num-of-vfs> > /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/sriov_numvfs
+
+
+Configure the VFs through PF
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The PCI virtual functions must be configured before working or getting assigned
+to VMs/Containers. The configuration involves allocating the number of hardware
+queues, priorities, load balance, bandwidth and other settings necessary for the
+device to perform FEC functions.
+
+This configuration needs to be executed at least once after reboot or PCI FLR and can
+be achieved by using the function ``acc100_configure()``, which sets up the
+parameters defined in ``acc100_conf`` structure.
+
+Test Application
+----------------
+
+BBDEV provides a test application, ``test-bbdev.py`` and range of test data for testing
+the functionality of ACC100 5G/4G FEC encode and decode, depending on the device's
+capabilities. The test application is located under app->test-bbdev folder and has the
+following options:
+
+.. code-block:: console
+
+  "-p", "--testapp-path": specifies path to the bbdev test app.
+  "-e", "--eal-params"	: EAL arguments which are passed to the test app.
+  "-t", "--timeout"	: Timeout in seconds (default=300).
+  "-c", "--test-cases"	: Defines test cases to run. Run all if not specified.
+  "-v", "--test-vector"	: Test vector path (default=dpdk_path+/app/test-bbdev/test_vectors/bbdev_null.data).
+  "-n", "--num-ops"	: Number of operations to process on device (default=32).
+  "-b", "--burst-size"	: Operations enqueue/dequeue burst size (default=32).
+  "-s", "--snr"		: SNR in dB used when generating LLRs for bler tests.
+  "-s", "--iter_max"	: Number of iterations for LDPC decoder.
+  "-l", "--num-lcores"	: Number of lcores to run (default=16).
+  "-i", "--init-device" : Initialise PF device with default values.
+
+
+To execute the test application tool using simple decode or encode data,
+type one of the following:
+
+.. code-block:: console
+
+  ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_dec_default.data
+  ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_enc_default.data
+
+
+The test application ``test-bbdev.py``, supports the ability to configure the PF device with
+a default set of values, if the "-i" or "- -init-device" option is included. The default values
+are defined in test_bbdev_perf.c.
+
+
+Test Vectors
+~~~~~~~~~~~~
+
+In addition to the simple LDPC decoder and LDPC encoder tests, bbdev also provides
+a range of additional tests under the test_vectors folder, which may be useful. The results
+of these tests will depend on the ACC100 5G/4G FEC capabilities which may cause some
+testcases to be skipped, but no failure should be reported.
diff --git a/doc/guides/bbdevs/index.rst b/doc/guides/bbdevs/index.rst
index a8092dd..4445cbd 100644
--- a/doc/guides/bbdevs/index.rst
+++ b/doc/guides/bbdevs/index.rst
@@ -13,3 +13,4 @@ Baseband Device Drivers
     turbo_sw
     fpga_lte_fec
     fpga_5gnr_fec
+    acc100
diff --git a/doc/guides/rel_notes/release_20_11.rst b/doc/guides/rel_notes/release_20_11.rst
index 73ac08f..20639ea 100644
--- a/doc/guides/rel_notes/release_20_11.rst
+++ b/doc/guides/rel_notes/release_20_11.rst
@@ -55,6 +55,12 @@ New Features
      Also, make sure to start the actual text at the margin.
      =======================================================
 
+* **Added Intel ACC100 bbdev PMD.**
+
+  Added a new ``acc100`` bbdev driver for the Intel\ |reg| ACC100 accelerator
+  also known as Mount Bryce.  See the
+  :doc:`../bbdevs/acc100` BBDEV guide for more details on this new driver.
+
 
 Removed Items
 -------------
diff --git a/drivers/baseband/acc100/meson.build b/drivers/baseband/acc100/meson.build
new file mode 100644
index 0000000..8afafc2
--- /dev/null
+++ b/drivers/baseband/acc100/meson.build
@@ -0,0 +1,6 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2020 Intel Corporation
+
+deps += ['bbdev', 'bus_vdev', 'ring', 'pci', 'bus_pci']
+
+sources = files('rte_acc100_pmd.c')
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
new file mode 100644
index 0000000..1b4cd13
--- /dev/null
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -0,0 +1,175 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#include <unistd.h>
+
+#include <rte_common.h>
+#include <rte_log.h>
+#include <rte_dev.h>
+#include <rte_malloc.h>
+#include <rte_mempool.h>
+#include <rte_byteorder.h>
+#include <rte_errno.h>
+#include <rte_branch_prediction.h>
+#include <rte_hexdump.h>
+#include <rte_pci.h>
+#include <rte_bus_pci.h>
+
+#include <rte_bbdev.h>
+#include <rte_bbdev_pmd.h>
+#include "rte_acc100_pmd.h"
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, DEBUG);
+#else
+RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);
+#endif
+
+/* Free 64MB memory used for software rings */
+static int
+acc100_dev_close(struct rte_bbdev *dev  __rte_unused)
+{
+	return 0;
+}
+
+static const struct rte_bbdev_ops acc100_bbdev_ops = {
+	.close = acc100_dev_close,
+};
+
+/* ACC100 PCI PF address map */
+static struct rte_pci_id pci_id_acc100_pf_map[] = {
+	{
+		RTE_PCI_DEVICE(RTE_ACC100_VENDOR_ID, RTE_ACC100_PF_DEVICE_ID)
+	},
+	{.device_id = 0},
+};
+
+/* ACC100 PCI VF address map */
+static struct rte_pci_id pci_id_acc100_vf_map[] = {
+	{
+		RTE_PCI_DEVICE(RTE_ACC100_VENDOR_ID, RTE_ACC100_VF_DEVICE_ID)
+	},
+	{.device_id = 0},
+};
+
+/* Initialization Function */
+static void
+acc100_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv)
+{
+	struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
+
+	dev->dev_ops = &acc100_bbdev_ops;
+
+	((struct acc100_device *) dev->data->dev_private)->pf_device =
+			!strcmp(drv->driver.name,
+					RTE_STR(ACC100PF_DRIVER_NAME));
+	((struct acc100_device *) dev->data->dev_private)->mmio_base =
+			pci_dev->mem_resource[0].addr;
+
+	rte_bbdev_log_debug("Init device %s [%s] @ vaddr %p paddr %#"PRIx64"",
+			drv->driver.name, dev->data->name,
+			(void *)pci_dev->mem_resource[0].addr,
+			pci_dev->mem_resource[0].phys_addr);
+}
+
+static int acc100_pci_probe(struct rte_pci_driver *pci_drv,
+	struct rte_pci_device *pci_dev)
+{
+	struct rte_bbdev *bbdev = NULL;
+	char dev_name[RTE_BBDEV_NAME_MAX_LEN];
+
+	if (pci_dev == NULL) {
+		rte_bbdev_log(ERR, "NULL PCI device");
+		return -EINVAL;
+	}
+
+	rte_pci_device_name(&pci_dev->addr, dev_name, sizeof(dev_name));
+
+	/* Allocate memory to be used privately by drivers */
+	bbdev = rte_bbdev_allocate(pci_dev->device.name);
+	if (bbdev == NULL)
+		return -ENODEV;
+
+	/* allocate device private memory */
+	bbdev->data->dev_private = rte_zmalloc_socket(dev_name,
+			sizeof(struct acc100_device), RTE_CACHE_LINE_SIZE,
+			pci_dev->device.numa_node);
+
+	if (bbdev->data->dev_private == NULL) {
+		rte_bbdev_log(CRIT,
+				"Allocate of %zu bytes for device \"%s\" failed",
+				sizeof(struct acc100_device), dev_name);
+				rte_bbdev_release(bbdev);
+			return -ENOMEM;
+	}
+
+	/* Fill HW specific part of device structure */
+	bbdev->device = &pci_dev->device;
+	bbdev->intr_handle = &pci_dev->intr_handle;
+	bbdev->data->socket_id = pci_dev->device.numa_node;
+
+	/* Invoke ACC100 device initialization function */
+	acc100_bbdev_init(bbdev, pci_drv);
+
+	rte_bbdev_log_debug("Initialised bbdev %s (id = %u)",
+			dev_name, bbdev->data->dev_id);
+	return 0;
+}
+
+static int acc100_pci_remove(struct rte_pci_device *pci_dev)
+{
+	struct rte_bbdev *bbdev;
+	int ret;
+	uint8_t dev_id;
+
+	if (pci_dev == NULL)
+		return -EINVAL;
+
+	/* Find device */
+	bbdev = rte_bbdev_get_named_dev(pci_dev->device.name);
+	if (bbdev == NULL) {
+		rte_bbdev_log(CRIT,
+				"Couldn't find HW dev \"%s\" to uninitialise it",
+				pci_dev->device.name);
+		return -ENODEV;
+	}
+	dev_id = bbdev->data->dev_id;
+
+	/* free device private memory before close */
+	rte_free(bbdev->data->dev_private);
+
+	/* Close device */
+	ret = rte_bbdev_close(dev_id);
+	if (ret < 0)
+		rte_bbdev_log(ERR,
+				"Device %i failed to close during uninit: %i",
+				dev_id, ret);
+
+	/* release bbdev from library */
+	rte_bbdev_release(bbdev);
+
+	rte_bbdev_log_debug("Destroyed bbdev = %u", dev_id);
+
+	return 0;
+}
+
+static struct rte_pci_driver acc100_pci_pf_driver = {
+		.probe = acc100_pci_probe,
+		.remove = acc100_pci_remove,
+		.id_table = pci_id_acc100_pf_map,
+		.drv_flags = RTE_PCI_DRV_NEED_MAPPING
+};
+
+static struct rte_pci_driver acc100_pci_vf_driver = {
+		.probe = acc100_pci_probe,
+		.remove = acc100_pci_remove,
+		.id_table = pci_id_acc100_vf_map,
+		.drv_flags = RTE_PCI_DRV_NEED_MAPPING
+};
+
+RTE_PMD_REGISTER_PCI(ACC100PF_DRIVER_NAME, acc100_pci_pf_driver);
+RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME, pci_id_acc100_pf_map);
+RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
+RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME, pci_id_acc100_vf_map);
+
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
new file mode 100644
index 0000000..6f46df0
--- /dev/null
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -0,0 +1,37 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#ifndef _RTE_ACC100_PMD_H_
+#define _RTE_ACC100_PMD_H_
+
+/* Helper macro for logging */
+#define rte_bbdev_log(level, fmt, ...) \
+	rte_log(RTE_LOG_ ## level, acc100_logtype, fmt "\n", \
+		##__VA_ARGS__)
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+#define rte_bbdev_log_debug(fmt, ...) \
+		rte_bbdev_log(DEBUG, "acc100_pmd: " fmt, \
+		##__VA_ARGS__)
+#else
+#define rte_bbdev_log_debug(fmt, ...)
+#endif
+
+/* ACC100 PF and VF driver names */
+#define ACC100PF_DRIVER_NAME           intel_acc100_pf
+#define ACC100VF_DRIVER_NAME           intel_acc100_vf
+
+/* ACC100 PCI vendor & device IDs */
+#define RTE_ACC100_VENDOR_ID           (0x8086)
+#define RTE_ACC100_PF_DEVICE_ID        (0x0d5c)
+#define RTE_ACC100_VF_DEVICE_ID        (0x0d5d)
+
+/* Private data structure for each ACC100 device */
+struct acc100_device {
+	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
+	bool pf_device; /**< True if this is a PF ACC100 device */
+	bool configured; /**< True if this ACC100 device is configured */
+};
+
+#endif /* _RTE_ACC100_PMD_H_ */
diff --git a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
new file mode 100644
index 0000000..4a76d1d
--- /dev/null
+++ b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
@@ -0,0 +1,3 @@
+DPDK_21 {
+	local: *;
+};
diff --git a/drivers/baseband/meson.build b/drivers/baseband/meson.build
index 415b672..72301ce 100644
--- a/drivers/baseband/meson.build
+++ b/drivers/baseband/meson.build
@@ -5,7 +5,7 @@ if is_windows
 	subdir_done()
 endif
 
-drivers = ['null', 'turbo_sw', 'fpga_lte_fec', 'fpga_5gnr_fec']
+drivers = ['null', 'turbo_sw', 'fpga_lte_fec', 'fpga_5gnr_fec', 'acc100']
 
 config_flag_fmt = 'RTE_LIBRTE_PMD_BBDEV_@0@'
 driver_name_fmt = 'rte_pmd_bbdev_@0@'
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v6 04/11] baseband/acc100: add register definition file
  2020-09-23  2:19   ` [dpdk-dev] [PATCH v6 00/11] bbdev PMD ACC100 Nicolas Chautru
                       ` (2 preceding siblings ...)
  2020-09-23  2:19     ` [dpdk-dev] [PATCH v6 03/11] drivers/baseband: add PMD for ACC100 Nicolas Chautru
@ 2020-09-23  2:19     ` Nicolas Chautru
  2020-09-23  2:19     ` [dpdk-dev] [PATCH v6 05/11] baseband/acc100: add info get function Nicolas Chautru
                       ` (6 subsequent siblings)
  10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-23  2:19 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Add in the list of registers for the device and related
HW specs definitions.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 drivers/baseband/acc100/acc100_pf_enum.h | 1068 ++++++++++++++++++++++++++++++
 drivers/baseband/acc100/acc100_vf_enum.h |   73 ++
 drivers/baseband/acc100/rte_acc100_pmd.h |  490 ++++++++++++++
 3 files changed, 1631 insertions(+)
 create mode 100644 drivers/baseband/acc100/acc100_pf_enum.h
 create mode 100644 drivers/baseband/acc100/acc100_vf_enum.h

diff --git a/drivers/baseband/acc100/acc100_pf_enum.h b/drivers/baseband/acc100/acc100_pf_enum.h
new file mode 100644
index 0000000..a1ee416
--- /dev/null
+++ b/drivers/baseband/acc100/acc100_pf_enum.h
@@ -0,0 +1,1068 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2017 Intel Corporation
+ */
+
+#ifndef ACC100_PF_ENUM_H
+#define ACC100_PF_ENUM_H
+
+/*
+ * ACC100 Register mapping on PF BAR0
+ * This is automatically generated from RDL, format may change with new RDL
+ * Release.
+ * Variable names are as is
+ */
+enum {
+	HWPfQmgrEgressQueuesTemplate          =  0x0007FE00,
+	HWPfQmgrIngressAq                     =  0x00080000,
+	HWPfQmgrArbQAvail                     =  0x00A00010,
+	HWPfQmgrArbQBlock                     =  0x00A00014,
+	HWPfQmgrAqueueDropNotifEn             =  0x00A00024,
+	HWPfQmgrAqueueDisableNotifEn          =  0x00A00028,
+	HWPfQmgrSoftReset                     =  0x00A00038,
+	HWPfQmgrInitStatus                    =  0x00A0003C,
+	HWPfQmgrAramWatchdogCount             =  0x00A00040,
+	HWPfQmgrAramWatchdogCounterEn         =  0x00A00044,
+	HWPfQmgrAxiWatchdogCount              =  0x00A00048,
+	HWPfQmgrAxiWatchdogCounterEn          =  0x00A0004C,
+	HWPfQmgrProcessWatchdogCount          =  0x00A00050,
+	HWPfQmgrProcessWatchdogCounterEn      =  0x00A00054,
+	HWPfQmgrProcessUl4GWatchdogCounter    =  0x00A00058,
+	HWPfQmgrProcessDl4GWatchdogCounter    =  0x00A0005C,
+	HWPfQmgrProcessUl5GWatchdogCounter    =  0x00A00060,
+	HWPfQmgrProcessDl5GWatchdogCounter    =  0x00A00064,
+	HWPfQmgrProcessMldWatchdogCounter     =  0x00A00068,
+	HWPfQmgrMsiOverflowUpperVf            =  0x00A00070,
+	HWPfQmgrMsiOverflowLowerVf            =  0x00A00074,
+	HWPfQmgrMsiWatchdogOverflow           =  0x00A00078,
+	HWPfQmgrMsiOverflowEnable             =  0x00A0007C,
+	HWPfQmgrDebugAqPointerMemGrp          =  0x00A00100,
+	HWPfQmgrDebugOutputArbQFifoGrp        =  0x00A00140,
+	HWPfQmgrDebugMsiFifoGrp               =  0x00A00180,
+	HWPfQmgrDebugAxiWdTimeoutMsiFifo      =  0x00A001C0,
+	HWPfQmgrDebugProcessWdTimeoutMsiFifo  =  0x00A001C4,
+	HWPfQmgrDepthLog2Grp                  =  0x00A00200,
+	HWPfQmgrTholdGrp                      =  0x00A00300,
+	HWPfQmgrGrpTmplateReg0Indx            =  0x00A00600,
+	HWPfQmgrGrpTmplateReg1Indx            =  0x00A00680,
+	HWPfQmgrGrpTmplateReg2indx            =  0x00A00700,
+	HWPfQmgrGrpTmplateReg3Indx            =  0x00A00780,
+	HWPfQmgrGrpTmplateReg4Indx            =  0x00A00800,
+	HWPfQmgrVfBaseAddr                    =  0x00A01000,
+	HWPfQmgrUl4GWeightRrVf                =  0x00A02000,
+	HWPfQmgrDl4GWeightRrVf                =  0x00A02100,
+	HWPfQmgrUl5GWeightRrVf                =  0x00A02200,
+	HWPfQmgrDl5GWeightRrVf                =  0x00A02300,
+	HWPfQmgrMldWeightRrVf                 =  0x00A02400,
+	HWPfQmgrArbQDepthGrp                  =  0x00A02F00,
+	HWPfQmgrGrpFunction0                  =  0x00A02F40,
+	HWPfQmgrGrpFunction1                  =  0x00A02F44,
+	HWPfQmgrGrpPriority                   =  0x00A02F48,
+	HWPfQmgrWeightSync                    =  0x00A03000,
+	HWPfQmgrAqEnableVf                    =  0x00A10000,
+	HWPfQmgrAqResetVf                     =  0x00A20000,
+	HWPfQmgrRingSizeVf                    =  0x00A20004,
+	HWPfQmgrGrpDepthLog20Vf               =  0x00A20008,
+	HWPfQmgrGrpDepthLog21Vf               =  0x00A2000C,
+	HWPfQmgrGrpFunction0Vf                =  0x00A20010,
+	HWPfQmgrGrpFunction1Vf                =  0x00A20014,
+	HWPfDmaConfig0Reg                     =  0x00B80000,
+	HWPfDmaConfig1Reg                     =  0x00B80004,
+	HWPfDmaQmgrAddrReg                    =  0x00B80008,
+	HWPfDmaSoftResetReg                   =  0x00B8000C,
+	HWPfDmaAxcacheReg                     =  0x00B80010,
+	HWPfDmaVersionReg                     =  0x00B80014,
+	HWPfDmaFrameThreshold                 =  0x00B80018,
+	HWPfDmaTimestampLo                    =  0x00B8001C,
+	HWPfDmaTimestampHi                    =  0x00B80020,
+	HWPfDmaAxiStatus                      =  0x00B80028,
+	HWPfDmaAxiControl                     =  0x00B8002C,
+	HWPfDmaNoQmgr                         =  0x00B80030,
+	HWPfDmaQosScale                       =  0x00B80034,
+	HWPfDmaQmanen                         =  0x00B80040,
+	HWPfDmaQmgrQosBase                    =  0x00B80060,
+	HWPfDmaFecClkGatingEnable             =  0x00B80080,
+	HWPfDmaPmEnable                       =  0x00B80084,
+	HWPfDmaQosEnable                      =  0x00B80088,
+	HWPfDmaHarqWeightedRrFrameThreshold   =  0x00B800B0,
+	HWPfDmaDataSmallWeightedRrFrameThresh  = 0x00B800B4,
+	HWPfDmaDataLargeWeightedRrFrameThresh  = 0x00B800B8,
+	HWPfDmaInboundCbMaxSize               =  0x00B800BC,
+	HWPfDmaInboundDrainDataSize           =  0x00B800C0,
+	HWPfDmaVfDdrBaseRw                    =  0x00B80400,
+	HWPfDmaCmplTmOutCnt                   =  0x00B80800,
+	HWPfDmaProcTmOutCnt                   =  0x00B80804,
+	HWPfDmaStatusRrespBresp               =  0x00B80810,
+	HWPfDmaCfgRrespBresp                  =  0x00B80814,
+	HWPfDmaStatusMemParErr                =  0x00B80818,
+	HWPfDmaCfgMemParErrEn                 =  0x00B8081C,
+	HWPfDmaStatusDmaHwErr                 =  0x00B80820,
+	HWPfDmaCfgDmaHwErrEn                  =  0x00B80824,
+	HWPfDmaStatusFecCoreErr               =  0x00B80828,
+	HWPfDmaCfgFecCoreErrEn                =  0x00B8082C,
+	HWPfDmaStatusFcwDescrErr              =  0x00B80830,
+	HWPfDmaCfgFcwDescrErrEn               =  0x00B80834,
+	HWPfDmaStatusBlockTransmit            =  0x00B80838,
+	HWPfDmaBlockOnErrEn                   =  0x00B8083C,
+	HWPfDmaStatusFlushDma                 =  0x00B80840,
+	HWPfDmaFlushDmaOnErrEn                =  0x00B80844,
+	HWPfDmaStatusSdoneFifoFull            =  0x00B80848,
+	HWPfDmaStatusDescriptorErrLoVf        =  0x00B8084C,
+	HWPfDmaStatusDescriptorErrHiVf        =  0x00B80850,
+	HWPfDmaStatusFcwErrLoVf               =  0x00B80854,
+	HWPfDmaStatusFcwErrHiVf               =  0x00B80858,
+	HWPfDmaStatusDataErrLoVf              =  0x00B8085C,
+	HWPfDmaStatusDataErrHiVf              =  0x00B80860,
+	HWPfDmaCfgMsiEnSoftwareErr            =  0x00B80864,
+	HWPfDmaDescriptorSignatuture          =  0x00B80868,
+	HWPfDmaFcwSignature                   =  0x00B8086C,
+	HWPfDmaErrorDetectionEn               =  0x00B80870,
+	HWPfDmaErrCntrlFifoDebug              =  0x00B8087C,
+	HWPfDmaStatusToutData                 =  0x00B80880,
+	HWPfDmaStatusToutDesc                 =  0x00B80884,
+	HWPfDmaStatusToutUnexpData            =  0x00B80888,
+	HWPfDmaStatusToutUnexpDesc            =  0x00B8088C,
+	HWPfDmaStatusToutProcess              =  0x00B80890,
+	HWPfDmaConfigCtoutOutDataEn           =  0x00B808A0,
+	HWPfDmaConfigCtoutOutDescrEn          =  0x00B808A4,
+	HWPfDmaConfigUnexpComplDataEn         =  0x00B808A8,
+	HWPfDmaConfigUnexpComplDescrEn        =  0x00B808AC,
+	HWPfDmaConfigPtoutOutEn               =  0x00B808B0,
+	HWPfDmaFec5GulDescBaseLoRegVf         =  0x00B88020,
+	HWPfDmaFec5GulDescBaseHiRegVf         =  0x00B88024,
+	HWPfDmaFec5GulRespPtrLoRegVf          =  0x00B88028,
+	HWPfDmaFec5GulRespPtrHiRegVf          =  0x00B8802C,
+	HWPfDmaFec5GdlDescBaseLoRegVf         =  0x00B88040,
+	HWPfDmaFec5GdlDescBaseHiRegVf         =  0x00B88044,
+	HWPfDmaFec5GdlRespPtrLoRegVf          =  0x00B88048,
+	HWPfDmaFec5GdlRespPtrHiRegVf          =  0x00B8804C,
+	HWPfDmaFec4GulDescBaseLoRegVf         =  0x00B88060,
+	HWPfDmaFec4GulDescBaseHiRegVf         =  0x00B88064,
+	HWPfDmaFec4GulRespPtrLoRegVf          =  0x00B88068,
+	HWPfDmaFec4GulRespPtrHiRegVf          =  0x00B8806C,
+	HWPfDmaFec4GdlDescBaseLoRegVf         =  0x00B88080,
+	HWPfDmaFec4GdlDescBaseHiRegVf         =  0x00B88084,
+	HWPfDmaFec4GdlRespPtrLoRegVf          =  0x00B88088,
+	HWPfDmaFec4GdlRespPtrHiRegVf          =  0x00B8808C,
+	HWPfDmaVfDdrBaseRangeRo               =  0x00B880A0,
+	HWPfQosmonACntrlReg                   =  0x00B90000,
+	HWPfQosmonAEvalOverflow0              =  0x00B90008,
+	HWPfQosmonAEvalOverflow1              =  0x00B9000C,
+	HWPfQosmonADivTerm                    =  0x00B90010,
+	HWPfQosmonATickTerm                   =  0x00B90014,
+	HWPfQosmonAEvalTerm                   =  0x00B90018,
+	HWPfQosmonAAveTerm                    =  0x00B9001C,
+	HWPfQosmonAForceEccErr                =  0x00B90020,
+	HWPfQosmonAEccErrDetect               =  0x00B90024,
+	HWPfQosmonAIterationConfig0Low        =  0x00B90060,
+	HWPfQosmonAIterationConfig0High       =  0x00B90064,
+	HWPfQosmonAIterationConfig1Low        =  0x00B90068,
+	HWPfQosmonAIterationConfig1High       =  0x00B9006C,
+	HWPfQosmonAIterationConfig2Low        =  0x00B90070,
+	HWPfQosmonAIterationConfig2High       =  0x00B90074,
+	HWPfQosmonAIterationConfig3Low        =  0x00B90078,
+	HWPfQosmonAIterationConfig3High       =  0x00B9007C,
+	HWPfQosmonAEvalMemAddr                =  0x00B90080,
+	HWPfQosmonAEvalMemData                =  0x00B90084,
+	HWPfQosmonAXaction                    =  0x00B900C0,
+	HWPfQosmonARemThres1Vf                =  0x00B90400,
+	HWPfQosmonAThres2Vf                   =  0x00B90404,
+	HWPfQosmonAWeiFracVf                  =  0x00B90408,
+	HWPfQosmonARrWeiVf                    =  0x00B9040C,
+	HWPfPermonACntrlRegVf                 =  0x00B98000,
+	HWPfPermonACountVf                    =  0x00B98008,
+	HWPfPermonAKCntLoVf                   =  0x00B98010,
+	HWPfPermonAKCntHiVf                   =  0x00B98014,
+	HWPfPermonADeltaCntLoVf               =  0x00B98020,
+	HWPfPermonADeltaCntHiVf               =  0x00B98024,
+	HWPfPermonAVersionReg                 =  0x00B9C000,
+	HWPfPermonACbControlFec               =  0x00B9C0F0,
+	HWPfPermonADltTimerLoFec              =  0x00B9C0F4,
+	HWPfPermonADltTimerHiFec              =  0x00B9C0F8,
+	HWPfPermonACbCountFec                 =  0x00B9C100,
+	HWPfPermonAAccExecTimerLoFec          =  0x00B9C104,
+	HWPfPermonAAccExecTimerHiFec          =  0x00B9C108,
+	HWPfPermonAExecTimerMinFec            =  0x00B9C200,
+	HWPfPermonAExecTimerMaxFec            =  0x00B9C204,
+	HWPfPermonAControlBusMon              =  0x00B9C400,
+	HWPfPermonAConfigBusMon               =  0x00B9C404,
+	HWPfPermonASkipCountBusMon            =  0x00B9C408,
+	HWPfPermonAMinLatBusMon               =  0x00B9C40C,
+	HWPfPermonAMaxLatBusMon               =  0x00B9C500,
+	HWPfPermonATotalLatLowBusMon          =  0x00B9C504,
+	HWPfPermonATotalLatUpperBusMon        =  0x00B9C508,
+	HWPfPermonATotalReqCntBusMon          =  0x00B9C50C,
+	HWPfQosmonBCntrlReg                   =  0x00BA0000,
+	HWPfQosmonBEvalOverflow0              =  0x00BA0008,
+	HWPfQosmonBEvalOverflow1              =  0x00BA000C,
+	HWPfQosmonBDivTerm                    =  0x00BA0010,
+	HWPfQosmonBTickTerm                   =  0x00BA0014,
+	HWPfQosmonBEvalTerm                   =  0x00BA0018,
+	HWPfQosmonBAveTerm                    =  0x00BA001C,
+	HWPfQosmonBForceEccErr                =  0x00BA0020,
+	HWPfQosmonBEccErrDetect               =  0x00BA0024,
+	HWPfQosmonBIterationConfig0Low        =  0x00BA0060,
+	HWPfQosmonBIterationConfig0High       =  0x00BA0064,
+	HWPfQosmonBIterationConfig1Low        =  0x00BA0068,
+	HWPfQosmonBIterationConfig1High       =  0x00BA006C,
+	HWPfQosmonBIterationConfig2Low        =  0x00BA0070,
+	HWPfQosmonBIterationConfig2High       =  0x00BA0074,
+	HWPfQosmonBIterationConfig3Low        =  0x00BA0078,
+	HWPfQosmonBIterationConfig3High       =  0x00BA007C,
+	HWPfQosmonBEvalMemAddr                =  0x00BA0080,
+	HWPfQosmonBEvalMemData                =  0x00BA0084,
+	HWPfQosmonBXaction                    =  0x00BA00C0,
+	HWPfQosmonBRemThres1Vf                =  0x00BA0400,
+	HWPfQosmonBThres2Vf                   =  0x00BA0404,
+	HWPfQosmonBWeiFracVf                  =  0x00BA0408,
+	HWPfQosmonBRrWeiVf                    =  0x00BA040C,
+	HWPfPermonBCntrlRegVf                 =  0x00BA8000,
+	HWPfPermonBCountVf                    =  0x00BA8008,
+	HWPfPermonBKCntLoVf                   =  0x00BA8010,
+	HWPfPermonBKCntHiVf                   =  0x00BA8014,
+	HWPfPermonBDeltaCntLoVf               =  0x00BA8020,
+	HWPfPermonBDeltaCntHiVf               =  0x00BA8024,
+	HWPfPermonBVersionReg                 =  0x00BAC000,
+	HWPfPermonBCbControlFec               =  0x00BAC0F0,
+	HWPfPermonBDltTimerLoFec              =  0x00BAC0F4,
+	HWPfPermonBDltTimerHiFec              =  0x00BAC0F8,
+	HWPfPermonBCbCountFec                 =  0x00BAC100,
+	HWPfPermonBAccExecTimerLoFec          =  0x00BAC104,
+	HWPfPermonBAccExecTimerHiFec          =  0x00BAC108,
+	HWPfPermonBExecTimerMinFec            =  0x00BAC200,
+	HWPfPermonBExecTimerMaxFec            =  0x00BAC204,
+	HWPfPermonBControlBusMon              =  0x00BAC400,
+	HWPfPermonBConfigBusMon               =  0x00BAC404,
+	HWPfPermonBSkipCountBusMon            =  0x00BAC408,
+	HWPfPermonBMinLatBusMon               =  0x00BAC40C,
+	HWPfPermonBMaxLatBusMon               =  0x00BAC500,
+	HWPfPermonBTotalLatLowBusMon          =  0x00BAC504,
+	HWPfPermonBTotalLatUpperBusMon        =  0x00BAC508,
+	HWPfPermonBTotalReqCntBusMon          =  0x00BAC50C,
+	HWPfFecUl5gCntrlReg                   =  0x00BC0000,
+	HWPfFecUl5gI2MThreshReg               =  0x00BC0004,
+	HWPfFecUl5gVersionReg                 =  0x00BC0100,
+	HWPfFecUl5gFcwStatusReg               =  0x00BC0104,
+	HWPfFecUl5gWarnReg                    =  0x00BC0108,
+	HwPfFecUl5gIbDebugReg                 =  0x00BC0200,
+	HwPfFecUl5gObLlrDebugReg              =  0x00BC0204,
+	HwPfFecUl5gObHarqDebugReg             =  0x00BC0208,
+	HwPfFecUl5g1CntrlReg                  =  0x00BC1000,
+	HwPfFecUl5g1I2MThreshReg              =  0x00BC1004,
+	HwPfFecUl5g1VersionReg                =  0x00BC1100,
+	HwPfFecUl5g1FcwStatusReg              =  0x00BC1104,
+	HwPfFecUl5g1WarnReg                   =  0x00BC1108,
+	HwPfFecUl5g1IbDebugReg                =  0x00BC1200,
+	HwPfFecUl5g1ObLlrDebugReg             =  0x00BC1204,
+	HwPfFecUl5g1ObHarqDebugReg            =  0x00BC1208,
+	HwPfFecUl5g2CntrlReg                  =  0x00BC2000,
+	HwPfFecUl5g2I2MThreshReg              =  0x00BC2004,
+	HwPfFecUl5g2VersionReg                =  0x00BC2100,
+	HwPfFecUl5g2FcwStatusReg              =  0x00BC2104,
+	HwPfFecUl5g2WarnReg                   =  0x00BC2108,
+	HwPfFecUl5g2IbDebugReg                =  0x00BC2200,
+	HwPfFecUl5g2ObLlrDebugReg             =  0x00BC2204,
+	HwPfFecUl5g2ObHarqDebugReg            =  0x00BC2208,
+	HwPfFecUl5g3CntrlReg                  =  0x00BC3000,
+	HwPfFecUl5g3I2MThreshReg              =  0x00BC3004,
+	HwPfFecUl5g3VersionReg                =  0x00BC3100,
+	HwPfFecUl5g3FcwStatusReg              =  0x00BC3104,
+	HwPfFecUl5g3WarnReg                   =  0x00BC3108,
+	HwPfFecUl5g3IbDebugReg                =  0x00BC3200,
+	HwPfFecUl5g3ObLlrDebugReg             =  0x00BC3204,
+	HwPfFecUl5g3ObHarqDebugReg            =  0x00BC3208,
+	HwPfFecUl5g4CntrlReg                  =  0x00BC4000,
+	HwPfFecUl5g4I2MThreshReg              =  0x00BC4004,
+	HwPfFecUl5g4VersionReg                =  0x00BC4100,
+	HwPfFecUl5g4FcwStatusReg              =  0x00BC4104,
+	HwPfFecUl5g4WarnReg                   =  0x00BC4108,
+	HwPfFecUl5g4IbDebugReg                =  0x00BC4200,
+	HwPfFecUl5g4ObLlrDebugReg             =  0x00BC4204,
+	HwPfFecUl5g4ObHarqDebugReg            =  0x00BC4208,
+	HwPfFecUl5g5CntrlReg                  =  0x00BC5000,
+	HwPfFecUl5g5I2MThreshReg              =  0x00BC5004,
+	HwPfFecUl5g5VersionReg                =  0x00BC5100,
+	HwPfFecUl5g5FcwStatusReg              =  0x00BC5104,
+	HwPfFecUl5g5WarnReg                   =  0x00BC5108,
+	HwPfFecUl5g5IbDebugReg                =  0x00BC5200,
+	HwPfFecUl5g5ObLlrDebugReg             =  0x00BC5204,
+	HwPfFecUl5g5ObHarqDebugReg            =  0x00BC5208,
+	HwPfFecUl5g6CntrlReg                  =  0x00BC6000,
+	HwPfFecUl5g6I2MThreshReg              =  0x00BC6004,
+	HwPfFecUl5g6VersionReg                =  0x00BC6100,
+	HwPfFecUl5g6FcwStatusReg              =  0x00BC6104,
+	HwPfFecUl5g6WarnReg                   =  0x00BC6108,
+	HwPfFecUl5g6IbDebugReg                =  0x00BC6200,
+	HwPfFecUl5g6ObLlrDebugReg             =  0x00BC6204,
+	HwPfFecUl5g6ObHarqDebugReg            =  0x00BC6208,
+	HwPfFecUl5g7CntrlReg                  =  0x00BC7000,
+	HwPfFecUl5g7I2MThreshReg              =  0x00BC7004,
+	HwPfFecUl5g7VersionReg                =  0x00BC7100,
+	HwPfFecUl5g7FcwStatusReg              =  0x00BC7104,
+	HwPfFecUl5g7WarnReg                   =  0x00BC7108,
+	HwPfFecUl5g7IbDebugReg                =  0x00BC7200,
+	HwPfFecUl5g7ObLlrDebugReg             =  0x00BC7204,
+	HwPfFecUl5g7ObHarqDebugReg            =  0x00BC7208,
+	HwPfFecUl5g8CntrlReg                  =  0x00BC8000,
+	HwPfFecUl5g8I2MThreshReg              =  0x00BC8004,
+	HwPfFecUl5g8VersionReg                =  0x00BC8100,
+	HwPfFecUl5g8FcwStatusReg              =  0x00BC8104,
+	HwPfFecUl5g8WarnReg                   =  0x00BC8108,
+	HwPfFecUl5g8IbDebugReg                =  0x00BC8200,
+	HwPfFecUl5g8ObLlrDebugReg             =  0x00BC8204,
+	HwPfFecUl5g8ObHarqDebugReg            =  0x00BC8208,
+	HWPfFecDl5gCntrlReg                   =  0x00BCF000,
+	HWPfFecDl5gI2MThreshReg               =  0x00BCF004,
+	HWPfFecDl5gVersionReg                 =  0x00BCF100,
+	HWPfFecDl5gFcwStatusReg               =  0x00BCF104,
+	HWPfFecDl5gWarnReg                    =  0x00BCF108,
+	HWPfFecUlVersionReg                   =  0x00BD0000,
+	HWPfFecUlControlReg                   =  0x00BD0004,
+	HWPfFecUlStatusReg                    =  0x00BD0008,
+	HWPfFecDlVersionReg                   =  0x00BDF000,
+	HWPfFecDlClusterConfigReg             =  0x00BDF004,
+	HWPfFecDlBurstThres                   =  0x00BDF00C,
+	HWPfFecDlClusterStatusReg0            =  0x00BDF040,
+	HWPfFecDlClusterStatusReg1            =  0x00BDF044,
+	HWPfFecDlClusterStatusReg2            =  0x00BDF048,
+	HWPfFecDlClusterStatusReg3            =  0x00BDF04C,
+	HWPfFecDlClusterStatusReg4            =  0x00BDF050,
+	HWPfFecDlClusterStatusReg5            =  0x00BDF054,
+	HWPfChaFabPllPllrst                   =  0x00C40000,
+	HWPfChaFabPllClk0                     =  0x00C40004,
+	HWPfChaFabPllClk1                     =  0x00C40008,
+	HWPfChaFabPllBwadj                    =  0x00C4000C,
+	HWPfChaFabPllLbw                      =  0x00C40010,
+	HWPfChaFabPllResetq                   =  0x00C40014,
+	HWPfChaFabPllPhshft0                  =  0x00C40018,
+	HWPfChaFabPllPhshft1                  =  0x00C4001C,
+	HWPfChaFabPllDivq0                    =  0x00C40020,
+	HWPfChaFabPllDivq1                    =  0x00C40024,
+	HWPfChaFabPllDivq2                    =  0x00C40028,
+	HWPfChaFabPllDivq3                    =  0x00C4002C,
+	HWPfChaFabPllDivq4                    =  0x00C40030,
+	HWPfChaFabPllDivq5                    =  0x00C40034,
+	HWPfChaFabPllDivq6                    =  0x00C40038,
+	HWPfChaFabPllDivq7                    =  0x00C4003C,
+	HWPfChaDl5gPllPllrst                  =  0x00C40080,
+	HWPfChaDl5gPllClk0                    =  0x00C40084,
+	HWPfChaDl5gPllClk1                    =  0x00C40088,
+	HWPfChaDl5gPllBwadj                   =  0x00C4008C,
+	HWPfChaDl5gPllLbw                     =  0x00C40090,
+	HWPfChaDl5gPllResetq                  =  0x00C40094,
+	HWPfChaDl5gPllPhshft0                 =  0x00C40098,
+	HWPfChaDl5gPllPhshft1                 =  0x00C4009C,
+	HWPfChaDl5gPllDivq0                   =  0x00C400A0,
+	HWPfChaDl5gPllDivq1                   =  0x00C400A4,
+	HWPfChaDl5gPllDivq2                   =  0x00C400A8,
+	HWPfChaDl5gPllDivq3                   =  0x00C400AC,
+	HWPfChaDl5gPllDivq4                   =  0x00C400B0,
+	HWPfChaDl5gPllDivq5                   =  0x00C400B4,
+	HWPfChaDl5gPllDivq6                   =  0x00C400B8,
+	HWPfChaDl5gPllDivq7                   =  0x00C400BC,
+	HWPfChaDl4gPllPllrst                  =  0x00C40100,
+	HWPfChaDl4gPllClk0                    =  0x00C40104,
+	HWPfChaDl4gPllClk1                    =  0x00C40108,
+	HWPfChaDl4gPllBwadj                   =  0x00C4010C,
+	HWPfChaDl4gPllLbw                     =  0x00C40110,
+	HWPfChaDl4gPllResetq                  =  0x00C40114,
+	HWPfChaDl4gPllPhshft0                 =  0x00C40118,
+	HWPfChaDl4gPllPhshft1                 =  0x00C4011C,
+	HWPfChaDl4gPllDivq0                   =  0x00C40120,
+	HWPfChaDl4gPllDivq1                   =  0x00C40124,
+	HWPfChaDl4gPllDivq2                   =  0x00C40128,
+	HWPfChaDl4gPllDivq3                   =  0x00C4012C,
+	HWPfChaDl4gPllDivq4                   =  0x00C40130,
+	HWPfChaDl4gPllDivq5                   =  0x00C40134,
+	HWPfChaDl4gPllDivq6                   =  0x00C40138,
+	HWPfChaDl4gPllDivq7                   =  0x00C4013C,
+	HWPfChaUl5gPllPllrst                  =  0x00C40180,
+	HWPfChaUl5gPllClk0                    =  0x00C40184,
+	HWPfChaUl5gPllClk1                    =  0x00C40188,
+	HWPfChaUl5gPllBwadj                   =  0x00C4018C,
+	HWPfChaUl5gPllLbw                     =  0x00C40190,
+	HWPfChaUl5gPllResetq                  =  0x00C40194,
+	HWPfChaUl5gPllPhshft0                 =  0x00C40198,
+	HWPfChaUl5gPllPhshft1                 =  0x00C4019C,
+	HWPfChaUl5gPllDivq0                   =  0x00C401A0,
+	HWPfChaUl5gPllDivq1                   =  0x00C401A4,
+	HWPfChaUl5gPllDivq2                   =  0x00C401A8,
+	HWPfChaUl5gPllDivq3                   =  0x00C401AC,
+	HWPfChaUl5gPllDivq4                   =  0x00C401B0,
+	HWPfChaUl5gPllDivq5                   =  0x00C401B4,
+	HWPfChaUl5gPllDivq6                   =  0x00C401B8,
+	HWPfChaUl5gPllDivq7                   =  0x00C401BC,
+	HWPfChaUl4gPllPllrst                  =  0x00C40200,
+	HWPfChaUl4gPllClk0                    =  0x00C40204,
+	HWPfChaUl4gPllClk1                    =  0x00C40208,
+	HWPfChaUl4gPllBwadj                   =  0x00C4020C,
+	HWPfChaUl4gPllLbw                     =  0x00C40210,
+	HWPfChaUl4gPllResetq                  =  0x00C40214,
+	HWPfChaUl4gPllPhshft0                 =  0x00C40218,
+	HWPfChaUl4gPllPhshft1                 =  0x00C4021C,
+	HWPfChaUl4gPllDivq0                   =  0x00C40220,
+	HWPfChaUl4gPllDivq1                   =  0x00C40224,
+	HWPfChaUl4gPllDivq2                   =  0x00C40228,
+	HWPfChaUl4gPllDivq3                   =  0x00C4022C,
+	HWPfChaUl4gPllDivq4                   =  0x00C40230,
+	HWPfChaUl4gPllDivq5                   =  0x00C40234,
+	HWPfChaUl4gPllDivq6                   =  0x00C40238,
+	HWPfChaUl4gPllDivq7                   =  0x00C4023C,
+	HWPfChaDdrPllPllrst                   =  0x00C40280,
+	HWPfChaDdrPllClk0                     =  0x00C40284,
+	HWPfChaDdrPllClk1                     =  0x00C40288,
+	HWPfChaDdrPllBwadj                    =  0x00C4028C,
+	HWPfChaDdrPllLbw                      =  0x00C40290,
+	HWPfChaDdrPllResetq                   =  0x00C40294,
+	HWPfChaDdrPllPhshft0                  =  0x00C40298,
+	HWPfChaDdrPllPhshft1                  =  0x00C4029C,
+	HWPfChaDdrPllDivq0                    =  0x00C402A0,
+	HWPfChaDdrPllDivq1                    =  0x00C402A4,
+	HWPfChaDdrPllDivq2                    =  0x00C402A8,
+	HWPfChaDdrPllDivq3                    =  0x00C402AC,
+	HWPfChaDdrPllDivq4                    =  0x00C402B0,
+	HWPfChaDdrPllDivq5                    =  0x00C402B4,
+	HWPfChaDdrPllDivq6                    =  0x00C402B8,
+	HWPfChaDdrPllDivq7                    =  0x00C402BC,
+	HWPfChaErrStatus                      =  0x00C40400,
+	HWPfChaErrMask                        =  0x00C40404,
+	HWPfChaDebugPcieMsiFifo               =  0x00C40410,
+	HWPfChaDebugDdrMsiFifo                =  0x00C40414,
+	HWPfChaDebugMiscMsiFifo               =  0x00C40418,
+	HWPfChaPwmSet                         =  0x00C40420,
+	HWPfChaDdrRstStatus                   =  0x00C40430,
+	HWPfChaDdrStDoneStatus                =  0x00C40434,
+	HWPfChaDdrWbRstCfg                    =  0x00C40438,
+	HWPfChaDdrApbRstCfg                   =  0x00C4043C,
+	HWPfChaDdrPhyRstCfg                   =  0x00C40440,
+	HWPfChaDdrCpuRstCfg                   =  0x00C40444,
+	HWPfChaDdrSifRstCfg                   =  0x00C40448,
+	HWPfChaPadcfgPcomp0                   =  0x00C41000,
+	HWPfChaPadcfgNcomp0                   =  0x00C41004,
+	HWPfChaPadcfgOdt0                     =  0x00C41008,
+	HWPfChaPadcfgProtect0                 =  0x00C4100C,
+	HWPfChaPreemphasisProtect0            =  0x00C41010,
+	HWPfChaPreemphasisCompen0             =  0x00C41040,
+	HWPfChaPreemphasisOdten0              =  0x00C41044,
+	HWPfChaPadcfgPcomp1                   =  0x00C41100,
+	HWPfChaPadcfgNcomp1                   =  0x00C41104,
+	HWPfChaPadcfgOdt1                     =  0x00C41108,
+	HWPfChaPadcfgProtect1                 =  0x00C4110C,
+	HWPfChaPreemphasisProtect1            =  0x00C41110,
+	HWPfChaPreemphasisCompen1             =  0x00C41140,
+	HWPfChaPreemphasisOdten1              =  0x00C41144,
+	HWPfChaPadcfgPcomp2                   =  0x00C41200,
+	HWPfChaPadcfgNcomp2                   =  0x00C41204,
+	HWPfChaPadcfgOdt2                     =  0x00C41208,
+	HWPfChaPadcfgProtect2                 =  0x00C4120C,
+	HWPfChaPreemphasisProtect2            =  0x00C41210,
+	HWPfChaPreemphasisCompen2             =  0x00C41240,
+	HWPfChaPreemphasisOdten4              =  0x00C41444,
+	HWPfChaPreemphasisOdten2              =  0x00C41244,
+	HWPfChaPadcfgPcomp3                   =  0x00C41300,
+	HWPfChaPadcfgNcomp3                   =  0x00C41304,
+	HWPfChaPadcfgOdt3                     =  0x00C41308,
+	HWPfChaPadcfgProtect3                 =  0x00C4130C,
+	HWPfChaPreemphasisProtect3            =  0x00C41310,
+	HWPfChaPreemphasisCompen3             =  0x00C41340,
+	HWPfChaPreemphasisOdten3              =  0x00C41344,
+	HWPfChaPadcfgPcomp4                   =  0x00C41400,
+	HWPfChaPadcfgNcomp4                   =  0x00C41404,
+	HWPfChaPadcfgOdt4                     =  0x00C41408,
+	HWPfChaPadcfgProtect4                 =  0x00C4140C,
+	HWPfChaPreemphasisProtect4            =  0x00C41410,
+	HWPfChaPreemphasisCompen4             =  0x00C41440,
+	HWPfHiVfToPfDbellVf                   =  0x00C80000,
+	HWPfHiPfToVfDbellVf                   =  0x00C80008,
+	HWPfHiInfoRingBaseLoVf                =  0x00C80010,
+	HWPfHiInfoRingBaseHiVf                =  0x00C80014,
+	HWPfHiInfoRingPointerVf               =  0x00C80018,
+	HWPfHiInfoRingIntWrEnVf               =  0x00C80020,
+	HWPfHiInfoRingPf2VfWrEnVf             =  0x00C80024,
+	HWPfHiMsixVectorMapperVf              =  0x00C80060,
+	HWPfHiModuleVersionReg                =  0x00C84000,
+	HWPfHiIosf2axiErrLogReg               =  0x00C84004,
+	HWPfHiHardResetReg                    =  0x00C84008,
+	HWPfHi5GHardResetReg                  =  0x00C8400C,
+	HWPfHiInfoRingBaseLoRegPf             =  0x00C84010,
+	HWPfHiInfoRingBaseHiRegPf             =  0x00C84014,
+	HWPfHiInfoRingPointerRegPf            =  0x00C84018,
+	HWPfHiInfoRingIntWrEnRegPf            =  0x00C84020,
+	HWPfHiInfoRingVf2pfLoWrEnReg          =  0x00C84024,
+	HWPfHiInfoRingVf2pfHiWrEnReg          =  0x00C84028,
+	HWPfHiLogParityErrStatusReg           =  0x00C8402C,
+	HWPfHiLogDataParityErrorVfStatusLo    =  0x00C84030,
+	HWPfHiLogDataParityErrorVfStatusHi    =  0x00C84034,
+	HWPfHiBlockTransmitOnErrorEn          =  0x00C84038,
+	HWPfHiCfgMsiIntWrEnRegPf              =  0x00C84040,
+	HWPfHiCfgMsiVf2pfLoWrEnReg            =  0x00C84044,
+	HWPfHiCfgMsiVf2pfHighWrEnReg          =  0x00C84048,
+	HWPfHiMsixVectorMapperPf              =  0x00C84060,
+	HWPfHiApbWrWaitTime                   =  0x00C84100,
+	HWPfHiXCounterMaxValue                =  0x00C84104,
+	HWPfHiPfMode                          =  0x00C84108,
+	HWPfHiClkGateHystReg                  =  0x00C8410C,
+	HWPfHiSnoopBitsReg                    =  0x00C84110,
+	HWPfHiMsiDropEnableReg                =  0x00C84114,
+	HWPfHiMsiStatReg                      =  0x00C84120,
+	HWPfHiFifoOflStatReg                  =  0x00C84124,
+	HWPfHiHiDebugReg                      =  0x00C841F4,
+	HWPfHiDebugMemSnoopMsiFifo            =  0x00C841F8,
+	HWPfHiDebugMemSnoopInputFifo          =  0x00C841FC,
+	HWPfHiMsixMappingConfig               =  0x00C84200,
+	HWPfHiJunkReg                         =  0x00C8FF00,
+	HWPfDdrUmmcVer                        =  0x00D00000,
+	HWPfDdrUmmcCap                        =  0x00D00010,
+	HWPfDdrUmmcCtrl                       =  0x00D00020,
+	HWPfDdrMpcPe                          =  0x00D00080,
+	HWPfDdrMpcPpri3                       =  0x00D00090,
+	HWPfDdrMpcPpri2                       =  0x00D000A0,
+	HWPfDdrMpcPpri1                       =  0x00D000B0,
+	HWPfDdrMpcPpri0                       =  0x00D000C0,
+	HWPfDdrMpcPrwgrpCtrl                  =  0x00D000D0,
+	HWPfDdrMpcPbw7                        =  0x00D000E0,
+	HWPfDdrMpcPbw6                        =  0x00D000F0,
+	HWPfDdrMpcPbw5                        =  0x00D00100,
+	HWPfDdrMpcPbw4                        =  0x00D00110,
+	HWPfDdrMpcPbw3                        =  0x00D00120,
+	HWPfDdrMpcPbw2                        =  0x00D00130,
+	HWPfDdrMpcPbw1                        =  0x00D00140,
+	HWPfDdrMpcPbw0                        =  0x00D00150,
+	HWPfDdrMemoryInit                     =  0x00D00200,
+	HWPfDdrMemoryInitDone                 =  0x00D00210,
+	HWPfDdrMemInitPhyTrng0                =  0x00D00240,
+	HWPfDdrMemInitPhyTrng1                =  0x00D00250,
+	HWPfDdrMemInitPhyTrng2                =  0x00D00260,
+	HWPfDdrMemInitPhyTrng3                =  0x00D00270,
+	HWPfDdrBcDram                         =  0x00D003C0,
+	HWPfDdrBcAddrMap                      =  0x00D003D0,
+	HWPfDdrBcRef                          =  0x00D003E0,
+	HWPfDdrBcTim0                         =  0x00D00400,
+	HWPfDdrBcTim1                         =  0x00D00410,
+	HWPfDdrBcTim2                         =  0x00D00420,
+	HWPfDdrBcTim3                         =  0x00D00430,
+	HWPfDdrBcTim4                         =  0x00D00440,
+	HWPfDdrBcTim5                         =  0x00D00450,
+	HWPfDdrBcTim6                         =  0x00D00460,
+	HWPfDdrBcTim7                         =  0x00D00470,
+	HWPfDdrBcTim8                         =  0x00D00480,
+	HWPfDdrBcTim9                         =  0x00D00490,
+	HWPfDdrBcTim10                        =  0x00D004A0,
+	HWPfDdrBcTim12                        =  0x00D004C0,
+	HWPfDdrDfiInit                        =  0x00D004D0,
+	HWPfDdrDfiInitComplete                =  0x00D004E0,
+	HWPfDdrDfiTim0                        =  0x00D004F0,
+	HWPfDdrDfiTim1                        =  0x00D00500,
+	HWPfDdrDfiPhyUpdEn                    =  0x00D00530,
+	HWPfDdrMemStatus                      =  0x00D00540,
+	HWPfDdrUmmcErrStatus                  =  0x00D00550,
+	HWPfDdrUmmcIntStatus                  =  0x00D00560,
+	HWPfDdrUmmcIntEn                      =  0x00D00570,
+	HWPfDdrPhyRdLatency                   =  0x00D48400,
+	HWPfDdrPhyRdLatencyDbi                =  0x00D48410,
+	HWPfDdrPhyWrLatency                   =  0x00D48420,
+	HWPfDdrPhyTrngType                    =  0x00D48430,
+	HWPfDdrPhyMrsTiming2                  =  0x00D48440,
+	HWPfDdrPhyMrsTiming0                  =  0x00D48450,
+	HWPfDdrPhyMrsTiming1                  =  0x00D48460,
+	HWPfDdrPhyDramTmrd                    =  0x00D48470,
+	HWPfDdrPhyDramTmod                    =  0x00D48480,
+	HWPfDdrPhyDramTwpre                   =  0x00D48490,
+	HWPfDdrPhyDramTrfc                    =  0x00D484A0,
+	HWPfDdrPhyDramTrwtp                   =  0x00D484B0,
+	HWPfDdrPhyMr01Dimm                    =  0x00D484C0,
+	HWPfDdrPhyMr01DimmDbi                 =  0x00D484D0,
+	HWPfDdrPhyMr23Dimm                    =  0x00D484E0,
+	HWPfDdrPhyMr45Dimm                    =  0x00D484F0,
+	HWPfDdrPhyMr67Dimm                    =  0x00D48500,
+	HWPfDdrPhyWrlvlWwRdlvlRr              =  0x00D48510,
+	HWPfDdrPhyOdtEn                       =  0x00D48520,
+	HWPfDdrPhyFastTrng                    =  0x00D48530,
+	HWPfDdrPhyDynTrngGap                  =  0x00D48540,
+	HWPfDdrPhyDynRcalGap                  =  0x00D48550,
+	HWPfDdrPhyIdletimeout                 =  0x00D48560,
+	HWPfDdrPhyRstCkeGap                   =  0x00D48570,
+	HWPfDdrPhyCkeMrsGap                   =  0x00D48580,
+	HWPfDdrPhyMemVrefMidVal               =  0x00D48590,
+	HWPfDdrPhyVrefStep                    =  0x00D485A0,
+	HWPfDdrPhyVrefThreshold               =  0x00D485B0,
+	HWPfDdrPhyPhyVrefMidVal               =  0x00D485C0,
+	HWPfDdrPhyDqsCountMax                 =  0x00D485D0,
+	HWPfDdrPhyDqsCountNum                 =  0x00D485E0,
+	HWPfDdrPhyDramRow                     =  0x00D485F0,
+	HWPfDdrPhyDramCol                     =  0x00D48600,
+	HWPfDdrPhyDramBgBa                    =  0x00D48610,
+	HWPfDdrPhyDynamicUpdreqrel            =  0x00D48620,
+	HWPfDdrPhyVrefLimits                  =  0x00D48630,
+	HWPfDdrPhyIdtmTcStatus                =  0x00D6C020,
+	HWPfDdrPhyIdtmFwVersion               =  0x00D6C410,
+	HWPfDdrPhyRdlvlGateInitDelay          =  0x00D70000,
+	HWPfDdrPhyRdenSmplabc                 =  0x00D70008,
+	HWPfDdrPhyVrefNibble0                 =  0x00D7000C,
+	HWPfDdrPhyVrefNibble1                 =  0x00D70010,
+	HWPfDdrPhyRdlvlGateDqsSmpl0           =  0x00D70014,
+	HWPfDdrPhyRdlvlGateDqsSmpl1           =  0x00D70018,
+	HWPfDdrPhyRdlvlGateDqsSmpl2           =  0x00D7001C,
+	HWPfDdrPhyDqsCount                    =  0x00D70020,
+	HWPfDdrPhyWrlvlRdlvlGateStatus        =  0x00D70024,
+	HWPfDdrPhyErrorFlags                  =  0x00D70028,
+	HWPfDdrPhyPowerDown                   =  0x00D70030,
+	HWPfDdrPhyPrbsSeedByte0               =  0x00D70034,
+	HWPfDdrPhyPrbsSeedByte1               =  0x00D70038,
+	HWPfDdrPhyPcompDq                     =  0x00D70040,
+	HWPfDdrPhyNcompDq                     =  0x00D70044,
+	HWPfDdrPhyPcompDqs                    =  0x00D70048,
+	HWPfDdrPhyNcompDqs                    =  0x00D7004C,
+	HWPfDdrPhyPcompCmd                    =  0x00D70050,
+	HWPfDdrPhyNcompCmd                    =  0x00D70054,
+	HWPfDdrPhyPcompCk                     =  0x00D70058,
+	HWPfDdrPhyNcompCk                     =  0x00D7005C,
+	HWPfDdrPhyRcalOdtDq                   =  0x00D70060,
+	HWPfDdrPhyRcalOdtDqs                  =  0x00D70064,
+	HWPfDdrPhyRcalMask1                   =  0x00D70068,
+	HWPfDdrPhyRcalMask2                   =  0x00D7006C,
+	HWPfDdrPhyRcalCtrl                    =  0x00D70070,
+	HWPfDdrPhyRcalCnt                     =  0x00D70074,
+	HWPfDdrPhyRcalOverride                =  0x00D70078,
+	HWPfDdrPhyRcalGateen                  =  0x00D7007C,
+	HWPfDdrPhyCtrl                        =  0x00D70080,
+	HWPfDdrPhyWrlvlAlg                    =  0x00D70084,
+	HWPfDdrPhyRcalVreftTxcmdOdt           =  0x00D70088,
+	HWPfDdrPhyRdlvlGateParam              =  0x00D7008C,
+	HWPfDdrPhyRdlvlGateParam2             =  0x00D70090,
+	HWPfDdrPhyRcalVreftTxdata             =  0x00D70094,
+	HWPfDdrPhyCmdIntDelay                 =  0x00D700A4,
+	HWPfDdrPhyAlertN                      =  0x00D700A8,
+	HWPfDdrPhyTrngReqWpre2tck             =  0x00D700AC,
+	HWPfDdrPhyCmdPhaseSel                 =  0x00D700B4,
+	HWPfDdrPhyCmdDcdl                     =  0x00D700B8,
+	HWPfDdrPhyCkDcdl                      =  0x00D700BC,
+	HWPfDdrPhySwTrngCtrl1                 =  0x00D700C0,
+	HWPfDdrPhySwTrngCtrl2                 =  0x00D700C4,
+	HWPfDdrPhyRcalPcompRden               =  0x00D700C8,
+	HWPfDdrPhyRcalNcompRden               =  0x00D700CC,
+	HWPfDdrPhyRcalCompen                  =  0x00D700D0,
+	HWPfDdrPhySwTrngRdqs                  =  0x00D700D4,
+	HWPfDdrPhySwTrngWdqs                  =  0x00D700D8,
+	HWPfDdrPhySwTrngRdena                 =  0x00D700DC,
+	HWPfDdrPhySwTrngRdenb                 =  0x00D700E0,
+	HWPfDdrPhySwTrngRdenc                 =  0x00D700E4,
+	HWPfDdrPhySwTrngWdq                   =  0x00D700E8,
+	HWPfDdrPhySwTrngRdq                   =  0x00D700EC,
+	HWPfDdrPhyPcfgHmValue                 =  0x00D700F0,
+	HWPfDdrPhyPcfgTimerValue              =  0x00D700F4,
+	HWPfDdrPhyPcfgSoftwareTraining        =  0x00D700F8,
+	HWPfDdrPhyPcfgMcStatus                =  0x00D700FC,
+	HWPfDdrPhyWrlvlPhRank0                =  0x00D70100,
+	HWPfDdrPhyRdenPhRank0                 =  0x00D70104,
+	HWPfDdrPhyRdenIntRank0                =  0x00D70108,
+	HWPfDdrPhyRdqsDcdlRank0               =  0x00D7010C,
+	HWPfDdrPhyRdqsShadowDcdlRank0         =  0x00D70110,
+	HWPfDdrPhyWdqsDcdlRank0               =  0x00D70114,
+	HWPfDdrPhyWdmDcdlShadowRank0          =  0x00D70118,
+	HWPfDdrPhyWdmDcdlRank0                =  0x00D7011C,
+	HWPfDdrPhyDbiDcdlRank0                =  0x00D70120,
+	HWPfDdrPhyRdenDcdlaRank0              =  0x00D70124,
+	HWPfDdrPhyDbiDcdlShadowRank0          =  0x00D70128,
+	HWPfDdrPhyRdenDcdlbRank0              =  0x00D7012C,
+	HWPfDdrPhyWdqsShadowDcdlRank0         =  0x00D70130,
+	HWPfDdrPhyRdenDcdlcRank0              =  0x00D70134,
+	HWPfDdrPhyRdenShadowDcdlaRank0        =  0x00D70138,
+	HWPfDdrPhyWrlvlIntRank0               =  0x00D7013C,
+	HWPfDdrPhyRdqDcdlBit0Rank0            =  0x00D70200,
+	HWPfDdrPhyRdqDcdlShadowBit0Rank0      =  0x00D70204,
+	HWPfDdrPhyWdqDcdlBit0Rank0            =  0x00D70208,
+	HWPfDdrPhyWdqDcdlShadowBit0Rank0      =  0x00D7020C,
+	HWPfDdrPhyRdqDcdlBit1Rank0            =  0x00D70240,
+	HWPfDdrPhyRdqDcdlShadowBit1Rank0      =  0x00D70244,
+	HWPfDdrPhyWdqDcdlBit1Rank0            =  0x00D70248,
+	HWPfDdrPhyWdqDcdlShadowBit1Rank0      =  0x00D7024C,
+	HWPfDdrPhyRdqDcdlBit2Rank0            =  0x00D70280,
+	HWPfDdrPhyRdqDcdlShadowBit2Rank0      =  0x00D70284,
+	HWPfDdrPhyWdqDcdlBit2Rank0            =  0x00D70288,
+	HWPfDdrPhyWdqDcdlShadowBit2Rank0      =  0x00D7028C,
+	HWPfDdrPhyRdqDcdlBit3Rank0            =  0x00D702C0,
+	HWPfDdrPhyRdqDcdlShadowBit3Rank0      =  0x00D702C4,
+	HWPfDdrPhyWdqDcdlBit3Rank0            =  0x00D702C8,
+	HWPfDdrPhyWdqDcdlShadowBit3Rank0      =  0x00D702CC,
+	HWPfDdrPhyRdqDcdlBit4Rank0            =  0x00D70300,
+	HWPfDdrPhyRdqDcdlShadowBit4Rank0      =  0x00D70304,
+	HWPfDdrPhyWdqDcdlBit4Rank0            =  0x00D70308,
+	HWPfDdrPhyWdqDcdlShadowBit4Rank0      =  0x00D7030C,
+	HWPfDdrPhyRdqDcdlBit5Rank0            =  0x00D70340,
+	HWPfDdrPhyRdqDcdlShadowBit5Rank0      =  0x00D70344,
+	HWPfDdrPhyWdqDcdlBit5Rank0            =  0x00D70348,
+	HWPfDdrPhyWdqDcdlShadowBit5Rank0      =  0x00D7034C,
+	HWPfDdrPhyRdqDcdlBit6Rank0            =  0x00D70380,
+	HWPfDdrPhyRdqDcdlShadowBit6Rank0      =  0x00D70384,
+	HWPfDdrPhyWdqDcdlBit6Rank0            =  0x00D70388,
+	HWPfDdrPhyWdqDcdlShadowBit6Rank0      =  0x00D7038C,
+	HWPfDdrPhyRdqDcdlBit7Rank0            =  0x00D703C0,
+	HWPfDdrPhyRdqDcdlShadowBit7Rank0      =  0x00D703C4,
+	HWPfDdrPhyWdqDcdlBit7Rank0            =  0x00D703C8,
+	HWPfDdrPhyWdqDcdlShadowBit7Rank0      =  0x00D703CC,
+	HWPfDdrPhyIdtmStatus                  =  0x00D740D0,
+	HWPfDdrPhyIdtmError                   =  0x00D74110,
+	HWPfDdrPhyIdtmDebug                   =  0x00D74120,
+	HWPfDdrPhyIdtmDebugInt                =  0x00D74130,
+	HwPfPcieLnAsicCfgovr                  =  0x00D80000,
+	HwPfPcieLnAclkmixer                   =  0x00D80004,
+	HwPfPcieLnTxrampfreq                  =  0x00D80008,
+	HwPfPcieLnLanetest                    =  0x00D8000C,
+	HwPfPcieLnDcctrl                      =  0x00D80010,
+	HwPfPcieLnDccmeas                     =  0x00D80014,
+	HwPfPcieLnDccovrAclk                  =  0x00D80018,
+	HwPfPcieLnDccovrTxa                   =  0x00D8001C,
+	HwPfPcieLnDccovrTxk                   =  0x00D80020,
+	HwPfPcieLnDccovrDclk                  =  0x00D80024,
+	HwPfPcieLnDccovrEclk                  =  0x00D80028,
+	HwPfPcieLnDcctrimAclk                 =  0x00D8002C,
+	HwPfPcieLnDcctrimTx                   =  0x00D80030,
+	HwPfPcieLnDcctrimDclk                 =  0x00D80034,
+	HwPfPcieLnDcctrimEclk                 =  0x00D80038,
+	HwPfPcieLnQuadCtrl                    =  0x00D8003C,
+	HwPfPcieLnQuadCorrIndex               =  0x00D80040,
+	HwPfPcieLnQuadCorrStatus              =  0x00D80044,
+	HwPfPcieLnAsicRxovr1                  =  0x00D80048,
+	HwPfPcieLnAsicRxovr2                  =  0x00D8004C,
+	HwPfPcieLnAsicEqinfovr                =  0x00D80050,
+	HwPfPcieLnRxcsr                       =  0x00D80054,
+	HwPfPcieLnRxfectrl                    =  0x00D80058,
+	HwPfPcieLnRxtest                      =  0x00D8005C,
+	HwPfPcieLnEscount                     =  0x00D80060,
+	HwPfPcieLnCdrctrl                     =  0x00D80064,
+	HwPfPcieLnCdrctrl2                    =  0x00D80068,
+	HwPfPcieLnCdrcfg0Ctrl0                =  0x00D8006C,
+	HwPfPcieLnCdrcfg0Ctrl1                =  0x00D80070,
+	HwPfPcieLnCdrcfg0Ctrl2                =  0x00D80074,
+	HwPfPcieLnCdrcfg1Ctrl0                =  0x00D80078,
+	HwPfPcieLnCdrcfg1Ctrl1                =  0x00D8007C,
+	HwPfPcieLnCdrcfg1Ctrl2                =  0x00D80080,
+	HwPfPcieLnCdrcfg2Ctrl0                =  0x00D80084,
+	HwPfPcieLnCdrcfg2Ctrl1                =  0x00D80088,
+	HwPfPcieLnCdrcfg2Ctrl2                =  0x00D8008C,
+	HwPfPcieLnCdrcfg3Ctrl0                =  0x00D80090,
+	HwPfPcieLnCdrcfg3Ctrl1                =  0x00D80094,
+	HwPfPcieLnCdrcfg3Ctrl2                =  0x00D80098,
+	HwPfPcieLnCdrphase                    =  0x00D8009C,
+	HwPfPcieLnCdrfreq                     =  0x00D800A0,
+	HwPfPcieLnCdrstatusPhase              =  0x00D800A4,
+	HwPfPcieLnCdrstatusFreq               =  0x00D800A8,
+	HwPfPcieLnCdroffset                   =  0x00D800AC,
+	HwPfPcieLnRxvosctl                    =  0x00D800B0,
+	HwPfPcieLnRxvosctl2                   =  0x00D800B4,
+	HwPfPcieLnRxlosctl                    =  0x00D800B8,
+	HwPfPcieLnRxlos                       =  0x00D800BC,
+	HwPfPcieLnRxlosvval                   =  0x00D800C0,
+	HwPfPcieLnRxvosd0                     =  0x00D800C4,
+	HwPfPcieLnRxvosd1                     =  0x00D800C8,
+	HwPfPcieLnRxvosep0                    =  0x00D800CC,
+	HwPfPcieLnRxvosep1                    =  0x00D800D0,
+	HwPfPcieLnRxvosen0                    =  0x00D800D4,
+	HwPfPcieLnRxvosen1                    =  0x00D800D8,
+	HwPfPcieLnRxvosafe                    =  0x00D800DC,
+	HwPfPcieLnRxvosa0                     =  0x00D800E0,
+	HwPfPcieLnRxvosa0Out                  =  0x00D800E4,
+	HwPfPcieLnRxvosa1                     =  0x00D800E8,
+	HwPfPcieLnRxvosa1Out                  =  0x00D800EC,
+	HwPfPcieLnRxmisc                      =  0x00D800F0,
+	HwPfPcieLnRxbeacon                    =  0x00D800F4,
+	HwPfPcieLnRxdssout                    =  0x00D800F8,
+	HwPfPcieLnRxdssout2                   =  0x00D800FC,
+	HwPfPcieLnAlphapctrl                  =  0x00D80100,
+	HwPfPcieLnAlphanctrl                  =  0x00D80104,
+	HwPfPcieLnAdaptctrl                   =  0x00D80108,
+	HwPfPcieLnAdaptctrl1                  =  0x00D8010C,
+	HwPfPcieLnAdaptstatus                 =  0x00D80110,
+	HwPfPcieLnAdaptvga1                   =  0x00D80114,
+	HwPfPcieLnAdaptvga2                   =  0x00D80118,
+	HwPfPcieLnAdaptvga3                   =  0x00D8011C,
+	HwPfPcieLnAdaptvga4                   =  0x00D80120,
+	HwPfPcieLnAdaptboost1                 =  0x00D80124,
+	HwPfPcieLnAdaptboost2                 =  0x00D80128,
+	HwPfPcieLnAdaptboost3                 =  0x00D8012C,
+	HwPfPcieLnAdaptboost4                 =  0x00D80130,
+	HwPfPcieLnAdaptsslms1                 =  0x00D80134,
+	HwPfPcieLnAdaptsslms2                 =  0x00D80138,
+	HwPfPcieLnAdaptvgaStatus              =  0x00D8013C,
+	HwPfPcieLnAdaptboostStatus            =  0x00D80140,
+	HwPfPcieLnAdaptsslmsStatus1           =  0x00D80144,
+	HwPfPcieLnAdaptsslmsStatus2           =  0x00D80148,
+	HwPfPcieLnAfectrl1                    =  0x00D8014C,
+	HwPfPcieLnAfectrl2                    =  0x00D80150,
+	HwPfPcieLnAfectrl3                    =  0x00D80154,
+	HwPfPcieLnAfedefault1                 =  0x00D80158,
+	HwPfPcieLnAfedefault2                 =  0x00D8015C,
+	HwPfPcieLnDfectrl1                    =  0x00D80160,
+	HwPfPcieLnDfectrl2                    =  0x00D80164,
+	HwPfPcieLnDfectrl3                    =  0x00D80168,
+	HwPfPcieLnDfectrl4                    =  0x00D8016C,
+	HwPfPcieLnDfectrl5                    =  0x00D80170,
+	HwPfPcieLnDfectrl6                    =  0x00D80174,
+	HwPfPcieLnAfestatus1                  =  0x00D80178,
+	HwPfPcieLnAfestatus2                  =  0x00D8017C,
+	HwPfPcieLnDfestatus1                  =  0x00D80180,
+	HwPfPcieLnDfestatus2                  =  0x00D80184,
+	HwPfPcieLnDfestatus3                  =  0x00D80188,
+	HwPfPcieLnDfestatus4                  =  0x00D8018C,
+	HwPfPcieLnDfestatus5                  =  0x00D80190,
+	HwPfPcieLnAlphastatus                 =  0x00D80194,
+	HwPfPcieLnFomctrl1                    =  0x00D80198,
+	HwPfPcieLnFomctrl2                    =  0x00D8019C,
+	HwPfPcieLnFomctrl3                    =  0x00D801A0,
+	HwPfPcieLnAclkcalStatus               =  0x00D801A4,
+	HwPfPcieLnOffscorrStatus              =  0x00D801A8,
+	HwPfPcieLnEyewidthStatus              =  0x00D801AC,
+	HwPfPcieLnEyeheightStatus             =  0x00D801B0,
+	HwPfPcieLnAsicTxovr1                  =  0x00D801B4,
+	HwPfPcieLnAsicTxovr2                  =  0x00D801B8,
+	HwPfPcieLnAsicTxovr3                  =  0x00D801BC,
+	HwPfPcieLnTxbiasadjOvr                =  0x00D801C0,
+	HwPfPcieLnTxcsr                       =  0x00D801C4,
+	HwPfPcieLnTxtest                      =  0x00D801C8,
+	HwPfPcieLnTxtestword                  =  0x00D801CC,
+	HwPfPcieLnTxtestwordHigh              =  0x00D801D0,
+	HwPfPcieLnTxdrive                     =  0x00D801D4,
+	HwPfPcieLnMtcsLn                      =  0x00D801D8,
+	HwPfPcieLnStatsumLn                   =  0x00D801DC,
+	HwPfPcieLnRcbusScratch                =  0x00D801E0,
+	HwPfPcieLnRcbusMinorrev               =  0x00D801F0,
+	HwPfPcieLnRcbusMajorrev               =  0x00D801F4,
+	HwPfPcieLnRcbusBlocktype              =  0x00D801F8,
+	HwPfPcieSupPllcsr                     =  0x00D80800,
+	HwPfPcieSupPlldiv                     =  0x00D80804,
+	HwPfPcieSupPllcal                     =  0x00D80808,
+	HwPfPcieSupPllcalsts                  =  0x00D8080C,
+	HwPfPcieSupPllmeas                    =  0x00D80810,
+	HwPfPcieSupPlldactrim                 =  0x00D80814,
+	HwPfPcieSupPllbiastrim                =  0x00D80818,
+	HwPfPcieSupPllbwtrim                  =  0x00D8081C,
+	HwPfPcieSupPllcaldly                  =  0x00D80820,
+	HwPfPcieSupRefclkonpclkctrl           =  0x00D80824,
+	HwPfPcieSupPclkdelay                  =  0x00D80828,
+	HwPfPcieSupPhyconfig                  =  0x00D8082C,
+	HwPfPcieSupRcalIntf                   =  0x00D80830,
+	HwPfPcieSupAuxcsr                     =  0x00D80834,
+	HwPfPcieSupVref                       =  0x00D80838,
+	HwPfPcieSupLinkmode                   =  0x00D8083C,
+	HwPfPcieSupRrefcalctl                 =  0x00D80840,
+	HwPfPcieSupRrefcal                    =  0x00D80844,
+	HwPfPcieSupRrefcaldly                 =  0x00D80848,
+	HwPfPcieSupTximpcalctl                =  0x00D8084C,
+	HwPfPcieSupTximpcal                   =  0x00D80850,
+	HwPfPcieSupTximpoffset                =  0x00D80854,
+	HwPfPcieSupTximpcaldly                =  0x00D80858,
+	HwPfPcieSupRximpcalctl                =  0x00D8085C,
+	HwPfPcieSupRximpcal                   =  0x00D80860,
+	HwPfPcieSupRximpoffset                =  0x00D80864,
+	HwPfPcieSupRximpcaldly                =  0x00D80868,
+	HwPfPcieSupFence                      =  0x00D8086C,
+	HwPfPcieSupMtcs                       =  0x00D80870,
+	HwPfPcieSupStatsum                    =  0x00D809B8,
+	HwPfPciePcsDpStatus0                  =  0x00D81000,
+	HwPfPciePcsDpControl0                 =  0x00D81004,
+	HwPfPciePcsPmaStatusLane0             =  0x00D81008,
+	HwPfPciePcsPipeStatusLane0            =  0x00D8100C,
+	HwPfPciePcsTxdeemph0Lane0             =  0x00D81010,
+	HwPfPciePcsTxdeemph1Lane0             =  0x00D81014,
+	HwPfPciePcsInternalStatusLane0        =  0x00D81018,
+	HwPfPciePcsDpStatus1                  =  0x00D8101C,
+	HwPfPciePcsDpControl1                 =  0x00D81020,
+	HwPfPciePcsPmaStatusLane1             =  0x00D81024,
+	HwPfPciePcsPipeStatusLane1            =  0x00D81028,
+	HwPfPciePcsTxdeemph0Lane1             =  0x00D8102C,
+	HwPfPciePcsTxdeemph1Lane1             =  0x00D81030,
+	HwPfPciePcsInternalStatusLane1        =  0x00D81034,
+	HwPfPciePcsDpStatus2                  =  0x00D81038,
+	HwPfPciePcsDpControl2                 =  0x00D8103C,
+	HwPfPciePcsPmaStatusLane2             =  0x00D81040,
+	HwPfPciePcsPipeStatusLane2            =  0x00D81044,
+	HwPfPciePcsTxdeemph0Lane2             =  0x00D81048,
+	HwPfPciePcsTxdeemph1Lane2             =  0x00D8104C,
+	HwPfPciePcsInternalStatusLane2        =  0x00D81050,
+	HwPfPciePcsDpStatus3                  =  0x00D81054,
+	HwPfPciePcsDpControl3                 =  0x00D81058,
+	HwPfPciePcsPmaStatusLane3             =  0x00D8105C,
+	HwPfPciePcsPipeStatusLane3            =  0x00D81060,
+	HwPfPciePcsTxdeemph0Lane3             =  0x00D81064,
+	HwPfPciePcsTxdeemph1Lane3             =  0x00D81068,
+	HwPfPciePcsInternalStatusLane3        =  0x00D8106C,
+	HwPfPciePcsEbStatus0                  =  0x00D81070,
+	HwPfPciePcsEbStatus1                  =  0x00D81074,
+	HwPfPciePcsEbStatus2                  =  0x00D81078,
+	HwPfPciePcsEbStatus3                  =  0x00D8107C,
+	HwPfPciePcsPllSettingPcieG1           =  0x00D81088,
+	HwPfPciePcsPllSettingPcieG2           =  0x00D8108C,
+	HwPfPciePcsPllSettingPcieG3           =  0x00D81090,
+	HwPfPciePcsControl                    =  0x00D81094,
+	HwPfPciePcsEqControl                  =  0x00D81098,
+	HwPfPciePcsEqTimer                    =  0x00D8109C,
+	HwPfPciePcsEqErrStatus                =  0x00D810A0,
+	HwPfPciePcsEqErrCount                 =  0x00D810A4,
+	HwPfPciePcsStatus                     =  0x00D810A8,
+	HwPfPciePcsMiscRegister               =  0x00D810AC,
+	HwPfPciePcsObsControl                 =  0x00D810B0,
+	HwPfPciePcsPrbsCount0                 =  0x00D81200,
+	HwPfPciePcsBistControl0               =  0x00D81204,
+	HwPfPciePcsBistStaticWord00           =  0x00D81208,
+	HwPfPciePcsBistStaticWord10           =  0x00D8120C,
+	HwPfPciePcsBistStaticWord20           =  0x00D81210,
+	HwPfPciePcsBistStaticWord30           =  0x00D81214,
+	HwPfPciePcsPrbsCount1                 =  0x00D81220,
+	HwPfPciePcsBistControl1               =  0x00D81224,
+	HwPfPciePcsBistStaticWord01           =  0x00D81228,
+	HwPfPciePcsBistStaticWord11           =  0x00D8122C,
+	HwPfPciePcsBistStaticWord21           =  0x00D81230,
+	HwPfPciePcsBistStaticWord31           =  0x00D81234,
+	HwPfPciePcsPrbsCount2                 =  0x00D81240,
+	HwPfPciePcsBistControl2               =  0x00D81244,
+	HwPfPciePcsBistStaticWord02           =  0x00D81248,
+	HwPfPciePcsBistStaticWord12           =  0x00D8124C,
+	HwPfPciePcsBistStaticWord22           =  0x00D81250,
+	HwPfPciePcsBistStaticWord32           =  0x00D81254,
+	HwPfPciePcsPrbsCount3                 =  0x00D81260,
+	HwPfPciePcsBistControl3               =  0x00D81264,
+	HwPfPciePcsBistStaticWord03           =  0x00D81268,
+	HwPfPciePcsBistStaticWord13           =  0x00D8126C,
+	HwPfPciePcsBistStaticWord23           =  0x00D81270,
+	HwPfPciePcsBistStaticWord33           =  0x00D81274,
+	HwPfPcieGpexLtssmStateCntrl           =  0x00D90400,
+	HwPfPcieGpexLtssmStateStatus          =  0x00D90404,
+	HwPfPcieGpexSkipFreqTimer             =  0x00D90408,
+	HwPfPcieGpexLaneSelect                =  0x00D9040C,
+	HwPfPcieGpexLaneDeskew                =  0x00D90410,
+	HwPfPcieGpexRxErrorStatus             =  0x00D90414,
+	HwPfPcieGpexLaneNumControl            =  0x00D90418,
+	HwPfPcieGpexNFstControl               =  0x00D9041C,
+	HwPfPcieGpexLinkStatus                =  0x00D90420,
+	HwPfPcieGpexAckReplayTimeout          =  0x00D90438,
+	HwPfPcieGpexSeqNumberStatus           =  0x00D9043C,
+	HwPfPcieGpexCoreClkRatio              =  0x00D90440,
+	HwPfPcieGpexDllTholdControl           =  0x00D90448,
+	HwPfPcieGpexPmTimer                   =  0x00D90450,
+	HwPfPcieGpexPmeTimeout                =  0x00D90454,
+	HwPfPcieGpexAspmL1Timer               =  0x00D90458,
+	HwPfPcieGpexAspmReqTimer              =  0x00D9045C,
+	HwPfPcieGpexAspmL1Dis                 =  0x00D90460,
+	HwPfPcieGpexAdvisoryErrorControl      =  0x00D90468,
+	HwPfPcieGpexId                        =  0x00D90470,
+	HwPfPcieGpexClasscode                 =  0x00D90474,
+	HwPfPcieGpexSubsystemId               =  0x00D90478,
+	HwPfPcieGpexDeviceCapabilities        =  0x00D9047C,
+	HwPfPcieGpexLinkCapabilities          =  0x00D90480,
+	HwPfPcieGpexFunctionNumber            =  0x00D90484,
+	HwPfPcieGpexPmCapabilities            =  0x00D90488,
+	HwPfPcieGpexFunctionSelect            =  0x00D9048C,
+	HwPfPcieGpexErrorCounter              =  0x00D904AC,
+	HwPfPcieGpexConfigReady               =  0x00D904B0,
+	HwPfPcieGpexFcUpdateTimeout           =  0x00D904B8,
+	HwPfPcieGpexFcUpdateTimer             =  0x00D904BC,
+	HwPfPcieGpexVcBufferLoad              =  0x00D904C8,
+	HwPfPcieGpexVcBufferSizeThold         =  0x00D904CC,
+	HwPfPcieGpexVcBufferSelect            =  0x00D904D0,
+	HwPfPcieGpexBarEnable                 =  0x00D904D4,
+	HwPfPcieGpexBarDwordLower             =  0x00D904D8,
+	HwPfPcieGpexBarDwordUpper             =  0x00D904DC,
+	HwPfPcieGpexBarSelect                 =  0x00D904E0,
+	HwPfPcieGpexCreditCounterSelect       =  0x00D904E4,
+	HwPfPcieGpexCreditCounterStatus       =  0x00D904E8,
+	HwPfPcieGpexTlpHeaderSelect           =  0x00D904EC,
+	HwPfPcieGpexTlpHeaderDword0           =  0x00D904F0,
+	HwPfPcieGpexTlpHeaderDword1           =  0x00D904F4,
+	HwPfPcieGpexTlpHeaderDword2           =  0x00D904F8,
+	HwPfPcieGpexTlpHeaderDword3           =  0x00D904FC,
+	HwPfPcieGpexRelaxOrderControl         =  0x00D90500,
+	HwPfPcieGpexBarPrefetch               =  0x00D90504,
+	HwPfPcieGpexFcCheckControl            =  0x00D90508,
+	HwPfPcieGpexFcUpdateTimerTraffic      =  0x00D90518,
+	HwPfPcieGpexPhyControl0               =  0x00D9053C,
+	HwPfPcieGpexPhyControl1               =  0x00D90544,
+	HwPfPcieGpexPhyControl2               =  0x00D9054C,
+	HwPfPcieGpexUserControl0              =  0x00D9055C,
+	HwPfPcieGpexUncorrErrorStatus         =  0x00D905F0,
+	HwPfPcieGpexRxCplError                =  0x00D90620,
+	HwPfPcieGpexRxCplErrorDword0          =  0x00D90624,
+	HwPfPcieGpexRxCplErrorDword1          =  0x00D90628,
+	HwPfPcieGpexRxCplErrorDword2          =  0x00D9062C,
+	HwPfPcieGpexPabSwResetEn              =  0x00D90630,
+	HwPfPcieGpexGen3Control0              =  0x00D90634,
+	HwPfPcieGpexGen3Control1              =  0x00D90638,
+	HwPfPcieGpexGen3Control2              =  0x00D9063C,
+	HwPfPcieGpexGen2ControlCsr            =  0x00D90640,
+	HwPfPcieGpexTotalVfInitialVf0         =  0x00D90644,
+	HwPfPcieGpexTotalVfInitialVf1         =  0x00D90648,
+	HwPfPcieGpexSriovLinkDevId0           =  0x00D90684,
+	HwPfPcieGpexSriovLinkDevId1           =  0x00D90688,
+	HwPfPcieGpexSriovPageSize0            =  0x00D906C4,
+	HwPfPcieGpexSriovPageSize1            =  0x00D906C8,
+	HwPfPcieGpexIdVersion                 =  0x00D906FC,
+	HwPfPcieGpexSriovVfOffsetStride0      =  0x00D90704,
+	HwPfPcieGpexSriovVfOffsetStride1      =  0x00D90708,
+	HwPfPcieGpexGen3DeskewControl         =  0x00D907B4,
+	HwPfPcieGpexGen3EqControl             =  0x00D907B8,
+	HwPfPcieGpexBridgeVersion             =  0x00D90800,
+	HwPfPcieGpexBridgeCapability          =  0x00D90804,
+	HwPfPcieGpexBridgeControl             =  0x00D90808,
+	HwPfPcieGpexBridgeStatus              =  0x00D9080C,
+	HwPfPcieGpexEngineActivityStatus      =  0x00D9081C,
+	HwPfPcieGpexEngineResetControl        =  0x00D90820,
+	HwPfPcieGpexAxiPioControl             =  0x00D90840,
+	HwPfPcieGpexAxiPioStatus              =  0x00D90844,
+	HwPfPcieGpexAmbaSlaveCmdStatus        =  0x00D90848,
+	HwPfPcieGpexPexPioControl             =  0x00D908C0,
+	HwPfPcieGpexPexPioStatus              =  0x00D908C4,
+	HwPfPcieGpexAmbaMasterStatus          =  0x00D908C8,
+	HwPfPcieGpexCsrSlaveCmdStatus         =  0x00D90920,
+	HwPfPcieGpexMailboxAxiControl         =  0x00D90A50,
+	HwPfPcieGpexMailboxAxiData            =  0x00D90A54,
+	HwPfPcieGpexMailboxPexControl         =  0x00D90A90,
+	HwPfPcieGpexMailboxPexData            =  0x00D90A94,
+	HwPfPcieGpexPexInterruptEnable        =  0x00D90AD0,
+	HwPfPcieGpexPexInterruptStatus        =  0x00D90AD4,
+	HwPfPcieGpexPexInterruptAxiPioVector  =  0x00D90AD8,
+	HwPfPcieGpexPexInterruptPexPioVector  =  0x00D90AE0,
+	HwPfPcieGpexPexInterruptMiscVector    =  0x00D90AF8,
+	HwPfPcieGpexAmbaInterruptPioEnable    =  0x00D90B00,
+	HwPfPcieGpexAmbaInterruptMiscEnable   =  0x00D90B0C,
+	HwPfPcieGpexAmbaInterruptPioStatus    =  0x00D90B10,
+	HwPfPcieGpexAmbaInterruptMiscStatus   =  0x00D90B1C,
+	HwPfPcieGpexPexPmControl              =  0x00D90B80,
+	HwPfPcieGpexSlotMisc                  =  0x00D90B88,
+	HwPfPcieGpexAxiAddrMappingControl     =  0x00D90BA0,
+	HwPfPcieGpexAxiAddrMappingWindowAxiBase     =  0x00D90BA4,
+	HwPfPcieGpexAxiAddrMappingWindowPexBaseLow  =  0x00D90BA8,
+	HwPfPcieGpexAxiAddrMappingWindowPexBaseHigh =  0x00D90BAC,
+	HwPfPcieGpexPexBarAddrFunc0Bar0       =  0x00D91BA0,
+	HwPfPcieGpexPexBarAddrFunc0Bar1       =  0x00D91BA4,
+	HwPfPcieGpexAxiAddrMappingPcieHdrParam =  0x00D95BA0,
+	HwPfPcieGpexExtAxiAddrMappingAxiBase  =  0x00D980A0,
+	HwPfPcieGpexPexExtBarAddrFunc0Bar0    =  0x00D984A0,
+	HwPfPcieGpexPexExtBarAddrFunc0Bar1    =  0x00D984A4,
+	HwPfPcieGpexAmbaInterruptFlrEnable    =  0x00D9B960,
+	HwPfPcieGpexAmbaInterruptFlrStatus    =  0x00D9B9A0,
+	HwPfPcieGpexExtAxiAddrMappingSize     =  0x00D9BAF0,
+	HwPfPcieGpexPexPioAwcacheControl      =  0x00D9C300,
+	HwPfPcieGpexPexPioArcacheControl      =  0x00D9C304,
+	HwPfPcieGpexPabObSizeControlVc0       =  0x00D9C310
+};
+
+/* TIP PF Interrupt numbers */
+enum {
+	ACC100_PF_INT_QMGR_AQ_OVERFLOW = 0,
+	ACC100_PF_INT_DOORBELL_VF_2_PF = 1,
+	ACC100_PF_INT_DMA_DL_DESC_IRQ = 2,
+	ACC100_PF_INT_DMA_UL_DESC_IRQ = 3,
+	ACC100_PF_INT_DMA_MLD_DESC_IRQ = 4,
+	ACC100_PF_INT_DMA_UL5G_DESC_IRQ = 5,
+	ACC100_PF_INT_DMA_DL5G_DESC_IRQ = 6,
+	ACC100_PF_INT_ILLEGAL_FORMAT = 7,
+	ACC100_PF_INT_QMGR_DISABLED_ACCESS = 8,
+	ACC100_PF_INT_QMGR_AQ_OVERTHRESHOLD = 9,
+	ACC100_PF_INT_ARAM_ACCESS_ERR = 10,
+	ACC100_PF_INT_ARAM_ECC_1BIT_ERR = 11,
+	ACC100_PF_INT_PARITY_ERR = 12,
+	ACC100_PF_INT_QMGR_ERR = 13,
+	ACC100_PF_INT_INT_REQ_OVERFLOW = 14,
+	ACC100_PF_INT_APB_TIMEOUT = 15,
+};
+
+#endif /* ACC100_PF_ENUM_H */
diff --git a/drivers/baseband/acc100/acc100_vf_enum.h b/drivers/baseband/acc100/acc100_vf_enum.h
new file mode 100644
index 0000000..b512af3
--- /dev/null
+++ b/drivers/baseband/acc100/acc100_vf_enum.h
@@ -0,0 +1,73 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2017 Intel Corporation
+ */
+
+#ifndef ACC100_VF_ENUM_H
+#define ACC100_VF_ENUM_H
+
+/*
+ * ACC100 Register mapping on VF BAR0
+ * This is automatically generated from RDL, format may change with new RDL
+ */
+enum {
+	HWVfQmgrIngressAq             =  0x00000000,
+	HWVfHiVfToPfDbellVf           =  0x00000800,
+	HWVfHiPfToVfDbellVf           =  0x00000808,
+	HWVfHiInfoRingBaseLoVf        =  0x00000810,
+	HWVfHiInfoRingBaseHiVf        =  0x00000814,
+	HWVfHiInfoRingPointerVf       =  0x00000818,
+	HWVfHiInfoRingIntWrEnVf       =  0x00000820,
+	HWVfHiInfoRingPf2VfWrEnVf     =  0x00000824,
+	HWVfHiMsixVectorMapperVf      =  0x00000860,
+	HWVfDmaFec5GulDescBaseLoRegVf =  0x00000920,
+	HWVfDmaFec5GulDescBaseHiRegVf =  0x00000924,
+	HWVfDmaFec5GulRespPtrLoRegVf  =  0x00000928,
+	HWVfDmaFec5GulRespPtrHiRegVf  =  0x0000092C,
+	HWVfDmaFec5GdlDescBaseLoRegVf =  0x00000940,
+	HWVfDmaFec5GdlDescBaseHiRegVf =  0x00000944,
+	HWVfDmaFec5GdlRespPtrLoRegVf  =  0x00000948,
+	HWVfDmaFec5GdlRespPtrHiRegVf  =  0x0000094C,
+	HWVfDmaFec4GulDescBaseLoRegVf =  0x00000960,
+	HWVfDmaFec4GulDescBaseHiRegVf =  0x00000964,
+	HWVfDmaFec4GulRespPtrLoRegVf  =  0x00000968,
+	HWVfDmaFec4GulRespPtrHiRegVf  =  0x0000096C,
+	HWVfDmaFec4GdlDescBaseLoRegVf =  0x00000980,
+	HWVfDmaFec4GdlDescBaseHiRegVf =  0x00000984,
+	HWVfDmaFec4GdlRespPtrLoRegVf  =  0x00000988,
+	HWVfDmaFec4GdlRespPtrHiRegVf  =  0x0000098C,
+	HWVfDmaDdrBaseRangeRoVf       =  0x000009A0,
+	HWVfQmgrAqResetVf             =  0x00000E00,
+	HWVfQmgrRingSizeVf            =  0x00000E04,
+	HWVfQmgrGrpDepthLog20Vf       =  0x00000E08,
+	HWVfQmgrGrpDepthLog21Vf       =  0x00000E0C,
+	HWVfQmgrGrpFunction0Vf        =  0x00000E10,
+	HWVfQmgrGrpFunction1Vf        =  0x00000E14,
+	HWVfPmACntrlRegVf             =  0x00000F40,
+	HWVfPmACountVf                =  0x00000F48,
+	HWVfPmAKCntLoVf               =  0x00000F50,
+	HWVfPmAKCntHiVf               =  0x00000F54,
+	HWVfPmADeltaCntLoVf           =  0x00000F60,
+	HWVfPmADeltaCntHiVf           =  0x00000F64,
+	HWVfPmBCntrlRegVf             =  0x00000F80,
+	HWVfPmBCountVf                =  0x00000F88,
+	HWVfPmBKCntLoVf               =  0x00000F90,
+	HWVfPmBKCntHiVf               =  0x00000F94,
+	HWVfPmBDeltaCntLoVf           =  0x00000FA0,
+	HWVfPmBDeltaCntHiVf           =  0x00000FA4
+};
+
+/* TIP VF Interrupt numbers */
+enum {
+	ACC100_VF_INT_QMGR_AQ_OVERFLOW = 0,
+	ACC100_VF_INT_DOORBELL_VF_2_PF = 1,
+	ACC100_VF_INT_DMA_DL_DESC_IRQ = 2,
+	ACC100_VF_INT_DMA_UL_DESC_IRQ = 3,
+	ACC100_VF_INT_DMA_MLD_DESC_IRQ = 4,
+	ACC100_VF_INT_DMA_UL5G_DESC_IRQ = 5,
+	ACC100_VF_INT_DMA_DL5G_DESC_IRQ = 6,
+	ACC100_VF_INT_ILLEGAL_FORMAT = 7,
+	ACC100_VF_INT_QMGR_DISABLED_ACCESS = 8,
+	ACC100_VF_INT_QMGR_AQ_OVERTHRESHOLD = 9,
+};
+
+#endif /* ACC100_VF_ENUM_H */
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 6f46df0..cd77570 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -5,6 +5,9 @@
 #ifndef _RTE_ACC100_PMD_H_
 #define _RTE_ACC100_PMD_H_
 
+#include "acc100_pf_enum.h"
+#include "acc100_vf_enum.h"
+
 /* Helper macro for logging */
 #define rte_bbdev_log(level, fmt, ...) \
 	rte_log(RTE_LOG_ ## level, acc100_logtype, fmt "\n", \
@@ -27,6 +30,493 @@
 #define RTE_ACC100_PF_DEVICE_ID        (0x0d5c)
 #define RTE_ACC100_VF_DEVICE_ID        (0x0d5d)
 
+/* Define as 1 to use only a single FEC engine */
+#ifndef RTE_ACC100_SINGLE_FEC
+#define RTE_ACC100_SINGLE_FEC 0
+#endif
+
+/* Values used in filling in descriptors */
+#define ACC100_DMA_DESC_TYPE           2
+#define ACC100_DMA_CODE_BLK_MODE       0
+#define ACC100_DMA_BLKID_FCW           1
+#define ACC100_DMA_BLKID_IN            2
+#define ACC100_DMA_BLKID_OUT_ENC       1
+#define ACC100_DMA_BLKID_OUT_HARD      1
+#define ACC100_DMA_BLKID_OUT_SOFT      2
+#define ACC100_DMA_BLKID_OUT_HARQ      3
+#define ACC100_DMA_BLKID_IN_HARQ       3
+
+/* Values used in filling in decode FCWs */
+#define ACC100_FCW_TD_VER              1
+#define ACC100_FCW_TD_EXT_COLD_REG_EN  1
+#define ACC100_FCW_TD_AUTOMAP          0x0f
+#define ACC100_FCW_TD_RVIDX_0          2
+#define ACC100_FCW_TD_RVIDX_1          26
+#define ACC100_FCW_TD_RVIDX_2          50
+#define ACC100_FCW_TD_RVIDX_3          74
+
+/* Values used in writing to the registers */
+#define ACC100_REG_IRQ_EN_ALL          0x1FF83FF  /* Enable all interrupts */
+
+/* ACC100 Specific Dimensioning */
+#define ACC100_SIZE_64MBYTE            (64*1024*1024)
+/* Number of elements in an Info Ring */
+#define ACC100_INFO_RING_NUM_ENTRIES   1024
+/* Number of elements in HARQ layout memory */
+#define ACC100_HARQ_LAYOUT             (64*1024*1024)
+/* Assume offset for HARQ in memory */
+#define ACC100_HARQ_OFFSET             (32*1024)
+/* Mask used to calculate an index in an Info Ring array (not a byte offset) */
+#define ACC100_INFO_RING_MASK          (ACC100_INFO_RING_NUM_ENTRIES-1)
+/* Number of Virtual Functions ACC100 supports */
+#define ACC100_NUM_VFS                  16
+#define ACC100_NUM_QGRPS                 8
+#define ACC100_NUM_QGRPS_PER_WORD        8
+#define ACC100_NUM_AQS                  16
+#define MAX_ENQ_BATCH_SIZE          255
+/* All ACC100 Registers alignment are 32bits = 4B */
+#define BYTES_IN_WORD                 4
+#define MAX_E_MBUF                64000
+
+#define GRP_ID_SHIFT    10 /* Queue Index Hierarchy */
+#define VF_ID_SHIFT     4  /* Queue Index Hierarchy */
+#define VF_OFFSET_QOS   16 /* offset in Memory Space specific to QoS Mon */
+#define TMPL_PRI_0      0x03020100
+#define TMPL_PRI_1      0x07060504
+#define TMPL_PRI_2      0x0b0a0908
+#define TMPL_PRI_3      0x0f0e0d0c
+#define QUEUE_ENABLE    0x80000000  /* Bit to mark Queue as Enabled */
+#define WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
+
+#define ACC100_NUM_TMPL  32
+#define VF_OFFSET_QOS 16 /* offset in Memory Space specific to QoS Mon */
+/* Mapping of signals for the available engines */
+#define SIG_UL_5G      0
+#define SIG_UL_5G_LAST 7
+#define SIG_DL_5G      13
+#define SIG_DL_5G_LAST 15
+#define SIG_UL_4G      16
+#define SIG_UL_4G_LAST 21
+#define SIG_DL_4G      27
+#define SIG_DL_4G_LAST 31
+
+/* max number of iterations to allocate memory block for all rings */
+#define SW_RING_MEM_ALLOC_ATTEMPTS 5
+#define MAX_QUEUE_DEPTH           1024
+#define ACC100_DMA_MAX_NUM_POINTERS  14
+#define ACC100_DMA_DESC_PADDING      8
+#define ACC100_FCW_PADDING           12
+#define ACC100_DESC_FCW_OFFSET       192
+#define ACC100_DESC_SIZE             256
+#define ACC100_DESC_OFFSET           (ACC100_DESC_SIZE / 64)
+#define ACC100_FCW_TE_BLEN     32
+#define ACC100_FCW_TD_BLEN     24
+#define ACC100_FCW_LE_BLEN     32
+#define ACC100_FCW_LD_BLEN     36
+
+#define ACC100_FCW_VER         2
+#define MUX_5GDL_DESC 6
+#define CMP_ENC_SIZE 20
+#define CMP_DEC_SIZE 24
+#define ENC_OFFSET (32)
+#define DEC_OFFSET (80)
+#define ACC100_EXT_MEM
+#define ACC100_HARQ_OFFSET_THRESHOLD 1024
+
+/* Constants from K0 computation from 3GPP 38.212 Table 5.4.2.1-2 */
+#define N_ZC_1 66 /* N = 66 Zc for BG 1 */
+#define N_ZC_2 50 /* N = 50 Zc for BG 2 */
+#define K0_1_1 17 /* K0 fraction numerator for rv 1 and BG 1 */
+#define K0_1_2 13 /* K0 fraction numerator for rv 1 and BG 2 */
+#define K0_2_1 33 /* K0 fraction numerator for rv 2 and BG 1 */
+#define K0_2_2 25 /* K0 fraction numerator for rv 2 and BG 2 */
+#define K0_3_1 56 /* K0 fraction numerator for rv 3 and BG 1 */
+#define K0_3_2 43 /* K0 fraction numerator for rv 3 and BG 2 */
+
+/* ACC100 Configuration */
+#define ACC100_DDR_ECC_ENABLE
+#define ACC100_CFG_DMA_ERROR 0x3D7
+#define ACC100_CFG_AXI_CACHE 0x11
+#define ACC100_CFG_QMGR_HI_P 0x0F0F
+#define ACC100_CFG_PCI_AXI 0xC003
+#define ACC100_CFG_PCI_BRIDGE 0x40006033
+#define ACC100_ENGINE_OFFSET 0x1000
+#define ACC100_RESET_HI 0x20100
+#define ACC100_RESET_LO 0x20000
+#define ACC100_RESET_HARD 0x1FF
+#define ACC100_ENGINES_MAX 9
+#define LONG_WAIT 1000
+
+/* ACC100 DMA Descriptor triplet */
+struct acc100_dma_triplet {
+	uint64_t address;
+	uint32_t blen:20,
+		res0:4,
+		last:1,
+		dma_ext:1,
+		res1:2,
+		blkid:4;
+} __rte_packed;
+
+
+
+/* ACC100 DMA Response Descriptor */
+union acc100_dma_rsp_desc {
+	uint32_t val;
+	struct {
+		uint32_t crc_status:1,
+			synd_ok:1,
+			dma_err:1,
+			neg_stop:1,
+			fcw_err:1,
+			output_err:1,
+			input_err:1,
+			timestampEn:1,
+			iterCountFrac:8,
+			iter_cnt:8,
+			rsrvd3:6,
+			sdone:1,
+			fdone:1;
+		uint32_t add_info_0;
+		uint32_t add_info_1;
+	};
+};
+
+
+/* ACC100 Queue Manager Enqueue PCI Register */
+union acc100_enqueue_reg_fmt {
+	uint32_t val;
+	struct {
+		uint32_t num_elem:8,
+			addr_offset:3,
+			rsrvd:1,
+			req_elem_addr:20;
+	};
+};
+
+/* FEC 4G Uplink Frame Control Word */
+struct __rte_packed acc100_fcw_td {
+	uint8_t fcw_ver:4,
+		num_maps:4; /* Unused */
+	uint8_t filler:6, /* Unused */
+		rsrvd0:1,
+		bypass_sb_deint:1;
+	uint16_t k_pos;
+	uint16_t k_neg; /* Unused */
+	uint8_t c_neg; /* Unused */
+	uint8_t c; /* Unused */
+	uint32_t ea; /* Unused */
+	uint32_t eb; /* Unused */
+	uint8_t cab; /* Unused */
+	uint8_t k0_start_col; /* Unused */
+	uint8_t rsrvd1;
+	uint8_t code_block_mode:1, /* Unused */
+		turbo_crc_type:1,
+		rsrvd2:3,
+		bypass_teq:1, /* Unused */
+		soft_output_en:1, /* Unused */
+		ext_td_cold_reg_en:1;
+	union { /* External Cold register */
+		uint32_t ext_td_cold_reg;
+		struct {
+			uint32_t min_iter:4, /* Unused */
+				max_iter:4,
+				ext_scale:5, /* Unused */
+				rsrvd3:3,
+				early_stop_en:1, /* Unused */
+				sw_soft_out_dis:1, /* Unused */
+				sw_et_cont:1, /* Unused */
+				sw_soft_out_saturation:1, /* Unused */
+				half_iter_on:1, /* Unused */
+				raw_decoder_input_on:1, /* Unused */
+				rsrvd4:10;
+		};
+	};
+};
+
+/* FEC 5GNR Uplink Frame Control Word */
+struct __rte_packed acc100_fcw_ld {
+	uint32_t FCWversion:4,
+		qm:4,
+		nfiller:11,
+		BG:1,
+		Zc:9,
+		res0:1,
+		synd_precoder:1,
+		synd_post:1;
+	uint32_t ncb:16,
+		k0:16;
+	uint32_t rm_e:24,
+		hcin_en:1,
+		hcout_en:1,
+		crc_select:1,
+		bypass_dec:1,
+		bypass_intlv:1,
+		so_en:1,
+		so_bypass_rm:1,
+		so_bypass_intlv:1;
+	uint32_t hcin_offset:16,
+		hcin_size0:16;
+	uint32_t hcin_size1:16,
+		hcin_decomp_mode:3,
+		llr_pack_mode:1,
+		hcout_comp_mode:3,
+		res2:1,
+		dec_convllr:4,
+		hcout_convllr:4;
+	uint32_t itmax:7,
+		itstop:1,
+		so_it:7,
+		res3:1,
+		hcout_offset:16;
+	uint32_t hcout_size0:16,
+		hcout_size1:16;
+	uint32_t gain_i:8,
+		gain_h:8,
+		negstop_th:16;
+	uint32_t negstop_it:7,
+		negstop_en:1,
+		res4:24;
+};
+
+/* FEC 4G Downlink Frame Control Word */
+struct __rte_packed acc100_fcw_te {
+	uint16_t k_neg;
+	uint16_t k_pos;
+	uint8_t c_neg;
+	uint8_t c;
+	uint8_t filler;
+	uint8_t cab;
+	uint32_t ea:17,
+		rsrvd0:15;
+	uint32_t eb:17,
+		rsrvd1:15;
+	uint16_t ncb_neg;
+	uint16_t ncb_pos;
+	uint8_t rv_idx0:2,
+		rsrvd2:2,
+		rv_idx1:2,
+		rsrvd3:2;
+	uint8_t bypass_rv_idx0:1,
+		bypass_rv_idx1:1,
+		bypass_rm:1,
+		rsrvd4:5;
+	uint8_t rsrvd5:1,
+		rsrvd6:3,
+		code_block_crc:1,
+		rsrvd7:3;
+	uint8_t code_block_mode:1,
+		rsrvd8:7;
+	uint64_t rsrvd9;
+};
+
+/* FEC 5GNR Downlink Frame Control Word */
+struct __rte_packed acc100_fcw_le {
+	uint32_t FCWversion:4,
+		qm:4,
+		nfiller:11,
+		BG:1,
+		Zc:9,
+		res0:3;
+	uint32_t ncb:16,
+		k0:16;
+	uint32_t rm_e:24,
+		res1:2,
+		crc_select:1,
+		res2:1,
+		bypass_intlv:1,
+		res3:3;
+	uint32_t res4_a:12,
+		mcb_count:3,
+		res4_b:17;
+	uint32_t res5;
+	uint32_t res6;
+	uint32_t res7;
+	uint32_t res8;
+};
+
+/* ACC100 DMA Request Descriptor */
+struct __rte_packed acc100_dma_req_desc {
+	union {
+		struct{
+			uint32_t type:4,
+				rsrvd0:26,
+				sdone:1,
+				fdone:1;
+			uint32_t rsrvd1;
+			uint32_t rsrvd2;
+			uint32_t pass_param:8,
+				sdone_enable:1,
+				irq_enable:1,
+				timeStampEn:1,
+				res0:5,
+				numCBs:4,
+				res1:4,
+				m2dlen:4,
+				d2mlen:4;
+		};
+		struct{
+			uint32_t word0;
+			uint32_t word1;
+			uint32_t word2;
+			uint32_t word3;
+		};
+	};
+	struct acc100_dma_triplet data_ptrs[ACC100_DMA_MAX_NUM_POINTERS];
+
+	/* Virtual addresses used to retrieve SW context info */
+	union {
+		void *op_addr;
+		uint64_t pad1;  /* pad to 64 bits */
+	};
+	/*
+	 * Stores additional information needed for driver processing:
+	 * - last_desc_in_batch - flag used to mark last descriptor (CB)
+	 *                        in batch
+	 * - cbs_in_tb - stores information about total number of Code Blocks
+	 *               in currently processed Transport Block
+	 */
+	union {
+		struct {
+			union {
+				struct acc100_fcw_ld fcw_ld;
+				struct acc100_fcw_td fcw_td;
+				struct acc100_fcw_le fcw_le;
+				struct acc100_fcw_te fcw_te;
+				uint32_t pad2[ACC100_FCW_PADDING];
+			};
+			uint32_t last_desc_in_batch :8,
+				cbs_in_tb:8,
+				pad4 : 16;
+		};
+		uint64_t pad3[ACC100_DMA_DESC_PADDING]; /* pad to 64 bits */
+	};
+};
+
+/* ACC100 DMA Descriptor */
+union acc100_dma_desc {
+	struct acc100_dma_req_desc req;
+	union acc100_dma_rsp_desc rsp;
+};
+
+
+/* Union describing Info Ring entry */
+union acc100_harq_layout_data {
+	uint32_t val;
+	struct {
+		uint16_t offset;
+		uint16_t size0;
+	};
+} __rte_packed;
+
+
+/* Union describing Info Ring entry */
+union acc100_info_ring_data {
+	uint32_t val;
+	struct {
+		union {
+			uint16_t detailed_info;
+			struct {
+				uint16_t aq_id: 4;
+				uint16_t qg_id: 4;
+				uint16_t vf_id: 6;
+				uint16_t reserved: 2;
+			};
+		};
+		uint16_t int_nb: 7;
+		uint16_t msi_0: 1;
+		uint16_t vf2pf: 6;
+		uint16_t loop: 1;
+		uint16_t valid: 1;
+	};
+} __rte_packed;
+
+struct acc100_registry_addr {
+	unsigned int dma_ring_dl5g_hi;
+	unsigned int dma_ring_dl5g_lo;
+	unsigned int dma_ring_ul5g_hi;
+	unsigned int dma_ring_ul5g_lo;
+	unsigned int dma_ring_dl4g_hi;
+	unsigned int dma_ring_dl4g_lo;
+	unsigned int dma_ring_ul4g_hi;
+	unsigned int dma_ring_ul4g_lo;
+	unsigned int ring_size;
+	unsigned int info_ring_hi;
+	unsigned int info_ring_lo;
+	unsigned int info_ring_en;
+	unsigned int info_ring_ptr;
+	unsigned int tail_ptrs_dl5g_hi;
+	unsigned int tail_ptrs_dl5g_lo;
+	unsigned int tail_ptrs_ul5g_hi;
+	unsigned int tail_ptrs_ul5g_lo;
+	unsigned int tail_ptrs_dl4g_hi;
+	unsigned int tail_ptrs_dl4g_lo;
+	unsigned int tail_ptrs_ul4g_hi;
+	unsigned int tail_ptrs_ul4g_lo;
+	unsigned int depth_log0_offset;
+	unsigned int depth_log1_offset;
+	unsigned int qman_group_func;
+	unsigned int ddr_range;
+};
+
+/* Structure holding registry addresses for PF */
+static const struct acc100_registry_addr pf_reg_addr = {
+	.dma_ring_dl5g_hi = HWPfDmaFec5GdlDescBaseHiRegVf,
+	.dma_ring_dl5g_lo = HWPfDmaFec5GdlDescBaseLoRegVf,
+	.dma_ring_ul5g_hi = HWPfDmaFec5GulDescBaseHiRegVf,
+	.dma_ring_ul5g_lo = HWPfDmaFec5GulDescBaseLoRegVf,
+	.dma_ring_dl4g_hi = HWPfDmaFec4GdlDescBaseHiRegVf,
+	.dma_ring_dl4g_lo = HWPfDmaFec4GdlDescBaseLoRegVf,
+	.dma_ring_ul4g_hi = HWPfDmaFec4GulDescBaseHiRegVf,
+	.dma_ring_ul4g_lo = HWPfDmaFec4GulDescBaseLoRegVf,
+	.ring_size = HWPfQmgrRingSizeVf,
+	.info_ring_hi = HWPfHiInfoRingBaseHiRegPf,
+	.info_ring_lo = HWPfHiInfoRingBaseLoRegPf,
+	.info_ring_en = HWPfHiInfoRingIntWrEnRegPf,
+	.info_ring_ptr = HWPfHiInfoRingPointerRegPf,
+	.tail_ptrs_dl5g_hi = HWPfDmaFec5GdlRespPtrHiRegVf,
+	.tail_ptrs_dl5g_lo = HWPfDmaFec5GdlRespPtrLoRegVf,
+	.tail_ptrs_ul5g_hi = HWPfDmaFec5GulRespPtrHiRegVf,
+	.tail_ptrs_ul5g_lo = HWPfDmaFec5GulRespPtrLoRegVf,
+	.tail_ptrs_dl4g_hi = HWPfDmaFec4GdlRespPtrHiRegVf,
+	.tail_ptrs_dl4g_lo = HWPfDmaFec4GdlRespPtrLoRegVf,
+	.tail_ptrs_ul4g_hi = HWPfDmaFec4GulRespPtrHiRegVf,
+	.tail_ptrs_ul4g_lo = HWPfDmaFec4GulRespPtrLoRegVf,
+	.depth_log0_offset = HWPfQmgrGrpDepthLog20Vf,
+	.depth_log1_offset = HWPfQmgrGrpDepthLog21Vf,
+	.qman_group_func = HWPfQmgrGrpFunction0,
+	.ddr_range = HWPfDmaVfDdrBaseRw,
+};
+
+/* Structure holding registry addresses for VF */
+static const struct acc100_registry_addr vf_reg_addr = {
+	.dma_ring_dl5g_hi = HWVfDmaFec5GdlDescBaseHiRegVf,
+	.dma_ring_dl5g_lo = HWVfDmaFec5GdlDescBaseLoRegVf,
+	.dma_ring_ul5g_hi = HWVfDmaFec5GulDescBaseHiRegVf,
+	.dma_ring_ul5g_lo = HWVfDmaFec5GulDescBaseLoRegVf,
+	.dma_ring_dl4g_hi = HWVfDmaFec4GdlDescBaseHiRegVf,
+	.dma_ring_dl4g_lo = HWVfDmaFec4GdlDescBaseLoRegVf,
+	.dma_ring_ul4g_hi = HWVfDmaFec4GulDescBaseHiRegVf,
+	.dma_ring_ul4g_lo = HWVfDmaFec4GulDescBaseLoRegVf,
+	.ring_size = HWVfQmgrRingSizeVf,
+	.info_ring_hi = HWVfHiInfoRingBaseHiVf,
+	.info_ring_lo = HWVfHiInfoRingBaseLoVf,
+	.info_ring_en = HWVfHiInfoRingIntWrEnVf,
+	.info_ring_ptr = HWVfHiInfoRingPointerVf,
+	.tail_ptrs_dl5g_hi = HWVfDmaFec5GdlRespPtrHiRegVf,
+	.tail_ptrs_dl5g_lo = HWVfDmaFec5GdlRespPtrLoRegVf,
+	.tail_ptrs_ul5g_hi = HWVfDmaFec5GulRespPtrHiRegVf,
+	.tail_ptrs_ul5g_lo = HWVfDmaFec5GulRespPtrLoRegVf,
+	.tail_ptrs_dl4g_hi = HWVfDmaFec4GdlRespPtrHiRegVf,
+	.tail_ptrs_dl4g_lo = HWVfDmaFec4GdlRespPtrLoRegVf,
+	.tail_ptrs_ul4g_hi = HWVfDmaFec4GulRespPtrHiRegVf,
+	.tail_ptrs_ul4g_lo = HWVfDmaFec4GulRespPtrLoRegVf,
+	.depth_log0_offset = HWVfQmgrGrpDepthLog20Vf,
+	.depth_log1_offset = HWVfQmgrGrpDepthLog21Vf,
+	.qman_group_func = HWVfQmgrGrpFunction0Vf,
+	.ddr_range = HWVfDmaDdrBaseRangeRoVf,
+};
+
 /* Private data structure for each ACC100 device */
 struct acc100_device {
 	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v6 05/11] baseband/acc100: add info get function
  2020-09-23  2:19   ` [dpdk-dev] [PATCH v6 00/11] bbdev PMD ACC100 Nicolas Chautru
                       ` (3 preceding siblings ...)
  2020-09-23  2:19     ` [dpdk-dev] [PATCH v6 04/11] baseband/acc100: add register definition file Nicolas Chautru
@ 2020-09-23  2:19     ` Nicolas Chautru
  2020-09-23  2:19     ` [dpdk-dev] [PATCH v6 06/11] baseband/acc100: add queue configuration Nicolas Chautru
                       ` (5 subsequent siblings)
  10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-23  2:19 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Add in the "info_get" function to the driver, to allow us to query the
device.
No processing capability are available yet.
Linking bbdev-test to support the PMD with null capability.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 app/test-bbdev/meson.build               |   3 +
 drivers/baseband/acc100/rte_acc100_cfg.h |  96 +++++++++++++
 drivers/baseband/acc100/rte_acc100_pmd.c | 225 +++++++++++++++++++++++++++++++
 drivers/baseband/acc100/rte_acc100_pmd.h |   3 +
 4 files changed, 327 insertions(+)
 create mode 100644 drivers/baseband/acc100/rte_acc100_cfg.h

diff --git a/app/test-bbdev/meson.build b/app/test-bbdev/meson.build
index 18ab6a8..fbd8ae3 100644
--- a/app/test-bbdev/meson.build
+++ b/app/test-bbdev/meson.build
@@ -12,3 +12,6 @@ endif
 if dpdk_conf.has('RTE_LIBRTE_PMD_BBDEV_FPGA_5GNR_FEC')
 	deps += ['pmd_bbdev_fpga_5gnr_fec']
 endif
+if dpdk_conf.has('RTE_LIBRTE_PMD_BBDEV_ACC100')
+	deps += ['pmd_bbdev_acc100']
+endif
\ No newline at end of file
diff --git a/drivers/baseband/acc100/rte_acc100_cfg.h b/drivers/baseband/acc100/rte_acc100_cfg.h
new file mode 100644
index 0000000..73bbe36
--- /dev/null
+++ b/drivers/baseband/acc100/rte_acc100_cfg.h
@@ -0,0 +1,96 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#ifndef _RTE_ACC100_CFG_H_
+#define _RTE_ACC100_CFG_H_
+
+/**
+ * @file rte_acc100_cfg.h
+ *
+ * Functions for configuring ACC100 HW, exposed directly to applications.
+ * Configuration related to encoding/decoding is done through the
+ * librte_bbdev library.
+ *
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ */
+
+#include <stdint.h>
+#include <stdbool.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+/**< Number of Virtual Functions ACC100 supports */
+#define RTE_ACC100_NUM_VFS 16
+
+/**
+ * Definition of Queue Topology for ACC100 Configuration
+ * Some level of details is abstracted out to expose a clean interface
+ * given that comprehensive flexibility is not required
+ */
+struct rte_q_topology_t {
+	/** Number of QGroups in incremental order of priority */
+	uint16_t num_qgroups;
+	/**
+	 * All QGroups have the same number of AQs here.
+	 * Note : Could be made a 16-array if more flexibility is really
+	 * required
+	 */
+	uint16_t num_aqs_per_groups;
+	/**
+	 * Depth of the AQs is the same of all QGroups here. Log2 Enum : 2^N
+	 * Note : Could be made a 16-array if more flexibility is really
+	 * required
+	 */
+	uint16_t aq_depth_log2;
+	/**
+	 * Index of the first Queue Group Index - assuming contiguity
+	 * Initialized as -1
+	 */
+	int8_t first_qgroup_index;
+};
+
+/**
+ * Definition of Arbitration related parameters for ACC100 Configuration
+ */
+struct rte_arbitration_t {
+	/** Default Weight for VF Fairness Arbitration */
+	uint16_t round_robin_weight;
+	uint32_t gbr_threshold1; /**< Guaranteed Bitrate Threshold 1 */
+	uint32_t gbr_threshold2; /**< Guaranteed Bitrate Threshold 2 */
+};
+
+/**
+ * Structure to pass ACC100 configuration.
+ * Note: all VF Bundles will have the same configuration.
+ */
+struct acc100_conf {
+	bool pf_mode_en; /**< 1 if PF is used for dataplane, 0 for VFs */
+	/** 1 if input '1' bit is represented by a positive LLR value, 0 if '1'
+	 * bit is represented by a negative value.
+	 */
+	bool input_pos_llr_1_bit;
+	/** 1 if output '1' bit is represented by a positive value, 0 if '1'
+	 * bit is represented by a negative value.
+	 */
+	bool output_pos_llr_1_bit;
+	uint16_t num_vf_bundles; /**< Number of VF bundles to setup */
+	/** Queue topology for each operation type */
+	struct rte_q_topology_t q_ul_4g;
+	struct rte_q_topology_t q_dl_4g;
+	struct rte_q_topology_t q_ul_5g;
+	struct rte_q_topology_t q_dl_5g;
+	/** Arbitration configuration for each operation type */
+	struct rte_arbitration_t arb_ul_4g[RTE_ACC100_NUM_VFS];
+	struct rte_arbitration_t arb_dl_4g[RTE_ACC100_NUM_VFS];
+	struct rte_arbitration_t arb_ul_5g[RTE_ACC100_NUM_VFS];
+	struct rte_arbitration_t arb_dl_5g[RTE_ACC100_NUM_VFS];
+};
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_ACC100_CFG_H_ */
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 1b4cd13..7807a30 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -26,6 +26,184 @@
 RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);
 #endif
 
+/* Read a register of a ACC100 device */
+static inline uint32_t
+acc100_reg_read(struct acc100_device *d, uint32_t offset)
+{
+
+	void *reg_addr = RTE_PTR_ADD(d->mmio_base, offset);
+	uint32_t ret = *((volatile uint32_t *)(reg_addr));
+	return rte_le_to_cpu_32(ret);
+}
+
+/* Calculate the offset of the enqueue register */
+static inline uint32_t
+queue_offset(bool pf_device, uint8_t vf_id, uint8_t qgrp_id, uint16_t aq_id)
+{
+	if (pf_device)
+		return ((vf_id << 12) + (qgrp_id << 7) + (aq_id << 3) +
+				HWPfQmgrIngressAq);
+	else
+		return ((qgrp_id << 7) + (aq_id << 3) +
+				HWVfQmgrIngressAq);
+}
+
+enum {UL_4G = 0, UL_5G, DL_4G, DL_5G, NUM_ACC};
+
+/* Return the queue topology for a Queue Group Index */
+static inline void
+qtopFromAcc(struct rte_q_topology_t **qtop, int acc_enum,
+		struct acc100_conf *acc100_conf)
+{
+	struct rte_q_topology_t *p_qtop;
+	p_qtop = NULL;
+	switch (acc_enum) {
+	case UL_4G:
+		p_qtop = &(acc100_conf->q_ul_4g);
+		break;
+	case UL_5G:
+		p_qtop = &(acc100_conf->q_ul_5g);
+		break;
+	case DL_4G:
+		p_qtop = &(acc100_conf->q_dl_4g);
+		break;
+	case DL_5G:
+		p_qtop = &(acc100_conf->q_dl_5g);
+		break;
+	default:
+		/* NOTREACHED */
+		rte_bbdev_log(ERR, "Unexpected error evaluating qtopFromAcc");
+		break;
+	}
+	*qtop = p_qtop;
+}
+
+static void
+initQTop(struct acc100_conf *acc100_conf)
+{
+	acc100_conf->q_ul_4g.num_aqs_per_groups = 0;
+	acc100_conf->q_ul_4g.num_qgroups = 0;
+	acc100_conf->q_ul_4g.first_qgroup_index = -1;
+	acc100_conf->q_ul_5g.num_aqs_per_groups = 0;
+	acc100_conf->q_ul_5g.num_qgroups = 0;
+	acc100_conf->q_ul_5g.first_qgroup_index = -1;
+	acc100_conf->q_dl_4g.num_aqs_per_groups = 0;
+	acc100_conf->q_dl_4g.num_qgroups = 0;
+	acc100_conf->q_dl_4g.first_qgroup_index = -1;
+	acc100_conf->q_dl_5g.num_aqs_per_groups = 0;
+	acc100_conf->q_dl_5g.num_qgroups = 0;
+	acc100_conf->q_dl_5g.first_qgroup_index = -1;
+}
+
+static inline void
+updateQtop(uint8_t acc, uint8_t qg, struct acc100_conf *acc100_conf,
+		struct acc100_device *d) {
+	uint32_t reg;
+	struct rte_q_topology_t *q_top = NULL;
+	qtopFromAcc(&q_top, acc, acc100_conf);
+	if (unlikely(q_top == NULL))
+		return;
+	uint16_t aq;
+	q_top->num_qgroups++;
+	if (q_top->first_qgroup_index == -1) {
+		q_top->first_qgroup_index = qg;
+		/* Can be optimized to assume all are enabled by default */
+		reg = acc100_reg_read(d, queue_offset(d->pf_device,
+				0, qg, ACC100_NUM_AQS - 1));
+		if (reg & QUEUE_ENABLE) {
+			q_top->num_aqs_per_groups = ACC100_NUM_AQS;
+			return;
+		}
+		q_top->num_aqs_per_groups = 0;
+		for (aq = 0; aq < ACC100_NUM_AQS; aq++) {
+			reg = acc100_reg_read(d, queue_offset(d->pf_device,
+					0, qg, aq));
+			if (reg & QUEUE_ENABLE)
+				q_top->num_aqs_per_groups++;
+		}
+	}
+}
+
+/* Fetch configuration enabled for the PF/VF using MMIO Read (slow) */
+static inline void
+fetch_acc100_config(struct rte_bbdev *dev)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	struct acc100_conf *acc100_conf = &d->acc100_conf;
+	const struct acc100_registry_addr *reg_addr;
+	uint8_t acc, qg;
+	uint32_t reg, reg_aq, reg_len0, reg_len1;
+	uint32_t reg_mode;
+
+	/* No need to retrieve the configuration is already done */
+	if (d->configured)
+		return;
+
+	/* Choose correct registry addresses for the device type */
+	if (d->pf_device)
+		reg_addr = &pf_reg_addr;
+	else
+		reg_addr = &vf_reg_addr;
+
+	d->ddr_size = (1 + acc100_reg_read(d, reg_addr->ddr_range)) << 10;
+
+	/* Single VF Bundle by VF */
+	acc100_conf->num_vf_bundles = 1;
+	initQTop(acc100_conf);
+
+	struct rte_q_topology_t *q_top = NULL;
+	int qman_func_id[5] = {0, 2, 1, 3, 4};
+	reg = acc100_reg_read(d, reg_addr->qman_group_func);
+	for (qg = 0; qg < ACC100_NUM_QGRPS_PER_WORD; qg++) {
+		reg_aq = acc100_reg_read(d,
+				queue_offset(d->pf_device, 0, qg, 0));
+		if (reg_aq & QUEUE_ENABLE) {
+			acc = qman_func_id[(reg >> (qg * 4)) & 0x7];
+			updateQtop(acc, qg, acc100_conf, d);
+		}
+	}
+
+	/* Check the depth of the AQs*/
+	reg_len0 = acc100_reg_read(d, reg_addr->depth_log0_offset);
+	reg_len1 = acc100_reg_read(d, reg_addr->depth_log1_offset);
+	for (acc = 0; acc < NUM_ACC; acc++) {
+		qtopFromAcc(&q_top, acc, acc100_conf);
+		if (q_top->first_qgroup_index < ACC100_NUM_QGRPS_PER_WORD)
+			q_top->aq_depth_log2 = (reg_len0 >>
+					(q_top->first_qgroup_index * 4))
+					& 0xF;
+		else
+			q_top->aq_depth_log2 = (reg_len1 >>
+					((q_top->first_qgroup_index -
+					ACC100_NUM_QGRPS_PER_WORD) * 4))
+					& 0xF;
+	}
+
+	/* Read PF mode */
+	if (d->pf_device) {
+		reg_mode = acc100_reg_read(d, HWPfHiPfMode);
+		acc100_conf->pf_mode_en = (reg_mode == 2) ? 1 : 0;
+	}
+
+	rte_bbdev_log_debug(
+			"%s Config LLR SIGN IN/OUT %s %s QG %u %u %u %u AQ %u %u %u %u Len %u %u %u %u\n",
+			(d->pf_device) ? "PF" : "VF",
+			(acc100_conf->input_pos_llr_1_bit) ? "POS" : "NEG",
+			(acc100_conf->output_pos_llr_1_bit) ? "POS" : "NEG",
+			acc100_conf->q_ul_4g.num_qgroups,
+			acc100_conf->q_dl_4g.num_qgroups,
+			acc100_conf->q_ul_5g.num_qgroups,
+			acc100_conf->q_dl_5g.num_qgroups,
+			acc100_conf->q_ul_4g.num_aqs_per_groups,
+			acc100_conf->q_dl_4g.num_aqs_per_groups,
+			acc100_conf->q_ul_5g.num_aqs_per_groups,
+			acc100_conf->q_dl_5g.num_aqs_per_groups,
+			acc100_conf->q_ul_4g.aq_depth_log2,
+			acc100_conf->q_dl_4g.aq_depth_log2,
+			acc100_conf->q_ul_5g.aq_depth_log2,
+			acc100_conf->q_dl_5g.aq_depth_log2);
+}
+
 /* Free 64MB memory used for software rings */
 static int
 acc100_dev_close(struct rte_bbdev *dev  __rte_unused)
@@ -33,8 +211,55 @@
 	return 0;
 }
 
+/* Get ACC100 device info */
+static void
+acc100_dev_info_get(struct rte_bbdev *dev,
+		struct rte_bbdev_driver_info *dev_info)
+{
+	struct acc100_device *d = dev->data->dev_private;
+
+	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
+		RTE_BBDEV_END_OF_CAPABILITIES_LIST()
+	};
+
+	static struct rte_bbdev_queue_conf default_queue_conf;
+	default_queue_conf.socket = dev->data->socket_id;
+	default_queue_conf.queue_size = MAX_QUEUE_DEPTH;
+
+	dev_info->driver_name = dev->device->driver->name;
+
+	/* Read and save the populated config from ACC100 registers */
+	fetch_acc100_config(dev);
+
+	/* This isn't ideal because it reports the maximum number of queues but
+	 * does not provide info on how many can be uplink/downlink or different
+	 * priorities
+	 */
+	dev_info->max_num_queues =
+			d->acc100_conf.q_dl_5g.num_aqs_per_groups *
+			d->acc100_conf.q_dl_5g.num_qgroups +
+			d->acc100_conf.q_ul_5g.num_aqs_per_groups *
+			d->acc100_conf.q_ul_5g.num_qgroups +
+			d->acc100_conf.q_dl_4g.num_aqs_per_groups *
+			d->acc100_conf.q_dl_4g.num_qgroups +
+			d->acc100_conf.q_ul_4g.num_aqs_per_groups *
+			d->acc100_conf.q_ul_4g.num_qgroups;
+	dev_info->queue_size_lim = MAX_QUEUE_DEPTH;
+	dev_info->hardware_accelerated = true;
+	dev_info->max_dl_queue_priority =
+			d->acc100_conf.q_dl_4g.num_qgroups - 1;
+	dev_info->max_ul_queue_priority =
+			d->acc100_conf.q_ul_4g.num_qgroups - 1;
+	dev_info->default_queue_conf = default_queue_conf;
+	dev_info->cpu_flag_reqs = NULL;
+	dev_info->min_alignment = 64;
+	dev_info->capabilities = bbdev_capabilities;
+	dev_info->harq_buffer_size = d->ddr_size;
+}
+
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
 	.close = acc100_dev_close,
+	.info_get = acc100_dev_info_get,
 };
 
 /* ACC100 PCI PF address map */
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index cd77570..662e2c8 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -7,6 +7,7 @@
 
 #include "acc100_pf_enum.h"
 #include "acc100_vf_enum.h"
+#include "rte_acc100_cfg.h"
 
 /* Helper macro for logging */
 #define rte_bbdev_log(level, fmt, ...) \
@@ -520,6 +521,8 @@ struct acc100_registry_addr {
 /* Private data structure for each ACC100 device */
 struct acc100_device {
 	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
+	uint32_t ddr_size; /* Size in kB */
+	struct acc100_conf acc100_conf; /* ACC100 Initial configuration */
 	bool pf_device; /**< True if this is a PF ACC100 device */
 	bool configured; /**< True if this ACC100 device is configured */
 };
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v6 06/11] baseband/acc100: add queue configuration
  2020-09-23  2:19   ` [dpdk-dev] [PATCH v6 00/11] bbdev PMD ACC100 Nicolas Chautru
                       ` (4 preceding siblings ...)
  2020-09-23  2:19     ` [dpdk-dev] [PATCH v6 05/11] baseband/acc100: add info get function Nicolas Chautru
@ 2020-09-23  2:19     ` Nicolas Chautru
  2020-09-23  2:19     ` [dpdk-dev] [PATCH v6 07/11] baseband/acc100: add LDPC processing functions Nicolas Chautru
                       ` (4 subsequent siblings)
  10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-23  2:19 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Adding function to create and configure queues for
the device. Still no capability.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 420 ++++++++++++++++++++++++++++++-
 drivers/baseband/acc100/rte_acc100_pmd.h |  45 ++++
 2 files changed, 464 insertions(+), 1 deletion(-)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 7807a30..7a21c57 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -26,6 +26,22 @@
 RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);
 #endif
 
+/* Write to MMIO register address */
+static inline void
+mmio_write(void *addr, uint32_t value)
+{
+	*((volatile uint32_t *)(addr)) = rte_cpu_to_le_32(value);
+}
+
+/* Write a register of a ACC100 device */
+static inline void
+acc100_reg_write(struct acc100_device *d, uint32_t offset, uint32_t payload)
+{
+	void *reg_addr = RTE_PTR_ADD(d->mmio_base, offset);
+	mmio_write(reg_addr, payload);
+	usleep(1000);
+}
+
 /* Read a register of a ACC100 device */
 static inline uint32_t
 acc100_reg_read(struct acc100_device *d, uint32_t offset)
@@ -36,6 +52,22 @@
 	return rte_le_to_cpu_32(ret);
 }
 
+/* Basic Implementation of Log2 for exact 2^N */
+static inline uint32_t
+log2_basic(uint32_t value)
+{
+	return (value == 0) ? 0 : __builtin_ctz(value);
+}
+
+/* Calculate memory alignment offset assuming alignment is 2^N */
+static inline uint32_t
+calc_mem_alignment_offset(void *unaligned_virt_mem, uint32_t alignment)
+{
+	rte_iova_t unaligned_phy_mem = rte_malloc_virt2iova(unaligned_virt_mem);
+	return (uint32_t)(alignment -
+			(unaligned_phy_mem & (alignment-1)));
+}
+
 /* Calculate the offset of the enqueue register */
 static inline uint32_t
 queue_offset(bool pf_device, uint8_t vf_id, uint8_t qgrp_id, uint16_t aq_id)
@@ -204,10 +236,393 @@
 			acc100_conf->q_dl_5g.aq_depth_log2);
 }
 
+static void
+free_base_addresses(void **base_addrs, int size)
+{
+	int i;
+	for (i = 0; i < size; i++)
+		rte_free(base_addrs[i]);
+}
+
+static inline uint32_t
+get_desc_len(void)
+{
+	return sizeof(union acc100_dma_desc);
+}
+
+/* Allocate the 2 * 64MB block for the sw rings */
+static int
+alloc_2x64mb_sw_rings_mem(struct rte_bbdev *dev, struct acc100_device *d,
+		int socket)
+{
+	uint32_t sw_ring_size = ACC100_SIZE_64MBYTE;
+	d->sw_rings_base = rte_zmalloc_socket(dev->device->driver->name,
+			2 * sw_ring_size, RTE_CACHE_LINE_SIZE, socket);
+	if (d->sw_rings_base == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate memory for %s:%u",
+				dev->device->driver->name,
+				dev->data->dev_id);
+		return -ENOMEM;
+	}
+	memset(d->sw_rings_base, 0, ACC100_SIZE_64MBYTE);
+	uint32_t next_64mb_align_offset = calc_mem_alignment_offset(
+			d->sw_rings_base, ACC100_SIZE_64MBYTE);
+	d->sw_rings = RTE_PTR_ADD(d->sw_rings_base, next_64mb_align_offset);
+	d->sw_rings_phys = rte_malloc_virt2iova(d->sw_rings_base) +
+			next_64mb_align_offset;
+	d->sw_ring_size = MAX_QUEUE_DEPTH * get_desc_len();
+	d->sw_ring_max_depth = d->sw_ring_size / get_desc_len();
+
+	return 0;
+}
+
+/* Attempt to allocate minimised memory space for sw rings */
+static void
+alloc_sw_rings_min_mem(struct rte_bbdev *dev, struct acc100_device *d,
+		uint16_t num_queues, int socket)
+{
+	rte_iova_t sw_rings_base_phy, next_64mb_align_addr_phy;
+	uint32_t next_64mb_align_offset;
+	rte_iova_t sw_ring_phys_end_addr;
+	void *base_addrs[SW_RING_MEM_ALLOC_ATTEMPTS];
+	void *sw_rings_base;
+	int i = 0;
+	uint32_t q_sw_ring_size = MAX_QUEUE_DEPTH * get_desc_len();
+	uint32_t dev_sw_ring_size = q_sw_ring_size * num_queues;
+
+	/* Find an aligned block of memory to store sw rings */
+	while (i < SW_RING_MEM_ALLOC_ATTEMPTS) {
+		/*
+		 * sw_ring allocated memory is guaranteed to be aligned to
+		 * q_sw_ring_size at the condition that the requested size is
+		 * less than the page size
+		 */
+		sw_rings_base = rte_zmalloc_socket(
+				dev->device->driver->name,
+				dev_sw_ring_size, q_sw_ring_size, socket);
+
+		if (sw_rings_base == NULL) {
+			rte_bbdev_log(ERR,
+					"Failed to allocate memory for %s:%u",
+					dev->device->driver->name,
+					dev->data->dev_id);
+			break;
+		}
+
+		sw_rings_base_phy = rte_malloc_virt2iova(sw_rings_base);
+		next_64mb_align_offset = calc_mem_alignment_offset(
+				sw_rings_base, ACC100_SIZE_64MBYTE);
+		next_64mb_align_addr_phy = sw_rings_base_phy +
+				next_64mb_align_offset;
+		sw_ring_phys_end_addr = sw_rings_base_phy + dev_sw_ring_size;
+
+		/* Check if the end of the sw ring memory block is before the
+		 * start of next 64MB aligned mem address
+		 */
+		if (sw_ring_phys_end_addr < next_64mb_align_addr_phy) {
+			d->sw_rings_phys = sw_rings_base_phy;
+			d->sw_rings = sw_rings_base;
+			d->sw_rings_base = sw_rings_base;
+			d->sw_ring_size = q_sw_ring_size;
+			d->sw_ring_max_depth = MAX_QUEUE_DEPTH;
+			break;
+		}
+		/* Store the address of the unaligned mem block */
+		base_addrs[i] = sw_rings_base;
+		i++;
+	}
+
+	/* Free all unaligned blocks of mem allocated in the loop */
+	free_base_addresses(base_addrs, i);
+}
+
+
+/* Allocate 64MB memory used for all software rings */
+static int
+acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id)
+{
+	uint32_t phys_low, phys_high, payload;
+	struct acc100_device *d = dev->data->dev_private;
+	const struct acc100_registry_addr *reg_addr;
+
+	if (d->pf_device && !d->acc100_conf.pf_mode_en) {
+		rte_bbdev_log(NOTICE,
+				"%s has PF mode disabled. This PF can't be used.",
+				dev->data->name);
+		return -ENODEV;
+	}
+
+	alloc_sw_rings_min_mem(dev, d, num_queues, socket_id);
+
+	/* If minimal memory space approach failed, then allocate
+	 * the 2 * 64MB block for the sw rings
+	 */
+	if (d->sw_rings == NULL)
+		alloc_2x64mb_sw_rings_mem(dev, d, socket_id);
+
+	/* Configure ACC100 with the base address for DMA descriptor rings
+	 * Same descriptor rings used for UL and DL DMA Engines
+	 * Note : Assuming only VF0 bundle is used for PF mode
+	 */
+	phys_high = (uint32_t)(d->sw_rings_phys >> 32);
+	phys_low  = (uint32_t)(d->sw_rings_phys & ~(ACC100_SIZE_64MBYTE-1));
+
+	/* Choose correct registry addresses for the device type */
+	if (d->pf_device)
+		reg_addr = &pf_reg_addr;
+	else
+		reg_addr = &vf_reg_addr;
+
+	/* Read the populated cfg from ACC100 registers */
+	fetch_acc100_config(dev);
+
+	/* Mark as configured properly */
+	d->configured = true;
+
+	/* Release AXI from PF */
+	if (d->pf_device)
+		acc100_reg_write(d, HWPfDmaAxiControl, 1);
+
+	acc100_reg_write(d, reg_addr->dma_ring_ul5g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->dma_ring_ul5g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->dma_ring_dl5g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->dma_ring_dl5g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->dma_ring_ul4g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->dma_ring_ul4g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->dma_ring_dl4g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->dma_ring_dl4g_lo, phys_low);
+
+	/*
+	 * Configure Ring Size to the max queue ring size
+	 * (used for wrapping purpose)
+	 */
+	payload = log2_basic(d->sw_ring_size / 64);
+	acc100_reg_write(d, reg_addr->ring_size, payload);
+
+	/* Configure tail pointer for use when SDONE enabled */
+	d->tail_ptrs = rte_zmalloc_socket(
+			dev->device->driver->name,
+			ACC100_NUM_QGRPS * ACC100_NUM_AQS * sizeof(uint32_t),
+			RTE_CACHE_LINE_SIZE, socket_id);
+	if (d->tail_ptrs == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate tail ptr for %s:%u",
+				dev->device->driver->name,
+				dev->data->dev_id);
+		rte_free(d->sw_rings);
+		return -ENOMEM;
+	}
+	d->tail_ptr_phys = rte_malloc_virt2iova(d->tail_ptrs);
+
+	phys_high = (uint32_t)(d->tail_ptr_phys >> 32);
+	phys_low  = (uint32_t)(d->tail_ptr_phys);
+	acc100_reg_write(d, reg_addr->tail_ptrs_ul5g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->tail_ptrs_ul5g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->tail_ptrs_dl5g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->tail_ptrs_dl5g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->tail_ptrs_ul4g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->tail_ptrs_ul4g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_lo, phys_low);
+
+	d->harq_layout = rte_zmalloc_socket("HARQ Layout",
+			ACC100_HARQ_LAYOUT * sizeof(*d->harq_layout),
+			RTE_CACHE_LINE_SIZE, dev->data->socket_id);
+
+	rte_bbdev_log_debug(
+			"ACC100 (%s) configured  sw_rings = %p, sw_rings_phys = %#"
+			PRIx64, dev->data->name, d->sw_rings, d->sw_rings_phys);
+
+	return 0;
+}
+
 /* Free 64MB memory used for software rings */
 static int
-acc100_dev_close(struct rte_bbdev *dev  __rte_unused)
+acc100_dev_close(struct rte_bbdev *dev)
 {
+	struct acc100_device *d = dev->data->dev_private;
+	if (d->sw_rings_base != NULL) {
+		rte_free(d->tail_ptrs);
+		rte_free(d->sw_rings_base);
+		d->sw_rings_base = NULL;
+	}
+	usleep(1000);
+	return 0;
+}
+
+
+/**
+ * Report a ACC100 queue index which is free
+ * Return 0 to 16k for a valid queue_idx or -1 when no queue is available
+ * Note : Only supporting VF0 Bundle for PF mode
+ */
+static int
+acc100_find_free_queue_idx(struct rte_bbdev *dev,
+		const struct rte_bbdev_queue_conf *conf)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	int op_2_acc[5] = {0, UL_4G, DL_4G, UL_5G, DL_5G};
+	int acc = op_2_acc[conf->op_type];
+	struct rte_q_topology_t *qtop = NULL;
+	qtopFromAcc(&qtop, acc, &(d->acc100_conf));
+	if (qtop == NULL)
+		return -1;
+	/* Identify matching QGroup Index which are sorted in priority order */
+	uint16_t group_idx = qtop->first_qgroup_index;
+	group_idx += conf->priority;
+	if (group_idx >= ACC100_NUM_QGRPS ||
+			conf->priority >= qtop->num_qgroups) {
+		rte_bbdev_log(INFO, "Invalid Priority on %s, priority %u",
+				dev->data->name, conf->priority);
+		return -1;
+	}
+	/* Find a free AQ_idx  */
+	uint16_t aq_idx;
+	for (aq_idx = 0; aq_idx < qtop->num_aqs_per_groups; aq_idx++) {
+		if (((d->q_assigned_bit_map[group_idx] >> aq_idx) & 0x1) == 0) {
+			/* Mark the Queue as assigned */
+			d->q_assigned_bit_map[group_idx] |= (1 << aq_idx);
+			/* Report the AQ Index */
+			return (group_idx << GRP_ID_SHIFT) + aq_idx;
+		}
+	}
+	rte_bbdev_log(INFO, "Failed to find free queue on %s, priority %u",
+			dev->data->name, conf->priority);
+	return -1;
+}
+
+/* Setup ACC100 queue */
+static int
+acc100_queue_setup(struct rte_bbdev *dev, uint16_t queue_id,
+		const struct rte_bbdev_queue_conf *conf)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	struct acc100_queue *q;
+	int16_t q_idx;
+
+	/* Allocate the queue data structure. */
+	q = rte_zmalloc_socket(dev->device->driver->name, sizeof(*q),
+			RTE_CACHE_LINE_SIZE, conf->socket);
+	if (q == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate queue memory");
+		return -ENOMEM;
+	}
+
+	q->d = d;
+	q->ring_addr = RTE_PTR_ADD(d->sw_rings, (d->sw_ring_size * queue_id));
+	q->ring_addr_phys = d->sw_rings_phys + (d->sw_ring_size * queue_id);
+
+	/* Prepare the Ring with default descriptor format */
+	union acc100_dma_desc *desc = NULL;
+	unsigned int desc_idx, b_idx;
+	int fcw_len = (conf->op_type == RTE_BBDEV_OP_LDPC_ENC ?
+		ACC100_FCW_LE_BLEN : (conf->op_type == RTE_BBDEV_OP_TURBO_DEC ?
+		ACC100_FCW_TD_BLEN : ACC100_FCW_LD_BLEN));
+
+	for (desc_idx = 0; desc_idx < d->sw_ring_max_depth; desc_idx++) {
+		desc = q->ring_addr + desc_idx;
+		desc->req.word0 = ACC100_DMA_DESC_TYPE;
+		desc->req.word1 = 0; /**< Timestamp */
+		desc->req.word2 = 0;
+		desc->req.word3 = 0;
+		uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
+		desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
+		desc->req.data_ptrs[0].blen = fcw_len;
+		desc->req.data_ptrs[0].blkid = ACC100_DMA_BLKID_FCW;
+		desc->req.data_ptrs[0].last = 0;
+		desc->req.data_ptrs[0].dma_ext = 0;
+		for (b_idx = 1; b_idx < ACC100_DMA_MAX_NUM_POINTERS - 1;
+				b_idx++) {
+			desc->req.data_ptrs[b_idx].blkid = ACC100_DMA_BLKID_IN;
+			desc->req.data_ptrs[b_idx].last = 1;
+			desc->req.data_ptrs[b_idx].dma_ext = 0;
+			b_idx++;
+			desc->req.data_ptrs[b_idx].blkid =
+					ACC100_DMA_BLKID_OUT_ENC;
+			desc->req.data_ptrs[b_idx].last = 1;
+			desc->req.data_ptrs[b_idx].dma_ext = 0;
+		}
+		/* Preset some fields of LDPC FCW */
+		desc->req.fcw_ld.FCWversion = ACC100_FCW_VER;
+		desc->req.fcw_ld.gain_i = 1;
+		desc->req.fcw_ld.gain_h = 1;
+	}
+
+	q->lb_in = rte_zmalloc_socket(dev->device->driver->name,
+			RTE_CACHE_LINE_SIZE,
+			RTE_CACHE_LINE_SIZE, conf->socket);
+	if (q->lb_in == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate lb_in memory");
+		return -ENOMEM;
+	}
+	q->lb_in_addr_phys = rte_malloc_virt2iova(q->lb_in);
+	q->lb_out = rte_zmalloc_socket(dev->device->driver->name,
+			RTE_CACHE_LINE_SIZE,
+			RTE_CACHE_LINE_SIZE, conf->socket);
+	if (q->lb_out == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate lb_out memory");
+		return -ENOMEM;
+	}
+	q->lb_out_addr_phys = rte_malloc_virt2iova(q->lb_out);
+
+	/*
+	 * Software queue ring wraps synchronously with the HW when it reaches
+	 * the boundary of the maximum allocated queue size, no matter what the
+	 * sw queue size is. This wrapping is guarded by setting the wrap_mask
+	 * to represent the maximum queue size as allocated at the time when
+	 * the device has been setup (in configure()).
+	 *
+	 * The queue depth is set to the queue size value (conf->queue_size).
+	 * This limits the occupancy of the queue at any point of time, so that
+	 * the queue does not get swamped with enqueue requests.
+	 */
+	q->sw_ring_depth = conf->queue_size;
+	q->sw_ring_wrap_mask = d->sw_ring_max_depth - 1;
+
+	q->op_type = conf->op_type;
+
+	q_idx = acc100_find_free_queue_idx(dev, conf);
+	if (q_idx == -1) {
+		rte_free(q);
+		return -1;
+	}
+
+	q->qgrp_id = (q_idx >> GRP_ID_SHIFT) & 0xF;
+	q->vf_id = (q_idx >> VF_ID_SHIFT)  & 0x3F;
+	q->aq_id = q_idx & 0xF;
+	q->aq_depth = (conf->op_type ==  RTE_BBDEV_OP_TURBO_DEC) ?
+			(1 << d->acc100_conf.q_ul_4g.aq_depth_log2) :
+			(1 << d->acc100_conf.q_dl_4g.aq_depth_log2);
+
+	q->mmio_reg_enqueue = RTE_PTR_ADD(d->mmio_base,
+			queue_offset(d->pf_device,
+					q->vf_id, q->qgrp_id, q->aq_id));
+
+	rte_bbdev_log_debug(
+			"Setup dev%u q%u: qgrp_id=%u, vf_id=%u, aq_id=%u, aq_depth=%u, mmio_reg_enqueue=%p",
+			dev->data->dev_id, queue_id, q->qgrp_id, q->vf_id,
+			q->aq_id, q->aq_depth, q->mmio_reg_enqueue);
+
+	dev->data->queues[queue_id].queue_private = q;
+	return 0;
+}
+
+/* Release ACC100 queue */
+static int
+acc100_queue_release(struct rte_bbdev *dev, uint16_t q_id)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	struct acc100_queue *q = dev->data->queues[q_id].queue_private;
+
+	if (q != NULL) {
+		/* Mark the Queue as un-assigned */
+		d->q_assigned_bit_map[q->qgrp_id] &= (0xFFFFFFFF -
+				(1 << q->aq_id));
+		rte_free(q->lb_in);
+		rte_free(q->lb_out);
+		rte_free(q);
+		dev->data->queues[q_id].queue_private = NULL;
+	}
+
 	return 0;
 }
 
@@ -258,8 +673,11 @@
 }
 
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
+	.setup_queues = acc100_setup_queues,
 	.close = acc100_dev_close,
 	.info_get = acc100_dev_info_get,
+	.queue_setup = acc100_queue_setup,
+	.queue_release = acc100_queue_release,
 };
 
 /* ACC100 PCI PF address map */
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 662e2c8..0e2b79c 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -518,11 +518,56 @@ struct acc100_registry_addr {
 	.ddr_range = HWVfDmaDdrBaseRangeRoVf,
 };
 
+/* Structure associated with each queue. */
+struct __rte_cache_aligned acc100_queue {
+	union acc100_dma_desc *ring_addr;  /* Virtual address of sw ring */
+	rte_iova_t ring_addr_phys;  /* Physical address of software ring */
+	uint32_t sw_ring_head;  /* software ring head */
+	uint32_t sw_ring_tail;  /* software ring tail */
+	/* software ring size (descriptors, not bytes) */
+	uint32_t sw_ring_depth;
+	/* mask used to wrap enqueued descriptors on the sw ring */
+	uint32_t sw_ring_wrap_mask;
+	/* MMIO register used to enqueue descriptors */
+	void *mmio_reg_enqueue;
+	uint8_t vf_id;  /* VF ID (max = 63) */
+	uint8_t qgrp_id;  /* Queue Group ID */
+	uint16_t aq_id;  /* Atomic Queue ID */
+	uint16_t aq_depth;  /* Depth of atomic queue */
+	uint32_t aq_enqueued;  /* Count how many "batches" have been enqueued */
+	uint32_t aq_dequeued;  /* Count how many "batches" have been dequeued */
+	uint32_t irq_enable;  /* Enable ops dequeue interrupts if set to 1 */
+	struct rte_mempool *fcw_mempool;  /* FCW mempool */
+	enum rte_bbdev_op_type op_type;  /* Type of this Queue: TE or TD */
+	/* Internal Buffers for loopback input */
+	uint8_t *lb_in;
+	uint8_t *lb_out;
+	rte_iova_t lb_in_addr_phys;
+	rte_iova_t lb_out_addr_phys;
+	struct acc100_device *d;
+};
+
 /* Private data structure for each ACC100 device */
 struct acc100_device {
 	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
+	void *sw_rings_base;  /* Base addr of un-aligned memory for sw rings */
+	void *sw_rings;  /* 64MBs of 64MB aligned memory for sw rings */
+	rte_iova_t sw_rings_phys;  /* Physical address of sw_rings */
+	/* Virtual address of the info memory routed to the this function under
+	 * operation, whether it is PF or VF.
+	 */
+	union acc100_harq_layout_data *harq_layout;
+	uint32_t sw_ring_size;
 	uint32_t ddr_size; /* Size in kB */
+	uint32_t *tail_ptrs; /* Base address of response tail pointer buffer */
+	rte_iova_t tail_ptr_phys; /* Physical address of tail pointers */
+	/* Max number of entries available for each queue in device, depending
+	 * on how many queues are enabled with configure()
+	 */
+	uint32_t sw_ring_max_depth;
 	struct acc100_conf acc100_conf; /* ACC100 Initial configuration */
+	/* Bitmap capturing which Queues have already been assigned */
+	uint16_t q_assigned_bit_map[ACC100_NUM_QGRPS];
 	bool pf_device; /**< True if this is a PF ACC100 device */
 	bool configured; /**< True if this ACC100 device is configured */
 };
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v6 07/11] baseband/acc100: add LDPC processing functions
  2020-09-23  2:19   ` [dpdk-dev] [PATCH v6 00/11] bbdev PMD ACC100 Nicolas Chautru
                       ` (5 preceding siblings ...)
  2020-09-23  2:19     ` [dpdk-dev] [PATCH v6 06/11] baseband/acc100: add queue configuration Nicolas Chautru
@ 2020-09-23  2:19     ` Nicolas Chautru
  2020-09-23  2:19     ` [dpdk-dev] [PATCH v6 08/11] baseband/acc100: add HARQ loopback support Nicolas Chautru
                       ` (3 subsequent siblings)
  10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-23  2:19 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Adding LDPC decode and encode processing operations

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
Acked-by: Dave Burley <dave.burley@accelercomm.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 1625 +++++++++++++++++++++++++++++-
 drivers/baseband/acc100/rte_acc100_pmd.h |    3 +
 2 files changed, 1626 insertions(+), 2 deletions(-)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 7a21c57..b223547 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -15,6 +15,9 @@
 #include <rte_hexdump.h>
 #include <rte_pci.h>
 #include <rte_bus_pci.h>
+#ifdef RTE_BBDEV_OFFLOAD_COST
+#include <rte_cycles.h>
+#endif
 
 #include <rte_bbdev.h>
 #include <rte_bbdev_pmd.h>
@@ -449,7 +452,6 @@
 	return 0;
 }
 
-
 /**
  * Report a ACC100 queue index which is free
  * Return 0 to 16k for a valid queue_idx or -1 when no queue is available
@@ -634,6 +636,46 @@
 	struct acc100_device *d = dev->data->dev_private;
 
 	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
+		{
+			.type   = RTE_BBDEV_OP_LDPC_ENC,
+			.cap.ldpc_enc = {
+				.capability_flags =
+					RTE_BBDEV_LDPC_RATE_MATCH |
+					RTE_BBDEV_LDPC_CRC_24B_ATTACH |
+					RTE_BBDEV_LDPC_INTERLEAVER_BYPASS,
+				.num_buffers_src =
+						RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+				.num_buffers_dst =
+						RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+			}
+		},
+		{
+			.type   = RTE_BBDEV_OP_LDPC_DEC,
+			.cap.ldpc_dec = {
+			.capability_flags =
+				RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK |
+				RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP |
+				RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE |
+				RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE |
+#ifdef ACC100_EXT_MEM
+				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE |
+				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE |
+#endif
+				RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE |
+				RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS |
+				RTE_BBDEV_LDPC_DECODE_BYPASS |
+				RTE_BBDEV_LDPC_DEC_SCATTER_GATHER |
+				RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION |
+				RTE_BBDEV_LDPC_LLR_COMPRESSION,
+			.llr_size = 8,
+			.llr_decimals = 1,
+			.num_buffers_src =
+					RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+			.num_buffers_hard_out =
+					RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+			.num_buffers_soft_out = 0,
+			}
+		},
 		RTE_BBDEV_END_OF_CAPABILITIES_LIST()
 	};
 
@@ -669,9 +711,14 @@
 	dev_info->cpu_flag_reqs = NULL;
 	dev_info->min_alignment = 64;
 	dev_info->capabilities = bbdev_capabilities;
+#ifdef ACC100_EXT_MEM
 	dev_info->harq_buffer_size = d->ddr_size;
+#else
+	dev_info->harq_buffer_size = 0;
+#endif
 }
 
+
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
 	.setup_queues = acc100_setup_queues,
 	.close = acc100_dev_close,
@@ -696,6 +743,1577 @@
 	{.device_id = 0},
 };
 
+/* Read flag value 0/1 from bitmap */
+static inline bool
+check_bit(uint32_t bitmap, uint32_t bitmask)
+{
+	return bitmap & bitmask;
+}
+
+static inline char *
+mbuf_append(struct rte_mbuf *m_head, struct rte_mbuf *m, uint16_t len)
+{
+	if (unlikely(len > rte_pktmbuf_tailroom(m)))
+		return NULL;
+
+	char *tail = (char *)m->buf_addr + m->data_off + m->data_len;
+	m->data_len = (uint16_t)(m->data_len + len);
+	m_head->pkt_len  = (m_head->pkt_len + len);
+	return tail;
+}
+
+/* Compute value of k0.
+ * Based on 3GPP 38.212 Table 5.4.2.1-2
+ * Starting position of different redundancy versions, k0
+ */
+static inline uint16_t
+get_k0(uint16_t n_cb, uint16_t z_c, uint8_t bg, uint8_t rv_index)
+{
+	if (rv_index == 0)
+		return 0;
+	uint16_t n = (bg == 1 ? N_ZC_1 : N_ZC_2) * z_c;
+	if (n_cb == n) {
+		if (rv_index == 1)
+			return (bg == 1 ? K0_1_1 : K0_1_2) * z_c;
+		else if (rv_index == 2)
+			return (bg == 1 ? K0_2_1 : K0_2_2) * z_c;
+		else
+			return (bg == 1 ? K0_3_1 : K0_3_2) * z_c;
+	}
+	/* LBRM case - includes a division by N */
+	if (rv_index == 1)
+		return (((bg == 1 ? K0_1_1 : K0_1_2) * n_cb)
+				/ n) * z_c;
+	else if (rv_index == 2)
+		return (((bg == 1 ? K0_2_1 : K0_2_2) * n_cb)
+				/ n) * z_c;
+	else
+		return (((bg == 1 ? K0_3_1 : K0_3_2) * n_cb)
+				/ n) * z_c;
+}
+
+/* Fill in a frame control word for LDPC encoding. */
+static inline void
+acc100_fcw_le_fill(const struct rte_bbdev_enc_op *op,
+		struct acc100_fcw_le *fcw, int num_cb)
+{
+	fcw->qm = op->ldpc_enc.q_m;
+	fcw->nfiller = op->ldpc_enc.n_filler;
+	fcw->BG = (op->ldpc_enc.basegraph - 1);
+	fcw->Zc = op->ldpc_enc.z_c;
+	fcw->ncb = op->ldpc_enc.n_cb;
+	fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_enc.basegraph,
+			op->ldpc_enc.rv_index);
+	fcw->rm_e = op->ldpc_enc.cb_params.e;
+	fcw->crc_select = check_bit(op->ldpc_enc.op_flags,
+			RTE_BBDEV_LDPC_CRC_24B_ATTACH);
+	fcw->bypass_intlv = check_bit(op->ldpc_enc.op_flags,
+			RTE_BBDEV_LDPC_INTERLEAVER_BYPASS);
+	fcw->mcb_count = num_cb;
+}
+
+/* Fill in a frame control word for LDPC decoding. */
+static inline void
+acc100_fcw_ld_fill(const struct rte_bbdev_dec_op *op, struct acc100_fcw_ld *fcw,
+		union acc100_harq_layout_data *harq_layout)
+{
+	uint16_t harq_out_length, harq_in_length, ncb_p, k0_p, parity_offset;
+	uint16_t harq_index;
+	uint32_t l;
+	bool harq_prun = false;
+
+	fcw->qm = op->ldpc_dec.q_m;
+	fcw->nfiller = op->ldpc_dec.n_filler;
+	fcw->BG = (op->ldpc_dec.basegraph - 1);
+	fcw->Zc = op->ldpc_dec.z_c;
+	fcw->ncb = op->ldpc_dec.n_cb;
+	fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_dec.basegraph,
+			op->ldpc_dec.rv_index);
+	if (op->ldpc_dec.code_block_mode == 1)
+		fcw->rm_e = op->ldpc_dec.cb_params.e;
+	else
+		fcw->rm_e = (op->ldpc_dec.tb_params.r <
+				op->ldpc_dec.tb_params.cab) ?
+						op->ldpc_dec.tb_params.ea :
+						op->ldpc_dec.tb_params.eb;
+
+	fcw->hcin_en = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE);
+	fcw->hcout_en = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE);
+	fcw->crc_select = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK);
+	fcw->bypass_dec = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_DECODE_BYPASS);
+	fcw->bypass_intlv = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS);
+	if (op->ldpc_dec.q_m == 1) {
+		fcw->bypass_intlv = 1;
+		fcw->qm = 2;
+	}
+	fcw->hcin_decomp_mode = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
+	fcw->hcout_comp_mode = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
+	fcw->llr_pack_mode = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_LLR_COMPRESSION);
+	harq_index = op->ldpc_dec.harq_combined_output.offset /
+			ACC100_HARQ_OFFSET;
+#ifdef ACC100_EXT_MEM
+	/* Limit cases when HARQ pruning is valid */
+	harq_prun = ((op->ldpc_dec.harq_combined_output.offset %
+			ACC100_HARQ_OFFSET) == 0) &&
+			(op->ldpc_dec.harq_combined_output.offset <= UINT16_MAX
+			* ACC100_HARQ_OFFSET);
+#endif
+	if (fcw->hcin_en > 0) {
+		harq_in_length = op->ldpc_dec.harq_combined_input.length;
+		if (fcw->hcin_decomp_mode > 0)
+			harq_in_length = harq_in_length * 8 / 6;
+		harq_in_length = RTE_ALIGN(harq_in_length, 64);
+		if ((harq_layout[harq_index].offset > 0) & harq_prun) {
+			rte_bbdev_log_debug("HARQ IN offset unexpected for now\n");
+			fcw->hcin_size0 = harq_layout[harq_index].size0;
+			fcw->hcin_offset = harq_layout[harq_index].offset;
+			fcw->hcin_size1 = harq_in_length -
+					harq_layout[harq_index].offset;
+		} else {
+			fcw->hcin_size0 = harq_in_length;
+			fcw->hcin_offset = 0;
+			fcw->hcin_size1 = 0;
+		}
+	} else {
+		fcw->hcin_size0 = 0;
+		fcw->hcin_offset = 0;
+		fcw->hcin_size1 = 0;
+	}
+
+	fcw->itmax = op->ldpc_dec.iter_max;
+	fcw->itstop = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE);
+	fcw->synd_precoder = fcw->itstop;
+	/*
+	 * These are all implicitly set
+	 * fcw->synd_post = 0;
+	 * fcw->so_en = 0;
+	 * fcw->so_bypass_rm = 0;
+	 * fcw->so_bypass_intlv = 0;
+	 * fcw->dec_convllr = 0;
+	 * fcw->hcout_convllr = 0;
+	 * fcw->hcout_size1 = 0;
+	 * fcw->so_it = 0;
+	 * fcw->hcout_offset = 0;
+	 * fcw->negstop_th = 0;
+	 * fcw->negstop_it = 0;
+	 * fcw->negstop_en = 0;
+	 * fcw->gain_i = 1;
+	 * fcw->gain_h = 1;
+	 */
+	if (fcw->hcout_en > 0) {
+		parity_offset = (op->ldpc_dec.basegraph == 1 ? 20 : 8)
+			* op->ldpc_dec.z_c - op->ldpc_dec.n_filler;
+		k0_p = (fcw->k0 > parity_offset) ?
+				fcw->k0 - op->ldpc_dec.n_filler : fcw->k0;
+		ncb_p = fcw->ncb - op->ldpc_dec.n_filler;
+		l = k0_p + fcw->rm_e;
+		harq_out_length = (uint16_t) fcw->hcin_size0;
+		harq_out_length = RTE_MIN(RTE_MAX(harq_out_length, l), ncb_p);
+		harq_out_length = (harq_out_length + 0x3F) & 0xFFC0;
+		if ((k0_p > fcw->hcin_size0 + ACC100_HARQ_OFFSET_THRESHOLD) &&
+				harq_prun) {
+			fcw->hcout_size0 = (uint16_t) fcw->hcin_size0;
+			fcw->hcout_offset = k0_p & 0xFFC0;
+			fcw->hcout_size1 = harq_out_length - fcw->hcout_offset;
+		} else {
+			fcw->hcout_size0 = harq_out_length;
+			fcw->hcout_size1 = 0;
+			fcw->hcout_offset = 0;
+		}
+		harq_layout[harq_index].offset = fcw->hcout_offset;
+		harq_layout[harq_index].size0 = fcw->hcout_size0;
+	} else {
+		fcw->hcout_size0 = 0;
+		fcw->hcout_size1 = 0;
+		fcw->hcout_offset = 0;
+	}
+}
+
+/**
+ * Fills descriptor with data pointers of one block type.
+ *
+ * @param desc
+ *   Pointer to DMA descriptor.
+ * @param input
+ *   Pointer to pointer to input data which will be encoded. It can be changed
+ *   and points to next segment in scatter-gather case.
+ * @param offset
+ *   Input offset in rte_mbuf structure. It is used for calculating the point
+ *   where data is starting.
+ * @param cb_len
+ *   Length of currently processed Code Block
+ * @param seg_total_left
+ *   It indicates how many bytes still left in segment (mbuf) for further
+ *   processing.
+ * @param op_flags
+ *   Store information about device capabilities
+ * @param next_triplet
+ *   Index for ACC100 DMA Descriptor triplet
+ *
+ * @return
+ *   Returns index of next triplet on success, other value if lengths of
+ *   pkt and processed cb do not match.
+ *
+ */
+static inline int
+acc100_dma_fill_blk_type_in(struct acc100_dma_req_desc *desc,
+		struct rte_mbuf **input, uint32_t *offset, uint32_t cb_len,
+		uint32_t *seg_total_left, int next_triplet)
+{
+	uint32_t part_len;
+	struct rte_mbuf *m = *input;
+
+	part_len = (*seg_total_left < cb_len) ? *seg_total_left : cb_len;
+	cb_len -= part_len;
+	*seg_total_left -= part_len;
+
+	desc->data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(m, *offset);
+	desc->data_ptrs[next_triplet].blen = part_len;
+	desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN;
+	desc->data_ptrs[next_triplet].last = 0;
+	desc->data_ptrs[next_triplet].dma_ext = 0;
+	*offset += part_len;
+	next_triplet++;
+
+	while (cb_len > 0) {
+		if (next_triplet < ACC100_DMA_MAX_NUM_POINTERS &&
+				m->next != NULL) {
+
+			m = m->next;
+			*seg_total_left = rte_pktmbuf_data_len(m);
+			part_len = (*seg_total_left < cb_len) ?
+					*seg_total_left :
+					cb_len;
+			desc->data_ptrs[next_triplet].address =
+					rte_pktmbuf_iova_offset(m, 0);
+			desc->data_ptrs[next_triplet].blen = part_len;
+			desc->data_ptrs[next_triplet].blkid =
+					ACC100_DMA_BLKID_IN;
+			desc->data_ptrs[next_triplet].last = 0;
+			desc->data_ptrs[next_triplet].dma_ext = 0;
+			cb_len -= part_len;
+			*seg_total_left -= part_len;
+			/* Initializing offset for next segment (mbuf) */
+			*offset = part_len;
+			next_triplet++;
+		} else {
+			rte_bbdev_log(ERR,
+				"Some data still left for processing: "
+				"data_left: %u, next_triplet: %u, next_mbuf: %p",
+				cb_len, next_triplet, m->next);
+			return -EINVAL;
+		}
+	}
+	/* Storing new mbuf as it could be changed in scatter-gather case*/
+	*input = m;
+
+	return next_triplet;
+}
+
+/* Fills descriptor with data pointers of one block type.
+ * Returns index of next triplet on success, other value if lengths of
+ * output data and processed mbuf do not match.
+ */
+static inline int
+acc100_dma_fill_blk_type_out(struct acc100_dma_req_desc *desc,
+		struct rte_mbuf *output, uint32_t out_offset,
+		uint32_t output_len, int next_triplet, int blk_id)
+{
+	desc->data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(output, out_offset);
+	desc->data_ptrs[next_triplet].blen = output_len;
+	desc->data_ptrs[next_triplet].blkid = blk_id;
+	desc->data_ptrs[next_triplet].last = 0;
+	desc->data_ptrs[next_triplet].dma_ext = 0;
+	next_triplet++;
+
+	return next_triplet;
+}
+
+static inline int
+acc100_dma_desc_le_fill(struct rte_bbdev_enc_op *op,
+		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
+		struct rte_mbuf *output, uint32_t *in_offset,
+		uint32_t *out_offset, uint32_t *out_length,
+		uint32_t *mbuf_total_left, uint32_t *seg_total_left)
+{
+	int next_triplet = 1; /* FCW already done */
+	uint16_t K, in_length_in_bits, in_length_in_bytes;
+	struct rte_bbdev_op_ldpc_enc *enc = &op->ldpc_enc;
+
+	desc->word0 = ACC100_DMA_DESC_TYPE;
+	desc->word1 = 0; /**< Timestamp could be disabled */
+	desc->word2 = 0;
+	desc->word3 = 0;
+	desc->numCBs = 1;
+
+	K = (enc->basegraph == 1 ? 22 : 10) * enc->z_c;
+	in_length_in_bits = K - enc->n_filler;
+	if ((enc->op_flags & RTE_BBDEV_LDPC_CRC_24A_ATTACH) ||
+			(enc->op_flags & RTE_BBDEV_LDPC_CRC_24B_ATTACH))
+		in_length_in_bits -= 24;
+	in_length_in_bytes = in_length_in_bits >> 3;
+
+	if (unlikely((*mbuf_total_left == 0) ||
+			(*mbuf_total_left < in_length_in_bytes))) {
+		rte_bbdev_log(ERR,
+				"Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+				*mbuf_total_left, in_length_in_bytes);
+		return -1;
+	}
+
+	next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset,
+			in_length_in_bytes,
+			seg_total_left, next_triplet);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->m2dlen = next_triplet;
+	*mbuf_total_left -= in_length_in_bytes;
+
+	/* Set output length */
+	/* Integer round up division by 8 */
+	*out_length = (enc->cb_params.e + 7) >> 3;
+
+	next_triplet = acc100_dma_fill_blk_type_out(desc, output, *out_offset,
+			*out_length, next_triplet, ACC100_DMA_BLKID_OUT_ENC);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+	op->ldpc_enc.output.length += *out_length;
+	*out_offset += *out_length;
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->data_ptrs[next_triplet - 1].dma_ext = 0;
+	desc->d2mlen = next_triplet - desc->m2dlen;
+
+	desc->op_addr = op;
+
+	return 0;
+}
+
+static inline int
+acc100_dma_desc_ld_fill(struct rte_bbdev_dec_op *op,
+		struct acc100_dma_req_desc *desc,
+		struct rte_mbuf **input, struct rte_mbuf *h_output,
+		uint32_t *in_offset, uint32_t *h_out_offset,
+		uint32_t *h_out_length, uint32_t *mbuf_total_left,
+		uint32_t *seg_total_left,
+		struct acc100_fcw_ld *fcw)
+{
+	struct rte_bbdev_op_ldpc_dec *dec = &op->ldpc_dec;
+	int next_triplet = 1; /* FCW already done */
+	uint32_t input_length;
+	uint16_t output_length, crc24_overlap = 0;
+	uint16_t sys_cols, K, h_p_size, h_np_size;
+	bool h_comp = check_bit(dec->op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
+
+	desc->word0 = ACC100_DMA_DESC_TYPE;
+	desc->word1 = 0; /**< Timestamp could be disabled */
+	desc->word2 = 0;
+	desc->word3 = 0;
+	desc->numCBs = 1;
+
+	if (check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP))
+		crc24_overlap = 24;
+
+	/* Compute some LDPC BG lengths */
+	input_length = dec->cb_params.e;
+	if (check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_LLR_COMPRESSION))
+		input_length = (input_length * 3 + 3) / 4;
+	sys_cols = (dec->basegraph == 1) ? 22 : 10;
+	K = sys_cols * dec->z_c;
+	output_length = K - dec->n_filler - crc24_overlap;
+
+	if (unlikely((*mbuf_total_left == 0) ||
+			(*mbuf_total_left < input_length))) {
+		rte_bbdev_log(ERR,
+				"Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+				*mbuf_total_left, input_length);
+		return -1;
+	}
+
+	next_triplet = acc100_dma_fill_blk_type_in(desc, input,
+			in_offset, input_length,
+			seg_total_left, next_triplet);
+
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+
+	if (check_bit(op->ldpc_dec.op_flags,
+				RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
+		h_p_size = fcw->hcin_size0 + fcw->hcin_size1;
+		if (h_comp)
+			h_p_size = (h_p_size * 3 + 3) / 4;
+		desc->data_ptrs[next_triplet].address =
+				dec->harq_combined_input.offset;
+		desc->data_ptrs[next_triplet].blen = h_p_size;
+		desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN_HARQ;
+		desc->data_ptrs[next_triplet].dma_ext = 1;
+#ifndef ACC100_EXT_MEM
+		acc100_dma_fill_blk_type_out(
+				desc,
+				op->ldpc_dec.harq_combined_input.data,
+				op->ldpc_dec.harq_combined_input.offset,
+				h_p_size,
+				next_triplet,
+				ACC100_DMA_BLKID_IN_HARQ);
+#endif
+		next_triplet++;
+	}
+
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->m2dlen = next_triplet;
+	*mbuf_total_left -= input_length;
+
+	next_triplet = acc100_dma_fill_blk_type_out(desc, h_output,
+			*h_out_offset, output_length >> 3, next_triplet,
+			ACC100_DMA_BLKID_OUT_HARD);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+
+	if (check_bit(op->ldpc_dec.op_flags,
+				RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
+		/* Pruned size of the HARQ */
+		h_p_size = fcw->hcout_size0 + fcw->hcout_size1;
+		/* Non-Pruned size of the HARQ */
+		h_np_size = fcw->hcout_offset > 0 ?
+				fcw->hcout_offset + fcw->hcout_size1 :
+				h_p_size;
+		if (h_comp) {
+			h_np_size = (h_np_size * 3 + 3) / 4;
+			h_p_size = (h_p_size * 3 + 3) / 4;
+		}
+		dec->harq_combined_output.length = h_np_size;
+		desc->data_ptrs[next_triplet].address =
+				dec->harq_combined_output.offset;
+		desc->data_ptrs[next_triplet].blen = h_p_size;
+		desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_OUT_HARQ;
+		desc->data_ptrs[next_triplet].dma_ext = 1;
+#ifndef ACC100_EXT_MEM
+		acc100_dma_fill_blk_type_out(
+				desc,
+				dec->harq_combined_output.data,
+				dec->harq_combined_output.offset,
+				h_p_size,
+				next_triplet,
+				ACC100_DMA_BLKID_OUT_HARQ);
+#endif
+		next_triplet++;
+	}
+
+	*h_out_length = output_length >> 3;
+	dec->hard_output.length += *h_out_length;
+	*h_out_offset += *h_out_length;
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->d2mlen = next_triplet - desc->m2dlen;
+
+	desc->op_addr = op;
+
+	return 0;
+}
+
+static inline void
+acc100_dma_desc_ld_update(struct rte_bbdev_dec_op *op,
+		struct acc100_dma_req_desc *desc,
+		struct rte_mbuf *input, struct rte_mbuf *h_output,
+		uint32_t *in_offset, uint32_t *h_out_offset,
+		uint32_t *h_out_length,
+		union acc100_harq_layout_data *harq_layout)
+{
+	int next_triplet = 1; /* FCW already done */
+	desc->data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(input, *in_offset);
+	next_triplet++;
+
+	if (check_bit(op->ldpc_dec.op_flags,
+				RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
+		struct rte_bbdev_op_data hi = op->ldpc_dec.harq_combined_input;
+		desc->data_ptrs[next_triplet].address = hi.offset;
+#ifndef ACC100_EXT_MEM
+		desc->data_ptrs[next_triplet].address =
+				rte_pktmbuf_iova_offset(hi.data, hi.offset);
+#endif
+		next_triplet++;
+	}
+
+	desc->data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(h_output, *h_out_offset);
+	*h_out_length = desc->data_ptrs[next_triplet].blen;
+	next_triplet++;
+
+	if (check_bit(op->ldpc_dec.op_flags,
+				RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
+		desc->data_ptrs[next_triplet].address =
+				op->ldpc_dec.harq_combined_output.offset;
+		/* Adjust based on previous operation */
+		struct rte_bbdev_dec_op *prev_op = desc->op_addr;
+		op->ldpc_dec.harq_combined_output.length =
+				prev_op->ldpc_dec.harq_combined_output.length;
+		int16_t hq_idx = op->ldpc_dec.harq_combined_output.offset /
+				ACC100_HARQ_OFFSET;
+		int16_t prev_hq_idx =
+				prev_op->ldpc_dec.harq_combined_output.offset
+				/ ACC100_HARQ_OFFSET;
+		harq_layout[hq_idx].val = harq_layout[prev_hq_idx].val;
+#ifndef ACC100_EXT_MEM
+		struct rte_bbdev_op_data ho =
+				op->ldpc_dec.harq_combined_output;
+		desc->data_ptrs[next_triplet].address =
+				rte_pktmbuf_iova_offset(ho.data, ho.offset);
+#endif
+		next_triplet++;
+	}
+
+	op->ldpc_dec.hard_output.length += *h_out_length;
+	desc->op_addr = op;
+}
+
+
+/* Enqueue a number of operations to HW and update software rings */
+static inline void
+acc100_dma_enqueue(struct acc100_queue *q, uint16_t n,
+		struct rte_bbdev_stats *queue_stats)
+{
+	union acc100_enqueue_reg_fmt enq_req;
+#ifdef RTE_BBDEV_OFFLOAD_COST
+	uint64_t start_time = 0;
+	queue_stats->acc_offload_cycles = 0;
+	RTE_SET_USED(queue_stats);
+#else
+	RTE_SET_USED(queue_stats);
+#endif
+
+	enq_req.val = 0;
+	/* Setting offset, 100b for 256 DMA Desc */
+	enq_req.addr_offset = ACC100_DESC_OFFSET;
+
+	/* Split ops into batches */
+	do {
+		union acc100_dma_desc *desc;
+		uint16_t enq_batch_size;
+		uint64_t offset;
+		rte_iova_t req_elem_addr;
+
+		enq_batch_size = RTE_MIN(n, MAX_ENQ_BATCH_SIZE);
+
+		/* Set flag on last descriptor in a batch */
+		desc = q->ring_addr + ((q->sw_ring_head + enq_batch_size - 1) &
+				q->sw_ring_wrap_mask);
+		desc->req.last_desc_in_batch = 1;
+
+		/* Calculate the 1st descriptor's address */
+		offset = ((q->sw_ring_head & q->sw_ring_wrap_mask) *
+				sizeof(union acc100_dma_desc));
+		req_elem_addr = q->ring_addr_phys + offset;
+
+		/* Fill enqueue struct */
+		enq_req.num_elem = enq_batch_size;
+		/* low 6 bits are not needed */
+		enq_req.req_elem_addr = (uint32_t)(req_elem_addr >> 6);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+		rte_memdump(stderr, "Req sdone", desc, sizeof(*desc));
+#endif
+		rte_bbdev_log_debug(
+				"Enqueue %u reqs (phys %#"PRIx64") to reg %p",
+				enq_batch_size,
+				req_elem_addr,
+				(void *)q->mmio_reg_enqueue);
+
+		rte_wmb();
+
+#ifdef RTE_BBDEV_OFFLOAD_COST
+		/* Start time measurement for enqueue function offload. */
+		start_time = rte_rdtsc_precise();
+#endif
+		rte_bbdev_log(DEBUG, "Debug : MMIO Enqueue");
+		mmio_write(q->mmio_reg_enqueue, enq_req.val);
+
+#ifdef RTE_BBDEV_OFFLOAD_COST
+		queue_stats->acc_offload_cycles +=
+				rte_rdtsc_precise() - start_time;
+#endif
+
+		q->aq_enqueued++;
+		q->sw_ring_head += enq_batch_size;
+		n -= enq_batch_size;
+
+	} while (n);
+
+
+}
+
+/* Enqueue one encode operations for ACC100 device in CB mode */
+static inline int
+enqueue_ldpc_enc_n_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op **ops,
+		uint16_t total_enqueued_cbs, int16_t num)
+{
+	union acc100_dma_desc *desc = NULL;
+	uint32_t out_length;
+	struct rte_mbuf *output_head, *output;
+	int i, next_triplet;
+	uint16_t  in_length_in_bytes;
+	struct rte_bbdev_op_ldpc_enc *enc = &ops[0]->ldpc_enc;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	acc100_fcw_le_fill(ops[0], &desc->req.fcw_le, num);
+
+	/** This could be done at polling */
+	desc->req.word0 = ACC100_DMA_DESC_TYPE;
+	desc->req.word1 = 0; /**< Timestamp could be disabled */
+	desc->req.word2 = 0;
+	desc->req.word3 = 0;
+	desc->req.numCBs = num;
+
+	in_length_in_bytes = ops[0]->ldpc_enc.input.data->data_len;
+	out_length = (enc->cb_params.e + 7) >> 3;
+	desc->req.m2dlen = 1 + num;
+	desc->req.d2mlen = num;
+	next_triplet = 1;
+
+	for (i = 0; i < num; i++) {
+		desc->req.data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(ops[i]->ldpc_enc.input.data, 0);
+		desc->req.data_ptrs[next_triplet].blen = in_length_in_bytes;
+		next_triplet++;
+		desc->req.data_ptrs[next_triplet].address =
+				rte_pktmbuf_iova_offset(
+				ops[i]->ldpc_enc.output.data, 0);
+		desc->req.data_ptrs[next_triplet].blen = out_length;
+		next_triplet++;
+		ops[i]->ldpc_enc.output.length = out_length;
+		output_head = output = ops[i]->ldpc_enc.output.data;
+		mbuf_append(output_head, output, out_length);
+		output->data_len = out_length;
+	}
+
+	desc->req.op_addr = ops[0];
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_le,
+			sizeof(desc->req.fcw_le) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+	/* One CB (one op) was successfully prepared to enqueue */
+	return num;
+}
+
+/* Enqueue one encode operations for ACC100 device in CB mode */
+static inline int
+enqueue_ldpc_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
+		uint16_t total_enqueued_cbs)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint32_t in_offset, out_offset, out_length, mbuf_total_left,
+		seg_total_left;
+	struct rte_mbuf *input, *output_head, *output;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	acc100_fcw_le_fill(op, &desc->req.fcw_le, 1);
+
+	input = op->ldpc_enc.input.data;
+	output_head = output = op->ldpc_enc.output.data;
+	in_offset = op->ldpc_enc.input.offset;
+	out_offset = op->ldpc_enc.output.offset;
+	out_length = 0;
+	mbuf_total_left = op->ldpc_enc.input.length;
+	seg_total_left = rte_pktmbuf_data_len(op->ldpc_enc.input.data)
+			- in_offset;
+
+	ret = acc100_dma_desc_le_fill(op, &desc->req, &input, output,
+			&in_offset, &out_offset, &out_length, &mbuf_total_left,
+			&seg_total_left);
+
+	if (unlikely(ret < 0))
+		return ret;
+
+	mbuf_append(output_head, output, out_length);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_le,
+			sizeof(desc->req.fcw_le) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+
+	/* Check if any data left after processing one CB */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left after processing one CB: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
+/** Enqueue one decode operations for ACC100 device in CB mode */
+static inline int
+enqueue_ldpc_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs, bool same_op)
+{
+	int ret;
+
+	union acc100_dma_desc *desc;
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	struct rte_mbuf *input, *h_output_head, *h_output;
+	uint32_t in_offset, h_out_offset, mbuf_total_left, h_out_length = 0;
+	input = op->ldpc_dec.input.data;
+	h_output_head = h_output = op->ldpc_dec.hard_output.data;
+	in_offset = op->ldpc_dec.input.offset;
+	h_out_offset = op->ldpc_dec.hard_output.offset;
+	mbuf_total_left = op->ldpc_dec.input.length;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(input == NULL)) {
+		rte_bbdev_log(ERR, "Invalid mbuf pointer");
+		return -EFAULT;
+	}
+#endif
+	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+
+	if (same_op) {
+		union acc100_dma_desc *prev_desc;
+		desc_idx = ((q->sw_ring_head + total_enqueued_cbs - 1)
+				& q->sw_ring_wrap_mask);
+		prev_desc = q->ring_addr + desc_idx;
+		uint8_t *prev_ptr = (uint8_t *) prev_desc;
+		uint8_t *new_ptr = (uint8_t *) desc;
+		/* Copy first 4 words and BDESCs */
+		rte_memcpy(new_ptr, prev_ptr, 16);
+		rte_memcpy(new_ptr + 36, prev_ptr + 36, 40);
+		desc->req.op_addr = prev_desc->req.op_addr;
+		/* Copy FCW */
+		rte_memcpy(new_ptr + ACC100_DESC_FCW_OFFSET,
+				prev_ptr + ACC100_DESC_FCW_OFFSET,
+				ACC100_FCW_LD_BLEN);
+		acc100_dma_desc_ld_update(op, &desc->req, input, h_output,
+				&in_offset, &h_out_offset,
+				&h_out_length, harq_layout);
+	} else {
+		struct acc100_fcw_ld *fcw;
+		uint32_t seg_total_left;
+		fcw = &desc->req.fcw_ld;
+		acc100_fcw_ld_fill(op, fcw, harq_layout);
+
+		/* Special handling when overusing mbuf */
+		if (fcw->rm_e < MAX_E_MBUF)
+			seg_total_left = rte_pktmbuf_data_len(input)
+					- in_offset;
+		else
+			seg_total_left = fcw->rm_e;
+
+		ret = acc100_dma_desc_ld_fill(op, &desc->req, &input, h_output,
+				&in_offset, &h_out_offset,
+				&h_out_length, &mbuf_total_left,
+				&seg_total_left, fcw);
+		if (unlikely(ret < 0))
+			return ret;
+	}
+
+	/* Hard output */
+	mbuf_append(h_output_head, h_output, h_out_length);
+#ifndef ACC100_EXT_MEM
+	if (op->ldpc_dec.harq_combined_output.length > 0) {
+		/* Push the HARQ output into host memory */
+		struct rte_mbuf *hq_output_head, *hq_output;
+		hq_output_head = op->ldpc_dec.harq_combined_output.data;
+		hq_output = op->ldpc_dec.harq_combined_output.data;
+		mbuf_append(hq_output_head, hq_output,
+				op->ldpc_dec.harq_combined_output.length);
+	}
+#endif
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_ld,
+			sizeof(desc->req.fcw_ld) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
+
+/* Enqueue one decode operations for ACC100 device in TB mode */
+static inline int
+enqueue_ldpc_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint8_t r, c;
+	uint32_t in_offset, h_out_offset,
+		h_out_length, mbuf_total_left, seg_total_left;
+	struct rte_mbuf *input, *h_output_head, *h_output;
+	uint16_t current_enqueued_cbs = 0;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
+	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+	acc100_fcw_ld_fill(op, &desc->req.fcw_ld, harq_layout);
+
+	input = op->ldpc_dec.input.data;
+	h_output_head = h_output = op->ldpc_dec.hard_output.data;
+	in_offset = op->ldpc_dec.input.offset;
+	h_out_offset = op->ldpc_dec.hard_output.offset;
+	h_out_length = 0;
+	mbuf_total_left = op->ldpc_dec.input.length;
+	c = op->ldpc_dec.tb_params.c;
+	r = op->ldpc_dec.tb_params.r;
+
+	while (mbuf_total_left > 0 && r < c) {
+
+		seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
+
+		/* Set up DMA descriptor */
+		desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
+				& q->sw_ring_wrap_mask);
+		desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
+		desc->req.data_ptrs[0].blen = ACC100_FCW_LD_BLEN;
+		ret = acc100_dma_desc_ld_fill(op, &desc->req, &input,
+				h_output, &in_offset, &h_out_offset,
+				&h_out_length,
+				&mbuf_total_left, &seg_total_left,
+				&desc->req.fcw_ld);
+
+		if (unlikely(ret < 0))
+			return ret;
+
+		/* Hard output */
+		mbuf_append(h_output_head, h_output, h_out_length);
+
+		/* Set total number of CBs in TB */
+		desc->req.cbs_in_tb = cbs_in_tb;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+		rte_memdump(stderr, "FCW", &desc->req.fcw_td,
+				sizeof(desc->req.fcw_td) - 8);
+		rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+		if (seg_total_left == 0) {
+			/* Go to the next mbuf */
+			input = input->next;
+			in_offset = 0;
+			h_output = h_output->next;
+			h_out_offset = 0;
+		}
+		total_enqueued_cbs++;
+		current_enqueued_cbs++;
+		r++;
+	}
+
+	if (unlikely(desc == NULL))
+		return current_enqueued_cbs;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Check if any CBs left for processing */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left for processing: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+	/* Set SDone on last CB descriptor for TB mode */
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	return current_enqueued_cbs;
+}
+
+
+/* Calculates number of CBs in processed encoder TB based on 'r' and input
+ * length.
+ */
+static inline uint8_t
+get_num_cbs_in_tb_enc(struct rte_bbdev_op_turbo_enc *turbo_enc)
+{
+	uint8_t c, c_neg, r, crc24_bits = 0;
+	uint16_t k, k_neg, k_pos;
+	uint8_t cbs_in_tb = 0;
+	int32_t length;
+
+	length = turbo_enc->input.length;
+	r = turbo_enc->tb_params.r;
+	c = turbo_enc->tb_params.c;
+	c_neg = turbo_enc->tb_params.c_neg;
+	k_neg = turbo_enc->tb_params.k_neg;
+	k_pos = turbo_enc->tb_params.k_pos;
+	crc24_bits = 0;
+	if (check_bit(turbo_enc->op_flags, RTE_BBDEV_TURBO_CRC_24B_ATTACH))
+		crc24_bits = 24;
+	while (length > 0 && r < c) {
+		k = (r < c_neg) ? k_neg : k_pos;
+		length -= (k - crc24_bits) >> 3;
+		r++;
+		cbs_in_tb++;
+	}
+
+	return cbs_in_tb;
+}
+
+/* Calculates number of CBs in processed decoder TB based on 'r' and input
+ * length.
+ */
+static inline uint16_t
+get_num_cbs_in_tb_dec(struct rte_bbdev_op_turbo_dec *turbo_dec)
+{
+	uint8_t c, c_neg, r = 0;
+	uint16_t kw, k, k_neg, k_pos, cbs_in_tb = 0;
+	int32_t length;
+
+	length = turbo_dec->input.length;
+	r = turbo_dec->tb_params.r;
+	c = turbo_dec->tb_params.c;
+	c_neg = turbo_dec->tb_params.c_neg;
+	k_neg = turbo_dec->tb_params.k_neg;
+	k_pos = turbo_dec->tb_params.k_pos;
+	while (length > 0 && r < c) {
+		k = (r < c_neg) ? k_neg : k_pos;
+		kw = RTE_ALIGN_CEIL(k + 4, 32) * 3;
+		length -= kw;
+		r++;
+		cbs_in_tb++;
+	}
+
+	return cbs_in_tb;
+}
+
+/* Calculates number of CBs in processed decoder TB based on 'r' and input
+ * length.
+ */
+static inline uint16_t
+get_num_cbs_in_tb_ldpc_dec(struct rte_bbdev_op_ldpc_dec *ldpc_dec)
+{
+	uint16_t r, cbs_in_tb = 0;
+	int32_t length = ldpc_dec->input.length;
+	r = ldpc_dec->tb_params.r;
+	while (length > 0 && r < ldpc_dec->tb_params.c) {
+		length -=  (r < ldpc_dec->tb_params.cab) ?
+				ldpc_dec->tb_params.ea :
+				ldpc_dec->tb_params.eb;
+		r++;
+		cbs_in_tb++;
+	}
+	return cbs_in_tb;
+}
+
+/* Check we can mux encode operations with common FCW */
+static inline bool
+check_mux(struct rte_bbdev_enc_op **ops, uint16_t num) {
+	uint16_t i;
+	if (num == 1)
+		return false;
+	for (i = 1; i < num; ++i) {
+		/* Only mux compatible code blocks */
+		if (memcmp((uint8_t *)(&ops[i]->ldpc_enc) + ENC_OFFSET,
+				(uint8_t *)(&ops[0]->ldpc_enc) + ENC_OFFSET,
+				CMP_ENC_SIZE) != 0)
+			return false;
+	}
+	return true;
+}
+
+/** Enqueue encode operations for ACC100 device in CB mode. */
+static inline uint16_t
+acc100_enqueue_ldpc_enc_cb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i = 0;
+	union acc100_dma_desc *desc;
+	int ret, desc_idx = 0;
+	int16_t enq, left = num;
+
+	while (left > 0) {
+		if (unlikely(avail - 1 < 0))
+			break;
+		avail--;
+		enq = RTE_MIN(left, MUX_5GDL_DESC);
+		if (check_mux(&ops[i], enq)) {
+			ret = enqueue_ldpc_enc_n_op_cb(q, &ops[i],
+					desc_idx, enq);
+			if (ret < 0)
+				break;
+			i += enq;
+		} else {
+			ret = enqueue_ldpc_enc_one_op_cb(q, ops[i], desc_idx);
+			if (ret < 0)
+				break;
+			i++;
+		}
+		desc_idx++;
+		left = num - i;
+	}
+
+	if (unlikely(i == 0))
+		return 0; /* Nothing to enqueue */
+
+	/* Set SDone in last CB in enqueued ops for CB mode*/
+	desc = q->ring_addr + ((q->sw_ring_head + desc_idx - 1)
+			& q->sw_ring_wrap_mask);
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	acc100_dma_enqueue(q, desc_idx, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+
+	return i;
+}
+
+/* Enqueue encode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	if (unlikely(num == 0))
+		return 0;
+	return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
+}
+
+/* Check we can mux encode operations with common FCW */
+static inline bool
+cmp_ldpc_dec_op(struct rte_bbdev_dec_op **ops) {
+	/* Only mux compatible code blocks */
+	if (memcmp((uint8_t *)(&ops[0]->ldpc_dec) + DEC_OFFSET,
+			(uint8_t *)(&ops[1]->ldpc_dec) +
+			DEC_OFFSET, CMP_DEC_SIZE) != 0) {
+		return false;
+	} else
+		return true;
+}
+
+
+/* Enqueue decode operations for ACC100 device in TB mode */
+static uint16_t
+acc100_enqueue_ldpc_dec_tb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i, enqueued_cbs = 0;
+	uint8_t cbs_in_tb;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		cbs_in_tb = get_num_cbs_in_tb_ldpc_dec(&ops[i]->ldpc_dec);
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - cbs_in_tb < 0))
+			break;
+		avail -= cbs_in_tb;
+
+		ret = enqueue_ldpc_dec_one_op_tb(q, ops[i],
+				enqueued_cbs, cbs_in_tb);
+		if (ret < 0)
+			break;
+		enqueued_cbs += ret;
+	}
+
+	acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+	return i;
+}
+
+/* Enqueue decode operations for ACC100 device in CB mode */
+static uint16_t
+acc100_enqueue_ldpc_dec_cb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i;
+	union acc100_dma_desc *desc;
+	int ret;
+	bool same_op = false;
+	for (i = 0; i < num; ++i) {
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - 1 < 0))
+			break;
+		avail -= 1;
+
+		if (i > 0)
+			same_op = cmp_ldpc_dec_op(&ops[i-1]);
+		rte_bbdev_log(INFO, "Op %d %d %d %d %d %d %d %d %d %d %d %d\n",
+			i, ops[i]->ldpc_dec.op_flags, ops[i]->ldpc_dec.rv_index,
+			ops[i]->ldpc_dec.iter_max, ops[i]->ldpc_dec.iter_count,
+			ops[i]->ldpc_dec.basegraph, ops[i]->ldpc_dec.z_c,
+			ops[i]->ldpc_dec.n_cb, ops[i]->ldpc_dec.q_m,
+			ops[i]->ldpc_dec.n_filler, ops[i]->ldpc_dec.cb_params.e,
+			same_op);
+		ret = enqueue_ldpc_dec_one_op_cb(q, ops[i], i, same_op);
+		if (ret < 0)
+			break;
+	}
+
+	if (unlikely(i == 0))
+		return 0; /* Nothing to enqueue */
+
+	/* Set SDone in last CB in enqueued ops for CB mode*/
+	desc = q->ring_addr + ((q->sw_ring_head + i - 1)
+			& q->sw_ring_wrap_mask);
+
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	acc100_dma_enqueue(q, i, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+	return i;
+}
+
+/* Enqueue decode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t aq_avail = q->aq_depth +
+			(q->aq_dequeued - q->aq_enqueued) / 128;
+
+	if (unlikely((aq_avail == 0) || (num == 0)))
+		return 0;
+
+	if (ops[0]->ldpc_dec.code_block_mode == 0)
+		return acc100_enqueue_ldpc_dec_tb(q_data, ops, num);
+	else
+		return acc100_enqueue_ldpc_dec_cb(q_data, ops, num);
+}
+
+
+/* Dequeue one encode operations from ACC100 device in CB mode */
+static inline int
+dequeue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op **ref_op,
+		uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_enc_op *op;
+	int i;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	rsp.val = atom_desc.rsp.val;
+	rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+
+	op->status |= ((rsp.input_err)
+			? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+	op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+	op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+
+	if (desc->req.last_desc_in_batch) {
+		(*aq_dequeued)++;
+		desc->req.last_desc_in_batch = 0;
+	}
+	desc->rsp.val = ACC100_DMA_DESC_TYPE;
+	desc->rsp.add_info_0 = 0; /*Reserved bits */
+	desc->rsp.add_info_1 = 0; /*Reserved bits */
+
+	/* Flag that the muxing cause loss of opaque data */
+	op->opaque_data = (void *)-1;
+	for (i = 0 ; i < desc->req.numCBs; i++)
+		ref_op[i] = op;
+
+	/* One CB (op) was successfully dequeued */
+	return desc->req.numCBs;
+}
+
+/* Dequeue one encode operations from ACC100 device in TB mode */
+static inline int
+dequeue_enc_one_op_tb(struct acc100_queue *q, struct rte_bbdev_enc_op **ref_op,
+		uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, *last_desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_enc_op *op;
+	uint8_t i = 0;
+	uint16_t current_dequeued_cbs = 0, cbs_in_tb;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	/* Get number of CBs in dequeued TB */
+	cbs_in_tb = desc->req.cbs_in_tb;
+	/* Get last CB */
+	last_desc = q->ring_addr + ((q->sw_ring_tail
+			+ total_dequeued_cbs + cbs_in_tb - 1)
+			& q->sw_ring_wrap_mask);
+	/* Check if last CB in TB is ready to dequeue (and thus
+	 * the whole TB) - checking sdone bit. If not return.
+	 */
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
+			__ATOMIC_RELAXED);
+	if (!(atom_desc.rsp.val & ACC100_SDONE))
+		return -1;
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+
+	while (i < cbs_in_tb) {
+		desc = q->ring_addr + ((q->sw_ring_tail
+				+ total_dequeued_cbs)
+				& q->sw_ring_wrap_mask);
+		atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+				__ATOMIC_RELAXED);
+		rsp.val = atom_desc.rsp.val;
+		rte_bbdev_log_debug("Resp. desc %p: %x", desc,
+				rsp.val);
+
+		op->status |= ((rsp.input_err)
+				? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+		op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+		op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+
+		if (desc->req.last_desc_in_batch) {
+			(*aq_dequeued)++;
+			desc->req.last_desc_in_batch = 0;
+		}
+		desc->rsp.val = ACC100_DMA_DESC_TYPE;
+		desc->rsp.add_info_0 = 0;
+		desc->rsp.add_info_1 = 0;
+		total_dequeued_cbs++;
+		current_dequeued_cbs++;
+		i++;
+	}
+
+	*ref_op = op;
+
+	return current_dequeued_cbs;
+}
+
+/* Dequeue one decode operation from ACC100 device in CB mode */
+static inline int
+dequeue_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
+		struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
+		uint16_t dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_dec_op *op;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	rsp.val = atom_desc.rsp.val;
+	rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+	op->status |= ((rsp.input_err)
+			? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+	op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+	op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+	if (op->status != 0)
+		q_data->queue_stats.dequeue_err_count++;
+
+	/* CRC invalid if error exists */
+	if (!op->status)
+		op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
+	op->turbo_dec.iter_count = (uint8_t) rsp.iter_cnt / 2;
+	/* Check if this is the last desc in batch (Atomic Queue) */
+	if (desc->req.last_desc_in_batch) {
+		(*aq_dequeued)++;
+		desc->req.last_desc_in_batch = 0;
+	}
+	desc->rsp.val = ACC100_DMA_DESC_TYPE;
+	desc->rsp.add_info_0 = 0;
+	desc->rsp.add_info_1 = 0;
+	*ref_op = op;
+
+	/* One CB (op) was successfully dequeued */
+	return 1;
+}
+
+/* Dequeue one decode operations from ACC100 device in CB mode */
+static inline int
+dequeue_ldpc_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
+		struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
+		uint16_t dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_dec_op *op;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	rsp.val = atom_desc.rsp.val;
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+	op->status |= rsp.input_err << RTE_BBDEV_DATA_ERROR;
+	op->status |= rsp.dma_err << RTE_BBDEV_DRV_ERROR;
+	op->status |= rsp.fcw_err << RTE_BBDEV_DRV_ERROR;
+	if (op->status != 0)
+		q_data->queue_stats.dequeue_err_count++;
+
+	op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
+	if (op->ldpc_dec.hard_output.length > 0 && !rsp.synd_ok)
+		op->status |= 1 << RTE_BBDEV_SYNDROME_ERROR;
+	op->ldpc_dec.iter_count = (uint8_t) rsp.iter_cnt;
+
+	/* Check if this is the last desc in batch (Atomic Queue) */
+	if (desc->req.last_desc_in_batch) {
+		(*aq_dequeued)++;
+		desc->req.last_desc_in_batch = 0;
+	}
+
+	desc->rsp.val = ACC100_DMA_DESC_TYPE;
+	desc->rsp.add_info_0 = 0;
+	desc->rsp.add_info_1 = 0;
+
+	*ref_op = op;
+
+	/* One CB (op) was successfully dequeued */
+	return 1;
+}
+
+/* Dequeue one decode operations from ACC100 device in TB mode. */
+static inline int
+dequeue_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
+		uint16_t dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, *last_desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_dec_op *op;
+	uint8_t cbs_in_tb = 1, cb_idx = 0;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Get number of CBs in dequeued TB */
+	cbs_in_tb = desc->req.cbs_in_tb;
+	/* Get last CB */
+	last_desc = q->ring_addr + ((q->sw_ring_tail
+			+ dequeued_cbs + cbs_in_tb - 1)
+			& q->sw_ring_wrap_mask);
+	/* Check if last CB in TB is ready to dequeue (and thus
+	 * the whole TB) - checking sdone bit. If not return.
+	 */
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
+			__ATOMIC_RELAXED);
+	if (!(atom_desc.rsp.val & ACC100_SDONE))
+		return -1;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+
+	/* Read remaining CBs if exists */
+	while (cb_idx < cbs_in_tb) {
+		desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+				& q->sw_ring_wrap_mask);
+		atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+				__ATOMIC_RELAXED);
+		rsp.val = atom_desc.rsp.val;
+		rte_bbdev_log_debug("Resp. desc %p: %x", desc,
+				rsp.val);
+
+		op->status |= ((rsp.input_err)
+				? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+		op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+		op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+
+		/* CRC invalid if error exists */
+		if (!op->status)
+			op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
+		op->turbo_dec.iter_count = RTE_MAX((uint8_t) rsp.iter_cnt,
+				op->turbo_dec.iter_count);
+
+		/* Check if this is the last desc in batch (Atomic Queue) */
+		if (desc->req.last_desc_in_batch) {
+			(*aq_dequeued)++;
+			desc->req.last_desc_in_batch = 0;
+		}
+		desc->rsp.val = ACC100_DMA_DESC_TYPE;
+		desc->rsp.add_info_0 = 0;
+		desc->rsp.add_info_1 = 0;
+		dequeued_cbs++;
+		cb_idx++;
+	}
+
+	*ref_op = op;
+
+	return cb_idx;
+}
+
+/* Dequeue LDPC encode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+	uint32_t aq_dequeued = 0;
+	uint16_t dequeue_num, i, dequeued_cbs = 0, dequeued_descs = 0;
+	int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(ops == 0 && q == NULL))
+		return 0;
+#endif
+
+	dequeue_num = (avail < num) ? avail : num;
+
+	for (i = 0; i < dequeue_num; i++) {
+		ret = dequeue_enc_one_op_cb(q, &ops[dequeued_cbs],
+				dequeued_descs, &aq_dequeued);
+		if (ret < 0)
+			break;
+		dequeued_cbs += ret;
+		dequeued_descs++;
+		if (dequeued_cbs >= num)
+			break;
+	}
+
+	q->aq_dequeued += aq_dequeued;
+	q->sw_ring_tail += dequeued_descs;
+
+	/* Update enqueue stats */
+	q_data->queue_stats.dequeued_count += dequeued_cbs;
+
+	return dequeued_cbs;
+}
+
+/* Dequeue decode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	uint16_t dequeue_num;
+	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+	uint32_t aq_dequeued = 0;
+	uint16_t i;
+	uint16_t dequeued_cbs = 0;
+	struct rte_bbdev_dec_op *op;
+	int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(ops == 0 && q == NULL))
+		return 0;
+#endif
+
+	dequeue_num = (avail < num) ? avail : num;
+
+	for (i = 0; i < dequeue_num; ++i) {
+		op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask))->req.op_addr;
+		if (op->ldpc_dec.code_block_mode == 0)
+			ret = dequeue_dec_one_op_tb(q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+		else
+			ret = dequeue_ldpc_dec_one_op_cb(
+					q_data, q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+
+		if (ret < 0)
+			break;
+		dequeued_cbs += ret;
+	}
+
+	q->aq_dequeued += aq_dequeued;
+	q->sw_ring_tail += dequeued_cbs;
+
+	/* Update enqueue stats */
+	q_data->queue_stats.dequeued_count += i;
+
+	return i;
+}
+
 /* Initialization Function */
 static void
 acc100_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv)
@@ -703,6 +2321,10 @@
 	struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
 
 	dev->dev_ops = &acc100_bbdev_ops;
+	dev->enqueue_ldpc_enc_ops = acc100_enqueue_ldpc_enc;
+	dev->enqueue_ldpc_dec_ops = acc100_enqueue_ldpc_dec;
+	dev->dequeue_ldpc_enc_ops = acc100_dequeue_ldpc_enc;
+	dev->dequeue_ldpc_dec_ops = acc100_dequeue_ldpc_dec;
 
 	((struct acc100_device *) dev->data->dev_private)->pf_device =
 			!strcmp(drv->driver.name,
@@ -815,4 +2437,3 @@ static int acc100_pci_remove(struct rte_pci_device *pci_dev)
 RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME, pci_id_acc100_pf_map);
 RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
 RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME, pci_id_acc100_vf_map);
-
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 0e2b79c..78686c1 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -88,6 +88,8 @@
 #define TMPL_PRI_3      0x0f0e0d0c
 #define QUEUE_ENABLE    0x80000000  /* Bit to mark Queue as Enabled */
 #define WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
+#define ACC100_FDONE    0x80000000
+#define ACC100_SDONE    0x40000000
 
 #define ACC100_NUM_TMPL  32
 #define VF_OFFSET_QOS 16 /* offset in Memory Space specific to QoS Mon */
@@ -398,6 +400,7 @@ struct __rte_packed acc100_dma_req_desc {
 union acc100_dma_desc {
 	struct acc100_dma_req_desc req;
 	union acc100_dma_rsp_desc rsp;
+	uint64_t atom_hdr;
 };
 
 
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v6 08/11] baseband/acc100: add HARQ loopback support
  2020-09-23  2:19   ` [dpdk-dev] [PATCH v6 00/11] bbdev PMD ACC100 Nicolas Chautru
                       ` (6 preceding siblings ...)
  2020-09-23  2:19     ` [dpdk-dev] [PATCH v6 07/11] baseband/acc100: add LDPC processing functions Nicolas Chautru
@ 2020-09-23  2:19     ` Nicolas Chautru
  2020-09-23  2:19     ` [dpdk-dev] [PATCH v6 09/11] baseband/acc100: add support for 4G processing Nicolas Chautru
                       ` (2 subsequent siblings)
  10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-23  2:19 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Additional support for HARQ memory loopback

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 158 +++++++++++++++++++++++++++++++
 1 file changed, 158 insertions(+)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index b223547..e484c0a 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -658,6 +658,7 @@
 				RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE |
 				RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE |
 #ifdef ACC100_EXT_MEM
+				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK |
 				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE |
 				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE |
 #endif
@@ -1480,12 +1481,169 @@
 	return 1;
 }
 
+static inline int
+harq_loopback(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs) {
+	struct acc100_fcw_ld *fcw;
+	union acc100_dma_desc *desc;
+	int next_triplet = 1;
+	struct rte_mbuf *hq_output_head, *hq_output;
+	uint16_t harq_in_length = op->ldpc_dec.harq_combined_input.length;
+	if (harq_in_length == 0) {
+		rte_bbdev_log(ERR, "Loopback of invalid null size\n");
+		return -EINVAL;
+	}
+
+	int h_comp = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION
+			) ? 1 : 0;
+	if (h_comp == 1)
+		harq_in_length = harq_in_length * 8 / 6;
+	harq_in_length = RTE_ALIGN(harq_in_length, 64);
+	uint16_t harq_dma_length_in = (h_comp == 0) ?
+			harq_in_length :
+			harq_in_length * 6 / 8;
+	uint16_t harq_dma_length_out = harq_dma_length_in;
+	bool ddr_mem_in = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE);
+	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+	uint16_t harq_index = (ddr_mem_in ?
+			op->ldpc_dec.harq_combined_input.offset :
+			op->ldpc_dec.harq_combined_output.offset)
+			/ ACC100_HARQ_OFFSET;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	fcw = &desc->req.fcw_ld;
+	/* Set the FCW from loopback into DDR */
+	memset(fcw, 0, sizeof(struct acc100_fcw_ld));
+	fcw->FCWversion = ACC100_FCW_VER;
+	fcw->qm = 2;
+	fcw->Zc = 384;
+	if (harq_in_length < 16 * N_ZC_1)
+		fcw->Zc = 16;
+	fcw->ncb = fcw->Zc * N_ZC_1;
+	fcw->rm_e = 2;
+	fcw->hcin_en = 1;
+	fcw->hcout_en = 1;
+
+	rte_bbdev_log(DEBUG, "Loopback IN %d Index %d offset %d length %d %d\n",
+			ddr_mem_in, harq_index,
+			harq_layout[harq_index].offset, harq_in_length,
+			harq_dma_length_in);
+
+	if (ddr_mem_in && (harq_layout[harq_index].offset > 0)) {
+		fcw->hcin_size0 = harq_layout[harq_index].size0;
+		fcw->hcin_offset = harq_layout[harq_index].offset;
+		fcw->hcin_size1 = harq_in_length - fcw->hcin_offset;
+		harq_dma_length_in = (fcw->hcin_size0 + fcw->hcin_size1);
+		if (h_comp == 1)
+			harq_dma_length_in = harq_dma_length_in * 6 / 8;
+	} else {
+		fcw->hcin_size0 = harq_in_length;
+	}
+	harq_layout[harq_index].val = 0;
+	rte_bbdev_log(DEBUG, "Loopback FCW Config %d %d %d\n",
+			fcw->hcin_size0, fcw->hcin_offset, fcw->hcin_size1);
+	fcw->hcout_size0 = harq_in_length;
+	fcw->hcin_decomp_mode = h_comp;
+	fcw->hcout_comp_mode = h_comp;
+	fcw->gain_i = 1;
+	fcw->gain_h = 1;
+
+	/* Set the prefix of descriptor. This could be done at polling */
+	desc->req.word0 = ACC100_DMA_DESC_TYPE;
+	desc->req.word1 = 0; /**< Timestamp could be disabled */
+	desc->req.word2 = 0;
+	desc->req.word3 = 0;
+	desc->req.numCBs = 1;
+
+	/* Null LLR input for Decoder */
+	desc->req.data_ptrs[next_triplet].address =
+			q->lb_in_addr_phys;
+	desc->req.data_ptrs[next_triplet].blen = 2;
+	desc->req.data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN;
+	desc->req.data_ptrs[next_triplet].last = 0;
+	desc->req.data_ptrs[next_triplet].dma_ext = 0;
+	next_triplet++;
+
+	/* HARQ Combine input from either Memory interface */
+	if (!ddr_mem_in) {
+		next_triplet = acc100_dma_fill_blk_type_out(&desc->req,
+				op->ldpc_dec.harq_combined_input.data,
+				op->ldpc_dec.harq_combined_input.offset,
+				harq_dma_length_in,
+				next_triplet,
+				ACC100_DMA_BLKID_IN_HARQ);
+	} else {
+		desc->req.data_ptrs[next_triplet].address =
+				op->ldpc_dec.harq_combined_input.offset;
+		desc->req.data_ptrs[next_triplet].blen =
+				harq_dma_length_in;
+		desc->req.data_ptrs[next_triplet].blkid =
+				ACC100_DMA_BLKID_IN_HARQ;
+		desc->req.data_ptrs[next_triplet].dma_ext = 1;
+		next_triplet++;
+	}
+	desc->req.data_ptrs[next_triplet - 1].last = 1;
+	desc->req.m2dlen = next_triplet;
+
+	/* Dropped decoder hard output */
+	desc->req.data_ptrs[next_triplet].address =
+			q->lb_out_addr_phys;
+	desc->req.data_ptrs[next_triplet].blen = BYTES_IN_WORD;
+	desc->req.data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_OUT_HARD;
+	desc->req.data_ptrs[next_triplet].last = 0;
+	desc->req.data_ptrs[next_triplet].dma_ext = 0;
+	next_triplet++;
+
+	/* HARQ Combine output to either Memory interface */
+	if (check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE
+			)) {
+		desc->req.data_ptrs[next_triplet].address =
+				op->ldpc_dec.harq_combined_output.offset;
+		desc->req.data_ptrs[next_triplet].blen =
+				harq_dma_length_out;
+		desc->req.data_ptrs[next_triplet].blkid =
+				ACC100_DMA_BLKID_OUT_HARQ;
+		desc->req.data_ptrs[next_triplet].dma_ext = 1;
+		next_triplet++;
+	} else {
+		hq_output_head = op->ldpc_dec.harq_combined_output.data;
+		hq_output = op->ldpc_dec.harq_combined_output.data;
+		next_triplet = acc100_dma_fill_blk_type_out(
+				&desc->req,
+				op->ldpc_dec.harq_combined_output.data,
+				op->ldpc_dec.harq_combined_output.offset,
+				harq_dma_length_out,
+				next_triplet,
+				ACC100_DMA_BLKID_OUT_HARQ);
+		/* HARQ output */
+		mbuf_append(hq_output_head, hq_output, harq_dma_length_out);
+		op->ldpc_dec.harq_combined_output.length =
+				harq_dma_length_out;
+	}
+	desc->req.data_ptrs[next_triplet - 1].last = 1;
+	desc->req.d2mlen = next_triplet - desc->req.m2dlen;
+	desc->req.op_addr = op;
+
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
 /** Enqueue one decode operations for ACC100 device in CB mode */
 static inline int
 enqueue_ldpc_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
 		uint16_t total_enqueued_cbs, bool same_op)
 {
 	int ret;
+	if (unlikely(check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK))) {
+		ret = harq_loopback(q, op, total_enqueued_cbs);
+		return ret;
+	}
 
 	union acc100_dma_desc *desc;
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v6 09/11] baseband/acc100: add support for 4G processing
  2020-09-23  2:19   ` [dpdk-dev] [PATCH v6 00/11] bbdev PMD ACC100 Nicolas Chautru
                       ` (7 preceding siblings ...)
  2020-09-23  2:19     ` [dpdk-dev] [PATCH v6 08/11] baseband/acc100: add HARQ loopback support Nicolas Chautru
@ 2020-09-23  2:19     ` Nicolas Chautru
  2020-09-23  2:19     ` [dpdk-dev] [PATCH v6 10/11] baseband/acc100: add interrupt support to PMD Nicolas Chautru
  2020-09-23  2:19     ` [dpdk-dev] [PATCH v6 11/11] baseband/acc100: add debug function to validate input Nicolas Chautru
  10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-23  2:19 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Adding capability for 4G encode and decoder processing

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 1010 ++++++++++++++++++++++++++++--
 1 file changed, 943 insertions(+), 67 deletions(-)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index e484c0a..7d4c3df 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -339,7 +339,6 @@
 	free_base_addresses(base_addrs, i);
 }
 
-
 /* Allocate 64MB memory used for all software rings */
 static int
 acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id)
@@ -637,6 +636,41 @@
 
 	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
 		{
+			.type = RTE_BBDEV_OP_TURBO_DEC,
+			.cap.turbo_dec = {
+				.capability_flags =
+					RTE_BBDEV_TURBO_SUBBLOCK_DEINTERLEAVE |
+					RTE_BBDEV_TURBO_CRC_TYPE_24B |
+					RTE_BBDEV_TURBO_HALF_ITERATION_EVEN |
+					RTE_BBDEV_TURBO_EARLY_TERMINATION |
+					RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN |
+					RTE_BBDEV_TURBO_MAP_DEC |
+					RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP |
+					RTE_BBDEV_TURBO_DEC_SCATTER_GATHER,
+				.max_llr_modulus = INT8_MAX,
+				.num_buffers_src =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+				.num_buffers_hard_out =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+				.num_buffers_soft_out =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+			}
+		},
+		{
+			.type = RTE_BBDEV_OP_TURBO_ENC,
+			.cap.turbo_enc = {
+				.capability_flags =
+					RTE_BBDEV_TURBO_CRC_24B_ATTACH |
+					RTE_BBDEV_TURBO_RV_INDEX_BYPASS |
+					RTE_BBDEV_TURBO_RATE_MATCH |
+					RTE_BBDEV_TURBO_ENC_SCATTER_GATHER,
+				.num_buffers_src =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+				.num_buffers_dst =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+			}
+		},
+		{
 			.type   = RTE_BBDEV_OP_LDPC_ENC,
 			.cap.ldpc_enc = {
 				.capability_flags =
@@ -719,7 +753,6 @@
 #endif
 }
 
-
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
 	.setup_queues = acc100_setup_queues,
 	.close = acc100_dev_close,
@@ -763,6 +796,58 @@
 	return tail;
 }
 
+/* Fill in a frame control word for turbo encoding. */
+static inline void
+acc100_fcw_te_fill(const struct rte_bbdev_enc_op *op, struct acc100_fcw_te *fcw)
+{
+	fcw->code_block_mode = op->turbo_enc.code_block_mode;
+	if (fcw->code_block_mode == 0) { /* For TB mode */
+		fcw->k_neg = op->turbo_enc.tb_params.k_neg;
+		fcw->k_pos = op->turbo_enc.tb_params.k_pos;
+		fcw->c_neg = op->turbo_enc.tb_params.c_neg;
+		fcw->c = op->turbo_enc.tb_params.c;
+		fcw->ncb_neg = op->turbo_enc.tb_params.ncb_neg;
+		fcw->ncb_pos = op->turbo_enc.tb_params.ncb_pos;
+
+		if (check_bit(op->turbo_enc.op_flags,
+				RTE_BBDEV_TURBO_RATE_MATCH)) {
+			fcw->bypass_rm = 0;
+			fcw->cab = op->turbo_enc.tb_params.cab;
+			fcw->ea = op->turbo_enc.tb_params.ea;
+			fcw->eb = op->turbo_enc.tb_params.eb;
+		} else {
+			/* E is set to the encoding output size when RM is
+			 * bypassed.
+			 */
+			fcw->bypass_rm = 1;
+			fcw->cab = fcw->c_neg;
+			fcw->ea = 3 * fcw->k_neg + 12;
+			fcw->eb = 3 * fcw->k_pos + 12;
+		}
+	} else { /* For CB mode */
+		fcw->k_pos = op->turbo_enc.cb_params.k;
+		fcw->ncb_pos = op->turbo_enc.cb_params.ncb;
+
+		if (check_bit(op->turbo_enc.op_flags,
+				RTE_BBDEV_TURBO_RATE_MATCH)) {
+			fcw->bypass_rm = 0;
+			fcw->eb = op->turbo_enc.cb_params.e;
+		} else {
+			/* E is set to the encoding output size when RM is
+			 * bypassed.
+			 */
+			fcw->bypass_rm = 1;
+			fcw->eb = 3 * fcw->k_pos + 12;
+		}
+	}
+
+	fcw->bypass_rv_idx1 = check_bit(op->turbo_enc.op_flags,
+			RTE_BBDEV_TURBO_RV_INDEX_BYPASS);
+	fcw->code_block_crc = check_bit(op->turbo_enc.op_flags,
+			RTE_BBDEV_TURBO_CRC_24B_ATTACH);
+	fcw->rv_idx1 = op->turbo_enc.rv_index;
+}
+
 /* Compute value of k0.
  * Based on 3GPP 38.212 Table 5.4.2.1-2
  * Starting position of different redundancy versions, k0
@@ -813,6 +898,25 @@
 	fcw->mcb_count = num_cb;
 }
 
+/* Fill in a frame control word for turbo decoding. */
+static inline void
+acc100_fcw_td_fill(const struct rte_bbdev_dec_op *op, struct acc100_fcw_td *fcw)
+{
+	/* Note : Early termination is always enabled for 4GUL */
+	fcw->fcw_ver = 1;
+	if (op->turbo_dec.code_block_mode == 0)
+		fcw->k_pos = op->turbo_dec.tb_params.k_pos;
+	else
+		fcw->k_pos = op->turbo_dec.cb_params.k;
+	fcw->turbo_crc_type = check_bit(op->turbo_dec.op_flags,
+			RTE_BBDEV_TURBO_CRC_TYPE_24B);
+	fcw->bypass_sb_deint = 0;
+	fcw->raw_decoder_input_on = 0;
+	fcw->max_iter = op->turbo_dec.iter_max;
+	fcw->half_iter_on = !check_bit(op->turbo_dec.op_flags,
+			RTE_BBDEV_TURBO_HALF_ITERATION_EVEN);
+}
+
 /* Fill in a frame control word for LDPC decoding. */
 static inline void
 acc100_fcw_ld_fill(const struct rte_bbdev_dec_op *op, struct acc100_fcw_ld *fcw,
@@ -1042,6 +1146,87 @@
 }
 
 static inline int
+acc100_dma_desc_te_fill(struct rte_bbdev_enc_op *op,
+		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
+		struct rte_mbuf *output, uint32_t *in_offset,
+		uint32_t *out_offset, uint32_t *out_length,
+		uint32_t *mbuf_total_left, uint32_t *seg_total_left, uint8_t r)
+{
+	int next_triplet = 1; /* FCW already done */
+	uint32_t e, ea, eb, length;
+	uint16_t k, k_neg, k_pos;
+	uint8_t cab, c_neg;
+
+	desc->word0 = ACC100_DMA_DESC_TYPE;
+	desc->word1 = 0; /**< Timestamp could be disabled */
+	desc->word2 = 0;
+	desc->word3 = 0;
+	desc->numCBs = 1;
+
+	if (op->turbo_enc.code_block_mode == 0) {
+		ea = op->turbo_enc.tb_params.ea;
+		eb = op->turbo_enc.tb_params.eb;
+		cab = op->turbo_enc.tb_params.cab;
+		k_neg = op->turbo_enc.tb_params.k_neg;
+		k_pos = op->turbo_enc.tb_params.k_pos;
+		c_neg = op->turbo_enc.tb_params.c_neg;
+		e = (r < cab) ? ea : eb;
+		k = (r < c_neg) ? k_neg : k_pos;
+	} else {
+		e = op->turbo_enc.cb_params.e;
+		k = op->turbo_enc.cb_params.k;
+	}
+
+	if (check_bit(op->turbo_enc.op_flags, RTE_BBDEV_TURBO_CRC_24B_ATTACH))
+		length = (k - 24) >> 3;
+	else
+		length = k >> 3;
+
+	if (unlikely((*mbuf_total_left == 0) || (*mbuf_total_left < length))) {
+		rte_bbdev_log(ERR,
+				"Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+				*mbuf_total_left, length);
+		return -1;
+	}
+
+	next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset,
+			length, seg_total_left, next_triplet);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->m2dlen = next_triplet;
+	*mbuf_total_left -= length;
+
+	/* Set output length */
+	if (check_bit(op->turbo_enc.op_flags, RTE_BBDEV_TURBO_RATE_MATCH))
+		/* Integer round up division by 8 */
+		*out_length = (e + 7) >> 3;
+	else
+		*out_length = (k >> 3) * 3 + 2;
+
+	next_triplet = acc100_dma_fill_blk_type_out(desc, output, *out_offset,
+			*out_length, next_triplet, ACC100_DMA_BLKID_OUT_ENC);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+	op->turbo_enc.output.length += *out_length;
+	*out_offset += *out_length;
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->d2mlen = next_triplet - desc->m2dlen;
+
+	desc->op_addr = op;
+
+	return 0;
+}
+
+static inline int
 acc100_dma_desc_le_fill(struct rte_bbdev_enc_op *op,
 		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
 		struct rte_mbuf *output, uint32_t *in_offset,
@@ -1110,6 +1295,117 @@
 }
 
 static inline int
+acc100_dma_desc_td_fill(struct rte_bbdev_dec_op *op,
+		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
+		struct rte_mbuf *h_output, struct rte_mbuf *s_output,
+		uint32_t *in_offset, uint32_t *h_out_offset,
+		uint32_t *s_out_offset, uint32_t *h_out_length,
+		uint32_t *s_out_length, uint32_t *mbuf_total_left,
+		uint32_t *seg_total_left, uint8_t r)
+{
+	int next_triplet = 1; /* FCW already done */
+	uint16_t k;
+	uint16_t crc24_overlap = 0;
+	uint32_t e, kw;
+
+	desc->word0 = ACC100_DMA_DESC_TYPE;
+	desc->word1 = 0; /**< Timestamp could be disabled */
+	desc->word2 = 0;
+	desc->word3 = 0;
+	desc->numCBs = 1;
+
+	if (op->turbo_dec.code_block_mode == 0) {
+		k = (r < op->turbo_dec.tb_params.c_neg)
+			? op->turbo_dec.tb_params.k_neg
+			: op->turbo_dec.tb_params.k_pos;
+		e = (r < op->turbo_dec.tb_params.cab)
+			? op->turbo_dec.tb_params.ea
+			: op->turbo_dec.tb_params.eb;
+	} else {
+		k = op->turbo_dec.cb_params.k;
+		e = op->turbo_dec.cb_params.e;
+	}
+
+	if ((op->turbo_dec.code_block_mode == 0)
+		&& !check_bit(op->turbo_dec.op_flags,
+		RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP))
+		crc24_overlap = 24;
+
+	/* Calculates circular buffer size.
+	 * According to 3gpp 36.212 section 5.1.4.2
+	 *   Kw = 3 * Kpi,
+	 * where:
+	 *   Kpi = nCol * nRow
+	 * where nCol is 32 and nRow can be calculated from:
+	 *   D =< nCol * nRow
+	 * where D is the size of each output from turbo encoder block (k + 4).
+	 */
+	kw = RTE_ALIGN_CEIL(k + 4, 32) * 3;
+
+	if (unlikely((*mbuf_total_left == 0) || (*mbuf_total_left < kw))) {
+		rte_bbdev_log(ERR,
+				"Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+				*mbuf_total_left, kw);
+		return -1;
+	}
+
+	next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset, kw,
+			seg_total_left, next_triplet);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->m2dlen = next_triplet;
+	*mbuf_total_left -= kw;
+
+	next_triplet = acc100_dma_fill_blk_type_out(
+			desc, h_output, *h_out_offset,
+			k >> 3, next_triplet, ACC100_DMA_BLKID_OUT_HARD);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+
+	*h_out_length = ((k - crc24_overlap) >> 3);
+	op->turbo_dec.hard_output.length += *h_out_length;
+	*h_out_offset += *h_out_length;
+
+	/* Soft output */
+	if (check_bit(op->turbo_dec.op_flags, RTE_BBDEV_TURBO_SOFT_OUTPUT)) {
+		if (check_bit(op->turbo_dec.op_flags,
+				RTE_BBDEV_TURBO_EQUALIZER))
+			*s_out_length = e;
+		else
+			*s_out_length = (k * 3) + 12;
+
+		next_triplet = acc100_dma_fill_blk_type_out(desc, s_output,
+				*s_out_offset, *s_out_length, next_triplet,
+				ACC100_DMA_BLKID_OUT_SOFT);
+		if (unlikely(next_triplet < 0)) {
+			rte_bbdev_log(ERR,
+					"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+					op);
+			return -1;
+		}
+
+		op->turbo_dec.soft_output.length += *s_out_length;
+		*s_out_offset += *s_out_length;
+	}
+
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->d2mlen = next_triplet - desc->m2dlen;
+
+	desc->op_addr = op;
+
+	return 0;
+}
+
+static inline int
 acc100_dma_desc_ld_fill(struct rte_bbdev_dec_op *op,
 		struct acc100_dma_req_desc *desc,
 		struct rte_mbuf **input, struct rte_mbuf *h_output,
@@ -1374,6 +1670,57 @@
 
 /* Enqueue one encode operations for ACC100 device in CB mode */
 static inline int
+enqueue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
+		uint16_t total_enqueued_cbs)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint32_t in_offset, out_offset, out_length, mbuf_total_left,
+		seg_total_left;
+	struct rte_mbuf *input, *output_head, *output;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	acc100_fcw_te_fill(op, &desc->req.fcw_te);
+
+	input = op->turbo_enc.input.data;
+	output_head = output = op->turbo_enc.output.data;
+	in_offset = op->turbo_enc.input.offset;
+	out_offset = op->turbo_enc.output.offset;
+	out_length = 0;
+	mbuf_total_left = op->turbo_enc.input.length;
+	seg_total_left = rte_pktmbuf_data_len(op->turbo_enc.input.data)
+			- in_offset;
+
+	ret = acc100_dma_desc_te_fill(op, &desc->req, &input, output,
+			&in_offset, &out_offset, &out_length, &mbuf_total_left,
+			&seg_total_left, 0);
+
+	if (unlikely(ret < 0))
+		return ret;
+
+	mbuf_append(output_head, output, out_length);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_te,
+			sizeof(desc->req.fcw_te) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+
+	/* Check if any data left after processing one CB */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left after processing one CB: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
+/* Enqueue one encode operations for ACC100 device in CB mode */
+static inline int
 enqueue_ldpc_enc_n_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op **ops,
 		uint16_t total_enqueued_cbs, int16_t num)
 {
@@ -1481,78 +1828,235 @@
 	return 1;
 }
 
-static inline int
-harq_loopback(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
-		uint16_t total_enqueued_cbs) {
-	struct acc100_fcw_ld *fcw;
-	union acc100_dma_desc *desc;
-	int next_triplet = 1;
-	struct rte_mbuf *hq_output_head, *hq_output;
-	uint16_t harq_in_length = op->ldpc_dec.harq_combined_input.length;
-	if (harq_in_length == 0) {
-		rte_bbdev_log(ERR, "Loopback of invalid null size\n");
-		return -EINVAL;
-	}
 
-	int h_comp = check_bit(op->ldpc_dec.op_flags,
-			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION
-			) ? 1 : 0;
-	if (h_comp == 1)
-		harq_in_length = harq_in_length * 8 / 6;
-	harq_in_length = RTE_ALIGN(harq_in_length, 64);
-	uint16_t harq_dma_length_in = (h_comp == 0) ?
-			harq_in_length :
-			harq_in_length * 6 / 8;
-	uint16_t harq_dma_length_out = harq_dma_length_in;
-	bool ddr_mem_in = check_bit(op->ldpc_dec.op_flags,
-			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE);
-	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
-	uint16_t harq_index = (ddr_mem_in ?
-			op->ldpc_dec.harq_combined_input.offset :
-			op->ldpc_dec.harq_combined_output.offset)
-			/ ACC100_HARQ_OFFSET;
+/* Enqueue one encode operations for ACC100 device in TB mode. */
+static inline int
+enqueue_enc_one_op_tb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
+		uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint8_t r, c;
+	uint32_t in_offset, out_offset, out_length, mbuf_total_left,
+		seg_total_left;
+	struct rte_mbuf *input, *output_head, *output;
+	uint16_t current_enqueued_cbs = 0;
 
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
-	fcw = &desc->req.fcw_ld;
-	/* Set the FCW from loopback into DDR */
-	memset(fcw, 0, sizeof(struct acc100_fcw_ld));
-	fcw->FCWversion = ACC100_FCW_VER;
-	fcw->qm = 2;
-	fcw->Zc = 384;
-	if (harq_in_length < 16 * N_ZC_1)
-		fcw->Zc = 16;
-	fcw->ncb = fcw->Zc * N_ZC_1;
-	fcw->rm_e = 2;
-	fcw->hcin_en = 1;
-	fcw->hcout_en = 1;
+	uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
+	acc100_fcw_te_fill(op, &desc->req.fcw_te);
 
-	rte_bbdev_log(DEBUG, "Loopback IN %d Index %d offset %d length %d %d\n",
-			ddr_mem_in, harq_index,
-			harq_layout[harq_index].offset, harq_in_length,
-			harq_dma_length_in);
+	input = op->turbo_enc.input.data;
+	output_head = output = op->turbo_enc.output.data;
+	in_offset = op->turbo_enc.input.offset;
+	out_offset = op->turbo_enc.output.offset;
+	out_length = 0;
+	mbuf_total_left = op->turbo_enc.input.length;
 
-	if (ddr_mem_in && (harq_layout[harq_index].offset > 0)) {
-		fcw->hcin_size0 = harq_layout[harq_index].size0;
-		fcw->hcin_offset = harq_layout[harq_index].offset;
-		fcw->hcin_size1 = harq_in_length - fcw->hcin_offset;
-		harq_dma_length_in = (fcw->hcin_size0 + fcw->hcin_size1);
-		if (h_comp == 1)
-			harq_dma_length_in = harq_dma_length_in * 6 / 8;
-	} else {
-		fcw->hcin_size0 = harq_in_length;
-	}
-	harq_layout[harq_index].val = 0;
-	rte_bbdev_log(DEBUG, "Loopback FCW Config %d %d %d\n",
-			fcw->hcin_size0, fcw->hcin_offset, fcw->hcin_size1);
-	fcw->hcout_size0 = harq_in_length;
-	fcw->hcin_decomp_mode = h_comp;
-	fcw->hcout_comp_mode = h_comp;
-	fcw->gain_i = 1;
-	fcw->gain_h = 1;
+	c = op->turbo_enc.tb_params.c;
+	r = op->turbo_enc.tb_params.r;
 
-	/* Set the prefix of descriptor. This could be done at polling */
+	while (mbuf_total_left > 0 && r < c) {
+		seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
+		/* Set up DMA descriptor */
+		desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
+				& q->sw_ring_wrap_mask);
+		desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
+		desc->req.data_ptrs[0].blen = ACC100_FCW_TE_BLEN;
+
+		ret = acc100_dma_desc_te_fill(op, &desc->req, &input, output,
+				&in_offset, &out_offset, &out_length,
+				&mbuf_total_left, &seg_total_left, r);
+		if (unlikely(ret < 0))
+			return ret;
+		mbuf_append(output_head, output, out_length);
+
+		/* Set total number of CBs in TB */
+		desc->req.cbs_in_tb = cbs_in_tb;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+		rte_memdump(stderr, "FCW", &desc->req.fcw_te,
+				sizeof(desc->req.fcw_te) - 8);
+		rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+		if (seg_total_left == 0) {
+			/* Go to the next mbuf */
+			input = input->next;
+			in_offset = 0;
+			output = output->next;
+			out_offset = 0;
+		}
+
+		total_enqueued_cbs++;
+		current_enqueued_cbs++;
+		r++;
+	}
+
+	if (unlikely(desc == NULL))
+		return current_enqueued_cbs;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Check if any CBs left for processing */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left for processing: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+
+	/* Set SDone on last CB descriptor for TB mode. */
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	return current_enqueued_cbs;
+}
+
+/** Enqueue one decode operations for ACC100 device in CB mode */
+static inline int
+enqueue_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint32_t in_offset, h_out_offset, s_out_offset, s_out_length,
+		h_out_length, mbuf_total_left, seg_total_left;
+	struct rte_mbuf *input, *h_output_head, *h_output,
+		*s_output_head, *s_output;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	acc100_fcw_td_fill(op, &desc->req.fcw_td);
+
+	input = op->turbo_dec.input.data;
+	h_output_head = h_output = op->turbo_dec.hard_output.data;
+	s_output_head = s_output = op->turbo_dec.soft_output.data;
+	in_offset = op->turbo_dec.input.offset;
+	h_out_offset = op->turbo_dec.hard_output.offset;
+	s_out_offset = op->turbo_dec.soft_output.offset;
+	h_out_length = s_out_length = 0;
+	mbuf_total_left = op->turbo_dec.input.length;
+	seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(input == NULL)) {
+		rte_bbdev_log(ERR, "Invalid mbuf pointer");
+		return -EFAULT;
+	}
+#endif
+
+	/* Set up DMA descriptor */
+	desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+
+	ret = acc100_dma_desc_td_fill(op, &desc->req, &input, h_output,
+			s_output, &in_offset, &h_out_offset, &s_out_offset,
+			&h_out_length, &s_out_length, &mbuf_total_left,
+			&seg_total_left, 0);
+
+	if (unlikely(ret < 0))
+		return ret;
+
+	/* Hard output */
+	mbuf_append(h_output_head, h_output, h_out_length);
+
+	/* Soft output */
+	if (check_bit(op->turbo_dec.op_flags, RTE_BBDEV_TURBO_SOFT_OUTPUT))
+		mbuf_append(s_output_head, s_output, s_out_length);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_td,
+			sizeof(desc->req.fcw_td) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+
+	/* Check if any CBs left for processing */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left after processing one CB: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
+static inline int
+harq_loopback(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs) {
+	struct acc100_fcw_ld *fcw;
+	union acc100_dma_desc *desc;
+	int next_triplet = 1;
+	struct rte_mbuf *hq_output_head, *hq_output;
+	uint16_t harq_in_length = op->ldpc_dec.harq_combined_input.length;
+	if (harq_in_length == 0) {
+		rte_bbdev_log(ERR, "Loopback of invalid null size\n");
+		return -EINVAL;
+	}
+
+	int h_comp = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION
+			) ? 1 : 0;
+	if (h_comp == 1)
+		harq_in_length = harq_in_length * 8 / 6;
+	harq_in_length = RTE_ALIGN(harq_in_length, 64);
+	uint16_t harq_dma_length_in = (h_comp == 0) ?
+			harq_in_length :
+			harq_in_length * 6 / 8;
+	uint16_t harq_dma_length_out = harq_dma_length_in;
+	bool ddr_mem_in = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE);
+	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+	uint16_t harq_index = (ddr_mem_in ?
+			op->ldpc_dec.harq_combined_input.offset :
+			op->ldpc_dec.harq_combined_output.offset)
+			/ ACC100_HARQ_OFFSET;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	fcw = &desc->req.fcw_ld;
+	/* Set the FCW from loopback into DDR */
+	memset(fcw, 0, sizeof(struct acc100_fcw_ld));
+	fcw->FCWversion = ACC100_FCW_VER;
+	fcw->qm = 2;
+	fcw->Zc = 384;
+	if (harq_in_length < 16 * N_ZC_1)
+		fcw->Zc = 16;
+	fcw->ncb = fcw->Zc * N_ZC_1;
+	fcw->rm_e = 2;
+	fcw->hcin_en = 1;
+	fcw->hcout_en = 1;
+
+	rte_bbdev_log(DEBUG, "Loopback IN %d Index %d offset %d length %d %d\n",
+			ddr_mem_in, harq_index,
+			harq_layout[harq_index].offset, harq_in_length,
+			harq_dma_length_in);
+
+	if (ddr_mem_in && (harq_layout[harq_index].offset > 0)) {
+		fcw->hcin_size0 = harq_layout[harq_index].size0;
+		fcw->hcin_offset = harq_layout[harq_index].offset;
+		fcw->hcin_size1 = harq_in_length - fcw->hcin_offset;
+		harq_dma_length_in = (fcw->hcin_size0 + fcw->hcin_size1);
+		if (h_comp == 1)
+			harq_dma_length_in = harq_dma_length_in * 6 / 8;
+	} else {
+		fcw->hcin_size0 = harq_in_length;
+	}
+	harq_layout[harq_index].val = 0;
+	rte_bbdev_log(DEBUG, "Loopback FCW Config %d %d %d\n",
+			fcw->hcin_size0, fcw->hcin_offset, fcw->hcin_size1);
+	fcw->hcout_size0 = harq_in_length;
+	fcw->hcin_decomp_mode = h_comp;
+	fcw->hcout_comp_mode = h_comp;
+	fcw->gain_i = 1;
+	fcw->gain_h = 1;
+
+	/* Set the prefix of descriptor. This could be done at polling */
 	desc->req.word0 = ACC100_DMA_DESC_TYPE;
 	desc->req.word1 = 0; /**< Timestamp could be disabled */
 	desc->req.word2 = 0;
@@ -1816,6 +2320,107 @@
 	return current_enqueued_cbs;
 }
 
+/* Enqueue one decode operations for ACC100 device in TB mode */
+static inline int
+enqueue_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint8_t r, c;
+	uint32_t in_offset, h_out_offset, s_out_offset, s_out_length,
+		h_out_length, mbuf_total_left, seg_total_left;
+	struct rte_mbuf *input, *h_output_head, *h_output,
+		*s_output_head, *s_output;
+	uint16_t current_enqueued_cbs = 0;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
+	acc100_fcw_td_fill(op, &desc->req.fcw_td);
+
+	input = op->turbo_dec.input.data;
+	h_output_head = h_output = op->turbo_dec.hard_output.data;
+	s_output_head = s_output = op->turbo_dec.soft_output.data;
+	in_offset = op->turbo_dec.input.offset;
+	h_out_offset = op->turbo_dec.hard_output.offset;
+	s_out_offset = op->turbo_dec.soft_output.offset;
+	h_out_length = s_out_length = 0;
+	mbuf_total_left = op->turbo_dec.input.length;
+	c = op->turbo_dec.tb_params.c;
+	r = op->turbo_dec.tb_params.r;
+
+	while (mbuf_total_left > 0 && r < c) {
+
+		seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
+
+		/* Set up DMA descriptor */
+		desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
+				& q->sw_ring_wrap_mask);
+		desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
+		desc->req.data_ptrs[0].blen = ACC100_FCW_TD_BLEN;
+		ret = acc100_dma_desc_td_fill(op, &desc->req, &input,
+				h_output, s_output, &in_offset, &h_out_offset,
+				&s_out_offset, &h_out_length, &s_out_length,
+				&mbuf_total_left, &seg_total_left, r);
+
+		if (unlikely(ret < 0))
+			return ret;
+
+		/* Hard output */
+		mbuf_append(h_output_head, h_output, h_out_length);
+
+		/* Soft output */
+		if (check_bit(op->turbo_dec.op_flags,
+				RTE_BBDEV_TURBO_SOFT_OUTPUT))
+			mbuf_append(s_output_head, s_output, s_out_length);
+
+		/* Set total number of CBs in TB */
+		desc->req.cbs_in_tb = cbs_in_tb;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+		rte_memdump(stderr, "FCW", &desc->req.fcw_td,
+				sizeof(desc->req.fcw_td) - 8);
+		rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+		if (seg_total_left == 0) {
+			/* Go to the next mbuf */
+			input = input->next;
+			in_offset = 0;
+			h_output = h_output->next;
+			h_out_offset = 0;
+
+			if (check_bit(op->turbo_dec.op_flags,
+					RTE_BBDEV_TURBO_SOFT_OUTPUT)) {
+				s_output = s_output->next;
+				s_out_offset = 0;
+			}
+		}
+
+		total_enqueued_cbs++;
+		current_enqueued_cbs++;
+		r++;
+	}
+
+	if (unlikely(desc == NULL))
+		return current_enqueued_cbs;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Check if any CBs left for processing */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left for processing: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+	/* Set SDone on last CB descriptor for TB mode */
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	return current_enqueued_cbs;
+}
 
 /* Calculates number of CBs in processed encoder TB based on 'r' and input
  * length.
@@ -1893,6 +2498,45 @@
 	return cbs_in_tb;
 }
 
+/* Enqueue encode operations for ACC100 device in CB mode. */
+static uint16_t
+acc100_enqueue_enc_cb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i;
+	union acc100_dma_desc *desc;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - 1 < 0))
+			break;
+		avail -= 1;
+
+		ret = enqueue_enc_one_op_cb(q, ops[i], i);
+		if (ret < 0)
+			break;
+	}
+
+	if (unlikely(i == 0))
+		return 0; /* Nothing to enqueue */
+
+	/* Set SDone in last CB in enqueued ops for CB mode*/
+	desc = q->ring_addr + ((q->sw_ring_head + i - 1)
+			& q->sw_ring_wrap_mask);
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	acc100_dma_enqueue(q, i, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+	return i;
+}
+
 /* Check we can mux encode operations with common FCW */
 static inline bool
 check_mux(struct rte_bbdev_enc_op **ops, uint16_t num) {
@@ -1960,6 +2604,52 @@
 	return i;
 }
 
+/* Enqueue encode operations for ACC100 device in TB mode. */
+static uint16_t
+acc100_enqueue_enc_tb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i, enqueued_cbs = 0;
+	uint8_t cbs_in_tb;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		cbs_in_tb = get_num_cbs_in_tb_enc(&ops[i]->turbo_enc);
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - cbs_in_tb < 0))
+			break;
+		avail -= cbs_in_tb;
+
+		ret = enqueue_enc_one_op_tb(q, ops[i], enqueued_cbs, cbs_in_tb);
+		if (ret < 0)
+			break;
+		enqueued_cbs += ret;
+	}
+
+	acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+
+	return i;
+}
+
+/* Enqueue encode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_enc(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	if (unlikely(num == 0))
+		return 0;
+	if (ops[0]->turbo_enc.code_block_mode == 0)
+		return acc100_enqueue_enc_tb(q_data, ops, num);
+	else
+		return acc100_enqueue_enc_cb(q_data, ops, num);
+}
+
 /* Enqueue encode operations for ACC100 device. */
 static uint16_t
 acc100_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
@@ -1967,7 +2657,51 @@
 {
 	if (unlikely(num == 0))
 		return 0;
-	return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
+	if (ops[0]->ldpc_enc.code_block_mode == 0)
+		return acc100_enqueue_enc_tb(q_data, ops, num);
+	else
+		return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
+}
+
+
+/* Enqueue decode operations for ACC100 device in CB mode */
+static uint16_t
+acc100_enqueue_dec_cb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i;
+	union acc100_dma_desc *desc;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - 1 < 0))
+			break;
+		avail -= 1;
+
+		ret = enqueue_dec_one_op_cb(q, ops[i], i);
+		if (ret < 0)
+			break;
+	}
+
+	if (unlikely(i == 0))
+		return 0; /* Nothing to enqueue */
+
+	/* Set SDone in last CB in enqueued ops for CB mode*/
+	desc = q->ring_addr + ((q->sw_ring_head + i - 1)
+			& q->sw_ring_wrap_mask);
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	acc100_dma_enqueue(q, i, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+
+	return i;
 }
 
 /* Check we can mux encode operations with common FCW */
@@ -2065,6 +2799,53 @@
 	return i;
 }
 
+
+/* Enqueue decode operations for ACC100 device in TB mode */
+static uint16_t
+acc100_enqueue_dec_tb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i, enqueued_cbs = 0;
+	uint8_t cbs_in_tb;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		cbs_in_tb = get_num_cbs_in_tb_dec(&ops[i]->turbo_dec);
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - cbs_in_tb < 0))
+			break;
+		avail -= cbs_in_tb;
+
+		ret = enqueue_dec_one_op_tb(q, ops[i], enqueued_cbs, cbs_in_tb);
+		if (ret < 0)
+			break;
+		enqueued_cbs += ret;
+	}
+
+	acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+
+	return i;
+}
+
+/* Enqueue decode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_dec(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	if (unlikely(num == 0))
+		return 0;
+	if (ops[0]->turbo_dec.code_block_mode == 0)
+		return acc100_enqueue_dec_tb(q_data, ops, num);
+	else
+		return acc100_enqueue_dec_cb(q_data, ops, num);
+}
+
 /* Enqueue decode operations for ACC100 device. */
 static uint16_t
 acc100_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
@@ -2388,6 +3169,51 @@
 	return cb_idx;
 }
 
+/* Dequeue encode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_enc(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	uint16_t dequeue_num;
+	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+	uint32_t aq_dequeued = 0;
+	uint16_t i;
+	uint16_t dequeued_cbs = 0;
+	struct rte_bbdev_enc_op *op;
+	int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(ops == 0 && q == NULL))
+		return 0;
+#endif
+
+	dequeue_num = (avail < num) ? avail : num;
+
+	for (i = 0; i < dequeue_num; ++i) {
+		op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask))->req.op_addr;
+		if (op->turbo_enc.code_block_mode == 0)
+			ret = dequeue_enc_one_op_tb(q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+		else
+			ret = dequeue_enc_one_op_cb(q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+
+		if (ret < 0)
+			break;
+		dequeued_cbs += ret;
+	}
+
+	q->aq_dequeued += aq_dequeued;
+	q->sw_ring_tail += dequeued_cbs;
+
+	/* Update enqueue stats */
+	q_data->queue_stats.dequeued_count += i;
+
+	return i;
+}
+
 /* Dequeue LDPC encode operations from ACC100 device. */
 static uint16_t
 acc100_dequeue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
@@ -2426,6 +3252,52 @@
 	return dequeued_cbs;
 }
 
+
+/* Dequeue decode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_dec(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	uint16_t dequeue_num;
+	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+	uint32_t aq_dequeued = 0;
+	uint16_t i;
+	uint16_t dequeued_cbs = 0;
+	struct rte_bbdev_dec_op *op;
+	int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(ops == 0 && q == NULL))
+		return 0;
+#endif
+
+	dequeue_num = (avail < num) ? avail : num;
+
+	for (i = 0; i < dequeue_num; ++i) {
+		op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask))->req.op_addr;
+		if (op->turbo_dec.code_block_mode == 0)
+			ret = dequeue_dec_one_op_tb(q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+		else
+			ret = dequeue_dec_one_op_cb(q_data, q, &ops[i],
+					dequeued_cbs, &aq_dequeued);
+
+		if (ret < 0)
+			break;
+		dequeued_cbs += ret;
+	}
+
+	q->aq_dequeued += aq_dequeued;
+	q->sw_ring_tail += dequeued_cbs;
+
+	/* Update enqueue stats */
+	q_data->queue_stats.dequeued_count += i;
+
+	return i;
+}
+
 /* Dequeue decode operations from ACC100 device. */
 static uint16_t
 acc100_dequeue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
@@ -2479,6 +3351,10 @@
 	struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
 
 	dev->dev_ops = &acc100_bbdev_ops;
+	dev->enqueue_enc_ops = acc100_enqueue_enc;
+	dev->enqueue_dec_ops = acc100_enqueue_dec;
+	dev->dequeue_enc_ops = acc100_dequeue_enc;
+	dev->dequeue_dec_ops = acc100_dequeue_dec;
 	dev->enqueue_ldpc_enc_ops = acc100_enqueue_ldpc_enc;
 	dev->enqueue_ldpc_dec_ops = acc100_enqueue_ldpc_dec;
 	dev->dequeue_ldpc_enc_ops = acc100_dequeue_ldpc_enc;
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v6 10/11] baseband/acc100: add interrupt support to PMD
  2020-09-23  2:19   ` [dpdk-dev] [PATCH v6 00/11] bbdev PMD ACC100 Nicolas Chautru
                       ` (8 preceding siblings ...)
  2020-09-23  2:19     ` [dpdk-dev] [PATCH v6 09/11] baseband/acc100: add support for 4G processing Nicolas Chautru
@ 2020-09-23  2:19     ` Nicolas Chautru
  2020-09-23  2:19     ` [dpdk-dev] [PATCH v6 11/11] baseband/acc100: add debug function to validate input Nicolas Chautru
  10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-23  2:19 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Adding capability and functions to support MSI
interrupts, call backs and inforing.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 288 ++++++++++++++++++++++++++++++-
 drivers/baseband/acc100/rte_acc100_pmd.h |  15 ++
 2 files changed, 300 insertions(+), 3 deletions(-)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 7d4c3df..b6d9e7c 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -339,6 +339,213 @@
 	free_base_addresses(base_addrs, i);
 }
 
+/*
+ * Find queue_id of a device queue based on details from the Info Ring.
+ * If a queue isn't found UINT16_MAX is returned.
+ */
+static inline uint16_t
+get_queue_id_from_ring_info(struct rte_bbdev_data *data,
+		const union acc100_info_ring_data ring_data)
+{
+	uint16_t queue_id;
+
+	for (queue_id = 0; queue_id < data->num_queues; ++queue_id) {
+		struct acc100_queue *acc100_q =
+				data->queues[queue_id].queue_private;
+		if (acc100_q != NULL && acc100_q->aq_id == ring_data.aq_id &&
+				acc100_q->qgrp_id == ring_data.qg_id &&
+				acc100_q->vf_id == ring_data.vf_id)
+			return queue_id;
+	}
+
+	return UINT16_MAX;
+}
+
+/* Checks PF Info Ring to find the interrupt cause and handles it accordingly */
+static inline void
+acc100_check_ir(struct acc100_device *acc100_dev)
+{
+	volatile union acc100_info_ring_data *ring_data;
+	uint16_t info_ring_head = acc100_dev->info_ring_head;
+	if (acc100_dev->info_ring == NULL)
+		return;
+
+	ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head &
+			ACC100_INFO_RING_MASK);
+
+	while (ring_data->valid) {
+		if ((ring_data->int_nb < ACC100_PF_INT_DMA_DL_DESC_IRQ) || (
+				ring_data->int_nb >
+				ACC100_PF_INT_DMA_DL5G_DESC_IRQ))
+			rte_bbdev_log(WARNING, "InfoRing: ITR:%d Info:0x%x",
+				ring_data->int_nb, ring_data->detailed_info);
+		/* Initialize Info Ring entry and move forward */
+		ring_data->val = 0;
+		info_ring_head++;
+		ring_data = acc100_dev->info_ring +
+				(info_ring_head & ACC100_INFO_RING_MASK);
+	}
+}
+
+/* Checks PF Info Ring to find the interrupt cause and handles it accordingly */
+static inline void
+acc100_pf_interrupt_handler(struct rte_bbdev *dev)
+{
+	struct acc100_device *acc100_dev = dev->data->dev_private;
+	volatile union acc100_info_ring_data *ring_data;
+	struct acc100_deq_intr_details deq_intr_det;
+
+	ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head &
+			ACC100_INFO_RING_MASK);
+
+	while (ring_data->valid) {
+
+		rte_bbdev_log_debug(
+				"ACC100 PF Interrupt received, Info Ring data: 0x%x",
+				ring_data->val);
+
+		switch (ring_data->int_nb) {
+		case ACC100_PF_INT_DMA_DL_DESC_IRQ:
+		case ACC100_PF_INT_DMA_UL_DESC_IRQ:
+		case ACC100_PF_INT_DMA_UL5G_DESC_IRQ:
+		case ACC100_PF_INT_DMA_DL5G_DESC_IRQ:
+			deq_intr_det.queue_id = get_queue_id_from_ring_info(
+					dev->data, *ring_data);
+			if (deq_intr_det.queue_id == UINT16_MAX) {
+				rte_bbdev_log(ERR,
+						"Couldn't find queue: aq_id: %u, qg_id: %u, vf_id: %u",
+						ring_data->aq_id,
+						ring_data->qg_id,
+						ring_data->vf_id);
+				return;
+			}
+			rte_bbdev_pmd_callback_process(dev,
+					RTE_BBDEV_EVENT_DEQUEUE, &deq_intr_det);
+			break;
+		default:
+			rte_bbdev_pmd_callback_process(dev,
+					RTE_BBDEV_EVENT_ERROR, NULL);
+			break;
+		}
+
+		/* Initialize Info Ring entry and move forward */
+		ring_data->val = 0;
+		++acc100_dev->info_ring_head;
+		ring_data = acc100_dev->info_ring +
+				(acc100_dev->info_ring_head &
+				ACC100_INFO_RING_MASK);
+	}
+}
+
+/* Checks VF Info Ring to find the interrupt cause and handles it accordingly */
+static inline void
+acc100_vf_interrupt_handler(struct rte_bbdev *dev)
+{
+	struct acc100_device *acc100_dev = dev->data->dev_private;
+	volatile union acc100_info_ring_data *ring_data;
+	struct acc100_deq_intr_details deq_intr_det;
+
+	ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head &
+			ACC100_INFO_RING_MASK);
+
+	while (ring_data->valid) {
+
+		rte_bbdev_log_debug(
+				"ACC100 VF Interrupt received, Info Ring data: 0x%x",
+				ring_data->val);
+
+		switch (ring_data->int_nb) {
+		case ACC100_VF_INT_DMA_DL_DESC_IRQ:
+		case ACC100_VF_INT_DMA_UL_DESC_IRQ:
+		case ACC100_VF_INT_DMA_UL5G_DESC_IRQ:
+		case ACC100_VF_INT_DMA_DL5G_DESC_IRQ:
+			/* VFs are not aware of their vf_id - it's set to 0 in
+			 * queue structures.
+			 */
+			ring_data->vf_id = 0;
+			deq_intr_det.queue_id = get_queue_id_from_ring_info(
+					dev->data, *ring_data);
+			if (deq_intr_det.queue_id == UINT16_MAX) {
+				rte_bbdev_log(ERR,
+						"Couldn't find queue: aq_id: %u, qg_id: %u",
+						ring_data->aq_id,
+						ring_data->qg_id);
+				return;
+			}
+			rte_bbdev_pmd_callback_process(dev,
+					RTE_BBDEV_EVENT_DEQUEUE, &deq_intr_det);
+			break;
+		default:
+			rte_bbdev_pmd_callback_process(dev,
+					RTE_BBDEV_EVENT_ERROR, NULL);
+			break;
+		}
+
+		/* Initialize Info Ring entry and move forward */
+		ring_data->valid = 0;
+		++acc100_dev->info_ring_head;
+		ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head
+				& ACC100_INFO_RING_MASK);
+	}
+}
+
+/* Interrupt handler triggered by ACC100 dev for handling specific interrupt */
+static void
+acc100_dev_interrupt_handler(void *cb_arg)
+{
+	struct rte_bbdev *dev = cb_arg;
+	struct acc100_device *acc100_dev = dev->data->dev_private;
+
+	/* Read info ring */
+	if (acc100_dev->pf_device)
+		acc100_pf_interrupt_handler(dev);
+	else
+		acc100_vf_interrupt_handler(dev);
+}
+
+/* Allocate and setup inforing */
+static int
+allocate_inforing(struct rte_bbdev *dev)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	const struct acc100_registry_addr *reg_addr;
+	rte_iova_t info_ring_phys;
+	uint32_t phys_low, phys_high;
+
+	if (d->info_ring != NULL)
+		return 0; /* Already configured */
+
+	/* Choose correct registry addresses for the device type */
+	if (d->pf_device)
+		reg_addr = &pf_reg_addr;
+	else
+		reg_addr = &vf_reg_addr;
+	/* Allocate InfoRing */
+	d->info_ring = rte_zmalloc_socket("Info Ring",
+			ACC100_INFO_RING_NUM_ENTRIES *
+			sizeof(*d->info_ring), RTE_CACHE_LINE_SIZE,
+			dev->data->socket_id);
+	if (d->info_ring == NULL) {
+		rte_bbdev_log(ERR,
+				"Failed to allocate Info Ring for %s:%u",
+				dev->device->driver->name,
+				dev->data->dev_id);
+		return -ENOMEM;
+	}
+	info_ring_phys = rte_malloc_virt2iova(d->info_ring);
+
+	/* Setup Info Ring */
+	phys_high = (uint32_t)(info_ring_phys >> 32);
+	phys_low  = (uint32_t)(info_ring_phys);
+	acc100_reg_write(d, reg_addr->info_ring_hi, phys_high);
+	acc100_reg_write(d, reg_addr->info_ring_lo, phys_low);
+	acc100_reg_write(d, reg_addr->info_ring_en, ACC100_REG_IRQ_EN_ALL);
+	d->info_ring_head = (acc100_reg_read(d, reg_addr->info_ring_ptr) &
+			0xFFF) / sizeof(union acc100_info_ring_data);
+	return 0;
+}
+
+
 /* Allocate 64MB memory used for all software rings */
 static int
 acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id)
@@ -426,6 +633,7 @@
 	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_hi, phys_high);
 	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_lo, phys_low);
 
+	allocate_inforing(dev);
 	d->harq_layout = rte_zmalloc_socket("HARQ Layout",
 			ACC100_HARQ_LAYOUT * sizeof(*d->harq_layout),
 			RTE_CACHE_LINE_SIZE, dev->data->socket_id);
@@ -437,13 +645,53 @@
 	return 0;
 }
 
+static int
+acc100_intr_enable(struct rte_bbdev *dev)
+{
+	int ret;
+	struct acc100_device *d = dev->data->dev_private;
+
+	/* Only MSI are currently supported */
+	if (dev->intr_handle->type == RTE_INTR_HANDLE_VFIO_MSI ||
+			dev->intr_handle->type == RTE_INTR_HANDLE_UIO) {
+
+		allocate_inforing(dev);
+
+		ret = rte_intr_enable(dev->intr_handle);
+		if (ret < 0) {
+			rte_bbdev_log(ERR,
+					"Couldn't enable interrupts for device: %s",
+					dev->data->name);
+			rte_free(d->info_ring);
+			return ret;
+		}
+		ret = rte_intr_callback_register(dev->intr_handle,
+				acc100_dev_interrupt_handler, dev);
+		if (ret < 0) {
+			rte_bbdev_log(ERR,
+					"Couldn't register interrupt callback for device: %s",
+					dev->data->name);
+			rte_free(d->info_ring);
+			return ret;
+		}
+
+		return 0;
+	}
+
+	rte_bbdev_log(ERR, "ACC100 (%s) supports only VFIO MSI interrupts",
+			dev->data->name);
+	return -ENOTSUP;
+}
+
 /* Free 64MB memory used for software rings */
 static int
 acc100_dev_close(struct rte_bbdev *dev)
 {
 	struct acc100_device *d = dev->data->dev_private;
+	acc100_check_ir(d);
 	if (d->sw_rings_base != NULL) {
 		rte_free(d->tail_ptrs);
+		rte_free(d->info_ring);
 		rte_free(d->sw_rings_base);
 		d->sw_rings_base = NULL;
 	}
@@ -643,6 +891,7 @@
 					RTE_BBDEV_TURBO_CRC_TYPE_24B |
 					RTE_BBDEV_TURBO_HALF_ITERATION_EVEN |
 					RTE_BBDEV_TURBO_EARLY_TERMINATION |
+					RTE_BBDEV_TURBO_DEC_INTERRUPTS |
 					RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN |
 					RTE_BBDEV_TURBO_MAP_DEC |
 					RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP |
@@ -663,6 +912,7 @@
 					RTE_BBDEV_TURBO_CRC_24B_ATTACH |
 					RTE_BBDEV_TURBO_RV_INDEX_BYPASS |
 					RTE_BBDEV_TURBO_RATE_MATCH |
+					RTE_BBDEV_TURBO_ENC_INTERRUPTS |
 					RTE_BBDEV_TURBO_ENC_SCATTER_GATHER,
 				.num_buffers_src =
 						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
@@ -676,7 +926,8 @@
 				.capability_flags =
 					RTE_BBDEV_LDPC_RATE_MATCH |
 					RTE_BBDEV_LDPC_CRC_24B_ATTACH |
-					RTE_BBDEV_LDPC_INTERLEAVER_BYPASS,
+					RTE_BBDEV_LDPC_INTERLEAVER_BYPASS |
+					RTE_BBDEV_LDPC_ENC_INTERRUPTS,
 				.num_buffers_src =
 						RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
 				.num_buffers_dst =
@@ -701,7 +952,8 @@
 				RTE_BBDEV_LDPC_DECODE_BYPASS |
 				RTE_BBDEV_LDPC_DEC_SCATTER_GATHER |
 				RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION |
-				RTE_BBDEV_LDPC_LLR_COMPRESSION,
+				RTE_BBDEV_LDPC_LLR_COMPRESSION |
+				RTE_BBDEV_LDPC_DEC_INTERRUPTS,
 			.llr_size = 8,
 			.llr_decimals = 1,
 			.num_buffers_src =
@@ -751,14 +1003,39 @@
 #else
 	dev_info->harq_buffer_size = 0;
 #endif
+	acc100_check_ir(d);
+}
+
+static int
+acc100_queue_intr_enable(struct rte_bbdev *dev, uint16_t queue_id)
+{
+	struct acc100_queue *q = dev->data->queues[queue_id].queue_private;
+
+	if (dev->intr_handle->type != RTE_INTR_HANDLE_VFIO_MSI &&
+			dev->intr_handle->type != RTE_INTR_HANDLE_UIO)
+		return -ENOTSUP;
+
+	q->irq_enable = 1;
+	return 0;
+}
+
+static int
+acc100_queue_intr_disable(struct rte_bbdev *dev, uint16_t queue_id)
+{
+	struct acc100_queue *q = dev->data->queues[queue_id].queue_private;
+	q->irq_enable = 0;
+	return 0;
 }
 
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
 	.setup_queues = acc100_setup_queues,
+	.intr_enable = acc100_intr_enable,
 	.close = acc100_dev_close,
 	.info_get = acc100_dev_info_get,
 	.queue_setup = acc100_queue_setup,
 	.queue_release = acc100_queue_release,
+	.queue_intr_enable = acc100_queue_intr_enable,
+	.queue_intr_disable = acc100_queue_intr_disable
 };
 
 /* ACC100 PCI PF address map */
@@ -3018,8 +3295,10 @@
 			? (1 << RTE_BBDEV_DATA_ERROR) : 0);
 	op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
 	op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
-	if (op->status != 0)
+	if (op->status != 0) {
 		q_data->queue_stats.dequeue_err_count++;
+		acc100_check_ir(q->d);
+	}
 
 	/* CRC invalid if error exists */
 	if (!op->status)
@@ -3076,6 +3355,9 @@
 		op->status |= 1 << RTE_BBDEV_SYNDROME_ERROR;
 	op->ldpc_dec.iter_count = (uint8_t) rsp.iter_cnt;
 
+	if (op->status & (1 << RTE_BBDEV_DRV_ERROR))
+		acc100_check_ir(q->d);
+
 	/* Check if this is the last desc in batch (Atomic Queue) */
 	if (desc->req.last_desc_in_batch) {
 		(*aq_dequeued)++;
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 78686c1..8980fa5 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -559,7 +559,14 @@ struct acc100_device {
 	/* Virtual address of the info memory routed to the this function under
 	 * operation, whether it is PF or VF.
 	 */
+	union acc100_info_ring_data *info_ring;
+
 	union acc100_harq_layout_data *harq_layout;
+	/* Virtual Info Ring head */
+	uint16_t info_ring_head;
+	/* Number of bytes available for each queue in device, depending on
+	 * how many queues are enabled with configure()
+	 */
 	uint32_t sw_ring_size;
 	uint32_t ddr_size; /* Size in kB */
 	uint32_t *tail_ptrs; /* Base address of response tail pointer buffer */
@@ -575,4 +582,12 @@ struct acc100_device {
 	bool configured; /**< True if this ACC100 device is configured */
 };
 
+/**
+ * Structure with details about RTE_BBDEV_EVENT_DEQUEUE event. It's passed to
+ * the callback function.
+ */
+struct acc100_deq_intr_details {
+	uint16_t queue_id;
+};
+
 #endif /* _RTE_ACC100_PMD_H_ */
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v6 11/11] baseband/acc100: add debug function to validate input
  2020-09-23  2:19   ` [dpdk-dev] [PATCH v6 00/11] bbdev PMD ACC100 Nicolas Chautru
                       ` (9 preceding siblings ...)
  2020-09-23  2:19     ` [dpdk-dev] [PATCH v6 10/11] baseband/acc100: add interrupt support to PMD Nicolas Chautru
@ 2020-09-23  2:19     ` Nicolas Chautru
  10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-23  2:19 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Debug functions to validate the input API from user
Only enabled in DEBUG mode at build time

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 424 +++++++++++++++++++++++++++++++
 1 file changed, 424 insertions(+)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index b6d9e7c..3589814 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -1945,6 +1945,231 @@
 
 }
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+/* Validates turbo encoder parameters */
+static inline int
+validate_enc_op(struct rte_bbdev_enc_op *op)
+{
+	struct rte_bbdev_op_turbo_enc *turbo_enc = &op->turbo_enc;
+	struct rte_bbdev_op_enc_turbo_cb_params *cb = NULL;
+	struct rte_bbdev_op_enc_turbo_tb_params *tb = NULL;
+	uint16_t kw, kw_neg, kw_pos;
+
+	if (op->mempool == NULL) {
+		rte_bbdev_log(ERR, "Invalid mempool pointer");
+		return -1;
+	}
+	if (turbo_enc->input.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid input pointer");
+		return -1;
+	}
+	if (turbo_enc->output.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid output pointer");
+		return -1;
+	}
+	if (turbo_enc->rv_index > 3) {
+		rte_bbdev_log(ERR,
+				"rv_index (%u) is out of range 0 <= value <= 3",
+				turbo_enc->rv_index);
+		return -1;
+	}
+	if (turbo_enc->code_block_mode != 0 &&
+			turbo_enc->code_block_mode != 1) {
+		rte_bbdev_log(ERR,
+				"code_block_mode (%u) is out of range 0 <= value <= 1",
+				turbo_enc->code_block_mode);
+		return -1;
+	}
+
+	if (turbo_enc->code_block_mode == 0) {
+		tb = &turbo_enc->tb_params;
+		if ((tb->k_neg < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| tb->k_neg > RTE_BBDEV_TURBO_MAX_CB_SIZE)
+				&& tb->c_neg > 0) {
+			rte_bbdev_log(ERR,
+					"k_neg (%u) is out of range %u <= value <= %u",
+					tb->k_neg, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if (tb->k_pos < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| tb->k_pos > RTE_BBDEV_TURBO_MAX_CB_SIZE) {
+			rte_bbdev_log(ERR,
+					"k_pos (%u) is out of range %u <= value <= %u",
+					tb->k_pos, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if (tb->c_neg > (RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1))
+			rte_bbdev_log(ERR,
+					"c_neg (%u) is out of range 0 <= value <= %u",
+					tb->c_neg,
+					RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1);
+		if (tb->c < 1 || tb->c > RTE_BBDEV_TURBO_MAX_CODE_BLOCKS) {
+			rte_bbdev_log(ERR,
+					"c (%u) is out of range 1 <= value <= %u",
+					tb->c, RTE_BBDEV_TURBO_MAX_CODE_BLOCKS);
+			return -1;
+		}
+		if (tb->cab > tb->c) {
+			rte_bbdev_log(ERR,
+					"cab (%u) is greater than c (%u)",
+					tb->cab, tb->c);
+			return -1;
+		}
+		if ((tb->ea < RTE_BBDEV_TURBO_MIN_CB_SIZE || (tb->ea % 2))
+				&& tb->r < tb->cab) {
+			rte_bbdev_log(ERR,
+					"ea (%u) is less than %u or it is not even",
+					tb->ea, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+		if ((tb->eb < RTE_BBDEV_TURBO_MIN_CB_SIZE || (tb->eb % 2))
+				&& tb->c > tb->cab) {
+			rte_bbdev_log(ERR,
+					"eb (%u) is less than %u or it is not even",
+					tb->eb, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+
+		kw_neg = 3 * RTE_ALIGN_CEIL(tb->k_neg + 4,
+					RTE_BBDEV_TURBO_C_SUBBLOCK);
+		if (tb->ncb_neg < tb->k_neg || tb->ncb_neg > kw_neg) {
+			rte_bbdev_log(ERR,
+					"ncb_neg (%u) is out of range (%u) k_neg <= value <= (%u) kw_neg",
+					tb->ncb_neg, tb->k_neg, kw_neg);
+			return -1;
+		}
+
+		kw_pos = 3 * RTE_ALIGN_CEIL(tb->k_pos + 4,
+					RTE_BBDEV_TURBO_C_SUBBLOCK);
+		if (tb->ncb_pos < tb->k_pos || tb->ncb_pos > kw_pos) {
+			rte_bbdev_log(ERR,
+					"ncb_pos (%u) is out of range (%u) k_pos <= value <= (%u) kw_pos",
+					tb->ncb_pos, tb->k_pos, kw_pos);
+			return -1;
+		}
+		if (tb->r > (tb->c - 1)) {
+			rte_bbdev_log(ERR,
+					"r (%u) is greater than c - 1 (%u)",
+					tb->r, tb->c - 1);
+			return -1;
+		}
+	} else {
+		cb = &turbo_enc->cb_params;
+		if (cb->k < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| cb->k > RTE_BBDEV_TURBO_MAX_CB_SIZE) {
+			rte_bbdev_log(ERR,
+					"k (%u) is out of range %u <= value <= %u",
+					cb->k, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+
+		if (cb->e < RTE_BBDEV_TURBO_MIN_CB_SIZE || (cb->e % 2)) {
+			rte_bbdev_log(ERR,
+					"e (%u) is less than %u or it is not even",
+					cb->e, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+
+		kw = RTE_ALIGN_CEIL(cb->k + 4, RTE_BBDEV_TURBO_C_SUBBLOCK) * 3;
+		if (cb->ncb < cb->k || cb->ncb > kw) {
+			rte_bbdev_log(ERR,
+					"ncb (%u) is out of range (%u) k <= value <= (%u) kw",
+					cb->ncb, cb->k, kw);
+			return -1;
+		}
+	}
+
+	return 0;
+}
+/* Validates LDPC encoder parameters */
+static inline int
+validate_ldpc_enc_op(struct rte_bbdev_enc_op *op)
+{
+	struct rte_bbdev_op_ldpc_enc *ldpc_enc = &op->ldpc_enc;
+
+	if (op->mempool == NULL) {
+		rte_bbdev_log(ERR, "Invalid mempool pointer");
+		return -1;
+	}
+	if (ldpc_enc->input.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid input pointer");
+		return -1;
+	}
+	if (ldpc_enc->output.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid output pointer");
+		return -1;
+	}
+	if (ldpc_enc->input.length >
+			RTE_BBDEV_LDPC_MAX_CB_SIZE >> 3) {
+		rte_bbdev_log(ERR, "CB size (%u) is too big, max: %d",
+				ldpc_enc->input.length,
+				RTE_BBDEV_LDPC_MAX_CB_SIZE);
+		return -1;
+	}
+	if ((ldpc_enc->basegraph > 2) || (ldpc_enc->basegraph == 0)) {
+		rte_bbdev_log(ERR,
+				"BG (%u) is out of range 1 <= value <= 2",
+				ldpc_enc->basegraph);
+		return -1;
+	}
+	if (ldpc_enc->rv_index > 3) {
+		rte_bbdev_log(ERR,
+				"rv_index (%u) is out of range 0 <= value <= 3",
+				ldpc_enc->rv_index);
+		return -1;
+	}
+	if (ldpc_enc->code_block_mode > 1) {
+		rte_bbdev_log(ERR,
+				"code_block_mode (%u) is out of range 0 <= value <= 1",
+				ldpc_enc->code_block_mode);
+		return -1;
+	}
+
+	return 0;
+}
+
+/* Validates LDPC decoder parameters */
+static inline int
+validate_ldpc_dec_op(struct rte_bbdev_dec_op *op)
+{
+	struct rte_bbdev_op_ldpc_dec *ldpc_dec = &op->ldpc_dec;
+
+	if (op->mempool == NULL) {
+		rte_bbdev_log(ERR, "Invalid mempool pointer");
+		return -1;
+	}
+	if ((ldpc_dec->basegraph > 2) || (ldpc_dec->basegraph == 0)) {
+		rte_bbdev_log(ERR,
+				"BG (%u) is out of range 1 <= value <= 2",
+				ldpc_dec->basegraph);
+		return -1;
+	}
+	if (ldpc_dec->iter_max == 0) {
+		rte_bbdev_log(ERR,
+				"iter_max (%u) is equal to 0",
+				ldpc_dec->iter_max);
+		return -1;
+	}
+	if (ldpc_dec->rv_index > 3) {
+		rte_bbdev_log(ERR,
+				"rv_index (%u) is out of range 0 <= value <= 3",
+				ldpc_dec->rv_index);
+		return -1;
+	}
+	if (ldpc_dec->code_block_mode > 1) {
+		rte_bbdev_log(ERR,
+				"code_block_mode (%u) is out of range 0 <= value <= 1",
+				ldpc_dec->code_block_mode);
+		return -1;
+	}
+
+	return 0;
+}
+#endif
+
 /* Enqueue one encode operations for ACC100 device in CB mode */
 static inline int
 enqueue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
@@ -1956,6 +2181,14 @@
 		seg_total_left;
 	struct rte_mbuf *input, *output_head, *output;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_enc_op(op) == -1) {
+		rte_bbdev_log(ERR, "Turbo encoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2008,6 +2241,14 @@
 	uint16_t  in_length_in_bytes;
 	struct rte_bbdev_op_ldpc_enc *enc = &ops[0]->ldpc_enc;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_ldpc_enc_op(ops[0]) == -1) {
+		rte_bbdev_log(ERR, "LDPC encoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2065,6 +2306,14 @@
 		seg_total_left;
 	struct rte_mbuf *input, *output_head, *output;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_ldpc_enc_op(op) == -1) {
+		rte_bbdev_log(ERR, "LDPC encoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2119,6 +2368,14 @@
 	struct rte_mbuf *input, *output_head, *output;
 	uint16_t current_enqueued_cbs = 0;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_enc_op(op) == -1) {
+		rte_bbdev_log(ERR, "Turbo encoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2191,6 +2448,142 @@
 	return current_enqueued_cbs;
 }
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+/* Validates turbo decoder parameters */
+static inline int
+validate_dec_op(struct rte_bbdev_dec_op *op)
+{
+	struct rte_bbdev_op_turbo_dec *turbo_dec = &op->turbo_dec;
+	struct rte_bbdev_op_dec_turbo_cb_params *cb = NULL;
+	struct rte_bbdev_op_dec_turbo_tb_params *tb = NULL;
+
+	if (op->mempool == NULL) {
+		rte_bbdev_log(ERR, "Invalid mempool pointer");
+		return -1;
+	}
+	if (turbo_dec->input.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid input pointer");
+		return -1;
+	}
+	if (turbo_dec->hard_output.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid hard_output pointer");
+		return -1;
+	}
+	if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_SOFT_OUTPUT) &&
+			turbo_dec->soft_output.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid soft_output pointer");
+		return -1;
+	}
+	if (turbo_dec->rv_index > 3) {
+		rte_bbdev_log(ERR,
+				"rv_index (%u) is out of range 0 <= value <= 3",
+				turbo_dec->rv_index);
+		return -1;
+	}
+	if (turbo_dec->iter_min < 1) {
+		rte_bbdev_log(ERR,
+				"iter_min (%u) is less than 1",
+				turbo_dec->iter_min);
+		return -1;
+	}
+	if (turbo_dec->iter_max <= 2) {
+		rte_bbdev_log(ERR,
+				"iter_max (%u) is less than or equal to 2",
+				turbo_dec->iter_max);
+		return -1;
+	}
+	if (turbo_dec->iter_min > turbo_dec->iter_max) {
+		rte_bbdev_log(ERR,
+				"iter_min (%u) is greater than iter_max (%u)",
+				turbo_dec->iter_min, turbo_dec->iter_max);
+		return -1;
+	}
+	if (turbo_dec->code_block_mode != 0 &&
+			turbo_dec->code_block_mode != 1) {
+		rte_bbdev_log(ERR,
+				"code_block_mode (%u) is out of range 0 <= value <= 1",
+				turbo_dec->code_block_mode);
+		return -1;
+	}
+
+	if (turbo_dec->code_block_mode == 0) {
+		tb = &turbo_dec->tb_params;
+		if ((tb->k_neg < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| tb->k_neg > RTE_BBDEV_TURBO_MAX_CB_SIZE)
+				&& tb->c_neg > 0) {
+			rte_bbdev_log(ERR,
+					"k_neg (%u) is out of range %u <= value <= %u",
+					tb->k_neg, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if ((tb->k_pos < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| tb->k_pos > RTE_BBDEV_TURBO_MAX_CB_SIZE)
+				&& tb->c > tb->c_neg) {
+			rte_bbdev_log(ERR,
+					"k_pos (%u) is out of range %u <= value <= %u",
+					tb->k_pos, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if (tb->c_neg > (RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1))
+			rte_bbdev_log(ERR,
+					"c_neg (%u) is out of range 0 <= value <= %u",
+					tb->c_neg,
+					RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1);
+		if (tb->c < 1 || tb->c > RTE_BBDEV_TURBO_MAX_CODE_BLOCKS) {
+			rte_bbdev_log(ERR,
+					"c (%u) is out of range 1 <= value <= %u",
+					tb->c, RTE_BBDEV_TURBO_MAX_CODE_BLOCKS);
+			return -1;
+		}
+		if (tb->cab > tb->c) {
+			rte_bbdev_log(ERR,
+					"cab (%u) is greater than c (%u)",
+					tb->cab, tb->c);
+			return -1;
+		}
+		if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_EQUALIZER) &&
+				(tb->ea < RTE_BBDEV_TURBO_MIN_CB_SIZE
+						|| (tb->ea % 2))
+				&& tb->cab > 0) {
+			rte_bbdev_log(ERR,
+					"ea (%u) is less than %u or it is not even",
+					tb->ea, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+		if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_EQUALIZER) &&
+				(tb->eb < RTE_BBDEV_TURBO_MIN_CB_SIZE
+						|| (tb->eb % 2))
+				&& tb->c > tb->cab) {
+			rte_bbdev_log(ERR,
+					"eb (%u) is less than %u or it is not even",
+					tb->eb, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+		}
+	} else {
+		cb = &turbo_dec->cb_params;
+		if (cb->k < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| cb->k > RTE_BBDEV_TURBO_MAX_CB_SIZE) {
+			rte_bbdev_log(ERR,
+					"k (%u) is out of range %u <= value <= %u",
+					cb->k, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_EQUALIZER) &&
+				(cb->e < RTE_BBDEV_TURBO_MIN_CB_SIZE ||
+				(cb->e % 2))) {
+			rte_bbdev_log(ERR,
+					"e (%u) is less than %u or it is not even",
+					cb->e, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+	}
+
+	return 0;
+}
+#endif
+
 /** Enqueue one decode operations for ACC100 device in CB mode */
 static inline int
 enqueue_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
@@ -2203,6 +2596,14 @@
 	struct rte_mbuf *input, *h_output_head, *h_output,
 		*s_output_head, *s_output;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_dec_op(op) == -1) {
+		rte_bbdev_log(ERR, "Turbo decoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2426,6 +2827,13 @@
 		return ret;
 	}
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_ldpc_dec_op(op) == -1) {
+		rte_bbdev_log(ERR, "LDPC decoder validation failed");
+		return -EINVAL;
+	}
+#endif
 	union acc100_dma_desc *desc;
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
@@ -2521,6 +2929,14 @@
 	struct rte_mbuf *input, *h_output_head, *h_output;
 	uint16_t current_enqueued_cbs = 0;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_ldpc_dec_op(op) == -1) {
+		rte_bbdev_log(ERR, "LDPC decoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2611,6 +3027,14 @@
 		*s_output_head, *s_output;
 	uint16_t current_enqueued_cbs = 0;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_dec_op(op) == -1) {
+		rte_bbdev_log(ERR, "Turbo decoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v4 00/11] bbdev PMD ACC100
  2020-09-22 19:32       ` Akhil Goyal
@ 2020-09-23  2:21         ` Chautru, Nicolas
  0 siblings, 0 replies; 213+ messages in thread
From: Chautru, Nicolas @ 2020-09-23  2:21 UTC (permalink / raw)
  To: Akhil Goyal, dev, Thomas Monjalon
  Cc: Richardson, Bruce, Xu, Rosen, dave.burley, aidan.goddard, Yigit,
	Ferruh, Liu, Tianjiao, Ananyev, Konstantin

Hi Akhil,

> 
> Hi Nicolas,
> 
> >
> > Hi Akhil,
> > Just a heads up on this bbdev PMD which is ready and was reviewed for
> > some time by the community.
> > There is one warning on patchwork but it can be ignored (one ack email
> > sent with bad formatting).
> > Thanks and best regards,
> > Nic
> There are changes in Makefiles, which are not required as all makefiles are
> removed As we have moved to meson build.
> 
> Could you please update the series.
> 
> Thanks,
> Akhil

No problem Akhil. I just rebased base on latest on main and sent as a v6 (ignore the v5, I had left one old Makefile after rebase). 

Thanks,
Nic


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v7 00/11] bbdev PMD ACC100
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 11/11] doc: update bbdev feature table Nicolas Chautru
                     ` (2 preceding siblings ...)
  2020-09-23  2:19   ` [dpdk-dev] [PATCH v6 00/11] bbdev PMD ACC100 Nicolas Chautru
@ 2020-09-23  2:24   ` Nicolas Chautru
  2020-09-23  2:24     ` [dpdk-dev] [PATCH v7 01/11] drivers/baseband: add PMD for ACC100 Nicolas Chautru
                       ` (10 more replies)
  2020-09-28 23:52   ` [dpdk-dev] [PATCH v8 00/10] bbdev PMD ACC100 Nicolas Chautru
                     ` (4 subsequent siblings)
  8 siblings, 11 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-23  2:24 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

v7: Fingers trouble. Previous one sent mid-rebase. My bad. 
v6: removed a legacy makefile no longer required
v5: rebase based on latest on main. The legacy makefiles are removed. 
v4: an odd compilation error is reported for one CI variant using "gcc latest" which looks to me like a false positive of maybe-undeclared. 
http://mails.dpdk.org/archives/test-report/2020-August/148936.html
Still forcing a dummy declare to remove this CI warning I will check with ci@dpdk.org in parallel.  
v3: missed a change during rebase
v2: includes clean up from latest CI checks.


Nicolas Chautru (11):
  drivers/baseband: add PMD for ACC100
  baseband/acc100: add register definition file
  baseband/acc100: add info get function
  baseband/acc100: add queue configuration
  baseband/acc100: add LDPC processing functions
  baseband/acc100: add HARQ loopback support
  baseband/acc100: add support for 4G processing
  baseband/acc100: add interrupt support to PMD
  baseband/acc100: add debug function to validate input
  baseband/acc100: add configure function
  doc: update bbdev feature table

 app/test-bbdev/meson.build                         |    3 +
 app/test-bbdev/test_bbdev_perf.c                   |   72 +
 doc/guides/bbdevs/acc100.rst                       |  233 +
 doc/guides/bbdevs/features/acc100.ini              |   14 +
 doc/guides/bbdevs/features/mbc.ini                 |   14 -
 doc/guides/bbdevs/index.rst                        |    1 +
 doc/guides/rel_notes/release_20_11.rst             |    6 +
 drivers/baseband/acc100/acc100_pf_enum.h           | 1068 +++++
 drivers/baseband/acc100/acc100_vf_enum.h           |   73 +
 drivers/baseband/acc100/meson.build                |    8 +
 drivers/baseband/acc100/rte_acc100_cfg.h           |  113 +
 drivers/baseband/acc100/rte_acc100_pmd.c           | 4684 ++++++++++++++++++++
 drivers/baseband/acc100/rte_acc100_pmd.h           |  593 +++
 .../acc100/rte_pmd_bbdev_acc100_version.map        |   10 +
 drivers/baseband/meson.build                       |    2 +-
 15 files changed, 6879 insertions(+), 15 deletions(-)
 create mode 100644 doc/guides/bbdevs/acc100.rst
 create mode 100644 doc/guides/bbdevs/features/acc100.ini
 delete mode 100644 doc/guides/bbdevs/features/mbc.ini
 create mode 100644 drivers/baseband/acc100/acc100_pf_enum.h
 create mode 100644 drivers/baseband/acc100/acc100_vf_enum.h
 create mode 100644 drivers/baseband/acc100/meson.build
 create mode 100644 drivers/baseband/acc100/rte_acc100_cfg.h
 create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.c
 create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.h
 create mode 100644 drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v7 01/11] drivers/baseband: add PMD for ACC100
  2020-09-23  2:24   ` [dpdk-dev] [PATCH v7 00/11] bbdev PMD ACC100 Nicolas Chautru
@ 2020-09-23  2:24     ` Nicolas Chautru
  2020-09-23  2:24     ` [dpdk-dev] [PATCH v7 02/11] baseband/acc100: add register definition file Nicolas Chautru
                       ` (9 subsequent siblings)
  10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-23  2:24 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Add stubs for the ACC100 PMD

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 doc/guides/bbdevs/acc100.rst                       | 233 +++++++++++++++++++++
 doc/guides/bbdevs/index.rst                        |   1 +
 doc/guides/rel_notes/release_20_11.rst             |   6 +
 drivers/baseband/acc100/meson.build                |   6 +
 drivers/baseband/acc100/rte_acc100_pmd.c           | 175 ++++++++++++++++
 drivers/baseband/acc100/rte_acc100_pmd.h           |  37 ++++
 .../acc100/rte_pmd_bbdev_acc100_version.map        |   3 +
 drivers/baseband/meson.build                       |   2 +-
 8 files changed, 462 insertions(+), 1 deletion(-)
 create mode 100644 doc/guides/bbdevs/acc100.rst
 create mode 100644 drivers/baseband/acc100/meson.build
 create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.c
 create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.h
 create mode 100644 drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map

diff --git a/doc/guides/bbdevs/acc100.rst b/doc/guides/bbdevs/acc100.rst
new file mode 100644
index 0000000..f87ee09
--- /dev/null
+++ b/doc/guides/bbdevs/acc100.rst
@@ -0,0 +1,233 @@
+..  SPDX-License-Identifier: BSD-3-Clause
+    Copyright(c) 2020 Intel Corporation
+
+Intel(R) ACC100 5G/4G FEC Poll Mode Driver
+==========================================
+
+The BBDEV ACC100 5G/4G FEC poll mode driver (PMD) supports an
+implementation of a VRAN FEC wireless acceleration function.
+This device is also known as Mount Bryce.
+
+Features
+--------
+
+ACC100 5G/4G FEC PMD supports the following features:
+
+- LDPC Encode in the DL (5GNR)
+- LDPC Decode in the UL (5GNR)
+- Turbo Encode in the DL (4G)
+- Turbo Decode in the UL (4G)
+- 16 VFs per PF (physical device)
+- Maximum of 128 queues per VF
+- PCIe Gen-3 x16 Interface
+- MSI
+- SR-IOV
+
+ACC100 5G/4G FEC PMD supports the following BBDEV capabilities:
+
+* For the LDPC encode operation:
+   - ``RTE_BBDEV_LDPC_CRC_24B_ATTACH`` :  set to attach CRC24B to CB(s)
+   - ``RTE_BBDEV_LDPC_RATE_MATCH`` :  if set then do not do Rate Match bypass
+   - ``RTE_BBDEV_LDPC_INTERLEAVER_BYPASS`` : if set then bypass interleaver
+
+* For the LDPC decode operation:
+   - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK`` :  check CRC24B from CB(s)
+   - ``RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE`` :  disable early termination
+   - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP`` :  drops CRC24B bits appended while decoding
+   - ``RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE`` :  provides an input for HARQ combining
+   - ``RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE`` :  provides an input for HARQ combining
+   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE`` :  HARQ memory input is internal
+   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE`` :  HARQ memory output is internal
+   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK`` :  loopback data to/from HARQ memory
+   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_FILLERS`` :  HARQ memory includes the fillers bits
+   - ``RTE_BBDEV_LDPC_DEC_SCATTER_GATHER`` :  supports scatter-gather for input/output data
+   - ``RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION`` :  supports compression of the HARQ input/output
+   - ``RTE_BBDEV_LDPC_LLR_COMPRESSION`` :  supports LLR input compression
+
+* For the turbo encode operation:
+   - ``RTE_BBDEV_TURBO_CRC_24B_ATTACH`` :  set to attach CRC24B to CB(s)
+   - ``RTE_BBDEV_TURBO_RATE_MATCH`` :  if set then do not do Rate Match bypass
+   - ``RTE_BBDEV_TURBO_ENC_INTERRUPTS`` :  set for encoder dequeue interrupts
+   - ``RTE_BBDEV_TURBO_RV_INDEX_BYPASS`` :  set to bypass RV index
+   - ``RTE_BBDEV_TURBO_ENC_SCATTER_GATHER`` :  supports scatter-gather for input/output data
+
+* For the turbo decode operation:
+   - ``RTE_BBDEV_TURBO_CRC_TYPE_24B`` :  check CRC24B from CB(s)
+   - ``RTE_BBDEV_TURBO_SUBBLOCK_DEINTERLEAVE`` :  perform subblock de-interleave
+   - ``RTE_BBDEV_TURBO_DEC_INTERRUPTS`` :  set for decoder dequeue interrupts
+   - ``RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN`` :  set if negative LLR encoder i/p is supported
+   - ``RTE_BBDEV_TURBO_POS_LLR_1_BIT_IN`` :  set if positive LLR encoder i/p is supported
+   - ``RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP`` :  keep CRC24B bits appended while decoding
+   - ``RTE_BBDEV_TURBO_EARLY_TERMINATION`` :  set early early termination feature
+   - ``RTE_BBDEV_TURBO_DEC_SCATTER_GATHER`` :  supports scatter-gather for input/output data
+   - ``RTE_BBDEV_TURBO_HALF_ITERATION_EVEN`` :  set half iteration granularity
+
+Installation
+------------
+
+Section 3 of the DPDK manual provides instuctions on installing and compiling DPDK. The
+default set of bbdev compile flags may be found in config/common_base, where for example
+the flag to build the ACC100 5G/4G FEC device, ``CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100``,
+is already set.
+
+DPDK requires hugepages to be configured as detailed in section 2 of the DPDK manual.
+The bbdev test application has been tested with a configuration 40 x 1GB hugepages. The
+hugepage configuration of a server may be examined using:
+
+.. code-block:: console
+
+   grep Huge* /proc/meminfo
+
+
+Initialization
+--------------
+
+When the device first powers up, its PCI Physical Functions (PF) can be listed through this command:
+
+.. code-block:: console
+
+  sudo lspci -vd8086:0d5c
+
+The physical and virtual functions are compatible with Linux UIO drivers:
+``vfio`` and ``igb_uio``. However, in order to work the ACC100 5G/4G
+FEC device firstly needs to be bound to one of these linux drivers through DPDK.
+
+
+Bind PF UIO driver(s)
+~~~~~~~~~~~~~~~~~~~~~
+
+Install the DPDK igb_uio driver, bind it with the PF PCI device ID and use
+``lspci`` to confirm the PF device is under use by ``igb_uio`` DPDK UIO driver.
+
+The igb_uio driver may be bound to the PF PCI device using one of three methods:
+
+
+1. PCI functions (physical or virtual, depending on the use case) can be bound to
+the UIO driver by repeating this command for every function.
+
+.. code-block:: console
+
+  cd <dpdk-top-level-directory>
+  insmod ./build/kmod/igb_uio.ko
+  echo "8086 0d5c" > /sys/bus/pci/drivers/igb_uio/new_id
+  lspci -vd8086:0d5c
+
+
+2. Another way to bind PF with DPDK UIO driver is by using the ``dpdk-devbind.py`` tool
+
+.. code-block:: console
+
+  cd <dpdk-top-level-directory>
+  ./usertools/dpdk-devbind.py -b igb_uio 0000:06:00.0
+
+where the PCI device ID (example: 0000:06:00.0) is obtained using lspci -vd8086:0d5c
+
+
+3. A third way to bind is to use ``dpdk-setup.sh`` tool
+
+.. code-block:: console
+
+  cd <dpdk-top-level-directory>
+  ./usertools/dpdk-setup.sh
+
+  select 'Bind Ethernet/Crypto/Baseband device to IGB UIO module'
+  or
+  select 'Bind Ethernet/Crypto/Baseband device to VFIO module' depending on driver required
+  enter PCI device ID
+  select 'Display current Ethernet/Crypto/Baseband device settings' to confirm binding
+
+
+In the same way the ACC100 5G/4G FEC PF can be bound with vfio, but vfio driver does not
+support SR-IOV configuration right out of the box, so it will need to be patched.
+
+
+Enable Virtual Functions
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+Now, it should be visible in the printouts that PCI PF is under igb_uio control
+"``Kernel driver in use: igb_uio``"
+
+To show the number of available VFs on the device, read ``sriov_totalvfs`` file..
+
+.. code-block:: console
+
+  cat /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/sriov_totalvfs
+
+  where 0000\:<b>\:<d>.<f> is the PCI device ID
+
+
+To enable VFs via igb_uio, echo the number of virtual functions intended to
+enable to ``max_vfs`` file..
+
+.. code-block:: console
+
+  echo <num-of-vfs> > /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/max_vfs
+
+
+Afterwards, all VFs must be bound to appropriate UIO drivers as required, same
+way it was done with the physical function previously.
+
+Enabling SR-IOV via vfio driver is pretty much the same, except that the file
+name is different:
+
+.. code-block:: console
+
+  echo <num-of-vfs> > /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/sriov_numvfs
+
+
+Configure the VFs through PF
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The PCI virtual functions must be configured before working or getting assigned
+to VMs/Containers. The configuration involves allocating the number of hardware
+queues, priorities, load balance, bandwidth and other settings necessary for the
+device to perform FEC functions.
+
+This configuration needs to be executed at least once after reboot or PCI FLR and can
+be achieved by using the function ``acc100_configure()``, which sets up the
+parameters defined in ``acc100_conf`` structure.
+
+Test Application
+----------------
+
+BBDEV provides a test application, ``test-bbdev.py`` and range of test data for testing
+the functionality of ACC100 5G/4G FEC encode and decode, depending on the device's
+capabilities. The test application is located under app->test-bbdev folder and has the
+following options:
+
+.. code-block:: console
+
+  "-p", "--testapp-path": specifies path to the bbdev test app.
+  "-e", "--eal-params"	: EAL arguments which are passed to the test app.
+  "-t", "--timeout"	: Timeout in seconds (default=300).
+  "-c", "--test-cases"	: Defines test cases to run. Run all if not specified.
+  "-v", "--test-vector"	: Test vector path (default=dpdk_path+/app/test-bbdev/test_vectors/bbdev_null.data).
+  "-n", "--num-ops"	: Number of operations to process on device (default=32).
+  "-b", "--burst-size"	: Operations enqueue/dequeue burst size (default=32).
+  "-s", "--snr"		: SNR in dB used when generating LLRs for bler tests.
+  "-s", "--iter_max"	: Number of iterations for LDPC decoder.
+  "-l", "--num-lcores"	: Number of lcores to run (default=16).
+  "-i", "--init-device" : Initialise PF device with default values.
+
+
+To execute the test application tool using simple decode or encode data,
+type one of the following:
+
+.. code-block:: console
+
+  ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_dec_default.data
+  ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_enc_default.data
+
+
+The test application ``test-bbdev.py``, supports the ability to configure the PF device with
+a default set of values, if the "-i" or "- -init-device" option is included. The default values
+are defined in test_bbdev_perf.c.
+
+
+Test Vectors
+~~~~~~~~~~~~
+
+In addition to the simple LDPC decoder and LDPC encoder tests, bbdev also provides
+a range of additional tests under the test_vectors folder, which may be useful. The results
+of these tests will depend on the ACC100 5G/4G FEC capabilities which may cause some
+testcases to be skipped, but no failure should be reported.
diff --git a/doc/guides/bbdevs/index.rst b/doc/guides/bbdevs/index.rst
index a8092dd..4445cbd 100644
--- a/doc/guides/bbdevs/index.rst
+++ b/doc/guides/bbdevs/index.rst
@@ -13,3 +13,4 @@ Baseband Device Drivers
     turbo_sw
     fpga_lte_fec
     fpga_5gnr_fec
+    acc100
diff --git a/doc/guides/rel_notes/release_20_11.rst b/doc/guides/rel_notes/release_20_11.rst
index 73ac08f..20639ea 100644
--- a/doc/guides/rel_notes/release_20_11.rst
+++ b/doc/guides/rel_notes/release_20_11.rst
@@ -55,6 +55,12 @@ New Features
      Also, make sure to start the actual text at the margin.
      =======================================================
 
+* **Added Intel ACC100 bbdev PMD.**
+
+  Added a new ``acc100`` bbdev driver for the Intel\ |reg| ACC100 accelerator
+  also known as Mount Bryce.  See the
+  :doc:`../bbdevs/acc100` BBDEV guide for more details on this new driver.
+
 
 Removed Items
 -------------
diff --git a/drivers/baseband/acc100/meson.build b/drivers/baseband/acc100/meson.build
new file mode 100644
index 0000000..8afafc2
--- /dev/null
+++ b/drivers/baseband/acc100/meson.build
@@ -0,0 +1,6 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2020 Intel Corporation
+
+deps += ['bbdev', 'bus_vdev', 'ring', 'pci', 'bus_pci']
+
+sources = files('rte_acc100_pmd.c')
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
new file mode 100644
index 0000000..1b4cd13
--- /dev/null
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -0,0 +1,175 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#include <unistd.h>
+
+#include <rte_common.h>
+#include <rte_log.h>
+#include <rte_dev.h>
+#include <rte_malloc.h>
+#include <rte_mempool.h>
+#include <rte_byteorder.h>
+#include <rte_errno.h>
+#include <rte_branch_prediction.h>
+#include <rte_hexdump.h>
+#include <rte_pci.h>
+#include <rte_bus_pci.h>
+
+#include <rte_bbdev.h>
+#include <rte_bbdev_pmd.h>
+#include "rte_acc100_pmd.h"
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, DEBUG);
+#else
+RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);
+#endif
+
+/* Free 64MB memory used for software rings */
+static int
+acc100_dev_close(struct rte_bbdev *dev  __rte_unused)
+{
+	return 0;
+}
+
+static const struct rte_bbdev_ops acc100_bbdev_ops = {
+	.close = acc100_dev_close,
+};
+
+/* ACC100 PCI PF address map */
+static struct rte_pci_id pci_id_acc100_pf_map[] = {
+	{
+		RTE_PCI_DEVICE(RTE_ACC100_VENDOR_ID, RTE_ACC100_PF_DEVICE_ID)
+	},
+	{.device_id = 0},
+};
+
+/* ACC100 PCI VF address map */
+static struct rte_pci_id pci_id_acc100_vf_map[] = {
+	{
+		RTE_PCI_DEVICE(RTE_ACC100_VENDOR_ID, RTE_ACC100_VF_DEVICE_ID)
+	},
+	{.device_id = 0},
+};
+
+/* Initialization Function */
+static void
+acc100_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv)
+{
+	struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
+
+	dev->dev_ops = &acc100_bbdev_ops;
+
+	((struct acc100_device *) dev->data->dev_private)->pf_device =
+			!strcmp(drv->driver.name,
+					RTE_STR(ACC100PF_DRIVER_NAME));
+	((struct acc100_device *) dev->data->dev_private)->mmio_base =
+			pci_dev->mem_resource[0].addr;
+
+	rte_bbdev_log_debug("Init device %s [%s] @ vaddr %p paddr %#"PRIx64"",
+			drv->driver.name, dev->data->name,
+			(void *)pci_dev->mem_resource[0].addr,
+			pci_dev->mem_resource[0].phys_addr);
+}
+
+static int acc100_pci_probe(struct rte_pci_driver *pci_drv,
+	struct rte_pci_device *pci_dev)
+{
+	struct rte_bbdev *bbdev = NULL;
+	char dev_name[RTE_BBDEV_NAME_MAX_LEN];
+
+	if (pci_dev == NULL) {
+		rte_bbdev_log(ERR, "NULL PCI device");
+		return -EINVAL;
+	}
+
+	rte_pci_device_name(&pci_dev->addr, dev_name, sizeof(dev_name));
+
+	/* Allocate memory to be used privately by drivers */
+	bbdev = rte_bbdev_allocate(pci_dev->device.name);
+	if (bbdev == NULL)
+		return -ENODEV;
+
+	/* allocate device private memory */
+	bbdev->data->dev_private = rte_zmalloc_socket(dev_name,
+			sizeof(struct acc100_device), RTE_CACHE_LINE_SIZE,
+			pci_dev->device.numa_node);
+
+	if (bbdev->data->dev_private == NULL) {
+		rte_bbdev_log(CRIT,
+				"Allocate of %zu bytes for device \"%s\" failed",
+				sizeof(struct acc100_device), dev_name);
+				rte_bbdev_release(bbdev);
+			return -ENOMEM;
+	}
+
+	/* Fill HW specific part of device structure */
+	bbdev->device = &pci_dev->device;
+	bbdev->intr_handle = &pci_dev->intr_handle;
+	bbdev->data->socket_id = pci_dev->device.numa_node;
+
+	/* Invoke ACC100 device initialization function */
+	acc100_bbdev_init(bbdev, pci_drv);
+
+	rte_bbdev_log_debug("Initialised bbdev %s (id = %u)",
+			dev_name, bbdev->data->dev_id);
+	return 0;
+}
+
+static int acc100_pci_remove(struct rte_pci_device *pci_dev)
+{
+	struct rte_bbdev *bbdev;
+	int ret;
+	uint8_t dev_id;
+
+	if (pci_dev == NULL)
+		return -EINVAL;
+
+	/* Find device */
+	bbdev = rte_bbdev_get_named_dev(pci_dev->device.name);
+	if (bbdev == NULL) {
+		rte_bbdev_log(CRIT,
+				"Couldn't find HW dev \"%s\" to uninitialise it",
+				pci_dev->device.name);
+		return -ENODEV;
+	}
+	dev_id = bbdev->data->dev_id;
+
+	/* free device private memory before close */
+	rte_free(bbdev->data->dev_private);
+
+	/* Close device */
+	ret = rte_bbdev_close(dev_id);
+	if (ret < 0)
+		rte_bbdev_log(ERR,
+				"Device %i failed to close during uninit: %i",
+				dev_id, ret);
+
+	/* release bbdev from library */
+	rte_bbdev_release(bbdev);
+
+	rte_bbdev_log_debug("Destroyed bbdev = %u", dev_id);
+
+	return 0;
+}
+
+static struct rte_pci_driver acc100_pci_pf_driver = {
+		.probe = acc100_pci_probe,
+		.remove = acc100_pci_remove,
+		.id_table = pci_id_acc100_pf_map,
+		.drv_flags = RTE_PCI_DRV_NEED_MAPPING
+};
+
+static struct rte_pci_driver acc100_pci_vf_driver = {
+		.probe = acc100_pci_probe,
+		.remove = acc100_pci_remove,
+		.id_table = pci_id_acc100_vf_map,
+		.drv_flags = RTE_PCI_DRV_NEED_MAPPING
+};
+
+RTE_PMD_REGISTER_PCI(ACC100PF_DRIVER_NAME, acc100_pci_pf_driver);
+RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME, pci_id_acc100_pf_map);
+RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
+RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME, pci_id_acc100_vf_map);
+
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
new file mode 100644
index 0000000..6f46df0
--- /dev/null
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -0,0 +1,37 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#ifndef _RTE_ACC100_PMD_H_
+#define _RTE_ACC100_PMD_H_
+
+/* Helper macro for logging */
+#define rte_bbdev_log(level, fmt, ...) \
+	rte_log(RTE_LOG_ ## level, acc100_logtype, fmt "\n", \
+		##__VA_ARGS__)
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+#define rte_bbdev_log_debug(fmt, ...) \
+		rte_bbdev_log(DEBUG, "acc100_pmd: " fmt, \
+		##__VA_ARGS__)
+#else
+#define rte_bbdev_log_debug(fmt, ...)
+#endif
+
+/* ACC100 PF and VF driver names */
+#define ACC100PF_DRIVER_NAME           intel_acc100_pf
+#define ACC100VF_DRIVER_NAME           intel_acc100_vf
+
+/* ACC100 PCI vendor & device IDs */
+#define RTE_ACC100_VENDOR_ID           (0x8086)
+#define RTE_ACC100_PF_DEVICE_ID        (0x0d5c)
+#define RTE_ACC100_VF_DEVICE_ID        (0x0d5d)
+
+/* Private data structure for each ACC100 device */
+struct acc100_device {
+	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
+	bool pf_device; /**< True if this is a PF ACC100 device */
+	bool configured; /**< True if this ACC100 device is configured */
+};
+
+#endif /* _RTE_ACC100_PMD_H_ */
diff --git a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
new file mode 100644
index 0000000..4a76d1d
--- /dev/null
+++ b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
@@ -0,0 +1,3 @@
+DPDK_21 {
+	local: *;
+};
diff --git a/drivers/baseband/meson.build b/drivers/baseband/meson.build
index 415b672..72301ce 100644
--- a/drivers/baseband/meson.build
+++ b/drivers/baseband/meson.build
@@ -5,7 +5,7 @@ if is_windows
 	subdir_done()
 endif
 
-drivers = ['null', 'turbo_sw', 'fpga_lte_fec', 'fpga_5gnr_fec']
+drivers = ['null', 'turbo_sw', 'fpga_lte_fec', 'fpga_5gnr_fec', 'acc100']
 
 config_flag_fmt = 'RTE_LIBRTE_PMD_BBDEV_@0@'
 driver_name_fmt = 'rte_pmd_bbdev_@0@'
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v7 02/11] baseband/acc100: add register definition file
  2020-09-23  2:24   ` [dpdk-dev] [PATCH v7 00/11] bbdev PMD ACC100 Nicolas Chautru
  2020-09-23  2:24     ` [dpdk-dev] [PATCH v7 01/11] drivers/baseband: add PMD for ACC100 Nicolas Chautru
@ 2020-09-23  2:24     ` Nicolas Chautru
  2020-09-23  2:24     ` [dpdk-dev] [PATCH v7 03/11] baseband/acc100: add info get function Nicolas Chautru
                       ` (8 subsequent siblings)
  10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-23  2:24 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Add in the list of registers for the device and related
HW specs definitions.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 drivers/baseband/acc100/acc100_pf_enum.h | 1068 ++++++++++++++++++++++++++++++
 drivers/baseband/acc100/acc100_vf_enum.h |   73 ++
 drivers/baseband/acc100/rte_acc100_pmd.h |  490 ++++++++++++++
 3 files changed, 1631 insertions(+)
 create mode 100644 drivers/baseband/acc100/acc100_pf_enum.h
 create mode 100644 drivers/baseband/acc100/acc100_vf_enum.h

diff --git a/drivers/baseband/acc100/acc100_pf_enum.h b/drivers/baseband/acc100/acc100_pf_enum.h
new file mode 100644
index 0000000..a1ee416
--- /dev/null
+++ b/drivers/baseband/acc100/acc100_pf_enum.h
@@ -0,0 +1,1068 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2017 Intel Corporation
+ */
+
+#ifndef ACC100_PF_ENUM_H
+#define ACC100_PF_ENUM_H
+
+/*
+ * ACC100 Register mapping on PF BAR0
+ * This is automatically generated from RDL, format may change with new RDL
+ * Release.
+ * Variable names are as is
+ */
+enum {
+	HWPfQmgrEgressQueuesTemplate          =  0x0007FE00,
+	HWPfQmgrIngressAq                     =  0x00080000,
+	HWPfQmgrArbQAvail                     =  0x00A00010,
+	HWPfQmgrArbQBlock                     =  0x00A00014,
+	HWPfQmgrAqueueDropNotifEn             =  0x00A00024,
+	HWPfQmgrAqueueDisableNotifEn          =  0x00A00028,
+	HWPfQmgrSoftReset                     =  0x00A00038,
+	HWPfQmgrInitStatus                    =  0x00A0003C,
+	HWPfQmgrAramWatchdogCount             =  0x00A00040,
+	HWPfQmgrAramWatchdogCounterEn         =  0x00A00044,
+	HWPfQmgrAxiWatchdogCount              =  0x00A00048,
+	HWPfQmgrAxiWatchdogCounterEn          =  0x00A0004C,
+	HWPfQmgrProcessWatchdogCount          =  0x00A00050,
+	HWPfQmgrProcessWatchdogCounterEn      =  0x00A00054,
+	HWPfQmgrProcessUl4GWatchdogCounter    =  0x00A00058,
+	HWPfQmgrProcessDl4GWatchdogCounter    =  0x00A0005C,
+	HWPfQmgrProcessUl5GWatchdogCounter    =  0x00A00060,
+	HWPfQmgrProcessDl5GWatchdogCounter    =  0x00A00064,
+	HWPfQmgrProcessMldWatchdogCounter     =  0x00A00068,
+	HWPfQmgrMsiOverflowUpperVf            =  0x00A00070,
+	HWPfQmgrMsiOverflowLowerVf            =  0x00A00074,
+	HWPfQmgrMsiWatchdogOverflow           =  0x00A00078,
+	HWPfQmgrMsiOverflowEnable             =  0x00A0007C,
+	HWPfQmgrDebugAqPointerMemGrp          =  0x00A00100,
+	HWPfQmgrDebugOutputArbQFifoGrp        =  0x00A00140,
+	HWPfQmgrDebugMsiFifoGrp               =  0x00A00180,
+	HWPfQmgrDebugAxiWdTimeoutMsiFifo      =  0x00A001C0,
+	HWPfQmgrDebugProcessWdTimeoutMsiFifo  =  0x00A001C4,
+	HWPfQmgrDepthLog2Grp                  =  0x00A00200,
+	HWPfQmgrTholdGrp                      =  0x00A00300,
+	HWPfQmgrGrpTmplateReg0Indx            =  0x00A00600,
+	HWPfQmgrGrpTmplateReg1Indx            =  0x00A00680,
+	HWPfQmgrGrpTmplateReg2indx            =  0x00A00700,
+	HWPfQmgrGrpTmplateReg3Indx            =  0x00A00780,
+	HWPfQmgrGrpTmplateReg4Indx            =  0x00A00800,
+	HWPfQmgrVfBaseAddr                    =  0x00A01000,
+	HWPfQmgrUl4GWeightRrVf                =  0x00A02000,
+	HWPfQmgrDl4GWeightRrVf                =  0x00A02100,
+	HWPfQmgrUl5GWeightRrVf                =  0x00A02200,
+	HWPfQmgrDl5GWeightRrVf                =  0x00A02300,
+	HWPfQmgrMldWeightRrVf                 =  0x00A02400,
+	HWPfQmgrArbQDepthGrp                  =  0x00A02F00,
+	HWPfQmgrGrpFunction0                  =  0x00A02F40,
+	HWPfQmgrGrpFunction1                  =  0x00A02F44,
+	HWPfQmgrGrpPriority                   =  0x00A02F48,
+	HWPfQmgrWeightSync                    =  0x00A03000,
+	HWPfQmgrAqEnableVf                    =  0x00A10000,
+	HWPfQmgrAqResetVf                     =  0x00A20000,
+	HWPfQmgrRingSizeVf                    =  0x00A20004,
+	HWPfQmgrGrpDepthLog20Vf               =  0x00A20008,
+	HWPfQmgrGrpDepthLog21Vf               =  0x00A2000C,
+	HWPfQmgrGrpFunction0Vf                =  0x00A20010,
+	HWPfQmgrGrpFunction1Vf                =  0x00A20014,
+	HWPfDmaConfig0Reg                     =  0x00B80000,
+	HWPfDmaConfig1Reg                     =  0x00B80004,
+	HWPfDmaQmgrAddrReg                    =  0x00B80008,
+	HWPfDmaSoftResetReg                   =  0x00B8000C,
+	HWPfDmaAxcacheReg                     =  0x00B80010,
+	HWPfDmaVersionReg                     =  0x00B80014,
+	HWPfDmaFrameThreshold                 =  0x00B80018,
+	HWPfDmaTimestampLo                    =  0x00B8001C,
+	HWPfDmaTimestampHi                    =  0x00B80020,
+	HWPfDmaAxiStatus                      =  0x00B80028,
+	HWPfDmaAxiControl                     =  0x00B8002C,
+	HWPfDmaNoQmgr                         =  0x00B80030,
+	HWPfDmaQosScale                       =  0x00B80034,
+	HWPfDmaQmanen                         =  0x00B80040,
+	HWPfDmaQmgrQosBase                    =  0x00B80060,
+	HWPfDmaFecClkGatingEnable             =  0x00B80080,
+	HWPfDmaPmEnable                       =  0x00B80084,
+	HWPfDmaQosEnable                      =  0x00B80088,
+	HWPfDmaHarqWeightedRrFrameThreshold   =  0x00B800B0,
+	HWPfDmaDataSmallWeightedRrFrameThresh  = 0x00B800B4,
+	HWPfDmaDataLargeWeightedRrFrameThresh  = 0x00B800B8,
+	HWPfDmaInboundCbMaxSize               =  0x00B800BC,
+	HWPfDmaInboundDrainDataSize           =  0x00B800C0,
+	HWPfDmaVfDdrBaseRw                    =  0x00B80400,
+	HWPfDmaCmplTmOutCnt                   =  0x00B80800,
+	HWPfDmaProcTmOutCnt                   =  0x00B80804,
+	HWPfDmaStatusRrespBresp               =  0x00B80810,
+	HWPfDmaCfgRrespBresp                  =  0x00B80814,
+	HWPfDmaStatusMemParErr                =  0x00B80818,
+	HWPfDmaCfgMemParErrEn                 =  0x00B8081C,
+	HWPfDmaStatusDmaHwErr                 =  0x00B80820,
+	HWPfDmaCfgDmaHwErrEn                  =  0x00B80824,
+	HWPfDmaStatusFecCoreErr               =  0x00B80828,
+	HWPfDmaCfgFecCoreErrEn                =  0x00B8082C,
+	HWPfDmaStatusFcwDescrErr              =  0x00B80830,
+	HWPfDmaCfgFcwDescrErrEn               =  0x00B80834,
+	HWPfDmaStatusBlockTransmit            =  0x00B80838,
+	HWPfDmaBlockOnErrEn                   =  0x00B8083C,
+	HWPfDmaStatusFlushDma                 =  0x00B80840,
+	HWPfDmaFlushDmaOnErrEn                =  0x00B80844,
+	HWPfDmaStatusSdoneFifoFull            =  0x00B80848,
+	HWPfDmaStatusDescriptorErrLoVf        =  0x00B8084C,
+	HWPfDmaStatusDescriptorErrHiVf        =  0x00B80850,
+	HWPfDmaStatusFcwErrLoVf               =  0x00B80854,
+	HWPfDmaStatusFcwErrHiVf               =  0x00B80858,
+	HWPfDmaStatusDataErrLoVf              =  0x00B8085C,
+	HWPfDmaStatusDataErrHiVf              =  0x00B80860,
+	HWPfDmaCfgMsiEnSoftwareErr            =  0x00B80864,
+	HWPfDmaDescriptorSignatuture          =  0x00B80868,
+	HWPfDmaFcwSignature                   =  0x00B8086C,
+	HWPfDmaErrorDetectionEn               =  0x00B80870,
+	HWPfDmaErrCntrlFifoDebug              =  0x00B8087C,
+	HWPfDmaStatusToutData                 =  0x00B80880,
+	HWPfDmaStatusToutDesc                 =  0x00B80884,
+	HWPfDmaStatusToutUnexpData            =  0x00B80888,
+	HWPfDmaStatusToutUnexpDesc            =  0x00B8088C,
+	HWPfDmaStatusToutProcess              =  0x00B80890,
+	HWPfDmaConfigCtoutOutDataEn           =  0x00B808A0,
+	HWPfDmaConfigCtoutOutDescrEn          =  0x00B808A4,
+	HWPfDmaConfigUnexpComplDataEn         =  0x00B808A8,
+	HWPfDmaConfigUnexpComplDescrEn        =  0x00B808AC,
+	HWPfDmaConfigPtoutOutEn               =  0x00B808B0,
+	HWPfDmaFec5GulDescBaseLoRegVf         =  0x00B88020,
+	HWPfDmaFec5GulDescBaseHiRegVf         =  0x00B88024,
+	HWPfDmaFec5GulRespPtrLoRegVf          =  0x00B88028,
+	HWPfDmaFec5GulRespPtrHiRegVf          =  0x00B8802C,
+	HWPfDmaFec5GdlDescBaseLoRegVf         =  0x00B88040,
+	HWPfDmaFec5GdlDescBaseHiRegVf         =  0x00B88044,
+	HWPfDmaFec5GdlRespPtrLoRegVf          =  0x00B88048,
+	HWPfDmaFec5GdlRespPtrHiRegVf          =  0x00B8804C,
+	HWPfDmaFec4GulDescBaseLoRegVf         =  0x00B88060,
+	HWPfDmaFec4GulDescBaseHiRegVf         =  0x00B88064,
+	HWPfDmaFec4GulRespPtrLoRegVf          =  0x00B88068,
+	HWPfDmaFec4GulRespPtrHiRegVf          =  0x00B8806C,
+	HWPfDmaFec4GdlDescBaseLoRegVf         =  0x00B88080,
+	HWPfDmaFec4GdlDescBaseHiRegVf         =  0x00B88084,
+	HWPfDmaFec4GdlRespPtrLoRegVf          =  0x00B88088,
+	HWPfDmaFec4GdlRespPtrHiRegVf          =  0x00B8808C,
+	HWPfDmaVfDdrBaseRangeRo               =  0x00B880A0,
+	HWPfQosmonACntrlReg                   =  0x00B90000,
+	HWPfQosmonAEvalOverflow0              =  0x00B90008,
+	HWPfQosmonAEvalOverflow1              =  0x00B9000C,
+	HWPfQosmonADivTerm                    =  0x00B90010,
+	HWPfQosmonATickTerm                   =  0x00B90014,
+	HWPfQosmonAEvalTerm                   =  0x00B90018,
+	HWPfQosmonAAveTerm                    =  0x00B9001C,
+	HWPfQosmonAForceEccErr                =  0x00B90020,
+	HWPfQosmonAEccErrDetect               =  0x00B90024,
+	HWPfQosmonAIterationConfig0Low        =  0x00B90060,
+	HWPfQosmonAIterationConfig0High       =  0x00B90064,
+	HWPfQosmonAIterationConfig1Low        =  0x00B90068,
+	HWPfQosmonAIterationConfig1High       =  0x00B9006C,
+	HWPfQosmonAIterationConfig2Low        =  0x00B90070,
+	HWPfQosmonAIterationConfig2High       =  0x00B90074,
+	HWPfQosmonAIterationConfig3Low        =  0x00B90078,
+	HWPfQosmonAIterationConfig3High       =  0x00B9007C,
+	HWPfQosmonAEvalMemAddr                =  0x00B90080,
+	HWPfQosmonAEvalMemData                =  0x00B90084,
+	HWPfQosmonAXaction                    =  0x00B900C0,
+	HWPfQosmonARemThres1Vf                =  0x00B90400,
+	HWPfQosmonAThres2Vf                   =  0x00B90404,
+	HWPfQosmonAWeiFracVf                  =  0x00B90408,
+	HWPfQosmonARrWeiVf                    =  0x00B9040C,
+	HWPfPermonACntrlRegVf                 =  0x00B98000,
+	HWPfPermonACountVf                    =  0x00B98008,
+	HWPfPermonAKCntLoVf                   =  0x00B98010,
+	HWPfPermonAKCntHiVf                   =  0x00B98014,
+	HWPfPermonADeltaCntLoVf               =  0x00B98020,
+	HWPfPermonADeltaCntHiVf               =  0x00B98024,
+	HWPfPermonAVersionReg                 =  0x00B9C000,
+	HWPfPermonACbControlFec               =  0x00B9C0F0,
+	HWPfPermonADltTimerLoFec              =  0x00B9C0F4,
+	HWPfPermonADltTimerHiFec              =  0x00B9C0F8,
+	HWPfPermonACbCountFec                 =  0x00B9C100,
+	HWPfPermonAAccExecTimerLoFec          =  0x00B9C104,
+	HWPfPermonAAccExecTimerHiFec          =  0x00B9C108,
+	HWPfPermonAExecTimerMinFec            =  0x00B9C200,
+	HWPfPermonAExecTimerMaxFec            =  0x00B9C204,
+	HWPfPermonAControlBusMon              =  0x00B9C400,
+	HWPfPermonAConfigBusMon               =  0x00B9C404,
+	HWPfPermonASkipCountBusMon            =  0x00B9C408,
+	HWPfPermonAMinLatBusMon               =  0x00B9C40C,
+	HWPfPermonAMaxLatBusMon               =  0x00B9C500,
+	HWPfPermonATotalLatLowBusMon          =  0x00B9C504,
+	HWPfPermonATotalLatUpperBusMon        =  0x00B9C508,
+	HWPfPermonATotalReqCntBusMon          =  0x00B9C50C,
+	HWPfQosmonBCntrlReg                   =  0x00BA0000,
+	HWPfQosmonBEvalOverflow0              =  0x00BA0008,
+	HWPfQosmonBEvalOverflow1              =  0x00BA000C,
+	HWPfQosmonBDivTerm                    =  0x00BA0010,
+	HWPfQosmonBTickTerm                   =  0x00BA0014,
+	HWPfQosmonBEvalTerm                   =  0x00BA0018,
+	HWPfQosmonBAveTerm                    =  0x00BA001C,
+	HWPfQosmonBForceEccErr                =  0x00BA0020,
+	HWPfQosmonBEccErrDetect               =  0x00BA0024,
+	HWPfQosmonBIterationConfig0Low        =  0x00BA0060,
+	HWPfQosmonBIterationConfig0High       =  0x00BA0064,
+	HWPfQosmonBIterationConfig1Low        =  0x00BA0068,
+	HWPfQosmonBIterationConfig1High       =  0x00BA006C,
+	HWPfQosmonBIterationConfig2Low        =  0x00BA0070,
+	HWPfQosmonBIterationConfig2High       =  0x00BA0074,
+	HWPfQosmonBIterationConfig3Low        =  0x00BA0078,
+	HWPfQosmonBIterationConfig3High       =  0x00BA007C,
+	HWPfQosmonBEvalMemAddr                =  0x00BA0080,
+	HWPfQosmonBEvalMemData                =  0x00BA0084,
+	HWPfQosmonBXaction                    =  0x00BA00C0,
+	HWPfQosmonBRemThres1Vf                =  0x00BA0400,
+	HWPfQosmonBThres2Vf                   =  0x00BA0404,
+	HWPfQosmonBWeiFracVf                  =  0x00BA0408,
+	HWPfQosmonBRrWeiVf                    =  0x00BA040C,
+	HWPfPermonBCntrlRegVf                 =  0x00BA8000,
+	HWPfPermonBCountVf                    =  0x00BA8008,
+	HWPfPermonBKCntLoVf                   =  0x00BA8010,
+	HWPfPermonBKCntHiVf                   =  0x00BA8014,
+	HWPfPermonBDeltaCntLoVf               =  0x00BA8020,
+	HWPfPermonBDeltaCntHiVf               =  0x00BA8024,
+	HWPfPermonBVersionReg                 =  0x00BAC000,
+	HWPfPermonBCbControlFec               =  0x00BAC0F0,
+	HWPfPermonBDltTimerLoFec              =  0x00BAC0F4,
+	HWPfPermonBDltTimerHiFec              =  0x00BAC0F8,
+	HWPfPermonBCbCountFec                 =  0x00BAC100,
+	HWPfPermonBAccExecTimerLoFec          =  0x00BAC104,
+	HWPfPermonBAccExecTimerHiFec          =  0x00BAC108,
+	HWPfPermonBExecTimerMinFec            =  0x00BAC200,
+	HWPfPermonBExecTimerMaxFec            =  0x00BAC204,
+	HWPfPermonBControlBusMon              =  0x00BAC400,
+	HWPfPermonBConfigBusMon               =  0x00BAC404,
+	HWPfPermonBSkipCountBusMon            =  0x00BAC408,
+	HWPfPermonBMinLatBusMon               =  0x00BAC40C,
+	HWPfPermonBMaxLatBusMon               =  0x00BAC500,
+	HWPfPermonBTotalLatLowBusMon          =  0x00BAC504,
+	HWPfPermonBTotalLatUpperBusMon        =  0x00BAC508,
+	HWPfPermonBTotalReqCntBusMon          =  0x00BAC50C,
+	HWPfFecUl5gCntrlReg                   =  0x00BC0000,
+	HWPfFecUl5gI2MThreshReg               =  0x00BC0004,
+	HWPfFecUl5gVersionReg                 =  0x00BC0100,
+	HWPfFecUl5gFcwStatusReg               =  0x00BC0104,
+	HWPfFecUl5gWarnReg                    =  0x00BC0108,
+	HwPfFecUl5gIbDebugReg                 =  0x00BC0200,
+	HwPfFecUl5gObLlrDebugReg              =  0x00BC0204,
+	HwPfFecUl5gObHarqDebugReg             =  0x00BC0208,
+	HwPfFecUl5g1CntrlReg                  =  0x00BC1000,
+	HwPfFecUl5g1I2MThreshReg              =  0x00BC1004,
+	HwPfFecUl5g1VersionReg                =  0x00BC1100,
+	HwPfFecUl5g1FcwStatusReg              =  0x00BC1104,
+	HwPfFecUl5g1WarnReg                   =  0x00BC1108,
+	HwPfFecUl5g1IbDebugReg                =  0x00BC1200,
+	HwPfFecUl5g1ObLlrDebugReg             =  0x00BC1204,
+	HwPfFecUl5g1ObHarqDebugReg            =  0x00BC1208,
+	HwPfFecUl5g2CntrlReg                  =  0x00BC2000,
+	HwPfFecUl5g2I2MThreshReg              =  0x00BC2004,
+	HwPfFecUl5g2VersionReg                =  0x00BC2100,
+	HwPfFecUl5g2FcwStatusReg              =  0x00BC2104,
+	HwPfFecUl5g2WarnReg                   =  0x00BC2108,
+	HwPfFecUl5g2IbDebugReg                =  0x00BC2200,
+	HwPfFecUl5g2ObLlrDebugReg             =  0x00BC2204,
+	HwPfFecUl5g2ObHarqDebugReg            =  0x00BC2208,
+	HwPfFecUl5g3CntrlReg                  =  0x00BC3000,
+	HwPfFecUl5g3I2MThreshReg              =  0x00BC3004,
+	HwPfFecUl5g3VersionReg                =  0x00BC3100,
+	HwPfFecUl5g3FcwStatusReg              =  0x00BC3104,
+	HwPfFecUl5g3WarnReg                   =  0x00BC3108,
+	HwPfFecUl5g3IbDebugReg                =  0x00BC3200,
+	HwPfFecUl5g3ObLlrDebugReg             =  0x00BC3204,
+	HwPfFecUl5g3ObHarqDebugReg            =  0x00BC3208,
+	HwPfFecUl5g4CntrlReg                  =  0x00BC4000,
+	HwPfFecUl5g4I2MThreshReg              =  0x00BC4004,
+	HwPfFecUl5g4VersionReg                =  0x00BC4100,
+	HwPfFecUl5g4FcwStatusReg              =  0x00BC4104,
+	HwPfFecUl5g4WarnReg                   =  0x00BC4108,
+	HwPfFecUl5g4IbDebugReg                =  0x00BC4200,
+	HwPfFecUl5g4ObLlrDebugReg             =  0x00BC4204,
+	HwPfFecUl5g4ObHarqDebugReg            =  0x00BC4208,
+	HwPfFecUl5g5CntrlReg                  =  0x00BC5000,
+	HwPfFecUl5g5I2MThreshReg              =  0x00BC5004,
+	HwPfFecUl5g5VersionReg                =  0x00BC5100,
+	HwPfFecUl5g5FcwStatusReg              =  0x00BC5104,
+	HwPfFecUl5g5WarnReg                   =  0x00BC5108,
+	HwPfFecUl5g5IbDebugReg                =  0x00BC5200,
+	HwPfFecUl5g5ObLlrDebugReg             =  0x00BC5204,
+	HwPfFecUl5g5ObHarqDebugReg            =  0x00BC5208,
+	HwPfFecUl5g6CntrlReg                  =  0x00BC6000,
+	HwPfFecUl5g6I2MThreshReg              =  0x00BC6004,
+	HwPfFecUl5g6VersionReg                =  0x00BC6100,
+	HwPfFecUl5g6FcwStatusReg              =  0x00BC6104,
+	HwPfFecUl5g6WarnReg                   =  0x00BC6108,
+	HwPfFecUl5g6IbDebugReg                =  0x00BC6200,
+	HwPfFecUl5g6ObLlrDebugReg             =  0x00BC6204,
+	HwPfFecUl5g6ObHarqDebugReg            =  0x00BC6208,
+	HwPfFecUl5g7CntrlReg                  =  0x00BC7000,
+	HwPfFecUl5g7I2MThreshReg              =  0x00BC7004,
+	HwPfFecUl5g7VersionReg                =  0x00BC7100,
+	HwPfFecUl5g7FcwStatusReg              =  0x00BC7104,
+	HwPfFecUl5g7WarnReg                   =  0x00BC7108,
+	HwPfFecUl5g7IbDebugReg                =  0x00BC7200,
+	HwPfFecUl5g7ObLlrDebugReg             =  0x00BC7204,
+	HwPfFecUl5g7ObHarqDebugReg            =  0x00BC7208,
+	HwPfFecUl5g8CntrlReg                  =  0x00BC8000,
+	HwPfFecUl5g8I2MThreshReg              =  0x00BC8004,
+	HwPfFecUl5g8VersionReg                =  0x00BC8100,
+	HwPfFecUl5g8FcwStatusReg              =  0x00BC8104,
+	HwPfFecUl5g8WarnReg                   =  0x00BC8108,
+	HwPfFecUl5g8IbDebugReg                =  0x00BC8200,
+	HwPfFecUl5g8ObLlrDebugReg             =  0x00BC8204,
+	HwPfFecUl5g8ObHarqDebugReg            =  0x00BC8208,
+	HWPfFecDl5gCntrlReg                   =  0x00BCF000,
+	HWPfFecDl5gI2MThreshReg               =  0x00BCF004,
+	HWPfFecDl5gVersionReg                 =  0x00BCF100,
+	HWPfFecDl5gFcwStatusReg               =  0x00BCF104,
+	HWPfFecDl5gWarnReg                    =  0x00BCF108,
+	HWPfFecUlVersionReg                   =  0x00BD0000,
+	HWPfFecUlControlReg                   =  0x00BD0004,
+	HWPfFecUlStatusReg                    =  0x00BD0008,
+	HWPfFecDlVersionReg                   =  0x00BDF000,
+	HWPfFecDlClusterConfigReg             =  0x00BDF004,
+	HWPfFecDlBurstThres                   =  0x00BDF00C,
+	HWPfFecDlClusterStatusReg0            =  0x00BDF040,
+	HWPfFecDlClusterStatusReg1            =  0x00BDF044,
+	HWPfFecDlClusterStatusReg2            =  0x00BDF048,
+	HWPfFecDlClusterStatusReg3            =  0x00BDF04C,
+	HWPfFecDlClusterStatusReg4            =  0x00BDF050,
+	HWPfFecDlClusterStatusReg5            =  0x00BDF054,
+	HWPfChaFabPllPllrst                   =  0x00C40000,
+	HWPfChaFabPllClk0                     =  0x00C40004,
+	HWPfChaFabPllClk1                     =  0x00C40008,
+	HWPfChaFabPllBwadj                    =  0x00C4000C,
+	HWPfChaFabPllLbw                      =  0x00C40010,
+	HWPfChaFabPllResetq                   =  0x00C40014,
+	HWPfChaFabPllPhshft0                  =  0x00C40018,
+	HWPfChaFabPllPhshft1                  =  0x00C4001C,
+	HWPfChaFabPllDivq0                    =  0x00C40020,
+	HWPfChaFabPllDivq1                    =  0x00C40024,
+	HWPfChaFabPllDivq2                    =  0x00C40028,
+	HWPfChaFabPllDivq3                    =  0x00C4002C,
+	HWPfChaFabPllDivq4                    =  0x00C40030,
+	HWPfChaFabPllDivq5                    =  0x00C40034,
+	HWPfChaFabPllDivq6                    =  0x00C40038,
+	HWPfChaFabPllDivq7                    =  0x00C4003C,
+	HWPfChaDl5gPllPllrst                  =  0x00C40080,
+	HWPfChaDl5gPllClk0                    =  0x00C40084,
+	HWPfChaDl5gPllClk1                    =  0x00C40088,
+	HWPfChaDl5gPllBwadj                   =  0x00C4008C,
+	HWPfChaDl5gPllLbw                     =  0x00C40090,
+	HWPfChaDl5gPllResetq                  =  0x00C40094,
+	HWPfChaDl5gPllPhshft0                 =  0x00C40098,
+	HWPfChaDl5gPllPhshft1                 =  0x00C4009C,
+	HWPfChaDl5gPllDivq0                   =  0x00C400A0,
+	HWPfChaDl5gPllDivq1                   =  0x00C400A4,
+	HWPfChaDl5gPllDivq2                   =  0x00C400A8,
+	HWPfChaDl5gPllDivq3                   =  0x00C400AC,
+	HWPfChaDl5gPllDivq4                   =  0x00C400B0,
+	HWPfChaDl5gPllDivq5                   =  0x00C400B4,
+	HWPfChaDl5gPllDivq6                   =  0x00C400B8,
+	HWPfChaDl5gPllDivq7                   =  0x00C400BC,
+	HWPfChaDl4gPllPllrst                  =  0x00C40100,
+	HWPfChaDl4gPllClk0                    =  0x00C40104,
+	HWPfChaDl4gPllClk1                    =  0x00C40108,
+	HWPfChaDl4gPllBwadj                   =  0x00C4010C,
+	HWPfChaDl4gPllLbw                     =  0x00C40110,
+	HWPfChaDl4gPllResetq                  =  0x00C40114,
+	HWPfChaDl4gPllPhshft0                 =  0x00C40118,
+	HWPfChaDl4gPllPhshft1                 =  0x00C4011C,
+	HWPfChaDl4gPllDivq0                   =  0x00C40120,
+	HWPfChaDl4gPllDivq1                   =  0x00C40124,
+	HWPfChaDl4gPllDivq2                   =  0x00C40128,
+	HWPfChaDl4gPllDivq3                   =  0x00C4012C,
+	HWPfChaDl4gPllDivq4                   =  0x00C40130,
+	HWPfChaDl4gPllDivq5                   =  0x00C40134,
+	HWPfChaDl4gPllDivq6                   =  0x00C40138,
+	HWPfChaDl4gPllDivq7                   =  0x00C4013C,
+	HWPfChaUl5gPllPllrst                  =  0x00C40180,
+	HWPfChaUl5gPllClk0                    =  0x00C40184,
+	HWPfChaUl5gPllClk1                    =  0x00C40188,
+	HWPfChaUl5gPllBwadj                   =  0x00C4018C,
+	HWPfChaUl5gPllLbw                     =  0x00C40190,
+	HWPfChaUl5gPllResetq                  =  0x00C40194,
+	HWPfChaUl5gPllPhshft0                 =  0x00C40198,
+	HWPfChaUl5gPllPhshft1                 =  0x00C4019C,
+	HWPfChaUl5gPllDivq0                   =  0x00C401A0,
+	HWPfChaUl5gPllDivq1                   =  0x00C401A4,
+	HWPfChaUl5gPllDivq2                   =  0x00C401A8,
+	HWPfChaUl5gPllDivq3                   =  0x00C401AC,
+	HWPfChaUl5gPllDivq4                   =  0x00C401B0,
+	HWPfChaUl5gPllDivq5                   =  0x00C401B4,
+	HWPfChaUl5gPllDivq6                   =  0x00C401B8,
+	HWPfChaUl5gPllDivq7                   =  0x00C401BC,
+	HWPfChaUl4gPllPllrst                  =  0x00C40200,
+	HWPfChaUl4gPllClk0                    =  0x00C40204,
+	HWPfChaUl4gPllClk1                    =  0x00C40208,
+	HWPfChaUl4gPllBwadj                   =  0x00C4020C,
+	HWPfChaUl4gPllLbw                     =  0x00C40210,
+	HWPfChaUl4gPllResetq                  =  0x00C40214,
+	HWPfChaUl4gPllPhshft0                 =  0x00C40218,
+	HWPfChaUl4gPllPhshft1                 =  0x00C4021C,
+	HWPfChaUl4gPllDivq0                   =  0x00C40220,
+	HWPfChaUl4gPllDivq1                   =  0x00C40224,
+	HWPfChaUl4gPllDivq2                   =  0x00C40228,
+	HWPfChaUl4gPllDivq3                   =  0x00C4022C,
+	HWPfChaUl4gPllDivq4                   =  0x00C40230,
+	HWPfChaUl4gPllDivq5                   =  0x00C40234,
+	HWPfChaUl4gPllDivq6                   =  0x00C40238,
+	HWPfChaUl4gPllDivq7                   =  0x00C4023C,
+	HWPfChaDdrPllPllrst                   =  0x00C40280,
+	HWPfChaDdrPllClk0                     =  0x00C40284,
+	HWPfChaDdrPllClk1                     =  0x00C40288,
+	HWPfChaDdrPllBwadj                    =  0x00C4028C,
+	HWPfChaDdrPllLbw                      =  0x00C40290,
+	HWPfChaDdrPllResetq                   =  0x00C40294,
+	HWPfChaDdrPllPhshft0                  =  0x00C40298,
+	HWPfChaDdrPllPhshft1                  =  0x00C4029C,
+	HWPfChaDdrPllDivq0                    =  0x00C402A0,
+	HWPfChaDdrPllDivq1                    =  0x00C402A4,
+	HWPfChaDdrPllDivq2                    =  0x00C402A8,
+	HWPfChaDdrPllDivq3                    =  0x00C402AC,
+	HWPfChaDdrPllDivq4                    =  0x00C402B0,
+	HWPfChaDdrPllDivq5                    =  0x00C402B4,
+	HWPfChaDdrPllDivq6                    =  0x00C402B8,
+	HWPfChaDdrPllDivq7                    =  0x00C402BC,
+	HWPfChaErrStatus                      =  0x00C40400,
+	HWPfChaErrMask                        =  0x00C40404,
+	HWPfChaDebugPcieMsiFifo               =  0x00C40410,
+	HWPfChaDebugDdrMsiFifo                =  0x00C40414,
+	HWPfChaDebugMiscMsiFifo               =  0x00C40418,
+	HWPfChaPwmSet                         =  0x00C40420,
+	HWPfChaDdrRstStatus                   =  0x00C40430,
+	HWPfChaDdrStDoneStatus                =  0x00C40434,
+	HWPfChaDdrWbRstCfg                    =  0x00C40438,
+	HWPfChaDdrApbRstCfg                   =  0x00C4043C,
+	HWPfChaDdrPhyRstCfg                   =  0x00C40440,
+	HWPfChaDdrCpuRstCfg                   =  0x00C40444,
+	HWPfChaDdrSifRstCfg                   =  0x00C40448,
+	HWPfChaPadcfgPcomp0                   =  0x00C41000,
+	HWPfChaPadcfgNcomp0                   =  0x00C41004,
+	HWPfChaPadcfgOdt0                     =  0x00C41008,
+	HWPfChaPadcfgProtect0                 =  0x00C4100C,
+	HWPfChaPreemphasisProtect0            =  0x00C41010,
+	HWPfChaPreemphasisCompen0             =  0x00C41040,
+	HWPfChaPreemphasisOdten0              =  0x00C41044,
+	HWPfChaPadcfgPcomp1                   =  0x00C41100,
+	HWPfChaPadcfgNcomp1                   =  0x00C41104,
+	HWPfChaPadcfgOdt1                     =  0x00C41108,
+	HWPfChaPadcfgProtect1                 =  0x00C4110C,
+	HWPfChaPreemphasisProtect1            =  0x00C41110,
+	HWPfChaPreemphasisCompen1             =  0x00C41140,
+	HWPfChaPreemphasisOdten1              =  0x00C41144,
+	HWPfChaPadcfgPcomp2                   =  0x00C41200,
+	HWPfChaPadcfgNcomp2                   =  0x00C41204,
+	HWPfChaPadcfgOdt2                     =  0x00C41208,
+	HWPfChaPadcfgProtect2                 =  0x00C4120C,
+	HWPfChaPreemphasisProtect2            =  0x00C41210,
+	HWPfChaPreemphasisCompen2             =  0x00C41240,
+	HWPfChaPreemphasisOdten4              =  0x00C41444,
+	HWPfChaPreemphasisOdten2              =  0x00C41244,
+	HWPfChaPadcfgPcomp3                   =  0x00C41300,
+	HWPfChaPadcfgNcomp3                   =  0x00C41304,
+	HWPfChaPadcfgOdt3                     =  0x00C41308,
+	HWPfChaPadcfgProtect3                 =  0x00C4130C,
+	HWPfChaPreemphasisProtect3            =  0x00C41310,
+	HWPfChaPreemphasisCompen3             =  0x00C41340,
+	HWPfChaPreemphasisOdten3              =  0x00C41344,
+	HWPfChaPadcfgPcomp4                   =  0x00C41400,
+	HWPfChaPadcfgNcomp4                   =  0x00C41404,
+	HWPfChaPadcfgOdt4                     =  0x00C41408,
+	HWPfChaPadcfgProtect4                 =  0x00C4140C,
+	HWPfChaPreemphasisProtect4            =  0x00C41410,
+	HWPfChaPreemphasisCompen4             =  0x00C41440,
+	HWPfHiVfToPfDbellVf                   =  0x00C80000,
+	HWPfHiPfToVfDbellVf                   =  0x00C80008,
+	HWPfHiInfoRingBaseLoVf                =  0x00C80010,
+	HWPfHiInfoRingBaseHiVf                =  0x00C80014,
+	HWPfHiInfoRingPointerVf               =  0x00C80018,
+	HWPfHiInfoRingIntWrEnVf               =  0x00C80020,
+	HWPfHiInfoRingPf2VfWrEnVf             =  0x00C80024,
+	HWPfHiMsixVectorMapperVf              =  0x00C80060,
+	HWPfHiModuleVersionReg                =  0x00C84000,
+	HWPfHiIosf2axiErrLogReg               =  0x00C84004,
+	HWPfHiHardResetReg                    =  0x00C84008,
+	HWPfHi5GHardResetReg                  =  0x00C8400C,
+	HWPfHiInfoRingBaseLoRegPf             =  0x00C84010,
+	HWPfHiInfoRingBaseHiRegPf             =  0x00C84014,
+	HWPfHiInfoRingPointerRegPf            =  0x00C84018,
+	HWPfHiInfoRingIntWrEnRegPf            =  0x00C84020,
+	HWPfHiInfoRingVf2pfLoWrEnReg          =  0x00C84024,
+	HWPfHiInfoRingVf2pfHiWrEnReg          =  0x00C84028,
+	HWPfHiLogParityErrStatusReg           =  0x00C8402C,
+	HWPfHiLogDataParityErrorVfStatusLo    =  0x00C84030,
+	HWPfHiLogDataParityErrorVfStatusHi    =  0x00C84034,
+	HWPfHiBlockTransmitOnErrorEn          =  0x00C84038,
+	HWPfHiCfgMsiIntWrEnRegPf              =  0x00C84040,
+	HWPfHiCfgMsiVf2pfLoWrEnReg            =  0x00C84044,
+	HWPfHiCfgMsiVf2pfHighWrEnReg          =  0x00C84048,
+	HWPfHiMsixVectorMapperPf              =  0x00C84060,
+	HWPfHiApbWrWaitTime                   =  0x00C84100,
+	HWPfHiXCounterMaxValue                =  0x00C84104,
+	HWPfHiPfMode                          =  0x00C84108,
+	HWPfHiClkGateHystReg                  =  0x00C8410C,
+	HWPfHiSnoopBitsReg                    =  0x00C84110,
+	HWPfHiMsiDropEnableReg                =  0x00C84114,
+	HWPfHiMsiStatReg                      =  0x00C84120,
+	HWPfHiFifoOflStatReg                  =  0x00C84124,
+	HWPfHiHiDebugReg                      =  0x00C841F4,
+	HWPfHiDebugMemSnoopMsiFifo            =  0x00C841F8,
+	HWPfHiDebugMemSnoopInputFifo          =  0x00C841FC,
+	HWPfHiMsixMappingConfig               =  0x00C84200,
+	HWPfHiJunkReg                         =  0x00C8FF00,
+	HWPfDdrUmmcVer                        =  0x00D00000,
+	HWPfDdrUmmcCap                        =  0x00D00010,
+	HWPfDdrUmmcCtrl                       =  0x00D00020,
+	HWPfDdrMpcPe                          =  0x00D00080,
+	HWPfDdrMpcPpri3                       =  0x00D00090,
+	HWPfDdrMpcPpri2                       =  0x00D000A0,
+	HWPfDdrMpcPpri1                       =  0x00D000B0,
+	HWPfDdrMpcPpri0                       =  0x00D000C0,
+	HWPfDdrMpcPrwgrpCtrl                  =  0x00D000D0,
+	HWPfDdrMpcPbw7                        =  0x00D000E0,
+	HWPfDdrMpcPbw6                        =  0x00D000F0,
+	HWPfDdrMpcPbw5                        =  0x00D00100,
+	HWPfDdrMpcPbw4                        =  0x00D00110,
+	HWPfDdrMpcPbw3                        =  0x00D00120,
+	HWPfDdrMpcPbw2                        =  0x00D00130,
+	HWPfDdrMpcPbw1                        =  0x00D00140,
+	HWPfDdrMpcPbw0                        =  0x00D00150,
+	HWPfDdrMemoryInit                     =  0x00D00200,
+	HWPfDdrMemoryInitDone                 =  0x00D00210,
+	HWPfDdrMemInitPhyTrng0                =  0x00D00240,
+	HWPfDdrMemInitPhyTrng1                =  0x00D00250,
+	HWPfDdrMemInitPhyTrng2                =  0x00D00260,
+	HWPfDdrMemInitPhyTrng3                =  0x00D00270,
+	HWPfDdrBcDram                         =  0x00D003C0,
+	HWPfDdrBcAddrMap                      =  0x00D003D0,
+	HWPfDdrBcRef                          =  0x00D003E0,
+	HWPfDdrBcTim0                         =  0x00D00400,
+	HWPfDdrBcTim1                         =  0x00D00410,
+	HWPfDdrBcTim2                         =  0x00D00420,
+	HWPfDdrBcTim3                         =  0x00D00430,
+	HWPfDdrBcTim4                         =  0x00D00440,
+	HWPfDdrBcTim5                         =  0x00D00450,
+	HWPfDdrBcTim6                         =  0x00D00460,
+	HWPfDdrBcTim7                         =  0x00D00470,
+	HWPfDdrBcTim8                         =  0x00D00480,
+	HWPfDdrBcTim9                         =  0x00D00490,
+	HWPfDdrBcTim10                        =  0x00D004A0,
+	HWPfDdrBcTim12                        =  0x00D004C0,
+	HWPfDdrDfiInit                        =  0x00D004D0,
+	HWPfDdrDfiInitComplete                =  0x00D004E0,
+	HWPfDdrDfiTim0                        =  0x00D004F0,
+	HWPfDdrDfiTim1                        =  0x00D00500,
+	HWPfDdrDfiPhyUpdEn                    =  0x00D00530,
+	HWPfDdrMemStatus                      =  0x00D00540,
+	HWPfDdrUmmcErrStatus                  =  0x00D00550,
+	HWPfDdrUmmcIntStatus                  =  0x00D00560,
+	HWPfDdrUmmcIntEn                      =  0x00D00570,
+	HWPfDdrPhyRdLatency                   =  0x00D48400,
+	HWPfDdrPhyRdLatencyDbi                =  0x00D48410,
+	HWPfDdrPhyWrLatency                   =  0x00D48420,
+	HWPfDdrPhyTrngType                    =  0x00D48430,
+	HWPfDdrPhyMrsTiming2                  =  0x00D48440,
+	HWPfDdrPhyMrsTiming0                  =  0x00D48450,
+	HWPfDdrPhyMrsTiming1                  =  0x00D48460,
+	HWPfDdrPhyDramTmrd                    =  0x00D48470,
+	HWPfDdrPhyDramTmod                    =  0x00D48480,
+	HWPfDdrPhyDramTwpre                   =  0x00D48490,
+	HWPfDdrPhyDramTrfc                    =  0x00D484A0,
+	HWPfDdrPhyDramTrwtp                   =  0x00D484B0,
+	HWPfDdrPhyMr01Dimm                    =  0x00D484C0,
+	HWPfDdrPhyMr01DimmDbi                 =  0x00D484D0,
+	HWPfDdrPhyMr23Dimm                    =  0x00D484E0,
+	HWPfDdrPhyMr45Dimm                    =  0x00D484F0,
+	HWPfDdrPhyMr67Dimm                    =  0x00D48500,
+	HWPfDdrPhyWrlvlWwRdlvlRr              =  0x00D48510,
+	HWPfDdrPhyOdtEn                       =  0x00D48520,
+	HWPfDdrPhyFastTrng                    =  0x00D48530,
+	HWPfDdrPhyDynTrngGap                  =  0x00D48540,
+	HWPfDdrPhyDynRcalGap                  =  0x00D48550,
+	HWPfDdrPhyIdletimeout                 =  0x00D48560,
+	HWPfDdrPhyRstCkeGap                   =  0x00D48570,
+	HWPfDdrPhyCkeMrsGap                   =  0x00D48580,
+	HWPfDdrPhyMemVrefMidVal               =  0x00D48590,
+	HWPfDdrPhyVrefStep                    =  0x00D485A0,
+	HWPfDdrPhyVrefThreshold               =  0x00D485B0,
+	HWPfDdrPhyPhyVrefMidVal               =  0x00D485C0,
+	HWPfDdrPhyDqsCountMax                 =  0x00D485D0,
+	HWPfDdrPhyDqsCountNum                 =  0x00D485E0,
+	HWPfDdrPhyDramRow                     =  0x00D485F0,
+	HWPfDdrPhyDramCol                     =  0x00D48600,
+	HWPfDdrPhyDramBgBa                    =  0x00D48610,
+	HWPfDdrPhyDynamicUpdreqrel            =  0x00D48620,
+	HWPfDdrPhyVrefLimits                  =  0x00D48630,
+	HWPfDdrPhyIdtmTcStatus                =  0x00D6C020,
+	HWPfDdrPhyIdtmFwVersion               =  0x00D6C410,
+	HWPfDdrPhyRdlvlGateInitDelay          =  0x00D70000,
+	HWPfDdrPhyRdenSmplabc                 =  0x00D70008,
+	HWPfDdrPhyVrefNibble0                 =  0x00D7000C,
+	HWPfDdrPhyVrefNibble1                 =  0x00D70010,
+	HWPfDdrPhyRdlvlGateDqsSmpl0           =  0x00D70014,
+	HWPfDdrPhyRdlvlGateDqsSmpl1           =  0x00D70018,
+	HWPfDdrPhyRdlvlGateDqsSmpl2           =  0x00D7001C,
+	HWPfDdrPhyDqsCount                    =  0x00D70020,
+	HWPfDdrPhyWrlvlRdlvlGateStatus        =  0x00D70024,
+	HWPfDdrPhyErrorFlags                  =  0x00D70028,
+	HWPfDdrPhyPowerDown                   =  0x00D70030,
+	HWPfDdrPhyPrbsSeedByte0               =  0x00D70034,
+	HWPfDdrPhyPrbsSeedByte1               =  0x00D70038,
+	HWPfDdrPhyPcompDq                     =  0x00D70040,
+	HWPfDdrPhyNcompDq                     =  0x00D70044,
+	HWPfDdrPhyPcompDqs                    =  0x00D70048,
+	HWPfDdrPhyNcompDqs                    =  0x00D7004C,
+	HWPfDdrPhyPcompCmd                    =  0x00D70050,
+	HWPfDdrPhyNcompCmd                    =  0x00D70054,
+	HWPfDdrPhyPcompCk                     =  0x00D70058,
+	HWPfDdrPhyNcompCk                     =  0x00D7005C,
+	HWPfDdrPhyRcalOdtDq                   =  0x00D70060,
+	HWPfDdrPhyRcalOdtDqs                  =  0x00D70064,
+	HWPfDdrPhyRcalMask1                   =  0x00D70068,
+	HWPfDdrPhyRcalMask2                   =  0x00D7006C,
+	HWPfDdrPhyRcalCtrl                    =  0x00D70070,
+	HWPfDdrPhyRcalCnt                     =  0x00D70074,
+	HWPfDdrPhyRcalOverride                =  0x00D70078,
+	HWPfDdrPhyRcalGateen                  =  0x00D7007C,
+	HWPfDdrPhyCtrl                        =  0x00D70080,
+	HWPfDdrPhyWrlvlAlg                    =  0x00D70084,
+	HWPfDdrPhyRcalVreftTxcmdOdt           =  0x00D70088,
+	HWPfDdrPhyRdlvlGateParam              =  0x00D7008C,
+	HWPfDdrPhyRdlvlGateParam2             =  0x00D70090,
+	HWPfDdrPhyRcalVreftTxdata             =  0x00D70094,
+	HWPfDdrPhyCmdIntDelay                 =  0x00D700A4,
+	HWPfDdrPhyAlertN                      =  0x00D700A8,
+	HWPfDdrPhyTrngReqWpre2tck             =  0x00D700AC,
+	HWPfDdrPhyCmdPhaseSel                 =  0x00D700B4,
+	HWPfDdrPhyCmdDcdl                     =  0x00D700B8,
+	HWPfDdrPhyCkDcdl                      =  0x00D700BC,
+	HWPfDdrPhySwTrngCtrl1                 =  0x00D700C0,
+	HWPfDdrPhySwTrngCtrl2                 =  0x00D700C4,
+	HWPfDdrPhyRcalPcompRden               =  0x00D700C8,
+	HWPfDdrPhyRcalNcompRden               =  0x00D700CC,
+	HWPfDdrPhyRcalCompen                  =  0x00D700D0,
+	HWPfDdrPhySwTrngRdqs                  =  0x00D700D4,
+	HWPfDdrPhySwTrngWdqs                  =  0x00D700D8,
+	HWPfDdrPhySwTrngRdena                 =  0x00D700DC,
+	HWPfDdrPhySwTrngRdenb                 =  0x00D700E0,
+	HWPfDdrPhySwTrngRdenc                 =  0x00D700E4,
+	HWPfDdrPhySwTrngWdq                   =  0x00D700E8,
+	HWPfDdrPhySwTrngRdq                   =  0x00D700EC,
+	HWPfDdrPhyPcfgHmValue                 =  0x00D700F0,
+	HWPfDdrPhyPcfgTimerValue              =  0x00D700F4,
+	HWPfDdrPhyPcfgSoftwareTraining        =  0x00D700F8,
+	HWPfDdrPhyPcfgMcStatus                =  0x00D700FC,
+	HWPfDdrPhyWrlvlPhRank0                =  0x00D70100,
+	HWPfDdrPhyRdenPhRank0                 =  0x00D70104,
+	HWPfDdrPhyRdenIntRank0                =  0x00D70108,
+	HWPfDdrPhyRdqsDcdlRank0               =  0x00D7010C,
+	HWPfDdrPhyRdqsShadowDcdlRank0         =  0x00D70110,
+	HWPfDdrPhyWdqsDcdlRank0               =  0x00D70114,
+	HWPfDdrPhyWdmDcdlShadowRank0          =  0x00D70118,
+	HWPfDdrPhyWdmDcdlRank0                =  0x00D7011C,
+	HWPfDdrPhyDbiDcdlRank0                =  0x00D70120,
+	HWPfDdrPhyRdenDcdlaRank0              =  0x00D70124,
+	HWPfDdrPhyDbiDcdlShadowRank0          =  0x00D70128,
+	HWPfDdrPhyRdenDcdlbRank0              =  0x00D7012C,
+	HWPfDdrPhyWdqsShadowDcdlRank0         =  0x00D70130,
+	HWPfDdrPhyRdenDcdlcRank0              =  0x00D70134,
+	HWPfDdrPhyRdenShadowDcdlaRank0        =  0x00D70138,
+	HWPfDdrPhyWrlvlIntRank0               =  0x00D7013C,
+	HWPfDdrPhyRdqDcdlBit0Rank0            =  0x00D70200,
+	HWPfDdrPhyRdqDcdlShadowBit0Rank0      =  0x00D70204,
+	HWPfDdrPhyWdqDcdlBit0Rank0            =  0x00D70208,
+	HWPfDdrPhyWdqDcdlShadowBit0Rank0      =  0x00D7020C,
+	HWPfDdrPhyRdqDcdlBit1Rank0            =  0x00D70240,
+	HWPfDdrPhyRdqDcdlShadowBit1Rank0      =  0x00D70244,
+	HWPfDdrPhyWdqDcdlBit1Rank0            =  0x00D70248,
+	HWPfDdrPhyWdqDcdlShadowBit1Rank0      =  0x00D7024C,
+	HWPfDdrPhyRdqDcdlBit2Rank0            =  0x00D70280,
+	HWPfDdrPhyRdqDcdlShadowBit2Rank0      =  0x00D70284,
+	HWPfDdrPhyWdqDcdlBit2Rank0            =  0x00D70288,
+	HWPfDdrPhyWdqDcdlShadowBit2Rank0      =  0x00D7028C,
+	HWPfDdrPhyRdqDcdlBit3Rank0            =  0x00D702C0,
+	HWPfDdrPhyRdqDcdlShadowBit3Rank0      =  0x00D702C4,
+	HWPfDdrPhyWdqDcdlBit3Rank0            =  0x00D702C8,
+	HWPfDdrPhyWdqDcdlShadowBit3Rank0      =  0x00D702CC,
+	HWPfDdrPhyRdqDcdlBit4Rank0            =  0x00D70300,
+	HWPfDdrPhyRdqDcdlShadowBit4Rank0      =  0x00D70304,
+	HWPfDdrPhyWdqDcdlBit4Rank0            =  0x00D70308,
+	HWPfDdrPhyWdqDcdlShadowBit4Rank0      =  0x00D7030C,
+	HWPfDdrPhyRdqDcdlBit5Rank0            =  0x00D70340,
+	HWPfDdrPhyRdqDcdlShadowBit5Rank0      =  0x00D70344,
+	HWPfDdrPhyWdqDcdlBit5Rank0            =  0x00D70348,
+	HWPfDdrPhyWdqDcdlShadowBit5Rank0      =  0x00D7034C,
+	HWPfDdrPhyRdqDcdlBit6Rank0            =  0x00D70380,
+	HWPfDdrPhyRdqDcdlShadowBit6Rank0      =  0x00D70384,
+	HWPfDdrPhyWdqDcdlBit6Rank0            =  0x00D70388,
+	HWPfDdrPhyWdqDcdlShadowBit6Rank0      =  0x00D7038C,
+	HWPfDdrPhyRdqDcdlBit7Rank0            =  0x00D703C0,
+	HWPfDdrPhyRdqDcdlShadowBit7Rank0      =  0x00D703C4,
+	HWPfDdrPhyWdqDcdlBit7Rank0            =  0x00D703C8,
+	HWPfDdrPhyWdqDcdlShadowBit7Rank0      =  0x00D703CC,
+	HWPfDdrPhyIdtmStatus                  =  0x00D740D0,
+	HWPfDdrPhyIdtmError                   =  0x00D74110,
+	HWPfDdrPhyIdtmDebug                   =  0x00D74120,
+	HWPfDdrPhyIdtmDebugInt                =  0x00D74130,
+	HwPfPcieLnAsicCfgovr                  =  0x00D80000,
+	HwPfPcieLnAclkmixer                   =  0x00D80004,
+	HwPfPcieLnTxrampfreq                  =  0x00D80008,
+	HwPfPcieLnLanetest                    =  0x00D8000C,
+	HwPfPcieLnDcctrl                      =  0x00D80010,
+	HwPfPcieLnDccmeas                     =  0x00D80014,
+	HwPfPcieLnDccovrAclk                  =  0x00D80018,
+	HwPfPcieLnDccovrTxa                   =  0x00D8001C,
+	HwPfPcieLnDccovrTxk                   =  0x00D80020,
+	HwPfPcieLnDccovrDclk                  =  0x00D80024,
+	HwPfPcieLnDccovrEclk                  =  0x00D80028,
+	HwPfPcieLnDcctrimAclk                 =  0x00D8002C,
+	HwPfPcieLnDcctrimTx                   =  0x00D80030,
+	HwPfPcieLnDcctrimDclk                 =  0x00D80034,
+	HwPfPcieLnDcctrimEclk                 =  0x00D80038,
+	HwPfPcieLnQuadCtrl                    =  0x00D8003C,
+	HwPfPcieLnQuadCorrIndex               =  0x00D80040,
+	HwPfPcieLnQuadCorrStatus              =  0x00D80044,
+	HwPfPcieLnAsicRxovr1                  =  0x00D80048,
+	HwPfPcieLnAsicRxovr2                  =  0x00D8004C,
+	HwPfPcieLnAsicEqinfovr                =  0x00D80050,
+	HwPfPcieLnRxcsr                       =  0x00D80054,
+	HwPfPcieLnRxfectrl                    =  0x00D80058,
+	HwPfPcieLnRxtest                      =  0x00D8005C,
+	HwPfPcieLnEscount                     =  0x00D80060,
+	HwPfPcieLnCdrctrl                     =  0x00D80064,
+	HwPfPcieLnCdrctrl2                    =  0x00D80068,
+	HwPfPcieLnCdrcfg0Ctrl0                =  0x00D8006C,
+	HwPfPcieLnCdrcfg0Ctrl1                =  0x00D80070,
+	HwPfPcieLnCdrcfg0Ctrl2                =  0x00D80074,
+	HwPfPcieLnCdrcfg1Ctrl0                =  0x00D80078,
+	HwPfPcieLnCdrcfg1Ctrl1                =  0x00D8007C,
+	HwPfPcieLnCdrcfg1Ctrl2                =  0x00D80080,
+	HwPfPcieLnCdrcfg2Ctrl0                =  0x00D80084,
+	HwPfPcieLnCdrcfg2Ctrl1                =  0x00D80088,
+	HwPfPcieLnCdrcfg2Ctrl2                =  0x00D8008C,
+	HwPfPcieLnCdrcfg3Ctrl0                =  0x00D80090,
+	HwPfPcieLnCdrcfg3Ctrl1                =  0x00D80094,
+	HwPfPcieLnCdrcfg3Ctrl2                =  0x00D80098,
+	HwPfPcieLnCdrphase                    =  0x00D8009C,
+	HwPfPcieLnCdrfreq                     =  0x00D800A0,
+	HwPfPcieLnCdrstatusPhase              =  0x00D800A4,
+	HwPfPcieLnCdrstatusFreq               =  0x00D800A8,
+	HwPfPcieLnCdroffset                   =  0x00D800AC,
+	HwPfPcieLnRxvosctl                    =  0x00D800B0,
+	HwPfPcieLnRxvosctl2                   =  0x00D800B4,
+	HwPfPcieLnRxlosctl                    =  0x00D800B8,
+	HwPfPcieLnRxlos                       =  0x00D800BC,
+	HwPfPcieLnRxlosvval                   =  0x00D800C0,
+	HwPfPcieLnRxvosd0                     =  0x00D800C4,
+	HwPfPcieLnRxvosd1                     =  0x00D800C8,
+	HwPfPcieLnRxvosep0                    =  0x00D800CC,
+	HwPfPcieLnRxvosep1                    =  0x00D800D0,
+	HwPfPcieLnRxvosen0                    =  0x00D800D4,
+	HwPfPcieLnRxvosen1                    =  0x00D800D8,
+	HwPfPcieLnRxvosafe                    =  0x00D800DC,
+	HwPfPcieLnRxvosa0                     =  0x00D800E0,
+	HwPfPcieLnRxvosa0Out                  =  0x00D800E4,
+	HwPfPcieLnRxvosa1                     =  0x00D800E8,
+	HwPfPcieLnRxvosa1Out                  =  0x00D800EC,
+	HwPfPcieLnRxmisc                      =  0x00D800F0,
+	HwPfPcieLnRxbeacon                    =  0x00D800F4,
+	HwPfPcieLnRxdssout                    =  0x00D800F8,
+	HwPfPcieLnRxdssout2                   =  0x00D800FC,
+	HwPfPcieLnAlphapctrl                  =  0x00D80100,
+	HwPfPcieLnAlphanctrl                  =  0x00D80104,
+	HwPfPcieLnAdaptctrl                   =  0x00D80108,
+	HwPfPcieLnAdaptctrl1                  =  0x00D8010C,
+	HwPfPcieLnAdaptstatus                 =  0x00D80110,
+	HwPfPcieLnAdaptvga1                   =  0x00D80114,
+	HwPfPcieLnAdaptvga2                   =  0x00D80118,
+	HwPfPcieLnAdaptvga3                   =  0x00D8011C,
+	HwPfPcieLnAdaptvga4                   =  0x00D80120,
+	HwPfPcieLnAdaptboost1                 =  0x00D80124,
+	HwPfPcieLnAdaptboost2                 =  0x00D80128,
+	HwPfPcieLnAdaptboost3                 =  0x00D8012C,
+	HwPfPcieLnAdaptboost4                 =  0x00D80130,
+	HwPfPcieLnAdaptsslms1                 =  0x00D80134,
+	HwPfPcieLnAdaptsslms2                 =  0x00D80138,
+	HwPfPcieLnAdaptvgaStatus              =  0x00D8013C,
+	HwPfPcieLnAdaptboostStatus            =  0x00D80140,
+	HwPfPcieLnAdaptsslmsStatus1           =  0x00D80144,
+	HwPfPcieLnAdaptsslmsStatus2           =  0x00D80148,
+	HwPfPcieLnAfectrl1                    =  0x00D8014C,
+	HwPfPcieLnAfectrl2                    =  0x00D80150,
+	HwPfPcieLnAfectrl3                    =  0x00D80154,
+	HwPfPcieLnAfedefault1                 =  0x00D80158,
+	HwPfPcieLnAfedefault2                 =  0x00D8015C,
+	HwPfPcieLnDfectrl1                    =  0x00D80160,
+	HwPfPcieLnDfectrl2                    =  0x00D80164,
+	HwPfPcieLnDfectrl3                    =  0x00D80168,
+	HwPfPcieLnDfectrl4                    =  0x00D8016C,
+	HwPfPcieLnDfectrl5                    =  0x00D80170,
+	HwPfPcieLnDfectrl6                    =  0x00D80174,
+	HwPfPcieLnAfestatus1                  =  0x00D80178,
+	HwPfPcieLnAfestatus2                  =  0x00D8017C,
+	HwPfPcieLnDfestatus1                  =  0x00D80180,
+	HwPfPcieLnDfestatus2                  =  0x00D80184,
+	HwPfPcieLnDfestatus3                  =  0x00D80188,
+	HwPfPcieLnDfestatus4                  =  0x00D8018C,
+	HwPfPcieLnDfestatus5                  =  0x00D80190,
+	HwPfPcieLnAlphastatus                 =  0x00D80194,
+	HwPfPcieLnFomctrl1                    =  0x00D80198,
+	HwPfPcieLnFomctrl2                    =  0x00D8019C,
+	HwPfPcieLnFomctrl3                    =  0x00D801A0,
+	HwPfPcieLnAclkcalStatus               =  0x00D801A4,
+	HwPfPcieLnOffscorrStatus              =  0x00D801A8,
+	HwPfPcieLnEyewidthStatus              =  0x00D801AC,
+	HwPfPcieLnEyeheightStatus             =  0x00D801B0,
+	HwPfPcieLnAsicTxovr1                  =  0x00D801B4,
+	HwPfPcieLnAsicTxovr2                  =  0x00D801B8,
+	HwPfPcieLnAsicTxovr3                  =  0x00D801BC,
+	HwPfPcieLnTxbiasadjOvr                =  0x00D801C0,
+	HwPfPcieLnTxcsr                       =  0x00D801C4,
+	HwPfPcieLnTxtest                      =  0x00D801C8,
+	HwPfPcieLnTxtestword                  =  0x00D801CC,
+	HwPfPcieLnTxtestwordHigh              =  0x00D801D0,
+	HwPfPcieLnTxdrive                     =  0x00D801D4,
+	HwPfPcieLnMtcsLn                      =  0x00D801D8,
+	HwPfPcieLnStatsumLn                   =  0x00D801DC,
+	HwPfPcieLnRcbusScratch                =  0x00D801E0,
+	HwPfPcieLnRcbusMinorrev               =  0x00D801F0,
+	HwPfPcieLnRcbusMajorrev               =  0x00D801F4,
+	HwPfPcieLnRcbusBlocktype              =  0x00D801F8,
+	HwPfPcieSupPllcsr                     =  0x00D80800,
+	HwPfPcieSupPlldiv                     =  0x00D80804,
+	HwPfPcieSupPllcal                     =  0x00D80808,
+	HwPfPcieSupPllcalsts                  =  0x00D8080C,
+	HwPfPcieSupPllmeas                    =  0x00D80810,
+	HwPfPcieSupPlldactrim                 =  0x00D80814,
+	HwPfPcieSupPllbiastrim                =  0x00D80818,
+	HwPfPcieSupPllbwtrim                  =  0x00D8081C,
+	HwPfPcieSupPllcaldly                  =  0x00D80820,
+	HwPfPcieSupRefclkonpclkctrl           =  0x00D80824,
+	HwPfPcieSupPclkdelay                  =  0x00D80828,
+	HwPfPcieSupPhyconfig                  =  0x00D8082C,
+	HwPfPcieSupRcalIntf                   =  0x00D80830,
+	HwPfPcieSupAuxcsr                     =  0x00D80834,
+	HwPfPcieSupVref                       =  0x00D80838,
+	HwPfPcieSupLinkmode                   =  0x00D8083C,
+	HwPfPcieSupRrefcalctl                 =  0x00D80840,
+	HwPfPcieSupRrefcal                    =  0x00D80844,
+	HwPfPcieSupRrefcaldly                 =  0x00D80848,
+	HwPfPcieSupTximpcalctl                =  0x00D8084C,
+	HwPfPcieSupTximpcal                   =  0x00D80850,
+	HwPfPcieSupTximpoffset                =  0x00D80854,
+	HwPfPcieSupTximpcaldly                =  0x00D80858,
+	HwPfPcieSupRximpcalctl                =  0x00D8085C,
+	HwPfPcieSupRximpcal                   =  0x00D80860,
+	HwPfPcieSupRximpoffset                =  0x00D80864,
+	HwPfPcieSupRximpcaldly                =  0x00D80868,
+	HwPfPcieSupFence                      =  0x00D8086C,
+	HwPfPcieSupMtcs                       =  0x00D80870,
+	HwPfPcieSupStatsum                    =  0x00D809B8,
+	HwPfPciePcsDpStatus0                  =  0x00D81000,
+	HwPfPciePcsDpControl0                 =  0x00D81004,
+	HwPfPciePcsPmaStatusLane0             =  0x00D81008,
+	HwPfPciePcsPipeStatusLane0            =  0x00D8100C,
+	HwPfPciePcsTxdeemph0Lane0             =  0x00D81010,
+	HwPfPciePcsTxdeemph1Lane0             =  0x00D81014,
+	HwPfPciePcsInternalStatusLane0        =  0x00D81018,
+	HwPfPciePcsDpStatus1                  =  0x00D8101C,
+	HwPfPciePcsDpControl1                 =  0x00D81020,
+	HwPfPciePcsPmaStatusLane1             =  0x00D81024,
+	HwPfPciePcsPipeStatusLane1            =  0x00D81028,
+	HwPfPciePcsTxdeemph0Lane1             =  0x00D8102C,
+	HwPfPciePcsTxdeemph1Lane1             =  0x00D81030,
+	HwPfPciePcsInternalStatusLane1        =  0x00D81034,
+	HwPfPciePcsDpStatus2                  =  0x00D81038,
+	HwPfPciePcsDpControl2                 =  0x00D8103C,
+	HwPfPciePcsPmaStatusLane2             =  0x00D81040,
+	HwPfPciePcsPipeStatusLane2            =  0x00D81044,
+	HwPfPciePcsTxdeemph0Lane2             =  0x00D81048,
+	HwPfPciePcsTxdeemph1Lane2             =  0x00D8104C,
+	HwPfPciePcsInternalStatusLane2        =  0x00D81050,
+	HwPfPciePcsDpStatus3                  =  0x00D81054,
+	HwPfPciePcsDpControl3                 =  0x00D81058,
+	HwPfPciePcsPmaStatusLane3             =  0x00D8105C,
+	HwPfPciePcsPipeStatusLane3            =  0x00D81060,
+	HwPfPciePcsTxdeemph0Lane3             =  0x00D81064,
+	HwPfPciePcsTxdeemph1Lane3             =  0x00D81068,
+	HwPfPciePcsInternalStatusLane3        =  0x00D8106C,
+	HwPfPciePcsEbStatus0                  =  0x00D81070,
+	HwPfPciePcsEbStatus1                  =  0x00D81074,
+	HwPfPciePcsEbStatus2                  =  0x00D81078,
+	HwPfPciePcsEbStatus3                  =  0x00D8107C,
+	HwPfPciePcsPllSettingPcieG1           =  0x00D81088,
+	HwPfPciePcsPllSettingPcieG2           =  0x00D8108C,
+	HwPfPciePcsPllSettingPcieG3           =  0x00D81090,
+	HwPfPciePcsControl                    =  0x00D81094,
+	HwPfPciePcsEqControl                  =  0x00D81098,
+	HwPfPciePcsEqTimer                    =  0x00D8109C,
+	HwPfPciePcsEqErrStatus                =  0x00D810A0,
+	HwPfPciePcsEqErrCount                 =  0x00D810A4,
+	HwPfPciePcsStatus                     =  0x00D810A8,
+	HwPfPciePcsMiscRegister               =  0x00D810AC,
+	HwPfPciePcsObsControl                 =  0x00D810B0,
+	HwPfPciePcsPrbsCount0                 =  0x00D81200,
+	HwPfPciePcsBistControl0               =  0x00D81204,
+	HwPfPciePcsBistStaticWord00           =  0x00D81208,
+	HwPfPciePcsBistStaticWord10           =  0x00D8120C,
+	HwPfPciePcsBistStaticWord20           =  0x00D81210,
+	HwPfPciePcsBistStaticWord30           =  0x00D81214,
+	HwPfPciePcsPrbsCount1                 =  0x00D81220,
+	HwPfPciePcsBistControl1               =  0x00D81224,
+	HwPfPciePcsBistStaticWord01           =  0x00D81228,
+	HwPfPciePcsBistStaticWord11           =  0x00D8122C,
+	HwPfPciePcsBistStaticWord21           =  0x00D81230,
+	HwPfPciePcsBistStaticWord31           =  0x00D81234,
+	HwPfPciePcsPrbsCount2                 =  0x00D81240,
+	HwPfPciePcsBistControl2               =  0x00D81244,
+	HwPfPciePcsBistStaticWord02           =  0x00D81248,
+	HwPfPciePcsBistStaticWord12           =  0x00D8124C,
+	HwPfPciePcsBistStaticWord22           =  0x00D81250,
+	HwPfPciePcsBistStaticWord32           =  0x00D81254,
+	HwPfPciePcsPrbsCount3                 =  0x00D81260,
+	HwPfPciePcsBistControl3               =  0x00D81264,
+	HwPfPciePcsBistStaticWord03           =  0x00D81268,
+	HwPfPciePcsBistStaticWord13           =  0x00D8126C,
+	HwPfPciePcsBistStaticWord23           =  0x00D81270,
+	HwPfPciePcsBistStaticWord33           =  0x00D81274,
+	HwPfPcieGpexLtssmStateCntrl           =  0x00D90400,
+	HwPfPcieGpexLtssmStateStatus          =  0x00D90404,
+	HwPfPcieGpexSkipFreqTimer             =  0x00D90408,
+	HwPfPcieGpexLaneSelect                =  0x00D9040C,
+	HwPfPcieGpexLaneDeskew                =  0x00D90410,
+	HwPfPcieGpexRxErrorStatus             =  0x00D90414,
+	HwPfPcieGpexLaneNumControl            =  0x00D90418,
+	HwPfPcieGpexNFstControl               =  0x00D9041C,
+	HwPfPcieGpexLinkStatus                =  0x00D90420,
+	HwPfPcieGpexAckReplayTimeout          =  0x00D90438,
+	HwPfPcieGpexSeqNumberStatus           =  0x00D9043C,
+	HwPfPcieGpexCoreClkRatio              =  0x00D90440,
+	HwPfPcieGpexDllTholdControl           =  0x00D90448,
+	HwPfPcieGpexPmTimer                   =  0x00D90450,
+	HwPfPcieGpexPmeTimeout                =  0x00D90454,
+	HwPfPcieGpexAspmL1Timer               =  0x00D90458,
+	HwPfPcieGpexAspmReqTimer              =  0x00D9045C,
+	HwPfPcieGpexAspmL1Dis                 =  0x00D90460,
+	HwPfPcieGpexAdvisoryErrorControl      =  0x00D90468,
+	HwPfPcieGpexId                        =  0x00D90470,
+	HwPfPcieGpexClasscode                 =  0x00D90474,
+	HwPfPcieGpexSubsystemId               =  0x00D90478,
+	HwPfPcieGpexDeviceCapabilities        =  0x00D9047C,
+	HwPfPcieGpexLinkCapabilities          =  0x00D90480,
+	HwPfPcieGpexFunctionNumber            =  0x00D90484,
+	HwPfPcieGpexPmCapabilities            =  0x00D90488,
+	HwPfPcieGpexFunctionSelect            =  0x00D9048C,
+	HwPfPcieGpexErrorCounter              =  0x00D904AC,
+	HwPfPcieGpexConfigReady               =  0x00D904B0,
+	HwPfPcieGpexFcUpdateTimeout           =  0x00D904B8,
+	HwPfPcieGpexFcUpdateTimer             =  0x00D904BC,
+	HwPfPcieGpexVcBufferLoad              =  0x00D904C8,
+	HwPfPcieGpexVcBufferSizeThold         =  0x00D904CC,
+	HwPfPcieGpexVcBufferSelect            =  0x00D904D0,
+	HwPfPcieGpexBarEnable                 =  0x00D904D4,
+	HwPfPcieGpexBarDwordLower             =  0x00D904D8,
+	HwPfPcieGpexBarDwordUpper             =  0x00D904DC,
+	HwPfPcieGpexBarSelect                 =  0x00D904E0,
+	HwPfPcieGpexCreditCounterSelect       =  0x00D904E4,
+	HwPfPcieGpexCreditCounterStatus       =  0x00D904E8,
+	HwPfPcieGpexTlpHeaderSelect           =  0x00D904EC,
+	HwPfPcieGpexTlpHeaderDword0           =  0x00D904F0,
+	HwPfPcieGpexTlpHeaderDword1           =  0x00D904F4,
+	HwPfPcieGpexTlpHeaderDword2           =  0x00D904F8,
+	HwPfPcieGpexTlpHeaderDword3           =  0x00D904FC,
+	HwPfPcieGpexRelaxOrderControl         =  0x00D90500,
+	HwPfPcieGpexBarPrefetch               =  0x00D90504,
+	HwPfPcieGpexFcCheckControl            =  0x00D90508,
+	HwPfPcieGpexFcUpdateTimerTraffic      =  0x00D90518,
+	HwPfPcieGpexPhyControl0               =  0x00D9053C,
+	HwPfPcieGpexPhyControl1               =  0x00D90544,
+	HwPfPcieGpexPhyControl2               =  0x00D9054C,
+	HwPfPcieGpexUserControl0              =  0x00D9055C,
+	HwPfPcieGpexUncorrErrorStatus         =  0x00D905F0,
+	HwPfPcieGpexRxCplError                =  0x00D90620,
+	HwPfPcieGpexRxCplErrorDword0          =  0x00D90624,
+	HwPfPcieGpexRxCplErrorDword1          =  0x00D90628,
+	HwPfPcieGpexRxCplErrorDword2          =  0x00D9062C,
+	HwPfPcieGpexPabSwResetEn              =  0x00D90630,
+	HwPfPcieGpexGen3Control0              =  0x00D90634,
+	HwPfPcieGpexGen3Control1              =  0x00D90638,
+	HwPfPcieGpexGen3Control2              =  0x00D9063C,
+	HwPfPcieGpexGen2ControlCsr            =  0x00D90640,
+	HwPfPcieGpexTotalVfInitialVf0         =  0x00D90644,
+	HwPfPcieGpexTotalVfInitialVf1         =  0x00D90648,
+	HwPfPcieGpexSriovLinkDevId0           =  0x00D90684,
+	HwPfPcieGpexSriovLinkDevId1           =  0x00D90688,
+	HwPfPcieGpexSriovPageSize0            =  0x00D906C4,
+	HwPfPcieGpexSriovPageSize1            =  0x00D906C8,
+	HwPfPcieGpexIdVersion                 =  0x00D906FC,
+	HwPfPcieGpexSriovVfOffsetStride0      =  0x00D90704,
+	HwPfPcieGpexSriovVfOffsetStride1      =  0x00D90708,
+	HwPfPcieGpexGen3DeskewControl         =  0x00D907B4,
+	HwPfPcieGpexGen3EqControl             =  0x00D907B8,
+	HwPfPcieGpexBridgeVersion             =  0x00D90800,
+	HwPfPcieGpexBridgeCapability          =  0x00D90804,
+	HwPfPcieGpexBridgeControl             =  0x00D90808,
+	HwPfPcieGpexBridgeStatus              =  0x00D9080C,
+	HwPfPcieGpexEngineActivityStatus      =  0x00D9081C,
+	HwPfPcieGpexEngineResetControl        =  0x00D90820,
+	HwPfPcieGpexAxiPioControl             =  0x00D90840,
+	HwPfPcieGpexAxiPioStatus              =  0x00D90844,
+	HwPfPcieGpexAmbaSlaveCmdStatus        =  0x00D90848,
+	HwPfPcieGpexPexPioControl             =  0x00D908C0,
+	HwPfPcieGpexPexPioStatus              =  0x00D908C4,
+	HwPfPcieGpexAmbaMasterStatus          =  0x00D908C8,
+	HwPfPcieGpexCsrSlaveCmdStatus         =  0x00D90920,
+	HwPfPcieGpexMailboxAxiControl         =  0x00D90A50,
+	HwPfPcieGpexMailboxAxiData            =  0x00D90A54,
+	HwPfPcieGpexMailboxPexControl         =  0x00D90A90,
+	HwPfPcieGpexMailboxPexData            =  0x00D90A94,
+	HwPfPcieGpexPexInterruptEnable        =  0x00D90AD0,
+	HwPfPcieGpexPexInterruptStatus        =  0x00D90AD4,
+	HwPfPcieGpexPexInterruptAxiPioVector  =  0x00D90AD8,
+	HwPfPcieGpexPexInterruptPexPioVector  =  0x00D90AE0,
+	HwPfPcieGpexPexInterruptMiscVector    =  0x00D90AF8,
+	HwPfPcieGpexAmbaInterruptPioEnable    =  0x00D90B00,
+	HwPfPcieGpexAmbaInterruptMiscEnable   =  0x00D90B0C,
+	HwPfPcieGpexAmbaInterruptPioStatus    =  0x00D90B10,
+	HwPfPcieGpexAmbaInterruptMiscStatus   =  0x00D90B1C,
+	HwPfPcieGpexPexPmControl              =  0x00D90B80,
+	HwPfPcieGpexSlotMisc                  =  0x00D90B88,
+	HwPfPcieGpexAxiAddrMappingControl     =  0x00D90BA0,
+	HwPfPcieGpexAxiAddrMappingWindowAxiBase     =  0x00D90BA4,
+	HwPfPcieGpexAxiAddrMappingWindowPexBaseLow  =  0x00D90BA8,
+	HwPfPcieGpexAxiAddrMappingWindowPexBaseHigh =  0x00D90BAC,
+	HwPfPcieGpexPexBarAddrFunc0Bar0       =  0x00D91BA0,
+	HwPfPcieGpexPexBarAddrFunc0Bar1       =  0x00D91BA4,
+	HwPfPcieGpexAxiAddrMappingPcieHdrParam =  0x00D95BA0,
+	HwPfPcieGpexExtAxiAddrMappingAxiBase  =  0x00D980A0,
+	HwPfPcieGpexPexExtBarAddrFunc0Bar0    =  0x00D984A0,
+	HwPfPcieGpexPexExtBarAddrFunc0Bar1    =  0x00D984A4,
+	HwPfPcieGpexAmbaInterruptFlrEnable    =  0x00D9B960,
+	HwPfPcieGpexAmbaInterruptFlrStatus    =  0x00D9B9A0,
+	HwPfPcieGpexExtAxiAddrMappingSize     =  0x00D9BAF0,
+	HwPfPcieGpexPexPioAwcacheControl      =  0x00D9C300,
+	HwPfPcieGpexPexPioArcacheControl      =  0x00D9C304,
+	HwPfPcieGpexPabObSizeControlVc0       =  0x00D9C310
+};
+
+/* TIP PF Interrupt numbers */
+enum {
+	ACC100_PF_INT_QMGR_AQ_OVERFLOW = 0,
+	ACC100_PF_INT_DOORBELL_VF_2_PF = 1,
+	ACC100_PF_INT_DMA_DL_DESC_IRQ = 2,
+	ACC100_PF_INT_DMA_UL_DESC_IRQ = 3,
+	ACC100_PF_INT_DMA_MLD_DESC_IRQ = 4,
+	ACC100_PF_INT_DMA_UL5G_DESC_IRQ = 5,
+	ACC100_PF_INT_DMA_DL5G_DESC_IRQ = 6,
+	ACC100_PF_INT_ILLEGAL_FORMAT = 7,
+	ACC100_PF_INT_QMGR_DISABLED_ACCESS = 8,
+	ACC100_PF_INT_QMGR_AQ_OVERTHRESHOLD = 9,
+	ACC100_PF_INT_ARAM_ACCESS_ERR = 10,
+	ACC100_PF_INT_ARAM_ECC_1BIT_ERR = 11,
+	ACC100_PF_INT_PARITY_ERR = 12,
+	ACC100_PF_INT_QMGR_ERR = 13,
+	ACC100_PF_INT_INT_REQ_OVERFLOW = 14,
+	ACC100_PF_INT_APB_TIMEOUT = 15,
+};
+
+#endif /* ACC100_PF_ENUM_H */
diff --git a/drivers/baseband/acc100/acc100_vf_enum.h b/drivers/baseband/acc100/acc100_vf_enum.h
new file mode 100644
index 0000000..b512af3
--- /dev/null
+++ b/drivers/baseband/acc100/acc100_vf_enum.h
@@ -0,0 +1,73 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2017 Intel Corporation
+ */
+
+#ifndef ACC100_VF_ENUM_H
+#define ACC100_VF_ENUM_H
+
+/*
+ * ACC100 Register mapping on VF BAR0
+ * This is automatically generated from RDL, format may change with new RDL
+ */
+enum {
+	HWVfQmgrIngressAq             =  0x00000000,
+	HWVfHiVfToPfDbellVf           =  0x00000800,
+	HWVfHiPfToVfDbellVf           =  0x00000808,
+	HWVfHiInfoRingBaseLoVf        =  0x00000810,
+	HWVfHiInfoRingBaseHiVf        =  0x00000814,
+	HWVfHiInfoRingPointerVf       =  0x00000818,
+	HWVfHiInfoRingIntWrEnVf       =  0x00000820,
+	HWVfHiInfoRingPf2VfWrEnVf     =  0x00000824,
+	HWVfHiMsixVectorMapperVf      =  0x00000860,
+	HWVfDmaFec5GulDescBaseLoRegVf =  0x00000920,
+	HWVfDmaFec5GulDescBaseHiRegVf =  0x00000924,
+	HWVfDmaFec5GulRespPtrLoRegVf  =  0x00000928,
+	HWVfDmaFec5GulRespPtrHiRegVf  =  0x0000092C,
+	HWVfDmaFec5GdlDescBaseLoRegVf =  0x00000940,
+	HWVfDmaFec5GdlDescBaseHiRegVf =  0x00000944,
+	HWVfDmaFec5GdlRespPtrLoRegVf  =  0x00000948,
+	HWVfDmaFec5GdlRespPtrHiRegVf  =  0x0000094C,
+	HWVfDmaFec4GulDescBaseLoRegVf =  0x00000960,
+	HWVfDmaFec4GulDescBaseHiRegVf =  0x00000964,
+	HWVfDmaFec4GulRespPtrLoRegVf  =  0x00000968,
+	HWVfDmaFec4GulRespPtrHiRegVf  =  0x0000096C,
+	HWVfDmaFec4GdlDescBaseLoRegVf =  0x00000980,
+	HWVfDmaFec4GdlDescBaseHiRegVf =  0x00000984,
+	HWVfDmaFec4GdlRespPtrLoRegVf  =  0x00000988,
+	HWVfDmaFec4GdlRespPtrHiRegVf  =  0x0000098C,
+	HWVfDmaDdrBaseRangeRoVf       =  0x000009A0,
+	HWVfQmgrAqResetVf             =  0x00000E00,
+	HWVfQmgrRingSizeVf            =  0x00000E04,
+	HWVfQmgrGrpDepthLog20Vf       =  0x00000E08,
+	HWVfQmgrGrpDepthLog21Vf       =  0x00000E0C,
+	HWVfQmgrGrpFunction0Vf        =  0x00000E10,
+	HWVfQmgrGrpFunction1Vf        =  0x00000E14,
+	HWVfPmACntrlRegVf             =  0x00000F40,
+	HWVfPmACountVf                =  0x00000F48,
+	HWVfPmAKCntLoVf               =  0x00000F50,
+	HWVfPmAKCntHiVf               =  0x00000F54,
+	HWVfPmADeltaCntLoVf           =  0x00000F60,
+	HWVfPmADeltaCntHiVf           =  0x00000F64,
+	HWVfPmBCntrlRegVf             =  0x00000F80,
+	HWVfPmBCountVf                =  0x00000F88,
+	HWVfPmBKCntLoVf               =  0x00000F90,
+	HWVfPmBKCntHiVf               =  0x00000F94,
+	HWVfPmBDeltaCntLoVf           =  0x00000FA0,
+	HWVfPmBDeltaCntHiVf           =  0x00000FA4
+};
+
+/* TIP VF Interrupt numbers */
+enum {
+	ACC100_VF_INT_QMGR_AQ_OVERFLOW = 0,
+	ACC100_VF_INT_DOORBELL_VF_2_PF = 1,
+	ACC100_VF_INT_DMA_DL_DESC_IRQ = 2,
+	ACC100_VF_INT_DMA_UL_DESC_IRQ = 3,
+	ACC100_VF_INT_DMA_MLD_DESC_IRQ = 4,
+	ACC100_VF_INT_DMA_UL5G_DESC_IRQ = 5,
+	ACC100_VF_INT_DMA_DL5G_DESC_IRQ = 6,
+	ACC100_VF_INT_ILLEGAL_FORMAT = 7,
+	ACC100_VF_INT_QMGR_DISABLED_ACCESS = 8,
+	ACC100_VF_INT_QMGR_AQ_OVERTHRESHOLD = 9,
+};
+
+#endif /* ACC100_VF_ENUM_H */
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 6f46df0..cd77570 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -5,6 +5,9 @@
 #ifndef _RTE_ACC100_PMD_H_
 #define _RTE_ACC100_PMD_H_
 
+#include "acc100_pf_enum.h"
+#include "acc100_vf_enum.h"
+
 /* Helper macro for logging */
 #define rte_bbdev_log(level, fmt, ...) \
 	rte_log(RTE_LOG_ ## level, acc100_logtype, fmt "\n", \
@@ -27,6 +30,493 @@
 #define RTE_ACC100_PF_DEVICE_ID        (0x0d5c)
 #define RTE_ACC100_VF_DEVICE_ID        (0x0d5d)
 
+/* Define as 1 to use only a single FEC engine */
+#ifndef RTE_ACC100_SINGLE_FEC
+#define RTE_ACC100_SINGLE_FEC 0
+#endif
+
+/* Values used in filling in descriptors */
+#define ACC100_DMA_DESC_TYPE           2
+#define ACC100_DMA_CODE_BLK_MODE       0
+#define ACC100_DMA_BLKID_FCW           1
+#define ACC100_DMA_BLKID_IN            2
+#define ACC100_DMA_BLKID_OUT_ENC       1
+#define ACC100_DMA_BLKID_OUT_HARD      1
+#define ACC100_DMA_BLKID_OUT_SOFT      2
+#define ACC100_DMA_BLKID_OUT_HARQ      3
+#define ACC100_DMA_BLKID_IN_HARQ       3
+
+/* Values used in filling in decode FCWs */
+#define ACC100_FCW_TD_VER              1
+#define ACC100_FCW_TD_EXT_COLD_REG_EN  1
+#define ACC100_FCW_TD_AUTOMAP          0x0f
+#define ACC100_FCW_TD_RVIDX_0          2
+#define ACC100_FCW_TD_RVIDX_1          26
+#define ACC100_FCW_TD_RVIDX_2          50
+#define ACC100_FCW_TD_RVIDX_3          74
+
+/* Values used in writing to the registers */
+#define ACC100_REG_IRQ_EN_ALL          0x1FF83FF  /* Enable all interrupts */
+
+/* ACC100 Specific Dimensioning */
+#define ACC100_SIZE_64MBYTE            (64*1024*1024)
+/* Number of elements in an Info Ring */
+#define ACC100_INFO_RING_NUM_ENTRIES   1024
+/* Number of elements in HARQ layout memory */
+#define ACC100_HARQ_LAYOUT             (64*1024*1024)
+/* Assume offset for HARQ in memory */
+#define ACC100_HARQ_OFFSET             (32*1024)
+/* Mask used to calculate an index in an Info Ring array (not a byte offset) */
+#define ACC100_INFO_RING_MASK          (ACC100_INFO_RING_NUM_ENTRIES-1)
+/* Number of Virtual Functions ACC100 supports */
+#define ACC100_NUM_VFS                  16
+#define ACC100_NUM_QGRPS                 8
+#define ACC100_NUM_QGRPS_PER_WORD        8
+#define ACC100_NUM_AQS                  16
+#define MAX_ENQ_BATCH_SIZE          255
+/* All ACC100 Registers alignment are 32bits = 4B */
+#define BYTES_IN_WORD                 4
+#define MAX_E_MBUF                64000
+
+#define GRP_ID_SHIFT    10 /* Queue Index Hierarchy */
+#define VF_ID_SHIFT     4  /* Queue Index Hierarchy */
+#define VF_OFFSET_QOS   16 /* offset in Memory Space specific to QoS Mon */
+#define TMPL_PRI_0      0x03020100
+#define TMPL_PRI_1      0x07060504
+#define TMPL_PRI_2      0x0b0a0908
+#define TMPL_PRI_3      0x0f0e0d0c
+#define QUEUE_ENABLE    0x80000000  /* Bit to mark Queue as Enabled */
+#define WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
+
+#define ACC100_NUM_TMPL  32
+#define VF_OFFSET_QOS 16 /* offset in Memory Space specific to QoS Mon */
+/* Mapping of signals for the available engines */
+#define SIG_UL_5G      0
+#define SIG_UL_5G_LAST 7
+#define SIG_DL_5G      13
+#define SIG_DL_5G_LAST 15
+#define SIG_UL_4G      16
+#define SIG_UL_4G_LAST 21
+#define SIG_DL_4G      27
+#define SIG_DL_4G_LAST 31
+
+/* max number of iterations to allocate memory block for all rings */
+#define SW_RING_MEM_ALLOC_ATTEMPTS 5
+#define MAX_QUEUE_DEPTH           1024
+#define ACC100_DMA_MAX_NUM_POINTERS  14
+#define ACC100_DMA_DESC_PADDING      8
+#define ACC100_FCW_PADDING           12
+#define ACC100_DESC_FCW_OFFSET       192
+#define ACC100_DESC_SIZE             256
+#define ACC100_DESC_OFFSET           (ACC100_DESC_SIZE / 64)
+#define ACC100_FCW_TE_BLEN     32
+#define ACC100_FCW_TD_BLEN     24
+#define ACC100_FCW_LE_BLEN     32
+#define ACC100_FCW_LD_BLEN     36
+
+#define ACC100_FCW_VER         2
+#define MUX_5GDL_DESC 6
+#define CMP_ENC_SIZE 20
+#define CMP_DEC_SIZE 24
+#define ENC_OFFSET (32)
+#define DEC_OFFSET (80)
+#define ACC100_EXT_MEM
+#define ACC100_HARQ_OFFSET_THRESHOLD 1024
+
+/* Constants from K0 computation from 3GPP 38.212 Table 5.4.2.1-2 */
+#define N_ZC_1 66 /* N = 66 Zc for BG 1 */
+#define N_ZC_2 50 /* N = 50 Zc for BG 2 */
+#define K0_1_1 17 /* K0 fraction numerator for rv 1 and BG 1 */
+#define K0_1_2 13 /* K0 fraction numerator for rv 1 and BG 2 */
+#define K0_2_1 33 /* K0 fraction numerator for rv 2 and BG 1 */
+#define K0_2_2 25 /* K0 fraction numerator for rv 2 and BG 2 */
+#define K0_3_1 56 /* K0 fraction numerator for rv 3 and BG 1 */
+#define K0_3_2 43 /* K0 fraction numerator for rv 3 and BG 2 */
+
+/* ACC100 Configuration */
+#define ACC100_DDR_ECC_ENABLE
+#define ACC100_CFG_DMA_ERROR 0x3D7
+#define ACC100_CFG_AXI_CACHE 0x11
+#define ACC100_CFG_QMGR_HI_P 0x0F0F
+#define ACC100_CFG_PCI_AXI 0xC003
+#define ACC100_CFG_PCI_BRIDGE 0x40006033
+#define ACC100_ENGINE_OFFSET 0x1000
+#define ACC100_RESET_HI 0x20100
+#define ACC100_RESET_LO 0x20000
+#define ACC100_RESET_HARD 0x1FF
+#define ACC100_ENGINES_MAX 9
+#define LONG_WAIT 1000
+
+/* ACC100 DMA Descriptor triplet */
+struct acc100_dma_triplet {
+	uint64_t address;
+	uint32_t blen:20,
+		res0:4,
+		last:1,
+		dma_ext:1,
+		res1:2,
+		blkid:4;
+} __rte_packed;
+
+
+
+/* ACC100 DMA Response Descriptor */
+union acc100_dma_rsp_desc {
+	uint32_t val;
+	struct {
+		uint32_t crc_status:1,
+			synd_ok:1,
+			dma_err:1,
+			neg_stop:1,
+			fcw_err:1,
+			output_err:1,
+			input_err:1,
+			timestampEn:1,
+			iterCountFrac:8,
+			iter_cnt:8,
+			rsrvd3:6,
+			sdone:1,
+			fdone:1;
+		uint32_t add_info_0;
+		uint32_t add_info_1;
+	};
+};
+
+
+/* ACC100 Queue Manager Enqueue PCI Register */
+union acc100_enqueue_reg_fmt {
+	uint32_t val;
+	struct {
+		uint32_t num_elem:8,
+			addr_offset:3,
+			rsrvd:1,
+			req_elem_addr:20;
+	};
+};
+
+/* FEC 4G Uplink Frame Control Word */
+struct __rte_packed acc100_fcw_td {
+	uint8_t fcw_ver:4,
+		num_maps:4; /* Unused */
+	uint8_t filler:6, /* Unused */
+		rsrvd0:1,
+		bypass_sb_deint:1;
+	uint16_t k_pos;
+	uint16_t k_neg; /* Unused */
+	uint8_t c_neg; /* Unused */
+	uint8_t c; /* Unused */
+	uint32_t ea; /* Unused */
+	uint32_t eb; /* Unused */
+	uint8_t cab; /* Unused */
+	uint8_t k0_start_col; /* Unused */
+	uint8_t rsrvd1;
+	uint8_t code_block_mode:1, /* Unused */
+		turbo_crc_type:1,
+		rsrvd2:3,
+		bypass_teq:1, /* Unused */
+		soft_output_en:1, /* Unused */
+		ext_td_cold_reg_en:1;
+	union { /* External Cold register */
+		uint32_t ext_td_cold_reg;
+		struct {
+			uint32_t min_iter:4, /* Unused */
+				max_iter:4,
+				ext_scale:5, /* Unused */
+				rsrvd3:3,
+				early_stop_en:1, /* Unused */
+				sw_soft_out_dis:1, /* Unused */
+				sw_et_cont:1, /* Unused */
+				sw_soft_out_saturation:1, /* Unused */
+				half_iter_on:1, /* Unused */
+				raw_decoder_input_on:1, /* Unused */
+				rsrvd4:10;
+		};
+	};
+};
+
+/* FEC 5GNR Uplink Frame Control Word */
+struct __rte_packed acc100_fcw_ld {
+	uint32_t FCWversion:4,
+		qm:4,
+		nfiller:11,
+		BG:1,
+		Zc:9,
+		res0:1,
+		synd_precoder:1,
+		synd_post:1;
+	uint32_t ncb:16,
+		k0:16;
+	uint32_t rm_e:24,
+		hcin_en:1,
+		hcout_en:1,
+		crc_select:1,
+		bypass_dec:1,
+		bypass_intlv:1,
+		so_en:1,
+		so_bypass_rm:1,
+		so_bypass_intlv:1;
+	uint32_t hcin_offset:16,
+		hcin_size0:16;
+	uint32_t hcin_size1:16,
+		hcin_decomp_mode:3,
+		llr_pack_mode:1,
+		hcout_comp_mode:3,
+		res2:1,
+		dec_convllr:4,
+		hcout_convllr:4;
+	uint32_t itmax:7,
+		itstop:1,
+		so_it:7,
+		res3:1,
+		hcout_offset:16;
+	uint32_t hcout_size0:16,
+		hcout_size1:16;
+	uint32_t gain_i:8,
+		gain_h:8,
+		negstop_th:16;
+	uint32_t negstop_it:7,
+		negstop_en:1,
+		res4:24;
+};
+
+/* FEC 4G Downlink Frame Control Word */
+struct __rte_packed acc100_fcw_te {
+	uint16_t k_neg;
+	uint16_t k_pos;
+	uint8_t c_neg;
+	uint8_t c;
+	uint8_t filler;
+	uint8_t cab;
+	uint32_t ea:17,
+		rsrvd0:15;
+	uint32_t eb:17,
+		rsrvd1:15;
+	uint16_t ncb_neg;
+	uint16_t ncb_pos;
+	uint8_t rv_idx0:2,
+		rsrvd2:2,
+		rv_idx1:2,
+		rsrvd3:2;
+	uint8_t bypass_rv_idx0:1,
+		bypass_rv_idx1:1,
+		bypass_rm:1,
+		rsrvd4:5;
+	uint8_t rsrvd5:1,
+		rsrvd6:3,
+		code_block_crc:1,
+		rsrvd7:3;
+	uint8_t code_block_mode:1,
+		rsrvd8:7;
+	uint64_t rsrvd9;
+};
+
+/* FEC 5GNR Downlink Frame Control Word */
+struct __rte_packed acc100_fcw_le {
+	uint32_t FCWversion:4,
+		qm:4,
+		nfiller:11,
+		BG:1,
+		Zc:9,
+		res0:3;
+	uint32_t ncb:16,
+		k0:16;
+	uint32_t rm_e:24,
+		res1:2,
+		crc_select:1,
+		res2:1,
+		bypass_intlv:1,
+		res3:3;
+	uint32_t res4_a:12,
+		mcb_count:3,
+		res4_b:17;
+	uint32_t res5;
+	uint32_t res6;
+	uint32_t res7;
+	uint32_t res8;
+};
+
+/* ACC100 DMA Request Descriptor */
+struct __rte_packed acc100_dma_req_desc {
+	union {
+		struct{
+			uint32_t type:4,
+				rsrvd0:26,
+				sdone:1,
+				fdone:1;
+			uint32_t rsrvd1;
+			uint32_t rsrvd2;
+			uint32_t pass_param:8,
+				sdone_enable:1,
+				irq_enable:1,
+				timeStampEn:1,
+				res0:5,
+				numCBs:4,
+				res1:4,
+				m2dlen:4,
+				d2mlen:4;
+		};
+		struct{
+			uint32_t word0;
+			uint32_t word1;
+			uint32_t word2;
+			uint32_t word3;
+		};
+	};
+	struct acc100_dma_triplet data_ptrs[ACC100_DMA_MAX_NUM_POINTERS];
+
+	/* Virtual addresses used to retrieve SW context info */
+	union {
+		void *op_addr;
+		uint64_t pad1;  /* pad to 64 bits */
+	};
+	/*
+	 * Stores additional information needed for driver processing:
+	 * - last_desc_in_batch - flag used to mark last descriptor (CB)
+	 *                        in batch
+	 * - cbs_in_tb - stores information about total number of Code Blocks
+	 *               in currently processed Transport Block
+	 */
+	union {
+		struct {
+			union {
+				struct acc100_fcw_ld fcw_ld;
+				struct acc100_fcw_td fcw_td;
+				struct acc100_fcw_le fcw_le;
+				struct acc100_fcw_te fcw_te;
+				uint32_t pad2[ACC100_FCW_PADDING];
+			};
+			uint32_t last_desc_in_batch :8,
+				cbs_in_tb:8,
+				pad4 : 16;
+		};
+		uint64_t pad3[ACC100_DMA_DESC_PADDING]; /* pad to 64 bits */
+	};
+};
+
+/* ACC100 DMA Descriptor */
+union acc100_dma_desc {
+	struct acc100_dma_req_desc req;
+	union acc100_dma_rsp_desc rsp;
+};
+
+
+/* Union describing Info Ring entry */
+union acc100_harq_layout_data {
+	uint32_t val;
+	struct {
+		uint16_t offset;
+		uint16_t size0;
+	};
+} __rte_packed;
+
+
+/* Union describing Info Ring entry */
+union acc100_info_ring_data {
+	uint32_t val;
+	struct {
+		union {
+			uint16_t detailed_info;
+			struct {
+				uint16_t aq_id: 4;
+				uint16_t qg_id: 4;
+				uint16_t vf_id: 6;
+				uint16_t reserved: 2;
+			};
+		};
+		uint16_t int_nb: 7;
+		uint16_t msi_0: 1;
+		uint16_t vf2pf: 6;
+		uint16_t loop: 1;
+		uint16_t valid: 1;
+	};
+} __rte_packed;
+
+struct acc100_registry_addr {
+	unsigned int dma_ring_dl5g_hi;
+	unsigned int dma_ring_dl5g_lo;
+	unsigned int dma_ring_ul5g_hi;
+	unsigned int dma_ring_ul5g_lo;
+	unsigned int dma_ring_dl4g_hi;
+	unsigned int dma_ring_dl4g_lo;
+	unsigned int dma_ring_ul4g_hi;
+	unsigned int dma_ring_ul4g_lo;
+	unsigned int ring_size;
+	unsigned int info_ring_hi;
+	unsigned int info_ring_lo;
+	unsigned int info_ring_en;
+	unsigned int info_ring_ptr;
+	unsigned int tail_ptrs_dl5g_hi;
+	unsigned int tail_ptrs_dl5g_lo;
+	unsigned int tail_ptrs_ul5g_hi;
+	unsigned int tail_ptrs_ul5g_lo;
+	unsigned int tail_ptrs_dl4g_hi;
+	unsigned int tail_ptrs_dl4g_lo;
+	unsigned int tail_ptrs_ul4g_hi;
+	unsigned int tail_ptrs_ul4g_lo;
+	unsigned int depth_log0_offset;
+	unsigned int depth_log1_offset;
+	unsigned int qman_group_func;
+	unsigned int ddr_range;
+};
+
+/* Structure holding registry addresses for PF */
+static const struct acc100_registry_addr pf_reg_addr = {
+	.dma_ring_dl5g_hi = HWPfDmaFec5GdlDescBaseHiRegVf,
+	.dma_ring_dl5g_lo = HWPfDmaFec5GdlDescBaseLoRegVf,
+	.dma_ring_ul5g_hi = HWPfDmaFec5GulDescBaseHiRegVf,
+	.dma_ring_ul5g_lo = HWPfDmaFec5GulDescBaseLoRegVf,
+	.dma_ring_dl4g_hi = HWPfDmaFec4GdlDescBaseHiRegVf,
+	.dma_ring_dl4g_lo = HWPfDmaFec4GdlDescBaseLoRegVf,
+	.dma_ring_ul4g_hi = HWPfDmaFec4GulDescBaseHiRegVf,
+	.dma_ring_ul4g_lo = HWPfDmaFec4GulDescBaseLoRegVf,
+	.ring_size = HWPfQmgrRingSizeVf,
+	.info_ring_hi = HWPfHiInfoRingBaseHiRegPf,
+	.info_ring_lo = HWPfHiInfoRingBaseLoRegPf,
+	.info_ring_en = HWPfHiInfoRingIntWrEnRegPf,
+	.info_ring_ptr = HWPfHiInfoRingPointerRegPf,
+	.tail_ptrs_dl5g_hi = HWPfDmaFec5GdlRespPtrHiRegVf,
+	.tail_ptrs_dl5g_lo = HWPfDmaFec5GdlRespPtrLoRegVf,
+	.tail_ptrs_ul5g_hi = HWPfDmaFec5GulRespPtrHiRegVf,
+	.tail_ptrs_ul5g_lo = HWPfDmaFec5GulRespPtrLoRegVf,
+	.tail_ptrs_dl4g_hi = HWPfDmaFec4GdlRespPtrHiRegVf,
+	.tail_ptrs_dl4g_lo = HWPfDmaFec4GdlRespPtrLoRegVf,
+	.tail_ptrs_ul4g_hi = HWPfDmaFec4GulRespPtrHiRegVf,
+	.tail_ptrs_ul4g_lo = HWPfDmaFec4GulRespPtrLoRegVf,
+	.depth_log0_offset = HWPfQmgrGrpDepthLog20Vf,
+	.depth_log1_offset = HWPfQmgrGrpDepthLog21Vf,
+	.qman_group_func = HWPfQmgrGrpFunction0,
+	.ddr_range = HWPfDmaVfDdrBaseRw,
+};
+
+/* Structure holding registry addresses for VF */
+static const struct acc100_registry_addr vf_reg_addr = {
+	.dma_ring_dl5g_hi = HWVfDmaFec5GdlDescBaseHiRegVf,
+	.dma_ring_dl5g_lo = HWVfDmaFec5GdlDescBaseLoRegVf,
+	.dma_ring_ul5g_hi = HWVfDmaFec5GulDescBaseHiRegVf,
+	.dma_ring_ul5g_lo = HWVfDmaFec5GulDescBaseLoRegVf,
+	.dma_ring_dl4g_hi = HWVfDmaFec4GdlDescBaseHiRegVf,
+	.dma_ring_dl4g_lo = HWVfDmaFec4GdlDescBaseLoRegVf,
+	.dma_ring_ul4g_hi = HWVfDmaFec4GulDescBaseHiRegVf,
+	.dma_ring_ul4g_lo = HWVfDmaFec4GulDescBaseLoRegVf,
+	.ring_size = HWVfQmgrRingSizeVf,
+	.info_ring_hi = HWVfHiInfoRingBaseHiVf,
+	.info_ring_lo = HWVfHiInfoRingBaseLoVf,
+	.info_ring_en = HWVfHiInfoRingIntWrEnVf,
+	.info_ring_ptr = HWVfHiInfoRingPointerVf,
+	.tail_ptrs_dl5g_hi = HWVfDmaFec5GdlRespPtrHiRegVf,
+	.tail_ptrs_dl5g_lo = HWVfDmaFec5GdlRespPtrLoRegVf,
+	.tail_ptrs_ul5g_hi = HWVfDmaFec5GulRespPtrHiRegVf,
+	.tail_ptrs_ul5g_lo = HWVfDmaFec5GulRespPtrLoRegVf,
+	.tail_ptrs_dl4g_hi = HWVfDmaFec4GdlRespPtrHiRegVf,
+	.tail_ptrs_dl4g_lo = HWVfDmaFec4GdlRespPtrLoRegVf,
+	.tail_ptrs_ul4g_hi = HWVfDmaFec4GulRespPtrHiRegVf,
+	.tail_ptrs_ul4g_lo = HWVfDmaFec4GulRespPtrLoRegVf,
+	.depth_log0_offset = HWVfQmgrGrpDepthLog20Vf,
+	.depth_log1_offset = HWVfQmgrGrpDepthLog21Vf,
+	.qman_group_func = HWVfQmgrGrpFunction0Vf,
+	.ddr_range = HWVfDmaDdrBaseRangeRoVf,
+};
+
 /* Private data structure for each ACC100 device */
 struct acc100_device {
 	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v7 03/11] baseband/acc100: add info get function
  2020-09-23  2:24   ` [dpdk-dev] [PATCH v7 00/11] bbdev PMD ACC100 Nicolas Chautru
  2020-09-23  2:24     ` [dpdk-dev] [PATCH v7 01/11] drivers/baseband: add PMD for ACC100 Nicolas Chautru
  2020-09-23  2:24     ` [dpdk-dev] [PATCH v7 02/11] baseband/acc100: add register definition file Nicolas Chautru
@ 2020-09-23  2:24     ` Nicolas Chautru
  2020-09-23  2:24     ` [dpdk-dev] [PATCH v7 04/11] baseband/acc100: add queue configuration Nicolas Chautru
                       ` (7 subsequent siblings)
  10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-23  2:24 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Add in the "info_get" function to the driver, to allow us to query the
device.
No processing capability are available yet.
Linking bbdev-test to support the PMD with null capability.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 app/test-bbdev/meson.build               |   3 +
 drivers/baseband/acc100/rte_acc100_cfg.h |  96 +++++++++++++
 drivers/baseband/acc100/rte_acc100_pmd.c | 225 +++++++++++++++++++++++++++++++
 drivers/baseband/acc100/rte_acc100_pmd.h |   3 +
 4 files changed, 327 insertions(+)
 create mode 100644 drivers/baseband/acc100/rte_acc100_cfg.h

diff --git a/app/test-bbdev/meson.build b/app/test-bbdev/meson.build
index 18ab6a8..fbd8ae3 100644
--- a/app/test-bbdev/meson.build
+++ b/app/test-bbdev/meson.build
@@ -12,3 +12,6 @@ endif
 if dpdk_conf.has('RTE_LIBRTE_PMD_BBDEV_FPGA_5GNR_FEC')
 	deps += ['pmd_bbdev_fpga_5gnr_fec']
 endif
+if dpdk_conf.has('RTE_LIBRTE_PMD_BBDEV_ACC100')
+	deps += ['pmd_bbdev_acc100']
+endif
\ No newline at end of file
diff --git a/drivers/baseband/acc100/rte_acc100_cfg.h b/drivers/baseband/acc100/rte_acc100_cfg.h
new file mode 100644
index 0000000..73bbe36
--- /dev/null
+++ b/drivers/baseband/acc100/rte_acc100_cfg.h
@@ -0,0 +1,96 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#ifndef _RTE_ACC100_CFG_H_
+#define _RTE_ACC100_CFG_H_
+
+/**
+ * @file rte_acc100_cfg.h
+ *
+ * Functions for configuring ACC100 HW, exposed directly to applications.
+ * Configuration related to encoding/decoding is done through the
+ * librte_bbdev library.
+ *
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ */
+
+#include <stdint.h>
+#include <stdbool.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+/**< Number of Virtual Functions ACC100 supports */
+#define RTE_ACC100_NUM_VFS 16
+
+/**
+ * Definition of Queue Topology for ACC100 Configuration
+ * Some level of details is abstracted out to expose a clean interface
+ * given that comprehensive flexibility is not required
+ */
+struct rte_q_topology_t {
+	/** Number of QGroups in incremental order of priority */
+	uint16_t num_qgroups;
+	/**
+	 * All QGroups have the same number of AQs here.
+	 * Note : Could be made a 16-array if more flexibility is really
+	 * required
+	 */
+	uint16_t num_aqs_per_groups;
+	/**
+	 * Depth of the AQs is the same of all QGroups here. Log2 Enum : 2^N
+	 * Note : Could be made a 16-array if more flexibility is really
+	 * required
+	 */
+	uint16_t aq_depth_log2;
+	/**
+	 * Index of the first Queue Group Index - assuming contiguity
+	 * Initialized as -1
+	 */
+	int8_t first_qgroup_index;
+};
+
+/**
+ * Definition of Arbitration related parameters for ACC100 Configuration
+ */
+struct rte_arbitration_t {
+	/** Default Weight for VF Fairness Arbitration */
+	uint16_t round_robin_weight;
+	uint32_t gbr_threshold1; /**< Guaranteed Bitrate Threshold 1 */
+	uint32_t gbr_threshold2; /**< Guaranteed Bitrate Threshold 2 */
+};
+
+/**
+ * Structure to pass ACC100 configuration.
+ * Note: all VF Bundles will have the same configuration.
+ */
+struct acc100_conf {
+	bool pf_mode_en; /**< 1 if PF is used for dataplane, 0 for VFs */
+	/** 1 if input '1' bit is represented by a positive LLR value, 0 if '1'
+	 * bit is represented by a negative value.
+	 */
+	bool input_pos_llr_1_bit;
+	/** 1 if output '1' bit is represented by a positive value, 0 if '1'
+	 * bit is represented by a negative value.
+	 */
+	bool output_pos_llr_1_bit;
+	uint16_t num_vf_bundles; /**< Number of VF bundles to setup */
+	/** Queue topology for each operation type */
+	struct rte_q_topology_t q_ul_4g;
+	struct rte_q_topology_t q_dl_4g;
+	struct rte_q_topology_t q_ul_5g;
+	struct rte_q_topology_t q_dl_5g;
+	/** Arbitration configuration for each operation type */
+	struct rte_arbitration_t arb_ul_4g[RTE_ACC100_NUM_VFS];
+	struct rte_arbitration_t arb_dl_4g[RTE_ACC100_NUM_VFS];
+	struct rte_arbitration_t arb_ul_5g[RTE_ACC100_NUM_VFS];
+	struct rte_arbitration_t arb_dl_5g[RTE_ACC100_NUM_VFS];
+};
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_ACC100_CFG_H_ */
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 1b4cd13..7807a30 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -26,6 +26,184 @@
 RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);
 #endif
 
+/* Read a register of a ACC100 device */
+static inline uint32_t
+acc100_reg_read(struct acc100_device *d, uint32_t offset)
+{
+
+	void *reg_addr = RTE_PTR_ADD(d->mmio_base, offset);
+	uint32_t ret = *((volatile uint32_t *)(reg_addr));
+	return rte_le_to_cpu_32(ret);
+}
+
+/* Calculate the offset of the enqueue register */
+static inline uint32_t
+queue_offset(bool pf_device, uint8_t vf_id, uint8_t qgrp_id, uint16_t aq_id)
+{
+	if (pf_device)
+		return ((vf_id << 12) + (qgrp_id << 7) + (aq_id << 3) +
+				HWPfQmgrIngressAq);
+	else
+		return ((qgrp_id << 7) + (aq_id << 3) +
+				HWVfQmgrIngressAq);
+}
+
+enum {UL_4G = 0, UL_5G, DL_4G, DL_5G, NUM_ACC};
+
+/* Return the queue topology for a Queue Group Index */
+static inline void
+qtopFromAcc(struct rte_q_topology_t **qtop, int acc_enum,
+		struct acc100_conf *acc100_conf)
+{
+	struct rte_q_topology_t *p_qtop;
+	p_qtop = NULL;
+	switch (acc_enum) {
+	case UL_4G:
+		p_qtop = &(acc100_conf->q_ul_4g);
+		break;
+	case UL_5G:
+		p_qtop = &(acc100_conf->q_ul_5g);
+		break;
+	case DL_4G:
+		p_qtop = &(acc100_conf->q_dl_4g);
+		break;
+	case DL_5G:
+		p_qtop = &(acc100_conf->q_dl_5g);
+		break;
+	default:
+		/* NOTREACHED */
+		rte_bbdev_log(ERR, "Unexpected error evaluating qtopFromAcc");
+		break;
+	}
+	*qtop = p_qtop;
+}
+
+static void
+initQTop(struct acc100_conf *acc100_conf)
+{
+	acc100_conf->q_ul_4g.num_aqs_per_groups = 0;
+	acc100_conf->q_ul_4g.num_qgroups = 0;
+	acc100_conf->q_ul_4g.first_qgroup_index = -1;
+	acc100_conf->q_ul_5g.num_aqs_per_groups = 0;
+	acc100_conf->q_ul_5g.num_qgroups = 0;
+	acc100_conf->q_ul_5g.first_qgroup_index = -1;
+	acc100_conf->q_dl_4g.num_aqs_per_groups = 0;
+	acc100_conf->q_dl_4g.num_qgroups = 0;
+	acc100_conf->q_dl_4g.first_qgroup_index = -1;
+	acc100_conf->q_dl_5g.num_aqs_per_groups = 0;
+	acc100_conf->q_dl_5g.num_qgroups = 0;
+	acc100_conf->q_dl_5g.first_qgroup_index = -1;
+}
+
+static inline void
+updateQtop(uint8_t acc, uint8_t qg, struct acc100_conf *acc100_conf,
+		struct acc100_device *d) {
+	uint32_t reg;
+	struct rte_q_topology_t *q_top = NULL;
+	qtopFromAcc(&q_top, acc, acc100_conf);
+	if (unlikely(q_top == NULL))
+		return;
+	uint16_t aq;
+	q_top->num_qgroups++;
+	if (q_top->first_qgroup_index == -1) {
+		q_top->first_qgroup_index = qg;
+		/* Can be optimized to assume all are enabled by default */
+		reg = acc100_reg_read(d, queue_offset(d->pf_device,
+				0, qg, ACC100_NUM_AQS - 1));
+		if (reg & QUEUE_ENABLE) {
+			q_top->num_aqs_per_groups = ACC100_NUM_AQS;
+			return;
+		}
+		q_top->num_aqs_per_groups = 0;
+		for (aq = 0; aq < ACC100_NUM_AQS; aq++) {
+			reg = acc100_reg_read(d, queue_offset(d->pf_device,
+					0, qg, aq));
+			if (reg & QUEUE_ENABLE)
+				q_top->num_aqs_per_groups++;
+		}
+	}
+}
+
+/* Fetch configuration enabled for the PF/VF using MMIO Read (slow) */
+static inline void
+fetch_acc100_config(struct rte_bbdev *dev)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	struct acc100_conf *acc100_conf = &d->acc100_conf;
+	const struct acc100_registry_addr *reg_addr;
+	uint8_t acc, qg;
+	uint32_t reg, reg_aq, reg_len0, reg_len1;
+	uint32_t reg_mode;
+
+	/* No need to retrieve the configuration is already done */
+	if (d->configured)
+		return;
+
+	/* Choose correct registry addresses for the device type */
+	if (d->pf_device)
+		reg_addr = &pf_reg_addr;
+	else
+		reg_addr = &vf_reg_addr;
+
+	d->ddr_size = (1 + acc100_reg_read(d, reg_addr->ddr_range)) << 10;
+
+	/* Single VF Bundle by VF */
+	acc100_conf->num_vf_bundles = 1;
+	initQTop(acc100_conf);
+
+	struct rte_q_topology_t *q_top = NULL;
+	int qman_func_id[5] = {0, 2, 1, 3, 4};
+	reg = acc100_reg_read(d, reg_addr->qman_group_func);
+	for (qg = 0; qg < ACC100_NUM_QGRPS_PER_WORD; qg++) {
+		reg_aq = acc100_reg_read(d,
+				queue_offset(d->pf_device, 0, qg, 0));
+		if (reg_aq & QUEUE_ENABLE) {
+			acc = qman_func_id[(reg >> (qg * 4)) & 0x7];
+			updateQtop(acc, qg, acc100_conf, d);
+		}
+	}
+
+	/* Check the depth of the AQs*/
+	reg_len0 = acc100_reg_read(d, reg_addr->depth_log0_offset);
+	reg_len1 = acc100_reg_read(d, reg_addr->depth_log1_offset);
+	for (acc = 0; acc < NUM_ACC; acc++) {
+		qtopFromAcc(&q_top, acc, acc100_conf);
+		if (q_top->first_qgroup_index < ACC100_NUM_QGRPS_PER_WORD)
+			q_top->aq_depth_log2 = (reg_len0 >>
+					(q_top->first_qgroup_index * 4))
+					& 0xF;
+		else
+			q_top->aq_depth_log2 = (reg_len1 >>
+					((q_top->first_qgroup_index -
+					ACC100_NUM_QGRPS_PER_WORD) * 4))
+					& 0xF;
+	}
+
+	/* Read PF mode */
+	if (d->pf_device) {
+		reg_mode = acc100_reg_read(d, HWPfHiPfMode);
+		acc100_conf->pf_mode_en = (reg_mode == 2) ? 1 : 0;
+	}
+
+	rte_bbdev_log_debug(
+			"%s Config LLR SIGN IN/OUT %s %s QG %u %u %u %u AQ %u %u %u %u Len %u %u %u %u\n",
+			(d->pf_device) ? "PF" : "VF",
+			(acc100_conf->input_pos_llr_1_bit) ? "POS" : "NEG",
+			(acc100_conf->output_pos_llr_1_bit) ? "POS" : "NEG",
+			acc100_conf->q_ul_4g.num_qgroups,
+			acc100_conf->q_dl_4g.num_qgroups,
+			acc100_conf->q_ul_5g.num_qgroups,
+			acc100_conf->q_dl_5g.num_qgroups,
+			acc100_conf->q_ul_4g.num_aqs_per_groups,
+			acc100_conf->q_dl_4g.num_aqs_per_groups,
+			acc100_conf->q_ul_5g.num_aqs_per_groups,
+			acc100_conf->q_dl_5g.num_aqs_per_groups,
+			acc100_conf->q_ul_4g.aq_depth_log2,
+			acc100_conf->q_dl_4g.aq_depth_log2,
+			acc100_conf->q_ul_5g.aq_depth_log2,
+			acc100_conf->q_dl_5g.aq_depth_log2);
+}
+
 /* Free 64MB memory used for software rings */
 static int
 acc100_dev_close(struct rte_bbdev *dev  __rte_unused)
@@ -33,8 +211,55 @@
 	return 0;
 }
 
+/* Get ACC100 device info */
+static void
+acc100_dev_info_get(struct rte_bbdev *dev,
+		struct rte_bbdev_driver_info *dev_info)
+{
+	struct acc100_device *d = dev->data->dev_private;
+
+	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
+		RTE_BBDEV_END_OF_CAPABILITIES_LIST()
+	};
+
+	static struct rte_bbdev_queue_conf default_queue_conf;
+	default_queue_conf.socket = dev->data->socket_id;
+	default_queue_conf.queue_size = MAX_QUEUE_DEPTH;
+
+	dev_info->driver_name = dev->device->driver->name;
+
+	/* Read and save the populated config from ACC100 registers */
+	fetch_acc100_config(dev);
+
+	/* This isn't ideal because it reports the maximum number of queues but
+	 * does not provide info on how many can be uplink/downlink or different
+	 * priorities
+	 */
+	dev_info->max_num_queues =
+			d->acc100_conf.q_dl_5g.num_aqs_per_groups *
+			d->acc100_conf.q_dl_5g.num_qgroups +
+			d->acc100_conf.q_ul_5g.num_aqs_per_groups *
+			d->acc100_conf.q_ul_5g.num_qgroups +
+			d->acc100_conf.q_dl_4g.num_aqs_per_groups *
+			d->acc100_conf.q_dl_4g.num_qgroups +
+			d->acc100_conf.q_ul_4g.num_aqs_per_groups *
+			d->acc100_conf.q_ul_4g.num_qgroups;
+	dev_info->queue_size_lim = MAX_QUEUE_DEPTH;
+	dev_info->hardware_accelerated = true;
+	dev_info->max_dl_queue_priority =
+			d->acc100_conf.q_dl_4g.num_qgroups - 1;
+	dev_info->max_ul_queue_priority =
+			d->acc100_conf.q_ul_4g.num_qgroups - 1;
+	dev_info->default_queue_conf = default_queue_conf;
+	dev_info->cpu_flag_reqs = NULL;
+	dev_info->min_alignment = 64;
+	dev_info->capabilities = bbdev_capabilities;
+	dev_info->harq_buffer_size = d->ddr_size;
+}
+
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
 	.close = acc100_dev_close,
+	.info_get = acc100_dev_info_get,
 };
 
 /* ACC100 PCI PF address map */
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index cd77570..662e2c8 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -7,6 +7,7 @@
 
 #include "acc100_pf_enum.h"
 #include "acc100_vf_enum.h"
+#include "rte_acc100_cfg.h"
 
 /* Helper macro for logging */
 #define rte_bbdev_log(level, fmt, ...) \
@@ -520,6 +521,8 @@ struct acc100_registry_addr {
 /* Private data structure for each ACC100 device */
 struct acc100_device {
 	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
+	uint32_t ddr_size; /* Size in kB */
+	struct acc100_conf acc100_conf; /* ACC100 Initial configuration */
 	bool pf_device; /**< True if this is a PF ACC100 device */
 	bool configured; /**< True if this ACC100 device is configured */
 };
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v7 04/11] baseband/acc100: add queue configuration
  2020-09-23  2:24   ` [dpdk-dev] [PATCH v7 00/11] bbdev PMD ACC100 Nicolas Chautru
                       ` (2 preceding siblings ...)
  2020-09-23  2:24     ` [dpdk-dev] [PATCH v7 03/11] baseband/acc100: add info get function Nicolas Chautru
@ 2020-09-23  2:24     ` Nicolas Chautru
  2020-09-23  2:25     ` [dpdk-dev] [PATCH v7 05/11] baseband/acc100: add LDPC processing functions Nicolas Chautru
                       ` (6 subsequent siblings)
  10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-23  2:24 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Adding function to create and configure queues for
the device. Still no capability.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 420 ++++++++++++++++++++++++++++++-
 drivers/baseband/acc100/rte_acc100_pmd.h |  45 ++++
 2 files changed, 464 insertions(+), 1 deletion(-)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 7807a30..7a21c57 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -26,6 +26,22 @@
 RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);
 #endif
 
+/* Write to MMIO register address */
+static inline void
+mmio_write(void *addr, uint32_t value)
+{
+	*((volatile uint32_t *)(addr)) = rte_cpu_to_le_32(value);
+}
+
+/* Write a register of a ACC100 device */
+static inline void
+acc100_reg_write(struct acc100_device *d, uint32_t offset, uint32_t payload)
+{
+	void *reg_addr = RTE_PTR_ADD(d->mmio_base, offset);
+	mmio_write(reg_addr, payload);
+	usleep(1000);
+}
+
 /* Read a register of a ACC100 device */
 static inline uint32_t
 acc100_reg_read(struct acc100_device *d, uint32_t offset)
@@ -36,6 +52,22 @@
 	return rte_le_to_cpu_32(ret);
 }
 
+/* Basic Implementation of Log2 for exact 2^N */
+static inline uint32_t
+log2_basic(uint32_t value)
+{
+	return (value == 0) ? 0 : __builtin_ctz(value);
+}
+
+/* Calculate memory alignment offset assuming alignment is 2^N */
+static inline uint32_t
+calc_mem_alignment_offset(void *unaligned_virt_mem, uint32_t alignment)
+{
+	rte_iova_t unaligned_phy_mem = rte_malloc_virt2iova(unaligned_virt_mem);
+	return (uint32_t)(alignment -
+			(unaligned_phy_mem & (alignment-1)));
+}
+
 /* Calculate the offset of the enqueue register */
 static inline uint32_t
 queue_offset(bool pf_device, uint8_t vf_id, uint8_t qgrp_id, uint16_t aq_id)
@@ -204,10 +236,393 @@
 			acc100_conf->q_dl_5g.aq_depth_log2);
 }
 
+static void
+free_base_addresses(void **base_addrs, int size)
+{
+	int i;
+	for (i = 0; i < size; i++)
+		rte_free(base_addrs[i]);
+}
+
+static inline uint32_t
+get_desc_len(void)
+{
+	return sizeof(union acc100_dma_desc);
+}
+
+/* Allocate the 2 * 64MB block for the sw rings */
+static int
+alloc_2x64mb_sw_rings_mem(struct rte_bbdev *dev, struct acc100_device *d,
+		int socket)
+{
+	uint32_t sw_ring_size = ACC100_SIZE_64MBYTE;
+	d->sw_rings_base = rte_zmalloc_socket(dev->device->driver->name,
+			2 * sw_ring_size, RTE_CACHE_LINE_SIZE, socket);
+	if (d->sw_rings_base == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate memory for %s:%u",
+				dev->device->driver->name,
+				dev->data->dev_id);
+		return -ENOMEM;
+	}
+	memset(d->sw_rings_base, 0, ACC100_SIZE_64MBYTE);
+	uint32_t next_64mb_align_offset = calc_mem_alignment_offset(
+			d->sw_rings_base, ACC100_SIZE_64MBYTE);
+	d->sw_rings = RTE_PTR_ADD(d->sw_rings_base, next_64mb_align_offset);
+	d->sw_rings_phys = rte_malloc_virt2iova(d->sw_rings_base) +
+			next_64mb_align_offset;
+	d->sw_ring_size = MAX_QUEUE_DEPTH * get_desc_len();
+	d->sw_ring_max_depth = d->sw_ring_size / get_desc_len();
+
+	return 0;
+}
+
+/* Attempt to allocate minimised memory space for sw rings */
+static void
+alloc_sw_rings_min_mem(struct rte_bbdev *dev, struct acc100_device *d,
+		uint16_t num_queues, int socket)
+{
+	rte_iova_t sw_rings_base_phy, next_64mb_align_addr_phy;
+	uint32_t next_64mb_align_offset;
+	rte_iova_t sw_ring_phys_end_addr;
+	void *base_addrs[SW_RING_MEM_ALLOC_ATTEMPTS];
+	void *sw_rings_base;
+	int i = 0;
+	uint32_t q_sw_ring_size = MAX_QUEUE_DEPTH * get_desc_len();
+	uint32_t dev_sw_ring_size = q_sw_ring_size * num_queues;
+
+	/* Find an aligned block of memory to store sw rings */
+	while (i < SW_RING_MEM_ALLOC_ATTEMPTS) {
+		/*
+		 * sw_ring allocated memory is guaranteed to be aligned to
+		 * q_sw_ring_size at the condition that the requested size is
+		 * less than the page size
+		 */
+		sw_rings_base = rte_zmalloc_socket(
+				dev->device->driver->name,
+				dev_sw_ring_size, q_sw_ring_size, socket);
+
+		if (sw_rings_base == NULL) {
+			rte_bbdev_log(ERR,
+					"Failed to allocate memory for %s:%u",
+					dev->device->driver->name,
+					dev->data->dev_id);
+			break;
+		}
+
+		sw_rings_base_phy = rte_malloc_virt2iova(sw_rings_base);
+		next_64mb_align_offset = calc_mem_alignment_offset(
+				sw_rings_base, ACC100_SIZE_64MBYTE);
+		next_64mb_align_addr_phy = sw_rings_base_phy +
+				next_64mb_align_offset;
+		sw_ring_phys_end_addr = sw_rings_base_phy + dev_sw_ring_size;
+
+		/* Check if the end of the sw ring memory block is before the
+		 * start of next 64MB aligned mem address
+		 */
+		if (sw_ring_phys_end_addr < next_64mb_align_addr_phy) {
+			d->sw_rings_phys = sw_rings_base_phy;
+			d->sw_rings = sw_rings_base;
+			d->sw_rings_base = sw_rings_base;
+			d->sw_ring_size = q_sw_ring_size;
+			d->sw_ring_max_depth = MAX_QUEUE_DEPTH;
+			break;
+		}
+		/* Store the address of the unaligned mem block */
+		base_addrs[i] = sw_rings_base;
+		i++;
+	}
+
+	/* Free all unaligned blocks of mem allocated in the loop */
+	free_base_addresses(base_addrs, i);
+}
+
+
+/* Allocate 64MB memory used for all software rings */
+static int
+acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id)
+{
+	uint32_t phys_low, phys_high, payload;
+	struct acc100_device *d = dev->data->dev_private;
+	const struct acc100_registry_addr *reg_addr;
+
+	if (d->pf_device && !d->acc100_conf.pf_mode_en) {
+		rte_bbdev_log(NOTICE,
+				"%s has PF mode disabled. This PF can't be used.",
+				dev->data->name);
+		return -ENODEV;
+	}
+
+	alloc_sw_rings_min_mem(dev, d, num_queues, socket_id);
+
+	/* If minimal memory space approach failed, then allocate
+	 * the 2 * 64MB block for the sw rings
+	 */
+	if (d->sw_rings == NULL)
+		alloc_2x64mb_sw_rings_mem(dev, d, socket_id);
+
+	/* Configure ACC100 with the base address for DMA descriptor rings
+	 * Same descriptor rings used for UL and DL DMA Engines
+	 * Note : Assuming only VF0 bundle is used for PF mode
+	 */
+	phys_high = (uint32_t)(d->sw_rings_phys >> 32);
+	phys_low  = (uint32_t)(d->sw_rings_phys & ~(ACC100_SIZE_64MBYTE-1));
+
+	/* Choose correct registry addresses for the device type */
+	if (d->pf_device)
+		reg_addr = &pf_reg_addr;
+	else
+		reg_addr = &vf_reg_addr;
+
+	/* Read the populated cfg from ACC100 registers */
+	fetch_acc100_config(dev);
+
+	/* Mark as configured properly */
+	d->configured = true;
+
+	/* Release AXI from PF */
+	if (d->pf_device)
+		acc100_reg_write(d, HWPfDmaAxiControl, 1);
+
+	acc100_reg_write(d, reg_addr->dma_ring_ul5g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->dma_ring_ul5g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->dma_ring_dl5g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->dma_ring_dl5g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->dma_ring_ul4g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->dma_ring_ul4g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->dma_ring_dl4g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->dma_ring_dl4g_lo, phys_low);
+
+	/*
+	 * Configure Ring Size to the max queue ring size
+	 * (used for wrapping purpose)
+	 */
+	payload = log2_basic(d->sw_ring_size / 64);
+	acc100_reg_write(d, reg_addr->ring_size, payload);
+
+	/* Configure tail pointer for use when SDONE enabled */
+	d->tail_ptrs = rte_zmalloc_socket(
+			dev->device->driver->name,
+			ACC100_NUM_QGRPS * ACC100_NUM_AQS * sizeof(uint32_t),
+			RTE_CACHE_LINE_SIZE, socket_id);
+	if (d->tail_ptrs == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate tail ptr for %s:%u",
+				dev->device->driver->name,
+				dev->data->dev_id);
+		rte_free(d->sw_rings);
+		return -ENOMEM;
+	}
+	d->tail_ptr_phys = rte_malloc_virt2iova(d->tail_ptrs);
+
+	phys_high = (uint32_t)(d->tail_ptr_phys >> 32);
+	phys_low  = (uint32_t)(d->tail_ptr_phys);
+	acc100_reg_write(d, reg_addr->tail_ptrs_ul5g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->tail_ptrs_ul5g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->tail_ptrs_dl5g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->tail_ptrs_dl5g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->tail_ptrs_ul4g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->tail_ptrs_ul4g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_lo, phys_low);
+
+	d->harq_layout = rte_zmalloc_socket("HARQ Layout",
+			ACC100_HARQ_LAYOUT * sizeof(*d->harq_layout),
+			RTE_CACHE_LINE_SIZE, dev->data->socket_id);
+
+	rte_bbdev_log_debug(
+			"ACC100 (%s) configured  sw_rings = %p, sw_rings_phys = %#"
+			PRIx64, dev->data->name, d->sw_rings, d->sw_rings_phys);
+
+	return 0;
+}
+
 /* Free 64MB memory used for software rings */
 static int
-acc100_dev_close(struct rte_bbdev *dev  __rte_unused)
+acc100_dev_close(struct rte_bbdev *dev)
 {
+	struct acc100_device *d = dev->data->dev_private;
+	if (d->sw_rings_base != NULL) {
+		rte_free(d->tail_ptrs);
+		rte_free(d->sw_rings_base);
+		d->sw_rings_base = NULL;
+	}
+	usleep(1000);
+	return 0;
+}
+
+
+/**
+ * Report a ACC100 queue index which is free
+ * Return 0 to 16k for a valid queue_idx or -1 when no queue is available
+ * Note : Only supporting VF0 Bundle for PF mode
+ */
+static int
+acc100_find_free_queue_idx(struct rte_bbdev *dev,
+		const struct rte_bbdev_queue_conf *conf)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	int op_2_acc[5] = {0, UL_4G, DL_4G, UL_5G, DL_5G};
+	int acc = op_2_acc[conf->op_type];
+	struct rte_q_topology_t *qtop = NULL;
+	qtopFromAcc(&qtop, acc, &(d->acc100_conf));
+	if (qtop == NULL)
+		return -1;
+	/* Identify matching QGroup Index which are sorted in priority order */
+	uint16_t group_idx = qtop->first_qgroup_index;
+	group_idx += conf->priority;
+	if (group_idx >= ACC100_NUM_QGRPS ||
+			conf->priority >= qtop->num_qgroups) {
+		rte_bbdev_log(INFO, "Invalid Priority on %s, priority %u",
+				dev->data->name, conf->priority);
+		return -1;
+	}
+	/* Find a free AQ_idx  */
+	uint16_t aq_idx;
+	for (aq_idx = 0; aq_idx < qtop->num_aqs_per_groups; aq_idx++) {
+		if (((d->q_assigned_bit_map[group_idx] >> aq_idx) & 0x1) == 0) {
+			/* Mark the Queue as assigned */
+			d->q_assigned_bit_map[group_idx] |= (1 << aq_idx);
+			/* Report the AQ Index */
+			return (group_idx << GRP_ID_SHIFT) + aq_idx;
+		}
+	}
+	rte_bbdev_log(INFO, "Failed to find free queue on %s, priority %u",
+			dev->data->name, conf->priority);
+	return -1;
+}
+
+/* Setup ACC100 queue */
+static int
+acc100_queue_setup(struct rte_bbdev *dev, uint16_t queue_id,
+		const struct rte_bbdev_queue_conf *conf)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	struct acc100_queue *q;
+	int16_t q_idx;
+
+	/* Allocate the queue data structure. */
+	q = rte_zmalloc_socket(dev->device->driver->name, sizeof(*q),
+			RTE_CACHE_LINE_SIZE, conf->socket);
+	if (q == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate queue memory");
+		return -ENOMEM;
+	}
+
+	q->d = d;
+	q->ring_addr = RTE_PTR_ADD(d->sw_rings, (d->sw_ring_size * queue_id));
+	q->ring_addr_phys = d->sw_rings_phys + (d->sw_ring_size * queue_id);
+
+	/* Prepare the Ring with default descriptor format */
+	union acc100_dma_desc *desc = NULL;
+	unsigned int desc_idx, b_idx;
+	int fcw_len = (conf->op_type == RTE_BBDEV_OP_LDPC_ENC ?
+		ACC100_FCW_LE_BLEN : (conf->op_type == RTE_BBDEV_OP_TURBO_DEC ?
+		ACC100_FCW_TD_BLEN : ACC100_FCW_LD_BLEN));
+
+	for (desc_idx = 0; desc_idx < d->sw_ring_max_depth; desc_idx++) {
+		desc = q->ring_addr + desc_idx;
+		desc->req.word0 = ACC100_DMA_DESC_TYPE;
+		desc->req.word1 = 0; /**< Timestamp */
+		desc->req.word2 = 0;
+		desc->req.word3 = 0;
+		uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
+		desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
+		desc->req.data_ptrs[0].blen = fcw_len;
+		desc->req.data_ptrs[0].blkid = ACC100_DMA_BLKID_FCW;
+		desc->req.data_ptrs[0].last = 0;
+		desc->req.data_ptrs[0].dma_ext = 0;
+		for (b_idx = 1; b_idx < ACC100_DMA_MAX_NUM_POINTERS - 1;
+				b_idx++) {
+			desc->req.data_ptrs[b_idx].blkid = ACC100_DMA_BLKID_IN;
+			desc->req.data_ptrs[b_idx].last = 1;
+			desc->req.data_ptrs[b_idx].dma_ext = 0;
+			b_idx++;
+			desc->req.data_ptrs[b_idx].blkid =
+					ACC100_DMA_BLKID_OUT_ENC;
+			desc->req.data_ptrs[b_idx].last = 1;
+			desc->req.data_ptrs[b_idx].dma_ext = 0;
+		}
+		/* Preset some fields of LDPC FCW */
+		desc->req.fcw_ld.FCWversion = ACC100_FCW_VER;
+		desc->req.fcw_ld.gain_i = 1;
+		desc->req.fcw_ld.gain_h = 1;
+	}
+
+	q->lb_in = rte_zmalloc_socket(dev->device->driver->name,
+			RTE_CACHE_LINE_SIZE,
+			RTE_CACHE_LINE_SIZE, conf->socket);
+	if (q->lb_in == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate lb_in memory");
+		return -ENOMEM;
+	}
+	q->lb_in_addr_phys = rte_malloc_virt2iova(q->lb_in);
+	q->lb_out = rte_zmalloc_socket(dev->device->driver->name,
+			RTE_CACHE_LINE_SIZE,
+			RTE_CACHE_LINE_SIZE, conf->socket);
+	if (q->lb_out == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate lb_out memory");
+		return -ENOMEM;
+	}
+	q->lb_out_addr_phys = rte_malloc_virt2iova(q->lb_out);
+
+	/*
+	 * Software queue ring wraps synchronously with the HW when it reaches
+	 * the boundary of the maximum allocated queue size, no matter what the
+	 * sw queue size is. This wrapping is guarded by setting the wrap_mask
+	 * to represent the maximum queue size as allocated at the time when
+	 * the device has been setup (in configure()).
+	 *
+	 * The queue depth is set to the queue size value (conf->queue_size).
+	 * This limits the occupancy of the queue at any point of time, so that
+	 * the queue does not get swamped with enqueue requests.
+	 */
+	q->sw_ring_depth = conf->queue_size;
+	q->sw_ring_wrap_mask = d->sw_ring_max_depth - 1;
+
+	q->op_type = conf->op_type;
+
+	q_idx = acc100_find_free_queue_idx(dev, conf);
+	if (q_idx == -1) {
+		rte_free(q);
+		return -1;
+	}
+
+	q->qgrp_id = (q_idx >> GRP_ID_SHIFT) & 0xF;
+	q->vf_id = (q_idx >> VF_ID_SHIFT)  & 0x3F;
+	q->aq_id = q_idx & 0xF;
+	q->aq_depth = (conf->op_type ==  RTE_BBDEV_OP_TURBO_DEC) ?
+			(1 << d->acc100_conf.q_ul_4g.aq_depth_log2) :
+			(1 << d->acc100_conf.q_dl_4g.aq_depth_log2);
+
+	q->mmio_reg_enqueue = RTE_PTR_ADD(d->mmio_base,
+			queue_offset(d->pf_device,
+					q->vf_id, q->qgrp_id, q->aq_id));
+
+	rte_bbdev_log_debug(
+			"Setup dev%u q%u: qgrp_id=%u, vf_id=%u, aq_id=%u, aq_depth=%u, mmio_reg_enqueue=%p",
+			dev->data->dev_id, queue_id, q->qgrp_id, q->vf_id,
+			q->aq_id, q->aq_depth, q->mmio_reg_enqueue);
+
+	dev->data->queues[queue_id].queue_private = q;
+	return 0;
+}
+
+/* Release ACC100 queue */
+static int
+acc100_queue_release(struct rte_bbdev *dev, uint16_t q_id)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	struct acc100_queue *q = dev->data->queues[q_id].queue_private;
+
+	if (q != NULL) {
+		/* Mark the Queue as un-assigned */
+		d->q_assigned_bit_map[q->qgrp_id] &= (0xFFFFFFFF -
+				(1 << q->aq_id));
+		rte_free(q->lb_in);
+		rte_free(q->lb_out);
+		rte_free(q);
+		dev->data->queues[q_id].queue_private = NULL;
+	}
+
 	return 0;
 }
 
@@ -258,8 +673,11 @@
 }
 
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
+	.setup_queues = acc100_setup_queues,
 	.close = acc100_dev_close,
 	.info_get = acc100_dev_info_get,
+	.queue_setup = acc100_queue_setup,
+	.queue_release = acc100_queue_release,
 };
 
 /* ACC100 PCI PF address map */
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 662e2c8..0e2b79c 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -518,11 +518,56 @@ struct acc100_registry_addr {
 	.ddr_range = HWVfDmaDdrBaseRangeRoVf,
 };
 
+/* Structure associated with each queue. */
+struct __rte_cache_aligned acc100_queue {
+	union acc100_dma_desc *ring_addr;  /* Virtual address of sw ring */
+	rte_iova_t ring_addr_phys;  /* Physical address of software ring */
+	uint32_t sw_ring_head;  /* software ring head */
+	uint32_t sw_ring_tail;  /* software ring tail */
+	/* software ring size (descriptors, not bytes) */
+	uint32_t sw_ring_depth;
+	/* mask used to wrap enqueued descriptors on the sw ring */
+	uint32_t sw_ring_wrap_mask;
+	/* MMIO register used to enqueue descriptors */
+	void *mmio_reg_enqueue;
+	uint8_t vf_id;  /* VF ID (max = 63) */
+	uint8_t qgrp_id;  /* Queue Group ID */
+	uint16_t aq_id;  /* Atomic Queue ID */
+	uint16_t aq_depth;  /* Depth of atomic queue */
+	uint32_t aq_enqueued;  /* Count how many "batches" have been enqueued */
+	uint32_t aq_dequeued;  /* Count how many "batches" have been dequeued */
+	uint32_t irq_enable;  /* Enable ops dequeue interrupts if set to 1 */
+	struct rte_mempool *fcw_mempool;  /* FCW mempool */
+	enum rte_bbdev_op_type op_type;  /* Type of this Queue: TE or TD */
+	/* Internal Buffers for loopback input */
+	uint8_t *lb_in;
+	uint8_t *lb_out;
+	rte_iova_t lb_in_addr_phys;
+	rte_iova_t lb_out_addr_phys;
+	struct acc100_device *d;
+};
+
 /* Private data structure for each ACC100 device */
 struct acc100_device {
 	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
+	void *sw_rings_base;  /* Base addr of un-aligned memory for sw rings */
+	void *sw_rings;  /* 64MBs of 64MB aligned memory for sw rings */
+	rte_iova_t sw_rings_phys;  /* Physical address of sw_rings */
+	/* Virtual address of the info memory routed to the this function under
+	 * operation, whether it is PF or VF.
+	 */
+	union acc100_harq_layout_data *harq_layout;
+	uint32_t sw_ring_size;
 	uint32_t ddr_size; /* Size in kB */
+	uint32_t *tail_ptrs; /* Base address of response tail pointer buffer */
+	rte_iova_t tail_ptr_phys; /* Physical address of tail pointers */
+	/* Max number of entries available for each queue in device, depending
+	 * on how many queues are enabled with configure()
+	 */
+	uint32_t sw_ring_max_depth;
 	struct acc100_conf acc100_conf; /* ACC100 Initial configuration */
+	/* Bitmap capturing which Queues have already been assigned */
+	uint16_t q_assigned_bit_map[ACC100_NUM_QGRPS];
 	bool pf_device; /**< True if this is a PF ACC100 device */
 	bool configured; /**< True if this ACC100 device is configured */
 };
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v7 05/11] baseband/acc100: add LDPC processing functions
  2020-09-23  2:24   ` [dpdk-dev] [PATCH v7 00/11] bbdev PMD ACC100 Nicolas Chautru
                       ` (3 preceding siblings ...)
  2020-09-23  2:24     ` [dpdk-dev] [PATCH v7 04/11] baseband/acc100: add queue configuration Nicolas Chautru
@ 2020-09-23  2:25     ` Nicolas Chautru
  2020-09-23  2:25     ` [dpdk-dev] [PATCH v7 06/11] baseband/acc100: add HARQ loopback support Nicolas Chautru
                       ` (5 subsequent siblings)
  10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-23  2:25 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Adding LDPC decode and encode processing operations

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
Acked-by: Dave Burley <dave.burley@accelercomm.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 1625 +++++++++++++++++++++++++++++-
 drivers/baseband/acc100/rte_acc100_pmd.h |    3 +
 2 files changed, 1626 insertions(+), 2 deletions(-)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 7a21c57..b223547 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -15,6 +15,9 @@
 #include <rte_hexdump.h>
 #include <rte_pci.h>
 #include <rte_bus_pci.h>
+#ifdef RTE_BBDEV_OFFLOAD_COST
+#include <rte_cycles.h>
+#endif
 
 #include <rte_bbdev.h>
 #include <rte_bbdev_pmd.h>
@@ -449,7 +452,6 @@
 	return 0;
 }
 
-
 /**
  * Report a ACC100 queue index which is free
  * Return 0 to 16k for a valid queue_idx or -1 when no queue is available
@@ -634,6 +636,46 @@
 	struct acc100_device *d = dev->data->dev_private;
 
 	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
+		{
+			.type   = RTE_BBDEV_OP_LDPC_ENC,
+			.cap.ldpc_enc = {
+				.capability_flags =
+					RTE_BBDEV_LDPC_RATE_MATCH |
+					RTE_BBDEV_LDPC_CRC_24B_ATTACH |
+					RTE_BBDEV_LDPC_INTERLEAVER_BYPASS,
+				.num_buffers_src =
+						RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+				.num_buffers_dst =
+						RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+			}
+		},
+		{
+			.type   = RTE_BBDEV_OP_LDPC_DEC,
+			.cap.ldpc_dec = {
+			.capability_flags =
+				RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK |
+				RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP |
+				RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE |
+				RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE |
+#ifdef ACC100_EXT_MEM
+				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE |
+				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE |
+#endif
+				RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE |
+				RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS |
+				RTE_BBDEV_LDPC_DECODE_BYPASS |
+				RTE_BBDEV_LDPC_DEC_SCATTER_GATHER |
+				RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION |
+				RTE_BBDEV_LDPC_LLR_COMPRESSION,
+			.llr_size = 8,
+			.llr_decimals = 1,
+			.num_buffers_src =
+					RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+			.num_buffers_hard_out =
+					RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+			.num_buffers_soft_out = 0,
+			}
+		},
 		RTE_BBDEV_END_OF_CAPABILITIES_LIST()
 	};
 
@@ -669,9 +711,14 @@
 	dev_info->cpu_flag_reqs = NULL;
 	dev_info->min_alignment = 64;
 	dev_info->capabilities = bbdev_capabilities;
+#ifdef ACC100_EXT_MEM
 	dev_info->harq_buffer_size = d->ddr_size;
+#else
+	dev_info->harq_buffer_size = 0;
+#endif
 }
 
+
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
 	.setup_queues = acc100_setup_queues,
 	.close = acc100_dev_close,
@@ -696,6 +743,1577 @@
 	{.device_id = 0},
 };
 
+/* Read flag value 0/1 from bitmap */
+static inline bool
+check_bit(uint32_t bitmap, uint32_t bitmask)
+{
+	return bitmap & bitmask;
+}
+
+static inline char *
+mbuf_append(struct rte_mbuf *m_head, struct rte_mbuf *m, uint16_t len)
+{
+	if (unlikely(len > rte_pktmbuf_tailroom(m)))
+		return NULL;
+
+	char *tail = (char *)m->buf_addr + m->data_off + m->data_len;
+	m->data_len = (uint16_t)(m->data_len + len);
+	m_head->pkt_len  = (m_head->pkt_len + len);
+	return tail;
+}
+
+/* Compute value of k0.
+ * Based on 3GPP 38.212 Table 5.4.2.1-2
+ * Starting position of different redundancy versions, k0
+ */
+static inline uint16_t
+get_k0(uint16_t n_cb, uint16_t z_c, uint8_t bg, uint8_t rv_index)
+{
+	if (rv_index == 0)
+		return 0;
+	uint16_t n = (bg == 1 ? N_ZC_1 : N_ZC_2) * z_c;
+	if (n_cb == n) {
+		if (rv_index == 1)
+			return (bg == 1 ? K0_1_1 : K0_1_2) * z_c;
+		else if (rv_index == 2)
+			return (bg == 1 ? K0_2_1 : K0_2_2) * z_c;
+		else
+			return (bg == 1 ? K0_3_1 : K0_3_2) * z_c;
+	}
+	/* LBRM case - includes a division by N */
+	if (rv_index == 1)
+		return (((bg == 1 ? K0_1_1 : K0_1_2) * n_cb)
+				/ n) * z_c;
+	else if (rv_index == 2)
+		return (((bg == 1 ? K0_2_1 : K0_2_2) * n_cb)
+				/ n) * z_c;
+	else
+		return (((bg == 1 ? K0_3_1 : K0_3_2) * n_cb)
+				/ n) * z_c;
+}
+
+/* Fill in a frame control word for LDPC encoding. */
+static inline void
+acc100_fcw_le_fill(const struct rte_bbdev_enc_op *op,
+		struct acc100_fcw_le *fcw, int num_cb)
+{
+	fcw->qm = op->ldpc_enc.q_m;
+	fcw->nfiller = op->ldpc_enc.n_filler;
+	fcw->BG = (op->ldpc_enc.basegraph - 1);
+	fcw->Zc = op->ldpc_enc.z_c;
+	fcw->ncb = op->ldpc_enc.n_cb;
+	fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_enc.basegraph,
+			op->ldpc_enc.rv_index);
+	fcw->rm_e = op->ldpc_enc.cb_params.e;
+	fcw->crc_select = check_bit(op->ldpc_enc.op_flags,
+			RTE_BBDEV_LDPC_CRC_24B_ATTACH);
+	fcw->bypass_intlv = check_bit(op->ldpc_enc.op_flags,
+			RTE_BBDEV_LDPC_INTERLEAVER_BYPASS);
+	fcw->mcb_count = num_cb;
+}
+
+/* Fill in a frame control word for LDPC decoding. */
+static inline void
+acc100_fcw_ld_fill(const struct rte_bbdev_dec_op *op, struct acc100_fcw_ld *fcw,
+		union acc100_harq_layout_data *harq_layout)
+{
+	uint16_t harq_out_length, harq_in_length, ncb_p, k0_p, parity_offset;
+	uint16_t harq_index;
+	uint32_t l;
+	bool harq_prun = false;
+
+	fcw->qm = op->ldpc_dec.q_m;
+	fcw->nfiller = op->ldpc_dec.n_filler;
+	fcw->BG = (op->ldpc_dec.basegraph - 1);
+	fcw->Zc = op->ldpc_dec.z_c;
+	fcw->ncb = op->ldpc_dec.n_cb;
+	fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_dec.basegraph,
+			op->ldpc_dec.rv_index);
+	if (op->ldpc_dec.code_block_mode == 1)
+		fcw->rm_e = op->ldpc_dec.cb_params.e;
+	else
+		fcw->rm_e = (op->ldpc_dec.tb_params.r <
+				op->ldpc_dec.tb_params.cab) ?
+						op->ldpc_dec.tb_params.ea :
+						op->ldpc_dec.tb_params.eb;
+
+	fcw->hcin_en = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE);
+	fcw->hcout_en = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE);
+	fcw->crc_select = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK);
+	fcw->bypass_dec = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_DECODE_BYPASS);
+	fcw->bypass_intlv = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS);
+	if (op->ldpc_dec.q_m == 1) {
+		fcw->bypass_intlv = 1;
+		fcw->qm = 2;
+	}
+	fcw->hcin_decomp_mode = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
+	fcw->hcout_comp_mode = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
+	fcw->llr_pack_mode = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_LLR_COMPRESSION);
+	harq_index = op->ldpc_dec.harq_combined_output.offset /
+			ACC100_HARQ_OFFSET;
+#ifdef ACC100_EXT_MEM
+	/* Limit cases when HARQ pruning is valid */
+	harq_prun = ((op->ldpc_dec.harq_combined_output.offset %
+			ACC100_HARQ_OFFSET) == 0) &&
+			(op->ldpc_dec.harq_combined_output.offset <= UINT16_MAX
+			* ACC100_HARQ_OFFSET);
+#endif
+	if (fcw->hcin_en > 0) {
+		harq_in_length = op->ldpc_dec.harq_combined_input.length;
+		if (fcw->hcin_decomp_mode > 0)
+			harq_in_length = harq_in_length * 8 / 6;
+		harq_in_length = RTE_ALIGN(harq_in_length, 64);
+		if ((harq_layout[harq_index].offset > 0) & harq_prun) {
+			rte_bbdev_log_debug("HARQ IN offset unexpected for now\n");
+			fcw->hcin_size0 = harq_layout[harq_index].size0;
+			fcw->hcin_offset = harq_layout[harq_index].offset;
+			fcw->hcin_size1 = harq_in_length -
+					harq_layout[harq_index].offset;
+		} else {
+			fcw->hcin_size0 = harq_in_length;
+			fcw->hcin_offset = 0;
+			fcw->hcin_size1 = 0;
+		}
+	} else {
+		fcw->hcin_size0 = 0;
+		fcw->hcin_offset = 0;
+		fcw->hcin_size1 = 0;
+	}
+
+	fcw->itmax = op->ldpc_dec.iter_max;
+	fcw->itstop = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE);
+	fcw->synd_precoder = fcw->itstop;
+	/*
+	 * These are all implicitly set
+	 * fcw->synd_post = 0;
+	 * fcw->so_en = 0;
+	 * fcw->so_bypass_rm = 0;
+	 * fcw->so_bypass_intlv = 0;
+	 * fcw->dec_convllr = 0;
+	 * fcw->hcout_convllr = 0;
+	 * fcw->hcout_size1 = 0;
+	 * fcw->so_it = 0;
+	 * fcw->hcout_offset = 0;
+	 * fcw->negstop_th = 0;
+	 * fcw->negstop_it = 0;
+	 * fcw->negstop_en = 0;
+	 * fcw->gain_i = 1;
+	 * fcw->gain_h = 1;
+	 */
+	if (fcw->hcout_en > 0) {
+		parity_offset = (op->ldpc_dec.basegraph == 1 ? 20 : 8)
+			* op->ldpc_dec.z_c - op->ldpc_dec.n_filler;
+		k0_p = (fcw->k0 > parity_offset) ?
+				fcw->k0 - op->ldpc_dec.n_filler : fcw->k0;
+		ncb_p = fcw->ncb - op->ldpc_dec.n_filler;
+		l = k0_p + fcw->rm_e;
+		harq_out_length = (uint16_t) fcw->hcin_size0;
+		harq_out_length = RTE_MIN(RTE_MAX(harq_out_length, l), ncb_p);
+		harq_out_length = (harq_out_length + 0x3F) & 0xFFC0;
+		if ((k0_p > fcw->hcin_size0 + ACC100_HARQ_OFFSET_THRESHOLD) &&
+				harq_prun) {
+			fcw->hcout_size0 = (uint16_t) fcw->hcin_size0;
+			fcw->hcout_offset = k0_p & 0xFFC0;
+			fcw->hcout_size1 = harq_out_length - fcw->hcout_offset;
+		} else {
+			fcw->hcout_size0 = harq_out_length;
+			fcw->hcout_size1 = 0;
+			fcw->hcout_offset = 0;
+		}
+		harq_layout[harq_index].offset = fcw->hcout_offset;
+		harq_layout[harq_index].size0 = fcw->hcout_size0;
+	} else {
+		fcw->hcout_size0 = 0;
+		fcw->hcout_size1 = 0;
+		fcw->hcout_offset = 0;
+	}
+}
+
+/**
+ * Fills descriptor with data pointers of one block type.
+ *
+ * @param desc
+ *   Pointer to DMA descriptor.
+ * @param input
+ *   Pointer to pointer to input data which will be encoded. It can be changed
+ *   and points to next segment in scatter-gather case.
+ * @param offset
+ *   Input offset in rte_mbuf structure. It is used for calculating the point
+ *   where data is starting.
+ * @param cb_len
+ *   Length of currently processed Code Block
+ * @param seg_total_left
+ *   It indicates how many bytes still left in segment (mbuf) for further
+ *   processing.
+ * @param op_flags
+ *   Store information about device capabilities
+ * @param next_triplet
+ *   Index for ACC100 DMA Descriptor triplet
+ *
+ * @return
+ *   Returns index of next triplet on success, other value if lengths of
+ *   pkt and processed cb do not match.
+ *
+ */
+static inline int
+acc100_dma_fill_blk_type_in(struct acc100_dma_req_desc *desc,
+		struct rte_mbuf **input, uint32_t *offset, uint32_t cb_len,
+		uint32_t *seg_total_left, int next_triplet)
+{
+	uint32_t part_len;
+	struct rte_mbuf *m = *input;
+
+	part_len = (*seg_total_left < cb_len) ? *seg_total_left : cb_len;
+	cb_len -= part_len;
+	*seg_total_left -= part_len;
+
+	desc->data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(m, *offset);
+	desc->data_ptrs[next_triplet].blen = part_len;
+	desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN;
+	desc->data_ptrs[next_triplet].last = 0;
+	desc->data_ptrs[next_triplet].dma_ext = 0;
+	*offset += part_len;
+	next_triplet++;
+
+	while (cb_len > 0) {
+		if (next_triplet < ACC100_DMA_MAX_NUM_POINTERS &&
+				m->next != NULL) {
+
+			m = m->next;
+			*seg_total_left = rte_pktmbuf_data_len(m);
+			part_len = (*seg_total_left < cb_len) ?
+					*seg_total_left :
+					cb_len;
+			desc->data_ptrs[next_triplet].address =
+					rte_pktmbuf_iova_offset(m, 0);
+			desc->data_ptrs[next_triplet].blen = part_len;
+			desc->data_ptrs[next_triplet].blkid =
+					ACC100_DMA_BLKID_IN;
+			desc->data_ptrs[next_triplet].last = 0;
+			desc->data_ptrs[next_triplet].dma_ext = 0;
+			cb_len -= part_len;
+			*seg_total_left -= part_len;
+			/* Initializing offset for next segment (mbuf) */
+			*offset = part_len;
+			next_triplet++;
+		} else {
+			rte_bbdev_log(ERR,
+				"Some data still left for processing: "
+				"data_left: %u, next_triplet: %u, next_mbuf: %p",
+				cb_len, next_triplet, m->next);
+			return -EINVAL;
+		}
+	}
+	/* Storing new mbuf as it could be changed in scatter-gather case*/
+	*input = m;
+
+	return next_triplet;
+}
+
+/* Fills descriptor with data pointers of one block type.
+ * Returns index of next triplet on success, other value if lengths of
+ * output data and processed mbuf do not match.
+ */
+static inline int
+acc100_dma_fill_blk_type_out(struct acc100_dma_req_desc *desc,
+		struct rte_mbuf *output, uint32_t out_offset,
+		uint32_t output_len, int next_triplet, int blk_id)
+{
+	desc->data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(output, out_offset);
+	desc->data_ptrs[next_triplet].blen = output_len;
+	desc->data_ptrs[next_triplet].blkid = blk_id;
+	desc->data_ptrs[next_triplet].last = 0;
+	desc->data_ptrs[next_triplet].dma_ext = 0;
+	next_triplet++;
+
+	return next_triplet;
+}
+
+static inline int
+acc100_dma_desc_le_fill(struct rte_bbdev_enc_op *op,
+		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
+		struct rte_mbuf *output, uint32_t *in_offset,
+		uint32_t *out_offset, uint32_t *out_length,
+		uint32_t *mbuf_total_left, uint32_t *seg_total_left)
+{
+	int next_triplet = 1; /* FCW already done */
+	uint16_t K, in_length_in_bits, in_length_in_bytes;
+	struct rte_bbdev_op_ldpc_enc *enc = &op->ldpc_enc;
+
+	desc->word0 = ACC100_DMA_DESC_TYPE;
+	desc->word1 = 0; /**< Timestamp could be disabled */
+	desc->word2 = 0;
+	desc->word3 = 0;
+	desc->numCBs = 1;
+
+	K = (enc->basegraph == 1 ? 22 : 10) * enc->z_c;
+	in_length_in_bits = K - enc->n_filler;
+	if ((enc->op_flags & RTE_BBDEV_LDPC_CRC_24A_ATTACH) ||
+			(enc->op_flags & RTE_BBDEV_LDPC_CRC_24B_ATTACH))
+		in_length_in_bits -= 24;
+	in_length_in_bytes = in_length_in_bits >> 3;
+
+	if (unlikely((*mbuf_total_left == 0) ||
+			(*mbuf_total_left < in_length_in_bytes))) {
+		rte_bbdev_log(ERR,
+				"Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+				*mbuf_total_left, in_length_in_bytes);
+		return -1;
+	}
+
+	next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset,
+			in_length_in_bytes,
+			seg_total_left, next_triplet);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->m2dlen = next_triplet;
+	*mbuf_total_left -= in_length_in_bytes;
+
+	/* Set output length */
+	/* Integer round up division by 8 */
+	*out_length = (enc->cb_params.e + 7) >> 3;
+
+	next_triplet = acc100_dma_fill_blk_type_out(desc, output, *out_offset,
+			*out_length, next_triplet, ACC100_DMA_BLKID_OUT_ENC);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+	op->ldpc_enc.output.length += *out_length;
+	*out_offset += *out_length;
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->data_ptrs[next_triplet - 1].dma_ext = 0;
+	desc->d2mlen = next_triplet - desc->m2dlen;
+
+	desc->op_addr = op;
+
+	return 0;
+}
+
+static inline int
+acc100_dma_desc_ld_fill(struct rte_bbdev_dec_op *op,
+		struct acc100_dma_req_desc *desc,
+		struct rte_mbuf **input, struct rte_mbuf *h_output,
+		uint32_t *in_offset, uint32_t *h_out_offset,
+		uint32_t *h_out_length, uint32_t *mbuf_total_left,
+		uint32_t *seg_total_left,
+		struct acc100_fcw_ld *fcw)
+{
+	struct rte_bbdev_op_ldpc_dec *dec = &op->ldpc_dec;
+	int next_triplet = 1; /* FCW already done */
+	uint32_t input_length;
+	uint16_t output_length, crc24_overlap = 0;
+	uint16_t sys_cols, K, h_p_size, h_np_size;
+	bool h_comp = check_bit(dec->op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
+
+	desc->word0 = ACC100_DMA_DESC_TYPE;
+	desc->word1 = 0; /**< Timestamp could be disabled */
+	desc->word2 = 0;
+	desc->word3 = 0;
+	desc->numCBs = 1;
+
+	if (check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP))
+		crc24_overlap = 24;
+
+	/* Compute some LDPC BG lengths */
+	input_length = dec->cb_params.e;
+	if (check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_LLR_COMPRESSION))
+		input_length = (input_length * 3 + 3) / 4;
+	sys_cols = (dec->basegraph == 1) ? 22 : 10;
+	K = sys_cols * dec->z_c;
+	output_length = K - dec->n_filler - crc24_overlap;
+
+	if (unlikely((*mbuf_total_left == 0) ||
+			(*mbuf_total_left < input_length))) {
+		rte_bbdev_log(ERR,
+				"Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+				*mbuf_total_left, input_length);
+		return -1;
+	}
+
+	next_triplet = acc100_dma_fill_blk_type_in(desc, input,
+			in_offset, input_length,
+			seg_total_left, next_triplet);
+
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+
+	if (check_bit(op->ldpc_dec.op_flags,
+				RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
+		h_p_size = fcw->hcin_size0 + fcw->hcin_size1;
+		if (h_comp)
+			h_p_size = (h_p_size * 3 + 3) / 4;
+		desc->data_ptrs[next_triplet].address =
+				dec->harq_combined_input.offset;
+		desc->data_ptrs[next_triplet].blen = h_p_size;
+		desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN_HARQ;
+		desc->data_ptrs[next_triplet].dma_ext = 1;
+#ifndef ACC100_EXT_MEM
+		acc100_dma_fill_blk_type_out(
+				desc,
+				op->ldpc_dec.harq_combined_input.data,
+				op->ldpc_dec.harq_combined_input.offset,
+				h_p_size,
+				next_triplet,
+				ACC100_DMA_BLKID_IN_HARQ);
+#endif
+		next_triplet++;
+	}
+
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->m2dlen = next_triplet;
+	*mbuf_total_left -= input_length;
+
+	next_triplet = acc100_dma_fill_blk_type_out(desc, h_output,
+			*h_out_offset, output_length >> 3, next_triplet,
+			ACC100_DMA_BLKID_OUT_HARD);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+
+	if (check_bit(op->ldpc_dec.op_flags,
+				RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
+		/* Pruned size of the HARQ */
+		h_p_size = fcw->hcout_size0 + fcw->hcout_size1;
+		/* Non-Pruned size of the HARQ */
+		h_np_size = fcw->hcout_offset > 0 ?
+				fcw->hcout_offset + fcw->hcout_size1 :
+				h_p_size;
+		if (h_comp) {
+			h_np_size = (h_np_size * 3 + 3) / 4;
+			h_p_size = (h_p_size * 3 + 3) / 4;
+		}
+		dec->harq_combined_output.length = h_np_size;
+		desc->data_ptrs[next_triplet].address =
+				dec->harq_combined_output.offset;
+		desc->data_ptrs[next_triplet].blen = h_p_size;
+		desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_OUT_HARQ;
+		desc->data_ptrs[next_triplet].dma_ext = 1;
+#ifndef ACC100_EXT_MEM
+		acc100_dma_fill_blk_type_out(
+				desc,
+				dec->harq_combined_output.data,
+				dec->harq_combined_output.offset,
+				h_p_size,
+				next_triplet,
+				ACC100_DMA_BLKID_OUT_HARQ);
+#endif
+		next_triplet++;
+	}
+
+	*h_out_length = output_length >> 3;
+	dec->hard_output.length += *h_out_length;
+	*h_out_offset += *h_out_length;
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->d2mlen = next_triplet - desc->m2dlen;
+
+	desc->op_addr = op;
+
+	return 0;
+}
+
+static inline void
+acc100_dma_desc_ld_update(struct rte_bbdev_dec_op *op,
+		struct acc100_dma_req_desc *desc,
+		struct rte_mbuf *input, struct rte_mbuf *h_output,
+		uint32_t *in_offset, uint32_t *h_out_offset,
+		uint32_t *h_out_length,
+		union acc100_harq_layout_data *harq_layout)
+{
+	int next_triplet = 1; /* FCW already done */
+	desc->data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(input, *in_offset);
+	next_triplet++;
+
+	if (check_bit(op->ldpc_dec.op_flags,
+				RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
+		struct rte_bbdev_op_data hi = op->ldpc_dec.harq_combined_input;
+		desc->data_ptrs[next_triplet].address = hi.offset;
+#ifndef ACC100_EXT_MEM
+		desc->data_ptrs[next_triplet].address =
+				rte_pktmbuf_iova_offset(hi.data, hi.offset);
+#endif
+		next_triplet++;
+	}
+
+	desc->data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(h_output, *h_out_offset);
+	*h_out_length = desc->data_ptrs[next_triplet].blen;
+	next_triplet++;
+
+	if (check_bit(op->ldpc_dec.op_flags,
+				RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
+		desc->data_ptrs[next_triplet].address =
+				op->ldpc_dec.harq_combined_output.offset;
+		/* Adjust based on previous operation */
+		struct rte_bbdev_dec_op *prev_op = desc->op_addr;
+		op->ldpc_dec.harq_combined_output.length =
+				prev_op->ldpc_dec.harq_combined_output.length;
+		int16_t hq_idx = op->ldpc_dec.harq_combined_output.offset /
+				ACC100_HARQ_OFFSET;
+		int16_t prev_hq_idx =
+				prev_op->ldpc_dec.harq_combined_output.offset
+				/ ACC100_HARQ_OFFSET;
+		harq_layout[hq_idx].val = harq_layout[prev_hq_idx].val;
+#ifndef ACC100_EXT_MEM
+		struct rte_bbdev_op_data ho =
+				op->ldpc_dec.harq_combined_output;
+		desc->data_ptrs[next_triplet].address =
+				rte_pktmbuf_iova_offset(ho.data, ho.offset);
+#endif
+		next_triplet++;
+	}
+
+	op->ldpc_dec.hard_output.length += *h_out_length;
+	desc->op_addr = op;
+}
+
+
+/* Enqueue a number of operations to HW and update software rings */
+static inline void
+acc100_dma_enqueue(struct acc100_queue *q, uint16_t n,
+		struct rte_bbdev_stats *queue_stats)
+{
+	union acc100_enqueue_reg_fmt enq_req;
+#ifdef RTE_BBDEV_OFFLOAD_COST
+	uint64_t start_time = 0;
+	queue_stats->acc_offload_cycles = 0;
+	RTE_SET_USED(queue_stats);
+#else
+	RTE_SET_USED(queue_stats);
+#endif
+
+	enq_req.val = 0;
+	/* Setting offset, 100b for 256 DMA Desc */
+	enq_req.addr_offset = ACC100_DESC_OFFSET;
+
+	/* Split ops into batches */
+	do {
+		union acc100_dma_desc *desc;
+		uint16_t enq_batch_size;
+		uint64_t offset;
+		rte_iova_t req_elem_addr;
+
+		enq_batch_size = RTE_MIN(n, MAX_ENQ_BATCH_SIZE);
+
+		/* Set flag on last descriptor in a batch */
+		desc = q->ring_addr + ((q->sw_ring_head + enq_batch_size - 1) &
+				q->sw_ring_wrap_mask);
+		desc->req.last_desc_in_batch = 1;
+
+		/* Calculate the 1st descriptor's address */
+		offset = ((q->sw_ring_head & q->sw_ring_wrap_mask) *
+				sizeof(union acc100_dma_desc));
+		req_elem_addr = q->ring_addr_phys + offset;
+
+		/* Fill enqueue struct */
+		enq_req.num_elem = enq_batch_size;
+		/* low 6 bits are not needed */
+		enq_req.req_elem_addr = (uint32_t)(req_elem_addr >> 6);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+		rte_memdump(stderr, "Req sdone", desc, sizeof(*desc));
+#endif
+		rte_bbdev_log_debug(
+				"Enqueue %u reqs (phys %#"PRIx64") to reg %p",
+				enq_batch_size,
+				req_elem_addr,
+				(void *)q->mmio_reg_enqueue);
+
+		rte_wmb();
+
+#ifdef RTE_BBDEV_OFFLOAD_COST
+		/* Start time measurement for enqueue function offload. */
+		start_time = rte_rdtsc_precise();
+#endif
+		rte_bbdev_log(DEBUG, "Debug : MMIO Enqueue");
+		mmio_write(q->mmio_reg_enqueue, enq_req.val);
+
+#ifdef RTE_BBDEV_OFFLOAD_COST
+		queue_stats->acc_offload_cycles +=
+				rte_rdtsc_precise() - start_time;
+#endif
+
+		q->aq_enqueued++;
+		q->sw_ring_head += enq_batch_size;
+		n -= enq_batch_size;
+
+	} while (n);
+
+
+}
+
+/* Enqueue one encode operations for ACC100 device in CB mode */
+static inline int
+enqueue_ldpc_enc_n_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op **ops,
+		uint16_t total_enqueued_cbs, int16_t num)
+{
+	union acc100_dma_desc *desc = NULL;
+	uint32_t out_length;
+	struct rte_mbuf *output_head, *output;
+	int i, next_triplet;
+	uint16_t  in_length_in_bytes;
+	struct rte_bbdev_op_ldpc_enc *enc = &ops[0]->ldpc_enc;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	acc100_fcw_le_fill(ops[0], &desc->req.fcw_le, num);
+
+	/** This could be done at polling */
+	desc->req.word0 = ACC100_DMA_DESC_TYPE;
+	desc->req.word1 = 0; /**< Timestamp could be disabled */
+	desc->req.word2 = 0;
+	desc->req.word3 = 0;
+	desc->req.numCBs = num;
+
+	in_length_in_bytes = ops[0]->ldpc_enc.input.data->data_len;
+	out_length = (enc->cb_params.e + 7) >> 3;
+	desc->req.m2dlen = 1 + num;
+	desc->req.d2mlen = num;
+	next_triplet = 1;
+
+	for (i = 0; i < num; i++) {
+		desc->req.data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(ops[i]->ldpc_enc.input.data, 0);
+		desc->req.data_ptrs[next_triplet].blen = in_length_in_bytes;
+		next_triplet++;
+		desc->req.data_ptrs[next_triplet].address =
+				rte_pktmbuf_iova_offset(
+				ops[i]->ldpc_enc.output.data, 0);
+		desc->req.data_ptrs[next_triplet].blen = out_length;
+		next_triplet++;
+		ops[i]->ldpc_enc.output.length = out_length;
+		output_head = output = ops[i]->ldpc_enc.output.data;
+		mbuf_append(output_head, output, out_length);
+		output->data_len = out_length;
+	}
+
+	desc->req.op_addr = ops[0];
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_le,
+			sizeof(desc->req.fcw_le) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+	/* One CB (one op) was successfully prepared to enqueue */
+	return num;
+}
+
+/* Enqueue one encode operations for ACC100 device in CB mode */
+static inline int
+enqueue_ldpc_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
+		uint16_t total_enqueued_cbs)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint32_t in_offset, out_offset, out_length, mbuf_total_left,
+		seg_total_left;
+	struct rte_mbuf *input, *output_head, *output;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	acc100_fcw_le_fill(op, &desc->req.fcw_le, 1);
+
+	input = op->ldpc_enc.input.data;
+	output_head = output = op->ldpc_enc.output.data;
+	in_offset = op->ldpc_enc.input.offset;
+	out_offset = op->ldpc_enc.output.offset;
+	out_length = 0;
+	mbuf_total_left = op->ldpc_enc.input.length;
+	seg_total_left = rte_pktmbuf_data_len(op->ldpc_enc.input.data)
+			- in_offset;
+
+	ret = acc100_dma_desc_le_fill(op, &desc->req, &input, output,
+			&in_offset, &out_offset, &out_length, &mbuf_total_left,
+			&seg_total_left);
+
+	if (unlikely(ret < 0))
+		return ret;
+
+	mbuf_append(output_head, output, out_length);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_le,
+			sizeof(desc->req.fcw_le) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+
+	/* Check if any data left after processing one CB */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left after processing one CB: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
+/** Enqueue one decode operations for ACC100 device in CB mode */
+static inline int
+enqueue_ldpc_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs, bool same_op)
+{
+	int ret;
+
+	union acc100_dma_desc *desc;
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	struct rte_mbuf *input, *h_output_head, *h_output;
+	uint32_t in_offset, h_out_offset, mbuf_total_left, h_out_length = 0;
+	input = op->ldpc_dec.input.data;
+	h_output_head = h_output = op->ldpc_dec.hard_output.data;
+	in_offset = op->ldpc_dec.input.offset;
+	h_out_offset = op->ldpc_dec.hard_output.offset;
+	mbuf_total_left = op->ldpc_dec.input.length;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(input == NULL)) {
+		rte_bbdev_log(ERR, "Invalid mbuf pointer");
+		return -EFAULT;
+	}
+#endif
+	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+
+	if (same_op) {
+		union acc100_dma_desc *prev_desc;
+		desc_idx = ((q->sw_ring_head + total_enqueued_cbs - 1)
+				& q->sw_ring_wrap_mask);
+		prev_desc = q->ring_addr + desc_idx;
+		uint8_t *prev_ptr = (uint8_t *) prev_desc;
+		uint8_t *new_ptr = (uint8_t *) desc;
+		/* Copy first 4 words and BDESCs */
+		rte_memcpy(new_ptr, prev_ptr, 16);
+		rte_memcpy(new_ptr + 36, prev_ptr + 36, 40);
+		desc->req.op_addr = prev_desc->req.op_addr;
+		/* Copy FCW */
+		rte_memcpy(new_ptr + ACC100_DESC_FCW_OFFSET,
+				prev_ptr + ACC100_DESC_FCW_OFFSET,
+				ACC100_FCW_LD_BLEN);
+		acc100_dma_desc_ld_update(op, &desc->req, input, h_output,
+				&in_offset, &h_out_offset,
+				&h_out_length, harq_layout);
+	} else {
+		struct acc100_fcw_ld *fcw;
+		uint32_t seg_total_left;
+		fcw = &desc->req.fcw_ld;
+		acc100_fcw_ld_fill(op, fcw, harq_layout);
+
+		/* Special handling when overusing mbuf */
+		if (fcw->rm_e < MAX_E_MBUF)
+			seg_total_left = rte_pktmbuf_data_len(input)
+					- in_offset;
+		else
+			seg_total_left = fcw->rm_e;
+
+		ret = acc100_dma_desc_ld_fill(op, &desc->req, &input, h_output,
+				&in_offset, &h_out_offset,
+				&h_out_length, &mbuf_total_left,
+				&seg_total_left, fcw);
+		if (unlikely(ret < 0))
+			return ret;
+	}
+
+	/* Hard output */
+	mbuf_append(h_output_head, h_output, h_out_length);
+#ifndef ACC100_EXT_MEM
+	if (op->ldpc_dec.harq_combined_output.length > 0) {
+		/* Push the HARQ output into host memory */
+		struct rte_mbuf *hq_output_head, *hq_output;
+		hq_output_head = op->ldpc_dec.harq_combined_output.data;
+		hq_output = op->ldpc_dec.harq_combined_output.data;
+		mbuf_append(hq_output_head, hq_output,
+				op->ldpc_dec.harq_combined_output.length);
+	}
+#endif
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_ld,
+			sizeof(desc->req.fcw_ld) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
+
+/* Enqueue one decode operations for ACC100 device in TB mode */
+static inline int
+enqueue_ldpc_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint8_t r, c;
+	uint32_t in_offset, h_out_offset,
+		h_out_length, mbuf_total_left, seg_total_left;
+	struct rte_mbuf *input, *h_output_head, *h_output;
+	uint16_t current_enqueued_cbs = 0;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
+	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+	acc100_fcw_ld_fill(op, &desc->req.fcw_ld, harq_layout);
+
+	input = op->ldpc_dec.input.data;
+	h_output_head = h_output = op->ldpc_dec.hard_output.data;
+	in_offset = op->ldpc_dec.input.offset;
+	h_out_offset = op->ldpc_dec.hard_output.offset;
+	h_out_length = 0;
+	mbuf_total_left = op->ldpc_dec.input.length;
+	c = op->ldpc_dec.tb_params.c;
+	r = op->ldpc_dec.tb_params.r;
+
+	while (mbuf_total_left > 0 && r < c) {
+
+		seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
+
+		/* Set up DMA descriptor */
+		desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
+				& q->sw_ring_wrap_mask);
+		desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
+		desc->req.data_ptrs[0].blen = ACC100_FCW_LD_BLEN;
+		ret = acc100_dma_desc_ld_fill(op, &desc->req, &input,
+				h_output, &in_offset, &h_out_offset,
+				&h_out_length,
+				&mbuf_total_left, &seg_total_left,
+				&desc->req.fcw_ld);
+
+		if (unlikely(ret < 0))
+			return ret;
+
+		/* Hard output */
+		mbuf_append(h_output_head, h_output, h_out_length);
+
+		/* Set total number of CBs in TB */
+		desc->req.cbs_in_tb = cbs_in_tb;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+		rte_memdump(stderr, "FCW", &desc->req.fcw_td,
+				sizeof(desc->req.fcw_td) - 8);
+		rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+		if (seg_total_left == 0) {
+			/* Go to the next mbuf */
+			input = input->next;
+			in_offset = 0;
+			h_output = h_output->next;
+			h_out_offset = 0;
+		}
+		total_enqueued_cbs++;
+		current_enqueued_cbs++;
+		r++;
+	}
+
+	if (unlikely(desc == NULL))
+		return current_enqueued_cbs;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Check if any CBs left for processing */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left for processing: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+	/* Set SDone on last CB descriptor for TB mode */
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	return current_enqueued_cbs;
+}
+
+
+/* Calculates number of CBs in processed encoder TB based on 'r' and input
+ * length.
+ */
+static inline uint8_t
+get_num_cbs_in_tb_enc(struct rte_bbdev_op_turbo_enc *turbo_enc)
+{
+	uint8_t c, c_neg, r, crc24_bits = 0;
+	uint16_t k, k_neg, k_pos;
+	uint8_t cbs_in_tb = 0;
+	int32_t length;
+
+	length = turbo_enc->input.length;
+	r = turbo_enc->tb_params.r;
+	c = turbo_enc->tb_params.c;
+	c_neg = turbo_enc->tb_params.c_neg;
+	k_neg = turbo_enc->tb_params.k_neg;
+	k_pos = turbo_enc->tb_params.k_pos;
+	crc24_bits = 0;
+	if (check_bit(turbo_enc->op_flags, RTE_BBDEV_TURBO_CRC_24B_ATTACH))
+		crc24_bits = 24;
+	while (length > 0 && r < c) {
+		k = (r < c_neg) ? k_neg : k_pos;
+		length -= (k - crc24_bits) >> 3;
+		r++;
+		cbs_in_tb++;
+	}
+
+	return cbs_in_tb;
+}
+
+/* Calculates number of CBs in processed decoder TB based on 'r' and input
+ * length.
+ */
+static inline uint16_t
+get_num_cbs_in_tb_dec(struct rte_bbdev_op_turbo_dec *turbo_dec)
+{
+	uint8_t c, c_neg, r = 0;
+	uint16_t kw, k, k_neg, k_pos, cbs_in_tb = 0;
+	int32_t length;
+
+	length = turbo_dec->input.length;
+	r = turbo_dec->tb_params.r;
+	c = turbo_dec->tb_params.c;
+	c_neg = turbo_dec->tb_params.c_neg;
+	k_neg = turbo_dec->tb_params.k_neg;
+	k_pos = turbo_dec->tb_params.k_pos;
+	while (length > 0 && r < c) {
+		k = (r < c_neg) ? k_neg : k_pos;
+		kw = RTE_ALIGN_CEIL(k + 4, 32) * 3;
+		length -= kw;
+		r++;
+		cbs_in_tb++;
+	}
+
+	return cbs_in_tb;
+}
+
+/* Calculates number of CBs in processed decoder TB based on 'r' and input
+ * length.
+ */
+static inline uint16_t
+get_num_cbs_in_tb_ldpc_dec(struct rte_bbdev_op_ldpc_dec *ldpc_dec)
+{
+	uint16_t r, cbs_in_tb = 0;
+	int32_t length = ldpc_dec->input.length;
+	r = ldpc_dec->tb_params.r;
+	while (length > 0 && r < ldpc_dec->tb_params.c) {
+		length -=  (r < ldpc_dec->tb_params.cab) ?
+				ldpc_dec->tb_params.ea :
+				ldpc_dec->tb_params.eb;
+		r++;
+		cbs_in_tb++;
+	}
+	return cbs_in_tb;
+}
+
+/* Check we can mux encode operations with common FCW */
+static inline bool
+check_mux(struct rte_bbdev_enc_op **ops, uint16_t num) {
+	uint16_t i;
+	if (num == 1)
+		return false;
+	for (i = 1; i < num; ++i) {
+		/* Only mux compatible code blocks */
+		if (memcmp((uint8_t *)(&ops[i]->ldpc_enc) + ENC_OFFSET,
+				(uint8_t *)(&ops[0]->ldpc_enc) + ENC_OFFSET,
+				CMP_ENC_SIZE) != 0)
+			return false;
+	}
+	return true;
+}
+
+/** Enqueue encode operations for ACC100 device in CB mode. */
+static inline uint16_t
+acc100_enqueue_ldpc_enc_cb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i = 0;
+	union acc100_dma_desc *desc;
+	int ret, desc_idx = 0;
+	int16_t enq, left = num;
+
+	while (left > 0) {
+		if (unlikely(avail - 1 < 0))
+			break;
+		avail--;
+		enq = RTE_MIN(left, MUX_5GDL_DESC);
+		if (check_mux(&ops[i], enq)) {
+			ret = enqueue_ldpc_enc_n_op_cb(q, &ops[i],
+					desc_idx, enq);
+			if (ret < 0)
+				break;
+			i += enq;
+		} else {
+			ret = enqueue_ldpc_enc_one_op_cb(q, ops[i], desc_idx);
+			if (ret < 0)
+				break;
+			i++;
+		}
+		desc_idx++;
+		left = num - i;
+	}
+
+	if (unlikely(i == 0))
+		return 0; /* Nothing to enqueue */
+
+	/* Set SDone in last CB in enqueued ops for CB mode*/
+	desc = q->ring_addr + ((q->sw_ring_head + desc_idx - 1)
+			& q->sw_ring_wrap_mask);
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	acc100_dma_enqueue(q, desc_idx, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+
+	return i;
+}
+
+/* Enqueue encode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	if (unlikely(num == 0))
+		return 0;
+	return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
+}
+
+/* Check we can mux encode operations with common FCW */
+static inline bool
+cmp_ldpc_dec_op(struct rte_bbdev_dec_op **ops) {
+	/* Only mux compatible code blocks */
+	if (memcmp((uint8_t *)(&ops[0]->ldpc_dec) + DEC_OFFSET,
+			(uint8_t *)(&ops[1]->ldpc_dec) +
+			DEC_OFFSET, CMP_DEC_SIZE) != 0) {
+		return false;
+	} else
+		return true;
+}
+
+
+/* Enqueue decode operations for ACC100 device in TB mode */
+static uint16_t
+acc100_enqueue_ldpc_dec_tb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i, enqueued_cbs = 0;
+	uint8_t cbs_in_tb;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		cbs_in_tb = get_num_cbs_in_tb_ldpc_dec(&ops[i]->ldpc_dec);
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - cbs_in_tb < 0))
+			break;
+		avail -= cbs_in_tb;
+
+		ret = enqueue_ldpc_dec_one_op_tb(q, ops[i],
+				enqueued_cbs, cbs_in_tb);
+		if (ret < 0)
+			break;
+		enqueued_cbs += ret;
+	}
+
+	acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+	return i;
+}
+
+/* Enqueue decode operations for ACC100 device in CB mode */
+static uint16_t
+acc100_enqueue_ldpc_dec_cb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i;
+	union acc100_dma_desc *desc;
+	int ret;
+	bool same_op = false;
+	for (i = 0; i < num; ++i) {
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - 1 < 0))
+			break;
+		avail -= 1;
+
+		if (i > 0)
+			same_op = cmp_ldpc_dec_op(&ops[i-1]);
+		rte_bbdev_log(INFO, "Op %d %d %d %d %d %d %d %d %d %d %d %d\n",
+			i, ops[i]->ldpc_dec.op_flags, ops[i]->ldpc_dec.rv_index,
+			ops[i]->ldpc_dec.iter_max, ops[i]->ldpc_dec.iter_count,
+			ops[i]->ldpc_dec.basegraph, ops[i]->ldpc_dec.z_c,
+			ops[i]->ldpc_dec.n_cb, ops[i]->ldpc_dec.q_m,
+			ops[i]->ldpc_dec.n_filler, ops[i]->ldpc_dec.cb_params.e,
+			same_op);
+		ret = enqueue_ldpc_dec_one_op_cb(q, ops[i], i, same_op);
+		if (ret < 0)
+			break;
+	}
+
+	if (unlikely(i == 0))
+		return 0; /* Nothing to enqueue */
+
+	/* Set SDone in last CB in enqueued ops for CB mode*/
+	desc = q->ring_addr + ((q->sw_ring_head + i - 1)
+			& q->sw_ring_wrap_mask);
+
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	acc100_dma_enqueue(q, i, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+	return i;
+}
+
+/* Enqueue decode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t aq_avail = q->aq_depth +
+			(q->aq_dequeued - q->aq_enqueued) / 128;
+
+	if (unlikely((aq_avail == 0) || (num == 0)))
+		return 0;
+
+	if (ops[0]->ldpc_dec.code_block_mode == 0)
+		return acc100_enqueue_ldpc_dec_tb(q_data, ops, num);
+	else
+		return acc100_enqueue_ldpc_dec_cb(q_data, ops, num);
+}
+
+
+/* Dequeue one encode operations from ACC100 device in CB mode */
+static inline int
+dequeue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op **ref_op,
+		uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_enc_op *op;
+	int i;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	rsp.val = atom_desc.rsp.val;
+	rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+
+	op->status |= ((rsp.input_err)
+			? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+	op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+	op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+
+	if (desc->req.last_desc_in_batch) {
+		(*aq_dequeued)++;
+		desc->req.last_desc_in_batch = 0;
+	}
+	desc->rsp.val = ACC100_DMA_DESC_TYPE;
+	desc->rsp.add_info_0 = 0; /*Reserved bits */
+	desc->rsp.add_info_1 = 0; /*Reserved bits */
+
+	/* Flag that the muxing cause loss of opaque data */
+	op->opaque_data = (void *)-1;
+	for (i = 0 ; i < desc->req.numCBs; i++)
+		ref_op[i] = op;
+
+	/* One CB (op) was successfully dequeued */
+	return desc->req.numCBs;
+}
+
+/* Dequeue one encode operations from ACC100 device in TB mode */
+static inline int
+dequeue_enc_one_op_tb(struct acc100_queue *q, struct rte_bbdev_enc_op **ref_op,
+		uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, *last_desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_enc_op *op;
+	uint8_t i = 0;
+	uint16_t current_dequeued_cbs = 0, cbs_in_tb;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	/* Get number of CBs in dequeued TB */
+	cbs_in_tb = desc->req.cbs_in_tb;
+	/* Get last CB */
+	last_desc = q->ring_addr + ((q->sw_ring_tail
+			+ total_dequeued_cbs + cbs_in_tb - 1)
+			& q->sw_ring_wrap_mask);
+	/* Check if last CB in TB is ready to dequeue (and thus
+	 * the whole TB) - checking sdone bit. If not return.
+	 */
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
+			__ATOMIC_RELAXED);
+	if (!(atom_desc.rsp.val & ACC100_SDONE))
+		return -1;
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+
+	while (i < cbs_in_tb) {
+		desc = q->ring_addr + ((q->sw_ring_tail
+				+ total_dequeued_cbs)
+				& q->sw_ring_wrap_mask);
+		atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+				__ATOMIC_RELAXED);
+		rsp.val = atom_desc.rsp.val;
+		rte_bbdev_log_debug("Resp. desc %p: %x", desc,
+				rsp.val);
+
+		op->status |= ((rsp.input_err)
+				? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+		op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+		op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+
+		if (desc->req.last_desc_in_batch) {
+			(*aq_dequeued)++;
+			desc->req.last_desc_in_batch = 0;
+		}
+		desc->rsp.val = ACC100_DMA_DESC_TYPE;
+		desc->rsp.add_info_0 = 0;
+		desc->rsp.add_info_1 = 0;
+		total_dequeued_cbs++;
+		current_dequeued_cbs++;
+		i++;
+	}
+
+	*ref_op = op;
+
+	return current_dequeued_cbs;
+}
+
+/* Dequeue one decode operation from ACC100 device in CB mode */
+static inline int
+dequeue_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
+		struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
+		uint16_t dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_dec_op *op;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	rsp.val = atom_desc.rsp.val;
+	rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+	op->status |= ((rsp.input_err)
+			? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+	op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+	op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+	if (op->status != 0)
+		q_data->queue_stats.dequeue_err_count++;
+
+	/* CRC invalid if error exists */
+	if (!op->status)
+		op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
+	op->turbo_dec.iter_count = (uint8_t) rsp.iter_cnt / 2;
+	/* Check if this is the last desc in batch (Atomic Queue) */
+	if (desc->req.last_desc_in_batch) {
+		(*aq_dequeued)++;
+		desc->req.last_desc_in_batch = 0;
+	}
+	desc->rsp.val = ACC100_DMA_DESC_TYPE;
+	desc->rsp.add_info_0 = 0;
+	desc->rsp.add_info_1 = 0;
+	*ref_op = op;
+
+	/* One CB (op) was successfully dequeued */
+	return 1;
+}
+
+/* Dequeue one decode operations from ACC100 device in CB mode */
+static inline int
+dequeue_ldpc_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
+		struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
+		uint16_t dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_dec_op *op;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	rsp.val = atom_desc.rsp.val;
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+	op->status |= rsp.input_err << RTE_BBDEV_DATA_ERROR;
+	op->status |= rsp.dma_err << RTE_BBDEV_DRV_ERROR;
+	op->status |= rsp.fcw_err << RTE_BBDEV_DRV_ERROR;
+	if (op->status != 0)
+		q_data->queue_stats.dequeue_err_count++;
+
+	op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
+	if (op->ldpc_dec.hard_output.length > 0 && !rsp.synd_ok)
+		op->status |= 1 << RTE_BBDEV_SYNDROME_ERROR;
+	op->ldpc_dec.iter_count = (uint8_t) rsp.iter_cnt;
+
+	/* Check if this is the last desc in batch (Atomic Queue) */
+	if (desc->req.last_desc_in_batch) {
+		(*aq_dequeued)++;
+		desc->req.last_desc_in_batch = 0;
+	}
+
+	desc->rsp.val = ACC100_DMA_DESC_TYPE;
+	desc->rsp.add_info_0 = 0;
+	desc->rsp.add_info_1 = 0;
+
+	*ref_op = op;
+
+	/* One CB (op) was successfully dequeued */
+	return 1;
+}
+
+/* Dequeue one decode operations from ACC100 device in TB mode. */
+static inline int
+dequeue_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
+		uint16_t dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, *last_desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_dec_op *op;
+	uint8_t cbs_in_tb = 1, cb_idx = 0;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Get number of CBs in dequeued TB */
+	cbs_in_tb = desc->req.cbs_in_tb;
+	/* Get last CB */
+	last_desc = q->ring_addr + ((q->sw_ring_tail
+			+ dequeued_cbs + cbs_in_tb - 1)
+			& q->sw_ring_wrap_mask);
+	/* Check if last CB in TB is ready to dequeue (and thus
+	 * the whole TB) - checking sdone bit. If not return.
+	 */
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
+			__ATOMIC_RELAXED);
+	if (!(atom_desc.rsp.val & ACC100_SDONE))
+		return -1;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+
+	/* Read remaining CBs if exists */
+	while (cb_idx < cbs_in_tb) {
+		desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+				& q->sw_ring_wrap_mask);
+		atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+				__ATOMIC_RELAXED);
+		rsp.val = atom_desc.rsp.val;
+		rte_bbdev_log_debug("Resp. desc %p: %x", desc,
+				rsp.val);
+
+		op->status |= ((rsp.input_err)
+				? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+		op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+		op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+
+		/* CRC invalid if error exists */
+		if (!op->status)
+			op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
+		op->turbo_dec.iter_count = RTE_MAX((uint8_t) rsp.iter_cnt,
+				op->turbo_dec.iter_count);
+
+		/* Check if this is the last desc in batch (Atomic Queue) */
+		if (desc->req.last_desc_in_batch) {
+			(*aq_dequeued)++;
+			desc->req.last_desc_in_batch = 0;
+		}
+		desc->rsp.val = ACC100_DMA_DESC_TYPE;
+		desc->rsp.add_info_0 = 0;
+		desc->rsp.add_info_1 = 0;
+		dequeued_cbs++;
+		cb_idx++;
+	}
+
+	*ref_op = op;
+
+	return cb_idx;
+}
+
+/* Dequeue LDPC encode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+	uint32_t aq_dequeued = 0;
+	uint16_t dequeue_num, i, dequeued_cbs = 0, dequeued_descs = 0;
+	int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(ops == 0 && q == NULL))
+		return 0;
+#endif
+
+	dequeue_num = (avail < num) ? avail : num;
+
+	for (i = 0; i < dequeue_num; i++) {
+		ret = dequeue_enc_one_op_cb(q, &ops[dequeued_cbs],
+				dequeued_descs, &aq_dequeued);
+		if (ret < 0)
+			break;
+		dequeued_cbs += ret;
+		dequeued_descs++;
+		if (dequeued_cbs >= num)
+			break;
+	}
+
+	q->aq_dequeued += aq_dequeued;
+	q->sw_ring_tail += dequeued_descs;
+
+	/* Update enqueue stats */
+	q_data->queue_stats.dequeued_count += dequeued_cbs;
+
+	return dequeued_cbs;
+}
+
+/* Dequeue decode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	uint16_t dequeue_num;
+	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+	uint32_t aq_dequeued = 0;
+	uint16_t i;
+	uint16_t dequeued_cbs = 0;
+	struct rte_bbdev_dec_op *op;
+	int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(ops == 0 && q == NULL))
+		return 0;
+#endif
+
+	dequeue_num = (avail < num) ? avail : num;
+
+	for (i = 0; i < dequeue_num; ++i) {
+		op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask))->req.op_addr;
+		if (op->ldpc_dec.code_block_mode == 0)
+			ret = dequeue_dec_one_op_tb(q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+		else
+			ret = dequeue_ldpc_dec_one_op_cb(
+					q_data, q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+
+		if (ret < 0)
+			break;
+		dequeued_cbs += ret;
+	}
+
+	q->aq_dequeued += aq_dequeued;
+	q->sw_ring_tail += dequeued_cbs;
+
+	/* Update enqueue stats */
+	q_data->queue_stats.dequeued_count += i;
+
+	return i;
+}
+
 /* Initialization Function */
 static void
 acc100_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv)
@@ -703,6 +2321,10 @@
 	struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
 
 	dev->dev_ops = &acc100_bbdev_ops;
+	dev->enqueue_ldpc_enc_ops = acc100_enqueue_ldpc_enc;
+	dev->enqueue_ldpc_dec_ops = acc100_enqueue_ldpc_dec;
+	dev->dequeue_ldpc_enc_ops = acc100_dequeue_ldpc_enc;
+	dev->dequeue_ldpc_dec_ops = acc100_dequeue_ldpc_dec;
 
 	((struct acc100_device *) dev->data->dev_private)->pf_device =
 			!strcmp(drv->driver.name,
@@ -815,4 +2437,3 @@ static int acc100_pci_remove(struct rte_pci_device *pci_dev)
 RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME, pci_id_acc100_pf_map);
 RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
 RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME, pci_id_acc100_vf_map);
-
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 0e2b79c..78686c1 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -88,6 +88,8 @@
 #define TMPL_PRI_3      0x0f0e0d0c
 #define QUEUE_ENABLE    0x80000000  /* Bit to mark Queue as Enabled */
 #define WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
+#define ACC100_FDONE    0x80000000
+#define ACC100_SDONE    0x40000000
 
 #define ACC100_NUM_TMPL  32
 #define VF_OFFSET_QOS 16 /* offset in Memory Space specific to QoS Mon */
@@ -398,6 +400,7 @@ struct __rte_packed acc100_dma_req_desc {
 union acc100_dma_desc {
 	struct acc100_dma_req_desc req;
 	union acc100_dma_rsp_desc rsp;
+	uint64_t atom_hdr;
 };
 
 
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v7 06/11] baseband/acc100: add HARQ loopback support
  2020-09-23  2:24   ` [dpdk-dev] [PATCH v7 00/11] bbdev PMD ACC100 Nicolas Chautru
                       ` (4 preceding siblings ...)
  2020-09-23  2:25     ` [dpdk-dev] [PATCH v7 05/11] baseband/acc100: add LDPC processing functions Nicolas Chautru
@ 2020-09-23  2:25     ` Nicolas Chautru
  2020-09-23  2:25     ` [dpdk-dev] [PATCH v7 07/11] baseband/acc100: add support for 4G processing Nicolas Chautru
                       ` (4 subsequent siblings)
  10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-23  2:25 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Additional support for HARQ memory loopback

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 158 +++++++++++++++++++++++++++++++
 1 file changed, 158 insertions(+)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index b223547..e484c0a 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -658,6 +658,7 @@
 				RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE |
 				RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE |
 #ifdef ACC100_EXT_MEM
+				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK |
 				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE |
 				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE |
 #endif
@@ -1480,12 +1481,169 @@
 	return 1;
 }
 
+static inline int
+harq_loopback(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs) {
+	struct acc100_fcw_ld *fcw;
+	union acc100_dma_desc *desc;
+	int next_triplet = 1;
+	struct rte_mbuf *hq_output_head, *hq_output;
+	uint16_t harq_in_length = op->ldpc_dec.harq_combined_input.length;
+	if (harq_in_length == 0) {
+		rte_bbdev_log(ERR, "Loopback of invalid null size\n");
+		return -EINVAL;
+	}
+
+	int h_comp = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION
+			) ? 1 : 0;
+	if (h_comp == 1)
+		harq_in_length = harq_in_length * 8 / 6;
+	harq_in_length = RTE_ALIGN(harq_in_length, 64);
+	uint16_t harq_dma_length_in = (h_comp == 0) ?
+			harq_in_length :
+			harq_in_length * 6 / 8;
+	uint16_t harq_dma_length_out = harq_dma_length_in;
+	bool ddr_mem_in = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE);
+	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+	uint16_t harq_index = (ddr_mem_in ?
+			op->ldpc_dec.harq_combined_input.offset :
+			op->ldpc_dec.harq_combined_output.offset)
+			/ ACC100_HARQ_OFFSET;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	fcw = &desc->req.fcw_ld;
+	/* Set the FCW from loopback into DDR */
+	memset(fcw, 0, sizeof(struct acc100_fcw_ld));
+	fcw->FCWversion = ACC100_FCW_VER;
+	fcw->qm = 2;
+	fcw->Zc = 384;
+	if (harq_in_length < 16 * N_ZC_1)
+		fcw->Zc = 16;
+	fcw->ncb = fcw->Zc * N_ZC_1;
+	fcw->rm_e = 2;
+	fcw->hcin_en = 1;
+	fcw->hcout_en = 1;
+
+	rte_bbdev_log(DEBUG, "Loopback IN %d Index %d offset %d length %d %d\n",
+			ddr_mem_in, harq_index,
+			harq_layout[harq_index].offset, harq_in_length,
+			harq_dma_length_in);
+
+	if (ddr_mem_in && (harq_layout[harq_index].offset > 0)) {
+		fcw->hcin_size0 = harq_layout[harq_index].size0;
+		fcw->hcin_offset = harq_layout[harq_index].offset;
+		fcw->hcin_size1 = harq_in_length - fcw->hcin_offset;
+		harq_dma_length_in = (fcw->hcin_size0 + fcw->hcin_size1);
+		if (h_comp == 1)
+			harq_dma_length_in = harq_dma_length_in * 6 / 8;
+	} else {
+		fcw->hcin_size0 = harq_in_length;
+	}
+	harq_layout[harq_index].val = 0;
+	rte_bbdev_log(DEBUG, "Loopback FCW Config %d %d %d\n",
+			fcw->hcin_size0, fcw->hcin_offset, fcw->hcin_size1);
+	fcw->hcout_size0 = harq_in_length;
+	fcw->hcin_decomp_mode = h_comp;
+	fcw->hcout_comp_mode = h_comp;
+	fcw->gain_i = 1;
+	fcw->gain_h = 1;
+
+	/* Set the prefix of descriptor. This could be done at polling */
+	desc->req.word0 = ACC100_DMA_DESC_TYPE;
+	desc->req.word1 = 0; /**< Timestamp could be disabled */
+	desc->req.word2 = 0;
+	desc->req.word3 = 0;
+	desc->req.numCBs = 1;
+
+	/* Null LLR input for Decoder */
+	desc->req.data_ptrs[next_triplet].address =
+			q->lb_in_addr_phys;
+	desc->req.data_ptrs[next_triplet].blen = 2;
+	desc->req.data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN;
+	desc->req.data_ptrs[next_triplet].last = 0;
+	desc->req.data_ptrs[next_triplet].dma_ext = 0;
+	next_triplet++;
+
+	/* HARQ Combine input from either Memory interface */
+	if (!ddr_mem_in) {
+		next_triplet = acc100_dma_fill_blk_type_out(&desc->req,
+				op->ldpc_dec.harq_combined_input.data,
+				op->ldpc_dec.harq_combined_input.offset,
+				harq_dma_length_in,
+				next_triplet,
+				ACC100_DMA_BLKID_IN_HARQ);
+	} else {
+		desc->req.data_ptrs[next_triplet].address =
+				op->ldpc_dec.harq_combined_input.offset;
+		desc->req.data_ptrs[next_triplet].blen =
+				harq_dma_length_in;
+		desc->req.data_ptrs[next_triplet].blkid =
+				ACC100_DMA_BLKID_IN_HARQ;
+		desc->req.data_ptrs[next_triplet].dma_ext = 1;
+		next_triplet++;
+	}
+	desc->req.data_ptrs[next_triplet - 1].last = 1;
+	desc->req.m2dlen = next_triplet;
+
+	/* Dropped decoder hard output */
+	desc->req.data_ptrs[next_triplet].address =
+			q->lb_out_addr_phys;
+	desc->req.data_ptrs[next_triplet].blen = BYTES_IN_WORD;
+	desc->req.data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_OUT_HARD;
+	desc->req.data_ptrs[next_triplet].last = 0;
+	desc->req.data_ptrs[next_triplet].dma_ext = 0;
+	next_triplet++;
+
+	/* HARQ Combine output to either Memory interface */
+	if (check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE
+			)) {
+		desc->req.data_ptrs[next_triplet].address =
+				op->ldpc_dec.harq_combined_output.offset;
+		desc->req.data_ptrs[next_triplet].blen =
+				harq_dma_length_out;
+		desc->req.data_ptrs[next_triplet].blkid =
+				ACC100_DMA_BLKID_OUT_HARQ;
+		desc->req.data_ptrs[next_triplet].dma_ext = 1;
+		next_triplet++;
+	} else {
+		hq_output_head = op->ldpc_dec.harq_combined_output.data;
+		hq_output = op->ldpc_dec.harq_combined_output.data;
+		next_triplet = acc100_dma_fill_blk_type_out(
+				&desc->req,
+				op->ldpc_dec.harq_combined_output.data,
+				op->ldpc_dec.harq_combined_output.offset,
+				harq_dma_length_out,
+				next_triplet,
+				ACC100_DMA_BLKID_OUT_HARQ);
+		/* HARQ output */
+		mbuf_append(hq_output_head, hq_output, harq_dma_length_out);
+		op->ldpc_dec.harq_combined_output.length =
+				harq_dma_length_out;
+	}
+	desc->req.data_ptrs[next_triplet - 1].last = 1;
+	desc->req.d2mlen = next_triplet - desc->req.m2dlen;
+	desc->req.op_addr = op;
+
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
 /** Enqueue one decode operations for ACC100 device in CB mode */
 static inline int
 enqueue_ldpc_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
 		uint16_t total_enqueued_cbs, bool same_op)
 {
 	int ret;
+	if (unlikely(check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK))) {
+		ret = harq_loopback(q, op, total_enqueued_cbs);
+		return ret;
+	}
 
 	union acc100_dma_desc *desc;
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v7 07/11] baseband/acc100: add support for 4G processing
  2020-09-23  2:24   ` [dpdk-dev] [PATCH v7 00/11] bbdev PMD ACC100 Nicolas Chautru
                       ` (5 preceding siblings ...)
  2020-09-23  2:25     ` [dpdk-dev] [PATCH v7 06/11] baseband/acc100: add HARQ loopback support Nicolas Chautru
@ 2020-09-23  2:25     ` Nicolas Chautru
  2020-09-23  2:25     ` [dpdk-dev] [PATCH v7 08/11] baseband/acc100: add interrupt support to PMD Nicolas Chautru
                       ` (3 subsequent siblings)
  10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-23  2:25 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Adding capability for 4G encode and decoder processing

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 1010 ++++++++++++++++++++++++++++--
 1 file changed, 943 insertions(+), 67 deletions(-)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index e484c0a..7d4c3df 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -339,7 +339,6 @@
 	free_base_addresses(base_addrs, i);
 }
 
-
 /* Allocate 64MB memory used for all software rings */
 static int
 acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id)
@@ -637,6 +636,41 @@
 
 	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
 		{
+			.type = RTE_BBDEV_OP_TURBO_DEC,
+			.cap.turbo_dec = {
+				.capability_flags =
+					RTE_BBDEV_TURBO_SUBBLOCK_DEINTERLEAVE |
+					RTE_BBDEV_TURBO_CRC_TYPE_24B |
+					RTE_BBDEV_TURBO_HALF_ITERATION_EVEN |
+					RTE_BBDEV_TURBO_EARLY_TERMINATION |
+					RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN |
+					RTE_BBDEV_TURBO_MAP_DEC |
+					RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP |
+					RTE_BBDEV_TURBO_DEC_SCATTER_GATHER,
+				.max_llr_modulus = INT8_MAX,
+				.num_buffers_src =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+				.num_buffers_hard_out =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+				.num_buffers_soft_out =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+			}
+		},
+		{
+			.type = RTE_BBDEV_OP_TURBO_ENC,
+			.cap.turbo_enc = {
+				.capability_flags =
+					RTE_BBDEV_TURBO_CRC_24B_ATTACH |
+					RTE_BBDEV_TURBO_RV_INDEX_BYPASS |
+					RTE_BBDEV_TURBO_RATE_MATCH |
+					RTE_BBDEV_TURBO_ENC_SCATTER_GATHER,
+				.num_buffers_src =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+				.num_buffers_dst =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+			}
+		},
+		{
 			.type   = RTE_BBDEV_OP_LDPC_ENC,
 			.cap.ldpc_enc = {
 				.capability_flags =
@@ -719,7 +753,6 @@
 #endif
 }
 
-
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
 	.setup_queues = acc100_setup_queues,
 	.close = acc100_dev_close,
@@ -763,6 +796,58 @@
 	return tail;
 }
 
+/* Fill in a frame control word for turbo encoding. */
+static inline void
+acc100_fcw_te_fill(const struct rte_bbdev_enc_op *op, struct acc100_fcw_te *fcw)
+{
+	fcw->code_block_mode = op->turbo_enc.code_block_mode;
+	if (fcw->code_block_mode == 0) { /* For TB mode */
+		fcw->k_neg = op->turbo_enc.tb_params.k_neg;
+		fcw->k_pos = op->turbo_enc.tb_params.k_pos;
+		fcw->c_neg = op->turbo_enc.tb_params.c_neg;
+		fcw->c = op->turbo_enc.tb_params.c;
+		fcw->ncb_neg = op->turbo_enc.tb_params.ncb_neg;
+		fcw->ncb_pos = op->turbo_enc.tb_params.ncb_pos;
+
+		if (check_bit(op->turbo_enc.op_flags,
+				RTE_BBDEV_TURBO_RATE_MATCH)) {
+			fcw->bypass_rm = 0;
+			fcw->cab = op->turbo_enc.tb_params.cab;
+			fcw->ea = op->turbo_enc.tb_params.ea;
+			fcw->eb = op->turbo_enc.tb_params.eb;
+		} else {
+			/* E is set to the encoding output size when RM is
+			 * bypassed.
+			 */
+			fcw->bypass_rm = 1;
+			fcw->cab = fcw->c_neg;
+			fcw->ea = 3 * fcw->k_neg + 12;
+			fcw->eb = 3 * fcw->k_pos + 12;
+		}
+	} else { /* For CB mode */
+		fcw->k_pos = op->turbo_enc.cb_params.k;
+		fcw->ncb_pos = op->turbo_enc.cb_params.ncb;
+
+		if (check_bit(op->turbo_enc.op_flags,
+				RTE_BBDEV_TURBO_RATE_MATCH)) {
+			fcw->bypass_rm = 0;
+			fcw->eb = op->turbo_enc.cb_params.e;
+		} else {
+			/* E is set to the encoding output size when RM is
+			 * bypassed.
+			 */
+			fcw->bypass_rm = 1;
+			fcw->eb = 3 * fcw->k_pos + 12;
+		}
+	}
+
+	fcw->bypass_rv_idx1 = check_bit(op->turbo_enc.op_flags,
+			RTE_BBDEV_TURBO_RV_INDEX_BYPASS);
+	fcw->code_block_crc = check_bit(op->turbo_enc.op_flags,
+			RTE_BBDEV_TURBO_CRC_24B_ATTACH);
+	fcw->rv_idx1 = op->turbo_enc.rv_index;
+}
+
 /* Compute value of k0.
  * Based on 3GPP 38.212 Table 5.4.2.1-2
  * Starting position of different redundancy versions, k0
@@ -813,6 +898,25 @@
 	fcw->mcb_count = num_cb;
 }
 
+/* Fill in a frame control word for turbo decoding. */
+static inline void
+acc100_fcw_td_fill(const struct rte_bbdev_dec_op *op, struct acc100_fcw_td *fcw)
+{
+	/* Note : Early termination is always enabled for 4GUL */
+	fcw->fcw_ver = 1;
+	if (op->turbo_dec.code_block_mode == 0)
+		fcw->k_pos = op->turbo_dec.tb_params.k_pos;
+	else
+		fcw->k_pos = op->turbo_dec.cb_params.k;
+	fcw->turbo_crc_type = check_bit(op->turbo_dec.op_flags,
+			RTE_BBDEV_TURBO_CRC_TYPE_24B);
+	fcw->bypass_sb_deint = 0;
+	fcw->raw_decoder_input_on = 0;
+	fcw->max_iter = op->turbo_dec.iter_max;
+	fcw->half_iter_on = !check_bit(op->turbo_dec.op_flags,
+			RTE_BBDEV_TURBO_HALF_ITERATION_EVEN);
+}
+
 /* Fill in a frame control word for LDPC decoding. */
 static inline void
 acc100_fcw_ld_fill(const struct rte_bbdev_dec_op *op, struct acc100_fcw_ld *fcw,
@@ -1042,6 +1146,87 @@
 }
 
 static inline int
+acc100_dma_desc_te_fill(struct rte_bbdev_enc_op *op,
+		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
+		struct rte_mbuf *output, uint32_t *in_offset,
+		uint32_t *out_offset, uint32_t *out_length,
+		uint32_t *mbuf_total_left, uint32_t *seg_total_left, uint8_t r)
+{
+	int next_triplet = 1; /* FCW already done */
+	uint32_t e, ea, eb, length;
+	uint16_t k, k_neg, k_pos;
+	uint8_t cab, c_neg;
+
+	desc->word0 = ACC100_DMA_DESC_TYPE;
+	desc->word1 = 0; /**< Timestamp could be disabled */
+	desc->word2 = 0;
+	desc->word3 = 0;
+	desc->numCBs = 1;
+
+	if (op->turbo_enc.code_block_mode == 0) {
+		ea = op->turbo_enc.tb_params.ea;
+		eb = op->turbo_enc.tb_params.eb;
+		cab = op->turbo_enc.tb_params.cab;
+		k_neg = op->turbo_enc.tb_params.k_neg;
+		k_pos = op->turbo_enc.tb_params.k_pos;
+		c_neg = op->turbo_enc.tb_params.c_neg;
+		e = (r < cab) ? ea : eb;
+		k = (r < c_neg) ? k_neg : k_pos;
+	} else {
+		e = op->turbo_enc.cb_params.e;
+		k = op->turbo_enc.cb_params.k;
+	}
+
+	if (check_bit(op->turbo_enc.op_flags, RTE_BBDEV_TURBO_CRC_24B_ATTACH))
+		length = (k - 24) >> 3;
+	else
+		length = k >> 3;
+
+	if (unlikely((*mbuf_total_left == 0) || (*mbuf_total_left < length))) {
+		rte_bbdev_log(ERR,
+				"Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+				*mbuf_total_left, length);
+		return -1;
+	}
+
+	next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset,
+			length, seg_total_left, next_triplet);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->m2dlen = next_triplet;
+	*mbuf_total_left -= length;
+
+	/* Set output length */
+	if (check_bit(op->turbo_enc.op_flags, RTE_BBDEV_TURBO_RATE_MATCH))
+		/* Integer round up division by 8 */
+		*out_length = (e + 7) >> 3;
+	else
+		*out_length = (k >> 3) * 3 + 2;
+
+	next_triplet = acc100_dma_fill_blk_type_out(desc, output, *out_offset,
+			*out_length, next_triplet, ACC100_DMA_BLKID_OUT_ENC);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+	op->turbo_enc.output.length += *out_length;
+	*out_offset += *out_length;
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->d2mlen = next_triplet - desc->m2dlen;
+
+	desc->op_addr = op;
+
+	return 0;
+}
+
+static inline int
 acc100_dma_desc_le_fill(struct rte_bbdev_enc_op *op,
 		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
 		struct rte_mbuf *output, uint32_t *in_offset,
@@ -1110,6 +1295,117 @@
 }
 
 static inline int
+acc100_dma_desc_td_fill(struct rte_bbdev_dec_op *op,
+		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
+		struct rte_mbuf *h_output, struct rte_mbuf *s_output,
+		uint32_t *in_offset, uint32_t *h_out_offset,
+		uint32_t *s_out_offset, uint32_t *h_out_length,
+		uint32_t *s_out_length, uint32_t *mbuf_total_left,
+		uint32_t *seg_total_left, uint8_t r)
+{
+	int next_triplet = 1; /* FCW already done */
+	uint16_t k;
+	uint16_t crc24_overlap = 0;
+	uint32_t e, kw;
+
+	desc->word0 = ACC100_DMA_DESC_TYPE;
+	desc->word1 = 0; /**< Timestamp could be disabled */
+	desc->word2 = 0;
+	desc->word3 = 0;
+	desc->numCBs = 1;
+
+	if (op->turbo_dec.code_block_mode == 0) {
+		k = (r < op->turbo_dec.tb_params.c_neg)
+			? op->turbo_dec.tb_params.k_neg
+			: op->turbo_dec.tb_params.k_pos;
+		e = (r < op->turbo_dec.tb_params.cab)
+			? op->turbo_dec.tb_params.ea
+			: op->turbo_dec.tb_params.eb;
+	} else {
+		k = op->turbo_dec.cb_params.k;
+		e = op->turbo_dec.cb_params.e;
+	}
+
+	if ((op->turbo_dec.code_block_mode == 0)
+		&& !check_bit(op->turbo_dec.op_flags,
+		RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP))
+		crc24_overlap = 24;
+
+	/* Calculates circular buffer size.
+	 * According to 3gpp 36.212 section 5.1.4.2
+	 *   Kw = 3 * Kpi,
+	 * where:
+	 *   Kpi = nCol * nRow
+	 * where nCol is 32 and nRow can be calculated from:
+	 *   D =< nCol * nRow
+	 * where D is the size of each output from turbo encoder block (k + 4).
+	 */
+	kw = RTE_ALIGN_CEIL(k + 4, 32) * 3;
+
+	if (unlikely((*mbuf_total_left == 0) || (*mbuf_total_left < kw))) {
+		rte_bbdev_log(ERR,
+				"Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+				*mbuf_total_left, kw);
+		return -1;
+	}
+
+	next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset, kw,
+			seg_total_left, next_triplet);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->m2dlen = next_triplet;
+	*mbuf_total_left -= kw;
+
+	next_triplet = acc100_dma_fill_blk_type_out(
+			desc, h_output, *h_out_offset,
+			k >> 3, next_triplet, ACC100_DMA_BLKID_OUT_HARD);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+
+	*h_out_length = ((k - crc24_overlap) >> 3);
+	op->turbo_dec.hard_output.length += *h_out_length;
+	*h_out_offset += *h_out_length;
+
+	/* Soft output */
+	if (check_bit(op->turbo_dec.op_flags, RTE_BBDEV_TURBO_SOFT_OUTPUT)) {
+		if (check_bit(op->turbo_dec.op_flags,
+				RTE_BBDEV_TURBO_EQUALIZER))
+			*s_out_length = e;
+		else
+			*s_out_length = (k * 3) + 12;
+
+		next_triplet = acc100_dma_fill_blk_type_out(desc, s_output,
+				*s_out_offset, *s_out_length, next_triplet,
+				ACC100_DMA_BLKID_OUT_SOFT);
+		if (unlikely(next_triplet < 0)) {
+			rte_bbdev_log(ERR,
+					"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+					op);
+			return -1;
+		}
+
+		op->turbo_dec.soft_output.length += *s_out_length;
+		*s_out_offset += *s_out_length;
+	}
+
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->d2mlen = next_triplet - desc->m2dlen;
+
+	desc->op_addr = op;
+
+	return 0;
+}
+
+static inline int
 acc100_dma_desc_ld_fill(struct rte_bbdev_dec_op *op,
 		struct acc100_dma_req_desc *desc,
 		struct rte_mbuf **input, struct rte_mbuf *h_output,
@@ -1374,6 +1670,57 @@
 
 /* Enqueue one encode operations for ACC100 device in CB mode */
 static inline int
+enqueue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
+		uint16_t total_enqueued_cbs)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint32_t in_offset, out_offset, out_length, mbuf_total_left,
+		seg_total_left;
+	struct rte_mbuf *input, *output_head, *output;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	acc100_fcw_te_fill(op, &desc->req.fcw_te);
+
+	input = op->turbo_enc.input.data;
+	output_head = output = op->turbo_enc.output.data;
+	in_offset = op->turbo_enc.input.offset;
+	out_offset = op->turbo_enc.output.offset;
+	out_length = 0;
+	mbuf_total_left = op->turbo_enc.input.length;
+	seg_total_left = rte_pktmbuf_data_len(op->turbo_enc.input.data)
+			- in_offset;
+
+	ret = acc100_dma_desc_te_fill(op, &desc->req, &input, output,
+			&in_offset, &out_offset, &out_length, &mbuf_total_left,
+			&seg_total_left, 0);
+
+	if (unlikely(ret < 0))
+		return ret;
+
+	mbuf_append(output_head, output, out_length);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_te,
+			sizeof(desc->req.fcw_te) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+
+	/* Check if any data left after processing one CB */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left after processing one CB: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
+/* Enqueue one encode operations for ACC100 device in CB mode */
+static inline int
 enqueue_ldpc_enc_n_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op **ops,
 		uint16_t total_enqueued_cbs, int16_t num)
 {
@@ -1481,78 +1828,235 @@
 	return 1;
 }
 
-static inline int
-harq_loopback(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
-		uint16_t total_enqueued_cbs) {
-	struct acc100_fcw_ld *fcw;
-	union acc100_dma_desc *desc;
-	int next_triplet = 1;
-	struct rte_mbuf *hq_output_head, *hq_output;
-	uint16_t harq_in_length = op->ldpc_dec.harq_combined_input.length;
-	if (harq_in_length == 0) {
-		rte_bbdev_log(ERR, "Loopback of invalid null size\n");
-		return -EINVAL;
-	}
 
-	int h_comp = check_bit(op->ldpc_dec.op_flags,
-			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION
-			) ? 1 : 0;
-	if (h_comp == 1)
-		harq_in_length = harq_in_length * 8 / 6;
-	harq_in_length = RTE_ALIGN(harq_in_length, 64);
-	uint16_t harq_dma_length_in = (h_comp == 0) ?
-			harq_in_length :
-			harq_in_length * 6 / 8;
-	uint16_t harq_dma_length_out = harq_dma_length_in;
-	bool ddr_mem_in = check_bit(op->ldpc_dec.op_flags,
-			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE);
-	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
-	uint16_t harq_index = (ddr_mem_in ?
-			op->ldpc_dec.harq_combined_input.offset :
-			op->ldpc_dec.harq_combined_output.offset)
-			/ ACC100_HARQ_OFFSET;
+/* Enqueue one encode operations for ACC100 device in TB mode. */
+static inline int
+enqueue_enc_one_op_tb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
+		uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint8_t r, c;
+	uint32_t in_offset, out_offset, out_length, mbuf_total_left,
+		seg_total_left;
+	struct rte_mbuf *input, *output_head, *output;
+	uint16_t current_enqueued_cbs = 0;
 
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
-	fcw = &desc->req.fcw_ld;
-	/* Set the FCW from loopback into DDR */
-	memset(fcw, 0, sizeof(struct acc100_fcw_ld));
-	fcw->FCWversion = ACC100_FCW_VER;
-	fcw->qm = 2;
-	fcw->Zc = 384;
-	if (harq_in_length < 16 * N_ZC_1)
-		fcw->Zc = 16;
-	fcw->ncb = fcw->Zc * N_ZC_1;
-	fcw->rm_e = 2;
-	fcw->hcin_en = 1;
-	fcw->hcout_en = 1;
+	uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
+	acc100_fcw_te_fill(op, &desc->req.fcw_te);
 
-	rte_bbdev_log(DEBUG, "Loopback IN %d Index %d offset %d length %d %d\n",
-			ddr_mem_in, harq_index,
-			harq_layout[harq_index].offset, harq_in_length,
-			harq_dma_length_in);
+	input = op->turbo_enc.input.data;
+	output_head = output = op->turbo_enc.output.data;
+	in_offset = op->turbo_enc.input.offset;
+	out_offset = op->turbo_enc.output.offset;
+	out_length = 0;
+	mbuf_total_left = op->turbo_enc.input.length;
 
-	if (ddr_mem_in && (harq_layout[harq_index].offset > 0)) {
-		fcw->hcin_size0 = harq_layout[harq_index].size0;
-		fcw->hcin_offset = harq_layout[harq_index].offset;
-		fcw->hcin_size1 = harq_in_length - fcw->hcin_offset;
-		harq_dma_length_in = (fcw->hcin_size0 + fcw->hcin_size1);
-		if (h_comp == 1)
-			harq_dma_length_in = harq_dma_length_in * 6 / 8;
-	} else {
-		fcw->hcin_size0 = harq_in_length;
-	}
-	harq_layout[harq_index].val = 0;
-	rte_bbdev_log(DEBUG, "Loopback FCW Config %d %d %d\n",
-			fcw->hcin_size0, fcw->hcin_offset, fcw->hcin_size1);
-	fcw->hcout_size0 = harq_in_length;
-	fcw->hcin_decomp_mode = h_comp;
-	fcw->hcout_comp_mode = h_comp;
-	fcw->gain_i = 1;
-	fcw->gain_h = 1;
+	c = op->turbo_enc.tb_params.c;
+	r = op->turbo_enc.tb_params.r;
 
-	/* Set the prefix of descriptor. This could be done at polling */
+	while (mbuf_total_left > 0 && r < c) {
+		seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
+		/* Set up DMA descriptor */
+		desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
+				& q->sw_ring_wrap_mask);
+		desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
+		desc->req.data_ptrs[0].blen = ACC100_FCW_TE_BLEN;
+
+		ret = acc100_dma_desc_te_fill(op, &desc->req, &input, output,
+				&in_offset, &out_offset, &out_length,
+				&mbuf_total_left, &seg_total_left, r);
+		if (unlikely(ret < 0))
+			return ret;
+		mbuf_append(output_head, output, out_length);
+
+		/* Set total number of CBs in TB */
+		desc->req.cbs_in_tb = cbs_in_tb;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+		rte_memdump(stderr, "FCW", &desc->req.fcw_te,
+				sizeof(desc->req.fcw_te) - 8);
+		rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+		if (seg_total_left == 0) {
+			/* Go to the next mbuf */
+			input = input->next;
+			in_offset = 0;
+			output = output->next;
+			out_offset = 0;
+		}
+
+		total_enqueued_cbs++;
+		current_enqueued_cbs++;
+		r++;
+	}
+
+	if (unlikely(desc == NULL))
+		return current_enqueued_cbs;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Check if any CBs left for processing */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left for processing: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+
+	/* Set SDone on last CB descriptor for TB mode. */
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	return current_enqueued_cbs;
+}
+
+/** Enqueue one decode operations for ACC100 device in CB mode */
+static inline int
+enqueue_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint32_t in_offset, h_out_offset, s_out_offset, s_out_length,
+		h_out_length, mbuf_total_left, seg_total_left;
+	struct rte_mbuf *input, *h_output_head, *h_output,
+		*s_output_head, *s_output;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	acc100_fcw_td_fill(op, &desc->req.fcw_td);
+
+	input = op->turbo_dec.input.data;
+	h_output_head = h_output = op->turbo_dec.hard_output.data;
+	s_output_head = s_output = op->turbo_dec.soft_output.data;
+	in_offset = op->turbo_dec.input.offset;
+	h_out_offset = op->turbo_dec.hard_output.offset;
+	s_out_offset = op->turbo_dec.soft_output.offset;
+	h_out_length = s_out_length = 0;
+	mbuf_total_left = op->turbo_dec.input.length;
+	seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(input == NULL)) {
+		rte_bbdev_log(ERR, "Invalid mbuf pointer");
+		return -EFAULT;
+	}
+#endif
+
+	/* Set up DMA descriptor */
+	desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+
+	ret = acc100_dma_desc_td_fill(op, &desc->req, &input, h_output,
+			s_output, &in_offset, &h_out_offset, &s_out_offset,
+			&h_out_length, &s_out_length, &mbuf_total_left,
+			&seg_total_left, 0);
+
+	if (unlikely(ret < 0))
+		return ret;
+
+	/* Hard output */
+	mbuf_append(h_output_head, h_output, h_out_length);
+
+	/* Soft output */
+	if (check_bit(op->turbo_dec.op_flags, RTE_BBDEV_TURBO_SOFT_OUTPUT))
+		mbuf_append(s_output_head, s_output, s_out_length);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_td,
+			sizeof(desc->req.fcw_td) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+
+	/* Check if any CBs left for processing */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left after processing one CB: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
+static inline int
+harq_loopback(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs) {
+	struct acc100_fcw_ld *fcw;
+	union acc100_dma_desc *desc;
+	int next_triplet = 1;
+	struct rte_mbuf *hq_output_head, *hq_output;
+	uint16_t harq_in_length = op->ldpc_dec.harq_combined_input.length;
+	if (harq_in_length == 0) {
+		rte_bbdev_log(ERR, "Loopback of invalid null size\n");
+		return -EINVAL;
+	}
+
+	int h_comp = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION
+			) ? 1 : 0;
+	if (h_comp == 1)
+		harq_in_length = harq_in_length * 8 / 6;
+	harq_in_length = RTE_ALIGN(harq_in_length, 64);
+	uint16_t harq_dma_length_in = (h_comp == 0) ?
+			harq_in_length :
+			harq_in_length * 6 / 8;
+	uint16_t harq_dma_length_out = harq_dma_length_in;
+	bool ddr_mem_in = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE);
+	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+	uint16_t harq_index = (ddr_mem_in ?
+			op->ldpc_dec.harq_combined_input.offset :
+			op->ldpc_dec.harq_combined_output.offset)
+			/ ACC100_HARQ_OFFSET;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	fcw = &desc->req.fcw_ld;
+	/* Set the FCW from loopback into DDR */
+	memset(fcw, 0, sizeof(struct acc100_fcw_ld));
+	fcw->FCWversion = ACC100_FCW_VER;
+	fcw->qm = 2;
+	fcw->Zc = 384;
+	if (harq_in_length < 16 * N_ZC_1)
+		fcw->Zc = 16;
+	fcw->ncb = fcw->Zc * N_ZC_1;
+	fcw->rm_e = 2;
+	fcw->hcin_en = 1;
+	fcw->hcout_en = 1;
+
+	rte_bbdev_log(DEBUG, "Loopback IN %d Index %d offset %d length %d %d\n",
+			ddr_mem_in, harq_index,
+			harq_layout[harq_index].offset, harq_in_length,
+			harq_dma_length_in);
+
+	if (ddr_mem_in && (harq_layout[harq_index].offset > 0)) {
+		fcw->hcin_size0 = harq_layout[harq_index].size0;
+		fcw->hcin_offset = harq_layout[harq_index].offset;
+		fcw->hcin_size1 = harq_in_length - fcw->hcin_offset;
+		harq_dma_length_in = (fcw->hcin_size0 + fcw->hcin_size1);
+		if (h_comp == 1)
+			harq_dma_length_in = harq_dma_length_in * 6 / 8;
+	} else {
+		fcw->hcin_size0 = harq_in_length;
+	}
+	harq_layout[harq_index].val = 0;
+	rte_bbdev_log(DEBUG, "Loopback FCW Config %d %d %d\n",
+			fcw->hcin_size0, fcw->hcin_offset, fcw->hcin_size1);
+	fcw->hcout_size0 = harq_in_length;
+	fcw->hcin_decomp_mode = h_comp;
+	fcw->hcout_comp_mode = h_comp;
+	fcw->gain_i = 1;
+	fcw->gain_h = 1;
+
+	/* Set the prefix of descriptor. This could be done at polling */
 	desc->req.word0 = ACC100_DMA_DESC_TYPE;
 	desc->req.word1 = 0; /**< Timestamp could be disabled */
 	desc->req.word2 = 0;
@@ -1816,6 +2320,107 @@
 	return current_enqueued_cbs;
 }
 
+/* Enqueue one decode operations for ACC100 device in TB mode */
+static inline int
+enqueue_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint8_t r, c;
+	uint32_t in_offset, h_out_offset, s_out_offset, s_out_length,
+		h_out_length, mbuf_total_left, seg_total_left;
+	struct rte_mbuf *input, *h_output_head, *h_output,
+		*s_output_head, *s_output;
+	uint16_t current_enqueued_cbs = 0;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
+	acc100_fcw_td_fill(op, &desc->req.fcw_td);
+
+	input = op->turbo_dec.input.data;
+	h_output_head = h_output = op->turbo_dec.hard_output.data;
+	s_output_head = s_output = op->turbo_dec.soft_output.data;
+	in_offset = op->turbo_dec.input.offset;
+	h_out_offset = op->turbo_dec.hard_output.offset;
+	s_out_offset = op->turbo_dec.soft_output.offset;
+	h_out_length = s_out_length = 0;
+	mbuf_total_left = op->turbo_dec.input.length;
+	c = op->turbo_dec.tb_params.c;
+	r = op->turbo_dec.tb_params.r;
+
+	while (mbuf_total_left > 0 && r < c) {
+
+		seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
+
+		/* Set up DMA descriptor */
+		desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
+				& q->sw_ring_wrap_mask);
+		desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
+		desc->req.data_ptrs[0].blen = ACC100_FCW_TD_BLEN;
+		ret = acc100_dma_desc_td_fill(op, &desc->req, &input,
+				h_output, s_output, &in_offset, &h_out_offset,
+				&s_out_offset, &h_out_length, &s_out_length,
+				&mbuf_total_left, &seg_total_left, r);
+
+		if (unlikely(ret < 0))
+			return ret;
+
+		/* Hard output */
+		mbuf_append(h_output_head, h_output, h_out_length);
+
+		/* Soft output */
+		if (check_bit(op->turbo_dec.op_flags,
+				RTE_BBDEV_TURBO_SOFT_OUTPUT))
+			mbuf_append(s_output_head, s_output, s_out_length);
+
+		/* Set total number of CBs in TB */
+		desc->req.cbs_in_tb = cbs_in_tb;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+		rte_memdump(stderr, "FCW", &desc->req.fcw_td,
+				sizeof(desc->req.fcw_td) - 8);
+		rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+		if (seg_total_left == 0) {
+			/* Go to the next mbuf */
+			input = input->next;
+			in_offset = 0;
+			h_output = h_output->next;
+			h_out_offset = 0;
+
+			if (check_bit(op->turbo_dec.op_flags,
+					RTE_BBDEV_TURBO_SOFT_OUTPUT)) {
+				s_output = s_output->next;
+				s_out_offset = 0;
+			}
+		}
+
+		total_enqueued_cbs++;
+		current_enqueued_cbs++;
+		r++;
+	}
+
+	if (unlikely(desc == NULL))
+		return current_enqueued_cbs;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Check if any CBs left for processing */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left for processing: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+	/* Set SDone on last CB descriptor for TB mode */
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	return current_enqueued_cbs;
+}
 
 /* Calculates number of CBs in processed encoder TB based on 'r' and input
  * length.
@@ -1893,6 +2498,45 @@
 	return cbs_in_tb;
 }
 
+/* Enqueue encode operations for ACC100 device in CB mode. */
+static uint16_t
+acc100_enqueue_enc_cb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i;
+	union acc100_dma_desc *desc;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - 1 < 0))
+			break;
+		avail -= 1;
+
+		ret = enqueue_enc_one_op_cb(q, ops[i], i);
+		if (ret < 0)
+			break;
+	}
+
+	if (unlikely(i == 0))
+		return 0; /* Nothing to enqueue */
+
+	/* Set SDone in last CB in enqueued ops for CB mode*/
+	desc = q->ring_addr + ((q->sw_ring_head + i - 1)
+			& q->sw_ring_wrap_mask);
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	acc100_dma_enqueue(q, i, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+	return i;
+}
+
 /* Check we can mux encode operations with common FCW */
 static inline bool
 check_mux(struct rte_bbdev_enc_op **ops, uint16_t num) {
@@ -1960,6 +2604,52 @@
 	return i;
 }
 
+/* Enqueue encode operations for ACC100 device in TB mode. */
+static uint16_t
+acc100_enqueue_enc_tb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i, enqueued_cbs = 0;
+	uint8_t cbs_in_tb;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		cbs_in_tb = get_num_cbs_in_tb_enc(&ops[i]->turbo_enc);
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - cbs_in_tb < 0))
+			break;
+		avail -= cbs_in_tb;
+
+		ret = enqueue_enc_one_op_tb(q, ops[i], enqueued_cbs, cbs_in_tb);
+		if (ret < 0)
+			break;
+		enqueued_cbs += ret;
+	}
+
+	acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+
+	return i;
+}
+
+/* Enqueue encode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_enc(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	if (unlikely(num == 0))
+		return 0;
+	if (ops[0]->turbo_enc.code_block_mode == 0)
+		return acc100_enqueue_enc_tb(q_data, ops, num);
+	else
+		return acc100_enqueue_enc_cb(q_data, ops, num);
+}
+
 /* Enqueue encode operations for ACC100 device. */
 static uint16_t
 acc100_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
@@ -1967,7 +2657,51 @@
 {
 	if (unlikely(num == 0))
 		return 0;
-	return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
+	if (ops[0]->ldpc_enc.code_block_mode == 0)
+		return acc100_enqueue_enc_tb(q_data, ops, num);
+	else
+		return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
+}
+
+
+/* Enqueue decode operations for ACC100 device in CB mode */
+static uint16_t
+acc100_enqueue_dec_cb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i;
+	union acc100_dma_desc *desc;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - 1 < 0))
+			break;
+		avail -= 1;
+
+		ret = enqueue_dec_one_op_cb(q, ops[i], i);
+		if (ret < 0)
+			break;
+	}
+
+	if (unlikely(i == 0))
+		return 0; /* Nothing to enqueue */
+
+	/* Set SDone in last CB in enqueued ops for CB mode*/
+	desc = q->ring_addr + ((q->sw_ring_head + i - 1)
+			& q->sw_ring_wrap_mask);
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	acc100_dma_enqueue(q, i, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+
+	return i;
 }
 
 /* Check we can mux encode operations with common FCW */
@@ -2065,6 +2799,53 @@
 	return i;
 }
 
+
+/* Enqueue decode operations for ACC100 device in TB mode */
+static uint16_t
+acc100_enqueue_dec_tb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i, enqueued_cbs = 0;
+	uint8_t cbs_in_tb;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		cbs_in_tb = get_num_cbs_in_tb_dec(&ops[i]->turbo_dec);
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - cbs_in_tb < 0))
+			break;
+		avail -= cbs_in_tb;
+
+		ret = enqueue_dec_one_op_tb(q, ops[i], enqueued_cbs, cbs_in_tb);
+		if (ret < 0)
+			break;
+		enqueued_cbs += ret;
+	}
+
+	acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+
+	return i;
+}
+
+/* Enqueue decode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_dec(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	if (unlikely(num == 0))
+		return 0;
+	if (ops[0]->turbo_dec.code_block_mode == 0)
+		return acc100_enqueue_dec_tb(q_data, ops, num);
+	else
+		return acc100_enqueue_dec_cb(q_data, ops, num);
+}
+
 /* Enqueue decode operations for ACC100 device. */
 static uint16_t
 acc100_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
@@ -2388,6 +3169,51 @@
 	return cb_idx;
 }
 
+/* Dequeue encode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_enc(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	uint16_t dequeue_num;
+	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+	uint32_t aq_dequeued = 0;
+	uint16_t i;
+	uint16_t dequeued_cbs = 0;
+	struct rte_bbdev_enc_op *op;
+	int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(ops == 0 && q == NULL))
+		return 0;
+#endif
+
+	dequeue_num = (avail < num) ? avail : num;
+
+	for (i = 0; i < dequeue_num; ++i) {
+		op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask))->req.op_addr;
+		if (op->turbo_enc.code_block_mode == 0)
+			ret = dequeue_enc_one_op_tb(q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+		else
+			ret = dequeue_enc_one_op_cb(q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+
+		if (ret < 0)
+			break;
+		dequeued_cbs += ret;
+	}
+
+	q->aq_dequeued += aq_dequeued;
+	q->sw_ring_tail += dequeued_cbs;
+
+	/* Update enqueue stats */
+	q_data->queue_stats.dequeued_count += i;
+
+	return i;
+}
+
 /* Dequeue LDPC encode operations from ACC100 device. */
 static uint16_t
 acc100_dequeue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
@@ -2426,6 +3252,52 @@
 	return dequeued_cbs;
 }
 
+
+/* Dequeue decode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_dec(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	uint16_t dequeue_num;
+	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+	uint32_t aq_dequeued = 0;
+	uint16_t i;
+	uint16_t dequeued_cbs = 0;
+	struct rte_bbdev_dec_op *op;
+	int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(ops == 0 && q == NULL))
+		return 0;
+#endif
+
+	dequeue_num = (avail < num) ? avail : num;
+
+	for (i = 0; i < dequeue_num; ++i) {
+		op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask))->req.op_addr;
+		if (op->turbo_dec.code_block_mode == 0)
+			ret = dequeue_dec_one_op_tb(q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+		else
+			ret = dequeue_dec_one_op_cb(q_data, q, &ops[i],
+					dequeued_cbs, &aq_dequeued);
+
+		if (ret < 0)
+			break;
+		dequeued_cbs += ret;
+	}
+
+	q->aq_dequeued += aq_dequeued;
+	q->sw_ring_tail += dequeued_cbs;
+
+	/* Update enqueue stats */
+	q_data->queue_stats.dequeued_count += i;
+
+	return i;
+}
+
 /* Dequeue decode operations from ACC100 device. */
 static uint16_t
 acc100_dequeue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
@@ -2479,6 +3351,10 @@
 	struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
 
 	dev->dev_ops = &acc100_bbdev_ops;
+	dev->enqueue_enc_ops = acc100_enqueue_enc;
+	dev->enqueue_dec_ops = acc100_enqueue_dec;
+	dev->dequeue_enc_ops = acc100_dequeue_enc;
+	dev->dequeue_dec_ops = acc100_dequeue_dec;
 	dev->enqueue_ldpc_enc_ops = acc100_enqueue_ldpc_enc;
 	dev->enqueue_ldpc_dec_ops = acc100_enqueue_ldpc_dec;
 	dev->dequeue_ldpc_enc_ops = acc100_dequeue_ldpc_enc;
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v7 08/11] baseband/acc100: add interrupt support to PMD
  2020-09-23  2:24   ` [dpdk-dev] [PATCH v7 00/11] bbdev PMD ACC100 Nicolas Chautru
                       ` (6 preceding siblings ...)
  2020-09-23  2:25     ` [dpdk-dev] [PATCH v7 07/11] baseband/acc100: add support for 4G processing Nicolas Chautru
@ 2020-09-23  2:25     ` Nicolas Chautru
  2020-09-23  2:25     ` [dpdk-dev] [PATCH v7 09/11] baseband/acc100: add debug function to validate input Nicolas Chautru
                       ` (2 subsequent siblings)
  10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-23  2:25 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Adding capability and functions to support MSI
interrupts, call backs and inforing.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 288 ++++++++++++++++++++++++++++++-
 drivers/baseband/acc100/rte_acc100_pmd.h |  15 ++
 2 files changed, 300 insertions(+), 3 deletions(-)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 7d4c3df..b6d9e7c 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -339,6 +339,213 @@
 	free_base_addresses(base_addrs, i);
 }
 
+/*
+ * Find queue_id of a device queue based on details from the Info Ring.
+ * If a queue isn't found UINT16_MAX is returned.
+ */
+static inline uint16_t
+get_queue_id_from_ring_info(struct rte_bbdev_data *data,
+		const union acc100_info_ring_data ring_data)
+{
+	uint16_t queue_id;
+
+	for (queue_id = 0; queue_id < data->num_queues; ++queue_id) {
+		struct acc100_queue *acc100_q =
+				data->queues[queue_id].queue_private;
+		if (acc100_q != NULL && acc100_q->aq_id == ring_data.aq_id &&
+				acc100_q->qgrp_id == ring_data.qg_id &&
+				acc100_q->vf_id == ring_data.vf_id)
+			return queue_id;
+	}
+
+	return UINT16_MAX;
+}
+
+/* Checks PF Info Ring to find the interrupt cause and handles it accordingly */
+static inline void
+acc100_check_ir(struct acc100_device *acc100_dev)
+{
+	volatile union acc100_info_ring_data *ring_data;
+	uint16_t info_ring_head = acc100_dev->info_ring_head;
+	if (acc100_dev->info_ring == NULL)
+		return;
+
+	ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head &
+			ACC100_INFO_RING_MASK);
+
+	while (ring_data->valid) {
+		if ((ring_data->int_nb < ACC100_PF_INT_DMA_DL_DESC_IRQ) || (
+				ring_data->int_nb >
+				ACC100_PF_INT_DMA_DL5G_DESC_IRQ))
+			rte_bbdev_log(WARNING, "InfoRing: ITR:%d Info:0x%x",
+				ring_data->int_nb, ring_data->detailed_info);
+		/* Initialize Info Ring entry and move forward */
+		ring_data->val = 0;
+		info_ring_head++;
+		ring_data = acc100_dev->info_ring +
+				(info_ring_head & ACC100_INFO_RING_MASK);
+	}
+}
+
+/* Checks PF Info Ring to find the interrupt cause and handles it accordingly */
+static inline void
+acc100_pf_interrupt_handler(struct rte_bbdev *dev)
+{
+	struct acc100_device *acc100_dev = dev->data->dev_private;
+	volatile union acc100_info_ring_data *ring_data;
+	struct acc100_deq_intr_details deq_intr_det;
+
+	ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head &
+			ACC100_INFO_RING_MASK);
+
+	while (ring_data->valid) {
+
+		rte_bbdev_log_debug(
+				"ACC100 PF Interrupt received, Info Ring data: 0x%x",
+				ring_data->val);
+
+		switch (ring_data->int_nb) {
+		case ACC100_PF_INT_DMA_DL_DESC_IRQ:
+		case ACC100_PF_INT_DMA_UL_DESC_IRQ:
+		case ACC100_PF_INT_DMA_UL5G_DESC_IRQ:
+		case ACC100_PF_INT_DMA_DL5G_DESC_IRQ:
+			deq_intr_det.queue_id = get_queue_id_from_ring_info(
+					dev->data, *ring_data);
+			if (deq_intr_det.queue_id == UINT16_MAX) {
+				rte_bbdev_log(ERR,
+						"Couldn't find queue: aq_id: %u, qg_id: %u, vf_id: %u",
+						ring_data->aq_id,
+						ring_data->qg_id,
+						ring_data->vf_id);
+				return;
+			}
+			rte_bbdev_pmd_callback_process(dev,
+					RTE_BBDEV_EVENT_DEQUEUE, &deq_intr_det);
+			break;
+		default:
+			rte_bbdev_pmd_callback_process(dev,
+					RTE_BBDEV_EVENT_ERROR, NULL);
+			break;
+		}
+
+		/* Initialize Info Ring entry and move forward */
+		ring_data->val = 0;
+		++acc100_dev->info_ring_head;
+		ring_data = acc100_dev->info_ring +
+				(acc100_dev->info_ring_head &
+				ACC100_INFO_RING_MASK);
+	}
+}
+
+/* Checks VF Info Ring to find the interrupt cause and handles it accordingly */
+static inline void
+acc100_vf_interrupt_handler(struct rte_bbdev *dev)
+{
+	struct acc100_device *acc100_dev = dev->data->dev_private;
+	volatile union acc100_info_ring_data *ring_data;
+	struct acc100_deq_intr_details deq_intr_det;
+
+	ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head &
+			ACC100_INFO_RING_MASK);
+
+	while (ring_data->valid) {
+
+		rte_bbdev_log_debug(
+				"ACC100 VF Interrupt received, Info Ring data: 0x%x",
+				ring_data->val);
+
+		switch (ring_data->int_nb) {
+		case ACC100_VF_INT_DMA_DL_DESC_IRQ:
+		case ACC100_VF_INT_DMA_UL_DESC_IRQ:
+		case ACC100_VF_INT_DMA_UL5G_DESC_IRQ:
+		case ACC100_VF_INT_DMA_DL5G_DESC_IRQ:
+			/* VFs are not aware of their vf_id - it's set to 0 in
+			 * queue structures.
+			 */
+			ring_data->vf_id = 0;
+			deq_intr_det.queue_id = get_queue_id_from_ring_info(
+					dev->data, *ring_data);
+			if (deq_intr_det.queue_id == UINT16_MAX) {
+				rte_bbdev_log(ERR,
+						"Couldn't find queue: aq_id: %u, qg_id: %u",
+						ring_data->aq_id,
+						ring_data->qg_id);
+				return;
+			}
+			rte_bbdev_pmd_callback_process(dev,
+					RTE_BBDEV_EVENT_DEQUEUE, &deq_intr_det);
+			break;
+		default:
+			rte_bbdev_pmd_callback_process(dev,
+					RTE_BBDEV_EVENT_ERROR, NULL);
+			break;
+		}
+
+		/* Initialize Info Ring entry and move forward */
+		ring_data->valid = 0;
+		++acc100_dev->info_ring_head;
+		ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head
+				& ACC100_INFO_RING_MASK);
+	}
+}
+
+/* Interrupt handler triggered by ACC100 dev for handling specific interrupt */
+static void
+acc100_dev_interrupt_handler(void *cb_arg)
+{
+	struct rte_bbdev *dev = cb_arg;
+	struct acc100_device *acc100_dev = dev->data->dev_private;
+
+	/* Read info ring */
+	if (acc100_dev->pf_device)
+		acc100_pf_interrupt_handler(dev);
+	else
+		acc100_vf_interrupt_handler(dev);
+}
+
+/* Allocate and setup inforing */
+static int
+allocate_inforing(struct rte_bbdev *dev)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	const struct acc100_registry_addr *reg_addr;
+	rte_iova_t info_ring_phys;
+	uint32_t phys_low, phys_high;
+
+	if (d->info_ring != NULL)
+		return 0; /* Already configured */
+
+	/* Choose correct registry addresses for the device type */
+	if (d->pf_device)
+		reg_addr = &pf_reg_addr;
+	else
+		reg_addr = &vf_reg_addr;
+	/* Allocate InfoRing */
+	d->info_ring = rte_zmalloc_socket("Info Ring",
+			ACC100_INFO_RING_NUM_ENTRIES *
+			sizeof(*d->info_ring), RTE_CACHE_LINE_SIZE,
+			dev->data->socket_id);
+	if (d->info_ring == NULL) {
+		rte_bbdev_log(ERR,
+				"Failed to allocate Info Ring for %s:%u",
+				dev->device->driver->name,
+				dev->data->dev_id);
+		return -ENOMEM;
+	}
+	info_ring_phys = rte_malloc_virt2iova(d->info_ring);
+
+	/* Setup Info Ring */
+	phys_high = (uint32_t)(info_ring_phys >> 32);
+	phys_low  = (uint32_t)(info_ring_phys);
+	acc100_reg_write(d, reg_addr->info_ring_hi, phys_high);
+	acc100_reg_write(d, reg_addr->info_ring_lo, phys_low);
+	acc100_reg_write(d, reg_addr->info_ring_en, ACC100_REG_IRQ_EN_ALL);
+	d->info_ring_head = (acc100_reg_read(d, reg_addr->info_ring_ptr) &
+			0xFFF) / sizeof(union acc100_info_ring_data);
+	return 0;
+}
+
+
 /* Allocate 64MB memory used for all software rings */
 static int
 acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id)
@@ -426,6 +633,7 @@
 	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_hi, phys_high);
 	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_lo, phys_low);
 
+	allocate_inforing(dev);
 	d->harq_layout = rte_zmalloc_socket("HARQ Layout",
 			ACC100_HARQ_LAYOUT * sizeof(*d->harq_layout),
 			RTE_CACHE_LINE_SIZE, dev->data->socket_id);
@@ -437,13 +645,53 @@
 	return 0;
 }
 
+static int
+acc100_intr_enable(struct rte_bbdev *dev)
+{
+	int ret;
+	struct acc100_device *d = dev->data->dev_private;
+
+	/* Only MSI are currently supported */
+	if (dev->intr_handle->type == RTE_INTR_HANDLE_VFIO_MSI ||
+			dev->intr_handle->type == RTE_INTR_HANDLE_UIO) {
+
+		allocate_inforing(dev);
+
+		ret = rte_intr_enable(dev->intr_handle);
+		if (ret < 0) {
+			rte_bbdev_log(ERR,
+					"Couldn't enable interrupts for device: %s",
+					dev->data->name);
+			rte_free(d->info_ring);
+			return ret;
+		}
+		ret = rte_intr_callback_register(dev->intr_handle,
+				acc100_dev_interrupt_handler, dev);
+		if (ret < 0) {
+			rte_bbdev_log(ERR,
+					"Couldn't register interrupt callback for device: %s",
+					dev->data->name);
+			rte_free(d->info_ring);
+			return ret;
+		}
+
+		return 0;
+	}
+
+	rte_bbdev_log(ERR, "ACC100 (%s) supports only VFIO MSI interrupts",
+			dev->data->name);
+	return -ENOTSUP;
+}
+
 /* Free 64MB memory used for software rings */
 static int
 acc100_dev_close(struct rte_bbdev *dev)
 {
 	struct acc100_device *d = dev->data->dev_private;
+	acc100_check_ir(d);
 	if (d->sw_rings_base != NULL) {
 		rte_free(d->tail_ptrs);
+		rte_free(d->info_ring);
 		rte_free(d->sw_rings_base);
 		d->sw_rings_base = NULL;
 	}
@@ -643,6 +891,7 @@
 					RTE_BBDEV_TURBO_CRC_TYPE_24B |
 					RTE_BBDEV_TURBO_HALF_ITERATION_EVEN |
 					RTE_BBDEV_TURBO_EARLY_TERMINATION |
+					RTE_BBDEV_TURBO_DEC_INTERRUPTS |
 					RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN |
 					RTE_BBDEV_TURBO_MAP_DEC |
 					RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP |
@@ -663,6 +912,7 @@
 					RTE_BBDEV_TURBO_CRC_24B_ATTACH |
 					RTE_BBDEV_TURBO_RV_INDEX_BYPASS |
 					RTE_BBDEV_TURBO_RATE_MATCH |
+					RTE_BBDEV_TURBO_ENC_INTERRUPTS |
 					RTE_BBDEV_TURBO_ENC_SCATTER_GATHER,
 				.num_buffers_src =
 						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
@@ -676,7 +926,8 @@
 				.capability_flags =
 					RTE_BBDEV_LDPC_RATE_MATCH |
 					RTE_BBDEV_LDPC_CRC_24B_ATTACH |
-					RTE_BBDEV_LDPC_INTERLEAVER_BYPASS,
+					RTE_BBDEV_LDPC_INTERLEAVER_BYPASS |
+					RTE_BBDEV_LDPC_ENC_INTERRUPTS,
 				.num_buffers_src =
 						RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
 				.num_buffers_dst =
@@ -701,7 +952,8 @@
 				RTE_BBDEV_LDPC_DECODE_BYPASS |
 				RTE_BBDEV_LDPC_DEC_SCATTER_GATHER |
 				RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION |
-				RTE_BBDEV_LDPC_LLR_COMPRESSION,
+				RTE_BBDEV_LDPC_LLR_COMPRESSION |
+				RTE_BBDEV_LDPC_DEC_INTERRUPTS,
 			.llr_size = 8,
 			.llr_decimals = 1,
 			.num_buffers_src =
@@ -751,14 +1003,39 @@
 #else
 	dev_info->harq_buffer_size = 0;
 #endif
+	acc100_check_ir(d);
+}
+
+static int
+acc100_queue_intr_enable(struct rte_bbdev *dev, uint16_t queue_id)
+{
+	struct acc100_queue *q = dev->data->queues[queue_id].queue_private;
+
+	if (dev->intr_handle->type != RTE_INTR_HANDLE_VFIO_MSI &&
+			dev->intr_handle->type != RTE_INTR_HANDLE_UIO)
+		return -ENOTSUP;
+
+	q->irq_enable = 1;
+	return 0;
+}
+
+static int
+acc100_queue_intr_disable(struct rte_bbdev *dev, uint16_t queue_id)
+{
+	struct acc100_queue *q = dev->data->queues[queue_id].queue_private;
+	q->irq_enable = 0;
+	return 0;
 }
 
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
 	.setup_queues = acc100_setup_queues,
+	.intr_enable = acc100_intr_enable,
 	.close = acc100_dev_close,
 	.info_get = acc100_dev_info_get,
 	.queue_setup = acc100_queue_setup,
 	.queue_release = acc100_queue_release,
+	.queue_intr_enable = acc100_queue_intr_enable,
+	.queue_intr_disable = acc100_queue_intr_disable
 };
 
 /* ACC100 PCI PF address map */
@@ -3018,8 +3295,10 @@
 			? (1 << RTE_BBDEV_DATA_ERROR) : 0);
 	op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
 	op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
-	if (op->status != 0)
+	if (op->status != 0) {
 		q_data->queue_stats.dequeue_err_count++;
+		acc100_check_ir(q->d);
+	}
 
 	/* CRC invalid if error exists */
 	if (!op->status)
@@ -3076,6 +3355,9 @@
 		op->status |= 1 << RTE_BBDEV_SYNDROME_ERROR;
 	op->ldpc_dec.iter_count = (uint8_t) rsp.iter_cnt;
 
+	if (op->status & (1 << RTE_BBDEV_DRV_ERROR))
+		acc100_check_ir(q->d);
+
 	/* Check if this is the last desc in batch (Atomic Queue) */
 	if (desc->req.last_desc_in_batch) {
 		(*aq_dequeued)++;
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 78686c1..8980fa5 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -559,7 +559,14 @@ struct acc100_device {
 	/* Virtual address of the info memory routed to the this function under
 	 * operation, whether it is PF or VF.
 	 */
+	union acc100_info_ring_data *info_ring;
+
 	union acc100_harq_layout_data *harq_layout;
+	/* Virtual Info Ring head */
+	uint16_t info_ring_head;
+	/* Number of bytes available for each queue in device, depending on
+	 * how many queues are enabled with configure()
+	 */
 	uint32_t sw_ring_size;
 	uint32_t ddr_size; /* Size in kB */
 	uint32_t *tail_ptrs; /* Base address of response tail pointer buffer */
@@ -575,4 +582,12 @@ struct acc100_device {
 	bool configured; /**< True if this ACC100 device is configured */
 };
 
+/**
+ * Structure with details about RTE_BBDEV_EVENT_DEQUEUE event. It's passed to
+ * the callback function.
+ */
+struct acc100_deq_intr_details {
+	uint16_t queue_id;
+};
+
 #endif /* _RTE_ACC100_PMD_H_ */
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v7 09/11] baseband/acc100: add debug function to validate input
  2020-09-23  2:24   ` [dpdk-dev] [PATCH v7 00/11] bbdev PMD ACC100 Nicolas Chautru
                       ` (7 preceding siblings ...)
  2020-09-23  2:25     ` [dpdk-dev] [PATCH v7 08/11] baseband/acc100: add interrupt support to PMD Nicolas Chautru
@ 2020-09-23  2:25     ` Nicolas Chautru
  2020-09-23  2:25     ` [dpdk-dev] [PATCH v7 10/11] baseband/acc100: add configure function Nicolas Chautru
  2020-09-23  2:25     ` [dpdk-dev] [PATCH v7 11/11] doc: update bbdev feature table Nicolas Chautru
  10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-23  2:25 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Debug functions to validate the input API from user
Only enabled in DEBUG mode at build time

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 424 +++++++++++++++++++++++++++++++
 1 file changed, 424 insertions(+)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index b6d9e7c..3589814 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -1945,6 +1945,231 @@
 
 }
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+/* Validates turbo encoder parameters */
+static inline int
+validate_enc_op(struct rte_bbdev_enc_op *op)
+{
+	struct rte_bbdev_op_turbo_enc *turbo_enc = &op->turbo_enc;
+	struct rte_bbdev_op_enc_turbo_cb_params *cb = NULL;
+	struct rte_bbdev_op_enc_turbo_tb_params *tb = NULL;
+	uint16_t kw, kw_neg, kw_pos;
+
+	if (op->mempool == NULL) {
+		rte_bbdev_log(ERR, "Invalid mempool pointer");
+		return -1;
+	}
+	if (turbo_enc->input.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid input pointer");
+		return -1;
+	}
+	if (turbo_enc->output.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid output pointer");
+		return -1;
+	}
+	if (turbo_enc->rv_index > 3) {
+		rte_bbdev_log(ERR,
+				"rv_index (%u) is out of range 0 <= value <= 3",
+				turbo_enc->rv_index);
+		return -1;
+	}
+	if (turbo_enc->code_block_mode != 0 &&
+			turbo_enc->code_block_mode != 1) {
+		rte_bbdev_log(ERR,
+				"code_block_mode (%u) is out of range 0 <= value <= 1",
+				turbo_enc->code_block_mode);
+		return -1;
+	}
+
+	if (turbo_enc->code_block_mode == 0) {
+		tb = &turbo_enc->tb_params;
+		if ((tb->k_neg < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| tb->k_neg > RTE_BBDEV_TURBO_MAX_CB_SIZE)
+				&& tb->c_neg > 0) {
+			rte_bbdev_log(ERR,
+					"k_neg (%u) is out of range %u <= value <= %u",
+					tb->k_neg, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if (tb->k_pos < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| tb->k_pos > RTE_BBDEV_TURBO_MAX_CB_SIZE) {
+			rte_bbdev_log(ERR,
+					"k_pos (%u) is out of range %u <= value <= %u",
+					tb->k_pos, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if (tb->c_neg > (RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1))
+			rte_bbdev_log(ERR,
+					"c_neg (%u) is out of range 0 <= value <= %u",
+					tb->c_neg,
+					RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1);
+		if (tb->c < 1 || tb->c > RTE_BBDEV_TURBO_MAX_CODE_BLOCKS) {
+			rte_bbdev_log(ERR,
+					"c (%u) is out of range 1 <= value <= %u",
+					tb->c, RTE_BBDEV_TURBO_MAX_CODE_BLOCKS);
+			return -1;
+		}
+		if (tb->cab > tb->c) {
+			rte_bbdev_log(ERR,
+					"cab (%u) is greater than c (%u)",
+					tb->cab, tb->c);
+			return -1;
+		}
+		if ((tb->ea < RTE_BBDEV_TURBO_MIN_CB_SIZE || (tb->ea % 2))
+				&& tb->r < tb->cab) {
+			rte_bbdev_log(ERR,
+					"ea (%u) is less than %u or it is not even",
+					tb->ea, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+		if ((tb->eb < RTE_BBDEV_TURBO_MIN_CB_SIZE || (tb->eb % 2))
+				&& tb->c > tb->cab) {
+			rte_bbdev_log(ERR,
+					"eb (%u) is less than %u or it is not even",
+					tb->eb, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+
+		kw_neg = 3 * RTE_ALIGN_CEIL(tb->k_neg + 4,
+					RTE_BBDEV_TURBO_C_SUBBLOCK);
+		if (tb->ncb_neg < tb->k_neg || tb->ncb_neg > kw_neg) {
+			rte_bbdev_log(ERR,
+					"ncb_neg (%u) is out of range (%u) k_neg <= value <= (%u) kw_neg",
+					tb->ncb_neg, tb->k_neg, kw_neg);
+			return -1;
+		}
+
+		kw_pos = 3 * RTE_ALIGN_CEIL(tb->k_pos + 4,
+					RTE_BBDEV_TURBO_C_SUBBLOCK);
+		if (tb->ncb_pos < tb->k_pos || tb->ncb_pos > kw_pos) {
+			rte_bbdev_log(ERR,
+					"ncb_pos (%u) is out of range (%u) k_pos <= value <= (%u) kw_pos",
+					tb->ncb_pos, tb->k_pos, kw_pos);
+			return -1;
+		}
+		if (tb->r > (tb->c - 1)) {
+			rte_bbdev_log(ERR,
+					"r (%u) is greater than c - 1 (%u)",
+					tb->r, tb->c - 1);
+			return -1;
+		}
+	} else {
+		cb = &turbo_enc->cb_params;
+		if (cb->k < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| cb->k > RTE_BBDEV_TURBO_MAX_CB_SIZE) {
+			rte_bbdev_log(ERR,
+					"k (%u) is out of range %u <= value <= %u",
+					cb->k, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+
+		if (cb->e < RTE_BBDEV_TURBO_MIN_CB_SIZE || (cb->e % 2)) {
+			rte_bbdev_log(ERR,
+					"e (%u) is less than %u or it is not even",
+					cb->e, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+
+		kw = RTE_ALIGN_CEIL(cb->k + 4, RTE_BBDEV_TURBO_C_SUBBLOCK) * 3;
+		if (cb->ncb < cb->k || cb->ncb > kw) {
+			rte_bbdev_log(ERR,
+					"ncb (%u) is out of range (%u) k <= value <= (%u) kw",
+					cb->ncb, cb->k, kw);
+			return -1;
+		}
+	}
+
+	return 0;
+}
+/* Validates LDPC encoder parameters */
+static inline int
+validate_ldpc_enc_op(struct rte_bbdev_enc_op *op)
+{
+	struct rte_bbdev_op_ldpc_enc *ldpc_enc = &op->ldpc_enc;
+
+	if (op->mempool == NULL) {
+		rte_bbdev_log(ERR, "Invalid mempool pointer");
+		return -1;
+	}
+	if (ldpc_enc->input.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid input pointer");
+		return -1;
+	}
+	if (ldpc_enc->output.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid output pointer");
+		return -1;
+	}
+	if (ldpc_enc->input.length >
+			RTE_BBDEV_LDPC_MAX_CB_SIZE >> 3) {
+		rte_bbdev_log(ERR, "CB size (%u) is too big, max: %d",
+				ldpc_enc->input.length,
+				RTE_BBDEV_LDPC_MAX_CB_SIZE);
+		return -1;
+	}
+	if ((ldpc_enc->basegraph > 2) || (ldpc_enc->basegraph == 0)) {
+		rte_bbdev_log(ERR,
+				"BG (%u) is out of range 1 <= value <= 2",
+				ldpc_enc->basegraph);
+		return -1;
+	}
+	if (ldpc_enc->rv_index > 3) {
+		rte_bbdev_log(ERR,
+				"rv_index (%u) is out of range 0 <= value <= 3",
+				ldpc_enc->rv_index);
+		return -1;
+	}
+	if (ldpc_enc->code_block_mode > 1) {
+		rte_bbdev_log(ERR,
+				"code_block_mode (%u) is out of range 0 <= value <= 1",
+				ldpc_enc->code_block_mode);
+		return -1;
+	}
+
+	return 0;
+}
+
+/* Validates LDPC decoder parameters */
+static inline int
+validate_ldpc_dec_op(struct rte_bbdev_dec_op *op)
+{
+	struct rte_bbdev_op_ldpc_dec *ldpc_dec = &op->ldpc_dec;
+
+	if (op->mempool == NULL) {
+		rte_bbdev_log(ERR, "Invalid mempool pointer");
+		return -1;
+	}
+	if ((ldpc_dec->basegraph > 2) || (ldpc_dec->basegraph == 0)) {
+		rte_bbdev_log(ERR,
+				"BG (%u) is out of range 1 <= value <= 2",
+				ldpc_dec->basegraph);
+		return -1;
+	}
+	if (ldpc_dec->iter_max == 0) {
+		rte_bbdev_log(ERR,
+				"iter_max (%u) is equal to 0",
+				ldpc_dec->iter_max);
+		return -1;
+	}
+	if (ldpc_dec->rv_index > 3) {
+		rte_bbdev_log(ERR,
+				"rv_index (%u) is out of range 0 <= value <= 3",
+				ldpc_dec->rv_index);
+		return -1;
+	}
+	if (ldpc_dec->code_block_mode > 1) {
+		rte_bbdev_log(ERR,
+				"code_block_mode (%u) is out of range 0 <= value <= 1",
+				ldpc_dec->code_block_mode);
+		return -1;
+	}
+
+	return 0;
+}
+#endif
+
 /* Enqueue one encode operations for ACC100 device in CB mode */
 static inline int
 enqueue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
@@ -1956,6 +2181,14 @@
 		seg_total_left;
 	struct rte_mbuf *input, *output_head, *output;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_enc_op(op) == -1) {
+		rte_bbdev_log(ERR, "Turbo encoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2008,6 +2241,14 @@
 	uint16_t  in_length_in_bytes;
 	struct rte_bbdev_op_ldpc_enc *enc = &ops[0]->ldpc_enc;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_ldpc_enc_op(ops[0]) == -1) {
+		rte_bbdev_log(ERR, "LDPC encoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2065,6 +2306,14 @@
 		seg_total_left;
 	struct rte_mbuf *input, *output_head, *output;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_ldpc_enc_op(op) == -1) {
+		rte_bbdev_log(ERR, "LDPC encoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2119,6 +2368,14 @@
 	struct rte_mbuf *input, *output_head, *output;
 	uint16_t current_enqueued_cbs = 0;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_enc_op(op) == -1) {
+		rte_bbdev_log(ERR, "Turbo encoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2191,6 +2448,142 @@
 	return current_enqueued_cbs;
 }
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+/* Validates turbo decoder parameters */
+static inline int
+validate_dec_op(struct rte_bbdev_dec_op *op)
+{
+	struct rte_bbdev_op_turbo_dec *turbo_dec = &op->turbo_dec;
+	struct rte_bbdev_op_dec_turbo_cb_params *cb = NULL;
+	struct rte_bbdev_op_dec_turbo_tb_params *tb = NULL;
+
+	if (op->mempool == NULL) {
+		rte_bbdev_log(ERR, "Invalid mempool pointer");
+		return -1;
+	}
+	if (turbo_dec->input.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid input pointer");
+		return -1;
+	}
+	if (turbo_dec->hard_output.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid hard_output pointer");
+		return -1;
+	}
+	if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_SOFT_OUTPUT) &&
+			turbo_dec->soft_output.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid soft_output pointer");
+		return -1;
+	}
+	if (turbo_dec->rv_index > 3) {
+		rte_bbdev_log(ERR,
+				"rv_index (%u) is out of range 0 <= value <= 3",
+				turbo_dec->rv_index);
+		return -1;
+	}
+	if (turbo_dec->iter_min < 1) {
+		rte_bbdev_log(ERR,
+				"iter_min (%u) is less than 1",
+				turbo_dec->iter_min);
+		return -1;
+	}
+	if (turbo_dec->iter_max <= 2) {
+		rte_bbdev_log(ERR,
+				"iter_max (%u) is less than or equal to 2",
+				turbo_dec->iter_max);
+		return -1;
+	}
+	if (turbo_dec->iter_min > turbo_dec->iter_max) {
+		rte_bbdev_log(ERR,
+				"iter_min (%u) is greater than iter_max (%u)",
+				turbo_dec->iter_min, turbo_dec->iter_max);
+		return -1;
+	}
+	if (turbo_dec->code_block_mode != 0 &&
+			turbo_dec->code_block_mode != 1) {
+		rte_bbdev_log(ERR,
+				"code_block_mode (%u) is out of range 0 <= value <= 1",
+				turbo_dec->code_block_mode);
+		return -1;
+	}
+
+	if (turbo_dec->code_block_mode == 0) {
+		tb = &turbo_dec->tb_params;
+		if ((tb->k_neg < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| tb->k_neg > RTE_BBDEV_TURBO_MAX_CB_SIZE)
+				&& tb->c_neg > 0) {
+			rte_bbdev_log(ERR,
+					"k_neg (%u) is out of range %u <= value <= %u",
+					tb->k_neg, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if ((tb->k_pos < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| tb->k_pos > RTE_BBDEV_TURBO_MAX_CB_SIZE)
+				&& tb->c > tb->c_neg) {
+			rte_bbdev_log(ERR,
+					"k_pos (%u) is out of range %u <= value <= %u",
+					tb->k_pos, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if (tb->c_neg > (RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1))
+			rte_bbdev_log(ERR,
+					"c_neg (%u) is out of range 0 <= value <= %u",
+					tb->c_neg,
+					RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1);
+		if (tb->c < 1 || tb->c > RTE_BBDEV_TURBO_MAX_CODE_BLOCKS) {
+			rte_bbdev_log(ERR,
+					"c (%u) is out of range 1 <= value <= %u",
+					tb->c, RTE_BBDEV_TURBO_MAX_CODE_BLOCKS);
+			return -1;
+		}
+		if (tb->cab > tb->c) {
+			rte_bbdev_log(ERR,
+					"cab (%u) is greater than c (%u)",
+					tb->cab, tb->c);
+			return -1;
+		}
+		if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_EQUALIZER) &&
+				(tb->ea < RTE_BBDEV_TURBO_MIN_CB_SIZE
+						|| (tb->ea % 2))
+				&& tb->cab > 0) {
+			rte_bbdev_log(ERR,
+					"ea (%u) is less than %u or it is not even",
+					tb->ea, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+		if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_EQUALIZER) &&
+				(tb->eb < RTE_BBDEV_TURBO_MIN_CB_SIZE
+						|| (tb->eb % 2))
+				&& tb->c > tb->cab) {
+			rte_bbdev_log(ERR,
+					"eb (%u) is less than %u or it is not even",
+					tb->eb, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+		}
+	} else {
+		cb = &turbo_dec->cb_params;
+		if (cb->k < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| cb->k > RTE_BBDEV_TURBO_MAX_CB_SIZE) {
+			rte_bbdev_log(ERR,
+					"k (%u) is out of range %u <= value <= %u",
+					cb->k, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_EQUALIZER) &&
+				(cb->e < RTE_BBDEV_TURBO_MIN_CB_SIZE ||
+				(cb->e % 2))) {
+			rte_bbdev_log(ERR,
+					"e (%u) is less than %u or it is not even",
+					cb->e, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+	}
+
+	return 0;
+}
+#endif
+
 /** Enqueue one decode operations for ACC100 device in CB mode */
 static inline int
 enqueue_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
@@ -2203,6 +2596,14 @@
 	struct rte_mbuf *input, *h_output_head, *h_output,
 		*s_output_head, *s_output;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_dec_op(op) == -1) {
+		rte_bbdev_log(ERR, "Turbo decoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2426,6 +2827,13 @@
 		return ret;
 	}
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_ldpc_dec_op(op) == -1) {
+		rte_bbdev_log(ERR, "LDPC decoder validation failed");
+		return -EINVAL;
+	}
+#endif
 	union acc100_dma_desc *desc;
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
@@ -2521,6 +2929,14 @@
 	struct rte_mbuf *input, *h_output_head, *h_output;
 	uint16_t current_enqueued_cbs = 0;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_ldpc_dec_op(op) == -1) {
+		rte_bbdev_log(ERR, "LDPC decoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2611,6 +3027,14 @@
 		*s_output_head, *s_output;
 	uint16_t current_enqueued_cbs = 0;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_dec_op(op) == -1) {
+		rte_bbdev_log(ERR, "Turbo decoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v7 10/11] baseband/acc100: add configure function
  2020-09-23  2:24   ` [dpdk-dev] [PATCH v7 00/11] bbdev PMD ACC100 Nicolas Chautru
                       ` (8 preceding siblings ...)
  2020-09-23  2:25     ` [dpdk-dev] [PATCH v7 09/11] baseband/acc100: add debug function to validate input Nicolas Chautru
@ 2020-09-23  2:25     ` Nicolas Chautru
  2020-09-23  2:25     ` [dpdk-dev] [PATCH v7 11/11] doc: update bbdev feature table Nicolas Chautru
  10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-23  2:25 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Add configure function to configure the PF from within
the bbdev-test itself without external application
configuration the device.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 app/test-bbdev/test_bbdev_perf.c                   |  72 +++
 drivers/baseband/acc100/meson.build                |   2 +
 drivers/baseband/acc100/rte_acc100_cfg.h           |  17 +
 drivers/baseband/acc100/rte_acc100_pmd.c           | 505 +++++++++++++++++++++
 .../acc100/rte_pmd_bbdev_acc100_version.map        |   7 +
 5 files changed, 603 insertions(+)

diff --git a/app/test-bbdev/test_bbdev_perf.c b/app/test-bbdev/test_bbdev_perf.c
index 45c0d62..32f23ff 100644
--- a/app/test-bbdev/test_bbdev_perf.c
+++ b/app/test-bbdev/test_bbdev_perf.c
@@ -52,6 +52,18 @@
 #define FLR_5G_TIMEOUT 610
 #endif
 
+#ifdef RTE_LIBRTE_PMD_BBDEV_ACC100
+#include <rte_acc100_cfg.h>
+#define ACC100PF_DRIVER_NAME   ("intel_acc100_pf")
+#define ACC100VF_DRIVER_NAME   ("intel_acc100_vf")
+#define ACC100_QMGR_NUM_AQS 16
+#define ACC100_QMGR_NUM_QGS 2
+#define ACC100_QMGR_AQ_DEPTH 5
+#define ACC100_QMGR_INVALID_IDX -1
+#define ACC100_QMGR_RR 1
+#define ACC100_QOS_GBR 0
+#endif
+
 #define OPS_CACHE_SIZE 256U
 #define OPS_POOL_SIZE_MIN 511U /* 0.5K per queue */
 
@@ -653,6 +665,66 @@ typedef int (test_case_function)(struct active_device *ad,
 				info->dev_name);
 	}
 #endif
+#ifdef RTE_LIBRTE_PMD_BBDEV_ACC100
+	if ((get_init_device() == true) &&
+		(!strcmp(info->drv.driver_name, ACC100PF_DRIVER_NAME))) {
+		struct acc100_conf conf;
+		unsigned int i;
+
+		printf("Configure ACC100 FEC Driver %s with default values\n",
+				info->drv.driver_name);
+
+		/* clear default configuration before initialization */
+		memset(&conf, 0, sizeof(struct acc100_conf));
+
+		/* Always set in PF mode for built-in configuration */
+		conf.pf_mode_en = true;
+		for (i = 0; i < RTE_ACC100_NUM_VFS; ++i) {
+			conf.arb_dl_4g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_dl_4g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_dl_4g[i].round_robin_weight = ACC100_QMGR_RR;
+			conf.arb_ul_4g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_ul_4g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_ul_4g[i].round_robin_weight = ACC100_QMGR_RR;
+			conf.arb_dl_5g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_dl_5g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_dl_5g[i].round_robin_weight = ACC100_QMGR_RR;
+			conf.arb_ul_5g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_ul_5g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_ul_5g[i].round_robin_weight = ACC100_QMGR_RR;
+		}
+
+		conf.input_pos_llr_1_bit = true;
+		conf.output_pos_llr_1_bit = true;
+		conf.num_vf_bundles = 1; /**< Number of VF bundles to setup */
+
+		conf.q_ul_4g.num_qgroups = ACC100_QMGR_NUM_QGS;
+		conf.q_ul_4g.first_qgroup_index = ACC100_QMGR_INVALID_IDX;
+		conf.q_ul_4g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
+		conf.q_ul_4g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
+		conf.q_dl_4g.num_qgroups = ACC100_QMGR_NUM_QGS;
+		conf.q_dl_4g.first_qgroup_index = ACC100_QMGR_INVALID_IDX;
+		conf.q_dl_4g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
+		conf.q_dl_4g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
+		conf.q_ul_5g.num_qgroups = ACC100_QMGR_NUM_QGS;
+		conf.q_ul_5g.first_qgroup_index = ACC100_QMGR_INVALID_IDX;
+		conf.q_ul_5g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
+		conf.q_ul_5g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
+		conf.q_dl_5g.num_qgroups = ACC100_QMGR_NUM_QGS;
+		conf.q_dl_5g.first_qgroup_index = ACC100_QMGR_INVALID_IDX;
+		conf.q_dl_5g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
+		conf.q_dl_5g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
+
+		/* setup PF with configuration information */
+		ret = acc100_configure(info->dev_name, &conf);
+		TEST_ASSERT_SUCCESS(ret,
+				"Failed to configure ACC100 PF for bbdev %s",
+				info->dev_name);
+		/* Let's refresh this now this is configured */
+	}
+	rte_bbdev_info_get(dev_id, info);
+#endif
+
 	nb_queues = RTE_MIN(rte_lcore_count(), info->drv.max_num_queues);
 	nb_queues = RTE_MIN(nb_queues, (unsigned int) MAX_QUEUES);
 
diff --git a/drivers/baseband/acc100/meson.build b/drivers/baseband/acc100/meson.build
index 8afafc2..7ac44dc 100644
--- a/drivers/baseband/acc100/meson.build
+++ b/drivers/baseband/acc100/meson.build
@@ -4,3 +4,5 @@
 deps += ['bbdev', 'bus_vdev', 'ring', 'pci', 'bus_pci']
 
 sources = files('rte_acc100_pmd.c')
+
+install_headers('rte_acc100_cfg.h')
diff --git a/drivers/baseband/acc100/rte_acc100_cfg.h b/drivers/baseband/acc100/rte_acc100_cfg.h
index 73bbe36..7f523bc 100644
--- a/drivers/baseband/acc100/rte_acc100_cfg.h
+++ b/drivers/baseband/acc100/rte_acc100_cfg.h
@@ -89,6 +89,23 @@ struct acc100_conf {
 	struct rte_arbitration_t arb_dl_5g[RTE_ACC100_NUM_VFS];
 };
 
+/**
+ * Configure a ACC100 device
+ *
+ * @param dev_name
+ *   The name of the device. This is the short form of PCI BDF, e.g. 00:01.0.
+ *   It can also be retrieved for a bbdev device from the dev_name field in the
+ *   rte_bbdev_info structure returned by rte_bbdev_info_get().
+ * @param conf
+ *   Configuration to apply to ACC100 HW.
+ *
+ * @return
+ *   Zero on success, negative value on failure.
+ */
+__rte_experimental
+int
+acc100_configure(const char *dev_name, struct acc100_conf *conf);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 3589814..b50dd32 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -85,6 +85,26 @@
 
 enum {UL_4G = 0, UL_5G, DL_4G, DL_5G, NUM_ACC};
 
+/* Return the accelerator enum for a Queue Group Index */
+static inline int
+accFromQgid(int qg_idx, const struct acc100_conf *acc100_conf)
+{
+	int accQg[ACC100_NUM_QGRPS];
+	int NumQGroupsPerFn[NUM_ACC];
+	int acc, qgIdx, qgIndex = 0;
+	for (qgIdx = 0; qgIdx < ACC100_NUM_QGRPS; qgIdx++)
+		accQg[qgIdx] = 0;
+	NumQGroupsPerFn[UL_4G] = acc100_conf->q_ul_4g.num_qgroups;
+	NumQGroupsPerFn[UL_5G] = acc100_conf->q_ul_5g.num_qgroups;
+	NumQGroupsPerFn[DL_4G] = acc100_conf->q_dl_4g.num_qgroups;
+	NumQGroupsPerFn[DL_5G] = acc100_conf->q_dl_5g.num_qgroups;
+	for (acc = UL_4G;  acc < NUM_ACC; acc++)
+		for (qgIdx = 0; qgIdx < NumQGroupsPerFn[acc]; qgIdx++)
+			accQg[qgIndex++] = acc;
+	acc = accQg[qg_idx];
+	return acc;
+}
+
 /* Return the queue topology for a Queue Group Index */
 static inline void
 qtopFromAcc(struct rte_q_topology_t **qtop, int acc_enum,
@@ -113,6 +133,30 @@
 	*qtop = p_qtop;
 }
 
+/* Return the AQ depth for a Queue Group Index */
+static inline int
+aqDepth(int qg_idx, struct acc100_conf *acc100_conf)
+{
+	struct rte_q_topology_t *q_top = NULL;
+	int acc_enum = accFromQgid(qg_idx, acc100_conf);
+	qtopFromAcc(&q_top, acc_enum, acc100_conf);
+	if (unlikely(q_top == NULL))
+		return 0;
+	return q_top->aq_depth_log2;
+}
+
+/* Return the AQ depth for a Queue Group Index */
+static inline int
+aqNum(int qg_idx, struct acc100_conf *acc100_conf)
+{
+	struct rte_q_topology_t *q_top = NULL;
+	int acc_enum = accFromQgid(qg_idx, acc100_conf);
+	qtopFromAcc(&q_top, acc_enum, acc100_conf);
+	if (unlikely(q_top == NULL))
+		return 0;
+	return q_top->num_aqs_per_groups;
+}
+
 static void
 initQTop(struct acc100_conf *acc100_conf)
 {
@@ -4177,3 +4221,464 @@ static int acc100_pci_remove(struct rte_pci_device *pci_dev)
 RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME, pci_id_acc100_pf_map);
 RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
 RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME, pci_id_acc100_vf_map);
+
+/*
+ * Implementation to fix the power on status of some 5GUL engines
+ * This requires DMA permission if ported outside DPDK
+ */
+static void
+poweron_cleanup(struct rte_bbdev *bbdev, struct acc100_device *d,
+		struct acc100_conf *conf)
+{
+	int i, template_idx, qg_idx;
+	uint32_t address, status, payload;
+	printf("Need to clear power-on 5GUL status in internal memory\n");
+	/* Reset LDPC Cores */
+	for (i = 0; i < ACC100_ENGINES_MAX; i++)
+		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
+				ACC100_ENGINE_OFFSET * i, ACC100_RESET_HI);
+	usleep(LONG_WAIT);
+	for (i = 0; i < ACC100_ENGINES_MAX; i++)
+		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
+				ACC100_ENGINE_OFFSET * i, ACC100_RESET_LO);
+	usleep(LONG_WAIT);
+	/* Prepare dummy workload */
+	alloc_2x64mb_sw_rings_mem(bbdev, d, 0);
+	/* Set base addresses */
+	uint32_t phys_high = (uint32_t)(d->sw_rings_phys >> 32);
+	uint32_t phys_low  = (uint32_t)(d->sw_rings_phys &
+			~(ACC100_SIZE_64MBYTE-1));
+	acc100_reg_write(d, HWPfDmaFec5GulDescBaseHiRegVf, phys_high);
+	acc100_reg_write(d, HWPfDmaFec5GulDescBaseLoRegVf, phys_low);
+
+	/* Descriptor for a dummy 5GUL code block processing*/
+	union acc100_dma_desc *desc = NULL;
+	desc = d->sw_rings;
+	desc->req.data_ptrs[0].address = d->sw_rings_phys +
+			ACC100_DESC_FCW_OFFSET;
+	desc->req.data_ptrs[0].blen = ACC100_FCW_LD_BLEN;
+	desc->req.data_ptrs[0].blkid = ACC100_DMA_BLKID_FCW;
+	desc->req.data_ptrs[0].last = 0;
+	desc->req.data_ptrs[0].dma_ext = 0;
+	desc->req.data_ptrs[1].address = d->sw_rings_phys + 512;
+	desc->req.data_ptrs[1].blkid = ACC100_DMA_BLKID_IN;
+	desc->req.data_ptrs[1].last = 1;
+	desc->req.data_ptrs[1].dma_ext = 0;
+	desc->req.data_ptrs[1].blen = 44;
+	desc->req.data_ptrs[2].address = d->sw_rings_phys + 1024;
+	desc->req.data_ptrs[2].blkid = ACC100_DMA_BLKID_OUT_ENC;
+	desc->req.data_ptrs[2].last = 1;
+	desc->req.data_ptrs[2].dma_ext = 0;
+	desc->req.data_ptrs[2].blen = 5;
+	/* Dummy FCW */
+	desc->req.fcw_ld.FCWversion = ACC100_FCW_VER;
+	desc->req.fcw_ld.qm = 1;
+	desc->req.fcw_ld.nfiller = 30;
+	desc->req.fcw_ld.BG = 2 - 1;
+	desc->req.fcw_ld.Zc = 7;
+	desc->req.fcw_ld.ncb = 350;
+	desc->req.fcw_ld.rm_e = 4;
+	desc->req.fcw_ld.itmax = 10;
+	desc->req.fcw_ld.gain_i = 1;
+	desc->req.fcw_ld.gain_h = 1;
+
+	int engines_to_restart[SIG_UL_5G_LAST + 1] = {0};
+	int num_failed_engine = 0;
+	/* Detect engines in undefined state */
+	for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST;
+			template_idx++) {
+		/* Check engine power-on status */
+		address = HwPfFecUl5gIbDebugReg +
+				ACC100_ENGINE_OFFSET * template_idx;
+		status = (acc100_reg_read(d, address) >> 4) & 0xF;
+		if (status == 0) {
+			engines_to_restart[num_failed_engine] = template_idx;
+			num_failed_engine++;
+		}
+	}
+
+	int numQqsAcc = conf->q_ul_5g.num_qgroups;
+	int numQgs = conf->q_ul_5g.num_qgroups;
+	payload = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		payload |= (1 << qg_idx);
+	/* Force each engine which is in unspecified state */
+	for (i = 0; i < num_failed_engine; i++) {
+		int failed_engine = engines_to_restart[i];
+		printf("Force engine %d\n", failed_engine);
+		for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST;
+				template_idx++) {
+			address = HWPfQmgrGrpTmplateReg4Indx
+					+ BYTES_IN_WORD * template_idx;
+			if (template_idx == failed_engine)
+				acc100_reg_write(d, address, payload);
+			else
+				acc100_reg_write(d, address, 0);
+		}
+		/* Reset descriptor header */
+		desc->req.word0 = ACC100_DMA_DESC_TYPE;
+		desc->req.word1 = 0;
+		desc->req.word2 = 0;
+		desc->req.word3 = 0;
+		desc->req.numCBs = 1;
+		desc->req.m2dlen = 2;
+		desc->req.d2mlen = 1;
+		/* Enqueue the code block for processing */
+		union acc100_enqueue_reg_fmt enq_req;
+		enq_req.val = 0;
+		enq_req.addr_offset = ACC100_DESC_OFFSET;
+		enq_req.num_elem = 1;
+		enq_req.req_elem_addr = 0;
+		rte_wmb();
+		acc100_reg_write(d, HWPfQmgrIngressAq + 0x100, enq_req.val);
+		usleep(LONG_WAIT * 100);
+		if (desc->req.word0 != 2)
+			printf("DMA Response %#"PRIx32"\n", desc->req.word0);
+	}
+
+	/* Reset LDPC Cores */
+	for (i = 0; i < ACC100_ENGINES_MAX; i++)
+		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
+				ACC100_ENGINE_OFFSET * i, ACC100_RESET_HI);
+	usleep(LONG_WAIT);
+	for (i = 0; i < ACC100_ENGINES_MAX; i++)
+		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
+				ACC100_ENGINE_OFFSET * i, ACC100_RESET_LO);
+	usleep(LONG_WAIT);
+	acc100_reg_write(d, HWPfHi5GHardResetReg, ACC100_RESET_HARD);
+	usleep(LONG_WAIT);
+	int numEngines = 0;
+	/* Check engine power-on status again */
+	for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST;
+			template_idx++) {
+		address = HwPfFecUl5gIbDebugReg +
+				ACC100_ENGINE_OFFSET * template_idx;
+		status = (acc100_reg_read(d, address) >> 4) & 0xF;
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ BYTES_IN_WORD * template_idx;
+		if (status == 1) {
+			acc100_reg_write(d, address, payload);
+			numEngines++;
+		} else
+			acc100_reg_write(d, address, 0);
+	}
+	printf("Number of 5GUL engines %d\n", numEngines);
+
+	if (d->sw_rings_base != NULL)
+		rte_free(d->sw_rings_base);
+	usleep(LONG_WAIT);
+}
+
+/* Initial configuration of a ACC100 device prior to running configure() */
+int
+acc100_configure(const char *dev_name, struct acc100_conf *conf)
+{
+	rte_bbdev_log(INFO, "acc100_configure");
+	uint32_t payload, address, status;
+	int qg_idx, template_idx, vf_idx, acc, i;
+	struct rte_bbdev *bbdev = rte_bbdev_get_named_dev(dev_name);
+
+	/* Compile time checks */
+	RTE_BUILD_BUG_ON(sizeof(struct acc100_dma_req_desc) != 256);
+	RTE_BUILD_BUG_ON(sizeof(union acc100_dma_desc) != 256);
+	RTE_BUILD_BUG_ON(sizeof(struct acc100_fcw_td) != 24);
+	RTE_BUILD_BUG_ON(sizeof(struct acc100_fcw_te) != 32);
+
+	if (bbdev == NULL) {
+		rte_bbdev_log(ERR,
+		"Invalid dev_name (%s), or device is not yet initialised",
+		dev_name);
+		return -ENODEV;
+	}
+	struct acc100_device *d = bbdev->data->dev_private;
+
+	/* Store configuration */
+	rte_memcpy(&d->acc100_conf, conf, sizeof(d->acc100_conf));
+
+	/* PCIe Bridge configuration */
+	acc100_reg_write(d, HwPfPcieGpexBridgeControl, ACC100_CFG_PCI_BRIDGE);
+	for (i = 1; i < 17; i++)
+		acc100_reg_write(d,
+				HwPfPcieGpexAxiAddrMappingWindowPexBaseHigh
+				+ i * 16, 0);
+
+	/* PCIe Link Trainiing and Status State Machine */
+	acc100_reg_write(d, HwPfPcieGpexLtssmStateCntrl, 0xDFC00000);
+
+	/* Prevent blocking AXI read on BRESP for AXI Write */
+	address = HwPfPcieGpexAxiPioControl;
+	payload = ACC100_CFG_PCI_AXI;
+	acc100_reg_write(d, address, payload);
+
+	/* 5GDL PLL phase shift */
+	acc100_reg_write(d, HWPfChaDl5gPllPhshft0, 0x1);
+
+	/* Explicitly releasing AXI as this may be stopped after PF FLR/BME */
+	address = HWPfDmaAxiControl;
+	payload = 1;
+	acc100_reg_write(d, address, payload);
+
+	/* DDR Configuration */
+	address = HWPfDdrBcTim6;
+	payload = acc100_reg_read(d, address);
+	payload &= 0xFFFFFFFB; /* Bit 2 */
+#ifdef ACC100_DDR_ECC_ENABLE
+	payload |= 0x4;
+#endif
+	acc100_reg_write(d, address, payload);
+	address = HWPfDdrPhyDqsCountNum;
+#ifdef ACC100_DDR_ECC_ENABLE
+	payload = 9;
+#else
+	payload = 8;
+#endif
+	acc100_reg_write(d, address, payload);
+
+	/* Set default descriptor signature */
+	address = HWPfDmaDescriptorSignatuture;
+	payload = 0;
+	acc100_reg_write(d, address, payload);
+
+	/* Enable the Error Detection in DMA */
+	payload = ACC100_CFG_DMA_ERROR;
+	address = HWPfDmaErrorDetectionEn;
+	acc100_reg_write(d, address, payload);
+
+	/* AXI Cache configuration */
+	payload = ACC100_CFG_AXI_CACHE;
+	address = HWPfDmaAxcacheReg;
+	acc100_reg_write(d, address, payload);
+
+	/* Default DMA Configuration (Qmgr Enabled) */
+	address = HWPfDmaConfig0Reg;
+	payload = 0;
+	acc100_reg_write(d, address, payload);
+	address = HWPfDmaQmanen;
+	payload = 0;
+	acc100_reg_write(d, address, payload);
+
+	/* Default RLIM/ALEN configuration */
+	address = HWPfDmaConfig1Reg;
+	payload = (1 << 31) + (23 << 8) + (1 << 6) + 7;
+	acc100_reg_write(d, address, payload);
+
+	/* Configure DMA Qmanager addresses */
+	address = HWPfDmaQmgrAddrReg;
+	payload = HWPfQmgrEgressQueuesTemplate;
+	acc100_reg_write(d, address, payload);
+
+	/* ===== Qmgr Configuration ===== */
+	/* Configuration of the AQueue Depth QMGR_GRP_0_DEPTH_LOG2 for UL */
+	int totalQgs = conf->q_ul_4g.num_qgroups +
+			conf->q_ul_5g.num_qgroups +
+			conf->q_dl_4g.num_qgroups +
+			conf->q_dl_5g.num_qgroups;
+	for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) {
+		address = HWPfQmgrDepthLog2Grp +
+		BYTES_IN_WORD * qg_idx;
+		payload = aqDepth(qg_idx, conf);
+		acc100_reg_write(d, address, payload);
+		address = HWPfQmgrTholdGrp +
+		BYTES_IN_WORD * qg_idx;
+		payload = (1 << 16) + (1 << (aqDepth(qg_idx, conf) - 1));
+		acc100_reg_write(d, address, payload);
+	}
+
+	/* Template Priority in incremental order */
+	for (template_idx = 0; template_idx < ACC100_NUM_TMPL;
+			template_idx++) {
+		address = HWPfQmgrGrpTmplateReg0Indx +
+		BYTES_IN_WORD * (template_idx % 8);
+		payload = TMPL_PRI_0;
+		acc100_reg_write(d, address, payload);
+		address = HWPfQmgrGrpTmplateReg1Indx +
+		BYTES_IN_WORD * (template_idx % 8);
+		payload = TMPL_PRI_1;
+		acc100_reg_write(d, address, payload);
+		address = HWPfQmgrGrpTmplateReg2indx +
+		BYTES_IN_WORD * (template_idx % 8);
+		payload = TMPL_PRI_2;
+		acc100_reg_write(d, address, payload);
+		address = HWPfQmgrGrpTmplateReg3Indx +
+		BYTES_IN_WORD * (template_idx % 8);
+		payload = TMPL_PRI_3;
+		acc100_reg_write(d, address, payload);
+	}
+
+	address = HWPfQmgrGrpPriority;
+	payload = ACC100_CFG_QMGR_HI_P;
+	acc100_reg_write(d, address, payload);
+
+	/* Template Configuration */
+	for (template_idx = 0; template_idx < ACC100_NUM_TMPL; template_idx++) {
+		payload = 0;
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ BYTES_IN_WORD * template_idx;
+		acc100_reg_write(d, address, payload);
+	}
+	/* 4GUL */
+	int numQgs = conf->q_ul_4g.num_qgroups;
+	int numQqsAcc = 0;
+	payload = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		payload |= (1 << qg_idx);
+	for (template_idx = SIG_UL_4G; template_idx <= SIG_UL_4G_LAST;
+			template_idx++) {
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ BYTES_IN_WORD*template_idx;
+		acc100_reg_write(d, address, payload);
+	}
+	/* 5GUL */
+	numQqsAcc += numQgs;
+	numQgs	= conf->q_ul_5g.num_qgroups;
+	payload = 0;
+	int numEngines = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		payload |= (1 << qg_idx);
+	for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST;
+			template_idx++) {
+		/* Check engine power-on status */
+		address = HwPfFecUl5gIbDebugReg +
+				ACC100_ENGINE_OFFSET * template_idx;
+		status = (acc100_reg_read(d, address) >> 4) & 0xF;
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ BYTES_IN_WORD * template_idx;
+		if (status == 1) {
+			acc100_reg_write(d, address, payload);
+			numEngines++;
+		} else
+			acc100_reg_write(d, address, 0);
+		#if RTE_ACC100_SINGLE_FEC == 1
+		payload = 0;
+		#endif
+	}
+	printf("Number of 5GUL engines %d\n", numEngines);
+	/* 4GDL */
+	numQqsAcc += numQgs;
+	numQgs	= conf->q_dl_4g.num_qgroups;
+	payload = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		payload |= (1 << qg_idx);
+	for (template_idx = SIG_DL_4G; template_idx <= SIG_DL_4G_LAST;
+			template_idx++) {
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ BYTES_IN_WORD*template_idx;
+		acc100_reg_write(d, address, payload);
+		#if RTE_ACC100_SINGLE_FEC == 1
+			payload = 0;
+		#endif
+	}
+	/* 5GDL */
+	numQqsAcc += numQgs;
+	numQgs	= conf->q_dl_5g.num_qgroups;
+	payload = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		payload |= (1 << qg_idx);
+	for (template_idx = SIG_DL_5G; template_idx <= SIG_DL_5G_LAST;
+			template_idx++) {
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ BYTES_IN_WORD*template_idx;
+		acc100_reg_write(d, address, payload);
+		#if RTE_ACC100_SINGLE_FEC == 1
+		payload = 0;
+		#endif
+	}
+
+	/* Queue Group Function mapping */
+	int qman_func_id[5] = {0, 2, 1, 3, 4};
+	address = HWPfQmgrGrpFunction0;
+	payload = 0;
+	for (qg_idx = 0; qg_idx < 8; qg_idx++) {
+		acc = accFromQgid(qg_idx, conf);
+		payload |= qman_func_id[acc]<<(qg_idx * 4);
+	}
+	acc100_reg_write(d, address, payload);
+
+	/* Configuration of the Arbitration QGroup depth to 1 */
+	for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) {
+		address = HWPfQmgrArbQDepthGrp +
+		BYTES_IN_WORD * qg_idx;
+		payload = 0;
+		acc100_reg_write(d, address, payload);
+	}
+
+	/* Enabling AQueues through the Queue hierarchy*/
+	for (vf_idx = 0; vf_idx < ACC100_NUM_VFS; vf_idx++) {
+		for (qg_idx = 0; qg_idx < ACC100_NUM_QGRPS; qg_idx++) {
+			payload = 0;
+			if (vf_idx < conf->num_vf_bundles &&
+					qg_idx < totalQgs)
+				payload = (1 << aqNum(qg_idx, conf)) - 1;
+			address = HWPfQmgrAqEnableVf
+					+ vf_idx * BYTES_IN_WORD;
+			payload += (qg_idx << 16);
+			acc100_reg_write(d, address, payload);
+		}
+	}
+
+	/* This pointer to ARAM (256kB) is shifted by 2 (4B per register) */
+	uint32_t aram_address = 0;
+	for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) {
+		for (vf_idx = 0; vf_idx < conf->num_vf_bundles; vf_idx++) {
+			address = HWPfQmgrVfBaseAddr + vf_idx
+					* BYTES_IN_WORD + qg_idx
+					* BYTES_IN_WORD * 64;
+			payload = aram_address;
+			acc100_reg_write(d, address, payload);
+			/* Offset ARAM Address for next memory bank
+			 * - increment of 4B
+			 */
+			aram_address += aqNum(qg_idx, conf) *
+					(1 << aqDepth(qg_idx, conf));
+		}
+	}
+
+	if (aram_address > WORDS_IN_ARAM_SIZE) {
+		rte_bbdev_log(ERR, "ARAM Configuration not fitting %d %d\n",
+				aram_address, WORDS_IN_ARAM_SIZE);
+		return -EINVAL;
+	}
+
+	/* ==== HI Configuration ==== */
+
+	/* Prevent Block on Transmit Error */
+	address = HWPfHiBlockTransmitOnErrorEn;
+	payload = 0;
+	acc100_reg_write(d, address, payload);
+	/* Prevents to drop MSI */
+	address = HWPfHiMsiDropEnableReg;
+	payload = 0;
+	acc100_reg_write(d, address, payload);
+	/* Set the PF Mode register */
+	address = HWPfHiPfMode;
+	payload = (conf->pf_mode_en) ? 2 : 0;
+	acc100_reg_write(d, address, payload);
+	/* Enable Error Detection in HW */
+	address = HWPfDmaErrorDetectionEn;
+	payload = 0x3D7;
+	acc100_reg_write(d, address, payload);
+
+	/* QoS overflow init */
+	payload = 1;
+	address = HWPfQosmonAEvalOverflow0;
+	acc100_reg_write(d, address, payload);
+	address = HWPfQosmonBEvalOverflow0;
+	acc100_reg_write(d, address, payload);
+
+	/* HARQ DDR Configuration */
+	unsigned int ddrSizeInMb = 512; /* Fixed to 512 MB per VF for now */
+	for (vf_idx = 0; vf_idx < conf->num_vf_bundles; vf_idx++) {
+		address = HWPfDmaVfDdrBaseRw + vf_idx
+				* 0x10;
+		payload = ((vf_idx * (ddrSizeInMb / 64)) << 16) +
+				(ddrSizeInMb - 1);
+		acc100_reg_write(d, address, payload);
+	}
+	usleep(LONG_WAIT);
+
+	if (numEngines < (SIG_UL_5G_LAST + 1))
+		poweron_cleanup(bbdev, d, conf);
+
+	rte_bbdev_log_debug("PF Tip configuration complete for %s", dev_name);
+	return 0;
+}
diff --git a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
index 4a76d1d..91c234d 100644
--- a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
+++ b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
@@ -1,3 +1,10 @@
 DPDK_21 {
 	local: *;
 };
+
+EXPERIMENTAL {
+	global:
+
+	acc100_configure;
+
+};
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v7 11/11] doc: update bbdev feature table
  2020-09-23  2:24   ` [dpdk-dev] [PATCH v7 00/11] bbdev PMD ACC100 Nicolas Chautru
                       ` (9 preceding siblings ...)
  2020-09-23  2:25     ` [dpdk-dev] [PATCH v7 10/11] baseband/acc100: add configure function Nicolas Chautru
@ 2020-09-23  2:25     ` Nicolas Chautru
  2020-09-28 20:19       ` Akhil Goyal
  10 siblings, 1 reply; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-23  2:25 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Correcting overview matrix to use acc100 name

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 doc/guides/bbdevs/features/acc100.ini | 14 ++++++++++++++
 doc/guides/bbdevs/features/mbc.ini    | 14 --------------
 2 files changed, 14 insertions(+), 14 deletions(-)
 create mode 100644 doc/guides/bbdevs/features/acc100.ini
 delete mode 100644 doc/guides/bbdevs/features/mbc.ini

diff --git a/doc/guides/bbdevs/features/acc100.ini b/doc/guides/bbdevs/features/acc100.ini
new file mode 100644
index 0000000..642cd48
--- /dev/null
+++ b/doc/guides/bbdevs/features/acc100.ini
@@ -0,0 +1,14 @@
+;
+; Supported features of the 'acc100' bbdev driver.
+;
+; Refer to default.ini for the full list of available PMD features.
+;
+[Features]
+Turbo Decoder (4G)     = Y
+Turbo Encoder (4G)     = Y
+LDPC Decoder (5G)      = Y
+LDPC Encoder (5G)      = Y
+LLR/HARQ Compression   = Y
+External DDR Access    = Y
+HW Accelerated         = Y
+BBDEV API              = Y
diff --git a/doc/guides/bbdevs/features/mbc.ini b/doc/guides/bbdevs/features/mbc.ini
deleted file mode 100644
index 78a7b95..0000000
--- a/doc/guides/bbdevs/features/mbc.ini
+++ /dev/null
@@ -1,14 +0,0 @@
-;
-; Supported features of the 'mbc' bbdev driver.
-;
-; Refer to default.ini for the full list of available PMD features.
-;
-[Features]
-Turbo Decoder (4G)     = Y
-Turbo Encoder (4G)     = Y
-LDPC Decoder (5G)      = Y
-LDPC Encoder (5G)      = Y
-LLR/HARQ Compression   = Y
-External DDR Access    = Y
-HW Accelerated         = Y
-BBDEV API              = Y
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v7 11/11] doc: update bbdev feature table
  2020-09-23  2:25     ` [dpdk-dev] [PATCH v7 11/11] doc: update bbdev feature table Nicolas Chautru
@ 2020-09-28 20:19       ` Akhil Goyal
  2020-09-29  0:57         ` Chautru, Nicolas
  0 siblings, 1 reply; 213+ messages in thread
From: Akhil Goyal @ 2020-09-28 20:19 UTC (permalink / raw)
  To: Nicolas Chautru, dev
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu

Hi Nicolas,

> 
> Correcting overview matrix to use acc100 name
> 
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
> ---
>  doc/guides/bbdevs/features/acc100.ini | 14 ++++++++++++++
>  doc/guides/bbdevs/features/mbc.ini    | 14 --------------
>  2 files changed, 14 insertions(+), 14 deletions(-)
>  create mode 100644 doc/guides/bbdevs/features/acc100.ini
>  delete mode 100644 doc/guides/bbdevs/features/mbc.ini
> 
> diff --git a/doc/guides/bbdevs/features/acc100.ini
> b/doc/guides/bbdevs/features/acc100.ini
> new file mode 100644
> index 0000000..642cd48
> --- /dev/null
> +++ b/doc/guides/bbdevs/features/acc100.ini
> @@ -0,0 +1,14 @@
> +;
> +; Supported features of the 'acc100' bbdev driver.
> +;
> +; Refer to default.ini for the full list of available PMD features.
> +;
> +[Features]
> +Turbo Decoder (4G)     = Y
> +Turbo Encoder (4G)     = Y
> +LDPC Decoder (5G)      = Y
> +LDPC Encoder (5G)      = Y
> +LLR/HARQ Compression   = Y
> +External DDR Access    = Y
> +HW Accelerated         = Y
> +BBDEV API              = Y
We normally do not take separate feature set patches for documentation.
These should be split across your patchset, where you are actually adding 
the feature.

Also the release notes in the first patch is not correct as the PMD is not
Complete there. You can add it in the last patch.

> diff --git a/doc/guides/bbdevs/features/mbc.ini
> b/doc/guides/bbdevs/features/mbc.ini
> deleted file mode 100644
> index 78a7b95..0000000
> --- a/doc/guides/bbdevs/features/mbc.ini
> +++ /dev/null
> @@ -1,14 +0,0 @@
> -;
> -; Supported features of the 'mbc' bbdev driver.
> -;
> -; Refer to default.ini for the full list of available PMD features.
> -;
> -[Features]
> -Turbo Decoder (4G)     = Y
> -Turbo Encoder (4G)     = Y
> -LDPC Decoder (5G)      = Y
> -LDPC Encoder (5G)      = Y
> -LLR/HARQ Compression   = Y
> -External DDR Access    = Y
> -HW Accelerated         = Y
> -BBDEV API              = Y

Not sure how was it missed earlier.
Please submit a separate patch for this. This should also be sent for stable backport as well
And add a fixes line.

^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v8 00/10] bbdev PMD ACC100
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 11/11] doc: update bbdev feature table Nicolas Chautru
                     ` (3 preceding siblings ...)
  2020-09-23  2:24   ` [dpdk-dev] [PATCH v7 00/11] bbdev PMD ACC100 Nicolas Chautru
@ 2020-09-28 23:52   ` Nicolas Chautru
  2020-09-28 23:52     ` [dpdk-dev] [PATCH v8 01/10] drivers/baseband: add PMD for ACC100 Nicolas Chautru
                       ` (9 more replies)
  2020-09-29  0:29   ` [dpdk-dev] [PATCH v9 00/10] bbdev PMD ACC100 Nicolas Chautru
                     ` (3 subsequent siblings)
  8 siblings, 10 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-28 23:52 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

v7: integrated the doc feature table in previous commit as suggested. 
v7: Fingers trouble. Previous one sent mid-rebase. My bad. 
v6: removed a legacy makefile no longer required
v5: rebase based on latest on main. The legacy makefiles are removed. 
v4: an odd compilation error is reported for one CI variant using "gcc latest" which looks to me like a false positive of maybe-undeclared. 
http://mails.dpdk.org/archives/test-report/2020-August/148936.html
Still forcing a dummy declare to remove this CI warning I will check with ci@dpdk.org in parallel.  
v3: missed a change during rebase
v2: includes clean up from latest CI checks.


Nicolas Chautru (10):
  drivers/baseband: add PMD for ACC100
  baseband/acc100: add register definition file
  baseband/acc100: add info get function
  baseband/acc100: add queue configuration
  baseband/acc100: add LDPC processing functions
  baseband/acc100: add HARQ loopback support
  baseband/acc100: add support for 4G processing
  baseband/acc100: add interrupt support to PMD
  baseband/acc100: add debug function to validate input
  baseband/acc100: add configure function

 app/test-bbdev/meson.build                         |    3 +
 app/test-bbdev/test_bbdev_perf.c                   |   72 +
 doc/guides/bbdevs/acc100.rst                       |  233 +
 doc/guides/bbdevs/features/acc100.ini              |   14 +
 doc/guides/bbdevs/index.rst                        |    1 +
 doc/guides/rel_notes/release_20_11.rst             |    6 +
 drivers/baseband/acc100/acc100_pf_enum.h           | 1068 +++++
 drivers/baseband/acc100/acc100_vf_enum.h           |   73 +
 drivers/baseband/acc100/meson.build                |    8 +
 drivers/baseband/acc100/rte_acc100_cfg.h           |  113 +
 drivers/baseband/acc100/rte_acc100_pmd.c           | 4684 ++++++++++++++++++++
 drivers/baseband/acc100/rte_acc100_pmd.h           |  593 +++
 .../acc100/rte_pmd_bbdev_acc100_version.map        |   10 +
 drivers/baseband/meson.build                       |    2 +-
 14 files changed, 6879 insertions(+), 1 deletion(-)
 create mode 100644 doc/guides/bbdevs/acc100.rst
 create mode 100644 doc/guides/bbdevs/features/acc100.ini
 create mode 100644 drivers/baseband/acc100/acc100_pf_enum.h
 create mode 100644 drivers/baseband/acc100/acc100_vf_enum.h
 create mode 100644 drivers/baseband/acc100/meson.build
 create mode 100644 drivers/baseband/acc100/rte_acc100_cfg.h
 create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.c
 create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.h
 create mode 100644 drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v8 01/10] drivers/baseband: add PMD for ACC100
  2020-09-28 23:52   ` [dpdk-dev] [PATCH v8 00/10] bbdev PMD ACC100 Nicolas Chautru
@ 2020-09-28 23:52     ` Nicolas Chautru
  2020-09-28 23:52     ` [dpdk-dev] [PATCH v8 02/10] baseband/acc100: add register definition file Nicolas Chautru
                       ` (8 subsequent siblings)
  9 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-28 23:52 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Add stubs for the ACC100 PMD

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 doc/guides/bbdevs/acc100.rst                       | 233 +++++++++++++++++++++
 doc/guides/bbdevs/features/acc100.ini              |  14 ++
 doc/guides/bbdevs/index.rst                        |   1 +
 doc/guides/rel_notes/release_20_11.rst             |   6 +
 drivers/baseband/acc100/meson.build                |   6 +
 drivers/baseband/acc100/rte_acc100_pmd.c           | 175 ++++++++++++++++
 drivers/baseband/acc100/rte_acc100_pmd.h           |  37 ++++
 .../acc100/rte_pmd_bbdev_acc100_version.map        |   3 +
 drivers/baseband/meson.build                       |   2 +-
 9 files changed, 476 insertions(+), 1 deletion(-)
 create mode 100644 doc/guides/bbdevs/acc100.rst
 create mode 100644 doc/guides/bbdevs/features/acc100.ini
 create mode 100644 drivers/baseband/acc100/meson.build
 create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.c
 create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.h
 create mode 100644 drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map

diff --git a/doc/guides/bbdevs/acc100.rst b/doc/guides/bbdevs/acc100.rst
new file mode 100644
index 0000000..f87ee09
--- /dev/null
+++ b/doc/guides/bbdevs/acc100.rst
@@ -0,0 +1,233 @@
+..  SPDX-License-Identifier: BSD-3-Clause
+    Copyright(c) 2020 Intel Corporation
+
+Intel(R) ACC100 5G/4G FEC Poll Mode Driver
+==========================================
+
+The BBDEV ACC100 5G/4G FEC poll mode driver (PMD) supports an
+implementation of a VRAN FEC wireless acceleration function.
+This device is also known as Mount Bryce.
+
+Features
+--------
+
+ACC100 5G/4G FEC PMD supports the following features:
+
+- LDPC Encode in the DL (5GNR)
+- LDPC Decode in the UL (5GNR)
+- Turbo Encode in the DL (4G)
+- Turbo Decode in the UL (4G)
+- 16 VFs per PF (physical device)
+- Maximum of 128 queues per VF
+- PCIe Gen-3 x16 Interface
+- MSI
+- SR-IOV
+
+ACC100 5G/4G FEC PMD supports the following BBDEV capabilities:
+
+* For the LDPC encode operation:
+   - ``RTE_BBDEV_LDPC_CRC_24B_ATTACH`` :  set to attach CRC24B to CB(s)
+   - ``RTE_BBDEV_LDPC_RATE_MATCH`` :  if set then do not do Rate Match bypass
+   - ``RTE_BBDEV_LDPC_INTERLEAVER_BYPASS`` : if set then bypass interleaver
+
+* For the LDPC decode operation:
+   - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK`` :  check CRC24B from CB(s)
+   - ``RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE`` :  disable early termination
+   - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP`` :  drops CRC24B bits appended while decoding
+   - ``RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE`` :  provides an input for HARQ combining
+   - ``RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE`` :  provides an input for HARQ combining
+   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE`` :  HARQ memory input is internal
+   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE`` :  HARQ memory output is internal
+   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK`` :  loopback data to/from HARQ memory
+   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_FILLERS`` :  HARQ memory includes the fillers bits
+   - ``RTE_BBDEV_LDPC_DEC_SCATTER_GATHER`` :  supports scatter-gather for input/output data
+   - ``RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION`` :  supports compression of the HARQ input/output
+   - ``RTE_BBDEV_LDPC_LLR_COMPRESSION`` :  supports LLR input compression
+
+* For the turbo encode operation:
+   - ``RTE_BBDEV_TURBO_CRC_24B_ATTACH`` :  set to attach CRC24B to CB(s)
+   - ``RTE_BBDEV_TURBO_RATE_MATCH`` :  if set then do not do Rate Match bypass
+   - ``RTE_BBDEV_TURBO_ENC_INTERRUPTS`` :  set for encoder dequeue interrupts
+   - ``RTE_BBDEV_TURBO_RV_INDEX_BYPASS`` :  set to bypass RV index
+   - ``RTE_BBDEV_TURBO_ENC_SCATTER_GATHER`` :  supports scatter-gather for input/output data
+
+* For the turbo decode operation:
+   - ``RTE_BBDEV_TURBO_CRC_TYPE_24B`` :  check CRC24B from CB(s)
+   - ``RTE_BBDEV_TURBO_SUBBLOCK_DEINTERLEAVE`` :  perform subblock de-interleave
+   - ``RTE_BBDEV_TURBO_DEC_INTERRUPTS`` :  set for decoder dequeue interrupts
+   - ``RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN`` :  set if negative LLR encoder i/p is supported
+   - ``RTE_BBDEV_TURBO_POS_LLR_1_BIT_IN`` :  set if positive LLR encoder i/p is supported
+   - ``RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP`` :  keep CRC24B bits appended while decoding
+   - ``RTE_BBDEV_TURBO_EARLY_TERMINATION`` :  set early early termination feature
+   - ``RTE_BBDEV_TURBO_DEC_SCATTER_GATHER`` :  supports scatter-gather for input/output data
+   - ``RTE_BBDEV_TURBO_HALF_ITERATION_EVEN`` :  set half iteration granularity
+
+Installation
+------------
+
+Section 3 of the DPDK manual provides instuctions on installing and compiling DPDK. The
+default set of bbdev compile flags may be found in config/common_base, where for example
+the flag to build the ACC100 5G/4G FEC device, ``CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100``,
+is already set.
+
+DPDK requires hugepages to be configured as detailed in section 2 of the DPDK manual.
+The bbdev test application has been tested with a configuration 40 x 1GB hugepages. The
+hugepage configuration of a server may be examined using:
+
+.. code-block:: console
+
+   grep Huge* /proc/meminfo
+
+
+Initialization
+--------------
+
+When the device first powers up, its PCI Physical Functions (PF) can be listed through this command:
+
+.. code-block:: console
+
+  sudo lspci -vd8086:0d5c
+
+The physical and virtual functions are compatible with Linux UIO drivers:
+``vfio`` and ``igb_uio``. However, in order to work the ACC100 5G/4G
+FEC device firstly needs to be bound to one of these linux drivers through DPDK.
+
+
+Bind PF UIO driver(s)
+~~~~~~~~~~~~~~~~~~~~~
+
+Install the DPDK igb_uio driver, bind it with the PF PCI device ID and use
+``lspci`` to confirm the PF device is under use by ``igb_uio`` DPDK UIO driver.
+
+The igb_uio driver may be bound to the PF PCI device using one of three methods:
+
+
+1. PCI functions (physical or virtual, depending on the use case) can be bound to
+the UIO driver by repeating this command for every function.
+
+.. code-block:: console
+
+  cd <dpdk-top-level-directory>
+  insmod ./build/kmod/igb_uio.ko
+  echo "8086 0d5c" > /sys/bus/pci/drivers/igb_uio/new_id
+  lspci -vd8086:0d5c
+
+
+2. Another way to bind PF with DPDK UIO driver is by using the ``dpdk-devbind.py`` tool
+
+.. code-block:: console
+
+  cd <dpdk-top-level-directory>
+  ./usertools/dpdk-devbind.py -b igb_uio 0000:06:00.0
+
+where the PCI device ID (example: 0000:06:00.0) is obtained using lspci -vd8086:0d5c
+
+
+3. A third way to bind is to use ``dpdk-setup.sh`` tool
+
+.. code-block:: console
+
+  cd <dpdk-top-level-directory>
+  ./usertools/dpdk-setup.sh
+
+  select 'Bind Ethernet/Crypto/Baseband device to IGB UIO module'
+  or
+  select 'Bind Ethernet/Crypto/Baseband device to VFIO module' depending on driver required
+  enter PCI device ID
+  select 'Display current Ethernet/Crypto/Baseband device settings' to confirm binding
+
+
+In the same way the ACC100 5G/4G FEC PF can be bound with vfio, but vfio driver does not
+support SR-IOV configuration right out of the box, so it will need to be patched.
+
+
+Enable Virtual Functions
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+Now, it should be visible in the printouts that PCI PF is under igb_uio control
+"``Kernel driver in use: igb_uio``"
+
+To show the number of available VFs on the device, read ``sriov_totalvfs`` file..
+
+.. code-block:: console
+
+  cat /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/sriov_totalvfs
+
+  where 0000\:<b>\:<d>.<f> is the PCI device ID
+
+
+To enable VFs via igb_uio, echo the number of virtual functions intended to
+enable to ``max_vfs`` file..
+
+.. code-block:: console
+
+  echo <num-of-vfs> > /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/max_vfs
+
+
+Afterwards, all VFs must be bound to appropriate UIO drivers as required, same
+way it was done with the physical function previously.
+
+Enabling SR-IOV via vfio driver is pretty much the same, except that the file
+name is different:
+
+.. code-block:: console
+
+  echo <num-of-vfs> > /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/sriov_numvfs
+
+
+Configure the VFs through PF
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The PCI virtual functions must be configured before working or getting assigned
+to VMs/Containers. The configuration involves allocating the number of hardware
+queues, priorities, load balance, bandwidth and other settings necessary for the
+device to perform FEC functions.
+
+This configuration needs to be executed at least once after reboot or PCI FLR and can
+be achieved by using the function ``acc100_configure()``, which sets up the
+parameters defined in ``acc100_conf`` structure.
+
+Test Application
+----------------
+
+BBDEV provides a test application, ``test-bbdev.py`` and range of test data for testing
+the functionality of ACC100 5G/4G FEC encode and decode, depending on the device's
+capabilities. The test application is located under app->test-bbdev folder and has the
+following options:
+
+.. code-block:: console
+
+  "-p", "--testapp-path": specifies path to the bbdev test app.
+  "-e", "--eal-params"	: EAL arguments which are passed to the test app.
+  "-t", "--timeout"	: Timeout in seconds (default=300).
+  "-c", "--test-cases"	: Defines test cases to run. Run all if not specified.
+  "-v", "--test-vector"	: Test vector path (default=dpdk_path+/app/test-bbdev/test_vectors/bbdev_null.data).
+  "-n", "--num-ops"	: Number of operations to process on device (default=32).
+  "-b", "--burst-size"	: Operations enqueue/dequeue burst size (default=32).
+  "-s", "--snr"		: SNR in dB used when generating LLRs for bler tests.
+  "-s", "--iter_max"	: Number of iterations for LDPC decoder.
+  "-l", "--num-lcores"	: Number of lcores to run (default=16).
+  "-i", "--init-device" : Initialise PF device with default values.
+
+
+To execute the test application tool using simple decode or encode data,
+type one of the following:
+
+.. code-block:: console
+
+  ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_dec_default.data
+  ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_enc_default.data
+
+
+The test application ``test-bbdev.py``, supports the ability to configure the PF device with
+a default set of values, if the "-i" or "- -init-device" option is included. The default values
+are defined in test_bbdev_perf.c.
+
+
+Test Vectors
+~~~~~~~~~~~~
+
+In addition to the simple LDPC decoder and LDPC encoder tests, bbdev also provides
+a range of additional tests under the test_vectors folder, which may be useful. The results
+of these tests will depend on the ACC100 5G/4G FEC capabilities which may cause some
+testcases to be skipped, but no failure should be reported.
diff --git a/doc/guides/bbdevs/features/acc100.ini b/doc/guides/bbdevs/features/acc100.ini
new file mode 100644
index 0000000..c89a4d7
--- /dev/null
+++ b/doc/guides/bbdevs/features/acc100.ini
@@ -0,0 +1,14 @@
+;
+; Supported features of the 'acc100' bbdev driver.
+;
+; Refer to default.ini for the full list of available PMD features.
+;
+[Features]
+Turbo Decoder (4G)     = N
+Turbo Encoder (4G)     = N
+LDPC Decoder (5G)      = N
+LDPC Encoder (5G)      = N
+LLR/HARQ Compression   = N
+External DDR Access    = N
+HW Accelerated         = Y
+BBDEV API              = Y
diff --git a/doc/guides/bbdevs/index.rst b/doc/guides/bbdevs/index.rst
index a8092dd..4445cbd 100644
--- a/doc/guides/bbdevs/index.rst
+++ b/doc/guides/bbdevs/index.rst
@@ -13,3 +13,4 @@ Baseband Device Drivers
     turbo_sw
     fpga_lte_fec
     fpga_5gnr_fec
+    acc100
diff --git a/doc/guides/rel_notes/release_20_11.rst b/doc/guides/rel_notes/release_20_11.rst
index 73ac08f..20639ea 100644
--- a/doc/guides/rel_notes/release_20_11.rst
+++ b/doc/guides/rel_notes/release_20_11.rst
@@ -55,6 +55,12 @@ New Features
      Also, make sure to start the actual text at the margin.
      =======================================================
 
+* **Added Intel ACC100 bbdev PMD.**
+
+  Added a new ``acc100`` bbdev driver for the Intel\ |reg| ACC100 accelerator
+  also known as Mount Bryce.  See the
+  :doc:`../bbdevs/acc100` BBDEV guide for more details on this new driver.
+
 
 Removed Items
 -------------
diff --git a/drivers/baseband/acc100/meson.build b/drivers/baseband/acc100/meson.build
new file mode 100644
index 0000000..8afafc2
--- /dev/null
+++ b/drivers/baseband/acc100/meson.build
@@ -0,0 +1,6 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2020 Intel Corporation
+
+deps += ['bbdev', 'bus_vdev', 'ring', 'pci', 'bus_pci']
+
+sources = files('rte_acc100_pmd.c')
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
new file mode 100644
index 0000000..1b4cd13
--- /dev/null
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -0,0 +1,175 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#include <unistd.h>
+
+#include <rte_common.h>
+#include <rte_log.h>
+#include <rte_dev.h>
+#include <rte_malloc.h>
+#include <rte_mempool.h>
+#include <rte_byteorder.h>
+#include <rte_errno.h>
+#include <rte_branch_prediction.h>
+#include <rte_hexdump.h>
+#include <rte_pci.h>
+#include <rte_bus_pci.h>
+
+#include <rte_bbdev.h>
+#include <rte_bbdev_pmd.h>
+#include "rte_acc100_pmd.h"
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, DEBUG);
+#else
+RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);
+#endif
+
+/* Free 64MB memory used for software rings */
+static int
+acc100_dev_close(struct rte_bbdev *dev  __rte_unused)
+{
+	return 0;
+}
+
+static const struct rte_bbdev_ops acc100_bbdev_ops = {
+	.close = acc100_dev_close,
+};
+
+/* ACC100 PCI PF address map */
+static struct rte_pci_id pci_id_acc100_pf_map[] = {
+	{
+		RTE_PCI_DEVICE(RTE_ACC100_VENDOR_ID, RTE_ACC100_PF_DEVICE_ID)
+	},
+	{.device_id = 0},
+};
+
+/* ACC100 PCI VF address map */
+static struct rte_pci_id pci_id_acc100_vf_map[] = {
+	{
+		RTE_PCI_DEVICE(RTE_ACC100_VENDOR_ID, RTE_ACC100_VF_DEVICE_ID)
+	},
+	{.device_id = 0},
+};
+
+/* Initialization Function */
+static void
+acc100_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv)
+{
+	struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
+
+	dev->dev_ops = &acc100_bbdev_ops;
+
+	((struct acc100_device *) dev->data->dev_private)->pf_device =
+			!strcmp(drv->driver.name,
+					RTE_STR(ACC100PF_DRIVER_NAME));
+	((struct acc100_device *) dev->data->dev_private)->mmio_base =
+			pci_dev->mem_resource[0].addr;
+
+	rte_bbdev_log_debug("Init device %s [%s] @ vaddr %p paddr %#"PRIx64"",
+			drv->driver.name, dev->data->name,
+			(void *)pci_dev->mem_resource[0].addr,
+			pci_dev->mem_resource[0].phys_addr);
+}
+
+static int acc100_pci_probe(struct rte_pci_driver *pci_drv,
+	struct rte_pci_device *pci_dev)
+{
+	struct rte_bbdev *bbdev = NULL;
+	char dev_name[RTE_BBDEV_NAME_MAX_LEN];
+
+	if (pci_dev == NULL) {
+		rte_bbdev_log(ERR, "NULL PCI device");
+		return -EINVAL;
+	}
+
+	rte_pci_device_name(&pci_dev->addr, dev_name, sizeof(dev_name));
+
+	/* Allocate memory to be used privately by drivers */
+	bbdev = rte_bbdev_allocate(pci_dev->device.name);
+	if (bbdev == NULL)
+		return -ENODEV;
+
+	/* allocate device private memory */
+	bbdev->data->dev_private = rte_zmalloc_socket(dev_name,
+			sizeof(struct acc100_device), RTE_CACHE_LINE_SIZE,
+			pci_dev->device.numa_node);
+
+	if (bbdev->data->dev_private == NULL) {
+		rte_bbdev_log(CRIT,
+				"Allocate of %zu bytes for device \"%s\" failed",
+				sizeof(struct acc100_device), dev_name);
+				rte_bbdev_release(bbdev);
+			return -ENOMEM;
+	}
+
+	/* Fill HW specific part of device structure */
+	bbdev->device = &pci_dev->device;
+	bbdev->intr_handle = &pci_dev->intr_handle;
+	bbdev->data->socket_id = pci_dev->device.numa_node;
+
+	/* Invoke ACC100 device initialization function */
+	acc100_bbdev_init(bbdev, pci_drv);
+
+	rte_bbdev_log_debug("Initialised bbdev %s (id = %u)",
+			dev_name, bbdev->data->dev_id);
+	return 0;
+}
+
+static int acc100_pci_remove(struct rte_pci_device *pci_dev)
+{
+	struct rte_bbdev *bbdev;
+	int ret;
+	uint8_t dev_id;
+
+	if (pci_dev == NULL)
+		return -EINVAL;
+
+	/* Find device */
+	bbdev = rte_bbdev_get_named_dev(pci_dev->device.name);
+	if (bbdev == NULL) {
+		rte_bbdev_log(CRIT,
+				"Couldn't find HW dev \"%s\" to uninitialise it",
+				pci_dev->device.name);
+		return -ENODEV;
+	}
+	dev_id = bbdev->data->dev_id;
+
+	/* free device private memory before close */
+	rte_free(bbdev->data->dev_private);
+
+	/* Close device */
+	ret = rte_bbdev_close(dev_id);
+	if (ret < 0)
+		rte_bbdev_log(ERR,
+				"Device %i failed to close during uninit: %i",
+				dev_id, ret);
+
+	/* release bbdev from library */
+	rte_bbdev_release(bbdev);
+
+	rte_bbdev_log_debug("Destroyed bbdev = %u", dev_id);
+
+	return 0;
+}
+
+static struct rte_pci_driver acc100_pci_pf_driver = {
+		.probe = acc100_pci_probe,
+		.remove = acc100_pci_remove,
+		.id_table = pci_id_acc100_pf_map,
+		.drv_flags = RTE_PCI_DRV_NEED_MAPPING
+};
+
+static struct rte_pci_driver acc100_pci_vf_driver = {
+		.probe = acc100_pci_probe,
+		.remove = acc100_pci_remove,
+		.id_table = pci_id_acc100_vf_map,
+		.drv_flags = RTE_PCI_DRV_NEED_MAPPING
+};
+
+RTE_PMD_REGISTER_PCI(ACC100PF_DRIVER_NAME, acc100_pci_pf_driver);
+RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME, pci_id_acc100_pf_map);
+RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
+RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME, pci_id_acc100_vf_map);
+
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
new file mode 100644
index 0000000..6f46df0
--- /dev/null
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -0,0 +1,37 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#ifndef _RTE_ACC100_PMD_H_
+#define _RTE_ACC100_PMD_H_
+
+/* Helper macro for logging */
+#define rte_bbdev_log(level, fmt, ...) \
+	rte_log(RTE_LOG_ ## level, acc100_logtype, fmt "\n", \
+		##__VA_ARGS__)
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+#define rte_bbdev_log_debug(fmt, ...) \
+		rte_bbdev_log(DEBUG, "acc100_pmd: " fmt, \
+		##__VA_ARGS__)
+#else
+#define rte_bbdev_log_debug(fmt, ...)
+#endif
+
+/* ACC100 PF and VF driver names */
+#define ACC100PF_DRIVER_NAME           intel_acc100_pf
+#define ACC100VF_DRIVER_NAME           intel_acc100_vf
+
+/* ACC100 PCI vendor & device IDs */
+#define RTE_ACC100_VENDOR_ID           (0x8086)
+#define RTE_ACC100_PF_DEVICE_ID        (0x0d5c)
+#define RTE_ACC100_VF_DEVICE_ID        (0x0d5d)
+
+/* Private data structure for each ACC100 device */
+struct acc100_device {
+	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
+	bool pf_device; /**< True if this is a PF ACC100 device */
+	bool configured; /**< True if this ACC100 device is configured */
+};
+
+#endif /* _RTE_ACC100_PMD_H_ */
diff --git a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
new file mode 100644
index 0000000..4a76d1d
--- /dev/null
+++ b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
@@ -0,0 +1,3 @@
+DPDK_21 {
+	local: *;
+};
diff --git a/drivers/baseband/meson.build b/drivers/baseband/meson.build
index 415b672..72301ce 100644
--- a/drivers/baseband/meson.build
+++ b/drivers/baseband/meson.build
@@ -5,7 +5,7 @@ if is_windows
 	subdir_done()
 endif
 
-drivers = ['null', 'turbo_sw', 'fpga_lte_fec', 'fpga_5gnr_fec']
+drivers = ['null', 'turbo_sw', 'fpga_lte_fec', 'fpga_5gnr_fec', 'acc100']
 
 config_flag_fmt = 'RTE_LIBRTE_PMD_BBDEV_@0@'
 driver_name_fmt = 'rte_pmd_bbdev_@0@'
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v8 02/10] baseband/acc100: add register definition file
  2020-09-28 23:52   ` [dpdk-dev] [PATCH v8 00/10] bbdev PMD ACC100 Nicolas Chautru
  2020-09-28 23:52     ` [dpdk-dev] [PATCH v8 01/10] drivers/baseband: add PMD for ACC100 Nicolas Chautru
@ 2020-09-28 23:52     ` Nicolas Chautru
  2020-09-28 23:52     ` [dpdk-dev] [PATCH v8 03/10] baseband/acc100: add info get function Nicolas Chautru
                       ` (7 subsequent siblings)
  9 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-28 23:52 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Add in the list of registers for the device and related
HW specs definitions.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 drivers/baseband/acc100/acc100_pf_enum.h | 1068 ++++++++++++++++++++++++++++++
 drivers/baseband/acc100/acc100_vf_enum.h |   73 ++
 drivers/baseband/acc100/rte_acc100_pmd.h |  490 ++++++++++++++
 3 files changed, 1631 insertions(+)
 create mode 100644 drivers/baseband/acc100/acc100_pf_enum.h
 create mode 100644 drivers/baseband/acc100/acc100_vf_enum.h

diff --git a/drivers/baseband/acc100/acc100_pf_enum.h b/drivers/baseband/acc100/acc100_pf_enum.h
new file mode 100644
index 0000000..a1ee416
--- /dev/null
+++ b/drivers/baseband/acc100/acc100_pf_enum.h
@@ -0,0 +1,1068 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2017 Intel Corporation
+ */
+
+#ifndef ACC100_PF_ENUM_H
+#define ACC100_PF_ENUM_H
+
+/*
+ * ACC100 Register mapping on PF BAR0
+ * This is automatically generated from RDL, format may change with new RDL
+ * Release.
+ * Variable names are as is
+ */
+enum {
+	HWPfQmgrEgressQueuesTemplate          =  0x0007FE00,
+	HWPfQmgrIngressAq                     =  0x00080000,
+	HWPfQmgrArbQAvail                     =  0x00A00010,
+	HWPfQmgrArbQBlock                     =  0x00A00014,
+	HWPfQmgrAqueueDropNotifEn             =  0x00A00024,
+	HWPfQmgrAqueueDisableNotifEn          =  0x00A00028,
+	HWPfQmgrSoftReset                     =  0x00A00038,
+	HWPfQmgrInitStatus                    =  0x00A0003C,
+	HWPfQmgrAramWatchdogCount             =  0x00A00040,
+	HWPfQmgrAramWatchdogCounterEn         =  0x00A00044,
+	HWPfQmgrAxiWatchdogCount              =  0x00A00048,
+	HWPfQmgrAxiWatchdogCounterEn          =  0x00A0004C,
+	HWPfQmgrProcessWatchdogCount          =  0x00A00050,
+	HWPfQmgrProcessWatchdogCounterEn      =  0x00A00054,
+	HWPfQmgrProcessUl4GWatchdogCounter    =  0x00A00058,
+	HWPfQmgrProcessDl4GWatchdogCounter    =  0x00A0005C,
+	HWPfQmgrProcessUl5GWatchdogCounter    =  0x00A00060,
+	HWPfQmgrProcessDl5GWatchdogCounter    =  0x00A00064,
+	HWPfQmgrProcessMldWatchdogCounter     =  0x00A00068,
+	HWPfQmgrMsiOverflowUpperVf            =  0x00A00070,
+	HWPfQmgrMsiOverflowLowerVf            =  0x00A00074,
+	HWPfQmgrMsiWatchdogOverflow           =  0x00A00078,
+	HWPfQmgrMsiOverflowEnable             =  0x00A0007C,
+	HWPfQmgrDebugAqPointerMemGrp          =  0x00A00100,
+	HWPfQmgrDebugOutputArbQFifoGrp        =  0x00A00140,
+	HWPfQmgrDebugMsiFifoGrp               =  0x00A00180,
+	HWPfQmgrDebugAxiWdTimeoutMsiFifo      =  0x00A001C0,
+	HWPfQmgrDebugProcessWdTimeoutMsiFifo  =  0x00A001C4,
+	HWPfQmgrDepthLog2Grp                  =  0x00A00200,
+	HWPfQmgrTholdGrp                      =  0x00A00300,
+	HWPfQmgrGrpTmplateReg0Indx            =  0x00A00600,
+	HWPfQmgrGrpTmplateReg1Indx            =  0x00A00680,
+	HWPfQmgrGrpTmplateReg2indx            =  0x00A00700,
+	HWPfQmgrGrpTmplateReg3Indx            =  0x00A00780,
+	HWPfQmgrGrpTmplateReg4Indx            =  0x00A00800,
+	HWPfQmgrVfBaseAddr                    =  0x00A01000,
+	HWPfQmgrUl4GWeightRrVf                =  0x00A02000,
+	HWPfQmgrDl4GWeightRrVf                =  0x00A02100,
+	HWPfQmgrUl5GWeightRrVf                =  0x00A02200,
+	HWPfQmgrDl5GWeightRrVf                =  0x00A02300,
+	HWPfQmgrMldWeightRrVf                 =  0x00A02400,
+	HWPfQmgrArbQDepthGrp                  =  0x00A02F00,
+	HWPfQmgrGrpFunction0                  =  0x00A02F40,
+	HWPfQmgrGrpFunction1                  =  0x00A02F44,
+	HWPfQmgrGrpPriority                   =  0x00A02F48,
+	HWPfQmgrWeightSync                    =  0x00A03000,
+	HWPfQmgrAqEnableVf                    =  0x00A10000,
+	HWPfQmgrAqResetVf                     =  0x00A20000,
+	HWPfQmgrRingSizeVf                    =  0x00A20004,
+	HWPfQmgrGrpDepthLog20Vf               =  0x00A20008,
+	HWPfQmgrGrpDepthLog21Vf               =  0x00A2000C,
+	HWPfQmgrGrpFunction0Vf                =  0x00A20010,
+	HWPfQmgrGrpFunction1Vf                =  0x00A20014,
+	HWPfDmaConfig0Reg                     =  0x00B80000,
+	HWPfDmaConfig1Reg                     =  0x00B80004,
+	HWPfDmaQmgrAddrReg                    =  0x00B80008,
+	HWPfDmaSoftResetReg                   =  0x00B8000C,
+	HWPfDmaAxcacheReg                     =  0x00B80010,
+	HWPfDmaVersionReg                     =  0x00B80014,
+	HWPfDmaFrameThreshold                 =  0x00B80018,
+	HWPfDmaTimestampLo                    =  0x00B8001C,
+	HWPfDmaTimestampHi                    =  0x00B80020,
+	HWPfDmaAxiStatus                      =  0x00B80028,
+	HWPfDmaAxiControl                     =  0x00B8002C,
+	HWPfDmaNoQmgr                         =  0x00B80030,
+	HWPfDmaQosScale                       =  0x00B80034,
+	HWPfDmaQmanen                         =  0x00B80040,
+	HWPfDmaQmgrQosBase                    =  0x00B80060,
+	HWPfDmaFecClkGatingEnable             =  0x00B80080,
+	HWPfDmaPmEnable                       =  0x00B80084,
+	HWPfDmaQosEnable                      =  0x00B80088,
+	HWPfDmaHarqWeightedRrFrameThreshold   =  0x00B800B0,
+	HWPfDmaDataSmallWeightedRrFrameThresh  = 0x00B800B4,
+	HWPfDmaDataLargeWeightedRrFrameThresh  = 0x00B800B8,
+	HWPfDmaInboundCbMaxSize               =  0x00B800BC,
+	HWPfDmaInboundDrainDataSize           =  0x00B800C0,
+	HWPfDmaVfDdrBaseRw                    =  0x00B80400,
+	HWPfDmaCmplTmOutCnt                   =  0x00B80800,
+	HWPfDmaProcTmOutCnt                   =  0x00B80804,
+	HWPfDmaStatusRrespBresp               =  0x00B80810,
+	HWPfDmaCfgRrespBresp                  =  0x00B80814,
+	HWPfDmaStatusMemParErr                =  0x00B80818,
+	HWPfDmaCfgMemParErrEn                 =  0x00B8081C,
+	HWPfDmaStatusDmaHwErr                 =  0x00B80820,
+	HWPfDmaCfgDmaHwErrEn                  =  0x00B80824,
+	HWPfDmaStatusFecCoreErr               =  0x00B80828,
+	HWPfDmaCfgFecCoreErrEn                =  0x00B8082C,
+	HWPfDmaStatusFcwDescrErr              =  0x00B80830,
+	HWPfDmaCfgFcwDescrErrEn               =  0x00B80834,
+	HWPfDmaStatusBlockTransmit            =  0x00B80838,
+	HWPfDmaBlockOnErrEn                   =  0x00B8083C,
+	HWPfDmaStatusFlushDma                 =  0x00B80840,
+	HWPfDmaFlushDmaOnErrEn                =  0x00B80844,
+	HWPfDmaStatusSdoneFifoFull            =  0x00B80848,
+	HWPfDmaStatusDescriptorErrLoVf        =  0x00B8084C,
+	HWPfDmaStatusDescriptorErrHiVf        =  0x00B80850,
+	HWPfDmaStatusFcwErrLoVf               =  0x00B80854,
+	HWPfDmaStatusFcwErrHiVf               =  0x00B80858,
+	HWPfDmaStatusDataErrLoVf              =  0x00B8085C,
+	HWPfDmaStatusDataErrHiVf              =  0x00B80860,
+	HWPfDmaCfgMsiEnSoftwareErr            =  0x00B80864,
+	HWPfDmaDescriptorSignatuture          =  0x00B80868,
+	HWPfDmaFcwSignature                   =  0x00B8086C,
+	HWPfDmaErrorDetectionEn               =  0x00B80870,
+	HWPfDmaErrCntrlFifoDebug              =  0x00B8087C,
+	HWPfDmaStatusToutData                 =  0x00B80880,
+	HWPfDmaStatusToutDesc                 =  0x00B80884,
+	HWPfDmaStatusToutUnexpData            =  0x00B80888,
+	HWPfDmaStatusToutUnexpDesc            =  0x00B8088C,
+	HWPfDmaStatusToutProcess              =  0x00B80890,
+	HWPfDmaConfigCtoutOutDataEn           =  0x00B808A0,
+	HWPfDmaConfigCtoutOutDescrEn          =  0x00B808A4,
+	HWPfDmaConfigUnexpComplDataEn         =  0x00B808A8,
+	HWPfDmaConfigUnexpComplDescrEn        =  0x00B808AC,
+	HWPfDmaConfigPtoutOutEn               =  0x00B808B0,
+	HWPfDmaFec5GulDescBaseLoRegVf         =  0x00B88020,
+	HWPfDmaFec5GulDescBaseHiRegVf         =  0x00B88024,
+	HWPfDmaFec5GulRespPtrLoRegVf          =  0x00B88028,
+	HWPfDmaFec5GulRespPtrHiRegVf          =  0x00B8802C,
+	HWPfDmaFec5GdlDescBaseLoRegVf         =  0x00B88040,
+	HWPfDmaFec5GdlDescBaseHiRegVf         =  0x00B88044,
+	HWPfDmaFec5GdlRespPtrLoRegVf          =  0x00B88048,
+	HWPfDmaFec5GdlRespPtrHiRegVf          =  0x00B8804C,
+	HWPfDmaFec4GulDescBaseLoRegVf         =  0x00B88060,
+	HWPfDmaFec4GulDescBaseHiRegVf         =  0x00B88064,
+	HWPfDmaFec4GulRespPtrLoRegVf          =  0x00B88068,
+	HWPfDmaFec4GulRespPtrHiRegVf          =  0x00B8806C,
+	HWPfDmaFec4GdlDescBaseLoRegVf         =  0x00B88080,
+	HWPfDmaFec4GdlDescBaseHiRegVf         =  0x00B88084,
+	HWPfDmaFec4GdlRespPtrLoRegVf          =  0x00B88088,
+	HWPfDmaFec4GdlRespPtrHiRegVf          =  0x00B8808C,
+	HWPfDmaVfDdrBaseRangeRo               =  0x00B880A0,
+	HWPfQosmonACntrlReg                   =  0x00B90000,
+	HWPfQosmonAEvalOverflow0              =  0x00B90008,
+	HWPfQosmonAEvalOverflow1              =  0x00B9000C,
+	HWPfQosmonADivTerm                    =  0x00B90010,
+	HWPfQosmonATickTerm                   =  0x00B90014,
+	HWPfQosmonAEvalTerm                   =  0x00B90018,
+	HWPfQosmonAAveTerm                    =  0x00B9001C,
+	HWPfQosmonAForceEccErr                =  0x00B90020,
+	HWPfQosmonAEccErrDetect               =  0x00B90024,
+	HWPfQosmonAIterationConfig0Low        =  0x00B90060,
+	HWPfQosmonAIterationConfig0High       =  0x00B90064,
+	HWPfQosmonAIterationConfig1Low        =  0x00B90068,
+	HWPfQosmonAIterationConfig1High       =  0x00B9006C,
+	HWPfQosmonAIterationConfig2Low        =  0x00B90070,
+	HWPfQosmonAIterationConfig2High       =  0x00B90074,
+	HWPfQosmonAIterationConfig3Low        =  0x00B90078,
+	HWPfQosmonAIterationConfig3High       =  0x00B9007C,
+	HWPfQosmonAEvalMemAddr                =  0x00B90080,
+	HWPfQosmonAEvalMemData                =  0x00B90084,
+	HWPfQosmonAXaction                    =  0x00B900C0,
+	HWPfQosmonARemThres1Vf                =  0x00B90400,
+	HWPfQosmonAThres2Vf                   =  0x00B90404,
+	HWPfQosmonAWeiFracVf                  =  0x00B90408,
+	HWPfQosmonARrWeiVf                    =  0x00B9040C,
+	HWPfPermonACntrlRegVf                 =  0x00B98000,
+	HWPfPermonACountVf                    =  0x00B98008,
+	HWPfPermonAKCntLoVf                   =  0x00B98010,
+	HWPfPermonAKCntHiVf                   =  0x00B98014,
+	HWPfPermonADeltaCntLoVf               =  0x00B98020,
+	HWPfPermonADeltaCntHiVf               =  0x00B98024,
+	HWPfPermonAVersionReg                 =  0x00B9C000,
+	HWPfPermonACbControlFec               =  0x00B9C0F0,
+	HWPfPermonADltTimerLoFec              =  0x00B9C0F4,
+	HWPfPermonADltTimerHiFec              =  0x00B9C0F8,
+	HWPfPermonACbCountFec                 =  0x00B9C100,
+	HWPfPermonAAccExecTimerLoFec          =  0x00B9C104,
+	HWPfPermonAAccExecTimerHiFec          =  0x00B9C108,
+	HWPfPermonAExecTimerMinFec            =  0x00B9C200,
+	HWPfPermonAExecTimerMaxFec            =  0x00B9C204,
+	HWPfPermonAControlBusMon              =  0x00B9C400,
+	HWPfPermonAConfigBusMon               =  0x00B9C404,
+	HWPfPermonASkipCountBusMon            =  0x00B9C408,
+	HWPfPermonAMinLatBusMon               =  0x00B9C40C,
+	HWPfPermonAMaxLatBusMon               =  0x00B9C500,
+	HWPfPermonATotalLatLowBusMon          =  0x00B9C504,
+	HWPfPermonATotalLatUpperBusMon        =  0x00B9C508,
+	HWPfPermonATotalReqCntBusMon          =  0x00B9C50C,
+	HWPfQosmonBCntrlReg                   =  0x00BA0000,
+	HWPfQosmonBEvalOverflow0              =  0x00BA0008,
+	HWPfQosmonBEvalOverflow1              =  0x00BA000C,
+	HWPfQosmonBDivTerm                    =  0x00BA0010,
+	HWPfQosmonBTickTerm                   =  0x00BA0014,
+	HWPfQosmonBEvalTerm                   =  0x00BA0018,
+	HWPfQosmonBAveTerm                    =  0x00BA001C,
+	HWPfQosmonBForceEccErr                =  0x00BA0020,
+	HWPfQosmonBEccErrDetect               =  0x00BA0024,
+	HWPfQosmonBIterationConfig0Low        =  0x00BA0060,
+	HWPfQosmonBIterationConfig0High       =  0x00BA0064,
+	HWPfQosmonBIterationConfig1Low        =  0x00BA0068,
+	HWPfQosmonBIterationConfig1High       =  0x00BA006C,
+	HWPfQosmonBIterationConfig2Low        =  0x00BA0070,
+	HWPfQosmonBIterationConfig2High       =  0x00BA0074,
+	HWPfQosmonBIterationConfig3Low        =  0x00BA0078,
+	HWPfQosmonBIterationConfig3High       =  0x00BA007C,
+	HWPfQosmonBEvalMemAddr                =  0x00BA0080,
+	HWPfQosmonBEvalMemData                =  0x00BA0084,
+	HWPfQosmonBXaction                    =  0x00BA00C0,
+	HWPfQosmonBRemThres1Vf                =  0x00BA0400,
+	HWPfQosmonBThres2Vf                   =  0x00BA0404,
+	HWPfQosmonBWeiFracVf                  =  0x00BA0408,
+	HWPfQosmonBRrWeiVf                    =  0x00BA040C,
+	HWPfPermonBCntrlRegVf                 =  0x00BA8000,
+	HWPfPermonBCountVf                    =  0x00BA8008,
+	HWPfPermonBKCntLoVf                   =  0x00BA8010,
+	HWPfPermonBKCntHiVf                   =  0x00BA8014,
+	HWPfPermonBDeltaCntLoVf               =  0x00BA8020,
+	HWPfPermonBDeltaCntHiVf               =  0x00BA8024,
+	HWPfPermonBVersionReg                 =  0x00BAC000,
+	HWPfPermonBCbControlFec               =  0x00BAC0F0,
+	HWPfPermonBDltTimerLoFec              =  0x00BAC0F4,
+	HWPfPermonBDltTimerHiFec              =  0x00BAC0F8,
+	HWPfPermonBCbCountFec                 =  0x00BAC100,
+	HWPfPermonBAccExecTimerLoFec          =  0x00BAC104,
+	HWPfPermonBAccExecTimerHiFec          =  0x00BAC108,
+	HWPfPermonBExecTimerMinFec            =  0x00BAC200,
+	HWPfPermonBExecTimerMaxFec            =  0x00BAC204,
+	HWPfPermonBControlBusMon              =  0x00BAC400,
+	HWPfPermonBConfigBusMon               =  0x00BAC404,
+	HWPfPermonBSkipCountBusMon            =  0x00BAC408,
+	HWPfPermonBMinLatBusMon               =  0x00BAC40C,
+	HWPfPermonBMaxLatBusMon               =  0x00BAC500,
+	HWPfPermonBTotalLatLowBusMon          =  0x00BAC504,
+	HWPfPermonBTotalLatUpperBusMon        =  0x00BAC508,
+	HWPfPermonBTotalReqCntBusMon          =  0x00BAC50C,
+	HWPfFecUl5gCntrlReg                   =  0x00BC0000,
+	HWPfFecUl5gI2MThreshReg               =  0x00BC0004,
+	HWPfFecUl5gVersionReg                 =  0x00BC0100,
+	HWPfFecUl5gFcwStatusReg               =  0x00BC0104,
+	HWPfFecUl5gWarnReg                    =  0x00BC0108,
+	HwPfFecUl5gIbDebugReg                 =  0x00BC0200,
+	HwPfFecUl5gObLlrDebugReg              =  0x00BC0204,
+	HwPfFecUl5gObHarqDebugReg             =  0x00BC0208,
+	HwPfFecUl5g1CntrlReg                  =  0x00BC1000,
+	HwPfFecUl5g1I2MThreshReg              =  0x00BC1004,
+	HwPfFecUl5g1VersionReg                =  0x00BC1100,
+	HwPfFecUl5g1FcwStatusReg              =  0x00BC1104,
+	HwPfFecUl5g1WarnReg                   =  0x00BC1108,
+	HwPfFecUl5g1IbDebugReg                =  0x00BC1200,
+	HwPfFecUl5g1ObLlrDebugReg             =  0x00BC1204,
+	HwPfFecUl5g1ObHarqDebugReg            =  0x00BC1208,
+	HwPfFecUl5g2CntrlReg                  =  0x00BC2000,
+	HwPfFecUl5g2I2MThreshReg              =  0x00BC2004,
+	HwPfFecUl5g2VersionReg                =  0x00BC2100,
+	HwPfFecUl5g2FcwStatusReg              =  0x00BC2104,
+	HwPfFecUl5g2WarnReg                   =  0x00BC2108,
+	HwPfFecUl5g2IbDebugReg                =  0x00BC2200,
+	HwPfFecUl5g2ObLlrDebugReg             =  0x00BC2204,
+	HwPfFecUl5g2ObHarqDebugReg            =  0x00BC2208,
+	HwPfFecUl5g3CntrlReg                  =  0x00BC3000,
+	HwPfFecUl5g3I2MThreshReg              =  0x00BC3004,
+	HwPfFecUl5g3VersionReg                =  0x00BC3100,
+	HwPfFecUl5g3FcwStatusReg              =  0x00BC3104,
+	HwPfFecUl5g3WarnReg                   =  0x00BC3108,
+	HwPfFecUl5g3IbDebugReg                =  0x00BC3200,
+	HwPfFecUl5g3ObLlrDebugReg             =  0x00BC3204,
+	HwPfFecUl5g3ObHarqDebugReg            =  0x00BC3208,
+	HwPfFecUl5g4CntrlReg                  =  0x00BC4000,
+	HwPfFecUl5g4I2MThreshReg              =  0x00BC4004,
+	HwPfFecUl5g4VersionReg                =  0x00BC4100,
+	HwPfFecUl5g4FcwStatusReg              =  0x00BC4104,
+	HwPfFecUl5g4WarnReg                   =  0x00BC4108,
+	HwPfFecUl5g4IbDebugReg                =  0x00BC4200,
+	HwPfFecUl5g4ObLlrDebugReg             =  0x00BC4204,
+	HwPfFecUl5g4ObHarqDebugReg            =  0x00BC4208,
+	HwPfFecUl5g5CntrlReg                  =  0x00BC5000,
+	HwPfFecUl5g5I2MThreshReg              =  0x00BC5004,
+	HwPfFecUl5g5VersionReg                =  0x00BC5100,
+	HwPfFecUl5g5FcwStatusReg              =  0x00BC5104,
+	HwPfFecUl5g5WarnReg                   =  0x00BC5108,
+	HwPfFecUl5g5IbDebugReg                =  0x00BC5200,
+	HwPfFecUl5g5ObLlrDebugReg             =  0x00BC5204,
+	HwPfFecUl5g5ObHarqDebugReg            =  0x00BC5208,
+	HwPfFecUl5g6CntrlReg                  =  0x00BC6000,
+	HwPfFecUl5g6I2MThreshReg              =  0x00BC6004,
+	HwPfFecUl5g6VersionReg                =  0x00BC6100,
+	HwPfFecUl5g6FcwStatusReg              =  0x00BC6104,
+	HwPfFecUl5g6WarnReg                   =  0x00BC6108,
+	HwPfFecUl5g6IbDebugReg                =  0x00BC6200,
+	HwPfFecUl5g6ObLlrDebugReg             =  0x00BC6204,
+	HwPfFecUl5g6ObHarqDebugReg            =  0x00BC6208,
+	HwPfFecUl5g7CntrlReg                  =  0x00BC7000,
+	HwPfFecUl5g7I2MThreshReg              =  0x00BC7004,
+	HwPfFecUl5g7VersionReg                =  0x00BC7100,
+	HwPfFecUl5g7FcwStatusReg              =  0x00BC7104,
+	HwPfFecUl5g7WarnReg                   =  0x00BC7108,
+	HwPfFecUl5g7IbDebugReg                =  0x00BC7200,
+	HwPfFecUl5g7ObLlrDebugReg             =  0x00BC7204,
+	HwPfFecUl5g7ObHarqDebugReg            =  0x00BC7208,
+	HwPfFecUl5g8CntrlReg                  =  0x00BC8000,
+	HwPfFecUl5g8I2MThreshReg              =  0x00BC8004,
+	HwPfFecUl5g8VersionReg                =  0x00BC8100,
+	HwPfFecUl5g8FcwStatusReg              =  0x00BC8104,
+	HwPfFecUl5g8WarnReg                   =  0x00BC8108,
+	HwPfFecUl5g8IbDebugReg                =  0x00BC8200,
+	HwPfFecUl5g8ObLlrDebugReg             =  0x00BC8204,
+	HwPfFecUl5g8ObHarqDebugReg            =  0x00BC8208,
+	HWPfFecDl5gCntrlReg                   =  0x00BCF000,
+	HWPfFecDl5gI2MThreshReg               =  0x00BCF004,
+	HWPfFecDl5gVersionReg                 =  0x00BCF100,
+	HWPfFecDl5gFcwStatusReg               =  0x00BCF104,
+	HWPfFecDl5gWarnReg                    =  0x00BCF108,
+	HWPfFecUlVersionReg                   =  0x00BD0000,
+	HWPfFecUlControlReg                   =  0x00BD0004,
+	HWPfFecUlStatusReg                    =  0x00BD0008,
+	HWPfFecDlVersionReg                   =  0x00BDF000,
+	HWPfFecDlClusterConfigReg             =  0x00BDF004,
+	HWPfFecDlBurstThres                   =  0x00BDF00C,
+	HWPfFecDlClusterStatusReg0            =  0x00BDF040,
+	HWPfFecDlClusterStatusReg1            =  0x00BDF044,
+	HWPfFecDlClusterStatusReg2            =  0x00BDF048,
+	HWPfFecDlClusterStatusReg3            =  0x00BDF04C,
+	HWPfFecDlClusterStatusReg4            =  0x00BDF050,
+	HWPfFecDlClusterStatusReg5            =  0x00BDF054,
+	HWPfChaFabPllPllrst                   =  0x00C40000,
+	HWPfChaFabPllClk0                     =  0x00C40004,
+	HWPfChaFabPllClk1                     =  0x00C40008,
+	HWPfChaFabPllBwadj                    =  0x00C4000C,
+	HWPfChaFabPllLbw                      =  0x00C40010,
+	HWPfChaFabPllResetq                   =  0x00C40014,
+	HWPfChaFabPllPhshft0                  =  0x00C40018,
+	HWPfChaFabPllPhshft1                  =  0x00C4001C,
+	HWPfChaFabPllDivq0                    =  0x00C40020,
+	HWPfChaFabPllDivq1                    =  0x00C40024,
+	HWPfChaFabPllDivq2                    =  0x00C40028,
+	HWPfChaFabPllDivq3                    =  0x00C4002C,
+	HWPfChaFabPllDivq4                    =  0x00C40030,
+	HWPfChaFabPllDivq5                    =  0x00C40034,
+	HWPfChaFabPllDivq6                    =  0x00C40038,
+	HWPfChaFabPllDivq7                    =  0x00C4003C,
+	HWPfChaDl5gPllPllrst                  =  0x00C40080,
+	HWPfChaDl5gPllClk0                    =  0x00C40084,
+	HWPfChaDl5gPllClk1                    =  0x00C40088,
+	HWPfChaDl5gPllBwadj                   =  0x00C4008C,
+	HWPfChaDl5gPllLbw                     =  0x00C40090,
+	HWPfChaDl5gPllResetq                  =  0x00C40094,
+	HWPfChaDl5gPllPhshft0                 =  0x00C40098,
+	HWPfChaDl5gPllPhshft1                 =  0x00C4009C,
+	HWPfChaDl5gPllDivq0                   =  0x00C400A0,
+	HWPfChaDl5gPllDivq1                   =  0x00C400A4,
+	HWPfChaDl5gPllDivq2                   =  0x00C400A8,
+	HWPfChaDl5gPllDivq3                   =  0x00C400AC,
+	HWPfChaDl5gPllDivq4                   =  0x00C400B0,
+	HWPfChaDl5gPllDivq5                   =  0x00C400B4,
+	HWPfChaDl5gPllDivq6                   =  0x00C400B8,
+	HWPfChaDl5gPllDivq7                   =  0x00C400BC,
+	HWPfChaDl4gPllPllrst                  =  0x00C40100,
+	HWPfChaDl4gPllClk0                    =  0x00C40104,
+	HWPfChaDl4gPllClk1                    =  0x00C40108,
+	HWPfChaDl4gPllBwadj                   =  0x00C4010C,
+	HWPfChaDl4gPllLbw                     =  0x00C40110,
+	HWPfChaDl4gPllResetq                  =  0x00C40114,
+	HWPfChaDl4gPllPhshft0                 =  0x00C40118,
+	HWPfChaDl4gPllPhshft1                 =  0x00C4011C,
+	HWPfChaDl4gPllDivq0                   =  0x00C40120,
+	HWPfChaDl4gPllDivq1                   =  0x00C40124,
+	HWPfChaDl4gPllDivq2                   =  0x00C40128,
+	HWPfChaDl4gPllDivq3                   =  0x00C4012C,
+	HWPfChaDl4gPllDivq4                   =  0x00C40130,
+	HWPfChaDl4gPllDivq5                   =  0x00C40134,
+	HWPfChaDl4gPllDivq6                   =  0x00C40138,
+	HWPfChaDl4gPllDivq7                   =  0x00C4013C,
+	HWPfChaUl5gPllPllrst                  =  0x00C40180,
+	HWPfChaUl5gPllClk0                    =  0x00C40184,
+	HWPfChaUl5gPllClk1                    =  0x00C40188,
+	HWPfChaUl5gPllBwadj                   =  0x00C4018C,
+	HWPfChaUl5gPllLbw                     =  0x00C40190,
+	HWPfChaUl5gPllResetq                  =  0x00C40194,
+	HWPfChaUl5gPllPhshft0                 =  0x00C40198,
+	HWPfChaUl5gPllPhshft1                 =  0x00C4019C,
+	HWPfChaUl5gPllDivq0                   =  0x00C401A0,
+	HWPfChaUl5gPllDivq1                   =  0x00C401A4,
+	HWPfChaUl5gPllDivq2                   =  0x00C401A8,
+	HWPfChaUl5gPllDivq3                   =  0x00C401AC,
+	HWPfChaUl5gPllDivq4                   =  0x00C401B0,
+	HWPfChaUl5gPllDivq5                   =  0x00C401B4,
+	HWPfChaUl5gPllDivq6                   =  0x00C401B8,
+	HWPfChaUl5gPllDivq7                   =  0x00C401BC,
+	HWPfChaUl4gPllPllrst                  =  0x00C40200,
+	HWPfChaUl4gPllClk0                    =  0x00C40204,
+	HWPfChaUl4gPllClk1                    =  0x00C40208,
+	HWPfChaUl4gPllBwadj                   =  0x00C4020C,
+	HWPfChaUl4gPllLbw                     =  0x00C40210,
+	HWPfChaUl4gPllResetq                  =  0x00C40214,
+	HWPfChaUl4gPllPhshft0                 =  0x00C40218,
+	HWPfChaUl4gPllPhshft1                 =  0x00C4021C,
+	HWPfChaUl4gPllDivq0                   =  0x00C40220,
+	HWPfChaUl4gPllDivq1                   =  0x00C40224,
+	HWPfChaUl4gPllDivq2                   =  0x00C40228,
+	HWPfChaUl4gPllDivq3                   =  0x00C4022C,
+	HWPfChaUl4gPllDivq4                   =  0x00C40230,
+	HWPfChaUl4gPllDivq5                   =  0x00C40234,
+	HWPfChaUl4gPllDivq6                   =  0x00C40238,
+	HWPfChaUl4gPllDivq7                   =  0x00C4023C,
+	HWPfChaDdrPllPllrst                   =  0x00C40280,
+	HWPfChaDdrPllClk0                     =  0x00C40284,
+	HWPfChaDdrPllClk1                     =  0x00C40288,
+	HWPfChaDdrPllBwadj                    =  0x00C4028C,
+	HWPfChaDdrPllLbw                      =  0x00C40290,
+	HWPfChaDdrPllResetq                   =  0x00C40294,
+	HWPfChaDdrPllPhshft0                  =  0x00C40298,
+	HWPfChaDdrPllPhshft1                  =  0x00C4029C,
+	HWPfChaDdrPllDivq0                    =  0x00C402A0,
+	HWPfChaDdrPllDivq1                    =  0x00C402A4,
+	HWPfChaDdrPllDivq2                    =  0x00C402A8,
+	HWPfChaDdrPllDivq3                    =  0x00C402AC,
+	HWPfChaDdrPllDivq4                    =  0x00C402B0,
+	HWPfChaDdrPllDivq5                    =  0x00C402B4,
+	HWPfChaDdrPllDivq6                    =  0x00C402B8,
+	HWPfChaDdrPllDivq7                    =  0x00C402BC,
+	HWPfChaErrStatus                      =  0x00C40400,
+	HWPfChaErrMask                        =  0x00C40404,
+	HWPfChaDebugPcieMsiFifo               =  0x00C40410,
+	HWPfChaDebugDdrMsiFifo                =  0x00C40414,
+	HWPfChaDebugMiscMsiFifo               =  0x00C40418,
+	HWPfChaPwmSet                         =  0x00C40420,
+	HWPfChaDdrRstStatus                   =  0x00C40430,
+	HWPfChaDdrStDoneStatus                =  0x00C40434,
+	HWPfChaDdrWbRstCfg                    =  0x00C40438,
+	HWPfChaDdrApbRstCfg                   =  0x00C4043C,
+	HWPfChaDdrPhyRstCfg                   =  0x00C40440,
+	HWPfChaDdrCpuRstCfg                   =  0x00C40444,
+	HWPfChaDdrSifRstCfg                   =  0x00C40448,
+	HWPfChaPadcfgPcomp0                   =  0x00C41000,
+	HWPfChaPadcfgNcomp0                   =  0x00C41004,
+	HWPfChaPadcfgOdt0                     =  0x00C41008,
+	HWPfChaPadcfgProtect0                 =  0x00C4100C,
+	HWPfChaPreemphasisProtect0            =  0x00C41010,
+	HWPfChaPreemphasisCompen0             =  0x00C41040,
+	HWPfChaPreemphasisOdten0              =  0x00C41044,
+	HWPfChaPadcfgPcomp1                   =  0x00C41100,
+	HWPfChaPadcfgNcomp1                   =  0x00C41104,
+	HWPfChaPadcfgOdt1                     =  0x00C41108,
+	HWPfChaPadcfgProtect1                 =  0x00C4110C,
+	HWPfChaPreemphasisProtect1            =  0x00C41110,
+	HWPfChaPreemphasisCompen1             =  0x00C41140,
+	HWPfChaPreemphasisOdten1              =  0x00C41144,
+	HWPfChaPadcfgPcomp2                   =  0x00C41200,
+	HWPfChaPadcfgNcomp2                   =  0x00C41204,
+	HWPfChaPadcfgOdt2                     =  0x00C41208,
+	HWPfChaPadcfgProtect2                 =  0x00C4120C,
+	HWPfChaPreemphasisProtect2            =  0x00C41210,
+	HWPfChaPreemphasisCompen2             =  0x00C41240,
+	HWPfChaPreemphasisOdten4              =  0x00C41444,
+	HWPfChaPreemphasisOdten2              =  0x00C41244,
+	HWPfChaPadcfgPcomp3                   =  0x00C41300,
+	HWPfChaPadcfgNcomp3                   =  0x00C41304,
+	HWPfChaPadcfgOdt3                     =  0x00C41308,
+	HWPfChaPadcfgProtect3                 =  0x00C4130C,
+	HWPfChaPreemphasisProtect3            =  0x00C41310,
+	HWPfChaPreemphasisCompen3             =  0x00C41340,
+	HWPfChaPreemphasisOdten3              =  0x00C41344,
+	HWPfChaPadcfgPcomp4                   =  0x00C41400,
+	HWPfChaPadcfgNcomp4                   =  0x00C41404,
+	HWPfChaPadcfgOdt4                     =  0x00C41408,
+	HWPfChaPadcfgProtect4                 =  0x00C4140C,
+	HWPfChaPreemphasisProtect4            =  0x00C41410,
+	HWPfChaPreemphasisCompen4             =  0x00C41440,
+	HWPfHiVfToPfDbellVf                   =  0x00C80000,
+	HWPfHiPfToVfDbellVf                   =  0x00C80008,
+	HWPfHiInfoRingBaseLoVf                =  0x00C80010,
+	HWPfHiInfoRingBaseHiVf                =  0x00C80014,
+	HWPfHiInfoRingPointerVf               =  0x00C80018,
+	HWPfHiInfoRingIntWrEnVf               =  0x00C80020,
+	HWPfHiInfoRingPf2VfWrEnVf             =  0x00C80024,
+	HWPfHiMsixVectorMapperVf              =  0x00C80060,
+	HWPfHiModuleVersionReg                =  0x00C84000,
+	HWPfHiIosf2axiErrLogReg               =  0x00C84004,
+	HWPfHiHardResetReg                    =  0x00C84008,
+	HWPfHi5GHardResetReg                  =  0x00C8400C,
+	HWPfHiInfoRingBaseLoRegPf             =  0x00C84010,
+	HWPfHiInfoRingBaseHiRegPf             =  0x00C84014,
+	HWPfHiInfoRingPointerRegPf            =  0x00C84018,
+	HWPfHiInfoRingIntWrEnRegPf            =  0x00C84020,
+	HWPfHiInfoRingVf2pfLoWrEnReg          =  0x00C84024,
+	HWPfHiInfoRingVf2pfHiWrEnReg          =  0x00C84028,
+	HWPfHiLogParityErrStatusReg           =  0x00C8402C,
+	HWPfHiLogDataParityErrorVfStatusLo    =  0x00C84030,
+	HWPfHiLogDataParityErrorVfStatusHi    =  0x00C84034,
+	HWPfHiBlockTransmitOnErrorEn          =  0x00C84038,
+	HWPfHiCfgMsiIntWrEnRegPf              =  0x00C84040,
+	HWPfHiCfgMsiVf2pfLoWrEnReg            =  0x00C84044,
+	HWPfHiCfgMsiVf2pfHighWrEnReg          =  0x00C84048,
+	HWPfHiMsixVectorMapperPf              =  0x00C84060,
+	HWPfHiApbWrWaitTime                   =  0x00C84100,
+	HWPfHiXCounterMaxValue                =  0x00C84104,
+	HWPfHiPfMode                          =  0x00C84108,
+	HWPfHiClkGateHystReg                  =  0x00C8410C,
+	HWPfHiSnoopBitsReg                    =  0x00C84110,
+	HWPfHiMsiDropEnableReg                =  0x00C84114,
+	HWPfHiMsiStatReg                      =  0x00C84120,
+	HWPfHiFifoOflStatReg                  =  0x00C84124,
+	HWPfHiHiDebugReg                      =  0x00C841F4,
+	HWPfHiDebugMemSnoopMsiFifo            =  0x00C841F8,
+	HWPfHiDebugMemSnoopInputFifo          =  0x00C841FC,
+	HWPfHiMsixMappingConfig               =  0x00C84200,
+	HWPfHiJunkReg                         =  0x00C8FF00,
+	HWPfDdrUmmcVer                        =  0x00D00000,
+	HWPfDdrUmmcCap                        =  0x00D00010,
+	HWPfDdrUmmcCtrl                       =  0x00D00020,
+	HWPfDdrMpcPe                          =  0x00D00080,
+	HWPfDdrMpcPpri3                       =  0x00D00090,
+	HWPfDdrMpcPpri2                       =  0x00D000A0,
+	HWPfDdrMpcPpri1                       =  0x00D000B0,
+	HWPfDdrMpcPpri0                       =  0x00D000C0,
+	HWPfDdrMpcPrwgrpCtrl                  =  0x00D000D0,
+	HWPfDdrMpcPbw7                        =  0x00D000E0,
+	HWPfDdrMpcPbw6                        =  0x00D000F0,
+	HWPfDdrMpcPbw5                        =  0x00D00100,
+	HWPfDdrMpcPbw4                        =  0x00D00110,
+	HWPfDdrMpcPbw3                        =  0x00D00120,
+	HWPfDdrMpcPbw2                        =  0x00D00130,
+	HWPfDdrMpcPbw1                        =  0x00D00140,
+	HWPfDdrMpcPbw0                        =  0x00D00150,
+	HWPfDdrMemoryInit                     =  0x00D00200,
+	HWPfDdrMemoryInitDone                 =  0x00D00210,
+	HWPfDdrMemInitPhyTrng0                =  0x00D00240,
+	HWPfDdrMemInitPhyTrng1                =  0x00D00250,
+	HWPfDdrMemInitPhyTrng2                =  0x00D00260,
+	HWPfDdrMemInitPhyTrng3                =  0x00D00270,
+	HWPfDdrBcDram                         =  0x00D003C0,
+	HWPfDdrBcAddrMap                      =  0x00D003D0,
+	HWPfDdrBcRef                          =  0x00D003E0,
+	HWPfDdrBcTim0                         =  0x00D00400,
+	HWPfDdrBcTim1                         =  0x00D00410,
+	HWPfDdrBcTim2                         =  0x00D00420,
+	HWPfDdrBcTim3                         =  0x00D00430,
+	HWPfDdrBcTim4                         =  0x00D00440,
+	HWPfDdrBcTim5                         =  0x00D00450,
+	HWPfDdrBcTim6                         =  0x00D00460,
+	HWPfDdrBcTim7                         =  0x00D00470,
+	HWPfDdrBcTim8                         =  0x00D00480,
+	HWPfDdrBcTim9                         =  0x00D00490,
+	HWPfDdrBcTim10                        =  0x00D004A0,
+	HWPfDdrBcTim12                        =  0x00D004C0,
+	HWPfDdrDfiInit                        =  0x00D004D0,
+	HWPfDdrDfiInitComplete                =  0x00D004E0,
+	HWPfDdrDfiTim0                        =  0x00D004F0,
+	HWPfDdrDfiTim1                        =  0x00D00500,
+	HWPfDdrDfiPhyUpdEn                    =  0x00D00530,
+	HWPfDdrMemStatus                      =  0x00D00540,
+	HWPfDdrUmmcErrStatus                  =  0x00D00550,
+	HWPfDdrUmmcIntStatus                  =  0x00D00560,
+	HWPfDdrUmmcIntEn                      =  0x00D00570,
+	HWPfDdrPhyRdLatency                   =  0x00D48400,
+	HWPfDdrPhyRdLatencyDbi                =  0x00D48410,
+	HWPfDdrPhyWrLatency                   =  0x00D48420,
+	HWPfDdrPhyTrngType                    =  0x00D48430,
+	HWPfDdrPhyMrsTiming2                  =  0x00D48440,
+	HWPfDdrPhyMrsTiming0                  =  0x00D48450,
+	HWPfDdrPhyMrsTiming1                  =  0x00D48460,
+	HWPfDdrPhyDramTmrd                    =  0x00D48470,
+	HWPfDdrPhyDramTmod                    =  0x00D48480,
+	HWPfDdrPhyDramTwpre                   =  0x00D48490,
+	HWPfDdrPhyDramTrfc                    =  0x00D484A0,
+	HWPfDdrPhyDramTrwtp                   =  0x00D484B0,
+	HWPfDdrPhyMr01Dimm                    =  0x00D484C0,
+	HWPfDdrPhyMr01DimmDbi                 =  0x00D484D0,
+	HWPfDdrPhyMr23Dimm                    =  0x00D484E0,
+	HWPfDdrPhyMr45Dimm                    =  0x00D484F0,
+	HWPfDdrPhyMr67Dimm                    =  0x00D48500,
+	HWPfDdrPhyWrlvlWwRdlvlRr              =  0x00D48510,
+	HWPfDdrPhyOdtEn                       =  0x00D48520,
+	HWPfDdrPhyFastTrng                    =  0x00D48530,
+	HWPfDdrPhyDynTrngGap                  =  0x00D48540,
+	HWPfDdrPhyDynRcalGap                  =  0x00D48550,
+	HWPfDdrPhyIdletimeout                 =  0x00D48560,
+	HWPfDdrPhyRstCkeGap                   =  0x00D48570,
+	HWPfDdrPhyCkeMrsGap                   =  0x00D48580,
+	HWPfDdrPhyMemVrefMidVal               =  0x00D48590,
+	HWPfDdrPhyVrefStep                    =  0x00D485A0,
+	HWPfDdrPhyVrefThreshold               =  0x00D485B0,
+	HWPfDdrPhyPhyVrefMidVal               =  0x00D485C0,
+	HWPfDdrPhyDqsCountMax                 =  0x00D485D0,
+	HWPfDdrPhyDqsCountNum                 =  0x00D485E0,
+	HWPfDdrPhyDramRow                     =  0x00D485F0,
+	HWPfDdrPhyDramCol                     =  0x00D48600,
+	HWPfDdrPhyDramBgBa                    =  0x00D48610,
+	HWPfDdrPhyDynamicUpdreqrel            =  0x00D48620,
+	HWPfDdrPhyVrefLimits                  =  0x00D48630,
+	HWPfDdrPhyIdtmTcStatus                =  0x00D6C020,
+	HWPfDdrPhyIdtmFwVersion               =  0x00D6C410,
+	HWPfDdrPhyRdlvlGateInitDelay          =  0x00D70000,
+	HWPfDdrPhyRdenSmplabc                 =  0x00D70008,
+	HWPfDdrPhyVrefNibble0                 =  0x00D7000C,
+	HWPfDdrPhyVrefNibble1                 =  0x00D70010,
+	HWPfDdrPhyRdlvlGateDqsSmpl0           =  0x00D70014,
+	HWPfDdrPhyRdlvlGateDqsSmpl1           =  0x00D70018,
+	HWPfDdrPhyRdlvlGateDqsSmpl2           =  0x00D7001C,
+	HWPfDdrPhyDqsCount                    =  0x00D70020,
+	HWPfDdrPhyWrlvlRdlvlGateStatus        =  0x00D70024,
+	HWPfDdrPhyErrorFlags                  =  0x00D70028,
+	HWPfDdrPhyPowerDown                   =  0x00D70030,
+	HWPfDdrPhyPrbsSeedByte0               =  0x00D70034,
+	HWPfDdrPhyPrbsSeedByte1               =  0x00D70038,
+	HWPfDdrPhyPcompDq                     =  0x00D70040,
+	HWPfDdrPhyNcompDq                     =  0x00D70044,
+	HWPfDdrPhyPcompDqs                    =  0x00D70048,
+	HWPfDdrPhyNcompDqs                    =  0x00D7004C,
+	HWPfDdrPhyPcompCmd                    =  0x00D70050,
+	HWPfDdrPhyNcompCmd                    =  0x00D70054,
+	HWPfDdrPhyPcompCk                     =  0x00D70058,
+	HWPfDdrPhyNcompCk                     =  0x00D7005C,
+	HWPfDdrPhyRcalOdtDq                   =  0x00D70060,
+	HWPfDdrPhyRcalOdtDqs                  =  0x00D70064,
+	HWPfDdrPhyRcalMask1                   =  0x00D70068,
+	HWPfDdrPhyRcalMask2                   =  0x00D7006C,
+	HWPfDdrPhyRcalCtrl                    =  0x00D70070,
+	HWPfDdrPhyRcalCnt                     =  0x00D70074,
+	HWPfDdrPhyRcalOverride                =  0x00D70078,
+	HWPfDdrPhyRcalGateen                  =  0x00D7007C,
+	HWPfDdrPhyCtrl                        =  0x00D70080,
+	HWPfDdrPhyWrlvlAlg                    =  0x00D70084,
+	HWPfDdrPhyRcalVreftTxcmdOdt           =  0x00D70088,
+	HWPfDdrPhyRdlvlGateParam              =  0x00D7008C,
+	HWPfDdrPhyRdlvlGateParam2             =  0x00D70090,
+	HWPfDdrPhyRcalVreftTxdata             =  0x00D70094,
+	HWPfDdrPhyCmdIntDelay                 =  0x00D700A4,
+	HWPfDdrPhyAlertN                      =  0x00D700A8,
+	HWPfDdrPhyTrngReqWpre2tck             =  0x00D700AC,
+	HWPfDdrPhyCmdPhaseSel                 =  0x00D700B4,
+	HWPfDdrPhyCmdDcdl                     =  0x00D700B8,
+	HWPfDdrPhyCkDcdl                      =  0x00D700BC,
+	HWPfDdrPhySwTrngCtrl1                 =  0x00D700C0,
+	HWPfDdrPhySwTrngCtrl2                 =  0x00D700C4,
+	HWPfDdrPhyRcalPcompRden               =  0x00D700C8,
+	HWPfDdrPhyRcalNcompRden               =  0x00D700CC,
+	HWPfDdrPhyRcalCompen                  =  0x00D700D0,
+	HWPfDdrPhySwTrngRdqs                  =  0x00D700D4,
+	HWPfDdrPhySwTrngWdqs                  =  0x00D700D8,
+	HWPfDdrPhySwTrngRdena                 =  0x00D700DC,
+	HWPfDdrPhySwTrngRdenb                 =  0x00D700E0,
+	HWPfDdrPhySwTrngRdenc                 =  0x00D700E4,
+	HWPfDdrPhySwTrngWdq                   =  0x00D700E8,
+	HWPfDdrPhySwTrngRdq                   =  0x00D700EC,
+	HWPfDdrPhyPcfgHmValue                 =  0x00D700F0,
+	HWPfDdrPhyPcfgTimerValue              =  0x00D700F4,
+	HWPfDdrPhyPcfgSoftwareTraining        =  0x00D700F8,
+	HWPfDdrPhyPcfgMcStatus                =  0x00D700FC,
+	HWPfDdrPhyWrlvlPhRank0                =  0x00D70100,
+	HWPfDdrPhyRdenPhRank0                 =  0x00D70104,
+	HWPfDdrPhyRdenIntRank0                =  0x00D70108,
+	HWPfDdrPhyRdqsDcdlRank0               =  0x00D7010C,
+	HWPfDdrPhyRdqsShadowDcdlRank0         =  0x00D70110,
+	HWPfDdrPhyWdqsDcdlRank0               =  0x00D70114,
+	HWPfDdrPhyWdmDcdlShadowRank0          =  0x00D70118,
+	HWPfDdrPhyWdmDcdlRank0                =  0x00D7011C,
+	HWPfDdrPhyDbiDcdlRank0                =  0x00D70120,
+	HWPfDdrPhyRdenDcdlaRank0              =  0x00D70124,
+	HWPfDdrPhyDbiDcdlShadowRank0          =  0x00D70128,
+	HWPfDdrPhyRdenDcdlbRank0              =  0x00D7012C,
+	HWPfDdrPhyWdqsShadowDcdlRank0         =  0x00D70130,
+	HWPfDdrPhyRdenDcdlcRank0              =  0x00D70134,
+	HWPfDdrPhyRdenShadowDcdlaRank0        =  0x00D70138,
+	HWPfDdrPhyWrlvlIntRank0               =  0x00D7013C,
+	HWPfDdrPhyRdqDcdlBit0Rank0            =  0x00D70200,
+	HWPfDdrPhyRdqDcdlShadowBit0Rank0      =  0x00D70204,
+	HWPfDdrPhyWdqDcdlBit0Rank0            =  0x00D70208,
+	HWPfDdrPhyWdqDcdlShadowBit0Rank0      =  0x00D7020C,
+	HWPfDdrPhyRdqDcdlBit1Rank0            =  0x00D70240,
+	HWPfDdrPhyRdqDcdlShadowBit1Rank0      =  0x00D70244,
+	HWPfDdrPhyWdqDcdlBit1Rank0            =  0x00D70248,
+	HWPfDdrPhyWdqDcdlShadowBit1Rank0      =  0x00D7024C,
+	HWPfDdrPhyRdqDcdlBit2Rank0            =  0x00D70280,
+	HWPfDdrPhyRdqDcdlShadowBit2Rank0      =  0x00D70284,
+	HWPfDdrPhyWdqDcdlBit2Rank0            =  0x00D70288,
+	HWPfDdrPhyWdqDcdlShadowBit2Rank0      =  0x00D7028C,
+	HWPfDdrPhyRdqDcdlBit3Rank0            =  0x00D702C0,
+	HWPfDdrPhyRdqDcdlShadowBit3Rank0      =  0x00D702C4,
+	HWPfDdrPhyWdqDcdlBit3Rank0            =  0x00D702C8,
+	HWPfDdrPhyWdqDcdlShadowBit3Rank0      =  0x00D702CC,
+	HWPfDdrPhyRdqDcdlBit4Rank0            =  0x00D70300,
+	HWPfDdrPhyRdqDcdlShadowBit4Rank0      =  0x00D70304,
+	HWPfDdrPhyWdqDcdlBit4Rank0            =  0x00D70308,
+	HWPfDdrPhyWdqDcdlShadowBit4Rank0      =  0x00D7030C,
+	HWPfDdrPhyRdqDcdlBit5Rank0            =  0x00D70340,
+	HWPfDdrPhyRdqDcdlShadowBit5Rank0      =  0x00D70344,
+	HWPfDdrPhyWdqDcdlBit5Rank0            =  0x00D70348,
+	HWPfDdrPhyWdqDcdlShadowBit5Rank0      =  0x00D7034C,
+	HWPfDdrPhyRdqDcdlBit6Rank0            =  0x00D70380,
+	HWPfDdrPhyRdqDcdlShadowBit6Rank0      =  0x00D70384,
+	HWPfDdrPhyWdqDcdlBit6Rank0            =  0x00D70388,
+	HWPfDdrPhyWdqDcdlShadowBit6Rank0      =  0x00D7038C,
+	HWPfDdrPhyRdqDcdlBit7Rank0            =  0x00D703C0,
+	HWPfDdrPhyRdqDcdlShadowBit7Rank0      =  0x00D703C4,
+	HWPfDdrPhyWdqDcdlBit7Rank0            =  0x00D703C8,
+	HWPfDdrPhyWdqDcdlShadowBit7Rank0      =  0x00D703CC,
+	HWPfDdrPhyIdtmStatus                  =  0x00D740D0,
+	HWPfDdrPhyIdtmError                   =  0x00D74110,
+	HWPfDdrPhyIdtmDebug                   =  0x00D74120,
+	HWPfDdrPhyIdtmDebugInt                =  0x00D74130,
+	HwPfPcieLnAsicCfgovr                  =  0x00D80000,
+	HwPfPcieLnAclkmixer                   =  0x00D80004,
+	HwPfPcieLnTxrampfreq                  =  0x00D80008,
+	HwPfPcieLnLanetest                    =  0x00D8000C,
+	HwPfPcieLnDcctrl                      =  0x00D80010,
+	HwPfPcieLnDccmeas                     =  0x00D80014,
+	HwPfPcieLnDccovrAclk                  =  0x00D80018,
+	HwPfPcieLnDccovrTxa                   =  0x00D8001C,
+	HwPfPcieLnDccovrTxk                   =  0x00D80020,
+	HwPfPcieLnDccovrDclk                  =  0x00D80024,
+	HwPfPcieLnDccovrEclk                  =  0x00D80028,
+	HwPfPcieLnDcctrimAclk                 =  0x00D8002C,
+	HwPfPcieLnDcctrimTx                   =  0x00D80030,
+	HwPfPcieLnDcctrimDclk                 =  0x00D80034,
+	HwPfPcieLnDcctrimEclk                 =  0x00D80038,
+	HwPfPcieLnQuadCtrl                    =  0x00D8003C,
+	HwPfPcieLnQuadCorrIndex               =  0x00D80040,
+	HwPfPcieLnQuadCorrStatus              =  0x00D80044,
+	HwPfPcieLnAsicRxovr1                  =  0x00D80048,
+	HwPfPcieLnAsicRxovr2                  =  0x00D8004C,
+	HwPfPcieLnAsicEqinfovr                =  0x00D80050,
+	HwPfPcieLnRxcsr                       =  0x00D80054,
+	HwPfPcieLnRxfectrl                    =  0x00D80058,
+	HwPfPcieLnRxtest                      =  0x00D8005C,
+	HwPfPcieLnEscount                     =  0x00D80060,
+	HwPfPcieLnCdrctrl                     =  0x00D80064,
+	HwPfPcieLnCdrctrl2                    =  0x00D80068,
+	HwPfPcieLnCdrcfg0Ctrl0                =  0x00D8006C,
+	HwPfPcieLnCdrcfg0Ctrl1                =  0x00D80070,
+	HwPfPcieLnCdrcfg0Ctrl2                =  0x00D80074,
+	HwPfPcieLnCdrcfg1Ctrl0                =  0x00D80078,
+	HwPfPcieLnCdrcfg1Ctrl1                =  0x00D8007C,
+	HwPfPcieLnCdrcfg1Ctrl2                =  0x00D80080,
+	HwPfPcieLnCdrcfg2Ctrl0                =  0x00D80084,
+	HwPfPcieLnCdrcfg2Ctrl1                =  0x00D80088,
+	HwPfPcieLnCdrcfg2Ctrl2                =  0x00D8008C,
+	HwPfPcieLnCdrcfg3Ctrl0                =  0x00D80090,
+	HwPfPcieLnCdrcfg3Ctrl1                =  0x00D80094,
+	HwPfPcieLnCdrcfg3Ctrl2                =  0x00D80098,
+	HwPfPcieLnCdrphase                    =  0x00D8009C,
+	HwPfPcieLnCdrfreq                     =  0x00D800A0,
+	HwPfPcieLnCdrstatusPhase              =  0x00D800A4,
+	HwPfPcieLnCdrstatusFreq               =  0x00D800A8,
+	HwPfPcieLnCdroffset                   =  0x00D800AC,
+	HwPfPcieLnRxvosctl                    =  0x00D800B0,
+	HwPfPcieLnRxvosctl2                   =  0x00D800B4,
+	HwPfPcieLnRxlosctl                    =  0x00D800B8,
+	HwPfPcieLnRxlos                       =  0x00D800BC,
+	HwPfPcieLnRxlosvval                   =  0x00D800C0,
+	HwPfPcieLnRxvosd0                     =  0x00D800C4,
+	HwPfPcieLnRxvosd1                     =  0x00D800C8,
+	HwPfPcieLnRxvosep0                    =  0x00D800CC,
+	HwPfPcieLnRxvosep1                    =  0x00D800D0,
+	HwPfPcieLnRxvosen0                    =  0x00D800D4,
+	HwPfPcieLnRxvosen1                    =  0x00D800D8,
+	HwPfPcieLnRxvosafe                    =  0x00D800DC,
+	HwPfPcieLnRxvosa0                     =  0x00D800E0,
+	HwPfPcieLnRxvosa0Out                  =  0x00D800E4,
+	HwPfPcieLnRxvosa1                     =  0x00D800E8,
+	HwPfPcieLnRxvosa1Out                  =  0x00D800EC,
+	HwPfPcieLnRxmisc                      =  0x00D800F0,
+	HwPfPcieLnRxbeacon                    =  0x00D800F4,
+	HwPfPcieLnRxdssout                    =  0x00D800F8,
+	HwPfPcieLnRxdssout2                   =  0x00D800FC,
+	HwPfPcieLnAlphapctrl                  =  0x00D80100,
+	HwPfPcieLnAlphanctrl                  =  0x00D80104,
+	HwPfPcieLnAdaptctrl                   =  0x00D80108,
+	HwPfPcieLnAdaptctrl1                  =  0x00D8010C,
+	HwPfPcieLnAdaptstatus                 =  0x00D80110,
+	HwPfPcieLnAdaptvga1                   =  0x00D80114,
+	HwPfPcieLnAdaptvga2                   =  0x00D80118,
+	HwPfPcieLnAdaptvga3                   =  0x00D8011C,
+	HwPfPcieLnAdaptvga4                   =  0x00D80120,
+	HwPfPcieLnAdaptboost1                 =  0x00D80124,
+	HwPfPcieLnAdaptboost2                 =  0x00D80128,
+	HwPfPcieLnAdaptboost3                 =  0x00D8012C,
+	HwPfPcieLnAdaptboost4                 =  0x00D80130,
+	HwPfPcieLnAdaptsslms1                 =  0x00D80134,
+	HwPfPcieLnAdaptsslms2                 =  0x00D80138,
+	HwPfPcieLnAdaptvgaStatus              =  0x00D8013C,
+	HwPfPcieLnAdaptboostStatus            =  0x00D80140,
+	HwPfPcieLnAdaptsslmsStatus1           =  0x00D80144,
+	HwPfPcieLnAdaptsslmsStatus2           =  0x00D80148,
+	HwPfPcieLnAfectrl1                    =  0x00D8014C,
+	HwPfPcieLnAfectrl2                    =  0x00D80150,
+	HwPfPcieLnAfectrl3                    =  0x00D80154,
+	HwPfPcieLnAfedefault1                 =  0x00D80158,
+	HwPfPcieLnAfedefault2                 =  0x00D8015C,
+	HwPfPcieLnDfectrl1                    =  0x00D80160,
+	HwPfPcieLnDfectrl2                    =  0x00D80164,
+	HwPfPcieLnDfectrl3                    =  0x00D80168,
+	HwPfPcieLnDfectrl4                    =  0x00D8016C,
+	HwPfPcieLnDfectrl5                    =  0x00D80170,
+	HwPfPcieLnDfectrl6                    =  0x00D80174,
+	HwPfPcieLnAfestatus1                  =  0x00D80178,
+	HwPfPcieLnAfestatus2                  =  0x00D8017C,
+	HwPfPcieLnDfestatus1                  =  0x00D80180,
+	HwPfPcieLnDfestatus2                  =  0x00D80184,
+	HwPfPcieLnDfestatus3                  =  0x00D80188,
+	HwPfPcieLnDfestatus4                  =  0x00D8018C,
+	HwPfPcieLnDfestatus5                  =  0x00D80190,
+	HwPfPcieLnAlphastatus                 =  0x00D80194,
+	HwPfPcieLnFomctrl1                    =  0x00D80198,
+	HwPfPcieLnFomctrl2                    =  0x00D8019C,
+	HwPfPcieLnFomctrl3                    =  0x00D801A0,
+	HwPfPcieLnAclkcalStatus               =  0x00D801A4,
+	HwPfPcieLnOffscorrStatus              =  0x00D801A8,
+	HwPfPcieLnEyewidthStatus              =  0x00D801AC,
+	HwPfPcieLnEyeheightStatus             =  0x00D801B0,
+	HwPfPcieLnAsicTxovr1                  =  0x00D801B4,
+	HwPfPcieLnAsicTxovr2                  =  0x00D801B8,
+	HwPfPcieLnAsicTxovr3                  =  0x00D801BC,
+	HwPfPcieLnTxbiasadjOvr                =  0x00D801C0,
+	HwPfPcieLnTxcsr                       =  0x00D801C4,
+	HwPfPcieLnTxtest                      =  0x00D801C8,
+	HwPfPcieLnTxtestword                  =  0x00D801CC,
+	HwPfPcieLnTxtestwordHigh              =  0x00D801D0,
+	HwPfPcieLnTxdrive                     =  0x00D801D4,
+	HwPfPcieLnMtcsLn                      =  0x00D801D8,
+	HwPfPcieLnStatsumLn                   =  0x00D801DC,
+	HwPfPcieLnRcbusScratch                =  0x00D801E0,
+	HwPfPcieLnRcbusMinorrev               =  0x00D801F0,
+	HwPfPcieLnRcbusMajorrev               =  0x00D801F4,
+	HwPfPcieLnRcbusBlocktype              =  0x00D801F8,
+	HwPfPcieSupPllcsr                     =  0x00D80800,
+	HwPfPcieSupPlldiv                     =  0x00D80804,
+	HwPfPcieSupPllcal                     =  0x00D80808,
+	HwPfPcieSupPllcalsts                  =  0x00D8080C,
+	HwPfPcieSupPllmeas                    =  0x00D80810,
+	HwPfPcieSupPlldactrim                 =  0x00D80814,
+	HwPfPcieSupPllbiastrim                =  0x00D80818,
+	HwPfPcieSupPllbwtrim                  =  0x00D8081C,
+	HwPfPcieSupPllcaldly                  =  0x00D80820,
+	HwPfPcieSupRefclkonpclkctrl           =  0x00D80824,
+	HwPfPcieSupPclkdelay                  =  0x00D80828,
+	HwPfPcieSupPhyconfig                  =  0x00D8082C,
+	HwPfPcieSupRcalIntf                   =  0x00D80830,
+	HwPfPcieSupAuxcsr                     =  0x00D80834,
+	HwPfPcieSupVref                       =  0x00D80838,
+	HwPfPcieSupLinkmode                   =  0x00D8083C,
+	HwPfPcieSupRrefcalctl                 =  0x00D80840,
+	HwPfPcieSupRrefcal                    =  0x00D80844,
+	HwPfPcieSupRrefcaldly                 =  0x00D80848,
+	HwPfPcieSupTximpcalctl                =  0x00D8084C,
+	HwPfPcieSupTximpcal                   =  0x00D80850,
+	HwPfPcieSupTximpoffset                =  0x00D80854,
+	HwPfPcieSupTximpcaldly                =  0x00D80858,
+	HwPfPcieSupRximpcalctl                =  0x00D8085C,
+	HwPfPcieSupRximpcal                   =  0x00D80860,
+	HwPfPcieSupRximpoffset                =  0x00D80864,
+	HwPfPcieSupRximpcaldly                =  0x00D80868,
+	HwPfPcieSupFence                      =  0x00D8086C,
+	HwPfPcieSupMtcs                       =  0x00D80870,
+	HwPfPcieSupStatsum                    =  0x00D809B8,
+	HwPfPciePcsDpStatus0                  =  0x00D81000,
+	HwPfPciePcsDpControl0                 =  0x00D81004,
+	HwPfPciePcsPmaStatusLane0             =  0x00D81008,
+	HwPfPciePcsPipeStatusLane0            =  0x00D8100C,
+	HwPfPciePcsTxdeemph0Lane0             =  0x00D81010,
+	HwPfPciePcsTxdeemph1Lane0             =  0x00D81014,
+	HwPfPciePcsInternalStatusLane0        =  0x00D81018,
+	HwPfPciePcsDpStatus1                  =  0x00D8101C,
+	HwPfPciePcsDpControl1                 =  0x00D81020,
+	HwPfPciePcsPmaStatusLane1             =  0x00D81024,
+	HwPfPciePcsPipeStatusLane1            =  0x00D81028,
+	HwPfPciePcsTxdeemph0Lane1             =  0x00D8102C,
+	HwPfPciePcsTxdeemph1Lane1             =  0x00D81030,
+	HwPfPciePcsInternalStatusLane1        =  0x00D81034,
+	HwPfPciePcsDpStatus2                  =  0x00D81038,
+	HwPfPciePcsDpControl2                 =  0x00D8103C,
+	HwPfPciePcsPmaStatusLane2             =  0x00D81040,
+	HwPfPciePcsPipeStatusLane2            =  0x00D81044,
+	HwPfPciePcsTxdeemph0Lane2             =  0x00D81048,
+	HwPfPciePcsTxdeemph1Lane2             =  0x00D8104C,
+	HwPfPciePcsInternalStatusLane2        =  0x00D81050,
+	HwPfPciePcsDpStatus3                  =  0x00D81054,
+	HwPfPciePcsDpControl3                 =  0x00D81058,
+	HwPfPciePcsPmaStatusLane3             =  0x00D8105C,
+	HwPfPciePcsPipeStatusLane3            =  0x00D81060,
+	HwPfPciePcsTxdeemph0Lane3             =  0x00D81064,
+	HwPfPciePcsTxdeemph1Lane3             =  0x00D81068,
+	HwPfPciePcsInternalStatusLane3        =  0x00D8106C,
+	HwPfPciePcsEbStatus0                  =  0x00D81070,
+	HwPfPciePcsEbStatus1                  =  0x00D81074,
+	HwPfPciePcsEbStatus2                  =  0x00D81078,
+	HwPfPciePcsEbStatus3                  =  0x00D8107C,
+	HwPfPciePcsPllSettingPcieG1           =  0x00D81088,
+	HwPfPciePcsPllSettingPcieG2           =  0x00D8108C,
+	HwPfPciePcsPllSettingPcieG3           =  0x00D81090,
+	HwPfPciePcsControl                    =  0x00D81094,
+	HwPfPciePcsEqControl                  =  0x00D81098,
+	HwPfPciePcsEqTimer                    =  0x00D8109C,
+	HwPfPciePcsEqErrStatus                =  0x00D810A0,
+	HwPfPciePcsEqErrCount                 =  0x00D810A4,
+	HwPfPciePcsStatus                     =  0x00D810A8,
+	HwPfPciePcsMiscRegister               =  0x00D810AC,
+	HwPfPciePcsObsControl                 =  0x00D810B0,
+	HwPfPciePcsPrbsCount0                 =  0x00D81200,
+	HwPfPciePcsBistControl0               =  0x00D81204,
+	HwPfPciePcsBistStaticWord00           =  0x00D81208,
+	HwPfPciePcsBistStaticWord10           =  0x00D8120C,
+	HwPfPciePcsBistStaticWord20           =  0x00D81210,
+	HwPfPciePcsBistStaticWord30           =  0x00D81214,
+	HwPfPciePcsPrbsCount1                 =  0x00D81220,
+	HwPfPciePcsBistControl1               =  0x00D81224,
+	HwPfPciePcsBistStaticWord01           =  0x00D81228,
+	HwPfPciePcsBistStaticWord11           =  0x00D8122C,
+	HwPfPciePcsBistStaticWord21           =  0x00D81230,
+	HwPfPciePcsBistStaticWord31           =  0x00D81234,
+	HwPfPciePcsPrbsCount2                 =  0x00D81240,
+	HwPfPciePcsBistControl2               =  0x00D81244,
+	HwPfPciePcsBistStaticWord02           =  0x00D81248,
+	HwPfPciePcsBistStaticWord12           =  0x00D8124C,
+	HwPfPciePcsBistStaticWord22           =  0x00D81250,
+	HwPfPciePcsBistStaticWord32           =  0x00D81254,
+	HwPfPciePcsPrbsCount3                 =  0x00D81260,
+	HwPfPciePcsBistControl3               =  0x00D81264,
+	HwPfPciePcsBistStaticWord03           =  0x00D81268,
+	HwPfPciePcsBistStaticWord13           =  0x00D8126C,
+	HwPfPciePcsBistStaticWord23           =  0x00D81270,
+	HwPfPciePcsBistStaticWord33           =  0x00D81274,
+	HwPfPcieGpexLtssmStateCntrl           =  0x00D90400,
+	HwPfPcieGpexLtssmStateStatus          =  0x00D90404,
+	HwPfPcieGpexSkipFreqTimer             =  0x00D90408,
+	HwPfPcieGpexLaneSelect                =  0x00D9040C,
+	HwPfPcieGpexLaneDeskew                =  0x00D90410,
+	HwPfPcieGpexRxErrorStatus             =  0x00D90414,
+	HwPfPcieGpexLaneNumControl            =  0x00D90418,
+	HwPfPcieGpexNFstControl               =  0x00D9041C,
+	HwPfPcieGpexLinkStatus                =  0x00D90420,
+	HwPfPcieGpexAckReplayTimeout          =  0x00D90438,
+	HwPfPcieGpexSeqNumberStatus           =  0x00D9043C,
+	HwPfPcieGpexCoreClkRatio              =  0x00D90440,
+	HwPfPcieGpexDllTholdControl           =  0x00D90448,
+	HwPfPcieGpexPmTimer                   =  0x00D90450,
+	HwPfPcieGpexPmeTimeout                =  0x00D90454,
+	HwPfPcieGpexAspmL1Timer               =  0x00D90458,
+	HwPfPcieGpexAspmReqTimer              =  0x00D9045C,
+	HwPfPcieGpexAspmL1Dis                 =  0x00D90460,
+	HwPfPcieGpexAdvisoryErrorControl      =  0x00D90468,
+	HwPfPcieGpexId                        =  0x00D90470,
+	HwPfPcieGpexClasscode                 =  0x00D90474,
+	HwPfPcieGpexSubsystemId               =  0x00D90478,
+	HwPfPcieGpexDeviceCapabilities        =  0x00D9047C,
+	HwPfPcieGpexLinkCapabilities          =  0x00D90480,
+	HwPfPcieGpexFunctionNumber            =  0x00D90484,
+	HwPfPcieGpexPmCapabilities            =  0x00D90488,
+	HwPfPcieGpexFunctionSelect            =  0x00D9048C,
+	HwPfPcieGpexErrorCounter              =  0x00D904AC,
+	HwPfPcieGpexConfigReady               =  0x00D904B0,
+	HwPfPcieGpexFcUpdateTimeout           =  0x00D904B8,
+	HwPfPcieGpexFcUpdateTimer             =  0x00D904BC,
+	HwPfPcieGpexVcBufferLoad              =  0x00D904C8,
+	HwPfPcieGpexVcBufferSizeThold         =  0x00D904CC,
+	HwPfPcieGpexVcBufferSelect            =  0x00D904D0,
+	HwPfPcieGpexBarEnable                 =  0x00D904D4,
+	HwPfPcieGpexBarDwordLower             =  0x00D904D8,
+	HwPfPcieGpexBarDwordUpper             =  0x00D904DC,
+	HwPfPcieGpexBarSelect                 =  0x00D904E0,
+	HwPfPcieGpexCreditCounterSelect       =  0x00D904E4,
+	HwPfPcieGpexCreditCounterStatus       =  0x00D904E8,
+	HwPfPcieGpexTlpHeaderSelect           =  0x00D904EC,
+	HwPfPcieGpexTlpHeaderDword0           =  0x00D904F0,
+	HwPfPcieGpexTlpHeaderDword1           =  0x00D904F4,
+	HwPfPcieGpexTlpHeaderDword2           =  0x00D904F8,
+	HwPfPcieGpexTlpHeaderDword3           =  0x00D904FC,
+	HwPfPcieGpexRelaxOrderControl         =  0x00D90500,
+	HwPfPcieGpexBarPrefetch               =  0x00D90504,
+	HwPfPcieGpexFcCheckControl            =  0x00D90508,
+	HwPfPcieGpexFcUpdateTimerTraffic      =  0x00D90518,
+	HwPfPcieGpexPhyControl0               =  0x00D9053C,
+	HwPfPcieGpexPhyControl1               =  0x00D90544,
+	HwPfPcieGpexPhyControl2               =  0x00D9054C,
+	HwPfPcieGpexUserControl0              =  0x00D9055C,
+	HwPfPcieGpexUncorrErrorStatus         =  0x00D905F0,
+	HwPfPcieGpexRxCplError                =  0x00D90620,
+	HwPfPcieGpexRxCplErrorDword0          =  0x00D90624,
+	HwPfPcieGpexRxCplErrorDword1          =  0x00D90628,
+	HwPfPcieGpexRxCplErrorDword2          =  0x00D9062C,
+	HwPfPcieGpexPabSwResetEn              =  0x00D90630,
+	HwPfPcieGpexGen3Control0              =  0x00D90634,
+	HwPfPcieGpexGen3Control1              =  0x00D90638,
+	HwPfPcieGpexGen3Control2              =  0x00D9063C,
+	HwPfPcieGpexGen2ControlCsr            =  0x00D90640,
+	HwPfPcieGpexTotalVfInitialVf0         =  0x00D90644,
+	HwPfPcieGpexTotalVfInitialVf1         =  0x00D90648,
+	HwPfPcieGpexSriovLinkDevId0           =  0x00D90684,
+	HwPfPcieGpexSriovLinkDevId1           =  0x00D90688,
+	HwPfPcieGpexSriovPageSize0            =  0x00D906C4,
+	HwPfPcieGpexSriovPageSize1            =  0x00D906C8,
+	HwPfPcieGpexIdVersion                 =  0x00D906FC,
+	HwPfPcieGpexSriovVfOffsetStride0      =  0x00D90704,
+	HwPfPcieGpexSriovVfOffsetStride1      =  0x00D90708,
+	HwPfPcieGpexGen3DeskewControl         =  0x00D907B4,
+	HwPfPcieGpexGen3EqControl             =  0x00D907B8,
+	HwPfPcieGpexBridgeVersion             =  0x00D90800,
+	HwPfPcieGpexBridgeCapability          =  0x00D90804,
+	HwPfPcieGpexBridgeControl             =  0x00D90808,
+	HwPfPcieGpexBridgeStatus              =  0x00D9080C,
+	HwPfPcieGpexEngineActivityStatus      =  0x00D9081C,
+	HwPfPcieGpexEngineResetControl        =  0x00D90820,
+	HwPfPcieGpexAxiPioControl             =  0x00D90840,
+	HwPfPcieGpexAxiPioStatus              =  0x00D90844,
+	HwPfPcieGpexAmbaSlaveCmdStatus        =  0x00D90848,
+	HwPfPcieGpexPexPioControl             =  0x00D908C0,
+	HwPfPcieGpexPexPioStatus              =  0x00D908C4,
+	HwPfPcieGpexAmbaMasterStatus          =  0x00D908C8,
+	HwPfPcieGpexCsrSlaveCmdStatus         =  0x00D90920,
+	HwPfPcieGpexMailboxAxiControl         =  0x00D90A50,
+	HwPfPcieGpexMailboxAxiData            =  0x00D90A54,
+	HwPfPcieGpexMailboxPexControl         =  0x00D90A90,
+	HwPfPcieGpexMailboxPexData            =  0x00D90A94,
+	HwPfPcieGpexPexInterruptEnable        =  0x00D90AD0,
+	HwPfPcieGpexPexInterruptStatus        =  0x00D90AD4,
+	HwPfPcieGpexPexInterruptAxiPioVector  =  0x00D90AD8,
+	HwPfPcieGpexPexInterruptPexPioVector  =  0x00D90AE0,
+	HwPfPcieGpexPexInterruptMiscVector    =  0x00D90AF8,
+	HwPfPcieGpexAmbaInterruptPioEnable    =  0x00D90B00,
+	HwPfPcieGpexAmbaInterruptMiscEnable   =  0x00D90B0C,
+	HwPfPcieGpexAmbaInterruptPioStatus    =  0x00D90B10,
+	HwPfPcieGpexAmbaInterruptMiscStatus   =  0x00D90B1C,
+	HwPfPcieGpexPexPmControl              =  0x00D90B80,
+	HwPfPcieGpexSlotMisc                  =  0x00D90B88,
+	HwPfPcieGpexAxiAddrMappingControl     =  0x00D90BA0,
+	HwPfPcieGpexAxiAddrMappingWindowAxiBase     =  0x00D90BA4,
+	HwPfPcieGpexAxiAddrMappingWindowPexBaseLow  =  0x00D90BA8,
+	HwPfPcieGpexAxiAddrMappingWindowPexBaseHigh =  0x00D90BAC,
+	HwPfPcieGpexPexBarAddrFunc0Bar0       =  0x00D91BA0,
+	HwPfPcieGpexPexBarAddrFunc0Bar1       =  0x00D91BA4,
+	HwPfPcieGpexAxiAddrMappingPcieHdrParam =  0x00D95BA0,
+	HwPfPcieGpexExtAxiAddrMappingAxiBase  =  0x00D980A0,
+	HwPfPcieGpexPexExtBarAddrFunc0Bar0    =  0x00D984A0,
+	HwPfPcieGpexPexExtBarAddrFunc0Bar1    =  0x00D984A4,
+	HwPfPcieGpexAmbaInterruptFlrEnable    =  0x00D9B960,
+	HwPfPcieGpexAmbaInterruptFlrStatus    =  0x00D9B9A0,
+	HwPfPcieGpexExtAxiAddrMappingSize     =  0x00D9BAF0,
+	HwPfPcieGpexPexPioAwcacheControl      =  0x00D9C300,
+	HwPfPcieGpexPexPioArcacheControl      =  0x00D9C304,
+	HwPfPcieGpexPabObSizeControlVc0       =  0x00D9C310
+};
+
+/* TIP PF Interrupt numbers */
+enum {
+	ACC100_PF_INT_QMGR_AQ_OVERFLOW = 0,
+	ACC100_PF_INT_DOORBELL_VF_2_PF = 1,
+	ACC100_PF_INT_DMA_DL_DESC_IRQ = 2,
+	ACC100_PF_INT_DMA_UL_DESC_IRQ = 3,
+	ACC100_PF_INT_DMA_MLD_DESC_IRQ = 4,
+	ACC100_PF_INT_DMA_UL5G_DESC_IRQ = 5,
+	ACC100_PF_INT_DMA_DL5G_DESC_IRQ = 6,
+	ACC100_PF_INT_ILLEGAL_FORMAT = 7,
+	ACC100_PF_INT_QMGR_DISABLED_ACCESS = 8,
+	ACC100_PF_INT_QMGR_AQ_OVERTHRESHOLD = 9,
+	ACC100_PF_INT_ARAM_ACCESS_ERR = 10,
+	ACC100_PF_INT_ARAM_ECC_1BIT_ERR = 11,
+	ACC100_PF_INT_PARITY_ERR = 12,
+	ACC100_PF_INT_QMGR_ERR = 13,
+	ACC100_PF_INT_INT_REQ_OVERFLOW = 14,
+	ACC100_PF_INT_APB_TIMEOUT = 15,
+};
+
+#endif /* ACC100_PF_ENUM_H */
diff --git a/drivers/baseband/acc100/acc100_vf_enum.h b/drivers/baseband/acc100/acc100_vf_enum.h
new file mode 100644
index 0000000..b512af3
--- /dev/null
+++ b/drivers/baseband/acc100/acc100_vf_enum.h
@@ -0,0 +1,73 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2017 Intel Corporation
+ */
+
+#ifndef ACC100_VF_ENUM_H
+#define ACC100_VF_ENUM_H
+
+/*
+ * ACC100 Register mapping on VF BAR0
+ * This is automatically generated from RDL, format may change with new RDL
+ */
+enum {
+	HWVfQmgrIngressAq             =  0x00000000,
+	HWVfHiVfToPfDbellVf           =  0x00000800,
+	HWVfHiPfToVfDbellVf           =  0x00000808,
+	HWVfHiInfoRingBaseLoVf        =  0x00000810,
+	HWVfHiInfoRingBaseHiVf        =  0x00000814,
+	HWVfHiInfoRingPointerVf       =  0x00000818,
+	HWVfHiInfoRingIntWrEnVf       =  0x00000820,
+	HWVfHiInfoRingPf2VfWrEnVf     =  0x00000824,
+	HWVfHiMsixVectorMapperVf      =  0x00000860,
+	HWVfDmaFec5GulDescBaseLoRegVf =  0x00000920,
+	HWVfDmaFec5GulDescBaseHiRegVf =  0x00000924,
+	HWVfDmaFec5GulRespPtrLoRegVf  =  0x00000928,
+	HWVfDmaFec5GulRespPtrHiRegVf  =  0x0000092C,
+	HWVfDmaFec5GdlDescBaseLoRegVf =  0x00000940,
+	HWVfDmaFec5GdlDescBaseHiRegVf =  0x00000944,
+	HWVfDmaFec5GdlRespPtrLoRegVf  =  0x00000948,
+	HWVfDmaFec5GdlRespPtrHiRegVf  =  0x0000094C,
+	HWVfDmaFec4GulDescBaseLoRegVf =  0x00000960,
+	HWVfDmaFec4GulDescBaseHiRegVf =  0x00000964,
+	HWVfDmaFec4GulRespPtrLoRegVf  =  0x00000968,
+	HWVfDmaFec4GulRespPtrHiRegVf  =  0x0000096C,
+	HWVfDmaFec4GdlDescBaseLoRegVf =  0x00000980,
+	HWVfDmaFec4GdlDescBaseHiRegVf =  0x00000984,
+	HWVfDmaFec4GdlRespPtrLoRegVf  =  0x00000988,
+	HWVfDmaFec4GdlRespPtrHiRegVf  =  0x0000098C,
+	HWVfDmaDdrBaseRangeRoVf       =  0x000009A0,
+	HWVfQmgrAqResetVf             =  0x00000E00,
+	HWVfQmgrRingSizeVf            =  0x00000E04,
+	HWVfQmgrGrpDepthLog20Vf       =  0x00000E08,
+	HWVfQmgrGrpDepthLog21Vf       =  0x00000E0C,
+	HWVfQmgrGrpFunction0Vf        =  0x00000E10,
+	HWVfQmgrGrpFunction1Vf        =  0x00000E14,
+	HWVfPmACntrlRegVf             =  0x00000F40,
+	HWVfPmACountVf                =  0x00000F48,
+	HWVfPmAKCntLoVf               =  0x00000F50,
+	HWVfPmAKCntHiVf               =  0x00000F54,
+	HWVfPmADeltaCntLoVf           =  0x00000F60,
+	HWVfPmADeltaCntHiVf           =  0x00000F64,
+	HWVfPmBCntrlRegVf             =  0x00000F80,
+	HWVfPmBCountVf                =  0x00000F88,
+	HWVfPmBKCntLoVf               =  0x00000F90,
+	HWVfPmBKCntHiVf               =  0x00000F94,
+	HWVfPmBDeltaCntLoVf           =  0x00000FA0,
+	HWVfPmBDeltaCntHiVf           =  0x00000FA4
+};
+
+/* TIP VF Interrupt numbers */
+enum {
+	ACC100_VF_INT_QMGR_AQ_OVERFLOW = 0,
+	ACC100_VF_INT_DOORBELL_VF_2_PF = 1,
+	ACC100_VF_INT_DMA_DL_DESC_IRQ = 2,
+	ACC100_VF_INT_DMA_UL_DESC_IRQ = 3,
+	ACC100_VF_INT_DMA_MLD_DESC_IRQ = 4,
+	ACC100_VF_INT_DMA_UL5G_DESC_IRQ = 5,
+	ACC100_VF_INT_DMA_DL5G_DESC_IRQ = 6,
+	ACC100_VF_INT_ILLEGAL_FORMAT = 7,
+	ACC100_VF_INT_QMGR_DISABLED_ACCESS = 8,
+	ACC100_VF_INT_QMGR_AQ_OVERTHRESHOLD = 9,
+};
+
+#endif /* ACC100_VF_ENUM_H */
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 6f46df0..cd77570 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -5,6 +5,9 @@
 #ifndef _RTE_ACC100_PMD_H_
 #define _RTE_ACC100_PMD_H_
 
+#include "acc100_pf_enum.h"
+#include "acc100_vf_enum.h"
+
 /* Helper macro for logging */
 #define rte_bbdev_log(level, fmt, ...) \
 	rte_log(RTE_LOG_ ## level, acc100_logtype, fmt "\n", \
@@ -27,6 +30,493 @@
 #define RTE_ACC100_PF_DEVICE_ID        (0x0d5c)
 #define RTE_ACC100_VF_DEVICE_ID        (0x0d5d)
 
+/* Define as 1 to use only a single FEC engine */
+#ifndef RTE_ACC100_SINGLE_FEC
+#define RTE_ACC100_SINGLE_FEC 0
+#endif
+
+/* Values used in filling in descriptors */
+#define ACC100_DMA_DESC_TYPE           2
+#define ACC100_DMA_CODE_BLK_MODE       0
+#define ACC100_DMA_BLKID_FCW           1
+#define ACC100_DMA_BLKID_IN            2
+#define ACC100_DMA_BLKID_OUT_ENC       1
+#define ACC100_DMA_BLKID_OUT_HARD      1
+#define ACC100_DMA_BLKID_OUT_SOFT      2
+#define ACC100_DMA_BLKID_OUT_HARQ      3
+#define ACC100_DMA_BLKID_IN_HARQ       3
+
+/* Values used in filling in decode FCWs */
+#define ACC100_FCW_TD_VER              1
+#define ACC100_FCW_TD_EXT_COLD_REG_EN  1
+#define ACC100_FCW_TD_AUTOMAP          0x0f
+#define ACC100_FCW_TD_RVIDX_0          2
+#define ACC100_FCW_TD_RVIDX_1          26
+#define ACC100_FCW_TD_RVIDX_2          50
+#define ACC100_FCW_TD_RVIDX_3          74
+
+/* Values used in writing to the registers */
+#define ACC100_REG_IRQ_EN_ALL          0x1FF83FF  /* Enable all interrupts */
+
+/* ACC100 Specific Dimensioning */
+#define ACC100_SIZE_64MBYTE            (64*1024*1024)
+/* Number of elements in an Info Ring */
+#define ACC100_INFO_RING_NUM_ENTRIES   1024
+/* Number of elements in HARQ layout memory */
+#define ACC100_HARQ_LAYOUT             (64*1024*1024)
+/* Assume offset for HARQ in memory */
+#define ACC100_HARQ_OFFSET             (32*1024)
+/* Mask used to calculate an index in an Info Ring array (not a byte offset) */
+#define ACC100_INFO_RING_MASK          (ACC100_INFO_RING_NUM_ENTRIES-1)
+/* Number of Virtual Functions ACC100 supports */
+#define ACC100_NUM_VFS                  16
+#define ACC100_NUM_QGRPS                 8
+#define ACC100_NUM_QGRPS_PER_WORD        8
+#define ACC100_NUM_AQS                  16
+#define MAX_ENQ_BATCH_SIZE          255
+/* All ACC100 Registers alignment are 32bits = 4B */
+#define BYTES_IN_WORD                 4
+#define MAX_E_MBUF                64000
+
+#define GRP_ID_SHIFT    10 /* Queue Index Hierarchy */
+#define VF_ID_SHIFT     4  /* Queue Index Hierarchy */
+#define VF_OFFSET_QOS   16 /* offset in Memory Space specific to QoS Mon */
+#define TMPL_PRI_0      0x03020100
+#define TMPL_PRI_1      0x07060504
+#define TMPL_PRI_2      0x0b0a0908
+#define TMPL_PRI_3      0x0f0e0d0c
+#define QUEUE_ENABLE    0x80000000  /* Bit to mark Queue as Enabled */
+#define WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
+
+#define ACC100_NUM_TMPL  32
+#define VF_OFFSET_QOS 16 /* offset in Memory Space specific to QoS Mon */
+/* Mapping of signals for the available engines */
+#define SIG_UL_5G      0
+#define SIG_UL_5G_LAST 7
+#define SIG_DL_5G      13
+#define SIG_DL_5G_LAST 15
+#define SIG_UL_4G      16
+#define SIG_UL_4G_LAST 21
+#define SIG_DL_4G      27
+#define SIG_DL_4G_LAST 31
+
+/* max number of iterations to allocate memory block for all rings */
+#define SW_RING_MEM_ALLOC_ATTEMPTS 5
+#define MAX_QUEUE_DEPTH           1024
+#define ACC100_DMA_MAX_NUM_POINTERS  14
+#define ACC100_DMA_DESC_PADDING      8
+#define ACC100_FCW_PADDING           12
+#define ACC100_DESC_FCW_OFFSET       192
+#define ACC100_DESC_SIZE             256
+#define ACC100_DESC_OFFSET           (ACC100_DESC_SIZE / 64)
+#define ACC100_FCW_TE_BLEN     32
+#define ACC100_FCW_TD_BLEN     24
+#define ACC100_FCW_LE_BLEN     32
+#define ACC100_FCW_LD_BLEN     36
+
+#define ACC100_FCW_VER         2
+#define MUX_5GDL_DESC 6
+#define CMP_ENC_SIZE 20
+#define CMP_DEC_SIZE 24
+#define ENC_OFFSET (32)
+#define DEC_OFFSET (80)
+#define ACC100_EXT_MEM
+#define ACC100_HARQ_OFFSET_THRESHOLD 1024
+
+/* Constants from K0 computation from 3GPP 38.212 Table 5.4.2.1-2 */
+#define N_ZC_1 66 /* N = 66 Zc for BG 1 */
+#define N_ZC_2 50 /* N = 50 Zc for BG 2 */
+#define K0_1_1 17 /* K0 fraction numerator for rv 1 and BG 1 */
+#define K0_1_2 13 /* K0 fraction numerator for rv 1 and BG 2 */
+#define K0_2_1 33 /* K0 fraction numerator for rv 2 and BG 1 */
+#define K0_2_2 25 /* K0 fraction numerator for rv 2 and BG 2 */
+#define K0_3_1 56 /* K0 fraction numerator for rv 3 and BG 1 */
+#define K0_3_2 43 /* K0 fraction numerator for rv 3 and BG 2 */
+
+/* ACC100 Configuration */
+#define ACC100_DDR_ECC_ENABLE
+#define ACC100_CFG_DMA_ERROR 0x3D7
+#define ACC100_CFG_AXI_CACHE 0x11
+#define ACC100_CFG_QMGR_HI_P 0x0F0F
+#define ACC100_CFG_PCI_AXI 0xC003
+#define ACC100_CFG_PCI_BRIDGE 0x40006033
+#define ACC100_ENGINE_OFFSET 0x1000
+#define ACC100_RESET_HI 0x20100
+#define ACC100_RESET_LO 0x20000
+#define ACC100_RESET_HARD 0x1FF
+#define ACC100_ENGINES_MAX 9
+#define LONG_WAIT 1000
+
+/* ACC100 DMA Descriptor triplet */
+struct acc100_dma_triplet {
+	uint64_t address;
+	uint32_t blen:20,
+		res0:4,
+		last:1,
+		dma_ext:1,
+		res1:2,
+		blkid:4;
+} __rte_packed;
+
+
+
+/* ACC100 DMA Response Descriptor */
+union acc100_dma_rsp_desc {
+	uint32_t val;
+	struct {
+		uint32_t crc_status:1,
+			synd_ok:1,
+			dma_err:1,
+			neg_stop:1,
+			fcw_err:1,
+			output_err:1,
+			input_err:1,
+			timestampEn:1,
+			iterCountFrac:8,
+			iter_cnt:8,
+			rsrvd3:6,
+			sdone:1,
+			fdone:1;
+		uint32_t add_info_0;
+		uint32_t add_info_1;
+	};
+};
+
+
+/* ACC100 Queue Manager Enqueue PCI Register */
+union acc100_enqueue_reg_fmt {
+	uint32_t val;
+	struct {
+		uint32_t num_elem:8,
+			addr_offset:3,
+			rsrvd:1,
+			req_elem_addr:20;
+	};
+};
+
+/* FEC 4G Uplink Frame Control Word */
+struct __rte_packed acc100_fcw_td {
+	uint8_t fcw_ver:4,
+		num_maps:4; /* Unused */
+	uint8_t filler:6, /* Unused */
+		rsrvd0:1,
+		bypass_sb_deint:1;
+	uint16_t k_pos;
+	uint16_t k_neg; /* Unused */
+	uint8_t c_neg; /* Unused */
+	uint8_t c; /* Unused */
+	uint32_t ea; /* Unused */
+	uint32_t eb; /* Unused */
+	uint8_t cab; /* Unused */
+	uint8_t k0_start_col; /* Unused */
+	uint8_t rsrvd1;
+	uint8_t code_block_mode:1, /* Unused */
+		turbo_crc_type:1,
+		rsrvd2:3,
+		bypass_teq:1, /* Unused */
+		soft_output_en:1, /* Unused */
+		ext_td_cold_reg_en:1;
+	union { /* External Cold register */
+		uint32_t ext_td_cold_reg;
+		struct {
+			uint32_t min_iter:4, /* Unused */
+				max_iter:4,
+				ext_scale:5, /* Unused */
+				rsrvd3:3,
+				early_stop_en:1, /* Unused */
+				sw_soft_out_dis:1, /* Unused */
+				sw_et_cont:1, /* Unused */
+				sw_soft_out_saturation:1, /* Unused */
+				half_iter_on:1, /* Unused */
+				raw_decoder_input_on:1, /* Unused */
+				rsrvd4:10;
+		};
+	};
+};
+
+/* FEC 5GNR Uplink Frame Control Word */
+struct __rte_packed acc100_fcw_ld {
+	uint32_t FCWversion:4,
+		qm:4,
+		nfiller:11,
+		BG:1,
+		Zc:9,
+		res0:1,
+		synd_precoder:1,
+		synd_post:1;
+	uint32_t ncb:16,
+		k0:16;
+	uint32_t rm_e:24,
+		hcin_en:1,
+		hcout_en:1,
+		crc_select:1,
+		bypass_dec:1,
+		bypass_intlv:1,
+		so_en:1,
+		so_bypass_rm:1,
+		so_bypass_intlv:1;
+	uint32_t hcin_offset:16,
+		hcin_size0:16;
+	uint32_t hcin_size1:16,
+		hcin_decomp_mode:3,
+		llr_pack_mode:1,
+		hcout_comp_mode:3,
+		res2:1,
+		dec_convllr:4,
+		hcout_convllr:4;
+	uint32_t itmax:7,
+		itstop:1,
+		so_it:7,
+		res3:1,
+		hcout_offset:16;
+	uint32_t hcout_size0:16,
+		hcout_size1:16;
+	uint32_t gain_i:8,
+		gain_h:8,
+		negstop_th:16;
+	uint32_t negstop_it:7,
+		negstop_en:1,
+		res4:24;
+};
+
+/* FEC 4G Downlink Frame Control Word */
+struct __rte_packed acc100_fcw_te {
+	uint16_t k_neg;
+	uint16_t k_pos;
+	uint8_t c_neg;
+	uint8_t c;
+	uint8_t filler;
+	uint8_t cab;
+	uint32_t ea:17,
+		rsrvd0:15;
+	uint32_t eb:17,
+		rsrvd1:15;
+	uint16_t ncb_neg;
+	uint16_t ncb_pos;
+	uint8_t rv_idx0:2,
+		rsrvd2:2,
+		rv_idx1:2,
+		rsrvd3:2;
+	uint8_t bypass_rv_idx0:1,
+		bypass_rv_idx1:1,
+		bypass_rm:1,
+		rsrvd4:5;
+	uint8_t rsrvd5:1,
+		rsrvd6:3,
+		code_block_crc:1,
+		rsrvd7:3;
+	uint8_t code_block_mode:1,
+		rsrvd8:7;
+	uint64_t rsrvd9;
+};
+
+/* FEC 5GNR Downlink Frame Control Word */
+struct __rte_packed acc100_fcw_le {
+	uint32_t FCWversion:4,
+		qm:4,
+		nfiller:11,
+		BG:1,
+		Zc:9,
+		res0:3;
+	uint32_t ncb:16,
+		k0:16;
+	uint32_t rm_e:24,
+		res1:2,
+		crc_select:1,
+		res2:1,
+		bypass_intlv:1,
+		res3:3;
+	uint32_t res4_a:12,
+		mcb_count:3,
+		res4_b:17;
+	uint32_t res5;
+	uint32_t res6;
+	uint32_t res7;
+	uint32_t res8;
+};
+
+/* ACC100 DMA Request Descriptor */
+struct __rte_packed acc100_dma_req_desc {
+	union {
+		struct{
+			uint32_t type:4,
+				rsrvd0:26,
+				sdone:1,
+				fdone:1;
+			uint32_t rsrvd1;
+			uint32_t rsrvd2;
+			uint32_t pass_param:8,
+				sdone_enable:1,
+				irq_enable:1,
+				timeStampEn:1,
+				res0:5,
+				numCBs:4,
+				res1:4,
+				m2dlen:4,
+				d2mlen:4;
+		};
+		struct{
+			uint32_t word0;
+			uint32_t word1;
+			uint32_t word2;
+			uint32_t word3;
+		};
+	};
+	struct acc100_dma_triplet data_ptrs[ACC100_DMA_MAX_NUM_POINTERS];
+
+	/* Virtual addresses used to retrieve SW context info */
+	union {
+		void *op_addr;
+		uint64_t pad1;  /* pad to 64 bits */
+	};
+	/*
+	 * Stores additional information needed for driver processing:
+	 * - last_desc_in_batch - flag used to mark last descriptor (CB)
+	 *                        in batch
+	 * - cbs_in_tb - stores information about total number of Code Blocks
+	 *               in currently processed Transport Block
+	 */
+	union {
+		struct {
+			union {
+				struct acc100_fcw_ld fcw_ld;
+				struct acc100_fcw_td fcw_td;
+				struct acc100_fcw_le fcw_le;
+				struct acc100_fcw_te fcw_te;
+				uint32_t pad2[ACC100_FCW_PADDING];
+			};
+			uint32_t last_desc_in_batch :8,
+				cbs_in_tb:8,
+				pad4 : 16;
+		};
+		uint64_t pad3[ACC100_DMA_DESC_PADDING]; /* pad to 64 bits */
+	};
+};
+
+/* ACC100 DMA Descriptor */
+union acc100_dma_desc {
+	struct acc100_dma_req_desc req;
+	union acc100_dma_rsp_desc rsp;
+};
+
+
+/* Union describing Info Ring entry */
+union acc100_harq_layout_data {
+	uint32_t val;
+	struct {
+		uint16_t offset;
+		uint16_t size0;
+	};
+} __rte_packed;
+
+
+/* Union describing Info Ring entry */
+union acc100_info_ring_data {
+	uint32_t val;
+	struct {
+		union {
+			uint16_t detailed_info;
+			struct {
+				uint16_t aq_id: 4;
+				uint16_t qg_id: 4;
+				uint16_t vf_id: 6;
+				uint16_t reserved: 2;
+			};
+		};
+		uint16_t int_nb: 7;
+		uint16_t msi_0: 1;
+		uint16_t vf2pf: 6;
+		uint16_t loop: 1;
+		uint16_t valid: 1;
+	};
+} __rte_packed;
+
+struct acc100_registry_addr {
+	unsigned int dma_ring_dl5g_hi;
+	unsigned int dma_ring_dl5g_lo;
+	unsigned int dma_ring_ul5g_hi;
+	unsigned int dma_ring_ul5g_lo;
+	unsigned int dma_ring_dl4g_hi;
+	unsigned int dma_ring_dl4g_lo;
+	unsigned int dma_ring_ul4g_hi;
+	unsigned int dma_ring_ul4g_lo;
+	unsigned int ring_size;
+	unsigned int info_ring_hi;
+	unsigned int info_ring_lo;
+	unsigned int info_ring_en;
+	unsigned int info_ring_ptr;
+	unsigned int tail_ptrs_dl5g_hi;
+	unsigned int tail_ptrs_dl5g_lo;
+	unsigned int tail_ptrs_ul5g_hi;
+	unsigned int tail_ptrs_ul5g_lo;
+	unsigned int tail_ptrs_dl4g_hi;
+	unsigned int tail_ptrs_dl4g_lo;
+	unsigned int tail_ptrs_ul4g_hi;
+	unsigned int tail_ptrs_ul4g_lo;
+	unsigned int depth_log0_offset;
+	unsigned int depth_log1_offset;
+	unsigned int qman_group_func;
+	unsigned int ddr_range;
+};
+
+/* Structure holding registry addresses for PF */
+static const struct acc100_registry_addr pf_reg_addr = {
+	.dma_ring_dl5g_hi = HWPfDmaFec5GdlDescBaseHiRegVf,
+	.dma_ring_dl5g_lo = HWPfDmaFec5GdlDescBaseLoRegVf,
+	.dma_ring_ul5g_hi = HWPfDmaFec5GulDescBaseHiRegVf,
+	.dma_ring_ul5g_lo = HWPfDmaFec5GulDescBaseLoRegVf,
+	.dma_ring_dl4g_hi = HWPfDmaFec4GdlDescBaseHiRegVf,
+	.dma_ring_dl4g_lo = HWPfDmaFec4GdlDescBaseLoRegVf,
+	.dma_ring_ul4g_hi = HWPfDmaFec4GulDescBaseHiRegVf,
+	.dma_ring_ul4g_lo = HWPfDmaFec4GulDescBaseLoRegVf,
+	.ring_size = HWPfQmgrRingSizeVf,
+	.info_ring_hi = HWPfHiInfoRingBaseHiRegPf,
+	.info_ring_lo = HWPfHiInfoRingBaseLoRegPf,
+	.info_ring_en = HWPfHiInfoRingIntWrEnRegPf,
+	.info_ring_ptr = HWPfHiInfoRingPointerRegPf,
+	.tail_ptrs_dl5g_hi = HWPfDmaFec5GdlRespPtrHiRegVf,
+	.tail_ptrs_dl5g_lo = HWPfDmaFec5GdlRespPtrLoRegVf,
+	.tail_ptrs_ul5g_hi = HWPfDmaFec5GulRespPtrHiRegVf,
+	.tail_ptrs_ul5g_lo = HWPfDmaFec5GulRespPtrLoRegVf,
+	.tail_ptrs_dl4g_hi = HWPfDmaFec4GdlRespPtrHiRegVf,
+	.tail_ptrs_dl4g_lo = HWPfDmaFec4GdlRespPtrLoRegVf,
+	.tail_ptrs_ul4g_hi = HWPfDmaFec4GulRespPtrHiRegVf,
+	.tail_ptrs_ul4g_lo = HWPfDmaFec4GulRespPtrLoRegVf,
+	.depth_log0_offset = HWPfQmgrGrpDepthLog20Vf,
+	.depth_log1_offset = HWPfQmgrGrpDepthLog21Vf,
+	.qman_group_func = HWPfQmgrGrpFunction0,
+	.ddr_range = HWPfDmaVfDdrBaseRw,
+};
+
+/* Structure holding registry addresses for VF */
+static const struct acc100_registry_addr vf_reg_addr = {
+	.dma_ring_dl5g_hi = HWVfDmaFec5GdlDescBaseHiRegVf,
+	.dma_ring_dl5g_lo = HWVfDmaFec5GdlDescBaseLoRegVf,
+	.dma_ring_ul5g_hi = HWVfDmaFec5GulDescBaseHiRegVf,
+	.dma_ring_ul5g_lo = HWVfDmaFec5GulDescBaseLoRegVf,
+	.dma_ring_dl4g_hi = HWVfDmaFec4GdlDescBaseHiRegVf,
+	.dma_ring_dl4g_lo = HWVfDmaFec4GdlDescBaseLoRegVf,
+	.dma_ring_ul4g_hi = HWVfDmaFec4GulDescBaseHiRegVf,
+	.dma_ring_ul4g_lo = HWVfDmaFec4GulDescBaseLoRegVf,
+	.ring_size = HWVfQmgrRingSizeVf,
+	.info_ring_hi = HWVfHiInfoRingBaseHiVf,
+	.info_ring_lo = HWVfHiInfoRingBaseLoVf,
+	.info_ring_en = HWVfHiInfoRingIntWrEnVf,
+	.info_ring_ptr = HWVfHiInfoRingPointerVf,
+	.tail_ptrs_dl5g_hi = HWVfDmaFec5GdlRespPtrHiRegVf,
+	.tail_ptrs_dl5g_lo = HWVfDmaFec5GdlRespPtrLoRegVf,
+	.tail_ptrs_ul5g_hi = HWVfDmaFec5GulRespPtrHiRegVf,
+	.tail_ptrs_ul5g_lo = HWVfDmaFec5GulRespPtrLoRegVf,
+	.tail_ptrs_dl4g_hi = HWVfDmaFec4GdlRespPtrHiRegVf,
+	.tail_ptrs_dl4g_lo = HWVfDmaFec4GdlRespPtrLoRegVf,
+	.tail_ptrs_ul4g_hi = HWVfDmaFec4GulRespPtrHiRegVf,
+	.tail_ptrs_ul4g_lo = HWVfDmaFec4GulRespPtrLoRegVf,
+	.depth_log0_offset = HWVfQmgrGrpDepthLog20Vf,
+	.depth_log1_offset = HWVfQmgrGrpDepthLog21Vf,
+	.qman_group_func = HWVfQmgrGrpFunction0Vf,
+	.ddr_range = HWVfDmaDdrBaseRangeRoVf,
+};
+
 /* Private data structure for each ACC100 device */
 struct acc100_device {
 	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v8 03/10] baseband/acc100: add info get function
  2020-09-28 23:52   ` [dpdk-dev] [PATCH v8 00/10] bbdev PMD ACC100 Nicolas Chautru
  2020-09-28 23:52     ` [dpdk-dev] [PATCH v8 01/10] drivers/baseband: add PMD for ACC100 Nicolas Chautru
  2020-09-28 23:52     ` [dpdk-dev] [PATCH v8 02/10] baseband/acc100: add register definition file Nicolas Chautru
@ 2020-09-28 23:52     ` Nicolas Chautru
  2020-09-28 23:52     ` [dpdk-dev] [PATCH v8 04/10] baseband/acc100: add queue configuration Nicolas Chautru
                       ` (6 subsequent siblings)
  9 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-28 23:52 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Add in the "info_get" function to the driver, to allow us to query the
device.
No processing capability are available yet.
Linking bbdev-test to support the PMD with null capability.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 app/test-bbdev/meson.build               |   3 +
 drivers/baseband/acc100/rte_acc100_cfg.h |  96 +++++++++++++
 drivers/baseband/acc100/rte_acc100_pmd.c | 225 +++++++++++++++++++++++++++++++
 drivers/baseband/acc100/rte_acc100_pmd.h |   3 +
 4 files changed, 327 insertions(+)
 create mode 100644 drivers/baseband/acc100/rte_acc100_cfg.h

diff --git a/app/test-bbdev/meson.build b/app/test-bbdev/meson.build
index 18ab6a8..fbd8ae3 100644
--- a/app/test-bbdev/meson.build
+++ b/app/test-bbdev/meson.build
@@ -12,3 +12,6 @@ endif
 if dpdk_conf.has('RTE_LIBRTE_PMD_BBDEV_FPGA_5GNR_FEC')
 	deps += ['pmd_bbdev_fpga_5gnr_fec']
 endif
+if dpdk_conf.has('RTE_LIBRTE_PMD_BBDEV_ACC100')
+	deps += ['pmd_bbdev_acc100']
+endif
\ No newline at end of file
diff --git a/drivers/baseband/acc100/rte_acc100_cfg.h b/drivers/baseband/acc100/rte_acc100_cfg.h
new file mode 100644
index 0000000..73bbe36
--- /dev/null
+++ b/drivers/baseband/acc100/rte_acc100_cfg.h
@@ -0,0 +1,96 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#ifndef _RTE_ACC100_CFG_H_
+#define _RTE_ACC100_CFG_H_
+
+/**
+ * @file rte_acc100_cfg.h
+ *
+ * Functions for configuring ACC100 HW, exposed directly to applications.
+ * Configuration related to encoding/decoding is done through the
+ * librte_bbdev library.
+ *
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ */
+
+#include <stdint.h>
+#include <stdbool.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+/**< Number of Virtual Functions ACC100 supports */
+#define RTE_ACC100_NUM_VFS 16
+
+/**
+ * Definition of Queue Topology for ACC100 Configuration
+ * Some level of details is abstracted out to expose a clean interface
+ * given that comprehensive flexibility is not required
+ */
+struct rte_q_topology_t {
+	/** Number of QGroups in incremental order of priority */
+	uint16_t num_qgroups;
+	/**
+	 * All QGroups have the same number of AQs here.
+	 * Note : Could be made a 16-array if more flexibility is really
+	 * required
+	 */
+	uint16_t num_aqs_per_groups;
+	/**
+	 * Depth of the AQs is the same of all QGroups here. Log2 Enum : 2^N
+	 * Note : Could be made a 16-array if more flexibility is really
+	 * required
+	 */
+	uint16_t aq_depth_log2;
+	/**
+	 * Index of the first Queue Group Index - assuming contiguity
+	 * Initialized as -1
+	 */
+	int8_t first_qgroup_index;
+};
+
+/**
+ * Definition of Arbitration related parameters for ACC100 Configuration
+ */
+struct rte_arbitration_t {
+	/** Default Weight for VF Fairness Arbitration */
+	uint16_t round_robin_weight;
+	uint32_t gbr_threshold1; /**< Guaranteed Bitrate Threshold 1 */
+	uint32_t gbr_threshold2; /**< Guaranteed Bitrate Threshold 2 */
+};
+
+/**
+ * Structure to pass ACC100 configuration.
+ * Note: all VF Bundles will have the same configuration.
+ */
+struct acc100_conf {
+	bool pf_mode_en; /**< 1 if PF is used for dataplane, 0 for VFs */
+	/** 1 if input '1' bit is represented by a positive LLR value, 0 if '1'
+	 * bit is represented by a negative value.
+	 */
+	bool input_pos_llr_1_bit;
+	/** 1 if output '1' bit is represented by a positive value, 0 if '1'
+	 * bit is represented by a negative value.
+	 */
+	bool output_pos_llr_1_bit;
+	uint16_t num_vf_bundles; /**< Number of VF bundles to setup */
+	/** Queue topology for each operation type */
+	struct rte_q_topology_t q_ul_4g;
+	struct rte_q_topology_t q_dl_4g;
+	struct rte_q_topology_t q_ul_5g;
+	struct rte_q_topology_t q_dl_5g;
+	/** Arbitration configuration for each operation type */
+	struct rte_arbitration_t arb_ul_4g[RTE_ACC100_NUM_VFS];
+	struct rte_arbitration_t arb_dl_4g[RTE_ACC100_NUM_VFS];
+	struct rte_arbitration_t arb_ul_5g[RTE_ACC100_NUM_VFS];
+	struct rte_arbitration_t arb_dl_5g[RTE_ACC100_NUM_VFS];
+};
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_ACC100_CFG_H_ */
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 1b4cd13..7807a30 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -26,6 +26,184 @@
 RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);
 #endif
 
+/* Read a register of a ACC100 device */
+static inline uint32_t
+acc100_reg_read(struct acc100_device *d, uint32_t offset)
+{
+
+	void *reg_addr = RTE_PTR_ADD(d->mmio_base, offset);
+	uint32_t ret = *((volatile uint32_t *)(reg_addr));
+	return rte_le_to_cpu_32(ret);
+}
+
+/* Calculate the offset of the enqueue register */
+static inline uint32_t
+queue_offset(bool pf_device, uint8_t vf_id, uint8_t qgrp_id, uint16_t aq_id)
+{
+	if (pf_device)
+		return ((vf_id << 12) + (qgrp_id << 7) + (aq_id << 3) +
+				HWPfQmgrIngressAq);
+	else
+		return ((qgrp_id << 7) + (aq_id << 3) +
+				HWVfQmgrIngressAq);
+}
+
+enum {UL_4G = 0, UL_5G, DL_4G, DL_5G, NUM_ACC};
+
+/* Return the queue topology for a Queue Group Index */
+static inline void
+qtopFromAcc(struct rte_q_topology_t **qtop, int acc_enum,
+		struct acc100_conf *acc100_conf)
+{
+	struct rte_q_topology_t *p_qtop;
+	p_qtop = NULL;
+	switch (acc_enum) {
+	case UL_4G:
+		p_qtop = &(acc100_conf->q_ul_4g);
+		break;
+	case UL_5G:
+		p_qtop = &(acc100_conf->q_ul_5g);
+		break;
+	case DL_4G:
+		p_qtop = &(acc100_conf->q_dl_4g);
+		break;
+	case DL_5G:
+		p_qtop = &(acc100_conf->q_dl_5g);
+		break;
+	default:
+		/* NOTREACHED */
+		rte_bbdev_log(ERR, "Unexpected error evaluating qtopFromAcc");
+		break;
+	}
+	*qtop = p_qtop;
+}
+
+static void
+initQTop(struct acc100_conf *acc100_conf)
+{
+	acc100_conf->q_ul_4g.num_aqs_per_groups = 0;
+	acc100_conf->q_ul_4g.num_qgroups = 0;
+	acc100_conf->q_ul_4g.first_qgroup_index = -1;
+	acc100_conf->q_ul_5g.num_aqs_per_groups = 0;
+	acc100_conf->q_ul_5g.num_qgroups = 0;
+	acc100_conf->q_ul_5g.first_qgroup_index = -1;
+	acc100_conf->q_dl_4g.num_aqs_per_groups = 0;
+	acc100_conf->q_dl_4g.num_qgroups = 0;
+	acc100_conf->q_dl_4g.first_qgroup_index = -1;
+	acc100_conf->q_dl_5g.num_aqs_per_groups = 0;
+	acc100_conf->q_dl_5g.num_qgroups = 0;
+	acc100_conf->q_dl_5g.first_qgroup_index = -1;
+}
+
+static inline void
+updateQtop(uint8_t acc, uint8_t qg, struct acc100_conf *acc100_conf,
+		struct acc100_device *d) {
+	uint32_t reg;
+	struct rte_q_topology_t *q_top = NULL;
+	qtopFromAcc(&q_top, acc, acc100_conf);
+	if (unlikely(q_top == NULL))
+		return;
+	uint16_t aq;
+	q_top->num_qgroups++;
+	if (q_top->first_qgroup_index == -1) {
+		q_top->first_qgroup_index = qg;
+		/* Can be optimized to assume all are enabled by default */
+		reg = acc100_reg_read(d, queue_offset(d->pf_device,
+				0, qg, ACC100_NUM_AQS - 1));
+		if (reg & QUEUE_ENABLE) {
+			q_top->num_aqs_per_groups = ACC100_NUM_AQS;
+			return;
+		}
+		q_top->num_aqs_per_groups = 0;
+		for (aq = 0; aq < ACC100_NUM_AQS; aq++) {
+			reg = acc100_reg_read(d, queue_offset(d->pf_device,
+					0, qg, aq));
+			if (reg & QUEUE_ENABLE)
+				q_top->num_aqs_per_groups++;
+		}
+	}
+}
+
+/* Fetch configuration enabled for the PF/VF using MMIO Read (slow) */
+static inline void
+fetch_acc100_config(struct rte_bbdev *dev)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	struct acc100_conf *acc100_conf = &d->acc100_conf;
+	const struct acc100_registry_addr *reg_addr;
+	uint8_t acc, qg;
+	uint32_t reg, reg_aq, reg_len0, reg_len1;
+	uint32_t reg_mode;
+
+	/* No need to retrieve the configuration is already done */
+	if (d->configured)
+		return;
+
+	/* Choose correct registry addresses for the device type */
+	if (d->pf_device)
+		reg_addr = &pf_reg_addr;
+	else
+		reg_addr = &vf_reg_addr;
+
+	d->ddr_size = (1 + acc100_reg_read(d, reg_addr->ddr_range)) << 10;
+
+	/* Single VF Bundle by VF */
+	acc100_conf->num_vf_bundles = 1;
+	initQTop(acc100_conf);
+
+	struct rte_q_topology_t *q_top = NULL;
+	int qman_func_id[5] = {0, 2, 1, 3, 4};
+	reg = acc100_reg_read(d, reg_addr->qman_group_func);
+	for (qg = 0; qg < ACC100_NUM_QGRPS_PER_WORD; qg++) {
+		reg_aq = acc100_reg_read(d,
+				queue_offset(d->pf_device, 0, qg, 0));
+		if (reg_aq & QUEUE_ENABLE) {
+			acc = qman_func_id[(reg >> (qg * 4)) & 0x7];
+			updateQtop(acc, qg, acc100_conf, d);
+		}
+	}
+
+	/* Check the depth of the AQs*/
+	reg_len0 = acc100_reg_read(d, reg_addr->depth_log0_offset);
+	reg_len1 = acc100_reg_read(d, reg_addr->depth_log1_offset);
+	for (acc = 0; acc < NUM_ACC; acc++) {
+		qtopFromAcc(&q_top, acc, acc100_conf);
+		if (q_top->first_qgroup_index < ACC100_NUM_QGRPS_PER_WORD)
+			q_top->aq_depth_log2 = (reg_len0 >>
+					(q_top->first_qgroup_index * 4))
+					& 0xF;
+		else
+			q_top->aq_depth_log2 = (reg_len1 >>
+					((q_top->first_qgroup_index -
+					ACC100_NUM_QGRPS_PER_WORD) * 4))
+					& 0xF;
+	}
+
+	/* Read PF mode */
+	if (d->pf_device) {
+		reg_mode = acc100_reg_read(d, HWPfHiPfMode);
+		acc100_conf->pf_mode_en = (reg_mode == 2) ? 1 : 0;
+	}
+
+	rte_bbdev_log_debug(
+			"%s Config LLR SIGN IN/OUT %s %s QG %u %u %u %u AQ %u %u %u %u Len %u %u %u %u\n",
+			(d->pf_device) ? "PF" : "VF",
+			(acc100_conf->input_pos_llr_1_bit) ? "POS" : "NEG",
+			(acc100_conf->output_pos_llr_1_bit) ? "POS" : "NEG",
+			acc100_conf->q_ul_4g.num_qgroups,
+			acc100_conf->q_dl_4g.num_qgroups,
+			acc100_conf->q_ul_5g.num_qgroups,
+			acc100_conf->q_dl_5g.num_qgroups,
+			acc100_conf->q_ul_4g.num_aqs_per_groups,
+			acc100_conf->q_dl_4g.num_aqs_per_groups,
+			acc100_conf->q_ul_5g.num_aqs_per_groups,
+			acc100_conf->q_dl_5g.num_aqs_per_groups,
+			acc100_conf->q_ul_4g.aq_depth_log2,
+			acc100_conf->q_dl_4g.aq_depth_log2,
+			acc100_conf->q_ul_5g.aq_depth_log2,
+			acc100_conf->q_dl_5g.aq_depth_log2);
+}
+
 /* Free 64MB memory used for software rings */
 static int
 acc100_dev_close(struct rte_bbdev *dev  __rte_unused)
@@ -33,8 +211,55 @@
 	return 0;
 }
 
+/* Get ACC100 device info */
+static void
+acc100_dev_info_get(struct rte_bbdev *dev,
+		struct rte_bbdev_driver_info *dev_info)
+{
+	struct acc100_device *d = dev->data->dev_private;
+
+	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
+		RTE_BBDEV_END_OF_CAPABILITIES_LIST()
+	};
+
+	static struct rte_bbdev_queue_conf default_queue_conf;
+	default_queue_conf.socket = dev->data->socket_id;
+	default_queue_conf.queue_size = MAX_QUEUE_DEPTH;
+
+	dev_info->driver_name = dev->device->driver->name;
+
+	/* Read and save the populated config from ACC100 registers */
+	fetch_acc100_config(dev);
+
+	/* This isn't ideal because it reports the maximum number of queues but
+	 * does not provide info on how many can be uplink/downlink or different
+	 * priorities
+	 */
+	dev_info->max_num_queues =
+			d->acc100_conf.q_dl_5g.num_aqs_per_groups *
+			d->acc100_conf.q_dl_5g.num_qgroups +
+			d->acc100_conf.q_ul_5g.num_aqs_per_groups *
+			d->acc100_conf.q_ul_5g.num_qgroups +
+			d->acc100_conf.q_dl_4g.num_aqs_per_groups *
+			d->acc100_conf.q_dl_4g.num_qgroups +
+			d->acc100_conf.q_ul_4g.num_aqs_per_groups *
+			d->acc100_conf.q_ul_4g.num_qgroups;
+	dev_info->queue_size_lim = MAX_QUEUE_DEPTH;
+	dev_info->hardware_accelerated = true;
+	dev_info->max_dl_queue_priority =
+			d->acc100_conf.q_dl_4g.num_qgroups - 1;
+	dev_info->max_ul_queue_priority =
+			d->acc100_conf.q_ul_4g.num_qgroups - 1;
+	dev_info->default_queue_conf = default_queue_conf;
+	dev_info->cpu_flag_reqs = NULL;
+	dev_info->min_alignment = 64;
+	dev_info->capabilities = bbdev_capabilities;
+	dev_info->harq_buffer_size = d->ddr_size;
+}
+
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
 	.close = acc100_dev_close,
+	.info_get = acc100_dev_info_get,
 };
 
 /* ACC100 PCI PF address map */
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index cd77570..662e2c8 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -7,6 +7,7 @@
 
 #include "acc100_pf_enum.h"
 #include "acc100_vf_enum.h"
+#include "rte_acc100_cfg.h"
 
 /* Helper macro for logging */
 #define rte_bbdev_log(level, fmt, ...) \
@@ -520,6 +521,8 @@ struct acc100_registry_addr {
 /* Private data structure for each ACC100 device */
 struct acc100_device {
 	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
+	uint32_t ddr_size; /* Size in kB */
+	struct acc100_conf acc100_conf; /* ACC100 Initial configuration */
 	bool pf_device; /**< True if this is a PF ACC100 device */
 	bool configured; /**< True if this ACC100 device is configured */
 };
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v8 04/10] baseband/acc100: add queue configuration
  2020-09-28 23:52   ` [dpdk-dev] [PATCH v8 00/10] bbdev PMD ACC100 Nicolas Chautru
                       ` (2 preceding siblings ...)
  2020-09-28 23:52     ` [dpdk-dev] [PATCH v8 03/10] baseband/acc100: add info get function Nicolas Chautru
@ 2020-09-28 23:52     ` Nicolas Chautru
  2020-09-28 23:52     ` [dpdk-dev] [PATCH v8 05/10] baseband/acc100: add LDPC processing functions Nicolas Chautru
                       ` (5 subsequent siblings)
  9 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-28 23:52 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Adding function to create and configure queues for
the device. Still no capability.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 420 ++++++++++++++++++++++++++++++-
 drivers/baseband/acc100/rte_acc100_pmd.h |  45 ++++
 2 files changed, 464 insertions(+), 1 deletion(-)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 7807a30..7a21c57 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -26,6 +26,22 @@
 RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);
 #endif
 
+/* Write to MMIO register address */
+static inline void
+mmio_write(void *addr, uint32_t value)
+{
+	*((volatile uint32_t *)(addr)) = rte_cpu_to_le_32(value);
+}
+
+/* Write a register of a ACC100 device */
+static inline void
+acc100_reg_write(struct acc100_device *d, uint32_t offset, uint32_t payload)
+{
+	void *reg_addr = RTE_PTR_ADD(d->mmio_base, offset);
+	mmio_write(reg_addr, payload);
+	usleep(1000);
+}
+
 /* Read a register of a ACC100 device */
 static inline uint32_t
 acc100_reg_read(struct acc100_device *d, uint32_t offset)
@@ -36,6 +52,22 @@
 	return rte_le_to_cpu_32(ret);
 }
 
+/* Basic Implementation of Log2 for exact 2^N */
+static inline uint32_t
+log2_basic(uint32_t value)
+{
+	return (value == 0) ? 0 : __builtin_ctz(value);
+}
+
+/* Calculate memory alignment offset assuming alignment is 2^N */
+static inline uint32_t
+calc_mem_alignment_offset(void *unaligned_virt_mem, uint32_t alignment)
+{
+	rte_iova_t unaligned_phy_mem = rte_malloc_virt2iova(unaligned_virt_mem);
+	return (uint32_t)(alignment -
+			(unaligned_phy_mem & (alignment-1)));
+}
+
 /* Calculate the offset of the enqueue register */
 static inline uint32_t
 queue_offset(bool pf_device, uint8_t vf_id, uint8_t qgrp_id, uint16_t aq_id)
@@ -204,10 +236,393 @@
 			acc100_conf->q_dl_5g.aq_depth_log2);
 }
 
+static void
+free_base_addresses(void **base_addrs, int size)
+{
+	int i;
+	for (i = 0; i < size; i++)
+		rte_free(base_addrs[i]);
+}
+
+static inline uint32_t
+get_desc_len(void)
+{
+	return sizeof(union acc100_dma_desc);
+}
+
+/* Allocate the 2 * 64MB block for the sw rings */
+static int
+alloc_2x64mb_sw_rings_mem(struct rte_bbdev *dev, struct acc100_device *d,
+		int socket)
+{
+	uint32_t sw_ring_size = ACC100_SIZE_64MBYTE;
+	d->sw_rings_base = rte_zmalloc_socket(dev->device->driver->name,
+			2 * sw_ring_size, RTE_CACHE_LINE_SIZE, socket);
+	if (d->sw_rings_base == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate memory for %s:%u",
+				dev->device->driver->name,
+				dev->data->dev_id);
+		return -ENOMEM;
+	}
+	memset(d->sw_rings_base, 0, ACC100_SIZE_64MBYTE);
+	uint32_t next_64mb_align_offset = calc_mem_alignment_offset(
+			d->sw_rings_base, ACC100_SIZE_64MBYTE);
+	d->sw_rings = RTE_PTR_ADD(d->sw_rings_base, next_64mb_align_offset);
+	d->sw_rings_phys = rte_malloc_virt2iova(d->sw_rings_base) +
+			next_64mb_align_offset;
+	d->sw_ring_size = MAX_QUEUE_DEPTH * get_desc_len();
+	d->sw_ring_max_depth = d->sw_ring_size / get_desc_len();
+
+	return 0;
+}
+
+/* Attempt to allocate minimised memory space for sw rings */
+static void
+alloc_sw_rings_min_mem(struct rte_bbdev *dev, struct acc100_device *d,
+		uint16_t num_queues, int socket)
+{
+	rte_iova_t sw_rings_base_phy, next_64mb_align_addr_phy;
+	uint32_t next_64mb_align_offset;
+	rte_iova_t sw_ring_phys_end_addr;
+	void *base_addrs[SW_RING_MEM_ALLOC_ATTEMPTS];
+	void *sw_rings_base;
+	int i = 0;
+	uint32_t q_sw_ring_size = MAX_QUEUE_DEPTH * get_desc_len();
+	uint32_t dev_sw_ring_size = q_sw_ring_size * num_queues;
+
+	/* Find an aligned block of memory to store sw rings */
+	while (i < SW_RING_MEM_ALLOC_ATTEMPTS) {
+		/*
+		 * sw_ring allocated memory is guaranteed to be aligned to
+		 * q_sw_ring_size at the condition that the requested size is
+		 * less than the page size
+		 */
+		sw_rings_base = rte_zmalloc_socket(
+				dev->device->driver->name,
+				dev_sw_ring_size, q_sw_ring_size, socket);
+
+		if (sw_rings_base == NULL) {
+			rte_bbdev_log(ERR,
+					"Failed to allocate memory for %s:%u",
+					dev->device->driver->name,
+					dev->data->dev_id);
+			break;
+		}
+
+		sw_rings_base_phy = rte_malloc_virt2iova(sw_rings_base);
+		next_64mb_align_offset = calc_mem_alignment_offset(
+				sw_rings_base, ACC100_SIZE_64MBYTE);
+		next_64mb_align_addr_phy = sw_rings_base_phy +
+				next_64mb_align_offset;
+		sw_ring_phys_end_addr = sw_rings_base_phy + dev_sw_ring_size;
+
+		/* Check if the end of the sw ring memory block is before the
+		 * start of next 64MB aligned mem address
+		 */
+		if (sw_ring_phys_end_addr < next_64mb_align_addr_phy) {
+			d->sw_rings_phys = sw_rings_base_phy;
+			d->sw_rings = sw_rings_base;
+			d->sw_rings_base = sw_rings_base;
+			d->sw_ring_size = q_sw_ring_size;
+			d->sw_ring_max_depth = MAX_QUEUE_DEPTH;
+			break;
+		}
+		/* Store the address of the unaligned mem block */
+		base_addrs[i] = sw_rings_base;
+		i++;
+	}
+
+	/* Free all unaligned blocks of mem allocated in the loop */
+	free_base_addresses(base_addrs, i);
+}
+
+
+/* Allocate 64MB memory used for all software rings */
+static int
+acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id)
+{
+	uint32_t phys_low, phys_high, payload;
+	struct acc100_device *d = dev->data->dev_private;
+	const struct acc100_registry_addr *reg_addr;
+
+	if (d->pf_device && !d->acc100_conf.pf_mode_en) {
+		rte_bbdev_log(NOTICE,
+				"%s has PF mode disabled. This PF can't be used.",
+				dev->data->name);
+		return -ENODEV;
+	}
+
+	alloc_sw_rings_min_mem(dev, d, num_queues, socket_id);
+
+	/* If minimal memory space approach failed, then allocate
+	 * the 2 * 64MB block for the sw rings
+	 */
+	if (d->sw_rings == NULL)
+		alloc_2x64mb_sw_rings_mem(dev, d, socket_id);
+
+	/* Configure ACC100 with the base address for DMA descriptor rings
+	 * Same descriptor rings used for UL and DL DMA Engines
+	 * Note : Assuming only VF0 bundle is used for PF mode
+	 */
+	phys_high = (uint32_t)(d->sw_rings_phys >> 32);
+	phys_low  = (uint32_t)(d->sw_rings_phys & ~(ACC100_SIZE_64MBYTE-1));
+
+	/* Choose correct registry addresses for the device type */
+	if (d->pf_device)
+		reg_addr = &pf_reg_addr;
+	else
+		reg_addr = &vf_reg_addr;
+
+	/* Read the populated cfg from ACC100 registers */
+	fetch_acc100_config(dev);
+
+	/* Mark as configured properly */
+	d->configured = true;
+
+	/* Release AXI from PF */
+	if (d->pf_device)
+		acc100_reg_write(d, HWPfDmaAxiControl, 1);
+
+	acc100_reg_write(d, reg_addr->dma_ring_ul5g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->dma_ring_ul5g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->dma_ring_dl5g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->dma_ring_dl5g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->dma_ring_ul4g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->dma_ring_ul4g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->dma_ring_dl4g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->dma_ring_dl4g_lo, phys_low);
+
+	/*
+	 * Configure Ring Size to the max queue ring size
+	 * (used for wrapping purpose)
+	 */
+	payload = log2_basic(d->sw_ring_size / 64);
+	acc100_reg_write(d, reg_addr->ring_size, payload);
+
+	/* Configure tail pointer for use when SDONE enabled */
+	d->tail_ptrs = rte_zmalloc_socket(
+			dev->device->driver->name,
+			ACC100_NUM_QGRPS * ACC100_NUM_AQS * sizeof(uint32_t),
+			RTE_CACHE_LINE_SIZE, socket_id);
+	if (d->tail_ptrs == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate tail ptr for %s:%u",
+				dev->device->driver->name,
+				dev->data->dev_id);
+		rte_free(d->sw_rings);
+		return -ENOMEM;
+	}
+	d->tail_ptr_phys = rte_malloc_virt2iova(d->tail_ptrs);
+
+	phys_high = (uint32_t)(d->tail_ptr_phys >> 32);
+	phys_low  = (uint32_t)(d->tail_ptr_phys);
+	acc100_reg_write(d, reg_addr->tail_ptrs_ul5g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->tail_ptrs_ul5g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->tail_ptrs_dl5g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->tail_ptrs_dl5g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->tail_ptrs_ul4g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->tail_ptrs_ul4g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_lo, phys_low);
+
+	d->harq_layout = rte_zmalloc_socket("HARQ Layout",
+			ACC100_HARQ_LAYOUT * sizeof(*d->harq_layout),
+			RTE_CACHE_LINE_SIZE, dev->data->socket_id);
+
+	rte_bbdev_log_debug(
+			"ACC100 (%s) configured  sw_rings = %p, sw_rings_phys = %#"
+			PRIx64, dev->data->name, d->sw_rings, d->sw_rings_phys);
+
+	return 0;
+}
+
 /* Free 64MB memory used for software rings */
 static int
-acc100_dev_close(struct rte_bbdev *dev  __rte_unused)
+acc100_dev_close(struct rte_bbdev *dev)
 {
+	struct acc100_device *d = dev->data->dev_private;
+	if (d->sw_rings_base != NULL) {
+		rte_free(d->tail_ptrs);
+		rte_free(d->sw_rings_base);
+		d->sw_rings_base = NULL;
+	}
+	usleep(1000);
+	return 0;
+}
+
+
+/**
+ * Report a ACC100 queue index which is free
+ * Return 0 to 16k for a valid queue_idx or -1 when no queue is available
+ * Note : Only supporting VF0 Bundle for PF mode
+ */
+static int
+acc100_find_free_queue_idx(struct rte_bbdev *dev,
+		const struct rte_bbdev_queue_conf *conf)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	int op_2_acc[5] = {0, UL_4G, DL_4G, UL_5G, DL_5G};
+	int acc = op_2_acc[conf->op_type];
+	struct rte_q_topology_t *qtop = NULL;
+	qtopFromAcc(&qtop, acc, &(d->acc100_conf));
+	if (qtop == NULL)
+		return -1;
+	/* Identify matching QGroup Index which are sorted in priority order */
+	uint16_t group_idx = qtop->first_qgroup_index;
+	group_idx += conf->priority;
+	if (group_idx >= ACC100_NUM_QGRPS ||
+			conf->priority >= qtop->num_qgroups) {
+		rte_bbdev_log(INFO, "Invalid Priority on %s, priority %u",
+				dev->data->name, conf->priority);
+		return -1;
+	}
+	/* Find a free AQ_idx  */
+	uint16_t aq_idx;
+	for (aq_idx = 0; aq_idx < qtop->num_aqs_per_groups; aq_idx++) {
+		if (((d->q_assigned_bit_map[group_idx] >> aq_idx) & 0x1) == 0) {
+			/* Mark the Queue as assigned */
+			d->q_assigned_bit_map[group_idx] |= (1 << aq_idx);
+			/* Report the AQ Index */
+			return (group_idx << GRP_ID_SHIFT) + aq_idx;
+		}
+	}
+	rte_bbdev_log(INFO, "Failed to find free queue on %s, priority %u",
+			dev->data->name, conf->priority);
+	return -1;
+}
+
+/* Setup ACC100 queue */
+static int
+acc100_queue_setup(struct rte_bbdev *dev, uint16_t queue_id,
+		const struct rte_bbdev_queue_conf *conf)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	struct acc100_queue *q;
+	int16_t q_idx;
+
+	/* Allocate the queue data structure. */
+	q = rte_zmalloc_socket(dev->device->driver->name, sizeof(*q),
+			RTE_CACHE_LINE_SIZE, conf->socket);
+	if (q == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate queue memory");
+		return -ENOMEM;
+	}
+
+	q->d = d;
+	q->ring_addr = RTE_PTR_ADD(d->sw_rings, (d->sw_ring_size * queue_id));
+	q->ring_addr_phys = d->sw_rings_phys + (d->sw_ring_size * queue_id);
+
+	/* Prepare the Ring with default descriptor format */
+	union acc100_dma_desc *desc = NULL;
+	unsigned int desc_idx, b_idx;
+	int fcw_len = (conf->op_type == RTE_BBDEV_OP_LDPC_ENC ?
+		ACC100_FCW_LE_BLEN : (conf->op_type == RTE_BBDEV_OP_TURBO_DEC ?
+		ACC100_FCW_TD_BLEN : ACC100_FCW_LD_BLEN));
+
+	for (desc_idx = 0; desc_idx < d->sw_ring_max_depth; desc_idx++) {
+		desc = q->ring_addr + desc_idx;
+		desc->req.word0 = ACC100_DMA_DESC_TYPE;
+		desc->req.word1 = 0; /**< Timestamp */
+		desc->req.word2 = 0;
+		desc->req.word3 = 0;
+		uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
+		desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
+		desc->req.data_ptrs[0].blen = fcw_len;
+		desc->req.data_ptrs[0].blkid = ACC100_DMA_BLKID_FCW;
+		desc->req.data_ptrs[0].last = 0;
+		desc->req.data_ptrs[0].dma_ext = 0;
+		for (b_idx = 1; b_idx < ACC100_DMA_MAX_NUM_POINTERS - 1;
+				b_idx++) {
+			desc->req.data_ptrs[b_idx].blkid = ACC100_DMA_BLKID_IN;
+			desc->req.data_ptrs[b_idx].last = 1;
+			desc->req.data_ptrs[b_idx].dma_ext = 0;
+			b_idx++;
+			desc->req.data_ptrs[b_idx].blkid =
+					ACC100_DMA_BLKID_OUT_ENC;
+			desc->req.data_ptrs[b_idx].last = 1;
+			desc->req.data_ptrs[b_idx].dma_ext = 0;
+		}
+		/* Preset some fields of LDPC FCW */
+		desc->req.fcw_ld.FCWversion = ACC100_FCW_VER;
+		desc->req.fcw_ld.gain_i = 1;
+		desc->req.fcw_ld.gain_h = 1;
+	}
+
+	q->lb_in = rte_zmalloc_socket(dev->device->driver->name,
+			RTE_CACHE_LINE_SIZE,
+			RTE_CACHE_LINE_SIZE, conf->socket);
+	if (q->lb_in == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate lb_in memory");
+		return -ENOMEM;
+	}
+	q->lb_in_addr_phys = rte_malloc_virt2iova(q->lb_in);
+	q->lb_out = rte_zmalloc_socket(dev->device->driver->name,
+			RTE_CACHE_LINE_SIZE,
+			RTE_CACHE_LINE_SIZE, conf->socket);
+	if (q->lb_out == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate lb_out memory");
+		return -ENOMEM;
+	}
+	q->lb_out_addr_phys = rte_malloc_virt2iova(q->lb_out);
+
+	/*
+	 * Software queue ring wraps synchronously with the HW when it reaches
+	 * the boundary of the maximum allocated queue size, no matter what the
+	 * sw queue size is. This wrapping is guarded by setting the wrap_mask
+	 * to represent the maximum queue size as allocated at the time when
+	 * the device has been setup (in configure()).
+	 *
+	 * The queue depth is set to the queue size value (conf->queue_size).
+	 * This limits the occupancy of the queue at any point of time, so that
+	 * the queue does not get swamped with enqueue requests.
+	 */
+	q->sw_ring_depth = conf->queue_size;
+	q->sw_ring_wrap_mask = d->sw_ring_max_depth - 1;
+
+	q->op_type = conf->op_type;
+
+	q_idx = acc100_find_free_queue_idx(dev, conf);
+	if (q_idx == -1) {
+		rte_free(q);
+		return -1;
+	}
+
+	q->qgrp_id = (q_idx >> GRP_ID_SHIFT) & 0xF;
+	q->vf_id = (q_idx >> VF_ID_SHIFT)  & 0x3F;
+	q->aq_id = q_idx & 0xF;
+	q->aq_depth = (conf->op_type ==  RTE_BBDEV_OP_TURBO_DEC) ?
+			(1 << d->acc100_conf.q_ul_4g.aq_depth_log2) :
+			(1 << d->acc100_conf.q_dl_4g.aq_depth_log2);
+
+	q->mmio_reg_enqueue = RTE_PTR_ADD(d->mmio_base,
+			queue_offset(d->pf_device,
+					q->vf_id, q->qgrp_id, q->aq_id));
+
+	rte_bbdev_log_debug(
+			"Setup dev%u q%u: qgrp_id=%u, vf_id=%u, aq_id=%u, aq_depth=%u, mmio_reg_enqueue=%p",
+			dev->data->dev_id, queue_id, q->qgrp_id, q->vf_id,
+			q->aq_id, q->aq_depth, q->mmio_reg_enqueue);
+
+	dev->data->queues[queue_id].queue_private = q;
+	return 0;
+}
+
+/* Release ACC100 queue */
+static int
+acc100_queue_release(struct rte_bbdev *dev, uint16_t q_id)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	struct acc100_queue *q = dev->data->queues[q_id].queue_private;
+
+	if (q != NULL) {
+		/* Mark the Queue as un-assigned */
+		d->q_assigned_bit_map[q->qgrp_id] &= (0xFFFFFFFF -
+				(1 << q->aq_id));
+		rte_free(q->lb_in);
+		rte_free(q->lb_out);
+		rte_free(q);
+		dev->data->queues[q_id].queue_private = NULL;
+	}
+
 	return 0;
 }
 
@@ -258,8 +673,11 @@
 }
 
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
+	.setup_queues = acc100_setup_queues,
 	.close = acc100_dev_close,
 	.info_get = acc100_dev_info_get,
+	.queue_setup = acc100_queue_setup,
+	.queue_release = acc100_queue_release,
 };
 
 /* ACC100 PCI PF address map */
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 662e2c8..0e2b79c 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -518,11 +518,56 @@ struct acc100_registry_addr {
 	.ddr_range = HWVfDmaDdrBaseRangeRoVf,
 };
 
+/* Structure associated with each queue. */
+struct __rte_cache_aligned acc100_queue {
+	union acc100_dma_desc *ring_addr;  /* Virtual address of sw ring */
+	rte_iova_t ring_addr_phys;  /* Physical address of software ring */
+	uint32_t sw_ring_head;  /* software ring head */
+	uint32_t sw_ring_tail;  /* software ring tail */
+	/* software ring size (descriptors, not bytes) */
+	uint32_t sw_ring_depth;
+	/* mask used to wrap enqueued descriptors on the sw ring */
+	uint32_t sw_ring_wrap_mask;
+	/* MMIO register used to enqueue descriptors */
+	void *mmio_reg_enqueue;
+	uint8_t vf_id;  /* VF ID (max = 63) */
+	uint8_t qgrp_id;  /* Queue Group ID */
+	uint16_t aq_id;  /* Atomic Queue ID */
+	uint16_t aq_depth;  /* Depth of atomic queue */
+	uint32_t aq_enqueued;  /* Count how many "batches" have been enqueued */
+	uint32_t aq_dequeued;  /* Count how many "batches" have been dequeued */
+	uint32_t irq_enable;  /* Enable ops dequeue interrupts if set to 1 */
+	struct rte_mempool *fcw_mempool;  /* FCW mempool */
+	enum rte_bbdev_op_type op_type;  /* Type of this Queue: TE or TD */
+	/* Internal Buffers for loopback input */
+	uint8_t *lb_in;
+	uint8_t *lb_out;
+	rte_iova_t lb_in_addr_phys;
+	rte_iova_t lb_out_addr_phys;
+	struct acc100_device *d;
+};
+
 /* Private data structure for each ACC100 device */
 struct acc100_device {
 	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
+	void *sw_rings_base;  /* Base addr of un-aligned memory for sw rings */
+	void *sw_rings;  /* 64MBs of 64MB aligned memory for sw rings */
+	rte_iova_t sw_rings_phys;  /* Physical address of sw_rings */
+	/* Virtual address of the info memory routed to the this function under
+	 * operation, whether it is PF or VF.
+	 */
+	union acc100_harq_layout_data *harq_layout;
+	uint32_t sw_ring_size;
 	uint32_t ddr_size; /* Size in kB */
+	uint32_t *tail_ptrs; /* Base address of response tail pointer buffer */
+	rte_iova_t tail_ptr_phys; /* Physical address of tail pointers */
+	/* Max number of entries available for each queue in device, depending
+	 * on how many queues are enabled with configure()
+	 */
+	uint32_t sw_ring_max_depth;
 	struct acc100_conf acc100_conf; /* ACC100 Initial configuration */
+	/* Bitmap capturing which Queues have already been assigned */
+	uint16_t q_assigned_bit_map[ACC100_NUM_QGRPS];
 	bool pf_device; /**< True if this is a PF ACC100 device */
 	bool configured; /**< True if this ACC100 device is configured */
 };
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v8 05/10] baseband/acc100: add LDPC processing functions
  2020-09-28 23:52   ` [dpdk-dev] [PATCH v8 00/10] bbdev PMD ACC100 Nicolas Chautru
                       ` (3 preceding siblings ...)
  2020-09-28 23:52     ` [dpdk-dev] [PATCH v8 04/10] baseband/acc100: add queue configuration Nicolas Chautru
@ 2020-09-28 23:52     ` Nicolas Chautru
  2020-09-28 23:52     ` [dpdk-dev] [PATCH v8 06/10] baseband/acc100: add HARQ loopback support Nicolas Chautru
                       ` (4 subsequent siblings)
  9 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-28 23:52 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Adding LDPC decode and encode processing operations

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
Acked-by: Dave Burley <dave.burley@accelercomm.com>
---
 doc/guides/bbdevs/features/acc100.ini    |    8 +-
 drivers/baseband/acc100/rte_acc100_pmd.c | 1625 +++++++++++++++++++++++++++++-
 drivers/baseband/acc100/rte_acc100_pmd.h |    3 +
 3 files changed, 1630 insertions(+), 6 deletions(-)

diff --git a/doc/guides/bbdevs/features/acc100.ini b/doc/guides/bbdevs/features/acc100.ini
index c89a4d7..40c7adc 100644
--- a/doc/guides/bbdevs/features/acc100.ini
+++ b/doc/guides/bbdevs/features/acc100.ini
@@ -6,9 +6,9 @@
 [Features]
 Turbo Decoder (4G)     = N
 Turbo Encoder (4G)     = N
-LDPC Decoder (5G)      = N
-LDPC Encoder (5G)      = N
-LLR/HARQ Compression   = N
-External DDR Access    = N
+LDPC Decoder (5G)      = Y
+LDPC Encoder (5G)      = Y
+LLR/HARQ Compression   = Y
+External DDR Access    = Y
 HW Accelerated         = Y
 BBDEV API              = Y
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 7a21c57..b223547 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -15,6 +15,9 @@
 #include <rte_hexdump.h>
 #include <rte_pci.h>
 #include <rte_bus_pci.h>
+#ifdef RTE_BBDEV_OFFLOAD_COST
+#include <rte_cycles.h>
+#endif
 
 #include <rte_bbdev.h>
 #include <rte_bbdev_pmd.h>
@@ -449,7 +452,6 @@
 	return 0;
 }
 
-
 /**
  * Report a ACC100 queue index which is free
  * Return 0 to 16k for a valid queue_idx or -1 when no queue is available
@@ -634,6 +636,46 @@
 	struct acc100_device *d = dev->data->dev_private;
 
 	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
+		{
+			.type   = RTE_BBDEV_OP_LDPC_ENC,
+			.cap.ldpc_enc = {
+				.capability_flags =
+					RTE_BBDEV_LDPC_RATE_MATCH |
+					RTE_BBDEV_LDPC_CRC_24B_ATTACH |
+					RTE_BBDEV_LDPC_INTERLEAVER_BYPASS,
+				.num_buffers_src =
+						RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+				.num_buffers_dst =
+						RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+			}
+		},
+		{
+			.type   = RTE_BBDEV_OP_LDPC_DEC,
+			.cap.ldpc_dec = {
+			.capability_flags =
+				RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK |
+				RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP |
+				RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE |
+				RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE |
+#ifdef ACC100_EXT_MEM
+				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE |
+				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE |
+#endif
+				RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE |
+				RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS |
+				RTE_BBDEV_LDPC_DECODE_BYPASS |
+				RTE_BBDEV_LDPC_DEC_SCATTER_GATHER |
+				RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION |
+				RTE_BBDEV_LDPC_LLR_COMPRESSION,
+			.llr_size = 8,
+			.llr_decimals = 1,
+			.num_buffers_src =
+					RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+			.num_buffers_hard_out =
+					RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+			.num_buffers_soft_out = 0,
+			}
+		},
 		RTE_BBDEV_END_OF_CAPABILITIES_LIST()
 	};
 
@@ -669,9 +711,14 @@
 	dev_info->cpu_flag_reqs = NULL;
 	dev_info->min_alignment = 64;
 	dev_info->capabilities = bbdev_capabilities;
+#ifdef ACC100_EXT_MEM
 	dev_info->harq_buffer_size = d->ddr_size;
+#else
+	dev_info->harq_buffer_size = 0;
+#endif
 }
 
+
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
 	.setup_queues = acc100_setup_queues,
 	.close = acc100_dev_close,
@@ -696,6 +743,1577 @@
 	{.device_id = 0},
 };
 
+/* Read flag value 0/1 from bitmap */
+static inline bool
+check_bit(uint32_t bitmap, uint32_t bitmask)
+{
+	return bitmap & bitmask;
+}
+
+static inline char *
+mbuf_append(struct rte_mbuf *m_head, struct rte_mbuf *m, uint16_t len)
+{
+	if (unlikely(len > rte_pktmbuf_tailroom(m)))
+		return NULL;
+
+	char *tail = (char *)m->buf_addr + m->data_off + m->data_len;
+	m->data_len = (uint16_t)(m->data_len + len);
+	m_head->pkt_len  = (m_head->pkt_len + len);
+	return tail;
+}
+
+/* Compute value of k0.
+ * Based on 3GPP 38.212 Table 5.4.2.1-2
+ * Starting position of different redundancy versions, k0
+ */
+static inline uint16_t
+get_k0(uint16_t n_cb, uint16_t z_c, uint8_t bg, uint8_t rv_index)
+{
+	if (rv_index == 0)
+		return 0;
+	uint16_t n = (bg == 1 ? N_ZC_1 : N_ZC_2) * z_c;
+	if (n_cb == n) {
+		if (rv_index == 1)
+			return (bg == 1 ? K0_1_1 : K0_1_2) * z_c;
+		else if (rv_index == 2)
+			return (bg == 1 ? K0_2_1 : K0_2_2) * z_c;
+		else
+			return (bg == 1 ? K0_3_1 : K0_3_2) * z_c;
+	}
+	/* LBRM case - includes a division by N */
+	if (rv_index == 1)
+		return (((bg == 1 ? K0_1_1 : K0_1_2) * n_cb)
+				/ n) * z_c;
+	else if (rv_index == 2)
+		return (((bg == 1 ? K0_2_1 : K0_2_2) * n_cb)
+				/ n) * z_c;
+	else
+		return (((bg == 1 ? K0_3_1 : K0_3_2) * n_cb)
+				/ n) * z_c;
+}
+
+/* Fill in a frame control word for LDPC encoding. */
+static inline void
+acc100_fcw_le_fill(const struct rte_bbdev_enc_op *op,
+		struct acc100_fcw_le *fcw, int num_cb)
+{
+	fcw->qm = op->ldpc_enc.q_m;
+	fcw->nfiller = op->ldpc_enc.n_filler;
+	fcw->BG = (op->ldpc_enc.basegraph - 1);
+	fcw->Zc = op->ldpc_enc.z_c;
+	fcw->ncb = op->ldpc_enc.n_cb;
+	fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_enc.basegraph,
+			op->ldpc_enc.rv_index);
+	fcw->rm_e = op->ldpc_enc.cb_params.e;
+	fcw->crc_select = check_bit(op->ldpc_enc.op_flags,
+			RTE_BBDEV_LDPC_CRC_24B_ATTACH);
+	fcw->bypass_intlv = check_bit(op->ldpc_enc.op_flags,
+			RTE_BBDEV_LDPC_INTERLEAVER_BYPASS);
+	fcw->mcb_count = num_cb;
+}
+
+/* Fill in a frame control word for LDPC decoding. */
+static inline void
+acc100_fcw_ld_fill(const struct rte_bbdev_dec_op *op, struct acc100_fcw_ld *fcw,
+		union acc100_harq_layout_data *harq_layout)
+{
+	uint16_t harq_out_length, harq_in_length, ncb_p, k0_p, parity_offset;
+	uint16_t harq_index;
+	uint32_t l;
+	bool harq_prun = false;
+
+	fcw->qm = op->ldpc_dec.q_m;
+	fcw->nfiller = op->ldpc_dec.n_filler;
+	fcw->BG = (op->ldpc_dec.basegraph - 1);
+	fcw->Zc = op->ldpc_dec.z_c;
+	fcw->ncb = op->ldpc_dec.n_cb;
+	fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_dec.basegraph,
+			op->ldpc_dec.rv_index);
+	if (op->ldpc_dec.code_block_mode == 1)
+		fcw->rm_e = op->ldpc_dec.cb_params.e;
+	else
+		fcw->rm_e = (op->ldpc_dec.tb_params.r <
+				op->ldpc_dec.tb_params.cab) ?
+						op->ldpc_dec.tb_params.ea :
+						op->ldpc_dec.tb_params.eb;
+
+	fcw->hcin_en = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE);
+	fcw->hcout_en = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE);
+	fcw->crc_select = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK);
+	fcw->bypass_dec = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_DECODE_BYPASS);
+	fcw->bypass_intlv = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS);
+	if (op->ldpc_dec.q_m == 1) {
+		fcw->bypass_intlv = 1;
+		fcw->qm = 2;
+	}
+	fcw->hcin_decomp_mode = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
+	fcw->hcout_comp_mode = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
+	fcw->llr_pack_mode = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_LLR_COMPRESSION);
+	harq_index = op->ldpc_dec.harq_combined_output.offset /
+			ACC100_HARQ_OFFSET;
+#ifdef ACC100_EXT_MEM
+	/* Limit cases when HARQ pruning is valid */
+	harq_prun = ((op->ldpc_dec.harq_combined_output.offset %
+			ACC100_HARQ_OFFSET) == 0) &&
+			(op->ldpc_dec.harq_combined_output.offset <= UINT16_MAX
+			* ACC100_HARQ_OFFSET);
+#endif
+	if (fcw->hcin_en > 0) {
+		harq_in_length = op->ldpc_dec.harq_combined_input.length;
+		if (fcw->hcin_decomp_mode > 0)
+			harq_in_length = harq_in_length * 8 / 6;
+		harq_in_length = RTE_ALIGN(harq_in_length, 64);
+		if ((harq_layout[harq_index].offset > 0) & harq_prun) {
+			rte_bbdev_log_debug("HARQ IN offset unexpected for now\n");
+			fcw->hcin_size0 = harq_layout[harq_index].size0;
+			fcw->hcin_offset = harq_layout[harq_index].offset;
+			fcw->hcin_size1 = harq_in_length -
+					harq_layout[harq_index].offset;
+		} else {
+			fcw->hcin_size0 = harq_in_length;
+			fcw->hcin_offset = 0;
+			fcw->hcin_size1 = 0;
+		}
+	} else {
+		fcw->hcin_size0 = 0;
+		fcw->hcin_offset = 0;
+		fcw->hcin_size1 = 0;
+	}
+
+	fcw->itmax = op->ldpc_dec.iter_max;
+	fcw->itstop = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE);
+	fcw->synd_precoder = fcw->itstop;
+	/*
+	 * These are all implicitly set
+	 * fcw->synd_post = 0;
+	 * fcw->so_en = 0;
+	 * fcw->so_bypass_rm = 0;
+	 * fcw->so_bypass_intlv = 0;
+	 * fcw->dec_convllr = 0;
+	 * fcw->hcout_convllr = 0;
+	 * fcw->hcout_size1 = 0;
+	 * fcw->so_it = 0;
+	 * fcw->hcout_offset = 0;
+	 * fcw->negstop_th = 0;
+	 * fcw->negstop_it = 0;
+	 * fcw->negstop_en = 0;
+	 * fcw->gain_i = 1;
+	 * fcw->gain_h = 1;
+	 */
+	if (fcw->hcout_en > 0) {
+		parity_offset = (op->ldpc_dec.basegraph == 1 ? 20 : 8)
+			* op->ldpc_dec.z_c - op->ldpc_dec.n_filler;
+		k0_p = (fcw->k0 > parity_offset) ?
+				fcw->k0 - op->ldpc_dec.n_filler : fcw->k0;
+		ncb_p = fcw->ncb - op->ldpc_dec.n_filler;
+		l = k0_p + fcw->rm_e;
+		harq_out_length = (uint16_t) fcw->hcin_size0;
+		harq_out_length = RTE_MIN(RTE_MAX(harq_out_length, l), ncb_p);
+		harq_out_length = (harq_out_length + 0x3F) & 0xFFC0;
+		if ((k0_p > fcw->hcin_size0 + ACC100_HARQ_OFFSET_THRESHOLD) &&
+				harq_prun) {
+			fcw->hcout_size0 = (uint16_t) fcw->hcin_size0;
+			fcw->hcout_offset = k0_p & 0xFFC0;
+			fcw->hcout_size1 = harq_out_length - fcw->hcout_offset;
+		} else {
+			fcw->hcout_size0 = harq_out_length;
+			fcw->hcout_size1 = 0;
+			fcw->hcout_offset = 0;
+		}
+		harq_layout[harq_index].offset = fcw->hcout_offset;
+		harq_layout[harq_index].size0 = fcw->hcout_size0;
+	} else {
+		fcw->hcout_size0 = 0;
+		fcw->hcout_size1 = 0;
+		fcw->hcout_offset = 0;
+	}
+}
+
+/**
+ * Fills descriptor with data pointers of one block type.
+ *
+ * @param desc
+ *   Pointer to DMA descriptor.
+ * @param input
+ *   Pointer to pointer to input data which will be encoded. It can be changed
+ *   and points to next segment in scatter-gather case.
+ * @param offset
+ *   Input offset in rte_mbuf structure. It is used for calculating the point
+ *   where data is starting.
+ * @param cb_len
+ *   Length of currently processed Code Block
+ * @param seg_total_left
+ *   It indicates how many bytes still left in segment (mbuf) for further
+ *   processing.
+ * @param op_flags
+ *   Store information about device capabilities
+ * @param next_triplet
+ *   Index for ACC100 DMA Descriptor triplet
+ *
+ * @return
+ *   Returns index of next triplet on success, other value if lengths of
+ *   pkt and processed cb do not match.
+ *
+ */
+static inline int
+acc100_dma_fill_blk_type_in(struct acc100_dma_req_desc *desc,
+		struct rte_mbuf **input, uint32_t *offset, uint32_t cb_len,
+		uint32_t *seg_total_left, int next_triplet)
+{
+	uint32_t part_len;
+	struct rte_mbuf *m = *input;
+
+	part_len = (*seg_total_left < cb_len) ? *seg_total_left : cb_len;
+	cb_len -= part_len;
+	*seg_total_left -= part_len;
+
+	desc->data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(m, *offset);
+	desc->data_ptrs[next_triplet].blen = part_len;
+	desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN;
+	desc->data_ptrs[next_triplet].last = 0;
+	desc->data_ptrs[next_triplet].dma_ext = 0;
+	*offset += part_len;
+	next_triplet++;
+
+	while (cb_len > 0) {
+		if (next_triplet < ACC100_DMA_MAX_NUM_POINTERS &&
+				m->next != NULL) {
+
+			m = m->next;
+			*seg_total_left = rte_pktmbuf_data_len(m);
+			part_len = (*seg_total_left < cb_len) ?
+					*seg_total_left :
+					cb_len;
+			desc->data_ptrs[next_triplet].address =
+					rte_pktmbuf_iova_offset(m, 0);
+			desc->data_ptrs[next_triplet].blen = part_len;
+			desc->data_ptrs[next_triplet].blkid =
+					ACC100_DMA_BLKID_IN;
+			desc->data_ptrs[next_triplet].last = 0;
+			desc->data_ptrs[next_triplet].dma_ext = 0;
+			cb_len -= part_len;
+			*seg_total_left -= part_len;
+			/* Initializing offset for next segment (mbuf) */
+			*offset = part_len;
+			next_triplet++;
+		} else {
+			rte_bbdev_log(ERR,
+				"Some data still left for processing: "
+				"data_left: %u, next_triplet: %u, next_mbuf: %p",
+				cb_len, next_triplet, m->next);
+			return -EINVAL;
+		}
+	}
+	/* Storing new mbuf as it could be changed in scatter-gather case*/
+	*input = m;
+
+	return next_triplet;
+}
+
+/* Fills descriptor with data pointers of one block type.
+ * Returns index of next triplet on success, other value if lengths of
+ * output data and processed mbuf do not match.
+ */
+static inline int
+acc100_dma_fill_blk_type_out(struct acc100_dma_req_desc *desc,
+		struct rte_mbuf *output, uint32_t out_offset,
+		uint32_t output_len, int next_triplet, int blk_id)
+{
+	desc->data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(output, out_offset);
+	desc->data_ptrs[next_triplet].blen = output_len;
+	desc->data_ptrs[next_triplet].blkid = blk_id;
+	desc->data_ptrs[next_triplet].last = 0;
+	desc->data_ptrs[next_triplet].dma_ext = 0;
+	next_triplet++;
+
+	return next_triplet;
+}
+
+static inline int
+acc100_dma_desc_le_fill(struct rte_bbdev_enc_op *op,
+		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
+		struct rte_mbuf *output, uint32_t *in_offset,
+		uint32_t *out_offset, uint32_t *out_length,
+		uint32_t *mbuf_total_left, uint32_t *seg_total_left)
+{
+	int next_triplet = 1; /* FCW already done */
+	uint16_t K, in_length_in_bits, in_length_in_bytes;
+	struct rte_bbdev_op_ldpc_enc *enc = &op->ldpc_enc;
+
+	desc->word0 = ACC100_DMA_DESC_TYPE;
+	desc->word1 = 0; /**< Timestamp could be disabled */
+	desc->word2 = 0;
+	desc->word3 = 0;
+	desc->numCBs = 1;
+
+	K = (enc->basegraph == 1 ? 22 : 10) * enc->z_c;
+	in_length_in_bits = K - enc->n_filler;
+	if ((enc->op_flags & RTE_BBDEV_LDPC_CRC_24A_ATTACH) ||
+			(enc->op_flags & RTE_BBDEV_LDPC_CRC_24B_ATTACH))
+		in_length_in_bits -= 24;
+	in_length_in_bytes = in_length_in_bits >> 3;
+
+	if (unlikely((*mbuf_total_left == 0) ||
+			(*mbuf_total_left < in_length_in_bytes))) {
+		rte_bbdev_log(ERR,
+				"Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+				*mbuf_total_left, in_length_in_bytes);
+		return -1;
+	}
+
+	next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset,
+			in_length_in_bytes,
+			seg_total_left, next_triplet);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->m2dlen = next_triplet;
+	*mbuf_total_left -= in_length_in_bytes;
+
+	/* Set output length */
+	/* Integer round up division by 8 */
+	*out_length = (enc->cb_params.e + 7) >> 3;
+
+	next_triplet = acc100_dma_fill_blk_type_out(desc, output, *out_offset,
+			*out_length, next_triplet, ACC100_DMA_BLKID_OUT_ENC);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+	op->ldpc_enc.output.length += *out_length;
+	*out_offset += *out_length;
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->data_ptrs[next_triplet - 1].dma_ext = 0;
+	desc->d2mlen = next_triplet - desc->m2dlen;
+
+	desc->op_addr = op;
+
+	return 0;
+}
+
+static inline int
+acc100_dma_desc_ld_fill(struct rte_bbdev_dec_op *op,
+		struct acc100_dma_req_desc *desc,
+		struct rte_mbuf **input, struct rte_mbuf *h_output,
+		uint32_t *in_offset, uint32_t *h_out_offset,
+		uint32_t *h_out_length, uint32_t *mbuf_total_left,
+		uint32_t *seg_total_left,
+		struct acc100_fcw_ld *fcw)
+{
+	struct rte_bbdev_op_ldpc_dec *dec = &op->ldpc_dec;
+	int next_triplet = 1; /* FCW already done */
+	uint32_t input_length;
+	uint16_t output_length, crc24_overlap = 0;
+	uint16_t sys_cols, K, h_p_size, h_np_size;
+	bool h_comp = check_bit(dec->op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
+
+	desc->word0 = ACC100_DMA_DESC_TYPE;
+	desc->word1 = 0; /**< Timestamp could be disabled */
+	desc->word2 = 0;
+	desc->word3 = 0;
+	desc->numCBs = 1;
+
+	if (check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP))
+		crc24_overlap = 24;
+
+	/* Compute some LDPC BG lengths */
+	input_length = dec->cb_params.e;
+	if (check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_LLR_COMPRESSION))
+		input_length = (input_length * 3 + 3) / 4;
+	sys_cols = (dec->basegraph == 1) ? 22 : 10;
+	K = sys_cols * dec->z_c;
+	output_length = K - dec->n_filler - crc24_overlap;
+
+	if (unlikely((*mbuf_total_left == 0) ||
+			(*mbuf_total_left < input_length))) {
+		rte_bbdev_log(ERR,
+				"Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+				*mbuf_total_left, input_length);
+		return -1;
+	}
+
+	next_triplet = acc100_dma_fill_blk_type_in(desc, input,
+			in_offset, input_length,
+			seg_total_left, next_triplet);
+
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+
+	if (check_bit(op->ldpc_dec.op_flags,
+				RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
+		h_p_size = fcw->hcin_size0 + fcw->hcin_size1;
+		if (h_comp)
+			h_p_size = (h_p_size * 3 + 3) / 4;
+		desc->data_ptrs[next_triplet].address =
+				dec->harq_combined_input.offset;
+		desc->data_ptrs[next_triplet].blen = h_p_size;
+		desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN_HARQ;
+		desc->data_ptrs[next_triplet].dma_ext = 1;
+#ifndef ACC100_EXT_MEM
+		acc100_dma_fill_blk_type_out(
+				desc,
+				op->ldpc_dec.harq_combined_input.data,
+				op->ldpc_dec.harq_combined_input.offset,
+				h_p_size,
+				next_triplet,
+				ACC100_DMA_BLKID_IN_HARQ);
+#endif
+		next_triplet++;
+	}
+
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->m2dlen = next_triplet;
+	*mbuf_total_left -= input_length;
+
+	next_triplet = acc100_dma_fill_blk_type_out(desc, h_output,
+			*h_out_offset, output_length >> 3, next_triplet,
+			ACC100_DMA_BLKID_OUT_HARD);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+
+	if (check_bit(op->ldpc_dec.op_flags,
+				RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
+		/* Pruned size of the HARQ */
+		h_p_size = fcw->hcout_size0 + fcw->hcout_size1;
+		/* Non-Pruned size of the HARQ */
+		h_np_size = fcw->hcout_offset > 0 ?
+				fcw->hcout_offset + fcw->hcout_size1 :
+				h_p_size;
+		if (h_comp) {
+			h_np_size = (h_np_size * 3 + 3) / 4;
+			h_p_size = (h_p_size * 3 + 3) / 4;
+		}
+		dec->harq_combined_output.length = h_np_size;
+		desc->data_ptrs[next_triplet].address =
+				dec->harq_combined_output.offset;
+		desc->data_ptrs[next_triplet].blen = h_p_size;
+		desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_OUT_HARQ;
+		desc->data_ptrs[next_triplet].dma_ext = 1;
+#ifndef ACC100_EXT_MEM
+		acc100_dma_fill_blk_type_out(
+				desc,
+				dec->harq_combined_output.data,
+				dec->harq_combined_output.offset,
+				h_p_size,
+				next_triplet,
+				ACC100_DMA_BLKID_OUT_HARQ);
+#endif
+		next_triplet++;
+	}
+
+	*h_out_length = output_length >> 3;
+	dec->hard_output.length += *h_out_length;
+	*h_out_offset += *h_out_length;
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->d2mlen = next_triplet - desc->m2dlen;
+
+	desc->op_addr = op;
+
+	return 0;
+}
+
+static inline void
+acc100_dma_desc_ld_update(struct rte_bbdev_dec_op *op,
+		struct acc100_dma_req_desc *desc,
+		struct rte_mbuf *input, struct rte_mbuf *h_output,
+		uint32_t *in_offset, uint32_t *h_out_offset,
+		uint32_t *h_out_length,
+		union acc100_harq_layout_data *harq_layout)
+{
+	int next_triplet = 1; /* FCW already done */
+	desc->data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(input, *in_offset);
+	next_triplet++;
+
+	if (check_bit(op->ldpc_dec.op_flags,
+				RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
+		struct rte_bbdev_op_data hi = op->ldpc_dec.harq_combined_input;
+		desc->data_ptrs[next_triplet].address = hi.offset;
+#ifndef ACC100_EXT_MEM
+		desc->data_ptrs[next_triplet].address =
+				rte_pktmbuf_iova_offset(hi.data, hi.offset);
+#endif
+		next_triplet++;
+	}
+
+	desc->data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(h_output, *h_out_offset);
+	*h_out_length = desc->data_ptrs[next_triplet].blen;
+	next_triplet++;
+
+	if (check_bit(op->ldpc_dec.op_flags,
+				RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
+		desc->data_ptrs[next_triplet].address =
+				op->ldpc_dec.harq_combined_output.offset;
+		/* Adjust based on previous operation */
+		struct rte_bbdev_dec_op *prev_op = desc->op_addr;
+		op->ldpc_dec.harq_combined_output.length =
+				prev_op->ldpc_dec.harq_combined_output.length;
+		int16_t hq_idx = op->ldpc_dec.harq_combined_output.offset /
+				ACC100_HARQ_OFFSET;
+		int16_t prev_hq_idx =
+				prev_op->ldpc_dec.harq_combined_output.offset
+				/ ACC100_HARQ_OFFSET;
+		harq_layout[hq_idx].val = harq_layout[prev_hq_idx].val;
+#ifndef ACC100_EXT_MEM
+		struct rte_bbdev_op_data ho =
+				op->ldpc_dec.harq_combined_output;
+		desc->data_ptrs[next_triplet].address =
+				rte_pktmbuf_iova_offset(ho.data, ho.offset);
+#endif
+		next_triplet++;
+	}
+
+	op->ldpc_dec.hard_output.length += *h_out_length;
+	desc->op_addr = op;
+}
+
+
+/* Enqueue a number of operations to HW and update software rings */
+static inline void
+acc100_dma_enqueue(struct acc100_queue *q, uint16_t n,
+		struct rte_bbdev_stats *queue_stats)
+{
+	union acc100_enqueue_reg_fmt enq_req;
+#ifdef RTE_BBDEV_OFFLOAD_COST
+	uint64_t start_time = 0;
+	queue_stats->acc_offload_cycles = 0;
+	RTE_SET_USED(queue_stats);
+#else
+	RTE_SET_USED(queue_stats);
+#endif
+
+	enq_req.val = 0;
+	/* Setting offset, 100b for 256 DMA Desc */
+	enq_req.addr_offset = ACC100_DESC_OFFSET;
+
+	/* Split ops into batches */
+	do {
+		union acc100_dma_desc *desc;
+		uint16_t enq_batch_size;
+		uint64_t offset;
+		rte_iova_t req_elem_addr;
+
+		enq_batch_size = RTE_MIN(n, MAX_ENQ_BATCH_SIZE);
+
+		/* Set flag on last descriptor in a batch */
+		desc = q->ring_addr + ((q->sw_ring_head + enq_batch_size - 1) &
+				q->sw_ring_wrap_mask);
+		desc->req.last_desc_in_batch = 1;
+
+		/* Calculate the 1st descriptor's address */
+		offset = ((q->sw_ring_head & q->sw_ring_wrap_mask) *
+				sizeof(union acc100_dma_desc));
+		req_elem_addr = q->ring_addr_phys + offset;
+
+		/* Fill enqueue struct */
+		enq_req.num_elem = enq_batch_size;
+		/* low 6 bits are not needed */
+		enq_req.req_elem_addr = (uint32_t)(req_elem_addr >> 6);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+		rte_memdump(stderr, "Req sdone", desc, sizeof(*desc));
+#endif
+		rte_bbdev_log_debug(
+				"Enqueue %u reqs (phys %#"PRIx64") to reg %p",
+				enq_batch_size,
+				req_elem_addr,
+				(void *)q->mmio_reg_enqueue);
+
+		rte_wmb();
+
+#ifdef RTE_BBDEV_OFFLOAD_COST
+		/* Start time measurement for enqueue function offload. */
+		start_time = rte_rdtsc_precise();
+#endif
+		rte_bbdev_log(DEBUG, "Debug : MMIO Enqueue");
+		mmio_write(q->mmio_reg_enqueue, enq_req.val);
+
+#ifdef RTE_BBDEV_OFFLOAD_COST
+		queue_stats->acc_offload_cycles +=
+				rte_rdtsc_precise() - start_time;
+#endif
+
+		q->aq_enqueued++;
+		q->sw_ring_head += enq_batch_size;
+		n -= enq_batch_size;
+
+	} while (n);
+
+
+}
+
+/* Enqueue one encode operations for ACC100 device in CB mode */
+static inline int
+enqueue_ldpc_enc_n_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op **ops,
+		uint16_t total_enqueued_cbs, int16_t num)
+{
+	union acc100_dma_desc *desc = NULL;
+	uint32_t out_length;
+	struct rte_mbuf *output_head, *output;
+	int i, next_triplet;
+	uint16_t  in_length_in_bytes;
+	struct rte_bbdev_op_ldpc_enc *enc = &ops[0]->ldpc_enc;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	acc100_fcw_le_fill(ops[0], &desc->req.fcw_le, num);
+
+	/** This could be done at polling */
+	desc->req.word0 = ACC100_DMA_DESC_TYPE;
+	desc->req.word1 = 0; /**< Timestamp could be disabled */
+	desc->req.word2 = 0;
+	desc->req.word3 = 0;
+	desc->req.numCBs = num;
+
+	in_length_in_bytes = ops[0]->ldpc_enc.input.data->data_len;
+	out_length = (enc->cb_params.e + 7) >> 3;
+	desc->req.m2dlen = 1 + num;
+	desc->req.d2mlen = num;
+	next_triplet = 1;
+
+	for (i = 0; i < num; i++) {
+		desc->req.data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(ops[i]->ldpc_enc.input.data, 0);
+		desc->req.data_ptrs[next_triplet].blen = in_length_in_bytes;
+		next_triplet++;
+		desc->req.data_ptrs[next_triplet].address =
+				rte_pktmbuf_iova_offset(
+				ops[i]->ldpc_enc.output.data, 0);
+		desc->req.data_ptrs[next_triplet].blen = out_length;
+		next_triplet++;
+		ops[i]->ldpc_enc.output.length = out_length;
+		output_head = output = ops[i]->ldpc_enc.output.data;
+		mbuf_append(output_head, output, out_length);
+		output->data_len = out_length;
+	}
+
+	desc->req.op_addr = ops[0];
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_le,
+			sizeof(desc->req.fcw_le) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+	/* One CB (one op) was successfully prepared to enqueue */
+	return num;
+}
+
+/* Enqueue one encode operations for ACC100 device in CB mode */
+static inline int
+enqueue_ldpc_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
+		uint16_t total_enqueued_cbs)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint32_t in_offset, out_offset, out_length, mbuf_total_left,
+		seg_total_left;
+	struct rte_mbuf *input, *output_head, *output;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	acc100_fcw_le_fill(op, &desc->req.fcw_le, 1);
+
+	input = op->ldpc_enc.input.data;
+	output_head = output = op->ldpc_enc.output.data;
+	in_offset = op->ldpc_enc.input.offset;
+	out_offset = op->ldpc_enc.output.offset;
+	out_length = 0;
+	mbuf_total_left = op->ldpc_enc.input.length;
+	seg_total_left = rte_pktmbuf_data_len(op->ldpc_enc.input.data)
+			- in_offset;
+
+	ret = acc100_dma_desc_le_fill(op, &desc->req, &input, output,
+			&in_offset, &out_offset, &out_length, &mbuf_total_left,
+			&seg_total_left);
+
+	if (unlikely(ret < 0))
+		return ret;
+
+	mbuf_append(output_head, output, out_length);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_le,
+			sizeof(desc->req.fcw_le) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+
+	/* Check if any data left after processing one CB */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left after processing one CB: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
+/** Enqueue one decode operations for ACC100 device in CB mode */
+static inline int
+enqueue_ldpc_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs, bool same_op)
+{
+	int ret;
+
+	union acc100_dma_desc *desc;
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	struct rte_mbuf *input, *h_output_head, *h_output;
+	uint32_t in_offset, h_out_offset, mbuf_total_left, h_out_length = 0;
+	input = op->ldpc_dec.input.data;
+	h_output_head = h_output = op->ldpc_dec.hard_output.data;
+	in_offset = op->ldpc_dec.input.offset;
+	h_out_offset = op->ldpc_dec.hard_output.offset;
+	mbuf_total_left = op->ldpc_dec.input.length;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(input == NULL)) {
+		rte_bbdev_log(ERR, "Invalid mbuf pointer");
+		return -EFAULT;
+	}
+#endif
+	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+
+	if (same_op) {
+		union acc100_dma_desc *prev_desc;
+		desc_idx = ((q->sw_ring_head + total_enqueued_cbs - 1)
+				& q->sw_ring_wrap_mask);
+		prev_desc = q->ring_addr + desc_idx;
+		uint8_t *prev_ptr = (uint8_t *) prev_desc;
+		uint8_t *new_ptr = (uint8_t *) desc;
+		/* Copy first 4 words and BDESCs */
+		rte_memcpy(new_ptr, prev_ptr, 16);
+		rte_memcpy(new_ptr + 36, prev_ptr + 36, 40);
+		desc->req.op_addr = prev_desc->req.op_addr;
+		/* Copy FCW */
+		rte_memcpy(new_ptr + ACC100_DESC_FCW_OFFSET,
+				prev_ptr + ACC100_DESC_FCW_OFFSET,
+				ACC100_FCW_LD_BLEN);
+		acc100_dma_desc_ld_update(op, &desc->req, input, h_output,
+				&in_offset, &h_out_offset,
+				&h_out_length, harq_layout);
+	} else {
+		struct acc100_fcw_ld *fcw;
+		uint32_t seg_total_left;
+		fcw = &desc->req.fcw_ld;
+		acc100_fcw_ld_fill(op, fcw, harq_layout);
+
+		/* Special handling when overusing mbuf */
+		if (fcw->rm_e < MAX_E_MBUF)
+			seg_total_left = rte_pktmbuf_data_len(input)
+					- in_offset;
+		else
+			seg_total_left = fcw->rm_e;
+
+		ret = acc100_dma_desc_ld_fill(op, &desc->req, &input, h_output,
+				&in_offset, &h_out_offset,
+				&h_out_length, &mbuf_total_left,
+				&seg_total_left, fcw);
+		if (unlikely(ret < 0))
+			return ret;
+	}
+
+	/* Hard output */
+	mbuf_append(h_output_head, h_output, h_out_length);
+#ifndef ACC100_EXT_MEM
+	if (op->ldpc_dec.harq_combined_output.length > 0) {
+		/* Push the HARQ output into host memory */
+		struct rte_mbuf *hq_output_head, *hq_output;
+		hq_output_head = op->ldpc_dec.harq_combined_output.data;
+		hq_output = op->ldpc_dec.harq_combined_output.data;
+		mbuf_append(hq_output_head, hq_output,
+				op->ldpc_dec.harq_combined_output.length);
+	}
+#endif
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_ld,
+			sizeof(desc->req.fcw_ld) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
+
+/* Enqueue one decode operations for ACC100 device in TB mode */
+static inline int
+enqueue_ldpc_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint8_t r, c;
+	uint32_t in_offset, h_out_offset,
+		h_out_length, mbuf_total_left, seg_total_left;
+	struct rte_mbuf *input, *h_output_head, *h_output;
+	uint16_t current_enqueued_cbs = 0;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
+	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+	acc100_fcw_ld_fill(op, &desc->req.fcw_ld, harq_layout);
+
+	input = op->ldpc_dec.input.data;
+	h_output_head = h_output = op->ldpc_dec.hard_output.data;
+	in_offset = op->ldpc_dec.input.offset;
+	h_out_offset = op->ldpc_dec.hard_output.offset;
+	h_out_length = 0;
+	mbuf_total_left = op->ldpc_dec.input.length;
+	c = op->ldpc_dec.tb_params.c;
+	r = op->ldpc_dec.tb_params.r;
+
+	while (mbuf_total_left > 0 && r < c) {
+
+		seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
+
+		/* Set up DMA descriptor */
+		desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
+				& q->sw_ring_wrap_mask);
+		desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
+		desc->req.data_ptrs[0].blen = ACC100_FCW_LD_BLEN;
+		ret = acc100_dma_desc_ld_fill(op, &desc->req, &input,
+				h_output, &in_offset, &h_out_offset,
+				&h_out_length,
+				&mbuf_total_left, &seg_total_left,
+				&desc->req.fcw_ld);
+
+		if (unlikely(ret < 0))
+			return ret;
+
+		/* Hard output */
+		mbuf_append(h_output_head, h_output, h_out_length);
+
+		/* Set total number of CBs in TB */
+		desc->req.cbs_in_tb = cbs_in_tb;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+		rte_memdump(stderr, "FCW", &desc->req.fcw_td,
+				sizeof(desc->req.fcw_td) - 8);
+		rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+		if (seg_total_left == 0) {
+			/* Go to the next mbuf */
+			input = input->next;
+			in_offset = 0;
+			h_output = h_output->next;
+			h_out_offset = 0;
+		}
+		total_enqueued_cbs++;
+		current_enqueued_cbs++;
+		r++;
+	}
+
+	if (unlikely(desc == NULL))
+		return current_enqueued_cbs;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Check if any CBs left for processing */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left for processing: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+	/* Set SDone on last CB descriptor for TB mode */
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	return current_enqueued_cbs;
+}
+
+
+/* Calculates number of CBs in processed encoder TB based on 'r' and input
+ * length.
+ */
+static inline uint8_t
+get_num_cbs_in_tb_enc(struct rte_bbdev_op_turbo_enc *turbo_enc)
+{
+	uint8_t c, c_neg, r, crc24_bits = 0;
+	uint16_t k, k_neg, k_pos;
+	uint8_t cbs_in_tb = 0;
+	int32_t length;
+
+	length = turbo_enc->input.length;
+	r = turbo_enc->tb_params.r;
+	c = turbo_enc->tb_params.c;
+	c_neg = turbo_enc->tb_params.c_neg;
+	k_neg = turbo_enc->tb_params.k_neg;
+	k_pos = turbo_enc->tb_params.k_pos;
+	crc24_bits = 0;
+	if (check_bit(turbo_enc->op_flags, RTE_BBDEV_TURBO_CRC_24B_ATTACH))
+		crc24_bits = 24;
+	while (length > 0 && r < c) {
+		k = (r < c_neg) ? k_neg : k_pos;
+		length -= (k - crc24_bits) >> 3;
+		r++;
+		cbs_in_tb++;
+	}
+
+	return cbs_in_tb;
+}
+
+/* Calculates number of CBs in processed decoder TB based on 'r' and input
+ * length.
+ */
+static inline uint16_t
+get_num_cbs_in_tb_dec(struct rte_bbdev_op_turbo_dec *turbo_dec)
+{
+	uint8_t c, c_neg, r = 0;
+	uint16_t kw, k, k_neg, k_pos, cbs_in_tb = 0;
+	int32_t length;
+
+	length = turbo_dec->input.length;
+	r = turbo_dec->tb_params.r;
+	c = turbo_dec->tb_params.c;
+	c_neg = turbo_dec->tb_params.c_neg;
+	k_neg = turbo_dec->tb_params.k_neg;
+	k_pos = turbo_dec->tb_params.k_pos;
+	while (length > 0 && r < c) {
+		k = (r < c_neg) ? k_neg : k_pos;
+		kw = RTE_ALIGN_CEIL(k + 4, 32) * 3;
+		length -= kw;
+		r++;
+		cbs_in_tb++;
+	}
+
+	return cbs_in_tb;
+}
+
+/* Calculates number of CBs in processed decoder TB based on 'r' and input
+ * length.
+ */
+static inline uint16_t
+get_num_cbs_in_tb_ldpc_dec(struct rte_bbdev_op_ldpc_dec *ldpc_dec)
+{
+	uint16_t r, cbs_in_tb = 0;
+	int32_t length = ldpc_dec->input.length;
+	r = ldpc_dec->tb_params.r;
+	while (length > 0 && r < ldpc_dec->tb_params.c) {
+		length -=  (r < ldpc_dec->tb_params.cab) ?
+				ldpc_dec->tb_params.ea :
+				ldpc_dec->tb_params.eb;
+		r++;
+		cbs_in_tb++;
+	}
+	return cbs_in_tb;
+}
+
+/* Check we can mux encode operations with common FCW */
+static inline bool
+check_mux(struct rte_bbdev_enc_op **ops, uint16_t num) {
+	uint16_t i;
+	if (num == 1)
+		return false;
+	for (i = 1; i < num; ++i) {
+		/* Only mux compatible code blocks */
+		if (memcmp((uint8_t *)(&ops[i]->ldpc_enc) + ENC_OFFSET,
+				(uint8_t *)(&ops[0]->ldpc_enc) + ENC_OFFSET,
+				CMP_ENC_SIZE) != 0)
+			return false;
+	}
+	return true;
+}
+
+/** Enqueue encode operations for ACC100 device in CB mode. */
+static inline uint16_t
+acc100_enqueue_ldpc_enc_cb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i = 0;
+	union acc100_dma_desc *desc;
+	int ret, desc_idx = 0;
+	int16_t enq, left = num;
+
+	while (left > 0) {
+		if (unlikely(avail - 1 < 0))
+			break;
+		avail--;
+		enq = RTE_MIN(left, MUX_5GDL_DESC);
+		if (check_mux(&ops[i], enq)) {
+			ret = enqueue_ldpc_enc_n_op_cb(q, &ops[i],
+					desc_idx, enq);
+			if (ret < 0)
+				break;
+			i += enq;
+		} else {
+			ret = enqueue_ldpc_enc_one_op_cb(q, ops[i], desc_idx);
+			if (ret < 0)
+				break;
+			i++;
+		}
+		desc_idx++;
+		left = num - i;
+	}
+
+	if (unlikely(i == 0))
+		return 0; /* Nothing to enqueue */
+
+	/* Set SDone in last CB in enqueued ops for CB mode*/
+	desc = q->ring_addr + ((q->sw_ring_head + desc_idx - 1)
+			& q->sw_ring_wrap_mask);
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	acc100_dma_enqueue(q, desc_idx, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+
+	return i;
+}
+
+/* Enqueue encode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	if (unlikely(num == 0))
+		return 0;
+	return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
+}
+
+/* Check we can mux encode operations with common FCW */
+static inline bool
+cmp_ldpc_dec_op(struct rte_bbdev_dec_op **ops) {
+	/* Only mux compatible code blocks */
+	if (memcmp((uint8_t *)(&ops[0]->ldpc_dec) + DEC_OFFSET,
+			(uint8_t *)(&ops[1]->ldpc_dec) +
+			DEC_OFFSET, CMP_DEC_SIZE) != 0) {
+		return false;
+	} else
+		return true;
+}
+
+
+/* Enqueue decode operations for ACC100 device in TB mode */
+static uint16_t
+acc100_enqueue_ldpc_dec_tb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i, enqueued_cbs = 0;
+	uint8_t cbs_in_tb;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		cbs_in_tb = get_num_cbs_in_tb_ldpc_dec(&ops[i]->ldpc_dec);
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - cbs_in_tb < 0))
+			break;
+		avail -= cbs_in_tb;
+
+		ret = enqueue_ldpc_dec_one_op_tb(q, ops[i],
+				enqueued_cbs, cbs_in_tb);
+		if (ret < 0)
+			break;
+		enqueued_cbs += ret;
+	}
+
+	acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+	return i;
+}
+
+/* Enqueue decode operations for ACC100 device in CB mode */
+static uint16_t
+acc100_enqueue_ldpc_dec_cb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i;
+	union acc100_dma_desc *desc;
+	int ret;
+	bool same_op = false;
+	for (i = 0; i < num; ++i) {
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - 1 < 0))
+			break;
+		avail -= 1;
+
+		if (i > 0)
+			same_op = cmp_ldpc_dec_op(&ops[i-1]);
+		rte_bbdev_log(INFO, "Op %d %d %d %d %d %d %d %d %d %d %d %d\n",
+			i, ops[i]->ldpc_dec.op_flags, ops[i]->ldpc_dec.rv_index,
+			ops[i]->ldpc_dec.iter_max, ops[i]->ldpc_dec.iter_count,
+			ops[i]->ldpc_dec.basegraph, ops[i]->ldpc_dec.z_c,
+			ops[i]->ldpc_dec.n_cb, ops[i]->ldpc_dec.q_m,
+			ops[i]->ldpc_dec.n_filler, ops[i]->ldpc_dec.cb_params.e,
+			same_op);
+		ret = enqueue_ldpc_dec_one_op_cb(q, ops[i], i, same_op);
+		if (ret < 0)
+			break;
+	}
+
+	if (unlikely(i == 0))
+		return 0; /* Nothing to enqueue */
+
+	/* Set SDone in last CB in enqueued ops for CB mode*/
+	desc = q->ring_addr + ((q->sw_ring_head + i - 1)
+			& q->sw_ring_wrap_mask);
+
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	acc100_dma_enqueue(q, i, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+	return i;
+}
+
+/* Enqueue decode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t aq_avail = q->aq_depth +
+			(q->aq_dequeued - q->aq_enqueued) / 128;
+
+	if (unlikely((aq_avail == 0) || (num == 0)))
+		return 0;
+
+	if (ops[0]->ldpc_dec.code_block_mode == 0)
+		return acc100_enqueue_ldpc_dec_tb(q_data, ops, num);
+	else
+		return acc100_enqueue_ldpc_dec_cb(q_data, ops, num);
+}
+
+
+/* Dequeue one encode operations from ACC100 device in CB mode */
+static inline int
+dequeue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op **ref_op,
+		uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_enc_op *op;
+	int i;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	rsp.val = atom_desc.rsp.val;
+	rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+
+	op->status |= ((rsp.input_err)
+			? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+	op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+	op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+
+	if (desc->req.last_desc_in_batch) {
+		(*aq_dequeued)++;
+		desc->req.last_desc_in_batch = 0;
+	}
+	desc->rsp.val = ACC100_DMA_DESC_TYPE;
+	desc->rsp.add_info_0 = 0; /*Reserved bits */
+	desc->rsp.add_info_1 = 0; /*Reserved bits */
+
+	/* Flag that the muxing cause loss of opaque data */
+	op->opaque_data = (void *)-1;
+	for (i = 0 ; i < desc->req.numCBs; i++)
+		ref_op[i] = op;
+
+	/* One CB (op) was successfully dequeued */
+	return desc->req.numCBs;
+}
+
+/* Dequeue one encode operations from ACC100 device in TB mode */
+static inline int
+dequeue_enc_one_op_tb(struct acc100_queue *q, struct rte_bbdev_enc_op **ref_op,
+		uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, *last_desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_enc_op *op;
+	uint8_t i = 0;
+	uint16_t current_dequeued_cbs = 0, cbs_in_tb;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	/* Get number of CBs in dequeued TB */
+	cbs_in_tb = desc->req.cbs_in_tb;
+	/* Get last CB */
+	last_desc = q->ring_addr + ((q->sw_ring_tail
+			+ total_dequeued_cbs + cbs_in_tb - 1)
+			& q->sw_ring_wrap_mask);
+	/* Check if last CB in TB is ready to dequeue (and thus
+	 * the whole TB) - checking sdone bit. If not return.
+	 */
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
+			__ATOMIC_RELAXED);
+	if (!(atom_desc.rsp.val & ACC100_SDONE))
+		return -1;
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+
+	while (i < cbs_in_tb) {
+		desc = q->ring_addr + ((q->sw_ring_tail
+				+ total_dequeued_cbs)
+				& q->sw_ring_wrap_mask);
+		atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+				__ATOMIC_RELAXED);
+		rsp.val = atom_desc.rsp.val;
+		rte_bbdev_log_debug("Resp. desc %p: %x", desc,
+				rsp.val);
+
+		op->status |= ((rsp.input_err)
+				? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+		op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+		op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+
+		if (desc->req.last_desc_in_batch) {
+			(*aq_dequeued)++;
+			desc->req.last_desc_in_batch = 0;
+		}
+		desc->rsp.val = ACC100_DMA_DESC_TYPE;
+		desc->rsp.add_info_0 = 0;
+		desc->rsp.add_info_1 = 0;
+		total_dequeued_cbs++;
+		current_dequeued_cbs++;
+		i++;
+	}
+
+	*ref_op = op;
+
+	return current_dequeued_cbs;
+}
+
+/* Dequeue one decode operation from ACC100 device in CB mode */
+static inline int
+dequeue_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
+		struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
+		uint16_t dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_dec_op *op;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	rsp.val = atom_desc.rsp.val;
+	rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+	op->status |= ((rsp.input_err)
+			? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+	op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+	op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+	if (op->status != 0)
+		q_data->queue_stats.dequeue_err_count++;
+
+	/* CRC invalid if error exists */
+	if (!op->status)
+		op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
+	op->turbo_dec.iter_count = (uint8_t) rsp.iter_cnt / 2;
+	/* Check if this is the last desc in batch (Atomic Queue) */
+	if (desc->req.last_desc_in_batch) {
+		(*aq_dequeued)++;
+		desc->req.last_desc_in_batch = 0;
+	}
+	desc->rsp.val = ACC100_DMA_DESC_TYPE;
+	desc->rsp.add_info_0 = 0;
+	desc->rsp.add_info_1 = 0;
+	*ref_op = op;
+
+	/* One CB (op) was successfully dequeued */
+	return 1;
+}
+
+/* Dequeue one decode operations from ACC100 device in CB mode */
+static inline int
+dequeue_ldpc_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
+		struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
+		uint16_t dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_dec_op *op;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	rsp.val = atom_desc.rsp.val;
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+	op->status |= rsp.input_err << RTE_BBDEV_DATA_ERROR;
+	op->status |= rsp.dma_err << RTE_BBDEV_DRV_ERROR;
+	op->status |= rsp.fcw_err << RTE_BBDEV_DRV_ERROR;
+	if (op->status != 0)
+		q_data->queue_stats.dequeue_err_count++;
+
+	op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
+	if (op->ldpc_dec.hard_output.length > 0 && !rsp.synd_ok)
+		op->status |= 1 << RTE_BBDEV_SYNDROME_ERROR;
+	op->ldpc_dec.iter_count = (uint8_t) rsp.iter_cnt;
+
+	/* Check if this is the last desc in batch (Atomic Queue) */
+	if (desc->req.last_desc_in_batch) {
+		(*aq_dequeued)++;
+		desc->req.last_desc_in_batch = 0;
+	}
+
+	desc->rsp.val = ACC100_DMA_DESC_TYPE;
+	desc->rsp.add_info_0 = 0;
+	desc->rsp.add_info_1 = 0;
+
+	*ref_op = op;
+
+	/* One CB (op) was successfully dequeued */
+	return 1;
+}
+
+/* Dequeue one decode operations from ACC100 device in TB mode. */
+static inline int
+dequeue_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
+		uint16_t dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, *last_desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_dec_op *op;
+	uint8_t cbs_in_tb = 1, cb_idx = 0;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Get number of CBs in dequeued TB */
+	cbs_in_tb = desc->req.cbs_in_tb;
+	/* Get last CB */
+	last_desc = q->ring_addr + ((q->sw_ring_tail
+			+ dequeued_cbs + cbs_in_tb - 1)
+			& q->sw_ring_wrap_mask);
+	/* Check if last CB in TB is ready to dequeue (and thus
+	 * the whole TB) - checking sdone bit. If not return.
+	 */
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
+			__ATOMIC_RELAXED);
+	if (!(atom_desc.rsp.val & ACC100_SDONE))
+		return -1;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+
+	/* Read remaining CBs if exists */
+	while (cb_idx < cbs_in_tb) {
+		desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+				& q->sw_ring_wrap_mask);
+		atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+				__ATOMIC_RELAXED);
+		rsp.val = atom_desc.rsp.val;
+		rte_bbdev_log_debug("Resp. desc %p: %x", desc,
+				rsp.val);
+
+		op->status |= ((rsp.input_err)
+				? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+		op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+		op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+
+		/* CRC invalid if error exists */
+		if (!op->status)
+			op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
+		op->turbo_dec.iter_count = RTE_MAX((uint8_t) rsp.iter_cnt,
+				op->turbo_dec.iter_count);
+
+		/* Check if this is the last desc in batch (Atomic Queue) */
+		if (desc->req.last_desc_in_batch) {
+			(*aq_dequeued)++;
+			desc->req.last_desc_in_batch = 0;
+		}
+		desc->rsp.val = ACC100_DMA_DESC_TYPE;
+		desc->rsp.add_info_0 = 0;
+		desc->rsp.add_info_1 = 0;
+		dequeued_cbs++;
+		cb_idx++;
+	}
+
+	*ref_op = op;
+
+	return cb_idx;
+}
+
+/* Dequeue LDPC encode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+	uint32_t aq_dequeued = 0;
+	uint16_t dequeue_num, i, dequeued_cbs = 0, dequeued_descs = 0;
+	int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(ops == 0 && q == NULL))
+		return 0;
+#endif
+
+	dequeue_num = (avail < num) ? avail : num;
+
+	for (i = 0; i < dequeue_num; i++) {
+		ret = dequeue_enc_one_op_cb(q, &ops[dequeued_cbs],
+				dequeued_descs, &aq_dequeued);
+		if (ret < 0)
+			break;
+		dequeued_cbs += ret;
+		dequeued_descs++;
+		if (dequeued_cbs >= num)
+			break;
+	}
+
+	q->aq_dequeued += aq_dequeued;
+	q->sw_ring_tail += dequeued_descs;
+
+	/* Update enqueue stats */
+	q_data->queue_stats.dequeued_count += dequeued_cbs;
+
+	return dequeued_cbs;
+}
+
+/* Dequeue decode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	uint16_t dequeue_num;
+	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+	uint32_t aq_dequeued = 0;
+	uint16_t i;
+	uint16_t dequeued_cbs = 0;
+	struct rte_bbdev_dec_op *op;
+	int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(ops == 0 && q == NULL))
+		return 0;
+#endif
+
+	dequeue_num = (avail < num) ? avail : num;
+
+	for (i = 0; i < dequeue_num; ++i) {
+		op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask))->req.op_addr;
+		if (op->ldpc_dec.code_block_mode == 0)
+			ret = dequeue_dec_one_op_tb(q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+		else
+			ret = dequeue_ldpc_dec_one_op_cb(
+					q_data, q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+
+		if (ret < 0)
+			break;
+		dequeued_cbs += ret;
+	}
+
+	q->aq_dequeued += aq_dequeued;
+	q->sw_ring_tail += dequeued_cbs;
+
+	/* Update enqueue stats */
+	q_data->queue_stats.dequeued_count += i;
+
+	return i;
+}
+
 /* Initialization Function */
 static void
 acc100_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv)
@@ -703,6 +2321,10 @@
 	struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
 
 	dev->dev_ops = &acc100_bbdev_ops;
+	dev->enqueue_ldpc_enc_ops = acc100_enqueue_ldpc_enc;
+	dev->enqueue_ldpc_dec_ops = acc100_enqueue_ldpc_dec;
+	dev->dequeue_ldpc_enc_ops = acc100_dequeue_ldpc_enc;
+	dev->dequeue_ldpc_dec_ops = acc100_dequeue_ldpc_dec;
 
 	((struct acc100_device *) dev->data->dev_private)->pf_device =
 			!strcmp(drv->driver.name,
@@ -815,4 +2437,3 @@ static int acc100_pci_remove(struct rte_pci_device *pci_dev)
 RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME, pci_id_acc100_pf_map);
 RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
 RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME, pci_id_acc100_vf_map);
-
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 0e2b79c..78686c1 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -88,6 +88,8 @@
 #define TMPL_PRI_3      0x0f0e0d0c
 #define QUEUE_ENABLE    0x80000000  /* Bit to mark Queue as Enabled */
 #define WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
+#define ACC100_FDONE    0x80000000
+#define ACC100_SDONE    0x40000000
 
 #define ACC100_NUM_TMPL  32
 #define VF_OFFSET_QOS 16 /* offset in Memory Space specific to QoS Mon */
@@ -398,6 +400,7 @@ struct __rte_packed acc100_dma_req_desc {
 union acc100_dma_desc {
 	struct acc100_dma_req_desc req;
 	union acc100_dma_rsp_desc rsp;
+	uint64_t atom_hdr;
 };
 
 
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v8 06/10] baseband/acc100: add HARQ loopback support
  2020-09-28 23:52   ` [dpdk-dev] [PATCH v8 00/10] bbdev PMD ACC100 Nicolas Chautru
                       ` (4 preceding siblings ...)
  2020-09-28 23:52     ` [dpdk-dev] [PATCH v8 05/10] baseband/acc100: add LDPC processing functions Nicolas Chautru
@ 2020-09-28 23:52     ` Nicolas Chautru
  2020-09-28 23:52     ` [dpdk-dev] [PATCH v8 07/10] baseband/acc100: add support for 4G processing Nicolas Chautru
                       ` (3 subsequent siblings)
  9 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-28 23:52 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Additional support for HARQ memory loopback

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 158 +++++++++++++++++++++++++++++++
 1 file changed, 158 insertions(+)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index b223547..e484c0a 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -658,6 +658,7 @@
 				RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE |
 				RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE |
 #ifdef ACC100_EXT_MEM
+				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK |
 				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE |
 				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE |
 #endif
@@ -1480,12 +1481,169 @@
 	return 1;
 }
 
+static inline int
+harq_loopback(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs) {
+	struct acc100_fcw_ld *fcw;
+	union acc100_dma_desc *desc;
+	int next_triplet = 1;
+	struct rte_mbuf *hq_output_head, *hq_output;
+	uint16_t harq_in_length = op->ldpc_dec.harq_combined_input.length;
+	if (harq_in_length == 0) {
+		rte_bbdev_log(ERR, "Loopback of invalid null size\n");
+		return -EINVAL;
+	}
+
+	int h_comp = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION
+			) ? 1 : 0;
+	if (h_comp == 1)
+		harq_in_length = harq_in_length * 8 / 6;
+	harq_in_length = RTE_ALIGN(harq_in_length, 64);
+	uint16_t harq_dma_length_in = (h_comp == 0) ?
+			harq_in_length :
+			harq_in_length * 6 / 8;
+	uint16_t harq_dma_length_out = harq_dma_length_in;
+	bool ddr_mem_in = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE);
+	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+	uint16_t harq_index = (ddr_mem_in ?
+			op->ldpc_dec.harq_combined_input.offset :
+			op->ldpc_dec.harq_combined_output.offset)
+			/ ACC100_HARQ_OFFSET;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	fcw = &desc->req.fcw_ld;
+	/* Set the FCW from loopback into DDR */
+	memset(fcw, 0, sizeof(struct acc100_fcw_ld));
+	fcw->FCWversion = ACC100_FCW_VER;
+	fcw->qm = 2;
+	fcw->Zc = 384;
+	if (harq_in_length < 16 * N_ZC_1)
+		fcw->Zc = 16;
+	fcw->ncb = fcw->Zc * N_ZC_1;
+	fcw->rm_e = 2;
+	fcw->hcin_en = 1;
+	fcw->hcout_en = 1;
+
+	rte_bbdev_log(DEBUG, "Loopback IN %d Index %d offset %d length %d %d\n",
+			ddr_mem_in, harq_index,
+			harq_layout[harq_index].offset, harq_in_length,
+			harq_dma_length_in);
+
+	if (ddr_mem_in && (harq_layout[harq_index].offset > 0)) {
+		fcw->hcin_size0 = harq_layout[harq_index].size0;
+		fcw->hcin_offset = harq_layout[harq_index].offset;
+		fcw->hcin_size1 = harq_in_length - fcw->hcin_offset;
+		harq_dma_length_in = (fcw->hcin_size0 + fcw->hcin_size1);
+		if (h_comp == 1)
+			harq_dma_length_in = harq_dma_length_in * 6 / 8;
+	} else {
+		fcw->hcin_size0 = harq_in_length;
+	}
+	harq_layout[harq_index].val = 0;
+	rte_bbdev_log(DEBUG, "Loopback FCW Config %d %d %d\n",
+			fcw->hcin_size0, fcw->hcin_offset, fcw->hcin_size1);
+	fcw->hcout_size0 = harq_in_length;
+	fcw->hcin_decomp_mode = h_comp;
+	fcw->hcout_comp_mode = h_comp;
+	fcw->gain_i = 1;
+	fcw->gain_h = 1;
+
+	/* Set the prefix of descriptor. This could be done at polling */
+	desc->req.word0 = ACC100_DMA_DESC_TYPE;
+	desc->req.word1 = 0; /**< Timestamp could be disabled */
+	desc->req.word2 = 0;
+	desc->req.word3 = 0;
+	desc->req.numCBs = 1;
+
+	/* Null LLR input for Decoder */
+	desc->req.data_ptrs[next_triplet].address =
+			q->lb_in_addr_phys;
+	desc->req.data_ptrs[next_triplet].blen = 2;
+	desc->req.data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN;
+	desc->req.data_ptrs[next_triplet].last = 0;
+	desc->req.data_ptrs[next_triplet].dma_ext = 0;
+	next_triplet++;
+
+	/* HARQ Combine input from either Memory interface */
+	if (!ddr_mem_in) {
+		next_triplet = acc100_dma_fill_blk_type_out(&desc->req,
+				op->ldpc_dec.harq_combined_input.data,
+				op->ldpc_dec.harq_combined_input.offset,
+				harq_dma_length_in,
+				next_triplet,
+				ACC100_DMA_BLKID_IN_HARQ);
+	} else {
+		desc->req.data_ptrs[next_triplet].address =
+				op->ldpc_dec.harq_combined_input.offset;
+		desc->req.data_ptrs[next_triplet].blen =
+				harq_dma_length_in;
+		desc->req.data_ptrs[next_triplet].blkid =
+				ACC100_DMA_BLKID_IN_HARQ;
+		desc->req.data_ptrs[next_triplet].dma_ext = 1;
+		next_triplet++;
+	}
+	desc->req.data_ptrs[next_triplet - 1].last = 1;
+	desc->req.m2dlen = next_triplet;
+
+	/* Dropped decoder hard output */
+	desc->req.data_ptrs[next_triplet].address =
+			q->lb_out_addr_phys;
+	desc->req.data_ptrs[next_triplet].blen = BYTES_IN_WORD;
+	desc->req.data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_OUT_HARD;
+	desc->req.data_ptrs[next_triplet].last = 0;
+	desc->req.data_ptrs[next_triplet].dma_ext = 0;
+	next_triplet++;
+
+	/* HARQ Combine output to either Memory interface */
+	if (check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE
+			)) {
+		desc->req.data_ptrs[next_triplet].address =
+				op->ldpc_dec.harq_combined_output.offset;
+		desc->req.data_ptrs[next_triplet].blen =
+				harq_dma_length_out;
+		desc->req.data_ptrs[next_triplet].blkid =
+				ACC100_DMA_BLKID_OUT_HARQ;
+		desc->req.data_ptrs[next_triplet].dma_ext = 1;
+		next_triplet++;
+	} else {
+		hq_output_head = op->ldpc_dec.harq_combined_output.data;
+		hq_output = op->ldpc_dec.harq_combined_output.data;
+		next_triplet = acc100_dma_fill_blk_type_out(
+				&desc->req,
+				op->ldpc_dec.harq_combined_output.data,
+				op->ldpc_dec.harq_combined_output.offset,
+				harq_dma_length_out,
+				next_triplet,
+				ACC100_DMA_BLKID_OUT_HARQ);
+		/* HARQ output */
+		mbuf_append(hq_output_head, hq_output, harq_dma_length_out);
+		op->ldpc_dec.harq_combined_output.length =
+				harq_dma_length_out;
+	}
+	desc->req.data_ptrs[next_triplet - 1].last = 1;
+	desc->req.d2mlen = next_triplet - desc->req.m2dlen;
+	desc->req.op_addr = op;
+
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
 /** Enqueue one decode operations for ACC100 device in CB mode */
 static inline int
 enqueue_ldpc_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
 		uint16_t total_enqueued_cbs, bool same_op)
 {
 	int ret;
+	if (unlikely(check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK))) {
+		ret = harq_loopback(q, op, total_enqueued_cbs);
+		return ret;
+	}
 
 	union acc100_dma_desc *desc;
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v8 07/10] baseband/acc100: add support for 4G processing
  2020-09-28 23:52   ` [dpdk-dev] [PATCH v8 00/10] bbdev PMD ACC100 Nicolas Chautru
                       ` (5 preceding siblings ...)
  2020-09-28 23:52     ` [dpdk-dev] [PATCH v8 06/10] baseband/acc100: add HARQ loopback support Nicolas Chautru
@ 2020-09-28 23:52     ` Nicolas Chautru
  2020-09-28 23:52     ` [dpdk-dev] [PATCH v8 08/10] baseband/acc100: add interrupt support to PMD Nicolas Chautru
                       ` (2 subsequent siblings)
  9 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-28 23:52 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Adding capability for 4G encode and decoder processing

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 doc/guides/bbdevs/features/acc100.ini    |    4 +-
 drivers/baseband/acc100/rte_acc100_pmd.c | 1010 ++++++++++++++++++++++++++++--
 2 files changed, 945 insertions(+), 69 deletions(-)

diff --git a/doc/guides/bbdevs/features/acc100.ini b/doc/guides/bbdevs/features/acc100.ini
index 40c7adc..642cd48 100644
--- a/doc/guides/bbdevs/features/acc100.ini
+++ b/doc/guides/bbdevs/features/acc100.ini
@@ -4,8 +4,8 @@
 ; Refer to default.ini for the full list of available PMD features.
 ;
 [Features]
-Turbo Decoder (4G)     = N
-Turbo Encoder (4G)     = N
+Turbo Decoder (4G)     = Y
+Turbo Encoder (4G)     = Y
 LDPC Decoder (5G)      = Y
 LDPC Encoder (5G)      = Y
 LLR/HARQ Compression   = Y
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index e484c0a..7d4c3df 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -339,7 +339,6 @@
 	free_base_addresses(base_addrs, i);
 }
 
-
 /* Allocate 64MB memory used for all software rings */
 static int
 acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id)
@@ -637,6 +636,41 @@
 
 	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
 		{
+			.type = RTE_BBDEV_OP_TURBO_DEC,
+			.cap.turbo_dec = {
+				.capability_flags =
+					RTE_BBDEV_TURBO_SUBBLOCK_DEINTERLEAVE |
+					RTE_BBDEV_TURBO_CRC_TYPE_24B |
+					RTE_BBDEV_TURBO_HALF_ITERATION_EVEN |
+					RTE_BBDEV_TURBO_EARLY_TERMINATION |
+					RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN |
+					RTE_BBDEV_TURBO_MAP_DEC |
+					RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP |
+					RTE_BBDEV_TURBO_DEC_SCATTER_GATHER,
+				.max_llr_modulus = INT8_MAX,
+				.num_buffers_src =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+				.num_buffers_hard_out =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+				.num_buffers_soft_out =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+			}
+		},
+		{
+			.type = RTE_BBDEV_OP_TURBO_ENC,
+			.cap.turbo_enc = {
+				.capability_flags =
+					RTE_BBDEV_TURBO_CRC_24B_ATTACH |
+					RTE_BBDEV_TURBO_RV_INDEX_BYPASS |
+					RTE_BBDEV_TURBO_RATE_MATCH |
+					RTE_BBDEV_TURBO_ENC_SCATTER_GATHER,
+				.num_buffers_src =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+				.num_buffers_dst =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+			}
+		},
+		{
 			.type   = RTE_BBDEV_OP_LDPC_ENC,
 			.cap.ldpc_enc = {
 				.capability_flags =
@@ -719,7 +753,6 @@
 #endif
 }
 
-
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
 	.setup_queues = acc100_setup_queues,
 	.close = acc100_dev_close,
@@ -763,6 +796,58 @@
 	return tail;
 }
 
+/* Fill in a frame control word for turbo encoding. */
+static inline void
+acc100_fcw_te_fill(const struct rte_bbdev_enc_op *op, struct acc100_fcw_te *fcw)
+{
+	fcw->code_block_mode = op->turbo_enc.code_block_mode;
+	if (fcw->code_block_mode == 0) { /* For TB mode */
+		fcw->k_neg = op->turbo_enc.tb_params.k_neg;
+		fcw->k_pos = op->turbo_enc.tb_params.k_pos;
+		fcw->c_neg = op->turbo_enc.tb_params.c_neg;
+		fcw->c = op->turbo_enc.tb_params.c;
+		fcw->ncb_neg = op->turbo_enc.tb_params.ncb_neg;
+		fcw->ncb_pos = op->turbo_enc.tb_params.ncb_pos;
+
+		if (check_bit(op->turbo_enc.op_flags,
+				RTE_BBDEV_TURBO_RATE_MATCH)) {
+			fcw->bypass_rm = 0;
+			fcw->cab = op->turbo_enc.tb_params.cab;
+			fcw->ea = op->turbo_enc.tb_params.ea;
+			fcw->eb = op->turbo_enc.tb_params.eb;
+		} else {
+			/* E is set to the encoding output size when RM is
+			 * bypassed.
+			 */
+			fcw->bypass_rm = 1;
+			fcw->cab = fcw->c_neg;
+			fcw->ea = 3 * fcw->k_neg + 12;
+			fcw->eb = 3 * fcw->k_pos + 12;
+		}
+	} else { /* For CB mode */
+		fcw->k_pos = op->turbo_enc.cb_params.k;
+		fcw->ncb_pos = op->turbo_enc.cb_params.ncb;
+
+		if (check_bit(op->turbo_enc.op_flags,
+				RTE_BBDEV_TURBO_RATE_MATCH)) {
+			fcw->bypass_rm = 0;
+			fcw->eb = op->turbo_enc.cb_params.e;
+		} else {
+			/* E is set to the encoding output size when RM is
+			 * bypassed.
+			 */
+			fcw->bypass_rm = 1;
+			fcw->eb = 3 * fcw->k_pos + 12;
+		}
+	}
+
+	fcw->bypass_rv_idx1 = check_bit(op->turbo_enc.op_flags,
+			RTE_BBDEV_TURBO_RV_INDEX_BYPASS);
+	fcw->code_block_crc = check_bit(op->turbo_enc.op_flags,
+			RTE_BBDEV_TURBO_CRC_24B_ATTACH);
+	fcw->rv_idx1 = op->turbo_enc.rv_index;
+}
+
 /* Compute value of k0.
  * Based on 3GPP 38.212 Table 5.4.2.1-2
  * Starting position of different redundancy versions, k0
@@ -813,6 +898,25 @@
 	fcw->mcb_count = num_cb;
 }
 
+/* Fill in a frame control word for turbo decoding. */
+static inline void
+acc100_fcw_td_fill(const struct rte_bbdev_dec_op *op, struct acc100_fcw_td *fcw)
+{
+	/* Note : Early termination is always enabled for 4GUL */
+	fcw->fcw_ver = 1;
+	if (op->turbo_dec.code_block_mode == 0)
+		fcw->k_pos = op->turbo_dec.tb_params.k_pos;
+	else
+		fcw->k_pos = op->turbo_dec.cb_params.k;
+	fcw->turbo_crc_type = check_bit(op->turbo_dec.op_flags,
+			RTE_BBDEV_TURBO_CRC_TYPE_24B);
+	fcw->bypass_sb_deint = 0;
+	fcw->raw_decoder_input_on = 0;
+	fcw->max_iter = op->turbo_dec.iter_max;
+	fcw->half_iter_on = !check_bit(op->turbo_dec.op_flags,
+			RTE_BBDEV_TURBO_HALF_ITERATION_EVEN);
+}
+
 /* Fill in a frame control word for LDPC decoding. */
 static inline void
 acc100_fcw_ld_fill(const struct rte_bbdev_dec_op *op, struct acc100_fcw_ld *fcw,
@@ -1042,6 +1146,87 @@
 }
 
 static inline int
+acc100_dma_desc_te_fill(struct rte_bbdev_enc_op *op,
+		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
+		struct rte_mbuf *output, uint32_t *in_offset,
+		uint32_t *out_offset, uint32_t *out_length,
+		uint32_t *mbuf_total_left, uint32_t *seg_total_left, uint8_t r)
+{
+	int next_triplet = 1; /* FCW already done */
+	uint32_t e, ea, eb, length;
+	uint16_t k, k_neg, k_pos;
+	uint8_t cab, c_neg;
+
+	desc->word0 = ACC100_DMA_DESC_TYPE;
+	desc->word1 = 0; /**< Timestamp could be disabled */
+	desc->word2 = 0;
+	desc->word3 = 0;
+	desc->numCBs = 1;
+
+	if (op->turbo_enc.code_block_mode == 0) {
+		ea = op->turbo_enc.tb_params.ea;
+		eb = op->turbo_enc.tb_params.eb;
+		cab = op->turbo_enc.tb_params.cab;
+		k_neg = op->turbo_enc.tb_params.k_neg;
+		k_pos = op->turbo_enc.tb_params.k_pos;
+		c_neg = op->turbo_enc.tb_params.c_neg;
+		e = (r < cab) ? ea : eb;
+		k = (r < c_neg) ? k_neg : k_pos;
+	} else {
+		e = op->turbo_enc.cb_params.e;
+		k = op->turbo_enc.cb_params.k;
+	}
+
+	if (check_bit(op->turbo_enc.op_flags, RTE_BBDEV_TURBO_CRC_24B_ATTACH))
+		length = (k - 24) >> 3;
+	else
+		length = k >> 3;
+
+	if (unlikely((*mbuf_total_left == 0) || (*mbuf_total_left < length))) {
+		rte_bbdev_log(ERR,
+				"Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+				*mbuf_total_left, length);
+		return -1;
+	}
+
+	next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset,
+			length, seg_total_left, next_triplet);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->m2dlen = next_triplet;
+	*mbuf_total_left -= length;
+
+	/* Set output length */
+	if (check_bit(op->turbo_enc.op_flags, RTE_BBDEV_TURBO_RATE_MATCH))
+		/* Integer round up division by 8 */
+		*out_length = (e + 7) >> 3;
+	else
+		*out_length = (k >> 3) * 3 + 2;
+
+	next_triplet = acc100_dma_fill_blk_type_out(desc, output, *out_offset,
+			*out_length, next_triplet, ACC100_DMA_BLKID_OUT_ENC);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+	op->turbo_enc.output.length += *out_length;
+	*out_offset += *out_length;
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->d2mlen = next_triplet - desc->m2dlen;
+
+	desc->op_addr = op;
+
+	return 0;
+}
+
+static inline int
 acc100_dma_desc_le_fill(struct rte_bbdev_enc_op *op,
 		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
 		struct rte_mbuf *output, uint32_t *in_offset,
@@ -1110,6 +1295,117 @@
 }
 
 static inline int
+acc100_dma_desc_td_fill(struct rte_bbdev_dec_op *op,
+		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
+		struct rte_mbuf *h_output, struct rte_mbuf *s_output,
+		uint32_t *in_offset, uint32_t *h_out_offset,
+		uint32_t *s_out_offset, uint32_t *h_out_length,
+		uint32_t *s_out_length, uint32_t *mbuf_total_left,
+		uint32_t *seg_total_left, uint8_t r)
+{
+	int next_triplet = 1; /* FCW already done */
+	uint16_t k;
+	uint16_t crc24_overlap = 0;
+	uint32_t e, kw;
+
+	desc->word0 = ACC100_DMA_DESC_TYPE;
+	desc->word1 = 0; /**< Timestamp could be disabled */
+	desc->word2 = 0;
+	desc->word3 = 0;
+	desc->numCBs = 1;
+
+	if (op->turbo_dec.code_block_mode == 0) {
+		k = (r < op->turbo_dec.tb_params.c_neg)
+			? op->turbo_dec.tb_params.k_neg
+			: op->turbo_dec.tb_params.k_pos;
+		e = (r < op->turbo_dec.tb_params.cab)
+			? op->turbo_dec.tb_params.ea
+			: op->turbo_dec.tb_params.eb;
+	} else {
+		k = op->turbo_dec.cb_params.k;
+		e = op->turbo_dec.cb_params.e;
+	}
+
+	if ((op->turbo_dec.code_block_mode == 0)
+		&& !check_bit(op->turbo_dec.op_flags,
+		RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP))
+		crc24_overlap = 24;
+
+	/* Calculates circular buffer size.
+	 * According to 3gpp 36.212 section 5.1.4.2
+	 *   Kw = 3 * Kpi,
+	 * where:
+	 *   Kpi = nCol * nRow
+	 * where nCol is 32 and nRow can be calculated from:
+	 *   D =< nCol * nRow
+	 * where D is the size of each output from turbo encoder block (k + 4).
+	 */
+	kw = RTE_ALIGN_CEIL(k + 4, 32) * 3;
+
+	if (unlikely((*mbuf_total_left == 0) || (*mbuf_total_left < kw))) {
+		rte_bbdev_log(ERR,
+				"Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+				*mbuf_total_left, kw);
+		return -1;
+	}
+
+	next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset, kw,
+			seg_total_left, next_triplet);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->m2dlen = next_triplet;
+	*mbuf_total_left -= kw;
+
+	next_triplet = acc100_dma_fill_blk_type_out(
+			desc, h_output, *h_out_offset,
+			k >> 3, next_triplet, ACC100_DMA_BLKID_OUT_HARD);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+
+	*h_out_length = ((k - crc24_overlap) >> 3);
+	op->turbo_dec.hard_output.length += *h_out_length;
+	*h_out_offset += *h_out_length;
+
+	/* Soft output */
+	if (check_bit(op->turbo_dec.op_flags, RTE_BBDEV_TURBO_SOFT_OUTPUT)) {
+		if (check_bit(op->turbo_dec.op_flags,
+				RTE_BBDEV_TURBO_EQUALIZER))
+			*s_out_length = e;
+		else
+			*s_out_length = (k * 3) + 12;
+
+		next_triplet = acc100_dma_fill_blk_type_out(desc, s_output,
+				*s_out_offset, *s_out_length, next_triplet,
+				ACC100_DMA_BLKID_OUT_SOFT);
+		if (unlikely(next_triplet < 0)) {
+			rte_bbdev_log(ERR,
+					"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+					op);
+			return -1;
+		}
+
+		op->turbo_dec.soft_output.length += *s_out_length;
+		*s_out_offset += *s_out_length;
+	}
+
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->d2mlen = next_triplet - desc->m2dlen;
+
+	desc->op_addr = op;
+
+	return 0;
+}
+
+static inline int
 acc100_dma_desc_ld_fill(struct rte_bbdev_dec_op *op,
 		struct acc100_dma_req_desc *desc,
 		struct rte_mbuf **input, struct rte_mbuf *h_output,
@@ -1374,6 +1670,57 @@
 
 /* Enqueue one encode operations for ACC100 device in CB mode */
 static inline int
+enqueue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
+		uint16_t total_enqueued_cbs)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint32_t in_offset, out_offset, out_length, mbuf_total_left,
+		seg_total_left;
+	struct rte_mbuf *input, *output_head, *output;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	acc100_fcw_te_fill(op, &desc->req.fcw_te);
+
+	input = op->turbo_enc.input.data;
+	output_head = output = op->turbo_enc.output.data;
+	in_offset = op->turbo_enc.input.offset;
+	out_offset = op->turbo_enc.output.offset;
+	out_length = 0;
+	mbuf_total_left = op->turbo_enc.input.length;
+	seg_total_left = rte_pktmbuf_data_len(op->turbo_enc.input.data)
+			- in_offset;
+
+	ret = acc100_dma_desc_te_fill(op, &desc->req, &input, output,
+			&in_offset, &out_offset, &out_length, &mbuf_total_left,
+			&seg_total_left, 0);
+
+	if (unlikely(ret < 0))
+		return ret;
+
+	mbuf_append(output_head, output, out_length);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_te,
+			sizeof(desc->req.fcw_te) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+
+	/* Check if any data left after processing one CB */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left after processing one CB: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
+/* Enqueue one encode operations for ACC100 device in CB mode */
+static inline int
 enqueue_ldpc_enc_n_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op **ops,
 		uint16_t total_enqueued_cbs, int16_t num)
 {
@@ -1481,78 +1828,235 @@
 	return 1;
 }
 
-static inline int
-harq_loopback(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
-		uint16_t total_enqueued_cbs) {
-	struct acc100_fcw_ld *fcw;
-	union acc100_dma_desc *desc;
-	int next_triplet = 1;
-	struct rte_mbuf *hq_output_head, *hq_output;
-	uint16_t harq_in_length = op->ldpc_dec.harq_combined_input.length;
-	if (harq_in_length == 0) {
-		rte_bbdev_log(ERR, "Loopback of invalid null size\n");
-		return -EINVAL;
-	}
 
-	int h_comp = check_bit(op->ldpc_dec.op_flags,
-			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION
-			) ? 1 : 0;
-	if (h_comp == 1)
-		harq_in_length = harq_in_length * 8 / 6;
-	harq_in_length = RTE_ALIGN(harq_in_length, 64);
-	uint16_t harq_dma_length_in = (h_comp == 0) ?
-			harq_in_length :
-			harq_in_length * 6 / 8;
-	uint16_t harq_dma_length_out = harq_dma_length_in;
-	bool ddr_mem_in = check_bit(op->ldpc_dec.op_flags,
-			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE);
-	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
-	uint16_t harq_index = (ddr_mem_in ?
-			op->ldpc_dec.harq_combined_input.offset :
-			op->ldpc_dec.harq_combined_output.offset)
-			/ ACC100_HARQ_OFFSET;
+/* Enqueue one encode operations for ACC100 device in TB mode. */
+static inline int
+enqueue_enc_one_op_tb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
+		uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint8_t r, c;
+	uint32_t in_offset, out_offset, out_length, mbuf_total_left,
+		seg_total_left;
+	struct rte_mbuf *input, *output_head, *output;
+	uint16_t current_enqueued_cbs = 0;
 
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
-	fcw = &desc->req.fcw_ld;
-	/* Set the FCW from loopback into DDR */
-	memset(fcw, 0, sizeof(struct acc100_fcw_ld));
-	fcw->FCWversion = ACC100_FCW_VER;
-	fcw->qm = 2;
-	fcw->Zc = 384;
-	if (harq_in_length < 16 * N_ZC_1)
-		fcw->Zc = 16;
-	fcw->ncb = fcw->Zc * N_ZC_1;
-	fcw->rm_e = 2;
-	fcw->hcin_en = 1;
-	fcw->hcout_en = 1;
+	uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
+	acc100_fcw_te_fill(op, &desc->req.fcw_te);
 
-	rte_bbdev_log(DEBUG, "Loopback IN %d Index %d offset %d length %d %d\n",
-			ddr_mem_in, harq_index,
-			harq_layout[harq_index].offset, harq_in_length,
-			harq_dma_length_in);
+	input = op->turbo_enc.input.data;
+	output_head = output = op->turbo_enc.output.data;
+	in_offset = op->turbo_enc.input.offset;
+	out_offset = op->turbo_enc.output.offset;
+	out_length = 0;
+	mbuf_total_left = op->turbo_enc.input.length;
 
-	if (ddr_mem_in && (harq_layout[harq_index].offset > 0)) {
-		fcw->hcin_size0 = harq_layout[harq_index].size0;
-		fcw->hcin_offset = harq_layout[harq_index].offset;
-		fcw->hcin_size1 = harq_in_length - fcw->hcin_offset;
-		harq_dma_length_in = (fcw->hcin_size0 + fcw->hcin_size1);
-		if (h_comp == 1)
-			harq_dma_length_in = harq_dma_length_in * 6 / 8;
-	} else {
-		fcw->hcin_size0 = harq_in_length;
-	}
-	harq_layout[harq_index].val = 0;
-	rte_bbdev_log(DEBUG, "Loopback FCW Config %d %d %d\n",
-			fcw->hcin_size0, fcw->hcin_offset, fcw->hcin_size1);
-	fcw->hcout_size0 = harq_in_length;
-	fcw->hcin_decomp_mode = h_comp;
-	fcw->hcout_comp_mode = h_comp;
-	fcw->gain_i = 1;
-	fcw->gain_h = 1;
+	c = op->turbo_enc.tb_params.c;
+	r = op->turbo_enc.tb_params.r;
 
-	/* Set the prefix of descriptor. This could be done at polling */
+	while (mbuf_total_left > 0 && r < c) {
+		seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
+		/* Set up DMA descriptor */
+		desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
+				& q->sw_ring_wrap_mask);
+		desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
+		desc->req.data_ptrs[0].blen = ACC100_FCW_TE_BLEN;
+
+		ret = acc100_dma_desc_te_fill(op, &desc->req, &input, output,
+				&in_offset, &out_offset, &out_length,
+				&mbuf_total_left, &seg_total_left, r);
+		if (unlikely(ret < 0))
+			return ret;
+		mbuf_append(output_head, output, out_length);
+
+		/* Set total number of CBs in TB */
+		desc->req.cbs_in_tb = cbs_in_tb;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+		rte_memdump(stderr, "FCW", &desc->req.fcw_te,
+				sizeof(desc->req.fcw_te) - 8);
+		rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+		if (seg_total_left == 0) {
+			/* Go to the next mbuf */
+			input = input->next;
+			in_offset = 0;
+			output = output->next;
+			out_offset = 0;
+		}
+
+		total_enqueued_cbs++;
+		current_enqueued_cbs++;
+		r++;
+	}
+
+	if (unlikely(desc == NULL))
+		return current_enqueued_cbs;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Check if any CBs left for processing */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left for processing: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+
+	/* Set SDone on last CB descriptor for TB mode. */
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	return current_enqueued_cbs;
+}
+
+/** Enqueue one decode operations for ACC100 device in CB mode */
+static inline int
+enqueue_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint32_t in_offset, h_out_offset, s_out_offset, s_out_length,
+		h_out_length, mbuf_total_left, seg_total_left;
+	struct rte_mbuf *input, *h_output_head, *h_output,
+		*s_output_head, *s_output;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	acc100_fcw_td_fill(op, &desc->req.fcw_td);
+
+	input = op->turbo_dec.input.data;
+	h_output_head = h_output = op->turbo_dec.hard_output.data;
+	s_output_head = s_output = op->turbo_dec.soft_output.data;
+	in_offset = op->turbo_dec.input.offset;
+	h_out_offset = op->turbo_dec.hard_output.offset;
+	s_out_offset = op->turbo_dec.soft_output.offset;
+	h_out_length = s_out_length = 0;
+	mbuf_total_left = op->turbo_dec.input.length;
+	seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(input == NULL)) {
+		rte_bbdev_log(ERR, "Invalid mbuf pointer");
+		return -EFAULT;
+	}
+#endif
+
+	/* Set up DMA descriptor */
+	desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+
+	ret = acc100_dma_desc_td_fill(op, &desc->req, &input, h_output,
+			s_output, &in_offset, &h_out_offset, &s_out_offset,
+			&h_out_length, &s_out_length, &mbuf_total_left,
+			&seg_total_left, 0);
+
+	if (unlikely(ret < 0))
+		return ret;
+
+	/* Hard output */
+	mbuf_append(h_output_head, h_output, h_out_length);
+
+	/* Soft output */
+	if (check_bit(op->turbo_dec.op_flags, RTE_BBDEV_TURBO_SOFT_OUTPUT))
+		mbuf_append(s_output_head, s_output, s_out_length);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_td,
+			sizeof(desc->req.fcw_td) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+
+	/* Check if any CBs left for processing */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left after processing one CB: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
+static inline int
+harq_loopback(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs) {
+	struct acc100_fcw_ld *fcw;
+	union acc100_dma_desc *desc;
+	int next_triplet = 1;
+	struct rte_mbuf *hq_output_head, *hq_output;
+	uint16_t harq_in_length = op->ldpc_dec.harq_combined_input.length;
+	if (harq_in_length == 0) {
+		rte_bbdev_log(ERR, "Loopback of invalid null size\n");
+		return -EINVAL;
+	}
+
+	int h_comp = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION
+			) ? 1 : 0;
+	if (h_comp == 1)
+		harq_in_length = harq_in_length * 8 / 6;
+	harq_in_length = RTE_ALIGN(harq_in_length, 64);
+	uint16_t harq_dma_length_in = (h_comp == 0) ?
+			harq_in_length :
+			harq_in_length * 6 / 8;
+	uint16_t harq_dma_length_out = harq_dma_length_in;
+	bool ddr_mem_in = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE);
+	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+	uint16_t harq_index = (ddr_mem_in ?
+			op->ldpc_dec.harq_combined_input.offset :
+			op->ldpc_dec.harq_combined_output.offset)
+			/ ACC100_HARQ_OFFSET;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	fcw = &desc->req.fcw_ld;
+	/* Set the FCW from loopback into DDR */
+	memset(fcw, 0, sizeof(struct acc100_fcw_ld));
+	fcw->FCWversion = ACC100_FCW_VER;
+	fcw->qm = 2;
+	fcw->Zc = 384;
+	if (harq_in_length < 16 * N_ZC_1)
+		fcw->Zc = 16;
+	fcw->ncb = fcw->Zc * N_ZC_1;
+	fcw->rm_e = 2;
+	fcw->hcin_en = 1;
+	fcw->hcout_en = 1;
+
+	rte_bbdev_log(DEBUG, "Loopback IN %d Index %d offset %d length %d %d\n",
+			ddr_mem_in, harq_index,
+			harq_layout[harq_index].offset, harq_in_length,
+			harq_dma_length_in);
+
+	if (ddr_mem_in && (harq_layout[harq_index].offset > 0)) {
+		fcw->hcin_size0 = harq_layout[harq_index].size0;
+		fcw->hcin_offset = harq_layout[harq_index].offset;
+		fcw->hcin_size1 = harq_in_length - fcw->hcin_offset;
+		harq_dma_length_in = (fcw->hcin_size0 + fcw->hcin_size1);
+		if (h_comp == 1)
+			harq_dma_length_in = harq_dma_length_in * 6 / 8;
+	} else {
+		fcw->hcin_size0 = harq_in_length;
+	}
+	harq_layout[harq_index].val = 0;
+	rte_bbdev_log(DEBUG, "Loopback FCW Config %d %d %d\n",
+			fcw->hcin_size0, fcw->hcin_offset, fcw->hcin_size1);
+	fcw->hcout_size0 = harq_in_length;
+	fcw->hcin_decomp_mode = h_comp;
+	fcw->hcout_comp_mode = h_comp;
+	fcw->gain_i = 1;
+	fcw->gain_h = 1;
+
+	/* Set the prefix of descriptor. This could be done at polling */
 	desc->req.word0 = ACC100_DMA_DESC_TYPE;
 	desc->req.word1 = 0; /**< Timestamp could be disabled */
 	desc->req.word2 = 0;
@@ -1816,6 +2320,107 @@
 	return current_enqueued_cbs;
 }
 
+/* Enqueue one decode operations for ACC100 device in TB mode */
+static inline int
+enqueue_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint8_t r, c;
+	uint32_t in_offset, h_out_offset, s_out_offset, s_out_length,
+		h_out_length, mbuf_total_left, seg_total_left;
+	struct rte_mbuf *input, *h_output_head, *h_output,
+		*s_output_head, *s_output;
+	uint16_t current_enqueued_cbs = 0;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
+	acc100_fcw_td_fill(op, &desc->req.fcw_td);
+
+	input = op->turbo_dec.input.data;
+	h_output_head = h_output = op->turbo_dec.hard_output.data;
+	s_output_head = s_output = op->turbo_dec.soft_output.data;
+	in_offset = op->turbo_dec.input.offset;
+	h_out_offset = op->turbo_dec.hard_output.offset;
+	s_out_offset = op->turbo_dec.soft_output.offset;
+	h_out_length = s_out_length = 0;
+	mbuf_total_left = op->turbo_dec.input.length;
+	c = op->turbo_dec.tb_params.c;
+	r = op->turbo_dec.tb_params.r;
+
+	while (mbuf_total_left > 0 && r < c) {
+
+		seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
+
+		/* Set up DMA descriptor */
+		desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
+				& q->sw_ring_wrap_mask);
+		desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
+		desc->req.data_ptrs[0].blen = ACC100_FCW_TD_BLEN;
+		ret = acc100_dma_desc_td_fill(op, &desc->req, &input,
+				h_output, s_output, &in_offset, &h_out_offset,
+				&s_out_offset, &h_out_length, &s_out_length,
+				&mbuf_total_left, &seg_total_left, r);
+
+		if (unlikely(ret < 0))
+			return ret;
+
+		/* Hard output */
+		mbuf_append(h_output_head, h_output, h_out_length);
+
+		/* Soft output */
+		if (check_bit(op->turbo_dec.op_flags,
+				RTE_BBDEV_TURBO_SOFT_OUTPUT))
+			mbuf_append(s_output_head, s_output, s_out_length);
+
+		/* Set total number of CBs in TB */
+		desc->req.cbs_in_tb = cbs_in_tb;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+		rte_memdump(stderr, "FCW", &desc->req.fcw_td,
+				sizeof(desc->req.fcw_td) - 8);
+		rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+		if (seg_total_left == 0) {
+			/* Go to the next mbuf */
+			input = input->next;
+			in_offset = 0;
+			h_output = h_output->next;
+			h_out_offset = 0;
+
+			if (check_bit(op->turbo_dec.op_flags,
+					RTE_BBDEV_TURBO_SOFT_OUTPUT)) {
+				s_output = s_output->next;
+				s_out_offset = 0;
+			}
+		}
+
+		total_enqueued_cbs++;
+		current_enqueued_cbs++;
+		r++;
+	}
+
+	if (unlikely(desc == NULL))
+		return current_enqueued_cbs;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Check if any CBs left for processing */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left for processing: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+	/* Set SDone on last CB descriptor for TB mode */
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	return current_enqueued_cbs;
+}
 
 /* Calculates number of CBs in processed encoder TB based on 'r' and input
  * length.
@@ -1893,6 +2498,45 @@
 	return cbs_in_tb;
 }
 
+/* Enqueue encode operations for ACC100 device in CB mode. */
+static uint16_t
+acc100_enqueue_enc_cb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i;
+	union acc100_dma_desc *desc;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - 1 < 0))
+			break;
+		avail -= 1;
+
+		ret = enqueue_enc_one_op_cb(q, ops[i], i);
+		if (ret < 0)
+			break;
+	}
+
+	if (unlikely(i == 0))
+		return 0; /* Nothing to enqueue */
+
+	/* Set SDone in last CB in enqueued ops for CB mode*/
+	desc = q->ring_addr + ((q->sw_ring_head + i - 1)
+			& q->sw_ring_wrap_mask);
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	acc100_dma_enqueue(q, i, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+	return i;
+}
+
 /* Check we can mux encode operations with common FCW */
 static inline bool
 check_mux(struct rte_bbdev_enc_op **ops, uint16_t num) {
@@ -1960,6 +2604,52 @@
 	return i;
 }
 
+/* Enqueue encode operations for ACC100 device in TB mode. */
+static uint16_t
+acc100_enqueue_enc_tb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i, enqueued_cbs = 0;
+	uint8_t cbs_in_tb;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		cbs_in_tb = get_num_cbs_in_tb_enc(&ops[i]->turbo_enc);
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - cbs_in_tb < 0))
+			break;
+		avail -= cbs_in_tb;
+
+		ret = enqueue_enc_one_op_tb(q, ops[i], enqueued_cbs, cbs_in_tb);
+		if (ret < 0)
+			break;
+		enqueued_cbs += ret;
+	}
+
+	acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+
+	return i;
+}
+
+/* Enqueue encode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_enc(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	if (unlikely(num == 0))
+		return 0;
+	if (ops[0]->turbo_enc.code_block_mode == 0)
+		return acc100_enqueue_enc_tb(q_data, ops, num);
+	else
+		return acc100_enqueue_enc_cb(q_data, ops, num);
+}
+
 /* Enqueue encode operations for ACC100 device. */
 static uint16_t
 acc100_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
@@ -1967,7 +2657,51 @@
 {
 	if (unlikely(num == 0))
 		return 0;
-	return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
+	if (ops[0]->ldpc_enc.code_block_mode == 0)
+		return acc100_enqueue_enc_tb(q_data, ops, num);
+	else
+		return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
+}
+
+
+/* Enqueue decode operations for ACC100 device in CB mode */
+static uint16_t
+acc100_enqueue_dec_cb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i;
+	union acc100_dma_desc *desc;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - 1 < 0))
+			break;
+		avail -= 1;
+
+		ret = enqueue_dec_one_op_cb(q, ops[i], i);
+		if (ret < 0)
+			break;
+	}
+
+	if (unlikely(i == 0))
+		return 0; /* Nothing to enqueue */
+
+	/* Set SDone in last CB in enqueued ops for CB mode*/
+	desc = q->ring_addr + ((q->sw_ring_head + i - 1)
+			& q->sw_ring_wrap_mask);
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	acc100_dma_enqueue(q, i, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+
+	return i;
 }
 
 /* Check we can mux encode operations with common FCW */
@@ -2065,6 +2799,53 @@
 	return i;
 }
 
+
+/* Enqueue decode operations for ACC100 device in TB mode */
+static uint16_t
+acc100_enqueue_dec_tb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i, enqueued_cbs = 0;
+	uint8_t cbs_in_tb;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		cbs_in_tb = get_num_cbs_in_tb_dec(&ops[i]->turbo_dec);
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - cbs_in_tb < 0))
+			break;
+		avail -= cbs_in_tb;
+
+		ret = enqueue_dec_one_op_tb(q, ops[i], enqueued_cbs, cbs_in_tb);
+		if (ret < 0)
+			break;
+		enqueued_cbs += ret;
+	}
+
+	acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+
+	return i;
+}
+
+/* Enqueue decode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_dec(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	if (unlikely(num == 0))
+		return 0;
+	if (ops[0]->turbo_dec.code_block_mode == 0)
+		return acc100_enqueue_dec_tb(q_data, ops, num);
+	else
+		return acc100_enqueue_dec_cb(q_data, ops, num);
+}
+
 /* Enqueue decode operations for ACC100 device. */
 static uint16_t
 acc100_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
@@ -2388,6 +3169,51 @@
 	return cb_idx;
 }
 
+/* Dequeue encode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_enc(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	uint16_t dequeue_num;
+	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+	uint32_t aq_dequeued = 0;
+	uint16_t i;
+	uint16_t dequeued_cbs = 0;
+	struct rte_bbdev_enc_op *op;
+	int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(ops == 0 && q == NULL))
+		return 0;
+#endif
+
+	dequeue_num = (avail < num) ? avail : num;
+
+	for (i = 0; i < dequeue_num; ++i) {
+		op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask))->req.op_addr;
+		if (op->turbo_enc.code_block_mode == 0)
+			ret = dequeue_enc_one_op_tb(q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+		else
+			ret = dequeue_enc_one_op_cb(q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+
+		if (ret < 0)
+			break;
+		dequeued_cbs += ret;
+	}
+
+	q->aq_dequeued += aq_dequeued;
+	q->sw_ring_tail += dequeued_cbs;
+
+	/* Update enqueue stats */
+	q_data->queue_stats.dequeued_count += i;
+
+	return i;
+}
+
 /* Dequeue LDPC encode operations from ACC100 device. */
 static uint16_t
 acc100_dequeue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
@@ -2426,6 +3252,52 @@
 	return dequeued_cbs;
 }
 
+
+/* Dequeue decode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_dec(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	uint16_t dequeue_num;
+	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+	uint32_t aq_dequeued = 0;
+	uint16_t i;
+	uint16_t dequeued_cbs = 0;
+	struct rte_bbdev_dec_op *op;
+	int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(ops == 0 && q == NULL))
+		return 0;
+#endif
+
+	dequeue_num = (avail < num) ? avail : num;
+
+	for (i = 0; i < dequeue_num; ++i) {
+		op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask))->req.op_addr;
+		if (op->turbo_dec.code_block_mode == 0)
+			ret = dequeue_dec_one_op_tb(q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+		else
+			ret = dequeue_dec_one_op_cb(q_data, q, &ops[i],
+					dequeued_cbs, &aq_dequeued);
+
+		if (ret < 0)
+			break;
+		dequeued_cbs += ret;
+	}
+
+	q->aq_dequeued += aq_dequeued;
+	q->sw_ring_tail += dequeued_cbs;
+
+	/* Update enqueue stats */
+	q_data->queue_stats.dequeued_count += i;
+
+	return i;
+}
+
 /* Dequeue decode operations from ACC100 device. */
 static uint16_t
 acc100_dequeue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
@@ -2479,6 +3351,10 @@
 	struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
 
 	dev->dev_ops = &acc100_bbdev_ops;
+	dev->enqueue_enc_ops = acc100_enqueue_enc;
+	dev->enqueue_dec_ops = acc100_enqueue_dec;
+	dev->dequeue_enc_ops = acc100_dequeue_enc;
+	dev->dequeue_dec_ops = acc100_dequeue_dec;
 	dev->enqueue_ldpc_enc_ops = acc100_enqueue_ldpc_enc;
 	dev->enqueue_ldpc_dec_ops = acc100_enqueue_ldpc_dec;
 	dev->dequeue_ldpc_enc_ops = acc100_dequeue_ldpc_enc;
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v8 08/10] baseband/acc100: add interrupt support to PMD
  2020-09-28 23:52   ` [dpdk-dev] [PATCH v8 00/10] bbdev PMD ACC100 Nicolas Chautru
                       ` (6 preceding siblings ...)
  2020-09-28 23:52     ` [dpdk-dev] [PATCH v8 07/10] baseband/acc100: add support for 4G processing Nicolas Chautru
@ 2020-09-28 23:52     ` Nicolas Chautru
  2020-09-28 23:52     ` [dpdk-dev] [PATCH v8 09/10] baseband/acc100: add debug function to validate input Nicolas Chautru
  2020-09-28 23:52     ` [dpdk-dev] [PATCH v8 10/10] baseband/acc100: add configure function Nicolas Chautru
  9 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-28 23:52 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Adding capability and functions to support MSI
interrupts, call backs and inforing.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 288 ++++++++++++++++++++++++++++++-
 drivers/baseband/acc100/rte_acc100_pmd.h |  15 ++
 2 files changed, 300 insertions(+), 3 deletions(-)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 7d4c3df..b6d9e7c 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -339,6 +339,213 @@
 	free_base_addresses(base_addrs, i);
 }
 
+/*
+ * Find queue_id of a device queue based on details from the Info Ring.
+ * If a queue isn't found UINT16_MAX is returned.
+ */
+static inline uint16_t
+get_queue_id_from_ring_info(struct rte_bbdev_data *data,
+		const union acc100_info_ring_data ring_data)
+{
+	uint16_t queue_id;
+
+	for (queue_id = 0; queue_id < data->num_queues; ++queue_id) {
+		struct acc100_queue *acc100_q =
+				data->queues[queue_id].queue_private;
+		if (acc100_q != NULL && acc100_q->aq_id == ring_data.aq_id &&
+				acc100_q->qgrp_id == ring_data.qg_id &&
+				acc100_q->vf_id == ring_data.vf_id)
+			return queue_id;
+	}
+
+	return UINT16_MAX;
+}
+
+/* Checks PF Info Ring to find the interrupt cause and handles it accordingly */
+static inline void
+acc100_check_ir(struct acc100_device *acc100_dev)
+{
+	volatile union acc100_info_ring_data *ring_data;
+	uint16_t info_ring_head = acc100_dev->info_ring_head;
+	if (acc100_dev->info_ring == NULL)
+		return;
+
+	ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head &
+			ACC100_INFO_RING_MASK);
+
+	while (ring_data->valid) {
+		if ((ring_data->int_nb < ACC100_PF_INT_DMA_DL_DESC_IRQ) || (
+				ring_data->int_nb >
+				ACC100_PF_INT_DMA_DL5G_DESC_IRQ))
+			rte_bbdev_log(WARNING, "InfoRing: ITR:%d Info:0x%x",
+				ring_data->int_nb, ring_data->detailed_info);
+		/* Initialize Info Ring entry and move forward */
+		ring_data->val = 0;
+		info_ring_head++;
+		ring_data = acc100_dev->info_ring +
+				(info_ring_head & ACC100_INFO_RING_MASK);
+	}
+}
+
+/* Checks PF Info Ring to find the interrupt cause and handles it accordingly */
+static inline void
+acc100_pf_interrupt_handler(struct rte_bbdev *dev)
+{
+	struct acc100_device *acc100_dev = dev->data->dev_private;
+	volatile union acc100_info_ring_data *ring_data;
+	struct acc100_deq_intr_details deq_intr_det;
+
+	ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head &
+			ACC100_INFO_RING_MASK);
+
+	while (ring_data->valid) {
+
+		rte_bbdev_log_debug(
+				"ACC100 PF Interrupt received, Info Ring data: 0x%x",
+				ring_data->val);
+
+		switch (ring_data->int_nb) {
+		case ACC100_PF_INT_DMA_DL_DESC_IRQ:
+		case ACC100_PF_INT_DMA_UL_DESC_IRQ:
+		case ACC100_PF_INT_DMA_UL5G_DESC_IRQ:
+		case ACC100_PF_INT_DMA_DL5G_DESC_IRQ:
+			deq_intr_det.queue_id = get_queue_id_from_ring_info(
+					dev->data, *ring_data);
+			if (deq_intr_det.queue_id == UINT16_MAX) {
+				rte_bbdev_log(ERR,
+						"Couldn't find queue: aq_id: %u, qg_id: %u, vf_id: %u",
+						ring_data->aq_id,
+						ring_data->qg_id,
+						ring_data->vf_id);
+				return;
+			}
+			rte_bbdev_pmd_callback_process(dev,
+					RTE_BBDEV_EVENT_DEQUEUE, &deq_intr_det);
+			break;
+		default:
+			rte_bbdev_pmd_callback_process(dev,
+					RTE_BBDEV_EVENT_ERROR, NULL);
+			break;
+		}
+
+		/* Initialize Info Ring entry and move forward */
+		ring_data->val = 0;
+		++acc100_dev->info_ring_head;
+		ring_data = acc100_dev->info_ring +
+				(acc100_dev->info_ring_head &
+				ACC100_INFO_RING_MASK);
+	}
+}
+
+/* Checks VF Info Ring to find the interrupt cause and handles it accordingly */
+static inline void
+acc100_vf_interrupt_handler(struct rte_bbdev *dev)
+{
+	struct acc100_device *acc100_dev = dev->data->dev_private;
+	volatile union acc100_info_ring_data *ring_data;
+	struct acc100_deq_intr_details deq_intr_det;
+
+	ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head &
+			ACC100_INFO_RING_MASK);
+
+	while (ring_data->valid) {
+
+		rte_bbdev_log_debug(
+				"ACC100 VF Interrupt received, Info Ring data: 0x%x",
+				ring_data->val);
+
+		switch (ring_data->int_nb) {
+		case ACC100_VF_INT_DMA_DL_DESC_IRQ:
+		case ACC100_VF_INT_DMA_UL_DESC_IRQ:
+		case ACC100_VF_INT_DMA_UL5G_DESC_IRQ:
+		case ACC100_VF_INT_DMA_DL5G_DESC_IRQ:
+			/* VFs are not aware of their vf_id - it's set to 0 in
+			 * queue structures.
+			 */
+			ring_data->vf_id = 0;
+			deq_intr_det.queue_id = get_queue_id_from_ring_info(
+					dev->data, *ring_data);
+			if (deq_intr_det.queue_id == UINT16_MAX) {
+				rte_bbdev_log(ERR,
+						"Couldn't find queue: aq_id: %u, qg_id: %u",
+						ring_data->aq_id,
+						ring_data->qg_id);
+				return;
+			}
+			rte_bbdev_pmd_callback_process(dev,
+					RTE_BBDEV_EVENT_DEQUEUE, &deq_intr_det);
+			break;
+		default:
+			rte_bbdev_pmd_callback_process(dev,
+					RTE_BBDEV_EVENT_ERROR, NULL);
+			break;
+		}
+
+		/* Initialize Info Ring entry and move forward */
+		ring_data->valid = 0;
+		++acc100_dev->info_ring_head;
+		ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head
+				& ACC100_INFO_RING_MASK);
+	}
+}
+
+/* Interrupt handler triggered by ACC100 dev for handling specific interrupt */
+static void
+acc100_dev_interrupt_handler(void *cb_arg)
+{
+	struct rte_bbdev *dev = cb_arg;
+	struct acc100_device *acc100_dev = dev->data->dev_private;
+
+	/* Read info ring */
+	if (acc100_dev->pf_device)
+		acc100_pf_interrupt_handler(dev);
+	else
+		acc100_vf_interrupt_handler(dev);
+}
+
+/* Allocate and setup inforing */
+static int
+allocate_inforing(struct rte_bbdev *dev)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	const struct acc100_registry_addr *reg_addr;
+	rte_iova_t info_ring_phys;
+	uint32_t phys_low, phys_high;
+
+	if (d->info_ring != NULL)
+		return 0; /* Already configured */
+
+	/* Choose correct registry addresses for the device type */
+	if (d->pf_device)
+		reg_addr = &pf_reg_addr;
+	else
+		reg_addr = &vf_reg_addr;
+	/* Allocate InfoRing */
+	d->info_ring = rte_zmalloc_socket("Info Ring",
+			ACC100_INFO_RING_NUM_ENTRIES *
+			sizeof(*d->info_ring), RTE_CACHE_LINE_SIZE,
+			dev->data->socket_id);
+	if (d->info_ring == NULL) {
+		rte_bbdev_log(ERR,
+				"Failed to allocate Info Ring for %s:%u",
+				dev->device->driver->name,
+				dev->data->dev_id);
+		return -ENOMEM;
+	}
+	info_ring_phys = rte_malloc_virt2iova(d->info_ring);
+
+	/* Setup Info Ring */
+	phys_high = (uint32_t)(info_ring_phys >> 32);
+	phys_low  = (uint32_t)(info_ring_phys);
+	acc100_reg_write(d, reg_addr->info_ring_hi, phys_high);
+	acc100_reg_write(d, reg_addr->info_ring_lo, phys_low);
+	acc100_reg_write(d, reg_addr->info_ring_en, ACC100_REG_IRQ_EN_ALL);
+	d->info_ring_head = (acc100_reg_read(d, reg_addr->info_ring_ptr) &
+			0xFFF) / sizeof(union acc100_info_ring_data);
+	return 0;
+}
+
+
 /* Allocate 64MB memory used for all software rings */
 static int
 acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id)
@@ -426,6 +633,7 @@
 	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_hi, phys_high);
 	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_lo, phys_low);
 
+	allocate_inforing(dev);
 	d->harq_layout = rte_zmalloc_socket("HARQ Layout",
 			ACC100_HARQ_LAYOUT * sizeof(*d->harq_layout),
 			RTE_CACHE_LINE_SIZE, dev->data->socket_id);
@@ -437,13 +645,53 @@
 	return 0;
 }
 
+static int
+acc100_intr_enable(struct rte_bbdev *dev)
+{
+	int ret;
+	struct acc100_device *d = dev->data->dev_private;
+
+	/* Only MSI are currently supported */
+	if (dev->intr_handle->type == RTE_INTR_HANDLE_VFIO_MSI ||
+			dev->intr_handle->type == RTE_INTR_HANDLE_UIO) {
+
+		allocate_inforing(dev);
+
+		ret = rte_intr_enable(dev->intr_handle);
+		if (ret < 0) {
+			rte_bbdev_log(ERR,
+					"Couldn't enable interrupts for device: %s",
+					dev->data->name);
+			rte_free(d->info_ring);
+			return ret;
+		}
+		ret = rte_intr_callback_register(dev->intr_handle,
+				acc100_dev_interrupt_handler, dev);
+		if (ret < 0) {
+			rte_bbdev_log(ERR,
+					"Couldn't register interrupt callback for device: %s",
+					dev->data->name);
+			rte_free(d->info_ring);
+			return ret;
+		}
+
+		return 0;
+	}
+
+	rte_bbdev_log(ERR, "ACC100 (%s) supports only VFIO MSI interrupts",
+			dev->data->name);
+	return -ENOTSUP;
+}
+
 /* Free 64MB memory used for software rings */
 static int
 acc100_dev_close(struct rte_bbdev *dev)
 {
 	struct acc100_device *d = dev->data->dev_private;
+	acc100_check_ir(d);
 	if (d->sw_rings_base != NULL) {
 		rte_free(d->tail_ptrs);
+		rte_free(d->info_ring);
 		rte_free(d->sw_rings_base);
 		d->sw_rings_base = NULL;
 	}
@@ -643,6 +891,7 @@
 					RTE_BBDEV_TURBO_CRC_TYPE_24B |
 					RTE_BBDEV_TURBO_HALF_ITERATION_EVEN |
 					RTE_BBDEV_TURBO_EARLY_TERMINATION |
+					RTE_BBDEV_TURBO_DEC_INTERRUPTS |
 					RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN |
 					RTE_BBDEV_TURBO_MAP_DEC |
 					RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP |
@@ -663,6 +912,7 @@
 					RTE_BBDEV_TURBO_CRC_24B_ATTACH |
 					RTE_BBDEV_TURBO_RV_INDEX_BYPASS |
 					RTE_BBDEV_TURBO_RATE_MATCH |
+					RTE_BBDEV_TURBO_ENC_INTERRUPTS |
 					RTE_BBDEV_TURBO_ENC_SCATTER_GATHER,
 				.num_buffers_src =
 						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
@@ -676,7 +926,8 @@
 				.capability_flags =
 					RTE_BBDEV_LDPC_RATE_MATCH |
 					RTE_BBDEV_LDPC_CRC_24B_ATTACH |
-					RTE_BBDEV_LDPC_INTERLEAVER_BYPASS,
+					RTE_BBDEV_LDPC_INTERLEAVER_BYPASS |
+					RTE_BBDEV_LDPC_ENC_INTERRUPTS,
 				.num_buffers_src =
 						RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
 				.num_buffers_dst =
@@ -701,7 +952,8 @@
 				RTE_BBDEV_LDPC_DECODE_BYPASS |
 				RTE_BBDEV_LDPC_DEC_SCATTER_GATHER |
 				RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION |
-				RTE_BBDEV_LDPC_LLR_COMPRESSION,
+				RTE_BBDEV_LDPC_LLR_COMPRESSION |
+				RTE_BBDEV_LDPC_DEC_INTERRUPTS,
 			.llr_size = 8,
 			.llr_decimals = 1,
 			.num_buffers_src =
@@ -751,14 +1003,39 @@
 #else
 	dev_info->harq_buffer_size = 0;
 #endif
+	acc100_check_ir(d);
+}
+
+static int
+acc100_queue_intr_enable(struct rte_bbdev *dev, uint16_t queue_id)
+{
+	struct acc100_queue *q = dev->data->queues[queue_id].queue_private;
+
+	if (dev->intr_handle->type != RTE_INTR_HANDLE_VFIO_MSI &&
+			dev->intr_handle->type != RTE_INTR_HANDLE_UIO)
+		return -ENOTSUP;
+
+	q->irq_enable = 1;
+	return 0;
+}
+
+static int
+acc100_queue_intr_disable(struct rte_bbdev *dev, uint16_t queue_id)
+{
+	struct acc100_queue *q = dev->data->queues[queue_id].queue_private;
+	q->irq_enable = 0;
+	return 0;
 }
 
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
 	.setup_queues = acc100_setup_queues,
+	.intr_enable = acc100_intr_enable,
 	.close = acc100_dev_close,
 	.info_get = acc100_dev_info_get,
 	.queue_setup = acc100_queue_setup,
 	.queue_release = acc100_queue_release,
+	.queue_intr_enable = acc100_queue_intr_enable,
+	.queue_intr_disable = acc100_queue_intr_disable
 };
 
 /* ACC100 PCI PF address map */
@@ -3018,8 +3295,10 @@
 			? (1 << RTE_BBDEV_DATA_ERROR) : 0);
 	op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
 	op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
-	if (op->status != 0)
+	if (op->status != 0) {
 		q_data->queue_stats.dequeue_err_count++;
+		acc100_check_ir(q->d);
+	}
 
 	/* CRC invalid if error exists */
 	if (!op->status)
@@ -3076,6 +3355,9 @@
 		op->status |= 1 << RTE_BBDEV_SYNDROME_ERROR;
 	op->ldpc_dec.iter_count = (uint8_t) rsp.iter_cnt;
 
+	if (op->status & (1 << RTE_BBDEV_DRV_ERROR))
+		acc100_check_ir(q->d);
+
 	/* Check if this is the last desc in batch (Atomic Queue) */
 	if (desc->req.last_desc_in_batch) {
 		(*aq_dequeued)++;
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 78686c1..8980fa5 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -559,7 +559,14 @@ struct acc100_device {
 	/* Virtual address of the info memory routed to the this function under
 	 * operation, whether it is PF or VF.
 	 */
+	union acc100_info_ring_data *info_ring;
+
 	union acc100_harq_layout_data *harq_layout;
+	/* Virtual Info Ring head */
+	uint16_t info_ring_head;
+	/* Number of bytes available for each queue in device, depending on
+	 * how many queues are enabled with configure()
+	 */
 	uint32_t sw_ring_size;
 	uint32_t ddr_size; /* Size in kB */
 	uint32_t *tail_ptrs; /* Base address of response tail pointer buffer */
@@ -575,4 +582,12 @@ struct acc100_device {
 	bool configured; /**< True if this ACC100 device is configured */
 };
 
+/**
+ * Structure with details about RTE_BBDEV_EVENT_DEQUEUE event. It's passed to
+ * the callback function.
+ */
+struct acc100_deq_intr_details {
+	uint16_t queue_id;
+};
+
 #endif /* _RTE_ACC100_PMD_H_ */
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v8 09/10] baseband/acc100: add debug function to validate input
  2020-09-28 23:52   ` [dpdk-dev] [PATCH v8 00/10] bbdev PMD ACC100 Nicolas Chautru
                       ` (7 preceding siblings ...)
  2020-09-28 23:52     ` [dpdk-dev] [PATCH v8 08/10] baseband/acc100: add interrupt support to PMD Nicolas Chautru
@ 2020-09-28 23:52     ` Nicolas Chautru
  2020-09-28 23:52     ` [dpdk-dev] [PATCH v8 10/10] baseband/acc100: add configure function Nicolas Chautru
  9 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-28 23:52 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Debug functions to validate the input API from user
Only enabled in DEBUG mode at build time

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 424 +++++++++++++++++++++++++++++++
 1 file changed, 424 insertions(+)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index b6d9e7c..3589814 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -1945,6 +1945,231 @@
 
 }
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+/* Validates turbo encoder parameters */
+static inline int
+validate_enc_op(struct rte_bbdev_enc_op *op)
+{
+	struct rte_bbdev_op_turbo_enc *turbo_enc = &op->turbo_enc;
+	struct rte_bbdev_op_enc_turbo_cb_params *cb = NULL;
+	struct rte_bbdev_op_enc_turbo_tb_params *tb = NULL;
+	uint16_t kw, kw_neg, kw_pos;
+
+	if (op->mempool == NULL) {
+		rte_bbdev_log(ERR, "Invalid mempool pointer");
+		return -1;
+	}
+	if (turbo_enc->input.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid input pointer");
+		return -1;
+	}
+	if (turbo_enc->output.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid output pointer");
+		return -1;
+	}
+	if (turbo_enc->rv_index > 3) {
+		rte_bbdev_log(ERR,
+				"rv_index (%u) is out of range 0 <= value <= 3",
+				turbo_enc->rv_index);
+		return -1;
+	}
+	if (turbo_enc->code_block_mode != 0 &&
+			turbo_enc->code_block_mode != 1) {
+		rte_bbdev_log(ERR,
+				"code_block_mode (%u) is out of range 0 <= value <= 1",
+				turbo_enc->code_block_mode);
+		return -1;
+	}
+
+	if (turbo_enc->code_block_mode == 0) {
+		tb = &turbo_enc->tb_params;
+		if ((tb->k_neg < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| tb->k_neg > RTE_BBDEV_TURBO_MAX_CB_SIZE)
+				&& tb->c_neg > 0) {
+			rte_bbdev_log(ERR,
+					"k_neg (%u) is out of range %u <= value <= %u",
+					tb->k_neg, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if (tb->k_pos < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| tb->k_pos > RTE_BBDEV_TURBO_MAX_CB_SIZE) {
+			rte_bbdev_log(ERR,
+					"k_pos (%u) is out of range %u <= value <= %u",
+					tb->k_pos, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if (tb->c_neg > (RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1))
+			rte_bbdev_log(ERR,
+					"c_neg (%u) is out of range 0 <= value <= %u",
+					tb->c_neg,
+					RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1);
+		if (tb->c < 1 || tb->c > RTE_BBDEV_TURBO_MAX_CODE_BLOCKS) {
+			rte_bbdev_log(ERR,
+					"c (%u) is out of range 1 <= value <= %u",
+					tb->c, RTE_BBDEV_TURBO_MAX_CODE_BLOCKS);
+			return -1;
+		}
+		if (tb->cab > tb->c) {
+			rte_bbdev_log(ERR,
+					"cab (%u) is greater than c (%u)",
+					tb->cab, tb->c);
+			return -1;
+		}
+		if ((tb->ea < RTE_BBDEV_TURBO_MIN_CB_SIZE || (tb->ea % 2))
+				&& tb->r < tb->cab) {
+			rte_bbdev_log(ERR,
+					"ea (%u) is less than %u or it is not even",
+					tb->ea, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+		if ((tb->eb < RTE_BBDEV_TURBO_MIN_CB_SIZE || (tb->eb % 2))
+				&& tb->c > tb->cab) {
+			rte_bbdev_log(ERR,
+					"eb (%u) is less than %u or it is not even",
+					tb->eb, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+
+		kw_neg = 3 * RTE_ALIGN_CEIL(tb->k_neg + 4,
+					RTE_BBDEV_TURBO_C_SUBBLOCK);
+		if (tb->ncb_neg < tb->k_neg || tb->ncb_neg > kw_neg) {
+			rte_bbdev_log(ERR,
+					"ncb_neg (%u) is out of range (%u) k_neg <= value <= (%u) kw_neg",
+					tb->ncb_neg, tb->k_neg, kw_neg);
+			return -1;
+		}
+
+		kw_pos = 3 * RTE_ALIGN_CEIL(tb->k_pos + 4,
+					RTE_BBDEV_TURBO_C_SUBBLOCK);
+		if (tb->ncb_pos < tb->k_pos || tb->ncb_pos > kw_pos) {
+			rte_bbdev_log(ERR,
+					"ncb_pos (%u) is out of range (%u) k_pos <= value <= (%u) kw_pos",
+					tb->ncb_pos, tb->k_pos, kw_pos);
+			return -1;
+		}
+		if (tb->r > (tb->c - 1)) {
+			rte_bbdev_log(ERR,
+					"r (%u) is greater than c - 1 (%u)",
+					tb->r, tb->c - 1);
+			return -1;
+		}
+	} else {
+		cb = &turbo_enc->cb_params;
+		if (cb->k < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| cb->k > RTE_BBDEV_TURBO_MAX_CB_SIZE) {
+			rte_bbdev_log(ERR,
+					"k (%u) is out of range %u <= value <= %u",
+					cb->k, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+
+		if (cb->e < RTE_BBDEV_TURBO_MIN_CB_SIZE || (cb->e % 2)) {
+			rte_bbdev_log(ERR,
+					"e (%u) is less than %u or it is not even",
+					cb->e, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+
+		kw = RTE_ALIGN_CEIL(cb->k + 4, RTE_BBDEV_TURBO_C_SUBBLOCK) * 3;
+		if (cb->ncb < cb->k || cb->ncb > kw) {
+			rte_bbdev_log(ERR,
+					"ncb (%u) is out of range (%u) k <= value <= (%u) kw",
+					cb->ncb, cb->k, kw);
+			return -1;
+		}
+	}
+
+	return 0;
+}
+/* Validates LDPC encoder parameters */
+static inline int
+validate_ldpc_enc_op(struct rte_bbdev_enc_op *op)
+{
+	struct rte_bbdev_op_ldpc_enc *ldpc_enc = &op->ldpc_enc;
+
+	if (op->mempool == NULL) {
+		rte_bbdev_log(ERR, "Invalid mempool pointer");
+		return -1;
+	}
+	if (ldpc_enc->input.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid input pointer");
+		return -1;
+	}
+	if (ldpc_enc->output.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid output pointer");
+		return -1;
+	}
+	if (ldpc_enc->input.length >
+			RTE_BBDEV_LDPC_MAX_CB_SIZE >> 3) {
+		rte_bbdev_log(ERR, "CB size (%u) is too big, max: %d",
+				ldpc_enc->input.length,
+				RTE_BBDEV_LDPC_MAX_CB_SIZE);
+		return -1;
+	}
+	if ((ldpc_enc->basegraph > 2) || (ldpc_enc->basegraph == 0)) {
+		rte_bbdev_log(ERR,
+				"BG (%u) is out of range 1 <= value <= 2",
+				ldpc_enc->basegraph);
+		return -1;
+	}
+	if (ldpc_enc->rv_index > 3) {
+		rte_bbdev_log(ERR,
+				"rv_index (%u) is out of range 0 <= value <= 3",
+				ldpc_enc->rv_index);
+		return -1;
+	}
+	if (ldpc_enc->code_block_mode > 1) {
+		rte_bbdev_log(ERR,
+				"code_block_mode (%u) is out of range 0 <= value <= 1",
+				ldpc_enc->code_block_mode);
+		return -1;
+	}
+
+	return 0;
+}
+
+/* Validates LDPC decoder parameters */
+static inline int
+validate_ldpc_dec_op(struct rte_bbdev_dec_op *op)
+{
+	struct rte_bbdev_op_ldpc_dec *ldpc_dec = &op->ldpc_dec;
+
+	if (op->mempool == NULL) {
+		rte_bbdev_log(ERR, "Invalid mempool pointer");
+		return -1;
+	}
+	if ((ldpc_dec->basegraph > 2) || (ldpc_dec->basegraph == 0)) {
+		rte_bbdev_log(ERR,
+				"BG (%u) is out of range 1 <= value <= 2",
+				ldpc_dec->basegraph);
+		return -1;
+	}
+	if (ldpc_dec->iter_max == 0) {
+		rte_bbdev_log(ERR,
+				"iter_max (%u) is equal to 0",
+				ldpc_dec->iter_max);
+		return -1;
+	}
+	if (ldpc_dec->rv_index > 3) {
+		rte_bbdev_log(ERR,
+				"rv_index (%u) is out of range 0 <= value <= 3",
+				ldpc_dec->rv_index);
+		return -1;
+	}
+	if (ldpc_dec->code_block_mode > 1) {
+		rte_bbdev_log(ERR,
+				"code_block_mode (%u) is out of range 0 <= value <= 1",
+				ldpc_dec->code_block_mode);
+		return -1;
+	}
+
+	return 0;
+}
+#endif
+
 /* Enqueue one encode operations for ACC100 device in CB mode */
 static inline int
 enqueue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
@@ -1956,6 +2181,14 @@
 		seg_total_left;
 	struct rte_mbuf *input, *output_head, *output;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_enc_op(op) == -1) {
+		rte_bbdev_log(ERR, "Turbo encoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2008,6 +2241,14 @@
 	uint16_t  in_length_in_bytes;
 	struct rte_bbdev_op_ldpc_enc *enc = &ops[0]->ldpc_enc;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_ldpc_enc_op(ops[0]) == -1) {
+		rte_bbdev_log(ERR, "LDPC encoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2065,6 +2306,14 @@
 		seg_total_left;
 	struct rte_mbuf *input, *output_head, *output;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_ldpc_enc_op(op) == -1) {
+		rte_bbdev_log(ERR, "LDPC encoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2119,6 +2368,14 @@
 	struct rte_mbuf *input, *output_head, *output;
 	uint16_t current_enqueued_cbs = 0;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_enc_op(op) == -1) {
+		rte_bbdev_log(ERR, "Turbo encoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2191,6 +2448,142 @@
 	return current_enqueued_cbs;
 }
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+/* Validates turbo decoder parameters */
+static inline int
+validate_dec_op(struct rte_bbdev_dec_op *op)
+{
+	struct rte_bbdev_op_turbo_dec *turbo_dec = &op->turbo_dec;
+	struct rte_bbdev_op_dec_turbo_cb_params *cb = NULL;
+	struct rte_bbdev_op_dec_turbo_tb_params *tb = NULL;
+
+	if (op->mempool == NULL) {
+		rte_bbdev_log(ERR, "Invalid mempool pointer");
+		return -1;
+	}
+	if (turbo_dec->input.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid input pointer");
+		return -1;
+	}
+	if (turbo_dec->hard_output.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid hard_output pointer");
+		return -1;
+	}
+	if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_SOFT_OUTPUT) &&
+			turbo_dec->soft_output.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid soft_output pointer");
+		return -1;
+	}
+	if (turbo_dec->rv_index > 3) {
+		rte_bbdev_log(ERR,
+				"rv_index (%u) is out of range 0 <= value <= 3",
+				turbo_dec->rv_index);
+		return -1;
+	}
+	if (turbo_dec->iter_min < 1) {
+		rte_bbdev_log(ERR,
+				"iter_min (%u) is less than 1",
+				turbo_dec->iter_min);
+		return -1;
+	}
+	if (turbo_dec->iter_max <= 2) {
+		rte_bbdev_log(ERR,
+				"iter_max (%u) is less than or equal to 2",
+				turbo_dec->iter_max);
+		return -1;
+	}
+	if (turbo_dec->iter_min > turbo_dec->iter_max) {
+		rte_bbdev_log(ERR,
+				"iter_min (%u) is greater than iter_max (%u)",
+				turbo_dec->iter_min, turbo_dec->iter_max);
+		return -1;
+	}
+	if (turbo_dec->code_block_mode != 0 &&
+			turbo_dec->code_block_mode != 1) {
+		rte_bbdev_log(ERR,
+				"code_block_mode (%u) is out of range 0 <= value <= 1",
+				turbo_dec->code_block_mode);
+		return -1;
+	}
+
+	if (turbo_dec->code_block_mode == 0) {
+		tb = &turbo_dec->tb_params;
+		if ((tb->k_neg < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| tb->k_neg > RTE_BBDEV_TURBO_MAX_CB_SIZE)
+				&& tb->c_neg > 0) {
+			rte_bbdev_log(ERR,
+					"k_neg (%u) is out of range %u <= value <= %u",
+					tb->k_neg, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if ((tb->k_pos < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| tb->k_pos > RTE_BBDEV_TURBO_MAX_CB_SIZE)
+				&& tb->c > tb->c_neg) {
+			rte_bbdev_log(ERR,
+					"k_pos (%u) is out of range %u <= value <= %u",
+					tb->k_pos, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if (tb->c_neg > (RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1))
+			rte_bbdev_log(ERR,
+					"c_neg (%u) is out of range 0 <= value <= %u",
+					tb->c_neg,
+					RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1);
+		if (tb->c < 1 || tb->c > RTE_BBDEV_TURBO_MAX_CODE_BLOCKS) {
+			rte_bbdev_log(ERR,
+					"c (%u) is out of range 1 <= value <= %u",
+					tb->c, RTE_BBDEV_TURBO_MAX_CODE_BLOCKS);
+			return -1;
+		}
+		if (tb->cab > tb->c) {
+			rte_bbdev_log(ERR,
+					"cab (%u) is greater than c (%u)",
+					tb->cab, tb->c);
+			return -1;
+		}
+		if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_EQUALIZER) &&
+				(tb->ea < RTE_BBDEV_TURBO_MIN_CB_SIZE
+						|| (tb->ea % 2))
+				&& tb->cab > 0) {
+			rte_bbdev_log(ERR,
+					"ea (%u) is less than %u or it is not even",
+					tb->ea, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+		if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_EQUALIZER) &&
+				(tb->eb < RTE_BBDEV_TURBO_MIN_CB_SIZE
+						|| (tb->eb % 2))
+				&& tb->c > tb->cab) {
+			rte_bbdev_log(ERR,
+					"eb (%u) is less than %u or it is not even",
+					tb->eb, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+		}
+	} else {
+		cb = &turbo_dec->cb_params;
+		if (cb->k < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| cb->k > RTE_BBDEV_TURBO_MAX_CB_SIZE) {
+			rte_bbdev_log(ERR,
+					"k (%u) is out of range %u <= value <= %u",
+					cb->k, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_EQUALIZER) &&
+				(cb->e < RTE_BBDEV_TURBO_MIN_CB_SIZE ||
+				(cb->e % 2))) {
+			rte_bbdev_log(ERR,
+					"e (%u) is less than %u or it is not even",
+					cb->e, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+	}
+
+	return 0;
+}
+#endif
+
 /** Enqueue one decode operations for ACC100 device in CB mode */
 static inline int
 enqueue_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
@@ -2203,6 +2596,14 @@
 	struct rte_mbuf *input, *h_output_head, *h_output,
 		*s_output_head, *s_output;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_dec_op(op) == -1) {
+		rte_bbdev_log(ERR, "Turbo decoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2426,6 +2827,13 @@
 		return ret;
 	}
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_ldpc_dec_op(op) == -1) {
+		rte_bbdev_log(ERR, "LDPC decoder validation failed");
+		return -EINVAL;
+	}
+#endif
 	union acc100_dma_desc *desc;
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
@@ -2521,6 +2929,14 @@
 	struct rte_mbuf *input, *h_output_head, *h_output;
 	uint16_t current_enqueued_cbs = 0;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_ldpc_dec_op(op) == -1) {
+		rte_bbdev_log(ERR, "LDPC decoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2611,6 +3027,14 @@
 		*s_output_head, *s_output;
 	uint16_t current_enqueued_cbs = 0;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_dec_op(op) == -1) {
+		rte_bbdev_log(ERR, "Turbo decoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v8 10/10] baseband/acc100: add configure function
  2020-09-28 23:52   ` [dpdk-dev] [PATCH v8 00/10] bbdev PMD ACC100 Nicolas Chautru
                       ` (8 preceding siblings ...)
  2020-09-28 23:52     ` [dpdk-dev] [PATCH v8 09/10] baseband/acc100: add debug function to validate input Nicolas Chautru
@ 2020-09-28 23:52     ` Nicolas Chautru
  9 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-28 23:52 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Add configure function to configure the PF from within
the bbdev-test itself without external application
configuration the device.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 app/test-bbdev/test_bbdev_perf.c                   |  72 +++
 drivers/baseband/acc100/meson.build                |   2 +
 drivers/baseband/acc100/rte_acc100_cfg.h           |  17 +
 drivers/baseband/acc100/rte_acc100_pmd.c           | 505 +++++++++++++++++++++
 .../acc100/rte_pmd_bbdev_acc100_version.map        |   7 +
 5 files changed, 603 insertions(+)

diff --git a/app/test-bbdev/test_bbdev_perf.c b/app/test-bbdev/test_bbdev_perf.c
index 45c0d62..32f23ff 100644
--- a/app/test-bbdev/test_bbdev_perf.c
+++ b/app/test-bbdev/test_bbdev_perf.c
@@ -52,6 +52,18 @@
 #define FLR_5G_TIMEOUT 610
 #endif
 
+#ifdef RTE_LIBRTE_PMD_BBDEV_ACC100
+#include <rte_acc100_cfg.h>
+#define ACC100PF_DRIVER_NAME   ("intel_acc100_pf")
+#define ACC100VF_DRIVER_NAME   ("intel_acc100_vf")
+#define ACC100_QMGR_NUM_AQS 16
+#define ACC100_QMGR_NUM_QGS 2
+#define ACC100_QMGR_AQ_DEPTH 5
+#define ACC100_QMGR_INVALID_IDX -1
+#define ACC100_QMGR_RR 1
+#define ACC100_QOS_GBR 0
+#endif
+
 #define OPS_CACHE_SIZE 256U
 #define OPS_POOL_SIZE_MIN 511U /* 0.5K per queue */
 
@@ -653,6 +665,66 @@ typedef int (test_case_function)(struct active_device *ad,
 				info->dev_name);
 	}
 #endif
+#ifdef RTE_LIBRTE_PMD_BBDEV_ACC100
+	if ((get_init_device() == true) &&
+		(!strcmp(info->drv.driver_name, ACC100PF_DRIVER_NAME))) {
+		struct acc100_conf conf;
+		unsigned int i;
+
+		printf("Configure ACC100 FEC Driver %s with default values\n",
+				info->drv.driver_name);
+
+		/* clear default configuration before initialization */
+		memset(&conf, 0, sizeof(struct acc100_conf));
+
+		/* Always set in PF mode for built-in configuration */
+		conf.pf_mode_en = true;
+		for (i = 0; i < RTE_ACC100_NUM_VFS; ++i) {
+			conf.arb_dl_4g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_dl_4g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_dl_4g[i].round_robin_weight = ACC100_QMGR_RR;
+			conf.arb_ul_4g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_ul_4g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_ul_4g[i].round_robin_weight = ACC100_QMGR_RR;
+			conf.arb_dl_5g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_dl_5g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_dl_5g[i].round_robin_weight = ACC100_QMGR_RR;
+			conf.arb_ul_5g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_ul_5g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_ul_5g[i].round_robin_weight = ACC100_QMGR_RR;
+		}
+
+		conf.input_pos_llr_1_bit = true;
+		conf.output_pos_llr_1_bit = true;
+		conf.num_vf_bundles = 1; /**< Number of VF bundles to setup */
+
+		conf.q_ul_4g.num_qgroups = ACC100_QMGR_NUM_QGS;
+		conf.q_ul_4g.first_qgroup_index = ACC100_QMGR_INVALID_IDX;
+		conf.q_ul_4g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
+		conf.q_ul_4g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
+		conf.q_dl_4g.num_qgroups = ACC100_QMGR_NUM_QGS;
+		conf.q_dl_4g.first_qgroup_index = ACC100_QMGR_INVALID_IDX;
+		conf.q_dl_4g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
+		conf.q_dl_4g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
+		conf.q_ul_5g.num_qgroups = ACC100_QMGR_NUM_QGS;
+		conf.q_ul_5g.first_qgroup_index = ACC100_QMGR_INVALID_IDX;
+		conf.q_ul_5g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
+		conf.q_ul_5g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
+		conf.q_dl_5g.num_qgroups = ACC100_QMGR_NUM_QGS;
+		conf.q_dl_5g.first_qgroup_index = ACC100_QMGR_INVALID_IDX;
+		conf.q_dl_5g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
+		conf.q_dl_5g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
+
+		/* setup PF with configuration information */
+		ret = acc100_configure(info->dev_name, &conf);
+		TEST_ASSERT_SUCCESS(ret,
+				"Failed to configure ACC100 PF for bbdev %s",
+				info->dev_name);
+		/* Let's refresh this now this is configured */
+	}
+	rte_bbdev_info_get(dev_id, info);
+#endif
+
 	nb_queues = RTE_MIN(rte_lcore_count(), info->drv.max_num_queues);
 	nb_queues = RTE_MIN(nb_queues, (unsigned int) MAX_QUEUES);
 
diff --git a/drivers/baseband/acc100/meson.build b/drivers/baseband/acc100/meson.build
index 8afafc2..7ac44dc 100644
--- a/drivers/baseband/acc100/meson.build
+++ b/drivers/baseband/acc100/meson.build
@@ -4,3 +4,5 @@
 deps += ['bbdev', 'bus_vdev', 'ring', 'pci', 'bus_pci']
 
 sources = files('rte_acc100_pmd.c')
+
+install_headers('rte_acc100_cfg.h')
diff --git a/drivers/baseband/acc100/rte_acc100_cfg.h b/drivers/baseband/acc100/rte_acc100_cfg.h
index 73bbe36..7f523bc 100644
--- a/drivers/baseband/acc100/rte_acc100_cfg.h
+++ b/drivers/baseband/acc100/rte_acc100_cfg.h
@@ -89,6 +89,23 @@ struct acc100_conf {
 	struct rte_arbitration_t arb_dl_5g[RTE_ACC100_NUM_VFS];
 };
 
+/**
+ * Configure a ACC100 device
+ *
+ * @param dev_name
+ *   The name of the device. This is the short form of PCI BDF, e.g. 00:01.0.
+ *   It can also be retrieved for a bbdev device from the dev_name field in the
+ *   rte_bbdev_info structure returned by rte_bbdev_info_get().
+ * @param conf
+ *   Configuration to apply to ACC100 HW.
+ *
+ * @return
+ *   Zero on success, negative value on failure.
+ */
+__rte_experimental
+int
+acc100_configure(const char *dev_name, struct acc100_conf *conf);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 3589814..b50dd32 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -85,6 +85,26 @@
 
 enum {UL_4G = 0, UL_5G, DL_4G, DL_5G, NUM_ACC};
 
+/* Return the accelerator enum for a Queue Group Index */
+static inline int
+accFromQgid(int qg_idx, const struct acc100_conf *acc100_conf)
+{
+	int accQg[ACC100_NUM_QGRPS];
+	int NumQGroupsPerFn[NUM_ACC];
+	int acc, qgIdx, qgIndex = 0;
+	for (qgIdx = 0; qgIdx < ACC100_NUM_QGRPS; qgIdx++)
+		accQg[qgIdx] = 0;
+	NumQGroupsPerFn[UL_4G] = acc100_conf->q_ul_4g.num_qgroups;
+	NumQGroupsPerFn[UL_5G] = acc100_conf->q_ul_5g.num_qgroups;
+	NumQGroupsPerFn[DL_4G] = acc100_conf->q_dl_4g.num_qgroups;
+	NumQGroupsPerFn[DL_5G] = acc100_conf->q_dl_5g.num_qgroups;
+	for (acc = UL_4G;  acc < NUM_ACC; acc++)
+		for (qgIdx = 0; qgIdx < NumQGroupsPerFn[acc]; qgIdx++)
+			accQg[qgIndex++] = acc;
+	acc = accQg[qg_idx];
+	return acc;
+}
+
 /* Return the queue topology for a Queue Group Index */
 static inline void
 qtopFromAcc(struct rte_q_topology_t **qtop, int acc_enum,
@@ -113,6 +133,30 @@
 	*qtop = p_qtop;
 }
 
+/* Return the AQ depth for a Queue Group Index */
+static inline int
+aqDepth(int qg_idx, struct acc100_conf *acc100_conf)
+{
+	struct rte_q_topology_t *q_top = NULL;
+	int acc_enum = accFromQgid(qg_idx, acc100_conf);
+	qtopFromAcc(&q_top, acc_enum, acc100_conf);
+	if (unlikely(q_top == NULL))
+		return 0;
+	return q_top->aq_depth_log2;
+}
+
+/* Return the AQ depth for a Queue Group Index */
+static inline int
+aqNum(int qg_idx, struct acc100_conf *acc100_conf)
+{
+	struct rte_q_topology_t *q_top = NULL;
+	int acc_enum = accFromQgid(qg_idx, acc100_conf);
+	qtopFromAcc(&q_top, acc_enum, acc100_conf);
+	if (unlikely(q_top == NULL))
+		return 0;
+	return q_top->num_aqs_per_groups;
+}
+
 static void
 initQTop(struct acc100_conf *acc100_conf)
 {
@@ -4177,3 +4221,464 @@ static int acc100_pci_remove(struct rte_pci_device *pci_dev)
 RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME, pci_id_acc100_pf_map);
 RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
 RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME, pci_id_acc100_vf_map);
+
+/*
+ * Implementation to fix the power on status of some 5GUL engines
+ * This requires DMA permission if ported outside DPDK
+ */
+static void
+poweron_cleanup(struct rte_bbdev *bbdev, struct acc100_device *d,
+		struct acc100_conf *conf)
+{
+	int i, template_idx, qg_idx;
+	uint32_t address, status, payload;
+	printf("Need to clear power-on 5GUL status in internal memory\n");
+	/* Reset LDPC Cores */
+	for (i = 0; i < ACC100_ENGINES_MAX; i++)
+		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
+				ACC100_ENGINE_OFFSET * i, ACC100_RESET_HI);
+	usleep(LONG_WAIT);
+	for (i = 0; i < ACC100_ENGINES_MAX; i++)
+		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
+				ACC100_ENGINE_OFFSET * i, ACC100_RESET_LO);
+	usleep(LONG_WAIT);
+	/* Prepare dummy workload */
+	alloc_2x64mb_sw_rings_mem(bbdev, d, 0);
+	/* Set base addresses */
+	uint32_t phys_high = (uint32_t)(d->sw_rings_phys >> 32);
+	uint32_t phys_low  = (uint32_t)(d->sw_rings_phys &
+			~(ACC100_SIZE_64MBYTE-1));
+	acc100_reg_write(d, HWPfDmaFec5GulDescBaseHiRegVf, phys_high);
+	acc100_reg_write(d, HWPfDmaFec5GulDescBaseLoRegVf, phys_low);
+
+	/* Descriptor for a dummy 5GUL code block processing*/
+	union acc100_dma_desc *desc = NULL;
+	desc = d->sw_rings;
+	desc->req.data_ptrs[0].address = d->sw_rings_phys +
+			ACC100_DESC_FCW_OFFSET;
+	desc->req.data_ptrs[0].blen = ACC100_FCW_LD_BLEN;
+	desc->req.data_ptrs[0].blkid = ACC100_DMA_BLKID_FCW;
+	desc->req.data_ptrs[0].last = 0;
+	desc->req.data_ptrs[0].dma_ext = 0;
+	desc->req.data_ptrs[1].address = d->sw_rings_phys + 512;
+	desc->req.data_ptrs[1].blkid = ACC100_DMA_BLKID_IN;
+	desc->req.data_ptrs[1].last = 1;
+	desc->req.data_ptrs[1].dma_ext = 0;
+	desc->req.data_ptrs[1].blen = 44;
+	desc->req.data_ptrs[2].address = d->sw_rings_phys + 1024;
+	desc->req.data_ptrs[2].blkid = ACC100_DMA_BLKID_OUT_ENC;
+	desc->req.data_ptrs[2].last = 1;
+	desc->req.data_ptrs[2].dma_ext = 0;
+	desc->req.data_ptrs[2].blen = 5;
+	/* Dummy FCW */
+	desc->req.fcw_ld.FCWversion = ACC100_FCW_VER;
+	desc->req.fcw_ld.qm = 1;
+	desc->req.fcw_ld.nfiller = 30;
+	desc->req.fcw_ld.BG = 2 - 1;
+	desc->req.fcw_ld.Zc = 7;
+	desc->req.fcw_ld.ncb = 350;
+	desc->req.fcw_ld.rm_e = 4;
+	desc->req.fcw_ld.itmax = 10;
+	desc->req.fcw_ld.gain_i = 1;
+	desc->req.fcw_ld.gain_h = 1;
+
+	int engines_to_restart[SIG_UL_5G_LAST + 1] = {0};
+	int num_failed_engine = 0;
+	/* Detect engines in undefined state */
+	for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST;
+			template_idx++) {
+		/* Check engine power-on status */
+		address = HwPfFecUl5gIbDebugReg +
+				ACC100_ENGINE_OFFSET * template_idx;
+		status = (acc100_reg_read(d, address) >> 4) & 0xF;
+		if (status == 0) {
+			engines_to_restart[num_failed_engine] = template_idx;
+			num_failed_engine++;
+		}
+	}
+
+	int numQqsAcc = conf->q_ul_5g.num_qgroups;
+	int numQgs = conf->q_ul_5g.num_qgroups;
+	payload = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		payload |= (1 << qg_idx);
+	/* Force each engine which is in unspecified state */
+	for (i = 0; i < num_failed_engine; i++) {
+		int failed_engine = engines_to_restart[i];
+		printf("Force engine %d\n", failed_engine);
+		for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST;
+				template_idx++) {
+			address = HWPfQmgrGrpTmplateReg4Indx
+					+ BYTES_IN_WORD * template_idx;
+			if (template_idx == failed_engine)
+				acc100_reg_write(d, address, payload);
+			else
+				acc100_reg_write(d, address, 0);
+		}
+		/* Reset descriptor header */
+		desc->req.word0 = ACC100_DMA_DESC_TYPE;
+		desc->req.word1 = 0;
+		desc->req.word2 = 0;
+		desc->req.word3 = 0;
+		desc->req.numCBs = 1;
+		desc->req.m2dlen = 2;
+		desc->req.d2mlen = 1;
+		/* Enqueue the code block for processing */
+		union acc100_enqueue_reg_fmt enq_req;
+		enq_req.val = 0;
+		enq_req.addr_offset = ACC100_DESC_OFFSET;
+		enq_req.num_elem = 1;
+		enq_req.req_elem_addr = 0;
+		rte_wmb();
+		acc100_reg_write(d, HWPfQmgrIngressAq + 0x100, enq_req.val);
+		usleep(LONG_WAIT * 100);
+		if (desc->req.word0 != 2)
+			printf("DMA Response %#"PRIx32"\n", desc->req.word0);
+	}
+
+	/* Reset LDPC Cores */
+	for (i = 0; i < ACC100_ENGINES_MAX; i++)
+		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
+				ACC100_ENGINE_OFFSET * i, ACC100_RESET_HI);
+	usleep(LONG_WAIT);
+	for (i = 0; i < ACC100_ENGINES_MAX; i++)
+		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
+				ACC100_ENGINE_OFFSET * i, ACC100_RESET_LO);
+	usleep(LONG_WAIT);
+	acc100_reg_write(d, HWPfHi5GHardResetReg, ACC100_RESET_HARD);
+	usleep(LONG_WAIT);
+	int numEngines = 0;
+	/* Check engine power-on status again */
+	for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST;
+			template_idx++) {
+		address = HwPfFecUl5gIbDebugReg +
+				ACC100_ENGINE_OFFSET * template_idx;
+		status = (acc100_reg_read(d, address) >> 4) & 0xF;
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ BYTES_IN_WORD * template_idx;
+		if (status == 1) {
+			acc100_reg_write(d, address, payload);
+			numEngines++;
+		} else
+			acc100_reg_write(d, address, 0);
+	}
+	printf("Number of 5GUL engines %d\n", numEngines);
+
+	if (d->sw_rings_base != NULL)
+		rte_free(d->sw_rings_base);
+	usleep(LONG_WAIT);
+}
+
+/* Initial configuration of a ACC100 device prior to running configure() */
+int
+acc100_configure(const char *dev_name, struct acc100_conf *conf)
+{
+	rte_bbdev_log(INFO, "acc100_configure");
+	uint32_t payload, address, status;
+	int qg_idx, template_idx, vf_idx, acc, i;
+	struct rte_bbdev *bbdev = rte_bbdev_get_named_dev(dev_name);
+
+	/* Compile time checks */
+	RTE_BUILD_BUG_ON(sizeof(struct acc100_dma_req_desc) != 256);
+	RTE_BUILD_BUG_ON(sizeof(union acc100_dma_desc) != 256);
+	RTE_BUILD_BUG_ON(sizeof(struct acc100_fcw_td) != 24);
+	RTE_BUILD_BUG_ON(sizeof(struct acc100_fcw_te) != 32);
+
+	if (bbdev == NULL) {
+		rte_bbdev_log(ERR,
+		"Invalid dev_name (%s), or device is not yet initialised",
+		dev_name);
+		return -ENODEV;
+	}
+	struct acc100_device *d = bbdev->data->dev_private;
+
+	/* Store configuration */
+	rte_memcpy(&d->acc100_conf, conf, sizeof(d->acc100_conf));
+
+	/* PCIe Bridge configuration */
+	acc100_reg_write(d, HwPfPcieGpexBridgeControl, ACC100_CFG_PCI_BRIDGE);
+	for (i = 1; i < 17; i++)
+		acc100_reg_write(d,
+				HwPfPcieGpexAxiAddrMappingWindowPexBaseHigh
+				+ i * 16, 0);
+
+	/* PCIe Link Trainiing and Status State Machine */
+	acc100_reg_write(d, HwPfPcieGpexLtssmStateCntrl, 0xDFC00000);
+
+	/* Prevent blocking AXI read on BRESP for AXI Write */
+	address = HwPfPcieGpexAxiPioControl;
+	payload = ACC100_CFG_PCI_AXI;
+	acc100_reg_write(d, address, payload);
+
+	/* 5GDL PLL phase shift */
+	acc100_reg_write(d, HWPfChaDl5gPllPhshft0, 0x1);
+
+	/* Explicitly releasing AXI as this may be stopped after PF FLR/BME */
+	address = HWPfDmaAxiControl;
+	payload = 1;
+	acc100_reg_write(d, address, payload);
+
+	/* DDR Configuration */
+	address = HWPfDdrBcTim6;
+	payload = acc100_reg_read(d, address);
+	payload &= 0xFFFFFFFB; /* Bit 2 */
+#ifdef ACC100_DDR_ECC_ENABLE
+	payload |= 0x4;
+#endif
+	acc100_reg_write(d, address, payload);
+	address = HWPfDdrPhyDqsCountNum;
+#ifdef ACC100_DDR_ECC_ENABLE
+	payload = 9;
+#else
+	payload = 8;
+#endif
+	acc100_reg_write(d, address, payload);
+
+	/* Set default descriptor signature */
+	address = HWPfDmaDescriptorSignatuture;
+	payload = 0;
+	acc100_reg_write(d, address, payload);
+
+	/* Enable the Error Detection in DMA */
+	payload = ACC100_CFG_DMA_ERROR;
+	address = HWPfDmaErrorDetectionEn;
+	acc100_reg_write(d, address, payload);
+
+	/* AXI Cache configuration */
+	payload = ACC100_CFG_AXI_CACHE;
+	address = HWPfDmaAxcacheReg;
+	acc100_reg_write(d, address, payload);
+
+	/* Default DMA Configuration (Qmgr Enabled) */
+	address = HWPfDmaConfig0Reg;
+	payload = 0;
+	acc100_reg_write(d, address, payload);
+	address = HWPfDmaQmanen;
+	payload = 0;
+	acc100_reg_write(d, address, payload);
+
+	/* Default RLIM/ALEN configuration */
+	address = HWPfDmaConfig1Reg;
+	payload = (1 << 31) + (23 << 8) + (1 << 6) + 7;
+	acc100_reg_write(d, address, payload);
+
+	/* Configure DMA Qmanager addresses */
+	address = HWPfDmaQmgrAddrReg;
+	payload = HWPfQmgrEgressQueuesTemplate;
+	acc100_reg_write(d, address, payload);
+
+	/* ===== Qmgr Configuration ===== */
+	/* Configuration of the AQueue Depth QMGR_GRP_0_DEPTH_LOG2 for UL */
+	int totalQgs = conf->q_ul_4g.num_qgroups +
+			conf->q_ul_5g.num_qgroups +
+			conf->q_dl_4g.num_qgroups +
+			conf->q_dl_5g.num_qgroups;
+	for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) {
+		address = HWPfQmgrDepthLog2Grp +
+		BYTES_IN_WORD * qg_idx;
+		payload = aqDepth(qg_idx, conf);
+		acc100_reg_write(d, address, payload);
+		address = HWPfQmgrTholdGrp +
+		BYTES_IN_WORD * qg_idx;
+		payload = (1 << 16) + (1 << (aqDepth(qg_idx, conf) - 1));
+		acc100_reg_write(d, address, payload);
+	}
+
+	/* Template Priority in incremental order */
+	for (template_idx = 0; template_idx < ACC100_NUM_TMPL;
+			template_idx++) {
+		address = HWPfQmgrGrpTmplateReg0Indx +
+		BYTES_IN_WORD * (template_idx % 8);
+		payload = TMPL_PRI_0;
+		acc100_reg_write(d, address, payload);
+		address = HWPfQmgrGrpTmplateReg1Indx +
+		BYTES_IN_WORD * (template_idx % 8);
+		payload = TMPL_PRI_1;
+		acc100_reg_write(d, address, payload);
+		address = HWPfQmgrGrpTmplateReg2indx +
+		BYTES_IN_WORD * (template_idx % 8);
+		payload = TMPL_PRI_2;
+		acc100_reg_write(d, address, payload);
+		address = HWPfQmgrGrpTmplateReg3Indx +
+		BYTES_IN_WORD * (template_idx % 8);
+		payload = TMPL_PRI_3;
+		acc100_reg_write(d, address, payload);
+	}
+
+	address = HWPfQmgrGrpPriority;
+	payload = ACC100_CFG_QMGR_HI_P;
+	acc100_reg_write(d, address, payload);
+
+	/* Template Configuration */
+	for (template_idx = 0; template_idx < ACC100_NUM_TMPL; template_idx++) {
+		payload = 0;
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ BYTES_IN_WORD * template_idx;
+		acc100_reg_write(d, address, payload);
+	}
+	/* 4GUL */
+	int numQgs = conf->q_ul_4g.num_qgroups;
+	int numQqsAcc = 0;
+	payload = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		payload |= (1 << qg_idx);
+	for (template_idx = SIG_UL_4G; template_idx <= SIG_UL_4G_LAST;
+			template_idx++) {
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ BYTES_IN_WORD*template_idx;
+		acc100_reg_write(d, address, payload);
+	}
+	/* 5GUL */
+	numQqsAcc += numQgs;
+	numQgs	= conf->q_ul_5g.num_qgroups;
+	payload = 0;
+	int numEngines = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		payload |= (1 << qg_idx);
+	for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST;
+			template_idx++) {
+		/* Check engine power-on status */
+		address = HwPfFecUl5gIbDebugReg +
+				ACC100_ENGINE_OFFSET * template_idx;
+		status = (acc100_reg_read(d, address) >> 4) & 0xF;
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ BYTES_IN_WORD * template_idx;
+		if (status == 1) {
+			acc100_reg_write(d, address, payload);
+			numEngines++;
+		} else
+			acc100_reg_write(d, address, 0);
+		#if RTE_ACC100_SINGLE_FEC == 1
+		payload = 0;
+		#endif
+	}
+	printf("Number of 5GUL engines %d\n", numEngines);
+	/* 4GDL */
+	numQqsAcc += numQgs;
+	numQgs	= conf->q_dl_4g.num_qgroups;
+	payload = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		payload |= (1 << qg_idx);
+	for (template_idx = SIG_DL_4G; template_idx <= SIG_DL_4G_LAST;
+			template_idx++) {
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ BYTES_IN_WORD*template_idx;
+		acc100_reg_write(d, address, payload);
+		#if RTE_ACC100_SINGLE_FEC == 1
+			payload = 0;
+		#endif
+	}
+	/* 5GDL */
+	numQqsAcc += numQgs;
+	numQgs	= conf->q_dl_5g.num_qgroups;
+	payload = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		payload |= (1 << qg_idx);
+	for (template_idx = SIG_DL_5G; template_idx <= SIG_DL_5G_LAST;
+			template_idx++) {
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ BYTES_IN_WORD*template_idx;
+		acc100_reg_write(d, address, payload);
+		#if RTE_ACC100_SINGLE_FEC == 1
+		payload = 0;
+		#endif
+	}
+
+	/* Queue Group Function mapping */
+	int qman_func_id[5] = {0, 2, 1, 3, 4};
+	address = HWPfQmgrGrpFunction0;
+	payload = 0;
+	for (qg_idx = 0; qg_idx < 8; qg_idx++) {
+		acc = accFromQgid(qg_idx, conf);
+		payload |= qman_func_id[acc]<<(qg_idx * 4);
+	}
+	acc100_reg_write(d, address, payload);
+
+	/* Configuration of the Arbitration QGroup depth to 1 */
+	for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) {
+		address = HWPfQmgrArbQDepthGrp +
+		BYTES_IN_WORD * qg_idx;
+		payload = 0;
+		acc100_reg_write(d, address, payload);
+	}
+
+	/* Enabling AQueues through the Queue hierarchy*/
+	for (vf_idx = 0; vf_idx < ACC100_NUM_VFS; vf_idx++) {
+		for (qg_idx = 0; qg_idx < ACC100_NUM_QGRPS; qg_idx++) {
+			payload = 0;
+			if (vf_idx < conf->num_vf_bundles &&
+					qg_idx < totalQgs)
+				payload = (1 << aqNum(qg_idx, conf)) - 1;
+			address = HWPfQmgrAqEnableVf
+					+ vf_idx * BYTES_IN_WORD;
+			payload += (qg_idx << 16);
+			acc100_reg_write(d, address, payload);
+		}
+	}
+
+	/* This pointer to ARAM (256kB) is shifted by 2 (4B per register) */
+	uint32_t aram_address = 0;
+	for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) {
+		for (vf_idx = 0; vf_idx < conf->num_vf_bundles; vf_idx++) {
+			address = HWPfQmgrVfBaseAddr + vf_idx
+					* BYTES_IN_WORD + qg_idx
+					* BYTES_IN_WORD * 64;
+			payload = aram_address;
+			acc100_reg_write(d, address, payload);
+			/* Offset ARAM Address for next memory bank
+			 * - increment of 4B
+			 */
+			aram_address += aqNum(qg_idx, conf) *
+					(1 << aqDepth(qg_idx, conf));
+		}
+	}
+
+	if (aram_address > WORDS_IN_ARAM_SIZE) {
+		rte_bbdev_log(ERR, "ARAM Configuration not fitting %d %d\n",
+				aram_address, WORDS_IN_ARAM_SIZE);
+		return -EINVAL;
+	}
+
+	/* ==== HI Configuration ==== */
+
+	/* Prevent Block on Transmit Error */
+	address = HWPfHiBlockTransmitOnErrorEn;
+	payload = 0;
+	acc100_reg_write(d, address, payload);
+	/* Prevents to drop MSI */
+	address = HWPfHiMsiDropEnableReg;
+	payload = 0;
+	acc100_reg_write(d, address, payload);
+	/* Set the PF Mode register */
+	address = HWPfHiPfMode;
+	payload = (conf->pf_mode_en) ? 2 : 0;
+	acc100_reg_write(d, address, payload);
+	/* Enable Error Detection in HW */
+	address = HWPfDmaErrorDetectionEn;
+	payload = 0x3D7;
+	acc100_reg_write(d, address, payload);
+
+	/* QoS overflow init */
+	payload = 1;
+	address = HWPfQosmonAEvalOverflow0;
+	acc100_reg_write(d, address, payload);
+	address = HWPfQosmonBEvalOverflow0;
+	acc100_reg_write(d, address, payload);
+
+	/* HARQ DDR Configuration */
+	unsigned int ddrSizeInMb = 512; /* Fixed to 512 MB per VF for now */
+	for (vf_idx = 0; vf_idx < conf->num_vf_bundles; vf_idx++) {
+		address = HWPfDmaVfDdrBaseRw + vf_idx
+				* 0x10;
+		payload = ((vf_idx * (ddrSizeInMb / 64)) << 16) +
+				(ddrSizeInMb - 1);
+		acc100_reg_write(d, address, payload);
+	}
+	usleep(LONG_WAIT);
+
+	if (numEngines < (SIG_UL_5G_LAST + 1))
+		poweron_cleanup(bbdev, d, conf);
+
+	rte_bbdev_log_debug("PF Tip configuration complete for %s", dev_name);
+	return 0;
+}
diff --git a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
index 4a76d1d..91c234d 100644
--- a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
+++ b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
@@ -1,3 +1,10 @@
 DPDK_21 {
 	local: *;
 };
+
+EXPERIMENTAL {
+	global:
+
+	acc100_configure;
+
+};
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v9 00/10] bbdev PMD ACC100
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 11/11] doc: update bbdev feature table Nicolas Chautru
                     ` (4 preceding siblings ...)
  2020-09-28 23:52   ` [dpdk-dev] [PATCH v8 00/10] bbdev PMD ACC100 Nicolas Chautru
@ 2020-09-29  0:29   ` Nicolas Chautru
  2020-09-29  0:29     ` [dpdk-dev] [PATCH v9 01/10] drivers/baseband: add PMD for ACC100 Nicolas Chautru
                       ` (9 more replies)
  2020-10-01  3:14   ` [dpdk-dev] [PATCH v10 00/10] bbdev PMD ACC100 Nicolas Chautru
                     ` (2 subsequent siblings)
  8 siblings, 10 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-29  0:29 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

v9: moved the release notes update to the last commit
v8: integrated the doc feature table in previous commit as suggested. 
v7: Fingers trouble. Previous one sent mid-rebase. My bad. 
v6: removed a legacy makefile no longer required
v5: rebase based on latest on main. The legacy makefiles are removed. 
v4: an odd compilation error is reported for one CI variant using "gcc latest" which looks to me like a false positive of maybe-undeclared. 
http://mails.dpdk.org/archives/test-report/2020-August/148936.html
Still forcing a dummy declare to remove this CI warning I will check with ci@dpdk.org in parallel.  
v3: missed a change during rebase
v2: includes clean up from latest CI checks.


Nicolas Chautru (10):
  drivers/baseband: add PMD for ACC100
  baseband/acc100: add register definition file
  baseband/acc100: add info get function
  baseband/acc100: add queue configuration
  baseband/acc100: add LDPC processing functions
  baseband/acc100: add HARQ loopback support
  baseband/acc100: add support for 4G processing
  baseband/acc100: add interrupt support to PMD
  baseband/acc100: add debug function to validate input
  baseband/acc100: add configure function

 app/test-bbdev/meson.build                         |    3 +
 app/test-bbdev/test_bbdev_perf.c                   |   72 +
 doc/guides/bbdevs/acc100.rst                       |  233 +
 doc/guides/bbdevs/features/acc100.ini              |   14 +
 doc/guides/bbdevs/index.rst                        |    1 +
 doc/guides/rel_notes/release_20_11.rst             |    5 +
 drivers/baseband/acc100/acc100_pf_enum.h           | 1068 +++++
 drivers/baseband/acc100/acc100_vf_enum.h           |   73 +
 drivers/baseband/acc100/meson.build                |    8 +
 drivers/baseband/acc100/rte_acc100_cfg.h           |  113 +
 drivers/baseband/acc100/rte_acc100_pmd.c           | 4684 ++++++++++++++++++++
 drivers/baseband/acc100/rte_acc100_pmd.h           |  593 +++
 .../acc100/rte_pmd_bbdev_acc100_version.map        |   10 +
 drivers/baseband/meson.build                       |    2 +-
 14 files changed, 6878 insertions(+), 1 deletion(-)
 create mode 100644 doc/guides/bbdevs/acc100.rst
 create mode 100644 doc/guides/bbdevs/features/acc100.ini
 create mode 100644 drivers/baseband/acc100/acc100_pf_enum.h
 create mode 100644 drivers/baseband/acc100/acc100_vf_enum.h
 create mode 100644 drivers/baseband/acc100/meson.build
 create mode 100644 drivers/baseband/acc100/rte_acc100_cfg.h
 create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.c
 create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.h
 create mode 100644 drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v9 01/10] drivers/baseband: add PMD for ACC100
  2020-09-29  0:29   ` [dpdk-dev] [PATCH v9 00/10] bbdev PMD ACC100 Nicolas Chautru
@ 2020-09-29  0:29     ` Nicolas Chautru
  2020-09-29 19:53       ` Tom Rix
  2020-09-29  0:29     ` [dpdk-dev] [PATCH v9 02/10] baseband/acc100: add register definition file Nicolas Chautru
                       ` (8 subsequent siblings)
  9 siblings, 1 reply; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-29  0:29 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Add stubs for the ACC100 PMD

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 doc/guides/bbdevs/acc100.rst                       | 233 +++++++++++++++++++++
 doc/guides/bbdevs/features/acc100.ini              |  14 ++
 doc/guides/bbdevs/index.rst                        |   1 +
 drivers/baseband/acc100/meson.build                |   6 +
 drivers/baseband/acc100/rte_acc100_pmd.c           | 175 ++++++++++++++++
 drivers/baseband/acc100/rte_acc100_pmd.h           |  37 ++++
 .../acc100/rte_pmd_bbdev_acc100_version.map        |   3 +
 drivers/baseband/meson.build                       |   2 +-
 8 files changed, 470 insertions(+), 1 deletion(-)
 create mode 100644 doc/guides/bbdevs/acc100.rst
 create mode 100644 doc/guides/bbdevs/features/acc100.ini
 create mode 100644 drivers/baseband/acc100/meson.build
 create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.c
 create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.h
 create mode 100644 drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map

diff --git a/doc/guides/bbdevs/acc100.rst b/doc/guides/bbdevs/acc100.rst
new file mode 100644
index 0000000..f87ee09
--- /dev/null
+++ b/doc/guides/bbdevs/acc100.rst
@@ -0,0 +1,233 @@
+..  SPDX-License-Identifier: BSD-3-Clause
+    Copyright(c) 2020 Intel Corporation
+
+Intel(R) ACC100 5G/4G FEC Poll Mode Driver
+==========================================
+
+The BBDEV ACC100 5G/4G FEC poll mode driver (PMD) supports an
+implementation of a VRAN FEC wireless acceleration function.
+This device is also known as Mount Bryce.
+
+Features
+--------
+
+ACC100 5G/4G FEC PMD supports the following features:
+
+- LDPC Encode in the DL (5GNR)
+- LDPC Decode in the UL (5GNR)
+- Turbo Encode in the DL (4G)
+- Turbo Decode in the UL (4G)
+- 16 VFs per PF (physical device)
+- Maximum of 128 queues per VF
+- PCIe Gen-3 x16 Interface
+- MSI
+- SR-IOV
+
+ACC100 5G/4G FEC PMD supports the following BBDEV capabilities:
+
+* For the LDPC encode operation:
+   - ``RTE_BBDEV_LDPC_CRC_24B_ATTACH`` :  set to attach CRC24B to CB(s)
+   - ``RTE_BBDEV_LDPC_RATE_MATCH`` :  if set then do not do Rate Match bypass
+   - ``RTE_BBDEV_LDPC_INTERLEAVER_BYPASS`` : if set then bypass interleaver
+
+* For the LDPC decode operation:
+   - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK`` :  check CRC24B from CB(s)
+   - ``RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE`` :  disable early termination
+   - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP`` :  drops CRC24B bits appended while decoding
+   - ``RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE`` :  provides an input for HARQ combining
+   - ``RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE`` :  provides an input for HARQ combining
+   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE`` :  HARQ memory input is internal
+   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE`` :  HARQ memory output is internal
+   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK`` :  loopback data to/from HARQ memory
+   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_FILLERS`` :  HARQ memory includes the fillers bits
+   - ``RTE_BBDEV_LDPC_DEC_SCATTER_GATHER`` :  supports scatter-gather for input/output data
+   - ``RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION`` :  supports compression of the HARQ input/output
+   - ``RTE_BBDEV_LDPC_LLR_COMPRESSION`` :  supports LLR input compression
+
+* For the turbo encode operation:
+   - ``RTE_BBDEV_TURBO_CRC_24B_ATTACH`` :  set to attach CRC24B to CB(s)
+   - ``RTE_BBDEV_TURBO_RATE_MATCH`` :  if set then do not do Rate Match bypass
+   - ``RTE_BBDEV_TURBO_ENC_INTERRUPTS`` :  set for encoder dequeue interrupts
+   - ``RTE_BBDEV_TURBO_RV_INDEX_BYPASS`` :  set to bypass RV index
+   - ``RTE_BBDEV_TURBO_ENC_SCATTER_GATHER`` :  supports scatter-gather for input/output data
+
+* For the turbo decode operation:
+   - ``RTE_BBDEV_TURBO_CRC_TYPE_24B`` :  check CRC24B from CB(s)
+   - ``RTE_BBDEV_TURBO_SUBBLOCK_DEINTERLEAVE`` :  perform subblock de-interleave
+   - ``RTE_BBDEV_TURBO_DEC_INTERRUPTS`` :  set for decoder dequeue interrupts
+   - ``RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN`` :  set if negative LLR encoder i/p is supported
+   - ``RTE_BBDEV_TURBO_POS_LLR_1_BIT_IN`` :  set if positive LLR encoder i/p is supported
+   - ``RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP`` :  keep CRC24B bits appended while decoding
+   - ``RTE_BBDEV_TURBO_EARLY_TERMINATION`` :  set early early termination feature
+   - ``RTE_BBDEV_TURBO_DEC_SCATTER_GATHER`` :  supports scatter-gather for input/output data
+   - ``RTE_BBDEV_TURBO_HALF_ITERATION_EVEN`` :  set half iteration granularity
+
+Installation
+------------
+
+Section 3 of the DPDK manual provides instuctions on installing and compiling DPDK. The
+default set of bbdev compile flags may be found in config/common_base, where for example
+the flag to build the ACC100 5G/4G FEC device, ``CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100``,
+is already set.
+
+DPDK requires hugepages to be configured as detailed in section 2 of the DPDK manual.
+The bbdev test application has been tested with a configuration 40 x 1GB hugepages. The
+hugepage configuration of a server may be examined using:
+
+.. code-block:: console
+
+   grep Huge* /proc/meminfo
+
+
+Initialization
+--------------
+
+When the device first powers up, its PCI Physical Functions (PF) can be listed through this command:
+
+.. code-block:: console
+
+  sudo lspci -vd8086:0d5c
+
+The physical and virtual functions are compatible with Linux UIO drivers:
+``vfio`` and ``igb_uio``. However, in order to work the ACC100 5G/4G
+FEC device firstly needs to be bound to one of these linux drivers through DPDK.
+
+
+Bind PF UIO driver(s)
+~~~~~~~~~~~~~~~~~~~~~
+
+Install the DPDK igb_uio driver, bind it with the PF PCI device ID and use
+``lspci`` to confirm the PF device is under use by ``igb_uio`` DPDK UIO driver.
+
+The igb_uio driver may be bound to the PF PCI device using one of three methods:
+
+
+1. PCI functions (physical or virtual, depending on the use case) can be bound to
+the UIO driver by repeating this command for every function.
+
+.. code-block:: console
+
+  cd <dpdk-top-level-directory>
+  insmod ./build/kmod/igb_uio.ko
+  echo "8086 0d5c" > /sys/bus/pci/drivers/igb_uio/new_id
+  lspci -vd8086:0d5c
+
+
+2. Another way to bind PF with DPDK UIO driver is by using the ``dpdk-devbind.py`` tool
+
+.. code-block:: console
+
+  cd <dpdk-top-level-directory>
+  ./usertools/dpdk-devbind.py -b igb_uio 0000:06:00.0
+
+where the PCI device ID (example: 0000:06:00.0) is obtained using lspci -vd8086:0d5c
+
+
+3. A third way to bind is to use ``dpdk-setup.sh`` tool
+
+.. code-block:: console
+
+  cd <dpdk-top-level-directory>
+  ./usertools/dpdk-setup.sh
+
+  select 'Bind Ethernet/Crypto/Baseband device to IGB UIO module'
+  or
+  select 'Bind Ethernet/Crypto/Baseband device to VFIO module' depending on driver required
+  enter PCI device ID
+  select 'Display current Ethernet/Crypto/Baseband device settings' to confirm binding
+
+
+In the same way the ACC100 5G/4G FEC PF can be bound with vfio, but vfio driver does not
+support SR-IOV configuration right out of the box, so it will need to be patched.
+
+
+Enable Virtual Functions
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+Now, it should be visible in the printouts that PCI PF is under igb_uio control
+"``Kernel driver in use: igb_uio``"
+
+To show the number of available VFs on the device, read ``sriov_totalvfs`` file..
+
+.. code-block:: console
+
+  cat /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/sriov_totalvfs
+
+  where 0000\:<b>\:<d>.<f> is the PCI device ID
+
+
+To enable VFs via igb_uio, echo the number of virtual functions intended to
+enable to ``max_vfs`` file..
+
+.. code-block:: console
+
+  echo <num-of-vfs> > /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/max_vfs
+
+
+Afterwards, all VFs must be bound to appropriate UIO drivers as required, same
+way it was done with the physical function previously.
+
+Enabling SR-IOV via vfio driver is pretty much the same, except that the file
+name is different:
+
+.. code-block:: console
+
+  echo <num-of-vfs> > /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/sriov_numvfs
+
+
+Configure the VFs through PF
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The PCI virtual functions must be configured before working or getting assigned
+to VMs/Containers. The configuration involves allocating the number of hardware
+queues, priorities, load balance, bandwidth and other settings necessary for the
+device to perform FEC functions.
+
+This configuration needs to be executed at least once after reboot or PCI FLR and can
+be achieved by using the function ``acc100_configure()``, which sets up the
+parameters defined in ``acc100_conf`` structure.
+
+Test Application
+----------------
+
+BBDEV provides a test application, ``test-bbdev.py`` and range of test data for testing
+the functionality of ACC100 5G/4G FEC encode and decode, depending on the device's
+capabilities. The test application is located under app->test-bbdev folder and has the
+following options:
+
+.. code-block:: console
+
+  "-p", "--testapp-path": specifies path to the bbdev test app.
+  "-e", "--eal-params"	: EAL arguments which are passed to the test app.
+  "-t", "--timeout"	: Timeout in seconds (default=300).
+  "-c", "--test-cases"	: Defines test cases to run. Run all if not specified.
+  "-v", "--test-vector"	: Test vector path (default=dpdk_path+/app/test-bbdev/test_vectors/bbdev_null.data).
+  "-n", "--num-ops"	: Number of operations to process on device (default=32).
+  "-b", "--burst-size"	: Operations enqueue/dequeue burst size (default=32).
+  "-s", "--snr"		: SNR in dB used when generating LLRs for bler tests.
+  "-s", "--iter_max"	: Number of iterations for LDPC decoder.
+  "-l", "--num-lcores"	: Number of lcores to run (default=16).
+  "-i", "--init-device" : Initialise PF device with default values.
+
+
+To execute the test application tool using simple decode or encode data,
+type one of the following:
+
+.. code-block:: console
+
+  ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_dec_default.data
+  ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_enc_default.data
+
+
+The test application ``test-bbdev.py``, supports the ability to configure the PF device with
+a default set of values, if the "-i" or "- -init-device" option is included. The default values
+are defined in test_bbdev_perf.c.
+
+
+Test Vectors
+~~~~~~~~~~~~
+
+In addition to the simple LDPC decoder and LDPC encoder tests, bbdev also provides
+a range of additional tests under the test_vectors folder, which may be useful. The results
+of these tests will depend on the ACC100 5G/4G FEC capabilities which may cause some
+testcases to be skipped, but no failure should be reported.
diff --git a/doc/guides/bbdevs/features/acc100.ini b/doc/guides/bbdevs/features/acc100.ini
new file mode 100644
index 0000000..c89a4d7
--- /dev/null
+++ b/doc/guides/bbdevs/features/acc100.ini
@@ -0,0 +1,14 @@
+;
+; Supported features of the 'acc100' bbdev driver.
+;
+; Refer to default.ini for the full list of available PMD features.
+;
+[Features]
+Turbo Decoder (4G)     = N
+Turbo Encoder (4G)     = N
+LDPC Decoder (5G)      = N
+LDPC Encoder (5G)      = N
+LLR/HARQ Compression   = N
+External DDR Access    = N
+HW Accelerated         = Y
+BBDEV API              = Y
diff --git a/doc/guides/bbdevs/index.rst b/doc/guides/bbdevs/index.rst
index a8092dd..4445cbd 100644
--- a/doc/guides/bbdevs/index.rst
+++ b/doc/guides/bbdevs/index.rst
@@ -13,3 +13,4 @@ Baseband Device Drivers
     turbo_sw
     fpga_lte_fec
     fpga_5gnr_fec
+    acc100
diff --git a/drivers/baseband/acc100/meson.build b/drivers/baseband/acc100/meson.build
new file mode 100644
index 0000000..8afafc2
--- /dev/null
+++ b/drivers/baseband/acc100/meson.build
@@ -0,0 +1,6 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2020 Intel Corporation
+
+deps += ['bbdev', 'bus_vdev', 'ring', 'pci', 'bus_pci']
+
+sources = files('rte_acc100_pmd.c')
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
new file mode 100644
index 0000000..1b4cd13
--- /dev/null
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -0,0 +1,175 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#include <unistd.h>
+
+#include <rte_common.h>
+#include <rte_log.h>
+#include <rte_dev.h>
+#include <rte_malloc.h>
+#include <rte_mempool.h>
+#include <rte_byteorder.h>
+#include <rte_errno.h>
+#include <rte_branch_prediction.h>
+#include <rte_hexdump.h>
+#include <rte_pci.h>
+#include <rte_bus_pci.h>
+
+#include <rte_bbdev.h>
+#include <rte_bbdev_pmd.h>
+#include "rte_acc100_pmd.h"
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, DEBUG);
+#else
+RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);
+#endif
+
+/* Free 64MB memory used for software rings */
+static int
+acc100_dev_close(struct rte_bbdev *dev  __rte_unused)
+{
+	return 0;
+}
+
+static const struct rte_bbdev_ops acc100_bbdev_ops = {
+	.close = acc100_dev_close,
+};
+
+/* ACC100 PCI PF address map */
+static struct rte_pci_id pci_id_acc100_pf_map[] = {
+	{
+		RTE_PCI_DEVICE(RTE_ACC100_VENDOR_ID, RTE_ACC100_PF_DEVICE_ID)
+	},
+	{.device_id = 0},
+};
+
+/* ACC100 PCI VF address map */
+static struct rte_pci_id pci_id_acc100_vf_map[] = {
+	{
+		RTE_PCI_DEVICE(RTE_ACC100_VENDOR_ID, RTE_ACC100_VF_DEVICE_ID)
+	},
+	{.device_id = 0},
+};
+
+/* Initialization Function */
+static void
+acc100_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv)
+{
+	struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
+
+	dev->dev_ops = &acc100_bbdev_ops;
+
+	((struct acc100_device *) dev->data->dev_private)->pf_device =
+			!strcmp(drv->driver.name,
+					RTE_STR(ACC100PF_DRIVER_NAME));
+	((struct acc100_device *) dev->data->dev_private)->mmio_base =
+			pci_dev->mem_resource[0].addr;
+
+	rte_bbdev_log_debug("Init device %s [%s] @ vaddr %p paddr %#"PRIx64"",
+			drv->driver.name, dev->data->name,
+			(void *)pci_dev->mem_resource[0].addr,
+			pci_dev->mem_resource[0].phys_addr);
+}
+
+static int acc100_pci_probe(struct rte_pci_driver *pci_drv,
+	struct rte_pci_device *pci_dev)
+{
+	struct rte_bbdev *bbdev = NULL;
+	char dev_name[RTE_BBDEV_NAME_MAX_LEN];
+
+	if (pci_dev == NULL) {
+		rte_bbdev_log(ERR, "NULL PCI device");
+		return -EINVAL;
+	}
+
+	rte_pci_device_name(&pci_dev->addr, dev_name, sizeof(dev_name));
+
+	/* Allocate memory to be used privately by drivers */
+	bbdev = rte_bbdev_allocate(pci_dev->device.name);
+	if (bbdev == NULL)
+		return -ENODEV;
+
+	/* allocate device private memory */
+	bbdev->data->dev_private = rte_zmalloc_socket(dev_name,
+			sizeof(struct acc100_device), RTE_CACHE_LINE_SIZE,
+			pci_dev->device.numa_node);
+
+	if (bbdev->data->dev_private == NULL) {
+		rte_bbdev_log(CRIT,
+				"Allocate of %zu bytes for device \"%s\" failed",
+				sizeof(struct acc100_device), dev_name);
+				rte_bbdev_release(bbdev);
+			return -ENOMEM;
+	}
+
+	/* Fill HW specific part of device structure */
+	bbdev->device = &pci_dev->device;
+	bbdev->intr_handle = &pci_dev->intr_handle;
+	bbdev->data->socket_id = pci_dev->device.numa_node;
+
+	/* Invoke ACC100 device initialization function */
+	acc100_bbdev_init(bbdev, pci_drv);
+
+	rte_bbdev_log_debug("Initialised bbdev %s (id = %u)",
+			dev_name, bbdev->data->dev_id);
+	return 0;
+}
+
+static int acc100_pci_remove(struct rte_pci_device *pci_dev)
+{
+	struct rte_bbdev *bbdev;
+	int ret;
+	uint8_t dev_id;
+
+	if (pci_dev == NULL)
+		return -EINVAL;
+
+	/* Find device */
+	bbdev = rte_bbdev_get_named_dev(pci_dev->device.name);
+	if (bbdev == NULL) {
+		rte_bbdev_log(CRIT,
+				"Couldn't find HW dev \"%s\" to uninitialise it",
+				pci_dev->device.name);
+		return -ENODEV;
+	}
+	dev_id = bbdev->data->dev_id;
+
+	/* free device private memory before close */
+	rte_free(bbdev->data->dev_private);
+
+	/* Close device */
+	ret = rte_bbdev_close(dev_id);
+	if (ret < 0)
+		rte_bbdev_log(ERR,
+				"Device %i failed to close during uninit: %i",
+				dev_id, ret);
+
+	/* release bbdev from library */
+	rte_bbdev_release(bbdev);
+
+	rte_bbdev_log_debug("Destroyed bbdev = %u", dev_id);
+
+	return 0;
+}
+
+static struct rte_pci_driver acc100_pci_pf_driver = {
+		.probe = acc100_pci_probe,
+		.remove = acc100_pci_remove,
+		.id_table = pci_id_acc100_pf_map,
+		.drv_flags = RTE_PCI_DRV_NEED_MAPPING
+};
+
+static struct rte_pci_driver acc100_pci_vf_driver = {
+		.probe = acc100_pci_probe,
+		.remove = acc100_pci_remove,
+		.id_table = pci_id_acc100_vf_map,
+		.drv_flags = RTE_PCI_DRV_NEED_MAPPING
+};
+
+RTE_PMD_REGISTER_PCI(ACC100PF_DRIVER_NAME, acc100_pci_pf_driver);
+RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME, pci_id_acc100_pf_map);
+RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
+RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME, pci_id_acc100_vf_map);
+
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
new file mode 100644
index 0000000..6f46df0
--- /dev/null
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -0,0 +1,37 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#ifndef _RTE_ACC100_PMD_H_
+#define _RTE_ACC100_PMD_H_
+
+/* Helper macro for logging */
+#define rte_bbdev_log(level, fmt, ...) \
+	rte_log(RTE_LOG_ ## level, acc100_logtype, fmt "\n", \
+		##__VA_ARGS__)
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+#define rte_bbdev_log_debug(fmt, ...) \
+		rte_bbdev_log(DEBUG, "acc100_pmd: " fmt, \
+		##__VA_ARGS__)
+#else
+#define rte_bbdev_log_debug(fmt, ...)
+#endif
+
+/* ACC100 PF and VF driver names */
+#define ACC100PF_DRIVER_NAME           intel_acc100_pf
+#define ACC100VF_DRIVER_NAME           intel_acc100_vf
+
+/* ACC100 PCI vendor & device IDs */
+#define RTE_ACC100_VENDOR_ID           (0x8086)
+#define RTE_ACC100_PF_DEVICE_ID        (0x0d5c)
+#define RTE_ACC100_VF_DEVICE_ID        (0x0d5d)
+
+/* Private data structure for each ACC100 device */
+struct acc100_device {
+	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
+	bool pf_device; /**< True if this is a PF ACC100 device */
+	bool configured; /**< True if this ACC100 device is configured */
+};
+
+#endif /* _RTE_ACC100_PMD_H_ */
diff --git a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
new file mode 100644
index 0000000..4a76d1d
--- /dev/null
+++ b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
@@ -0,0 +1,3 @@
+DPDK_21 {
+	local: *;
+};
diff --git a/drivers/baseband/meson.build b/drivers/baseband/meson.build
index 415b672..72301ce 100644
--- a/drivers/baseband/meson.build
+++ b/drivers/baseband/meson.build
@@ -5,7 +5,7 @@ if is_windows
 	subdir_done()
 endif
 
-drivers = ['null', 'turbo_sw', 'fpga_lte_fec', 'fpga_5gnr_fec']
+drivers = ['null', 'turbo_sw', 'fpga_lte_fec', 'fpga_5gnr_fec', 'acc100']
 
 config_flag_fmt = 'RTE_LIBRTE_PMD_BBDEV_@0@'
 driver_name_fmt = 'rte_pmd_bbdev_@0@'
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v9 02/10] baseband/acc100: add register definition file
  2020-09-29  0:29   ` [dpdk-dev] [PATCH v9 00/10] bbdev PMD ACC100 Nicolas Chautru
  2020-09-29  0:29     ` [dpdk-dev] [PATCH v9 01/10] drivers/baseband: add PMD for ACC100 Nicolas Chautru
@ 2020-09-29  0:29     ` Nicolas Chautru
  2020-09-29 20:34       ` Tom Rix
  2020-09-29  0:29     ` [dpdk-dev] [PATCH v9 03/10] baseband/acc100: add info get function Nicolas Chautru
                       ` (7 subsequent siblings)
  9 siblings, 1 reply; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-29  0:29 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Add in the list of registers for the device and related
HW specs definitions.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 drivers/baseband/acc100/acc100_pf_enum.h | 1068 ++++++++++++++++++++++++++++++
 drivers/baseband/acc100/acc100_vf_enum.h |   73 ++
 drivers/baseband/acc100/rte_acc100_pmd.h |  490 ++++++++++++++
 3 files changed, 1631 insertions(+)
 create mode 100644 drivers/baseband/acc100/acc100_pf_enum.h
 create mode 100644 drivers/baseband/acc100/acc100_vf_enum.h

diff --git a/drivers/baseband/acc100/acc100_pf_enum.h b/drivers/baseband/acc100/acc100_pf_enum.h
new file mode 100644
index 0000000..a1ee416
--- /dev/null
+++ b/drivers/baseband/acc100/acc100_pf_enum.h
@@ -0,0 +1,1068 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2017 Intel Corporation
+ */
+
+#ifndef ACC100_PF_ENUM_H
+#define ACC100_PF_ENUM_H
+
+/*
+ * ACC100 Register mapping on PF BAR0
+ * This is automatically generated from RDL, format may change with new RDL
+ * Release.
+ * Variable names are as is
+ */
+enum {
+	HWPfQmgrEgressQueuesTemplate          =  0x0007FE00,
+	HWPfQmgrIngressAq                     =  0x00080000,
+	HWPfQmgrArbQAvail                     =  0x00A00010,
+	HWPfQmgrArbQBlock                     =  0x00A00014,
+	HWPfQmgrAqueueDropNotifEn             =  0x00A00024,
+	HWPfQmgrAqueueDisableNotifEn          =  0x00A00028,
+	HWPfQmgrSoftReset                     =  0x00A00038,
+	HWPfQmgrInitStatus                    =  0x00A0003C,
+	HWPfQmgrAramWatchdogCount             =  0x00A00040,
+	HWPfQmgrAramWatchdogCounterEn         =  0x00A00044,
+	HWPfQmgrAxiWatchdogCount              =  0x00A00048,
+	HWPfQmgrAxiWatchdogCounterEn          =  0x00A0004C,
+	HWPfQmgrProcessWatchdogCount          =  0x00A00050,
+	HWPfQmgrProcessWatchdogCounterEn      =  0x00A00054,
+	HWPfQmgrProcessUl4GWatchdogCounter    =  0x00A00058,
+	HWPfQmgrProcessDl4GWatchdogCounter    =  0x00A0005C,
+	HWPfQmgrProcessUl5GWatchdogCounter    =  0x00A00060,
+	HWPfQmgrProcessDl5GWatchdogCounter    =  0x00A00064,
+	HWPfQmgrProcessMldWatchdogCounter     =  0x00A00068,
+	HWPfQmgrMsiOverflowUpperVf            =  0x00A00070,
+	HWPfQmgrMsiOverflowLowerVf            =  0x00A00074,
+	HWPfQmgrMsiWatchdogOverflow           =  0x00A00078,
+	HWPfQmgrMsiOverflowEnable             =  0x00A0007C,
+	HWPfQmgrDebugAqPointerMemGrp          =  0x00A00100,
+	HWPfQmgrDebugOutputArbQFifoGrp        =  0x00A00140,
+	HWPfQmgrDebugMsiFifoGrp               =  0x00A00180,
+	HWPfQmgrDebugAxiWdTimeoutMsiFifo      =  0x00A001C0,
+	HWPfQmgrDebugProcessWdTimeoutMsiFifo  =  0x00A001C4,
+	HWPfQmgrDepthLog2Grp                  =  0x00A00200,
+	HWPfQmgrTholdGrp                      =  0x00A00300,
+	HWPfQmgrGrpTmplateReg0Indx            =  0x00A00600,
+	HWPfQmgrGrpTmplateReg1Indx            =  0x00A00680,
+	HWPfQmgrGrpTmplateReg2indx            =  0x00A00700,
+	HWPfQmgrGrpTmplateReg3Indx            =  0x00A00780,
+	HWPfQmgrGrpTmplateReg4Indx            =  0x00A00800,
+	HWPfQmgrVfBaseAddr                    =  0x00A01000,
+	HWPfQmgrUl4GWeightRrVf                =  0x00A02000,
+	HWPfQmgrDl4GWeightRrVf                =  0x00A02100,
+	HWPfQmgrUl5GWeightRrVf                =  0x00A02200,
+	HWPfQmgrDl5GWeightRrVf                =  0x00A02300,
+	HWPfQmgrMldWeightRrVf                 =  0x00A02400,
+	HWPfQmgrArbQDepthGrp                  =  0x00A02F00,
+	HWPfQmgrGrpFunction0                  =  0x00A02F40,
+	HWPfQmgrGrpFunction1                  =  0x00A02F44,
+	HWPfQmgrGrpPriority                   =  0x00A02F48,
+	HWPfQmgrWeightSync                    =  0x00A03000,
+	HWPfQmgrAqEnableVf                    =  0x00A10000,
+	HWPfQmgrAqResetVf                     =  0x00A20000,
+	HWPfQmgrRingSizeVf                    =  0x00A20004,
+	HWPfQmgrGrpDepthLog20Vf               =  0x00A20008,
+	HWPfQmgrGrpDepthLog21Vf               =  0x00A2000C,
+	HWPfQmgrGrpFunction0Vf                =  0x00A20010,
+	HWPfQmgrGrpFunction1Vf                =  0x00A20014,
+	HWPfDmaConfig0Reg                     =  0x00B80000,
+	HWPfDmaConfig1Reg                     =  0x00B80004,
+	HWPfDmaQmgrAddrReg                    =  0x00B80008,
+	HWPfDmaSoftResetReg                   =  0x00B8000C,
+	HWPfDmaAxcacheReg                     =  0x00B80010,
+	HWPfDmaVersionReg                     =  0x00B80014,
+	HWPfDmaFrameThreshold                 =  0x00B80018,
+	HWPfDmaTimestampLo                    =  0x00B8001C,
+	HWPfDmaTimestampHi                    =  0x00B80020,
+	HWPfDmaAxiStatus                      =  0x00B80028,
+	HWPfDmaAxiControl                     =  0x00B8002C,
+	HWPfDmaNoQmgr                         =  0x00B80030,
+	HWPfDmaQosScale                       =  0x00B80034,
+	HWPfDmaQmanen                         =  0x00B80040,
+	HWPfDmaQmgrQosBase                    =  0x00B80060,
+	HWPfDmaFecClkGatingEnable             =  0x00B80080,
+	HWPfDmaPmEnable                       =  0x00B80084,
+	HWPfDmaQosEnable                      =  0x00B80088,
+	HWPfDmaHarqWeightedRrFrameThreshold   =  0x00B800B0,
+	HWPfDmaDataSmallWeightedRrFrameThresh  = 0x00B800B4,
+	HWPfDmaDataLargeWeightedRrFrameThresh  = 0x00B800B8,
+	HWPfDmaInboundCbMaxSize               =  0x00B800BC,
+	HWPfDmaInboundDrainDataSize           =  0x00B800C0,
+	HWPfDmaVfDdrBaseRw                    =  0x00B80400,
+	HWPfDmaCmplTmOutCnt                   =  0x00B80800,
+	HWPfDmaProcTmOutCnt                   =  0x00B80804,
+	HWPfDmaStatusRrespBresp               =  0x00B80810,
+	HWPfDmaCfgRrespBresp                  =  0x00B80814,
+	HWPfDmaStatusMemParErr                =  0x00B80818,
+	HWPfDmaCfgMemParErrEn                 =  0x00B8081C,
+	HWPfDmaStatusDmaHwErr                 =  0x00B80820,
+	HWPfDmaCfgDmaHwErrEn                  =  0x00B80824,
+	HWPfDmaStatusFecCoreErr               =  0x00B80828,
+	HWPfDmaCfgFecCoreErrEn                =  0x00B8082C,
+	HWPfDmaStatusFcwDescrErr              =  0x00B80830,
+	HWPfDmaCfgFcwDescrErrEn               =  0x00B80834,
+	HWPfDmaStatusBlockTransmit            =  0x00B80838,
+	HWPfDmaBlockOnErrEn                   =  0x00B8083C,
+	HWPfDmaStatusFlushDma                 =  0x00B80840,
+	HWPfDmaFlushDmaOnErrEn                =  0x00B80844,
+	HWPfDmaStatusSdoneFifoFull            =  0x00B80848,
+	HWPfDmaStatusDescriptorErrLoVf        =  0x00B8084C,
+	HWPfDmaStatusDescriptorErrHiVf        =  0x00B80850,
+	HWPfDmaStatusFcwErrLoVf               =  0x00B80854,
+	HWPfDmaStatusFcwErrHiVf               =  0x00B80858,
+	HWPfDmaStatusDataErrLoVf              =  0x00B8085C,
+	HWPfDmaStatusDataErrHiVf              =  0x00B80860,
+	HWPfDmaCfgMsiEnSoftwareErr            =  0x00B80864,
+	HWPfDmaDescriptorSignatuture          =  0x00B80868,
+	HWPfDmaFcwSignature                   =  0x00B8086C,
+	HWPfDmaErrorDetectionEn               =  0x00B80870,
+	HWPfDmaErrCntrlFifoDebug              =  0x00B8087C,
+	HWPfDmaStatusToutData                 =  0x00B80880,
+	HWPfDmaStatusToutDesc                 =  0x00B80884,
+	HWPfDmaStatusToutUnexpData            =  0x00B80888,
+	HWPfDmaStatusToutUnexpDesc            =  0x00B8088C,
+	HWPfDmaStatusToutProcess              =  0x00B80890,
+	HWPfDmaConfigCtoutOutDataEn           =  0x00B808A0,
+	HWPfDmaConfigCtoutOutDescrEn          =  0x00B808A4,
+	HWPfDmaConfigUnexpComplDataEn         =  0x00B808A8,
+	HWPfDmaConfigUnexpComplDescrEn        =  0x00B808AC,
+	HWPfDmaConfigPtoutOutEn               =  0x00B808B0,
+	HWPfDmaFec5GulDescBaseLoRegVf         =  0x00B88020,
+	HWPfDmaFec5GulDescBaseHiRegVf         =  0x00B88024,
+	HWPfDmaFec5GulRespPtrLoRegVf          =  0x00B88028,
+	HWPfDmaFec5GulRespPtrHiRegVf          =  0x00B8802C,
+	HWPfDmaFec5GdlDescBaseLoRegVf         =  0x00B88040,
+	HWPfDmaFec5GdlDescBaseHiRegVf         =  0x00B88044,
+	HWPfDmaFec5GdlRespPtrLoRegVf          =  0x00B88048,
+	HWPfDmaFec5GdlRespPtrHiRegVf          =  0x00B8804C,
+	HWPfDmaFec4GulDescBaseLoRegVf         =  0x00B88060,
+	HWPfDmaFec4GulDescBaseHiRegVf         =  0x00B88064,
+	HWPfDmaFec4GulRespPtrLoRegVf          =  0x00B88068,
+	HWPfDmaFec4GulRespPtrHiRegVf          =  0x00B8806C,
+	HWPfDmaFec4GdlDescBaseLoRegVf         =  0x00B88080,
+	HWPfDmaFec4GdlDescBaseHiRegVf         =  0x00B88084,
+	HWPfDmaFec4GdlRespPtrLoRegVf          =  0x00B88088,
+	HWPfDmaFec4GdlRespPtrHiRegVf          =  0x00B8808C,
+	HWPfDmaVfDdrBaseRangeRo               =  0x00B880A0,
+	HWPfQosmonACntrlReg                   =  0x00B90000,
+	HWPfQosmonAEvalOverflow0              =  0x00B90008,
+	HWPfQosmonAEvalOverflow1              =  0x00B9000C,
+	HWPfQosmonADivTerm                    =  0x00B90010,
+	HWPfQosmonATickTerm                   =  0x00B90014,
+	HWPfQosmonAEvalTerm                   =  0x00B90018,
+	HWPfQosmonAAveTerm                    =  0x00B9001C,
+	HWPfQosmonAForceEccErr                =  0x00B90020,
+	HWPfQosmonAEccErrDetect               =  0x00B90024,
+	HWPfQosmonAIterationConfig0Low        =  0x00B90060,
+	HWPfQosmonAIterationConfig0High       =  0x00B90064,
+	HWPfQosmonAIterationConfig1Low        =  0x00B90068,
+	HWPfQosmonAIterationConfig1High       =  0x00B9006C,
+	HWPfQosmonAIterationConfig2Low        =  0x00B90070,
+	HWPfQosmonAIterationConfig2High       =  0x00B90074,
+	HWPfQosmonAIterationConfig3Low        =  0x00B90078,
+	HWPfQosmonAIterationConfig3High       =  0x00B9007C,
+	HWPfQosmonAEvalMemAddr                =  0x00B90080,
+	HWPfQosmonAEvalMemData                =  0x00B90084,
+	HWPfQosmonAXaction                    =  0x00B900C0,
+	HWPfQosmonARemThres1Vf                =  0x00B90400,
+	HWPfQosmonAThres2Vf                   =  0x00B90404,
+	HWPfQosmonAWeiFracVf                  =  0x00B90408,
+	HWPfQosmonARrWeiVf                    =  0x00B9040C,
+	HWPfPermonACntrlRegVf                 =  0x00B98000,
+	HWPfPermonACountVf                    =  0x00B98008,
+	HWPfPermonAKCntLoVf                   =  0x00B98010,
+	HWPfPermonAKCntHiVf                   =  0x00B98014,
+	HWPfPermonADeltaCntLoVf               =  0x00B98020,
+	HWPfPermonADeltaCntHiVf               =  0x00B98024,
+	HWPfPermonAVersionReg                 =  0x00B9C000,
+	HWPfPermonACbControlFec               =  0x00B9C0F0,
+	HWPfPermonADltTimerLoFec              =  0x00B9C0F4,
+	HWPfPermonADltTimerHiFec              =  0x00B9C0F8,
+	HWPfPermonACbCountFec                 =  0x00B9C100,
+	HWPfPermonAAccExecTimerLoFec          =  0x00B9C104,
+	HWPfPermonAAccExecTimerHiFec          =  0x00B9C108,
+	HWPfPermonAExecTimerMinFec            =  0x00B9C200,
+	HWPfPermonAExecTimerMaxFec            =  0x00B9C204,
+	HWPfPermonAControlBusMon              =  0x00B9C400,
+	HWPfPermonAConfigBusMon               =  0x00B9C404,
+	HWPfPermonASkipCountBusMon            =  0x00B9C408,
+	HWPfPermonAMinLatBusMon               =  0x00B9C40C,
+	HWPfPermonAMaxLatBusMon               =  0x00B9C500,
+	HWPfPermonATotalLatLowBusMon          =  0x00B9C504,
+	HWPfPermonATotalLatUpperBusMon        =  0x00B9C508,
+	HWPfPermonATotalReqCntBusMon          =  0x00B9C50C,
+	HWPfQosmonBCntrlReg                   =  0x00BA0000,
+	HWPfQosmonBEvalOverflow0              =  0x00BA0008,
+	HWPfQosmonBEvalOverflow1              =  0x00BA000C,
+	HWPfQosmonBDivTerm                    =  0x00BA0010,
+	HWPfQosmonBTickTerm                   =  0x00BA0014,
+	HWPfQosmonBEvalTerm                   =  0x00BA0018,
+	HWPfQosmonBAveTerm                    =  0x00BA001C,
+	HWPfQosmonBForceEccErr                =  0x00BA0020,
+	HWPfQosmonBEccErrDetect               =  0x00BA0024,
+	HWPfQosmonBIterationConfig0Low        =  0x00BA0060,
+	HWPfQosmonBIterationConfig0High       =  0x00BA0064,
+	HWPfQosmonBIterationConfig1Low        =  0x00BA0068,
+	HWPfQosmonBIterationConfig1High       =  0x00BA006C,
+	HWPfQosmonBIterationConfig2Low        =  0x00BA0070,
+	HWPfQosmonBIterationConfig2High       =  0x00BA0074,
+	HWPfQosmonBIterationConfig3Low        =  0x00BA0078,
+	HWPfQosmonBIterationConfig3High       =  0x00BA007C,
+	HWPfQosmonBEvalMemAddr                =  0x00BA0080,
+	HWPfQosmonBEvalMemData                =  0x00BA0084,
+	HWPfQosmonBXaction                    =  0x00BA00C0,
+	HWPfQosmonBRemThres1Vf                =  0x00BA0400,
+	HWPfQosmonBThres2Vf                   =  0x00BA0404,
+	HWPfQosmonBWeiFracVf                  =  0x00BA0408,
+	HWPfQosmonBRrWeiVf                    =  0x00BA040C,
+	HWPfPermonBCntrlRegVf                 =  0x00BA8000,
+	HWPfPermonBCountVf                    =  0x00BA8008,
+	HWPfPermonBKCntLoVf                   =  0x00BA8010,
+	HWPfPermonBKCntHiVf                   =  0x00BA8014,
+	HWPfPermonBDeltaCntLoVf               =  0x00BA8020,
+	HWPfPermonBDeltaCntHiVf               =  0x00BA8024,
+	HWPfPermonBVersionReg                 =  0x00BAC000,
+	HWPfPermonBCbControlFec               =  0x00BAC0F0,
+	HWPfPermonBDltTimerLoFec              =  0x00BAC0F4,
+	HWPfPermonBDltTimerHiFec              =  0x00BAC0F8,
+	HWPfPermonBCbCountFec                 =  0x00BAC100,
+	HWPfPermonBAccExecTimerLoFec          =  0x00BAC104,
+	HWPfPermonBAccExecTimerHiFec          =  0x00BAC108,
+	HWPfPermonBExecTimerMinFec            =  0x00BAC200,
+	HWPfPermonBExecTimerMaxFec            =  0x00BAC204,
+	HWPfPermonBControlBusMon              =  0x00BAC400,
+	HWPfPermonBConfigBusMon               =  0x00BAC404,
+	HWPfPermonBSkipCountBusMon            =  0x00BAC408,
+	HWPfPermonBMinLatBusMon               =  0x00BAC40C,
+	HWPfPermonBMaxLatBusMon               =  0x00BAC500,
+	HWPfPermonBTotalLatLowBusMon          =  0x00BAC504,
+	HWPfPermonBTotalLatUpperBusMon        =  0x00BAC508,
+	HWPfPermonBTotalReqCntBusMon          =  0x00BAC50C,
+	HWPfFecUl5gCntrlReg                   =  0x00BC0000,
+	HWPfFecUl5gI2MThreshReg               =  0x00BC0004,
+	HWPfFecUl5gVersionReg                 =  0x00BC0100,
+	HWPfFecUl5gFcwStatusReg               =  0x00BC0104,
+	HWPfFecUl5gWarnReg                    =  0x00BC0108,
+	HwPfFecUl5gIbDebugReg                 =  0x00BC0200,
+	HwPfFecUl5gObLlrDebugReg              =  0x00BC0204,
+	HwPfFecUl5gObHarqDebugReg             =  0x00BC0208,
+	HwPfFecUl5g1CntrlReg                  =  0x00BC1000,
+	HwPfFecUl5g1I2MThreshReg              =  0x00BC1004,
+	HwPfFecUl5g1VersionReg                =  0x00BC1100,
+	HwPfFecUl5g1FcwStatusReg              =  0x00BC1104,
+	HwPfFecUl5g1WarnReg                   =  0x00BC1108,
+	HwPfFecUl5g1IbDebugReg                =  0x00BC1200,
+	HwPfFecUl5g1ObLlrDebugReg             =  0x00BC1204,
+	HwPfFecUl5g1ObHarqDebugReg            =  0x00BC1208,
+	HwPfFecUl5g2CntrlReg                  =  0x00BC2000,
+	HwPfFecUl5g2I2MThreshReg              =  0x00BC2004,
+	HwPfFecUl5g2VersionReg                =  0x00BC2100,
+	HwPfFecUl5g2FcwStatusReg              =  0x00BC2104,
+	HwPfFecUl5g2WarnReg                   =  0x00BC2108,
+	HwPfFecUl5g2IbDebugReg                =  0x00BC2200,
+	HwPfFecUl5g2ObLlrDebugReg             =  0x00BC2204,
+	HwPfFecUl5g2ObHarqDebugReg            =  0x00BC2208,
+	HwPfFecUl5g3CntrlReg                  =  0x00BC3000,
+	HwPfFecUl5g3I2MThreshReg              =  0x00BC3004,
+	HwPfFecUl5g3VersionReg                =  0x00BC3100,
+	HwPfFecUl5g3FcwStatusReg              =  0x00BC3104,
+	HwPfFecUl5g3WarnReg                   =  0x00BC3108,
+	HwPfFecUl5g3IbDebugReg                =  0x00BC3200,
+	HwPfFecUl5g3ObLlrDebugReg             =  0x00BC3204,
+	HwPfFecUl5g3ObHarqDebugReg            =  0x00BC3208,
+	HwPfFecUl5g4CntrlReg                  =  0x00BC4000,
+	HwPfFecUl5g4I2MThreshReg              =  0x00BC4004,
+	HwPfFecUl5g4VersionReg                =  0x00BC4100,
+	HwPfFecUl5g4FcwStatusReg              =  0x00BC4104,
+	HwPfFecUl5g4WarnReg                   =  0x00BC4108,
+	HwPfFecUl5g4IbDebugReg                =  0x00BC4200,
+	HwPfFecUl5g4ObLlrDebugReg             =  0x00BC4204,
+	HwPfFecUl5g4ObHarqDebugReg            =  0x00BC4208,
+	HwPfFecUl5g5CntrlReg                  =  0x00BC5000,
+	HwPfFecUl5g5I2MThreshReg              =  0x00BC5004,
+	HwPfFecUl5g5VersionReg                =  0x00BC5100,
+	HwPfFecUl5g5FcwStatusReg              =  0x00BC5104,
+	HwPfFecUl5g5WarnReg                   =  0x00BC5108,
+	HwPfFecUl5g5IbDebugReg                =  0x00BC5200,
+	HwPfFecUl5g5ObLlrDebugReg             =  0x00BC5204,
+	HwPfFecUl5g5ObHarqDebugReg            =  0x00BC5208,
+	HwPfFecUl5g6CntrlReg                  =  0x00BC6000,
+	HwPfFecUl5g6I2MThreshReg              =  0x00BC6004,
+	HwPfFecUl5g6VersionReg                =  0x00BC6100,
+	HwPfFecUl5g6FcwStatusReg              =  0x00BC6104,
+	HwPfFecUl5g6WarnReg                   =  0x00BC6108,
+	HwPfFecUl5g6IbDebugReg                =  0x00BC6200,
+	HwPfFecUl5g6ObLlrDebugReg             =  0x00BC6204,
+	HwPfFecUl5g6ObHarqDebugReg            =  0x00BC6208,
+	HwPfFecUl5g7CntrlReg                  =  0x00BC7000,
+	HwPfFecUl5g7I2MThreshReg              =  0x00BC7004,
+	HwPfFecUl5g7VersionReg                =  0x00BC7100,
+	HwPfFecUl5g7FcwStatusReg              =  0x00BC7104,
+	HwPfFecUl5g7WarnReg                   =  0x00BC7108,
+	HwPfFecUl5g7IbDebugReg                =  0x00BC7200,
+	HwPfFecUl5g7ObLlrDebugReg             =  0x00BC7204,
+	HwPfFecUl5g7ObHarqDebugReg            =  0x00BC7208,
+	HwPfFecUl5g8CntrlReg                  =  0x00BC8000,
+	HwPfFecUl5g8I2MThreshReg              =  0x00BC8004,
+	HwPfFecUl5g8VersionReg                =  0x00BC8100,
+	HwPfFecUl5g8FcwStatusReg              =  0x00BC8104,
+	HwPfFecUl5g8WarnReg                   =  0x00BC8108,
+	HwPfFecUl5g8IbDebugReg                =  0x00BC8200,
+	HwPfFecUl5g8ObLlrDebugReg             =  0x00BC8204,
+	HwPfFecUl5g8ObHarqDebugReg            =  0x00BC8208,
+	HWPfFecDl5gCntrlReg                   =  0x00BCF000,
+	HWPfFecDl5gI2MThreshReg               =  0x00BCF004,
+	HWPfFecDl5gVersionReg                 =  0x00BCF100,
+	HWPfFecDl5gFcwStatusReg               =  0x00BCF104,
+	HWPfFecDl5gWarnReg                    =  0x00BCF108,
+	HWPfFecUlVersionReg                   =  0x00BD0000,
+	HWPfFecUlControlReg                   =  0x00BD0004,
+	HWPfFecUlStatusReg                    =  0x00BD0008,
+	HWPfFecDlVersionReg                   =  0x00BDF000,
+	HWPfFecDlClusterConfigReg             =  0x00BDF004,
+	HWPfFecDlBurstThres                   =  0x00BDF00C,
+	HWPfFecDlClusterStatusReg0            =  0x00BDF040,
+	HWPfFecDlClusterStatusReg1            =  0x00BDF044,
+	HWPfFecDlClusterStatusReg2            =  0x00BDF048,
+	HWPfFecDlClusterStatusReg3            =  0x00BDF04C,
+	HWPfFecDlClusterStatusReg4            =  0x00BDF050,
+	HWPfFecDlClusterStatusReg5            =  0x00BDF054,
+	HWPfChaFabPllPllrst                   =  0x00C40000,
+	HWPfChaFabPllClk0                     =  0x00C40004,
+	HWPfChaFabPllClk1                     =  0x00C40008,
+	HWPfChaFabPllBwadj                    =  0x00C4000C,
+	HWPfChaFabPllLbw                      =  0x00C40010,
+	HWPfChaFabPllResetq                   =  0x00C40014,
+	HWPfChaFabPllPhshft0                  =  0x00C40018,
+	HWPfChaFabPllPhshft1                  =  0x00C4001C,
+	HWPfChaFabPllDivq0                    =  0x00C40020,
+	HWPfChaFabPllDivq1                    =  0x00C40024,
+	HWPfChaFabPllDivq2                    =  0x00C40028,
+	HWPfChaFabPllDivq3                    =  0x00C4002C,
+	HWPfChaFabPllDivq4                    =  0x00C40030,
+	HWPfChaFabPllDivq5                    =  0x00C40034,
+	HWPfChaFabPllDivq6                    =  0x00C40038,
+	HWPfChaFabPllDivq7                    =  0x00C4003C,
+	HWPfChaDl5gPllPllrst                  =  0x00C40080,
+	HWPfChaDl5gPllClk0                    =  0x00C40084,
+	HWPfChaDl5gPllClk1                    =  0x00C40088,
+	HWPfChaDl5gPllBwadj                   =  0x00C4008C,
+	HWPfChaDl5gPllLbw                     =  0x00C40090,
+	HWPfChaDl5gPllResetq                  =  0x00C40094,
+	HWPfChaDl5gPllPhshft0                 =  0x00C40098,
+	HWPfChaDl5gPllPhshft1                 =  0x00C4009C,
+	HWPfChaDl5gPllDivq0                   =  0x00C400A0,
+	HWPfChaDl5gPllDivq1                   =  0x00C400A4,
+	HWPfChaDl5gPllDivq2                   =  0x00C400A8,
+	HWPfChaDl5gPllDivq3                   =  0x00C400AC,
+	HWPfChaDl5gPllDivq4                   =  0x00C400B0,
+	HWPfChaDl5gPllDivq5                   =  0x00C400B4,
+	HWPfChaDl5gPllDivq6                   =  0x00C400B8,
+	HWPfChaDl5gPllDivq7                   =  0x00C400BC,
+	HWPfChaDl4gPllPllrst                  =  0x00C40100,
+	HWPfChaDl4gPllClk0                    =  0x00C40104,
+	HWPfChaDl4gPllClk1                    =  0x00C40108,
+	HWPfChaDl4gPllBwadj                   =  0x00C4010C,
+	HWPfChaDl4gPllLbw                     =  0x00C40110,
+	HWPfChaDl4gPllResetq                  =  0x00C40114,
+	HWPfChaDl4gPllPhshft0                 =  0x00C40118,
+	HWPfChaDl4gPllPhshft1                 =  0x00C4011C,
+	HWPfChaDl4gPllDivq0                   =  0x00C40120,
+	HWPfChaDl4gPllDivq1                   =  0x00C40124,
+	HWPfChaDl4gPllDivq2                   =  0x00C40128,
+	HWPfChaDl4gPllDivq3                   =  0x00C4012C,
+	HWPfChaDl4gPllDivq4                   =  0x00C40130,
+	HWPfChaDl4gPllDivq5                   =  0x00C40134,
+	HWPfChaDl4gPllDivq6                   =  0x00C40138,
+	HWPfChaDl4gPllDivq7                   =  0x00C4013C,
+	HWPfChaUl5gPllPllrst                  =  0x00C40180,
+	HWPfChaUl5gPllClk0                    =  0x00C40184,
+	HWPfChaUl5gPllClk1                    =  0x00C40188,
+	HWPfChaUl5gPllBwadj                   =  0x00C4018C,
+	HWPfChaUl5gPllLbw                     =  0x00C40190,
+	HWPfChaUl5gPllResetq                  =  0x00C40194,
+	HWPfChaUl5gPllPhshft0                 =  0x00C40198,
+	HWPfChaUl5gPllPhshft1                 =  0x00C4019C,
+	HWPfChaUl5gPllDivq0                   =  0x00C401A0,
+	HWPfChaUl5gPllDivq1                   =  0x00C401A4,
+	HWPfChaUl5gPllDivq2                   =  0x00C401A8,
+	HWPfChaUl5gPllDivq3                   =  0x00C401AC,
+	HWPfChaUl5gPllDivq4                   =  0x00C401B0,
+	HWPfChaUl5gPllDivq5                   =  0x00C401B4,
+	HWPfChaUl5gPllDivq6                   =  0x00C401B8,
+	HWPfChaUl5gPllDivq7                   =  0x00C401BC,
+	HWPfChaUl4gPllPllrst                  =  0x00C40200,
+	HWPfChaUl4gPllClk0                    =  0x00C40204,
+	HWPfChaUl4gPllClk1                    =  0x00C40208,
+	HWPfChaUl4gPllBwadj                   =  0x00C4020C,
+	HWPfChaUl4gPllLbw                     =  0x00C40210,
+	HWPfChaUl4gPllResetq                  =  0x00C40214,
+	HWPfChaUl4gPllPhshft0                 =  0x00C40218,
+	HWPfChaUl4gPllPhshft1                 =  0x00C4021C,
+	HWPfChaUl4gPllDivq0                   =  0x00C40220,
+	HWPfChaUl4gPllDivq1                   =  0x00C40224,
+	HWPfChaUl4gPllDivq2                   =  0x00C40228,
+	HWPfChaUl4gPllDivq3                   =  0x00C4022C,
+	HWPfChaUl4gPllDivq4                   =  0x00C40230,
+	HWPfChaUl4gPllDivq5                   =  0x00C40234,
+	HWPfChaUl4gPllDivq6                   =  0x00C40238,
+	HWPfChaUl4gPllDivq7                   =  0x00C4023C,
+	HWPfChaDdrPllPllrst                   =  0x00C40280,
+	HWPfChaDdrPllClk0                     =  0x00C40284,
+	HWPfChaDdrPllClk1                     =  0x00C40288,
+	HWPfChaDdrPllBwadj                    =  0x00C4028C,
+	HWPfChaDdrPllLbw                      =  0x00C40290,
+	HWPfChaDdrPllResetq                   =  0x00C40294,
+	HWPfChaDdrPllPhshft0                  =  0x00C40298,
+	HWPfChaDdrPllPhshft1                  =  0x00C4029C,
+	HWPfChaDdrPllDivq0                    =  0x00C402A0,
+	HWPfChaDdrPllDivq1                    =  0x00C402A4,
+	HWPfChaDdrPllDivq2                    =  0x00C402A8,
+	HWPfChaDdrPllDivq3                    =  0x00C402AC,
+	HWPfChaDdrPllDivq4                    =  0x00C402B0,
+	HWPfChaDdrPllDivq5                    =  0x00C402B4,
+	HWPfChaDdrPllDivq6                    =  0x00C402B8,
+	HWPfChaDdrPllDivq7                    =  0x00C402BC,
+	HWPfChaErrStatus                      =  0x00C40400,
+	HWPfChaErrMask                        =  0x00C40404,
+	HWPfChaDebugPcieMsiFifo               =  0x00C40410,
+	HWPfChaDebugDdrMsiFifo                =  0x00C40414,
+	HWPfChaDebugMiscMsiFifo               =  0x00C40418,
+	HWPfChaPwmSet                         =  0x00C40420,
+	HWPfChaDdrRstStatus                   =  0x00C40430,
+	HWPfChaDdrStDoneStatus                =  0x00C40434,
+	HWPfChaDdrWbRstCfg                    =  0x00C40438,
+	HWPfChaDdrApbRstCfg                   =  0x00C4043C,
+	HWPfChaDdrPhyRstCfg                   =  0x00C40440,
+	HWPfChaDdrCpuRstCfg                   =  0x00C40444,
+	HWPfChaDdrSifRstCfg                   =  0x00C40448,
+	HWPfChaPadcfgPcomp0                   =  0x00C41000,
+	HWPfChaPadcfgNcomp0                   =  0x00C41004,
+	HWPfChaPadcfgOdt0                     =  0x00C41008,
+	HWPfChaPadcfgProtect0                 =  0x00C4100C,
+	HWPfChaPreemphasisProtect0            =  0x00C41010,
+	HWPfChaPreemphasisCompen0             =  0x00C41040,
+	HWPfChaPreemphasisOdten0              =  0x00C41044,
+	HWPfChaPadcfgPcomp1                   =  0x00C41100,
+	HWPfChaPadcfgNcomp1                   =  0x00C41104,
+	HWPfChaPadcfgOdt1                     =  0x00C41108,
+	HWPfChaPadcfgProtect1                 =  0x00C4110C,
+	HWPfChaPreemphasisProtect1            =  0x00C41110,
+	HWPfChaPreemphasisCompen1             =  0x00C41140,
+	HWPfChaPreemphasisOdten1              =  0x00C41144,
+	HWPfChaPadcfgPcomp2                   =  0x00C41200,
+	HWPfChaPadcfgNcomp2                   =  0x00C41204,
+	HWPfChaPadcfgOdt2                     =  0x00C41208,
+	HWPfChaPadcfgProtect2                 =  0x00C4120C,
+	HWPfChaPreemphasisProtect2            =  0x00C41210,
+	HWPfChaPreemphasisCompen2             =  0x00C41240,
+	HWPfChaPreemphasisOdten4              =  0x00C41444,
+	HWPfChaPreemphasisOdten2              =  0x00C41244,
+	HWPfChaPadcfgPcomp3                   =  0x00C41300,
+	HWPfChaPadcfgNcomp3                   =  0x00C41304,
+	HWPfChaPadcfgOdt3                     =  0x00C41308,
+	HWPfChaPadcfgProtect3                 =  0x00C4130C,
+	HWPfChaPreemphasisProtect3            =  0x00C41310,
+	HWPfChaPreemphasisCompen3             =  0x00C41340,
+	HWPfChaPreemphasisOdten3              =  0x00C41344,
+	HWPfChaPadcfgPcomp4                   =  0x00C41400,
+	HWPfChaPadcfgNcomp4                   =  0x00C41404,
+	HWPfChaPadcfgOdt4                     =  0x00C41408,
+	HWPfChaPadcfgProtect4                 =  0x00C4140C,
+	HWPfChaPreemphasisProtect4            =  0x00C41410,
+	HWPfChaPreemphasisCompen4             =  0x00C41440,
+	HWPfHiVfToPfDbellVf                   =  0x00C80000,
+	HWPfHiPfToVfDbellVf                   =  0x00C80008,
+	HWPfHiInfoRingBaseLoVf                =  0x00C80010,
+	HWPfHiInfoRingBaseHiVf                =  0x00C80014,
+	HWPfHiInfoRingPointerVf               =  0x00C80018,
+	HWPfHiInfoRingIntWrEnVf               =  0x00C80020,
+	HWPfHiInfoRingPf2VfWrEnVf             =  0x00C80024,
+	HWPfHiMsixVectorMapperVf              =  0x00C80060,
+	HWPfHiModuleVersionReg                =  0x00C84000,
+	HWPfHiIosf2axiErrLogReg               =  0x00C84004,
+	HWPfHiHardResetReg                    =  0x00C84008,
+	HWPfHi5GHardResetReg                  =  0x00C8400C,
+	HWPfHiInfoRingBaseLoRegPf             =  0x00C84010,
+	HWPfHiInfoRingBaseHiRegPf             =  0x00C84014,
+	HWPfHiInfoRingPointerRegPf            =  0x00C84018,
+	HWPfHiInfoRingIntWrEnRegPf            =  0x00C84020,
+	HWPfHiInfoRingVf2pfLoWrEnReg          =  0x00C84024,
+	HWPfHiInfoRingVf2pfHiWrEnReg          =  0x00C84028,
+	HWPfHiLogParityErrStatusReg           =  0x00C8402C,
+	HWPfHiLogDataParityErrorVfStatusLo    =  0x00C84030,
+	HWPfHiLogDataParityErrorVfStatusHi    =  0x00C84034,
+	HWPfHiBlockTransmitOnErrorEn          =  0x00C84038,
+	HWPfHiCfgMsiIntWrEnRegPf              =  0x00C84040,
+	HWPfHiCfgMsiVf2pfLoWrEnReg            =  0x00C84044,
+	HWPfHiCfgMsiVf2pfHighWrEnReg          =  0x00C84048,
+	HWPfHiMsixVectorMapperPf              =  0x00C84060,
+	HWPfHiApbWrWaitTime                   =  0x00C84100,
+	HWPfHiXCounterMaxValue                =  0x00C84104,
+	HWPfHiPfMode                          =  0x00C84108,
+	HWPfHiClkGateHystReg                  =  0x00C8410C,
+	HWPfHiSnoopBitsReg                    =  0x00C84110,
+	HWPfHiMsiDropEnableReg                =  0x00C84114,
+	HWPfHiMsiStatReg                      =  0x00C84120,
+	HWPfHiFifoOflStatReg                  =  0x00C84124,
+	HWPfHiHiDebugReg                      =  0x00C841F4,
+	HWPfHiDebugMemSnoopMsiFifo            =  0x00C841F8,
+	HWPfHiDebugMemSnoopInputFifo          =  0x00C841FC,
+	HWPfHiMsixMappingConfig               =  0x00C84200,
+	HWPfHiJunkReg                         =  0x00C8FF00,
+	HWPfDdrUmmcVer                        =  0x00D00000,
+	HWPfDdrUmmcCap                        =  0x00D00010,
+	HWPfDdrUmmcCtrl                       =  0x00D00020,
+	HWPfDdrMpcPe                          =  0x00D00080,
+	HWPfDdrMpcPpri3                       =  0x00D00090,
+	HWPfDdrMpcPpri2                       =  0x00D000A0,
+	HWPfDdrMpcPpri1                       =  0x00D000B0,
+	HWPfDdrMpcPpri0                       =  0x00D000C0,
+	HWPfDdrMpcPrwgrpCtrl                  =  0x00D000D0,
+	HWPfDdrMpcPbw7                        =  0x00D000E0,
+	HWPfDdrMpcPbw6                        =  0x00D000F0,
+	HWPfDdrMpcPbw5                        =  0x00D00100,
+	HWPfDdrMpcPbw4                        =  0x00D00110,
+	HWPfDdrMpcPbw3                        =  0x00D00120,
+	HWPfDdrMpcPbw2                        =  0x00D00130,
+	HWPfDdrMpcPbw1                        =  0x00D00140,
+	HWPfDdrMpcPbw0                        =  0x00D00150,
+	HWPfDdrMemoryInit                     =  0x00D00200,
+	HWPfDdrMemoryInitDone                 =  0x00D00210,
+	HWPfDdrMemInitPhyTrng0                =  0x00D00240,
+	HWPfDdrMemInitPhyTrng1                =  0x00D00250,
+	HWPfDdrMemInitPhyTrng2                =  0x00D00260,
+	HWPfDdrMemInitPhyTrng3                =  0x00D00270,
+	HWPfDdrBcDram                         =  0x00D003C0,
+	HWPfDdrBcAddrMap                      =  0x00D003D0,
+	HWPfDdrBcRef                          =  0x00D003E0,
+	HWPfDdrBcTim0                         =  0x00D00400,
+	HWPfDdrBcTim1                         =  0x00D00410,
+	HWPfDdrBcTim2                         =  0x00D00420,
+	HWPfDdrBcTim3                         =  0x00D00430,
+	HWPfDdrBcTim4                         =  0x00D00440,
+	HWPfDdrBcTim5                         =  0x00D00450,
+	HWPfDdrBcTim6                         =  0x00D00460,
+	HWPfDdrBcTim7                         =  0x00D00470,
+	HWPfDdrBcTim8                         =  0x00D00480,
+	HWPfDdrBcTim9                         =  0x00D00490,
+	HWPfDdrBcTim10                        =  0x00D004A0,
+	HWPfDdrBcTim12                        =  0x00D004C0,
+	HWPfDdrDfiInit                        =  0x00D004D0,
+	HWPfDdrDfiInitComplete                =  0x00D004E0,
+	HWPfDdrDfiTim0                        =  0x00D004F0,
+	HWPfDdrDfiTim1                        =  0x00D00500,
+	HWPfDdrDfiPhyUpdEn                    =  0x00D00530,
+	HWPfDdrMemStatus                      =  0x00D00540,
+	HWPfDdrUmmcErrStatus                  =  0x00D00550,
+	HWPfDdrUmmcIntStatus                  =  0x00D00560,
+	HWPfDdrUmmcIntEn                      =  0x00D00570,
+	HWPfDdrPhyRdLatency                   =  0x00D48400,
+	HWPfDdrPhyRdLatencyDbi                =  0x00D48410,
+	HWPfDdrPhyWrLatency                   =  0x00D48420,
+	HWPfDdrPhyTrngType                    =  0x00D48430,
+	HWPfDdrPhyMrsTiming2                  =  0x00D48440,
+	HWPfDdrPhyMrsTiming0                  =  0x00D48450,
+	HWPfDdrPhyMrsTiming1                  =  0x00D48460,
+	HWPfDdrPhyDramTmrd                    =  0x00D48470,
+	HWPfDdrPhyDramTmod                    =  0x00D48480,
+	HWPfDdrPhyDramTwpre                   =  0x00D48490,
+	HWPfDdrPhyDramTrfc                    =  0x00D484A0,
+	HWPfDdrPhyDramTrwtp                   =  0x00D484B0,
+	HWPfDdrPhyMr01Dimm                    =  0x00D484C0,
+	HWPfDdrPhyMr01DimmDbi                 =  0x00D484D0,
+	HWPfDdrPhyMr23Dimm                    =  0x00D484E0,
+	HWPfDdrPhyMr45Dimm                    =  0x00D484F0,
+	HWPfDdrPhyMr67Dimm                    =  0x00D48500,
+	HWPfDdrPhyWrlvlWwRdlvlRr              =  0x00D48510,
+	HWPfDdrPhyOdtEn                       =  0x00D48520,
+	HWPfDdrPhyFastTrng                    =  0x00D48530,
+	HWPfDdrPhyDynTrngGap                  =  0x00D48540,
+	HWPfDdrPhyDynRcalGap                  =  0x00D48550,
+	HWPfDdrPhyIdletimeout                 =  0x00D48560,
+	HWPfDdrPhyRstCkeGap                   =  0x00D48570,
+	HWPfDdrPhyCkeMrsGap                   =  0x00D48580,
+	HWPfDdrPhyMemVrefMidVal               =  0x00D48590,
+	HWPfDdrPhyVrefStep                    =  0x00D485A0,
+	HWPfDdrPhyVrefThreshold               =  0x00D485B0,
+	HWPfDdrPhyPhyVrefMidVal               =  0x00D485C0,
+	HWPfDdrPhyDqsCountMax                 =  0x00D485D0,
+	HWPfDdrPhyDqsCountNum                 =  0x00D485E0,
+	HWPfDdrPhyDramRow                     =  0x00D485F0,
+	HWPfDdrPhyDramCol                     =  0x00D48600,
+	HWPfDdrPhyDramBgBa                    =  0x00D48610,
+	HWPfDdrPhyDynamicUpdreqrel            =  0x00D48620,
+	HWPfDdrPhyVrefLimits                  =  0x00D48630,
+	HWPfDdrPhyIdtmTcStatus                =  0x00D6C020,
+	HWPfDdrPhyIdtmFwVersion               =  0x00D6C410,
+	HWPfDdrPhyRdlvlGateInitDelay          =  0x00D70000,
+	HWPfDdrPhyRdenSmplabc                 =  0x00D70008,
+	HWPfDdrPhyVrefNibble0                 =  0x00D7000C,
+	HWPfDdrPhyVrefNibble1                 =  0x00D70010,
+	HWPfDdrPhyRdlvlGateDqsSmpl0           =  0x00D70014,
+	HWPfDdrPhyRdlvlGateDqsSmpl1           =  0x00D70018,
+	HWPfDdrPhyRdlvlGateDqsSmpl2           =  0x00D7001C,
+	HWPfDdrPhyDqsCount                    =  0x00D70020,
+	HWPfDdrPhyWrlvlRdlvlGateStatus        =  0x00D70024,
+	HWPfDdrPhyErrorFlags                  =  0x00D70028,
+	HWPfDdrPhyPowerDown                   =  0x00D70030,
+	HWPfDdrPhyPrbsSeedByte0               =  0x00D70034,
+	HWPfDdrPhyPrbsSeedByte1               =  0x00D70038,
+	HWPfDdrPhyPcompDq                     =  0x00D70040,
+	HWPfDdrPhyNcompDq                     =  0x00D70044,
+	HWPfDdrPhyPcompDqs                    =  0x00D70048,
+	HWPfDdrPhyNcompDqs                    =  0x00D7004C,
+	HWPfDdrPhyPcompCmd                    =  0x00D70050,
+	HWPfDdrPhyNcompCmd                    =  0x00D70054,
+	HWPfDdrPhyPcompCk                     =  0x00D70058,
+	HWPfDdrPhyNcompCk                     =  0x00D7005C,
+	HWPfDdrPhyRcalOdtDq                   =  0x00D70060,
+	HWPfDdrPhyRcalOdtDqs                  =  0x00D70064,
+	HWPfDdrPhyRcalMask1                   =  0x00D70068,
+	HWPfDdrPhyRcalMask2                   =  0x00D7006C,
+	HWPfDdrPhyRcalCtrl                    =  0x00D70070,
+	HWPfDdrPhyRcalCnt                     =  0x00D70074,
+	HWPfDdrPhyRcalOverride                =  0x00D70078,
+	HWPfDdrPhyRcalGateen                  =  0x00D7007C,
+	HWPfDdrPhyCtrl                        =  0x00D70080,
+	HWPfDdrPhyWrlvlAlg                    =  0x00D70084,
+	HWPfDdrPhyRcalVreftTxcmdOdt           =  0x00D70088,
+	HWPfDdrPhyRdlvlGateParam              =  0x00D7008C,
+	HWPfDdrPhyRdlvlGateParam2             =  0x00D70090,
+	HWPfDdrPhyRcalVreftTxdata             =  0x00D70094,
+	HWPfDdrPhyCmdIntDelay                 =  0x00D700A4,
+	HWPfDdrPhyAlertN                      =  0x00D700A8,
+	HWPfDdrPhyTrngReqWpre2tck             =  0x00D700AC,
+	HWPfDdrPhyCmdPhaseSel                 =  0x00D700B4,
+	HWPfDdrPhyCmdDcdl                     =  0x00D700B8,
+	HWPfDdrPhyCkDcdl                      =  0x00D700BC,
+	HWPfDdrPhySwTrngCtrl1                 =  0x00D700C0,
+	HWPfDdrPhySwTrngCtrl2                 =  0x00D700C4,
+	HWPfDdrPhyRcalPcompRden               =  0x00D700C8,
+	HWPfDdrPhyRcalNcompRden               =  0x00D700CC,
+	HWPfDdrPhyRcalCompen                  =  0x00D700D0,
+	HWPfDdrPhySwTrngRdqs                  =  0x00D700D4,
+	HWPfDdrPhySwTrngWdqs                  =  0x00D700D8,
+	HWPfDdrPhySwTrngRdena                 =  0x00D700DC,
+	HWPfDdrPhySwTrngRdenb                 =  0x00D700E0,
+	HWPfDdrPhySwTrngRdenc                 =  0x00D700E4,
+	HWPfDdrPhySwTrngWdq                   =  0x00D700E8,
+	HWPfDdrPhySwTrngRdq                   =  0x00D700EC,
+	HWPfDdrPhyPcfgHmValue                 =  0x00D700F0,
+	HWPfDdrPhyPcfgTimerValue              =  0x00D700F4,
+	HWPfDdrPhyPcfgSoftwareTraining        =  0x00D700F8,
+	HWPfDdrPhyPcfgMcStatus                =  0x00D700FC,
+	HWPfDdrPhyWrlvlPhRank0                =  0x00D70100,
+	HWPfDdrPhyRdenPhRank0                 =  0x00D70104,
+	HWPfDdrPhyRdenIntRank0                =  0x00D70108,
+	HWPfDdrPhyRdqsDcdlRank0               =  0x00D7010C,
+	HWPfDdrPhyRdqsShadowDcdlRank0         =  0x00D70110,
+	HWPfDdrPhyWdqsDcdlRank0               =  0x00D70114,
+	HWPfDdrPhyWdmDcdlShadowRank0          =  0x00D70118,
+	HWPfDdrPhyWdmDcdlRank0                =  0x00D7011C,
+	HWPfDdrPhyDbiDcdlRank0                =  0x00D70120,
+	HWPfDdrPhyRdenDcdlaRank0              =  0x00D70124,
+	HWPfDdrPhyDbiDcdlShadowRank0          =  0x00D70128,
+	HWPfDdrPhyRdenDcdlbRank0              =  0x00D7012C,
+	HWPfDdrPhyWdqsShadowDcdlRank0         =  0x00D70130,
+	HWPfDdrPhyRdenDcdlcRank0              =  0x00D70134,
+	HWPfDdrPhyRdenShadowDcdlaRank0        =  0x00D70138,
+	HWPfDdrPhyWrlvlIntRank0               =  0x00D7013C,
+	HWPfDdrPhyRdqDcdlBit0Rank0            =  0x00D70200,
+	HWPfDdrPhyRdqDcdlShadowBit0Rank0      =  0x00D70204,
+	HWPfDdrPhyWdqDcdlBit0Rank0            =  0x00D70208,
+	HWPfDdrPhyWdqDcdlShadowBit0Rank0      =  0x00D7020C,
+	HWPfDdrPhyRdqDcdlBit1Rank0            =  0x00D70240,
+	HWPfDdrPhyRdqDcdlShadowBit1Rank0      =  0x00D70244,
+	HWPfDdrPhyWdqDcdlBit1Rank0            =  0x00D70248,
+	HWPfDdrPhyWdqDcdlShadowBit1Rank0      =  0x00D7024C,
+	HWPfDdrPhyRdqDcdlBit2Rank0            =  0x00D70280,
+	HWPfDdrPhyRdqDcdlShadowBit2Rank0      =  0x00D70284,
+	HWPfDdrPhyWdqDcdlBit2Rank0            =  0x00D70288,
+	HWPfDdrPhyWdqDcdlShadowBit2Rank0      =  0x00D7028C,
+	HWPfDdrPhyRdqDcdlBit3Rank0            =  0x00D702C0,
+	HWPfDdrPhyRdqDcdlShadowBit3Rank0      =  0x00D702C4,
+	HWPfDdrPhyWdqDcdlBit3Rank0            =  0x00D702C8,
+	HWPfDdrPhyWdqDcdlShadowBit3Rank0      =  0x00D702CC,
+	HWPfDdrPhyRdqDcdlBit4Rank0            =  0x00D70300,
+	HWPfDdrPhyRdqDcdlShadowBit4Rank0      =  0x00D70304,
+	HWPfDdrPhyWdqDcdlBit4Rank0            =  0x00D70308,
+	HWPfDdrPhyWdqDcdlShadowBit4Rank0      =  0x00D7030C,
+	HWPfDdrPhyRdqDcdlBit5Rank0            =  0x00D70340,
+	HWPfDdrPhyRdqDcdlShadowBit5Rank0      =  0x00D70344,
+	HWPfDdrPhyWdqDcdlBit5Rank0            =  0x00D70348,
+	HWPfDdrPhyWdqDcdlShadowBit5Rank0      =  0x00D7034C,
+	HWPfDdrPhyRdqDcdlBit6Rank0            =  0x00D70380,
+	HWPfDdrPhyRdqDcdlShadowBit6Rank0      =  0x00D70384,
+	HWPfDdrPhyWdqDcdlBit6Rank0            =  0x00D70388,
+	HWPfDdrPhyWdqDcdlShadowBit6Rank0      =  0x00D7038C,
+	HWPfDdrPhyRdqDcdlBit7Rank0            =  0x00D703C0,
+	HWPfDdrPhyRdqDcdlShadowBit7Rank0      =  0x00D703C4,
+	HWPfDdrPhyWdqDcdlBit7Rank0            =  0x00D703C8,
+	HWPfDdrPhyWdqDcdlShadowBit7Rank0      =  0x00D703CC,
+	HWPfDdrPhyIdtmStatus                  =  0x00D740D0,
+	HWPfDdrPhyIdtmError                   =  0x00D74110,
+	HWPfDdrPhyIdtmDebug                   =  0x00D74120,
+	HWPfDdrPhyIdtmDebugInt                =  0x00D74130,
+	HwPfPcieLnAsicCfgovr                  =  0x00D80000,
+	HwPfPcieLnAclkmixer                   =  0x00D80004,
+	HwPfPcieLnTxrampfreq                  =  0x00D80008,
+	HwPfPcieLnLanetest                    =  0x00D8000C,
+	HwPfPcieLnDcctrl                      =  0x00D80010,
+	HwPfPcieLnDccmeas                     =  0x00D80014,
+	HwPfPcieLnDccovrAclk                  =  0x00D80018,
+	HwPfPcieLnDccovrTxa                   =  0x00D8001C,
+	HwPfPcieLnDccovrTxk                   =  0x00D80020,
+	HwPfPcieLnDccovrDclk                  =  0x00D80024,
+	HwPfPcieLnDccovrEclk                  =  0x00D80028,
+	HwPfPcieLnDcctrimAclk                 =  0x00D8002C,
+	HwPfPcieLnDcctrimTx                   =  0x00D80030,
+	HwPfPcieLnDcctrimDclk                 =  0x00D80034,
+	HwPfPcieLnDcctrimEclk                 =  0x00D80038,
+	HwPfPcieLnQuadCtrl                    =  0x00D8003C,
+	HwPfPcieLnQuadCorrIndex               =  0x00D80040,
+	HwPfPcieLnQuadCorrStatus              =  0x00D80044,
+	HwPfPcieLnAsicRxovr1                  =  0x00D80048,
+	HwPfPcieLnAsicRxovr2                  =  0x00D8004C,
+	HwPfPcieLnAsicEqinfovr                =  0x00D80050,
+	HwPfPcieLnRxcsr                       =  0x00D80054,
+	HwPfPcieLnRxfectrl                    =  0x00D80058,
+	HwPfPcieLnRxtest                      =  0x00D8005C,
+	HwPfPcieLnEscount                     =  0x00D80060,
+	HwPfPcieLnCdrctrl                     =  0x00D80064,
+	HwPfPcieLnCdrctrl2                    =  0x00D80068,
+	HwPfPcieLnCdrcfg0Ctrl0                =  0x00D8006C,
+	HwPfPcieLnCdrcfg0Ctrl1                =  0x00D80070,
+	HwPfPcieLnCdrcfg0Ctrl2                =  0x00D80074,
+	HwPfPcieLnCdrcfg1Ctrl0                =  0x00D80078,
+	HwPfPcieLnCdrcfg1Ctrl1                =  0x00D8007C,
+	HwPfPcieLnCdrcfg1Ctrl2                =  0x00D80080,
+	HwPfPcieLnCdrcfg2Ctrl0                =  0x00D80084,
+	HwPfPcieLnCdrcfg2Ctrl1                =  0x00D80088,
+	HwPfPcieLnCdrcfg2Ctrl2                =  0x00D8008C,
+	HwPfPcieLnCdrcfg3Ctrl0                =  0x00D80090,
+	HwPfPcieLnCdrcfg3Ctrl1                =  0x00D80094,
+	HwPfPcieLnCdrcfg3Ctrl2                =  0x00D80098,
+	HwPfPcieLnCdrphase                    =  0x00D8009C,
+	HwPfPcieLnCdrfreq                     =  0x00D800A0,
+	HwPfPcieLnCdrstatusPhase              =  0x00D800A4,
+	HwPfPcieLnCdrstatusFreq               =  0x00D800A8,
+	HwPfPcieLnCdroffset                   =  0x00D800AC,
+	HwPfPcieLnRxvosctl                    =  0x00D800B0,
+	HwPfPcieLnRxvosctl2                   =  0x00D800B4,
+	HwPfPcieLnRxlosctl                    =  0x00D800B8,
+	HwPfPcieLnRxlos                       =  0x00D800BC,
+	HwPfPcieLnRxlosvval                   =  0x00D800C0,
+	HwPfPcieLnRxvosd0                     =  0x00D800C4,
+	HwPfPcieLnRxvosd1                     =  0x00D800C8,
+	HwPfPcieLnRxvosep0                    =  0x00D800CC,
+	HwPfPcieLnRxvosep1                    =  0x00D800D0,
+	HwPfPcieLnRxvosen0                    =  0x00D800D4,
+	HwPfPcieLnRxvosen1                    =  0x00D800D8,
+	HwPfPcieLnRxvosafe                    =  0x00D800DC,
+	HwPfPcieLnRxvosa0                     =  0x00D800E0,
+	HwPfPcieLnRxvosa0Out                  =  0x00D800E4,
+	HwPfPcieLnRxvosa1                     =  0x00D800E8,
+	HwPfPcieLnRxvosa1Out                  =  0x00D800EC,
+	HwPfPcieLnRxmisc                      =  0x00D800F0,
+	HwPfPcieLnRxbeacon                    =  0x00D800F4,
+	HwPfPcieLnRxdssout                    =  0x00D800F8,
+	HwPfPcieLnRxdssout2                   =  0x00D800FC,
+	HwPfPcieLnAlphapctrl                  =  0x00D80100,
+	HwPfPcieLnAlphanctrl                  =  0x00D80104,
+	HwPfPcieLnAdaptctrl                   =  0x00D80108,
+	HwPfPcieLnAdaptctrl1                  =  0x00D8010C,
+	HwPfPcieLnAdaptstatus                 =  0x00D80110,
+	HwPfPcieLnAdaptvga1                   =  0x00D80114,
+	HwPfPcieLnAdaptvga2                   =  0x00D80118,
+	HwPfPcieLnAdaptvga3                   =  0x00D8011C,
+	HwPfPcieLnAdaptvga4                   =  0x00D80120,
+	HwPfPcieLnAdaptboost1                 =  0x00D80124,
+	HwPfPcieLnAdaptboost2                 =  0x00D80128,
+	HwPfPcieLnAdaptboost3                 =  0x00D8012C,
+	HwPfPcieLnAdaptboost4                 =  0x00D80130,
+	HwPfPcieLnAdaptsslms1                 =  0x00D80134,
+	HwPfPcieLnAdaptsslms2                 =  0x00D80138,
+	HwPfPcieLnAdaptvgaStatus              =  0x00D8013C,
+	HwPfPcieLnAdaptboostStatus            =  0x00D80140,
+	HwPfPcieLnAdaptsslmsStatus1           =  0x00D80144,
+	HwPfPcieLnAdaptsslmsStatus2           =  0x00D80148,
+	HwPfPcieLnAfectrl1                    =  0x00D8014C,
+	HwPfPcieLnAfectrl2                    =  0x00D80150,
+	HwPfPcieLnAfectrl3                    =  0x00D80154,
+	HwPfPcieLnAfedefault1                 =  0x00D80158,
+	HwPfPcieLnAfedefault2                 =  0x00D8015C,
+	HwPfPcieLnDfectrl1                    =  0x00D80160,
+	HwPfPcieLnDfectrl2                    =  0x00D80164,
+	HwPfPcieLnDfectrl3                    =  0x00D80168,
+	HwPfPcieLnDfectrl4                    =  0x00D8016C,
+	HwPfPcieLnDfectrl5                    =  0x00D80170,
+	HwPfPcieLnDfectrl6                    =  0x00D80174,
+	HwPfPcieLnAfestatus1                  =  0x00D80178,
+	HwPfPcieLnAfestatus2                  =  0x00D8017C,
+	HwPfPcieLnDfestatus1                  =  0x00D80180,
+	HwPfPcieLnDfestatus2                  =  0x00D80184,
+	HwPfPcieLnDfestatus3                  =  0x00D80188,
+	HwPfPcieLnDfestatus4                  =  0x00D8018C,
+	HwPfPcieLnDfestatus5                  =  0x00D80190,
+	HwPfPcieLnAlphastatus                 =  0x00D80194,
+	HwPfPcieLnFomctrl1                    =  0x00D80198,
+	HwPfPcieLnFomctrl2                    =  0x00D8019C,
+	HwPfPcieLnFomctrl3                    =  0x00D801A0,
+	HwPfPcieLnAclkcalStatus               =  0x00D801A4,
+	HwPfPcieLnOffscorrStatus              =  0x00D801A8,
+	HwPfPcieLnEyewidthStatus              =  0x00D801AC,
+	HwPfPcieLnEyeheightStatus             =  0x00D801B0,
+	HwPfPcieLnAsicTxovr1                  =  0x00D801B4,
+	HwPfPcieLnAsicTxovr2                  =  0x00D801B8,
+	HwPfPcieLnAsicTxovr3                  =  0x00D801BC,
+	HwPfPcieLnTxbiasadjOvr                =  0x00D801C0,
+	HwPfPcieLnTxcsr                       =  0x00D801C4,
+	HwPfPcieLnTxtest                      =  0x00D801C8,
+	HwPfPcieLnTxtestword                  =  0x00D801CC,
+	HwPfPcieLnTxtestwordHigh              =  0x00D801D0,
+	HwPfPcieLnTxdrive                     =  0x00D801D4,
+	HwPfPcieLnMtcsLn                      =  0x00D801D8,
+	HwPfPcieLnStatsumLn                   =  0x00D801DC,
+	HwPfPcieLnRcbusScratch                =  0x00D801E0,
+	HwPfPcieLnRcbusMinorrev               =  0x00D801F0,
+	HwPfPcieLnRcbusMajorrev               =  0x00D801F4,
+	HwPfPcieLnRcbusBlocktype              =  0x00D801F8,
+	HwPfPcieSupPllcsr                     =  0x00D80800,
+	HwPfPcieSupPlldiv                     =  0x00D80804,
+	HwPfPcieSupPllcal                     =  0x00D80808,
+	HwPfPcieSupPllcalsts                  =  0x00D8080C,
+	HwPfPcieSupPllmeas                    =  0x00D80810,
+	HwPfPcieSupPlldactrim                 =  0x00D80814,
+	HwPfPcieSupPllbiastrim                =  0x00D80818,
+	HwPfPcieSupPllbwtrim                  =  0x00D8081C,
+	HwPfPcieSupPllcaldly                  =  0x00D80820,
+	HwPfPcieSupRefclkonpclkctrl           =  0x00D80824,
+	HwPfPcieSupPclkdelay                  =  0x00D80828,
+	HwPfPcieSupPhyconfig                  =  0x00D8082C,
+	HwPfPcieSupRcalIntf                   =  0x00D80830,
+	HwPfPcieSupAuxcsr                     =  0x00D80834,
+	HwPfPcieSupVref                       =  0x00D80838,
+	HwPfPcieSupLinkmode                   =  0x00D8083C,
+	HwPfPcieSupRrefcalctl                 =  0x00D80840,
+	HwPfPcieSupRrefcal                    =  0x00D80844,
+	HwPfPcieSupRrefcaldly                 =  0x00D80848,
+	HwPfPcieSupTximpcalctl                =  0x00D8084C,
+	HwPfPcieSupTximpcal                   =  0x00D80850,
+	HwPfPcieSupTximpoffset                =  0x00D80854,
+	HwPfPcieSupTximpcaldly                =  0x00D80858,
+	HwPfPcieSupRximpcalctl                =  0x00D8085C,
+	HwPfPcieSupRximpcal                   =  0x00D80860,
+	HwPfPcieSupRximpoffset                =  0x00D80864,
+	HwPfPcieSupRximpcaldly                =  0x00D80868,
+	HwPfPcieSupFence                      =  0x00D8086C,
+	HwPfPcieSupMtcs                       =  0x00D80870,
+	HwPfPcieSupStatsum                    =  0x00D809B8,
+	HwPfPciePcsDpStatus0                  =  0x00D81000,
+	HwPfPciePcsDpControl0                 =  0x00D81004,
+	HwPfPciePcsPmaStatusLane0             =  0x00D81008,
+	HwPfPciePcsPipeStatusLane0            =  0x00D8100C,
+	HwPfPciePcsTxdeemph0Lane0             =  0x00D81010,
+	HwPfPciePcsTxdeemph1Lane0             =  0x00D81014,
+	HwPfPciePcsInternalStatusLane0        =  0x00D81018,
+	HwPfPciePcsDpStatus1                  =  0x00D8101C,
+	HwPfPciePcsDpControl1                 =  0x00D81020,
+	HwPfPciePcsPmaStatusLane1             =  0x00D81024,
+	HwPfPciePcsPipeStatusLane1            =  0x00D81028,
+	HwPfPciePcsTxdeemph0Lane1             =  0x00D8102C,
+	HwPfPciePcsTxdeemph1Lane1             =  0x00D81030,
+	HwPfPciePcsInternalStatusLane1        =  0x00D81034,
+	HwPfPciePcsDpStatus2                  =  0x00D81038,
+	HwPfPciePcsDpControl2                 =  0x00D8103C,
+	HwPfPciePcsPmaStatusLane2             =  0x00D81040,
+	HwPfPciePcsPipeStatusLane2            =  0x00D81044,
+	HwPfPciePcsTxdeemph0Lane2             =  0x00D81048,
+	HwPfPciePcsTxdeemph1Lane2             =  0x00D8104C,
+	HwPfPciePcsInternalStatusLane2        =  0x00D81050,
+	HwPfPciePcsDpStatus3                  =  0x00D81054,
+	HwPfPciePcsDpControl3                 =  0x00D81058,
+	HwPfPciePcsPmaStatusLane3             =  0x00D8105C,
+	HwPfPciePcsPipeStatusLane3            =  0x00D81060,
+	HwPfPciePcsTxdeemph0Lane3             =  0x00D81064,
+	HwPfPciePcsTxdeemph1Lane3             =  0x00D81068,
+	HwPfPciePcsInternalStatusLane3        =  0x00D8106C,
+	HwPfPciePcsEbStatus0                  =  0x00D81070,
+	HwPfPciePcsEbStatus1                  =  0x00D81074,
+	HwPfPciePcsEbStatus2                  =  0x00D81078,
+	HwPfPciePcsEbStatus3                  =  0x00D8107C,
+	HwPfPciePcsPllSettingPcieG1           =  0x00D81088,
+	HwPfPciePcsPllSettingPcieG2           =  0x00D8108C,
+	HwPfPciePcsPllSettingPcieG3           =  0x00D81090,
+	HwPfPciePcsControl                    =  0x00D81094,
+	HwPfPciePcsEqControl                  =  0x00D81098,
+	HwPfPciePcsEqTimer                    =  0x00D8109C,
+	HwPfPciePcsEqErrStatus                =  0x00D810A0,
+	HwPfPciePcsEqErrCount                 =  0x00D810A4,
+	HwPfPciePcsStatus                     =  0x00D810A8,
+	HwPfPciePcsMiscRegister               =  0x00D810AC,
+	HwPfPciePcsObsControl                 =  0x00D810B0,
+	HwPfPciePcsPrbsCount0                 =  0x00D81200,
+	HwPfPciePcsBistControl0               =  0x00D81204,
+	HwPfPciePcsBistStaticWord00           =  0x00D81208,
+	HwPfPciePcsBistStaticWord10           =  0x00D8120C,
+	HwPfPciePcsBistStaticWord20           =  0x00D81210,
+	HwPfPciePcsBistStaticWord30           =  0x00D81214,
+	HwPfPciePcsPrbsCount1                 =  0x00D81220,
+	HwPfPciePcsBistControl1               =  0x00D81224,
+	HwPfPciePcsBistStaticWord01           =  0x00D81228,
+	HwPfPciePcsBistStaticWord11           =  0x00D8122C,
+	HwPfPciePcsBistStaticWord21           =  0x00D81230,
+	HwPfPciePcsBistStaticWord31           =  0x00D81234,
+	HwPfPciePcsPrbsCount2                 =  0x00D81240,
+	HwPfPciePcsBistControl2               =  0x00D81244,
+	HwPfPciePcsBistStaticWord02           =  0x00D81248,
+	HwPfPciePcsBistStaticWord12           =  0x00D8124C,
+	HwPfPciePcsBistStaticWord22           =  0x00D81250,
+	HwPfPciePcsBistStaticWord32           =  0x00D81254,
+	HwPfPciePcsPrbsCount3                 =  0x00D81260,
+	HwPfPciePcsBistControl3               =  0x00D81264,
+	HwPfPciePcsBistStaticWord03           =  0x00D81268,
+	HwPfPciePcsBistStaticWord13           =  0x00D8126C,
+	HwPfPciePcsBistStaticWord23           =  0x00D81270,
+	HwPfPciePcsBistStaticWord33           =  0x00D81274,
+	HwPfPcieGpexLtssmStateCntrl           =  0x00D90400,
+	HwPfPcieGpexLtssmStateStatus          =  0x00D90404,
+	HwPfPcieGpexSkipFreqTimer             =  0x00D90408,
+	HwPfPcieGpexLaneSelect                =  0x00D9040C,
+	HwPfPcieGpexLaneDeskew                =  0x00D90410,
+	HwPfPcieGpexRxErrorStatus             =  0x00D90414,
+	HwPfPcieGpexLaneNumControl            =  0x00D90418,
+	HwPfPcieGpexNFstControl               =  0x00D9041C,
+	HwPfPcieGpexLinkStatus                =  0x00D90420,
+	HwPfPcieGpexAckReplayTimeout          =  0x00D90438,
+	HwPfPcieGpexSeqNumberStatus           =  0x00D9043C,
+	HwPfPcieGpexCoreClkRatio              =  0x00D90440,
+	HwPfPcieGpexDllTholdControl           =  0x00D90448,
+	HwPfPcieGpexPmTimer                   =  0x00D90450,
+	HwPfPcieGpexPmeTimeout                =  0x00D90454,
+	HwPfPcieGpexAspmL1Timer               =  0x00D90458,
+	HwPfPcieGpexAspmReqTimer              =  0x00D9045C,
+	HwPfPcieGpexAspmL1Dis                 =  0x00D90460,
+	HwPfPcieGpexAdvisoryErrorControl      =  0x00D90468,
+	HwPfPcieGpexId                        =  0x00D90470,
+	HwPfPcieGpexClasscode                 =  0x00D90474,
+	HwPfPcieGpexSubsystemId               =  0x00D90478,
+	HwPfPcieGpexDeviceCapabilities        =  0x00D9047C,
+	HwPfPcieGpexLinkCapabilities          =  0x00D90480,
+	HwPfPcieGpexFunctionNumber            =  0x00D90484,
+	HwPfPcieGpexPmCapabilities            =  0x00D90488,
+	HwPfPcieGpexFunctionSelect            =  0x00D9048C,
+	HwPfPcieGpexErrorCounter              =  0x00D904AC,
+	HwPfPcieGpexConfigReady               =  0x00D904B0,
+	HwPfPcieGpexFcUpdateTimeout           =  0x00D904B8,
+	HwPfPcieGpexFcUpdateTimer             =  0x00D904BC,
+	HwPfPcieGpexVcBufferLoad              =  0x00D904C8,
+	HwPfPcieGpexVcBufferSizeThold         =  0x00D904CC,
+	HwPfPcieGpexVcBufferSelect            =  0x00D904D0,
+	HwPfPcieGpexBarEnable                 =  0x00D904D4,
+	HwPfPcieGpexBarDwordLower             =  0x00D904D8,
+	HwPfPcieGpexBarDwordUpper             =  0x00D904DC,
+	HwPfPcieGpexBarSelect                 =  0x00D904E0,
+	HwPfPcieGpexCreditCounterSelect       =  0x00D904E4,
+	HwPfPcieGpexCreditCounterStatus       =  0x00D904E8,
+	HwPfPcieGpexTlpHeaderSelect           =  0x00D904EC,
+	HwPfPcieGpexTlpHeaderDword0           =  0x00D904F0,
+	HwPfPcieGpexTlpHeaderDword1           =  0x00D904F4,
+	HwPfPcieGpexTlpHeaderDword2           =  0x00D904F8,
+	HwPfPcieGpexTlpHeaderDword3           =  0x00D904FC,
+	HwPfPcieGpexRelaxOrderControl         =  0x00D90500,
+	HwPfPcieGpexBarPrefetch               =  0x00D90504,
+	HwPfPcieGpexFcCheckControl            =  0x00D90508,
+	HwPfPcieGpexFcUpdateTimerTraffic      =  0x00D90518,
+	HwPfPcieGpexPhyControl0               =  0x00D9053C,
+	HwPfPcieGpexPhyControl1               =  0x00D90544,
+	HwPfPcieGpexPhyControl2               =  0x00D9054C,
+	HwPfPcieGpexUserControl0              =  0x00D9055C,
+	HwPfPcieGpexUncorrErrorStatus         =  0x00D905F0,
+	HwPfPcieGpexRxCplError                =  0x00D90620,
+	HwPfPcieGpexRxCplErrorDword0          =  0x00D90624,
+	HwPfPcieGpexRxCplErrorDword1          =  0x00D90628,
+	HwPfPcieGpexRxCplErrorDword2          =  0x00D9062C,
+	HwPfPcieGpexPabSwResetEn              =  0x00D90630,
+	HwPfPcieGpexGen3Control0              =  0x00D90634,
+	HwPfPcieGpexGen3Control1              =  0x00D90638,
+	HwPfPcieGpexGen3Control2              =  0x00D9063C,
+	HwPfPcieGpexGen2ControlCsr            =  0x00D90640,
+	HwPfPcieGpexTotalVfInitialVf0         =  0x00D90644,
+	HwPfPcieGpexTotalVfInitialVf1         =  0x00D90648,
+	HwPfPcieGpexSriovLinkDevId0           =  0x00D90684,
+	HwPfPcieGpexSriovLinkDevId1           =  0x00D90688,
+	HwPfPcieGpexSriovPageSize0            =  0x00D906C4,
+	HwPfPcieGpexSriovPageSize1            =  0x00D906C8,
+	HwPfPcieGpexIdVersion                 =  0x00D906FC,
+	HwPfPcieGpexSriovVfOffsetStride0      =  0x00D90704,
+	HwPfPcieGpexSriovVfOffsetStride1      =  0x00D90708,
+	HwPfPcieGpexGen3DeskewControl         =  0x00D907B4,
+	HwPfPcieGpexGen3EqControl             =  0x00D907B8,
+	HwPfPcieGpexBridgeVersion             =  0x00D90800,
+	HwPfPcieGpexBridgeCapability          =  0x00D90804,
+	HwPfPcieGpexBridgeControl             =  0x00D90808,
+	HwPfPcieGpexBridgeStatus              =  0x00D9080C,
+	HwPfPcieGpexEngineActivityStatus      =  0x00D9081C,
+	HwPfPcieGpexEngineResetControl        =  0x00D90820,
+	HwPfPcieGpexAxiPioControl             =  0x00D90840,
+	HwPfPcieGpexAxiPioStatus              =  0x00D90844,
+	HwPfPcieGpexAmbaSlaveCmdStatus        =  0x00D90848,
+	HwPfPcieGpexPexPioControl             =  0x00D908C0,
+	HwPfPcieGpexPexPioStatus              =  0x00D908C4,
+	HwPfPcieGpexAmbaMasterStatus          =  0x00D908C8,
+	HwPfPcieGpexCsrSlaveCmdStatus         =  0x00D90920,
+	HwPfPcieGpexMailboxAxiControl         =  0x00D90A50,
+	HwPfPcieGpexMailboxAxiData            =  0x00D90A54,
+	HwPfPcieGpexMailboxPexControl         =  0x00D90A90,
+	HwPfPcieGpexMailboxPexData            =  0x00D90A94,
+	HwPfPcieGpexPexInterruptEnable        =  0x00D90AD0,
+	HwPfPcieGpexPexInterruptStatus        =  0x00D90AD4,
+	HwPfPcieGpexPexInterruptAxiPioVector  =  0x00D90AD8,
+	HwPfPcieGpexPexInterruptPexPioVector  =  0x00D90AE0,
+	HwPfPcieGpexPexInterruptMiscVector    =  0x00D90AF8,
+	HwPfPcieGpexAmbaInterruptPioEnable    =  0x00D90B00,
+	HwPfPcieGpexAmbaInterruptMiscEnable   =  0x00D90B0C,
+	HwPfPcieGpexAmbaInterruptPioStatus    =  0x00D90B10,
+	HwPfPcieGpexAmbaInterruptMiscStatus   =  0x00D90B1C,
+	HwPfPcieGpexPexPmControl              =  0x00D90B80,
+	HwPfPcieGpexSlotMisc                  =  0x00D90B88,
+	HwPfPcieGpexAxiAddrMappingControl     =  0x00D90BA0,
+	HwPfPcieGpexAxiAddrMappingWindowAxiBase     =  0x00D90BA4,
+	HwPfPcieGpexAxiAddrMappingWindowPexBaseLow  =  0x00D90BA8,
+	HwPfPcieGpexAxiAddrMappingWindowPexBaseHigh =  0x00D90BAC,
+	HwPfPcieGpexPexBarAddrFunc0Bar0       =  0x00D91BA0,
+	HwPfPcieGpexPexBarAddrFunc0Bar1       =  0x00D91BA4,
+	HwPfPcieGpexAxiAddrMappingPcieHdrParam =  0x00D95BA0,
+	HwPfPcieGpexExtAxiAddrMappingAxiBase  =  0x00D980A0,
+	HwPfPcieGpexPexExtBarAddrFunc0Bar0    =  0x00D984A0,
+	HwPfPcieGpexPexExtBarAddrFunc0Bar1    =  0x00D984A4,
+	HwPfPcieGpexAmbaInterruptFlrEnable    =  0x00D9B960,
+	HwPfPcieGpexAmbaInterruptFlrStatus    =  0x00D9B9A0,
+	HwPfPcieGpexExtAxiAddrMappingSize     =  0x00D9BAF0,
+	HwPfPcieGpexPexPioAwcacheControl      =  0x00D9C300,
+	HwPfPcieGpexPexPioArcacheControl      =  0x00D9C304,
+	HwPfPcieGpexPabObSizeControlVc0       =  0x00D9C310
+};
+
+/* TIP PF Interrupt numbers */
+enum {
+	ACC100_PF_INT_QMGR_AQ_OVERFLOW = 0,
+	ACC100_PF_INT_DOORBELL_VF_2_PF = 1,
+	ACC100_PF_INT_DMA_DL_DESC_IRQ = 2,
+	ACC100_PF_INT_DMA_UL_DESC_IRQ = 3,
+	ACC100_PF_INT_DMA_MLD_DESC_IRQ = 4,
+	ACC100_PF_INT_DMA_UL5G_DESC_IRQ = 5,
+	ACC100_PF_INT_DMA_DL5G_DESC_IRQ = 6,
+	ACC100_PF_INT_ILLEGAL_FORMAT = 7,
+	ACC100_PF_INT_QMGR_DISABLED_ACCESS = 8,
+	ACC100_PF_INT_QMGR_AQ_OVERTHRESHOLD = 9,
+	ACC100_PF_INT_ARAM_ACCESS_ERR = 10,
+	ACC100_PF_INT_ARAM_ECC_1BIT_ERR = 11,
+	ACC100_PF_INT_PARITY_ERR = 12,
+	ACC100_PF_INT_QMGR_ERR = 13,
+	ACC100_PF_INT_INT_REQ_OVERFLOW = 14,
+	ACC100_PF_INT_APB_TIMEOUT = 15,
+};
+
+#endif /* ACC100_PF_ENUM_H */
diff --git a/drivers/baseband/acc100/acc100_vf_enum.h b/drivers/baseband/acc100/acc100_vf_enum.h
new file mode 100644
index 0000000..b512af3
--- /dev/null
+++ b/drivers/baseband/acc100/acc100_vf_enum.h
@@ -0,0 +1,73 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2017 Intel Corporation
+ */
+
+#ifndef ACC100_VF_ENUM_H
+#define ACC100_VF_ENUM_H
+
+/*
+ * ACC100 Register mapping on VF BAR0
+ * This is automatically generated from RDL, format may change with new RDL
+ */
+enum {
+	HWVfQmgrIngressAq             =  0x00000000,
+	HWVfHiVfToPfDbellVf           =  0x00000800,
+	HWVfHiPfToVfDbellVf           =  0x00000808,
+	HWVfHiInfoRingBaseLoVf        =  0x00000810,
+	HWVfHiInfoRingBaseHiVf        =  0x00000814,
+	HWVfHiInfoRingPointerVf       =  0x00000818,
+	HWVfHiInfoRingIntWrEnVf       =  0x00000820,
+	HWVfHiInfoRingPf2VfWrEnVf     =  0x00000824,
+	HWVfHiMsixVectorMapperVf      =  0x00000860,
+	HWVfDmaFec5GulDescBaseLoRegVf =  0x00000920,
+	HWVfDmaFec5GulDescBaseHiRegVf =  0x00000924,
+	HWVfDmaFec5GulRespPtrLoRegVf  =  0x00000928,
+	HWVfDmaFec5GulRespPtrHiRegVf  =  0x0000092C,
+	HWVfDmaFec5GdlDescBaseLoRegVf =  0x00000940,
+	HWVfDmaFec5GdlDescBaseHiRegVf =  0x00000944,
+	HWVfDmaFec5GdlRespPtrLoRegVf  =  0x00000948,
+	HWVfDmaFec5GdlRespPtrHiRegVf  =  0x0000094C,
+	HWVfDmaFec4GulDescBaseLoRegVf =  0x00000960,
+	HWVfDmaFec4GulDescBaseHiRegVf =  0x00000964,
+	HWVfDmaFec4GulRespPtrLoRegVf  =  0x00000968,
+	HWVfDmaFec4GulRespPtrHiRegVf  =  0x0000096C,
+	HWVfDmaFec4GdlDescBaseLoRegVf =  0x00000980,
+	HWVfDmaFec4GdlDescBaseHiRegVf =  0x00000984,
+	HWVfDmaFec4GdlRespPtrLoRegVf  =  0x00000988,
+	HWVfDmaFec4GdlRespPtrHiRegVf  =  0x0000098C,
+	HWVfDmaDdrBaseRangeRoVf       =  0x000009A0,
+	HWVfQmgrAqResetVf             =  0x00000E00,
+	HWVfQmgrRingSizeVf            =  0x00000E04,
+	HWVfQmgrGrpDepthLog20Vf       =  0x00000E08,
+	HWVfQmgrGrpDepthLog21Vf       =  0x00000E0C,
+	HWVfQmgrGrpFunction0Vf        =  0x00000E10,
+	HWVfQmgrGrpFunction1Vf        =  0x00000E14,
+	HWVfPmACntrlRegVf             =  0x00000F40,
+	HWVfPmACountVf                =  0x00000F48,
+	HWVfPmAKCntLoVf               =  0x00000F50,
+	HWVfPmAKCntHiVf               =  0x00000F54,
+	HWVfPmADeltaCntLoVf           =  0x00000F60,
+	HWVfPmADeltaCntHiVf           =  0x00000F64,
+	HWVfPmBCntrlRegVf             =  0x00000F80,
+	HWVfPmBCountVf                =  0x00000F88,
+	HWVfPmBKCntLoVf               =  0x00000F90,
+	HWVfPmBKCntHiVf               =  0x00000F94,
+	HWVfPmBDeltaCntLoVf           =  0x00000FA0,
+	HWVfPmBDeltaCntHiVf           =  0x00000FA4
+};
+
+/* TIP VF Interrupt numbers */
+enum {
+	ACC100_VF_INT_QMGR_AQ_OVERFLOW = 0,
+	ACC100_VF_INT_DOORBELL_VF_2_PF = 1,
+	ACC100_VF_INT_DMA_DL_DESC_IRQ = 2,
+	ACC100_VF_INT_DMA_UL_DESC_IRQ = 3,
+	ACC100_VF_INT_DMA_MLD_DESC_IRQ = 4,
+	ACC100_VF_INT_DMA_UL5G_DESC_IRQ = 5,
+	ACC100_VF_INT_DMA_DL5G_DESC_IRQ = 6,
+	ACC100_VF_INT_ILLEGAL_FORMAT = 7,
+	ACC100_VF_INT_QMGR_DISABLED_ACCESS = 8,
+	ACC100_VF_INT_QMGR_AQ_OVERTHRESHOLD = 9,
+};
+
+#endif /* ACC100_VF_ENUM_H */
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 6f46df0..cd77570 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -5,6 +5,9 @@
 #ifndef _RTE_ACC100_PMD_H_
 #define _RTE_ACC100_PMD_H_
 
+#include "acc100_pf_enum.h"
+#include "acc100_vf_enum.h"
+
 /* Helper macro for logging */
 #define rte_bbdev_log(level, fmt, ...) \
 	rte_log(RTE_LOG_ ## level, acc100_logtype, fmt "\n", \
@@ -27,6 +30,493 @@
 #define RTE_ACC100_PF_DEVICE_ID        (0x0d5c)
 #define RTE_ACC100_VF_DEVICE_ID        (0x0d5d)
 
+/* Define as 1 to use only a single FEC engine */
+#ifndef RTE_ACC100_SINGLE_FEC
+#define RTE_ACC100_SINGLE_FEC 0
+#endif
+
+/* Values used in filling in descriptors */
+#define ACC100_DMA_DESC_TYPE           2
+#define ACC100_DMA_CODE_BLK_MODE       0
+#define ACC100_DMA_BLKID_FCW           1
+#define ACC100_DMA_BLKID_IN            2
+#define ACC100_DMA_BLKID_OUT_ENC       1
+#define ACC100_DMA_BLKID_OUT_HARD      1
+#define ACC100_DMA_BLKID_OUT_SOFT      2
+#define ACC100_DMA_BLKID_OUT_HARQ      3
+#define ACC100_DMA_BLKID_IN_HARQ       3
+
+/* Values used in filling in decode FCWs */
+#define ACC100_FCW_TD_VER              1
+#define ACC100_FCW_TD_EXT_COLD_REG_EN  1
+#define ACC100_FCW_TD_AUTOMAP          0x0f
+#define ACC100_FCW_TD_RVIDX_0          2
+#define ACC100_FCW_TD_RVIDX_1          26
+#define ACC100_FCW_TD_RVIDX_2          50
+#define ACC100_FCW_TD_RVIDX_3          74
+
+/* Values used in writing to the registers */
+#define ACC100_REG_IRQ_EN_ALL          0x1FF83FF  /* Enable all interrupts */
+
+/* ACC100 Specific Dimensioning */
+#define ACC100_SIZE_64MBYTE            (64*1024*1024)
+/* Number of elements in an Info Ring */
+#define ACC100_INFO_RING_NUM_ENTRIES   1024
+/* Number of elements in HARQ layout memory */
+#define ACC100_HARQ_LAYOUT             (64*1024*1024)
+/* Assume offset for HARQ in memory */
+#define ACC100_HARQ_OFFSET             (32*1024)
+/* Mask used to calculate an index in an Info Ring array (not a byte offset) */
+#define ACC100_INFO_RING_MASK          (ACC100_INFO_RING_NUM_ENTRIES-1)
+/* Number of Virtual Functions ACC100 supports */
+#define ACC100_NUM_VFS                  16
+#define ACC100_NUM_QGRPS                 8
+#define ACC100_NUM_QGRPS_PER_WORD        8
+#define ACC100_NUM_AQS                  16
+#define MAX_ENQ_BATCH_SIZE          255
+/* All ACC100 Registers alignment are 32bits = 4B */
+#define BYTES_IN_WORD                 4
+#define MAX_E_MBUF                64000
+
+#define GRP_ID_SHIFT    10 /* Queue Index Hierarchy */
+#define VF_ID_SHIFT     4  /* Queue Index Hierarchy */
+#define VF_OFFSET_QOS   16 /* offset in Memory Space specific to QoS Mon */
+#define TMPL_PRI_0      0x03020100
+#define TMPL_PRI_1      0x07060504
+#define TMPL_PRI_2      0x0b0a0908
+#define TMPL_PRI_3      0x0f0e0d0c
+#define QUEUE_ENABLE    0x80000000  /* Bit to mark Queue as Enabled */
+#define WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
+
+#define ACC100_NUM_TMPL  32
+#define VF_OFFSET_QOS 16 /* offset in Memory Space specific to QoS Mon */
+/* Mapping of signals for the available engines */
+#define SIG_UL_5G      0
+#define SIG_UL_5G_LAST 7
+#define SIG_DL_5G      13
+#define SIG_DL_5G_LAST 15
+#define SIG_UL_4G      16
+#define SIG_UL_4G_LAST 21
+#define SIG_DL_4G      27
+#define SIG_DL_4G_LAST 31
+
+/* max number of iterations to allocate memory block for all rings */
+#define SW_RING_MEM_ALLOC_ATTEMPTS 5
+#define MAX_QUEUE_DEPTH           1024
+#define ACC100_DMA_MAX_NUM_POINTERS  14
+#define ACC100_DMA_DESC_PADDING      8
+#define ACC100_FCW_PADDING           12
+#define ACC100_DESC_FCW_OFFSET       192
+#define ACC100_DESC_SIZE             256
+#define ACC100_DESC_OFFSET           (ACC100_DESC_SIZE / 64)
+#define ACC100_FCW_TE_BLEN     32
+#define ACC100_FCW_TD_BLEN     24
+#define ACC100_FCW_LE_BLEN     32
+#define ACC100_FCW_LD_BLEN     36
+
+#define ACC100_FCW_VER         2
+#define MUX_5GDL_DESC 6
+#define CMP_ENC_SIZE 20
+#define CMP_DEC_SIZE 24
+#define ENC_OFFSET (32)
+#define DEC_OFFSET (80)
+#define ACC100_EXT_MEM
+#define ACC100_HARQ_OFFSET_THRESHOLD 1024
+
+/* Constants from K0 computation from 3GPP 38.212 Table 5.4.2.1-2 */
+#define N_ZC_1 66 /* N = 66 Zc for BG 1 */
+#define N_ZC_2 50 /* N = 50 Zc for BG 2 */
+#define K0_1_1 17 /* K0 fraction numerator for rv 1 and BG 1 */
+#define K0_1_2 13 /* K0 fraction numerator for rv 1 and BG 2 */
+#define K0_2_1 33 /* K0 fraction numerator for rv 2 and BG 1 */
+#define K0_2_2 25 /* K0 fraction numerator for rv 2 and BG 2 */
+#define K0_3_1 56 /* K0 fraction numerator for rv 3 and BG 1 */
+#define K0_3_2 43 /* K0 fraction numerator for rv 3 and BG 2 */
+
+/* ACC100 Configuration */
+#define ACC100_DDR_ECC_ENABLE
+#define ACC100_CFG_DMA_ERROR 0x3D7
+#define ACC100_CFG_AXI_CACHE 0x11
+#define ACC100_CFG_QMGR_HI_P 0x0F0F
+#define ACC100_CFG_PCI_AXI 0xC003
+#define ACC100_CFG_PCI_BRIDGE 0x40006033
+#define ACC100_ENGINE_OFFSET 0x1000
+#define ACC100_RESET_HI 0x20100
+#define ACC100_RESET_LO 0x20000
+#define ACC100_RESET_HARD 0x1FF
+#define ACC100_ENGINES_MAX 9
+#define LONG_WAIT 1000
+
+/* ACC100 DMA Descriptor triplet */
+struct acc100_dma_triplet {
+	uint64_t address;
+	uint32_t blen:20,
+		res0:4,
+		last:1,
+		dma_ext:1,
+		res1:2,
+		blkid:4;
+} __rte_packed;
+
+
+
+/* ACC100 DMA Response Descriptor */
+union acc100_dma_rsp_desc {
+	uint32_t val;
+	struct {
+		uint32_t crc_status:1,
+			synd_ok:1,
+			dma_err:1,
+			neg_stop:1,
+			fcw_err:1,
+			output_err:1,
+			input_err:1,
+			timestampEn:1,
+			iterCountFrac:8,
+			iter_cnt:8,
+			rsrvd3:6,
+			sdone:1,
+			fdone:1;
+		uint32_t add_info_0;
+		uint32_t add_info_1;
+	};
+};
+
+
+/* ACC100 Queue Manager Enqueue PCI Register */
+union acc100_enqueue_reg_fmt {
+	uint32_t val;
+	struct {
+		uint32_t num_elem:8,
+			addr_offset:3,
+			rsrvd:1,
+			req_elem_addr:20;
+	};
+};
+
+/* FEC 4G Uplink Frame Control Word */
+struct __rte_packed acc100_fcw_td {
+	uint8_t fcw_ver:4,
+		num_maps:4; /* Unused */
+	uint8_t filler:6, /* Unused */
+		rsrvd0:1,
+		bypass_sb_deint:1;
+	uint16_t k_pos;
+	uint16_t k_neg; /* Unused */
+	uint8_t c_neg; /* Unused */
+	uint8_t c; /* Unused */
+	uint32_t ea; /* Unused */
+	uint32_t eb; /* Unused */
+	uint8_t cab; /* Unused */
+	uint8_t k0_start_col; /* Unused */
+	uint8_t rsrvd1;
+	uint8_t code_block_mode:1, /* Unused */
+		turbo_crc_type:1,
+		rsrvd2:3,
+		bypass_teq:1, /* Unused */
+		soft_output_en:1, /* Unused */
+		ext_td_cold_reg_en:1;
+	union { /* External Cold register */
+		uint32_t ext_td_cold_reg;
+		struct {
+			uint32_t min_iter:4, /* Unused */
+				max_iter:4,
+				ext_scale:5, /* Unused */
+				rsrvd3:3,
+				early_stop_en:1, /* Unused */
+				sw_soft_out_dis:1, /* Unused */
+				sw_et_cont:1, /* Unused */
+				sw_soft_out_saturation:1, /* Unused */
+				half_iter_on:1, /* Unused */
+				raw_decoder_input_on:1, /* Unused */
+				rsrvd4:10;
+		};
+	};
+};
+
+/* FEC 5GNR Uplink Frame Control Word */
+struct __rte_packed acc100_fcw_ld {
+	uint32_t FCWversion:4,
+		qm:4,
+		nfiller:11,
+		BG:1,
+		Zc:9,
+		res0:1,
+		synd_precoder:1,
+		synd_post:1;
+	uint32_t ncb:16,
+		k0:16;
+	uint32_t rm_e:24,
+		hcin_en:1,
+		hcout_en:1,
+		crc_select:1,
+		bypass_dec:1,
+		bypass_intlv:1,
+		so_en:1,
+		so_bypass_rm:1,
+		so_bypass_intlv:1;
+	uint32_t hcin_offset:16,
+		hcin_size0:16;
+	uint32_t hcin_size1:16,
+		hcin_decomp_mode:3,
+		llr_pack_mode:1,
+		hcout_comp_mode:3,
+		res2:1,
+		dec_convllr:4,
+		hcout_convllr:4;
+	uint32_t itmax:7,
+		itstop:1,
+		so_it:7,
+		res3:1,
+		hcout_offset:16;
+	uint32_t hcout_size0:16,
+		hcout_size1:16;
+	uint32_t gain_i:8,
+		gain_h:8,
+		negstop_th:16;
+	uint32_t negstop_it:7,
+		negstop_en:1,
+		res4:24;
+};
+
+/* FEC 4G Downlink Frame Control Word */
+struct __rte_packed acc100_fcw_te {
+	uint16_t k_neg;
+	uint16_t k_pos;
+	uint8_t c_neg;
+	uint8_t c;
+	uint8_t filler;
+	uint8_t cab;
+	uint32_t ea:17,
+		rsrvd0:15;
+	uint32_t eb:17,
+		rsrvd1:15;
+	uint16_t ncb_neg;
+	uint16_t ncb_pos;
+	uint8_t rv_idx0:2,
+		rsrvd2:2,
+		rv_idx1:2,
+		rsrvd3:2;
+	uint8_t bypass_rv_idx0:1,
+		bypass_rv_idx1:1,
+		bypass_rm:1,
+		rsrvd4:5;
+	uint8_t rsrvd5:1,
+		rsrvd6:3,
+		code_block_crc:1,
+		rsrvd7:3;
+	uint8_t code_block_mode:1,
+		rsrvd8:7;
+	uint64_t rsrvd9;
+};
+
+/* FEC 5GNR Downlink Frame Control Word */
+struct __rte_packed acc100_fcw_le {
+	uint32_t FCWversion:4,
+		qm:4,
+		nfiller:11,
+		BG:1,
+		Zc:9,
+		res0:3;
+	uint32_t ncb:16,
+		k0:16;
+	uint32_t rm_e:24,
+		res1:2,
+		crc_select:1,
+		res2:1,
+		bypass_intlv:1,
+		res3:3;
+	uint32_t res4_a:12,
+		mcb_count:3,
+		res4_b:17;
+	uint32_t res5;
+	uint32_t res6;
+	uint32_t res7;
+	uint32_t res8;
+};
+
+/* ACC100 DMA Request Descriptor */
+struct __rte_packed acc100_dma_req_desc {
+	union {
+		struct{
+			uint32_t type:4,
+				rsrvd0:26,
+				sdone:1,
+				fdone:1;
+			uint32_t rsrvd1;
+			uint32_t rsrvd2;
+			uint32_t pass_param:8,
+				sdone_enable:1,
+				irq_enable:1,
+				timeStampEn:1,
+				res0:5,
+				numCBs:4,
+				res1:4,
+				m2dlen:4,
+				d2mlen:4;
+		};
+		struct{
+			uint32_t word0;
+			uint32_t word1;
+			uint32_t word2;
+			uint32_t word3;
+		};
+	};
+	struct acc100_dma_triplet data_ptrs[ACC100_DMA_MAX_NUM_POINTERS];
+
+	/* Virtual addresses used to retrieve SW context info */
+	union {
+		void *op_addr;
+		uint64_t pad1;  /* pad to 64 bits */
+	};
+	/*
+	 * Stores additional information needed for driver processing:
+	 * - last_desc_in_batch - flag used to mark last descriptor (CB)
+	 *                        in batch
+	 * - cbs_in_tb - stores information about total number of Code Blocks
+	 *               in currently processed Transport Block
+	 */
+	union {
+		struct {
+			union {
+				struct acc100_fcw_ld fcw_ld;
+				struct acc100_fcw_td fcw_td;
+				struct acc100_fcw_le fcw_le;
+				struct acc100_fcw_te fcw_te;
+				uint32_t pad2[ACC100_FCW_PADDING];
+			};
+			uint32_t last_desc_in_batch :8,
+				cbs_in_tb:8,
+				pad4 : 16;
+		};
+		uint64_t pad3[ACC100_DMA_DESC_PADDING]; /* pad to 64 bits */
+	};
+};
+
+/* ACC100 DMA Descriptor */
+union acc100_dma_desc {
+	struct acc100_dma_req_desc req;
+	union acc100_dma_rsp_desc rsp;
+};
+
+
+/* Union describing Info Ring entry */
+union acc100_harq_layout_data {
+	uint32_t val;
+	struct {
+		uint16_t offset;
+		uint16_t size0;
+	};
+} __rte_packed;
+
+
+/* Union describing Info Ring entry */
+union acc100_info_ring_data {
+	uint32_t val;
+	struct {
+		union {
+			uint16_t detailed_info;
+			struct {
+				uint16_t aq_id: 4;
+				uint16_t qg_id: 4;
+				uint16_t vf_id: 6;
+				uint16_t reserved: 2;
+			};
+		};
+		uint16_t int_nb: 7;
+		uint16_t msi_0: 1;
+		uint16_t vf2pf: 6;
+		uint16_t loop: 1;
+		uint16_t valid: 1;
+	};
+} __rte_packed;
+
+struct acc100_registry_addr {
+	unsigned int dma_ring_dl5g_hi;
+	unsigned int dma_ring_dl5g_lo;
+	unsigned int dma_ring_ul5g_hi;
+	unsigned int dma_ring_ul5g_lo;
+	unsigned int dma_ring_dl4g_hi;
+	unsigned int dma_ring_dl4g_lo;
+	unsigned int dma_ring_ul4g_hi;
+	unsigned int dma_ring_ul4g_lo;
+	unsigned int ring_size;
+	unsigned int info_ring_hi;
+	unsigned int info_ring_lo;
+	unsigned int info_ring_en;
+	unsigned int info_ring_ptr;
+	unsigned int tail_ptrs_dl5g_hi;
+	unsigned int tail_ptrs_dl5g_lo;
+	unsigned int tail_ptrs_ul5g_hi;
+	unsigned int tail_ptrs_ul5g_lo;
+	unsigned int tail_ptrs_dl4g_hi;
+	unsigned int tail_ptrs_dl4g_lo;
+	unsigned int tail_ptrs_ul4g_hi;
+	unsigned int tail_ptrs_ul4g_lo;
+	unsigned int depth_log0_offset;
+	unsigned int depth_log1_offset;
+	unsigned int qman_group_func;
+	unsigned int ddr_range;
+};
+
+/* Structure holding registry addresses for PF */
+static const struct acc100_registry_addr pf_reg_addr = {
+	.dma_ring_dl5g_hi = HWPfDmaFec5GdlDescBaseHiRegVf,
+	.dma_ring_dl5g_lo = HWPfDmaFec5GdlDescBaseLoRegVf,
+	.dma_ring_ul5g_hi = HWPfDmaFec5GulDescBaseHiRegVf,
+	.dma_ring_ul5g_lo = HWPfDmaFec5GulDescBaseLoRegVf,
+	.dma_ring_dl4g_hi = HWPfDmaFec4GdlDescBaseHiRegVf,
+	.dma_ring_dl4g_lo = HWPfDmaFec4GdlDescBaseLoRegVf,
+	.dma_ring_ul4g_hi = HWPfDmaFec4GulDescBaseHiRegVf,
+	.dma_ring_ul4g_lo = HWPfDmaFec4GulDescBaseLoRegVf,
+	.ring_size = HWPfQmgrRingSizeVf,
+	.info_ring_hi = HWPfHiInfoRingBaseHiRegPf,
+	.info_ring_lo = HWPfHiInfoRingBaseLoRegPf,
+	.info_ring_en = HWPfHiInfoRingIntWrEnRegPf,
+	.info_ring_ptr = HWPfHiInfoRingPointerRegPf,
+	.tail_ptrs_dl5g_hi = HWPfDmaFec5GdlRespPtrHiRegVf,
+	.tail_ptrs_dl5g_lo = HWPfDmaFec5GdlRespPtrLoRegVf,
+	.tail_ptrs_ul5g_hi = HWPfDmaFec5GulRespPtrHiRegVf,
+	.tail_ptrs_ul5g_lo = HWPfDmaFec5GulRespPtrLoRegVf,
+	.tail_ptrs_dl4g_hi = HWPfDmaFec4GdlRespPtrHiRegVf,
+	.tail_ptrs_dl4g_lo = HWPfDmaFec4GdlRespPtrLoRegVf,
+	.tail_ptrs_ul4g_hi = HWPfDmaFec4GulRespPtrHiRegVf,
+	.tail_ptrs_ul4g_lo = HWPfDmaFec4GulRespPtrLoRegVf,
+	.depth_log0_offset = HWPfQmgrGrpDepthLog20Vf,
+	.depth_log1_offset = HWPfQmgrGrpDepthLog21Vf,
+	.qman_group_func = HWPfQmgrGrpFunction0,
+	.ddr_range = HWPfDmaVfDdrBaseRw,
+};
+
+/* Structure holding registry addresses for VF */
+static const struct acc100_registry_addr vf_reg_addr = {
+	.dma_ring_dl5g_hi = HWVfDmaFec5GdlDescBaseHiRegVf,
+	.dma_ring_dl5g_lo = HWVfDmaFec5GdlDescBaseLoRegVf,
+	.dma_ring_ul5g_hi = HWVfDmaFec5GulDescBaseHiRegVf,
+	.dma_ring_ul5g_lo = HWVfDmaFec5GulDescBaseLoRegVf,
+	.dma_ring_dl4g_hi = HWVfDmaFec4GdlDescBaseHiRegVf,
+	.dma_ring_dl4g_lo = HWVfDmaFec4GdlDescBaseLoRegVf,
+	.dma_ring_ul4g_hi = HWVfDmaFec4GulDescBaseHiRegVf,
+	.dma_ring_ul4g_lo = HWVfDmaFec4GulDescBaseLoRegVf,
+	.ring_size = HWVfQmgrRingSizeVf,
+	.info_ring_hi = HWVfHiInfoRingBaseHiVf,
+	.info_ring_lo = HWVfHiInfoRingBaseLoVf,
+	.info_ring_en = HWVfHiInfoRingIntWrEnVf,
+	.info_ring_ptr = HWVfHiInfoRingPointerVf,
+	.tail_ptrs_dl5g_hi = HWVfDmaFec5GdlRespPtrHiRegVf,
+	.tail_ptrs_dl5g_lo = HWVfDmaFec5GdlRespPtrLoRegVf,
+	.tail_ptrs_ul5g_hi = HWVfDmaFec5GulRespPtrHiRegVf,
+	.tail_ptrs_ul5g_lo = HWVfDmaFec5GulRespPtrLoRegVf,
+	.tail_ptrs_dl4g_hi = HWVfDmaFec4GdlRespPtrHiRegVf,
+	.tail_ptrs_dl4g_lo = HWVfDmaFec4GdlRespPtrLoRegVf,
+	.tail_ptrs_ul4g_hi = HWVfDmaFec4GulRespPtrHiRegVf,
+	.tail_ptrs_ul4g_lo = HWVfDmaFec4GulRespPtrLoRegVf,
+	.depth_log0_offset = HWVfQmgrGrpDepthLog20Vf,
+	.depth_log1_offset = HWVfQmgrGrpDepthLog21Vf,
+	.qman_group_func = HWVfQmgrGrpFunction0Vf,
+	.ddr_range = HWVfDmaDdrBaseRangeRoVf,
+};
+
 /* Private data structure for each ACC100 device */
 struct acc100_device {
 	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v9 03/10] baseband/acc100: add info get function
  2020-09-29  0:29   ` [dpdk-dev] [PATCH v9 00/10] bbdev PMD ACC100 Nicolas Chautru
  2020-09-29  0:29     ` [dpdk-dev] [PATCH v9 01/10] drivers/baseband: add PMD for ACC100 Nicolas Chautru
  2020-09-29  0:29     ` [dpdk-dev] [PATCH v9 02/10] baseband/acc100: add register definition file Nicolas Chautru
@ 2020-09-29  0:29     ` Nicolas Chautru
  2020-09-29 21:13       ` Tom Rix
  2020-09-29  0:29     ` [dpdk-dev] [PATCH v9 04/10] baseband/acc100: add queue configuration Nicolas Chautru
                       ` (6 subsequent siblings)
  9 siblings, 1 reply; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-29  0:29 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Add in the "info_get" function to the driver, to allow us to query the
device.
No processing capability are available yet.
Linking bbdev-test to support the PMD with null capability.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 app/test-bbdev/meson.build               |   3 +
 drivers/baseband/acc100/rte_acc100_cfg.h |  96 +++++++++++++
 drivers/baseband/acc100/rte_acc100_pmd.c | 225 +++++++++++++++++++++++++++++++
 drivers/baseband/acc100/rte_acc100_pmd.h |   3 +
 4 files changed, 327 insertions(+)
 create mode 100644 drivers/baseband/acc100/rte_acc100_cfg.h

diff --git a/app/test-bbdev/meson.build b/app/test-bbdev/meson.build
index 18ab6a8..fbd8ae3 100644
--- a/app/test-bbdev/meson.build
+++ b/app/test-bbdev/meson.build
@@ -12,3 +12,6 @@ endif
 if dpdk_conf.has('RTE_LIBRTE_PMD_BBDEV_FPGA_5GNR_FEC')
 	deps += ['pmd_bbdev_fpga_5gnr_fec']
 endif
+if dpdk_conf.has('RTE_LIBRTE_PMD_BBDEV_ACC100')
+	deps += ['pmd_bbdev_acc100']
+endif
\ No newline at end of file
diff --git a/drivers/baseband/acc100/rte_acc100_cfg.h b/drivers/baseband/acc100/rte_acc100_cfg.h
new file mode 100644
index 0000000..73bbe36
--- /dev/null
+++ b/drivers/baseband/acc100/rte_acc100_cfg.h
@@ -0,0 +1,96 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#ifndef _RTE_ACC100_CFG_H_
+#define _RTE_ACC100_CFG_H_
+
+/**
+ * @file rte_acc100_cfg.h
+ *
+ * Functions for configuring ACC100 HW, exposed directly to applications.
+ * Configuration related to encoding/decoding is done through the
+ * librte_bbdev library.
+ *
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ */
+
+#include <stdint.h>
+#include <stdbool.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+/**< Number of Virtual Functions ACC100 supports */
+#define RTE_ACC100_NUM_VFS 16
+
+/**
+ * Definition of Queue Topology for ACC100 Configuration
+ * Some level of details is abstracted out to expose a clean interface
+ * given that comprehensive flexibility is not required
+ */
+struct rte_q_topology_t {
+	/** Number of QGroups in incremental order of priority */
+	uint16_t num_qgroups;
+	/**
+	 * All QGroups have the same number of AQs here.
+	 * Note : Could be made a 16-array if more flexibility is really
+	 * required
+	 */
+	uint16_t num_aqs_per_groups;
+	/**
+	 * Depth of the AQs is the same of all QGroups here. Log2 Enum : 2^N
+	 * Note : Could be made a 16-array if more flexibility is really
+	 * required
+	 */
+	uint16_t aq_depth_log2;
+	/**
+	 * Index of the first Queue Group Index - assuming contiguity
+	 * Initialized as -1
+	 */
+	int8_t first_qgroup_index;
+};
+
+/**
+ * Definition of Arbitration related parameters for ACC100 Configuration
+ */
+struct rte_arbitration_t {
+	/** Default Weight for VF Fairness Arbitration */
+	uint16_t round_robin_weight;
+	uint32_t gbr_threshold1; /**< Guaranteed Bitrate Threshold 1 */
+	uint32_t gbr_threshold2; /**< Guaranteed Bitrate Threshold 2 */
+};
+
+/**
+ * Structure to pass ACC100 configuration.
+ * Note: all VF Bundles will have the same configuration.
+ */
+struct acc100_conf {
+	bool pf_mode_en; /**< 1 if PF is used for dataplane, 0 for VFs */
+	/** 1 if input '1' bit is represented by a positive LLR value, 0 if '1'
+	 * bit is represented by a negative value.
+	 */
+	bool input_pos_llr_1_bit;
+	/** 1 if output '1' bit is represented by a positive value, 0 if '1'
+	 * bit is represented by a negative value.
+	 */
+	bool output_pos_llr_1_bit;
+	uint16_t num_vf_bundles; /**< Number of VF bundles to setup */
+	/** Queue topology for each operation type */
+	struct rte_q_topology_t q_ul_4g;
+	struct rte_q_topology_t q_dl_4g;
+	struct rte_q_topology_t q_ul_5g;
+	struct rte_q_topology_t q_dl_5g;
+	/** Arbitration configuration for each operation type */
+	struct rte_arbitration_t arb_ul_4g[RTE_ACC100_NUM_VFS];
+	struct rte_arbitration_t arb_dl_4g[RTE_ACC100_NUM_VFS];
+	struct rte_arbitration_t arb_ul_5g[RTE_ACC100_NUM_VFS];
+	struct rte_arbitration_t arb_dl_5g[RTE_ACC100_NUM_VFS];
+};
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_ACC100_CFG_H_ */
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 1b4cd13..7807a30 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -26,6 +26,184 @@
 RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);
 #endif
 
+/* Read a register of a ACC100 device */
+static inline uint32_t
+acc100_reg_read(struct acc100_device *d, uint32_t offset)
+{
+
+	void *reg_addr = RTE_PTR_ADD(d->mmio_base, offset);
+	uint32_t ret = *((volatile uint32_t *)(reg_addr));
+	return rte_le_to_cpu_32(ret);
+}
+
+/* Calculate the offset of the enqueue register */
+static inline uint32_t
+queue_offset(bool pf_device, uint8_t vf_id, uint8_t qgrp_id, uint16_t aq_id)
+{
+	if (pf_device)
+		return ((vf_id << 12) + (qgrp_id << 7) + (aq_id << 3) +
+				HWPfQmgrIngressAq);
+	else
+		return ((qgrp_id << 7) + (aq_id << 3) +
+				HWVfQmgrIngressAq);
+}
+
+enum {UL_4G = 0, UL_5G, DL_4G, DL_5G, NUM_ACC};
+
+/* Return the queue topology for a Queue Group Index */
+static inline void
+qtopFromAcc(struct rte_q_topology_t **qtop, int acc_enum,
+		struct acc100_conf *acc100_conf)
+{
+	struct rte_q_topology_t *p_qtop;
+	p_qtop = NULL;
+	switch (acc_enum) {
+	case UL_4G:
+		p_qtop = &(acc100_conf->q_ul_4g);
+		break;
+	case UL_5G:
+		p_qtop = &(acc100_conf->q_ul_5g);
+		break;
+	case DL_4G:
+		p_qtop = &(acc100_conf->q_dl_4g);
+		break;
+	case DL_5G:
+		p_qtop = &(acc100_conf->q_dl_5g);
+		break;
+	default:
+		/* NOTREACHED */
+		rte_bbdev_log(ERR, "Unexpected error evaluating qtopFromAcc");
+		break;
+	}
+	*qtop = p_qtop;
+}
+
+static void
+initQTop(struct acc100_conf *acc100_conf)
+{
+	acc100_conf->q_ul_4g.num_aqs_per_groups = 0;
+	acc100_conf->q_ul_4g.num_qgroups = 0;
+	acc100_conf->q_ul_4g.first_qgroup_index = -1;
+	acc100_conf->q_ul_5g.num_aqs_per_groups = 0;
+	acc100_conf->q_ul_5g.num_qgroups = 0;
+	acc100_conf->q_ul_5g.first_qgroup_index = -1;
+	acc100_conf->q_dl_4g.num_aqs_per_groups = 0;
+	acc100_conf->q_dl_4g.num_qgroups = 0;
+	acc100_conf->q_dl_4g.first_qgroup_index = -1;
+	acc100_conf->q_dl_5g.num_aqs_per_groups = 0;
+	acc100_conf->q_dl_5g.num_qgroups = 0;
+	acc100_conf->q_dl_5g.first_qgroup_index = -1;
+}
+
+static inline void
+updateQtop(uint8_t acc, uint8_t qg, struct acc100_conf *acc100_conf,
+		struct acc100_device *d) {
+	uint32_t reg;
+	struct rte_q_topology_t *q_top = NULL;
+	qtopFromAcc(&q_top, acc, acc100_conf);
+	if (unlikely(q_top == NULL))
+		return;
+	uint16_t aq;
+	q_top->num_qgroups++;
+	if (q_top->first_qgroup_index == -1) {
+		q_top->first_qgroup_index = qg;
+		/* Can be optimized to assume all are enabled by default */
+		reg = acc100_reg_read(d, queue_offset(d->pf_device,
+				0, qg, ACC100_NUM_AQS - 1));
+		if (reg & QUEUE_ENABLE) {
+			q_top->num_aqs_per_groups = ACC100_NUM_AQS;
+			return;
+		}
+		q_top->num_aqs_per_groups = 0;
+		for (aq = 0; aq < ACC100_NUM_AQS; aq++) {
+			reg = acc100_reg_read(d, queue_offset(d->pf_device,
+					0, qg, aq));
+			if (reg & QUEUE_ENABLE)
+				q_top->num_aqs_per_groups++;
+		}
+	}
+}
+
+/* Fetch configuration enabled for the PF/VF using MMIO Read (slow) */
+static inline void
+fetch_acc100_config(struct rte_bbdev *dev)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	struct acc100_conf *acc100_conf = &d->acc100_conf;
+	const struct acc100_registry_addr *reg_addr;
+	uint8_t acc, qg;
+	uint32_t reg, reg_aq, reg_len0, reg_len1;
+	uint32_t reg_mode;
+
+	/* No need to retrieve the configuration is already done */
+	if (d->configured)
+		return;
+
+	/* Choose correct registry addresses for the device type */
+	if (d->pf_device)
+		reg_addr = &pf_reg_addr;
+	else
+		reg_addr = &vf_reg_addr;
+
+	d->ddr_size = (1 + acc100_reg_read(d, reg_addr->ddr_range)) << 10;
+
+	/* Single VF Bundle by VF */
+	acc100_conf->num_vf_bundles = 1;
+	initQTop(acc100_conf);
+
+	struct rte_q_topology_t *q_top = NULL;
+	int qman_func_id[5] = {0, 2, 1, 3, 4};
+	reg = acc100_reg_read(d, reg_addr->qman_group_func);
+	for (qg = 0; qg < ACC100_NUM_QGRPS_PER_WORD; qg++) {
+		reg_aq = acc100_reg_read(d,
+				queue_offset(d->pf_device, 0, qg, 0));
+		if (reg_aq & QUEUE_ENABLE) {
+			acc = qman_func_id[(reg >> (qg * 4)) & 0x7];
+			updateQtop(acc, qg, acc100_conf, d);
+		}
+	}
+
+	/* Check the depth of the AQs*/
+	reg_len0 = acc100_reg_read(d, reg_addr->depth_log0_offset);
+	reg_len1 = acc100_reg_read(d, reg_addr->depth_log1_offset);
+	for (acc = 0; acc < NUM_ACC; acc++) {
+		qtopFromAcc(&q_top, acc, acc100_conf);
+		if (q_top->first_qgroup_index < ACC100_NUM_QGRPS_PER_WORD)
+			q_top->aq_depth_log2 = (reg_len0 >>
+					(q_top->first_qgroup_index * 4))
+					& 0xF;
+		else
+			q_top->aq_depth_log2 = (reg_len1 >>
+					((q_top->first_qgroup_index -
+					ACC100_NUM_QGRPS_PER_WORD) * 4))
+					& 0xF;
+	}
+
+	/* Read PF mode */
+	if (d->pf_device) {
+		reg_mode = acc100_reg_read(d, HWPfHiPfMode);
+		acc100_conf->pf_mode_en = (reg_mode == 2) ? 1 : 0;
+	}
+
+	rte_bbdev_log_debug(
+			"%s Config LLR SIGN IN/OUT %s %s QG %u %u %u %u AQ %u %u %u %u Len %u %u %u %u\n",
+			(d->pf_device) ? "PF" : "VF",
+			(acc100_conf->input_pos_llr_1_bit) ? "POS" : "NEG",
+			(acc100_conf->output_pos_llr_1_bit) ? "POS" : "NEG",
+			acc100_conf->q_ul_4g.num_qgroups,
+			acc100_conf->q_dl_4g.num_qgroups,
+			acc100_conf->q_ul_5g.num_qgroups,
+			acc100_conf->q_dl_5g.num_qgroups,
+			acc100_conf->q_ul_4g.num_aqs_per_groups,
+			acc100_conf->q_dl_4g.num_aqs_per_groups,
+			acc100_conf->q_ul_5g.num_aqs_per_groups,
+			acc100_conf->q_dl_5g.num_aqs_per_groups,
+			acc100_conf->q_ul_4g.aq_depth_log2,
+			acc100_conf->q_dl_4g.aq_depth_log2,
+			acc100_conf->q_ul_5g.aq_depth_log2,
+			acc100_conf->q_dl_5g.aq_depth_log2);
+}
+
 /* Free 64MB memory used for software rings */
 static int
 acc100_dev_close(struct rte_bbdev *dev  __rte_unused)
@@ -33,8 +211,55 @@
 	return 0;
 }
 
+/* Get ACC100 device info */
+static void
+acc100_dev_info_get(struct rte_bbdev *dev,
+		struct rte_bbdev_driver_info *dev_info)
+{
+	struct acc100_device *d = dev->data->dev_private;
+
+	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
+		RTE_BBDEV_END_OF_CAPABILITIES_LIST()
+	};
+
+	static struct rte_bbdev_queue_conf default_queue_conf;
+	default_queue_conf.socket = dev->data->socket_id;
+	default_queue_conf.queue_size = MAX_QUEUE_DEPTH;
+
+	dev_info->driver_name = dev->device->driver->name;
+
+	/* Read and save the populated config from ACC100 registers */
+	fetch_acc100_config(dev);
+
+	/* This isn't ideal because it reports the maximum number of queues but
+	 * does not provide info on how many can be uplink/downlink or different
+	 * priorities
+	 */
+	dev_info->max_num_queues =
+			d->acc100_conf.q_dl_5g.num_aqs_per_groups *
+			d->acc100_conf.q_dl_5g.num_qgroups +
+			d->acc100_conf.q_ul_5g.num_aqs_per_groups *
+			d->acc100_conf.q_ul_5g.num_qgroups +
+			d->acc100_conf.q_dl_4g.num_aqs_per_groups *
+			d->acc100_conf.q_dl_4g.num_qgroups +
+			d->acc100_conf.q_ul_4g.num_aqs_per_groups *
+			d->acc100_conf.q_ul_4g.num_qgroups;
+	dev_info->queue_size_lim = MAX_QUEUE_DEPTH;
+	dev_info->hardware_accelerated = true;
+	dev_info->max_dl_queue_priority =
+			d->acc100_conf.q_dl_4g.num_qgroups - 1;
+	dev_info->max_ul_queue_priority =
+			d->acc100_conf.q_ul_4g.num_qgroups - 1;
+	dev_info->default_queue_conf = default_queue_conf;
+	dev_info->cpu_flag_reqs = NULL;
+	dev_info->min_alignment = 64;
+	dev_info->capabilities = bbdev_capabilities;
+	dev_info->harq_buffer_size = d->ddr_size;
+}
+
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
 	.close = acc100_dev_close,
+	.info_get = acc100_dev_info_get,
 };
 
 /* ACC100 PCI PF address map */
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index cd77570..662e2c8 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -7,6 +7,7 @@
 
 #include "acc100_pf_enum.h"
 #include "acc100_vf_enum.h"
+#include "rte_acc100_cfg.h"
 
 /* Helper macro for logging */
 #define rte_bbdev_log(level, fmt, ...) \
@@ -520,6 +521,8 @@ struct acc100_registry_addr {
 /* Private data structure for each ACC100 device */
 struct acc100_device {
 	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
+	uint32_t ddr_size; /* Size in kB */
+	struct acc100_conf acc100_conf; /* ACC100 Initial configuration */
 	bool pf_device; /**< True if this is a PF ACC100 device */
 	bool configured; /**< True if this ACC100 device is configured */
 };
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v9 04/10] baseband/acc100: add queue configuration
  2020-09-29  0:29   ` [dpdk-dev] [PATCH v9 00/10] bbdev PMD ACC100 Nicolas Chautru
                       ` (2 preceding siblings ...)
  2020-09-29  0:29     ` [dpdk-dev] [PATCH v9 03/10] baseband/acc100: add info get function Nicolas Chautru
@ 2020-09-29  0:29     ` Nicolas Chautru
  2020-09-29 21:46       ` Tom Rix
  2020-09-29  0:29     ` [dpdk-dev] [PATCH v9 05/10] baseband/acc100: add LDPC processing functions Nicolas Chautru
                       ` (5 subsequent siblings)
  9 siblings, 1 reply; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-29  0:29 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Adding function to create and configure queues for
the device. Still no capability.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 420 ++++++++++++++++++++++++++++++-
 drivers/baseband/acc100/rte_acc100_pmd.h |  45 ++++
 2 files changed, 464 insertions(+), 1 deletion(-)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 7807a30..7a21c57 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -26,6 +26,22 @@
 RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);
 #endif
 
+/* Write to MMIO register address */
+static inline void
+mmio_write(void *addr, uint32_t value)
+{
+	*((volatile uint32_t *)(addr)) = rte_cpu_to_le_32(value);
+}
+
+/* Write a register of a ACC100 device */
+static inline void
+acc100_reg_write(struct acc100_device *d, uint32_t offset, uint32_t payload)
+{
+	void *reg_addr = RTE_PTR_ADD(d->mmio_base, offset);
+	mmio_write(reg_addr, payload);
+	usleep(1000);
+}
+
 /* Read a register of a ACC100 device */
 static inline uint32_t
 acc100_reg_read(struct acc100_device *d, uint32_t offset)
@@ -36,6 +52,22 @@
 	return rte_le_to_cpu_32(ret);
 }
 
+/* Basic Implementation of Log2 for exact 2^N */
+static inline uint32_t
+log2_basic(uint32_t value)
+{
+	return (value == 0) ? 0 : __builtin_ctz(value);
+}
+
+/* Calculate memory alignment offset assuming alignment is 2^N */
+static inline uint32_t
+calc_mem_alignment_offset(void *unaligned_virt_mem, uint32_t alignment)
+{
+	rte_iova_t unaligned_phy_mem = rte_malloc_virt2iova(unaligned_virt_mem);
+	return (uint32_t)(alignment -
+			(unaligned_phy_mem & (alignment-1)));
+}
+
 /* Calculate the offset of the enqueue register */
 static inline uint32_t
 queue_offset(bool pf_device, uint8_t vf_id, uint8_t qgrp_id, uint16_t aq_id)
@@ -204,10 +236,393 @@
 			acc100_conf->q_dl_5g.aq_depth_log2);
 }
 
+static void
+free_base_addresses(void **base_addrs, int size)
+{
+	int i;
+	for (i = 0; i < size; i++)
+		rte_free(base_addrs[i]);
+}
+
+static inline uint32_t
+get_desc_len(void)
+{
+	return sizeof(union acc100_dma_desc);
+}
+
+/* Allocate the 2 * 64MB block for the sw rings */
+static int
+alloc_2x64mb_sw_rings_mem(struct rte_bbdev *dev, struct acc100_device *d,
+		int socket)
+{
+	uint32_t sw_ring_size = ACC100_SIZE_64MBYTE;
+	d->sw_rings_base = rte_zmalloc_socket(dev->device->driver->name,
+			2 * sw_ring_size, RTE_CACHE_LINE_SIZE, socket);
+	if (d->sw_rings_base == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate memory for %s:%u",
+				dev->device->driver->name,
+				dev->data->dev_id);
+		return -ENOMEM;
+	}
+	memset(d->sw_rings_base, 0, ACC100_SIZE_64MBYTE);
+	uint32_t next_64mb_align_offset = calc_mem_alignment_offset(
+			d->sw_rings_base, ACC100_SIZE_64MBYTE);
+	d->sw_rings = RTE_PTR_ADD(d->sw_rings_base, next_64mb_align_offset);
+	d->sw_rings_phys = rte_malloc_virt2iova(d->sw_rings_base) +
+			next_64mb_align_offset;
+	d->sw_ring_size = MAX_QUEUE_DEPTH * get_desc_len();
+	d->sw_ring_max_depth = d->sw_ring_size / get_desc_len();
+
+	return 0;
+}
+
+/* Attempt to allocate minimised memory space for sw rings */
+static void
+alloc_sw_rings_min_mem(struct rte_bbdev *dev, struct acc100_device *d,
+		uint16_t num_queues, int socket)
+{
+	rte_iova_t sw_rings_base_phy, next_64mb_align_addr_phy;
+	uint32_t next_64mb_align_offset;
+	rte_iova_t sw_ring_phys_end_addr;
+	void *base_addrs[SW_RING_MEM_ALLOC_ATTEMPTS];
+	void *sw_rings_base;
+	int i = 0;
+	uint32_t q_sw_ring_size = MAX_QUEUE_DEPTH * get_desc_len();
+	uint32_t dev_sw_ring_size = q_sw_ring_size * num_queues;
+
+	/* Find an aligned block of memory to store sw rings */
+	while (i < SW_RING_MEM_ALLOC_ATTEMPTS) {
+		/*
+		 * sw_ring allocated memory is guaranteed to be aligned to
+		 * q_sw_ring_size at the condition that the requested size is
+		 * less than the page size
+		 */
+		sw_rings_base = rte_zmalloc_socket(
+				dev->device->driver->name,
+				dev_sw_ring_size, q_sw_ring_size, socket);
+
+		if (sw_rings_base == NULL) {
+			rte_bbdev_log(ERR,
+					"Failed to allocate memory for %s:%u",
+					dev->device->driver->name,
+					dev->data->dev_id);
+			break;
+		}
+
+		sw_rings_base_phy = rte_malloc_virt2iova(sw_rings_base);
+		next_64mb_align_offset = calc_mem_alignment_offset(
+				sw_rings_base, ACC100_SIZE_64MBYTE);
+		next_64mb_align_addr_phy = sw_rings_base_phy +
+				next_64mb_align_offset;
+		sw_ring_phys_end_addr = sw_rings_base_phy + dev_sw_ring_size;
+
+		/* Check if the end of the sw ring memory block is before the
+		 * start of next 64MB aligned mem address
+		 */
+		if (sw_ring_phys_end_addr < next_64mb_align_addr_phy) {
+			d->sw_rings_phys = sw_rings_base_phy;
+			d->sw_rings = sw_rings_base;
+			d->sw_rings_base = sw_rings_base;
+			d->sw_ring_size = q_sw_ring_size;
+			d->sw_ring_max_depth = MAX_QUEUE_DEPTH;
+			break;
+		}
+		/* Store the address of the unaligned mem block */
+		base_addrs[i] = sw_rings_base;
+		i++;
+	}
+
+	/* Free all unaligned blocks of mem allocated in the loop */
+	free_base_addresses(base_addrs, i);
+}
+
+
+/* Allocate 64MB memory used for all software rings */
+static int
+acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id)
+{
+	uint32_t phys_low, phys_high, payload;
+	struct acc100_device *d = dev->data->dev_private;
+	const struct acc100_registry_addr *reg_addr;
+
+	if (d->pf_device && !d->acc100_conf.pf_mode_en) {
+		rte_bbdev_log(NOTICE,
+				"%s has PF mode disabled. This PF can't be used.",
+				dev->data->name);
+		return -ENODEV;
+	}
+
+	alloc_sw_rings_min_mem(dev, d, num_queues, socket_id);
+
+	/* If minimal memory space approach failed, then allocate
+	 * the 2 * 64MB block for the sw rings
+	 */
+	if (d->sw_rings == NULL)
+		alloc_2x64mb_sw_rings_mem(dev, d, socket_id);
+
+	/* Configure ACC100 with the base address for DMA descriptor rings
+	 * Same descriptor rings used for UL and DL DMA Engines
+	 * Note : Assuming only VF0 bundle is used for PF mode
+	 */
+	phys_high = (uint32_t)(d->sw_rings_phys >> 32);
+	phys_low  = (uint32_t)(d->sw_rings_phys & ~(ACC100_SIZE_64MBYTE-1));
+
+	/* Choose correct registry addresses for the device type */
+	if (d->pf_device)
+		reg_addr = &pf_reg_addr;
+	else
+		reg_addr = &vf_reg_addr;
+
+	/* Read the populated cfg from ACC100 registers */
+	fetch_acc100_config(dev);
+
+	/* Mark as configured properly */
+	d->configured = true;
+
+	/* Release AXI from PF */
+	if (d->pf_device)
+		acc100_reg_write(d, HWPfDmaAxiControl, 1);
+
+	acc100_reg_write(d, reg_addr->dma_ring_ul5g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->dma_ring_ul5g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->dma_ring_dl5g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->dma_ring_dl5g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->dma_ring_ul4g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->dma_ring_ul4g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->dma_ring_dl4g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->dma_ring_dl4g_lo, phys_low);
+
+	/*
+	 * Configure Ring Size to the max queue ring size
+	 * (used for wrapping purpose)
+	 */
+	payload = log2_basic(d->sw_ring_size / 64);
+	acc100_reg_write(d, reg_addr->ring_size, payload);
+
+	/* Configure tail pointer for use when SDONE enabled */
+	d->tail_ptrs = rte_zmalloc_socket(
+			dev->device->driver->name,
+			ACC100_NUM_QGRPS * ACC100_NUM_AQS * sizeof(uint32_t),
+			RTE_CACHE_LINE_SIZE, socket_id);
+	if (d->tail_ptrs == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate tail ptr for %s:%u",
+				dev->device->driver->name,
+				dev->data->dev_id);
+		rte_free(d->sw_rings);
+		return -ENOMEM;
+	}
+	d->tail_ptr_phys = rte_malloc_virt2iova(d->tail_ptrs);
+
+	phys_high = (uint32_t)(d->tail_ptr_phys >> 32);
+	phys_low  = (uint32_t)(d->tail_ptr_phys);
+	acc100_reg_write(d, reg_addr->tail_ptrs_ul5g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->tail_ptrs_ul5g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->tail_ptrs_dl5g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->tail_ptrs_dl5g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->tail_ptrs_ul4g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->tail_ptrs_ul4g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_lo, phys_low);
+
+	d->harq_layout = rte_zmalloc_socket("HARQ Layout",
+			ACC100_HARQ_LAYOUT * sizeof(*d->harq_layout),
+			RTE_CACHE_LINE_SIZE, dev->data->socket_id);
+
+	rte_bbdev_log_debug(
+			"ACC100 (%s) configured  sw_rings = %p, sw_rings_phys = %#"
+			PRIx64, dev->data->name, d->sw_rings, d->sw_rings_phys);
+
+	return 0;
+}
+
 /* Free 64MB memory used for software rings */
 static int
-acc100_dev_close(struct rte_bbdev *dev  __rte_unused)
+acc100_dev_close(struct rte_bbdev *dev)
 {
+	struct acc100_device *d = dev->data->dev_private;
+	if (d->sw_rings_base != NULL) {
+		rte_free(d->tail_ptrs);
+		rte_free(d->sw_rings_base);
+		d->sw_rings_base = NULL;
+	}
+	usleep(1000);
+	return 0;
+}
+
+
+/**
+ * Report a ACC100 queue index which is free
+ * Return 0 to 16k for a valid queue_idx or -1 when no queue is available
+ * Note : Only supporting VF0 Bundle for PF mode
+ */
+static int
+acc100_find_free_queue_idx(struct rte_bbdev *dev,
+		const struct rte_bbdev_queue_conf *conf)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	int op_2_acc[5] = {0, UL_4G, DL_4G, UL_5G, DL_5G};
+	int acc = op_2_acc[conf->op_type];
+	struct rte_q_topology_t *qtop = NULL;
+	qtopFromAcc(&qtop, acc, &(d->acc100_conf));
+	if (qtop == NULL)
+		return -1;
+	/* Identify matching QGroup Index which are sorted in priority order */
+	uint16_t group_idx = qtop->first_qgroup_index;
+	group_idx += conf->priority;
+	if (group_idx >= ACC100_NUM_QGRPS ||
+			conf->priority >= qtop->num_qgroups) {
+		rte_bbdev_log(INFO, "Invalid Priority on %s, priority %u",
+				dev->data->name, conf->priority);
+		return -1;
+	}
+	/* Find a free AQ_idx  */
+	uint16_t aq_idx;
+	for (aq_idx = 0; aq_idx < qtop->num_aqs_per_groups; aq_idx++) {
+		if (((d->q_assigned_bit_map[group_idx] >> aq_idx) & 0x1) == 0) {
+			/* Mark the Queue as assigned */
+			d->q_assigned_bit_map[group_idx] |= (1 << aq_idx);
+			/* Report the AQ Index */
+			return (group_idx << GRP_ID_SHIFT) + aq_idx;
+		}
+	}
+	rte_bbdev_log(INFO, "Failed to find free queue on %s, priority %u",
+			dev->data->name, conf->priority);
+	return -1;
+}
+
+/* Setup ACC100 queue */
+static int
+acc100_queue_setup(struct rte_bbdev *dev, uint16_t queue_id,
+		const struct rte_bbdev_queue_conf *conf)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	struct acc100_queue *q;
+	int16_t q_idx;
+
+	/* Allocate the queue data structure. */
+	q = rte_zmalloc_socket(dev->device->driver->name, sizeof(*q),
+			RTE_CACHE_LINE_SIZE, conf->socket);
+	if (q == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate queue memory");
+		return -ENOMEM;
+	}
+
+	q->d = d;
+	q->ring_addr = RTE_PTR_ADD(d->sw_rings, (d->sw_ring_size * queue_id));
+	q->ring_addr_phys = d->sw_rings_phys + (d->sw_ring_size * queue_id);
+
+	/* Prepare the Ring with default descriptor format */
+	union acc100_dma_desc *desc = NULL;
+	unsigned int desc_idx, b_idx;
+	int fcw_len = (conf->op_type == RTE_BBDEV_OP_LDPC_ENC ?
+		ACC100_FCW_LE_BLEN : (conf->op_type == RTE_BBDEV_OP_TURBO_DEC ?
+		ACC100_FCW_TD_BLEN : ACC100_FCW_LD_BLEN));
+
+	for (desc_idx = 0; desc_idx < d->sw_ring_max_depth; desc_idx++) {
+		desc = q->ring_addr + desc_idx;
+		desc->req.word0 = ACC100_DMA_DESC_TYPE;
+		desc->req.word1 = 0; /**< Timestamp */
+		desc->req.word2 = 0;
+		desc->req.word3 = 0;
+		uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
+		desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
+		desc->req.data_ptrs[0].blen = fcw_len;
+		desc->req.data_ptrs[0].blkid = ACC100_DMA_BLKID_FCW;
+		desc->req.data_ptrs[0].last = 0;
+		desc->req.data_ptrs[0].dma_ext = 0;
+		for (b_idx = 1; b_idx < ACC100_DMA_MAX_NUM_POINTERS - 1;
+				b_idx++) {
+			desc->req.data_ptrs[b_idx].blkid = ACC100_DMA_BLKID_IN;
+			desc->req.data_ptrs[b_idx].last = 1;
+			desc->req.data_ptrs[b_idx].dma_ext = 0;
+			b_idx++;
+			desc->req.data_ptrs[b_idx].blkid =
+					ACC100_DMA_BLKID_OUT_ENC;
+			desc->req.data_ptrs[b_idx].last = 1;
+			desc->req.data_ptrs[b_idx].dma_ext = 0;
+		}
+		/* Preset some fields of LDPC FCW */
+		desc->req.fcw_ld.FCWversion = ACC100_FCW_VER;
+		desc->req.fcw_ld.gain_i = 1;
+		desc->req.fcw_ld.gain_h = 1;
+	}
+
+	q->lb_in = rte_zmalloc_socket(dev->device->driver->name,
+			RTE_CACHE_LINE_SIZE,
+			RTE_CACHE_LINE_SIZE, conf->socket);
+	if (q->lb_in == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate lb_in memory");
+		return -ENOMEM;
+	}
+	q->lb_in_addr_phys = rte_malloc_virt2iova(q->lb_in);
+	q->lb_out = rte_zmalloc_socket(dev->device->driver->name,
+			RTE_CACHE_LINE_SIZE,
+			RTE_CACHE_LINE_SIZE, conf->socket);
+	if (q->lb_out == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate lb_out memory");
+		return -ENOMEM;
+	}
+	q->lb_out_addr_phys = rte_malloc_virt2iova(q->lb_out);
+
+	/*
+	 * Software queue ring wraps synchronously with the HW when it reaches
+	 * the boundary of the maximum allocated queue size, no matter what the
+	 * sw queue size is. This wrapping is guarded by setting the wrap_mask
+	 * to represent the maximum queue size as allocated at the time when
+	 * the device has been setup (in configure()).
+	 *
+	 * The queue depth is set to the queue size value (conf->queue_size).
+	 * This limits the occupancy of the queue at any point of time, so that
+	 * the queue does not get swamped with enqueue requests.
+	 */
+	q->sw_ring_depth = conf->queue_size;
+	q->sw_ring_wrap_mask = d->sw_ring_max_depth - 1;
+
+	q->op_type = conf->op_type;
+
+	q_idx = acc100_find_free_queue_idx(dev, conf);
+	if (q_idx == -1) {
+		rte_free(q);
+		return -1;
+	}
+
+	q->qgrp_id = (q_idx >> GRP_ID_SHIFT) & 0xF;
+	q->vf_id = (q_idx >> VF_ID_SHIFT)  & 0x3F;
+	q->aq_id = q_idx & 0xF;
+	q->aq_depth = (conf->op_type ==  RTE_BBDEV_OP_TURBO_DEC) ?
+			(1 << d->acc100_conf.q_ul_4g.aq_depth_log2) :
+			(1 << d->acc100_conf.q_dl_4g.aq_depth_log2);
+
+	q->mmio_reg_enqueue = RTE_PTR_ADD(d->mmio_base,
+			queue_offset(d->pf_device,
+					q->vf_id, q->qgrp_id, q->aq_id));
+
+	rte_bbdev_log_debug(
+			"Setup dev%u q%u: qgrp_id=%u, vf_id=%u, aq_id=%u, aq_depth=%u, mmio_reg_enqueue=%p",
+			dev->data->dev_id, queue_id, q->qgrp_id, q->vf_id,
+			q->aq_id, q->aq_depth, q->mmio_reg_enqueue);
+
+	dev->data->queues[queue_id].queue_private = q;
+	return 0;
+}
+
+/* Release ACC100 queue */
+static int
+acc100_queue_release(struct rte_bbdev *dev, uint16_t q_id)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	struct acc100_queue *q = dev->data->queues[q_id].queue_private;
+
+	if (q != NULL) {
+		/* Mark the Queue as un-assigned */
+		d->q_assigned_bit_map[q->qgrp_id] &= (0xFFFFFFFF -
+				(1 << q->aq_id));
+		rte_free(q->lb_in);
+		rte_free(q->lb_out);
+		rte_free(q);
+		dev->data->queues[q_id].queue_private = NULL;
+	}
+
 	return 0;
 }
 
@@ -258,8 +673,11 @@
 }
 
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
+	.setup_queues = acc100_setup_queues,
 	.close = acc100_dev_close,
 	.info_get = acc100_dev_info_get,
+	.queue_setup = acc100_queue_setup,
+	.queue_release = acc100_queue_release,
 };
 
 /* ACC100 PCI PF address map */
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 662e2c8..0e2b79c 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -518,11 +518,56 @@ struct acc100_registry_addr {
 	.ddr_range = HWVfDmaDdrBaseRangeRoVf,
 };
 
+/* Structure associated with each queue. */
+struct __rte_cache_aligned acc100_queue {
+	union acc100_dma_desc *ring_addr;  /* Virtual address of sw ring */
+	rte_iova_t ring_addr_phys;  /* Physical address of software ring */
+	uint32_t sw_ring_head;  /* software ring head */
+	uint32_t sw_ring_tail;  /* software ring tail */
+	/* software ring size (descriptors, not bytes) */
+	uint32_t sw_ring_depth;
+	/* mask used to wrap enqueued descriptors on the sw ring */
+	uint32_t sw_ring_wrap_mask;
+	/* MMIO register used to enqueue descriptors */
+	void *mmio_reg_enqueue;
+	uint8_t vf_id;  /* VF ID (max = 63) */
+	uint8_t qgrp_id;  /* Queue Group ID */
+	uint16_t aq_id;  /* Atomic Queue ID */
+	uint16_t aq_depth;  /* Depth of atomic queue */
+	uint32_t aq_enqueued;  /* Count how many "batches" have been enqueued */
+	uint32_t aq_dequeued;  /* Count how many "batches" have been dequeued */
+	uint32_t irq_enable;  /* Enable ops dequeue interrupts if set to 1 */
+	struct rte_mempool *fcw_mempool;  /* FCW mempool */
+	enum rte_bbdev_op_type op_type;  /* Type of this Queue: TE or TD */
+	/* Internal Buffers for loopback input */
+	uint8_t *lb_in;
+	uint8_t *lb_out;
+	rte_iova_t lb_in_addr_phys;
+	rte_iova_t lb_out_addr_phys;
+	struct acc100_device *d;
+};
+
 /* Private data structure for each ACC100 device */
 struct acc100_device {
 	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
+	void *sw_rings_base;  /* Base addr of un-aligned memory for sw rings */
+	void *sw_rings;  /* 64MBs of 64MB aligned memory for sw rings */
+	rte_iova_t sw_rings_phys;  /* Physical address of sw_rings */
+	/* Virtual address of the info memory routed to the this function under
+	 * operation, whether it is PF or VF.
+	 */
+	union acc100_harq_layout_data *harq_layout;
+	uint32_t sw_ring_size;
 	uint32_t ddr_size; /* Size in kB */
+	uint32_t *tail_ptrs; /* Base address of response tail pointer buffer */
+	rte_iova_t tail_ptr_phys; /* Physical address of tail pointers */
+	/* Max number of entries available for each queue in device, depending
+	 * on how many queues are enabled with configure()
+	 */
+	uint32_t sw_ring_max_depth;
 	struct acc100_conf acc100_conf; /* ACC100 Initial configuration */
+	/* Bitmap capturing which Queues have already been assigned */
+	uint16_t q_assigned_bit_map[ACC100_NUM_QGRPS];
 	bool pf_device; /**< True if this is a PF ACC100 device */
 	bool configured; /**< True if this ACC100 device is configured */
 };
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v9 05/10] baseband/acc100: add LDPC processing functions
  2020-09-29  0:29   ` [dpdk-dev] [PATCH v9 00/10] bbdev PMD ACC100 Nicolas Chautru
                       ` (3 preceding siblings ...)
  2020-09-29  0:29     ` [dpdk-dev] [PATCH v9 04/10] baseband/acc100: add queue configuration Nicolas Chautru
@ 2020-09-29  0:29     ` Nicolas Chautru
  2020-09-30 16:53       ` Tom Rix
  2020-09-29  0:29     ` [dpdk-dev] [PATCH v9 06/10] baseband/acc100: add HARQ loopback support Nicolas Chautru
                       ` (4 subsequent siblings)
  9 siblings, 1 reply; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-29  0:29 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Adding LDPC decode and encode processing operations

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
Acked-by: Dave Burley <dave.burley@accelercomm.com>
---
 doc/guides/bbdevs/features/acc100.ini    |    8 +-
 drivers/baseband/acc100/rte_acc100_pmd.c | 1625 +++++++++++++++++++++++++++++-
 drivers/baseband/acc100/rte_acc100_pmd.h |    3 +
 3 files changed, 1630 insertions(+), 6 deletions(-)

diff --git a/doc/guides/bbdevs/features/acc100.ini b/doc/guides/bbdevs/features/acc100.ini
index c89a4d7..40c7adc 100644
--- a/doc/guides/bbdevs/features/acc100.ini
+++ b/doc/guides/bbdevs/features/acc100.ini
@@ -6,9 +6,9 @@
 [Features]
 Turbo Decoder (4G)     = N
 Turbo Encoder (4G)     = N
-LDPC Decoder (5G)      = N
-LDPC Encoder (5G)      = N
-LLR/HARQ Compression   = N
-External DDR Access    = N
+LDPC Decoder (5G)      = Y
+LDPC Encoder (5G)      = Y
+LLR/HARQ Compression   = Y
+External DDR Access    = Y
 HW Accelerated         = Y
 BBDEV API              = Y
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 7a21c57..b223547 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -15,6 +15,9 @@
 #include <rte_hexdump.h>
 #include <rte_pci.h>
 #include <rte_bus_pci.h>
+#ifdef RTE_BBDEV_OFFLOAD_COST
+#include <rte_cycles.h>
+#endif
 
 #include <rte_bbdev.h>
 #include <rte_bbdev_pmd.h>
@@ -449,7 +452,6 @@
 	return 0;
 }
 
-
 /**
  * Report a ACC100 queue index which is free
  * Return 0 to 16k for a valid queue_idx or -1 when no queue is available
@@ -634,6 +636,46 @@
 	struct acc100_device *d = dev->data->dev_private;
 
 	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
+		{
+			.type   = RTE_BBDEV_OP_LDPC_ENC,
+			.cap.ldpc_enc = {
+				.capability_flags =
+					RTE_BBDEV_LDPC_RATE_MATCH |
+					RTE_BBDEV_LDPC_CRC_24B_ATTACH |
+					RTE_BBDEV_LDPC_INTERLEAVER_BYPASS,
+				.num_buffers_src =
+						RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+				.num_buffers_dst =
+						RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+			}
+		},
+		{
+			.type   = RTE_BBDEV_OP_LDPC_DEC,
+			.cap.ldpc_dec = {
+			.capability_flags =
+				RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK |
+				RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP |
+				RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE |
+				RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE |
+#ifdef ACC100_EXT_MEM
+				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE |
+				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE |
+#endif
+				RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE |
+				RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS |
+				RTE_BBDEV_LDPC_DECODE_BYPASS |
+				RTE_BBDEV_LDPC_DEC_SCATTER_GATHER |
+				RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION |
+				RTE_BBDEV_LDPC_LLR_COMPRESSION,
+			.llr_size = 8,
+			.llr_decimals = 1,
+			.num_buffers_src =
+					RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+			.num_buffers_hard_out =
+					RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+			.num_buffers_soft_out = 0,
+			}
+		},
 		RTE_BBDEV_END_OF_CAPABILITIES_LIST()
 	};
 
@@ -669,9 +711,14 @@
 	dev_info->cpu_flag_reqs = NULL;
 	dev_info->min_alignment = 64;
 	dev_info->capabilities = bbdev_capabilities;
+#ifdef ACC100_EXT_MEM
 	dev_info->harq_buffer_size = d->ddr_size;
+#else
+	dev_info->harq_buffer_size = 0;
+#endif
 }
 
+
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
 	.setup_queues = acc100_setup_queues,
 	.close = acc100_dev_close,
@@ -696,6 +743,1577 @@
 	{.device_id = 0},
 };
 
+/* Read flag value 0/1 from bitmap */
+static inline bool
+check_bit(uint32_t bitmap, uint32_t bitmask)
+{
+	return bitmap & bitmask;
+}
+
+static inline char *
+mbuf_append(struct rte_mbuf *m_head, struct rte_mbuf *m, uint16_t len)
+{
+	if (unlikely(len > rte_pktmbuf_tailroom(m)))
+		return NULL;
+
+	char *tail = (char *)m->buf_addr + m->data_off + m->data_len;
+	m->data_len = (uint16_t)(m->data_len + len);
+	m_head->pkt_len  = (m_head->pkt_len + len);
+	return tail;
+}
+
+/* Compute value of k0.
+ * Based on 3GPP 38.212 Table 5.4.2.1-2
+ * Starting position of different redundancy versions, k0
+ */
+static inline uint16_t
+get_k0(uint16_t n_cb, uint16_t z_c, uint8_t bg, uint8_t rv_index)
+{
+	if (rv_index == 0)
+		return 0;
+	uint16_t n = (bg == 1 ? N_ZC_1 : N_ZC_2) * z_c;
+	if (n_cb == n) {
+		if (rv_index == 1)
+			return (bg == 1 ? K0_1_1 : K0_1_2) * z_c;
+		else if (rv_index == 2)
+			return (bg == 1 ? K0_2_1 : K0_2_2) * z_c;
+		else
+			return (bg == 1 ? K0_3_1 : K0_3_2) * z_c;
+	}
+	/* LBRM case - includes a division by N */
+	if (rv_index == 1)
+		return (((bg == 1 ? K0_1_1 : K0_1_2) * n_cb)
+				/ n) * z_c;
+	else if (rv_index == 2)
+		return (((bg == 1 ? K0_2_1 : K0_2_2) * n_cb)
+				/ n) * z_c;
+	else
+		return (((bg == 1 ? K0_3_1 : K0_3_2) * n_cb)
+				/ n) * z_c;
+}
+
+/* Fill in a frame control word for LDPC encoding. */
+static inline void
+acc100_fcw_le_fill(const struct rte_bbdev_enc_op *op,
+		struct acc100_fcw_le *fcw, int num_cb)
+{
+	fcw->qm = op->ldpc_enc.q_m;
+	fcw->nfiller = op->ldpc_enc.n_filler;
+	fcw->BG = (op->ldpc_enc.basegraph - 1);
+	fcw->Zc = op->ldpc_enc.z_c;
+	fcw->ncb = op->ldpc_enc.n_cb;
+	fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_enc.basegraph,
+			op->ldpc_enc.rv_index);
+	fcw->rm_e = op->ldpc_enc.cb_params.e;
+	fcw->crc_select = check_bit(op->ldpc_enc.op_flags,
+			RTE_BBDEV_LDPC_CRC_24B_ATTACH);
+	fcw->bypass_intlv = check_bit(op->ldpc_enc.op_flags,
+			RTE_BBDEV_LDPC_INTERLEAVER_BYPASS);
+	fcw->mcb_count = num_cb;
+}
+
+/* Fill in a frame control word for LDPC decoding. */
+static inline void
+acc100_fcw_ld_fill(const struct rte_bbdev_dec_op *op, struct acc100_fcw_ld *fcw,
+		union acc100_harq_layout_data *harq_layout)
+{
+	uint16_t harq_out_length, harq_in_length, ncb_p, k0_p, parity_offset;
+	uint16_t harq_index;
+	uint32_t l;
+	bool harq_prun = false;
+
+	fcw->qm = op->ldpc_dec.q_m;
+	fcw->nfiller = op->ldpc_dec.n_filler;
+	fcw->BG = (op->ldpc_dec.basegraph - 1);
+	fcw->Zc = op->ldpc_dec.z_c;
+	fcw->ncb = op->ldpc_dec.n_cb;
+	fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_dec.basegraph,
+			op->ldpc_dec.rv_index);
+	if (op->ldpc_dec.code_block_mode == 1)
+		fcw->rm_e = op->ldpc_dec.cb_params.e;
+	else
+		fcw->rm_e = (op->ldpc_dec.tb_params.r <
+				op->ldpc_dec.tb_params.cab) ?
+						op->ldpc_dec.tb_params.ea :
+						op->ldpc_dec.tb_params.eb;
+
+	fcw->hcin_en = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE);
+	fcw->hcout_en = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE);
+	fcw->crc_select = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK);
+	fcw->bypass_dec = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_DECODE_BYPASS);
+	fcw->bypass_intlv = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS);
+	if (op->ldpc_dec.q_m == 1) {
+		fcw->bypass_intlv = 1;
+		fcw->qm = 2;
+	}
+	fcw->hcin_decomp_mode = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
+	fcw->hcout_comp_mode = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
+	fcw->llr_pack_mode = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_LLR_COMPRESSION);
+	harq_index = op->ldpc_dec.harq_combined_output.offset /
+			ACC100_HARQ_OFFSET;
+#ifdef ACC100_EXT_MEM
+	/* Limit cases when HARQ pruning is valid */
+	harq_prun = ((op->ldpc_dec.harq_combined_output.offset %
+			ACC100_HARQ_OFFSET) == 0) &&
+			(op->ldpc_dec.harq_combined_output.offset <= UINT16_MAX
+			* ACC100_HARQ_OFFSET);
+#endif
+	if (fcw->hcin_en > 0) {
+		harq_in_length = op->ldpc_dec.harq_combined_input.length;
+		if (fcw->hcin_decomp_mode > 0)
+			harq_in_length = harq_in_length * 8 / 6;
+		harq_in_length = RTE_ALIGN(harq_in_length, 64);
+		if ((harq_layout[harq_index].offset > 0) & harq_prun) {
+			rte_bbdev_log_debug("HARQ IN offset unexpected for now\n");
+			fcw->hcin_size0 = harq_layout[harq_index].size0;
+			fcw->hcin_offset = harq_layout[harq_index].offset;
+			fcw->hcin_size1 = harq_in_length -
+					harq_layout[harq_index].offset;
+		} else {
+			fcw->hcin_size0 = harq_in_length;
+			fcw->hcin_offset = 0;
+			fcw->hcin_size1 = 0;
+		}
+	} else {
+		fcw->hcin_size0 = 0;
+		fcw->hcin_offset = 0;
+		fcw->hcin_size1 = 0;
+	}
+
+	fcw->itmax = op->ldpc_dec.iter_max;
+	fcw->itstop = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE);
+	fcw->synd_precoder = fcw->itstop;
+	/*
+	 * These are all implicitly set
+	 * fcw->synd_post = 0;
+	 * fcw->so_en = 0;
+	 * fcw->so_bypass_rm = 0;
+	 * fcw->so_bypass_intlv = 0;
+	 * fcw->dec_convllr = 0;
+	 * fcw->hcout_convllr = 0;
+	 * fcw->hcout_size1 = 0;
+	 * fcw->so_it = 0;
+	 * fcw->hcout_offset = 0;
+	 * fcw->negstop_th = 0;
+	 * fcw->negstop_it = 0;
+	 * fcw->negstop_en = 0;
+	 * fcw->gain_i = 1;
+	 * fcw->gain_h = 1;
+	 */
+	if (fcw->hcout_en > 0) {
+		parity_offset = (op->ldpc_dec.basegraph == 1 ? 20 : 8)
+			* op->ldpc_dec.z_c - op->ldpc_dec.n_filler;
+		k0_p = (fcw->k0 > parity_offset) ?
+				fcw->k0 - op->ldpc_dec.n_filler : fcw->k0;
+		ncb_p = fcw->ncb - op->ldpc_dec.n_filler;
+		l = k0_p + fcw->rm_e;
+		harq_out_length = (uint16_t) fcw->hcin_size0;
+		harq_out_length = RTE_MIN(RTE_MAX(harq_out_length, l), ncb_p);
+		harq_out_length = (harq_out_length + 0x3F) & 0xFFC0;
+		if ((k0_p > fcw->hcin_size0 + ACC100_HARQ_OFFSET_THRESHOLD) &&
+				harq_prun) {
+			fcw->hcout_size0 = (uint16_t) fcw->hcin_size0;
+			fcw->hcout_offset = k0_p & 0xFFC0;
+			fcw->hcout_size1 = harq_out_length - fcw->hcout_offset;
+		} else {
+			fcw->hcout_size0 = harq_out_length;
+			fcw->hcout_size1 = 0;
+			fcw->hcout_offset = 0;
+		}
+		harq_layout[harq_index].offset = fcw->hcout_offset;
+		harq_layout[harq_index].size0 = fcw->hcout_size0;
+	} else {
+		fcw->hcout_size0 = 0;
+		fcw->hcout_size1 = 0;
+		fcw->hcout_offset = 0;
+	}
+}
+
+/**
+ * Fills descriptor with data pointers of one block type.
+ *
+ * @param desc
+ *   Pointer to DMA descriptor.
+ * @param input
+ *   Pointer to pointer to input data which will be encoded. It can be changed
+ *   and points to next segment in scatter-gather case.
+ * @param offset
+ *   Input offset in rte_mbuf structure. It is used for calculating the point
+ *   where data is starting.
+ * @param cb_len
+ *   Length of currently processed Code Block
+ * @param seg_total_left
+ *   It indicates how many bytes still left in segment (mbuf) for further
+ *   processing.
+ * @param op_flags
+ *   Store information about device capabilities
+ * @param next_triplet
+ *   Index for ACC100 DMA Descriptor triplet
+ *
+ * @return
+ *   Returns index of next triplet on success, other value if lengths of
+ *   pkt and processed cb do not match.
+ *
+ */
+static inline int
+acc100_dma_fill_blk_type_in(struct acc100_dma_req_desc *desc,
+		struct rte_mbuf **input, uint32_t *offset, uint32_t cb_len,
+		uint32_t *seg_total_left, int next_triplet)
+{
+	uint32_t part_len;
+	struct rte_mbuf *m = *input;
+
+	part_len = (*seg_total_left < cb_len) ? *seg_total_left : cb_len;
+	cb_len -= part_len;
+	*seg_total_left -= part_len;
+
+	desc->data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(m, *offset);
+	desc->data_ptrs[next_triplet].blen = part_len;
+	desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN;
+	desc->data_ptrs[next_triplet].last = 0;
+	desc->data_ptrs[next_triplet].dma_ext = 0;
+	*offset += part_len;
+	next_triplet++;
+
+	while (cb_len > 0) {
+		if (next_triplet < ACC100_DMA_MAX_NUM_POINTERS &&
+				m->next != NULL) {
+
+			m = m->next;
+			*seg_total_left = rte_pktmbuf_data_len(m);
+			part_len = (*seg_total_left < cb_len) ?
+					*seg_total_left :
+					cb_len;
+			desc->data_ptrs[next_triplet].address =
+					rte_pktmbuf_iova_offset(m, 0);
+			desc->data_ptrs[next_triplet].blen = part_len;
+			desc->data_ptrs[next_triplet].blkid =
+					ACC100_DMA_BLKID_IN;
+			desc->data_ptrs[next_triplet].last = 0;
+			desc->data_ptrs[next_triplet].dma_ext = 0;
+			cb_len -= part_len;
+			*seg_total_left -= part_len;
+			/* Initializing offset for next segment (mbuf) */
+			*offset = part_len;
+			next_triplet++;
+		} else {
+			rte_bbdev_log(ERR,
+				"Some data still left for processing: "
+				"data_left: %u, next_triplet: %u, next_mbuf: %p",
+				cb_len, next_triplet, m->next);
+			return -EINVAL;
+		}
+	}
+	/* Storing new mbuf as it could be changed in scatter-gather case*/
+	*input = m;
+
+	return next_triplet;
+}
+
+/* Fills descriptor with data pointers of one block type.
+ * Returns index of next triplet on success, other value if lengths of
+ * output data and processed mbuf do not match.
+ */
+static inline int
+acc100_dma_fill_blk_type_out(struct acc100_dma_req_desc *desc,
+		struct rte_mbuf *output, uint32_t out_offset,
+		uint32_t output_len, int next_triplet, int blk_id)
+{
+	desc->data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(output, out_offset);
+	desc->data_ptrs[next_triplet].blen = output_len;
+	desc->data_ptrs[next_triplet].blkid = blk_id;
+	desc->data_ptrs[next_triplet].last = 0;
+	desc->data_ptrs[next_triplet].dma_ext = 0;
+	next_triplet++;
+
+	return next_triplet;
+}
+
+static inline int
+acc100_dma_desc_le_fill(struct rte_bbdev_enc_op *op,
+		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
+		struct rte_mbuf *output, uint32_t *in_offset,
+		uint32_t *out_offset, uint32_t *out_length,
+		uint32_t *mbuf_total_left, uint32_t *seg_total_left)
+{
+	int next_triplet = 1; /* FCW already done */
+	uint16_t K, in_length_in_bits, in_length_in_bytes;
+	struct rte_bbdev_op_ldpc_enc *enc = &op->ldpc_enc;
+
+	desc->word0 = ACC100_DMA_DESC_TYPE;
+	desc->word1 = 0; /**< Timestamp could be disabled */
+	desc->word2 = 0;
+	desc->word3 = 0;
+	desc->numCBs = 1;
+
+	K = (enc->basegraph == 1 ? 22 : 10) * enc->z_c;
+	in_length_in_bits = K - enc->n_filler;
+	if ((enc->op_flags & RTE_BBDEV_LDPC_CRC_24A_ATTACH) ||
+			(enc->op_flags & RTE_BBDEV_LDPC_CRC_24B_ATTACH))
+		in_length_in_bits -= 24;
+	in_length_in_bytes = in_length_in_bits >> 3;
+
+	if (unlikely((*mbuf_total_left == 0) ||
+			(*mbuf_total_left < in_length_in_bytes))) {
+		rte_bbdev_log(ERR,
+				"Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+				*mbuf_total_left, in_length_in_bytes);
+		return -1;
+	}
+
+	next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset,
+			in_length_in_bytes,
+			seg_total_left, next_triplet);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->m2dlen = next_triplet;
+	*mbuf_total_left -= in_length_in_bytes;
+
+	/* Set output length */
+	/* Integer round up division by 8 */
+	*out_length = (enc->cb_params.e + 7) >> 3;
+
+	next_triplet = acc100_dma_fill_blk_type_out(desc, output, *out_offset,
+			*out_length, next_triplet, ACC100_DMA_BLKID_OUT_ENC);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+	op->ldpc_enc.output.length += *out_length;
+	*out_offset += *out_length;
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->data_ptrs[next_triplet - 1].dma_ext = 0;
+	desc->d2mlen = next_triplet - desc->m2dlen;
+
+	desc->op_addr = op;
+
+	return 0;
+}
+
+static inline int
+acc100_dma_desc_ld_fill(struct rte_bbdev_dec_op *op,
+		struct acc100_dma_req_desc *desc,
+		struct rte_mbuf **input, struct rte_mbuf *h_output,
+		uint32_t *in_offset, uint32_t *h_out_offset,
+		uint32_t *h_out_length, uint32_t *mbuf_total_left,
+		uint32_t *seg_total_left,
+		struct acc100_fcw_ld *fcw)
+{
+	struct rte_bbdev_op_ldpc_dec *dec = &op->ldpc_dec;
+	int next_triplet = 1; /* FCW already done */
+	uint32_t input_length;
+	uint16_t output_length, crc24_overlap = 0;
+	uint16_t sys_cols, K, h_p_size, h_np_size;
+	bool h_comp = check_bit(dec->op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
+
+	desc->word0 = ACC100_DMA_DESC_TYPE;
+	desc->word1 = 0; /**< Timestamp could be disabled */
+	desc->word2 = 0;
+	desc->word3 = 0;
+	desc->numCBs = 1;
+
+	if (check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP))
+		crc24_overlap = 24;
+
+	/* Compute some LDPC BG lengths */
+	input_length = dec->cb_params.e;
+	if (check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_LLR_COMPRESSION))
+		input_length = (input_length * 3 + 3) / 4;
+	sys_cols = (dec->basegraph == 1) ? 22 : 10;
+	K = sys_cols * dec->z_c;
+	output_length = K - dec->n_filler - crc24_overlap;
+
+	if (unlikely((*mbuf_total_left == 0) ||
+			(*mbuf_total_left < input_length))) {
+		rte_bbdev_log(ERR,
+				"Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+				*mbuf_total_left, input_length);
+		return -1;
+	}
+
+	next_triplet = acc100_dma_fill_blk_type_in(desc, input,
+			in_offset, input_length,
+			seg_total_left, next_triplet);
+
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+
+	if (check_bit(op->ldpc_dec.op_flags,
+				RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
+		h_p_size = fcw->hcin_size0 + fcw->hcin_size1;
+		if (h_comp)
+			h_p_size = (h_p_size * 3 + 3) / 4;
+		desc->data_ptrs[next_triplet].address =
+				dec->harq_combined_input.offset;
+		desc->data_ptrs[next_triplet].blen = h_p_size;
+		desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN_HARQ;
+		desc->data_ptrs[next_triplet].dma_ext = 1;
+#ifndef ACC100_EXT_MEM
+		acc100_dma_fill_blk_type_out(
+				desc,
+				op->ldpc_dec.harq_combined_input.data,
+				op->ldpc_dec.harq_combined_input.offset,
+				h_p_size,
+				next_triplet,
+				ACC100_DMA_BLKID_IN_HARQ);
+#endif
+		next_triplet++;
+	}
+
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->m2dlen = next_triplet;
+	*mbuf_total_left -= input_length;
+
+	next_triplet = acc100_dma_fill_blk_type_out(desc, h_output,
+			*h_out_offset, output_length >> 3, next_triplet,
+			ACC100_DMA_BLKID_OUT_HARD);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+
+	if (check_bit(op->ldpc_dec.op_flags,
+				RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
+		/* Pruned size of the HARQ */
+		h_p_size = fcw->hcout_size0 + fcw->hcout_size1;
+		/* Non-Pruned size of the HARQ */
+		h_np_size = fcw->hcout_offset > 0 ?
+				fcw->hcout_offset + fcw->hcout_size1 :
+				h_p_size;
+		if (h_comp) {
+			h_np_size = (h_np_size * 3 + 3) / 4;
+			h_p_size = (h_p_size * 3 + 3) / 4;
+		}
+		dec->harq_combined_output.length = h_np_size;
+		desc->data_ptrs[next_triplet].address =
+				dec->harq_combined_output.offset;
+		desc->data_ptrs[next_triplet].blen = h_p_size;
+		desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_OUT_HARQ;
+		desc->data_ptrs[next_triplet].dma_ext = 1;
+#ifndef ACC100_EXT_MEM
+		acc100_dma_fill_blk_type_out(
+				desc,
+				dec->harq_combined_output.data,
+				dec->harq_combined_output.offset,
+				h_p_size,
+				next_triplet,
+				ACC100_DMA_BLKID_OUT_HARQ);
+#endif
+		next_triplet++;
+	}
+
+	*h_out_length = output_length >> 3;
+	dec->hard_output.length += *h_out_length;
+	*h_out_offset += *h_out_length;
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->d2mlen = next_triplet - desc->m2dlen;
+
+	desc->op_addr = op;
+
+	return 0;
+}
+
+static inline void
+acc100_dma_desc_ld_update(struct rte_bbdev_dec_op *op,
+		struct acc100_dma_req_desc *desc,
+		struct rte_mbuf *input, struct rte_mbuf *h_output,
+		uint32_t *in_offset, uint32_t *h_out_offset,
+		uint32_t *h_out_length,
+		union acc100_harq_layout_data *harq_layout)
+{
+	int next_triplet = 1; /* FCW already done */
+	desc->data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(input, *in_offset);
+	next_triplet++;
+
+	if (check_bit(op->ldpc_dec.op_flags,
+				RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
+		struct rte_bbdev_op_data hi = op->ldpc_dec.harq_combined_input;
+		desc->data_ptrs[next_triplet].address = hi.offset;
+#ifndef ACC100_EXT_MEM
+		desc->data_ptrs[next_triplet].address =
+				rte_pktmbuf_iova_offset(hi.data, hi.offset);
+#endif
+		next_triplet++;
+	}
+
+	desc->data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(h_output, *h_out_offset);
+	*h_out_length = desc->data_ptrs[next_triplet].blen;
+	next_triplet++;
+
+	if (check_bit(op->ldpc_dec.op_flags,
+				RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
+		desc->data_ptrs[next_triplet].address =
+				op->ldpc_dec.harq_combined_output.offset;
+		/* Adjust based on previous operation */
+		struct rte_bbdev_dec_op *prev_op = desc->op_addr;
+		op->ldpc_dec.harq_combined_output.length =
+				prev_op->ldpc_dec.harq_combined_output.length;
+		int16_t hq_idx = op->ldpc_dec.harq_combined_output.offset /
+				ACC100_HARQ_OFFSET;
+		int16_t prev_hq_idx =
+				prev_op->ldpc_dec.harq_combined_output.offset
+				/ ACC100_HARQ_OFFSET;
+		harq_layout[hq_idx].val = harq_layout[prev_hq_idx].val;
+#ifndef ACC100_EXT_MEM
+		struct rte_bbdev_op_data ho =
+				op->ldpc_dec.harq_combined_output;
+		desc->data_ptrs[next_triplet].address =
+				rte_pktmbuf_iova_offset(ho.data, ho.offset);
+#endif
+		next_triplet++;
+	}
+
+	op->ldpc_dec.hard_output.length += *h_out_length;
+	desc->op_addr = op;
+}
+
+
+/* Enqueue a number of operations to HW and update software rings */
+static inline void
+acc100_dma_enqueue(struct acc100_queue *q, uint16_t n,
+		struct rte_bbdev_stats *queue_stats)
+{
+	union acc100_enqueue_reg_fmt enq_req;
+#ifdef RTE_BBDEV_OFFLOAD_COST
+	uint64_t start_time = 0;
+	queue_stats->acc_offload_cycles = 0;
+	RTE_SET_USED(queue_stats);
+#else
+	RTE_SET_USED(queue_stats);
+#endif
+
+	enq_req.val = 0;
+	/* Setting offset, 100b for 256 DMA Desc */
+	enq_req.addr_offset = ACC100_DESC_OFFSET;
+
+	/* Split ops into batches */
+	do {
+		union acc100_dma_desc *desc;
+		uint16_t enq_batch_size;
+		uint64_t offset;
+		rte_iova_t req_elem_addr;
+
+		enq_batch_size = RTE_MIN(n, MAX_ENQ_BATCH_SIZE);
+
+		/* Set flag on last descriptor in a batch */
+		desc = q->ring_addr + ((q->sw_ring_head + enq_batch_size - 1) &
+				q->sw_ring_wrap_mask);
+		desc->req.last_desc_in_batch = 1;
+
+		/* Calculate the 1st descriptor's address */
+		offset = ((q->sw_ring_head & q->sw_ring_wrap_mask) *
+				sizeof(union acc100_dma_desc));
+		req_elem_addr = q->ring_addr_phys + offset;
+
+		/* Fill enqueue struct */
+		enq_req.num_elem = enq_batch_size;
+		/* low 6 bits are not needed */
+		enq_req.req_elem_addr = (uint32_t)(req_elem_addr >> 6);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+		rte_memdump(stderr, "Req sdone", desc, sizeof(*desc));
+#endif
+		rte_bbdev_log_debug(
+				"Enqueue %u reqs (phys %#"PRIx64") to reg %p",
+				enq_batch_size,
+				req_elem_addr,
+				(void *)q->mmio_reg_enqueue);
+
+		rte_wmb();
+
+#ifdef RTE_BBDEV_OFFLOAD_COST
+		/* Start time measurement for enqueue function offload. */
+		start_time = rte_rdtsc_precise();
+#endif
+		rte_bbdev_log(DEBUG, "Debug : MMIO Enqueue");
+		mmio_write(q->mmio_reg_enqueue, enq_req.val);
+
+#ifdef RTE_BBDEV_OFFLOAD_COST
+		queue_stats->acc_offload_cycles +=
+				rte_rdtsc_precise() - start_time;
+#endif
+
+		q->aq_enqueued++;
+		q->sw_ring_head += enq_batch_size;
+		n -= enq_batch_size;
+
+	} while (n);
+
+
+}
+
+/* Enqueue one encode operations for ACC100 device in CB mode */
+static inline int
+enqueue_ldpc_enc_n_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op **ops,
+		uint16_t total_enqueued_cbs, int16_t num)
+{
+	union acc100_dma_desc *desc = NULL;
+	uint32_t out_length;
+	struct rte_mbuf *output_head, *output;
+	int i, next_triplet;
+	uint16_t  in_length_in_bytes;
+	struct rte_bbdev_op_ldpc_enc *enc = &ops[0]->ldpc_enc;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	acc100_fcw_le_fill(ops[0], &desc->req.fcw_le, num);
+
+	/** This could be done at polling */
+	desc->req.word0 = ACC100_DMA_DESC_TYPE;
+	desc->req.word1 = 0; /**< Timestamp could be disabled */
+	desc->req.word2 = 0;
+	desc->req.word3 = 0;
+	desc->req.numCBs = num;
+
+	in_length_in_bytes = ops[0]->ldpc_enc.input.data->data_len;
+	out_length = (enc->cb_params.e + 7) >> 3;
+	desc->req.m2dlen = 1 + num;
+	desc->req.d2mlen = num;
+	next_triplet = 1;
+
+	for (i = 0; i < num; i++) {
+		desc->req.data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(ops[i]->ldpc_enc.input.data, 0);
+		desc->req.data_ptrs[next_triplet].blen = in_length_in_bytes;
+		next_triplet++;
+		desc->req.data_ptrs[next_triplet].address =
+				rte_pktmbuf_iova_offset(
+				ops[i]->ldpc_enc.output.data, 0);
+		desc->req.data_ptrs[next_triplet].blen = out_length;
+		next_triplet++;
+		ops[i]->ldpc_enc.output.length = out_length;
+		output_head = output = ops[i]->ldpc_enc.output.data;
+		mbuf_append(output_head, output, out_length);
+		output->data_len = out_length;
+	}
+
+	desc->req.op_addr = ops[0];
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_le,
+			sizeof(desc->req.fcw_le) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+	/* One CB (one op) was successfully prepared to enqueue */
+	return num;
+}
+
+/* Enqueue one encode operations for ACC100 device in CB mode */
+static inline int
+enqueue_ldpc_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
+		uint16_t total_enqueued_cbs)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint32_t in_offset, out_offset, out_length, mbuf_total_left,
+		seg_total_left;
+	struct rte_mbuf *input, *output_head, *output;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	acc100_fcw_le_fill(op, &desc->req.fcw_le, 1);
+
+	input = op->ldpc_enc.input.data;
+	output_head = output = op->ldpc_enc.output.data;
+	in_offset = op->ldpc_enc.input.offset;
+	out_offset = op->ldpc_enc.output.offset;
+	out_length = 0;
+	mbuf_total_left = op->ldpc_enc.input.length;
+	seg_total_left = rte_pktmbuf_data_len(op->ldpc_enc.input.data)
+			- in_offset;
+
+	ret = acc100_dma_desc_le_fill(op, &desc->req, &input, output,
+			&in_offset, &out_offset, &out_length, &mbuf_total_left,
+			&seg_total_left);
+
+	if (unlikely(ret < 0))
+		return ret;
+
+	mbuf_append(output_head, output, out_length);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_le,
+			sizeof(desc->req.fcw_le) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+
+	/* Check if any data left after processing one CB */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left after processing one CB: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
+/** Enqueue one decode operations for ACC100 device in CB mode */
+static inline int
+enqueue_ldpc_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs, bool same_op)
+{
+	int ret;
+
+	union acc100_dma_desc *desc;
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	struct rte_mbuf *input, *h_output_head, *h_output;
+	uint32_t in_offset, h_out_offset, mbuf_total_left, h_out_length = 0;
+	input = op->ldpc_dec.input.data;
+	h_output_head = h_output = op->ldpc_dec.hard_output.data;
+	in_offset = op->ldpc_dec.input.offset;
+	h_out_offset = op->ldpc_dec.hard_output.offset;
+	mbuf_total_left = op->ldpc_dec.input.length;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(input == NULL)) {
+		rte_bbdev_log(ERR, "Invalid mbuf pointer");
+		return -EFAULT;
+	}
+#endif
+	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+
+	if (same_op) {
+		union acc100_dma_desc *prev_desc;
+		desc_idx = ((q->sw_ring_head + total_enqueued_cbs - 1)
+				& q->sw_ring_wrap_mask);
+		prev_desc = q->ring_addr + desc_idx;
+		uint8_t *prev_ptr = (uint8_t *) prev_desc;
+		uint8_t *new_ptr = (uint8_t *) desc;
+		/* Copy first 4 words and BDESCs */
+		rte_memcpy(new_ptr, prev_ptr, 16);
+		rte_memcpy(new_ptr + 36, prev_ptr + 36, 40);
+		desc->req.op_addr = prev_desc->req.op_addr;
+		/* Copy FCW */
+		rte_memcpy(new_ptr + ACC100_DESC_FCW_OFFSET,
+				prev_ptr + ACC100_DESC_FCW_OFFSET,
+				ACC100_FCW_LD_BLEN);
+		acc100_dma_desc_ld_update(op, &desc->req, input, h_output,
+				&in_offset, &h_out_offset,
+				&h_out_length, harq_layout);
+	} else {
+		struct acc100_fcw_ld *fcw;
+		uint32_t seg_total_left;
+		fcw = &desc->req.fcw_ld;
+		acc100_fcw_ld_fill(op, fcw, harq_layout);
+
+		/* Special handling when overusing mbuf */
+		if (fcw->rm_e < MAX_E_MBUF)
+			seg_total_left = rte_pktmbuf_data_len(input)
+					- in_offset;
+		else
+			seg_total_left = fcw->rm_e;
+
+		ret = acc100_dma_desc_ld_fill(op, &desc->req, &input, h_output,
+				&in_offset, &h_out_offset,
+				&h_out_length, &mbuf_total_left,
+				&seg_total_left, fcw);
+		if (unlikely(ret < 0))
+			return ret;
+	}
+
+	/* Hard output */
+	mbuf_append(h_output_head, h_output, h_out_length);
+#ifndef ACC100_EXT_MEM
+	if (op->ldpc_dec.harq_combined_output.length > 0) {
+		/* Push the HARQ output into host memory */
+		struct rte_mbuf *hq_output_head, *hq_output;
+		hq_output_head = op->ldpc_dec.harq_combined_output.data;
+		hq_output = op->ldpc_dec.harq_combined_output.data;
+		mbuf_append(hq_output_head, hq_output,
+				op->ldpc_dec.harq_combined_output.length);
+	}
+#endif
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_ld,
+			sizeof(desc->req.fcw_ld) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
+
+/* Enqueue one decode operations for ACC100 device in TB mode */
+static inline int
+enqueue_ldpc_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint8_t r, c;
+	uint32_t in_offset, h_out_offset,
+		h_out_length, mbuf_total_left, seg_total_left;
+	struct rte_mbuf *input, *h_output_head, *h_output;
+	uint16_t current_enqueued_cbs = 0;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
+	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+	acc100_fcw_ld_fill(op, &desc->req.fcw_ld, harq_layout);
+
+	input = op->ldpc_dec.input.data;
+	h_output_head = h_output = op->ldpc_dec.hard_output.data;
+	in_offset = op->ldpc_dec.input.offset;
+	h_out_offset = op->ldpc_dec.hard_output.offset;
+	h_out_length = 0;
+	mbuf_total_left = op->ldpc_dec.input.length;
+	c = op->ldpc_dec.tb_params.c;
+	r = op->ldpc_dec.tb_params.r;
+
+	while (mbuf_total_left > 0 && r < c) {
+
+		seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
+
+		/* Set up DMA descriptor */
+		desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
+				& q->sw_ring_wrap_mask);
+		desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
+		desc->req.data_ptrs[0].blen = ACC100_FCW_LD_BLEN;
+		ret = acc100_dma_desc_ld_fill(op, &desc->req, &input,
+				h_output, &in_offset, &h_out_offset,
+				&h_out_length,
+				&mbuf_total_left, &seg_total_left,
+				&desc->req.fcw_ld);
+
+		if (unlikely(ret < 0))
+			return ret;
+
+		/* Hard output */
+		mbuf_append(h_output_head, h_output, h_out_length);
+
+		/* Set total number of CBs in TB */
+		desc->req.cbs_in_tb = cbs_in_tb;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+		rte_memdump(stderr, "FCW", &desc->req.fcw_td,
+				sizeof(desc->req.fcw_td) - 8);
+		rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+		if (seg_total_left == 0) {
+			/* Go to the next mbuf */
+			input = input->next;
+			in_offset = 0;
+			h_output = h_output->next;
+			h_out_offset = 0;
+		}
+		total_enqueued_cbs++;
+		current_enqueued_cbs++;
+		r++;
+	}
+
+	if (unlikely(desc == NULL))
+		return current_enqueued_cbs;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Check if any CBs left for processing */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left for processing: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+	/* Set SDone on last CB descriptor for TB mode */
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	return current_enqueued_cbs;
+}
+
+
+/* Calculates number of CBs in processed encoder TB based on 'r' and input
+ * length.
+ */
+static inline uint8_t
+get_num_cbs_in_tb_enc(struct rte_bbdev_op_turbo_enc *turbo_enc)
+{
+	uint8_t c, c_neg, r, crc24_bits = 0;
+	uint16_t k, k_neg, k_pos;
+	uint8_t cbs_in_tb = 0;
+	int32_t length;
+
+	length = turbo_enc->input.length;
+	r = turbo_enc->tb_params.r;
+	c = turbo_enc->tb_params.c;
+	c_neg = turbo_enc->tb_params.c_neg;
+	k_neg = turbo_enc->tb_params.k_neg;
+	k_pos = turbo_enc->tb_params.k_pos;
+	crc24_bits = 0;
+	if (check_bit(turbo_enc->op_flags, RTE_BBDEV_TURBO_CRC_24B_ATTACH))
+		crc24_bits = 24;
+	while (length > 0 && r < c) {
+		k = (r < c_neg) ? k_neg : k_pos;
+		length -= (k - crc24_bits) >> 3;
+		r++;
+		cbs_in_tb++;
+	}
+
+	return cbs_in_tb;
+}
+
+/* Calculates number of CBs in processed decoder TB based on 'r' and input
+ * length.
+ */
+static inline uint16_t
+get_num_cbs_in_tb_dec(struct rte_bbdev_op_turbo_dec *turbo_dec)
+{
+	uint8_t c, c_neg, r = 0;
+	uint16_t kw, k, k_neg, k_pos, cbs_in_tb = 0;
+	int32_t length;
+
+	length = turbo_dec->input.length;
+	r = turbo_dec->tb_params.r;
+	c = turbo_dec->tb_params.c;
+	c_neg = turbo_dec->tb_params.c_neg;
+	k_neg = turbo_dec->tb_params.k_neg;
+	k_pos = turbo_dec->tb_params.k_pos;
+	while (length > 0 && r < c) {
+		k = (r < c_neg) ? k_neg : k_pos;
+		kw = RTE_ALIGN_CEIL(k + 4, 32) * 3;
+		length -= kw;
+		r++;
+		cbs_in_tb++;
+	}
+
+	return cbs_in_tb;
+}
+
+/* Calculates number of CBs in processed decoder TB based on 'r' and input
+ * length.
+ */
+static inline uint16_t
+get_num_cbs_in_tb_ldpc_dec(struct rte_bbdev_op_ldpc_dec *ldpc_dec)
+{
+	uint16_t r, cbs_in_tb = 0;
+	int32_t length = ldpc_dec->input.length;
+	r = ldpc_dec->tb_params.r;
+	while (length > 0 && r < ldpc_dec->tb_params.c) {
+		length -=  (r < ldpc_dec->tb_params.cab) ?
+				ldpc_dec->tb_params.ea :
+				ldpc_dec->tb_params.eb;
+		r++;
+		cbs_in_tb++;
+	}
+	return cbs_in_tb;
+}
+
+/* Check we can mux encode operations with common FCW */
+static inline bool
+check_mux(struct rte_bbdev_enc_op **ops, uint16_t num) {
+	uint16_t i;
+	if (num == 1)
+		return false;
+	for (i = 1; i < num; ++i) {
+		/* Only mux compatible code blocks */
+		if (memcmp((uint8_t *)(&ops[i]->ldpc_enc) + ENC_OFFSET,
+				(uint8_t *)(&ops[0]->ldpc_enc) + ENC_OFFSET,
+				CMP_ENC_SIZE) != 0)
+			return false;
+	}
+	return true;
+}
+
+/** Enqueue encode operations for ACC100 device in CB mode. */
+static inline uint16_t
+acc100_enqueue_ldpc_enc_cb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i = 0;
+	union acc100_dma_desc *desc;
+	int ret, desc_idx = 0;
+	int16_t enq, left = num;
+
+	while (left > 0) {
+		if (unlikely(avail - 1 < 0))
+			break;
+		avail--;
+		enq = RTE_MIN(left, MUX_5GDL_DESC);
+		if (check_mux(&ops[i], enq)) {
+			ret = enqueue_ldpc_enc_n_op_cb(q, &ops[i],
+					desc_idx, enq);
+			if (ret < 0)
+				break;
+			i += enq;
+		} else {
+			ret = enqueue_ldpc_enc_one_op_cb(q, ops[i], desc_idx);
+			if (ret < 0)
+				break;
+			i++;
+		}
+		desc_idx++;
+		left = num - i;
+	}
+
+	if (unlikely(i == 0))
+		return 0; /* Nothing to enqueue */
+
+	/* Set SDone in last CB in enqueued ops for CB mode*/
+	desc = q->ring_addr + ((q->sw_ring_head + desc_idx - 1)
+			& q->sw_ring_wrap_mask);
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	acc100_dma_enqueue(q, desc_idx, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+
+	return i;
+}
+
+/* Enqueue encode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	if (unlikely(num == 0))
+		return 0;
+	return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
+}
+
+/* Check we can mux encode operations with common FCW */
+static inline bool
+cmp_ldpc_dec_op(struct rte_bbdev_dec_op **ops) {
+	/* Only mux compatible code blocks */
+	if (memcmp((uint8_t *)(&ops[0]->ldpc_dec) + DEC_OFFSET,
+			(uint8_t *)(&ops[1]->ldpc_dec) +
+			DEC_OFFSET, CMP_DEC_SIZE) != 0) {
+		return false;
+	} else
+		return true;
+}
+
+
+/* Enqueue decode operations for ACC100 device in TB mode */
+static uint16_t
+acc100_enqueue_ldpc_dec_tb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i, enqueued_cbs = 0;
+	uint8_t cbs_in_tb;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		cbs_in_tb = get_num_cbs_in_tb_ldpc_dec(&ops[i]->ldpc_dec);
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - cbs_in_tb < 0))
+			break;
+		avail -= cbs_in_tb;
+
+		ret = enqueue_ldpc_dec_one_op_tb(q, ops[i],
+				enqueued_cbs, cbs_in_tb);
+		if (ret < 0)
+			break;
+		enqueued_cbs += ret;
+	}
+
+	acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+	return i;
+}
+
+/* Enqueue decode operations for ACC100 device in CB mode */
+static uint16_t
+acc100_enqueue_ldpc_dec_cb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i;
+	union acc100_dma_desc *desc;
+	int ret;
+	bool same_op = false;
+	for (i = 0; i < num; ++i) {
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - 1 < 0))
+			break;
+		avail -= 1;
+
+		if (i > 0)
+			same_op = cmp_ldpc_dec_op(&ops[i-1]);
+		rte_bbdev_log(INFO, "Op %d %d %d %d %d %d %d %d %d %d %d %d\n",
+			i, ops[i]->ldpc_dec.op_flags, ops[i]->ldpc_dec.rv_index,
+			ops[i]->ldpc_dec.iter_max, ops[i]->ldpc_dec.iter_count,
+			ops[i]->ldpc_dec.basegraph, ops[i]->ldpc_dec.z_c,
+			ops[i]->ldpc_dec.n_cb, ops[i]->ldpc_dec.q_m,
+			ops[i]->ldpc_dec.n_filler, ops[i]->ldpc_dec.cb_params.e,
+			same_op);
+		ret = enqueue_ldpc_dec_one_op_cb(q, ops[i], i, same_op);
+		if (ret < 0)
+			break;
+	}
+
+	if (unlikely(i == 0))
+		return 0; /* Nothing to enqueue */
+
+	/* Set SDone in last CB in enqueued ops for CB mode*/
+	desc = q->ring_addr + ((q->sw_ring_head + i - 1)
+			& q->sw_ring_wrap_mask);
+
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	acc100_dma_enqueue(q, i, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+	return i;
+}
+
+/* Enqueue decode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t aq_avail = q->aq_depth +
+			(q->aq_dequeued - q->aq_enqueued) / 128;
+
+	if (unlikely((aq_avail == 0) || (num == 0)))
+		return 0;
+
+	if (ops[0]->ldpc_dec.code_block_mode == 0)
+		return acc100_enqueue_ldpc_dec_tb(q_data, ops, num);
+	else
+		return acc100_enqueue_ldpc_dec_cb(q_data, ops, num);
+}
+
+
+/* Dequeue one encode operations from ACC100 device in CB mode */
+static inline int
+dequeue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op **ref_op,
+		uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_enc_op *op;
+	int i;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	rsp.val = atom_desc.rsp.val;
+	rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+
+	op->status |= ((rsp.input_err)
+			? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+	op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+	op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+
+	if (desc->req.last_desc_in_batch) {
+		(*aq_dequeued)++;
+		desc->req.last_desc_in_batch = 0;
+	}
+	desc->rsp.val = ACC100_DMA_DESC_TYPE;
+	desc->rsp.add_info_0 = 0; /*Reserved bits */
+	desc->rsp.add_info_1 = 0; /*Reserved bits */
+
+	/* Flag that the muxing cause loss of opaque data */
+	op->opaque_data = (void *)-1;
+	for (i = 0 ; i < desc->req.numCBs; i++)
+		ref_op[i] = op;
+
+	/* One CB (op) was successfully dequeued */
+	return desc->req.numCBs;
+}
+
+/* Dequeue one encode operations from ACC100 device in TB mode */
+static inline int
+dequeue_enc_one_op_tb(struct acc100_queue *q, struct rte_bbdev_enc_op **ref_op,
+		uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, *last_desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_enc_op *op;
+	uint8_t i = 0;
+	uint16_t current_dequeued_cbs = 0, cbs_in_tb;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	/* Get number of CBs in dequeued TB */
+	cbs_in_tb = desc->req.cbs_in_tb;
+	/* Get last CB */
+	last_desc = q->ring_addr + ((q->sw_ring_tail
+			+ total_dequeued_cbs + cbs_in_tb - 1)
+			& q->sw_ring_wrap_mask);
+	/* Check if last CB in TB is ready to dequeue (and thus
+	 * the whole TB) - checking sdone bit. If not return.
+	 */
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
+			__ATOMIC_RELAXED);
+	if (!(atom_desc.rsp.val & ACC100_SDONE))
+		return -1;
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+
+	while (i < cbs_in_tb) {
+		desc = q->ring_addr + ((q->sw_ring_tail
+				+ total_dequeued_cbs)
+				& q->sw_ring_wrap_mask);
+		atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+				__ATOMIC_RELAXED);
+		rsp.val = atom_desc.rsp.val;
+		rte_bbdev_log_debug("Resp. desc %p: %x", desc,
+				rsp.val);
+
+		op->status |= ((rsp.input_err)
+				? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+		op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+		op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+
+		if (desc->req.last_desc_in_batch) {
+			(*aq_dequeued)++;
+			desc->req.last_desc_in_batch = 0;
+		}
+		desc->rsp.val = ACC100_DMA_DESC_TYPE;
+		desc->rsp.add_info_0 = 0;
+		desc->rsp.add_info_1 = 0;
+		total_dequeued_cbs++;
+		current_dequeued_cbs++;
+		i++;
+	}
+
+	*ref_op = op;
+
+	return current_dequeued_cbs;
+}
+
+/* Dequeue one decode operation from ACC100 device in CB mode */
+static inline int
+dequeue_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
+		struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
+		uint16_t dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_dec_op *op;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	rsp.val = atom_desc.rsp.val;
+	rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+	op->status |= ((rsp.input_err)
+			? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+	op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+	op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+	if (op->status != 0)
+		q_data->queue_stats.dequeue_err_count++;
+
+	/* CRC invalid if error exists */
+	if (!op->status)
+		op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
+	op->turbo_dec.iter_count = (uint8_t) rsp.iter_cnt / 2;
+	/* Check if this is the last desc in batch (Atomic Queue) */
+	if (desc->req.last_desc_in_batch) {
+		(*aq_dequeued)++;
+		desc->req.last_desc_in_batch = 0;
+	}
+	desc->rsp.val = ACC100_DMA_DESC_TYPE;
+	desc->rsp.add_info_0 = 0;
+	desc->rsp.add_info_1 = 0;
+	*ref_op = op;
+
+	/* One CB (op) was successfully dequeued */
+	return 1;
+}
+
+/* Dequeue one decode operations from ACC100 device in CB mode */
+static inline int
+dequeue_ldpc_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
+		struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
+		uint16_t dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_dec_op *op;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	rsp.val = atom_desc.rsp.val;
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+	op->status |= rsp.input_err << RTE_BBDEV_DATA_ERROR;
+	op->status |= rsp.dma_err << RTE_BBDEV_DRV_ERROR;
+	op->status |= rsp.fcw_err << RTE_BBDEV_DRV_ERROR;
+	if (op->status != 0)
+		q_data->queue_stats.dequeue_err_count++;
+
+	op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
+	if (op->ldpc_dec.hard_output.length > 0 && !rsp.synd_ok)
+		op->status |= 1 << RTE_BBDEV_SYNDROME_ERROR;
+	op->ldpc_dec.iter_count = (uint8_t) rsp.iter_cnt;
+
+	/* Check if this is the last desc in batch (Atomic Queue) */
+	if (desc->req.last_desc_in_batch) {
+		(*aq_dequeued)++;
+		desc->req.last_desc_in_batch = 0;
+	}
+
+	desc->rsp.val = ACC100_DMA_DESC_TYPE;
+	desc->rsp.add_info_0 = 0;
+	desc->rsp.add_info_1 = 0;
+
+	*ref_op = op;
+
+	/* One CB (op) was successfully dequeued */
+	return 1;
+}
+
+/* Dequeue one decode operations from ACC100 device in TB mode. */
+static inline int
+dequeue_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
+		uint16_t dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, *last_desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_dec_op *op;
+	uint8_t cbs_in_tb = 1, cb_idx = 0;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Get number of CBs in dequeued TB */
+	cbs_in_tb = desc->req.cbs_in_tb;
+	/* Get last CB */
+	last_desc = q->ring_addr + ((q->sw_ring_tail
+			+ dequeued_cbs + cbs_in_tb - 1)
+			& q->sw_ring_wrap_mask);
+	/* Check if last CB in TB is ready to dequeue (and thus
+	 * the whole TB) - checking sdone bit. If not return.
+	 */
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
+			__ATOMIC_RELAXED);
+	if (!(atom_desc.rsp.val & ACC100_SDONE))
+		return -1;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+
+	/* Read remaining CBs if exists */
+	while (cb_idx < cbs_in_tb) {
+		desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+				& q->sw_ring_wrap_mask);
+		atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+				__ATOMIC_RELAXED);
+		rsp.val = atom_desc.rsp.val;
+		rte_bbdev_log_debug("Resp. desc %p: %x", desc,
+				rsp.val);
+
+		op->status |= ((rsp.input_err)
+				? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+		op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+		op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+
+		/* CRC invalid if error exists */
+		if (!op->status)
+			op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
+		op->turbo_dec.iter_count = RTE_MAX((uint8_t) rsp.iter_cnt,
+				op->turbo_dec.iter_count);
+
+		/* Check if this is the last desc in batch (Atomic Queue) */
+		if (desc->req.last_desc_in_batch) {
+			(*aq_dequeued)++;
+			desc->req.last_desc_in_batch = 0;
+		}
+		desc->rsp.val = ACC100_DMA_DESC_TYPE;
+		desc->rsp.add_info_0 = 0;
+		desc->rsp.add_info_1 = 0;
+		dequeued_cbs++;
+		cb_idx++;
+	}
+
+	*ref_op = op;
+
+	return cb_idx;
+}
+
+/* Dequeue LDPC encode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+	uint32_t aq_dequeued = 0;
+	uint16_t dequeue_num, i, dequeued_cbs = 0, dequeued_descs = 0;
+	int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(ops == 0 && q == NULL))
+		return 0;
+#endif
+
+	dequeue_num = (avail < num) ? avail : num;
+
+	for (i = 0; i < dequeue_num; i++) {
+		ret = dequeue_enc_one_op_cb(q, &ops[dequeued_cbs],
+				dequeued_descs, &aq_dequeued);
+		if (ret < 0)
+			break;
+		dequeued_cbs += ret;
+		dequeued_descs++;
+		if (dequeued_cbs >= num)
+			break;
+	}
+
+	q->aq_dequeued += aq_dequeued;
+	q->sw_ring_tail += dequeued_descs;
+
+	/* Update enqueue stats */
+	q_data->queue_stats.dequeued_count += dequeued_cbs;
+
+	return dequeued_cbs;
+}
+
+/* Dequeue decode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	uint16_t dequeue_num;
+	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+	uint32_t aq_dequeued = 0;
+	uint16_t i;
+	uint16_t dequeued_cbs = 0;
+	struct rte_bbdev_dec_op *op;
+	int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(ops == 0 && q == NULL))
+		return 0;
+#endif
+
+	dequeue_num = (avail < num) ? avail : num;
+
+	for (i = 0; i < dequeue_num; ++i) {
+		op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask))->req.op_addr;
+		if (op->ldpc_dec.code_block_mode == 0)
+			ret = dequeue_dec_one_op_tb(q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+		else
+			ret = dequeue_ldpc_dec_one_op_cb(
+					q_data, q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+
+		if (ret < 0)
+			break;
+		dequeued_cbs += ret;
+	}
+
+	q->aq_dequeued += aq_dequeued;
+	q->sw_ring_tail += dequeued_cbs;
+
+	/* Update enqueue stats */
+	q_data->queue_stats.dequeued_count += i;
+
+	return i;
+}
+
 /* Initialization Function */
 static void
 acc100_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv)
@@ -703,6 +2321,10 @@
 	struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
 
 	dev->dev_ops = &acc100_bbdev_ops;
+	dev->enqueue_ldpc_enc_ops = acc100_enqueue_ldpc_enc;
+	dev->enqueue_ldpc_dec_ops = acc100_enqueue_ldpc_dec;
+	dev->dequeue_ldpc_enc_ops = acc100_dequeue_ldpc_enc;
+	dev->dequeue_ldpc_dec_ops = acc100_dequeue_ldpc_dec;
 
 	((struct acc100_device *) dev->data->dev_private)->pf_device =
 			!strcmp(drv->driver.name,
@@ -815,4 +2437,3 @@ static int acc100_pci_remove(struct rte_pci_device *pci_dev)
 RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME, pci_id_acc100_pf_map);
 RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
 RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME, pci_id_acc100_vf_map);
-
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 0e2b79c..78686c1 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -88,6 +88,8 @@
 #define TMPL_PRI_3      0x0f0e0d0c
 #define QUEUE_ENABLE    0x80000000  /* Bit to mark Queue as Enabled */
 #define WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
+#define ACC100_FDONE    0x80000000
+#define ACC100_SDONE    0x40000000
 
 #define ACC100_NUM_TMPL  32
 #define VF_OFFSET_QOS 16 /* offset in Memory Space specific to QoS Mon */
@@ -398,6 +400,7 @@ struct __rte_packed acc100_dma_req_desc {
 union acc100_dma_desc {
 	struct acc100_dma_req_desc req;
 	union acc100_dma_rsp_desc rsp;
+	uint64_t atom_hdr;
 };
 
 
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v9 06/10] baseband/acc100: add HARQ loopback support
  2020-09-29  0:29   ` [dpdk-dev] [PATCH v9 00/10] bbdev PMD ACC100 Nicolas Chautru
                       ` (4 preceding siblings ...)
  2020-09-29  0:29     ` [dpdk-dev] [PATCH v9 05/10] baseband/acc100: add LDPC processing functions Nicolas Chautru
@ 2020-09-29  0:29     ` Nicolas Chautru
  2020-09-30 17:25       ` Tom Rix
  2020-09-29  0:29     ` [dpdk-dev] [PATCH v9 07/10] baseband/acc100: add support for 4G processing Nicolas Chautru
                       ` (3 subsequent siblings)
  9 siblings, 1 reply; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-29  0:29 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Additional support for HARQ memory loopback

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 158 +++++++++++++++++++++++++++++++
 1 file changed, 158 insertions(+)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index b223547..e484c0a 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -658,6 +658,7 @@
 				RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE |
 				RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE |
 #ifdef ACC100_EXT_MEM
+				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK |
 				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE |
 				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE |
 #endif
@@ -1480,12 +1481,169 @@
 	return 1;
 }
 
+static inline int
+harq_loopback(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs) {
+	struct acc100_fcw_ld *fcw;
+	union acc100_dma_desc *desc;
+	int next_triplet = 1;
+	struct rte_mbuf *hq_output_head, *hq_output;
+	uint16_t harq_in_length = op->ldpc_dec.harq_combined_input.length;
+	if (harq_in_length == 0) {
+		rte_bbdev_log(ERR, "Loopback of invalid null size\n");
+		return -EINVAL;
+	}
+
+	int h_comp = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION
+			) ? 1 : 0;
+	if (h_comp == 1)
+		harq_in_length = harq_in_length * 8 / 6;
+	harq_in_length = RTE_ALIGN(harq_in_length, 64);
+	uint16_t harq_dma_length_in = (h_comp == 0) ?
+			harq_in_length :
+			harq_in_length * 6 / 8;
+	uint16_t harq_dma_length_out = harq_dma_length_in;
+	bool ddr_mem_in = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE);
+	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+	uint16_t harq_index = (ddr_mem_in ?
+			op->ldpc_dec.harq_combined_input.offset :
+			op->ldpc_dec.harq_combined_output.offset)
+			/ ACC100_HARQ_OFFSET;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	fcw = &desc->req.fcw_ld;
+	/* Set the FCW from loopback into DDR */
+	memset(fcw, 0, sizeof(struct acc100_fcw_ld));
+	fcw->FCWversion = ACC100_FCW_VER;
+	fcw->qm = 2;
+	fcw->Zc = 384;
+	if (harq_in_length < 16 * N_ZC_1)
+		fcw->Zc = 16;
+	fcw->ncb = fcw->Zc * N_ZC_1;
+	fcw->rm_e = 2;
+	fcw->hcin_en = 1;
+	fcw->hcout_en = 1;
+
+	rte_bbdev_log(DEBUG, "Loopback IN %d Index %d offset %d length %d %d\n",
+			ddr_mem_in, harq_index,
+			harq_layout[harq_index].offset, harq_in_length,
+			harq_dma_length_in);
+
+	if (ddr_mem_in && (harq_layout[harq_index].offset > 0)) {
+		fcw->hcin_size0 = harq_layout[harq_index].size0;
+		fcw->hcin_offset = harq_layout[harq_index].offset;
+		fcw->hcin_size1 = harq_in_length - fcw->hcin_offset;
+		harq_dma_length_in = (fcw->hcin_size0 + fcw->hcin_size1);
+		if (h_comp == 1)
+			harq_dma_length_in = harq_dma_length_in * 6 / 8;
+	} else {
+		fcw->hcin_size0 = harq_in_length;
+	}
+	harq_layout[harq_index].val = 0;
+	rte_bbdev_log(DEBUG, "Loopback FCW Config %d %d %d\n",
+			fcw->hcin_size0, fcw->hcin_offset, fcw->hcin_size1);
+	fcw->hcout_size0 = harq_in_length;
+	fcw->hcin_decomp_mode = h_comp;
+	fcw->hcout_comp_mode = h_comp;
+	fcw->gain_i = 1;
+	fcw->gain_h = 1;
+
+	/* Set the prefix of descriptor. This could be done at polling */
+	desc->req.word0 = ACC100_DMA_DESC_TYPE;
+	desc->req.word1 = 0; /**< Timestamp could be disabled */
+	desc->req.word2 = 0;
+	desc->req.word3 = 0;
+	desc->req.numCBs = 1;
+
+	/* Null LLR input for Decoder */
+	desc->req.data_ptrs[next_triplet].address =
+			q->lb_in_addr_phys;
+	desc->req.data_ptrs[next_triplet].blen = 2;
+	desc->req.data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN;
+	desc->req.data_ptrs[next_triplet].last = 0;
+	desc->req.data_ptrs[next_triplet].dma_ext = 0;
+	next_triplet++;
+
+	/* HARQ Combine input from either Memory interface */
+	if (!ddr_mem_in) {
+		next_triplet = acc100_dma_fill_blk_type_out(&desc->req,
+				op->ldpc_dec.harq_combined_input.data,
+				op->ldpc_dec.harq_combined_input.offset,
+				harq_dma_length_in,
+				next_triplet,
+				ACC100_DMA_BLKID_IN_HARQ);
+	} else {
+		desc->req.data_ptrs[next_triplet].address =
+				op->ldpc_dec.harq_combined_input.offset;
+		desc->req.data_ptrs[next_triplet].blen =
+				harq_dma_length_in;
+		desc->req.data_ptrs[next_triplet].blkid =
+				ACC100_DMA_BLKID_IN_HARQ;
+		desc->req.data_ptrs[next_triplet].dma_ext = 1;
+		next_triplet++;
+	}
+	desc->req.data_ptrs[next_triplet - 1].last = 1;
+	desc->req.m2dlen = next_triplet;
+
+	/* Dropped decoder hard output */
+	desc->req.data_ptrs[next_triplet].address =
+			q->lb_out_addr_phys;
+	desc->req.data_ptrs[next_triplet].blen = BYTES_IN_WORD;
+	desc->req.data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_OUT_HARD;
+	desc->req.data_ptrs[next_triplet].last = 0;
+	desc->req.data_ptrs[next_triplet].dma_ext = 0;
+	next_triplet++;
+
+	/* HARQ Combine output to either Memory interface */
+	if (check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE
+			)) {
+		desc->req.data_ptrs[next_triplet].address =
+				op->ldpc_dec.harq_combined_output.offset;
+		desc->req.data_ptrs[next_triplet].blen =
+				harq_dma_length_out;
+		desc->req.data_ptrs[next_triplet].blkid =
+				ACC100_DMA_BLKID_OUT_HARQ;
+		desc->req.data_ptrs[next_triplet].dma_ext = 1;
+		next_triplet++;
+	} else {
+		hq_output_head = op->ldpc_dec.harq_combined_output.data;
+		hq_output = op->ldpc_dec.harq_combined_output.data;
+		next_triplet = acc100_dma_fill_blk_type_out(
+				&desc->req,
+				op->ldpc_dec.harq_combined_output.data,
+				op->ldpc_dec.harq_combined_output.offset,
+				harq_dma_length_out,
+				next_triplet,
+				ACC100_DMA_BLKID_OUT_HARQ);
+		/* HARQ output */
+		mbuf_append(hq_output_head, hq_output, harq_dma_length_out);
+		op->ldpc_dec.harq_combined_output.length =
+				harq_dma_length_out;
+	}
+	desc->req.data_ptrs[next_triplet - 1].last = 1;
+	desc->req.d2mlen = next_triplet - desc->req.m2dlen;
+	desc->req.op_addr = op;
+
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
 /** Enqueue one decode operations for ACC100 device in CB mode */
 static inline int
 enqueue_ldpc_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
 		uint16_t total_enqueued_cbs, bool same_op)
 {
 	int ret;
+	if (unlikely(check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK))) {
+		ret = harq_loopback(q, op, total_enqueued_cbs);
+		return ret;
+	}
 
 	union acc100_dma_desc *desc;
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v9 07/10] baseband/acc100: add support for 4G processing
  2020-09-29  0:29   ` [dpdk-dev] [PATCH v9 00/10] bbdev PMD ACC100 Nicolas Chautru
                       ` (5 preceding siblings ...)
  2020-09-29  0:29     ` [dpdk-dev] [PATCH v9 06/10] baseband/acc100: add HARQ loopback support Nicolas Chautru
@ 2020-09-29  0:29     ` Nicolas Chautru
  2020-09-30 18:37       ` Tom Rix
  2020-09-29  0:29     ` [dpdk-dev] [PATCH v9 08/10] baseband/acc100: add interrupt support to PMD Nicolas Chautru
                       ` (2 subsequent siblings)
  9 siblings, 1 reply; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-29  0:29 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Adding capability for 4G encode and decoder processing

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 doc/guides/bbdevs/features/acc100.ini    |    4 +-
 drivers/baseband/acc100/rte_acc100_pmd.c | 1010 ++++++++++++++++++++++++++++--
 2 files changed, 945 insertions(+), 69 deletions(-)

diff --git a/doc/guides/bbdevs/features/acc100.ini b/doc/guides/bbdevs/features/acc100.ini
index 40c7adc..642cd48 100644
--- a/doc/guides/bbdevs/features/acc100.ini
+++ b/doc/guides/bbdevs/features/acc100.ini
@@ -4,8 +4,8 @@
 ; Refer to default.ini for the full list of available PMD features.
 ;
 [Features]
-Turbo Decoder (4G)     = N
-Turbo Encoder (4G)     = N
+Turbo Decoder (4G)     = Y
+Turbo Encoder (4G)     = Y
 LDPC Decoder (5G)      = Y
 LDPC Encoder (5G)      = Y
 LLR/HARQ Compression   = Y
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index e484c0a..7d4c3df 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -339,7 +339,6 @@
 	free_base_addresses(base_addrs, i);
 }
 
-
 /* Allocate 64MB memory used for all software rings */
 static int
 acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id)
@@ -637,6 +636,41 @@
 
 	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
 		{
+			.type = RTE_BBDEV_OP_TURBO_DEC,
+			.cap.turbo_dec = {
+				.capability_flags =
+					RTE_BBDEV_TURBO_SUBBLOCK_DEINTERLEAVE |
+					RTE_BBDEV_TURBO_CRC_TYPE_24B |
+					RTE_BBDEV_TURBO_HALF_ITERATION_EVEN |
+					RTE_BBDEV_TURBO_EARLY_TERMINATION |
+					RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN |
+					RTE_BBDEV_TURBO_MAP_DEC |
+					RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP |
+					RTE_BBDEV_TURBO_DEC_SCATTER_GATHER,
+				.max_llr_modulus = INT8_MAX,
+				.num_buffers_src =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+				.num_buffers_hard_out =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+				.num_buffers_soft_out =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+			}
+		},
+		{
+			.type = RTE_BBDEV_OP_TURBO_ENC,
+			.cap.turbo_enc = {
+				.capability_flags =
+					RTE_BBDEV_TURBO_CRC_24B_ATTACH |
+					RTE_BBDEV_TURBO_RV_INDEX_BYPASS |
+					RTE_BBDEV_TURBO_RATE_MATCH |
+					RTE_BBDEV_TURBO_ENC_SCATTER_GATHER,
+				.num_buffers_src =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+				.num_buffers_dst =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+			}
+		},
+		{
 			.type   = RTE_BBDEV_OP_LDPC_ENC,
 			.cap.ldpc_enc = {
 				.capability_flags =
@@ -719,7 +753,6 @@
 #endif
 }
 
-
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
 	.setup_queues = acc100_setup_queues,
 	.close = acc100_dev_close,
@@ -763,6 +796,58 @@
 	return tail;
 }
 
+/* Fill in a frame control word for turbo encoding. */
+static inline void
+acc100_fcw_te_fill(const struct rte_bbdev_enc_op *op, struct acc100_fcw_te *fcw)
+{
+	fcw->code_block_mode = op->turbo_enc.code_block_mode;
+	if (fcw->code_block_mode == 0) { /* For TB mode */
+		fcw->k_neg = op->turbo_enc.tb_params.k_neg;
+		fcw->k_pos = op->turbo_enc.tb_params.k_pos;
+		fcw->c_neg = op->turbo_enc.tb_params.c_neg;
+		fcw->c = op->turbo_enc.tb_params.c;
+		fcw->ncb_neg = op->turbo_enc.tb_params.ncb_neg;
+		fcw->ncb_pos = op->turbo_enc.tb_params.ncb_pos;
+
+		if (check_bit(op->turbo_enc.op_flags,
+				RTE_BBDEV_TURBO_RATE_MATCH)) {
+			fcw->bypass_rm = 0;
+			fcw->cab = op->turbo_enc.tb_params.cab;
+			fcw->ea = op->turbo_enc.tb_params.ea;
+			fcw->eb = op->turbo_enc.tb_params.eb;
+		} else {
+			/* E is set to the encoding output size when RM is
+			 * bypassed.
+			 */
+			fcw->bypass_rm = 1;
+			fcw->cab = fcw->c_neg;
+			fcw->ea = 3 * fcw->k_neg + 12;
+			fcw->eb = 3 * fcw->k_pos + 12;
+		}
+	} else { /* For CB mode */
+		fcw->k_pos = op->turbo_enc.cb_params.k;
+		fcw->ncb_pos = op->turbo_enc.cb_params.ncb;
+
+		if (check_bit(op->turbo_enc.op_flags,
+				RTE_BBDEV_TURBO_RATE_MATCH)) {
+			fcw->bypass_rm = 0;
+			fcw->eb = op->turbo_enc.cb_params.e;
+		} else {
+			/* E is set to the encoding output size when RM is
+			 * bypassed.
+			 */
+			fcw->bypass_rm = 1;
+			fcw->eb = 3 * fcw->k_pos + 12;
+		}
+	}
+
+	fcw->bypass_rv_idx1 = check_bit(op->turbo_enc.op_flags,
+			RTE_BBDEV_TURBO_RV_INDEX_BYPASS);
+	fcw->code_block_crc = check_bit(op->turbo_enc.op_flags,
+			RTE_BBDEV_TURBO_CRC_24B_ATTACH);
+	fcw->rv_idx1 = op->turbo_enc.rv_index;
+}
+
 /* Compute value of k0.
  * Based on 3GPP 38.212 Table 5.4.2.1-2
  * Starting position of different redundancy versions, k0
@@ -813,6 +898,25 @@
 	fcw->mcb_count = num_cb;
 }
 
+/* Fill in a frame control word for turbo decoding. */
+static inline void
+acc100_fcw_td_fill(const struct rte_bbdev_dec_op *op, struct acc100_fcw_td *fcw)
+{
+	/* Note : Early termination is always enabled for 4GUL */
+	fcw->fcw_ver = 1;
+	if (op->turbo_dec.code_block_mode == 0)
+		fcw->k_pos = op->turbo_dec.tb_params.k_pos;
+	else
+		fcw->k_pos = op->turbo_dec.cb_params.k;
+	fcw->turbo_crc_type = check_bit(op->turbo_dec.op_flags,
+			RTE_BBDEV_TURBO_CRC_TYPE_24B);
+	fcw->bypass_sb_deint = 0;
+	fcw->raw_decoder_input_on = 0;
+	fcw->max_iter = op->turbo_dec.iter_max;
+	fcw->half_iter_on = !check_bit(op->turbo_dec.op_flags,
+			RTE_BBDEV_TURBO_HALF_ITERATION_EVEN);
+}
+
 /* Fill in a frame control word for LDPC decoding. */
 static inline void
 acc100_fcw_ld_fill(const struct rte_bbdev_dec_op *op, struct acc100_fcw_ld *fcw,
@@ -1042,6 +1146,87 @@
 }
 
 static inline int
+acc100_dma_desc_te_fill(struct rte_bbdev_enc_op *op,
+		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
+		struct rte_mbuf *output, uint32_t *in_offset,
+		uint32_t *out_offset, uint32_t *out_length,
+		uint32_t *mbuf_total_left, uint32_t *seg_total_left, uint8_t r)
+{
+	int next_triplet = 1; /* FCW already done */
+	uint32_t e, ea, eb, length;
+	uint16_t k, k_neg, k_pos;
+	uint8_t cab, c_neg;
+
+	desc->word0 = ACC100_DMA_DESC_TYPE;
+	desc->word1 = 0; /**< Timestamp could be disabled */
+	desc->word2 = 0;
+	desc->word3 = 0;
+	desc->numCBs = 1;
+
+	if (op->turbo_enc.code_block_mode == 0) {
+		ea = op->turbo_enc.tb_params.ea;
+		eb = op->turbo_enc.tb_params.eb;
+		cab = op->turbo_enc.tb_params.cab;
+		k_neg = op->turbo_enc.tb_params.k_neg;
+		k_pos = op->turbo_enc.tb_params.k_pos;
+		c_neg = op->turbo_enc.tb_params.c_neg;
+		e = (r < cab) ? ea : eb;
+		k = (r < c_neg) ? k_neg : k_pos;
+	} else {
+		e = op->turbo_enc.cb_params.e;
+		k = op->turbo_enc.cb_params.k;
+	}
+
+	if (check_bit(op->turbo_enc.op_flags, RTE_BBDEV_TURBO_CRC_24B_ATTACH))
+		length = (k - 24) >> 3;
+	else
+		length = k >> 3;
+
+	if (unlikely((*mbuf_total_left == 0) || (*mbuf_total_left < length))) {
+		rte_bbdev_log(ERR,
+				"Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+				*mbuf_total_left, length);
+		return -1;
+	}
+
+	next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset,
+			length, seg_total_left, next_triplet);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->m2dlen = next_triplet;
+	*mbuf_total_left -= length;
+
+	/* Set output length */
+	if (check_bit(op->turbo_enc.op_flags, RTE_BBDEV_TURBO_RATE_MATCH))
+		/* Integer round up division by 8 */
+		*out_length = (e + 7) >> 3;
+	else
+		*out_length = (k >> 3) * 3 + 2;
+
+	next_triplet = acc100_dma_fill_blk_type_out(desc, output, *out_offset,
+			*out_length, next_triplet, ACC100_DMA_BLKID_OUT_ENC);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+	op->turbo_enc.output.length += *out_length;
+	*out_offset += *out_length;
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->d2mlen = next_triplet - desc->m2dlen;
+
+	desc->op_addr = op;
+
+	return 0;
+}
+
+static inline int
 acc100_dma_desc_le_fill(struct rte_bbdev_enc_op *op,
 		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
 		struct rte_mbuf *output, uint32_t *in_offset,
@@ -1110,6 +1295,117 @@
 }
 
 static inline int
+acc100_dma_desc_td_fill(struct rte_bbdev_dec_op *op,
+		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
+		struct rte_mbuf *h_output, struct rte_mbuf *s_output,
+		uint32_t *in_offset, uint32_t *h_out_offset,
+		uint32_t *s_out_offset, uint32_t *h_out_length,
+		uint32_t *s_out_length, uint32_t *mbuf_total_left,
+		uint32_t *seg_total_left, uint8_t r)
+{
+	int next_triplet = 1; /* FCW already done */
+	uint16_t k;
+	uint16_t crc24_overlap = 0;
+	uint32_t e, kw;
+
+	desc->word0 = ACC100_DMA_DESC_TYPE;
+	desc->word1 = 0; /**< Timestamp could be disabled */
+	desc->word2 = 0;
+	desc->word3 = 0;
+	desc->numCBs = 1;
+
+	if (op->turbo_dec.code_block_mode == 0) {
+		k = (r < op->turbo_dec.tb_params.c_neg)
+			? op->turbo_dec.tb_params.k_neg
+			: op->turbo_dec.tb_params.k_pos;
+		e = (r < op->turbo_dec.tb_params.cab)
+			? op->turbo_dec.tb_params.ea
+			: op->turbo_dec.tb_params.eb;
+	} else {
+		k = op->turbo_dec.cb_params.k;
+		e = op->turbo_dec.cb_params.e;
+	}
+
+	if ((op->turbo_dec.code_block_mode == 0)
+		&& !check_bit(op->turbo_dec.op_flags,
+		RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP))
+		crc24_overlap = 24;
+
+	/* Calculates circular buffer size.
+	 * According to 3gpp 36.212 section 5.1.4.2
+	 *   Kw = 3 * Kpi,
+	 * where:
+	 *   Kpi = nCol * nRow
+	 * where nCol is 32 and nRow can be calculated from:
+	 *   D =< nCol * nRow
+	 * where D is the size of each output from turbo encoder block (k + 4).
+	 */
+	kw = RTE_ALIGN_CEIL(k + 4, 32) * 3;
+
+	if (unlikely((*mbuf_total_left == 0) || (*mbuf_total_left < kw))) {
+		rte_bbdev_log(ERR,
+				"Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+				*mbuf_total_left, kw);
+		return -1;
+	}
+
+	next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset, kw,
+			seg_total_left, next_triplet);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->m2dlen = next_triplet;
+	*mbuf_total_left -= kw;
+
+	next_triplet = acc100_dma_fill_blk_type_out(
+			desc, h_output, *h_out_offset,
+			k >> 3, next_triplet, ACC100_DMA_BLKID_OUT_HARD);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+
+	*h_out_length = ((k - crc24_overlap) >> 3);
+	op->turbo_dec.hard_output.length += *h_out_length;
+	*h_out_offset += *h_out_length;
+
+	/* Soft output */
+	if (check_bit(op->turbo_dec.op_flags, RTE_BBDEV_TURBO_SOFT_OUTPUT)) {
+		if (check_bit(op->turbo_dec.op_flags,
+				RTE_BBDEV_TURBO_EQUALIZER))
+			*s_out_length = e;
+		else
+			*s_out_length = (k * 3) + 12;
+
+		next_triplet = acc100_dma_fill_blk_type_out(desc, s_output,
+				*s_out_offset, *s_out_length, next_triplet,
+				ACC100_DMA_BLKID_OUT_SOFT);
+		if (unlikely(next_triplet < 0)) {
+			rte_bbdev_log(ERR,
+					"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+					op);
+			return -1;
+		}
+
+		op->turbo_dec.soft_output.length += *s_out_length;
+		*s_out_offset += *s_out_length;
+	}
+
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->d2mlen = next_triplet - desc->m2dlen;
+
+	desc->op_addr = op;
+
+	return 0;
+}
+
+static inline int
 acc100_dma_desc_ld_fill(struct rte_bbdev_dec_op *op,
 		struct acc100_dma_req_desc *desc,
 		struct rte_mbuf **input, struct rte_mbuf *h_output,
@@ -1374,6 +1670,57 @@
 
 /* Enqueue one encode operations for ACC100 device in CB mode */
 static inline int
+enqueue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
+		uint16_t total_enqueued_cbs)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint32_t in_offset, out_offset, out_length, mbuf_total_left,
+		seg_total_left;
+	struct rte_mbuf *input, *output_head, *output;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	acc100_fcw_te_fill(op, &desc->req.fcw_te);
+
+	input = op->turbo_enc.input.data;
+	output_head = output = op->turbo_enc.output.data;
+	in_offset = op->turbo_enc.input.offset;
+	out_offset = op->turbo_enc.output.offset;
+	out_length = 0;
+	mbuf_total_left = op->turbo_enc.input.length;
+	seg_total_left = rte_pktmbuf_data_len(op->turbo_enc.input.data)
+			- in_offset;
+
+	ret = acc100_dma_desc_te_fill(op, &desc->req, &input, output,
+			&in_offset, &out_offset, &out_length, &mbuf_total_left,
+			&seg_total_left, 0);
+
+	if (unlikely(ret < 0))
+		return ret;
+
+	mbuf_append(output_head, output, out_length);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_te,
+			sizeof(desc->req.fcw_te) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+
+	/* Check if any data left after processing one CB */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left after processing one CB: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
+/* Enqueue one encode operations for ACC100 device in CB mode */
+static inline int
 enqueue_ldpc_enc_n_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op **ops,
 		uint16_t total_enqueued_cbs, int16_t num)
 {
@@ -1481,78 +1828,235 @@
 	return 1;
 }
 
-static inline int
-harq_loopback(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
-		uint16_t total_enqueued_cbs) {
-	struct acc100_fcw_ld *fcw;
-	union acc100_dma_desc *desc;
-	int next_triplet = 1;
-	struct rte_mbuf *hq_output_head, *hq_output;
-	uint16_t harq_in_length = op->ldpc_dec.harq_combined_input.length;
-	if (harq_in_length == 0) {
-		rte_bbdev_log(ERR, "Loopback of invalid null size\n");
-		return -EINVAL;
-	}
 
-	int h_comp = check_bit(op->ldpc_dec.op_flags,
-			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION
-			) ? 1 : 0;
-	if (h_comp == 1)
-		harq_in_length = harq_in_length * 8 / 6;
-	harq_in_length = RTE_ALIGN(harq_in_length, 64);
-	uint16_t harq_dma_length_in = (h_comp == 0) ?
-			harq_in_length :
-			harq_in_length * 6 / 8;
-	uint16_t harq_dma_length_out = harq_dma_length_in;
-	bool ddr_mem_in = check_bit(op->ldpc_dec.op_flags,
-			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE);
-	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
-	uint16_t harq_index = (ddr_mem_in ?
-			op->ldpc_dec.harq_combined_input.offset :
-			op->ldpc_dec.harq_combined_output.offset)
-			/ ACC100_HARQ_OFFSET;
+/* Enqueue one encode operations for ACC100 device in TB mode. */
+static inline int
+enqueue_enc_one_op_tb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
+		uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint8_t r, c;
+	uint32_t in_offset, out_offset, out_length, mbuf_total_left,
+		seg_total_left;
+	struct rte_mbuf *input, *output_head, *output;
+	uint16_t current_enqueued_cbs = 0;
 
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
-	fcw = &desc->req.fcw_ld;
-	/* Set the FCW from loopback into DDR */
-	memset(fcw, 0, sizeof(struct acc100_fcw_ld));
-	fcw->FCWversion = ACC100_FCW_VER;
-	fcw->qm = 2;
-	fcw->Zc = 384;
-	if (harq_in_length < 16 * N_ZC_1)
-		fcw->Zc = 16;
-	fcw->ncb = fcw->Zc * N_ZC_1;
-	fcw->rm_e = 2;
-	fcw->hcin_en = 1;
-	fcw->hcout_en = 1;
+	uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
+	acc100_fcw_te_fill(op, &desc->req.fcw_te);
 
-	rte_bbdev_log(DEBUG, "Loopback IN %d Index %d offset %d length %d %d\n",
-			ddr_mem_in, harq_index,
-			harq_layout[harq_index].offset, harq_in_length,
-			harq_dma_length_in);
+	input = op->turbo_enc.input.data;
+	output_head = output = op->turbo_enc.output.data;
+	in_offset = op->turbo_enc.input.offset;
+	out_offset = op->turbo_enc.output.offset;
+	out_length = 0;
+	mbuf_total_left = op->turbo_enc.input.length;
 
-	if (ddr_mem_in && (harq_layout[harq_index].offset > 0)) {
-		fcw->hcin_size0 = harq_layout[harq_index].size0;
-		fcw->hcin_offset = harq_layout[harq_index].offset;
-		fcw->hcin_size1 = harq_in_length - fcw->hcin_offset;
-		harq_dma_length_in = (fcw->hcin_size0 + fcw->hcin_size1);
-		if (h_comp == 1)
-			harq_dma_length_in = harq_dma_length_in * 6 / 8;
-	} else {
-		fcw->hcin_size0 = harq_in_length;
-	}
-	harq_layout[harq_index].val = 0;
-	rte_bbdev_log(DEBUG, "Loopback FCW Config %d %d %d\n",
-			fcw->hcin_size0, fcw->hcin_offset, fcw->hcin_size1);
-	fcw->hcout_size0 = harq_in_length;
-	fcw->hcin_decomp_mode = h_comp;
-	fcw->hcout_comp_mode = h_comp;
-	fcw->gain_i = 1;
-	fcw->gain_h = 1;
+	c = op->turbo_enc.tb_params.c;
+	r = op->turbo_enc.tb_params.r;
 
-	/* Set the prefix of descriptor. This could be done at polling */
+	while (mbuf_total_left > 0 && r < c) {
+		seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
+		/* Set up DMA descriptor */
+		desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
+				& q->sw_ring_wrap_mask);
+		desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
+		desc->req.data_ptrs[0].blen = ACC100_FCW_TE_BLEN;
+
+		ret = acc100_dma_desc_te_fill(op, &desc->req, &input, output,
+				&in_offset, &out_offset, &out_length,
+				&mbuf_total_left, &seg_total_left, r);
+		if (unlikely(ret < 0))
+			return ret;
+		mbuf_append(output_head, output, out_length);
+
+		/* Set total number of CBs in TB */
+		desc->req.cbs_in_tb = cbs_in_tb;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+		rte_memdump(stderr, "FCW", &desc->req.fcw_te,
+				sizeof(desc->req.fcw_te) - 8);
+		rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+		if (seg_total_left == 0) {
+			/* Go to the next mbuf */
+			input = input->next;
+			in_offset = 0;
+			output = output->next;
+			out_offset = 0;
+		}
+
+		total_enqueued_cbs++;
+		current_enqueued_cbs++;
+		r++;
+	}
+
+	if (unlikely(desc == NULL))
+		return current_enqueued_cbs;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Check if any CBs left for processing */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left for processing: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+
+	/* Set SDone on last CB descriptor for TB mode. */
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	return current_enqueued_cbs;
+}
+
+/** Enqueue one decode operations for ACC100 device in CB mode */
+static inline int
+enqueue_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint32_t in_offset, h_out_offset, s_out_offset, s_out_length,
+		h_out_length, mbuf_total_left, seg_total_left;
+	struct rte_mbuf *input, *h_output_head, *h_output,
+		*s_output_head, *s_output;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	acc100_fcw_td_fill(op, &desc->req.fcw_td);
+
+	input = op->turbo_dec.input.data;
+	h_output_head = h_output = op->turbo_dec.hard_output.data;
+	s_output_head = s_output = op->turbo_dec.soft_output.data;
+	in_offset = op->turbo_dec.input.offset;
+	h_out_offset = op->turbo_dec.hard_output.offset;
+	s_out_offset = op->turbo_dec.soft_output.offset;
+	h_out_length = s_out_length = 0;
+	mbuf_total_left = op->turbo_dec.input.length;
+	seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(input == NULL)) {
+		rte_bbdev_log(ERR, "Invalid mbuf pointer");
+		return -EFAULT;
+	}
+#endif
+
+	/* Set up DMA descriptor */
+	desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+
+	ret = acc100_dma_desc_td_fill(op, &desc->req, &input, h_output,
+			s_output, &in_offset, &h_out_offset, &s_out_offset,
+			&h_out_length, &s_out_length, &mbuf_total_left,
+			&seg_total_left, 0);
+
+	if (unlikely(ret < 0))
+		return ret;
+
+	/* Hard output */
+	mbuf_append(h_output_head, h_output, h_out_length);
+
+	/* Soft output */
+	if (check_bit(op->turbo_dec.op_flags, RTE_BBDEV_TURBO_SOFT_OUTPUT))
+		mbuf_append(s_output_head, s_output, s_out_length);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_td,
+			sizeof(desc->req.fcw_td) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+
+	/* Check if any CBs left for processing */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left after processing one CB: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
+static inline int
+harq_loopback(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs) {
+	struct acc100_fcw_ld *fcw;
+	union acc100_dma_desc *desc;
+	int next_triplet = 1;
+	struct rte_mbuf *hq_output_head, *hq_output;
+	uint16_t harq_in_length = op->ldpc_dec.harq_combined_input.length;
+	if (harq_in_length == 0) {
+		rte_bbdev_log(ERR, "Loopback of invalid null size\n");
+		return -EINVAL;
+	}
+
+	int h_comp = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION
+			) ? 1 : 0;
+	if (h_comp == 1)
+		harq_in_length = harq_in_length * 8 / 6;
+	harq_in_length = RTE_ALIGN(harq_in_length, 64);
+	uint16_t harq_dma_length_in = (h_comp == 0) ?
+			harq_in_length :
+			harq_in_length * 6 / 8;
+	uint16_t harq_dma_length_out = harq_dma_length_in;
+	bool ddr_mem_in = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE);
+	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+	uint16_t harq_index = (ddr_mem_in ?
+			op->ldpc_dec.harq_combined_input.offset :
+			op->ldpc_dec.harq_combined_output.offset)
+			/ ACC100_HARQ_OFFSET;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	fcw = &desc->req.fcw_ld;
+	/* Set the FCW from loopback into DDR */
+	memset(fcw, 0, sizeof(struct acc100_fcw_ld));
+	fcw->FCWversion = ACC100_FCW_VER;
+	fcw->qm = 2;
+	fcw->Zc = 384;
+	if (harq_in_length < 16 * N_ZC_1)
+		fcw->Zc = 16;
+	fcw->ncb = fcw->Zc * N_ZC_1;
+	fcw->rm_e = 2;
+	fcw->hcin_en = 1;
+	fcw->hcout_en = 1;
+
+	rte_bbdev_log(DEBUG, "Loopback IN %d Index %d offset %d length %d %d\n",
+			ddr_mem_in, harq_index,
+			harq_layout[harq_index].offset, harq_in_length,
+			harq_dma_length_in);
+
+	if (ddr_mem_in && (harq_layout[harq_index].offset > 0)) {
+		fcw->hcin_size0 = harq_layout[harq_index].size0;
+		fcw->hcin_offset = harq_layout[harq_index].offset;
+		fcw->hcin_size1 = harq_in_length - fcw->hcin_offset;
+		harq_dma_length_in = (fcw->hcin_size0 + fcw->hcin_size1);
+		if (h_comp == 1)
+			harq_dma_length_in = harq_dma_length_in * 6 / 8;
+	} else {
+		fcw->hcin_size0 = harq_in_length;
+	}
+	harq_layout[harq_index].val = 0;
+	rte_bbdev_log(DEBUG, "Loopback FCW Config %d %d %d\n",
+			fcw->hcin_size0, fcw->hcin_offset, fcw->hcin_size1);
+	fcw->hcout_size0 = harq_in_length;
+	fcw->hcin_decomp_mode = h_comp;
+	fcw->hcout_comp_mode = h_comp;
+	fcw->gain_i = 1;
+	fcw->gain_h = 1;
+
+	/* Set the prefix of descriptor. This could be done at polling */
 	desc->req.word0 = ACC100_DMA_DESC_TYPE;
 	desc->req.word1 = 0; /**< Timestamp could be disabled */
 	desc->req.word2 = 0;
@@ -1816,6 +2320,107 @@
 	return current_enqueued_cbs;
 }
 
+/* Enqueue one decode operations for ACC100 device in TB mode */
+static inline int
+enqueue_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint8_t r, c;
+	uint32_t in_offset, h_out_offset, s_out_offset, s_out_length,
+		h_out_length, mbuf_total_left, seg_total_left;
+	struct rte_mbuf *input, *h_output_head, *h_output,
+		*s_output_head, *s_output;
+	uint16_t current_enqueued_cbs = 0;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
+	acc100_fcw_td_fill(op, &desc->req.fcw_td);
+
+	input = op->turbo_dec.input.data;
+	h_output_head = h_output = op->turbo_dec.hard_output.data;
+	s_output_head = s_output = op->turbo_dec.soft_output.data;
+	in_offset = op->turbo_dec.input.offset;
+	h_out_offset = op->turbo_dec.hard_output.offset;
+	s_out_offset = op->turbo_dec.soft_output.offset;
+	h_out_length = s_out_length = 0;
+	mbuf_total_left = op->turbo_dec.input.length;
+	c = op->turbo_dec.tb_params.c;
+	r = op->turbo_dec.tb_params.r;
+
+	while (mbuf_total_left > 0 && r < c) {
+
+		seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
+
+		/* Set up DMA descriptor */
+		desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
+				& q->sw_ring_wrap_mask);
+		desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
+		desc->req.data_ptrs[0].blen = ACC100_FCW_TD_BLEN;
+		ret = acc100_dma_desc_td_fill(op, &desc->req, &input,
+				h_output, s_output, &in_offset, &h_out_offset,
+				&s_out_offset, &h_out_length, &s_out_length,
+				&mbuf_total_left, &seg_total_left, r);
+
+		if (unlikely(ret < 0))
+			return ret;
+
+		/* Hard output */
+		mbuf_append(h_output_head, h_output, h_out_length);
+
+		/* Soft output */
+		if (check_bit(op->turbo_dec.op_flags,
+				RTE_BBDEV_TURBO_SOFT_OUTPUT))
+			mbuf_append(s_output_head, s_output, s_out_length);
+
+		/* Set total number of CBs in TB */
+		desc->req.cbs_in_tb = cbs_in_tb;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+		rte_memdump(stderr, "FCW", &desc->req.fcw_td,
+				sizeof(desc->req.fcw_td) - 8);
+		rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+		if (seg_total_left == 0) {
+			/* Go to the next mbuf */
+			input = input->next;
+			in_offset = 0;
+			h_output = h_output->next;
+			h_out_offset = 0;
+
+			if (check_bit(op->turbo_dec.op_flags,
+					RTE_BBDEV_TURBO_SOFT_OUTPUT)) {
+				s_output = s_output->next;
+				s_out_offset = 0;
+			}
+		}
+
+		total_enqueued_cbs++;
+		current_enqueued_cbs++;
+		r++;
+	}
+
+	if (unlikely(desc == NULL))
+		return current_enqueued_cbs;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Check if any CBs left for processing */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left for processing: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+	/* Set SDone on last CB descriptor for TB mode */
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	return current_enqueued_cbs;
+}
 
 /* Calculates number of CBs in processed encoder TB based on 'r' and input
  * length.
@@ -1893,6 +2498,45 @@
 	return cbs_in_tb;
 }
 
+/* Enqueue encode operations for ACC100 device in CB mode. */
+static uint16_t
+acc100_enqueue_enc_cb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i;
+	union acc100_dma_desc *desc;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - 1 < 0))
+			break;
+		avail -= 1;
+
+		ret = enqueue_enc_one_op_cb(q, ops[i], i);
+		if (ret < 0)
+			break;
+	}
+
+	if (unlikely(i == 0))
+		return 0; /* Nothing to enqueue */
+
+	/* Set SDone in last CB in enqueued ops for CB mode*/
+	desc = q->ring_addr + ((q->sw_ring_head + i - 1)
+			& q->sw_ring_wrap_mask);
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	acc100_dma_enqueue(q, i, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+	return i;
+}
+
 /* Check we can mux encode operations with common FCW */
 static inline bool
 check_mux(struct rte_bbdev_enc_op **ops, uint16_t num) {
@@ -1960,6 +2604,52 @@
 	return i;
 }
 
+/* Enqueue encode operations for ACC100 device in TB mode. */
+static uint16_t
+acc100_enqueue_enc_tb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i, enqueued_cbs = 0;
+	uint8_t cbs_in_tb;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		cbs_in_tb = get_num_cbs_in_tb_enc(&ops[i]->turbo_enc);
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - cbs_in_tb < 0))
+			break;
+		avail -= cbs_in_tb;
+
+		ret = enqueue_enc_one_op_tb(q, ops[i], enqueued_cbs, cbs_in_tb);
+		if (ret < 0)
+			break;
+		enqueued_cbs += ret;
+	}
+
+	acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+
+	return i;
+}
+
+/* Enqueue encode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_enc(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	if (unlikely(num == 0))
+		return 0;
+	if (ops[0]->turbo_enc.code_block_mode == 0)
+		return acc100_enqueue_enc_tb(q_data, ops, num);
+	else
+		return acc100_enqueue_enc_cb(q_data, ops, num);
+}
+
 /* Enqueue encode operations for ACC100 device. */
 static uint16_t
 acc100_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
@@ -1967,7 +2657,51 @@
 {
 	if (unlikely(num == 0))
 		return 0;
-	return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
+	if (ops[0]->ldpc_enc.code_block_mode == 0)
+		return acc100_enqueue_enc_tb(q_data, ops, num);
+	else
+		return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
+}
+
+
+/* Enqueue decode operations for ACC100 device in CB mode */
+static uint16_t
+acc100_enqueue_dec_cb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i;
+	union acc100_dma_desc *desc;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - 1 < 0))
+			break;
+		avail -= 1;
+
+		ret = enqueue_dec_one_op_cb(q, ops[i], i);
+		if (ret < 0)
+			break;
+	}
+
+	if (unlikely(i == 0))
+		return 0; /* Nothing to enqueue */
+
+	/* Set SDone in last CB in enqueued ops for CB mode*/
+	desc = q->ring_addr + ((q->sw_ring_head + i - 1)
+			& q->sw_ring_wrap_mask);
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	acc100_dma_enqueue(q, i, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+
+	return i;
 }
 
 /* Check we can mux encode operations with common FCW */
@@ -2065,6 +2799,53 @@
 	return i;
 }
 
+
+/* Enqueue decode operations for ACC100 device in TB mode */
+static uint16_t
+acc100_enqueue_dec_tb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i, enqueued_cbs = 0;
+	uint8_t cbs_in_tb;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		cbs_in_tb = get_num_cbs_in_tb_dec(&ops[i]->turbo_dec);
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - cbs_in_tb < 0))
+			break;
+		avail -= cbs_in_tb;
+
+		ret = enqueue_dec_one_op_tb(q, ops[i], enqueued_cbs, cbs_in_tb);
+		if (ret < 0)
+			break;
+		enqueued_cbs += ret;
+	}
+
+	acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+
+	return i;
+}
+
+/* Enqueue decode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_dec(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	if (unlikely(num == 0))
+		return 0;
+	if (ops[0]->turbo_dec.code_block_mode == 0)
+		return acc100_enqueue_dec_tb(q_data, ops, num);
+	else
+		return acc100_enqueue_dec_cb(q_data, ops, num);
+}
+
 /* Enqueue decode operations for ACC100 device. */
 static uint16_t
 acc100_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
@@ -2388,6 +3169,51 @@
 	return cb_idx;
 }
 
+/* Dequeue encode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_enc(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	uint16_t dequeue_num;
+	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+	uint32_t aq_dequeued = 0;
+	uint16_t i;
+	uint16_t dequeued_cbs = 0;
+	struct rte_bbdev_enc_op *op;
+	int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(ops == 0 && q == NULL))
+		return 0;
+#endif
+
+	dequeue_num = (avail < num) ? avail : num;
+
+	for (i = 0; i < dequeue_num; ++i) {
+		op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask))->req.op_addr;
+		if (op->turbo_enc.code_block_mode == 0)
+			ret = dequeue_enc_one_op_tb(q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+		else
+			ret = dequeue_enc_one_op_cb(q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+
+		if (ret < 0)
+			break;
+		dequeued_cbs += ret;
+	}
+
+	q->aq_dequeued += aq_dequeued;
+	q->sw_ring_tail += dequeued_cbs;
+
+	/* Update enqueue stats */
+	q_data->queue_stats.dequeued_count += i;
+
+	return i;
+}
+
 /* Dequeue LDPC encode operations from ACC100 device. */
 static uint16_t
 acc100_dequeue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
@@ -2426,6 +3252,52 @@
 	return dequeued_cbs;
 }
 
+
+/* Dequeue decode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_dec(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	uint16_t dequeue_num;
+	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+	uint32_t aq_dequeued = 0;
+	uint16_t i;
+	uint16_t dequeued_cbs = 0;
+	struct rte_bbdev_dec_op *op;
+	int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(ops == 0 && q == NULL))
+		return 0;
+#endif
+
+	dequeue_num = (avail < num) ? avail : num;
+
+	for (i = 0; i < dequeue_num; ++i) {
+		op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask))->req.op_addr;
+		if (op->turbo_dec.code_block_mode == 0)
+			ret = dequeue_dec_one_op_tb(q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+		else
+			ret = dequeue_dec_one_op_cb(q_data, q, &ops[i],
+					dequeued_cbs, &aq_dequeued);
+
+		if (ret < 0)
+			break;
+		dequeued_cbs += ret;
+	}
+
+	q->aq_dequeued += aq_dequeued;
+	q->sw_ring_tail += dequeued_cbs;
+
+	/* Update enqueue stats */
+	q_data->queue_stats.dequeued_count += i;
+
+	return i;
+}
+
 /* Dequeue decode operations from ACC100 device. */
 static uint16_t
 acc100_dequeue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
@@ -2479,6 +3351,10 @@
 	struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
 
 	dev->dev_ops = &acc100_bbdev_ops;
+	dev->enqueue_enc_ops = acc100_enqueue_enc;
+	dev->enqueue_dec_ops = acc100_enqueue_dec;
+	dev->dequeue_enc_ops = acc100_dequeue_enc;
+	dev->dequeue_dec_ops = acc100_dequeue_dec;
 	dev->enqueue_ldpc_enc_ops = acc100_enqueue_ldpc_enc;
 	dev->enqueue_ldpc_dec_ops = acc100_enqueue_ldpc_dec;
 	dev->dequeue_ldpc_enc_ops = acc100_dequeue_ldpc_enc;
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v9 08/10] baseband/acc100: add interrupt support to PMD
  2020-09-29  0:29   ` [dpdk-dev] [PATCH v9 00/10] bbdev PMD ACC100 Nicolas Chautru
                       ` (6 preceding siblings ...)
  2020-09-29  0:29     ` [dpdk-dev] [PATCH v9 07/10] baseband/acc100: add support for 4G processing Nicolas Chautru
@ 2020-09-29  0:29     ` Nicolas Chautru
  2020-09-30 19:03       ` Tom Rix
  2020-09-29  0:29     ` [dpdk-dev] [PATCH v9 09/10] baseband/acc100: add debug function to validate input Nicolas Chautru
  2020-09-29  0:29     ` [dpdk-dev] [PATCH v9 10/10] baseband/acc100: add configure function Nicolas Chautru
  9 siblings, 1 reply; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-29  0:29 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Adding capability and functions to support MSI
interrupts, call backs and inforing.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 288 ++++++++++++++++++++++++++++++-
 drivers/baseband/acc100/rte_acc100_pmd.h |  15 ++
 2 files changed, 300 insertions(+), 3 deletions(-)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 7d4c3df..b6d9e7c 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -339,6 +339,213 @@
 	free_base_addresses(base_addrs, i);
 }
 
+/*
+ * Find queue_id of a device queue based on details from the Info Ring.
+ * If a queue isn't found UINT16_MAX is returned.
+ */
+static inline uint16_t
+get_queue_id_from_ring_info(struct rte_bbdev_data *data,
+		const union acc100_info_ring_data ring_data)
+{
+	uint16_t queue_id;
+
+	for (queue_id = 0; queue_id < data->num_queues; ++queue_id) {
+		struct acc100_queue *acc100_q =
+				data->queues[queue_id].queue_private;
+		if (acc100_q != NULL && acc100_q->aq_id == ring_data.aq_id &&
+				acc100_q->qgrp_id == ring_data.qg_id &&
+				acc100_q->vf_id == ring_data.vf_id)
+			return queue_id;
+	}
+
+	return UINT16_MAX;
+}
+
+/* Checks PF Info Ring to find the interrupt cause and handles it accordingly */
+static inline void
+acc100_check_ir(struct acc100_device *acc100_dev)
+{
+	volatile union acc100_info_ring_data *ring_data;
+	uint16_t info_ring_head = acc100_dev->info_ring_head;
+	if (acc100_dev->info_ring == NULL)
+		return;
+
+	ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head &
+			ACC100_INFO_RING_MASK);
+
+	while (ring_data->valid) {
+		if ((ring_data->int_nb < ACC100_PF_INT_DMA_DL_DESC_IRQ) || (
+				ring_data->int_nb >
+				ACC100_PF_INT_DMA_DL5G_DESC_IRQ))
+			rte_bbdev_log(WARNING, "InfoRing: ITR:%d Info:0x%x",
+				ring_data->int_nb, ring_data->detailed_info);
+		/* Initialize Info Ring entry and move forward */
+		ring_data->val = 0;
+		info_ring_head++;
+		ring_data = acc100_dev->info_ring +
+				(info_ring_head & ACC100_INFO_RING_MASK);
+	}
+}
+
+/* Checks PF Info Ring to find the interrupt cause and handles it accordingly */
+static inline void
+acc100_pf_interrupt_handler(struct rte_bbdev *dev)
+{
+	struct acc100_device *acc100_dev = dev->data->dev_private;
+	volatile union acc100_info_ring_data *ring_data;
+	struct acc100_deq_intr_details deq_intr_det;
+
+	ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head &
+			ACC100_INFO_RING_MASK);
+
+	while (ring_data->valid) {
+
+		rte_bbdev_log_debug(
+				"ACC100 PF Interrupt received, Info Ring data: 0x%x",
+				ring_data->val);
+
+		switch (ring_data->int_nb) {
+		case ACC100_PF_INT_DMA_DL_DESC_IRQ:
+		case ACC100_PF_INT_DMA_UL_DESC_IRQ:
+		case ACC100_PF_INT_DMA_UL5G_DESC_IRQ:
+		case ACC100_PF_INT_DMA_DL5G_DESC_IRQ:
+			deq_intr_det.queue_id = get_queue_id_from_ring_info(
+					dev->data, *ring_data);
+			if (deq_intr_det.queue_id == UINT16_MAX) {
+				rte_bbdev_log(ERR,
+						"Couldn't find queue: aq_id: %u, qg_id: %u, vf_id: %u",
+						ring_data->aq_id,
+						ring_data->qg_id,
+						ring_data->vf_id);
+				return;
+			}
+			rte_bbdev_pmd_callback_process(dev,
+					RTE_BBDEV_EVENT_DEQUEUE, &deq_intr_det);
+			break;
+		default:
+			rte_bbdev_pmd_callback_process(dev,
+					RTE_BBDEV_EVENT_ERROR, NULL);
+			break;
+		}
+
+		/* Initialize Info Ring entry and move forward */
+		ring_data->val = 0;
+		++acc100_dev->info_ring_head;
+		ring_data = acc100_dev->info_ring +
+				(acc100_dev->info_ring_head &
+				ACC100_INFO_RING_MASK);
+	}
+}
+
+/* Checks VF Info Ring to find the interrupt cause and handles it accordingly */
+static inline void
+acc100_vf_interrupt_handler(struct rte_bbdev *dev)
+{
+	struct acc100_device *acc100_dev = dev->data->dev_private;
+	volatile union acc100_info_ring_data *ring_data;
+	struct acc100_deq_intr_details deq_intr_det;
+
+	ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head &
+			ACC100_INFO_RING_MASK);
+
+	while (ring_data->valid) {
+
+		rte_bbdev_log_debug(
+				"ACC100 VF Interrupt received, Info Ring data: 0x%x",
+				ring_data->val);
+
+		switch (ring_data->int_nb) {
+		case ACC100_VF_INT_DMA_DL_DESC_IRQ:
+		case ACC100_VF_INT_DMA_UL_DESC_IRQ:
+		case ACC100_VF_INT_DMA_UL5G_DESC_IRQ:
+		case ACC100_VF_INT_DMA_DL5G_DESC_IRQ:
+			/* VFs are not aware of their vf_id - it's set to 0 in
+			 * queue structures.
+			 */
+			ring_data->vf_id = 0;
+			deq_intr_det.queue_id = get_queue_id_from_ring_info(
+					dev->data, *ring_data);
+			if (deq_intr_det.queue_id == UINT16_MAX) {
+				rte_bbdev_log(ERR,
+						"Couldn't find queue: aq_id: %u, qg_id: %u",
+						ring_data->aq_id,
+						ring_data->qg_id);
+				return;
+			}
+			rte_bbdev_pmd_callback_process(dev,
+					RTE_BBDEV_EVENT_DEQUEUE, &deq_intr_det);
+			break;
+		default:
+			rte_bbdev_pmd_callback_process(dev,
+					RTE_BBDEV_EVENT_ERROR, NULL);
+			break;
+		}
+
+		/* Initialize Info Ring entry and move forward */
+		ring_data->valid = 0;
+		++acc100_dev->info_ring_head;
+		ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head
+				& ACC100_INFO_RING_MASK);
+	}
+}
+
+/* Interrupt handler triggered by ACC100 dev for handling specific interrupt */
+static void
+acc100_dev_interrupt_handler(void *cb_arg)
+{
+	struct rte_bbdev *dev = cb_arg;
+	struct acc100_device *acc100_dev = dev->data->dev_private;
+
+	/* Read info ring */
+	if (acc100_dev->pf_device)
+		acc100_pf_interrupt_handler(dev);
+	else
+		acc100_vf_interrupt_handler(dev);
+}
+
+/* Allocate and setup inforing */
+static int
+allocate_inforing(struct rte_bbdev *dev)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	const struct acc100_registry_addr *reg_addr;
+	rte_iova_t info_ring_phys;
+	uint32_t phys_low, phys_high;
+
+	if (d->info_ring != NULL)
+		return 0; /* Already configured */
+
+	/* Choose correct registry addresses for the device type */
+	if (d->pf_device)
+		reg_addr = &pf_reg_addr;
+	else
+		reg_addr = &vf_reg_addr;
+	/* Allocate InfoRing */
+	d->info_ring = rte_zmalloc_socket("Info Ring",
+			ACC100_INFO_RING_NUM_ENTRIES *
+			sizeof(*d->info_ring), RTE_CACHE_LINE_SIZE,
+			dev->data->socket_id);
+	if (d->info_ring == NULL) {
+		rte_bbdev_log(ERR,
+				"Failed to allocate Info Ring for %s:%u",
+				dev->device->driver->name,
+				dev->data->dev_id);
+		return -ENOMEM;
+	}
+	info_ring_phys = rte_malloc_virt2iova(d->info_ring);
+
+	/* Setup Info Ring */
+	phys_high = (uint32_t)(info_ring_phys >> 32);
+	phys_low  = (uint32_t)(info_ring_phys);
+	acc100_reg_write(d, reg_addr->info_ring_hi, phys_high);
+	acc100_reg_write(d, reg_addr->info_ring_lo, phys_low);
+	acc100_reg_write(d, reg_addr->info_ring_en, ACC100_REG_IRQ_EN_ALL);
+	d->info_ring_head = (acc100_reg_read(d, reg_addr->info_ring_ptr) &
+			0xFFF) / sizeof(union acc100_info_ring_data);
+	return 0;
+}
+
+
 /* Allocate 64MB memory used for all software rings */
 static int
 acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id)
@@ -426,6 +633,7 @@
 	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_hi, phys_high);
 	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_lo, phys_low);
 
+	allocate_inforing(dev);
 	d->harq_layout = rte_zmalloc_socket("HARQ Layout",
 			ACC100_HARQ_LAYOUT * sizeof(*d->harq_layout),
 			RTE_CACHE_LINE_SIZE, dev->data->socket_id);
@@ -437,13 +645,53 @@
 	return 0;
 }
 
+static int
+acc100_intr_enable(struct rte_bbdev *dev)
+{
+	int ret;
+	struct acc100_device *d = dev->data->dev_private;
+
+	/* Only MSI are currently supported */
+	if (dev->intr_handle->type == RTE_INTR_HANDLE_VFIO_MSI ||
+			dev->intr_handle->type == RTE_INTR_HANDLE_UIO) {
+
+		allocate_inforing(dev);
+
+		ret = rte_intr_enable(dev->intr_handle);
+		if (ret < 0) {
+			rte_bbdev_log(ERR,
+					"Couldn't enable interrupts for device: %s",
+					dev->data->name);
+			rte_free(d->info_ring);
+			return ret;
+		}
+		ret = rte_intr_callback_register(dev->intr_handle,
+				acc100_dev_interrupt_handler, dev);
+		if (ret < 0) {
+			rte_bbdev_log(ERR,
+					"Couldn't register interrupt callback for device: %s",
+					dev->data->name);
+			rte_free(d->info_ring);
+			return ret;
+		}
+
+		return 0;
+	}
+
+	rte_bbdev_log(ERR, "ACC100 (%s) supports only VFIO MSI interrupts",
+			dev->data->name);
+	return -ENOTSUP;
+}
+
 /* Free 64MB memory used for software rings */
 static int
 acc100_dev_close(struct rte_bbdev *dev)
 {
 	struct acc100_device *d = dev->data->dev_private;
+	acc100_check_ir(d);
 	if (d->sw_rings_base != NULL) {
 		rte_free(d->tail_ptrs);
+		rte_free(d->info_ring);
 		rte_free(d->sw_rings_base);
 		d->sw_rings_base = NULL;
 	}
@@ -643,6 +891,7 @@
 					RTE_BBDEV_TURBO_CRC_TYPE_24B |
 					RTE_BBDEV_TURBO_HALF_ITERATION_EVEN |
 					RTE_BBDEV_TURBO_EARLY_TERMINATION |
+					RTE_BBDEV_TURBO_DEC_INTERRUPTS |
 					RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN |
 					RTE_BBDEV_TURBO_MAP_DEC |
 					RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP |
@@ -663,6 +912,7 @@
 					RTE_BBDEV_TURBO_CRC_24B_ATTACH |
 					RTE_BBDEV_TURBO_RV_INDEX_BYPASS |
 					RTE_BBDEV_TURBO_RATE_MATCH |
+					RTE_BBDEV_TURBO_ENC_INTERRUPTS |
 					RTE_BBDEV_TURBO_ENC_SCATTER_GATHER,
 				.num_buffers_src =
 						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
@@ -676,7 +926,8 @@
 				.capability_flags =
 					RTE_BBDEV_LDPC_RATE_MATCH |
 					RTE_BBDEV_LDPC_CRC_24B_ATTACH |
-					RTE_BBDEV_LDPC_INTERLEAVER_BYPASS,
+					RTE_BBDEV_LDPC_INTERLEAVER_BYPASS |
+					RTE_BBDEV_LDPC_ENC_INTERRUPTS,
 				.num_buffers_src =
 						RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
 				.num_buffers_dst =
@@ -701,7 +952,8 @@
 				RTE_BBDEV_LDPC_DECODE_BYPASS |
 				RTE_BBDEV_LDPC_DEC_SCATTER_GATHER |
 				RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION |
-				RTE_BBDEV_LDPC_LLR_COMPRESSION,
+				RTE_BBDEV_LDPC_LLR_COMPRESSION |
+				RTE_BBDEV_LDPC_DEC_INTERRUPTS,
 			.llr_size = 8,
 			.llr_decimals = 1,
 			.num_buffers_src =
@@ -751,14 +1003,39 @@
 #else
 	dev_info->harq_buffer_size = 0;
 #endif
+	acc100_check_ir(d);
+}
+
+static int
+acc100_queue_intr_enable(struct rte_bbdev *dev, uint16_t queue_id)
+{
+	struct acc100_queue *q = dev->data->queues[queue_id].queue_private;
+
+	if (dev->intr_handle->type != RTE_INTR_HANDLE_VFIO_MSI &&
+			dev->intr_handle->type != RTE_INTR_HANDLE_UIO)
+		return -ENOTSUP;
+
+	q->irq_enable = 1;
+	return 0;
+}
+
+static int
+acc100_queue_intr_disable(struct rte_bbdev *dev, uint16_t queue_id)
+{
+	struct acc100_queue *q = dev->data->queues[queue_id].queue_private;
+	q->irq_enable = 0;
+	return 0;
 }
 
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
 	.setup_queues = acc100_setup_queues,
+	.intr_enable = acc100_intr_enable,
 	.close = acc100_dev_close,
 	.info_get = acc100_dev_info_get,
 	.queue_setup = acc100_queue_setup,
 	.queue_release = acc100_queue_release,
+	.queue_intr_enable = acc100_queue_intr_enable,
+	.queue_intr_disable = acc100_queue_intr_disable
 };
 
 /* ACC100 PCI PF address map */
@@ -3018,8 +3295,10 @@
 			? (1 << RTE_BBDEV_DATA_ERROR) : 0);
 	op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
 	op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
-	if (op->status != 0)
+	if (op->status != 0) {
 		q_data->queue_stats.dequeue_err_count++;
+		acc100_check_ir(q->d);
+	}
 
 	/* CRC invalid if error exists */
 	if (!op->status)
@@ -3076,6 +3355,9 @@
 		op->status |= 1 << RTE_BBDEV_SYNDROME_ERROR;
 	op->ldpc_dec.iter_count = (uint8_t) rsp.iter_cnt;
 
+	if (op->status & (1 << RTE_BBDEV_DRV_ERROR))
+		acc100_check_ir(q->d);
+
 	/* Check if this is the last desc in batch (Atomic Queue) */
 	if (desc->req.last_desc_in_batch) {
 		(*aq_dequeued)++;
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 78686c1..8980fa5 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -559,7 +559,14 @@ struct acc100_device {
 	/* Virtual address of the info memory routed to the this function under
 	 * operation, whether it is PF or VF.
 	 */
+	union acc100_info_ring_data *info_ring;
+
 	union acc100_harq_layout_data *harq_layout;
+	/* Virtual Info Ring head */
+	uint16_t info_ring_head;
+	/* Number of bytes available for each queue in device, depending on
+	 * how many queues are enabled with configure()
+	 */
 	uint32_t sw_ring_size;
 	uint32_t ddr_size; /* Size in kB */
 	uint32_t *tail_ptrs; /* Base address of response tail pointer buffer */
@@ -575,4 +582,12 @@ struct acc100_device {
 	bool configured; /**< True if this ACC100 device is configured */
 };
 
+/**
+ * Structure with details about RTE_BBDEV_EVENT_DEQUEUE event. It's passed to
+ * the callback function.
+ */
+struct acc100_deq_intr_details {
+	uint16_t queue_id;
+};
+
 #endif /* _RTE_ACC100_PMD_H_ */
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v9 09/10] baseband/acc100: add debug function to validate input
  2020-09-29  0:29   ` [dpdk-dev] [PATCH v9 00/10] bbdev PMD ACC100 Nicolas Chautru
                       ` (7 preceding siblings ...)
  2020-09-29  0:29     ` [dpdk-dev] [PATCH v9 08/10] baseband/acc100: add interrupt support to PMD Nicolas Chautru
@ 2020-09-29  0:29     ` Nicolas Chautru
  2020-09-30 19:16       ` Tom Rix
  2020-09-29  0:29     ` [dpdk-dev] [PATCH v9 10/10] baseband/acc100: add configure function Nicolas Chautru
  9 siblings, 1 reply; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-29  0:29 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Debug functions to validate the input API from user
Only enabled in DEBUG mode at build time

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 424 +++++++++++++++++++++++++++++++
 1 file changed, 424 insertions(+)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index b6d9e7c..3589814 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -1945,6 +1945,231 @@
 
 }
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+/* Validates turbo encoder parameters */
+static inline int
+validate_enc_op(struct rte_bbdev_enc_op *op)
+{
+	struct rte_bbdev_op_turbo_enc *turbo_enc = &op->turbo_enc;
+	struct rte_bbdev_op_enc_turbo_cb_params *cb = NULL;
+	struct rte_bbdev_op_enc_turbo_tb_params *tb = NULL;
+	uint16_t kw, kw_neg, kw_pos;
+
+	if (op->mempool == NULL) {
+		rte_bbdev_log(ERR, "Invalid mempool pointer");
+		return -1;
+	}
+	if (turbo_enc->input.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid input pointer");
+		return -1;
+	}
+	if (turbo_enc->output.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid output pointer");
+		return -1;
+	}
+	if (turbo_enc->rv_index > 3) {
+		rte_bbdev_log(ERR,
+				"rv_index (%u) is out of range 0 <= value <= 3",
+				turbo_enc->rv_index);
+		return -1;
+	}
+	if (turbo_enc->code_block_mode != 0 &&
+			turbo_enc->code_block_mode != 1) {
+		rte_bbdev_log(ERR,
+				"code_block_mode (%u) is out of range 0 <= value <= 1",
+				turbo_enc->code_block_mode);
+		return -1;
+	}
+
+	if (turbo_enc->code_block_mode == 0) {
+		tb = &turbo_enc->tb_params;
+		if ((tb->k_neg < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| tb->k_neg > RTE_BBDEV_TURBO_MAX_CB_SIZE)
+				&& tb->c_neg > 0) {
+			rte_bbdev_log(ERR,
+					"k_neg (%u) is out of range %u <= value <= %u",
+					tb->k_neg, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if (tb->k_pos < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| tb->k_pos > RTE_BBDEV_TURBO_MAX_CB_SIZE) {
+			rte_bbdev_log(ERR,
+					"k_pos (%u) is out of range %u <= value <= %u",
+					tb->k_pos, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if (tb->c_neg > (RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1))
+			rte_bbdev_log(ERR,
+					"c_neg (%u) is out of range 0 <= value <= %u",
+					tb->c_neg,
+					RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1);
+		if (tb->c < 1 || tb->c > RTE_BBDEV_TURBO_MAX_CODE_BLOCKS) {
+			rte_bbdev_log(ERR,
+					"c (%u) is out of range 1 <= value <= %u",
+					tb->c, RTE_BBDEV_TURBO_MAX_CODE_BLOCKS);
+			return -1;
+		}
+		if (tb->cab > tb->c) {
+			rte_bbdev_log(ERR,
+					"cab (%u) is greater than c (%u)",
+					tb->cab, tb->c);
+			return -1;
+		}
+		if ((tb->ea < RTE_BBDEV_TURBO_MIN_CB_SIZE || (tb->ea % 2))
+				&& tb->r < tb->cab) {
+			rte_bbdev_log(ERR,
+					"ea (%u) is less than %u or it is not even",
+					tb->ea, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+		if ((tb->eb < RTE_BBDEV_TURBO_MIN_CB_SIZE || (tb->eb % 2))
+				&& tb->c > tb->cab) {
+			rte_bbdev_log(ERR,
+					"eb (%u) is less than %u or it is not even",
+					tb->eb, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+
+		kw_neg = 3 * RTE_ALIGN_CEIL(tb->k_neg + 4,
+					RTE_BBDEV_TURBO_C_SUBBLOCK);
+		if (tb->ncb_neg < tb->k_neg || tb->ncb_neg > kw_neg) {
+			rte_bbdev_log(ERR,
+					"ncb_neg (%u) is out of range (%u) k_neg <= value <= (%u) kw_neg",
+					tb->ncb_neg, tb->k_neg, kw_neg);
+			return -1;
+		}
+
+		kw_pos = 3 * RTE_ALIGN_CEIL(tb->k_pos + 4,
+					RTE_BBDEV_TURBO_C_SUBBLOCK);
+		if (tb->ncb_pos < tb->k_pos || tb->ncb_pos > kw_pos) {
+			rte_bbdev_log(ERR,
+					"ncb_pos (%u) is out of range (%u) k_pos <= value <= (%u) kw_pos",
+					tb->ncb_pos, tb->k_pos, kw_pos);
+			return -1;
+		}
+		if (tb->r > (tb->c - 1)) {
+			rte_bbdev_log(ERR,
+					"r (%u) is greater than c - 1 (%u)",
+					tb->r, tb->c - 1);
+			return -1;
+		}
+	} else {
+		cb = &turbo_enc->cb_params;
+		if (cb->k < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| cb->k > RTE_BBDEV_TURBO_MAX_CB_SIZE) {
+			rte_bbdev_log(ERR,
+					"k (%u) is out of range %u <= value <= %u",
+					cb->k, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+
+		if (cb->e < RTE_BBDEV_TURBO_MIN_CB_SIZE || (cb->e % 2)) {
+			rte_bbdev_log(ERR,
+					"e (%u) is less than %u or it is not even",
+					cb->e, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+
+		kw = RTE_ALIGN_CEIL(cb->k + 4, RTE_BBDEV_TURBO_C_SUBBLOCK) * 3;
+		if (cb->ncb < cb->k || cb->ncb > kw) {
+			rte_bbdev_log(ERR,
+					"ncb (%u) is out of range (%u) k <= value <= (%u) kw",
+					cb->ncb, cb->k, kw);
+			return -1;
+		}
+	}
+
+	return 0;
+}
+/* Validates LDPC encoder parameters */
+static inline int
+validate_ldpc_enc_op(struct rte_bbdev_enc_op *op)
+{
+	struct rte_bbdev_op_ldpc_enc *ldpc_enc = &op->ldpc_enc;
+
+	if (op->mempool == NULL) {
+		rte_bbdev_log(ERR, "Invalid mempool pointer");
+		return -1;
+	}
+	if (ldpc_enc->input.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid input pointer");
+		return -1;
+	}
+	if (ldpc_enc->output.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid output pointer");
+		return -1;
+	}
+	if (ldpc_enc->input.length >
+			RTE_BBDEV_LDPC_MAX_CB_SIZE >> 3) {
+		rte_bbdev_log(ERR, "CB size (%u) is too big, max: %d",
+				ldpc_enc->input.length,
+				RTE_BBDEV_LDPC_MAX_CB_SIZE);
+		return -1;
+	}
+	if ((ldpc_enc->basegraph > 2) || (ldpc_enc->basegraph == 0)) {
+		rte_bbdev_log(ERR,
+				"BG (%u) is out of range 1 <= value <= 2",
+				ldpc_enc->basegraph);
+		return -1;
+	}
+	if (ldpc_enc->rv_index > 3) {
+		rte_bbdev_log(ERR,
+				"rv_index (%u) is out of range 0 <= value <= 3",
+				ldpc_enc->rv_index);
+		return -1;
+	}
+	if (ldpc_enc->code_block_mode > 1) {
+		rte_bbdev_log(ERR,
+				"code_block_mode (%u) is out of range 0 <= value <= 1",
+				ldpc_enc->code_block_mode);
+		return -1;
+	}
+
+	return 0;
+}
+
+/* Validates LDPC decoder parameters */
+static inline int
+validate_ldpc_dec_op(struct rte_bbdev_dec_op *op)
+{
+	struct rte_bbdev_op_ldpc_dec *ldpc_dec = &op->ldpc_dec;
+
+	if (op->mempool == NULL) {
+		rte_bbdev_log(ERR, "Invalid mempool pointer");
+		return -1;
+	}
+	if ((ldpc_dec->basegraph > 2) || (ldpc_dec->basegraph == 0)) {
+		rte_bbdev_log(ERR,
+				"BG (%u) is out of range 1 <= value <= 2",
+				ldpc_dec->basegraph);
+		return -1;
+	}
+	if (ldpc_dec->iter_max == 0) {
+		rte_bbdev_log(ERR,
+				"iter_max (%u) is equal to 0",
+				ldpc_dec->iter_max);
+		return -1;
+	}
+	if (ldpc_dec->rv_index > 3) {
+		rte_bbdev_log(ERR,
+				"rv_index (%u) is out of range 0 <= value <= 3",
+				ldpc_dec->rv_index);
+		return -1;
+	}
+	if (ldpc_dec->code_block_mode > 1) {
+		rte_bbdev_log(ERR,
+				"code_block_mode (%u) is out of range 0 <= value <= 1",
+				ldpc_dec->code_block_mode);
+		return -1;
+	}
+
+	return 0;
+}
+#endif
+
 /* Enqueue one encode operations for ACC100 device in CB mode */
 static inline int
 enqueue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
@@ -1956,6 +2181,14 @@
 		seg_total_left;
 	struct rte_mbuf *input, *output_head, *output;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_enc_op(op) == -1) {
+		rte_bbdev_log(ERR, "Turbo encoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2008,6 +2241,14 @@
 	uint16_t  in_length_in_bytes;
 	struct rte_bbdev_op_ldpc_enc *enc = &ops[0]->ldpc_enc;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_ldpc_enc_op(ops[0]) == -1) {
+		rte_bbdev_log(ERR, "LDPC encoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2065,6 +2306,14 @@
 		seg_total_left;
 	struct rte_mbuf *input, *output_head, *output;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_ldpc_enc_op(op) == -1) {
+		rte_bbdev_log(ERR, "LDPC encoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2119,6 +2368,14 @@
 	struct rte_mbuf *input, *output_head, *output;
 	uint16_t current_enqueued_cbs = 0;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_enc_op(op) == -1) {
+		rte_bbdev_log(ERR, "Turbo encoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2191,6 +2448,142 @@
 	return current_enqueued_cbs;
 }
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+/* Validates turbo decoder parameters */
+static inline int
+validate_dec_op(struct rte_bbdev_dec_op *op)
+{
+	struct rte_bbdev_op_turbo_dec *turbo_dec = &op->turbo_dec;
+	struct rte_bbdev_op_dec_turbo_cb_params *cb = NULL;
+	struct rte_bbdev_op_dec_turbo_tb_params *tb = NULL;
+
+	if (op->mempool == NULL) {
+		rte_bbdev_log(ERR, "Invalid mempool pointer");
+		return -1;
+	}
+	if (turbo_dec->input.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid input pointer");
+		return -1;
+	}
+	if (turbo_dec->hard_output.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid hard_output pointer");
+		return -1;
+	}
+	if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_SOFT_OUTPUT) &&
+			turbo_dec->soft_output.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid soft_output pointer");
+		return -1;
+	}
+	if (turbo_dec->rv_index > 3) {
+		rte_bbdev_log(ERR,
+				"rv_index (%u) is out of range 0 <= value <= 3",
+				turbo_dec->rv_index);
+		return -1;
+	}
+	if (turbo_dec->iter_min < 1) {
+		rte_bbdev_log(ERR,
+				"iter_min (%u) is less than 1",
+				turbo_dec->iter_min);
+		return -1;
+	}
+	if (turbo_dec->iter_max <= 2) {
+		rte_bbdev_log(ERR,
+				"iter_max (%u) is less than or equal to 2",
+				turbo_dec->iter_max);
+		return -1;
+	}
+	if (turbo_dec->iter_min > turbo_dec->iter_max) {
+		rte_bbdev_log(ERR,
+				"iter_min (%u) is greater than iter_max (%u)",
+				turbo_dec->iter_min, turbo_dec->iter_max);
+		return -1;
+	}
+	if (turbo_dec->code_block_mode != 0 &&
+			turbo_dec->code_block_mode != 1) {
+		rte_bbdev_log(ERR,
+				"code_block_mode (%u) is out of range 0 <= value <= 1",
+				turbo_dec->code_block_mode);
+		return -1;
+	}
+
+	if (turbo_dec->code_block_mode == 0) {
+		tb = &turbo_dec->tb_params;
+		if ((tb->k_neg < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| tb->k_neg > RTE_BBDEV_TURBO_MAX_CB_SIZE)
+				&& tb->c_neg > 0) {
+			rte_bbdev_log(ERR,
+					"k_neg (%u) is out of range %u <= value <= %u",
+					tb->k_neg, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if ((tb->k_pos < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| tb->k_pos > RTE_BBDEV_TURBO_MAX_CB_SIZE)
+				&& tb->c > tb->c_neg) {
+			rte_bbdev_log(ERR,
+					"k_pos (%u) is out of range %u <= value <= %u",
+					tb->k_pos, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if (tb->c_neg > (RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1))
+			rte_bbdev_log(ERR,
+					"c_neg (%u) is out of range 0 <= value <= %u",
+					tb->c_neg,
+					RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1);
+		if (tb->c < 1 || tb->c > RTE_BBDEV_TURBO_MAX_CODE_BLOCKS) {
+			rte_bbdev_log(ERR,
+					"c (%u) is out of range 1 <= value <= %u",
+					tb->c, RTE_BBDEV_TURBO_MAX_CODE_BLOCKS);
+			return -1;
+		}
+		if (tb->cab > tb->c) {
+			rte_bbdev_log(ERR,
+					"cab (%u) is greater than c (%u)",
+					tb->cab, tb->c);
+			return -1;
+		}
+		if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_EQUALIZER) &&
+				(tb->ea < RTE_BBDEV_TURBO_MIN_CB_SIZE
+						|| (tb->ea % 2))
+				&& tb->cab > 0) {
+			rte_bbdev_log(ERR,
+					"ea (%u) is less than %u or it is not even",
+					tb->ea, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+		if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_EQUALIZER) &&
+				(tb->eb < RTE_BBDEV_TURBO_MIN_CB_SIZE
+						|| (tb->eb % 2))
+				&& tb->c > tb->cab) {
+			rte_bbdev_log(ERR,
+					"eb (%u) is less than %u or it is not even",
+					tb->eb, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+		}
+	} else {
+		cb = &turbo_dec->cb_params;
+		if (cb->k < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| cb->k > RTE_BBDEV_TURBO_MAX_CB_SIZE) {
+			rte_bbdev_log(ERR,
+					"k (%u) is out of range %u <= value <= %u",
+					cb->k, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_EQUALIZER) &&
+				(cb->e < RTE_BBDEV_TURBO_MIN_CB_SIZE ||
+				(cb->e % 2))) {
+			rte_bbdev_log(ERR,
+					"e (%u) is less than %u or it is not even",
+					cb->e, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+	}
+
+	return 0;
+}
+#endif
+
 /** Enqueue one decode operations for ACC100 device in CB mode */
 static inline int
 enqueue_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
@@ -2203,6 +2596,14 @@
 	struct rte_mbuf *input, *h_output_head, *h_output,
 		*s_output_head, *s_output;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_dec_op(op) == -1) {
+		rte_bbdev_log(ERR, "Turbo decoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2426,6 +2827,13 @@
 		return ret;
 	}
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_ldpc_dec_op(op) == -1) {
+		rte_bbdev_log(ERR, "LDPC decoder validation failed");
+		return -EINVAL;
+	}
+#endif
 	union acc100_dma_desc *desc;
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
@@ -2521,6 +2929,14 @@
 	struct rte_mbuf *input, *h_output_head, *h_output;
 	uint16_t current_enqueued_cbs = 0;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_ldpc_dec_op(op) == -1) {
+		rte_bbdev_log(ERR, "LDPC decoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2611,6 +3027,14 @@
 		*s_output_head, *s_output;
 	uint16_t current_enqueued_cbs = 0;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_dec_op(op) == -1) {
+		rte_bbdev_log(ERR, "Turbo decoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v9 10/10] baseband/acc100: add configure function
  2020-09-29  0:29   ` [dpdk-dev] [PATCH v9 00/10] bbdev PMD ACC100 Nicolas Chautru
                       ` (8 preceding siblings ...)
  2020-09-29  0:29     ` [dpdk-dev] [PATCH v9 09/10] baseband/acc100: add debug function to validate input Nicolas Chautru
@ 2020-09-29  0:29     ` Nicolas Chautru
  2020-09-30 19:58       ` Tom Rix
  9 siblings, 1 reply; 213+ messages in thread
From: Nicolas Chautru @ 2020-09-29  0:29 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu, Nicolas Chautru

Add configure function to configure the PF from within
the bbdev-test itself without external application
configuration the device.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 app/test-bbdev/test_bbdev_perf.c                   |  72 +++
 doc/guides/rel_notes/release_20_11.rst             |   5 +
 drivers/baseband/acc100/meson.build                |   2 +
 drivers/baseband/acc100/rte_acc100_cfg.h           |  17 +
 drivers/baseband/acc100/rte_acc100_pmd.c           | 505 +++++++++++++++++++++
 .../acc100/rte_pmd_bbdev_acc100_version.map        |   7 +
 6 files changed, 608 insertions(+)

diff --git a/app/test-bbdev/test_bbdev_perf.c b/app/test-bbdev/test_bbdev_perf.c
index 45c0d62..32f23ff 100644
--- a/app/test-bbdev/test_bbdev_perf.c
+++ b/app/test-bbdev/test_bbdev_perf.c
@@ -52,6 +52,18 @@
 #define FLR_5G_TIMEOUT 610
 #endif
 
+#ifdef RTE_LIBRTE_PMD_BBDEV_ACC100
+#include <rte_acc100_cfg.h>
+#define ACC100PF_DRIVER_NAME   ("intel_acc100_pf")
+#define ACC100VF_DRIVER_NAME   ("intel_acc100_vf")
+#define ACC100_QMGR_NUM_AQS 16
+#define ACC100_QMGR_NUM_QGS 2
+#define ACC100_QMGR_AQ_DEPTH 5
+#define ACC100_QMGR_INVALID_IDX -1
+#define ACC100_QMGR_RR 1
+#define ACC100_QOS_GBR 0
+#endif
+
 #define OPS_CACHE_SIZE 256U
 #define OPS_POOL_SIZE_MIN 511U /* 0.5K per queue */
 
@@ -653,6 +665,66 @@ typedef int (test_case_function)(struct active_device *ad,
 				info->dev_name);
 	}
 #endif
+#ifdef RTE_LIBRTE_PMD_BBDEV_ACC100
+	if ((get_init_device() == true) &&
+		(!strcmp(info->drv.driver_name, ACC100PF_DRIVER_NAME))) {
+		struct acc100_conf conf;
+		unsigned int i;
+
+		printf("Configure ACC100 FEC Driver %s with default values\n",
+				info->drv.driver_name);
+
+		/* clear default configuration before initialization */
+		memset(&conf, 0, sizeof(struct acc100_conf));
+
+		/* Always set in PF mode for built-in configuration */
+		conf.pf_mode_en = true;
+		for (i = 0; i < RTE_ACC100_NUM_VFS; ++i) {
+			conf.arb_dl_4g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_dl_4g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_dl_4g[i].round_robin_weight = ACC100_QMGR_RR;
+			conf.arb_ul_4g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_ul_4g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_ul_4g[i].round_robin_weight = ACC100_QMGR_RR;
+			conf.arb_dl_5g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_dl_5g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_dl_5g[i].round_robin_weight = ACC100_QMGR_RR;
+			conf.arb_ul_5g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_ul_5g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_ul_5g[i].round_robin_weight = ACC100_QMGR_RR;
+		}
+
+		conf.input_pos_llr_1_bit = true;
+		conf.output_pos_llr_1_bit = true;
+		conf.num_vf_bundles = 1; /**< Number of VF bundles to setup */
+
+		conf.q_ul_4g.num_qgroups = ACC100_QMGR_NUM_QGS;
+		conf.q_ul_4g.first_qgroup_index = ACC100_QMGR_INVALID_IDX;
+		conf.q_ul_4g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
+		conf.q_ul_4g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
+		conf.q_dl_4g.num_qgroups = ACC100_QMGR_NUM_QGS;
+		conf.q_dl_4g.first_qgroup_index = ACC100_QMGR_INVALID_IDX;
+		conf.q_dl_4g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
+		conf.q_dl_4g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
+		conf.q_ul_5g.num_qgroups = ACC100_QMGR_NUM_QGS;
+		conf.q_ul_5g.first_qgroup_index = ACC100_QMGR_INVALID_IDX;
+		conf.q_ul_5g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
+		conf.q_ul_5g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
+		conf.q_dl_5g.num_qgroups = ACC100_QMGR_NUM_QGS;
+		conf.q_dl_5g.first_qgroup_index = ACC100_QMGR_INVALID_IDX;
+		conf.q_dl_5g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
+		conf.q_dl_5g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
+
+		/* setup PF with configuration information */
+		ret = acc100_configure(info->dev_name, &conf);
+		TEST_ASSERT_SUCCESS(ret,
+				"Failed to configure ACC100 PF for bbdev %s",
+				info->dev_name);
+		/* Let's refresh this now this is configured */
+	}
+	rte_bbdev_info_get(dev_id, info);
+#endif
+
 	nb_queues = RTE_MIN(rte_lcore_count(), info->drv.max_num_queues);
 	nb_queues = RTE_MIN(nb_queues, (unsigned int) MAX_QUEUES);
 
diff --git a/doc/guides/rel_notes/release_20_11.rst b/doc/guides/rel_notes/release_20_11.rst
index 73ac08f..c8d0586 100644
--- a/doc/guides/rel_notes/release_20_11.rst
+++ b/doc/guides/rel_notes/release_20_11.rst
@@ -55,6 +55,11 @@ New Features
      Also, make sure to start the actual text at the margin.
      =======================================================
 
+* **Added Intel ACC100 bbdev PMD.**
+
+  Added a new ``acc100`` bbdev driver for the Intel\ |reg| ACC100 accelerator
+  also known as Mount Bryce.  See the
+  :doc:`../bbdevs/acc100` BBDEV guide for more details on this new driver.
 
 Removed Items
 -------------
diff --git a/drivers/baseband/acc100/meson.build b/drivers/baseband/acc100/meson.build
index 8afafc2..7ac44dc 100644
--- a/drivers/baseband/acc100/meson.build
+++ b/drivers/baseband/acc100/meson.build
@@ -4,3 +4,5 @@
 deps += ['bbdev', 'bus_vdev', 'ring', 'pci', 'bus_pci']
 
 sources = files('rte_acc100_pmd.c')
+
+install_headers('rte_acc100_cfg.h')
diff --git a/drivers/baseband/acc100/rte_acc100_cfg.h b/drivers/baseband/acc100/rte_acc100_cfg.h
index 73bbe36..7f523bc 100644
--- a/drivers/baseband/acc100/rte_acc100_cfg.h
+++ b/drivers/baseband/acc100/rte_acc100_cfg.h
@@ -89,6 +89,23 @@ struct acc100_conf {
 	struct rte_arbitration_t arb_dl_5g[RTE_ACC100_NUM_VFS];
 };
 
+/**
+ * Configure a ACC100 device
+ *
+ * @param dev_name
+ *   The name of the device. This is the short form of PCI BDF, e.g. 00:01.0.
+ *   It can also be retrieved for a bbdev device from the dev_name field in the
+ *   rte_bbdev_info structure returned by rte_bbdev_info_get().
+ * @param conf
+ *   Configuration to apply to ACC100 HW.
+ *
+ * @return
+ *   Zero on success, negative value on failure.
+ */
+__rte_experimental
+int
+acc100_configure(const char *dev_name, struct acc100_conf *conf);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 3589814..b50dd32 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -85,6 +85,26 @@
 
 enum {UL_4G = 0, UL_5G, DL_4G, DL_5G, NUM_ACC};
 
+/* Return the accelerator enum for a Queue Group Index */
+static inline int
+accFromQgid(int qg_idx, const struct acc100_conf *acc100_conf)
+{
+	int accQg[ACC100_NUM_QGRPS];
+	int NumQGroupsPerFn[NUM_ACC];
+	int acc, qgIdx, qgIndex = 0;
+	for (qgIdx = 0; qgIdx < ACC100_NUM_QGRPS; qgIdx++)
+		accQg[qgIdx] = 0;
+	NumQGroupsPerFn[UL_4G] = acc100_conf->q_ul_4g.num_qgroups;
+	NumQGroupsPerFn[UL_5G] = acc100_conf->q_ul_5g.num_qgroups;
+	NumQGroupsPerFn[DL_4G] = acc100_conf->q_dl_4g.num_qgroups;
+	NumQGroupsPerFn[DL_5G] = acc100_conf->q_dl_5g.num_qgroups;
+	for (acc = UL_4G;  acc < NUM_ACC; acc++)
+		for (qgIdx = 0; qgIdx < NumQGroupsPerFn[acc]; qgIdx++)
+			accQg[qgIndex++] = acc;
+	acc = accQg[qg_idx];
+	return acc;
+}
+
 /* Return the queue topology for a Queue Group Index */
 static inline void
 qtopFromAcc(struct rte_q_topology_t **qtop, int acc_enum,
@@ -113,6 +133,30 @@
 	*qtop = p_qtop;
 }
 
+/* Return the AQ depth for a Queue Group Index */
+static inline int
+aqDepth(int qg_idx, struct acc100_conf *acc100_conf)
+{
+	struct rte_q_topology_t *q_top = NULL;
+	int acc_enum = accFromQgid(qg_idx, acc100_conf);
+	qtopFromAcc(&q_top, acc_enum, acc100_conf);
+	if (unlikely(q_top == NULL))
+		return 0;
+	return q_top->aq_depth_log2;
+}
+
+/* Return the AQ depth for a Queue Group Index */
+static inline int
+aqNum(int qg_idx, struct acc100_conf *acc100_conf)
+{
+	struct rte_q_topology_t *q_top = NULL;
+	int acc_enum = accFromQgid(qg_idx, acc100_conf);
+	qtopFromAcc(&q_top, acc_enum, acc100_conf);
+	if (unlikely(q_top == NULL))
+		return 0;
+	return q_top->num_aqs_per_groups;
+}
+
 static void
 initQTop(struct acc100_conf *acc100_conf)
 {
@@ -4177,3 +4221,464 @@ static int acc100_pci_remove(struct rte_pci_device *pci_dev)
 RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME, pci_id_acc100_pf_map);
 RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
 RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME, pci_id_acc100_vf_map);
+
+/*
+ * Implementation to fix the power on status of some 5GUL engines
+ * This requires DMA permission if ported outside DPDK
+ */
+static void
+poweron_cleanup(struct rte_bbdev *bbdev, struct acc100_device *d,
+		struct acc100_conf *conf)
+{
+	int i, template_idx, qg_idx;
+	uint32_t address, status, payload;
+	printf("Need to clear power-on 5GUL status in internal memory\n");
+	/* Reset LDPC Cores */
+	for (i = 0; i < ACC100_ENGINES_MAX; i++)
+		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
+				ACC100_ENGINE_OFFSET * i, ACC100_RESET_HI);
+	usleep(LONG_WAIT);
+	for (i = 0; i < ACC100_ENGINES_MAX; i++)
+		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
+				ACC100_ENGINE_OFFSET * i, ACC100_RESET_LO);
+	usleep(LONG_WAIT);
+	/* Prepare dummy workload */
+	alloc_2x64mb_sw_rings_mem(bbdev, d, 0);
+	/* Set base addresses */
+	uint32_t phys_high = (uint32_t)(d->sw_rings_phys >> 32);
+	uint32_t phys_low  = (uint32_t)(d->sw_rings_phys &
+			~(ACC100_SIZE_64MBYTE-1));
+	acc100_reg_write(d, HWPfDmaFec5GulDescBaseHiRegVf, phys_high);
+	acc100_reg_write(d, HWPfDmaFec5GulDescBaseLoRegVf, phys_low);
+
+	/* Descriptor for a dummy 5GUL code block processing*/
+	union acc100_dma_desc *desc = NULL;
+	desc = d->sw_rings;
+	desc->req.data_ptrs[0].address = d->sw_rings_phys +
+			ACC100_DESC_FCW_OFFSET;
+	desc->req.data_ptrs[0].blen = ACC100_FCW_LD_BLEN;
+	desc->req.data_ptrs[0].blkid = ACC100_DMA_BLKID_FCW;
+	desc->req.data_ptrs[0].last = 0;
+	desc->req.data_ptrs[0].dma_ext = 0;
+	desc->req.data_ptrs[1].address = d->sw_rings_phys + 512;
+	desc->req.data_ptrs[1].blkid = ACC100_DMA_BLKID_IN;
+	desc->req.data_ptrs[1].last = 1;
+	desc->req.data_ptrs[1].dma_ext = 0;
+	desc->req.data_ptrs[1].blen = 44;
+	desc->req.data_ptrs[2].address = d->sw_rings_phys + 1024;
+	desc->req.data_ptrs[2].blkid = ACC100_DMA_BLKID_OUT_ENC;
+	desc->req.data_ptrs[2].last = 1;
+	desc->req.data_ptrs[2].dma_ext = 0;
+	desc->req.data_ptrs[2].blen = 5;
+	/* Dummy FCW */
+	desc->req.fcw_ld.FCWversion = ACC100_FCW_VER;
+	desc->req.fcw_ld.qm = 1;
+	desc->req.fcw_ld.nfiller = 30;
+	desc->req.fcw_ld.BG = 2 - 1;
+	desc->req.fcw_ld.Zc = 7;
+	desc->req.fcw_ld.ncb = 350;
+	desc->req.fcw_ld.rm_e = 4;
+	desc->req.fcw_ld.itmax = 10;
+	desc->req.fcw_ld.gain_i = 1;
+	desc->req.fcw_ld.gain_h = 1;
+
+	int engines_to_restart[SIG_UL_5G_LAST + 1] = {0};
+	int num_failed_engine = 0;
+	/* Detect engines in undefined state */
+	for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST;
+			template_idx++) {
+		/* Check engine power-on status */
+		address = HwPfFecUl5gIbDebugReg +
+				ACC100_ENGINE_OFFSET * template_idx;
+		status = (acc100_reg_read(d, address) >> 4) & 0xF;
+		if (status == 0) {
+			engines_to_restart[num_failed_engine] = template_idx;
+			num_failed_engine++;
+		}
+	}
+
+	int numQqsAcc = conf->q_ul_5g.num_qgroups;
+	int numQgs = conf->q_ul_5g.num_qgroups;
+	payload = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		payload |= (1 << qg_idx);
+	/* Force each engine which is in unspecified state */
+	for (i = 0; i < num_failed_engine; i++) {
+		int failed_engine = engines_to_restart[i];
+		printf("Force engine %d\n", failed_engine);
+		for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST;
+				template_idx++) {
+			address = HWPfQmgrGrpTmplateReg4Indx
+					+ BYTES_IN_WORD * template_idx;
+			if (template_idx == failed_engine)
+				acc100_reg_write(d, address, payload);
+			else
+				acc100_reg_write(d, address, 0);
+		}
+		/* Reset descriptor header */
+		desc->req.word0 = ACC100_DMA_DESC_TYPE;
+		desc->req.word1 = 0;
+		desc->req.word2 = 0;
+		desc->req.word3 = 0;
+		desc->req.numCBs = 1;
+		desc->req.m2dlen = 2;
+		desc->req.d2mlen = 1;
+		/* Enqueue the code block for processing */
+		union acc100_enqueue_reg_fmt enq_req;
+		enq_req.val = 0;
+		enq_req.addr_offset = ACC100_DESC_OFFSET;
+		enq_req.num_elem = 1;
+		enq_req.req_elem_addr = 0;
+		rte_wmb();
+		acc100_reg_write(d, HWPfQmgrIngressAq + 0x100, enq_req.val);
+		usleep(LONG_WAIT * 100);
+		if (desc->req.word0 != 2)
+			printf("DMA Response %#"PRIx32"\n", desc->req.word0);
+	}
+
+	/* Reset LDPC Cores */
+	for (i = 0; i < ACC100_ENGINES_MAX; i++)
+		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
+				ACC100_ENGINE_OFFSET * i, ACC100_RESET_HI);
+	usleep(LONG_WAIT);
+	for (i = 0; i < ACC100_ENGINES_MAX; i++)
+		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
+				ACC100_ENGINE_OFFSET * i, ACC100_RESET_LO);
+	usleep(LONG_WAIT);
+	acc100_reg_write(d, HWPfHi5GHardResetReg, ACC100_RESET_HARD);
+	usleep(LONG_WAIT);
+	int numEngines = 0;
+	/* Check engine power-on status again */
+	for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST;
+			template_idx++) {
+		address = HwPfFecUl5gIbDebugReg +
+				ACC100_ENGINE_OFFSET * template_idx;
+		status = (acc100_reg_read(d, address) >> 4) & 0xF;
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ BYTES_IN_WORD * template_idx;
+		if (status == 1) {
+			acc100_reg_write(d, address, payload);
+			numEngines++;
+		} else
+			acc100_reg_write(d, address, 0);
+	}
+	printf("Number of 5GUL engines %d\n", numEngines);
+
+	if (d->sw_rings_base != NULL)
+		rte_free(d->sw_rings_base);
+	usleep(LONG_WAIT);
+}
+
+/* Initial configuration of a ACC100 device prior to running configure() */
+int
+acc100_configure(const char *dev_name, struct acc100_conf *conf)
+{
+	rte_bbdev_log(INFO, "acc100_configure");
+	uint32_t payload, address, status;
+	int qg_idx, template_idx, vf_idx, acc, i;
+	struct rte_bbdev *bbdev = rte_bbdev_get_named_dev(dev_name);
+
+	/* Compile time checks */
+	RTE_BUILD_BUG_ON(sizeof(struct acc100_dma_req_desc) != 256);
+	RTE_BUILD_BUG_ON(sizeof(union acc100_dma_desc) != 256);
+	RTE_BUILD_BUG_ON(sizeof(struct acc100_fcw_td) != 24);
+	RTE_BUILD_BUG_ON(sizeof(struct acc100_fcw_te) != 32);
+
+	if (bbdev == NULL) {
+		rte_bbdev_log(ERR,
+		"Invalid dev_name (%s), or device is not yet initialised",
+		dev_name);
+		return -ENODEV;
+	}
+	struct acc100_device *d = bbdev->data->dev_private;
+
+	/* Store configuration */
+	rte_memcpy(&d->acc100_conf, conf, sizeof(d->acc100_conf));
+
+	/* PCIe Bridge configuration */
+	acc100_reg_write(d, HwPfPcieGpexBridgeControl, ACC100_CFG_PCI_BRIDGE);
+	for (i = 1; i < 17; i++)
+		acc100_reg_write(d,
+				HwPfPcieGpexAxiAddrMappingWindowPexBaseHigh
+				+ i * 16, 0);
+
+	/* PCIe Link Trainiing and Status State Machine */
+	acc100_reg_write(d, HwPfPcieGpexLtssmStateCntrl, 0xDFC00000);
+
+	/* Prevent blocking AXI read on BRESP for AXI Write */
+	address = HwPfPcieGpexAxiPioControl;
+	payload = ACC100_CFG_PCI_AXI;
+	acc100_reg_write(d, address, payload);
+
+	/* 5GDL PLL phase shift */
+	acc100_reg_write(d, HWPfChaDl5gPllPhshft0, 0x1);
+
+	/* Explicitly releasing AXI as this may be stopped after PF FLR/BME */
+	address = HWPfDmaAxiControl;
+	payload = 1;
+	acc100_reg_write(d, address, payload);
+
+	/* DDR Configuration */
+	address = HWPfDdrBcTim6;
+	payload = acc100_reg_read(d, address);
+	payload &= 0xFFFFFFFB; /* Bit 2 */
+#ifdef ACC100_DDR_ECC_ENABLE
+	payload |= 0x4;
+#endif
+	acc100_reg_write(d, address, payload);
+	address = HWPfDdrPhyDqsCountNum;
+#ifdef ACC100_DDR_ECC_ENABLE
+	payload = 9;
+#else
+	payload = 8;
+#endif
+	acc100_reg_write(d, address, payload);
+
+	/* Set default descriptor signature */
+	address = HWPfDmaDescriptorSignatuture;
+	payload = 0;
+	acc100_reg_write(d, address, payload);
+
+	/* Enable the Error Detection in DMA */
+	payload = ACC100_CFG_DMA_ERROR;
+	address = HWPfDmaErrorDetectionEn;
+	acc100_reg_write(d, address, payload);
+
+	/* AXI Cache configuration */
+	payload = ACC100_CFG_AXI_CACHE;
+	address = HWPfDmaAxcacheReg;
+	acc100_reg_write(d, address, payload);
+
+	/* Default DMA Configuration (Qmgr Enabled) */
+	address = HWPfDmaConfig0Reg;
+	payload = 0;
+	acc100_reg_write(d, address, payload);
+	address = HWPfDmaQmanen;
+	payload = 0;
+	acc100_reg_write(d, address, payload);
+
+	/* Default RLIM/ALEN configuration */
+	address = HWPfDmaConfig1Reg;
+	payload = (1 << 31) + (23 << 8) + (1 << 6) + 7;
+	acc100_reg_write(d, address, payload);
+
+	/* Configure DMA Qmanager addresses */
+	address = HWPfDmaQmgrAddrReg;
+	payload = HWPfQmgrEgressQueuesTemplate;
+	acc100_reg_write(d, address, payload);
+
+	/* ===== Qmgr Configuration ===== */
+	/* Configuration of the AQueue Depth QMGR_GRP_0_DEPTH_LOG2 for UL */
+	int totalQgs = conf->q_ul_4g.num_qgroups +
+			conf->q_ul_5g.num_qgroups +
+			conf->q_dl_4g.num_qgroups +
+			conf->q_dl_5g.num_qgroups;
+	for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) {
+		address = HWPfQmgrDepthLog2Grp +
+		BYTES_IN_WORD * qg_idx;
+		payload = aqDepth(qg_idx, conf);
+		acc100_reg_write(d, address, payload);
+		address = HWPfQmgrTholdGrp +
+		BYTES_IN_WORD * qg_idx;
+		payload = (1 << 16) + (1 << (aqDepth(qg_idx, conf) - 1));
+		acc100_reg_write(d, address, payload);
+	}
+
+	/* Template Priority in incremental order */
+	for (template_idx = 0; template_idx < ACC100_NUM_TMPL;
+			template_idx++) {
+		address = HWPfQmgrGrpTmplateReg0Indx +
+		BYTES_IN_WORD * (template_idx % 8);
+		payload = TMPL_PRI_0;
+		acc100_reg_write(d, address, payload);
+		address = HWPfQmgrGrpTmplateReg1Indx +
+		BYTES_IN_WORD * (template_idx % 8);
+		payload = TMPL_PRI_1;
+		acc100_reg_write(d, address, payload);
+		address = HWPfQmgrGrpTmplateReg2indx +
+		BYTES_IN_WORD * (template_idx % 8);
+		payload = TMPL_PRI_2;
+		acc100_reg_write(d, address, payload);
+		address = HWPfQmgrGrpTmplateReg3Indx +
+		BYTES_IN_WORD * (template_idx % 8);
+		payload = TMPL_PRI_3;
+		acc100_reg_write(d, address, payload);
+	}
+
+	address = HWPfQmgrGrpPriority;
+	payload = ACC100_CFG_QMGR_HI_P;
+	acc100_reg_write(d, address, payload);
+
+	/* Template Configuration */
+	for (template_idx = 0; template_idx < ACC100_NUM_TMPL; template_idx++) {
+		payload = 0;
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ BYTES_IN_WORD * template_idx;
+		acc100_reg_write(d, address, payload);
+	}
+	/* 4GUL */
+	int numQgs = conf->q_ul_4g.num_qgroups;
+	int numQqsAcc = 0;
+	payload = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		payload |= (1 << qg_idx);
+	for (template_idx = SIG_UL_4G; template_idx <= SIG_UL_4G_LAST;
+			template_idx++) {
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ BYTES_IN_WORD*template_idx;
+		acc100_reg_write(d, address, payload);
+	}
+	/* 5GUL */
+	numQqsAcc += numQgs;
+	numQgs	= conf->q_ul_5g.num_qgroups;
+	payload = 0;
+	int numEngines = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		payload |= (1 << qg_idx);
+	for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST;
+			template_idx++) {
+		/* Check engine power-on status */
+		address = HwPfFecUl5gIbDebugReg +
+				ACC100_ENGINE_OFFSET * template_idx;
+		status = (acc100_reg_read(d, address) >> 4) & 0xF;
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ BYTES_IN_WORD * template_idx;
+		if (status == 1) {
+			acc100_reg_write(d, address, payload);
+			numEngines++;
+		} else
+			acc100_reg_write(d, address, 0);
+		#if RTE_ACC100_SINGLE_FEC == 1
+		payload = 0;
+		#endif
+	}
+	printf("Number of 5GUL engines %d\n", numEngines);
+	/* 4GDL */
+	numQqsAcc += numQgs;
+	numQgs	= conf->q_dl_4g.num_qgroups;
+	payload = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		payload |= (1 << qg_idx);
+	for (template_idx = SIG_DL_4G; template_idx <= SIG_DL_4G_LAST;
+			template_idx++) {
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ BYTES_IN_WORD*template_idx;
+		acc100_reg_write(d, address, payload);
+		#if RTE_ACC100_SINGLE_FEC == 1
+			payload = 0;
+		#endif
+	}
+	/* 5GDL */
+	numQqsAcc += numQgs;
+	numQgs	= conf->q_dl_5g.num_qgroups;
+	payload = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		payload |= (1 << qg_idx);
+	for (template_idx = SIG_DL_5G; template_idx <= SIG_DL_5G_LAST;
+			template_idx++) {
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ BYTES_IN_WORD*template_idx;
+		acc100_reg_write(d, address, payload);
+		#if RTE_ACC100_SINGLE_FEC == 1
+		payload = 0;
+		#endif
+	}
+
+	/* Queue Group Function mapping */
+	int qman_func_id[5] = {0, 2, 1, 3, 4};
+	address = HWPfQmgrGrpFunction0;
+	payload = 0;
+	for (qg_idx = 0; qg_idx < 8; qg_idx++) {
+		acc = accFromQgid(qg_idx, conf);
+		payload |= qman_func_id[acc]<<(qg_idx * 4);
+	}
+	acc100_reg_write(d, address, payload);
+
+	/* Configuration of the Arbitration QGroup depth to 1 */
+	for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) {
+		address = HWPfQmgrArbQDepthGrp +
+		BYTES_IN_WORD * qg_idx;
+		payload = 0;
+		acc100_reg_write(d, address, payload);
+	}
+
+	/* Enabling AQueues through the Queue hierarchy*/
+	for (vf_idx = 0; vf_idx < ACC100_NUM_VFS; vf_idx++) {
+		for (qg_idx = 0; qg_idx < ACC100_NUM_QGRPS; qg_idx++) {
+			payload = 0;
+			if (vf_idx < conf->num_vf_bundles &&
+					qg_idx < totalQgs)
+				payload = (1 << aqNum(qg_idx, conf)) - 1;
+			address = HWPfQmgrAqEnableVf
+					+ vf_idx * BYTES_IN_WORD;
+			payload += (qg_idx << 16);
+			acc100_reg_write(d, address, payload);
+		}
+	}
+
+	/* This pointer to ARAM (256kB) is shifted by 2 (4B per register) */
+	uint32_t aram_address = 0;
+	for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) {
+		for (vf_idx = 0; vf_idx < conf->num_vf_bundles; vf_idx++) {
+			address = HWPfQmgrVfBaseAddr + vf_idx
+					* BYTES_IN_WORD + qg_idx
+					* BYTES_IN_WORD * 64;
+			payload = aram_address;
+			acc100_reg_write(d, address, payload);
+			/* Offset ARAM Address for next memory bank
+			 * - increment of 4B
+			 */
+			aram_address += aqNum(qg_idx, conf) *
+					(1 << aqDepth(qg_idx, conf));
+		}
+	}
+
+	if (aram_address > WORDS_IN_ARAM_SIZE) {
+		rte_bbdev_log(ERR, "ARAM Configuration not fitting %d %d\n",
+				aram_address, WORDS_IN_ARAM_SIZE);
+		return -EINVAL;
+	}
+
+	/* ==== HI Configuration ==== */
+
+	/* Prevent Block on Transmit Error */
+	address = HWPfHiBlockTransmitOnErrorEn;
+	payload = 0;
+	acc100_reg_write(d, address, payload);
+	/* Prevents to drop MSI */
+	address = HWPfHiMsiDropEnableReg;
+	payload = 0;
+	acc100_reg_write(d, address, payload);
+	/* Set the PF Mode register */
+	address = HWPfHiPfMode;
+	payload = (conf->pf_mode_en) ? 2 : 0;
+	acc100_reg_write(d, address, payload);
+	/* Enable Error Detection in HW */
+	address = HWPfDmaErrorDetectionEn;
+	payload = 0x3D7;
+	acc100_reg_write(d, address, payload);
+
+	/* QoS overflow init */
+	payload = 1;
+	address = HWPfQosmonAEvalOverflow0;
+	acc100_reg_write(d, address, payload);
+	address = HWPfQosmonBEvalOverflow0;
+	acc100_reg_write(d, address, payload);
+
+	/* HARQ DDR Configuration */
+	unsigned int ddrSizeInMb = 512; /* Fixed to 512 MB per VF for now */
+	for (vf_idx = 0; vf_idx < conf->num_vf_bundles; vf_idx++) {
+		address = HWPfDmaVfDdrBaseRw + vf_idx
+				* 0x10;
+		payload = ((vf_idx * (ddrSizeInMb / 64)) << 16) +
+				(ddrSizeInMb - 1);
+		acc100_reg_write(d, address, payload);
+	}
+	usleep(LONG_WAIT);
+
+	if (numEngines < (SIG_UL_5G_LAST + 1))
+		poweron_cleanup(bbdev, d, conf);
+
+	rte_bbdev_log_debug("PF Tip configuration complete for %s", dev_name);
+	return 0;
+}
diff --git a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
index 4a76d1d..91c234d 100644
--- a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
+++ b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
@@ -1,3 +1,10 @@
 DPDK_21 {
 	local: *;
 };
+
+EXPERIMENTAL {
+	global:
+
+	acc100_configure;
+
+};
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v7 11/11] doc: update bbdev feature table
  2020-09-28 20:19       ` Akhil Goyal
@ 2020-09-29  0:57         ` Chautru, Nicolas
  0 siblings, 0 replies; 213+ messages in thread
From: Chautru, Nicolas @ 2020-09-29  0:57 UTC (permalink / raw)
  To: Akhil Goyal, dev
  Cc: Richardson, Bruce, Xu, Rosen, dave.burley, aidan.goddard, Yigit,
	Ferruh, Liu, Tianjiao

Hi Akhil, 

> From: Akhil Goyal <akhil.goyal@nxp.com>
> 
> Hi Nicolas,
> 
> >
> > Correcting overview matrix to use acc100 name
> >
> > Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> > Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
> > ---
> >  doc/guides/bbdevs/features/acc100.ini | 14 ++++++++++++++
> >  doc/guides/bbdevs/features/mbc.ini    | 14 --------------
> >  2 files changed, 14 insertions(+), 14 deletions(-)  create mode
> > 100644 doc/guides/bbdevs/features/acc100.ini
> >  delete mode 100644 doc/guides/bbdevs/features/mbc.ini
> >
> > diff --git a/doc/guides/bbdevs/features/acc100.ini
> > b/doc/guides/bbdevs/features/acc100.ini
> > new file mode 100644
> > index 0000000..642cd48
> > --- /dev/null
> > +++ b/doc/guides/bbdevs/features/acc100.ini
> > @@ -0,0 +1,14 @@
> > +;
> > +; Supported features of the 'acc100' bbdev driver.
> > +;
> > +; Refer to default.ini for the full list of available PMD features.
> > +;
> > +[Features]
> > +Turbo Decoder (4G)     = Y
> > +Turbo Encoder (4G)     = Y
> > +LDPC Decoder (5G)      = Y
> > +LDPC Encoder (5G)      = Y
> > +LLR/HARQ Compression   = Y
> > +External DDR Access    = Y
> > +HW Accelerated         = Y
> > +BBDEV API              = Y
> We normally do not take separate feature set patches for documentation.
> These should be split across your patchset, where you are actually adding
> the feature.
> 
> Also the release notes in the first patch is not correct as the PMD is not
> Complete there. You can add it in the last patch.

No problem. Updated in the v9 now available. 

> 
> > diff --git a/doc/guides/bbdevs/features/mbc.ini
> > b/doc/guides/bbdevs/features/mbc.ini
> > deleted file mode 100644
> > index 78a7b95..0000000
> > --- a/doc/guides/bbdevs/features/mbc.ini
> > +++ /dev/null
> > @@ -1,14 +0,0 @@
> > -;
> > -; Supported features of the 'mbc' bbdev driver.
> > -;
> > -; Refer to default.ini for the full list of available PMD features.
> > -;
> > -[Features]
> > -Turbo Decoder (4G)     = Y
> > -Turbo Encoder (4G)     = Y
> > -LDPC Decoder (5G)      = Y
> > -LDPC Encoder (5G)      = Y
> > -LLR/HARQ Compression   = Y
> > -External DDR Access    = Y
> > -HW Accelerated         = Y
> > -BBDEV API              = Y
> 
> Not sure how was it missed earlier.
> Please submit a separate patch for this. This should also be sent for stable
> backport as well And add a fixes line.

Yes that was my bad to overlook this. Sent now as well on separate patch. 

More generally the patches related to BBDEV pending to be applied for 20.11 :  
https://patches.dpdk.org/project/dpdk/list/?series=&submitter=chautru&state=&q=&archive=&delegate=

Thanks
Nic

^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v9 01/10] drivers/baseband: add PMD for ACC100
  2020-09-29  0:29     ` [dpdk-dev] [PATCH v9 01/10] drivers/baseband: add PMD for ACC100 Nicolas Chautru
@ 2020-09-29 19:53       ` Tom Rix
  2020-09-29 23:17         ` Chautru, Nicolas
  0 siblings, 1 reply; 213+ messages in thread
From: Tom Rix @ 2020-09-29 19:53 UTC (permalink / raw)
  To: Nicolas Chautru, dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu


On 9/28/20 5:29 PM, Nicolas Chautru wrote:
> Add stubs for the ACC100 PMD
>
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
> ---
>  doc/guides/bbdevs/acc100.rst                       | 233 +++++++++++++++++++++
>  doc/guides/bbdevs/features/acc100.ini              |  14 ++
>  doc/guides/bbdevs/index.rst                        |   1 +
>  drivers/baseband/acc100/meson.build                |   6 +
>  drivers/baseband/acc100/rte_acc100_pmd.c           | 175 ++++++++++++++++
>  drivers/baseband/acc100/rte_acc100_pmd.h           |  37 ++++
>  .../acc100/rte_pmd_bbdev_acc100_version.map        |   3 +
>  drivers/baseband/meson.build                       |   2 +-
>  8 files changed, 470 insertions(+), 1 deletion(-)
>  create mode 100644 doc/guides/bbdevs/acc100.rst
>  create mode 100644 doc/guides/bbdevs/features/acc100.ini
>  create mode 100644 drivers/baseband/acc100/meson.build
>  create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.c
>  create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.h
>  create mode 100644 drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
>
> diff --git a/doc/guides/bbdevs/acc100.rst b/doc/guides/bbdevs/acc100.rst
> new file mode 100644
> index 0000000..f87ee09
> --- /dev/null
> +++ b/doc/guides/bbdevs/acc100.rst
> @@ -0,0 +1,233 @@
> +..  SPDX-License-Identifier: BSD-3-Clause
> +    Copyright(c) 2020 Intel Corporation
> +
> +Intel(R) ACC100 5G/4G FEC Poll Mode Driver
> +==========================================
> +
> +The BBDEV ACC100 5G/4G FEC poll mode driver (PMD) supports an
> +implementation of a VRAN FEC wireless acceleration function.
> +This device is also known as Mount Bryce.
If this is code name or general chip name it should be removed.
> +
> +Features
> +--------
> +
> +ACC100 5G/4G FEC PMD supports the following features:
> +
> +- LDPC Encode in the DL (5GNR)
> +- LDPC Decode in the UL (5GNR)
> +- Turbo Encode in the DL (4G)
> +- Turbo Decode in the UL (4G)
> +- 16 VFs per PF (physical device)
> +- Maximum of 128 queues per VF
> +- PCIe Gen-3 x16 Interface
> +- MSI
> +- SR-IOV
> +
> +ACC100 5G/4G FEC PMD supports the following BBDEV capabilities:
> +
> +* For the LDPC encode operation:
> +   - ``RTE_BBDEV_LDPC_CRC_24B_ATTACH`` :  set to attach CRC24B to CB(s)
> +   - ``RTE_BBDEV_LDPC_RATE_MATCH`` :  if set then do not do Rate Match bypass
> +   - ``RTE_BBDEV_LDPC_INTERLEAVER_BYPASS`` : if set then bypass interleaver
> +
> +* For the LDPC decode operation:
> +   - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK`` :  check CRC24B from CB(s)
> +   - ``RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE`` :  disable early termination
> +   - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP`` :  drops CRC24B bits appended while decoding
> +   - ``RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE`` :  provides an input for HARQ combining
> +   - ``RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE`` :  provides an input for HARQ combining
> +   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE`` :  HARQ memory input is internal
> +   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE`` :  HARQ memory output is internal
> +   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK`` :  loopback data to/from HARQ memory
> +   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_FILLERS`` :  HARQ memory includes the fillers bits
> +   - ``RTE_BBDEV_LDPC_DEC_SCATTER_GATHER`` :  supports scatter-gather for input/output data
> +   - ``RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION`` :  supports compression of the HARQ input/output
> +   - ``RTE_BBDEV_LDPC_LLR_COMPRESSION`` :  supports LLR input compression
> +
> +* For the turbo encode operation:
> +   - ``RTE_BBDEV_TURBO_CRC_24B_ATTACH`` :  set to attach CRC24B to CB(s)
> +   - ``RTE_BBDEV_TURBO_RATE_MATCH`` :  if set then do not do Rate Match bypass
> +   - ``RTE_BBDEV_TURBO_ENC_INTERRUPTS`` :  set for encoder dequeue interrupts
> +   - ``RTE_BBDEV_TURBO_RV_INDEX_BYPASS`` :  set to bypass RV index
> +   - ``RTE_BBDEV_TURBO_ENC_SCATTER_GATHER`` :  supports scatter-gather for input/output data
> +
> +* For the turbo decode operation:
> +   - ``RTE_BBDEV_TURBO_CRC_TYPE_24B`` :  check CRC24B from CB(s)
> +   - ``RTE_BBDEV_TURBO_SUBBLOCK_DEINTERLEAVE`` :  perform subblock de-interleave
> +   - ``RTE_BBDEV_TURBO_DEC_INTERRUPTS`` :  set for decoder dequeue interrupts
> +   - ``RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN`` :  set if negative LLR encoder i/p is supported
> +   - ``RTE_BBDEV_TURBO_POS_LLR_1_BIT_IN`` :  set if positive LLR encoder i/p is supported
> +   - ``RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP`` :  keep CRC24B bits appended while decoding
> +   - ``RTE_BBDEV_TURBO_EARLY_TERMINATION`` :  set early early termination feature
> +   - ``RTE_BBDEV_TURBO_DEC_SCATTER_GATHER`` :  supports scatter-gather for input/output data
> +   - ``RTE_BBDEV_TURBO_HALF_ITERATION_EVEN`` :  set half iteration granularity
> +
> +Installation
> +------------
> +
> +Section 3 of the DPDK manual provides instuctions on installing and compiling DPDK. The
> +default set of bbdev compile flags may be found in config/common_base, where for example
> +the flag to build the ACC100 5G/4G FEC device, ``CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100``,
> +is already set.
> +
> +DPDK requires hugepages to be configured as detailed in section 2 of the DPDK manual.
> +The bbdev test application has been tested with a configuration 40 x 1GB hugepages. The
> +hugepage configuration of a server may be examined using:
> +
> +.. code-block:: console
> +
> +   grep Huge* /proc/meminfo
> +
> +
> +Initialization
> +--------------
> +
> +When the device first powers up, its PCI Physical Functions (PF) can be listed through this command:
> +
> +.. code-block:: console
> +
> +  sudo lspci -vd8086:0d5c
> +
> +The physical and virtual functions are compatible with Linux UIO drivers:
> +``vfio`` and ``igb_uio``. However, in order to work the ACC100 5G/4G
> +FEC device firstly needs to be bound to one of these linux drivers through DPDK.
FEC device first
> +
> +
> +Bind PF UIO driver(s)
> +~~~~~~~~~~~~~~~~~~~~~
> +
> +Install the DPDK igb_uio driver, bind it with the PF PCI device ID and use
> +``lspci`` to confirm the PF device is under use by ``igb_uio`` DPDK UIO driver.
> +
> +The igb_uio driver may be bound to the PF PCI device using one of three methods:
> +
> +
> +1. PCI functions (physical or virtual, depending on the use case) can be bound to
> +the UIO driver by repeating this command for every function.
> +
> +.. code-block:: console
> +
> +  cd <dpdk-top-level-directory>
> +  insmod ./build/kmod/igb_uio.ko
> +  echo "8086 0d5c" > /sys/bus/pci/drivers/igb_uio/new_id
> +  lspci -vd8086:0d5c
> +
> +
> +2. Another way to bind PF with DPDK UIO driver is by using the ``dpdk-devbind.py`` tool
> +
> +.. code-block:: console
> +
> +  cd <dpdk-top-level-directory>
> +  ./usertools/dpdk-devbind.py -b igb_uio 0000:06:00.0
> +
> +where the PCI device ID (example: 0000:06:00.0) is obtained using lspci -vd8086:0d5c
> +
> +
> +3. A third way to bind is to use ``dpdk-setup.sh`` tool
> +
> +.. code-block:: console
> +
> +  cd <dpdk-top-level-directory>
> +  ./usertools/dpdk-setup.sh
> +
> +  select 'Bind Ethernet/Crypto/Baseband device to IGB UIO module'
> +  or
> +  select 'Bind Ethernet/Crypto/Baseband device to VFIO module' depending on driver required
This is the igb_uio section, should defer vfio select to its section.
> +  enter PCI device ID
> +  select 'Display current Ethernet/Crypto/Baseband device settings' to confirm binding
> +
> +
> +In the same way the ACC100 5G/4G FEC PF can be bound with vfio, but vfio driver does not
> +support SR-IOV configuration right out of the box, so it will need to be patched.
Other documentation says works with 5.7
> +
> +
> +Enable Virtual Functions
> +~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +Now, it should be visible in the printouts that PCI PF is under igb_uio control
> +"``Kernel driver in use: igb_uio``"
> +
> +To show the number of available VFs on the device, read ``sriov_totalvfs`` file..
> +
> +.. code-block:: console
> +
> +  cat /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/sriov_totalvfs
> +
> +  where 0000\:<b>\:<d>.<f> is the PCI device ID
> +
> +
> +To enable VFs via igb_uio, echo the number of virtual functions intended to
> +enable to ``max_vfs`` file..
> +
> +.. code-block:: console
> +
> +  echo <num-of-vfs> > /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/max_vfs
> +
> +
> +Afterwards, all VFs must be bound to appropriate UIO drivers as required, same
> +way it was done with the physical function previously.
> +
> +Enabling SR-IOV via vfio driver is pretty much the same, except that the file
> +name is different:
> +
> +.. code-block:: console
> +
> +  echo <num-of-vfs> > /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/sriov_numvfs
> +
> +
> +Configure the VFs through PF
> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> +
> +The PCI virtual functions must be configured before working or getting assigned
> +to VMs/Containers. The configuration involves allocating the number of hardware
> +queues, priorities, load balance, bandwidth and other settings necessary for the
> +device to perform FEC functions.
> +
> +This configuration needs to be executed at least once after reboot or PCI FLR and can
> +be achieved by using the function ``acc100_configure()``, which sets up the
> +parameters defined in ``acc100_conf`` structure.
> +
> +Test Application
> +----------------
> +
> +BBDEV provides a test application, ``test-bbdev.py`` and range of test data for testing
> +the functionality of ACC100 5G/4G FEC encode and decode, depending on the device's
> +capabilities. The test application is located under app->test-bbdev folder and has the
> +following options:
> +
> +.. code-block:: console
> +
> +  "-p", "--testapp-path": specifies path to the bbdev test app.
> +  "-e", "--eal-params"	: EAL arguments which are passed to the test app.
> +  "-t", "--timeout"	: Timeout in seconds (default=300).
> +  "-c", "--test-cases"	: Defines test cases to run. Run all if not specified.
> +  "-v", "--test-vector"	: Test vector path (default=dpdk_path+/app/test-bbdev/test_vectors/bbdev_null.data).
> +  "-n", "--num-ops"	: Number of operations to process on device (default=32).
> +  "-b", "--burst-size"	: Operations enqueue/dequeue burst size (default=32).
> +  "-s", "--snr"		: SNR in dB used when generating LLRs for bler tests.
> +  "-s", "--iter_max"	: Number of iterations for LDPC decoder.
> +  "-l", "--num-lcores"	: Number of lcores to run (default=16).
> +  "-i", "--init-device" : Initialise PF device with default values.
> +
> +
> +To execute the test application tool using simple decode or encode data,
> +type one of the following:
> +
> +.. code-block:: console
> +
> +  ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_dec_default.data
> +  ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_enc_default.data
> +
> +
> +The test application ``test-bbdev.py``, supports the ability to configure the PF device with
> +a default set of values, if the "-i" or "- -init-device" option is included. The default values
> +are defined in test_bbdev_perf.c.
> +
> +
> +Test Vectors
> +~~~~~~~~~~~~
> +
> +In addition to the simple LDPC decoder and LDPC encoder tests, bbdev also provides
> +a range of additional tests under the test_vectors folder, which may be useful. The results
> +of these tests will depend on the ACC100 5G/4G FEC capabilities which may cause some
> +testcases to be skipped, but no failure should be reported.

Just

to be skipped.

should be able to assume skipped test do not get reported as failures.

> diff --git a/doc/guides/bbdevs/features/acc100.ini b/doc/guides/bbdevs/features/acc100.ini
> new file mode 100644
> index 0000000..c89a4d7
> --- /dev/null
> +++ b/doc/guides/bbdevs/features/acc100.ini
> @@ -0,0 +1,14 @@
> +;
> +; Supported features of the 'acc100' bbdev driver.
> +;
> +; Refer to default.ini for the full list of available PMD features.
> +;
> +[Features]
> +Turbo Decoder (4G)     = N
> +Turbo Encoder (4G)     = N
> +LDPC Decoder (5G)      = N
> +LDPC Encoder (5G)      = N
> +LLR/HARQ Compression   = N
> +External DDR Access    = N
> +HW Accelerated         = Y
> +BBDEV API              = Y
> diff --git a/doc/guides/bbdevs/index.rst b/doc/guides/bbdevs/index.rst
> index a8092dd..4445cbd 100644
> --- a/doc/guides/bbdevs/index.rst
> +++ b/doc/guides/bbdevs/index.rst
> @@ -13,3 +13,4 @@ Baseband Device Drivers
>      turbo_sw
>      fpga_lte_fec
>      fpga_5gnr_fec
> +    acc100
> diff --git a/drivers/baseband/acc100/meson.build b/drivers/baseband/acc100/meson.build
> new file mode 100644
> index 0000000..8afafc2
> --- /dev/null
> +++ b/drivers/baseband/acc100/meson.build
> @@ -0,0 +1,6 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright(c) 2020 Intel Corporation
> +
> +deps += ['bbdev', 'bus_vdev', 'ring', 'pci', 'bus_pci']
> +
> +sources = files('rte_acc100_pmd.c')
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
> new file mode 100644
> index 0000000..1b4cd13
> --- /dev/null
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> @@ -0,0 +1,175 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2020 Intel Corporation
> + */
> +
> +#include <unistd.h>
> +
> +#include <rte_common.h>
> +#include <rte_log.h>
> +#include <rte_dev.h>
> +#include <rte_malloc.h>
> +#include <rte_mempool.h>
> +#include <rte_byteorder.h>
> +#include <rte_errno.h>
> +#include <rte_branch_prediction.h>
> +#include <rte_hexdump.h>
> +#include <rte_pci.h>
> +#include <rte_bus_pci.h>
> +
> +#include <rte_bbdev.h>
> +#include <rte_bbdev_pmd.h>
Should these #includes' be in alpha order ?
> +#include "rte_acc100_pmd.h"
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, DEBUG);
> +#else
> +RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);
> +#endif
> +
> +/* Free 64MB memory used for software rings */
> +static int
> +acc100_dev_close(struct rte_bbdev *dev  __rte_unused)
> +{
> +	return 0;
> +}
> +
> +static const struct rte_bbdev_ops acc100_bbdev_ops = {
> +	.close = acc100_dev_close,
> +};
> +
> +/* ACC100 PCI PF address map */
> +static struct rte_pci_id pci_id_acc100_pf_map[] = {
> +	{
> +		RTE_PCI_DEVICE(RTE_ACC100_VENDOR_ID, RTE_ACC100_PF_DEVICE_ID)
> +	},
> +	{.device_id = 0},
> +};
> +
> +/* ACC100 PCI VF address map */
> +static struct rte_pci_id pci_id_acc100_vf_map[] = {
> +	{
> +		RTE_PCI_DEVICE(RTE_ACC100_VENDOR_ID, RTE_ACC100_VF_DEVICE_ID)
> +	},
> +	{.device_id = 0},
> +};
> +
> +/* Initialization Function */
> +static void
> +acc100_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv)
> +{
> +	struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
> +
> +	dev->dev_ops = &acc100_bbdev_ops;
> +
> +	((struct acc100_device *) dev->data->dev_private)->pf_device =
> +			!strcmp(drv->driver.name,
> +					RTE_STR(ACC100PF_DRIVER_NAME));
> +	((struct acc100_device *) dev->data->dev_private)->mmio_base =
> +			pci_dev->mem_resource[0].addr;
> +
> +	rte_bbdev_log_debug("Init device %s [%s] @ vaddr %p paddr %#"PRIx64"",
> +			drv->driver.name, dev->data->name,
> +			(void *)pci_dev->mem_resource[0].addr,
> +			pci_dev->mem_resource[0].phys_addr);
> +}
> +
> +static int acc100_pci_probe(struct rte_pci_driver *pci_drv,
> +	struct rte_pci_device *pci_dev)
> +{
> +	struct rte_bbdev *bbdev = NULL;
> +	char dev_name[RTE_BBDEV_NAME_MAX_LEN];
> +
> +	if (pci_dev == NULL) {
> +		rte_bbdev_log(ERR, "NULL PCI device");
> +		return -EINVAL;
> +	}
> +
> +	rte_pci_device_name(&pci_dev->addr, dev_name, sizeof(dev_name));
> +
> +	/* Allocate memory to be used privately by drivers */
> +	bbdev = rte_bbdev_allocate(pci_dev->device.name);
> +	if (bbdev == NULL)
> +		return -ENODEV;
> +
> +	/* allocate device private memory */
> +	bbdev->data->dev_private = rte_zmalloc_socket(dev_name,
> +			sizeof(struct acc100_device), RTE_CACHE_LINE_SIZE,
> +			pci_dev->device.numa_node);
> +
> +	if (bbdev->data->dev_private == NULL) {
> +		rte_bbdev_log(CRIT,
> +				"Allocate of %zu bytes for device \"%s\" failed",
> +				sizeof(struct acc100_device), dev_name);
> +				rte_bbdev_release(bbdev);
> +			return -ENOMEM;
> +	}
> +
> +	/* Fill HW specific part of device structure */
> +	bbdev->device = &pci_dev->device;
> +	bbdev->intr_handle = &pci_dev->intr_handle;
> +	bbdev->data->socket_id = pci_dev->device.numa_node;
> +
> +	/* Invoke ACC100 device initialization function */
> +	acc100_bbdev_init(bbdev, pci_drv);
> +
> +	rte_bbdev_log_debug("Initialised bbdev %s (id = %u)",
> +			dev_name, bbdev->data->dev_id);
> +	return 0;
> +}
> +
> +static int acc100_pci_remove(struct rte_pci_device *pci_dev)
> +{
> +	struct rte_bbdev *bbdev;
> +	int ret;
> +	uint8_t dev_id;
> +
> +	if (pci_dev == NULL)
> +		return -EINVAL;
> +
> +	/* Find device */
> +	bbdev = rte_bbdev_get_named_dev(pci_dev->device.name);
> +	if (bbdev == NULL) {
> +		rte_bbdev_log(CRIT,
> +				"Couldn't find HW dev \"%s\" to uninitialise it",
> +				pci_dev->device.name);
> +		return -ENODEV;
> +	}
> +	dev_id = bbdev->data->dev_id;
> +
> +	/* free device private memory before close */
> +	rte_free(bbdev->data->dev_private);
> +
> +	/* Close device */
> +	ret = rte_bbdev_close(dev_id);

Do you want to reorder this close before the rte_free so you could recover from the failure ?

Tom

> +	if (ret < 0)
> +		rte_bbdev_log(ERR,
> +				"Device %i failed to close during uninit: %i",
> +				dev_id, ret);
> +
> +	/* release bbdev from library */
> +	rte_bbdev_release(bbdev);
> +
> +	rte_bbdev_log_debug("Destroyed bbdev = %u", dev_id);
> +
> +	return 0;
> +}
> +
> +static struct rte_pci_driver acc100_pci_pf_driver = {
> +		.probe = acc100_pci_probe,
> +		.remove = acc100_pci_remove,
> +		.id_table = pci_id_acc100_pf_map,
> +		.drv_flags = RTE_PCI_DRV_NEED_MAPPING
> +};
> +
> +static struct rte_pci_driver acc100_pci_vf_driver = {
> +		.probe = acc100_pci_probe,
> +		.remove = acc100_pci_remove,
> +		.id_table = pci_id_acc100_vf_map,
> +		.drv_flags = RTE_PCI_DRV_NEED_MAPPING
> +};
> +
> +RTE_PMD_REGISTER_PCI(ACC100PF_DRIVER_NAME, acc100_pci_pf_driver);
> +RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME, pci_id_acc100_pf_map);
> +RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
> +RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME, pci_id_acc100_vf_map);
> +
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
> new file mode 100644
> index 0000000..6f46df0
> --- /dev/null
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
> @@ -0,0 +1,37 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2020 Intel Corporation
> + */
> +
> +#ifndef _RTE_ACC100_PMD_H_
> +#define _RTE_ACC100_PMD_H_
> +
> +/* Helper macro for logging */
> +#define rte_bbdev_log(level, fmt, ...) \
> +	rte_log(RTE_LOG_ ## level, acc100_logtype, fmt "\n", \
> +		##__VA_ARGS__)
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +#define rte_bbdev_log_debug(fmt, ...) \
> +		rte_bbdev_log(DEBUG, "acc100_pmd: " fmt, \
> +		##__VA_ARGS__)
> +#else
> +#define rte_bbdev_log_debug(fmt, ...)
> +#endif
> +
> +/* ACC100 PF and VF driver names */
> +#define ACC100PF_DRIVER_NAME           intel_acc100_pf
> +#define ACC100VF_DRIVER_NAME           intel_acc100_vf
> +
> +/* ACC100 PCI vendor & device IDs */
> +#define RTE_ACC100_VENDOR_ID           (0x8086)
> +#define RTE_ACC100_PF_DEVICE_ID        (0x0d5c)
> +#define RTE_ACC100_VF_DEVICE_ID        (0x0d5d)
> +
> +/* Private data structure for each ACC100 device */
> +struct acc100_device {
> +	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
> +	bool pf_device; /**< True if this is a PF ACC100 device */
> +	bool configured; /**< True if this ACC100 device is configured */
> +};
> +
> +#endif /* _RTE_ACC100_PMD_H_ */
> diff --git a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> new file mode 100644
> index 0000000..4a76d1d
> --- /dev/null
> +++ b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> @@ -0,0 +1,3 @@
> +DPDK_21 {
> +	local: *;
> +};
> diff --git a/drivers/baseband/meson.build b/drivers/baseband/meson.build
> index 415b672..72301ce 100644
> --- a/drivers/baseband/meson.build
> +++ b/drivers/baseband/meson.build
> @@ -5,7 +5,7 @@ if is_windows
>  	subdir_done()
>  endif
>  
> -drivers = ['null', 'turbo_sw', 'fpga_lte_fec', 'fpga_5gnr_fec']
> +drivers = ['null', 'turbo_sw', 'fpga_lte_fec', 'fpga_5gnr_fec', 'acc100']
>  
>  config_flag_fmt = 'RTE_LIBRTE_PMD_BBDEV_@0@'
>  driver_name_fmt = 'rte_pmd_bbdev_@0@'


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v9 02/10] baseband/acc100: add register definition file
  2020-09-29  0:29     ` [dpdk-dev] [PATCH v9 02/10] baseband/acc100: add register definition file Nicolas Chautru
@ 2020-09-29 20:34       ` Tom Rix
  2020-09-29 23:30         ` Chautru, Nicolas
  0 siblings, 1 reply; 213+ messages in thread
From: Tom Rix @ 2020-09-29 20:34 UTC (permalink / raw)
  To: Nicolas Chautru, dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu


On 9/28/20 5:29 PM, Nicolas Chautru wrote:
> Add in the list of registers for the device and related
> HW specs definitions.
>
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> Reviewed-by: Rosen Xu <rosen.xu@intel.com>
> Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
> ---
>  drivers/baseband/acc100/acc100_pf_enum.h | 1068 ++++++++++++++++++++++++++++++
>  drivers/baseband/acc100/acc100_vf_enum.h |   73 ++
>  drivers/baseband/acc100/rte_acc100_pmd.h |  490 ++++++++++++++
>  3 files changed, 1631 insertions(+)
>  create mode 100644 drivers/baseband/acc100/acc100_pf_enum.h
>  create mode 100644 drivers/baseband/acc100/acc100_vf_enum.h
>
> diff --git a/drivers/baseband/acc100/acc100_pf_enum.h b/drivers/baseband/acc100/acc100_pf_enum.h
> new file mode 100644
> index 0000000..a1ee416
> --- /dev/null
> +++ b/drivers/baseband/acc100/acc100_pf_enum.h
> @@ -0,0 +1,1068 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2017 Intel Corporation
> + */
> +
> +#ifndef ACC100_PF_ENUM_H
> +#define ACC100_PF_ENUM_H
> +
> +/*
> + * ACC100 Register mapping on PF BAR0
> + * This is automatically generated from RDL, format may change with new RDL
> + * Release.
> + * Variable names are as is
> + */
> +enum {
> +	HWPfQmgrEgressQueuesTemplate          =  0x0007FE00,
> +	HWPfQmgrIngressAq                     =  0x00080000,
> +	HWPfQmgrArbQAvail                     =  0x00A00010,
> +	HWPfQmgrArbQBlock                     =  0x00A00014,
> +	HWPfQmgrAqueueDropNotifEn             =  0x00A00024,
> +	HWPfQmgrAqueueDisableNotifEn          =  0x00A00028,
> +	HWPfQmgrSoftReset                     =  0x00A00038,
> +	HWPfQmgrInitStatus                    =  0x00A0003C,
> +	HWPfQmgrAramWatchdogCount             =  0x00A00040,
> +	HWPfQmgrAramWatchdogCounterEn         =  0x00A00044,
> +	HWPfQmgrAxiWatchdogCount              =  0x00A00048,
> +	HWPfQmgrAxiWatchdogCounterEn          =  0x00A0004C,
> +	HWPfQmgrProcessWatchdogCount          =  0x00A00050,
> +	HWPfQmgrProcessWatchdogCounterEn      =  0x00A00054,
> +	HWPfQmgrProcessUl4GWatchdogCounter    =  0x00A00058,
> +	HWPfQmgrProcessDl4GWatchdogCounter    =  0x00A0005C,
> +	HWPfQmgrProcessUl5GWatchdogCounter    =  0x00A00060,
> +	HWPfQmgrProcessDl5GWatchdogCounter    =  0x00A00064,
> +	HWPfQmgrProcessMldWatchdogCounter     =  0x00A00068,
> +	HWPfQmgrMsiOverflowUpperVf            =  0x00A00070,
> +	HWPfQmgrMsiOverflowLowerVf            =  0x00A00074,
> +	HWPfQmgrMsiWatchdogOverflow           =  0x00A00078,
> +	HWPfQmgrMsiOverflowEnable             =  0x00A0007C,
> +	HWPfQmgrDebugAqPointerMemGrp          =  0x00A00100,
> +	HWPfQmgrDebugOutputArbQFifoGrp        =  0x00A00140,
> +	HWPfQmgrDebugMsiFifoGrp               =  0x00A00180,
> +	HWPfQmgrDebugAxiWdTimeoutMsiFifo      =  0x00A001C0,
> +	HWPfQmgrDebugProcessWdTimeoutMsiFifo  =  0x00A001C4,
> +	HWPfQmgrDepthLog2Grp                  =  0x00A00200,
> +	HWPfQmgrTholdGrp                      =  0x00A00300,
> +	HWPfQmgrGrpTmplateReg0Indx            =  0x00A00600,
> +	HWPfQmgrGrpTmplateReg1Indx            =  0x00A00680,
> +	HWPfQmgrGrpTmplateReg2indx            =  0x00A00700,
> +	HWPfQmgrGrpTmplateReg3Indx            =  0x00A00780,
> +	HWPfQmgrGrpTmplateReg4Indx            =  0x00A00800,
> +	HWPfQmgrVfBaseAddr                    =  0x00A01000,
> +	HWPfQmgrUl4GWeightRrVf                =  0x00A02000,
> +	HWPfQmgrDl4GWeightRrVf                =  0x00A02100,
> +	HWPfQmgrUl5GWeightRrVf                =  0x00A02200,
> +	HWPfQmgrDl5GWeightRrVf                =  0x00A02300,
> +	HWPfQmgrMldWeightRrVf                 =  0x00A02400,
> +	HWPfQmgrArbQDepthGrp                  =  0x00A02F00,
> +	HWPfQmgrGrpFunction0                  =  0x00A02F40,
> +	HWPfQmgrGrpFunction1                  =  0x00A02F44,
> +	HWPfQmgrGrpPriority                   =  0x00A02F48,
> +	HWPfQmgrWeightSync                    =  0x00A03000,
> +	HWPfQmgrAqEnableVf                    =  0x00A10000,
> +	HWPfQmgrAqResetVf                     =  0x00A20000,
> +	HWPfQmgrRingSizeVf                    =  0x00A20004,
> +	HWPfQmgrGrpDepthLog20Vf               =  0x00A20008,
> +	HWPfQmgrGrpDepthLog21Vf               =  0x00A2000C,
> +	HWPfQmgrGrpFunction0Vf                =  0x00A20010,
> +	HWPfQmgrGrpFunction1Vf                =  0x00A20014,
> +	HWPfDmaConfig0Reg                     =  0x00B80000,
> +	HWPfDmaConfig1Reg                     =  0x00B80004,
> +	HWPfDmaQmgrAddrReg                    =  0x00B80008,
> +	HWPfDmaSoftResetReg                   =  0x00B8000C,
> +	HWPfDmaAxcacheReg                     =  0x00B80010,
> +	HWPfDmaVersionReg                     =  0x00B80014,
> +	HWPfDmaFrameThreshold                 =  0x00B80018,
> +	HWPfDmaTimestampLo                    =  0x00B8001C,
> +	HWPfDmaTimestampHi                    =  0x00B80020,
> +	HWPfDmaAxiStatus                      =  0x00B80028,
> +	HWPfDmaAxiControl                     =  0x00B8002C,
> +	HWPfDmaNoQmgr                         =  0x00B80030,
> +	HWPfDmaQosScale                       =  0x00B80034,
> +	HWPfDmaQmanen                         =  0x00B80040,
> +	HWPfDmaQmgrQosBase                    =  0x00B80060,
> +	HWPfDmaFecClkGatingEnable             =  0x00B80080,
> +	HWPfDmaPmEnable                       =  0x00B80084,
> +	HWPfDmaQosEnable                      =  0x00B80088,
> +	HWPfDmaHarqWeightedRrFrameThreshold   =  0x00B800B0,
> +	HWPfDmaDataSmallWeightedRrFrameThresh  = 0x00B800B4,
> +	HWPfDmaDataLargeWeightedRrFrameThresh  = 0x00B800B8,
> +	HWPfDmaInboundCbMaxSize               =  0x00B800BC,
> +	HWPfDmaInboundDrainDataSize           =  0x00B800C0,
> +	HWPfDmaVfDdrBaseRw                    =  0x00B80400,
> +	HWPfDmaCmplTmOutCnt                   =  0x00B80800,
> +	HWPfDmaProcTmOutCnt                   =  0x00B80804,
> +	HWPfDmaStatusRrespBresp               =  0x00B80810,
> +	HWPfDmaCfgRrespBresp                  =  0x00B80814,
> +	HWPfDmaStatusMemParErr                =  0x00B80818,
> +	HWPfDmaCfgMemParErrEn                 =  0x00B8081C,
> +	HWPfDmaStatusDmaHwErr                 =  0x00B80820,
> +	HWPfDmaCfgDmaHwErrEn                  =  0x00B80824,
> +	HWPfDmaStatusFecCoreErr               =  0x00B80828,
> +	HWPfDmaCfgFecCoreErrEn                =  0x00B8082C,
> +	HWPfDmaStatusFcwDescrErr              =  0x00B80830,
> +	HWPfDmaCfgFcwDescrErrEn               =  0x00B80834,
> +	HWPfDmaStatusBlockTransmit            =  0x00B80838,
> +	HWPfDmaBlockOnErrEn                   =  0x00B8083C,
> +	HWPfDmaStatusFlushDma                 =  0x00B80840,
> +	HWPfDmaFlushDmaOnErrEn                =  0x00B80844,
> +	HWPfDmaStatusSdoneFifoFull            =  0x00B80848,
> +	HWPfDmaStatusDescriptorErrLoVf        =  0x00B8084C,
> +	HWPfDmaStatusDescriptorErrHiVf        =  0x00B80850,
> +	HWPfDmaStatusFcwErrLoVf               =  0x00B80854,
> +	HWPfDmaStatusFcwErrHiVf               =  0x00B80858,
> +	HWPfDmaStatusDataErrLoVf              =  0x00B8085C,
> +	HWPfDmaStatusDataErrHiVf              =  0x00B80860,
> +	HWPfDmaCfgMsiEnSoftwareErr            =  0x00B80864,
> +	HWPfDmaDescriptorSignatuture          =  0x00B80868,
> +	HWPfDmaFcwSignature                   =  0x00B8086C,
> +	HWPfDmaErrorDetectionEn               =  0x00B80870,
> +	HWPfDmaErrCntrlFifoDebug              =  0x00B8087C,
> +	HWPfDmaStatusToutData                 =  0x00B80880,
> +	HWPfDmaStatusToutDesc                 =  0x00B80884,
> +	HWPfDmaStatusToutUnexpData            =  0x00B80888,
> +	HWPfDmaStatusToutUnexpDesc            =  0x00B8088C,
> +	HWPfDmaStatusToutProcess              =  0x00B80890,
> +	HWPfDmaConfigCtoutOutDataEn           =  0x00B808A0,
> +	HWPfDmaConfigCtoutOutDescrEn          =  0x00B808A4,
> +	HWPfDmaConfigUnexpComplDataEn         =  0x00B808A8,
> +	HWPfDmaConfigUnexpComplDescrEn        =  0x00B808AC,
> +	HWPfDmaConfigPtoutOutEn               =  0x00B808B0,
> +	HWPfDmaFec5GulDescBaseLoRegVf         =  0x00B88020,
> +	HWPfDmaFec5GulDescBaseHiRegVf         =  0x00B88024,
> +	HWPfDmaFec5GulRespPtrLoRegVf          =  0x00B88028,
> +	HWPfDmaFec5GulRespPtrHiRegVf          =  0x00B8802C,
> +	HWPfDmaFec5GdlDescBaseLoRegVf         =  0x00B88040,
> +	HWPfDmaFec5GdlDescBaseHiRegVf         =  0x00B88044,
> +	HWPfDmaFec5GdlRespPtrLoRegVf          =  0x00B88048,
> +	HWPfDmaFec5GdlRespPtrHiRegVf          =  0x00B8804C,
> +	HWPfDmaFec4GulDescBaseLoRegVf         =  0x00B88060,
> +	HWPfDmaFec4GulDescBaseHiRegVf         =  0x00B88064,
> +	HWPfDmaFec4GulRespPtrLoRegVf          =  0x00B88068,
> +	HWPfDmaFec4GulRespPtrHiRegVf          =  0x00B8806C,
> +	HWPfDmaFec4GdlDescBaseLoRegVf         =  0x00B88080,
> +	HWPfDmaFec4GdlDescBaseHiRegVf         =  0x00B88084,
> +	HWPfDmaFec4GdlRespPtrLoRegVf          =  0x00B88088,
> +	HWPfDmaFec4GdlRespPtrHiRegVf          =  0x00B8808C,
> +	HWPfDmaVfDdrBaseRangeRo               =  0x00B880A0,
> +	HWPfQosmonACntrlReg                   =  0x00B90000,
> +	HWPfQosmonAEvalOverflow0              =  0x00B90008,
> +	HWPfQosmonAEvalOverflow1              =  0x00B9000C,
> +	HWPfQosmonADivTerm                    =  0x00B90010,
> +	HWPfQosmonATickTerm                   =  0x00B90014,
> +	HWPfQosmonAEvalTerm                   =  0x00B90018,
> +	HWPfQosmonAAveTerm                    =  0x00B9001C,
> +	HWPfQosmonAForceEccErr                =  0x00B90020,
> +	HWPfQosmonAEccErrDetect               =  0x00B90024,
> +	HWPfQosmonAIterationConfig0Low        =  0x00B90060,
> +	HWPfQosmonAIterationConfig0High       =  0x00B90064,
> +	HWPfQosmonAIterationConfig1Low        =  0x00B90068,
> +	HWPfQosmonAIterationConfig1High       =  0x00B9006C,
> +	HWPfQosmonAIterationConfig2Low        =  0x00B90070,
> +	HWPfQosmonAIterationConfig2High       =  0x00B90074,
> +	HWPfQosmonAIterationConfig3Low        =  0x00B90078,
> +	HWPfQosmonAIterationConfig3High       =  0x00B9007C,
> +	HWPfQosmonAEvalMemAddr                =  0x00B90080,
> +	HWPfQosmonAEvalMemData                =  0x00B90084,
> +	HWPfQosmonAXaction                    =  0x00B900C0,
> +	HWPfQosmonARemThres1Vf                =  0x00B90400,
> +	HWPfQosmonAThres2Vf                   =  0x00B90404,
> +	HWPfQosmonAWeiFracVf                  =  0x00B90408,
> +	HWPfQosmonARrWeiVf                    =  0x00B9040C,
> +	HWPfPermonACntrlRegVf                 =  0x00B98000,
> +	HWPfPermonACountVf                    =  0x00B98008,
> +	HWPfPermonAKCntLoVf                   =  0x00B98010,
> +	HWPfPermonAKCntHiVf                   =  0x00B98014,
> +	HWPfPermonADeltaCntLoVf               =  0x00B98020,
> +	HWPfPermonADeltaCntHiVf               =  0x00B98024,
> +	HWPfPermonAVersionReg                 =  0x00B9C000,
> +	HWPfPermonACbControlFec               =  0x00B9C0F0,
> +	HWPfPermonADltTimerLoFec              =  0x00B9C0F4,
> +	HWPfPermonADltTimerHiFec              =  0x00B9C0F8,
> +	HWPfPermonACbCountFec                 =  0x00B9C100,
> +	HWPfPermonAAccExecTimerLoFec          =  0x00B9C104,
> +	HWPfPermonAAccExecTimerHiFec          =  0x00B9C108,
> +	HWPfPermonAExecTimerMinFec            =  0x00B9C200,
> +	HWPfPermonAExecTimerMaxFec            =  0x00B9C204,
> +	HWPfPermonAControlBusMon              =  0x00B9C400,
> +	HWPfPermonAConfigBusMon               =  0x00B9C404,
> +	HWPfPermonASkipCountBusMon            =  0x00B9C408,
> +	HWPfPermonAMinLatBusMon               =  0x00B9C40C,
> +	HWPfPermonAMaxLatBusMon               =  0x00B9C500,
> +	HWPfPermonATotalLatLowBusMon          =  0x00B9C504,
> +	HWPfPermonATotalLatUpperBusMon        =  0x00B9C508,
> +	HWPfPermonATotalReqCntBusMon          =  0x00B9C50C,
> +	HWPfQosmonBCntrlReg                   =  0x00BA0000,
> +	HWPfQosmonBEvalOverflow0              =  0x00BA0008,
> +	HWPfQosmonBEvalOverflow1              =  0x00BA000C,
> +	HWPfQosmonBDivTerm                    =  0x00BA0010,
> +	HWPfQosmonBTickTerm                   =  0x00BA0014,
> +	HWPfQosmonBEvalTerm                   =  0x00BA0018,
> +	HWPfQosmonBAveTerm                    =  0x00BA001C,
> +	HWPfQosmonBForceEccErr                =  0x00BA0020,
> +	HWPfQosmonBEccErrDetect               =  0x00BA0024,
> +	HWPfQosmonBIterationConfig0Low        =  0x00BA0060,
> +	HWPfQosmonBIterationConfig0High       =  0x00BA0064,
> +	HWPfQosmonBIterationConfig1Low        =  0x00BA0068,
> +	HWPfQosmonBIterationConfig1High       =  0x00BA006C,
> +	HWPfQosmonBIterationConfig2Low        =  0x00BA0070,
> +	HWPfQosmonBIterationConfig2High       =  0x00BA0074,
> +	HWPfQosmonBIterationConfig3Low        =  0x00BA0078,
> +	HWPfQosmonBIterationConfig3High       =  0x00BA007C,
> +	HWPfQosmonBEvalMemAddr                =  0x00BA0080,
> +	HWPfQosmonBEvalMemData                =  0x00BA0084,
> +	HWPfQosmonBXaction                    =  0x00BA00C0,
> +	HWPfQosmonBRemThres1Vf                =  0x00BA0400,
> +	HWPfQosmonBThres2Vf                   =  0x00BA0404,
> +	HWPfQosmonBWeiFracVf                  =  0x00BA0408,
> +	HWPfQosmonBRrWeiVf                    =  0x00BA040C,
> +	HWPfPermonBCntrlRegVf                 =  0x00BA8000,
> +	HWPfPermonBCountVf                    =  0x00BA8008,
> +	HWPfPermonBKCntLoVf                   =  0x00BA8010,
> +	HWPfPermonBKCntHiVf                   =  0x00BA8014,
> +	HWPfPermonBDeltaCntLoVf               =  0x00BA8020,
> +	HWPfPermonBDeltaCntHiVf               =  0x00BA8024,
> +	HWPfPermonBVersionReg                 =  0x00BAC000,
> +	HWPfPermonBCbControlFec               =  0x00BAC0F0,
> +	HWPfPermonBDltTimerLoFec              =  0x00BAC0F4,
> +	HWPfPermonBDltTimerHiFec              =  0x00BAC0F8,
> +	HWPfPermonBCbCountFec                 =  0x00BAC100,
> +	HWPfPermonBAccExecTimerLoFec          =  0x00BAC104,
> +	HWPfPermonBAccExecTimerHiFec          =  0x00BAC108,
> +	HWPfPermonBExecTimerMinFec            =  0x00BAC200,
> +	HWPfPermonBExecTimerMaxFec            =  0x00BAC204,
> +	HWPfPermonBControlBusMon              =  0x00BAC400,
> +	HWPfPermonBConfigBusMon               =  0x00BAC404,
> +	HWPfPermonBSkipCountBusMon            =  0x00BAC408,
> +	HWPfPermonBMinLatBusMon               =  0x00BAC40C,
> +	HWPfPermonBMaxLatBusMon               =  0x00BAC500,
> +	HWPfPermonBTotalLatLowBusMon          =  0x00BAC504,
> +	HWPfPermonBTotalLatUpperBusMon        =  0x00BAC508,
> +	HWPfPermonBTotalReqCntBusMon          =  0x00BAC50C,
> +	HWPfFecUl5gCntrlReg                   =  0x00BC0000,
> +	HWPfFecUl5gI2MThreshReg               =  0x00BC0004,
> +	HWPfFecUl5gVersionReg                 =  0x00BC0100,
> +	HWPfFecUl5gFcwStatusReg               =  0x00BC0104,
> +	HWPfFecUl5gWarnReg                    =  0x00BC0108,
> +	HwPfFecUl5gIbDebugReg                 =  0x00BC0200,
> +	HwPfFecUl5gObLlrDebugReg              =  0x00BC0204,
> +	HwPfFecUl5gObHarqDebugReg             =  0x00BC0208,
> +	HwPfFecUl5g1CntrlReg                  =  0x00BC1000,
> +	HwPfFecUl5g1I2MThreshReg              =  0x00BC1004,
> +	HwPfFecUl5g1VersionReg                =  0x00BC1100,
> +	HwPfFecUl5g1FcwStatusReg              =  0x00BC1104,
> +	HwPfFecUl5g1WarnReg                   =  0x00BC1108,
> +	HwPfFecUl5g1IbDebugReg                =  0x00BC1200,
> +	HwPfFecUl5g1ObLlrDebugReg             =  0x00BC1204,
> +	HwPfFecUl5g1ObHarqDebugReg            =  0x00BC1208,
> +	HwPfFecUl5g2CntrlReg                  =  0x00BC2000,
> +	HwPfFecUl5g2I2MThreshReg              =  0x00BC2004,
> +	HwPfFecUl5g2VersionReg                =  0x00BC2100,
> +	HwPfFecUl5g2FcwStatusReg              =  0x00BC2104,
> +	HwPfFecUl5g2WarnReg                   =  0x00BC2108,
> +	HwPfFecUl5g2IbDebugReg                =  0x00BC2200,
> +	HwPfFecUl5g2ObLlrDebugReg             =  0x00BC2204,
> +	HwPfFecUl5g2ObHarqDebugReg            =  0x00BC2208,
> +	HwPfFecUl5g3CntrlReg                  =  0x00BC3000,
> +	HwPfFecUl5g3I2MThreshReg              =  0x00BC3004,
> +	HwPfFecUl5g3VersionReg                =  0x00BC3100,
> +	HwPfFecUl5g3FcwStatusReg              =  0x00BC3104,
> +	HwPfFecUl5g3WarnReg                   =  0x00BC3108,
> +	HwPfFecUl5g3IbDebugReg                =  0x00BC3200,
> +	HwPfFecUl5g3ObLlrDebugReg             =  0x00BC3204,
> +	HwPfFecUl5g3ObHarqDebugReg            =  0x00BC3208,
> +	HwPfFecUl5g4CntrlReg                  =  0x00BC4000,
> +	HwPfFecUl5g4I2MThreshReg              =  0x00BC4004,
> +	HwPfFecUl5g4VersionReg                =  0x00BC4100,
> +	HwPfFecUl5g4FcwStatusReg              =  0x00BC4104,
> +	HwPfFecUl5g4WarnReg                   =  0x00BC4108,
> +	HwPfFecUl5g4IbDebugReg                =  0x00BC4200,
> +	HwPfFecUl5g4ObLlrDebugReg             =  0x00BC4204,
> +	HwPfFecUl5g4ObHarqDebugReg            =  0x00BC4208,
> +	HwPfFecUl5g5CntrlReg                  =  0x00BC5000,
> +	HwPfFecUl5g5I2MThreshReg              =  0x00BC5004,
> +	HwPfFecUl5g5VersionReg                =  0x00BC5100,
> +	HwPfFecUl5g5FcwStatusReg              =  0x00BC5104,
> +	HwPfFecUl5g5WarnReg                   =  0x00BC5108,
> +	HwPfFecUl5g5IbDebugReg                =  0x00BC5200,
> +	HwPfFecUl5g5ObLlrDebugReg             =  0x00BC5204,
> +	HwPfFecUl5g5ObHarqDebugReg            =  0x00BC5208,
> +	HwPfFecUl5g6CntrlReg                  =  0x00BC6000,
> +	HwPfFecUl5g6I2MThreshReg              =  0x00BC6004,
> +	HwPfFecUl5g6VersionReg                =  0x00BC6100,
> +	HwPfFecUl5g6FcwStatusReg              =  0x00BC6104,
> +	HwPfFecUl5g6WarnReg                   =  0x00BC6108,
> +	HwPfFecUl5g6IbDebugReg                =  0x00BC6200,
> +	HwPfFecUl5g6ObLlrDebugReg             =  0x00BC6204,
> +	HwPfFecUl5g6ObHarqDebugReg            =  0x00BC6208,
> +	HwPfFecUl5g7CntrlReg                  =  0x00BC7000,
> +	HwPfFecUl5g7I2MThreshReg              =  0x00BC7004,
> +	HwPfFecUl5g7VersionReg                =  0x00BC7100,
> +	HwPfFecUl5g7FcwStatusReg              =  0x00BC7104,
> +	HwPfFecUl5g7WarnReg                   =  0x00BC7108,
> +	HwPfFecUl5g7IbDebugReg                =  0x00BC7200,
> +	HwPfFecUl5g7ObLlrDebugReg             =  0x00BC7204,
> +	HwPfFecUl5g7ObHarqDebugReg            =  0x00BC7208,
> +	HwPfFecUl5g8CntrlReg                  =  0x00BC8000,
> +	HwPfFecUl5g8I2MThreshReg              =  0x00BC8004,
> +	HwPfFecUl5g8VersionReg                =  0x00BC8100,
> +	HwPfFecUl5g8FcwStatusReg              =  0x00BC8104,
> +	HwPfFecUl5g8WarnReg                   =  0x00BC8108,
> +	HwPfFecUl5g8IbDebugReg                =  0x00BC8200,
> +	HwPfFecUl5g8ObLlrDebugReg             =  0x00BC8204,
> +	HwPfFecUl5g8ObHarqDebugReg            =  0x00BC8208,
> +	HWPfFecDl5gCntrlReg                   =  0x00BCF000,
> +	HWPfFecDl5gI2MThreshReg               =  0x00BCF004,
> +	HWPfFecDl5gVersionReg                 =  0x00BCF100,
> +	HWPfFecDl5gFcwStatusReg               =  0x00BCF104,
> +	HWPfFecDl5gWarnReg                    =  0x00BCF108,
> +	HWPfFecUlVersionReg                   =  0x00BD0000,
> +	HWPfFecUlControlReg                   =  0x00BD0004,
> +	HWPfFecUlStatusReg                    =  0x00BD0008,
> +	HWPfFecDlVersionReg                   =  0x00BDF000,
> +	HWPfFecDlClusterConfigReg             =  0x00BDF004,
> +	HWPfFecDlBurstThres                   =  0x00BDF00C,
> +	HWPfFecDlClusterStatusReg0            =  0x00BDF040,
> +	HWPfFecDlClusterStatusReg1            =  0x00BDF044,
> +	HWPfFecDlClusterStatusReg2            =  0x00BDF048,
> +	HWPfFecDlClusterStatusReg3            =  0x00BDF04C,
> +	HWPfFecDlClusterStatusReg4            =  0x00BDF050,
> +	HWPfFecDlClusterStatusReg5            =  0x00BDF054,
> +	HWPfChaFabPllPllrst                   =  0x00C40000,
> +	HWPfChaFabPllClk0                     =  0x00C40004,
> +	HWPfChaFabPllClk1                     =  0x00C40008,
> +	HWPfChaFabPllBwadj                    =  0x00C4000C,
> +	HWPfChaFabPllLbw                      =  0x00C40010,
> +	HWPfChaFabPllResetq                   =  0x00C40014,
> +	HWPfChaFabPllPhshft0                  =  0x00C40018,
> +	HWPfChaFabPllPhshft1                  =  0x00C4001C,
> +	HWPfChaFabPllDivq0                    =  0x00C40020,
> +	HWPfChaFabPllDivq1                    =  0x00C40024,
> +	HWPfChaFabPllDivq2                    =  0x00C40028,
> +	HWPfChaFabPllDivq3                    =  0x00C4002C,
> +	HWPfChaFabPllDivq4                    =  0x00C40030,
> +	HWPfChaFabPllDivq5                    =  0x00C40034,
> +	HWPfChaFabPllDivq6                    =  0x00C40038,
> +	HWPfChaFabPllDivq7                    =  0x00C4003C,
> +	HWPfChaDl5gPllPllrst                  =  0x00C40080,
> +	HWPfChaDl5gPllClk0                    =  0x00C40084,
> +	HWPfChaDl5gPllClk1                    =  0x00C40088,
> +	HWPfChaDl5gPllBwadj                   =  0x00C4008C,
> +	HWPfChaDl5gPllLbw                     =  0x00C40090,
> +	HWPfChaDl5gPllResetq                  =  0x00C40094,
> +	HWPfChaDl5gPllPhshft0                 =  0x00C40098,
> +	HWPfChaDl5gPllPhshft1                 =  0x00C4009C,
> +	HWPfChaDl5gPllDivq0                   =  0x00C400A0,
> +	HWPfChaDl5gPllDivq1                   =  0x00C400A4,
> +	HWPfChaDl5gPllDivq2                   =  0x00C400A8,
> +	HWPfChaDl5gPllDivq3                   =  0x00C400AC,
> +	HWPfChaDl5gPllDivq4                   =  0x00C400B0,
> +	HWPfChaDl5gPllDivq5                   =  0x00C400B4,
> +	HWPfChaDl5gPllDivq6                   =  0x00C400B8,
> +	HWPfChaDl5gPllDivq7                   =  0x00C400BC,
> +	HWPfChaDl4gPllPllrst                  =  0x00C40100,
> +	HWPfChaDl4gPllClk0                    =  0x00C40104,
> +	HWPfChaDl4gPllClk1                    =  0x00C40108,
> +	HWPfChaDl4gPllBwadj                   =  0x00C4010C,
> +	HWPfChaDl4gPllLbw                     =  0x00C40110,
> +	HWPfChaDl4gPllResetq                  =  0x00C40114,
> +	HWPfChaDl4gPllPhshft0                 =  0x00C40118,
> +	HWPfChaDl4gPllPhshft1                 =  0x00C4011C,
> +	HWPfChaDl4gPllDivq0                   =  0x00C40120,
> +	HWPfChaDl4gPllDivq1                   =  0x00C40124,
> +	HWPfChaDl4gPllDivq2                   =  0x00C40128,
> +	HWPfChaDl4gPllDivq3                   =  0x00C4012C,
> +	HWPfChaDl4gPllDivq4                   =  0x00C40130,
> +	HWPfChaDl4gPllDivq5                   =  0x00C40134,
> +	HWPfChaDl4gPllDivq6                   =  0x00C40138,
> +	HWPfChaDl4gPllDivq7                   =  0x00C4013C,
> +	HWPfChaUl5gPllPllrst                  =  0x00C40180,
> +	HWPfChaUl5gPllClk0                    =  0x00C40184,
> +	HWPfChaUl5gPllClk1                    =  0x00C40188,
> +	HWPfChaUl5gPllBwadj                   =  0x00C4018C,
> +	HWPfChaUl5gPllLbw                     =  0x00C40190,
> +	HWPfChaUl5gPllResetq                  =  0x00C40194,
> +	HWPfChaUl5gPllPhshft0                 =  0x00C40198,
> +	HWPfChaUl5gPllPhshft1                 =  0x00C4019C,
> +	HWPfChaUl5gPllDivq0                   =  0x00C401A0,
> +	HWPfChaUl5gPllDivq1                   =  0x00C401A4,
> +	HWPfChaUl5gPllDivq2                   =  0x00C401A8,
> +	HWPfChaUl5gPllDivq3                   =  0x00C401AC,
> +	HWPfChaUl5gPllDivq4                   =  0x00C401B0,
> +	HWPfChaUl5gPllDivq5                   =  0x00C401B4,
> +	HWPfChaUl5gPllDivq6                   =  0x00C401B8,
> +	HWPfChaUl5gPllDivq7                   =  0x00C401BC,
> +	HWPfChaUl4gPllPllrst                  =  0x00C40200,
> +	HWPfChaUl4gPllClk0                    =  0x00C40204,
> +	HWPfChaUl4gPllClk1                    =  0x00C40208,
> +	HWPfChaUl4gPllBwadj                   =  0x00C4020C,
> +	HWPfChaUl4gPllLbw                     =  0x00C40210,
> +	HWPfChaUl4gPllResetq                  =  0x00C40214,
> +	HWPfChaUl4gPllPhshft0                 =  0x00C40218,
> +	HWPfChaUl4gPllPhshft1                 =  0x00C4021C,
> +	HWPfChaUl4gPllDivq0                   =  0x00C40220,
> +	HWPfChaUl4gPllDivq1                   =  0x00C40224,
> +	HWPfChaUl4gPllDivq2                   =  0x00C40228,
> +	HWPfChaUl4gPllDivq3                   =  0x00C4022C,
> +	HWPfChaUl4gPllDivq4                   =  0x00C40230,
> +	HWPfChaUl4gPllDivq5                   =  0x00C40234,
> +	HWPfChaUl4gPllDivq6                   =  0x00C40238,
> +	HWPfChaUl4gPllDivq7                   =  0x00C4023C,
> +	HWPfChaDdrPllPllrst                   =  0x00C40280,
> +	HWPfChaDdrPllClk0                     =  0x00C40284,
> +	HWPfChaDdrPllClk1                     =  0x00C40288,
> +	HWPfChaDdrPllBwadj                    =  0x00C4028C,
> +	HWPfChaDdrPllLbw                      =  0x00C40290,
> +	HWPfChaDdrPllResetq                   =  0x00C40294,
> +	HWPfChaDdrPllPhshft0                  =  0x00C40298,
> +	HWPfChaDdrPllPhshft1                  =  0x00C4029C,
> +	HWPfChaDdrPllDivq0                    =  0x00C402A0,
> +	HWPfChaDdrPllDivq1                    =  0x00C402A4,
> +	HWPfChaDdrPllDivq2                    =  0x00C402A8,
> +	HWPfChaDdrPllDivq3                    =  0x00C402AC,
> +	HWPfChaDdrPllDivq4                    =  0x00C402B0,
> +	HWPfChaDdrPllDivq5                    =  0x00C402B4,
> +	HWPfChaDdrPllDivq6                    =  0x00C402B8,
> +	HWPfChaDdrPllDivq7                    =  0x00C402BC,
> +	HWPfChaErrStatus                      =  0x00C40400,
> +	HWPfChaErrMask                        =  0x00C40404,
> +	HWPfChaDebugPcieMsiFifo               =  0x00C40410,
> +	HWPfChaDebugDdrMsiFifo                =  0x00C40414,
> +	HWPfChaDebugMiscMsiFifo               =  0x00C40418,
> +	HWPfChaPwmSet                         =  0x00C40420,
> +	HWPfChaDdrRstStatus                   =  0x00C40430,
> +	HWPfChaDdrStDoneStatus                =  0x00C40434,
> +	HWPfChaDdrWbRstCfg                    =  0x00C40438,
> +	HWPfChaDdrApbRstCfg                   =  0x00C4043C,
> +	HWPfChaDdrPhyRstCfg                   =  0x00C40440,
> +	HWPfChaDdrCpuRstCfg                   =  0x00C40444,
> +	HWPfChaDdrSifRstCfg                   =  0x00C40448,
> +	HWPfChaPadcfgPcomp0                   =  0x00C41000,
> +	HWPfChaPadcfgNcomp0                   =  0x00C41004,
> +	HWPfChaPadcfgOdt0                     =  0x00C41008,
> +	HWPfChaPadcfgProtect0                 =  0x00C4100C,
> +	HWPfChaPreemphasisProtect0            =  0x00C41010,
> +	HWPfChaPreemphasisCompen0             =  0x00C41040,
> +	HWPfChaPreemphasisOdten0              =  0x00C41044,
> +	HWPfChaPadcfgPcomp1                   =  0x00C41100,
> +	HWPfChaPadcfgNcomp1                   =  0x00C41104,
> +	HWPfChaPadcfgOdt1                     =  0x00C41108,
> +	HWPfChaPadcfgProtect1                 =  0x00C4110C,
> +	HWPfChaPreemphasisProtect1            =  0x00C41110,
> +	HWPfChaPreemphasisCompen1             =  0x00C41140,
> +	HWPfChaPreemphasisOdten1              =  0x00C41144,
> +	HWPfChaPadcfgPcomp2                   =  0x00C41200,
> +	HWPfChaPadcfgNcomp2                   =  0x00C41204,
> +	HWPfChaPadcfgOdt2                     =  0x00C41208,
> +	HWPfChaPadcfgProtect2                 =  0x00C4120C,
> +	HWPfChaPreemphasisProtect2            =  0x00C41210,
> +	HWPfChaPreemphasisCompen2             =  0x00C41240,
> +	HWPfChaPreemphasisOdten4              =  0x00C41444,
> +	HWPfChaPreemphasisOdten2              =  0x00C41244,
> +	HWPfChaPadcfgPcomp3                   =  0x00C41300,
> +	HWPfChaPadcfgNcomp3                   =  0x00C41304,
> +	HWPfChaPadcfgOdt3                     =  0x00C41308,
> +	HWPfChaPadcfgProtect3                 =  0x00C4130C,
> +	HWPfChaPreemphasisProtect3            =  0x00C41310,
> +	HWPfChaPreemphasisCompen3             =  0x00C41340,
> +	HWPfChaPreemphasisOdten3              =  0x00C41344,
> +	HWPfChaPadcfgPcomp4                   =  0x00C41400,
> +	HWPfChaPadcfgNcomp4                   =  0x00C41404,
> +	HWPfChaPadcfgOdt4                     =  0x00C41408,
> +	HWPfChaPadcfgProtect4                 =  0x00C4140C,
> +	HWPfChaPreemphasisProtect4            =  0x00C41410,
> +	HWPfChaPreemphasisCompen4             =  0x00C41440,
> +	HWPfHiVfToPfDbellVf                   =  0x00C80000,
> +	HWPfHiPfToVfDbellVf                   =  0x00C80008,
> +	HWPfHiInfoRingBaseLoVf                =  0x00C80010,
> +	HWPfHiInfoRingBaseHiVf                =  0x00C80014,
> +	HWPfHiInfoRingPointerVf               =  0x00C80018,
> +	HWPfHiInfoRingIntWrEnVf               =  0x00C80020,
> +	HWPfHiInfoRingPf2VfWrEnVf             =  0x00C80024,
> +	HWPfHiMsixVectorMapperVf              =  0x00C80060,
> +	HWPfHiModuleVersionReg                =  0x00C84000,
> +	HWPfHiIosf2axiErrLogReg               =  0x00C84004,
> +	HWPfHiHardResetReg                    =  0x00C84008,
> +	HWPfHi5GHardResetReg                  =  0x00C8400C,
> +	HWPfHiInfoRingBaseLoRegPf             =  0x00C84010,
> +	HWPfHiInfoRingBaseHiRegPf             =  0x00C84014,
> +	HWPfHiInfoRingPointerRegPf            =  0x00C84018,
> +	HWPfHiInfoRingIntWrEnRegPf            =  0x00C84020,
> +	HWPfHiInfoRingVf2pfLoWrEnReg          =  0x00C84024,
> +	HWPfHiInfoRingVf2pfHiWrEnReg          =  0x00C84028,
> +	HWPfHiLogParityErrStatusReg           =  0x00C8402C,
> +	HWPfHiLogDataParityErrorVfStatusLo    =  0x00C84030,
> +	HWPfHiLogDataParityErrorVfStatusHi    =  0x00C84034,
> +	HWPfHiBlockTransmitOnErrorEn          =  0x00C84038,
> +	HWPfHiCfgMsiIntWrEnRegPf              =  0x00C84040,
> +	HWPfHiCfgMsiVf2pfLoWrEnReg            =  0x00C84044,
> +	HWPfHiCfgMsiVf2pfHighWrEnReg          =  0x00C84048,
> +	HWPfHiMsixVectorMapperPf              =  0x00C84060,
> +	HWPfHiApbWrWaitTime                   =  0x00C84100,
> +	HWPfHiXCounterMaxValue                =  0x00C84104,
> +	HWPfHiPfMode                          =  0x00C84108,
> +	HWPfHiClkGateHystReg                  =  0x00C8410C,
> +	HWPfHiSnoopBitsReg                    =  0x00C84110,
> +	HWPfHiMsiDropEnableReg                =  0x00C84114,
> +	HWPfHiMsiStatReg                      =  0x00C84120,
> +	HWPfHiFifoOflStatReg                  =  0x00C84124,
> +	HWPfHiHiDebugReg                      =  0x00C841F4,
> +	HWPfHiDebugMemSnoopMsiFifo            =  0x00C841F8,
> +	HWPfHiDebugMemSnoopInputFifo          =  0x00C841FC,
> +	HWPfHiMsixMappingConfig               =  0x00C84200,
> +	HWPfHiJunkReg                         =  0x00C8FF00,
> +	HWPfDdrUmmcVer                        =  0x00D00000,
> +	HWPfDdrUmmcCap                        =  0x00D00010,
> +	HWPfDdrUmmcCtrl                       =  0x00D00020,
> +	HWPfDdrMpcPe                          =  0x00D00080,
> +	HWPfDdrMpcPpri3                       =  0x00D00090,
> +	HWPfDdrMpcPpri2                       =  0x00D000A0,
> +	HWPfDdrMpcPpri1                       =  0x00D000B0,
> +	HWPfDdrMpcPpri0                       =  0x00D000C0,
> +	HWPfDdrMpcPrwgrpCtrl                  =  0x00D000D0,
> +	HWPfDdrMpcPbw7                        =  0x00D000E0,
> +	HWPfDdrMpcPbw6                        =  0x00D000F0,
> +	HWPfDdrMpcPbw5                        =  0x00D00100,
> +	HWPfDdrMpcPbw4                        =  0x00D00110,
> +	HWPfDdrMpcPbw3                        =  0x00D00120,
> +	HWPfDdrMpcPbw2                        =  0x00D00130,
> +	HWPfDdrMpcPbw1                        =  0x00D00140,
> +	HWPfDdrMpcPbw0                        =  0x00D00150,
> +	HWPfDdrMemoryInit                     =  0x00D00200,
> +	HWPfDdrMemoryInitDone                 =  0x00D00210,
> +	HWPfDdrMemInitPhyTrng0                =  0x00D00240,
> +	HWPfDdrMemInitPhyTrng1                =  0x00D00250,
> +	HWPfDdrMemInitPhyTrng2                =  0x00D00260,
> +	HWPfDdrMemInitPhyTrng3                =  0x00D00270,
> +	HWPfDdrBcDram                         =  0x00D003C0,
> +	HWPfDdrBcAddrMap                      =  0x00D003D0,
> +	HWPfDdrBcRef                          =  0x00D003E0,
> +	HWPfDdrBcTim0                         =  0x00D00400,
> +	HWPfDdrBcTim1                         =  0x00D00410,
> +	HWPfDdrBcTim2                         =  0x00D00420,
> +	HWPfDdrBcTim3                         =  0x00D00430,
> +	HWPfDdrBcTim4                         =  0x00D00440,
> +	HWPfDdrBcTim5                         =  0x00D00450,
> +	HWPfDdrBcTim6                         =  0x00D00460,
> +	HWPfDdrBcTim7                         =  0x00D00470,
> +	HWPfDdrBcTim8                         =  0x00D00480,
> +	HWPfDdrBcTim9                         =  0x00D00490,
> +	HWPfDdrBcTim10                        =  0x00D004A0,
> +	HWPfDdrBcTim12                        =  0x00D004C0,
> +	HWPfDdrDfiInit                        =  0x00D004D0,
> +	HWPfDdrDfiInitComplete                =  0x00D004E0,
> +	HWPfDdrDfiTim0                        =  0x00D004F0,
> +	HWPfDdrDfiTim1                        =  0x00D00500,
> +	HWPfDdrDfiPhyUpdEn                    =  0x00D00530,
> +	HWPfDdrMemStatus                      =  0x00D00540,
> +	HWPfDdrUmmcErrStatus                  =  0x00D00550,
> +	HWPfDdrUmmcIntStatus                  =  0x00D00560,
> +	HWPfDdrUmmcIntEn                      =  0x00D00570,
> +	HWPfDdrPhyRdLatency                   =  0x00D48400,
> +	HWPfDdrPhyRdLatencyDbi                =  0x00D48410,
> +	HWPfDdrPhyWrLatency                   =  0x00D48420,
> +	HWPfDdrPhyTrngType                    =  0x00D48430,
> +	HWPfDdrPhyMrsTiming2                  =  0x00D48440,
> +	HWPfDdrPhyMrsTiming0                  =  0x00D48450,
> +	HWPfDdrPhyMrsTiming1                  =  0x00D48460,
> +	HWPfDdrPhyDramTmrd                    =  0x00D48470,
> +	HWPfDdrPhyDramTmod                    =  0x00D48480,
> +	HWPfDdrPhyDramTwpre                   =  0x00D48490,
> +	HWPfDdrPhyDramTrfc                    =  0x00D484A0,
> +	HWPfDdrPhyDramTrwtp                   =  0x00D484B0,
> +	HWPfDdrPhyMr01Dimm                    =  0x00D484C0,
> +	HWPfDdrPhyMr01DimmDbi                 =  0x00D484D0,
> +	HWPfDdrPhyMr23Dimm                    =  0x00D484E0,
> +	HWPfDdrPhyMr45Dimm                    =  0x00D484F0,
> +	HWPfDdrPhyMr67Dimm                    =  0x00D48500,
> +	HWPfDdrPhyWrlvlWwRdlvlRr              =  0x00D48510,
> +	HWPfDdrPhyOdtEn                       =  0x00D48520,
> +	HWPfDdrPhyFastTrng                    =  0x00D48530,
> +	HWPfDdrPhyDynTrngGap                  =  0x00D48540,
> +	HWPfDdrPhyDynRcalGap                  =  0x00D48550,
> +	HWPfDdrPhyIdletimeout                 =  0x00D48560,
> +	HWPfDdrPhyRstCkeGap                   =  0x00D48570,
> +	HWPfDdrPhyCkeMrsGap                   =  0x00D48580,
> +	HWPfDdrPhyMemVrefMidVal               =  0x00D48590,
> +	HWPfDdrPhyVrefStep                    =  0x00D485A0,
> +	HWPfDdrPhyVrefThreshold               =  0x00D485B0,
> +	HWPfDdrPhyPhyVrefMidVal               =  0x00D485C0,
> +	HWPfDdrPhyDqsCountMax                 =  0x00D485D0,
> +	HWPfDdrPhyDqsCountNum                 =  0x00D485E0,
> +	HWPfDdrPhyDramRow                     =  0x00D485F0,
> +	HWPfDdrPhyDramCol                     =  0x00D48600,
> +	HWPfDdrPhyDramBgBa                    =  0x00D48610,
> +	HWPfDdrPhyDynamicUpdreqrel            =  0x00D48620,
> +	HWPfDdrPhyVrefLimits                  =  0x00D48630,
> +	HWPfDdrPhyIdtmTcStatus                =  0x00D6C020,
> +	HWPfDdrPhyIdtmFwVersion               =  0x00D6C410,
> +	HWPfDdrPhyRdlvlGateInitDelay          =  0x00D70000,
> +	HWPfDdrPhyRdenSmplabc                 =  0x00D70008,
> +	HWPfDdrPhyVrefNibble0                 =  0x00D7000C,
> +	HWPfDdrPhyVrefNibble1                 =  0x00D70010,
> +	HWPfDdrPhyRdlvlGateDqsSmpl0           =  0x00D70014,
> +	HWPfDdrPhyRdlvlGateDqsSmpl1           =  0x00D70018,
> +	HWPfDdrPhyRdlvlGateDqsSmpl2           =  0x00D7001C,
> +	HWPfDdrPhyDqsCount                    =  0x00D70020,
> +	HWPfDdrPhyWrlvlRdlvlGateStatus        =  0x00D70024,
> +	HWPfDdrPhyErrorFlags                  =  0x00D70028,
> +	HWPfDdrPhyPowerDown                   =  0x00D70030,
> +	HWPfDdrPhyPrbsSeedByte0               =  0x00D70034,
> +	HWPfDdrPhyPrbsSeedByte1               =  0x00D70038,
> +	HWPfDdrPhyPcompDq                     =  0x00D70040,
> +	HWPfDdrPhyNcompDq                     =  0x00D70044,
> +	HWPfDdrPhyPcompDqs                    =  0x00D70048,
> +	HWPfDdrPhyNcompDqs                    =  0x00D7004C,
> +	HWPfDdrPhyPcompCmd                    =  0x00D70050,
> +	HWPfDdrPhyNcompCmd                    =  0x00D70054,
> +	HWPfDdrPhyPcompCk                     =  0x00D70058,
> +	HWPfDdrPhyNcompCk                     =  0x00D7005C,
> +	HWPfDdrPhyRcalOdtDq                   =  0x00D70060,
> +	HWPfDdrPhyRcalOdtDqs                  =  0x00D70064,
> +	HWPfDdrPhyRcalMask1                   =  0x00D70068,
> +	HWPfDdrPhyRcalMask2                   =  0x00D7006C,
> +	HWPfDdrPhyRcalCtrl                    =  0x00D70070,
> +	HWPfDdrPhyRcalCnt                     =  0x00D70074,
> +	HWPfDdrPhyRcalOverride                =  0x00D70078,
> +	HWPfDdrPhyRcalGateen                  =  0x00D7007C,
> +	HWPfDdrPhyCtrl                        =  0x00D70080,
> +	HWPfDdrPhyWrlvlAlg                    =  0x00D70084,
> +	HWPfDdrPhyRcalVreftTxcmdOdt           =  0x00D70088,
> +	HWPfDdrPhyRdlvlGateParam              =  0x00D7008C,
> +	HWPfDdrPhyRdlvlGateParam2             =  0x00D70090,
> +	HWPfDdrPhyRcalVreftTxdata             =  0x00D70094,
> +	HWPfDdrPhyCmdIntDelay                 =  0x00D700A4,
> +	HWPfDdrPhyAlertN                      =  0x00D700A8,
> +	HWPfDdrPhyTrngReqWpre2tck             =  0x00D700AC,
> +	HWPfDdrPhyCmdPhaseSel                 =  0x00D700B4,
> +	HWPfDdrPhyCmdDcdl                     =  0x00D700B8,
> +	HWPfDdrPhyCkDcdl                      =  0x00D700BC,
> +	HWPfDdrPhySwTrngCtrl1                 =  0x00D700C0,
> +	HWPfDdrPhySwTrngCtrl2                 =  0x00D700C4,
> +	HWPfDdrPhyRcalPcompRden               =  0x00D700C8,
> +	HWPfDdrPhyRcalNcompRden               =  0x00D700CC,
> +	HWPfDdrPhyRcalCompen                  =  0x00D700D0,
> +	HWPfDdrPhySwTrngRdqs                  =  0x00D700D4,
> +	HWPfDdrPhySwTrngWdqs                  =  0x00D700D8,
> +	HWPfDdrPhySwTrngRdena                 =  0x00D700DC,
> +	HWPfDdrPhySwTrngRdenb                 =  0x00D700E0,
> +	HWPfDdrPhySwTrngRdenc                 =  0x00D700E4,
> +	HWPfDdrPhySwTrngWdq                   =  0x00D700E8,
> +	HWPfDdrPhySwTrngRdq                   =  0x00D700EC,
> +	HWPfDdrPhyPcfgHmValue                 =  0x00D700F0,
> +	HWPfDdrPhyPcfgTimerValue              =  0x00D700F4,
> +	HWPfDdrPhyPcfgSoftwareTraining        =  0x00D700F8,
> +	HWPfDdrPhyPcfgMcStatus                =  0x00D700FC,
> +	HWPfDdrPhyWrlvlPhRank0                =  0x00D70100,
> +	HWPfDdrPhyRdenPhRank0                 =  0x00D70104,
> +	HWPfDdrPhyRdenIntRank0                =  0x00D70108,
> +	HWPfDdrPhyRdqsDcdlRank0               =  0x00D7010C,
> +	HWPfDdrPhyRdqsShadowDcdlRank0         =  0x00D70110,
> +	HWPfDdrPhyWdqsDcdlRank0               =  0x00D70114,
> +	HWPfDdrPhyWdmDcdlShadowRank0          =  0x00D70118,
> +	HWPfDdrPhyWdmDcdlRank0                =  0x00D7011C,
> +	HWPfDdrPhyDbiDcdlRank0                =  0x00D70120,
> +	HWPfDdrPhyRdenDcdlaRank0              =  0x00D70124,
> +	HWPfDdrPhyDbiDcdlShadowRank0          =  0x00D70128,
> +	HWPfDdrPhyRdenDcdlbRank0              =  0x00D7012C,
> +	HWPfDdrPhyWdqsShadowDcdlRank0         =  0x00D70130,
> +	HWPfDdrPhyRdenDcdlcRank0              =  0x00D70134,
> +	HWPfDdrPhyRdenShadowDcdlaRank0        =  0x00D70138,
> +	HWPfDdrPhyWrlvlIntRank0               =  0x00D7013C,
> +	HWPfDdrPhyRdqDcdlBit0Rank0            =  0x00D70200,
> +	HWPfDdrPhyRdqDcdlShadowBit0Rank0      =  0x00D70204,
> +	HWPfDdrPhyWdqDcdlBit0Rank0            =  0x00D70208,
> +	HWPfDdrPhyWdqDcdlShadowBit0Rank0      =  0x00D7020C,
> +	HWPfDdrPhyRdqDcdlBit1Rank0            =  0x00D70240,
> +	HWPfDdrPhyRdqDcdlShadowBit1Rank0      =  0x00D70244,
> +	HWPfDdrPhyWdqDcdlBit1Rank0            =  0x00D70248,
> +	HWPfDdrPhyWdqDcdlShadowBit1Rank0      =  0x00D7024C,
> +	HWPfDdrPhyRdqDcdlBit2Rank0            =  0x00D70280,
> +	HWPfDdrPhyRdqDcdlShadowBit2Rank0      =  0x00D70284,
> +	HWPfDdrPhyWdqDcdlBit2Rank0            =  0x00D70288,
> +	HWPfDdrPhyWdqDcdlShadowBit2Rank0      =  0x00D7028C,
> +	HWPfDdrPhyRdqDcdlBit3Rank0            =  0x00D702C0,
> +	HWPfDdrPhyRdqDcdlShadowBit3Rank0      =  0x00D702C4,
> +	HWPfDdrPhyWdqDcdlBit3Rank0            =  0x00D702C8,
> +	HWPfDdrPhyWdqDcdlShadowBit3Rank0      =  0x00D702CC,
> +	HWPfDdrPhyRdqDcdlBit4Rank0            =  0x00D70300,
> +	HWPfDdrPhyRdqDcdlShadowBit4Rank0      =  0x00D70304,
> +	HWPfDdrPhyWdqDcdlBit4Rank0            =  0x00D70308,
> +	HWPfDdrPhyWdqDcdlShadowBit4Rank0      =  0x00D7030C,
> +	HWPfDdrPhyRdqDcdlBit5Rank0            =  0x00D70340,
> +	HWPfDdrPhyRdqDcdlShadowBit5Rank0      =  0x00D70344,
> +	HWPfDdrPhyWdqDcdlBit5Rank0            =  0x00D70348,
> +	HWPfDdrPhyWdqDcdlShadowBit5Rank0      =  0x00D7034C,
> +	HWPfDdrPhyRdqDcdlBit6Rank0            =  0x00D70380,
> +	HWPfDdrPhyRdqDcdlShadowBit6Rank0      =  0x00D70384,
> +	HWPfDdrPhyWdqDcdlBit6Rank0            =  0x00D70388,
> +	HWPfDdrPhyWdqDcdlShadowBit6Rank0      =  0x00D7038C,
> +	HWPfDdrPhyRdqDcdlBit7Rank0            =  0x00D703C0,
> +	HWPfDdrPhyRdqDcdlShadowBit7Rank0      =  0x00D703C4,
> +	HWPfDdrPhyWdqDcdlBit7Rank0            =  0x00D703C8,
> +	HWPfDdrPhyWdqDcdlShadowBit7Rank0      =  0x00D703CC,
> +	HWPfDdrPhyIdtmStatus                  =  0x00D740D0,
> +	HWPfDdrPhyIdtmError                   =  0x00D74110,
> +	HWPfDdrPhyIdtmDebug                   =  0x00D74120,
> +	HWPfDdrPhyIdtmDebugInt                =  0x00D74130,
> +	HwPfPcieLnAsicCfgovr                  =  0x00D80000,
> +	HwPfPcieLnAclkmixer                   =  0x00D80004,
> +	HwPfPcieLnTxrampfreq                  =  0x00D80008,
> +	HwPfPcieLnLanetest                    =  0x00D8000C,
> +	HwPfPcieLnDcctrl                      =  0x00D80010,
> +	HwPfPcieLnDccmeas                     =  0x00D80014,
> +	HwPfPcieLnDccovrAclk                  =  0x00D80018,
> +	HwPfPcieLnDccovrTxa                   =  0x00D8001C,
> +	HwPfPcieLnDccovrTxk                   =  0x00D80020,
> +	HwPfPcieLnDccovrDclk                  =  0x00D80024,
> +	HwPfPcieLnDccovrEclk                  =  0x00D80028,
> +	HwPfPcieLnDcctrimAclk                 =  0x00D8002C,
> +	HwPfPcieLnDcctrimTx                   =  0x00D80030,
> +	HwPfPcieLnDcctrimDclk                 =  0x00D80034,
> +	HwPfPcieLnDcctrimEclk                 =  0x00D80038,
> +	HwPfPcieLnQuadCtrl                    =  0x00D8003C,
> +	HwPfPcieLnQuadCorrIndex               =  0x00D80040,
> +	HwPfPcieLnQuadCorrStatus              =  0x00D80044,
> +	HwPfPcieLnAsicRxovr1                  =  0x00D80048,
> +	HwPfPcieLnAsicRxovr2                  =  0x00D8004C,
> +	HwPfPcieLnAsicEqinfovr                =  0x00D80050,
> +	HwPfPcieLnRxcsr                       =  0x00D80054,
> +	HwPfPcieLnRxfectrl                    =  0x00D80058,
> +	HwPfPcieLnRxtest                      =  0x00D8005C,
> +	HwPfPcieLnEscount                     =  0x00D80060,
> +	HwPfPcieLnCdrctrl                     =  0x00D80064,
> +	HwPfPcieLnCdrctrl2                    =  0x00D80068,
> +	HwPfPcieLnCdrcfg0Ctrl0                =  0x00D8006C,
> +	HwPfPcieLnCdrcfg0Ctrl1                =  0x00D80070,
> +	HwPfPcieLnCdrcfg0Ctrl2                =  0x00D80074,
> +	HwPfPcieLnCdrcfg1Ctrl0                =  0x00D80078,
> +	HwPfPcieLnCdrcfg1Ctrl1                =  0x00D8007C,
> +	HwPfPcieLnCdrcfg1Ctrl2                =  0x00D80080,
> +	HwPfPcieLnCdrcfg2Ctrl0                =  0x00D80084,
> +	HwPfPcieLnCdrcfg2Ctrl1                =  0x00D80088,
> +	HwPfPcieLnCdrcfg2Ctrl2                =  0x00D8008C,
> +	HwPfPcieLnCdrcfg3Ctrl0                =  0x00D80090,
> +	HwPfPcieLnCdrcfg3Ctrl1                =  0x00D80094,
> +	HwPfPcieLnCdrcfg3Ctrl2                =  0x00D80098,
> +	HwPfPcieLnCdrphase                    =  0x00D8009C,
> +	HwPfPcieLnCdrfreq                     =  0x00D800A0,
> +	HwPfPcieLnCdrstatusPhase              =  0x00D800A4,
> +	HwPfPcieLnCdrstatusFreq               =  0x00D800A8,
> +	HwPfPcieLnCdroffset                   =  0x00D800AC,
> +	HwPfPcieLnRxvosctl                    =  0x00D800B0,
> +	HwPfPcieLnRxvosctl2                   =  0x00D800B4,
> +	HwPfPcieLnRxlosctl                    =  0x00D800B8,
> +	HwPfPcieLnRxlos                       =  0x00D800BC,
> +	HwPfPcieLnRxlosvval                   =  0x00D800C0,
> +	HwPfPcieLnRxvosd0                     =  0x00D800C4,
> +	HwPfPcieLnRxvosd1                     =  0x00D800C8,
> +	HwPfPcieLnRxvosep0                    =  0x00D800CC,
> +	HwPfPcieLnRxvosep1                    =  0x00D800D0,
> +	HwPfPcieLnRxvosen0                    =  0x00D800D4,
> +	HwPfPcieLnRxvosen1                    =  0x00D800D8,
> +	HwPfPcieLnRxvosafe                    =  0x00D800DC,
> +	HwPfPcieLnRxvosa0                     =  0x00D800E0,
> +	HwPfPcieLnRxvosa0Out                  =  0x00D800E4,
> +	HwPfPcieLnRxvosa1                     =  0x00D800E8,
> +	HwPfPcieLnRxvosa1Out                  =  0x00D800EC,
> +	HwPfPcieLnRxmisc                      =  0x00D800F0,
> +	HwPfPcieLnRxbeacon                    =  0x00D800F4,
> +	HwPfPcieLnRxdssout                    =  0x00D800F8,
> +	HwPfPcieLnRxdssout2                   =  0x00D800FC,
> +	HwPfPcieLnAlphapctrl                  =  0x00D80100,
> +	HwPfPcieLnAlphanctrl                  =  0x00D80104,
> +	HwPfPcieLnAdaptctrl                   =  0x00D80108,
> +	HwPfPcieLnAdaptctrl1                  =  0x00D8010C,
> +	HwPfPcieLnAdaptstatus                 =  0x00D80110,
> +	HwPfPcieLnAdaptvga1                   =  0x00D80114,
> +	HwPfPcieLnAdaptvga2                   =  0x00D80118,
> +	HwPfPcieLnAdaptvga3                   =  0x00D8011C,
> +	HwPfPcieLnAdaptvga4                   =  0x00D80120,
> +	HwPfPcieLnAdaptboost1                 =  0x00D80124,
> +	HwPfPcieLnAdaptboost2                 =  0x00D80128,
> +	HwPfPcieLnAdaptboost3                 =  0x00D8012C,
> +	HwPfPcieLnAdaptboost4                 =  0x00D80130,
> +	HwPfPcieLnAdaptsslms1                 =  0x00D80134,
> +	HwPfPcieLnAdaptsslms2                 =  0x00D80138,
> +	HwPfPcieLnAdaptvgaStatus              =  0x00D8013C,
> +	HwPfPcieLnAdaptboostStatus            =  0x00D80140,
> +	HwPfPcieLnAdaptsslmsStatus1           =  0x00D80144,
> +	HwPfPcieLnAdaptsslmsStatus2           =  0x00D80148,
> +	HwPfPcieLnAfectrl1                    =  0x00D8014C,
> +	HwPfPcieLnAfectrl2                    =  0x00D80150,
> +	HwPfPcieLnAfectrl3                    =  0x00D80154,
> +	HwPfPcieLnAfedefault1                 =  0x00D80158,
> +	HwPfPcieLnAfedefault2                 =  0x00D8015C,
> +	HwPfPcieLnDfectrl1                    =  0x00D80160,
> +	HwPfPcieLnDfectrl2                    =  0x00D80164,
> +	HwPfPcieLnDfectrl3                    =  0x00D80168,
> +	HwPfPcieLnDfectrl4                    =  0x00D8016C,
> +	HwPfPcieLnDfectrl5                    =  0x00D80170,
> +	HwPfPcieLnDfectrl6                    =  0x00D80174,
> +	HwPfPcieLnAfestatus1                  =  0x00D80178,
> +	HwPfPcieLnAfestatus2                  =  0x00D8017C,
> +	HwPfPcieLnDfestatus1                  =  0x00D80180,
> +	HwPfPcieLnDfestatus2                  =  0x00D80184,
> +	HwPfPcieLnDfestatus3                  =  0x00D80188,
> +	HwPfPcieLnDfestatus4                  =  0x00D8018C,
> +	HwPfPcieLnDfestatus5                  =  0x00D80190,
> +	HwPfPcieLnAlphastatus                 =  0x00D80194,
> +	HwPfPcieLnFomctrl1                    =  0x00D80198,
> +	HwPfPcieLnFomctrl2                    =  0x00D8019C,
> +	HwPfPcieLnFomctrl3                    =  0x00D801A0,
> +	HwPfPcieLnAclkcalStatus               =  0x00D801A4,
> +	HwPfPcieLnOffscorrStatus              =  0x00D801A8,
> +	HwPfPcieLnEyewidthStatus              =  0x00D801AC,
> +	HwPfPcieLnEyeheightStatus             =  0x00D801B0,
> +	HwPfPcieLnAsicTxovr1                  =  0x00D801B4,
> +	HwPfPcieLnAsicTxovr2                  =  0x00D801B8,
> +	HwPfPcieLnAsicTxovr3                  =  0x00D801BC,
> +	HwPfPcieLnTxbiasadjOvr                =  0x00D801C0,
> +	HwPfPcieLnTxcsr                       =  0x00D801C4,
> +	HwPfPcieLnTxtest                      =  0x00D801C8,
> +	HwPfPcieLnTxtestword                  =  0x00D801CC,
> +	HwPfPcieLnTxtestwordHigh              =  0x00D801D0,
> +	HwPfPcieLnTxdrive                     =  0x00D801D4,
> +	HwPfPcieLnMtcsLn                      =  0x00D801D8,
> +	HwPfPcieLnStatsumLn                   =  0x00D801DC,
> +	HwPfPcieLnRcbusScratch                =  0x00D801E0,
> +	HwPfPcieLnRcbusMinorrev               =  0x00D801F0,
> +	HwPfPcieLnRcbusMajorrev               =  0x00D801F4,
> +	HwPfPcieLnRcbusBlocktype              =  0x00D801F8,
> +	HwPfPcieSupPllcsr                     =  0x00D80800,
> +	HwPfPcieSupPlldiv                     =  0x00D80804,
> +	HwPfPcieSupPllcal                     =  0x00D80808,
> +	HwPfPcieSupPllcalsts                  =  0x00D8080C,
> +	HwPfPcieSupPllmeas                    =  0x00D80810,
> +	HwPfPcieSupPlldactrim                 =  0x00D80814,
> +	HwPfPcieSupPllbiastrim                =  0x00D80818,
> +	HwPfPcieSupPllbwtrim                  =  0x00D8081C,
> +	HwPfPcieSupPllcaldly                  =  0x00D80820,
> +	HwPfPcieSupRefclkonpclkctrl           =  0x00D80824,
> +	HwPfPcieSupPclkdelay                  =  0x00D80828,
> +	HwPfPcieSupPhyconfig                  =  0x00D8082C,
> +	HwPfPcieSupRcalIntf                   =  0x00D80830,
> +	HwPfPcieSupAuxcsr                     =  0x00D80834,
> +	HwPfPcieSupVref                       =  0x00D80838,
> +	HwPfPcieSupLinkmode                   =  0x00D8083C,
> +	HwPfPcieSupRrefcalctl                 =  0x00D80840,
> +	HwPfPcieSupRrefcal                    =  0x00D80844,
> +	HwPfPcieSupRrefcaldly                 =  0x00D80848,
> +	HwPfPcieSupTximpcalctl                =  0x00D8084C,
> +	HwPfPcieSupTximpcal                   =  0x00D80850,
> +	HwPfPcieSupTximpoffset                =  0x00D80854,
> +	HwPfPcieSupTximpcaldly                =  0x00D80858,
> +	HwPfPcieSupRximpcalctl                =  0x00D8085C,
> +	HwPfPcieSupRximpcal                   =  0x00D80860,
> +	HwPfPcieSupRximpoffset                =  0x00D80864,
> +	HwPfPcieSupRximpcaldly                =  0x00D80868,
> +	HwPfPcieSupFence                      =  0x00D8086C,
> +	HwPfPcieSupMtcs                       =  0x00D80870,
> +	HwPfPcieSupStatsum                    =  0x00D809B8,
> +	HwPfPciePcsDpStatus0                  =  0x00D81000,
> +	HwPfPciePcsDpControl0                 =  0x00D81004,
> +	HwPfPciePcsPmaStatusLane0             =  0x00D81008,
> +	HwPfPciePcsPipeStatusLane0            =  0x00D8100C,
> +	HwPfPciePcsTxdeemph0Lane0             =  0x00D81010,
> +	HwPfPciePcsTxdeemph1Lane0             =  0x00D81014,
> +	HwPfPciePcsInternalStatusLane0        =  0x00D81018,
> +	HwPfPciePcsDpStatus1                  =  0x00D8101C,
> +	HwPfPciePcsDpControl1                 =  0x00D81020,
> +	HwPfPciePcsPmaStatusLane1             =  0x00D81024,
> +	HwPfPciePcsPipeStatusLane1            =  0x00D81028,
> +	HwPfPciePcsTxdeemph0Lane1             =  0x00D8102C,
> +	HwPfPciePcsTxdeemph1Lane1             =  0x00D81030,
> +	HwPfPciePcsInternalStatusLane1        =  0x00D81034,
> +	HwPfPciePcsDpStatus2                  =  0x00D81038,
> +	HwPfPciePcsDpControl2                 =  0x00D8103C,
> +	HwPfPciePcsPmaStatusLane2             =  0x00D81040,
> +	HwPfPciePcsPipeStatusLane2            =  0x00D81044,
> +	HwPfPciePcsTxdeemph0Lane2             =  0x00D81048,
> +	HwPfPciePcsTxdeemph1Lane2             =  0x00D8104C,
> +	HwPfPciePcsInternalStatusLane2        =  0x00D81050,
> +	HwPfPciePcsDpStatus3                  =  0x00D81054,
> +	HwPfPciePcsDpControl3                 =  0x00D81058,
> +	HwPfPciePcsPmaStatusLane3             =  0x00D8105C,
> +	HwPfPciePcsPipeStatusLane3            =  0x00D81060,
> +	HwPfPciePcsTxdeemph0Lane3             =  0x00D81064,
> +	HwPfPciePcsTxdeemph1Lane3             =  0x00D81068,
> +	HwPfPciePcsInternalStatusLane3        =  0x00D8106C,
> +	HwPfPciePcsEbStatus0                  =  0x00D81070,
> +	HwPfPciePcsEbStatus1                  =  0x00D81074,
> +	HwPfPciePcsEbStatus2                  =  0x00D81078,
> +	HwPfPciePcsEbStatus3                  =  0x00D8107C,
> +	HwPfPciePcsPllSettingPcieG1           =  0x00D81088,
> +	HwPfPciePcsPllSettingPcieG2           =  0x00D8108C,
> +	HwPfPciePcsPllSettingPcieG3           =  0x00D81090,
> +	HwPfPciePcsControl                    =  0x00D81094,
> +	HwPfPciePcsEqControl                  =  0x00D81098,
> +	HwPfPciePcsEqTimer                    =  0x00D8109C,
> +	HwPfPciePcsEqErrStatus                =  0x00D810A0,
> +	HwPfPciePcsEqErrCount                 =  0x00D810A4,
> +	HwPfPciePcsStatus                     =  0x00D810A8,
> +	HwPfPciePcsMiscRegister               =  0x00D810AC,
> +	HwPfPciePcsObsControl                 =  0x00D810B0,
> +	HwPfPciePcsPrbsCount0                 =  0x00D81200,
> +	HwPfPciePcsBistControl0               =  0x00D81204,
> +	HwPfPciePcsBistStaticWord00           =  0x00D81208,
> +	HwPfPciePcsBistStaticWord10           =  0x00D8120C,
> +	HwPfPciePcsBistStaticWord20           =  0x00D81210,
> +	HwPfPciePcsBistStaticWord30           =  0x00D81214,
> +	HwPfPciePcsPrbsCount1                 =  0x00D81220,
> +	HwPfPciePcsBistControl1               =  0x00D81224,
> +	HwPfPciePcsBistStaticWord01           =  0x00D81228,
> +	HwPfPciePcsBistStaticWord11           =  0x00D8122C,
> +	HwPfPciePcsBistStaticWord21           =  0x00D81230,
> +	HwPfPciePcsBistStaticWord31           =  0x00D81234,
> +	HwPfPciePcsPrbsCount2                 =  0x00D81240,
> +	HwPfPciePcsBistControl2               =  0x00D81244,
> +	HwPfPciePcsBistStaticWord02           =  0x00D81248,
> +	HwPfPciePcsBistStaticWord12           =  0x00D8124C,
> +	HwPfPciePcsBistStaticWord22           =  0x00D81250,
> +	HwPfPciePcsBistStaticWord32           =  0x00D81254,
> +	HwPfPciePcsPrbsCount3                 =  0x00D81260,
> +	HwPfPciePcsBistControl3               =  0x00D81264,
> +	HwPfPciePcsBistStaticWord03           =  0x00D81268,
> +	HwPfPciePcsBistStaticWord13           =  0x00D8126C,
> +	HwPfPciePcsBistStaticWord23           =  0x00D81270,
> +	HwPfPciePcsBistStaticWord33           =  0x00D81274,
> +	HwPfPcieGpexLtssmStateCntrl           =  0x00D90400,
> +	HwPfPcieGpexLtssmStateStatus          =  0x00D90404,
> +	HwPfPcieGpexSkipFreqTimer             =  0x00D90408,
> +	HwPfPcieGpexLaneSelect                =  0x00D9040C,
> +	HwPfPcieGpexLaneDeskew                =  0x00D90410,
> +	HwPfPcieGpexRxErrorStatus             =  0x00D90414,
> +	HwPfPcieGpexLaneNumControl            =  0x00D90418,
> +	HwPfPcieGpexNFstControl               =  0x00D9041C,
> +	HwPfPcieGpexLinkStatus                =  0x00D90420,
> +	HwPfPcieGpexAckReplayTimeout          =  0x00D90438,
> +	HwPfPcieGpexSeqNumberStatus           =  0x00D9043C,
> +	HwPfPcieGpexCoreClkRatio              =  0x00D90440,
> +	HwPfPcieGpexDllTholdControl           =  0x00D90448,
> +	HwPfPcieGpexPmTimer                   =  0x00D90450,
> +	HwPfPcieGpexPmeTimeout                =  0x00D90454,
> +	HwPfPcieGpexAspmL1Timer               =  0x00D90458,
> +	HwPfPcieGpexAspmReqTimer              =  0x00D9045C,
> +	HwPfPcieGpexAspmL1Dis                 =  0x00D90460,
> +	HwPfPcieGpexAdvisoryErrorControl      =  0x00D90468,
> +	HwPfPcieGpexId                        =  0x00D90470,
> +	HwPfPcieGpexClasscode                 =  0x00D90474,
> +	HwPfPcieGpexSubsystemId               =  0x00D90478,
> +	HwPfPcieGpexDeviceCapabilities        =  0x00D9047C,
> +	HwPfPcieGpexLinkCapabilities          =  0x00D90480,
> +	HwPfPcieGpexFunctionNumber            =  0x00D90484,
> +	HwPfPcieGpexPmCapabilities            =  0x00D90488,
> +	HwPfPcieGpexFunctionSelect            =  0x00D9048C,
> +	HwPfPcieGpexErrorCounter              =  0x00D904AC,
> +	HwPfPcieGpexConfigReady               =  0x00D904B0,
> +	HwPfPcieGpexFcUpdateTimeout           =  0x00D904B8,
> +	HwPfPcieGpexFcUpdateTimer             =  0x00D904BC,
> +	HwPfPcieGpexVcBufferLoad              =  0x00D904C8,
> +	HwPfPcieGpexVcBufferSizeThold         =  0x00D904CC,
> +	HwPfPcieGpexVcBufferSelect            =  0x00D904D0,
> +	HwPfPcieGpexBarEnable                 =  0x00D904D4,
> +	HwPfPcieGpexBarDwordLower             =  0x00D904D8,
> +	HwPfPcieGpexBarDwordUpper             =  0x00D904DC,
> +	HwPfPcieGpexBarSelect                 =  0x00D904E0,
> +	HwPfPcieGpexCreditCounterSelect       =  0x00D904E4,
> +	HwPfPcieGpexCreditCounterStatus       =  0x00D904E8,
> +	HwPfPcieGpexTlpHeaderSelect           =  0x00D904EC,
> +	HwPfPcieGpexTlpHeaderDword0           =  0x00D904F0,
> +	HwPfPcieGpexTlpHeaderDword1           =  0x00D904F4,
> +	HwPfPcieGpexTlpHeaderDword2           =  0x00D904F8,
> +	HwPfPcieGpexTlpHeaderDword3           =  0x00D904FC,
> +	HwPfPcieGpexRelaxOrderControl         =  0x00D90500,
> +	HwPfPcieGpexBarPrefetch               =  0x00D90504,
> +	HwPfPcieGpexFcCheckControl            =  0x00D90508,
> +	HwPfPcieGpexFcUpdateTimerTraffic      =  0x00D90518,
> +	HwPfPcieGpexPhyControl0               =  0x00D9053C,
> +	HwPfPcieGpexPhyControl1               =  0x00D90544,
> +	HwPfPcieGpexPhyControl2               =  0x00D9054C,
> +	HwPfPcieGpexUserControl0              =  0x00D9055C,
> +	HwPfPcieGpexUncorrErrorStatus         =  0x00D905F0,
> +	HwPfPcieGpexRxCplError                =  0x00D90620,
> +	HwPfPcieGpexRxCplErrorDword0          =  0x00D90624,
> +	HwPfPcieGpexRxCplErrorDword1          =  0x00D90628,
> +	HwPfPcieGpexRxCplErrorDword2          =  0x00D9062C,
> +	HwPfPcieGpexPabSwResetEn              =  0x00D90630,
> +	HwPfPcieGpexGen3Control0              =  0x00D90634,
> +	HwPfPcieGpexGen3Control1              =  0x00D90638,
> +	HwPfPcieGpexGen3Control2              =  0x00D9063C,
> +	HwPfPcieGpexGen2ControlCsr            =  0x00D90640,
> +	HwPfPcieGpexTotalVfInitialVf0         =  0x00D90644,
> +	HwPfPcieGpexTotalVfInitialVf1         =  0x00D90648,
> +	HwPfPcieGpexSriovLinkDevId0           =  0x00D90684,
> +	HwPfPcieGpexSriovLinkDevId1           =  0x00D90688,
> +	HwPfPcieGpexSriovPageSize0            =  0x00D906C4,
> +	HwPfPcieGpexSriovPageSize1            =  0x00D906C8,
> +	HwPfPcieGpexIdVersion                 =  0x00D906FC,
> +	HwPfPcieGpexSriovVfOffsetStride0      =  0x00D90704,
> +	HwPfPcieGpexSriovVfOffsetStride1      =  0x00D90708,
> +	HwPfPcieGpexGen3DeskewControl         =  0x00D907B4,
> +	HwPfPcieGpexGen3EqControl             =  0x00D907B8,
> +	HwPfPcieGpexBridgeVersion             =  0x00D90800,
> +	HwPfPcieGpexBridgeCapability          =  0x00D90804,
> +	HwPfPcieGpexBridgeControl             =  0x00D90808,
> +	HwPfPcieGpexBridgeStatus              =  0x00D9080C,
> +	HwPfPcieGpexEngineActivityStatus      =  0x00D9081C,
> +	HwPfPcieGpexEngineResetControl        =  0x00D90820,
> +	HwPfPcieGpexAxiPioControl             =  0x00D90840,
> +	HwPfPcieGpexAxiPioStatus              =  0x00D90844,
> +	HwPfPcieGpexAmbaSlaveCmdStatus        =  0x00D90848,
> +	HwPfPcieGpexPexPioControl             =  0x00D908C0,
> +	HwPfPcieGpexPexPioStatus              =  0x00D908C4,
> +	HwPfPcieGpexAmbaMasterStatus          =  0x00D908C8,
> +	HwPfPcieGpexCsrSlaveCmdStatus         =  0x00D90920,
> +	HwPfPcieGpexMailboxAxiControl         =  0x00D90A50,
> +	HwPfPcieGpexMailboxAxiData            =  0x00D90A54,
> +	HwPfPcieGpexMailboxPexControl         =  0x00D90A90,
> +	HwPfPcieGpexMailboxPexData            =  0x00D90A94,
> +	HwPfPcieGpexPexInterruptEnable        =  0x00D90AD0,
> +	HwPfPcieGpexPexInterruptStatus        =  0x00D90AD4,
> +	HwPfPcieGpexPexInterruptAxiPioVector  =  0x00D90AD8,
> +	HwPfPcieGpexPexInterruptPexPioVector  =  0x00D90AE0,
> +	HwPfPcieGpexPexInterruptMiscVector    =  0x00D90AF8,
> +	HwPfPcieGpexAmbaInterruptPioEnable    =  0x00D90B00,
> +	HwPfPcieGpexAmbaInterruptMiscEnable   =  0x00D90B0C,
> +	HwPfPcieGpexAmbaInterruptPioStatus    =  0x00D90B10,
> +	HwPfPcieGpexAmbaInterruptMiscStatus   =  0x00D90B1C,
> +	HwPfPcieGpexPexPmControl              =  0x00D90B80,
> +	HwPfPcieGpexSlotMisc                  =  0x00D90B88,
> +	HwPfPcieGpexAxiAddrMappingControl     =  0x00D90BA0,
> +	HwPfPcieGpexAxiAddrMappingWindowAxiBase     =  0x00D90BA4,
> +	HwPfPcieGpexAxiAddrMappingWindowPexBaseLow  =  0x00D90BA8,
> +	HwPfPcieGpexAxiAddrMappingWindowPexBaseHigh =  0x00D90BAC,
> +	HwPfPcieGpexPexBarAddrFunc0Bar0       =  0x00D91BA0,
> +	HwPfPcieGpexPexBarAddrFunc0Bar1       =  0x00D91BA4,
> +	HwPfPcieGpexAxiAddrMappingPcieHdrParam =  0x00D95BA0,
> +	HwPfPcieGpexExtAxiAddrMappingAxiBase  =  0x00D980A0,
> +	HwPfPcieGpexPexExtBarAddrFunc0Bar0    =  0x00D984A0,
> +	HwPfPcieGpexPexExtBarAddrFunc0Bar1    =  0x00D984A4,
> +	HwPfPcieGpexAmbaInterruptFlrEnable    =  0x00D9B960,
> +	HwPfPcieGpexAmbaInterruptFlrStatus    =  0x00D9B9A0,
> +	HwPfPcieGpexExtAxiAddrMappingSize     =  0x00D9BAF0,
> +	HwPfPcieGpexPexPioAwcacheControl      =  0x00D9C300,
> +	HwPfPcieGpexPexPioArcacheControl      =  0x00D9C304,
> +	HwPfPcieGpexPabObSizeControlVc0       =  0x00D9C310
> +};
> +
> +/* TIP PF Interrupt numbers */
> +enum {
> +	ACC100_PF_INT_QMGR_AQ_OVERFLOW = 0,
> +	ACC100_PF_INT_DOORBELL_VF_2_PF = 1,
> +	ACC100_PF_INT_DMA_DL_DESC_IRQ = 2,
> +	ACC100_PF_INT_DMA_UL_DESC_IRQ = 3,
> +	ACC100_PF_INT_DMA_MLD_DESC_IRQ = 4,
> +	ACC100_PF_INT_DMA_UL5G_DESC_IRQ = 5,
> +	ACC100_PF_INT_DMA_DL5G_DESC_IRQ = 6,
> +	ACC100_PF_INT_ILLEGAL_FORMAT = 7,
> +	ACC100_PF_INT_QMGR_DISABLED_ACCESS = 8,
> +	ACC100_PF_INT_QMGR_AQ_OVERTHRESHOLD = 9,
> +	ACC100_PF_INT_ARAM_ACCESS_ERR = 10,
> +	ACC100_PF_INT_ARAM_ECC_1BIT_ERR = 11,
> +	ACC100_PF_INT_PARITY_ERR = 12,
> +	ACC100_PF_INT_QMGR_ERR = 13,
> +	ACC100_PF_INT_INT_REQ_OVERFLOW = 14,
> +	ACC100_PF_INT_APB_TIMEOUT = 15,
> +};
> +
> +#endif /* ACC100_PF_ENUM_H */
> diff --git a/drivers/baseband/acc100/acc100_vf_enum.h b/drivers/baseband/acc100/acc100_vf_enum.h
> new file mode 100644
> index 0000000..b512af3
> --- /dev/null
> +++ b/drivers/baseband/acc100/acc100_vf_enum.h
> @@ -0,0 +1,73 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2017 Intel Corporation
> + */
> +
> +#ifndef ACC100_VF_ENUM_H
> +#define ACC100_VF_ENUM_H
> +
> +/*
> + * ACC100 Register mapping on VF BAR0
> + * This is automatically generated from RDL, format may change with new RDL
> + */
> +enum {
> +	HWVfQmgrIngressAq             =  0x00000000,
> +	HWVfHiVfToPfDbellVf           =  0x00000800,
> +	HWVfHiPfToVfDbellVf           =  0x00000808,
> +	HWVfHiInfoRingBaseLoVf        =  0x00000810,
> +	HWVfHiInfoRingBaseHiVf        =  0x00000814,
> +	HWVfHiInfoRingPointerVf       =  0x00000818,
> +	HWVfHiInfoRingIntWrEnVf       =  0x00000820,
> +	HWVfHiInfoRingPf2VfWrEnVf     =  0x00000824,
> +	HWVfHiMsixVectorMapperVf      =  0x00000860,
> +	HWVfDmaFec5GulDescBaseLoRegVf =  0x00000920,
> +	HWVfDmaFec5GulDescBaseHiRegVf =  0x00000924,
> +	HWVfDmaFec5GulRespPtrLoRegVf  =  0x00000928,
> +	HWVfDmaFec5GulRespPtrHiRegVf  =  0x0000092C,
> +	HWVfDmaFec5GdlDescBaseLoRegVf =  0x00000940,
> +	HWVfDmaFec5GdlDescBaseHiRegVf =  0x00000944,
> +	HWVfDmaFec5GdlRespPtrLoRegVf  =  0x00000948,
> +	HWVfDmaFec5GdlRespPtrHiRegVf  =  0x0000094C,
> +	HWVfDmaFec4GulDescBaseLoRegVf =  0x00000960,
> +	HWVfDmaFec4GulDescBaseHiRegVf =  0x00000964,
> +	HWVfDmaFec4GulRespPtrLoRegVf  =  0x00000968,
> +	HWVfDmaFec4GulRespPtrHiRegVf  =  0x0000096C,
> +	HWVfDmaFec4GdlDescBaseLoRegVf =  0x00000980,
> +	HWVfDmaFec4GdlDescBaseHiRegVf =  0x00000984,
> +	HWVfDmaFec4GdlRespPtrLoRegVf  =  0x00000988,
> +	HWVfDmaFec4GdlRespPtrHiRegVf  =  0x0000098C,
> +	HWVfDmaDdrBaseRangeRoVf       =  0x000009A0,
> +	HWVfQmgrAqResetVf             =  0x00000E00,
> +	HWVfQmgrRingSizeVf            =  0x00000E04,
> +	HWVfQmgrGrpDepthLog20Vf       =  0x00000E08,
> +	HWVfQmgrGrpDepthLog21Vf       =  0x00000E0C,
> +	HWVfQmgrGrpFunction0Vf        =  0x00000E10,
> +	HWVfQmgrGrpFunction1Vf        =  0x00000E14,
> +	HWVfPmACntrlRegVf             =  0x00000F40,
> +	HWVfPmACountVf                =  0x00000F48,
> +	HWVfPmAKCntLoVf               =  0x00000F50,
> +	HWVfPmAKCntHiVf               =  0x00000F54,
> +	HWVfPmADeltaCntLoVf           =  0x00000F60,
> +	HWVfPmADeltaCntHiVf           =  0x00000F64,
> +	HWVfPmBCntrlRegVf             =  0x00000F80,
> +	HWVfPmBCountVf                =  0x00000F88,
> +	HWVfPmBKCntLoVf               =  0x00000F90,
> +	HWVfPmBKCntHiVf               =  0x00000F94,
> +	HWVfPmBDeltaCntLoVf           =  0x00000FA0,
> +	HWVfPmBDeltaCntHiVf           =  0x00000FA4
> +};
> +
> +/* TIP VF Interrupt numbers */
> +enum {
> +	ACC100_VF_INT_QMGR_AQ_OVERFLOW = 0,
> +	ACC100_VF_INT_DOORBELL_VF_2_PF = 1,
> +	ACC100_VF_INT_DMA_DL_DESC_IRQ = 2,
> +	ACC100_VF_INT_DMA_UL_DESC_IRQ = 3,
> +	ACC100_VF_INT_DMA_MLD_DESC_IRQ = 4,
> +	ACC100_VF_INT_DMA_UL5G_DESC_IRQ = 5,
> +	ACC100_VF_INT_DMA_DL5G_DESC_IRQ = 6,
> +	ACC100_VF_INT_ILLEGAL_FORMAT = 7,
> +	ACC100_VF_INT_QMGR_DISABLED_ACCESS = 8,
> +	ACC100_VF_INT_QMGR_AQ_OVERTHRESHOLD = 9,
> +};
> +
> +#endif /* ACC100_VF_ENUM_H */
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
> index 6f46df0..cd77570 100644
> --- a/drivers/baseband/acc100/rte_acc100_pmd.h
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
> @@ -5,6 +5,9 @@
>  #ifndef _RTE_ACC100_PMD_H_
>  #define _RTE_ACC100_PMD_H_
>  
> +#include "acc100_pf_enum.h"
> +#include "acc100_vf_enum.h"
> +
>  /* Helper macro for logging */
>  #define rte_bbdev_log(level, fmt, ...) \
>  	rte_log(RTE_LOG_ ## level, acc100_logtype, fmt "\n", \
> @@ -27,6 +30,493 @@
>  #define RTE_ACC100_PF_DEVICE_ID        (0x0d5c)
>  #define RTE_ACC100_VF_DEVICE_ID        (0x0d5d)
>  
> +/* Define as 1 to use only a single FEC engine */
> +#ifndef RTE_ACC100_SINGLE_FEC
> +#define RTE_ACC100_SINGLE_FEC 0
> +#endif
> +
> +/* Values used in filling in descriptors */
> +#define ACC100_DMA_DESC_TYPE           2
> +#define ACC100_DMA_CODE_BLK_MODE       0
> +#define ACC100_DMA_BLKID_FCW           1
> +#define ACC100_DMA_BLKID_IN            2
> +#define ACC100_DMA_BLKID_OUT_ENC       1
> +#define ACC100_DMA_BLKID_OUT_HARD      1
> +#define ACC100_DMA_BLKID_OUT_SOFT      2
> +#define ACC100_DMA_BLKID_OUT_HARQ      3
> +#define ACC100_DMA_BLKID_IN_HARQ       3
> +
> +/* Values used in filling in decode FCWs */
> +#define ACC100_FCW_TD_VER              1
> +#define ACC100_FCW_TD_EXT_COLD_REG_EN  1
> +#define ACC100_FCW_TD_AUTOMAP          0x0f
> +#define ACC100_FCW_TD_RVIDX_0          2
> +#define ACC100_FCW_TD_RVIDX_1          26
> +#define ACC100_FCW_TD_RVIDX_2          50
> +#define ACC100_FCW_TD_RVIDX_3          74
> +
> +/* Values used in writing to the registers */
> +#define ACC100_REG_IRQ_EN_ALL          0x1FF83FF  /* Enable all interrupts */
> +
> +/* ACC100 Specific Dimensioning */
> +#define ACC100_SIZE_64MBYTE            (64*1024*1024)

A better name for this #define would be ACC100_MAX_RING_SIZE

Similar for alloc_2x64mb_sw_rings_mem should be

alloc_max_sw_rings_mem.


> +/* Number of elements in an Info Ring */
> +#define ACC100_INFO_RING_NUM_ENTRIES   1024
> +/* Number of elements in HARQ layout memory */
> +#define ACC100_HARQ_LAYOUT             (64*1024*1024)
> +/* Assume offset for HARQ in memory */
> +#define ACC100_HARQ_OFFSET             (32*1024)
> +/* Mask used to calculate an index in an Info Ring array (not a byte offset) */
> +#define ACC100_INFO_RING_MASK          (ACC100_INFO_RING_NUM_ENTRIES-1)
> +/* Number of Virtual Functions ACC100 supports */
> +#define ACC100_NUM_VFS                  16
> +#define ACC100_NUM_QGRPS                 8
> +#define ACC100_NUM_QGRPS_PER_WORD        8
> +#define ACC100_NUM_AQS                  16
> +#define MAX_ENQ_BATCH_SIZE          255
little stuff, these define values should line up at least in the blocks.
> +/* All ACC100 Registers alignment are 32bits = 4B */
> +#define BYTES_IN_WORD                 4

Common #define names should have ACC100_ prefix to lower chance of name conflicts.

Generally a good idea of all of them.

Tom

> +#define MAX_E_MBUF                64000
> +
> +#define GRP_ID_SHIFT    10 /* Queue Index Hierarchy */
> +#define VF_ID_SHIFT     4  /* Queue Index Hierarchy */
> +#define VF_OFFSET_QOS   16 /* offset in Memory Space specific to QoS Mon */
> +#define TMPL_PRI_0      0x03020100
> +#define TMPL_PRI_1      0x07060504
> +#define TMPL_PRI_2      0x0b0a0908
> +#define TMPL_PRI_3      0x0f0e0d0c
> +#define QUEUE_ENABLE    0x80000000  /* Bit to mark Queue as Enabled */
> +#define WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
> +
> +#define ACC100_NUM_TMPL  32
> +#define VF_OFFSET_QOS 16 /* offset in Memory Space specific to QoS Mon */
> +/* Mapping of signals for the available engines */
> +#define SIG_UL_5G      0
> +#define SIG_UL_5G_LAST 7
> +#define SIG_DL_5G      13
> +#define SIG_DL_5G_LAST 15
> +#define SIG_UL_4G      16
> +#define SIG_UL_4G_LAST 21
> +#define SIG_DL_4G      27
> +#define SIG_DL_4G_LAST 31
> +
> +/* max number of iterations to allocate memory block for all rings */
> +#define SW_RING_MEM_ALLOC_ATTEMPTS 5
> +#define MAX_QUEUE_DEPTH           1024
> +#define ACC100_DMA_MAX_NUM_POINTERS  14
> +#define ACC100_DMA_DESC_PADDING      8
> +#define ACC100_FCW_PADDING           12
> +#define ACC100_DESC_FCW_OFFSET       192
> +#define ACC100_DESC_SIZE             256
> +#define ACC100_DESC_OFFSET           (ACC100_DESC_SIZE / 64)
> +#define ACC100_FCW_TE_BLEN     32
> +#define ACC100_FCW_TD_BLEN     24
> +#define ACC100_FCW_LE_BLEN     32
> +#define ACC100_FCW_LD_BLEN     36
> +
> +#define ACC100_FCW_VER         2
> +#define MUX_5GDL_DESC 6
> +#define CMP_ENC_SIZE 20
> +#define CMP_DEC_SIZE 24
> +#define ENC_OFFSET (32)
> +#define DEC_OFFSET (80)
> +#define ACC100_EXT_MEM
> +#define ACC100_HARQ_OFFSET_THRESHOLD 1024
> +
> +/* Constants from K0 computation from 3GPP 38.212 Table 5.4.2.1-2 */
> +#define N_ZC_1 66 /* N = 66 Zc for BG 1 */
> +#define N_ZC_2 50 /* N = 50 Zc for BG 2 */
> +#define K0_1_1 17 /* K0 fraction numerator for rv 1 and BG 1 */
> +#define K0_1_2 13 /* K0 fraction numerator for rv 1 and BG 2 */
> +#define K0_2_1 33 /* K0 fraction numerator for rv 2 and BG 1 */
> +#define K0_2_2 25 /* K0 fraction numerator for rv 2 and BG 2 */
> +#define K0_3_1 56 /* K0 fraction numerator for rv 3 and BG 1 */
> +#define K0_3_2 43 /* K0 fraction numerator for rv 3 and BG 2 */
> +
> +/* ACC100 Configuration */
> +#define ACC100_DDR_ECC_ENABLE
> +#define ACC100_CFG_DMA_ERROR 0x3D7
> +#define ACC100_CFG_AXI_CACHE 0x11
> +#define ACC100_CFG_QMGR_HI_P 0x0F0F
> +#define ACC100_CFG_PCI_AXI 0xC003
> +#define ACC100_CFG_PCI_BRIDGE 0x40006033
> +#define ACC100_ENGINE_OFFSET 0x1000
> +#define ACC100_RESET_HI 0x20100
> +#define ACC100_RESET_LO 0x20000
> +#define ACC100_RESET_HARD 0x1FF
> +#define ACC100_ENGINES_MAX 9
> +#define LONG_WAIT 1000
> +
> +/* ACC100 DMA Descriptor triplet */
> +struct acc100_dma_triplet {
> +	uint64_t address;
> +	uint32_t blen:20,
> +		res0:4,
> +		last:1,
> +		dma_ext:1,
> +		res1:2,
> +		blkid:4;
> +} __rte_packed;
> +
> +
> +
> +/* ACC100 DMA Response Descriptor */
> +union acc100_dma_rsp_desc {
> +	uint32_t val;
> +	struct {
> +		uint32_t crc_status:1,
> +			synd_ok:1,
> +			dma_err:1,
> +			neg_stop:1,
> +			fcw_err:1,
> +			output_err:1,
> +			input_err:1,
> +			timestampEn:1,
> +			iterCountFrac:8,
> +			iter_cnt:8,
> +			rsrvd3:6,
> +			sdone:1,
> +			fdone:1;
> +		uint32_t add_info_0;
> +		uint32_t add_info_1;
> +	};
> +};
> +
> +
> +/* ACC100 Queue Manager Enqueue PCI Register */
> +union acc100_enqueue_reg_fmt {
> +	uint32_t val;
> +	struct {
> +		uint32_t num_elem:8,
> +			addr_offset:3,
> +			rsrvd:1,
> +			req_elem_addr:20;
> +	};
> +};
> +
> +/* FEC 4G Uplink Frame Control Word */
> +struct __rte_packed acc100_fcw_td {
> +	uint8_t fcw_ver:4,
> +		num_maps:4; /* Unused */
> +	uint8_t filler:6, /* Unused */
> +		rsrvd0:1,
> +		bypass_sb_deint:1;
> +	uint16_t k_pos;
> +	uint16_t k_neg; /* Unused */
> +	uint8_t c_neg; /* Unused */
> +	uint8_t c; /* Unused */
> +	uint32_t ea; /* Unused */
> +	uint32_t eb; /* Unused */
> +	uint8_t cab; /* Unused */
> +	uint8_t k0_start_col; /* Unused */
> +	uint8_t rsrvd1;
> +	uint8_t code_block_mode:1, /* Unused */
> +		turbo_crc_type:1,
> +		rsrvd2:3,
> +		bypass_teq:1, /* Unused */
> +		soft_output_en:1, /* Unused */
> +		ext_td_cold_reg_en:1;
> +	union { /* External Cold register */
> +		uint32_t ext_td_cold_reg;
> +		struct {
> +			uint32_t min_iter:4, /* Unused */
> +				max_iter:4,
> +				ext_scale:5, /* Unused */
> +				rsrvd3:3,
> +				early_stop_en:1, /* Unused */
> +				sw_soft_out_dis:1, /* Unused */
> +				sw_et_cont:1, /* Unused */
> +				sw_soft_out_saturation:1, /* Unused */
> +				half_iter_on:1, /* Unused */
> +				raw_decoder_input_on:1, /* Unused */
> +				rsrvd4:10;
> +		};
> +	};
> +};
> +
> +/* FEC 5GNR Uplink Frame Control Word */
> +struct __rte_packed acc100_fcw_ld {
> +	uint32_t FCWversion:4,
> +		qm:4,
> +		nfiller:11,
> +		BG:1,
> +		Zc:9,
> +		res0:1,
> +		synd_precoder:1,
> +		synd_post:1;
> +	uint32_t ncb:16,
> +		k0:16;
> +	uint32_t rm_e:24,
> +		hcin_en:1,
> +		hcout_en:1,
> +		crc_select:1,
> +		bypass_dec:1,
> +		bypass_intlv:1,
> +		so_en:1,
> +		so_bypass_rm:1,
> +		so_bypass_intlv:1;
> +	uint32_t hcin_offset:16,
> +		hcin_size0:16;
> +	uint32_t hcin_size1:16,
> +		hcin_decomp_mode:3,
> +		llr_pack_mode:1,
> +		hcout_comp_mode:3,
> +		res2:1,
> +		dec_convllr:4,
> +		hcout_convllr:4;
> +	uint32_t itmax:7,
> +		itstop:1,
> +		so_it:7,
> +		res3:1,
> +		hcout_offset:16;
> +	uint32_t hcout_size0:16,
> +		hcout_size1:16;
> +	uint32_t gain_i:8,
> +		gain_h:8,
> +		negstop_th:16;
> +	uint32_t negstop_it:7,
> +		negstop_en:1,
> +		res4:24;
> +};
> +
> +/* FEC 4G Downlink Frame Control Word */
> +struct __rte_packed acc100_fcw_te {
> +	uint16_t k_neg;
> +	uint16_t k_pos;
> +	uint8_t c_neg;
> +	uint8_t c;
> +	uint8_t filler;
> +	uint8_t cab;
> +	uint32_t ea:17,
> +		rsrvd0:15;
> +	uint32_t eb:17,
> +		rsrvd1:15;
> +	uint16_t ncb_neg;
> +	uint16_t ncb_pos;
> +	uint8_t rv_idx0:2,
> +		rsrvd2:2,
> +		rv_idx1:2,
> +		rsrvd3:2;
> +	uint8_t bypass_rv_idx0:1,
> +		bypass_rv_idx1:1,
> +		bypass_rm:1,
> +		rsrvd4:5;
> +	uint8_t rsrvd5:1,
> +		rsrvd6:3,
> +		code_block_crc:1,
> +		rsrvd7:3;
> +	uint8_t code_block_mode:1,
> +		rsrvd8:7;
> +	uint64_t rsrvd9;
> +};
> +
> +/* FEC 5GNR Downlink Frame Control Word */
> +struct __rte_packed acc100_fcw_le {
> +	uint32_t FCWversion:4,
> +		qm:4,
> +		nfiller:11,
> +		BG:1,
> +		Zc:9,
> +		res0:3;
> +	uint32_t ncb:16,
> +		k0:16;
> +	uint32_t rm_e:24,
> +		res1:2,
> +		crc_select:1,
> +		res2:1,
> +		bypass_intlv:1,
> +		res3:3;
> +	uint32_t res4_a:12,
> +		mcb_count:3,
> +		res4_b:17;
> +	uint32_t res5;
> +	uint32_t res6;
> +	uint32_t res7;
> +	uint32_t res8;
> +};
> +
> +/* ACC100 DMA Request Descriptor */
> +struct __rte_packed acc100_dma_req_desc {
> +	union {
> +		struct{
> +			uint32_t type:4,
> +				rsrvd0:26,
> +				sdone:1,
> +				fdone:1;
> +			uint32_t rsrvd1;
> +			uint32_t rsrvd2;
> +			uint32_t pass_param:8,
> +				sdone_enable:1,
> +				irq_enable:1,
> +				timeStampEn:1,
> +				res0:5,
> +				numCBs:4,
> +				res1:4,
> +				m2dlen:4,
> +				d2mlen:4;
> +		};
> +		struct{
> +			uint32_t word0;
> +			uint32_t word1;
> +			uint32_t word2;
> +			uint32_t word3;
> +		};
> +	};
> +	struct acc100_dma_triplet data_ptrs[ACC100_DMA_MAX_NUM_POINTERS];
> +
> +	/* Virtual addresses used to retrieve SW context info */
> +	union {
> +		void *op_addr;
> +		uint64_t pad1;  /* pad to 64 bits */
> +	};
> +	/*
> +	 * Stores additional information needed for driver processing:
> +	 * - last_desc_in_batch - flag used to mark last descriptor (CB)
> +	 *                        in batch
> +	 * - cbs_in_tb - stores information about total number of Code Blocks
> +	 *               in currently processed Transport Block
> +	 */
> +	union {
> +		struct {
> +			union {
> +				struct acc100_fcw_ld fcw_ld;
> +				struct acc100_fcw_td fcw_td;
> +				struct acc100_fcw_le fcw_le;
> +				struct acc100_fcw_te fcw_te;
> +				uint32_t pad2[ACC100_FCW_PADDING];
> +			};
> +			uint32_t last_desc_in_batch :8,
> +				cbs_in_tb:8,
> +				pad4 : 16;
> +		};
> +		uint64_t pad3[ACC100_DMA_DESC_PADDING]; /* pad to 64 bits */
> +	};
> +};
> +
> +/* ACC100 DMA Descriptor */
> +union acc100_dma_desc {
> +	struct acc100_dma_req_desc req;
> +	union acc100_dma_rsp_desc rsp;
> +};
> +
> +
> +/* Union describing Info Ring entry */
> +union acc100_harq_layout_data {
> +	uint32_t val;
> +	struct {
> +		uint16_t offset;
> +		uint16_t size0;
> +	};
> +} __rte_packed;
> +
> +
> +/* Union describing Info Ring entry */
> +union acc100_info_ring_data {
> +	uint32_t val;
> +	struct {
> +		union {
> +			uint16_t detailed_info;
> +			struct {
> +				uint16_t aq_id: 4;
> +				uint16_t qg_id: 4;
> +				uint16_t vf_id: 6;
> +				uint16_t reserved: 2;
> +			};
> +		};
> +		uint16_t int_nb: 7;
> +		uint16_t msi_0: 1;
> +		uint16_t vf2pf: 6;
> +		uint16_t loop: 1;
> +		uint16_t valid: 1;
> +	};
> +} __rte_packed;
> +
> +struct acc100_registry_addr {
> +	unsigned int dma_ring_dl5g_hi;
> +	unsigned int dma_ring_dl5g_lo;
> +	unsigned int dma_ring_ul5g_hi;
> +	unsigned int dma_ring_ul5g_lo;
> +	unsigned int dma_ring_dl4g_hi;
> +	unsigned int dma_ring_dl4g_lo;
> +	unsigned int dma_ring_ul4g_hi;
> +	unsigned int dma_ring_ul4g_lo;
> +	unsigned int ring_size;
> +	unsigned int info_ring_hi;
> +	unsigned int info_ring_lo;
> +	unsigned int info_ring_en;
> +	unsigned int info_ring_ptr;
> +	unsigned int tail_ptrs_dl5g_hi;
> +	unsigned int tail_ptrs_dl5g_lo;
> +	unsigned int tail_ptrs_ul5g_hi;
> +	unsigned int tail_ptrs_ul5g_lo;
> +	unsigned int tail_ptrs_dl4g_hi;
> +	unsigned int tail_ptrs_dl4g_lo;
> +	unsigned int tail_ptrs_ul4g_hi;
> +	unsigned int tail_ptrs_ul4g_lo;
> +	unsigned int depth_log0_offset;
> +	unsigned int depth_log1_offset;
> +	unsigned int qman_group_func;
> +	unsigned int ddr_range;
> +};
> +
> +/* Structure holding registry addresses for PF */
> +static const struct acc100_registry_addr pf_reg_addr = {
> +	.dma_ring_dl5g_hi = HWPfDmaFec5GdlDescBaseHiRegVf,
> +	.dma_ring_dl5g_lo = HWPfDmaFec5GdlDescBaseLoRegVf,
> +	.dma_ring_ul5g_hi = HWPfDmaFec5GulDescBaseHiRegVf,
> +	.dma_ring_ul5g_lo = HWPfDmaFec5GulDescBaseLoRegVf,
> +	.dma_ring_dl4g_hi = HWPfDmaFec4GdlDescBaseHiRegVf,
> +	.dma_ring_dl4g_lo = HWPfDmaFec4GdlDescBaseLoRegVf,
> +	.dma_ring_ul4g_hi = HWPfDmaFec4GulDescBaseHiRegVf,
> +	.dma_ring_ul4g_lo = HWPfDmaFec4GulDescBaseLoRegVf,
> +	.ring_size = HWPfQmgrRingSizeVf,
> +	.info_ring_hi = HWPfHiInfoRingBaseHiRegPf,
> +	.info_ring_lo = HWPfHiInfoRingBaseLoRegPf,
> +	.info_ring_en = HWPfHiInfoRingIntWrEnRegPf,
> +	.info_ring_ptr = HWPfHiInfoRingPointerRegPf,
> +	.tail_ptrs_dl5g_hi = HWPfDmaFec5GdlRespPtrHiRegVf,
> +	.tail_ptrs_dl5g_lo = HWPfDmaFec5GdlRespPtrLoRegVf,
> +	.tail_ptrs_ul5g_hi = HWPfDmaFec5GulRespPtrHiRegVf,
> +	.tail_ptrs_ul5g_lo = HWPfDmaFec5GulRespPtrLoRegVf,
> +	.tail_ptrs_dl4g_hi = HWPfDmaFec4GdlRespPtrHiRegVf,
> +	.tail_ptrs_dl4g_lo = HWPfDmaFec4GdlRespPtrLoRegVf,
> +	.tail_ptrs_ul4g_hi = HWPfDmaFec4GulRespPtrHiRegVf,
> +	.tail_ptrs_ul4g_lo = HWPfDmaFec4GulRespPtrLoRegVf,
> +	.depth_log0_offset = HWPfQmgrGrpDepthLog20Vf,
> +	.depth_log1_offset = HWPfQmgrGrpDepthLog21Vf,
> +	.qman_group_func = HWPfQmgrGrpFunction0,
> +	.ddr_range = HWPfDmaVfDdrBaseRw,
> +};
> +
> +/* Structure holding registry addresses for VF */
> +static const struct acc100_registry_addr vf_reg_addr = {
> +	.dma_ring_dl5g_hi = HWVfDmaFec5GdlDescBaseHiRegVf,
> +	.dma_ring_dl5g_lo = HWVfDmaFec5GdlDescBaseLoRegVf,
> +	.dma_ring_ul5g_hi = HWVfDmaFec5GulDescBaseHiRegVf,
> +	.dma_ring_ul5g_lo = HWVfDmaFec5GulDescBaseLoRegVf,
> +	.dma_ring_dl4g_hi = HWVfDmaFec4GdlDescBaseHiRegVf,
> +	.dma_ring_dl4g_lo = HWVfDmaFec4GdlDescBaseLoRegVf,
> +	.dma_ring_ul4g_hi = HWVfDmaFec4GulDescBaseHiRegVf,
> +	.dma_ring_ul4g_lo = HWVfDmaFec4GulDescBaseLoRegVf,
> +	.ring_size = HWVfQmgrRingSizeVf,
> +	.info_ring_hi = HWVfHiInfoRingBaseHiVf,
> +	.info_ring_lo = HWVfHiInfoRingBaseLoVf,
> +	.info_ring_en = HWVfHiInfoRingIntWrEnVf,
> +	.info_ring_ptr = HWVfHiInfoRingPointerVf,
> +	.tail_ptrs_dl5g_hi = HWVfDmaFec5GdlRespPtrHiRegVf,
> +	.tail_ptrs_dl5g_lo = HWVfDmaFec5GdlRespPtrLoRegVf,
> +	.tail_ptrs_ul5g_hi = HWVfDmaFec5GulRespPtrHiRegVf,
> +	.tail_ptrs_ul5g_lo = HWVfDmaFec5GulRespPtrLoRegVf,
> +	.tail_ptrs_dl4g_hi = HWVfDmaFec4GdlRespPtrHiRegVf,
> +	.tail_ptrs_dl4g_lo = HWVfDmaFec4GdlRespPtrLoRegVf,
> +	.tail_ptrs_ul4g_hi = HWVfDmaFec4GulRespPtrHiRegVf,
> +	.tail_ptrs_ul4g_lo = HWVfDmaFec4GulRespPtrLoRegVf,
> +	.depth_log0_offset = HWVfQmgrGrpDepthLog20Vf,
> +	.depth_log1_offset = HWVfQmgrGrpDepthLog21Vf,
> +	.qman_group_func = HWVfQmgrGrpFunction0Vf,
> +	.ddr_range = HWVfDmaDdrBaseRangeRoVf,
> +};
> +
>  /* Private data structure for each ACC100 device */
>  struct acc100_device {
>  	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v9 03/10] baseband/acc100: add info get function
  2020-09-29  0:29     ` [dpdk-dev] [PATCH v9 03/10] baseband/acc100: add info get function Nicolas Chautru
@ 2020-09-29 21:13       ` Tom Rix
  2020-09-30  0:25         ` Chautru, Nicolas
  0 siblings, 1 reply; 213+ messages in thread
From: Tom Rix @ 2020-09-29 21:13 UTC (permalink / raw)
  To: Nicolas Chautru, dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu


On 9/28/20 5:29 PM, Nicolas Chautru wrote:
> Add in the "info_get" function to the driver, to allow us to query the
> device.
> No processing capability are available yet.
> Linking bbdev-test to support the PMD with null capability.
>
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
> ---
>  app/test-bbdev/meson.build               |   3 +
>  drivers/baseband/acc100/rte_acc100_cfg.h |  96 +++++++++++++
>  drivers/baseband/acc100/rte_acc100_pmd.c | 225 +++++++++++++++++++++++++++++++
>  drivers/baseband/acc100/rte_acc100_pmd.h |   3 +
>  4 files changed, 327 insertions(+)
>  create mode 100644 drivers/baseband/acc100/rte_acc100_cfg.h
>
> diff --git a/app/test-bbdev/meson.build b/app/test-bbdev/meson.build
> index 18ab6a8..fbd8ae3 100644
> --- a/app/test-bbdev/meson.build
> +++ b/app/test-bbdev/meson.build
> @@ -12,3 +12,6 @@ endif
>  if dpdk_conf.has('RTE_LIBRTE_PMD_BBDEV_FPGA_5GNR_FEC')
>  	deps += ['pmd_bbdev_fpga_5gnr_fec']
>  endif
> +if dpdk_conf.has('RTE_LIBRTE_PMD_BBDEV_ACC100')
> +	deps += ['pmd_bbdev_acc100']
> +endif
> \ No newline at end of file
> diff --git a/drivers/baseband/acc100/rte_acc100_cfg.h b/drivers/baseband/acc100/rte_acc100_cfg.h
> new file mode 100644
> index 0000000..73bbe36
> --- /dev/null
> +++ b/drivers/baseband/acc100/rte_acc100_cfg.h
> @@ -0,0 +1,96 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2020 Intel Corporation
> + */
> +
> +#ifndef _RTE_ACC100_CFG_H_
> +#define _RTE_ACC100_CFG_H_
> +
> +/**
> + * @file rte_acc100_cfg.h
> + *
> + * Functions for configuring ACC100 HW, exposed directly to applications.
> + * Configuration related to encoding/decoding is done through the
> + * librte_bbdev library.
> + *
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice
When will this experimental tag be removed ?
> + */
> +
> +#include <stdint.h>
> +#include <stdbool.h>
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +/**< Number of Virtual Functions ACC100 supports */
> +#define RTE_ACC100_NUM_VFS 16
This is already defined with ACC100_NUM_VFS
> +
> +/**
> + * Definition of Queue Topology for ACC100 Configuration
> + * Some level of details is abstracted out to expose a clean interface
> + * given that comprehensive flexibility is not required
> + */
> +struct rte_q_topology_t {
> +	/** Number of QGroups in incremental order of priority */
> +	uint16_t num_qgroups;
> +	/**
> +	 * All QGroups have the same number of AQs here.
> +	 * Note : Could be made a 16-array if more flexibility is really
> +	 * required
> +	 */
> +	uint16_t num_aqs_per_groups;
> +	/**
> +	 * Depth of the AQs is the same of all QGroups here. Log2 Enum : 2^N
> +	 * Note : Could be made a 16-array if more flexibility is really
> +	 * required
> +	 */
> +	uint16_t aq_depth_log2;
> +	/**
> +	 * Index of the first Queue Group Index - assuming contiguity
> +	 * Initialized as -1
> +	 */
> +	int8_t first_qgroup_index;
> +};
> +
> +/**
> + * Definition of Arbitration related parameters for ACC100 Configuration
> + */
> +struct rte_arbitration_t {
> +	/** Default Weight for VF Fairness Arbitration */
> +	uint16_t round_robin_weight;
> +	uint32_t gbr_threshold1; /**< Guaranteed Bitrate Threshold 1 */
> +	uint32_t gbr_threshold2; /**< Guaranteed Bitrate Threshold 2 */
> +};
> +
> +/**
> + * Structure to pass ACC100 configuration.
> + * Note: all VF Bundles will have the same configuration.
> + */
> +struct acc100_conf {
> +	bool pf_mode_en; /**< 1 if PF is used for dataplane, 0 for VFs */
> +	/** 1 if input '1' bit is represented by a positive LLR value, 0 if '1'
> +	 * bit is represented by a negative value.
> +	 */
> +	bool input_pos_llr_1_bit;
> +	/** 1 if output '1' bit is represented by a positive value, 0 if '1'
> +	 * bit is represented by a negative value.
> +	 */
> +	bool output_pos_llr_1_bit;
> +	uint16_t num_vf_bundles; /**< Number of VF bundles to setup */
> +	/** Queue topology for each operation type */
> +	struct rte_q_topology_t q_ul_4g;
> +	struct rte_q_topology_t q_dl_4g;
> +	struct rte_q_topology_t q_ul_5g;
> +	struct rte_q_topology_t q_dl_5g;
> +	/** Arbitration configuration for each operation type */
> +	struct rte_arbitration_t arb_ul_4g[RTE_ACC100_NUM_VFS];
> +	struct rte_arbitration_t arb_dl_4g[RTE_ACC100_NUM_VFS];
> +	struct rte_arbitration_t arb_ul_5g[RTE_ACC100_NUM_VFS];
> +	struct rte_arbitration_t arb_dl_5g[RTE_ACC100_NUM_VFS];
> +};
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_ACC100_CFG_H_ */
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
> index 1b4cd13..7807a30 100644
> --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> @@ -26,6 +26,184 @@
>  RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);
>  #endif
>  
> +/* Read a register of a ACC100 device */
> +static inline uint32_t
> +acc100_reg_read(struct acc100_device *d, uint32_t offset)
> +{
> +
> +	void *reg_addr = RTE_PTR_ADD(d->mmio_base, offset);
> +	uint32_t ret = *((volatile uint32_t *)(reg_addr));
> +	return rte_le_to_cpu_32(ret);
> +}
> +
> +/* Calculate the offset of the enqueue register */
> +static inline uint32_t
> +queue_offset(bool pf_device, uint8_t vf_id, uint8_t qgrp_id, uint16_t aq_id)
> +{
> +	if (pf_device)
> +		return ((vf_id << 12) + (qgrp_id << 7) + (aq_id << 3) +
> +				HWPfQmgrIngressAq);
> +	else
> +		return ((qgrp_id << 7) + (aq_id << 3) +
> +				HWVfQmgrIngressAq);
Could you add *QmrIngressAq to the acc100_registry_addr and skip the if (pf_device) check ?
> +}
> +
> +enum {UL_4G = 0, UL_5G, DL_4G, DL_5G, NUM_ACC};
> +
> +/* Return the queue topology for a Queue Group Index */
> +static inline void
> +qtopFromAcc(struct rte_q_topology_t **qtop, int acc_enum,
> +		struct acc100_conf *acc100_conf)
> +{
> +	struct rte_q_topology_t *p_qtop;
> +	p_qtop = NULL;
> +	switch (acc_enum) {
> +	case UL_4G:
> +		p_qtop = &(acc100_conf->q_ul_4g);
> +		break;
> +	case UL_5G:
> +		p_qtop = &(acc100_conf->q_ul_5g);
> +		break;
> +	case DL_4G:
> +		p_qtop = &(acc100_conf->q_dl_4g);
> +		break;
> +	case DL_5G:
> +		p_qtop = &(acc100_conf->q_dl_5g);
> +		break;
> +	default:
> +		/* NOTREACHED */
> +		rte_bbdev_log(ERR, "Unexpected error evaluating qtopFromAcc");
Use in fetch_acc100_config does not check for NULL.
> +		break;
> +	}
> +	*qtop = p_qtop;
> +}
> +
> +static void
> +initQTop(struct acc100_conf *acc100_conf)
> +{
> +	acc100_conf->q_ul_4g.num_aqs_per_groups = 0;
> +	acc100_conf->q_ul_4g.num_qgroups = 0;
> +	acc100_conf->q_ul_4g.first_qgroup_index = -1;
> +	acc100_conf->q_ul_5g.num_aqs_per_groups = 0;
> +	acc100_conf->q_ul_5g.num_qgroups = 0;
> +	acc100_conf->q_ul_5g.first_qgroup_index = -1;
> +	acc100_conf->q_dl_4g.num_aqs_per_groups = 0;
> +	acc100_conf->q_dl_4g.num_qgroups = 0;
> +	acc100_conf->q_dl_4g.first_qgroup_index = -1;
> +	acc100_conf->q_dl_5g.num_aqs_per_groups = 0;
> +	acc100_conf->q_dl_5g.num_qgroups = 0;
> +	acc100_conf->q_dl_5g.first_qgroup_index = -1;
> +}
> +
> +static inline void
> +updateQtop(uint8_t acc, uint8_t qg, struct acc100_conf *acc100_conf,
> +		struct acc100_device *d) {
> +	uint32_t reg;
> +	struct rte_q_topology_t *q_top = NULL;
> +	qtopFromAcc(&q_top, acc, acc100_conf);
> +	if (unlikely(q_top == NULL))
> +		return;
as above, this error is not handled by caller fetch_acc100_config
> +	uint16_t aq;
> +	q_top->num_qgroups++;
> +	if (q_top->first_qgroup_index == -1) {
> +		q_top->first_qgroup_index = qg;
> +		/* Can be optimized to assume all are enabled by default */
> +		reg = acc100_reg_read(d, queue_offset(d->pf_device,
> +				0, qg, ACC100_NUM_AQS - 1));
> +		if (reg & QUEUE_ENABLE) {
> +			q_top->num_aqs_per_groups = ACC100_NUM_AQS;
> +			return;
> +		}
> +		q_top->num_aqs_per_groups = 0;
> +		for (aq = 0; aq < ACC100_NUM_AQS; aq++) {
> +			reg = acc100_reg_read(d, queue_offset(d->pf_device,
> +					0, qg, aq));
> +			if (reg & QUEUE_ENABLE)
> +				q_top->num_aqs_per_groups++;
> +		}
> +	}
> +}
> +
> +/* Fetch configuration enabled for the PF/VF using MMIO Read (slow) */
> +static inline void
> +fetch_acc100_config(struct rte_bbdev *dev)
> +{
> +	struct acc100_device *d = dev->data->dev_private;
> +	struct acc100_conf *acc100_conf = &d->acc100_conf;
> +	const struct acc100_registry_addr *reg_addr;
> +	uint8_t acc, qg;
> +	uint32_t reg, reg_aq, reg_len0, reg_len1;
> +	uint32_t reg_mode;
> +
> +	/* No need to retrieve the configuration is already done */
> +	if (d->configured)
> +		return;
Warn ?
> +
> +	/* Choose correct registry addresses for the device type */
> +	if (d->pf_device)
> +		reg_addr = &pf_reg_addr;
> +	else
> +		reg_addr = &vf_reg_addr;
> +
> +	d->ddr_size = (1 + acc100_reg_read(d, reg_addr->ddr_range)) << 10;
> +
> +	/* Single VF Bundle by VF */
> +	acc100_conf->num_vf_bundles = 1;
> +	initQTop(acc100_conf);
> +
> +	struct rte_q_topology_t *q_top = NULL;
> +	int qman_func_id[5] = {0, 2, 1, 3, 4};
Do these magic numbers need #defines ?
> +	reg = acc100_reg_read(d, reg_addr->qman_group_func);
> +	for (qg = 0; qg < ACC100_NUM_QGRPS_PER_WORD; qg++) {
> +		reg_aq = acc100_reg_read(d,
> +				queue_offset(d->pf_device, 0, qg, 0));
> +		if (reg_aq & QUEUE_ENABLE) {
> +			acc = qman_func_id[(reg >> (qg * 4)) & 0x7];
0x7 and [5], this could overflow.
> +			updateQtop(acc, qg, acc100_conf, d);
> +		}
> +	}
> +
> +	/* Check the depth of the AQs*/
> +	reg_len0 = acc100_reg_read(d, reg_addr->depth_log0_offset);
> +	reg_len1 = acc100_reg_read(d, reg_addr->depth_log1_offset);
> +	for (acc = 0; acc < NUM_ACC; acc++) {
> +		qtopFromAcc(&q_top, acc, acc100_conf);
> +		if (q_top->first_qgroup_index < ACC100_NUM_QGRPS_PER_WORD)
> +			q_top->aq_depth_log2 = (reg_len0 >>
> +					(q_top->first_qgroup_index * 4))
> +					& 0xF;
> +		else
> +			q_top->aq_depth_log2 = (reg_len1 >>
> +					((q_top->first_qgroup_index -
> +					ACC100_NUM_QGRPS_PER_WORD) * 4))
> +					& 0xF;
> +	}
> +
> +	/* Read PF mode */
> +	if (d->pf_device) {
> +		reg_mode = acc100_reg_read(d, HWPfHiPfMode);
> +		acc100_conf->pf_mode_en = (reg_mode == 2) ? 1 : 0;

2 is a magic number, consider a #define

Tom

> +	}
> +
> +	rte_bbdev_log_debug(
> +			"%s Config LLR SIGN IN/OUT %s %s QG %u %u %u %u AQ %u %u %u %u Len %u %u %u %u\n",
> +			(d->pf_device) ? "PF" : "VF",
> +			(acc100_conf->input_pos_llr_1_bit) ? "POS" : "NEG",
> +			(acc100_conf->output_pos_llr_1_bit) ? "POS" : "NEG",
> +			acc100_conf->q_ul_4g.num_qgroups,
> +			acc100_conf->q_dl_4g.num_qgroups,
> +			acc100_conf->q_ul_5g.num_qgroups,
> +			acc100_conf->q_dl_5g.num_qgroups,
> +			acc100_conf->q_ul_4g.num_aqs_per_groups,
> +			acc100_conf->q_dl_4g.num_aqs_per_groups,
> +			acc100_conf->q_ul_5g.num_aqs_per_groups,
> +			acc100_conf->q_dl_5g.num_aqs_per_groups,
> +			acc100_conf->q_ul_4g.aq_depth_log2,
> +			acc100_conf->q_dl_4g.aq_depth_log2,
> +			acc100_conf->q_ul_5g.aq_depth_log2,
> +			acc100_conf->q_dl_5g.aq_depth_log2);
> +}
> +
>  /* Free 64MB memory used for software rings */
>  static int
>  acc100_dev_close(struct rte_bbdev *dev  __rte_unused)
> @@ -33,8 +211,55 @@
>  	return 0;
>  }
>  
> +/* Get ACC100 device info */
> +static void
> +acc100_dev_info_get(struct rte_bbdev *dev,
> +		struct rte_bbdev_driver_info *dev_info)
> +{
> +	struct acc100_device *d = dev->data->dev_private;
> +
> +	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
> +		RTE_BBDEV_END_OF_CAPABILITIES_LIST()
> +	};
> +
> +	static struct rte_bbdev_queue_conf default_queue_conf;
> +	default_queue_conf.socket = dev->data->socket_id;
> +	default_queue_conf.queue_size = MAX_QUEUE_DEPTH;
> +
> +	dev_info->driver_name = dev->device->driver->name;
> +
> +	/* Read and save the populated config from ACC100 registers */
> +	fetch_acc100_config(dev);
> +
> +	/* This isn't ideal because it reports the maximum number of queues but
> +	 * does not provide info on how many can be uplink/downlink or different
> +	 * priorities
> +	 */
> +	dev_info->max_num_queues =
> +			d->acc100_conf.q_dl_5g.num_aqs_per_groups *
> +			d->acc100_conf.q_dl_5g.num_qgroups +
> +			d->acc100_conf.q_ul_5g.num_aqs_per_groups *
> +			d->acc100_conf.q_ul_5g.num_qgroups +
> +			d->acc100_conf.q_dl_4g.num_aqs_per_groups *
> +			d->acc100_conf.q_dl_4g.num_qgroups +
> +			d->acc100_conf.q_ul_4g.num_aqs_per_groups *
> +			d->acc100_conf.q_ul_4g.num_qgroups;
> +	dev_info->queue_size_lim = MAX_QUEUE_DEPTH;
> +	dev_info->hardware_accelerated = true;
> +	dev_info->max_dl_queue_priority =
> +			d->acc100_conf.q_dl_4g.num_qgroups - 1;
> +	dev_info->max_ul_queue_priority =
> +			d->acc100_conf.q_ul_4g.num_qgroups - 1;
> +	dev_info->default_queue_conf = default_queue_conf;
> +	dev_info->cpu_flag_reqs = NULL;
> +	dev_info->min_alignment = 64;
> +	dev_info->capabilities = bbdev_capabilities;
> +	dev_info->harq_buffer_size = d->ddr_size;
> +}
> +
>  static const struct rte_bbdev_ops acc100_bbdev_ops = {
>  	.close = acc100_dev_close,
> +	.info_get = acc100_dev_info_get,
>  };
>  
>  /* ACC100 PCI PF address map */
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
> index cd77570..662e2c8 100644
> --- a/drivers/baseband/acc100/rte_acc100_pmd.h
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
> @@ -7,6 +7,7 @@
>  
>  #include "acc100_pf_enum.h"
>  #include "acc100_vf_enum.h"
> +#include "rte_acc100_cfg.h"
>  
>  /* Helper macro for logging */
>  #define rte_bbdev_log(level, fmt, ...) \
> @@ -520,6 +521,8 @@ struct acc100_registry_addr {
>  /* Private data structure for each ACC100 device */
>  struct acc100_device {
>  	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
> +	uint32_t ddr_size; /* Size in kB */
> +	struct acc100_conf acc100_conf; /* ACC100 Initial configuration */
>  	bool pf_device; /**< True if this is a PF ACC100 device */
>  	bool configured; /**< True if this ACC100 device is configured */
>  };


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v9 04/10] baseband/acc100: add queue configuration
  2020-09-29  0:29     ` [dpdk-dev] [PATCH v9 04/10] baseband/acc100: add queue configuration Nicolas Chautru
@ 2020-09-29 21:46       ` Tom Rix
  2020-09-30  1:03         ` Chautru, Nicolas
  0 siblings, 1 reply; 213+ messages in thread
From: Tom Rix @ 2020-09-29 21:46 UTC (permalink / raw)
  To: Nicolas Chautru, dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu


On 9/28/20 5:29 PM, Nicolas Chautru wrote:
> Adding function to create and configure queues for
> the device. Still no capability.
>
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> Reviewed-by: Rosen Xu <rosen.xu@intel.com>
> Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
> ---
>  drivers/baseband/acc100/rte_acc100_pmd.c | 420 ++++++++++++++++++++++++++++++-
>  drivers/baseband/acc100/rte_acc100_pmd.h |  45 ++++
>  2 files changed, 464 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
> index 7807a30..7a21c57 100644
> --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> @@ -26,6 +26,22 @@
>  RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);
>  #endif
>  
> +/* Write to MMIO register address */
> +static inline void
> +mmio_write(void *addr, uint32_t value)
> +{
> +	*((volatile uint32_t *)(addr)) = rte_cpu_to_le_32(value);
> +}
> +
> +/* Write a register of a ACC100 device */
> +static inline void
> +acc100_reg_write(struct acc100_device *d, uint32_t offset, uint32_t payload)
> +{
> +	void *reg_addr = RTE_PTR_ADD(d->mmio_base, offset);
> +	mmio_write(reg_addr, payload);
> +	usleep(1000);
rte_acc100_pmd.h defines LONG_WAIT , could this #define be used instead ?
> +}
> +
>  /* Read a register of a ACC100 device */
>  static inline uint32_t
>  acc100_reg_read(struct acc100_device *d, uint32_t offset)
> @@ -36,6 +52,22 @@
>  	return rte_le_to_cpu_32(ret);
>  }
>  
> +/* Basic Implementation of Log2 for exact 2^N */
> +static inline uint32_t
> +log2_basic(uint32_t value)
mirrors the function rte_bsf32
> +{
> +	return (value == 0) ? 0 : __builtin_ctz(value);
> +}
> +
> +/* Calculate memory alignment offset assuming alignment is 2^N */
> +static inline uint32_t
> +calc_mem_alignment_offset(void *unaligned_virt_mem, uint32_t alignment)
> +{
> +	rte_iova_t unaligned_phy_mem = rte_malloc_virt2iova(unaligned_virt_mem);
> +	return (uint32_t)(alignment -
> +			(unaligned_phy_mem & (alignment-1)));
> +}
> +
>  /* Calculate the offset of the enqueue register */
>  static inline uint32_t
>  queue_offset(bool pf_device, uint8_t vf_id, uint8_t qgrp_id, uint16_t aq_id)
> @@ -204,10 +236,393 @@
>  			acc100_conf->q_dl_5g.aq_depth_log2);
>  }
>  
> +static void
> +free_base_addresses(void **base_addrs, int size)
> +{
> +	int i;
> +	for (i = 0; i < size; i++)
> +		rte_free(base_addrs[i]);
> +}
> +
> +static inline uint32_t
> +get_desc_len(void)
> +{
> +	return sizeof(union acc100_dma_desc);
> +}
> +
> +/* Allocate the 2 * 64MB block for the sw rings */
> +static int
> +alloc_2x64mb_sw_rings_mem(struct rte_bbdev *dev, struct acc100_device *d,
> +		int socket)
see earlier comment about name of function.
> +{
> +	uint32_t sw_ring_size = ACC100_SIZE_64MBYTE;
> +	d->sw_rings_base = rte_zmalloc_socket(dev->device->driver->name,
> +			2 * sw_ring_size, RTE_CACHE_LINE_SIZE, socket);
> +	if (d->sw_rings_base == NULL) {
> +		rte_bbdev_log(ERR, "Failed to allocate memory for %s:%u",
> +				dev->device->driver->name,
> +				dev->data->dev_id);
> +		return -ENOMEM;
> +	}
> +	memset(d->sw_rings_base, 0, ACC100_SIZE_64MBYTE);
> +	uint32_t next_64mb_align_offset = calc_mem_alignment_offset(
> +			d->sw_rings_base, ACC100_SIZE_64MBYTE);
> +	d->sw_rings = RTE_PTR_ADD(d->sw_rings_base, next_64mb_align_offset);
> +	d->sw_rings_phys = rte_malloc_virt2iova(d->sw_rings_base) +
> +			next_64mb_align_offset;
> +	d->sw_ring_size = MAX_QUEUE_DEPTH * get_desc_len();
> +	d->sw_ring_max_depth = d->sw_ring_size / get_desc_len();
> +
> +	return 0;
> +}
> +
> +/* Attempt to allocate minimised memory space for sw rings */
> +static void
> +alloc_sw_rings_min_mem(struct rte_bbdev *dev, struct acc100_device *d,
> +		uint16_t num_queues, int socket)
> +{
> +	rte_iova_t sw_rings_base_phy, next_64mb_align_addr_phy;
> +	uint32_t next_64mb_align_offset;
> +	rte_iova_t sw_ring_phys_end_addr;
> +	void *base_addrs[SW_RING_MEM_ALLOC_ATTEMPTS];
> +	void *sw_rings_base;
> +	int i = 0;
> +	uint32_t q_sw_ring_size = MAX_QUEUE_DEPTH * get_desc_len();
> +	uint32_t dev_sw_ring_size = q_sw_ring_size * num_queues;
> +
> +	/* Find an aligned block of memory to store sw rings */
> +	while (i < SW_RING_MEM_ALLOC_ATTEMPTS) {
> +		/*
> +		 * sw_ring allocated memory is guaranteed to be aligned to
> +		 * q_sw_ring_size at the condition that the requested size is
> +		 * less than the page size
> +		 */
> +		sw_rings_base = rte_zmalloc_socket(
> +				dev->device->driver->name,
> +				dev_sw_ring_size, q_sw_ring_size, socket);
> +
> +		if (sw_rings_base == NULL) {
> +			rte_bbdev_log(ERR,
> +					"Failed to allocate memory for %s:%u",
> +					dev->device->driver->name,
> +					dev->data->dev_id);
> +			break;
> +		}
> +
> +		sw_rings_base_phy = rte_malloc_virt2iova(sw_rings_base);
> +		next_64mb_align_offset = calc_mem_alignment_offset(
> +				sw_rings_base, ACC100_SIZE_64MBYTE);
> +		next_64mb_align_addr_phy = sw_rings_base_phy +
> +				next_64mb_align_offset;
> +		sw_ring_phys_end_addr = sw_rings_base_phy + dev_sw_ring_size;
> +
> +		/* Check if the end of the sw ring memory block is before the
> +		 * start of next 64MB aligned mem address
> +		 */
> +		if (sw_ring_phys_end_addr < next_64mb_align_addr_phy) {
> +			d->sw_rings_phys = sw_rings_base_phy;
> +			d->sw_rings = sw_rings_base;
> +			d->sw_rings_base = sw_rings_base;
> +			d->sw_ring_size = q_sw_ring_size;
> +			d->sw_ring_max_depth = MAX_QUEUE_DEPTH;
> +			break;
> +		}
> +		/* Store the address of the unaligned mem block */
> +		base_addrs[i] = sw_rings_base;
> +		i++;
> +	}
> +

This looks like a bug.

Freeing memory that was just allocated.

Looks like it could be part of an error handler for memory access in the loop failing.

There should be a better way to allocate aligned memory like round up the size and use an offset to the alignment you need.

> +	/* Free all unaligned blocks of mem allocated in the loop */
> +	free_base_addresses(base_addrs, i);
> +}
> +
> +
> +/* Allocate 64MB memory used for all software rings */
> +static int
> +acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id)
> +{
> +	uint32_t phys_low, phys_high, payload;
> +	struct acc100_device *d = dev->data->dev_private;
> +	const struct acc100_registry_addr *reg_addr;
> +
> +	if (d->pf_device && !d->acc100_conf.pf_mode_en) {
> +		rte_bbdev_log(NOTICE,
> +				"%s has PF mode disabled. This PF can't be used.",
> +				dev->data->name);
> +		return -ENODEV;
> +	}
> +
> +	alloc_sw_rings_min_mem(dev, d, num_queues, socket_id);
> +
> +	/* If minimal memory space approach failed, then allocate
> +	 * the 2 * 64MB block for the sw rings
> +	 */
> +	if (d->sw_rings == NULL)
> +		alloc_2x64mb_sw_rings_mem(dev, d, socket_id);
This can fail as well, but is unhandled.
> +
> +	/* Configure ACC100 with the base address for DMA descriptor rings
> +	 * Same descriptor rings used for UL and DL DMA Engines
> +	 * Note : Assuming only VF0 bundle is used for PF mode
> +	 */
> +	phys_high = (uint32_t)(d->sw_rings_phys >> 32);
> +	phys_low  = (uint32_t)(d->sw_rings_phys & ~(ACC100_SIZE_64MBYTE-1));
> +
> +	/* Choose correct registry addresses for the device type */
> +	if (d->pf_device)
> +		reg_addr = &pf_reg_addr;
> +	else
> +		reg_addr = &vf_reg_addr;
could reg_addr be part of acc100_device struct ?
> +
> +	/* Read the populated cfg from ACC100 registers */
> +	fetch_acc100_config(dev);
> +
> +	/* Mark as configured properly */
> +	d->configured = true;
should set configured at the end, as the function can still fail.
> +
> +	/* Release AXI from PF */
> +	if (d->pf_device)
> +		acc100_reg_write(d, HWPfDmaAxiControl, 1);
> +
> +	acc100_reg_write(d, reg_addr->dma_ring_ul5g_hi, phys_high);
> +	acc100_reg_write(d, reg_addr->dma_ring_ul5g_lo, phys_low);
> +	acc100_reg_write(d, reg_addr->dma_ring_dl5g_hi, phys_high);
> +	acc100_reg_write(d, reg_addr->dma_ring_dl5g_lo, phys_low);
> +	acc100_reg_write(d, reg_addr->dma_ring_ul4g_hi, phys_high);
> +	acc100_reg_write(d, reg_addr->dma_ring_ul4g_lo, phys_low);
> +	acc100_reg_write(d, reg_addr->dma_ring_dl4g_hi, phys_high);
> +	acc100_reg_write(d, reg_addr->dma_ring_dl4g_lo, phys_low);
> +
> +	/*
> +	 * Configure Ring Size to the max queue ring size
> +	 * (used for wrapping purpose)
> +	 */
> +	payload = log2_basic(d->sw_ring_size / 64);
> +	acc100_reg_write(d, reg_addr->ring_size, payload);
> +
> +	/* Configure tail pointer for use when SDONE enabled */
> +	d->tail_ptrs = rte_zmalloc_socket(
> +			dev->device->driver->name,
> +			ACC100_NUM_QGRPS * ACC100_NUM_AQS * sizeof(uint32_t),
> +			RTE_CACHE_LINE_SIZE, socket_id);
> +	if (d->tail_ptrs == NULL) {
> +		rte_bbdev_log(ERR, "Failed to allocate tail ptr for %s:%u",
> +				dev->device->driver->name,
> +				dev->data->dev_id);
> +		rte_free(d->sw_rings);
> +		return -ENOMEM;
> +	}
> +	d->tail_ptr_phys = rte_malloc_virt2iova(d->tail_ptrs);
> +
> +	phys_high = (uint32_t)(d->tail_ptr_phys >> 32);
> +	phys_low  = (uint32_t)(d->tail_ptr_phys);
> +	acc100_reg_write(d, reg_addr->tail_ptrs_ul5g_hi, phys_high);
> +	acc100_reg_write(d, reg_addr->tail_ptrs_ul5g_lo, phys_low);
> +	acc100_reg_write(d, reg_addr->tail_ptrs_dl5g_hi, phys_high);
> +	acc100_reg_write(d, reg_addr->tail_ptrs_dl5g_lo, phys_low);
> +	acc100_reg_write(d, reg_addr->tail_ptrs_ul4g_hi, phys_high);
> +	acc100_reg_write(d, reg_addr->tail_ptrs_ul4g_lo, phys_low);
> +	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_hi, phys_high);
> +	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_lo, phys_low);
> +
> +	d->harq_layout = rte_zmalloc_socket("HARQ Layout",
> +			ACC100_HARQ_LAYOUT * sizeof(*d->harq_layout),
> +			RTE_CACHE_LINE_SIZE, dev->data->socket_id);
unchecked
> +
> +	rte_bbdev_log_debug(
> +			"ACC100 (%s) configured  sw_rings = %p, sw_rings_phys = %#"
> +			PRIx64, dev->data->name, d->sw_rings, d->sw_rings_phys);
> +
> +	return 0;
> +}
> +
>  /* Free 64MB memory used for software rings */
>  static int
> -acc100_dev_close(struct rte_bbdev *dev  __rte_unused)
> +acc100_dev_close(struct rte_bbdev *dev)
>  {
> +	struct acc100_device *d = dev->data->dev_private;
> +	if (d->sw_rings_base != NULL) {
> +		rte_free(d->tail_ptrs);
> +		rte_free(d->sw_rings_base);
> +		d->sw_rings_base = NULL;
> +	}
> +	usleep(1000);
similar LONG_WAIT
> +	return 0;
> +}
> +
> +
> +/**
> + * Report a ACC100 queue index which is free
> + * Return 0 to 16k for a valid queue_idx or -1 when no queue is available
> + * Note : Only supporting VF0 Bundle for PF mode
> + */
> +static int
> +acc100_find_free_queue_idx(struct rte_bbdev *dev,
> +		const struct rte_bbdev_queue_conf *conf)
> +{
> +	struct acc100_device *d = dev->data->dev_private;
> +	int op_2_acc[5] = {0, UL_4G, DL_4G, UL_5G, DL_5G};
> +	int acc = op_2_acc[conf->op_type];
> +	struct rte_q_topology_t *qtop = NULL;
> +	qtopFromAcc(&qtop, acc, &(d->acc100_conf));
> +	if (qtop == NULL)
> +		return -1;
> +	/* Identify matching QGroup Index which are sorted in priority order */
> +	uint16_t group_idx = qtop->first_qgroup_index;
> +	group_idx += conf->priority;
> +	if (group_idx >= ACC100_NUM_QGRPS ||
> +			conf->priority >= qtop->num_qgroups) {
> +		rte_bbdev_log(INFO, "Invalid Priority on %s, priority %u",
> +				dev->data->name, conf->priority);
> +		return -1;
> +	}
> +	/* Find a free AQ_idx  */
> +	uint16_t aq_idx;
> +	for (aq_idx = 0; aq_idx < qtop->num_aqs_per_groups; aq_idx++) {
> +		if (((d->q_assigned_bit_map[group_idx] >> aq_idx) & 0x1) == 0) {
> +			/* Mark the Queue as assigned */
> +			d->q_assigned_bit_map[group_idx] |= (1 << aq_idx);
> +			/* Report the AQ Index */
> +			return (group_idx << GRP_ID_SHIFT) + aq_idx;
> +		}
> +	}
> +	rte_bbdev_log(INFO, "Failed to find free queue on %s, priority %u",
> +			dev->data->name, conf->priority);
> +	return -1;
> +}
> +
> +/* Setup ACC100 queue */
> +static int
> +acc100_queue_setup(struct rte_bbdev *dev, uint16_t queue_id,
> +		const struct rte_bbdev_queue_conf *conf)
> +{
> +	struct acc100_device *d = dev->data->dev_private;
> +	struct acc100_queue *q;
> +	int16_t q_idx;
> +
> +	/* Allocate the queue data structure. */
> +	q = rte_zmalloc_socket(dev->device->driver->name, sizeof(*q),
> +			RTE_CACHE_LINE_SIZE, conf->socket);
> +	if (q == NULL) {
> +		rte_bbdev_log(ERR, "Failed to allocate queue memory");
> +		return -ENOMEM;
> +	}
> +
> +	q->d = d;
> +	q->ring_addr = RTE_PTR_ADD(d->sw_rings, (d->sw_ring_size * queue_id));
> +	q->ring_addr_phys = d->sw_rings_phys + (d->sw_ring_size * queue_id);
> +
> +	/* Prepare the Ring with default descriptor format */
> +	union acc100_dma_desc *desc = NULL;
> +	unsigned int desc_idx, b_idx;
> +	int fcw_len = (conf->op_type == RTE_BBDEV_OP_LDPC_ENC ?
> +		ACC100_FCW_LE_BLEN : (conf->op_type == RTE_BBDEV_OP_TURBO_DEC ?
> +		ACC100_FCW_TD_BLEN : ACC100_FCW_LD_BLEN));
> +
> +	for (desc_idx = 0; desc_idx < d->sw_ring_max_depth; desc_idx++) {
> +		desc = q->ring_addr + desc_idx;
> +		desc->req.word0 = ACC100_DMA_DESC_TYPE;
> +		desc->req.word1 = 0; /**< Timestamp */
> +		desc->req.word2 = 0;
> +		desc->req.word3 = 0;
> +		uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
> +		desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
> +		desc->req.data_ptrs[0].blen = fcw_len;
> +		desc->req.data_ptrs[0].blkid = ACC100_DMA_BLKID_FCW;
> +		desc->req.data_ptrs[0].last = 0;
> +		desc->req.data_ptrs[0].dma_ext = 0;
> +		for (b_idx = 1; b_idx < ACC100_DMA_MAX_NUM_POINTERS - 1;
> +				b_idx++) {
> +			desc->req.data_ptrs[b_idx].blkid = ACC100_DMA_BLKID_IN;
> +			desc->req.data_ptrs[b_idx].last = 1;
> +			desc->req.data_ptrs[b_idx].dma_ext = 0;
> +			b_idx++;

This works, but it would be better to only inc the index in the for loop statement.

The second data set should accessed as [b_idx+1]

And the loop inc by +2

> +			desc->req.data_ptrs[b_idx].blkid =
> +					ACC100_DMA_BLKID_OUT_ENC;
> +			desc->req.data_ptrs[b_idx].last = 1;
> +			desc->req.data_ptrs[b_idx].dma_ext = 0;
> +		}
> +		/* Preset some fields of LDPC FCW */
> +		desc->req.fcw_ld.FCWversion = ACC100_FCW_VER;
> +		desc->req.fcw_ld.gain_i = 1;
> +		desc->req.fcw_ld.gain_h = 1;
> +	}
> +
> +	q->lb_in = rte_zmalloc_socket(dev->device->driver->name,
> +			RTE_CACHE_LINE_SIZE,
> +			RTE_CACHE_LINE_SIZE, conf->socket);
> +	if (q->lb_in == NULL) {

q is not freed.

> +		rte_bbdev_log(ERR, "Failed to allocate lb_in memory");
> +		return -ENOMEM;
> +	}
> +	q->lb_in_addr_phys = rte_malloc_virt2iova(q->lb_in);
> +	q->lb_out = rte_zmalloc_socket(dev->device->driver->name,
> +			RTE_CACHE_LINE_SIZE,
> +			RTE_CACHE_LINE_SIZE, conf->socket);
> +	if (q->lb_out == NULL) {
> +		rte_bbdev_log(ERR, "Failed to allocate lb_out memory");
> +		return -ENOMEM;

q->lb_in is not freed

q is not freed

> +	}
> +	q->lb_out_addr_phys = rte_malloc_virt2iova(q->lb_out);
> +
> +	/*
> +	 * Software queue ring wraps synchronously with the HW when it reaches
> +	 * the boundary of the maximum allocated queue size, no matter what the
> +	 * sw queue size is. This wrapping is guarded by setting the wrap_mask
> +	 * to represent the maximum queue size as allocated at the time when
> +	 * the device has been setup (in configure()).
> +	 *
> +	 * The queue depth is set to the queue size value (conf->queue_size).
> +	 * This limits the occupancy of the queue at any point of time, so that
> +	 * the queue does not get swamped with enqueue requests.
> +	 */
> +	q->sw_ring_depth = conf->queue_size;
> +	q->sw_ring_wrap_mask = d->sw_ring_max_depth - 1;
> +
> +	q->op_type = conf->op_type;
> +
> +	q_idx = acc100_find_free_queue_idx(dev, conf);
> +	if (q_idx == -1) {
> +		rte_free(q);

This will leak the other two ptr's

This function needs better error handling.

Tom

> +		return -1;
> +	}
> +
> +	q->qgrp_id = (q_idx >> GRP_ID_SHIFT) & 0xF;
> +	q->vf_id = (q_idx >> VF_ID_SHIFT)  & 0x3F;
> +	q->aq_id = q_idx & 0xF;
> +	q->aq_depth = (conf->op_type ==  RTE_BBDEV_OP_TURBO_DEC) ?
> +			(1 << d->acc100_conf.q_ul_4g.aq_depth_log2) :
> +			(1 << d->acc100_conf.q_dl_4g.aq_depth_log2);
> +
> +	q->mmio_reg_enqueue = RTE_PTR_ADD(d->mmio_base,
> +			queue_offset(d->pf_device,
> +					q->vf_id, q->qgrp_id, q->aq_id));
> +
> +	rte_bbdev_log_debug(
> +			"Setup dev%u q%u: qgrp_id=%u, vf_id=%u, aq_id=%u, aq_depth=%u, mmio_reg_enqueue=%p",
> +			dev->data->dev_id, queue_id, q->qgrp_id, q->vf_id,
> +			q->aq_id, q->aq_depth, q->mmio_reg_enqueue);
> +
> +	dev->data->queues[queue_id].queue_private = q;
> +	return 0;
> +}
> +
> +/* Release ACC100 queue */
> +static int
> +acc100_queue_release(struct rte_bbdev *dev, uint16_t q_id)
> +{
> +	struct acc100_device *d = dev->data->dev_private;
> +	struct acc100_queue *q = dev->data->queues[q_id].queue_private;
> +
> +	if (q != NULL) {
> +		/* Mark the Queue as un-assigned */
> +		d->q_assigned_bit_map[q->qgrp_id] &= (0xFFFFFFFF -
> +				(1 << q->aq_id));
> +		rte_free(q->lb_in);
> +		rte_free(q->lb_out);
> +		rte_free(q);
> +		dev->data->queues[q_id].queue_private = NULL;
> +	}
> +
>  	return 0;
>  }
>  
> @@ -258,8 +673,11 @@
>  }
>  
>  static const struct rte_bbdev_ops acc100_bbdev_ops = {
> +	.setup_queues = acc100_setup_queues,
>  	.close = acc100_dev_close,
>  	.info_get = acc100_dev_info_get,
> +	.queue_setup = acc100_queue_setup,
> +	.queue_release = acc100_queue_release,
>  };
>  
>  /* ACC100 PCI PF address map */
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
> index 662e2c8..0e2b79c 100644
> --- a/drivers/baseband/acc100/rte_acc100_pmd.h
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
> @@ -518,11 +518,56 @@ struct acc100_registry_addr {
>  	.ddr_range = HWVfDmaDdrBaseRangeRoVf,
>  };
>  
> +/* Structure associated with each queue. */
> +struct __rte_cache_aligned acc100_queue {
> +	union acc100_dma_desc *ring_addr;  /* Virtual address of sw ring */
> +	rte_iova_t ring_addr_phys;  /* Physical address of software ring */
> +	uint32_t sw_ring_head;  /* software ring head */
> +	uint32_t sw_ring_tail;  /* software ring tail */
> +	/* software ring size (descriptors, not bytes) */
> +	uint32_t sw_ring_depth;
> +	/* mask used to wrap enqueued descriptors on the sw ring */
> +	uint32_t sw_ring_wrap_mask;
> +	/* MMIO register used to enqueue descriptors */
> +	void *mmio_reg_enqueue;
> +	uint8_t vf_id;  /* VF ID (max = 63) */
> +	uint8_t qgrp_id;  /* Queue Group ID */
> +	uint16_t aq_id;  /* Atomic Queue ID */
> +	uint16_t aq_depth;  /* Depth of atomic queue */
> +	uint32_t aq_enqueued;  /* Count how many "batches" have been enqueued */
> +	uint32_t aq_dequeued;  /* Count how many "batches" have been dequeued */
> +	uint32_t irq_enable;  /* Enable ops dequeue interrupts if set to 1 */
> +	struct rte_mempool *fcw_mempool;  /* FCW mempool */
> +	enum rte_bbdev_op_type op_type;  /* Type of this Queue: TE or TD */
> +	/* Internal Buffers for loopback input */
> +	uint8_t *lb_in;
> +	uint8_t *lb_out;
> +	rte_iova_t lb_in_addr_phys;
> +	rte_iova_t lb_out_addr_phys;
> +	struct acc100_device *d;
> +};
> +
>  /* Private data structure for each ACC100 device */
>  struct acc100_device {
>  	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
> +	void *sw_rings_base;  /* Base addr of un-aligned memory for sw rings */
> +	void *sw_rings;  /* 64MBs of 64MB aligned memory for sw rings */
> +	rte_iova_t sw_rings_phys;  /* Physical address of sw_rings */
> +	/* Virtual address of the info memory routed to the this function under
> +	 * operation, whether it is PF or VF.
> +	 */
> +	union acc100_harq_layout_data *harq_layout;
> +	uint32_t sw_ring_size;
>  	uint32_t ddr_size; /* Size in kB */
> +	uint32_t *tail_ptrs; /* Base address of response tail pointer buffer */
> +	rte_iova_t tail_ptr_phys; /* Physical address of tail pointers */
> +	/* Max number of entries available for each queue in device, depending
> +	 * on how many queues are enabled with configure()
> +	 */
> +	uint32_t sw_ring_max_depth;
>  	struct acc100_conf acc100_conf; /* ACC100 Initial configuration */
> +	/* Bitmap capturing which Queues have already been assigned */
> +	uint16_t q_assigned_bit_map[ACC100_NUM_QGRPS];
>  	bool pf_device; /**< True if this is a PF ACC100 device */
>  	bool configured; /**< True if this ACC100 device is configured */
>  };


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v9 01/10] drivers/baseband: add PMD for ACC100
  2020-09-29 19:53       ` Tom Rix
@ 2020-09-29 23:17         ` Chautru, Nicolas
  2020-09-30 23:06           ` Tom Rix
  0 siblings, 1 reply; 213+ messages in thread
From: Chautru, Nicolas @ 2020-09-29 23:17 UTC (permalink / raw)
  To: Tom Rix, dev, akhil.goyal
  Cc: Richardson, Bruce, Xu, Rosen, dave.burley, aidan.goddard, Yigit,
	Ferruh, Liu, Tianjiao

Hi Tom, 

> -----Original Message-----
> From: Tom Rix <trix@redhat.com>
> Sent: Tuesday, September 29, 2020 12:54 PM
> To: Chautru, Nicolas <nicolas.chautru@intel.com>; dev@dpdk.org;
> akhil.goyal@nxp.com
> Cc: Richardson, Bruce <bruce.richardson@intel.com>; Xu, Rosen
> <rosen.xu@intel.com>; dave.burley@accelercomm.com;
> aidan.goddard@accelercomm.com; Yigit, Ferruh <ferruh.yigit@intel.com>;
> Liu, Tianjiao <tianjiao.liu@intel.com>
> Subject: Re: [dpdk-dev] [PATCH v9 01/10] drivers/baseband: add PMD for
> ACC100
> 
> 
> On 9/28/20 5:29 PM, Nicolas Chautru wrote:
> > Add stubs for the ACC100 PMD
> >
> > Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> > Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
> > ---
> >  doc/guides/bbdevs/acc100.rst                       | 233 +++++++++++++++++++++
> >  doc/guides/bbdevs/features/acc100.ini              |  14 ++
> >  doc/guides/bbdevs/index.rst                        |   1 +
> >  drivers/baseband/acc100/meson.build                |   6 +
> >  drivers/baseband/acc100/rte_acc100_pmd.c           | 175
> ++++++++++++++++
> >  drivers/baseband/acc100/rte_acc100_pmd.h           |  37 ++++
> >  .../acc100/rte_pmd_bbdev_acc100_version.map        |   3 +
> >  drivers/baseband/meson.build                       |   2 +-
> >  8 files changed, 470 insertions(+), 1 deletion(-)  create mode 100644
> > doc/guides/bbdevs/acc100.rst  create mode 100644
> > doc/guides/bbdevs/features/acc100.ini
> >  create mode 100644 drivers/baseband/acc100/meson.build
> >  create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.c
> >  create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.h
> >  create mode 100644
> > drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> >
> > diff --git a/doc/guides/bbdevs/acc100.rst
> > b/doc/guides/bbdevs/acc100.rst new file mode 100644 index
> > 0000000..f87ee09
> > --- /dev/null
> > +++ b/doc/guides/bbdevs/acc100.rst
> > @@ -0,0 +1,233 @@
> > +..  SPDX-License-Identifier: BSD-3-Clause
> > +    Copyright(c) 2020 Intel Corporation
> > +
> > +Intel(R) ACC100 5G/4G FEC Poll Mode Driver
> > +==========================================
> > +
> > +The BBDEV ACC100 5G/4G FEC poll mode driver (PMD) supports an
> > +implementation of a VRAN FEC wireless acceleration function.
> > +This device is also known as Mount Bryce.
> If this is code name or general chip name it should be removed.

We have used general chip name for other PMDs (ie. Vista Creek), I can remove but
why should this be removed for my benefit? This tends to be the most user friendly
name so arguablygood to name drop in documentation . 


> > +
> > +Features
> > +--------
> > +
> > +ACC100 5G/4G FEC PMD supports the following features:
> > +
> > +- LDPC Encode in the DL (5GNR)
> > +- LDPC Decode in the UL (5GNR)
> > +- Turbo Encode in the DL (4G)
> > +- Turbo Decode in the UL (4G)
> > +- 16 VFs per PF (physical device)
> > +- Maximum of 128 queues per VF
> > +- PCIe Gen-3 x16 Interface
> > +- MSI
> > +- SR-IOV
> > +
> > +ACC100 5G/4G FEC PMD supports the following BBDEV capabilities:
> > +
> > +* For the LDPC encode operation:
> > +   - ``RTE_BBDEV_LDPC_CRC_24B_ATTACH`` :  set to attach CRC24B to
> CB(s)
> > +   - ``RTE_BBDEV_LDPC_RATE_MATCH`` :  if set then do not do Rate Match
> bypass
> > +   - ``RTE_BBDEV_LDPC_INTERLEAVER_BYPASS`` : if set then bypass
> > +interleaver
> > +
> > +* For the LDPC decode operation:
> > +   - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK`` :  check CRC24B from
> CB(s)
> > +   - ``RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE`` :  disable early
> termination
> > +   - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP`` :  drops CRC24B bits
> appended while decoding
> > +   - ``RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE`` :  provides an input
> for HARQ combining
> > +   - ``RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE`` :  provides an input
> for HARQ combining
> > +   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE`` :  HARQ
> memory input is internal
> > +   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE`` :
> HARQ memory output is internal
> > +   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK`` :
> loopback data to/from HARQ memory
> > +   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_FILLERS`` :  HARQ
> memory includes the fillers bits
> > +   - ``RTE_BBDEV_LDPC_DEC_SCATTER_GATHER`` :  supports scatter-
> gather for input/output data
> > +   - ``RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION`` :  supports
> compression of the HARQ input/output
> > +   - ``RTE_BBDEV_LDPC_LLR_COMPRESSION`` :  supports LLR input
> > +compression
> > +
> > +* For the turbo encode operation:
> > +   - ``RTE_BBDEV_TURBO_CRC_24B_ATTACH`` :  set to attach CRC24B to
> CB(s)
> > +   - ``RTE_BBDEV_TURBO_RATE_MATCH`` :  if set then do not do Rate
> Match bypass
> > +   - ``RTE_BBDEV_TURBO_ENC_INTERRUPTS`` :  set for encoder dequeue
> interrupts
> > +   - ``RTE_BBDEV_TURBO_RV_INDEX_BYPASS`` :  set to bypass RV index
> > +   - ``RTE_BBDEV_TURBO_ENC_SCATTER_GATHER`` :  supports
> > +scatter-gather for input/output data
> > +
> > +* For the turbo decode operation:
> > +   - ``RTE_BBDEV_TURBO_CRC_TYPE_24B`` :  check CRC24B from CB(s)
> > +   - ``RTE_BBDEV_TURBO_SUBBLOCK_DEINTERLEAVE`` :  perform subblock
> de-interleave
> > +   - ``RTE_BBDEV_TURBO_DEC_INTERRUPTS`` :  set for decoder dequeue
> interrupts
> > +   - ``RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN`` :  set if negative LLR
> encoder i/p is supported
> > +   - ``RTE_BBDEV_TURBO_POS_LLR_1_BIT_IN`` :  set if positive LLR encoder
> i/p is supported
> > +   - ``RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP`` :  keep CRC24B bits
> appended while decoding
> > +   - ``RTE_BBDEV_TURBO_EARLY_TERMINATION`` :  set early early
> termination feature
> > +   - ``RTE_BBDEV_TURBO_DEC_SCATTER_GATHER`` :  supports scatter-
> gather for input/output data
> > +   - ``RTE_BBDEV_TURBO_HALF_ITERATION_EVEN`` :  set half iteration
> > +granularity
> > +
> > +Installation
> > +------------
> > +
> > +Section 3 of the DPDK manual provides instuctions on installing and
> > +compiling DPDK. The default set of bbdev compile flags may be found
> > +in config/common_base, where for example the flag to build the ACC100
> > +5G/4G FEC device, ``CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100``,
> > +is already set.
> > +
> > +DPDK requires hugepages to be configured as detailed in section 2 of the
> DPDK manual.
> > +The bbdev test application has been tested with a configuration 40 x
> > +1GB hugepages. The hugepage configuration of a server may be examined
> using:
> > +
> > +.. code-block:: console
> > +
> > +   grep Huge* /proc/meminfo
> > +
> > +
> > +Initialization
> > +--------------
> > +
> > +When the device first powers up, its PCI Physical Functions (PF) can be
> listed through this command:
> > +
> > +.. code-block:: console
> > +
> > +  sudo lspci -vd8086:0d5c
> > +
> > +The physical and virtual functions are compatible with Linux UIO drivers:
> > +``vfio`` and ``igb_uio``. However, in order to work the ACC100 5G/4G
> > +FEC device firstly needs to be bound to one of these linux drivers through
> DPDK.
> FEC device first

ok

> > +
> > +
> > +Bind PF UIO driver(s)
> > +~~~~~~~~~~~~~~~~~~~~~
> > +
> > +Install the DPDK igb_uio driver, bind it with the PF PCI device ID
> > +and use ``lspci`` to confirm the PF device is under use by ``igb_uio`` DPDK
> UIO driver.
> > +
> > +The igb_uio driver may be bound to the PF PCI device using one of three
> methods:
> > +
> > +
> > +1. PCI functions (physical or virtual, depending on the use case) can
> > +be bound to the UIO driver by repeating this command for every function.
> > +
> > +.. code-block:: console
> > +
> > +  cd <dpdk-top-level-directory>
> > +  insmod ./build/kmod/igb_uio.ko
> > +  echo "8086 0d5c" > /sys/bus/pci/drivers/igb_uio/new_id
> > +  lspci -vd8086:0d5c
> > +
> > +
> > +2. Another way to bind PF with DPDK UIO driver is by using the
> > +``dpdk-devbind.py`` tool
> > +
> > +.. code-block:: console
> > +
> > +  cd <dpdk-top-level-directory>
> > +  ./usertools/dpdk-devbind.py -b igb_uio 0000:06:00.0
> > +
> > +where the PCI device ID (example: 0000:06:00.0) is obtained using
> > +lspci -vd8086:0d5c
> > +
> > +
> > +3. A third way to bind is to use ``dpdk-setup.sh`` tool
> > +
> > +.. code-block:: console
> > +
> > +  cd <dpdk-top-level-directory>
> > +  ./usertools/dpdk-setup.sh
> > +
> > +  select 'Bind Ethernet/Crypto/Baseband device to IGB UIO module'
> > +  or
> > +  select 'Bind Ethernet/Crypto/Baseband device to VFIO module'
> > + depending on driver required
> This is the igb_uio section, should defer vfio select to its section.

Ok

> > +  enter PCI device ID
> > +  select 'Display current Ethernet/Crypto/Baseband device settings'
> > + to confirm binding
> > +
> > +
> > +In the same way the ACC100 5G/4G FEC PF can be bound with vfio, but
> > +vfio driver does not support SR-IOV configuration right out of the box, so
> it will need to be patched.
> Other documentation says works with 5.7

Yes this is a bit historical now. I can remove this bit which is not very informative and non specific to that PMD. 

> > +
> > +
> > +Enable Virtual Functions
> > +~~~~~~~~~~~~~~~~~~~~~~~~
> > +
> > +Now, it should be visible in the printouts that PCI PF is under
> > +igb_uio control "``Kernel driver in use: igb_uio``"
> > +
> > +To show the number of available VFs on the device, read ``sriov_totalvfs``
> file..
> > +
> > +.. code-block:: console
> > +
> > +  cat /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/sriov_totalvfs
> > +
> > +  where 0000\:<b>\:<d>.<f> is the PCI device ID
> > +
> > +
> > +To enable VFs via igb_uio, echo the number of virtual functions
> > +intended to enable to ``max_vfs`` file..
> > +
> > +.. code-block:: console
> > +
> > +  echo <num-of-vfs> > /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/max_vfs
> > +
> > +
> > +Afterwards, all VFs must be bound to appropriate UIO drivers as
> > +required, same way it was done with the physical function previously.
> > +
> > +Enabling SR-IOV via vfio driver is pretty much the same, except that
> > +the file name is different:
> > +
> > +.. code-block:: console
> > +
> > +  echo <num-of-vfs> >
> > + /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/sriov_numvfs
> > +
> > +
> > +Configure the VFs through PF
> > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > +
> > +The PCI virtual functions must be configured before working or
> > +getting assigned to VMs/Containers. The configuration involves
> > +allocating the number of hardware queues, priorities, load balance,
> > +bandwidth and other settings necessary for the device to perform FEC
> functions.
> > +
> > +This configuration needs to be executed at least once after reboot or
> > +PCI FLR and can be achieved by using the function
> > +``acc100_configure()``, which sets up the parameters defined in
> ``acc100_conf`` structure.
> > +
> > +Test Application
> > +----------------
> > +
> > +BBDEV provides a test application, ``test-bbdev.py`` and range of
> > +test data for testing the functionality of ACC100 5G/4G FEC encode
> > +and decode, depending on the device's capabilities. The test
> > +application is located under app->test-bbdev folder and has the following
> options:
> > +
> > +.. code-block:: console
> > +
> > +  "-p", "--testapp-path": specifies path to the bbdev test app.
> > +  "-e", "--eal-params"	: EAL arguments which are passed to the test
> app.
> > +  "-t", "--timeout"	: Timeout in seconds (default=300).
> > +  "-c", "--test-cases"	: Defines test cases to run. Run all if not specified.
> > +  "-v", "--test-vector"	: Test vector path (default=dpdk_path+/app/test-
> bbdev/test_vectors/bbdev_null.data).
> > +  "-n", "--num-ops"	: Number of operations to process on device
> (default=32).
> > +  "-b", "--burst-size"	: Operations enqueue/dequeue burst size
> (default=32).
> > +  "-s", "--snr"		: SNR in dB used when generating LLRs for bler tests.
> > +  "-s", "--iter_max"	: Number of iterations for LDPC decoder.
> > +  "-l", "--num-lcores"	: Number of lcores to run (default=16).
> > +  "-i", "--init-device" : Initialise PF device with default values.
> > +
> > +
> > +To execute the test application tool using simple decode or encode
> > +data, type one of the following:
> > +
> > +.. code-block:: console
> > +
> > +  ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_dec_default.data
> > + ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_enc_default.data
> > +
> > +
> > +The test application ``test-bbdev.py``, supports the ability to
> > +configure the PF device with a default set of values, if the "-i" or
> > +"- -init-device" option is included. The default values are defined in
> test_bbdev_perf.c.
> > +
> > +
> > +Test Vectors
> > +~~~~~~~~~~~~
> > +
> > +In addition to the simple LDPC decoder and LDPC encoder tests, bbdev
> > +also provides a range of additional tests under the test_vectors
> > +folder, which may be useful. The results of these tests will depend
> > +on the ACC100 5G/4G FEC capabilities which may cause some testcases to
> be skipped, but no failure should be reported.
> 
> Just
> 
> to be skipped.
> 
> should be able to assume skipped test do not get reported as failures.

Not necessaraly that obvious from feedback. It doesn't hurt to be explicit and
this statement is common to all PMDs. 


> 
> > diff --git a/doc/guides/bbdevs/features/acc100.ini
> > b/doc/guides/bbdevs/features/acc100.ini
> > new file mode 100644
> > index 0000000..c89a4d7
> > --- /dev/null
> > +++ b/doc/guides/bbdevs/features/acc100.ini
> > @@ -0,0 +1,14 @@
> > +;
> > +; Supported features of the 'acc100' bbdev driver.
> > +;
> > +; Refer to default.ini for the full list of available PMD features.
> > +;
> > +[Features]
> > +Turbo Decoder (4G)     = N
> > +Turbo Encoder (4G)     = N
> > +LDPC Decoder (5G)      = N
> > +LDPC Encoder (5G)      = N
> > +LLR/HARQ Compression   = N
> > +External DDR Access    = N
> > +HW Accelerated         = Y
> > +BBDEV API              = Y
> > diff --git a/doc/guides/bbdevs/index.rst b/doc/guides/bbdevs/index.rst
> > index a8092dd..4445cbd 100644
> > --- a/doc/guides/bbdevs/index.rst
> > +++ b/doc/guides/bbdevs/index.rst
> > @@ -13,3 +13,4 @@ Baseband Device Drivers
> >      turbo_sw
> >      fpga_lte_fec
> >      fpga_5gnr_fec
> > +    acc100
> > diff --git a/drivers/baseband/acc100/meson.build
> > b/drivers/baseband/acc100/meson.build
> > new file mode 100644
> > index 0000000..8afafc2
> > --- /dev/null
> > +++ b/drivers/baseband/acc100/meson.build
> > @@ -0,0 +1,6 @@
> > +# SPDX-License-Identifier: BSD-3-Clause # Copyright(c) 2020 Intel
> > +Corporation
> > +
> > +deps += ['bbdev', 'bus_vdev', 'ring', 'pci', 'bus_pci']
> > +
> > +sources = files('rte_acc100_pmd.c')
> > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> > b/drivers/baseband/acc100/rte_acc100_pmd.c
> > new file mode 100644
> > index 0000000..1b4cd13
> > --- /dev/null
> > +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> > @@ -0,0 +1,175 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright(c) 2020 Intel Corporation  */
> > +
> > +#include <unistd.h>
> > +
> > +#include <rte_common.h>
> > +#include <rte_log.h>
> > +#include <rte_dev.h>
> > +#include <rte_malloc.h>
> > +#include <rte_mempool.h>
> > +#include <rte_byteorder.h>
> > +#include <rte_errno.h>
> > +#include <rte_branch_prediction.h>
> > +#include <rte_hexdump.h>
> > +#include <rte_pci.h>
> > +#include <rte_bus_pci.h>
> > +
> > +#include <rte_bbdev.h>
> > +#include <rte_bbdev_pmd.h>
> Should these #includes' be in alpha order ?

Interesting comment. Is this a coding guide line for DPDK or others?
I have never heard of this personnally, what is the rational? 

> > +#include "rte_acc100_pmd.h"
> > +
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, DEBUG); #else
> > +RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE); #endif
> > +
> > +/* Free 64MB memory used for software rings */ static int
> > +acc100_dev_close(struct rte_bbdev *dev  __rte_unused) {
> > +	return 0;
> > +}
> > +
> > +static const struct rte_bbdev_ops acc100_bbdev_ops = {
> > +	.close = acc100_dev_close,
> > +};
> > +
> > +/* ACC100 PCI PF address map */
> > +static struct rte_pci_id pci_id_acc100_pf_map[] = {
> > +	{
> > +		RTE_PCI_DEVICE(RTE_ACC100_VENDOR_ID,
> RTE_ACC100_PF_DEVICE_ID)
> > +	},
> > +	{.device_id = 0},
> > +};
> > +
> > +/* ACC100 PCI VF address map */
> > +static struct rte_pci_id pci_id_acc100_vf_map[] = {
> > +	{
> > +		RTE_PCI_DEVICE(RTE_ACC100_VENDOR_ID,
> RTE_ACC100_VF_DEVICE_ID)
> > +	},
> > +	{.device_id = 0},
> > +};
> > +
> > +/* Initialization Function */
> > +static void
> > +acc100_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv)
> > +{
> > +	struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
> > +
> > +	dev->dev_ops = &acc100_bbdev_ops;
> > +
> > +	((struct acc100_device *) dev->data->dev_private)->pf_device =
> > +			!strcmp(drv->driver.name,
> > +
> 	RTE_STR(ACC100PF_DRIVER_NAME));
> > +	((struct acc100_device *) dev->data->dev_private)->mmio_base =
> > +			pci_dev->mem_resource[0].addr;
> > +
> > +	rte_bbdev_log_debug("Init device %s [%s] @ vaddr %p paddr
> %#"PRIx64"",
> > +			drv->driver.name, dev->data->name,
> > +			(void *)pci_dev->mem_resource[0].addr,
> > +			pci_dev->mem_resource[0].phys_addr);
> > +}
> > +
> > +static int acc100_pci_probe(struct rte_pci_driver *pci_drv,
> > +	struct rte_pci_device *pci_dev)
> > +{
> > +	struct rte_bbdev *bbdev = NULL;
> > +	char dev_name[RTE_BBDEV_NAME_MAX_LEN];
> > +
> > +	if (pci_dev == NULL) {
> > +		rte_bbdev_log(ERR, "NULL PCI device");
> > +		return -EINVAL;
> > +	}
> > +
> > +	rte_pci_device_name(&pci_dev->addr, dev_name,
> sizeof(dev_name));
> > +
> > +	/* Allocate memory to be used privately by drivers */
> > +	bbdev = rte_bbdev_allocate(pci_dev->device.name);
> > +	if (bbdev == NULL)
> > +		return -ENODEV;
> > +
> > +	/* allocate device private memory */
> > +	bbdev->data->dev_private = rte_zmalloc_socket(dev_name,
> > +			sizeof(struct acc100_device), RTE_CACHE_LINE_SIZE,
> > +			pci_dev->device.numa_node);
> > +
> > +	if (bbdev->data->dev_private == NULL) {
> > +		rte_bbdev_log(CRIT,
> > +				"Allocate of %zu bytes for device \"%s\"
> failed",
> > +				sizeof(struct acc100_device), dev_name);
> > +				rte_bbdev_release(bbdev);
> > +			return -ENOMEM;
> > +	}
> > +
> > +	/* Fill HW specific part of device structure */
> > +	bbdev->device = &pci_dev->device;
> > +	bbdev->intr_handle = &pci_dev->intr_handle;
> > +	bbdev->data->socket_id = pci_dev->device.numa_node;
> > +
> > +	/* Invoke ACC100 device initialization function */
> > +	acc100_bbdev_init(bbdev, pci_drv);
> > +
> > +	rte_bbdev_log_debug("Initialised bbdev %s (id = %u)",
> > +			dev_name, bbdev->data->dev_id);
> > +	return 0;
> > +}
> > +
> > +static int acc100_pci_remove(struct rte_pci_device *pci_dev) {
> > +	struct rte_bbdev *bbdev;
> > +	int ret;
> > +	uint8_t dev_id;
> > +
> > +	if (pci_dev == NULL)
> > +		return -EINVAL;
> > +
> > +	/* Find device */
> > +	bbdev = rte_bbdev_get_named_dev(pci_dev->device.name);
> > +	if (bbdev == NULL) {
> > +		rte_bbdev_log(CRIT,
> > +				"Couldn't find HW dev \"%s\" to uninitialise
> it",
> > +				pci_dev->device.name);
> > +		return -ENODEV;
> > +	}
> > +	dev_id = bbdev->data->dev_id;
> > +
> > +	/* free device private memory before close */
> > +	rte_free(bbdev->data->dev_private);
> > +
> > +	/* Close device */
> > +	ret = rte_bbdev_close(dev_id);
> 
> Do you want to reorder this close before the rte_free so you could recover
> from the failure ?

Given this is done the same way for other PMDs I would not change it as it would create a discrepency.
It could be done in principle as another patch for multiple PMDs to support this, but really I don't see a usecase for try to fall back in case there was such a speculative aerror. 


> 
> Tom
> 

Thanks
Nic


> > +	if (ret < 0)
> > +		rte_bbdev_log(ERR,
> > +				"Device %i failed to close during uninit: %i",
> > +				dev_id, ret);
> > +
> > +	/* release bbdev from library */
> > +	rte_bbdev_release(bbdev);
> > +
> > +	rte_bbdev_log_debug("Destroyed bbdev = %u", dev_id);
> > +
> > +	return 0;
> > +}
> > +
> > +static struct rte_pci_driver acc100_pci_pf_driver = {
> > +		.probe = acc100_pci_probe,
> > +		.remove = acc100_pci_remove,
> > +		.id_table = pci_id_acc100_pf_map,
> > +		.drv_flags = RTE_PCI_DRV_NEED_MAPPING };
> > +
> > +static struct rte_pci_driver acc100_pci_vf_driver = {
> > +		.probe = acc100_pci_probe,
> > +		.remove = acc100_pci_remove,
> > +		.id_table = pci_id_acc100_vf_map,
> > +		.drv_flags = RTE_PCI_DRV_NEED_MAPPING };
> > +
> > +RTE_PMD_REGISTER_PCI(ACC100PF_DRIVER_NAME,
> acc100_pci_pf_driver);
> > +RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME,
> > +pci_id_acc100_pf_map);
> RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME,
> > +acc100_pci_vf_driver);
> > +RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME,
> > +pci_id_acc100_vf_map);
> > +
> > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h
> > b/drivers/baseband/acc100/rte_acc100_pmd.h
> > new file mode 100644
> > index 0000000..6f46df0
> > --- /dev/null
> > +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
> > @@ -0,0 +1,37 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright(c) 2020 Intel Corporation  */
> > +
> > +#ifndef _RTE_ACC100_PMD_H_
> > +#define _RTE_ACC100_PMD_H_
> > +
> > +/* Helper macro for logging */
> > +#define rte_bbdev_log(level, fmt, ...) \
> > +	rte_log(RTE_LOG_ ## level, acc100_logtype, fmt "\n", \
> > +		##__VA_ARGS__)
> > +
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +#define rte_bbdev_log_debug(fmt, ...) \
> > +		rte_bbdev_log(DEBUG, "acc100_pmd: " fmt, \
> > +		##__VA_ARGS__)
> > +#else
> > +#define rte_bbdev_log_debug(fmt, ...) #endif
> > +
> > +/* ACC100 PF and VF driver names */
> > +#define ACC100PF_DRIVER_NAME           intel_acc100_pf
> > +#define ACC100VF_DRIVER_NAME           intel_acc100_vf
> > +
> > +/* ACC100 PCI vendor & device IDs */
> > +#define RTE_ACC100_VENDOR_ID           (0x8086)
> > +#define RTE_ACC100_PF_DEVICE_ID        (0x0d5c)
> > +#define RTE_ACC100_VF_DEVICE_ID        (0x0d5d)
> > +
> > +/* Private data structure for each ACC100 device */ struct
> > +acc100_device {
> > +	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
> > +	bool pf_device; /**< True if this is a PF ACC100 device */
> > +	bool configured; /**< True if this ACC100 device is configured */ };
> > +
> > +#endif /* _RTE_ACC100_PMD_H_ */
> > diff --git a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> > b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> > new file mode 100644
> > index 0000000..4a76d1d
> > --- /dev/null
> > +++ b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> > @@ -0,0 +1,3 @@
> > +DPDK_21 {
> > +	local: *;
> > +};
> > diff --git a/drivers/baseband/meson.build
> > b/drivers/baseband/meson.build index 415b672..72301ce 100644
> > --- a/drivers/baseband/meson.build
> > +++ b/drivers/baseband/meson.build
> > @@ -5,7 +5,7 @@ if is_windows
> >  	subdir_done()
> >  endif
> >
> > -drivers = ['null', 'turbo_sw', 'fpga_lte_fec', 'fpga_5gnr_fec']
> > +drivers = ['null', 'turbo_sw', 'fpga_lte_fec', 'fpga_5gnr_fec',
> > +'acc100']
> >
> >  config_flag_fmt = 'RTE_LIBRTE_PMD_BBDEV_@0@'
> >  driver_name_fmt = 'rte_pmd_bbdev_@0@'


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v9 02/10] baseband/acc100: add register definition file
  2020-09-29 20:34       ` Tom Rix
@ 2020-09-29 23:30         ` Chautru, Nicolas
  2020-09-30 23:11           ` Tom Rix
  0 siblings, 1 reply; 213+ messages in thread
From: Chautru, Nicolas @ 2020-09-29 23:30 UTC (permalink / raw)
  To: Tom Rix, dev, akhil.goyal
  Cc: Richardson, Bruce, Xu, Rosen, dave.burley, aidan.goddard, Yigit,
	Ferruh, Liu, Tianjiao

Hi Tom, 

> From: Tom Rix <trix@redhat.com>
> On 9/28/20 5:29 PM, Nicolas Chautru wrote:
> > Add in the list of registers for the device and related
> > HW specs definitions.
> >
> > Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> > Reviewed-by: Rosen Xu <rosen.xu@intel.com>
> > Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
> > ---
> >  drivers/baseband/acc100/acc100_pf_enum.h | 1068
> ++++++++++++++++++++++++++++++
> >  drivers/baseband/acc100/acc100_vf_enum.h |   73 ++
> >  drivers/baseband/acc100/rte_acc100_pmd.h |  490 ++++++++++++++
> >  3 files changed, 1631 insertions(+)
> >  create mode 100644 drivers/baseband/acc100/acc100_pf_enum.h
> >  create mode 100644 drivers/baseband/acc100/acc100_vf_enum.h
> >
> > diff --git a/drivers/baseband/acc100/acc100_pf_enum.h
> b/drivers/baseband/acc100/acc100_pf_enum.h
> > new file mode 100644
> > index 0000000..a1ee416
> > --- /dev/null
> > +++ b/drivers/baseband/acc100/acc100_pf_enum.h
> > @@ -0,0 +1,1068 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright(c) 2017 Intel Corporation
> > + */
> > +
> > +#ifndef ACC100_PF_ENUM_H
> > +#define ACC100_PF_ENUM_H
> > +
> > +/*
> > + * ACC100 Register mapping on PF BAR0
> > + * This is automatically generated from RDL, format may change with new
> RDL
> > + * Release.
> > + * Variable names are as is
> > + */
> > +enum {
> > +	HWPfQmgrEgressQueuesTemplate          =  0x0007FE00,
> > +	HWPfQmgrIngressAq                     =  0x00080000,
> > +	HWPfQmgrArbQAvail                     =  0x00A00010,
> > +	HWPfQmgrArbQBlock                     =  0x00A00014,
> > +	HWPfQmgrAqueueDropNotifEn             =  0x00A00024,
> > +	HWPfQmgrAqueueDisableNotifEn          =  0x00A00028,
> > +	HWPfQmgrSoftReset                     =  0x00A00038,
> > +	HWPfQmgrInitStatus                    =  0x00A0003C,
> > +	HWPfQmgrAramWatchdogCount             =  0x00A00040,
> > +	HWPfQmgrAramWatchdogCounterEn         =  0x00A00044,
> > +	HWPfQmgrAxiWatchdogCount              =  0x00A00048,
> > +	HWPfQmgrAxiWatchdogCounterEn          =  0x00A0004C,
> > +	HWPfQmgrProcessWatchdogCount          =  0x00A00050,
> > +	HWPfQmgrProcessWatchdogCounterEn      =  0x00A00054,
> > +	HWPfQmgrProcessUl4GWatchdogCounter    =  0x00A00058,
> > +	HWPfQmgrProcessDl4GWatchdogCounter    =  0x00A0005C,
> > +	HWPfQmgrProcessUl5GWatchdogCounter    =  0x00A00060,
> > +	HWPfQmgrProcessDl5GWatchdogCounter    =  0x00A00064,
> > +	HWPfQmgrProcessMldWatchdogCounter     =  0x00A00068,
> > +	HWPfQmgrMsiOverflowUpperVf            =  0x00A00070,
> > +	HWPfQmgrMsiOverflowLowerVf            =  0x00A00074,
> > +	HWPfQmgrMsiWatchdogOverflow           =  0x00A00078,
> > +	HWPfQmgrMsiOverflowEnable             =  0x00A0007C,
> > +	HWPfQmgrDebugAqPointerMemGrp          =  0x00A00100,
> > +	HWPfQmgrDebugOutputArbQFifoGrp        =  0x00A00140,
> > +	HWPfQmgrDebugMsiFifoGrp               =  0x00A00180,
> > +	HWPfQmgrDebugAxiWdTimeoutMsiFifo      =  0x00A001C0,
> > +	HWPfQmgrDebugProcessWdTimeoutMsiFifo  =  0x00A001C4,
> > +	HWPfQmgrDepthLog2Grp                  =  0x00A00200,
> > +	HWPfQmgrTholdGrp                      =  0x00A00300,
> > +	HWPfQmgrGrpTmplateReg0Indx            =  0x00A00600,
> > +	HWPfQmgrGrpTmplateReg1Indx            =  0x00A00680,
> > +	HWPfQmgrGrpTmplateReg2indx            =  0x00A00700,
> > +	HWPfQmgrGrpTmplateReg3Indx            =  0x00A00780,
> > +	HWPfQmgrGrpTmplateReg4Indx            =  0x00A00800,
> > +	HWPfQmgrVfBaseAddr                    =  0x00A01000,
> > +	HWPfQmgrUl4GWeightRrVf                =  0x00A02000,
> > +	HWPfQmgrDl4GWeightRrVf                =  0x00A02100,
> > +	HWPfQmgrUl5GWeightRrVf                =  0x00A02200,
> > +	HWPfQmgrDl5GWeightRrVf                =  0x00A02300,
> > +	HWPfQmgrMldWeightRrVf                 =  0x00A02400,
> > +	HWPfQmgrArbQDepthGrp                  =  0x00A02F00,
> > +	HWPfQmgrGrpFunction0                  =  0x00A02F40,
> > +	HWPfQmgrGrpFunction1                  =  0x00A02F44,
> > +	HWPfQmgrGrpPriority                   =  0x00A02F48,
> > +	HWPfQmgrWeightSync                    =  0x00A03000,
> > +	HWPfQmgrAqEnableVf                    =  0x00A10000,
> > +	HWPfQmgrAqResetVf                     =  0x00A20000,
> > +	HWPfQmgrRingSizeVf                    =  0x00A20004,
> > +	HWPfQmgrGrpDepthLog20Vf               =  0x00A20008,
> > +	HWPfQmgrGrpDepthLog21Vf               =  0x00A2000C,
> > +	HWPfQmgrGrpFunction0Vf                =  0x00A20010,
> > +	HWPfQmgrGrpFunction1Vf                =  0x00A20014,
> > +	HWPfDmaConfig0Reg                     =  0x00B80000,
> > +	HWPfDmaConfig1Reg                     =  0x00B80004,
> > +	HWPfDmaQmgrAddrReg                    =  0x00B80008,
> > +	HWPfDmaSoftResetReg                   =  0x00B8000C,
> > +	HWPfDmaAxcacheReg                     =  0x00B80010,
> > +	HWPfDmaVersionReg                     =  0x00B80014,
> > +	HWPfDmaFrameThreshold                 =  0x00B80018,
> > +	HWPfDmaTimestampLo                    =  0x00B8001C,
> > +	HWPfDmaTimestampHi                    =  0x00B80020,
> > +	HWPfDmaAxiStatus                      =  0x00B80028,
> > +	HWPfDmaAxiControl                     =  0x00B8002C,
> > +	HWPfDmaNoQmgr                         =  0x00B80030,
> > +	HWPfDmaQosScale                       =  0x00B80034,
> > +	HWPfDmaQmanen                         =  0x00B80040,
> > +	HWPfDmaQmgrQosBase                    =  0x00B80060,
> > +	HWPfDmaFecClkGatingEnable             =  0x00B80080,
> > +	HWPfDmaPmEnable                       =  0x00B80084,
> > +	HWPfDmaQosEnable                      =  0x00B80088,
> > +	HWPfDmaHarqWeightedRrFrameThreshold   =  0x00B800B0,
> > +	HWPfDmaDataSmallWeightedRrFrameThresh  = 0x00B800B4,
> > +	HWPfDmaDataLargeWeightedRrFrameThresh  = 0x00B800B8,
> > +	HWPfDmaInboundCbMaxSize               =  0x00B800BC,
> > +	HWPfDmaInboundDrainDataSize           =  0x00B800C0,
> > +	HWPfDmaVfDdrBaseRw                    =  0x00B80400,
> > +	HWPfDmaCmplTmOutCnt                   =  0x00B80800,
> > +	HWPfDmaProcTmOutCnt                   =  0x00B80804,
> > +	HWPfDmaStatusRrespBresp               =  0x00B80810,
> > +	HWPfDmaCfgRrespBresp                  =  0x00B80814,
> > +	HWPfDmaStatusMemParErr                =  0x00B80818,
> > +	HWPfDmaCfgMemParErrEn                 =  0x00B8081C,
> > +	HWPfDmaStatusDmaHwErr                 =  0x00B80820,
> > +	HWPfDmaCfgDmaHwErrEn                  =  0x00B80824,
> > +	HWPfDmaStatusFecCoreErr               =  0x00B80828,
> > +	HWPfDmaCfgFecCoreErrEn                =  0x00B8082C,
> > +	HWPfDmaStatusFcwDescrErr              =  0x00B80830,
> > +	HWPfDmaCfgFcwDescrErrEn               =  0x00B80834,
> > +	HWPfDmaStatusBlockTransmit            =  0x00B80838,
> > +	HWPfDmaBlockOnErrEn                   =  0x00B8083C,
> > +	HWPfDmaStatusFlushDma                 =  0x00B80840,
> > +	HWPfDmaFlushDmaOnErrEn                =  0x00B80844,
> > +	HWPfDmaStatusSdoneFifoFull            =  0x00B80848,
> > +	HWPfDmaStatusDescriptorErrLoVf        =  0x00B8084C,
> > +	HWPfDmaStatusDescriptorErrHiVf        =  0x00B80850,
> > +	HWPfDmaStatusFcwErrLoVf               =  0x00B80854,
> > +	HWPfDmaStatusFcwErrHiVf               =  0x00B80858,
> > +	HWPfDmaStatusDataErrLoVf              =  0x00B8085C,
> > +	HWPfDmaStatusDataErrHiVf              =  0x00B80860,
> > +	HWPfDmaCfgMsiEnSoftwareErr            =  0x00B80864,
> > +	HWPfDmaDescriptorSignatuture          =  0x00B80868,
> > +	HWPfDmaFcwSignature                   =  0x00B8086C,
> > +	HWPfDmaErrorDetectionEn               =  0x00B80870,
> > +	HWPfDmaErrCntrlFifoDebug              =  0x00B8087C,
> > +	HWPfDmaStatusToutData                 =  0x00B80880,
> > +	HWPfDmaStatusToutDesc                 =  0x00B80884,
> > +	HWPfDmaStatusToutUnexpData            =  0x00B80888,
> > +	HWPfDmaStatusToutUnexpDesc            =  0x00B8088C,
> > +	HWPfDmaStatusToutProcess              =  0x00B80890,
> > +	HWPfDmaConfigCtoutOutDataEn           =  0x00B808A0,
> > +	HWPfDmaConfigCtoutOutDescrEn          =  0x00B808A4,
> > +	HWPfDmaConfigUnexpComplDataEn         =  0x00B808A8,
> > +	HWPfDmaConfigUnexpComplDescrEn        =  0x00B808AC,
> > +	HWPfDmaConfigPtoutOutEn               =  0x00B808B0,
> > +	HWPfDmaFec5GulDescBaseLoRegVf         =  0x00B88020,
> > +	HWPfDmaFec5GulDescBaseHiRegVf         =  0x00B88024,
> > +	HWPfDmaFec5GulRespPtrLoRegVf          =  0x00B88028,
> > +	HWPfDmaFec5GulRespPtrHiRegVf          =  0x00B8802C,
> > +	HWPfDmaFec5GdlDescBaseLoRegVf         =  0x00B88040,
> > +	HWPfDmaFec5GdlDescBaseHiRegVf         =  0x00B88044,
> > +	HWPfDmaFec5GdlRespPtrLoRegVf          =  0x00B88048,
> > +	HWPfDmaFec5GdlRespPtrHiRegVf          =  0x00B8804C,
> > +	HWPfDmaFec4GulDescBaseLoRegVf         =  0x00B88060,
> > +	HWPfDmaFec4GulDescBaseHiRegVf         =  0x00B88064,
> > +	HWPfDmaFec4GulRespPtrLoRegVf          =  0x00B88068,
> > +	HWPfDmaFec4GulRespPtrHiRegVf          =  0x00B8806C,
> > +	HWPfDmaFec4GdlDescBaseLoRegVf         =  0x00B88080,
> > +	HWPfDmaFec4GdlDescBaseHiRegVf         =  0x00B88084,
> > +	HWPfDmaFec4GdlRespPtrLoRegVf          =  0x00B88088,
> > +	HWPfDmaFec4GdlRespPtrHiRegVf          =  0x00B8808C,
> > +	HWPfDmaVfDdrBaseRangeRo               =  0x00B880A0,
> > +	HWPfQosmonACntrlReg                   =  0x00B90000,
> > +	HWPfQosmonAEvalOverflow0              =  0x00B90008,
> > +	HWPfQosmonAEvalOverflow1              =  0x00B9000C,
> > +	HWPfQosmonADivTerm                    =  0x00B90010,
> > +	HWPfQosmonATickTerm                   =  0x00B90014,
> > +	HWPfQosmonAEvalTerm                   =  0x00B90018,
> > +	HWPfQosmonAAveTerm                    =  0x00B9001C,
> > +	HWPfQosmonAForceEccErr                =  0x00B90020,
> > +	HWPfQosmonAEccErrDetect               =  0x00B90024,
> > +	HWPfQosmonAIterationConfig0Low        =  0x00B90060,
> > +	HWPfQosmonAIterationConfig0High       =  0x00B90064,
> > +	HWPfQosmonAIterationConfig1Low        =  0x00B90068,
> > +	HWPfQosmonAIterationConfig1High       =  0x00B9006C,
> > +	HWPfQosmonAIterationConfig2Low        =  0x00B90070,
> > +	HWPfQosmonAIterationConfig2High       =  0x00B90074,
> > +	HWPfQosmonAIterationConfig3Low        =  0x00B90078,
> > +	HWPfQosmonAIterationConfig3High       =  0x00B9007C,
> > +	HWPfQosmonAEvalMemAddr                =  0x00B90080,
> > +	HWPfQosmonAEvalMemData                =  0x00B90084,
> > +	HWPfQosmonAXaction                    =  0x00B900C0,
> > +	HWPfQosmonARemThres1Vf                =  0x00B90400,
> > +	HWPfQosmonAThres2Vf                   =  0x00B90404,
> > +	HWPfQosmonAWeiFracVf                  =  0x00B90408,
> > +	HWPfQosmonARrWeiVf                    =  0x00B9040C,
> > +	HWPfPermonACntrlRegVf                 =  0x00B98000,
> > +	HWPfPermonACountVf                    =  0x00B98008,
> > +	HWPfPermonAKCntLoVf                   =  0x00B98010,
> > +	HWPfPermonAKCntHiVf                   =  0x00B98014,
> > +	HWPfPermonADeltaCntLoVf               =  0x00B98020,
> > +	HWPfPermonADeltaCntHiVf               =  0x00B98024,
> > +	HWPfPermonAVersionReg                 =  0x00B9C000,
> > +	HWPfPermonACbControlFec               =  0x00B9C0F0,
> > +	HWPfPermonADltTimerLoFec              =  0x00B9C0F4,
> > +	HWPfPermonADltTimerHiFec              =  0x00B9C0F8,
> > +	HWPfPermonACbCountFec                 =  0x00B9C100,
> > +	HWPfPermonAAccExecTimerLoFec          =  0x00B9C104,
> > +	HWPfPermonAAccExecTimerHiFec          =  0x00B9C108,
> > +	HWPfPermonAExecTimerMinFec            =  0x00B9C200,
> > +	HWPfPermonAExecTimerMaxFec            =  0x00B9C204,
> > +	HWPfPermonAControlBusMon              =  0x00B9C400,
> > +	HWPfPermonAConfigBusMon               =  0x00B9C404,
> > +	HWPfPermonASkipCountBusMon            =  0x00B9C408,
> > +	HWPfPermonAMinLatBusMon               =  0x00B9C40C,
> > +	HWPfPermonAMaxLatBusMon               =  0x00B9C500,
> > +	HWPfPermonATotalLatLowBusMon          =  0x00B9C504,
> > +	HWPfPermonATotalLatUpperBusMon        =  0x00B9C508,
> > +	HWPfPermonATotalReqCntBusMon          =  0x00B9C50C,
> > +	HWPfQosmonBCntrlReg                   =  0x00BA0000,
> > +	HWPfQosmonBEvalOverflow0              =  0x00BA0008,
> > +	HWPfQosmonBEvalOverflow1              =  0x00BA000C,
> > +	HWPfQosmonBDivTerm                    =  0x00BA0010,
> > +	HWPfQosmonBTickTerm                   =  0x00BA0014,
> > +	HWPfQosmonBEvalTerm                   =  0x00BA0018,
> > +	HWPfQosmonBAveTerm                    =  0x00BA001C,
> > +	HWPfQosmonBForceEccErr                =  0x00BA0020,
> > +	HWPfQosmonBEccErrDetect               =  0x00BA0024,
> > +	HWPfQosmonBIterationConfig0Low        =  0x00BA0060,
> > +	HWPfQosmonBIterationConfig0High       =  0x00BA0064,
> > +	HWPfQosmonBIterationConfig1Low        =  0x00BA0068,
> > +	HWPfQosmonBIterationConfig1High       =  0x00BA006C,
> > +	HWPfQosmonBIterationConfig2Low        =  0x00BA0070,
> > +	HWPfQosmonBIterationConfig2High       =  0x00BA0074,
> > +	HWPfQosmonBIterationConfig3Low        =  0x00BA0078,
> > +	HWPfQosmonBIterationConfig3High       =  0x00BA007C,
> > +	HWPfQosmonBEvalMemAddr                =  0x00BA0080,
> > +	HWPfQosmonBEvalMemData                =  0x00BA0084,
> > +	HWPfQosmonBXaction                    =  0x00BA00C0,
> > +	HWPfQosmonBRemThres1Vf                =  0x00BA0400,
> > +	HWPfQosmonBThres2Vf                   =  0x00BA0404,
> > +	HWPfQosmonBWeiFracVf                  =  0x00BA0408,
> > +	HWPfQosmonBRrWeiVf                    =  0x00BA040C,
> > +	HWPfPermonBCntrlRegVf                 =  0x00BA8000,
> > +	HWPfPermonBCountVf                    =  0x00BA8008,
> > +	HWPfPermonBKCntLoVf                   =  0x00BA8010,
> > +	HWPfPermonBKCntHiVf                   =  0x00BA8014,
> > +	HWPfPermonBDeltaCntLoVf               =  0x00BA8020,
> > +	HWPfPermonBDeltaCntHiVf               =  0x00BA8024,
> > +	HWPfPermonBVersionReg                 =  0x00BAC000,
> > +	HWPfPermonBCbControlFec               =  0x00BAC0F0,
> > +	HWPfPermonBDltTimerLoFec              =  0x00BAC0F4,
> > +	HWPfPermonBDltTimerHiFec              =  0x00BAC0F8,
> > +	HWPfPermonBCbCountFec                 =  0x00BAC100,
> > +	HWPfPermonBAccExecTimerLoFec          =  0x00BAC104,
> > +	HWPfPermonBAccExecTimerHiFec          =  0x00BAC108,
> > +	HWPfPermonBExecTimerMinFec            =  0x00BAC200,
> > +	HWPfPermonBExecTimerMaxFec            =  0x00BAC204,
> > +	HWPfPermonBControlBusMon              =  0x00BAC400,
> > +	HWPfPermonBConfigBusMon               =  0x00BAC404,
> > +	HWPfPermonBSkipCountBusMon            =  0x00BAC408,
> > +	HWPfPermonBMinLatBusMon               =  0x00BAC40C,
> > +	HWPfPermonBMaxLatBusMon               =  0x00BAC500,
> > +	HWPfPermonBTotalLatLowBusMon          =  0x00BAC504,
> > +	HWPfPermonBTotalLatUpperBusMon        =  0x00BAC508,
> > +	HWPfPermonBTotalReqCntBusMon          =  0x00BAC50C,
> > +	HWPfFecUl5gCntrlReg                   =  0x00BC0000,
> > +	HWPfFecUl5gI2MThreshReg               =  0x00BC0004,
> > +	HWPfFecUl5gVersionReg                 =  0x00BC0100,
> > +	HWPfFecUl5gFcwStatusReg               =  0x00BC0104,
> > +	HWPfFecUl5gWarnReg                    =  0x00BC0108,
> > +	HwPfFecUl5gIbDebugReg                 =  0x00BC0200,
> > +	HwPfFecUl5gObLlrDebugReg              =  0x00BC0204,
> > +	HwPfFecUl5gObHarqDebugReg             =  0x00BC0208,
> > +	HwPfFecUl5g1CntrlReg                  =  0x00BC1000,
> > +	HwPfFecUl5g1I2MThreshReg              =  0x00BC1004,
> > +	HwPfFecUl5g1VersionReg                =  0x00BC1100,
> > +	HwPfFecUl5g1FcwStatusReg              =  0x00BC1104,
> > +	HwPfFecUl5g1WarnReg                   =  0x00BC1108,
> > +	HwPfFecUl5g1IbDebugReg                =  0x00BC1200,
> > +	HwPfFecUl5g1ObLlrDebugReg             =  0x00BC1204,
> > +	HwPfFecUl5g1ObHarqDebugReg            =  0x00BC1208,
> > +	HwPfFecUl5g2CntrlReg                  =  0x00BC2000,
> > +	HwPfFecUl5g2I2MThreshReg              =  0x00BC2004,
> > +	HwPfFecUl5g2VersionReg                =  0x00BC2100,
> > +	HwPfFecUl5g2FcwStatusReg              =  0x00BC2104,
> > +	HwPfFecUl5g2WarnReg                   =  0x00BC2108,
> > +	HwPfFecUl5g2IbDebugReg                =  0x00BC2200,
> > +	HwPfFecUl5g2ObLlrDebugReg             =  0x00BC2204,
> > +	HwPfFecUl5g2ObHarqDebugReg            =  0x00BC2208,
> > +	HwPfFecUl5g3CntrlReg                  =  0x00BC3000,
> > +	HwPfFecUl5g3I2MThreshReg              =  0x00BC3004,
> > +	HwPfFecUl5g3VersionReg                =  0x00BC3100,
> > +	HwPfFecUl5g3FcwStatusReg              =  0x00BC3104,
> > +	HwPfFecUl5g3WarnReg                   =  0x00BC3108,
> > +	HwPfFecUl5g3IbDebugReg                =  0x00BC3200,
> > +	HwPfFecUl5g3ObLlrDebugReg             =  0x00BC3204,
> > +	HwPfFecUl5g3ObHarqDebugReg            =  0x00BC3208,
> > +	HwPfFecUl5g4CntrlReg                  =  0x00BC4000,
> > +	HwPfFecUl5g4I2MThreshReg              =  0x00BC4004,
> > +	HwPfFecUl5g4VersionReg                =  0x00BC4100,
> > +	HwPfFecUl5g4FcwStatusReg              =  0x00BC4104,
> > +	HwPfFecUl5g4WarnReg                   =  0x00BC4108,
> > +	HwPfFecUl5g4IbDebugReg                =  0x00BC4200,
> > +	HwPfFecUl5g4ObLlrDebugReg             =  0x00BC4204,
> > +	HwPfFecUl5g4ObHarqDebugReg            =  0x00BC4208,
> > +	HwPfFecUl5g5CntrlReg                  =  0x00BC5000,
> > +	HwPfFecUl5g5I2MThreshReg              =  0x00BC5004,
> > +	HwPfFecUl5g5VersionReg                =  0x00BC5100,
> > +	HwPfFecUl5g5FcwStatusReg              =  0x00BC5104,
> > +	HwPfFecUl5g5WarnReg                   =  0x00BC5108,
> > +	HwPfFecUl5g5IbDebugReg                =  0x00BC5200,
> > +	HwPfFecUl5g5ObLlrDebugReg             =  0x00BC5204,
> > +	HwPfFecUl5g5ObHarqDebugReg            =  0x00BC5208,
> > +	HwPfFecUl5g6CntrlReg                  =  0x00BC6000,
> > +	HwPfFecUl5g6I2MThreshReg              =  0x00BC6004,
> > +	HwPfFecUl5g6VersionReg                =  0x00BC6100,
> > +	HwPfFecUl5g6FcwStatusReg              =  0x00BC6104,
> > +	HwPfFecUl5g6WarnReg                   =  0x00BC6108,
> > +	HwPfFecUl5g6IbDebugReg                =  0x00BC6200,
> > +	HwPfFecUl5g6ObLlrDebugReg             =  0x00BC6204,
> > +	HwPfFecUl5g6ObHarqDebugReg            =  0x00BC6208,
> > +	HwPfFecUl5g7CntrlReg                  =  0x00BC7000,
> > +	HwPfFecUl5g7I2MThreshReg              =  0x00BC7004,
> > +	HwPfFecUl5g7VersionReg                =  0x00BC7100,
> > +	HwPfFecUl5g7FcwStatusReg              =  0x00BC7104,
> > +	HwPfFecUl5g7WarnReg                   =  0x00BC7108,
> > +	HwPfFecUl5g7IbDebugReg                =  0x00BC7200,
> > +	HwPfFecUl5g7ObLlrDebugReg             =  0x00BC7204,
> > +	HwPfFecUl5g7ObHarqDebugReg            =  0x00BC7208,
> > +	HwPfFecUl5g8CntrlReg                  =  0x00BC8000,
> > +	HwPfFecUl5g8I2MThreshReg              =  0x00BC8004,
> > +	HwPfFecUl5g8VersionReg                =  0x00BC8100,
> > +	HwPfFecUl5g8FcwStatusReg              =  0x00BC8104,
> > +	HwPfFecUl5g8WarnReg                   =  0x00BC8108,
> > +	HwPfFecUl5g8IbDebugReg                =  0x00BC8200,
> > +	HwPfFecUl5g8ObLlrDebugReg             =  0x00BC8204,
> > +	HwPfFecUl5g8ObHarqDebugReg            =  0x00BC8208,
> > +	HWPfFecDl5gCntrlReg                   =  0x00BCF000,
> > +	HWPfFecDl5gI2MThreshReg               =  0x00BCF004,
> > +	HWPfFecDl5gVersionReg                 =  0x00BCF100,
> > +	HWPfFecDl5gFcwStatusReg               =  0x00BCF104,
> > +	HWPfFecDl5gWarnReg                    =  0x00BCF108,
> > +	HWPfFecUlVersionReg                   =  0x00BD0000,
> > +	HWPfFecUlControlReg                   =  0x00BD0004,
> > +	HWPfFecUlStatusReg                    =  0x00BD0008,
> > +	HWPfFecDlVersionReg                   =  0x00BDF000,
> > +	HWPfFecDlClusterConfigReg             =  0x00BDF004,
> > +	HWPfFecDlBurstThres                   =  0x00BDF00C,
> > +	HWPfFecDlClusterStatusReg0            =  0x00BDF040,
> > +	HWPfFecDlClusterStatusReg1            =  0x00BDF044,
> > +	HWPfFecDlClusterStatusReg2            =  0x00BDF048,
> > +	HWPfFecDlClusterStatusReg3            =  0x00BDF04C,
> > +	HWPfFecDlClusterStatusReg4            =  0x00BDF050,
> > +	HWPfFecDlClusterStatusReg5            =  0x00BDF054,
> > +	HWPfChaFabPllPllrst                   =  0x00C40000,
> > +	HWPfChaFabPllClk0                     =  0x00C40004,
> > +	HWPfChaFabPllClk1                     =  0x00C40008,
> > +	HWPfChaFabPllBwadj                    =  0x00C4000C,
> > +	HWPfChaFabPllLbw                      =  0x00C40010,
> > +	HWPfChaFabPllResetq                   =  0x00C40014,
> > +	HWPfChaFabPllPhshft0                  =  0x00C40018,
> > +	HWPfChaFabPllPhshft1                  =  0x00C4001C,
> > +	HWPfChaFabPllDivq0                    =  0x00C40020,
> > +	HWPfChaFabPllDivq1                    =  0x00C40024,
> > +	HWPfChaFabPllDivq2                    =  0x00C40028,
> > +	HWPfChaFabPllDivq3                    =  0x00C4002C,
> > +	HWPfChaFabPllDivq4                    =  0x00C40030,
> > +	HWPfChaFabPllDivq5                    =  0x00C40034,
> > +	HWPfChaFabPllDivq6                    =  0x00C40038,
> > +	HWPfChaFabPllDivq7                    =  0x00C4003C,
> > +	HWPfChaDl5gPllPllrst                  =  0x00C40080,
> > +	HWPfChaDl5gPllClk0                    =  0x00C40084,
> > +	HWPfChaDl5gPllClk1                    =  0x00C40088,
> > +	HWPfChaDl5gPllBwadj                   =  0x00C4008C,
> > +	HWPfChaDl5gPllLbw                     =  0x00C40090,
> > +	HWPfChaDl5gPllResetq                  =  0x00C40094,
> > +	HWPfChaDl5gPllPhshft0                 =  0x00C40098,
> > +	HWPfChaDl5gPllPhshft1                 =  0x00C4009C,
> > +	HWPfChaDl5gPllDivq0                   =  0x00C400A0,
> > +	HWPfChaDl5gPllDivq1                   =  0x00C400A4,
> > +	HWPfChaDl5gPllDivq2                   =  0x00C400A8,
> > +	HWPfChaDl5gPllDivq3                   =  0x00C400AC,
> > +	HWPfChaDl5gPllDivq4                   =  0x00C400B0,
> > +	HWPfChaDl5gPllDivq5                   =  0x00C400B4,
> > +	HWPfChaDl5gPllDivq6                   =  0x00C400B8,
> > +	HWPfChaDl5gPllDivq7                   =  0x00C400BC,
> > +	HWPfChaDl4gPllPllrst                  =  0x00C40100,
> > +	HWPfChaDl4gPllClk0                    =  0x00C40104,
> > +	HWPfChaDl4gPllClk1                    =  0x00C40108,
> > +	HWPfChaDl4gPllBwadj                   =  0x00C4010C,
> > +	HWPfChaDl4gPllLbw                     =  0x00C40110,
> > +	HWPfChaDl4gPllResetq                  =  0x00C40114,
> > +	HWPfChaDl4gPllPhshft0                 =  0x00C40118,
> > +	HWPfChaDl4gPllPhshft1                 =  0x00C4011C,
> > +	HWPfChaDl4gPllDivq0                   =  0x00C40120,
> > +	HWPfChaDl4gPllDivq1                   =  0x00C40124,
> > +	HWPfChaDl4gPllDivq2                   =  0x00C40128,
> > +	HWPfChaDl4gPllDivq3                   =  0x00C4012C,
> > +	HWPfChaDl4gPllDivq4                   =  0x00C40130,
> > +	HWPfChaDl4gPllDivq5                   =  0x00C40134,
> > +	HWPfChaDl4gPllDivq6                   =  0x00C40138,
> > +	HWPfChaDl4gPllDivq7                   =  0x00C4013C,
> > +	HWPfChaUl5gPllPllrst                  =  0x00C40180,
> > +	HWPfChaUl5gPllClk0                    =  0x00C40184,
> > +	HWPfChaUl5gPllClk1                    =  0x00C40188,
> > +	HWPfChaUl5gPllBwadj                   =  0x00C4018C,
> > +	HWPfChaUl5gPllLbw                     =  0x00C40190,
> > +	HWPfChaUl5gPllResetq                  =  0x00C40194,
> > +	HWPfChaUl5gPllPhshft0                 =  0x00C40198,
> > +	HWPfChaUl5gPllPhshft1                 =  0x00C4019C,
> > +	HWPfChaUl5gPllDivq0                   =  0x00C401A0,
> > +	HWPfChaUl5gPllDivq1                   =  0x00C401A4,
> > +	HWPfChaUl5gPllDivq2                   =  0x00C401A8,
> > +	HWPfChaUl5gPllDivq3                   =  0x00C401AC,
> > +	HWPfChaUl5gPllDivq4                   =  0x00C401B0,
> > +	HWPfChaUl5gPllDivq5                   =  0x00C401B4,
> > +	HWPfChaUl5gPllDivq6                   =  0x00C401B8,
> > +	HWPfChaUl5gPllDivq7                   =  0x00C401BC,
> > +	HWPfChaUl4gPllPllrst                  =  0x00C40200,
> > +	HWPfChaUl4gPllClk0                    =  0x00C40204,
> > +	HWPfChaUl4gPllClk1                    =  0x00C40208,
> > +	HWPfChaUl4gPllBwadj                   =  0x00C4020C,
> > +	HWPfChaUl4gPllLbw                     =  0x00C40210,
> > +	HWPfChaUl4gPllResetq                  =  0x00C40214,
> > +	HWPfChaUl4gPllPhshft0                 =  0x00C40218,
> > +	HWPfChaUl4gPllPhshft1                 =  0x00C4021C,
> > +	HWPfChaUl4gPllDivq0                   =  0x00C40220,
> > +	HWPfChaUl4gPllDivq1                   =  0x00C40224,
> > +	HWPfChaUl4gPllDivq2                   =  0x00C40228,
> > +	HWPfChaUl4gPllDivq3                   =  0x00C4022C,
> > +	HWPfChaUl4gPllDivq4                   =  0x00C40230,
> > +	HWPfChaUl4gPllDivq5                   =  0x00C40234,
> > +	HWPfChaUl4gPllDivq6                   =  0x00C40238,
> > +	HWPfChaUl4gPllDivq7                   =  0x00C4023C,
> > +	HWPfChaDdrPllPllrst                   =  0x00C40280,
> > +	HWPfChaDdrPllClk0                     =  0x00C40284,
> > +	HWPfChaDdrPllClk1                     =  0x00C40288,
> > +	HWPfChaDdrPllBwadj                    =  0x00C4028C,
> > +	HWPfChaDdrPllLbw                      =  0x00C40290,
> > +	HWPfChaDdrPllResetq                   =  0x00C40294,
> > +	HWPfChaDdrPllPhshft0                  =  0x00C40298,
> > +	HWPfChaDdrPllPhshft1                  =  0x00C4029C,
> > +	HWPfChaDdrPllDivq0                    =  0x00C402A0,
> > +	HWPfChaDdrPllDivq1                    =  0x00C402A4,
> > +	HWPfChaDdrPllDivq2                    =  0x00C402A8,
> > +	HWPfChaDdrPllDivq3                    =  0x00C402AC,
> > +	HWPfChaDdrPllDivq4                    =  0x00C402B0,
> > +	HWPfChaDdrPllDivq5                    =  0x00C402B4,
> > +	HWPfChaDdrPllDivq6                    =  0x00C402B8,
> > +	HWPfChaDdrPllDivq7                    =  0x00C402BC,
> > +	HWPfChaErrStatus                      =  0x00C40400,
> > +	HWPfChaErrMask                        =  0x00C40404,
> > +	HWPfChaDebugPcieMsiFifo               =  0x00C40410,
> > +	HWPfChaDebugDdrMsiFifo                =  0x00C40414,
> > +	HWPfChaDebugMiscMsiFifo               =  0x00C40418,
> > +	HWPfChaPwmSet                         =  0x00C40420,
> > +	HWPfChaDdrRstStatus                   =  0x00C40430,
> > +	HWPfChaDdrStDoneStatus                =  0x00C40434,
> > +	HWPfChaDdrWbRstCfg                    =  0x00C40438,
> > +	HWPfChaDdrApbRstCfg                   =  0x00C4043C,
> > +	HWPfChaDdrPhyRstCfg                   =  0x00C40440,
> > +	HWPfChaDdrCpuRstCfg                   =  0x00C40444,
> > +	HWPfChaDdrSifRstCfg                   =  0x00C40448,
> > +	HWPfChaPadcfgPcomp0                   =  0x00C41000,
> > +	HWPfChaPadcfgNcomp0                   =  0x00C41004,
> > +	HWPfChaPadcfgOdt0                     =  0x00C41008,
> > +	HWPfChaPadcfgProtect0                 =  0x00C4100C,
> > +	HWPfChaPreemphasisProtect0            =  0x00C41010,
> > +	HWPfChaPreemphasisCompen0             =  0x00C41040,
> > +	HWPfChaPreemphasisOdten0              =  0x00C41044,
> > +	HWPfChaPadcfgPcomp1                   =  0x00C41100,
> > +	HWPfChaPadcfgNcomp1                   =  0x00C41104,
> > +	HWPfChaPadcfgOdt1                     =  0x00C41108,
> > +	HWPfChaPadcfgProtect1                 =  0x00C4110C,
> > +	HWPfChaPreemphasisProtect1            =  0x00C41110,
> > +	HWPfChaPreemphasisCompen1             =  0x00C41140,
> > +	HWPfChaPreemphasisOdten1              =  0x00C41144,
> > +	HWPfChaPadcfgPcomp2                   =  0x00C41200,
> > +	HWPfChaPadcfgNcomp2                   =  0x00C41204,
> > +	HWPfChaPadcfgOdt2                     =  0x00C41208,
> > +	HWPfChaPadcfgProtect2                 =  0x00C4120C,
> > +	HWPfChaPreemphasisProtect2            =  0x00C41210,
> > +	HWPfChaPreemphasisCompen2             =  0x00C41240,
> > +	HWPfChaPreemphasisOdten4              =  0x00C41444,
> > +	HWPfChaPreemphasisOdten2              =  0x00C41244,
> > +	HWPfChaPadcfgPcomp3                   =  0x00C41300,
> > +	HWPfChaPadcfgNcomp3                   =  0x00C41304,
> > +	HWPfChaPadcfgOdt3                     =  0x00C41308,
> > +	HWPfChaPadcfgProtect3                 =  0x00C4130C,
> > +	HWPfChaPreemphasisProtect3            =  0x00C41310,
> > +	HWPfChaPreemphasisCompen3             =  0x00C41340,
> > +	HWPfChaPreemphasisOdten3              =  0x00C41344,
> > +	HWPfChaPadcfgPcomp4                   =  0x00C41400,
> > +	HWPfChaPadcfgNcomp4                   =  0x00C41404,
> > +	HWPfChaPadcfgOdt4                     =  0x00C41408,
> > +	HWPfChaPadcfgProtect4                 =  0x00C4140C,
> > +	HWPfChaPreemphasisProtect4            =  0x00C41410,
> > +	HWPfChaPreemphasisCompen4             =  0x00C41440,
> > +	HWPfHiVfToPfDbellVf                   =  0x00C80000,
> > +	HWPfHiPfToVfDbellVf                   =  0x00C80008,
> > +	HWPfHiInfoRingBaseLoVf                =  0x00C80010,
> > +	HWPfHiInfoRingBaseHiVf                =  0x00C80014,
> > +	HWPfHiInfoRingPointerVf               =  0x00C80018,
> > +	HWPfHiInfoRingIntWrEnVf               =  0x00C80020,
> > +	HWPfHiInfoRingPf2VfWrEnVf             =  0x00C80024,
> > +	HWPfHiMsixVectorMapperVf              =  0x00C80060,
> > +	HWPfHiModuleVersionReg                =  0x00C84000,
> > +	HWPfHiIosf2axiErrLogReg               =  0x00C84004,
> > +	HWPfHiHardResetReg                    =  0x00C84008,
> > +	HWPfHi5GHardResetReg                  =  0x00C8400C,
> > +	HWPfHiInfoRingBaseLoRegPf             =  0x00C84010,
> > +	HWPfHiInfoRingBaseHiRegPf             =  0x00C84014,
> > +	HWPfHiInfoRingPointerRegPf            =  0x00C84018,
> > +	HWPfHiInfoRingIntWrEnRegPf            =  0x00C84020,
> > +	HWPfHiInfoRingVf2pfLoWrEnReg          =  0x00C84024,
> > +	HWPfHiInfoRingVf2pfHiWrEnReg          =  0x00C84028,
> > +	HWPfHiLogParityErrStatusReg           =  0x00C8402C,
> > +	HWPfHiLogDataParityErrorVfStatusLo    =  0x00C84030,
> > +	HWPfHiLogDataParityErrorVfStatusHi    =  0x00C84034,
> > +	HWPfHiBlockTransmitOnErrorEn          =  0x00C84038,
> > +	HWPfHiCfgMsiIntWrEnRegPf              =  0x00C84040,
> > +	HWPfHiCfgMsiVf2pfLoWrEnReg            =  0x00C84044,
> > +	HWPfHiCfgMsiVf2pfHighWrEnReg          =  0x00C84048,
> > +	HWPfHiMsixVectorMapperPf              =  0x00C84060,
> > +	HWPfHiApbWrWaitTime                   =  0x00C84100,
> > +	HWPfHiXCounterMaxValue                =  0x00C84104,
> > +	HWPfHiPfMode                          =  0x00C84108,
> > +	HWPfHiClkGateHystReg                  =  0x00C8410C,
> > +	HWPfHiSnoopBitsReg                    =  0x00C84110,
> > +	HWPfHiMsiDropEnableReg                =  0x00C84114,
> > +	HWPfHiMsiStatReg                      =  0x00C84120,
> > +	HWPfHiFifoOflStatReg                  =  0x00C84124,
> > +	HWPfHiHiDebugReg                      =  0x00C841F4,
> > +	HWPfHiDebugMemSnoopMsiFifo            =  0x00C841F8,
> > +	HWPfHiDebugMemSnoopInputFifo          =  0x00C841FC,
> > +	HWPfHiMsixMappingConfig               =  0x00C84200,
> > +	HWPfHiJunkReg                         =  0x00C8FF00,
> > +	HWPfDdrUmmcVer                        =  0x00D00000,
> > +	HWPfDdrUmmcCap                        =  0x00D00010,
> > +	HWPfDdrUmmcCtrl                       =  0x00D00020,
> > +	HWPfDdrMpcPe                          =  0x00D00080,
> > +	HWPfDdrMpcPpri3                       =  0x00D00090,
> > +	HWPfDdrMpcPpri2                       =  0x00D000A0,
> > +	HWPfDdrMpcPpri1                       =  0x00D000B0,
> > +	HWPfDdrMpcPpri0                       =  0x00D000C0,
> > +	HWPfDdrMpcPrwgrpCtrl                  =  0x00D000D0,
> > +	HWPfDdrMpcPbw7                        =  0x00D000E0,
> > +	HWPfDdrMpcPbw6                        =  0x00D000F0,
> > +	HWPfDdrMpcPbw5                        =  0x00D00100,
> > +	HWPfDdrMpcPbw4                        =  0x00D00110,
> > +	HWPfDdrMpcPbw3                        =  0x00D00120,
> > +	HWPfDdrMpcPbw2                        =  0x00D00130,
> > +	HWPfDdrMpcPbw1                        =  0x00D00140,
> > +	HWPfDdrMpcPbw0                        =  0x00D00150,
> > +	HWPfDdrMemoryInit                     =  0x00D00200,
> > +	HWPfDdrMemoryInitDone                 =  0x00D00210,
> > +	HWPfDdrMemInitPhyTrng0                =  0x00D00240,
> > +	HWPfDdrMemInitPhyTrng1                =  0x00D00250,
> > +	HWPfDdrMemInitPhyTrng2                =  0x00D00260,
> > +	HWPfDdrMemInitPhyTrng3                =  0x00D00270,
> > +	HWPfDdrBcDram                         =  0x00D003C0,
> > +	HWPfDdrBcAddrMap                      =  0x00D003D0,
> > +	HWPfDdrBcRef                          =  0x00D003E0,
> > +	HWPfDdrBcTim0                         =  0x00D00400,
> > +	HWPfDdrBcTim1                         =  0x00D00410,
> > +	HWPfDdrBcTim2                         =  0x00D00420,
> > +	HWPfDdrBcTim3                         =  0x00D00430,
> > +	HWPfDdrBcTim4                         =  0x00D00440,
> > +	HWPfDdrBcTim5                         =  0x00D00450,
> > +	HWPfDdrBcTim6                         =  0x00D00460,
> > +	HWPfDdrBcTim7                         =  0x00D00470,
> > +	HWPfDdrBcTim8                         =  0x00D00480,
> > +	HWPfDdrBcTim9                         =  0x00D00490,
> > +	HWPfDdrBcTim10                        =  0x00D004A0,
> > +	HWPfDdrBcTim12                        =  0x00D004C0,
> > +	HWPfDdrDfiInit                        =  0x00D004D0,
> > +	HWPfDdrDfiInitComplete                =  0x00D004E0,
> > +	HWPfDdrDfiTim0                        =  0x00D004F0,
> > +	HWPfDdrDfiTim1                        =  0x00D00500,
> > +	HWPfDdrDfiPhyUpdEn                    =  0x00D00530,
> > +	HWPfDdrMemStatus                      =  0x00D00540,
> > +	HWPfDdrUmmcErrStatus                  =  0x00D00550,
> > +	HWPfDdrUmmcIntStatus                  =  0x00D00560,
> > +	HWPfDdrUmmcIntEn                      =  0x00D00570,
> > +	HWPfDdrPhyRdLatency                   =  0x00D48400,
> > +	HWPfDdrPhyRdLatencyDbi                =  0x00D48410,
> > +	HWPfDdrPhyWrLatency                   =  0x00D48420,
> > +	HWPfDdrPhyTrngType                    =  0x00D48430,
> > +	HWPfDdrPhyMrsTiming2                  =  0x00D48440,
> > +	HWPfDdrPhyMrsTiming0                  =  0x00D48450,
> > +	HWPfDdrPhyMrsTiming1                  =  0x00D48460,
> > +	HWPfDdrPhyDramTmrd                    =  0x00D48470,
> > +	HWPfDdrPhyDramTmod                    =  0x00D48480,
> > +	HWPfDdrPhyDramTwpre                   =  0x00D48490,
> > +	HWPfDdrPhyDramTrfc                    =  0x00D484A0,
> > +	HWPfDdrPhyDramTrwtp                   =  0x00D484B0,
> > +	HWPfDdrPhyMr01Dimm                    =  0x00D484C0,
> > +	HWPfDdrPhyMr01DimmDbi                 =  0x00D484D0,
> > +	HWPfDdrPhyMr23Dimm                    =  0x00D484E0,
> > +	HWPfDdrPhyMr45Dimm                    =  0x00D484F0,
> > +	HWPfDdrPhyMr67Dimm                    =  0x00D48500,
> > +	HWPfDdrPhyWrlvlWwRdlvlRr              =  0x00D48510,
> > +	HWPfDdrPhyOdtEn                       =  0x00D48520,
> > +	HWPfDdrPhyFastTrng                    =  0x00D48530,
> > +	HWPfDdrPhyDynTrngGap                  =  0x00D48540,
> > +	HWPfDdrPhyDynRcalGap                  =  0x00D48550,
> > +	HWPfDdrPhyIdletimeout                 =  0x00D48560,
> > +	HWPfDdrPhyRstCkeGap                   =  0x00D48570,
> > +	HWPfDdrPhyCkeMrsGap                   =  0x00D48580,
> > +	HWPfDdrPhyMemVrefMidVal               =  0x00D48590,
> > +	HWPfDdrPhyVrefStep                    =  0x00D485A0,
> > +	HWPfDdrPhyVrefThreshold               =  0x00D485B0,
> > +	HWPfDdrPhyPhyVrefMidVal               =  0x00D485C0,
> > +	HWPfDdrPhyDqsCountMax                 =  0x00D485D0,
> > +	HWPfDdrPhyDqsCountNum                 =  0x00D485E0,
> > +	HWPfDdrPhyDramRow                     =  0x00D485F0,
> > +	HWPfDdrPhyDramCol                     =  0x00D48600,
> > +	HWPfDdrPhyDramBgBa                    =  0x00D48610,
> > +	HWPfDdrPhyDynamicUpdreqrel            =  0x00D48620,
> > +	HWPfDdrPhyVrefLimits                  =  0x00D48630,
> > +	HWPfDdrPhyIdtmTcStatus                =  0x00D6C020,
> > +	HWPfDdrPhyIdtmFwVersion               =  0x00D6C410,
> > +	HWPfDdrPhyRdlvlGateInitDelay          =  0x00D70000,
> > +	HWPfDdrPhyRdenSmplabc                 =  0x00D70008,
> > +	HWPfDdrPhyVrefNibble0                 =  0x00D7000C,
> > +	HWPfDdrPhyVrefNibble1                 =  0x00D70010,
> > +	HWPfDdrPhyRdlvlGateDqsSmpl0           =  0x00D70014,
> > +	HWPfDdrPhyRdlvlGateDqsSmpl1           =  0x00D70018,
> > +	HWPfDdrPhyRdlvlGateDqsSmpl2           =  0x00D7001C,
> > +	HWPfDdrPhyDqsCount                    =  0x00D70020,
> > +	HWPfDdrPhyWrlvlRdlvlGateStatus        =  0x00D70024,
> > +	HWPfDdrPhyErrorFlags                  =  0x00D70028,
> > +	HWPfDdrPhyPowerDown                   =  0x00D70030,
> > +	HWPfDdrPhyPrbsSeedByte0               =  0x00D70034,
> > +	HWPfDdrPhyPrbsSeedByte1               =  0x00D70038,
> > +	HWPfDdrPhyPcompDq                     =  0x00D70040,
> > +	HWPfDdrPhyNcompDq                     =  0x00D70044,
> > +	HWPfDdrPhyPcompDqs                    =  0x00D70048,
> > +	HWPfDdrPhyNcompDqs                    =  0x00D7004C,
> > +	HWPfDdrPhyPcompCmd                    =  0x00D70050,
> > +	HWPfDdrPhyNcompCmd                    =  0x00D70054,
> > +	HWPfDdrPhyPcompCk                     =  0x00D70058,
> > +	HWPfDdrPhyNcompCk                     =  0x00D7005C,
> > +	HWPfDdrPhyRcalOdtDq                   =  0x00D70060,
> > +	HWPfDdrPhyRcalOdtDqs                  =  0x00D70064,
> > +	HWPfDdrPhyRcalMask1                   =  0x00D70068,
> > +	HWPfDdrPhyRcalMask2                   =  0x00D7006C,
> > +	HWPfDdrPhyRcalCtrl                    =  0x00D70070,
> > +	HWPfDdrPhyRcalCnt                     =  0x00D70074,
> > +	HWPfDdrPhyRcalOverride                =  0x00D70078,
> > +	HWPfDdrPhyRcalGateen                  =  0x00D7007C,
> > +	HWPfDdrPhyCtrl                        =  0x00D70080,
> > +	HWPfDdrPhyWrlvlAlg                    =  0x00D70084,
> > +	HWPfDdrPhyRcalVreftTxcmdOdt           =  0x00D70088,
> > +	HWPfDdrPhyRdlvlGateParam              =  0x00D7008C,
> > +	HWPfDdrPhyRdlvlGateParam2             =  0x00D70090,
> > +	HWPfDdrPhyRcalVreftTxdata             =  0x00D70094,
> > +	HWPfDdrPhyCmdIntDelay                 =  0x00D700A4,
> > +	HWPfDdrPhyAlertN                      =  0x00D700A8,
> > +	HWPfDdrPhyTrngReqWpre2tck             =  0x00D700AC,
> > +	HWPfDdrPhyCmdPhaseSel                 =  0x00D700B4,
> > +	HWPfDdrPhyCmdDcdl                     =  0x00D700B8,
> > +	HWPfDdrPhyCkDcdl                      =  0x00D700BC,
> > +	HWPfDdrPhySwTrngCtrl1                 =  0x00D700C0,
> > +	HWPfDdrPhySwTrngCtrl2                 =  0x00D700C4,
> > +	HWPfDdrPhyRcalPcompRden               =  0x00D700C8,
> > +	HWPfDdrPhyRcalNcompRden               =  0x00D700CC,
> > +	HWPfDdrPhyRcalCompen                  =  0x00D700D0,
> > +	HWPfDdrPhySwTrngRdqs                  =  0x00D700D4,
> > +	HWPfDdrPhySwTrngWdqs                  =  0x00D700D8,
> > +	HWPfDdrPhySwTrngRdena                 =  0x00D700DC,
> > +	HWPfDdrPhySwTrngRdenb                 =  0x00D700E0,
> > +	HWPfDdrPhySwTrngRdenc                 =  0x00D700E4,
> > +	HWPfDdrPhySwTrngWdq                   =  0x00D700E8,
> > +	HWPfDdrPhySwTrngRdq                   =  0x00D700EC,
> > +	HWPfDdrPhyPcfgHmValue                 =  0x00D700F0,
> > +	HWPfDdrPhyPcfgTimerValue              =  0x00D700F4,
> > +	HWPfDdrPhyPcfgSoftwareTraining        =  0x00D700F8,
> > +	HWPfDdrPhyPcfgMcStatus                =  0x00D700FC,
> > +	HWPfDdrPhyWrlvlPhRank0                =  0x00D70100,
> > +	HWPfDdrPhyRdenPhRank0                 =  0x00D70104,
> > +	HWPfDdrPhyRdenIntRank0                =  0x00D70108,
> > +	HWPfDdrPhyRdqsDcdlRank0               =  0x00D7010C,
> > +	HWPfDdrPhyRdqsShadowDcdlRank0         =  0x00D70110,
> > +	HWPfDdrPhyWdqsDcdlRank0               =  0x00D70114,
> > +	HWPfDdrPhyWdmDcdlShadowRank0          =  0x00D70118,
> > +	HWPfDdrPhyWdmDcdlRank0                =  0x00D7011C,
> > +	HWPfDdrPhyDbiDcdlRank0                =  0x00D70120,
> > +	HWPfDdrPhyRdenDcdlaRank0              =  0x00D70124,
> > +	HWPfDdrPhyDbiDcdlShadowRank0          =  0x00D70128,
> > +	HWPfDdrPhyRdenDcdlbRank0              =  0x00D7012C,
> > +	HWPfDdrPhyWdqsShadowDcdlRank0         =  0x00D70130,
> > +	HWPfDdrPhyRdenDcdlcRank0              =  0x00D70134,
> > +	HWPfDdrPhyRdenShadowDcdlaRank0        =  0x00D70138,
> > +	HWPfDdrPhyWrlvlIntRank0               =  0x00D7013C,
> > +	HWPfDdrPhyRdqDcdlBit0Rank0            =  0x00D70200,
> > +	HWPfDdrPhyRdqDcdlShadowBit0Rank0      =  0x00D70204,
> > +	HWPfDdrPhyWdqDcdlBit0Rank0            =  0x00D70208,
> > +	HWPfDdrPhyWdqDcdlShadowBit0Rank0      =  0x00D7020C,
> > +	HWPfDdrPhyRdqDcdlBit1Rank0            =  0x00D70240,
> > +	HWPfDdrPhyRdqDcdlShadowBit1Rank0      =  0x00D70244,
> > +	HWPfDdrPhyWdqDcdlBit1Rank0            =  0x00D70248,
> > +	HWPfDdrPhyWdqDcdlShadowBit1Rank0      =  0x00D7024C,
> > +	HWPfDdrPhyRdqDcdlBit2Rank0            =  0x00D70280,
> > +	HWPfDdrPhyRdqDcdlShadowBit2Rank0      =  0x00D70284,
> > +	HWPfDdrPhyWdqDcdlBit2Rank0            =  0x00D70288,
> > +	HWPfDdrPhyWdqDcdlShadowBit2Rank0      =  0x00D7028C,
> > +	HWPfDdrPhyRdqDcdlBit3Rank0            =  0x00D702C0,
> > +	HWPfDdrPhyRdqDcdlShadowBit3Rank0      =  0x00D702C4,
> > +	HWPfDdrPhyWdqDcdlBit3Rank0            =  0x00D702C8,
> > +	HWPfDdrPhyWdqDcdlShadowBit3Rank0      =  0x00D702CC,
> > +	HWPfDdrPhyRdqDcdlBit4Rank0            =  0x00D70300,
> > +	HWPfDdrPhyRdqDcdlShadowBit4Rank0      =  0x00D70304,
> > +	HWPfDdrPhyWdqDcdlBit4Rank0            =  0x00D70308,
> > +	HWPfDdrPhyWdqDcdlShadowBit4Rank0      =  0x00D7030C,
> > +	HWPfDdrPhyRdqDcdlBit5Rank0            =  0x00D70340,
> > +	HWPfDdrPhyRdqDcdlShadowBit5Rank0      =  0x00D70344,
> > +	HWPfDdrPhyWdqDcdlBit5Rank0            =  0x00D70348,
> > +	HWPfDdrPhyWdqDcdlShadowBit5Rank0      =  0x00D7034C,
> > +	HWPfDdrPhyRdqDcdlBit6Rank0            =  0x00D70380,
> > +	HWPfDdrPhyRdqDcdlShadowBit6Rank0      =  0x00D70384,
> > +	HWPfDdrPhyWdqDcdlBit6Rank0            =  0x00D70388,
> > +	HWPfDdrPhyWdqDcdlShadowBit6Rank0      =  0x00D7038C,
> > +	HWPfDdrPhyRdqDcdlBit7Rank0            =  0x00D703C0,
> > +	HWPfDdrPhyRdqDcdlShadowBit7Rank0      =  0x00D703C4,
> > +	HWPfDdrPhyWdqDcdlBit7Rank0            =  0x00D703C8,
> > +	HWPfDdrPhyWdqDcdlShadowBit7Rank0      =  0x00D703CC,
> > +	HWPfDdrPhyIdtmStatus                  =  0x00D740D0,
> > +	HWPfDdrPhyIdtmError                   =  0x00D74110,
> > +	HWPfDdrPhyIdtmDebug                   =  0x00D74120,
> > +	HWPfDdrPhyIdtmDebugInt                =  0x00D74130,
> > +	HwPfPcieLnAsicCfgovr                  =  0x00D80000,
> > +	HwPfPcieLnAclkmixer                   =  0x00D80004,
> > +	HwPfPcieLnTxrampfreq                  =  0x00D80008,
> > +	HwPfPcieLnLanetest                    =  0x00D8000C,
> > +	HwPfPcieLnDcctrl                      =  0x00D80010,
> > +	HwPfPcieLnDccmeas                     =  0x00D80014,
> > +	HwPfPcieLnDccovrAclk                  =  0x00D80018,
> > +	HwPfPcieLnDccovrTxa                   =  0x00D8001C,
> > +	HwPfPcieLnDccovrTxk                   =  0x00D80020,
> > +	HwPfPcieLnDccovrDclk                  =  0x00D80024,
> > +	HwPfPcieLnDccovrEclk                  =  0x00D80028,
> > +	HwPfPcieLnDcctrimAclk                 =  0x00D8002C,
> > +	HwPfPcieLnDcctrimTx                   =  0x00D80030,
> > +	HwPfPcieLnDcctrimDclk                 =  0x00D80034,
> > +	HwPfPcieLnDcctrimEclk                 =  0x00D80038,
> > +	HwPfPcieLnQuadCtrl                    =  0x00D8003C,
> > +	HwPfPcieLnQuadCorrIndex               =  0x00D80040,
> > +	HwPfPcieLnQuadCorrStatus              =  0x00D80044,
> > +	HwPfPcieLnAsicRxovr1                  =  0x00D80048,
> > +	HwPfPcieLnAsicRxovr2                  =  0x00D8004C,
> > +	HwPfPcieLnAsicEqinfovr                =  0x00D80050,
> > +	HwPfPcieLnRxcsr                       =  0x00D80054,
> > +	HwPfPcieLnRxfectrl                    =  0x00D80058,
> > +	HwPfPcieLnRxtest                      =  0x00D8005C,
> > +	HwPfPcieLnEscount                     =  0x00D80060,
> > +	HwPfPcieLnCdrctrl                     =  0x00D80064,
> > +	HwPfPcieLnCdrctrl2                    =  0x00D80068,
> > +	HwPfPcieLnCdrcfg0Ctrl0                =  0x00D8006C,
> > +	HwPfPcieLnCdrcfg0Ctrl1                =  0x00D80070,
> > +	HwPfPcieLnCdrcfg0Ctrl2                =  0x00D80074,
> > +	HwPfPcieLnCdrcfg1Ctrl0                =  0x00D80078,
> > +	HwPfPcieLnCdrcfg1Ctrl1                =  0x00D8007C,
> > +	HwPfPcieLnCdrcfg1Ctrl2                =  0x00D80080,
> > +	HwPfPcieLnCdrcfg2Ctrl0                =  0x00D80084,
> > +	HwPfPcieLnCdrcfg2Ctrl1                =  0x00D80088,
> > +	HwPfPcieLnCdrcfg2Ctrl2                =  0x00D8008C,
> > +	HwPfPcieLnCdrcfg3Ctrl0                =  0x00D80090,
> > +	HwPfPcieLnCdrcfg3Ctrl1                =  0x00D80094,
> > +	HwPfPcieLnCdrcfg3Ctrl2                =  0x00D80098,
> > +	HwPfPcieLnCdrphase                    =  0x00D8009C,
> > +	HwPfPcieLnCdrfreq                     =  0x00D800A0,
> > +	HwPfPcieLnCdrstatusPhase              =  0x00D800A4,
> > +	HwPfPcieLnCdrstatusFreq               =  0x00D800A8,
> > +	HwPfPcieLnCdroffset                   =  0x00D800AC,
> > +	HwPfPcieLnRxvosctl                    =  0x00D800B0,
> > +	HwPfPcieLnRxvosctl2                   =  0x00D800B4,
> > +	HwPfPcieLnRxlosctl                    =  0x00D800B8,
> > +	HwPfPcieLnRxlos                       =  0x00D800BC,
> > +	HwPfPcieLnRxlosvval                   =  0x00D800C0,
> > +	HwPfPcieLnRxvosd0                     =  0x00D800C4,
> > +	HwPfPcieLnRxvosd1                     =  0x00D800C8,
> > +	HwPfPcieLnRxvosep0                    =  0x00D800CC,
> > +	HwPfPcieLnRxvosep1                    =  0x00D800D0,
> > +	HwPfPcieLnRxvosen0                    =  0x00D800D4,
> > +	HwPfPcieLnRxvosen1                    =  0x00D800D8,
> > +	HwPfPcieLnRxvosafe                    =  0x00D800DC,
> > +	HwPfPcieLnRxvosa0                     =  0x00D800E0,
> > +	HwPfPcieLnRxvosa0Out                  =  0x00D800E4,
> > +	HwPfPcieLnRxvosa1                     =  0x00D800E8,
> > +	HwPfPcieLnRxvosa1Out                  =  0x00D800EC,
> > +	HwPfPcieLnRxmisc                      =  0x00D800F0,
> > +	HwPfPcieLnRxbeacon                    =  0x00D800F4,
> > +	HwPfPcieLnRxdssout                    =  0x00D800F8,
> > +	HwPfPcieLnRxdssout2                   =  0x00D800FC,
> > +	HwPfPcieLnAlphapctrl                  =  0x00D80100,
> > +	HwPfPcieLnAlphanctrl                  =  0x00D80104,
> > +	HwPfPcieLnAdaptctrl                   =  0x00D80108,
> > +	HwPfPcieLnAdaptctrl1                  =  0x00D8010C,
> > +	HwPfPcieLnAdaptstatus                 =  0x00D80110,
> > +	HwPfPcieLnAdaptvga1                   =  0x00D80114,
> > +	HwPfPcieLnAdaptvga2                   =  0x00D80118,
> > +	HwPfPcieLnAdaptvga3                   =  0x00D8011C,
> > +	HwPfPcieLnAdaptvga4                   =  0x00D80120,
> > +	HwPfPcieLnAdaptboost1                 =  0x00D80124,
> > +	HwPfPcieLnAdaptboost2                 =  0x00D80128,
> > +	HwPfPcieLnAdaptboost3                 =  0x00D8012C,
> > +	HwPfPcieLnAdaptboost4                 =  0x00D80130,
> > +	HwPfPcieLnAdaptsslms1                 =  0x00D80134,
> > +	HwPfPcieLnAdaptsslms2                 =  0x00D80138,
> > +	HwPfPcieLnAdaptvgaStatus              =  0x00D8013C,
> > +	HwPfPcieLnAdaptboostStatus            =  0x00D80140,
> > +	HwPfPcieLnAdaptsslmsStatus1           =  0x00D80144,
> > +	HwPfPcieLnAdaptsslmsStatus2           =  0x00D80148,
> > +	HwPfPcieLnAfectrl1                    =  0x00D8014C,
> > +	HwPfPcieLnAfectrl2                    =  0x00D80150,
> > +	HwPfPcieLnAfectrl3                    =  0x00D80154,
> > +	HwPfPcieLnAfedefault1                 =  0x00D80158,
> > +	HwPfPcieLnAfedefault2                 =  0x00D8015C,
> > +	HwPfPcieLnDfectrl1                    =  0x00D80160,
> > +	HwPfPcieLnDfectrl2                    =  0x00D80164,
> > +	HwPfPcieLnDfectrl3                    =  0x00D80168,
> > +	HwPfPcieLnDfectrl4                    =  0x00D8016C,
> > +	HwPfPcieLnDfectrl5                    =  0x00D80170,
> > +	HwPfPcieLnDfectrl6                    =  0x00D80174,
> > +	HwPfPcieLnAfestatus1                  =  0x00D80178,
> > +	HwPfPcieLnAfestatus2                  =  0x00D8017C,
> > +	HwPfPcieLnDfestatus1                  =  0x00D80180,
> > +	HwPfPcieLnDfestatus2                  =  0x00D80184,
> > +	HwPfPcieLnDfestatus3                  =  0x00D80188,
> > +	HwPfPcieLnDfestatus4                  =  0x00D8018C,
> > +	HwPfPcieLnDfestatus5                  =  0x00D80190,
> > +	HwPfPcieLnAlphastatus                 =  0x00D80194,
> > +	HwPfPcieLnFomctrl1                    =  0x00D80198,
> > +	HwPfPcieLnFomctrl2                    =  0x00D8019C,
> > +	HwPfPcieLnFomctrl3                    =  0x00D801A0,
> > +	HwPfPcieLnAclkcalStatus               =  0x00D801A4,
> > +	HwPfPcieLnOffscorrStatus              =  0x00D801A8,
> > +	HwPfPcieLnEyewidthStatus              =  0x00D801AC,
> > +	HwPfPcieLnEyeheightStatus             =  0x00D801B0,
> > +	HwPfPcieLnAsicTxovr1                  =  0x00D801B4,
> > +	HwPfPcieLnAsicTxovr2                  =  0x00D801B8,
> > +	HwPfPcieLnAsicTxovr3                  =  0x00D801BC,
> > +	HwPfPcieLnTxbiasadjOvr                =  0x00D801C0,
> > +	HwPfPcieLnTxcsr                       =  0x00D801C4,
> > +	HwPfPcieLnTxtest                      =  0x00D801C8,
> > +	HwPfPcieLnTxtestword                  =  0x00D801CC,
> > +	HwPfPcieLnTxtestwordHigh              =  0x00D801D0,
> > +	HwPfPcieLnTxdrive                     =  0x00D801D4,
> > +	HwPfPcieLnMtcsLn                      =  0x00D801D8,
> > +	HwPfPcieLnStatsumLn                   =  0x00D801DC,
> > +	HwPfPcieLnRcbusScratch                =  0x00D801E0,
> > +	HwPfPcieLnRcbusMinorrev               =  0x00D801F0,
> > +	HwPfPcieLnRcbusMajorrev               =  0x00D801F4,
> > +	HwPfPcieLnRcbusBlocktype              =  0x00D801F8,
> > +	HwPfPcieSupPllcsr                     =  0x00D80800,
> > +	HwPfPcieSupPlldiv                     =  0x00D80804,
> > +	HwPfPcieSupPllcal                     =  0x00D80808,
> > +	HwPfPcieSupPllcalsts                  =  0x00D8080C,
> > +	HwPfPcieSupPllmeas                    =  0x00D80810,
> > +	HwPfPcieSupPlldactrim                 =  0x00D80814,
> > +	HwPfPcieSupPllbiastrim                =  0x00D80818,
> > +	HwPfPcieSupPllbwtrim                  =  0x00D8081C,
> > +	HwPfPcieSupPllcaldly                  =  0x00D80820,
> > +	HwPfPcieSupRefclkonpclkctrl           =  0x00D80824,
> > +	HwPfPcieSupPclkdelay                  =  0x00D80828,
> > +	HwPfPcieSupPhyconfig                  =  0x00D8082C,
> > +	HwPfPcieSupRcalIntf                   =  0x00D80830,
> > +	HwPfPcieSupAuxcsr                     =  0x00D80834,
> > +	HwPfPcieSupVref                       =  0x00D80838,
> > +	HwPfPcieSupLinkmode                   =  0x00D8083C,
> > +	HwPfPcieSupRrefcalctl                 =  0x00D80840,
> > +	HwPfPcieSupRrefcal                    =  0x00D80844,
> > +	HwPfPcieSupRrefcaldly                 =  0x00D80848,
> > +	HwPfPcieSupTximpcalctl                =  0x00D8084C,
> > +	HwPfPcieSupTximpcal                   =  0x00D80850,
> > +	HwPfPcieSupTximpoffset                =  0x00D80854,
> > +	HwPfPcieSupTximpcaldly                =  0x00D80858,
> > +	HwPfPcieSupRximpcalctl                =  0x00D8085C,
> > +	HwPfPcieSupRximpcal                   =  0x00D80860,
> > +	HwPfPcieSupRximpoffset                =  0x00D80864,
> > +	HwPfPcieSupRximpcaldly                =  0x00D80868,
> > +	HwPfPcieSupFence                      =  0x00D8086C,
> > +	HwPfPcieSupMtcs                       =  0x00D80870,
> > +	HwPfPcieSupStatsum                    =  0x00D809B8,
> > +	HwPfPciePcsDpStatus0                  =  0x00D81000,
> > +	HwPfPciePcsDpControl0                 =  0x00D81004,
> > +	HwPfPciePcsPmaStatusLane0             =  0x00D81008,
> > +	HwPfPciePcsPipeStatusLane0            =  0x00D8100C,
> > +	HwPfPciePcsTxdeemph0Lane0             =  0x00D81010,
> > +	HwPfPciePcsTxdeemph1Lane0             =  0x00D81014,
> > +	HwPfPciePcsInternalStatusLane0        =  0x00D81018,
> > +	HwPfPciePcsDpStatus1                  =  0x00D8101C,
> > +	HwPfPciePcsDpControl1                 =  0x00D81020,
> > +	HwPfPciePcsPmaStatusLane1             =  0x00D81024,
> > +	HwPfPciePcsPipeStatusLane1            =  0x00D81028,
> > +	HwPfPciePcsTxdeemph0Lane1             =  0x00D8102C,
> > +	HwPfPciePcsTxdeemph1Lane1             =  0x00D81030,
> > +	HwPfPciePcsInternalStatusLane1        =  0x00D81034,
> > +	HwPfPciePcsDpStatus2                  =  0x00D81038,
> > +	HwPfPciePcsDpControl2                 =  0x00D8103C,
> > +	HwPfPciePcsPmaStatusLane2             =  0x00D81040,
> > +	HwPfPciePcsPipeStatusLane2            =  0x00D81044,
> > +	HwPfPciePcsTxdeemph0Lane2             =  0x00D81048,
> > +	HwPfPciePcsTxdeemph1Lane2             =  0x00D8104C,
> > +	HwPfPciePcsInternalStatusLane2        =  0x00D81050,
> > +	HwPfPciePcsDpStatus3                  =  0x00D81054,
> > +	HwPfPciePcsDpControl3                 =  0x00D81058,
> > +	HwPfPciePcsPmaStatusLane3             =  0x00D8105C,
> > +	HwPfPciePcsPipeStatusLane3            =  0x00D81060,
> > +	HwPfPciePcsTxdeemph0Lane3             =  0x00D81064,
> > +	HwPfPciePcsTxdeemph1Lane3             =  0x00D81068,
> > +	HwPfPciePcsInternalStatusLane3        =  0x00D8106C,
> > +	HwPfPciePcsEbStatus0                  =  0x00D81070,
> > +	HwPfPciePcsEbStatus1                  =  0x00D81074,
> > +	HwPfPciePcsEbStatus2                  =  0x00D81078,
> > +	HwPfPciePcsEbStatus3                  =  0x00D8107C,
> > +	HwPfPciePcsPllSettingPcieG1           =  0x00D81088,
> > +	HwPfPciePcsPllSettingPcieG2           =  0x00D8108C,
> > +	HwPfPciePcsPllSettingPcieG3           =  0x00D81090,
> > +	HwPfPciePcsControl                    =  0x00D81094,
> > +	HwPfPciePcsEqControl                  =  0x00D81098,
> > +	HwPfPciePcsEqTimer                    =  0x00D8109C,
> > +	HwPfPciePcsEqErrStatus                =  0x00D810A0,
> > +	HwPfPciePcsEqErrCount                 =  0x00D810A4,
> > +	HwPfPciePcsStatus                     =  0x00D810A8,
> > +	HwPfPciePcsMiscRegister               =  0x00D810AC,
> > +	HwPfPciePcsObsControl                 =  0x00D810B0,
> > +	HwPfPciePcsPrbsCount0                 =  0x00D81200,
> > +	HwPfPciePcsBistControl0               =  0x00D81204,
> > +	HwPfPciePcsBistStaticWord00           =  0x00D81208,
> > +	HwPfPciePcsBistStaticWord10           =  0x00D8120C,
> > +	HwPfPciePcsBistStaticWord20           =  0x00D81210,
> > +	HwPfPciePcsBistStaticWord30           =  0x00D81214,
> > +	HwPfPciePcsPrbsCount1                 =  0x00D81220,
> > +	HwPfPciePcsBistControl1               =  0x00D81224,
> > +	HwPfPciePcsBistStaticWord01           =  0x00D81228,
> > +	HwPfPciePcsBistStaticWord11           =  0x00D8122C,
> > +	HwPfPciePcsBistStaticWord21           =  0x00D81230,
> > +	HwPfPciePcsBistStaticWord31           =  0x00D81234,
> > +	HwPfPciePcsPrbsCount2                 =  0x00D81240,
> > +	HwPfPciePcsBistControl2               =  0x00D81244,
> > +	HwPfPciePcsBistStaticWord02           =  0x00D81248,
> > +	HwPfPciePcsBistStaticWord12           =  0x00D8124C,
> > +	HwPfPciePcsBistStaticWord22           =  0x00D81250,
> > +	HwPfPciePcsBistStaticWord32           =  0x00D81254,
> > +	HwPfPciePcsPrbsCount3                 =  0x00D81260,
> > +	HwPfPciePcsBistControl3               =  0x00D81264,
> > +	HwPfPciePcsBistStaticWord03           =  0x00D81268,
> > +	HwPfPciePcsBistStaticWord13           =  0x00D8126C,
> > +	HwPfPciePcsBistStaticWord23           =  0x00D81270,
> > +	HwPfPciePcsBistStaticWord33           =  0x00D81274,
> > +	HwPfPcieGpexLtssmStateCntrl           =  0x00D90400,
> > +	HwPfPcieGpexLtssmStateStatus          =  0x00D90404,
> > +	HwPfPcieGpexSkipFreqTimer             =  0x00D90408,
> > +	HwPfPcieGpexLaneSelect                =  0x00D9040C,
> > +	HwPfPcieGpexLaneDeskew                =  0x00D90410,
> > +	HwPfPcieGpexRxErrorStatus             =  0x00D90414,
> > +	HwPfPcieGpexLaneNumControl            =  0x00D90418,
> > +	HwPfPcieGpexNFstControl               =  0x00D9041C,
> > +	HwPfPcieGpexLinkStatus                =  0x00D90420,
> > +	HwPfPcieGpexAckReplayTimeout          =  0x00D90438,
> > +	HwPfPcieGpexSeqNumberStatus           =  0x00D9043C,
> > +	HwPfPcieGpexCoreClkRatio              =  0x00D90440,
> > +	HwPfPcieGpexDllTholdControl           =  0x00D90448,
> > +	HwPfPcieGpexPmTimer                   =  0x00D90450,
> > +	HwPfPcieGpexPmeTimeout                =  0x00D90454,
> > +	HwPfPcieGpexAspmL1Timer               =  0x00D90458,
> > +	HwPfPcieGpexAspmReqTimer              =  0x00D9045C,
> > +	HwPfPcieGpexAspmL1Dis                 =  0x00D90460,
> > +	HwPfPcieGpexAdvisoryErrorControl      =  0x00D90468,
> > +	HwPfPcieGpexId                        =  0x00D90470,
> > +	HwPfPcieGpexClasscode                 =  0x00D90474,
> > +	HwPfPcieGpexSubsystemId               =  0x00D90478,
> > +	HwPfPcieGpexDeviceCapabilities        =  0x00D9047C,
> > +	HwPfPcieGpexLinkCapabilities          =  0x00D90480,
> > +	HwPfPcieGpexFunctionNumber            =  0x00D90484,
> > +	HwPfPcieGpexPmCapabilities            =  0x00D90488,
> > +	HwPfPcieGpexFunctionSelect            =  0x00D9048C,
> > +	HwPfPcieGpexErrorCounter              =  0x00D904AC,
> > +	HwPfPcieGpexConfigReady               =  0x00D904B0,
> > +	HwPfPcieGpexFcUpdateTimeout           =  0x00D904B8,
> > +	HwPfPcieGpexFcUpdateTimer             =  0x00D904BC,
> > +	HwPfPcieGpexVcBufferLoad              =  0x00D904C8,
> > +	HwPfPcieGpexVcBufferSizeThold         =  0x00D904CC,
> > +	HwPfPcieGpexVcBufferSelect            =  0x00D904D0,
> > +	HwPfPcieGpexBarEnable                 =  0x00D904D4,
> > +	HwPfPcieGpexBarDwordLower             =  0x00D904D8,
> > +	HwPfPcieGpexBarDwordUpper             =  0x00D904DC,
> > +	HwPfPcieGpexBarSelect                 =  0x00D904E0,
> > +	HwPfPcieGpexCreditCounterSelect       =  0x00D904E4,
> > +	HwPfPcieGpexCreditCounterStatus       =  0x00D904E8,
> > +	HwPfPcieGpexTlpHeaderSelect           =  0x00D904EC,
> > +	HwPfPcieGpexTlpHeaderDword0           =  0x00D904F0,
> > +	HwPfPcieGpexTlpHeaderDword1           =  0x00D904F4,
> > +	HwPfPcieGpexTlpHeaderDword2           =  0x00D904F8,
> > +	HwPfPcieGpexTlpHeaderDword3           =  0x00D904FC,
> > +	HwPfPcieGpexRelaxOrderControl         =  0x00D90500,
> > +	HwPfPcieGpexBarPrefetch               =  0x00D90504,
> > +	HwPfPcieGpexFcCheckControl            =  0x00D90508,
> > +	HwPfPcieGpexFcUpdateTimerTraffic      =  0x00D90518,
> > +	HwPfPcieGpexPhyControl0               =  0x00D9053C,
> > +	HwPfPcieGpexPhyControl1               =  0x00D90544,
> > +	HwPfPcieGpexPhyControl2               =  0x00D9054C,
> > +	HwPfPcieGpexUserControl0              =  0x00D9055C,
> > +	HwPfPcieGpexUncorrErrorStatus         =  0x00D905F0,
> > +	HwPfPcieGpexRxCplError                =  0x00D90620,
> > +	HwPfPcieGpexRxCplErrorDword0          =  0x00D90624,
> > +	HwPfPcieGpexRxCplErrorDword1          =  0x00D90628,
> > +	HwPfPcieGpexRxCplErrorDword2          =  0x00D9062C,
> > +	HwPfPcieGpexPabSwResetEn              =  0x00D90630,
> > +	HwPfPcieGpexGen3Control0              =  0x00D90634,
> > +	HwPfPcieGpexGen3Control1              =  0x00D90638,
> > +	HwPfPcieGpexGen3Control2              =  0x00D9063C,
> > +	HwPfPcieGpexGen2ControlCsr            =  0x00D90640,
> > +	HwPfPcieGpexTotalVfInitialVf0         =  0x00D90644,
> > +	HwPfPcieGpexTotalVfInitialVf1         =  0x00D90648,
> > +	HwPfPcieGpexSriovLinkDevId0           =  0x00D90684,
> > +	HwPfPcieGpexSriovLinkDevId1           =  0x00D90688,
> > +	HwPfPcieGpexSriovPageSize0            =  0x00D906C4,
> > +	HwPfPcieGpexSriovPageSize1            =  0x00D906C8,
> > +	HwPfPcieGpexIdVersion                 =  0x00D906FC,
> > +	HwPfPcieGpexSriovVfOffsetStride0      =  0x00D90704,
> > +	HwPfPcieGpexSriovVfOffsetStride1      =  0x00D90708,
> > +	HwPfPcieGpexGen3DeskewControl         =  0x00D907B4,
> > +	HwPfPcieGpexGen3EqControl             =  0x00D907B8,
> > +	HwPfPcieGpexBridgeVersion             =  0x00D90800,
> > +	HwPfPcieGpexBridgeCapability          =  0x00D90804,
> > +	HwPfPcieGpexBridgeControl             =  0x00D90808,
> > +	HwPfPcieGpexBridgeStatus              =  0x00D9080C,
> > +	HwPfPcieGpexEngineActivityStatus      =  0x00D9081C,
> > +	HwPfPcieGpexEngineResetControl        =  0x00D90820,
> > +	HwPfPcieGpexAxiPioControl             =  0x00D90840,
> > +	HwPfPcieGpexAxiPioStatus              =  0x00D90844,
> > +	HwPfPcieGpexAmbaSlaveCmdStatus        =  0x00D90848,
> > +	HwPfPcieGpexPexPioControl             =  0x00D908C0,
> > +	HwPfPcieGpexPexPioStatus              =  0x00D908C4,
> > +	HwPfPcieGpexAmbaMasterStatus          =  0x00D908C8,
> > +	HwPfPcieGpexCsrSlaveCmdStatus         =  0x00D90920,
> > +	HwPfPcieGpexMailboxAxiControl         =  0x00D90A50,
> > +	HwPfPcieGpexMailboxAxiData            =  0x00D90A54,
> > +	HwPfPcieGpexMailboxPexControl         =  0x00D90A90,
> > +	HwPfPcieGpexMailboxPexData            =  0x00D90A94,
> > +	HwPfPcieGpexPexInterruptEnable        =  0x00D90AD0,
> > +	HwPfPcieGpexPexInterruptStatus        =  0x00D90AD4,
> > +	HwPfPcieGpexPexInterruptAxiPioVector  =  0x00D90AD8,
> > +	HwPfPcieGpexPexInterruptPexPioVector  =  0x00D90AE0,
> > +	HwPfPcieGpexPexInterruptMiscVector    =  0x00D90AF8,
> > +	HwPfPcieGpexAmbaInterruptPioEnable    =  0x00D90B00,
> > +	HwPfPcieGpexAmbaInterruptMiscEnable   =  0x00D90B0C,
> > +	HwPfPcieGpexAmbaInterruptPioStatus    =  0x00D90B10,
> > +	HwPfPcieGpexAmbaInterruptMiscStatus   =  0x00D90B1C,
> > +	HwPfPcieGpexPexPmControl              =  0x00D90B80,
> > +	HwPfPcieGpexSlotMisc                  =  0x00D90B88,
> > +	HwPfPcieGpexAxiAddrMappingControl     =  0x00D90BA0,
> > +	HwPfPcieGpexAxiAddrMappingWindowAxiBase     =  0x00D90BA4,
> > +	HwPfPcieGpexAxiAddrMappingWindowPexBaseLow  =  0x00D90BA8,
> > +	HwPfPcieGpexAxiAddrMappingWindowPexBaseHigh =  0x00D90BAC,
> > +	HwPfPcieGpexPexBarAddrFunc0Bar0       =  0x00D91BA0,
> > +	HwPfPcieGpexPexBarAddrFunc0Bar1       =  0x00D91BA4,
> > +	HwPfPcieGpexAxiAddrMappingPcieHdrParam =  0x00D95BA0,
> > +	HwPfPcieGpexExtAxiAddrMappingAxiBase  =  0x00D980A0,
> > +	HwPfPcieGpexPexExtBarAddrFunc0Bar0    =  0x00D984A0,
> > +	HwPfPcieGpexPexExtBarAddrFunc0Bar1    =  0x00D984A4,
> > +	HwPfPcieGpexAmbaInterruptFlrEnable    =  0x00D9B960,
> > +	HwPfPcieGpexAmbaInterruptFlrStatus    =  0x00D9B9A0,
> > +	HwPfPcieGpexExtAxiAddrMappingSize     =  0x00D9BAF0,
> > +	HwPfPcieGpexPexPioAwcacheControl      =  0x00D9C300,
> > +	HwPfPcieGpexPexPioArcacheControl      =  0x00D9C304,
> > +	HwPfPcieGpexPabObSizeControlVc0       =  0x00D9C310
> > +};
> > +
> > +/* TIP PF Interrupt numbers */
> > +enum {
> > +	ACC100_PF_INT_QMGR_AQ_OVERFLOW = 0,
> > +	ACC100_PF_INT_DOORBELL_VF_2_PF = 1,
> > +	ACC100_PF_INT_DMA_DL_DESC_IRQ = 2,
> > +	ACC100_PF_INT_DMA_UL_DESC_IRQ = 3,
> > +	ACC100_PF_INT_DMA_MLD_DESC_IRQ = 4,
> > +	ACC100_PF_INT_DMA_UL5G_DESC_IRQ = 5,
> > +	ACC100_PF_INT_DMA_DL5G_DESC_IRQ = 6,
> > +	ACC100_PF_INT_ILLEGAL_FORMAT = 7,
> > +	ACC100_PF_INT_QMGR_DISABLED_ACCESS = 8,
> > +	ACC100_PF_INT_QMGR_AQ_OVERTHRESHOLD = 9,
> > +	ACC100_PF_INT_ARAM_ACCESS_ERR = 10,
> > +	ACC100_PF_INT_ARAM_ECC_1BIT_ERR = 11,
> > +	ACC100_PF_INT_PARITY_ERR = 12,
> > +	ACC100_PF_INT_QMGR_ERR = 13,
> > +	ACC100_PF_INT_INT_REQ_OVERFLOW = 14,
> > +	ACC100_PF_INT_APB_TIMEOUT = 15,
> > +};
> > +
> > +#endif /* ACC100_PF_ENUM_H */
> > diff --git a/drivers/baseband/acc100/acc100_vf_enum.h
> b/drivers/baseband/acc100/acc100_vf_enum.h
> > new file mode 100644
> > index 0000000..b512af3
> > --- /dev/null
> > +++ b/drivers/baseband/acc100/acc100_vf_enum.h
> > @@ -0,0 +1,73 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright(c) 2017 Intel Corporation
> > + */
> > +
> > +#ifndef ACC100_VF_ENUM_H
> > +#define ACC100_VF_ENUM_H
> > +
> > +/*
> > + * ACC100 Register mapping on VF BAR0
> > + * This is automatically generated from RDL, format may change with new
> RDL
> > + */
> > +enum {
> > +	HWVfQmgrIngressAq             =  0x00000000,
> > +	HWVfHiVfToPfDbellVf           =  0x00000800,
> > +	HWVfHiPfToVfDbellVf           =  0x00000808,
> > +	HWVfHiInfoRingBaseLoVf        =  0x00000810,
> > +	HWVfHiInfoRingBaseHiVf        =  0x00000814,
> > +	HWVfHiInfoRingPointerVf       =  0x00000818,
> > +	HWVfHiInfoRingIntWrEnVf       =  0x00000820,
> > +	HWVfHiInfoRingPf2VfWrEnVf     =  0x00000824,
> > +	HWVfHiMsixVectorMapperVf      =  0x00000860,
> > +	HWVfDmaFec5GulDescBaseLoRegVf =  0x00000920,
> > +	HWVfDmaFec5GulDescBaseHiRegVf =  0x00000924,
> > +	HWVfDmaFec5GulRespPtrLoRegVf  =  0x00000928,
> > +	HWVfDmaFec5GulRespPtrHiRegVf  =  0x0000092C,
> > +	HWVfDmaFec5GdlDescBaseLoRegVf =  0x00000940,
> > +	HWVfDmaFec5GdlDescBaseHiRegVf =  0x00000944,
> > +	HWVfDmaFec5GdlRespPtrLoRegVf  =  0x00000948,
> > +	HWVfDmaFec5GdlRespPtrHiRegVf  =  0x0000094C,
> > +	HWVfDmaFec4GulDescBaseLoRegVf =  0x00000960,
> > +	HWVfDmaFec4GulDescBaseHiRegVf =  0x00000964,
> > +	HWVfDmaFec4GulRespPtrLoRegVf  =  0x00000968,
> > +	HWVfDmaFec4GulRespPtrHiRegVf  =  0x0000096C,
> > +	HWVfDmaFec4GdlDescBaseLoRegVf =  0x00000980,
> > +	HWVfDmaFec4GdlDescBaseHiRegVf =  0x00000984,
> > +	HWVfDmaFec4GdlRespPtrLoRegVf  =  0x00000988,
> > +	HWVfDmaFec4GdlRespPtrHiRegVf  =  0x0000098C,
> > +	HWVfDmaDdrBaseRangeRoVf       =  0x000009A0,
> > +	HWVfQmgrAqResetVf             =  0x00000E00,
> > +	HWVfQmgrRingSizeVf            =  0x00000E04,
> > +	HWVfQmgrGrpDepthLog20Vf       =  0x00000E08,
> > +	HWVfQmgrGrpDepthLog21Vf       =  0x00000E0C,
> > +	HWVfQmgrGrpFunction0Vf        =  0x00000E10,
> > +	HWVfQmgrGrpFunction1Vf        =  0x00000E14,
> > +	HWVfPmACntrlRegVf             =  0x00000F40,
> > +	HWVfPmACountVf                =  0x00000F48,
> > +	HWVfPmAKCntLoVf               =  0x00000F50,
> > +	HWVfPmAKCntHiVf               =  0x00000F54,
> > +	HWVfPmADeltaCntLoVf           =  0x00000F60,
> > +	HWVfPmADeltaCntHiVf           =  0x00000F64,
> > +	HWVfPmBCntrlRegVf             =  0x00000F80,
> > +	HWVfPmBCountVf                =  0x00000F88,
> > +	HWVfPmBKCntLoVf               =  0x00000F90,
> > +	HWVfPmBKCntHiVf               =  0x00000F94,
> > +	HWVfPmBDeltaCntLoVf           =  0x00000FA0,
> > +	HWVfPmBDeltaCntHiVf           =  0x00000FA4
> > +};
> > +
> > +/* TIP VF Interrupt numbers */
> > +enum {
> > +	ACC100_VF_INT_QMGR_AQ_OVERFLOW = 0,
> > +	ACC100_VF_INT_DOORBELL_VF_2_PF = 1,
> > +	ACC100_VF_INT_DMA_DL_DESC_IRQ = 2,
> > +	ACC100_VF_INT_DMA_UL_DESC_IRQ = 3,
> > +	ACC100_VF_INT_DMA_MLD_DESC_IRQ = 4,
> > +	ACC100_VF_INT_DMA_UL5G_DESC_IRQ = 5,
> > +	ACC100_VF_INT_DMA_DL5G_DESC_IRQ = 6,
> > +	ACC100_VF_INT_ILLEGAL_FORMAT = 7,
> > +	ACC100_VF_INT_QMGR_DISABLED_ACCESS = 8,
> > +	ACC100_VF_INT_QMGR_AQ_OVERTHRESHOLD = 9,
> > +};
> > +
> > +#endif /* ACC100_VF_ENUM_H */
> > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h
> b/drivers/baseband/acc100/rte_acc100_pmd.h
> > index 6f46df0..cd77570 100644
> > --- a/drivers/baseband/acc100/rte_acc100_pmd.h
> > +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
> > @@ -5,6 +5,9 @@
> >  #ifndef _RTE_ACC100_PMD_H_
> >  #define _RTE_ACC100_PMD_H_
> >
> > +#include "acc100_pf_enum.h"
> > +#include "acc100_vf_enum.h"
> > +
> >  /* Helper macro for logging */
> >  #define rte_bbdev_log(level, fmt, ...) \
> >  	rte_log(RTE_LOG_ ## level, acc100_logtype, fmt "\n", \
> > @@ -27,6 +30,493 @@
> >  #define RTE_ACC100_PF_DEVICE_ID        (0x0d5c)
> >  #define RTE_ACC100_VF_DEVICE_ID        (0x0d5d)
> >
> > +/* Define as 1 to use only a single FEC engine */
> > +#ifndef RTE_ACC100_SINGLE_FEC
> > +#define RTE_ACC100_SINGLE_FEC 0
> > +#endif
> > +
> > +/* Values used in filling in descriptors */
> > +#define ACC100_DMA_DESC_TYPE           2
> > +#define ACC100_DMA_CODE_BLK_MODE       0
> > +#define ACC100_DMA_BLKID_FCW           1
> > +#define ACC100_DMA_BLKID_IN            2
> > +#define ACC100_DMA_BLKID_OUT_ENC       1
> > +#define ACC100_DMA_BLKID_OUT_HARD      1
> > +#define ACC100_DMA_BLKID_OUT_SOFT      2
> > +#define ACC100_DMA_BLKID_OUT_HARQ      3
> > +#define ACC100_DMA_BLKID_IN_HARQ       3
> > +
> > +/* Values used in filling in decode FCWs */
> > +#define ACC100_FCW_TD_VER              1
> > +#define ACC100_FCW_TD_EXT_COLD_REG_EN  1
> > +#define ACC100_FCW_TD_AUTOMAP          0x0f
> > +#define ACC100_FCW_TD_RVIDX_0          2
> > +#define ACC100_FCW_TD_RVIDX_1          26
> > +#define ACC100_FCW_TD_RVIDX_2          50
> > +#define ACC100_FCW_TD_RVIDX_3          74
> > +
> > +/* Values used in writing to the registers */
> > +#define ACC100_REG_IRQ_EN_ALL          0x1FF83FF  /* Enable all
> interrupts */
> > +
> > +/* ACC100 Specific Dimensioning */
> > +#define ACC100_SIZE_64MBYTE            (64*1024*1024)
> 
> A better name for this #define would be ACC100_MAX_RING_SIZE
> 
> Similar for alloc_2x64mb_sw_rings_mem should be
> 
> alloc_max_sw_rings_mem.
> 

I am not convinced. I tend to believe this is actually more descriptive this way, and the
concept of max ring size is something else. 


> 
> > +/* Number of elements in an Info Ring */
> > +#define ACC100_INFO_RING_NUM_ENTRIES   1024
> > +/* Number of elements in HARQ layout memory */
> > +#define ACC100_HARQ_LAYOUT             (64*1024*1024)
> > +/* Assume offset for HARQ in memory */
> > +#define ACC100_HARQ_OFFSET             (32*1024)
> > +/* Mask used to calculate an index in an Info Ring array (not a byte offset)
> */
> > +#define ACC100_INFO_RING_MASK
> (ACC100_INFO_RING_NUM_ENTRIES-1)
> > +/* Number of Virtual Functions ACC100 supports */
> > +#define ACC100_NUM_VFS                  16
> > +#define ACC100_NUM_QGRPS                 8
> > +#define ACC100_NUM_QGRPS_PER_WORD        8
> > +#define ACC100_NUM_AQS                  16
> > +#define MAX_ENQ_BATCH_SIZE          255
> little stuff, these define values should line up at least in the blocks.

ok

> > +/* All ACC100 Registers alignment are 32bits = 4B */
> > +#define BYTES_IN_WORD                 4
> 
> Common #define names should have ACC100_ prefix to lower chance of
> name conflicts.
> 
> Generally a good idea of all of them.

You are right, ok.

> 
> Tom
> 
> > +#define MAX_E_MBUF                64000
> > +
> > +#define GRP_ID_SHIFT    10 /* Queue Index Hierarchy */
> > +#define VF_ID_SHIFT     4  /* Queue Index Hierarchy */
> > +#define VF_OFFSET_QOS   16 /* offset in Memory Space specific to QoS
> Mon */
> > +#define TMPL_PRI_0      0x03020100
> > +#define TMPL_PRI_1      0x07060504
> > +#define TMPL_PRI_2      0x0b0a0908
> > +#define TMPL_PRI_3      0x0f0e0d0c
> > +#define QUEUE_ENABLE    0x80000000  /* Bit to mark Queue as Enabled
> */
> > +#define WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
> > +
> > +#define ACC100_NUM_TMPL  32
> > +#define VF_OFFSET_QOS 16 /* offset in Memory Space specific to QoS
> Mon */
> > +/* Mapping of signals for the available engines */
> > +#define SIG_UL_5G      0
> > +#define SIG_UL_5G_LAST 7
> > +#define SIG_DL_5G      13
> > +#define SIG_DL_5G_LAST 15
> > +#define SIG_UL_4G      16
> > +#define SIG_UL_4G_LAST 21
> > +#define SIG_DL_4G      27
> > +#define SIG_DL_4G_LAST 31
> > +
> > +/* max number of iterations to allocate memory block for all rings */
> > +#define SW_RING_MEM_ALLOC_ATTEMPTS 5
> > +#define MAX_QUEUE_DEPTH           1024
> > +#define ACC100_DMA_MAX_NUM_POINTERS  14
> > +#define ACC100_DMA_DESC_PADDING      8
> > +#define ACC100_FCW_PADDING           12
> > +#define ACC100_DESC_FCW_OFFSET       192
> > +#define ACC100_DESC_SIZE             256
> > +#define ACC100_DESC_OFFSET           (ACC100_DESC_SIZE / 64)
> > +#define ACC100_FCW_TE_BLEN     32
> > +#define ACC100_FCW_TD_BLEN     24
> > +#define ACC100_FCW_LE_BLEN     32
> > +#define ACC100_FCW_LD_BLEN     36
> > +
> > +#define ACC100_FCW_VER         2
> > +#define MUX_5GDL_DESC 6
> > +#define CMP_ENC_SIZE 20
> > +#define CMP_DEC_SIZE 24
> > +#define ENC_OFFSET (32)
> > +#define DEC_OFFSET (80)
> > +#define ACC100_EXT_MEM
> > +#define ACC100_HARQ_OFFSET_THRESHOLD 1024
> > +
> > +/* Constants from K0 computation from 3GPP 38.212 Table 5.4.2.1-2 */
> > +#define N_ZC_1 66 /* N = 66 Zc for BG 1 */
> > +#define N_ZC_2 50 /* N = 50 Zc for BG 2 */
> > +#define K0_1_1 17 /* K0 fraction numerator for rv 1 and BG 1 */
> > +#define K0_1_2 13 /* K0 fraction numerator for rv 1 and BG 2 */
> > +#define K0_2_1 33 /* K0 fraction numerator for rv 2 and BG 1 */
> > +#define K0_2_2 25 /* K0 fraction numerator for rv 2 and BG 2 */
> > +#define K0_3_1 56 /* K0 fraction numerator for rv 3 and BG 1 */
> > +#define K0_3_2 43 /* K0 fraction numerator for rv 3 and BG 2 */
> > +
> > +/* ACC100 Configuration */
> > +#define ACC100_DDR_ECC_ENABLE
> > +#define ACC100_CFG_DMA_ERROR 0x3D7
> > +#define ACC100_CFG_AXI_CACHE 0x11
> > +#define ACC100_CFG_QMGR_HI_P 0x0F0F
> > +#define ACC100_CFG_PCI_AXI 0xC003
> > +#define ACC100_CFG_PCI_BRIDGE 0x40006033
> > +#define ACC100_ENGINE_OFFSET 0x1000
> > +#define ACC100_RESET_HI 0x20100
> > +#define ACC100_RESET_LO 0x20000
> > +#define ACC100_RESET_HARD 0x1FF
> > +#define ACC100_ENGINES_MAX 9
> > +#define LONG_WAIT 1000
> > +
> > +/* ACC100 DMA Descriptor triplet */
> > +struct acc100_dma_triplet {
> > +	uint64_t address;
> > +	uint32_t blen:20,
> > +		res0:4,
> > +		last:1,
> > +		dma_ext:1,
> > +		res1:2,
> > +		blkid:4;
> > +} __rte_packed;
> > +
> > +
> > +
> > +/* ACC100 DMA Response Descriptor */
> > +union acc100_dma_rsp_desc {
> > +	uint32_t val;
> > +	struct {
> > +		uint32_t crc_status:1,
> > +			synd_ok:1,
> > +			dma_err:1,
> > +			neg_stop:1,
> > +			fcw_err:1,
> > +			output_err:1,
> > +			input_err:1,
> > +			timestampEn:1,
> > +			iterCountFrac:8,
> > +			iter_cnt:8,
> > +			rsrvd3:6,
> > +			sdone:1,
> > +			fdone:1;
> > +		uint32_t add_info_0;
> > +		uint32_t add_info_1;
> > +	};
> > +};
> > +
> > +
> > +/* ACC100 Queue Manager Enqueue PCI Register */
> > +union acc100_enqueue_reg_fmt {
> > +	uint32_t val;
> > +	struct {
> > +		uint32_t num_elem:8,
> > +			addr_offset:3,
> > +			rsrvd:1,
> > +			req_elem_addr:20;
> > +	};
> > +};
> > +
> > +/* FEC 4G Uplink Frame Control Word */
> > +struct __rte_packed acc100_fcw_td {
> > +	uint8_t fcw_ver:4,
> > +		num_maps:4; /* Unused */
> > +	uint8_t filler:6, /* Unused */
> > +		rsrvd0:1,
> > +		bypass_sb_deint:1;
> > +	uint16_t k_pos;
> > +	uint16_t k_neg; /* Unused */
> > +	uint8_t c_neg; /* Unused */
> > +	uint8_t c; /* Unused */
> > +	uint32_t ea; /* Unused */
> > +	uint32_t eb; /* Unused */
> > +	uint8_t cab; /* Unused */
> > +	uint8_t k0_start_col; /* Unused */
> > +	uint8_t rsrvd1;
> > +	uint8_t code_block_mode:1, /* Unused */
> > +		turbo_crc_type:1,
> > +		rsrvd2:3,
> > +		bypass_teq:1, /* Unused */
> > +		soft_output_en:1, /* Unused */
> > +		ext_td_cold_reg_en:1;
> > +	union { /* External Cold register */
> > +		uint32_t ext_td_cold_reg;
> > +		struct {
> > +			uint32_t min_iter:4, /* Unused */
> > +				max_iter:4,
> > +				ext_scale:5, /* Unused */
> > +				rsrvd3:3,
> > +				early_stop_en:1, /* Unused */
> > +				sw_soft_out_dis:1, /* Unused */
> > +				sw_et_cont:1, /* Unused */
> > +				sw_soft_out_saturation:1, /* Unused */
> > +				half_iter_on:1, /* Unused */
> > +				raw_decoder_input_on:1, /* Unused */
> > +				rsrvd4:10;
> > +		};
> > +	};
> > +};
> > +
> > +/* FEC 5GNR Uplink Frame Control Word */
> > +struct __rte_packed acc100_fcw_ld {
> > +	uint32_t FCWversion:4,
> > +		qm:4,
> > +		nfiller:11,
> > +		BG:1,
> > +		Zc:9,
> > +		res0:1,
> > +		synd_precoder:1,
> > +		synd_post:1;
> > +	uint32_t ncb:16,
> > +		k0:16;
> > +	uint32_t rm_e:24,
> > +		hcin_en:1,
> > +		hcout_en:1,
> > +		crc_select:1,
> > +		bypass_dec:1,
> > +		bypass_intlv:1,
> > +		so_en:1,
> > +		so_bypass_rm:1,
> > +		so_bypass_intlv:1;
> > +	uint32_t hcin_offset:16,
> > +		hcin_size0:16;
> > +	uint32_t hcin_size1:16,
> > +		hcin_decomp_mode:3,
> > +		llr_pack_mode:1,
> > +		hcout_comp_mode:3,
> > +		res2:1,
> > +		dec_convllr:4,
> > +		hcout_convllr:4;
> > +	uint32_t itmax:7,
> > +		itstop:1,
> > +		so_it:7,
> > +		res3:1,
> > +		hcout_offset:16;
> > +	uint32_t hcout_size0:16,
> > +		hcout_size1:16;
> > +	uint32_t gain_i:8,
> > +		gain_h:8,
> > +		negstop_th:16;
> > +	uint32_t negstop_it:7,
> > +		negstop_en:1,
> > +		res4:24;
> > +};
> > +
> > +/* FEC 4G Downlink Frame Control Word */
> > +struct __rte_packed acc100_fcw_te {
> > +	uint16_t k_neg;
> > +	uint16_t k_pos;
> > +	uint8_t c_neg;
> > +	uint8_t c;
> > +	uint8_t filler;
> > +	uint8_t cab;
> > +	uint32_t ea:17,
> > +		rsrvd0:15;
> > +	uint32_t eb:17,
> > +		rsrvd1:15;
> > +	uint16_t ncb_neg;
> > +	uint16_t ncb_pos;
> > +	uint8_t rv_idx0:2,
> > +		rsrvd2:2,
> > +		rv_idx1:2,
> > +		rsrvd3:2;
> > +	uint8_t bypass_rv_idx0:1,
> > +		bypass_rv_idx1:1,
> > +		bypass_rm:1,
> > +		rsrvd4:5;
> > +	uint8_t rsrvd5:1,
> > +		rsrvd6:3,
> > +		code_block_crc:1,
> > +		rsrvd7:3;
> > +	uint8_t code_block_mode:1,
> > +		rsrvd8:7;
> > +	uint64_t rsrvd9;
> > +};
> > +
> > +/* FEC 5GNR Downlink Frame Control Word */
> > +struct __rte_packed acc100_fcw_le {
> > +	uint32_t FCWversion:4,
> > +		qm:4,
> > +		nfiller:11,
> > +		BG:1,
> > +		Zc:9,
> > +		res0:3;
> > +	uint32_t ncb:16,
> > +		k0:16;
> > +	uint32_t rm_e:24,
> > +		res1:2,
> > +		crc_select:1,
> > +		res2:1,
> > +		bypass_intlv:1,
> > +		res3:3;
> > +	uint32_t res4_a:12,
> > +		mcb_count:3,
> > +		res4_b:17;
> > +	uint32_t res5;
> > +	uint32_t res6;
> > +	uint32_t res7;
> > +	uint32_t res8;
> > +};
> > +
> > +/* ACC100 DMA Request Descriptor */
> > +struct __rte_packed acc100_dma_req_desc {
> > +	union {
> > +		struct{
> > +			uint32_t type:4,
> > +				rsrvd0:26,
> > +				sdone:1,
> > +				fdone:1;
> > +			uint32_t rsrvd1;
> > +			uint32_t rsrvd2;
> > +			uint32_t pass_param:8,
> > +				sdone_enable:1,
> > +				irq_enable:1,
> > +				timeStampEn:1,
> > +				res0:5,
> > +				numCBs:4,
> > +				res1:4,
> > +				m2dlen:4,
> > +				d2mlen:4;
> > +		};
> > +		struct{
> > +			uint32_t word0;
> > +			uint32_t word1;
> > +			uint32_t word2;
> > +			uint32_t word3;
> > +		};
> > +	};
> > +	struct acc100_dma_triplet
> data_ptrs[ACC100_DMA_MAX_NUM_POINTERS];
> > +
> > +	/* Virtual addresses used to retrieve SW context info */
> > +	union {
> > +		void *op_addr;
> > +		uint64_t pad1;  /* pad to 64 bits */
> > +	};
> > +	/*
> > +	 * Stores additional information needed for driver processing:
> > +	 * - last_desc_in_batch - flag used to mark last descriptor (CB)
> > +	 *                        in batch
> > +	 * - cbs_in_tb - stores information about total number of Code Blocks
> > +	 *               in currently processed Transport Block
> > +	 */
> > +	union {
> > +		struct {
> > +			union {
> > +				struct acc100_fcw_ld fcw_ld;
> > +				struct acc100_fcw_td fcw_td;
> > +				struct acc100_fcw_le fcw_le;
> > +				struct acc100_fcw_te fcw_te;
> > +				uint32_t pad2[ACC100_FCW_PADDING];
> > +			};
> > +			uint32_t last_desc_in_batch :8,
> > +				cbs_in_tb:8,
> > +				pad4 : 16;
> > +		};
> > +		uint64_t pad3[ACC100_DMA_DESC_PADDING]; /* pad to 64
> bits */
> > +	};
> > +};
> > +
> > +/* ACC100 DMA Descriptor */
> > +union acc100_dma_desc {
> > +	struct acc100_dma_req_desc req;
> > +	union acc100_dma_rsp_desc rsp;
> > +};
> > +
> > +
> > +/* Union describing Info Ring entry */
> > +union acc100_harq_layout_data {
> > +	uint32_t val;
> > +	struct {
> > +		uint16_t offset;
> > +		uint16_t size0;
> > +	};
> > +} __rte_packed;
> > +
> > +
> > +/* Union describing Info Ring entry */
> > +union acc100_info_ring_data {
> > +	uint32_t val;
> > +	struct {
> > +		union {
> > +			uint16_t detailed_info;
> > +			struct {
> > +				uint16_t aq_id: 4;
> > +				uint16_t qg_id: 4;
> > +				uint16_t vf_id: 6;
> > +				uint16_t reserved: 2;
> > +			};
> > +		};
> > +		uint16_t int_nb: 7;
> > +		uint16_t msi_0: 1;
> > +		uint16_t vf2pf: 6;
> > +		uint16_t loop: 1;
> > +		uint16_t valid: 1;
> > +	};
> > +} __rte_packed;
> > +
> > +struct acc100_registry_addr {
> > +	unsigned int dma_ring_dl5g_hi;
> > +	unsigned int dma_ring_dl5g_lo;
> > +	unsigned int dma_ring_ul5g_hi;
> > +	unsigned int dma_ring_ul5g_lo;
> > +	unsigned int dma_ring_dl4g_hi;
> > +	unsigned int dma_ring_dl4g_lo;
> > +	unsigned int dma_ring_ul4g_hi;
> > +	unsigned int dma_ring_ul4g_lo;
> > +	unsigned int ring_size;
> > +	unsigned int info_ring_hi;
> > +	unsigned int info_ring_lo;
> > +	unsigned int info_ring_en;
> > +	unsigned int info_ring_ptr;
> > +	unsigned int tail_ptrs_dl5g_hi;
> > +	unsigned int tail_ptrs_dl5g_lo;
> > +	unsigned int tail_ptrs_ul5g_hi;
> > +	unsigned int tail_ptrs_ul5g_lo;
> > +	unsigned int tail_ptrs_dl4g_hi;
> > +	unsigned int tail_ptrs_dl4g_lo;
> > +	unsigned int tail_ptrs_ul4g_hi;
> > +	unsigned int tail_ptrs_ul4g_lo;
> > +	unsigned int depth_log0_offset;
> > +	unsigned int depth_log1_offset;
> > +	unsigned int qman_group_func;
> > +	unsigned int ddr_range;
> > +};
> > +
> > +/* Structure holding registry addresses for PF */
> > +static const struct acc100_registry_addr pf_reg_addr = {
> > +	.dma_ring_dl5g_hi = HWPfDmaFec5GdlDescBaseHiRegVf,
> > +	.dma_ring_dl5g_lo = HWPfDmaFec5GdlDescBaseLoRegVf,
> > +	.dma_ring_ul5g_hi = HWPfDmaFec5GulDescBaseHiRegVf,
> > +	.dma_ring_ul5g_lo = HWPfDmaFec5GulDescBaseLoRegVf,
> > +	.dma_ring_dl4g_hi = HWPfDmaFec4GdlDescBaseHiRegVf,
> > +	.dma_ring_dl4g_lo = HWPfDmaFec4GdlDescBaseLoRegVf,
> > +	.dma_ring_ul4g_hi = HWPfDmaFec4GulDescBaseHiRegVf,
> > +	.dma_ring_ul4g_lo = HWPfDmaFec4GulDescBaseLoRegVf,
> > +	.ring_size = HWPfQmgrRingSizeVf,
> > +	.info_ring_hi = HWPfHiInfoRingBaseHiRegPf,
> > +	.info_ring_lo = HWPfHiInfoRingBaseLoRegPf,
> > +	.info_ring_en = HWPfHiInfoRingIntWrEnRegPf,
> > +	.info_ring_ptr = HWPfHiInfoRingPointerRegPf,
> > +	.tail_ptrs_dl5g_hi = HWPfDmaFec5GdlRespPtrHiRegVf,
> > +	.tail_ptrs_dl5g_lo = HWPfDmaFec5GdlRespPtrLoRegVf,
> > +	.tail_ptrs_ul5g_hi = HWPfDmaFec5GulRespPtrHiRegVf,
> > +	.tail_ptrs_ul5g_lo = HWPfDmaFec5GulRespPtrLoRegVf,
> > +	.tail_ptrs_dl4g_hi = HWPfDmaFec4GdlRespPtrHiRegVf,
> > +	.tail_ptrs_dl4g_lo = HWPfDmaFec4GdlRespPtrLoRegVf,
> > +	.tail_ptrs_ul4g_hi = HWPfDmaFec4GulRespPtrHiRegVf,
> > +	.tail_ptrs_ul4g_lo = HWPfDmaFec4GulRespPtrLoRegVf,
> > +	.depth_log0_offset = HWPfQmgrGrpDepthLog20Vf,
> > +	.depth_log1_offset = HWPfQmgrGrpDepthLog21Vf,
> > +	.qman_group_func = HWPfQmgrGrpFunction0,
> > +	.ddr_range = HWPfDmaVfDdrBaseRw,
> > +};
> > +
> > +/* Structure holding registry addresses for VF */
> > +static const struct acc100_registry_addr vf_reg_addr = {
> > +	.dma_ring_dl5g_hi = HWVfDmaFec5GdlDescBaseHiRegVf,
> > +	.dma_ring_dl5g_lo = HWVfDmaFec5GdlDescBaseLoRegVf,
> > +	.dma_ring_ul5g_hi = HWVfDmaFec5GulDescBaseHiRegVf,
> > +	.dma_ring_ul5g_lo = HWVfDmaFec5GulDescBaseLoRegVf,
> > +	.dma_ring_dl4g_hi = HWVfDmaFec4GdlDescBaseHiRegVf,
> > +	.dma_ring_dl4g_lo = HWVfDmaFec4GdlDescBaseLoRegVf,
> > +	.dma_ring_ul4g_hi = HWVfDmaFec4GulDescBaseHiRegVf,
> > +	.dma_ring_ul4g_lo = HWVfDmaFec4GulDescBaseLoRegVf,
> > +	.ring_size = HWVfQmgrRingSizeVf,
> > +	.info_ring_hi = HWVfHiInfoRingBaseHiVf,
> > +	.info_ring_lo = HWVfHiInfoRingBaseLoVf,
> > +	.info_ring_en = HWVfHiInfoRingIntWrEnVf,
> > +	.info_ring_ptr = HWVfHiInfoRingPointerVf,
> > +	.tail_ptrs_dl5g_hi = HWVfDmaFec5GdlRespPtrHiRegVf,
> > +	.tail_ptrs_dl5g_lo = HWVfDmaFec5GdlRespPtrLoRegVf,
> > +	.tail_ptrs_ul5g_hi = HWVfDmaFec5GulRespPtrHiRegVf,
> > +	.tail_ptrs_ul5g_lo = HWVfDmaFec5GulRespPtrLoRegVf,
> > +	.tail_ptrs_dl4g_hi = HWVfDmaFec4GdlRespPtrHiRegVf,
> > +	.tail_ptrs_dl4g_lo = HWVfDmaFec4GdlRespPtrLoRegVf,
> > +	.tail_ptrs_ul4g_hi = HWVfDmaFec4GulRespPtrHiRegVf,
> > +	.tail_ptrs_ul4g_lo = HWVfDmaFec4GulRespPtrLoRegVf,
> > +	.depth_log0_offset = HWVfQmgrGrpDepthLog20Vf,
> > +	.depth_log1_offset = HWVfQmgrGrpDepthLog21Vf,
> > +	.qman_group_func = HWVfQmgrGrpFunction0Vf,
> > +	.ddr_range = HWVfDmaDdrBaseRangeRoVf,
> > +};
> > +
> >  /* Private data structure for each ACC100 device */
> >  struct acc100_device {
> >  	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v9 03/10] baseband/acc100: add info get function
  2020-09-29 21:13       ` Tom Rix
@ 2020-09-30  0:25         ` Chautru, Nicolas
  2020-09-30 23:20           ` Tom Rix
  0 siblings, 1 reply; 213+ messages in thread
From: Chautru, Nicolas @ 2020-09-30  0:25 UTC (permalink / raw)
  To: Tom Rix, dev, akhil.goyal
  Cc: Richardson, Bruce, Xu, Rosen, dave.burley, aidan.goddard, Yigit,
	Ferruh, Liu, Tianjiao

Hi Tom, 

> From: Tom Rix <trix@redhat.com>
> On 9/28/20 5:29 PM, Nicolas Chautru wrote:
> > Add in the "info_get" function to the driver, to allow us to query the
> > device.
> > No processing capability are available yet.
> > Linking bbdev-test to support the PMD with null capability.
> >
> > Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> > Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
> > ---
> >  app/test-bbdev/meson.build               |   3 +
> >  drivers/baseband/acc100/rte_acc100_cfg.h |  96 +++++++++++++
> > drivers/baseband/acc100/rte_acc100_pmd.c | 225
> +++++++++++++++++++++++++++++++
> >  drivers/baseband/acc100/rte_acc100_pmd.h |   3 +
> >  4 files changed, 327 insertions(+)
> >  create mode 100644 drivers/baseband/acc100/rte_acc100_cfg.h
> >
> > diff --git a/app/test-bbdev/meson.build b/app/test-bbdev/meson.build
> > index 18ab6a8..fbd8ae3 100644
> > --- a/app/test-bbdev/meson.build
> > +++ b/app/test-bbdev/meson.build
> > @@ -12,3 +12,6 @@ endif
> >  if dpdk_conf.has('RTE_LIBRTE_PMD_BBDEV_FPGA_5GNR_FEC')
> >  	deps += ['pmd_bbdev_fpga_5gnr_fec']
> >  endif
> > +if dpdk_conf.has('RTE_LIBRTE_PMD_BBDEV_ACC100')
> > +	deps += ['pmd_bbdev_acc100']
> > +endif
> > \ No newline at end of file
> > diff --git a/drivers/baseband/acc100/rte_acc100_cfg.h
> > b/drivers/baseband/acc100/rte_acc100_cfg.h
> > new file mode 100644
> > index 0000000..73bbe36
> > --- /dev/null
> > +++ b/drivers/baseband/acc100/rte_acc100_cfg.h
> > @@ -0,0 +1,96 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright(c) 2020 Intel Corporation  */
> > +
> > +#ifndef _RTE_ACC100_CFG_H_
> > +#define _RTE_ACC100_CFG_H_
> > +
> > +/**
> > + * @file rte_acc100_cfg.h
> > + *
> > + * Functions for configuring ACC100 HW, exposed directly to applications.
> > + * Configuration related to encoding/decoding is done through the
> > + * librte_bbdev library.
> > + *
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice
> When will this experimental tag be removed ?

I have pushed a patch to remove it. But the feedback from some of the community was to wait a bit more until next year.

> > + */
> > +
> > +#include <stdint.h>
> > +#include <stdbool.h>
> > +
> > +#ifdef __cplusplus
> > +extern "C" {
> > +#endif
> > +/**< Number of Virtual Functions ACC100 supports */ #define
> > +RTE_ACC100_NUM_VFS 16
> This is already defined with ACC100_NUM_VFS

Thanks. 

> > +
> > +/**
> > + * Definition of Queue Topology for ACC100 Configuration
> > + * Some level of details is abstracted out to expose a clean
> > +interface
> > + * given that comprehensive flexibility is not required  */ struct
> > +rte_q_topology_t {
> > +	/** Number of QGroups in incremental order of priority */
> > +	uint16_t num_qgroups;
> > +	/**
> > +	 * All QGroups have the same number of AQs here.
> > +	 * Note : Could be made a 16-array if more flexibility is really
> > +	 * required
> > +	 */
> > +	uint16_t num_aqs_per_groups;
> > +	/**
> > +	 * Depth of the AQs is the same of all QGroups here. Log2 Enum : 2^N
> > +	 * Note : Could be made a 16-array if more flexibility is really
> > +	 * required
> > +	 */
> > +	uint16_t aq_depth_log2;
> > +	/**
> > +	 * Index of the first Queue Group Index - assuming contiguity
> > +	 * Initialized as -1
> > +	 */
> > +	int8_t first_qgroup_index;
> > +};
> > +
> > +/**
> > + * Definition of Arbitration related parameters for ACC100
> > +Configuration  */ struct rte_arbitration_t {
> > +	/** Default Weight for VF Fairness Arbitration */
> > +	uint16_t round_robin_weight;
> > +	uint32_t gbr_threshold1; /**< Guaranteed Bitrate Threshold 1 */
> > +	uint32_t gbr_threshold2; /**< Guaranteed Bitrate Threshold 2 */ };
> > +
> > +/**
> > + * Structure to pass ACC100 configuration.
> > + * Note: all VF Bundles will have the same configuration.
> > + */
> > +struct acc100_conf {
> > +	bool pf_mode_en; /**< 1 if PF is used for dataplane, 0 for VFs */
> > +	/** 1 if input '1' bit is represented by a positive LLR value, 0 if '1'
> > +	 * bit is represented by a negative value.
> > +	 */
> > +	bool input_pos_llr_1_bit;
> > +	/** 1 if output '1' bit is represented by a positive value, 0 if '1'
> > +	 * bit is represented by a negative value.
> > +	 */
> > +	bool output_pos_llr_1_bit;
> > +	uint16_t num_vf_bundles; /**< Number of VF bundles to setup */
> > +	/** Queue topology for each operation type */
> > +	struct rte_q_topology_t q_ul_4g;
> > +	struct rte_q_topology_t q_dl_4g;
> > +	struct rte_q_topology_t q_ul_5g;
> > +	struct rte_q_topology_t q_dl_5g;
> > +	/** Arbitration configuration for each operation type */
> > +	struct rte_arbitration_t arb_ul_4g[RTE_ACC100_NUM_VFS];
> > +	struct rte_arbitration_t arb_dl_4g[RTE_ACC100_NUM_VFS];
> > +	struct rte_arbitration_t arb_ul_5g[RTE_ACC100_NUM_VFS];
> > +	struct rte_arbitration_t arb_dl_5g[RTE_ACC100_NUM_VFS]; };
> > +
> > +#ifdef __cplusplus
> > +}
> > +#endif
> > +
> > +#endif /* _RTE_ACC100_CFG_H_ */
> > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> > b/drivers/baseband/acc100/rte_acc100_pmd.c
> > index 1b4cd13..7807a30 100644
> > --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> > +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> > @@ -26,6 +26,184 @@
> >  RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);  #endif
> >
> > +/* Read a register of a ACC100 device */ static inline uint32_t
> > +acc100_reg_read(struct acc100_device *d, uint32_t offset) {
> > +
> > +	void *reg_addr = RTE_PTR_ADD(d->mmio_base, offset);
> > +	uint32_t ret = *((volatile uint32_t *)(reg_addr));
> > +	return rte_le_to_cpu_32(ret);
> > +}
> > +
> > +/* Calculate the offset of the enqueue register */ static inline
> > +uint32_t queue_offset(bool pf_device, uint8_t vf_id, uint8_t qgrp_id,
> > +uint16_t aq_id) {
> > +	if (pf_device)
> > +		return ((vf_id << 12) + (qgrp_id << 7) + (aq_id << 3) +
> > +				HWPfQmgrIngressAq);
> > +	else
> > +		return ((qgrp_id << 7) + (aq_id << 3) +
> > +				HWVfQmgrIngressAq);
> Could you add *QmrIngressAq to the acc100_registry_addr and skip the if
> (pf_device) check ?

I am not convinced. That acc100_registry_addr is not kept with the driver, you
still need to check pf_device flag to now what registers are to be used.


> > +}
> > +
> > +enum {UL_4G = 0, UL_5G, DL_4G, DL_5G, NUM_ACC};
> > +
> > +/* Return the queue topology for a Queue Group Index */ static inline
> > +void qtopFromAcc(struct rte_q_topology_t **qtop, int acc_enum,
> > +		struct acc100_conf *acc100_conf)
> > +{
> > +	struct rte_q_topology_t *p_qtop;
> > +	p_qtop = NULL;
> > +	switch (acc_enum) {
> > +	case UL_4G:
> > +		p_qtop = &(acc100_conf->q_ul_4g);
> > +		break;
> > +	case UL_5G:
> > +		p_qtop = &(acc100_conf->q_ul_5g);
> > +		break;
> > +	case DL_4G:
> > +		p_qtop = &(acc100_conf->q_dl_4g);
> > +		break;
> > +	case DL_5G:
> > +		p_qtop = &(acc100_conf->q_dl_5g);
> > +		break;
> > +	default:
> > +		/* NOTREACHED */
> > +		rte_bbdev_log(ERR, "Unexpected error evaluating
> qtopFromAcc");
> Use in fetch_acc100_config does not check for NULL.

Yes because it can't be null. This function is called explictly for supported values. 

> > +		break;
> > +	}
> > +	*qtop = p_qtop;
> > +}
> > +
> > +static void
> > +initQTop(struct acc100_conf *acc100_conf) {
> > +	acc100_conf->q_ul_4g.num_aqs_per_groups = 0;
> > +	acc100_conf->q_ul_4g.num_qgroups = 0;
> > +	acc100_conf->q_ul_4g.first_qgroup_index = -1;
> > +	acc100_conf->q_ul_5g.num_aqs_per_groups = 0;
> > +	acc100_conf->q_ul_5g.num_qgroups = 0;
> > +	acc100_conf->q_ul_5g.first_qgroup_index = -1;
> > +	acc100_conf->q_dl_4g.num_aqs_per_groups = 0;
> > +	acc100_conf->q_dl_4g.num_qgroups = 0;
> > +	acc100_conf->q_dl_4g.first_qgroup_index = -1;
> > +	acc100_conf->q_dl_5g.num_aqs_per_groups = 0;
> > +	acc100_conf->q_dl_5g.num_qgroups = 0;
> > +	acc100_conf->q_dl_5g.first_qgroup_index = -1; }
> > +
> > +static inline void
> > +updateQtop(uint8_t acc, uint8_t qg, struct acc100_conf *acc100_conf,
> > +		struct acc100_device *d) {
> > +	uint32_t reg;
> > +	struct rte_q_topology_t *q_top = NULL;
> > +	qtopFromAcc(&q_top, acc, acc100_conf);
> > +	if (unlikely(q_top == NULL))
> > +		return;
> as above, this error is not handled by caller fetch_acc100_config

It cannot really fail for fetch_acc100_config. If you insist I can add. 

> > +	uint16_t aq;
> > +	q_top->num_qgroups++;
> > +	if (q_top->first_qgroup_index == -1) {
> > +		q_top->first_qgroup_index = qg;
> > +		/* Can be optimized to assume all are enabled by default */
> > +		reg = acc100_reg_read(d, queue_offset(d->pf_device,
> > +				0, qg, ACC100_NUM_AQS - 1));
> > +		if (reg & QUEUE_ENABLE) {
> > +			q_top->num_aqs_per_groups = ACC100_NUM_AQS;
> > +			return;
> > +		}
> > +		q_top->num_aqs_per_groups = 0;
> > +		for (aq = 0; aq < ACC100_NUM_AQS; aq++) {
> > +			reg = acc100_reg_read(d, queue_offset(d-
> >pf_device,
> > +					0, qg, aq));
> > +			if (reg & QUEUE_ENABLE)
> > +				q_top->num_aqs_per_groups++;
> > +		}
> > +	}
> > +}
> > +
> > +/* Fetch configuration enabled for the PF/VF using MMIO Read (slow)
> > +*/ static inline void fetch_acc100_config(struct rte_bbdev *dev) {
> > +	struct acc100_device *d = dev->data->dev_private;
> > +	struct acc100_conf *acc100_conf = &d->acc100_conf;
> > +	const struct acc100_registry_addr *reg_addr;
> > +	uint8_t acc, qg;
> > +	uint32_t reg, reg_aq, reg_len0, reg_len1;
> > +	uint32_t reg_mode;
> > +
> > +	/* No need to retrieve the configuration is already done */
> > +	if (d->configured)
> > +		return;
> Warn ?

No this can genuinely happen on a regular basis, just no need to fetch it all again. 

> > +
> > +	/* Choose correct registry addresses for the device type */
> > +	if (d->pf_device)
> > +		reg_addr = &pf_reg_addr;
> > +	else
> > +		reg_addr = &vf_reg_addr;
> > +
> > +	d->ddr_size = (1 + acc100_reg_read(d, reg_addr->ddr_range)) << 10;
> > +
> > +	/* Single VF Bundle by VF */
> > +	acc100_conf->num_vf_bundles = 1;
> > +	initQTop(acc100_conf);
> > +
> > +	struct rte_q_topology_t *q_top = NULL;
> > +	int qman_func_id[5] = {0, 2, 1, 3, 4};
> Do these magic numbers need #defines ?

ok. 

> > +	reg = acc100_reg_read(d, reg_addr->qman_group_func);
> > +	for (qg = 0; qg < ACC100_NUM_QGRPS_PER_WORD; qg++) {
> > +		reg_aq = acc100_reg_read(d,
> > +				queue_offset(d->pf_device, 0, qg, 0));
> > +		if (reg_aq & QUEUE_ENABLE) {
> > +			acc = qman_func_id[(reg >> (qg * 4)) & 0x7];
> 0x7 and [5], this could overflow.

ok thanks, I can add exception handling. Not clear to me right now why it did not trigger tool warning. 

> > +			updateQtop(acc, qg, acc100_conf, d);
> > +		}
> > +	}
> > +
> > +	/* Check the depth of the AQs*/
> > +	reg_len0 = acc100_reg_read(d, reg_addr->depth_log0_offset);
> > +	reg_len1 = acc100_reg_read(d, reg_addr->depth_log1_offset);
> > +	for (acc = 0; acc < NUM_ACC; acc++) {
> > +		qtopFromAcc(&q_top, acc, acc100_conf);
> > +		if (q_top->first_qgroup_index <
> ACC100_NUM_QGRPS_PER_WORD)
> > +			q_top->aq_depth_log2 = (reg_len0 >>
> > +					(q_top->first_qgroup_index * 4))
> > +					& 0xF;
> > +		else
> > +			q_top->aq_depth_log2 = (reg_len1 >>
> > +					((q_top->first_qgroup_index -
> > +					ACC100_NUM_QGRPS_PER_WORD) *
> 4))
> > +					& 0xF;
> > +	}
> > +
> > +	/* Read PF mode */
> > +	if (d->pf_device) {
> > +		reg_mode = acc100_reg_read(d, HWPfHiPfMode);
> > +		acc100_conf->pf_mode_en = (reg_mode == 2) ? 1 : 0;
> 
> 2 is a magic number, consider a #define
> 

ok

> Tom
> 

Thanks
Nic

> > +	}
> > +
> > +	rte_bbdev_log_debug(
> > +			"%s Config LLR SIGN IN/OUT %s %s QG %u %u %u %u
> AQ %u %u %u %u Len %u %u %u %u\n",
> > +			(d->pf_device) ? "PF" : "VF",
> > +			(acc100_conf->input_pos_llr_1_bit) ? "POS" : "NEG",
> > +			(acc100_conf->output_pos_llr_1_bit) ? "POS" :
> "NEG",
> > +			acc100_conf->q_ul_4g.num_qgroups,
> > +			acc100_conf->q_dl_4g.num_qgroups,
> > +			acc100_conf->q_ul_5g.num_qgroups,
> > +			acc100_conf->q_dl_5g.num_qgroups,
> > +			acc100_conf->q_ul_4g.num_aqs_per_groups,
> > +			acc100_conf->q_dl_4g.num_aqs_per_groups,
> > +			acc100_conf->q_ul_5g.num_aqs_per_groups,
> > +			acc100_conf->q_dl_5g.num_aqs_per_groups,
> > +			acc100_conf->q_ul_4g.aq_depth_log2,
> > +			acc100_conf->q_dl_4g.aq_depth_log2,
> > +			acc100_conf->q_ul_5g.aq_depth_log2,
> > +			acc100_conf->q_dl_5g.aq_depth_log2);
> > +}
> > +
> >  /* Free 64MB memory used for software rings */  static int
> > acc100_dev_close(struct rte_bbdev *dev  __rte_unused) @@ -33,8
> +211,55
> > @@
> >  	return 0;
> >  }
> >
> > +/* Get ACC100 device info */
> > +static void
> > +acc100_dev_info_get(struct rte_bbdev *dev,
> > +		struct rte_bbdev_driver_info *dev_info) {
> > +	struct acc100_device *d = dev->data->dev_private;
> > +
> > +	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
> > +		RTE_BBDEV_END_OF_CAPABILITIES_LIST()
> > +	};
> > +
> > +	static struct rte_bbdev_queue_conf default_queue_conf;
> > +	default_queue_conf.socket = dev->data->socket_id;
> > +	default_queue_conf.queue_size = MAX_QUEUE_DEPTH;
> > +
> > +	dev_info->driver_name = dev->device->driver->name;
> > +
> > +	/* Read and save the populated config from ACC100 registers */
> > +	fetch_acc100_config(dev);
> > +
> > +	/* This isn't ideal because it reports the maximum number of queues
> but
> > +	 * does not provide info on how many can be uplink/downlink or
> different
> > +	 * priorities
> > +	 */
> > +	dev_info->max_num_queues =
> > +			d->acc100_conf.q_dl_5g.num_aqs_per_groups *
> > +			d->acc100_conf.q_dl_5g.num_qgroups +
> > +			d->acc100_conf.q_ul_5g.num_aqs_per_groups *
> > +			d->acc100_conf.q_ul_5g.num_qgroups +
> > +			d->acc100_conf.q_dl_4g.num_aqs_per_groups *
> > +			d->acc100_conf.q_dl_4g.num_qgroups +
> > +			d->acc100_conf.q_ul_4g.num_aqs_per_groups *
> > +			d->acc100_conf.q_ul_4g.num_qgroups;
> > +	dev_info->queue_size_lim = MAX_QUEUE_DEPTH;
> > +	dev_info->hardware_accelerated = true;
> > +	dev_info->max_dl_queue_priority =
> > +			d->acc100_conf.q_dl_4g.num_qgroups - 1;
> > +	dev_info->max_ul_queue_priority =
> > +			d->acc100_conf.q_ul_4g.num_qgroups - 1;
> > +	dev_info->default_queue_conf = default_queue_conf;
> > +	dev_info->cpu_flag_reqs = NULL;
> > +	dev_info->min_alignment = 64;
> > +	dev_info->capabilities = bbdev_capabilities;
> > +	dev_info->harq_buffer_size = d->ddr_size; }
> > +
> >  static const struct rte_bbdev_ops acc100_bbdev_ops = {
> >  	.close = acc100_dev_close,
> > +	.info_get = acc100_dev_info_get,
> >  };
> >
> >  /* ACC100 PCI PF address map */
> > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h
> > b/drivers/baseband/acc100/rte_acc100_pmd.h
> > index cd77570..662e2c8 100644
> > --- a/drivers/baseband/acc100/rte_acc100_pmd.h
> > +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
> > @@ -7,6 +7,7 @@
> >
> >  #include "acc100_pf_enum.h"
> >  #include "acc100_vf_enum.h"
> > +#include "rte_acc100_cfg.h"
> >
> >  /* Helper macro for logging */
> >  #define rte_bbdev_log(level, fmt, ...) \ @@ -520,6 +521,8 @@ struct
> > acc100_registry_addr {
> >  /* Private data structure for each ACC100 device */  struct
> > acc100_device {
> >  	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
> > +	uint32_t ddr_size; /* Size in kB */
> > +	struct acc100_conf acc100_conf; /* ACC100 Initial configuration */
> >  	bool pf_device; /**< True if this is a PF ACC100 device */
> >  	bool configured; /**< True if this ACC100 device is configured */
> > };


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v9 04/10] baseband/acc100: add queue configuration
  2020-09-29 21:46       ` Tom Rix
@ 2020-09-30  1:03         ` Chautru, Nicolas
  2020-09-30 23:36           ` Tom Rix
  0 siblings, 1 reply; 213+ messages in thread
From: Chautru, Nicolas @ 2020-09-30  1:03 UTC (permalink / raw)
  To: Tom Rix, dev, akhil.goyal
  Cc: Richardson, Bruce, Xu, Rosen, dave.burley, aidan.goddard, Yigit,
	Ferruh, Liu, Tianjiao

Hi Tom, 

> From: Tom Rix <trix@redhat.com>
> On 9/28/20 5:29 PM, Nicolas Chautru wrote:
> > Adding function to create and configure queues for the device. Still
> > no capability.
> >
> > Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> > Reviewed-by: Rosen Xu <rosen.xu@intel.com>
> > Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
> > ---
> >  drivers/baseband/acc100/rte_acc100_pmd.c | 420
> > ++++++++++++++++++++++++++++++-
> > drivers/baseband/acc100/rte_acc100_pmd.h |  45 ++++
> >  2 files changed, 464 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> > b/drivers/baseband/acc100/rte_acc100_pmd.c
> > index 7807a30..7a21c57 100644
> > --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> > +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> > @@ -26,6 +26,22 @@
> >  RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);  #endif
> >
> > +/* Write to MMIO register address */
> > +static inline void
> > +mmio_write(void *addr, uint32_t value) {
> > +	*((volatile uint32_t *)(addr)) = rte_cpu_to_le_32(value); }
> > +
> > +/* Write a register of a ACC100 device */ static inline void
> > +acc100_reg_write(struct acc100_device *d, uint32_t offset, uint32_t
> > +payload) {
> > +	void *reg_addr = RTE_PTR_ADD(d->mmio_base, offset);
> > +	mmio_write(reg_addr, payload);
> > +	usleep(1000);
> rte_acc100_pmd.h defines LONG_WAIT , could this #define be used instead
> ?

ok

> > +}
> > +
> >  /* Read a register of a ACC100 device */  static inline uint32_t
> > acc100_reg_read(struct acc100_device *d, uint32_t offset) @@ -36,6
> > +52,22 @@
> >  	return rte_le_to_cpu_32(ret);
> >  }
> >
> > +/* Basic Implementation of Log2 for exact 2^N */ static inline
> > +uint32_t log2_basic(uint32_t value)
> mirrors the function rte_bsf32

rte_bsf32 is also undefined for zero input.
I could just replace __builtin_ctz() by rte_bsf32() indeed.

> > +{
> > +	return (value == 0) ? 0 : __builtin_ctz(value); }
> > +
> > +/* Calculate memory alignment offset assuming alignment is 2^N */
> > +static inline uint32_t calc_mem_alignment_offset(void
> > +*unaligned_virt_mem, uint32_t alignment) {
> > +	rte_iova_t unaligned_phy_mem =
> rte_malloc_virt2iova(unaligned_virt_mem);
> > +	return (uint32_t)(alignment -
> > +			(unaligned_phy_mem & (alignment-1))); }
> > +
> >  /* Calculate the offset of the enqueue register */  static inline
> > uint32_t  queue_offset(bool pf_device, uint8_t vf_id, uint8_t qgrp_id,
> > uint16_t aq_id) @@ -204,10 +236,393 @@
> >  			acc100_conf->q_dl_5g.aq_depth_log2);
> >  }
> >
> > +static void
> > +free_base_addresses(void **base_addrs, int size) {
> > +	int i;
> > +	for (i = 0; i < size; i++)
> > +		rte_free(base_addrs[i]);
> > +}
> > +
> > +static inline uint32_t
> > +get_desc_len(void)
> > +{
> > +	return sizeof(union acc100_dma_desc); }
> > +
> > +/* Allocate the 2 * 64MB block for the sw rings */ static int
> > +alloc_2x64mb_sw_rings_mem(struct rte_bbdev *dev, struct
> acc100_device *d,
> > +		int socket)
> see earlier comment about name of function.

replied in other patch set

> > +{
> > +	uint32_t sw_ring_size = ACC100_SIZE_64MBYTE;
> > +	d->sw_rings_base = rte_zmalloc_socket(dev->device->driver->name,
> > +			2 * sw_ring_size, RTE_CACHE_LINE_SIZE, socket);
> > +	if (d->sw_rings_base == NULL) {
> > +		rte_bbdev_log(ERR, "Failed to allocate memory for %s:%u",
> > +				dev->device->driver->name,
> > +				dev->data->dev_id);
> > +		return -ENOMEM;
> > +	}
> > +	memset(d->sw_rings_base, 0, ACC100_SIZE_64MBYTE);
> > +	uint32_t next_64mb_align_offset = calc_mem_alignment_offset(
> > +			d->sw_rings_base, ACC100_SIZE_64MBYTE);
> > +	d->sw_rings = RTE_PTR_ADD(d->sw_rings_base,
> next_64mb_align_offset);
> > +	d->sw_rings_phys = rte_malloc_virt2iova(d->sw_rings_base) +
> > +			next_64mb_align_offset;
> > +	d->sw_ring_size = MAX_QUEUE_DEPTH * get_desc_len();
> > +	d->sw_ring_max_depth = d->sw_ring_size / get_desc_len();
> > +
> > +	return 0;
> > +}
> > +
> > +/* Attempt to allocate minimised memory space for sw rings */ static
> > +void alloc_sw_rings_min_mem(struct rte_bbdev *dev, struct
> > +acc100_device *d,
> > +		uint16_t num_queues, int socket)
> > +{
> > +	rte_iova_t sw_rings_base_phy, next_64mb_align_addr_phy;
> > +	uint32_t next_64mb_align_offset;
> > +	rte_iova_t sw_ring_phys_end_addr;
> > +	void *base_addrs[SW_RING_MEM_ALLOC_ATTEMPTS];
> > +	void *sw_rings_base;
> > +	int i = 0;
> > +	uint32_t q_sw_ring_size = MAX_QUEUE_DEPTH * get_desc_len();
> > +	uint32_t dev_sw_ring_size = q_sw_ring_size * num_queues;
> > +
> > +	/* Find an aligned block of memory to store sw rings */
> > +	while (i < SW_RING_MEM_ALLOC_ATTEMPTS) {
> > +		/*
> > +		 * sw_ring allocated memory is guaranteed to be aligned to
> > +		 * q_sw_ring_size at the condition that the requested size is
> > +		 * less than the page size
> > +		 */
> > +		sw_rings_base = rte_zmalloc_socket(
> > +				dev->device->driver->name,
> > +				dev_sw_ring_size, q_sw_ring_size, socket);
> > +
> > +		if (sw_rings_base == NULL) {
> > +			rte_bbdev_log(ERR,
> > +					"Failed to allocate memory for
> %s:%u",
> > +					dev->device->driver->name,
> > +					dev->data->dev_id);
> > +			break;
> > +		}
> > +
> > +		sw_rings_base_phy = rte_malloc_virt2iova(sw_rings_base);
> > +		next_64mb_align_offset = calc_mem_alignment_offset(
> > +				sw_rings_base, ACC100_SIZE_64MBYTE);
> > +		next_64mb_align_addr_phy = sw_rings_base_phy +
> > +				next_64mb_align_offset;
> > +		sw_ring_phys_end_addr = sw_rings_base_phy +
> dev_sw_ring_size;
> > +
> > +		/* Check if the end of the sw ring memory block is before the
> > +		 * start of next 64MB aligned mem address
> > +		 */
> > +		if (sw_ring_phys_end_addr < next_64mb_align_addr_phy) {
> > +			d->sw_rings_phys = sw_rings_base_phy;
> > +			d->sw_rings = sw_rings_base;
> > +			d->sw_rings_base = sw_rings_base;
> > +			d->sw_ring_size = q_sw_ring_size;
> > +			d->sw_ring_max_depth = MAX_QUEUE_DEPTH;
> > +			break;
> > +		}
> > +		/* Store the address of the unaligned mem block */
> > +		base_addrs[i] = sw_rings_base;
> > +		i++;
> > +	}
> > +
> 
> This looks like a bug.
> 
> Freeing memory that was just allocated.
> 
> Looks like it could be part of an error handler for memory access in the loop
> failing.

You are not the first person to raise concerns in that serie for that piece of code.
I agree this is a bit convoluted but functionally correct. 

> 
> There should be a better way to allocate aligned memory like round up the
> size and use an offset to the alignment you need.

This is actually the fall back option below in case that first iterative option fails (but more wasteful in memory).
If really that looks too dodgy we could skip that first attempt method and go directly to the 2nd option which is more wasteful, 
but really that is doing what it is supposed to do hence ok to me as it is. 
Let me know what you think. 

> 
> > +	/* Free all unaligned blocks of mem allocated in the loop */
> > +	free_base_addresses(base_addrs, i);
> > +}
> > +
> > +
> > +/* Allocate 64MB memory used for all software rings */ static int
> > +acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int
> > +socket_id) {
> > +	uint32_t phys_low, phys_high, payload;
> > +	struct acc100_device *d = dev->data->dev_private;
> > +	const struct acc100_registry_addr *reg_addr;
> > +
> > +	if (d->pf_device && !d->acc100_conf.pf_mode_en) {
> > +		rte_bbdev_log(NOTICE,
> > +				"%s has PF mode disabled. This PF can't be
> used.",
> > +				dev->data->name);
> > +		return -ENODEV;
> > +	}
> > +
> > +	alloc_sw_rings_min_mem(dev, d, num_queues, socket_id);
> > +
> > +	/* If minimal memory space approach failed, then allocate
> > +	 * the 2 * 64MB block for the sw rings
> > +	 */
> > +	if (d->sw_rings == NULL)
> > +		alloc_2x64mb_sw_rings_mem(dev, d, socket_id);
> This can fail as well, but is unhandled.

ok can add. 

> > +
> > +	/* Configure ACC100 with the base address for DMA descriptor rings
> > +	 * Same descriptor rings used for UL and DL DMA Engines
> > +	 * Note : Assuming only VF0 bundle is used for PF mode
> > +	 */
> > +	phys_high = (uint32_t)(d->sw_rings_phys >> 32);
> > +	phys_low  = (uint32_t)(d->sw_rings_phys &
> ~(ACC100_SIZE_64MBYTE-1));
> > +
> > +	/* Choose correct registry addresses for the device type */
> > +	if (d->pf_device)
> > +		reg_addr = &pf_reg_addr;
> > +	else
> > +		reg_addr = &vf_reg_addr;
> could reg_addr be part of acc100_device struct ?

I don't see this as useful really as part of the device data in my opinion.

> > +
> > +	/* Read the populated cfg from ACC100 registers */
> > +	fetch_acc100_config(dev);
> > +
> > +	/* Mark as configured properly */
> > +	d->configured = true;
> should set configured at the end, as the function can still fail.

ok

> > +
> > +	/* Release AXI from PF */
> > +	if (d->pf_device)
> > +		acc100_reg_write(d, HWPfDmaAxiControl, 1);
> > +
> > +	acc100_reg_write(d, reg_addr->dma_ring_ul5g_hi, phys_high);
> > +	acc100_reg_write(d, reg_addr->dma_ring_ul5g_lo, phys_low);
> > +	acc100_reg_write(d, reg_addr->dma_ring_dl5g_hi, phys_high);
> > +	acc100_reg_write(d, reg_addr->dma_ring_dl5g_lo, phys_low);
> > +	acc100_reg_write(d, reg_addr->dma_ring_ul4g_hi, phys_high);
> > +	acc100_reg_write(d, reg_addr->dma_ring_ul4g_lo, phys_low);
> > +	acc100_reg_write(d, reg_addr->dma_ring_dl4g_hi, phys_high);
> > +	acc100_reg_write(d, reg_addr->dma_ring_dl4g_lo, phys_low);
> > +
> > +	/*
> > +	 * Configure Ring Size to the max queue ring size
> > +	 * (used for wrapping purpose)
> > +	 */
> > +	payload = log2_basic(d->sw_ring_size / 64);
> > +	acc100_reg_write(d, reg_addr->ring_size, payload);
> > +
> > +	/* Configure tail pointer for use when SDONE enabled */
> > +	d->tail_ptrs = rte_zmalloc_socket(
> > +			dev->device->driver->name,
> > +			ACC100_NUM_QGRPS * ACC100_NUM_AQS *
> sizeof(uint32_t),
> > +			RTE_CACHE_LINE_SIZE, socket_id);
> > +	if (d->tail_ptrs == NULL) {
> > +		rte_bbdev_log(ERR, "Failed to allocate tail ptr for %s:%u",
> > +				dev->device->driver->name,
> > +				dev->data->dev_id);
> > +		rte_free(d->sw_rings);
> > +		return -ENOMEM;
> > +	}
> > +	d->tail_ptr_phys = rte_malloc_virt2iova(d->tail_ptrs);
> > +
> > +	phys_high = (uint32_t)(d->tail_ptr_phys >> 32);
> > +	phys_low  = (uint32_t)(d->tail_ptr_phys);
> > +	acc100_reg_write(d, reg_addr->tail_ptrs_ul5g_hi, phys_high);
> > +	acc100_reg_write(d, reg_addr->tail_ptrs_ul5g_lo, phys_low);
> > +	acc100_reg_write(d, reg_addr->tail_ptrs_dl5g_hi, phys_high);
> > +	acc100_reg_write(d, reg_addr->tail_ptrs_dl5g_lo, phys_low);
> > +	acc100_reg_write(d, reg_addr->tail_ptrs_ul4g_hi, phys_high);
> > +	acc100_reg_write(d, reg_addr->tail_ptrs_ul4g_lo, phys_low);
> > +	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_hi, phys_high);
> > +	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_lo, phys_low);
> > +
> > +	d->harq_layout = rte_zmalloc_socket("HARQ Layout",
> > +			ACC100_HARQ_LAYOUT * sizeof(*d->harq_layout),
> > +			RTE_CACHE_LINE_SIZE, dev->data->socket_id);
> unchecked

ok will add. 

> > +
> > +	rte_bbdev_log_debug(
> > +			"ACC100 (%s) configured  sw_rings = %p,
> sw_rings_phys = %#"
> > +			PRIx64, dev->data->name, d->sw_rings, d-
> >sw_rings_phys);
> > +
> > +	return 0;
> > +}
> > +
> >  /* Free 64MB memory used for software rings */  static int
> > -acc100_dev_close(struct rte_bbdev *dev  __rte_unused)
> > +acc100_dev_close(struct rte_bbdev *dev)
> >  {
> > +	struct acc100_device *d = dev->data->dev_private;
> > +	if (d->sw_rings_base != NULL) {
> > +		rte_free(d->tail_ptrs);
> > +		rte_free(d->sw_rings_base);
> > +		d->sw_rings_base = NULL;
> > +	}
> > +	usleep(1000);
> similar LONG_WAIT

ok

> > +	return 0;
> > +}
> > +
> > +
> > +/**
> > + * Report a ACC100 queue index which is free
> > + * Return 0 to 16k for a valid queue_idx or -1 when no queue is
> > +available
> > + * Note : Only supporting VF0 Bundle for PF mode  */ static int
> > +acc100_find_free_queue_idx(struct rte_bbdev *dev,
> > +		const struct rte_bbdev_queue_conf *conf) {
> > +	struct acc100_device *d = dev->data->dev_private;
> > +	int op_2_acc[5] = {0, UL_4G, DL_4G, UL_5G, DL_5G};
> > +	int acc = op_2_acc[conf->op_type];
> > +	struct rte_q_topology_t *qtop = NULL;
> > +	qtopFromAcc(&qtop, acc, &(d->acc100_conf));
> > +	if (qtop == NULL)
> > +		return -1;
> > +	/* Identify matching QGroup Index which are sorted in priority order
> */
> > +	uint16_t group_idx = qtop->first_qgroup_index;
> > +	group_idx += conf->priority;
> > +	if (group_idx >= ACC100_NUM_QGRPS ||
> > +			conf->priority >= qtop->num_qgroups) {
> > +		rte_bbdev_log(INFO, "Invalid Priority on %s, priority %u",
> > +				dev->data->name, conf->priority);
> > +		return -1;
> > +	}
> > +	/* Find a free AQ_idx  */
> > +	uint16_t aq_idx;
> > +	for (aq_idx = 0; aq_idx < qtop->num_aqs_per_groups; aq_idx++) {
> > +		if (((d->q_assigned_bit_map[group_idx] >> aq_idx) & 0x1) ==
> 0) {
> > +			/* Mark the Queue as assigned */
> > +			d->q_assigned_bit_map[group_idx] |= (1 << aq_idx);
> > +			/* Report the AQ Index */
> > +			return (group_idx << GRP_ID_SHIFT) + aq_idx;
> > +		}
> > +	}
> > +	rte_bbdev_log(INFO, "Failed to find free queue on %s, priority %u",
> > +			dev->data->name, conf->priority);
> > +	return -1;
> > +}
> > +
> > +/* Setup ACC100 queue */
> > +static int
> > +acc100_queue_setup(struct rte_bbdev *dev, uint16_t queue_id,
> > +		const struct rte_bbdev_queue_conf *conf) {
> > +	struct acc100_device *d = dev->data->dev_private;
> > +	struct acc100_queue *q;
> > +	int16_t q_idx;
> > +
> > +	/* Allocate the queue data structure. */
> > +	q = rte_zmalloc_socket(dev->device->driver->name, sizeof(*q),
> > +			RTE_CACHE_LINE_SIZE, conf->socket);
> > +	if (q == NULL) {
> > +		rte_bbdev_log(ERR, "Failed to allocate queue memory");
> > +		return -ENOMEM;
> > +	}
> > +
> > +	q->d = d;
> > +	q->ring_addr = RTE_PTR_ADD(d->sw_rings, (d->sw_ring_size *
> queue_id));
> > +	q->ring_addr_phys = d->sw_rings_phys + (d->sw_ring_size *
> queue_id);
> > +
> > +	/* Prepare the Ring with default descriptor format */
> > +	union acc100_dma_desc *desc = NULL;
> > +	unsigned int desc_idx, b_idx;
> > +	int fcw_len = (conf->op_type == RTE_BBDEV_OP_LDPC_ENC ?
> > +		ACC100_FCW_LE_BLEN : (conf->op_type ==
> RTE_BBDEV_OP_TURBO_DEC ?
> > +		ACC100_FCW_TD_BLEN : ACC100_FCW_LD_BLEN));
> > +
> > +	for (desc_idx = 0; desc_idx < d->sw_ring_max_depth; desc_idx++) {
> > +		desc = q->ring_addr + desc_idx;
> > +		desc->req.word0 = ACC100_DMA_DESC_TYPE;
> > +		desc->req.word1 = 0; /**< Timestamp */
> > +		desc->req.word2 = 0;
> > +		desc->req.word3 = 0;
> > +		uint64_t fcw_offset = (desc_idx << 8) +
> ACC100_DESC_FCW_OFFSET;
> > +		desc->req.data_ptrs[0].address = q->ring_addr_phys +
> fcw_offset;
> > +		desc->req.data_ptrs[0].blen = fcw_len;
> > +		desc->req.data_ptrs[0].blkid = ACC100_DMA_BLKID_FCW;
> > +		desc->req.data_ptrs[0].last = 0;
> > +		desc->req.data_ptrs[0].dma_ext = 0;
> > +		for (b_idx = 1; b_idx < ACC100_DMA_MAX_NUM_POINTERS -
> 1;
> > +				b_idx++) {
> > +			desc->req.data_ptrs[b_idx].blkid =
> ACC100_DMA_BLKID_IN;
> > +			desc->req.data_ptrs[b_idx].last = 1;
> > +			desc->req.data_ptrs[b_idx].dma_ext = 0;
> > +			b_idx++;
> 
> This works, but it would be better to only inc the index in the for loop
> statement.
> 
> The second data set should accessed as [b_idx+1]
> 
> And the loop inc by +2

Matter of preference maybe? 

> 
> > +			desc->req.data_ptrs[b_idx].blkid =
> > +					ACC100_DMA_BLKID_OUT_ENC;
> > +			desc->req.data_ptrs[b_idx].last = 1;
> > +			desc->req.data_ptrs[b_idx].dma_ext = 0;
> > +		}
> > +		/* Preset some fields of LDPC FCW */
> > +		desc->req.fcw_ld.FCWversion = ACC100_FCW_VER;
> > +		desc->req.fcw_ld.gain_i = 1;
> > +		desc->req.fcw_ld.gain_h = 1;
> > +	}
> > +
> > +	q->lb_in = rte_zmalloc_socket(dev->device->driver->name,
> > +			RTE_CACHE_LINE_SIZE,
> > +			RTE_CACHE_LINE_SIZE, conf->socket);
> > +	if (q->lb_in == NULL) {
> 
> q is not freed.

ok thanks

> 
> > +		rte_bbdev_log(ERR, "Failed to allocate lb_in memory");
> > +		return -ENOMEM;
> > +	}
> > +	q->lb_in_addr_phys = rte_malloc_virt2iova(q->lb_in);
> > +	q->lb_out = rte_zmalloc_socket(dev->device->driver->name,
> > +			RTE_CACHE_LINE_SIZE,
> > +			RTE_CACHE_LINE_SIZE, conf->socket);
> > +	if (q->lb_out == NULL) {
> > +		rte_bbdev_log(ERR, "Failed to allocate lb_out memory");
> > +		return -ENOMEM;
> 
> q->lb_in is not freed
> 
> q is not freed

ok too thanks

> 
> > +	}
> > +	q->lb_out_addr_phys = rte_malloc_virt2iova(q->lb_out);
> > +
> > +	/*
> > +	 * Software queue ring wraps synchronously with the HW when it
> reaches
> > +	 * the boundary of the maximum allocated queue size, no matter
> what the
> > +	 * sw queue size is. This wrapping is guarded by setting the
> wrap_mask
> > +	 * to represent the maximum queue size as allocated at the time
> when
> > +	 * the device has been setup (in configure()).
> > +	 *
> > +	 * The queue depth is set to the queue size value (conf->queue_size).
> > +	 * This limits the occupancy of the queue at any point of time, so
> that
> > +	 * the queue does not get swamped with enqueue requests.
> > +	 */
> > +	q->sw_ring_depth = conf->queue_size;
> > +	q->sw_ring_wrap_mask = d->sw_ring_max_depth - 1;
> > +
> > +	q->op_type = conf->op_type;
> > +
> > +	q_idx = acc100_find_free_queue_idx(dev, conf);
> > +	if (q_idx == -1) {
> > +		rte_free(q);
> 
> This will leak the other two ptr's
> This function needs better error handling.

Yes agreed. Thanks.

> 
> Tom
> 

Thanks for your review Tom, aiming to push updated serie tomorrow.

Nic



> > +		return -1;
> > +	}
> > +
> > +	q->qgrp_id = (q_idx >> GRP_ID_SHIFT) & 0xF;
> > +	q->vf_id = (q_idx >> VF_ID_SHIFT)  & 0x3F;
> > +	q->aq_id = q_idx & 0xF;
> > +	q->aq_depth = (conf->op_type ==  RTE_BBDEV_OP_TURBO_DEC) ?
> > +			(1 << d->acc100_conf.q_ul_4g.aq_depth_log2) :
> > +			(1 << d->acc100_conf.q_dl_4g.aq_depth_log2);
> > +
> > +	q->mmio_reg_enqueue = RTE_PTR_ADD(d->mmio_base,
> > +			queue_offset(d->pf_device,
> > +					q->vf_id, q->qgrp_id, q->aq_id));
> > +
> > +	rte_bbdev_log_debug(
> > +			"Setup dev%u q%u: qgrp_id=%u, vf_id=%u,
> aq_id=%u, aq_depth=%u, mmio_reg_enqueue=%p",
> > +			dev->data->dev_id, queue_id, q->qgrp_id, q->vf_id,
> > +			q->aq_id, q->aq_depth, q->mmio_reg_enqueue);
> > +
> > +	dev->data->queues[queue_id].queue_private = q;
> > +	return 0;
> > +}
> > +
> > +/* Release ACC100 queue */
> > +static int
> > +acc100_queue_release(struct rte_bbdev *dev, uint16_t q_id) {
> > +	struct acc100_device *d = dev->data->dev_private;
> > +	struct acc100_queue *q = dev->data->queues[q_id].queue_private;
> > +
> > +	if (q != NULL) {
> > +		/* Mark the Queue as un-assigned */
> > +		d->q_assigned_bit_map[q->qgrp_id] &= (0xFFFFFFFF -
> > +				(1 << q->aq_id));
> > +		rte_free(q->lb_in);
> > +		rte_free(q->lb_out);
> > +		rte_free(q);
> > +		dev->data->queues[q_id].queue_private = NULL;
> > +	}
> > +
> >  	return 0;
> >  }
> >
> > @@ -258,8 +673,11 @@
> >  }
> >
> >  static const struct rte_bbdev_ops acc100_bbdev_ops = {
> > +	.setup_queues = acc100_setup_queues,
> >  	.close = acc100_dev_close,
> >  	.info_get = acc100_dev_info_get,
> > +	.queue_setup = acc100_queue_setup,
> > +	.queue_release = acc100_queue_release,
> >  };
> >
> >  /* ACC100 PCI PF address map */
> > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h
> > b/drivers/baseband/acc100/rte_acc100_pmd.h
> > index 662e2c8..0e2b79c 100644
> > --- a/drivers/baseband/acc100/rte_acc100_pmd.h
> > +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
> > @@ -518,11 +518,56 @@ struct acc100_registry_addr {
> >  	.ddr_range = HWVfDmaDdrBaseRangeRoVf,  };
> >
> > +/* Structure associated with each queue. */ struct
> > +__rte_cache_aligned acc100_queue {
> > +	union acc100_dma_desc *ring_addr;  /* Virtual address of sw ring */
> > +	rte_iova_t ring_addr_phys;  /* Physical address of software ring */
> > +	uint32_t sw_ring_head;  /* software ring head */
> > +	uint32_t sw_ring_tail;  /* software ring tail */
> > +	/* software ring size (descriptors, not bytes) */
> > +	uint32_t sw_ring_depth;
> > +	/* mask used to wrap enqueued descriptors on the sw ring */
> > +	uint32_t sw_ring_wrap_mask;
> > +	/* MMIO register used to enqueue descriptors */
> > +	void *mmio_reg_enqueue;
> > +	uint8_t vf_id;  /* VF ID (max = 63) */
> > +	uint8_t qgrp_id;  /* Queue Group ID */
> > +	uint16_t aq_id;  /* Atomic Queue ID */
> > +	uint16_t aq_depth;  /* Depth of atomic queue */
> > +	uint32_t aq_enqueued;  /* Count how many "batches" have been
> enqueued */
> > +	uint32_t aq_dequeued;  /* Count how many "batches" have been
> dequeued */
> > +	uint32_t irq_enable;  /* Enable ops dequeue interrupts if set to 1 */
> > +	struct rte_mempool *fcw_mempool;  /* FCW mempool */
> > +	enum rte_bbdev_op_type op_type;  /* Type of this Queue: TE or TD
> */
> > +	/* Internal Buffers for loopback input */
> > +	uint8_t *lb_in;
> > +	uint8_t *lb_out;
> > +	rte_iova_t lb_in_addr_phys;
> > +	rte_iova_t lb_out_addr_phys;
> > +	struct acc100_device *d;
> > +};
> > +
> >  /* Private data structure for each ACC100 device */  struct
> > acc100_device {
> >  	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
> > +	void *sw_rings_base;  /* Base addr of un-aligned memory for sw
> rings */
> > +	void *sw_rings;  /* 64MBs of 64MB aligned memory for sw rings */
> > +	rte_iova_t sw_rings_phys;  /* Physical address of sw_rings */
> > +	/* Virtual address of the info memory routed to the this function
> under
> > +	 * operation, whether it is PF or VF.
> > +	 */
> > +	union acc100_harq_layout_data *harq_layout;
> > +	uint32_t sw_ring_size;
> >  	uint32_t ddr_size; /* Size in kB */
> > +	uint32_t *tail_ptrs; /* Base address of response tail pointer buffer */
> > +	rte_iova_t tail_ptr_phys; /* Physical address of tail pointers */
> > +	/* Max number of entries available for each queue in device,
> depending
> > +	 * on how many queues are enabled with configure()
> > +	 */
> > +	uint32_t sw_ring_max_depth;
> >  	struct acc100_conf acc100_conf; /* ACC100 Initial configuration */
> > +	/* Bitmap capturing which Queues have already been assigned */
> > +	uint16_t q_assigned_bit_map[ACC100_NUM_QGRPS];
> >  	bool pf_device; /**< True if this is a PF ACC100 device */
> >  	bool configured; /**< True if this ACC100 device is configured */
> > };


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v9 05/10] baseband/acc100: add LDPC processing functions
  2020-09-29  0:29     ` [dpdk-dev] [PATCH v9 05/10] baseband/acc100: add LDPC processing functions Nicolas Chautru
@ 2020-09-30 16:53       ` Tom Rix
  2020-09-30 18:52         ` Chautru, Nicolas
  0 siblings, 1 reply; 213+ messages in thread
From: Tom Rix @ 2020-09-30 16:53 UTC (permalink / raw)
  To: Nicolas Chautru, dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu


On 9/28/20 5:29 PM, Nicolas Chautru wrote:
> Adding LDPC decode and encode processing operations
>
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
> Acked-by: Dave Burley <dave.burley@accelercomm.com>
> ---
>  doc/guides/bbdevs/features/acc100.ini    |    8 +-
>  drivers/baseband/acc100/rte_acc100_pmd.c | 1625 +++++++++++++++++++++++++++++-
>  drivers/baseband/acc100/rte_acc100_pmd.h |    3 +
>  3 files changed, 1630 insertions(+), 6 deletions(-)
>
> diff --git a/doc/guides/bbdevs/features/acc100.ini b/doc/guides/bbdevs/features/acc100.ini
> index c89a4d7..40c7adc 100644
> --- a/doc/guides/bbdevs/features/acc100.ini
> +++ b/doc/guides/bbdevs/features/acc100.ini
> @@ -6,9 +6,9 @@
>  [Features]
>  Turbo Decoder (4G)     = N
>  Turbo Encoder (4G)     = N
> -LDPC Decoder (5G)      = N
> -LDPC Encoder (5G)      = N
> -LLR/HARQ Compression   = N
> -External DDR Access    = N
> +LDPC Decoder (5G)      = Y
> +LDPC Encoder (5G)      = Y
> +LLR/HARQ Compression   = Y
> +External DDR Access    = Y
>  HW Accelerated         = Y
>  BBDEV API              = Y
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
> index 7a21c57..b223547 100644
> --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> @@ -15,6 +15,9 @@
>  #include <rte_hexdump.h>
>  #include <rte_pci.h>
>  #include <rte_bus_pci.h>
> +#ifdef RTE_BBDEV_OFFLOAD_COST
> +#include <rte_cycles.h>
> +#endif
>  
>  #include <rte_bbdev.h>
>  #include <rte_bbdev_pmd.h>
> @@ -449,7 +452,6 @@
>  	return 0;
>  }
>  
> -
>  /**
>   * Report a ACC100 queue index which is free
>   * Return 0 to 16k for a valid queue_idx or -1 when no queue is available
> @@ -634,6 +636,46 @@
>  	struct acc100_device *d = dev->data->dev_private;
>  
>  	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
> +		{
> +			.type   = RTE_BBDEV_OP_LDPC_ENC,
> +			.cap.ldpc_enc = {
> +				.capability_flags =
> +					RTE_BBDEV_LDPC_RATE_MATCH |
> +					RTE_BBDEV_LDPC_CRC_24B_ATTACH |
> +					RTE_BBDEV_LDPC_INTERLEAVER_BYPASS,
> +				.num_buffers_src =
> +						RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> +				.num_buffers_dst =
> +						RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> +			}
> +		},
> +		{
> +			.type   = RTE_BBDEV_OP_LDPC_DEC,
> +			.cap.ldpc_dec = {
> +			.capability_flags =
> +				RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK |
> +				RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP |
> +				RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE |
> +				RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE |
> +#ifdef ACC100_EXT_MEM

This is unconditionally defined in rte_acc100_pmd.h but it seems

like it could be a hw config.  Please add a comment in the *.h

Could also change to

#if ACC100_EXT_MEM

and change the #define ACC100_EXT_MEM 1

> +				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE |
> +				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE |
> +#endif
> +				RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE |
> +				RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS |
> +				RTE_BBDEV_LDPC_DECODE_BYPASS |
> +				RTE_BBDEV_LDPC_DEC_SCATTER_GATHER |
> +				RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION |
> +				RTE_BBDEV_LDPC_LLR_COMPRESSION,
> +			.llr_size = 8,
> +			.llr_decimals = 1,
> +			.num_buffers_src =
> +					RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> +			.num_buffers_hard_out =
> +					RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> +			.num_buffers_soft_out = 0,
> +			}
> +		},
>  		RTE_BBDEV_END_OF_CAPABILITIES_LIST()
>  	};
>  
> @@ -669,9 +711,14 @@
>  	dev_info->cpu_flag_reqs = NULL;
>  	dev_info->min_alignment = 64;
>  	dev_info->capabilities = bbdev_capabilities;
> +#ifdef ACC100_EXT_MEM
>  	dev_info->harq_buffer_size = d->ddr_size;
> +#else
> +	dev_info->harq_buffer_size = 0;
> +#endif
>  }
>  
> +
>  static const struct rte_bbdev_ops acc100_bbdev_ops = {
>  	.setup_queues = acc100_setup_queues,
>  	.close = acc100_dev_close,
> @@ -696,6 +743,1577 @@
>  	{.device_id = 0},
>  };
>  
> +/* Read flag value 0/1 from bitmap */
> +static inline bool
> +check_bit(uint32_t bitmap, uint32_t bitmask)
> +{
> +	return bitmap & bitmask;
> +}
> +

All the bbdev have this function, its pretty trival but it would be good if common bbdev

functions got moved to a common place.

> +static inline char *
> +mbuf_append(struct rte_mbuf *m_head, struct rte_mbuf *m, uint16_t len)
> +{
> +	if (unlikely(len > rte_pktmbuf_tailroom(m)))
> +		return NULL;
> +
> +	char *tail = (char *)m->buf_addr + m->data_off + m->data_len;
> +	m->data_len = (uint16_t)(m->data_len + len);
> +	m_head->pkt_len  = (m_head->pkt_len + len);
> +	return tail;
> +}
> +
> +/* Compute value of k0.
> + * Based on 3GPP 38.212 Table 5.4.2.1-2
> + * Starting position of different redundancy versions, k0
> + */
> +static inline uint16_t
> +get_k0(uint16_t n_cb, uint16_t z_c, uint8_t bg, uint8_t rv_index)
> +{
> +	if (rv_index == 0)
> +		return 0;
> +	uint16_t n = (bg == 1 ? N_ZC_1 : N_ZC_2) * z_c;
> +	if (n_cb == n) {
> +		if (rv_index == 1)
> +			return (bg == 1 ? K0_1_1 : K0_1_2) * z_c;
> +		else if (rv_index == 2)
> +			return (bg == 1 ? K0_2_1 : K0_2_2) * z_c;
> +		else
> +			return (bg == 1 ? K0_3_1 : K0_3_2) * z_c;
> +	}
> +	/* LBRM case - includes a division by N */
> +	if (rv_index == 1)
> +		return (((bg == 1 ? K0_1_1 : K0_1_2) * n_cb)
> +				/ n) * z_c;
> +	else if (rv_index == 2)
> +		return (((bg == 1 ? K0_2_1 : K0_2_2) * n_cb)
> +				/ n) * z_c;
> +	else
> +		return (((bg == 1 ? K0_3_1 : K0_3_2) * n_cb)
> +				/ n) * z_c;
> +}
> +
> +/* Fill in a frame control word for LDPC encoding. */
> +static inline void
> +acc100_fcw_le_fill(const struct rte_bbdev_enc_op *op,
> +		struct acc100_fcw_le *fcw, int num_cb)
> +{
> +	fcw->qm = op->ldpc_enc.q_m;
> +	fcw->nfiller = op->ldpc_enc.n_filler;
> +	fcw->BG = (op->ldpc_enc.basegraph - 1);
> +	fcw->Zc = op->ldpc_enc.z_c;
> +	fcw->ncb = op->ldpc_enc.n_cb;
> +	fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_enc.basegraph,
> +			op->ldpc_enc.rv_index);
> +	fcw->rm_e = op->ldpc_enc.cb_params.e;
> +	fcw->crc_select = check_bit(op->ldpc_enc.op_flags,
> +			RTE_BBDEV_LDPC_CRC_24B_ATTACH);
> +	fcw->bypass_intlv = check_bit(op->ldpc_enc.op_flags,
> +			RTE_BBDEV_LDPC_INTERLEAVER_BYPASS);
> +	fcw->mcb_count = num_cb;
> +}
> +
> +/* Fill in a frame control word for LDPC decoding. */
> +static inline void
> +acc100_fcw_ld_fill(const struct rte_bbdev_dec_op *op, struct acc100_fcw_ld *fcw,
> +		union acc100_harq_layout_data *harq_layout)
> +{
> +	uint16_t harq_out_length, harq_in_length, ncb_p, k0_p, parity_offset;
> +	uint16_t harq_index;
> +	uint32_t l;
> +	bool harq_prun = false;
> +
> +	fcw->qm = op->ldpc_dec.q_m;
> +	fcw->nfiller = op->ldpc_dec.n_filler;
> +	fcw->BG = (op->ldpc_dec.basegraph - 1);
> +	fcw->Zc = op->ldpc_dec.z_c;
> +	fcw->ncb = op->ldpc_dec.n_cb;
> +	fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_dec.basegraph,
> +			op->ldpc_dec.rv_index);
> +	if (op->ldpc_dec.code_block_mode == 1)
1 is magic, consider a #define
> +		fcw->rm_e = op->ldpc_dec.cb_params.e;
> +	else
> +		fcw->rm_e = (op->ldpc_dec.tb_params.r <
> +				op->ldpc_dec.tb_params.cab) ?
> +						op->ldpc_dec.tb_params.ea :
> +						op->ldpc_dec.tb_params.eb;
> +
> +	fcw->hcin_en = check_bit(op->ldpc_dec.op_flags,
> +			RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE);
> +	fcw->hcout_en = check_bit(op->ldpc_dec.op_flags,
> +			RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE);
> +	fcw->crc_select = check_bit(op->ldpc_dec.op_flags,
> +			RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK);
> +	fcw->bypass_dec = check_bit(op->ldpc_dec.op_flags,
> +			RTE_BBDEV_LDPC_DECODE_BYPASS);
> +	fcw->bypass_intlv = check_bit(op->ldpc_dec.op_flags,
> +			RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS);
> +	if (op->ldpc_dec.q_m == 1) {
> +		fcw->bypass_intlv = 1;
> +		fcw->qm = 2;
> +	}
similar magic.
> +	fcw->hcin_decomp_mode = check_bit(op->ldpc_dec.op_flags,
> +			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
> +	fcw->hcout_comp_mode = check_bit(op->ldpc_dec.op_flags,
> +			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
> +	fcw->llr_pack_mode = check_bit(op->ldpc_dec.op_flags,
> +			RTE_BBDEV_LDPC_LLR_COMPRESSION);
> +	harq_index = op->ldpc_dec.harq_combined_output.offset /
> +			ACC100_HARQ_OFFSET;
> +#ifdef ACC100_EXT_MEM
> +	/* Limit cases when HARQ pruning is valid */
> +	harq_prun = ((op->ldpc_dec.harq_combined_output.offset %
> +			ACC100_HARQ_OFFSET) == 0) &&
> +			(op->ldpc_dec.harq_combined_output.offset <= UINT16_MAX
> +			* ACC100_HARQ_OFFSET);
> +#endif
> +	if (fcw->hcin_en > 0) {
> +		harq_in_length = op->ldpc_dec.harq_combined_input.length;
> +		if (fcw->hcin_decomp_mode > 0)
> +			harq_in_length = harq_in_length * 8 / 6;
> +		harq_in_length = RTE_ALIGN(harq_in_length, 64);
> +		if ((harq_layout[harq_index].offset > 0) & harq_prun) {
> +			rte_bbdev_log_debug("HARQ IN offset unexpected for now\n");
> +			fcw->hcin_size0 = harq_layout[harq_index].size0;
> +			fcw->hcin_offset = harq_layout[harq_index].offset;
> +			fcw->hcin_size1 = harq_in_length -
> +					harq_layout[harq_index].offset;
> +		} else {
> +			fcw->hcin_size0 = harq_in_length;
> +			fcw->hcin_offset = 0;
> +			fcw->hcin_size1 = 0;
> +		}
> +	} else {
> +		fcw->hcin_size0 = 0;
> +		fcw->hcin_offset = 0;
> +		fcw->hcin_size1 = 0;
> +	}
> +
> +	fcw->itmax = op->ldpc_dec.iter_max;
> +	fcw->itstop = check_bit(op->ldpc_dec.op_flags,
> +			RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE);
> +	fcw->synd_precoder = fcw->itstop;
> +	/*
> +	 * These are all implicitly set
> +	 * fcw->synd_post = 0;
> +	 * fcw->so_en = 0;
> +	 * fcw->so_bypass_rm = 0;
> +	 * fcw->so_bypass_intlv = 0;
> +	 * fcw->dec_convllr = 0;
> +	 * fcw->hcout_convllr = 0;
> +	 * fcw->hcout_size1 = 0;
> +	 * fcw->so_it = 0;
> +	 * fcw->hcout_offset = 0;
> +	 * fcw->negstop_th = 0;
> +	 * fcw->negstop_it = 0;
> +	 * fcw->negstop_en = 0;
> +	 * fcw->gain_i = 1;
> +	 * fcw->gain_h = 1;
> +	 */
> +	if (fcw->hcout_en > 0) {
> +		parity_offset = (op->ldpc_dec.basegraph == 1 ? 20 : 8)
> +			* op->ldpc_dec.z_c - op->ldpc_dec.n_filler;
> +		k0_p = (fcw->k0 > parity_offset) ?
> +				fcw->k0 - op->ldpc_dec.n_filler : fcw->k0;
> +		ncb_p = fcw->ncb - op->ldpc_dec.n_filler;
> +		l = k0_p + fcw->rm_e;
> +		harq_out_length = (uint16_t) fcw->hcin_size0;
> +		harq_out_length = RTE_MIN(RTE_MAX(harq_out_length, l), ncb_p);
> +		harq_out_length = (harq_out_length + 0x3F) & 0xFFC0;
> +		if ((k0_p > fcw->hcin_size0 + ACC100_HARQ_OFFSET_THRESHOLD) &&
> +				harq_prun) {
> +			fcw->hcout_size0 = (uint16_t) fcw->hcin_size0;
> +			fcw->hcout_offset = k0_p & 0xFFC0;
> +			fcw->hcout_size1 = harq_out_length - fcw->hcout_offset;
> +		} else {
> +			fcw->hcout_size0 = harq_out_length;
> +			fcw->hcout_size1 = 0;
> +			fcw->hcout_offset = 0;
> +		}
> +		harq_layout[harq_index].offset = fcw->hcout_offset;
> +		harq_layout[harq_index].size0 = fcw->hcout_size0;
> +	} else {
> +		fcw->hcout_size0 = 0;
> +		fcw->hcout_size1 = 0;
> +		fcw->hcout_offset = 0;
> +	}
> +}
> +
> +/**
> + * Fills descriptor with data pointers of one block type.
> + *
> + * @param desc
> + *   Pointer to DMA descriptor.
> + * @param input
> + *   Pointer to pointer to input data which will be encoded. It can be changed
> + *   and points to next segment in scatter-gather case.
> + * @param offset
> + *   Input offset in rte_mbuf structure. It is used for calculating the point
> + *   where data is starting.
> + * @param cb_len
> + *   Length of currently processed Code Block
> + * @param seg_total_left
> + *   It indicates how many bytes still left in segment (mbuf) for further
> + *   processing.
> + * @param op_flags
> + *   Store information about device capabilities
> + * @param next_triplet
> + *   Index for ACC100 DMA Descriptor triplet
> + *
> + * @return
> + *   Returns index of next triplet on success, other value if lengths of
> + *   pkt and processed cb do not match.
> + *
> + */
> +static inline int
> +acc100_dma_fill_blk_type_in(struct acc100_dma_req_desc *desc,
> +		struct rte_mbuf **input, uint32_t *offset, uint32_t cb_len,
> +		uint32_t *seg_total_left, int next_triplet)
> +{
> +	uint32_t part_len;
> +	struct rte_mbuf *m = *input;
> +
> +	part_len = (*seg_total_left < cb_len) ? *seg_total_left : cb_len;
> +	cb_len -= part_len;
> +	*seg_total_left -= part_len;
> +
> +	desc->data_ptrs[next_triplet].address =
> +			rte_pktmbuf_iova_offset(m, *offset);
> +	desc->data_ptrs[next_triplet].blen = part_len;
> +	desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN;
> +	desc->data_ptrs[next_triplet].last = 0;
> +	desc->data_ptrs[next_triplet].dma_ext = 0;
> +	*offset += part_len;
> +	next_triplet++;
> +
> +	while (cb_len > 0) {

Since cb_len is unsigned, a better check would be

while (cb_len != 0)

> +		if (next_triplet < ACC100_DMA_MAX_NUM_POINTERS &&
> +				m->next != NULL) {
> +
> +			m = m->next;
> +			*seg_total_left = rte_pktmbuf_data_len(m);
> +			part_len = (*seg_total_left < cb_len) ?
> +					*seg_total_left :
> +					cb_len;
> +			desc->data_ptrs[next_triplet].address =
> +					rte_pktmbuf_iova_offset(m, 0);
> +			desc->data_ptrs[next_triplet].blen = part_len;
> +			desc->data_ptrs[next_triplet].blkid =
> +					ACC100_DMA_BLKID_IN;
> +			desc->data_ptrs[next_triplet].last = 0;
> +			desc->data_ptrs[next_triplet].dma_ext = 0;
> +			cb_len -= part_len;
> +			*seg_total_left -= part_len;

when *sec_total_left goes to zero here, there will be a lot of iterations doing nothing.

should stop early.

> +			/* Initializing offset for next segment (mbuf) */
> +			*offset = part_len;
> +			next_triplet++;
> +		} else {
> +			rte_bbdev_log(ERR,
> +				"Some data still left for processing: "
> +				"data_left: %u, next_triplet: %u, next_mbuf: %p",
> +				cb_len, next_triplet, m->next);
> +			return -EINVAL;
> +		}
> +	}
> +	/* Storing new mbuf as it could be changed in scatter-gather case*/
> +	*input = m;
> +
> +	return next_triplet;

callers, after checking, dec the return.

Maybe change return to next_triplet-- and save the callers from doing it.

> +}
> +
> +/* Fills descriptor with data pointers of one block type.
> + * Returns index of next triplet on success, other value if lengths of
> + * output data and processed mbuf do not match.
> + */
> +static inline int
> +acc100_dma_fill_blk_type_out(struct acc100_dma_req_desc *desc,
> +		struct rte_mbuf *output, uint32_t out_offset,
> +		uint32_t output_len, int next_triplet, int blk_id)
> +{
> +	desc->data_ptrs[next_triplet].address =
> +			rte_pktmbuf_iova_offset(output, out_offset);
> +	desc->data_ptrs[next_triplet].blen = output_len;
> +	desc->data_ptrs[next_triplet].blkid = blk_id;
> +	desc->data_ptrs[next_triplet].last = 0;
> +	desc->data_ptrs[next_triplet].dma_ext = 0;
> +	next_triplet++;

Callers check return is < 0, like above but there is no similar logic to

check the bounds of next_triplet to return -EINVAL

so add this check here or remove the is < 0 checks by the callers.

> +
> +	return next_triplet;
> +}
> +
> +static inline int
> +acc100_dma_desc_le_fill(struct rte_bbdev_enc_op *op,
> +		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
> +		struct rte_mbuf *output, uint32_t *in_offset,
> +		uint32_t *out_offset, uint32_t *out_length,
> +		uint32_t *mbuf_total_left, uint32_t *seg_total_left)
> +{
> +	int next_triplet = 1; /* FCW already done */
> +	uint16_t K, in_length_in_bits, in_length_in_bytes;
> +	struct rte_bbdev_op_ldpc_enc *enc = &op->ldpc_enc;
> +
> +	desc->word0 = ACC100_DMA_DESC_TYPE;
> +	desc->word1 = 0; /**< Timestamp could be disabled */
> +	desc->word2 = 0;
> +	desc->word3 = 0;
> +	desc->numCBs = 1;
> +
> +	K = (enc->basegraph == 1 ? 22 : 10) * enc->z_c;
> +	in_length_in_bits = K - enc->n_filler;
can this overflow ? enc->n_filler > K ?
> +	if ((enc->op_flags & RTE_BBDEV_LDPC_CRC_24A_ATTACH) ||
> +			(enc->op_flags & RTE_BBDEV_LDPC_CRC_24B_ATTACH))
> +		in_length_in_bits -= 24;
> +	in_length_in_bytes = in_length_in_bits >> 3;
> +
> +	if (unlikely((*mbuf_total_left == 0) ||
This check is covered by the next and can be removed.
> +			(*mbuf_total_left < in_length_in_bytes))) {
> +		rte_bbdev_log(ERR,
> +				"Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
> +				*mbuf_total_left, in_length_in_bytes);
> +		return -1;
> +	}
> +
> +	next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset,
> +			in_length_in_bytes,
> +			seg_total_left, next_triplet);
> +	if (unlikely(next_triplet < 0)) {
> +		rte_bbdev_log(ERR,
> +				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
> +				op);
> +		return -1;
> +	}
> +	desc->data_ptrs[next_triplet - 1].last = 1;
> +	desc->m2dlen = next_triplet;
> +	*mbuf_total_left -= in_length_in_bytes;

Updating output pointers should be deferred until the the call is known to be successful.

Otherwise caller is left in a bad, unknown state.

> +
> +	/* Set output length */
> +	/* Integer round up division by 8 */
> +	*out_length = (enc->cb_params.e + 7) >> 3;
> +
> +	next_triplet = acc100_dma_fill_blk_type_out(desc, output, *out_offset,
> +			*out_length, next_triplet, ACC100_DMA_BLKID_OUT_ENC);
> +	if (unlikely(next_triplet < 0)) {
> +		rte_bbdev_log(ERR,
> +				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
> +				op);
> +		return -1;
> +	}
> +	op->ldpc_enc.output.length += *out_length;
> +	*out_offset += *out_length;
> +	desc->data_ptrs[next_triplet - 1].last = 1;
> +	desc->data_ptrs[next_triplet - 1].dma_ext = 0;
> +	desc->d2mlen = next_triplet - desc->m2dlen;
> +
> +	desc->op_addr = op;
> +
> +	return 0;
> +}
> +
> +static inline int
> +acc100_dma_desc_ld_fill(struct rte_bbdev_dec_op *op,
> +		struct acc100_dma_req_desc *desc,
> +		struct rte_mbuf **input, struct rte_mbuf *h_output,
> +		uint32_t *in_offset, uint32_t *h_out_offset,
> +		uint32_t *h_out_length, uint32_t *mbuf_total_left,
> +		uint32_t *seg_total_left,
> +		struct acc100_fcw_ld *fcw)
> +{
> +	struct rte_bbdev_op_ldpc_dec *dec = &op->ldpc_dec;
> +	int next_triplet = 1; /* FCW already done */
> +	uint32_t input_length;
> +	uint16_t output_length, crc24_overlap = 0;
> +	uint16_t sys_cols, K, h_p_size, h_np_size;
> +	bool h_comp = check_bit(dec->op_flags,
> +			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
> +
> +	desc->word0 = ACC100_DMA_DESC_TYPE;
> +	desc->word1 = 0; /**< Timestamp could be disabled */
> +	desc->word2 = 0;
> +	desc->word3 = 0;
> +	desc->numCBs = 1;
This seems to be a common setup logic, maybe use a macro or inline function.
> +
> +	if (check_bit(op->ldpc_dec.op_flags,
> +			RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP))
> +		crc24_overlap = 24;
> +
> +	/* Compute some LDPC BG lengths */
> +	input_length = dec->cb_params.e;
> +	if (check_bit(op->ldpc_dec.op_flags,
> +			RTE_BBDEV_LDPC_LLR_COMPRESSION))
> +		input_length = (input_length * 3 + 3) / 4;
> +	sys_cols = (dec->basegraph == 1) ? 22 : 10;
> +	K = sys_cols * dec->z_c;
> +	output_length = K - dec->n_filler - crc24_overlap;
> +
> +	if (unlikely((*mbuf_total_left == 0) ||
similar to above, this check can be removed.
> +			(*mbuf_total_left < input_length))) {
> +		rte_bbdev_log(ERR,
> +				"Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
> +				*mbuf_total_left, input_length);
> +		return -1;
> +	}
> +
> +	next_triplet = acc100_dma_fill_blk_type_in(desc, input,
> +			in_offset, input_length,
> +			seg_total_left, next_triplet);
> +
> +	if (unlikely(next_triplet < 0)) {
> +		rte_bbdev_log(ERR,
> +				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
> +				op);
> +		return -1;
> +	}
> +
> +	if (check_bit(op->ldpc_dec.op_flags,
> +				RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
> +		h_p_size = fcw->hcin_size0 + fcw->hcin_size1;
> +		if (h_comp)
> +			h_p_size = (h_p_size * 3 + 3) / 4;
> +		desc->data_ptrs[next_triplet].address =
> +				dec->harq_combined_input.offset;
> +		desc->data_ptrs[next_triplet].blen = h_p_size;
> +		desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN_HARQ;
> +		desc->data_ptrs[next_triplet].dma_ext = 1;
> +#ifndef ACC100_EXT_MEM
> +		acc100_dma_fill_blk_type_out(
> +				desc,
> +				op->ldpc_dec.harq_combined_input.data,
> +				op->ldpc_dec.harq_combined_input.offset,
> +				h_p_size,
> +				next_triplet,
> +				ACC100_DMA_BLKID_IN_HARQ);
> +#endif
> +		next_triplet++;
> +	}
> +
> +	desc->data_ptrs[next_triplet - 1].last = 1;
> +	desc->m2dlen = next_triplet;
> +	*mbuf_total_left -= input_length;
> +
> +	next_triplet = acc100_dma_fill_blk_type_out(desc, h_output,
> +			*h_out_offset, output_length >> 3, next_triplet,
> +			ACC100_DMA_BLKID_OUT_HARD);
> +	if (unlikely(next_triplet < 0)) {
> +		rte_bbdev_log(ERR,
> +				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
> +				op);
> +		return -1;
> +	}
> +
> +	if (check_bit(op->ldpc_dec.op_flags,
> +				RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
> +		/* Pruned size of the HARQ */
> +		h_p_size = fcw->hcout_size0 + fcw->hcout_size1;
> +		/* Non-Pruned size of the HARQ */
> +		h_np_size = fcw->hcout_offset > 0 ?
> +				fcw->hcout_offset + fcw->hcout_size1 :
> +				h_p_size;
> +		if (h_comp) {
> +			h_np_size = (h_np_size * 3 + 3) / 4;
> +			h_p_size = (h_p_size * 3 + 3) / 4;

* 4 -1 ) / 4

may produce better assembly.

> +		}
> +		dec->harq_combined_output.length = h_np_size;
> +		desc->data_ptrs[next_triplet].address =
> +				dec->harq_combined_output.offset;
> +		desc->data_ptrs[next_triplet].blen = h_p_size;
> +		desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_OUT_HARQ;
> +		desc->data_ptrs[next_triplet].dma_ext = 1;
> +#ifndef ACC100_EXT_MEM
> +		acc100_dma_fill_blk_type_out(
> +				desc,
> +				dec->harq_combined_output.data,
> +				dec->harq_combined_output.offset,
> +				h_p_size,
> +				next_triplet,
> +				ACC100_DMA_BLKID_OUT_HARQ);
> +#endif
> +		next_triplet++;
> +	}
> +
> +	*h_out_length = output_length >> 3;
> +	dec->hard_output.length += *h_out_length;
> +	*h_out_offset += *h_out_length;
> +	desc->data_ptrs[next_triplet - 1].last = 1;
> +	desc->d2mlen = next_triplet - desc->m2dlen;
> +
> +	desc->op_addr = op;
> +
> +	return 0;
> +}
> +
> +static inline void
> +acc100_dma_desc_ld_update(struct rte_bbdev_dec_op *op,
> +		struct acc100_dma_req_desc *desc,
> +		struct rte_mbuf *input, struct rte_mbuf *h_output,
> +		uint32_t *in_offset, uint32_t *h_out_offset,
> +		uint32_t *h_out_length,
> +		union acc100_harq_layout_data *harq_layout)
> +{
> +	int next_triplet = 1; /* FCW already done */
> +	desc->data_ptrs[next_triplet].address =
> +			rte_pktmbuf_iova_offset(input, *in_offset);
> +	next_triplet++;

No overflow checks on next_triplet

This is a general problem.

> +
> +	if (check_bit(op->ldpc_dec.op_flags,
> +				RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
> +		struct rte_bbdev_op_data hi = op->ldpc_dec.harq_combined_input;
> +		desc->data_ptrs[next_triplet].address = hi.offset;
> +#ifndef ACC100_EXT_MEM
> +		desc->data_ptrs[next_triplet].address =
> +				rte_pktmbuf_iova_offset(hi.data, hi.offset);
> +#endif
> +		next_triplet++;
> +	}
> +
> +	desc->data_ptrs[next_triplet].address =
> +			rte_pktmbuf_iova_offset(h_output, *h_out_offset);
> +	*h_out_length = desc->data_ptrs[next_triplet].blen;
> +	next_triplet++;
> +
> +	if (check_bit(op->ldpc_dec.op_flags,
> +				RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
> +		desc->data_ptrs[next_triplet].address =
> +				op->ldpc_dec.harq_combined_output.offset;
> +		/* Adjust based on previous operation */
> +		struct rte_bbdev_dec_op *prev_op = desc->op_addr;
> +		op->ldpc_dec.harq_combined_output.length =
> +				prev_op->ldpc_dec.harq_combined_output.length;
> +		int16_t hq_idx = op->ldpc_dec.harq_combined_output.offset /
> +				ACC100_HARQ_OFFSET;
> +		int16_t prev_hq_idx =
> +				prev_op->ldpc_dec.harq_combined_output.offset
> +				/ ACC100_HARQ_OFFSET;
> +		harq_layout[hq_idx].val = harq_layout[prev_hq_idx].val;
> +#ifndef ACC100_EXT_MEM
> +		struct rte_bbdev_op_data ho =
> +				op->ldpc_dec.harq_combined_output;
> +		desc->data_ptrs[next_triplet].address =
> +				rte_pktmbuf_iova_offset(ho.data, ho.offset);
> +#endif
> +		next_triplet++;
> +	}
> +
> +	op->ldpc_dec.hard_output.length += *h_out_length;
> +	desc->op_addr = op;
> +}
> +
> +
> +/* Enqueue a number of operations to HW and update software rings */
> +static inline void
> +acc100_dma_enqueue(struct acc100_queue *q, uint16_t n,
> +		struct rte_bbdev_stats *queue_stats)
> +{
> +	union acc100_enqueue_reg_fmt enq_req;
> +#ifdef RTE_BBDEV_OFFLOAD_COST
> +	uint64_t start_time = 0;
> +	queue_stats->acc_offload_cycles = 0;
> +	RTE_SET_USED(queue_stats);
> +#else
> +	RTE_SET_USED(queue_stats);
> +#endif

RTE_SET_UNUSED(... is common in the #ifdef/#else

so it should be moved out.

> +
> +	enq_req.val = 0;
> +	/* Setting offset, 100b for 256 DMA Desc */
> +	enq_req.addr_offset = ACC100_DESC_OFFSET;
> +
should n != 0 be checked here ?
> +	/* Split ops into batches */
> +	do {
> +		union acc100_dma_desc *desc;
> +		uint16_t enq_batch_size;
> +		uint64_t offset;
> +		rte_iova_t req_elem_addr;
> +
> +		enq_batch_size = RTE_MIN(n, MAX_ENQ_BATCH_SIZE);
> +
> +		/* Set flag on last descriptor in a batch */
> +		desc = q->ring_addr + ((q->sw_ring_head + enq_batch_size - 1) &
> +				q->sw_ring_wrap_mask);
> +		desc->req.last_desc_in_batch = 1;
> +
> +		/* Calculate the 1st descriptor's address */
> +		offset = ((q->sw_ring_head & q->sw_ring_wrap_mask) *
> +				sizeof(union acc100_dma_desc));
> +		req_elem_addr = q->ring_addr_phys + offset;
> +
> +		/* Fill enqueue struct */
> +		enq_req.num_elem = enq_batch_size;
> +		/* low 6 bits are not needed */
> +		enq_req.req_elem_addr = (uint32_t)(req_elem_addr >> 6);
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +		rte_memdump(stderr, "Req sdone", desc, sizeof(*desc));
> +#endif
> +		rte_bbdev_log_debug(
> +				"Enqueue %u reqs (phys %#"PRIx64") to reg %p",
> +				enq_batch_size,
> +				req_elem_addr,
> +				(void *)q->mmio_reg_enqueue);
> +
> +		rte_wmb();
> +
> +#ifdef RTE_BBDEV_OFFLOAD_COST
> +		/* Start time measurement for enqueue function offload. */
> +		start_time = rte_rdtsc_precise();
> +#endif
> +		rte_bbdev_log(DEBUG, "Debug : MMIO Enqueue");

logging time will be tracked with the mmio_write

so logging should be moved above the start_time setting

> +		mmio_write(q->mmio_reg_enqueue, enq_req.val);
> +
> +#ifdef RTE_BBDEV_OFFLOAD_COST
> +		queue_stats->acc_offload_cycles +=
> +				rte_rdtsc_precise() - start_time;
> +#endif
> +
> +		q->aq_enqueued++;
> +		q->sw_ring_head += enq_batch_size;
> +		n -= enq_batch_size;
> +
> +	} while (n);
> +
> +
> +}
> +
> +/* Enqueue one encode operations for ACC100 device in CB mode */
> +static inline int
> +enqueue_ldpc_enc_n_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op **ops,
> +		uint16_t total_enqueued_cbs, int16_t num)
> +{
> +	union acc100_dma_desc *desc = NULL;
> +	uint32_t out_length;
> +	struct rte_mbuf *output_head, *output;
> +	int i, next_triplet;
> +	uint16_t  in_length_in_bytes;
> +	struct rte_bbdev_op_ldpc_enc *enc = &ops[0]->ldpc_enc;
> +
> +	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> +			& q->sw_ring_wrap_mask);
> +	desc = q->ring_addr + desc_idx;
> +	acc100_fcw_le_fill(ops[0], &desc->req.fcw_le, num);
> +
> +	/** This could be done at polling */
> +	desc->req.word0 = ACC100_DMA_DESC_TYPE;
> +	desc->req.word1 = 0; /**< Timestamp could be disabled */
> +	desc->req.word2 = 0;
> +	desc->req.word3 = 0;
> +	desc->req.numCBs = num;
> +
> +	in_length_in_bytes = ops[0]->ldpc_enc.input.data->data_len;
> +	out_length = (enc->cb_params.e + 7) >> 3;
> +	desc->req.m2dlen = 1 + num;
> +	desc->req.d2mlen = num;
> +	next_triplet = 1;
> +
> +	for (i = 0; i < num; i++) {
i is not needed here, it is next_triplet - 1
> +		desc->req.data_ptrs[next_triplet].address =
> +			rte_pktmbuf_iova_offset(ops[i]->ldpc_enc.input.data, 0);
> +		desc->req.data_ptrs[next_triplet].blen = in_length_in_bytes;
> +		next_triplet++;
> +		desc->req.data_ptrs[next_triplet].address =
> +				rte_pktmbuf_iova_offset(
> +				ops[i]->ldpc_enc.output.data, 0);
> +		desc->req.data_ptrs[next_triplet].blen = out_length;
> +		next_triplet++;
> +		ops[i]->ldpc_enc.output.length = out_length;
> +		output_head = output = ops[i]->ldpc_enc.output.data;
> +		mbuf_append(output_head, output, out_length);
> +		output->data_len = out_length;
> +	}
> +
> +	desc->req.op_addr = ops[0];
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +	rte_memdump(stderr, "FCW", &desc->req.fcw_le,
> +			sizeof(desc->req.fcw_le) - 8);
> +	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> +#endif
> +
> +	/* One CB (one op) was successfully prepared to enqueue */
> +	return num;

caller does not use num, only check if < 0

So could change to return 0

> +}
> +
> +/* Enqueue one encode operations for ACC100 device in CB mode */
> +static inline int
> +enqueue_ldpc_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
> +		uint16_t total_enqueued_cbs)

rte_fpga_5gnr_fec.c has this same function.  It would be good if common functions could be collected and used to stabilize the internal bbdev interface.

This is general issue

> +{
> +	union acc100_dma_desc *desc = NULL;
> +	int ret;
> +	uint32_t in_offset, out_offset, out_length, mbuf_total_left,
> +		seg_total_left;
> +	struct rte_mbuf *input, *output_head, *output;
> +
> +	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> +			& q->sw_ring_wrap_mask);
> +	desc = q->ring_addr + desc_idx;
> +	acc100_fcw_le_fill(op, &desc->req.fcw_le, 1);
> +
> +	input = op->ldpc_enc.input.data;
> +	output_head = output = op->ldpc_enc.output.data;
> +	in_offset = op->ldpc_enc.input.offset;
> +	out_offset = op->ldpc_enc.output.offset;
> +	out_length = 0;
> +	mbuf_total_left = op->ldpc_enc.input.length;
> +	seg_total_left = rte_pktmbuf_data_len(op->ldpc_enc.input.data)
> +			- in_offset;
> +
> +	ret = acc100_dma_desc_le_fill(op, &desc->req, &input, output,
> +			&in_offset, &out_offset, &out_length, &mbuf_total_left,
> +			&seg_total_left);
> +
> +	if (unlikely(ret < 0))
> +		return ret;
> +
> +	mbuf_append(output_head, output, out_length);
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +	rte_memdump(stderr, "FCW", &desc->req.fcw_le,
> +			sizeof(desc->req.fcw_le) - 8);
> +	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> +
> +	/* Check if any data left after processing one CB */
> +	if (mbuf_total_left != 0) {
> +		rte_bbdev_log(ERR,
> +				"Some date still left after processing one CB: mbuf_total_left = %u",
> +				mbuf_total_left);
> +		return -EINVAL;
> +	}
> +#endif
> +	/* One CB (one op) was successfully prepared to enqueue */
> +	return 1;

Another case where caller only check for < 0

Consider changes all similar to return 0 on success.

> +}
> +
> +/** Enqueue one decode operations for ACC100 device in CB mode */
> +static inline int
> +enqueue_ldpc_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
> +		uint16_t total_enqueued_cbs, bool same_op)
> +{
> +	int ret;
> +
> +	union acc100_dma_desc *desc;
> +	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> +			& q->sw_ring_wrap_mask);
> +	desc = q->ring_addr + desc_idx;
> +	struct rte_mbuf *input, *h_output_head, *h_output;
> +	uint32_t in_offset, h_out_offset, mbuf_total_left, h_out_length = 0;
> +	input = op->ldpc_dec.input.data;
> +	h_output_head = h_output = op->ldpc_dec.hard_output.data;
> +	in_offset = op->ldpc_dec.input.offset;
> +	h_out_offset = op->ldpc_dec.hard_output.offset;
> +	mbuf_total_left = op->ldpc_dec.input.length;
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +	if (unlikely(input == NULL)) {
> +		rte_bbdev_log(ERR, "Invalid mbuf pointer");
> +		return -EFAULT;
> +	}
> +#endif
> +	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
> +
> +	if (same_op) {
> +		union acc100_dma_desc *prev_desc;
> +		desc_idx = ((q->sw_ring_head + total_enqueued_cbs - 1)
> +				& q->sw_ring_wrap_mask);
> +		prev_desc = q->ring_addr + desc_idx;
> +		uint8_t *prev_ptr = (uint8_t *) prev_desc;
> +		uint8_t *new_ptr = (uint8_t *) desc;
> +		/* Copy first 4 words and BDESCs */
> +		rte_memcpy(new_ptr, prev_ptr, 16);
> +		rte_memcpy(new_ptr + 36, prev_ptr + 36, 40);
These magic numbers should be #defines
> +		desc->req.op_addr = prev_desc->req.op_addr;
> +		/* Copy FCW */
> +		rte_memcpy(new_ptr + ACC100_DESC_FCW_OFFSET,
> +				prev_ptr + ACC100_DESC_FCW_OFFSET,
> +				ACC100_FCW_LD_BLEN);
> +		acc100_dma_desc_ld_update(op, &desc->req, input, h_output,
> +				&in_offset, &h_out_offset,
> +				&h_out_length, harq_layout);
> +	} else {
> +		struct acc100_fcw_ld *fcw;
> +		uint32_t seg_total_left;
> +		fcw = &desc->req.fcw_ld;
> +		acc100_fcw_ld_fill(op, fcw, harq_layout);
> +
> +		/* Special handling when overusing mbuf */
> +		if (fcw->rm_e < MAX_E_MBUF)
> +			seg_total_left = rte_pktmbuf_data_len(input)
> +					- in_offset;
> +		else
> +			seg_total_left = fcw->rm_e;
> +
> +		ret = acc100_dma_desc_ld_fill(op, &desc->req, &input, h_output,
> +				&in_offset, &h_out_offset,
> +				&h_out_length, &mbuf_total_left,
> +				&seg_total_left, fcw);
> +		if (unlikely(ret < 0))
> +			return ret;
> +	}
> +
> +	/* Hard output */
> +	mbuf_append(h_output_head, h_output, h_out_length);
> +#ifndef ACC100_EXT_MEM
> +	if (op->ldpc_dec.harq_combined_output.length > 0) {
> +		/* Push the HARQ output into host memory */
> +		struct rte_mbuf *hq_output_head, *hq_output;
> +		hq_output_head = op->ldpc_dec.harq_combined_output.data;
> +		hq_output = op->ldpc_dec.harq_combined_output.data;
> +		mbuf_append(hq_output_head, hq_output,
> +				op->ldpc_dec.harq_combined_output.length);
> +	}
> +#endif
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +	rte_memdump(stderr, "FCW", &desc->req.fcw_ld,
> +			sizeof(desc->req.fcw_ld) - 8);
> +	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> +#endif
> +
> +	/* One CB (one op) was successfully prepared to enqueue */
> +	return 1;
> +}
> +
> +
> +/* Enqueue one decode operations for ACC100 device in TB mode */
> +static inline int
> +enqueue_ldpc_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
> +		uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
> +{
> +	union acc100_dma_desc *desc = NULL;
> +	int ret;
> +	uint8_t r, c;
> +	uint32_t in_offset, h_out_offset,
> +		h_out_length, mbuf_total_left, seg_total_left;
> +	struct rte_mbuf *input, *h_output_head, *h_output;
> +	uint16_t current_enqueued_cbs = 0;
> +
> +	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> +			& q->sw_ring_wrap_mask);
> +	desc = q->ring_addr + desc_idx;
> +	uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
> +	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
> +	acc100_fcw_ld_fill(op, &desc->req.fcw_ld, harq_layout);
> +
> +	input = op->ldpc_dec.input.data;
> +	h_output_head = h_output = op->ldpc_dec.hard_output.data;
> +	in_offset = op->ldpc_dec.input.offset;
> +	h_out_offset = op->ldpc_dec.hard_output.offset;
> +	h_out_length = 0;
> +	mbuf_total_left = op->ldpc_dec.input.length;
> +	c = op->ldpc_dec.tb_params.c;
> +	r = op->ldpc_dec.tb_params.r;
> +
> +	while (mbuf_total_left > 0 && r < c) {
> +
> +		seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
> +
> +		/* Set up DMA descriptor */
> +		desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
> +				& q->sw_ring_wrap_mask);
> +		desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
> +		desc->req.data_ptrs[0].blen = ACC100_FCW_LD_BLEN;
> +		ret = acc100_dma_desc_ld_fill(op, &desc->req, &input,
> +				h_output, &in_offset, &h_out_offset,
> +				&h_out_length,
> +				&mbuf_total_left, &seg_total_left,
> +				&desc->req.fcw_ld);
> +
> +		if (unlikely(ret < 0))
> +			return ret;
> +
> +		/* Hard output */
> +		mbuf_append(h_output_head, h_output, h_out_length);
> +
> +		/* Set total number of CBs in TB */
> +		desc->req.cbs_in_tb = cbs_in_tb;
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +		rte_memdump(stderr, "FCW", &desc->req.fcw_td,
> +				sizeof(desc->req.fcw_td) - 8);
> +		rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> +#endif
> +
> +		if (seg_total_left == 0) {
> +			/* Go to the next mbuf */
> +			input = input->next;
> +			in_offset = 0;
> +			h_output = h_output->next;
> +			h_out_offset = 0;
> +		}
> +		total_enqueued_cbs++;
> +		current_enqueued_cbs++;
> +		r++;
> +	}
> +
> +	if (unlikely(desc == NULL))
How is this possible ? desc has be dereferenced already.
> +		return current_enqueued_cbs;
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +	/* Check if any CBs left for processing */
> +	if (mbuf_total_left != 0) {
> +		rte_bbdev_log(ERR,
> +				"Some date still left for processing: mbuf_total_left = %u",
> +				mbuf_total_left);
> +		return -EINVAL;
> +	}
> +#endif
> +	/* Set SDone on last CB descriptor for TB mode */
> +	desc->req.sdone_enable = 1;
> +	desc->req.irq_enable = q->irq_enable;
> +
> +	return current_enqueued_cbs;
> +}
> +
> +
> +/* Calculates number of CBs in processed encoder TB based on 'r' and input
> + * length.
> + */
> +static inline uint8_t
> +get_num_cbs_in_tb_enc(struct rte_bbdev_op_turbo_enc *turbo_enc)
> +{
> +	uint8_t c, c_neg, r, crc24_bits = 0;
> +	uint16_t k, k_neg, k_pos;
> +	uint8_t cbs_in_tb = 0;
> +	int32_t length;
> +
> +	length = turbo_enc->input.length;
> +	r = turbo_enc->tb_params.r;
> +	c = turbo_enc->tb_params.c;
> +	c_neg = turbo_enc->tb_params.c_neg;
> +	k_neg = turbo_enc->tb_params.k_neg;
> +	k_pos = turbo_enc->tb_params.k_pos;
> +	crc24_bits = 0;
> +	if (check_bit(turbo_enc->op_flags, RTE_BBDEV_TURBO_CRC_24B_ATTACH))
> +		crc24_bits = 24;
> +	while (length > 0 && r < c) {
> +		k = (r < c_neg) ? k_neg : k_pos;
> +		length -= (k - crc24_bits) >> 3;
> +		r++;
> +		cbs_in_tb++;
> +	}
> +
> +	return cbs_in_tb;
> +}
> +
> +/* Calculates number of CBs in processed decoder TB based on 'r' and input
> + * length.
> + */
> +static inline uint16_t
> +get_num_cbs_in_tb_dec(struct rte_bbdev_op_turbo_dec *turbo_dec)
> +{
> +	uint8_t c, c_neg, r = 0;
> +	uint16_t kw, k, k_neg, k_pos, cbs_in_tb = 0;
> +	int32_t length;
> +
> +	length = turbo_dec->input.length;
> +	r = turbo_dec->tb_params.r;
> +	c = turbo_dec->tb_params.c;
> +	c_neg = turbo_dec->tb_params.c_neg;
> +	k_neg = turbo_dec->tb_params.k_neg;
> +	k_pos = turbo_dec->tb_params.k_pos;
> +	while (length > 0 && r < c) {
> +		k = (r < c_neg) ? k_neg : k_pos;
> +		kw = RTE_ALIGN_CEIL(k + 4, 32) * 3;
> +		length -= kw;
> +		r++;
> +		cbs_in_tb++;
> +	}
> +
> +	return cbs_in_tb;
> +}
> +
> +/* Calculates number of CBs in processed decoder TB based on 'r' and input
> + * length.
> + */
> +static inline uint16_t
> +get_num_cbs_in_tb_ldpc_dec(struct rte_bbdev_op_ldpc_dec *ldpc_dec)
> +{
> +	uint16_t r, cbs_in_tb = 0;
> +	int32_t length = ldpc_dec->input.length;
> +	r = ldpc_dec->tb_params.r;
> +	while (length > 0 && r < ldpc_dec->tb_params.c) {
> +		length -=  (r < ldpc_dec->tb_params.cab) ?
> +				ldpc_dec->tb_params.ea :
> +				ldpc_dec->tb_params.eb;
> +		r++;
> +		cbs_in_tb++;
> +	}
> +	return cbs_in_tb;
> +}
> +
> +/* Check we can mux encode operations with common FCW */
> +static inline bool
> +check_mux(struct rte_bbdev_enc_op **ops, uint16_t num) {
> +	uint16_t i;
> +	if (num == 1)
> +		return false;
likely should strengthen check to num <= 1
> +	for (i = 1; i < num; ++i) {
> +		/* Only mux compatible code blocks */
> +		if (memcmp((uint8_t *)(&ops[i]->ldpc_enc) + ENC_OFFSET,
> +				(uint8_t *)(&ops[0]->ldpc_enc) + ENC_OFFSET,
ops[0]->ldpc_enc should be hoisted out of loop as it is invariant.
> +				CMP_ENC_SIZE) != 0)
> +			return false;
> +	}
> +	return true;
> +}
> +
> +/** Enqueue encode operations for ACC100 device in CB mode. */
> +static inline uint16_t
> +acc100_enqueue_ldpc_enc_cb(struct rte_bbdev_queue_data *q_data,
> +		struct rte_bbdev_enc_op **ops, uint16_t num)
> +{
> +	struct acc100_queue *q = q_data->queue_private;
> +	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
> +	uint16_t i = 0;
> +	union acc100_dma_desc *desc;
> +	int ret, desc_idx = 0;
> +	int16_t enq, left = num;
> +
> +	while (left > 0) {
> +		if (unlikely(avail - 1 < 0))
> +			break;
> +		avail--;
> +		enq = RTE_MIN(left, MUX_5GDL_DESC);
> +		if (check_mux(&ops[i], enq)) {
> +			ret = enqueue_ldpc_enc_n_op_cb(q, &ops[i],
> +					desc_idx, enq);
> +			if (ret < 0)
> +				break;
> +			i += enq;
> +		} else {
> +			ret = enqueue_ldpc_enc_one_op_cb(q, ops[i], desc_idx);
> +			if (ret < 0)
> +				break;
failure is not handled well, what happens if this is one of serveral
> +			i++;
> +		}
> +		desc_idx++;
> +		left = num - i;
> +	}
> +
> +	if (unlikely(i == 0))
> +		return 0; /* Nothing to enqueue */
this does not look correct for all cases
> +
> +	/* Set SDone in last CB in enqueued ops for CB mode*/
> +	desc = q->ring_addr + ((q->sw_ring_head + desc_idx - 1)
> +			& q->sw_ring_wrap_mask);
> +	desc->req.sdone_enable = 1;
> +	desc->req.irq_enable = q->irq_enable;
> +
> +	acc100_dma_enqueue(q, desc_idx, &q_data->queue_stats);
> +
> +	/* Update stats */
> +	q_data->queue_stats.enqueued_count += i;
> +	q_data->queue_stats.enqueue_err_count += num - i;
> +
> +	return i;
> +}
> +
> +/* Enqueue encode operations for ACC100 device. */
> +static uint16_t
> +acc100_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
> +		struct rte_bbdev_enc_op **ops, uint16_t num)
> +{
> +	if (unlikely(num == 0))
> +		return 0;
Handling num == 0 should be in acc100_enqueue_ldpc_enc_cb
> +	return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
> +}
> +
> +/* Check we can mux encode operations with common FCW */
> +static inline bool
> +cmp_ldpc_dec_op(struct rte_bbdev_dec_op **ops) {
> +	/* Only mux compatible code blocks */
> +	if (memcmp((uint8_t *)(&ops[0]->ldpc_dec) + DEC_OFFSET,
> +			(uint8_t *)(&ops[1]->ldpc_dec) +
> +			DEC_OFFSET, CMP_DEC_SIZE) != 0) {
> +		return false;
> +	} else
do not need the else, there are no other statements.
> +		return true;
> +}
> +
> +
> +/* Enqueue decode operations for ACC100 device in TB mode */
> +static uint16_t
> +acc100_enqueue_ldpc_dec_tb(struct rte_bbdev_queue_data *q_data,
> +		struct rte_bbdev_dec_op **ops, uint16_t num)
> +{
> +	struct acc100_queue *q = q_data->queue_private;
> +	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
> +	uint16_t i, enqueued_cbs = 0;
> +	uint8_t cbs_in_tb;
> +	int ret;
> +
> +	for (i = 0; i < num; ++i) {
> +		cbs_in_tb = get_num_cbs_in_tb_ldpc_dec(&ops[i]->ldpc_dec);
> +		/* Check if there are available space for further processing */
> +		if (unlikely(avail - cbs_in_tb < 0))
> +			break;
> +		avail -= cbs_in_tb;
> +
> +		ret = enqueue_ldpc_dec_one_op_tb(q, ops[i],
> +				enqueued_cbs, cbs_in_tb);
> +		if (ret < 0)
> +			break;
> +		enqueued_cbs += ret;
> +	}
> +
> +	acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
> +
> +	/* Update stats */
> +	q_data->queue_stats.enqueued_count += i;
> +	q_data->queue_stats.enqueue_err_count += num - i;
> +	return i;
> +}
> +
> +/* Enqueue decode operations for ACC100 device in CB mode */
> +static uint16_t
> +acc100_enqueue_ldpc_dec_cb(struct rte_bbdev_queue_data *q_data,
> +		struct rte_bbdev_dec_op **ops, uint16_t num)
> +{
> +	struct acc100_queue *q = q_data->queue_private;
> +	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
> +	uint16_t i;
> +	union acc100_dma_desc *desc;
> +	int ret;
> +	bool same_op = false;
> +	for (i = 0; i < num; ++i) {
> +		/* Check if there are available space for further processing */
> +		if (unlikely(avail - 1 < 0))

change to (avail < 1)

Generally.

> +			break;
> +		avail -= 1;
> +
> +		if (i > 0)
> +			same_op = cmp_ldpc_dec_op(&ops[i-1]);
> +		rte_bbdev_log(INFO, "Op %d %d %d %d %d %d %d %d %d %d %d %d\n",
> +			i, ops[i]->ldpc_dec.op_flags, ops[i]->ldpc_dec.rv_index,
> +			ops[i]->ldpc_dec.iter_max, ops[i]->ldpc_dec.iter_count,
> +			ops[i]->ldpc_dec.basegraph, ops[i]->ldpc_dec.z_c,
> +			ops[i]->ldpc_dec.n_cb, ops[i]->ldpc_dec.q_m,
> +			ops[i]->ldpc_dec.n_filler, ops[i]->ldpc_dec.cb_params.e,
> +			same_op);
> +		ret = enqueue_ldpc_dec_one_op_cb(q, ops[i], i, same_op);
> +		if (ret < 0)
> +			break;
> +	}
> +
> +	if (unlikely(i == 0))
> +		return 0; /* Nothing to enqueue */
> +
> +	/* Set SDone in last CB in enqueued ops for CB mode*/
> +	desc = q->ring_addr + ((q->sw_ring_head + i - 1)
> +			& q->sw_ring_wrap_mask);
> +
> +	desc->req.sdone_enable = 1;
> +	desc->req.irq_enable = q->irq_enable;
> +
> +	acc100_dma_enqueue(q, i, &q_data->queue_stats);
> +
> +	/* Update stats */
> +	q_data->queue_stats.enqueued_count += i;
> +	q_data->queue_stats.enqueue_err_count += num - i;
> +	return i;
> +}
> +
> +/* Enqueue decode operations for ACC100 device. */
> +static uint16_t
> +acc100_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
> +		struct rte_bbdev_dec_op **ops, uint16_t num)
> +{
> +	struct acc100_queue *q = q_data->queue_private;
> +	int32_t aq_avail = q->aq_depth +
> +			(q->aq_dequeued - q->aq_enqueued) / 128;
> +
> +	if (unlikely((aq_avail == 0) || (num == 0)))
> +		return 0;
> +
> +	if (ops[0]->ldpc_dec.code_block_mode == 0)
> +		return acc100_enqueue_ldpc_dec_tb(q_data, ops, num);
> +	else
> +		return acc100_enqueue_ldpc_dec_cb(q_data, ops, num);
> +}
> +
> +
> +/* Dequeue one encode operations from ACC100 device in CB mode */
> +static inline int
> +dequeue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op **ref_op,
> +		uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
> +{
> +	union acc100_dma_desc *desc, atom_desc;
> +	union acc100_dma_rsp_desc rsp;
> +	struct rte_bbdev_enc_op *op;
> +	int i;
> +
> +	desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
> +			& q->sw_ring_wrap_mask);
> +	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> +			__ATOMIC_RELAXED);
> +
> +	/* Check fdone bit */
> +	if (!(atom_desc.rsp.val & ACC100_FDONE))
> +		return -1;
> +
> +	rsp.val = atom_desc.rsp.val;
> +	rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
> +
> +	/* Dequeue */
> +	op = desc->req.op_addr;
> +
> +	/* Clearing status, it will be set based on response */
> +	op->status = 0;
> +
> +	op->status |= ((rsp.input_err)
> +			? (1 << RTE_BBDEV_DATA_ERROR) : 0);
can remove the = 0, if |= is changed to =
> +	op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> +	op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> +
> +	if (desc->req.last_desc_in_batch) {
> +		(*aq_dequeued)++;
> +		desc->req.last_desc_in_batch = 0;
> +	}
> +	desc->rsp.val = ACC100_DMA_DESC_TYPE;
> +	desc->rsp.add_info_0 = 0; /*Reserved bits */
> +	desc->rsp.add_info_1 = 0; /*Reserved bits */
> +
> +	/* Flag that the muxing cause loss of opaque data */
> +	op->opaque_data = (void *)-1;
as a ptr, shouldn't opaque_data be poisoned with '0' ?
> +	for (i = 0 ; i < desc->req.numCBs; i++)
> +		ref_op[i] = op;
> +
> +	/* One CB (op) was successfully dequeued */
> +	return desc->req.numCBs;
> +}
> +
> +/* Dequeue one encode operations from ACC100 device in TB mode */
> +static inline int
> +dequeue_enc_one_op_tb(struct acc100_queue *q, struct rte_bbdev_enc_op **ref_op,
> +		uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
> +{
> +	union acc100_dma_desc *desc, *last_desc, atom_desc;
> +	union acc100_dma_rsp_desc rsp;
> +	struct rte_bbdev_enc_op *op;
> +	uint8_t i = 0;
> +	uint16_t current_dequeued_cbs = 0, cbs_in_tb;
> +
> +	desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
> +			& q->sw_ring_wrap_mask);
> +	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> +			__ATOMIC_RELAXED);
> +
> +	/* Check fdone bit */
> +	if (!(atom_desc.rsp.val & ACC100_FDONE))
> +		return -1;
> +
> +	/* Get number of CBs in dequeued TB */
> +	cbs_in_tb = desc->req.cbs_in_tb;
> +	/* Get last CB */
> +	last_desc = q->ring_addr + ((q->sw_ring_tail
> +			+ total_dequeued_cbs + cbs_in_tb - 1)
> +			& q->sw_ring_wrap_mask);
> +	/* Check if last CB in TB is ready to dequeue (and thus
> +	 * the whole TB) - checking sdone bit. If not return.
> +	 */
> +	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
> +			__ATOMIC_RELAXED);
> +	if (!(atom_desc.rsp.val & ACC100_SDONE))
> +		return -1;
> +
> +	/* Dequeue */
> +	op = desc->req.op_addr;
> +
> +	/* Clearing status, it will be set based on response */
> +	op->status = 0;
> +
> +	while (i < cbs_in_tb) {
> +		desc = q->ring_addr + ((q->sw_ring_tail
> +				+ total_dequeued_cbs)
> +				& q->sw_ring_wrap_mask);
> +		atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> +				__ATOMIC_RELAXED);
> +		rsp.val = atom_desc.rsp.val;
> +		rte_bbdev_log_debug("Resp. desc %p: %x", desc,
> +				rsp.val);
> +
> +		op->status |= ((rsp.input_err)
> +				? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> +		op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> +		op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> +
> +		if (desc->req.last_desc_in_batch) {
> +			(*aq_dequeued)++;
> +			desc->req.last_desc_in_batch = 0;
> +		}
> +		desc->rsp.val = ACC100_DMA_DESC_TYPE;
> +		desc->rsp.add_info_0 = 0;
> +		desc->rsp.add_info_1 = 0;
> +		total_dequeued_cbs++;
> +		current_dequeued_cbs++;
> +		i++;
> +	}
> +
> +	*ref_op = op;
> +
> +	return current_dequeued_cbs;
> +}
> +
> +/* Dequeue one decode operation from ACC100 device in CB mode */
> +static inline int
> +dequeue_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
> +		struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
> +		uint16_t dequeued_cbs, uint32_t *aq_dequeued)
> +{
> +	union acc100_dma_desc *desc, atom_desc;
> +	union acc100_dma_rsp_desc rsp;
> +	struct rte_bbdev_dec_op *op;
> +
> +	desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> +			& q->sw_ring_wrap_mask);
> +	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> +			__ATOMIC_RELAXED);
> +
> +	/* Check fdone bit */
> +	if (!(atom_desc.rsp.val & ACC100_FDONE))
> +		return -1;
> +
> +	rsp.val = atom_desc.rsp.val;
> +	rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
> +
> +	/* Dequeue */
> +	op = desc->req.op_addr;
> +
> +	/* Clearing status, it will be set based on response */
> +	op->status = 0;
> +	op->status |= ((rsp.input_err)
> +			? (1 << RTE_BBDEV_DATA_ERROR) : 0);

similar to above, can remove the = 0

This is a general issue.

> +	op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> +	op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> +	if (op->status != 0)
> +		q_data->queue_stats.dequeue_err_count++;
> +
> +	/* CRC invalid if error exists */
> +	if (!op->status)
> +		op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
> +	op->turbo_dec.iter_count = (uint8_t) rsp.iter_cnt / 2;
> +	/* Check if this is the last desc in batch (Atomic Queue) */
> +	if (desc->req.last_desc_in_batch) {
> +		(*aq_dequeued)++;
> +		desc->req.last_desc_in_batch = 0;
> +	}
> +	desc->rsp.val = ACC100_DMA_DESC_TYPE;
> +	desc->rsp.add_info_0 = 0;
> +	desc->rsp.add_info_1 = 0;
> +	*ref_op = op;
> +
> +	/* One CB (op) was successfully dequeued */
> +	return 1;
> +}
> +
> +/* Dequeue one decode operations from ACC100 device in CB mode */
> +static inline int
> +dequeue_ldpc_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
> +		struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
> +		uint16_t dequeued_cbs, uint32_t *aq_dequeued)
> +{
> +	union acc100_dma_desc *desc, atom_desc;
> +	union acc100_dma_rsp_desc rsp;
> +	struct rte_bbdev_dec_op *op;
> +
> +	desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> +			& q->sw_ring_wrap_mask);
> +	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> +			__ATOMIC_RELAXED);
> +
> +	/* Check fdone bit */
> +	if (!(atom_desc.rsp.val & ACC100_FDONE))
> +		return -1;
> +
> +	rsp.val = atom_desc.rsp.val;
> +
> +	/* Dequeue */
> +	op = desc->req.op_addr;
> +
> +	/* Clearing status, it will be set based on response */
> +	op->status = 0;
> +	op->status |= rsp.input_err << RTE_BBDEV_DATA_ERROR;
> +	op->status |= rsp.dma_err << RTE_BBDEV_DRV_ERROR;
> +	op->status |= rsp.fcw_err << RTE_BBDEV_DRV_ERROR;
> +	if (op->status != 0)
> +		q_data->queue_stats.dequeue_err_count++;
> +
> +	op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
> +	if (op->ldpc_dec.hard_output.length > 0 && !rsp.synd_ok)
> +		op->status |= 1 << RTE_BBDEV_SYNDROME_ERROR;
> +	op->ldpc_dec.iter_count = (uint8_t) rsp.iter_cnt;
> +
> +	/* Check if this is the last desc in batch (Atomic Queue) */
> +	if (desc->req.last_desc_in_batch) {
> +		(*aq_dequeued)++;
> +		desc->req.last_desc_in_batch = 0;
> +	}
> +
> +	desc->rsp.val = ACC100_DMA_DESC_TYPE;
> +	desc->rsp.add_info_0 = 0;
> +	desc->rsp.add_info_1 = 0;
> +
> +	*ref_op = op;
> +
> +	/* One CB (op) was successfully dequeued */
> +	return 1;
> +}
> +
> +/* Dequeue one decode operations from ACC100 device in TB mode. */
> +static inline int
> +dequeue_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
> +		uint16_t dequeued_cbs, uint32_t *aq_dequeued)
> +{
similar call as fpga_lte_fec
> +	union acc100_dma_desc *desc, *last_desc, atom_desc;
> +	union acc100_dma_rsp_desc rsp;
> +	struct rte_bbdev_dec_op *op;
> +	uint8_t cbs_in_tb = 1, cb_idx = 0;
> +
> +	desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> +			& q->sw_ring_wrap_mask);
> +	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> +			__ATOMIC_RELAXED);
> +
> +	/* Check fdone bit */
> +	if (!(atom_desc.rsp.val & ACC100_FDONE))
> +		return -1;
> +
> +	/* Dequeue */
> +	op = desc->req.op_addr;
> +
> +	/* Get number of CBs in dequeued TB */
> +	cbs_in_tb = desc->req.cbs_in_tb;
> +	/* Get last CB */
> +	last_desc = q->ring_addr + ((q->sw_ring_tail
> +			+ dequeued_cbs + cbs_in_tb - 1)
> +			& q->sw_ring_wrap_mask);
> +	/* Check if last CB in TB is ready to dequeue (and thus
> +	 * the whole TB) - checking sdone bit. If not return.
> +	 */
> +	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
> +			__ATOMIC_RELAXED);
> +	if (!(atom_desc.rsp.val & ACC100_SDONE))
> +		return -1;
> +
> +	/* Clearing status, it will be set based on response */
> +	op->status = 0;
> +
> +	/* Read remaining CBs if exists */
> +	while (cb_idx < cbs_in_tb) {
Other similar calls use 'i' , 'cb_idx' is more meaningful, consider changing the other loops.
> +		desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> +				& q->sw_ring_wrap_mask);
> +		atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> +				__ATOMIC_RELAXED);
> +		rsp.val = atom_desc.rsp.val;
> +		rte_bbdev_log_debug("Resp. desc %p: %x", desc,
> +				rsp.val);
> +
> +		op->status |= ((rsp.input_err)
> +				? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> +		op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> +		op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> +
> +		/* CRC invalid if error exists */
> +		if (!op->status)
> +			op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
> +		op->turbo_dec.iter_count = RTE_MAX((uint8_t) rsp.iter_cnt,
> +				op->turbo_dec.iter_count);
> +
> +		/* Check if this is the last desc in batch (Atomic Queue) */
> +		if (desc->req.last_desc_in_batch) {
> +			(*aq_dequeued)++;
> +			desc->req.last_desc_in_batch = 0;
> +		}
> +		desc->rsp.val = ACC100_DMA_DESC_TYPE;
> +		desc->rsp.add_info_0 = 0;
> +		desc->rsp.add_info_1 = 0;
> +		dequeued_cbs++;
> +		cb_idx++;
> +	}
> +
> +	*ref_op = op;
> +
> +	return cb_idx;
> +}
> +
> +/* Dequeue LDPC encode operations from ACC100 device. */
> +static uint16_t
> +acc100_dequeue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
> +		struct rte_bbdev_enc_op **ops, uint16_t num)
> +{
> +	struct acc100_queue *q = q_data->queue_private;
> +	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
> +	uint32_t aq_dequeued = 0;
> +	uint16_t dequeue_num, i, dequeued_cbs = 0, dequeued_descs = 0;
> +	int ret;
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +	if (unlikely(ops == 0 && q == NULL))
> +		return 0;
> +#endif
> +
> +	dequeue_num = (avail < num) ? avail : num;

Similar to RTE_MIN

general issue

> +
> +	for (i = 0; i < dequeue_num; i++) {
> +		ret = dequeue_enc_one_op_cb(q, &ops[dequeued_cbs],
> +				dequeued_descs, &aq_dequeued);
> +		if (ret < 0)
> +			break;
> +		dequeued_cbs += ret;
> +		dequeued_descs++;
> +		if (dequeued_cbs >= num)
> +			break;
condition should be added to the for-loop
> +	}
> +
> +	q->aq_dequeued += aq_dequeued;
> +	q->sw_ring_tail += dequeued_descs;
> +
> +	/* Update enqueue stats */
> +	q_data->queue_stats.dequeued_count += dequeued_cbs;
> +
> +	return dequeued_cbs;
> +}
> +
> +/* Dequeue decode operations from ACC100 device. */
> +static uint16_t
> +acc100_dequeue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
> +		struct rte_bbdev_dec_op **ops, uint16_t num)
> +{
> +	struct acc100_queue *q = q_data->queue_private;
> +	uint16_t dequeue_num;
> +	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
> +	uint32_t aq_dequeued = 0;
> +	uint16_t i;
> +	uint16_t dequeued_cbs = 0;
> +	struct rte_bbdev_dec_op *op;
> +	int ret;
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +	if (unlikely(ops == 0 && q == NULL))
> +		return 0;
> +#endif
> +
> +	dequeue_num = (avail < num) ? avail : num;
> +
> +	for (i = 0; i < dequeue_num; ++i) {
> +		op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> +			& q->sw_ring_wrap_mask))->req.op_addr;
> +		if (op->ldpc_dec.code_block_mode == 0)

0 should be a #define

Tom

> +			ret = dequeue_dec_one_op_tb(q, &ops[i], dequeued_cbs,
> +					&aq_dequeued);
> +		else
> +			ret = dequeue_ldpc_dec_one_op_cb(
> +					q_data, q, &ops[i], dequeued_cbs,
> +					&aq_dequeued);
> +
> +		if (ret < 0)
> +			break;
> +		dequeued_cbs += ret;
> +	}
> +
> +	q->aq_dequeued += aq_dequeued;
> +	q->sw_ring_tail += dequeued_cbs;
> +
> +	/* Update enqueue stats */
> +	q_data->queue_stats.dequeued_count += i;
> +
> +	return i;
> +}
> +
>  /* Initialization Function */
>  static void
>  acc100_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv)
> @@ -703,6 +2321,10 @@
>  	struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
>  
>  	dev->dev_ops = &acc100_bbdev_ops;
> +	dev->enqueue_ldpc_enc_ops = acc100_enqueue_ldpc_enc;
> +	dev->enqueue_ldpc_dec_ops = acc100_enqueue_ldpc_dec;
> +	dev->dequeue_ldpc_enc_ops = acc100_dequeue_ldpc_enc;
> +	dev->dequeue_ldpc_dec_ops = acc100_dequeue_ldpc_dec;
>  
>  	((struct acc100_device *) dev->data->dev_private)->pf_device =
>  			!strcmp(drv->driver.name,
> @@ -815,4 +2437,3 @@ static int acc100_pci_remove(struct rte_pci_device *pci_dev)
>  RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME, pci_id_acc100_pf_map);
>  RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
>  RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME, pci_id_acc100_vf_map);
> -
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
> index 0e2b79c..78686c1 100644
> --- a/drivers/baseband/acc100/rte_acc100_pmd.h
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
> @@ -88,6 +88,8 @@
>  #define TMPL_PRI_3      0x0f0e0d0c
>  #define QUEUE_ENABLE    0x80000000  /* Bit to mark Queue as Enabled */
>  #define WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
> +#define ACC100_FDONE    0x80000000
> +#define ACC100_SDONE    0x40000000
>  
>  #define ACC100_NUM_TMPL  32
>  #define VF_OFFSET_QOS 16 /* offset in Memory Space specific to QoS Mon */
> @@ -398,6 +400,7 @@ struct __rte_packed acc100_dma_req_desc {
>  union acc100_dma_desc {
>  	struct acc100_dma_req_desc req;
>  	union acc100_dma_rsp_desc rsp;
> +	uint64_t atom_hdr;
>  };
>  
>  


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v9 06/10] baseband/acc100: add HARQ loopback support
  2020-09-29  0:29     ` [dpdk-dev] [PATCH v9 06/10] baseband/acc100: add HARQ loopback support Nicolas Chautru
@ 2020-09-30 17:25       ` Tom Rix
  2020-09-30 18:55         ` Chautru, Nicolas
  0 siblings, 1 reply; 213+ messages in thread
From: Tom Rix @ 2020-09-30 17:25 UTC (permalink / raw)
  To: Nicolas Chautru, dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu


On 9/28/20 5:29 PM, Nicolas Chautru wrote:
> Additional support for HARQ memory loopback
>
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
> ---
>  drivers/baseband/acc100/rte_acc100_pmd.c | 158 +++++++++++++++++++++++++++++++
>  1 file changed, 158 insertions(+)
>
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
> index b223547..e484c0a 100644
> --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> @@ -658,6 +658,7 @@
>  				RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE |
>  				RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE |
>  #ifdef ACC100_EXT_MEM
> +				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK |
>  				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE |
>  				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE |
>  #endif
> @@ -1480,12 +1481,169 @@
>  	return 1;
>  }
>  
> +static inline int
> +harq_loopback(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
> +		uint16_t total_enqueued_cbs) {
> +	struct acc100_fcw_ld *fcw;
> +	union acc100_dma_desc *desc;
> +	int next_triplet = 1;
> +	struct rte_mbuf *hq_output_head, *hq_output;
> +	uint16_t harq_in_length = op->ldpc_dec.harq_combined_input.length;
> +	if (harq_in_length == 0) {
> +		rte_bbdev_log(ERR, "Loopback of invalid null size\n");
> +		return -EINVAL;
> +	}
> +
> +	int h_comp = check_bit(op->ldpc_dec.op_flags,
> +			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION
> +			) ? 1 : 0;

bool

Tom

> +	if (h_comp == 1)
> +		harq_in_length = harq_in_length * 8 / 6;
> +	harq_in_length = RTE_ALIGN(harq_in_length, 64);
> +	uint16_t harq_dma_length_in = (h_comp == 0) ?
> +			harq_in_length :
> +			harq_in_length * 6 / 8;
> +	uint16_t harq_dma_length_out = harq_dma_length_in;
> +	bool ddr_mem_in = check_bit(op->ldpc_dec.op_flags,
> +			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE);
> +	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
> +	uint16_t harq_index = (ddr_mem_in ?
> +			op->ldpc_dec.harq_combined_input.offset :
> +			op->ldpc_dec.harq_combined_output.offset)
> +			/ ACC100_HARQ_OFFSET;
> +
> +	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> +			& q->sw_ring_wrap_mask);
> +	desc = q->ring_addr + desc_idx;
> +	fcw = &desc->req.fcw_ld;
> +	/* Set the FCW from loopback into DDR */
> +	memset(fcw, 0, sizeof(struct acc100_fcw_ld));
> +	fcw->FCWversion = ACC100_FCW_VER;
> +	fcw->qm = 2;
> +	fcw->Zc = 384;
> +	if (harq_in_length < 16 * N_ZC_1)
> +		fcw->Zc = 16;
> +	fcw->ncb = fcw->Zc * N_ZC_1;
> +	fcw->rm_e = 2;
> +	fcw->hcin_en = 1;
> +	fcw->hcout_en = 1;
> +
> +	rte_bbdev_log(DEBUG, "Loopback IN %d Index %d offset %d length %d %d\n",
> +			ddr_mem_in, harq_index,
> +			harq_layout[harq_index].offset, harq_in_length,
> +			harq_dma_length_in);
> +
> +	if (ddr_mem_in && (harq_layout[harq_index].offset > 0)) {
> +		fcw->hcin_size0 = harq_layout[harq_index].size0;
> +		fcw->hcin_offset = harq_layout[harq_index].offset;
> +		fcw->hcin_size1 = harq_in_length - fcw->hcin_offset;
> +		harq_dma_length_in = (fcw->hcin_size0 + fcw->hcin_size1);
> +		if (h_comp == 1)
> +			harq_dma_length_in = harq_dma_length_in * 6 / 8;
> +	} else {
> +		fcw->hcin_size0 = harq_in_length;
> +	}
> +	harq_layout[harq_index].val = 0;
> +	rte_bbdev_log(DEBUG, "Loopback FCW Config %d %d %d\n",
> +			fcw->hcin_size0, fcw->hcin_offset, fcw->hcin_size1);
> +	fcw->hcout_size0 = harq_in_length;
> +	fcw->hcin_decomp_mode = h_comp;
> +	fcw->hcout_comp_mode = h_comp;
> +	fcw->gain_i = 1;
> +	fcw->gain_h = 1;
> +
> +	/* Set the prefix of descriptor. This could be done at polling */
> +	desc->req.word0 = ACC100_DMA_DESC_TYPE;
> +	desc->req.word1 = 0; /**< Timestamp could be disabled */
> +	desc->req.word2 = 0;
> +	desc->req.word3 = 0;
> +	desc->req.numCBs = 1;
> +
> +	/* Null LLR input for Decoder */
> +	desc->req.data_ptrs[next_triplet].address =
> +			q->lb_in_addr_phys;
> +	desc->req.data_ptrs[next_triplet].blen = 2;
> +	desc->req.data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN;
> +	desc->req.data_ptrs[next_triplet].last = 0;
> +	desc->req.data_ptrs[next_triplet].dma_ext = 0;
> +	next_triplet++;
> +
> +	/* HARQ Combine input from either Memory interface */
> +	if (!ddr_mem_in) {
> +		next_triplet = acc100_dma_fill_blk_type_out(&desc->req,
> +				op->ldpc_dec.harq_combined_input.data,
> +				op->ldpc_dec.harq_combined_input.offset,
> +				harq_dma_length_in,
> +				next_triplet,
> +				ACC100_DMA_BLKID_IN_HARQ);
> +	} else {
> +		desc->req.data_ptrs[next_triplet].address =
> +				op->ldpc_dec.harq_combined_input.offset;
> +		desc->req.data_ptrs[next_triplet].blen =
> +				harq_dma_length_in;
> +		desc->req.data_ptrs[next_triplet].blkid =
> +				ACC100_DMA_BLKID_IN_HARQ;
> +		desc->req.data_ptrs[next_triplet].dma_ext = 1;
> +		next_triplet++;
> +	}
> +	desc->req.data_ptrs[next_triplet - 1].last = 1;
> +	desc->req.m2dlen = next_triplet;
> +
> +	/* Dropped decoder hard output */
> +	desc->req.data_ptrs[next_triplet].address =
> +			q->lb_out_addr_phys;
> +	desc->req.data_ptrs[next_triplet].blen = BYTES_IN_WORD;
> +	desc->req.data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_OUT_HARD;
> +	desc->req.data_ptrs[next_triplet].last = 0;
> +	desc->req.data_ptrs[next_triplet].dma_ext = 0;
> +	next_triplet++;
> +
> +	/* HARQ Combine output to either Memory interface */
> +	if (check_bit(op->ldpc_dec.op_flags,
> +			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE
> +			)) {
> +		desc->req.data_ptrs[next_triplet].address =
> +				op->ldpc_dec.harq_combined_output.offset;
> +		desc->req.data_ptrs[next_triplet].blen =
> +				harq_dma_length_out;
> +		desc->req.data_ptrs[next_triplet].blkid =
> +				ACC100_DMA_BLKID_OUT_HARQ;
> +		desc->req.data_ptrs[next_triplet].dma_ext = 1;
> +		next_triplet++;
> +	} else {
> +		hq_output_head = op->ldpc_dec.harq_combined_output.data;
> +		hq_output = op->ldpc_dec.harq_combined_output.data;
> +		next_triplet = acc100_dma_fill_blk_type_out(
> +				&desc->req,
> +				op->ldpc_dec.harq_combined_output.data,
> +				op->ldpc_dec.harq_combined_output.offset,
> +				harq_dma_length_out,
> +				next_triplet,
> +				ACC100_DMA_BLKID_OUT_HARQ);
> +		/* HARQ output */
> +		mbuf_append(hq_output_head, hq_output, harq_dma_length_out);
> +		op->ldpc_dec.harq_combined_output.length =
> +				harq_dma_length_out;
> +	}
> +	desc->req.data_ptrs[next_triplet - 1].last = 1;
> +	desc->req.d2mlen = next_triplet - desc->req.m2dlen;
> +	desc->req.op_addr = op;
> +
> +	/* One CB (one op) was successfully prepared to enqueue */
> +	return 1;
> +}
> +
>  /** Enqueue one decode operations for ACC100 device in CB mode */
>  static inline int
>  enqueue_ldpc_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
>  		uint16_t total_enqueued_cbs, bool same_op)
>  {
>  	int ret;
> +	if (unlikely(check_bit(op->ldpc_dec.op_flags,
> +			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK))) {
> +		ret = harq_loopback(q, op, total_enqueued_cbs);
> +		return ret;
> +	}
>  
>  	union acc100_dma_desc *desc;
>  	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v9 07/10] baseband/acc100: add support for 4G processing
  2020-09-29  0:29     ` [dpdk-dev] [PATCH v9 07/10] baseband/acc100: add support for 4G processing Nicolas Chautru
@ 2020-09-30 18:37       ` Tom Rix
  2020-09-30 19:10         ` Chautru, Nicolas
  0 siblings, 1 reply; 213+ messages in thread
From: Tom Rix @ 2020-09-30 18:37 UTC (permalink / raw)
  To: Nicolas Chautru, dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu


On 9/28/20 5:29 PM, Nicolas Chautru wrote:
> Adding capability for 4G encode and decoder processing
>
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
> ---
>  doc/guides/bbdevs/features/acc100.ini    |    4 +-
>  drivers/baseband/acc100/rte_acc100_pmd.c | 1010 ++++++++++++++++++++++++++++--
>  2 files changed, 945 insertions(+), 69 deletions(-)
>
> diff --git a/doc/guides/bbdevs/features/acc100.ini b/doc/guides/bbdevs/features/acc100.ini
> index 40c7adc..642cd48 100644
> --- a/doc/guides/bbdevs/features/acc100.ini
> +++ b/doc/guides/bbdevs/features/acc100.ini
> @@ -4,8 +4,8 @@
>  ; Refer to default.ini for the full list of available PMD features.
>  ;
>  [Features]
> -Turbo Decoder (4G)     = N
> -Turbo Encoder (4G)     = N
> +Turbo Decoder (4G)     = Y
> +Turbo Encoder (4G)     = Y
>  LDPC Decoder (5G)      = Y
>  LDPC Encoder (5G)      = Y
>  LLR/HARQ Compression   = Y
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
> index e484c0a..7d4c3df 100644
> --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> @@ -339,7 +339,6 @@
>  	free_base_addresses(base_addrs, i);
>  }
>  
> -
>  /* Allocate 64MB memory used for all software rings */
>  static int
>  acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id)
> @@ -637,6 +636,41 @@
>  
>  	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
>  		{
> +			.type = RTE_BBDEV_OP_TURBO_DEC,
> +			.cap.turbo_dec = {
> +				.capability_flags =
> +					RTE_BBDEV_TURBO_SUBBLOCK_DEINTERLEAVE |
> +					RTE_BBDEV_TURBO_CRC_TYPE_24B |
> +					RTE_BBDEV_TURBO_HALF_ITERATION_EVEN |
> +					RTE_BBDEV_TURBO_EARLY_TERMINATION |
> +					RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN |
> +					RTE_BBDEV_TURBO_MAP_DEC |
> +					RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP |
> +					RTE_BBDEV_TURBO_DEC_SCATTER_GATHER,
> +				.max_llr_modulus = INT8_MAX,
> +				.num_buffers_src =
> +						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
> +				.num_buffers_hard_out =
> +						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
> +				.num_buffers_soft_out =
> +						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
> +			}
> +		},
> +		{
> +			.type = RTE_BBDEV_OP_TURBO_ENC,
> +			.cap.turbo_enc = {
> +				.capability_flags =
> +					RTE_BBDEV_TURBO_CRC_24B_ATTACH |
> +					RTE_BBDEV_TURBO_RV_INDEX_BYPASS |
> +					RTE_BBDEV_TURBO_RATE_MATCH |
> +					RTE_BBDEV_TURBO_ENC_SCATTER_GATHER,
> +				.num_buffers_src =
> +						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
> +				.num_buffers_dst =
> +						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
> +			}
> +		},
> +		{
>  			.type   = RTE_BBDEV_OP_LDPC_ENC,
>  			.cap.ldpc_enc = {
>  				.capability_flags =
> @@ -719,7 +753,6 @@
>  #endif
>  }
>  
> -
>  static const struct rte_bbdev_ops acc100_bbdev_ops = {
>  	.setup_queues = acc100_setup_queues,
>  	.close = acc100_dev_close,
> @@ -763,6 +796,58 @@
>  	return tail;
>  }
>  
> +/* Fill in a frame control word for turbo encoding. */
> +static inline void
> +acc100_fcw_te_fill(const struct rte_bbdev_enc_op *op, struct acc100_fcw_te *fcw)
> +{
> +	fcw->code_block_mode = op->turbo_enc.code_block_mode;
> +	if (fcw->code_block_mode == 0) { /* For TB mode */
> +		fcw->k_neg = op->turbo_enc.tb_params.k_neg;
> +		fcw->k_pos = op->turbo_enc.tb_params.k_pos;
> +		fcw->c_neg = op->turbo_enc.tb_params.c_neg;
> +		fcw->c = op->turbo_enc.tb_params.c;
> +		fcw->ncb_neg = op->turbo_enc.tb_params.ncb_neg;
> +		fcw->ncb_pos = op->turbo_enc.tb_params.ncb_pos;
> +
> +		if (check_bit(op->turbo_enc.op_flags,
> +				RTE_BBDEV_TURBO_RATE_MATCH)) {
> +			fcw->bypass_rm = 0;
> +			fcw->cab = op->turbo_enc.tb_params.cab;
> +			fcw->ea = op->turbo_enc.tb_params.ea;
> +			fcw->eb = op->turbo_enc.tb_params.eb;
> +		} else {
> +			/* E is set to the encoding output size when RM is
> +			 * bypassed.
> +			 */
> +			fcw->bypass_rm = 1;
> +			fcw->cab = fcw->c_neg;
> +			fcw->ea = 3 * fcw->k_neg + 12;
> +			fcw->eb = 3 * fcw->k_pos + 12;
> +		}
> +	} else { /* For CB mode */
> +		fcw->k_pos = op->turbo_enc.cb_params.k;
> +		fcw->ncb_pos = op->turbo_enc.cb_params.ncb;
> +
> +		if (check_bit(op->turbo_enc.op_flags,
> +				RTE_BBDEV_TURBO_RATE_MATCH)) {
> +			fcw->bypass_rm = 0;
> +			fcw->eb = op->turbo_enc.cb_params.e;
> +		} else {
> +			/* E is set to the encoding output size when RM is
> +			 * bypassed.
> +			 */
> +			fcw->bypass_rm = 1;
> +			fcw->eb = 3 * fcw->k_pos + 12;
> +		}
> +	}
> +
> +	fcw->bypass_rv_idx1 = check_bit(op->turbo_enc.op_flags,
> +			RTE_BBDEV_TURBO_RV_INDEX_BYPASS);
> +	fcw->code_block_crc = check_bit(op->turbo_enc.op_flags,
> +			RTE_BBDEV_TURBO_CRC_24B_ATTACH);
> +	fcw->rv_idx1 = op->turbo_enc.rv_index;
> +}
> +
>  /* Compute value of k0.
>   * Based on 3GPP 38.212 Table 5.4.2.1-2
>   * Starting position of different redundancy versions, k0
> @@ -813,6 +898,25 @@
>  	fcw->mcb_count = num_cb;
>  }
>  
> +/* Fill in a frame control word for turbo decoding. */
> +static inline void
> +acc100_fcw_td_fill(const struct rte_bbdev_dec_op *op, struct acc100_fcw_td *fcw)
> +{
> +	/* Note : Early termination is always enabled for 4GUL */
> +	fcw->fcw_ver = 1;
> +	if (op->turbo_dec.code_block_mode == 0)
> +		fcw->k_pos = op->turbo_dec.tb_params.k_pos;
> +	else
> +		fcw->k_pos = op->turbo_dec.cb_params.k;
> +	fcw->turbo_crc_type = check_bit(op->turbo_dec.op_flags,
> +			RTE_BBDEV_TURBO_CRC_TYPE_24B);
> +	fcw->bypass_sb_deint = 0;
> +	fcw->raw_decoder_input_on = 0;
> +	fcw->max_iter = op->turbo_dec.iter_max;
> +	fcw->half_iter_on = !check_bit(op->turbo_dec.op_flags,
> +			RTE_BBDEV_TURBO_HALF_ITERATION_EVEN);
> +}
> +
>  /* Fill in a frame control word for LDPC decoding. */
>  static inline void
>  acc100_fcw_ld_fill(const struct rte_bbdev_dec_op *op, struct acc100_fcw_ld *fcw,
> @@ -1042,6 +1146,87 @@
>  }
>  
>  static inline int
> +acc100_dma_desc_te_fill(struct rte_bbdev_enc_op *op,
> +		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
> +		struct rte_mbuf *output, uint32_t *in_offset,
> +		uint32_t *out_offset, uint32_t *out_length,
> +		uint32_t *mbuf_total_left, uint32_t *seg_total_left, uint8_t r)
> +{
> +	int next_triplet = 1; /* FCW already done */
> +	uint32_t e, ea, eb, length;
> +	uint16_t k, k_neg, k_pos;
> +	uint8_t cab, c_neg;
> +
> +	desc->word0 = ACC100_DMA_DESC_TYPE;
> +	desc->word1 = 0; /**< Timestamp could be disabled */
> +	desc->word2 = 0;
> +	desc->word3 = 0;
> +	desc->numCBs = 1;
> +
> +	if (op->turbo_enc.code_block_mode == 0) {
> +		ea = op->turbo_enc.tb_params.ea;
> +		eb = op->turbo_enc.tb_params.eb;
> +		cab = op->turbo_enc.tb_params.cab;
> +		k_neg = op->turbo_enc.tb_params.k_neg;
> +		k_pos = op->turbo_enc.tb_params.k_pos;
> +		c_neg = op->turbo_enc.tb_params.c_neg;
> +		e = (r < cab) ? ea : eb;
> +		k = (r < c_neg) ? k_neg : k_pos;
> +	} else {
> +		e = op->turbo_enc.cb_params.e;
> +		k = op->turbo_enc.cb_params.k;
> +	}
> +
> +	if (check_bit(op->turbo_enc.op_flags, RTE_BBDEV_TURBO_CRC_24B_ATTACH))
> +		length = (k - 24) >> 3;
> +	else
> +		length = k >> 3;
> +
> +	if (unlikely((*mbuf_total_left == 0) || (*mbuf_total_left < length))) {

similar to other patches, this check can be combined to <=

change generally

> +		rte_bbdev_log(ERR,
> +				"Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
> +				*mbuf_total_left, length);
> +		return -1;
> +	}
> +
> +	next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset,
> +			length, seg_total_left, next_triplet);
> +	if (unlikely(next_triplet < 0)) {
> +		rte_bbdev_log(ERR,
> +				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
> +				op);
> +		return -1;
> +	}
> +	desc->data_ptrs[next_triplet - 1].last = 1;
> +	desc->m2dlen = next_triplet;
> +	*mbuf_total_left -= length;
> +
> +	/* Set output length */
> +	if (check_bit(op->turbo_enc.op_flags, RTE_BBDEV_TURBO_RATE_MATCH))
> +		/* Integer round up division by 8 */
> +		*out_length = (e + 7) >> 3;
> +	else
> +		*out_length = (k >> 3) * 3 + 2;
> +
> +	next_triplet = acc100_dma_fill_blk_type_out(desc, output, *out_offset,
> +			*out_length, next_triplet, ACC100_DMA_BLKID_OUT_ENC);
> +	if (unlikely(next_triplet < 0)) {
> +		rte_bbdev_log(ERR,
> +				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
> +				op);
> +		return -1;
> +	}
> +	op->turbo_enc.output.length += *out_length;
> +	*out_offset += *out_length;
> +	desc->data_ptrs[next_triplet - 1].last = 1;
> +	desc->d2mlen = next_triplet - desc->m2dlen;
> +
> +	desc->op_addr = op;
> +
> +	return 0;
> +}
> +
> +static inline int
>  acc100_dma_desc_le_fill(struct rte_bbdev_enc_op *op,
>  		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
>  		struct rte_mbuf *output, uint32_t *in_offset,
> @@ -1110,6 +1295,117 @@
>  }
>  
>  static inline int
> +acc100_dma_desc_td_fill(struct rte_bbdev_dec_op *op,
> +		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
> +		struct rte_mbuf *h_output, struct rte_mbuf *s_output,
> +		uint32_t *in_offset, uint32_t *h_out_offset,
> +		uint32_t *s_out_offset, uint32_t *h_out_length,
> +		uint32_t *s_out_length, uint32_t *mbuf_total_left,
> +		uint32_t *seg_total_left, uint8_t r)
> +{
> +	int next_triplet = 1; /* FCW already done */
> +	uint16_t k;
> +	uint16_t crc24_overlap = 0;
> +	uint32_t e, kw;
> +
> +	desc->word0 = ACC100_DMA_DESC_TYPE;
> +	desc->word1 = 0; /**< Timestamp could be disabled */
> +	desc->word2 = 0;
> +	desc->word3 = 0;
> +	desc->numCBs = 1;
> +
> +	if (op->turbo_dec.code_block_mode == 0) {
> +		k = (r < op->turbo_dec.tb_params.c_neg)
> +			? op->turbo_dec.tb_params.k_neg
> +			: op->turbo_dec.tb_params.k_pos;
> +		e = (r < op->turbo_dec.tb_params.cab)
> +			? op->turbo_dec.tb_params.ea
> +			: op->turbo_dec.tb_params.eb;
> +	} else {
> +		k = op->turbo_dec.cb_params.k;
> +		e = op->turbo_dec.cb_params.e;
> +	}
> +
> +	if ((op->turbo_dec.code_block_mode == 0)
> +		&& !check_bit(op->turbo_dec.op_flags,
> +		RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP))
> +		crc24_overlap = 24;
> +
> +	/* Calculates circular buffer size.
> +	 * According to 3gpp 36.212 section 5.1.4.2
> +	 *   Kw = 3 * Kpi,
> +	 * where:
> +	 *   Kpi = nCol * nRow
> +	 * where nCol is 32 and nRow can be calculated from:
> +	 *   D =< nCol * nRow
> +	 * where D is the size of each output from turbo encoder block (k + 4).
> +	 */
> +	kw = RTE_ALIGN_CEIL(k + 4, 32) * 3;
> +
> +	if (unlikely((*mbuf_total_left == 0) || (*mbuf_total_left < kw))) {
> +		rte_bbdev_log(ERR,
> +				"Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
> +				*mbuf_total_left, kw);
> +		return -1;
> +	}
> +
> +	next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset, kw,
> +			seg_total_left, next_triplet);
> +	if (unlikely(next_triplet < 0)) {
> +		rte_bbdev_log(ERR,
> +				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
> +				op);
> +		return -1;
> +	}
> +	desc->data_ptrs[next_triplet - 1].last = 1;
> +	desc->m2dlen = next_triplet;
> +	*mbuf_total_left -= kw;
> +
> +	next_triplet = acc100_dma_fill_blk_type_out(
> +			desc, h_output, *h_out_offset,
> +			k >> 3, next_triplet, ACC100_DMA_BLKID_OUT_HARD);
> +	if (unlikely(next_triplet < 0)) {
> +		rte_bbdev_log(ERR,
> +				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
> +				op);
> +		return -1;
> +	}
> +
> +	*h_out_length = ((k - crc24_overlap) >> 3);
> +	op->turbo_dec.hard_output.length += *h_out_length;
> +	*h_out_offset += *h_out_length;
> +
> +	/* Soft output */
> +	if (check_bit(op->turbo_dec.op_flags, RTE_BBDEV_TURBO_SOFT_OUTPUT)) {
> +		if (check_bit(op->turbo_dec.op_flags,
> +				RTE_BBDEV_TURBO_EQUALIZER))
> +			*s_out_length = e;
> +		else
> +			*s_out_length = (k * 3) + 12;
> +
> +		next_triplet = acc100_dma_fill_blk_type_out(desc, s_output,
> +				*s_out_offset, *s_out_length, next_triplet,
> +				ACC100_DMA_BLKID_OUT_SOFT);
> +		if (unlikely(next_triplet < 0)) {
> +			rte_bbdev_log(ERR,
> +					"Mismatch between data to process and mbuf data length in bbdev_op: %p",
> +					op);
> +			return -1;
> +		}
> +
> +		op->turbo_dec.soft_output.length += *s_out_length;
> +		*s_out_offset += *s_out_length;
> +	}
> +
> +	desc->data_ptrs[next_triplet - 1].last = 1;
> +	desc->d2mlen = next_triplet - desc->m2dlen;
> +
> +	desc->op_addr = op;
> +
> +	return 0;
> +}
> +
> +static inline int
>  acc100_dma_desc_ld_fill(struct rte_bbdev_dec_op *op,
>  		struct acc100_dma_req_desc *desc,
>  		struct rte_mbuf **input, struct rte_mbuf *h_output,
> @@ -1374,6 +1670,57 @@
>  
>  /* Enqueue one encode operations for ACC100 device in CB mode */
>  static inline int
> +enqueue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
> +		uint16_t total_enqueued_cbs)
> +{
> +	union acc100_dma_desc *desc = NULL;
> +	int ret;
> +	uint32_t in_offset, out_offset, out_length, mbuf_total_left,
> +		seg_total_left;
> +	struct rte_mbuf *input, *output_head, *output;
> +
> +	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> +			& q->sw_ring_wrap_mask);
> +	desc = q->ring_addr + desc_idx;
> +	acc100_fcw_te_fill(op, &desc->req.fcw_te);
> +
> +	input = op->turbo_enc.input.data;
> +	output_head = output = op->turbo_enc.output.data;
> +	in_offset = op->turbo_enc.input.offset;
> +	out_offset = op->turbo_enc.output.offset;
> +	out_length = 0;
> +	mbuf_total_left = op->turbo_enc.input.length;
> +	seg_total_left = rte_pktmbuf_data_len(op->turbo_enc.input.data)
> +			- in_offset;
> +
> +	ret = acc100_dma_desc_te_fill(op, &desc->req, &input, output,
> +			&in_offset, &out_offset, &out_length, &mbuf_total_left,
> +			&seg_total_left, 0);
> +
> +	if (unlikely(ret < 0))
> +		return ret;
> +
> +	mbuf_append(output_head, output, out_length);
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +	rte_memdump(stderr, "FCW", &desc->req.fcw_te,
> +			sizeof(desc->req.fcw_te) - 8);
> +	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> +
> +	/* Check if any data left after processing one CB */
> +	if (mbuf_total_left != 0) {
> +		rte_bbdev_log(ERR,
> +				"Some date still left after processing one CB: mbuf_total_left = %u",
> +				mbuf_total_left);
> +		return -EINVAL;
> +	}
> +#endif
> +	/* One CB (one op) was successfully prepared to enqueue */
> +	return 1;
> +}
> +
> +/* Enqueue one encode operations for ACC100 device in CB mode */
> +static inline int
>  enqueue_ldpc_enc_n_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op **ops,
>  		uint16_t total_enqueued_cbs, int16_t num)
>  {
> @@ -1481,78 +1828,235 @@
>  	return 1;
>  }
>  
> -static inline int
> -harq_loopback(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
> -		uint16_t total_enqueued_cbs) {
> -	struct acc100_fcw_ld *fcw;
> -	union acc100_dma_desc *desc;
> -	int next_triplet = 1;
> -	struct rte_mbuf *hq_output_head, *hq_output;
> -	uint16_t harq_in_length = op->ldpc_dec.harq_combined_input.length;
> -	if (harq_in_length == 0) {
> -		rte_bbdev_log(ERR, "Loopback of invalid null size\n");
> -		return -EINVAL;
> -	}
>  
> -	int h_comp = check_bit(op->ldpc_dec.op_flags,
> -			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION
> -			) ? 1 : 0;
> -	if (h_comp == 1)
> -		harq_in_length = harq_in_length * 8 / 6;
> -	harq_in_length = RTE_ALIGN(harq_in_length, 64);
> -	uint16_t harq_dma_length_in = (h_comp == 0) ?
> -			harq_in_length :
> -			harq_in_length * 6 / 8;
> -	uint16_t harq_dma_length_out = harq_dma_length_in;
> -	bool ddr_mem_in = check_bit(op->ldpc_dec.op_flags,
> -			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE);
> -	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
> -	uint16_t harq_index = (ddr_mem_in ?
> -			op->ldpc_dec.harq_combined_input.offset :
> -			op->ldpc_dec.harq_combined_output.offset)
> -			/ ACC100_HARQ_OFFSET;
> +/* Enqueue one encode operations for ACC100 device in TB mode. */
> +static inline int
> +enqueue_enc_one_op_tb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
> +		uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
> +{
> +	union acc100_dma_desc *desc = NULL;
> +	int ret;
> +	uint8_t r, c;
> +	uint32_t in_offset, out_offset, out_length, mbuf_total_left,
> +		seg_total_left;
> +	struct rte_mbuf *input, *output_head, *output;
> +	uint16_t current_enqueued_cbs = 0;
>  
>  	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
>  			& q->sw_ring_wrap_mask);
>  	desc = q->ring_addr + desc_idx;
> -	fcw = &desc->req.fcw_ld;
> -	/* Set the FCW from loopback into DDR */
> -	memset(fcw, 0, sizeof(struct acc100_fcw_ld));
> -	fcw->FCWversion = ACC100_FCW_VER;
> -	fcw->qm = 2;
> -	fcw->Zc = 384;
> -	if (harq_in_length < 16 * N_ZC_1)
> -		fcw->Zc = 16;
> -	fcw->ncb = fcw->Zc * N_ZC_1;
> -	fcw->rm_e = 2;
> -	fcw->hcin_en = 1;
> -	fcw->hcout_en = 1;
> +	uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
> +	acc100_fcw_te_fill(op, &desc->req.fcw_te);
>  
> -	rte_bbdev_log(DEBUG, "Loopback IN %d Index %d offset %d length %d %d\n",
> -			ddr_mem_in, harq_index,
> -			harq_layout[harq_index].offset, harq_in_length,
> -			harq_dma_length_in);
> +	input = op->turbo_enc.input.data;
> +	output_head = output = op->turbo_enc.output.data;
> +	in_offset = op->turbo_enc.input.offset;
> +	out_offset = op->turbo_enc.output.offset;
> +	out_length = 0;
> +	mbuf_total_left = op->turbo_enc.input.length;
>  
> -	if (ddr_mem_in && (harq_layout[harq_index].offset > 0)) {
> -		fcw->hcin_size0 = harq_layout[harq_index].size0;
> -		fcw->hcin_offset = harq_layout[harq_index].offset;
> -		fcw->hcin_size1 = harq_in_length - fcw->hcin_offset;
> -		harq_dma_length_in = (fcw->hcin_size0 + fcw->hcin_size1);
> -		if (h_comp == 1)
> -			harq_dma_length_in = harq_dma_length_in * 6 / 8;
> -	} else {
> -		fcw->hcin_size0 = harq_in_length;
> -	}
> -	harq_layout[harq_index].val = 0;
> -	rte_bbdev_log(DEBUG, "Loopback FCW Config %d %d %d\n",
> -			fcw->hcin_size0, fcw->hcin_offset, fcw->hcin_size1);
> -	fcw->hcout_size0 = harq_in_length;
> -	fcw->hcin_decomp_mode = h_comp;
> -	fcw->hcout_comp_mode = h_comp;
> -	fcw->gain_i = 1;
> -	fcw->gain_h = 1;
> +	c = op->turbo_enc.tb_params.c;
> +	r = op->turbo_enc.tb_params.r;
>  
> -	/* Set the prefix of descriptor. This could be done at polling */
> +	while (mbuf_total_left > 0 && r < c) {
> +		seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
> +		/* Set up DMA descriptor */
> +		desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
> +				& q->sw_ring_wrap_mask);
> +		desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
> +		desc->req.data_ptrs[0].blen = ACC100_FCW_TE_BLEN;
> +
> +		ret = acc100_dma_desc_te_fill(op, &desc->req, &input, output,
> +				&in_offset, &out_offset, &out_length,
> +				&mbuf_total_left, &seg_total_left, r);
> +		if (unlikely(ret < 0))
> +			return ret;
> +		mbuf_append(output_head, output, out_length);
> +
> +		/* Set total number of CBs in TB */
> +		desc->req.cbs_in_tb = cbs_in_tb;
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +		rte_memdump(stderr, "FCW", &desc->req.fcw_te,
> +				sizeof(desc->req.fcw_te) - 8);
> +		rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> +#endif
> +
> +		if (seg_total_left == 0) {
> +			/* Go to the next mbuf */
> +			input = input->next;
> +			in_offset = 0;
> +			output = output->next;
> +			out_offset = 0;
> +		}
> +
> +		total_enqueued_cbs++;
> +		current_enqueued_cbs++;
> +		r++;
> +	}
> +
> +	if (unlikely(desc == NULL))
> +		return current_enqueued_cbs;
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +	/* Check if any CBs left for processing */
> +	if (mbuf_total_left != 0) {
> +		rte_bbdev_log(ERR,
> +				"Some date still left for processing: mbuf_total_left = %u",
> +				mbuf_total_left);
> +		return -EINVAL;
> +	}
> +#endif
> +
> +	/* Set SDone on last CB descriptor for TB mode. */
> +	desc->req.sdone_enable = 1;
> +	desc->req.irq_enable = q->irq_enable;
> +
> +	return current_enqueued_cbs;
> +}
> +
> +/** Enqueue one decode operations for ACC100 device in CB mode */
> +static inline int
> +enqueue_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
> +		uint16_t total_enqueued_cbs)
> +{
> +	union acc100_dma_desc *desc = NULL;
> +	int ret;
> +	uint32_t in_offset, h_out_offset, s_out_offset, s_out_length,
> +		h_out_length, mbuf_total_left, seg_total_left;
> +	struct rte_mbuf *input, *h_output_head, *h_output,
> +		*s_output_head, *s_output;
> +
> +	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> +			& q->sw_ring_wrap_mask);
> +	desc = q->ring_addr + desc_idx;
> +	acc100_fcw_td_fill(op, &desc->req.fcw_td);
> +
> +	input = op->turbo_dec.input.data;
> +	h_output_head = h_output = op->turbo_dec.hard_output.data;
> +	s_output_head = s_output = op->turbo_dec.soft_output.data;
> +	in_offset = op->turbo_dec.input.offset;
> +	h_out_offset = op->turbo_dec.hard_output.offset;
> +	s_out_offset = op->turbo_dec.soft_output.offset;
> +	h_out_length = s_out_length = 0;
> +	mbuf_total_left = op->turbo_dec.input.length;
> +	seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +	if (unlikely(input == NULL)) {
> +		rte_bbdev_log(ERR, "Invalid mbuf pointer");
> +		return -EFAULT;
> +	}
> +#endif
> +
> +	/* Set up DMA descriptor */
> +	desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
> +			& q->sw_ring_wrap_mask);
> +
> +	ret = acc100_dma_desc_td_fill(op, &desc->req, &input, h_output,
> +			s_output, &in_offset, &h_out_offset, &s_out_offset,
> +			&h_out_length, &s_out_length, &mbuf_total_left,
> +			&seg_total_left, 0);
> +
> +	if (unlikely(ret < 0))
> +		return ret;
> +
> +	/* Hard output */
> +	mbuf_append(h_output_head, h_output, h_out_length);
> +
> +	/* Soft output */
> +	if (check_bit(op->turbo_dec.op_flags, RTE_BBDEV_TURBO_SOFT_OUTPUT))
> +		mbuf_append(s_output_head, s_output, s_out_length);
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +	rte_memdump(stderr, "FCW", &desc->req.fcw_td,
> +			sizeof(desc->req.fcw_td) - 8);
> +	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> +
> +	/* Check if any CBs left for processing */
> +	if (mbuf_total_left != 0) {
> +		rte_bbdev_log(ERR,
> +				"Some date still left after processing one CB: mbuf_total_left = %u",
> +				mbuf_total_left);
> +		return -EINVAL;
> +	}
> +#endif
logic similar to debug in mbuf_append, should be a common function.
> +
> +	/* One CB (one op) was successfully prepared to enqueue */
> +	return 1;
> +}
> +
> +static inline int
> +harq_loopback(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
> +		uint16_t total_enqueued_cbs) {
> +	struct acc100_fcw_ld *fcw;
> +	union acc100_dma_desc *desc;
> +	int next_triplet = 1;
> +	struct rte_mbuf *hq_output_head, *hq_output;
> +	uint16_t harq_in_length = op->ldpc_dec.harq_combined_input.length;
> +	if (harq_in_length == 0) {
> +		rte_bbdev_log(ERR, "Loopback of invalid null size\n");
> +		return -EINVAL;
> +	}
> +
> +	int h_comp = check_bit(op->ldpc_dec.op_flags,
> +			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION
> +			) ? 1 : 0;
> +	if (h_comp == 1)
> +		harq_in_length = harq_in_length * 8 / 6;
> +	harq_in_length = RTE_ALIGN(harq_in_length, 64);
> +	uint16_t harq_dma_length_in = (h_comp == 0) ?
Can these h_comp checks be combined to a single if/else ?
> +			harq_in_length :
> +			harq_in_length * 6 / 8;
> +	uint16_t harq_dma_length_out = harq_dma_length_in;
> +	bool ddr_mem_in = check_bit(op->ldpc_dec.op_flags,
> +			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE);
> +	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
> +	uint16_t harq_index = (ddr_mem_in ?
> +			op->ldpc_dec.harq_combined_input.offset :
> +			op->ldpc_dec.harq_combined_output.offset)
> +			/ ACC100_HARQ_OFFSET;
> +
> +	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> +			& q->sw_ring_wrap_mask);
> +	desc = q->ring_addr + desc_idx;
> +	fcw = &desc->req.fcw_ld;
> +	/* Set the FCW from loopback into DDR */
> +	memset(fcw, 0, sizeof(struct acc100_fcw_ld));
> +	fcw->FCWversion = ACC100_FCW_VER;
> +	fcw->qm = 2;
> +	fcw->Zc = 384;
these magic numbers should have #defines
> +	if (harq_in_length < 16 * N_ZC_1)
> +		fcw->Zc = 16;
> +	fcw->ncb = fcw->Zc * N_ZC_1;
> +	fcw->rm_e = 2;
> +	fcw->hcin_en = 1;
> +	fcw->hcout_en = 1;
> +
> +	rte_bbdev_log(DEBUG, "Loopback IN %d Index %d offset %d length %d %d\n",
> +			ddr_mem_in, harq_index,
> +			harq_layout[harq_index].offset, harq_in_length,
> +			harq_dma_length_in);
> +
> +	if (ddr_mem_in && (harq_layout[harq_index].offset > 0)) {
> +		fcw->hcin_size0 = harq_layout[harq_index].size0;
> +		fcw->hcin_offset = harq_layout[harq_index].offset;
> +		fcw->hcin_size1 = harq_in_length - fcw->hcin_offset;
> +		harq_dma_length_in = (fcw->hcin_size0 + fcw->hcin_size1);
> +		if (h_comp == 1)
> +			harq_dma_length_in = harq_dma_length_in * 6 / 8;
> +	} else {
> +		fcw->hcin_size0 = harq_in_length;
> +	}
> +	harq_layout[harq_index].val = 0;
> +	rte_bbdev_log(DEBUG, "Loopback FCW Config %d %d %d\n",
> +			fcw->hcin_size0, fcw->hcin_offset, fcw->hcin_size1);
> +	fcw->hcout_size0 = harq_in_length;
> +	fcw->hcin_decomp_mode = h_comp;
> +	fcw->hcout_comp_mode = h_comp;
> +	fcw->gain_i = 1;
> +	fcw->gain_h = 1;
> +
> +	/* Set the prefix of descriptor. This could be done at polling */
>  	desc->req.word0 = ACC100_DMA_DESC_TYPE;
>  	desc->req.word1 = 0; /**< Timestamp could be disabled */
>  	desc->req.word2 = 0;
> @@ -1816,6 +2320,107 @@
>  	return current_enqueued_cbs;
>  }
>  
> +/* Enqueue one decode operations for ACC100 device in TB mode */
> +static inline int
> +enqueue_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
> +		uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
> +{
> +	union acc100_dma_desc *desc = NULL;
> +	int ret;
> +	uint8_t r, c;
> +	uint32_t in_offset, h_out_offset, s_out_offset, s_out_length,
> +		h_out_length, mbuf_total_left, seg_total_left;
> +	struct rte_mbuf *input, *h_output_head, *h_output,
> +		*s_output_head, *s_output;
> +	uint16_t current_enqueued_cbs = 0;
> +
> +	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> +			& q->sw_ring_wrap_mask);
> +	desc = q->ring_addr + desc_idx;
> +	uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
> +	acc100_fcw_td_fill(op, &desc->req.fcw_td);
> +
> +	input = op->turbo_dec.input.data;
> +	h_output_head = h_output = op->turbo_dec.hard_output.data;
> +	s_output_head = s_output = op->turbo_dec.soft_output.data;
> +	in_offset = op->turbo_dec.input.offset;
> +	h_out_offset = op->turbo_dec.hard_output.offset;
> +	s_out_offset = op->turbo_dec.soft_output.offset;
> +	h_out_length = s_out_length = 0;
> +	mbuf_total_left = op->turbo_dec.input.length;
> +	c = op->turbo_dec.tb_params.c;
> +	r = op->turbo_dec.tb_params.r;
> +
> +	while (mbuf_total_left > 0 && r < c) {
> +
> +		seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
> +
> +		/* Set up DMA descriptor */
> +		desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
> +				& q->sw_ring_wrap_mask);
> +		desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
> +		desc->req.data_ptrs[0].blen = ACC100_FCW_TD_BLEN;
> +		ret = acc100_dma_desc_td_fill(op, &desc->req, &input,
> +				h_output, s_output, &in_offset, &h_out_offset,
> +				&s_out_offset, &h_out_length, &s_out_length,
> +				&mbuf_total_left, &seg_total_left, r);
> +
> +		if (unlikely(ret < 0))
> +			return ret;
> +
> +		/* Hard output */
> +		mbuf_append(h_output_head, h_output, h_out_length);
> +
> +		/* Soft output */
> +		if (check_bit(op->turbo_dec.op_flags,
> +				RTE_BBDEV_TURBO_SOFT_OUTPUT))
> +			mbuf_append(s_output_head, s_output, s_out_length);
> +
> +		/* Set total number of CBs in TB */
> +		desc->req.cbs_in_tb = cbs_in_tb;
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +		rte_memdump(stderr, "FCW", &desc->req.fcw_td,
> +				sizeof(desc->req.fcw_td) - 8);
> +		rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> +#endif
> +
> +		if (seg_total_left == 0) {
> +			/* Go to the next mbuf */
> +			input = input->next;
> +			in_offset = 0;
> +			h_output = h_output->next;
> +			h_out_offset = 0;
> +
> +			if (check_bit(op->turbo_dec.op_flags,
> +					RTE_BBDEV_TURBO_SOFT_OUTPUT)) {
> +				s_output = s_output->next;
> +				s_out_offset = 0;
> +			}
> +		}
> +
> +		total_enqueued_cbs++;
> +		current_enqueued_cbs++;
> +		r++;
> +	}
> +
> +	if (unlikely(desc == NULL))
> +		return current_enqueued_cbs;
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +	/* Check if any CBs left for processing */
> +	if (mbuf_total_left != 0) {
> +		rte_bbdev_log(ERR,
> +				"Some date still left for processing: mbuf_total_left = %u",
> +				mbuf_total_left);
> +		return -EINVAL;
> +	}
> +#endif
> +	/* Set SDone on last CB descriptor for TB mode */
> +	desc->req.sdone_enable = 1;
> +	desc->req.irq_enable = q->irq_enable;
> +
> +	return current_enqueued_cbs;
> +}
>  
>  /* Calculates number of CBs in processed encoder TB based on 'r' and input
>   * length.
> @@ -1893,6 +2498,45 @@
>  	return cbs_in_tb;
>  }
>  
> +/* Enqueue encode operations for ACC100 device in CB mode. */
> +static uint16_t
> +acc100_enqueue_enc_cb(struct rte_bbdev_queue_data *q_data,
> +		struct rte_bbdev_enc_op **ops, uint16_t num)
> +{
> +	struct acc100_queue *q = q_data->queue_private;
> +	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
> +	uint16_t i;
> +	union acc100_dma_desc *desc;
> +	int ret;
> +
> +	for (i = 0; i < num; ++i) {
> +		/* Check if there are available space for further processing */
> +		if (unlikely(avail - 1 < 0))
> +			break;
> +		avail -= 1;
> +
> +		ret = enqueue_enc_one_op_cb(q, ops[i], i);
> +		if (ret < 0)
> +			break;
> +	}
> +
> +	if (unlikely(i == 0))
> +		return 0; /* Nothing to enqueue */
> +
> +	/* Set SDone in last CB in enqueued ops for CB mode*/
> +	desc = q->ring_addr + ((q->sw_ring_head + i - 1)
> +			& q->sw_ring_wrap_mask);
> +	desc->req.sdone_enable = 1;
> +	desc->req.irq_enable = q->irq_enable;
> +
> +	acc100_dma_enqueue(q, i, &q_data->queue_stats);
> +
> +	/* Update stats */
> +	q_data->queue_stats.enqueued_count += i;
> +	q_data->queue_stats.enqueue_err_count += num - i;
> +	return i;
> +}
> +
>  /* Check we can mux encode operations with common FCW */
>  static inline bool
>  check_mux(struct rte_bbdev_enc_op **ops, uint16_t num) {
> @@ -1960,6 +2604,52 @@
>  	return i;
>  }
>  
> +/* Enqueue encode operations for ACC100 device in TB mode. */
> +static uint16_t
> +acc100_enqueue_enc_tb(struct rte_bbdev_queue_data *q_data,
> +		struct rte_bbdev_enc_op **ops, uint16_t num)
> +{
> +	struct acc100_queue *q = q_data->queue_private;
> +	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
> +	uint16_t i, enqueued_cbs = 0;
> +	uint8_t cbs_in_tb;
> +	int ret;
> +
> +	for (i = 0; i < num; ++i) {
> +		cbs_in_tb = get_num_cbs_in_tb_enc(&ops[i]->turbo_enc);
> +		/* Check if there are available space for further processing */
> +		if (unlikely(avail - cbs_in_tb < 0))
> +			break;
> +		avail -= cbs_in_tb;
> +
> +		ret = enqueue_enc_one_op_tb(q, ops[i], enqueued_cbs, cbs_in_tb);
> +		if (ret < 0)
> +			break;
> +		enqueued_cbs += ret;
> +	}
> +
other similar functions have a (i == 0) check here.
> +	acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
> +
> +	/* Update stats */
> +	q_data->queue_stats.enqueued_count += i;
> +	q_data->queue_stats.enqueue_err_count += num - i;
> +
> +	return i;
> +}
> +
> +/* Enqueue encode operations for ACC100 device. */
> +static uint16_t
> +acc100_enqueue_enc(struct rte_bbdev_queue_data *q_data,
> +		struct rte_bbdev_enc_op **ops, uint16_t num)
> +{
> +	if (unlikely(num == 0))
> +		return 0;
num == 0 check should move into the tb/cb functions
> +	if (ops[0]->turbo_enc.code_block_mode == 0)
> +		return acc100_enqueue_enc_tb(q_data, ops, num);
> +	else
> +		return acc100_enqueue_enc_cb(q_data, ops, num);
> +}
> +
>  /* Enqueue encode operations for ACC100 device. */
>  static uint16_t
>  acc100_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
> @@ -1967,7 +2657,51 @@
>  {
>  	if (unlikely(num == 0))
>  		return 0;
> -	return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
> +	if (ops[0]->ldpc_enc.code_block_mode == 0)
> +		return acc100_enqueue_enc_tb(q_data, ops, num);
> +	else
> +		return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
> +}
> +
> +
> +/* Enqueue decode operations for ACC100 device in CB mode */
> +static uint16_t
> +acc100_enqueue_dec_cb(struct rte_bbdev_queue_data *q_data,
> +		struct rte_bbdev_dec_op **ops, uint16_t num)
> +{

Seems like the 10th variant of a similar function could these be combined to fewer functions ?

Maybe by passing in a function pointer to the enqueue_one_dec_one* that does the work ?

> +	struct acc100_queue *q = q_data->queue_private;
> +	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
> +	uint16_t i;
> +	union acc100_dma_desc *desc;
> +	int ret;
> +
> +	for (i = 0; i < num; ++i) {
> +		/* Check if there are available space for further processing */
> +		if (unlikely(avail - 1 < 0))
> +			break;
> +		avail -= 1;
> +
> +		ret = enqueue_dec_one_op_cb(q, ops[i], i);
> +		if (ret < 0)
> +			break;
> +	}
> +
> +	if (unlikely(i == 0))
> +		return 0; /* Nothing to enqueue */
> +
> +	/* Set SDone in last CB in enqueued ops for CB mode*/
> +	desc = q->ring_addr + ((q->sw_ring_head + i - 1)
> +			& q->sw_ring_wrap_mask);
> +	desc->req.sdone_enable = 1;
> +	desc->req.irq_enable = q->irq_enable;
> +
> +	acc100_dma_enqueue(q, i, &q_data->queue_stats);
> +
> +	/* Update stats */
> +	q_data->queue_stats.enqueued_count += i;
> +	q_data->queue_stats.enqueue_err_count += num - i;
> +
> +	return i;
>  }
>  
>  /* Check we can mux encode operations with common FCW */
> @@ -2065,6 +2799,53 @@
>  	return i;
>  }
>  
> +
> +/* Enqueue decode operations for ACC100 device in TB mode */
> +static uint16_t
> +acc100_enqueue_dec_tb(struct rte_bbdev_queue_data *q_data,
> +		struct rte_bbdev_dec_op **ops, uint16_t num)
> +{
11th ;)
> +	struct acc100_queue *q = q_data->queue_private;
> +	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
> +	uint16_t i, enqueued_cbs = 0;
> +	uint8_t cbs_in_tb;
> +	int ret;
> +
> +	for (i = 0; i < num; ++i) {
> +		cbs_in_tb = get_num_cbs_in_tb_dec(&ops[i]->turbo_dec);
> +		/* Check if there are available space for further processing */
> +		if (unlikely(avail - cbs_in_tb < 0))
> +			break;
> +		avail -= cbs_in_tb;
> +
> +		ret = enqueue_dec_one_op_tb(q, ops[i], enqueued_cbs, cbs_in_tb);
> +		if (ret < 0)
> +			break;
> +		enqueued_cbs += ret;
> +	}
> +
> +	acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
> +
> +	/* Update stats */
> +	q_data->queue_stats.enqueued_count += i;
> +	q_data->queue_stats.enqueue_err_count += num - i;
> +
> +	return i;
> +}
> +
> +/* Enqueue decode operations for ACC100 device. */
> +static uint16_t
> +acc100_enqueue_dec(struct rte_bbdev_queue_data *q_data,
> +		struct rte_bbdev_dec_op **ops, uint16_t num)
> +{
> +	if (unlikely(num == 0))
> +		return 0;
similar move the num == 0 check to the tb/cb functions.
> +	if (ops[0]->turbo_dec.code_block_mode == 0)
> +		return acc100_enqueue_dec_tb(q_data, ops, num);
> +	else
> +		return acc100_enqueue_dec_cb(q_data, ops, num);
> +}
> +
>  /* Enqueue decode operations for ACC100 device. */
>  static uint16_t
>  acc100_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
> @@ -2388,6 +3169,51 @@
>  	return cb_idx;
>  }
>  
> +/* Dequeue encode operations from ACC100 device. */
> +static uint16_t
> +acc100_dequeue_enc(struct rte_bbdev_queue_data *q_data,
> +		struct rte_bbdev_enc_op **ops, uint16_t num)
> +{
> +	struct acc100_queue *q = q_data->queue_private;
> +	uint16_t dequeue_num;
> +	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
> +	uint32_t aq_dequeued = 0;
> +	uint16_t i;
> +	uint16_t dequeued_cbs = 0;
> +	struct rte_bbdev_enc_op *op;
> +	int ret;
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +	if (unlikely(ops == 0 && q == NULL))

ops is a pointer so should compare with NULL

The && likely needs to be ||

Maybe print out a message so caller knows something wrong happened.

> +		return 0;
> +#endif
> +
> +	dequeue_num = (avail < num) ? avail : num;
> +
> +	for (i = 0; i < dequeue_num; ++i) {
> +		op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> +			& q->sw_ring_wrap_mask))->req.op_addr;
> +		if (op->turbo_enc.code_block_mode == 0)
> +			ret = dequeue_enc_one_op_tb(q, &ops[i], dequeued_cbs,
> +					&aq_dequeued);
> +		else
> +			ret = dequeue_enc_one_op_cb(q, &ops[i], dequeued_cbs,
> +					&aq_dequeued);
> +
> +		if (ret < 0)
> +			break;
> +		dequeued_cbs += ret;
> +	}
> +
> +	q->aq_dequeued += aq_dequeued;
> +	q->sw_ring_tail += dequeued_cbs;
> +
> +	/* Update enqueue stats */
> +	q_data->queue_stats.dequeued_count += i;
> +
> +	return i;
> +}
> +
>  /* Dequeue LDPC encode operations from ACC100 device. */
>  static uint16_t
>  acc100_dequeue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
> @@ -2426,6 +3252,52 @@
>  	return dequeued_cbs;
>  }
>  
> +
> +/* Dequeue decode operations from ACC100 device. */
> +static uint16_t
> +acc100_dequeue_dec(struct rte_bbdev_queue_data *q_data,
> +		struct rte_bbdev_dec_op **ops, uint16_t num)
> +{

very similar to enc function above, consider how to combine them to a single function.

Tom

> +	struct acc100_queue *q = q_data->queue_private;
> +	uint16_t dequeue_num;
> +	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
> +	uint32_t aq_dequeued = 0;
> +	uint16_t i;
> +	uint16_t dequeued_cbs = 0;
> +	struct rte_bbdev_dec_op *op;
> +	int ret;
> +
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +	if (unlikely(ops == 0 && q == NULL))
> +		return 0;
> +#endif
> +
> +	dequeue_num = (avail < num) ? avail : num;
> +
> +	for (i = 0; i < dequeue_num; ++i) {
> +		op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> +			& q->sw_ring_wrap_mask))->req.op_addr;
> +		if (op->turbo_dec.code_block_mode == 0)
> +			ret = dequeue_dec_one_op_tb(q, &ops[i], dequeued_cbs,
> +					&aq_dequeued);
> +		else
> +			ret = dequeue_dec_one_op_cb(q_data, q, &ops[i],
> +					dequeued_cbs, &aq_dequeued);
> +
> +		if (ret < 0)
> +			break;
> +		dequeued_cbs += ret;
> +	}
> +
> +	q->aq_dequeued += aq_dequeued;
> +	q->sw_ring_tail += dequeued_cbs;
> +
> +	/* Update enqueue stats */
> +	q_data->queue_stats.dequeued_count += i;
> +
> +	return i;
> +}
> +
>  /* Dequeue decode operations from ACC100 device. */
>  static uint16_t
>  acc100_dequeue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
> @@ -2479,6 +3351,10 @@
>  	struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
>  
>  	dev->dev_ops = &acc100_bbdev_ops;
> +	dev->enqueue_enc_ops = acc100_enqueue_enc;
> +	dev->enqueue_dec_ops = acc100_enqueue_dec;
> +	dev->dequeue_enc_ops = acc100_dequeue_enc;
> +	dev->dequeue_dec_ops = acc100_dequeue_dec;
>  	dev->enqueue_ldpc_enc_ops = acc100_enqueue_ldpc_enc;
>  	dev->enqueue_ldpc_dec_ops = acc100_enqueue_ldpc_dec;
>  	dev->dequeue_ldpc_enc_ops = acc100_dequeue_ldpc_enc;


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v9 05/10] baseband/acc100: add LDPC processing functions
  2020-09-30 16:53       ` Tom Rix
@ 2020-09-30 18:52         ` Chautru, Nicolas
  2020-10-01 15:31           ` Tom Rix
  0 siblings, 1 reply; 213+ messages in thread
From: Chautru, Nicolas @ 2020-09-30 18:52 UTC (permalink / raw)
  To: Tom Rix, dev, akhil.goyal
  Cc: Richardson, Bruce, Xu, Rosen, dave.burley, aidan.goddard, Yigit,
	Ferruh, Liu, Tianjiao

Hi Tom, 

> From: Tom Rix <trix@redhat.com>
> On 9/28/20 5:29 PM, Nicolas Chautru wrote:
> > Adding LDPC decode and encode processing operations
> >
> > Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> > Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
> > Acked-by: Dave Burley <dave.burley@accelercomm.com>
> > ---
> >  doc/guides/bbdevs/features/acc100.ini    |    8 +-
> >  drivers/baseband/acc100/rte_acc100_pmd.c | 1625
> +++++++++++++++++++++++++++++-
> >  drivers/baseband/acc100/rte_acc100_pmd.h |    3 +
> >  3 files changed, 1630 insertions(+), 6 deletions(-)
> >
> > diff --git a/doc/guides/bbdevs/features/acc100.ini
> b/doc/guides/bbdevs/features/acc100.ini
> > index c89a4d7..40c7adc 100644
> > --- a/doc/guides/bbdevs/features/acc100.ini
> > +++ b/doc/guides/bbdevs/features/acc100.ini
> > @@ -6,9 +6,9 @@
> >  [Features]
> >  Turbo Decoder (4G)     = N
> >  Turbo Encoder (4G)     = N
> > -LDPC Decoder (5G)      = N
> > -LDPC Encoder (5G)      = N
> > -LLR/HARQ Compression   = N
> > -External DDR Access    = N
> > +LDPC Decoder (5G)      = Y
> > +LDPC Encoder (5G)      = Y
> > +LLR/HARQ Compression   = Y
> > +External DDR Access    = Y
> >  HW Accelerated         = Y
> >  BBDEV API              = Y
> > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> b/drivers/baseband/acc100/rte_acc100_pmd.c
> > index 7a21c57..b223547 100644
> > --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> > +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> > @@ -15,6 +15,9 @@
> >  #include <rte_hexdump.h>
> >  #include <rte_pci.h>
> >  #include <rte_bus_pci.h>
> > +#ifdef RTE_BBDEV_OFFLOAD_COST
> > +#include <rte_cycles.h>
> > +#endif
> >
> >  #include <rte_bbdev.h>
> >  #include <rte_bbdev_pmd.h>
> > @@ -449,7 +452,6 @@
> >  	return 0;
> >  }
> >
> > -
> >  /**
> >   * Report a ACC100 queue index which is free
> >   * Return 0 to 16k for a valid queue_idx or -1 when no queue is available
> > @@ -634,6 +636,46 @@
> >  	struct acc100_device *d = dev->data->dev_private;
> >
> >  	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
> > +		{
> > +			.type   = RTE_BBDEV_OP_LDPC_ENC,
> > +			.cap.ldpc_enc = {
> > +				.capability_flags =
> > +					RTE_BBDEV_LDPC_RATE_MATCH |
> > +
> 	RTE_BBDEV_LDPC_CRC_24B_ATTACH |
> > +
> 	RTE_BBDEV_LDPC_INTERLEAVER_BYPASS,
> > +				.num_buffers_src =
> > +
> 	RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> > +				.num_buffers_dst =
> > +
> 	RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> > +			}
> > +		},
> > +		{
> > +			.type   = RTE_BBDEV_OP_LDPC_DEC,
> > +			.cap.ldpc_dec = {
> > +			.capability_flags =
> > +				RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK |
> > +				RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP |
> > +
> 	RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE |
> > +
> 	RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE |
> > +#ifdef ACC100_EXT_MEM
> 
> This is unconditionally defined in rte_acc100_pmd.h but it seems
> 
> like it could be a hw config.  Please add a comment in the *.h
> 

It is not really an HW config, just a potential alternate way to run
the device notably for troubleshooting.
I can add a comment though

> Could also change to
> 
> #if ACC100_EXT_MEM
> 
> and change the #define ACC100_EXT_MEM 1

ok

> 
> > +
> 	RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE |
> > +
> 	RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE |
> > +#endif
> > +
> 	RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE |
> > +				RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS
> |
> > +				RTE_BBDEV_LDPC_DECODE_BYPASS |
> > +				RTE_BBDEV_LDPC_DEC_SCATTER_GATHER |
> > +
> 	RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION |
> > +				RTE_BBDEV_LDPC_LLR_COMPRESSION,
> > +			.llr_size = 8,
> > +			.llr_decimals = 1,
> > +			.num_buffers_src =
> > +
> 	RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> > +			.num_buffers_hard_out =
> > +
> 	RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> > +			.num_buffers_soft_out = 0,
> > +			}
> > +		},
> >  		RTE_BBDEV_END_OF_CAPABILITIES_LIST()
> >  	};
> >
> > @@ -669,9 +711,14 @@
> >  	dev_info->cpu_flag_reqs = NULL;
> >  	dev_info->min_alignment = 64;
> >  	dev_info->capabilities = bbdev_capabilities;
> > +#ifdef ACC100_EXT_MEM
> >  	dev_info->harq_buffer_size = d->ddr_size;
> > +#else
> > +	dev_info->harq_buffer_size = 0;
> > +#endif
> >  }
> >
> > +
> >  static const struct rte_bbdev_ops acc100_bbdev_ops = {
> >  	.setup_queues = acc100_setup_queues,
> >  	.close = acc100_dev_close,
> > @@ -696,6 +743,1577 @@
> >  	{.device_id = 0},
> >  };
> >
> > +/* Read flag value 0/1 from bitmap */
> > +static inline bool
> > +check_bit(uint32_t bitmap, uint32_t bitmask)
> > +{
> > +	return bitmap & bitmask;
> > +}
> > +
> 
> All the bbdev have this function, its pretty trival but it would be good if
> common bbdev
> 
> functions got moved to a common place.

Noted for future change affecting all PMDs outside of that serie. 

> 
> > +static inline char *
> > +mbuf_append(struct rte_mbuf *m_head, struct rte_mbuf *m, uint16_t
> len)
> > +{
> > +	if (unlikely(len > rte_pktmbuf_tailroom(m)))
> > +		return NULL;
> > +
> > +	char *tail = (char *)m->buf_addr + m->data_off + m->data_len;
> > +	m->data_len = (uint16_t)(m->data_len + len);
> > +	m_head->pkt_len  = (m_head->pkt_len + len);
> > +	return tail;
> > +}
> > +
> > +/* Compute value of k0.
> > + * Based on 3GPP 38.212 Table 5.4.2.1-2
> > + * Starting position of different redundancy versions, k0
> > + */
> > +static inline uint16_t
> > +get_k0(uint16_t n_cb, uint16_t z_c, uint8_t bg, uint8_t rv_index)
> > +{
> > +	if (rv_index == 0)
> > +		return 0;
> > +	uint16_t n = (bg == 1 ? N_ZC_1 : N_ZC_2) * z_c;
> > +	if (n_cb == n) {
> > +		if (rv_index == 1)
> > +			return (bg == 1 ? K0_1_1 : K0_1_2) * z_c;
> > +		else if (rv_index == 2)
> > +			return (bg == 1 ? K0_2_1 : K0_2_2) * z_c;
> > +		else
> > +			return (bg == 1 ? K0_3_1 : K0_3_2) * z_c;
> > +	}
> > +	/* LBRM case - includes a division by N */
> > +	if (rv_index == 1)
> > +		return (((bg == 1 ? K0_1_1 : K0_1_2) * n_cb)
> > +				/ n) * z_c;
> > +	else if (rv_index == 2)
> > +		return (((bg == 1 ? K0_2_1 : K0_2_2) * n_cb)
> > +				/ n) * z_c;
> > +	else
> > +		return (((bg == 1 ? K0_3_1 : K0_3_2) * n_cb)
> > +				/ n) * z_c;
> > +}
> > +
> > +/* Fill in a frame control word for LDPC encoding. */
> > +static inline void
> > +acc100_fcw_le_fill(const struct rte_bbdev_enc_op *op,
> > +		struct acc100_fcw_le *fcw, int num_cb)
> > +{
> > +	fcw->qm = op->ldpc_enc.q_m;
> > +	fcw->nfiller = op->ldpc_enc.n_filler;
> > +	fcw->BG = (op->ldpc_enc.basegraph - 1);
> > +	fcw->Zc = op->ldpc_enc.z_c;
> > +	fcw->ncb = op->ldpc_enc.n_cb;
> > +	fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_enc.basegraph,
> > +			op->ldpc_enc.rv_index);
> > +	fcw->rm_e = op->ldpc_enc.cb_params.e;
> > +	fcw->crc_select = check_bit(op->ldpc_enc.op_flags,
> > +			RTE_BBDEV_LDPC_CRC_24B_ATTACH);
> > +	fcw->bypass_intlv = check_bit(op->ldpc_enc.op_flags,
> > +			RTE_BBDEV_LDPC_INTERLEAVER_BYPASS);
> > +	fcw->mcb_count = num_cb;
> > +}
> > +
> > +/* Fill in a frame control word for LDPC decoding. */
> > +static inline void
> > +acc100_fcw_ld_fill(const struct rte_bbdev_dec_op *op, struct
> acc100_fcw_ld *fcw,
> > +		union acc100_harq_layout_data *harq_layout)
> > +{
> > +	uint16_t harq_out_length, harq_in_length, ncb_p, k0_p,
> parity_offset;
> > +	uint16_t harq_index;
> > +	uint32_t l;
> > +	bool harq_prun = false;
> > +
> > +	fcw->qm = op->ldpc_dec.q_m;
> > +	fcw->nfiller = op->ldpc_dec.n_filler;
> > +	fcw->BG = (op->ldpc_dec.basegraph - 1);
> > +	fcw->Zc = op->ldpc_dec.z_c;
> > +	fcw->ncb = op->ldpc_dec.n_cb;
> > +	fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_dec.basegraph,
> > +			op->ldpc_dec.rv_index);
> > +	if (op->ldpc_dec.code_block_mode == 1)
> 1 is magic, consider a #define

This would be a changed not related to that PMD, but noted and agreed. 

> > +		fcw->rm_e = op->ldpc_dec.cb_params.e;
> > +	else
> > +		fcw->rm_e = (op->ldpc_dec.tb_params.r <
> > +				op->ldpc_dec.tb_params.cab) ?
> > +						op->ldpc_dec.tb_params.ea :
> > +						op->ldpc_dec.tb_params.eb;
> > +
> > +	fcw->hcin_en = check_bit(op->ldpc_dec.op_flags,
> > +			RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE);
> > +	fcw->hcout_en = check_bit(op->ldpc_dec.op_flags,
> > +			RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE);
> > +	fcw->crc_select = check_bit(op->ldpc_dec.op_flags,
> > +			RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK);
> > +	fcw->bypass_dec = check_bit(op->ldpc_dec.op_flags,
> > +			RTE_BBDEV_LDPC_DECODE_BYPASS);
> > +	fcw->bypass_intlv = check_bit(op->ldpc_dec.op_flags,
> > +			RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS);
> > +	if (op->ldpc_dec.q_m == 1) {
> > +		fcw->bypass_intlv = 1;
> > +		fcw->qm = 2;
> > +	}
> similar magic.

Qm is an integer number defined in 3GPP, not a magic number. This literally means qm = 2.

> > +	fcw->hcin_decomp_mode = check_bit(op->ldpc_dec.op_flags,
> > +			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
> > +	fcw->hcout_comp_mode = check_bit(op->ldpc_dec.op_flags,
> > +			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
> > +	fcw->llr_pack_mode = check_bit(op->ldpc_dec.op_flags,
> > +			RTE_BBDEV_LDPC_LLR_COMPRESSION);
> > +	harq_index = op->ldpc_dec.harq_combined_output.offset /
> > +			ACC100_HARQ_OFFSET;
> > +#ifdef ACC100_EXT_MEM
> > +	/* Limit cases when HARQ pruning is valid */
> > +	harq_prun = ((op->ldpc_dec.harq_combined_output.offset %
> > +			ACC100_HARQ_OFFSET) == 0) &&
> > +			(op->ldpc_dec.harq_combined_output.offset <=
> UINT16_MAX
> > +			* ACC100_HARQ_OFFSET);
> > +#endif
> > +	if (fcw->hcin_en > 0) {
> > +		harq_in_length = op-
> >ldpc_dec.harq_combined_input.length;
> > +		if (fcw->hcin_decomp_mode > 0)
> > +			harq_in_length = harq_in_length * 8 / 6;
> > +		harq_in_length = RTE_ALIGN(harq_in_length, 64);
> > +		if ((harq_layout[harq_index].offset > 0) & harq_prun) {
> > +			rte_bbdev_log_debug("HARQ IN offset unexpected
> for now\n");
> > +			fcw->hcin_size0 = harq_layout[harq_index].size0;
> > +			fcw->hcin_offset = harq_layout[harq_index].offset;
> > +			fcw->hcin_size1 = harq_in_length -
> > +					harq_layout[harq_index].offset;
> > +		} else {
> > +			fcw->hcin_size0 = harq_in_length;
> > +			fcw->hcin_offset = 0;
> > +			fcw->hcin_size1 = 0;
> > +		}
> > +	} else {
> > +		fcw->hcin_size0 = 0;
> > +		fcw->hcin_offset = 0;
> > +		fcw->hcin_size1 = 0;
> > +	}
> > +
> > +	fcw->itmax = op->ldpc_dec.iter_max;
> > +	fcw->itstop = check_bit(op->ldpc_dec.op_flags,
> > +			RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE);
> > +	fcw->synd_precoder = fcw->itstop;
> > +	/*
> > +	 * These are all implicitly set
> > +	 * fcw->synd_post = 0;
> > +	 * fcw->so_en = 0;
> > +	 * fcw->so_bypass_rm = 0;
> > +	 * fcw->so_bypass_intlv = 0;
> > +	 * fcw->dec_convllr = 0;
> > +	 * fcw->hcout_convllr = 0;
> > +	 * fcw->hcout_size1 = 0;
> > +	 * fcw->so_it = 0;
> > +	 * fcw->hcout_offset = 0;
> > +	 * fcw->negstop_th = 0;
> > +	 * fcw->negstop_it = 0;
> > +	 * fcw->negstop_en = 0;
> > +	 * fcw->gain_i = 1;
> > +	 * fcw->gain_h = 1;
> > +	 */
> > +	if (fcw->hcout_en > 0) {
> > +		parity_offset = (op->ldpc_dec.basegraph == 1 ? 20 : 8)
> > +			* op->ldpc_dec.z_c - op->ldpc_dec.n_filler;
> > +		k0_p = (fcw->k0 > parity_offset) ?
> > +				fcw->k0 - op->ldpc_dec.n_filler : fcw->k0;
> > +		ncb_p = fcw->ncb - op->ldpc_dec.n_filler;
> > +		l = k0_p + fcw->rm_e;
> > +		harq_out_length = (uint16_t) fcw->hcin_size0;
> > +		harq_out_length = RTE_MIN(RTE_MAX(harq_out_length, l),
> ncb_p);
> > +		harq_out_length = (harq_out_length + 0x3F) & 0xFFC0;
> > +		if ((k0_p > fcw->hcin_size0 +
> ACC100_HARQ_OFFSET_THRESHOLD) &&
> > +				harq_prun) {
> > +			fcw->hcout_size0 = (uint16_t) fcw->hcin_size0;
> > +			fcw->hcout_offset = k0_p & 0xFFC0;
> > +			fcw->hcout_size1 = harq_out_length - fcw-
> >hcout_offset;
> > +		} else {
> > +			fcw->hcout_size0 = harq_out_length;
> > +			fcw->hcout_size1 = 0;
> > +			fcw->hcout_offset = 0;
> > +		}
> > +		harq_layout[harq_index].offset = fcw->hcout_offset;
> > +		harq_layout[harq_index].size0 = fcw->hcout_size0;
> > +	} else {
> > +		fcw->hcout_size0 = 0;
> > +		fcw->hcout_size1 = 0;
> > +		fcw->hcout_offset = 0;
> > +	}
> > +}
> > +
> > +/**
> > + * Fills descriptor with data pointers of one block type.
> > + *
> > + * @param desc
> > + *   Pointer to DMA descriptor.
> > + * @param input
> > + *   Pointer to pointer to input data which will be encoded. It can be
> changed
> > + *   and points to next segment in scatter-gather case.
> > + * @param offset
> > + *   Input offset in rte_mbuf structure. It is used for calculating the point
> > + *   where data is starting.
> > + * @param cb_len
> > + *   Length of currently processed Code Block
> > + * @param seg_total_left
> > + *   It indicates how many bytes still left in segment (mbuf) for further
> > + *   processing.
> > + * @param op_flags
> > + *   Store information about device capabilities
> > + * @param next_triplet
> > + *   Index for ACC100 DMA Descriptor triplet
> > + *
> > + * @return
> > + *   Returns index of next triplet on success, other value if lengths of
> > + *   pkt and processed cb do not match.
> > + *
> > + */
> > +static inline int
> > +acc100_dma_fill_blk_type_in(struct acc100_dma_req_desc *desc,
> > +		struct rte_mbuf **input, uint32_t *offset, uint32_t cb_len,
> > +		uint32_t *seg_total_left, int next_triplet)
> > +{
> > +	uint32_t part_len;
> > +	struct rte_mbuf *m = *input;
> > +
> > +	part_len = (*seg_total_left < cb_len) ? *seg_total_left : cb_len;
> > +	cb_len -= part_len;
> > +	*seg_total_left -= part_len;
> > +
> > +	desc->data_ptrs[next_triplet].address =
> > +			rte_pktmbuf_iova_offset(m, *offset);
> > +	desc->data_ptrs[next_triplet].blen = part_len;
> > +	desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN;
> > +	desc->data_ptrs[next_triplet].last = 0;
> > +	desc->data_ptrs[next_triplet].dma_ext = 0;
> > +	*offset += part_len;
> > +	next_triplet++;
> > +
> > +	while (cb_len > 0) {
> 
> Since cb_len is unsigned, a better check would be
> 
> while (cb_len != 0)

Why would this be better?

> 
> > +		if (next_triplet < ACC100_DMA_MAX_NUM_POINTERS &&
> > +				m->next != NULL) {
> > +
> > +			m = m->next;
> > +			*seg_total_left = rte_pktmbuf_data_len(m);
> > +			part_len = (*seg_total_left < cb_len) ?
> > +					*seg_total_left :
> > +					cb_len;
> > +			desc->data_ptrs[next_triplet].address =
> > +					rte_pktmbuf_iova_offset(m, 0);
> > +			desc->data_ptrs[next_triplet].blen = part_len;
> > +			desc->data_ptrs[next_triplet].blkid =
> > +					ACC100_DMA_BLKID_IN;
> > +			desc->data_ptrs[next_triplet].last = 0;
> > +			desc->data_ptrs[next_triplet].dma_ext = 0;
> > +			cb_len -= part_len;
> > +			*seg_total_left -= part_len;
> 
> when *sec_total_left goes to zero here, there will be a lot of iterations doing
> nothing.
> 
> should stop early.

Not really, it would pick next m anyway and keep adding buffer descriptor pointer. 

> 
> > +			/* Initializing offset for next segment (mbuf) */
> > +			*offset = part_len;
> > +			next_triplet++;
> > +		} else {
> > +			rte_bbdev_log(ERR,
> > +				"Some data still left for processing: "
> > +				"data_left: %u, next_triplet: %u, next_mbuf:
> %p",
> > +				cb_len, next_triplet, m->next);
> > +			return -EINVAL;
> > +		}
> > +	}
> > +	/* Storing new mbuf as it could be changed in scatter-gather case*/
> > +	*input = m;
> > +
> > +	return next_triplet;
> 
> callers, after checking, dec the return.
> 
> Maybe change return to next_triplet-- and save the callers from doing it.

I miss your point

> 
> > +}
> > +
> > +/* Fills descriptor with data pointers of one block type.
> > + * Returns index of next triplet on success, other value if lengths of
> > + * output data and processed mbuf do not match.
> > + */
> > +static inline int
> > +acc100_dma_fill_blk_type_out(struct acc100_dma_req_desc *desc,
> > +		struct rte_mbuf *output, uint32_t out_offset,
> > +		uint32_t output_len, int next_triplet, int blk_id)
> > +{
> > +	desc->data_ptrs[next_triplet].address =
> > +			rte_pktmbuf_iova_offset(output, out_offset);
> > +	desc->data_ptrs[next_triplet].blen = output_len;
> > +	desc->data_ptrs[next_triplet].blkid = blk_id;
> > +	desc->data_ptrs[next_triplet].last = 0;
> > +	desc->data_ptrs[next_triplet].dma_ext = 0;
> > +	next_triplet++;
> 
> Callers check return is < 0, like above but there is no similar logic to
> 
> check the bounds of next_triplet to return -EINVAL
> 
> so add this check here or remove the is < 0 checks by the callers.
> 

fair enough thanks. 

> > +
> > +	return next_triplet;
> > +}
> > +
> > +static inline int
> > +acc100_dma_desc_le_fill(struct rte_bbdev_enc_op *op,
> > +		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
> > +		struct rte_mbuf *output, uint32_t *in_offset,
> > +		uint32_t *out_offset, uint32_t *out_length,
> > +		uint32_t *mbuf_total_left, uint32_t *seg_total_left)
> > +{
> > +	int next_triplet = 1; /* FCW already done */
> > +	uint16_t K, in_length_in_bits, in_length_in_bytes;
> > +	struct rte_bbdev_op_ldpc_enc *enc = &op->ldpc_enc;
> > +
> > +	desc->word0 = ACC100_DMA_DESC_TYPE;
> > +	desc->word1 = 0; /**< Timestamp could be disabled */
> > +	desc->word2 = 0;
> > +	desc->word3 = 0;
> > +	desc->numCBs = 1;
> > +
> > +	K = (enc->basegraph == 1 ? 22 : 10) * enc->z_c;
> > +	in_length_in_bits = K - enc->n_filler;
> can this overflow ? enc->n_filler > K ?

I would not add such checks in the time critical function. For valid scenario it can't.
It could be added to the validate_ldpc_dec_op() which is only run in debug mode.

> > +	if ((enc->op_flags & RTE_BBDEV_LDPC_CRC_24A_ATTACH) ||
> > +			(enc->op_flags &
> RTE_BBDEV_LDPC_CRC_24B_ATTACH))
> > +		in_length_in_bits -= 24;
> > +	in_length_in_bytes = in_length_in_bits >> 3;
> > +
> > +	if (unlikely((*mbuf_total_left == 0) ||
> This check is covered by the next and can be removed.

Not necessaraly, would keep as is. 

> > +			(*mbuf_total_left < in_length_in_bytes))) {
> > +		rte_bbdev_log(ERR,
> > +				"Mismatch between mbuf length and
> included CB sizes: mbuf len %u, cb len %u",
> > +				*mbuf_total_left, in_length_in_bytes);
> > +		return -1;
> > +	}
> > +
> > +	next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset,
> > +			in_length_in_bytes,
> > +			seg_total_left, next_triplet);
> > +	if (unlikely(next_triplet < 0)) {
> > +		rte_bbdev_log(ERR,
> > +				"Mismatch between data to process and
> mbuf data length in bbdev_op: %p",
> > +				op);
> > +		return -1;
> > +	}
> > +	desc->data_ptrs[next_triplet - 1].last = 1;
> > +	desc->m2dlen = next_triplet;
> > +	*mbuf_total_left -= in_length_in_bytes;
> 
> Updating output pointers should be deferred until the the call is known to
> be successful.
> 
> Otherwise caller is left in a bad, unknown state.

We already had to touch them by that point.

> 
> > +
> > +	/* Set output length */
> > +	/* Integer round up division by 8 */
> > +	*out_length = (enc->cb_params.e + 7) >> 3;
> > +
> > +	next_triplet = acc100_dma_fill_blk_type_out(desc, output,
> *out_offset,
> > +			*out_length, next_triplet,
> ACC100_DMA_BLKID_OUT_ENC);
> > +	if (unlikely(next_triplet < 0)) {
> > +		rte_bbdev_log(ERR,
> > +				"Mismatch between data to process and
> mbuf data length in bbdev_op: %p",
> > +				op);
> > +		return -1;
> > +	}
> > +	op->ldpc_enc.output.length += *out_length;
> > +	*out_offset += *out_length;
> > +	desc->data_ptrs[next_triplet - 1].last = 1;
> > +	desc->data_ptrs[next_triplet - 1].dma_ext = 0;
> > +	desc->d2mlen = next_triplet - desc->m2dlen;
> > +
> > +	desc->op_addr = op;
> > +
> > +	return 0;
> > +}
> > +
> > +static inline int
> > +acc100_dma_desc_ld_fill(struct rte_bbdev_dec_op *op,
> > +		struct acc100_dma_req_desc *desc,
> > +		struct rte_mbuf **input, struct rte_mbuf *h_output,
> > +		uint32_t *in_offset, uint32_t *h_out_offset,
> > +		uint32_t *h_out_length, uint32_t *mbuf_total_left,
> > +		uint32_t *seg_total_left,
> > +		struct acc100_fcw_ld *fcw)
> > +{
> > +	struct rte_bbdev_op_ldpc_dec *dec = &op->ldpc_dec;
> > +	int next_triplet = 1; /* FCW already done */
> > +	uint32_t input_length;
> > +	uint16_t output_length, crc24_overlap = 0;
> > +	uint16_t sys_cols, K, h_p_size, h_np_size;
> > +	bool h_comp = check_bit(dec->op_flags,
> > +			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
> > +
> > +	desc->word0 = ACC100_DMA_DESC_TYPE;
> > +	desc->word1 = 0; /**< Timestamp could be disabled */
> > +	desc->word2 = 0;
> > +	desc->word3 = 0;
> > +	desc->numCBs = 1;
> This seems to be a common setup logic, maybe use a macro or inline
> function.

fair enough

> > +
> > +	if (check_bit(op->ldpc_dec.op_flags,
> > +			RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP))
> > +		crc24_overlap = 24;
> > +
> > +	/* Compute some LDPC BG lengths */
> > +	input_length = dec->cb_params.e;
> > +	if (check_bit(op->ldpc_dec.op_flags,
> > +			RTE_BBDEV_LDPC_LLR_COMPRESSION))
> > +		input_length = (input_length * 3 + 3) / 4;
> > +	sys_cols = (dec->basegraph == 1) ? 22 : 10;
> > +	K = sys_cols * dec->z_c;
> > +	output_length = K - dec->n_filler - crc24_overlap;
> > +
> > +	if (unlikely((*mbuf_total_left == 0) ||
> similar to above, this check can be removed.

same comment

> > +			(*mbuf_total_left < input_length))) {
> > +		rte_bbdev_log(ERR,
> > +				"Mismatch between mbuf length and
> included CB sizes: mbuf len %u, cb len %u",
> > +				*mbuf_total_left, input_length);
> > +		return -1;
> > +	}
> > +
> > +	next_triplet = acc100_dma_fill_blk_type_in(desc, input,
> > +			in_offset, input_length,
> > +			seg_total_left, next_triplet);
> > +
> > +	if (unlikely(next_triplet < 0)) {
> > +		rte_bbdev_log(ERR,
> > +				"Mismatch between data to process and
> mbuf data length in bbdev_op: %p",
> > +				op);
> > +		return -1;
> > +	}
> > +
> > +	if (check_bit(op->ldpc_dec.op_flags,
> > +
> 	RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
> > +		h_p_size = fcw->hcin_size0 + fcw->hcin_size1;
> > +		if (h_comp)
> > +			h_p_size = (h_p_size * 3 + 3) / 4;
> > +		desc->data_ptrs[next_triplet].address =
> > +				dec->harq_combined_input.offset;
> > +		desc->data_ptrs[next_triplet].blen = h_p_size;
> > +		desc->data_ptrs[next_triplet].blkid =
> ACC100_DMA_BLKID_IN_HARQ;
> > +		desc->data_ptrs[next_triplet].dma_ext = 1;
> > +#ifndef ACC100_EXT_MEM
> > +		acc100_dma_fill_blk_type_out(
> > +				desc,
> > +				op->ldpc_dec.harq_combined_input.data,
> > +				op->ldpc_dec.harq_combined_input.offset,
> > +				h_p_size,
> > +				next_triplet,
> > +				ACC100_DMA_BLKID_IN_HARQ);
> > +#endif
> > +		next_triplet++;
> > +	}
> > +
> > +	desc->data_ptrs[next_triplet - 1].last = 1;
> > +	desc->m2dlen = next_triplet;
> > +	*mbuf_total_left -= input_length;
> > +
> > +	next_triplet = acc100_dma_fill_blk_type_out(desc, h_output,
> > +			*h_out_offset, output_length >> 3, next_triplet,
> > +			ACC100_DMA_BLKID_OUT_HARD);
> > +	if (unlikely(next_triplet < 0)) {
> > +		rte_bbdev_log(ERR,
> > +				"Mismatch between data to process and
> mbuf data length in bbdev_op: %p",
> > +				op);
> > +		return -1;
> > +	}
> > +
> > +	if (check_bit(op->ldpc_dec.op_flags,
> > +
> 	RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
> > +		/* Pruned size of the HARQ */
> > +		h_p_size = fcw->hcout_size0 + fcw->hcout_size1;
> > +		/* Non-Pruned size of the HARQ */
> > +		h_np_size = fcw->hcout_offset > 0 ?
> > +				fcw->hcout_offset + fcw->hcout_size1 :
> > +				h_p_size;
> > +		if (h_comp) {
> > +			h_np_size = (h_np_size * 3 + 3) / 4;
> > +			h_p_size = (h_p_size * 3 + 3) / 4;
> 
> * 4 -1 ) / 4
> 
> may produce better assembly.

that is not the same arithmetic

> 
> > +		}
> > +		dec->harq_combined_output.length = h_np_size;
> > +		desc->data_ptrs[next_triplet].address =
> > +				dec->harq_combined_output.offset;
> > +		desc->data_ptrs[next_triplet].blen = h_p_size;
> > +		desc->data_ptrs[next_triplet].blkid =
> ACC100_DMA_BLKID_OUT_HARQ;
> > +		desc->data_ptrs[next_triplet].dma_ext = 1;
> > +#ifndef ACC100_EXT_MEM
> > +		acc100_dma_fill_blk_type_out(
> > +				desc,
> > +				dec->harq_combined_output.data,
> > +				dec->harq_combined_output.offset,
> > +				h_p_size,
> > +				next_triplet,
> > +				ACC100_DMA_BLKID_OUT_HARQ);
> > +#endif
> > +		next_triplet++;
> > +	}
> > +
> > +	*h_out_length = output_length >> 3;
> > +	dec->hard_output.length += *h_out_length;
> > +	*h_out_offset += *h_out_length;
> > +	desc->data_ptrs[next_triplet - 1].last = 1;
> > +	desc->d2mlen = next_triplet - desc->m2dlen;
> > +
> > +	desc->op_addr = op;
> > +
> > +	return 0;
> > +}
> > +
> > +static inline void
> > +acc100_dma_desc_ld_update(struct rte_bbdev_dec_op *op,
> > +		struct acc100_dma_req_desc *desc,
> > +		struct rte_mbuf *input, struct rte_mbuf *h_output,
> > +		uint32_t *in_offset, uint32_t *h_out_offset,
> > +		uint32_t *h_out_length,
> > +		union acc100_harq_layout_data *harq_layout)
> > +{
> > +	int next_triplet = 1; /* FCW already done */
> > +	desc->data_ptrs[next_triplet].address =
> > +			rte_pktmbuf_iova_offset(input, *in_offset);
> > +	next_triplet++;
> 
> No overflow checks on next_triplet
> 
> This is a general problem.

I dont see the overflow risk.

> 
> > +
> > +	if (check_bit(op->ldpc_dec.op_flags,
> > +
> 	RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
> > +		struct rte_bbdev_op_data hi = op-
> >ldpc_dec.harq_combined_input;
> > +		desc->data_ptrs[next_triplet].address = hi.offset;
> > +#ifndef ACC100_EXT_MEM
> > +		desc->data_ptrs[next_triplet].address =
> > +				rte_pktmbuf_iova_offset(hi.data, hi.offset);
> > +#endif
> > +		next_triplet++;
> > +	}
> > +
> > +	desc->data_ptrs[next_triplet].address =
> > +			rte_pktmbuf_iova_offset(h_output, *h_out_offset);
> > +	*h_out_length = desc->data_ptrs[next_triplet].blen;
> > +	next_triplet++;
> > +
> > +	if (check_bit(op->ldpc_dec.op_flags,
> > +
> 	RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
> > +		desc->data_ptrs[next_triplet].address =
> > +				op->ldpc_dec.harq_combined_output.offset;
> > +		/* Adjust based on previous operation */
> > +		struct rte_bbdev_dec_op *prev_op = desc->op_addr;
> > +		op->ldpc_dec.harq_combined_output.length =
> > +				prev_op-
> >ldpc_dec.harq_combined_output.length;
> > +		int16_t hq_idx = op-
> >ldpc_dec.harq_combined_output.offset /
> > +				ACC100_HARQ_OFFSET;
> > +		int16_t prev_hq_idx =
> > +				prev_op-
> >ldpc_dec.harq_combined_output.offset
> > +				/ ACC100_HARQ_OFFSET;
> > +		harq_layout[hq_idx].val = harq_layout[prev_hq_idx].val;
> > +#ifndef ACC100_EXT_MEM
> > +		struct rte_bbdev_op_data ho =
> > +				op->ldpc_dec.harq_combined_output;
> > +		desc->data_ptrs[next_triplet].address =
> > +				rte_pktmbuf_iova_offset(ho.data, ho.offset);
> > +#endif
> > +		next_triplet++;
> > +	}
> > +
> > +	op->ldpc_dec.hard_output.length += *h_out_length;
> > +	desc->op_addr = op;
> > +}
> > +
> > +
> > +/* Enqueue a number of operations to HW and update software rings */
> > +static inline void
> > +acc100_dma_enqueue(struct acc100_queue *q, uint16_t n,
> > +		struct rte_bbdev_stats *queue_stats)
> > +{
> > +	union acc100_enqueue_reg_fmt enq_req;
> > +#ifdef RTE_BBDEV_OFFLOAD_COST
> > +	uint64_t start_time = 0;
> > +	queue_stats->acc_offload_cycles = 0;
> > +	RTE_SET_USED(queue_stats);
> > +#else
> > +	RTE_SET_USED(queue_stats);
> > +#endif
> 
> RTE_SET_UNUSED(... is common in the #ifdef/#else
> 
> so it should be moved out.

ok

> 
> > +
> > +	enq_req.val = 0;
> > +	/* Setting offset, 100b for 256 DMA Desc */
> > +	enq_req.addr_offset = ACC100_DESC_OFFSET;
> > +
> should n != 0 be checked here ?

This is all checked before that point. 

> > +	/* Split ops into batches */
> > +	do {
> > +		union acc100_dma_desc *desc;
> > +		uint16_t enq_batch_size;
> > +		uint64_t offset;
> > +		rte_iova_t req_elem_addr;
> > +
> > +		enq_batch_size = RTE_MIN(n, MAX_ENQ_BATCH_SIZE);
> > +
> > +		/* Set flag on last descriptor in a batch */
> > +		desc = q->ring_addr + ((q->sw_ring_head + enq_batch_size -
> 1) &
> > +				q->sw_ring_wrap_mask);
> > +		desc->req.last_desc_in_batch = 1;
> > +
> > +		/* Calculate the 1st descriptor's address */
> > +		offset = ((q->sw_ring_head & q->sw_ring_wrap_mask) *
> > +				sizeof(union acc100_dma_desc));
> > +		req_elem_addr = q->ring_addr_phys + offset;
> > +
> > +		/* Fill enqueue struct */
> > +		enq_req.num_elem = enq_batch_size;
> > +		/* low 6 bits are not needed */
> > +		enq_req.req_elem_addr = (uint32_t)(req_elem_addr >> 6);
> > +
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +		rte_memdump(stderr, "Req sdone", desc, sizeof(*desc));
> > +#endif
> > +		rte_bbdev_log_debug(
> > +				"Enqueue %u reqs (phys %#"PRIx64") to reg
> %p",
> > +				enq_batch_size,
> > +				req_elem_addr,
> > +				(void *)q->mmio_reg_enqueue);
> > +
> > +		rte_wmb();
> > +
> > +#ifdef RTE_BBDEV_OFFLOAD_COST
> > +		/* Start time measurement for enqueue function offload. */
> > +		start_time = rte_rdtsc_precise();
> > +#endif
> > +		rte_bbdev_log(DEBUG, "Debug : MMIO Enqueue");
> 
> logging time will be tracked with the mmio_write
> 
> so logging should be moved above the start_time setting

Not required. Running with debug traces is expected to make real time offload measurement irrelevant.

> 
> > +		mmio_write(q->mmio_reg_enqueue, enq_req.val);
> > +
> > +#ifdef RTE_BBDEV_OFFLOAD_COST
> > +		queue_stats->acc_offload_cycles +=
> > +				rte_rdtsc_precise() - start_time;
> > +#endif
> > +
> > +		q->aq_enqueued++;
> > +		q->sw_ring_head += enq_batch_size;
> > +		n -= enq_batch_size;
> > +
> > +	} while (n);
> > +
> > +
> > +}
> > +
> > +/* Enqueue one encode operations for ACC100 device in CB mode */
> > +static inline int
> > +enqueue_ldpc_enc_n_op_cb(struct acc100_queue *q, struct
> rte_bbdev_enc_op **ops,
> > +		uint16_t total_enqueued_cbs, int16_t num)
> > +{
> > +	union acc100_dma_desc *desc = NULL;
> > +	uint32_t out_length;
> > +	struct rte_mbuf *output_head, *output;
> > +	int i, next_triplet;
> > +	uint16_t  in_length_in_bytes;
> > +	struct rte_bbdev_op_ldpc_enc *enc = &ops[0]->ldpc_enc;
> > +
> > +	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> > +			& q->sw_ring_wrap_mask);
> > +	desc = q->ring_addr + desc_idx;
> > +	acc100_fcw_le_fill(ops[0], &desc->req.fcw_le, num);
> > +
> > +	/** This could be done at polling */
> > +	desc->req.word0 = ACC100_DMA_DESC_TYPE;
> > +	desc->req.word1 = 0; /**< Timestamp could be disabled */
> > +	desc->req.word2 = 0;
> > +	desc->req.word3 = 0;
> > +	desc->req.numCBs = num;
> > +
> > +	in_length_in_bytes = ops[0]->ldpc_enc.input.data->data_len;
> > +	out_length = (enc->cb_params.e + 7) >> 3;
> > +	desc->req.m2dlen = 1 + num;
> > +	desc->req.d2mlen = num;
> > +	next_triplet = 1;
> > +
> > +	for (i = 0; i < num; i++) {
> i is not needed here, it is next_triplet - 1

would impact readability as these refer to different concepts (code blocks and bdescs).
Would keep as is

> > +		desc->req.data_ptrs[next_triplet].address =
> > +			rte_pktmbuf_iova_offset(ops[i]-
> >ldpc_enc.input.data, 0);
> > +		desc->req.data_ptrs[next_triplet].blen = in_length_in_bytes;
> > +		next_triplet++;
> > +		desc->req.data_ptrs[next_triplet].address =
> > +				rte_pktmbuf_iova_offset(
> > +				ops[i]->ldpc_enc.output.data, 0);
> > +		desc->req.data_ptrs[next_triplet].blen = out_length;
> > +		next_triplet++;
> > +		ops[i]->ldpc_enc.output.length = out_length;
> > +		output_head = output = ops[i]->ldpc_enc.output.data;
> > +		mbuf_append(output_head, output, out_length);
> > +		output->data_len = out_length;
> > +	}
> > +
> > +	desc->req.op_addr = ops[0];
> > +
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +	rte_memdump(stderr, "FCW", &desc->req.fcw_le,
> > +			sizeof(desc->req.fcw_le) - 8);
> > +	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> > +#endif
> > +
> > +	/* One CB (one op) was successfully prepared to enqueue */
> > +	return num;
> 
> caller does not use num, only check if < 0
> 
> So could change to return 0

would keep as is for debug

> 
> > +}
> > +
> > +/* Enqueue one encode operations for ACC100 device in CB mode */
> > +static inline int
> > +enqueue_ldpc_enc_one_op_cb(struct acc100_queue *q, struct
> rte_bbdev_enc_op *op,
> > +		uint16_t total_enqueued_cbs)
> 
> rte_fpga_5gnr_fec.c has this same function.  It would be good if common
> functions could be collected and used to stabilize the internal bbdev
> interface.
> 
> This is general issue

This is true for some part of the code and noted.
In that very case they are distinct implementation with HW specifics
But agreed to look into such refactory later on. 

> 
> > +{
> > +	union acc100_dma_desc *desc = NULL;
> > +	int ret;
> > +	uint32_t in_offset, out_offset, out_length, mbuf_total_left,
> > +		seg_total_left;
> > +	struct rte_mbuf *input, *output_head, *output;
> > +
> > +	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> > +			& q->sw_ring_wrap_mask);
> > +	desc = q->ring_addr + desc_idx;
> > +	acc100_fcw_le_fill(op, &desc->req.fcw_le, 1);
> > +
> > +	input = op->ldpc_enc.input.data;
> > +	output_head = output = op->ldpc_enc.output.data;
> > +	in_offset = op->ldpc_enc.input.offset;
> > +	out_offset = op->ldpc_enc.output.offset;
> > +	out_length = 0;
> > +	mbuf_total_left = op->ldpc_enc.input.length;
> > +	seg_total_left = rte_pktmbuf_data_len(op->ldpc_enc.input.data)
> > +			- in_offset;
> > +
> > +	ret = acc100_dma_desc_le_fill(op, &desc->req, &input, output,
> > +			&in_offset, &out_offset, &out_length,
> &mbuf_total_left,
> > +			&seg_total_left);
> > +
> > +	if (unlikely(ret < 0))
> > +		return ret;
> > +
> > +	mbuf_append(output_head, output, out_length);
> > +
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +	rte_memdump(stderr, "FCW", &desc->req.fcw_le,
> > +			sizeof(desc->req.fcw_le) - 8);
> > +	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> > +
> > +	/* Check if any data left after processing one CB */
> > +	if (mbuf_total_left != 0) {
> > +		rte_bbdev_log(ERR,
> > +				"Some date still left after processing one CB:
> mbuf_total_left = %u",
> > +				mbuf_total_left);
> > +		return -EINVAL;
> > +	}
> > +#endif
> > +	/* One CB (one op) was successfully prepared to enqueue */
> > +	return 1;
> 
> Another case where caller only check for < 0
> 
> Consider changes all similar to return 0 on success.

same comment as above, would keep as is. 

> 
> > +}
> > +
> > +/** Enqueue one decode operations for ACC100 device in CB mode */
> > +static inline int
> > +enqueue_ldpc_dec_one_op_cb(struct acc100_queue *q, struct
> rte_bbdev_dec_op *op,
> > +		uint16_t total_enqueued_cbs, bool same_op)
> > +{
> > +	int ret;
> > +
> > +	union acc100_dma_desc *desc;
> > +	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> > +			& q->sw_ring_wrap_mask);
> > +	desc = q->ring_addr + desc_idx;
> > +	struct rte_mbuf *input, *h_output_head, *h_output;
> > +	uint32_t in_offset, h_out_offset, mbuf_total_left, h_out_length = 0;
> > +	input = op->ldpc_dec.input.data;
> > +	h_output_head = h_output = op->ldpc_dec.hard_output.data;
> > +	in_offset = op->ldpc_dec.input.offset;
> > +	h_out_offset = op->ldpc_dec.hard_output.offset;
> > +	mbuf_total_left = op->ldpc_dec.input.length;
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +	if (unlikely(input == NULL)) {
> > +		rte_bbdev_log(ERR, "Invalid mbuf pointer");
> > +		return -EFAULT;
> > +	}
> > +#endif
> > +	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
> > +
> > +	if (same_op) {
> > +		union acc100_dma_desc *prev_desc;
> > +		desc_idx = ((q->sw_ring_head + total_enqueued_cbs - 1)
> > +				& q->sw_ring_wrap_mask);
> > +		prev_desc = q->ring_addr + desc_idx;
> > +		uint8_t *prev_ptr = (uint8_t *) prev_desc;
> > +		uint8_t *new_ptr = (uint8_t *) desc;
> > +		/* Copy first 4 words and BDESCs */
> > +		rte_memcpy(new_ptr, prev_ptr, 16);
> > +		rte_memcpy(new_ptr + 36, prev_ptr + 36, 40);
> These magic numbers should be #defines

yes

> > +		desc->req.op_addr = prev_desc->req.op_addr;
> > +		/* Copy FCW */
> > +		rte_memcpy(new_ptr + ACC100_DESC_FCW_OFFSET,
> > +				prev_ptr + ACC100_DESC_FCW_OFFSET,
> > +				ACC100_FCW_LD_BLEN);
> > +		acc100_dma_desc_ld_update(op, &desc->req, input,
> h_output,
> > +				&in_offset, &h_out_offset,
> > +				&h_out_length, harq_layout);
> > +	} else {
> > +		struct acc100_fcw_ld *fcw;
> > +		uint32_t seg_total_left;
> > +		fcw = &desc->req.fcw_ld;
> > +		acc100_fcw_ld_fill(op, fcw, harq_layout);
> > +
> > +		/* Special handling when overusing mbuf */
> > +		if (fcw->rm_e < MAX_E_MBUF)
> > +			seg_total_left = rte_pktmbuf_data_len(input)
> > +					- in_offset;
> > +		else
> > +			seg_total_left = fcw->rm_e;
> > +
> > +		ret = acc100_dma_desc_ld_fill(op, &desc->req, &input,
> h_output,
> > +				&in_offset, &h_out_offset,
> > +				&h_out_length, &mbuf_total_left,
> > +				&seg_total_left, fcw);
> > +		if (unlikely(ret < 0))
> > +			return ret;
> > +	}
> > +
> > +	/* Hard output */
> > +	mbuf_append(h_output_head, h_output, h_out_length);
> > +#ifndef ACC100_EXT_MEM
> > +	if (op->ldpc_dec.harq_combined_output.length > 0) {
> > +		/* Push the HARQ output into host memory */
> > +		struct rte_mbuf *hq_output_head, *hq_output;
> > +		hq_output_head = op-
> >ldpc_dec.harq_combined_output.data;
> > +		hq_output = op->ldpc_dec.harq_combined_output.data;
> > +		mbuf_append(hq_output_head, hq_output,
> > +				op-
> >ldpc_dec.harq_combined_output.length);
> > +	}
> > +#endif
> > +
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +	rte_memdump(stderr, "FCW", &desc->req.fcw_ld,
> > +			sizeof(desc->req.fcw_ld) - 8);
> > +	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> > +#endif
> > +
> > +	/* One CB (one op) was successfully prepared to enqueue */
> > +	return 1;
> > +}
> > +
> > +
> > +/* Enqueue one decode operations for ACC100 device in TB mode */
> > +static inline int
> > +enqueue_ldpc_dec_one_op_tb(struct acc100_queue *q, struct
> rte_bbdev_dec_op *op,
> > +		uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
> > +{
> > +	union acc100_dma_desc *desc = NULL;
> > +	int ret;
> > +	uint8_t r, c;
> > +	uint32_t in_offset, h_out_offset,
> > +		h_out_length, mbuf_total_left, seg_total_left;
> > +	struct rte_mbuf *input, *h_output_head, *h_output;
> > +	uint16_t current_enqueued_cbs = 0;
> > +
> > +	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> > +			& q->sw_ring_wrap_mask);
> > +	desc = q->ring_addr + desc_idx;
> > +	uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
> > +	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
> > +	acc100_fcw_ld_fill(op, &desc->req.fcw_ld, harq_layout);
> > +
> > +	input = op->ldpc_dec.input.data;
> > +	h_output_head = h_output = op->ldpc_dec.hard_output.data;
> > +	in_offset = op->ldpc_dec.input.offset;
> > +	h_out_offset = op->ldpc_dec.hard_output.offset;
> > +	h_out_length = 0;
> > +	mbuf_total_left = op->ldpc_dec.input.length;
> > +	c = op->ldpc_dec.tb_params.c;
> > +	r = op->ldpc_dec.tb_params.r;
> > +
> > +	while (mbuf_total_left > 0 && r < c) {
> > +
> > +		seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
> > +
> > +		/* Set up DMA descriptor */
> > +		desc = q->ring_addr + ((q->sw_ring_head +
> total_enqueued_cbs)
> > +				& q->sw_ring_wrap_mask);
> > +		desc->req.data_ptrs[0].address = q->ring_addr_phys +
> fcw_offset;
> > +		desc->req.data_ptrs[0].blen = ACC100_FCW_LD_BLEN;
> > +		ret = acc100_dma_desc_ld_fill(op, &desc->req, &input,
> > +				h_output, &in_offset, &h_out_offset,
> > +				&h_out_length,
> > +				&mbuf_total_left, &seg_total_left,
> > +				&desc->req.fcw_ld);
> > +
> > +		if (unlikely(ret < 0))
> > +			return ret;
> > +
> > +		/* Hard output */
> > +		mbuf_append(h_output_head, h_output, h_out_length);
> > +
> > +		/* Set total number of CBs in TB */
> > +		desc->req.cbs_in_tb = cbs_in_tb;
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +		rte_memdump(stderr, "FCW", &desc->req.fcw_td,
> > +				sizeof(desc->req.fcw_td) - 8);
> > +		rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> > +#endif
> > +
> > +		if (seg_total_left == 0) {
> > +			/* Go to the next mbuf */
> > +			input = input->next;
> > +			in_offset = 0;
> > +			h_output = h_output->next;
> > +			h_out_offset = 0;
> > +		}
> > +		total_enqueued_cbs++;
> > +		current_enqueued_cbs++;
> > +		r++;
> > +	}
> > +
> > +	if (unlikely(desc == NULL))
> How is this possible ? desc has be dereferenced already.

related to static code analysis, arguably a false alarm

> > +		return current_enqueued_cbs;
> > +
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +	/* Check if any CBs left for processing */
> > +	if (mbuf_total_left != 0) {
> > +		rte_bbdev_log(ERR,
> > +				"Some date still left for processing:
> mbuf_total_left = %u",
> > +				mbuf_total_left);
> > +		return -EINVAL;
> > +	}
> > +#endif
> > +	/* Set SDone on last CB descriptor for TB mode */
> > +	desc->req.sdone_enable = 1;
> > +	desc->req.irq_enable = q->irq_enable;
> > +
> > +	return current_enqueued_cbs;
> > +}
> > +
> > +
> > +/* Calculates number of CBs in processed encoder TB based on 'r' and
> input
> > + * length.
> > + */
> > +static inline uint8_t
> > +get_num_cbs_in_tb_enc(struct rte_bbdev_op_turbo_enc *turbo_enc)
> > +{
> > +	uint8_t c, c_neg, r, crc24_bits = 0;
> > +	uint16_t k, k_neg, k_pos;
> > +	uint8_t cbs_in_tb = 0;
> > +	int32_t length;
> > +
> > +	length = turbo_enc->input.length;
> > +	r = turbo_enc->tb_params.r;
> > +	c = turbo_enc->tb_params.c;
> > +	c_neg = turbo_enc->tb_params.c_neg;
> > +	k_neg = turbo_enc->tb_params.k_neg;
> > +	k_pos = turbo_enc->tb_params.k_pos;
> > +	crc24_bits = 0;
> > +	if (check_bit(turbo_enc->op_flags,
> RTE_BBDEV_TURBO_CRC_24B_ATTACH))
> > +		crc24_bits = 24;
> > +	while (length > 0 && r < c) {
> > +		k = (r < c_neg) ? k_neg : k_pos;
> > +		length -= (k - crc24_bits) >> 3;
> > +		r++;
> > +		cbs_in_tb++;
> > +	}
> > +
> > +	return cbs_in_tb;
> > +}
> > +
> > +/* Calculates number of CBs in processed decoder TB based on 'r' and
> input
> > + * length.
> > + */
> > +static inline uint16_t
> > +get_num_cbs_in_tb_dec(struct rte_bbdev_op_turbo_dec *turbo_dec)
> > +{
> > +	uint8_t c, c_neg, r = 0;
> > +	uint16_t kw, k, k_neg, k_pos, cbs_in_tb = 0;
> > +	int32_t length;
> > +
> > +	length = turbo_dec->input.length;
> > +	r = turbo_dec->tb_params.r;
> > +	c = turbo_dec->tb_params.c;
> > +	c_neg = turbo_dec->tb_params.c_neg;
> > +	k_neg = turbo_dec->tb_params.k_neg;
> > +	k_pos = turbo_dec->tb_params.k_pos;
> > +	while (length > 0 && r < c) {
> > +		k = (r < c_neg) ? k_neg : k_pos;
> > +		kw = RTE_ALIGN_CEIL(k + 4, 32) * 3;
> > +		length -= kw;
> > +		r++;
> > +		cbs_in_tb++;
> > +	}
> > +
> > +	return cbs_in_tb;
> > +}
> > +
> > +/* Calculates number of CBs in processed decoder TB based on 'r' and
> input
> > + * length.
> > + */
> > +static inline uint16_t
> > +get_num_cbs_in_tb_ldpc_dec(struct rte_bbdev_op_ldpc_dec *ldpc_dec)
> > +{
> > +	uint16_t r, cbs_in_tb = 0;
> > +	int32_t length = ldpc_dec->input.length;
> > +	r = ldpc_dec->tb_params.r;
> > +	while (length > 0 && r < ldpc_dec->tb_params.c) {
> > +		length -=  (r < ldpc_dec->tb_params.cab) ?
> > +				ldpc_dec->tb_params.ea :
> > +				ldpc_dec->tb_params.eb;
> > +		r++;
> > +		cbs_in_tb++;
> > +	}
> > +	return cbs_in_tb;
> > +}
> > +
> > +/* Check we can mux encode operations with common FCW */
> > +static inline bool
> > +check_mux(struct rte_bbdev_enc_op **ops, uint16_t num) {
> > +	uint16_t i;
> > +	if (num == 1)
> > +		return false;
> likely should strengthen check to num <= 1

no impact, but doesnt hurt to change ok. 

> > +	for (i = 1; i < num; ++i) {
> > +		/* Only mux compatible code blocks */
> > +		if (memcmp((uint8_t *)(&ops[i]->ldpc_enc) + ENC_OFFSET,
> > +				(uint8_t *)(&ops[0]->ldpc_enc) +
> ENC_OFFSET,
> ops[0]->ldpc_enc should be hoisted out of loop as it is invariant.

compiler takes care of this I believe

> > +				CMP_ENC_SIZE) != 0)
> > +			return false;
> > +	}
> > +	return true;
> > +}
> > +
> > +/** Enqueue encode operations for ACC100 device in CB mode. */
> > +static inline uint16_t
> > +acc100_enqueue_ldpc_enc_cb(struct rte_bbdev_queue_data *q_data,
> > +		struct rte_bbdev_enc_op **ops, uint16_t num)
> > +{
> > +	struct acc100_queue *q = q_data->queue_private;
> > +	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q-
> >sw_ring_head;
> > +	uint16_t i = 0;
> > +	union acc100_dma_desc *desc;
> > +	int ret, desc_idx = 0;
> > +	int16_t enq, left = num;
> > +
> > +	while (left > 0) {
> > +		if (unlikely(avail - 1 < 0))
> > +			break;
> > +		avail--;
> > +		enq = RTE_MIN(left, MUX_5GDL_DESC);
> > +		if (check_mux(&ops[i], enq)) {
> > +			ret = enqueue_ldpc_enc_n_op_cb(q, &ops[i],
> > +					desc_idx, enq);
> > +			if (ret < 0)
> > +				break;
> > +			i += enq;
> > +		} else {
> > +			ret = enqueue_ldpc_enc_one_op_cb(q, ops[i],
> desc_idx);
> > +			if (ret < 0)
> > +				break;
> failure is not handled well, what happens if this is one of serveral

the aim is to flag the error and move on 


> > +			i++;
> > +		}
> > +		desc_idx++;
> > +		left = num - i;
> > +	}
> > +
> > +	if (unlikely(i == 0))
> > +		return 0; /* Nothing to enqueue */
> this does not look correct for all cases

I miss your point

> > +
> > +	/* Set SDone in last CB in enqueued ops for CB mode*/
> > +	desc = q->ring_addr + ((q->sw_ring_head + desc_idx - 1)
> > +			& q->sw_ring_wrap_mask);
> > +	desc->req.sdone_enable = 1;
> > +	desc->req.irq_enable = q->irq_enable;
> > +
> > +	acc100_dma_enqueue(q, desc_idx, &q_data->queue_stats);
> > +
> > +	/* Update stats */
> > +	q_data->queue_stats.enqueued_count += i;
> > +	q_data->queue_stats.enqueue_err_count += num - i;
> > +
> > +	return i;
> > +}
> > +
> > +/* Enqueue encode operations for ACC100 device. */
> > +static uint16_t
> > +acc100_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
> > +		struct rte_bbdev_enc_op **ops, uint16_t num)
> > +{
> > +	if (unlikely(num == 0))
> > +		return 0;
> Handling num == 0 should be in acc100_enqueue_ldpc_enc_cb

Why would this be better not to catch early from user api call?

> > +	return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
> > +}
> > +
> > +/* Check we can mux encode operations with common FCW */
> > +static inline bool
> > +cmp_ldpc_dec_op(struct rte_bbdev_dec_op **ops) {
> > +	/* Only mux compatible code blocks */
> > +	if (memcmp((uint8_t *)(&ops[0]->ldpc_dec) + DEC_OFFSET,
> > +			(uint8_t *)(&ops[1]->ldpc_dec) +
> > +			DEC_OFFSET, CMP_DEC_SIZE) != 0) {
> > +		return false;
> > +	} else
> do not need the else, there are no other statements.

debatable. Not considering change except if that becomes a DPDK
coding guideline. 

> > +		return true;
> > +}
> > +
> > +
> > +/* Enqueue decode operations for ACC100 device in TB mode */
> > +static uint16_t
> > +acc100_enqueue_ldpc_dec_tb(struct rte_bbdev_queue_data *q_data,
> > +		struct rte_bbdev_dec_op **ops, uint16_t num)
> > +{
> > +	struct acc100_queue *q = q_data->queue_private;
> > +	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q-
> >sw_ring_head;
> > +	uint16_t i, enqueued_cbs = 0;
> > +	uint8_t cbs_in_tb;
> > +	int ret;
> > +
> > +	for (i = 0; i < num; ++i) {
> > +		cbs_in_tb = get_num_cbs_in_tb_ldpc_dec(&ops[i]-
> >ldpc_dec);
> > +		/* Check if there are available space for further processing */
> > +		if (unlikely(avail - cbs_in_tb < 0))
> > +			break;
> > +		avail -= cbs_in_tb;
> > +
> > +		ret = enqueue_ldpc_dec_one_op_tb(q, ops[i],
> > +				enqueued_cbs, cbs_in_tb);
> > +		if (ret < 0)
> > +			break;
> > +		enqueued_cbs += ret;
> > +	}
> > +
> > +	acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
> > +
> > +	/* Update stats */
> > +	q_data->queue_stats.enqueued_count += i;
> > +	q_data->queue_stats.enqueue_err_count += num - i;
> > +	return i;
> > +}
> > +
> > +/* Enqueue decode operations for ACC100 device in CB mode */
> > +static uint16_t
> > +acc100_enqueue_ldpc_dec_cb(struct rte_bbdev_queue_data *q_data,
> > +		struct rte_bbdev_dec_op **ops, uint16_t num)
> > +{
> > +	struct acc100_queue *q = q_data->queue_private;
> > +	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q-
> >sw_ring_head;
> > +	uint16_t i;
> > +	union acc100_dma_desc *desc;
> > +	int ret;
> > +	bool same_op = false;
> > +	for (i = 0; i < num; ++i) {
> > +		/* Check if there are available space for further processing */
> > +		if (unlikely(avail - 1 < 0))
> 
> change to (avail < 1)
> 
> Generally.

ok

> 
> > +			break;
> > +		avail -= 1;
> > +
> > +		if (i > 0)
> > +			same_op = cmp_ldpc_dec_op(&ops[i-1]);
> > +		rte_bbdev_log(INFO, "Op %d %d %d %d %d %d %d %d %d %d
> %d %d\n",
> > +			i, ops[i]->ldpc_dec.op_flags, ops[i]-
> >ldpc_dec.rv_index,
> > +			ops[i]->ldpc_dec.iter_max, ops[i]-
> >ldpc_dec.iter_count,
> > +			ops[i]->ldpc_dec.basegraph, ops[i]->ldpc_dec.z_c,
> > +			ops[i]->ldpc_dec.n_cb, ops[i]->ldpc_dec.q_m,
> > +			ops[i]->ldpc_dec.n_filler, ops[i]-
> >ldpc_dec.cb_params.e,
> > +			same_op);
> > +		ret = enqueue_ldpc_dec_one_op_cb(q, ops[i], i, same_op);
> > +		if (ret < 0)
> > +			break;
> > +	}
> > +
> > +	if (unlikely(i == 0))
> > +		return 0; /* Nothing to enqueue */
> > +
> > +	/* Set SDone in last CB in enqueued ops for CB mode*/
> > +	desc = q->ring_addr + ((q->sw_ring_head + i - 1)
> > +			& q->sw_ring_wrap_mask);
> > +
> > +	desc->req.sdone_enable = 1;
> > +	desc->req.irq_enable = q->irq_enable;
> > +
> > +	acc100_dma_enqueue(q, i, &q_data->queue_stats);
> > +
> > +	/* Update stats */
> > +	q_data->queue_stats.enqueued_count += i;
> > +	q_data->queue_stats.enqueue_err_count += num - i;
> > +	return i;
> > +}
> > +
> > +/* Enqueue decode operations for ACC100 device. */
> > +static uint16_t
> > +acc100_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
> > +		struct rte_bbdev_dec_op **ops, uint16_t num)
> > +{
> > +	struct acc100_queue *q = q_data->queue_private;
> > +	int32_t aq_avail = q->aq_depth +
> > +			(q->aq_dequeued - q->aq_enqueued) / 128;
> > +
> > +	if (unlikely((aq_avail == 0) || (num == 0)))
> > +		return 0;
> > +
> > +	if (ops[0]->ldpc_dec.code_block_mode == 0)
> > +		return acc100_enqueue_ldpc_dec_tb(q_data, ops, num);
> > +	else
> > +		return acc100_enqueue_ldpc_dec_cb(q_data, ops, num);
> > +}
> > +
> > +
> > +/* Dequeue one encode operations from ACC100 device in CB mode */
> > +static inline int
> > +dequeue_enc_one_op_cb(struct acc100_queue *q, struct
> rte_bbdev_enc_op **ref_op,
> > +		uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
> > +{
> > +	union acc100_dma_desc *desc, atom_desc;
> > +	union acc100_dma_rsp_desc rsp;
> > +	struct rte_bbdev_enc_op *op;
> > +	int i;
> > +
> > +	desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
> > +			& q->sw_ring_wrap_mask);
> > +	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> > +			__ATOMIC_RELAXED);
> > +
> > +	/* Check fdone bit */
> > +	if (!(atom_desc.rsp.val & ACC100_FDONE))
> > +		return -1;
> > +
> > +	rsp.val = atom_desc.rsp.val;
> > +	rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
> > +
> > +	/* Dequeue */
> > +	op = desc->req.op_addr;
> > +
> > +	/* Clearing status, it will be set based on response */
> > +	op->status = 0;
> > +
> > +	op->status |= ((rsp.input_err)
> > +			? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> can remove the = 0, if |= is changed to =

yes in principle, but easy to break by mistake, so would keep. 

> > +	op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> > +	op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> > +
> > +	if (desc->req.last_desc_in_batch) {
> > +		(*aq_dequeued)++;
> > +		desc->req.last_desc_in_batch = 0;
> > +	}
> > +	desc->rsp.val = ACC100_DMA_DESC_TYPE;
> > +	desc->rsp.add_info_0 = 0; /*Reserved bits */
> > +	desc->rsp.add_info_1 = 0; /*Reserved bits */
> > +
> > +	/* Flag that the muxing cause loss of opaque data */
> > +	op->opaque_data = (void *)-1;
> as a ptr, shouldn't opaque_data be poisoned with '0' ?

more obvious this way I think. 

> > +	for (i = 0 ; i < desc->req.numCBs; i++)
> > +		ref_op[i] = op;
> > +
> > +	/* One CB (op) was successfully dequeued */
> > +	return desc->req.numCBs;
> > +}
> > +
> > +/* Dequeue one encode operations from ACC100 device in TB mode */
> > +static inline int
> > +dequeue_enc_one_op_tb(struct acc100_queue *q, struct
> rte_bbdev_enc_op **ref_op,
> > +		uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
> > +{
> > +	union acc100_dma_desc *desc, *last_desc, atom_desc;
> > +	union acc100_dma_rsp_desc rsp;
> > +	struct rte_bbdev_enc_op *op;
> > +	uint8_t i = 0;
> > +	uint16_t current_dequeued_cbs = 0, cbs_in_tb;
> > +
> > +	desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
> > +			& q->sw_ring_wrap_mask);
> > +	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> > +			__ATOMIC_RELAXED);
> > +
> > +	/* Check fdone bit */
> > +	if (!(atom_desc.rsp.val & ACC100_FDONE))
> > +		return -1;
> > +
> > +	/* Get number of CBs in dequeued TB */
> > +	cbs_in_tb = desc->req.cbs_in_tb;
> > +	/* Get last CB */
> > +	last_desc = q->ring_addr + ((q->sw_ring_tail
> > +			+ total_dequeued_cbs + cbs_in_tb - 1)
> > +			& q->sw_ring_wrap_mask);
> > +	/* Check if last CB in TB is ready to dequeue (and thus
> > +	 * the whole TB) - checking sdone bit. If not return.
> > +	 */
> > +	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
> > +			__ATOMIC_RELAXED);
> > +	if (!(atom_desc.rsp.val & ACC100_SDONE))
> > +		return -1;
> > +
> > +	/* Dequeue */
> > +	op = desc->req.op_addr;
> > +
> > +	/* Clearing status, it will be set based on response */
> > +	op->status = 0;
> > +
> > +	while (i < cbs_in_tb) {
> > +		desc = q->ring_addr + ((q->sw_ring_tail
> > +				+ total_dequeued_cbs)
> > +				& q->sw_ring_wrap_mask);
> > +		atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> > +				__ATOMIC_RELAXED);
> > +		rsp.val = atom_desc.rsp.val;
> > +		rte_bbdev_log_debug("Resp. desc %p: %x", desc,
> > +				rsp.val);
> > +
> > +		op->status |= ((rsp.input_err)
> > +				? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> > +		op->status |= ((rsp.dma_err) ? (1 <<
> RTE_BBDEV_DRV_ERROR) : 0);
> > +		op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR)
> : 0);
> > +
> > +		if (desc->req.last_desc_in_batch) {
> > +			(*aq_dequeued)++;
> > +			desc->req.last_desc_in_batch = 0;
> > +		}
> > +		desc->rsp.val = ACC100_DMA_DESC_TYPE;
> > +		desc->rsp.add_info_0 = 0;
> > +		desc->rsp.add_info_1 = 0;
> > +		total_dequeued_cbs++;
> > +		current_dequeued_cbs++;
> > +		i++;
> > +	}
> > +
> > +	*ref_op = op;
> > +
> > +	return current_dequeued_cbs;
> > +}
> > +
> > +/* Dequeue one decode operation from ACC100 device in CB mode */
> > +static inline int
> > +dequeue_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
> > +		struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
> > +		uint16_t dequeued_cbs, uint32_t *aq_dequeued)
> > +{
> > +	union acc100_dma_desc *desc, atom_desc;
> > +	union acc100_dma_rsp_desc rsp;
> > +	struct rte_bbdev_dec_op *op;
> > +
> > +	desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> > +			& q->sw_ring_wrap_mask);
> > +	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> > +			__ATOMIC_RELAXED);
> > +
> > +	/* Check fdone bit */
> > +	if (!(atom_desc.rsp.val & ACC100_FDONE))
> > +		return -1;
> > +
> > +	rsp.val = atom_desc.rsp.val;
> > +	rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
> > +
> > +	/* Dequeue */
> > +	op = desc->req.op_addr;
> > +
> > +	/* Clearing status, it will be set based on response */
> > +	op->status = 0;
> > +	op->status |= ((rsp.input_err)
> > +			? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> 
> similar to above, can remove the = 0
> 
> This is a general issue.

same comment above

> 
> > +	op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> > +	op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> > +	if (op->status != 0)
> > +		q_data->queue_stats.dequeue_err_count++;
> > +
> > +	/* CRC invalid if error exists */
> > +	if (!op->status)
> > +		op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
> > +	op->turbo_dec.iter_count = (uint8_t) rsp.iter_cnt / 2;
> > +	/* Check if this is the last desc in batch (Atomic Queue) */
> > +	if (desc->req.last_desc_in_batch) {
> > +		(*aq_dequeued)++;
> > +		desc->req.last_desc_in_batch = 0;
> > +	}
> > +	desc->rsp.val = ACC100_DMA_DESC_TYPE;
> > +	desc->rsp.add_info_0 = 0;
> > +	desc->rsp.add_info_1 = 0;
> > +	*ref_op = op;
> > +
> > +	/* One CB (op) was successfully dequeued */
> > +	return 1;
> > +}
> > +
> > +/* Dequeue one decode operations from ACC100 device in CB mode */
> > +static inline int
> > +dequeue_ldpc_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
> > +		struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
> > +		uint16_t dequeued_cbs, uint32_t *aq_dequeued)
> > +{
> > +	union acc100_dma_desc *desc, atom_desc;
> > +	union acc100_dma_rsp_desc rsp;
> > +	struct rte_bbdev_dec_op *op;
> > +
> > +	desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> > +			& q->sw_ring_wrap_mask);
> > +	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> > +			__ATOMIC_RELAXED);
> > +
> > +	/* Check fdone bit */
> > +	if (!(atom_desc.rsp.val & ACC100_FDONE))
> > +		return -1;
> > +
> > +	rsp.val = atom_desc.rsp.val;
> > +
> > +	/* Dequeue */
> > +	op = desc->req.op_addr;
> > +
> > +	/* Clearing status, it will be set based on response */
> > +	op->status = 0;
> > +	op->status |= rsp.input_err << RTE_BBDEV_DATA_ERROR;
> > +	op->status |= rsp.dma_err << RTE_BBDEV_DRV_ERROR;
> > +	op->status |= rsp.fcw_err << RTE_BBDEV_DRV_ERROR;
> > +	if (op->status != 0)
> > +		q_data->queue_stats.dequeue_err_count++;
> > +
> > +	op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
> > +	if (op->ldpc_dec.hard_output.length > 0 && !rsp.synd_ok)
> > +		op->status |= 1 << RTE_BBDEV_SYNDROME_ERROR;
> > +	op->ldpc_dec.iter_count = (uint8_t) rsp.iter_cnt;
> > +
> > +	/* Check if this is the last desc in batch (Atomic Queue) */
> > +	if (desc->req.last_desc_in_batch) {
> > +		(*aq_dequeued)++;
> > +		desc->req.last_desc_in_batch = 0;
> > +	}
> > +
> > +	desc->rsp.val = ACC100_DMA_DESC_TYPE;
> > +	desc->rsp.add_info_0 = 0;
> > +	desc->rsp.add_info_1 = 0;
> > +
> > +	*ref_op = op;
> > +
> > +	/* One CB (op) was successfully dequeued */
> > +	return 1;
> > +}
> > +
> > +/* Dequeue one decode operations from ACC100 device in TB mode. */
> > +static inline int
> > +dequeue_dec_one_op_tb(struct acc100_queue *q, struct
> rte_bbdev_dec_op **ref_op,
> > +		uint16_t dequeued_cbs, uint32_t *aq_dequeued)
> > +{
> similar call as fpga_lte_fec

distinct though as HW specific

> > +	union acc100_dma_desc *desc, *last_desc, atom_desc;
> > +	union acc100_dma_rsp_desc rsp;
> > +	struct rte_bbdev_dec_op *op;
> > +	uint8_t cbs_in_tb = 1, cb_idx = 0;
> > +
> > +	desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> > +			& q->sw_ring_wrap_mask);
> > +	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> > +			__ATOMIC_RELAXED);
> > +
> > +	/* Check fdone bit */
> > +	if (!(atom_desc.rsp.val & ACC100_FDONE))
> > +		return -1;
> > +
> > +	/* Dequeue */
> > +	op = desc->req.op_addr;
> > +
> > +	/* Get number of CBs in dequeued TB */
> > +	cbs_in_tb = desc->req.cbs_in_tb;
> > +	/* Get last CB */
> > +	last_desc = q->ring_addr + ((q->sw_ring_tail
> > +			+ dequeued_cbs + cbs_in_tb - 1)
> > +			& q->sw_ring_wrap_mask);
> > +	/* Check if last CB in TB is ready to dequeue (and thus
> > +	 * the whole TB) - checking sdone bit. If not return.
> > +	 */
> > +	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
> > +			__ATOMIC_RELAXED);
> > +	if (!(atom_desc.rsp.val & ACC100_SDONE))
> > +		return -1;
> > +
> > +	/* Clearing status, it will be set based on response */
> > +	op->status = 0;
> > +
> > +	/* Read remaining CBs if exists */
> > +	while (cb_idx < cbs_in_tb) {
> Other similar calls use 'i' , 'cb_idx' is more meaningful, consider changing the
> other loops.

More relevant here due to split of TB into CBs. 

> > +		desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> > +				& q->sw_ring_wrap_mask);
> > +		atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> > +				__ATOMIC_RELAXED);
> > +		rsp.val = atom_desc.rsp.val;
> > +		rte_bbdev_log_debug("Resp. desc %p: %x", desc,
> > +				rsp.val);
> > +
> > +		op->status |= ((rsp.input_err)
> > +				? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> > +		op->status |= ((rsp.dma_err) ? (1 <<
> RTE_BBDEV_DRV_ERROR) : 0);
> > +		op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR)
> : 0);
> > +
> > +		/* CRC invalid if error exists */
> > +		if (!op->status)
> > +			op->status |= rsp.crc_status <<
> RTE_BBDEV_CRC_ERROR;
> > +		op->turbo_dec.iter_count = RTE_MAX((uint8_t) rsp.iter_cnt,
> > +				op->turbo_dec.iter_count);
> > +
> > +		/* Check if this is the last desc in batch (Atomic Queue) */
> > +		if (desc->req.last_desc_in_batch) {
> > +			(*aq_dequeued)++;
> > +			desc->req.last_desc_in_batch = 0;
> > +		}
> > +		desc->rsp.val = ACC100_DMA_DESC_TYPE;
> > +		desc->rsp.add_info_0 = 0;
> > +		desc->rsp.add_info_1 = 0;
> > +		dequeued_cbs++;
> > +		cb_idx++;
> > +	}
> > +
> > +	*ref_op = op;
> > +
> > +	return cb_idx;
> > +}
> > +
> > +/* Dequeue LDPC encode operations from ACC100 device. */
> > +static uint16_t
> > +acc100_dequeue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
> > +		struct rte_bbdev_enc_op **ops, uint16_t num)
> > +{
> > +	struct acc100_queue *q = q_data->queue_private;
> > +	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
> > +	uint32_t aq_dequeued = 0;
> > +	uint16_t dequeue_num, i, dequeued_cbs = 0, dequeued_descs = 0;
> > +	int ret;
> > +
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +	if (unlikely(ops == 0 && q == NULL))
> > +		return 0;
> > +#endif
> > +
> > +	dequeue_num = (avail < num) ? avail : num;
> 
> Similar to RTE_MIN
> 
> general issue

ok, will check

> 
> > +
> > +	for (i = 0; i < dequeue_num; i++) {
> > +		ret = dequeue_enc_one_op_cb(q, &ops[dequeued_cbs],
> > +				dequeued_descs, &aq_dequeued);
> > +		if (ret < 0)
> > +			break;
> > +		dequeued_cbs += ret;
> > +		dequeued_descs++;
> > +		if (dequeued_cbs >= num)
> > +			break;
> condition should be added to the for-loop

unsure this would helps readability personnaly

> > +	}
> > +
> > +	q->aq_dequeued += aq_dequeued;
> > +	q->sw_ring_tail += dequeued_descs;
> > +
> > +	/* Update enqueue stats */
> > +	q_data->queue_stats.dequeued_count += dequeued_cbs;
> > +
> > +	return dequeued_cbs;
> > +}
> > +
> > +/* Dequeue decode operations from ACC100 device. */
> > +static uint16_t
> > +acc100_dequeue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
> > +		struct rte_bbdev_dec_op **ops, uint16_t num)
> > +{
> > +	struct acc100_queue *q = q_data->queue_private;
> > +	uint16_t dequeue_num;
> > +	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
> > +	uint32_t aq_dequeued = 0;
> > +	uint16_t i;
> > +	uint16_t dequeued_cbs = 0;
> > +	struct rte_bbdev_dec_op *op;
> > +	int ret;
> > +
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +	if (unlikely(ops == 0 && q == NULL))
> > +		return 0;
> > +#endif
> > +
> > +	dequeue_num = (avail < num) ? avail : num;
> > +
> > +	for (i = 0; i < dequeue_num; ++i) {
> > +		op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> > +			& q->sw_ring_wrap_mask))->req.op_addr;
> > +		if (op->ldpc_dec.code_block_mode == 0)
> 
> 0 should be a #define

mentioned in previous review.

Thanks

> 
> Tom
> 
> > +			ret = dequeue_dec_one_op_tb(q, &ops[i],
> dequeued_cbs,
> > +					&aq_dequeued);
> > +		else
> > +			ret = dequeue_ldpc_dec_one_op_cb(
> > +					q_data, q, &ops[i], dequeued_cbs,
> > +					&aq_dequeued);
> > +
> > +		if (ret < 0)
> > +			break;
> > +		dequeued_cbs += ret;
> > +	}
> > +
> > +	q->aq_dequeued += aq_dequeued;
> > +	q->sw_ring_tail += dequeued_cbs;
> > +
> > +	/* Update enqueue stats */
> > +	q_data->queue_stats.dequeued_count += i;
> > +
> > +	return i;
> > +}
> > +
> >  /* Initialization Function */
> >  static void
> >  acc100_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv)
> > @@ -703,6 +2321,10 @@
> >  	struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
> >
> >  	dev->dev_ops = &acc100_bbdev_ops;
> > +	dev->enqueue_ldpc_enc_ops = acc100_enqueue_ldpc_enc;
> > +	dev->enqueue_ldpc_dec_ops = acc100_enqueue_ldpc_dec;
> > +	dev->dequeue_ldpc_enc_ops = acc100_dequeue_ldpc_enc;
> > +	dev->dequeue_ldpc_dec_ops = acc100_dequeue_ldpc_dec;
> >
> >  	((struct acc100_device *) dev->data->dev_private)->pf_device =
> >  			!strcmp(drv->driver.name,
> > @@ -815,4 +2437,3 @@ static int acc100_pci_remove(struct
> rte_pci_device *pci_dev)
> >  RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME,
> pci_id_acc100_pf_map);
> >  RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME,
> acc100_pci_vf_driver);
> >  RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME,
> pci_id_acc100_vf_map);
> > -
> > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h
> b/drivers/baseband/acc100/rte_acc100_pmd.h
> > index 0e2b79c..78686c1 100644
> > --- a/drivers/baseband/acc100/rte_acc100_pmd.h
> > +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
> > @@ -88,6 +88,8 @@
> >  #define TMPL_PRI_3      0x0f0e0d0c
> >  #define QUEUE_ENABLE    0x80000000  /* Bit to mark Queue as Enabled */
> >  #define WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
> > +#define ACC100_FDONE    0x80000000
> > +#define ACC100_SDONE    0x40000000
> >
> >  #define ACC100_NUM_TMPL  32
> >  #define VF_OFFSET_QOS 16 /* offset in Memory Space specific to QoS
> Mon */
> > @@ -398,6 +400,7 @@ struct __rte_packed acc100_dma_req_desc {
> >  union acc100_dma_desc {
> >  	struct acc100_dma_req_desc req;
> >  	union acc100_dma_rsp_desc rsp;
> > +	uint64_t atom_hdr;
> >  };
> >
> >


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v9 06/10] baseband/acc100: add HARQ loopback support
  2020-09-30 17:25       ` Tom Rix
@ 2020-09-30 18:55         ` Chautru, Nicolas
  2020-10-01 15:32           ` Tom Rix
  0 siblings, 1 reply; 213+ messages in thread
From: Chautru, Nicolas @ 2020-09-30 18:55 UTC (permalink / raw)
  To: Tom Rix, dev, akhil.goyal
  Cc: Richardson, Bruce, Xu, Rosen, dave.burley, aidan.goddard, Yigit,
	Ferruh, Liu, Tianjiao

Hi Tom, 


> From: Tom Rix <trix@redhat.com>
> On 9/28/20 5:29 PM, Nicolas Chautru wrote:
> > Additional support for HARQ memory loopback
> >
> > Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> > Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
> > ---
> >  drivers/baseband/acc100/rte_acc100_pmd.c | 158
> > +++++++++++++++++++++++++++++++
> >  1 file changed, 158 insertions(+)
> >
> > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> > b/drivers/baseband/acc100/rte_acc100_pmd.c
> > index b223547..e484c0a 100644
> > --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> > +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> > @@ -658,6 +658,7 @@
> >
> 	RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE |
> >
> 	RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE |  #ifdef
> ACC100_EXT_MEM
> > +
> 	RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK |
> >
> 	RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE |
> >
> 	RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE |
> #endif @@
> > -1480,12 +1481,169 @@
> >  	return 1;
> >  }
> >
> > +static inline int
> > +harq_loopback(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
> > +		uint16_t total_enqueued_cbs) {
> > +	struct acc100_fcw_ld *fcw;
> > +	union acc100_dma_desc *desc;
> > +	int next_triplet = 1;
> > +	struct rte_mbuf *hq_output_head, *hq_output;
> > +	uint16_t harq_in_length = op-
> >ldpc_dec.harq_combined_input.length;
> > +	if (harq_in_length == 0) {
> > +		rte_bbdev_log(ERR, "Loopback of invalid null size\n");
> > +		return -EINVAL;
> > +	}
> > +
> > +	int h_comp = check_bit(op->ldpc_dec.op_flags,
> > +			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION
> > +			) ? 1 : 0;
> 
> bool

Not in that case as this is used explictly as an integer in the FCW. 

Thanks
Nic


> 
> Tom
> 
> > +	if (h_comp == 1)
> > +		harq_in_length = harq_in_length * 8 / 6;
> > +	harq_in_length = RTE_ALIGN(harq_in_length, 64);
> > +	uint16_t harq_dma_length_in = (h_comp == 0) ?
> > +			harq_in_length :
> > +			harq_in_length * 6 / 8;
> > +	uint16_t harq_dma_length_out = harq_dma_length_in;
> > +	bool ddr_mem_in = check_bit(op->ldpc_dec.op_flags,
> > +
> 	RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE);
> > +	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
> > +	uint16_t harq_index = (ddr_mem_in ?
> > +			op->ldpc_dec.harq_combined_input.offset :
> > +			op->ldpc_dec.harq_combined_output.offset)
> > +			/ ACC100_HARQ_OFFSET;
> > +
> > +	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> > +			& q->sw_ring_wrap_mask);
> > +	desc = q->ring_addr + desc_idx;
> > +	fcw = &desc->req.fcw_ld;
> > +	/* Set the FCW from loopback into DDR */
> > +	memset(fcw, 0, sizeof(struct acc100_fcw_ld));
> > +	fcw->FCWversion = ACC100_FCW_VER;
> > +	fcw->qm = 2;
> > +	fcw->Zc = 384;
> > +	if (harq_in_length < 16 * N_ZC_1)
> > +		fcw->Zc = 16;
> > +	fcw->ncb = fcw->Zc * N_ZC_1;
> > +	fcw->rm_e = 2;
> > +	fcw->hcin_en = 1;
> > +	fcw->hcout_en = 1;
> > +
> > +	rte_bbdev_log(DEBUG, "Loopback IN %d Index %d offset %d length
> %d %d\n",
> > +			ddr_mem_in, harq_index,
> > +			harq_layout[harq_index].offset, harq_in_length,
> > +			harq_dma_length_in);
> > +
> > +	if (ddr_mem_in && (harq_layout[harq_index].offset > 0)) {
> > +		fcw->hcin_size0 = harq_layout[harq_index].size0;
> > +		fcw->hcin_offset = harq_layout[harq_index].offset;
> > +		fcw->hcin_size1 = harq_in_length - fcw->hcin_offset;
> > +		harq_dma_length_in = (fcw->hcin_size0 + fcw->hcin_size1);
> > +		if (h_comp == 1)
> > +			harq_dma_length_in = harq_dma_length_in * 6 / 8;
> > +	} else {
> > +		fcw->hcin_size0 = harq_in_length;
> > +	}
> > +	harq_layout[harq_index].val = 0;
> > +	rte_bbdev_log(DEBUG, "Loopback FCW Config %d %d %d\n",
> > +			fcw->hcin_size0, fcw->hcin_offset, fcw->hcin_size1);
> > +	fcw->hcout_size0 = harq_in_length;
> > +	fcw->hcin_decomp_mode = h_comp;
> > +	fcw->hcout_comp_mode = h_comp;

see here

> > +	fcw->gain_i = 1;
> > +	fcw->gain_h = 1;
> > +
> > +	/* Set the prefix of descriptor. This could be done at polling */
> > +	desc->req.word0 = ACC100_DMA_DESC_TYPE;
> > +	desc->req.word1 = 0; /**< Timestamp could be disabled */
> > +	desc->req.word2 = 0;
> > +	desc->req.word3 = 0;
> > +	desc->req.numCBs = 1;
> > +
> > +	/* Null LLR input for Decoder */
> > +	desc->req.data_ptrs[next_triplet].address =
> > +			q->lb_in_addr_phys;
> > +	desc->req.data_ptrs[next_triplet].blen = 2;
> > +	desc->req.data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN;
> > +	desc->req.data_ptrs[next_triplet].last = 0;
> > +	desc->req.data_ptrs[next_triplet].dma_ext = 0;
> > +	next_triplet++;
> > +
> > +	/* HARQ Combine input from either Memory interface */
> > +	if (!ddr_mem_in) {
> > +		next_triplet = acc100_dma_fill_blk_type_out(&desc->req,
> > +				op->ldpc_dec.harq_combined_input.data,
> > +				op->ldpc_dec.harq_combined_input.offset,
> > +				harq_dma_length_in,
> > +				next_triplet,
> > +				ACC100_DMA_BLKID_IN_HARQ);
> > +	} else {
> > +		desc->req.data_ptrs[next_triplet].address =
> > +				op->ldpc_dec.harq_combined_input.offset;
> > +		desc->req.data_ptrs[next_triplet].blen =
> > +				harq_dma_length_in;
> > +		desc->req.data_ptrs[next_triplet].blkid =
> > +				ACC100_DMA_BLKID_IN_HARQ;
> > +		desc->req.data_ptrs[next_triplet].dma_ext = 1;
> > +		next_triplet++;
> > +	}
> > +	desc->req.data_ptrs[next_triplet - 1].last = 1;
> > +	desc->req.m2dlen = next_triplet;
> > +
> > +	/* Dropped decoder hard output */
> > +	desc->req.data_ptrs[next_triplet].address =
> > +			q->lb_out_addr_phys;
> > +	desc->req.data_ptrs[next_triplet].blen = BYTES_IN_WORD;
> > +	desc->req.data_ptrs[next_triplet].blkid =
> ACC100_DMA_BLKID_OUT_HARD;
> > +	desc->req.data_ptrs[next_triplet].last = 0;
> > +	desc->req.data_ptrs[next_triplet].dma_ext = 0;
> > +	next_triplet++;
> > +
> > +	/* HARQ Combine output to either Memory interface */
> > +	if (check_bit(op->ldpc_dec.op_flags,
> > +
> 	RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE
> > +			)) {
> > +		desc->req.data_ptrs[next_triplet].address =
> > +				op->ldpc_dec.harq_combined_output.offset;
> > +		desc->req.data_ptrs[next_triplet].blen =
> > +				harq_dma_length_out;
> > +		desc->req.data_ptrs[next_triplet].blkid =
> > +				ACC100_DMA_BLKID_OUT_HARQ;
> > +		desc->req.data_ptrs[next_triplet].dma_ext = 1;
> > +		next_triplet++;
> > +	} else {
> > +		hq_output_head = op-
> >ldpc_dec.harq_combined_output.data;
> > +		hq_output = op->ldpc_dec.harq_combined_output.data;
> > +		next_triplet = acc100_dma_fill_blk_type_out(
> > +				&desc->req,
> > +				op->ldpc_dec.harq_combined_output.data,
> > +				op->ldpc_dec.harq_combined_output.offset,
> > +				harq_dma_length_out,
> > +				next_triplet,
> > +				ACC100_DMA_BLKID_OUT_HARQ);
> > +		/* HARQ output */
> > +		mbuf_append(hq_output_head, hq_output,
> harq_dma_length_out);
> > +		op->ldpc_dec.harq_combined_output.length =
> > +				harq_dma_length_out;
> > +	}
> > +	desc->req.data_ptrs[next_triplet - 1].last = 1;
> > +	desc->req.d2mlen = next_triplet - desc->req.m2dlen;
> > +	desc->req.op_addr = op;
> > +
> > +	/* One CB (one op) was successfully prepared to enqueue */
> > +	return 1;
> > +}
> > +
> >  /** Enqueue one decode operations for ACC100 device in CB mode */
> > static inline int  enqueue_ldpc_dec_one_op_cb(struct acc100_queue *q,
> > struct rte_bbdev_dec_op *op,
> >  		uint16_t total_enqueued_cbs, bool same_op)  {
> >  	int ret;
> > +	if (unlikely(check_bit(op->ldpc_dec.op_flags,
> > +
> 	RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK))) {
> > +		ret = harq_loopback(q, op, total_enqueued_cbs);
> > +		return ret;
> > +	}
> >
> >  	union acc100_dma_desc *desc;
> >  	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v9 08/10] baseband/acc100: add interrupt support to PMD
  2020-09-29  0:29     ` [dpdk-dev] [PATCH v9 08/10] baseband/acc100: add interrupt support to PMD Nicolas Chautru
@ 2020-09-30 19:03       ` Tom Rix
  2020-09-30 19:45         ` Chautru, Nicolas
  0 siblings, 1 reply; 213+ messages in thread
From: Tom Rix @ 2020-09-30 19:03 UTC (permalink / raw)
  To: Nicolas Chautru, dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu


On 9/28/20 5:29 PM, Nicolas Chautru wrote:
> Adding capability and functions to support MSI
> interrupts, call backs and inforing.
>
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
> ---
>  drivers/baseband/acc100/rte_acc100_pmd.c | 288 ++++++++++++++++++++++++++++++-
>  drivers/baseband/acc100/rte_acc100_pmd.h |  15 ++
>  2 files changed, 300 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
> index 7d4c3df..b6d9e7c 100644
> --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> @@ -339,6 +339,213 @@
>  	free_base_addresses(base_addrs, i);
>  }
>  
> +/*
> + * Find queue_id of a device queue based on details from the Info Ring.
> + * If a queue isn't found UINT16_MAX is returned.
> + */
> +static inline uint16_t
> +get_queue_id_from_ring_info(struct rte_bbdev_data *data,
> +		const union acc100_info_ring_data ring_data)
> +{
> +	uint16_t queue_id;
> +
> +	for (queue_id = 0; queue_id < data->num_queues; ++queue_id) {
> +		struct acc100_queue *acc100_q =
> +				data->queues[queue_id].queue_private;
> +		if (acc100_q != NULL && acc100_q->aq_id == ring_data.aq_id &&
> +				acc100_q->qgrp_id == ring_data.qg_id &&
> +				acc100_q->vf_id == ring_data.vf_id)
> +			return queue_id;

If num_queues is large, this linear search will be slow.

Consider changing the search algorithm.

> +	}
> +
> +	return UINT16_MAX;
the interrupt handlers that use this function do not a great job of handling this error.
> +}
> +
> +/* Checks PF Info Ring to find the interrupt cause and handles it accordingly */
> +static inline void
> +acc100_check_ir(struct acc100_device *acc100_dev)
> +{
> +	volatile union acc100_info_ring_data *ring_data;
> +	uint16_t info_ring_head = acc100_dev->info_ring_head;
> +	if (acc100_dev->info_ring == NULL)
> +		return;
> +
> +	ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head &
> +			ACC100_INFO_RING_MASK);
> +
> +	while (ring_data->valid) {
> +		if ((ring_data->int_nb < ACC100_PF_INT_DMA_DL_DESC_IRQ) || (
> +				ring_data->int_nb >
> +				ACC100_PF_INT_DMA_DL5G_DESC_IRQ))
> +			rte_bbdev_log(WARNING, "InfoRing: ITR:%d Info:0x%x",
> +				ring_data->int_nb, ring_data->detailed_info);
> +		/* Initialize Info Ring entry and move forward */
> +		ring_data->val = 0;
> +		info_ring_head++;
> +		ring_data = acc100_dev->info_ring +
> +				(info_ring_head & ACC100_INFO_RING_MASK);
These three statements are common for the ring handling, consider a macro or inline function.
> +	}
> +}
> +
> +/* Checks PF Info Ring to find the interrupt cause and handles it accordingly */
> +static inline void
> +acc100_pf_interrupt_handler(struct rte_bbdev *dev)
> +{
> +	struct acc100_device *acc100_dev = dev->data->dev_private;
> +	volatile union acc100_info_ring_data *ring_data;
> +	struct acc100_deq_intr_details deq_intr_det;
> +
> +	ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head &
> +			ACC100_INFO_RING_MASK);
> +
> +	while (ring_data->valid) {
> +
> +		rte_bbdev_log_debug(
> +				"ACC100 PF Interrupt received, Info Ring data: 0x%x",
> +				ring_data->val);
> +
> +		switch (ring_data->int_nb) {
> +		case ACC100_PF_INT_DMA_DL_DESC_IRQ:
> +		case ACC100_PF_INT_DMA_UL_DESC_IRQ:
> +		case ACC100_PF_INT_DMA_UL5G_DESC_IRQ:
> +		case ACC100_PF_INT_DMA_DL5G_DESC_IRQ:
> +			deq_intr_det.queue_id = get_queue_id_from_ring_info(
> +					dev->data, *ring_data);
> +			if (deq_intr_det.queue_id == UINT16_MAX) {
> +				rte_bbdev_log(ERR,
> +						"Couldn't find queue: aq_id: %u, qg_id: %u, vf_id: %u",
> +						ring_data->aq_id,
> +						ring_data->qg_id,
> +						ring_data->vf_id);
> +				return;
> +			}
> +			rte_bbdev_pmd_callback_process(dev,
> +					RTE_BBDEV_EVENT_DEQUEUE, &deq_intr_det);
> +			break;
> +		default:
> +			rte_bbdev_pmd_callback_process(dev,
> +					RTE_BBDEV_EVENT_ERROR, NULL);
> +			break;
> +		}
> +
> +		/* Initialize Info Ring entry and move forward */
> +		ring_data->val = 0;
> +		++acc100_dev->info_ring_head;
> +		ring_data = acc100_dev->info_ring +
> +				(acc100_dev->info_ring_head &
> +				ACC100_INFO_RING_MASK);
> +	}
> +}
> +
> +/* Checks VF Info Ring to find the interrupt cause and handles it accordingly */
> +static inline void
> +acc100_vf_interrupt_handler(struct rte_bbdev *dev)
very similar to pf case, consider combining.
> +{
> +	struct acc100_device *acc100_dev = dev->data->dev_private;
> +	volatile union acc100_info_ring_data *ring_data;
> +	struct acc100_deq_intr_details deq_intr_det;
> +
> +	ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head &
> +			ACC100_INFO_RING_MASK);
> +
> +	while (ring_data->valid) {
> +
> +		rte_bbdev_log_debug(
> +				"ACC100 VF Interrupt received, Info Ring data: 0x%x",
> +				ring_data->val);
> +
> +		switch (ring_data->int_nb) {
> +		case ACC100_VF_INT_DMA_DL_DESC_IRQ:
> +		case ACC100_VF_INT_DMA_UL_DESC_IRQ:
> +		case ACC100_VF_INT_DMA_UL5G_DESC_IRQ:
> +		case ACC100_VF_INT_DMA_DL5G_DESC_IRQ:
> +			/* VFs are not aware of their vf_id - it's set to 0 in
> +			 * queue structures.
> +			 */
> +			ring_data->vf_id = 0;
> +			deq_intr_det.queue_id = get_queue_id_from_ring_info(
> +					dev->data, *ring_data);
> +			if (deq_intr_det.queue_id == UINT16_MAX) {
> +				rte_bbdev_log(ERR,
> +						"Couldn't find queue: aq_id: %u, qg_id: %u",
> +						ring_data->aq_id,
> +						ring_data->qg_id);
> +				return;
> +			}
> +			rte_bbdev_pmd_callback_process(dev,
> +					RTE_BBDEV_EVENT_DEQUEUE, &deq_intr_det);
> +			break;
> +		default:
> +			rte_bbdev_pmd_callback_process(dev,
> +					RTE_BBDEV_EVENT_ERROR, NULL);
> +			break;
> +		}
> +
> +		/* Initialize Info Ring entry and move forward */
> +		ring_data->valid = 0;
> +		++acc100_dev->info_ring_head;
> +		ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head
> +				& ACC100_INFO_RING_MASK);
> +	}
> +}
> +
> +/* Interrupt handler triggered by ACC100 dev for handling specific interrupt */
> +static void
> +acc100_dev_interrupt_handler(void *cb_arg)
> +{
> +	struct rte_bbdev *dev = cb_arg;
> +	struct acc100_device *acc100_dev = dev->data->dev_private;
> +
> +	/* Read info ring */
> +	if (acc100_dev->pf_device)
> +		acc100_pf_interrupt_handler(dev);

combined like ..

acc100_interrupt_handler(dev, is_pf)

> +	else
> +		acc100_vf_interrupt_handler(dev);
> +}
> +
> +/* Allocate and setup inforing */
> +static int
> +allocate_inforing(struct rte_bbdev *dev)

consider renaming

allocate_info_ring

> +{
> +	struct acc100_device *d = dev->data->dev_private;
> +	const struct acc100_registry_addr *reg_addr;
> +	rte_iova_t info_ring_phys;
> +	uint32_t phys_low, phys_high;
> +
> +	if (d->info_ring != NULL)
> +		return 0; /* Already configured */
> +
> +	/* Choose correct registry addresses for the device type */
> +	if (d->pf_device)
> +		reg_addr = &pf_reg_addr;
> +	else
> +		reg_addr = &vf_reg_addr;
> +	/* Allocate InfoRing */
> +	d->info_ring = rte_zmalloc_socket("Info Ring",
> +			ACC100_INFO_RING_NUM_ENTRIES *
> +			sizeof(*d->info_ring), RTE_CACHE_LINE_SIZE,
> +			dev->data->socket_id);
> +	if (d->info_ring == NULL) {
> +		rte_bbdev_log(ERR,
> +				"Failed to allocate Info Ring for %s:%u",
> +				dev->device->driver->name,
> +				dev->data->dev_id);
The callers do not check that this fails.
> +		return -ENOMEM;
> +	}
> +	info_ring_phys = rte_malloc_virt2iova(d->info_ring);
> +
> +	/* Setup Info Ring */
> +	phys_high = (uint32_t)(info_ring_phys >> 32);
> +	phys_low  = (uint32_t)(info_ring_phys);
> +	acc100_reg_write(d, reg_addr->info_ring_hi, phys_high);
> +	acc100_reg_write(d, reg_addr->info_ring_lo, phys_low);
> +	acc100_reg_write(d, reg_addr->info_ring_en, ACC100_REG_IRQ_EN_ALL);
> +	d->info_ring_head = (acc100_reg_read(d, reg_addr->info_ring_ptr) &
> +			0xFFF) / sizeof(union acc100_info_ring_data);
> +	return 0;
> +}
> +
> +
>  /* Allocate 64MB memory used for all software rings */
>  static int
>  acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id)
> @@ -426,6 +633,7 @@
>  	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_hi, phys_high);
>  	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_lo, phys_low);
>  
> +	allocate_inforing(dev);
need to check here
>  	d->harq_layout = rte_zmalloc_socket("HARQ Layout",
>  			ACC100_HARQ_LAYOUT * sizeof(*d->harq_layout),
>  			RTE_CACHE_LINE_SIZE, dev->data->socket_id);
> @@ -437,13 +645,53 @@
>  	return 0;
>  }
>  
> +static int
> +acc100_intr_enable(struct rte_bbdev *dev)
> +{
> +	int ret;
> +	struct acc100_device *d = dev->data->dev_private;
> +
> +	/* Only MSI are currently supported */
> +	if (dev->intr_handle->type == RTE_INTR_HANDLE_VFIO_MSI ||
> +			dev->intr_handle->type == RTE_INTR_HANDLE_UIO) {
> +
> +		allocate_inforing(dev);
need to check here
> +
> +		ret = rte_intr_enable(dev->intr_handle);
> +		if (ret < 0) {
> +			rte_bbdev_log(ERR,
> +					"Couldn't enable interrupts for device: %s",
> +					dev->data->name);
> +			rte_free(d->info_ring);
> +			return ret;
> +		}
> +		ret = rte_intr_callback_register(dev->intr_handle,
> +				acc100_dev_interrupt_handler, dev);
> +		if (ret < 0) {
> +			rte_bbdev_log(ERR,
> +					"Couldn't register interrupt callback for device: %s",
> +					dev->data->name);
> +			rte_free(d->info_ring);
does intr need to be disabled here ?
> +			return ret;
> +		}
> +
> +		return 0;
> +	}
> +
> +	rte_bbdev_log(ERR, "ACC100 (%s) supports only VFIO MSI interrupts",
> +			dev->data->name);
> +	return -ENOTSUP;
> +}
> +
>  /* Free 64MB memory used for software rings */
>  static int
>  acc100_dev_close(struct rte_bbdev *dev)
>  {
>  	struct acc100_device *d = dev->data->dev_private;
> +	acc100_check_ir(d);
>  	if (d->sw_rings_base != NULL) {
>  		rte_free(d->tail_ptrs);
> +		rte_free(d->info_ring);
>  		rte_free(d->sw_rings_base);
>  		d->sw_rings_base = NULL;
>  	}
> @@ -643,6 +891,7 @@
>  					RTE_BBDEV_TURBO_CRC_TYPE_24B |
>  					RTE_BBDEV_TURBO_HALF_ITERATION_EVEN |
>  					RTE_BBDEV_TURBO_EARLY_TERMINATION |
> +					RTE_BBDEV_TURBO_DEC_INTERRUPTS |
>  					RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN |
>  					RTE_BBDEV_TURBO_MAP_DEC |
>  					RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP |
> @@ -663,6 +912,7 @@
>  					RTE_BBDEV_TURBO_CRC_24B_ATTACH |
>  					RTE_BBDEV_TURBO_RV_INDEX_BYPASS |
>  					RTE_BBDEV_TURBO_RATE_MATCH |
> +					RTE_BBDEV_TURBO_ENC_INTERRUPTS |
>  					RTE_BBDEV_TURBO_ENC_SCATTER_GATHER,
>  				.num_buffers_src =
>  						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
> @@ -676,7 +926,8 @@
>  				.capability_flags =
>  					RTE_BBDEV_LDPC_RATE_MATCH |
>  					RTE_BBDEV_LDPC_CRC_24B_ATTACH |
> -					RTE_BBDEV_LDPC_INTERLEAVER_BYPASS,
> +					RTE_BBDEV_LDPC_INTERLEAVER_BYPASS |
> +					RTE_BBDEV_LDPC_ENC_INTERRUPTS,
>  				.num_buffers_src =
>  						RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
>  				.num_buffers_dst =
> @@ -701,7 +952,8 @@
>  				RTE_BBDEV_LDPC_DECODE_BYPASS |
>  				RTE_BBDEV_LDPC_DEC_SCATTER_GATHER |
>  				RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION |
> -				RTE_BBDEV_LDPC_LLR_COMPRESSION,
> +				RTE_BBDEV_LDPC_LLR_COMPRESSION |
> +				RTE_BBDEV_LDPC_DEC_INTERRUPTS,
>  			.llr_size = 8,
>  			.llr_decimals = 1,
>  			.num_buffers_src =
> @@ -751,14 +1003,39 @@
>  #else
>  	dev_info->harq_buffer_size = 0;
>  #endif
> +	acc100_check_ir(d);
> +}
> +
> +static int
> +acc100_queue_intr_enable(struct rte_bbdev *dev, uint16_t queue_id)
> +{
> +	struct acc100_queue *q = dev->data->queues[queue_id].queue_private;
> +
> +	if (dev->intr_handle->type != RTE_INTR_HANDLE_VFIO_MSI &&
> +			dev->intr_handle->type != RTE_INTR_HANDLE_UIO)
> +		return -ENOTSUP;
> +
> +	q->irq_enable = 1;
> +	return 0;
> +}
> +
> +static int
> +acc100_queue_intr_disable(struct rte_bbdev *dev, uint16_t queue_id)
> +{
> +	struct acc100_queue *q = dev->data->queues[queue_id].queue_private;
> +	q->irq_enable = 0;
A -ENOTSUP above, should need similar check here.
> +	return 0;
>  }
>  
>  static const struct rte_bbdev_ops acc100_bbdev_ops = {
>  	.setup_queues = acc100_setup_queues,
> +	.intr_enable = acc100_intr_enable,
>  	.close = acc100_dev_close,
>  	.info_get = acc100_dev_info_get,
>  	.queue_setup = acc100_queue_setup,
>  	.queue_release = acc100_queue_release,
> +	.queue_intr_enable = acc100_queue_intr_enable,
> +	.queue_intr_disable = acc100_queue_intr_disable
>  };
>  
>  /* ACC100 PCI PF address map */
> @@ -3018,8 +3295,10 @@
>  			? (1 << RTE_BBDEV_DATA_ERROR) : 0);
>  	op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
>  	op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> -	if (op->status != 0)
> +	if (op->status != 0) {
>  		q_data->queue_stats.dequeue_err_count++;
> +		acc100_check_ir(q->d);
> +	}
>  
>  	/* CRC invalid if error exists */
>  	if (!op->status)
> @@ -3076,6 +3355,9 @@
>  		op->status |= 1 << RTE_BBDEV_SYNDROME_ERROR;
>  	op->ldpc_dec.iter_count = (uint8_t) rsp.iter_cnt;
>  
> +	if (op->status & (1 << RTE_BBDEV_DRV_ERROR))
> +		acc100_check_ir(q->d);
> +
>  	/* Check if this is the last desc in batch (Atomic Queue) */
>  	if (desc->req.last_desc_in_batch) {
>  		(*aq_dequeued)++;
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
> index 78686c1..8980fa5 100644
> --- a/drivers/baseband/acc100/rte_acc100_pmd.h
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
> @@ -559,7 +559,14 @@ struct acc100_device {
>  	/* Virtual address of the info memory routed to the this function under
>  	 * operation, whether it is PF or VF.
>  	 */
> +	union acc100_info_ring_data *info_ring;

Need a comment that this array needs a sentinel ?

Tom

> +
>  	union acc100_harq_layout_data *harq_layout;
> +	/* Virtual Info Ring head */
> +	uint16_t info_ring_head;
> +	/* Number of bytes available for each queue in device, depending on
> +	 * how many queues are enabled with configure()
> +	 */
>  	uint32_t sw_ring_size;
>  	uint32_t ddr_size; /* Size in kB */
>  	uint32_t *tail_ptrs; /* Base address of response tail pointer buffer */
> @@ -575,4 +582,12 @@ struct acc100_device {
>  	bool configured; /**< True if this ACC100 device is configured */
>  };
>  
> +/**
> + * Structure with details about RTE_BBDEV_EVENT_DEQUEUE event. It's passed to
> + * the callback function.
> + */
> +struct acc100_deq_intr_details {
> +	uint16_t queue_id;
> +};
> +
>  #endif /* _RTE_ACC100_PMD_H_ */


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v9 07/10] baseband/acc100: add support for 4G processing
  2020-09-30 18:37       ` Tom Rix
@ 2020-09-30 19:10         ` Chautru, Nicolas
  2020-10-01 15:42           ` Tom Rix
  0 siblings, 1 reply; 213+ messages in thread
From: Chautru, Nicolas @ 2020-09-30 19:10 UTC (permalink / raw)
  To: Tom Rix, dev, akhil.goyal
  Cc: Richardson, Bruce, Xu, Rosen, dave.burley, aidan.goddard, Yigit,
	Ferruh, Liu, Tianjiao

Hi Tom, 

> From: Tom Rix <trix@redhat.com>
> On 9/28/20 5:29 PM, Nicolas Chautru wrote:
> > Adding capability for 4G encode and decoder processing
> >
> > Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> > Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
> > ---
> >  doc/guides/bbdevs/features/acc100.ini    |    4 +-
> >  drivers/baseband/acc100/rte_acc100_pmd.c | 1010
> > ++++++++++++++++++++++++++++--
> >  2 files changed, 945 insertions(+), 69 deletions(-)
> >
> > diff --git a/doc/guides/bbdevs/features/acc100.ini
> > b/doc/guides/bbdevs/features/acc100.ini
> > index 40c7adc..642cd48 100644
> > --- a/doc/guides/bbdevs/features/acc100.ini
> > +++ b/doc/guides/bbdevs/features/acc100.ini
> > @@ -4,8 +4,8 @@
> >  ; Refer to default.ini for the full list of available PMD features.
> >  ;
> >  [Features]
> > -Turbo Decoder (4G)     = N
> > -Turbo Encoder (4G)     = N
> > +Turbo Decoder (4G)     = Y
> > +Turbo Encoder (4G)     = Y
> >  LDPC Decoder (5G)      = Y
> >  LDPC Encoder (5G)      = Y
> >  LLR/HARQ Compression   = Y
> > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> > b/drivers/baseband/acc100/rte_acc100_pmd.c
> > index e484c0a..7d4c3df 100644
> > --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> > +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> > @@ -339,7 +339,6 @@
> >  	free_base_addresses(base_addrs, i);
> >  }
> >
> > -
> >  /* Allocate 64MB memory used for all software rings */  static int
> > acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int
> > socket_id) @@ -637,6 +636,41 @@
> >
> >  	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
> >  		{
> > +			.type = RTE_BBDEV_OP_TURBO_DEC,
> > +			.cap.turbo_dec = {
> > +				.capability_flags =
> > +
> 	RTE_BBDEV_TURBO_SUBBLOCK_DEINTERLEAVE |
> > +					RTE_BBDEV_TURBO_CRC_TYPE_24B
> |
> > +
> 	RTE_BBDEV_TURBO_HALF_ITERATION_EVEN |
> > +
> 	RTE_BBDEV_TURBO_EARLY_TERMINATION |
> > +
> 	RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN |
> > +					RTE_BBDEV_TURBO_MAP_DEC |
> > +
> 	RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP |
> > +
> 	RTE_BBDEV_TURBO_DEC_SCATTER_GATHER,
> > +				.max_llr_modulus = INT8_MAX,
> > +				.num_buffers_src =
> > +
> 	RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
> > +				.num_buffers_hard_out =
> > +
> 	RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
> > +				.num_buffers_soft_out =
> > +
> 	RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
> > +			}
> > +		},
> > +		{
> > +			.type = RTE_BBDEV_OP_TURBO_ENC,
> > +			.cap.turbo_enc = {
> > +				.capability_flags =
> > +
> 	RTE_BBDEV_TURBO_CRC_24B_ATTACH |
> > +
> 	RTE_BBDEV_TURBO_RV_INDEX_BYPASS |
> > +					RTE_BBDEV_TURBO_RATE_MATCH |
> > +
> 	RTE_BBDEV_TURBO_ENC_SCATTER_GATHER,
> > +				.num_buffers_src =
> > +
> 	RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
> > +				.num_buffers_dst =
> > +
> 	RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
> > +			}
> > +		},
> > +		{
> >  			.type   = RTE_BBDEV_OP_LDPC_ENC,
> >  			.cap.ldpc_enc = {
> >  				.capability_flags =
> > @@ -719,7 +753,6 @@
> >  #endif
> >  }
> >
> > -
> >  static const struct rte_bbdev_ops acc100_bbdev_ops = {
> >  	.setup_queues = acc100_setup_queues,
> >  	.close = acc100_dev_close,
> > @@ -763,6 +796,58 @@
> >  	return tail;
> >  }
> >
> > +/* Fill in a frame control word for turbo encoding. */ static inline
> > +void acc100_fcw_te_fill(const struct rte_bbdev_enc_op *op, struct
> > +acc100_fcw_te *fcw) {
> > +	fcw->code_block_mode = op->turbo_enc.code_block_mode;
> > +	if (fcw->code_block_mode == 0) { /* For TB mode */
> > +		fcw->k_neg = op->turbo_enc.tb_params.k_neg;
> > +		fcw->k_pos = op->turbo_enc.tb_params.k_pos;
> > +		fcw->c_neg = op->turbo_enc.tb_params.c_neg;
> > +		fcw->c = op->turbo_enc.tb_params.c;
> > +		fcw->ncb_neg = op->turbo_enc.tb_params.ncb_neg;
> > +		fcw->ncb_pos = op->turbo_enc.tb_params.ncb_pos;
> > +
> > +		if (check_bit(op->turbo_enc.op_flags,
> > +				RTE_BBDEV_TURBO_RATE_MATCH)) {
> > +			fcw->bypass_rm = 0;
> > +			fcw->cab = op->turbo_enc.tb_params.cab;
> > +			fcw->ea = op->turbo_enc.tb_params.ea;
> > +			fcw->eb = op->turbo_enc.tb_params.eb;
> > +		} else {
> > +			/* E is set to the encoding output size when RM is
> > +			 * bypassed.
> > +			 */
> > +			fcw->bypass_rm = 1;
> > +			fcw->cab = fcw->c_neg;
> > +			fcw->ea = 3 * fcw->k_neg + 12;
> > +			fcw->eb = 3 * fcw->k_pos + 12;
> > +		}
> > +	} else { /* For CB mode */
> > +		fcw->k_pos = op->turbo_enc.cb_params.k;
> > +		fcw->ncb_pos = op->turbo_enc.cb_params.ncb;
> > +
> > +		if (check_bit(op->turbo_enc.op_flags,
> > +				RTE_BBDEV_TURBO_RATE_MATCH)) {
> > +			fcw->bypass_rm = 0;
> > +			fcw->eb = op->turbo_enc.cb_params.e;
> > +		} else {
> > +			/* E is set to the encoding output size when RM is
> > +			 * bypassed.
> > +			 */
> > +			fcw->bypass_rm = 1;
> > +			fcw->eb = 3 * fcw->k_pos + 12;
> > +		}
> > +	}
> > +
> > +	fcw->bypass_rv_idx1 = check_bit(op->turbo_enc.op_flags,
> > +			RTE_BBDEV_TURBO_RV_INDEX_BYPASS);
> > +	fcw->code_block_crc = check_bit(op->turbo_enc.op_flags,
> > +			RTE_BBDEV_TURBO_CRC_24B_ATTACH);
> > +	fcw->rv_idx1 = op->turbo_enc.rv_index; }
> > +
> >  /* Compute value of k0.
> >   * Based on 3GPP 38.212 Table 5.4.2.1-2
> >   * Starting position of different redundancy versions, k0 @@ -813,6
> > +898,25 @@
> >  	fcw->mcb_count = num_cb;
> >  }
> >
> > +/* Fill in a frame control word for turbo decoding. */ static inline
> > +void acc100_fcw_td_fill(const struct rte_bbdev_dec_op *op, struct
> > +acc100_fcw_td *fcw) {
> > +	/* Note : Early termination is always enabled for 4GUL */
> > +	fcw->fcw_ver = 1;
> > +	if (op->turbo_dec.code_block_mode == 0)
> > +		fcw->k_pos = op->turbo_dec.tb_params.k_pos;
> > +	else
> > +		fcw->k_pos = op->turbo_dec.cb_params.k;
> > +	fcw->turbo_crc_type = check_bit(op->turbo_dec.op_flags,
> > +			RTE_BBDEV_TURBO_CRC_TYPE_24B);
> > +	fcw->bypass_sb_deint = 0;
> > +	fcw->raw_decoder_input_on = 0;
> > +	fcw->max_iter = op->turbo_dec.iter_max;
> > +	fcw->half_iter_on = !check_bit(op->turbo_dec.op_flags,
> > +			RTE_BBDEV_TURBO_HALF_ITERATION_EVEN);
> > +}
> > +
> >  /* Fill in a frame control word for LDPC decoding. */  static inline
> > void  acc100_fcw_ld_fill(const struct rte_bbdev_dec_op *op, struct
> > acc100_fcw_ld *fcw, @@ -1042,6 +1146,87 @@  }
> >
> >  static inline int
> > +acc100_dma_desc_te_fill(struct rte_bbdev_enc_op *op,
> > +		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
> > +		struct rte_mbuf *output, uint32_t *in_offset,
> > +		uint32_t *out_offset, uint32_t *out_length,
> > +		uint32_t *mbuf_total_left, uint32_t *seg_total_left, uint8_t
> r) {
> > +	int next_triplet = 1; /* FCW already done */
> > +	uint32_t e, ea, eb, length;
> > +	uint16_t k, k_neg, k_pos;
> > +	uint8_t cab, c_neg;
> > +
> > +	desc->word0 = ACC100_DMA_DESC_TYPE;
> > +	desc->word1 = 0; /**< Timestamp could be disabled */
> > +	desc->word2 = 0;
> > +	desc->word3 = 0;
> > +	desc->numCBs = 1;
> > +
> > +	if (op->turbo_enc.code_block_mode == 0) {
> > +		ea = op->turbo_enc.tb_params.ea;
> > +		eb = op->turbo_enc.tb_params.eb;
> > +		cab = op->turbo_enc.tb_params.cab;
> > +		k_neg = op->turbo_enc.tb_params.k_neg;
> > +		k_pos = op->turbo_enc.tb_params.k_pos;
> > +		c_neg = op->turbo_enc.tb_params.c_neg;
> > +		e = (r < cab) ? ea : eb;
> > +		k = (r < c_neg) ? k_neg : k_pos;
> > +	} else {
> > +		e = op->turbo_enc.cb_params.e;
> > +		k = op->turbo_enc.cb_params.k;
> > +	}
> > +
> > +	if (check_bit(op->turbo_enc.op_flags,
> RTE_BBDEV_TURBO_CRC_24B_ATTACH))
> > +		length = (k - 24) >> 3;
> > +	else
> > +		length = k >> 3;
> > +
> > +	if (unlikely((*mbuf_total_left == 0) || (*mbuf_total_left <
> > +length))) {
> 
> similar to other patches, this check can be combined to <=
> 
> change generally

same comment on other patch

> 
> > +		rte_bbdev_log(ERR,
> > +				"Mismatch between mbuf length and
> included CB sizes: mbuf len %u, cb len %u",
> > +				*mbuf_total_left, length);
> > +		return -1;
> > +	}
> > +
> > +	next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset,
> > +			length, seg_total_left, next_triplet);
> > +	if (unlikely(next_triplet < 0)) {
> > +		rte_bbdev_log(ERR,
> > +				"Mismatch between data to process and
> mbuf data length in bbdev_op: %p",
> > +				op);
> > +		return -1;
> > +	}
> > +	desc->data_ptrs[next_triplet - 1].last = 1;
> > +	desc->m2dlen = next_triplet;
> > +	*mbuf_total_left -= length;
> > +
> > +	/* Set output length */
> > +	if (check_bit(op->turbo_enc.op_flags,
> RTE_BBDEV_TURBO_RATE_MATCH))
> > +		/* Integer round up division by 8 */
> > +		*out_length = (e + 7) >> 3;
> > +	else
> > +		*out_length = (k >> 3) * 3 + 2;
> > +
> > +	next_triplet = acc100_dma_fill_blk_type_out(desc, output,
> *out_offset,
> > +			*out_length, next_triplet,
> ACC100_DMA_BLKID_OUT_ENC);
> > +	if (unlikely(next_triplet < 0)) {
> > +		rte_bbdev_log(ERR,
> > +				"Mismatch between data to process and
> mbuf data length in bbdev_op: %p",
> > +				op);
> > +		return -1;
> > +	}
> > +	op->turbo_enc.output.length += *out_length;
> > +	*out_offset += *out_length;
> > +	desc->data_ptrs[next_triplet - 1].last = 1;
> > +	desc->d2mlen = next_triplet - desc->m2dlen;
> > +
> > +	desc->op_addr = op;
> > +
> > +	return 0;
> > +}
> > +
> > +static inline int
> >  acc100_dma_desc_le_fill(struct rte_bbdev_enc_op *op,
> >  		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
> >  		struct rte_mbuf *output, uint32_t *in_offset, @@ -1110,6
> +1295,117
> > @@  }
> >
> >  static inline int
> > +acc100_dma_desc_td_fill(struct rte_bbdev_dec_op *op,
> > +		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
> > +		struct rte_mbuf *h_output, struct rte_mbuf *s_output,
> > +		uint32_t *in_offset, uint32_t *h_out_offset,
> > +		uint32_t *s_out_offset, uint32_t *h_out_length,
> > +		uint32_t *s_out_length, uint32_t *mbuf_total_left,
> > +		uint32_t *seg_total_left, uint8_t r) {
> > +	int next_triplet = 1; /* FCW already done */
> > +	uint16_t k;
> > +	uint16_t crc24_overlap = 0;
> > +	uint32_t e, kw;
> > +
> > +	desc->word0 = ACC100_DMA_DESC_TYPE;
> > +	desc->word1 = 0; /**< Timestamp could be disabled */
> > +	desc->word2 = 0;
> > +	desc->word3 = 0;
> > +	desc->numCBs = 1;
> > +
> > +	if (op->turbo_dec.code_block_mode == 0) {
> > +		k = (r < op->turbo_dec.tb_params.c_neg)
> > +			? op->turbo_dec.tb_params.k_neg
> > +			: op->turbo_dec.tb_params.k_pos;
> > +		e = (r < op->turbo_dec.tb_params.cab)
> > +			? op->turbo_dec.tb_params.ea
> > +			: op->turbo_dec.tb_params.eb;
> > +	} else {
> > +		k = op->turbo_dec.cb_params.k;
> > +		e = op->turbo_dec.cb_params.e;
> > +	}
> > +
> > +	if ((op->turbo_dec.code_block_mode == 0)
> > +		&& !check_bit(op->turbo_dec.op_flags,
> > +		RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP))
> > +		crc24_overlap = 24;
> > +
> > +	/* Calculates circular buffer size.
> > +	 * According to 3gpp 36.212 section 5.1.4.2
> > +	 *   Kw = 3 * Kpi,
> > +	 * where:
> > +	 *   Kpi = nCol * nRow
> > +	 * where nCol is 32 and nRow can be calculated from:
> > +	 *   D =< nCol * nRow
> > +	 * where D is the size of each output from turbo encoder block (k +
> 4).
> > +	 */
> > +	kw = RTE_ALIGN_CEIL(k + 4, 32) * 3;
> > +
> > +	if (unlikely((*mbuf_total_left == 0) || (*mbuf_total_left < kw))) {
> > +		rte_bbdev_log(ERR,
> > +				"Mismatch between mbuf length and
> included CB sizes: mbuf len %u, cb len %u",
> > +				*mbuf_total_left, kw);
> > +		return -1;
> > +	}
> > +
> > +	next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset,
> kw,
> > +			seg_total_left, next_triplet);
> > +	if (unlikely(next_triplet < 0)) {
> > +		rte_bbdev_log(ERR,
> > +				"Mismatch between data to process and
> mbuf data length in bbdev_op: %p",
> > +				op);
> > +		return -1;
> > +	}
> > +	desc->data_ptrs[next_triplet - 1].last = 1;
> > +	desc->m2dlen = next_triplet;
> > +	*mbuf_total_left -= kw;
> > +
> > +	next_triplet = acc100_dma_fill_blk_type_out(
> > +			desc, h_output, *h_out_offset,
> > +			k >> 3, next_triplet,
> ACC100_DMA_BLKID_OUT_HARD);
> > +	if (unlikely(next_triplet < 0)) {
> > +		rte_bbdev_log(ERR,
> > +				"Mismatch between data to process and
> mbuf data length in bbdev_op: %p",
> > +				op);
> > +		return -1;
> > +	}
> > +
> > +	*h_out_length = ((k - crc24_overlap) >> 3);
> > +	op->turbo_dec.hard_output.length += *h_out_length;
> > +	*h_out_offset += *h_out_length;
> > +
> > +	/* Soft output */
> > +	if (check_bit(op->turbo_dec.op_flags,
> RTE_BBDEV_TURBO_SOFT_OUTPUT)) {
> > +		if (check_bit(op->turbo_dec.op_flags,
> > +				RTE_BBDEV_TURBO_EQUALIZER))
> > +			*s_out_length = e;
> > +		else
> > +			*s_out_length = (k * 3) + 12;
> > +
> > +		next_triplet = acc100_dma_fill_blk_type_out(desc, s_output,
> > +				*s_out_offset, *s_out_length, next_triplet,
> > +				ACC100_DMA_BLKID_OUT_SOFT);
> > +		if (unlikely(next_triplet < 0)) {
> > +			rte_bbdev_log(ERR,
> > +					"Mismatch between data to process
> and mbuf data length in bbdev_op: %p",
> > +					op);
> > +			return -1;
> > +		}
> > +
> > +		op->turbo_dec.soft_output.length += *s_out_length;
> > +		*s_out_offset += *s_out_length;
> > +	}
> > +
> > +	desc->data_ptrs[next_triplet - 1].last = 1;
> > +	desc->d2mlen = next_triplet - desc->m2dlen;
> > +
> > +	desc->op_addr = op;
> > +
> > +	return 0;
> > +}
> > +
> > +static inline int
> >  acc100_dma_desc_ld_fill(struct rte_bbdev_dec_op *op,
> >  		struct acc100_dma_req_desc *desc,
> >  		struct rte_mbuf **input, struct rte_mbuf *h_output, @@ -
> 1374,6
> > +1670,57 @@
> >
> >  /* Enqueue one encode operations for ACC100 device in CB mode */
> > static inline int
> > +enqueue_enc_one_op_cb(struct acc100_queue *q, struct
> rte_bbdev_enc_op *op,
> > +		uint16_t total_enqueued_cbs)
> > +{
> > +	union acc100_dma_desc *desc = NULL;
> > +	int ret;
> > +	uint32_t in_offset, out_offset, out_length, mbuf_total_left,
> > +		seg_total_left;
> > +	struct rte_mbuf *input, *output_head, *output;
> > +
> > +	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> > +			& q->sw_ring_wrap_mask);
> > +	desc = q->ring_addr + desc_idx;
> > +	acc100_fcw_te_fill(op, &desc->req.fcw_te);
> > +
> > +	input = op->turbo_enc.input.data;
> > +	output_head = output = op->turbo_enc.output.data;
> > +	in_offset = op->turbo_enc.input.offset;
> > +	out_offset = op->turbo_enc.output.offset;
> > +	out_length = 0;
> > +	mbuf_total_left = op->turbo_enc.input.length;
> > +	seg_total_left = rte_pktmbuf_data_len(op->turbo_enc.input.data)
> > +			- in_offset;
> > +
> > +	ret = acc100_dma_desc_te_fill(op, &desc->req, &input, output,
> > +			&in_offset, &out_offset, &out_length,
> &mbuf_total_left,
> > +			&seg_total_left, 0);
> > +
> > +	if (unlikely(ret < 0))
> > +		return ret;
> > +
> > +	mbuf_append(output_head, output, out_length);
> > +
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +	rte_memdump(stderr, "FCW", &desc->req.fcw_te,
> > +			sizeof(desc->req.fcw_te) - 8);
> > +	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> > +
> > +	/* Check if any data left after processing one CB */
> > +	if (mbuf_total_left != 0) {
> > +		rte_bbdev_log(ERR,
> > +				"Some date still left after processing one CB:
> mbuf_total_left = %u",
> > +				mbuf_total_left);
> > +		return -EINVAL;
> > +	}
> > +#endif
> > +	/* One CB (one op) was successfully prepared to enqueue */
> > +	return 1;
> > +}
> > +
> > +/* Enqueue one encode operations for ACC100 device in CB mode */
> > +static inline int
> >  enqueue_ldpc_enc_n_op_cb(struct acc100_queue *q, struct
> rte_bbdev_enc_op **ops,
> >  		uint16_t total_enqueued_cbs, int16_t num)  { @@ -1481,78
> +1828,235
> > @@
> >  	return 1;
> >  }
> >
> > -static inline int
> > -harq_loopback(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
> > -		uint16_t total_enqueued_cbs) {
> > -	struct acc100_fcw_ld *fcw;
> > -	union acc100_dma_desc *desc;
> > -	int next_triplet = 1;
> > -	struct rte_mbuf *hq_output_head, *hq_output;
> > -	uint16_t harq_in_length = op-
> >ldpc_dec.harq_combined_input.length;
> > -	if (harq_in_length == 0) {
> > -		rte_bbdev_log(ERR, "Loopback of invalid null size\n");
> > -		return -EINVAL;
> > -	}
> >
> > -	int h_comp = check_bit(op->ldpc_dec.op_flags,
> > -			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION
> > -			) ? 1 : 0;
> > -	if (h_comp == 1)
> > -		harq_in_length = harq_in_length * 8 / 6;
> > -	harq_in_length = RTE_ALIGN(harq_in_length, 64);
> > -	uint16_t harq_dma_length_in = (h_comp == 0) ?
> > -			harq_in_length :
> > -			harq_in_length * 6 / 8;
> > -	uint16_t harq_dma_length_out = harq_dma_length_in;
> > -	bool ddr_mem_in = check_bit(op->ldpc_dec.op_flags,
> > -
> 	RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE);
> > -	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
> > -	uint16_t harq_index = (ddr_mem_in ?
> > -			op->ldpc_dec.harq_combined_input.offset :
> > -			op->ldpc_dec.harq_combined_output.offset)
> > -			/ ACC100_HARQ_OFFSET;
> > +/* Enqueue one encode operations for ACC100 device in TB mode. */
> > +static inline int enqueue_enc_one_op_tb(struct acc100_queue *q,
> > +struct rte_bbdev_enc_op *op,
> > +		uint16_t total_enqueued_cbs, uint8_t cbs_in_tb) {
> > +	union acc100_dma_desc *desc = NULL;
> > +	int ret;
> > +	uint8_t r, c;
> > +	uint32_t in_offset, out_offset, out_length, mbuf_total_left,
> > +		seg_total_left;
> > +	struct rte_mbuf *input, *output_head, *output;
> > +	uint16_t current_enqueued_cbs = 0;
> >
> >  	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> >  			& q->sw_ring_wrap_mask);
> >  	desc = q->ring_addr + desc_idx;
> > -	fcw = &desc->req.fcw_ld;
> > -	/* Set the FCW from loopback into DDR */
> > -	memset(fcw, 0, sizeof(struct acc100_fcw_ld));
> > -	fcw->FCWversion = ACC100_FCW_VER;
> > -	fcw->qm = 2;
> > -	fcw->Zc = 384;
> > -	if (harq_in_length < 16 * N_ZC_1)
> > -		fcw->Zc = 16;
> > -	fcw->ncb = fcw->Zc * N_ZC_1;
> > -	fcw->rm_e = 2;
> > -	fcw->hcin_en = 1;
> > -	fcw->hcout_en = 1;
> > +	uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
> > +	acc100_fcw_te_fill(op, &desc->req.fcw_te);
> >
> > -	rte_bbdev_log(DEBUG, "Loopback IN %d Index %d offset %d length
> %d %d\n",
> > -			ddr_mem_in, harq_index,
> > -			harq_layout[harq_index].offset, harq_in_length,
> > -			harq_dma_length_in);
> > +	input = op->turbo_enc.input.data;
> > +	output_head = output = op->turbo_enc.output.data;
> > +	in_offset = op->turbo_enc.input.offset;
> > +	out_offset = op->turbo_enc.output.offset;
> > +	out_length = 0;
> > +	mbuf_total_left = op->turbo_enc.input.length;
> >
> > -	if (ddr_mem_in && (harq_layout[harq_index].offset > 0)) {
> > -		fcw->hcin_size0 = harq_layout[harq_index].size0;
> > -		fcw->hcin_offset = harq_layout[harq_index].offset;
> > -		fcw->hcin_size1 = harq_in_length - fcw->hcin_offset;
> > -		harq_dma_length_in = (fcw->hcin_size0 + fcw->hcin_size1);
> > -		if (h_comp == 1)
> > -			harq_dma_length_in = harq_dma_length_in * 6 / 8;
> > -	} else {
> > -		fcw->hcin_size0 = harq_in_length;
> > -	}
> > -	harq_layout[harq_index].val = 0;
> > -	rte_bbdev_log(DEBUG, "Loopback FCW Config %d %d %d\n",
> > -			fcw->hcin_size0, fcw->hcin_offset, fcw->hcin_size1);
> > -	fcw->hcout_size0 = harq_in_length;
> > -	fcw->hcin_decomp_mode = h_comp;
> > -	fcw->hcout_comp_mode = h_comp;
> > -	fcw->gain_i = 1;
> > -	fcw->gain_h = 1;
> > +	c = op->turbo_enc.tb_params.c;
> > +	r = op->turbo_enc.tb_params.r;
> >
> > -	/* Set the prefix of descriptor. This could be done at polling */
> > +	while (mbuf_total_left > 0 && r < c) {
> > +		seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
> > +		/* Set up DMA descriptor */
> > +		desc = q->ring_addr + ((q->sw_ring_head +
> total_enqueued_cbs)
> > +				& q->sw_ring_wrap_mask);
> > +		desc->req.data_ptrs[0].address = q->ring_addr_phys +
> fcw_offset;
> > +		desc->req.data_ptrs[0].blen = ACC100_FCW_TE_BLEN;
> > +
> > +		ret = acc100_dma_desc_te_fill(op, &desc->req, &input,
> output,
> > +				&in_offset, &out_offset, &out_length,
> > +				&mbuf_total_left, &seg_total_left, r);
> > +		if (unlikely(ret < 0))
> > +			return ret;
> > +		mbuf_append(output_head, output, out_length);
> > +
> > +		/* Set total number of CBs in TB */
> > +		desc->req.cbs_in_tb = cbs_in_tb;
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +		rte_memdump(stderr, "FCW", &desc->req.fcw_te,
> > +				sizeof(desc->req.fcw_te) - 8);
> > +		rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> #endif
> > +
> > +		if (seg_total_left == 0) {
> > +			/* Go to the next mbuf */
> > +			input = input->next;
> > +			in_offset = 0;
> > +			output = output->next;
> > +			out_offset = 0;
> > +		}
> > +
> > +		total_enqueued_cbs++;
> > +		current_enqueued_cbs++;
> > +		r++;
> > +	}
> > +
> > +	if (unlikely(desc == NULL))
> > +		return current_enqueued_cbs;
> > +
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +	/* Check if any CBs left for processing */
> > +	if (mbuf_total_left != 0) {
> > +		rte_bbdev_log(ERR,
> > +				"Some date still left for processing:
> mbuf_total_left = %u",
> > +				mbuf_total_left);
> > +		return -EINVAL;
> > +	}
> > +#endif
> > +
> > +	/* Set SDone on last CB descriptor for TB mode. */
> > +	desc->req.sdone_enable = 1;
> > +	desc->req.irq_enable = q->irq_enable;
> > +
> > +	return current_enqueued_cbs;
> > +}
> > +
> > +/** Enqueue one decode operations for ACC100 device in CB mode */
> > +static inline int enqueue_dec_one_op_cb(struct acc100_queue *q,
> > +struct rte_bbdev_dec_op *op,
> > +		uint16_t total_enqueued_cbs)
> > +{
> > +	union acc100_dma_desc *desc = NULL;
> > +	int ret;
> > +	uint32_t in_offset, h_out_offset, s_out_offset, s_out_length,
> > +		h_out_length, mbuf_total_left, seg_total_left;
> > +	struct rte_mbuf *input, *h_output_head, *h_output,
> > +		*s_output_head, *s_output;
> > +
> > +	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> > +			& q->sw_ring_wrap_mask);
> > +	desc = q->ring_addr + desc_idx;
> > +	acc100_fcw_td_fill(op, &desc->req.fcw_td);
> > +
> > +	input = op->turbo_dec.input.data;
> > +	h_output_head = h_output = op->turbo_dec.hard_output.data;
> > +	s_output_head = s_output = op->turbo_dec.soft_output.data;
> > +	in_offset = op->turbo_dec.input.offset;
> > +	h_out_offset = op->turbo_dec.hard_output.offset;
> > +	s_out_offset = op->turbo_dec.soft_output.offset;
> > +	h_out_length = s_out_length = 0;
> > +	mbuf_total_left = op->turbo_dec.input.length;
> > +	seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
> > +
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +	if (unlikely(input == NULL)) {
> > +		rte_bbdev_log(ERR, "Invalid mbuf pointer");
> > +		return -EFAULT;
> > +	}
> > +#endif
> > +
> > +	/* Set up DMA descriptor */
> > +	desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
> > +			& q->sw_ring_wrap_mask);
> > +
> > +	ret = acc100_dma_desc_td_fill(op, &desc->req, &input, h_output,
> > +			s_output, &in_offset, &h_out_offset, &s_out_offset,
> > +			&h_out_length, &s_out_length, &mbuf_total_left,
> > +			&seg_total_left, 0);
> > +
> > +	if (unlikely(ret < 0))
> > +		return ret;
> > +
> > +	/* Hard output */
> > +	mbuf_append(h_output_head, h_output, h_out_length);
> > +
> > +	/* Soft output */
> > +	if (check_bit(op->turbo_dec.op_flags,
> RTE_BBDEV_TURBO_SOFT_OUTPUT))
> > +		mbuf_append(s_output_head, s_output, s_out_length);
> > +
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +	rte_memdump(stderr, "FCW", &desc->req.fcw_td,
> > +			sizeof(desc->req.fcw_td) - 8);
> > +	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> > +
> > +	/* Check if any CBs left for processing */
> > +	if (mbuf_total_left != 0) {
> > +		rte_bbdev_log(ERR,
> > +				"Some date still left after processing one CB:
> mbuf_total_left = %u",
> > +				mbuf_total_left);
> > +		return -EINVAL;
> > +	}
> > +#endif
> logic similar to debug in mbuf_append, should be a common function.

Not exactly except if I miss your point. 

> > +
> > +	/* One CB (one op) was successfully prepared to enqueue */
> > +	return 1;
> > +}
> > +
> > +static inline int
> > +harq_loopback(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
> > +		uint16_t total_enqueued_cbs) {
> > +	struct acc100_fcw_ld *fcw;
> > +	union acc100_dma_desc *desc;
> > +	int next_triplet = 1;
> > +	struct rte_mbuf *hq_output_head, *hq_output;
> > +	uint16_t harq_in_length = op-
> >ldpc_dec.harq_combined_input.length;
> > +	if (harq_in_length == 0) {
> > +		rte_bbdev_log(ERR, "Loopback of invalid null size\n");
> > +		return -EINVAL;
> > +	}
> > +
> > +	int h_comp = check_bit(op->ldpc_dec.op_flags,
> > +			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION
> > +			) ? 1 : 0;
> > +	if (h_comp == 1)
> > +		harq_in_length = harq_in_length * 8 / 6;
> > +	harq_in_length = RTE_ALIGN(harq_in_length, 64);
> > +	uint16_t harq_dma_length_in = (h_comp == 0) ?
> Can these h_comp checks be combined to a single if/else ?

it may be clearer, ok.


> > +			harq_in_length :
> > +			harq_in_length * 6 / 8;
> > +	uint16_t harq_dma_length_out = harq_dma_length_in;
> > +	bool ddr_mem_in = check_bit(op->ldpc_dec.op_flags,
> > +
> 	RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE);
> > +	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
> > +	uint16_t harq_index = (ddr_mem_in ?
> > +			op->ldpc_dec.harq_combined_input.offset :
> > +			op->ldpc_dec.harq_combined_output.offset)
> > +			/ ACC100_HARQ_OFFSET;
> > +
> > +	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> > +			& q->sw_ring_wrap_mask);
> > +	desc = q->ring_addr + desc_idx;
> > +	fcw = &desc->req.fcw_ld;
> > +	/* Set the FCW from loopback into DDR */
> > +	memset(fcw, 0, sizeof(struct acc100_fcw_ld));
> > +	fcw->FCWversion = ACC100_FCW_VER;
> > +	fcw->qm = 2;
> > +	fcw->Zc = 384;
> these magic numbers should have #defines

These are not magic numbers, but actually 3GPP values

> > +	if (harq_in_length < 16 * N_ZC_1)
> > +		fcw->Zc = 16;
> > +	fcw->ncb = fcw->Zc * N_ZC_1;
> > +	fcw->rm_e = 2;
> > +	fcw->hcin_en = 1;
> > +	fcw->hcout_en = 1;
> > +
> > +	rte_bbdev_log(DEBUG, "Loopback IN %d Index %d offset %d length
> %d %d\n",
> > +			ddr_mem_in, harq_index,
> > +			harq_layout[harq_index].offset, harq_in_length,
> > +			harq_dma_length_in);
> > +
> > +	if (ddr_mem_in && (harq_layout[harq_index].offset > 0)) {
> > +		fcw->hcin_size0 = harq_layout[harq_index].size0;
> > +		fcw->hcin_offset = harq_layout[harq_index].offset;
> > +		fcw->hcin_size1 = harq_in_length - fcw->hcin_offset;
> > +		harq_dma_length_in = (fcw->hcin_size0 + fcw->hcin_size1);
> > +		if (h_comp == 1)
> > +			harq_dma_length_in = harq_dma_length_in * 6 / 8;
> > +	} else {
> > +		fcw->hcin_size0 = harq_in_length;
> > +	}
> > +	harq_layout[harq_index].val = 0;
> > +	rte_bbdev_log(DEBUG, "Loopback FCW Config %d %d %d\n",
> > +			fcw->hcin_size0, fcw->hcin_offset, fcw->hcin_size1);
> > +	fcw->hcout_size0 = harq_in_length;
> > +	fcw->hcin_decomp_mode = h_comp;
> > +	fcw->hcout_comp_mode = h_comp;
> > +	fcw->gain_i = 1;
> > +	fcw->gain_h = 1;
> > +
> > +	/* Set the prefix of descriptor. This could be done at polling */
> >  	desc->req.word0 = ACC100_DMA_DESC_TYPE;
> >  	desc->req.word1 = 0; /**< Timestamp could be disabled */
> >  	desc->req.word2 = 0;
> > @@ -1816,6 +2320,107 @@
> >  	return current_enqueued_cbs;
> >  }
> >
> > +/* Enqueue one decode operations for ACC100 device in TB mode */
> > +static inline int enqueue_dec_one_op_tb(struct acc100_queue *q,
> > +struct rte_bbdev_dec_op *op,
> > +		uint16_t total_enqueued_cbs, uint8_t cbs_in_tb) {
> > +	union acc100_dma_desc *desc = NULL;
> > +	int ret;
> > +	uint8_t r, c;
> > +	uint32_t in_offset, h_out_offset, s_out_offset, s_out_length,
> > +		h_out_length, mbuf_total_left, seg_total_left;
> > +	struct rte_mbuf *input, *h_output_head, *h_output,
> > +		*s_output_head, *s_output;
> > +	uint16_t current_enqueued_cbs = 0;
> > +
> > +	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> > +			& q->sw_ring_wrap_mask);
> > +	desc = q->ring_addr + desc_idx;
> > +	uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
> > +	acc100_fcw_td_fill(op, &desc->req.fcw_td);
> > +
> > +	input = op->turbo_dec.input.data;
> > +	h_output_head = h_output = op->turbo_dec.hard_output.data;
> > +	s_output_head = s_output = op->turbo_dec.soft_output.data;
> > +	in_offset = op->turbo_dec.input.offset;
> > +	h_out_offset = op->turbo_dec.hard_output.offset;
> > +	s_out_offset = op->turbo_dec.soft_output.offset;
> > +	h_out_length = s_out_length = 0;
> > +	mbuf_total_left = op->turbo_dec.input.length;
> > +	c = op->turbo_dec.tb_params.c;
> > +	r = op->turbo_dec.tb_params.r;
> > +
> > +	while (mbuf_total_left > 0 && r < c) {
> > +
> > +		seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
> > +
> > +		/* Set up DMA descriptor */
> > +		desc = q->ring_addr + ((q->sw_ring_head +
> total_enqueued_cbs)
> > +				& q->sw_ring_wrap_mask);
> > +		desc->req.data_ptrs[0].address = q->ring_addr_phys +
> fcw_offset;
> > +		desc->req.data_ptrs[0].blen = ACC100_FCW_TD_BLEN;
> > +		ret = acc100_dma_desc_td_fill(op, &desc->req, &input,
> > +				h_output, s_output, &in_offset,
> &h_out_offset,
> > +				&s_out_offset, &h_out_length,
> &s_out_length,
> > +				&mbuf_total_left, &seg_total_left, r);
> > +
> > +		if (unlikely(ret < 0))
> > +			return ret;
> > +
> > +		/* Hard output */
> > +		mbuf_append(h_output_head, h_output, h_out_length);
> > +
> > +		/* Soft output */
> > +		if (check_bit(op->turbo_dec.op_flags,
> > +				RTE_BBDEV_TURBO_SOFT_OUTPUT))
> > +			mbuf_append(s_output_head, s_output,
> s_out_length);
> > +
> > +		/* Set total number of CBs in TB */
> > +		desc->req.cbs_in_tb = cbs_in_tb;
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +		rte_memdump(stderr, "FCW", &desc->req.fcw_td,
> > +				sizeof(desc->req.fcw_td) - 8);
> > +		rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> #endif
> > +
> > +		if (seg_total_left == 0) {
> > +			/* Go to the next mbuf */
> > +			input = input->next;
> > +			in_offset = 0;
> > +			h_output = h_output->next;
> > +			h_out_offset = 0;
> > +
> > +			if (check_bit(op->turbo_dec.op_flags,
> > +					RTE_BBDEV_TURBO_SOFT_OUTPUT))
> {
> > +				s_output = s_output->next;
> > +				s_out_offset = 0;
> > +			}
> > +		}
> > +
> > +		total_enqueued_cbs++;
> > +		current_enqueued_cbs++;
> > +		r++;
> > +	}
> > +
> > +	if (unlikely(desc == NULL))
> > +		return current_enqueued_cbs;
> > +
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +	/* Check if any CBs left for processing */
> > +	if (mbuf_total_left != 0) {
> > +		rte_bbdev_log(ERR,
> > +				"Some date still left for processing:
> mbuf_total_left = %u",
> > +				mbuf_total_left);
> > +		return -EINVAL;
> > +	}
> > +#endif
> > +	/* Set SDone on last CB descriptor for TB mode */
> > +	desc->req.sdone_enable = 1;
> > +	desc->req.irq_enable = q->irq_enable;
> > +
> > +	return current_enqueued_cbs;
> > +}
> >
> >  /* Calculates number of CBs in processed encoder TB based on 'r' and
> input
> >   * length.
> > @@ -1893,6 +2498,45 @@
> >  	return cbs_in_tb;
> >  }
> >
> > +/* Enqueue encode operations for ACC100 device in CB mode. */ static
> > +uint16_t acc100_enqueue_enc_cb(struct rte_bbdev_queue_data
> *q_data,
> > +		struct rte_bbdev_enc_op **ops, uint16_t num) {
> > +	struct acc100_queue *q = q_data->queue_private;
> > +	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q-
> >sw_ring_head;
> > +	uint16_t i;
> > +	union acc100_dma_desc *desc;
> > +	int ret;
> > +
> > +	for (i = 0; i < num; ++i) {
> > +		/* Check if there are available space for further processing */
> > +		if (unlikely(avail - 1 < 0))
> > +			break;
> > +		avail -= 1;
> > +
> > +		ret = enqueue_enc_one_op_cb(q, ops[i], i);
> > +		if (ret < 0)
> > +			break;
> > +	}
> > +
> > +	if (unlikely(i == 0))
> > +		return 0; /* Nothing to enqueue */
> > +
> > +	/* Set SDone in last CB in enqueued ops for CB mode*/
> > +	desc = q->ring_addr + ((q->sw_ring_head + i - 1)
> > +			& q->sw_ring_wrap_mask);
> > +	desc->req.sdone_enable = 1;
> > +	desc->req.irq_enable = q->irq_enable;
> > +
> > +	acc100_dma_enqueue(q, i, &q_data->queue_stats);
> > +
> > +	/* Update stats */
> > +	q_data->queue_stats.enqueued_count += i;
> > +	q_data->queue_stats.enqueue_err_count += num - i;
> > +	return i;
> > +}
> > +
> >  /* Check we can mux encode operations with common FCW */  static
> > inline bool  check_mux(struct rte_bbdev_enc_op **ops, uint16_t num) {
> > @@ -1960,6 +2604,52 @@
> >  	return i;
> >  }
> >
> > +/* Enqueue encode operations for ACC100 device in TB mode. */ static
> > +uint16_t acc100_enqueue_enc_tb(struct rte_bbdev_queue_data
> *q_data,
> > +		struct rte_bbdev_enc_op **ops, uint16_t num) {
> > +	struct acc100_queue *q = q_data->queue_private;
> > +	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q-
> >sw_ring_head;
> > +	uint16_t i, enqueued_cbs = 0;
> > +	uint8_t cbs_in_tb;
> > +	int ret;
> > +
> > +	for (i = 0; i < num; ++i) {
> > +		cbs_in_tb = get_num_cbs_in_tb_enc(&ops[i]->turbo_enc);
> > +		/* Check if there are available space for further processing */
> > +		if (unlikely(avail - cbs_in_tb < 0))
> > +			break;
> > +		avail -= cbs_in_tb;
> > +
> > +		ret = enqueue_enc_one_op_tb(q, ops[i], enqueued_cbs,
> cbs_in_tb);
> > +		if (ret < 0)
> > +			break;
> > +		enqueued_cbs += ret;
> > +	}
> > +
> other similar functions have a (i == 0) check here.

ok

> > +	acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
> > +
> > +	/* Update stats */
> > +	q_data->queue_stats.enqueued_count += i;
> > +	q_data->queue_stats.enqueue_err_count += num - i;
> > +
> > +	return i;
> > +}
> > +
> > +/* Enqueue encode operations for ACC100 device. */ static uint16_t
> > +acc100_enqueue_enc(struct rte_bbdev_queue_data *q_data,
> > +		struct rte_bbdev_enc_op **ops, uint16_t num) {
> > +	if (unlikely(num == 0))
> > +		return 0;
> num == 0 check should move into the tb/cb functions

same comment on other patch, why not catch it early?

> > +	if (ops[0]->turbo_enc.code_block_mode == 0)
> > +		return acc100_enqueue_enc_tb(q_data, ops, num);
> > +	else
> > +		return acc100_enqueue_enc_cb(q_data, ops, num); }
> > +
> >  /* Enqueue encode operations for ACC100 device. */  static uint16_t
> > acc100_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data, @@
> > -1967,7 +2657,51 @@  {
> >  	if (unlikely(num == 0))
> >  		return 0;
> > -	return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
> > +	if (ops[0]->ldpc_enc.code_block_mode == 0)
> > +		return acc100_enqueue_enc_tb(q_data, ops, num);
> > +	else
> > +		return acc100_enqueue_ldpc_enc_cb(q_data, ops, num); }
> > +
> > +
> > +/* Enqueue decode operations for ACC100 device in CB mode */ static
> > +uint16_t acc100_enqueue_dec_cb(struct rte_bbdev_queue_data
> *q_data,
> > +		struct rte_bbdev_dec_op **ops, uint16_t num) {
> 
> Seems like the 10th variant of a similar function could these be combined to
> fewer functions ?
> 
> Maybe by passing in a function pointer to the enqueue_one_dec_one* that
> does the work ?

They have some variants related to the actual operation and constraints.
Not obvious to have a valuable refactor. 


> 
> > +	struct acc100_queue *q = q_data->queue_private;
> > +	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q-
> >sw_ring_head;
> > +	uint16_t i;
> > +	union acc100_dma_desc *desc;
> > +	int ret;
> > +
> > +	for (i = 0; i < num; ++i) {
> > +		/* Check if there are available space for further processing */
> > +		if (unlikely(avail - 1 < 0))
> > +			break;
> > +		avail -= 1;
> > +
> > +		ret = enqueue_dec_one_op_cb(q, ops[i], i);
> > +		if (ret < 0)
> > +			break;
> > +	}
> > +
> > +	if (unlikely(i == 0))
> > +		return 0; /* Nothing to enqueue */
> > +
> > +	/* Set SDone in last CB in enqueued ops for CB mode*/
> > +	desc = q->ring_addr + ((q->sw_ring_head + i - 1)
> > +			& q->sw_ring_wrap_mask);
> > +	desc->req.sdone_enable = 1;
> > +	desc->req.irq_enable = q->irq_enable;
> > +
> > +	acc100_dma_enqueue(q, i, &q_data->queue_stats);
> > +
> > +	/* Update stats */
> > +	q_data->queue_stats.enqueued_count += i;
> > +	q_data->queue_stats.enqueue_err_count += num - i;
> > +
> > +	return i;
> >  }
> >
> >  /* Check we can mux encode operations with common FCW */ @@ -
> 2065,6
> > +2799,53 @@
> >  	return i;
> >  }
> >
> > +
> > +/* Enqueue decode operations for ACC100 device in TB mode */ static
> > +uint16_t acc100_enqueue_dec_tb(struct rte_bbdev_queue_data
> *q_data,
> > +		struct rte_bbdev_dec_op **ops, uint16_t num) {
> 11th ;)
> > +	struct acc100_queue *q = q_data->queue_private;
> > +	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q-
> >sw_ring_head;
> > +	uint16_t i, enqueued_cbs = 0;
> > +	uint8_t cbs_in_tb;
> > +	int ret;
> > +
> > +	for (i = 0; i < num; ++i) {
> > +		cbs_in_tb = get_num_cbs_in_tb_dec(&ops[i]->turbo_dec);
> > +		/* Check if there are available space for further processing */
> > +		if (unlikely(avail - cbs_in_tb < 0))
> > +			break;
> > +		avail -= cbs_in_tb;
> > +
> > +		ret = enqueue_dec_one_op_tb(q, ops[i], enqueued_cbs,
> cbs_in_tb);
> > +		if (ret < 0)
> > +			break;
> > +		enqueued_cbs += ret;
> > +	}
> > +
> > +	acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
> > +
> > +	/* Update stats */
> > +	q_data->queue_stats.enqueued_count += i;
> > +	q_data->queue_stats.enqueue_err_count += num - i;
> > +
> > +	return i;
> > +}
> > +
> > +/* Enqueue decode operations for ACC100 device. */ static uint16_t
> > +acc100_enqueue_dec(struct rte_bbdev_queue_data *q_data,
> > +		struct rte_bbdev_dec_op **ops, uint16_t num) {
> > +	if (unlikely(num == 0))
> > +		return 0;
> similar move the num == 0 check to the tb/cb functions.

same comment

> > +	if (ops[0]->turbo_dec.code_block_mode == 0)
> > +		return acc100_enqueue_dec_tb(q_data, ops, num);
> > +	else
> > +		return acc100_enqueue_dec_cb(q_data, ops, num); }
> > +
> >  /* Enqueue decode operations for ACC100 device. */  static uint16_t
> > acc100_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data, @@
> > -2388,6 +3169,51 @@
> >  	return cb_idx;
> >  }
> >
> > +/* Dequeue encode operations from ACC100 device. */ static uint16_t
> > +acc100_dequeue_enc(struct rte_bbdev_queue_data *q_data,
> > +		struct rte_bbdev_enc_op **ops, uint16_t num) {
> > +	struct acc100_queue *q = q_data->queue_private;
> > +	uint16_t dequeue_num;
> > +	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
> > +	uint32_t aq_dequeued = 0;
> > +	uint16_t i;
> > +	uint16_t dequeued_cbs = 0;
> > +	struct rte_bbdev_enc_op *op;
> > +	int ret;
> > +
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +	if (unlikely(ops == 0 && q == NULL))
> 
> ops is a pointer so should compare with NULL
> 
> The && likely needs to be ||
> 
> Maybe print out a message so caller knows something wrong happened.

ok

> 
> > +		return 0;
> > +#endif
> > +
> > +	dequeue_num = (avail < num) ? avail : num;
> > +
> > +	for (i = 0; i < dequeue_num; ++i) {
> > +		op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> > +			& q->sw_ring_wrap_mask))->req.op_addr;
> > +		if (op->turbo_enc.code_block_mode == 0)
> > +			ret = dequeue_enc_one_op_tb(q, &ops[i],
> dequeued_cbs,
> > +					&aq_dequeued);
> > +		else
> > +			ret = dequeue_enc_one_op_cb(q, &ops[i],
> dequeued_cbs,
> > +					&aq_dequeued);
> > +
> > +		if (ret < 0)
> > +			break;
> > +		dequeued_cbs += ret;
> > +	}
> > +
> > +	q->aq_dequeued += aq_dequeued;
> > +	q->sw_ring_tail += dequeued_cbs;
> > +
> > +	/* Update enqueue stats */
> > +	q_data->queue_stats.dequeued_count += i;
> > +
> > +	return i;
> > +}
> > +
> >  /* Dequeue LDPC encode operations from ACC100 device. */  static
> > uint16_t  acc100_dequeue_ldpc_enc(struct rte_bbdev_queue_data
> *q_data,
> > @@ -2426,6 +3252,52 @@
> >  	return dequeued_cbs;
> >  }
> >
> > +
> > +/* Dequeue decode operations from ACC100 device. */ static uint16_t
> > +acc100_dequeue_dec(struct rte_bbdev_queue_data *q_data,
> > +		struct rte_bbdev_dec_op **ops, uint16_t num) {
> 
> very similar to enc function above, consider how to combine them to a
> single function.
> 
> Tom
> 
> > +	struct acc100_queue *q = q_data->queue_private;
> > +	uint16_t dequeue_num;
> > +	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
> > +	uint32_t aq_dequeued = 0;
> > +	uint16_t i;
> > +	uint16_t dequeued_cbs = 0;
> > +	struct rte_bbdev_dec_op *op;
> > +	int ret;
> > +
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +	if (unlikely(ops == 0 && q == NULL))
> > +		return 0;
> > +#endif
> > +
> > +	dequeue_num = (avail < num) ? avail : num;
> > +
> > +	for (i = 0; i < dequeue_num; ++i) {
> > +		op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> > +			& q->sw_ring_wrap_mask))->req.op_addr;
> > +		if (op->turbo_dec.code_block_mode == 0)
> > +			ret = dequeue_dec_one_op_tb(q, &ops[i],
> dequeued_cbs,
> > +					&aq_dequeued);
> > +		else
> > +			ret = dequeue_dec_one_op_cb(q_data, q, &ops[i],
> > +					dequeued_cbs, &aq_dequeued);
> > +
> > +		if (ret < 0)
> > +			break;
> > +		dequeued_cbs += ret;
> > +	}
> > +
> > +	q->aq_dequeued += aq_dequeued;
> > +	q->sw_ring_tail += dequeued_cbs;
> > +
> > +	/* Update enqueue stats */
> > +	q_data->queue_stats.dequeued_count += i;
> > +
> > +	return i;
> > +}
> > +
> >  /* Dequeue decode operations from ACC100 device. */  static uint16_t
> > acc100_dequeue_ldpc_dec(struct rte_bbdev_queue_data *q_data, @@
> > -2479,6 +3351,10 @@
> >  	struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
> >
> >  	dev->dev_ops = &acc100_bbdev_ops;
> > +	dev->enqueue_enc_ops = acc100_enqueue_enc;
> > +	dev->enqueue_dec_ops = acc100_enqueue_dec;
> > +	dev->dequeue_enc_ops = acc100_dequeue_enc;
> > +	dev->dequeue_dec_ops = acc100_dequeue_dec;
> >  	dev->enqueue_ldpc_enc_ops = acc100_enqueue_ldpc_enc;
> >  	dev->enqueue_ldpc_dec_ops = acc100_enqueue_ldpc_dec;
> >  	dev->dequeue_ldpc_enc_ops = acc100_dequeue_ldpc_enc;


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v9 09/10] baseband/acc100: add debug function to validate input
  2020-09-29  0:29     ` [dpdk-dev] [PATCH v9 09/10] baseband/acc100: add debug function to validate input Nicolas Chautru
@ 2020-09-30 19:16       ` Tom Rix
  2020-09-30 19:53         ` Chautru, Nicolas
  0 siblings, 1 reply; 213+ messages in thread
From: Tom Rix @ 2020-09-30 19:16 UTC (permalink / raw)
  To: Nicolas Chautru, dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu


On 9/28/20 5:29 PM, Nicolas Chautru wrote:
> Debug functions to validate the input API from user
> Only enabled in DEBUG mode at build time
>
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
> ---
>  drivers/baseband/acc100/rte_acc100_pmd.c | 424 +++++++++++++++++++++++++++++++
>  1 file changed, 424 insertions(+)
>
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
> index b6d9e7c..3589814 100644
> --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> @@ -1945,6 +1945,231 @@
>  
>  }
>  
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +/* Validates turbo encoder parameters */
> +static inline int
> +validate_enc_op(struct rte_bbdev_enc_op *op)
> +{
> +	struct rte_bbdev_op_turbo_enc *turbo_enc = &op->turbo_enc;
> +	struct rte_bbdev_op_enc_turbo_cb_params *cb = NULL;
> +	struct rte_bbdev_op_enc_turbo_tb_params *tb = NULL;
> +	uint16_t kw, kw_neg, kw_pos;
> +
> +	if (op->mempool == NULL) {
> +		rte_bbdev_log(ERR, "Invalid mempool pointer");
> +		return -1;
> +	}
> +	if (turbo_enc->input.data == NULL) {
> +		rte_bbdev_log(ERR, "Invalid input pointer");
> +		return -1;
> +	}
> +	if (turbo_enc->output.data == NULL) {
> +		rte_bbdev_log(ERR, "Invalid output pointer");
> +		return -1;
> +	}
> +	if (turbo_enc->rv_index > 3) {
> +		rte_bbdev_log(ERR,
> +				"rv_index (%u) is out of range 0 <= value <= 3",
> +				turbo_enc->rv_index);
> +		return -1;
> +	}
> +	if (turbo_enc->code_block_mode != 0 &&
> +			turbo_enc->code_block_mode != 1) {
> +		rte_bbdev_log(ERR,
> +				"code_block_mode (%u) is out of range 0 <= value <= 1",
> +				turbo_enc->code_block_mode);
> +		return -1;
> +	}
> +
> +	if (turbo_enc->code_block_mode == 0) {
> +		tb = &turbo_enc->tb_params;
> +		if ((tb->k_neg < RTE_BBDEV_TURBO_MIN_CB_SIZE
> +				|| tb->k_neg > RTE_BBDEV_TURBO_MAX_CB_SIZE)
> +				&& tb->c_neg > 0) {
> +			rte_bbdev_log(ERR,
> +					"k_neg (%u) is out of range %u <= value <= %u",
> +					tb->k_neg, RTE_BBDEV_TURBO_MIN_CB_SIZE,
> +					RTE_BBDEV_TURBO_MAX_CB_SIZE);
> +			return -1;
> +		}
> +		if (tb->k_pos < RTE_BBDEV_TURBO_MIN_CB_SIZE
> +				|| tb->k_pos > RTE_BBDEV_TURBO_MAX_CB_SIZE) {
> +			rte_bbdev_log(ERR,
> +					"k_pos (%u) is out of range %u <= value <= %u",
> +					tb->k_pos, RTE_BBDEV_TURBO_MIN_CB_SIZE,
> +					RTE_BBDEV_TURBO_MAX_CB_SIZE);
> +			return -1;
> +		}
> +		if (tb->c_neg > (RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1))
> +			rte_bbdev_log(ERR,
> +					"c_neg (%u) is out of range 0 <= value <= %u",
> +					tb->c_neg,
> +					RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1);
> +		if (tb->c < 1 || tb->c > RTE_BBDEV_TURBO_MAX_CODE_BLOCKS) {
> +			rte_bbdev_log(ERR,
> +					"c (%u) is out of range 1 <= value <= %u",
> +					tb->c, RTE_BBDEV_TURBO_MAX_CODE_BLOCKS);
> +			return -1;
> +		}
> +		if (tb->cab > tb->c) {
> +			rte_bbdev_log(ERR,
> +					"cab (%u) is greater than c (%u)",
> +					tb->cab, tb->c);
> +			return -1;
> +		}
> +		if ((tb->ea < RTE_BBDEV_TURBO_MIN_CB_SIZE || (tb->ea % 2))
> +				&& tb->r < tb->cab) {
> +			rte_bbdev_log(ERR,
> +					"ea (%u) is less than %u or it is not even",
> +					tb->ea, RTE_BBDEV_TURBO_MIN_CB_SIZE);
> +			return -1;
> +		}
> +		if ((tb->eb < RTE_BBDEV_TURBO_MIN_CB_SIZE || (tb->eb % 2))
> +				&& tb->c > tb->cab) {
> +			rte_bbdev_log(ERR,
> +					"eb (%u) is less than %u or it is not even",
> +					tb->eb, RTE_BBDEV_TURBO_MIN_CB_SIZE);
> +			return -1;
> +		}
> +
> +		kw_neg = 3 * RTE_ALIGN_CEIL(tb->k_neg + 4,
> +					RTE_BBDEV_TURBO_C_SUBBLOCK);
> +		if (tb->ncb_neg < tb->k_neg || tb->ncb_neg > kw_neg) {
> +			rte_bbdev_log(ERR,
> +					"ncb_neg (%u) is out of range (%u) k_neg <= value <= (%u) kw_neg",
> +					tb->ncb_neg, tb->k_neg, kw_neg);
> +			return -1;
> +		}
> +
> +		kw_pos = 3 * RTE_ALIGN_CEIL(tb->k_pos + 4,
> +					RTE_BBDEV_TURBO_C_SUBBLOCK);
> +		if (tb->ncb_pos < tb->k_pos || tb->ncb_pos > kw_pos) {
> +			rte_bbdev_log(ERR,
> +					"ncb_pos (%u) is out of range (%u) k_pos <= value <= (%u) kw_pos",
> +					tb->ncb_pos, tb->k_pos, kw_pos);
> +			return -1;
> +		}
> +		if (tb->r > (tb->c - 1)) {
> +			rte_bbdev_log(ERR,
> +					"r (%u) is greater than c - 1 (%u)",
> +					tb->r, tb->c - 1);
> +			return -1;
> +		}
> +	} else {
> +		cb = &turbo_enc->cb_params;
> +		if (cb->k < RTE_BBDEV_TURBO_MIN_CB_SIZE
> +				|| cb->k > RTE_BBDEV_TURBO_MAX_CB_SIZE) {
> +			rte_bbdev_log(ERR,
> +					"k (%u) is out of range %u <= value <= %u",
> +					cb->k, RTE_BBDEV_TURBO_MIN_CB_SIZE,
> +					RTE_BBDEV_TURBO_MAX_CB_SIZE);
> +			return -1;
> +		}
> +
> +		if (cb->e < RTE_BBDEV_TURBO_MIN_CB_SIZE || (cb->e % 2)) {
> +			rte_bbdev_log(ERR,
> +					"e (%u) is less than %u or it is not even",
> +					cb->e, RTE_BBDEV_TURBO_MIN_CB_SIZE);
> +			return -1;
> +		}
> +
> +		kw = RTE_ALIGN_CEIL(cb->k + 4, RTE_BBDEV_TURBO_C_SUBBLOCK) * 3;
> +		if (cb->ncb < cb->k || cb->ncb > kw) {
> +			rte_bbdev_log(ERR,
> +					"ncb (%u) is out of range (%u) k <= value <= (%u) kw",
> +					cb->ncb, cb->k, kw);
> +			return -1;
> +		}
> +	}
> +
> +	return 0;
> +}
> +/* Validates LDPC encoder parameters */
> +static inline int
> +validate_ldpc_enc_op(struct rte_bbdev_enc_op *op)
> +{
> +	struct rte_bbdev_op_ldpc_enc *ldpc_enc = &op->ldpc_enc;
> +
> +	if (op->mempool == NULL) {
> +		rte_bbdev_log(ERR, "Invalid mempool pointer");
> +		return -1;
> +	}
> +	if (ldpc_enc->input.data == NULL) {
> +		rte_bbdev_log(ERR, "Invalid input pointer");
> +		return -1;
> +	}
> +	if (ldpc_enc->output.data == NULL) {
> +		rte_bbdev_log(ERR, "Invalid output pointer");
> +		return -1;
> +	}
> +	if (ldpc_enc->input.length >
> +			RTE_BBDEV_LDPC_MAX_CB_SIZE >> 3) {
> +		rte_bbdev_log(ERR, "CB size (%u) is too big, max: %d",
> +				ldpc_enc->input.length,
> +				RTE_BBDEV_LDPC_MAX_CB_SIZE);
> +		return -1;
> +	}
> +	if ((ldpc_enc->basegraph > 2) || (ldpc_enc->basegraph == 0)) {
> +		rte_bbdev_log(ERR,
> +				"BG (%u) is out of range 1 <= value <= 2",
> +				ldpc_enc->basegraph);
> +		return -1;
> +	}
> +	if (ldpc_enc->rv_index > 3) {
> +		rte_bbdev_log(ERR,
> +				"rv_index (%u) is out of range 0 <= value <= 3",
> +				ldpc_enc->rv_index);
> +		return -1;
> +	}
> +	if (ldpc_enc->code_block_mode > 1) {
> +		rte_bbdev_log(ERR,
> +				"code_block_mode (%u) is out of range 0 <= value <= 1",
> +				ldpc_enc->code_block_mode);
> +		return -1;
> +	}
> +
> +	return 0;
> +}
> +
> +/* Validates LDPC decoder parameters */
> +static inline int
> +validate_ldpc_dec_op(struct rte_bbdev_dec_op *op)
> +{
> +	struct rte_bbdev_op_ldpc_dec *ldpc_dec = &op->ldpc_dec;
> +
> +	if (op->mempool == NULL) {
> +		rte_bbdev_log(ERR, "Invalid mempool pointer");
> +		return -1;
> +	}
> +	if ((ldpc_dec->basegraph > 2) || (ldpc_dec->basegraph == 0)) {
> +		rte_bbdev_log(ERR,
> +				"BG (%u) is out of range 1 <= value <= 2",
> +				ldpc_dec->basegraph);
> +		return -1;
> +	}
> +	if (ldpc_dec->iter_max == 0) {
> +		rte_bbdev_log(ERR,
> +				"iter_max (%u) is equal to 0",
> +				ldpc_dec->iter_max);
> +		return -1;
> +	}
> +	if (ldpc_dec->rv_index > 3) {
> +		rte_bbdev_log(ERR,
> +				"rv_index (%u) is out of range 0 <= value <= 3",
> +				ldpc_dec->rv_index);
> +		return -1;
> +	}
> +	if (ldpc_dec->code_block_mode > 1) {
> +		rte_bbdev_log(ERR,
> +				"code_block_mode (%u) is out of range 0 <= value <= 1",
> +				ldpc_dec->code_block_mode);
> +		return -1;
> +	}
> +
> +	return 0;
> +}
> +#endif
Could have an #else with stubs so the users do not have to bother with #ifdef decorations
> +
>  /* Enqueue one encode operations for ACC100 device in CB mode */
>  static inline int
>  enqueue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
> @@ -1956,6 +2181,14 @@
>  		seg_total_left;
>  	struct rte_mbuf *input, *output_head, *output;
>  
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +	/* Validate op structure */
> +	if (validate_enc_op(op) == -1) {
> +		rte_bbdev_log(ERR, "Turbo encoder validation failed");
> +		return -EINVAL;
> +	}
> +#endif
> +
>  	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
>  			& q->sw_ring_wrap_mask);
>  	desc = q->ring_addr + desc_idx;
> @@ -2008,6 +2241,14 @@
>  	uint16_t  in_length_in_bytes;
>  	struct rte_bbdev_op_ldpc_enc *enc = &ops[0]->ldpc_enc;
>  
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +	/* Validate op structure */
> +	if (validate_ldpc_enc_op(ops[0]) == -1) {
> +		rte_bbdev_log(ERR, "LDPC encoder validation failed");
> +		return -EINVAL;
> +	}
> +#endif
> +
>  	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
>  			& q->sw_ring_wrap_mask);
>  	desc = q->ring_addr + desc_idx;
> @@ -2065,6 +2306,14 @@
>  		seg_total_left;
>  	struct rte_mbuf *input, *output_head, *output;
>  
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +	/* Validate op structure */
> +	if (validate_ldpc_enc_op(op) == -1) {
> +		rte_bbdev_log(ERR, "LDPC encoder validation failed");
> +		return -EINVAL;
> +	}
> +#endif
> +
>  	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
>  			& q->sw_ring_wrap_mask);
>  	desc = q->ring_addr + desc_idx;
> @@ -2119,6 +2368,14 @@
>  	struct rte_mbuf *input, *output_head, *output;
>  	uint16_t current_enqueued_cbs = 0;
>  
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +	/* Validate op structure */
> +	if (validate_enc_op(op) == -1) {
> +		rte_bbdev_log(ERR, "Turbo encoder validation failed");
> +		return -EINVAL;
> +	}
> +#endif
> +
>  	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
>  			& q->sw_ring_wrap_mask);
>  	desc = q->ring_addr + desc_idx;
> @@ -2191,6 +2448,142 @@
>  	return current_enqueued_cbs;
>  }
>  
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +/* Validates turbo decoder parameters */
> +static inline int
> +validate_dec_op(struct rte_bbdev_dec_op *op)
> +{

This (guessing) later dec validation share similar code with enc validation, consider function for the common parts.

Tom

> +	struct rte_bbdev_op_turbo_dec *turbo_dec = &op->turbo_dec;
> +	struct rte_bbdev_op_dec_turbo_cb_params *cb = NULL;
> +	struct rte_bbdev_op_dec_turbo_tb_params *tb = NULL;
> +
> +	if (op->mempool == NULL) {
> +		rte_bbdev_log(ERR, "Invalid mempool pointer");
> +		return -1;
> +	}
> +	if (turbo_dec->input.data == NULL) {
> +		rte_bbdev_log(ERR, "Invalid input pointer");
> +		return -1;
> +	}
> +	if (turbo_dec->hard_output.data == NULL) {
> +		rte_bbdev_log(ERR, "Invalid hard_output pointer");
> +		return -1;
> +	}
> +	if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_SOFT_OUTPUT) &&
> +			turbo_dec->soft_output.data == NULL) {
> +		rte_bbdev_log(ERR, "Invalid soft_output pointer");
> +		return -1;
> +	}
> +	if (turbo_dec->rv_index > 3) {
> +		rte_bbdev_log(ERR,
> +				"rv_index (%u) is out of range 0 <= value <= 3",
> +				turbo_dec->rv_index);
> +		return -1;
> +	}
> +	if (turbo_dec->iter_min < 1) {
> +		rte_bbdev_log(ERR,
> +				"iter_min (%u) is less than 1",
> +				turbo_dec->iter_min);
> +		return -1;
> +	}
> +	if (turbo_dec->iter_max <= 2) {
> +		rte_bbdev_log(ERR,
> +				"iter_max (%u) is less than or equal to 2",
> +				turbo_dec->iter_max);
> +		return -1;
> +	}
> +	if (turbo_dec->iter_min > turbo_dec->iter_max) {
> +		rte_bbdev_log(ERR,
> +				"iter_min (%u) is greater than iter_max (%u)",
> +				turbo_dec->iter_min, turbo_dec->iter_max);
> +		return -1;
> +	}
> +	if (turbo_dec->code_block_mode != 0 &&
> +			turbo_dec->code_block_mode != 1) {
> +		rte_bbdev_log(ERR,
> +				"code_block_mode (%u) is out of range 0 <= value <= 1",
> +				turbo_dec->code_block_mode);
> +		return -1;
> +	}
> +
> +	if (turbo_dec->code_block_mode == 0) {
> +		tb = &turbo_dec->tb_params;
> +		if ((tb->k_neg < RTE_BBDEV_TURBO_MIN_CB_SIZE
> +				|| tb->k_neg > RTE_BBDEV_TURBO_MAX_CB_SIZE)
> +				&& tb->c_neg > 0) {
> +			rte_bbdev_log(ERR,
> +					"k_neg (%u) is out of range %u <= value <= %u",
> +					tb->k_neg, RTE_BBDEV_TURBO_MIN_CB_SIZE,
> +					RTE_BBDEV_TURBO_MAX_CB_SIZE);
> +			return -1;
> +		}
> +		if ((tb->k_pos < RTE_BBDEV_TURBO_MIN_CB_SIZE
> +				|| tb->k_pos > RTE_BBDEV_TURBO_MAX_CB_SIZE)
> +				&& tb->c > tb->c_neg) {
> +			rte_bbdev_log(ERR,
> +					"k_pos (%u) is out of range %u <= value <= %u",
> +					tb->k_pos, RTE_BBDEV_TURBO_MIN_CB_SIZE,
> +					RTE_BBDEV_TURBO_MAX_CB_SIZE);
> +			return -1;
> +		}
> +		if (tb->c_neg > (RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1))
> +			rte_bbdev_log(ERR,
> +					"c_neg (%u) is out of range 0 <= value <= %u",
> +					tb->c_neg,
> +					RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1);
> +		if (tb->c < 1 || tb->c > RTE_BBDEV_TURBO_MAX_CODE_BLOCKS) {
> +			rte_bbdev_log(ERR,
> +					"c (%u) is out of range 1 <= value <= %u",
> +					tb->c, RTE_BBDEV_TURBO_MAX_CODE_BLOCKS);
> +			return -1;
> +		}
> +		if (tb->cab > tb->c) {
> +			rte_bbdev_log(ERR,
> +					"cab (%u) is greater than c (%u)",
> +					tb->cab, tb->c);
> +			return -1;
> +		}
> +		if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_EQUALIZER) &&
> +				(tb->ea < RTE_BBDEV_TURBO_MIN_CB_SIZE
> +						|| (tb->ea % 2))
> +				&& tb->cab > 0) {
> +			rte_bbdev_log(ERR,
> +					"ea (%u) is less than %u or it is not even",
> +					tb->ea, RTE_BBDEV_TURBO_MIN_CB_SIZE);
> +			return -1;
> +		}
> +		if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_EQUALIZER) &&
> +				(tb->eb < RTE_BBDEV_TURBO_MIN_CB_SIZE
> +						|| (tb->eb % 2))
> +				&& tb->c > tb->cab) {
> +			rte_bbdev_log(ERR,
> +					"eb (%u) is less than %u or it is not even",
> +					tb->eb, RTE_BBDEV_TURBO_MIN_CB_SIZE);
> +		}
> +	} else {
> +		cb = &turbo_dec->cb_params;
> +		if (cb->k < RTE_BBDEV_TURBO_MIN_CB_SIZE
> +				|| cb->k > RTE_BBDEV_TURBO_MAX_CB_SIZE) {
> +			rte_bbdev_log(ERR,
> +					"k (%u) is out of range %u <= value <= %u",
> +					cb->k, RTE_BBDEV_TURBO_MIN_CB_SIZE,
> +					RTE_BBDEV_TURBO_MAX_CB_SIZE);
> +			return -1;
> +		}
> +		if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_EQUALIZER) &&
> +				(cb->e < RTE_BBDEV_TURBO_MIN_CB_SIZE ||
> +				(cb->e % 2))) {
> +			rte_bbdev_log(ERR,
> +					"e (%u) is less than %u or it is not even",
> +					cb->e, RTE_BBDEV_TURBO_MIN_CB_SIZE);
> +			return -1;
> +		}
> +	}
> +
> +	return 0;
> +}
> +#endif
> +
>  /** Enqueue one decode operations for ACC100 device in CB mode */
>  static inline int
>  enqueue_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
> @@ -2203,6 +2596,14 @@
>  	struct rte_mbuf *input, *h_output_head, *h_output,
>  		*s_output_head, *s_output;
>  
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +	/* Validate op structure */
> +	if (validate_dec_op(op) == -1) {
> +		rte_bbdev_log(ERR, "Turbo decoder validation failed");
> +		return -EINVAL;
> +	}
> +#endif
> +
>  	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
>  			& q->sw_ring_wrap_mask);
>  	desc = q->ring_addr + desc_idx;
> @@ -2426,6 +2827,13 @@
>  		return ret;
>  	}
>  
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +	/* Validate op structure */
> +	if (validate_ldpc_dec_op(op) == -1) {
> +		rte_bbdev_log(ERR, "LDPC decoder validation failed");
> +		return -EINVAL;
> +	}
> +#endif
>  	union acc100_dma_desc *desc;
>  	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
>  			& q->sw_ring_wrap_mask);
> @@ -2521,6 +2929,14 @@
>  	struct rte_mbuf *input, *h_output_head, *h_output;
>  	uint16_t current_enqueued_cbs = 0;
>  
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +	/* Validate op structure */
> +	if (validate_ldpc_dec_op(op) == -1) {
> +		rte_bbdev_log(ERR, "LDPC decoder validation failed");
> +		return -EINVAL;
> +	}
> +#endif
> +
>  	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
>  			& q->sw_ring_wrap_mask);
>  	desc = q->ring_addr + desc_idx;
> @@ -2611,6 +3027,14 @@
>  		*s_output_head, *s_output;
>  	uint16_t current_enqueued_cbs = 0;
>  
> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> +	/* Validate op structure */
> +	if (validate_dec_op(op) == -1) {
> +		rte_bbdev_log(ERR, "Turbo decoder validation failed");
> +		return -EINVAL;
> +	}
> +#endif
> +
>  	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
>  			& q->sw_ring_wrap_mask);
>  	desc = q->ring_addr + desc_idx;


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v9 08/10] baseband/acc100: add interrupt support to PMD
  2020-09-30 19:03       ` Tom Rix
@ 2020-09-30 19:45         ` Chautru, Nicolas
  2020-10-01 16:05           ` Tom Rix
  0 siblings, 1 reply; 213+ messages in thread
From: Chautru, Nicolas @ 2020-09-30 19:45 UTC (permalink / raw)
  To: Tom Rix, dev, akhil.goyal
  Cc: Richardson, Bruce, Xu, Rosen, dave.burley, aidan.goddard, Yigit,
	Ferruh, Liu, Tianjiao

Hi Tom, 

> From: Tom Rix <trix@redhat.com>
> On 9/28/20 5:29 PM, Nicolas Chautru wrote:
> > Adding capability and functions to support MSI interrupts, call backs
> > and inforing.
> >
> > Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> > Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
> > ---
> >  drivers/baseband/acc100/rte_acc100_pmd.c | 288
> > ++++++++++++++++++++++++++++++-
> > drivers/baseband/acc100/rte_acc100_pmd.h |  15 ++
> >  2 files changed, 300 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> > b/drivers/baseband/acc100/rte_acc100_pmd.c
> > index 7d4c3df..b6d9e7c 100644
> > --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> > +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> > @@ -339,6 +339,213 @@
> >  	free_base_addresses(base_addrs, i);
> >  }
> >
> > +/*
> > + * Find queue_id of a device queue based on details from the Info Ring.
> > + * If a queue isn't found UINT16_MAX is returned.
> > + */
> > +static inline uint16_t
> > +get_queue_id_from_ring_info(struct rte_bbdev_data *data,
> > +		const union acc100_info_ring_data ring_data) {
> > +	uint16_t queue_id;
> > +
> > +	for (queue_id = 0; queue_id < data->num_queues; ++queue_id) {
> > +		struct acc100_queue *acc100_q =
> > +				data->queues[queue_id].queue_private;
> > +		if (acc100_q != NULL && acc100_q->aq_id == ring_data.aq_id
> &&
> > +				acc100_q->qgrp_id == ring_data.qg_id &&
> > +				acc100_q->vf_id == ring_data.vf_id)
> > +			return queue_id;
> 
> If num_queues is large, this linear search will be slow.
> 
> Consider changing the search algorithm.

This is not in the time critical part of the code


> 
> > +	}
> > +
> > +	return UINT16_MAX;
> the interrupt handlers that use this function do not a great job of handling
> this error.

if that error actualy happened then there is not much else that can be done except reporting the unexpected data.

> > +}
> > +
> > +/* Checks PF Info Ring to find the interrupt cause and handles it
> > +accordingly */ static inline void acc100_check_ir(struct
> > +acc100_device *acc100_dev) {
> > +	volatile union acc100_info_ring_data *ring_data;
> > +	uint16_t info_ring_head = acc100_dev->info_ring_head;
> > +	if (acc100_dev->info_ring == NULL)
> > +		return;
> > +
> > +	ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head &
> > +			ACC100_INFO_RING_MASK);
> > +
> > +	while (ring_data->valid) {
> > +		if ((ring_data->int_nb <
> ACC100_PF_INT_DMA_DL_DESC_IRQ) || (
> > +				ring_data->int_nb >
> > +				ACC100_PF_INT_DMA_DL5G_DESC_IRQ))
> > +			rte_bbdev_log(WARNING, "InfoRing: ITR:%d
> Info:0x%x",
> > +				ring_data->int_nb, ring_data-
> >detailed_info);
> > +		/* Initialize Info Ring entry and move forward */
> > +		ring_data->val = 0;
> > +		info_ring_head++;
> > +		ring_data = acc100_dev->info_ring +
> > +				(info_ring_head &
> ACC100_INFO_RING_MASK);
> These three statements are common for the ring handling, consider a macro
> or inline function.

ok

> > +	}
> > +}
> > +
> > +/* Checks PF Info Ring to find the interrupt cause and handles it
> > +accordingly */ static inline void acc100_pf_interrupt_handler(struct
> > +rte_bbdev *dev) {
> > +	struct acc100_device *acc100_dev = dev->data->dev_private;
> > +	volatile union acc100_info_ring_data *ring_data;
> > +	struct acc100_deq_intr_details deq_intr_det;
> > +
> > +	ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head &
> > +			ACC100_INFO_RING_MASK);
> > +
> > +	while (ring_data->valid) {
> > +
> > +		rte_bbdev_log_debug(
> > +				"ACC100 PF Interrupt received, Info Ring
> data: 0x%x",
> > +				ring_data->val);
> > +
> > +		switch (ring_data->int_nb) {
> > +		case ACC100_PF_INT_DMA_DL_DESC_IRQ:
> > +		case ACC100_PF_INT_DMA_UL_DESC_IRQ:
> > +		case ACC100_PF_INT_DMA_UL5G_DESC_IRQ:
> > +		case ACC100_PF_INT_DMA_DL5G_DESC_IRQ:
> > +			deq_intr_det.queue_id =
> get_queue_id_from_ring_info(
> > +					dev->data, *ring_data);
> > +			if (deq_intr_det.queue_id == UINT16_MAX) {
> > +				rte_bbdev_log(ERR,
> > +						"Couldn't find queue: aq_id:
> %u, qg_id: %u, vf_id: %u",
> > +						ring_data->aq_id,
> > +						ring_data->qg_id,
> > +						ring_data->vf_id);
> > +				return;
> > +			}
> > +			rte_bbdev_pmd_callback_process(dev,
> > +					RTE_BBDEV_EVENT_DEQUEUE,
> &deq_intr_det);
> > +			break;
> > +		default:
> > +			rte_bbdev_pmd_callback_process(dev,
> > +					RTE_BBDEV_EVENT_ERROR, NULL);
> > +			break;
> > +		}
> > +
> > +		/* Initialize Info Ring entry and move forward */
> > +		ring_data->val = 0;
> > +		++acc100_dev->info_ring_head;
> > +		ring_data = acc100_dev->info_ring +
> > +				(acc100_dev->info_ring_head &
> > +				ACC100_INFO_RING_MASK);
> > +	}
> > +}
> > +
> > +/* Checks VF Info Ring to find the interrupt cause and handles it
> > +accordingly */ static inline void acc100_vf_interrupt_handler(struct
> > +rte_bbdev *dev)
> very similar to pf case, consider combining.
> > +{
> > +	struct acc100_device *acc100_dev = dev->data->dev_private;
> > +	volatile union acc100_info_ring_data *ring_data;
> > +	struct acc100_deq_intr_details deq_intr_det;
> > +
> > +	ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head &
> > +			ACC100_INFO_RING_MASK);
> > +
> > +	while (ring_data->valid) {
> > +
> > +		rte_bbdev_log_debug(
> > +				"ACC100 VF Interrupt received, Info Ring
> data: 0x%x",
> > +				ring_data->val);
> > +
> > +		switch (ring_data->int_nb) {
> > +		case ACC100_VF_INT_DMA_DL_DESC_IRQ:
> > +		case ACC100_VF_INT_DMA_UL_DESC_IRQ:
> > +		case ACC100_VF_INT_DMA_UL5G_DESC_IRQ:
> > +		case ACC100_VF_INT_DMA_DL5G_DESC_IRQ:
> > +			/* VFs are not aware of their vf_id - it's set to 0 in
> > +			 * queue structures.
> > +			 */
> > +			ring_data->vf_id = 0;
> > +			deq_intr_det.queue_id =
> get_queue_id_from_ring_info(
> > +					dev->data, *ring_data);
> > +			if (deq_intr_det.queue_id == UINT16_MAX) {
> > +				rte_bbdev_log(ERR,
> > +						"Couldn't find queue: aq_id:
> %u, qg_id: %u",
> > +						ring_data->aq_id,
> > +						ring_data->qg_id);
> > +				return;
> > +			}
> > +			rte_bbdev_pmd_callback_process(dev,
> > +					RTE_BBDEV_EVENT_DEQUEUE,
> &deq_intr_det);
> > +			break;
> > +		default:
> > +			rte_bbdev_pmd_callback_process(dev,
> > +					RTE_BBDEV_EVENT_ERROR, NULL);
> > +			break;
> > +		}
> > +
> > +		/* Initialize Info Ring entry and move forward */
> > +		ring_data->valid = 0;
> > +		++acc100_dev->info_ring_head;
> > +		ring_data = acc100_dev->info_ring + (acc100_dev-
> >info_ring_head
> > +				& ACC100_INFO_RING_MASK);
> > +	}
> > +}
> > +
> > +/* Interrupt handler triggered by ACC100 dev for handling specific
> > +interrupt */ static void acc100_dev_interrupt_handler(void *cb_arg) {
> > +	struct rte_bbdev *dev = cb_arg;
> > +	struct acc100_device *acc100_dev = dev->data->dev_private;
> > +
> > +	/* Read info ring */
> > +	if (acc100_dev->pf_device)
> > +		acc100_pf_interrupt_handler(dev);
> 
> combined like ..
> 
> acc100_interrupt_handler(dev, is_pf)

unsure it will help readability. Much of the code would still be distinct

> 
> > +	else
> > +		acc100_vf_interrupt_handler(dev);
> > +}
> > +
> > +/* Allocate and setup inforing */
> > +static int
> > +allocate_inforing(struct rte_bbdev *dev)
> 
> consider renaming
> 
> allocate_info_ring

ok

> 
> > +{
> > +	struct acc100_device *d = dev->data->dev_private;
> > +	const struct acc100_registry_addr *reg_addr;
> > +	rte_iova_t info_ring_phys;
> > +	uint32_t phys_low, phys_high;
> > +
> > +	if (d->info_ring != NULL)
> > +		return 0; /* Already configured */
> > +
> > +	/* Choose correct registry addresses for the device type */
> > +	if (d->pf_device)
> > +		reg_addr = &pf_reg_addr;
> > +	else
> > +		reg_addr = &vf_reg_addr;
> > +	/* Allocate InfoRing */
> > +	d->info_ring = rte_zmalloc_socket("Info Ring",
> > +			ACC100_INFO_RING_NUM_ENTRIES *
> > +			sizeof(*d->info_ring), RTE_CACHE_LINE_SIZE,
> > +			dev->data->socket_id);
> > +	if (d->info_ring == NULL) {
> > +		rte_bbdev_log(ERR,
> > +				"Failed to allocate Info Ring for %s:%u",
> > +				dev->device->driver->name,
> > +				dev->data->dev_id);
> The callers do not check that this fails.

arguably the error would be self contained if that did fail. But doesn't hurt to add, ok. 

> > +		return -ENOMEM;
> > +	}
> > +	info_ring_phys = rte_malloc_virt2iova(d->info_ring);
> > +
> > +	/* Setup Info Ring */
> > +	phys_high = (uint32_t)(info_ring_phys >> 32);
> > +	phys_low  = (uint32_t)(info_ring_phys);
> > +	acc100_reg_write(d, reg_addr->info_ring_hi, phys_high);
> > +	acc100_reg_write(d, reg_addr->info_ring_lo, phys_low);
> > +	acc100_reg_write(d, reg_addr->info_ring_en,
> ACC100_REG_IRQ_EN_ALL);
> > +	d->info_ring_head = (acc100_reg_read(d, reg_addr->info_ring_ptr) &
> > +			0xFFF) / sizeof(union acc100_info_ring_data);
> > +	return 0;
> > +}
> > +
> > +
> >  /* Allocate 64MB memory used for all software rings */  static int
> > acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int
> > socket_id) @@ -426,6 +633,7 @@
> >  	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_hi, phys_high);
> >  	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_lo, phys_low);
> >
> > +	allocate_inforing(dev);
> need to check here
> >  	d->harq_layout = rte_zmalloc_socket("HARQ Layout",
> >  			ACC100_HARQ_LAYOUT * sizeof(*d->harq_layout),
> >  			RTE_CACHE_LINE_SIZE, dev->data->socket_id); @@ -
> 437,13 +645,53 @@
> >  	return 0;
> >  }
> >
> > +static int
> > +acc100_intr_enable(struct rte_bbdev *dev) {
> > +	int ret;
> > +	struct acc100_device *d = dev->data->dev_private;
> > +
> > +	/* Only MSI are currently supported */
> > +	if (dev->intr_handle->type == RTE_INTR_HANDLE_VFIO_MSI ||
> > +			dev->intr_handle->type == RTE_INTR_HANDLE_UIO)
> {
> > +
> > +		allocate_inforing(dev);
> need to check here
> > +
> > +		ret = rte_intr_enable(dev->intr_handle);
> > +		if (ret < 0) {
> > +			rte_bbdev_log(ERR,
> > +					"Couldn't enable interrupts for
> device: %s",
> > +					dev->data->name);
> > +			rte_free(d->info_ring);
> > +			return ret;
> > +		}
> > +		ret = rte_intr_callback_register(dev->intr_handle,
> > +				acc100_dev_interrupt_handler, dev);
> > +		if (ret < 0) {
> > +			rte_bbdev_log(ERR,
> > +					"Couldn't register interrupt callback
> for device: %s",
> > +					dev->data->name);
> > +			rte_free(d->info_ring);
> does intr need to be disabled here ?

Well I don't see a lot of consistency with other drivers. Sometimes these are not even check for failure.
I would rather defer changing through other future patch if required as this is same code on other bbdev drivers already used (if changed I would rather all changed the same way). 

> > +			return ret;
> > +		}
> > +
> > +		return 0;
> > +	}
> > +
> > +	rte_bbdev_log(ERR, "ACC100 (%s) supports only VFIO MSI
> interrupts",
> > +			dev->data->name);
> > +	return -ENOTSUP;
> > +}
> > +
> >  /* Free 64MB memory used for software rings */  static int
> > acc100_dev_close(struct rte_bbdev *dev)  {
> >  	struct acc100_device *d = dev->data->dev_private;
> > +	acc100_check_ir(d);
> >  	if (d->sw_rings_base != NULL) {
> >  		rte_free(d->tail_ptrs);
> > +		rte_free(d->info_ring);
> >  		rte_free(d->sw_rings_base);
> >  		d->sw_rings_base = NULL;
> >  	}
> > @@ -643,6 +891,7 @@
> >  					RTE_BBDEV_TURBO_CRC_TYPE_24B
> |
> >
> 	RTE_BBDEV_TURBO_HALF_ITERATION_EVEN |
> >
> 	RTE_BBDEV_TURBO_EARLY_TERMINATION |
> > +
> 	RTE_BBDEV_TURBO_DEC_INTERRUPTS |
> >
> 	RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN |
> >  					RTE_BBDEV_TURBO_MAP_DEC |
> >
> 	RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP | @@ -663,6 +912,7
> @@
> >
> 	RTE_BBDEV_TURBO_CRC_24B_ATTACH |
> >
> 	RTE_BBDEV_TURBO_RV_INDEX_BYPASS |
> >  					RTE_BBDEV_TURBO_RATE_MATCH |
> > +
> 	RTE_BBDEV_TURBO_ENC_INTERRUPTS |
> >
> 	RTE_BBDEV_TURBO_ENC_SCATTER_GATHER,
> >  				.num_buffers_src =
> >
> 	RTE_BBDEV_TURBO_MAX_CODE_BLOCKS, @@ -676,7 +926,8 @@
> >  				.capability_flags =
> >  					RTE_BBDEV_LDPC_RATE_MATCH |
> >
> 	RTE_BBDEV_LDPC_CRC_24B_ATTACH |
> > -
> 	RTE_BBDEV_LDPC_INTERLEAVER_BYPASS,
> > +
> 	RTE_BBDEV_LDPC_INTERLEAVER_BYPASS |
> > +
> 	RTE_BBDEV_LDPC_ENC_INTERRUPTS,
> >  				.num_buffers_src =
> >
> 	RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> >  				.num_buffers_dst =
> > @@ -701,7 +952,8 @@
> >  				RTE_BBDEV_LDPC_DECODE_BYPASS |
> >  				RTE_BBDEV_LDPC_DEC_SCATTER_GATHER |
> >
> 	RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION |
> > -				RTE_BBDEV_LDPC_LLR_COMPRESSION,
> > +				RTE_BBDEV_LDPC_LLR_COMPRESSION |
> > +				RTE_BBDEV_LDPC_DEC_INTERRUPTS,
> >  			.llr_size = 8,
> >  			.llr_decimals = 1,
> >  			.num_buffers_src =
> > @@ -751,14 +1003,39 @@
> >  #else
> >  	dev_info->harq_buffer_size = 0;
> >  #endif
> > +	acc100_check_ir(d);
> > +}
> > +
> > +static int
> > +acc100_queue_intr_enable(struct rte_bbdev *dev, uint16_t queue_id) {
> > +	struct acc100_queue *q = dev->data-
> >queues[queue_id].queue_private;
> > +
> > +	if (dev->intr_handle->type != RTE_INTR_HANDLE_VFIO_MSI &&
> > +			dev->intr_handle->type != RTE_INTR_HANDLE_UIO)
> > +		return -ENOTSUP;
> > +
> > +	q->irq_enable = 1;
> > +	return 0;
> > +}
> > +
> > +static int
> > +acc100_queue_intr_disable(struct rte_bbdev *dev, uint16_t queue_id) {
> > +	struct acc100_queue *q = dev->data-
> >queues[queue_id].queue_private;
> > +	q->irq_enable = 0;
> A -ENOTSUP above, should need similar check here.

How can this fail when we purely disable?

> > +	return 0;
> >  }
> >
> >  static const struct rte_bbdev_ops acc100_bbdev_ops = {
> >  	.setup_queues = acc100_setup_queues,
> > +	.intr_enable = acc100_intr_enable,
> >  	.close = acc100_dev_close,
> >  	.info_get = acc100_dev_info_get,
> >  	.queue_setup = acc100_queue_setup,
> >  	.queue_release = acc100_queue_release,
> > +	.queue_intr_enable = acc100_queue_intr_enable,
> > +	.queue_intr_disable = acc100_queue_intr_disable
> >  };
> >
> >  /* ACC100 PCI PF address map */
> > @@ -3018,8 +3295,10 @@
> >  			? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> >  	op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> >  	op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> > -	if (op->status != 0)
> > +	if (op->status != 0) {
> >  		q_data->queue_stats.dequeue_err_count++;
> > +		acc100_check_ir(q->d);
> > +	}
> >
> >  	/* CRC invalid if error exists */
> >  	if (!op->status)
> > @@ -3076,6 +3355,9 @@
> >  		op->status |= 1 << RTE_BBDEV_SYNDROME_ERROR;
> >  	op->ldpc_dec.iter_count = (uint8_t) rsp.iter_cnt;
> >
> > +	if (op->status & (1 << RTE_BBDEV_DRV_ERROR))
> > +		acc100_check_ir(q->d);
> > +
> >  	/* Check if this is the last desc in batch (Atomic Queue) */
> >  	if (desc->req.last_desc_in_batch) {
> >  		(*aq_dequeued)++;
> > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h
> > b/drivers/baseband/acc100/rte_acc100_pmd.h
> > index 78686c1..8980fa5 100644
> > --- a/drivers/baseband/acc100/rte_acc100_pmd.h
> > +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
> > @@ -559,7 +559,14 @@ struct acc100_device {
> >  	/* Virtual address of the info memory routed to the this function
> under
> >  	 * operation, whether it is PF or VF.
> >  	 */
> > +	union acc100_info_ring_data *info_ring;
> 
> Need a comment that this array needs a sentinel ?

Can clarify a bit expected HW behaviour

Thanks

> 
> Tom
> 
> > +
> >  	union acc100_harq_layout_data *harq_layout;
> > +	/* Virtual Info Ring head */
> > +	uint16_t info_ring_head;
> > +	/* Number of bytes available for each queue in device, depending
> on
> > +	 * how many queues are enabled with configure()
> > +	 */
> >  	uint32_t sw_ring_size;
> >  	uint32_t ddr_size; /* Size in kB */
> >  	uint32_t *tail_ptrs; /* Base address of response tail pointer buffer
> > */ @@ -575,4 +582,12 @@ struct acc100_device {
> >  	bool configured; /**< True if this ACC100 device is configured */
> > };
> >
> > +/**
> > + * Structure with details about RTE_BBDEV_EVENT_DEQUEUE event. It's
> > +passed to
> > + * the callback function.
> > + */
> > +struct acc100_deq_intr_details {
> > +	uint16_t queue_id;
> > +};
> > +
> >  #endif /* _RTE_ACC100_PMD_H_ */


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v9 09/10] baseband/acc100: add debug function to validate input
  2020-09-30 19:16       ` Tom Rix
@ 2020-09-30 19:53         ` Chautru, Nicolas
  2020-10-01 16:07           ` Tom Rix
  0 siblings, 1 reply; 213+ messages in thread
From: Chautru, Nicolas @ 2020-09-30 19:53 UTC (permalink / raw)
  To: Tom Rix, dev, akhil.goyal
  Cc: Richardson, Bruce, Xu, Rosen, dave.burley, aidan.goddard, Yigit,
	Ferruh, Liu, Tianjiao

Hi Tom, 

> From: Tom Rix <trix@redhat.com>
> On 9/28/20 5:29 PM, Nicolas Chautru wrote:
> > Debug functions to validate the input API from user Only enabled in
> > DEBUG mode at build time
> >
> > Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> > Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
> > ---
> >  drivers/baseband/acc100/rte_acc100_pmd.c | 424
> > +++++++++++++++++++++++++++++++
> >  1 file changed, 424 insertions(+)
> >
> > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> > b/drivers/baseband/acc100/rte_acc100_pmd.c
> > index b6d9e7c..3589814 100644
> > --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> > +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> > @@ -1945,6 +1945,231 @@
> >
> >  }
> >
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +/* Validates turbo encoder parameters */ static inline int
> > +validate_enc_op(struct rte_bbdev_enc_op *op) {
> > +	struct rte_bbdev_op_turbo_enc *turbo_enc = &op->turbo_enc;
> > +	struct rte_bbdev_op_enc_turbo_cb_params *cb = NULL;
> > +	struct rte_bbdev_op_enc_turbo_tb_params *tb = NULL;
> > +	uint16_t kw, kw_neg, kw_pos;
> > +
> > +	if (op->mempool == NULL) {
> > +		rte_bbdev_log(ERR, "Invalid mempool pointer");
> > +		return -1;
> > +	}
> > +	if (turbo_enc->input.data == NULL) {
> > +		rte_bbdev_log(ERR, "Invalid input pointer");
> > +		return -1;
> > +	}
> > +	if (turbo_enc->output.data == NULL) {
> > +		rte_bbdev_log(ERR, "Invalid output pointer");
> > +		return -1;
> > +	}
> > +	if (turbo_enc->rv_index > 3) {
> > +		rte_bbdev_log(ERR,
> > +				"rv_index (%u) is out of range 0 <= value <=
> 3",
> > +				turbo_enc->rv_index);
> > +		return -1;
> > +	}
> > +	if (turbo_enc->code_block_mode != 0 &&
> > +			turbo_enc->code_block_mode != 1) {
> > +		rte_bbdev_log(ERR,
> > +				"code_block_mode (%u) is out of range 0 <=
> value <= 1",
> > +				turbo_enc->code_block_mode);
> > +		return -1;
> > +	}
> > +
> > +	if (turbo_enc->code_block_mode == 0) {
> > +		tb = &turbo_enc->tb_params;
> > +		if ((tb->k_neg < RTE_BBDEV_TURBO_MIN_CB_SIZE
> > +				|| tb->k_neg >
> RTE_BBDEV_TURBO_MAX_CB_SIZE)
> > +				&& tb->c_neg > 0) {
> > +			rte_bbdev_log(ERR,
> > +					"k_neg (%u) is out of range %u <=
> value <= %u",
> > +					tb->k_neg,
> RTE_BBDEV_TURBO_MIN_CB_SIZE,
> > +					RTE_BBDEV_TURBO_MAX_CB_SIZE);
> > +			return -1;
> > +		}
> > +		if (tb->k_pos < RTE_BBDEV_TURBO_MIN_CB_SIZE
> > +				|| tb->k_pos >
> RTE_BBDEV_TURBO_MAX_CB_SIZE) {
> > +			rte_bbdev_log(ERR,
> > +					"k_pos (%u) is out of range %u <=
> value <= %u",
> > +					tb->k_pos,
> RTE_BBDEV_TURBO_MIN_CB_SIZE,
> > +					RTE_BBDEV_TURBO_MAX_CB_SIZE);
> > +			return -1;
> > +		}
> > +		if (tb->c_neg > (RTE_BBDEV_TURBO_MAX_CODE_BLOCKS -
> 1))
> > +			rte_bbdev_log(ERR,
> > +					"c_neg (%u) is out of range 0 <= value
> <= %u",
> > +					tb->c_neg,
> > +
> 	RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1);
> > +		if (tb->c < 1 || tb->c >
> RTE_BBDEV_TURBO_MAX_CODE_BLOCKS) {
> > +			rte_bbdev_log(ERR,
> > +					"c (%u) is out of range 1 <= value <=
> %u",
> > +					tb->c,
> RTE_BBDEV_TURBO_MAX_CODE_BLOCKS);
> > +			return -1;
> > +		}
> > +		if (tb->cab > tb->c) {
> > +			rte_bbdev_log(ERR,
> > +					"cab (%u) is greater than c (%u)",
> > +					tb->cab, tb->c);
> > +			return -1;
> > +		}
> > +		if ((tb->ea < RTE_BBDEV_TURBO_MIN_CB_SIZE || (tb->ea %
> 2))
> > +				&& tb->r < tb->cab) {
> > +			rte_bbdev_log(ERR,
> > +					"ea (%u) is less than %u or it is not
> even",
> > +					tb->ea,
> RTE_BBDEV_TURBO_MIN_CB_SIZE);
> > +			return -1;
> > +		}
> > +		if ((tb->eb < RTE_BBDEV_TURBO_MIN_CB_SIZE || (tb->eb %
> 2))
> > +				&& tb->c > tb->cab) {
> > +			rte_bbdev_log(ERR,
> > +					"eb (%u) is less than %u or it is not
> even",
> > +					tb->eb,
> RTE_BBDEV_TURBO_MIN_CB_SIZE);
> > +			return -1;
> > +		}
> > +
> > +		kw_neg = 3 * RTE_ALIGN_CEIL(tb->k_neg + 4,
> > +					RTE_BBDEV_TURBO_C_SUBBLOCK);
> > +		if (tb->ncb_neg < tb->k_neg || tb->ncb_neg > kw_neg) {
> > +			rte_bbdev_log(ERR,
> > +					"ncb_neg (%u) is out of range (%u)
> k_neg <= value <= (%u) kw_neg",
> > +					tb->ncb_neg, tb->k_neg, kw_neg);
> > +			return -1;
> > +		}
> > +
> > +		kw_pos = 3 * RTE_ALIGN_CEIL(tb->k_pos + 4,
> > +					RTE_BBDEV_TURBO_C_SUBBLOCK);
> > +		if (tb->ncb_pos < tb->k_pos || tb->ncb_pos > kw_pos) {
> > +			rte_bbdev_log(ERR,
> > +					"ncb_pos (%u) is out of range (%u)
> k_pos <= value <= (%u) kw_pos",
> > +					tb->ncb_pos, tb->k_pos, kw_pos);
> > +			return -1;
> > +		}
> > +		if (tb->r > (tb->c - 1)) {
> > +			rte_bbdev_log(ERR,
> > +					"r (%u) is greater than c - 1 (%u)",
> > +					tb->r, tb->c - 1);
> > +			return -1;
> > +		}
> > +	} else {
> > +		cb = &turbo_enc->cb_params;
> > +		if (cb->k < RTE_BBDEV_TURBO_MIN_CB_SIZE
> > +				|| cb->k >
> RTE_BBDEV_TURBO_MAX_CB_SIZE) {
> > +			rte_bbdev_log(ERR,
> > +					"k (%u) is out of range %u <= value <=
> %u",
> > +					cb->k,
> RTE_BBDEV_TURBO_MIN_CB_SIZE,
> > +					RTE_BBDEV_TURBO_MAX_CB_SIZE);
> > +			return -1;
> > +		}
> > +
> > +		if (cb->e < RTE_BBDEV_TURBO_MIN_CB_SIZE || (cb->e % 2))
> {
> > +			rte_bbdev_log(ERR,
> > +					"e (%u) is less than %u or it is not
> even",
> > +					cb->e,
> RTE_BBDEV_TURBO_MIN_CB_SIZE);
> > +			return -1;
> > +		}
> > +
> > +		kw = RTE_ALIGN_CEIL(cb->k + 4,
> RTE_BBDEV_TURBO_C_SUBBLOCK) * 3;
> > +		if (cb->ncb < cb->k || cb->ncb > kw) {
> > +			rte_bbdev_log(ERR,
> > +					"ncb (%u) is out of range (%u) k <=
> value <= (%u) kw",
> > +					cb->ncb, cb->k, kw);
> > +			return -1;
> > +		}
> > +	}
> > +
> > +	return 0;
> > +}
> > +/* Validates LDPC encoder parameters */ static inline int
> > +validate_ldpc_enc_op(struct rte_bbdev_enc_op *op) {
> > +	struct rte_bbdev_op_ldpc_enc *ldpc_enc = &op->ldpc_enc;
> > +
> > +	if (op->mempool == NULL) {
> > +		rte_bbdev_log(ERR, "Invalid mempool pointer");
> > +		return -1;
> > +	}
> > +	if (ldpc_enc->input.data == NULL) {
> > +		rte_bbdev_log(ERR, "Invalid input pointer");
> > +		return -1;
> > +	}
> > +	if (ldpc_enc->output.data == NULL) {
> > +		rte_bbdev_log(ERR, "Invalid output pointer");
> > +		return -1;
> > +	}
> > +	if (ldpc_enc->input.length >
> > +			RTE_BBDEV_LDPC_MAX_CB_SIZE >> 3) {
> > +		rte_bbdev_log(ERR, "CB size (%u) is too big, max: %d",
> > +				ldpc_enc->input.length,
> > +				RTE_BBDEV_LDPC_MAX_CB_SIZE);
> > +		return -1;
> > +	}
> > +	if ((ldpc_enc->basegraph > 2) || (ldpc_enc->basegraph == 0)) {
> > +		rte_bbdev_log(ERR,
> > +				"BG (%u) is out of range 1 <= value <= 2",
> > +				ldpc_enc->basegraph);
> > +		return -1;
> > +	}
> > +	if (ldpc_enc->rv_index > 3) {
> > +		rte_bbdev_log(ERR,
> > +				"rv_index (%u) is out of range 0 <= value <=
> 3",
> > +				ldpc_enc->rv_index);
> > +		return -1;
> > +	}
> > +	if (ldpc_enc->code_block_mode > 1) {
> > +		rte_bbdev_log(ERR,
> > +				"code_block_mode (%u) is out of range 0 <=
> value <= 1",
> > +				ldpc_enc->code_block_mode);
> > +		return -1;
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +/* Validates LDPC decoder parameters */ static inline int
> > +validate_ldpc_dec_op(struct rte_bbdev_dec_op *op) {
> > +	struct rte_bbdev_op_ldpc_dec *ldpc_dec = &op->ldpc_dec;
> > +
> > +	if (op->mempool == NULL) {
> > +		rte_bbdev_log(ERR, "Invalid mempool pointer");
> > +		return -1;
> > +	}
> > +	if ((ldpc_dec->basegraph > 2) || (ldpc_dec->basegraph == 0)) {
> > +		rte_bbdev_log(ERR,
> > +				"BG (%u) is out of range 1 <= value <= 2",
> > +				ldpc_dec->basegraph);
> > +		return -1;
> > +	}
> > +	if (ldpc_dec->iter_max == 0) {
> > +		rte_bbdev_log(ERR,
> > +				"iter_max (%u) is equal to 0",
> > +				ldpc_dec->iter_max);
> > +		return -1;
> > +	}
> > +	if (ldpc_dec->rv_index > 3) {
> > +		rte_bbdev_log(ERR,
> > +				"rv_index (%u) is out of range 0 <= value <=
> 3",
> > +				ldpc_dec->rv_index);
> > +		return -1;
> > +	}
> > +	if (ldpc_dec->code_block_mode > 1) {
> > +		rte_bbdev_log(ERR,
> > +				"code_block_mode (%u) is out of range 0 <=
> value <= 1",
> > +				ldpc_dec->code_block_mode);
> > +		return -1;
> > +	}
> > +
> > +	return 0;
> > +}
> > +#endif
> Could have an #else with stubs so the users do not have to bother with
> #ifdef decorations

I see what you mean. Debatable. But given this is done the same way for other
bbdev driver I would rather keep consistency. 

> > +
> >  /* Enqueue one encode operations for ACC100 device in CB mode */
> > static inline int  enqueue_enc_one_op_cb(struct acc100_queue *q,
> > struct rte_bbdev_enc_op *op, @@ -1956,6 +2181,14 @@
> >  		seg_total_left;
> >  	struct rte_mbuf *input, *output_head, *output;
> >
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +	/* Validate op structure */
> > +	if (validate_enc_op(op) == -1) {
> > +		rte_bbdev_log(ERR, "Turbo encoder validation failed");
> > +		return -EINVAL;
> > +	}
> > +#endif
> > +
> >  	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> >  			& q->sw_ring_wrap_mask);
> >  	desc = q->ring_addr + desc_idx;
> > @@ -2008,6 +2241,14 @@
> >  	uint16_t  in_length_in_bytes;
> >  	struct rte_bbdev_op_ldpc_enc *enc = &ops[0]->ldpc_enc;
> >
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +	/* Validate op structure */
> > +	if (validate_ldpc_enc_op(ops[0]) == -1) {
> > +		rte_bbdev_log(ERR, "LDPC encoder validation failed");
> > +		return -EINVAL;
> > +	}
> > +#endif
> > +
> >  	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> >  			& q->sw_ring_wrap_mask);
> >  	desc = q->ring_addr + desc_idx;
> > @@ -2065,6 +2306,14 @@
> >  		seg_total_left;
> >  	struct rte_mbuf *input, *output_head, *output;
> >
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +	/* Validate op structure */
> > +	if (validate_ldpc_enc_op(op) == -1) {
> > +		rte_bbdev_log(ERR, "LDPC encoder validation failed");
> > +		return -EINVAL;
> > +	}
> > +#endif
> > +
> >  	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> >  			& q->sw_ring_wrap_mask);
> >  	desc = q->ring_addr + desc_idx;
> > @@ -2119,6 +2368,14 @@
> >  	struct rte_mbuf *input, *output_head, *output;
> >  	uint16_t current_enqueued_cbs = 0;
> >
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +	/* Validate op structure */
> > +	if (validate_enc_op(op) == -1) {
> > +		rte_bbdev_log(ERR, "Turbo encoder validation failed");
> > +		return -EINVAL;
> > +	}
> > +#endif
> > +
> >  	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> >  			& q->sw_ring_wrap_mask);
> >  	desc = q->ring_addr + desc_idx;
> > @@ -2191,6 +2448,142 @@
> >  	return current_enqueued_cbs;
> >  }
> >
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +/* Validates turbo decoder parameters */ static inline int
> > +validate_dec_op(struct rte_bbdev_dec_op *op) {
> 
> This (guessing) later dec validation share similar code with enc validation,
> consider function for the common parts.

They have different API really, a few checks only have common range checks.
So not convinced it would help personnaly.
Thanks

> 
> Tom
> 
> > +	struct rte_bbdev_op_turbo_dec *turbo_dec = &op->turbo_dec;
> > +	struct rte_bbdev_op_dec_turbo_cb_params *cb = NULL;
> > +	struct rte_bbdev_op_dec_turbo_tb_params *tb = NULL;
> > +
> > +	if (op->mempool == NULL) {
> > +		rte_bbdev_log(ERR, "Invalid mempool pointer");
> > +		return -1;
> > +	}
> > +	if (turbo_dec->input.data == NULL) {
> > +		rte_bbdev_log(ERR, "Invalid input pointer");
> > +		return -1;
> > +	}
> > +	if (turbo_dec->hard_output.data == NULL) {
> > +		rte_bbdev_log(ERR, "Invalid hard_output pointer");
> > +		return -1;
> > +	}
> > +	if (check_bit(turbo_dec->op_flags,
> RTE_BBDEV_TURBO_SOFT_OUTPUT) &&
> > +			turbo_dec->soft_output.data == NULL) {
> > +		rte_bbdev_log(ERR, "Invalid soft_output pointer");
> > +		return -1;
> > +	}
> > +	if (turbo_dec->rv_index > 3) {
> > +		rte_bbdev_log(ERR,
> > +				"rv_index (%u) is out of range 0 <= value <=
> 3",
> > +				turbo_dec->rv_index);
> > +		return -1;
> > +	}
> > +	if (turbo_dec->iter_min < 1) {
> > +		rte_bbdev_log(ERR,
> > +				"iter_min (%u) is less than 1",
> > +				turbo_dec->iter_min);
> > +		return -1;
> > +	}
> > +	if (turbo_dec->iter_max <= 2) {
> > +		rte_bbdev_log(ERR,
> > +				"iter_max (%u) is less than or equal to 2",
> > +				turbo_dec->iter_max);
> > +		return -1;
> > +	}
> > +	if (turbo_dec->iter_min > turbo_dec->iter_max) {
> > +		rte_bbdev_log(ERR,
> > +				"iter_min (%u) is greater than iter_max
> (%u)",
> > +				turbo_dec->iter_min, turbo_dec->iter_max);
> > +		return -1;
> > +	}
> > +	if (turbo_dec->code_block_mode != 0 &&
> > +			turbo_dec->code_block_mode != 1) {
> > +		rte_bbdev_log(ERR,
> > +				"code_block_mode (%u) is out of range 0 <=
> value <= 1",
> > +				turbo_dec->code_block_mode);
> > +		return -1;
> > +	}
> > +
> > +	if (turbo_dec->code_block_mode == 0) {
> > +		tb = &turbo_dec->tb_params;
> > +		if ((tb->k_neg < RTE_BBDEV_TURBO_MIN_CB_SIZE
> > +				|| tb->k_neg >
> RTE_BBDEV_TURBO_MAX_CB_SIZE)
> > +				&& tb->c_neg > 0) {
> > +			rte_bbdev_log(ERR,
> > +					"k_neg (%u) is out of range %u <=
> value <= %u",
> > +					tb->k_neg,
> RTE_BBDEV_TURBO_MIN_CB_SIZE,
> > +					RTE_BBDEV_TURBO_MAX_CB_SIZE);
> > +			return -1;
> > +		}
> > +		if ((tb->k_pos < RTE_BBDEV_TURBO_MIN_CB_SIZE
> > +				|| tb->k_pos >
> RTE_BBDEV_TURBO_MAX_CB_SIZE)
> > +				&& tb->c > tb->c_neg) {
> > +			rte_bbdev_log(ERR,
> > +					"k_pos (%u) is out of range %u <=
> value <= %u",
> > +					tb->k_pos,
> RTE_BBDEV_TURBO_MIN_CB_SIZE,
> > +					RTE_BBDEV_TURBO_MAX_CB_SIZE);
> > +			return -1;
> > +		}
> > +		if (tb->c_neg > (RTE_BBDEV_TURBO_MAX_CODE_BLOCKS -
> 1))
> > +			rte_bbdev_log(ERR,
> > +					"c_neg (%u) is out of range 0 <= value
> <= %u",
> > +					tb->c_neg,
> > +
> 	RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1);
> > +		if (tb->c < 1 || tb->c >
> RTE_BBDEV_TURBO_MAX_CODE_BLOCKS) {
> > +			rte_bbdev_log(ERR,
> > +					"c (%u) is out of range 1 <= value <=
> %u",
> > +					tb->c,
> RTE_BBDEV_TURBO_MAX_CODE_BLOCKS);
> > +			return -1;
> > +		}
> > +		if (tb->cab > tb->c) {
> > +			rte_bbdev_log(ERR,
> > +					"cab (%u) is greater than c (%u)",
> > +					tb->cab, tb->c);
> > +			return -1;
> > +		}
> > +		if (check_bit(turbo_dec->op_flags,
> RTE_BBDEV_TURBO_EQUALIZER) &&
> > +				(tb->ea < RTE_BBDEV_TURBO_MIN_CB_SIZE
> > +						|| (tb->ea % 2))
> > +				&& tb->cab > 0) {
> > +			rte_bbdev_log(ERR,
> > +					"ea (%u) is less than %u or it is not
> even",
> > +					tb->ea,
> RTE_BBDEV_TURBO_MIN_CB_SIZE);
> > +			return -1;
> > +		}
> > +		if (check_bit(turbo_dec->op_flags,
> RTE_BBDEV_TURBO_EQUALIZER) &&
> > +				(tb->eb < RTE_BBDEV_TURBO_MIN_CB_SIZE
> > +						|| (tb->eb % 2))
> > +				&& tb->c > tb->cab) {
> > +			rte_bbdev_log(ERR,
> > +					"eb (%u) is less than %u or it is not
> even",
> > +					tb->eb,
> RTE_BBDEV_TURBO_MIN_CB_SIZE);
> > +		}
> > +	} else {
> > +		cb = &turbo_dec->cb_params;
> > +		if (cb->k < RTE_BBDEV_TURBO_MIN_CB_SIZE
> > +				|| cb->k >
> RTE_BBDEV_TURBO_MAX_CB_SIZE) {
> > +			rte_bbdev_log(ERR,
> > +					"k (%u) is out of range %u <= value <=
> %u",
> > +					cb->k,
> RTE_BBDEV_TURBO_MIN_CB_SIZE,
> > +					RTE_BBDEV_TURBO_MAX_CB_SIZE);
> > +			return -1;
> > +		}
> > +		if (check_bit(turbo_dec->op_flags,
> RTE_BBDEV_TURBO_EQUALIZER) &&
> > +				(cb->e < RTE_BBDEV_TURBO_MIN_CB_SIZE
> ||
> > +				(cb->e % 2))) {
> > +			rte_bbdev_log(ERR,
> > +					"e (%u) is less than %u or it is not
> even",
> > +					cb->e,
> RTE_BBDEV_TURBO_MIN_CB_SIZE);
> > +			return -1;
> > +		}
> > +	}
> > +
> > +	return 0;
> > +}
> > +#endif
> > +
> >  /** Enqueue one decode operations for ACC100 device in CB mode */
> > static inline int  enqueue_dec_one_op_cb(struct acc100_queue *q,
> > struct rte_bbdev_dec_op *op, @@ -2203,6 +2596,14 @@
> >  	struct rte_mbuf *input, *h_output_head, *h_output,
> >  		*s_output_head, *s_output;
> >
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +	/* Validate op structure */
> > +	if (validate_dec_op(op) == -1) {
> > +		rte_bbdev_log(ERR, "Turbo decoder validation failed");
> > +		return -EINVAL;
> > +	}
> > +#endif
> > +
> >  	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> >  			& q->sw_ring_wrap_mask);
> >  	desc = q->ring_addr + desc_idx;
> > @@ -2426,6 +2827,13 @@
> >  		return ret;
> >  	}
> >
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +	/* Validate op structure */
> > +	if (validate_ldpc_dec_op(op) == -1) {
> > +		rte_bbdev_log(ERR, "LDPC decoder validation failed");
> > +		return -EINVAL;
> > +	}
> > +#endif
> >  	union acc100_dma_desc *desc;
> >  	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> >  			& q->sw_ring_wrap_mask);
> > @@ -2521,6 +2929,14 @@
> >  	struct rte_mbuf *input, *h_output_head, *h_output;
> >  	uint16_t current_enqueued_cbs = 0;
> >
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +	/* Validate op structure */
> > +	if (validate_ldpc_dec_op(op) == -1) {
> > +		rte_bbdev_log(ERR, "LDPC decoder validation failed");
> > +		return -EINVAL;
> > +	}
> > +#endif
> > +
> >  	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> >  			& q->sw_ring_wrap_mask);
> >  	desc = q->ring_addr + desc_idx;
> > @@ -2611,6 +3027,14 @@
> >  		*s_output_head, *s_output;
> >  	uint16_t current_enqueued_cbs = 0;
> >
> > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> > +	/* Validate op structure */
> > +	if (validate_dec_op(op) == -1) {
> > +		rte_bbdev_log(ERR, "Turbo decoder validation failed");
> > +		return -EINVAL;
> > +	}
> > +#endif
> > +
> >  	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> >  			& q->sw_ring_wrap_mask);
> >  	desc = q->ring_addr + desc_idx;


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v9 10/10] baseband/acc100: add configure function
  2020-09-29  0:29     ` [dpdk-dev] [PATCH v9 10/10] baseband/acc100: add configure function Nicolas Chautru
@ 2020-09-30 19:58       ` Tom Rix
  2020-09-30 22:54         ` Chautru, Nicolas
  0 siblings, 1 reply; 213+ messages in thread
From: Tom Rix @ 2020-09-30 19:58 UTC (permalink / raw)
  To: Nicolas Chautru, dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, dave.burley, aidan.goddard,
	ferruh.yigit, tianjiao.liu


On 9/28/20 5:29 PM, Nicolas Chautru wrote:
> Add configure function to configure the PF from within
> the bbdev-test itself without external application
> configuration the device.
>
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
> ---
>  app/test-bbdev/test_bbdev_perf.c                   |  72 +++
>  doc/guides/rel_notes/release_20_11.rst             |   5 +
>  drivers/baseband/acc100/meson.build                |   2 +
>  drivers/baseband/acc100/rte_acc100_cfg.h           |  17 +
>  drivers/baseband/acc100/rte_acc100_pmd.c           | 505 +++++++++++++++++++++
>  .../acc100/rte_pmd_bbdev_acc100_version.map        |   7 +
>  6 files changed, 608 insertions(+)
>
> diff --git a/app/test-bbdev/test_bbdev_perf.c b/app/test-bbdev/test_bbdev_perf.c
> index 45c0d62..32f23ff 100644
> --- a/app/test-bbdev/test_bbdev_perf.c
> +++ b/app/test-bbdev/test_bbdev_perf.c
> @@ -52,6 +52,18 @@
>  #define FLR_5G_TIMEOUT 610
>  #endif
>  
> +#ifdef RTE_LIBRTE_PMD_BBDEV_ACC100
> +#include <rte_acc100_cfg.h>
> +#define ACC100PF_DRIVER_NAME   ("intel_acc100_pf")
> +#define ACC100VF_DRIVER_NAME   ("intel_acc100_vf")
> +#define ACC100_QMGR_NUM_AQS 16
> +#define ACC100_QMGR_NUM_QGS 2
> +#define ACC100_QMGR_AQ_DEPTH 5
> +#define ACC100_QMGR_INVALID_IDX -1
> +#define ACC100_QMGR_RR 1
> +#define ACC100_QOS_GBR 0
> +#endif
> +
>  #define OPS_CACHE_SIZE 256U
>  #define OPS_POOL_SIZE_MIN 511U /* 0.5K per queue */
>  
> @@ -653,6 +665,66 @@ typedef int (test_case_function)(struct active_device *ad,
>  				info->dev_name);
>  	}
>  #endif
> +#ifdef RTE_LIBRTE_PMD_BBDEV_ACC100
seems like this function would break if one of the other bbdev's were #defined.
> +	if ((get_init_device() == true) &&
> +		(!strcmp(info->drv.driver_name, ACC100PF_DRIVER_NAME))) {
> +		struct acc100_conf conf;
> +		unsigned int i;
> +
> +		printf("Configure ACC100 FEC Driver %s with default values\n",
> +				info->drv.driver_name);
> +
> +		/* clear default configuration before initialization */
> +		memset(&conf, 0, sizeof(struct acc100_conf));
> +
> +		/* Always set in PF mode for built-in configuration */
> +		conf.pf_mode_en = true;
> +		for (i = 0; i < RTE_ACC100_NUM_VFS; ++i) {
> +			conf.arb_dl_4g[i].gbr_threshold1 = ACC100_QOS_GBR;
> +			conf.arb_dl_4g[i].gbr_threshold1 = ACC100_QOS_GBR;
> +			conf.arb_dl_4g[i].round_robin_weight = ACC100_QMGR_RR;
> +			conf.arb_ul_4g[i].gbr_threshold1 = ACC100_QOS_GBR;
> +			conf.arb_ul_4g[i].gbr_threshold1 = ACC100_QOS_GBR;
> +			conf.arb_ul_4g[i].round_robin_weight = ACC100_QMGR_RR;
> +			conf.arb_dl_5g[i].gbr_threshold1 = ACC100_QOS_GBR;
> +			conf.arb_dl_5g[i].gbr_threshold1 = ACC100_QOS_GBR;
> +			conf.arb_dl_5g[i].round_robin_weight = ACC100_QMGR_RR;
> +			conf.arb_ul_5g[i].gbr_threshold1 = ACC100_QOS_GBR;
> +			conf.arb_ul_5g[i].gbr_threshold1 = ACC100_QOS_GBR;
> +			conf.arb_ul_5g[i].round_robin_weight = ACC100_QMGR_RR;
> +		}
> +
> +		conf.input_pos_llr_1_bit = true;
> +		conf.output_pos_llr_1_bit = true;
> +		conf.num_vf_bundles = 1; /**< Number of VF bundles to setup */
> +
> +		conf.q_ul_4g.num_qgroups = ACC100_QMGR_NUM_QGS;
> +		conf.q_ul_4g.first_qgroup_index = ACC100_QMGR_INVALID_IDX;
> +		conf.q_ul_4g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
> +		conf.q_ul_4g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
> +		conf.q_dl_4g.num_qgroups = ACC100_QMGR_NUM_QGS;
> +		conf.q_dl_4g.first_qgroup_index = ACC100_QMGR_INVALID_IDX;
> +		conf.q_dl_4g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
> +		conf.q_dl_4g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
> +		conf.q_ul_5g.num_qgroups = ACC100_QMGR_NUM_QGS;
> +		conf.q_ul_5g.first_qgroup_index = ACC100_QMGR_INVALID_IDX;
> +		conf.q_ul_5g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
> +		conf.q_ul_5g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
> +		conf.q_dl_5g.num_qgroups = ACC100_QMGR_NUM_QGS;
> +		conf.q_dl_5g.first_qgroup_index = ACC100_QMGR_INVALID_IDX;
> +		conf.q_dl_5g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
> +		conf.q_dl_5g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
> +
> +		/* setup PF with configuration information */
> +		ret = acc100_configure(info->dev_name, &conf);
> +		TEST_ASSERT_SUCCESS(ret,
> +				"Failed to configure ACC100 PF for bbdev %s",
> +				info->dev_name);
> +		/* Let's refresh this now this is configured */
> +	}
> +	rte_bbdev_info_get(dev_id, info);
The other bbdev's do not call rte_bbdev_info_get, can this be removed ?
> +#endif
> +
>  	nb_queues = RTE_MIN(rte_lcore_count(), info->drv.max_num_queues);
>  	nb_queues = RTE_MIN(nb_queues, (unsigned int) MAX_QUEUES);
>  
> diff --git a/doc/guides/rel_notes/release_20_11.rst b/doc/guides/rel_notes/release_20_11.rst
> index 73ac08f..c8d0586 100644
> --- a/doc/guides/rel_notes/release_20_11.rst
> +++ b/doc/guides/rel_notes/release_20_11.rst
> @@ -55,6 +55,11 @@ New Features
>       Also, make sure to start the actual text at the margin.
>       =======================================================
>  
> +* **Added Intel ACC100 bbdev PMD.**
> +
> +  Added a new ``acc100`` bbdev driver for the Intel\ |reg| ACC100 accelerator
> +  also known as Mount Bryce.  See the
> +  :doc:`../bbdevs/acc100` BBDEV guide for more details on this new driver.
>  
>  Removed Items
>  -------------
> diff --git a/drivers/baseband/acc100/meson.build b/drivers/baseband/acc100/meson.build
> index 8afafc2..7ac44dc 100644
> --- a/drivers/baseband/acc100/meson.build
> +++ b/drivers/baseband/acc100/meson.build
> @@ -4,3 +4,5 @@
>  deps += ['bbdev', 'bus_vdev', 'ring', 'pci', 'bus_pci']
>  
>  sources = files('rte_acc100_pmd.c')
> +
> +install_headers('rte_acc100_cfg.h')
> diff --git a/drivers/baseband/acc100/rte_acc100_cfg.h b/drivers/baseband/acc100/rte_acc100_cfg.h
> index 73bbe36..7f523bc 100644
> --- a/drivers/baseband/acc100/rte_acc100_cfg.h
> +++ b/drivers/baseband/acc100/rte_acc100_cfg.h
> @@ -89,6 +89,23 @@ struct acc100_conf {
>  	struct rte_arbitration_t arb_dl_5g[RTE_ACC100_NUM_VFS];
>  };
>  
> +/**
> + * Configure a ACC100 device
> + *
> + * @param dev_name
> + *   The name of the device. This is the short form of PCI BDF, e.g. 00:01.0.
> + *   It can also be retrieved for a bbdev device from the dev_name field in the
> + *   rte_bbdev_info structure returned by rte_bbdev_info_get().
> + * @param conf
> + *   Configuration to apply to ACC100 HW.
> + *
> + * @return
> + *   Zero on success, negative value on failure.
> + */
> +__rte_experimental
> +int
> +acc100_configure(const char *dev_name, struct acc100_conf *conf);
> +
>  #ifdef __cplusplus
>  }
>  #endif
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
> index 3589814..b50dd32 100644
> --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> @@ -85,6 +85,26 @@
>  
>  enum {UL_4G = 0, UL_5G, DL_4G, DL_5G, NUM_ACC};
>  
> +/* Return the accelerator enum for a Queue Group Index */
> +static inline int
> +accFromQgid(int qg_idx, const struct acc100_conf *acc100_conf)
> +{
> +	int accQg[ACC100_NUM_QGRPS];
> +	int NumQGroupsPerFn[NUM_ACC];
> +	int acc, qgIdx, qgIndex = 0;
> +	for (qgIdx = 0; qgIdx < ACC100_NUM_QGRPS; qgIdx++)
> +		accQg[qgIdx] = 0;
> +	NumQGroupsPerFn[UL_4G] = acc100_conf->q_ul_4g.num_qgroups;
> +	NumQGroupsPerFn[UL_5G] = acc100_conf->q_ul_5g.num_qgroups;
> +	NumQGroupsPerFn[DL_4G] = acc100_conf->q_dl_4g.num_qgroups;
> +	NumQGroupsPerFn[DL_5G] = acc100_conf->q_dl_5g.num_qgroups;
> +	for (acc = UL_4G;  acc < NUM_ACC; acc++)
> +		for (qgIdx = 0; qgIdx < NumQGroupsPerFn[acc]; qgIdx++)
> +			accQg[qgIndex++] = acc;

This looks inefficient, is there a way this could be calculated without filling arrays to

access 1 value ?

> +	acc = accQg[qg_idx];
> +	return acc;
> +}
> +
>  /* Return the queue topology for a Queue Group Index */
>  static inline void
>  qtopFromAcc(struct rte_q_topology_t **qtop, int acc_enum,
> @@ -113,6 +133,30 @@
>  	*qtop = p_qtop;
>  }
>  
> +/* Return the AQ depth for a Queue Group Index */
> +static inline int
> +aqDepth(int qg_idx, struct acc100_conf *acc100_conf)
> +{
> +	struct rte_q_topology_t *q_top = NULL;
> +	int acc_enum = accFromQgid(qg_idx, acc100_conf);
> +	qtopFromAcc(&q_top, acc_enum, acc100_conf);
> +	if (unlikely(q_top == NULL))
> +		return 0;

This error is not handled well be the callers.

aqNum is similar.

> +	return q_top->aq_depth_log2;
> +}
> +
> +/* Return the AQ depth for a Queue Group Index */
> +static inline int
> +aqNum(int qg_idx, struct acc100_conf *acc100_conf)
> +{
> +	struct rte_q_topology_t *q_top = NULL;
> +	int acc_enum = accFromQgid(qg_idx, acc100_conf);
> +	qtopFromAcc(&q_top, acc_enum, acc100_conf);
> +	if (unlikely(q_top == NULL))
> +		return 0;
> +	return q_top->num_aqs_per_groups;
> +}
> +
>  static void
>  initQTop(struct acc100_conf *acc100_conf)
>  {
> @@ -4177,3 +4221,464 @@ static int acc100_pci_remove(struct rte_pci_device *pci_dev)
>  RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME, pci_id_acc100_pf_map);
>  RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
>  RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME, pci_id_acc100_vf_map);
> +
> +/*
> + * Implementation to fix the power on status of some 5GUL engines
> + * This requires DMA permission if ported outside DPDK
This sounds like a workaround, can more detail be added here ?
> + */
> +static void
> +poweron_cleanup(struct rte_bbdev *bbdev, struct acc100_device *d,
> +		struct acc100_conf *conf)
> +{
> +	int i, template_idx, qg_idx;
> +	uint32_t address, status, payload;
> +	printf("Need to clear power-on 5GUL status in internal memory\n");
> +	/* Reset LDPC Cores */
> +	for (i = 0; i < ACC100_ENGINES_MAX; i++)
> +		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
> +				ACC100_ENGINE_OFFSET * i, ACC100_RESET_HI);
> +	usleep(LONG_WAIT);
> +	for (i = 0; i < ACC100_ENGINES_MAX; i++)
> +		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
> +				ACC100_ENGINE_OFFSET * i, ACC100_RESET_LO);
> +	usleep(LONG_WAIT);
> +	/* Prepare dummy workload */
> +	alloc_2x64mb_sw_rings_mem(bbdev, d, 0);
> +	/* Set base addresses */
> +	uint32_t phys_high = (uint32_t)(d->sw_rings_phys >> 32);
> +	uint32_t phys_low  = (uint32_t)(d->sw_rings_phys &
> +			~(ACC100_SIZE_64MBYTE-1));
> +	acc100_reg_write(d, HWPfDmaFec5GulDescBaseHiRegVf, phys_high);
> +	acc100_reg_write(d, HWPfDmaFec5GulDescBaseLoRegVf, phys_low);
> +
> +	/* Descriptor for a dummy 5GUL code block processing*/
> +	union acc100_dma_desc *desc = NULL;
> +	desc = d->sw_rings;
> +	desc->req.data_ptrs[0].address = d->sw_rings_phys +
> +			ACC100_DESC_FCW_OFFSET;
> +	desc->req.data_ptrs[0].blen = ACC100_FCW_LD_BLEN;
> +	desc->req.data_ptrs[0].blkid = ACC100_DMA_BLKID_FCW;
> +	desc->req.data_ptrs[0].last = 0;
> +	desc->req.data_ptrs[0].dma_ext = 0;
> +	desc->req.data_ptrs[1].address = d->sw_rings_phys + 512;
> +	desc->req.data_ptrs[1].blkid = ACC100_DMA_BLKID_IN;
> +	desc->req.data_ptrs[1].last = 1;
> +	desc->req.data_ptrs[1].dma_ext = 0;
> +	desc->req.data_ptrs[1].blen = 44;
> +	desc->req.data_ptrs[2].address = d->sw_rings_phys + 1024;
> +	desc->req.data_ptrs[2].blkid = ACC100_DMA_BLKID_OUT_ENC;
> +	desc->req.data_ptrs[2].last = 1;
> +	desc->req.data_ptrs[2].dma_ext = 0;
> +	desc->req.data_ptrs[2].blen = 5;
> +	/* Dummy FCW */
> +	desc->req.fcw_ld.FCWversion = ACC100_FCW_VER;
> +	desc->req.fcw_ld.qm = 1;
> +	desc->req.fcw_ld.nfiller = 30;
> +	desc->req.fcw_ld.BG = 2 - 1;
> +	desc->req.fcw_ld.Zc = 7;
> +	desc->req.fcw_ld.ncb = 350;
> +	desc->req.fcw_ld.rm_e = 4;
> +	desc->req.fcw_ld.itmax = 10;
> +	desc->req.fcw_ld.gain_i = 1;
> +	desc->req.fcw_ld.gain_h = 1;
> +
> +	int engines_to_restart[SIG_UL_5G_LAST + 1] = {0};
> +	int num_failed_engine = 0;
> +	/* Detect engines in undefined state */
> +	for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST;
> +			template_idx++) {
> +		/* Check engine power-on status */
> +		address = HwPfFecUl5gIbDebugReg +
> +				ACC100_ENGINE_OFFSET * template_idx;
> +		status = (acc100_reg_read(d, address) >> 4) & 0xF;
> +		if (status == 0) {
> +			engines_to_restart[num_failed_engine] = template_idx;
> +			num_failed_engine++;
> +		}
> +	}
> +
> +	int numQqsAcc = conf->q_ul_5g.num_qgroups;
> +	int numQgs = conf->q_ul_5g.num_qgroups;
> +	payload = 0;
> +	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
> +		payload |= (1 << qg_idx);
> +	/* Force each engine which is in unspecified state */
> +	for (i = 0; i < num_failed_engine; i++) {
> +		int failed_engine = engines_to_restart[i];
> +		printf("Force engine %d\n", failed_engine);
> +		for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST;
> +				template_idx++) {
> +			address = HWPfQmgrGrpTmplateReg4Indx
> +					+ BYTES_IN_WORD * template_idx;
> +			if (template_idx == failed_engine)
> +				acc100_reg_write(d, address, payload);
> +			else
> +				acc100_reg_write(d, address, 0);
> +		}
> +		/* Reset descriptor header */
> +		desc->req.word0 = ACC100_DMA_DESC_TYPE;
> +		desc->req.word1 = 0;
> +		desc->req.word2 = 0;
> +		desc->req.word3 = 0;
> +		desc->req.numCBs = 1;
> +		desc->req.m2dlen = 2;
> +		desc->req.d2mlen = 1;
> +		/* Enqueue the code block for processing */
> +		union acc100_enqueue_reg_fmt enq_req;
> +		enq_req.val = 0;
> +		enq_req.addr_offset = ACC100_DESC_OFFSET;
> +		enq_req.num_elem = 1;
> +		enq_req.req_elem_addr = 0;
> +		rte_wmb();
> +		acc100_reg_write(d, HWPfQmgrIngressAq + 0x100, enq_req.val);
> +		usleep(LONG_WAIT * 100);
> +		if (desc->req.word0 != 2)
> +			printf("DMA Response %#"PRIx32"\n", desc->req.word0);
> +	}
> +
> +	/* Reset LDPC Cores */
> +	for (i = 0; i < ACC100_ENGINES_MAX; i++)
> +		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
> +				ACC100_ENGINE_OFFSET * i, ACC100_RESET_HI);
> +	usleep(LONG_WAIT);
> +	for (i = 0; i < ACC100_ENGINES_MAX; i++)
> +		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
> +				ACC100_ENGINE_OFFSET * i, ACC100_RESET_LO);
> +	usleep(LONG_WAIT);
> +	acc100_reg_write(d, HWPfHi5GHardResetReg, ACC100_RESET_HARD);
> +	usleep(LONG_WAIT);
> +	int numEngines = 0;
> +	/* Check engine power-on status again */
> +	for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST;
> +			template_idx++) {
> +		address = HwPfFecUl5gIbDebugReg +
> +				ACC100_ENGINE_OFFSET * template_idx;
> +		status = (acc100_reg_read(d, address) >> 4) & 0xF;
> +		address = HWPfQmgrGrpTmplateReg4Indx
> +				+ BYTES_IN_WORD * template_idx;
> +		if (status == 1) {
> +			acc100_reg_write(d, address, payload);
> +			numEngines++;
> +		} else
> +			acc100_reg_write(d, address, 0);
> +	}
> +	printf("Number of 5GUL engines %d\n", numEngines);
> +
> +	if (d->sw_rings_base != NULL)
> +		rte_free(d->sw_rings_base);
> +	usleep(LONG_WAIT);
> +}
> +
> +/* Initial configuration of a ACC100 device prior to running configure() */
> +int
> +acc100_configure(const char *dev_name, struct acc100_conf *conf)
> +{
> +	rte_bbdev_log(INFO, "acc100_configure");
> +	uint32_t payload, address, status;

maybe value or data would be a better variable name than payload.

would mean changing acc100_reg_write

> +	int qg_idx, template_idx, vf_idx, acc, i;
> +	struct rte_bbdev *bbdev = rte_bbdev_get_named_dev(dev_name);
> +
> +	/* Compile time checks */
> +	RTE_BUILD_BUG_ON(sizeof(struct acc100_dma_req_desc) != 256);
> +	RTE_BUILD_BUG_ON(sizeof(union acc100_dma_desc) != 256);
> +	RTE_BUILD_BUG_ON(sizeof(struct acc100_fcw_td) != 24);
> +	RTE_BUILD_BUG_ON(sizeof(struct acc100_fcw_te) != 32);
> +
> +	if (bbdev == NULL) {
> +		rte_bbdev_log(ERR,
> +		"Invalid dev_name (%s), or device is not yet initialised",
> +		dev_name);
> +		return -ENODEV;
> +	}
> +	struct acc100_device *d = bbdev->data->dev_private;
> +
> +	/* Store configuration */
> +	rte_memcpy(&d->acc100_conf, conf, sizeof(d->acc100_conf));
> +
> +	/* PCIe Bridge configuration */
> +	acc100_reg_write(d, HwPfPcieGpexBridgeControl, ACC100_CFG_PCI_BRIDGE);
> +	for (i = 1; i < 17; i++)

17 is a magic number, use a #define

this is a general issue.

> +		acc100_reg_write(d,
> +				HwPfPcieGpexAxiAddrMappingWindowPexBaseHigh
> +				+ i * 16, 0);
> +
> +	/* PCIe Link Trainiing and Status State Machine */
> +	acc100_reg_write(d, HwPfPcieGpexLtssmStateCntrl, 0xDFC00000);
> +
> +	/* Prevent blocking AXI read on BRESP for AXI Write */
> +	address = HwPfPcieGpexAxiPioControl;
> +	payload = ACC100_CFG_PCI_AXI;
> +	acc100_reg_write(d, address, payload);
> +
> +	/* 5GDL PLL phase shift */
> +	acc100_reg_write(d, HWPfChaDl5gPllPhshft0, 0x1);
> +
> +	/* Explicitly releasing AXI as this may be stopped after PF FLR/BME */
> +	address = HWPfDmaAxiControl;
> +	payload = 1;
> +	acc100_reg_write(d, address, payload);
> +
> +	/* DDR Configuration */
> +	address = HWPfDdrBcTim6;
> +	payload = acc100_reg_read(d, address);
> +	payload &= 0xFFFFFFFB; /* Bit 2 */
> +#ifdef ACC100_DDR_ECC_ENABLE
> +	payload |= 0x4;
> +#endif
> +	acc100_reg_write(d, address, payload);
> +	address = HWPfDdrPhyDqsCountNum;
> +#ifdef ACC100_DDR_ECC_ENABLE
> +	payload = 9;
> +#else
> +	payload = 8;
> +#endif
> +	acc100_reg_write(d, address, payload);
> +
> +	/* Set default descriptor signature */
> +	address = HWPfDmaDescriptorSignatuture;
> +	payload = 0;
> +	acc100_reg_write(d, address, payload);
> +
> +	/* Enable the Error Detection in DMA */
> +	payload = ACC100_CFG_DMA_ERROR;
> +	address = HWPfDmaErrorDetectionEn;
> +	acc100_reg_write(d, address, payload);
> +
> +	/* AXI Cache configuration */
> +	payload = ACC100_CFG_AXI_CACHE;
> +	address = HWPfDmaAxcacheReg;
> +	acc100_reg_write(d, address, payload);
> +
> +	/* Default DMA Configuration (Qmgr Enabled) */
> +	address = HWPfDmaConfig0Reg;
> +	payload = 0;
> +	acc100_reg_write(d, address, payload);
> +	address = HWPfDmaQmanen;
> +	payload = 0;
> +	acc100_reg_write(d, address, payload);
> +
> +	/* Default RLIM/ALEN configuration */
> +	address = HWPfDmaConfig1Reg;
> +	payload = (1 << 31) + (23 << 8) + (1 << 6) + 7;
> +	acc100_reg_write(d, address, payload);
> +
> +	/* Configure DMA Qmanager addresses */
> +	address = HWPfDmaQmgrAddrReg;
> +	payload = HWPfQmgrEgressQueuesTemplate;
> +	acc100_reg_write(d, address, payload);
> +
> +	/* ===== Qmgr Configuration ===== */
> +	/* Configuration of the AQueue Depth QMGR_GRP_0_DEPTH_LOG2 for UL */
> +	int totalQgs = conf->q_ul_4g.num_qgroups +
> +			conf->q_ul_5g.num_qgroups +
> +			conf->q_dl_4g.num_qgroups +
> +			conf->q_dl_5g.num_qgroups;
> +	for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) {
> +		address = HWPfQmgrDepthLog2Grp +
> +		BYTES_IN_WORD * qg_idx;
> +		payload = aqDepth(qg_idx, conf);
> +		acc100_reg_write(d, address, payload);
> +		address = HWPfQmgrTholdGrp +
> +		BYTES_IN_WORD * qg_idx;
> +		payload = (1 << 16) + (1 << (aqDepth(qg_idx, conf) - 1));
> +		acc100_reg_write(d, address, payload);
> +	}
> +
> +	/* Template Priority in incremental order */
> +	for (template_idx = 0; template_idx < ACC100_NUM_TMPL;
> +			template_idx++) {
> +		address = HWPfQmgrGrpTmplateReg0Indx +
> +		BYTES_IN_WORD * (template_idx % 8);
> +		payload = TMPL_PRI_0;
> +		acc100_reg_write(d, address, payload);
> +		address = HWPfQmgrGrpTmplateReg1Indx +
> +		BYTES_IN_WORD * (template_idx % 8);
> +		payload = TMPL_PRI_1;
> +		acc100_reg_write(d, address, payload);
> +		address = HWPfQmgrGrpTmplateReg2indx +
> +		BYTES_IN_WORD * (template_idx % 8);
> +		payload = TMPL_PRI_2;
> +		acc100_reg_write(d, address, payload);
> +		address = HWPfQmgrGrpTmplateReg3Indx +
> +		BYTES_IN_WORD * (template_idx % 8);
> +		payload = TMPL_PRI_3;
> +		acc100_reg_write(d, address, payload);
> +	}
> +
> +	address = HWPfQmgrGrpPriority;
> +	payload = ACC100_CFG_QMGR_HI_P;
> +	acc100_reg_write(d, address, payload);
> +
> +	/* Template Configuration */
> +	for (template_idx = 0; template_idx < ACC100_NUM_TMPL; template_idx++) {
> +		payload = 0;
> +		address = HWPfQmgrGrpTmplateReg4Indx
> +				+ BYTES_IN_WORD * template_idx;
> +		acc100_reg_write(d, address, payload);
> +	}
> +	/* 4GUL */
> +	int numQgs = conf->q_ul_4g.num_qgroups;
> +	int numQqsAcc = 0;
> +	payload = 0;
> +	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
> +		payload |= (1 << qg_idx);
> +	for (template_idx = SIG_UL_4G; template_idx <= SIG_UL_4G_LAST;
> +			template_idx++) {
> +		address = HWPfQmgrGrpTmplateReg4Indx
> +				+ BYTES_IN_WORD*template_idx;
> +		acc100_reg_write(d, address, payload);
> +	}
> +	/* 5GUL */
> +	numQqsAcc += numQgs;
> +	numQgs	= conf->q_ul_5g.num_qgroups;
> +	payload = 0;
> +	int numEngines = 0;
> +	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
> +		payload |= (1 << qg_idx);
> +	for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST;
> +			template_idx++) {
> +		/* Check engine power-on status */
> +		address = HwPfFecUl5gIbDebugReg +
> +				ACC100_ENGINE_OFFSET * template_idx;
> +		status = (acc100_reg_read(d, address) >> 4) & 0xF;
> +		address = HWPfQmgrGrpTmplateReg4Indx
> +				+ BYTES_IN_WORD * template_idx;
> +		if (status == 1) {
> +			acc100_reg_write(d, address, payload);
> +			numEngines++;
> +		} else
> +			acc100_reg_write(d, address, 0);
> +		#if RTE_ACC100_SINGLE_FEC == 1
#if should be at start of line
> +		payload = 0;
> +		#endif
> +	}
> +	printf("Number of 5GUL engines %d\n", numEngines);
> +	/* 4GDL */
> +	numQqsAcc += numQgs;
> +	numQgs	= conf->q_dl_4g.num_qgroups;
> +	payload = 0;
> +	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
> +		payload |= (1 << qg_idx);
> +	for (template_idx = SIG_DL_4G; template_idx <= SIG_DL_4G_LAST;
> +			template_idx++) {
> +		address = HWPfQmgrGrpTmplateReg4Indx
> +				+ BYTES_IN_WORD*template_idx;
> +		acc100_reg_write(d, address, payload);
> +		#if RTE_ACC100_SINGLE_FEC == 1
> +			payload = 0;
> +		#endif
> +	}
> +	/* 5GDL */
> +	numQqsAcc += numQgs;
> +	numQgs	= conf->q_dl_5g.num_qgroups;
> +	payload = 0;
> +	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
> +		payload |= (1 << qg_idx);
> +	for (template_idx = SIG_DL_5G; template_idx <= SIG_DL_5G_LAST;
> +			template_idx++) {
> +		address = HWPfQmgrGrpTmplateReg4Indx
> +				+ BYTES_IN_WORD*template_idx;
> +		acc100_reg_write(d, address, payload);
> +		#if RTE_ACC100_SINGLE_FEC == 1
> +		payload = 0;
> +		#endif
> +	}
> +
> +	/* Queue Group Function mapping */
> +	int qman_func_id[5] = {0, 2, 1, 3, 4};
> +	address = HWPfQmgrGrpFunction0;
> +	payload = 0;
> +	for (qg_idx = 0; qg_idx < 8; qg_idx++) {
> +		acc = accFromQgid(qg_idx, conf);
> +		payload |= qman_func_id[acc]<<(qg_idx * 4);
> +	}
> +	acc100_reg_write(d, address, payload);
> +
> +	/* Configuration of the Arbitration QGroup depth to 1 */
> +	for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) {
> +		address = HWPfQmgrArbQDepthGrp +
> +		BYTES_IN_WORD * qg_idx;
> +		payload = 0;
> +		acc100_reg_write(d, address, payload);
> +	}
> +
> +	/* Enabling AQueues through the Queue hierarchy*/
> +	for (vf_idx = 0; vf_idx < ACC100_NUM_VFS; vf_idx++) {
> +		for (qg_idx = 0; qg_idx < ACC100_NUM_QGRPS; qg_idx++) {
> +			payload = 0;
> +			if (vf_idx < conf->num_vf_bundles &&
> +					qg_idx < totalQgs)
> +				payload = (1 << aqNum(qg_idx, conf)) - 1;
> +			address = HWPfQmgrAqEnableVf
> +					+ vf_idx * BYTES_IN_WORD;
> +			payload += (qg_idx << 16);
> +			acc100_reg_write(d, address, payload);
> +		}
> +	}
> +
> +	/* This pointer to ARAM (256kB) is shifted by 2 (4B per register) */
> +	uint32_t aram_address = 0;
> +	for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) {
> +		for (vf_idx = 0; vf_idx < conf->num_vf_bundles; vf_idx++) {
> +			address = HWPfQmgrVfBaseAddr + vf_idx
> +					* BYTES_IN_WORD + qg_idx
> +					* BYTES_IN_WORD * 64;
> +			payload = aram_address;
> +			acc100_reg_write(d, address, payload);
> +			/* Offset ARAM Address for next memory bank
> +			 * - increment of 4B
> +			 */
> +			aram_address += aqNum(qg_idx, conf) *
> +					(1 << aqDepth(qg_idx, conf));
> +		}
> +	}
> +
> +	if (aram_address > WORDS_IN_ARAM_SIZE) {
> +		rte_bbdev_log(ERR, "ARAM Configuration not fitting %d %d\n",
> +				aram_address, WORDS_IN_ARAM_SIZE);
> +		return -EINVAL;
> +	}
> +
> +	/* ==== HI Configuration ==== */
> +
> +	/* Prevent Block on Transmit Error */
> +	address = HWPfHiBlockTransmitOnErrorEn;
> +	payload = 0;
> +	acc100_reg_write(d, address, payload);
> +	/* Prevents to drop MSI */
> +	address = HWPfHiMsiDropEnableReg;
> +	payload = 0;
> +	acc100_reg_write(d, address, payload);
> +	/* Set the PF Mode register */
> +	address = HWPfHiPfMode;
> +	payload = (conf->pf_mode_en) ? 2 : 0;
> +	acc100_reg_write(d, address, payload);
> +	/* Enable Error Detection in HW */
> +	address = HWPfDmaErrorDetectionEn;
> +	payload = 0x3D7;
> +	acc100_reg_write(d, address, payload);
> +
> +	/* QoS overflow init */
> +	payload = 1;
> +	address = HWPfQosmonAEvalOverflow0;
> +	acc100_reg_write(d, address, payload);
> +	address = HWPfQosmonBEvalOverflow0;
> +	acc100_reg_write(d, address, payload);
> +
> +	/* HARQ DDR Configuration */
> +	unsigned int ddrSizeInMb = 512; /* Fixed to 512 MB per VF for now */
> +	for (vf_idx = 0; vf_idx < conf->num_vf_bundles; vf_idx++) {
> +		address = HWPfDmaVfDdrBaseRw + vf_idx
> +				* 0x10;
> +		payload = ((vf_idx * (ddrSizeInMb / 64)) << 16) +
> +				(ddrSizeInMb - 1);
> +		acc100_reg_write(d, address, payload);
> +	}
> +	usleep(LONG_WAIT);
Is sleep needed here ? the reg_write has one.
> +

Since this seems like a workaround, add a comment here.

Tom

> +	if (numEngines < (SIG_UL_5G_LAST + 1))
> +		poweron_cleanup(bbdev, d, conf);
> +
> +	rte_bbdev_log_debug("PF Tip configuration complete for %s", dev_name);
> +	return 0;
> +}
> diff --git a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> index 4a76d1d..91c234d 100644
> --- a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> +++ b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> @@ -1,3 +1,10 @@
>  DPDK_21 {
>  	local: *;
>  };
> +
> +EXPERIMENTAL {
> +	global:
> +
> +	acc100_configure;
> +
> +};


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v9 10/10] baseband/acc100: add configure function
  2020-09-30 19:58       ` Tom Rix
@ 2020-09-30 22:54         ` Chautru, Nicolas
  2020-10-01 16:18           ` Tom Rix
  0 siblings, 1 reply; 213+ messages in thread
From: Chautru, Nicolas @ 2020-09-30 22:54 UTC (permalink / raw)
  To: Tom Rix, dev, akhil.goyal
  Cc: Richardson, Bruce, Xu, Rosen, dave.burley, aidan.goddard, Yigit,
	Ferruh, Liu, Tianjiao

Hi Tom, 

> From: Tom Rix <trix@redhat.com>
> On 9/28/20 5:29 PM, Nicolas Chautru wrote:
> > Add configure function to configure the PF from within the 
> > bbdev-test itself without external application configuration the device.
> >
> > Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> > Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
> > ---
> >  app/test-bbdev/test_bbdev_perf.c                   |  72 +++
> >  doc/guides/rel_notes/release_20_11.rst             |   5 +
> >  drivers/baseband/acc100/meson.build                |   2 +
> >  drivers/baseband/acc100/rte_acc100_cfg.h           |  17 +
> >  drivers/baseband/acc100/rte_acc100_pmd.c           | 505
> +++++++++++++++++++++
> >  .../acc100/rte_pmd_bbdev_acc100_version.map        |   7 +
> >  6 files changed, 608 insertions(+)
> >
> > diff --git a/app/test-bbdev/test_bbdev_perf.c
> > b/app/test-bbdev/test_bbdev_perf.c
> > index 45c0d62..32f23ff 100644
> > --- a/app/test-bbdev/test_bbdev_perf.c
> > +++ b/app/test-bbdev/test_bbdev_perf.c
> > @@ -52,6 +52,18 @@
> >  #define FLR_5G_TIMEOUT 610
> >  #endif
> >
> > +#ifdef RTE_LIBRTE_PMD_BBDEV_ACC100
> > +#include <rte_acc100_cfg.h>
> > +#define ACC100PF_DRIVER_NAME   ("intel_acc100_pf")
> > +#define ACC100VF_DRIVER_NAME   ("intel_acc100_vf")
> > +#define ACC100_QMGR_NUM_AQS 16
> > +#define ACC100_QMGR_NUM_QGS 2
> > +#define ACC100_QMGR_AQ_DEPTH 5
> > +#define ACC100_QMGR_INVALID_IDX -1
> > +#define ACC100_QMGR_RR 1
> > +#define ACC100_QOS_GBR 0
> > +#endif
> > +
> >  #define OPS_CACHE_SIZE 256U
> >  #define OPS_POOL_SIZE_MIN 511U /* 0.5K per queue */
> >
> > @@ -653,6 +665,66 @@ typedef int (test_case_function)(struct
> active_device *ad,
> >  				info->dev_name);
> >  	}
> >  #endif
> > +#ifdef RTE_LIBRTE_PMD_BBDEV_ACC100
> seems like this function would break if one of the other bbdev's were 
> #defined.

No these are independent. By default they are all defined. 


> > +	if ((get_init_device() == true) &&
> > +		(!strcmp(info->drv.driver_name,
> ACC100PF_DRIVER_NAME))) {
> > +		struct acc100_conf conf;
> > +		unsigned int i;
> > +
> > +		printf("Configure ACC100 FEC Driver %s with default
> values\n",
> > +				info->drv.driver_name);
> > +
> > +		/* clear default configuration before initialization */
> > +		memset(&conf, 0, sizeof(struct acc100_conf));
> > +
> > +		/* Always set in PF mode for built-in configuration */
> > +		conf.pf_mode_en = true;
> > +		for (i = 0; i < RTE_ACC100_NUM_VFS; ++i) {
> > +			conf.arb_dl_4g[i].gbr_threshold1 =
> ACC100_QOS_GBR;
> > +			conf.arb_dl_4g[i].gbr_threshold1 =
> ACC100_QOS_GBR;
> > +			conf.arb_dl_4g[i].round_robin_weight =
> ACC100_QMGR_RR;
> > +			conf.arb_ul_4g[i].gbr_threshold1 =
> ACC100_QOS_GBR;
> > +			conf.arb_ul_4g[i].gbr_threshold1 =
> ACC100_QOS_GBR;
> > +			conf.arb_ul_4g[i].round_robin_weight =
> ACC100_QMGR_RR;
> > +			conf.arb_dl_5g[i].gbr_threshold1 =
> ACC100_QOS_GBR;
> > +			conf.arb_dl_5g[i].gbr_threshold1 =
> ACC100_QOS_GBR;
> > +			conf.arb_dl_5g[i].round_robin_weight =
> ACC100_QMGR_RR;
> > +			conf.arb_ul_5g[i].gbr_threshold1 =
> ACC100_QOS_GBR;
> > +			conf.arb_ul_5g[i].gbr_threshold1 =
> ACC100_QOS_GBR;
> > +			conf.arb_ul_5g[i].round_robin_weight =
> ACC100_QMGR_RR;
> > +		}
> > +
> > +		conf.input_pos_llr_1_bit = true;
> > +		conf.output_pos_llr_1_bit = true;
> > +		conf.num_vf_bundles = 1; /**< Number of VF bundles to
> setup */
> > +
> > +		conf.q_ul_4g.num_qgroups = ACC100_QMGR_NUM_QGS;
> > +		conf.q_ul_4g.first_qgroup_index =
> ACC100_QMGR_INVALID_IDX;
> > +		conf.q_ul_4g.num_aqs_per_groups =
> ACC100_QMGR_NUM_AQS;
> > +		conf.q_ul_4g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
> > +		conf.q_dl_4g.num_qgroups = ACC100_QMGR_NUM_QGS;
> > +		conf.q_dl_4g.first_qgroup_index =
> ACC100_QMGR_INVALID_IDX;
> > +		conf.q_dl_4g.num_aqs_per_groups =
> ACC100_QMGR_NUM_AQS;
> > +		conf.q_dl_4g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
> > +		conf.q_ul_5g.num_qgroups = ACC100_QMGR_NUM_QGS;
> > +		conf.q_ul_5g.first_qgroup_index =
> ACC100_QMGR_INVALID_IDX;
> > +		conf.q_ul_5g.num_aqs_per_groups =
> ACC100_QMGR_NUM_AQS;
> > +		conf.q_ul_5g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
> > +		conf.q_dl_5g.num_qgroups = ACC100_QMGR_NUM_QGS;
> > +		conf.q_dl_5g.first_qgroup_index =
> ACC100_QMGR_INVALID_IDX;
> > +		conf.q_dl_5g.num_aqs_per_groups =
> ACC100_QMGR_NUM_AQS;
> > +		conf.q_dl_5g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
> > +
> > +		/* setup PF with configuration information */
> > +		ret = acc100_configure(info->dev_name, &conf);
> > +		TEST_ASSERT_SUCCESS(ret,
> > +				"Failed to configure ACC100 PF for bbdev
> %s",
> > +				info->dev_name);
> > +		/* Let's refresh this now this is configured */
> > +	}
> > +	rte_bbdev_info_get(dev_id, info);
> The other bbdev's do not call rte_bbdev_info_get, can this be removed ?

Actually it should be added outside for all versions post-configuraion. Thanks

> > +#endif
> > +
> >  	nb_queues = RTE_MIN(rte_lcore_count(), info- drv.max_num_queues);
> >  	nb_queues = RTE_MIN(nb_queues, (unsigned int) MAX_QUEUES);
> >
> > diff --git a/doc/guides/rel_notes/release_20_11.rst
> > b/doc/guides/rel_notes/release_20_11.rst
> > index 73ac08f..c8d0586 100644
> > --- a/doc/guides/rel_notes/release_20_11.rst
> > +++ b/doc/guides/rel_notes/release_20_11.rst
> > @@ -55,6 +55,11 @@ New Features
> >       Also, make sure to start the actual text at the margin.
> >       =======================================================
> >
> > +* **Added Intel ACC100 bbdev PMD.**
> > +
> > +  Added a new ``acc100`` bbdev driver for the Intel\ |reg| ACC100 
> > + accelerator  also known as Mount Bryce.  See the 
> > + :doc:`../bbdevs/acc100` BBDEV guide for more details on this new driver.
> >
> >  Removed Items
> >  -------------
> > diff --git a/drivers/baseband/acc100/meson.build
> > b/drivers/baseband/acc100/meson.build
> > index 8afafc2..7ac44dc 100644
> > --- a/drivers/baseband/acc100/meson.build
> > +++ b/drivers/baseband/acc100/meson.build
> > @@ -4,3 +4,5 @@
> >  deps += ['bbdev', 'bus_vdev', 'ring', 'pci', 'bus_pci']
> >
> >  sources = files('rte_acc100_pmd.c')
> > +
> > +install_headers('rte_acc100_cfg.h')
> > diff --git a/drivers/baseband/acc100/rte_acc100_cfg.h
> > b/drivers/baseband/acc100/rte_acc100_cfg.h
> > index 73bbe36..7f523bc 100644
> > --- a/drivers/baseband/acc100/rte_acc100_cfg.h
> > +++ b/drivers/baseband/acc100/rte_acc100_cfg.h
> > @@ -89,6 +89,23 @@ struct acc100_conf {
> >  	struct rte_arbitration_t arb_dl_5g[RTE_ACC100_NUM_VFS];  };
> >
> > +/**
> > + * Configure a ACC100 device
> > + *
> > + * @param dev_name
> > + *   The name of the device. This is the short form of PCI BDF, e.g. 00:01.0.
> > + *   It can also be retrieved for a bbdev device from the dev_name field in
> the
> > + *   rte_bbdev_info structure returned by rte_bbdev_info_get().
> > + * @param conf
> > + *   Configuration to apply to ACC100 HW.
> > + *
> > + * @return
> > + *   Zero on success, negative value on failure.
> > + */
> > +__rte_experimental
> > +int
> > +acc100_configure(const char *dev_name, struct acc100_conf *conf);
> > +
> >  #ifdef __cplusplus
> >  }
> >  #endif
> > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> > b/drivers/baseband/acc100/rte_acc100_pmd.c
> > index 3589814..b50dd32 100644
> > --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> > +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> > @@ -85,6 +85,26 @@
> >
> >  enum {UL_4G = 0, UL_5G, DL_4G, DL_5G, NUM_ACC};
> >
> > +/* Return the accelerator enum for a Queue Group Index */ static 
> > +inline int accFromQgid(int qg_idx, const struct acc100_conf
> > +*acc100_conf) {
> > +	int accQg[ACC100_NUM_QGRPS];
> > +	int NumQGroupsPerFn[NUM_ACC];
> > +	int acc, qgIdx, qgIndex = 0;
> > +	for (qgIdx = 0; qgIdx < ACC100_NUM_QGRPS; qgIdx++)
> > +		accQg[qgIdx] = 0;
> > +	NumQGroupsPerFn[UL_4G] = acc100_conf->q_ul_4g.num_qgroups;
> > +	NumQGroupsPerFn[UL_5G] = acc100_conf->q_ul_5g.num_qgroups;
> > +	NumQGroupsPerFn[DL_4G] = acc100_conf->q_dl_4g.num_qgroups;
> > +	NumQGroupsPerFn[DL_5G] = acc100_conf->q_dl_5g.num_qgroups;
> > +	for (acc = UL_4G;  acc < NUM_ACC; acc++)
> > +		for (qgIdx = 0; qgIdx < NumQGroupsPerFn[acc]; qgIdx++)
> > +			accQg[qgIndex++] = acc;
> 
> This looks inefficient, is there a way this could be calculated 
> without filling arrays to
> 
> access 1 value ?

That is not time critical, and the same common code is run each time. 

> 
> > +	acc = accQg[qg_idx];
> > +	return acc;
> > +}
> > +
> >  /* Return the queue topology for a Queue Group Index */  static 
> > inline void  qtopFromAcc(struct rte_q_topology_t **qtop, int 
> > acc_enum, @@ -113,6 +133,30 @@
> >  	*qtop = p_qtop;
> >  }
> >
> > +/* Return the AQ depth for a Queue Group Index */ static inline int 
> > +aqDepth(int qg_idx, struct acc100_conf *acc100_conf) {
> > +	struct rte_q_topology_t *q_top = NULL;
> > +	int acc_enum = accFromQgid(qg_idx, acc100_conf);
> > +	qtopFromAcc(&q_top, acc_enum, acc100_conf);
> > +	if (unlikely(q_top == NULL))
> > +		return 0;
> 
> This error is not handled well be the callers.
> 
> aqNum is similar.

This fails in a consistent basis, by having not queue available and handling this as the default case.

> 
> > +	return q_top->aq_depth_log2;
> > +}
> > +
> > +/* Return the AQ depth for a Queue Group Index */ static inline int 
> > +aqNum(int qg_idx, struct acc100_conf *acc100_conf) {
> > +	struct rte_q_topology_t *q_top = NULL;
> > +	int acc_enum = accFromQgid(qg_idx, acc100_conf);
> > +	qtopFromAcc(&q_top, acc_enum, acc100_conf);
> > +	if (unlikely(q_top == NULL))
> > +		return 0;
> > +	return q_top->num_aqs_per_groups;
> > +}
> > +
> >  static void
> >  initQTop(struct acc100_conf *acc100_conf)  { @@ -4177,3 +4221,464 
> > @@ static int acc100_pci_remove(struct rte_pci_device *pci_dev) 
> > RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME,
> > pci_id_acc100_pf_map);
> RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME,
> > acc100_pci_vf_driver);
> > RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME,
> > pci_id_acc100_vf_map);
> > +
> > +/*
> > + * Implementation to fix the power on status of some 5GUL engines
> > + * This requires DMA permission if ported outside DPDK
> This sounds like a workaround, can more detail be added here ?

There are comments through the code I believe:
  - /* Detect engines in undefined state */
  - /* Force each engine which is in unspecified state */
  - /* Reset LDPC Cores */
  - /* Check engine power-on status again */ Do you believe this is not explicit enough. Power-on status may be in an undefined state hence this engine are avtivate with dummy payload to make sure they are in a predicable state once configuration is done. 

> > + */
> > +static void
> > +poweron_cleanup(struct rte_bbdev *bbdev, struct acc100_device *d,
> > +		struct acc100_conf *conf)
> > +{
> > +	int i, template_idx, qg_idx;
> > +	uint32_t address, status, payload;
> > +	printf("Need to clear power-on 5GUL status in internal memory\n");
> > +	/* Reset LDPC Cores */
> > +	for (i = 0; i < ACC100_ENGINES_MAX; i++)
> > +		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
> > +				ACC100_ENGINE_OFFSET * i,
> ACC100_RESET_HI);
> > +	usleep(LONG_WAIT);
> > +	for (i = 0; i < ACC100_ENGINES_MAX; i++)
> > +		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
> > +				ACC100_ENGINE_OFFSET * i,
> ACC100_RESET_LO);
> > +	usleep(LONG_WAIT);
> > +	/* Prepare dummy workload */
> > +	alloc_2x64mb_sw_rings_mem(bbdev, d, 0);
> > +	/* Set base addresses */
> > +	uint32_t phys_high = (uint32_t)(d->sw_rings_phys >> 32);
> > +	uint32_t phys_low  = (uint32_t)(d->sw_rings_phys &
> > +			~(ACC100_SIZE_64MBYTE-1));
> > +	acc100_reg_write(d, HWPfDmaFec5GulDescBaseHiRegVf,
> phys_high);
> > +	acc100_reg_write(d, HWPfDmaFec5GulDescBaseLoRegVf, phys_low);
> > +
> > +	/* Descriptor for a dummy 5GUL code block processing*/
> > +	union acc100_dma_desc *desc = NULL;
> > +	desc = d->sw_rings;
> > +	desc->req.data_ptrs[0].address = d->sw_rings_phys +
> > +			ACC100_DESC_FCW_OFFSET;
> > +	desc->req.data_ptrs[0].blen = ACC100_FCW_LD_BLEN;
> > +	desc->req.data_ptrs[0].blkid = ACC100_DMA_BLKID_FCW;
> > +	desc->req.data_ptrs[0].last = 0;
> > +	desc->req.data_ptrs[0].dma_ext = 0;
> > +	desc->req.data_ptrs[1].address = d->sw_rings_phys + 512;
> > +	desc->req.data_ptrs[1].blkid = ACC100_DMA_BLKID_IN;
> > +	desc->req.data_ptrs[1].last = 1;
> > +	desc->req.data_ptrs[1].dma_ext = 0;
> > +	desc->req.data_ptrs[1].blen = 44;
> > +	desc->req.data_ptrs[2].address = d->sw_rings_phys + 1024;
> > +	desc->req.data_ptrs[2].blkid = ACC100_DMA_BLKID_OUT_ENC;
> > +	desc->req.data_ptrs[2].last = 1;
> > +	desc->req.data_ptrs[2].dma_ext = 0;
> > +	desc->req.data_ptrs[2].blen = 5;
> > +	/* Dummy FCW */
> > +	desc->req.fcw_ld.FCWversion = ACC100_FCW_VER;
> > +	desc->req.fcw_ld.qm = 1;
> > +	desc->req.fcw_ld.nfiller = 30;
> > +	desc->req.fcw_ld.BG = 2 - 1;
> > +	desc->req.fcw_ld.Zc = 7;
> > +	desc->req.fcw_ld.ncb = 350;
> > +	desc->req.fcw_ld.rm_e = 4;
> > +	desc->req.fcw_ld.itmax = 10;
> > +	desc->req.fcw_ld.gain_i = 1;
> > +	desc->req.fcw_ld.gain_h = 1;
> > +
> > +	int engines_to_restart[SIG_UL_5G_LAST + 1] = {0};
> > +	int num_failed_engine = 0;
> > +	/* Detect engines in undefined state */
> > +	for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST;
> > +			template_idx++) {
> > +		/* Check engine power-on status */
> > +		address = HwPfFecUl5gIbDebugReg +
> > +				ACC100_ENGINE_OFFSET * template_idx;
> > +		status = (acc100_reg_read(d, address) >> 4) & 0xF;
> > +		if (status == 0) {
> > +			engines_to_restart[num_failed_engine] =
> template_idx;
> > +			num_failed_engine++;
> > +		}
> > +	}
> > +
> > +	int numQqsAcc = conf->q_ul_5g.num_qgroups;
> > +	int numQgs = conf->q_ul_5g.num_qgroups;
> > +	payload = 0;
> > +	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc);
> qg_idx++)
> > +		payload |= (1 << qg_idx);
> > +	/* Force each engine which is in unspecified state */
> > +	for (i = 0; i < num_failed_engine; i++) {
> > +		int failed_engine = engines_to_restart[i];
> > +		printf("Force engine %d\n", failed_engine);
> > +		for (template_idx = SIG_UL_5G; template_idx <=
> SIG_UL_5G_LAST;
> > +				template_idx++) {
> > +			address = HWPfQmgrGrpTmplateReg4Indx
> > +					+ BYTES_IN_WORD * template_idx;
> > +			if (template_idx == failed_engine)
> > +				acc100_reg_write(d, address, payload);
> > +			else
> > +				acc100_reg_write(d, address, 0);
> > +		}
> > +		/* Reset descriptor header */
> > +		desc->req.word0 = ACC100_DMA_DESC_TYPE;
> > +		desc->req.word1 = 0;
> > +		desc->req.word2 = 0;
> > +		desc->req.word3 = 0;
> > +		desc->req.numCBs = 1;
> > +		desc->req.m2dlen = 2;
> > +		desc->req.d2mlen = 1;
> > +		/* Enqueue the code block for processing */
> > +		union acc100_enqueue_reg_fmt enq_req;
> > +		enq_req.val = 0;
> > +		enq_req.addr_offset = ACC100_DESC_OFFSET;
> > +		enq_req.num_elem = 1;
> > +		enq_req.req_elem_addr = 0;
> > +		rte_wmb();
> > +		acc100_reg_write(d, HWPfQmgrIngressAq + 0x100,
> enq_req.val);
> > +		usleep(LONG_WAIT * 100);
> > +		if (desc->req.word0 != 2)
> > +			printf("DMA Response %#"PRIx32"\n", desc-
> >req.word0);
> > +	}
> > +
> > +	/* Reset LDPC Cores */
> > +	for (i = 0; i < ACC100_ENGINES_MAX; i++)
> > +		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
> > +				ACC100_ENGINE_OFFSET * i,
> ACC100_RESET_HI);
> > +	usleep(LONG_WAIT);
> > +	for (i = 0; i < ACC100_ENGINES_MAX; i++)
> > +		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
> > +				ACC100_ENGINE_OFFSET * i,
> ACC100_RESET_LO);
> > +	usleep(LONG_WAIT);
> > +	acc100_reg_write(d, HWPfHi5GHardResetReg,
> ACC100_RESET_HARD);
> > +	usleep(LONG_WAIT);
> > +	int numEngines = 0;
> > +	/* Check engine power-on status again */
> > +	for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST;
> > +			template_idx++) {
> > +		address = HwPfFecUl5gIbDebugReg +
> > +				ACC100_ENGINE_OFFSET * template_idx;
> > +		status = (acc100_reg_read(d, address) >> 4) & 0xF;
> > +		address = HWPfQmgrGrpTmplateReg4Indx
> > +				+ BYTES_IN_WORD * template_idx;
> > +		if (status == 1) {
> > +			acc100_reg_write(d, address, payload);
> > +			numEngines++;
> > +		} else
> > +			acc100_reg_write(d, address, 0);
> > +	}
> > +	printf("Number of 5GUL engines %d\n", numEngines);
> > +
> > +	if (d->sw_rings_base != NULL)
> > +		rte_free(d->sw_rings_base);
> > +	usleep(LONG_WAIT);
> > +}
> > +
> > +/* Initial configuration of a ACC100 device prior to running
> > +configure() */ int acc100_configure(const char *dev_name, struct 
> > +acc100_conf *conf) {
> > +	rte_bbdev_log(INFO, "acc100_configure");
> > +	uint32_t payload, address, status;
> 
> maybe value or data would be a better variable name than payload.
> 
> would mean changing acc100_reg_write

transparent to me, but can change given DPDK uses term value. 


> 
> > +	int qg_idx, template_idx, vf_idx, acc, i;
> > +	struct rte_bbdev *bbdev = rte_bbdev_get_named_dev(dev_name);
> > +
> > +	/* Compile time checks */
> > +	RTE_BUILD_BUG_ON(sizeof(struct acc100_dma_req_desc) != 256);
> > +	RTE_BUILD_BUG_ON(sizeof(union acc100_dma_desc) != 256);
> > +	RTE_BUILD_BUG_ON(sizeof(struct acc100_fcw_td) != 24);
> > +	RTE_BUILD_BUG_ON(sizeof(struct acc100_fcw_te) != 32);
> > +
> > +	if (bbdev == NULL) {
> > +		rte_bbdev_log(ERR,
> > +		"Invalid dev_name (%s), or device is not yet initialised",
> > +		dev_name);
> > +		return -ENODEV;
> > +	}
> > +	struct acc100_device *d = bbdev->data->dev_private;
> > +
> > +	/* Store configuration */
> > +	rte_memcpy(&d->acc100_conf, conf, sizeof(d->acc100_conf));
> > +
> > +	/* PCIe Bridge configuration */
> > +	acc100_reg_write(d, HwPfPcieGpexBridgeControl,
> ACC100_CFG_PCI_BRIDGE);
> > +	for (i = 1; i < 17; i++)
> 
> 17 is a magic number, use a #define
> 
> this is a general issue.

These are only used once but still agreed.

> 
> > +		acc100_reg_write(d,
> > +
> 	HwPfPcieGpexAxiAddrMappingWindowPexBaseHigh
> > +				+ i * 16, 0);
> > +
> > +	/* PCIe Link Trainiing and Status State Machine */
> > +	acc100_reg_write(d, HwPfPcieGpexLtssmStateCntrl, 0xDFC00000);
> > +
> > +	/* Prevent blocking AXI read on BRESP for AXI Write */
> > +	address = HwPfPcieGpexAxiPioControl;
> > +	payload = ACC100_CFG_PCI_AXI;
> > +	acc100_reg_write(d, address, payload);
> > +
> > +	/* 5GDL PLL phase shift */
> > +	acc100_reg_write(d, HWPfChaDl5gPllPhshft0, 0x1);
> > +
> > +	/* Explicitly releasing AXI as this may be stopped after PF FLR/BME */
> > +	address = HWPfDmaAxiControl;
> > +	payload = 1;
> > +	acc100_reg_write(d, address, payload);
> > +
> > +	/* DDR Configuration */
> > +	address = HWPfDdrBcTim6;
> > +	payload = acc100_reg_read(d, address);
> > +	payload &= 0xFFFFFFFB; /* Bit 2 */ #ifdef ACC100_DDR_ECC_ENABLE
> > +	payload |= 0x4;
> > +#endif
> > +	acc100_reg_write(d, address, payload);
> > +	address = HWPfDdrPhyDqsCountNum;
> > +#ifdef ACC100_DDR_ECC_ENABLE
> > +	payload = 9;
> > +#else
> > +	payload = 8;
> > +#endif
> > +	acc100_reg_write(d, address, payload);
> > +
> > +	/* Set default descriptor signature */
> > +	address = HWPfDmaDescriptorSignatuture;
> > +	payload = 0;
> > +	acc100_reg_write(d, address, payload);
> > +
> > +	/* Enable the Error Detection in DMA */
> > +	payload = ACC100_CFG_DMA_ERROR;
> > +	address = HWPfDmaErrorDetectionEn;
> > +	acc100_reg_write(d, address, payload);
> > +
> > +	/* AXI Cache configuration */
> > +	payload = ACC100_CFG_AXI_CACHE;
> > +	address = HWPfDmaAxcacheReg;
> > +	acc100_reg_write(d, address, payload);
> > +
> > +	/* Default DMA Configuration (Qmgr Enabled) */
> > +	address = HWPfDmaConfig0Reg;
> > +	payload = 0;
> > +	acc100_reg_write(d, address, payload);
> > +	address = HWPfDmaQmanen;
> > +	payload = 0;
> > +	acc100_reg_write(d, address, payload);
> > +
> > +	/* Default RLIM/ALEN configuration */
> > +	address = HWPfDmaConfig1Reg;
> > +	payload = (1 << 31) + (23 << 8) + (1 << 6) + 7;
> > +	acc100_reg_write(d, address, payload);
> > +
> > +	/* Configure DMA Qmanager addresses */
> > +	address = HWPfDmaQmgrAddrReg;
> > +	payload = HWPfQmgrEgressQueuesTemplate;
> > +	acc100_reg_write(d, address, payload);
> > +
> > +	/* ===== Qmgr Configuration ===== */
> > +	/* Configuration of the AQueue Depth QMGR_GRP_0_DEPTH_LOG2
> for UL */
> > +	int totalQgs = conf->q_ul_4g.num_qgroups +
> > +			conf->q_ul_5g.num_qgroups +
> > +			conf->q_dl_4g.num_qgroups +
> > +			conf->q_dl_5g.num_qgroups;
> > +	for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) {
> > +		address = HWPfQmgrDepthLog2Grp +
> > +		BYTES_IN_WORD * qg_idx;
> > +		payload = aqDepth(qg_idx, conf);
> > +		acc100_reg_write(d, address, payload);
> > +		address = HWPfQmgrTholdGrp +
> > +		BYTES_IN_WORD * qg_idx;
> > +		payload = (1 << 16) + (1 << (aqDepth(qg_idx, conf) - 1));
> > +		acc100_reg_write(d, address, payload);
> > +	}
> > +
> > +	/* Template Priority in incremental order */
> > +	for (template_idx = 0; template_idx < ACC100_NUM_TMPL;
> > +			template_idx++) {
> > +		address = HWPfQmgrGrpTmplateReg0Indx +
> > +		BYTES_IN_WORD * (template_idx % 8);
> > +		payload = TMPL_PRI_0;
> > +		acc100_reg_write(d, address, payload);
> > +		address = HWPfQmgrGrpTmplateReg1Indx +
> > +		BYTES_IN_WORD * (template_idx % 8);
> > +		payload = TMPL_PRI_1;
> > +		acc100_reg_write(d, address, payload);
> > +		address = HWPfQmgrGrpTmplateReg2indx +
> > +		BYTES_IN_WORD * (template_idx % 8);
> > +		payload = TMPL_PRI_2;
> > +		acc100_reg_write(d, address, payload);
> > +		address = HWPfQmgrGrpTmplateReg3Indx +
> > +		BYTES_IN_WORD * (template_idx % 8);
> > +		payload = TMPL_PRI_3;
> > +		acc100_reg_write(d, address, payload);
> > +	}
> > +
> > +	address = HWPfQmgrGrpPriority;
> > +	payload = ACC100_CFG_QMGR_HI_P;
> > +	acc100_reg_write(d, address, payload);
> > +
> > +	/* Template Configuration */
> > +	for (template_idx = 0; template_idx < ACC100_NUM_TMPL;
> template_idx++) {
> > +		payload = 0;
> > +		address = HWPfQmgrGrpTmplateReg4Indx
> > +				+ BYTES_IN_WORD * template_idx;
> > +		acc100_reg_write(d, address, payload);
> > +	}
> > +	/* 4GUL */
> > +	int numQgs = conf->q_ul_4g.num_qgroups;
> > +	int numQqsAcc = 0;
> > +	payload = 0;
> > +	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc);
> qg_idx++)
> > +		payload |= (1 << qg_idx);
> > +	for (template_idx = SIG_UL_4G; template_idx <= SIG_UL_4G_LAST;
> > +			template_idx++) {
> > +		address = HWPfQmgrGrpTmplateReg4Indx
> > +				+ BYTES_IN_WORD*template_idx;
> > +		acc100_reg_write(d, address, payload);
> > +	}
> > +	/* 5GUL */
> > +	numQqsAcc += numQgs;
> > +	numQgs	= conf->q_ul_5g.num_qgroups;
> > +	payload = 0;
> > +	int numEngines = 0;
> > +	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc);
> qg_idx++)
> > +		payload |= (1 << qg_idx);
> > +	for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST;
> > +			template_idx++) {
> > +		/* Check engine power-on status */
> > +		address = HwPfFecUl5gIbDebugReg +
> > +				ACC100_ENGINE_OFFSET * template_idx;
> > +		status = (acc100_reg_read(d, address) >> 4) & 0xF;
> > +		address = HWPfQmgrGrpTmplateReg4Indx
> > +				+ BYTES_IN_WORD * template_idx;
> > +		if (status == 1) {
> > +			acc100_reg_write(d, address, payload);
> > +			numEngines++;
> > +		} else
> > +			acc100_reg_write(d, address, 0);
> > +		#if RTE_ACC100_SINGLE_FEC == 1
> #if should be at start of line

ok

> > +		payload = 0;
> > +		#endif
> > +	}
> > +	printf("Number of 5GUL engines %d\n", numEngines);
> > +	/* 4GDL */
> > +	numQqsAcc += numQgs;
> > +	numQgs	= conf->q_dl_4g.num_qgroups;
> > +	payload = 0;
> > +	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc);
> qg_idx++)
> > +		payload |= (1 << qg_idx);
> > +	for (template_idx = SIG_DL_4G; template_idx <= SIG_DL_4G_LAST;
> > +			template_idx++) {
> > +		address = HWPfQmgrGrpTmplateReg4Indx
> > +				+ BYTES_IN_WORD*template_idx;
> > +		acc100_reg_write(d, address, payload);
> > +		#if RTE_ACC100_SINGLE_FEC == 1
> > +			payload = 0;
> > +		#endif
> > +	}
> > +	/* 5GDL */
> > +	numQqsAcc += numQgs;
> > +	numQgs	= conf->q_dl_5g.num_qgroups;
> > +	payload = 0;
> > +	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc);
> qg_idx++)
> > +		payload |= (1 << qg_idx);
> > +	for (template_idx = SIG_DL_5G; template_idx <= SIG_DL_5G_LAST;
> > +			template_idx++) {
> > +		address = HWPfQmgrGrpTmplateReg4Indx
> > +				+ BYTES_IN_WORD*template_idx;
> > +		acc100_reg_write(d, address, payload);
> > +		#if RTE_ACC100_SINGLE_FEC == 1
> > +		payload = 0;
> > +		#endif
> > +	}
> > +
> > +	/* Queue Group Function mapping */
> > +	int qman_func_id[5] = {0, 2, 1, 3, 4};
> > +	address = HWPfQmgrGrpFunction0;
> > +	payload = 0;
> > +	for (qg_idx = 0; qg_idx < 8; qg_idx++) {
> > +		acc = accFromQgid(qg_idx, conf);
> > +		payload |= qman_func_id[acc]<<(qg_idx * 4);
> > +	}
> > +	acc100_reg_write(d, address, payload);
> > +
> > +	/* Configuration of the Arbitration QGroup depth to 1 */
> > +	for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) {
> > +		address = HWPfQmgrArbQDepthGrp +
> > +		BYTES_IN_WORD * qg_idx;
> > +		payload = 0;
> > +		acc100_reg_write(d, address, payload);
> > +	}
> > +
> > +	/* Enabling AQueues through the Queue hierarchy*/
> > +	for (vf_idx = 0; vf_idx < ACC100_NUM_VFS; vf_idx++) {
> > +		for (qg_idx = 0; qg_idx < ACC100_NUM_QGRPS; qg_idx++) {
> > +			payload = 0;
> > +			if (vf_idx < conf->num_vf_bundles &&
> > +					qg_idx < totalQgs)
> > +				payload = (1 << aqNum(qg_idx, conf)) - 1;
> > +			address = HWPfQmgrAqEnableVf
> > +					+ vf_idx * BYTES_IN_WORD;
> > +			payload += (qg_idx << 16);
> > +			acc100_reg_write(d, address, payload);
> > +		}
> > +	}
> > +
> > +	/* This pointer to ARAM (256kB) is shifted by 2 (4B per register) */
> > +	uint32_t aram_address = 0;
> > +	for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) {
> > +		for (vf_idx = 0; vf_idx < conf->num_vf_bundles; vf_idx++) {
> > +			address = HWPfQmgrVfBaseAddr + vf_idx
> > +					* BYTES_IN_WORD + qg_idx
> > +					* BYTES_IN_WORD * 64;
> > +			payload = aram_address;
> > +			acc100_reg_write(d, address, payload);
> > +			/* Offset ARAM Address for next memory bank
> > +			 * - increment of 4B
> > +			 */
> > +			aram_address += aqNum(qg_idx, conf) *
> > +					(1 << aqDepth(qg_idx, conf));
> > +		}
> > +	}
> > +
> > +	if (aram_address > WORDS_IN_ARAM_SIZE) {
> > +		rte_bbdev_log(ERR, "ARAM Configuration not fitting %d
> %d\n",
> > +				aram_address, WORDS_IN_ARAM_SIZE);
> > +		return -EINVAL;
> > +	}
> > +
> > +	/* ==== HI Configuration ==== */
> > +
> > +	/* Prevent Block on Transmit Error */
> > +	address = HWPfHiBlockTransmitOnErrorEn;
> > +	payload = 0;
> > +	acc100_reg_write(d, address, payload);
> > +	/* Prevents to drop MSI */
> > +	address = HWPfHiMsiDropEnableReg;
> > +	payload = 0;
> > +	acc100_reg_write(d, address, payload);
> > +	/* Set the PF Mode register */
> > +	address = HWPfHiPfMode;
> > +	payload = (conf->pf_mode_en) ? 2 : 0;
> > +	acc100_reg_write(d, address, payload);
> > +	/* Enable Error Detection in HW */
> > +	address = HWPfDmaErrorDetectionEn;
> > +	payload = 0x3D7;
> > +	acc100_reg_write(d, address, payload);
> > +
> > +	/* QoS overflow init */
> > +	payload = 1;
> > +	address = HWPfQosmonAEvalOverflow0;
> > +	acc100_reg_write(d, address, payload);
> > +	address = HWPfQosmonBEvalOverflow0;
> > +	acc100_reg_write(d, address, payload);
> > +
> > +	/* HARQ DDR Configuration */
> > +	unsigned int ddrSizeInMb = 512; /* Fixed to 512 MB per VF for now
> */
> > +	for (vf_idx = 0; vf_idx < conf->num_vf_bundles; vf_idx++) {
> > +		address = HWPfDmaVfDdrBaseRw + vf_idx
> > +				* 0x10;
> > +		payload = ((vf_idx * (ddrSizeInMb / 64)) << 16) +
> > +				(ddrSizeInMb - 1);
> > +		acc100_reg_write(d, address, payload);
> > +	}
> > +	usleep(LONG_WAIT);
> Is sleep needed here ? the reg_write has one.

This one is needed on top

> > +
> 
> Since this seems like a workaround, add a comment here.

fair enough, ok, thanks

> 
> Tom
> 
> > +	if (numEngines < (SIG_UL_5G_LAST + 1))
> > +		poweron_cleanup(bbdev, d, conf);
> > +
> > +	rte_bbdev_log_debug("PF Tip configuration complete for %s",
> dev_name);
> > +	return 0;
> > +}
> > diff --git 
> > a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> > b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> > index 4a76d1d..91c234d 100644
> > --- a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> > +++ b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> > @@ -1,3 +1,10 @@
> >  DPDK_21 {
> >  	local: *;
> >  };
> > +
> > +EXPERIMENTAL {
> > +	global:
> > +
> > +	acc100_configure;
> > +
> > +};


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v9 01/10] drivers/baseband: add PMD for ACC100
  2020-09-29 23:17         ` Chautru, Nicolas
@ 2020-09-30 23:06           ` Tom Rix
  2020-09-30 23:30             ` Chautru, Nicolas
  0 siblings, 1 reply; 213+ messages in thread
From: Tom Rix @ 2020-09-30 23:06 UTC (permalink / raw)
  To: Chautru, Nicolas, dev, akhil.goyal
  Cc: Richardson, Bruce, Xu, Rosen, dave.burley, aidan.goddard, Yigit,
	Ferruh, Liu, Tianjiao


On 9/29/20 4:17 PM, Chautru, Nicolas wrote:
> Hi Tom, 
>
>> -----Original Message-----
>> From: Tom Rix <trix@redhat.com>
>> Sent: Tuesday, September 29, 2020 12:54 PM
>> To: Chautru, Nicolas <nicolas.chautru@intel.com>; dev@dpdk.org;
>> akhil.goyal@nxp.com
>> Cc: Richardson, Bruce <bruce.richardson@intel.com>; Xu, Rosen
>> <rosen.xu@intel.com>; dave.burley@accelercomm.com;
>> aidan.goddard@accelercomm.com; Yigit, Ferruh <ferruh.yigit@intel.com>;
>> Liu, Tianjiao <tianjiao.liu@intel.com>
>> Subject: Re: [dpdk-dev] [PATCH v9 01/10] drivers/baseband: add PMD for
>> ACC100
>>
>>
>> On 9/28/20 5:29 PM, Nicolas Chautru wrote:
>>> Add stubs for the ACC100 PMD
>>>
>>> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
>>> Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
>>> ---
>>>  doc/guides/bbdevs/acc100.rst                       | 233 +++++++++++++++++++++
>>>  doc/guides/bbdevs/features/acc100.ini              |  14 ++
>>>  doc/guides/bbdevs/index.rst                        |   1 +
>>>  drivers/baseband/acc100/meson.build                |   6 +
>>>  drivers/baseband/acc100/rte_acc100_pmd.c           | 175
>> ++++++++++++++++
>>>  drivers/baseband/acc100/rte_acc100_pmd.h           |  37 ++++
>>>  .../acc100/rte_pmd_bbdev_acc100_version.map        |   3 +
>>>  drivers/baseband/meson.build                       |   2 +-
>>>  8 files changed, 470 insertions(+), 1 deletion(-)  create mode 100644
>>> doc/guides/bbdevs/acc100.rst  create mode 100644
>>> doc/guides/bbdevs/features/acc100.ini
>>>  create mode 100644 drivers/baseband/acc100/meson.build
>>>  create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.c
>>>  create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.h
>>>  create mode 100644
>>> drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
>>>
>>> diff --git a/doc/guides/bbdevs/acc100.rst
>>> b/doc/guides/bbdevs/acc100.rst new file mode 100644 index
>>> 0000000..f87ee09
>>> --- /dev/null
>>> +++ b/doc/guides/bbdevs/acc100.rst
>>> @@ -0,0 +1,233 @@
>>> +..  SPDX-License-Identifier: BSD-3-Clause
>>> +    Copyright(c) 2020 Intel Corporation
>>> +
>>> +Intel(R) ACC100 5G/4G FEC Poll Mode Driver
>>> +==========================================
>>> +
>>> +The BBDEV ACC100 5G/4G FEC poll mode driver (PMD) supports an
>>> +implementation of a VRAN FEC wireless acceleration function.
>>> +This device is also known as Mount Bryce.
>> If this is code name or general chip name it should be removed.
> We have used general chip name for other PMDs (ie. Vista Creek), I can remove but
> why should this be removed for my benefit? This tends to be the most user friendly
> name so arguablygood to name drop in documentation . 

VistaCreek is the code name, the chip would be aria10.

Since mt bryce is the chip name, after more than 1 eASIC this becomes confusing.

Generally public product names should be used because only the early developers will know the development code names.

>
>
>>> +
>>> +Features
>>> +--------
>>> +
>>> +ACC100 5G/4G FEC PMD supports the following features:
>>> +
>>> +- LDPC Encode in the DL (5GNR)
>>> +- LDPC Decode in the UL (5GNR)
>>> +- Turbo Encode in the DL (4G)
>>> +- Turbo Decode in the UL (4G)
>>> +- 16 VFs per PF (physical device)
>>> +- Maximum of 128 queues per VF
>>> +- PCIe Gen-3 x16 Interface
>>> +- MSI
>>> +- SR-IOV
>>> +
>>> +ACC100 5G/4G FEC PMD supports the following BBDEV capabilities:
>>> +
>>> +* For the LDPC encode operation:
>>> +   - ``RTE_BBDEV_LDPC_CRC_24B_ATTACH`` :  set to attach CRC24B to
>> CB(s)
>>> +   - ``RTE_BBDEV_LDPC_RATE_MATCH`` :  if set then do not do Rate Match
>> bypass
>>> +   - ``RTE_BBDEV_LDPC_INTERLEAVER_BYPASS`` : if set then bypass
>>> +interleaver
>>> +
>>> +* For the LDPC decode operation:
>>> +   - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK`` :  check CRC24B from
>> CB(s)
>>> +   - ``RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE`` :  disable early
>> termination
>>> +   - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP`` :  drops CRC24B bits
>> appended while decoding
>>> +   - ``RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE`` :  provides an input
>> for HARQ combining
>>> +   - ``RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE`` :  provides an input
>> for HARQ combining
>>> +   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE`` :  HARQ
>> memory input is internal
>>> +   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE`` :
>> HARQ memory output is internal
>>> +   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK`` :
>> loopback data to/from HARQ memory
>>> +   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_FILLERS`` :  HARQ
>> memory includes the fillers bits
>>> +   - ``RTE_BBDEV_LDPC_DEC_SCATTER_GATHER`` :  supports scatter-
>> gather for input/output data
>>> +   - ``RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION`` :  supports
>> compression of the HARQ input/output
>>> +   - ``RTE_BBDEV_LDPC_LLR_COMPRESSION`` :  supports LLR input
>>> +compression
>>> +
>>> +* For the turbo encode operation:
>>> +   - ``RTE_BBDEV_TURBO_CRC_24B_ATTACH`` :  set to attach CRC24B to
>> CB(s)
>>> +   - ``RTE_BBDEV_TURBO_RATE_MATCH`` :  if set then do not do Rate
>> Match bypass
>>> +   - ``RTE_BBDEV_TURBO_ENC_INTERRUPTS`` :  set for encoder dequeue
>> interrupts
>>> +   - ``RTE_BBDEV_TURBO_RV_INDEX_BYPASS`` :  set to bypass RV index
>>> +   - ``RTE_BBDEV_TURBO_ENC_SCATTER_GATHER`` :  supports
>>> +scatter-gather for input/output data
>>> +
>>> +* For the turbo decode operation:
>>> +   - ``RTE_BBDEV_TURBO_CRC_TYPE_24B`` :  check CRC24B from CB(s)
>>> +   - ``RTE_BBDEV_TURBO_SUBBLOCK_DEINTERLEAVE`` :  perform subblock
>> de-interleave
>>> +   - ``RTE_BBDEV_TURBO_DEC_INTERRUPTS`` :  set for decoder dequeue
>> interrupts
>>> +   - ``RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN`` :  set if negative LLR
>> encoder i/p is supported
>>> +   - ``RTE_BBDEV_TURBO_POS_LLR_1_BIT_IN`` :  set if positive LLR encoder
>> i/p is supported
>>> +   - ``RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP`` :  keep CRC24B bits
>> appended while decoding
>>> +   - ``RTE_BBDEV_TURBO_EARLY_TERMINATION`` :  set early early
>> termination feature
>>> +   - ``RTE_BBDEV_TURBO_DEC_SCATTER_GATHER`` :  supports scatter-
>> gather for input/output data
>>> +   - ``RTE_BBDEV_TURBO_HALF_ITERATION_EVEN`` :  set half iteration
>>> +granularity
>>> +
>>> +Installation
>>> +------------
>>> +
>>> +Section 3 of the DPDK manual provides instuctions on installing and
>>> +compiling DPDK. The default set of bbdev compile flags may be found
>>> +in config/common_base, where for example the flag to build the ACC100
>>> +5G/4G FEC device, ``CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100``,
>>> +is already set.
>>> +
>>> +DPDK requires hugepages to be configured as detailed in section 2 of the
>> DPDK manual.
>>> +The bbdev test application has been tested with a configuration 40 x
>>> +1GB hugepages. The hugepage configuration of a server may be examined
>> using:
>>> +
>>> +.. code-block:: console
>>> +
>>> +   grep Huge* /proc/meminfo
>>> +
>>> +
>>> +Initialization
>>> +--------------
>>> +
>>> +When the device first powers up, its PCI Physical Functions (PF) can be
>> listed through this command:
>>> +
>>> +.. code-block:: console
>>> +
>>> +  sudo lspci -vd8086:0d5c
>>> +
>>> +The physical and virtual functions are compatible with Linux UIO drivers:
>>> +``vfio`` and ``igb_uio``. However, in order to work the ACC100 5G/4G
>>> +FEC device firstly needs to be bound to one of these linux drivers through
>> DPDK.
>> FEC device first
> ok
>
>>> +
>>> +
>>> +Bind PF UIO driver(s)
>>> +~~~~~~~~~~~~~~~~~~~~~
>>> +
>>> +Install the DPDK igb_uio driver, bind it with the PF PCI device ID
>>> +and use ``lspci`` to confirm the PF device is under use by ``igb_uio`` DPDK
>> UIO driver.
>>> +
>>> +The igb_uio driver may be bound to the PF PCI device using one of three
>> methods:
>>> +
>>> +
>>> +1. PCI functions (physical or virtual, depending on the use case) can
>>> +be bound to the UIO driver by repeating this command for every function.
>>> +
>>> +.. code-block:: console
>>> +
>>> +  cd <dpdk-top-level-directory>
>>> +  insmod ./build/kmod/igb_uio.ko
>>> +  echo "8086 0d5c" > /sys/bus/pci/drivers/igb_uio/new_id
>>> +  lspci -vd8086:0d5c
>>> +
>>> +
>>> +2. Another way to bind PF with DPDK UIO driver is by using the
>>> +``dpdk-devbind.py`` tool
>>> +
>>> +.. code-block:: console
>>> +
>>> +  cd <dpdk-top-level-directory>
>>> +  ./usertools/dpdk-devbind.py -b igb_uio 0000:06:00.0
>>> +
>>> +where the PCI device ID (example: 0000:06:00.0) is obtained using
>>> +lspci -vd8086:0d5c
>>> +
>>> +
>>> +3. A third way to bind is to use ``dpdk-setup.sh`` tool
>>> +
>>> +.. code-block:: console
>>> +
>>> +  cd <dpdk-top-level-directory>
>>> +  ./usertools/dpdk-setup.sh
>>> +
>>> +  select 'Bind Ethernet/Crypto/Baseband device to IGB UIO module'
>>> +  or
>>> +  select 'Bind Ethernet/Crypto/Baseband device to VFIO module'
>>> + depending on driver required
>> This is the igb_uio section, should defer vfio select to its section.
> Ok
>
>>> +  enter PCI device ID
>>> +  select 'Display current Ethernet/Crypto/Baseband device settings'
>>> + to confirm binding
>>> +
>>> +
>>> +In the same way the ACC100 5G/4G FEC PF can be bound with vfio, but
>>> +vfio driver does not support SR-IOV configuration right out of the box, so
>> it will need to be patched.
>> Other documentation says works with 5.7
> Yes this is a bit historical now. I can remove this bit which is not very informative and non specific to that PMD. 
>
>>> +
>>> +
>>> +Enable Virtual Functions
>>> +~~~~~~~~~~~~~~~~~~~~~~~~
>>> +
>>> +Now, it should be visible in the printouts that PCI PF is under
>>> +igb_uio control "``Kernel driver in use: igb_uio``"
>>> +
>>> +To show the number of available VFs on the device, read ``sriov_totalvfs``
>> file..
>>> +
>>> +.. code-block:: console
>>> +
>>> +  cat /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/sriov_totalvfs
>>> +
>>> +  where 0000\:<b>\:<d>.<f> is the PCI device ID
>>> +
>>> +
>>> +To enable VFs via igb_uio, echo the number of virtual functions
>>> +intended to enable to ``max_vfs`` file..
>>> +
>>> +.. code-block:: console
>>> +
>>> +  echo <num-of-vfs> > /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/max_vfs
>>> +
>>> +
>>> +Afterwards, all VFs must be bound to appropriate UIO drivers as
>>> +required, same way it was done with the physical function previously.
>>> +
>>> +Enabling SR-IOV via vfio driver is pretty much the same, except that
>>> +the file name is different:
>>> +
>>> +.. code-block:: console
>>> +
>>> +  echo <num-of-vfs> >
>>> + /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/sriov_numvfs
>>> +
>>> +
>>> +Configure the VFs through PF
>>> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>>> +
>>> +The PCI virtual functions must be configured before working or
>>> +getting assigned to VMs/Containers. The configuration involves
>>> +allocating the number of hardware queues, priorities, load balance,
>>> +bandwidth and other settings necessary for the device to perform FEC
>> functions.
>>> +
>>> +This configuration needs to be executed at least once after reboot or
>>> +PCI FLR and can be achieved by using the function
>>> +``acc100_configure()``, which sets up the parameters defined in
>> ``acc100_conf`` structure.
>>> +
>>> +Test Application
>>> +----------------
>>> +
>>> +BBDEV provides a test application, ``test-bbdev.py`` and range of
>>> +test data for testing the functionality of ACC100 5G/4G FEC encode
>>> +and decode, depending on the device's capabilities. The test
>>> +application is located under app->test-bbdev folder and has the following
>> options:
>>> +
>>> +.. code-block:: console
>>> +
>>> +  "-p", "--testapp-path": specifies path to the bbdev test app.
>>> +  "-e", "--eal-params"	: EAL arguments which are passed to the test
>> app.
>>> +  "-t", "--timeout"	: Timeout in seconds (default=300).
>>> +  "-c", "--test-cases"	: Defines test cases to run. Run all if not specified.
>>> +  "-v", "--test-vector"	: Test vector path (default=dpdk_path+/app/test-
>> bbdev/test_vectors/bbdev_null.data).
>>> +  "-n", "--num-ops"	: Number of operations to process on device
>> (default=32).
>>> +  "-b", "--burst-size"	: Operations enqueue/dequeue burst size
>> (default=32).
>>> +  "-s", "--snr"		: SNR in dB used when generating LLRs for bler tests.
>>> +  "-s", "--iter_max"	: Number of iterations for LDPC decoder.
>>> +  "-l", "--num-lcores"	: Number of lcores to run (default=16).
>>> +  "-i", "--init-device" : Initialise PF device with default values.
>>> +
>>> +
>>> +To execute the test application tool using simple decode or encode
>>> +data, type one of the following:
>>> +
>>> +.. code-block:: console
>>> +
>>> +  ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_dec_default.data
>>> + ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_enc_default.data
>>> +
>>> +
>>> +The test application ``test-bbdev.py``, supports the ability to
>>> +configure the PF device with a default set of values, if the "-i" or
>>> +"- -init-device" option is included. The default values are defined in
>> test_bbdev_perf.c.
>>> +
>>> +
>>> +Test Vectors
>>> +~~~~~~~~~~~~
>>> +
>>> +In addition to the simple LDPC decoder and LDPC encoder tests, bbdev
>>> +also provides a range of additional tests under the test_vectors
>>> +folder, which may be useful. The results of these tests will depend
>>> +on the ACC100 5G/4G FEC capabilities which may cause some testcases to
>> be skipped, but no failure should be reported.
>>
>> Just
>>
>> to be skipped.
>>
>> should be able to assume skipped test do not get reported as failures.
> Not necessaraly that obvious from feedback. It doesn't hurt to be explicit and
> this statement is common to all PMDs. 
>
ok

>>> diff --git a/doc/guides/bbdevs/features/acc100.ini
>>> b/doc/guides/bbdevs/features/acc100.ini
>>> new file mode 100644
>>> index 0000000..c89a4d7
>>> --- /dev/null
>>> +++ b/doc/guides/bbdevs/features/acc100.ini
>>> @@ -0,0 +1,14 @@
>>> +;
>>> +; Supported features of the 'acc100' bbdev driver.
>>> +;
>>> +; Refer to default.ini for the full list of available PMD features.
>>> +;
>>> +[Features]
>>> +Turbo Decoder (4G)     = N
>>> +Turbo Encoder (4G)     = N
>>> +LDPC Decoder (5G)      = N
>>> +LDPC Encoder (5G)      = N
>>> +LLR/HARQ Compression   = N
>>> +External DDR Access    = N
>>> +HW Accelerated         = Y
>>> +BBDEV API              = Y
>>> diff --git a/doc/guides/bbdevs/index.rst b/doc/guides/bbdevs/index.rst
>>> index a8092dd..4445cbd 100644
>>> --- a/doc/guides/bbdevs/index.rst
>>> +++ b/doc/guides/bbdevs/index.rst
>>> @@ -13,3 +13,4 @@ Baseband Device Drivers
>>>      turbo_sw
>>>      fpga_lte_fec
>>>      fpga_5gnr_fec
>>> +    acc100
>>> diff --git a/drivers/baseband/acc100/meson.build
>>> b/drivers/baseband/acc100/meson.build
>>> new file mode 100644
>>> index 0000000..8afafc2
>>> --- /dev/null
>>> +++ b/drivers/baseband/acc100/meson.build
>>> @@ -0,0 +1,6 @@
>>> +# SPDX-License-Identifier: BSD-3-Clause # Copyright(c) 2020 Intel
>>> +Corporation
>>> +
>>> +deps += ['bbdev', 'bus_vdev', 'ring', 'pci', 'bus_pci']
>>> +
>>> +sources = files('rte_acc100_pmd.c')
>>> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
>>> b/drivers/baseband/acc100/rte_acc100_pmd.c
>>> new file mode 100644
>>> index 0000000..1b4cd13
>>> --- /dev/null
>>> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
>>> @@ -0,0 +1,175 @@
>>> +/* SPDX-License-Identifier: BSD-3-Clause
>>> + * Copyright(c) 2020 Intel Corporation  */
>>> +
>>> +#include <unistd.h>
>>> +
>>> +#include <rte_common.h>
>>> +#include <rte_log.h>
>>> +#include <rte_dev.h>
>>> +#include <rte_malloc.h>
>>> +#include <rte_mempool.h>
>>> +#include <rte_byteorder.h>
>>> +#include <rte_errno.h>
>>> +#include <rte_branch_prediction.h>
>>> +#include <rte_hexdump.h>
>>> +#include <rte_pci.h>
>>> +#include <rte_bus_pci.h>
>>> +
>>> +#include <rte_bbdev.h>
>>> +#include <rte_bbdev_pmd.h>
>> Should these #includes' be in alpha order ?
> Interesting comment. Is this a coding guide line for DPDK or others?
> I have never heard of this personnally, what is the rational? 

Not sure if this is dpdk style, i know some other project do this.

This works for self consistent headers, no idea if dpdk are.

don't bother with this.

>>> +#include "rte_acc100_pmd.h"
>>> +
>>> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
>>> +RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, DEBUG); #else
>>> +RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE); #endif
>>> +
>>> +/* Free 64MB memory used for software rings */ static int
>>> +acc100_dev_close(struct rte_bbdev *dev  __rte_unused) {
>>> +	return 0;
>>> +}
>>> +
>>> +static const struct rte_bbdev_ops acc100_bbdev_ops = {
>>> +	.close = acc100_dev_close,
>>> +};
>>> +
>>> +/* ACC100 PCI PF address map */
>>> +static struct rte_pci_id pci_id_acc100_pf_map[] = {
>>> +	{
>>> +		RTE_PCI_DEVICE(RTE_ACC100_VENDOR_ID,
>> RTE_ACC100_PF_DEVICE_ID)
>>> +	},
>>> +	{.device_id = 0},
>>> +};
>>> +
>>> +/* ACC100 PCI VF address map */
>>> +static struct rte_pci_id pci_id_acc100_vf_map[] = {
>>> +	{
>>> +		RTE_PCI_DEVICE(RTE_ACC100_VENDOR_ID,
>> RTE_ACC100_VF_DEVICE_ID)
>>> +	},
>>> +	{.device_id = 0},
>>> +};
>>> +
>>> +/* Initialization Function */
>>> +static void
>>> +acc100_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv)
>>> +{
>>> +	struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
>>> +
>>> +	dev->dev_ops = &acc100_bbdev_ops;
>>> +
>>> +	((struct acc100_device *) dev->data->dev_private)->pf_device =
>>> +			!strcmp(drv->driver.name,
>>> +
>> 	RTE_STR(ACC100PF_DRIVER_NAME));
>>> +	((struct acc100_device *) dev->data->dev_private)->mmio_base =
>>> +			pci_dev->mem_resource[0].addr;
>>> +
>>> +	rte_bbdev_log_debug("Init device %s [%s] @ vaddr %p paddr
>> %#"PRIx64"",
>>> +			drv->driver.name, dev->data->name,
>>> +			(void *)pci_dev->mem_resource[0].addr,
>>> +			pci_dev->mem_resource[0].phys_addr);
>>> +}
>>> +
>>> +static int acc100_pci_probe(struct rte_pci_driver *pci_drv,
>>> +	struct rte_pci_device *pci_dev)
>>> +{
>>> +	struct rte_bbdev *bbdev = NULL;
>>> +	char dev_name[RTE_BBDEV_NAME_MAX_LEN];
>>> +
>>> +	if (pci_dev == NULL) {
>>> +		rte_bbdev_log(ERR, "NULL PCI device");
>>> +		return -EINVAL;
>>> +	}
>>> +
>>> +	rte_pci_device_name(&pci_dev->addr, dev_name,
>> sizeof(dev_name));
>>> +
>>> +	/* Allocate memory to be used privately by drivers */
>>> +	bbdev = rte_bbdev_allocate(pci_dev->device.name);
>>> +	if (bbdev == NULL)
>>> +		return -ENODEV;
>>> +
>>> +	/* allocate device private memory */
>>> +	bbdev->data->dev_private = rte_zmalloc_socket(dev_name,
>>> +			sizeof(struct acc100_device), RTE_CACHE_LINE_SIZE,
>>> +			pci_dev->device.numa_node);
>>> +
>>> +	if (bbdev->data->dev_private == NULL) {
>>> +		rte_bbdev_log(CRIT,
>>> +				"Allocate of %zu bytes for device \"%s\"
>> failed",
>>> +				sizeof(struct acc100_device), dev_name);
>>> +				rte_bbdev_release(bbdev);
>>> +			return -ENOMEM;
>>> +	}
>>> +
>>> +	/* Fill HW specific part of device structure */
>>> +	bbdev->device = &pci_dev->device;
>>> +	bbdev->intr_handle = &pci_dev->intr_handle;
>>> +	bbdev->data->socket_id = pci_dev->device.numa_node;
>>> +
>>> +	/* Invoke ACC100 device initialization function */
>>> +	acc100_bbdev_init(bbdev, pci_drv);
>>> +
>>> +	rte_bbdev_log_debug("Initialised bbdev %s (id = %u)",
>>> +			dev_name, bbdev->data->dev_id);
>>> +	return 0;
>>> +}
>>> +
>>> +static int acc100_pci_remove(struct rte_pci_device *pci_dev) {
>>> +	struct rte_bbdev *bbdev;
>>> +	int ret;
>>> +	uint8_t dev_id;
>>> +
>>> +	if (pci_dev == NULL)
>>> +		return -EINVAL;
>>> +
>>> +	/* Find device */
>>> +	bbdev = rte_bbdev_get_named_dev(pci_dev->device.name);
>>> +	if (bbdev == NULL) {
>>> +		rte_bbdev_log(CRIT,
>>> +				"Couldn't find HW dev \"%s\" to uninitialise
>> it",
>>> +				pci_dev->device.name);
>>> +		return -ENODEV;
>>> +	}
>>> +	dev_id = bbdev->data->dev_id;
>>> +
>>> +	/* free device private memory before close */
>>> +	rte_free(bbdev->data->dev_private);
>>> +
>>> +	/* Close device */
>>> +	ret = rte_bbdev_close(dev_id);
>> Do you want to reorder this close before the rte_free so you could recover
>> from the failure ?
> Given this is done the same way for other PMDs I would not change it as it would create a discrepency.
> It could be done in principle as another patch for multiple PMDs to support this, but really I don't see a usecase for try to fall back in case there was such a speculative aerror. 
>
fair enough

Tom

>> Tom
>>
> Thanks
> Nic
>
>
>>> +	if (ret < 0)
>>> +		rte_bbdev_log(ERR,
>>> +				"Device %i failed to close during uninit: %i",
>>> +				dev_id, ret);
>>> +
>>> +	/* release bbdev from library */
>>> +	rte_bbdev_release(bbdev);
>>> +
>>> +	rte_bbdev_log_debug("Destroyed bbdev = %u", dev_id);
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +static struct rte_pci_driver acc100_pci_pf_driver = {
>>> +		.probe = acc100_pci_probe,
>>> +		.remove = acc100_pci_remove,
>>> +		.id_table = pci_id_acc100_pf_map,
>>> +		.drv_flags = RTE_PCI_DRV_NEED_MAPPING };
>>> +
>>> +static struct rte_pci_driver acc100_pci_vf_driver = {
>>> +		.probe = acc100_pci_probe,
>>> +		.remove = acc100_pci_remove,
>>> +		.id_table = pci_id_acc100_vf_map,
>>> +		.drv_flags = RTE_PCI_DRV_NEED_MAPPING };
>>> +
>>> +RTE_PMD_REGISTER_PCI(ACC100PF_DRIVER_NAME,
>> acc100_pci_pf_driver);
>>> +RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME,
>>> +pci_id_acc100_pf_map);
>> RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME,
>>> +acc100_pci_vf_driver);
>>> +RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME,
>>> +pci_id_acc100_vf_map);
>>> +
>>> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h
>>> b/drivers/baseband/acc100/rte_acc100_pmd.h
>>> new file mode 100644
>>> index 0000000..6f46df0
>>> --- /dev/null
>>> +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
>>> @@ -0,0 +1,37 @@
>>> +/* SPDX-License-Identifier: BSD-3-Clause
>>> + * Copyright(c) 2020 Intel Corporation  */
>>> +
>>> +#ifndef _RTE_ACC100_PMD_H_
>>> +#define _RTE_ACC100_PMD_H_
>>> +
>>> +/* Helper macro for logging */
>>> +#define rte_bbdev_log(level, fmt, ...) \
>>> +	rte_log(RTE_LOG_ ## level, acc100_logtype, fmt "\n", \
>>> +		##__VA_ARGS__)
>>> +
>>> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
>>> +#define rte_bbdev_log_debug(fmt, ...) \
>>> +		rte_bbdev_log(DEBUG, "acc100_pmd: " fmt, \
>>> +		##__VA_ARGS__)
>>> +#else
>>> +#define rte_bbdev_log_debug(fmt, ...) #endif
>>> +
>>> +/* ACC100 PF and VF driver names */
>>> +#define ACC100PF_DRIVER_NAME           intel_acc100_pf
>>> +#define ACC100VF_DRIVER_NAME           intel_acc100_vf
>>> +
>>> +/* ACC100 PCI vendor & device IDs */
>>> +#define RTE_ACC100_VENDOR_ID           (0x8086)
>>> +#define RTE_ACC100_PF_DEVICE_ID        (0x0d5c)
>>> +#define RTE_ACC100_VF_DEVICE_ID        (0x0d5d)
>>> +
>>> +/* Private data structure for each ACC100 device */ struct
>>> +acc100_device {
>>> +	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
>>> +	bool pf_device; /**< True if this is a PF ACC100 device */
>>> +	bool configured; /**< True if this ACC100 device is configured */ };
>>> +
>>> +#endif /* _RTE_ACC100_PMD_H_ */
>>> diff --git a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
>>> b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
>>> new file mode 100644
>>> index 0000000..4a76d1d
>>> --- /dev/null
>>> +++ b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
>>> @@ -0,0 +1,3 @@
>>> +DPDK_21 {
>>> +	local: *;
>>> +};
>>> diff --git a/drivers/baseband/meson.build
>>> b/drivers/baseband/meson.build index 415b672..72301ce 100644
>>> --- a/drivers/baseband/meson.build
>>> +++ b/drivers/baseband/meson.build
>>> @@ -5,7 +5,7 @@ if is_windows
>>>  	subdir_done()
>>>  endif
>>>
>>> -drivers = ['null', 'turbo_sw', 'fpga_lte_fec', 'fpga_5gnr_fec']
>>> +drivers = ['null', 'turbo_sw', 'fpga_lte_fec', 'fpga_5gnr_fec',
>>> +'acc100']
>>>
>>>  config_flag_fmt = 'RTE_LIBRTE_PMD_BBDEV_@0@'
>>>  driver_name_fmt = 'rte_pmd_bbdev_@0@'


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v9 02/10] baseband/acc100: add register definition file
  2020-09-29 23:30         ` Chautru, Nicolas
@ 2020-09-30 23:11           ` Tom Rix
  0 siblings, 0 replies; 213+ messages in thread
From: Tom Rix @ 2020-09-30 23:11 UTC (permalink / raw)
  To: Chautru, Nicolas, dev, akhil.goyal
  Cc: Richardson, Bruce, Xu, Rosen, dave.burley, aidan.goddard, Yigit,
	Ferruh, Liu, Tianjiao


On 9/29/20 4:30 PM, Chautru, Nicolas wrote:
> Hi Tom, 
>
>> From: Tom Rix <trix@redhat.com>
>> On 9/28/20 5:29 PM, Nicolas Chautru wrote:
>>> Add in the list of registers for the device and related
>>> HW specs definitions.
>>>
>>> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
>>> Reviewed-by: Rosen Xu <rosen.xu@intel.com>
>>> Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
>>> ---
>>>  drivers/baseband/acc100/acc100_pf_enum.h | 1068
>> ++++++++++++++++++++++++++++++
>>>  drivers/baseband/acc100/acc100_vf_enum.h |   73 ++
>>>  drivers/baseband/acc100/rte_acc100_pmd.h |  490 ++++++++++++++
>>>  3 files changed, 1631 insertions(+)
>>>  create mode 100644 drivers/baseband/acc100/acc100_pf_enum.h
>>>  create mode 100644 drivers/baseband/acc100/acc100_vf_enum.h
>>>
>>> diff --git a/drivers/baseband/acc100/acc100_pf_enum.h
>> b/drivers/baseband/acc100/acc100_pf_enum.h
>>> new file mode 100644
>>> index 0000000..a1ee416
>>> --- /dev/null
>>> +++ b/drivers/baseband/acc100/acc100_pf_enum.h
>>> @@ -0,0 +1,1068 @@
>>> +/* SPDX-License-Identifier: BSD-3-Clause
>>> + * Copyright(c) 2017 Intel Corporation
>>> + */
>>> +
>>> +#ifndef ACC100_PF_ENUM_H
>>> +#define ACC100_PF_ENUM_H
>>> +
>>> +/*
>>> + * ACC100 Register mapping on PF BAR0
>>> + * This is automatically generated from RDL, format may change with new
>> RDL
>>> + * Release.
>>> + * Variable names are as is
>>> + */
>>> +enum {
>>> +	HWPfQmgrEgressQueuesTemplate          =  0x0007FE00,
>>> +	HWPfQmgrIngressAq                     =  0x00080000,
>>> +	HWPfQmgrArbQAvail                     =  0x00A00010,
>>> +	HWPfQmgrArbQBlock                     =  0x00A00014,
>>> +	HWPfQmgrAqueueDropNotifEn             =  0x00A00024,
>>> +	HWPfQmgrAqueueDisableNotifEn          =  0x00A00028,
>>> +	HWPfQmgrSoftReset                     =  0x00A00038,
>>> +	HWPfQmgrInitStatus                    =  0x00A0003C,
>>> +	HWPfQmgrAramWatchdogCount             =  0x00A00040,
>>> +	HWPfQmgrAramWatchdogCounterEn         =  0x00A00044,
>>> +	HWPfQmgrAxiWatchdogCount              =  0x00A00048,
>>> +	HWPfQmgrAxiWatchdogCounterEn          =  0x00A0004C,
>>> +	HWPfQmgrProcessWatchdogCount          =  0x00A00050,
>>> +	HWPfQmgrProcessWatchdogCounterEn      =  0x00A00054,
>>> +	HWPfQmgrProcessUl4GWatchdogCounter    =  0x00A00058,
>>> +	HWPfQmgrProcessDl4GWatchdogCounter    =  0x00A0005C,
>>> +	HWPfQmgrProcessUl5GWatchdogCounter    =  0x00A00060,
>>> +	HWPfQmgrProcessDl5GWatchdogCounter    =  0x00A00064,
>>> +	HWPfQmgrProcessMldWatchdogCounter     =  0x00A00068,
>>> +	HWPfQmgrMsiOverflowUpperVf            =  0x00A00070,
>>> +	HWPfQmgrMsiOverflowLowerVf            =  0x00A00074,
>>> +	HWPfQmgrMsiWatchdogOverflow           =  0x00A00078,
>>> +	HWPfQmgrMsiOverflowEnable             =  0x00A0007C,
>>> +	HWPfQmgrDebugAqPointerMemGrp          =  0x00A00100,
>>> +	HWPfQmgrDebugOutputArbQFifoGrp        =  0x00A00140,
>>> +	HWPfQmgrDebugMsiFifoGrp               =  0x00A00180,
>>> +	HWPfQmgrDebugAxiWdTimeoutMsiFifo      =  0x00A001C0,
>>> +	HWPfQmgrDebugProcessWdTimeoutMsiFifo  =  0x00A001C4,
>>> +	HWPfQmgrDepthLog2Grp                  =  0x00A00200,
>>> +	HWPfQmgrTholdGrp                      =  0x00A00300,
>>> +	HWPfQmgrGrpTmplateReg0Indx            =  0x00A00600,
>>> +	HWPfQmgrGrpTmplateReg1Indx            =  0x00A00680,
>>> +	HWPfQmgrGrpTmplateReg2indx            =  0x00A00700,
>>> +	HWPfQmgrGrpTmplateReg3Indx            =  0x00A00780,
>>> +	HWPfQmgrGrpTmplateReg4Indx            =  0x00A00800,
>>> +	HWPfQmgrVfBaseAddr                    =  0x00A01000,
>>> +	HWPfQmgrUl4GWeightRrVf                =  0x00A02000,
>>> +	HWPfQmgrDl4GWeightRrVf                =  0x00A02100,
>>> +	HWPfQmgrUl5GWeightRrVf                =  0x00A02200,
>>> +	HWPfQmgrDl5GWeightRrVf                =  0x00A02300,
>>> +	HWPfQmgrMldWeightRrVf                 =  0x00A02400,
>>> +	HWPfQmgrArbQDepthGrp                  =  0x00A02F00,
>>> +	HWPfQmgrGrpFunction0                  =  0x00A02F40,
>>> +	HWPfQmgrGrpFunction1                  =  0x00A02F44,
>>> +	HWPfQmgrGrpPriority                   =  0x00A02F48,
>>> +	HWPfQmgrWeightSync                    =  0x00A03000,
>>> +	HWPfQmgrAqEnableVf                    =  0x00A10000,
>>> +	HWPfQmgrAqResetVf                     =  0x00A20000,
>>> +	HWPfQmgrRingSizeVf                    =  0x00A20004,
>>> +	HWPfQmgrGrpDepthLog20Vf               =  0x00A20008,
>>> +	HWPfQmgrGrpDepthLog21Vf               =  0x00A2000C,
>>> +	HWPfQmgrGrpFunction0Vf                =  0x00A20010,
>>> +	HWPfQmgrGrpFunction1Vf                =  0x00A20014,
>>> +	HWPfDmaConfig0Reg                     =  0x00B80000,
>>> +	HWPfDmaConfig1Reg                     =  0x00B80004,
>>> +	HWPfDmaQmgrAddrReg                    =  0x00B80008,
>>> +	HWPfDmaSoftResetReg                   =  0x00B8000C,
>>> +	HWPfDmaAxcacheReg                     =  0x00B80010,
>>> +	HWPfDmaVersionReg                     =  0x00B80014,
>>> +	HWPfDmaFrameThreshold                 =  0x00B80018,
>>> +	HWPfDmaTimestampLo                    =  0x00B8001C,
>>> +	HWPfDmaTimestampHi                    =  0x00B80020,
>>> +	HWPfDmaAxiStatus                      =  0x00B80028,
>>> +	HWPfDmaAxiControl                     =  0x00B8002C,
>>> +	HWPfDmaNoQmgr                         =  0x00B80030,
>>> +	HWPfDmaQosScale                       =  0x00B80034,
>>> +	HWPfDmaQmanen                         =  0x00B80040,
>>> +	HWPfDmaQmgrQosBase                    =  0x00B80060,
>>> +	HWPfDmaFecClkGatingEnable             =  0x00B80080,
>>> +	HWPfDmaPmEnable                       =  0x00B80084,
>>> +	HWPfDmaQosEnable                      =  0x00B80088,
>>> +	HWPfDmaHarqWeightedRrFrameThreshold   =  0x00B800B0,
>>> +	HWPfDmaDataSmallWeightedRrFrameThresh  = 0x00B800B4,
>>> +	HWPfDmaDataLargeWeightedRrFrameThresh  = 0x00B800B8,
>>> +	HWPfDmaInboundCbMaxSize               =  0x00B800BC,
>>> +	HWPfDmaInboundDrainDataSize           =  0x00B800C0,
>>> +	HWPfDmaVfDdrBaseRw                    =  0x00B80400,
>>> +	HWPfDmaCmplTmOutCnt                   =  0x00B80800,
>>> +	HWPfDmaProcTmOutCnt                   =  0x00B80804,
>>> +	HWPfDmaStatusRrespBresp               =  0x00B80810,
>>> +	HWPfDmaCfgRrespBresp                  =  0x00B80814,
>>> +	HWPfDmaStatusMemParErr                =  0x00B80818,
>>> +	HWPfDmaCfgMemParErrEn                 =  0x00B8081C,
>>> +	HWPfDmaStatusDmaHwErr                 =  0x00B80820,
>>> +	HWPfDmaCfgDmaHwErrEn                  =  0x00B80824,
>>> +	HWPfDmaStatusFecCoreErr               =  0x00B80828,
>>> +	HWPfDmaCfgFecCoreErrEn                =  0x00B8082C,
>>> +	HWPfDmaStatusFcwDescrErr              =  0x00B80830,
>>> +	HWPfDmaCfgFcwDescrErrEn               =  0x00B80834,
>>> +	HWPfDmaStatusBlockTransmit            =  0x00B80838,
>>> +	HWPfDmaBlockOnErrEn                   =  0x00B8083C,
>>> +	HWPfDmaStatusFlushDma                 =  0x00B80840,
>>> +	HWPfDmaFlushDmaOnErrEn                =  0x00B80844,
>>> +	HWPfDmaStatusSdoneFifoFull            =  0x00B80848,
>>> +	HWPfDmaStatusDescriptorErrLoVf        =  0x00B8084C,
>>> +	HWPfDmaStatusDescriptorErrHiVf        =  0x00B80850,
>>> +	HWPfDmaStatusFcwErrLoVf               =  0x00B80854,
>>> +	HWPfDmaStatusFcwErrHiVf               =  0x00B80858,
>>> +	HWPfDmaStatusDataErrLoVf              =  0x00B8085C,
>>> +	HWPfDmaStatusDataErrHiVf              =  0x00B80860,
>>> +	HWPfDmaCfgMsiEnSoftwareErr            =  0x00B80864,
>>> +	HWPfDmaDescriptorSignatuture          =  0x00B80868,
>>> +	HWPfDmaFcwSignature                   =  0x00B8086C,
>>> +	HWPfDmaErrorDetectionEn               =  0x00B80870,
>>> +	HWPfDmaErrCntrlFifoDebug              =  0x00B8087C,
>>> +	HWPfDmaStatusToutData                 =  0x00B80880,
>>> +	HWPfDmaStatusToutDesc                 =  0x00B80884,
>>> +	HWPfDmaStatusToutUnexpData            =  0x00B80888,
>>> +	HWPfDmaStatusToutUnexpDesc            =  0x00B8088C,
>>> +	HWPfDmaStatusToutProcess              =  0x00B80890,
>>> +	HWPfDmaConfigCtoutOutDataEn           =  0x00B808A0,
>>> +	HWPfDmaConfigCtoutOutDescrEn          =  0x00B808A4,
>>> +	HWPfDmaConfigUnexpComplDataEn         =  0x00B808A8,
>>> +	HWPfDmaConfigUnexpComplDescrEn        =  0x00B808AC,
>>> +	HWPfDmaConfigPtoutOutEn               =  0x00B808B0,
>>> +	HWPfDmaFec5GulDescBaseLoRegVf         =  0x00B88020,
>>> +	HWPfDmaFec5GulDescBaseHiRegVf         =  0x00B88024,
>>> +	HWPfDmaFec5GulRespPtrLoRegVf          =  0x00B88028,
>>> +	HWPfDmaFec5GulRespPtrHiRegVf          =  0x00B8802C,
>>> +	HWPfDmaFec5GdlDescBaseLoRegVf         =  0x00B88040,
>>> +	HWPfDmaFec5GdlDescBaseHiRegVf         =  0x00B88044,
>>> +	HWPfDmaFec5GdlRespPtrLoRegVf          =  0x00B88048,
>>> +	HWPfDmaFec5GdlRespPtrHiRegVf          =  0x00B8804C,
>>> +	HWPfDmaFec4GulDescBaseLoRegVf         =  0x00B88060,
>>> +	HWPfDmaFec4GulDescBaseHiRegVf         =  0x00B88064,
>>> +	HWPfDmaFec4GulRespPtrLoRegVf          =  0x00B88068,
>>> +	HWPfDmaFec4GulRespPtrHiRegVf          =  0x00B8806C,
>>> +	HWPfDmaFec4GdlDescBaseLoRegVf         =  0x00B88080,
>>> +	HWPfDmaFec4GdlDescBaseHiRegVf         =  0x00B88084,
>>> +	HWPfDmaFec4GdlRespPtrLoRegVf          =  0x00B88088,
>>> +	HWPfDmaFec4GdlRespPtrHiRegVf          =  0x00B8808C,
>>> +	HWPfDmaVfDdrBaseRangeRo               =  0x00B880A0,
>>> +	HWPfQosmonACntrlReg                   =  0x00B90000,
>>> +	HWPfQosmonAEvalOverflow0              =  0x00B90008,
>>> +	HWPfQosmonAEvalOverflow1              =  0x00B9000C,
>>> +	HWPfQosmonADivTerm                    =  0x00B90010,
>>> +	HWPfQosmonATickTerm                   =  0x00B90014,
>>> +	HWPfQosmonAEvalTerm                   =  0x00B90018,
>>> +	HWPfQosmonAAveTerm                    =  0x00B9001C,
>>> +	HWPfQosmonAForceEccErr                =  0x00B90020,
>>> +	HWPfQosmonAEccErrDetect               =  0x00B90024,
>>> +	HWPfQosmonAIterationConfig0Low        =  0x00B90060,
>>> +	HWPfQosmonAIterationConfig0High       =  0x00B90064,
>>> +	HWPfQosmonAIterationConfig1Low        =  0x00B90068,
>>> +	HWPfQosmonAIterationConfig1High       =  0x00B9006C,
>>> +	HWPfQosmonAIterationConfig2Low        =  0x00B90070,
>>> +	HWPfQosmonAIterationConfig2High       =  0x00B90074,
>>> +	HWPfQosmonAIterationConfig3Low        =  0x00B90078,
>>> +	HWPfQosmonAIterationConfig3High       =  0x00B9007C,
>>> +	HWPfQosmonAEvalMemAddr                =  0x00B90080,
>>> +	HWPfQosmonAEvalMemData                =  0x00B90084,
>>> +	HWPfQosmonAXaction                    =  0x00B900C0,
>>> +	HWPfQosmonARemThres1Vf                =  0x00B90400,
>>> +	HWPfQosmonAThres2Vf                   =  0x00B90404,
>>> +	HWPfQosmonAWeiFracVf                  =  0x00B90408,
>>> +	HWPfQosmonARrWeiVf                    =  0x00B9040C,
>>> +	HWPfPermonACntrlRegVf                 =  0x00B98000,
>>> +	HWPfPermonACountVf                    =  0x00B98008,
>>> +	HWPfPermonAKCntLoVf                   =  0x00B98010,
>>> +	HWPfPermonAKCntHiVf                   =  0x00B98014,
>>> +	HWPfPermonADeltaCntLoVf               =  0x00B98020,
>>> +	HWPfPermonADeltaCntHiVf               =  0x00B98024,
>>> +	HWPfPermonAVersionReg                 =  0x00B9C000,
>>> +	HWPfPermonACbControlFec               =  0x00B9C0F0,
>>> +	HWPfPermonADltTimerLoFec              =  0x00B9C0F4,
>>> +	HWPfPermonADltTimerHiFec              =  0x00B9C0F8,
>>> +	HWPfPermonACbCountFec                 =  0x00B9C100,
>>> +	HWPfPermonAAccExecTimerLoFec          =  0x00B9C104,
>>> +	HWPfPermonAAccExecTimerHiFec          =  0x00B9C108,
>>> +	HWPfPermonAExecTimerMinFec            =  0x00B9C200,
>>> +	HWPfPermonAExecTimerMaxFec            =  0x00B9C204,
>>> +	HWPfPermonAControlBusMon              =  0x00B9C400,
>>> +	HWPfPermonAConfigBusMon               =  0x00B9C404,
>>> +	HWPfPermonASkipCountBusMon            =  0x00B9C408,
>>> +	HWPfPermonAMinLatBusMon               =  0x00B9C40C,
>>> +	HWPfPermonAMaxLatBusMon               =  0x00B9C500,
>>> +	HWPfPermonATotalLatLowBusMon          =  0x00B9C504,
>>> +	HWPfPermonATotalLatUpperBusMon        =  0x00B9C508,
>>> +	HWPfPermonATotalReqCntBusMon          =  0x00B9C50C,
>>> +	HWPfQosmonBCntrlReg                   =  0x00BA0000,
>>> +	HWPfQosmonBEvalOverflow0              =  0x00BA0008,
>>> +	HWPfQosmonBEvalOverflow1              =  0x00BA000C,
>>> +	HWPfQosmonBDivTerm                    =  0x00BA0010,
>>> +	HWPfQosmonBTickTerm                   =  0x00BA0014,
>>> +	HWPfQosmonBEvalTerm                   =  0x00BA0018,
>>> +	HWPfQosmonBAveTerm                    =  0x00BA001C,
>>> +	HWPfQosmonBForceEccErr                =  0x00BA0020,
>>> +	HWPfQosmonBEccErrDetect               =  0x00BA0024,
>>> +	HWPfQosmonBIterationConfig0Low        =  0x00BA0060,
>>> +	HWPfQosmonBIterationConfig0High       =  0x00BA0064,
>>> +	HWPfQosmonBIterationConfig1Low        =  0x00BA0068,
>>> +	HWPfQosmonBIterationConfig1High       =  0x00BA006C,
>>> +	HWPfQosmonBIterationConfig2Low        =  0x00BA0070,
>>> +	HWPfQosmonBIterationConfig2High       =  0x00BA0074,
>>> +	HWPfQosmonBIterationConfig3Low        =  0x00BA0078,
>>> +	HWPfQosmonBIterationConfig3High       =  0x00BA007C,
>>> +	HWPfQosmonBEvalMemAddr                =  0x00BA0080,
>>> +	HWPfQosmonBEvalMemData                =  0x00BA0084,
>>> +	HWPfQosmonBXaction                    =  0x00BA00C0,
>>> +	HWPfQosmonBRemThres1Vf                =  0x00BA0400,
>>> +	HWPfQosmonBThres2Vf                   =  0x00BA0404,
>>> +	HWPfQosmonBWeiFracVf                  =  0x00BA0408,
>>> +	HWPfQosmonBRrWeiVf                    =  0x00BA040C,
>>> +	HWPfPermonBCntrlRegVf                 =  0x00BA8000,
>>> +	HWPfPermonBCountVf                    =  0x00BA8008,
>>> +	HWPfPermonBKCntLoVf                   =  0x00BA8010,
>>> +	HWPfPermonBKCntHiVf                   =  0x00BA8014,
>>> +	HWPfPermonBDeltaCntLoVf               =  0x00BA8020,
>>> +	HWPfPermonBDeltaCntHiVf               =  0x00BA8024,
>>> +	HWPfPermonBVersionReg                 =  0x00BAC000,
>>> +	HWPfPermonBCbControlFec               =  0x00BAC0F0,
>>> +	HWPfPermonBDltTimerLoFec              =  0x00BAC0F4,
>>> +	HWPfPermonBDltTimerHiFec              =  0x00BAC0F8,
>>> +	HWPfPermonBCbCountFec                 =  0x00BAC100,
>>> +	HWPfPermonBAccExecTimerLoFec          =  0x00BAC104,
>>> +	HWPfPermonBAccExecTimerHiFec          =  0x00BAC108,
>>> +	HWPfPermonBExecTimerMinFec            =  0x00BAC200,
>>> +	HWPfPermonBExecTimerMaxFec            =  0x00BAC204,
>>> +	HWPfPermonBControlBusMon              =  0x00BAC400,
>>> +	HWPfPermonBConfigBusMon               =  0x00BAC404,
>>> +	HWPfPermonBSkipCountBusMon            =  0x00BAC408,
>>> +	HWPfPermonBMinLatBusMon               =  0x00BAC40C,
>>> +	HWPfPermonBMaxLatBusMon               =  0x00BAC500,
>>> +	HWPfPermonBTotalLatLowBusMon          =  0x00BAC504,
>>> +	HWPfPermonBTotalLatUpperBusMon        =  0x00BAC508,
>>> +	HWPfPermonBTotalReqCntBusMon          =  0x00BAC50C,
>>> +	HWPfFecUl5gCntrlReg                   =  0x00BC0000,
>>> +	HWPfFecUl5gI2MThreshReg               =  0x00BC0004,
>>> +	HWPfFecUl5gVersionReg                 =  0x00BC0100,
>>> +	HWPfFecUl5gFcwStatusReg               =  0x00BC0104,
>>> +	HWPfFecUl5gWarnReg                    =  0x00BC0108,
>>> +	HwPfFecUl5gIbDebugReg                 =  0x00BC0200,
>>> +	HwPfFecUl5gObLlrDebugReg              =  0x00BC0204,
>>> +	HwPfFecUl5gObHarqDebugReg             =  0x00BC0208,
>>> +	HwPfFecUl5g1CntrlReg                  =  0x00BC1000,
>>> +	HwPfFecUl5g1I2MThreshReg              =  0x00BC1004,
>>> +	HwPfFecUl5g1VersionReg                =  0x00BC1100,
>>> +	HwPfFecUl5g1FcwStatusReg              =  0x00BC1104,
>>> +	HwPfFecUl5g1WarnReg                   =  0x00BC1108,
>>> +	HwPfFecUl5g1IbDebugReg                =  0x00BC1200,
>>> +	HwPfFecUl5g1ObLlrDebugReg             =  0x00BC1204,
>>> +	HwPfFecUl5g1ObHarqDebugReg            =  0x00BC1208,
>>> +	HwPfFecUl5g2CntrlReg                  =  0x00BC2000,
>>> +	HwPfFecUl5g2I2MThreshReg              =  0x00BC2004,
>>> +	HwPfFecUl5g2VersionReg                =  0x00BC2100,
>>> +	HwPfFecUl5g2FcwStatusReg              =  0x00BC2104,
>>> +	HwPfFecUl5g2WarnReg                   =  0x00BC2108,
>>> +	HwPfFecUl5g2IbDebugReg                =  0x00BC2200,
>>> +	HwPfFecUl5g2ObLlrDebugReg             =  0x00BC2204,
>>> +	HwPfFecUl5g2ObHarqDebugReg            =  0x00BC2208,
>>> +	HwPfFecUl5g3CntrlReg                  =  0x00BC3000,
>>> +	HwPfFecUl5g3I2MThreshReg              =  0x00BC3004,
>>> +	HwPfFecUl5g3VersionReg                =  0x00BC3100,
>>> +	HwPfFecUl5g3FcwStatusReg              =  0x00BC3104,
>>> +	HwPfFecUl5g3WarnReg                   =  0x00BC3108,
>>> +	HwPfFecUl5g3IbDebugReg                =  0x00BC3200,
>>> +	HwPfFecUl5g3ObLlrDebugReg             =  0x00BC3204,
>>> +	HwPfFecUl5g3ObHarqDebugReg            =  0x00BC3208,
>>> +	HwPfFecUl5g4CntrlReg                  =  0x00BC4000,
>>> +	HwPfFecUl5g4I2MThreshReg              =  0x00BC4004,
>>> +	HwPfFecUl5g4VersionReg                =  0x00BC4100,
>>> +	HwPfFecUl5g4FcwStatusReg              =  0x00BC4104,
>>> +	HwPfFecUl5g4WarnReg                   =  0x00BC4108,
>>> +	HwPfFecUl5g4IbDebugReg                =  0x00BC4200,
>>> +	HwPfFecUl5g4ObLlrDebugReg             =  0x00BC4204,
>>> +	HwPfFecUl5g4ObHarqDebugReg            =  0x00BC4208,
>>> +	HwPfFecUl5g5CntrlReg                  =  0x00BC5000,
>>> +	HwPfFecUl5g5I2MThreshReg              =  0x00BC5004,
>>> +	HwPfFecUl5g5VersionReg                =  0x00BC5100,
>>> +	HwPfFecUl5g5FcwStatusReg              =  0x00BC5104,
>>> +	HwPfFecUl5g5WarnReg                   =  0x00BC5108,
>>> +	HwPfFecUl5g5IbDebugReg                =  0x00BC5200,
>>> +	HwPfFecUl5g5ObLlrDebugReg             =  0x00BC5204,
>>> +	HwPfFecUl5g5ObHarqDebugReg            =  0x00BC5208,
>>> +	HwPfFecUl5g6CntrlReg                  =  0x00BC6000,
>>> +	HwPfFecUl5g6I2MThreshReg              =  0x00BC6004,
>>> +	HwPfFecUl5g6VersionReg                =  0x00BC6100,
>>> +	HwPfFecUl5g6FcwStatusReg              =  0x00BC6104,
>>> +	HwPfFecUl5g6WarnReg                   =  0x00BC6108,
>>> +	HwPfFecUl5g6IbDebugReg                =  0x00BC6200,
>>> +	HwPfFecUl5g6ObLlrDebugReg             =  0x00BC6204,
>>> +	HwPfFecUl5g6ObHarqDebugReg            =  0x00BC6208,
>>> +	HwPfFecUl5g7CntrlReg                  =  0x00BC7000,
>>> +	HwPfFecUl5g7I2MThreshReg              =  0x00BC7004,
>>> +	HwPfFecUl5g7VersionReg                =  0x00BC7100,
>>> +	HwPfFecUl5g7FcwStatusReg              =  0x00BC7104,
>>> +	HwPfFecUl5g7WarnReg                   =  0x00BC7108,
>>> +	HwPfFecUl5g7IbDebugReg                =  0x00BC7200,
>>> +	HwPfFecUl5g7ObLlrDebugReg             =  0x00BC7204,
>>> +	HwPfFecUl5g7ObHarqDebugReg            =  0x00BC7208,
>>> +	HwPfFecUl5g8CntrlReg                  =  0x00BC8000,
>>> +	HwPfFecUl5g8I2MThreshReg              =  0x00BC8004,
>>> +	HwPfFecUl5g8VersionReg                =  0x00BC8100,
>>> +	HwPfFecUl5g8FcwStatusReg              =  0x00BC8104,
>>> +	HwPfFecUl5g8WarnReg                   =  0x00BC8108,
>>> +	HwPfFecUl5g8IbDebugReg                =  0x00BC8200,
>>> +	HwPfFecUl5g8ObLlrDebugReg             =  0x00BC8204,
>>> +	HwPfFecUl5g8ObHarqDebugReg            =  0x00BC8208,
>>> +	HWPfFecDl5gCntrlReg                   =  0x00BCF000,
>>> +	HWPfFecDl5gI2MThreshReg               =  0x00BCF004,
>>> +	HWPfFecDl5gVersionReg                 =  0x00BCF100,
>>> +	HWPfFecDl5gFcwStatusReg               =  0x00BCF104,
>>> +	HWPfFecDl5gWarnReg                    =  0x00BCF108,
>>> +	HWPfFecUlVersionReg                   =  0x00BD0000,
>>> +	HWPfFecUlControlReg                   =  0x00BD0004,
>>> +	HWPfFecUlStatusReg                    =  0x00BD0008,
>>> +	HWPfFecDlVersionReg                   =  0x00BDF000,
>>> +	HWPfFecDlClusterConfigReg             =  0x00BDF004,
>>> +	HWPfFecDlBurstThres                   =  0x00BDF00C,
>>> +	HWPfFecDlClusterStatusReg0            =  0x00BDF040,
>>> +	HWPfFecDlClusterStatusReg1            =  0x00BDF044,
>>> +	HWPfFecDlClusterStatusReg2            =  0x00BDF048,
>>> +	HWPfFecDlClusterStatusReg3            =  0x00BDF04C,
>>> +	HWPfFecDlClusterStatusReg4            =  0x00BDF050,
>>> +	HWPfFecDlClusterStatusReg5            =  0x00BDF054,
>>> +	HWPfChaFabPllPllrst                   =  0x00C40000,
>>> +	HWPfChaFabPllClk0                     =  0x00C40004,
>>> +	HWPfChaFabPllClk1                     =  0x00C40008,
>>> +	HWPfChaFabPllBwadj                    =  0x00C4000C,
>>> +	HWPfChaFabPllLbw                      =  0x00C40010,
>>> +	HWPfChaFabPllResetq                   =  0x00C40014,
>>> +	HWPfChaFabPllPhshft0                  =  0x00C40018,
>>> +	HWPfChaFabPllPhshft1                  =  0x00C4001C,
>>> +	HWPfChaFabPllDivq0                    =  0x00C40020,
>>> +	HWPfChaFabPllDivq1                    =  0x00C40024,
>>> +	HWPfChaFabPllDivq2                    =  0x00C40028,
>>> +	HWPfChaFabPllDivq3                    =  0x00C4002C,
>>> +	HWPfChaFabPllDivq4                    =  0x00C40030,
>>> +	HWPfChaFabPllDivq5                    =  0x00C40034,
>>> +	HWPfChaFabPllDivq6                    =  0x00C40038,
>>> +	HWPfChaFabPllDivq7                    =  0x00C4003C,
>>> +	HWPfChaDl5gPllPllrst                  =  0x00C40080,
>>> +	HWPfChaDl5gPllClk0                    =  0x00C40084,
>>> +	HWPfChaDl5gPllClk1                    =  0x00C40088,
>>> +	HWPfChaDl5gPllBwadj                   =  0x00C4008C,
>>> +	HWPfChaDl5gPllLbw                     =  0x00C40090,
>>> +	HWPfChaDl5gPllResetq                  =  0x00C40094,
>>> +	HWPfChaDl5gPllPhshft0                 =  0x00C40098,
>>> +	HWPfChaDl5gPllPhshft1                 =  0x00C4009C,
>>> +	HWPfChaDl5gPllDivq0                   =  0x00C400A0,
>>> +	HWPfChaDl5gPllDivq1                   =  0x00C400A4,
>>> +	HWPfChaDl5gPllDivq2                   =  0x00C400A8,
>>> +	HWPfChaDl5gPllDivq3                   =  0x00C400AC,
>>> +	HWPfChaDl5gPllDivq4                   =  0x00C400B0,
>>> +	HWPfChaDl5gPllDivq5                   =  0x00C400B4,
>>> +	HWPfChaDl5gPllDivq6                   =  0x00C400B8,
>>> +	HWPfChaDl5gPllDivq7                   =  0x00C400BC,
>>> +	HWPfChaDl4gPllPllrst                  =  0x00C40100,
>>> +	HWPfChaDl4gPllClk0                    =  0x00C40104,
>>> +	HWPfChaDl4gPllClk1                    =  0x00C40108,
>>> +	HWPfChaDl4gPllBwadj                   =  0x00C4010C,
>>> +	HWPfChaDl4gPllLbw                     =  0x00C40110,
>>> +	HWPfChaDl4gPllResetq                  =  0x00C40114,
>>> +	HWPfChaDl4gPllPhshft0                 =  0x00C40118,
>>> +	HWPfChaDl4gPllPhshft1                 =  0x00C4011C,
>>> +	HWPfChaDl4gPllDivq0                   =  0x00C40120,
>>> +	HWPfChaDl4gPllDivq1                   =  0x00C40124,
>>> +	HWPfChaDl4gPllDivq2                   =  0x00C40128,
>>> +	HWPfChaDl4gPllDivq3                   =  0x00C4012C,
>>> +	HWPfChaDl4gPllDivq4                   =  0x00C40130,
>>> +	HWPfChaDl4gPllDivq5                   =  0x00C40134,
>>> +	HWPfChaDl4gPllDivq6                   =  0x00C40138,
>>> +	HWPfChaDl4gPllDivq7                   =  0x00C4013C,
>>> +	HWPfChaUl5gPllPllrst                  =  0x00C40180,
>>> +	HWPfChaUl5gPllClk0                    =  0x00C40184,
>>> +	HWPfChaUl5gPllClk1                    =  0x00C40188,
>>> +	HWPfChaUl5gPllBwadj                   =  0x00C4018C,
>>> +	HWPfChaUl5gPllLbw                     =  0x00C40190,
>>> +	HWPfChaUl5gPllResetq                  =  0x00C40194,
>>> +	HWPfChaUl5gPllPhshft0                 =  0x00C40198,
>>> +	HWPfChaUl5gPllPhshft1                 =  0x00C4019C,
>>> +	HWPfChaUl5gPllDivq0                   =  0x00C401A0,
>>> +	HWPfChaUl5gPllDivq1                   =  0x00C401A4,
>>> +	HWPfChaUl5gPllDivq2                   =  0x00C401A8,
>>> +	HWPfChaUl5gPllDivq3                   =  0x00C401AC,
>>> +	HWPfChaUl5gPllDivq4                   =  0x00C401B0,
>>> +	HWPfChaUl5gPllDivq5                   =  0x00C401B4,
>>> +	HWPfChaUl5gPllDivq6                   =  0x00C401B8,
>>> +	HWPfChaUl5gPllDivq7                   =  0x00C401BC,
>>> +	HWPfChaUl4gPllPllrst                  =  0x00C40200,
>>> +	HWPfChaUl4gPllClk0                    =  0x00C40204,
>>> +	HWPfChaUl4gPllClk1                    =  0x00C40208,
>>> +	HWPfChaUl4gPllBwadj                   =  0x00C4020C,
>>> +	HWPfChaUl4gPllLbw                     =  0x00C40210,
>>> +	HWPfChaUl4gPllResetq                  =  0x00C40214,
>>> +	HWPfChaUl4gPllPhshft0                 =  0x00C40218,
>>> +	HWPfChaUl4gPllPhshft1                 =  0x00C4021C,
>>> +	HWPfChaUl4gPllDivq0                   =  0x00C40220,
>>> +	HWPfChaUl4gPllDivq1                   =  0x00C40224,
>>> +	HWPfChaUl4gPllDivq2                   =  0x00C40228,
>>> +	HWPfChaUl4gPllDivq3                   =  0x00C4022C,
>>> +	HWPfChaUl4gPllDivq4                   =  0x00C40230,
>>> +	HWPfChaUl4gPllDivq5                   =  0x00C40234,
>>> +	HWPfChaUl4gPllDivq6                   =  0x00C40238,
>>> +	HWPfChaUl4gPllDivq7                   =  0x00C4023C,
>>> +	HWPfChaDdrPllPllrst                   =  0x00C40280,
>>> +	HWPfChaDdrPllClk0                     =  0x00C40284,
>>> +	HWPfChaDdrPllClk1                     =  0x00C40288,
>>> +	HWPfChaDdrPllBwadj                    =  0x00C4028C,
>>> +	HWPfChaDdrPllLbw                      =  0x00C40290,
>>> +	HWPfChaDdrPllResetq                   =  0x00C40294,
>>> +	HWPfChaDdrPllPhshft0                  =  0x00C40298,
>>> +	HWPfChaDdrPllPhshft1                  =  0x00C4029C,
>>> +	HWPfChaDdrPllDivq0                    =  0x00C402A0,
>>> +	HWPfChaDdrPllDivq1                    =  0x00C402A4,
>>> +	HWPfChaDdrPllDivq2                    =  0x00C402A8,
>>> +	HWPfChaDdrPllDivq3                    =  0x00C402AC,
>>> +	HWPfChaDdrPllDivq4                    =  0x00C402B0,
>>> +	HWPfChaDdrPllDivq5                    =  0x00C402B4,
>>> +	HWPfChaDdrPllDivq6                    =  0x00C402B8,
>>> +	HWPfChaDdrPllDivq7                    =  0x00C402BC,
>>> +	HWPfChaErrStatus                      =  0x00C40400,
>>> +	HWPfChaErrMask                        =  0x00C40404,
>>> +	HWPfChaDebugPcieMsiFifo               =  0x00C40410,
>>> +	HWPfChaDebugDdrMsiFifo                =  0x00C40414,
>>> +	HWPfChaDebugMiscMsiFifo               =  0x00C40418,
>>> +	HWPfChaPwmSet                         =  0x00C40420,
>>> +	HWPfChaDdrRstStatus                   =  0x00C40430,
>>> +	HWPfChaDdrStDoneStatus                =  0x00C40434,
>>> +	HWPfChaDdrWbRstCfg                    =  0x00C40438,
>>> +	HWPfChaDdrApbRstCfg                   =  0x00C4043C,
>>> +	HWPfChaDdrPhyRstCfg                   =  0x00C40440,
>>> +	HWPfChaDdrCpuRstCfg                   =  0x00C40444,
>>> +	HWPfChaDdrSifRstCfg                   =  0x00C40448,
>>> +	HWPfChaPadcfgPcomp0                   =  0x00C41000,
>>> +	HWPfChaPadcfgNcomp0                   =  0x00C41004,
>>> +	HWPfChaPadcfgOdt0                     =  0x00C41008,
>>> +	HWPfChaPadcfgProtect0                 =  0x00C4100C,
>>> +	HWPfChaPreemphasisProtect0            =  0x00C41010,
>>> +	HWPfChaPreemphasisCompen0             =  0x00C41040,
>>> +	HWPfChaPreemphasisOdten0              =  0x00C41044,
>>> +	HWPfChaPadcfgPcomp1                   =  0x00C41100,
>>> +	HWPfChaPadcfgNcomp1                   =  0x00C41104,
>>> +	HWPfChaPadcfgOdt1                     =  0x00C41108,
>>> +	HWPfChaPadcfgProtect1                 =  0x00C4110C,
>>> +	HWPfChaPreemphasisProtect1            =  0x00C41110,
>>> +	HWPfChaPreemphasisCompen1             =  0x00C41140,
>>> +	HWPfChaPreemphasisOdten1              =  0x00C41144,
>>> +	HWPfChaPadcfgPcomp2                   =  0x00C41200,
>>> +	HWPfChaPadcfgNcomp2                   =  0x00C41204,
>>> +	HWPfChaPadcfgOdt2                     =  0x00C41208,
>>> +	HWPfChaPadcfgProtect2                 =  0x00C4120C,
>>> +	HWPfChaPreemphasisProtect2            =  0x00C41210,
>>> +	HWPfChaPreemphasisCompen2             =  0x00C41240,
>>> +	HWPfChaPreemphasisOdten4              =  0x00C41444,
>>> +	HWPfChaPreemphasisOdten2              =  0x00C41244,
>>> +	HWPfChaPadcfgPcomp3                   =  0x00C41300,
>>> +	HWPfChaPadcfgNcomp3                   =  0x00C41304,
>>> +	HWPfChaPadcfgOdt3                     =  0x00C41308,
>>> +	HWPfChaPadcfgProtect3                 =  0x00C4130C,
>>> +	HWPfChaPreemphasisProtect3            =  0x00C41310,
>>> +	HWPfChaPreemphasisCompen3             =  0x00C41340,
>>> +	HWPfChaPreemphasisOdten3              =  0x00C41344,
>>> +	HWPfChaPadcfgPcomp4                   =  0x00C41400,
>>> +	HWPfChaPadcfgNcomp4                   =  0x00C41404,
>>> +	HWPfChaPadcfgOdt4                     =  0x00C41408,
>>> +	HWPfChaPadcfgProtect4                 =  0x00C4140C,
>>> +	HWPfChaPreemphasisProtect4            =  0x00C41410,
>>> +	HWPfChaPreemphasisCompen4             =  0x00C41440,
>>> +	HWPfHiVfToPfDbellVf                   =  0x00C80000,
>>> +	HWPfHiPfToVfDbellVf                   =  0x00C80008,
>>> +	HWPfHiInfoRingBaseLoVf                =  0x00C80010,
>>> +	HWPfHiInfoRingBaseHiVf                =  0x00C80014,
>>> +	HWPfHiInfoRingPointerVf               =  0x00C80018,
>>> +	HWPfHiInfoRingIntWrEnVf               =  0x00C80020,
>>> +	HWPfHiInfoRingPf2VfWrEnVf             =  0x00C80024,
>>> +	HWPfHiMsixVectorMapperVf              =  0x00C80060,
>>> +	HWPfHiModuleVersionReg                =  0x00C84000,
>>> +	HWPfHiIosf2axiErrLogReg               =  0x00C84004,
>>> +	HWPfHiHardResetReg                    =  0x00C84008,
>>> +	HWPfHi5GHardResetReg                  =  0x00C8400C,
>>> +	HWPfHiInfoRingBaseLoRegPf             =  0x00C84010,
>>> +	HWPfHiInfoRingBaseHiRegPf             =  0x00C84014,
>>> +	HWPfHiInfoRingPointerRegPf            =  0x00C84018,
>>> +	HWPfHiInfoRingIntWrEnRegPf            =  0x00C84020,
>>> +	HWPfHiInfoRingVf2pfLoWrEnReg          =  0x00C84024,
>>> +	HWPfHiInfoRingVf2pfHiWrEnReg          =  0x00C84028,
>>> +	HWPfHiLogParityErrStatusReg           =  0x00C8402C,
>>> +	HWPfHiLogDataParityErrorVfStatusLo    =  0x00C84030,
>>> +	HWPfHiLogDataParityErrorVfStatusHi    =  0x00C84034,
>>> +	HWPfHiBlockTransmitOnErrorEn          =  0x00C84038,
>>> +	HWPfHiCfgMsiIntWrEnRegPf              =  0x00C84040,
>>> +	HWPfHiCfgMsiVf2pfLoWrEnReg            =  0x00C84044,
>>> +	HWPfHiCfgMsiVf2pfHighWrEnReg          =  0x00C84048,
>>> +	HWPfHiMsixVectorMapperPf              =  0x00C84060,
>>> +	HWPfHiApbWrWaitTime                   =  0x00C84100,
>>> +	HWPfHiXCounterMaxValue                =  0x00C84104,
>>> +	HWPfHiPfMode                          =  0x00C84108,
>>> +	HWPfHiClkGateHystReg                  =  0x00C8410C,
>>> +	HWPfHiSnoopBitsReg                    =  0x00C84110,
>>> +	HWPfHiMsiDropEnableReg                =  0x00C84114,
>>> +	HWPfHiMsiStatReg                      =  0x00C84120,
>>> +	HWPfHiFifoOflStatReg                  =  0x00C84124,
>>> +	HWPfHiHiDebugReg                      =  0x00C841F4,
>>> +	HWPfHiDebugMemSnoopMsiFifo            =  0x00C841F8,
>>> +	HWPfHiDebugMemSnoopInputFifo          =  0x00C841FC,
>>> +	HWPfHiMsixMappingConfig               =  0x00C84200,
>>> +	HWPfHiJunkReg                         =  0x00C8FF00,
>>> +	HWPfDdrUmmcVer                        =  0x00D00000,
>>> +	HWPfDdrUmmcCap                        =  0x00D00010,
>>> +	HWPfDdrUmmcCtrl                       =  0x00D00020,
>>> +	HWPfDdrMpcPe                          =  0x00D00080,
>>> +	HWPfDdrMpcPpri3                       =  0x00D00090,
>>> +	HWPfDdrMpcPpri2                       =  0x00D000A0,
>>> +	HWPfDdrMpcPpri1                       =  0x00D000B0,
>>> +	HWPfDdrMpcPpri0                       =  0x00D000C0,
>>> +	HWPfDdrMpcPrwgrpCtrl                  =  0x00D000D0,
>>> +	HWPfDdrMpcPbw7                        =  0x00D000E0,
>>> +	HWPfDdrMpcPbw6                        =  0x00D000F0,
>>> +	HWPfDdrMpcPbw5                        =  0x00D00100,
>>> +	HWPfDdrMpcPbw4                        =  0x00D00110,
>>> +	HWPfDdrMpcPbw3                        =  0x00D00120,
>>> +	HWPfDdrMpcPbw2                        =  0x00D00130,
>>> +	HWPfDdrMpcPbw1                        =  0x00D00140,
>>> +	HWPfDdrMpcPbw0                        =  0x00D00150,
>>> +	HWPfDdrMemoryInit                     =  0x00D00200,
>>> +	HWPfDdrMemoryInitDone                 =  0x00D00210,
>>> +	HWPfDdrMemInitPhyTrng0                =  0x00D00240,
>>> +	HWPfDdrMemInitPhyTrng1                =  0x00D00250,
>>> +	HWPfDdrMemInitPhyTrng2                =  0x00D00260,
>>> +	HWPfDdrMemInitPhyTrng3                =  0x00D00270,
>>> +	HWPfDdrBcDram                         =  0x00D003C0,
>>> +	HWPfDdrBcAddrMap                      =  0x00D003D0,
>>> +	HWPfDdrBcRef                          =  0x00D003E0,
>>> +	HWPfDdrBcTim0                         =  0x00D00400,
>>> +	HWPfDdrBcTim1                         =  0x00D00410,
>>> +	HWPfDdrBcTim2                         =  0x00D00420,
>>> +	HWPfDdrBcTim3                         =  0x00D00430,
>>> +	HWPfDdrBcTim4                         =  0x00D00440,
>>> +	HWPfDdrBcTim5                         =  0x00D00450,
>>> +	HWPfDdrBcTim6                         =  0x00D00460,
>>> +	HWPfDdrBcTim7                         =  0x00D00470,
>>> +	HWPfDdrBcTim8                         =  0x00D00480,
>>> +	HWPfDdrBcTim9                         =  0x00D00490,
>>> +	HWPfDdrBcTim10                        =  0x00D004A0,
>>> +	HWPfDdrBcTim12                        =  0x00D004C0,
>>> +	HWPfDdrDfiInit                        =  0x00D004D0,
>>> +	HWPfDdrDfiInitComplete                =  0x00D004E0,
>>> +	HWPfDdrDfiTim0                        =  0x00D004F0,
>>> +	HWPfDdrDfiTim1                        =  0x00D00500,
>>> +	HWPfDdrDfiPhyUpdEn                    =  0x00D00530,
>>> +	HWPfDdrMemStatus                      =  0x00D00540,
>>> +	HWPfDdrUmmcErrStatus                  =  0x00D00550,
>>> +	HWPfDdrUmmcIntStatus                  =  0x00D00560,
>>> +	HWPfDdrUmmcIntEn                      =  0x00D00570,
>>> +	HWPfDdrPhyRdLatency                   =  0x00D48400,
>>> +	HWPfDdrPhyRdLatencyDbi                =  0x00D48410,
>>> +	HWPfDdrPhyWrLatency                   =  0x00D48420,
>>> +	HWPfDdrPhyTrngType                    =  0x00D48430,
>>> +	HWPfDdrPhyMrsTiming2                  =  0x00D48440,
>>> +	HWPfDdrPhyMrsTiming0                  =  0x00D48450,
>>> +	HWPfDdrPhyMrsTiming1                  =  0x00D48460,
>>> +	HWPfDdrPhyDramTmrd                    =  0x00D48470,
>>> +	HWPfDdrPhyDramTmod                    =  0x00D48480,
>>> +	HWPfDdrPhyDramTwpre                   =  0x00D48490,
>>> +	HWPfDdrPhyDramTrfc                    =  0x00D484A0,
>>> +	HWPfDdrPhyDramTrwtp                   =  0x00D484B0,
>>> +	HWPfDdrPhyMr01Dimm                    =  0x00D484C0,
>>> +	HWPfDdrPhyMr01DimmDbi                 =  0x00D484D0,
>>> +	HWPfDdrPhyMr23Dimm                    =  0x00D484E0,
>>> +	HWPfDdrPhyMr45Dimm                    =  0x00D484F0,
>>> +	HWPfDdrPhyMr67Dimm                    =  0x00D48500,
>>> +	HWPfDdrPhyWrlvlWwRdlvlRr              =  0x00D48510,
>>> +	HWPfDdrPhyOdtEn                       =  0x00D48520,
>>> +	HWPfDdrPhyFastTrng                    =  0x00D48530,
>>> +	HWPfDdrPhyDynTrngGap                  =  0x00D48540,
>>> +	HWPfDdrPhyDynRcalGap                  =  0x00D48550,
>>> +	HWPfDdrPhyIdletimeout                 =  0x00D48560,
>>> +	HWPfDdrPhyRstCkeGap                   =  0x00D48570,
>>> +	HWPfDdrPhyCkeMrsGap                   =  0x00D48580,
>>> +	HWPfDdrPhyMemVrefMidVal               =  0x00D48590,
>>> +	HWPfDdrPhyVrefStep                    =  0x00D485A0,
>>> +	HWPfDdrPhyVrefThreshold               =  0x00D485B0,
>>> +	HWPfDdrPhyPhyVrefMidVal               =  0x00D485C0,
>>> +	HWPfDdrPhyDqsCountMax                 =  0x00D485D0,
>>> +	HWPfDdrPhyDqsCountNum                 =  0x00D485E0,
>>> +	HWPfDdrPhyDramRow                     =  0x00D485F0,
>>> +	HWPfDdrPhyDramCol                     =  0x00D48600,
>>> +	HWPfDdrPhyDramBgBa                    =  0x00D48610,
>>> +	HWPfDdrPhyDynamicUpdreqrel            =  0x00D48620,
>>> +	HWPfDdrPhyVrefLimits                  =  0x00D48630,
>>> +	HWPfDdrPhyIdtmTcStatus                =  0x00D6C020,
>>> +	HWPfDdrPhyIdtmFwVersion               =  0x00D6C410,
>>> +	HWPfDdrPhyRdlvlGateInitDelay          =  0x00D70000,
>>> +	HWPfDdrPhyRdenSmplabc                 =  0x00D70008,
>>> +	HWPfDdrPhyVrefNibble0                 =  0x00D7000C,
>>> +	HWPfDdrPhyVrefNibble1                 =  0x00D70010,
>>> +	HWPfDdrPhyRdlvlGateDqsSmpl0           =  0x00D70014,
>>> +	HWPfDdrPhyRdlvlGateDqsSmpl1           =  0x00D70018,
>>> +	HWPfDdrPhyRdlvlGateDqsSmpl2           =  0x00D7001C,
>>> +	HWPfDdrPhyDqsCount                    =  0x00D70020,
>>> +	HWPfDdrPhyWrlvlRdlvlGateStatus        =  0x00D70024,
>>> +	HWPfDdrPhyErrorFlags                  =  0x00D70028,
>>> +	HWPfDdrPhyPowerDown                   =  0x00D70030,
>>> +	HWPfDdrPhyPrbsSeedByte0               =  0x00D70034,
>>> +	HWPfDdrPhyPrbsSeedByte1               =  0x00D70038,
>>> +	HWPfDdrPhyPcompDq                     =  0x00D70040,
>>> +	HWPfDdrPhyNcompDq                     =  0x00D70044,
>>> +	HWPfDdrPhyPcompDqs                    =  0x00D70048,
>>> +	HWPfDdrPhyNcompDqs                    =  0x00D7004C,
>>> +	HWPfDdrPhyPcompCmd                    =  0x00D70050,
>>> +	HWPfDdrPhyNcompCmd                    =  0x00D70054,
>>> +	HWPfDdrPhyPcompCk                     =  0x00D70058,
>>> +	HWPfDdrPhyNcompCk                     =  0x00D7005C,
>>> +	HWPfDdrPhyRcalOdtDq                   =  0x00D70060,
>>> +	HWPfDdrPhyRcalOdtDqs                  =  0x00D70064,
>>> +	HWPfDdrPhyRcalMask1                   =  0x00D70068,
>>> +	HWPfDdrPhyRcalMask2                   =  0x00D7006C,
>>> +	HWPfDdrPhyRcalCtrl                    =  0x00D70070,
>>> +	HWPfDdrPhyRcalCnt                     =  0x00D70074,
>>> +	HWPfDdrPhyRcalOverride                =  0x00D70078,
>>> +	HWPfDdrPhyRcalGateen                  =  0x00D7007C,
>>> +	HWPfDdrPhyCtrl                        =  0x00D70080,
>>> +	HWPfDdrPhyWrlvlAlg                    =  0x00D70084,
>>> +	HWPfDdrPhyRcalVreftTxcmdOdt           =  0x00D70088,
>>> +	HWPfDdrPhyRdlvlGateParam              =  0x00D7008C,
>>> +	HWPfDdrPhyRdlvlGateParam2             =  0x00D70090,
>>> +	HWPfDdrPhyRcalVreftTxdata             =  0x00D70094,
>>> +	HWPfDdrPhyCmdIntDelay                 =  0x00D700A4,
>>> +	HWPfDdrPhyAlertN                      =  0x00D700A8,
>>> +	HWPfDdrPhyTrngReqWpre2tck             =  0x00D700AC,
>>> +	HWPfDdrPhyCmdPhaseSel                 =  0x00D700B4,
>>> +	HWPfDdrPhyCmdDcdl                     =  0x00D700B8,
>>> +	HWPfDdrPhyCkDcdl                      =  0x00D700BC,
>>> +	HWPfDdrPhySwTrngCtrl1                 =  0x00D700C0,
>>> +	HWPfDdrPhySwTrngCtrl2                 =  0x00D700C4,
>>> +	HWPfDdrPhyRcalPcompRden               =  0x00D700C8,
>>> +	HWPfDdrPhyRcalNcompRden               =  0x00D700CC,
>>> +	HWPfDdrPhyRcalCompen                  =  0x00D700D0,
>>> +	HWPfDdrPhySwTrngRdqs                  =  0x00D700D4,
>>> +	HWPfDdrPhySwTrngWdqs                  =  0x00D700D8,
>>> +	HWPfDdrPhySwTrngRdena                 =  0x00D700DC,
>>> +	HWPfDdrPhySwTrngRdenb                 =  0x00D700E0,
>>> +	HWPfDdrPhySwTrngRdenc                 =  0x00D700E4,
>>> +	HWPfDdrPhySwTrngWdq                   =  0x00D700E8,
>>> +	HWPfDdrPhySwTrngRdq                   =  0x00D700EC,
>>> +	HWPfDdrPhyPcfgHmValue                 =  0x00D700F0,
>>> +	HWPfDdrPhyPcfgTimerValue              =  0x00D700F4,
>>> +	HWPfDdrPhyPcfgSoftwareTraining        =  0x00D700F8,
>>> +	HWPfDdrPhyPcfgMcStatus                =  0x00D700FC,
>>> +	HWPfDdrPhyWrlvlPhRank0                =  0x00D70100,
>>> +	HWPfDdrPhyRdenPhRank0                 =  0x00D70104,
>>> +	HWPfDdrPhyRdenIntRank0                =  0x00D70108,
>>> +	HWPfDdrPhyRdqsDcdlRank0               =  0x00D7010C,
>>> +	HWPfDdrPhyRdqsShadowDcdlRank0         =  0x00D70110,
>>> +	HWPfDdrPhyWdqsDcdlRank0               =  0x00D70114,
>>> +	HWPfDdrPhyWdmDcdlShadowRank0          =  0x00D70118,
>>> +	HWPfDdrPhyWdmDcdlRank0                =  0x00D7011C,
>>> +	HWPfDdrPhyDbiDcdlRank0                =  0x00D70120,
>>> +	HWPfDdrPhyRdenDcdlaRank0              =  0x00D70124,
>>> +	HWPfDdrPhyDbiDcdlShadowRank0          =  0x00D70128,
>>> +	HWPfDdrPhyRdenDcdlbRank0              =  0x00D7012C,
>>> +	HWPfDdrPhyWdqsShadowDcdlRank0         =  0x00D70130,
>>> +	HWPfDdrPhyRdenDcdlcRank0              =  0x00D70134,
>>> +	HWPfDdrPhyRdenShadowDcdlaRank0        =  0x00D70138,
>>> +	HWPfDdrPhyWrlvlIntRank0               =  0x00D7013C,
>>> +	HWPfDdrPhyRdqDcdlBit0Rank0            =  0x00D70200,
>>> +	HWPfDdrPhyRdqDcdlShadowBit0Rank0      =  0x00D70204,
>>> +	HWPfDdrPhyWdqDcdlBit0Rank0            =  0x00D70208,
>>> +	HWPfDdrPhyWdqDcdlShadowBit0Rank0      =  0x00D7020C,
>>> +	HWPfDdrPhyRdqDcdlBit1Rank0            =  0x00D70240,
>>> +	HWPfDdrPhyRdqDcdlShadowBit1Rank0      =  0x00D70244,
>>> +	HWPfDdrPhyWdqDcdlBit1Rank0            =  0x00D70248,
>>> +	HWPfDdrPhyWdqDcdlShadowBit1Rank0      =  0x00D7024C,
>>> +	HWPfDdrPhyRdqDcdlBit2Rank0            =  0x00D70280,
>>> +	HWPfDdrPhyRdqDcdlShadowBit2Rank0      =  0x00D70284,
>>> +	HWPfDdrPhyWdqDcdlBit2Rank0            =  0x00D70288,
>>> +	HWPfDdrPhyWdqDcdlShadowBit2Rank0      =  0x00D7028C,
>>> +	HWPfDdrPhyRdqDcdlBit3Rank0            =  0x00D702C0,
>>> +	HWPfDdrPhyRdqDcdlShadowBit3Rank0      =  0x00D702C4,
>>> +	HWPfDdrPhyWdqDcdlBit3Rank0            =  0x00D702C8,
>>> +	HWPfDdrPhyWdqDcdlShadowBit3Rank0      =  0x00D702CC,
>>> +	HWPfDdrPhyRdqDcdlBit4Rank0            =  0x00D70300,
>>> +	HWPfDdrPhyRdqDcdlShadowBit4Rank0      =  0x00D70304,
>>> +	HWPfDdrPhyWdqDcdlBit4Rank0            =  0x00D70308,
>>> +	HWPfDdrPhyWdqDcdlShadowBit4Rank0      =  0x00D7030C,
>>> +	HWPfDdrPhyRdqDcdlBit5Rank0            =  0x00D70340,
>>> +	HWPfDdrPhyRdqDcdlShadowBit5Rank0      =  0x00D70344,
>>> +	HWPfDdrPhyWdqDcdlBit5Rank0            =  0x00D70348,
>>> +	HWPfDdrPhyWdqDcdlShadowBit5Rank0      =  0x00D7034C,
>>> +	HWPfDdrPhyRdqDcdlBit6Rank0            =  0x00D70380,
>>> +	HWPfDdrPhyRdqDcdlShadowBit6Rank0      =  0x00D70384,
>>> +	HWPfDdrPhyWdqDcdlBit6Rank0            =  0x00D70388,
>>> +	HWPfDdrPhyWdqDcdlShadowBit6Rank0      =  0x00D7038C,
>>> +	HWPfDdrPhyRdqDcdlBit7Rank0            =  0x00D703C0,
>>> +	HWPfDdrPhyRdqDcdlShadowBit7Rank0      =  0x00D703C4,
>>> +	HWPfDdrPhyWdqDcdlBit7Rank0            =  0x00D703C8,
>>> +	HWPfDdrPhyWdqDcdlShadowBit7Rank0      =  0x00D703CC,
>>> +	HWPfDdrPhyIdtmStatus                  =  0x00D740D0,
>>> +	HWPfDdrPhyIdtmError                   =  0x00D74110,
>>> +	HWPfDdrPhyIdtmDebug                   =  0x00D74120,
>>> +	HWPfDdrPhyIdtmDebugInt                =  0x00D74130,
>>> +	HwPfPcieLnAsicCfgovr                  =  0x00D80000,
>>> +	HwPfPcieLnAclkmixer                   =  0x00D80004,
>>> +	HwPfPcieLnTxrampfreq                  =  0x00D80008,
>>> +	HwPfPcieLnLanetest                    =  0x00D8000C,
>>> +	HwPfPcieLnDcctrl                      =  0x00D80010,
>>> +	HwPfPcieLnDccmeas                     =  0x00D80014,
>>> +	HwPfPcieLnDccovrAclk                  =  0x00D80018,
>>> +	HwPfPcieLnDccovrTxa                   =  0x00D8001C,
>>> +	HwPfPcieLnDccovrTxk                   =  0x00D80020,
>>> +	HwPfPcieLnDccovrDclk                  =  0x00D80024,
>>> +	HwPfPcieLnDccovrEclk                  =  0x00D80028,
>>> +	HwPfPcieLnDcctrimAclk                 =  0x00D8002C,
>>> +	HwPfPcieLnDcctrimTx                   =  0x00D80030,
>>> +	HwPfPcieLnDcctrimDclk                 =  0x00D80034,
>>> +	HwPfPcieLnDcctrimEclk                 =  0x00D80038,
>>> +	HwPfPcieLnQuadCtrl                    =  0x00D8003C,
>>> +	HwPfPcieLnQuadCorrIndex               =  0x00D80040,
>>> +	HwPfPcieLnQuadCorrStatus              =  0x00D80044,
>>> +	HwPfPcieLnAsicRxovr1                  =  0x00D80048,
>>> +	HwPfPcieLnAsicRxovr2                  =  0x00D8004C,
>>> +	HwPfPcieLnAsicEqinfovr                =  0x00D80050,
>>> +	HwPfPcieLnRxcsr                       =  0x00D80054,
>>> +	HwPfPcieLnRxfectrl                    =  0x00D80058,
>>> +	HwPfPcieLnRxtest                      =  0x00D8005C,
>>> +	HwPfPcieLnEscount                     =  0x00D80060,
>>> +	HwPfPcieLnCdrctrl                     =  0x00D80064,
>>> +	HwPfPcieLnCdrctrl2                    =  0x00D80068,
>>> +	HwPfPcieLnCdrcfg0Ctrl0                =  0x00D8006C,
>>> +	HwPfPcieLnCdrcfg0Ctrl1                =  0x00D80070,
>>> +	HwPfPcieLnCdrcfg0Ctrl2                =  0x00D80074,
>>> +	HwPfPcieLnCdrcfg1Ctrl0                =  0x00D80078,
>>> +	HwPfPcieLnCdrcfg1Ctrl1                =  0x00D8007C,
>>> +	HwPfPcieLnCdrcfg1Ctrl2                =  0x00D80080,
>>> +	HwPfPcieLnCdrcfg2Ctrl0                =  0x00D80084,
>>> +	HwPfPcieLnCdrcfg2Ctrl1                =  0x00D80088,
>>> +	HwPfPcieLnCdrcfg2Ctrl2                =  0x00D8008C,
>>> +	HwPfPcieLnCdrcfg3Ctrl0                =  0x00D80090,
>>> +	HwPfPcieLnCdrcfg3Ctrl1                =  0x00D80094,
>>> +	HwPfPcieLnCdrcfg3Ctrl2                =  0x00D80098,
>>> +	HwPfPcieLnCdrphase                    =  0x00D8009C,
>>> +	HwPfPcieLnCdrfreq                     =  0x00D800A0,
>>> +	HwPfPcieLnCdrstatusPhase              =  0x00D800A4,
>>> +	HwPfPcieLnCdrstatusFreq               =  0x00D800A8,
>>> +	HwPfPcieLnCdroffset                   =  0x00D800AC,
>>> +	HwPfPcieLnRxvosctl                    =  0x00D800B0,
>>> +	HwPfPcieLnRxvosctl2                   =  0x00D800B4,
>>> +	HwPfPcieLnRxlosctl                    =  0x00D800B8,
>>> +	HwPfPcieLnRxlos                       =  0x00D800BC,
>>> +	HwPfPcieLnRxlosvval                   =  0x00D800C0,
>>> +	HwPfPcieLnRxvosd0                     =  0x00D800C4,
>>> +	HwPfPcieLnRxvosd1                     =  0x00D800C8,
>>> +	HwPfPcieLnRxvosep0                    =  0x00D800CC,
>>> +	HwPfPcieLnRxvosep1                    =  0x00D800D0,
>>> +	HwPfPcieLnRxvosen0                    =  0x00D800D4,
>>> +	HwPfPcieLnRxvosen1                    =  0x00D800D8,
>>> +	HwPfPcieLnRxvosafe                    =  0x00D800DC,
>>> +	HwPfPcieLnRxvosa0                     =  0x00D800E0,
>>> +	HwPfPcieLnRxvosa0Out                  =  0x00D800E4,
>>> +	HwPfPcieLnRxvosa1                     =  0x00D800E8,
>>> +	HwPfPcieLnRxvosa1Out                  =  0x00D800EC,
>>> +	HwPfPcieLnRxmisc                      =  0x00D800F0,
>>> +	HwPfPcieLnRxbeacon                    =  0x00D800F4,
>>> +	HwPfPcieLnRxdssout                    =  0x00D800F8,
>>> +	HwPfPcieLnRxdssout2                   =  0x00D800FC,
>>> +	HwPfPcieLnAlphapctrl                  =  0x00D80100,
>>> +	HwPfPcieLnAlphanctrl                  =  0x00D80104,
>>> +	HwPfPcieLnAdaptctrl                   =  0x00D80108,
>>> +	HwPfPcieLnAdaptctrl1                  =  0x00D8010C,
>>> +	HwPfPcieLnAdaptstatus                 =  0x00D80110,
>>> +	HwPfPcieLnAdaptvga1                   =  0x00D80114,
>>> +	HwPfPcieLnAdaptvga2                   =  0x00D80118,
>>> +	HwPfPcieLnAdaptvga3                   =  0x00D8011C,
>>> +	HwPfPcieLnAdaptvga4                   =  0x00D80120,
>>> +	HwPfPcieLnAdaptboost1                 =  0x00D80124,
>>> +	HwPfPcieLnAdaptboost2                 =  0x00D80128,
>>> +	HwPfPcieLnAdaptboost3                 =  0x00D8012C,
>>> +	HwPfPcieLnAdaptboost4                 =  0x00D80130,
>>> +	HwPfPcieLnAdaptsslms1                 =  0x00D80134,
>>> +	HwPfPcieLnAdaptsslms2                 =  0x00D80138,
>>> +	HwPfPcieLnAdaptvgaStatus              =  0x00D8013C,
>>> +	HwPfPcieLnAdaptboostStatus            =  0x00D80140,
>>> +	HwPfPcieLnAdaptsslmsStatus1           =  0x00D80144,
>>> +	HwPfPcieLnAdaptsslmsStatus2           =  0x00D80148,
>>> +	HwPfPcieLnAfectrl1                    =  0x00D8014C,
>>> +	HwPfPcieLnAfectrl2                    =  0x00D80150,
>>> +	HwPfPcieLnAfectrl3                    =  0x00D80154,
>>> +	HwPfPcieLnAfedefault1                 =  0x00D80158,
>>> +	HwPfPcieLnAfedefault2                 =  0x00D8015C,
>>> +	HwPfPcieLnDfectrl1                    =  0x00D80160,
>>> +	HwPfPcieLnDfectrl2                    =  0x00D80164,
>>> +	HwPfPcieLnDfectrl3                    =  0x00D80168,
>>> +	HwPfPcieLnDfectrl4                    =  0x00D8016C,
>>> +	HwPfPcieLnDfectrl5                    =  0x00D80170,
>>> +	HwPfPcieLnDfectrl6                    =  0x00D80174,
>>> +	HwPfPcieLnAfestatus1                  =  0x00D80178,
>>> +	HwPfPcieLnAfestatus2                  =  0x00D8017C,
>>> +	HwPfPcieLnDfestatus1                  =  0x00D80180,
>>> +	HwPfPcieLnDfestatus2                  =  0x00D80184,
>>> +	HwPfPcieLnDfestatus3                  =  0x00D80188,
>>> +	HwPfPcieLnDfestatus4                  =  0x00D8018C,
>>> +	HwPfPcieLnDfestatus5                  =  0x00D80190,
>>> +	HwPfPcieLnAlphastatus                 =  0x00D80194,
>>> +	HwPfPcieLnFomctrl1                    =  0x00D80198,
>>> +	HwPfPcieLnFomctrl2                    =  0x00D8019C,
>>> +	HwPfPcieLnFomctrl3                    =  0x00D801A0,
>>> +	HwPfPcieLnAclkcalStatus               =  0x00D801A4,
>>> +	HwPfPcieLnOffscorrStatus              =  0x00D801A8,
>>> +	HwPfPcieLnEyewidthStatus              =  0x00D801AC,
>>> +	HwPfPcieLnEyeheightStatus             =  0x00D801B0,
>>> +	HwPfPcieLnAsicTxovr1                  =  0x00D801B4,
>>> +	HwPfPcieLnAsicTxovr2                  =  0x00D801B8,
>>> +	HwPfPcieLnAsicTxovr3                  =  0x00D801BC,
>>> +	HwPfPcieLnTxbiasadjOvr                =  0x00D801C0,
>>> +	HwPfPcieLnTxcsr                       =  0x00D801C4,
>>> +	HwPfPcieLnTxtest                      =  0x00D801C8,
>>> +	HwPfPcieLnTxtestword                  =  0x00D801CC,
>>> +	HwPfPcieLnTxtestwordHigh              =  0x00D801D0,
>>> +	HwPfPcieLnTxdrive                     =  0x00D801D4,
>>> +	HwPfPcieLnMtcsLn                      =  0x00D801D8,
>>> +	HwPfPcieLnStatsumLn                   =  0x00D801DC,
>>> +	HwPfPcieLnRcbusScratch                =  0x00D801E0,
>>> +	HwPfPcieLnRcbusMinorrev               =  0x00D801F0,
>>> +	HwPfPcieLnRcbusMajorrev               =  0x00D801F4,
>>> +	HwPfPcieLnRcbusBlocktype              =  0x00D801F8,
>>> +	HwPfPcieSupPllcsr                     =  0x00D80800,
>>> +	HwPfPcieSupPlldiv                     =  0x00D80804,
>>> +	HwPfPcieSupPllcal                     =  0x00D80808,
>>> +	HwPfPcieSupPllcalsts                  =  0x00D8080C,
>>> +	HwPfPcieSupPllmeas                    =  0x00D80810,
>>> +	HwPfPcieSupPlldactrim                 =  0x00D80814,
>>> +	HwPfPcieSupPllbiastrim                =  0x00D80818,
>>> +	HwPfPcieSupPllbwtrim                  =  0x00D8081C,
>>> +	HwPfPcieSupPllcaldly                  =  0x00D80820,
>>> +	HwPfPcieSupRefclkonpclkctrl           =  0x00D80824,
>>> +	HwPfPcieSupPclkdelay                  =  0x00D80828,
>>> +	HwPfPcieSupPhyconfig                  =  0x00D8082C,
>>> +	HwPfPcieSupRcalIntf                   =  0x00D80830,
>>> +	HwPfPcieSupAuxcsr                     =  0x00D80834,
>>> +	HwPfPcieSupVref                       =  0x00D80838,
>>> +	HwPfPcieSupLinkmode                   =  0x00D8083C,
>>> +	HwPfPcieSupRrefcalctl                 =  0x00D80840,
>>> +	HwPfPcieSupRrefcal                    =  0x00D80844,
>>> +	HwPfPcieSupRrefcaldly                 =  0x00D80848,
>>> +	HwPfPcieSupTximpcalctl                =  0x00D8084C,
>>> +	HwPfPcieSupTximpcal                   =  0x00D80850,
>>> +	HwPfPcieSupTximpoffset                =  0x00D80854,
>>> +	HwPfPcieSupTximpcaldly                =  0x00D80858,
>>> +	HwPfPcieSupRximpcalctl                =  0x00D8085C,
>>> +	HwPfPcieSupRximpcal                   =  0x00D80860,
>>> +	HwPfPcieSupRximpoffset                =  0x00D80864,
>>> +	HwPfPcieSupRximpcaldly                =  0x00D80868,
>>> +	HwPfPcieSupFence                      =  0x00D8086C,
>>> +	HwPfPcieSupMtcs                       =  0x00D80870,
>>> +	HwPfPcieSupStatsum                    =  0x00D809B8,
>>> +	HwPfPciePcsDpStatus0                  =  0x00D81000,
>>> +	HwPfPciePcsDpControl0                 =  0x00D81004,
>>> +	HwPfPciePcsPmaStatusLane0             =  0x00D81008,
>>> +	HwPfPciePcsPipeStatusLane0            =  0x00D8100C,
>>> +	HwPfPciePcsTxdeemph0Lane0             =  0x00D81010,
>>> +	HwPfPciePcsTxdeemph1Lane0             =  0x00D81014,
>>> +	HwPfPciePcsInternalStatusLane0        =  0x00D81018,
>>> +	HwPfPciePcsDpStatus1                  =  0x00D8101C,
>>> +	HwPfPciePcsDpControl1                 =  0x00D81020,
>>> +	HwPfPciePcsPmaStatusLane1             =  0x00D81024,
>>> +	HwPfPciePcsPipeStatusLane1            =  0x00D81028,
>>> +	HwPfPciePcsTxdeemph0Lane1             =  0x00D8102C,
>>> +	HwPfPciePcsTxdeemph1Lane1             =  0x00D81030,
>>> +	HwPfPciePcsInternalStatusLane1        =  0x00D81034,
>>> +	HwPfPciePcsDpStatus2                  =  0x00D81038,
>>> +	HwPfPciePcsDpControl2                 =  0x00D8103C,
>>> +	HwPfPciePcsPmaStatusLane2             =  0x00D81040,
>>> +	HwPfPciePcsPipeStatusLane2            =  0x00D81044,
>>> +	HwPfPciePcsTxdeemph0Lane2             =  0x00D81048,
>>> +	HwPfPciePcsTxdeemph1Lane2             =  0x00D8104C,
>>> +	HwPfPciePcsInternalStatusLane2        =  0x00D81050,
>>> +	HwPfPciePcsDpStatus3                  =  0x00D81054,
>>> +	HwPfPciePcsDpControl3                 =  0x00D81058,
>>> +	HwPfPciePcsPmaStatusLane3             =  0x00D8105C,
>>> +	HwPfPciePcsPipeStatusLane3            =  0x00D81060,
>>> +	HwPfPciePcsTxdeemph0Lane3             =  0x00D81064,
>>> +	HwPfPciePcsTxdeemph1Lane3             =  0x00D81068,
>>> +	HwPfPciePcsInternalStatusLane3        =  0x00D8106C,
>>> +	HwPfPciePcsEbStatus0                  =  0x00D81070,
>>> +	HwPfPciePcsEbStatus1                  =  0x00D81074,
>>> +	HwPfPciePcsEbStatus2                  =  0x00D81078,
>>> +	HwPfPciePcsEbStatus3                  =  0x00D8107C,
>>> +	HwPfPciePcsPllSettingPcieG1           =  0x00D81088,
>>> +	HwPfPciePcsPllSettingPcieG2           =  0x00D8108C,
>>> +	HwPfPciePcsPllSettingPcieG3           =  0x00D81090,
>>> +	HwPfPciePcsControl                    =  0x00D81094,
>>> +	HwPfPciePcsEqControl                  =  0x00D81098,
>>> +	HwPfPciePcsEqTimer                    =  0x00D8109C,
>>> +	HwPfPciePcsEqErrStatus                =  0x00D810A0,
>>> +	HwPfPciePcsEqErrCount                 =  0x00D810A4,
>>> +	HwPfPciePcsStatus                     =  0x00D810A8,
>>> +	HwPfPciePcsMiscRegister               =  0x00D810AC,
>>> +	HwPfPciePcsObsControl                 =  0x00D810B0,
>>> +	HwPfPciePcsPrbsCount0                 =  0x00D81200,
>>> +	HwPfPciePcsBistControl0               =  0x00D81204,
>>> +	HwPfPciePcsBistStaticWord00           =  0x00D81208,
>>> +	HwPfPciePcsBistStaticWord10           =  0x00D8120C,
>>> +	HwPfPciePcsBistStaticWord20           =  0x00D81210,
>>> +	HwPfPciePcsBistStaticWord30           =  0x00D81214,
>>> +	HwPfPciePcsPrbsCount1                 =  0x00D81220,
>>> +	HwPfPciePcsBistControl1               =  0x00D81224,
>>> +	HwPfPciePcsBistStaticWord01           =  0x00D81228,
>>> +	HwPfPciePcsBistStaticWord11           =  0x00D8122C,
>>> +	HwPfPciePcsBistStaticWord21           =  0x00D81230,
>>> +	HwPfPciePcsBistStaticWord31           =  0x00D81234,
>>> +	HwPfPciePcsPrbsCount2                 =  0x00D81240,
>>> +	HwPfPciePcsBistControl2               =  0x00D81244,
>>> +	HwPfPciePcsBistStaticWord02           =  0x00D81248,
>>> +	HwPfPciePcsBistStaticWord12           =  0x00D8124C,
>>> +	HwPfPciePcsBistStaticWord22           =  0x00D81250,
>>> +	HwPfPciePcsBistStaticWord32           =  0x00D81254,
>>> +	HwPfPciePcsPrbsCount3                 =  0x00D81260,
>>> +	HwPfPciePcsBistControl3               =  0x00D81264,
>>> +	HwPfPciePcsBistStaticWord03           =  0x00D81268,
>>> +	HwPfPciePcsBistStaticWord13           =  0x00D8126C,
>>> +	HwPfPciePcsBistStaticWord23           =  0x00D81270,
>>> +	HwPfPciePcsBistStaticWord33           =  0x00D81274,
>>> +	HwPfPcieGpexLtssmStateCntrl           =  0x00D90400,
>>> +	HwPfPcieGpexLtssmStateStatus          =  0x00D90404,
>>> +	HwPfPcieGpexSkipFreqTimer             =  0x00D90408,
>>> +	HwPfPcieGpexLaneSelect                =  0x00D9040C,
>>> +	HwPfPcieGpexLaneDeskew                =  0x00D90410,
>>> +	HwPfPcieGpexRxErrorStatus             =  0x00D90414,
>>> +	HwPfPcieGpexLaneNumControl            =  0x00D90418,
>>> +	HwPfPcieGpexNFstControl               =  0x00D9041C,
>>> +	HwPfPcieGpexLinkStatus                =  0x00D90420,
>>> +	HwPfPcieGpexAckReplayTimeout          =  0x00D90438,
>>> +	HwPfPcieGpexSeqNumberStatus           =  0x00D9043C,
>>> +	HwPfPcieGpexCoreClkRatio              =  0x00D90440,
>>> +	HwPfPcieGpexDllTholdControl           =  0x00D90448,
>>> +	HwPfPcieGpexPmTimer                   =  0x00D90450,
>>> +	HwPfPcieGpexPmeTimeout                =  0x00D90454,
>>> +	HwPfPcieGpexAspmL1Timer               =  0x00D90458,
>>> +	HwPfPcieGpexAspmReqTimer              =  0x00D9045C,
>>> +	HwPfPcieGpexAspmL1Dis                 =  0x00D90460,
>>> +	HwPfPcieGpexAdvisoryErrorControl      =  0x00D90468,
>>> +	HwPfPcieGpexId                        =  0x00D90470,
>>> +	HwPfPcieGpexClasscode                 =  0x00D90474,
>>> +	HwPfPcieGpexSubsystemId               =  0x00D90478,
>>> +	HwPfPcieGpexDeviceCapabilities        =  0x00D9047C,
>>> +	HwPfPcieGpexLinkCapabilities          =  0x00D90480,
>>> +	HwPfPcieGpexFunctionNumber            =  0x00D90484,
>>> +	HwPfPcieGpexPmCapabilities            =  0x00D90488,
>>> +	HwPfPcieGpexFunctionSelect            =  0x00D9048C,
>>> +	HwPfPcieGpexErrorCounter              =  0x00D904AC,
>>> +	HwPfPcieGpexConfigReady               =  0x00D904B0,
>>> +	HwPfPcieGpexFcUpdateTimeout           =  0x00D904B8,
>>> +	HwPfPcieGpexFcUpdateTimer             =  0x00D904BC,
>>> +	HwPfPcieGpexVcBufferLoad              =  0x00D904C8,
>>> +	HwPfPcieGpexVcBufferSizeThold         =  0x00D904CC,
>>> +	HwPfPcieGpexVcBufferSelect            =  0x00D904D0,
>>> +	HwPfPcieGpexBarEnable                 =  0x00D904D4,
>>> +	HwPfPcieGpexBarDwordLower             =  0x00D904D8,
>>> +	HwPfPcieGpexBarDwordUpper             =  0x00D904DC,
>>> +	HwPfPcieGpexBarSelect                 =  0x00D904E0,
>>> +	HwPfPcieGpexCreditCounterSelect       =  0x00D904E4,
>>> +	HwPfPcieGpexCreditCounterStatus       =  0x00D904E8,
>>> +	HwPfPcieGpexTlpHeaderSelect           =  0x00D904EC,
>>> +	HwPfPcieGpexTlpHeaderDword0           =  0x00D904F0,
>>> +	HwPfPcieGpexTlpHeaderDword1           =  0x00D904F4,
>>> +	HwPfPcieGpexTlpHeaderDword2           =  0x00D904F8,
>>> +	HwPfPcieGpexTlpHeaderDword3           =  0x00D904FC,
>>> +	HwPfPcieGpexRelaxOrderControl         =  0x00D90500,
>>> +	HwPfPcieGpexBarPrefetch               =  0x00D90504,
>>> +	HwPfPcieGpexFcCheckControl            =  0x00D90508,
>>> +	HwPfPcieGpexFcUpdateTimerTraffic      =  0x00D90518,
>>> +	HwPfPcieGpexPhyControl0               =  0x00D9053C,
>>> +	HwPfPcieGpexPhyControl1               =  0x00D90544,
>>> +	HwPfPcieGpexPhyControl2               =  0x00D9054C,
>>> +	HwPfPcieGpexUserControl0              =  0x00D9055C,
>>> +	HwPfPcieGpexUncorrErrorStatus         =  0x00D905F0,
>>> +	HwPfPcieGpexRxCplError                =  0x00D90620,
>>> +	HwPfPcieGpexRxCplErrorDword0          =  0x00D90624,
>>> +	HwPfPcieGpexRxCplErrorDword1          =  0x00D90628,
>>> +	HwPfPcieGpexRxCplErrorDword2          =  0x00D9062C,
>>> +	HwPfPcieGpexPabSwResetEn              =  0x00D90630,
>>> +	HwPfPcieGpexGen3Control0              =  0x00D90634,
>>> +	HwPfPcieGpexGen3Control1              =  0x00D90638,
>>> +	HwPfPcieGpexGen3Control2              =  0x00D9063C,
>>> +	HwPfPcieGpexGen2ControlCsr            =  0x00D90640,
>>> +	HwPfPcieGpexTotalVfInitialVf0         =  0x00D90644,
>>> +	HwPfPcieGpexTotalVfInitialVf1         =  0x00D90648,
>>> +	HwPfPcieGpexSriovLinkDevId0           =  0x00D90684,
>>> +	HwPfPcieGpexSriovLinkDevId1           =  0x00D90688,
>>> +	HwPfPcieGpexSriovPageSize0            =  0x00D906C4,
>>> +	HwPfPcieGpexSriovPageSize1            =  0x00D906C8,
>>> +	HwPfPcieGpexIdVersion                 =  0x00D906FC,
>>> +	HwPfPcieGpexSriovVfOffsetStride0      =  0x00D90704,
>>> +	HwPfPcieGpexSriovVfOffsetStride1      =  0x00D90708,
>>> +	HwPfPcieGpexGen3DeskewControl         =  0x00D907B4,
>>> +	HwPfPcieGpexGen3EqControl             =  0x00D907B8,
>>> +	HwPfPcieGpexBridgeVersion             =  0x00D90800,
>>> +	HwPfPcieGpexBridgeCapability          =  0x00D90804,
>>> +	HwPfPcieGpexBridgeControl             =  0x00D90808,
>>> +	HwPfPcieGpexBridgeStatus              =  0x00D9080C,
>>> +	HwPfPcieGpexEngineActivityStatus      =  0x00D9081C,
>>> +	HwPfPcieGpexEngineResetControl        =  0x00D90820,
>>> +	HwPfPcieGpexAxiPioControl             =  0x00D90840,
>>> +	HwPfPcieGpexAxiPioStatus              =  0x00D90844,
>>> +	HwPfPcieGpexAmbaSlaveCmdStatus        =  0x00D90848,
>>> +	HwPfPcieGpexPexPioControl             =  0x00D908C0,
>>> +	HwPfPcieGpexPexPioStatus              =  0x00D908C4,
>>> +	HwPfPcieGpexAmbaMasterStatus          =  0x00D908C8,
>>> +	HwPfPcieGpexCsrSlaveCmdStatus         =  0x00D90920,
>>> +	HwPfPcieGpexMailboxAxiControl         =  0x00D90A50,
>>> +	HwPfPcieGpexMailboxAxiData            =  0x00D90A54,
>>> +	HwPfPcieGpexMailboxPexControl         =  0x00D90A90,
>>> +	HwPfPcieGpexMailboxPexData            =  0x00D90A94,
>>> +	HwPfPcieGpexPexInterruptEnable        =  0x00D90AD0,
>>> +	HwPfPcieGpexPexInterruptStatus        =  0x00D90AD4,
>>> +	HwPfPcieGpexPexInterruptAxiPioVector  =  0x00D90AD8,
>>> +	HwPfPcieGpexPexInterruptPexPioVector  =  0x00D90AE0,
>>> +	HwPfPcieGpexPexInterruptMiscVector    =  0x00D90AF8,
>>> +	HwPfPcieGpexAmbaInterruptPioEnable    =  0x00D90B00,
>>> +	HwPfPcieGpexAmbaInterruptMiscEnable   =  0x00D90B0C,
>>> +	HwPfPcieGpexAmbaInterruptPioStatus    =  0x00D90B10,
>>> +	HwPfPcieGpexAmbaInterruptMiscStatus   =  0x00D90B1C,
>>> +	HwPfPcieGpexPexPmControl              =  0x00D90B80,
>>> +	HwPfPcieGpexSlotMisc                  =  0x00D90B88,
>>> +	HwPfPcieGpexAxiAddrMappingControl     =  0x00D90BA0,
>>> +	HwPfPcieGpexAxiAddrMappingWindowAxiBase     =  0x00D90BA4,
>>> +	HwPfPcieGpexAxiAddrMappingWindowPexBaseLow  =  0x00D90BA8,
>>> +	HwPfPcieGpexAxiAddrMappingWindowPexBaseHigh =  0x00D90BAC,
>>> +	HwPfPcieGpexPexBarAddrFunc0Bar0       =  0x00D91BA0,
>>> +	HwPfPcieGpexPexBarAddrFunc0Bar1       =  0x00D91BA4,
>>> +	HwPfPcieGpexAxiAddrMappingPcieHdrParam =  0x00D95BA0,
>>> +	HwPfPcieGpexExtAxiAddrMappingAxiBase  =  0x00D980A0,
>>> +	HwPfPcieGpexPexExtBarAddrFunc0Bar0    =  0x00D984A0,
>>> +	HwPfPcieGpexPexExtBarAddrFunc0Bar1    =  0x00D984A4,
>>> +	HwPfPcieGpexAmbaInterruptFlrEnable    =  0x00D9B960,
>>> +	HwPfPcieGpexAmbaInterruptFlrStatus    =  0x00D9B9A0,
>>> +	HwPfPcieGpexExtAxiAddrMappingSize     =  0x00D9BAF0,
>>> +	HwPfPcieGpexPexPioAwcacheControl      =  0x00D9C300,
>>> +	HwPfPcieGpexPexPioArcacheControl      =  0x00D9C304,
>>> +	HwPfPcieGpexPabObSizeControlVc0       =  0x00D9C310
>>> +};
>>> +
>>> +/* TIP PF Interrupt numbers */
>>> +enum {
>>> +	ACC100_PF_INT_QMGR_AQ_OVERFLOW = 0,
>>> +	ACC100_PF_INT_DOORBELL_VF_2_PF = 1,
>>> +	ACC100_PF_INT_DMA_DL_DESC_IRQ = 2,
>>> +	ACC100_PF_INT_DMA_UL_DESC_IRQ = 3,
>>> +	ACC100_PF_INT_DMA_MLD_DESC_IRQ = 4,
>>> +	ACC100_PF_INT_DMA_UL5G_DESC_IRQ = 5,
>>> +	ACC100_PF_INT_DMA_DL5G_DESC_IRQ = 6,
>>> +	ACC100_PF_INT_ILLEGAL_FORMAT = 7,
>>> +	ACC100_PF_INT_QMGR_DISABLED_ACCESS = 8,
>>> +	ACC100_PF_INT_QMGR_AQ_OVERTHRESHOLD = 9,
>>> +	ACC100_PF_INT_ARAM_ACCESS_ERR = 10,
>>> +	ACC100_PF_INT_ARAM_ECC_1BIT_ERR = 11,
>>> +	ACC100_PF_INT_PARITY_ERR = 12,
>>> +	ACC100_PF_INT_QMGR_ERR = 13,
>>> +	ACC100_PF_INT_INT_REQ_OVERFLOW = 14,
>>> +	ACC100_PF_INT_APB_TIMEOUT = 15,
>>> +};
>>> +
>>> +#endif /* ACC100_PF_ENUM_H */
>>> diff --git a/drivers/baseband/acc100/acc100_vf_enum.h
>> b/drivers/baseband/acc100/acc100_vf_enum.h
>>> new file mode 100644
>>> index 0000000..b512af3
>>> --- /dev/null
>>> +++ b/drivers/baseband/acc100/acc100_vf_enum.h
>>> @@ -0,0 +1,73 @@
>>> +/* SPDX-License-Identifier: BSD-3-Clause
>>> + * Copyright(c) 2017 Intel Corporation
>>> + */
>>> +
>>> +#ifndef ACC100_VF_ENUM_H
>>> +#define ACC100_VF_ENUM_H
>>> +
>>> +/*
>>> + * ACC100 Register mapping on VF BAR0
>>> + * This is automatically generated from RDL, format may change with new
>> RDL
>>> + */
>>> +enum {
>>> +	HWVfQmgrIngressAq             =  0x00000000,
>>> +	HWVfHiVfToPfDbellVf           =  0x00000800,
>>> +	HWVfHiPfToVfDbellVf           =  0x00000808,
>>> +	HWVfHiInfoRingBaseLoVf        =  0x00000810,
>>> +	HWVfHiInfoRingBaseHiVf        =  0x00000814,
>>> +	HWVfHiInfoRingPointerVf       =  0x00000818,
>>> +	HWVfHiInfoRingIntWrEnVf       =  0x00000820,
>>> +	HWVfHiInfoRingPf2VfWrEnVf     =  0x00000824,
>>> +	HWVfHiMsixVectorMapperVf      =  0x00000860,
>>> +	HWVfDmaFec5GulDescBaseLoRegVf =  0x00000920,
>>> +	HWVfDmaFec5GulDescBaseHiRegVf =  0x00000924,
>>> +	HWVfDmaFec5GulRespPtrLoRegVf  =  0x00000928,
>>> +	HWVfDmaFec5GulRespPtrHiRegVf  =  0x0000092C,
>>> +	HWVfDmaFec5GdlDescBaseLoRegVf =  0x00000940,
>>> +	HWVfDmaFec5GdlDescBaseHiRegVf =  0x00000944,
>>> +	HWVfDmaFec5GdlRespPtrLoRegVf  =  0x00000948,
>>> +	HWVfDmaFec5GdlRespPtrHiRegVf  =  0x0000094C,
>>> +	HWVfDmaFec4GulDescBaseLoRegVf =  0x00000960,
>>> +	HWVfDmaFec4GulDescBaseHiRegVf =  0x00000964,
>>> +	HWVfDmaFec4GulRespPtrLoRegVf  =  0x00000968,
>>> +	HWVfDmaFec4GulRespPtrHiRegVf  =  0x0000096C,
>>> +	HWVfDmaFec4GdlDescBaseLoRegVf =  0x00000980,
>>> +	HWVfDmaFec4GdlDescBaseHiRegVf =  0x00000984,
>>> +	HWVfDmaFec4GdlRespPtrLoRegVf  =  0x00000988,
>>> +	HWVfDmaFec4GdlRespPtrHiRegVf  =  0x0000098C,
>>> +	HWVfDmaDdrBaseRangeRoVf       =  0x000009A0,
>>> +	HWVfQmgrAqResetVf             =  0x00000E00,
>>> +	HWVfQmgrRingSizeVf            =  0x00000E04,
>>> +	HWVfQmgrGrpDepthLog20Vf       =  0x00000E08,
>>> +	HWVfQmgrGrpDepthLog21Vf       =  0x00000E0C,
>>> +	HWVfQmgrGrpFunction0Vf        =  0x00000E10,
>>> +	HWVfQmgrGrpFunction1Vf        =  0x00000E14,
>>> +	HWVfPmACntrlRegVf             =  0x00000F40,
>>> +	HWVfPmACountVf                =  0x00000F48,
>>> +	HWVfPmAKCntLoVf               =  0x00000F50,
>>> +	HWVfPmAKCntHiVf               =  0x00000F54,
>>> +	HWVfPmADeltaCntLoVf           =  0x00000F60,
>>> +	HWVfPmADeltaCntHiVf           =  0x00000F64,
>>> +	HWVfPmBCntrlRegVf             =  0x00000F80,
>>> +	HWVfPmBCountVf                =  0x00000F88,
>>> +	HWVfPmBKCntLoVf               =  0x00000F90,
>>> +	HWVfPmBKCntHiVf               =  0x00000F94,
>>> +	HWVfPmBDeltaCntLoVf           =  0x00000FA0,
>>> +	HWVfPmBDeltaCntHiVf           =  0x00000FA4
>>> +};
>>> +
>>> +/* TIP VF Interrupt numbers */
>>> +enum {
>>> +	ACC100_VF_INT_QMGR_AQ_OVERFLOW = 0,
>>> +	ACC100_VF_INT_DOORBELL_VF_2_PF = 1,
>>> +	ACC100_VF_INT_DMA_DL_DESC_IRQ = 2,
>>> +	ACC100_VF_INT_DMA_UL_DESC_IRQ = 3,
>>> +	ACC100_VF_INT_DMA_MLD_DESC_IRQ = 4,
>>> +	ACC100_VF_INT_DMA_UL5G_DESC_IRQ = 5,
>>> +	ACC100_VF_INT_DMA_DL5G_DESC_IRQ = 6,
>>> +	ACC100_VF_INT_ILLEGAL_FORMAT = 7,
>>> +	ACC100_VF_INT_QMGR_DISABLED_ACCESS = 8,
>>> +	ACC100_VF_INT_QMGR_AQ_OVERTHRESHOLD = 9,
>>> +};
>>> +
>>> +#endif /* ACC100_VF_ENUM_H */
>>> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h
>> b/drivers/baseband/acc100/rte_acc100_pmd.h
>>> index 6f46df0..cd77570 100644
>>> --- a/drivers/baseband/acc100/rte_acc100_pmd.h
>>> +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
>>> @@ -5,6 +5,9 @@
>>>  #ifndef _RTE_ACC100_PMD_H_
>>>  #define _RTE_ACC100_PMD_H_
>>>
>>> +#include "acc100_pf_enum.h"
>>> +#include "acc100_vf_enum.h"
>>> +
>>>  /* Helper macro for logging */
>>>  #define rte_bbdev_log(level, fmt, ...) \
>>>  	rte_log(RTE_LOG_ ## level, acc100_logtype, fmt "\n", \
>>> @@ -27,6 +30,493 @@
>>>  #define RTE_ACC100_PF_DEVICE_ID        (0x0d5c)
>>>  #define RTE_ACC100_VF_DEVICE_ID        (0x0d5d)
>>>
>>> +/* Define as 1 to use only a single FEC engine */
>>> +#ifndef RTE_ACC100_SINGLE_FEC
>>> +#define RTE_ACC100_SINGLE_FEC 0
>>> +#endif
>>> +
>>> +/* Values used in filling in descriptors */
>>> +#define ACC100_DMA_DESC_TYPE           2
>>> +#define ACC100_DMA_CODE_BLK_MODE       0
>>> +#define ACC100_DMA_BLKID_FCW           1
>>> +#define ACC100_DMA_BLKID_IN            2
>>> +#define ACC100_DMA_BLKID_OUT_ENC       1
>>> +#define ACC100_DMA_BLKID_OUT_HARD      1
>>> +#define ACC100_DMA_BLKID_OUT_SOFT      2
>>> +#define ACC100_DMA_BLKID_OUT_HARQ      3
>>> +#define ACC100_DMA_BLKID_IN_HARQ       3
>>> +
>>> +/* Values used in filling in decode FCWs */
>>> +#define ACC100_FCW_TD_VER              1
>>> +#define ACC100_FCW_TD_EXT_COLD_REG_EN  1
>>> +#define ACC100_FCW_TD_AUTOMAP          0x0f
>>> +#define ACC100_FCW_TD_RVIDX_0          2
>>> +#define ACC100_FCW_TD_RVIDX_1          26
>>> +#define ACC100_FCW_TD_RVIDX_2          50
>>> +#define ACC100_FCW_TD_RVIDX_3          74
>>> +
>>> +/* Values used in writing to the registers */
>>> +#define ACC100_REG_IRQ_EN_ALL          0x1FF83FF  /* Enable all
>> interrupts */
>>> +
>>> +/* ACC100 Specific Dimensioning */
>>> +#define ACC100_SIZE_64MBYTE            (64*1024*1024)
>> A better name for this #define would be ACC100_MAX_RING_SIZE
>>
>> Similar for alloc_2x64mb_sw_rings_mem should be
>>
>> alloc_max_sw_rings_mem.
>>
> I am not convinced. I tend to believe this is actually more descriptive this way, and the
> concept of max ring size is something else. 

Ok, i see in later patches this is a special purpose function and not

the general one i assumed it was.

Tom

>
>
>>> +/* Number of elements in an Info Ring */
>>> +#define ACC100_INFO_RING_NUM_ENTRIES   1024
>>> +/* Number of elements in HARQ layout memory */
>>> +#define ACC100_HARQ_LAYOUT             (64*1024*1024)
>>> +/* Assume offset for HARQ in memory */
>>> +#define ACC100_HARQ_OFFSET             (32*1024)
>>> +/* Mask used to calculate an index in an Info Ring array (not a byte offset)
>> */
>>> +#define ACC100_INFO_RING_MASK
>> (ACC100_INFO_RING_NUM_ENTRIES-1)
>>> +/* Number of Virtual Functions ACC100 supports */
>>> +#define ACC100_NUM_VFS                  16
>>> +#define ACC100_NUM_QGRPS                 8
>>> +#define ACC100_NUM_QGRPS_PER_WORD        8
>>> +#define ACC100_NUM_AQS                  16
>>> +#define MAX_ENQ_BATCH_SIZE          255
>> little stuff, these define values should line up at least in the blocks.
> ok
>
>>> +/* All ACC100 Registers alignment are 32bits = 4B */
>>> +#define BYTES_IN_WORD                 4
>> Common #define names should have ACC100_ prefix to lower chance of
>> name conflicts.
>>
>> Generally a good idea of all of them.
> You are right, ok.
>
>> Tom
>>
>>> +#define MAX_E_MBUF                64000
>>> +
>>> +#define GRP_ID_SHIFT    10 /* Queue Index Hierarchy */
>>> +#define VF_ID_SHIFT     4  /* Queue Index Hierarchy */
>>> +#define VF_OFFSET_QOS   16 /* offset in Memory Space specific to QoS
>> Mon */
>>> +#define TMPL_PRI_0      0x03020100
>>> +#define TMPL_PRI_1      0x07060504
>>> +#define TMPL_PRI_2      0x0b0a0908
>>> +#define TMPL_PRI_3      0x0f0e0d0c
>>> +#define QUEUE_ENABLE    0x80000000  /* Bit to mark Queue as Enabled
>> */
>>> +#define WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
>>> +
>>> +#define ACC100_NUM_TMPL  32
>>> +#define VF_OFFSET_QOS 16 /* offset in Memory Space specific to QoS
>> Mon */
>>> +/* Mapping of signals for the available engines */
>>> +#define SIG_UL_5G      0
>>> +#define SIG_UL_5G_LAST 7
>>> +#define SIG_DL_5G      13
>>> +#define SIG_DL_5G_LAST 15
>>> +#define SIG_UL_4G      16
>>> +#define SIG_UL_4G_LAST 21
>>> +#define SIG_DL_4G      27
>>> +#define SIG_DL_4G_LAST 31
>>> +
>>> +/* max number of iterations to allocate memory block for all rings */
>>> +#define SW_RING_MEM_ALLOC_ATTEMPTS 5
>>> +#define MAX_QUEUE_DEPTH           1024
>>> +#define ACC100_DMA_MAX_NUM_POINTERS  14
>>> +#define ACC100_DMA_DESC_PADDING      8
>>> +#define ACC100_FCW_PADDING           12
>>> +#define ACC100_DESC_FCW_OFFSET       192
>>> +#define ACC100_DESC_SIZE             256
>>> +#define ACC100_DESC_OFFSET           (ACC100_DESC_SIZE / 64)
>>> +#define ACC100_FCW_TE_BLEN     32
>>> +#define ACC100_FCW_TD_BLEN     24
>>> +#define ACC100_FCW_LE_BLEN     32
>>> +#define ACC100_FCW_LD_BLEN     36
>>> +
>>> +#define ACC100_FCW_VER         2
>>> +#define MUX_5GDL_DESC 6
>>> +#define CMP_ENC_SIZE 20
>>> +#define CMP_DEC_SIZE 24
>>> +#define ENC_OFFSET (32)
>>> +#define DEC_OFFSET (80)
>>> +#define ACC100_EXT_MEM
>>> +#define ACC100_HARQ_OFFSET_THRESHOLD 1024
>>> +
>>> +/* Constants from K0 computation from 3GPP 38.212 Table 5.4.2.1-2 */
>>> +#define N_ZC_1 66 /* N = 66 Zc for BG 1 */
>>> +#define N_ZC_2 50 /* N = 50 Zc for BG 2 */
>>> +#define K0_1_1 17 /* K0 fraction numerator for rv 1 and BG 1 */
>>> +#define K0_1_2 13 /* K0 fraction numerator for rv 1 and BG 2 */
>>> +#define K0_2_1 33 /* K0 fraction numerator for rv 2 and BG 1 */
>>> +#define K0_2_2 25 /* K0 fraction numerator for rv 2 and BG 2 */
>>> +#define K0_3_1 56 /* K0 fraction numerator for rv 3 and BG 1 */
>>> +#define K0_3_2 43 /* K0 fraction numerator for rv 3 and BG 2 */
>>> +
>>> +/* ACC100 Configuration */
>>> +#define ACC100_DDR_ECC_ENABLE
>>> +#define ACC100_CFG_DMA_ERROR 0x3D7
>>> +#define ACC100_CFG_AXI_CACHE 0x11
>>> +#define ACC100_CFG_QMGR_HI_P 0x0F0F
>>> +#define ACC100_CFG_PCI_AXI 0xC003
>>> +#define ACC100_CFG_PCI_BRIDGE 0x40006033
>>> +#define ACC100_ENGINE_OFFSET 0x1000
>>> +#define ACC100_RESET_HI 0x20100
>>> +#define ACC100_RESET_LO 0x20000
>>> +#define ACC100_RESET_HARD 0x1FF
>>> +#define ACC100_ENGINES_MAX 9
>>> +#define LONG_WAIT 1000
>>> +
>>> +/* ACC100 DMA Descriptor triplet */
>>> +struct acc100_dma_triplet {
>>> +	uint64_t address;
>>> +	uint32_t blen:20,
>>> +		res0:4,
>>> +		last:1,
>>> +		dma_ext:1,
>>> +		res1:2,
>>> +		blkid:4;
>>> +} __rte_packed;
>>> +
>>> +
>>> +
>>> +/* ACC100 DMA Response Descriptor */
>>> +union acc100_dma_rsp_desc {
>>> +	uint32_t val;
>>> +	struct {
>>> +		uint32_t crc_status:1,
>>> +			synd_ok:1,
>>> +			dma_err:1,
>>> +			neg_stop:1,
>>> +			fcw_err:1,
>>> +			output_err:1,
>>> +			input_err:1,
>>> +			timestampEn:1,
>>> +			iterCountFrac:8,
>>> +			iter_cnt:8,
>>> +			rsrvd3:6,
>>> +			sdone:1,
>>> +			fdone:1;
>>> +		uint32_t add_info_0;
>>> +		uint32_t add_info_1;
>>> +	};
>>> +};
>>> +
>>> +
>>> +/* ACC100 Queue Manager Enqueue PCI Register */
>>> +union acc100_enqueue_reg_fmt {
>>> +	uint32_t val;
>>> +	struct {
>>> +		uint32_t num_elem:8,
>>> +			addr_offset:3,
>>> +			rsrvd:1,
>>> +			req_elem_addr:20;
>>> +	};
>>> +};
>>> +
>>> +/* FEC 4G Uplink Frame Control Word */
>>> +struct __rte_packed acc100_fcw_td {
>>> +	uint8_t fcw_ver:4,
>>> +		num_maps:4; /* Unused */
>>> +	uint8_t filler:6, /* Unused */
>>> +		rsrvd0:1,
>>> +		bypass_sb_deint:1;
>>> +	uint16_t k_pos;
>>> +	uint16_t k_neg; /* Unused */
>>> +	uint8_t c_neg; /* Unused */
>>> +	uint8_t c; /* Unused */
>>> +	uint32_t ea; /* Unused */
>>> +	uint32_t eb; /* Unused */
>>> +	uint8_t cab; /* Unused */
>>> +	uint8_t k0_start_col; /* Unused */
>>> +	uint8_t rsrvd1;
>>> +	uint8_t code_block_mode:1, /* Unused */
>>> +		turbo_crc_type:1,
>>> +		rsrvd2:3,
>>> +		bypass_teq:1, /* Unused */
>>> +		soft_output_en:1, /* Unused */
>>> +		ext_td_cold_reg_en:1;
>>> +	union { /* External Cold register */
>>> +		uint32_t ext_td_cold_reg;
>>> +		struct {
>>> +			uint32_t min_iter:4, /* Unused */
>>> +				max_iter:4,
>>> +				ext_scale:5, /* Unused */
>>> +				rsrvd3:3,
>>> +				early_stop_en:1, /* Unused */
>>> +				sw_soft_out_dis:1, /* Unused */
>>> +				sw_et_cont:1, /* Unused */
>>> +				sw_soft_out_saturation:1, /* Unused */
>>> +				half_iter_on:1, /* Unused */
>>> +				raw_decoder_input_on:1, /* Unused */
>>> +				rsrvd4:10;
>>> +		};
>>> +	};
>>> +};
>>> +
>>> +/* FEC 5GNR Uplink Frame Control Word */
>>> +struct __rte_packed acc100_fcw_ld {
>>> +	uint32_t FCWversion:4,
>>> +		qm:4,
>>> +		nfiller:11,
>>> +		BG:1,
>>> +		Zc:9,
>>> +		res0:1,
>>> +		synd_precoder:1,
>>> +		synd_post:1;
>>> +	uint32_t ncb:16,
>>> +		k0:16;
>>> +	uint32_t rm_e:24,
>>> +		hcin_en:1,
>>> +		hcout_en:1,
>>> +		crc_select:1,
>>> +		bypass_dec:1,
>>> +		bypass_intlv:1,
>>> +		so_en:1,
>>> +		so_bypass_rm:1,
>>> +		so_bypass_intlv:1;
>>> +	uint32_t hcin_offset:16,
>>> +		hcin_size0:16;
>>> +	uint32_t hcin_size1:16,
>>> +		hcin_decomp_mode:3,
>>> +		llr_pack_mode:1,
>>> +		hcout_comp_mode:3,
>>> +		res2:1,
>>> +		dec_convllr:4,
>>> +		hcout_convllr:4;
>>> +	uint32_t itmax:7,
>>> +		itstop:1,
>>> +		so_it:7,
>>> +		res3:1,
>>> +		hcout_offset:16;
>>> +	uint32_t hcout_size0:16,
>>> +		hcout_size1:16;
>>> +	uint32_t gain_i:8,
>>> +		gain_h:8,
>>> +		negstop_th:16;
>>> +	uint32_t negstop_it:7,
>>> +		negstop_en:1,
>>> +		res4:24;
>>> +};
>>> +
>>> +/* FEC 4G Downlink Frame Control Word */
>>> +struct __rte_packed acc100_fcw_te {
>>> +	uint16_t k_neg;
>>> +	uint16_t k_pos;
>>> +	uint8_t c_neg;
>>> +	uint8_t c;
>>> +	uint8_t filler;
>>> +	uint8_t cab;
>>> +	uint32_t ea:17,
>>> +		rsrvd0:15;
>>> +	uint32_t eb:17,
>>> +		rsrvd1:15;
>>> +	uint16_t ncb_neg;
>>> +	uint16_t ncb_pos;
>>> +	uint8_t rv_idx0:2,
>>> +		rsrvd2:2,
>>> +		rv_idx1:2,
>>> +		rsrvd3:2;
>>> +	uint8_t bypass_rv_idx0:1,
>>> +		bypass_rv_idx1:1,
>>> +		bypass_rm:1,
>>> +		rsrvd4:5;
>>> +	uint8_t rsrvd5:1,
>>> +		rsrvd6:3,
>>> +		code_block_crc:1,
>>> +		rsrvd7:3;
>>> +	uint8_t code_block_mode:1,
>>> +		rsrvd8:7;
>>> +	uint64_t rsrvd9;
>>> +};
>>> +
>>> +/* FEC 5GNR Downlink Frame Control Word */
>>> +struct __rte_packed acc100_fcw_le {
>>> +	uint32_t FCWversion:4,
>>> +		qm:4,
>>> +		nfiller:11,
>>> +		BG:1,
>>> +		Zc:9,
>>> +		res0:3;
>>> +	uint32_t ncb:16,
>>> +		k0:16;
>>> +	uint32_t rm_e:24,
>>> +		res1:2,
>>> +		crc_select:1,
>>> +		res2:1,
>>> +		bypass_intlv:1,
>>> +		res3:3;
>>> +	uint32_t res4_a:12,
>>> +		mcb_count:3,
>>> +		res4_b:17;
>>> +	uint32_t res5;
>>> +	uint32_t res6;
>>> +	uint32_t res7;
>>> +	uint32_t res8;
>>> +};
>>> +
>>> +/* ACC100 DMA Request Descriptor */
>>> +struct __rte_packed acc100_dma_req_desc {
>>> +	union {
>>> +		struct{
>>> +			uint32_t type:4,
>>> +				rsrvd0:26,
>>> +				sdone:1,
>>> +				fdone:1;
>>> +			uint32_t rsrvd1;
>>> +			uint32_t rsrvd2;
>>> +			uint32_t pass_param:8,
>>> +				sdone_enable:1,
>>> +				irq_enable:1,
>>> +				timeStampEn:1,
>>> +				res0:5,
>>> +				numCBs:4,
>>> +				res1:4,
>>> +				m2dlen:4,
>>> +				d2mlen:4;
>>> +		};
>>> +		struct{
>>> +			uint32_t word0;
>>> +			uint32_t word1;
>>> +			uint32_t word2;
>>> +			uint32_t word3;
>>> +		};
>>> +	};
>>> +	struct acc100_dma_triplet
>> data_ptrs[ACC100_DMA_MAX_NUM_POINTERS];
>>> +
>>> +	/* Virtual addresses used to retrieve SW context info */
>>> +	union {
>>> +		void *op_addr;
>>> +		uint64_t pad1;  /* pad to 64 bits */
>>> +	};
>>> +	/*
>>> +	 * Stores additional information needed for driver processing:
>>> +	 * - last_desc_in_batch - flag used to mark last descriptor (CB)
>>> +	 *                        in batch
>>> +	 * - cbs_in_tb - stores information about total number of Code Blocks
>>> +	 *               in currently processed Transport Block
>>> +	 */
>>> +	union {
>>> +		struct {
>>> +			union {
>>> +				struct acc100_fcw_ld fcw_ld;
>>> +				struct acc100_fcw_td fcw_td;
>>> +				struct acc100_fcw_le fcw_le;
>>> +				struct acc100_fcw_te fcw_te;
>>> +				uint32_t pad2[ACC100_FCW_PADDING];
>>> +			};
>>> +			uint32_t last_desc_in_batch :8,
>>> +				cbs_in_tb:8,
>>> +				pad4 : 16;
>>> +		};
>>> +		uint64_t pad3[ACC100_DMA_DESC_PADDING]; /* pad to 64
>> bits */
>>> +	};
>>> +};
>>> +
>>> +/* ACC100 DMA Descriptor */
>>> +union acc100_dma_desc {
>>> +	struct acc100_dma_req_desc req;
>>> +	union acc100_dma_rsp_desc rsp;
>>> +};
>>> +
>>> +
>>> +/* Union describing Info Ring entry */
>>> +union acc100_harq_layout_data {
>>> +	uint32_t val;
>>> +	struct {
>>> +		uint16_t offset;
>>> +		uint16_t size0;
>>> +	};
>>> +} __rte_packed;
>>> +
>>> +
>>> +/* Union describing Info Ring entry */
>>> +union acc100_info_ring_data {
>>> +	uint32_t val;
>>> +	struct {
>>> +		union {
>>> +			uint16_t detailed_info;
>>> +			struct {
>>> +				uint16_t aq_id: 4;
>>> +				uint16_t qg_id: 4;
>>> +				uint16_t vf_id: 6;
>>> +				uint16_t reserved: 2;
>>> +			};
>>> +		};
>>> +		uint16_t int_nb: 7;
>>> +		uint16_t msi_0: 1;
>>> +		uint16_t vf2pf: 6;
>>> +		uint16_t loop: 1;
>>> +		uint16_t valid: 1;
>>> +	};
>>> +} __rte_packed;
>>> +
>>> +struct acc100_registry_addr {
>>> +	unsigned int dma_ring_dl5g_hi;
>>> +	unsigned int dma_ring_dl5g_lo;
>>> +	unsigned int dma_ring_ul5g_hi;
>>> +	unsigned int dma_ring_ul5g_lo;
>>> +	unsigned int dma_ring_dl4g_hi;
>>> +	unsigned int dma_ring_dl4g_lo;
>>> +	unsigned int dma_ring_ul4g_hi;
>>> +	unsigned int dma_ring_ul4g_lo;
>>> +	unsigned int ring_size;
>>> +	unsigned int info_ring_hi;
>>> +	unsigned int info_ring_lo;
>>> +	unsigned int info_ring_en;
>>> +	unsigned int info_ring_ptr;
>>> +	unsigned int tail_ptrs_dl5g_hi;
>>> +	unsigned int tail_ptrs_dl5g_lo;
>>> +	unsigned int tail_ptrs_ul5g_hi;
>>> +	unsigned int tail_ptrs_ul5g_lo;
>>> +	unsigned int tail_ptrs_dl4g_hi;
>>> +	unsigned int tail_ptrs_dl4g_lo;
>>> +	unsigned int tail_ptrs_ul4g_hi;
>>> +	unsigned int tail_ptrs_ul4g_lo;
>>> +	unsigned int depth_log0_offset;
>>> +	unsigned int depth_log1_offset;
>>> +	unsigned int qman_group_func;
>>> +	unsigned int ddr_range;
>>> +};
>>> +
>>> +/* Structure holding registry addresses for PF */
>>> +static const struct acc100_registry_addr pf_reg_addr = {
>>> +	.dma_ring_dl5g_hi = HWPfDmaFec5GdlDescBaseHiRegVf,
>>> +	.dma_ring_dl5g_lo = HWPfDmaFec5GdlDescBaseLoRegVf,
>>> +	.dma_ring_ul5g_hi = HWPfDmaFec5GulDescBaseHiRegVf,
>>> +	.dma_ring_ul5g_lo = HWPfDmaFec5GulDescBaseLoRegVf,
>>> +	.dma_ring_dl4g_hi = HWPfDmaFec4GdlDescBaseHiRegVf,
>>> +	.dma_ring_dl4g_lo = HWPfDmaFec4GdlDescBaseLoRegVf,
>>> +	.dma_ring_ul4g_hi = HWPfDmaFec4GulDescBaseHiRegVf,
>>> +	.dma_ring_ul4g_lo = HWPfDmaFec4GulDescBaseLoRegVf,
>>> +	.ring_size = HWPfQmgrRingSizeVf,
>>> +	.info_ring_hi = HWPfHiInfoRingBaseHiRegPf,
>>> +	.info_ring_lo = HWPfHiInfoRingBaseLoRegPf,
>>> +	.info_ring_en = HWPfHiInfoRingIntWrEnRegPf,
>>> +	.info_ring_ptr = HWPfHiInfoRingPointerRegPf,
>>> +	.tail_ptrs_dl5g_hi = HWPfDmaFec5GdlRespPtrHiRegVf,
>>> +	.tail_ptrs_dl5g_lo = HWPfDmaFec5GdlRespPtrLoRegVf,
>>> +	.tail_ptrs_ul5g_hi = HWPfDmaFec5GulRespPtrHiRegVf,
>>> +	.tail_ptrs_ul5g_lo = HWPfDmaFec5GulRespPtrLoRegVf,
>>> +	.tail_ptrs_dl4g_hi = HWPfDmaFec4GdlRespPtrHiRegVf,
>>> +	.tail_ptrs_dl4g_lo = HWPfDmaFec4GdlRespPtrLoRegVf,
>>> +	.tail_ptrs_ul4g_hi = HWPfDmaFec4GulRespPtrHiRegVf,
>>> +	.tail_ptrs_ul4g_lo = HWPfDmaFec4GulRespPtrLoRegVf,
>>> +	.depth_log0_offset = HWPfQmgrGrpDepthLog20Vf,
>>> +	.depth_log1_offset = HWPfQmgrGrpDepthLog21Vf,
>>> +	.qman_group_func = HWPfQmgrGrpFunction0,
>>> +	.ddr_range = HWPfDmaVfDdrBaseRw,
>>> +};
>>> +
>>> +/* Structure holding registry addresses for VF */
>>> +static const struct acc100_registry_addr vf_reg_addr = {
>>> +	.dma_ring_dl5g_hi = HWVfDmaFec5GdlDescBaseHiRegVf,
>>> +	.dma_ring_dl5g_lo = HWVfDmaFec5GdlDescBaseLoRegVf,
>>> +	.dma_ring_ul5g_hi = HWVfDmaFec5GulDescBaseHiRegVf,
>>> +	.dma_ring_ul5g_lo = HWVfDmaFec5GulDescBaseLoRegVf,
>>> +	.dma_ring_dl4g_hi = HWVfDmaFec4GdlDescBaseHiRegVf,
>>> +	.dma_ring_dl4g_lo = HWVfDmaFec4GdlDescBaseLoRegVf,
>>> +	.dma_ring_ul4g_hi = HWVfDmaFec4GulDescBaseHiRegVf,
>>> +	.dma_ring_ul4g_lo = HWVfDmaFec4GulDescBaseLoRegVf,
>>> +	.ring_size = HWVfQmgrRingSizeVf,
>>> +	.info_ring_hi = HWVfHiInfoRingBaseHiVf,
>>> +	.info_ring_lo = HWVfHiInfoRingBaseLoVf,
>>> +	.info_ring_en = HWVfHiInfoRingIntWrEnVf,
>>> +	.info_ring_ptr = HWVfHiInfoRingPointerVf,
>>> +	.tail_ptrs_dl5g_hi = HWVfDmaFec5GdlRespPtrHiRegVf,
>>> +	.tail_ptrs_dl5g_lo = HWVfDmaFec5GdlRespPtrLoRegVf,
>>> +	.tail_ptrs_ul5g_hi = HWVfDmaFec5GulRespPtrHiRegVf,
>>> +	.tail_ptrs_ul5g_lo = HWVfDmaFec5GulRespPtrLoRegVf,
>>> +	.tail_ptrs_dl4g_hi = HWVfDmaFec4GdlRespPtrHiRegVf,
>>> +	.tail_ptrs_dl4g_lo = HWVfDmaFec4GdlRespPtrLoRegVf,
>>> +	.tail_ptrs_ul4g_hi = HWVfDmaFec4GulRespPtrHiRegVf,
>>> +	.tail_ptrs_ul4g_lo = HWVfDmaFec4GulRespPtrLoRegVf,
>>> +	.depth_log0_offset = HWVfQmgrGrpDepthLog20Vf,
>>> +	.depth_log1_offset = HWVfQmgrGrpDepthLog21Vf,
>>> +	.qman_group_func = HWVfQmgrGrpFunction0Vf,
>>> +	.ddr_range = HWVfDmaDdrBaseRangeRoVf,
>>> +};
>>> +
>>>  /* Private data structure for each ACC100 device */
>>>  struct acc100_device {
>>>  	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v9 03/10] baseband/acc100: add info get function
  2020-09-30  0:25         ` Chautru, Nicolas
@ 2020-09-30 23:20           ` Tom Rix
  0 siblings, 0 replies; 213+ messages in thread
From: Tom Rix @ 2020-09-30 23:20 UTC (permalink / raw)
  To: Chautru, Nicolas, dev, akhil.goyal
  Cc: Richardson, Bruce, Xu, Rosen, dave.burley, aidan.goddard, Yigit,
	Ferruh, Liu, Tianjiao


On 9/29/20 5:25 PM, Chautru, Nicolas wrote:
> Hi Tom, 
>
>> From: Tom Rix <trix@redhat.com>
>> On 9/28/20 5:29 PM, Nicolas Chautru wrote:
>>> Add in the "info_get" function to the driver, to allow us to query the
>>> device.
>>> No processing capability are available yet.
>>> Linking bbdev-test to support the PMD with null capability.
>>>
>>> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
>>> Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
>>> ---
>>>  app/test-bbdev/meson.build               |   3 +
>>>  drivers/baseband/acc100/rte_acc100_cfg.h |  96 +++++++++++++
>>> drivers/baseband/acc100/rte_acc100_pmd.c | 225
>> +++++++++++++++++++++++++++++++
>>>  drivers/baseband/acc100/rte_acc100_pmd.h |   3 +
>>>  4 files changed, 327 insertions(+)
>>>  create mode 100644 drivers/baseband/acc100/rte_acc100_cfg.h
>>>
>>> diff --git a/app/test-bbdev/meson.build b/app/test-bbdev/meson.build
>>> index 18ab6a8..fbd8ae3 100644
>>> --- a/app/test-bbdev/meson.build
>>> +++ b/app/test-bbdev/meson.build
>>> @@ -12,3 +12,6 @@ endif
>>>  if dpdk_conf.has('RTE_LIBRTE_PMD_BBDEV_FPGA_5GNR_FEC')
>>>  	deps += ['pmd_bbdev_fpga_5gnr_fec']
>>>  endif
>>> +if dpdk_conf.has('RTE_LIBRTE_PMD_BBDEV_ACC100')
>>> +	deps += ['pmd_bbdev_acc100']
>>> +endif
>>> \ No newline at end of file
>>> diff --git a/drivers/baseband/acc100/rte_acc100_cfg.h
>>> b/drivers/baseband/acc100/rte_acc100_cfg.h
>>> new file mode 100644
>>> index 0000000..73bbe36
>>> --- /dev/null
>>> +++ b/drivers/baseband/acc100/rte_acc100_cfg.h
>>> @@ -0,0 +1,96 @@
>>> +/* SPDX-License-Identifier: BSD-3-Clause
>>> + * Copyright(c) 2020 Intel Corporation  */
>>> +
>>> +#ifndef _RTE_ACC100_CFG_H_
>>> +#define _RTE_ACC100_CFG_H_
>>> +
>>> +/**
>>> + * @file rte_acc100_cfg.h
>>> + *
>>> + * Functions for configuring ACC100 HW, exposed directly to applications.
>>> + * Configuration related to encoding/decoding is done through the
>>> + * librte_bbdev library.
>>> + *
>>> + * @warning
>>> + * @b EXPERIMENTAL: this API may change without prior notice
>> When will this experimental tag be removed ?
> I have pushed a patch to remove it. But the feedback from some of the community was to wait a bit more until next year.
ok, i am late to the party. i'll chip in when this issue comes around again.
>
>>> + */
>>> +
>>> +#include <stdint.h>
>>> +#include <stdbool.h>
>>> +
>>> +#ifdef __cplusplus
>>> +extern "C" {
>>> +#endif
>>> +/**< Number of Virtual Functions ACC100 supports */ #define
>>> +RTE_ACC100_NUM_VFS 16
>> This is already defined with ACC100_NUM_VFS
> Thanks. 
>
>>> +
>>> +/**
>>> + * Definition of Queue Topology for ACC100 Configuration
>>> + * Some level of details is abstracted out to expose a clean
>>> +interface
>>> + * given that comprehensive flexibility is not required  */ struct
>>> +rte_q_topology_t {
>>> +	/** Number of QGroups in incremental order of priority */
>>> +	uint16_t num_qgroups;
>>> +	/**
>>> +	 * All QGroups have the same number of AQs here.
>>> +	 * Note : Could be made a 16-array if more flexibility is really
>>> +	 * required
>>> +	 */
>>> +	uint16_t num_aqs_per_groups;
>>> +	/**
>>> +	 * Depth of the AQs is the same of all QGroups here. Log2 Enum : 2^N
>>> +	 * Note : Could be made a 16-array if more flexibility is really
>>> +	 * required
>>> +	 */
>>> +	uint16_t aq_depth_log2;
>>> +	/**
>>> +	 * Index of the first Queue Group Index - assuming contiguity
>>> +	 * Initialized as -1
>>> +	 */
>>> +	int8_t first_qgroup_index;
>>> +};
>>> +
>>> +/**
>>> + * Definition of Arbitration related parameters for ACC100
>>> +Configuration  */ struct rte_arbitration_t {
>>> +	/** Default Weight for VF Fairness Arbitration */
>>> +	uint16_t round_robin_weight;
>>> +	uint32_t gbr_threshold1; /**< Guaranteed Bitrate Threshold 1 */
>>> +	uint32_t gbr_threshold2; /**< Guaranteed Bitrate Threshold 2 */ };
>>> +
>>> +/**
>>> + * Structure to pass ACC100 configuration.
>>> + * Note: all VF Bundles will have the same configuration.
>>> + */
>>> +struct acc100_conf {
>>> +	bool pf_mode_en; /**< 1 if PF is used for dataplane, 0 for VFs */
>>> +	/** 1 if input '1' bit is represented by a positive LLR value, 0 if '1'
>>> +	 * bit is represented by a negative value.
>>> +	 */
>>> +	bool input_pos_llr_1_bit;
>>> +	/** 1 if output '1' bit is represented by a positive value, 0 if '1'
>>> +	 * bit is represented by a negative value.
>>> +	 */
>>> +	bool output_pos_llr_1_bit;
>>> +	uint16_t num_vf_bundles; /**< Number of VF bundles to setup */
>>> +	/** Queue topology for each operation type */
>>> +	struct rte_q_topology_t q_ul_4g;
>>> +	struct rte_q_topology_t q_dl_4g;
>>> +	struct rte_q_topology_t q_ul_5g;
>>> +	struct rte_q_topology_t q_dl_5g;
>>> +	/** Arbitration configuration for each operation type */
>>> +	struct rte_arbitration_t arb_ul_4g[RTE_ACC100_NUM_VFS];
>>> +	struct rte_arbitration_t arb_dl_4g[RTE_ACC100_NUM_VFS];
>>> +	struct rte_arbitration_t arb_ul_5g[RTE_ACC100_NUM_VFS];
>>> +	struct rte_arbitration_t arb_dl_5g[RTE_ACC100_NUM_VFS]; };
>>> +
>>> +#ifdef __cplusplus
>>> +}
>>> +#endif
>>> +
>>> +#endif /* _RTE_ACC100_CFG_H_ */
>>> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
>>> b/drivers/baseband/acc100/rte_acc100_pmd.c
>>> index 1b4cd13..7807a30 100644
>>> --- a/drivers/baseband/acc100/rte_acc100_pmd.c
>>> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
>>> @@ -26,6 +26,184 @@
>>>  RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);  #endif
>>>
>>> +/* Read a register of a ACC100 device */ static inline uint32_t
>>> +acc100_reg_read(struct acc100_device *d, uint32_t offset) {
>>> +
>>> +	void *reg_addr = RTE_PTR_ADD(d->mmio_base, offset);
>>> +	uint32_t ret = *((volatile uint32_t *)(reg_addr));
>>> +	return rte_le_to_cpu_32(ret);
>>> +}
>>> +
>>> +/* Calculate the offset of the enqueue register */ static inline
>>> +uint32_t queue_offset(bool pf_device, uint8_t vf_id, uint8_t qgrp_id,
>>> +uint16_t aq_id) {
>>> +	if (pf_device)
>>> +		return ((vf_id << 12) + (qgrp_id << 7) + (aq_id << 3) +
>>> +				HWPfQmgrIngressAq);
>>> +	else
>>> +		return ((qgrp_id << 7) + (aq_id << 3) +
>>> +				HWVfQmgrIngressAq);
>> Could you add *QmrIngressAq to the acc100_registry_addr and skip the if
>> (pf_device) check ?
> I am not convinced. That acc100_registry_addr is not kept with the driver, you
> still need to check pf_device flag to now what registers are to be used.

ok

The values of pf and vf regs are tantalizingly close

my first inclination is to abstract all of them.

>
>
>>> +}
>>> +
>>> +enum {UL_4G = 0, UL_5G, DL_4G, DL_5G, NUM_ACC};
>>> +
>>> +/* Return the queue topology for a Queue Group Index */ static inline
>>> +void qtopFromAcc(struct rte_q_topology_t **qtop, int acc_enum,
>>> +		struct acc100_conf *acc100_conf)
>>> +{
>>> +	struct rte_q_topology_t *p_qtop;
>>> +	p_qtop = NULL;
>>> +	switch (acc_enum) {
>>> +	case UL_4G:
>>> +		p_qtop = &(acc100_conf->q_ul_4g);
>>> +		break;
>>> +	case UL_5G:
>>> +		p_qtop = &(acc100_conf->q_ul_5g);
>>> +		break;
>>> +	case DL_4G:
>>> +		p_qtop = &(acc100_conf->q_dl_4g);
>>> +		break;
>>> +	case DL_5G:
>>> +		p_qtop = &(acc100_conf->q_dl_5g);
>>> +		break;
>>> +	default:
>>> +		/* NOTREACHED */
>>> +		rte_bbdev_log(ERR, "Unexpected error evaluating
>> qtopFromAcc");
>> Use in fetch_acc100_config does not check for NULL.
> Yes because it can't be null. This function is called explictly for supported values. 
ok
>>> +		break;
>>> +	}
>>> +	*qtop = p_qtop;
>>> +}
>>> +
>>> +static void
>>> +initQTop(struct acc100_conf *acc100_conf) {
>>> +	acc100_conf->q_ul_4g.num_aqs_per_groups = 0;
>>> +	acc100_conf->q_ul_4g.num_qgroups = 0;
>>> +	acc100_conf->q_ul_4g.first_qgroup_index = -1;
>>> +	acc100_conf->q_ul_5g.num_aqs_per_groups = 0;
>>> +	acc100_conf->q_ul_5g.num_qgroups = 0;
>>> +	acc100_conf->q_ul_5g.first_qgroup_index = -1;
>>> +	acc100_conf->q_dl_4g.num_aqs_per_groups = 0;
>>> +	acc100_conf->q_dl_4g.num_qgroups = 0;
>>> +	acc100_conf->q_dl_4g.first_qgroup_index = -1;
>>> +	acc100_conf->q_dl_5g.num_aqs_per_groups = 0;
>>> +	acc100_conf->q_dl_5g.num_qgroups = 0;
>>> +	acc100_conf->q_dl_5g.first_qgroup_index = -1; }
>>> +
>>> +static inline void
>>> +updateQtop(uint8_t acc, uint8_t qg, struct acc100_conf *acc100_conf,
>>> +		struct acc100_device *d) {
>>> +	uint32_t reg;
>>> +	struct rte_q_topology_t *q_top = NULL;
>>> +	qtopFromAcc(&q_top, acc, acc100_conf);
>>> +	if (unlikely(q_top == NULL))
>>> +		return;
>> as above, this error is not handled by caller fetch_acc100_config
> It cannot really fail for fetch_acc100_config. If you insist I can add. 

An alternative to handing is to print a debug message.

could do this generally for the other checks i commented on.

>>> +	uint16_t aq;
>>> +	q_top->num_qgroups++;
>>> +	if (q_top->first_qgroup_index == -1) {
>>> +		q_top->first_qgroup_index = qg;
>>> +		/* Can be optimized to assume all are enabled by default */
>>> +		reg = acc100_reg_read(d, queue_offset(d->pf_device,
>>> +				0, qg, ACC100_NUM_AQS - 1));
>>> +		if (reg & QUEUE_ENABLE) {
>>> +			q_top->num_aqs_per_groups = ACC100_NUM_AQS;
>>> +			return;
>>> +		}
>>> +		q_top->num_aqs_per_groups = 0;
>>> +		for (aq = 0; aq < ACC100_NUM_AQS; aq++) {
>>> +			reg = acc100_reg_read(d, queue_offset(d-
>>> pf_device,
>>> +					0, qg, aq));
>>> +			if (reg & QUEUE_ENABLE)
>>> +				q_top->num_aqs_per_groups++;
>>> +		}
>>> +	}
>>> +}
>>> +
>>> +/* Fetch configuration enabled for the PF/VF using MMIO Read (slow)
>>> +*/ static inline void fetch_acc100_config(struct rte_bbdev *dev) {
>>> +	struct acc100_device *d = dev->data->dev_private;
>>> +	struct acc100_conf *acc100_conf = &d->acc100_conf;
>>> +	const struct acc100_registry_addr *reg_addr;
>>> +	uint8_t acc, qg;
>>> +	uint32_t reg, reg_aq, reg_len0, reg_len1;
>>> +	uint32_t reg_mode;
>>> +
>>> +	/* No need to retrieve the configuration is already done */
>>> +	if (d->configured)
>>> +		return;
>> Warn ?
> No this can genuinely happen on a regular basis, just no need to fetch it all again. 

ok

Tom

>
>>> +
>>> +	/* Choose correct registry addresses for the device type */
>>> +	if (d->pf_device)
>>> +		reg_addr = &pf_reg_addr;
>>> +	else
>>> +		reg_addr = &vf_reg_addr;
>>> +
>>> +	d->ddr_size = (1 + acc100_reg_read(d, reg_addr->ddr_range)) << 10;
>>> +
>>> +	/* Single VF Bundle by VF */
>>> +	acc100_conf->num_vf_bundles = 1;
>>> +	initQTop(acc100_conf);
>>> +
>>> +	struct rte_q_topology_t *q_top = NULL;
>>> +	int qman_func_id[5] = {0, 2, 1, 3, 4};
>> Do these magic numbers need #defines ?
> ok. 
>
>>> +	reg = acc100_reg_read(d, reg_addr->qman_group_func);
>>> +	for (qg = 0; qg < ACC100_NUM_QGRPS_PER_WORD; qg++) {
>>> +		reg_aq = acc100_reg_read(d,
>>> +				queue_offset(d->pf_device, 0, qg, 0));
>>> +		if (reg_aq & QUEUE_ENABLE) {
>>> +			acc = qman_func_id[(reg >> (qg * 4)) & 0x7];
>> 0x7 and [5], this could overflow.
> ok thanks, I can add exception handling. Not clear to me right now why it did not trigger tool warning. 
>
>>> +			updateQtop(acc, qg, acc100_conf, d);
>>> +		}
>>> +	}
>>> +
>>> +	/* Check the depth of the AQs*/
>>> +	reg_len0 = acc100_reg_read(d, reg_addr->depth_log0_offset);
>>> +	reg_len1 = acc100_reg_read(d, reg_addr->depth_log1_offset);
>>> +	for (acc = 0; acc < NUM_ACC; acc++) {
>>> +		qtopFromAcc(&q_top, acc, acc100_conf);
>>> +		if (q_top->first_qgroup_index <
>> ACC100_NUM_QGRPS_PER_WORD)
>>> +			q_top->aq_depth_log2 = (reg_len0 >>
>>> +					(q_top->first_qgroup_index * 4))
>>> +					& 0xF;
>>> +		else
>>> +			q_top->aq_depth_log2 = (reg_len1 >>
>>> +					((q_top->first_qgroup_index -
>>> +					ACC100_NUM_QGRPS_PER_WORD) *
>> 4))
>>> +					& 0xF;
>>> +	}
>>> +
>>> +	/* Read PF mode */
>>> +	if (d->pf_device) {
>>> +		reg_mode = acc100_reg_read(d, HWPfHiPfMode);
>>> +		acc100_conf->pf_mode_en = (reg_mode == 2) ? 1 : 0;
>> 2 is a magic number, consider a #define
>>
> ok
>
>> Tom
>>
> Thanks
> Nic
>
>>> +	}
>>> +
>>> +	rte_bbdev_log_debug(
>>> +			"%s Config LLR SIGN IN/OUT %s %s QG %u %u %u %u
>> AQ %u %u %u %u Len %u %u %u %u\n",
>>> +			(d->pf_device) ? "PF" : "VF",
>>> +			(acc100_conf->input_pos_llr_1_bit) ? "POS" : "NEG",
>>> +			(acc100_conf->output_pos_llr_1_bit) ? "POS" :
>> "NEG",
>>> +			acc100_conf->q_ul_4g.num_qgroups,
>>> +			acc100_conf->q_dl_4g.num_qgroups,
>>> +			acc100_conf->q_ul_5g.num_qgroups,
>>> +			acc100_conf->q_dl_5g.num_qgroups,
>>> +			acc100_conf->q_ul_4g.num_aqs_per_groups,
>>> +			acc100_conf->q_dl_4g.num_aqs_per_groups,
>>> +			acc100_conf->q_ul_5g.num_aqs_per_groups,
>>> +			acc100_conf->q_dl_5g.num_aqs_per_groups,
>>> +			acc100_conf->q_ul_4g.aq_depth_log2,
>>> +			acc100_conf->q_dl_4g.aq_depth_log2,
>>> +			acc100_conf->q_ul_5g.aq_depth_log2,
>>> +			acc100_conf->q_dl_5g.aq_depth_log2);
>>> +}
>>> +
>>>  /* Free 64MB memory used for software rings */  static int
>>> acc100_dev_close(struct rte_bbdev *dev  __rte_unused) @@ -33,8
>> +211,55
>>> @@
>>>  	return 0;
>>>  }
>>>
>>> +/* Get ACC100 device info */
>>> +static void
>>> +acc100_dev_info_get(struct rte_bbdev *dev,
>>> +		struct rte_bbdev_driver_info *dev_info) {
>>> +	struct acc100_device *d = dev->data->dev_private;
>>> +
>>> +	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
>>> +		RTE_BBDEV_END_OF_CAPABILITIES_LIST()
>>> +	};
>>> +
>>> +	static struct rte_bbdev_queue_conf default_queue_conf;
>>> +	default_queue_conf.socket = dev->data->socket_id;
>>> +	default_queue_conf.queue_size = MAX_QUEUE_DEPTH;
>>> +
>>> +	dev_info->driver_name = dev->device->driver->name;
>>> +
>>> +	/* Read and save the populated config from ACC100 registers */
>>> +	fetch_acc100_config(dev);
>>> +
>>> +	/* This isn't ideal because it reports the maximum number of queues
>> but
>>> +	 * does not provide info on how many can be uplink/downlink or
>> different
>>> +	 * priorities
>>> +	 */
>>> +	dev_info->max_num_queues =
>>> +			d->acc100_conf.q_dl_5g.num_aqs_per_groups *
>>> +			d->acc100_conf.q_dl_5g.num_qgroups +
>>> +			d->acc100_conf.q_ul_5g.num_aqs_per_groups *
>>> +			d->acc100_conf.q_ul_5g.num_qgroups +
>>> +			d->acc100_conf.q_dl_4g.num_aqs_per_groups *
>>> +			d->acc100_conf.q_dl_4g.num_qgroups +
>>> +			d->acc100_conf.q_ul_4g.num_aqs_per_groups *
>>> +			d->acc100_conf.q_ul_4g.num_qgroups;
>>> +	dev_info->queue_size_lim = MAX_QUEUE_DEPTH;
>>> +	dev_info->hardware_accelerated = true;
>>> +	dev_info->max_dl_queue_priority =
>>> +			d->acc100_conf.q_dl_4g.num_qgroups - 1;
>>> +	dev_info->max_ul_queue_priority =
>>> +			d->acc100_conf.q_ul_4g.num_qgroups - 1;
>>> +	dev_info->default_queue_conf = default_queue_conf;
>>> +	dev_info->cpu_flag_reqs = NULL;
>>> +	dev_info->min_alignment = 64;
>>> +	dev_info->capabilities = bbdev_capabilities;
>>> +	dev_info->harq_buffer_size = d->ddr_size; }
>>> +
>>>  static const struct rte_bbdev_ops acc100_bbdev_ops = {
>>>  	.close = acc100_dev_close,
>>> +	.info_get = acc100_dev_info_get,
>>>  };
>>>
>>>  /* ACC100 PCI PF address map */
>>> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h
>>> b/drivers/baseband/acc100/rte_acc100_pmd.h
>>> index cd77570..662e2c8 100644
>>> --- a/drivers/baseband/acc100/rte_acc100_pmd.h
>>> +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
>>> @@ -7,6 +7,7 @@
>>>
>>>  #include "acc100_pf_enum.h"
>>>  #include "acc100_vf_enum.h"
>>> +#include "rte_acc100_cfg.h"
>>>
>>>  /* Helper macro for logging */
>>>  #define rte_bbdev_log(level, fmt, ...) \ @@ -520,6 +521,8 @@ struct
>>> acc100_registry_addr {
>>>  /* Private data structure for each ACC100 device */  struct
>>> acc100_device {
>>>  	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
>>> +	uint32_t ddr_size; /* Size in kB */
>>> +	struct acc100_conf acc100_conf; /* ACC100 Initial configuration */
>>>  	bool pf_device; /**< True if this is a PF ACC100 device */
>>>  	bool configured; /**< True if this ACC100 device is configured */
>>> };


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v9 01/10] drivers/baseband: add PMD for ACC100
  2020-09-30 23:06           ` Tom Rix
@ 2020-09-30 23:30             ` Chautru, Nicolas
  0 siblings, 0 replies; 213+ messages in thread
From: Chautru, Nicolas @ 2020-09-30 23:30 UTC (permalink / raw)
  To: Tom Rix, dev, akhil.goyal
  Cc: Richardson, Bruce, Xu, Rosen, dave.burley, aidan.goddard, Yigit,
	Ferruh, Liu, Tianjiao

Hi Tom, 

> From: Tom Rix <trix@redhat.com>
> On 9/29/20 4:17 PM, Chautru, Nicolas wrote:
> > Hi Tom,
> >
> >> -----Original Message-----
> >> From: Tom Rix <trix@redhat.com>
> >> Sent: Tuesday, September 29, 2020 12:54 PM
> >> To: Chautru, Nicolas <nicolas.chautru@intel.com>; dev@dpdk.org;
> >> akhil.goyal@nxp.com
> >> Cc: Richardson, Bruce <bruce.richardson@intel.com>; Xu, Rosen
> >> <rosen.xu@intel.com>; dave.burley@accelercomm.com;
> >> aidan.goddard@accelercomm.com; Yigit, Ferruh
> >> <ferruh.yigit@intel.com>; Liu, Tianjiao <tianjiao.liu@intel.com>
> >> Subject: Re: [dpdk-dev] [PATCH v9 01/10] drivers/baseband: add PMD
> >> for
> >> ACC100
> >>
> >>
> >> On 9/28/20 5:29 PM, Nicolas Chautru wrote:
> >>> Add stubs for the ACC100 PMD
> >>>
> >>> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> >>> Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
> >>> ---
> >>>  doc/guides/bbdevs/acc100.rst                       | 233
> +++++++++++++++++++++
> >>>  doc/guides/bbdevs/features/acc100.ini              |  14 ++
> >>>  doc/guides/bbdevs/index.rst                        |   1 +
> >>>  drivers/baseband/acc100/meson.build                |   6 +
> >>>  drivers/baseband/acc100/rte_acc100_pmd.c           | 175
> >> ++++++++++++++++
> >>>  drivers/baseband/acc100/rte_acc100_pmd.h           |  37 ++++
> >>>  .../acc100/rte_pmd_bbdev_acc100_version.map        |   3 +
> >>>  drivers/baseband/meson.build                       |   2 +-
> >>>  8 files changed, 470 insertions(+), 1 deletion(-)  create mode
> >>> 100644 doc/guides/bbdevs/acc100.rst  create mode 100644
> >>> doc/guides/bbdevs/features/acc100.ini
> >>>  create mode 100644 drivers/baseband/acc100/meson.build
> >>>  create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.c
> >>>  create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.h
> >>>  create mode 100644
> >>> drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> >>>
> >>> diff --git a/doc/guides/bbdevs/acc100.rst
> >>> b/doc/guides/bbdevs/acc100.rst new file mode 100644 index
> >>> 0000000..f87ee09
> >>> --- /dev/null
> >>> +++ b/doc/guides/bbdevs/acc100.rst
> >>> @@ -0,0 +1,233 @@
> >>> +..  SPDX-License-Identifier: BSD-3-Clause
> >>> +    Copyright(c) 2020 Intel Corporation
> >>> +
> >>> +Intel(R) ACC100 5G/4G FEC Poll Mode Driver
> >>> +==========================================
> >>> +
> >>> +The BBDEV ACC100 5G/4G FEC poll mode driver (PMD) supports an
> >>> +implementation of a VRAN FEC wireless acceleration function.
> >>> +This device is also known as Mount Bryce.
> >> If this is code name or general chip name it should be removed.
> > We have used general chip name for other PMDs (ie. Vista Creek), I can
> > remove but why should this be removed for my benefit? This tends to be
> > the most user friendly name so arguablygood to name drop in
> documentation .
> 
> VistaCreek is the code name, the chip would be aria10.
> 
> Since mt bryce is the chip name, after more than 1 eASIC this becomes
> confusing.
> 

Actually MtBryce is the 5G personality on top of a given eASIC DM5, eASIC can be seen as process/fab technology.
Other eASIC chips != MtBryce. 
ACC100 == MtBryce literally, only usage is 4G/5G FEC as exposed by bbdev
Vista Creek is user friendly name for N3000 (Card + Arria10), these names tend to stick long after early deployments. 
I think it helps to include it in that doc as a one liner, even if through the rest of the doc and code the device is referred to as ACC100. 

> Generally public product names should be used because only the early
> developers will know the development code names.
> 
> >
> >
> >>> +
> >>> +Features
> >>> +--------
> >>> +
> >>> +ACC100 5G/4G FEC PMD supports the following features:
> >>> +
> >>> +- LDPC Encode in the DL (5GNR)
> >>> +- LDPC Decode in the UL (5GNR)
> >>> +- Turbo Encode in the DL (4G)
> >>> +- Turbo Decode in the UL (4G)
> >>> +- 16 VFs per PF (physical device)
> >>> +- Maximum of 128 queues per VF
> >>> +- PCIe Gen-3 x16 Interface
> >>> +- MSI
> >>> +- SR-IOV
> >>> +
> >>> +ACC100 5G/4G FEC PMD supports the following BBDEV capabilities:
> >>> +
> >>> +* For the LDPC encode operation:
> >>> +   - ``RTE_BBDEV_LDPC_CRC_24B_ATTACH`` :  set to attach CRC24B to
> >> CB(s)
> >>> +   - ``RTE_BBDEV_LDPC_RATE_MATCH`` :  if set then do not do Rate
> >>> + Match
> >> bypass
> >>> +   - ``RTE_BBDEV_LDPC_INTERLEAVER_BYPASS`` : if set then bypass
> >>> +interleaver
> >>> +
> >>> +* For the LDPC decode operation:
> >>> +   - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK`` :  check CRC24B from
> >> CB(s)
> >>> +   - ``RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE`` :  disable early
> >> termination
> >>> +   - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP`` :  drops CRC24B bits
> >> appended while decoding
> >>> +   - ``RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE`` :  provides an
> input
> >> for HARQ combining
> >>> +   - ``RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE`` :  provides an
> input
> >> for HARQ combining
> >>> +   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE`` :
> HARQ
> >> memory input is internal
> >>> +   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE`` :
> >> HARQ memory output is internal
> >>> +   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK`` :
> >> loopback data to/from HARQ memory
> >>> +   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_FILLERS`` :  HARQ
> >> memory includes the fillers bits
> >>> +   - ``RTE_BBDEV_LDPC_DEC_SCATTER_GATHER`` :  supports scatter-
> >> gather for input/output data
> >>> +   - ``RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION`` :  supports
> >> compression of the HARQ input/output
> >>> +   - ``RTE_BBDEV_LDPC_LLR_COMPRESSION`` :  supports LLR input
> >>> +compression
> >>> +
> >>> +* For the turbo encode operation:
> >>> +   - ``RTE_BBDEV_TURBO_CRC_24B_ATTACH`` :  set to attach CRC24B to
> >> CB(s)
> >>> +   - ``RTE_BBDEV_TURBO_RATE_MATCH`` :  if set then do not do Rate
> >> Match bypass
> >>> +   - ``RTE_BBDEV_TURBO_ENC_INTERRUPTS`` :  set for encoder dequeue
> >> interrupts
> >>> +   - ``RTE_BBDEV_TURBO_RV_INDEX_BYPASS`` :  set to bypass RV index
> >>> +   - ``RTE_BBDEV_TURBO_ENC_SCATTER_GATHER`` :  supports
> >>> +scatter-gather for input/output data
> >>> +
> >>> +* For the turbo decode operation:
> >>> +   - ``RTE_BBDEV_TURBO_CRC_TYPE_24B`` :  check CRC24B from CB(s)
> >>> +   - ``RTE_BBDEV_TURBO_SUBBLOCK_DEINTERLEAVE`` :  perform
> subblock
> >> de-interleave
> >>> +   - ``RTE_BBDEV_TURBO_DEC_INTERRUPTS`` :  set for decoder dequeue
> >> interrupts
> >>> +   - ``RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN`` :  set if negative LLR
> >> encoder i/p is supported
> >>> +   - ``RTE_BBDEV_TURBO_POS_LLR_1_BIT_IN`` :  set if positive LLR
> >>> + encoder
> >> i/p is supported
> >>> +   - ``RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP`` :  keep CRC24B bits
> >> appended while decoding
> >>> +   - ``RTE_BBDEV_TURBO_EARLY_TERMINATION`` :  set early early
> >> termination feature
> >>> +   - ``RTE_BBDEV_TURBO_DEC_SCATTER_GATHER`` :  supports scatter-
> >> gather for input/output data
> >>> +   - ``RTE_BBDEV_TURBO_HALF_ITERATION_EVEN`` :  set half iteration
> >>> +granularity
> >>> +
> >>> +Installation
> >>> +------------
> >>> +
> >>> +Section 3 of the DPDK manual provides instuctions on installing and
> >>> +compiling DPDK. The default set of bbdev compile flags may be found
> >>> +in config/common_base, where for example the flag to build the
> >>> +ACC100 5G/4G FEC device,
> ``CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100``,
> >>> +is already set.
> >>> +
> >>> +DPDK requires hugepages to be configured as detailed in section 2
> >>> +of the
> >> DPDK manual.
> >>> +The bbdev test application has been tested with a configuration 40
> >>> +x 1GB hugepages. The hugepage configuration of a server may be
> >>> +examined
> >> using:
> >>> +
> >>> +.. code-block:: console
> >>> +
> >>> +   grep Huge* /proc/meminfo
> >>> +
> >>> +
> >>> +Initialization
> >>> +--------------
> >>> +
> >>> +When the device first powers up, its PCI Physical Functions (PF)
> >>> +can be
> >> listed through this command:
> >>> +
> >>> +.. code-block:: console
> >>> +
> >>> +  sudo lspci -vd8086:0d5c
> >>> +
> >>> +The physical and virtual functions are compatible with Linux UIO
> drivers:
> >>> +``vfio`` and ``igb_uio``. However, in order to work the ACC100
> >>> +5G/4G FEC device firstly needs to be bound to one of these linux
> >>> +drivers through
> >> DPDK.
> >> FEC device first
> > ok
> >
> >>> +
> >>> +
> >>> +Bind PF UIO driver(s)
> >>> +~~~~~~~~~~~~~~~~~~~~~
> >>> +
> >>> +Install the DPDK igb_uio driver, bind it with the PF PCI device ID
> >>> +and use ``lspci`` to confirm the PF device is under use by
> >>> +``igb_uio`` DPDK
> >> UIO driver.
> >>> +
> >>> +The igb_uio driver may be bound to the PF PCI device using one of
> >>> +three
> >> methods:
> >>> +
> >>> +
> >>> +1. PCI functions (physical or virtual, depending on the use case)
> >>> +can be bound to the UIO driver by repeating this command for every
> function.
> >>> +
> >>> +.. code-block:: console
> >>> +
> >>> +  cd <dpdk-top-level-directory>
> >>> +  insmod ./build/kmod/igb_uio.ko
> >>> +  echo "8086 0d5c" > /sys/bus/pci/drivers/igb_uio/new_id
> >>> +  lspci -vd8086:0d5c
> >>> +
> >>> +
> >>> +2. Another way to bind PF with DPDK UIO driver is by using the
> >>> +``dpdk-devbind.py`` tool
> >>> +
> >>> +.. code-block:: console
> >>> +
> >>> +  cd <dpdk-top-level-directory>
> >>> +  ./usertools/dpdk-devbind.py -b igb_uio 0000:06:00.0
> >>> +
> >>> +where the PCI device ID (example: 0000:06:00.0) is obtained using
> >>> +lspci -vd8086:0d5c
> >>> +
> >>> +
> >>> +3. A third way to bind is to use ``dpdk-setup.sh`` tool
> >>> +
> >>> +.. code-block:: console
> >>> +
> >>> +  cd <dpdk-top-level-directory>
> >>> +  ./usertools/dpdk-setup.sh
> >>> +
> >>> +  select 'Bind Ethernet/Crypto/Baseband device to IGB UIO module'
> >>> +  or
> >>> +  select 'Bind Ethernet/Crypto/Baseband device to VFIO module'
> >>> + depending on driver required
> >> This is the igb_uio section, should defer vfio select to its section.
> > Ok
> >
> >>> +  enter PCI device ID
> >>> +  select 'Display current Ethernet/Crypto/Baseband device settings'
> >>> + to confirm binding
> >>> +
> >>> +
> >>> +In the same way the ACC100 5G/4G FEC PF can be bound with vfio, but
> >>> +vfio driver does not support SR-IOV configuration right out of the
> >>> +box, so
> >> it will need to be patched.
> >> Other documentation says works with 5.7
> > Yes this is a bit historical now. I can remove this bit which is not very
> informative and non specific to that PMD.
> >
> >>> +
> >>> +
> >>> +Enable Virtual Functions
> >>> +~~~~~~~~~~~~~~~~~~~~~~~~
> >>> +
> >>> +Now, it should be visible in the printouts that PCI PF is under
> >>> +igb_uio control "``Kernel driver in use: igb_uio``"
> >>> +
> >>> +To show the number of available VFs on the device, read
> >>> +``sriov_totalvfs``
> >> file..
> >>> +
> >>> +.. code-block:: console
> >>> +
> >>> +  cat /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/sriov_totalvfs
> >>> +
> >>> +  where 0000\:<b>\:<d>.<f> is the PCI device ID
> >>> +
> >>> +
> >>> +To enable VFs via igb_uio, echo the number of virtual functions
> >>> +intended to enable to ``max_vfs`` file..
> >>> +
> >>> +.. code-block:: console
> >>> +
> >>> +  echo <num-of-vfs> >
> >>> + /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/max_vfs
> >>> +
> >>> +
> >>> +Afterwards, all VFs must be bound to appropriate UIO drivers as
> >>> +required, same way it was done with the physical function previously.
> >>> +
> >>> +Enabling SR-IOV via vfio driver is pretty much the same, except
> >>> +that the file name is different:
> >>> +
> >>> +.. code-block:: console
> >>> +
> >>> +  echo <num-of-vfs> >
> >>> + /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/sriov_numvfs
> >>> +
> >>> +
> >>> +Configure the VFs through PF
> >>> +~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >>> +
> >>> +The PCI virtual functions must be configured before working or
> >>> +getting assigned to VMs/Containers. The configuration involves
> >>> +allocating the number of hardware queues, priorities, load balance,
> >>> +bandwidth and other settings necessary for the device to perform
> >>> +FEC
> >> functions.
> >>> +
> >>> +This configuration needs to be executed at least once after reboot
> >>> +or PCI FLR and can be achieved by using the function
> >>> +``acc100_configure()``, which sets up the parameters defined in
> >> ``acc100_conf`` structure.
> >>> +
> >>> +Test Application
> >>> +----------------
> >>> +
> >>> +BBDEV provides a test application, ``test-bbdev.py`` and range of
> >>> +test data for testing the functionality of ACC100 5G/4G FEC encode
> >>> +and decode, depending on the device's capabilities. The test
> >>> +application is located under app->test-bbdev folder and has the
> >>> +following
> >> options:
> >>> +
> >>> +.. code-block:: console
> >>> +
> >>> +  "-p", "--testapp-path": specifies path to the bbdev test app.
> >>> +  "-e", "--eal-params"	: EAL arguments which are passed to the test
> >> app.
> >>> +  "-t", "--timeout"	: Timeout in seconds (default=300).
> >>> +  "-c", "--test-cases"	: Defines test cases to run. Run all if not
> specified.
> >>> +  "-v", "--test-vector"	: Test vector path
> (default=dpdk_path+/app/test-
> >> bbdev/test_vectors/bbdev_null.data).
> >>> +  "-n", "--num-ops"	: Number of operations to process on device
> >> (default=32).
> >>> +  "-b", "--burst-size"	: Operations enqueue/dequeue burst size
> >> (default=32).
> >>> +  "-s", "--snr"		: SNR in dB used when generating LLRs for
> bler tests.
> >>> +  "-s", "--iter_max"	: Number of iterations for LDPC decoder.
> >>> +  "-l", "--num-lcores"	: Number of lcores to run (default=16).
> >>> +  "-i", "--init-device" : Initialise PF device with default values.
> >>> +
> >>> +
> >>> +To execute the test application tool using simple decode or encode
> >>> +data, type one of the following:
> >>> +
> >>> +.. code-block:: console
> >>> +
> >>> +  ./test-bbdev.py -c validation -n 64 -b 1 -v
> >>> + ./ldpc_dec_default.data ./test-bbdev.py -c validation -n 64 -b 1
> >>> + -v ./ldpc_enc_default.data
> >>> +
> >>> +
> >>> +The test application ``test-bbdev.py``, supports the ability to
> >>> +configure the PF device with a default set of values, if the "-i"
> >>> +or
> >>> +"- -init-device" option is included. The default values are defined
> >>> +in
> >> test_bbdev_perf.c.
> >>> +
> >>> +
> >>> +Test Vectors
> >>> +~~~~~~~~~~~~
> >>> +
> >>> +In addition to the simple LDPC decoder and LDPC encoder tests,
> >>> +bbdev also provides a range of additional tests under the
> >>> +test_vectors folder, which may be useful. The results of these
> >>> +tests will depend on the ACC100 5G/4G FEC capabilities which may
> >>> +cause some testcases to
> >> be skipped, but no failure should be reported.
> >>
> >> Just
> >>
> >> to be skipped.
> >>
> >> should be able to assume skipped test do not get reported as failures.
> > Not necessaraly that obvious from feedback. It doesn't hurt to be
> > explicit and this statement is common to all PMDs.
> >
> ok
> 
> >>> diff --git a/doc/guides/bbdevs/features/acc100.ini
> >>> b/doc/guides/bbdevs/features/acc100.ini
> >>> new file mode 100644
> >>> index 0000000..c89a4d7
> >>> --- /dev/null
> >>> +++ b/doc/guides/bbdevs/features/acc100.ini
> >>> @@ -0,0 +1,14 @@
> >>> +;
> >>> +; Supported features of the 'acc100' bbdev driver.
> >>> +;
> >>> +; Refer to default.ini for the full list of available PMD features.
> >>> +;
> >>> +[Features]
> >>> +Turbo Decoder (4G)     = N
> >>> +Turbo Encoder (4G)     = N
> >>> +LDPC Decoder (5G)      = N
> >>> +LDPC Encoder (5G)      = N
> >>> +LLR/HARQ Compression   = N
> >>> +External DDR Access    = N
> >>> +HW Accelerated         = Y
> >>> +BBDEV API              = Y
> >>> diff --git a/doc/guides/bbdevs/index.rst
> >>> b/doc/guides/bbdevs/index.rst index a8092dd..4445cbd 100644
> >>> --- a/doc/guides/bbdevs/index.rst
> >>> +++ b/doc/guides/bbdevs/index.rst
> >>> @@ -13,3 +13,4 @@ Baseband Device Drivers
> >>>      turbo_sw
> >>>      fpga_lte_fec
> >>>      fpga_5gnr_fec
> >>> +    acc100
> >>> diff --git a/drivers/baseband/acc100/meson.build
> >>> b/drivers/baseband/acc100/meson.build
> >>> new file mode 100644
> >>> index 0000000..8afafc2
> >>> --- /dev/null
> >>> +++ b/drivers/baseband/acc100/meson.build
> >>> @@ -0,0 +1,6 @@
> >>> +# SPDX-License-Identifier: BSD-3-Clause # Copyright(c) 2020 Intel
> >>> +Corporation
> >>> +
> >>> +deps += ['bbdev', 'bus_vdev', 'ring', 'pci', 'bus_pci']
> >>> +
> >>> +sources = files('rte_acc100_pmd.c')
> >>> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> >>> b/drivers/baseband/acc100/rte_acc100_pmd.c
> >>> new file mode 100644
> >>> index 0000000..1b4cd13
> >>> --- /dev/null
> >>> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> >>> @@ -0,0 +1,175 @@
> >>> +/* SPDX-License-Identifier: BSD-3-Clause
> >>> + * Copyright(c) 2020 Intel Corporation  */
> >>> +
> >>> +#include <unistd.h>
> >>> +
> >>> +#include <rte_common.h>
> >>> +#include <rte_log.h>
> >>> +#include <rte_dev.h>
> >>> +#include <rte_malloc.h>
> >>> +#include <rte_mempool.h>
> >>> +#include <rte_byteorder.h>
> >>> +#include <rte_errno.h>
> >>> +#include <rte_branch_prediction.h>
> >>> +#include <rte_hexdump.h>
> >>> +#include <rte_pci.h>
> >>> +#include <rte_bus_pci.h>
> >>> +
> >>> +#include <rte_bbdev.h>
> >>> +#include <rte_bbdev_pmd.h>
> >> Should these #includes' be in alpha order ?
> > Interesting comment. Is this a coding guide line for DPDK or others?
> > I have never heard of this personnally, what is the rational?
> 
> Not sure if this is dpdk style, i know some other project do this.
> 
> This works for self consistent headers, no idea if dpdk are.
> 
> don't bother with this.
> 
> >>> +#include "rte_acc100_pmd.h"
> >>> +
> >>> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> >>> +RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, DEBUG); #else
> >>> +RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE); #endif
> >>> +
> >>> +/* Free 64MB memory used for software rings */ static int
> >>> +acc100_dev_close(struct rte_bbdev *dev  __rte_unused) {
> >>> +	return 0;
> >>> +}
> >>> +
> >>> +static const struct rte_bbdev_ops acc100_bbdev_ops = {
> >>> +	.close = acc100_dev_close,
> >>> +};
> >>> +
> >>> +/* ACC100 PCI PF address map */
> >>> +static struct rte_pci_id pci_id_acc100_pf_map[] = {
> >>> +	{
> >>> +		RTE_PCI_DEVICE(RTE_ACC100_VENDOR_ID,
> >> RTE_ACC100_PF_DEVICE_ID)
> >>> +	},
> >>> +	{.device_id = 0},
> >>> +};
> >>> +
> >>> +/* ACC100 PCI VF address map */
> >>> +static struct rte_pci_id pci_id_acc100_vf_map[] = {
> >>> +	{
> >>> +		RTE_PCI_DEVICE(RTE_ACC100_VENDOR_ID,
> >> RTE_ACC100_VF_DEVICE_ID)
> >>> +	},
> >>> +	{.device_id = 0},
> >>> +};
> >>> +
> >>> +/* Initialization Function */
> >>> +static void
> >>> +acc100_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver
> >>> +*drv) {
> >>> +	struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
> >>> +
> >>> +	dev->dev_ops = &acc100_bbdev_ops;
> >>> +
> >>> +	((struct acc100_device *) dev->data->dev_private)->pf_device =
> >>> +			!strcmp(drv->driver.name,
> >>> +
> >> 	RTE_STR(ACC100PF_DRIVER_NAME));
> >>> +	((struct acc100_device *) dev->data->dev_private)->mmio_base =
> >>> +			pci_dev->mem_resource[0].addr;
> >>> +
> >>> +	rte_bbdev_log_debug("Init device %s [%s] @ vaddr %p paddr
> >> %#"PRIx64"",
> >>> +			drv->driver.name, dev->data->name,
> >>> +			(void *)pci_dev->mem_resource[0].addr,
> >>> +			pci_dev->mem_resource[0].phys_addr);
> >>> +}
> >>> +
> >>> +static int acc100_pci_probe(struct rte_pci_driver *pci_drv,
> >>> +	struct rte_pci_device *pci_dev)
> >>> +{
> >>> +	struct rte_bbdev *bbdev = NULL;
> >>> +	char dev_name[RTE_BBDEV_NAME_MAX_LEN];
> >>> +
> >>> +	if (pci_dev == NULL) {
> >>> +		rte_bbdev_log(ERR, "NULL PCI device");
> >>> +		return -EINVAL;
> >>> +	}
> >>> +
> >>> +	rte_pci_device_name(&pci_dev->addr, dev_name,
> >> sizeof(dev_name));
> >>> +
> >>> +	/* Allocate memory to be used privately by drivers */
> >>> +	bbdev = rte_bbdev_allocate(pci_dev->device.name);
> >>> +	if (bbdev == NULL)
> >>> +		return -ENODEV;
> >>> +
> >>> +	/* allocate device private memory */
> >>> +	bbdev->data->dev_private = rte_zmalloc_socket(dev_name,
> >>> +			sizeof(struct acc100_device), RTE_CACHE_LINE_SIZE,
> >>> +			pci_dev->device.numa_node);
> >>> +
> >>> +	if (bbdev->data->dev_private == NULL) {
> >>> +		rte_bbdev_log(CRIT,
> >>> +				"Allocate of %zu bytes for device \"%s\"
> >> failed",
> >>> +				sizeof(struct acc100_device), dev_name);
> >>> +				rte_bbdev_release(bbdev);
> >>> +			return -ENOMEM;
> >>> +	}
> >>> +
> >>> +	/* Fill HW specific part of device structure */
> >>> +	bbdev->device = &pci_dev->device;
> >>> +	bbdev->intr_handle = &pci_dev->intr_handle;
> >>> +	bbdev->data->socket_id = pci_dev->device.numa_node;
> >>> +
> >>> +	/* Invoke ACC100 device initialization function */
> >>> +	acc100_bbdev_init(bbdev, pci_drv);
> >>> +
> >>> +	rte_bbdev_log_debug("Initialised bbdev %s (id = %u)",
> >>> +			dev_name, bbdev->data->dev_id);
> >>> +	return 0;
> >>> +}
> >>> +
> >>> +static int acc100_pci_remove(struct rte_pci_device *pci_dev) {
> >>> +	struct rte_bbdev *bbdev;
> >>> +	int ret;
> >>> +	uint8_t dev_id;
> >>> +
> >>> +	if (pci_dev == NULL)
> >>> +		return -EINVAL;
> >>> +
> >>> +	/* Find device */
> >>> +	bbdev = rte_bbdev_get_named_dev(pci_dev->device.name);
> >>> +	if (bbdev == NULL) {
> >>> +		rte_bbdev_log(CRIT,
> >>> +				"Couldn't find HW dev \"%s\" to uninitialise
> >> it",
> >>> +				pci_dev->device.name);
> >>> +		return -ENODEV;
> >>> +	}
> >>> +	dev_id = bbdev->data->dev_id;
> >>> +
> >>> +	/* free device private memory before close */
> >>> +	rte_free(bbdev->data->dev_private);
> >>> +
> >>> +	/* Close device */
> >>> +	ret = rte_bbdev_close(dev_id);
> >> Do you want to reorder this close before the rte_free so you could
> >> recover from the failure ?
> > Given this is done the same way for other PMDs I would not change it as it
> would create a discrepency.
> > It could be done in principle as another patch for multiple PMDs to
> support this, but really I don't see a usecase for try to fall back in case there
> was such a speculative aerror.
> >
> fair enough
> 
> Tom
> 
> >> Tom
> >>
> > Thanks
> > Nic
> >
> >
> >>> +	if (ret < 0)
> >>> +		rte_bbdev_log(ERR,
> >>> +				"Device %i failed to close during uninit: %i",
> >>> +				dev_id, ret);
> >>> +
> >>> +	/* release bbdev from library */
> >>> +	rte_bbdev_release(bbdev);
> >>> +
> >>> +	rte_bbdev_log_debug("Destroyed bbdev = %u", dev_id);
> >>> +
> >>> +	return 0;
> >>> +}
> >>> +
> >>> +static struct rte_pci_driver acc100_pci_pf_driver = {
> >>> +		.probe = acc100_pci_probe,
> >>> +		.remove = acc100_pci_remove,
> >>> +		.id_table = pci_id_acc100_pf_map,
> >>> +		.drv_flags = RTE_PCI_DRV_NEED_MAPPING };
> >>> +
> >>> +static struct rte_pci_driver acc100_pci_vf_driver = {
> >>> +		.probe = acc100_pci_probe,
> >>> +		.remove = acc100_pci_remove,
> >>> +		.id_table = pci_id_acc100_vf_map,
> >>> +		.drv_flags = RTE_PCI_DRV_NEED_MAPPING };
> >>> +
> >>> +RTE_PMD_REGISTER_PCI(ACC100PF_DRIVER_NAME,
> >> acc100_pci_pf_driver);
> >>> +RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME,
> >>> +pci_id_acc100_pf_map);
> >> RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME,
> >>> +acc100_pci_vf_driver);
> >>> +RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME,
> >>> +pci_id_acc100_vf_map);
> >>> +
> >>> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h
> >>> b/drivers/baseband/acc100/rte_acc100_pmd.h
> >>> new file mode 100644
> >>> index 0000000..6f46df0
> >>> --- /dev/null
> >>> +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
> >>> @@ -0,0 +1,37 @@
> >>> +/* SPDX-License-Identifier: BSD-3-Clause
> >>> + * Copyright(c) 2020 Intel Corporation  */
> >>> +
> >>> +#ifndef _RTE_ACC100_PMD_H_
> >>> +#define _RTE_ACC100_PMD_H_
> >>> +
> >>> +/* Helper macro for logging */
> >>> +#define rte_bbdev_log(level, fmt, ...) \
> >>> +	rte_log(RTE_LOG_ ## level, acc100_logtype, fmt "\n", \
> >>> +		##__VA_ARGS__)
> >>> +
> >>> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> >>> +#define rte_bbdev_log_debug(fmt, ...) \
> >>> +		rte_bbdev_log(DEBUG, "acc100_pmd: " fmt, \
> >>> +		##__VA_ARGS__)
> >>> +#else
> >>> +#define rte_bbdev_log_debug(fmt, ...) #endif
> >>> +
> >>> +/* ACC100 PF and VF driver names */
> >>> +#define ACC100PF_DRIVER_NAME           intel_acc100_pf
> >>> +#define ACC100VF_DRIVER_NAME           intel_acc100_vf
> >>> +
> >>> +/* ACC100 PCI vendor & device IDs */
> >>> +#define RTE_ACC100_VENDOR_ID           (0x8086)
> >>> +#define RTE_ACC100_PF_DEVICE_ID        (0x0d5c)
> >>> +#define RTE_ACC100_VF_DEVICE_ID        (0x0d5d)
> >>> +
> >>> +/* Private data structure for each ACC100 device */ struct
> >>> +acc100_device {
> >>> +	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
> >>> +	bool pf_device; /**< True if this is a PF ACC100 device */
> >>> +	bool configured; /**< True if this ACC100 device is configured */
> >>> +};
> >>> +
> >>> +#endif /* _RTE_ACC100_PMD_H_ */
> >>> diff --git
> >>> a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> >>> b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> >>> new file mode 100644
> >>> index 0000000..4a76d1d
> >>> --- /dev/null
> >>> +++ b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> >>> @@ -0,0 +1,3 @@
> >>> +DPDK_21 {
> >>> +	local: *;
> >>> +};
> >>> diff --git a/drivers/baseband/meson.build
> >>> b/drivers/baseband/meson.build index 415b672..72301ce 100644
> >>> --- a/drivers/baseband/meson.build
> >>> +++ b/drivers/baseband/meson.build
> >>> @@ -5,7 +5,7 @@ if is_windows
> >>>  	subdir_done()
> >>>  endif
> >>>
> >>> -drivers = ['null', 'turbo_sw', 'fpga_lte_fec', 'fpga_5gnr_fec']
> >>> +drivers = ['null', 'turbo_sw', 'fpga_lte_fec', 'fpga_5gnr_fec',
> >>> +'acc100']
> >>>
> >>>  config_flag_fmt = 'RTE_LIBRTE_PMD_BBDEV_@0@'
> >>>  driver_name_fmt = 'rte_pmd_bbdev_@0@'


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v9 04/10] baseband/acc100: add queue configuration
  2020-09-30  1:03         ` Chautru, Nicolas
@ 2020-09-30 23:36           ` Tom Rix
  0 siblings, 0 replies; 213+ messages in thread
From: Tom Rix @ 2020-09-30 23:36 UTC (permalink / raw)
  To: Chautru, Nicolas, dev, akhil.goyal
  Cc: Richardson, Bruce, Xu, Rosen, dave.burley, aidan.goddard, Yigit,
	Ferruh, Liu, Tianjiao


On 9/29/20 6:03 PM, Chautru, Nicolas wrote:
> Hi Tom, 
>
>> From: Tom Rix <trix@redhat.com>
>> On 9/28/20 5:29 PM, Nicolas Chautru wrote:
>>> Adding function to create and configure queues for the device. Still
>>> no capability.
>>>
>>> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
>>> Reviewed-by: Rosen Xu <rosen.xu@intel.com>
>>> Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
>>> ---
>>>  drivers/baseband/acc100/rte_acc100_pmd.c | 420
>>> ++++++++++++++++++++++++++++++-
>>> drivers/baseband/acc100/rte_acc100_pmd.h |  45 ++++
>>>  2 files changed, 464 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
>>> b/drivers/baseband/acc100/rte_acc100_pmd.c
>>> index 7807a30..7a21c57 100644
>>> --- a/drivers/baseband/acc100/rte_acc100_pmd.c
>>> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
>>> @@ -26,6 +26,22 @@
>>>  RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);  #endif
>>>
>>> +/* Write to MMIO register address */
>>> +static inline void
>>> +mmio_write(void *addr, uint32_t value) {
>>> +	*((volatile uint32_t *)(addr)) = rte_cpu_to_le_32(value); }
>>> +
>>> +/* Write a register of a ACC100 device */ static inline void
>>> +acc100_reg_write(struct acc100_device *d, uint32_t offset, uint32_t
>>> +payload) {
>>> +	void *reg_addr = RTE_PTR_ADD(d->mmio_base, offset);
>>> +	mmio_write(reg_addr, payload);
>>> +	usleep(1000);
>> rte_acc100_pmd.h defines LONG_WAIT , could this #define be used instead
>> ?
> ok
>
>>> +}
>>> +
>>>  /* Read a register of a ACC100 device */  static inline uint32_t
>>> acc100_reg_read(struct acc100_device *d, uint32_t offset) @@ -36,6
>>> +52,22 @@
>>>  	return rte_le_to_cpu_32(ret);
>>>  }
>>>
>>> +/* Basic Implementation of Log2 for exact 2^N */ static inline
>>> +uint32_t log2_basic(uint32_t value)
>> mirrors the function rte_bsf32
> rte_bsf32 is also undefined for zero input.
> I could just replace __builtin_ctz() by rte_bsf32() indeed.
>
>>> +{
>>> +	return (value == 0) ? 0 : __builtin_ctz(value); }
>>> +
>>> +/* Calculate memory alignment offset assuming alignment is 2^N */
>>> +static inline uint32_t calc_mem_alignment_offset(void
>>> +*unaligned_virt_mem, uint32_t alignment) {
>>> +	rte_iova_t unaligned_phy_mem =
>> rte_malloc_virt2iova(unaligned_virt_mem);
>>> +	return (uint32_t)(alignment -
>>> +			(unaligned_phy_mem & (alignment-1))); }
>>> +
>>>  /* Calculate the offset of the enqueue register */  static inline
>>> uint32_t  queue_offset(bool pf_device, uint8_t vf_id, uint8_t qgrp_id,
>>> uint16_t aq_id) @@ -204,10 +236,393 @@
>>>  			acc100_conf->q_dl_5g.aq_depth_log2);
>>>  }
>>>
>>> +static void
>>> +free_base_addresses(void **base_addrs, int size) {
>>> +	int i;
>>> +	for (i = 0; i < size; i++)
>>> +		rte_free(base_addrs[i]);
>>> +}
>>> +
>>> +static inline uint32_t
>>> +get_desc_len(void)
>>> +{
>>> +	return sizeof(union acc100_dma_desc); }
>>> +
>>> +/* Allocate the 2 * 64MB block for the sw rings */ static int
>>> +alloc_2x64mb_sw_rings_mem(struct rte_bbdev *dev, struct
>> acc100_device *d,
>>> +		int socket)
>> see earlier comment about name of function.
> replied in other patch set
>
>>> +{
>>> +	uint32_t sw_ring_size = ACC100_SIZE_64MBYTE;
>>> +	d->sw_rings_base = rte_zmalloc_socket(dev->device->driver->name,
>>> +			2 * sw_ring_size, RTE_CACHE_LINE_SIZE, socket);
>>> +	if (d->sw_rings_base == NULL) {
>>> +		rte_bbdev_log(ERR, "Failed to allocate memory for %s:%u",
>>> +				dev->device->driver->name,
>>> +				dev->data->dev_id);
>>> +		return -ENOMEM;
>>> +	}
>>> +	memset(d->sw_rings_base, 0, ACC100_SIZE_64MBYTE);
>>> +	uint32_t next_64mb_align_offset = calc_mem_alignment_offset(
>>> +			d->sw_rings_base, ACC100_SIZE_64MBYTE);
>>> +	d->sw_rings = RTE_PTR_ADD(d->sw_rings_base,
>> next_64mb_align_offset);
>>> +	d->sw_rings_phys = rte_malloc_virt2iova(d->sw_rings_base) +
>>> +			next_64mb_align_offset;
>>> +	d->sw_ring_size = MAX_QUEUE_DEPTH * get_desc_len();
>>> +	d->sw_ring_max_depth = d->sw_ring_size / get_desc_len();
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +/* Attempt to allocate minimised memory space for sw rings */ static
>>> +void alloc_sw_rings_min_mem(struct rte_bbdev *dev, struct
>>> +acc100_device *d,
>>> +		uint16_t num_queues, int socket)
>>> +{
>>> +	rte_iova_t sw_rings_base_phy, next_64mb_align_addr_phy;
>>> +	uint32_t next_64mb_align_offset;
>>> +	rte_iova_t sw_ring_phys_end_addr;
>>> +	void *base_addrs[SW_RING_MEM_ALLOC_ATTEMPTS];
>>> +	void *sw_rings_base;
>>> +	int i = 0;
>>> +	uint32_t q_sw_ring_size = MAX_QUEUE_DEPTH * get_desc_len();
>>> +	uint32_t dev_sw_ring_size = q_sw_ring_size * num_queues;
>>> +
>>> +	/* Find an aligned block of memory to store sw rings */
>>> +	while (i < SW_RING_MEM_ALLOC_ATTEMPTS) {
>>> +		/*
>>> +		 * sw_ring allocated memory is guaranteed to be aligned to
>>> +		 * q_sw_ring_size at the condition that the requested size is
>>> +		 * less than the page size
>>> +		 */
>>> +		sw_rings_base = rte_zmalloc_socket(
>>> +				dev->device->driver->name,
>>> +				dev_sw_ring_size, q_sw_ring_size, socket);
>>> +
>>> +		if (sw_rings_base == NULL) {
>>> +			rte_bbdev_log(ERR,
>>> +					"Failed to allocate memory for
>> %s:%u",
>>> +					dev->device->driver->name,
>>> +					dev->data->dev_id);
>>> +			break;
>>> +		}
>>> +
>>> +		sw_rings_base_phy = rte_malloc_virt2iova(sw_rings_base);
>>> +		next_64mb_align_offset = calc_mem_alignment_offset(
>>> +				sw_rings_base, ACC100_SIZE_64MBYTE);
>>> +		next_64mb_align_addr_phy = sw_rings_base_phy +
>>> +				next_64mb_align_offset;
>>> +		sw_ring_phys_end_addr = sw_rings_base_phy +
>> dev_sw_ring_size;
>>> +
>>> +		/* Check if the end of the sw ring memory block is before the
>>> +		 * start of next 64MB aligned mem address
>>> +		 */
>>> +		if (sw_ring_phys_end_addr < next_64mb_align_addr_phy) {
>>> +			d->sw_rings_phys = sw_rings_base_phy;
>>> +			d->sw_rings = sw_rings_base;
>>> +			d->sw_rings_base = sw_rings_base;
>>> +			d->sw_ring_size = q_sw_ring_size;
>>> +			d->sw_ring_max_depth = MAX_QUEUE_DEPTH;
>>> +			break;
>>> +		}
>>> +		/* Store the address of the unaligned mem block */
>>> +		base_addrs[i] = sw_rings_base;
>>> +		i++;
>>> +	}
>>> +
>> This looks like a bug.
>>
>> Freeing memory that was just allocated.
>>
>> Looks like it could be part of an error handler for memory access in the loop
>> failing.
> You are not the first person to raise concerns in that serie for that piece of code.
> I agree this is a bit convoluted but functionally correct. 
>
>> There should be a better way to allocate aligned memory like round up the
>> size and use an offset to the alignment you need.
> This is actually the fall back option below in case that first iterative option fails (but more wasteful in memory).
> If really that looks too dodgy we could skip that first attempt method and go directly to the 2nd option which is more wasteful, 
> but really that is doing what it is supposed to do hence ok to me as it is. 
> Let me know what you think. 

I like your idea try the obvious alloc and fallback to wasteful.

>
>>> +	/* Free all unaligned blocks of mem allocated in the loop */
>>> +	free_base_addresses(base_addrs, i);
>>> +}
>>> +
>>> +
>>> +/* Allocate 64MB memory used for all software rings */ static int
>>> +acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int
>>> +socket_id) {
>>> +	uint32_t phys_low, phys_high, payload;
>>> +	struct acc100_device *d = dev->data->dev_private;
>>> +	const struct acc100_registry_addr *reg_addr;
>>> +
>>> +	if (d->pf_device && !d->acc100_conf.pf_mode_en) {
>>> +		rte_bbdev_log(NOTICE,
>>> +				"%s has PF mode disabled. This PF can't be
>> used.",
>>> +				dev->data->name);
>>> +		return -ENODEV;
>>> +	}
>>> +
>>> +	alloc_sw_rings_min_mem(dev, d, num_queues, socket_id);
>>> +
>>> +	/* If minimal memory space approach failed, then allocate
>>> +	 * the 2 * 64MB block for the sw rings
>>> +	 */
>>> +	if (d->sw_rings == NULL)
>>> +		alloc_2x64mb_sw_rings_mem(dev, d, socket_id);
>> This can fail as well, but is unhandled.
> ok can add. 
>
>>> +
>>> +	/* Configure ACC100 with the base address for DMA descriptor rings
>>> +	 * Same descriptor rings used for UL and DL DMA Engines
>>> +	 * Note : Assuming only VF0 bundle is used for PF mode
>>> +	 */
>>> +	phys_high = (uint32_t)(d->sw_rings_phys >> 32);
>>> +	phys_low  = (uint32_t)(d->sw_rings_phys &
>> ~(ACC100_SIZE_64MBYTE-1));
>>> +
>>> +	/* Choose correct registry addresses for the device type */
>>> +	if (d->pf_device)
>>> +		reg_addr = &pf_reg_addr;
>>> +	else
>>> +		reg_addr = &vf_reg_addr;
>> could reg_addr be part of acc100_device struct ?
> I don't see this as useful really as part of the device data in my opinion.
ok, i just saw this bit of code a lot.
>
>>> +
>>> +	/* Read the populated cfg from ACC100 registers */
>>> +	fetch_acc100_config(dev);
>>> +
>>> +	/* Mark as configured properly */
>>> +	d->configured = true;
>> should set configured at the end, as the function can still fail.
> ok
>
>>> +
>>> +	/* Release AXI from PF */
>>> +	if (d->pf_device)
>>> +		acc100_reg_write(d, HWPfDmaAxiControl, 1);
>>> +
>>> +	acc100_reg_write(d, reg_addr->dma_ring_ul5g_hi, phys_high);
>>> +	acc100_reg_write(d, reg_addr->dma_ring_ul5g_lo, phys_low);
>>> +	acc100_reg_write(d, reg_addr->dma_ring_dl5g_hi, phys_high);
>>> +	acc100_reg_write(d, reg_addr->dma_ring_dl5g_lo, phys_low);
>>> +	acc100_reg_write(d, reg_addr->dma_ring_ul4g_hi, phys_high);
>>> +	acc100_reg_write(d, reg_addr->dma_ring_ul4g_lo, phys_low);
>>> +	acc100_reg_write(d, reg_addr->dma_ring_dl4g_hi, phys_high);
>>> +	acc100_reg_write(d, reg_addr->dma_ring_dl4g_lo, phys_low);
>>> +
>>> +	/*
>>> +	 * Configure Ring Size to the max queue ring size
>>> +	 * (used for wrapping purpose)
>>> +	 */
>>> +	payload = log2_basic(d->sw_ring_size / 64);
>>> +	acc100_reg_write(d, reg_addr->ring_size, payload);
>>> +
>>> +	/* Configure tail pointer for use when SDONE enabled */
>>> +	d->tail_ptrs = rte_zmalloc_socket(
>>> +			dev->device->driver->name,
>>> +			ACC100_NUM_QGRPS * ACC100_NUM_AQS *
>> sizeof(uint32_t),
>>> +			RTE_CACHE_LINE_SIZE, socket_id);
>>> +	if (d->tail_ptrs == NULL) {
>>> +		rte_bbdev_log(ERR, "Failed to allocate tail ptr for %s:%u",
>>> +				dev->device->driver->name,
>>> +				dev->data->dev_id);
>>> +		rte_free(d->sw_rings);
>>> +		return -ENOMEM;
>>> +	}
>>> +	d->tail_ptr_phys = rte_malloc_virt2iova(d->tail_ptrs);
>>> +
>>> +	phys_high = (uint32_t)(d->tail_ptr_phys >> 32);
>>> +	phys_low  = (uint32_t)(d->tail_ptr_phys);
>>> +	acc100_reg_write(d, reg_addr->tail_ptrs_ul5g_hi, phys_high);
>>> +	acc100_reg_write(d, reg_addr->tail_ptrs_ul5g_lo, phys_low);
>>> +	acc100_reg_write(d, reg_addr->tail_ptrs_dl5g_hi, phys_high);
>>> +	acc100_reg_write(d, reg_addr->tail_ptrs_dl5g_lo, phys_low);
>>> +	acc100_reg_write(d, reg_addr->tail_ptrs_ul4g_hi, phys_high);
>>> +	acc100_reg_write(d, reg_addr->tail_ptrs_ul4g_lo, phys_low);
>>> +	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_hi, phys_high);
>>> +	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_lo, phys_low);
>>> +
>>> +	d->harq_layout = rte_zmalloc_socket("HARQ Layout",
>>> +			ACC100_HARQ_LAYOUT * sizeof(*d->harq_layout),
>>> +			RTE_CACHE_LINE_SIZE, dev->data->socket_id);
>> unchecked
> ok will add. 
>
>>> +
>>> +	rte_bbdev_log_debug(
>>> +			"ACC100 (%s) configured  sw_rings = %p,
>> sw_rings_phys = %#"
>>> +			PRIx64, dev->data->name, d->sw_rings, d-
>>> sw_rings_phys);
>>> +
>>> +	return 0;
>>> +}
>>> +
>>>  /* Free 64MB memory used for software rings */  static int
>>> -acc100_dev_close(struct rte_bbdev *dev  __rte_unused)
>>> +acc100_dev_close(struct rte_bbdev *dev)
>>>  {
>>> +	struct acc100_device *d = dev->data->dev_private;
>>> +	if (d->sw_rings_base != NULL) {
>>> +		rte_free(d->tail_ptrs);
>>> +		rte_free(d->sw_rings_base);
>>> +		d->sw_rings_base = NULL;
>>> +	}
>>> +	usleep(1000);
>> similar LONG_WAIT
> ok
>
>>> +	return 0;
>>> +}
>>> +
>>> +
>>> +/**
>>> + * Report a ACC100 queue index which is free
>>> + * Return 0 to 16k for a valid queue_idx or -1 when no queue is
>>> +available
>>> + * Note : Only supporting VF0 Bundle for PF mode  */ static int
>>> +acc100_find_free_queue_idx(struct rte_bbdev *dev,
>>> +		const struct rte_bbdev_queue_conf *conf) {
>>> +	struct acc100_device *d = dev->data->dev_private;
>>> +	int op_2_acc[5] = {0, UL_4G, DL_4G, UL_5G, DL_5G};
>>> +	int acc = op_2_acc[conf->op_type];
>>> +	struct rte_q_topology_t *qtop = NULL;
>>> +	qtopFromAcc(&qtop, acc, &(d->acc100_conf));
>>> +	if (qtop == NULL)
>>> +		return -1;
>>> +	/* Identify matching QGroup Index which are sorted in priority order
>> */
>>> +	uint16_t group_idx = qtop->first_qgroup_index;
>>> +	group_idx += conf->priority;
>>> +	if (group_idx >= ACC100_NUM_QGRPS ||
>>> +			conf->priority >= qtop->num_qgroups) {
>>> +		rte_bbdev_log(INFO, "Invalid Priority on %s, priority %u",
>>> +				dev->data->name, conf->priority);
>>> +		return -1;
>>> +	}
>>> +	/* Find a free AQ_idx  */
>>> +	uint16_t aq_idx;
>>> +	for (aq_idx = 0; aq_idx < qtop->num_aqs_per_groups; aq_idx++) {
>>> +		if (((d->q_assigned_bit_map[group_idx] >> aq_idx) & 0x1) ==
>> 0) {
>>> +			/* Mark the Queue as assigned */
>>> +			d->q_assigned_bit_map[group_idx] |= (1 << aq_idx);
>>> +			/* Report the AQ Index */
>>> +			return (group_idx << GRP_ID_SHIFT) + aq_idx;
>>> +		}
>>> +	}
>>> +	rte_bbdev_log(INFO, "Failed to find free queue on %s, priority %u",
>>> +			dev->data->name, conf->priority);
>>> +	return -1;
>>> +}
>>> +
>>> +/* Setup ACC100 queue */
>>> +static int
>>> +acc100_queue_setup(struct rte_bbdev *dev, uint16_t queue_id,
>>> +		const struct rte_bbdev_queue_conf *conf) {
>>> +	struct acc100_device *d = dev->data->dev_private;
>>> +	struct acc100_queue *q;
>>> +	int16_t q_idx;
>>> +
>>> +	/* Allocate the queue data structure. */
>>> +	q = rte_zmalloc_socket(dev->device->driver->name, sizeof(*q),
>>> +			RTE_CACHE_LINE_SIZE, conf->socket);
>>> +	if (q == NULL) {
>>> +		rte_bbdev_log(ERR, "Failed to allocate queue memory");
>>> +		return -ENOMEM;
>>> +	}
>>> +
>>> +	q->d = d;
>>> +	q->ring_addr = RTE_PTR_ADD(d->sw_rings, (d->sw_ring_size *
>> queue_id));
>>> +	q->ring_addr_phys = d->sw_rings_phys + (d->sw_ring_size *
>> queue_id);
>>> +
>>> +	/* Prepare the Ring with default descriptor format */
>>> +	union acc100_dma_desc *desc = NULL;
>>> +	unsigned int desc_idx, b_idx;
>>> +	int fcw_len = (conf->op_type == RTE_BBDEV_OP_LDPC_ENC ?
>>> +		ACC100_FCW_LE_BLEN : (conf->op_type ==
>> RTE_BBDEV_OP_TURBO_DEC ?
>>> +		ACC100_FCW_TD_BLEN : ACC100_FCW_LD_BLEN));
>>> +
>>> +	for (desc_idx = 0; desc_idx < d->sw_ring_max_depth; desc_idx++) {
>>> +		desc = q->ring_addr + desc_idx;
>>> +		desc->req.word0 = ACC100_DMA_DESC_TYPE;
>>> +		desc->req.word1 = 0; /**< Timestamp */
>>> +		desc->req.word2 = 0;
>>> +		desc->req.word3 = 0;
>>> +		uint64_t fcw_offset = (desc_idx << 8) +
>> ACC100_DESC_FCW_OFFSET;
>>> +		desc->req.data_ptrs[0].address = q->ring_addr_phys +
>> fcw_offset;
>>> +		desc->req.data_ptrs[0].blen = fcw_len;
>>> +		desc->req.data_ptrs[0].blkid = ACC100_DMA_BLKID_FCW;
>>> +		desc->req.data_ptrs[0].last = 0;
>>> +		desc->req.data_ptrs[0].dma_ext = 0;
>>> +		for (b_idx = 1; b_idx < ACC100_DMA_MAX_NUM_POINTERS -
>> 1;
>>> +				b_idx++) {
>>> +			desc->req.data_ptrs[b_idx].blkid =
>> ACC100_DMA_BLKID_IN;
>>> +			desc->req.data_ptrs[b_idx].last = 1;
>>> +			desc->req.data_ptrs[b_idx].dma_ext = 0;
>>> +			b_idx++;
>> This works, but it would be better to only inc the index in the for loop
>> statement.
>>
>> The second data set should accessed as [b_idx+1]
>>
>> And the loop inc by +2
> Matter of preference maybe? 

If you feel strongly, ok.


>
>>> +			desc->req.data_ptrs[b_idx].blkid =
>>> +					ACC100_DMA_BLKID_OUT_ENC;
>>> +			desc->req.data_ptrs[b_idx].last = 1;
>>> +			desc->req.data_ptrs[b_idx].dma_ext = 0;
>>> +		}
>>> +		/* Preset some fields of LDPC FCW */
>>> +		desc->req.fcw_ld.FCWversion = ACC100_FCW_VER;
>>> +		desc->req.fcw_ld.gain_i = 1;
>>> +		desc->req.fcw_ld.gain_h = 1;
>>> +	}
>>> +
>>> +	q->lb_in = rte_zmalloc_socket(dev->device->driver->name,
>>> +			RTE_CACHE_LINE_SIZE,
>>> +			RTE_CACHE_LINE_SIZE, conf->socket);
>>> +	if (q->lb_in == NULL) {
>> q is not freed.
> ok thanks
>
>>> +		rte_bbdev_log(ERR, "Failed to allocate lb_in memory");
>>> +		return -ENOMEM;
>>> +	}
>>> +	q->lb_in_addr_phys = rte_malloc_virt2iova(q->lb_in);
>>> +	q->lb_out = rte_zmalloc_socket(dev->device->driver->name,
>>> +			RTE_CACHE_LINE_SIZE,
>>> +			RTE_CACHE_LINE_SIZE, conf->socket);
>>> +	if (q->lb_out == NULL) {
>>> +		rte_bbdev_log(ERR, "Failed to allocate lb_out memory");
>>> +		return -ENOMEM;
>> q->lb_in is not freed
>>
>> q is not freed
> ok too thanks
>
>>> +	}
>>> +	q->lb_out_addr_phys = rte_malloc_virt2iova(q->lb_out);
>>> +
>>> +	/*
>>> +	 * Software queue ring wraps synchronously with the HW when it
>> reaches
>>> +	 * the boundary of the maximum allocated queue size, no matter
>> what the
>>> +	 * sw queue size is. This wrapping is guarded by setting the
>> wrap_mask
>>> +	 * to represent the maximum queue size as allocated at the time
>> when
>>> +	 * the device has been setup (in configure()).
>>> +	 *
>>> +	 * The queue depth is set to the queue size value (conf->queue_size).
>>> +	 * This limits the occupancy of the queue at any point of time, so
>> that
>>> +	 * the queue does not get swamped with enqueue requests.
>>> +	 */
>>> +	q->sw_ring_depth = conf->queue_size;
>>> +	q->sw_ring_wrap_mask = d->sw_ring_max_depth - 1;
>>> +
>>> +	q->op_type = conf->op_type;
>>> +
>>> +	q_idx = acc100_find_free_queue_idx(dev, conf);
>>> +	if (q_idx == -1) {
>>> +		rte_free(q);
>> This will leak the other two ptr's
>> This function needs better error handling.
> Yes agreed. Thanks.
>
>> Tom
>>
> Thanks for your review Tom, aiming to push updated serie tomorrow.

Ok, i'll look for them.

Thanks,

Tom

>
> Nic
>
>
>
>>> +		return -1;
>>> +	}
>>> +
>>> +	q->qgrp_id = (q_idx >> GRP_ID_SHIFT) & 0xF;
>>> +	q->vf_id = (q_idx >> VF_ID_SHIFT)  & 0x3F;
>>> +	q->aq_id = q_idx & 0xF;
>>> +	q->aq_depth = (conf->op_type ==  RTE_BBDEV_OP_TURBO_DEC) ?
>>> +			(1 << d->acc100_conf.q_ul_4g.aq_depth_log2) :
>>> +			(1 << d->acc100_conf.q_dl_4g.aq_depth_log2);
>>> +
>>> +	q->mmio_reg_enqueue = RTE_PTR_ADD(d->mmio_base,
>>> +			queue_offset(d->pf_device,
>>> +					q->vf_id, q->qgrp_id, q->aq_id));
>>> +
>>> +	rte_bbdev_log_debug(
>>> +			"Setup dev%u q%u: qgrp_id=%u, vf_id=%u,
>> aq_id=%u, aq_depth=%u, mmio_reg_enqueue=%p",
>>> +			dev->data->dev_id, queue_id, q->qgrp_id, q->vf_id,
>>> +			q->aq_id, q->aq_depth, q->mmio_reg_enqueue);
>>> +
>>> +	dev->data->queues[queue_id].queue_private = q;
>>> +	return 0;
>>> +}
>>> +
>>> +/* Release ACC100 queue */
>>> +static int
>>> +acc100_queue_release(struct rte_bbdev *dev, uint16_t q_id) {
>>> +	struct acc100_device *d = dev->data->dev_private;
>>> +	struct acc100_queue *q = dev->data->queues[q_id].queue_private;
>>> +
>>> +	if (q != NULL) {
>>> +		/* Mark the Queue as un-assigned */
>>> +		d->q_assigned_bit_map[q->qgrp_id] &= (0xFFFFFFFF -
>>> +				(1 << q->aq_id));
>>> +		rte_free(q->lb_in);
>>> +		rte_free(q->lb_out);
>>> +		rte_free(q);
>>> +		dev->data->queues[q_id].queue_private = NULL;
>>> +	}
>>> +
>>>  	return 0;
>>>  }
>>>
>>> @@ -258,8 +673,11 @@
>>>  }
>>>
>>>  static const struct rte_bbdev_ops acc100_bbdev_ops = {
>>> +	.setup_queues = acc100_setup_queues,
>>>  	.close = acc100_dev_close,
>>>  	.info_get = acc100_dev_info_get,
>>> +	.queue_setup = acc100_queue_setup,
>>> +	.queue_release = acc100_queue_release,
>>>  };
>>>
>>>  /* ACC100 PCI PF address map */
>>> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h
>>> b/drivers/baseband/acc100/rte_acc100_pmd.h
>>> index 662e2c8..0e2b79c 100644
>>> --- a/drivers/baseband/acc100/rte_acc100_pmd.h
>>> +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
>>> @@ -518,11 +518,56 @@ struct acc100_registry_addr {
>>>  	.ddr_range = HWVfDmaDdrBaseRangeRoVf,  };
>>>
>>> +/* Structure associated with each queue. */ struct
>>> +__rte_cache_aligned acc100_queue {
>>> +	union acc100_dma_desc *ring_addr;  /* Virtual address of sw ring */
>>> +	rte_iova_t ring_addr_phys;  /* Physical address of software ring */
>>> +	uint32_t sw_ring_head;  /* software ring head */
>>> +	uint32_t sw_ring_tail;  /* software ring tail */
>>> +	/* software ring size (descriptors, not bytes) */
>>> +	uint32_t sw_ring_depth;
>>> +	/* mask used to wrap enqueued descriptors on the sw ring */
>>> +	uint32_t sw_ring_wrap_mask;
>>> +	/* MMIO register used to enqueue descriptors */
>>> +	void *mmio_reg_enqueue;
>>> +	uint8_t vf_id;  /* VF ID (max = 63) */
>>> +	uint8_t qgrp_id;  /* Queue Group ID */
>>> +	uint16_t aq_id;  /* Atomic Queue ID */
>>> +	uint16_t aq_depth;  /* Depth of atomic queue */
>>> +	uint32_t aq_enqueued;  /* Count how many "batches" have been
>> enqueued */
>>> +	uint32_t aq_dequeued;  /* Count how many "batches" have been
>> dequeued */
>>> +	uint32_t irq_enable;  /* Enable ops dequeue interrupts if set to 1 */
>>> +	struct rte_mempool *fcw_mempool;  /* FCW mempool */
>>> +	enum rte_bbdev_op_type op_type;  /* Type of this Queue: TE or TD
>> */
>>> +	/* Internal Buffers for loopback input */
>>> +	uint8_t *lb_in;
>>> +	uint8_t *lb_out;
>>> +	rte_iova_t lb_in_addr_phys;
>>> +	rte_iova_t lb_out_addr_phys;
>>> +	struct acc100_device *d;
>>> +};
>>> +
>>>  /* Private data structure for each ACC100 device */  struct
>>> acc100_device {
>>>  	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
>>> +	void *sw_rings_base;  /* Base addr of un-aligned memory for sw
>> rings */
>>> +	void *sw_rings;  /* 64MBs of 64MB aligned memory for sw rings */
>>> +	rte_iova_t sw_rings_phys;  /* Physical address of sw_rings */
>>> +	/* Virtual address of the info memory routed to the this function
>> under
>>> +	 * operation, whether it is PF or VF.
>>> +	 */
>>> +	union acc100_harq_layout_data *harq_layout;
>>> +	uint32_t sw_ring_size;
>>>  	uint32_t ddr_size; /* Size in kB */
>>> +	uint32_t *tail_ptrs; /* Base address of response tail pointer buffer */
>>> +	rte_iova_t tail_ptr_phys; /* Physical address of tail pointers */
>>> +	/* Max number of entries available for each queue in device,
>> depending
>>> +	 * on how many queues are enabled with configure()
>>> +	 */
>>> +	uint32_t sw_ring_max_depth;
>>>  	struct acc100_conf acc100_conf; /* ACC100 Initial configuration */
>>> +	/* Bitmap capturing which Queues have already been assigned */
>>> +	uint16_t q_assigned_bit_map[ACC100_NUM_QGRPS];
>>>  	bool pf_device; /**< True if this is a PF ACC100 device */
>>>  	bool configured; /**< True if this ACC100 device is configured */
>>> };


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v10 00/10] bbdev PMD ACC100
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 11/11] doc: update bbdev feature table Nicolas Chautru
                     ` (5 preceding siblings ...)
  2020-09-29  0:29   ` [dpdk-dev] [PATCH v9 00/10] bbdev PMD ACC100 Nicolas Chautru
@ 2020-10-01  3:14   ` Nicolas Chautru
  2020-10-01  3:14     ` [dpdk-dev] [PATCH v10 01/10] drivers/baseband: add PMD for ACC100 Nicolas Chautru
                       ` (9 more replies)
  2020-10-02  1:01   ` [dpdk-dev] [PATCH v11 00/10] bbdev PMD ACC100 Nicolas Chautru
  2020-10-05 22:12   ` [dpdk-dev] [PATCH v12 00/10] bbdev PMD ACC100 Nicolas Chautru
  8 siblings, 10 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-10-01  3:14 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, trix, ferruh.yigit, tianjiao.liu,
	Nicolas Chautru

v10: Updates based on Tom Rix valuable review comments. Notably doc clarifiction, #define names updates, few magic numbers left, stricter error handling and few valuable coding suggestions. Thanks
v9: moved the release notes update to the last commit
v8: integrated the doc feature table in previous commit as suggested. 
v7: Fingers trouble. Previous one sent mid-rebase. My bad. 
v6: removed a legacy makefile no longer required
v5: rebase based on latest on main. The legacy makefiles are removed. 
v4: an odd compilation error is reported for one CI variant using "gcc latest" which looks to me like a false positive of maybe-undeclared. 
http://mails.dpdk.org/archives/test-report/2020-August/148936.html
Still forcing a dummy declare to remove this CI warning I will check with ci@dpdk.org in parallel.  
v3: missed a change during rebase
v2: includes clean up from latest CI checks.

Nicolas Chautru (10):
  drivers/baseband: add PMD for ACC100
  baseband/acc100: add register definition file
  baseband/acc100: add info get function
  baseband/acc100: add queue configuration
  baseband/acc100: add LDPC processing functions
  baseband/acc100: add HARQ loopback support
  baseband/acc100: add support for 4G processing
  baseband/acc100: add interrupt support to PMD
  baseband/acc100: add debug function to validate input
  baseband/acc100: add configure function

 app/test-bbdev/meson.build                         |    3 +
 app/test-bbdev/test_bbdev_perf.c                   |   71 +
 doc/guides/bbdevs/acc100.rst                       |  228 +
 doc/guides/bbdevs/features/acc100.ini              |   14 +
 doc/guides/bbdevs/index.rst                        |    1 +
 doc/guides/rel_notes/release_20_11.rst             |    5 +
 drivers/baseband/acc100/acc100_pf_enum.h           | 1068 +++++
 drivers/baseband/acc100/acc100_vf_enum.h           |   73 +
 drivers/baseband/acc100/meson.build                |    8 +
 drivers/baseband/acc100/rte_acc100_cfg.h           |  113 +
 drivers/baseband/acc100/rte_acc100_pmd.c           | 4731 ++++++++++++++++++++
 drivers/baseband/acc100/rte_acc100_pmd.h           |  602 +++
 .../acc100/rte_pmd_bbdev_acc100_version.map        |   10 +
 drivers/baseband/meson.build                       |    2 +-
 14 files changed, 6928 insertions(+), 1 deletion(-)
 create mode 100644 doc/guides/bbdevs/acc100.rst
 create mode 100644 doc/guides/bbdevs/features/acc100.ini
 create mode 100644 drivers/baseband/acc100/acc100_pf_enum.h
 create mode 100644 drivers/baseband/acc100/acc100_vf_enum.h
 create mode 100644 drivers/baseband/acc100/meson.build
 create mode 100644 drivers/baseband/acc100/rte_acc100_cfg.h
 create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.c
 create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.h
 create mode 100644 drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v10 01/10] drivers/baseband: add PMD for ACC100
  2020-10-01  3:14   ` [dpdk-dev] [PATCH v10 00/10] bbdev PMD ACC100 Nicolas Chautru
@ 2020-10-01  3:14     ` Nicolas Chautru
  2020-10-01  3:14     ` [dpdk-dev] [PATCH v10 02/10] baseband/acc100: add register definition file Nicolas Chautru
                       ` (8 subsequent siblings)
  9 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-10-01  3:14 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, trix, ferruh.yigit, tianjiao.liu,
	Nicolas Chautru

Add stubs for the ACC100 PMD

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 doc/guides/bbdevs/acc100.rst                       | 228 +++++++++++++++++++++
 doc/guides/bbdevs/features/acc100.ini              |  14 ++
 doc/guides/bbdevs/index.rst                        |   1 +
 drivers/baseband/acc100/meson.build                |   6 +
 drivers/baseband/acc100/rte_acc100_pmd.c           | 175 ++++++++++++++++
 drivers/baseband/acc100/rte_acc100_pmd.h           |  37 ++++
 .../acc100/rte_pmd_bbdev_acc100_version.map        |   3 +
 drivers/baseband/meson.build                       |   2 +-
 8 files changed, 465 insertions(+), 1 deletion(-)
 create mode 100644 doc/guides/bbdevs/acc100.rst
 create mode 100644 doc/guides/bbdevs/features/acc100.ini
 create mode 100644 drivers/baseband/acc100/meson.build
 create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.c
 create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.h
 create mode 100644 drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map

diff --git a/doc/guides/bbdevs/acc100.rst b/doc/guides/bbdevs/acc100.rst
new file mode 100644
index 0000000..d6d56ad
--- /dev/null
+++ b/doc/guides/bbdevs/acc100.rst
@@ -0,0 +1,228 @@
+..  SPDX-License-Identifier: BSD-3-Clause
+    Copyright(c) 2020 Intel Corporation
+
+Intel(R) ACC100 5G/4G FEC Poll Mode Driver
+==========================================
+
+The BBDEV ACC100 5G/4G FEC poll mode driver (PMD) supports an
+implementation of a VRAN FEC wireless acceleration function.
+This device is also known as Mount Bryce.
+
+Features
+--------
+
+ACC100 5G/4G FEC PMD supports the following features:
+
+- LDPC Encode in the DL (5GNR)
+- LDPC Decode in the UL (5GNR)
+- Turbo Encode in the DL (4G)
+- Turbo Decode in the UL (4G)
+- 16 VFs per PF (physical device)
+- Maximum of 128 queues per VF
+- PCIe Gen-3 x16 Interface
+- MSI
+- SR-IOV
+
+ACC100 5G/4G FEC PMD supports the following BBDEV capabilities:
+
+* For the LDPC encode operation:
+   - ``RTE_BBDEV_LDPC_CRC_24B_ATTACH`` :  set to attach CRC24B to CB(s)
+   - ``RTE_BBDEV_LDPC_RATE_MATCH`` :  if set then do not do Rate Match bypass
+   - ``RTE_BBDEV_LDPC_INTERLEAVER_BYPASS`` : if set then bypass interleaver
+
+* For the LDPC decode operation:
+   - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK`` :  check CRC24B from CB(s)
+   - ``RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE`` :  disable early termination
+   - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP`` :  drops CRC24B bits appended while decoding
+   - ``RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE`` :  provides an input for HARQ combining
+   - ``RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE`` :  provides an input for HARQ combining
+   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE`` :  HARQ memory input is internal
+   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE`` :  HARQ memory output is internal
+   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK`` :  loopback data to/from HARQ memory
+   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_FILLERS`` :  HARQ memory includes the fillers bits
+   - ``RTE_BBDEV_LDPC_DEC_SCATTER_GATHER`` :  supports scatter-gather for input/output data
+   - ``RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION`` :  supports compression of the HARQ input/output
+   - ``RTE_BBDEV_LDPC_LLR_COMPRESSION`` :  supports LLR input compression
+
+* For the turbo encode operation:
+   - ``RTE_BBDEV_TURBO_CRC_24B_ATTACH`` :  set to attach CRC24B to CB(s)
+   - ``RTE_BBDEV_TURBO_RATE_MATCH`` :  if set then do not do Rate Match bypass
+   - ``RTE_BBDEV_TURBO_ENC_INTERRUPTS`` :  set for encoder dequeue interrupts
+   - ``RTE_BBDEV_TURBO_RV_INDEX_BYPASS`` :  set to bypass RV index
+   - ``RTE_BBDEV_TURBO_ENC_SCATTER_GATHER`` :  supports scatter-gather for input/output data
+
+* For the turbo decode operation:
+   - ``RTE_BBDEV_TURBO_CRC_TYPE_24B`` :  check CRC24B from CB(s)
+   - ``RTE_BBDEV_TURBO_SUBBLOCK_DEINTERLEAVE`` :  perform subblock de-interleave
+   - ``RTE_BBDEV_TURBO_DEC_INTERRUPTS`` :  set for decoder dequeue interrupts
+   - ``RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN`` :  set if negative LLR encoder i/p is supported
+   - ``RTE_BBDEV_TURBO_POS_LLR_1_BIT_IN`` :  set if positive LLR encoder i/p is supported
+   - ``RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP`` :  keep CRC24B bits appended while decoding
+   - ``RTE_BBDEV_TURBO_EARLY_TERMINATION`` :  set early early termination feature
+   - ``RTE_BBDEV_TURBO_DEC_SCATTER_GATHER`` :  supports scatter-gather for input/output data
+   - ``RTE_BBDEV_TURBO_HALF_ITERATION_EVEN`` :  set half iteration granularity
+
+Installation
+------------
+
+Section 3 of the DPDK manual provides instuctions on installing and compiling DPDK. The
+default set of bbdev compile flags may be found in config/common_base, where for example
+the flag to build the ACC100 5G/4G FEC device, ``CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100``,
+is already set.
+
+DPDK requires hugepages to be configured as detailed in section 2 of the DPDK manual.
+The bbdev test application has been tested with a configuration 40 x 1GB hugepages. The
+hugepage configuration of a server may be examined using:
+
+.. code-block:: console
+
+   grep Huge* /proc/meminfo
+
+
+Initialization
+--------------
+
+When the device first powers up, its PCI Physical Functions (PF) can be listed through this command:
+
+.. code-block:: console
+
+  sudo lspci -vd8086:0d5c
+
+The physical and virtual functions are compatible with Linux UIO drivers:
+``vfio`` and ``igb_uio``. However, in order to work the ACC100 5G/4G
+FEC device first needs to be bound to one of these linux drivers through DPDK.
+
+
+Bind PF UIO driver(s)
+~~~~~~~~~~~~~~~~~~~~~
+
+Install the DPDK igb_uio driver, bind it with the PF PCI device ID and use
+``lspci`` to confirm the PF device is under use by ``igb_uio`` DPDK UIO driver.
+
+The igb_uio driver may be bound to the PF PCI device using one of three methods:
+
+
+1. PCI functions (physical or virtual, depending on the use case) can be bound to
+the UIO driver by repeating this command for every function.
+
+.. code-block:: console
+
+  cd <dpdk-top-level-directory>
+  insmod ./build/kmod/igb_uio.ko
+  echo "8086 0d5c" > /sys/bus/pci/drivers/igb_uio/new_id
+  lspci -vd8086:0d5c
+
+
+2. Another way to bind PF with DPDK UIO driver is by using the ``dpdk-devbind.py`` tool
+
+.. code-block:: console
+
+  cd <dpdk-top-level-directory>
+  ./usertools/dpdk-devbind.py -b igb_uio 0000:06:00.0
+
+where the PCI device ID (example: 0000:06:00.0) is obtained using lspci -vd8086:0d5c
+
+
+3. A third way to bind is to use ``dpdk-setup.sh`` tool
+
+.. code-block:: console
+
+  cd <dpdk-top-level-directory>
+  ./usertools/dpdk-setup.sh
+
+  select 'Bind Ethernet/Crypto/Baseband device to IGB UIO module'
+  enter PCI device ID
+  select 'Display current Ethernet/Crypto/Baseband device settings' to confirm binding
+
+In a similar way the ACC100 5G/4G FEC PF may be bound with vfio-pci as any PCIe device.
+
+Enable Virtual Functions
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+Now, it should be visible in the printouts that PCI PF is under igb_uio control
+"``Kernel driver in use: igb_uio``"
+
+To show the number of available VFs on the device, read ``sriov_totalvfs`` file..
+
+.. code-block:: console
+
+  cat /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/sriov_totalvfs
+
+  where 0000\:<b>\:<d>.<f> is the PCI device ID
+
+
+To enable VFs via igb_uio, echo the number of virtual functions intended to
+enable to ``max_vfs`` file..
+
+.. code-block:: console
+
+  echo <num-of-vfs> > /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/max_vfs
+
+
+Afterwards, all VFs must be bound to appropriate UIO drivers as required, same
+way it was done with the physical function previously.
+
+Enabling SR-IOV via vfio driver is pretty much the same, except that the file
+name is different:
+
+.. code-block:: console
+
+  echo <num-of-vfs> > /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/sriov_numvfs
+
+
+Configure the VFs through PF
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The PCI virtual functions must be configured before working or getting assigned
+to VMs/Containers. The configuration involves allocating the number of hardware
+queues, priorities, load balance, bandwidth and other settings necessary for the
+device to perform FEC functions.
+
+This configuration needs to be executed at least once after reboot or PCI FLR and can
+be achieved by using the function ``acc100_configure()``, which sets up the
+parameters defined in ``acc100_conf`` structure.
+
+Test Application
+----------------
+
+BBDEV provides a test application, ``test-bbdev.py`` and range of test data for testing
+the functionality of ACC100 5G/4G FEC encode and decode, depending on the device's
+capabilities. The test application is located under app->test-bbdev folder and has the
+following options:
+
+.. code-block:: console
+
+  "-p", "--testapp-path": specifies path to the bbdev test app.
+  "-e", "--eal-params"	: EAL arguments which are passed to the test app.
+  "-t", "--timeout"	: Timeout in seconds (default=300).
+  "-c", "--test-cases"	: Defines test cases to run. Run all if not specified.
+  "-v", "--test-vector"	: Test vector path (default=dpdk_path+/app/test-bbdev/test_vectors/bbdev_null.data).
+  "-n", "--num-ops"	: Number of operations to process on device (default=32).
+  "-b", "--burst-size"	: Operations enqueue/dequeue burst size (default=32).
+  "-s", "--snr"		: SNR in dB used when generating LLRs for bler tests.
+  "-s", "--iter_max"	: Number of iterations for LDPC decoder.
+  "-l", "--num-lcores"	: Number of lcores to run (default=16).
+  "-i", "--init-device" : Initialise PF device with default values.
+
+
+To execute the test application tool using simple decode or encode data,
+type one of the following:
+
+.. code-block:: console
+
+  ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_dec_default.data
+  ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_enc_default.data
+
+
+The test application ``test-bbdev.py``, supports the ability to configure the PF device with
+a default set of values, if the "-i" or "- -init-device" option is included. The default values
+are defined in test_bbdev_perf.c.
+
+
+Test Vectors
+~~~~~~~~~~~~
+
+In addition to the simple LDPC decoder and LDPC encoder tests, bbdev also provides
+a range of additional tests under the test_vectors folder, which may be useful. The results
+of these tests will depend on the ACC100 5G/4G FEC capabilities which may cause some
+testcases to be skipped, but no failure should be reported.
diff --git a/doc/guides/bbdevs/features/acc100.ini b/doc/guides/bbdevs/features/acc100.ini
new file mode 100644
index 0000000..c89a4d7
--- /dev/null
+++ b/doc/guides/bbdevs/features/acc100.ini
@@ -0,0 +1,14 @@
+;
+; Supported features of the 'acc100' bbdev driver.
+;
+; Refer to default.ini for the full list of available PMD features.
+;
+[Features]
+Turbo Decoder (4G)     = N
+Turbo Encoder (4G)     = N
+LDPC Decoder (5G)      = N
+LDPC Encoder (5G)      = N
+LLR/HARQ Compression   = N
+External DDR Access    = N
+HW Accelerated         = Y
+BBDEV API              = Y
diff --git a/doc/guides/bbdevs/index.rst b/doc/guides/bbdevs/index.rst
index a8092dd..4445cbd 100644
--- a/doc/guides/bbdevs/index.rst
+++ b/doc/guides/bbdevs/index.rst
@@ -13,3 +13,4 @@ Baseband Device Drivers
     turbo_sw
     fpga_lte_fec
     fpga_5gnr_fec
+    acc100
diff --git a/drivers/baseband/acc100/meson.build b/drivers/baseband/acc100/meson.build
new file mode 100644
index 0000000..8afafc2
--- /dev/null
+++ b/drivers/baseband/acc100/meson.build
@@ -0,0 +1,6 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2020 Intel Corporation
+
+deps += ['bbdev', 'bus_vdev', 'ring', 'pci', 'bus_pci']
+
+sources = files('rte_acc100_pmd.c')
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
new file mode 100644
index 0000000..1b4cd13
--- /dev/null
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -0,0 +1,175 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#include <unistd.h>
+
+#include <rte_common.h>
+#include <rte_log.h>
+#include <rte_dev.h>
+#include <rte_malloc.h>
+#include <rte_mempool.h>
+#include <rte_byteorder.h>
+#include <rte_errno.h>
+#include <rte_branch_prediction.h>
+#include <rte_hexdump.h>
+#include <rte_pci.h>
+#include <rte_bus_pci.h>
+
+#include <rte_bbdev.h>
+#include <rte_bbdev_pmd.h>
+#include "rte_acc100_pmd.h"
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, DEBUG);
+#else
+RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);
+#endif
+
+/* Free 64MB memory used for software rings */
+static int
+acc100_dev_close(struct rte_bbdev *dev  __rte_unused)
+{
+	return 0;
+}
+
+static const struct rte_bbdev_ops acc100_bbdev_ops = {
+	.close = acc100_dev_close,
+};
+
+/* ACC100 PCI PF address map */
+static struct rte_pci_id pci_id_acc100_pf_map[] = {
+	{
+		RTE_PCI_DEVICE(RTE_ACC100_VENDOR_ID, RTE_ACC100_PF_DEVICE_ID)
+	},
+	{.device_id = 0},
+};
+
+/* ACC100 PCI VF address map */
+static struct rte_pci_id pci_id_acc100_vf_map[] = {
+	{
+		RTE_PCI_DEVICE(RTE_ACC100_VENDOR_ID, RTE_ACC100_VF_DEVICE_ID)
+	},
+	{.device_id = 0},
+};
+
+/* Initialization Function */
+static void
+acc100_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv)
+{
+	struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
+
+	dev->dev_ops = &acc100_bbdev_ops;
+
+	((struct acc100_device *) dev->data->dev_private)->pf_device =
+			!strcmp(drv->driver.name,
+					RTE_STR(ACC100PF_DRIVER_NAME));
+	((struct acc100_device *) dev->data->dev_private)->mmio_base =
+			pci_dev->mem_resource[0].addr;
+
+	rte_bbdev_log_debug("Init device %s [%s] @ vaddr %p paddr %#"PRIx64"",
+			drv->driver.name, dev->data->name,
+			(void *)pci_dev->mem_resource[0].addr,
+			pci_dev->mem_resource[0].phys_addr);
+}
+
+static int acc100_pci_probe(struct rte_pci_driver *pci_drv,
+	struct rte_pci_device *pci_dev)
+{
+	struct rte_bbdev *bbdev = NULL;
+	char dev_name[RTE_BBDEV_NAME_MAX_LEN];
+
+	if (pci_dev == NULL) {
+		rte_bbdev_log(ERR, "NULL PCI device");
+		return -EINVAL;
+	}
+
+	rte_pci_device_name(&pci_dev->addr, dev_name, sizeof(dev_name));
+
+	/* Allocate memory to be used privately by drivers */
+	bbdev = rte_bbdev_allocate(pci_dev->device.name);
+	if (bbdev == NULL)
+		return -ENODEV;
+
+	/* allocate device private memory */
+	bbdev->data->dev_private = rte_zmalloc_socket(dev_name,
+			sizeof(struct acc100_device), RTE_CACHE_LINE_SIZE,
+			pci_dev->device.numa_node);
+
+	if (bbdev->data->dev_private == NULL) {
+		rte_bbdev_log(CRIT,
+				"Allocate of %zu bytes for device \"%s\" failed",
+				sizeof(struct acc100_device), dev_name);
+				rte_bbdev_release(bbdev);
+			return -ENOMEM;
+	}
+
+	/* Fill HW specific part of device structure */
+	bbdev->device = &pci_dev->device;
+	bbdev->intr_handle = &pci_dev->intr_handle;
+	bbdev->data->socket_id = pci_dev->device.numa_node;
+
+	/* Invoke ACC100 device initialization function */
+	acc100_bbdev_init(bbdev, pci_drv);
+
+	rte_bbdev_log_debug("Initialised bbdev %s (id = %u)",
+			dev_name, bbdev->data->dev_id);
+	return 0;
+}
+
+static int acc100_pci_remove(struct rte_pci_device *pci_dev)
+{
+	struct rte_bbdev *bbdev;
+	int ret;
+	uint8_t dev_id;
+
+	if (pci_dev == NULL)
+		return -EINVAL;
+
+	/* Find device */
+	bbdev = rte_bbdev_get_named_dev(pci_dev->device.name);
+	if (bbdev == NULL) {
+		rte_bbdev_log(CRIT,
+				"Couldn't find HW dev \"%s\" to uninitialise it",
+				pci_dev->device.name);
+		return -ENODEV;
+	}
+	dev_id = bbdev->data->dev_id;
+
+	/* free device private memory before close */
+	rte_free(bbdev->data->dev_private);
+
+	/* Close device */
+	ret = rte_bbdev_close(dev_id);
+	if (ret < 0)
+		rte_bbdev_log(ERR,
+				"Device %i failed to close during uninit: %i",
+				dev_id, ret);
+
+	/* release bbdev from library */
+	rte_bbdev_release(bbdev);
+
+	rte_bbdev_log_debug("Destroyed bbdev = %u", dev_id);
+
+	return 0;
+}
+
+static struct rte_pci_driver acc100_pci_pf_driver = {
+		.probe = acc100_pci_probe,
+		.remove = acc100_pci_remove,
+		.id_table = pci_id_acc100_pf_map,
+		.drv_flags = RTE_PCI_DRV_NEED_MAPPING
+};
+
+static struct rte_pci_driver acc100_pci_vf_driver = {
+		.probe = acc100_pci_probe,
+		.remove = acc100_pci_remove,
+		.id_table = pci_id_acc100_vf_map,
+		.drv_flags = RTE_PCI_DRV_NEED_MAPPING
+};
+
+RTE_PMD_REGISTER_PCI(ACC100PF_DRIVER_NAME, acc100_pci_pf_driver);
+RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME, pci_id_acc100_pf_map);
+RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
+RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME, pci_id_acc100_vf_map);
+
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
new file mode 100644
index 0000000..6f46df0
--- /dev/null
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -0,0 +1,37 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#ifndef _RTE_ACC100_PMD_H_
+#define _RTE_ACC100_PMD_H_
+
+/* Helper macro for logging */
+#define rte_bbdev_log(level, fmt, ...) \
+	rte_log(RTE_LOG_ ## level, acc100_logtype, fmt "\n", \
+		##__VA_ARGS__)
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+#define rte_bbdev_log_debug(fmt, ...) \
+		rte_bbdev_log(DEBUG, "acc100_pmd: " fmt, \
+		##__VA_ARGS__)
+#else
+#define rte_bbdev_log_debug(fmt, ...)
+#endif
+
+/* ACC100 PF and VF driver names */
+#define ACC100PF_DRIVER_NAME           intel_acc100_pf
+#define ACC100VF_DRIVER_NAME           intel_acc100_vf
+
+/* ACC100 PCI vendor & device IDs */
+#define RTE_ACC100_VENDOR_ID           (0x8086)
+#define RTE_ACC100_PF_DEVICE_ID        (0x0d5c)
+#define RTE_ACC100_VF_DEVICE_ID        (0x0d5d)
+
+/* Private data structure for each ACC100 device */
+struct acc100_device {
+	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
+	bool pf_device; /**< True if this is a PF ACC100 device */
+	bool configured; /**< True if this ACC100 device is configured */
+};
+
+#endif /* _RTE_ACC100_PMD_H_ */
diff --git a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
new file mode 100644
index 0000000..4a76d1d
--- /dev/null
+++ b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
@@ -0,0 +1,3 @@
+DPDK_21 {
+	local: *;
+};
diff --git a/drivers/baseband/meson.build b/drivers/baseband/meson.build
index 415b672..72301ce 100644
--- a/drivers/baseband/meson.build
+++ b/drivers/baseband/meson.build
@@ -5,7 +5,7 @@ if is_windows
 	subdir_done()
 endif
 
-drivers = ['null', 'turbo_sw', 'fpga_lte_fec', 'fpga_5gnr_fec']
+drivers = ['null', 'turbo_sw', 'fpga_lte_fec', 'fpga_5gnr_fec', 'acc100']
 
 config_flag_fmt = 'RTE_LIBRTE_PMD_BBDEV_@0@'
 driver_name_fmt = 'rte_pmd_bbdev_@0@'
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v10 02/10] baseband/acc100: add register definition file
  2020-10-01  3:14   ` [dpdk-dev] [PATCH v10 00/10] bbdev PMD ACC100 Nicolas Chautru
  2020-10-01  3:14     ` [dpdk-dev] [PATCH v10 01/10] drivers/baseband: add PMD for ACC100 Nicolas Chautru
@ 2020-10-01  3:14     ` Nicolas Chautru
  2020-10-01  3:14     ` [dpdk-dev] [PATCH v10 03/10] baseband/acc100: add info get function Nicolas Chautru
                       ` (7 subsequent siblings)
  9 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-10-01  3:14 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, trix, ferruh.yigit, tianjiao.liu,
	Nicolas Chautru

Add in the list of registers for the device and related
HW specs definitions.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 drivers/baseband/acc100/acc100_pf_enum.h | 1068 ++++++++++++++++++++++++++++++
 drivers/baseband/acc100/acc100_vf_enum.h |   73 ++
 drivers/baseband/acc100/rte_acc100_pmd.h |  487 ++++++++++++++
 3 files changed, 1628 insertions(+)
 create mode 100644 drivers/baseband/acc100/acc100_pf_enum.h
 create mode 100644 drivers/baseband/acc100/acc100_vf_enum.h

diff --git a/drivers/baseband/acc100/acc100_pf_enum.h b/drivers/baseband/acc100/acc100_pf_enum.h
new file mode 100644
index 0000000..a1ee416
--- /dev/null
+++ b/drivers/baseband/acc100/acc100_pf_enum.h
@@ -0,0 +1,1068 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2017 Intel Corporation
+ */
+
+#ifndef ACC100_PF_ENUM_H
+#define ACC100_PF_ENUM_H
+
+/*
+ * ACC100 Register mapping on PF BAR0
+ * This is automatically generated from RDL, format may change with new RDL
+ * Release.
+ * Variable names are as is
+ */
+enum {
+	HWPfQmgrEgressQueuesTemplate          =  0x0007FE00,
+	HWPfQmgrIngressAq                     =  0x00080000,
+	HWPfQmgrArbQAvail                     =  0x00A00010,
+	HWPfQmgrArbQBlock                     =  0x00A00014,
+	HWPfQmgrAqueueDropNotifEn             =  0x00A00024,
+	HWPfQmgrAqueueDisableNotifEn          =  0x00A00028,
+	HWPfQmgrSoftReset                     =  0x00A00038,
+	HWPfQmgrInitStatus                    =  0x00A0003C,
+	HWPfQmgrAramWatchdogCount             =  0x00A00040,
+	HWPfQmgrAramWatchdogCounterEn         =  0x00A00044,
+	HWPfQmgrAxiWatchdogCount              =  0x00A00048,
+	HWPfQmgrAxiWatchdogCounterEn          =  0x00A0004C,
+	HWPfQmgrProcessWatchdogCount          =  0x00A00050,
+	HWPfQmgrProcessWatchdogCounterEn      =  0x00A00054,
+	HWPfQmgrProcessUl4GWatchdogCounter    =  0x00A00058,
+	HWPfQmgrProcessDl4GWatchdogCounter    =  0x00A0005C,
+	HWPfQmgrProcessUl5GWatchdogCounter    =  0x00A00060,
+	HWPfQmgrProcessDl5GWatchdogCounter    =  0x00A00064,
+	HWPfQmgrProcessMldWatchdogCounter     =  0x00A00068,
+	HWPfQmgrMsiOverflowUpperVf            =  0x00A00070,
+	HWPfQmgrMsiOverflowLowerVf            =  0x00A00074,
+	HWPfQmgrMsiWatchdogOverflow           =  0x00A00078,
+	HWPfQmgrMsiOverflowEnable             =  0x00A0007C,
+	HWPfQmgrDebugAqPointerMemGrp          =  0x00A00100,
+	HWPfQmgrDebugOutputArbQFifoGrp        =  0x00A00140,
+	HWPfQmgrDebugMsiFifoGrp               =  0x00A00180,
+	HWPfQmgrDebugAxiWdTimeoutMsiFifo      =  0x00A001C0,
+	HWPfQmgrDebugProcessWdTimeoutMsiFifo  =  0x00A001C4,
+	HWPfQmgrDepthLog2Grp                  =  0x00A00200,
+	HWPfQmgrTholdGrp                      =  0x00A00300,
+	HWPfQmgrGrpTmplateReg0Indx            =  0x00A00600,
+	HWPfQmgrGrpTmplateReg1Indx            =  0x00A00680,
+	HWPfQmgrGrpTmplateReg2indx            =  0x00A00700,
+	HWPfQmgrGrpTmplateReg3Indx            =  0x00A00780,
+	HWPfQmgrGrpTmplateReg4Indx            =  0x00A00800,
+	HWPfQmgrVfBaseAddr                    =  0x00A01000,
+	HWPfQmgrUl4GWeightRrVf                =  0x00A02000,
+	HWPfQmgrDl4GWeightRrVf                =  0x00A02100,
+	HWPfQmgrUl5GWeightRrVf                =  0x00A02200,
+	HWPfQmgrDl5GWeightRrVf                =  0x00A02300,
+	HWPfQmgrMldWeightRrVf                 =  0x00A02400,
+	HWPfQmgrArbQDepthGrp                  =  0x00A02F00,
+	HWPfQmgrGrpFunction0                  =  0x00A02F40,
+	HWPfQmgrGrpFunction1                  =  0x00A02F44,
+	HWPfQmgrGrpPriority                   =  0x00A02F48,
+	HWPfQmgrWeightSync                    =  0x00A03000,
+	HWPfQmgrAqEnableVf                    =  0x00A10000,
+	HWPfQmgrAqResetVf                     =  0x00A20000,
+	HWPfQmgrRingSizeVf                    =  0x00A20004,
+	HWPfQmgrGrpDepthLog20Vf               =  0x00A20008,
+	HWPfQmgrGrpDepthLog21Vf               =  0x00A2000C,
+	HWPfQmgrGrpFunction0Vf                =  0x00A20010,
+	HWPfQmgrGrpFunction1Vf                =  0x00A20014,
+	HWPfDmaConfig0Reg                     =  0x00B80000,
+	HWPfDmaConfig1Reg                     =  0x00B80004,
+	HWPfDmaQmgrAddrReg                    =  0x00B80008,
+	HWPfDmaSoftResetReg                   =  0x00B8000C,
+	HWPfDmaAxcacheReg                     =  0x00B80010,
+	HWPfDmaVersionReg                     =  0x00B80014,
+	HWPfDmaFrameThreshold                 =  0x00B80018,
+	HWPfDmaTimestampLo                    =  0x00B8001C,
+	HWPfDmaTimestampHi                    =  0x00B80020,
+	HWPfDmaAxiStatus                      =  0x00B80028,
+	HWPfDmaAxiControl                     =  0x00B8002C,
+	HWPfDmaNoQmgr                         =  0x00B80030,
+	HWPfDmaQosScale                       =  0x00B80034,
+	HWPfDmaQmanen                         =  0x00B80040,
+	HWPfDmaQmgrQosBase                    =  0x00B80060,
+	HWPfDmaFecClkGatingEnable             =  0x00B80080,
+	HWPfDmaPmEnable                       =  0x00B80084,
+	HWPfDmaQosEnable                      =  0x00B80088,
+	HWPfDmaHarqWeightedRrFrameThreshold   =  0x00B800B0,
+	HWPfDmaDataSmallWeightedRrFrameThresh  = 0x00B800B4,
+	HWPfDmaDataLargeWeightedRrFrameThresh  = 0x00B800B8,
+	HWPfDmaInboundCbMaxSize               =  0x00B800BC,
+	HWPfDmaInboundDrainDataSize           =  0x00B800C0,
+	HWPfDmaVfDdrBaseRw                    =  0x00B80400,
+	HWPfDmaCmplTmOutCnt                   =  0x00B80800,
+	HWPfDmaProcTmOutCnt                   =  0x00B80804,
+	HWPfDmaStatusRrespBresp               =  0x00B80810,
+	HWPfDmaCfgRrespBresp                  =  0x00B80814,
+	HWPfDmaStatusMemParErr                =  0x00B80818,
+	HWPfDmaCfgMemParErrEn                 =  0x00B8081C,
+	HWPfDmaStatusDmaHwErr                 =  0x00B80820,
+	HWPfDmaCfgDmaHwErrEn                  =  0x00B80824,
+	HWPfDmaStatusFecCoreErr               =  0x00B80828,
+	HWPfDmaCfgFecCoreErrEn                =  0x00B8082C,
+	HWPfDmaStatusFcwDescrErr              =  0x00B80830,
+	HWPfDmaCfgFcwDescrErrEn               =  0x00B80834,
+	HWPfDmaStatusBlockTransmit            =  0x00B80838,
+	HWPfDmaBlockOnErrEn                   =  0x00B8083C,
+	HWPfDmaStatusFlushDma                 =  0x00B80840,
+	HWPfDmaFlushDmaOnErrEn                =  0x00B80844,
+	HWPfDmaStatusSdoneFifoFull            =  0x00B80848,
+	HWPfDmaStatusDescriptorErrLoVf        =  0x00B8084C,
+	HWPfDmaStatusDescriptorErrHiVf        =  0x00B80850,
+	HWPfDmaStatusFcwErrLoVf               =  0x00B80854,
+	HWPfDmaStatusFcwErrHiVf               =  0x00B80858,
+	HWPfDmaStatusDataErrLoVf              =  0x00B8085C,
+	HWPfDmaStatusDataErrHiVf              =  0x00B80860,
+	HWPfDmaCfgMsiEnSoftwareErr            =  0x00B80864,
+	HWPfDmaDescriptorSignatuture          =  0x00B80868,
+	HWPfDmaFcwSignature                   =  0x00B8086C,
+	HWPfDmaErrorDetectionEn               =  0x00B80870,
+	HWPfDmaErrCntrlFifoDebug              =  0x00B8087C,
+	HWPfDmaStatusToutData                 =  0x00B80880,
+	HWPfDmaStatusToutDesc                 =  0x00B80884,
+	HWPfDmaStatusToutUnexpData            =  0x00B80888,
+	HWPfDmaStatusToutUnexpDesc            =  0x00B8088C,
+	HWPfDmaStatusToutProcess              =  0x00B80890,
+	HWPfDmaConfigCtoutOutDataEn           =  0x00B808A0,
+	HWPfDmaConfigCtoutOutDescrEn          =  0x00B808A4,
+	HWPfDmaConfigUnexpComplDataEn         =  0x00B808A8,
+	HWPfDmaConfigUnexpComplDescrEn        =  0x00B808AC,
+	HWPfDmaConfigPtoutOutEn               =  0x00B808B0,
+	HWPfDmaFec5GulDescBaseLoRegVf         =  0x00B88020,
+	HWPfDmaFec5GulDescBaseHiRegVf         =  0x00B88024,
+	HWPfDmaFec5GulRespPtrLoRegVf          =  0x00B88028,
+	HWPfDmaFec5GulRespPtrHiRegVf          =  0x00B8802C,
+	HWPfDmaFec5GdlDescBaseLoRegVf         =  0x00B88040,
+	HWPfDmaFec5GdlDescBaseHiRegVf         =  0x00B88044,
+	HWPfDmaFec5GdlRespPtrLoRegVf          =  0x00B88048,
+	HWPfDmaFec5GdlRespPtrHiRegVf          =  0x00B8804C,
+	HWPfDmaFec4GulDescBaseLoRegVf         =  0x00B88060,
+	HWPfDmaFec4GulDescBaseHiRegVf         =  0x00B88064,
+	HWPfDmaFec4GulRespPtrLoRegVf          =  0x00B88068,
+	HWPfDmaFec4GulRespPtrHiRegVf          =  0x00B8806C,
+	HWPfDmaFec4GdlDescBaseLoRegVf         =  0x00B88080,
+	HWPfDmaFec4GdlDescBaseHiRegVf         =  0x00B88084,
+	HWPfDmaFec4GdlRespPtrLoRegVf          =  0x00B88088,
+	HWPfDmaFec4GdlRespPtrHiRegVf          =  0x00B8808C,
+	HWPfDmaVfDdrBaseRangeRo               =  0x00B880A0,
+	HWPfQosmonACntrlReg                   =  0x00B90000,
+	HWPfQosmonAEvalOverflow0              =  0x00B90008,
+	HWPfQosmonAEvalOverflow1              =  0x00B9000C,
+	HWPfQosmonADivTerm                    =  0x00B90010,
+	HWPfQosmonATickTerm                   =  0x00B90014,
+	HWPfQosmonAEvalTerm                   =  0x00B90018,
+	HWPfQosmonAAveTerm                    =  0x00B9001C,
+	HWPfQosmonAForceEccErr                =  0x00B90020,
+	HWPfQosmonAEccErrDetect               =  0x00B90024,
+	HWPfQosmonAIterationConfig0Low        =  0x00B90060,
+	HWPfQosmonAIterationConfig0High       =  0x00B90064,
+	HWPfQosmonAIterationConfig1Low        =  0x00B90068,
+	HWPfQosmonAIterationConfig1High       =  0x00B9006C,
+	HWPfQosmonAIterationConfig2Low        =  0x00B90070,
+	HWPfQosmonAIterationConfig2High       =  0x00B90074,
+	HWPfQosmonAIterationConfig3Low        =  0x00B90078,
+	HWPfQosmonAIterationConfig3High       =  0x00B9007C,
+	HWPfQosmonAEvalMemAddr                =  0x00B90080,
+	HWPfQosmonAEvalMemData                =  0x00B90084,
+	HWPfQosmonAXaction                    =  0x00B900C0,
+	HWPfQosmonARemThres1Vf                =  0x00B90400,
+	HWPfQosmonAThres2Vf                   =  0x00B90404,
+	HWPfQosmonAWeiFracVf                  =  0x00B90408,
+	HWPfQosmonARrWeiVf                    =  0x00B9040C,
+	HWPfPermonACntrlRegVf                 =  0x00B98000,
+	HWPfPermonACountVf                    =  0x00B98008,
+	HWPfPermonAKCntLoVf                   =  0x00B98010,
+	HWPfPermonAKCntHiVf                   =  0x00B98014,
+	HWPfPermonADeltaCntLoVf               =  0x00B98020,
+	HWPfPermonADeltaCntHiVf               =  0x00B98024,
+	HWPfPermonAVersionReg                 =  0x00B9C000,
+	HWPfPermonACbControlFec               =  0x00B9C0F0,
+	HWPfPermonADltTimerLoFec              =  0x00B9C0F4,
+	HWPfPermonADltTimerHiFec              =  0x00B9C0F8,
+	HWPfPermonACbCountFec                 =  0x00B9C100,
+	HWPfPermonAAccExecTimerLoFec          =  0x00B9C104,
+	HWPfPermonAAccExecTimerHiFec          =  0x00B9C108,
+	HWPfPermonAExecTimerMinFec            =  0x00B9C200,
+	HWPfPermonAExecTimerMaxFec            =  0x00B9C204,
+	HWPfPermonAControlBusMon              =  0x00B9C400,
+	HWPfPermonAConfigBusMon               =  0x00B9C404,
+	HWPfPermonASkipCountBusMon            =  0x00B9C408,
+	HWPfPermonAMinLatBusMon               =  0x00B9C40C,
+	HWPfPermonAMaxLatBusMon               =  0x00B9C500,
+	HWPfPermonATotalLatLowBusMon          =  0x00B9C504,
+	HWPfPermonATotalLatUpperBusMon        =  0x00B9C508,
+	HWPfPermonATotalReqCntBusMon          =  0x00B9C50C,
+	HWPfQosmonBCntrlReg                   =  0x00BA0000,
+	HWPfQosmonBEvalOverflow0              =  0x00BA0008,
+	HWPfQosmonBEvalOverflow1              =  0x00BA000C,
+	HWPfQosmonBDivTerm                    =  0x00BA0010,
+	HWPfQosmonBTickTerm                   =  0x00BA0014,
+	HWPfQosmonBEvalTerm                   =  0x00BA0018,
+	HWPfQosmonBAveTerm                    =  0x00BA001C,
+	HWPfQosmonBForceEccErr                =  0x00BA0020,
+	HWPfQosmonBEccErrDetect               =  0x00BA0024,
+	HWPfQosmonBIterationConfig0Low        =  0x00BA0060,
+	HWPfQosmonBIterationConfig0High       =  0x00BA0064,
+	HWPfQosmonBIterationConfig1Low        =  0x00BA0068,
+	HWPfQosmonBIterationConfig1High       =  0x00BA006C,
+	HWPfQosmonBIterationConfig2Low        =  0x00BA0070,
+	HWPfQosmonBIterationConfig2High       =  0x00BA0074,
+	HWPfQosmonBIterationConfig3Low        =  0x00BA0078,
+	HWPfQosmonBIterationConfig3High       =  0x00BA007C,
+	HWPfQosmonBEvalMemAddr                =  0x00BA0080,
+	HWPfQosmonBEvalMemData                =  0x00BA0084,
+	HWPfQosmonBXaction                    =  0x00BA00C0,
+	HWPfQosmonBRemThres1Vf                =  0x00BA0400,
+	HWPfQosmonBThres2Vf                   =  0x00BA0404,
+	HWPfQosmonBWeiFracVf                  =  0x00BA0408,
+	HWPfQosmonBRrWeiVf                    =  0x00BA040C,
+	HWPfPermonBCntrlRegVf                 =  0x00BA8000,
+	HWPfPermonBCountVf                    =  0x00BA8008,
+	HWPfPermonBKCntLoVf                   =  0x00BA8010,
+	HWPfPermonBKCntHiVf                   =  0x00BA8014,
+	HWPfPermonBDeltaCntLoVf               =  0x00BA8020,
+	HWPfPermonBDeltaCntHiVf               =  0x00BA8024,
+	HWPfPermonBVersionReg                 =  0x00BAC000,
+	HWPfPermonBCbControlFec               =  0x00BAC0F0,
+	HWPfPermonBDltTimerLoFec              =  0x00BAC0F4,
+	HWPfPermonBDltTimerHiFec              =  0x00BAC0F8,
+	HWPfPermonBCbCountFec                 =  0x00BAC100,
+	HWPfPermonBAccExecTimerLoFec          =  0x00BAC104,
+	HWPfPermonBAccExecTimerHiFec          =  0x00BAC108,
+	HWPfPermonBExecTimerMinFec            =  0x00BAC200,
+	HWPfPermonBExecTimerMaxFec            =  0x00BAC204,
+	HWPfPermonBControlBusMon              =  0x00BAC400,
+	HWPfPermonBConfigBusMon               =  0x00BAC404,
+	HWPfPermonBSkipCountBusMon            =  0x00BAC408,
+	HWPfPermonBMinLatBusMon               =  0x00BAC40C,
+	HWPfPermonBMaxLatBusMon               =  0x00BAC500,
+	HWPfPermonBTotalLatLowBusMon          =  0x00BAC504,
+	HWPfPermonBTotalLatUpperBusMon        =  0x00BAC508,
+	HWPfPermonBTotalReqCntBusMon          =  0x00BAC50C,
+	HWPfFecUl5gCntrlReg                   =  0x00BC0000,
+	HWPfFecUl5gI2MThreshReg               =  0x00BC0004,
+	HWPfFecUl5gVersionReg                 =  0x00BC0100,
+	HWPfFecUl5gFcwStatusReg               =  0x00BC0104,
+	HWPfFecUl5gWarnReg                    =  0x00BC0108,
+	HwPfFecUl5gIbDebugReg                 =  0x00BC0200,
+	HwPfFecUl5gObLlrDebugReg              =  0x00BC0204,
+	HwPfFecUl5gObHarqDebugReg             =  0x00BC0208,
+	HwPfFecUl5g1CntrlReg                  =  0x00BC1000,
+	HwPfFecUl5g1I2MThreshReg              =  0x00BC1004,
+	HwPfFecUl5g1VersionReg                =  0x00BC1100,
+	HwPfFecUl5g1FcwStatusReg              =  0x00BC1104,
+	HwPfFecUl5g1WarnReg                   =  0x00BC1108,
+	HwPfFecUl5g1IbDebugReg                =  0x00BC1200,
+	HwPfFecUl5g1ObLlrDebugReg             =  0x00BC1204,
+	HwPfFecUl5g1ObHarqDebugReg            =  0x00BC1208,
+	HwPfFecUl5g2CntrlReg                  =  0x00BC2000,
+	HwPfFecUl5g2I2MThreshReg              =  0x00BC2004,
+	HwPfFecUl5g2VersionReg                =  0x00BC2100,
+	HwPfFecUl5g2FcwStatusReg              =  0x00BC2104,
+	HwPfFecUl5g2WarnReg                   =  0x00BC2108,
+	HwPfFecUl5g2IbDebugReg                =  0x00BC2200,
+	HwPfFecUl5g2ObLlrDebugReg             =  0x00BC2204,
+	HwPfFecUl5g2ObHarqDebugReg            =  0x00BC2208,
+	HwPfFecUl5g3CntrlReg                  =  0x00BC3000,
+	HwPfFecUl5g3I2MThreshReg              =  0x00BC3004,
+	HwPfFecUl5g3VersionReg                =  0x00BC3100,
+	HwPfFecUl5g3FcwStatusReg              =  0x00BC3104,
+	HwPfFecUl5g3WarnReg                   =  0x00BC3108,
+	HwPfFecUl5g3IbDebugReg                =  0x00BC3200,
+	HwPfFecUl5g3ObLlrDebugReg             =  0x00BC3204,
+	HwPfFecUl5g3ObHarqDebugReg            =  0x00BC3208,
+	HwPfFecUl5g4CntrlReg                  =  0x00BC4000,
+	HwPfFecUl5g4I2MThreshReg              =  0x00BC4004,
+	HwPfFecUl5g4VersionReg                =  0x00BC4100,
+	HwPfFecUl5g4FcwStatusReg              =  0x00BC4104,
+	HwPfFecUl5g4WarnReg                   =  0x00BC4108,
+	HwPfFecUl5g4IbDebugReg                =  0x00BC4200,
+	HwPfFecUl5g4ObLlrDebugReg             =  0x00BC4204,
+	HwPfFecUl5g4ObHarqDebugReg            =  0x00BC4208,
+	HwPfFecUl5g5CntrlReg                  =  0x00BC5000,
+	HwPfFecUl5g5I2MThreshReg              =  0x00BC5004,
+	HwPfFecUl5g5VersionReg                =  0x00BC5100,
+	HwPfFecUl5g5FcwStatusReg              =  0x00BC5104,
+	HwPfFecUl5g5WarnReg                   =  0x00BC5108,
+	HwPfFecUl5g5IbDebugReg                =  0x00BC5200,
+	HwPfFecUl5g5ObLlrDebugReg             =  0x00BC5204,
+	HwPfFecUl5g5ObHarqDebugReg            =  0x00BC5208,
+	HwPfFecUl5g6CntrlReg                  =  0x00BC6000,
+	HwPfFecUl5g6I2MThreshReg              =  0x00BC6004,
+	HwPfFecUl5g6VersionReg                =  0x00BC6100,
+	HwPfFecUl5g6FcwStatusReg              =  0x00BC6104,
+	HwPfFecUl5g6WarnReg                   =  0x00BC6108,
+	HwPfFecUl5g6IbDebugReg                =  0x00BC6200,
+	HwPfFecUl5g6ObLlrDebugReg             =  0x00BC6204,
+	HwPfFecUl5g6ObHarqDebugReg            =  0x00BC6208,
+	HwPfFecUl5g7CntrlReg                  =  0x00BC7000,
+	HwPfFecUl5g7I2MThreshReg              =  0x00BC7004,
+	HwPfFecUl5g7VersionReg                =  0x00BC7100,
+	HwPfFecUl5g7FcwStatusReg              =  0x00BC7104,
+	HwPfFecUl5g7WarnReg                   =  0x00BC7108,
+	HwPfFecUl5g7IbDebugReg                =  0x00BC7200,
+	HwPfFecUl5g7ObLlrDebugReg             =  0x00BC7204,
+	HwPfFecUl5g7ObHarqDebugReg            =  0x00BC7208,
+	HwPfFecUl5g8CntrlReg                  =  0x00BC8000,
+	HwPfFecUl5g8I2MThreshReg              =  0x00BC8004,
+	HwPfFecUl5g8VersionReg                =  0x00BC8100,
+	HwPfFecUl5g8FcwStatusReg              =  0x00BC8104,
+	HwPfFecUl5g8WarnReg                   =  0x00BC8108,
+	HwPfFecUl5g8IbDebugReg                =  0x00BC8200,
+	HwPfFecUl5g8ObLlrDebugReg             =  0x00BC8204,
+	HwPfFecUl5g8ObHarqDebugReg            =  0x00BC8208,
+	HWPfFecDl5gCntrlReg                   =  0x00BCF000,
+	HWPfFecDl5gI2MThreshReg               =  0x00BCF004,
+	HWPfFecDl5gVersionReg                 =  0x00BCF100,
+	HWPfFecDl5gFcwStatusReg               =  0x00BCF104,
+	HWPfFecDl5gWarnReg                    =  0x00BCF108,
+	HWPfFecUlVersionReg                   =  0x00BD0000,
+	HWPfFecUlControlReg                   =  0x00BD0004,
+	HWPfFecUlStatusReg                    =  0x00BD0008,
+	HWPfFecDlVersionReg                   =  0x00BDF000,
+	HWPfFecDlClusterConfigReg             =  0x00BDF004,
+	HWPfFecDlBurstThres                   =  0x00BDF00C,
+	HWPfFecDlClusterStatusReg0            =  0x00BDF040,
+	HWPfFecDlClusterStatusReg1            =  0x00BDF044,
+	HWPfFecDlClusterStatusReg2            =  0x00BDF048,
+	HWPfFecDlClusterStatusReg3            =  0x00BDF04C,
+	HWPfFecDlClusterStatusReg4            =  0x00BDF050,
+	HWPfFecDlClusterStatusReg5            =  0x00BDF054,
+	HWPfChaFabPllPllrst                   =  0x00C40000,
+	HWPfChaFabPllClk0                     =  0x00C40004,
+	HWPfChaFabPllClk1                     =  0x00C40008,
+	HWPfChaFabPllBwadj                    =  0x00C4000C,
+	HWPfChaFabPllLbw                      =  0x00C40010,
+	HWPfChaFabPllResetq                   =  0x00C40014,
+	HWPfChaFabPllPhshft0                  =  0x00C40018,
+	HWPfChaFabPllPhshft1                  =  0x00C4001C,
+	HWPfChaFabPllDivq0                    =  0x00C40020,
+	HWPfChaFabPllDivq1                    =  0x00C40024,
+	HWPfChaFabPllDivq2                    =  0x00C40028,
+	HWPfChaFabPllDivq3                    =  0x00C4002C,
+	HWPfChaFabPllDivq4                    =  0x00C40030,
+	HWPfChaFabPllDivq5                    =  0x00C40034,
+	HWPfChaFabPllDivq6                    =  0x00C40038,
+	HWPfChaFabPllDivq7                    =  0x00C4003C,
+	HWPfChaDl5gPllPllrst                  =  0x00C40080,
+	HWPfChaDl5gPllClk0                    =  0x00C40084,
+	HWPfChaDl5gPllClk1                    =  0x00C40088,
+	HWPfChaDl5gPllBwadj                   =  0x00C4008C,
+	HWPfChaDl5gPllLbw                     =  0x00C40090,
+	HWPfChaDl5gPllResetq                  =  0x00C40094,
+	HWPfChaDl5gPllPhshft0                 =  0x00C40098,
+	HWPfChaDl5gPllPhshft1                 =  0x00C4009C,
+	HWPfChaDl5gPllDivq0                   =  0x00C400A0,
+	HWPfChaDl5gPllDivq1                   =  0x00C400A4,
+	HWPfChaDl5gPllDivq2                   =  0x00C400A8,
+	HWPfChaDl5gPllDivq3                   =  0x00C400AC,
+	HWPfChaDl5gPllDivq4                   =  0x00C400B0,
+	HWPfChaDl5gPllDivq5                   =  0x00C400B4,
+	HWPfChaDl5gPllDivq6                   =  0x00C400B8,
+	HWPfChaDl5gPllDivq7                   =  0x00C400BC,
+	HWPfChaDl4gPllPllrst                  =  0x00C40100,
+	HWPfChaDl4gPllClk0                    =  0x00C40104,
+	HWPfChaDl4gPllClk1                    =  0x00C40108,
+	HWPfChaDl4gPllBwadj                   =  0x00C4010C,
+	HWPfChaDl4gPllLbw                     =  0x00C40110,
+	HWPfChaDl4gPllResetq                  =  0x00C40114,
+	HWPfChaDl4gPllPhshft0                 =  0x00C40118,
+	HWPfChaDl4gPllPhshft1                 =  0x00C4011C,
+	HWPfChaDl4gPllDivq0                   =  0x00C40120,
+	HWPfChaDl4gPllDivq1                   =  0x00C40124,
+	HWPfChaDl4gPllDivq2                   =  0x00C40128,
+	HWPfChaDl4gPllDivq3                   =  0x00C4012C,
+	HWPfChaDl4gPllDivq4                   =  0x00C40130,
+	HWPfChaDl4gPllDivq5                   =  0x00C40134,
+	HWPfChaDl4gPllDivq6                   =  0x00C40138,
+	HWPfChaDl4gPllDivq7                   =  0x00C4013C,
+	HWPfChaUl5gPllPllrst                  =  0x00C40180,
+	HWPfChaUl5gPllClk0                    =  0x00C40184,
+	HWPfChaUl5gPllClk1                    =  0x00C40188,
+	HWPfChaUl5gPllBwadj                   =  0x00C4018C,
+	HWPfChaUl5gPllLbw                     =  0x00C40190,
+	HWPfChaUl5gPllResetq                  =  0x00C40194,
+	HWPfChaUl5gPllPhshft0                 =  0x00C40198,
+	HWPfChaUl5gPllPhshft1                 =  0x00C4019C,
+	HWPfChaUl5gPllDivq0                   =  0x00C401A0,
+	HWPfChaUl5gPllDivq1                   =  0x00C401A4,
+	HWPfChaUl5gPllDivq2                   =  0x00C401A8,
+	HWPfChaUl5gPllDivq3                   =  0x00C401AC,
+	HWPfChaUl5gPllDivq4                   =  0x00C401B0,
+	HWPfChaUl5gPllDivq5                   =  0x00C401B4,
+	HWPfChaUl5gPllDivq6                   =  0x00C401B8,
+	HWPfChaUl5gPllDivq7                   =  0x00C401BC,
+	HWPfChaUl4gPllPllrst                  =  0x00C40200,
+	HWPfChaUl4gPllClk0                    =  0x00C40204,
+	HWPfChaUl4gPllClk1                    =  0x00C40208,
+	HWPfChaUl4gPllBwadj                   =  0x00C4020C,
+	HWPfChaUl4gPllLbw                     =  0x00C40210,
+	HWPfChaUl4gPllResetq                  =  0x00C40214,
+	HWPfChaUl4gPllPhshft0                 =  0x00C40218,
+	HWPfChaUl4gPllPhshft1                 =  0x00C4021C,
+	HWPfChaUl4gPllDivq0                   =  0x00C40220,
+	HWPfChaUl4gPllDivq1                   =  0x00C40224,
+	HWPfChaUl4gPllDivq2                   =  0x00C40228,
+	HWPfChaUl4gPllDivq3                   =  0x00C4022C,
+	HWPfChaUl4gPllDivq4                   =  0x00C40230,
+	HWPfChaUl4gPllDivq5                   =  0x00C40234,
+	HWPfChaUl4gPllDivq6                   =  0x00C40238,
+	HWPfChaUl4gPllDivq7                   =  0x00C4023C,
+	HWPfChaDdrPllPllrst                   =  0x00C40280,
+	HWPfChaDdrPllClk0                     =  0x00C40284,
+	HWPfChaDdrPllClk1                     =  0x00C40288,
+	HWPfChaDdrPllBwadj                    =  0x00C4028C,
+	HWPfChaDdrPllLbw                      =  0x00C40290,
+	HWPfChaDdrPllResetq                   =  0x00C40294,
+	HWPfChaDdrPllPhshft0                  =  0x00C40298,
+	HWPfChaDdrPllPhshft1                  =  0x00C4029C,
+	HWPfChaDdrPllDivq0                    =  0x00C402A0,
+	HWPfChaDdrPllDivq1                    =  0x00C402A4,
+	HWPfChaDdrPllDivq2                    =  0x00C402A8,
+	HWPfChaDdrPllDivq3                    =  0x00C402AC,
+	HWPfChaDdrPllDivq4                    =  0x00C402B0,
+	HWPfChaDdrPllDivq5                    =  0x00C402B4,
+	HWPfChaDdrPllDivq6                    =  0x00C402B8,
+	HWPfChaDdrPllDivq7                    =  0x00C402BC,
+	HWPfChaErrStatus                      =  0x00C40400,
+	HWPfChaErrMask                        =  0x00C40404,
+	HWPfChaDebugPcieMsiFifo               =  0x00C40410,
+	HWPfChaDebugDdrMsiFifo                =  0x00C40414,
+	HWPfChaDebugMiscMsiFifo               =  0x00C40418,
+	HWPfChaPwmSet                         =  0x00C40420,
+	HWPfChaDdrRstStatus                   =  0x00C40430,
+	HWPfChaDdrStDoneStatus                =  0x00C40434,
+	HWPfChaDdrWbRstCfg                    =  0x00C40438,
+	HWPfChaDdrApbRstCfg                   =  0x00C4043C,
+	HWPfChaDdrPhyRstCfg                   =  0x00C40440,
+	HWPfChaDdrCpuRstCfg                   =  0x00C40444,
+	HWPfChaDdrSifRstCfg                   =  0x00C40448,
+	HWPfChaPadcfgPcomp0                   =  0x00C41000,
+	HWPfChaPadcfgNcomp0                   =  0x00C41004,
+	HWPfChaPadcfgOdt0                     =  0x00C41008,
+	HWPfChaPadcfgProtect0                 =  0x00C4100C,
+	HWPfChaPreemphasisProtect0            =  0x00C41010,
+	HWPfChaPreemphasisCompen0             =  0x00C41040,
+	HWPfChaPreemphasisOdten0              =  0x00C41044,
+	HWPfChaPadcfgPcomp1                   =  0x00C41100,
+	HWPfChaPadcfgNcomp1                   =  0x00C41104,
+	HWPfChaPadcfgOdt1                     =  0x00C41108,
+	HWPfChaPadcfgProtect1                 =  0x00C4110C,
+	HWPfChaPreemphasisProtect1            =  0x00C41110,
+	HWPfChaPreemphasisCompen1             =  0x00C41140,
+	HWPfChaPreemphasisOdten1              =  0x00C41144,
+	HWPfChaPadcfgPcomp2                   =  0x00C41200,
+	HWPfChaPadcfgNcomp2                   =  0x00C41204,
+	HWPfChaPadcfgOdt2                     =  0x00C41208,
+	HWPfChaPadcfgProtect2                 =  0x00C4120C,
+	HWPfChaPreemphasisProtect2            =  0x00C41210,
+	HWPfChaPreemphasisCompen2             =  0x00C41240,
+	HWPfChaPreemphasisOdten4              =  0x00C41444,
+	HWPfChaPreemphasisOdten2              =  0x00C41244,
+	HWPfChaPadcfgPcomp3                   =  0x00C41300,
+	HWPfChaPadcfgNcomp3                   =  0x00C41304,
+	HWPfChaPadcfgOdt3                     =  0x00C41308,
+	HWPfChaPadcfgProtect3                 =  0x00C4130C,
+	HWPfChaPreemphasisProtect3            =  0x00C41310,
+	HWPfChaPreemphasisCompen3             =  0x00C41340,
+	HWPfChaPreemphasisOdten3              =  0x00C41344,
+	HWPfChaPadcfgPcomp4                   =  0x00C41400,
+	HWPfChaPadcfgNcomp4                   =  0x00C41404,
+	HWPfChaPadcfgOdt4                     =  0x00C41408,
+	HWPfChaPadcfgProtect4                 =  0x00C4140C,
+	HWPfChaPreemphasisProtect4            =  0x00C41410,
+	HWPfChaPreemphasisCompen4             =  0x00C41440,
+	HWPfHiVfToPfDbellVf                   =  0x00C80000,
+	HWPfHiPfToVfDbellVf                   =  0x00C80008,
+	HWPfHiInfoRingBaseLoVf                =  0x00C80010,
+	HWPfHiInfoRingBaseHiVf                =  0x00C80014,
+	HWPfHiInfoRingPointerVf               =  0x00C80018,
+	HWPfHiInfoRingIntWrEnVf               =  0x00C80020,
+	HWPfHiInfoRingPf2VfWrEnVf             =  0x00C80024,
+	HWPfHiMsixVectorMapperVf              =  0x00C80060,
+	HWPfHiModuleVersionReg                =  0x00C84000,
+	HWPfHiIosf2axiErrLogReg               =  0x00C84004,
+	HWPfHiHardResetReg                    =  0x00C84008,
+	HWPfHi5GHardResetReg                  =  0x00C8400C,
+	HWPfHiInfoRingBaseLoRegPf             =  0x00C84010,
+	HWPfHiInfoRingBaseHiRegPf             =  0x00C84014,
+	HWPfHiInfoRingPointerRegPf            =  0x00C84018,
+	HWPfHiInfoRingIntWrEnRegPf            =  0x00C84020,
+	HWPfHiInfoRingVf2pfLoWrEnReg          =  0x00C84024,
+	HWPfHiInfoRingVf2pfHiWrEnReg          =  0x00C84028,
+	HWPfHiLogParityErrStatusReg           =  0x00C8402C,
+	HWPfHiLogDataParityErrorVfStatusLo    =  0x00C84030,
+	HWPfHiLogDataParityErrorVfStatusHi    =  0x00C84034,
+	HWPfHiBlockTransmitOnErrorEn          =  0x00C84038,
+	HWPfHiCfgMsiIntWrEnRegPf              =  0x00C84040,
+	HWPfHiCfgMsiVf2pfLoWrEnReg            =  0x00C84044,
+	HWPfHiCfgMsiVf2pfHighWrEnReg          =  0x00C84048,
+	HWPfHiMsixVectorMapperPf              =  0x00C84060,
+	HWPfHiApbWrWaitTime                   =  0x00C84100,
+	HWPfHiXCounterMaxValue                =  0x00C84104,
+	HWPfHiPfMode                          =  0x00C84108,
+	HWPfHiClkGateHystReg                  =  0x00C8410C,
+	HWPfHiSnoopBitsReg                    =  0x00C84110,
+	HWPfHiMsiDropEnableReg                =  0x00C84114,
+	HWPfHiMsiStatReg                      =  0x00C84120,
+	HWPfHiFifoOflStatReg                  =  0x00C84124,
+	HWPfHiHiDebugReg                      =  0x00C841F4,
+	HWPfHiDebugMemSnoopMsiFifo            =  0x00C841F8,
+	HWPfHiDebugMemSnoopInputFifo          =  0x00C841FC,
+	HWPfHiMsixMappingConfig               =  0x00C84200,
+	HWPfHiJunkReg                         =  0x00C8FF00,
+	HWPfDdrUmmcVer                        =  0x00D00000,
+	HWPfDdrUmmcCap                        =  0x00D00010,
+	HWPfDdrUmmcCtrl                       =  0x00D00020,
+	HWPfDdrMpcPe                          =  0x00D00080,
+	HWPfDdrMpcPpri3                       =  0x00D00090,
+	HWPfDdrMpcPpri2                       =  0x00D000A0,
+	HWPfDdrMpcPpri1                       =  0x00D000B0,
+	HWPfDdrMpcPpri0                       =  0x00D000C0,
+	HWPfDdrMpcPrwgrpCtrl                  =  0x00D000D0,
+	HWPfDdrMpcPbw7                        =  0x00D000E0,
+	HWPfDdrMpcPbw6                        =  0x00D000F0,
+	HWPfDdrMpcPbw5                        =  0x00D00100,
+	HWPfDdrMpcPbw4                        =  0x00D00110,
+	HWPfDdrMpcPbw3                        =  0x00D00120,
+	HWPfDdrMpcPbw2                        =  0x00D00130,
+	HWPfDdrMpcPbw1                        =  0x00D00140,
+	HWPfDdrMpcPbw0                        =  0x00D00150,
+	HWPfDdrMemoryInit                     =  0x00D00200,
+	HWPfDdrMemoryInitDone                 =  0x00D00210,
+	HWPfDdrMemInitPhyTrng0                =  0x00D00240,
+	HWPfDdrMemInitPhyTrng1                =  0x00D00250,
+	HWPfDdrMemInitPhyTrng2                =  0x00D00260,
+	HWPfDdrMemInitPhyTrng3                =  0x00D00270,
+	HWPfDdrBcDram                         =  0x00D003C0,
+	HWPfDdrBcAddrMap                      =  0x00D003D0,
+	HWPfDdrBcRef                          =  0x00D003E0,
+	HWPfDdrBcTim0                         =  0x00D00400,
+	HWPfDdrBcTim1                         =  0x00D00410,
+	HWPfDdrBcTim2                         =  0x00D00420,
+	HWPfDdrBcTim3                         =  0x00D00430,
+	HWPfDdrBcTim4                         =  0x00D00440,
+	HWPfDdrBcTim5                         =  0x00D00450,
+	HWPfDdrBcTim6                         =  0x00D00460,
+	HWPfDdrBcTim7                         =  0x00D00470,
+	HWPfDdrBcTim8                         =  0x00D00480,
+	HWPfDdrBcTim9                         =  0x00D00490,
+	HWPfDdrBcTim10                        =  0x00D004A0,
+	HWPfDdrBcTim12                        =  0x00D004C0,
+	HWPfDdrDfiInit                        =  0x00D004D0,
+	HWPfDdrDfiInitComplete                =  0x00D004E0,
+	HWPfDdrDfiTim0                        =  0x00D004F0,
+	HWPfDdrDfiTim1                        =  0x00D00500,
+	HWPfDdrDfiPhyUpdEn                    =  0x00D00530,
+	HWPfDdrMemStatus                      =  0x00D00540,
+	HWPfDdrUmmcErrStatus                  =  0x00D00550,
+	HWPfDdrUmmcIntStatus                  =  0x00D00560,
+	HWPfDdrUmmcIntEn                      =  0x00D00570,
+	HWPfDdrPhyRdLatency                   =  0x00D48400,
+	HWPfDdrPhyRdLatencyDbi                =  0x00D48410,
+	HWPfDdrPhyWrLatency                   =  0x00D48420,
+	HWPfDdrPhyTrngType                    =  0x00D48430,
+	HWPfDdrPhyMrsTiming2                  =  0x00D48440,
+	HWPfDdrPhyMrsTiming0                  =  0x00D48450,
+	HWPfDdrPhyMrsTiming1                  =  0x00D48460,
+	HWPfDdrPhyDramTmrd                    =  0x00D48470,
+	HWPfDdrPhyDramTmod                    =  0x00D48480,
+	HWPfDdrPhyDramTwpre                   =  0x00D48490,
+	HWPfDdrPhyDramTrfc                    =  0x00D484A0,
+	HWPfDdrPhyDramTrwtp                   =  0x00D484B0,
+	HWPfDdrPhyMr01Dimm                    =  0x00D484C0,
+	HWPfDdrPhyMr01DimmDbi                 =  0x00D484D0,
+	HWPfDdrPhyMr23Dimm                    =  0x00D484E0,
+	HWPfDdrPhyMr45Dimm                    =  0x00D484F0,
+	HWPfDdrPhyMr67Dimm                    =  0x00D48500,
+	HWPfDdrPhyWrlvlWwRdlvlRr              =  0x00D48510,
+	HWPfDdrPhyOdtEn                       =  0x00D48520,
+	HWPfDdrPhyFastTrng                    =  0x00D48530,
+	HWPfDdrPhyDynTrngGap                  =  0x00D48540,
+	HWPfDdrPhyDynRcalGap                  =  0x00D48550,
+	HWPfDdrPhyIdletimeout                 =  0x00D48560,
+	HWPfDdrPhyRstCkeGap                   =  0x00D48570,
+	HWPfDdrPhyCkeMrsGap                   =  0x00D48580,
+	HWPfDdrPhyMemVrefMidVal               =  0x00D48590,
+	HWPfDdrPhyVrefStep                    =  0x00D485A0,
+	HWPfDdrPhyVrefThreshold               =  0x00D485B0,
+	HWPfDdrPhyPhyVrefMidVal               =  0x00D485C0,
+	HWPfDdrPhyDqsCountMax                 =  0x00D485D0,
+	HWPfDdrPhyDqsCountNum                 =  0x00D485E0,
+	HWPfDdrPhyDramRow                     =  0x00D485F0,
+	HWPfDdrPhyDramCol                     =  0x00D48600,
+	HWPfDdrPhyDramBgBa                    =  0x00D48610,
+	HWPfDdrPhyDynamicUpdreqrel            =  0x00D48620,
+	HWPfDdrPhyVrefLimits                  =  0x00D48630,
+	HWPfDdrPhyIdtmTcStatus                =  0x00D6C020,
+	HWPfDdrPhyIdtmFwVersion               =  0x00D6C410,
+	HWPfDdrPhyRdlvlGateInitDelay          =  0x00D70000,
+	HWPfDdrPhyRdenSmplabc                 =  0x00D70008,
+	HWPfDdrPhyVrefNibble0                 =  0x00D7000C,
+	HWPfDdrPhyVrefNibble1                 =  0x00D70010,
+	HWPfDdrPhyRdlvlGateDqsSmpl0           =  0x00D70014,
+	HWPfDdrPhyRdlvlGateDqsSmpl1           =  0x00D70018,
+	HWPfDdrPhyRdlvlGateDqsSmpl2           =  0x00D7001C,
+	HWPfDdrPhyDqsCount                    =  0x00D70020,
+	HWPfDdrPhyWrlvlRdlvlGateStatus        =  0x00D70024,
+	HWPfDdrPhyErrorFlags                  =  0x00D70028,
+	HWPfDdrPhyPowerDown                   =  0x00D70030,
+	HWPfDdrPhyPrbsSeedByte0               =  0x00D70034,
+	HWPfDdrPhyPrbsSeedByte1               =  0x00D70038,
+	HWPfDdrPhyPcompDq                     =  0x00D70040,
+	HWPfDdrPhyNcompDq                     =  0x00D70044,
+	HWPfDdrPhyPcompDqs                    =  0x00D70048,
+	HWPfDdrPhyNcompDqs                    =  0x00D7004C,
+	HWPfDdrPhyPcompCmd                    =  0x00D70050,
+	HWPfDdrPhyNcompCmd                    =  0x00D70054,
+	HWPfDdrPhyPcompCk                     =  0x00D70058,
+	HWPfDdrPhyNcompCk                     =  0x00D7005C,
+	HWPfDdrPhyRcalOdtDq                   =  0x00D70060,
+	HWPfDdrPhyRcalOdtDqs                  =  0x00D70064,
+	HWPfDdrPhyRcalMask1                   =  0x00D70068,
+	HWPfDdrPhyRcalMask2                   =  0x00D7006C,
+	HWPfDdrPhyRcalCtrl                    =  0x00D70070,
+	HWPfDdrPhyRcalCnt                     =  0x00D70074,
+	HWPfDdrPhyRcalOverride                =  0x00D70078,
+	HWPfDdrPhyRcalGateen                  =  0x00D7007C,
+	HWPfDdrPhyCtrl                        =  0x00D70080,
+	HWPfDdrPhyWrlvlAlg                    =  0x00D70084,
+	HWPfDdrPhyRcalVreftTxcmdOdt           =  0x00D70088,
+	HWPfDdrPhyRdlvlGateParam              =  0x00D7008C,
+	HWPfDdrPhyRdlvlGateParam2             =  0x00D70090,
+	HWPfDdrPhyRcalVreftTxdata             =  0x00D70094,
+	HWPfDdrPhyCmdIntDelay                 =  0x00D700A4,
+	HWPfDdrPhyAlertN                      =  0x00D700A8,
+	HWPfDdrPhyTrngReqWpre2tck             =  0x00D700AC,
+	HWPfDdrPhyCmdPhaseSel                 =  0x00D700B4,
+	HWPfDdrPhyCmdDcdl                     =  0x00D700B8,
+	HWPfDdrPhyCkDcdl                      =  0x00D700BC,
+	HWPfDdrPhySwTrngCtrl1                 =  0x00D700C0,
+	HWPfDdrPhySwTrngCtrl2                 =  0x00D700C4,
+	HWPfDdrPhyRcalPcompRden               =  0x00D700C8,
+	HWPfDdrPhyRcalNcompRden               =  0x00D700CC,
+	HWPfDdrPhyRcalCompen                  =  0x00D700D0,
+	HWPfDdrPhySwTrngRdqs                  =  0x00D700D4,
+	HWPfDdrPhySwTrngWdqs                  =  0x00D700D8,
+	HWPfDdrPhySwTrngRdena                 =  0x00D700DC,
+	HWPfDdrPhySwTrngRdenb                 =  0x00D700E0,
+	HWPfDdrPhySwTrngRdenc                 =  0x00D700E4,
+	HWPfDdrPhySwTrngWdq                   =  0x00D700E8,
+	HWPfDdrPhySwTrngRdq                   =  0x00D700EC,
+	HWPfDdrPhyPcfgHmValue                 =  0x00D700F0,
+	HWPfDdrPhyPcfgTimerValue              =  0x00D700F4,
+	HWPfDdrPhyPcfgSoftwareTraining        =  0x00D700F8,
+	HWPfDdrPhyPcfgMcStatus                =  0x00D700FC,
+	HWPfDdrPhyWrlvlPhRank0                =  0x00D70100,
+	HWPfDdrPhyRdenPhRank0                 =  0x00D70104,
+	HWPfDdrPhyRdenIntRank0                =  0x00D70108,
+	HWPfDdrPhyRdqsDcdlRank0               =  0x00D7010C,
+	HWPfDdrPhyRdqsShadowDcdlRank0         =  0x00D70110,
+	HWPfDdrPhyWdqsDcdlRank0               =  0x00D70114,
+	HWPfDdrPhyWdmDcdlShadowRank0          =  0x00D70118,
+	HWPfDdrPhyWdmDcdlRank0                =  0x00D7011C,
+	HWPfDdrPhyDbiDcdlRank0                =  0x00D70120,
+	HWPfDdrPhyRdenDcdlaRank0              =  0x00D70124,
+	HWPfDdrPhyDbiDcdlShadowRank0          =  0x00D70128,
+	HWPfDdrPhyRdenDcdlbRank0              =  0x00D7012C,
+	HWPfDdrPhyWdqsShadowDcdlRank0         =  0x00D70130,
+	HWPfDdrPhyRdenDcdlcRank0              =  0x00D70134,
+	HWPfDdrPhyRdenShadowDcdlaRank0        =  0x00D70138,
+	HWPfDdrPhyWrlvlIntRank0               =  0x00D7013C,
+	HWPfDdrPhyRdqDcdlBit0Rank0            =  0x00D70200,
+	HWPfDdrPhyRdqDcdlShadowBit0Rank0      =  0x00D70204,
+	HWPfDdrPhyWdqDcdlBit0Rank0            =  0x00D70208,
+	HWPfDdrPhyWdqDcdlShadowBit0Rank0      =  0x00D7020C,
+	HWPfDdrPhyRdqDcdlBit1Rank0            =  0x00D70240,
+	HWPfDdrPhyRdqDcdlShadowBit1Rank0      =  0x00D70244,
+	HWPfDdrPhyWdqDcdlBit1Rank0            =  0x00D70248,
+	HWPfDdrPhyWdqDcdlShadowBit1Rank0      =  0x00D7024C,
+	HWPfDdrPhyRdqDcdlBit2Rank0            =  0x00D70280,
+	HWPfDdrPhyRdqDcdlShadowBit2Rank0      =  0x00D70284,
+	HWPfDdrPhyWdqDcdlBit2Rank0            =  0x00D70288,
+	HWPfDdrPhyWdqDcdlShadowBit2Rank0      =  0x00D7028C,
+	HWPfDdrPhyRdqDcdlBit3Rank0            =  0x00D702C0,
+	HWPfDdrPhyRdqDcdlShadowBit3Rank0      =  0x00D702C4,
+	HWPfDdrPhyWdqDcdlBit3Rank0            =  0x00D702C8,
+	HWPfDdrPhyWdqDcdlShadowBit3Rank0      =  0x00D702CC,
+	HWPfDdrPhyRdqDcdlBit4Rank0            =  0x00D70300,
+	HWPfDdrPhyRdqDcdlShadowBit4Rank0      =  0x00D70304,
+	HWPfDdrPhyWdqDcdlBit4Rank0            =  0x00D70308,
+	HWPfDdrPhyWdqDcdlShadowBit4Rank0      =  0x00D7030C,
+	HWPfDdrPhyRdqDcdlBit5Rank0            =  0x00D70340,
+	HWPfDdrPhyRdqDcdlShadowBit5Rank0      =  0x00D70344,
+	HWPfDdrPhyWdqDcdlBit5Rank0            =  0x00D70348,
+	HWPfDdrPhyWdqDcdlShadowBit5Rank0      =  0x00D7034C,
+	HWPfDdrPhyRdqDcdlBit6Rank0            =  0x00D70380,
+	HWPfDdrPhyRdqDcdlShadowBit6Rank0      =  0x00D70384,
+	HWPfDdrPhyWdqDcdlBit6Rank0            =  0x00D70388,
+	HWPfDdrPhyWdqDcdlShadowBit6Rank0      =  0x00D7038C,
+	HWPfDdrPhyRdqDcdlBit7Rank0            =  0x00D703C0,
+	HWPfDdrPhyRdqDcdlShadowBit7Rank0      =  0x00D703C4,
+	HWPfDdrPhyWdqDcdlBit7Rank0            =  0x00D703C8,
+	HWPfDdrPhyWdqDcdlShadowBit7Rank0      =  0x00D703CC,
+	HWPfDdrPhyIdtmStatus                  =  0x00D740D0,
+	HWPfDdrPhyIdtmError                   =  0x00D74110,
+	HWPfDdrPhyIdtmDebug                   =  0x00D74120,
+	HWPfDdrPhyIdtmDebugInt                =  0x00D74130,
+	HwPfPcieLnAsicCfgovr                  =  0x00D80000,
+	HwPfPcieLnAclkmixer                   =  0x00D80004,
+	HwPfPcieLnTxrampfreq                  =  0x00D80008,
+	HwPfPcieLnLanetest                    =  0x00D8000C,
+	HwPfPcieLnDcctrl                      =  0x00D80010,
+	HwPfPcieLnDccmeas                     =  0x00D80014,
+	HwPfPcieLnDccovrAclk                  =  0x00D80018,
+	HwPfPcieLnDccovrTxa                   =  0x00D8001C,
+	HwPfPcieLnDccovrTxk                   =  0x00D80020,
+	HwPfPcieLnDccovrDclk                  =  0x00D80024,
+	HwPfPcieLnDccovrEclk                  =  0x00D80028,
+	HwPfPcieLnDcctrimAclk                 =  0x00D8002C,
+	HwPfPcieLnDcctrimTx                   =  0x00D80030,
+	HwPfPcieLnDcctrimDclk                 =  0x00D80034,
+	HwPfPcieLnDcctrimEclk                 =  0x00D80038,
+	HwPfPcieLnQuadCtrl                    =  0x00D8003C,
+	HwPfPcieLnQuadCorrIndex               =  0x00D80040,
+	HwPfPcieLnQuadCorrStatus              =  0x00D80044,
+	HwPfPcieLnAsicRxovr1                  =  0x00D80048,
+	HwPfPcieLnAsicRxovr2                  =  0x00D8004C,
+	HwPfPcieLnAsicEqinfovr                =  0x00D80050,
+	HwPfPcieLnRxcsr                       =  0x00D80054,
+	HwPfPcieLnRxfectrl                    =  0x00D80058,
+	HwPfPcieLnRxtest                      =  0x00D8005C,
+	HwPfPcieLnEscount                     =  0x00D80060,
+	HwPfPcieLnCdrctrl                     =  0x00D80064,
+	HwPfPcieLnCdrctrl2                    =  0x00D80068,
+	HwPfPcieLnCdrcfg0Ctrl0                =  0x00D8006C,
+	HwPfPcieLnCdrcfg0Ctrl1                =  0x00D80070,
+	HwPfPcieLnCdrcfg0Ctrl2                =  0x00D80074,
+	HwPfPcieLnCdrcfg1Ctrl0                =  0x00D80078,
+	HwPfPcieLnCdrcfg1Ctrl1                =  0x00D8007C,
+	HwPfPcieLnCdrcfg1Ctrl2                =  0x00D80080,
+	HwPfPcieLnCdrcfg2Ctrl0                =  0x00D80084,
+	HwPfPcieLnCdrcfg2Ctrl1                =  0x00D80088,
+	HwPfPcieLnCdrcfg2Ctrl2                =  0x00D8008C,
+	HwPfPcieLnCdrcfg3Ctrl0                =  0x00D80090,
+	HwPfPcieLnCdrcfg3Ctrl1                =  0x00D80094,
+	HwPfPcieLnCdrcfg3Ctrl2                =  0x00D80098,
+	HwPfPcieLnCdrphase                    =  0x00D8009C,
+	HwPfPcieLnCdrfreq                     =  0x00D800A0,
+	HwPfPcieLnCdrstatusPhase              =  0x00D800A4,
+	HwPfPcieLnCdrstatusFreq               =  0x00D800A8,
+	HwPfPcieLnCdroffset                   =  0x00D800AC,
+	HwPfPcieLnRxvosctl                    =  0x00D800B0,
+	HwPfPcieLnRxvosctl2                   =  0x00D800B4,
+	HwPfPcieLnRxlosctl                    =  0x00D800B8,
+	HwPfPcieLnRxlos                       =  0x00D800BC,
+	HwPfPcieLnRxlosvval                   =  0x00D800C0,
+	HwPfPcieLnRxvosd0                     =  0x00D800C4,
+	HwPfPcieLnRxvosd1                     =  0x00D800C8,
+	HwPfPcieLnRxvosep0                    =  0x00D800CC,
+	HwPfPcieLnRxvosep1                    =  0x00D800D0,
+	HwPfPcieLnRxvosen0                    =  0x00D800D4,
+	HwPfPcieLnRxvosen1                    =  0x00D800D8,
+	HwPfPcieLnRxvosafe                    =  0x00D800DC,
+	HwPfPcieLnRxvosa0                     =  0x00D800E0,
+	HwPfPcieLnRxvosa0Out                  =  0x00D800E4,
+	HwPfPcieLnRxvosa1                     =  0x00D800E8,
+	HwPfPcieLnRxvosa1Out                  =  0x00D800EC,
+	HwPfPcieLnRxmisc                      =  0x00D800F0,
+	HwPfPcieLnRxbeacon                    =  0x00D800F4,
+	HwPfPcieLnRxdssout                    =  0x00D800F8,
+	HwPfPcieLnRxdssout2                   =  0x00D800FC,
+	HwPfPcieLnAlphapctrl                  =  0x00D80100,
+	HwPfPcieLnAlphanctrl                  =  0x00D80104,
+	HwPfPcieLnAdaptctrl                   =  0x00D80108,
+	HwPfPcieLnAdaptctrl1                  =  0x00D8010C,
+	HwPfPcieLnAdaptstatus                 =  0x00D80110,
+	HwPfPcieLnAdaptvga1                   =  0x00D80114,
+	HwPfPcieLnAdaptvga2                   =  0x00D80118,
+	HwPfPcieLnAdaptvga3                   =  0x00D8011C,
+	HwPfPcieLnAdaptvga4                   =  0x00D80120,
+	HwPfPcieLnAdaptboost1                 =  0x00D80124,
+	HwPfPcieLnAdaptboost2                 =  0x00D80128,
+	HwPfPcieLnAdaptboost3                 =  0x00D8012C,
+	HwPfPcieLnAdaptboost4                 =  0x00D80130,
+	HwPfPcieLnAdaptsslms1                 =  0x00D80134,
+	HwPfPcieLnAdaptsslms2                 =  0x00D80138,
+	HwPfPcieLnAdaptvgaStatus              =  0x00D8013C,
+	HwPfPcieLnAdaptboostStatus            =  0x00D80140,
+	HwPfPcieLnAdaptsslmsStatus1           =  0x00D80144,
+	HwPfPcieLnAdaptsslmsStatus2           =  0x00D80148,
+	HwPfPcieLnAfectrl1                    =  0x00D8014C,
+	HwPfPcieLnAfectrl2                    =  0x00D80150,
+	HwPfPcieLnAfectrl3                    =  0x00D80154,
+	HwPfPcieLnAfedefault1                 =  0x00D80158,
+	HwPfPcieLnAfedefault2                 =  0x00D8015C,
+	HwPfPcieLnDfectrl1                    =  0x00D80160,
+	HwPfPcieLnDfectrl2                    =  0x00D80164,
+	HwPfPcieLnDfectrl3                    =  0x00D80168,
+	HwPfPcieLnDfectrl4                    =  0x00D8016C,
+	HwPfPcieLnDfectrl5                    =  0x00D80170,
+	HwPfPcieLnDfectrl6                    =  0x00D80174,
+	HwPfPcieLnAfestatus1                  =  0x00D80178,
+	HwPfPcieLnAfestatus2                  =  0x00D8017C,
+	HwPfPcieLnDfestatus1                  =  0x00D80180,
+	HwPfPcieLnDfestatus2                  =  0x00D80184,
+	HwPfPcieLnDfestatus3                  =  0x00D80188,
+	HwPfPcieLnDfestatus4                  =  0x00D8018C,
+	HwPfPcieLnDfestatus5                  =  0x00D80190,
+	HwPfPcieLnAlphastatus                 =  0x00D80194,
+	HwPfPcieLnFomctrl1                    =  0x00D80198,
+	HwPfPcieLnFomctrl2                    =  0x00D8019C,
+	HwPfPcieLnFomctrl3                    =  0x00D801A0,
+	HwPfPcieLnAclkcalStatus               =  0x00D801A4,
+	HwPfPcieLnOffscorrStatus              =  0x00D801A8,
+	HwPfPcieLnEyewidthStatus              =  0x00D801AC,
+	HwPfPcieLnEyeheightStatus             =  0x00D801B0,
+	HwPfPcieLnAsicTxovr1                  =  0x00D801B4,
+	HwPfPcieLnAsicTxovr2                  =  0x00D801B8,
+	HwPfPcieLnAsicTxovr3                  =  0x00D801BC,
+	HwPfPcieLnTxbiasadjOvr                =  0x00D801C0,
+	HwPfPcieLnTxcsr                       =  0x00D801C4,
+	HwPfPcieLnTxtest                      =  0x00D801C8,
+	HwPfPcieLnTxtestword                  =  0x00D801CC,
+	HwPfPcieLnTxtestwordHigh              =  0x00D801D0,
+	HwPfPcieLnTxdrive                     =  0x00D801D4,
+	HwPfPcieLnMtcsLn                      =  0x00D801D8,
+	HwPfPcieLnStatsumLn                   =  0x00D801DC,
+	HwPfPcieLnRcbusScratch                =  0x00D801E0,
+	HwPfPcieLnRcbusMinorrev               =  0x00D801F0,
+	HwPfPcieLnRcbusMajorrev               =  0x00D801F4,
+	HwPfPcieLnRcbusBlocktype              =  0x00D801F8,
+	HwPfPcieSupPllcsr                     =  0x00D80800,
+	HwPfPcieSupPlldiv                     =  0x00D80804,
+	HwPfPcieSupPllcal                     =  0x00D80808,
+	HwPfPcieSupPllcalsts                  =  0x00D8080C,
+	HwPfPcieSupPllmeas                    =  0x00D80810,
+	HwPfPcieSupPlldactrim                 =  0x00D80814,
+	HwPfPcieSupPllbiastrim                =  0x00D80818,
+	HwPfPcieSupPllbwtrim                  =  0x00D8081C,
+	HwPfPcieSupPllcaldly                  =  0x00D80820,
+	HwPfPcieSupRefclkonpclkctrl           =  0x00D80824,
+	HwPfPcieSupPclkdelay                  =  0x00D80828,
+	HwPfPcieSupPhyconfig                  =  0x00D8082C,
+	HwPfPcieSupRcalIntf                   =  0x00D80830,
+	HwPfPcieSupAuxcsr                     =  0x00D80834,
+	HwPfPcieSupVref                       =  0x00D80838,
+	HwPfPcieSupLinkmode                   =  0x00D8083C,
+	HwPfPcieSupRrefcalctl                 =  0x00D80840,
+	HwPfPcieSupRrefcal                    =  0x00D80844,
+	HwPfPcieSupRrefcaldly                 =  0x00D80848,
+	HwPfPcieSupTximpcalctl                =  0x00D8084C,
+	HwPfPcieSupTximpcal                   =  0x00D80850,
+	HwPfPcieSupTximpoffset                =  0x00D80854,
+	HwPfPcieSupTximpcaldly                =  0x00D80858,
+	HwPfPcieSupRximpcalctl                =  0x00D8085C,
+	HwPfPcieSupRximpcal                   =  0x00D80860,
+	HwPfPcieSupRximpoffset                =  0x00D80864,
+	HwPfPcieSupRximpcaldly                =  0x00D80868,
+	HwPfPcieSupFence                      =  0x00D8086C,
+	HwPfPcieSupMtcs                       =  0x00D80870,
+	HwPfPcieSupStatsum                    =  0x00D809B8,
+	HwPfPciePcsDpStatus0                  =  0x00D81000,
+	HwPfPciePcsDpControl0                 =  0x00D81004,
+	HwPfPciePcsPmaStatusLane0             =  0x00D81008,
+	HwPfPciePcsPipeStatusLane0            =  0x00D8100C,
+	HwPfPciePcsTxdeemph0Lane0             =  0x00D81010,
+	HwPfPciePcsTxdeemph1Lane0             =  0x00D81014,
+	HwPfPciePcsInternalStatusLane0        =  0x00D81018,
+	HwPfPciePcsDpStatus1                  =  0x00D8101C,
+	HwPfPciePcsDpControl1                 =  0x00D81020,
+	HwPfPciePcsPmaStatusLane1             =  0x00D81024,
+	HwPfPciePcsPipeStatusLane1            =  0x00D81028,
+	HwPfPciePcsTxdeemph0Lane1             =  0x00D8102C,
+	HwPfPciePcsTxdeemph1Lane1             =  0x00D81030,
+	HwPfPciePcsInternalStatusLane1        =  0x00D81034,
+	HwPfPciePcsDpStatus2                  =  0x00D81038,
+	HwPfPciePcsDpControl2                 =  0x00D8103C,
+	HwPfPciePcsPmaStatusLane2             =  0x00D81040,
+	HwPfPciePcsPipeStatusLane2            =  0x00D81044,
+	HwPfPciePcsTxdeemph0Lane2             =  0x00D81048,
+	HwPfPciePcsTxdeemph1Lane2             =  0x00D8104C,
+	HwPfPciePcsInternalStatusLane2        =  0x00D81050,
+	HwPfPciePcsDpStatus3                  =  0x00D81054,
+	HwPfPciePcsDpControl3                 =  0x00D81058,
+	HwPfPciePcsPmaStatusLane3             =  0x00D8105C,
+	HwPfPciePcsPipeStatusLane3            =  0x00D81060,
+	HwPfPciePcsTxdeemph0Lane3             =  0x00D81064,
+	HwPfPciePcsTxdeemph1Lane3             =  0x00D81068,
+	HwPfPciePcsInternalStatusLane3        =  0x00D8106C,
+	HwPfPciePcsEbStatus0                  =  0x00D81070,
+	HwPfPciePcsEbStatus1                  =  0x00D81074,
+	HwPfPciePcsEbStatus2                  =  0x00D81078,
+	HwPfPciePcsEbStatus3                  =  0x00D8107C,
+	HwPfPciePcsPllSettingPcieG1           =  0x00D81088,
+	HwPfPciePcsPllSettingPcieG2           =  0x00D8108C,
+	HwPfPciePcsPllSettingPcieG3           =  0x00D81090,
+	HwPfPciePcsControl                    =  0x00D81094,
+	HwPfPciePcsEqControl                  =  0x00D81098,
+	HwPfPciePcsEqTimer                    =  0x00D8109C,
+	HwPfPciePcsEqErrStatus                =  0x00D810A0,
+	HwPfPciePcsEqErrCount                 =  0x00D810A4,
+	HwPfPciePcsStatus                     =  0x00D810A8,
+	HwPfPciePcsMiscRegister               =  0x00D810AC,
+	HwPfPciePcsObsControl                 =  0x00D810B0,
+	HwPfPciePcsPrbsCount0                 =  0x00D81200,
+	HwPfPciePcsBistControl0               =  0x00D81204,
+	HwPfPciePcsBistStaticWord00           =  0x00D81208,
+	HwPfPciePcsBistStaticWord10           =  0x00D8120C,
+	HwPfPciePcsBistStaticWord20           =  0x00D81210,
+	HwPfPciePcsBistStaticWord30           =  0x00D81214,
+	HwPfPciePcsPrbsCount1                 =  0x00D81220,
+	HwPfPciePcsBistControl1               =  0x00D81224,
+	HwPfPciePcsBistStaticWord01           =  0x00D81228,
+	HwPfPciePcsBistStaticWord11           =  0x00D8122C,
+	HwPfPciePcsBistStaticWord21           =  0x00D81230,
+	HwPfPciePcsBistStaticWord31           =  0x00D81234,
+	HwPfPciePcsPrbsCount2                 =  0x00D81240,
+	HwPfPciePcsBistControl2               =  0x00D81244,
+	HwPfPciePcsBistStaticWord02           =  0x00D81248,
+	HwPfPciePcsBistStaticWord12           =  0x00D8124C,
+	HwPfPciePcsBistStaticWord22           =  0x00D81250,
+	HwPfPciePcsBistStaticWord32           =  0x00D81254,
+	HwPfPciePcsPrbsCount3                 =  0x00D81260,
+	HwPfPciePcsBistControl3               =  0x00D81264,
+	HwPfPciePcsBistStaticWord03           =  0x00D81268,
+	HwPfPciePcsBistStaticWord13           =  0x00D8126C,
+	HwPfPciePcsBistStaticWord23           =  0x00D81270,
+	HwPfPciePcsBistStaticWord33           =  0x00D81274,
+	HwPfPcieGpexLtssmStateCntrl           =  0x00D90400,
+	HwPfPcieGpexLtssmStateStatus          =  0x00D90404,
+	HwPfPcieGpexSkipFreqTimer             =  0x00D90408,
+	HwPfPcieGpexLaneSelect                =  0x00D9040C,
+	HwPfPcieGpexLaneDeskew                =  0x00D90410,
+	HwPfPcieGpexRxErrorStatus             =  0x00D90414,
+	HwPfPcieGpexLaneNumControl            =  0x00D90418,
+	HwPfPcieGpexNFstControl               =  0x00D9041C,
+	HwPfPcieGpexLinkStatus                =  0x00D90420,
+	HwPfPcieGpexAckReplayTimeout          =  0x00D90438,
+	HwPfPcieGpexSeqNumberStatus           =  0x00D9043C,
+	HwPfPcieGpexCoreClkRatio              =  0x00D90440,
+	HwPfPcieGpexDllTholdControl           =  0x00D90448,
+	HwPfPcieGpexPmTimer                   =  0x00D90450,
+	HwPfPcieGpexPmeTimeout                =  0x00D90454,
+	HwPfPcieGpexAspmL1Timer               =  0x00D90458,
+	HwPfPcieGpexAspmReqTimer              =  0x00D9045C,
+	HwPfPcieGpexAspmL1Dis                 =  0x00D90460,
+	HwPfPcieGpexAdvisoryErrorControl      =  0x00D90468,
+	HwPfPcieGpexId                        =  0x00D90470,
+	HwPfPcieGpexClasscode                 =  0x00D90474,
+	HwPfPcieGpexSubsystemId               =  0x00D90478,
+	HwPfPcieGpexDeviceCapabilities        =  0x00D9047C,
+	HwPfPcieGpexLinkCapabilities          =  0x00D90480,
+	HwPfPcieGpexFunctionNumber            =  0x00D90484,
+	HwPfPcieGpexPmCapabilities            =  0x00D90488,
+	HwPfPcieGpexFunctionSelect            =  0x00D9048C,
+	HwPfPcieGpexErrorCounter              =  0x00D904AC,
+	HwPfPcieGpexConfigReady               =  0x00D904B0,
+	HwPfPcieGpexFcUpdateTimeout           =  0x00D904B8,
+	HwPfPcieGpexFcUpdateTimer             =  0x00D904BC,
+	HwPfPcieGpexVcBufferLoad              =  0x00D904C8,
+	HwPfPcieGpexVcBufferSizeThold         =  0x00D904CC,
+	HwPfPcieGpexVcBufferSelect            =  0x00D904D0,
+	HwPfPcieGpexBarEnable                 =  0x00D904D4,
+	HwPfPcieGpexBarDwordLower             =  0x00D904D8,
+	HwPfPcieGpexBarDwordUpper             =  0x00D904DC,
+	HwPfPcieGpexBarSelect                 =  0x00D904E0,
+	HwPfPcieGpexCreditCounterSelect       =  0x00D904E4,
+	HwPfPcieGpexCreditCounterStatus       =  0x00D904E8,
+	HwPfPcieGpexTlpHeaderSelect           =  0x00D904EC,
+	HwPfPcieGpexTlpHeaderDword0           =  0x00D904F0,
+	HwPfPcieGpexTlpHeaderDword1           =  0x00D904F4,
+	HwPfPcieGpexTlpHeaderDword2           =  0x00D904F8,
+	HwPfPcieGpexTlpHeaderDword3           =  0x00D904FC,
+	HwPfPcieGpexRelaxOrderControl         =  0x00D90500,
+	HwPfPcieGpexBarPrefetch               =  0x00D90504,
+	HwPfPcieGpexFcCheckControl            =  0x00D90508,
+	HwPfPcieGpexFcUpdateTimerTraffic      =  0x00D90518,
+	HwPfPcieGpexPhyControl0               =  0x00D9053C,
+	HwPfPcieGpexPhyControl1               =  0x00D90544,
+	HwPfPcieGpexPhyControl2               =  0x00D9054C,
+	HwPfPcieGpexUserControl0              =  0x00D9055C,
+	HwPfPcieGpexUncorrErrorStatus         =  0x00D905F0,
+	HwPfPcieGpexRxCplError                =  0x00D90620,
+	HwPfPcieGpexRxCplErrorDword0          =  0x00D90624,
+	HwPfPcieGpexRxCplErrorDword1          =  0x00D90628,
+	HwPfPcieGpexRxCplErrorDword2          =  0x00D9062C,
+	HwPfPcieGpexPabSwResetEn              =  0x00D90630,
+	HwPfPcieGpexGen3Control0              =  0x00D90634,
+	HwPfPcieGpexGen3Control1              =  0x00D90638,
+	HwPfPcieGpexGen3Control2              =  0x00D9063C,
+	HwPfPcieGpexGen2ControlCsr            =  0x00D90640,
+	HwPfPcieGpexTotalVfInitialVf0         =  0x00D90644,
+	HwPfPcieGpexTotalVfInitialVf1         =  0x00D90648,
+	HwPfPcieGpexSriovLinkDevId0           =  0x00D90684,
+	HwPfPcieGpexSriovLinkDevId1           =  0x00D90688,
+	HwPfPcieGpexSriovPageSize0            =  0x00D906C4,
+	HwPfPcieGpexSriovPageSize1            =  0x00D906C8,
+	HwPfPcieGpexIdVersion                 =  0x00D906FC,
+	HwPfPcieGpexSriovVfOffsetStride0      =  0x00D90704,
+	HwPfPcieGpexSriovVfOffsetStride1      =  0x00D90708,
+	HwPfPcieGpexGen3DeskewControl         =  0x00D907B4,
+	HwPfPcieGpexGen3EqControl             =  0x00D907B8,
+	HwPfPcieGpexBridgeVersion             =  0x00D90800,
+	HwPfPcieGpexBridgeCapability          =  0x00D90804,
+	HwPfPcieGpexBridgeControl             =  0x00D90808,
+	HwPfPcieGpexBridgeStatus              =  0x00D9080C,
+	HwPfPcieGpexEngineActivityStatus      =  0x00D9081C,
+	HwPfPcieGpexEngineResetControl        =  0x00D90820,
+	HwPfPcieGpexAxiPioControl             =  0x00D90840,
+	HwPfPcieGpexAxiPioStatus              =  0x00D90844,
+	HwPfPcieGpexAmbaSlaveCmdStatus        =  0x00D90848,
+	HwPfPcieGpexPexPioControl             =  0x00D908C0,
+	HwPfPcieGpexPexPioStatus              =  0x00D908C4,
+	HwPfPcieGpexAmbaMasterStatus          =  0x00D908C8,
+	HwPfPcieGpexCsrSlaveCmdStatus         =  0x00D90920,
+	HwPfPcieGpexMailboxAxiControl         =  0x00D90A50,
+	HwPfPcieGpexMailboxAxiData            =  0x00D90A54,
+	HwPfPcieGpexMailboxPexControl         =  0x00D90A90,
+	HwPfPcieGpexMailboxPexData            =  0x00D90A94,
+	HwPfPcieGpexPexInterruptEnable        =  0x00D90AD0,
+	HwPfPcieGpexPexInterruptStatus        =  0x00D90AD4,
+	HwPfPcieGpexPexInterruptAxiPioVector  =  0x00D90AD8,
+	HwPfPcieGpexPexInterruptPexPioVector  =  0x00D90AE0,
+	HwPfPcieGpexPexInterruptMiscVector    =  0x00D90AF8,
+	HwPfPcieGpexAmbaInterruptPioEnable    =  0x00D90B00,
+	HwPfPcieGpexAmbaInterruptMiscEnable   =  0x00D90B0C,
+	HwPfPcieGpexAmbaInterruptPioStatus    =  0x00D90B10,
+	HwPfPcieGpexAmbaInterruptMiscStatus   =  0x00D90B1C,
+	HwPfPcieGpexPexPmControl              =  0x00D90B80,
+	HwPfPcieGpexSlotMisc                  =  0x00D90B88,
+	HwPfPcieGpexAxiAddrMappingControl     =  0x00D90BA0,
+	HwPfPcieGpexAxiAddrMappingWindowAxiBase     =  0x00D90BA4,
+	HwPfPcieGpexAxiAddrMappingWindowPexBaseLow  =  0x00D90BA8,
+	HwPfPcieGpexAxiAddrMappingWindowPexBaseHigh =  0x00D90BAC,
+	HwPfPcieGpexPexBarAddrFunc0Bar0       =  0x00D91BA0,
+	HwPfPcieGpexPexBarAddrFunc0Bar1       =  0x00D91BA4,
+	HwPfPcieGpexAxiAddrMappingPcieHdrParam =  0x00D95BA0,
+	HwPfPcieGpexExtAxiAddrMappingAxiBase  =  0x00D980A0,
+	HwPfPcieGpexPexExtBarAddrFunc0Bar0    =  0x00D984A0,
+	HwPfPcieGpexPexExtBarAddrFunc0Bar1    =  0x00D984A4,
+	HwPfPcieGpexAmbaInterruptFlrEnable    =  0x00D9B960,
+	HwPfPcieGpexAmbaInterruptFlrStatus    =  0x00D9B9A0,
+	HwPfPcieGpexExtAxiAddrMappingSize     =  0x00D9BAF0,
+	HwPfPcieGpexPexPioAwcacheControl      =  0x00D9C300,
+	HwPfPcieGpexPexPioArcacheControl      =  0x00D9C304,
+	HwPfPcieGpexPabObSizeControlVc0       =  0x00D9C310
+};
+
+/* TIP PF Interrupt numbers */
+enum {
+	ACC100_PF_INT_QMGR_AQ_OVERFLOW = 0,
+	ACC100_PF_INT_DOORBELL_VF_2_PF = 1,
+	ACC100_PF_INT_DMA_DL_DESC_IRQ = 2,
+	ACC100_PF_INT_DMA_UL_DESC_IRQ = 3,
+	ACC100_PF_INT_DMA_MLD_DESC_IRQ = 4,
+	ACC100_PF_INT_DMA_UL5G_DESC_IRQ = 5,
+	ACC100_PF_INT_DMA_DL5G_DESC_IRQ = 6,
+	ACC100_PF_INT_ILLEGAL_FORMAT = 7,
+	ACC100_PF_INT_QMGR_DISABLED_ACCESS = 8,
+	ACC100_PF_INT_QMGR_AQ_OVERTHRESHOLD = 9,
+	ACC100_PF_INT_ARAM_ACCESS_ERR = 10,
+	ACC100_PF_INT_ARAM_ECC_1BIT_ERR = 11,
+	ACC100_PF_INT_PARITY_ERR = 12,
+	ACC100_PF_INT_QMGR_ERR = 13,
+	ACC100_PF_INT_INT_REQ_OVERFLOW = 14,
+	ACC100_PF_INT_APB_TIMEOUT = 15,
+};
+
+#endif /* ACC100_PF_ENUM_H */
diff --git a/drivers/baseband/acc100/acc100_vf_enum.h b/drivers/baseband/acc100/acc100_vf_enum.h
new file mode 100644
index 0000000..b512af3
--- /dev/null
+++ b/drivers/baseband/acc100/acc100_vf_enum.h
@@ -0,0 +1,73 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2017 Intel Corporation
+ */
+
+#ifndef ACC100_VF_ENUM_H
+#define ACC100_VF_ENUM_H
+
+/*
+ * ACC100 Register mapping on VF BAR0
+ * This is automatically generated from RDL, format may change with new RDL
+ */
+enum {
+	HWVfQmgrIngressAq             =  0x00000000,
+	HWVfHiVfToPfDbellVf           =  0x00000800,
+	HWVfHiPfToVfDbellVf           =  0x00000808,
+	HWVfHiInfoRingBaseLoVf        =  0x00000810,
+	HWVfHiInfoRingBaseHiVf        =  0x00000814,
+	HWVfHiInfoRingPointerVf       =  0x00000818,
+	HWVfHiInfoRingIntWrEnVf       =  0x00000820,
+	HWVfHiInfoRingPf2VfWrEnVf     =  0x00000824,
+	HWVfHiMsixVectorMapperVf      =  0x00000860,
+	HWVfDmaFec5GulDescBaseLoRegVf =  0x00000920,
+	HWVfDmaFec5GulDescBaseHiRegVf =  0x00000924,
+	HWVfDmaFec5GulRespPtrLoRegVf  =  0x00000928,
+	HWVfDmaFec5GulRespPtrHiRegVf  =  0x0000092C,
+	HWVfDmaFec5GdlDescBaseLoRegVf =  0x00000940,
+	HWVfDmaFec5GdlDescBaseHiRegVf =  0x00000944,
+	HWVfDmaFec5GdlRespPtrLoRegVf  =  0x00000948,
+	HWVfDmaFec5GdlRespPtrHiRegVf  =  0x0000094C,
+	HWVfDmaFec4GulDescBaseLoRegVf =  0x00000960,
+	HWVfDmaFec4GulDescBaseHiRegVf =  0x00000964,
+	HWVfDmaFec4GulRespPtrLoRegVf  =  0x00000968,
+	HWVfDmaFec4GulRespPtrHiRegVf  =  0x0000096C,
+	HWVfDmaFec4GdlDescBaseLoRegVf =  0x00000980,
+	HWVfDmaFec4GdlDescBaseHiRegVf =  0x00000984,
+	HWVfDmaFec4GdlRespPtrLoRegVf  =  0x00000988,
+	HWVfDmaFec4GdlRespPtrHiRegVf  =  0x0000098C,
+	HWVfDmaDdrBaseRangeRoVf       =  0x000009A0,
+	HWVfQmgrAqResetVf             =  0x00000E00,
+	HWVfQmgrRingSizeVf            =  0x00000E04,
+	HWVfQmgrGrpDepthLog20Vf       =  0x00000E08,
+	HWVfQmgrGrpDepthLog21Vf       =  0x00000E0C,
+	HWVfQmgrGrpFunction0Vf        =  0x00000E10,
+	HWVfQmgrGrpFunction1Vf        =  0x00000E14,
+	HWVfPmACntrlRegVf             =  0x00000F40,
+	HWVfPmACountVf                =  0x00000F48,
+	HWVfPmAKCntLoVf               =  0x00000F50,
+	HWVfPmAKCntHiVf               =  0x00000F54,
+	HWVfPmADeltaCntLoVf           =  0x00000F60,
+	HWVfPmADeltaCntHiVf           =  0x00000F64,
+	HWVfPmBCntrlRegVf             =  0x00000F80,
+	HWVfPmBCountVf                =  0x00000F88,
+	HWVfPmBKCntLoVf               =  0x00000F90,
+	HWVfPmBKCntHiVf               =  0x00000F94,
+	HWVfPmBDeltaCntLoVf           =  0x00000FA0,
+	HWVfPmBDeltaCntHiVf           =  0x00000FA4
+};
+
+/* TIP VF Interrupt numbers */
+enum {
+	ACC100_VF_INT_QMGR_AQ_OVERFLOW = 0,
+	ACC100_VF_INT_DOORBELL_VF_2_PF = 1,
+	ACC100_VF_INT_DMA_DL_DESC_IRQ = 2,
+	ACC100_VF_INT_DMA_UL_DESC_IRQ = 3,
+	ACC100_VF_INT_DMA_MLD_DESC_IRQ = 4,
+	ACC100_VF_INT_DMA_UL5G_DESC_IRQ = 5,
+	ACC100_VF_INT_DMA_DL5G_DESC_IRQ = 6,
+	ACC100_VF_INT_ILLEGAL_FORMAT = 7,
+	ACC100_VF_INT_QMGR_DISABLED_ACCESS = 8,
+	ACC100_VF_INT_QMGR_AQ_OVERTHRESHOLD = 9,
+};
+
+#endif /* ACC100_VF_ENUM_H */
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 6f46df0..6525d66 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -5,6 +5,9 @@
 #ifndef _RTE_ACC100_PMD_H_
 #define _RTE_ACC100_PMD_H_
 
+#include "acc100_pf_enum.h"
+#include "acc100_vf_enum.h"
+
 /* Helper macro for logging */
 #define rte_bbdev_log(level, fmt, ...) \
 	rte_log(RTE_LOG_ ## level, acc100_logtype, fmt "\n", \
@@ -27,6 +30,490 @@
 #define RTE_ACC100_PF_DEVICE_ID        (0x0d5c)
 #define RTE_ACC100_VF_DEVICE_ID        (0x0d5d)
 
+/* Define as 1 to use only a single FEC engine */
+#ifndef RTE_ACC100_SINGLE_FEC
+#define RTE_ACC100_SINGLE_FEC 0
+#endif
+
+/* Values used in filling in descriptors */
+#define ACC100_DMA_DESC_TYPE           2
+#define ACC100_DMA_CODE_BLK_MODE       0
+#define ACC100_DMA_BLKID_FCW           1
+#define ACC100_DMA_BLKID_IN            2
+#define ACC100_DMA_BLKID_OUT_ENC       1
+#define ACC100_DMA_BLKID_OUT_HARD      1
+#define ACC100_DMA_BLKID_OUT_SOFT      2
+#define ACC100_DMA_BLKID_OUT_HARQ      3
+#define ACC100_DMA_BLKID_IN_HARQ       3
+
+/* Values used in filling in decode FCWs */
+#define ACC100_FCW_TD_VER              1
+#define ACC100_FCW_TD_EXT_COLD_REG_EN  1
+#define ACC100_FCW_TD_AUTOMAP          0x0f
+#define ACC100_FCW_TD_RVIDX_0          2
+#define ACC100_FCW_TD_RVIDX_1          26
+#define ACC100_FCW_TD_RVIDX_2          50
+#define ACC100_FCW_TD_RVIDX_3          74
+
+/* Values used in writing to the registers */
+#define ACC100_REG_IRQ_EN_ALL          0x1FF83FF  /* Enable all interrupts */
+
+/* ACC100 Specific Dimensioning */
+#define ACC100_SIZE_64MBYTE            (64*1024*1024)
+/* Number of elements in an Info Ring */
+#define ACC100_INFO_RING_NUM_ENTRIES   1024
+/* Number of elements in HARQ layout memory */
+#define ACC100_HARQ_LAYOUT             (64*1024*1024)
+/* Assume offset for HARQ in memory */
+#define ACC100_HARQ_OFFSET             (32*1024)
+/* Mask used to calculate an index in an Info Ring array (not a byte offset) */
+#define ACC100_INFO_RING_MASK          (ACC100_INFO_RING_NUM_ENTRIES-1)
+/* Number of Virtual Functions ACC100 supports */
+#define ACC100_NUM_VFS                  16
+#define ACC100_NUM_QGRPS                8
+#define ACC100_NUM_QGRPS_PER_WORD       8
+#define ACC100_NUM_AQS                  16
+#define MAX_ENQ_BATCH_SIZE              255
+/* All ACC100 Registers alignment are 32bits = 4B */
+#define ACC100_BYTES_IN_WORD                 4
+#define ACC100_MAX_E_MBUF                64000
+
+#define ACC100_GRP_ID_SHIFT    10 /* Queue Index Hierarchy */
+#define ACC100_VF_ID_SHIFT     4  /* Queue Index Hierarchy */
+#define ACC100_VF_OFFSET_QOS   16 /* offset in Memory specific to QoS Mon */
+#define ACC100_TMPL_PRI_0      0x03020100
+#define ACC100_TMPL_PRI_1      0x07060504
+#define ACC100_TMPL_PRI_2      0x0b0a0908
+#define ACC100_TMPL_PRI_3      0x0f0e0d0c
+#define ACC100_QUEUE_ENABLE    0x80000000  /* Bit to mark Queue as Enabled */
+#define ACC100_WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
+
+#define ACC100_NUM_TMPL       32
+/* Mapping of signals for the available engines */
+#define ACC100_SIG_UL_5G      0
+#define ACC100_SIG_UL_5G_LAST 7
+#define ACC100_SIG_DL_5G      13
+#define ACC100_SIG_DL_5G_LAST 15
+#define ACC100_SIG_UL_4G      16
+#define ACC100_SIG_UL_4G_LAST 21
+#define ACC100_SIG_DL_4G      27
+#define ACC100_SIG_DL_4G_LAST 31
+
+/* max number of iterations to allocate memory block for all rings */
+#define ACC100_SW_RING_MEM_ALLOC_ATTEMPTS 5
+#define ACC100_MAX_QUEUE_DEPTH            1024
+#define ACC100_DMA_MAX_NUM_POINTERS       14
+#define ACC100_DMA_DESC_PADDING           8
+#define ACC100_FCW_PADDING                12
+#define ACC100_DESC_FCW_OFFSET            192
+#define ACC100_DESC_SIZE                  256
+#define ACC100_DESC_OFFSET                (ACC100_DESC_SIZE / 64)
+#define ACC100_FCW_TE_BLEN                32
+#define ACC100_FCW_TD_BLEN                24
+#define ACC100_FCW_LE_BLEN                32
+#define ACC100_FCW_LD_BLEN                36
+
+#define ACC100_FCW_VER         2
+#define ACC100_MUX_5GDL_DESC   6
+#define ACC100_CMP_ENC_SIZE    20
+#define ACC100_CMP_DEC_SIZE    24
+#define ACC100_ENC_OFFSET     (32)
+#define ACC100_DEC_OFFSET     (80)
+#define ACC100_EXT_MEM /* Default option with memory external to CPU */
+#define ACC100_HARQ_OFFSET_THRESHOLD 1024
+
+/* Constants from K0 computation from 3GPP 38.212 Table 5.4.2.1-2 */
+#define ACC100_N_ZC_1 66 /* N = 66 Zc for BG 1 */
+#define ACC100_N_ZC_2 50 /* N = 50 Zc for BG 2 */
+#define ACC100_K0_1_1 17 /* K0 fraction numerator for rv 1 and BG 1 */
+#define ACC100_K0_1_2 13 /* K0 fraction numerator for rv 1 and BG 2 */
+#define ACC100_K0_2_1 33 /* K0 fraction numerator for rv 2 and BG 1 */
+#define ACC100_K0_2_2 25 /* K0 fraction numerator for rv 2 and BG 2 */
+#define ACC100_K0_3_1 56 /* K0 fraction numerator for rv 3 and BG 1 */
+#define ACC100_K0_3_2 43 /* K0 fraction numerator for rv 3 and BG 2 */
+
+/* ACC100 Configuration */
+#define ACC100_DDR_ECC_ENABLE
+#define ACC100_CFG_DMA_ERROR    0x3D7
+#define ACC100_CFG_AXI_CACHE    0x11
+#define ACC100_CFG_QMGR_HI_P    0x0F0F
+#define ACC100_CFG_PCI_AXI      0xC003
+#define ACC100_CFG_PCI_BRIDGE   0x40006033
+#define ACC100_ENGINE_OFFSET    0x1000
+#define ACC100_RESET_HI         0x20100
+#define ACC100_RESET_LO         0x20000
+#define ACC100_RESET_HARD       0x1FF
+#define ACC100_ENGINES_MAX      9
+#define ACC100_LONG_WAIT        1000
+
+/* ACC100 DMA Descriptor triplet */
+struct acc100_dma_triplet {
+	uint64_t address;
+	uint32_t blen:20,
+		res0:4,
+		last:1,
+		dma_ext:1,
+		res1:2,
+		blkid:4;
+} __rte_packed;
+
+/* ACC100 DMA Response Descriptor */
+union acc100_dma_rsp_desc {
+	uint32_t val;
+	struct {
+		uint32_t crc_status:1,
+			synd_ok:1,
+			dma_err:1,
+			neg_stop:1,
+			fcw_err:1,
+			output_err:1,
+			input_err:1,
+			timestampEn:1,
+			iterCountFrac:8,
+			iter_cnt:8,
+			rsrvd3:6,
+			sdone:1,
+			fdone:1;
+		uint32_t add_info_0;
+		uint32_t add_info_1;
+	};
+};
+
+
+/* ACC100 Queue Manager Enqueue PCI Register */
+union acc100_enqueue_reg_fmt {
+	uint32_t val;
+	struct {
+		uint32_t num_elem:8,
+			addr_offset:3,
+			rsrvd:1,
+			req_elem_addr:20;
+	};
+};
+
+/* FEC 4G Uplink Frame Control Word */
+struct __rte_packed acc100_fcw_td {
+	uint8_t fcw_ver:4,
+		num_maps:4; /* Unused */
+	uint8_t filler:6, /* Unused */
+		rsrvd0:1,
+		bypass_sb_deint:1;
+	uint16_t k_pos;
+	uint16_t k_neg; /* Unused */
+	uint8_t c_neg; /* Unused */
+	uint8_t c; /* Unused */
+	uint32_t ea; /* Unused */
+	uint32_t eb; /* Unused */
+	uint8_t cab; /* Unused */
+	uint8_t k0_start_col; /* Unused */
+	uint8_t rsrvd1;
+	uint8_t code_block_mode:1, /* Unused */
+		turbo_crc_type:1,
+		rsrvd2:3,
+		bypass_teq:1, /* Unused */
+		soft_output_en:1, /* Unused */
+		ext_td_cold_reg_en:1;
+	union { /* External Cold register */
+		uint32_t ext_td_cold_reg;
+		struct {
+			uint32_t min_iter:4, /* Unused */
+				max_iter:4,
+				ext_scale:5, /* Unused */
+				rsrvd3:3,
+				early_stop_en:1, /* Unused */
+				sw_soft_out_dis:1, /* Unused */
+				sw_et_cont:1, /* Unused */
+				sw_soft_out_saturation:1, /* Unused */
+				half_iter_on:1, /* Unused */
+				raw_decoder_input_on:1, /* Unused */
+				rsrvd4:10;
+		};
+	};
+};
+
+/* FEC 5GNR Uplink Frame Control Word */
+struct __rte_packed acc100_fcw_ld {
+	uint32_t FCWversion:4,
+		qm:4,
+		nfiller:11,
+		BG:1,
+		Zc:9,
+		res0:1,
+		synd_precoder:1,
+		synd_post:1;
+	uint32_t ncb:16,
+		k0:16;
+	uint32_t rm_e:24,
+		hcin_en:1,
+		hcout_en:1,
+		crc_select:1,
+		bypass_dec:1,
+		bypass_intlv:1,
+		so_en:1,
+		so_bypass_rm:1,
+		so_bypass_intlv:1;
+	uint32_t hcin_offset:16,
+		hcin_size0:16;
+	uint32_t hcin_size1:16,
+		hcin_decomp_mode:3,
+		llr_pack_mode:1,
+		hcout_comp_mode:3,
+		res2:1,
+		dec_convllr:4,
+		hcout_convllr:4;
+	uint32_t itmax:7,
+		itstop:1,
+		so_it:7,
+		res3:1,
+		hcout_offset:16;
+	uint32_t hcout_size0:16,
+		hcout_size1:16;
+	uint32_t gain_i:8,
+		gain_h:8,
+		negstop_th:16;
+	uint32_t negstop_it:7,
+		negstop_en:1,
+		res4:24;
+};
+
+/* FEC 4G Downlink Frame Control Word */
+struct __rte_packed acc100_fcw_te {
+	uint16_t k_neg;
+	uint16_t k_pos;
+	uint8_t c_neg;
+	uint8_t c;
+	uint8_t filler;
+	uint8_t cab;
+	uint32_t ea:17,
+		rsrvd0:15;
+	uint32_t eb:17,
+		rsrvd1:15;
+	uint16_t ncb_neg;
+	uint16_t ncb_pos;
+	uint8_t rv_idx0:2,
+		rsrvd2:2,
+		rv_idx1:2,
+		rsrvd3:2;
+	uint8_t bypass_rv_idx0:1,
+		bypass_rv_idx1:1,
+		bypass_rm:1,
+		rsrvd4:5;
+	uint8_t rsrvd5:1,
+		rsrvd6:3,
+		code_block_crc:1,
+		rsrvd7:3;
+	uint8_t code_block_mode:1,
+		rsrvd8:7;
+	uint64_t rsrvd9;
+};
+
+/* FEC 5GNR Downlink Frame Control Word */
+struct __rte_packed acc100_fcw_le {
+	uint32_t FCWversion:4,
+		qm:4,
+		nfiller:11,
+		BG:1,
+		Zc:9,
+		res0:3;
+	uint32_t ncb:16,
+		k0:16;
+	uint32_t rm_e:24,
+		res1:2,
+		crc_select:1,
+		res2:1,
+		bypass_intlv:1,
+		res3:3;
+	uint32_t res4_a:12,
+		mcb_count:3,
+		res4_b:17;
+	uint32_t res5;
+	uint32_t res6;
+	uint32_t res7;
+	uint32_t res8;
+};
+
+/* ACC100 DMA Request Descriptor */
+struct __rte_packed acc100_dma_req_desc {
+	union {
+		struct{
+			uint32_t type:4,
+				rsrvd0:26,
+				sdone:1,
+				fdone:1;
+			uint32_t rsrvd1;
+			uint32_t rsrvd2;
+			uint32_t pass_param:8,
+				sdone_enable:1,
+				irq_enable:1,
+				timeStampEn:1,
+				res0:5,
+				numCBs:4,
+				res1:4,
+				m2dlen:4,
+				d2mlen:4;
+		};
+		struct{
+			uint32_t word0;
+			uint32_t word1;
+			uint32_t word2;
+			uint32_t word3;
+		};
+	};
+	struct acc100_dma_triplet data_ptrs[ACC100_DMA_MAX_NUM_POINTERS];
+
+	/* Virtual addresses used to retrieve SW context info */
+	union {
+		void *op_addr;
+		uint64_t pad1;  /* pad to 64 bits */
+	};
+	/*
+	 * Stores additional information needed for driver processing:
+	 * - last_desc_in_batch - flag used to mark last descriptor (CB)
+	 *                        in batch
+	 * - cbs_in_tb - stores information about total number of Code Blocks
+	 *               in currently processed Transport Block
+	 */
+	union {
+		struct {
+			union {
+				struct acc100_fcw_ld fcw_ld;
+				struct acc100_fcw_td fcw_td;
+				struct acc100_fcw_le fcw_le;
+				struct acc100_fcw_te fcw_te;
+				uint32_t pad2[ACC100_FCW_PADDING];
+			};
+			uint32_t last_desc_in_batch :8,
+				cbs_in_tb:8,
+				pad4 : 16;
+		};
+		uint64_t pad3[ACC100_DMA_DESC_PADDING]; /* pad to 64 bits */
+	};
+};
+
+/* ACC100 DMA Descriptor */
+union acc100_dma_desc {
+	struct acc100_dma_req_desc req;
+	union acc100_dma_rsp_desc rsp;
+};
+
+
+/* Union describing Info Ring entry */
+union acc100_harq_layout_data {
+	uint32_t val;
+	struct {
+		uint16_t offset;
+		uint16_t size0;
+	};
+} __rte_packed;
+
+
+/* Union describing Info Ring entry */
+union acc100_info_ring_data {
+	uint32_t val;
+	struct {
+		union {
+			uint16_t detailed_info;
+			struct {
+				uint16_t aq_id: 4;
+				uint16_t qg_id: 4;
+				uint16_t vf_id: 6;
+				uint16_t reserved: 2;
+			};
+		};
+		uint16_t int_nb: 7;
+		uint16_t msi_0: 1;
+		uint16_t vf2pf: 6;
+		uint16_t loop: 1;
+		uint16_t valid: 1;
+	};
+} __rte_packed;
+
+struct acc100_registry_addr {
+	unsigned int dma_ring_dl5g_hi;
+	unsigned int dma_ring_dl5g_lo;
+	unsigned int dma_ring_ul5g_hi;
+	unsigned int dma_ring_ul5g_lo;
+	unsigned int dma_ring_dl4g_hi;
+	unsigned int dma_ring_dl4g_lo;
+	unsigned int dma_ring_ul4g_hi;
+	unsigned int dma_ring_ul4g_lo;
+	unsigned int ring_size;
+	unsigned int info_ring_hi;
+	unsigned int info_ring_lo;
+	unsigned int info_ring_en;
+	unsigned int info_ring_ptr;
+	unsigned int tail_ptrs_dl5g_hi;
+	unsigned int tail_ptrs_dl5g_lo;
+	unsigned int tail_ptrs_ul5g_hi;
+	unsigned int tail_ptrs_ul5g_lo;
+	unsigned int tail_ptrs_dl4g_hi;
+	unsigned int tail_ptrs_dl4g_lo;
+	unsigned int tail_ptrs_ul4g_hi;
+	unsigned int tail_ptrs_ul4g_lo;
+	unsigned int depth_log0_offset;
+	unsigned int depth_log1_offset;
+	unsigned int qman_group_func;
+	unsigned int ddr_range;
+};
+
+/* Structure holding registry addresses for PF */
+static const struct acc100_registry_addr pf_reg_addr = {
+	.dma_ring_dl5g_hi = HWPfDmaFec5GdlDescBaseHiRegVf,
+	.dma_ring_dl5g_lo = HWPfDmaFec5GdlDescBaseLoRegVf,
+	.dma_ring_ul5g_hi = HWPfDmaFec5GulDescBaseHiRegVf,
+	.dma_ring_ul5g_lo = HWPfDmaFec5GulDescBaseLoRegVf,
+	.dma_ring_dl4g_hi = HWPfDmaFec4GdlDescBaseHiRegVf,
+	.dma_ring_dl4g_lo = HWPfDmaFec4GdlDescBaseLoRegVf,
+	.dma_ring_ul4g_hi = HWPfDmaFec4GulDescBaseHiRegVf,
+	.dma_ring_ul4g_lo = HWPfDmaFec4GulDescBaseLoRegVf,
+	.ring_size = HWPfQmgrRingSizeVf,
+	.info_ring_hi = HWPfHiInfoRingBaseHiRegPf,
+	.info_ring_lo = HWPfHiInfoRingBaseLoRegPf,
+	.info_ring_en = HWPfHiInfoRingIntWrEnRegPf,
+	.info_ring_ptr = HWPfHiInfoRingPointerRegPf,
+	.tail_ptrs_dl5g_hi = HWPfDmaFec5GdlRespPtrHiRegVf,
+	.tail_ptrs_dl5g_lo = HWPfDmaFec5GdlRespPtrLoRegVf,
+	.tail_ptrs_ul5g_hi = HWPfDmaFec5GulRespPtrHiRegVf,
+	.tail_ptrs_ul5g_lo = HWPfDmaFec5GulRespPtrLoRegVf,
+	.tail_ptrs_dl4g_hi = HWPfDmaFec4GdlRespPtrHiRegVf,
+	.tail_ptrs_dl4g_lo = HWPfDmaFec4GdlRespPtrLoRegVf,
+	.tail_ptrs_ul4g_hi = HWPfDmaFec4GulRespPtrHiRegVf,
+	.tail_ptrs_ul4g_lo = HWPfDmaFec4GulRespPtrLoRegVf,
+	.depth_log0_offset = HWPfQmgrGrpDepthLog20Vf,
+	.depth_log1_offset = HWPfQmgrGrpDepthLog21Vf,
+	.qman_group_func = HWPfQmgrGrpFunction0,
+	.ddr_range = HWPfDmaVfDdrBaseRw,
+};
+
+/* Structure holding registry addresses for VF */
+static const struct acc100_registry_addr vf_reg_addr = {
+	.dma_ring_dl5g_hi = HWVfDmaFec5GdlDescBaseHiRegVf,
+	.dma_ring_dl5g_lo = HWVfDmaFec5GdlDescBaseLoRegVf,
+	.dma_ring_ul5g_hi = HWVfDmaFec5GulDescBaseHiRegVf,
+	.dma_ring_ul5g_lo = HWVfDmaFec5GulDescBaseLoRegVf,
+	.dma_ring_dl4g_hi = HWVfDmaFec4GdlDescBaseHiRegVf,
+	.dma_ring_dl4g_lo = HWVfDmaFec4GdlDescBaseLoRegVf,
+	.dma_ring_ul4g_hi = HWVfDmaFec4GulDescBaseHiRegVf,
+	.dma_ring_ul4g_lo = HWVfDmaFec4GulDescBaseLoRegVf,
+	.ring_size = HWVfQmgrRingSizeVf,
+	.info_ring_hi = HWVfHiInfoRingBaseHiVf,
+	.info_ring_lo = HWVfHiInfoRingBaseLoVf,
+	.info_ring_en = HWVfHiInfoRingIntWrEnVf,
+	.info_ring_ptr = HWVfHiInfoRingPointerVf,
+	.tail_ptrs_dl5g_hi = HWVfDmaFec5GdlRespPtrHiRegVf,
+	.tail_ptrs_dl5g_lo = HWVfDmaFec5GdlRespPtrLoRegVf,
+	.tail_ptrs_ul5g_hi = HWVfDmaFec5GulRespPtrHiRegVf,
+	.tail_ptrs_ul5g_lo = HWVfDmaFec5GulRespPtrLoRegVf,
+	.tail_ptrs_dl4g_hi = HWVfDmaFec4GdlRespPtrHiRegVf,
+	.tail_ptrs_dl4g_lo = HWVfDmaFec4GdlRespPtrLoRegVf,
+	.tail_ptrs_ul4g_hi = HWVfDmaFec4GulRespPtrHiRegVf,
+	.tail_ptrs_ul4g_lo = HWVfDmaFec4GulRespPtrLoRegVf,
+	.depth_log0_offset = HWVfQmgrGrpDepthLog20Vf,
+	.depth_log1_offset = HWVfQmgrGrpDepthLog21Vf,
+	.qman_group_func = HWVfQmgrGrpFunction0Vf,
+	.ddr_range = HWVfDmaDdrBaseRangeRoVf,
+};
+
 /* Private data structure for each ACC100 device */
 struct acc100_device {
 	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v10 03/10] baseband/acc100: add info get function
  2020-10-01  3:14   ` [dpdk-dev] [PATCH v10 00/10] bbdev PMD ACC100 Nicolas Chautru
  2020-10-01  3:14     ` [dpdk-dev] [PATCH v10 01/10] drivers/baseband: add PMD for ACC100 Nicolas Chautru
  2020-10-01  3:14     ` [dpdk-dev] [PATCH v10 02/10] baseband/acc100: add register definition file Nicolas Chautru
@ 2020-10-01  3:14     ` Nicolas Chautru
  2020-10-01 14:34       ` Maxime Coquelin
  2020-10-01  3:14     ` [dpdk-dev] [PATCH v10 04/10] baseband/acc100: add queue configuration Nicolas Chautru
                       ` (6 subsequent siblings)
  9 siblings, 1 reply; 213+ messages in thread
From: Nicolas Chautru @ 2020-10-01  3:14 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, trix, ferruh.yigit, tianjiao.liu,
	Nicolas Chautru

Add in the "info_get" function to the driver, to allow us to query the
device.
No processing capability are available yet.
Linking bbdev-test to support the PMD with null capability.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 app/test-bbdev/meson.build               |   3 +
 drivers/baseband/acc100/rte_acc100_cfg.h |  96 +++++++++++++
 drivers/baseband/acc100/rte_acc100_pmd.c | 229 +++++++++++++++++++++++++++++++
 drivers/baseband/acc100/rte_acc100_pmd.h |  10 ++
 4 files changed, 338 insertions(+)
 create mode 100644 drivers/baseband/acc100/rte_acc100_cfg.h

diff --git a/app/test-bbdev/meson.build b/app/test-bbdev/meson.build
index 18ab6a8..fbd8ae3 100644
--- a/app/test-bbdev/meson.build
+++ b/app/test-bbdev/meson.build
@@ -12,3 +12,6 @@ endif
 if dpdk_conf.has('RTE_LIBRTE_PMD_BBDEV_FPGA_5GNR_FEC')
 	deps += ['pmd_bbdev_fpga_5gnr_fec']
 endif
+if dpdk_conf.has('RTE_LIBRTE_PMD_BBDEV_ACC100')
+	deps += ['pmd_bbdev_acc100']
+endif
\ No newline at end of file
diff --git a/drivers/baseband/acc100/rte_acc100_cfg.h b/drivers/baseband/acc100/rte_acc100_cfg.h
new file mode 100644
index 0000000..73bbe36
--- /dev/null
+++ b/drivers/baseband/acc100/rte_acc100_cfg.h
@@ -0,0 +1,96 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#ifndef _RTE_ACC100_CFG_H_
+#define _RTE_ACC100_CFG_H_
+
+/**
+ * @file rte_acc100_cfg.h
+ *
+ * Functions for configuring ACC100 HW, exposed directly to applications.
+ * Configuration related to encoding/decoding is done through the
+ * librte_bbdev library.
+ *
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ */
+
+#include <stdint.h>
+#include <stdbool.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+/**< Number of Virtual Functions ACC100 supports */
+#define RTE_ACC100_NUM_VFS 16
+
+/**
+ * Definition of Queue Topology for ACC100 Configuration
+ * Some level of details is abstracted out to expose a clean interface
+ * given that comprehensive flexibility is not required
+ */
+struct rte_q_topology_t {
+	/** Number of QGroups in incremental order of priority */
+	uint16_t num_qgroups;
+	/**
+	 * All QGroups have the same number of AQs here.
+	 * Note : Could be made a 16-array if more flexibility is really
+	 * required
+	 */
+	uint16_t num_aqs_per_groups;
+	/**
+	 * Depth of the AQs is the same of all QGroups here. Log2 Enum : 2^N
+	 * Note : Could be made a 16-array if more flexibility is really
+	 * required
+	 */
+	uint16_t aq_depth_log2;
+	/**
+	 * Index of the first Queue Group Index - assuming contiguity
+	 * Initialized as -1
+	 */
+	int8_t first_qgroup_index;
+};
+
+/**
+ * Definition of Arbitration related parameters for ACC100 Configuration
+ */
+struct rte_arbitration_t {
+	/** Default Weight for VF Fairness Arbitration */
+	uint16_t round_robin_weight;
+	uint32_t gbr_threshold1; /**< Guaranteed Bitrate Threshold 1 */
+	uint32_t gbr_threshold2; /**< Guaranteed Bitrate Threshold 2 */
+};
+
+/**
+ * Structure to pass ACC100 configuration.
+ * Note: all VF Bundles will have the same configuration.
+ */
+struct acc100_conf {
+	bool pf_mode_en; /**< 1 if PF is used for dataplane, 0 for VFs */
+	/** 1 if input '1' bit is represented by a positive LLR value, 0 if '1'
+	 * bit is represented by a negative value.
+	 */
+	bool input_pos_llr_1_bit;
+	/** 1 if output '1' bit is represented by a positive value, 0 if '1'
+	 * bit is represented by a negative value.
+	 */
+	bool output_pos_llr_1_bit;
+	uint16_t num_vf_bundles; /**< Number of VF bundles to setup */
+	/** Queue topology for each operation type */
+	struct rte_q_topology_t q_ul_4g;
+	struct rte_q_topology_t q_dl_4g;
+	struct rte_q_topology_t q_ul_5g;
+	struct rte_q_topology_t q_dl_5g;
+	/** Arbitration configuration for each operation type */
+	struct rte_arbitration_t arb_ul_4g[RTE_ACC100_NUM_VFS];
+	struct rte_arbitration_t arb_dl_4g[RTE_ACC100_NUM_VFS];
+	struct rte_arbitration_t arb_ul_5g[RTE_ACC100_NUM_VFS];
+	struct rte_arbitration_t arb_dl_5g[RTE_ACC100_NUM_VFS];
+};
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_ACC100_CFG_H_ */
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 1b4cd13..98a17b3 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -26,6 +26,188 @@
 RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);
 #endif
 
+/* Read a register of a ACC100 device */
+static inline uint32_t
+acc100_reg_read(struct acc100_device *d, uint32_t offset)
+{
+
+	void *reg_addr = RTE_PTR_ADD(d->mmio_base, offset);
+	uint32_t ret = *((volatile uint32_t *)(reg_addr));
+	return rte_le_to_cpu_32(ret);
+}
+
+/* Calculate the offset of the enqueue register */
+static inline uint32_t
+queue_offset(bool pf_device, uint8_t vf_id, uint8_t qgrp_id, uint16_t aq_id)
+{
+	if (pf_device)
+		return ((vf_id << 12) + (qgrp_id << 7) + (aq_id << 3) +
+				HWPfQmgrIngressAq);
+	else
+		return ((qgrp_id << 7) + (aq_id << 3) +
+				HWVfQmgrIngressAq);
+}
+
+enum {UL_4G = 0, UL_5G, DL_4G, DL_5G, NUM_ACC};
+
+/* Return the queue topology for a Queue Group Index */
+static inline void
+qtopFromAcc(struct rte_q_topology_t **qtop, int acc_enum,
+		struct acc100_conf *acc100_conf)
+{
+	struct rte_q_topology_t *p_qtop;
+	p_qtop = NULL;
+	switch (acc_enum) {
+	case UL_4G:
+		p_qtop = &(acc100_conf->q_ul_4g);
+		break;
+	case UL_5G:
+		p_qtop = &(acc100_conf->q_ul_5g);
+		break;
+	case DL_4G:
+		p_qtop = &(acc100_conf->q_dl_4g);
+		break;
+	case DL_5G:
+		p_qtop = &(acc100_conf->q_dl_5g);
+		break;
+	default:
+		/* NOTREACHED */
+		rte_bbdev_log(ERR, "Unexpected error evaluating qtopFromAcc");
+		break;
+	}
+	*qtop = p_qtop;
+}
+
+static void
+initQTop(struct acc100_conf *acc100_conf)
+{
+	acc100_conf->q_ul_4g.num_aqs_per_groups = 0;
+	acc100_conf->q_ul_4g.num_qgroups = 0;
+	acc100_conf->q_ul_4g.first_qgroup_index = -1;
+	acc100_conf->q_ul_5g.num_aqs_per_groups = 0;
+	acc100_conf->q_ul_5g.num_qgroups = 0;
+	acc100_conf->q_ul_5g.first_qgroup_index = -1;
+	acc100_conf->q_dl_4g.num_aqs_per_groups = 0;
+	acc100_conf->q_dl_4g.num_qgroups = 0;
+	acc100_conf->q_dl_4g.first_qgroup_index = -1;
+	acc100_conf->q_dl_5g.num_aqs_per_groups = 0;
+	acc100_conf->q_dl_5g.num_qgroups = 0;
+	acc100_conf->q_dl_5g.first_qgroup_index = -1;
+}
+
+static inline void
+updateQtop(uint8_t acc, uint8_t qg, struct acc100_conf *acc100_conf,
+		struct acc100_device *d) {
+	uint32_t reg;
+	struct rte_q_topology_t *q_top = NULL;
+	qtopFromAcc(&q_top, acc, acc100_conf);
+	if (unlikely(q_top == NULL))
+		return;
+	uint16_t aq;
+	q_top->num_qgroups++;
+	if (q_top->first_qgroup_index == -1) {
+		q_top->first_qgroup_index = qg;
+		/* Can be optimized to assume all are enabled by default */
+		reg = acc100_reg_read(d, queue_offset(d->pf_device,
+				0, qg, ACC100_NUM_AQS - 1));
+		if (reg & ACC100_QUEUE_ENABLE) {
+			q_top->num_aqs_per_groups = ACC100_NUM_AQS;
+			return;
+		}
+		q_top->num_aqs_per_groups = 0;
+		for (aq = 0; aq < ACC100_NUM_AQS; aq++) {
+			reg = acc100_reg_read(d, queue_offset(d->pf_device,
+					0, qg, aq));
+			if (reg & ACC100_QUEUE_ENABLE)
+				q_top->num_aqs_per_groups++;
+		}
+	}
+}
+
+/* Fetch configuration enabled for the PF/VF using MMIO Read (slow) */
+static inline void
+fetch_acc100_config(struct rte_bbdev *dev)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	struct acc100_conf *acc100_conf = &d->acc100_conf;
+	const struct acc100_registry_addr *reg_addr;
+	uint8_t acc, qg;
+	uint32_t reg, reg_aq, reg_len0, reg_len1;
+	uint32_t reg_mode;
+
+	/* No need to retrieve the configuration is already done */
+	if (d->configured)
+		return;
+
+	/* Choose correct registry addresses for the device type */
+	if (d->pf_device)
+		reg_addr = &pf_reg_addr;
+	else
+		reg_addr = &vf_reg_addr;
+
+	d->ddr_size = (1 + acc100_reg_read(d, reg_addr->ddr_range)) << 10;
+
+	/* Single VF Bundle by VF */
+	acc100_conf->num_vf_bundles = 1;
+	initQTop(acc100_conf);
+
+	struct rte_q_topology_t *q_top = NULL;
+	int qman_func_id[ACC100_NUM_ACCS] = {ACC100_ACCMAP_0, ACC100_ACCMAP_1,
+			ACC100_ACCMAP_2, ACC100_ACCMAP_3, ACC100_ACCMAP_4};
+	reg = acc100_reg_read(d, reg_addr->qman_group_func);
+	for (qg = 0; qg < ACC100_NUM_QGRPS_PER_WORD; qg++) {
+		reg_aq = acc100_reg_read(d,
+				queue_offset(d->pf_device, 0, qg, 0));
+		if (reg_aq & ACC100_QUEUE_ENABLE) {
+			uint32_t idx = (reg >> (qg * 4)) & 0x7;
+			if (idx >= ACC100_NUM_ACCS)
+				break;
+			acc = qman_func_id[idx];
+			updateQtop(acc, qg, acc100_conf, d);
+		}
+	}
+
+	/* Check the depth of the AQs*/
+	reg_len0 = acc100_reg_read(d, reg_addr->depth_log0_offset);
+	reg_len1 = acc100_reg_read(d, reg_addr->depth_log1_offset);
+	for (acc = 0; acc < NUM_ACC; acc++) {
+		qtopFromAcc(&q_top, acc, acc100_conf);
+		if (q_top->first_qgroup_index < ACC100_NUM_QGRPS_PER_WORD)
+			q_top->aq_depth_log2 = (reg_len0 >>
+					(q_top->first_qgroup_index * 4))
+					& 0xF;
+		else
+			q_top->aq_depth_log2 = (reg_len1 >>
+					((q_top->first_qgroup_index -
+					ACC100_NUM_QGRPS_PER_WORD) * 4))
+					& 0xF;
+	}
+
+	/* Read PF mode */
+	if (d->pf_device) {
+		reg_mode = acc100_reg_read(d, HWPfHiPfMode);
+		acc100_conf->pf_mode_en = (reg_mode == ACC100_PF_VAL) ? 1 : 0;
+	}
+
+	rte_bbdev_log_debug(
+			"%s Config LLR SIGN IN/OUT %s %s QG %u %u %u %u AQ %u %u %u %u Len %u %u %u %u\n",
+			(d->pf_device) ? "PF" : "VF",
+			(acc100_conf->input_pos_llr_1_bit) ? "POS" : "NEG",
+			(acc100_conf->output_pos_llr_1_bit) ? "POS" : "NEG",
+			acc100_conf->q_ul_4g.num_qgroups,
+			acc100_conf->q_dl_4g.num_qgroups,
+			acc100_conf->q_ul_5g.num_qgroups,
+			acc100_conf->q_dl_5g.num_qgroups,
+			acc100_conf->q_ul_4g.num_aqs_per_groups,
+			acc100_conf->q_dl_4g.num_aqs_per_groups,
+			acc100_conf->q_ul_5g.num_aqs_per_groups,
+			acc100_conf->q_dl_5g.num_aqs_per_groups,
+			acc100_conf->q_ul_4g.aq_depth_log2,
+			acc100_conf->q_dl_4g.aq_depth_log2,
+			acc100_conf->q_ul_5g.aq_depth_log2,
+			acc100_conf->q_dl_5g.aq_depth_log2);
+}
+
 /* Free 64MB memory used for software rings */
 static int
 acc100_dev_close(struct rte_bbdev *dev  __rte_unused)
@@ -33,8 +215,55 @@
 	return 0;
 }
 
+/* Get ACC100 device info */
+static void
+acc100_dev_info_get(struct rte_bbdev *dev,
+		struct rte_bbdev_driver_info *dev_info)
+{
+	struct acc100_device *d = dev->data->dev_private;
+
+	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
+		RTE_BBDEV_END_OF_CAPABILITIES_LIST()
+	};
+
+	static struct rte_bbdev_queue_conf default_queue_conf;
+	default_queue_conf.socket = dev->data->socket_id;
+	default_queue_conf.queue_size = ACC100_MAX_QUEUE_DEPTH;
+
+	dev_info->driver_name = dev->device->driver->name;
+
+	/* Read and save the populated config from ACC100 registers */
+	fetch_acc100_config(dev);
+
+	/* This isn't ideal because it reports the maximum number of queues but
+	 * does not provide info on how many can be uplink/downlink or different
+	 * priorities
+	 */
+	dev_info->max_num_queues =
+			d->acc100_conf.q_dl_5g.num_aqs_per_groups *
+			d->acc100_conf.q_dl_5g.num_qgroups +
+			d->acc100_conf.q_ul_5g.num_aqs_per_groups *
+			d->acc100_conf.q_ul_5g.num_qgroups +
+			d->acc100_conf.q_dl_4g.num_aqs_per_groups *
+			d->acc100_conf.q_dl_4g.num_qgroups +
+			d->acc100_conf.q_ul_4g.num_aqs_per_groups *
+			d->acc100_conf.q_ul_4g.num_qgroups;
+	dev_info->queue_size_lim = ACC100_MAX_QUEUE_DEPTH;
+	dev_info->hardware_accelerated = true;
+	dev_info->max_dl_queue_priority =
+			d->acc100_conf.q_dl_4g.num_qgroups - 1;
+	dev_info->max_ul_queue_priority =
+			d->acc100_conf.q_ul_4g.num_qgroups - 1;
+	dev_info->default_queue_conf = default_queue_conf;
+	dev_info->cpu_flag_reqs = NULL;
+	dev_info->min_alignment = 64;
+	dev_info->capabilities = bbdev_capabilities;
+	dev_info->harq_buffer_size = d->ddr_size;
+}
+
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
 	.close = acc100_dev_close,
+	.info_get = acc100_dev_info_get,
 };
 
 /* ACC100 PCI PF address map */
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 6525d66..de015ca 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -7,6 +7,7 @@
 
 #include "acc100_pf_enum.h"
 #include "acc100_vf_enum.h"
+#include "rte_acc100_cfg.h"
 
 /* Helper macro for logging */
 #define rte_bbdev_log(level, fmt, ...) \
@@ -98,6 +99,13 @@
 #define ACC100_SIG_UL_4G_LAST 21
 #define ACC100_SIG_DL_4G      27
 #define ACC100_SIG_DL_4G_LAST 31
+#define ACC100_NUM_ACCS       5
+#define ACC100_ACCMAP_0       0
+#define ACC100_ACCMAP_1       2
+#define ACC100_ACCMAP_2       1
+#define ACC100_ACCMAP_3       3
+#define ACC100_ACCMAP_4       4
+#define ACC100_PF_VAL         2
 
 /* max number of iterations to allocate memory block for all rings */
 #define ACC100_SW_RING_MEM_ALLOC_ATTEMPTS 5
@@ -517,6 +525,8 @@ struct acc100_registry_addr {
 /* Private data structure for each ACC100 device */
 struct acc100_device {
 	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
+	uint32_t ddr_size; /* Size in kB */
+	struct acc100_conf acc100_conf; /* ACC100 Initial configuration */
 	bool pf_device; /**< True if this is a PF ACC100 device */
 	bool configured; /**< True if this ACC100 device is configured */
 };
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v10 04/10] baseband/acc100: add queue configuration
  2020-10-01  3:14   ` [dpdk-dev] [PATCH v10 00/10] bbdev PMD ACC100 Nicolas Chautru
                       ` (2 preceding siblings ...)
  2020-10-01  3:14     ` [dpdk-dev] [PATCH v10 03/10] baseband/acc100: add info get function Nicolas Chautru
@ 2020-10-01  3:14     ` Nicolas Chautru
  2020-10-01 15:38       ` Maxime Coquelin
  2020-10-01  3:14     ` [dpdk-dev] [PATCH v10 05/10] baseband/acc100: add LDPC processing functions Nicolas Chautru
                       ` (5 subsequent siblings)
  9 siblings, 1 reply; 213+ messages in thread
From: Nicolas Chautru @ 2020-10-01  3:14 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, trix, ferruh.yigit, tianjiao.liu,
	Nicolas Chautru

Adding function to create and configure queues for
the device. Still no capability.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 438 ++++++++++++++++++++++++++++++-
 drivers/baseband/acc100/rte_acc100_pmd.h |  45 ++++
 2 files changed, 482 insertions(+), 1 deletion(-)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 98a17b3..709a7af 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -26,6 +26,22 @@
 RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);
 #endif
 
+/* Write to MMIO register address */
+static inline void
+mmio_write(void *addr, uint32_t value)
+{
+	*((volatile uint32_t *)(addr)) = rte_cpu_to_le_32(value);
+}
+
+/* Write a register of a ACC100 device */
+static inline void
+acc100_reg_write(struct acc100_device *d, uint32_t offset, uint32_t payload)
+{
+	void *reg_addr = RTE_PTR_ADD(d->mmio_base, offset);
+	mmio_write(reg_addr, payload);
+	usleep(ACC100_LONG_WAIT);
+}
+
 /* Read a register of a ACC100 device */
 static inline uint32_t
 acc100_reg_read(struct acc100_device *d, uint32_t offset)
@@ -36,6 +52,22 @@
 	return rte_le_to_cpu_32(ret);
 }
 
+/* Basic Implementation of Log2 for exact 2^N */
+static inline uint32_t
+log2_basic(uint32_t value)
+{
+	return (value == 0) ? 0 : rte_bsf32(value);
+}
+
+/* Calculate memory alignment offset assuming alignment is 2^N */
+static inline uint32_t
+calc_mem_alignment_offset(void *unaligned_virt_mem, uint32_t alignment)
+{
+	rte_iova_t unaligned_phy_mem = rte_malloc_virt2iova(unaligned_virt_mem);
+	return (uint32_t)(alignment -
+			(unaligned_phy_mem & (alignment-1)));
+}
+
 /* Calculate the offset of the enqueue register */
 static inline uint32_t
 queue_offset(bool pf_device, uint8_t vf_id, uint8_t qgrp_id, uint16_t aq_id)
@@ -208,10 +240,411 @@
 			acc100_conf->q_dl_5g.aq_depth_log2);
 }
 
+static void
+free_base_addresses(void **base_addrs, int size)
+{
+	int i;
+	for (i = 0; i < size; i++)
+		rte_free(base_addrs[i]);
+}
+
+static inline uint32_t
+get_desc_len(void)
+{
+	return sizeof(union acc100_dma_desc);
+}
+
+/* Allocate the 2 * 64MB block for the sw rings */
+static int
+alloc_2x64mb_sw_rings_mem(struct rte_bbdev *dev, struct acc100_device *d,
+		int socket)
+{
+	uint32_t sw_ring_size = ACC100_SIZE_64MBYTE;
+	d->sw_rings_base = rte_zmalloc_socket(dev->device->driver->name,
+			2 * sw_ring_size, RTE_CACHE_LINE_SIZE, socket);
+	if (d->sw_rings_base == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate memory for %s:%u",
+				dev->device->driver->name,
+				dev->data->dev_id);
+		return -ENOMEM;
+	}
+	memset(d->sw_rings_base, 0, ACC100_SIZE_64MBYTE);
+	uint32_t next_64mb_align_offset = calc_mem_alignment_offset(
+			d->sw_rings_base, ACC100_SIZE_64MBYTE);
+	d->sw_rings = RTE_PTR_ADD(d->sw_rings_base, next_64mb_align_offset);
+	d->sw_rings_phys = rte_malloc_virt2iova(d->sw_rings_base) +
+			next_64mb_align_offset;
+	d->sw_ring_size = ACC100_MAX_QUEUE_DEPTH * get_desc_len();
+	d->sw_ring_max_depth = d->sw_ring_size / get_desc_len();
+
+	return 0;
+}
+
+/* Attempt to allocate minimised memory space for sw rings */
+static void
+alloc_sw_rings_min_mem(struct rte_bbdev *dev, struct acc100_device *d,
+		uint16_t num_queues, int socket)
+{
+	rte_iova_t sw_rings_base_phy, next_64mb_align_addr_phy;
+	uint32_t next_64mb_align_offset;
+	rte_iova_t sw_ring_phys_end_addr;
+	void *base_addrs[ACC100_SW_RING_MEM_ALLOC_ATTEMPTS];
+	void *sw_rings_base;
+	int i = 0;
+	uint32_t q_sw_ring_size = ACC100_MAX_QUEUE_DEPTH * get_desc_len();
+	uint32_t dev_sw_ring_size = q_sw_ring_size * num_queues;
+
+	/* Find an aligned block of memory to store sw rings */
+	while (i < ACC100_SW_RING_MEM_ALLOC_ATTEMPTS) {
+		/*
+		 * sw_ring allocated memory is guaranteed to be aligned to
+		 * q_sw_ring_size at the condition that the requested size is
+		 * less than the page size
+		 */
+		sw_rings_base = rte_zmalloc_socket(
+				dev->device->driver->name,
+				dev_sw_ring_size, q_sw_ring_size, socket);
+
+		if (sw_rings_base == NULL) {
+			rte_bbdev_log(ERR,
+					"Failed to allocate memory for %s:%u",
+					dev->device->driver->name,
+					dev->data->dev_id);
+			break;
+		}
+
+		sw_rings_base_phy = rte_malloc_virt2iova(sw_rings_base);
+		next_64mb_align_offset = calc_mem_alignment_offset(
+				sw_rings_base, ACC100_SIZE_64MBYTE);
+		next_64mb_align_addr_phy = sw_rings_base_phy +
+				next_64mb_align_offset;
+		sw_ring_phys_end_addr = sw_rings_base_phy + dev_sw_ring_size;
+
+		/* Check if the end of the sw ring memory block is before the
+		 * start of next 64MB aligned mem address
+		 */
+		if (sw_ring_phys_end_addr < next_64mb_align_addr_phy) {
+			d->sw_rings_phys = sw_rings_base_phy;
+			d->sw_rings = sw_rings_base;
+			d->sw_rings_base = sw_rings_base;
+			d->sw_ring_size = q_sw_ring_size;
+			d->sw_ring_max_depth = ACC100_MAX_QUEUE_DEPTH;
+			break;
+		}
+		/* Store the address of the unaligned mem block */
+		base_addrs[i] = sw_rings_base;
+		i++;
+	}
+
+	/* Free all unaligned blocks of mem allocated in the loop */
+	free_base_addresses(base_addrs, i);
+}
+
+
+/* Allocate 64MB memory used for all software rings */
+static int
+acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id)
+{
+	uint32_t phys_low, phys_high, payload;
+	struct acc100_device *d = dev->data->dev_private;
+	const struct acc100_registry_addr *reg_addr;
+
+	if (d->pf_device && !d->acc100_conf.pf_mode_en) {
+		rte_bbdev_log(NOTICE,
+				"%s has PF mode disabled. This PF can't be used.",
+				dev->data->name);
+		return -ENODEV;
+	}
+
+	alloc_sw_rings_min_mem(dev, d, num_queues, socket_id);
+
+	/* If minimal memory space approach failed, then allocate
+	 * the 2 * 64MB block for the sw rings
+	 */
+	if (d->sw_rings == NULL)
+		alloc_2x64mb_sw_rings_mem(dev, d, socket_id);
+
+	if (d->sw_rings == NULL) {
+		rte_bbdev_log(NOTICE,
+				"Failure allocating sw_rings memory");
+		return -ENODEV;
+	}
+
+	/* Configure ACC100 with the base address for DMA descriptor rings
+	 * Same descriptor rings used for UL and DL DMA Engines
+	 * Note : Assuming only VF0 bundle is used for PF mode
+	 */
+	phys_high = (uint32_t)(d->sw_rings_phys >> 32);
+	phys_low  = (uint32_t)(d->sw_rings_phys & ~(ACC100_SIZE_64MBYTE-1));
+
+	/* Choose correct registry addresses for the device type */
+	if (d->pf_device)
+		reg_addr = &pf_reg_addr;
+	else
+		reg_addr = &vf_reg_addr;
+
+	/* Read the populated cfg from ACC100 registers */
+	fetch_acc100_config(dev);
+
+	/* Release AXI from PF */
+	if (d->pf_device)
+		acc100_reg_write(d, HWPfDmaAxiControl, 1);
+
+	acc100_reg_write(d, reg_addr->dma_ring_ul5g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->dma_ring_ul5g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->dma_ring_dl5g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->dma_ring_dl5g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->dma_ring_ul4g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->dma_ring_ul4g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->dma_ring_dl4g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->dma_ring_dl4g_lo, phys_low);
+
+	/*
+	 * Configure Ring Size to the max queue ring size
+	 * (used for wrapping purpose)
+	 */
+	payload = log2_basic(d->sw_ring_size / 64);
+	acc100_reg_write(d, reg_addr->ring_size, payload);
+
+	/* Configure tail pointer for use when SDONE enabled */
+	d->tail_ptrs = rte_zmalloc_socket(
+			dev->device->driver->name,
+			ACC100_NUM_QGRPS * ACC100_NUM_AQS * sizeof(uint32_t),
+			RTE_CACHE_LINE_SIZE, socket_id);
+	if (d->tail_ptrs == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate tail ptr for %s:%u",
+				dev->device->driver->name,
+				dev->data->dev_id);
+		rte_free(d->sw_rings);
+		return -ENOMEM;
+	}
+	d->tail_ptr_phys = rte_malloc_virt2iova(d->tail_ptrs);
+
+	phys_high = (uint32_t)(d->tail_ptr_phys >> 32);
+	phys_low  = (uint32_t)(d->tail_ptr_phys);
+	acc100_reg_write(d, reg_addr->tail_ptrs_ul5g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->tail_ptrs_ul5g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->tail_ptrs_dl5g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->tail_ptrs_dl5g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->tail_ptrs_ul4g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->tail_ptrs_ul4g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_lo, phys_low);
+
+	d->harq_layout = rte_zmalloc_socket("HARQ Layout",
+			ACC100_HARQ_LAYOUT * sizeof(*d->harq_layout),
+			RTE_CACHE_LINE_SIZE, dev->data->socket_id);
+	if (d->harq_layout == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate harq_layout for %s:%u",
+				dev->device->driver->name,
+				dev->data->dev_id);
+		rte_free(d->sw_rings);
+		return -ENOMEM;
+	}
+
+	/* Mark as configured properly */
+	d->configured = true;
+
+	rte_bbdev_log_debug(
+			"ACC100 (%s) configured  sw_rings = %p, sw_rings_phys = %#"
+			PRIx64, dev->data->name, d->sw_rings, d->sw_rings_phys);
+
+	return 0;
+}
+
 /* Free 64MB memory used for software rings */
 static int
-acc100_dev_close(struct rte_bbdev *dev  __rte_unused)
+acc100_dev_close(struct rte_bbdev *dev)
 {
+	struct acc100_device *d = dev->data->dev_private;
+	if (d->sw_rings_base != NULL) {
+		rte_free(d->tail_ptrs);
+		rte_free(d->sw_rings_base);
+		d->sw_rings_base = NULL;
+	}
+	usleep(ACC100_LONG_WAIT);
+	return 0;
+}
+
+
+/**
+ * Report a ACC100 queue index which is free
+ * Return 0 to 16k for a valid queue_idx or -1 when no queue is available
+ * Note : Only supporting VF0 Bundle for PF mode
+ */
+static int
+acc100_find_free_queue_idx(struct rte_bbdev *dev,
+		const struct rte_bbdev_queue_conf *conf)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	int op_2_acc[5] = {0, UL_4G, DL_4G, UL_5G, DL_5G};
+	int acc = op_2_acc[conf->op_type];
+	struct rte_q_topology_t *qtop = NULL;
+	qtopFromAcc(&qtop, acc, &(d->acc100_conf));
+	if (qtop == NULL)
+		return -1;
+	/* Identify matching QGroup Index which are sorted in priority order */
+	uint16_t group_idx = qtop->first_qgroup_index;
+	group_idx += conf->priority;
+	if (group_idx >= ACC100_NUM_QGRPS ||
+			conf->priority >= qtop->num_qgroups) {
+		rte_bbdev_log(INFO, "Invalid Priority on %s, priority %u",
+				dev->data->name, conf->priority);
+		return -1;
+	}
+	/* Find a free AQ_idx  */
+	uint16_t aq_idx;
+	for (aq_idx = 0; aq_idx < qtop->num_aqs_per_groups; aq_idx++) {
+		if (((d->q_assigned_bit_map[group_idx] >> aq_idx) & 0x1) == 0) {
+			/* Mark the Queue as assigned */
+			d->q_assigned_bit_map[group_idx] |= (1 << aq_idx);
+			/* Report the AQ Index */
+			return (group_idx << ACC100_GRP_ID_SHIFT) + aq_idx;
+		}
+	}
+	rte_bbdev_log(INFO, "Failed to find free queue on %s, priority %u",
+			dev->data->name, conf->priority);
+	return -1;
+}
+
+/* Setup ACC100 queue */
+static int
+acc100_queue_setup(struct rte_bbdev *dev, uint16_t queue_id,
+		const struct rte_bbdev_queue_conf *conf)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	struct acc100_queue *q;
+	int16_t q_idx;
+
+	/* Allocate the queue data structure. */
+	q = rte_zmalloc_socket(dev->device->driver->name, sizeof(*q),
+			RTE_CACHE_LINE_SIZE, conf->socket);
+	if (q == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate queue memory");
+		return -ENOMEM;
+	}
+
+	q->d = d;
+	q->ring_addr = RTE_PTR_ADD(d->sw_rings, (d->sw_ring_size * queue_id));
+	q->ring_addr_phys = d->sw_rings_phys + (d->sw_ring_size * queue_id);
+
+	/* Prepare the Ring with default descriptor format */
+	union acc100_dma_desc *desc = NULL;
+	unsigned int desc_idx, b_idx;
+	int fcw_len = (conf->op_type == RTE_BBDEV_OP_LDPC_ENC ?
+		ACC100_FCW_LE_BLEN : (conf->op_type == RTE_BBDEV_OP_TURBO_DEC ?
+		ACC100_FCW_TD_BLEN : ACC100_FCW_LD_BLEN));
+
+	for (desc_idx = 0; desc_idx < d->sw_ring_max_depth; desc_idx++) {
+		desc = q->ring_addr + desc_idx;
+		desc->req.word0 = ACC100_DMA_DESC_TYPE;
+		desc->req.word1 = 0; /**< Timestamp */
+		desc->req.word2 = 0;
+		desc->req.word3 = 0;
+		uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
+		desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
+		desc->req.data_ptrs[0].blen = fcw_len;
+		desc->req.data_ptrs[0].blkid = ACC100_DMA_BLKID_FCW;
+		desc->req.data_ptrs[0].last = 0;
+		desc->req.data_ptrs[0].dma_ext = 0;
+		for (b_idx = 1; b_idx < ACC100_DMA_MAX_NUM_POINTERS - 1;
+				b_idx++) {
+			desc->req.data_ptrs[b_idx].blkid = ACC100_DMA_BLKID_IN;
+			desc->req.data_ptrs[b_idx].last = 1;
+			desc->req.data_ptrs[b_idx].dma_ext = 0;
+			b_idx++;
+			desc->req.data_ptrs[b_idx].blkid =
+					ACC100_DMA_BLKID_OUT_ENC;
+			desc->req.data_ptrs[b_idx].last = 1;
+			desc->req.data_ptrs[b_idx].dma_ext = 0;
+		}
+		/* Preset some fields of LDPC FCW */
+		desc->req.fcw_ld.FCWversion = ACC100_FCW_VER;
+		desc->req.fcw_ld.gain_i = 1;
+		desc->req.fcw_ld.gain_h = 1;
+	}
+
+	q->lb_in = rte_zmalloc_socket(dev->device->driver->name,
+			RTE_CACHE_LINE_SIZE,
+			RTE_CACHE_LINE_SIZE, conf->socket);
+	if (q->lb_in == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate lb_in memory");
+		rte_free(q);
+		return -ENOMEM;
+	}
+	q->lb_in_addr_phys = rte_malloc_virt2iova(q->lb_in);
+	q->lb_out = rte_zmalloc_socket(dev->device->driver->name,
+			RTE_CACHE_LINE_SIZE,
+			RTE_CACHE_LINE_SIZE, conf->socket);
+	if (q->lb_out == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate lb_out memory");
+		rte_free(q->lb_in);
+		rte_free(q);
+		return -ENOMEM;
+	}
+	q->lb_out_addr_phys = rte_malloc_virt2iova(q->lb_out);
+
+	/*
+	 * Software queue ring wraps synchronously with the HW when it reaches
+	 * the boundary of the maximum allocated queue size, no matter what the
+	 * sw queue size is. This wrapping is guarded by setting the wrap_mask
+	 * to represent the maximum queue size as allocated at the time when
+	 * the device has been setup (in configure()).
+	 *
+	 * The queue depth is set to the queue size value (conf->queue_size).
+	 * This limits the occupancy of the queue at any point of time, so that
+	 * the queue does not get swamped with enqueue requests.
+	 */
+	q->sw_ring_depth = conf->queue_size;
+	q->sw_ring_wrap_mask = d->sw_ring_max_depth - 1;
+
+	q->op_type = conf->op_type;
+
+	q_idx = acc100_find_free_queue_idx(dev, conf);
+	if (q_idx == -1) {
+		rte_free(q->lb_in);
+		rte_free(q->lb_out);
+		rte_free(q);
+		return -1;
+	}
+
+	q->qgrp_id = (q_idx >> ACC100_GRP_ID_SHIFT) & 0xF;
+	q->vf_id = (q_idx >> ACC100_VF_ID_SHIFT)  & 0x3F;
+	q->aq_id = q_idx & 0xF;
+	q->aq_depth = (conf->op_type ==  RTE_BBDEV_OP_TURBO_DEC) ?
+			(1 << d->acc100_conf.q_ul_4g.aq_depth_log2) :
+			(1 << d->acc100_conf.q_dl_4g.aq_depth_log2);
+
+	q->mmio_reg_enqueue = RTE_PTR_ADD(d->mmio_base,
+			queue_offset(d->pf_device,
+					q->vf_id, q->qgrp_id, q->aq_id));
+
+	rte_bbdev_log_debug(
+			"Setup dev%u q%u: qgrp_id=%u, vf_id=%u, aq_id=%u, aq_depth=%u, mmio_reg_enqueue=%p",
+			dev->data->dev_id, queue_id, q->qgrp_id, q->vf_id,
+			q->aq_id, q->aq_depth, q->mmio_reg_enqueue);
+
+	dev->data->queues[queue_id].queue_private = q;
+	return 0;
+}
+
+/* Release ACC100 queue */
+static int
+acc100_queue_release(struct rte_bbdev *dev, uint16_t q_id)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	struct acc100_queue *q = dev->data->queues[q_id].queue_private;
+
+	if (q != NULL) {
+		/* Mark the Queue as un-assigned */
+		d->q_assigned_bit_map[q->qgrp_id] &= (0xFFFFFFFF -
+				(1 << q->aq_id));
+		rte_free(q->lb_in);
+		rte_free(q->lb_out);
+		rte_free(q);
+		dev->data->queues[q_id].queue_private = NULL;
+	}
+
 	return 0;
 }
 
@@ -262,8 +695,11 @@
 }
 
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
+	.setup_queues = acc100_setup_queues,
 	.close = acc100_dev_close,
 	.info_get = acc100_dev_info_get,
+	.queue_setup = acc100_queue_setup,
+	.queue_release = acc100_queue_release,
 };
 
 /* ACC100 PCI PF address map */
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index de015ca..2508385 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -522,11 +522,56 @@ struct acc100_registry_addr {
 	.ddr_range = HWVfDmaDdrBaseRangeRoVf,
 };
 
+/* Structure associated with each queue. */
+struct __rte_cache_aligned acc100_queue {
+	union acc100_dma_desc *ring_addr;  /* Virtual address of sw ring */
+	rte_iova_t ring_addr_phys;  /* Physical address of software ring */
+	uint32_t sw_ring_head;  /* software ring head */
+	uint32_t sw_ring_tail;  /* software ring tail */
+	/* software ring size (descriptors, not bytes) */
+	uint32_t sw_ring_depth;
+	/* mask used to wrap enqueued descriptors on the sw ring */
+	uint32_t sw_ring_wrap_mask;
+	/* MMIO register used to enqueue descriptors */
+	void *mmio_reg_enqueue;
+	uint8_t vf_id;  /* VF ID (max = 63) */
+	uint8_t qgrp_id;  /* Queue Group ID */
+	uint16_t aq_id;  /* Atomic Queue ID */
+	uint16_t aq_depth;  /* Depth of atomic queue */
+	uint32_t aq_enqueued;  /* Count how many "batches" have been enqueued */
+	uint32_t aq_dequeued;  /* Count how many "batches" have been dequeued */
+	uint32_t irq_enable;  /* Enable ops dequeue interrupts if set to 1 */
+	struct rte_mempool *fcw_mempool;  /* FCW mempool */
+	enum rte_bbdev_op_type op_type;  /* Type of this Queue: TE or TD */
+	/* Internal Buffers for loopback input */
+	uint8_t *lb_in;
+	uint8_t *lb_out;
+	rte_iova_t lb_in_addr_phys;
+	rte_iova_t lb_out_addr_phys;
+	struct acc100_device *d;
+};
+
 /* Private data structure for each ACC100 device */
 struct acc100_device {
 	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
+	void *sw_rings_base;  /* Base addr of un-aligned memory for sw rings */
+	void *sw_rings;  /* 64MBs of 64MB aligned memory for sw rings */
+	rte_iova_t sw_rings_phys;  /* Physical address of sw_rings */
+	/* Virtual address of the info memory routed to the this function under
+	 * operation, whether it is PF or VF.
+	 */
+	union acc100_harq_layout_data *harq_layout;
+	uint32_t sw_ring_size;
 	uint32_t ddr_size; /* Size in kB */
+	uint32_t *tail_ptrs; /* Base address of response tail pointer buffer */
+	rte_iova_t tail_ptr_phys; /* Physical address of tail pointers */
+	/* Max number of entries available for each queue in device, depending
+	 * on how many queues are enabled with configure()
+	 */
+	uint32_t sw_ring_max_depth;
 	struct acc100_conf acc100_conf; /* ACC100 Initial configuration */
+	/* Bitmap capturing which Queues have already been assigned */
+	uint16_t q_assigned_bit_map[ACC100_NUM_QGRPS];
 	bool pf_device; /**< True if this is a PF ACC100 device */
 	bool configured; /**< True if this ACC100 device is configured */
 };
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v10 05/10] baseband/acc100: add LDPC processing functions
  2020-10-01  3:14   ` [dpdk-dev] [PATCH v10 00/10] bbdev PMD ACC100 Nicolas Chautru
                       ` (3 preceding siblings ...)
  2020-10-01  3:14     ` [dpdk-dev] [PATCH v10 04/10] baseband/acc100: add queue configuration Nicolas Chautru
@ 2020-10-01  3:14     ` Nicolas Chautru
  2020-10-01  3:14     ` [dpdk-dev] [PATCH v10 06/10] baseband/acc100: add HARQ loopback support Nicolas Chautru
                       ` (4 subsequent siblings)
  9 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-10-01  3:14 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, trix, ferruh.yigit, tianjiao.liu,
	Nicolas Chautru

Adding LDPC decode and encode processing operations

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
Acked-by: Dave Burley <dave.burley@accelercomm.com>
---
 doc/guides/bbdevs/features/acc100.ini    |    8 +-
 drivers/baseband/acc100/rte_acc100_pmd.c | 1616 +++++++++++++++++++++++++++++-
 drivers/baseband/acc100/rte_acc100_pmd.h |    6 +
 3 files changed, 1624 insertions(+), 6 deletions(-)

diff --git a/doc/guides/bbdevs/features/acc100.ini b/doc/guides/bbdevs/features/acc100.ini
index c89a4d7..40c7adc 100644
--- a/doc/guides/bbdevs/features/acc100.ini
+++ b/doc/guides/bbdevs/features/acc100.ini
@@ -6,9 +6,9 @@
 [Features]
 Turbo Decoder (4G)     = N
 Turbo Encoder (4G)     = N
-LDPC Decoder (5G)      = N
-LDPC Encoder (5G)      = N
-LLR/HARQ Compression   = N
-External DDR Access    = N
+LDPC Decoder (5G)      = Y
+LDPC Encoder (5G)      = Y
+LLR/HARQ Compression   = Y
+External DDR Access    = Y
 HW Accelerated         = Y
 BBDEV API              = Y
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 709a7af..ce2ad68 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -15,6 +15,9 @@
 #include <rte_hexdump.h>
 #include <rte_pci.h>
 #include <rte_bus_pci.h>
+#ifdef RTE_BBDEV_OFFLOAD_COST
+#include <rte_cycles.h>
+#endif
 
 #include <rte_bbdev.h>
 #include <rte_bbdev_pmd.h>
@@ -466,7 +469,6 @@
 	return 0;
 }
 
-
 /**
  * Report a ACC100 queue index which is free
  * Return 0 to 16k for a valid queue_idx or -1 when no queue is available
@@ -656,6 +658,46 @@
 	struct acc100_device *d = dev->data->dev_private;
 
 	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
+		{
+			.type   = RTE_BBDEV_OP_LDPC_ENC,
+			.cap.ldpc_enc = {
+				.capability_flags =
+					RTE_BBDEV_LDPC_RATE_MATCH |
+					RTE_BBDEV_LDPC_CRC_24B_ATTACH |
+					RTE_BBDEV_LDPC_INTERLEAVER_BYPASS,
+				.num_buffers_src =
+						RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+				.num_buffers_dst =
+						RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+			}
+		},
+		{
+			.type   = RTE_BBDEV_OP_LDPC_DEC,
+			.cap.ldpc_dec = {
+			.capability_flags =
+				RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK |
+				RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP |
+				RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE |
+				RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE |
+#ifdef ACC100_EXT_MEM
+				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE |
+				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE |
+#endif
+				RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE |
+				RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS |
+				RTE_BBDEV_LDPC_DECODE_BYPASS |
+				RTE_BBDEV_LDPC_DEC_SCATTER_GATHER |
+				RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION |
+				RTE_BBDEV_LDPC_LLR_COMPRESSION,
+			.llr_size = 8,
+			.llr_decimals = 1,
+			.num_buffers_src =
+					RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+			.num_buffers_hard_out =
+					RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+			.num_buffers_soft_out = 0,
+			}
+		},
 		RTE_BBDEV_END_OF_CAPABILITIES_LIST()
 	};
 
@@ -691,9 +733,14 @@
 	dev_info->cpu_flag_reqs = NULL;
 	dev_info->min_alignment = 64;
 	dev_info->capabilities = bbdev_capabilities;
+#ifdef ACC100_EXT_MEM
 	dev_info->harq_buffer_size = d->ddr_size;
+#else
+	dev_info->harq_buffer_size = 0;
+#endif
 }
 
+
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
 	.setup_queues = acc100_setup_queues,
 	.close = acc100_dev_close,
@@ -718,6 +765,1568 @@
 	{.device_id = 0},
 };
 
+/* Read flag value 0/1 from bitmap */
+static inline bool
+check_bit(uint32_t bitmap, uint32_t bitmask)
+{
+	return bitmap & bitmask;
+}
+
+static inline char *
+mbuf_append(struct rte_mbuf *m_head, struct rte_mbuf *m, uint16_t len)
+{
+	if (unlikely(len > rte_pktmbuf_tailroom(m)))
+		return NULL;
+
+	char *tail = (char *)m->buf_addr + m->data_off + m->data_len;
+	m->data_len = (uint16_t)(m->data_len + len);
+	m_head->pkt_len  = (m_head->pkt_len + len);
+	return tail;
+}
+
+/* Compute value of k0.
+ * Based on 3GPP 38.212 Table 5.4.2.1-2
+ * Starting position of different redundancy versions, k0
+ */
+static inline uint16_t
+get_k0(uint16_t n_cb, uint16_t z_c, uint8_t bg, uint8_t rv_index)
+{
+	if (rv_index == 0)
+		return 0;
+	uint16_t n = (bg == 1 ? ACC100_N_ZC_1 : ACC100_N_ZC_2) * z_c;
+	if (n_cb == n) {
+		if (rv_index == 1)
+			return (bg == 1 ? ACC100_K0_1_1 : ACC100_K0_1_2) * z_c;
+		else if (rv_index == 2)
+			return (bg == 1 ? ACC100_K0_2_1 : ACC100_K0_2_2) * z_c;
+		else
+			return (bg == 1 ? ACC100_K0_3_1 : ACC100_K0_3_2) * z_c;
+	}
+	/* LBRM case - includes a division by N */
+	if (rv_index == 1)
+		return (((bg == 1 ? ACC100_K0_1_1 : ACC100_K0_1_2) * n_cb)
+				/ n) * z_c;
+	else if (rv_index == 2)
+		return (((bg == 1 ? ACC100_K0_2_1 : ACC100_K0_2_2) * n_cb)
+				/ n) * z_c;
+	else
+		return (((bg == 1 ? ACC100_K0_3_1 : ACC100_K0_3_2) * n_cb)
+				/ n) * z_c;
+}
+
+/* Fill in a frame control word for LDPC encoding. */
+static inline void
+acc100_fcw_le_fill(const struct rte_bbdev_enc_op *op,
+		struct acc100_fcw_le *fcw, int num_cb)
+{
+	fcw->qm = op->ldpc_enc.q_m;
+	fcw->nfiller = op->ldpc_enc.n_filler;
+	fcw->BG = (op->ldpc_enc.basegraph - 1);
+	fcw->Zc = op->ldpc_enc.z_c;
+	fcw->ncb = op->ldpc_enc.n_cb;
+	fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_enc.basegraph,
+			op->ldpc_enc.rv_index);
+	fcw->rm_e = op->ldpc_enc.cb_params.e;
+	fcw->crc_select = check_bit(op->ldpc_enc.op_flags,
+			RTE_BBDEV_LDPC_CRC_24B_ATTACH);
+	fcw->bypass_intlv = check_bit(op->ldpc_enc.op_flags,
+			RTE_BBDEV_LDPC_INTERLEAVER_BYPASS);
+	fcw->mcb_count = num_cb;
+}
+
+/* Fill in a frame control word for LDPC decoding. */
+static inline void
+acc100_fcw_ld_fill(const struct rte_bbdev_dec_op *op, struct acc100_fcw_ld *fcw,
+		union acc100_harq_layout_data *harq_layout)
+{
+	uint16_t harq_out_length, harq_in_length, ncb_p, k0_p, parity_offset;
+	uint16_t harq_index;
+	uint32_t l;
+	bool harq_prun = false;
+
+	fcw->qm = op->ldpc_dec.q_m;
+	fcw->nfiller = op->ldpc_dec.n_filler;
+	fcw->BG = (op->ldpc_dec.basegraph - 1);
+	fcw->Zc = op->ldpc_dec.z_c;
+	fcw->ncb = op->ldpc_dec.n_cb;
+	fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_dec.basegraph,
+			op->ldpc_dec.rv_index);
+	if (op->ldpc_dec.code_block_mode == 1)
+		fcw->rm_e = op->ldpc_dec.cb_params.e;
+	else
+		fcw->rm_e = (op->ldpc_dec.tb_params.r <
+				op->ldpc_dec.tb_params.cab) ?
+						op->ldpc_dec.tb_params.ea :
+						op->ldpc_dec.tb_params.eb;
+
+	fcw->hcin_en = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE);
+	fcw->hcout_en = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE);
+	fcw->crc_select = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK);
+	fcw->bypass_dec = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_DECODE_BYPASS);
+	fcw->bypass_intlv = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS);
+	if (op->ldpc_dec.q_m == 1) {
+		fcw->bypass_intlv = 1;
+		fcw->qm = 2;
+	}
+	fcw->hcin_decomp_mode = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
+	fcw->hcout_comp_mode = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
+	fcw->llr_pack_mode = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_LLR_COMPRESSION);
+	harq_index = op->ldpc_dec.harq_combined_output.offset /
+			ACC100_HARQ_OFFSET;
+#ifdef ACC100_EXT_MEM
+	/* Limit cases when HARQ pruning is valid */
+	harq_prun = ((op->ldpc_dec.harq_combined_output.offset %
+			ACC100_HARQ_OFFSET) == 0) &&
+			(op->ldpc_dec.harq_combined_output.offset <= UINT16_MAX
+			* ACC100_HARQ_OFFSET);
+#endif
+	if (fcw->hcin_en > 0) {
+		harq_in_length = op->ldpc_dec.harq_combined_input.length;
+		if (fcw->hcin_decomp_mode > 0)
+			harq_in_length = harq_in_length * 8 / 6;
+		harq_in_length = RTE_ALIGN(harq_in_length, 64);
+		if ((harq_layout[harq_index].offset > 0) & harq_prun) {
+			rte_bbdev_log_debug("HARQ IN offset unexpected for now\n");
+			fcw->hcin_size0 = harq_layout[harq_index].size0;
+			fcw->hcin_offset = harq_layout[harq_index].offset;
+			fcw->hcin_size1 = harq_in_length -
+					harq_layout[harq_index].offset;
+		} else {
+			fcw->hcin_size0 = harq_in_length;
+			fcw->hcin_offset = 0;
+			fcw->hcin_size1 = 0;
+		}
+	} else {
+		fcw->hcin_size0 = 0;
+		fcw->hcin_offset = 0;
+		fcw->hcin_size1 = 0;
+	}
+
+	fcw->itmax = op->ldpc_dec.iter_max;
+	fcw->itstop = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE);
+	fcw->synd_precoder = fcw->itstop;
+	/*
+	 * These are all implicitly set
+	 * fcw->synd_post = 0;
+	 * fcw->so_en = 0;
+	 * fcw->so_bypass_rm = 0;
+	 * fcw->so_bypass_intlv = 0;
+	 * fcw->dec_convllr = 0;
+	 * fcw->hcout_convllr = 0;
+	 * fcw->hcout_size1 = 0;
+	 * fcw->so_it = 0;
+	 * fcw->hcout_offset = 0;
+	 * fcw->negstop_th = 0;
+	 * fcw->negstop_it = 0;
+	 * fcw->negstop_en = 0;
+	 * fcw->gain_i = 1;
+	 * fcw->gain_h = 1;
+	 */
+	if (fcw->hcout_en > 0) {
+		parity_offset = (op->ldpc_dec.basegraph == 1 ? 20 : 8)
+			* op->ldpc_dec.z_c - op->ldpc_dec.n_filler;
+		k0_p = (fcw->k0 > parity_offset) ?
+				fcw->k0 - op->ldpc_dec.n_filler : fcw->k0;
+		ncb_p = fcw->ncb - op->ldpc_dec.n_filler;
+		l = k0_p + fcw->rm_e;
+		harq_out_length = (uint16_t) fcw->hcin_size0;
+		harq_out_length = RTE_MIN(RTE_MAX(harq_out_length, l), ncb_p);
+		harq_out_length = (harq_out_length + 0x3F) & 0xFFC0;
+		if ((k0_p > fcw->hcin_size0 + ACC100_HARQ_OFFSET_THRESHOLD) &&
+				harq_prun) {
+			fcw->hcout_size0 = (uint16_t) fcw->hcin_size0;
+			fcw->hcout_offset = k0_p & 0xFFC0;
+			fcw->hcout_size1 = harq_out_length - fcw->hcout_offset;
+		} else {
+			fcw->hcout_size0 = harq_out_length;
+			fcw->hcout_size1 = 0;
+			fcw->hcout_offset = 0;
+		}
+		harq_layout[harq_index].offset = fcw->hcout_offset;
+		harq_layout[harq_index].size0 = fcw->hcout_size0;
+	} else {
+		fcw->hcout_size0 = 0;
+		fcw->hcout_size1 = 0;
+		fcw->hcout_offset = 0;
+	}
+}
+
+/**
+ * Fills descriptor with data pointers of one block type.
+ *
+ * @param desc
+ *   Pointer to DMA descriptor.
+ * @param input
+ *   Pointer to pointer to input data which will be encoded. It can be changed
+ *   and points to next segment in scatter-gather case.
+ * @param offset
+ *   Input offset in rte_mbuf structure. It is used for calculating the point
+ *   where data is starting.
+ * @param cb_len
+ *   Length of currently processed Code Block
+ * @param seg_total_left
+ *   It indicates how many bytes still left in segment (mbuf) for further
+ *   processing.
+ * @param op_flags
+ *   Store information about device capabilities
+ * @param next_triplet
+ *   Index for ACC100 DMA Descriptor triplet
+ *
+ * @return
+ *   Returns index of next triplet on success, other value if lengths of
+ *   pkt and processed cb do not match.
+ *
+ */
+static inline int
+acc100_dma_fill_blk_type_in(struct acc100_dma_req_desc *desc,
+		struct rte_mbuf **input, uint32_t *offset, uint32_t cb_len,
+		uint32_t *seg_total_left, int next_triplet)
+{
+	uint32_t part_len;
+	struct rte_mbuf *m = *input;
+
+	part_len = (*seg_total_left < cb_len) ? *seg_total_left : cb_len;
+	cb_len -= part_len;
+	*seg_total_left -= part_len;
+
+	desc->data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(m, *offset);
+	desc->data_ptrs[next_triplet].blen = part_len;
+	desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN;
+	desc->data_ptrs[next_triplet].last = 0;
+	desc->data_ptrs[next_triplet].dma_ext = 0;
+	*offset += part_len;
+	next_triplet++;
+
+	while (cb_len > 0) {
+		if (next_triplet < ACC100_DMA_MAX_NUM_POINTERS &&
+				m->next != NULL) {
+
+			m = m->next;
+			*seg_total_left = rte_pktmbuf_data_len(m);
+			part_len = (*seg_total_left < cb_len) ?
+					*seg_total_left :
+					cb_len;
+			desc->data_ptrs[next_triplet].address =
+					rte_pktmbuf_iova_offset(m, 0);
+			desc->data_ptrs[next_triplet].blen = part_len;
+			desc->data_ptrs[next_triplet].blkid =
+					ACC100_DMA_BLKID_IN;
+			desc->data_ptrs[next_triplet].last = 0;
+			desc->data_ptrs[next_triplet].dma_ext = 0;
+			cb_len -= part_len;
+			*seg_total_left -= part_len;
+			/* Initializing offset for next segment (mbuf) */
+			*offset = part_len;
+			next_triplet++;
+		} else {
+			rte_bbdev_log(ERR,
+				"Some data still left for processing: "
+				"data_left: %u, next_triplet: %u, next_mbuf: %p",
+				cb_len, next_triplet, m->next);
+			return -EINVAL;
+		}
+	}
+	/* Storing new mbuf as it could be changed in scatter-gather case*/
+	*input = m;
+
+	return next_triplet;
+}
+
+/* Fills descriptor with data pointers of one block type.
+ * Returns index of next triplet on success, other value if lengths of
+ * output data and processed mbuf do not match.
+ */
+static inline int
+acc100_dma_fill_blk_type_out(struct acc100_dma_req_desc *desc,
+		struct rte_mbuf *output, uint32_t out_offset,
+		uint32_t output_len, int next_triplet, int blk_id)
+{
+	desc->data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(output, out_offset);
+	desc->data_ptrs[next_triplet].blen = output_len;
+	desc->data_ptrs[next_triplet].blkid = blk_id;
+	desc->data_ptrs[next_triplet].last = 0;
+	desc->data_ptrs[next_triplet].dma_ext = 0;
+	next_triplet++;
+
+	return next_triplet;
+}
+
+static inline void acc100_header_init(struct acc100_dma_req_desc *desc)
+{
+	desc->word0 = ACC100_DMA_DESC_TYPE;
+	desc->word1 = 0; /**< Timestamp could be disabled */
+	desc->word2 = 0;
+	desc->word3 = 0;
+	desc->numCBs = 1;
+}
+
+static inline int
+acc100_dma_desc_le_fill(struct rte_bbdev_enc_op *op,
+		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
+		struct rte_mbuf *output, uint32_t *in_offset,
+		uint32_t *out_offset, uint32_t *out_length,
+		uint32_t *mbuf_total_left, uint32_t *seg_total_left)
+{
+	int next_triplet = 1; /* FCW already done */
+	uint16_t K, in_length_in_bits, in_length_in_bytes;
+	struct rte_bbdev_op_ldpc_enc *enc = &op->ldpc_enc;
+
+	acc100_header_init(desc);
+
+	K = (enc->basegraph == 1 ? 22 : 10) * enc->z_c;
+	in_length_in_bits = K - enc->n_filler;
+	if ((enc->op_flags & RTE_BBDEV_LDPC_CRC_24A_ATTACH) ||
+			(enc->op_flags & RTE_BBDEV_LDPC_CRC_24B_ATTACH))
+		in_length_in_bits -= 24;
+	in_length_in_bytes = in_length_in_bits >> 3;
+
+	if (unlikely((*mbuf_total_left == 0) ||
+			(*mbuf_total_left < in_length_in_bytes))) {
+		rte_bbdev_log(ERR,
+				"Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+				*mbuf_total_left, in_length_in_bytes);
+		return -1;
+	}
+
+	next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset,
+			in_length_in_bytes,
+			seg_total_left, next_triplet);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->m2dlen = next_triplet;
+	*mbuf_total_left -= in_length_in_bytes;
+
+	/* Set output length */
+	/* Integer round up division by 8 */
+	*out_length = (enc->cb_params.e + 7) >> 3;
+
+	next_triplet = acc100_dma_fill_blk_type_out(desc, output, *out_offset,
+			*out_length, next_triplet, ACC100_DMA_BLKID_OUT_ENC);
+	op->ldpc_enc.output.length += *out_length;
+	*out_offset += *out_length;
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->data_ptrs[next_triplet - 1].dma_ext = 0;
+	desc->d2mlen = next_triplet - desc->m2dlen;
+
+	desc->op_addr = op;
+
+	return 0;
+}
+
+static inline int
+acc100_dma_desc_ld_fill(struct rte_bbdev_dec_op *op,
+		struct acc100_dma_req_desc *desc,
+		struct rte_mbuf **input, struct rte_mbuf *h_output,
+		uint32_t *in_offset, uint32_t *h_out_offset,
+		uint32_t *h_out_length, uint32_t *mbuf_total_left,
+		uint32_t *seg_total_left,
+		struct acc100_fcw_ld *fcw)
+{
+	struct rte_bbdev_op_ldpc_dec *dec = &op->ldpc_dec;
+	int next_triplet = 1; /* FCW already done */
+	uint32_t input_length;
+	uint16_t output_length, crc24_overlap = 0;
+	uint16_t sys_cols, K, h_p_size, h_np_size;
+	bool h_comp = check_bit(dec->op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
+
+	acc100_header_init(desc);
+
+	if (check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP))
+		crc24_overlap = 24;
+
+	/* Compute some LDPC BG lengths */
+	input_length = dec->cb_params.e;
+	if (check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_LLR_COMPRESSION))
+		input_length = (input_length * 3 + 3) / 4;
+	sys_cols = (dec->basegraph == 1) ? 22 : 10;
+	K = sys_cols * dec->z_c;
+	output_length = K - dec->n_filler - crc24_overlap;
+
+	if (unlikely((*mbuf_total_left == 0) ||
+			(*mbuf_total_left < input_length))) {
+		rte_bbdev_log(ERR,
+				"Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+				*mbuf_total_left, input_length);
+		return -1;
+	}
+
+	next_triplet = acc100_dma_fill_blk_type_in(desc, input,
+			in_offset, input_length,
+			seg_total_left, next_triplet);
+
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+
+	if (check_bit(op->ldpc_dec.op_flags,
+				RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
+		h_p_size = fcw->hcin_size0 + fcw->hcin_size1;
+		if (h_comp)
+			h_p_size = (h_p_size * 3 + 3) / 4;
+		desc->data_ptrs[next_triplet].address =
+				dec->harq_combined_input.offset;
+		desc->data_ptrs[next_triplet].blen = h_p_size;
+		desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN_HARQ;
+		desc->data_ptrs[next_triplet].dma_ext = 1;
+#ifndef ACC100_EXT_MEM
+		acc100_dma_fill_blk_type_out(
+				desc,
+				op->ldpc_dec.harq_combined_input.data,
+				op->ldpc_dec.harq_combined_input.offset,
+				h_p_size,
+				next_triplet,
+				ACC100_DMA_BLKID_IN_HARQ);
+#endif
+		next_triplet++;
+	}
+
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->m2dlen = next_triplet;
+	*mbuf_total_left -= input_length;
+
+	next_triplet = acc100_dma_fill_blk_type_out(desc, h_output,
+			*h_out_offset, output_length >> 3, next_triplet,
+			ACC100_DMA_BLKID_OUT_HARD);
+
+	if (check_bit(op->ldpc_dec.op_flags,
+				RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
+		/* Pruned size of the HARQ */
+		h_p_size = fcw->hcout_size0 + fcw->hcout_size1;
+		/* Non-Pruned size of the HARQ */
+		h_np_size = fcw->hcout_offset > 0 ?
+				fcw->hcout_offset + fcw->hcout_size1 :
+				h_p_size;
+		if (h_comp) {
+			h_np_size = (h_np_size * 3 + 3) / 4;
+			h_p_size = (h_p_size * 3 + 3) / 4;
+		}
+		dec->harq_combined_output.length = h_np_size;
+		desc->data_ptrs[next_triplet].address =
+				dec->harq_combined_output.offset;
+		desc->data_ptrs[next_triplet].blen = h_p_size;
+		desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_OUT_HARQ;
+		desc->data_ptrs[next_triplet].dma_ext = 1;
+#ifndef ACC100_EXT_MEM
+		acc100_dma_fill_blk_type_out(
+				desc,
+				dec->harq_combined_output.data,
+				dec->harq_combined_output.offset,
+				h_p_size,
+				next_triplet,
+				ACC100_DMA_BLKID_OUT_HARQ);
+#endif
+		next_triplet++;
+	}
+
+	*h_out_length = output_length >> 3;
+	dec->hard_output.length += *h_out_length;
+	*h_out_offset += *h_out_length;
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->d2mlen = next_triplet - desc->m2dlen;
+
+	desc->op_addr = op;
+
+	return 0;
+}
+
+static inline void
+acc100_dma_desc_ld_update(struct rte_bbdev_dec_op *op,
+		struct acc100_dma_req_desc *desc,
+		struct rte_mbuf *input, struct rte_mbuf *h_output,
+		uint32_t *in_offset, uint32_t *h_out_offset,
+		uint32_t *h_out_length,
+		union acc100_harq_layout_data *harq_layout)
+{
+	int next_triplet = 1; /* FCW already done */
+	desc->data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(input, *in_offset);
+	next_triplet++;
+
+	if (check_bit(op->ldpc_dec.op_flags,
+				RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
+		struct rte_bbdev_op_data hi = op->ldpc_dec.harq_combined_input;
+		desc->data_ptrs[next_triplet].address = hi.offset;
+#ifndef ACC100_EXT_MEM
+		desc->data_ptrs[next_triplet].address =
+				rte_pktmbuf_iova_offset(hi.data, hi.offset);
+#endif
+		next_triplet++;
+	}
+
+	desc->data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(h_output, *h_out_offset);
+	*h_out_length = desc->data_ptrs[next_triplet].blen;
+	next_triplet++;
+
+	if (check_bit(op->ldpc_dec.op_flags,
+				RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
+		desc->data_ptrs[next_triplet].address =
+				op->ldpc_dec.harq_combined_output.offset;
+		/* Adjust based on previous operation */
+		struct rte_bbdev_dec_op *prev_op = desc->op_addr;
+		op->ldpc_dec.harq_combined_output.length =
+				prev_op->ldpc_dec.harq_combined_output.length;
+		int16_t hq_idx = op->ldpc_dec.harq_combined_output.offset /
+				ACC100_HARQ_OFFSET;
+		int16_t prev_hq_idx =
+				prev_op->ldpc_dec.harq_combined_output.offset
+				/ ACC100_HARQ_OFFSET;
+		harq_layout[hq_idx].val = harq_layout[prev_hq_idx].val;
+#ifndef ACC100_EXT_MEM
+		struct rte_bbdev_op_data ho =
+				op->ldpc_dec.harq_combined_output;
+		desc->data_ptrs[next_triplet].address =
+				rte_pktmbuf_iova_offset(ho.data, ho.offset);
+#endif
+		next_triplet++;
+	}
+
+	op->ldpc_dec.hard_output.length += *h_out_length;
+	desc->op_addr = op;
+}
+
+
+/* Enqueue a number of operations to HW and update software rings */
+static inline void
+acc100_dma_enqueue(struct acc100_queue *q, uint16_t n,
+		struct rte_bbdev_stats *queue_stats)
+{
+	union acc100_enqueue_reg_fmt enq_req;
+#ifdef RTE_BBDEV_OFFLOAD_COST
+	uint64_t start_time = 0;
+	queue_stats->acc_offload_cycles = 0;
+#else
+	RTE_SET_USED(queue_stats);
+#endif
+
+	enq_req.val = 0;
+	/* Setting offset, 100b for 256 DMA Desc */
+	enq_req.addr_offset = ACC100_DESC_OFFSET;
+
+	/* Split ops into batches */
+	do {
+		union acc100_dma_desc *desc;
+		uint16_t enq_batch_size;
+		uint64_t offset;
+		rte_iova_t req_elem_addr;
+
+		enq_batch_size = RTE_MIN(n, MAX_ENQ_BATCH_SIZE);
+
+		/* Set flag on last descriptor in a batch */
+		desc = q->ring_addr + ((q->sw_ring_head + enq_batch_size - 1) &
+				q->sw_ring_wrap_mask);
+		desc->req.last_desc_in_batch = 1;
+
+		/* Calculate the 1st descriptor's address */
+		offset = ((q->sw_ring_head & q->sw_ring_wrap_mask) *
+				sizeof(union acc100_dma_desc));
+		req_elem_addr = q->ring_addr_phys + offset;
+
+		/* Fill enqueue struct */
+		enq_req.num_elem = enq_batch_size;
+		/* low 6 bits are not needed */
+		enq_req.req_elem_addr = (uint32_t)(req_elem_addr >> 6);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+		rte_memdump(stderr, "Req sdone", desc, sizeof(*desc));
+#endif
+		rte_bbdev_log_debug(
+				"Enqueue %u reqs (phys %#"PRIx64") to reg %p",
+				enq_batch_size,
+				req_elem_addr,
+				(void *)q->mmio_reg_enqueue);
+
+		rte_wmb();
+
+#ifdef RTE_BBDEV_OFFLOAD_COST
+		/* Start time measurement for enqueue function offload. */
+		start_time = rte_rdtsc_precise();
+#endif
+		rte_bbdev_log(DEBUG, "Debug : MMIO Enqueue");
+		mmio_write(q->mmio_reg_enqueue, enq_req.val);
+
+#ifdef RTE_BBDEV_OFFLOAD_COST
+		queue_stats->acc_offload_cycles +=
+				rte_rdtsc_precise() - start_time;
+#endif
+
+		q->aq_enqueued++;
+		q->sw_ring_head += enq_batch_size;
+		n -= enq_batch_size;
+
+	} while (n);
+
+
+}
+
+/* Enqueue one encode operations for ACC100 device in CB mode */
+static inline int
+enqueue_ldpc_enc_n_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op **ops,
+		uint16_t total_enqueued_cbs, int16_t num)
+{
+	union acc100_dma_desc *desc = NULL;
+	uint32_t out_length;
+	struct rte_mbuf *output_head, *output;
+	int i, next_triplet;
+	uint16_t  in_length_in_bytes;
+	struct rte_bbdev_op_ldpc_enc *enc = &ops[0]->ldpc_enc;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	acc100_fcw_le_fill(ops[0], &desc->req.fcw_le, num);
+
+	/** This could be done at polling */
+	desc->req.word0 = ACC100_DMA_DESC_TYPE;
+	desc->req.word1 = 0; /**< Timestamp could be disabled */
+	desc->req.word2 = 0;
+	desc->req.word3 = 0;
+	desc->req.numCBs = num;
+
+	in_length_in_bytes = ops[0]->ldpc_enc.input.data->data_len;
+	out_length = (enc->cb_params.e + 7) >> 3;
+	desc->req.m2dlen = 1 + num;
+	desc->req.d2mlen = num;
+	next_triplet = 1;
+
+	for (i = 0; i < num; i++) {
+		desc->req.data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(ops[i]->ldpc_enc.input.data, 0);
+		desc->req.data_ptrs[next_triplet].blen = in_length_in_bytes;
+		next_triplet++;
+		desc->req.data_ptrs[next_triplet].address =
+				rte_pktmbuf_iova_offset(
+				ops[i]->ldpc_enc.output.data, 0);
+		desc->req.data_ptrs[next_triplet].blen = out_length;
+		next_triplet++;
+		ops[i]->ldpc_enc.output.length = out_length;
+		output_head = output = ops[i]->ldpc_enc.output.data;
+		mbuf_append(output_head, output, out_length);
+		output->data_len = out_length;
+	}
+
+	desc->req.op_addr = ops[0];
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_le,
+			sizeof(desc->req.fcw_le) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+	/* One CB (one op) was successfully prepared to enqueue */
+	return num;
+}
+
+/* Enqueue one encode operations for ACC100 device in CB mode */
+static inline int
+enqueue_ldpc_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
+		uint16_t total_enqueued_cbs)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint32_t in_offset, out_offset, out_length, mbuf_total_left,
+		seg_total_left;
+	struct rte_mbuf *input, *output_head, *output;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	acc100_fcw_le_fill(op, &desc->req.fcw_le, 1);
+
+	input = op->ldpc_enc.input.data;
+	output_head = output = op->ldpc_enc.output.data;
+	in_offset = op->ldpc_enc.input.offset;
+	out_offset = op->ldpc_enc.output.offset;
+	out_length = 0;
+	mbuf_total_left = op->ldpc_enc.input.length;
+	seg_total_left = rte_pktmbuf_data_len(op->ldpc_enc.input.data)
+			- in_offset;
+
+	ret = acc100_dma_desc_le_fill(op, &desc->req, &input, output,
+			&in_offset, &out_offset, &out_length, &mbuf_total_left,
+			&seg_total_left);
+
+	if (unlikely(ret < 0))
+		return ret;
+
+	mbuf_append(output_head, output, out_length);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_le,
+			sizeof(desc->req.fcw_le) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+
+	/* Check if any data left after processing one CB */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left after processing one CB: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
+/** Enqueue one decode operations for ACC100 device in CB mode */
+static inline int
+enqueue_ldpc_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs, bool same_op)
+{
+	int ret;
+
+	union acc100_dma_desc *desc;
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	struct rte_mbuf *input, *h_output_head, *h_output;
+	uint32_t in_offset, h_out_offset, mbuf_total_left, h_out_length = 0;
+	input = op->ldpc_dec.input.data;
+	h_output_head = h_output = op->ldpc_dec.hard_output.data;
+	in_offset = op->ldpc_dec.input.offset;
+	h_out_offset = op->ldpc_dec.hard_output.offset;
+	mbuf_total_left = op->ldpc_dec.input.length;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(input == NULL)) {
+		rte_bbdev_log(ERR, "Invalid mbuf pointer");
+		return -EFAULT;
+	}
+#endif
+	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+
+	if (same_op) {
+		union acc100_dma_desc *prev_desc;
+		desc_idx = ((q->sw_ring_head + total_enqueued_cbs - 1)
+				& q->sw_ring_wrap_mask);
+		prev_desc = q->ring_addr + desc_idx;
+		uint8_t *prev_ptr = (uint8_t *) prev_desc;
+		uint8_t *new_ptr = (uint8_t *) desc;
+		/* Copy first 4 words and BDESCs */
+		rte_memcpy(new_ptr, prev_ptr, ACC100_5GUL_SIZE_0);
+		rte_memcpy(new_ptr + ACC100_5GUL_OFFSET_0,
+				prev_ptr + ACC100_5GUL_OFFSET_0,
+				ACC100_5GUL_SIZE_1);
+		desc->req.op_addr = prev_desc->req.op_addr;
+		/* Copy FCW */
+		rte_memcpy(new_ptr + ACC100_DESC_FCW_OFFSET,
+				prev_ptr + ACC100_DESC_FCW_OFFSET,
+				ACC100_FCW_LD_BLEN);
+		acc100_dma_desc_ld_update(op, &desc->req, input, h_output,
+				&in_offset, &h_out_offset,
+				&h_out_length, harq_layout);
+	} else {
+		struct acc100_fcw_ld *fcw;
+		uint32_t seg_total_left;
+		fcw = &desc->req.fcw_ld;
+		acc100_fcw_ld_fill(op, fcw, harq_layout);
+
+		/* Special handling when overusing mbuf */
+		if (fcw->rm_e < ACC100_MAX_E_MBUF)
+			seg_total_left = rte_pktmbuf_data_len(input)
+					- in_offset;
+		else
+			seg_total_left = fcw->rm_e;
+
+		ret = acc100_dma_desc_ld_fill(op, &desc->req, &input, h_output,
+				&in_offset, &h_out_offset,
+				&h_out_length, &mbuf_total_left,
+				&seg_total_left, fcw);
+		if (unlikely(ret < 0))
+			return ret;
+	}
+
+	/* Hard output */
+	mbuf_append(h_output_head, h_output, h_out_length);
+#ifndef ACC100_EXT_MEM
+	if (op->ldpc_dec.harq_combined_output.length > 0) {
+		/* Push the HARQ output into host memory */
+		struct rte_mbuf *hq_output_head, *hq_output;
+		hq_output_head = op->ldpc_dec.harq_combined_output.data;
+		hq_output = op->ldpc_dec.harq_combined_output.data;
+		mbuf_append(hq_output_head, hq_output,
+				op->ldpc_dec.harq_combined_output.length);
+	}
+#endif
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_ld,
+			sizeof(desc->req.fcw_ld) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
+
+/* Enqueue one decode operations for ACC100 device in TB mode */
+static inline int
+enqueue_ldpc_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint8_t r, c;
+	uint32_t in_offset, h_out_offset,
+		h_out_length, mbuf_total_left, seg_total_left;
+	struct rte_mbuf *input, *h_output_head, *h_output;
+	uint16_t current_enqueued_cbs = 0;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
+	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+	acc100_fcw_ld_fill(op, &desc->req.fcw_ld, harq_layout);
+
+	input = op->ldpc_dec.input.data;
+	h_output_head = h_output = op->ldpc_dec.hard_output.data;
+	in_offset = op->ldpc_dec.input.offset;
+	h_out_offset = op->ldpc_dec.hard_output.offset;
+	h_out_length = 0;
+	mbuf_total_left = op->ldpc_dec.input.length;
+	c = op->ldpc_dec.tb_params.c;
+	r = op->ldpc_dec.tb_params.r;
+
+	while (mbuf_total_left > 0 && r < c) {
+
+		seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
+
+		/* Set up DMA descriptor */
+		desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
+				& q->sw_ring_wrap_mask);
+		desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
+		desc->req.data_ptrs[0].blen = ACC100_FCW_LD_BLEN;
+		ret = acc100_dma_desc_ld_fill(op, &desc->req, &input,
+				h_output, &in_offset, &h_out_offset,
+				&h_out_length,
+				&mbuf_total_left, &seg_total_left,
+				&desc->req.fcw_ld);
+
+		if (unlikely(ret < 0))
+			return ret;
+
+		/* Hard output */
+		mbuf_append(h_output_head, h_output, h_out_length);
+
+		/* Set total number of CBs in TB */
+		desc->req.cbs_in_tb = cbs_in_tb;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+		rte_memdump(stderr, "FCW", &desc->req.fcw_td,
+				sizeof(desc->req.fcw_td) - 8);
+		rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+		if (seg_total_left == 0) {
+			/* Go to the next mbuf */
+			input = input->next;
+			in_offset = 0;
+			h_output = h_output->next;
+			h_out_offset = 0;
+		}
+		total_enqueued_cbs++;
+		current_enqueued_cbs++;
+		r++;
+	}
+
+	if (unlikely(desc == NULL))
+		return current_enqueued_cbs;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Check if any CBs left for processing */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left for processing: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+	/* Set SDone on last CB descriptor for TB mode */
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	return current_enqueued_cbs;
+}
+
+
+/* Calculates number of CBs in processed encoder TB based on 'r' and input
+ * length.
+ */
+static inline uint8_t
+get_num_cbs_in_tb_enc(struct rte_bbdev_op_turbo_enc *turbo_enc)
+{
+	uint8_t c, c_neg, r, crc24_bits = 0;
+	uint16_t k, k_neg, k_pos;
+	uint8_t cbs_in_tb = 0;
+	int32_t length;
+
+	length = turbo_enc->input.length;
+	r = turbo_enc->tb_params.r;
+	c = turbo_enc->tb_params.c;
+	c_neg = turbo_enc->tb_params.c_neg;
+	k_neg = turbo_enc->tb_params.k_neg;
+	k_pos = turbo_enc->tb_params.k_pos;
+	crc24_bits = 0;
+	if (check_bit(turbo_enc->op_flags, RTE_BBDEV_TURBO_CRC_24B_ATTACH))
+		crc24_bits = 24;
+	while (length > 0 && r < c) {
+		k = (r < c_neg) ? k_neg : k_pos;
+		length -= (k - crc24_bits) >> 3;
+		r++;
+		cbs_in_tb++;
+	}
+
+	return cbs_in_tb;
+}
+
+/* Calculates number of CBs in processed decoder TB based on 'r' and input
+ * length.
+ */
+static inline uint16_t
+get_num_cbs_in_tb_dec(struct rte_bbdev_op_turbo_dec *turbo_dec)
+{
+	uint8_t c, c_neg, r = 0;
+	uint16_t kw, k, k_neg, k_pos, cbs_in_tb = 0;
+	int32_t length;
+
+	length = turbo_dec->input.length;
+	r = turbo_dec->tb_params.r;
+	c = turbo_dec->tb_params.c;
+	c_neg = turbo_dec->tb_params.c_neg;
+	k_neg = turbo_dec->tb_params.k_neg;
+	k_pos = turbo_dec->tb_params.k_pos;
+	while (length > 0 && r < c) {
+		k = (r < c_neg) ? k_neg : k_pos;
+		kw = RTE_ALIGN_CEIL(k + 4, 32) * 3;
+		length -= kw;
+		r++;
+		cbs_in_tb++;
+	}
+
+	return cbs_in_tb;
+}
+
+/* Calculates number of CBs in processed decoder TB based on 'r' and input
+ * length.
+ */
+static inline uint16_t
+get_num_cbs_in_tb_ldpc_dec(struct rte_bbdev_op_ldpc_dec *ldpc_dec)
+{
+	uint16_t r, cbs_in_tb = 0;
+	int32_t length = ldpc_dec->input.length;
+	r = ldpc_dec->tb_params.r;
+	while (length > 0 && r < ldpc_dec->tb_params.c) {
+		length -=  (r < ldpc_dec->tb_params.cab) ?
+				ldpc_dec->tb_params.ea :
+				ldpc_dec->tb_params.eb;
+		r++;
+		cbs_in_tb++;
+	}
+	return cbs_in_tb;
+}
+
+/* Check we can mux encode operations with common FCW */
+static inline bool
+check_mux(struct rte_bbdev_enc_op **ops, uint16_t num) {
+	uint16_t i;
+	if (num <= 1)
+		return false;
+	for (i = 1; i < num; ++i) {
+		/* Only mux compatible code blocks */
+		if (memcmp((uint8_t *)(&ops[i]->ldpc_enc) + ACC100_ENC_OFFSET,
+				(uint8_t *)(&ops[0]->ldpc_enc) +
+				ACC100_ENC_OFFSET,
+				ACC100_CMP_ENC_SIZE) != 0)
+			return false;
+	}
+	return true;
+}
+
+/** Enqueue encode operations for ACC100 device in CB mode. */
+static inline uint16_t
+acc100_enqueue_ldpc_enc_cb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i = 0;
+	union acc100_dma_desc *desc;
+	int ret, desc_idx = 0;
+	int16_t enq, left = num;
+
+	while (left > 0) {
+		if (unlikely(avail < 1))
+			break;
+		avail--;
+		enq = RTE_MIN(left, ACC100_MUX_5GDL_DESC);
+		if (check_mux(&ops[i], enq)) {
+			ret = enqueue_ldpc_enc_n_op_cb(q, &ops[i],
+					desc_idx, enq);
+			if (ret < 0)
+				break;
+			i += enq;
+		} else {
+			ret = enqueue_ldpc_enc_one_op_cb(q, ops[i], desc_idx);
+			if (ret < 0)
+				break;
+			i++;
+		}
+		desc_idx++;
+		left = num - i;
+	}
+
+	if (unlikely(i == 0))
+		return 0; /* Nothing to enqueue */
+
+	/* Set SDone in last CB in enqueued ops for CB mode*/
+	desc = q->ring_addr + ((q->sw_ring_head + desc_idx - 1)
+			& q->sw_ring_wrap_mask);
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	acc100_dma_enqueue(q, desc_idx, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+
+	return i;
+}
+
+/* Enqueue encode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	if (unlikely(num == 0))
+		return 0;
+	return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
+}
+
+/* Check we can mux encode operations with common FCW */
+static inline bool
+cmp_ldpc_dec_op(struct rte_bbdev_dec_op **ops) {
+	/* Only mux compatible code blocks */
+	if (memcmp((uint8_t *)(&ops[0]->ldpc_dec) + ACC100_DEC_OFFSET,
+			(uint8_t *)(&ops[1]->ldpc_dec) +
+			ACC100_DEC_OFFSET, ACC100_CMP_DEC_SIZE) != 0) {
+		return false;
+	} else
+		return true;
+}
+
+
+/* Enqueue decode operations for ACC100 device in TB mode */
+static uint16_t
+acc100_enqueue_ldpc_dec_tb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i, enqueued_cbs = 0;
+	uint8_t cbs_in_tb;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		cbs_in_tb = get_num_cbs_in_tb_ldpc_dec(&ops[i]->ldpc_dec);
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - cbs_in_tb < 0))
+			break;
+		avail -= cbs_in_tb;
+
+		ret = enqueue_ldpc_dec_one_op_tb(q, ops[i],
+				enqueued_cbs, cbs_in_tb);
+		if (ret < 0)
+			break;
+		enqueued_cbs += ret;
+	}
+
+	acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+	return i;
+}
+
+/* Enqueue decode operations for ACC100 device in CB mode */
+static uint16_t
+acc100_enqueue_ldpc_dec_cb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i;
+	union acc100_dma_desc *desc;
+	int ret;
+	bool same_op = false;
+	for (i = 0; i < num; ++i) {
+		/* Check if there are available space for further processing */
+		if (unlikely(avail < 1))
+			break;
+		avail -= 1;
+
+		if (i > 0)
+			same_op = cmp_ldpc_dec_op(&ops[i-1]);
+		rte_bbdev_log(INFO, "Op %d %d %d %d %d %d %d %d %d %d %d %d\n",
+			i, ops[i]->ldpc_dec.op_flags, ops[i]->ldpc_dec.rv_index,
+			ops[i]->ldpc_dec.iter_max, ops[i]->ldpc_dec.iter_count,
+			ops[i]->ldpc_dec.basegraph, ops[i]->ldpc_dec.z_c,
+			ops[i]->ldpc_dec.n_cb, ops[i]->ldpc_dec.q_m,
+			ops[i]->ldpc_dec.n_filler, ops[i]->ldpc_dec.cb_params.e,
+			same_op);
+		ret = enqueue_ldpc_dec_one_op_cb(q, ops[i], i, same_op);
+		if (ret < 0)
+			break;
+	}
+
+	if (unlikely(i == 0))
+		return 0; /* Nothing to enqueue */
+
+	/* Set SDone in last CB in enqueued ops for CB mode*/
+	desc = q->ring_addr + ((q->sw_ring_head + i - 1)
+			& q->sw_ring_wrap_mask);
+
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	acc100_dma_enqueue(q, i, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+	return i;
+}
+
+/* Enqueue decode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t aq_avail = q->aq_depth +
+			(q->aq_dequeued - q->aq_enqueued) / 128;
+
+	if (unlikely((aq_avail == 0) || (num == 0)))
+		return 0;
+
+	if (ops[0]->ldpc_dec.code_block_mode == 0)
+		return acc100_enqueue_ldpc_dec_tb(q_data, ops, num);
+	else
+		return acc100_enqueue_ldpc_dec_cb(q_data, ops, num);
+}
+
+
+/* Dequeue one encode operations from ACC100 device in CB mode */
+static inline int
+dequeue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op **ref_op,
+		uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_enc_op *op;
+	int i;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	rsp.val = atom_desc.rsp.val;
+	rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+
+	op->status |= ((rsp.input_err)
+			? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+	op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+	op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+
+	if (desc->req.last_desc_in_batch) {
+		(*aq_dequeued)++;
+		desc->req.last_desc_in_batch = 0;
+	}
+	desc->rsp.val = ACC100_DMA_DESC_TYPE;
+	desc->rsp.add_info_0 = 0; /*Reserved bits */
+	desc->rsp.add_info_1 = 0; /*Reserved bits */
+
+	/* Flag that the muxing cause loss of opaque data */
+	op->opaque_data = (void *)-1;
+	for (i = 0 ; i < desc->req.numCBs; i++)
+		ref_op[i] = op;
+
+	/* One CB (op) was successfully dequeued */
+	return desc->req.numCBs;
+}
+
+/* Dequeue one encode operations from ACC100 device in TB mode */
+static inline int
+dequeue_enc_one_op_tb(struct acc100_queue *q, struct rte_bbdev_enc_op **ref_op,
+		uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, *last_desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_enc_op *op;
+	uint8_t i = 0;
+	uint16_t current_dequeued_cbs = 0, cbs_in_tb;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	/* Get number of CBs in dequeued TB */
+	cbs_in_tb = desc->req.cbs_in_tb;
+	/* Get last CB */
+	last_desc = q->ring_addr + ((q->sw_ring_tail
+			+ total_dequeued_cbs + cbs_in_tb - 1)
+			& q->sw_ring_wrap_mask);
+	/* Check if last CB in TB is ready to dequeue (and thus
+	 * the whole TB) - checking sdone bit. If not return.
+	 */
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
+			__ATOMIC_RELAXED);
+	if (!(atom_desc.rsp.val & ACC100_SDONE))
+		return -1;
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+
+	while (i < cbs_in_tb) {
+		desc = q->ring_addr + ((q->sw_ring_tail
+				+ total_dequeued_cbs)
+				& q->sw_ring_wrap_mask);
+		atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+				__ATOMIC_RELAXED);
+		rsp.val = atom_desc.rsp.val;
+		rte_bbdev_log_debug("Resp. desc %p: %x", desc,
+				rsp.val);
+
+		op->status |= ((rsp.input_err)
+				? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+		op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+		op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+
+		if (desc->req.last_desc_in_batch) {
+			(*aq_dequeued)++;
+			desc->req.last_desc_in_batch = 0;
+		}
+		desc->rsp.val = ACC100_DMA_DESC_TYPE;
+		desc->rsp.add_info_0 = 0;
+		desc->rsp.add_info_1 = 0;
+		total_dequeued_cbs++;
+		current_dequeued_cbs++;
+		i++;
+	}
+
+	*ref_op = op;
+
+	return current_dequeued_cbs;
+}
+
+/* Dequeue one decode operation from ACC100 device in CB mode */
+static inline int
+dequeue_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
+		struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
+		uint16_t dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_dec_op *op;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	rsp.val = atom_desc.rsp.val;
+	rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+	op->status |= ((rsp.input_err)
+			? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+	op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+	op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+	if (op->status != 0)
+		q_data->queue_stats.dequeue_err_count++;
+
+	/* CRC invalid if error exists */
+	if (!op->status)
+		op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
+	op->turbo_dec.iter_count = (uint8_t) rsp.iter_cnt / 2;
+	/* Check if this is the last desc in batch (Atomic Queue) */
+	if (desc->req.last_desc_in_batch) {
+		(*aq_dequeued)++;
+		desc->req.last_desc_in_batch = 0;
+	}
+	desc->rsp.val = ACC100_DMA_DESC_TYPE;
+	desc->rsp.add_info_0 = 0;
+	desc->rsp.add_info_1 = 0;
+	*ref_op = op;
+
+	/* One CB (op) was successfully dequeued */
+	return 1;
+}
+
+/* Dequeue one decode operations from ACC100 device in CB mode */
+static inline int
+dequeue_ldpc_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
+		struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
+		uint16_t dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_dec_op *op;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	rsp.val = atom_desc.rsp.val;
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+	op->status |= rsp.input_err << RTE_BBDEV_DATA_ERROR;
+	op->status |= rsp.dma_err << RTE_BBDEV_DRV_ERROR;
+	op->status |= rsp.fcw_err << RTE_BBDEV_DRV_ERROR;
+	if (op->status != 0)
+		q_data->queue_stats.dequeue_err_count++;
+
+	op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
+	if (op->ldpc_dec.hard_output.length > 0 && !rsp.synd_ok)
+		op->status |= 1 << RTE_BBDEV_SYNDROME_ERROR;
+	op->ldpc_dec.iter_count = (uint8_t) rsp.iter_cnt;
+
+	/* Check if this is the last desc in batch (Atomic Queue) */
+	if (desc->req.last_desc_in_batch) {
+		(*aq_dequeued)++;
+		desc->req.last_desc_in_batch = 0;
+	}
+
+	desc->rsp.val = ACC100_DMA_DESC_TYPE;
+	desc->rsp.add_info_0 = 0;
+	desc->rsp.add_info_1 = 0;
+
+	*ref_op = op;
+
+	/* One CB (op) was successfully dequeued */
+	return 1;
+}
+
+/* Dequeue one decode operations from ACC100 device in TB mode. */
+static inline int
+dequeue_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
+		uint16_t dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, *last_desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_dec_op *op;
+	uint8_t cbs_in_tb = 1, cb_idx = 0;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Get number of CBs in dequeued TB */
+	cbs_in_tb = desc->req.cbs_in_tb;
+	/* Get last CB */
+	last_desc = q->ring_addr + ((q->sw_ring_tail
+			+ dequeued_cbs + cbs_in_tb - 1)
+			& q->sw_ring_wrap_mask);
+	/* Check if last CB in TB is ready to dequeue (and thus
+	 * the whole TB) - checking sdone bit. If not return.
+	 */
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
+			__ATOMIC_RELAXED);
+	if (!(atom_desc.rsp.val & ACC100_SDONE))
+		return -1;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+
+	/* Read remaining CBs if exists */
+	while (cb_idx < cbs_in_tb) {
+		desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+				& q->sw_ring_wrap_mask);
+		atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+				__ATOMIC_RELAXED);
+		rsp.val = atom_desc.rsp.val;
+		rte_bbdev_log_debug("Resp. desc %p: %x", desc,
+				rsp.val);
+
+		op->status |= ((rsp.input_err)
+				? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+		op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+		op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+
+		/* CRC invalid if error exists */
+		if (!op->status)
+			op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
+		op->turbo_dec.iter_count = RTE_MAX((uint8_t) rsp.iter_cnt,
+				op->turbo_dec.iter_count);
+
+		/* Check if this is the last desc in batch (Atomic Queue) */
+		if (desc->req.last_desc_in_batch) {
+			(*aq_dequeued)++;
+			desc->req.last_desc_in_batch = 0;
+		}
+		desc->rsp.val = ACC100_DMA_DESC_TYPE;
+		desc->rsp.add_info_0 = 0;
+		desc->rsp.add_info_1 = 0;
+		dequeued_cbs++;
+		cb_idx++;
+	}
+
+	*ref_op = op;
+
+	return cb_idx;
+}
+
+/* Dequeue LDPC encode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+	uint32_t aq_dequeued = 0;
+	uint16_t dequeue_num, i, dequeued_cbs = 0, dequeued_descs = 0;
+	int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(ops == 0 && q == NULL))
+		return 0;
+#endif
+
+	dequeue_num = RTE_MIN(avail, num);
+
+	for (i = 0; i < dequeue_num; i++) {
+		ret = dequeue_enc_one_op_cb(q, &ops[dequeued_cbs],
+				dequeued_descs, &aq_dequeued);
+		if (ret < 0)
+			break;
+		dequeued_cbs += ret;
+		dequeued_descs++;
+		if (dequeued_cbs >= num)
+			break;
+	}
+
+	q->aq_dequeued += aq_dequeued;
+	q->sw_ring_tail += dequeued_descs;
+
+	/* Update enqueue stats */
+	q_data->queue_stats.dequeued_count += dequeued_cbs;
+
+	return dequeued_cbs;
+}
+
+/* Dequeue decode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	uint16_t dequeue_num;
+	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+	uint32_t aq_dequeued = 0;
+	uint16_t i;
+	uint16_t dequeued_cbs = 0;
+	struct rte_bbdev_dec_op *op;
+	int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(ops == 0 && q == NULL))
+		return 0;
+#endif
+
+	dequeue_num = RTE_MIN(avail, num);
+
+	for (i = 0; i < dequeue_num; ++i) {
+		op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask))->req.op_addr;
+		if (op->ldpc_dec.code_block_mode == 0)
+			ret = dequeue_dec_one_op_tb(q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+		else
+			ret = dequeue_ldpc_dec_one_op_cb(
+					q_data, q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+
+		if (ret < 0)
+			break;
+		dequeued_cbs += ret;
+	}
+
+	q->aq_dequeued += aq_dequeued;
+	q->sw_ring_tail += dequeued_cbs;
+
+	/* Update enqueue stats */
+	q_data->queue_stats.dequeued_count += i;
+
+	return i;
+}
+
 /* Initialization Function */
 static void
 acc100_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv)
@@ -725,6 +2334,10 @@
 	struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
 
 	dev->dev_ops = &acc100_bbdev_ops;
+	dev->enqueue_ldpc_enc_ops = acc100_enqueue_ldpc_enc;
+	dev->enqueue_ldpc_dec_ops = acc100_enqueue_ldpc_dec;
+	dev->dequeue_ldpc_enc_ops = acc100_dequeue_ldpc_enc;
+	dev->dequeue_ldpc_dec_ops = acc100_dequeue_ldpc_dec;
 
 	((struct acc100_device *) dev->data->dev_private)->pf_device =
 			!strcmp(drv->driver.name,
@@ -837,4 +2450,3 @@ static int acc100_pci_remove(struct rte_pci_device *pci_dev)
 RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME, pci_id_acc100_pf_map);
 RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
 RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME, pci_id_acc100_vf_map);
-
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 2508385..ab41ee5 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -88,6 +88,8 @@
 #define ACC100_TMPL_PRI_3      0x0f0e0d0c
 #define ACC100_QUEUE_ENABLE    0x80000000  /* Bit to mark Queue as Enabled */
 #define ACC100_WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
+#define ACC100_FDONE    0x80000000
+#define ACC100_SDONE    0x40000000
 
 #define ACC100_NUM_TMPL       32
 /* Mapping of signals for the available engines */
@@ -120,6 +122,9 @@
 #define ACC100_FCW_TD_BLEN                24
 #define ACC100_FCW_LE_BLEN                32
 #define ACC100_FCW_LD_BLEN                36
+#define ACC100_5GUL_SIZE_0                16
+#define ACC100_5GUL_SIZE_1                40
+#define ACC100_5GUL_OFFSET_0              36
 
 #define ACC100_FCW_VER         2
 #define ACC100_MUX_5GDL_DESC   6
@@ -402,6 +407,7 @@ struct __rte_packed acc100_dma_req_desc {
 union acc100_dma_desc {
 	struct acc100_dma_req_desc req;
 	union acc100_dma_rsp_desc rsp;
+	uint64_t atom_hdr;
 };
 
 
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v10 06/10] baseband/acc100: add HARQ loopback support
  2020-10-01  3:14   ` [dpdk-dev] [PATCH v10 00/10] bbdev PMD ACC100 Nicolas Chautru
                       ` (4 preceding siblings ...)
  2020-10-01  3:14     ` [dpdk-dev] [PATCH v10 05/10] baseband/acc100: add LDPC processing functions Nicolas Chautru
@ 2020-10-01  3:14     ` Nicolas Chautru
  2020-10-01  3:14     ` [dpdk-dev] [PATCH v10 07/10] baseband/acc100: add support for 4G processing Nicolas Chautru
                       ` (3 subsequent siblings)
  9 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-10-01  3:14 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, trix, ferruh.yigit, tianjiao.liu,
	Nicolas Chautru

Additional support for HARQ memory loopback

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 159 ++++++++++++++++++++++++++++++-
 1 file changed, 155 insertions(+), 4 deletions(-)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index ce2ad68..0862691 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -680,6 +680,7 @@
 				RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE |
 				RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE |
 #ifdef ACC100_EXT_MEM
+				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK |
 				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE |
 				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE |
 #endif
@@ -1399,10 +1400,7 @@ static inline void acc100_header_init(struct acc100_dma_req_desc *desc)
 	acc100_fcw_le_fill(ops[0], &desc->req.fcw_le, num);
 
 	/** This could be done at polling */
-	desc->req.word0 = ACC100_DMA_DESC_TYPE;
-	desc->req.word1 = 0; /**< Timestamp could be disabled */
-	desc->req.word2 = 0;
-	desc->req.word3 = 0;
+	acc100_header_init(&desc->req);
 	desc->req.numCBs = num;
 
 	in_length_in_bytes = ops[0]->ldpc_enc.input.data->data_len;
@@ -1490,12 +1488,165 @@ static inline void acc100_header_init(struct acc100_dma_req_desc *desc)
 	return 1;
 }
 
+static inline int
+harq_loopback(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs) {
+	struct acc100_fcw_ld *fcw;
+	union acc100_dma_desc *desc;
+	int next_triplet = 1;
+	struct rte_mbuf *hq_output_head, *hq_output;
+	uint16_t harq_in_length = op->ldpc_dec.harq_combined_input.length;
+	if (harq_in_length == 0) {
+		rte_bbdev_log(ERR, "Loopback of invalid null size\n");
+		return -EINVAL;
+	}
+
+	int h_comp = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION
+			) ? 1 : 0;
+	if (h_comp == 1)
+		harq_in_length = harq_in_length * 8 / 6;
+	harq_in_length = RTE_ALIGN(harq_in_length, 64);
+	uint16_t harq_dma_length_in = (h_comp == 0) ?
+			harq_in_length :
+			harq_in_length * 6 / 8;
+	uint16_t harq_dma_length_out = harq_dma_length_in;
+	bool ddr_mem_in = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE);
+	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+	uint16_t harq_index = (ddr_mem_in ?
+			op->ldpc_dec.harq_combined_input.offset :
+			op->ldpc_dec.harq_combined_output.offset)
+			/ ACC100_HARQ_OFFSET;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	fcw = &desc->req.fcw_ld;
+	/* Set the FCW from loopback into DDR */
+	memset(fcw, 0, sizeof(struct acc100_fcw_ld));
+	fcw->FCWversion = ACC100_FCW_VER;
+	fcw->qm = 2;
+	fcw->Zc = 384;
+	if (harq_in_length < 16 * ACC100_N_ZC_1)
+		fcw->Zc = 16;
+	fcw->ncb = fcw->Zc * ACC100_N_ZC_1;
+	fcw->rm_e = 2;
+	fcw->hcin_en = 1;
+	fcw->hcout_en = 1;
+
+	rte_bbdev_log(DEBUG, "Loopback IN %d Index %d offset %d length %d %d\n",
+			ddr_mem_in, harq_index,
+			harq_layout[harq_index].offset, harq_in_length,
+			harq_dma_length_in);
+
+	if (ddr_mem_in && (harq_layout[harq_index].offset > 0)) {
+		fcw->hcin_size0 = harq_layout[harq_index].size0;
+		fcw->hcin_offset = harq_layout[harq_index].offset;
+		fcw->hcin_size1 = harq_in_length - fcw->hcin_offset;
+		harq_dma_length_in = (fcw->hcin_size0 + fcw->hcin_size1);
+		if (h_comp == 1)
+			harq_dma_length_in = harq_dma_length_in * 6 / 8;
+	} else {
+		fcw->hcin_size0 = harq_in_length;
+	}
+	harq_layout[harq_index].val = 0;
+	rte_bbdev_log(DEBUG, "Loopback FCW Config %d %d %d\n",
+			fcw->hcin_size0, fcw->hcin_offset, fcw->hcin_size1);
+	fcw->hcout_size0 = harq_in_length;
+	fcw->hcin_decomp_mode = h_comp;
+	fcw->hcout_comp_mode = h_comp;
+	fcw->gain_i = 1;
+	fcw->gain_h = 1;
+
+	/* Set the prefix of descriptor. This could be done at polling */
+	acc100_header_init(&desc->req);
+
+	/* Null LLR input for Decoder */
+	desc->req.data_ptrs[next_triplet].address =
+			q->lb_in_addr_phys;
+	desc->req.data_ptrs[next_triplet].blen = 2;
+	desc->req.data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN;
+	desc->req.data_ptrs[next_triplet].last = 0;
+	desc->req.data_ptrs[next_triplet].dma_ext = 0;
+	next_triplet++;
+
+	/* HARQ Combine input from either Memory interface */
+	if (!ddr_mem_in) {
+		next_triplet = acc100_dma_fill_blk_type_out(&desc->req,
+				op->ldpc_dec.harq_combined_input.data,
+				op->ldpc_dec.harq_combined_input.offset,
+				harq_dma_length_in,
+				next_triplet,
+				ACC100_DMA_BLKID_IN_HARQ);
+	} else {
+		desc->req.data_ptrs[next_triplet].address =
+				op->ldpc_dec.harq_combined_input.offset;
+		desc->req.data_ptrs[next_triplet].blen =
+				harq_dma_length_in;
+		desc->req.data_ptrs[next_triplet].blkid =
+				ACC100_DMA_BLKID_IN_HARQ;
+		desc->req.data_ptrs[next_triplet].dma_ext = 1;
+		next_triplet++;
+	}
+	desc->req.data_ptrs[next_triplet - 1].last = 1;
+	desc->req.m2dlen = next_triplet;
+
+	/* Dropped decoder hard output */
+	desc->req.data_ptrs[next_triplet].address =
+			q->lb_out_addr_phys;
+	desc->req.data_ptrs[next_triplet].blen = ACC100_BYTES_IN_WORD;
+	desc->req.data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_OUT_HARD;
+	desc->req.data_ptrs[next_triplet].last = 0;
+	desc->req.data_ptrs[next_triplet].dma_ext = 0;
+	next_triplet++;
+
+	/* HARQ Combine output to either Memory interface */
+	if (check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE
+			)) {
+		desc->req.data_ptrs[next_triplet].address =
+				op->ldpc_dec.harq_combined_output.offset;
+		desc->req.data_ptrs[next_triplet].blen =
+				harq_dma_length_out;
+		desc->req.data_ptrs[next_triplet].blkid =
+				ACC100_DMA_BLKID_OUT_HARQ;
+		desc->req.data_ptrs[next_triplet].dma_ext = 1;
+		next_triplet++;
+	} else {
+		hq_output_head = op->ldpc_dec.harq_combined_output.data;
+		hq_output = op->ldpc_dec.harq_combined_output.data;
+		next_triplet = acc100_dma_fill_blk_type_out(
+				&desc->req,
+				op->ldpc_dec.harq_combined_output.data,
+				op->ldpc_dec.harq_combined_output.offset,
+				harq_dma_length_out,
+				next_triplet,
+				ACC100_DMA_BLKID_OUT_HARQ);
+		/* HARQ output */
+		mbuf_append(hq_output_head, hq_output, harq_dma_length_out);
+		op->ldpc_dec.harq_combined_output.length =
+				harq_dma_length_out;
+	}
+	desc->req.data_ptrs[next_triplet - 1].last = 1;
+	desc->req.d2mlen = next_triplet - desc->req.m2dlen;
+	desc->req.op_addr = op;
+
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
 /** Enqueue one decode operations for ACC100 device in CB mode */
 static inline int
 enqueue_ldpc_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
 		uint16_t total_enqueued_cbs, bool same_op)
 {
 	int ret;
+	if (unlikely(check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK))) {
+		ret = harq_loopback(q, op, total_enqueued_cbs);
+		return ret;
+	}
 
 	union acc100_dma_desc *desc;
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v10 07/10] baseband/acc100: add support for 4G processing
  2020-10-01  3:14   ` [dpdk-dev] [PATCH v10 00/10] bbdev PMD ACC100 Nicolas Chautru
                       ` (5 preceding siblings ...)
  2020-10-01  3:14     ` [dpdk-dev] [PATCH v10 06/10] baseband/acc100: add HARQ loopback support Nicolas Chautru
@ 2020-10-01  3:14     ` Nicolas Chautru
  2020-10-01  3:14     ` [dpdk-dev] [PATCH v10 08/10] baseband/acc100: add interrupt support to PMD Nicolas Chautru
                       ` (2 subsequent siblings)
  9 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-10-01  3:14 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, trix, ferruh.yigit, tianjiao.liu,
	Nicolas Chautru

Adding capability for 4G encode and decoder processing

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 doc/guides/bbdevs/features/acc100.ini    |    4 +-
 drivers/baseband/acc100/rte_acc100_pmd.c | 1029 +++++++++++++++++++++++++++---
 2 files changed, 958 insertions(+), 75 deletions(-)

diff --git a/doc/guides/bbdevs/features/acc100.ini b/doc/guides/bbdevs/features/acc100.ini
index 40c7adc..642cd48 100644
--- a/doc/guides/bbdevs/features/acc100.ini
+++ b/doc/guides/bbdevs/features/acc100.ini
@@ -4,8 +4,8 @@
 ; Refer to default.ini for the full list of available PMD features.
 ;
 [Features]
-Turbo Decoder (4G)     = N
-Turbo Encoder (4G)     = N
+Turbo Decoder (4G)     = Y
+Turbo Encoder (4G)     = Y
 LDPC Decoder (5G)      = Y
 LDPC Encoder (5G)      = Y
 LLR/HARQ Compression   = Y
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 0862691..a583630 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -343,7 +343,6 @@
 	free_base_addresses(base_addrs, i);
 }
 
-
 /* Allocate 64MB memory used for all software rings */
 static int
 acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id)
@@ -659,6 +658,41 @@
 
 	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
 		{
+			.type = RTE_BBDEV_OP_TURBO_DEC,
+			.cap.turbo_dec = {
+				.capability_flags =
+					RTE_BBDEV_TURBO_SUBBLOCK_DEINTERLEAVE |
+					RTE_BBDEV_TURBO_CRC_TYPE_24B |
+					RTE_BBDEV_TURBO_HALF_ITERATION_EVEN |
+					RTE_BBDEV_TURBO_EARLY_TERMINATION |
+					RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN |
+					RTE_BBDEV_TURBO_MAP_DEC |
+					RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP |
+					RTE_BBDEV_TURBO_DEC_SCATTER_GATHER,
+				.max_llr_modulus = INT8_MAX,
+				.num_buffers_src =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+				.num_buffers_hard_out =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+				.num_buffers_soft_out =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+			}
+		},
+		{
+			.type = RTE_BBDEV_OP_TURBO_ENC,
+			.cap.turbo_enc = {
+				.capability_flags =
+					RTE_BBDEV_TURBO_CRC_24B_ATTACH |
+					RTE_BBDEV_TURBO_RV_INDEX_BYPASS |
+					RTE_BBDEV_TURBO_RATE_MATCH |
+					RTE_BBDEV_TURBO_ENC_SCATTER_GATHER,
+				.num_buffers_src =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+				.num_buffers_dst =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+			}
+		},
+		{
 			.type   = RTE_BBDEV_OP_LDPC_ENC,
 			.cap.ldpc_enc = {
 				.capability_flags =
@@ -741,7 +775,6 @@
 #endif
 }
 
-
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
 	.setup_queues = acc100_setup_queues,
 	.close = acc100_dev_close,
@@ -785,6 +818,58 @@
 	return tail;
 }
 
+/* Fill in a frame control word for turbo encoding. */
+static inline void
+acc100_fcw_te_fill(const struct rte_bbdev_enc_op *op, struct acc100_fcw_te *fcw)
+{
+	fcw->code_block_mode = op->turbo_enc.code_block_mode;
+	if (fcw->code_block_mode == 0) { /* For TB mode */
+		fcw->k_neg = op->turbo_enc.tb_params.k_neg;
+		fcw->k_pos = op->turbo_enc.tb_params.k_pos;
+		fcw->c_neg = op->turbo_enc.tb_params.c_neg;
+		fcw->c = op->turbo_enc.tb_params.c;
+		fcw->ncb_neg = op->turbo_enc.tb_params.ncb_neg;
+		fcw->ncb_pos = op->turbo_enc.tb_params.ncb_pos;
+
+		if (check_bit(op->turbo_enc.op_flags,
+				RTE_BBDEV_TURBO_RATE_MATCH)) {
+			fcw->bypass_rm = 0;
+			fcw->cab = op->turbo_enc.tb_params.cab;
+			fcw->ea = op->turbo_enc.tb_params.ea;
+			fcw->eb = op->turbo_enc.tb_params.eb;
+		} else {
+			/* E is set to the encoding output size when RM is
+			 * bypassed.
+			 */
+			fcw->bypass_rm = 1;
+			fcw->cab = fcw->c_neg;
+			fcw->ea = 3 * fcw->k_neg + 12;
+			fcw->eb = 3 * fcw->k_pos + 12;
+		}
+	} else { /* For CB mode */
+		fcw->k_pos = op->turbo_enc.cb_params.k;
+		fcw->ncb_pos = op->turbo_enc.cb_params.ncb;
+
+		if (check_bit(op->turbo_enc.op_flags,
+				RTE_BBDEV_TURBO_RATE_MATCH)) {
+			fcw->bypass_rm = 0;
+			fcw->eb = op->turbo_enc.cb_params.e;
+		} else {
+			/* E is set to the encoding output size when RM is
+			 * bypassed.
+			 */
+			fcw->bypass_rm = 1;
+			fcw->eb = 3 * fcw->k_pos + 12;
+		}
+	}
+
+	fcw->bypass_rv_idx1 = check_bit(op->turbo_enc.op_flags,
+			RTE_BBDEV_TURBO_RV_INDEX_BYPASS);
+	fcw->code_block_crc = check_bit(op->turbo_enc.op_flags,
+			RTE_BBDEV_TURBO_CRC_24B_ATTACH);
+	fcw->rv_idx1 = op->turbo_enc.rv_index;
+}
+
 /* Compute value of k0.
  * Based on 3GPP 38.212 Table 5.4.2.1-2
  * Starting position of different redundancy versions, k0
@@ -835,6 +920,25 @@
 	fcw->mcb_count = num_cb;
 }
 
+/* Fill in a frame control word for turbo decoding. */
+static inline void
+acc100_fcw_td_fill(const struct rte_bbdev_dec_op *op, struct acc100_fcw_td *fcw)
+{
+	/* Note : Early termination is always enabled for 4GUL */
+	fcw->fcw_ver = 1;
+	if (op->turbo_dec.code_block_mode == 0)
+		fcw->k_pos = op->turbo_dec.tb_params.k_pos;
+	else
+		fcw->k_pos = op->turbo_dec.cb_params.k;
+	fcw->turbo_crc_type = check_bit(op->turbo_dec.op_flags,
+			RTE_BBDEV_TURBO_CRC_TYPE_24B);
+	fcw->bypass_sb_deint = 0;
+	fcw->raw_decoder_input_on = 0;
+	fcw->max_iter = op->turbo_dec.iter_max;
+	fcw->half_iter_on = !check_bit(op->turbo_dec.op_flags,
+			RTE_BBDEV_TURBO_HALF_ITERATION_EVEN);
+}
+
 /* Fill in a frame control word for LDPC decoding. */
 static inline void
 acc100_fcw_ld_fill(const struct rte_bbdev_dec_op *op, struct acc100_fcw_ld *fcw,
@@ -1073,6 +1177,87 @@ static inline void acc100_header_init(struct acc100_dma_req_desc *desc)
 }
 
 static inline int
+acc100_dma_desc_te_fill(struct rte_bbdev_enc_op *op,
+		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
+		struct rte_mbuf *output, uint32_t *in_offset,
+		uint32_t *out_offset, uint32_t *out_length,
+		uint32_t *mbuf_total_left, uint32_t *seg_total_left, uint8_t r)
+{
+	int next_triplet = 1; /* FCW already done */
+	uint32_t e, ea, eb, length;
+	uint16_t k, k_neg, k_pos;
+	uint8_t cab, c_neg;
+
+	desc->word0 = ACC100_DMA_DESC_TYPE;
+	desc->word1 = 0; /**< Timestamp could be disabled */
+	desc->word2 = 0;
+	desc->word3 = 0;
+	desc->numCBs = 1;
+
+	if (op->turbo_enc.code_block_mode == 0) {
+		ea = op->turbo_enc.tb_params.ea;
+		eb = op->turbo_enc.tb_params.eb;
+		cab = op->turbo_enc.tb_params.cab;
+		k_neg = op->turbo_enc.tb_params.k_neg;
+		k_pos = op->turbo_enc.tb_params.k_pos;
+		c_neg = op->turbo_enc.tb_params.c_neg;
+		e = (r < cab) ? ea : eb;
+		k = (r < c_neg) ? k_neg : k_pos;
+	} else {
+		e = op->turbo_enc.cb_params.e;
+		k = op->turbo_enc.cb_params.k;
+	}
+
+	if (check_bit(op->turbo_enc.op_flags, RTE_BBDEV_TURBO_CRC_24B_ATTACH))
+		length = (k - 24) >> 3;
+	else
+		length = k >> 3;
+
+	if (unlikely((*mbuf_total_left == 0) || (*mbuf_total_left < length))) {
+		rte_bbdev_log(ERR,
+				"Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+				*mbuf_total_left, length);
+		return -1;
+	}
+
+	next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset,
+			length, seg_total_left, next_triplet);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->m2dlen = next_triplet;
+	*mbuf_total_left -= length;
+
+	/* Set output length */
+	if (check_bit(op->turbo_enc.op_flags, RTE_BBDEV_TURBO_RATE_MATCH))
+		/* Integer round up division by 8 */
+		*out_length = (e + 7) >> 3;
+	else
+		*out_length = (k >> 3) * 3 + 2;
+
+	next_triplet = acc100_dma_fill_blk_type_out(desc, output, *out_offset,
+			*out_length, next_triplet, ACC100_DMA_BLKID_OUT_ENC);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+	op->turbo_enc.output.length += *out_length;
+	*out_offset += *out_length;
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->d2mlen = next_triplet - desc->m2dlen;
+
+	desc->op_addr = op;
+
+	return 0;
+}
+
+static inline int
 acc100_dma_desc_le_fill(struct rte_bbdev_enc_op *op,
 		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
 		struct rte_mbuf *output, uint32_t *in_offset,
@@ -1131,6 +1316,117 @@ static inline void acc100_header_init(struct acc100_dma_req_desc *desc)
 }
 
 static inline int
+acc100_dma_desc_td_fill(struct rte_bbdev_dec_op *op,
+		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
+		struct rte_mbuf *h_output, struct rte_mbuf *s_output,
+		uint32_t *in_offset, uint32_t *h_out_offset,
+		uint32_t *s_out_offset, uint32_t *h_out_length,
+		uint32_t *s_out_length, uint32_t *mbuf_total_left,
+		uint32_t *seg_total_left, uint8_t r)
+{
+	int next_triplet = 1; /* FCW already done */
+	uint16_t k;
+	uint16_t crc24_overlap = 0;
+	uint32_t e, kw;
+
+	desc->word0 = ACC100_DMA_DESC_TYPE;
+	desc->word1 = 0; /**< Timestamp could be disabled */
+	desc->word2 = 0;
+	desc->word3 = 0;
+	desc->numCBs = 1;
+
+	if (op->turbo_dec.code_block_mode == 0) {
+		k = (r < op->turbo_dec.tb_params.c_neg)
+			? op->turbo_dec.tb_params.k_neg
+			: op->turbo_dec.tb_params.k_pos;
+		e = (r < op->turbo_dec.tb_params.cab)
+			? op->turbo_dec.tb_params.ea
+			: op->turbo_dec.tb_params.eb;
+	} else {
+		k = op->turbo_dec.cb_params.k;
+		e = op->turbo_dec.cb_params.e;
+	}
+
+	if ((op->turbo_dec.code_block_mode == 0)
+		&& !check_bit(op->turbo_dec.op_flags,
+		RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP))
+		crc24_overlap = 24;
+
+	/* Calculates circular buffer size.
+	 * According to 3gpp 36.212 section 5.1.4.2
+	 *   Kw = 3 * Kpi,
+	 * where:
+	 *   Kpi = nCol * nRow
+	 * where nCol is 32 and nRow can be calculated from:
+	 *   D =< nCol * nRow
+	 * where D is the size of each output from turbo encoder block (k + 4).
+	 */
+	kw = RTE_ALIGN_CEIL(k + 4, 32) * 3;
+
+	if (unlikely((*mbuf_total_left == 0) || (*mbuf_total_left < kw))) {
+		rte_bbdev_log(ERR,
+				"Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+				*mbuf_total_left, kw);
+		return -1;
+	}
+
+	next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset, kw,
+			seg_total_left, next_triplet);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->m2dlen = next_triplet;
+	*mbuf_total_left -= kw;
+
+	next_triplet = acc100_dma_fill_blk_type_out(
+			desc, h_output, *h_out_offset,
+			k >> 3, next_triplet, ACC100_DMA_BLKID_OUT_HARD);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+
+	*h_out_length = ((k - crc24_overlap) >> 3);
+	op->turbo_dec.hard_output.length += *h_out_length;
+	*h_out_offset += *h_out_length;
+
+	/* Soft output */
+	if (check_bit(op->turbo_dec.op_flags, RTE_BBDEV_TURBO_SOFT_OUTPUT)) {
+		if (check_bit(op->turbo_dec.op_flags,
+				RTE_BBDEV_TURBO_EQUALIZER))
+			*s_out_length = e;
+		else
+			*s_out_length = (k * 3) + 12;
+
+		next_triplet = acc100_dma_fill_blk_type_out(desc, s_output,
+				*s_out_offset, *s_out_length, next_triplet,
+				ACC100_DMA_BLKID_OUT_SOFT);
+		if (unlikely(next_triplet < 0)) {
+			rte_bbdev_log(ERR,
+					"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+					op);
+			return -1;
+		}
+
+		op->turbo_dec.soft_output.length += *s_out_length;
+		*s_out_offset += *s_out_length;
+	}
+
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->d2mlen = next_triplet - desc->m2dlen;
+
+	desc->op_addr = op;
+
+	return 0;
+}
+
+static inline int
 acc100_dma_desc_ld_fill(struct rte_bbdev_dec_op *op,
 		struct acc100_dma_req_desc *desc,
 		struct rte_mbuf **input, struct rte_mbuf *h_output,
@@ -1384,6 +1680,57 @@ static inline void acc100_header_init(struct acc100_dma_req_desc *desc)
 
 /* Enqueue one encode operations for ACC100 device in CB mode */
 static inline int
+enqueue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
+		uint16_t total_enqueued_cbs)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint32_t in_offset, out_offset, out_length, mbuf_total_left,
+		seg_total_left;
+	struct rte_mbuf *input, *output_head, *output;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	acc100_fcw_te_fill(op, &desc->req.fcw_te);
+
+	input = op->turbo_enc.input.data;
+	output_head = output = op->turbo_enc.output.data;
+	in_offset = op->turbo_enc.input.offset;
+	out_offset = op->turbo_enc.output.offset;
+	out_length = 0;
+	mbuf_total_left = op->turbo_enc.input.length;
+	seg_total_left = rte_pktmbuf_data_len(op->turbo_enc.input.data)
+			- in_offset;
+
+	ret = acc100_dma_desc_te_fill(op, &desc->req, &input, output,
+			&in_offset, &out_offset, &out_length, &mbuf_total_left,
+			&seg_total_left, 0);
+
+	if (unlikely(ret < 0))
+		return ret;
+
+	mbuf_append(output_head, output, out_length);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_te,
+			sizeof(desc->req.fcw_te) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+
+	/* Check if any data left after processing one CB */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left after processing one CB: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
+/* Enqueue one encode operations for ACC100 device in CB mode */
+static inline int
 enqueue_ldpc_enc_n_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op **ops,
 		uint16_t total_enqueued_cbs, int16_t num)
 {
@@ -1488,84 +1835,245 @@ static inline void acc100_header_init(struct acc100_dma_req_desc *desc)
 	return 1;
 }
 
-static inline int
-harq_loopback(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
-		uint16_t total_enqueued_cbs) {
-	struct acc100_fcw_ld *fcw;
-	union acc100_dma_desc *desc;
-	int next_triplet = 1;
-	struct rte_mbuf *hq_output_head, *hq_output;
-	uint16_t harq_in_length = op->ldpc_dec.harq_combined_input.length;
-	if (harq_in_length == 0) {
-		rte_bbdev_log(ERR, "Loopback of invalid null size\n");
-		return -EINVAL;
-	}
 
-	int h_comp = check_bit(op->ldpc_dec.op_flags,
-			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION
-			) ? 1 : 0;
-	if (h_comp == 1)
-		harq_in_length = harq_in_length * 8 / 6;
-	harq_in_length = RTE_ALIGN(harq_in_length, 64);
-	uint16_t harq_dma_length_in = (h_comp == 0) ?
-			harq_in_length :
-			harq_in_length * 6 / 8;
-	uint16_t harq_dma_length_out = harq_dma_length_in;
-	bool ddr_mem_in = check_bit(op->ldpc_dec.op_flags,
-			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE);
-	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
-	uint16_t harq_index = (ddr_mem_in ?
-			op->ldpc_dec.harq_combined_input.offset :
-			op->ldpc_dec.harq_combined_output.offset)
-			/ ACC100_HARQ_OFFSET;
+/* Enqueue one encode operations for ACC100 device in TB mode. */
+static inline int
+enqueue_enc_one_op_tb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
+		uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint8_t r, c;
+	uint32_t in_offset, out_offset, out_length, mbuf_total_left,
+		seg_total_left;
+	struct rte_mbuf *input, *output_head, *output;
+	uint16_t current_enqueued_cbs = 0;
 
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
-	fcw = &desc->req.fcw_ld;
-	/* Set the FCW from loopback into DDR */
-	memset(fcw, 0, sizeof(struct acc100_fcw_ld));
-	fcw->FCWversion = ACC100_FCW_VER;
-	fcw->qm = 2;
-	fcw->Zc = 384;
-	if (harq_in_length < 16 * ACC100_N_ZC_1)
-		fcw->Zc = 16;
-	fcw->ncb = fcw->Zc * ACC100_N_ZC_1;
-	fcw->rm_e = 2;
-	fcw->hcin_en = 1;
-	fcw->hcout_en = 1;
-
-	rte_bbdev_log(DEBUG, "Loopback IN %d Index %d offset %d length %d %d\n",
-			ddr_mem_in, harq_index,
-			harq_layout[harq_index].offset, harq_in_length,
-			harq_dma_length_in);
+	uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
+	acc100_fcw_te_fill(op, &desc->req.fcw_te);
 
-	if (ddr_mem_in && (harq_layout[harq_index].offset > 0)) {
-		fcw->hcin_size0 = harq_layout[harq_index].size0;
-		fcw->hcin_offset = harq_layout[harq_index].offset;
-		fcw->hcin_size1 = harq_in_length - fcw->hcin_offset;
-		harq_dma_length_in = (fcw->hcin_size0 + fcw->hcin_size1);
-		if (h_comp == 1)
-			harq_dma_length_in = harq_dma_length_in * 6 / 8;
-	} else {
-		fcw->hcin_size0 = harq_in_length;
-	}
-	harq_layout[harq_index].val = 0;
-	rte_bbdev_log(DEBUG, "Loopback FCW Config %d %d %d\n",
-			fcw->hcin_size0, fcw->hcin_offset, fcw->hcin_size1);
-	fcw->hcout_size0 = harq_in_length;
-	fcw->hcin_decomp_mode = h_comp;
-	fcw->hcout_comp_mode = h_comp;
-	fcw->gain_i = 1;
-	fcw->gain_h = 1;
+	input = op->turbo_enc.input.data;
+	output_head = output = op->turbo_enc.output.data;
+	in_offset = op->turbo_enc.input.offset;
+	out_offset = op->turbo_enc.output.offset;
+	out_length = 0;
+	mbuf_total_left = op->turbo_enc.input.length;
 
-	/* Set the prefix of descriptor. This could be done at polling */
-	acc100_header_init(&desc->req);
+	c = op->turbo_enc.tb_params.c;
+	r = op->turbo_enc.tb_params.r;
 
-	/* Null LLR input for Decoder */
-	desc->req.data_ptrs[next_triplet].address =
-			q->lb_in_addr_phys;
-	desc->req.data_ptrs[next_triplet].blen = 2;
+	while (mbuf_total_left > 0 && r < c) {
+		seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
+		/* Set up DMA descriptor */
+		desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
+				& q->sw_ring_wrap_mask);
+		desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
+		desc->req.data_ptrs[0].blen = ACC100_FCW_TE_BLEN;
+
+		ret = acc100_dma_desc_te_fill(op, &desc->req, &input, output,
+				&in_offset, &out_offset, &out_length,
+				&mbuf_total_left, &seg_total_left, r);
+		if (unlikely(ret < 0))
+			return ret;
+		mbuf_append(output_head, output, out_length);
+
+		/* Set total number of CBs in TB */
+		desc->req.cbs_in_tb = cbs_in_tb;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+		rte_memdump(stderr, "FCW", &desc->req.fcw_te,
+				sizeof(desc->req.fcw_te) - 8);
+		rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+		if (seg_total_left == 0) {
+			/* Go to the next mbuf */
+			input = input->next;
+			in_offset = 0;
+			output = output->next;
+			out_offset = 0;
+		}
+
+		total_enqueued_cbs++;
+		current_enqueued_cbs++;
+		r++;
+	}
+
+	if (unlikely(desc == NULL))
+		return current_enqueued_cbs;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Check if any CBs left for processing */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left for processing: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+
+	/* Set SDone on last CB descriptor for TB mode. */
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	return current_enqueued_cbs;
+}
+
+/** Enqueue one decode operations for ACC100 device in CB mode */
+static inline int
+enqueue_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint32_t in_offset, h_out_offset, s_out_offset, s_out_length,
+		h_out_length, mbuf_total_left, seg_total_left;
+	struct rte_mbuf *input, *h_output_head, *h_output,
+		*s_output_head, *s_output;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	acc100_fcw_td_fill(op, &desc->req.fcw_td);
+
+	input = op->turbo_dec.input.data;
+	h_output_head = h_output = op->turbo_dec.hard_output.data;
+	s_output_head = s_output = op->turbo_dec.soft_output.data;
+	in_offset = op->turbo_dec.input.offset;
+	h_out_offset = op->turbo_dec.hard_output.offset;
+	s_out_offset = op->turbo_dec.soft_output.offset;
+	h_out_length = s_out_length = 0;
+	mbuf_total_left = op->turbo_dec.input.length;
+	seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(input == NULL)) {
+		rte_bbdev_log(ERR, "Invalid mbuf pointer");
+		return -EFAULT;
+	}
+#endif
+
+	/* Set up DMA descriptor */
+	desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+
+	ret = acc100_dma_desc_td_fill(op, &desc->req, &input, h_output,
+			s_output, &in_offset, &h_out_offset, &s_out_offset,
+			&h_out_length, &s_out_length, &mbuf_total_left,
+			&seg_total_left, 0);
+
+	if (unlikely(ret < 0))
+		return ret;
+
+	/* Hard output */
+	mbuf_append(h_output_head, h_output, h_out_length);
+
+	/* Soft output */
+	if (check_bit(op->turbo_dec.op_flags, RTE_BBDEV_TURBO_SOFT_OUTPUT))
+		mbuf_append(s_output_head, s_output, s_out_length);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_td,
+			sizeof(desc->req.fcw_td) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+
+	/* Check if any CBs left for processing */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left after processing one CB: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
+static inline int
+harq_loopback(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs) {
+	struct acc100_fcw_ld *fcw;
+	union acc100_dma_desc *desc;
+	int next_triplet = 1;
+	struct rte_mbuf *hq_output_head, *hq_output;
+	uint16_t harq_dma_length_in, harq_dma_length_out;
+	uint16_t harq_in_length = op->ldpc_dec.harq_combined_input.length;
+	if (harq_in_length == 0) {
+		rte_bbdev_log(ERR, "Loopback of invalid null size\n");
+		return -EINVAL;
+	}
+
+	int h_comp = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION
+			) ? 1 : 0;
+	if (h_comp == 1) {
+		harq_in_length = harq_in_length * 8 / 6;
+		harq_in_length = RTE_ALIGN(harq_in_length, 64);
+		harq_dma_length_in = harq_in_length * 6 / 8;
+	} else {
+		harq_in_length = RTE_ALIGN(harq_in_length, 64);
+		harq_dma_length_in = harq_in_length;
+	}
+	harq_dma_length_out = harq_dma_length_in;
+
+	bool ddr_mem_in = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE);
+	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+	uint16_t harq_index = (ddr_mem_in ?
+			op->ldpc_dec.harq_combined_input.offset :
+			op->ldpc_dec.harq_combined_output.offset)
+			/ ACC100_HARQ_OFFSET;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	fcw = &desc->req.fcw_ld;
+	/* Set the FCW from loopback into DDR */
+	memset(fcw, 0, sizeof(struct acc100_fcw_ld));
+	fcw->FCWversion = ACC100_FCW_VER;
+	fcw->qm = 2;
+	fcw->Zc = 384;
+	if (harq_in_length < 16 * ACC100_N_ZC_1)
+		fcw->Zc = 16;
+	fcw->ncb = fcw->Zc * ACC100_N_ZC_1;
+	fcw->rm_e = 2;
+	fcw->hcin_en = 1;
+	fcw->hcout_en = 1;
+
+	rte_bbdev_log(DEBUG, "Loopback IN %d Index %d offset %d length %d %d\n",
+			ddr_mem_in, harq_index,
+			harq_layout[harq_index].offset, harq_in_length,
+			harq_dma_length_in);
+
+	if (ddr_mem_in && (harq_layout[harq_index].offset > 0)) {
+		fcw->hcin_size0 = harq_layout[harq_index].size0;
+		fcw->hcin_offset = harq_layout[harq_index].offset;
+		fcw->hcin_size1 = harq_in_length - fcw->hcin_offset;
+		harq_dma_length_in = (fcw->hcin_size0 + fcw->hcin_size1);
+		if (h_comp == 1)
+			harq_dma_length_in = harq_dma_length_in * 6 / 8;
+	} else {
+		fcw->hcin_size0 = harq_in_length;
+	}
+	harq_layout[harq_index].val = 0;
+	rte_bbdev_log(DEBUG, "Loopback FCW Config %d %d %d\n",
+			fcw->hcin_size0, fcw->hcin_offset, fcw->hcin_size1);
+	fcw->hcout_size0 = harq_in_length;
+	fcw->hcin_decomp_mode = h_comp;
+	fcw->hcout_comp_mode = h_comp;
+	fcw->gain_i = 1;
+	fcw->gain_h = 1;
+
+	/* Set the prefix of descriptor. This could be done at polling */
+	acc100_header_init(&desc->req);
+
+	/* Null LLR input for Decoder */
+	desc->req.data_ptrs[next_triplet].address =
+			q->lb_in_addr_phys;
+	desc->req.data_ptrs[next_triplet].blen = 2;
 	desc->req.data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN;
 	desc->req.data_ptrs[next_triplet].last = 0;
 	desc->req.data_ptrs[next_triplet].dma_ext = 0;
@@ -1821,6 +2329,107 @@ static inline void acc100_header_init(struct acc100_dma_req_desc *desc)
 	return current_enqueued_cbs;
 }
 
+/* Enqueue one decode operations for ACC100 device in TB mode */
+static inline int
+enqueue_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint8_t r, c;
+	uint32_t in_offset, h_out_offset, s_out_offset, s_out_length,
+		h_out_length, mbuf_total_left, seg_total_left;
+	struct rte_mbuf *input, *h_output_head, *h_output,
+		*s_output_head, *s_output;
+	uint16_t current_enqueued_cbs = 0;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
+	acc100_fcw_td_fill(op, &desc->req.fcw_td);
+
+	input = op->turbo_dec.input.data;
+	h_output_head = h_output = op->turbo_dec.hard_output.data;
+	s_output_head = s_output = op->turbo_dec.soft_output.data;
+	in_offset = op->turbo_dec.input.offset;
+	h_out_offset = op->turbo_dec.hard_output.offset;
+	s_out_offset = op->turbo_dec.soft_output.offset;
+	h_out_length = s_out_length = 0;
+	mbuf_total_left = op->turbo_dec.input.length;
+	c = op->turbo_dec.tb_params.c;
+	r = op->turbo_dec.tb_params.r;
+
+	while (mbuf_total_left > 0 && r < c) {
+
+		seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
+
+		/* Set up DMA descriptor */
+		desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
+				& q->sw_ring_wrap_mask);
+		desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
+		desc->req.data_ptrs[0].blen = ACC100_FCW_TD_BLEN;
+		ret = acc100_dma_desc_td_fill(op, &desc->req, &input,
+				h_output, s_output, &in_offset, &h_out_offset,
+				&s_out_offset, &h_out_length, &s_out_length,
+				&mbuf_total_left, &seg_total_left, r);
+
+		if (unlikely(ret < 0))
+			return ret;
+
+		/* Hard output */
+		mbuf_append(h_output_head, h_output, h_out_length);
+
+		/* Soft output */
+		if (check_bit(op->turbo_dec.op_flags,
+				RTE_BBDEV_TURBO_SOFT_OUTPUT))
+			mbuf_append(s_output_head, s_output, s_out_length);
+
+		/* Set total number of CBs in TB */
+		desc->req.cbs_in_tb = cbs_in_tb;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+		rte_memdump(stderr, "FCW", &desc->req.fcw_td,
+				sizeof(desc->req.fcw_td) - 8);
+		rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+		if (seg_total_left == 0) {
+			/* Go to the next mbuf */
+			input = input->next;
+			in_offset = 0;
+			h_output = h_output->next;
+			h_out_offset = 0;
+
+			if (check_bit(op->turbo_dec.op_flags,
+					RTE_BBDEV_TURBO_SOFT_OUTPUT)) {
+				s_output = s_output->next;
+				s_out_offset = 0;
+			}
+		}
+
+		total_enqueued_cbs++;
+		current_enqueued_cbs++;
+		r++;
+	}
+
+	if (unlikely(desc == NULL))
+		return current_enqueued_cbs;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Check if any CBs left for processing */
+	if (mbuf_total_left != 0) {
+		rte_bbdev_log(ERR,
+				"Some date still left for processing: mbuf_total_left = %u",
+				mbuf_total_left);
+		return -EINVAL;
+	}
+#endif
+	/* Set SDone on last CB descriptor for TB mode */
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	return current_enqueued_cbs;
+}
 
 /* Calculates number of CBs in processed encoder TB based on 'r' and input
  * length.
@@ -1898,6 +2507,45 @@ static inline void acc100_header_init(struct acc100_dma_req_desc *desc)
 	return cbs_in_tb;
 }
 
+/* Enqueue encode operations for ACC100 device in CB mode. */
+static uint16_t
+acc100_enqueue_enc_cb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i;
+	union acc100_dma_desc *desc;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - 1 < 0))
+			break;
+		avail -= 1;
+
+		ret = enqueue_enc_one_op_cb(q, ops[i], i);
+		if (ret < 0)
+			break;
+	}
+
+	if (unlikely(i == 0))
+		return 0; /* Nothing to enqueue */
+
+	/* Set SDone in last CB in enqueued ops for CB mode*/
+	desc = q->ring_addr + ((q->sw_ring_head + i - 1)
+			& q->sw_ring_wrap_mask);
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	acc100_dma_enqueue(q, i, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+	return i;
+}
+
 /* Check we can mux encode operations with common FCW */
 static inline bool
 check_mux(struct rte_bbdev_enc_op **ops, uint16_t num) {
@@ -1966,6 +2614,54 @@ static inline void acc100_header_init(struct acc100_dma_req_desc *desc)
 	return i;
 }
 
+/* Enqueue encode operations for ACC100 device in TB mode. */
+static uint16_t
+acc100_enqueue_enc_tb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i, enqueued_cbs = 0;
+	uint8_t cbs_in_tb;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		cbs_in_tb = get_num_cbs_in_tb_enc(&ops[i]->turbo_enc);
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - cbs_in_tb < 0))
+			break;
+		avail -= cbs_in_tb;
+
+		ret = enqueue_enc_one_op_tb(q, ops[i], enqueued_cbs, cbs_in_tb);
+		if (ret < 0)
+			break;
+		enqueued_cbs += ret;
+	}
+	if (unlikely(enqueued_cbs == 0))
+		return 0; /* Nothing to enqueue */
+
+	acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+
+	return i;
+}
+
+/* Enqueue encode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_enc(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	if (unlikely(num == 0))
+		return 0;
+	if (ops[0]->turbo_enc.code_block_mode == 0)
+		return acc100_enqueue_enc_tb(q_data, ops, num);
+	else
+		return acc100_enqueue_enc_cb(q_data, ops, num);
+}
+
 /* Enqueue encode operations for ACC100 device. */
 static uint16_t
 acc100_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
@@ -1973,7 +2669,51 @@ static inline void acc100_header_init(struct acc100_dma_req_desc *desc)
 {
 	if (unlikely(num == 0))
 		return 0;
-	return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
+	if (ops[0]->ldpc_enc.code_block_mode == 0)
+		return acc100_enqueue_enc_tb(q_data, ops, num);
+	else
+		return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
+}
+
+
+/* Enqueue decode operations for ACC100 device in CB mode */
+static uint16_t
+acc100_enqueue_dec_cb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i;
+	union acc100_dma_desc *desc;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - 1 < 0))
+			break;
+		avail -= 1;
+
+		ret = enqueue_dec_one_op_cb(q, ops[i], i);
+		if (ret < 0)
+			break;
+	}
+
+	if (unlikely(i == 0))
+		return 0; /* Nothing to enqueue */
+
+	/* Set SDone in last CB in enqueued ops for CB mode*/
+	desc = q->ring_addr + ((q->sw_ring_head + i - 1)
+			& q->sw_ring_wrap_mask);
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	acc100_dma_enqueue(q, i, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+
+	return i;
 }
 
 /* Check we can mux encode operations with common FCW */
@@ -2071,6 +2811,53 @@ static inline void acc100_header_init(struct acc100_dma_req_desc *desc)
 	return i;
 }
 
+
+/* Enqueue decode operations for ACC100 device in TB mode */
+static uint16_t
+acc100_enqueue_dec_tb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i, enqueued_cbs = 0;
+	uint8_t cbs_in_tb;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		cbs_in_tb = get_num_cbs_in_tb_dec(&ops[i]->turbo_dec);
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - cbs_in_tb < 0))
+			break;
+		avail -= cbs_in_tb;
+
+		ret = enqueue_dec_one_op_tb(q, ops[i], enqueued_cbs, cbs_in_tb);
+		if (ret < 0)
+			break;
+		enqueued_cbs += ret;
+	}
+
+	acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+
+	return i;
+}
+
+/* Enqueue decode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_dec(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	if (unlikely(num == 0))
+		return 0;
+	if (ops[0]->turbo_dec.code_block_mode == 0)
+		return acc100_enqueue_dec_tb(q_data, ops, num);
+	else
+		return acc100_enqueue_dec_cb(q_data, ops, num);
+}
+
 /* Enqueue decode operations for ACC100 device. */
 static uint16_t
 acc100_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
@@ -2394,6 +3181,52 @@ static inline void acc100_header_init(struct acc100_dma_req_desc *desc)
 	return cb_idx;
 }
 
+/* Dequeue encode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_enc(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	uint16_t dequeue_num;
+	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+	uint32_t aq_dequeued = 0;
+	uint16_t i, dequeued_cbs = 0;
+	struct rte_bbdev_enc_op *op;
+	int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(ops == NULL || q == NULL)) {
+		rte_bbdev_log_debug("Unexpected undefined pointer");
+		return 0;
+	}
+#endif
+
+	dequeue_num = (avail < num) ? avail : num;
+
+	for (i = 0; i < dequeue_num; ++i) {
+		op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask))->req.op_addr;
+		if (op->turbo_enc.code_block_mode == 0)
+			ret = dequeue_enc_one_op_tb(q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+		else
+			ret = dequeue_enc_one_op_cb(q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+
+		if (ret < 0)
+			break;
+		dequeued_cbs += ret;
+	}
+
+	q->aq_dequeued += aq_dequeued;
+	q->sw_ring_tail += dequeued_cbs;
+
+	/* Update enqueue stats */
+	q_data->queue_stats.dequeued_count += i;
+
+	return i;
+}
+
 /* Dequeue LDPC encode operations from ACC100 device. */
 static uint16_t
 acc100_dequeue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
@@ -2432,6 +3265,52 @@ static inline void acc100_header_init(struct acc100_dma_req_desc *desc)
 	return dequeued_cbs;
 }
 
+
+/* Dequeue decode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_dec(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	uint16_t dequeue_num;
+	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+	uint32_t aq_dequeued = 0;
+	uint16_t i;
+	uint16_t dequeued_cbs = 0;
+	struct rte_bbdev_dec_op *op;
+	int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(ops == 0 && q == NULL))
+		return 0;
+#endif
+
+	dequeue_num = (avail < num) ? avail : num;
+
+	for (i = 0; i < dequeue_num; ++i) {
+		op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask))->req.op_addr;
+		if (op->turbo_dec.code_block_mode == 0)
+			ret = dequeue_dec_one_op_tb(q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+		else
+			ret = dequeue_dec_one_op_cb(q_data, q, &ops[i],
+					dequeued_cbs, &aq_dequeued);
+
+		if (ret < 0)
+			break;
+		dequeued_cbs += ret;
+	}
+
+	q->aq_dequeued += aq_dequeued;
+	q->sw_ring_tail += dequeued_cbs;
+
+	/* Update enqueue stats */
+	q_data->queue_stats.dequeued_count += i;
+
+	return i;
+}
+
 /* Dequeue decode operations from ACC100 device. */
 static uint16_t
 acc100_dequeue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
@@ -2485,6 +3364,10 @@ static inline void acc100_header_init(struct acc100_dma_req_desc *desc)
 	struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
 
 	dev->dev_ops = &acc100_bbdev_ops;
+	dev->enqueue_enc_ops = acc100_enqueue_enc;
+	dev->enqueue_dec_ops = acc100_enqueue_dec;
+	dev->dequeue_enc_ops = acc100_dequeue_enc;
+	dev->dequeue_dec_ops = acc100_dequeue_dec;
 	dev->enqueue_ldpc_enc_ops = acc100_enqueue_ldpc_enc;
 	dev->enqueue_ldpc_dec_ops = acc100_enqueue_ldpc_dec;
 	dev->dequeue_ldpc_enc_ops = acc100_dequeue_ldpc_enc;
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v10 08/10] baseband/acc100: add interrupt support to PMD
  2020-10-01  3:14   ` [dpdk-dev] [PATCH v10 00/10] bbdev PMD ACC100 Nicolas Chautru
                       ` (6 preceding siblings ...)
  2020-10-01  3:14     ` [dpdk-dev] [PATCH v10 07/10] baseband/acc100: add support for 4G processing Nicolas Chautru
@ 2020-10-01  3:14     ` Nicolas Chautru
  2020-10-01  3:14     ` [dpdk-dev] [PATCH v10 09/10] baseband/acc100: add debug function to validate input Nicolas Chautru
  2020-10-01  3:14     ` [dpdk-dev] [PATCH v10 10/10] baseband/acc100: add configure function Nicolas Chautru
  9 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-10-01  3:14 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, trix, ferruh.yigit, tianjiao.liu,
	Nicolas Chautru

Adding capability and functions to support MSI
interrupts, call backs and inforing.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 302 ++++++++++++++++++++++++++++++-
 drivers/baseband/acc100/rte_acc100_pmd.h |  16 ++
 2 files changed, 315 insertions(+), 3 deletions(-)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index a583630..1271229 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -343,6 +343,213 @@
 	free_base_addresses(base_addrs, i);
 }
 
+/*
+ * Find queue_id of a device queue based on details from the Info Ring.
+ * If a queue isn't found UINT16_MAX is returned.
+ */
+static inline uint16_t
+get_queue_id_from_ring_info(struct rte_bbdev_data *data,
+		const union acc100_info_ring_data ring_data)
+{
+	uint16_t queue_id;
+
+	for (queue_id = 0; queue_id < data->num_queues; ++queue_id) {
+		struct acc100_queue *acc100_q =
+				data->queues[queue_id].queue_private;
+		if (acc100_q != NULL && acc100_q->aq_id == ring_data.aq_id &&
+				acc100_q->qgrp_id == ring_data.qg_id &&
+				acc100_q->vf_id == ring_data.vf_id)
+			return queue_id;
+	}
+
+	return UINT16_MAX;
+}
+
+/* Checks PF Info Ring to find the interrupt cause and handles it accordingly */
+static inline void
+acc100_check_ir(struct acc100_device *acc100_dev)
+{
+	volatile union acc100_info_ring_data *ring_data;
+	uint16_t info_ring_head = acc100_dev->info_ring_head;
+	if (acc100_dev->info_ring == NULL)
+		return;
+
+	ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head &
+			ACC100_INFO_RING_MASK);
+
+	while (ring_data->valid) {
+		if ((ring_data->int_nb < ACC100_PF_INT_DMA_DL_DESC_IRQ) || (
+				ring_data->int_nb >
+				ACC100_PF_INT_DMA_DL5G_DESC_IRQ))
+			rte_bbdev_log(WARNING, "InfoRing: ITR:%d Info:0x%x",
+				ring_data->int_nb, ring_data->detailed_info);
+		/* Initialize Info Ring entry and move forward */
+		ring_data->val = 0;
+		info_ring_head++;
+		ring_data = acc100_dev->info_ring +
+				(info_ring_head & ACC100_INFO_RING_MASK);
+	}
+}
+
+/* Checks PF Info Ring to find the interrupt cause and handles it accordingly */
+static inline void
+acc100_pf_interrupt_handler(struct rte_bbdev *dev)
+{
+	struct acc100_device *acc100_dev = dev->data->dev_private;
+	volatile union acc100_info_ring_data *ring_data;
+	struct acc100_deq_intr_details deq_intr_det;
+
+	ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head &
+			ACC100_INFO_RING_MASK);
+
+	while (ring_data->valid) {
+
+		rte_bbdev_log_debug(
+				"ACC100 PF Interrupt received, Info Ring data: 0x%x",
+				ring_data->val);
+
+		switch (ring_data->int_nb) {
+		case ACC100_PF_INT_DMA_DL_DESC_IRQ:
+		case ACC100_PF_INT_DMA_UL_DESC_IRQ:
+		case ACC100_PF_INT_DMA_UL5G_DESC_IRQ:
+		case ACC100_PF_INT_DMA_DL5G_DESC_IRQ:
+			deq_intr_det.queue_id = get_queue_id_from_ring_info(
+					dev->data, *ring_data);
+			if (deq_intr_det.queue_id == UINT16_MAX) {
+				rte_bbdev_log(ERR,
+						"Couldn't find queue: aq_id: %u, qg_id: %u, vf_id: %u",
+						ring_data->aq_id,
+						ring_data->qg_id,
+						ring_data->vf_id);
+				return;
+			}
+			rte_bbdev_pmd_callback_process(dev,
+					RTE_BBDEV_EVENT_DEQUEUE, &deq_intr_det);
+			break;
+		default:
+			rte_bbdev_pmd_callback_process(dev,
+					RTE_BBDEV_EVENT_ERROR, NULL);
+			break;
+		}
+
+		/* Initialize Info Ring entry and move forward */
+		ring_data->val = 0;
+		++acc100_dev->info_ring_head;
+		ring_data = acc100_dev->info_ring +
+				(acc100_dev->info_ring_head &
+				ACC100_INFO_RING_MASK);
+	}
+}
+
+/* Checks VF Info Ring to find the interrupt cause and handles it accordingly */
+static inline void
+acc100_vf_interrupt_handler(struct rte_bbdev *dev)
+{
+	struct acc100_device *acc100_dev = dev->data->dev_private;
+	volatile union acc100_info_ring_data *ring_data;
+	struct acc100_deq_intr_details deq_intr_det;
+
+	ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head &
+			ACC100_INFO_RING_MASK);
+
+	while (ring_data->valid) {
+
+		rte_bbdev_log_debug(
+				"ACC100 VF Interrupt received, Info Ring data: 0x%x",
+				ring_data->val);
+
+		switch (ring_data->int_nb) {
+		case ACC100_VF_INT_DMA_DL_DESC_IRQ:
+		case ACC100_VF_INT_DMA_UL_DESC_IRQ:
+		case ACC100_VF_INT_DMA_UL5G_DESC_IRQ:
+		case ACC100_VF_INT_DMA_DL5G_DESC_IRQ:
+			/* VFs are not aware of their vf_id - it's set to 0 in
+			 * queue structures.
+			 */
+			ring_data->vf_id = 0;
+			deq_intr_det.queue_id = get_queue_id_from_ring_info(
+					dev->data, *ring_data);
+			if (deq_intr_det.queue_id == UINT16_MAX) {
+				rte_bbdev_log(ERR,
+						"Couldn't find queue: aq_id: %u, qg_id: %u",
+						ring_data->aq_id,
+						ring_data->qg_id);
+				return;
+			}
+			rte_bbdev_pmd_callback_process(dev,
+					RTE_BBDEV_EVENT_DEQUEUE, &deq_intr_det);
+			break;
+		default:
+			rte_bbdev_pmd_callback_process(dev,
+					RTE_BBDEV_EVENT_ERROR, NULL);
+			break;
+		}
+
+		/* Initialize Info Ring entry and move forward */
+		ring_data->valid = 0;
+		++acc100_dev->info_ring_head;
+		ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head
+				& ACC100_INFO_RING_MASK);
+	}
+}
+
+/* Interrupt handler triggered by ACC100 dev for handling specific interrupt */
+static void
+acc100_dev_interrupt_handler(void *cb_arg)
+{
+	struct rte_bbdev *dev = cb_arg;
+	struct acc100_device *acc100_dev = dev->data->dev_private;
+
+	/* Read info ring */
+	if (acc100_dev->pf_device)
+		acc100_pf_interrupt_handler(dev);
+	else
+		acc100_vf_interrupt_handler(dev);
+}
+
+/* Allocate and setup inforing */
+static int
+allocate_info_ring(struct rte_bbdev *dev)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	const struct acc100_registry_addr *reg_addr;
+	rte_iova_t info_ring_phys;
+	uint32_t phys_low, phys_high;
+
+	if (d->info_ring != NULL)
+		return 0; /* Already configured */
+
+	/* Choose correct registry addresses for the device type */
+	if (d->pf_device)
+		reg_addr = &pf_reg_addr;
+	else
+		reg_addr = &vf_reg_addr;
+	/* Allocate InfoRing */
+	d->info_ring = rte_zmalloc_socket("Info Ring",
+			ACC100_INFO_RING_NUM_ENTRIES *
+			sizeof(*d->info_ring), RTE_CACHE_LINE_SIZE,
+			dev->data->socket_id);
+	if (d->info_ring == NULL) {
+		rte_bbdev_log(ERR,
+				"Failed to allocate Info Ring for %s:%u",
+				dev->device->driver->name,
+				dev->data->dev_id);
+		return -ENOMEM;
+	}
+	info_ring_phys = rte_malloc_virt2iova(d->info_ring);
+
+	/* Setup Info Ring */
+	phys_high = (uint32_t)(info_ring_phys >> 32);
+	phys_low  = (uint32_t)(info_ring_phys);
+	acc100_reg_write(d, reg_addr->info_ring_hi, phys_high);
+	acc100_reg_write(d, reg_addr->info_ring_lo, phys_low);
+	acc100_reg_write(d, reg_addr->info_ring_en, ACC100_REG_IRQ_EN_ALL);
+	d->info_ring_head = (acc100_reg_read(d, reg_addr->info_ring_ptr) &
+			0xFFF) / sizeof(union acc100_info_ring_data);
+	return 0;
+}
+
+
 /* Allocate 64MB memory used for all software rings */
 static int
 acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id)
@@ -350,6 +557,7 @@
 	uint32_t phys_low, phys_high, payload;
 	struct acc100_device *d = dev->data->dev_private;
 	const struct acc100_registry_addr *reg_addr;
+	int ret;
 
 	if (d->pf_device && !d->acc100_conf.pf_mode_en) {
 		rte_bbdev_log(NOTICE,
@@ -433,6 +641,14 @@
 	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_hi, phys_high);
 	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_lo, phys_low);
 
+	ret = allocate_info_ring(dev);
+	if (ret < 0) {
+		rte_bbdev_log(ERR, "Failed to allocate info_ring for %s:%u",
+				dev->device->driver->name,
+				dev->data->dev_id);
+		/* Continue */
+	}
+
 	d->harq_layout = rte_zmalloc_socket("HARQ Layout",
 			ACC100_HARQ_LAYOUT * sizeof(*d->harq_layout),
 			RTE_CACHE_LINE_SIZE, dev->data->socket_id);
@@ -454,13 +670,59 @@
 	return 0;
 }
 
+static int
+acc100_intr_enable(struct rte_bbdev *dev)
+{
+	int ret;
+	struct acc100_device *d = dev->data->dev_private;
+
+	/* Only MSI are currently supported */
+	if (dev->intr_handle->type == RTE_INTR_HANDLE_VFIO_MSI ||
+			dev->intr_handle->type == RTE_INTR_HANDLE_UIO) {
+
+		ret = allocate_info_ring(dev);
+		if (ret < 0) {
+			rte_bbdev_log(ERR,
+					"Couldn't allocate info ring for device: %s",
+					dev->data->name);
+			return ret;
+		}
+
+		ret = rte_intr_enable(dev->intr_handle);
+		if (ret < 0) {
+			rte_bbdev_log(ERR,
+					"Couldn't enable interrupts for device: %s",
+					dev->data->name);
+			rte_free(d->info_ring);
+			return ret;
+		}
+		ret = rte_intr_callback_register(dev->intr_handle,
+				acc100_dev_interrupt_handler, dev);
+		if (ret < 0) {
+			rte_bbdev_log(ERR,
+					"Couldn't register interrupt callback for device: %s",
+					dev->data->name);
+			rte_free(d->info_ring);
+			return ret;
+		}
+
+		return 0;
+	}
+
+	rte_bbdev_log(ERR, "ACC100 (%s) supports only VFIO MSI interrupts",
+			dev->data->name);
+	return -ENOTSUP;
+}
+
 /* Free 64MB memory used for software rings */
 static int
 acc100_dev_close(struct rte_bbdev *dev)
 {
 	struct acc100_device *d = dev->data->dev_private;
+	acc100_check_ir(d);
 	if (d->sw_rings_base != NULL) {
 		rte_free(d->tail_ptrs);
+		rte_free(d->info_ring);
 		rte_free(d->sw_rings_base);
 		d->sw_rings_base = NULL;
 	}
@@ -665,6 +927,7 @@
 					RTE_BBDEV_TURBO_CRC_TYPE_24B |
 					RTE_BBDEV_TURBO_HALF_ITERATION_EVEN |
 					RTE_BBDEV_TURBO_EARLY_TERMINATION |
+					RTE_BBDEV_TURBO_DEC_INTERRUPTS |
 					RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN |
 					RTE_BBDEV_TURBO_MAP_DEC |
 					RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP |
@@ -685,6 +948,7 @@
 					RTE_BBDEV_TURBO_CRC_24B_ATTACH |
 					RTE_BBDEV_TURBO_RV_INDEX_BYPASS |
 					RTE_BBDEV_TURBO_RATE_MATCH |
+					RTE_BBDEV_TURBO_ENC_INTERRUPTS |
 					RTE_BBDEV_TURBO_ENC_SCATTER_GATHER,
 				.num_buffers_src =
 						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
@@ -698,7 +962,8 @@
 				.capability_flags =
 					RTE_BBDEV_LDPC_RATE_MATCH |
 					RTE_BBDEV_LDPC_CRC_24B_ATTACH |
-					RTE_BBDEV_LDPC_INTERLEAVER_BYPASS,
+					RTE_BBDEV_LDPC_INTERLEAVER_BYPASS |
+					RTE_BBDEV_LDPC_ENC_INTERRUPTS,
 				.num_buffers_src =
 						RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
 				.num_buffers_dst =
@@ -723,7 +988,8 @@
 				RTE_BBDEV_LDPC_DECODE_BYPASS |
 				RTE_BBDEV_LDPC_DEC_SCATTER_GATHER |
 				RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION |
-				RTE_BBDEV_LDPC_LLR_COMPRESSION,
+				RTE_BBDEV_LDPC_LLR_COMPRESSION |
+				RTE_BBDEV_LDPC_DEC_INTERRUPTS,
 			.llr_size = 8,
 			.llr_decimals = 1,
 			.num_buffers_src =
@@ -773,14 +1039,39 @@
 #else
 	dev_info->harq_buffer_size = 0;
 #endif
+	acc100_check_ir(d);
+}
+
+static int
+acc100_queue_intr_enable(struct rte_bbdev *dev, uint16_t queue_id)
+{
+	struct acc100_queue *q = dev->data->queues[queue_id].queue_private;
+
+	if (dev->intr_handle->type != RTE_INTR_HANDLE_VFIO_MSI &&
+			dev->intr_handle->type != RTE_INTR_HANDLE_UIO)
+		return -ENOTSUP;
+
+	q->irq_enable = 1;
+	return 0;
+}
+
+static int
+acc100_queue_intr_disable(struct rte_bbdev *dev, uint16_t queue_id)
+{
+	struct acc100_queue *q = dev->data->queues[queue_id].queue_private;
+	q->irq_enable = 0;
+	return 0;
 }
 
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
 	.setup_queues = acc100_setup_queues,
+	.intr_enable = acc100_intr_enable,
 	.close = acc100_dev_close,
 	.info_get = acc100_dev_info_get,
 	.queue_setup = acc100_queue_setup,
 	.queue_release = acc100_queue_release,
+	.queue_intr_enable = acc100_queue_intr_enable,
+	.queue_intr_disable = acc100_queue_intr_disable
 };
 
 /* ACC100 PCI PF address map */
@@ -3030,8 +3321,10 @@ static inline void acc100_header_init(struct acc100_dma_req_desc *desc)
 			? (1 << RTE_BBDEV_DATA_ERROR) : 0);
 	op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
 	op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
-	if (op->status != 0)
+	if (op->status != 0) {
 		q_data->queue_stats.dequeue_err_count++;
+		acc100_check_ir(q->d);
+	}
 
 	/* CRC invalid if error exists */
 	if (!op->status)
@@ -3088,6 +3381,9 @@ static inline void acc100_header_init(struct acc100_dma_req_desc *desc)
 		op->status |= 1 << RTE_BBDEV_SYNDROME_ERROR;
 	op->ldpc_dec.iter_count = (uint8_t) rsp.iter_cnt;
 
+	if (op->status & (1 << RTE_BBDEV_DRV_ERROR))
+		acc100_check_ir(q->d);
+
 	/* Check if this is the last desc in batch (Atomic Queue) */
 	if (desc->req.last_desc_in_batch) {
 		(*aq_dequeued)++;
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index ab41ee5..a61cc71 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -565,8 +565,16 @@ struct acc100_device {
 	rte_iova_t sw_rings_phys;  /* Physical address of sw_rings */
 	/* Virtual address of the info memory routed to the this function under
 	 * operation, whether it is PF or VF.
+	 * HW may DMA information data at this location asynchronously
 	 */
+	union acc100_info_ring_data *info_ring;
+
 	union acc100_harq_layout_data *harq_layout;
+	/* Virtual Info Ring head */
+	uint16_t info_ring_head;
+	/* Number of bytes available for each queue in device, depending on
+	 * how many queues are enabled with configure()
+	 */
 	uint32_t sw_ring_size;
 	uint32_t ddr_size; /* Size in kB */
 	uint32_t *tail_ptrs; /* Base address of response tail pointer buffer */
@@ -582,4 +590,12 @@ struct acc100_device {
 	bool configured; /**< True if this ACC100 device is configured */
 };
 
+/**
+ * Structure with details about RTE_BBDEV_EVENT_DEQUEUE event. It's passed to
+ * the callback function.
+ */
+struct acc100_deq_intr_details {
+	uint16_t queue_id;
+};
+
 #endif /* _RTE_ACC100_PMD_H_ */
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v10 09/10] baseband/acc100: add debug function to validate input
  2020-10-01  3:14   ` [dpdk-dev] [PATCH v10 00/10] bbdev PMD ACC100 Nicolas Chautru
                       ` (7 preceding siblings ...)
  2020-10-01  3:14     ` [dpdk-dev] [PATCH v10 08/10] baseband/acc100: add interrupt support to PMD Nicolas Chautru
@ 2020-10-01  3:14     ` Nicolas Chautru
  2020-10-01  3:14     ` [dpdk-dev] [PATCH v10 10/10] baseband/acc100: add configure function Nicolas Chautru
  9 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-10-01  3:14 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, trix, ferruh.yigit, tianjiao.liu,
	Nicolas Chautru

Debug functions to validate the input API from user
Only enabled in DEBUG mode at build time

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 436 +++++++++++++++++++++++++++++++
 1 file changed, 436 insertions(+)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 1271229..4640b6e 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -1969,6 +1969,243 @@ static inline void acc100_header_init(struct acc100_dma_req_desc *desc)
 
 }
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+/* Validates turbo encoder parameters */
+static inline int
+validate_enc_op(struct rte_bbdev_enc_op *op)
+{
+	struct rte_bbdev_op_turbo_enc *turbo_enc = &op->turbo_enc;
+	struct rte_bbdev_op_enc_turbo_cb_params *cb = NULL;
+	struct rte_bbdev_op_enc_turbo_tb_params *tb = NULL;
+	uint16_t kw, kw_neg, kw_pos;
+
+	if (op->mempool == NULL) {
+		rte_bbdev_log(ERR, "Invalid mempool pointer");
+		return -1;
+	}
+	if (turbo_enc->input.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid input pointer");
+		return -1;
+	}
+	if (turbo_enc->output.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid output pointer");
+		return -1;
+	}
+	if (turbo_enc->rv_index > 3) {
+		rte_bbdev_log(ERR,
+				"rv_index (%u) is out of range 0 <= value <= 3",
+				turbo_enc->rv_index);
+		return -1;
+	}
+	if (turbo_enc->code_block_mode != 0 &&
+			turbo_enc->code_block_mode != 1) {
+		rte_bbdev_log(ERR,
+				"code_block_mode (%u) is out of range 0 <= value <= 1",
+				turbo_enc->code_block_mode);
+		return -1;
+	}
+
+	if (turbo_enc->code_block_mode == 0) {
+		tb = &turbo_enc->tb_params;
+		if ((tb->k_neg < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| tb->k_neg > RTE_BBDEV_TURBO_MAX_CB_SIZE)
+				&& tb->c_neg > 0) {
+			rte_bbdev_log(ERR,
+					"k_neg (%u) is out of range %u <= value <= %u",
+					tb->k_neg, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if (tb->k_pos < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| tb->k_pos > RTE_BBDEV_TURBO_MAX_CB_SIZE) {
+			rte_bbdev_log(ERR,
+					"k_pos (%u) is out of range %u <= value <= %u",
+					tb->k_pos, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if (tb->c_neg > (RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1))
+			rte_bbdev_log(ERR,
+					"c_neg (%u) is out of range 0 <= value <= %u",
+					tb->c_neg,
+					RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1);
+		if (tb->c < 1 || tb->c > RTE_BBDEV_TURBO_MAX_CODE_BLOCKS) {
+			rte_bbdev_log(ERR,
+					"c (%u) is out of range 1 <= value <= %u",
+					tb->c, RTE_BBDEV_TURBO_MAX_CODE_BLOCKS);
+			return -1;
+		}
+		if (tb->cab > tb->c) {
+			rte_bbdev_log(ERR,
+					"cab (%u) is greater than c (%u)",
+					tb->cab, tb->c);
+			return -1;
+		}
+		if ((tb->ea < RTE_BBDEV_TURBO_MIN_CB_SIZE || (tb->ea % 2))
+				&& tb->r < tb->cab) {
+			rte_bbdev_log(ERR,
+					"ea (%u) is less than %u or it is not even",
+					tb->ea, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+		if ((tb->eb < RTE_BBDEV_TURBO_MIN_CB_SIZE || (tb->eb % 2))
+				&& tb->c > tb->cab) {
+			rte_bbdev_log(ERR,
+					"eb (%u) is less than %u or it is not even",
+					tb->eb, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+
+		kw_neg = 3 * RTE_ALIGN_CEIL(tb->k_neg + 4,
+					RTE_BBDEV_TURBO_C_SUBBLOCK);
+		if (tb->ncb_neg < tb->k_neg || tb->ncb_neg > kw_neg) {
+			rte_bbdev_log(ERR,
+					"ncb_neg (%u) is out of range (%u) k_neg <= value <= (%u) kw_neg",
+					tb->ncb_neg, tb->k_neg, kw_neg);
+			return -1;
+		}
+
+		kw_pos = 3 * RTE_ALIGN_CEIL(tb->k_pos + 4,
+					RTE_BBDEV_TURBO_C_SUBBLOCK);
+		if (tb->ncb_pos < tb->k_pos || tb->ncb_pos > kw_pos) {
+			rte_bbdev_log(ERR,
+					"ncb_pos (%u) is out of range (%u) k_pos <= value <= (%u) kw_pos",
+					tb->ncb_pos, tb->k_pos, kw_pos);
+			return -1;
+		}
+		if (tb->r > (tb->c - 1)) {
+			rte_bbdev_log(ERR,
+					"r (%u) is greater than c - 1 (%u)",
+					tb->r, tb->c - 1);
+			return -1;
+		}
+	} else {
+		cb = &turbo_enc->cb_params;
+		if (cb->k < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| cb->k > RTE_BBDEV_TURBO_MAX_CB_SIZE) {
+			rte_bbdev_log(ERR,
+					"k (%u) is out of range %u <= value <= %u",
+					cb->k, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+
+		if (cb->e < RTE_BBDEV_TURBO_MIN_CB_SIZE || (cb->e % 2)) {
+			rte_bbdev_log(ERR,
+					"e (%u) is less than %u or it is not even",
+					cb->e, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+
+		kw = RTE_ALIGN_CEIL(cb->k + 4, RTE_BBDEV_TURBO_C_SUBBLOCK) * 3;
+		if (cb->ncb < cb->k || cb->ncb > kw) {
+			rte_bbdev_log(ERR,
+					"ncb (%u) is out of range (%u) k <= value <= (%u) kw",
+					cb->ncb, cb->k, kw);
+			return -1;
+		}
+	}
+
+	return 0;
+}
+/* Validates LDPC encoder parameters */
+static inline int
+validate_ldpc_enc_op(struct rte_bbdev_enc_op *op)
+{
+	struct rte_bbdev_op_ldpc_enc *ldpc_enc = &op->ldpc_enc;
+
+	if (op->mempool == NULL) {
+		rte_bbdev_log(ERR, "Invalid mempool pointer");
+		return -1;
+	}
+	if (ldpc_enc->input.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid input pointer");
+		return -1;
+	}
+	if (ldpc_enc->output.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid output pointer");
+		return -1;
+	}
+	if (ldpc_enc->input.length >
+			RTE_BBDEV_LDPC_MAX_CB_SIZE >> 3) {
+		rte_bbdev_log(ERR, "CB size (%u) is too big, max: %d",
+				ldpc_enc->input.length,
+				RTE_BBDEV_LDPC_MAX_CB_SIZE);
+		return -1;
+	}
+	if ((ldpc_enc->basegraph > 2) || (ldpc_enc->basegraph == 0)) {
+		rte_bbdev_log(ERR,
+				"BG (%u) is out of range 1 <= value <= 2",
+				ldpc_enc->basegraph);
+		return -1;
+	}
+	if (ldpc_enc->rv_index > 3) {
+		rte_bbdev_log(ERR,
+				"rv_index (%u) is out of range 0 <= value <= 3",
+				ldpc_enc->rv_index);
+		return -1;
+	}
+	if (ldpc_enc->code_block_mode > 1) {
+		rte_bbdev_log(ERR,
+				"code_block_mode (%u) is out of range 0 <= value <= 1",
+				ldpc_enc->code_block_mode);
+		return -1;
+	}
+	int K = (ldpc_enc->basegraph == 1 ? 22 : 10) * ldpc_enc->z_c;
+	if (ldpc_enc->n_filler >= K) {
+		rte_bbdev_log(ERR,
+				"K and F are not compatible %u %u",
+				K, ldpc_enc->n_filler);
+		return -1;
+	}
+	return 0;
+}
+
+/* Validates LDPC decoder parameters */
+static inline int
+validate_ldpc_dec_op(struct rte_bbdev_dec_op *op)
+{
+	struct rte_bbdev_op_ldpc_dec *ldpc_dec = &op->ldpc_dec;
+
+	if (op->mempool == NULL) {
+		rte_bbdev_log(ERR, "Invalid mempool pointer");
+		return -1;
+	}
+	if ((ldpc_dec->basegraph > 2) || (ldpc_dec->basegraph == 0)) {
+		rte_bbdev_log(ERR,
+				"BG (%u) is out of range 1 <= value <= 2",
+				ldpc_dec->basegraph);
+		return -1;
+	}
+	if (ldpc_dec->iter_max == 0) {
+		rte_bbdev_log(ERR,
+				"iter_max (%u) is equal to 0",
+				ldpc_dec->iter_max);
+		return -1;
+	}
+	if (ldpc_dec->rv_index > 3) {
+		rte_bbdev_log(ERR,
+				"rv_index (%u) is out of range 0 <= value <= 3",
+				ldpc_dec->rv_index);
+		return -1;
+	}
+	if (ldpc_dec->code_block_mode > 1) {
+		rte_bbdev_log(ERR,
+				"code_block_mode (%u) is out of range 0 <= value <= 1",
+				ldpc_dec->code_block_mode);
+		return -1;
+	}
+	int K = (ldpc_dec->basegraph == 1 ? 22 : 10) * ldpc_dec->z_c;
+	if (ldpc_dec->n_filler >= K) {
+		rte_bbdev_log(ERR,
+				"K and F are not compatible %u %u",
+				K, ldpc_dec->n_filler);
+		return -1;
+	}
+	return 0;
+}
+#endif
+
 /* Enqueue one encode operations for ACC100 device in CB mode */
 static inline int
 enqueue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
@@ -1980,6 +2217,14 @@ static inline void acc100_header_init(struct acc100_dma_req_desc *desc)
 		seg_total_left;
 	struct rte_mbuf *input, *output_head, *output;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_enc_op(op) == -1) {
+		rte_bbdev_log(ERR, "Turbo encoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2032,6 +2277,14 @@ static inline void acc100_header_init(struct acc100_dma_req_desc *desc)
 	uint16_t  in_length_in_bytes;
 	struct rte_bbdev_op_ldpc_enc *enc = &ops[0]->ldpc_enc;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_ldpc_enc_op(ops[0]) == -1) {
+		rte_bbdev_log(ERR, "LDPC encoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2086,6 +2339,14 @@ static inline void acc100_header_init(struct acc100_dma_req_desc *desc)
 		seg_total_left;
 	struct rte_mbuf *input, *output_head, *output;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_ldpc_enc_op(op) == -1) {
+		rte_bbdev_log(ERR, "LDPC encoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2140,6 +2401,14 @@ static inline void acc100_header_init(struct acc100_dma_req_desc *desc)
 	struct rte_mbuf *input, *output_head, *output;
 	uint16_t current_enqueued_cbs = 0;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_enc_op(op) == -1) {
+		rte_bbdev_log(ERR, "Turbo encoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2212,6 +2481,142 @@ static inline void acc100_header_init(struct acc100_dma_req_desc *desc)
 	return current_enqueued_cbs;
 }
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+/* Validates turbo decoder parameters */
+static inline int
+validate_dec_op(struct rte_bbdev_dec_op *op)
+{
+	struct rte_bbdev_op_turbo_dec *turbo_dec = &op->turbo_dec;
+	struct rte_bbdev_op_dec_turbo_cb_params *cb = NULL;
+	struct rte_bbdev_op_dec_turbo_tb_params *tb = NULL;
+
+	if (op->mempool == NULL) {
+		rte_bbdev_log(ERR, "Invalid mempool pointer");
+		return -1;
+	}
+	if (turbo_dec->input.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid input pointer");
+		return -1;
+	}
+	if (turbo_dec->hard_output.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid hard_output pointer");
+		return -1;
+	}
+	if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_SOFT_OUTPUT) &&
+			turbo_dec->soft_output.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid soft_output pointer");
+		return -1;
+	}
+	if (turbo_dec->rv_index > 3) {
+		rte_bbdev_log(ERR,
+				"rv_index (%u) is out of range 0 <= value <= 3",
+				turbo_dec->rv_index);
+		return -1;
+	}
+	if (turbo_dec->iter_min < 1) {
+		rte_bbdev_log(ERR,
+				"iter_min (%u) is less than 1",
+				turbo_dec->iter_min);
+		return -1;
+	}
+	if (turbo_dec->iter_max <= 2) {
+		rte_bbdev_log(ERR,
+				"iter_max (%u) is less than or equal to 2",
+				turbo_dec->iter_max);
+		return -1;
+	}
+	if (turbo_dec->iter_min > turbo_dec->iter_max) {
+		rte_bbdev_log(ERR,
+				"iter_min (%u) is greater than iter_max (%u)",
+				turbo_dec->iter_min, turbo_dec->iter_max);
+		return -1;
+	}
+	if (turbo_dec->code_block_mode != 0 &&
+			turbo_dec->code_block_mode != 1) {
+		rte_bbdev_log(ERR,
+				"code_block_mode (%u) is out of range 0 <= value <= 1",
+				turbo_dec->code_block_mode);
+		return -1;
+	}
+
+	if (turbo_dec->code_block_mode == 0) {
+		tb = &turbo_dec->tb_params;
+		if ((tb->k_neg < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| tb->k_neg > RTE_BBDEV_TURBO_MAX_CB_SIZE)
+				&& tb->c_neg > 0) {
+			rte_bbdev_log(ERR,
+					"k_neg (%u) is out of range %u <= value <= %u",
+					tb->k_neg, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if ((tb->k_pos < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| tb->k_pos > RTE_BBDEV_TURBO_MAX_CB_SIZE)
+				&& tb->c > tb->c_neg) {
+			rte_bbdev_log(ERR,
+					"k_pos (%u) is out of range %u <= value <= %u",
+					tb->k_pos, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if (tb->c_neg > (RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1))
+			rte_bbdev_log(ERR,
+					"c_neg (%u) is out of range 0 <= value <= %u",
+					tb->c_neg,
+					RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1);
+		if (tb->c < 1 || tb->c > RTE_BBDEV_TURBO_MAX_CODE_BLOCKS) {
+			rte_bbdev_log(ERR,
+					"c (%u) is out of range 1 <= value <= %u",
+					tb->c, RTE_BBDEV_TURBO_MAX_CODE_BLOCKS);
+			return -1;
+		}
+		if (tb->cab > tb->c) {
+			rte_bbdev_log(ERR,
+					"cab (%u) is greater than c (%u)",
+					tb->cab, tb->c);
+			return -1;
+		}
+		if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_EQUALIZER) &&
+				(tb->ea < RTE_BBDEV_TURBO_MIN_CB_SIZE
+						|| (tb->ea % 2))
+				&& tb->cab > 0) {
+			rte_bbdev_log(ERR,
+					"ea (%u) is less than %u or it is not even",
+					tb->ea, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+		if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_EQUALIZER) &&
+				(tb->eb < RTE_BBDEV_TURBO_MIN_CB_SIZE
+						|| (tb->eb % 2))
+				&& tb->c > tb->cab) {
+			rte_bbdev_log(ERR,
+					"eb (%u) is less than %u or it is not even",
+					tb->eb, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+		}
+	} else {
+		cb = &turbo_dec->cb_params;
+		if (cb->k < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| cb->k > RTE_BBDEV_TURBO_MAX_CB_SIZE) {
+			rte_bbdev_log(ERR,
+					"k (%u) is out of range %u <= value <= %u",
+					cb->k, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_EQUALIZER) &&
+				(cb->e < RTE_BBDEV_TURBO_MIN_CB_SIZE ||
+				(cb->e % 2))) {
+			rte_bbdev_log(ERR,
+					"e (%u) is less than %u or it is not even",
+					cb->e, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+	}
+
+	return 0;
+}
+#endif
+
 /** Enqueue one decode operations for ACC100 device in CB mode */
 static inline int
 enqueue_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
@@ -2224,6 +2629,14 @@ static inline void acc100_header_init(struct acc100_dma_req_desc *desc)
 	struct rte_mbuf *input, *h_output_head, *h_output,
 		*s_output_head, *s_output;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_dec_op(op) == -1) {
+		rte_bbdev_log(ERR, "Turbo decoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2447,6 +2860,13 @@ static inline void acc100_header_init(struct acc100_dma_req_desc *desc)
 		return ret;
 	}
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_ldpc_dec_op(op) == -1) {
+		rte_bbdev_log(ERR, "LDPC decoder validation failed");
+		return -EINVAL;
+	}
+#endif
 	union acc100_dma_desc *desc;
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
@@ -2544,6 +2964,14 @@ static inline void acc100_header_init(struct acc100_dma_req_desc *desc)
 	struct rte_mbuf *input, *h_output_head, *h_output;
 	uint16_t current_enqueued_cbs = 0;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_ldpc_dec_op(op) == -1) {
+		rte_bbdev_log(ERR, "LDPC decoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2634,6 +3062,14 @@ static inline void acc100_header_init(struct acc100_dma_req_desc *desc)
 		*s_output_head, *s_output;
 	uint16_t current_enqueued_cbs = 0;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_dec_op(op) == -1) {
+		rte_bbdev_log(ERR, "Turbo decoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v10 10/10] baseband/acc100: add configure function
  2020-10-01  3:14   ` [dpdk-dev] [PATCH v10 00/10] bbdev PMD ACC100 Nicolas Chautru
                       ` (8 preceding siblings ...)
  2020-10-01  3:14     ` [dpdk-dev] [PATCH v10 09/10] baseband/acc100: add debug function to validate input Nicolas Chautru
@ 2020-10-01  3:14     ` Nicolas Chautru
  2020-10-01 14:11       ` Maxime Coquelin
  9 siblings, 1 reply; 213+ messages in thread
From: Nicolas Chautru @ 2020-10-01  3:14 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, trix, ferruh.yigit, tianjiao.liu,
	Nicolas Chautru

Add configure function to configure the PF from within
the bbdev-test itself without external application
configuration the device.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 app/test-bbdev/test_bbdev_perf.c                   |  71 +++
 doc/guides/rel_notes/release_20_11.rst             |   5 +
 drivers/baseband/acc100/meson.build                |   2 +
 drivers/baseband/acc100/rte_acc100_cfg.h           |  17 +
 drivers/baseband/acc100/rte_acc100_pmd.c           | 523 ++++++++++++++++++++-
 drivers/baseband/acc100/rte_acc100_pmd.h           |   1 +
 .../acc100/rte_pmd_bbdev_acc100_version.map        |   7 +
 7 files changed, 621 insertions(+), 5 deletions(-)

diff --git a/app/test-bbdev/test_bbdev_perf.c b/app/test-bbdev/test_bbdev_perf.c
index 45c0d62..77903bd 100644
--- a/app/test-bbdev/test_bbdev_perf.c
+++ b/app/test-bbdev/test_bbdev_perf.c
@@ -52,6 +52,18 @@
 #define FLR_5G_TIMEOUT 610
 #endif
 
+#ifdef RTE_LIBRTE_PMD_BBDEV_ACC100
+#include <rte_acc100_cfg.h>
+#define ACC100PF_DRIVER_NAME   ("intel_acc100_pf")
+#define ACC100VF_DRIVER_NAME   ("intel_acc100_vf")
+#define ACC100_QMGR_NUM_AQS 16
+#define ACC100_QMGR_NUM_QGS 2
+#define ACC100_QMGR_AQ_DEPTH 5
+#define ACC100_QMGR_INVALID_IDX -1
+#define ACC100_QMGR_RR 1
+#define ACC100_QOS_GBR 0
+#endif
+
 #define OPS_CACHE_SIZE 256U
 #define OPS_POOL_SIZE_MIN 511U /* 0.5K per queue */
 
@@ -653,6 +665,65 @@ typedef int (test_case_function)(struct active_device *ad,
 				info->dev_name);
 	}
 #endif
+#ifdef RTE_LIBRTE_PMD_BBDEV_ACC100
+	if ((get_init_device() == true) &&
+		(!strcmp(info->drv.driver_name, ACC100PF_DRIVER_NAME))) {
+		struct acc100_conf conf;
+		unsigned int i;
+
+		printf("Configure ACC100 FEC Driver %s with default values\n",
+				info->drv.driver_name);
+
+		/* clear default configuration before initialization */
+		memset(&conf, 0, sizeof(struct acc100_conf));
+
+		/* Always set in PF mode for built-in configuration */
+		conf.pf_mode_en = true;
+		for (i = 0; i < RTE_ACC100_NUM_VFS; ++i) {
+			conf.arb_dl_4g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_dl_4g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_dl_4g[i].round_robin_weight = ACC100_QMGR_RR;
+			conf.arb_ul_4g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_ul_4g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_ul_4g[i].round_robin_weight = ACC100_QMGR_RR;
+			conf.arb_dl_5g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_dl_5g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_dl_5g[i].round_robin_weight = ACC100_QMGR_RR;
+			conf.arb_ul_5g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_ul_5g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_ul_5g[i].round_robin_weight = ACC100_QMGR_RR;
+		}
+
+		conf.input_pos_llr_1_bit = true;
+		conf.output_pos_llr_1_bit = true;
+		conf.num_vf_bundles = 1; /**< Number of VF bundles to setup */
+
+		conf.q_ul_4g.num_qgroups = ACC100_QMGR_NUM_QGS;
+		conf.q_ul_4g.first_qgroup_index = ACC100_QMGR_INVALID_IDX;
+		conf.q_ul_4g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
+		conf.q_ul_4g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
+		conf.q_dl_4g.num_qgroups = ACC100_QMGR_NUM_QGS;
+		conf.q_dl_4g.first_qgroup_index = ACC100_QMGR_INVALID_IDX;
+		conf.q_dl_4g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
+		conf.q_dl_4g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
+		conf.q_ul_5g.num_qgroups = ACC100_QMGR_NUM_QGS;
+		conf.q_ul_5g.first_qgroup_index = ACC100_QMGR_INVALID_IDX;
+		conf.q_ul_5g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
+		conf.q_ul_5g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
+		conf.q_dl_5g.num_qgroups = ACC100_QMGR_NUM_QGS;
+		conf.q_dl_5g.first_qgroup_index = ACC100_QMGR_INVALID_IDX;
+		conf.q_dl_5g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
+		conf.q_dl_5g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
+
+		/* setup PF with configuration information */
+		ret = acc100_configure(info->dev_name, &conf);
+		TEST_ASSERT_SUCCESS(ret,
+				"Failed to configure ACC100 PF for bbdev %s",
+				info->dev_name);
+	}
+#endif
+	/* Let's refresh this now this is configured */
+	rte_bbdev_info_get(dev_id, info);
 	nb_queues = RTE_MIN(rte_lcore_count(), info->drv.max_num_queues);
 	nb_queues = RTE_MIN(nb_queues, (unsigned int) MAX_QUEUES);
 
diff --git a/doc/guides/rel_notes/release_20_11.rst b/doc/guides/rel_notes/release_20_11.rst
index 73ac08f..c8d0586 100644
--- a/doc/guides/rel_notes/release_20_11.rst
+++ b/doc/guides/rel_notes/release_20_11.rst
@@ -55,6 +55,11 @@ New Features
      Also, make sure to start the actual text at the margin.
      =======================================================
 
+* **Added Intel ACC100 bbdev PMD.**
+
+  Added a new ``acc100`` bbdev driver for the Intel\ |reg| ACC100 accelerator
+  also known as Mount Bryce.  See the
+  :doc:`../bbdevs/acc100` BBDEV guide for more details on this new driver.
 
 Removed Items
 -------------
diff --git a/drivers/baseband/acc100/meson.build b/drivers/baseband/acc100/meson.build
index 8afafc2..7ac44dc 100644
--- a/drivers/baseband/acc100/meson.build
+++ b/drivers/baseband/acc100/meson.build
@@ -4,3 +4,5 @@
 deps += ['bbdev', 'bus_vdev', 'ring', 'pci', 'bus_pci']
 
 sources = files('rte_acc100_pmd.c')
+
+install_headers('rte_acc100_cfg.h')
diff --git a/drivers/baseband/acc100/rte_acc100_cfg.h b/drivers/baseband/acc100/rte_acc100_cfg.h
index 73bbe36..7f523bc 100644
--- a/drivers/baseband/acc100/rte_acc100_cfg.h
+++ b/drivers/baseband/acc100/rte_acc100_cfg.h
@@ -89,6 +89,23 @@ struct acc100_conf {
 	struct rte_arbitration_t arb_dl_5g[RTE_ACC100_NUM_VFS];
 };
 
+/**
+ * Configure a ACC100 device
+ *
+ * @param dev_name
+ *   The name of the device. This is the short form of PCI BDF, e.g. 00:01.0.
+ *   It can also be retrieved for a bbdev device from the dev_name field in the
+ *   rte_bbdev_info structure returned by rte_bbdev_info_get().
+ * @param conf
+ *   Configuration to apply to ACC100 HW.
+ *
+ * @return
+ *   Zero on success, negative value on failure.
+ */
+__rte_experimental
+int
+acc100_configure(const char *dev_name, struct acc100_conf *conf);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 4640b6e..8552694 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -38,10 +38,10 @@
 
 /* Write a register of a ACC100 device */
 static inline void
-acc100_reg_write(struct acc100_device *d, uint32_t offset, uint32_t payload)
+acc100_reg_write(struct acc100_device *d, uint32_t offset, uint32_t value)
 {
 	void *reg_addr = RTE_PTR_ADD(d->mmio_base, offset);
-	mmio_write(reg_addr, payload);
+	mmio_write(reg_addr, value);
 	usleep(ACC100_LONG_WAIT);
 }
 
@@ -85,6 +85,26 @@
 
 enum {UL_4G = 0, UL_5G, DL_4G, DL_5G, NUM_ACC};
 
+/* Return the accelerator enum for a Queue Group Index */
+static inline int
+accFromQgid(int qg_idx, const struct acc100_conf *acc100_conf)
+{
+	int accQg[ACC100_NUM_QGRPS];
+	int NumQGroupsPerFn[NUM_ACC];
+	int acc, qgIdx, qgIndex = 0;
+	for (qgIdx = 0; qgIdx < ACC100_NUM_QGRPS; qgIdx++)
+		accQg[qgIdx] = 0;
+	NumQGroupsPerFn[UL_4G] = acc100_conf->q_ul_4g.num_qgroups;
+	NumQGroupsPerFn[UL_5G] = acc100_conf->q_ul_5g.num_qgroups;
+	NumQGroupsPerFn[DL_4G] = acc100_conf->q_dl_4g.num_qgroups;
+	NumQGroupsPerFn[DL_5G] = acc100_conf->q_dl_5g.num_qgroups;
+	for (acc = UL_4G;  acc < NUM_ACC; acc++)
+		for (qgIdx = 0; qgIdx < NumQGroupsPerFn[acc]; qgIdx++)
+			accQg[qgIndex++] = acc;
+	acc = accQg[qg_idx];
+	return acc;
+}
+
 /* Return the queue topology for a Queue Group Index */
 static inline void
 qtopFromAcc(struct rte_q_topology_t **qtop, int acc_enum,
@@ -113,6 +133,30 @@
 	*qtop = p_qtop;
 }
 
+/* Return the AQ depth for a Queue Group Index */
+static inline int
+aqDepth(int qg_idx, struct acc100_conf *acc100_conf)
+{
+	struct rte_q_topology_t *q_top = NULL;
+	int acc_enum = accFromQgid(qg_idx, acc100_conf);
+	qtopFromAcc(&q_top, acc_enum, acc100_conf);
+	if (unlikely(q_top == NULL))
+		return 0;
+	return q_top->aq_depth_log2;
+}
+
+/* Return the AQ depth for a Queue Group Index */
+static inline int
+aqNum(int qg_idx, struct acc100_conf *acc100_conf)
+{
+	struct rte_q_topology_t *q_top = NULL;
+	int acc_enum = accFromQgid(qg_idx, acc100_conf);
+	qtopFromAcc(&q_top, acc_enum, acc100_conf);
+	if (unlikely(q_top == NULL))
+		return 0;
+	return q_top->num_aqs_per_groups;
+}
+
 static void
 initQTop(struct acc100_conf *acc100_conf)
 {
@@ -554,7 +598,7 @@
 static int
 acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id)
 {
-	uint32_t phys_low, phys_high, payload;
+	uint32_t phys_low, phys_high, value;
 	struct acc100_device *d = dev->data->dev_private;
 	const struct acc100_registry_addr *reg_addr;
 	int ret;
@@ -613,8 +657,8 @@
 	 * Configure Ring Size to the max queue ring size
 	 * (used for wrapping purpose)
 	 */
-	payload = log2_basic(d->sw_ring_size / 64);
-	acc100_reg_write(d, reg_addr->ring_size, payload);
+	value = log2_basic(d->sw_ring_size / 64);
+	acc100_reg_write(d, reg_addr->ring_size, value);
 
 	/* Configure tail pointer for use when SDONE enabled */
 	d->tail_ptrs = rte_zmalloc_socket(
@@ -4216,3 +4260,472 @@ static int acc100_pci_remove(struct rte_pci_device *pci_dev)
 RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME, pci_id_acc100_pf_map);
 RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
 RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME, pci_id_acc100_vf_map);
+
+/*
+ * Implementation to fix the power on status of some 5GUL engines
+ * This requires DMA permission if ported outside DPDK
+ */
+static void
+poweron_cleanup(struct rte_bbdev *bbdev, struct acc100_device *d,
+		struct acc100_conf *conf)
+{
+	int i, template_idx, qg_idx;
+	uint32_t address, status, value;
+	printf("Need to clear power-on 5GUL status in internal memory\n");
+	/* Reset LDPC Cores */
+	for (i = 0; i < ACC100_ENGINES_MAX; i++)
+		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
+				ACC100_ENGINE_OFFSET * i, ACC100_RESET_HI);
+	usleep(ACC100_LONG_WAIT);
+	for (i = 0; i < ACC100_ENGINES_MAX; i++)
+		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
+				ACC100_ENGINE_OFFSET * i, ACC100_RESET_LO);
+	usleep(ACC100_LONG_WAIT);
+	/* Prepare dummy workload */
+	alloc_2x64mb_sw_rings_mem(bbdev, d, 0);
+	/* Set base addresses */
+	uint32_t phys_high = (uint32_t)(d->sw_rings_phys >> 32);
+	uint32_t phys_low  = (uint32_t)(d->sw_rings_phys &
+			~(ACC100_SIZE_64MBYTE-1));
+	acc100_reg_write(d, HWPfDmaFec5GulDescBaseHiRegVf, phys_high);
+	acc100_reg_write(d, HWPfDmaFec5GulDescBaseLoRegVf, phys_low);
+
+	/* Descriptor for a dummy 5GUL code block processing*/
+	union acc100_dma_desc *desc = NULL;
+	desc = d->sw_rings;
+	desc->req.data_ptrs[0].address = d->sw_rings_phys +
+			ACC100_DESC_FCW_OFFSET;
+	desc->req.data_ptrs[0].blen = ACC100_FCW_LD_BLEN;
+	desc->req.data_ptrs[0].blkid = ACC100_DMA_BLKID_FCW;
+	desc->req.data_ptrs[0].last = 0;
+	desc->req.data_ptrs[0].dma_ext = 0;
+	desc->req.data_ptrs[1].address = d->sw_rings_phys + 512;
+	desc->req.data_ptrs[1].blkid = ACC100_DMA_BLKID_IN;
+	desc->req.data_ptrs[1].last = 1;
+	desc->req.data_ptrs[1].dma_ext = 0;
+	desc->req.data_ptrs[1].blen = 44;
+	desc->req.data_ptrs[2].address = d->sw_rings_phys + 1024;
+	desc->req.data_ptrs[2].blkid = ACC100_DMA_BLKID_OUT_ENC;
+	desc->req.data_ptrs[2].last = 1;
+	desc->req.data_ptrs[2].dma_ext = 0;
+	desc->req.data_ptrs[2].blen = 5;
+	/* Dummy FCW */
+	desc->req.fcw_ld.FCWversion = ACC100_FCW_VER;
+	desc->req.fcw_ld.qm = 1;
+	desc->req.fcw_ld.nfiller = 30;
+	desc->req.fcw_ld.BG = 2 - 1;
+	desc->req.fcw_ld.Zc = 7;
+	desc->req.fcw_ld.ncb = 350;
+	desc->req.fcw_ld.rm_e = 4;
+	desc->req.fcw_ld.itmax = 10;
+	desc->req.fcw_ld.gain_i = 1;
+	desc->req.fcw_ld.gain_h = 1;
+
+	int engines_to_restart[ACC100_SIG_UL_5G_LAST + 1] = {0};
+	int num_failed_engine = 0;
+	/* Detect engines in undefined state */
+	for (template_idx = ACC100_SIG_UL_5G;
+			template_idx <= ACC100_SIG_UL_5G_LAST;
+			template_idx++) {
+		/* Check engine power-on status */
+		address = HwPfFecUl5gIbDebugReg +
+				ACC100_ENGINE_OFFSET * template_idx;
+		status = (acc100_reg_read(d, address) >> 4) & 0xF;
+		if (status == 0) {
+			engines_to_restart[num_failed_engine] = template_idx;
+			num_failed_engine++;
+		}
+	}
+
+	int numQqsAcc = conf->q_ul_5g.num_qgroups;
+	int numQgs = conf->q_ul_5g.num_qgroups;
+	value = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		value |= (1 << qg_idx);
+	/* Force each engine which is in unspecified state */
+	for (i = 0; i < num_failed_engine; i++) {
+		int failed_engine = engines_to_restart[i];
+		printf("Force engine %d\n", failed_engine);
+		for (template_idx = ACC100_SIG_UL_5G;
+				template_idx <= ACC100_SIG_UL_5G_LAST;
+				template_idx++) {
+			address = HWPfQmgrGrpTmplateReg4Indx
+					+ ACC100_BYTES_IN_WORD * template_idx;
+			if (template_idx == failed_engine)
+				acc100_reg_write(d, address, value);
+			else
+				acc100_reg_write(d, address, 0);
+		}
+		/* Reset descriptor header */
+		desc->req.word0 = ACC100_DMA_DESC_TYPE;
+		desc->req.word1 = 0;
+		desc->req.word2 = 0;
+		desc->req.word3 = 0;
+		desc->req.numCBs = 1;
+		desc->req.m2dlen = 2;
+		desc->req.d2mlen = 1;
+		/* Enqueue the code block for processing */
+		union acc100_enqueue_reg_fmt enq_req;
+		enq_req.val = 0;
+		enq_req.addr_offset = ACC100_DESC_OFFSET;
+		enq_req.num_elem = 1;
+		enq_req.req_elem_addr = 0;
+		rte_wmb();
+		acc100_reg_write(d, HWPfQmgrIngressAq + 0x100, enq_req.val);
+		usleep(ACC100_LONG_WAIT * 100);
+		if (desc->req.word0 != 2)
+			printf("DMA Response %#"PRIx32"\n", desc->req.word0);
+	}
+
+	/* Reset LDPC Cores */
+	for (i = 0; i < ACC100_ENGINES_MAX; i++)
+		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
+				ACC100_ENGINE_OFFSET * i,
+				ACC100_RESET_HI);
+	usleep(ACC100_LONG_WAIT);
+	for (i = 0; i < ACC100_ENGINES_MAX; i++)
+		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
+				ACC100_ENGINE_OFFSET * i,
+				ACC100_RESET_LO);
+	usleep(ACC100_LONG_WAIT);
+	acc100_reg_write(d, HWPfHi5GHardResetReg, ACC100_RESET_HARD);
+	usleep(ACC100_LONG_WAIT);
+	int numEngines = 0;
+	/* Check engine power-on status again */
+	for (template_idx = ACC100_SIG_UL_5G;
+			template_idx <= ACC100_SIG_UL_5G_LAST;
+			template_idx++) {
+		address = HwPfFecUl5gIbDebugReg +
+				ACC100_ENGINE_OFFSET * template_idx;
+		status = (acc100_reg_read(d, address) >> 4) & 0xF;
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ ACC100_BYTES_IN_WORD * template_idx;
+		if (status == 1) {
+			acc100_reg_write(d, address, value);
+			numEngines++;
+		} else
+			acc100_reg_write(d, address, 0);
+	}
+	printf("Number of 5GUL engines %d\n", numEngines);
+
+	if (d->sw_rings_base != NULL)
+		rte_free(d->sw_rings_base);
+	usleep(ACC100_LONG_WAIT);
+}
+
+/* Initial configuration of a ACC100 device prior to running configure() */
+int
+acc100_configure(const char *dev_name, struct acc100_conf *conf)
+{
+	rte_bbdev_log(INFO, "acc100_configure");
+	uint32_t value, address, status;
+	int qg_idx, template_idx, vf_idx, acc, i;
+	struct rte_bbdev *bbdev = rte_bbdev_get_named_dev(dev_name);
+
+	/* Compile time checks */
+	RTE_BUILD_BUG_ON(sizeof(struct acc100_dma_req_desc) != 256);
+	RTE_BUILD_BUG_ON(sizeof(union acc100_dma_desc) != 256);
+	RTE_BUILD_BUG_ON(sizeof(struct acc100_fcw_td) != 24);
+	RTE_BUILD_BUG_ON(sizeof(struct acc100_fcw_te) != 32);
+
+	if (bbdev == NULL) {
+		rte_bbdev_log(ERR,
+		"Invalid dev_name (%s), or device is not yet initialised",
+		dev_name);
+		return -ENODEV;
+	}
+	struct acc100_device *d = bbdev->data->dev_private;
+
+	/* Store configuration */
+	rte_memcpy(&d->acc100_conf, conf, sizeof(d->acc100_conf));
+
+	/* PCIe Bridge configuration */
+	acc100_reg_write(d, HwPfPcieGpexBridgeControl, ACC100_CFG_PCI_BRIDGE);
+	for (i = 1; i < ACC100_GPEX_AXIMAP_NUM; i++)
+		acc100_reg_write(d,
+				HwPfPcieGpexAxiAddrMappingWindowPexBaseHigh
+				+ i * 16, 0);
+
+	/* Prevent blocking AXI read on BRESP for AXI Write */
+	address = HwPfPcieGpexAxiPioControl;
+	value = ACC100_CFG_PCI_AXI;
+	acc100_reg_write(d, address, value);
+
+	/* 5GDL PLL phase shift */
+	acc100_reg_write(d, HWPfChaDl5gPllPhshft0, 0x1);
+
+	/* Explicitly releasing AXI as this may be stopped after PF FLR/BME */
+	address = HWPfDmaAxiControl;
+	value = 1;
+	acc100_reg_write(d, address, value);
+
+	/* DDR Configuration */
+	address = HWPfDdrBcTim6;
+	value = acc100_reg_read(d, address);
+	value &= 0xFFFFFFFB; /* Bit 2 */
+#ifdef ACC100_DDR_ECC_ENABLE
+	value |= 0x4;
+#endif
+	acc100_reg_write(d, address, value);
+	address = HWPfDdrPhyDqsCountNum;
+#ifdef ACC100_DDR_ECC_ENABLE
+	value = 9;
+#else
+	value = 8;
+#endif
+	acc100_reg_write(d, address, value);
+
+	/* Set default descriptor signature */
+	address = HWPfDmaDescriptorSignatuture;
+	value = 0;
+	acc100_reg_write(d, address, value);
+
+	/* Enable the Error Detection in DMA */
+	value = ACC100_CFG_DMA_ERROR;
+	address = HWPfDmaErrorDetectionEn;
+	acc100_reg_write(d, address, value);
+
+	/* AXI Cache configuration */
+	value = ACC100_CFG_AXI_CACHE;
+	address = HWPfDmaAxcacheReg;
+	acc100_reg_write(d, address, value);
+
+	/* Default DMA Configuration (Qmgr Enabled) */
+	address = HWPfDmaConfig0Reg;
+	value = 0;
+	acc100_reg_write(d, address, value);
+	address = HWPfDmaQmanen;
+	value = 0;
+	acc100_reg_write(d, address, value);
+
+	/* Default RLIM/ALEN configuration */
+	address = HWPfDmaConfig1Reg;
+	value = (1 << 31) + (23 << 8) + (1 << 6) + 7;
+	acc100_reg_write(d, address, value);
+
+	/* Configure DMA Qmanager addresses */
+	address = HWPfDmaQmgrAddrReg;
+	value = HWPfQmgrEgressQueuesTemplate;
+	acc100_reg_write(d, address, value);
+
+	/* ===== Qmgr Configuration ===== */
+	/* Configuration of the AQueue Depth QMGR_GRP_0_DEPTH_LOG2 for UL */
+	int totalQgs = conf->q_ul_4g.num_qgroups +
+			conf->q_ul_5g.num_qgroups +
+			conf->q_dl_4g.num_qgroups +
+			conf->q_dl_5g.num_qgroups;
+	for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) {
+		address = HWPfQmgrDepthLog2Grp +
+		ACC100_BYTES_IN_WORD * qg_idx;
+		value = aqDepth(qg_idx, conf);
+		acc100_reg_write(d, address, value);
+		address = HWPfQmgrTholdGrp +
+		ACC100_BYTES_IN_WORD * qg_idx;
+		value = (1 << 16) + (1 << (aqDepth(qg_idx, conf) - 1));
+		acc100_reg_write(d, address, value);
+	}
+
+	/* Template Priority in incremental order */
+	for (template_idx = 0; template_idx < ACC100_NUM_TMPL;
+			template_idx++) {
+		address = HWPfQmgrGrpTmplateReg0Indx +
+		ACC100_BYTES_IN_WORD * (template_idx % 8);
+		value = ACC100_TMPL_PRI_0;
+		acc100_reg_write(d, address, value);
+		address = HWPfQmgrGrpTmplateReg1Indx +
+		ACC100_BYTES_IN_WORD * (template_idx % 8);
+		value = ACC100_TMPL_PRI_1;
+		acc100_reg_write(d, address, value);
+		address = HWPfQmgrGrpTmplateReg2indx +
+		ACC100_BYTES_IN_WORD * (template_idx % 8);
+		value = ACC100_TMPL_PRI_2;
+		acc100_reg_write(d, address, value);
+		address = HWPfQmgrGrpTmplateReg3Indx +
+		ACC100_BYTES_IN_WORD * (template_idx % 8);
+		value = ACC100_TMPL_PRI_3;
+		acc100_reg_write(d, address, value);
+	}
+
+	address = HWPfQmgrGrpPriority;
+	value = ACC100_CFG_QMGR_HI_P;
+	acc100_reg_write(d, address, value);
+
+	/* Template Configuration */
+	for (template_idx = 0; template_idx < ACC100_NUM_TMPL;
+			template_idx++) {
+		value = 0;
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ ACC100_BYTES_IN_WORD * template_idx;
+		acc100_reg_write(d, address, value);
+	}
+	/* 4GUL */
+	int numQgs = conf->q_ul_4g.num_qgroups;
+	int numQqsAcc = 0;
+	value = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		value |= (1 << qg_idx);
+	for (template_idx = ACC100_SIG_UL_4G;
+			template_idx <= ACC100_SIG_UL_4G_LAST;
+			template_idx++) {
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ ACC100_BYTES_IN_WORD * template_idx;
+		acc100_reg_write(d, address, value);
+	}
+	/* 5GUL */
+	numQqsAcc += numQgs;
+	numQgs	= conf->q_ul_5g.num_qgroups;
+	value = 0;
+	int numEngines = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		value |= (1 << qg_idx);
+	for (template_idx = ACC100_SIG_UL_5G;
+			template_idx <= ACC100_SIG_UL_5G_LAST;
+			template_idx++) {
+		/* Check engine power-on status */
+		address = HwPfFecUl5gIbDebugReg +
+				ACC100_ENGINE_OFFSET * template_idx;
+		status = (acc100_reg_read(d, address) >> 4) & 0xF;
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ ACC100_BYTES_IN_WORD * template_idx;
+		if (status == 1) {
+			acc100_reg_write(d, address, value);
+			numEngines++;
+		} else
+			acc100_reg_write(d, address, 0);
+#if RTE_ACC100_SINGLE_FEC == 1
+		value = 0;
+#endif
+	}
+	printf("Number of 5GUL engines %d\n", numEngines);
+	/* 4GDL */
+	numQqsAcc += numQgs;
+	numQgs	= conf->q_dl_4g.num_qgroups;
+	value = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		value |= (1 << qg_idx);
+	for (template_idx = ACC100_SIG_DL_4G;
+			template_idx <= ACC100_SIG_DL_4G_LAST;
+			template_idx++) {
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ ACC100_BYTES_IN_WORD * template_idx;
+		acc100_reg_write(d, address, value);
+#if RTE_ACC100_SINGLE_FEC == 1
+			value = 0;
+#endif
+	}
+	/* 5GDL */
+	numQqsAcc += numQgs;
+	numQgs	= conf->q_dl_5g.num_qgroups;
+	value = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		value |= (1 << qg_idx);
+	for (template_idx = ACC100_SIG_DL_5G;
+			template_idx <= ACC100_SIG_DL_5G_LAST;
+			template_idx++) {
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ ACC100_BYTES_IN_WORD * template_idx;
+		acc100_reg_write(d, address, value);
+#if RTE_ACC100_SINGLE_FEC == 1
+		value = 0;
+#endif
+	}
+
+	/* Queue Group Function mapping */
+	int qman_func_id[5] = {0, 2, 1, 3, 4};
+	address = HWPfQmgrGrpFunction0;
+	value = 0;
+	for (qg_idx = 0; qg_idx < 8; qg_idx++) {
+		acc = accFromQgid(qg_idx, conf);
+		value |= qman_func_id[acc]<<(qg_idx * 4);
+	}
+	acc100_reg_write(d, address, value);
+
+	/* Configuration of the Arbitration QGroup depth to 1 */
+	for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) {
+		address = HWPfQmgrArbQDepthGrp +
+		ACC100_BYTES_IN_WORD * qg_idx;
+		value = 0;
+		acc100_reg_write(d, address, value);
+	}
+
+	/* Enabling AQueues through the Queue hierarchy*/
+	for (vf_idx = 0; vf_idx < ACC100_NUM_VFS; vf_idx++) {
+		for (qg_idx = 0; qg_idx < ACC100_NUM_QGRPS; qg_idx++) {
+			value = 0;
+			if (vf_idx < conf->num_vf_bundles &&
+					qg_idx < totalQgs)
+				value = (1 << aqNum(qg_idx, conf)) - 1;
+			address = HWPfQmgrAqEnableVf
+					+ vf_idx * ACC100_BYTES_IN_WORD;
+			value += (qg_idx << 16);
+			acc100_reg_write(d, address, value);
+		}
+	}
+
+	/* This pointer to ARAM (256kB) is shifted by 2 (4B per register) */
+	uint32_t aram_address = 0;
+	for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) {
+		for (vf_idx = 0; vf_idx < conf->num_vf_bundles; vf_idx++) {
+			address = HWPfQmgrVfBaseAddr + vf_idx
+					* ACC100_BYTES_IN_WORD + qg_idx
+					* ACC100_BYTES_IN_WORD * 64;
+			value = aram_address;
+			acc100_reg_write(d, address, value);
+			/* Offset ARAM Address for next memory bank
+			 * - increment of 4B
+			 */
+			aram_address += aqNum(qg_idx, conf) *
+					(1 << aqDepth(qg_idx, conf));
+		}
+	}
+
+	if (aram_address > ACC100_WORDS_IN_ARAM_SIZE) {
+		rte_bbdev_log(ERR, "ARAM Configuration not fitting %d %d\n",
+				aram_address, ACC100_WORDS_IN_ARAM_SIZE);
+		return -EINVAL;
+	}
+
+	/* ==== HI Configuration ==== */
+
+	/* Prevent Block on Transmit Error */
+	address = HWPfHiBlockTransmitOnErrorEn;
+	value = 0;
+	acc100_reg_write(d, address, value);
+	/* Prevents to drop MSI */
+	address = HWPfHiMsiDropEnableReg;
+	value = 0;
+	acc100_reg_write(d, address, value);
+	/* Set the PF Mode register */
+	address = HWPfHiPfMode;
+	value = (conf->pf_mode_en) ? ACC100_PF_VAL : 0;
+	acc100_reg_write(d, address, value);
+	/* Enable Error Detection in HW */
+	address = HWPfDmaErrorDetectionEn;
+	value = 0x3D7;
+	acc100_reg_write(d, address, value);
+
+	/* QoS overflow init */
+	value = 1;
+	address = HWPfQosmonAEvalOverflow0;
+	acc100_reg_write(d, address, value);
+	address = HWPfQosmonBEvalOverflow0;
+	acc100_reg_write(d, address, value);
+
+	/* HARQ DDR Configuration */
+	unsigned int ddrSizeInMb = 512; /* Fixed to 512 MB per VF for now */
+	for (vf_idx = 0; vf_idx < conf->num_vf_bundles; vf_idx++) {
+		address = HWPfDmaVfDdrBaseRw + vf_idx
+				* 0x10;
+		value = ((vf_idx * (ddrSizeInMb / 64)) << 16) +
+				(ddrSizeInMb - 1);
+		acc100_reg_write(d, address, value);
+	}
+	usleep(ACC100_LONG_WAIT);
+
+	/* Workaround in case some 5GUL engines are in an unexpected state */
+	if (numEngines < (ACC100_SIG_UL_5G_LAST + 1))
+		poweron_cleanup(bbdev, d, conf);
+
+	rte_bbdev_log_debug("PF Tip configuration complete for %s", dev_name);
+	return 0;
+}
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index a61cc71..2acfd10 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -158,6 +158,7 @@
 #define ACC100_RESET_HARD       0x1FF
 #define ACC100_ENGINES_MAX      9
 #define ACC100_LONG_WAIT        1000
+#define ACC100_GPEX_AXIMAP_NUM  17
 
 /* ACC100 DMA Descriptor triplet */
 struct acc100_dma_triplet {
diff --git a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
index 4a76d1d..91c234d 100644
--- a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
+++ b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
@@ -1,3 +1,10 @@
 DPDK_21 {
 	local: *;
 };
+
+EXPERIMENTAL {
+	global:
+
+	acc100_configure;
+
+};
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v10 10/10] baseband/acc100: add configure function
  2020-10-01  3:14     ` [dpdk-dev] [PATCH v10 10/10] baseband/acc100: add configure function Nicolas Chautru
@ 2020-10-01 14:11       ` Maxime Coquelin
  2020-10-01 15:36         ` Chautru, Nicolas
  0 siblings, 1 reply; 213+ messages in thread
From: Maxime Coquelin @ 2020-10-01 14:11 UTC (permalink / raw)
  To: Nicolas Chautru, dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, trix, ferruh.yigit, tianjiao.liu

Hi Nicolas,

On 10/1/20 5:14 AM, Nicolas Chautru wrote:
> diff --git a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> index 4a76d1d..91c234d 100644
> --- a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> +++ b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> @@ -1,3 +1,10 @@
>  DPDK_21 {
>  	local: *;
>  };
> +
> +EXPERIMENTAL {
> +	global:
> +
> +	acc100_configure;
> +
> +};
> -- 

Ideally we should not need to have device specific APIs, but at least it
should be prefixed with "rte_".

Regards,
Maxime


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v10 03/10] baseband/acc100: add info get function
  2020-10-01  3:14     ` [dpdk-dev] [PATCH v10 03/10] baseband/acc100: add info get function Nicolas Chautru
@ 2020-10-01 14:34       ` Maxime Coquelin
  2020-10-01 19:50         ` Chautru, Nicolas
  0 siblings, 1 reply; 213+ messages in thread
From: Maxime Coquelin @ 2020-10-01 14:34 UTC (permalink / raw)
  To: Nicolas Chautru, dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, trix, ferruh.yigit, tianjiao.liu



On 10/1/20 5:14 AM, Nicolas Chautru wrote:
> Add in the "info_get" function to the driver, to allow us to query the
> device.
> No processing capability are available yet.
> Linking bbdev-test to support the PMD with null capability.
> 
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
> ---
>  app/test-bbdev/meson.build               |   3 +
>  drivers/baseband/acc100/rte_acc100_cfg.h |  96 +++++++++++++
>  drivers/baseband/acc100/rte_acc100_pmd.c | 229 +++++++++++++++++++++++++++++++
>  drivers/baseband/acc100/rte_acc100_pmd.h |  10 ++
>  4 files changed, 338 insertions(+)
>  create mode 100644 drivers/baseband/acc100/rte_acc100_cfg.h
> 
> diff --git a/app/test-bbdev/meson.build b/app/test-bbdev/meson.build
> index 18ab6a8..fbd8ae3 100644
> --- a/app/test-bbdev/meson.build
> +++ b/app/test-bbdev/meson.build
> @@ -12,3 +12,6 @@ endif
>  if dpdk_conf.has('RTE_LIBRTE_PMD_BBDEV_FPGA_5GNR_FEC')
>  	deps += ['pmd_bbdev_fpga_5gnr_fec']
>  endif
> +if dpdk_conf.has('RTE_LIBRTE_PMD_BBDEV_ACC100')
> +	deps += ['pmd_bbdev_acc100']
> +endif
> \ No newline at end of file
> diff --git a/drivers/baseband/acc100/rte_acc100_cfg.h b/drivers/baseband/acc100/rte_acc100_cfg.h
> new file mode 100644
> index 0000000..73bbe36
> --- /dev/null
> +++ b/drivers/baseband/acc100/rte_acc100_cfg.h
> @@ -0,0 +1,96 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2020 Intel Corporation
> + */
> +
> +#ifndef _RTE_ACC100_CFG_H_
> +#define _RTE_ACC100_CFG_H_
> +
> +/**
> + * @file rte_acc100_cfg.h
> + *
> + * Functions for configuring ACC100 HW, exposed directly to applications.
> + * Configuration related to encoding/decoding is done through the
> + * librte_bbdev library.
> + *
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice
> + */
> +
> +#include <stdint.h>
> +#include <stdbool.h>
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +/**< Number of Virtual Functions ACC100 supports */
> +#define RTE_ACC100_NUM_VFS 16
> +
> +/**
> + * Definition of Queue Topology for ACC100 Configuration
> + * Some level of details is abstracted out to expose a clean interface
> + * given that comprehensive flexibility is not required
> + */
> +struct rte_q_topology_t {

The naming is too generic, it has to contain the driver name.
Also, it should not pe postfixed with _t, as it is not a typedef.

"struct rte_acc100_queue_topology"?

> +	/** Number of QGroups in incremental order of priority */
> +	uint16_t num_qgroups;
> +	/**
> +	 * All QGroups have the same number of AQs here.
> +	 * Note : Could be made a 16-array if more flexibility is really
> +	 * required
> +	 */
> +	uint16_t num_aqs_per_groups;
> +	/**
> +	 * Depth of the AQs is the same of all QGroups here. Log2 Enum : 2^N
> +	 * Note : Could be made a 16-array if more flexibility is really
> +	 * required
> +	 */
> +	uint16_t aq_depth_log2;
> +	/**
> +	 * Index of the first Queue Group Index - assuming contiguity
> +	 * Initialized as -1
> +	 */
> +	int8_t first_qgroup_index;
> +};
> +
> +/**
> + * Definition of Arbitration related parameters for ACC100 Configuration
> + */
> +struct rte_arbitration_t {

Same remark here.

> +	/** Default Weight for VF Fairness Arbitration */
> +	uint16_t round_robin_weight;
> +	uint32_t gbr_threshold1; /**< Guaranteed Bitrate Threshold 1 */
> +	uint32_t gbr_threshold2; /**< Guaranteed Bitrate Threshold 2 */
> +};
> +
> +/**
> + * Structure to pass ACC100 configuration.
> + * Note: all VF Bundles will have the same configuration.
> + */
> +struct acc100_conf {

"struct rte_acc100_conf"?

> +	bool pf_mode_en; /**< 1 if PF is used for dataplane, 0 for VFs */
> +	/** 1 if input '1' bit is represented by a positive LLR value, 0 if '1'
> +	 * bit is represented by a negative value.
> +	 */
> +	bool input_pos_llr_1_bit;
> +	/** 1 if output '1' bit is represented by a positive value, 0 if '1'
> +	 * bit is represented by a negative value.
> +	 */
> +	bool output_pos_llr_1_bit;
> +	uint16_t num_vf_bundles; /**< Number of VF bundles to setup */
> +	/** Queue topology for each operation type */
> +	struct rte_q_topology_t q_ul_4g;
> +	struct rte_q_topology_t q_dl_4g;
> +	struct rte_q_topology_t q_ul_5g;
> +	struct rte_q_topology_t q_dl_5g;
> +	/** Arbitration configuration for each operation type */
> +	struct rte_arbitration_t arb_ul_4g[RTE_ACC100_NUM_VFS];
> +	struct rte_arbitration_t arb_dl_4g[RTE_ACC100_NUM_VFS];
> +	struct rte_arbitration_t arb_ul_5g[RTE_ACC100_NUM_VFS];
> +	struct rte_arbitration_t arb_dl_5g[RTE_ACC100_NUM_VFS];
> +};
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_ACC100_CFG_H_ */

Regards,
Maxime


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v9 05/10] baseband/acc100: add LDPC processing functions
  2020-09-30 18:52         ` Chautru, Nicolas
@ 2020-10-01 15:31           ` Tom Rix
  2020-10-01 16:07             ` Chautru, Nicolas
  0 siblings, 1 reply; 213+ messages in thread
From: Tom Rix @ 2020-10-01 15:31 UTC (permalink / raw)
  To: Chautru, Nicolas, dev, akhil.goyal
  Cc: Richardson, Bruce, Xu, Rosen, dave.burley, aidan.goddard, Yigit,
	Ferruh, Liu, Tianjiao


On 9/30/20 11:52 AM, Chautru, Nicolas wrote:
> Hi Tom, 
>
>> From: Tom Rix <trix@redhat.com>
>> On 9/28/20 5:29 PM, Nicolas Chautru wrote:
>>> Adding LDPC decode and encode processing operations
>>>
>>> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
>>> Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
>>> Acked-by: Dave Burley <dave.burley@accelercomm.com>
>>> ---
>>>  doc/guides/bbdevs/features/acc100.ini    |    8 +-
>>>  drivers/baseband/acc100/rte_acc100_pmd.c | 1625
>> +++++++++++++++++++++++++++++-
>>>  drivers/baseband/acc100/rte_acc100_pmd.h |    3 +
>>>  3 files changed, 1630 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/doc/guides/bbdevs/features/acc100.ini
>> b/doc/guides/bbdevs/features/acc100.ini
>>> index c89a4d7..40c7adc 100644
>>> --- a/doc/guides/bbdevs/features/acc100.ini
>>> +++ b/doc/guides/bbdevs/features/acc100.ini
>>> @@ -6,9 +6,9 @@
>>>  [Features]
>>>  Turbo Decoder (4G)     = N
>>>  Turbo Encoder (4G)     = N
>>> -LDPC Decoder (5G)      = N
>>> -LDPC Encoder (5G)      = N
>>> -LLR/HARQ Compression   = N
>>> -External DDR Access    = N
>>> +LDPC Decoder (5G)      = Y
>>> +LDPC Encoder (5G)      = Y
>>> +LLR/HARQ Compression   = Y
>>> +External DDR Access    = Y
>>>  HW Accelerated         = Y
>>>  BBDEV API              = Y
>>> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
>> b/drivers/baseband/acc100/rte_acc100_pmd.c
>>> index 7a21c57..b223547 100644
>>> --- a/drivers/baseband/acc100/rte_acc100_pmd.c
>>> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
>>> @@ -15,6 +15,9 @@
>>>  #include <rte_hexdump.h>
>>>  #include <rte_pci.h>
>>>  #include <rte_bus_pci.h>
>>> +#ifdef RTE_BBDEV_OFFLOAD_COST
>>> +#include <rte_cycles.h>
>>> +#endif
>>>
>>>  #include <rte_bbdev.h>
>>>  #include <rte_bbdev_pmd.h>
>>> @@ -449,7 +452,6 @@
>>>  	return 0;
>>>  }
>>>
>>> -
>>>  /**
>>>   * Report a ACC100 queue index which is free
>>>   * Return 0 to 16k for a valid queue_idx or -1 when no queue is available
>>> @@ -634,6 +636,46 @@
>>>  	struct acc100_device *d = dev->data->dev_private;
>>>
>>>  	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
>>> +		{
>>> +			.type   = RTE_BBDEV_OP_LDPC_ENC,
>>> +			.cap.ldpc_enc = {
>>> +				.capability_flags =
>>> +					RTE_BBDEV_LDPC_RATE_MATCH |
>>> +
>> 	RTE_BBDEV_LDPC_CRC_24B_ATTACH |
>>> +
>> 	RTE_BBDEV_LDPC_INTERLEAVER_BYPASS,
>>> +				.num_buffers_src =
>>> +
>> 	RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
>>> +				.num_buffers_dst =
>>> +
>> 	RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
>>> +			}
>>> +		},
>>> +		{
>>> +			.type   = RTE_BBDEV_OP_LDPC_DEC,
>>> +			.cap.ldpc_dec = {
>>> +			.capability_flags =
>>> +				RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK |
>>> +				RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP |
>>> +
>> 	RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE |
>>> +
>> 	RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE |
>>> +#ifdef ACC100_EXT_MEM
>> This is unconditionally defined in rte_acc100_pmd.h but it seems
>>
>> like it could be a hw config.  Please add a comment in the *.h
>>
> It is not really an HW config, just a potential alternate way to run
> the device notably for troubleshooting.
> I can add a comment though
>
>> Could also change to
>>
>> #if ACC100_EXT_MEM
>>
>> and change the #define ACC100_EXT_MEM 1
> ok
>
>>> +
>> 	RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE |
>>> +
>> 	RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE |
>>> +#endif
>>> +
>> 	RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE |
>>> +				RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS
>> |
>>> +				RTE_BBDEV_LDPC_DECODE_BYPASS |
>>> +				RTE_BBDEV_LDPC_DEC_SCATTER_GATHER |
>>> +
>> 	RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION |
>>> +				RTE_BBDEV_LDPC_LLR_COMPRESSION,
>>> +			.llr_size = 8,
>>> +			.llr_decimals = 1,
>>> +			.num_buffers_src =
>>> +
>> 	RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
>>> +			.num_buffers_hard_out =
>>> +
>> 	RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
>>> +			.num_buffers_soft_out = 0,
>>> +			}
>>> +		},
>>>  		RTE_BBDEV_END_OF_CAPABILITIES_LIST()
>>>  	};
>>>
>>> @@ -669,9 +711,14 @@
>>>  	dev_info->cpu_flag_reqs = NULL;
>>>  	dev_info->min_alignment = 64;
>>>  	dev_info->capabilities = bbdev_capabilities;
>>> +#ifdef ACC100_EXT_MEM
>>>  	dev_info->harq_buffer_size = d->ddr_size;
>>> +#else
>>> +	dev_info->harq_buffer_size = 0;
>>> +#endif
>>>  }
>>>
>>> +
>>>  static const struct rte_bbdev_ops acc100_bbdev_ops = {
>>>  	.setup_queues = acc100_setup_queues,
>>>  	.close = acc100_dev_close,
>>> @@ -696,6 +743,1577 @@
>>>  	{.device_id = 0},
>>>  };
>>>
>>> +/* Read flag value 0/1 from bitmap */
>>> +static inline bool
>>> +check_bit(uint32_t bitmap, uint32_t bitmask)
>>> +{
>>> +	return bitmap & bitmask;
>>> +}
>>> +
>> All the bbdev have this function, its pretty trival but it would be good if
>> common bbdev
>>
>> functions got moved to a common place.
> Noted for future change affecting all PMDs outside of that serie. 

ok.

>
>>> +static inline char *
>>> +mbuf_append(struct rte_mbuf *m_head, struct rte_mbuf *m, uint16_t
>> len)
>>> +{
>>> +	if (unlikely(len > rte_pktmbuf_tailroom(m)))
>>> +		return NULL;
>>> +
>>> +	char *tail = (char *)m->buf_addr + m->data_off + m->data_len;
>>> +	m->data_len = (uint16_t)(m->data_len + len);
>>> +	m_head->pkt_len  = (m_head->pkt_len + len);
>>> +	return tail;
>>> +}
>>> +
>>> +/* Compute value of k0.
>>> + * Based on 3GPP 38.212 Table 5.4.2.1-2
>>> + * Starting position of different redundancy versions, k0
>>> + */
>>> +static inline uint16_t
>>> +get_k0(uint16_t n_cb, uint16_t z_c, uint8_t bg, uint8_t rv_index)
>>> +{
>>> +	if (rv_index == 0)
>>> +		return 0;
>>> +	uint16_t n = (bg == 1 ? N_ZC_1 : N_ZC_2) * z_c;
>>> +	if (n_cb == n) {
>>> +		if (rv_index == 1)
>>> +			return (bg == 1 ? K0_1_1 : K0_1_2) * z_c;
>>> +		else if (rv_index == 2)
>>> +			return (bg == 1 ? K0_2_1 : K0_2_2) * z_c;
>>> +		else
>>> +			return (bg == 1 ? K0_3_1 : K0_3_2) * z_c;
>>> +	}
>>> +	/* LBRM case - includes a division by N */
>>> +	if (rv_index == 1)
>>> +		return (((bg == 1 ? K0_1_1 : K0_1_2) * n_cb)
>>> +				/ n) * z_c;
>>> +	else if (rv_index == 2)
>>> +		return (((bg == 1 ? K0_2_1 : K0_2_2) * n_cb)
>>> +				/ n) * z_c;
>>> +	else
>>> +		return (((bg == 1 ? K0_3_1 : K0_3_2) * n_cb)
>>> +				/ n) * z_c;
>>> +}
>>> +
>>> +/* Fill in a frame control word for LDPC encoding. */
>>> +static inline void
>>> +acc100_fcw_le_fill(const struct rte_bbdev_enc_op *op,
>>> +		struct acc100_fcw_le *fcw, int num_cb)
>>> +{
>>> +	fcw->qm = op->ldpc_enc.q_m;
>>> +	fcw->nfiller = op->ldpc_enc.n_filler;
>>> +	fcw->BG = (op->ldpc_enc.basegraph - 1);
>>> +	fcw->Zc = op->ldpc_enc.z_c;
>>> +	fcw->ncb = op->ldpc_enc.n_cb;
>>> +	fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_enc.basegraph,
>>> +			op->ldpc_enc.rv_index);
>>> +	fcw->rm_e = op->ldpc_enc.cb_params.e;
>>> +	fcw->crc_select = check_bit(op->ldpc_enc.op_flags,
>>> +			RTE_BBDEV_LDPC_CRC_24B_ATTACH);
>>> +	fcw->bypass_intlv = check_bit(op->ldpc_enc.op_flags,
>>> +			RTE_BBDEV_LDPC_INTERLEAVER_BYPASS);
>>> +	fcw->mcb_count = num_cb;
>>> +}
>>> +
>>> +/* Fill in a frame control word for LDPC decoding. */
>>> +static inline void
>>> +acc100_fcw_ld_fill(const struct rte_bbdev_dec_op *op, struct
>> acc100_fcw_ld *fcw,
>>> +		union acc100_harq_layout_data *harq_layout)
>>> +{
>>> +	uint16_t harq_out_length, harq_in_length, ncb_p, k0_p,
>> parity_offset;
>>> +	uint16_t harq_index;
>>> +	uint32_t l;
>>> +	bool harq_prun = false;
>>> +
>>> +	fcw->qm = op->ldpc_dec.q_m;
>>> +	fcw->nfiller = op->ldpc_dec.n_filler;
>>> +	fcw->BG = (op->ldpc_dec.basegraph - 1);
>>> +	fcw->Zc = op->ldpc_dec.z_c;
>>> +	fcw->ncb = op->ldpc_dec.n_cb;
>>> +	fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_dec.basegraph,
>>> +			op->ldpc_dec.rv_index);
>>> +	if (op->ldpc_dec.code_block_mode == 1)
>> 1 is magic, consider a #define
> This would be a changed not related to that PMD, but noted and agreed. 
>
>>> +		fcw->rm_e = op->ldpc_dec.cb_params.e;
>>> +	else
>>> +		fcw->rm_e = (op->ldpc_dec.tb_params.r <
>>> +				op->ldpc_dec.tb_params.cab) ?
>>> +						op->ldpc_dec.tb_params.ea :
>>> +						op->ldpc_dec.tb_params.eb;
>>> +
>>> +	fcw->hcin_en = check_bit(op->ldpc_dec.op_flags,
>>> +			RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE);
>>> +	fcw->hcout_en = check_bit(op->ldpc_dec.op_flags,
>>> +			RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE);
>>> +	fcw->crc_select = check_bit(op->ldpc_dec.op_flags,
>>> +			RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK);
>>> +	fcw->bypass_dec = check_bit(op->ldpc_dec.op_flags,
>>> +			RTE_BBDEV_LDPC_DECODE_BYPASS);
>>> +	fcw->bypass_intlv = check_bit(op->ldpc_dec.op_flags,
>>> +			RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS);
>>> +	if (op->ldpc_dec.q_m == 1) {
>>> +		fcw->bypass_intlv = 1;
>>> +		fcw->qm = 2;
>>> +	}
>> similar magic.
> Qm is an integer number defined in 3GPP, not a magic number. This literally means qm = 2.

ok


>>> +	fcw->hcin_decomp_mode = check_bit(op->ldpc_dec.op_flags,
>>> +			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
>>> +	fcw->hcout_comp_mode = check_bit(op->ldpc_dec.op_flags,
>>> +			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
>>> +	fcw->llr_pack_mode = check_bit(op->ldpc_dec.op_flags,
>>> +			RTE_BBDEV_LDPC_LLR_COMPRESSION);
>>> +	harq_index = op->ldpc_dec.harq_combined_output.offset /
>>> +			ACC100_HARQ_OFFSET;
>>> +#ifdef ACC100_EXT_MEM
>>> +	/* Limit cases when HARQ pruning is valid */
>>> +	harq_prun = ((op->ldpc_dec.harq_combined_output.offset %
>>> +			ACC100_HARQ_OFFSET) == 0) &&
>>> +			(op->ldpc_dec.harq_combined_output.offset <=
>> UINT16_MAX
>>> +			* ACC100_HARQ_OFFSET);
>>> +#endif
>>> +	if (fcw->hcin_en > 0) {
>>> +		harq_in_length = op-
>>> ldpc_dec.harq_combined_input.length;
>>> +		if (fcw->hcin_decomp_mode > 0)
>>> +			harq_in_length = harq_in_length * 8 / 6;
>>> +		harq_in_length = RTE_ALIGN(harq_in_length, 64);
>>> +		if ((harq_layout[harq_index].offset > 0) & harq_prun) {
>>> +			rte_bbdev_log_debug("HARQ IN offset unexpected
>> for now\n");
>>> +			fcw->hcin_size0 = harq_layout[harq_index].size0;
>>> +			fcw->hcin_offset = harq_layout[harq_index].offset;
>>> +			fcw->hcin_size1 = harq_in_length -
>>> +					harq_layout[harq_index].offset;
>>> +		} else {
>>> +			fcw->hcin_size0 = harq_in_length;
>>> +			fcw->hcin_offset = 0;
>>> +			fcw->hcin_size1 = 0;
>>> +		}
>>> +	} else {
>>> +		fcw->hcin_size0 = 0;
>>> +		fcw->hcin_offset = 0;
>>> +		fcw->hcin_size1 = 0;
>>> +	}
>>> +
>>> +	fcw->itmax = op->ldpc_dec.iter_max;
>>> +	fcw->itstop = check_bit(op->ldpc_dec.op_flags,
>>> +			RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE);
>>> +	fcw->synd_precoder = fcw->itstop;
>>> +	/*
>>> +	 * These are all implicitly set
>>> +	 * fcw->synd_post = 0;
>>> +	 * fcw->so_en = 0;
>>> +	 * fcw->so_bypass_rm = 0;
>>> +	 * fcw->so_bypass_intlv = 0;
>>> +	 * fcw->dec_convllr = 0;
>>> +	 * fcw->hcout_convllr = 0;
>>> +	 * fcw->hcout_size1 = 0;
>>> +	 * fcw->so_it = 0;
>>> +	 * fcw->hcout_offset = 0;
>>> +	 * fcw->negstop_th = 0;
>>> +	 * fcw->negstop_it = 0;
>>> +	 * fcw->negstop_en = 0;
>>> +	 * fcw->gain_i = 1;
>>> +	 * fcw->gain_h = 1;
>>> +	 */
>>> +	if (fcw->hcout_en > 0) {
>>> +		parity_offset = (op->ldpc_dec.basegraph == 1 ? 20 : 8)
>>> +			* op->ldpc_dec.z_c - op->ldpc_dec.n_filler;
>>> +		k0_p = (fcw->k0 > parity_offset) ?
>>> +				fcw->k0 - op->ldpc_dec.n_filler : fcw->k0;
>>> +		ncb_p = fcw->ncb - op->ldpc_dec.n_filler;
>>> +		l = k0_p + fcw->rm_e;
>>> +		harq_out_length = (uint16_t) fcw->hcin_size0;
>>> +		harq_out_length = RTE_MIN(RTE_MAX(harq_out_length, l),
>> ncb_p);
>>> +		harq_out_length = (harq_out_length + 0x3F) & 0xFFC0;
>>> +		if ((k0_p > fcw->hcin_size0 +
>> ACC100_HARQ_OFFSET_THRESHOLD) &&
>>> +				harq_prun) {
>>> +			fcw->hcout_size0 = (uint16_t) fcw->hcin_size0;
>>> +			fcw->hcout_offset = k0_p & 0xFFC0;
>>> +			fcw->hcout_size1 = harq_out_length - fcw-
>>> hcout_offset;
>>> +		} else {
>>> +			fcw->hcout_size0 = harq_out_length;
>>> +			fcw->hcout_size1 = 0;
>>> +			fcw->hcout_offset = 0;
>>> +		}
>>> +		harq_layout[harq_index].offset = fcw->hcout_offset;
>>> +		harq_layout[harq_index].size0 = fcw->hcout_size0;
>>> +	} else {
>>> +		fcw->hcout_size0 = 0;
>>> +		fcw->hcout_size1 = 0;
>>> +		fcw->hcout_offset = 0;
>>> +	}
>>> +}
>>> +
>>> +/**
>>> + * Fills descriptor with data pointers of one block type.
>>> + *
>>> + * @param desc
>>> + *   Pointer to DMA descriptor.
>>> + * @param input
>>> + *   Pointer to pointer to input data which will be encoded. It can be
>> changed
>>> + *   and points to next segment in scatter-gather case.
>>> + * @param offset
>>> + *   Input offset in rte_mbuf structure. It is used for calculating the point
>>> + *   where data is starting.
>>> + * @param cb_len
>>> + *   Length of currently processed Code Block
>>> + * @param seg_total_left
>>> + *   It indicates how many bytes still left in segment (mbuf) for further
>>> + *   processing.
>>> + * @param op_flags
>>> + *   Store information about device capabilities
>>> + * @param next_triplet
>>> + *   Index for ACC100 DMA Descriptor triplet
>>> + *
>>> + * @return
>>> + *   Returns index of next triplet on success, other value if lengths of
>>> + *   pkt and processed cb do not match.
>>> + *
>>> + */
>>> +static inline int
>>> +acc100_dma_fill_blk_type_in(struct acc100_dma_req_desc *desc,
>>> +		struct rte_mbuf **input, uint32_t *offset, uint32_t cb_len,
>>> +		uint32_t *seg_total_left, int next_triplet)
>>> +{
>>> +	uint32_t part_len;
>>> +	struct rte_mbuf *m = *input;
>>> +
>>> +	part_len = (*seg_total_left < cb_len) ? *seg_total_left : cb_len;
>>> +	cb_len -= part_len;
>>> +	*seg_total_left -= part_len;
>>> +
>>> +	desc->data_ptrs[next_triplet].address =
>>> +			rte_pktmbuf_iova_offset(m, *offset);
>>> +	desc->data_ptrs[next_triplet].blen = part_len;
>>> +	desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN;
>>> +	desc->data_ptrs[next_triplet].last = 0;
>>> +	desc->data_ptrs[next_triplet].dma_ext = 0;
>>> +	*offset += part_len;
>>> +	next_triplet++;
>>> +
>>> +	while (cb_len > 0) {
>> Since cb_len is unsigned, a better check would be
>>
>> while (cb_len != 0)
> Why would this be better?

It is unsigned it will never be < 0.

!= 0 reflects that.

>
>>> +		if (next_triplet < ACC100_DMA_MAX_NUM_POINTERS &&
>>> +				m->next != NULL) {
>>> +
>>> +			m = m->next;
>>> +			*seg_total_left = rte_pktmbuf_data_len(m);
>>> +			part_len = (*seg_total_left < cb_len) ?
>>> +					*seg_total_left :
>>> +					cb_len;
>>> +			desc->data_ptrs[next_triplet].address =
>>> +					rte_pktmbuf_iova_offset(m, 0);
>>> +			desc->data_ptrs[next_triplet].blen = part_len;
>>> +			desc->data_ptrs[next_triplet].blkid =
>>> +					ACC100_DMA_BLKID_IN;
>>> +			desc->data_ptrs[next_triplet].last = 0;
>>> +			desc->data_ptrs[next_triplet].dma_ext = 0;
>>> +			cb_len -= part_len;
>>> +			*seg_total_left -= part_len;
>> when *sec_total_left goes to zero here, there will be a lot of iterations doing
>> nothing.
>>
>> should stop early.
> Not really, it would pick next m anyway and keep adding buffer descriptor pointer.
ok
>  
>
>>> +			/* Initializing offset for next segment (mbuf) */
>>> +			*offset = part_len;
>>> +			next_triplet++;
>>> +		} else {
>>> +			rte_bbdev_log(ERR,
>>> +				"Some data still left for processing: "
>>> +				"data_left: %u, next_triplet: %u, next_mbuf:
>> %p",
>>> +				cb_len, next_triplet, m->next);
>>> +			return -EINVAL;
>>> +		}
>>> +	}
>>> +	/* Storing new mbuf as it could be changed in scatter-gather case*/
>>> +	*input = m;
>>> +
>>> +	return next_triplet;
>> callers, after checking, dec the return.
>>
>> Maybe change return to next_triplet-- and save the callers from doing it.
> I miss your point

Looking at how the callers of this function use the return,

a fair number decrement it to get to the current_triplet.

So maybe returning the current_triplet would be better.

Something to think about, not required.

>>> +}
>>> +
>>> +/* Fills descriptor with data pointers of one block type.
>>> + * Returns index of next triplet on success, other value if lengths of
>>> + * output data and processed mbuf do not match.
>>> + */
>>> +static inline int
>>> +acc100_dma_fill_blk_type_out(struct acc100_dma_req_desc *desc,
>>> +		struct rte_mbuf *output, uint32_t out_offset,
>>> +		uint32_t output_len, int next_triplet, int blk_id)
>>> +{
>>> +	desc->data_ptrs[next_triplet].address =
>>> +			rte_pktmbuf_iova_offset(output, out_offset);
>>> +	desc->data_ptrs[next_triplet].blen = output_len;
>>> +	desc->data_ptrs[next_triplet].blkid = blk_id;
>>> +	desc->data_ptrs[next_triplet].last = 0;
>>> +	desc->data_ptrs[next_triplet].dma_ext = 0;
>>> +	next_triplet++;
>> Callers check return is < 0, like above but there is no similar logic to
>>
>> check the bounds of next_triplet to return -EINVAL
>>
>> so add this check here or remove the is < 0 checks by the callers.
>>
> fair enough thanks. 
>
>>> +
>>> +	return next_triplet;
>>> +}
>>> +
>>> +static inline int
>>> +acc100_dma_desc_le_fill(struct rte_bbdev_enc_op *op,
>>> +		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
>>> +		struct rte_mbuf *output, uint32_t *in_offset,
>>> +		uint32_t *out_offset, uint32_t *out_length,
>>> +		uint32_t *mbuf_total_left, uint32_t *seg_total_left)
>>> +{
>>> +	int next_triplet = 1; /* FCW already done */
>>> +	uint16_t K, in_length_in_bits, in_length_in_bytes;
>>> +	struct rte_bbdev_op_ldpc_enc *enc = &op->ldpc_enc;
>>> +
>>> +	desc->word0 = ACC100_DMA_DESC_TYPE;
>>> +	desc->word1 = 0; /**< Timestamp could be disabled */
>>> +	desc->word2 = 0;
>>> +	desc->word3 = 0;
>>> +	desc->numCBs = 1;
>>> +
>>> +	K = (enc->basegraph == 1 ? 22 : 10) * enc->z_c;
>>> +	in_length_in_bits = K - enc->n_filler;
>> can this overflow ? enc->n_filler > K ?
> I would not add such checks in the time critical function. For valid scenario it can't.
> It could be added to the validate_ldpc_dec_op() which is only run in debug mode.
>
>>> +	if ((enc->op_flags & RTE_BBDEV_LDPC_CRC_24A_ATTACH) ||
>>> +			(enc->op_flags &
>> RTE_BBDEV_LDPC_CRC_24B_ATTACH))
>>> +		in_length_in_bits -= 24;
>>> +	in_length_in_bytes = in_length_in_bits >> 3;
>>> +
>>> +	if (unlikely((*mbuf_total_left == 0) ||
>> This check is covered by the next and can be removed.
> Not necessaraly, would keep as is. 
only if in_length_in_bytes was negative
>
>>> +			(*mbuf_total_left < in_length_in_bytes))) {
>>> +		rte_bbdev_log(ERR,
>>> +				"Mismatch between mbuf length and
>> included CB sizes: mbuf len %u, cb len %u",
>>> +				*mbuf_total_left, in_length_in_bytes);
>>> +		return -1;
>>> +	}
>>> +
>>> +	next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset,
>>> +			in_length_in_bytes,
>>> +			seg_total_left, next_triplet);
>>> +	if (unlikely(next_triplet < 0)) {
>>> +		rte_bbdev_log(ERR,
>>> +				"Mismatch between data to process and
>> mbuf data length in bbdev_op: %p",
>>> +				op);
>>> +		return -1;
>>> +	}
>>> +	desc->data_ptrs[next_triplet - 1].last = 1;
>>> +	desc->m2dlen = next_triplet;
>>> +	*mbuf_total_left -= in_length_in_bytes;
>> Updating output pointers should be deferred until the the call is known to
>> be successful.
>>
>> Otherwise caller is left in a bad, unknown state.
> We already had to touch them by that point.
ugh.
>
>>> +
>>> +	/* Set output length */
>>> +	/* Integer round up division by 8 */
>>> +	*out_length = (enc->cb_params.e + 7) >> 3;
>>> +
>>> +	next_triplet = acc100_dma_fill_blk_type_out(desc, output,
>> *out_offset,
>>> +			*out_length, next_triplet,
>> ACC100_DMA_BLKID_OUT_ENC);
>>> +	if (unlikely(next_triplet < 0)) {
>>> +		rte_bbdev_log(ERR,
>>> +				"Mismatch between data to process and
>> mbuf data length in bbdev_op: %p",
>>> +				op);
>>> +		return -1;
>>> +	}
>>> +	op->ldpc_enc.output.length += *out_length;
>>> +	*out_offset += *out_length;
>>> +	desc->data_ptrs[next_triplet - 1].last = 1;
>>> +	desc->data_ptrs[next_triplet - 1].dma_ext = 0;
>>> +	desc->d2mlen = next_triplet - desc->m2dlen;
>>> +
>>> +	desc->op_addr = op;
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +static inline int
>>> +acc100_dma_desc_ld_fill(struct rte_bbdev_dec_op *op,
>>> +		struct acc100_dma_req_desc *desc,
>>> +		struct rte_mbuf **input, struct rte_mbuf *h_output,
>>> +		uint32_t *in_offset, uint32_t *h_out_offset,
>>> +		uint32_t *h_out_length, uint32_t *mbuf_total_left,
>>> +		uint32_t *seg_total_left,
>>> +		struct acc100_fcw_ld *fcw)
>>> +{
>>> +	struct rte_bbdev_op_ldpc_dec *dec = &op->ldpc_dec;
>>> +	int next_triplet = 1; /* FCW already done */
>>> +	uint32_t input_length;
>>> +	uint16_t output_length, crc24_overlap = 0;
>>> +	uint16_t sys_cols, K, h_p_size, h_np_size;
>>> +	bool h_comp = check_bit(dec->op_flags,
>>> +			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
>>> +
>>> +	desc->word0 = ACC100_DMA_DESC_TYPE;
>>> +	desc->word1 = 0; /**< Timestamp could be disabled */
>>> +	desc->word2 = 0;
>>> +	desc->word3 = 0;
>>> +	desc->numCBs = 1;
>> This seems to be a common setup logic, maybe use a macro or inline
>> function.
> fair enough
>
>>> +
>>> +	if (check_bit(op->ldpc_dec.op_flags,
>>> +			RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP))
>>> +		crc24_overlap = 24;
>>> +
>>> +	/* Compute some LDPC BG lengths */
>>> +	input_length = dec->cb_params.e;
>>> +	if (check_bit(op->ldpc_dec.op_flags,
>>> +			RTE_BBDEV_LDPC_LLR_COMPRESSION))
>>> +		input_length = (input_length * 3 + 3) / 4;
>>> +	sys_cols = (dec->basegraph == 1) ? 22 : 10;
>>> +	K = sys_cols * dec->z_c;
>>> +	output_length = K - dec->n_filler - crc24_overlap;
>>> +
>>> +	if (unlikely((*mbuf_total_left == 0) ||
>> similar to above, this check can be removed.
> same comment
>
>>> +			(*mbuf_total_left < input_length))) {
>>> +		rte_bbdev_log(ERR,
>>> +				"Mismatch between mbuf length and
>> included CB sizes: mbuf len %u, cb len %u",
>>> +				*mbuf_total_left, input_length);
>>> +		return -1;
>>> +	}
>>> +
>>> +	next_triplet = acc100_dma_fill_blk_type_in(desc, input,
>>> +			in_offset, input_length,
>>> +			seg_total_left, next_triplet);
>>> +
>>> +	if (unlikely(next_triplet < 0)) {
>>> +		rte_bbdev_log(ERR,
>>> +				"Mismatch between data to process and
>> mbuf data length in bbdev_op: %p",
>>> +				op);
>>> +		return -1;
>>> +	}
>>> +
>>> +	if (check_bit(op->ldpc_dec.op_flags,
>>> +
>> 	RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
>>> +		h_p_size = fcw->hcin_size0 + fcw->hcin_size1;
>>> +		if (h_comp)
>>> +			h_p_size = (h_p_size * 3 + 3) / 4;
>>> +		desc->data_ptrs[next_triplet].address =
>>> +				dec->harq_combined_input.offset;
>>> +		desc->data_ptrs[next_triplet].blen = h_p_size;
>>> +		desc->data_ptrs[next_triplet].blkid =
>> ACC100_DMA_BLKID_IN_HARQ;
>>> +		desc->data_ptrs[next_triplet].dma_ext = 1;
>>> +#ifndef ACC100_EXT_MEM
>>> +		acc100_dma_fill_blk_type_out(
>>> +				desc,
>>> +				op->ldpc_dec.harq_combined_input.data,
>>> +				op->ldpc_dec.harq_combined_input.offset,
>>> +				h_p_size,
>>> +				next_triplet,
>>> +				ACC100_DMA_BLKID_IN_HARQ);
>>> +#endif
>>> +		next_triplet++;
>>> +	}
>>> +
>>> +	desc->data_ptrs[next_triplet - 1].last = 1;
>>> +	desc->m2dlen = next_triplet;
>>> +	*mbuf_total_left -= input_length;
>>> +
>>> +	next_triplet = acc100_dma_fill_blk_type_out(desc, h_output,
>>> +			*h_out_offset, output_length >> 3, next_triplet,
>>> +			ACC100_DMA_BLKID_OUT_HARD);
>>> +	if (unlikely(next_triplet < 0)) {
>>> +		rte_bbdev_log(ERR,
>>> +				"Mismatch between data to process and
>> mbuf data length in bbdev_op: %p",
>>> +				op);
>>> +		return -1;
>>> +	}
>>> +
>>> +	if (check_bit(op->ldpc_dec.op_flags,
>>> +
>> 	RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
>>> +		/* Pruned size of the HARQ */
>>> +		h_p_size = fcw->hcout_size0 + fcw->hcout_size1;
>>> +		/* Non-Pruned size of the HARQ */
>>> +		h_np_size = fcw->hcout_offset > 0 ?
>>> +				fcw->hcout_offset + fcw->hcout_size1 :
>>> +				h_p_size;
>>> +		if (h_comp) {
>>> +			h_np_size = (h_np_size * 3 + 3) / 4;
>>> +			h_p_size = (h_p_size * 3 + 3) / 4;
>> * 4 -1 ) / 4
>>
>> may produce better assembly.
> that is not the same arithmetic
?
>>> +		}
>>> +		dec->harq_combined_output.length = h_np_size;
>>> +		desc->data_ptrs[next_triplet].address =
>>> +				dec->harq_combined_output.offset;
>>> +		desc->data_ptrs[next_triplet].blen = h_p_size;
>>> +		desc->data_ptrs[next_triplet].blkid =
>> ACC100_DMA_BLKID_OUT_HARQ;
>>> +		desc->data_ptrs[next_triplet].dma_ext = 1;
>>> +#ifndef ACC100_EXT_MEM
>>> +		acc100_dma_fill_blk_type_out(
>>> +				desc,
>>> +				dec->harq_combined_output.data,
>>> +				dec->harq_combined_output.offset,
>>> +				h_p_size,
>>> +				next_triplet,
>>> +				ACC100_DMA_BLKID_OUT_HARQ);
>>> +#endif
>>> +		next_triplet++;
>>> +	}
>>> +
>>> +	*h_out_length = output_length >> 3;
>>> +	dec->hard_output.length += *h_out_length;
>>> +	*h_out_offset += *h_out_length;
>>> +	desc->data_ptrs[next_triplet - 1].last = 1;
>>> +	desc->d2mlen = next_triplet - desc->m2dlen;
>>> +
>>> +	desc->op_addr = op;
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +static inline void
>>> +acc100_dma_desc_ld_update(struct rte_bbdev_dec_op *op,
>>> +		struct acc100_dma_req_desc *desc,
>>> +		struct rte_mbuf *input, struct rte_mbuf *h_output,
>>> +		uint32_t *in_offset, uint32_t *h_out_offset,
>>> +		uint32_t *h_out_length,
>>> +		union acc100_harq_layout_data *harq_layout)
>>> +{
>>> +	int next_triplet = 1; /* FCW already done */
>>> +	desc->data_ptrs[next_triplet].address =
>>> +			rte_pktmbuf_iova_offset(input, *in_offset);
>>> +	next_triplet++;
>> No overflow checks on next_triplet
>>
>> This is a general problem.
> I dont see the overflow risk.

A lot of places increments without checking the bounds.

To me, it seems like we are getting lucky that data_ptrs[] is big enough.

>
>>> +
>>> +	if (check_bit(op->ldpc_dec.op_flags,
>>> +
>> 	RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
>>> +		struct rte_bbdev_op_data hi = op-
>>> ldpc_dec.harq_combined_input;
>>> +		desc->data_ptrs[next_triplet].address = hi.offset;
>>> +#ifndef ACC100_EXT_MEM
>>> +		desc->data_ptrs[next_triplet].address =
>>> +				rte_pktmbuf_iova_offset(hi.data, hi.offset);
>>> +#endif
>>> +		next_triplet++;
>>> +	}
>>> +
>>> +	desc->data_ptrs[next_triplet].address =
>>> +			rte_pktmbuf_iova_offset(h_output, *h_out_offset);
>>> +	*h_out_length = desc->data_ptrs[next_triplet].blen;
>>> +	next_triplet++;
>>> +
>>> +	if (check_bit(op->ldpc_dec.op_flags,
>>> +
>> 	RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
>>> +		desc->data_ptrs[next_triplet].address =
>>> +				op->ldpc_dec.harq_combined_output.offset;
>>> +		/* Adjust based on previous operation */
>>> +		struct rte_bbdev_dec_op *prev_op = desc->op_addr;
>>> +		op->ldpc_dec.harq_combined_output.length =
>>> +				prev_op-
>>> ldpc_dec.harq_combined_output.length;
>>> +		int16_t hq_idx = op-
>>> ldpc_dec.harq_combined_output.offset /
>>> +				ACC100_HARQ_OFFSET;
>>> +		int16_t prev_hq_idx =
>>> +				prev_op-
>>> ldpc_dec.harq_combined_output.offset
>>> +				/ ACC100_HARQ_OFFSET;
>>> +		harq_layout[hq_idx].val = harq_layout[prev_hq_idx].val;
>>> +#ifndef ACC100_EXT_MEM
>>> +		struct rte_bbdev_op_data ho =
>>> +				op->ldpc_dec.harq_combined_output;
>>> +		desc->data_ptrs[next_triplet].address =
>>> +				rte_pktmbuf_iova_offset(ho.data, ho.offset);
>>> +#endif
>>> +		next_triplet++;
>>> +	}
>>> +
>>> +	op->ldpc_dec.hard_output.length += *h_out_length;
>>> +	desc->op_addr = op;
>>> +}
>>> +
>>> +
>>> +/* Enqueue a number of operations to HW and update software rings */
>>> +static inline void
>>> +acc100_dma_enqueue(struct acc100_queue *q, uint16_t n,
>>> +		struct rte_bbdev_stats *queue_stats)
>>> +{
>>> +	union acc100_enqueue_reg_fmt enq_req;
>>> +#ifdef RTE_BBDEV_OFFLOAD_COST
>>> +	uint64_t start_time = 0;
>>> +	queue_stats->acc_offload_cycles = 0;
>>> +	RTE_SET_USED(queue_stats);
>>> +#else
>>> +	RTE_SET_USED(queue_stats);
>>> +#endif
>> RTE_SET_UNUSED(... is common in the #ifdef/#else
>>
>> so it should be moved out.
> ok
>
>>> +
>>> +	enq_req.val = 0;
>>> +	/* Setting offset, 100b for 256 DMA Desc */
>>> +	enq_req.addr_offset = ACC100_DESC_OFFSET;
>>> +
>> should n != 0 be checked here ?
> This is all checked before that point. 
ok
>
>>> +	/* Split ops into batches */
>>> +	do {
>>> +		union acc100_dma_desc *desc;
>>> +		uint16_t enq_batch_size;
>>> +		uint64_t offset;
>>> +		rte_iova_t req_elem_addr;
>>> +
>>> +		enq_batch_size = RTE_MIN(n, MAX_ENQ_BATCH_SIZE);
>>> +
>>> +		/* Set flag on last descriptor in a batch */
>>> +		desc = q->ring_addr + ((q->sw_ring_head + enq_batch_size -
>> 1) &
>>> +				q->sw_ring_wrap_mask);
>>> +		desc->req.last_desc_in_batch = 1;
>>> +
>>> +		/* Calculate the 1st descriptor's address */
>>> +		offset = ((q->sw_ring_head & q->sw_ring_wrap_mask) *
>>> +				sizeof(union acc100_dma_desc));
>>> +		req_elem_addr = q->ring_addr_phys + offset;
>>> +
>>> +		/* Fill enqueue struct */
>>> +		enq_req.num_elem = enq_batch_size;
>>> +		/* low 6 bits are not needed */
>>> +		enq_req.req_elem_addr = (uint32_t)(req_elem_addr >> 6);
>>> +
>>> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
>>> +		rte_memdump(stderr, "Req sdone", desc, sizeof(*desc));
>>> +#endif
>>> +		rte_bbdev_log_debug(
>>> +				"Enqueue %u reqs (phys %#"PRIx64") to reg
>> %p",
>>> +				enq_batch_size,
>>> +				req_elem_addr,
>>> +				(void *)q->mmio_reg_enqueue);
>>> +
>>> +		rte_wmb();
>>> +
>>> +#ifdef RTE_BBDEV_OFFLOAD_COST
>>> +		/* Start time measurement for enqueue function offload. */
>>> +		start_time = rte_rdtsc_precise();
>>> +#endif
>>> +		rte_bbdev_log(DEBUG, "Debug : MMIO Enqueue");
>> logging time will be tracked with the mmio_write
>>
>> so logging should be moved above the start_time setting
> Not required. Running with debug traces is expected to make real time offload measurement irrelevant.

I disagree, if logging has a pagefault or writes to disk even sometimes

there will be huge spike in the time that would make the accumulated

acc_offload_cycles meaningless.  It would be ok if the write time is of

the same order of magnitude as disk access.

>>> +		mmio_write(q->mmio_reg_enqueue, enq_req.val);
>>> +
>>> +#ifdef RTE_BBDEV_OFFLOAD_COST
>>> +		queue_stats->acc_offload_cycles +=
>>> +				rte_rdtsc_precise() - start_time;
>>> +#endif
>>> +
>>> +		q->aq_enqueued++;
>>> +		q->sw_ring_head += enq_batch_size;
>>> +		n -= enq_batch_size;
>>> +
>>> +	} while (n);
>>> +
>>> +
>>> +}
>>> +
>>> +/* Enqueue one encode operations for ACC100 device in CB mode */
>>> +static inline int
>>> +enqueue_ldpc_enc_n_op_cb(struct acc100_queue *q, struct
>> rte_bbdev_enc_op **ops,
>>> +		uint16_t total_enqueued_cbs, int16_t num)
>>> +{
>>> +	union acc100_dma_desc *desc = NULL;
>>> +	uint32_t out_length;
>>> +	struct rte_mbuf *output_head, *output;
>>> +	int i, next_triplet;
>>> +	uint16_t  in_length_in_bytes;
>>> +	struct rte_bbdev_op_ldpc_enc *enc = &ops[0]->ldpc_enc;
>>> +
>>> +	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
>>> +			& q->sw_ring_wrap_mask);
>>> +	desc = q->ring_addr + desc_idx;
>>> +	acc100_fcw_le_fill(ops[0], &desc->req.fcw_le, num);
>>> +
>>> +	/** This could be done at polling */
>>> +	desc->req.word0 = ACC100_DMA_DESC_TYPE;
>>> +	desc->req.word1 = 0; /**< Timestamp could be disabled */
>>> +	desc->req.word2 = 0;
>>> +	desc->req.word3 = 0;
>>> +	desc->req.numCBs = num;
>>> +
>>> +	in_length_in_bytes = ops[0]->ldpc_enc.input.data->data_len;
>>> +	out_length = (enc->cb_params.e + 7) >> 3;
>>> +	desc->req.m2dlen = 1 + num;
>>> +	desc->req.d2mlen = num;
>>> +	next_triplet = 1;
>>> +
>>> +	for (i = 0; i < num; i++) {
>> i is not needed here, it is next_triplet - 1
> would impact readability as these refer to different concepts (code blocks and bdescs).
> Would keep as is
ok
>
>>> +		desc->req.data_ptrs[next_triplet].address =
>>> +			rte_pktmbuf_iova_offset(ops[i]-
>>> ldpc_enc.input.data, 0);
>>> +		desc->req.data_ptrs[next_triplet].blen = in_length_in_bytes;
>>> +		next_triplet++;
>>> +		desc->req.data_ptrs[next_triplet].address =
>>> +				rte_pktmbuf_iova_offset(
>>> +				ops[i]->ldpc_enc.output.data, 0);
>>> +		desc->req.data_ptrs[next_triplet].blen = out_length;
>>> +		next_triplet++;
>>> +		ops[i]->ldpc_enc.output.length = out_length;
>>> +		output_head = output = ops[i]->ldpc_enc.output.data;
>>> +		mbuf_append(output_head, output, out_length);
>>> +		output->data_len = out_length;
>>> +	}
>>> +
>>> +	desc->req.op_addr = ops[0];
>>> +
>>> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
>>> +	rte_memdump(stderr, "FCW", &desc->req.fcw_le,
>>> +			sizeof(desc->req.fcw_le) - 8);
>>> +	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
>>> +#endif
>>> +
>>> +	/* One CB (one op) was successfully prepared to enqueue */
>>> +	return num;
>> caller does not use num, only check if < 0
>>
>> So could change to return 0
> would keep as is for debug
ok
>
>>> +}
>>> +
>>> +/* Enqueue one encode operations for ACC100 device in CB mode */
>>> +static inline int
>>> +enqueue_ldpc_enc_one_op_cb(struct acc100_queue *q, struct
>> rte_bbdev_enc_op *op,
>>> +		uint16_t total_enqueued_cbs)
>> rte_fpga_5gnr_fec.c has this same function.  It would be good if common
>> functions could be collected and used to stabilize the internal bbdev
>> interface.
>>
>> This is general issue
> This is true for some part of the code and noted.
> In that very case they are distinct implementation with HW specifics
> But agreed to look into such refactory later on. 
ok
>
>>> +{
>>> +	union acc100_dma_desc *desc = NULL;
>>> +	int ret;
>>> +	uint32_t in_offset, out_offset, out_length, mbuf_total_left,
>>> +		seg_total_left;
>>> +	struct rte_mbuf *input, *output_head, *output;
>>> +
>>> +	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
>>> +			& q->sw_ring_wrap_mask);
>>> +	desc = q->ring_addr + desc_idx;
>>> +	acc100_fcw_le_fill(op, &desc->req.fcw_le, 1);
>>> +
>>> +	input = op->ldpc_enc.input.data;
>>> +	output_head = output = op->ldpc_enc.output.data;
>>> +	in_offset = op->ldpc_enc.input.offset;
>>> +	out_offset = op->ldpc_enc.output.offset;
>>> +	out_length = 0;
>>> +	mbuf_total_left = op->ldpc_enc.input.length;
>>> +	seg_total_left = rte_pktmbuf_data_len(op->ldpc_enc.input.data)
>>> +			- in_offset;
>>> +
>>> +	ret = acc100_dma_desc_le_fill(op, &desc->req, &input, output,
>>> +			&in_offset, &out_offset, &out_length,
>> &mbuf_total_left,
>>> +			&seg_total_left);
>>> +
>>> +	if (unlikely(ret < 0))
>>> +		return ret;
>>> +
>>> +	mbuf_append(output_head, output, out_length);
>>> +
>>> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
>>> +	rte_memdump(stderr, "FCW", &desc->req.fcw_le,
>>> +			sizeof(desc->req.fcw_le) - 8);
>>> +	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
>>> +
>>> +	/* Check if any data left after processing one CB */
>>> +	if (mbuf_total_left != 0) {
>>> +		rte_bbdev_log(ERR,
>>> +				"Some date still left after processing one CB:
>> mbuf_total_left = %u",
>>> +				mbuf_total_left);
>>> +		return -EINVAL;
>>> +	}
>>> +#endif
>>> +	/* One CB (one op) was successfully prepared to enqueue */
>>> +	return 1;
>> Another case where caller only check for < 0
>>
>> Consider changes all similar to return 0 on success.
> same comment as above, would keep as is. 
>
>>> +}
>>> +
>>> +/** Enqueue one decode operations for ACC100 device in CB mode */
>>> +static inline int
>>> +enqueue_ldpc_dec_one_op_cb(struct acc100_queue *q, struct
>> rte_bbdev_dec_op *op,
>>> +		uint16_t total_enqueued_cbs, bool same_op)
>>> +{
>>> +	int ret;
>>> +
>>> +	union acc100_dma_desc *desc;
>>> +	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
>>> +			& q->sw_ring_wrap_mask);
>>> +	desc = q->ring_addr + desc_idx;
>>> +	struct rte_mbuf *input, *h_output_head, *h_output;
>>> +	uint32_t in_offset, h_out_offset, mbuf_total_left, h_out_length = 0;
>>> +	input = op->ldpc_dec.input.data;
>>> +	h_output_head = h_output = op->ldpc_dec.hard_output.data;
>>> +	in_offset = op->ldpc_dec.input.offset;
>>> +	h_out_offset = op->ldpc_dec.hard_output.offset;
>>> +	mbuf_total_left = op->ldpc_dec.input.length;
>>> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
>>> +	if (unlikely(input == NULL)) {
>>> +		rte_bbdev_log(ERR, "Invalid mbuf pointer");
>>> +		return -EFAULT;
>>> +	}
>>> +#endif
>>> +	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
>>> +
>>> +	if (same_op) {
>>> +		union acc100_dma_desc *prev_desc;
>>> +		desc_idx = ((q->sw_ring_head + total_enqueued_cbs - 1)
>>> +				& q->sw_ring_wrap_mask);
>>> +		prev_desc = q->ring_addr + desc_idx;
>>> +		uint8_t *prev_ptr = (uint8_t *) prev_desc;
>>> +		uint8_t *new_ptr = (uint8_t *) desc;
>>> +		/* Copy first 4 words and BDESCs */
>>> +		rte_memcpy(new_ptr, prev_ptr, 16);
>>> +		rte_memcpy(new_ptr + 36, prev_ptr + 36, 40);
>> These magic numbers should be #defines
> yes
>
>>> +		desc->req.op_addr = prev_desc->req.op_addr;
>>> +		/* Copy FCW */
>>> +		rte_memcpy(new_ptr + ACC100_DESC_FCW_OFFSET,
>>> +				prev_ptr + ACC100_DESC_FCW_OFFSET,
>>> +				ACC100_FCW_LD_BLEN);
>>> +		acc100_dma_desc_ld_update(op, &desc->req, input,
>> h_output,
>>> +				&in_offset, &h_out_offset,
>>> +				&h_out_length, harq_layout);
>>> +	} else {
>>> +		struct acc100_fcw_ld *fcw;
>>> +		uint32_t seg_total_left;
>>> +		fcw = &desc->req.fcw_ld;
>>> +		acc100_fcw_ld_fill(op, fcw, harq_layout);
>>> +
>>> +		/* Special handling when overusing mbuf */
>>> +		if (fcw->rm_e < MAX_E_MBUF)
>>> +			seg_total_left = rte_pktmbuf_data_len(input)
>>> +					- in_offset;
>>> +		else
>>> +			seg_total_left = fcw->rm_e;
>>> +
>>> +		ret = acc100_dma_desc_ld_fill(op, &desc->req, &input,
>> h_output,
>>> +				&in_offset, &h_out_offset,
>>> +				&h_out_length, &mbuf_total_left,
>>> +				&seg_total_left, fcw);
>>> +		if (unlikely(ret < 0))
>>> +			return ret;
>>> +	}
>>> +
>>> +	/* Hard output */
>>> +	mbuf_append(h_output_head, h_output, h_out_length);
>>> +#ifndef ACC100_EXT_MEM
>>> +	if (op->ldpc_dec.harq_combined_output.length > 0) {
>>> +		/* Push the HARQ output into host memory */
>>> +		struct rte_mbuf *hq_output_head, *hq_output;
>>> +		hq_output_head = op-
>>> ldpc_dec.harq_combined_output.data;
>>> +		hq_output = op->ldpc_dec.harq_combined_output.data;
>>> +		mbuf_append(hq_output_head, hq_output,
>>> +				op-
>>> ldpc_dec.harq_combined_output.length);
>>> +	}
>>> +#endif
>>> +
>>> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
>>> +	rte_memdump(stderr, "FCW", &desc->req.fcw_ld,
>>> +			sizeof(desc->req.fcw_ld) - 8);
>>> +	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
>>> +#endif
>>> +
>>> +	/* One CB (one op) was successfully prepared to enqueue */
>>> +	return 1;
>>> +}
>>> +
>>> +
>>> +/* Enqueue one decode operations for ACC100 device in TB mode */
>>> +static inline int
>>> +enqueue_ldpc_dec_one_op_tb(struct acc100_queue *q, struct
>> rte_bbdev_dec_op *op,
>>> +		uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
>>> +{
>>> +	union acc100_dma_desc *desc = NULL;
>>> +	int ret;
>>> +	uint8_t r, c;
>>> +	uint32_t in_offset, h_out_offset,
>>> +		h_out_length, mbuf_total_left, seg_total_left;
>>> +	struct rte_mbuf *input, *h_output_head, *h_output;
>>> +	uint16_t current_enqueued_cbs = 0;
>>> +
>>> +	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
>>> +			& q->sw_ring_wrap_mask);
>>> +	desc = q->ring_addr + desc_idx;
>>> +	uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
>>> +	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
>>> +	acc100_fcw_ld_fill(op, &desc->req.fcw_ld, harq_layout);
>>> +
>>> +	input = op->ldpc_dec.input.data;
>>> +	h_output_head = h_output = op->ldpc_dec.hard_output.data;
>>> +	in_offset = op->ldpc_dec.input.offset;
>>> +	h_out_offset = op->ldpc_dec.hard_output.offset;
>>> +	h_out_length = 0;
>>> +	mbuf_total_left = op->ldpc_dec.input.length;
>>> +	c = op->ldpc_dec.tb_params.c;
>>> +	r = op->ldpc_dec.tb_params.r;
>>> +
>>> +	while (mbuf_total_left > 0 && r < c) {
>>> +
>>> +		seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
>>> +
>>> +		/* Set up DMA descriptor */
>>> +		desc = q->ring_addr + ((q->sw_ring_head +
>> total_enqueued_cbs)
>>> +				& q->sw_ring_wrap_mask);
>>> +		desc->req.data_ptrs[0].address = q->ring_addr_phys +
>> fcw_offset;
>>> +		desc->req.data_ptrs[0].blen = ACC100_FCW_LD_BLEN;
>>> +		ret = acc100_dma_desc_ld_fill(op, &desc->req, &input,
>>> +				h_output, &in_offset, &h_out_offset,
>>> +				&h_out_length,
>>> +				&mbuf_total_left, &seg_total_left,
>>> +				&desc->req.fcw_ld);
>>> +
>>> +		if (unlikely(ret < 0))
>>> +			return ret;
>>> +
>>> +		/* Hard output */
>>> +		mbuf_append(h_output_head, h_output, h_out_length);
>>> +
>>> +		/* Set total number of CBs in TB */
>>> +		desc->req.cbs_in_tb = cbs_in_tb;
>>> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
>>> +		rte_memdump(stderr, "FCW", &desc->req.fcw_td,
>>> +				sizeof(desc->req.fcw_td) - 8);
>>> +		rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
>>> +#endif
>>> +
>>> +		if (seg_total_left == 0) {
>>> +			/* Go to the next mbuf */
>>> +			input = input->next;
>>> +			in_offset = 0;
>>> +			h_output = h_output->next;
>>> +			h_out_offset = 0;
>>> +		}
>>> +		total_enqueued_cbs++;
>>> +		current_enqueued_cbs++;
>>> +		r++;
>>> +	}
>>> +
>>> +	if (unlikely(desc == NULL))
>> How is this possible ? desc has be dereferenced already.
> related to static code analysis, arguably a false alarm
>
>>> +		return current_enqueued_cbs;
>>> +
>>> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
>>> +	/* Check if any CBs left for processing */
>>> +	if (mbuf_total_left != 0) {
>>> +		rte_bbdev_log(ERR,
>>> +				"Some date still left for processing:
>> mbuf_total_left = %u",
>>> +				mbuf_total_left);
>>> +		return -EINVAL;
>>> +	}
>>> +#endif
>>> +	/* Set SDone on last CB descriptor for TB mode */
>>> +	desc->req.sdone_enable = 1;
>>> +	desc->req.irq_enable = q->irq_enable;
>>> +
>>> +	return current_enqueued_cbs;
>>> +}
>>> +
>>> +
>>> +/* Calculates number of CBs in processed encoder TB based on 'r' and
>> input
>>> + * length.
>>> + */
>>> +static inline uint8_t
>>> +get_num_cbs_in_tb_enc(struct rte_bbdev_op_turbo_enc *turbo_enc)
>>> +{
>>> +	uint8_t c, c_neg, r, crc24_bits = 0;
>>> +	uint16_t k, k_neg, k_pos;
>>> +	uint8_t cbs_in_tb = 0;
>>> +	int32_t length;
>>> +
>>> +	length = turbo_enc->input.length;
>>> +	r = turbo_enc->tb_params.r;
>>> +	c = turbo_enc->tb_params.c;
>>> +	c_neg = turbo_enc->tb_params.c_neg;
>>> +	k_neg = turbo_enc->tb_params.k_neg;
>>> +	k_pos = turbo_enc->tb_params.k_pos;
>>> +	crc24_bits = 0;
>>> +	if (check_bit(turbo_enc->op_flags,
>> RTE_BBDEV_TURBO_CRC_24B_ATTACH))
>>> +		crc24_bits = 24;
>>> +	while (length > 0 && r < c) {
>>> +		k = (r < c_neg) ? k_neg : k_pos;
>>> +		length -= (k - crc24_bits) >> 3;
>>> +		r++;
>>> +		cbs_in_tb++;
>>> +	}
>>> +
>>> +	return cbs_in_tb;
>>> +}
>>> +
>>> +/* Calculates number of CBs in processed decoder TB based on 'r' and
>> input
>>> + * length.
>>> + */
>>> +static inline uint16_t
>>> +get_num_cbs_in_tb_dec(struct rte_bbdev_op_turbo_dec *turbo_dec)
>>> +{
>>> +	uint8_t c, c_neg, r = 0;
>>> +	uint16_t kw, k, k_neg, k_pos, cbs_in_tb = 0;
>>> +	int32_t length;
>>> +
>>> +	length = turbo_dec->input.length;
>>> +	r = turbo_dec->tb_params.r;
>>> +	c = turbo_dec->tb_params.c;
>>> +	c_neg = turbo_dec->tb_params.c_neg;
>>> +	k_neg = turbo_dec->tb_params.k_neg;
>>> +	k_pos = turbo_dec->tb_params.k_pos;
>>> +	while (length > 0 && r < c) {
>>> +		k = (r < c_neg) ? k_neg : k_pos;
>>> +		kw = RTE_ALIGN_CEIL(k + 4, 32) * 3;
>>> +		length -= kw;
>>> +		r++;
>>> +		cbs_in_tb++;
>>> +	}
>>> +
>>> +	return cbs_in_tb;
>>> +}
>>> +
>>> +/* Calculates number of CBs in processed decoder TB based on 'r' and
>> input
>>> + * length.
>>> + */
>>> +static inline uint16_t
>>> +get_num_cbs_in_tb_ldpc_dec(struct rte_bbdev_op_ldpc_dec *ldpc_dec)
>>> +{
>>> +	uint16_t r, cbs_in_tb = 0;
>>> +	int32_t length = ldpc_dec->input.length;
>>> +	r = ldpc_dec->tb_params.r;
>>> +	while (length > 0 && r < ldpc_dec->tb_params.c) {
>>> +		length -=  (r < ldpc_dec->tb_params.cab) ?
>>> +				ldpc_dec->tb_params.ea :
>>> +				ldpc_dec->tb_params.eb;
>>> +		r++;
>>> +		cbs_in_tb++;
>>> +	}
>>> +	return cbs_in_tb;
>>> +}
>>> +
>>> +/* Check we can mux encode operations with common FCW */
>>> +static inline bool
>>> +check_mux(struct rte_bbdev_enc_op **ops, uint16_t num) {
>>> +	uint16_t i;
>>> +	if (num == 1)
>>> +		return false;
>> likely should strengthen check to num <= 1
> no impact, but doesnt hurt to change ok. 
>
>>> +	for (i = 1; i < num; ++i) {
>>> +		/* Only mux compatible code blocks */
>>> +		if (memcmp((uint8_t *)(&ops[i]->ldpc_enc) + ENC_OFFSET,
>>> +				(uint8_t *)(&ops[0]->ldpc_enc) +
>> ENC_OFFSET,
>> ops[0]->ldpc_enc should be hoisted out of loop as it is invariant.
> compiler takes care of this I believe
hopefully, yes.
>
>>> +				CMP_ENC_SIZE) != 0)
>>> +			return false;
>>> +	}
>>> +	return true;
>>> +}
>>> +
>>> +/** Enqueue encode operations for ACC100 device in CB mode. */
>>> +static inline uint16_t
>>> +acc100_enqueue_ldpc_enc_cb(struct rte_bbdev_queue_data *q_data,
>>> +		struct rte_bbdev_enc_op **ops, uint16_t num)
>>> +{
>>> +	struct acc100_queue *q = q_data->queue_private;
>>> +	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q-
>>> sw_ring_head;
>>> +	uint16_t i = 0;
>>> +	union acc100_dma_desc *desc;
>>> +	int ret, desc_idx = 0;
>>> +	int16_t enq, left = num;
>>> +
>>> +	while (left > 0) {
>>> +		if (unlikely(avail - 1 < 0))
>>> +			break;
>>> +		avail--;
>>> +		enq = RTE_MIN(left, MUX_5GDL_DESC);
>>> +		if (check_mux(&ops[i], enq)) {
>>> +			ret = enqueue_ldpc_enc_n_op_cb(q, &ops[i],
>>> +					desc_idx, enq);
>>> +			if (ret < 0)
>>> +				break;
>>> +			i += enq;
>>> +		} else {
>>> +			ret = enqueue_ldpc_enc_one_op_cb(q, ops[i],
>> desc_idx);
>>> +			if (ret < 0)
>>> +				break;
>> failure is not handled well, what happens if this is one of serveral
> the aim is to flag the error and move on 
>
>
>>> +			i++;
>>> +		}
>>> +		desc_idx++;
>>> +		left = num - i;
>>> +	}
>>> +
>>> +	if (unlikely(i == 0))
>>> +		return 0; /* Nothing to enqueue */
>> this does not look correct for all cases
> I miss your point

I was thinking this was an error handler and needed beefing up.


>>> +
>>> +	/* Set SDone in last CB in enqueued ops for CB mode*/
>>> +	desc = q->ring_addr + ((q->sw_ring_head + desc_idx - 1)
>>> +			& q->sw_ring_wrap_mask);
>>> +	desc->req.sdone_enable = 1;
>>> +	desc->req.irq_enable = q->irq_enable;
>>> +
>>> +	acc100_dma_enqueue(q, desc_idx, &q_data->queue_stats);
>>> +
>>> +	/* Update stats */
>>> +	q_data->queue_stats.enqueued_count += i;
>>> +	q_data->queue_stats.enqueue_err_count += num - i;
>>> +
>>> +	return i;
>>> +}
>>> +
>>> +/* Enqueue encode operations for ACC100 device. */
>>> +static uint16_t
>>> +acc100_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
>>> +		struct rte_bbdev_enc_op **ops, uint16_t num)
>>> +{
>>> +	if (unlikely(num == 0))
>>> +		return 0;
>> Handling num == 0 should be in acc100_enqueue_ldpc_enc_cb
> Why would this be better not to catch early from user api call?
ok, because it was 'static' i was unsure
>
>>> +	return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
>>> +}
>>> +
>>> +/* Check we can mux encode operations with common FCW */
>>> +static inline bool
>>> +cmp_ldpc_dec_op(struct rte_bbdev_dec_op **ops) {
>>> +	/* Only mux compatible code blocks */
>>> +	if (memcmp((uint8_t *)(&ops[0]->ldpc_dec) + DEC_OFFSET,
>>> +			(uint8_t *)(&ops[1]->ldpc_dec) +
>>> +			DEC_OFFSET, CMP_DEC_SIZE) != 0) {
>>> +		return false;
>>> +	} else
>> do not need the else, there are no other statements.
> debatable. Not considering change except if that becomes a DPDK
> coding guideline. 
fine.
>>> +		return true;
>>> +}
>>> +
>>> +
>>> +/* Enqueue decode operations for ACC100 device in TB mode */
>>> +static uint16_t
>>> +acc100_enqueue_ldpc_dec_tb(struct rte_bbdev_queue_data *q_data,
>>> +		struct rte_bbdev_dec_op **ops, uint16_t num)
>>> +{
>>> +	struct acc100_queue *q = q_data->queue_private;
>>> +	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q-
>>> sw_ring_head;
>>> +	uint16_t i, enqueued_cbs = 0;
>>> +	uint8_t cbs_in_tb;
>>> +	int ret;
>>> +
>>> +	for (i = 0; i < num; ++i) {
>>> +		cbs_in_tb = get_num_cbs_in_tb_ldpc_dec(&ops[i]-
>>> ldpc_dec);
>>> +		/* Check if there are available space for further processing */
>>> +		if (unlikely(avail - cbs_in_tb < 0))
>>> +			break;
>>> +		avail -= cbs_in_tb;
>>> +
>>> +		ret = enqueue_ldpc_dec_one_op_tb(q, ops[i],
>>> +				enqueued_cbs, cbs_in_tb);
>>> +		if (ret < 0)
>>> +			break;
>>> +		enqueued_cbs += ret;
>>> +	}
>>> +
>>> +	acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
>>> +
>>> +	/* Update stats */
>>> +	q_data->queue_stats.enqueued_count += i;
>>> +	q_data->queue_stats.enqueue_err_count += num - i;
>>> +	return i;
>>> +}
>>> +
>>> +/* Enqueue decode operations for ACC100 device in CB mode */
>>> +static uint16_t
>>> +acc100_enqueue_ldpc_dec_cb(struct rte_bbdev_queue_data *q_data,
>>> +		struct rte_bbdev_dec_op **ops, uint16_t num)
>>> +{
>>> +	struct acc100_queue *q = q_data->queue_private;
>>> +	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q-
>>> sw_ring_head;
>>> +	uint16_t i;
>>> +	union acc100_dma_desc *desc;
>>> +	int ret;
>>> +	bool same_op = false;
>>> +	for (i = 0; i < num; ++i) {
>>> +		/* Check if there are available space for further processing */
>>> +		if (unlikely(avail - 1 < 0))
>> change to (avail < 1)
>>
>> Generally.
> ok
>
>>> +			break;
>>> +		avail -= 1;
>>> +
>>> +		if (i > 0)
>>> +			same_op = cmp_ldpc_dec_op(&ops[i-1]);
>>> +		rte_bbdev_log(INFO, "Op %d %d %d %d %d %d %d %d %d %d
>> %d %d\n",
>>> +			i, ops[i]->ldpc_dec.op_flags, ops[i]-
>>> ldpc_dec.rv_index,
>>> +			ops[i]->ldpc_dec.iter_max, ops[i]-
>>> ldpc_dec.iter_count,
>>> +			ops[i]->ldpc_dec.basegraph, ops[i]->ldpc_dec.z_c,
>>> +			ops[i]->ldpc_dec.n_cb, ops[i]->ldpc_dec.q_m,
>>> +			ops[i]->ldpc_dec.n_filler, ops[i]-
>>> ldpc_dec.cb_params.e,
>>> +			same_op);
>>> +		ret = enqueue_ldpc_dec_one_op_cb(q, ops[i], i, same_op);
>>> +		if (ret < 0)
>>> +			break;
>>> +	}
>>> +
>>> +	if (unlikely(i == 0))
>>> +		return 0; /* Nothing to enqueue */
>>> +
>>> +	/* Set SDone in last CB in enqueued ops for CB mode*/
>>> +	desc = q->ring_addr + ((q->sw_ring_head + i - 1)
>>> +			& q->sw_ring_wrap_mask);
>>> +
>>> +	desc->req.sdone_enable = 1;
>>> +	desc->req.irq_enable = q->irq_enable;
>>> +
>>> +	acc100_dma_enqueue(q, i, &q_data->queue_stats);
>>> +
>>> +	/* Update stats */
>>> +	q_data->queue_stats.enqueued_count += i;
>>> +	q_data->queue_stats.enqueue_err_count += num - i;
>>> +	return i;
>>> +}
>>> +
>>> +/* Enqueue decode operations for ACC100 device. */
>>> +static uint16_t
>>> +acc100_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
>>> +		struct rte_bbdev_dec_op **ops, uint16_t num)
>>> +{
>>> +	struct acc100_queue *q = q_data->queue_private;
>>> +	int32_t aq_avail = q->aq_depth +
>>> +			(q->aq_dequeued - q->aq_enqueued) / 128;
>>> +
>>> +	if (unlikely((aq_avail == 0) || (num == 0)))
>>> +		return 0;
>>> +
>>> +	if (ops[0]->ldpc_dec.code_block_mode == 0)
>>> +		return acc100_enqueue_ldpc_dec_tb(q_data, ops, num);
>>> +	else
>>> +		return acc100_enqueue_ldpc_dec_cb(q_data, ops, num);
>>> +}
>>> +
>>> +
>>> +/* Dequeue one encode operations from ACC100 device in CB mode */
>>> +static inline int
>>> +dequeue_enc_one_op_cb(struct acc100_queue *q, struct
>> rte_bbdev_enc_op **ref_op,
>>> +		uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
>>> +{
>>> +	union acc100_dma_desc *desc, atom_desc;
>>> +	union acc100_dma_rsp_desc rsp;
>>> +	struct rte_bbdev_enc_op *op;
>>> +	int i;
>>> +
>>> +	desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
>>> +			& q->sw_ring_wrap_mask);
>>> +	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
>>> +			__ATOMIC_RELAXED);
>>> +
>>> +	/* Check fdone bit */
>>> +	if (!(atom_desc.rsp.val & ACC100_FDONE))
>>> +		return -1;
>>> +
>>> +	rsp.val = atom_desc.rsp.val;
>>> +	rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
>>> +
>>> +	/* Dequeue */
>>> +	op = desc->req.op_addr;
>>> +
>>> +	/* Clearing status, it will be set based on response */
>>> +	op->status = 0;
>>> +
>>> +	op->status |= ((rsp.input_err)
>>> +			? (1 << RTE_BBDEV_DATA_ERROR) : 0);
>> can remove the = 0, if |= is changed to =
> yes in principle, but easy to break by mistake, so would keep. 
ok
>>> +	op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
>>> +	op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
>>> +
>>> +	if (desc->req.last_desc_in_batch) {
>>> +		(*aq_dequeued)++;
>>> +		desc->req.last_desc_in_batch = 0;
>>> +	}
>>> +	desc->rsp.val = ACC100_DMA_DESC_TYPE;
>>> +	desc->rsp.add_info_0 = 0; /*Reserved bits */
>>> +	desc->rsp.add_info_1 = 0; /*Reserved bits */
>>> +
>>> +	/* Flag that the muxing cause loss of opaque data */
>>> +	op->opaque_data = (void *)-1;
>> as a ptr, shouldn't opaque_data be poisoned with '0' ?
> more obvious this way I think. 

the idiom (ptr == NULL) would need to be changed.

as a non standard poison, it is likely that someone will trip over this. 

>>> +	for (i = 0 ; i < desc->req.numCBs; i++)
>>> +		ref_op[i] = op;
>>> +
>>> +	/* One CB (op) was successfully dequeued */
>>> +	return desc->req.numCBs;
>>> +}
>>> +
>>> +/* Dequeue one encode operations from ACC100 device in TB mode */
>>> +static inline int
>>> +dequeue_enc_one_op_tb(struct acc100_queue *q, struct
>> rte_bbdev_enc_op **ref_op,
>>> +		uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
>>> +{
>>> +	union acc100_dma_desc *desc, *last_desc, atom_desc;
>>> +	union acc100_dma_rsp_desc rsp;
>>> +	struct rte_bbdev_enc_op *op;
>>> +	uint8_t i = 0;
>>> +	uint16_t current_dequeued_cbs = 0, cbs_in_tb;
>>> +
>>> +	desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
>>> +			& q->sw_ring_wrap_mask);
>>> +	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
>>> +			__ATOMIC_RELAXED);
>>> +
>>> +	/* Check fdone bit */
>>> +	if (!(atom_desc.rsp.val & ACC100_FDONE))
>>> +		return -1;
>>> +
>>> +	/* Get number of CBs in dequeued TB */
>>> +	cbs_in_tb = desc->req.cbs_in_tb;
>>> +	/* Get last CB */
>>> +	last_desc = q->ring_addr + ((q->sw_ring_tail
>>> +			+ total_dequeued_cbs + cbs_in_tb - 1)
>>> +			& q->sw_ring_wrap_mask);
>>> +	/* Check if last CB in TB is ready to dequeue (and thus
>>> +	 * the whole TB) - checking sdone bit. If not return.
>>> +	 */
>>> +	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
>>> +			__ATOMIC_RELAXED);
>>> +	if (!(atom_desc.rsp.val & ACC100_SDONE))
>>> +		return -1;
>>> +
>>> +	/* Dequeue */
>>> +	op = desc->req.op_addr;
>>> +
>>> +	/* Clearing status, it will be set based on response */
>>> +	op->status = 0;
>>> +
>>> +	while (i < cbs_in_tb) {
>>> +		desc = q->ring_addr + ((q->sw_ring_tail
>>> +				+ total_dequeued_cbs)
>>> +				& q->sw_ring_wrap_mask);
>>> +		atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
>>> +				__ATOMIC_RELAXED);
>>> +		rsp.val = atom_desc.rsp.val;
>>> +		rte_bbdev_log_debug("Resp. desc %p: %x", desc,
>>> +				rsp.val);
>>> +
>>> +		op->status |= ((rsp.input_err)
>>> +				? (1 << RTE_BBDEV_DATA_ERROR) : 0);
>>> +		op->status |= ((rsp.dma_err) ? (1 <<
>> RTE_BBDEV_DRV_ERROR) : 0);
>>> +		op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR)
>> : 0);
>>> +
>>> +		if (desc->req.last_desc_in_batch) {
>>> +			(*aq_dequeued)++;
>>> +			desc->req.last_desc_in_batch = 0;
>>> +		}
>>> +		desc->rsp.val = ACC100_DMA_DESC_TYPE;
>>> +		desc->rsp.add_info_0 = 0;
>>> +		desc->rsp.add_info_1 = 0;
>>> +		total_dequeued_cbs++;
>>> +		current_dequeued_cbs++;
>>> +		i++;
>>> +	}
>>> +
>>> +	*ref_op = op;
>>> +
>>> +	return current_dequeued_cbs;
>>> +}
>>> +
>>> +/* Dequeue one decode operation from ACC100 device in CB mode */
>>> +static inline int
>>> +dequeue_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
>>> +		struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
>>> +		uint16_t dequeued_cbs, uint32_t *aq_dequeued)
>>> +{
>>> +	union acc100_dma_desc *desc, atom_desc;
>>> +	union acc100_dma_rsp_desc rsp;
>>> +	struct rte_bbdev_dec_op *op;
>>> +
>>> +	desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
>>> +			& q->sw_ring_wrap_mask);
>>> +	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
>>> +			__ATOMIC_RELAXED);
>>> +
>>> +	/* Check fdone bit */
>>> +	if (!(atom_desc.rsp.val & ACC100_FDONE))
>>> +		return -1;
>>> +
>>> +	rsp.val = atom_desc.rsp.val;
>>> +	rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
>>> +
>>> +	/* Dequeue */
>>> +	op = desc->req.op_addr;
>>> +
>>> +	/* Clearing status, it will be set based on response */
>>> +	op->status = 0;
>>> +	op->status |= ((rsp.input_err)
>>> +			? (1 << RTE_BBDEV_DATA_ERROR) : 0);
>> similar to above, can remove the = 0
>>
>> This is a general issue.
> same comment above
>
>>> +	op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
>>> +	op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
>>> +	if (op->status != 0)
>>> +		q_data->queue_stats.dequeue_err_count++;
>>> +
>>> +	/* CRC invalid if error exists */
>>> +	if (!op->status)
>>> +		op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
>>> +	op->turbo_dec.iter_count = (uint8_t) rsp.iter_cnt / 2;
>>> +	/* Check if this is the last desc in batch (Atomic Queue) */
>>> +	if (desc->req.last_desc_in_batch) {
>>> +		(*aq_dequeued)++;
>>> +		desc->req.last_desc_in_batch = 0;
>>> +	}
>>> +	desc->rsp.val = ACC100_DMA_DESC_TYPE;
>>> +	desc->rsp.add_info_0 = 0;
>>> +	desc->rsp.add_info_1 = 0;
>>> +	*ref_op = op;
>>> +
>>> +	/* One CB (op) was successfully dequeued */
>>> +	return 1;
>>> +}
>>> +
>>> +/* Dequeue one decode operations from ACC100 device in CB mode */
>>> +static inline int
>>> +dequeue_ldpc_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
>>> +		struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
>>> +		uint16_t dequeued_cbs, uint32_t *aq_dequeued)
>>> +{
>>> +	union acc100_dma_desc *desc, atom_desc;
>>> +	union acc100_dma_rsp_desc rsp;
>>> +	struct rte_bbdev_dec_op *op;
>>> +
>>> +	desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
>>> +			& q->sw_ring_wrap_mask);
>>> +	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
>>> +			__ATOMIC_RELAXED);
>>> +
>>> +	/* Check fdone bit */
>>> +	if (!(atom_desc.rsp.val & ACC100_FDONE))
>>> +		return -1;
>>> +
>>> +	rsp.val = atom_desc.rsp.val;
>>> +
>>> +	/* Dequeue */
>>> +	op = desc->req.op_addr;
>>> +
>>> +	/* Clearing status, it will be set based on response */
>>> +	op->status = 0;
>>> +	op->status |= rsp.input_err << RTE_BBDEV_DATA_ERROR;
>>> +	op->status |= rsp.dma_err << RTE_BBDEV_DRV_ERROR;
>>> +	op->status |= rsp.fcw_err << RTE_BBDEV_DRV_ERROR;
>>> +	if (op->status != 0)
>>> +		q_data->queue_stats.dequeue_err_count++;
>>> +
>>> +	op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
>>> +	if (op->ldpc_dec.hard_output.length > 0 && !rsp.synd_ok)
>>> +		op->status |= 1 << RTE_BBDEV_SYNDROME_ERROR;
>>> +	op->ldpc_dec.iter_count = (uint8_t) rsp.iter_cnt;
>>> +
>>> +	/* Check if this is the last desc in batch (Atomic Queue) */
>>> +	if (desc->req.last_desc_in_batch) {
>>> +		(*aq_dequeued)++;
>>> +		desc->req.last_desc_in_batch = 0;
>>> +	}
>>> +
>>> +	desc->rsp.val = ACC100_DMA_DESC_TYPE;
>>> +	desc->rsp.add_info_0 = 0;
>>> +	desc->rsp.add_info_1 = 0;
>>> +
>>> +	*ref_op = op;
>>> +
>>> +	/* One CB (op) was successfully dequeued */
>>> +	return 1;
>>> +}
>>> +
>>> +/* Dequeue one decode operations from ACC100 device in TB mode. */
>>> +static inline int
>>> +dequeue_dec_one_op_tb(struct acc100_queue *q, struct
>> rte_bbdev_dec_op **ref_op,
>>> +		uint16_t dequeued_cbs, uint32_t *aq_dequeued)
>>> +{
>> similar call as fpga_lte_fec
> distinct though as HW specific
>
>>> +	union acc100_dma_desc *desc, *last_desc, atom_desc;
>>> +	union acc100_dma_rsp_desc rsp;
>>> +	struct rte_bbdev_dec_op *op;
>>> +	uint8_t cbs_in_tb = 1, cb_idx = 0;
>>> +
>>> +	desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
>>> +			& q->sw_ring_wrap_mask);
>>> +	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
>>> +			__ATOMIC_RELAXED);
>>> +
>>> +	/* Check fdone bit */
>>> +	if (!(atom_desc.rsp.val & ACC100_FDONE))
>>> +		return -1;
>>> +
>>> +	/* Dequeue */
>>> +	op = desc->req.op_addr;
>>> +
>>> +	/* Get number of CBs in dequeued TB */
>>> +	cbs_in_tb = desc->req.cbs_in_tb;
>>> +	/* Get last CB */
>>> +	last_desc = q->ring_addr + ((q->sw_ring_tail
>>> +			+ dequeued_cbs + cbs_in_tb - 1)
>>> +			& q->sw_ring_wrap_mask);
>>> +	/* Check if last CB in TB is ready to dequeue (and thus
>>> +	 * the whole TB) - checking sdone bit. If not return.
>>> +	 */
>>> +	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
>>> +			__ATOMIC_RELAXED);
>>> +	if (!(atom_desc.rsp.val & ACC100_SDONE))
>>> +		return -1;
>>> +
>>> +	/* Clearing status, it will be set based on response */
>>> +	op->status = 0;
>>> +
>>> +	/* Read remaining CBs if exists */
>>> +	while (cb_idx < cbs_in_tb) {
>> Other similar calls use 'i' , 'cb_idx' is more meaningful, consider changing the
>> other loops.
> More relevant here due to split of TB into CBs. 
ok
>>> +		desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
>>> +				& q->sw_ring_wrap_mask);
>>> +		atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
>>> +				__ATOMIC_RELAXED);
>>> +		rsp.val = atom_desc.rsp.val;
>>> +		rte_bbdev_log_debug("Resp. desc %p: %x", desc,
>>> +				rsp.val);
>>> +
>>> +		op->status |= ((rsp.input_err)
>>> +				? (1 << RTE_BBDEV_DATA_ERROR) : 0);
>>> +		op->status |= ((rsp.dma_err) ? (1 <<
>> RTE_BBDEV_DRV_ERROR) : 0);
>>> +		op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR)
>> : 0);
>>> +
>>> +		/* CRC invalid if error exists */
>>> +		if (!op->status)
>>> +			op->status |= rsp.crc_status <<
>> RTE_BBDEV_CRC_ERROR;
>>> +		op->turbo_dec.iter_count = RTE_MAX((uint8_t) rsp.iter_cnt,
>>> +				op->turbo_dec.iter_count);
>>> +
>>> +		/* Check if this is the last desc in batch (Atomic Queue) */
>>> +		if (desc->req.last_desc_in_batch) {
>>> +			(*aq_dequeued)++;
>>> +			desc->req.last_desc_in_batch = 0;
>>> +		}
>>> +		desc->rsp.val = ACC100_DMA_DESC_TYPE;
>>> +		desc->rsp.add_info_0 = 0;
>>> +		desc->rsp.add_info_1 = 0;
>>> +		dequeued_cbs++;
>>> +		cb_idx++;
>>> +	}
>>> +
>>> +	*ref_op = op;
>>> +
>>> +	return cb_idx;
>>> +}
>>> +
>>> +/* Dequeue LDPC encode operations from ACC100 device. */
>>> +static uint16_t
>>> +acc100_dequeue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
>>> +		struct rte_bbdev_enc_op **ops, uint16_t num)
>>> +{
>>> +	struct acc100_queue *q = q_data->queue_private;
>>> +	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
>>> +	uint32_t aq_dequeued = 0;
>>> +	uint16_t dequeue_num, i, dequeued_cbs = 0, dequeued_descs = 0;
>>> +	int ret;
>>> +
>>> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
>>> +	if (unlikely(ops == 0 && q == NULL))
>>> +		return 0;
>>> +#endif
>>> +
>>> +	dequeue_num = (avail < num) ? avail : num;
>> Similar to RTE_MIN
>>
>> general issue
> ok, will check
>
>>> +
>>> +	for (i = 0; i < dequeue_num; i++) {
>>> +		ret = dequeue_enc_one_op_cb(q, &ops[dequeued_cbs],
>>> +				dequeued_descs, &aq_dequeued);
>>> +		if (ret < 0)
>>> +			break;
>>> +		dequeued_cbs += ret;
>>> +		dequeued_descs++;
>>> +		if (dequeued_cbs >= num)
>>> +			break;
>> condition should be added to the for-loop
> unsure this would helps readability personnaly

ok

Tom

>>> +	}
>>> +
>>> +	q->aq_dequeued += aq_dequeued;
>>> +	q->sw_ring_tail += dequeued_descs;
>>> +
>>> +	/* Update enqueue stats */
>>> +	q_data->queue_stats.dequeued_count += dequeued_cbs;
>>> +
>>> +	return dequeued_cbs;
>>> +}
>>> +
>>> +/* Dequeue decode operations from ACC100 device. */
>>> +static uint16_t
>>> +acc100_dequeue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
>>> +		struct rte_bbdev_dec_op **ops, uint16_t num)
>>> +{
>>> +	struct acc100_queue *q = q_data->queue_private;
>>> +	uint16_t dequeue_num;
>>> +	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
>>> +	uint32_t aq_dequeued = 0;
>>> +	uint16_t i;
>>> +	uint16_t dequeued_cbs = 0;
>>> +	struct rte_bbdev_dec_op *op;
>>> +	int ret;
>>> +
>>> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
>>> +	if (unlikely(ops == 0 && q == NULL))
>>> +		return 0;
>>> +#endif
>>> +
>>> +	dequeue_num = (avail < num) ? avail : num;
>>> +
>>> +	for (i = 0; i < dequeue_num; ++i) {
>>> +		op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
>>> +			& q->sw_ring_wrap_mask))->req.op_addr;
>>> +		if (op->ldpc_dec.code_block_mode == 0)
>> 0 should be a #define
> mentioned in previous review.
>
> Thanks
>
>> Tom
>>
>>> +			ret = dequeue_dec_one_op_tb(q, &ops[i],
>> dequeued_cbs,
>>> +					&aq_dequeued);
>>> +		else
>>> +			ret = dequeue_ldpc_dec_one_op_cb(
>>> +					q_data, q, &ops[i], dequeued_cbs,
>>> +					&aq_dequeued);
>>> +
>>> +		if (ret < 0)
>>> +			break;
>>> +		dequeued_cbs += ret;
>>> +	}
>>> +
>>> +	q->aq_dequeued += aq_dequeued;
>>> +	q->sw_ring_tail += dequeued_cbs;
>>> +
>>> +	/* Update enqueue stats */
>>> +	q_data->queue_stats.dequeued_count += i;
>>> +
>>> +	return i;
>>> +}
>>> +
>>>  /* Initialization Function */
>>>  static void
>>>  acc100_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv)
>>> @@ -703,6 +2321,10 @@
>>>  	struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
>>>
>>>  	dev->dev_ops = &acc100_bbdev_ops;
>>> +	dev->enqueue_ldpc_enc_ops = acc100_enqueue_ldpc_enc;
>>> +	dev->enqueue_ldpc_dec_ops = acc100_enqueue_ldpc_dec;
>>> +	dev->dequeue_ldpc_enc_ops = acc100_dequeue_ldpc_enc;
>>> +	dev->dequeue_ldpc_dec_ops = acc100_dequeue_ldpc_dec;
>>>
>>>  	((struct acc100_device *) dev->data->dev_private)->pf_device =
>>>  			!strcmp(drv->driver.name,
>>> @@ -815,4 +2437,3 @@ static int acc100_pci_remove(struct
>> rte_pci_device *pci_dev)
>>>  RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME,
>> pci_id_acc100_pf_map);
>>>  RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME,
>> acc100_pci_vf_driver);
>>>  RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME,
>> pci_id_acc100_vf_map);
>>> -
>>> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h
>> b/drivers/baseband/acc100/rte_acc100_pmd.h
>>> index 0e2b79c..78686c1 100644
>>> --- a/drivers/baseband/acc100/rte_acc100_pmd.h
>>> +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
>>> @@ -88,6 +88,8 @@
>>>  #define TMPL_PRI_3      0x0f0e0d0c
>>>  #define QUEUE_ENABLE    0x80000000  /* Bit to mark Queue as Enabled */
>>>  #define WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
>>> +#define ACC100_FDONE    0x80000000
>>> +#define ACC100_SDONE    0x40000000
>>>
>>>  #define ACC100_NUM_TMPL  32
>>>  #define VF_OFFSET_QOS 16 /* offset in Memory Space specific to QoS
>> Mon */
>>> @@ -398,6 +400,7 @@ struct __rte_packed acc100_dma_req_desc {
>>>  union acc100_dma_desc {
>>>  	struct acc100_dma_req_desc req;
>>>  	union acc100_dma_rsp_desc rsp;
>>> +	uint64_t atom_hdr;
>>>  };
>>>
>>>


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v9 06/10] baseband/acc100: add HARQ loopback support
  2020-09-30 18:55         ` Chautru, Nicolas
@ 2020-10-01 15:32           ` Tom Rix
  0 siblings, 0 replies; 213+ messages in thread
From: Tom Rix @ 2020-10-01 15:32 UTC (permalink / raw)
  To: Chautru, Nicolas, dev, akhil.goyal
  Cc: Richardson, Bruce, Xu, Rosen, dave.burley, aidan.goddard, Yigit,
	Ferruh, Liu, Tianjiao


On 9/30/20 11:55 AM, Chautru, Nicolas wrote:
> Hi Tom, 
>
>
>> From: Tom Rix <trix@redhat.com>
>> On 9/28/20 5:29 PM, Nicolas Chautru wrote:
>>> Additional support for HARQ memory loopback
>>>
>>> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
>>> Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
>>> ---
>>>  drivers/baseband/acc100/rte_acc100_pmd.c | 158
>>> +++++++++++++++++++++++++++++++
>>>  1 file changed, 158 insertions(+)
>>>
>>> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
>>> b/drivers/baseband/acc100/rte_acc100_pmd.c
>>> index b223547..e484c0a 100644
>>> --- a/drivers/baseband/acc100/rte_acc100_pmd.c
>>> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
>>> @@ -658,6 +658,7 @@
>>>
>> 	RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE |
>> 	RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE |  #ifdef
>> ACC100_EXT_MEM
>>> +
>> 	RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK |
>> 	RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE |
>> 	RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE |
>> #endif @@
>>> -1480,12 +1481,169 @@
>>>  	return 1;
>>>  }
>>>
>>> +static inline int
>>> +harq_loopback(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
>>> +		uint16_t total_enqueued_cbs) {
>>> +	struct acc100_fcw_ld *fcw;
>>> +	union acc100_dma_desc *desc;
>>> +	int next_triplet = 1;
>>> +	struct rte_mbuf *hq_output_head, *hq_output;
>>> +	uint16_t harq_in_length = op-
>>> ldpc_dec.harq_combined_input.length;
>>> +	if (harq_in_length == 0) {
>>> +		rte_bbdev_log(ERR, "Loopback of invalid null size\n");
>>> +		return -EINVAL;
>>> +	}
>>> +
>>> +	int h_comp = check_bit(op->ldpc_dec.op_flags,
>>> +			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION
>>> +			) ? 1 : 0;
>> bool
> Not in that case as this is used explictly as an integer in the FCW. 

ok.

Reviewed-by: Tom Rix <trix@redhat.com>

>
> Thanks
> Nic
>
>
>> Tom
>>
>>> +	if (h_comp == 1)
>>> +		harq_in_length = harq_in_length * 8 / 6;
>>> +	harq_in_length = RTE_ALIGN(harq_in_length, 64);
>>> +	uint16_t harq_dma_length_in = (h_comp == 0) ?
>>> +			harq_in_length :
>>> +			harq_in_length * 6 / 8;
>>> +	uint16_t harq_dma_length_out = harq_dma_length_in;
>>> +	bool ddr_mem_in = check_bit(op->ldpc_dec.op_flags,
>>> +
>> 	RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE);
>>> +	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
>>> +	uint16_t harq_index = (ddr_mem_in ?
>>> +			op->ldpc_dec.harq_combined_input.offset :
>>> +			op->ldpc_dec.harq_combined_output.offset)
>>> +			/ ACC100_HARQ_OFFSET;
>>> +
>>> +	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
>>> +			& q->sw_ring_wrap_mask);
>>> +	desc = q->ring_addr + desc_idx;
>>> +	fcw = &desc->req.fcw_ld;
>>> +	/* Set the FCW from loopback into DDR */
>>> +	memset(fcw, 0, sizeof(struct acc100_fcw_ld));
>>> +	fcw->FCWversion = ACC100_FCW_VER;
>>> +	fcw->qm = 2;
>>> +	fcw->Zc = 384;
>>> +	if (harq_in_length < 16 * N_ZC_1)
>>> +		fcw->Zc = 16;
>>> +	fcw->ncb = fcw->Zc * N_ZC_1;
>>> +	fcw->rm_e = 2;
>>> +	fcw->hcin_en = 1;
>>> +	fcw->hcout_en = 1;
>>> +
>>> +	rte_bbdev_log(DEBUG, "Loopback IN %d Index %d offset %d length
>> %d %d\n",
>>> +			ddr_mem_in, harq_index,
>>> +			harq_layout[harq_index].offset, harq_in_length,
>>> +			harq_dma_length_in);
>>> +
>>> +	if (ddr_mem_in && (harq_layout[harq_index].offset > 0)) {
>>> +		fcw->hcin_size0 = harq_layout[harq_index].size0;
>>> +		fcw->hcin_offset = harq_layout[harq_index].offset;
>>> +		fcw->hcin_size1 = harq_in_length - fcw->hcin_offset;
>>> +		harq_dma_length_in = (fcw->hcin_size0 + fcw->hcin_size1);
>>> +		if (h_comp == 1)
>>> +			harq_dma_length_in = harq_dma_length_in * 6 / 8;
>>> +	} else {
>>> +		fcw->hcin_size0 = harq_in_length;
>>> +	}
>>> +	harq_layout[harq_index].val = 0;
>>> +	rte_bbdev_log(DEBUG, "Loopback FCW Config %d %d %d\n",
>>> +			fcw->hcin_size0, fcw->hcin_offset, fcw->hcin_size1);
>>> +	fcw->hcout_size0 = harq_in_length;
>>> +	fcw->hcin_decomp_mode = h_comp;
>>> +	fcw->hcout_comp_mode = h_comp;
> see here
>
>>> +	fcw->gain_i = 1;
>>> +	fcw->gain_h = 1;
>>> +
>>> +	/* Set the prefix of descriptor. This could be done at polling */
>>> +	desc->req.word0 = ACC100_DMA_DESC_TYPE;
>>> +	desc->req.word1 = 0; /**< Timestamp could be disabled */
>>> +	desc->req.word2 = 0;
>>> +	desc->req.word3 = 0;
>>> +	desc->req.numCBs = 1;
>>> +
>>> +	/* Null LLR input for Decoder */
>>> +	desc->req.data_ptrs[next_triplet].address =
>>> +			q->lb_in_addr_phys;
>>> +	desc->req.data_ptrs[next_triplet].blen = 2;
>>> +	desc->req.data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN;
>>> +	desc->req.data_ptrs[next_triplet].last = 0;
>>> +	desc->req.data_ptrs[next_triplet].dma_ext = 0;
>>> +	next_triplet++;
>>> +
>>> +	/* HARQ Combine input from either Memory interface */
>>> +	if (!ddr_mem_in) {
>>> +		next_triplet = acc100_dma_fill_blk_type_out(&desc->req,
>>> +				op->ldpc_dec.harq_combined_input.data,
>>> +				op->ldpc_dec.harq_combined_input.offset,
>>> +				harq_dma_length_in,
>>> +				next_triplet,
>>> +				ACC100_DMA_BLKID_IN_HARQ);
>>> +	} else {
>>> +		desc->req.data_ptrs[next_triplet].address =
>>> +				op->ldpc_dec.harq_combined_input.offset;
>>> +		desc->req.data_ptrs[next_triplet].blen =
>>> +				harq_dma_length_in;
>>> +		desc->req.data_ptrs[next_triplet].blkid =
>>> +				ACC100_DMA_BLKID_IN_HARQ;
>>> +		desc->req.data_ptrs[next_triplet].dma_ext = 1;
>>> +		next_triplet++;
>>> +	}
>>> +	desc->req.data_ptrs[next_triplet - 1].last = 1;
>>> +	desc->req.m2dlen = next_triplet;
>>> +
>>> +	/* Dropped decoder hard output */
>>> +	desc->req.data_ptrs[next_triplet].address =
>>> +			q->lb_out_addr_phys;
>>> +	desc->req.data_ptrs[next_triplet].blen = BYTES_IN_WORD;
>>> +	desc->req.data_ptrs[next_triplet].blkid =
>> ACC100_DMA_BLKID_OUT_HARD;
>>> +	desc->req.data_ptrs[next_triplet].last = 0;
>>> +	desc->req.data_ptrs[next_triplet].dma_ext = 0;
>>> +	next_triplet++;
>>> +
>>> +	/* HARQ Combine output to either Memory interface */
>>> +	if (check_bit(op->ldpc_dec.op_flags,
>>> +
>> 	RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE
>>> +			)) {
>>> +		desc->req.data_ptrs[next_triplet].address =
>>> +				op->ldpc_dec.harq_combined_output.offset;
>>> +		desc->req.data_ptrs[next_triplet].blen =
>>> +				harq_dma_length_out;
>>> +		desc->req.data_ptrs[next_triplet].blkid =
>>> +				ACC100_DMA_BLKID_OUT_HARQ;
>>> +		desc->req.data_ptrs[next_triplet].dma_ext = 1;
>>> +		next_triplet++;
>>> +	} else {
>>> +		hq_output_head = op-
>>> ldpc_dec.harq_combined_output.data;
>>> +		hq_output = op->ldpc_dec.harq_combined_output.data;
>>> +		next_triplet = acc100_dma_fill_blk_type_out(
>>> +				&desc->req,
>>> +				op->ldpc_dec.harq_combined_output.data,
>>> +				op->ldpc_dec.harq_combined_output.offset,
>>> +				harq_dma_length_out,
>>> +				next_triplet,
>>> +				ACC100_DMA_BLKID_OUT_HARQ);
>>> +		/* HARQ output */
>>> +		mbuf_append(hq_output_head, hq_output,
>> harq_dma_length_out);
>>> +		op->ldpc_dec.harq_combined_output.length =
>>> +				harq_dma_length_out;
>>> +	}
>>> +	desc->req.data_ptrs[next_triplet - 1].last = 1;
>>> +	desc->req.d2mlen = next_triplet - desc->req.m2dlen;
>>> +	desc->req.op_addr = op;
>>> +
>>> +	/* One CB (one op) was successfully prepared to enqueue */
>>> +	return 1;
>>> +}
>>> +
>>>  /** Enqueue one decode operations for ACC100 device in CB mode */
>>> static inline int  enqueue_ldpc_dec_one_op_cb(struct acc100_queue *q,
>>> struct rte_bbdev_dec_op *op,
>>>  		uint16_t total_enqueued_cbs, bool same_op)  {
>>>  	int ret;
>>> +	if (unlikely(check_bit(op->ldpc_dec.op_flags,
>>> +
>> 	RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK))) {
>>> +		ret = harq_loopback(q, op, total_enqueued_cbs);
>>> +		return ret;
>>> +	}
>>>
>>>  	union acc100_dma_desc *desc;
>>>  	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v10 10/10] baseband/acc100: add configure function
  2020-10-01 14:11       ` Maxime Coquelin
@ 2020-10-01 15:36         ` Chautru, Nicolas
  2020-10-01 15:43           ` Maxime Coquelin
  0 siblings, 1 reply; 213+ messages in thread
From: Chautru, Nicolas @ 2020-10-01 15:36 UTC (permalink / raw)
  To: Maxime Coquelin, dev, akhil.goyal
  Cc: Richardson, Bruce, Xu, Rosen, trix, Yigit, Ferruh, Liu, Tianjiao

Hi Maxime, 

> -----Original Message-----
> From: Maxime Coquelin <maxime.coquelin@redhat.com>
> Sent: Thursday, October 1, 2020 7:11 AM
> To: Chautru, Nicolas <nicolas.chautru@intel.com>; dev@dpdk.org;
> akhil.goyal@nxp.com
> Cc: Richardson, Bruce <bruce.richardson@intel.com>; Xu, Rosen
> <rosen.xu@intel.com>; trix@redhat.com; Yigit, Ferruh
> <ferruh.yigit@intel.com>; Liu, Tianjiao <tianjiao.liu@intel.com>
> Subject: Re: [dpdk-dev] [PATCH v10 10/10] baseband/acc100: add configure
> function
> 
> Hi Nicolas,
> 
> On 10/1/20 5:14 AM, Nicolas Chautru wrote:
> > diff --git a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> > b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> > index 4a76d1d..91c234d 100644
> > --- a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> > +++ b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> > @@ -1,3 +1,10 @@
> >  DPDK_21 {
> >  	local: *;
> >  };
> > +
> > +EXPERIMENTAL {
> > +	global:
> > +
> > +	acc100_configure;
> > +
> > +};
> > --
> 
> Ideally we should not need to have device specific APIs, but at least it should
> be prefixed with "rte_".

Currently this is already like that for other bbdev PMDs. 
So I would tend to prefer consistency over all in that context. 
You could argue or not whether this is PMD function or a companion exposed function, but again if this should change it should change for all PMDs to avoid discrepencies.
If really this is deemed required this can be pushed as an extra patch covering all PMD, but probably not for 20.11.
What do you think?

> 
> Regards,
> Maxime


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v10 04/10] baseband/acc100: add queue configuration
  2020-10-01  3:14     ` [dpdk-dev] [PATCH v10 04/10] baseband/acc100: add queue configuration Nicolas Chautru
@ 2020-10-01 15:38       ` Maxime Coquelin
  2020-10-01 19:50         ` Chautru, Nicolas
  0 siblings, 1 reply; 213+ messages in thread
From: Maxime Coquelin @ 2020-10-01 15:38 UTC (permalink / raw)
  To: Nicolas Chautru, dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, trix, ferruh.yigit, tianjiao.liu



On 10/1/20 5:14 AM, Nicolas Chautru wrote:
> Adding function to create and configure queues for
> the device. Still no capability.
> 
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> Reviewed-by: Rosen Xu <rosen.xu@intel.com>
> Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
> ---
>  drivers/baseband/acc100/rte_acc100_pmd.c | 438 ++++++++++++++++++++++++++++++-
>  drivers/baseband/acc100/rte_acc100_pmd.h |  45 ++++
>  2 files changed, 482 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
> index 98a17b3..709a7af 100644
> --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> @@ -26,6 +26,22 @@
>  RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);
>  #endif
>  
> +/* Write to MMIO register address */
> +static inline void
> +mmio_write(void *addr, uint32_t value)
> +{
> +	*((volatile uint32_t *)(addr)) = rte_cpu_to_le_32(value);
> +}
> +
> +/* Write a register of a ACC100 device */
> +static inline void
> +acc100_reg_write(struct acc100_device *d, uint32_t offset, uint32_t payload)
> +{
> +	void *reg_addr = RTE_PTR_ADD(d->mmio_base, offset);
> +	mmio_write(reg_addr, payload);
> +	usleep(ACC100_LONG_WAIT);

Is it really needed to sleep after the MMIO write access?

> +}
> +
>  /* Read a register of a ACC100 device */
>  static inline uint32_t
>  acc100_reg_read(struct acc100_device *d, uint32_t offset)
> @@ -36,6 +52,22 @@
>  	return rte_le_to_cpu_32(ret);
>  }
>  
> +/* Basic Implementation of Log2 for exact 2^N */
> +static inline uint32_t
> +log2_basic(uint32_t value)
> +{
> +	return (value == 0) ? 0 : rte_bsf32(value);
> +}
> +
> +/* Calculate memory alignment offset assuming alignment is 2^N */
> +static inline uint32_t
> +calc_mem_alignment_offset(void *unaligned_virt_mem, uint32_t alignment)
> +{
> +	rte_iova_t unaligned_phy_mem = rte_malloc_virt2iova(unaligned_virt_mem);
> +	return (uint32_t)(alignment -
> +			(unaligned_phy_mem & (alignment-1)));
> +}
> +
>  /* Calculate the offset of the enqueue register */
>  static inline uint32_t
>  queue_offset(bool pf_device, uint8_t vf_id, uint8_t qgrp_id, uint16_t aq_id)
> @@ -208,10 +240,411 @@
>  			acc100_conf->q_dl_5g.aq_depth_log2);
>  }
>  
> +static void
> +free_base_addresses(void **base_addrs, int size)
> +{
> +	int i;
> +	for (i = 0; i < size; i++)
> +		rte_free(base_addrs[i]);
> +}
> +
> +static inline uint32_t
> +get_desc_len(void)
> +{
> +	return sizeof(union acc100_dma_desc);
> +}
> +
> +/* Allocate the 2 * 64MB block for the sw rings */
> +static int
> +alloc_2x64mb_sw_rings_mem(struct rte_bbdev *dev, struct acc100_device *d,
> +		int socket)
> +{
> +	uint32_t sw_ring_size = ACC100_SIZE_64MBYTE;
> +	d->sw_rings_base = rte_zmalloc_socket(dev->device->driver->name,
> +			2 * sw_ring_size, RTE_CACHE_LINE_SIZE, socket);
> +	if (d->sw_rings_base == NULL) {
> +		rte_bbdev_log(ERR, "Failed to allocate memory for %s:%u",
> +				dev->device->driver->name,
> +				dev->data->dev_id);
> +		return -ENOMEM;
> +	}
> +	memset(d->sw_rings_base, 0, ACC100_SIZE_64MBYTE);

Having used zmalloc, the memset looks overkill. Also, it does not clear
all the allocated are, don't know if this is expected.

> +	uint32_t next_64mb_align_offset = calc_mem_alignment_offset(
> +			d->sw_rings_base, ACC100_SIZE_64MBYTE);
> +	d->sw_rings = RTE_PTR_ADD(d->sw_rings_base, next_64mb_align_offset);
> +	d->sw_rings_phys = rte_malloc_virt2iova(d->sw_rings_base) +
> +			next_64mb_align_offset;

sw_rings_phys should be renamed to sw_rings_iova, as it could be a VA if
IOVA_AS_VA more is used.

> +	d->sw_ring_size = ACC100_MAX_QUEUE_DEPTH * get_desc_len();
> +	d->sw_ring_max_depth = d->sw_ring_size / get_desc_len();

d->sw_ring_max_depth = ACC100_MAX_QUEUE_DEPTH;

> +
> +	return 0;
> +}
> +
> +/* Attempt to allocate minimised memory space for sw rings */
> +static void
> +alloc_sw_rings_min_mem(struct rte_bbdev *dev, struct acc100_device *d,
> +		uint16_t num_queues, int socket)
> +{
> +	rte_iova_t sw_rings_base_phy, next_64mb_align_addr_phy;

Same comment regarding phys vs. iova in this function.

> +	uint32_t next_64mb_align_offset;
> +	rte_iova_t sw_ring_phys_end_addr;
> +	void *base_addrs[ACC100_SW_RING_MEM_ALLOC_ATTEMPTS];
> +	void *sw_rings_base;
> +	int i = 0;
> +	uint32_t q_sw_ring_size = ACC100_MAX_QUEUE_DEPTH * get_desc_len();
> +	uint32_t dev_sw_ring_size = q_sw_ring_size * num_queues;
> +
> +	/* Find an aligned block of memory to store sw rings */
> +	while (i < ACC100_SW_RING_MEM_ALLOC_ATTEMPTS) {
> +		/*
> +		 * sw_ring allocated memory is guaranteed to be aligned to
> +		 * q_sw_ring_size at the condition that the requested size is
> +		 * less than the page size
> +		 */
> +		sw_rings_base = rte_zmalloc_socket(
> +				dev->device->driver->name,
> +				dev_sw_ring_size, q_sw_ring_size, socket);
> +
> +		if (sw_rings_base == NULL) {
> +			rte_bbdev_log(ERR,
> +					"Failed to allocate memory for %s:%u",
> +					dev->device->driver->name,
> +					dev->data->dev_id);
> +			break;
> +		}
> +
> +		sw_rings_base_phy = rte_malloc_virt2iova(sw_rings_base);
> +		next_64mb_align_offset = calc_mem_alignment_offset(
> +				sw_rings_base, ACC100_SIZE_64MBYTE);
> +		next_64mb_align_addr_phy = sw_rings_base_phy +
> +				next_64mb_align_offset;
> +		sw_ring_phys_end_addr = sw_rings_base_phy + dev_sw_ring_size;
> +
> +		/* Check if the end of the sw ring memory block is before the
> +		 * start of next 64MB aligned mem address
> +		 */
> +		if (sw_ring_phys_end_addr < next_64mb_align_addr_phy) {
> +			d->sw_rings_phys = sw_rings_base_phy;
> +			d->sw_rings = sw_rings_base;
> +			d->sw_rings_base = sw_rings_base;
> +			d->sw_ring_size = q_sw_ring_size;
> +			d->sw_ring_max_depth = ACC100_MAX_QUEUE_DEPTH;
> +			break;
> +		}
> +		/* Store the address of the unaligned mem block */
> +		base_addrs[i] = sw_rings_base;
> +		i++;
> +	}
> +
> +	/* Free all unaligned blocks of mem allocated in the loop */
> +	free_base_addresses(base_addrs, i);
> +}
> +
> +
> +/* Allocate 64MB memory used for all software rings */
> +static int
> +acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id)
> +{
> +	uint32_t phys_low, phys_high, payload;
> +	struct acc100_device *d = dev->data->dev_private;
> +	const struct acc100_registry_addr *reg_addr;
> +
> +	if (d->pf_device && !d->acc100_conf.pf_mode_en) {
> +		rte_bbdev_log(NOTICE,
> +				"%s has PF mode disabled. This PF can't be used.",
> +				dev->data->name);
> +		return -ENODEV;
> +	}
> +
> +	alloc_sw_rings_min_mem(dev, d, num_queues, socket_id);
> +
> +	/* If minimal memory space approach failed, then allocate
> +	 * the 2 * 64MB block for the sw rings
> +	 */
> +	if (d->sw_rings == NULL)
> +		alloc_2x64mb_sw_rings_mem(dev, d, socket_id);
> +
> +	if (d->sw_rings == NULL) {
> +		rte_bbdev_log(NOTICE,
> +				"Failure allocating sw_rings memory");
> +		return -ENODEV;
> +	}
> +
> +	/* Configure ACC100 with the base address for DMA descriptor rings
> +	 * Same descriptor rings used for UL and DL DMA Engines
> +	 * Note : Assuming only VF0 bundle is used for PF mode
> +	 */
> +	phys_high = (uint32_t)(d->sw_rings_phys >> 32);
> +	phys_low  = (uint32_t)(d->sw_rings_phys & ~(ACC100_SIZE_64MBYTE-1));
> +
> +	/* Choose correct registry addresses for the device type */
> +	if (d->pf_device)
> +		reg_addr = &pf_reg_addr;
> +	else
> +		reg_addr = &vf_reg_addr;
> +
> +	/* Read the populated cfg from ACC100 registers */
> +	fetch_acc100_config(dev);
> +
> +	/* Release AXI from PF */
> +	if (d->pf_device)
> +		acc100_reg_write(d, HWPfDmaAxiControl, 1);
> +
> +	acc100_reg_write(d, reg_addr->dma_ring_ul5g_hi, phys_high);
> +	acc100_reg_write(d, reg_addr->dma_ring_ul5g_lo, phys_low);
> +	acc100_reg_write(d, reg_addr->dma_ring_dl5g_hi, phys_high);
> +	acc100_reg_write(d, reg_addr->dma_ring_dl5g_lo, phys_low);
> +	acc100_reg_write(d, reg_addr->dma_ring_ul4g_hi, phys_high);
> +	acc100_reg_write(d, reg_addr->dma_ring_ul4g_lo, phys_low);
> +	acc100_reg_write(d, reg_addr->dma_ring_dl4g_hi, phys_high);
> +	acc100_reg_write(d, reg_addr->dma_ring_dl4g_lo, phys_low);
> +
> +	/*
> +	 * Configure Ring Size to the max queue ring size
> +	 * (used for wrapping purpose)
> +	 */
> +	payload = log2_basic(d->sw_ring_size / 64);
> +	acc100_reg_write(d, reg_addr->ring_size, payload);
> +
> +	/* Configure tail pointer for use when SDONE enabled */
> +	d->tail_ptrs = rte_zmalloc_socket(
> +			dev->device->driver->name,
> +			ACC100_NUM_QGRPS * ACC100_NUM_AQS * sizeof(uint32_t),
> +			RTE_CACHE_LINE_SIZE, socket_id);
> +	if (d->tail_ptrs == NULL) {
> +		rte_bbdev_log(ERR, "Failed to allocate tail ptr for %s:%u",
> +				dev->device->driver->name,
> +				dev->data->dev_id);
> +		rte_free(d->sw_rings);
> +		return -ENOMEM;
> +	}
> +	d->tail_ptr_phys = rte_malloc_virt2iova(d->tail_ptrs);
> +
> +	phys_high = (uint32_t)(d->tail_ptr_phys >> 32);
> +	phys_low  = (uint32_t)(d->tail_ptr_phys);
> +	acc100_reg_write(d, reg_addr->tail_ptrs_ul5g_hi, phys_high);
> +	acc100_reg_write(d, reg_addr->tail_ptrs_ul5g_lo, phys_low);
> +	acc100_reg_write(d, reg_addr->tail_ptrs_dl5g_hi, phys_high);
> +	acc100_reg_write(d, reg_addr->tail_ptrs_dl5g_lo, phys_low);
> +	acc100_reg_write(d, reg_addr->tail_ptrs_ul4g_hi, phys_high);
> +	acc100_reg_write(d, reg_addr->tail_ptrs_ul4g_lo, phys_low);
> +	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_hi, phys_high);
> +	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_lo, phys_low);
> +
> +	d->harq_layout = rte_zmalloc_socket("HARQ Layout",
> +			ACC100_HARQ_LAYOUT * sizeof(*d->harq_layout),
> +			RTE_CACHE_LINE_SIZE, dev->data->socket_id);
> +	if (d->harq_layout == NULL) {
> +		rte_bbdev_log(ERR, "Failed to allocate harq_layout for %s:%u",
> +				dev->device->driver->name,
> +				dev->data->dev_id);
> +		rte_free(d->sw_rings);
> +		return -ENOMEM;
> +	}
> +
> +	/* Mark as configured properly */
> +	d->configured = true;
> +
> +	rte_bbdev_log_debug(
> +			"ACC100 (%s) configured  sw_rings = %p, sw_rings_phys = %#"
> +			PRIx64, dev->data->name, d->sw_rings, d->sw_rings_phys);
> +
> +	return 0;
> +}
> +
>  /* Free 64MB memory used for software rings */

Seems to be 2x64MB are allocated.

>  static int
> -acc100_dev_close(struct rte_bbdev *dev  __rte_unused)
> +acc100_dev_close(struct rte_bbdev *dev)
>  {
> +	struct acc100_device *d = dev->data->dev_private;
> +	if (d->sw_rings_base != NULL) {
> +		rte_free(d->tail_ptrs);
> +		rte_free(d->sw_rings_base);
> +		d->sw_rings_base = NULL;
> +	}
> +	usleep(ACC100_LONG_WAIT);

This sleep looks weird, it would need a comment if it is really needed.

> +	return 0;
> +}
> +
> +
> +/**
> + * Report a ACC100 queue index which is free
> + * Return 0 to 16k for a valid queue_idx or -1 when no queue is available
> + * Note : Only supporting VF0 Bundle for PF mode
> + */
> +static int
> +acc100_find_free_queue_idx(struct rte_bbdev *dev,
> +		const struct rte_bbdev_queue_conf *conf)
> +{
> +	struct acc100_device *d = dev->data->dev_private;
> +	int op_2_acc[5] = {0, UL_4G, DL_4G, UL_5G, DL_5G};
> +	int acc = op_2_acc[conf->op_type];
> +	struct rte_q_topology_t *qtop = NULL;

New line.

> +	qtopFromAcc(&qtop, acc, &(d->acc100_conf));
> +	if (qtop == NULL)
> +		return -1;
> +	/* Identify matching QGroup Index which are sorted in priority order */
> +	uint16_t group_idx = qtop->first_qgroup_index;
> +	group_idx += conf->priority;
> +	if (group_idx >= ACC100_NUM_QGRPS ||
> +			conf->priority >= qtop->num_qgroups) {
> +		rte_bbdev_log(INFO, "Invalid Priority on %s, priority %u",
> +				dev->data->name, conf->priority);
> +		return -1;
> +	}
> +	/* Find a free AQ_idx  */
> +	uint16_t aq_idx;
> +	for (aq_idx = 0; aq_idx < qtop->num_aqs_per_groups; aq_idx++) {
> +		if (((d->q_assigned_bit_map[group_idx] >> aq_idx) & 0x1) == 0) {
> +			/* Mark the Queue as assigned */
> +			d->q_assigned_bit_map[group_idx] |= (1 << aq_idx);
> +			/* Report the AQ Index */
> +			return (group_idx << ACC100_GRP_ID_SHIFT) + aq_idx;
> +		}
> +	}
> +	rte_bbdev_log(INFO, "Failed to find free queue on %s, priority %u",
> +			dev->data->name, conf->priority);
> +	return -1;
> +}
> +
> +/* Setup ACC100 queue */
> +static int
> +acc100_queue_setup(struct rte_bbdev *dev, uint16_t queue_id,
> +		const struct rte_bbdev_queue_conf *conf)
> +{
> +	struct acc100_device *d = dev->data->dev_private;
> +	struct acc100_queue *q;
> +	int16_t q_idx;
> +
> +	/* Allocate the queue data structure. */
> +	q = rte_zmalloc_socket(dev->device->driver->name, sizeof(*q),
> +			RTE_CACHE_LINE_SIZE, conf->socket);
> +	if (q == NULL) {
> +		rte_bbdev_log(ERR, "Failed to allocate queue memory");
> +		return -ENOMEM;
> +	}
> +
> +	q->d = d;
> +	q->ring_addr = RTE_PTR_ADD(d->sw_rings, (d->sw_ring_size * queue_id));

You might want to ensure d is not NULL before dereferencing it.

> +	q->ring_addr_phys = d->sw_rings_phys + (d->sw_ring_size * queue_id);
> +
> +	/* Prepare the Ring with default descriptor format */
> +	union acc100_dma_desc *desc = NULL;
> +	unsigned int desc_idx, b_idx;
> +	int fcw_len = (conf->op_type == RTE_BBDEV_OP_LDPC_ENC ?
> +		ACC100_FCW_LE_BLEN : (conf->op_type == RTE_BBDEV_OP_TURBO_DEC ?
> +		ACC100_FCW_TD_BLEN : ACC100_FCW_LD_BLEN));
> +
> +	for (desc_idx = 0; desc_idx < d->sw_ring_max_depth; desc_idx++) {
> +		desc = q->ring_addr + desc_idx;
> +		desc->req.word0 = ACC100_DMA_DESC_TYPE;
> +		desc->req.word1 = 0; /**< Timestamp */
> +		desc->req.word2 = 0;
> +		desc->req.word3 = 0;
> +		uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
> +		desc->req.data_ptrs[0].address = q->ring_addr_phys + fcw_offset;
> +		desc->req.data_ptrs[0].blen = fcw_len;
> +		desc->req.data_ptrs[0].blkid = ACC100_DMA_BLKID_FCW;
> +		desc->req.data_ptrs[0].last = 0;
> +		desc->req.data_ptrs[0].dma_ext = 0;
> +		for (b_idx = 1; b_idx < ACC100_DMA_MAX_NUM_POINTERS - 1;
> +				b_idx++) {
> +			desc->req.data_ptrs[b_idx].blkid = ACC100_DMA_BLKID_IN;
> +			desc->req.data_ptrs[b_idx].last = 1;
> +			desc->req.data_ptrs[b_idx].dma_ext = 0;
> +			b_idx++;
> +			desc->req.data_ptrs[b_idx].blkid =
> +					ACC100_DMA_BLKID_OUT_ENC;
> +			desc->req.data_ptrs[b_idx].last = 1;
> +			desc->req.data_ptrs[b_idx].dma_ext = 0;
> +		}
> +		/* Preset some fields of LDPC FCW */
> +		desc->req.fcw_ld.FCWversion = ACC100_FCW_VER;
> +		desc->req.fcw_ld.gain_i = 1;
> +		desc->req.fcw_ld.gain_h = 1;
> +	}
> +
> +	q->lb_in = rte_zmalloc_socket(dev->device->driver->name,
> +			RTE_CACHE_LINE_SIZE,
> +			RTE_CACHE_LINE_SIZE, conf->socket);
> +	if (q->lb_in == NULL) {
> +		rte_bbdev_log(ERR, "Failed to allocate lb_in memory");
> +		rte_free(q);
> +		return -ENOMEM;
> +	}
> +	q->lb_in_addr_phys = rte_malloc_virt2iova(q->lb_in);
> +	q->lb_out = rte_zmalloc_socket(dev->device->driver->name,
> +			RTE_CACHE_LINE_SIZE,
> +			RTE_CACHE_LINE_SIZE, conf->socket);
> +	if (q->lb_out == NULL) {
> +		rte_bbdev_log(ERR, "Failed to allocate lb_out memory");
> +		rte_free(q->lb_in);
> +		rte_free(q);
> +		return -ENOMEM;
> +	}
> +	q->lb_out_addr_phys = rte_malloc_virt2iova(q->lb_out);
> +
> +	/*
> +	 * Software queue ring wraps synchronously with the HW when it reaches
> +	 * the boundary of the maximum allocated queue size, no matter what the
> +	 * sw queue size is. This wrapping is guarded by setting the wrap_mask
> +	 * to represent the maximum queue size as allocated at the time when
> +	 * the device has been setup (in configure()).
> +	 *
> +	 * The queue depth is set to the queue size value (conf->queue_size).
> +	 * This limits the occupancy of the queue at any point of time, so that
> +	 * the queue does not get swamped with enqueue requests.
> +	 */
> +	q->sw_ring_depth = conf->queue_size;
> +	q->sw_ring_wrap_mask = d->sw_ring_max_depth - 1;
> +
> +	q->op_type = conf->op_type;
> +
> +	q_idx = acc100_find_free_queue_idx(dev, conf);
> +	if (q_idx == -1) {
> +		rte_free(q->lb_in);
> +		rte_free(q->lb_out);
> +		rte_free(q);
> +		return -1;
> +	}
> +
> +	q->qgrp_id = (q_idx >> ACC100_GRP_ID_SHIFT) & 0xF;
> +	q->vf_id = (q_idx >> ACC100_VF_ID_SHIFT)  & 0x3F;
> +	q->aq_id = q_idx & 0xF;
> +	q->aq_depth = (conf->op_type ==  RTE_BBDEV_OP_TURBO_DEC) ?
> +			(1 << d->acc100_conf.q_ul_4g.aq_depth_log2) :
> +			(1 << d->acc100_conf.q_dl_4g.aq_depth_log2);
> +
> +	q->mmio_reg_enqueue = RTE_PTR_ADD(d->mmio_base,
> +			queue_offset(d->pf_device,
> +					q->vf_id, q->qgrp_id, q->aq_id));
> +
> +	rte_bbdev_log_debug(
> +			"Setup dev%u q%u: qgrp_id=%u, vf_id=%u, aq_id=%u, aq_depth=%u, mmio_reg_enqueue=%p",
> +			dev->data->dev_id, queue_id, q->qgrp_id, q->vf_id,
> +			q->aq_id, q->aq_depth, q->mmio_reg_enqueue);
> +
> +	dev->data->queues[queue_id].queue_private = q;
> +	return 0;
> +}
> +
> +/* Release ACC100 queue */
> +static int
> +acc100_queue_release(struct rte_bbdev *dev, uint16_t q_id)
> +{
> +	struct acc100_device *d = dev->data->dev_private;
> +	struct acc100_queue *q = dev->data->queues[q_id].queue_private;
> +
> +	if (q != NULL) {
> +		/* Mark the Queue as un-assigned */
> +		d->q_assigned_bit_map[q->qgrp_id] &= (0xFFFFFFFF -
> +				(1 << q->aq_id));
> +		rte_free(q->lb_in);
> +		rte_free(q->lb_out);
> +		rte_free(q);
> +		dev->data->queues[q_id].queue_private = NULL;
> +	}
> +
>  	return 0;
>  }
>  
> @@ -262,8 +695,11 @@
>  }
>  
>  static const struct rte_bbdev_ops acc100_bbdev_ops = {
> +	.setup_queues = acc100_setup_queues,
>  	.close = acc100_dev_close,
>  	.info_get = acc100_dev_info_get,
> +	.queue_setup = acc100_queue_setup,
> +	.queue_release = acc100_queue_release,
>  };
>  
>  /* ACC100 PCI PF address map */
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
> index de015ca..2508385 100644
> --- a/drivers/baseband/acc100/rte_acc100_pmd.h
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
> @@ -522,11 +522,56 @@ struct acc100_registry_addr {
>  	.ddr_range = HWVfDmaDdrBaseRangeRoVf,
>  };
>  
> +/* Structure associated with each queue. */
> +struct __rte_cache_aligned acc100_queue {
> +	union acc100_dma_desc *ring_addr;  /* Virtual address of sw ring */
> +	rte_iova_t ring_addr_phys;  /* Physical address of software ring */
> +	uint32_t sw_ring_head;  /* software ring head */
> +	uint32_t sw_ring_tail;  /* software ring tail */
> +	/* software ring size (descriptors, not bytes) */
> +	uint32_t sw_ring_depth;
> +	/* mask used to wrap enqueued descriptors on the sw ring */
> +	uint32_t sw_ring_wrap_mask;
> +	/* MMIO register used to enqueue descriptors */
> +	void *mmio_reg_enqueue;
> +	uint8_t vf_id;  /* VF ID (max = 63) */
> +	uint8_t qgrp_id;  /* Queue Group ID */
> +	uint16_t aq_id;  /* Atomic Queue ID */
> +	uint16_t aq_depth;  /* Depth of atomic queue */
> +	uint32_t aq_enqueued;  /* Count how many "batches" have been enqueued */
> +	uint32_t aq_dequeued;  /* Count how many "batches" have been dequeued */
> +	uint32_t irq_enable;  /* Enable ops dequeue interrupts if set to 1 */
> +	struct rte_mempool *fcw_mempool;  /* FCW mempool */
> +	enum rte_bbdev_op_type op_type;  /* Type of this Queue: TE or TD */
> +	/* Internal Buffers for loopback input */
> +	uint8_t *lb_in;
> +	uint8_t *lb_out;
> +	rte_iova_t lb_in_addr_phys;
> +	rte_iova_t lb_out_addr_phys;
> +	struct acc100_device *d;
> +};
> +
>  /* Private data structure for each ACC100 device */
>  struct acc100_device {
>  	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
> +	void *sw_rings_base;  /* Base addr of un-aligned memory for sw rings */
> +	void *sw_rings;  /* 64MBs of 64MB aligned memory for sw rings */
> +	rte_iova_t sw_rings_phys;  /* Physical address of sw_rings */
> +	/* Virtual address of the info memory routed to the this function under
> +	 * operation, whether it is PF or VF.
> +	 */
> +	union acc100_harq_layout_data *harq_layout;
> +	uint32_t sw_ring_size;
>  	uint32_t ddr_size; /* Size in kB */
> +	uint32_t *tail_ptrs; /* Base address of response tail pointer buffer */
> +	rte_iova_t tail_ptr_phys; /* Physical address of tail pointers */
> +	/* Max number of entries available for each queue in device, depending
> +	 * on how many queues are enabled with configure()
> +	 */
> +	uint32_t sw_ring_max_depth;
>  	struct acc100_conf acc100_conf; /* ACC100 Initial configuration */
> +	/* Bitmap capturing which Queues have already been assigned */
> +	uint16_t q_assigned_bit_map[ACC100_NUM_QGRPS];
>  	bool pf_device; /**< True if this is a PF ACC100 device */
>  	bool configured; /**< True if this ACC100 device is configured */
>  };
> 


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v9 07/10] baseband/acc100: add support for 4G processing
  2020-09-30 19:10         ` Chautru, Nicolas
@ 2020-10-01 15:42           ` Tom Rix
  2020-10-01 21:46             ` Chautru, Nicolas
  0 siblings, 1 reply; 213+ messages in thread
From: Tom Rix @ 2020-10-01 15:42 UTC (permalink / raw)
  To: Chautru, Nicolas, dev, akhil.goyal
  Cc: Richardson, Bruce, Xu, Rosen, dave.burley, aidan.goddard, Yigit,
	Ferruh, Liu, Tianjiao


On 9/30/20 12:10 PM, Chautru, Nicolas wrote:
> Hi Tom, 
>
>> From: Tom Rix <trix@redhat.com>
>> On 9/28/20 5:29 PM, Nicolas Chautru wrote:
>>> Adding capability for 4G encode and decoder processing
>>>
>>> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
>>> Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
>>> ---
>>>  doc/guides/bbdevs/features/acc100.ini    |    4 +-
>>>  drivers/baseband/acc100/rte_acc100_pmd.c | 1010
>>> ++++++++++++++++++++++++++++--
>>>  2 files changed, 945 insertions(+), 69 deletions(-)
>>>
>>> diff --git a/doc/guides/bbdevs/features/acc100.ini
>>> b/doc/guides/bbdevs/features/acc100.ini
>>> index 40c7adc..642cd48 100644
>>> --- a/doc/guides/bbdevs/features/acc100.ini
>>> +++ b/doc/guides/bbdevs/features/acc100.ini
>>> @@ -4,8 +4,8 @@
>>>  ; Refer to default.ini for the full list of available PMD features.
>>>  ;
>>>  [Features]
>>> -Turbo Decoder (4G)     = N
>>> -Turbo Encoder (4G)     = N
>>> +Turbo Decoder (4G)     = Y
>>> +Turbo Encoder (4G)     = Y
>>>  LDPC Decoder (5G)      = Y
>>>  LDPC Encoder (5G)      = Y
>>>  LLR/HARQ Compression   = Y
>>> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
>>> b/drivers/baseband/acc100/rte_acc100_pmd.c
>>> index e484c0a..7d4c3df 100644
>>> --- a/drivers/baseband/acc100/rte_acc100_pmd.c
>>> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
>>> @@ -339,7 +339,6 @@
>>>  	free_base_addresses(base_addrs, i);
>>>  }
>>>
>>> -
>>>  /* Allocate 64MB memory used for all software rings */  static int
>>> acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int
>>> socket_id) @@ -637,6 +636,41 @@
>>>
>>>  	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
>>>  		{
>>> +			.type = RTE_BBDEV_OP_TURBO_DEC,
>>> +			.cap.turbo_dec = {
>>> +				.capability_flags =
>>> +
>> 	RTE_BBDEV_TURBO_SUBBLOCK_DEINTERLEAVE |
>>> +					RTE_BBDEV_TURBO_CRC_TYPE_24B
>> |
>>> +
>> 	RTE_BBDEV_TURBO_HALF_ITERATION_EVEN |
>>> +
>> 	RTE_BBDEV_TURBO_EARLY_TERMINATION |
>>> +
>> 	RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN |
>>> +					RTE_BBDEV_TURBO_MAP_DEC |
>>> +
>> 	RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP |
>>> +
>> 	RTE_BBDEV_TURBO_DEC_SCATTER_GATHER,
>>> +				.max_llr_modulus = INT8_MAX,
>>> +				.num_buffers_src =
>>> +
>> 	RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
>>> +				.num_buffers_hard_out =
>>> +
>> 	RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
>>> +				.num_buffers_soft_out =
>>> +
>> 	RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
>>> +			}
>>> +		},
>>> +		{
>>> +			.type = RTE_BBDEV_OP_TURBO_ENC,
>>> +			.cap.turbo_enc = {
>>> +				.capability_flags =
>>> +
>> 	RTE_BBDEV_TURBO_CRC_24B_ATTACH |
>>> +
>> 	RTE_BBDEV_TURBO_RV_INDEX_BYPASS |
>>> +					RTE_BBDEV_TURBO_RATE_MATCH |
>>> +
>> 	RTE_BBDEV_TURBO_ENC_SCATTER_GATHER,
>>> +				.num_buffers_src =
>>> +
>> 	RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
>>> +				.num_buffers_dst =
>>> +
>> 	RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
>>> +			}
>>> +		},
>>> +		{
>>>  			.type   = RTE_BBDEV_OP_LDPC_ENC,
>>>  			.cap.ldpc_enc = {
>>>  				.capability_flags =
>>> @@ -719,7 +753,6 @@
>>>  #endif
>>>  }
>>>
>>> -
>>>  static const struct rte_bbdev_ops acc100_bbdev_ops = {
>>>  	.setup_queues = acc100_setup_queues,
>>>  	.close = acc100_dev_close,
>>> @@ -763,6 +796,58 @@
>>>  	return tail;
>>>  }
>>>
>>> +/* Fill in a frame control word for turbo encoding. */ static inline
>>> +void acc100_fcw_te_fill(const struct rte_bbdev_enc_op *op, struct
>>> +acc100_fcw_te *fcw) {
>>> +	fcw->code_block_mode = op->turbo_enc.code_block_mode;
>>> +	if (fcw->code_block_mode == 0) { /* For TB mode */
>>> +		fcw->k_neg = op->turbo_enc.tb_params.k_neg;
>>> +		fcw->k_pos = op->turbo_enc.tb_params.k_pos;
>>> +		fcw->c_neg = op->turbo_enc.tb_params.c_neg;
>>> +		fcw->c = op->turbo_enc.tb_params.c;
>>> +		fcw->ncb_neg = op->turbo_enc.tb_params.ncb_neg;
>>> +		fcw->ncb_pos = op->turbo_enc.tb_params.ncb_pos;
>>> +
>>> +		if (check_bit(op->turbo_enc.op_flags,
>>> +				RTE_BBDEV_TURBO_RATE_MATCH)) {
>>> +			fcw->bypass_rm = 0;
>>> +			fcw->cab = op->turbo_enc.tb_params.cab;
>>> +			fcw->ea = op->turbo_enc.tb_params.ea;
>>> +			fcw->eb = op->turbo_enc.tb_params.eb;
>>> +		} else {
>>> +			/* E is set to the encoding output size when RM is
>>> +			 * bypassed.
>>> +			 */
>>> +			fcw->bypass_rm = 1;
>>> +			fcw->cab = fcw->c_neg;
>>> +			fcw->ea = 3 * fcw->k_neg + 12;
>>> +			fcw->eb = 3 * fcw->k_pos + 12;
>>> +		}
>>> +	} else { /* For CB mode */
>>> +		fcw->k_pos = op->turbo_enc.cb_params.k;
>>> +		fcw->ncb_pos = op->turbo_enc.cb_params.ncb;
>>> +
>>> +		if (check_bit(op->turbo_enc.op_flags,
>>> +				RTE_BBDEV_TURBO_RATE_MATCH)) {
>>> +			fcw->bypass_rm = 0;
>>> +			fcw->eb = op->turbo_enc.cb_params.e;
>>> +		} else {
>>> +			/* E is set to the encoding output size when RM is
>>> +			 * bypassed.
>>> +			 */
>>> +			fcw->bypass_rm = 1;
>>> +			fcw->eb = 3 * fcw->k_pos + 12;
>>> +		}
>>> +	}
>>> +
>>> +	fcw->bypass_rv_idx1 = check_bit(op->turbo_enc.op_flags,
>>> +			RTE_BBDEV_TURBO_RV_INDEX_BYPASS);
>>> +	fcw->code_block_crc = check_bit(op->turbo_enc.op_flags,
>>> +			RTE_BBDEV_TURBO_CRC_24B_ATTACH);
>>> +	fcw->rv_idx1 = op->turbo_enc.rv_index; }
>>> +
>>>  /* Compute value of k0.
>>>   * Based on 3GPP 38.212 Table 5.4.2.1-2
>>>   * Starting position of different redundancy versions, k0 @@ -813,6
>>> +898,25 @@
>>>  	fcw->mcb_count = num_cb;
>>>  }
>>>
>>> +/* Fill in a frame control word for turbo decoding. */ static inline
>>> +void acc100_fcw_td_fill(const struct rte_bbdev_dec_op *op, struct
>>> +acc100_fcw_td *fcw) {
>>> +	/* Note : Early termination is always enabled for 4GUL */
>>> +	fcw->fcw_ver = 1;
>>> +	if (op->turbo_dec.code_block_mode == 0)
>>> +		fcw->k_pos = op->turbo_dec.tb_params.k_pos;
>>> +	else
>>> +		fcw->k_pos = op->turbo_dec.cb_params.k;
>>> +	fcw->turbo_crc_type = check_bit(op->turbo_dec.op_flags,
>>> +			RTE_BBDEV_TURBO_CRC_TYPE_24B);
>>> +	fcw->bypass_sb_deint = 0;
>>> +	fcw->raw_decoder_input_on = 0;
>>> +	fcw->max_iter = op->turbo_dec.iter_max;
>>> +	fcw->half_iter_on = !check_bit(op->turbo_dec.op_flags,
>>> +			RTE_BBDEV_TURBO_HALF_ITERATION_EVEN);
>>> +}
>>> +
>>>  /* Fill in a frame control word for LDPC decoding. */  static inline
>>> void  acc100_fcw_ld_fill(const struct rte_bbdev_dec_op *op, struct
>>> acc100_fcw_ld *fcw, @@ -1042,6 +1146,87 @@  }
>>>
>>>  static inline int
>>> +acc100_dma_desc_te_fill(struct rte_bbdev_enc_op *op,
>>> +		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
>>> +		struct rte_mbuf *output, uint32_t *in_offset,
>>> +		uint32_t *out_offset, uint32_t *out_length,
>>> +		uint32_t *mbuf_total_left, uint32_t *seg_total_left, uint8_t
>> r) {
>>> +	int next_triplet = 1; /* FCW already done */
>>> +	uint32_t e, ea, eb, length;
>>> +	uint16_t k, k_neg, k_pos;
>>> +	uint8_t cab, c_neg;
>>> +
>>> +	desc->word0 = ACC100_DMA_DESC_TYPE;
>>> +	desc->word1 = 0; /**< Timestamp could be disabled */
>>> +	desc->word2 = 0;
>>> +	desc->word3 = 0;
>>> +	desc->numCBs = 1;
>>> +
>>> +	if (op->turbo_enc.code_block_mode == 0) {
>>> +		ea = op->turbo_enc.tb_params.ea;
>>> +		eb = op->turbo_enc.tb_params.eb;
>>> +		cab = op->turbo_enc.tb_params.cab;
>>> +		k_neg = op->turbo_enc.tb_params.k_neg;
>>> +		k_pos = op->turbo_enc.tb_params.k_pos;
>>> +		c_neg = op->turbo_enc.tb_params.c_neg;
>>> +		e = (r < cab) ? ea : eb;
>>> +		k = (r < c_neg) ? k_neg : k_pos;
>>> +	} else {
>>> +		e = op->turbo_enc.cb_params.e;
>>> +		k = op->turbo_enc.cb_params.k;
>>> +	}
>>> +
>>> +	if (check_bit(op->turbo_enc.op_flags,
>> RTE_BBDEV_TURBO_CRC_24B_ATTACH))
>>> +		length = (k - 24) >> 3;
>>> +	else
>>> +		length = k >> 3;
>>> +
>>> +	if (unlikely((*mbuf_total_left == 0) || (*mbuf_total_left <
>>> +length))) {
>> similar to other patches, this check can be combined to <=
>>
>> change generally
> same comment on other patch
>
>>> +		rte_bbdev_log(ERR,
>>> +				"Mismatch between mbuf length and
>> included CB sizes: mbuf len %u, cb len %u",
>>> +				*mbuf_total_left, length);
>>> +		return -1;
>>> +	}
>>> +
>>> +	next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset,
>>> +			length, seg_total_left, next_triplet);
>>> +	if (unlikely(next_triplet < 0)) {
>>> +		rte_bbdev_log(ERR,
>>> +				"Mismatch between data to process and
>> mbuf data length in bbdev_op: %p",
>>> +				op);
>>> +		return -1;
>>> +	}
>>> +	desc->data_ptrs[next_triplet - 1].last = 1;
>>> +	desc->m2dlen = next_triplet;
>>> +	*mbuf_total_left -= length;
>>> +
>>> +	/* Set output length */
>>> +	if (check_bit(op->turbo_enc.op_flags,
>> RTE_BBDEV_TURBO_RATE_MATCH))
>>> +		/* Integer round up division by 8 */
>>> +		*out_length = (e + 7) >> 3;
>>> +	else
>>> +		*out_length = (k >> 3) * 3 + 2;
>>> +
>>> +	next_triplet = acc100_dma_fill_blk_type_out(desc, output,
>> *out_offset,
>>> +			*out_length, next_triplet,
>> ACC100_DMA_BLKID_OUT_ENC);
>>> +	if (unlikely(next_triplet < 0)) {
>>> +		rte_bbdev_log(ERR,
>>> +				"Mismatch between data to process and
>> mbuf data length in bbdev_op: %p",
>>> +				op);
>>> +		return -1;
>>> +	}
>>> +	op->turbo_enc.output.length += *out_length;
>>> +	*out_offset += *out_length;
>>> +	desc->data_ptrs[next_triplet - 1].last = 1;
>>> +	desc->d2mlen = next_triplet - desc->m2dlen;
>>> +
>>> +	desc->op_addr = op;
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +static inline int
>>>  acc100_dma_desc_le_fill(struct rte_bbdev_enc_op *op,
>>>  		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
>>>  		struct rte_mbuf *output, uint32_t *in_offset, @@ -1110,6
>> +1295,117
>>> @@  }
>>>
>>>  static inline int
>>> +acc100_dma_desc_td_fill(struct rte_bbdev_dec_op *op,
>>> +		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
>>> +		struct rte_mbuf *h_output, struct rte_mbuf *s_output,
>>> +		uint32_t *in_offset, uint32_t *h_out_offset,
>>> +		uint32_t *s_out_offset, uint32_t *h_out_length,
>>> +		uint32_t *s_out_length, uint32_t *mbuf_total_left,
>>> +		uint32_t *seg_total_left, uint8_t r) {
>>> +	int next_triplet = 1; /* FCW already done */
>>> +	uint16_t k;
>>> +	uint16_t crc24_overlap = 0;
>>> +	uint32_t e, kw;
>>> +
>>> +	desc->word0 = ACC100_DMA_DESC_TYPE;
>>> +	desc->word1 = 0; /**< Timestamp could be disabled */
>>> +	desc->word2 = 0;
>>> +	desc->word3 = 0;
>>> +	desc->numCBs = 1;
>>> +
>>> +	if (op->turbo_dec.code_block_mode == 0) {
>>> +		k = (r < op->turbo_dec.tb_params.c_neg)
>>> +			? op->turbo_dec.tb_params.k_neg
>>> +			: op->turbo_dec.tb_params.k_pos;
>>> +		e = (r < op->turbo_dec.tb_params.cab)
>>> +			? op->turbo_dec.tb_params.ea
>>> +			: op->turbo_dec.tb_params.eb;
>>> +	} else {
>>> +		k = op->turbo_dec.cb_params.k;
>>> +		e = op->turbo_dec.cb_params.e;
>>> +	}
>>> +
>>> +	if ((op->turbo_dec.code_block_mode == 0)
>>> +		&& !check_bit(op->turbo_dec.op_flags,
>>> +		RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP))
>>> +		crc24_overlap = 24;
>>> +
>>> +	/* Calculates circular buffer size.
>>> +	 * According to 3gpp 36.212 section 5.1.4.2
>>> +	 *   Kw = 3 * Kpi,
>>> +	 * where:
>>> +	 *   Kpi = nCol * nRow
>>> +	 * where nCol is 32 and nRow can be calculated from:
>>> +	 *   D =< nCol * nRow
>>> +	 * where D is the size of each output from turbo encoder block (k +
>> 4).
>>> +	 */
>>> +	kw = RTE_ALIGN_CEIL(k + 4, 32) * 3;
>>> +
>>> +	if (unlikely((*mbuf_total_left == 0) || (*mbuf_total_left < kw))) {
>>> +		rte_bbdev_log(ERR,
>>> +				"Mismatch between mbuf length and
>> included CB sizes: mbuf len %u, cb len %u",
>>> +				*mbuf_total_left, kw);
>>> +		return -1;
>>> +	}
>>> +
>>> +	next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset,
>> kw,
>>> +			seg_total_left, next_triplet);
>>> +	if (unlikely(next_triplet < 0)) {
>>> +		rte_bbdev_log(ERR,
>>> +				"Mismatch between data to process and
>> mbuf data length in bbdev_op: %p",
>>> +				op);
>>> +		return -1;
>>> +	}
>>> +	desc->data_ptrs[next_triplet - 1].last = 1;
>>> +	desc->m2dlen = next_triplet;
>>> +	*mbuf_total_left -= kw;
>>> +
>>> +	next_triplet = acc100_dma_fill_blk_type_out(
>>> +			desc, h_output, *h_out_offset,
>>> +			k >> 3, next_triplet,
>> ACC100_DMA_BLKID_OUT_HARD);
>>> +	if (unlikely(next_triplet < 0)) {
>>> +		rte_bbdev_log(ERR,
>>> +				"Mismatch between data to process and
>> mbuf data length in bbdev_op: %p",
>>> +				op);
>>> +		return -1;
>>> +	}
>>> +
>>> +	*h_out_length = ((k - crc24_overlap) >> 3);
>>> +	op->turbo_dec.hard_output.length += *h_out_length;
>>> +	*h_out_offset += *h_out_length;
>>> +
>>> +	/* Soft output */
>>> +	if (check_bit(op->turbo_dec.op_flags,
>> RTE_BBDEV_TURBO_SOFT_OUTPUT)) {
>>> +		if (check_bit(op->turbo_dec.op_flags,
>>> +				RTE_BBDEV_TURBO_EQUALIZER))
>>> +			*s_out_length = e;
>>> +		else
>>> +			*s_out_length = (k * 3) + 12;
>>> +
>>> +		next_triplet = acc100_dma_fill_blk_type_out(desc, s_output,
>>> +				*s_out_offset, *s_out_length, next_triplet,
>>> +				ACC100_DMA_BLKID_OUT_SOFT);
>>> +		if (unlikely(next_triplet < 0)) {
>>> +			rte_bbdev_log(ERR,
>>> +					"Mismatch between data to process
>> and mbuf data length in bbdev_op: %p",
>>> +					op);
>>> +			return -1;
>>> +		}
>>> +
>>> +		op->turbo_dec.soft_output.length += *s_out_length;
>>> +		*s_out_offset += *s_out_length;
>>> +	}
>>> +
>>> +	desc->data_ptrs[next_triplet - 1].last = 1;
>>> +	desc->d2mlen = next_triplet - desc->m2dlen;
>>> +
>>> +	desc->op_addr = op;
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +static inline int
>>>  acc100_dma_desc_ld_fill(struct rte_bbdev_dec_op *op,
>>>  		struct acc100_dma_req_desc *desc,
>>>  		struct rte_mbuf **input, struct rte_mbuf *h_output, @@ -
>> 1374,6
>>> +1670,57 @@
>>>
>>>  /* Enqueue one encode operations for ACC100 device in CB mode */
>>> static inline int
>>> +enqueue_enc_one_op_cb(struct acc100_queue *q, struct
>> rte_bbdev_enc_op *op,
>>> +		uint16_t total_enqueued_cbs)
>>> +{
>>> +	union acc100_dma_desc *desc = NULL;
>>> +	int ret;
>>> +	uint32_t in_offset, out_offset, out_length, mbuf_total_left,
>>> +		seg_total_left;
>>> +	struct rte_mbuf *input, *output_head, *output;
>>> +
>>> +	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
>>> +			& q->sw_ring_wrap_mask);
>>> +	desc = q->ring_addr + desc_idx;
>>> +	acc100_fcw_te_fill(op, &desc->req.fcw_te);
>>> +
>>> +	input = op->turbo_enc.input.data;
>>> +	output_head = output = op->turbo_enc.output.data;
>>> +	in_offset = op->turbo_enc.input.offset;
>>> +	out_offset = op->turbo_enc.output.offset;
>>> +	out_length = 0;
>>> +	mbuf_total_left = op->turbo_enc.input.length;
>>> +	seg_total_left = rte_pktmbuf_data_len(op->turbo_enc.input.data)
>>> +			- in_offset;
>>> +
>>> +	ret = acc100_dma_desc_te_fill(op, &desc->req, &input, output,
>>> +			&in_offset, &out_offset, &out_length,
>> &mbuf_total_left,
>>> +			&seg_total_left, 0);
>>> +
>>> +	if (unlikely(ret < 0))
>>> +		return ret;
>>> +
>>> +	mbuf_append(output_head, output, out_length);
>>> +
>>> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
>>> +	rte_memdump(stderr, "FCW", &desc->req.fcw_te,
>>> +			sizeof(desc->req.fcw_te) - 8);
>>> +	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
>>> +
>>> +	/* Check if any data left after processing one CB */
>>> +	if (mbuf_total_left != 0) {
>>> +		rte_bbdev_log(ERR,
>>> +				"Some date still left after processing one CB:
>> mbuf_total_left = %u",
>>> +				mbuf_total_left);
>>> +		return -EINVAL;
>>> +	}
>>> +#endif
>>> +	/* One CB (one op) was successfully prepared to enqueue */
>>> +	return 1;
>>> +}
>>> +
>>> +/* Enqueue one encode operations for ACC100 device in CB mode */
>>> +static inline int
>>>  enqueue_ldpc_enc_n_op_cb(struct acc100_queue *q, struct
>> rte_bbdev_enc_op **ops,
>>>  		uint16_t total_enqueued_cbs, int16_t num)  { @@ -1481,78
>> +1828,235
>>> @@
>>>  	return 1;
>>>  }
>>>
>>> -static inline int
>>> -harq_loopback(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
>>> -		uint16_t total_enqueued_cbs) {
>>> -	struct acc100_fcw_ld *fcw;
>>> -	union acc100_dma_desc *desc;
>>> -	int next_triplet = 1;
>>> -	struct rte_mbuf *hq_output_head, *hq_output;
>>> -	uint16_t harq_in_length = op-
>>> ldpc_dec.harq_combined_input.length;
>>> -	if (harq_in_length == 0) {
>>> -		rte_bbdev_log(ERR, "Loopback of invalid null size\n");
>>> -		return -EINVAL;
>>> -	}
>>>
>>> -	int h_comp = check_bit(op->ldpc_dec.op_flags,
>>> -			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION
>>> -			) ? 1 : 0;
>>> -	if (h_comp == 1)
>>> -		harq_in_length = harq_in_length * 8 / 6;
>>> -	harq_in_length = RTE_ALIGN(harq_in_length, 64);
>>> -	uint16_t harq_dma_length_in = (h_comp == 0) ?
>>> -			harq_in_length :
>>> -			harq_in_length * 6 / 8;
>>> -	uint16_t harq_dma_length_out = harq_dma_length_in;
>>> -	bool ddr_mem_in = check_bit(op->ldpc_dec.op_flags,
>>> -
>> 	RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE);
>>> -	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
>>> -	uint16_t harq_index = (ddr_mem_in ?
>>> -			op->ldpc_dec.harq_combined_input.offset :
>>> -			op->ldpc_dec.harq_combined_output.offset)
>>> -			/ ACC100_HARQ_OFFSET;
>>> +/* Enqueue one encode operations for ACC100 device in TB mode. */
>>> +static inline int enqueue_enc_one_op_tb(struct acc100_queue *q,
>>> +struct rte_bbdev_enc_op *op,
>>> +		uint16_t total_enqueued_cbs, uint8_t cbs_in_tb) {
>>> +	union acc100_dma_desc *desc = NULL;
>>> +	int ret;
>>> +	uint8_t r, c;
>>> +	uint32_t in_offset, out_offset, out_length, mbuf_total_left,
>>> +		seg_total_left;
>>> +	struct rte_mbuf *input, *output_head, *output;
>>> +	uint16_t current_enqueued_cbs = 0;
>>>
>>>  	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
>>>  			& q->sw_ring_wrap_mask);
>>>  	desc = q->ring_addr + desc_idx;
>>> -	fcw = &desc->req.fcw_ld;
>>> -	/* Set the FCW from loopback into DDR */
>>> -	memset(fcw, 0, sizeof(struct acc100_fcw_ld));
>>> -	fcw->FCWversion = ACC100_FCW_VER;
>>> -	fcw->qm = 2;
>>> -	fcw->Zc = 384;
>>> -	if (harq_in_length < 16 * N_ZC_1)
>>> -		fcw->Zc = 16;
>>> -	fcw->ncb = fcw->Zc * N_ZC_1;
>>> -	fcw->rm_e = 2;
>>> -	fcw->hcin_en = 1;
>>> -	fcw->hcout_en = 1;
>>> +	uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
>>> +	acc100_fcw_te_fill(op, &desc->req.fcw_te);
>>>
>>> -	rte_bbdev_log(DEBUG, "Loopback IN %d Index %d offset %d length
>> %d %d\n",
>>> -			ddr_mem_in, harq_index,
>>> -			harq_layout[harq_index].offset, harq_in_length,
>>> -			harq_dma_length_in);
>>> +	input = op->turbo_enc.input.data;
>>> +	output_head = output = op->turbo_enc.output.data;
>>> +	in_offset = op->turbo_enc.input.offset;
>>> +	out_offset = op->turbo_enc.output.offset;
>>> +	out_length = 0;
>>> +	mbuf_total_left = op->turbo_enc.input.length;
>>>
>>> -	if (ddr_mem_in && (harq_layout[harq_index].offset > 0)) {
>>> -		fcw->hcin_size0 = harq_layout[harq_index].size0;
>>> -		fcw->hcin_offset = harq_layout[harq_index].offset;
>>> -		fcw->hcin_size1 = harq_in_length - fcw->hcin_offset;
>>> -		harq_dma_length_in = (fcw->hcin_size0 + fcw->hcin_size1);
>>> -		if (h_comp == 1)
>>> -			harq_dma_length_in = harq_dma_length_in * 6 / 8;
>>> -	} else {
>>> -		fcw->hcin_size0 = harq_in_length;
>>> -	}
>>> -	harq_layout[harq_index].val = 0;
>>> -	rte_bbdev_log(DEBUG, "Loopback FCW Config %d %d %d\n",
>>> -			fcw->hcin_size0, fcw->hcin_offset, fcw->hcin_size1);
>>> -	fcw->hcout_size0 = harq_in_length;
>>> -	fcw->hcin_decomp_mode = h_comp;
>>> -	fcw->hcout_comp_mode = h_comp;
>>> -	fcw->gain_i = 1;
>>> -	fcw->gain_h = 1;
>>> +	c = op->turbo_enc.tb_params.c;
>>> +	r = op->turbo_enc.tb_params.r;
>>>
>>> -	/* Set the prefix of descriptor. This could be done at polling */
>>> +	while (mbuf_total_left > 0 && r < c) {
>>> +		seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
>>> +		/* Set up DMA descriptor */
>>> +		desc = q->ring_addr + ((q->sw_ring_head +
>> total_enqueued_cbs)
>>> +				& q->sw_ring_wrap_mask);
>>> +		desc->req.data_ptrs[0].address = q->ring_addr_phys +
>> fcw_offset;
>>> +		desc->req.data_ptrs[0].blen = ACC100_FCW_TE_BLEN;
>>> +
>>> +		ret = acc100_dma_desc_te_fill(op, &desc->req, &input,
>> output,
>>> +				&in_offset, &out_offset, &out_length,
>>> +				&mbuf_total_left, &seg_total_left, r);
>>> +		if (unlikely(ret < 0))
>>> +			return ret;
>>> +		mbuf_append(output_head, output, out_length);
>>> +
>>> +		/* Set total number of CBs in TB */
>>> +		desc->req.cbs_in_tb = cbs_in_tb;
>>> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
>>> +		rte_memdump(stderr, "FCW", &desc->req.fcw_te,
>>> +				sizeof(desc->req.fcw_te) - 8);
>>> +		rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
>> #endif
>>> +
>>> +		if (seg_total_left == 0) {
>>> +			/* Go to the next mbuf */
>>> +			input = input->next;
>>> +			in_offset = 0;
>>> +			output = output->next;
>>> +			out_offset = 0;
>>> +		}
>>> +
>>> +		total_enqueued_cbs++;
>>> +		current_enqueued_cbs++;
>>> +		r++;
>>> +	}
>>> +
>>> +	if (unlikely(desc == NULL))
>>> +		return current_enqueued_cbs;
>>> +
>>> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
>>> +	/* Check if any CBs left for processing */
>>> +	if (mbuf_total_left != 0) {
>>> +		rte_bbdev_log(ERR,
>>> +				"Some date still left for processing:
>> mbuf_total_left = %u",
>>> +				mbuf_total_left);
>>> +		return -EINVAL;
>>> +	}
>>> +#endif
>>> +
>>> +	/* Set SDone on last CB descriptor for TB mode. */
>>> +	desc->req.sdone_enable = 1;
>>> +	desc->req.irq_enable = q->irq_enable;
>>> +
>>> +	return current_enqueued_cbs;
>>> +}
>>> +
>>> +/** Enqueue one decode operations for ACC100 device in CB mode */
>>> +static inline int enqueue_dec_one_op_cb(struct acc100_queue *q,
>>> +struct rte_bbdev_dec_op *op,
>>> +		uint16_t total_enqueued_cbs)
>>> +{
>>> +	union acc100_dma_desc *desc = NULL;
>>> +	int ret;
>>> +	uint32_t in_offset, h_out_offset, s_out_offset, s_out_length,
>>> +		h_out_length, mbuf_total_left, seg_total_left;
>>> +	struct rte_mbuf *input, *h_output_head, *h_output,
>>> +		*s_output_head, *s_output;
>>> +
>>> +	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
>>> +			& q->sw_ring_wrap_mask);
>>> +	desc = q->ring_addr + desc_idx;
>>> +	acc100_fcw_td_fill(op, &desc->req.fcw_td);
>>> +
>>> +	input = op->turbo_dec.input.data;
>>> +	h_output_head = h_output = op->turbo_dec.hard_output.data;
>>> +	s_output_head = s_output = op->turbo_dec.soft_output.data;
>>> +	in_offset = op->turbo_dec.input.offset;
>>> +	h_out_offset = op->turbo_dec.hard_output.offset;
>>> +	s_out_offset = op->turbo_dec.soft_output.offset;
>>> +	h_out_length = s_out_length = 0;
>>> +	mbuf_total_left = op->turbo_dec.input.length;
>>> +	seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
>>> +
>>> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
>>> +	if (unlikely(input == NULL)) {
>>> +		rte_bbdev_log(ERR, "Invalid mbuf pointer");
>>> +		return -EFAULT;
>>> +	}
>>> +#endif
>>> +
>>> +	/* Set up DMA descriptor */
>>> +	desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
>>> +			& q->sw_ring_wrap_mask);
>>> +
>>> +	ret = acc100_dma_desc_td_fill(op, &desc->req, &input, h_output,
>>> +			s_output, &in_offset, &h_out_offset, &s_out_offset,
>>> +			&h_out_length, &s_out_length, &mbuf_total_left,
>>> +			&seg_total_left, 0);
>>> +
>>> +	if (unlikely(ret < 0))
>>> +		return ret;
>>> +
>>> +	/* Hard output */
>>> +	mbuf_append(h_output_head, h_output, h_out_length);
>>> +
>>> +	/* Soft output */
>>> +	if (check_bit(op->turbo_dec.op_flags,
>> RTE_BBDEV_TURBO_SOFT_OUTPUT))
>>> +		mbuf_append(s_output_head, s_output, s_out_length);
>>> +
>>> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
>>> +	rte_memdump(stderr, "FCW", &desc->req.fcw_td,
>>> +			sizeof(desc->req.fcw_td) - 8);
>>> +	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
>>> +
>>> +	/* Check if any CBs left for processing */
>>> +	if (mbuf_total_left != 0) {
>>> +		rte_bbdev_log(ERR,
>>> +				"Some date still left after processing one CB:
>> mbuf_total_left = %u",
>>> +				mbuf_total_left);
>>> +		return -EINVAL;
>>> +	}
>>> +#endif
>> logic similar to debug in mbuf_append, should be a common function.
> Not exactly except if I miss your point. 

I look for code blocks that look similar and want you to consider if they can be combined

into a general function or macro.  A general function is easier to maintain. In this case,

it seems like logging of something is wrong with mbuf is similar to an earlier block of code.

Nothing is functionally wrong. 

>
>>> +
>>> +	/* One CB (one op) was successfully prepared to enqueue */
>>> +	return 1;
>>> +}
>>> +
>>> +static inline int
>>> +harq_loopback(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
>>> +		uint16_t total_enqueued_cbs) {
>>> +	struct acc100_fcw_ld *fcw;
>>> +	union acc100_dma_desc *desc;
>>> +	int next_triplet = 1;
>>> +	struct rte_mbuf *hq_output_head, *hq_output;
>>> +	uint16_t harq_in_length = op-
>>> ldpc_dec.harq_combined_input.length;
>>> +	if (harq_in_length == 0) {
>>> +		rte_bbdev_log(ERR, "Loopback of invalid null size\n");
>>> +		return -EINVAL;
>>> +	}
>>> +
>>> +	int h_comp = check_bit(op->ldpc_dec.op_flags,
>>> +			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION
>>> +			) ? 1 : 0;
>>> +	if (h_comp == 1)
>>> +		harq_in_length = harq_in_length * 8 / 6;
>>> +	harq_in_length = RTE_ALIGN(harq_in_length, 64);
>>> +	uint16_t harq_dma_length_in = (h_comp == 0) ?
>> Can these h_comp checks be combined to a single if/else ?
> it may be clearer, ok.
>
>
>>> +			harq_in_length :
>>> +			harq_in_length * 6 / 8;
>>> +	uint16_t harq_dma_length_out = harq_dma_length_in;
>>> +	bool ddr_mem_in = check_bit(op->ldpc_dec.op_flags,
>>> +
>> 	RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE);
>>> +	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
>>> +	uint16_t harq_index = (ddr_mem_in ?
>>> +			op->ldpc_dec.harq_combined_input.offset :
>>> +			op->ldpc_dec.harq_combined_output.offset)
>>> +			/ ACC100_HARQ_OFFSET;
>>> +
>>> +	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
>>> +			& q->sw_ring_wrap_mask);
>>> +	desc = q->ring_addr + desc_idx;
>>> +	fcw = &desc->req.fcw_ld;
>>> +	/* Set the FCW from loopback into DDR */
>>> +	memset(fcw, 0, sizeof(struct acc100_fcw_ld));
>>> +	fcw->FCWversion = ACC100_FCW_VER;
>>> +	fcw->qm = 2;
>>> +	fcw->Zc = 384;
>> these magic numbers should have #defines
> These are not magic numbers, but actually 3GPP values
ok
>
>>> +	if (harq_in_length < 16 * N_ZC_1)
>>> +		fcw->Zc = 16;
>>> +	fcw->ncb = fcw->Zc * N_ZC_1;
>>> +	fcw->rm_e = 2;
>>> +	fcw->hcin_en = 1;
>>> +	fcw->hcout_en = 1;
>>> +
>>> +	rte_bbdev_log(DEBUG, "Loopback IN %d Index %d offset %d length
>> %d %d\n",
>>> +			ddr_mem_in, harq_index,
>>> +			harq_layout[harq_index].offset, harq_in_length,
>>> +			harq_dma_length_in);
>>> +
>>> +	if (ddr_mem_in && (harq_layout[harq_index].offset > 0)) {
>>> +		fcw->hcin_size0 = harq_layout[harq_index].size0;
>>> +		fcw->hcin_offset = harq_layout[harq_index].offset;
>>> +		fcw->hcin_size1 = harq_in_length - fcw->hcin_offset;
>>> +		harq_dma_length_in = (fcw->hcin_size0 + fcw->hcin_size1);
>>> +		if (h_comp == 1)
>>> +			harq_dma_length_in = harq_dma_length_in * 6 / 8;
>>> +	} else {
>>> +		fcw->hcin_size0 = harq_in_length;
>>> +	}
>>> +	harq_layout[harq_index].val = 0;
>>> +	rte_bbdev_log(DEBUG, "Loopback FCW Config %d %d %d\n",
>>> +			fcw->hcin_size0, fcw->hcin_offset, fcw->hcin_size1);
>>> +	fcw->hcout_size0 = harq_in_length;
>>> +	fcw->hcin_decomp_mode = h_comp;
>>> +	fcw->hcout_comp_mode = h_comp;
>>> +	fcw->gain_i = 1;
>>> +	fcw->gain_h = 1;
>>> +
>>> +	/* Set the prefix of descriptor. This could be done at polling */
>>>  	desc->req.word0 = ACC100_DMA_DESC_TYPE;
>>>  	desc->req.word1 = 0; /**< Timestamp could be disabled */
>>>  	desc->req.word2 = 0;
>>> @@ -1816,6 +2320,107 @@
>>>  	return current_enqueued_cbs;
>>>  }
>>>
>>> +/* Enqueue one decode operations for ACC100 device in TB mode */
>>> +static inline int enqueue_dec_one_op_tb(struct acc100_queue *q,
>>> +struct rte_bbdev_dec_op *op,
>>> +		uint16_t total_enqueued_cbs, uint8_t cbs_in_tb) {
>>> +	union acc100_dma_desc *desc = NULL;
>>> +	int ret;
>>> +	uint8_t r, c;
>>> +	uint32_t in_offset, h_out_offset, s_out_offset, s_out_length,
>>> +		h_out_length, mbuf_total_left, seg_total_left;
>>> +	struct rte_mbuf *input, *h_output_head, *h_output,
>>> +		*s_output_head, *s_output;
>>> +	uint16_t current_enqueued_cbs = 0;
>>> +
>>> +	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
>>> +			& q->sw_ring_wrap_mask);
>>> +	desc = q->ring_addr + desc_idx;
>>> +	uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
>>> +	acc100_fcw_td_fill(op, &desc->req.fcw_td);
>>> +
>>> +	input = op->turbo_dec.input.data;
>>> +	h_output_head = h_output = op->turbo_dec.hard_output.data;
>>> +	s_output_head = s_output = op->turbo_dec.soft_output.data;
>>> +	in_offset = op->turbo_dec.input.offset;
>>> +	h_out_offset = op->turbo_dec.hard_output.offset;
>>> +	s_out_offset = op->turbo_dec.soft_output.offset;
>>> +	h_out_length = s_out_length = 0;
>>> +	mbuf_total_left = op->turbo_dec.input.length;
>>> +	c = op->turbo_dec.tb_params.c;
>>> +	r = op->turbo_dec.tb_params.r;
>>> +
>>> +	while (mbuf_total_left > 0 && r < c) {
>>> +
>>> +		seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
>>> +
>>> +		/* Set up DMA descriptor */
>>> +		desc = q->ring_addr + ((q->sw_ring_head +
>> total_enqueued_cbs)
>>> +				& q->sw_ring_wrap_mask);
>>> +		desc->req.data_ptrs[0].address = q->ring_addr_phys +
>> fcw_offset;
>>> +		desc->req.data_ptrs[0].blen = ACC100_FCW_TD_BLEN;
>>> +		ret = acc100_dma_desc_td_fill(op, &desc->req, &input,
>>> +				h_output, s_output, &in_offset,
>> &h_out_offset,
>>> +				&s_out_offset, &h_out_length,
>> &s_out_length,
>>> +				&mbuf_total_left, &seg_total_left, r);
>>> +
>>> +		if (unlikely(ret < 0))
>>> +			return ret;
>>> +
>>> +		/* Hard output */
>>> +		mbuf_append(h_output_head, h_output, h_out_length);
>>> +
>>> +		/* Soft output */
>>> +		if (check_bit(op->turbo_dec.op_flags,
>>> +				RTE_BBDEV_TURBO_SOFT_OUTPUT))
>>> +			mbuf_append(s_output_head, s_output,
>> s_out_length);
>>> +
>>> +		/* Set total number of CBs in TB */
>>> +		desc->req.cbs_in_tb = cbs_in_tb;
>>> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
>>> +		rte_memdump(stderr, "FCW", &desc->req.fcw_td,
>>> +				sizeof(desc->req.fcw_td) - 8);
>>> +		rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
>> #endif
>>> +
>>> +		if (seg_total_left == 0) {
>>> +			/* Go to the next mbuf */
>>> +			input = input->next;
>>> +			in_offset = 0;
>>> +			h_output = h_output->next;
>>> +			h_out_offset = 0;
>>> +
>>> +			if (check_bit(op->turbo_dec.op_flags,
>>> +					RTE_BBDEV_TURBO_SOFT_OUTPUT))
>> {
>>> +				s_output = s_output->next;
>>> +				s_out_offset = 0;
>>> +			}
>>> +		}
>>> +
>>> +		total_enqueued_cbs++;
>>> +		current_enqueued_cbs++;
>>> +		r++;
>>> +	}
>>> +
>>> +	if (unlikely(desc == NULL))
>>> +		return current_enqueued_cbs;
>>> +
>>> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
>>> +	/* Check if any CBs left for processing */
>>> +	if (mbuf_total_left != 0) {
>>> +		rte_bbdev_log(ERR,
>>> +				"Some date still left for processing:
>> mbuf_total_left = %u",
>>> +				mbuf_total_left);
>>> +		return -EINVAL;
>>> +	}
>>> +#endif
>>> +	/* Set SDone on last CB descriptor for TB mode */
>>> +	desc->req.sdone_enable = 1;
>>> +	desc->req.irq_enable = q->irq_enable;
>>> +
>>> +	return current_enqueued_cbs;
>>> +}
>>>
>>>  /* Calculates number of CBs in processed encoder TB based on 'r' and
>> input
>>>   * length.
>>> @@ -1893,6 +2498,45 @@
>>>  	return cbs_in_tb;
>>>  }
>>>
>>> +/* Enqueue encode operations for ACC100 device in CB mode. */ static
>>> +uint16_t acc100_enqueue_enc_cb(struct rte_bbdev_queue_data
>> *q_data,
>>> +		struct rte_bbdev_enc_op **ops, uint16_t num) {
>>> +	struct acc100_queue *q = q_data->queue_private;
>>> +	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q-
>>> sw_ring_head;
>>> +	uint16_t i;
>>> +	union acc100_dma_desc *desc;
>>> +	int ret;
>>> +
>>> +	for (i = 0; i < num; ++i) {
>>> +		/* Check if there are available space for further processing */
>>> +		if (unlikely(avail - 1 < 0))
>>> +			break;
>>> +		avail -= 1;
>>> +
>>> +		ret = enqueue_enc_one_op_cb(q, ops[i], i);
>>> +		if (ret < 0)
>>> +			break;
>>> +	}
>>> +
>>> +	if (unlikely(i == 0))
>>> +		return 0; /* Nothing to enqueue */
>>> +
>>> +	/* Set SDone in last CB in enqueued ops for CB mode*/
>>> +	desc = q->ring_addr + ((q->sw_ring_head + i - 1)
>>> +			& q->sw_ring_wrap_mask);
>>> +	desc->req.sdone_enable = 1;
>>> +	desc->req.irq_enable = q->irq_enable;
>>> +
>>> +	acc100_dma_enqueue(q, i, &q_data->queue_stats);
>>> +
>>> +	/* Update stats */
>>> +	q_data->queue_stats.enqueued_count += i;
>>> +	q_data->queue_stats.enqueue_err_count += num - i;
>>> +	return i;
>>> +}
>>> +
>>>  /* Check we can mux encode operations with common FCW */  static
>>> inline bool  check_mux(struct rte_bbdev_enc_op **ops, uint16_t num) {
>>> @@ -1960,6 +2604,52 @@
>>>  	return i;
>>>  }
>>>
>>> +/* Enqueue encode operations for ACC100 device in TB mode. */ static
>>> +uint16_t acc100_enqueue_enc_tb(struct rte_bbdev_queue_data
>> *q_data,
>>> +		struct rte_bbdev_enc_op **ops, uint16_t num) {
>>> +	struct acc100_queue *q = q_data->queue_private;
>>> +	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q-
>>> sw_ring_head;
>>> +	uint16_t i, enqueued_cbs = 0;
>>> +	uint8_t cbs_in_tb;
>>> +	int ret;
>>> +
>>> +	for (i = 0; i < num; ++i) {
>>> +		cbs_in_tb = get_num_cbs_in_tb_enc(&ops[i]->turbo_enc);
>>> +		/* Check if there are available space for further processing */
>>> +		if (unlikely(avail - cbs_in_tb < 0))
>>> +			break;
>>> +		avail -= cbs_in_tb;
>>> +
>>> +		ret = enqueue_enc_one_op_tb(q, ops[i], enqueued_cbs,
>> cbs_in_tb);
>>> +		if (ret < 0)
>>> +			break;
>>> +		enqueued_cbs += ret;
>>> +	}
>>> +
>> other similar functions have a (i == 0) check here.
> ok
>
>>> +	acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
>>> +
>>> +	/* Update stats */
>>> +	q_data->queue_stats.enqueued_count += i;
>>> +	q_data->queue_stats.enqueue_err_count += num - i;
>>> +
>>> +	return i;
>>> +}
>>> +
>>> +/* Enqueue encode operations for ACC100 device. */ static uint16_t
>>> +acc100_enqueue_enc(struct rte_bbdev_queue_data *q_data,
>>> +		struct rte_bbdev_enc_op **ops, uint16_t num) {
>>> +	if (unlikely(num == 0))
>>> +		return 0;
>> num == 0 check should move into the tb/cb functions
> same comment on other patch, why not catch it early?
>
>>> +	if (ops[0]->turbo_enc.code_block_mode == 0)
>>> +		return acc100_enqueue_enc_tb(q_data, ops, num);
>>> +	else
>>> +		return acc100_enqueue_enc_cb(q_data, ops, num); }
>>> +
>>>  /* Enqueue encode operations for ACC100 device. */  static uint16_t
>>> acc100_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data, @@
>>> -1967,7 +2657,51 @@  {
>>>  	if (unlikely(num == 0))
>>>  		return 0;
>>> -	return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
>>> +	if (ops[0]->ldpc_enc.code_block_mode == 0)
>>> +		return acc100_enqueue_enc_tb(q_data, ops, num);
>>> +	else
>>> +		return acc100_enqueue_ldpc_enc_cb(q_data, ops, num); }
>>> +
>>> +
>>> +/* Enqueue decode operations for ACC100 device in CB mode */ static
>>> +uint16_t acc100_enqueue_dec_cb(struct rte_bbdev_queue_data
>> *q_data,
>>> +		struct rte_bbdev_dec_op **ops, uint16_t num) {
>> Seems like the 10th variant of a similar function could these be combined to
>> fewer functions ?
>>
>> Maybe by passing in a function pointer to the enqueue_one_dec_one* that
>> does the work ?
> They have some variants related to the actual operation and constraints.
> Not obvious to have a valuable refactor. 
>
As above nothing functionally wrong, just something to consider

ok.

Tom

>>> +	struct acc100_queue *q = q_data->queue_private;
>>> +	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q-
>>> sw_ring_head;
>>> +	uint16_t i;
>>> +	union acc100_dma_desc *desc;
>>> +	int ret;
>>> +
>>> +	for (i = 0; i < num; ++i) {
>>> +		/* Check if there are available space for further processing */
>>> +		if (unlikely(avail - 1 < 0))
>>> +			break;
>>> +		avail -= 1;
>>> +
>>> +		ret = enqueue_dec_one_op_cb(q, ops[i], i);
>>> +		if (ret < 0)
>>> +			break;
>>> +	}
>>> +
>>> +	if (unlikely(i == 0))
>>> +		return 0; /* Nothing to enqueue */
>>> +
>>> +	/* Set SDone in last CB in enqueued ops for CB mode*/
>>> +	desc = q->ring_addr + ((q->sw_ring_head + i - 1)
>>> +			& q->sw_ring_wrap_mask);
>>> +	desc->req.sdone_enable = 1;
>>> +	desc->req.irq_enable = q->irq_enable;
>>> +
>>> +	acc100_dma_enqueue(q, i, &q_data->queue_stats);
>>> +
>>> +	/* Update stats */
>>> +	q_data->queue_stats.enqueued_count += i;
>>> +	q_data->queue_stats.enqueue_err_count += num - i;
>>> +
>>> +	return i;
>>>  }
>>>
>>>  /* Check we can mux encode operations with common FCW */ @@ -
>> 2065,6
>>> +2799,53 @@
>>>  	return i;
>>>  }
>>>
>>> +
>>> +/* Enqueue decode operations for ACC100 device in TB mode */ static
>>> +uint16_t acc100_enqueue_dec_tb(struct rte_bbdev_queue_data
>> *q_data,
>>> +		struct rte_bbdev_dec_op **ops, uint16_t num) {
>> 11th ;)
>>> +	struct acc100_queue *q = q_data->queue_private;
>>> +	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q-
>>> sw_ring_head;
>>> +	uint16_t i, enqueued_cbs = 0;
>>> +	uint8_t cbs_in_tb;
>>> +	int ret;
>>> +
>>> +	for (i = 0; i < num; ++i) {
>>> +		cbs_in_tb = get_num_cbs_in_tb_dec(&ops[i]->turbo_dec);
>>> +		/* Check if there are available space for further processing */
>>> +		if (unlikely(avail - cbs_in_tb < 0))
>>> +			break;
>>> +		avail -= cbs_in_tb;
>>> +
>>> +		ret = enqueue_dec_one_op_tb(q, ops[i], enqueued_cbs,
>> cbs_in_tb);
>>> +		if (ret < 0)
>>> +			break;
>>> +		enqueued_cbs += ret;
>>> +	}
>>> +
>>> +	acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
>>> +
>>> +	/* Update stats */
>>> +	q_data->queue_stats.enqueued_count += i;
>>> +	q_data->queue_stats.enqueue_err_count += num - i;
>>> +
>>> +	return i;
>>> +}
>>> +
>>> +/* Enqueue decode operations for ACC100 device. */ static uint16_t
>>> +acc100_enqueue_dec(struct rte_bbdev_queue_data *q_data,
>>> +		struct rte_bbdev_dec_op **ops, uint16_t num) {
>>> +	if (unlikely(num == 0))
>>> +		return 0;
>> similar move the num == 0 check to the tb/cb functions.
> same comment
>
>>> +	if (ops[0]->turbo_dec.code_block_mode == 0)
>>> +		return acc100_enqueue_dec_tb(q_data, ops, num);
>>> +	else
>>> +		return acc100_enqueue_dec_cb(q_data, ops, num); }
>>> +
>>>  /* Enqueue decode operations for ACC100 device. */  static uint16_t
>>> acc100_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data, @@
>>> -2388,6 +3169,51 @@
>>>  	return cb_idx;
>>>  }
>>>
>>> +/* Dequeue encode operations from ACC100 device. */ static uint16_t
>>> +acc100_dequeue_enc(struct rte_bbdev_queue_data *q_data,
>>> +		struct rte_bbdev_enc_op **ops, uint16_t num) {
>>> +	struct acc100_queue *q = q_data->queue_private;
>>> +	uint16_t dequeue_num;
>>> +	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
>>> +	uint32_t aq_dequeued = 0;
>>> +	uint16_t i;
>>> +	uint16_t dequeued_cbs = 0;
>>> +	struct rte_bbdev_enc_op *op;
>>> +	int ret;
>>> +
>>> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
>>> +	if (unlikely(ops == 0 && q == NULL))
>> ops is a pointer so should compare with NULL
>>
>> The && likely needs to be ||
>>
>> Maybe print out a message so caller knows something wrong happened.
> ok
>
>>> +		return 0;
>>> +#endif
>>> +
>>> +	dequeue_num = (avail < num) ? avail : num;
>>> +
>>> +	for (i = 0; i < dequeue_num; ++i) {
>>> +		op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
>>> +			& q->sw_ring_wrap_mask))->req.op_addr;
>>> +		if (op->turbo_enc.code_block_mode == 0)
>>> +			ret = dequeue_enc_one_op_tb(q, &ops[i],
>> dequeued_cbs,
>>> +					&aq_dequeued);
>>> +		else
>>> +			ret = dequeue_enc_one_op_cb(q, &ops[i],
>> dequeued_cbs,
>>> +					&aq_dequeued);
>>> +
>>> +		if (ret < 0)
>>> +			break;
>>> +		dequeued_cbs += ret;
>>> +	}
>>> +
>>> +	q->aq_dequeued += aq_dequeued;
>>> +	q->sw_ring_tail += dequeued_cbs;
>>> +
>>> +	/* Update enqueue stats */
>>> +	q_data->queue_stats.dequeued_count += i;
>>> +
>>> +	return i;
>>> +}
>>> +
>>>  /* Dequeue LDPC encode operations from ACC100 device. */  static
>>> uint16_t  acc100_dequeue_ldpc_enc(struct rte_bbdev_queue_data
>> *q_data,
>>> @@ -2426,6 +3252,52 @@
>>>  	return dequeued_cbs;
>>>  }
>>>
>>> +
>>> +/* Dequeue decode operations from ACC100 device. */ static uint16_t
>>> +acc100_dequeue_dec(struct rte_bbdev_queue_data *q_data,
>>> +		struct rte_bbdev_dec_op **ops, uint16_t num) {
>> very similar to enc function above, consider how to combine them to a
>> single function.
>>
>> Tom
>>
>>> +	struct acc100_queue *q = q_data->queue_private;
>>> +	uint16_t dequeue_num;
>>> +	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
>>> +	uint32_t aq_dequeued = 0;
>>> +	uint16_t i;
>>> +	uint16_t dequeued_cbs = 0;
>>> +	struct rte_bbdev_dec_op *op;
>>> +	int ret;
>>> +
>>> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
>>> +	if (unlikely(ops == 0 && q == NULL))
>>> +		return 0;
>>> +#endif
>>> +
>>> +	dequeue_num = (avail < num) ? avail : num;
>>> +
>>> +	for (i = 0; i < dequeue_num; ++i) {
>>> +		op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
>>> +			& q->sw_ring_wrap_mask))->req.op_addr;
>>> +		if (op->turbo_dec.code_block_mode == 0)
>>> +			ret = dequeue_dec_one_op_tb(q, &ops[i],
>> dequeued_cbs,
>>> +					&aq_dequeued);
>>> +		else
>>> +			ret = dequeue_dec_one_op_cb(q_data, q, &ops[i],
>>> +					dequeued_cbs, &aq_dequeued);
>>> +
>>> +		if (ret < 0)
>>> +			break;
>>> +		dequeued_cbs += ret;
>>> +	}
>>> +
>>> +	q->aq_dequeued += aq_dequeued;
>>> +	q->sw_ring_tail += dequeued_cbs;
>>> +
>>> +	/* Update enqueue stats */
>>> +	q_data->queue_stats.dequeued_count += i;
>>> +
>>> +	return i;
>>> +}
>>> +
>>>  /* Dequeue decode operations from ACC100 device. */  static uint16_t
>>> acc100_dequeue_ldpc_dec(struct rte_bbdev_queue_data *q_data, @@
>>> -2479,6 +3351,10 @@
>>>  	struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
>>>
>>>  	dev->dev_ops = &acc100_bbdev_ops;
>>> +	dev->enqueue_enc_ops = acc100_enqueue_enc;
>>> +	dev->enqueue_dec_ops = acc100_enqueue_dec;
>>> +	dev->dequeue_enc_ops = acc100_dequeue_enc;
>>> +	dev->dequeue_dec_ops = acc100_dequeue_dec;
>>>  	dev->enqueue_ldpc_enc_ops = acc100_enqueue_ldpc_enc;
>>>  	dev->enqueue_ldpc_dec_ops = acc100_enqueue_ldpc_dec;
>>>  	dev->dequeue_ldpc_enc_ops = acc100_dequeue_ldpc_enc;


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v10 10/10] baseband/acc100: add configure function
  2020-10-01 15:36         ` Chautru, Nicolas
@ 2020-10-01 15:43           ` Maxime Coquelin
  2020-10-01 19:50             ` Chautru, Nicolas
  0 siblings, 1 reply; 213+ messages in thread
From: Maxime Coquelin @ 2020-10-01 15:43 UTC (permalink / raw)
  To: Chautru, Nicolas, dev, akhil.goyal
  Cc: Richardson, Bruce, Xu, Rosen, trix, Yigit, Ferruh, Liu, Tianjiao



On 10/1/20 5:36 PM, Chautru, Nicolas wrote:
> Hi Maxime, 
> 
>> -----Original Message-----
>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>> Sent: Thursday, October 1, 2020 7:11 AM
>> To: Chautru, Nicolas <nicolas.chautru@intel.com>; dev@dpdk.org;
>> akhil.goyal@nxp.com
>> Cc: Richardson, Bruce <bruce.richardson@intel.com>; Xu, Rosen
>> <rosen.xu@intel.com>; trix@redhat.com; Yigit, Ferruh
>> <ferruh.yigit@intel.com>; Liu, Tianjiao <tianjiao.liu@intel.com>
>> Subject: Re: [dpdk-dev] [PATCH v10 10/10] baseband/acc100: add configure
>> function
>>
>> Hi Nicolas,
>>
>> On 10/1/20 5:14 AM, Nicolas Chautru wrote:
>>> diff --git a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
>>> b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
>>> index 4a76d1d..91c234d 100644
>>> --- a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
>>> +++ b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
>>> @@ -1,3 +1,10 @@
>>>  DPDK_21 {
>>>  	local: *;
>>>  };
>>> +
>>> +EXPERIMENTAL {
>>> +	global:
>>> +
>>> +	acc100_configure;
>>> +
>>> +};
>>> --
>>
>> Ideally we should not need to have device specific APIs, but at least it should
>> be prefixed with "rte_".
> 
> Currently this is already like that for other bbdev PMDs. 
> So I would tend to prefer consistency over all in that context. 
> You could argue or not whether this is PMD function or a companion exposed function, but again if this should change it should change for all PMDs to avoid discrepencies.
> If really this is deemed required this can be pushed as an extra patch covering all PMD, but probably not for 20.11.
> What do you think?

Better to fix the API now to avoid namespace pollution, including the
other comments I made regarding API on patch 3.
That's not a big change, it can be done in v20.11 in my opinion.

Thanks,
Maxime

>>
>> Regards,
>> Maxime
> 


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v9 08/10] baseband/acc100: add interrupt support to PMD
  2020-09-30 19:45         ` Chautru, Nicolas
@ 2020-10-01 16:05           ` Tom Rix
  2020-10-01 21:07             ` Chautru, Nicolas
  0 siblings, 1 reply; 213+ messages in thread
From: Tom Rix @ 2020-10-01 16:05 UTC (permalink / raw)
  To: Chautru, Nicolas, dev, akhil.goyal
  Cc: Richardson, Bruce, Xu, Rosen, dave.burley, aidan.goddard, Yigit,
	Ferruh, Liu, Tianjiao


On 9/30/20 12:45 PM, Chautru, Nicolas wrote:
> Hi Tom, 
>
>> From: Tom Rix <trix@redhat.com>
>> On 9/28/20 5:29 PM, Nicolas Chautru wrote:
>>> Adding capability and functions to support MSI interrupts, call backs
>>> and inforing.
>>>
>>> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
>>> Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
>>> ---
>>>  drivers/baseband/acc100/rte_acc100_pmd.c | 288
>>> ++++++++++++++++++++++++++++++-
>>> drivers/baseband/acc100/rte_acc100_pmd.h |  15 ++
>>>  2 files changed, 300 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
>>> b/drivers/baseband/acc100/rte_acc100_pmd.c
>>> index 7d4c3df..b6d9e7c 100644
>>> --- a/drivers/baseband/acc100/rte_acc100_pmd.c
>>> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
>>> @@ -339,6 +339,213 @@
>>>  	free_base_addresses(base_addrs, i);
>>>  }
>>>
>>> +/*
>>> + * Find queue_id of a device queue based on details from the Info Ring.
>>> + * If a queue isn't found UINT16_MAX is returned.
>>> + */
>>> +static inline uint16_t
>>> +get_queue_id_from_ring_info(struct rte_bbdev_data *data,
>>> +		const union acc100_info_ring_data ring_data) {
>>> +	uint16_t queue_id;
>>> +
>>> +	for (queue_id = 0; queue_id < data->num_queues; ++queue_id) {
>>> +		struct acc100_queue *acc100_q =
>>> +				data->queues[queue_id].queue_private;
>>> +		if (acc100_q != NULL && acc100_q->aq_id == ring_data.aq_id
>> &&
>>> +				acc100_q->qgrp_id == ring_data.qg_id &&
>>> +				acc100_q->vf_id == ring_data.vf_id)
>>> +			return queue_id;
>> If num_queues is large, this linear search will be slow.
>>
>> Consider changing the search algorithm.
> This is not in the time critical part of the code
ok
>
>
>>> +	}
>>> +
>>> +	return UINT16_MAX;
>> the interrupt handlers that use this function do not a great job of handling
>> this error.
> if that error actualy happened then there is not much else that can be done except reporting the unexpected data.
ok
>
>>> +}
>>> +
>>> +/* Checks PF Info Ring to find the interrupt cause and handles it
>>> +accordingly */ static inline void acc100_check_ir(struct
>>> +acc100_device *acc100_dev) {
>>> +	volatile union acc100_info_ring_data *ring_data;
>>> +	uint16_t info_ring_head = acc100_dev->info_ring_head;
>>> +	if (acc100_dev->info_ring == NULL)
>>> +		return;
>>> +
>>> +	ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head &
>>> +			ACC100_INFO_RING_MASK);
>>> +
>>> +	while (ring_data->valid) {
>>> +		if ((ring_data->int_nb <
>> ACC100_PF_INT_DMA_DL_DESC_IRQ) || (
>>> +				ring_data->int_nb >
>>> +				ACC100_PF_INT_DMA_DL5G_DESC_IRQ))
>>> +			rte_bbdev_log(WARNING, "InfoRing: ITR:%d
>> Info:0x%x",
>>> +				ring_data->int_nb, ring_data-
>>> detailed_info);
>>> +		/* Initialize Info Ring entry and move forward */
>>> +		ring_data->val = 0;
>>> +		info_ring_head++;
>>> +		ring_data = acc100_dev->info_ring +
>>> +				(info_ring_head &
>> ACC100_INFO_RING_MASK);
>> These three statements are common for the ring handling, consider a macro
>> or inline function.
> ok
>
>>> +	}
>>> +}
>>> +
>>> +/* Checks PF Info Ring to find the interrupt cause and handles it
>>> +accordingly */ static inline void acc100_pf_interrupt_handler(struct
>>> +rte_bbdev *dev) {
>>> +	struct acc100_device *acc100_dev = dev->data->dev_private;
>>> +	volatile union acc100_info_ring_data *ring_data;
>>> +	struct acc100_deq_intr_details deq_intr_det;
>>> +
>>> +	ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head &
>>> +			ACC100_INFO_RING_MASK);
>>> +
>>> +	while (ring_data->valid) {
>>> +
>>> +		rte_bbdev_log_debug(
>>> +				"ACC100 PF Interrupt received, Info Ring
>> data: 0x%x",
>>> +				ring_data->val);
>>> +
>>> +		switch (ring_data->int_nb) {
>>> +		case ACC100_PF_INT_DMA_DL_DESC_IRQ:
>>> +		case ACC100_PF_INT_DMA_UL_DESC_IRQ:
>>> +		case ACC100_PF_INT_DMA_UL5G_DESC_IRQ:
>>> +		case ACC100_PF_INT_DMA_DL5G_DESC_IRQ:
>>> +			deq_intr_det.queue_id =
>> get_queue_id_from_ring_info(
>>> +					dev->data, *ring_data);
>>> +			if (deq_intr_det.queue_id == UINT16_MAX) {
>>> +				rte_bbdev_log(ERR,
>>> +						"Couldn't find queue: aq_id:
>> %u, qg_id: %u, vf_id: %u",
>>> +						ring_data->aq_id,
>>> +						ring_data->qg_id,
>>> +						ring_data->vf_id);
>>> +				return;
>>> +			}
>>> +			rte_bbdev_pmd_callback_process(dev,
>>> +					RTE_BBDEV_EVENT_DEQUEUE,
>> &deq_intr_det);
>>> +			break;
>>> +		default:
>>> +			rte_bbdev_pmd_callback_process(dev,
>>> +					RTE_BBDEV_EVENT_ERROR, NULL);
>>> +			break;
>>> +		}
>>> +
>>> +		/* Initialize Info Ring entry and move forward */
>>> +		ring_data->val = 0;
>>> +		++acc100_dev->info_ring_head;
>>> +		ring_data = acc100_dev->info_ring +
>>> +				(acc100_dev->info_ring_head &
>>> +				ACC100_INFO_RING_MASK);
>>> +	}
>>> +}
>>> +
>>> +/* Checks VF Info Ring to find the interrupt cause and handles it
>>> +accordingly */ static inline void acc100_vf_interrupt_handler(struct
>>> +rte_bbdev *dev)
>> very similar to pf case, consider combining.
>>> +{
>>> +	struct acc100_device *acc100_dev = dev->data->dev_private;
>>> +	volatile union acc100_info_ring_data *ring_data;
>>> +	struct acc100_deq_intr_details deq_intr_det;
>>> +
>>> +	ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head &
>>> +			ACC100_INFO_RING_MASK);
>>> +
>>> +	while (ring_data->valid) {
>>> +
>>> +		rte_bbdev_log_debug(
>>> +				"ACC100 VF Interrupt received, Info Ring
>> data: 0x%x",
>>> +				ring_data->val);
>>> +
>>> +		switch (ring_data->int_nb) {
>>> +		case ACC100_VF_INT_DMA_DL_DESC_IRQ:
>>> +		case ACC100_VF_INT_DMA_UL_DESC_IRQ:
>>> +		case ACC100_VF_INT_DMA_UL5G_DESC_IRQ:
>>> +		case ACC100_VF_INT_DMA_DL5G_DESC_IRQ:
>>> +			/* VFs are not aware of their vf_id - it's set to 0 in
>>> +			 * queue structures.
>>> +			 */
>>> +			ring_data->vf_id = 0;
>>> +			deq_intr_det.queue_id =
>> get_queue_id_from_ring_info(
>>> +					dev->data, *ring_data);
>>> +			if (deq_intr_det.queue_id == UINT16_MAX) {
>>> +				rte_bbdev_log(ERR,
>>> +						"Couldn't find queue: aq_id:
>> %u, qg_id: %u",
>>> +						ring_data->aq_id,
>>> +						ring_data->qg_id);
>>> +				return;
>>> +			}
>>> +			rte_bbdev_pmd_callback_process(dev,
>>> +					RTE_BBDEV_EVENT_DEQUEUE,
>> &deq_intr_det);
>>> +			break;
>>> +		default:
>>> +			rte_bbdev_pmd_callback_process(dev,
>>> +					RTE_BBDEV_EVENT_ERROR, NULL);
>>> +			break;
>>> +		}
>>> +
>>> +		/* Initialize Info Ring entry and move forward */
>>> +		ring_data->valid = 0;
>>> +		++acc100_dev->info_ring_head;
>>> +		ring_data = acc100_dev->info_ring + (acc100_dev-
>>> info_ring_head
>>> +				& ACC100_INFO_RING_MASK);
>>> +	}
>>> +}
>>> +
>>> +/* Interrupt handler triggered by ACC100 dev for handling specific
>>> +interrupt */ static void acc100_dev_interrupt_handler(void *cb_arg) {
>>> +	struct rte_bbdev *dev = cb_arg;
>>> +	struct acc100_device *acc100_dev = dev->data->dev_private;
>>> +
>>> +	/* Read info ring */
>>> +	if (acc100_dev->pf_device)
>>> +		acc100_pf_interrupt_handler(dev);
>> combined like ..
>>
>> acc100_interrupt_handler(dev, is_pf)
> unsure it will help readability. Much of the code would still be distinct
ok
>
>>> +	else
>>> +		acc100_vf_interrupt_handler(dev);
>>> +}
>>> +
>>> +/* Allocate and setup inforing */
>>> +static int
>>> +allocate_inforing(struct rte_bbdev *dev)
>> consider renaming
>>
>> allocate_info_ring
> ok
>
>>> +{
>>> +	struct acc100_device *d = dev->data->dev_private;
>>> +	const struct acc100_registry_addr *reg_addr;
>>> +	rte_iova_t info_ring_phys;
>>> +	uint32_t phys_low, phys_high;
>>> +
>>> +	if (d->info_ring != NULL)
>>> +		return 0; /* Already configured */
>>> +
>>> +	/* Choose correct registry addresses for the device type */
>>> +	if (d->pf_device)
>>> +		reg_addr = &pf_reg_addr;
>>> +	else
>>> +		reg_addr = &vf_reg_addr;
>>> +	/* Allocate InfoRing */
>>> +	d->info_ring = rte_zmalloc_socket("Info Ring",
>>> +			ACC100_INFO_RING_NUM_ENTRIES *
>>> +			sizeof(*d->info_ring), RTE_CACHE_LINE_SIZE,
>>> +			dev->data->socket_id);
>>> +	if (d->info_ring == NULL) {
>>> +		rte_bbdev_log(ERR,
>>> +				"Failed to allocate Info Ring for %s:%u",
>>> +				dev->device->driver->name,
>>> +				dev->data->dev_id);
>> The callers do not check that this fails.
> arguably the error would be self contained if that did fail. But doesn't hurt to add, ok. 
>
>>> +		return -ENOMEM;
>>> +	}
>>> +	info_ring_phys = rte_malloc_virt2iova(d->info_ring);
>>> +
>>> +	/* Setup Info Ring */
>>> +	phys_high = (uint32_t)(info_ring_phys >> 32);
>>> +	phys_low  = (uint32_t)(info_ring_phys);
>>> +	acc100_reg_write(d, reg_addr->info_ring_hi, phys_high);
>>> +	acc100_reg_write(d, reg_addr->info_ring_lo, phys_low);
>>> +	acc100_reg_write(d, reg_addr->info_ring_en,
>> ACC100_REG_IRQ_EN_ALL);
>>> +	d->info_ring_head = (acc100_reg_read(d, reg_addr->info_ring_ptr) &
>>> +			0xFFF) / sizeof(union acc100_info_ring_data);
>>> +	return 0;
>>> +}
>>> +
>>> +
>>>  /* Allocate 64MB memory used for all software rings */  static int
>>> acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int
>>> socket_id) @@ -426,6 +633,7 @@
>>>  	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_hi, phys_high);
>>>  	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_lo, phys_low);
>>>
>>> +	allocate_inforing(dev);
>> need to check here
>>>  	d->harq_layout = rte_zmalloc_socket("HARQ Layout",
>>>  			ACC100_HARQ_LAYOUT * sizeof(*d->harq_layout),
>>>  			RTE_CACHE_LINE_SIZE, dev->data->socket_id); @@ -
>> 437,13 +645,53 @@
>>>  	return 0;
>>>  }
>>>
>>> +static int
>>> +acc100_intr_enable(struct rte_bbdev *dev) {
>>> +	int ret;
>>> +	struct acc100_device *d = dev->data->dev_private;
>>> +
>>> +	/* Only MSI are currently supported */
>>> +	if (dev->intr_handle->type == RTE_INTR_HANDLE_VFIO_MSI ||
>>> +			dev->intr_handle->type == RTE_INTR_HANDLE_UIO)
>> {
>>> +
>>> +		allocate_inforing(dev);
>> need to check here
>>> +
>>> +		ret = rte_intr_enable(dev->intr_handle);
>>> +		if (ret < 0) {
>>> +			rte_bbdev_log(ERR,
>>> +					"Couldn't enable interrupts for
>> device: %s",
>>> +					dev->data->name);
>>> +			rte_free(d->info_ring);
>>> +			return ret;
>>> +		}
>>> +		ret = rte_intr_callback_register(dev->intr_handle,
>>> +				acc100_dev_interrupt_handler, dev);
>>> +		if (ret < 0) {
>>> +			rte_bbdev_log(ERR,
>>> +					"Couldn't register interrupt callback
>> for device: %s",
>>> +					dev->data->name);
>>> +			rte_free(d->info_ring);
>> does intr need to be disabled here ?
> Well I don't see a lot of consistency with other drivers. Sometimes these are not even check for failure.
> I would rather defer changing through other future patch if required as this is same code on other bbdev drivers already used (if changed I would rather all changed the same way). 

ok.


>
>>> +			return ret;
>>> +		}
>>> +
>>> +		return 0;
>>> +	}
>>> +
>>> +	rte_bbdev_log(ERR, "ACC100 (%s) supports only VFIO MSI
>> interrupts",
>>> +			dev->data->name);
>>> +	return -ENOTSUP;
>>> +}
>>> +
>>>  /* Free 64MB memory used for software rings */  static int
>>> acc100_dev_close(struct rte_bbdev *dev)  {
>>>  	struct acc100_device *d = dev->data->dev_private;
>>> +	acc100_check_ir(d);
>>>  	if (d->sw_rings_base != NULL) {
>>>  		rte_free(d->tail_ptrs);
>>> +		rte_free(d->info_ring);
>>>  		rte_free(d->sw_rings_base);
>>>  		d->sw_rings_base = NULL;
>>>  	}
>>> @@ -643,6 +891,7 @@
>>>  					RTE_BBDEV_TURBO_CRC_TYPE_24B
>> |
>> 	RTE_BBDEV_TURBO_HALF_ITERATION_EVEN |
>> 	RTE_BBDEV_TURBO_EARLY_TERMINATION |
>>> +
>> 	RTE_BBDEV_TURBO_DEC_INTERRUPTS |
>> 	RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN |
>>>  					RTE_BBDEV_TURBO_MAP_DEC |
>>>
>> 	RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP | @@ -663,6 +912,7
>> @@
>> 	RTE_BBDEV_TURBO_CRC_24B_ATTACH |
>> 	RTE_BBDEV_TURBO_RV_INDEX_BYPASS |
>>>  					RTE_BBDEV_TURBO_RATE_MATCH |
>>> +
>> 	RTE_BBDEV_TURBO_ENC_INTERRUPTS |
>> 	RTE_BBDEV_TURBO_ENC_SCATTER_GATHER,
>>>  				.num_buffers_src =
>>>
>> 	RTE_BBDEV_TURBO_MAX_CODE_BLOCKS, @@ -676,7 +926,8 @@
>>>  				.capability_flags =
>>>  					RTE_BBDEV_LDPC_RATE_MATCH |
>>>
>> 	RTE_BBDEV_LDPC_CRC_24B_ATTACH |
>>> -
>> 	RTE_BBDEV_LDPC_INTERLEAVER_BYPASS,
>>> +
>> 	RTE_BBDEV_LDPC_INTERLEAVER_BYPASS |
>>> +
>> 	RTE_BBDEV_LDPC_ENC_INTERRUPTS,
>>>  				.num_buffers_src =
>>>
>> 	RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
>>>  				.num_buffers_dst =
>>> @@ -701,7 +952,8 @@
>>>  				RTE_BBDEV_LDPC_DECODE_BYPASS |
>>>  				RTE_BBDEV_LDPC_DEC_SCATTER_GATHER |
>>>
>> 	RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION |
>>> -				RTE_BBDEV_LDPC_LLR_COMPRESSION,
>>> +				RTE_BBDEV_LDPC_LLR_COMPRESSION |
>>> +				RTE_BBDEV_LDPC_DEC_INTERRUPTS,
>>>  			.llr_size = 8,
>>>  			.llr_decimals = 1,
>>>  			.num_buffers_src =
>>> @@ -751,14 +1003,39 @@
>>>  #else
>>>  	dev_info->harq_buffer_size = 0;
>>>  #endif
>>> +	acc100_check_ir(d);
>>> +}
>>> +
>>> +static int
>>> +acc100_queue_intr_enable(struct rte_bbdev *dev, uint16_t queue_id) {
>>> +	struct acc100_queue *q = dev->data-
>>> queues[queue_id].queue_private;
>>> +
>>> +	if (dev->intr_handle->type != RTE_INTR_HANDLE_VFIO_MSI &&
>>> +			dev->intr_handle->type != RTE_INTR_HANDLE_UIO)
>>> +		return -ENOTSUP;
>>> +
>>> +	q->irq_enable = 1;
>>> +	return 0;
>>> +}
>>> +
>>> +static int
>>> +acc100_queue_intr_disable(struct rte_bbdev *dev, uint16_t queue_id) {
>>> +	struct acc100_queue *q = dev->data-
>>> queues[queue_id].queue_private;
>>> +	q->irq_enable = 0;
>> A -ENOTSUP above, should need similar check here.
> How can this fail when we purely disable?

It is for api consistency.

the enable fails

the disable succeeds

that is not consistent.

Tom

>
>>> +	return 0;
>>>  }
>>>
>>>  static const struct rte_bbdev_ops acc100_bbdev_ops = {
>>>  	.setup_queues = acc100_setup_queues,
>>> +	.intr_enable = acc100_intr_enable,
>>>  	.close = acc100_dev_close,
>>>  	.info_get = acc100_dev_info_get,
>>>  	.queue_setup = acc100_queue_setup,
>>>  	.queue_release = acc100_queue_release,
>>> +	.queue_intr_enable = acc100_queue_intr_enable,
>>> +	.queue_intr_disable = acc100_queue_intr_disable
>>>  };
>>>
>>>  /* ACC100 PCI PF address map */
>>> @@ -3018,8 +3295,10 @@
>>>  			? (1 << RTE_BBDEV_DATA_ERROR) : 0);
>>>  	op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
>>>  	op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
>>> -	if (op->status != 0)
>>> +	if (op->status != 0) {
>>>  		q_data->queue_stats.dequeue_err_count++;
>>> +		acc100_check_ir(q->d);
>>> +	}
>>>
>>>  	/* CRC invalid if error exists */
>>>  	if (!op->status)
>>> @@ -3076,6 +3355,9 @@
>>>  		op->status |= 1 << RTE_BBDEV_SYNDROME_ERROR;
>>>  	op->ldpc_dec.iter_count = (uint8_t) rsp.iter_cnt;
>>>
>>> +	if (op->status & (1 << RTE_BBDEV_DRV_ERROR))
>>> +		acc100_check_ir(q->d);
>>> +
>>>  	/* Check if this is the last desc in batch (Atomic Queue) */
>>>  	if (desc->req.last_desc_in_batch) {
>>>  		(*aq_dequeued)++;
>>> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h
>>> b/drivers/baseband/acc100/rte_acc100_pmd.h
>>> index 78686c1..8980fa5 100644
>>> --- a/drivers/baseband/acc100/rte_acc100_pmd.h
>>> +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
>>> @@ -559,7 +559,14 @@ struct acc100_device {
>>>  	/* Virtual address of the info memory routed to the this function
>> under
>>>  	 * operation, whether it is PF or VF.
>>>  	 */
>>> +	union acc100_info_ring_data *info_ring;
>> Need a comment that this array needs a sentinel ?
> Can clarify a bit expected HW behaviour
>
> Thanks
>
>> Tom
>>
>>> +
>>>  	union acc100_harq_layout_data *harq_layout;
>>> +	/* Virtual Info Ring head */
>>> +	uint16_t info_ring_head;
>>> +	/* Number of bytes available for each queue in device, depending
>> on
>>> +	 * how many queues are enabled with configure()
>>> +	 */
>>>  	uint32_t sw_ring_size;
>>>  	uint32_t ddr_size; /* Size in kB */
>>>  	uint32_t *tail_ptrs; /* Base address of response tail pointer buffer
>>> */ @@ -575,4 +582,12 @@ struct acc100_device {
>>>  	bool configured; /**< True if this ACC100 device is configured */
>>> };
>>>
>>> +/**
>>> + * Structure with details about RTE_BBDEV_EVENT_DEQUEUE event. It's
>>> +passed to
>>> + * the callback function.
>>> + */
>>> +struct acc100_deq_intr_details {
>>> +	uint16_t queue_id;
>>> +};
>>> +
>>>  #endif /* _RTE_ACC100_PMD_H_ */


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v9 09/10] baseband/acc100: add debug function to validate input
  2020-09-30 19:53         ` Chautru, Nicolas
@ 2020-10-01 16:07           ` Tom Rix
  0 siblings, 0 replies; 213+ messages in thread
From: Tom Rix @ 2020-10-01 16:07 UTC (permalink / raw)
  To: Chautru, Nicolas, dev, akhil.goyal
  Cc: Richardson, Bruce, Xu, Rosen, dave.burley, aidan.goddard, Yigit,
	Ferruh, Liu, Tianjiao


On 9/30/20 12:53 PM, Chautru, Nicolas wrote:
> Hi Tom, 
>
>> From: Tom Rix <trix@redhat.com>
>> On 9/28/20 5:29 PM, Nicolas Chautru wrote:
>>> Debug functions to validate the input API from user Only enabled in
>>> DEBUG mode at build time
>>>
>>> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
>>> Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
>>> ---
>>>  drivers/baseband/acc100/rte_acc100_pmd.c | 424
>>> +++++++++++++++++++++++++++++++
>>>  1 file changed, 424 insertions(+)
>>>
>>> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
>>> b/drivers/baseband/acc100/rte_acc100_pmd.c
>>> index b6d9e7c..3589814 100644
>>> --- a/drivers/baseband/acc100/rte_acc100_pmd.c
>>> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
>>> @@ -1945,6 +1945,231 @@
>>>
>>>  }
>>>
>>> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
>>> +/* Validates turbo encoder parameters */ static inline int
>>> +validate_enc_op(struct rte_bbdev_enc_op *op) {
>>> +	struct rte_bbdev_op_turbo_enc *turbo_enc = &op->turbo_enc;
>>> +	struct rte_bbdev_op_enc_turbo_cb_params *cb = NULL;
>>> +	struct rte_bbdev_op_enc_turbo_tb_params *tb = NULL;
>>> +	uint16_t kw, kw_neg, kw_pos;
>>> +
>>> +	if (op->mempool == NULL) {
>>> +		rte_bbdev_log(ERR, "Invalid mempool pointer");
>>> +		return -1;
>>> +	}
>>> +	if (turbo_enc->input.data == NULL) {
>>> +		rte_bbdev_log(ERR, "Invalid input pointer");
>>> +		return -1;
>>> +	}
>>> +	if (turbo_enc->output.data == NULL) {
>>> +		rte_bbdev_log(ERR, "Invalid output pointer");
>>> +		return -1;
>>> +	}
>>> +	if (turbo_enc->rv_index > 3) {
>>> +		rte_bbdev_log(ERR,
>>> +				"rv_index (%u) is out of range 0 <= value <=
>> 3",
>>> +				turbo_enc->rv_index);
>>> +		return -1;
>>> +	}
>>> +	if (turbo_enc->code_block_mode != 0 &&
>>> +			turbo_enc->code_block_mode != 1) {
>>> +		rte_bbdev_log(ERR,
>>> +				"code_block_mode (%u) is out of range 0 <=
>> value <= 1",
>>> +				turbo_enc->code_block_mode);
>>> +		return -1;
>>> +	}
>>> +
>>> +	if (turbo_enc->code_block_mode == 0) {
>>> +		tb = &turbo_enc->tb_params;
>>> +		if ((tb->k_neg < RTE_BBDEV_TURBO_MIN_CB_SIZE
>>> +				|| tb->k_neg >
>> RTE_BBDEV_TURBO_MAX_CB_SIZE)
>>> +				&& tb->c_neg > 0) {
>>> +			rte_bbdev_log(ERR,
>>> +					"k_neg (%u) is out of range %u <=
>> value <= %u",
>>> +					tb->k_neg,
>> RTE_BBDEV_TURBO_MIN_CB_SIZE,
>>> +					RTE_BBDEV_TURBO_MAX_CB_SIZE);
>>> +			return -1;
>>> +		}
>>> +		if (tb->k_pos < RTE_BBDEV_TURBO_MIN_CB_SIZE
>>> +				|| tb->k_pos >
>> RTE_BBDEV_TURBO_MAX_CB_SIZE) {
>>> +			rte_bbdev_log(ERR,
>>> +					"k_pos (%u) is out of range %u <=
>> value <= %u",
>>> +					tb->k_pos,
>> RTE_BBDEV_TURBO_MIN_CB_SIZE,
>>> +					RTE_BBDEV_TURBO_MAX_CB_SIZE);
>>> +			return -1;
>>> +		}
>>> +		if (tb->c_neg > (RTE_BBDEV_TURBO_MAX_CODE_BLOCKS -
>> 1))
>>> +			rte_bbdev_log(ERR,
>>> +					"c_neg (%u) is out of range 0 <= value
>> <= %u",
>>> +					tb->c_neg,
>>> +
>> 	RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1);
>>> +		if (tb->c < 1 || tb->c >
>> RTE_BBDEV_TURBO_MAX_CODE_BLOCKS) {
>>> +			rte_bbdev_log(ERR,
>>> +					"c (%u) is out of range 1 <= value <=
>> %u",
>>> +					tb->c,
>> RTE_BBDEV_TURBO_MAX_CODE_BLOCKS);
>>> +			return -1;
>>> +		}
>>> +		if (tb->cab > tb->c) {
>>> +			rte_bbdev_log(ERR,
>>> +					"cab (%u) is greater than c (%u)",
>>> +					tb->cab, tb->c);
>>> +			return -1;
>>> +		}
>>> +		if ((tb->ea < RTE_BBDEV_TURBO_MIN_CB_SIZE || (tb->ea %
>> 2))
>>> +				&& tb->r < tb->cab) {
>>> +			rte_bbdev_log(ERR,
>>> +					"ea (%u) is less than %u or it is not
>> even",
>>> +					tb->ea,
>> RTE_BBDEV_TURBO_MIN_CB_SIZE);
>>> +			return -1;
>>> +		}
>>> +		if ((tb->eb < RTE_BBDEV_TURBO_MIN_CB_SIZE || (tb->eb %
>> 2))
>>> +				&& tb->c > tb->cab) {
>>> +			rte_bbdev_log(ERR,
>>> +					"eb (%u) is less than %u or it is not
>> even",
>>> +					tb->eb,
>> RTE_BBDEV_TURBO_MIN_CB_SIZE);
>>> +			return -1;
>>> +		}
>>> +
>>> +		kw_neg = 3 * RTE_ALIGN_CEIL(tb->k_neg + 4,
>>> +					RTE_BBDEV_TURBO_C_SUBBLOCK);
>>> +		if (tb->ncb_neg < tb->k_neg || tb->ncb_neg > kw_neg) {
>>> +			rte_bbdev_log(ERR,
>>> +					"ncb_neg (%u) is out of range (%u)
>> k_neg <= value <= (%u) kw_neg",
>>> +					tb->ncb_neg, tb->k_neg, kw_neg);
>>> +			return -1;
>>> +		}
>>> +
>>> +		kw_pos = 3 * RTE_ALIGN_CEIL(tb->k_pos + 4,
>>> +					RTE_BBDEV_TURBO_C_SUBBLOCK);
>>> +		if (tb->ncb_pos < tb->k_pos || tb->ncb_pos > kw_pos) {
>>> +			rte_bbdev_log(ERR,
>>> +					"ncb_pos (%u) is out of range (%u)
>> k_pos <= value <= (%u) kw_pos",
>>> +					tb->ncb_pos, tb->k_pos, kw_pos);
>>> +			return -1;
>>> +		}
>>> +		if (tb->r > (tb->c - 1)) {
>>> +			rte_bbdev_log(ERR,
>>> +					"r (%u) is greater than c - 1 (%u)",
>>> +					tb->r, tb->c - 1);
>>> +			return -1;
>>> +		}
>>> +	} else {
>>> +		cb = &turbo_enc->cb_params;
>>> +		if (cb->k < RTE_BBDEV_TURBO_MIN_CB_SIZE
>>> +				|| cb->k >
>> RTE_BBDEV_TURBO_MAX_CB_SIZE) {
>>> +			rte_bbdev_log(ERR,
>>> +					"k (%u) is out of range %u <= value <=
>> %u",
>>> +					cb->k,
>> RTE_BBDEV_TURBO_MIN_CB_SIZE,
>>> +					RTE_BBDEV_TURBO_MAX_CB_SIZE);
>>> +			return -1;
>>> +		}
>>> +
>>> +		if (cb->e < RTE_BBDEV_TURBO_MIN_CB_SIZE || (cb->e % 2))
>> {
>>> +			rte_bbdev_log(ERR,
>>> +					"e (%u) is less than %u or it is not
>> even",
>>> +					cb->e,
>> RTE_BBDEV_TURBO_MIN_CB_SIZE);
>>> +			return -1;
>>> +		}
>>> +
>>> +		kw = RTE_ALIGN_CEIL(cb->k + 4,
>> RTE_BBDEV_TURBO_C_SUBBLOCK) * 3;
>>> +		if (cb->ncb < cb->k || cb->ncb > kw) {
>>> +			rte_bbdev_log(ERR,
>>> +					"ncb (%u) is out of range (%u) k <=
>> value <= (%u) kw",
>>> +					cb->ncb, cb->k, kw);
>>> +			return -1;
>>> +		}
>>> +	}
>>> +
>>> +	return 0;
>>> +}
>>> +/* Validates LDPC encoder parameters */ static inline int
>>> +validate_ldpc_enc_op(struct rte_bbdev_enc_op *op) {
>>> +	struct rte_bbdev_op_ldpc_enc *ldpc_enc = &op->ldpc_enc;
>>> +
>>> +	if (op->mempool == NULL) {
>>> +		rte_bbdev_log(ERR, "Invalid mempool pointer");
>>> +		return -1;
>>> +	}
>>> +	if (ldpc_enc->input.data == NULL) {
>>> +		rte_bbdev_log(ERR, "Invalid input pointer");
>>> +		return -1;
>>> +	}
>>> +	if (ldpc_enc->output.data == NULL) {
>>> +		rte_bbdev_log(ERR, "Invalid output pointer");
>>> +		return -1;
>>> +	}
>>> +	if (ldpc_enc->input.length >
>>> +			RTE_BBDEV_LDPC_MAX_CB_SIZE >> 3) {
>>> +		rte_bbdev_log(ERR, "CB size (%u) is too big, max: %d",
>>> +				ldpc_enc->input.length,
>>> +				RTE_BBDEV_LDPC_MAX_CB_SIZE);
>>> +		return -1;
>>> +	}
>>> +	if ((ldpc_enc->basegraph > 2) || (ldpc_enc->basegraph == 0)) {
>>> +		rte_bbdev_log(ERR,
>>> +				"BG (%u) is out of range 1 <= value <= 2",
>>> +				ldpc_enc->basegraph);
>>> +		return -1;
>>> +	}
>>> +	if (ldpc_enc->rv_index > 3) {
>>> +		rte_bbdev_log(ERR,
>>> +				"rv_index (%u) is out of range 0 <= value <=
>> 3",
>>> +				ldpc_enc->rv_index);
>>> +		return -1;
>>> +	}
>>> +	if (ldpc_enc->code_block_mode > 1) {
>>> +		rte_bbdev_log(ERR,
>>> +				"code_block_mode (%u) is out of range 0 <=
>> value <= 1",
>>> +				ldpc_enc->code_block_mode);
>>> +		return -1;
>>> +	}
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +/* Validates LDPC decoder parameters */ static inline int
>>> +validate_ldpc_dec_op(struct rte_bbdev_dec_op *op) {
>>> +	struct rte_bbdev_op_ldpc_dec *ldpc_dec = &op->ldpc_dec;
>>> +
>>> +	if (op->mempool == NULL) {
>>> +		rte_bbdev_log(ERR, "Invalid mempool pointer");
>>> +		return -1;
>>> +	}
>>> +	if ((ldpc_dec->basegraph > 2) || (ldpc_dec->basegraph == 0)) {
>>> +		rte_bbdev_log(ERR,
>>> +				"BG (%u) is out of range 1 <= value <= 2",
>>> +				ldpc_dec->basegraph);
>>> +		return -1;
>>> +	}
>>> +	if (ldpc_dec->iter_max == 0) {
>>> +		rte_bbdev_log(ERR,
>>> +				"iter_max (%u) is equal to 0",
>>> +				ldpc_dec->iter_max);
>>> +		return -1;
>>> +	}
>>> +	if (ldpc_dec->rv_index > 3) {
>>> +		rte_bbdev_log(ERR,
>>> +				"rv_index (%u) is out of range 0 <= value <=
>> 3",
>>> +				ldpc_dec->rv_index);
>>> +		return -1;
>>> +	}
>>> +	if (ldpc_dec->code_block_mode > 1) {
>>> +		rte_bbdev_log(ERR,
>>> +				"code_block_mode (%u) is out of range 0 <=
>> value <= 1",
>>> +				ldpc_dec->code_block_mode);
>>> +		return -1;
>>> +	}
>>> +
>>> +	return 0;
>>> +}
>>> +#endif
>> Could have an #else with stubs so the users do not have to bother with
>> #ifdef decorations
> I see what you mean. Debatable. But given this is done the same way for other
> bbdev driver I would rather keep consistency. 
ok
>
>>> +
>>>  /* Enqueue one encode operations for ACC100 device in CB mode */
>>> static inline int  enqueue_enc_one_op_cb(struct acc100_queue *q,
>>> struct rte_bbdev_enc_op *op, @@ -1956,6 +2181,14 @@
>>>  		seg_total_left;
>>>  	struct rte_mbuf *input, *output_head, *output;
>>>
>>> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
>>> +	/* Validate op structure */
>>> +	if (validate_enc_op(op) == -1) {
>>> +		rte_bbdev_log(ERR, "Turbo encoder validation failed");
>>> +		return -EINVAL;
>>> +	}
>>> +#endif
>>> +
>>>  	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
>>>  			& q->sw_ring_wrap_mask);
>>>  	desc = q->ring_addr + desc_idx;
>>> @@ -2008,6 +2241,14 @@
>>>  	uint16_t  in_length_in_bytes;
>>>  	struct rte_bbdev_op_ldpc_enc *enc = &ops[0]->ldpc_enc;
>>>
>>> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
>>> +	/* Validate op structure */
>>> +	if (validate_ldpc_enc_op(ops[0]) == -1) {
>>> +		rte_bbdev_log(ERR, "LDPC encoder validation failed");
>>> +		return -EINVAL;
>>> +	}
>>> +#endif
>>> +
>>>  	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
>>>  			& q->sw_ring_wrap_mask);
>>>  	desc = q->ring_addr + desc_idx;
>>> @@ -2065,6 +2306,14 @@
>>>  		seg_total_left;
>>>  	struct rte_mbuf *input, *output_head, *output;
>>>
>>> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
>>> +	/* Validate op structure */
>>> +	if (validate_ldpc_enc_op(op) == -1) {
>>> +		rte_bbdev_log(ERR, "LDPC encoder validation failed");
>>> +		return -EINVAL;
>>> +	}
>>> +#endif
>>> +
>>>  	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
>>>  			& q->sw_ring_wrap_mask);
>>>  	desc = q->ring_addr + desc_idx;
>>> @@ -2119,6 +2368,14 @@
>>>  	struct rte_mbuf *input, *output_head, *output;
>>>  	uint16_t current_enqueued_cbs = 0;
>>>
>>> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
>>> +	/* Validate op structure */
>>> +	if (validate_enc_op(op) == -1) {
>>> +		rte_bbdev_log(ERR, "Turbo encoder validation failed");
>>> +		return -EINVAL;
>>> +	}
>>> +#endif
>>> +
>>>  	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
>>>  			& q->sw_ring_wrap_mask);
>>>  	desc = q->ring_addr + desc_idx;
>>> @@ -2191,6 +2448,142 @@
>>>  	return current_enqueued_cbs;
>>>  }
>>>
>>> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
>>> +/* Validates turbo decoder parameters */ static inline int
>>> +validate_dec_op(struct rte_bbdev_dec_op *op) {
>> This (guessing) later dec validation share similar code with enc validation,
>> consider function for the common parts.
> They have different API really, a few checks only have common range checks.
> So not convinced it would help personnaly.
> Thanks

ok

Reviewed-by: Tom Rix <trix@redhat.com>

>
>> Tom
>>
>>> +	struct rte_bbdev_op_turbo_dec *turbo_dec = &op->turbo_dec;
>>> +	struct rte_bbdev_op_dec_turbo_cb_params *cb = NULL;
>>> +	struct rte_bbdev_op_dec_turbo_tb_params *tb = NULL;
>>> +
>>> +	if (op->mempool == NULL) {
>>> +		rte_bbdev_log(ERR, "Invalid mempool pointer");
>>> +		return -1;
>>> +	}
>>> +	if (turbo_dec->input.data == NULL) {
>>> +		rte_bbdev_log(ERR, "Invalid input pointer");
>>> +		return -1;
>>> +	}
>>> +	if (turbo_dec->hard_output.data == NULL) {
>>> +		rte_bbdev_log(ERR, "Invalid hard_output pointer");
>>> +		return -1;
>>> +	}
>>> +	if (check_bit(turbo_dec->op_flags,
>> RTE_BBDEV_TURBO_SOFT_OUTPUT) &&
>>> +			turbo_dec->soft_output.data == NULL) {
>>> +		rte_bbdev_log(ERR, "Invalid soft_output pointer");
>>> +		return -1;
>>> +	}
>>> +	if (turbo_dec->rv_index > 3) {
>>> +		rte_bbdev_log(ERR,
>>> +				"rv_index (%u) is out of range 0 <= value <=
>> 3",
>>> +				turbo_dec->rv_index);
>>> +		return -1;
>>> +	}
>>> +	if (turbo_dec->iter_min < 1) {
>>> +		rte_bbdev_log(ERR,
>>> +				"iter_min (%u) is less than 1",
>>> +				turbo_dec->iter_min);
>>> +		return -1;
>>> +	}
>>> +	if (turbo_dec->iter_max <= 2) {
>>> +		rte_bbdev_log(ERR,
>>> +				"iter_max (%u) is less than or equal to 2",
>>> +				turbo_dec->iter_max);
>>> +		return -1;
>>> +	}
>>> +	if (turbo_dec->iter_min > turbo_dec->iter_max) {
>>> +		rte_bbdev_log(ERR,
>>> +				"iter_min (%u) is greater than iter_max
>> (%u)",
>>> +				turbo_dec->iter_min, turbo_dec->iter_max);
>>> +		return -1;
>>> +	}
>>> +	if (turbo_dec->code_block_mode != 0 &&
>>> +			turbo_dec->code_block_mode != 1) {
>>> +		rte_bbdev_log(ERR,
>>> +				"code_block_mode (%u) is out of range 0 <=
>> value <= 1",
>>> +				turbo_dec->code_block_mode);
>>> +		return -1;
>>> +	}
>>> +
>>> +	if (turbo_dec->code_block_mode == 0) {
>>> +		tb = &turbo_dec->tb_params;
>>> +		if ((tb->k_neg < RTE_BBDEV_TURBO_MIN_CB_SIZE
>>> +				|| tb->k_neg >
>> RTE_BBDEV_TURBO_MAX_CB_SIZE)
>>> +				&& tb->c_neg > 0) {
>>> +			rte_bbdev_log(ERR,
>>> +					"k_neg (%u) is out of range %u <=
>> value <= %u",
>>> +					tb->k_neg,
>> RTE_BBDEV_TURBO_MIN_CB_SIZE,
>>> +					RTE_BBDEV_TURBO_MAX_CB_SIZE);
>>> +			return -1;
>>> +		}
>>> +		if ((tb->k_pos < RTE_BBDEV_TURBO_MIN_CB_SIZE
>>> +				|| tb->k_pos >
>> RTE_BBDEV_TURBO_MAX_CB_SIZE)
>>> +				&& tb->c > tb->c_neg) {
>>> +			rte_bbdev_log(ERR,
>>> +					"k_pos (%u) is out of range %u <=
>> value <= %u",
>>> +					tb->k_pos,
>> RTE_BBDEV_TURBO_MIN_CB_SIZE,
>>> +					RTE_BBDEV_TURBO_MAX_CB_SIZE);
>>> +			return -1;
>>> +		}
>>> +		if (tb->c_neg > (RTE_BBDEV_TURBO_MAX_CODE_BLOCKS -
>> 1))
>>> +			rte_bbdev_log(ERR,
>>> +					"c_neg (%u) is out of range 0 <= value
>> <= %u",
>>> +					tb->c_neg,
>>> +
>> 	RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1);
>>> +		if (tb->c < 1 || tb->c >
>> RTE_BBDEV_TURBO_MAX_CODE_BLOCKS) {
>>> +			rte_bbdev_log(ERR,
>>> +					"c (%u) is out of range 1 <= value <=
>> %u",
>>> +					tb->c,
>> RTE_BBDEV_TURBO_MAX_CODE_BLOCKS);
>>> +			return -1;
>>> +		}
>>> +		if (tb->cab > tb->c) {
>>> +			rte_bbdev_log(ERR,
>>> +					"cab (%u) is greater than c (%u)",
>>> +					tb->cab, tb->c);
>>> +			return -1;
>>> +		}
>>> +		if (check_bit(turbo_dec->op_flags,
>> RTE_BBDEV_TURBO_EQUALIZER) &&
>>> +				(tb->ea < RTE_BBDEV_TURBO_MIN_CB_SIZE
>>> +						|| (tb->ea % 2))
>>> +				&& tb->cab > 0) {
>>> +			rte_bbdev_log(ERR,
>>> +					"ea (%u) is less than %u or it is not
>> even",
>>> +					tb->ea,
>> RTE_BBDEV_TURBO_MIN_CB_SIZE);
>>> +			return -1;
>>> +		}
>>> +		if (check_bit(turbo_dec->op_flags,
>> RTE_BBDEV_TURBO_EQUALIZER) &&
>>> +				(tb->eb < RTE_BBDEV_TURBO_MIN_CB_SIZE
>>> +						|| (tb->eb % 2))
>>> +				&& tb->c > tb->cab) {
>>> +			rte_bbdev_log(ERR,
>>> +					"eb (%u) is less than %u or it is not
>> even",
>>> +					tb->eb,
>> RTE_BBDEV_TURBO_MIN_CB_SIZE);
>>> +		}
>>> +	} else {
>>> +		cb = &turbo_dec->cb_params;
>>> +		if (cb->k < RTE_BBDEV_TURBO_MIN_CB_SIZE
>>> +				|| cb->k >
>> RTE_BBDEV_TURBO_MAX_CB_SIZE) {
>>> +			rte_bbdev_log(ERR,
>>> +					"k (%u) is out of range %u <= value <=
>> %u",
>>> +					cb->k,
>> RTE_BBDEV_TURBO_MIN_CB_SIZE,
>>> +					RTE_BBDEV_TURBO_MAX_CB_SIZE);
>>> +			return -1;
>>> +		}
>>> +		if (check_bit(turbo_dec->op_flags,
>> RTE_BBDEV_TURBO_EQUALIZER) &&
>>> +				(cb->e < RTE_BBDEV_TURBO_MIN_CB_SIZE
>> ||
>>> +				(cb->e % 2))) {
>>> +			rte_bbdev_log(ERR,
>>> +					"e (%u) is less than %u or it is not
>> even",
>>> +					cb->e,
>> RTE_BBDEV_TURBO_MIN_CB_SIZE);
>>> +			return -1;
>>> +		}
>>> +	}
>>> +
>>> +	return 0;
>>> +}
>>> +#endif
>>> +
>>>  /** Enqueue one decode operations for ACC100 device in CB mode */
>>> static inline int  enqueue_dec_one_op_cb(struct acc100_queue *q,
>>> struct rte_bbdev_dec_op *op, @@ -2203,6 +2596,14 @@
>>>  	struct rte_mbuf *input, *h_output_head, *h_output,
>>>  		*s_output_head, *s_output;
>>>
>>> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
>>> +	/* Validate op structure */
>>> +	if (validate_dec_op(op) == -1) {
>>> +		rte_bbdev_log(ERR, "Turbo decoder validation failed");
>>> +		return -EINVAL;
>>> +	}
>>> +#endif
>>> +
>>>  	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
>>>  			& q->sw_ring_wrap_mask);
>>>  	desc = q->ring_addr + desc_idx;
>>> @@ -2426,6 +2827,13 @@
>>>  		return ret;
>>>  	}
>>>
>>> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
>>> +	/* Validate op structure */
>>> +	if (validate_ldpc_dec_op(op) == -1) {
>>> +		rte_bbdev_log(ERR, "LDPC decoder validation failed");
>>> +		return -EINVAL;
>>> +	}
>>> +#endif
>>>  	union acc100_dma_desc *desc;
>>>  	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
>>>  			& q->sw_ring_wrap_mask);
>>> @@ -2521,6 +2929,14 @@
>>>  	struct rte_mbuf *input, *h_output_head, *h_output;
>>>  	uint16_t current_enqueued_cbs = 0;
>>>
>>> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
>>> +	/* Validate op structure */
>>> +	if (validate_ldpc_dec_op(op) == -1) {
>>> +		rte_bbdev_log(ERR, "LDPC decoder validation failed");
>>> +		return -EINVAL;
>>> +	}
>>> +#endif
>>> +
>>>  	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
>>>  			& q->sw_ring_wrap_mask);
>>>  	desc = q->ring_addr + desc_idx;
>>> @@ -2611,6 +3027,14 @@
>>>  		*s_output_head, *s_output;
>>>  	uint16_t current_enqueued_cbs = 0;
>>>
>>> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
>>> +	/* Validate op structure */
>>> +	if (validate_dec_op(op) == -1) {
>>> +		rte_bbdev_log(ERR, "Turbo decoder validation failed");
>>> +		return -EINVAL;
>>> +	}
>>> +#endif
>>> +
>>>  	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
>>>  			& q->sw_ring_wrap_mask);
>>>  	desc = q->ring_addr + desc_idx;


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v9 05/10] baseband/acc100: add LDPC processing functions
  2020-10-01 15:31           ` Tom Rix
@ 2020-10-01 16:07             ` Chautru, Nicolas
  0 siblings, 0 replies; 213+ messages in thread
From: Chautru, Nicolas @ 2020-10-01 16:07 UTC (permalink / raw)
  To: Tom Rix, dev, akhil.goyal
  Cc: Richardson, Bruce, Xu, Rosen, dave.burley, aidan.goddard, Yigit,
	Ferruh, Liu, Tianjiao

Hi Tom, 
Note that there is no a v10 which includes several suggested changes deemed valuable. 

> From: Tom Rix <trix@redhat.com>
> On 9/30/20 11:52 AM, Chautru, Nicolas wrote:
> > Hi Tom,
> >
> >> From: Tom Rix <trix@redhat.com>
> >> On 9/28/20 5:29 PM, Nicolas Chautru wrote:
> >>> Adding LDPC decode and encode processing operations
> >>>
> >>> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> >>> Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
> >>> Acked-by: Dave Burley <dave.burley@accelercomm.com>
> >>> ---
> >>>  doc/guides/bbdevs/features/acc100.ini    |    8 +-
> >>>  drivers/baseband/acc100/rte_acc100_pmd.c | 1625
> >> +++++++++++++++++++++++++++++-
> >>>  drivers/baseband/acc100/rte_acc100_pmd.h |    3 +
> >>>  3 files changed, 1630 insertions(+), 6 deletions(-)
> >>>
> >>> diff --git a/doc/guides/bbdevs/features/acc100.ini
> >> b/doc/guides/bbdevs/features/acc100.ini
> >>> index c89a4d7..40c7adc 100644
> >>> --- a/doc/guides/bbdevs/features/acc100.ini
> >>> +++ b/doc/guides/bbdevs/features/acc100.ini
> >>> @@ -6,9 +6,9 @@
> >>>  [Features]
> >>>  Turbo Decoder (4G)     = N
> >>>  Turbo Encoder (4G)     = N
> >>> -LDPC Decoder (5G)      = N
> >>> -LDPC Encoder (5G)      = N
> >>> -LLR/HARQ Compression   = N
> >>> -External DDR Access    = N
> >>> +LDPC Decoder (5G)      = Y
> >>> +LDPC Encoder (5G)      = Y
> >>> +LLR/HARQ Compression   = Y
> >>> +External DDR Access    = Y
> >>>  HW Accelerated         = Y
> >>>  BBDEV API              = Y
> >>> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> >> b/drivers/baseband/acc100/rte_acc100_pmd.c
> >>> index 7a21c57..b223547 100644
> >>> --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> >>> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> >>> @@ -15,6 +15,9 @@
> >>>  #include <rte_hexdump.h>
> >>>  #include <rte_pci.h>
> >>>  #include <rte_bus_pci.h>
> >>> +#ifdef RTE_BBDEV_OFFLOAD_COST
> >>> +#include <rte_cycles.h>
> >>> +#endif
> >>>
> >>>  #include <rte_bbdev.h>
> >>>  #include <rte_bbdev_pmd.h>
> >>> @@ -449,7 +452,6 @@
> >>>  	return 0;
> >>>  }
> >>>
> >>> -
> >>>  /**
> >>>   * Report a ACC100 queue index which is free
> >>>   * Return 0 to 16k for a valid queue_idx or -1 when no queue is available
> >>> @@ -634,6 +636,46 @@
> >>>  	struct acc100_device *d = dev->data->dev_private;
> >>>
> >>>  	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
> >>> +		{
> >>> +			.type   = RTE_BBDEV_OP_LDPC_ENC,
> >>> +			.cap.ldpc_enc = {
> >>> +				.capability_flags =
> >>> +					RTE_BBDEV_LDPC_RATE_MATCH |
> >>> +
> >> 	RTE_BBDEV_LDPC_CRC_24B_ATTACH |
> >>> +
> >> 	RTE_BBDEV_LDPC_INTERLEAVER_BYPASS,
> >>> +				.num_buffers_src =
> >>> +
> >> 	RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> >>> +				.num_buffers_dst =
> >>> +
> >> 	RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> >>> +			}
> >>> +		},
> >>> +		{
> >>> +			.type   = RTE_BBDEV_OP_LDPC_DEC,
> >>> +			.cap.ldpc_dec = {
> >>> +			.capability_flags =
> >>> +				RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK |
> >>> +				RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP |
> >>> +
> >> 	RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE |
> >>> +
> >> 	RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE |
> >>> +#ifdef ACC100_EXT_MEM
> >> This is unconditionally defined in rte_acc100_pmd.h but it seems
> >>
> >> like it could be a hw config.  Please add a comment in the *.h
> >>
> > It is not really an HW config, just a potential alternate way to run
> > the device notably for troubleshooting.
> > I can add a comment though
> >
> >> Could also change to
> >>
> >> #if ACC100_EXT_MEM
> >>
> >> and change the #define ACC100_EXT_MEM 1
> > ok
> >
> >>> +
> >> 	RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE |
> >>> +
> >> 	RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE |
> >>> +#endif
> >>> +
> >> 	RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE |
> >>> +				RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS
> >> |
> >>> +				RTE_BBDEV_LDPC_DECODE_BYPASS |
> >>> +				RTE_BBDEV_LDPC_DEC_SCATTER_GATHER |
> >>> +
> >> 	RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION |
> >>> +				RTE_BBDEV_LDPC_LLR_COMPRESSION,
> >>> +			.llr_size = 8,
> >>> +			.llr_decimals = 1,
> >>> +			.num_buffers_src =
> >>> +
> >> 	RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> >>> +			.num_buffers_hard_out =
> >>> +
> >> 	RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> >>> +			.num_buffers_soft_out = 0,
> >>> +			}
> >>> +		},
> >>>  		RTE_BBDEV_END_OF_CAPABILITIES_LIST()
> >>>  	};
> >>>
> >>> @@ -669,9 +711,14 @@
> >>>  	dev_info->cpu_flag_reqs = NULL;
> >>>  	dev_info->min_alignment = 64;
> >>>  	dev_info->capabilities = bbdev_capabilities;
> >>> +#ifdef ACC100_EXT_MEM
> >>>  	dev_info->harq_buffer_size = d->ddr_size;
> >>> +#else
> >>> +	dev_info->harq_buffer_size = 0;
> >>> +#endif
> >>>  }
> >>>
> >>> +
> >>>  static const struct rte_bbdev_ops acc100_bbdev_ops = {
> >>>  	.setup_queues = acc100_setup_queues,
> >>>  	.close = acc100_dev_close,
> >>> @@ -696,6 +743,1577 @@
> >>>  	{.device_id = 0},
> >>>  };
> >>>
> >>> +/* Read flag value 0/1 from bitmap */
> >>> +static inline bool
> >>> +check_bit(uint32_t bitmap, uint32_t bitmask)
> >>> +{
> >>> +	return bitmap & bitmask;
> >>> +}
> >>> +
> >> All the bbdev have this function, its pretty trival but it would be good if
> >> common bbdev
> >>
> >> functions got moved to a common place.
> > Noted for future change affecting all PMDs outside of that serie.
> 
> ok.
> 
> >
> >>> +static inline char *
> >>> +mbuf_append(struct rte_mbuf *m_head, struct rte_mbuf *m, uint16_t
> >> len)
> >>> +{
> >>> +	if (unlikely(len > rte_pktmbuf_tailroom(m)))
> >>> +		return NULL;
> >>> +
> >>> +	char *tail = (char *)m->buf_addr + m->data_off + m->data_len;
> >>> +	m->data_len = (uint16_t)(m->data_len + len);
> >>> +	m_head->pkt_len  = (m_head->pkt_len + len);
> >>> +	return tail;
> >>> +}
> >>> +
> >>> +/* Compute value of k0.
> >>> + * Based on 3GPP 38.212 Table 5.4.2.1-2
> >>> + * Starting position of different redundancy versions, k0
> >>> + */
> >>> +static inline uint16_t
> >>> +get_k0(uint16_t n_cb, uint16_t z_c, uint8_t bg, uint8_t rv_index)
> >>> +{
> >>> +	if (rv_index == 0)
> >>> +		return 0;
> >>> +	uint16_t n = (bg == 1 ? N_ZC_1 : N_ZC_2) * z_c;
> >>> +	if (n_cb == n) {
> >>> +		if (rv_index == 1)
> >>> +			return (bg == 1 ? K0_1_1 : K0_1_2) * z_c;
> >>> +		else if (rv_index == 2)
> >>> +			return (bg == 1 ? K0_2_1 : K0_2_2) * z_c;
> >>> +		else
> >>> +			return (bg == 1 ? K0_3_1 : K0_3_2) * z_c;
> >>> +	}
> >>> +	/* LBRM case - includes a division by N */
> >>> +	if (rv_index == 1)
> >>> +		return (((bg == 1 ? K0_1_1 : K0_1_2) * n_cb)
> >>> +				/ n) * z_c;
> >>> +	else if (rv_index == 2)
> >>> +		return (((bg == 1 ? K0_2_1 : K0_2_2) * n_cb)
> >>> +				/ n) * z_c;
> >>> +	else
> >>> +		return (((bg == 1 ? K0_3_1 : K0_3_2) * n_cb)
> >>> +				/ n) * z_c;
> >>> +}
> >>> +
> >>> +/* Fill in a frame control word for LDPC encoding. */
> >>> +static inline void
> >>> +acc100_fcw_le_fill(const struct rte_bbdev_enc_op *op,
> >>> +		struct acc100_fcw_le *fcw, int num_cb)
> >>> +{
> >>> +	fcw->qm = op->ldpc_enc.q_m;
> >>> +	fcw->nfiller = op->ldpc_enc.n_filler;
> >>> +	fcw->BG = (op->ldpc_enc.basegraph - 1);
> >>> +	fcw->Zc = op->ldpc_enc.z_c;
> >>> +	fcw->ncb = op->ldpc_enc.n_cb;
> >>> +	fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_enc.basegraph,
> >>> +			op->ldpc_enc.rv_index);
> >>> +	fcw->rm_e = op->ldpc_enc.cb_params.e;
> >>> +	fcw->crc_select = check_bit(op->ldpc_enc.op_flags,
> >>> +			RTE_BBDEV_LDPC_CRC_24B_ATTACH);
> >>> +	fcw->bypass_intlv = check_bit(op->ldpc_enc.op_flags,
> >>> +			RTE_BBDEV_LDPC_INTERLEAVER_BYPASS);
> >>> +	fcw->mcb_count = num_cb;
> >>> +}
> >>> +
> >>> +/* Fill in a frame control word for LDPC decoding. */
> >>> +static inline void
> >>> +acc100_fcw_ld_fill(const struct rte_bbdev_dec_op *op, struct
> >> acc100_fcw_ld *fcw,
> >>> +		union acc100_harq_layout_data *harq_layout)
> >>> +{
> >>> +	uint16_t harq_out_length, harq_in_length, ncb_p, k0_p,
> >> parity_offset;
> >>> +	uint16_t harq_index;
> >>> +	uint32_t l;
> >>> +	bool harq_prun = false;
> >>> +
> >>> +	fcw->qm = op->ldpc_dec.q_m;
> >>> +	fcw->nfiller = op->ldpc_dec.n_filler;
> >>> +	fcw->BG = (op->ldpc_dec.basegraph - 1);
> >>> +	fcw->Zc = op->ldpc_dec.z_c;
> >>> +	fcw->ncb = op->ldpc_dec.n_cb;
> >>> +	fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_dec.basegraph,
> >>> +			op->ldpc_dec.rv_index);
> >>> +	if (op->ldpc_dec.code_block_mode == 1)
> >> 1 is magic, consider a #define
> > This would be a changed not related to that PMD, but noted and agreed.
> >
> >>> +		fcw->rm_e = op->ldpc_dec.cb_params.e;
> >>> +	else
> >>> +		fcw->rm_e = (op->ldpc_dec.tb_params.r <
> >>> +				op->ldpc_dec.tb_params.cab) ?
> >>> +						op->ldpc_dec.tb_params.ea :
> >>> +						op->ldpc_dec.tb_params.eb;
> >>> +
> >>> +	fcw->hcin_en = check_bit(op->ldpc_dec.op_flags,
> >>> +			RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE);
> >>> +	fcw->hcout_en = check_bit(op->ldpc_dec.op_flags,
> >>> +			RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE);
> >>> +	fcw->crc_select = check_bit(op->ldpc_dec.op_flags,
> >>> +			RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK);
> >>> +	fcw->bypass_dec = check_bit(op->ldpc_dec.op_flags,
> >>> +			RTE_BBDEV_LDPC_DECODE_BYPASS);
> >>> +	fcw->bypass_intlv = check_bit(op->ldpc_dec.op_flags,
> >>> +			RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS);
> >>> +	if (op->ldpc_dec.q_m == 1) {
> >>> +		fcw->bypass_intlv = 1;
> >>> +		fcw->qm = 2;
> >>> +	}
> >> similar magic.
> > Qm is an integer number defined in 3GPP, not a magic number. This
> literally means qm = 2.
> 
> ok
> 
> 
> >>> +	fcw->hcin_decomp_mode = check_bit(op->ldpc_dec.op_flags,
> >>> +			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
> >>> +	fcw->hcout_comp_mode = check_bit(op->ldpc_dec.op_flags,
> >>> +			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
> >>> +	fcw->llr_pack_mode = check_bit(op->ldpc_dec.op_flags,
> >>> +			RTE_BBDEV_LDPC_LLR_COMPRESSION);
> >>> +	harq_index = op->ldpc_dec.harq_combined_output.offset /
> >>> +			ACC100_HARQ_OFFSET;
> >>> +#ifdef ACC100_EXT_MEM
> >>> +	/* Limit cases when HARQ pruning is valid */
> >>> +	harq_prun = ((op->ldpc_dec.harq_combined_output.offset %
> >>> +			ACC100_HARQ_OFFSET) == 0) &&
> >>> +			(op->ldpc_dec.harq_combined_output.offset <=
> >> UINT16_MAX
> >>> +			* ACC100_HARQ_OFFSET);
> >>> +#endif
> >>> +	if (fcw->hcin_en > 0) {
> >>> +		harq_in_length = op-
> >>> ldpc_dec.harq_combined_input.length;
> >>> +		if (fcw->hcin_decomp_mode > 0)
> >>> +			harq_in_length = harq_in_length * 8 / 6;
> >>> +		harq_in_length = RTE_ALIGN(harq_in_length, 64);
> >>> +		if ((harq_layout[harq_index].offset > 0) & harq_prun) {
> >>> +			rte_bbdev_log_debug("HARQ IN offset unexpected
> >> for now\n");
> >>> +			fcw->hcin_size0 = harq_layout[harq_index].size0;
> >>> +			fcw->hcin_offset = harq_layout[harq_index].offset;
> >>> +			fcw->hcin_size1 = harq_in_length -
> >>> +					harq_layout[harq_index].offset;
> >>> +		} else {
> >>> +			fcw->hcin_size0 = harq_in_length;
> >>> +			fcw->hcin_offset = 0;
> >>> +			fcw->hcin_size1 = 0;
> >>> +		}
> >>> +	} else {
> >>> +		fcw->hcin_size0 = 0;
> >>> +		fcw->hcin_offset = 0;
> >>> +		fcw->hcin_size1 = 0;
> >>> +	}
> >>> +
> >>> +	fcw->itmax = op->ldpc_dec.iter_max;
> >>> +	fcw->itstop = check_bit(op->ldpc_dec.op_flags,
> >>> +			RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE);
> >>> +	fcw->synd_precoder = fcw->itstop;
> >>> +	/*
> >>> +	 * These are all implicitly set
> >>> +	 * fcw->synd_post = 0;
> >>> +	 * fcw->so_en = 0;
> >>> +	 * fcw->so_bypass_rm = 0;
> >>> +	 * fcw->so_bypass_intlv = 0;
> >>> +	 * fcw->dec_convllr = 0;
> >>> +	 * fcw->hcout_convllr = 0;
> >>> +	 * fcw->hcout_size1 = 0;
> >>> +	 * fcw->so_it = 0;
> >>> +	 * fcw->hcout_offset = 0;
> >>> +	 * fcw->negstop_th = 0;
> >>> +	 * fcw->negstop_it = 0;
> >>> +	 * fcw->negstop_en = 0;
> >>> +	 * fcw->gain_i = 1;
> >>> +	 * fcw->gain_h = 1;
> >>> +	 */
> >>> +	if (fcw->hcout_en > 0) {
> >>> +		parity_offset = (op->ldpc_dec.basegraph == 1 ? 20 : 8)
> >>> +			* op->ldpc_dec.z_c - op->ldpc_dec.n_filler;
> >>> +		k0_p = (fcw->k0 > parity_offset) ?
> >>> +				fcw->k0 - op->ldpc_dec.n_filler : fcw->k0;
> >>> +		ncb_p = fcw->ncb - op->ldpc_dec.n_filler;
> >>> +		l = k0_p + fcw->rm_e;
> >>> +		harq_out_length = (uint16_t) fcw->hcin_size0;
> >>> +		harq_out_length = RTE_MIN(RTE_MAX(harq_out_length, l),
> >> ncb_p);
> >>> +		harq_out_length = (harq_out_length + 0x3F) & 0xFFC0;
> >>> +		if ((k0_p > fcw->hcin_size0 +
> >> ACC100_HARQ_OFFSET_THRESHOLD) &&
> >>> +				harq_prun) {
> >>> +			fcw->hcout_size0 = (uint16_t) fcw->hcin_size0;
> >>> +			fcw->hcout_offset = k0_p & 0xFFC0;
> >>> +			fcw->hcout_size1 = harq_out_length - fcw-
> >>> hcout_offset;
> >>> +		} else {
> >>> +			fcw->hcout_size0 = harq_out_length;
> >>> +			fcw->hcout_size1 = 0;
> >>> +			fcw->hcout_offset = 0;
> >>> +		}
> >>> +		harq_layout[harq_index].offset = fcw->hcout_offset;
> >>> +		harq_layout[harq_index].size0 = fcw->hcout_size0;
> >>> +	} else {
> >>> +		fcw->hcout_size0 = 0;
> >>> +		fcw->hcout_size1 = 0;
> >>> +		fcw->hcout_offset = 0;
> >>> +	}
> >>> +}
> >>> +
> >>> +/**
> >>> + * Fills descriptor with data pointers of one block type.
> >>> + *
> >>> + * @param desc
> >>> + *   Pointer to DMA descriptor.
> >>> + * @param input
> >>> + *   Pointer to pointer to input data which will be encoded. It can be
> >> changed
> >>> + *   and points to next segment in scatter-gather case.
> >>> + * @param offset
> >>> + *   Input offset in rte_mbuf structure. It is used for calculating the point
> >>> + *   where data is starting.
> >>> + * @param cb_len
> >>> + *   Length of currently processed Code Block
> >>> + * @param seg_total_left
> >>> + *   It indicates how many bytes still left in segment (mbuf) for further
> >>> + *   processing.
> >>> + * @param op_flags
> >>> + *   Store information about device capabilities
> >>> + * @param next_triplet
> >>> + *   Index for ACC100 DMA Descriptor triplet
> >>> + *
> >>> + * @return
> >>> + *   Returns index of next triplet on success, other value if lengths of
> >>> + *   pkt and processed cb do not match.
> >>> + *
> >>> + */
> >>> +static inline int
> >>> +acc100_dma_fill_blk_type_in(struct acc100_dma_req_desc *desc,
> >>> +		struct rte_mbuf **input, uint32_t *offset, uint32_t cb_len,
> >>> +		uint32_t *seg_total_left, int next_triplet)
> >>> +{
> >>> +	uint32_t part_len;
> >>> +	struct rte_mbuf *m = *input;
> >>> +
> >>> +	part_len = (*seg_total_left < cb_len) ? *seg_total_left : cb_len;
> >>> +	cb_len -= part_len;
> >>> +	*seg_total_left -= part_len;
> >>> +
> >>> +	desc->data_ptrs[next_triplet].address =
> >>> +			rte_pktmbuf_iova_offset(m, *offset);
> >>> +	desc->data_ptrs[next_triplet].blen = part_len;
> >>> +	desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN;
> >>> +	desc->data_ptrs[next_triplet].last = 0;
> >>> +	desc->data_ptrs[next_triplet].dma_ext = 0;
> >>> +	*offset += part_len;
> >>> +	next_triplet++;
> >>> +
> >>> +	while (cb_len > 0) {
> >> Since cb_len is unsigned, a better check would be
> >>
> >> while (cb_len != 0)
> > Why would this be better?
> 
> It is unsigned it will never be < 0.
> 
> != 0 reflects that.
> 

I understand that but I personnaly see not valuable difference.

> >
> >>> +		if (next_triplet < ACC100_DMA_MAX_NUM_POINTERS &&
> >>> +				m->next != NULL) {
> >>> +
> >>> +			m = m->next;
> >>> +			*seg_total_left = rte_pktmbuf_data_len(m);
> >>> +			part_len = (*seg_total_left < cb_len) ?
> >>> +					*seg_total_left :
> >>> +					cb_len;
> >>> +			desc->data_ptrs[next_triplet].address =
> >>> +					rte_pktmbuf_iova_offset(m, 0);
> >>> +			desc->data_ptrs[next_triplet].blen = part_len;
> >>> +			desc->data_ptrs[next_triplet].blkid =
> >>> +					ACC100_DMA_BLKID_IN;
> >>> +			desc->data_ptrs[next_triplet].last = 0;
> >>> +			desc->data_ptrs[next_triplet].dma_ext = 0;
> >>> +			cb_len -= part_len;
> >>> +			*seg_total_left -= part_len;
> >> when *sec_total_left goes to zero here, there will be a lot of iterations
> doing
> >> nothing.
> >>
> >> should stop early.
> > Not really, it would pick next m anyway and keep adding buffer descriptor
> pointer.
> ok
> >
> >
> >>> +			/* Initializing offset for next segment (mbuf) */
> >>> +			*offset = part_len;
> >>> +			next_triplet++;
> >>> +		} else {
> >>> +			rte_bbdev_log(ERR,
> >>> +				"Some data still left for processing: "
> >>> +				"data_left: %u, next_triplet: %u, next_mbuf:
> >> %p",
> >>> +				cb_len, next_triplet, m->next);
> >>> +			return -EINVAL;
> >>> +		}
> >>> +	}
> >>> +	/* Storing new mbuf as it could be changed in scatter-gather case*/
> >>> +	*input = m;
> >>> +
> >>> +	return next_triplet;
> >> callers, after checking, dec the return.
> >>
> >> Maybe change return to next_triplet-- and save the callers from doing it.
> > I miss your point
> 
> Looking at how the callers of this function use the return,
> 
> a fair number decrement it to get to the current_triplet.
> 
> So maybe returning the current_triplet would be better.
> 
> Something to think about, not required.

OK I see, you mean refering to previous descriptor as (next - 1).
I believe it is more consistant this way when next_triplet is litterally always the next_triplet and used as is. 

> 
> >>> +}
> >>> +
> >>> +/* Fills descriptor with data pointers of one block type.
> >>> + * Returns index of next triplet on success, other value if lengths of
> >>> + * output data and processed mbuf do not match.
> >>> + */
> >>> +static inline int
> >>> +acc100_dma_fill_blk_type_out(struct acc100_dma_req_desc *desc,
> >>> +		struct rte_mbuf *output, uint32_t out_offset,
> >>> +		uint32_t output_len, int next_triplet, int blk_id)
> >>> +{
> >>> +	desc->data_ptrs[next_triplet].address =
> >>> +			rte_pktmbuf_iova_offset(output, out_offset);
> >>> +	desc->data_ptrs[next_triplet].blen = output_len;
> >>> +	desc->data_ptrs[next_triplet].blkid = blk_id;
> >>> +	desc->data_ptrs[next_triplet].last = 0;
> >>> +	desc->data_ptrs[next_triplet].dma_ext = 0;
> >>> +	next_triplet++;
> >> Callers check return is < 0, like above but there is no similar logic to
> >>
> >> check the bounds of next_triplet to return -EINVAL
> >>
> >> so add this check here or remove the is < 0 checks by the callers.
> >>
> > fair enough thanks.
> >
> >>> +
> >>> +	return next_triplet;
> >>> +}
> >>> +
> >>> +static inline int
> >>> +acc100_dma_desc_le_fill(struct rte_bbdev_enc_op *op,
> >>> +		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
> >>> +		struct rte_mbuf *output, uint32_t *in_offset,
> >>> +		uint32_t *out_offset, uint32_t *out_length,
> >>> +		uint32_t *mbuf_total_left, uint32_t *seg_total_left)
> >>> +{
> >>> +	int next_triplet = 1; /* FCW already done */
> >>> +	uint16_t K, in_length_in_bits, in_length_in_bytes;
> >>> +	struct rte_bbdev_op_ldpc_enc *enc = &op->ldpc_enc;
> >>> +
> >>> +	desc->word0 = ACC100_DMA_DESC_TYPE;
> >>> +	desc->word1 = 0; /**< Timestamp could be disabled */
> >>> +	desc->word2 = 0;
> >>> +	desc->word3 = 0;
> >>> +	desc->numCBs = 1;
> >>> +
> >>> +	K = (enc->basegraph == 1 ? 22 : 10) * enc->z_c;
> >>> +	in_length_in_bits = K - enc->n_filler;
> >> can this overflow ? enc->n_filler > K ?
> > I would not add such checks in the time critical function. For valid scenario
> it can't.
> > It could be added to the validate_ldpc_dec_op() which is only run in debug
> mode.
> >
> >>> +	if ((enc->op_flags & RTE_BBDEV_LDPC_CRC_24A_ATTACH) ||
> >>> +			(enc->op_flags &
> >> RTE_BBDEV_LDPC_CRC_24B_ATTACH))
> >>> +		in_length_in_bits -= 24;
> >>> +	in_length_in_bytes = in_length_in_bits >> 3;
> >>> +
> >>> +	if (unlikely((*mbuf_total_left == 0) ||
> >> This check is covered by the next and can be removed.
> > Not necessaraly, would keep as is.
> only if in_length_in_bytes was negative

yes, would keep as is I think. 

> >
> >>> +			(*mbuf_total_left < in_length_in_bytes))) {
> >>> +		rte_bbdev_log(ERR,
> >>> +				"Mismatch between mbuf length and
> >> included CB sizes: mbuf len %u, cb len %u",
> >>> +				*mbuf_total_left, in_length_in_bytes);
> >>> +		return -1;
> >>> +	}
> >>> +
> >>> +	next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset,
> >>> +			in_length_in_bytes,
> >>> +			seg_total_left, next_triplet);
> >>> +	if (unlikely(next_triplet < 0)) {
> >>> +		rte_bbdev_log(ERR,
> >>> +				"Mismatch between data to process and
> >> mbuf data length in bbdev_op: %p",
> >>> +				op);
> >>> +		return -1;
> >>> +	}
> >>> +	desc->data_ptrs[next_triplet - 1].last = 1;
> >>> +	desc->m2dlen = next_triplet;
> >>> +	*c
> >> Updating output pointers should be deferred until the the call is known
> to
> >> be successful.
> >>
> >> Otherwise caller is left in a bad, unknown state.
> > We already had to touch them by that point.
> ugh.

you hurt my feelings. I think it is okay given we have to write on the fly.
At system level this pinned down memory cannot be used if no processing was done.

> >
> >>> +
> >>> +	/* Set output length */
> >>> +	/* Integer round up division by 8 */
> >>> +	*out_length = (enc->cb_params.e + 7) >> 3;
> >>> +
> >>> +	next_triplet = acc100_dma_fill_blk_type_out(desc, output,
> >> *out_offset,
> >>> +			*out_length, next_triplet,
> >> ACC100_DMA_BLKID_OUT_ENC);
> >>> +	if (unlikely(next_triplet < 0)) {
> >>> +		rte_bbdev_log(ERR,
> >>> +				"Mismatch between data to process and
> >> mbuf data length in bbdev_op: %p",
> >>> +				op);
> >>> +		return -1;
> >>> +	}
> >>> +	op->ldpc_enc.output.length += *out_length;
> >>> +	*out_offset += *out_length;
> >>> +	desc->data_ptrs[next_triplet - 1].last = 1;
> >>> +	desc->data_ptrs[next_triplet - 1].dma_ext = 0;
> >>> +	desc->d2mlen = next_triplet - desc->m2dlen;
> >>> +
> >>> +	desc->op_addr = op;
> >>> +
> >>> +	return 0;
> >>> +}
> >>> +
> >>> +static inline int
> >>> +acc100_dma_desc_ld_fill(struct rte_bbdev_dec_op *op,
> >>> +		struct acc100_dma_req_desc *desc,
> >>> +		struct rte_mbuf **input, struct rte_mbuf *h_output,
> >>> +		uint32_t *in_offset, uint32_t *h_out_offset,
> >>> +		uint32_t *h_out_length, uint32_t *mbuf_total_left,
> >>> +		uint32_t *seg_total_left,
> >>> +		struct acc100_fcw_ld *fcw)
> >>> +{
> >>> +	struct rte_bbdev_op_ldpc_dec *dec = &op->ldpc_dec;
> >>> +	int next_triplet = 1; /* FCW already done */
> >>> +	uint32_t input_length;
> >>> +	uint16_t output_length, crc24_overlap = 0;
> >>> +	uint16_t sys_cols, K, h_p_size, h_np_size;
> >>> +	bool h_comp = check_bit(dec->op_flags,
> >>> +			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
> >>> +
> >>> +	desc->word0 = ACC100_DMA_DESC_TYPE;
> >>> +	desc->word1 = 0; /**< Timestamp could be disabled */
> >>> +	desc->word2 = 0;
> >>> +	desc->word3 = 0;
> >>> +	desc->numCBs = 1;
> >> This seems to be a common setup logic, maybe use a macro or inline
> >> function.
> > fair enough
> >
> >>> +
> >>> +	if (check_bit(op->ldpc_dec.op_flags,
> >>> +			RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP))
> >>> +		crc24_overlap = 24;
> >>> +
> >>> +	/* Compute some LDPC BG lengths */
> >>> +	input_length = dec->cb_params.e;
> >>> +	if (check_bit(op->ldpc_dec.op_flags,
> >>> +			RTE_BBDEV_LDPC_LLR_COMPRESSION))
> >>> +		input_length = (input_length * 3 + 3) / 4;
> >>> +	sys_cols = (dec->basegraph == 1) ? 22 : 10;
> >>> +	K = sys_cols * dec->z_c;
> >>> +	output_length = K - dec->n_filler - crc24_overlap;
> >>> +
> >>> +	if (unlikely((*mbuf_total_left == 0) ||
> >> similar to above, this check can be removed.
> > same comment
> >
> >>> +			(*mbuf_total_left < input_length))) {
> >>> +		rte_bbdev_log(ERR,
> >>> +				"Mismatch between mbuf length and
> >> included CB sizes: mbuf len %u, cb len %u",
> >>> +				*mbuf_total_left, input_length);
> >>> +		return -1;
> >>> +	}
> >>> +
> >>> +	next_triplet = acc100_dma_fill_blk_type_in(desc, input,
> >>> +			in_offset, input_length,
> >>> +			seg_total_left, next_triplet);
> >>> +
> >>> +	if (unlikely(next_triplet < 0)) {
> >>> +		rte_bbdev_log(ERR,
> >>> +				"Mismatch between data to process and
> >> mbuf data length in bbdev_op: %p",
> >>> +				op);
> >>> +		return -1;
> >>> +	}
> >>> +
> >>> +	if (check_bit(op->ldpc_dec.op_flags,
> >>> +
> >> 	RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
> >>> +		h_p_size = fcw->hcin_size0 + fcw->hcin_size1;
> >>> +		if (h_comp)
> >>> +			h_p_size = (h_p_size * 3 + 3) / 4;
> >>> +		desc->data_ptrs[next_triplet].address =
> >>> +				dec->harq_combined_input.offset;
> >>> +		desc->data_ptrs[next_triplet].blen = h_p_size;
> >>> +		desc->data_ptrs[next_triplet].blkid =
> >> ACC100_DMA_BLKID_IN_HARQ;
> >>> +		desc->data_ptrs[next_triplet].dma_ext = 1;
> >>> +#ifndef ACC100_EXT_MEM
> >>> +		acc100_dma_fill_blk_type_out(
> >>> +				desc,
> >>> +				op->ldpc_dec.harq_combined_input.data,
> >>> +				op->ldpc_dec.harq_combined_input.offset,
> >>> +				h_p_size,
> >>> +				next_triplet,
> >>> +				ACC100_DMA_BLKID_IN_HARQ);
> >>> +#endif
> >>> +		next_triplet++;
> >>> +	}
> >>> +
> >>> +	desc->data_ptrs[next_triplet - 1].last = 1;
> >>> +	desc->m2dlen = next_triplet;
> >>> +	*mbuf_total_left -= input_length;
> >>> +
> >>> +	next_triplet = acc100_dma_fill_blk_type_out(desc, h_output,
> >>> +			*h_out_offset, output_length >> 3, next_triplet,
> >>> +			ACC100_DMA_BLKID_OUT_HARD);
> >>> +	if (unlikely(next_triplet < 0)) {
> >>> +		rte_bbdev_log(ERR,
> >>> +				"Mismatch between data to process and
> >> mbuf data length in bbdev_op: %p",
> >>> +				op);
> >>> +		return -1;
> >>> +	}
> >>> +
> >>> +	if (check_bit(op->ldpc_dec.op_flags,
> >>> +
> >> 	RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
> >>> +		/* Pruned size of the HARQ */
> >>> +		h_p_size = fcw->hcout_size0 + fcw->hcout_size1;
> >>> +		/* Non-Pruned size of the HARQ */
> >>> +		h_np_size = fcw->hcout_offset > 0 ?
> >>> +				fcw->hcout_offset + fcw->hcout_size1 :
> >>> +				h_p_size;
> >>> +		if (h_comp) {
> >>> +			h_np_size = (h_np_size * 3 + 3) / 4;
> >>> +			h_p_size = (h_p_size * 3 + 3) / 4;
> >> * 4 -1 ) / 4
> >>
> >> may produce better assembly.
> > that is not the same arithmetic
> ?

(x * 3 + 3 ) / 4 != (x * 4 - 1 ) / 4 
let me know if any doubt

> >>> +		}
> >>> +		dec->harq_combined_output.length = h_np_size;
> >>> +		desc->data_ptrs[next_triplet].address =
> >>> +				dec->harq_combined_output.offset;
> >>> +		desc->data_ptrs[next_triplet].blen = h_p_size;
> >>> +		desc->data_ptrs[next_triplet].blkid =
> >> ACC100_DMA_BLKID_OUT_HARQ;
> >>> +		desc->data_ptrs[next_triplet].dma_ext = 1;
> >>> +#ifndef ACC100_EXT_MEM
> >>> +		acc100_dma_fill_blk_type_out(
> >>> +				desc,
> >>> +				dec->harq_combined_output.data,
> >>> +				dec->harq_combined_output.offset,
> >>> +				h_p_size,
> >>> +				next_triplet,
> >>> +				ACC100_DMA_BLKID_OUT_HARQ);
> >>> +#endif
> >>> +		next_triplet++;
> >>> +	}
> >>> +
> >>> +	*h_out_length = output_length >> 3;
> >>> +	dec->hard_output.length += *h_out_length;
> >>> +	*h_out_offset += *h_out_length;
> >>> +	desc->data_ptrs[next_triplet - 1].last = 1;
> >>> +	desc->d2mlen = next_triplet - desc->m2dlen;
> >>> +
> >>> +	desc->op_addr = op;
> >>> +
> >>> +	return 0;
> >>> +}
> >>> +
> >>> +static inline void
> >>> +acc100_dma_desc_ld_update(struct rte_bbdev_dec_op *op,
> >>> +		struct acc100_dma_req_desc *desc,
> >>> +		struct rte_mbuf *input, struct rte_mbuf *h_output,
> >>> +		uint32_t *in_offset, uint32_t *h_out_offset,
> >>> +		uint32_t *h_out_length,
> >>> +		union acc100_harq_layout_data *harq_layout)
> >>> +{
> >>> +	int next_triplet = 1; /* FCW already done */
> >>> +	desc->data_ptrs[next_triplet].address =
> >>> +			rte_pktmbuf_iova_offset(input, *in_offset);
> >>> +	next_triplet++;
> >> No overflow checks on next_triplet
> >>
> >> This is a general problem.
> > I dont see the overflow risk.
> 
> A lot of places increments without checking the bounds.
> 
> To me, it seems like we are getting lucky that data_ptrs[] is big enough.

I dont think so. They are bounded by design, additionaly this would be caught in
analysis tool. 

> 
> >
> >>> +
> >>> +	if (check_bit(op->ldpc_dec.op_flags,
> >>> +
> >> 	RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
> >>> +		struct rte_bbdev_op_data hi = op-
> >>> ldpc_dec.harq_combined_input;
> >>> +		desc->data_ptrs[next_triplet].address = hi.offset;
> >>> +#ifndef ACC100_EXT_MEM
> >>> +		desc->data_ptrs[next_triplet].address =
> >>> +				rte_pktmbuf_iova_offset(hi.data, hi.offset);
> >>> +#endif
> >>> +		next_triplet++;
> >>> +	}
> >>> +
> >>> +	desc->data_ptrs[next_triplet].address =
> >>> +			rte_pktmbuf_iova_offset(h_output, *h_out_offset);
> >>> +	*h_out_length = desc->data_ptrs[next_triplet].blen;
> >>> +	next_triplet++;
> >>> +
> >>> +	if (check_bit(op->ldpc_dec.op_flags,
> >>> +
> >> 	RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
> >>> +		desc->data_ptrs[next_triplet].address =
> >>> +				op->ldpc_dec.harq_combined_output.offset;
> >>> +		/* Adjust based on previous operation */
> >>> +		struct rte_bbdev_dec_op *prev_op = desc->op_addr;
> >>> +		op->ldpc_dec.harq_combined_output.length =
> >>> +				prev_op-
> >>> ldpc_dec.harq_combined_output.length;
> >>> +		int16_t hq_idx = op-
> >>> ldpc_dec.harq_combined_output.offset /
> >>> +				ACC100_HARQ_OFFSET;
> >>> +		int16_t prev_hq_idx =
> >>> +				prev_op-
> >>> ldpc_dec.harq_combined_output.offset
> >>> +				/ ACC100_HARQ_OFFSET;
> >>> +		harq_layout[hq_idx].val = harq_layout[prev_hq_idx].val;
> >>> +#ifndef ACC100_EXT_MEM
> >>> +		struct rte_bbdev_op_data ho =
> >>> +				op->ldpc_dec.harq_combined_output;
> >>> +		desc->data_ptrs[next_triplet].address =
> >>> +				rte_pktmbuf_iova_offset(ho.data, ho.offset);
> >>> +#endif
> >>> +		next_triplet++;
> >>> +	}
> >>> +
> >>> +	op->ldpc_dec.hard_output.length += *h_out_length;
> >>> +	desc->op_addr = op;
> >>> +}
> >>> +
> >>> +
> >>> +/* Enqueue a number of operations to HW and update software rings
> */
> >>> +static inline void
> >>> +acc100_dma_enqueue(struct acc100_queue *q, uint16_t n,
> >>> +		struct rte_bbdev_stats *queue_stats)
> >>> +{
> >>> +	union acc100_enqueue_reg_fmt enq_req;
> >>> +#ifdef RTE_BBDEV_OFFLOAD_COST
> >>> +	uint64_t start_time = 0;
> >>> +	queue_stats->acc_offload_cycles = 0;
> >>> +	RTE_SET_USED(queue_stats);
> >>> +#else
> >>> +	RTE_SET_USED(queue_stats);
> >>> +#endif
> >> RTE_SET_UNUSED(... is common in the #ifdef/#else
> >>
> >> so it should be moved out.
> > ok
> >
> >>> +
> >>> +	enq_req.val = 0;
> >>> +	/* Setting offset, 100b for 256 DMA Desc */
> >>> +	enq_req.addr_offset = ACC100_DESC_OFFSET;
> >>> +
> >> should n != 0 be checked here ?
> > This is all checked before that point.
> ok
> >
> >>> +	/* Split ops into batches */
> >>> +	do {
> >>> +		union acc100_dma_desc *desc;
> >>> +		uint16_t enq_batch_size;
> >>> +		uint64_t offset;
> >>> +		rte_iova_t req_elem_addr;
> >>> +
> >>> +		enq_batch_size = RTE_MIN(n, MAX_ENQ_BATCH_SIZE);
> >>> +
> >>> +		/* Set flag on last descriptor in a batch */
> >>> +		desc = q->ring_addr + ((q->sw_ring_head + enq_batch_size -
> >> 1) &
> >>> +				q->sw_ring_wrap_mask);
> >>> +		desc->req.last_desc_in_batch = 1;
> >>> +
> >>> +		/* Calculate the 1st descriptor's address */
> >>> +		offset = ((q->sw_ring_head & q->sw_ring_wrap_mask) *
> >>> +				sizeof(union acc100_dma_desc));
> >>> +		req_elem_addr = q->ring_addr_phys + offset;
> >>> +
> >>> +		/* Fill enqueue struct */
> >>> +		enq_req.num_elem = enq_batch_size;
> >>> +		/* low 6 bits are not needed */
> >>> +		enq_req.req_elem_addr = (uint32_t)(req_elem_addr >> 6);
> >>> +
> >>> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> >>> +		rte_memdump(stderr, "Req sdone", desc, sizeof(*desc));
> >>> +#endif
> >>> +		rte_bbdev_log_debug(
> >>> +				"Enqueue %u reqs (phys %#"PRIx64") to reg
> >> %p",
> >>> +				enq_batch_size,
> >>> +				req_elem_addr,
> >>> +				(void *)q->mmio_reg_enqueue);
> >>> +
> >>> +		rte_wmb();
> >>> +
> >>> +#ifdef RTE_BBDEV_OFFLOAD_COST
> >>> +		/* Start time measurement for enqueue function offload. */
> >>> +		start_time = rte_rdtsc_precise();
> >>> +#endif
> >>> +		rte_bbdev_log(DEBUG, "Debug : MMIO Enqueue");
> >> logging time will be tracked with the mmio_write
> >>
> >> so logging should be moved above the start_time setting
> > Not required. Running with debug traces is expected to make real time
> offload measurement irrelevant.
> 
> I disagree, if logging has a pagefault or writes to disk even sometimes
> 
> there will be huge spike in the time that would make the accumulated
> 
> acc_offload_cycles meaningless.  It would be ok if the write time is of
> 
> the same order of magnitude as disk access.

I am just saying that whether this extra time should be part of not of this processing time
is undefined (anyone could argue both ways) and not relevant.

> 
> >>> +		mmio_write(q->mmio_reg_enqueue, enq_req.val);
> >>> +
> >>> +#ifdef RTE_BBDEV_OFFLOAD_COST
> >>> +		queue_stats->acc_offload_cycles +=
> >>> +				rte_rdtsc_precise() - start_time;
> >>> +#endif
> >>> +
> >>> +		q->aq_enqueued++;
> >>> +		q->sw_ring_head += enq_batch_size;
> >>> +		n -= enq_batch_size;
> >>> +
> >>> +	} while (n);
> >>> +
> >>> +
> >>> +}
> >>> +
> >>> +/* Enqueue one encode operations for ACC100 device in CB mode */
> >>> +static inline int
> >>> +enqueue_ldpc_enc_n_op_cb(struct acc100_queue *q, struct
> >> rte_bbdev_enc_op **ops,
> >>> +		uint16_t total_enqueued_cbs, int16_t num)
> >>> +{
> >>> +	union acc100_dma_desc *desc = NULL;
> >>> +	uint32_t out_length;
> >>> +	struct rte_mbuf *output_head, *output;
> >>> +	int i, next_triplet;
> >>> +	uint16_t  in_length_in_bytes;
> >>> +	struct rte_bbdev_op_ldpc_enc *enc = &ops[0]->ldpc_enc;
> >>> +
> >>> +	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> >>> +			& q->sw_ring_wrap_mask);
> >>> +	desc = q->ring_addr + desc_idx;
> >>> +	acc100_fcw_le_fill(ops[0], &desc->req.fcw_le, num);
> >>> +
> >>> +	/** This could be done at polling */
> >>> +	desc->req.word0 = ACC100_DMA_DESC_TYPE;
> >>> +	desc->req.word1 = 0; /**< Timestamp could be disabled */
> >>> +	desc->req.word2 = 0;
> >>> +	desc->req.word3 = 0;
> >>> +	desc->req.numCBs = num;
> >>> +
> >>> +	in_length_in_bytes = ops[0]->ldpc_enc.input.data->data_len;
> >>> +	out_length = (enc->cb_params.e + 7) >> 3;
> >>> +	desc->req.m2dlen = 1 + num;
> >>> +	desc->req.d2mlen = num;
> >>> +	next_triplet = 1;
> >>> +
> >>> +	for (i = 0; i < num; i++) {
> >> i is not needed here, it is next_triplet - 1
> > would impact readability as these refer to different concepts (code blocks
> and bdescs).
> > Would keep as is
> ok
> >
> >>> +		desc->req.data_ptrs[next_triplet].address =
> >>> +			rte_pktmbuf_iova_offset(ops[i]-
> >>> ldpc_enc.input.data, 0);
> >>> +		desc->req.data_ptrs[next_triplet].blen = in_length_in_bytes;
> >>> +		next_triplet++;
> >>> +		desc->req.data_ptrs[next_triplet].address =
> >>> +				rte_pktmbuf_iova_offset(
> >>> +				ops[i]->ldpc_enc.output.data, 0);
> >>> +		desc->req.data_ptrs[next_triplet].blen = out_length;
> >>> +		next_triplet++;
> >>> +		ops[i]->ldpc_enc.output.length = out_length;
> >>> +		output_head = output = ops[i]->ldpc_enc.output.data;
> >>> +		mbuf_append(output_head, output, out_length);
> >>> +		output->data_len = out_length;
> >>> +	}
> >>> +
> >>> +	desc->req.op_addr = ops[0];
> >>> +
> >>> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> >>> +	rte_memdump(stderr, "FCW", &desc->req.fcw_le,
> >>> +			sizeof(desc->req.fcw_le) - 8);
> >>> +	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> >>> +#endif
> >>> +
> >>> +	/* One CB (one op) was successfully prepared to enqueue */
> >>> +	return num;
> >> caller does not use num, only check if < 0
> >>
> >> So could change to return 0
> > would keep as is for debug
> ok
> >
> >>> +}
> >>> +
> >>> +/* Enqueue one encode operations for ACC100 device in CB mode */
> >>> +static inline int
> >>> +enqueue_ldpc_enc_one_op_cb(struct acc100_queue *q, struct
> >> rte_bbdev_enc_op *op,
> >>> +		uint16_t total_enqueued_cbs)
> >> rte_fpga_5gnr_fec.c has this same function.  It would be good if common
> >> functions could be collected and used to stabilize the internal bbdev
> >> interface.
> >>
> >> This is general issue
> > This is true for some part of the code and noted.
> > In that very case they are distinct implementation with HW specifics
> > But agreed to look into such refactory later on.
> ok
> >
> >>> +{
> >>> +	union acc100_dma_desc *desc = NULL;
> >>> +	int ret;
> >>> +	uint32_t in_offset, out_offset, out_length, mbuf_total_left,
> >>> +		seg_total_left;
> >>> +	struct rte_mbuf *input, *output_head, *output;
> >>> +
> >>> +	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> >>> +			& q->sw_ring_wrap_mask);
> >>> +	desc = q->ring_addr + desc_idx;
> >>> +	acc100_fcw_le_fill(op, &desc->req.fcw_le, 1);
> >>> +
> >>> +	input = op->ldpc_enc.input.data;
> >>> +	output_head = output = op->ldpc_enc.output.data;
> >>> +	in_offset = op->ldpc_enc.input.offset;
> >>> +	out_offset = op->ldpc_enc.output.offset;
> >>> +	out_length = 0;
> >>> +	mbuf_total_left = op->ldpc_enc.input.length;
> >>> +	seg_total_left = rte_pktmbuf_data_len(op->ldpc_enc.input.data)
> >>> +			- in_offset;
> >>> +
> >>> +	ret = acc100_dma_desc_le_fill(op, &desc->req, &input, output,
> >>> +			&in_offset, &out_offset, &out_length,
> >> &mbuf_total_left,
> >>> +			&seg_total_left);
> >>> +
> >>> +	if (unlikely(ret < 0))
> >>> +		return ret;
> >>> +
> >>> +	mbuf_append(output_head, output, out_length);
> >>> +
> >>> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> >>> +	rte_memdump(stderr, "FCW", &desc->req.fcw_le,
> >>> +			sizeof(desc->req.fcw_le) - 8);
> >>> +	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> >>> +
> >>> +	/* Check if any data left after processing one CB */
> >>> +	if (mbuf_total_left != 0) {
> >>> +		rte_bbdev_log(ERR,
> >>> +				"Some date still left after processing one CB:
> >> mbuf_total_left = %u",
> >>> +				mbuf_total_left);
> >>> +		return -EINVAL;
> >>> +	}
> >>> +#endif
> >>> +	/* One CB (one op) was successfully prepared to enqueue */
> >>> +	return 1;
> >> Another case where caller only check for < 0
> >>
> >> Consider changes all similar to return 0 on success.
> > same comment as above, would keep as is.
> >
> >>> +}
> >>> +
> >>> +/** Enqueue one decode operations for ACC100 device in CB mode */
> >>> +static inline int
> >>> +enqueue_ldpc_dec_one_op_cb(struct acc100_queue *q, struct
> >> rte_bbdev_dec_op *op,
> >>> +		uint16_t total_enqueued_cbs, bool same_op)
> >>> +{
> >>> +	int ret;
> >>> +
> >>> +	union acc100_dma_desc *desc;
> >>> +	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> >>> +			& q->sw_ring_wrap_mask);
> >>> +	desc = q->ring_addr + desc_idx;
> >>> +	struct rte_mbuf *input, *h_output_head, *h_output;
> >>> +	uint32_t in_offset, h_out_offset, mbuf_total_left, h_out_length = 0;
> >>> +	input = op->ldpc_dec.input.data;
> >>> +	h_output_head = h_output = op->ldpc_dec.hard_output.data;
> >>> +	in_offset = op->ldpc_dec.input.offset;
> >>> +	h_out_offset = op->ldpc_dec.hard_output.offset;
> >>> +	mbuf_total_left = op->ldpc_dec.input.length;
> >>> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> >>> +	if (unlikely(input == NULL)) {
> >>> +		rte_bbdev_log(ERR, "Invalid mbuf pointer");
> >>> +		return -EFAULT;
> >>> +	}
> >>> +#endif
> >>> +	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
> >>> +
> >>> +	if (same_op) {
> >>> +		union acc100_dma_desc *prev_desc;
> >>> +		desc_idx = ((q->sw_ring_head + total_enqueued_cbs - 1)
> >>> +				& q->sw_ring_wrap_mask);
> >>> +		prev_desc = q->ring_addr + desc_idx;
> >>> +		uint8_t *prev_ptr = (uint8_t *) prev_desc;
> >>> +		uint8_t *new_ptr = (uint8_t *) desc;
> >>> +		/* Copy first 4 words and BDESCs */
> >>> +		rte_memcpy(new_ptr, prev_ptr, 16);
> >>> +		rte_memcpy(new_ptr + 36, prev_ptr + 36, 40);
> >> These magic numbers should be #defines
> > yes
> >
> >>> +		desc->req.op_addr = prev_desc->req.op_addr;
> >>> +		/* Copy FCW */
> >>> +		rte_memcpy(new_ptr + ACC100_DESC_FCW_OFFSET,
> >>> +				prev_ptr + ACC100_DESC_FCW_OFFSET,
> >>> +				ACC100_FCW_LD_BLEN);
> >>> +		acc100_dma_desc_ld_update(op, &desc->req, input,
> >> h_output,
> >>> +				&in_offset, &h_out_offset,
> >>> +				&h_out_length, harq_layout);
> >>> +	} else {
> >>> +		struct acc100_fcw_ld *fcw;
> >>> +		uint32_t seg_total_left;
> >>> +		fcw = &desc->req.fcw_ld;
> >>> +		acc100_fcw_ld_fill(op, fcw, harq_layout);
> >>> +
> >>> +		/* Special handling when overusing mbuf */
> >>> +		if (fcw->rm_e < MAX_E_MBUF)
> >>> +			seg_total_left = rte_pktmbuf_data_len(input)
> >>> +					- in_offset;
> >>> +		else
> >>> +			seg_total_left = fcw->rm_e;
> >>> +
> >>> +		ret = acc100_dma_desc_ld_fill(op, &desc->req, &input,
> >> h_output,
> >>> +				&in_offset, &h_out_offset,
> >>> +				&h_out_length, &mbuf_total_left,
> >>> +				&seg_total_left, fcw);
> >>> +		if (unlikely(ret < 0))
> >>> +			return ret;
> >>> +	}
> >>> +
> >>> +	/* Hard output */
> >>> +	mbuf_append(h_output_head, h_output, h_out_length);
> >>> +#ifndef ACC100_EXT_MEM
> >>> +	if (op->ldpc_dec.harq_combined_output.length > 0) {
> >>> +		/* Push the HARQ output into host memory */
> >>> +		struct rte_mbuf *hq_output_head, *hq_output;
> >>> +		hq_output_head = op-
> >>> ldpc_dec.harq_combined_output.data;
> >>> +		hq_output = op->ldpc_dec.harq_combined_output.data;
> >>> +		mbuf_append(hq_output_head, hq_output,
> >>> +				op-
> >>> ldpc_dec.harq_combined_output.length);
> >>> +	}
> >>> +#endif
> >>> +
> >>> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> >>> +	rte_memdump(stderr, "FCW", &desc->req.fcw_ld,
> >>> +			sizeof(desc->req.fcw_ld) - 8);
> >>> +	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> >>> +#endif
> >>> +
> >>> +	/* One CB (one op) was successfully prepared to enqueue */
> >>> +	return 1;
> >>> +}
> >>> +
> >>> +
> >>> +/* Enqueue one decode operations for ACC100 device in TB mode */
> >>> +static inline int
> >>> +enqueue_ldpc_dec_one_op_tb(struct acc100_queue *q, struct
> >> rte_bbdev_dec_op *op,
> >>> +		uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
> >>> +{
> >>> +	union acc100_dma_desc *desc = NULL;
> >>> +	int ret;
> >>> +	uint8_t r, c;
> >>> +	uint32_t in_offset, h_out_offset,
> >>> +		h_out_length, mbuf_total_left, seg_total_left;
> >>> +	struct rte_mbuf *input, *h_output_head, *h_output;
> >>> +	uint16_t current_enqueued_cbs = 0;
> >>> +
> >>> +	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> >>> +			& q->sw_ring_wrap_mask);
> >>> +	desc = q->ring_addr + desc_idx;
> >>> +	uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
> >>> +	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
> >>> +	acc100_fcw_ld_fill(op, &desc->req.fcw_ld, harq_layout);
> >>> +
> >>> +	input = op->ldpc_dec.input.data;
> >>> +	h_output_head = h_output = op->ldpc_dec.hard_output.data;
> >>> +	in_offset = op->ldpc_dec.input.offset;
> >>> +	h_out_offset = op->ldpc_dec.hard_output.offset;
> >>> +	h_out_length = 0;
> >>> +	mbuf_total_left = op->ldpc_dec.input.length;
> >>> +	c = op->ldpc_dec.tb_params.c;
> >>> +	r = op->ldpc_dec.tb_params.r;
> >>> +
> >>> +	while (mbuf_total_left > 0 && r < c) {
> >>> +
> >>> +		seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
> >>> +
> >>> +		/* Set up DMA descriptor */
> >>> +		desc = q->ring_addr + ((q->sw_ring_head +
> >> total_enqueued_cbs)
> >>> +				& q->sw_ring_wrap_mask);
> >>> +		desc->req.data_ptrs[0].address = q->ring_addr_phys +
> >> fcw_offset;
> >>> +		desc->req.data_ptrs[0].blen = ACC100_FCW_LD_BLEN;
> >>> +		ret = acc100_dma_desc_ld_fill(op, &desc->req, &input,
> >>> +				h_output, &in_offset, &h_out_offset,
> >>> +				&h_out_length,
> >>> +				&mbuf_total_left, &seg_total_left,
> >>> +				&desc->req.fcw_ld);
> >>> +
> >>> +		if (unlikely(ret < 0))
> >>> +			return ret;
> >>> +
> >>> +		/* Hard output */
> >>> +		mbuf_append(h_output_head, h_output, h_out_length);
> >>> +
> >>> +		/* Set total number of CBs in TB */
> >>> +		desc->req.cbs_in_tb = cbs_in_tb;
> >>> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> >>> +		rte_memdump(stderr, "FCW", &desc->req.fcw_td,
> >>> +				sizeof(desc->req.fcw_td) - 8);
> >>> +		rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> >>> +#endif
> >>> +
> >>> +		if (seg_total_left == 0) {
> >>> +			/* Go to the next mbuf */
> >>> +			input = input->next;
> >>> +			in_offset = 0;
> >>> +			h_output = h_output->next;
> >>> +			h_out_offset = 0;
> >>> +		}
> >>> +		total_enqueued_cbs++;
> >>> +		current_enqueued_cbs++;
> >>> +		r++;
> >>> +	}
> >>> +
> >>> +	if (unlikely(desc == NULL))
> >> How is this possible ? desc has be dereferenced already.
> > related to static code analysis, arguably a false alarm
> >
> >>> +		return current_enqueued_cbs;
> >>> +
> >>> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> >>> +	/* Check if any CBs left for processing */
> >>> +	if (mbuf_total_left != 0) {
> >>> +		rte_bbdev_log(ERR,
> >>> +				"Some date still left for processing:
> >> mbuf_total_left = %u",
> >>> +				mbuf_total_left);
> >>> +		return -EINVAL;
> >>> +	}
> >>> +#endif
> >>> +	/* Set SDone on last CB descriptor for TB mode */
> >>> +	desc->req.sdone_enable = 1;
> >>> +	desc->req.irq_enable = q->irq_enable;
> >>> +
> >>> +	return current_enqueued_cbs;
> >>> +}
> >>> +
> >>> +
> >>> +/* Calculates number of CBs in processed encoder TB based on 'r' and
> >> input
> >>> + * length.
> >>> + */
> >>> +static inline uint8_t
> >>> +get_num_cbs_in_tb_enc(struct rte_bbdev_op_turbo_enc *turbo_enc)
> >>> +{
> >>> +	uint8_t c, c_neg, r, crc24_bits = 0;
> >>> +	uint16_t k, k_neg, k_pos;
> >>> +	uint8_t cbs_in_tb = 0;
> >>> +	int32_t length;
> >>> +
> >>> +	length = turbo_enc->input.length;
> >>> +	r = turbo_enc->tb_params.r;
> >>> +	c = turbo_enc->tb_params.c;
> >>> +	c_neg = turbo_enc->tb_params.c_neg;
> >>> +	k_neg = turbo_enc->tb_params.k_neg;
> >>> +	k_pos = turbo_enc->tb_params.k_pos;
> >>> +	crc24_bits = 0;
> >>> +	if (check_bit(turbo_enc->op_flags,
> >> RTE_BBDEV_TURBO_CRC_24B_ATTACH))
> >>> +		crc24_bits = 24;
> >>> +	while (length > 0 && r < c) {
> >>> +		k = (r < c_neg) ? k_neg : k_pos;
> >>> +		length -= (k - crc24_bits) >> 3;
> >>> +		r++;
> >>> +		cbs_in_tb++;
> >>> +	}
> >>> +
> >>> +	return cbs_in_tb;
> >>> +}
> >>> +
> >>> +/* Calculates number of CBs in processed decoder TB based on 'r' and
> >> input
> >>> + * length.
> >>> + */
> >>> +static inline uint16_t
> >>> +get_num_cbs_in_tb_dec(struct rte_bbdev_op_turbo_dec *turbo_dec)
> >>> +{
> >>> +	uint8_t c, c_neg, r = 0;
> >>> +	uint16_t kw, k, k_neg, k_pos, cbs_in_tb = 0;
> >>> +	int32_t length;
> >>> +
> >>> +	length = turbo_dec->input.length;
> >>> +	r = turbo_dec->tb_params.r;
> >>> +	c = turbo_dec->tb_params.c;
> >>> +	c_neg = turbo_dec->tb_params.c_neg;
> >>> +	k_neg = turbo_dec->tb_params.k_neg;
> >>> +	k_pos = turbo_dec->tb_params.k_pos;
> >>> +	while (length > 0 && r < c) {
> >>> +		k = (r < c_neg) ? k_neg : k_pos;
> >>> +		kw = RTE_ALIGN_CEIL(k + 4, 32) * 3;
> >>> +		length -= kw;
> >>> +		r++;
> >>> +		cbs_in_tb++;
> >>> +	}
> >>> +
> >>> +	return cbs_in_tb;
> >>> +}
> >>> +
> >>> +/* Calculates number of CBs in processed decoder TB based on 'r' and
> >> input
> >>> + * length.
> >>> + */
> >>> +static inline uint16_t
> >>> +get_num_cbs_in_tb_ldpc_dec(struct rte_bbdev_op_ldpc_dec
> *ldpc_dec)
> >>> +{
> >>> +	uint16_t r, cbs_in_tb = 0;
> >>> +	int32_t length = ldpc_dec->input.length;
> >>> +	r = ldpc_dec->tb_params.r;
> >>> +	while (length > 0 && r < ldpc_dec->tb_params.c) {
> >>> +		length -=  (r < ldpc_dec->tb_params.cab) ?
> >>> +				ldpc_dec->tb_params.ea :
> >>> +				ldpc_dec->tb_params.eb;
> >>> +		r++;
> >>> +		cbs_in_tb++;
> >>> +	}
> >>> +	return cbs_in_tb;
> >>> +}
> >>> +
> >>> +/* Check we can mux encode operations with common FCW */
> >>> +static inline bool
> >>> +check_mux(struct rte_bbdev_enc_op **ops, uint16_t num) {
> >>> +	uint16_t i;
> >>> +	if (num == 1)
> >>> +		return false;
> >> likely should strengthen check to num <= 1
> > no impact, but doesnt hurt to change ok.
> >
> >>> +	for (i = 1; i < num; ++i) {
> >>> +		/* Only mux compatible code blocks */
> >>> +		if (memcmp((uint8_t *)(&ops[i]->ldpc_enc) + ENC_OFFSET,
> >>> +				(uint8_t *)(&ops[0]->ldpc_enc) +
> >> ENC_OFFSET,
> >> ops[0]->ldpc_enc should be hoisted out of loop as it is invariant.
> > compiler takes care of this I believe
> hopefully, yes.
> >
> >>> +				CMP_ENC_SIZE) != 0)
> >>> +			return false;
> >>> +	}
> >>> +	return true;
> >>> +}
> >>> +
> >>> +/** Enqueue encode operations for ACC100 device in CB mode. */
> >>> +static inline uint16_t
> >>> +acc100_enqueue_ldpc_enc_cb(struct rte_bbdev_queue_data *q_data,
> >>> +		struct rte_bbdev_enc_op **ops, uint16_t num)
> >>> +{
> >>> +	struct acc100_queue *q = q_data->queue_private;
> >>> +	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q-
> >>> sw_ring_head;
> >>> +	uint16_t i = 0;
> >>> +	union acc100_dma_desc *desc;
> >>> +	int ret, desc_idx = 0;
> >>> +	int16_t enq, left = num;
> >>> +
> >>> +	while (left > 0) {
> >>> +		if (unlikely(avail - 1 < 0))
> >>> +			break;
> >>> +		avail--;
> >>> +		enq = RTE_MIN(left, MUX_5GDL_DESC);
> >>> +		if (check_mux(&ops[i], enq)) {
> >>> +			ret = enqueue_ldpc_enc_n_op_cb(q, &ops[i],
> >>> +					desc_idx, enq);
> >>> +			if (ret < 0)
> >>> +				break;
> >>> +			i += enq;
> >>> +		} else {
> >>> +			ret = enqueue_ldpc_enc_one_op_cb(q, ops[i],
> >> desc_idx);
> >>> +			if (ret < 0)
> >>> +				break;
> >> failure is not handled well, what happens if this is one of serveral
> > the aim is to flag the error and move on
> >
> >
> >>> +			i++;
> >>> +		}
> >>> +		desc_idx++;
> >>> +		left = num - i;
> >>> +	}
> >>> +
> >>> +	if (unlikely(i == 0))
> >>> +		return 0; /* Nothing to enqueue */
> >> this does not look correct for all cases
> > I miss your point
> 
> I was thinking this was an error handler and needed beefing up.

user would know if it could not enqueue based on the return 0 value.
In case it is for a bad reason (not just not enough space) this will be logged
as part of other exception. I think.

> 
> 
> >>> +
> >>> +	/* Set SDone in last CB in enqueued ops for CB mode*/
> >>> +	desc = q->ring_addr + ((q->sw_ring_head + desc_idx - 1)
> >>> +			& q->sw_ring_wrap_mask);
> >>> +	desc->req.sdone_enable = 1;
> >>> +	desc->req.irq_enable = q->irq_enable;
> >>> +
> >>> +	acc100_dma_enqueue(q, desc_idx, &q_data->queue_stats);
> >>> +
> >>> +	/* Update stats */
> >>> +	q_data->queue_stats.enqueued_count += i;
> >>> +	q_data->queue_stats.enqueue_err_count += num - i;
> >>> +
> >>> +	return i;
> >>> +}
> >>> +
> >>> +/* Enqueue encode operations for ACC100 device. */
> >>> +static uint16_t
> >>> +acc100_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
> >>> +		struct rte_bbdev_enc_op **ops, uint16_t num)
> >>> +{
> >>> +	if (unlikely(num == 0))
> >>> +		return 0;
> >> Handling num == 0 should be in acc100_enqueue_ldpc_enc_cb
> > Why would this be better not to catch early from user api call?
> ok, because it was 'static' i was unsure
> >
> >>> +	return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
> >>> +}
> >>> +
> >>> +/* Check we can mux encode operations with common FCW */
> >>> +static inline bool
> >>> +cmp_ldpc_dec_op(struct rte_bbdev_dec_op **ops) {
> >>> +	/* Only mux compatible code blocks */
> >>> +	if (memcmp((uint8_t *)(&ops[0]->ldpc_dec) + DEC_OFFSET,
> >>> +			(uint8_t *)(&ops[1]->ldpc_dec) +
> >>> +			DEC_OFFSET, CMP_DEC_SIZE) != 0) {
> >>> +		return false;
> >>> +	} else
> >> do not need the else, there are no other statements.
> > debatable. Not considering change except if that becomes a DPDK
> > coding guideline.
> fine.
> >>> +		return true;
> >>> +}
> >>> +
> >>> +
> >>> +/* Enqueue decode operations for ACC100 device in TB mode */
> >>> +static uint16_t
> >>> +acc100_enqueue_ldpc_dec_tb(struct rte_bbdev_queue_data *q_data,
> >>> +		struct rte_bbdev_dec_op **ops, uint16_t num)
> >>> +{
> >>> +	struct acc100_queue *q = q_data->queue_private;
> >>> +	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q-
> >>> sw_ring_head;
> >>> +	uint16_t i, enqueued_cbs = 0;
> >>> +	uint8_t cbs_in_tb;
> >>> +	int ret;
> >>> +
> >>> +	for (i = 0; i < num; ++i) {
> >>> +		cbs_in_tb = get_num_cbs_in_tb_ldpc_dec(&ops[i]-
> >>> ldpc_dec);
> >>> +		/* Check if there are available space for further processing */
> >>> +		if (unlikely(avail - cbs_in_tb < 0))
> >>> +			break;
> >>> +		avail -= cbs_in_tb;
> >>> +
> >>> +		ret = enqueue_ldpc_dec_one_op_tb(q, ops[i],
> >>> +				enqueued_cbs, cbs_in_tb);
> >>> +		if (ret < 0)
> >>> +			break;
> >>> +		enqueued_cbs += ret;
> >>> +	}
> >>> +
> >>> +	acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
> >>> +
> >>> +	/* Update stats */
> >>> +	q_data->queue_stats.enqueued_count += i;
> >>> +	q_data->queue_stats.enqueue_err_count += num - i;
> >>> +	return i;
> >>> +}
> >>> +
> >>> +/* Enqueue decode operations for ACC100 device in CB mode */
> >>> +static uint16_t
> >>> +acc100_enqueue_ldpc_dec_cb(struct rte_bbdev_queue_data *q_data,
> >>> +		struct rte_bbdev_dec_op **ops, uint16_t num)
> >>> +{
> >>> +	struct acc100_queue *q = q_data->queue_private;
> >>> +	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q-
> >>> sw_ring_head;
> >>> +	uint16_t i;
> >>> +	union acc100_dma_desc *desc;
> >>> +	int ret;
> >>> +	bool same_op = false;
> >>> +	for (i = 0; i < num; ++i) {
> >>> +		/* Check if there are available space for further processing */
> >>> +		if (unlikely(avail - 1 < 0))
> >> change to (avail < 1)
> >>
> >> Generally.
> > ok
> >
> >>> +			break;
> >>> +		avail -= 1;
> >>> +
> >>> +		if (i > 0)
> >>> +			same_op = cmp_ldpc_dec_op(&ops[i-1]);
> >>> +		rte_bbdev_log(INFO, "Op %d %d %d %d %d %d %d %d %d %d
> >> %d %d\n",
> >>> +			i, ops[i]->ldpc_dec.op_flags, ops[i]-
> >>> ldpc_dec.rv_index,
> >>> +			ops[i]->ldpc_dec.iter_max, ops[i]-
> >>> ldpc_dec.iter_count,
> >>> +			ops[i]->ldpc_dec.basegraph, ops[i]->ldpc_dec.z_c,
> >>> +			ops[i]->ldpc_dec.n_cb, ops[i]->ldpc_dec.q_m,
> >>> +			ops[i]->ldpc_dec.n_filler, ops[i]-
> >>> ldpc_dec.cb_params.e,
> >>> +			same_op);
> >>> +		ret = enqueue_ldpc_dec_one_op_cb(q, ops[i], i, same_op);
> >>> +		if (ret < 0)
> >>> +			break;
> >>> +	}
> >>> +
> >>> +	if (unlikely(i == 0))
> >>> +		return 0; /* Nothing to enqueue */
> >>> +
> >>> +	/* Set SDone in last CB in enqueued ops for CB mode*/
> >>> +	desc = q->ring_addr + ((q->sw_ring_head + i - 1)
> >>> +			& q->sw_ring_wrap_mask);
> >>> +
> >>> +	desc->req.sdone_enable = 1;
> >>> +	desc->req.irq_enable = q->irq_enable;
> >>> +
> >>> +	acc100_dma_enqueue(q, i, &q_data->queue_stats);
> >>> +
> >>> +	/* Update stats */
> >>> +	q_data->queue_stats.enqueued_count += i;
> >>> +	q_data->queue_stats.enqueue_err_count += num - i;
> >>> +	return i;
> >>> +}
> >>> +
> >>> +/* Enqueue decode operations for ACC100 device. */
> >>> +static uint16_t
> >>> +acc100_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
> >>> +		struct rte_bbdev_dec_op **ops, uint16_t num)
> >>> +{
> >>> +	struct acc100_queue *q = q_data->queue_private;
> >>> +	int32_t aq_avail = q->aq_depth +
> >>> +			(q->aq_dequeued - q->aq_enqueued) / 128;
> >>> +
> >>> +	if (unlikely((aq_avail == 0) || (num == 0)))
> >>> +		return 0;
> >>> +
> >>> +	if (ops[0]->ldpc_dec.code_block_mode == 0)
> >>> +		return acc100_enqueue_ldpc_dec_tb(q_data, ops, num);
> >>> +	else
> >>> +		return acc100_enqueue_ldpc_dec_cb(q_data, ops, num);
> >>> +}
> >>> +
> >>> +
> >>> +/* Dequeue one encode operations from ACC100 device in CB mode */
> >>> +static inline int
> >>> +dequeue_enc_one_op_cb(struct acc100_queue *q, struct
> >> rte_bbdev_enc_op **ref_op,
> >>> +		uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
> >>> +{
> >>> +	union acc100_dma_desc *desc, atom_desc;
> >>> +	union acc100_dma_rsp_desc rsp;
> >>> +	struct rte_bbdev_enc_op *op;
> >>> +	int i;
> >>> +
> >>> +	desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
> >>> +			& q->sw_ring_wrap_mask);
> >>> +	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> >>> +			__ATOMIC_RELAXED);
> >>> +
> >>> +	/* Check fdone bit */
> >>> +	if (!(atom_desc.rsp.val & ACC100_FDONE))
> >>> +		return -1;
> >>> +
> >>> +	rsp.val = atom_desc.rsp.val;
> >>> +	rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
> >>> +
> >>> +	/* Dequeue */
> >>> +	op = desc->req.op_addr;
> >>> +
> >>> +	/* Clearing status, it will be set based on response */
> >>> +	op->status = 0;
> >>> +
> >>> +	op->status |= ((rsp.input_err)
> >>> +			? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> >> can remove the = 0, if |= is changed to =
> > yes in principle, but easy to break by mistake, so would keep.
> ok
> >>> +	op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> >>> +	op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> >>> +
> >>> +	if (desc->req.last_desc_in_batch) {
> >>> +		(*aq_dequeued)++;
> >>> +		desc->req.last_desc_in_batch = 0;
> >>> +	}
> >>> +	desc->rsp.val = ACC100_DMA_DESC_TYPE;
> >>> +	desc->rsp.add_info_0 = 0; /*Reserved bits */
> >>> +	desc->rsp.add_info_1 = 0; /*Reserved bits */
> >>> +
> >>> +	/* Flag that the muxing cause loss of opaque data */
> >>> +	op->opaque_data = (void *)-1;
> >> as a ptr, shouldn't opaque_data be poisoned with '0' ?
> > more obvious this way I think.
> 
> the idiom (ptr == NULL) would need to be changed.
> 
> as a non standard poison, it is likely that someone will trip over this.

It is just that so far this opaque data pointer could also be used to pass directly data as opposed to an actual pointer (ie. like a counter).
Arguably a bit odd but like this prior to this PMD introduction.


> 
> >>> +	for (i = 0 ; i < desc->req.numCBs; i++)
> >>> +		ref_op[i] = op;
> >>> +
> >>> +	/* One CB (op) was successfully dequeued */
> >>> +	return desc->req.numCBs;
> >>> +}
> >>> +
> >>> +/* Dequeue one encode operations from ACC100 device in TB mode */
> >>> +static inline int
> >>> +dequeue_enc_one_op_tb(struct acc100_queue *q, struct
> >> rte_bbdev_enc_op **ref_op,
> >>> +		uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
> >>> +{
> >>> +	union acc100_dma_desc *desc, *last_desc, atom_desc;
> >>> +	union acc100_dma_rsp_desc rsp;
> >>> +	struct rte_bbdev_enc_op *op;
> >>> +	uint8_t i = 0;
> >>> +	uint16_t current_dequeued_cbs = 0, cbs_in_tb;
> >>> +
> >>> +	desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
> >>> +			& q->sw_ring_wrap_mask);
> >>> +	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> >>> +			__ATOMIC_RELAXED);
> >>> +
> >>> +	/* Check fdone bit */
> >>> +	if (!(atom_desc.rsp.val & ACC100_FDONE))
> >>> +		return -1;
> >>> +
> >>> +	/* Get number of CBs in dequeued TB */
> >>> +	cbs_in_tb = desc->req.cbs_in_tb;
> >>> +	/* Get last CB */
> >>> +	last_desc = q->ring_addr + ((q->sw_ring_tail
> >>> +			+ total_dequeued_cbs + cbs_in_tb - 1)
> >>> +			& q->sw_ring_wrap_mask);
> >>> +	/* Check if last CB in TB is ready to dequeue (and thus
> >>> +	 * the whole TB) - checking sdone bit. If not return.
> >>> +	 */
> >>> +	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
> >>> +			__ATOMIC_RELAXED);
> >>> +	if (!(atom_desc.rsp.val & ACC100_SDONE))
> >>> +		return -1;
> >>> +
> >>> +	/* Dequeue */
> >>> +	op = desc->req.op_addr;
> >>> +
> >>> +	/* Clearing status, it will be set based on response */
> >>> +	op->status = 0;
> >>> +
> >>> +	while (i < cbs_in_tb) {
> >>> +		desc = q->ring_addr + ((q->sw_ring_tail
> >>> +				+ total_dequeued_cbs)
> >>> +				& q->sw_ring_wrap_mask);
> >>> +		atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> >>> +				__ATOMIC_RELAXED);
> >>> +		rsp.val = atom_desc.rsp.val;
> >>> +		rte_bbdev_log_debug("Resp. desc %p: %x", desc,
> >>> +				rsp.val);
> >>> +
> >>> +		op->status |= ((rsp.input_err)
> >>> +				? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> >>> +		op->status |= ((rsp.dma_err) ? (1 <<
> >> RTE_BBDEV_DRV_ERROR) : 0);
> >>> +		op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR)
> >> : 0);
> >>> +
> >>> +		if (desc->req.last_desc_in_batch) {
> >>> +			(*aq_dequeued)++;
> >>> +			desc->req.last_desc_in_batch = 0;
> >>> +		}
> >>> +		desc->rsp.val = ACC100_DMA_DESC_TYPE;
> >>> +		desc->rsp.add_info_0 = 0;
> >>> +		desc->rsp.add_info_1 = 0;
> >>> +		total_dequeued_cbs++;
> >>> +		current_dequeued_cbs++;
> >>> +		i++;
> >>> +	}
> >>> +
> >>> +	*ref_op = op;
> >>> +
> >>> +	return current_dequeued_cbs;
> >>> +}
> >>> +
> >>> +/* Dequeue one decode operation from ACC100 device in CB mode */
> >>> +static inline int
> >>> +dequeue_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
> >>> +		struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
> >>> +		uint16_t dequeued_cbs, uint32_t *aq_dequeued)
> >>> +{
> >>> +	union acc100_dma_desc *desc, atom_desc;
> >>> +	union acc100_dma_rsp_desc rsp;
> >>> +	struct rte_bbdev_dec_op *op;
> >>> +
> >>> +	desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> >>> +			& q->sw_ring_wrap_mask);
> >>> +	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> >>> +			__ATOMIC_RELAXED);
> >>> +
> >>> +	/* Check fdone bit */
> >>> +	if (!(atom_desc.rsp.val & ACC100_FDONE))
> >>> +		return -1;
> >>> +
> >>> +	rsp.val = atom_desc.rsp.val;
> >>> +	rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
> >>> +
> >>> +	/* Dequeue */
> >>> +	op = desc->req.op_addr;
> >>> +
> >>> +	/* Clearing status, it will be set based on response */
> >>> +	op->status = 0;
> >>> +	op->status |= ((rsp.input_err)
> >>> +			? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> >> similar to above, can remove the = 0
> >>
> >> This is a general issue.
> > same comment above
> >
> >>> +	op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> >>> +	op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> >>> +	if (op->status != 0)
> >>> +		q_data->queue_stats.dequeue_err_count++;
> >>> +
> >>> +	/* CRC invalid if error exists */
> >>> +	if (!op->status)
> >>> +		op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
> >>> +	op->turbo_dec.iter_count = (uint8_t) rsp.iter_cnt / 2;
> >>> +	/* Check if this is the last desc in batch (Atomic Queue) */
> >>> +	if (desc->req.last_desc_in_batch) {
> >>> +		(*aq_dequeued)++;
> >>> +		desc->req.last_desc_in_batch = 0;
> >>> +	}
> >>> +	desc->rsp.val = ACC100_DMA_DESC_TYPE;
> >>> +	desc->rsp.add_info_0 = 0;
> >>> +	desc->rsp.add_info_1 = 0;
> >>> +	*ref_op = op;
> >>> +
> >>> +	/* One CB (op) was successfully dequeued */
> >>> +	return 1;
> >>> +}
> >>> +
> >>> +/* Dequeue one decode operations from ACC100 device in CB mode */
> >>> +static inline int
> >>> +dequeue_ldpc_dec_one_op_cb(struct rte_bbdev_queue_data
> *q_data,
> >>> +		struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
> >>> +		uint16_t dequeued_cbs, uint32_t *aq_dequeued)
> >>> +{
> >>> +	union acc100_dma_desc *desc, atom_desc;
> >>> +	union acc100_dma_rsp_desc rsp;
> >>> +	struct rte_bbdev_dec_op *op;
> >>> +
> >>> +	desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> >>> +			& q->sw_ring_wrap_mask);
> >>> +	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> >>> +			__ATOMIC_RELAXED);
> >>> +
> >>> +	/* Check fdone bit */
> >>> +	if (!(atom_desc.rsp.val & ACC100_FDONE))
> >>> +		return -1;
> >>> +
> >>> +	rsp.val = atom_desc.rsp.val;
> >>> +
> >>> +	/* Dequeue */
> >>> +	op = desc->req.op_addr;
> >>> +
> >>> +	/* Clearing status, it will be set based on response */
> >>> +	op->status = 0;
> >>> +	op->status |= rsp.input_err << RTE_BBDEV_DATA_ERROR;
> >>> +	op->status |= rsp.dma_err << RTE_BBDEV_DRV_ERROR;
> >>> +	op->status |= rsp.fcw_err << RTE_BBDEV_DRV_ERROR;
> >>> +	if (op->status != 0)
> >>> +		q_data->queue_stats.dequeue_err_count++;
> >>> +
> >>> +	op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
> >>> +	if (op->ldpc_dec.hard_output.length > 0 && !rsp.synd_ok)
> >>> +		op->status |= 1 << RTE_BBDEV_SYNDROME_ERROR;
> >>> +	op->ldpc_dec.iter_count = (uint8_t) rsp.iter_cnt;
> >>> +
> >>> +	/* Check if this is the last desc in batch (Atomic Queue) */
> >>> +	if (desc->req.last_desc_in_batch) {
> >>> +		(*aq_dequeued)++;
> >>> +		desc->req.last_desc_in_batch = 0;
> >>> +	}
> >>> +
> >>> +	desc->rsp.val = ACC100_DMA_DESC_TYPE;
> >>> +	desc->rsp.add_info_0 = 0;
> >>> +	desc->rsp.add_info_1 = 0;
> >>> +
> >>> +	*ref_op = op;
> >>> +
> >>> +	/* One CB (op) was successfully dequeued */
> >>> +	return 1;
> >>> +}
> >>> +
> >>> +/* Dequeue one decode operations from ACC100 device in TB mode. */
> >>> +static inline int
> >>> +dequeue_dec_one_op_tb(struct acc100_queue *q, struct
> >> rte_bbdev_dec_op **ref_op,
> >>> +		uint16_t dequeued_cbs, uint32_t *aq_dequeued)
> >>> +{
> >> similar call as fpga_lte_fec
> > distinct though as HW specific
> >
> >>> +	union acc100_dma_desc *desc, *last_desc, atom_desc;
> >>> +	union acc100_dma_rsp_desc rsp;
> >>> +	struct rte_bbdev_dec_op *op;
> >>> +	uint8_t cbs_in_tb = 1, cb_idx = 0;
> >>> +
> >>> +	desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> >>> +			& q->sw_ring_wrap_mask);
> >>> +	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> >>> +			__ATOMIC_RELAXED);
> >>> +
> >>> +	/* Check fdone bit */
> >>> +	if (!(atom_desc.rsp.val & ACC100_FDONE))
> >>> +		return -1;
> >>> +
> >>> +	/* Dequeue */
> >>> +	op = desc->req.op_addr;
> >>> +
> >>> +	/* Get number of CBs in dequeued TB */
> >>> +	cbs_in_tb = desc->req.cbs_in_tb;
> >>> +	/* Get last CB */
> >>> +	last_desc = q->ring_addr + ((q->sw_ring_tail
> >>> +			+ dequeued_cbs + cbs_in_tb - 1)
> >>> +			& q->sw_ring_wrap_mask);
> >>> +	/* Check if last CB in TB is ready to dequeue (and thus
> >>> +	 * the whole TB) - checking sdone bit. If not return.
> >>> +	 */
> >>> +	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
> >>> +			__ATOMIC_RELAXED);
> >>> +	if (!(atom_desc.rsp.val & ACC100_SDONE))
> >>> +		return -1;
> >>> +
> >>> +	/* Clearing status, it will be set based on response */
> >>> +	op->status = 0;
> >>> +
> >>> +	/* Read remaining CBs if exists */
> >>> +	while (cb_idx < cbs_in_tb) {
> >> Other similar calls use 'i' , 'cb_idx' is more meaningful, consider changing
> the
> >> other loops.
> > More relevant here due to split of TB into CBs.
> ok
> >>> +		desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> >>> +				& q->sw_ring_wrap_mask);
> >>> +		atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
> >>> +				__ATOMIC_RELAXED);
> >>> +		rsp.val = atom_desc.rsp.val;
> >>> +		rte_bbdev_log_debug("Resp. desc %p: %x", desc,
> >>> +				rsp.val);
> >>> +
> >>> +		op->status |= ((rsp.input_err)
> >>> +				? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> >>> +		op->status |= ((rsp.dma_err) ? (1 <<
> >> RTE_BBDEV_DRV_ERROR) : 0);
> >>> +		op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR)
> >> : 0);
> >>> +
> >>> +		/* CRC invalid if error exists */
> >>> +		if (!op->status)
> >>> +			op->status |= rsp.crc_status <<
> >> RTE_BBDEV_CRC_ERROR;
> >>> +		op->turbo_dec.iter_count = RTE_MAX((uint8_t) rsp.iter_cnt,
> >>> +				op->turbo_dec.iter_count);
> >>> +
> >>> +		/* Check if this is the last desc in batch (Atomic Queue) */
> >>> +		if (desc->req.last_desc_in_batch) {
> >>> +			(*aq_dequeued)++;
> >>> +			desc->req.last_desc_in_batch = 0;
> >>> +		}
> >>> +		desc->rsp.val = ACC100_DMA_DESC_TYPE;
> >>> +		desc->rsp.add_info_0 = 0;
> >>> +		desc->rsp.add_info_1 = 0;
> >>> +		dequeued_cbs++;
> >>> +		cb_idx++;
> >>> +	}
> >>> +
> >>> +	*ref_op = op;
> >>> +
> >>> +	return cb_idx;
> >>> +}
> >>> +
> >>> +/* Dequeue LDPC encode operations from ACC100 device. */
> >>> +static uint16_t
> >>> +acc100_dequeue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
> >>> +		struct rte_bbdev_enc_op **ops, uint16_t num)
> >>> +{
> >>> +	struct acc100_queue *q = q_data->queue_private;
> >>> +	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
> >>> +	uint32_t aq_dequeued = 0;
> >>> +	uint16_t dequeue_num, i, dequeued_cbs = 0, dequeued_descs = 0;
> >>> +	int ret;
> >>> +
> >>> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> >>> +	if (unlikely(ops == 0 && q == NULL))
> >>> +		return 0;
> >>> +#endif
> >>> +
> >>> +	dequeue_num = (avail < num) ? avail : num;
> >> Similar to RTE_MIN
> >>
> >> general issue
> > ok, will check
> >
> >>> +
> >>> +	for (i = 0; i < dequeue_num; i++) {
> >>> +		ret = dequeue_enc_one_op_cb(q, &ops[dequeued_cbs],
> >>> +				dequeued_descs, &aq_dequeued);
> >>> +		if (ret < 0)
> >>> +			break;
> >>> +		dequeued_cbs += ret;
> >>> +		dequeued_descs++;
> >>> +		if (dequeued_cbs >= num)
> >>> +			break;
> >> condition should be added to the for-loop
> > unsure this would helps readability personnaly
> 
> ok
> 
> Tom
> 
> >>> +	}
> >>> +
> >>> +	q->aq_dequeued += aq_dequeued;
> >>> +	q->sw_ring_tail += dequeued_descs;
> >>> +
> >>> +	/* Update enqueue stats */
> >>> +	q_data->queue_stats.dequeued_count += dequeued_cbs;
> >>> +
> >>> +	return dequeued_cbs;
> >>> +}
> >>> +
> >>> +/* Dequeue decode operations from ACC100 device. */
> >>> +static uint16_t
> >>> +acc100_dequeue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
> >>> +		struct rte_bbdev_dec_op **ops, uint16_t num)
> >>> +{
> >>> +	struct acc100_queue *q = q_data->queue_private;
> >>> +	uint16_t dequeue_num;
> >>> +	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
> >>> +	uint32_t aq_dequeued = 0;
> >>> +	uint16_t i;
> >>> +	uint16_t dequeued_cbs = 0;
> >>> +	struct rte_bbdev_dec_op *op;
> >>> +	int ret;
> >>> +
> >>> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> >>> +	if (unlikely(ops == 0 && q == NULL))
> >>> +		return 0;
> >>> +#endif
> >>> +
> >>> +	dequeue_num = (avail < num) ? avail : num;
> >>> +
> >>> +	for (i = 0; i < dequeue_num; ++i) {
> >>> +		op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> >>> +			& q->sw_ring_wrap_mask))->req.op_addr;
> >>> +		if (op->ldpc_dec.code_block_mode == 0)
> >> 0 should be a #define
> > mentioned in previous review.
> >
> > Thanks
> >
> >> Tom
> >>
> >>> +			ret = dequeue_dec_one_op_tb(q, &ops[i],
> >> dequeued_cbs,
> >>> +					&aq_dequeued);
> >>> +		else
> >>> +			ret = dequeue_ldpc_dec_one_op_cb(
> >>> +					q_data, q, &ops[i], dequeued_cbs,
> >>> +					&aq_dequeued);
> >>> +
> >>> +		if (ret < 0)
> >>> +			break;
> >>> +		dequeued_cbs += ret;
> >>> +	}
> >>> +
> >>> +	q->aq_dequeued += aq_dequeued;
> >>> +	q->sw_ring_tail += dequeued_cbs;
> >>> +
> >>> +	/* Update enqueue stats */
> >>> +	q_data->queue_stats.dequeued_count += i;
> >>> +
> >>> +	return i;
> >>> +}
> >>> +
> >>>  /* Initialization Function */
> >>>  static void
> >>>  acc100_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv)
> >>> @@ -703,6 +2321,10 @@
> >>>  	struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
> >>>
> >>>  	dev->dev_ops = &acc100_bbdev_ops;
> >>> +	dev->enqueue_ldpc_enc_ops = acc100_enqueue_ldpc_enc;
> >>> +	dev->enqueue_ldpc_dec_ops = acc100_enqueue_ldpc_dec;
> >>> +	dev->dequeue_ldpc_enc_ops = acc100_dequeue_ldpc_enc;
> >>> +	dev->dequeue_ldpc_dec_ops = acc100_dequeue_ldpc_dec;
> >>>
> >>>  	((struct acc100_device *) dev->data->dev_private)->pf_device =
> >>>  			!strcmp(drv->driver.name,
> >>> @@ -815,4 +2437,3 @@ static int acc100_pci_remove(struct
> >> rte_pci_device *pci_dev)
> >>>  RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME,
> >> pci_id_acc100_pf_map);
> >>>  RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME,
> >> acc100_pci_vf_driver);
> >>>  RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME,
> >> pci_id_acc100_vf_map);
> >>> -
> >>> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h
> >> b/drivers/baseband/acc100/rte_acc100_pmd.h
> >>> index 0e2b79c..78686c1 100644
> >>> --- a/drivers/baseband/acc100/rte_acc100_pmd.h
> >>> +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
> >>> @@ -88,6 +88,8 @@
> >>>  #define TMPL_PRI_3      0x0f0e0d0c
> >>>  #define QUEUE_ENABLE    0x80000000  /* Bit to mark Queue as Enabled
> */
> >>>  #define WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
> >>> +#define ACC100_FDONE    0x80000000
> >>> +#define ACC100_SDONE    0x40000000
> >>>
> >>>  #define ACC100_NUM_TMPL  32
> >>>  #define VF_OFFSET_QOS 16 /* offset in Memory Space specific to QoS
> >> Mon */
> >>> @@ -398,6 +400,7 @@ struct __rte_packed acc100_dma_req_desc {
> >>>  union acc100_dma_desc {
> >>>  	struct acc100_dma_req_desc req;
> >>>  	union acc100_dma_rsp_desc rsp;
> >>> +	uint64_t atom_hdr;
> >>>  };
> >>>
> >>>


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v9 10/10] baseband/acc100: add configure function
  2020-09-30 22:54         ` Chautru, Nicolas
@ 2020-10-01 16:18           ` Tom Rix
  2020-10-01 21:11             ` Chautru, Nicolas
  0 siblings, 1 reply; 213+ messages in thread
From: Tom Rix @ 2020-10-01 16:18 UTC (permalink / raw)
  To: Chautru, Nicolas, dev, akhil.goyal
  Cc: Richardson, Bruce, Xu, Rosen, dave.burley, aidan.goddard, Yigit,
	Ferruh, Liu, Tianjiao


On 9/30/20 3:54 PM, Chautru, Nicolas wrote:
> Hi Tom, 
>
>> From: Tom Rix <trix@redhat.com>
>> On 9/28/20 5:29 PM, Nicolas Chautru wrote:
>>> Add configure function to configure the PF from within the 
>>> bbdev-test itself without external application configuration the device.
>>>
>>> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
>>> Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
>>> ---
>>>  app/test-bbdev/test_bbdev_perf.c                   |  72 +++
>>>  doc/guides/rel_notes/release_20_11.rst             |   5 +
>>>  drivers/baseband/acc100/meson.build                |   2 +
>>>  drivers/baseband/acc100/rte_acc100_cfg.h           |  17 +
>>>  drivers/baseband/acc100/rte_acc100_pmd.c           | 505
>> +++++++++++++++++++++
>>>  .../acc100/rte_pmd_bbdev_acc100_version.map        |   7 +
>>>  6 files changed, 608 insertions(+)
>>>
>>> diff --git a/app/test-bbdev/test_bbdev_perf.c
>>> b/app/test-bbdev/test_bbdev_perf.c
>>> index 45c0d62..32f23ff 100644
>>> --- a/app/test-bbdev/test_bbdev_perf.c
>>> +++ b/app/test-bbdev/test_bbdev_perf.c
>>> @@ -52,6 +52,18 @@
>>>  #define FLR_5G_TIMEOUT 610
>>>  #endif
>>>
>>> +#ifdef RTE_LIBRTE_PMD_BBDEV_ACC100
>>> +#include <rte_acc100_cfg.h>
>>> +#define ACC100PF_DRIVER_NAME   ("intel_acc100_pf")
>>> +#define ACC100VF_DRIVER_NAME   ("intel_acc100_vf")
>>> +#define ACC100_QMGR_NUM_AQS 16
>>> +#define ACC100_QMGR_NUM_QGS 2
>>> +#define ACC100_QMGR_AQ_DEPTH 5
>>> +#define ACC100_QMGR_INVALID_IDX -1
>>> +#define ACC100_QMGR_RR 1
>>> +#define ACC100_QOS_GBR 0
>>> +#endif
>>> +
>>>  #define OPS_CACHE_SIZE 256U
>>>  #define OPS_POOL_SIZE_MIN 511U /* 0.5K per queue */
>>>
>>> @@ -653,6 +665,66 @@ typedef int (test_case_function)(struct
>> active_device *ad,
>>>  				info->dev_name);
>>>  	}
>>>  #endif
>>> +#ifdef RTE_LIBRTE_PMD_BBDEV_ACC100
>> seems like this function would break if one of the other bbdev's were 
>> #defined.
> No these are independent. By default they are all defined. 
ok
>
>
>>> +	if ((get_init_device() == true) &&
>>> +		(!strcmp(info->drv.driver_name,
>> ACC100PF_DRIVER_NAME))) {
>>> +		struct acc100_conf conf;
>>> +		unsigned int i;
>>> +
>>> +		printf("Configure ACC100 FEC Driver %s with default
>> values\n",
>>> +				info->drv.driver_name);
>>> +
>>> +		/* clear default configuration before initialization */
>>> +		memset(&conf, 0, sizeof(struct acc100_conf));
>>> +
>>> +		/* Always set in PF mode for built-in configuration */
>>> +		conf.pf_mode_en = true;
>>> +		for (i = 0; i < RTE_ACC100_NUM_VFS; ++i) {
>>> +			conf.arb_dl_4g[i].gbr_threshold1 =
>> ACC100_QOS_GBR;
>>> +			conf.arb_dl_4g[i].gbr_threshold1 =
>> ACC100_QOS_GBR;
>>> +			conf.arb_dl_4g[i].round_robin_weight =
>> ACC100_QMGR_RR;
>>> +			conf.arb_ul_4g[i].gbr_threshold1 =
>> ACC100_QOS_GBR;
>>> +			conf.arb_ul_4g[i].gbr_threshold1 =
>> ACC100_QOS_GBR;
>>> +			conf.arb_ul_4g[i].round_robin_weight =
>> ACC100_QMGR_RR;
>>> +			conf.arb_dl_5g[i].gbr_threshold1 =
>> ACC100_QOS_GBR;
>>> +			conf.arb_dl_5g[i].gbr_threshold1 =
>> ACC100_QOS_GBR;
>>> +			conf.arb_dl_5g[i].round_robin_weight =
>> ACC100_QMGR_RR;
>>> +			conf.arb_ul_5g[i].gbr_threshold1 =
>> ACC100_QOS_GBR;
>>> +			conf.arb_ul_5g[i].gbr_threshold1 =
>> ACC100_QOS_GBR;
>>> +			conf.arb_ul_5g[i].round_robin_weight =
>> ACC100_QMGR_RR;
>>> +		}
>>> +
>>> +		conf.input_pos_llr_1_bit = true;
>>> +		conf.output_pos_llr_1_bit = true;
>>> +		conf.num_vf_bundles = 1; /**< Number of VF bundles to
>> setup */
>>> +
>>> +		conf.q_ul_4g.num_qgroups = ACC100_QMGR_NUM_QGS;
>>> +		conf.q_ul_4g.first_qgroup_index =
>> ACC100_QMGR_INVALID_IDX;
>>> +		conf.q_ul_4g.num_aqs_per_groups =
>> ACC100_QMGR_NUM_AQS;
>>> +		conf.q_ul_4g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
>>> +		conf.q_dl_4g.num_qgroups = ACC100_QMGR_NUM_QGS;
>>> +		conf.q_dl_4g.first_qgroup_index =
>> ACC100_QMGR_INVALID_IDX;
>>> +		conf.q_dl_4g.num_aqs_per_groups =
>> ACC100_QMGR_NUM_AQS;
>>> +		conf.q_dl_4g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
>>> +		conf.q_ul_5g.num_qgroups = ACC100_QMGR_NUM_QGS;
>>> +		conf.q_ul_5g.first_qgroup_index =
>> ACC100_QMGR_INVALID_IDX;
>>> +		conf.q_ul_5g.num_aqs_per_groups =
>> ACC100_QMGR_NUM_AQS;
>>> +		conf.q_ul_5g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
>>> +		conf.q_dl_5g.num_qgroups = ACC100_QMGR_NUM_QGS;
>>> +		conf.q_dl_5g.first_qgroup_index =
>> ACC100_QMGR_INVALID_IDX;
>>> +		conf.q_dl_5g.num_aqs_per_groups =
>> ACC100_QMGR_NUM_AQS;
>>> +		conf.q_dl_5g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
>>> +
>>> +		/* setup PF with configuration information */
>>> +		ret = acc100_configure(info->dev_name, &conf);
>>> +		TEST_ASSERT_SUCCESS(ret,
>>> +				"Failed to configure ACC100 PF for bbdev
>> %s",
>>> +				info->dev_name);
>>> +		/* Let's refresh this now this is configured */
>>> +	}
>>> +	rte_bbdev_info_get(dev_id, info);
>> The other bbdev's do not call rte_bbdev_info_get, can this be removed ?
> Actually it should be added outside for all versions post-configuraion. Thanks
>
>>> +#endif
>>> +
>>>  	nb_queues = RTE_MIN(rte_lcore_count(), info- drv.max_num_queues);
>>>  	nb_queues = RTE_MIN(nb_queues, (unsigned int) MAX_QUEUES);
>>>
>>> diff --git a/doc/guides/rel_notes/release_20_11.rst
>>> b/doc/guides/rel_notes/release_20_11.rst
>>> index 73ac08f..c8d0586 100644
>>> --- a/doc/guides/rel_notes/release_20_11.rst
>>> +++ b/doc/guides/rel_notes/release_20_11.rst
>>> @@ -55,6 +55,11 @@ New Features
>>>       Also, make sure to start the actual text at the margin.
>>>       =======================================================
>>>
>>> +* **Added Intel ACC100 bbdev PMD.**
>>> +
>>> +  Added a new ``acc100`` bbdev driver for the Intel\ |reg| ACC100 
>>> + accelerator  also known as Mount Bryce.  See the 
>>> + :doc:`../bbdevs/acc100` BBDEV guide for more details on this new driver.
>>>
>>>  Removed Items
>>>  -------------
>>> diff --git a/drivers/baseband/acc100/meson.build
>>> b/drivers/baseband/acc100/meson.build
>>> index 8afafc2..7ac44dc 100644
>>> --- a/drivers/baseband/acc100/meson.build
>>> +++ b/drivers/baseband/acc100/meson.build
>>> @@ -4,3 +4,5 @@
>>>  deps += ['bbdev', 'bus_vdev', 'ring', 'pci', 'bus_pci']
>>>
>>>  sources = files('rte_acc100_pmd.c')
>>> +
>>> +install_headers('rte_acc100_cfg.h')
>>> diff --git a/drivers/baseband/acc100/rte_acc100_cfg.h
>>> b/drivers/baseband/acc100/rte_acc100_cfg.h
>>> index 73bbe36..7f523bc 100644
>>> --- a/drivers/baseband/acc100/rte_acc100_cfg.h
>>> +++ b/drivers/baseband/acc100/rte_acc100_cfg.h
>>> @@ -89,6 +89,23 @@ struct acc100_conf {
>>>  	struct rte_arbitration_t arb_dl_5g[RTE_ACC100_NUM_VFS];  };
>>>
>>> +/**
>>> + * Configure a ACC100 device
>>> + *
>>> + * @param dev_name
>>> + *   The name of the device. This is the short form of PCI BDF, e.g. 00:01.0.
>>> + *   It can also be retrieved for a bbdev device from the dev_name field in
>> the
>>> + *   rte_bbdev_info structure returned by rte_bbdev_info_get().
>>> + * @param conf
>>> + *   Configuration to apply to ACC100 HW.
>>> + *
>>> + * @return
>>> + *   Zero on success, negative value on failure.
>>> + */
>>> +__rte_experimental
>>> +int
>>> +acc100_configure(const char *dev_name, struct acc100_conf *conf);
>>> +
>>>  #ifdef __cplusplus
>>>  }
>>>  #endif
>>> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
>>> b/drivers/baseband/acc100/rte_acc100_pmd.c
>>> index 3589814..b50dd32 100644
>>> --- a/drivers/baseband/acc100/rte_acc100_pmd.c
>>> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
>>> @@ -85,6 +85,26 @@
>>>
>>>  enum {UL_4G = 0, UL_5G, DL_4G, DL_5G, NUM_ACC};
>>>
>>> +/* Return the accelerator enum for a Queue Group Index */ static 
>>> +inline int accFromQgid(int qg_idx, const struct acc100_conf
>>> +*acc100_conf) {
>>> +	int accQg[ACC100_NUM_QGRPS];
>>> +	int NumQGroupsPerFn[NUM_ACC];
>>> +	int acc, qgIdx, qgIndex = 0;
>>> +	for (qgIdx = 0; qgIdx < ACC100_NUM_QGRPS; qgIdx++)
>>> +		accQg[qgIdx] = 0;
>>> +	NumQGroupsPerFn[UL_4G] = acc100_conf->q_ul_4g.num_qgroups;
>>> +	NumQGroupsPerFn[UL_5G] = acc100_conf->q_ul_5g.num_qgroups;
>>> +	NumQGroupsPerFn[DL_4G] = acc100_conf->q_dl_4g.num_qgroups;
>>> +	NumQGroupsPerFn[DL_5G] = acc100_conf->q_dl_5g.num_qgroups;
>>> +	for (acc = UL_4G;  acc < NUM_ACC; acc++)
>>> +		for (qgIdx = 0; qgIdx < NumQGroupsPerFn[acc]; qgIdx++)
>>> +			accQg[qgIndex++] = acc;
>> This looks inefficient, is there a way this could be calculated 
>> without filling arrays to
>>
>> access 1 value ?
> That is not time critical, and the same common code is run each time. 
ok
>
>>> +	acc = accQg[qg_idx];
>>> +	return acc;
>>> +}
>>> +
>>>  /* Return the queue topology for a Queue Group Index */  static 
>>> inline void  qtopFromAcc(struct rte_q_topology_t **qtop, int 
>>> acc_enum, @@ -113,6 +133,30 @@
>>>  	*qtop = p_qtop;
>>>  }
>>>
>>> +/* Return the AQ depth for a Queue Group Index */ static inline int 
>>> +aqDepth(int qg_idx, struct acc100_conf *acc100_conf) {
>>> +	struct rte_q_topology_t *q_top = NULL;
>>> +	int acc_enum = accFromQgid(qg_idx, acc100_conf);
>>> +	qtopFromAcc(&q_top, acc_enum, acc100_conf);
>>> +	if (unlikely(q_top == NULL))
>>> +		return 0;
>> This error is not handled well be the callers.
>>
>> aqNum is similar.
> This fails in a consistent basis, by having not queue available and handling this as the default case.
ok
>
>>> +	return q_top->aq_depth_log2;
>>> +}
>>> +
>>> +/* Return the AQ depth for a Queue Group Index */ static inline int 
>>> +aqNum(int qg_idx, struct acc100_conf *acc100_conf) {
>>> +	struct rte_q_topology_t *q_top = NULL;
>>> +	int acc_enum = accFromQgid(qg_idx, acc100_conf);
>>> +	qtopFromAcc(&q_top, acc_enum, acc100_conf);
>>> +	if (unlikely(q_top == NULL))
>>> +		return 0;
>>> +	return q_top->num_aqs_per_groups;
>>> +}
>>> +
>>>  static void
>>>  initQTop(struct acc100_conf *acc100_conf)  { @@ -4177,3 +4221,464 
>>> @@ static int acc100_pci_remove(struct rte_pci_device *pci_dev) 
>>> RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME,
>>> pci_id_acc100_pf_map);
>> RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME,
>>> acc100_pci_vf_driver);
>>> RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME,
>>> pci_id_acc100_vf_map);
>>> +
>>> +/*
>>> + * Implementation to fix the power on status of some 5GUL engines
>>> + * This requires DMA permission if ported outside DPDK
>> This sounds like a workaround, can more detail be added here ?
> There are comments through the code I believe:
>   - /* Detect engines in undefined state */
>   - /* Force each engine which is in unspecified state */
>   - /* Reset LDPC Cores */
>   - /* Check engine power-on status again */ Do you believe this is not explicit enough. Power-on status may be in an undefined state hence this engine are avtivate with dummy payload to make sure they are in a predicable state once configuration is done. 

Yes, not explicit enough. They do not say it is a workaround so someone else would not know that

this is needed or is likely needs adjusting in the future.  Maybe change

/* Check engine power-on status again */ to

/*

 * Power-on status may be in an undefined state.

 * Active this engine with a dummy payload to make sure the state is defined.

 */ 

Tom

>>> + */
>>> +static void
>>> +poweron_cleanup(struct rte_bbdev *bbdev, struct acc100_device *d,
>>> +		struct acc100_conf *conf)
>>> +{
>>> +	int i, template_idx, qg_idx;
>>> +	uint32_t address, status, payload;
>>> +	printf("Need to clear power-on 5GUL status in internal memory\n");
>>> +	/* Reset LDPC Cores */
>>> +	for (i = 0; i < ACC100_ENGINES_MAX; i++)
>>> +		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
>>> +				ACC100_ENGINE_OFFSET * i,
>> ACC100_RESET_HI);
>>> +	usleep(LONG_WAIT);
>>> +	for (i = 0; i < ACC100_ENGINES_MAX; i++)
>>> +		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
>>> +				ACC100_ENGINE_OFFSET * i,
>> ACC100_RESET_LO);
>>> +	usleep(LONG_WAIT);
>>> +	/* Prepare dummy workload */
>>> +	alloc_2x64mb_sw_rings_mem(bbdev, d, 0);
>>> +	/* Set base addresses */
>>> +	uint32_t phys_high = (uint32_t)(d->sw_rings_phys >> 32);
>>> +	uint32_t phys_low  = (uint32_t)(d->sw_rings_phys &
>>> +			~(ACC100_SIZE_64MBYTE-1));
>>> +	acc100_reg_write(d, HWPfDmaFec5GulDescBaseHiRegVf,
>> phys_high);
>>> +	acc100_reg_write(d, HWPfDmaFec5GulDescBaseLoRegVf, phys_low);
>>> +
>>> +	/* Descriptor for a dummy 5GUL code block processing*/
>>> +	union acc100_dma_desc *desc = NULL;
>>> +	desc = d->sw_rings;
>>> +	desc->req.data_ptrs[0].address = d->sw_rings_phys +
>>> +			ACC100_DESC_FCW_OFFSET;
>>> +	desc->req.data_ptrs[0].blen = ACC100_FCW_LD_BLEN;
>>> +	desc->req.data_ptrs[0].blkid = ACC100_DMA_BLKID_FCW;
>>> +	desc->req.data_ptrs[0].last = 0;
>>> +	desc->req.data_ptrs[0].dma_ext = 0;
>>> +	desc->req.data_ptrs[1].address = d->sw_rings_phys + 512;
>>> +	desc->req.data_ptrs[1].blkid = ACC100_DMA_BLKID_IN;
>>> +	desc->req.data_ptrs[1].last = 1;
>>> +	desc->req.data_ptrs[1].dma_ext = 0;
>>> +	desc->req.data_ptrs[1].blen = 44;
>>> +	desc->req.data_ptrs[2].address = d->sw_rings_phys + 1024;
>>> +	desc->req.data_ptrs[2].blkid = ACC100_DMA_BLKID_OUT_ENC;
>>> +	desc->req.data_ptrs[2].last = 1;
>>> +	desc->req.data_ptrs[2].dma_ext = 0;
>>> +	desc->req.data_ptrs[2].blen = 5;
>>> +	/* Dummy FCW */
>>> +	desc->req.fcw_ld.FCWversion = ACC100_FCW_VER;
>>> +	desc->req.fcw_ld.qm = 1;
>>> +	desc->req.fcw_ld.nfiller = 30;
>>> +	desc->req.fcw_ld.BG = 2 - 1;
>>> +	desc->req.fcw_ld.Zc = 7;
>>> +	desc->req.fcw_ld.ncb = 350;
>>> +	desc->req.fcw_ld.rm_e = 4;
>>> +	desc->req.fcw_ld.itmax = 10;
>>> +	desc->req.fcw_ld.gain_i = 1;
>>> +	desc->req.fcw_ld.gain_h = 1;
>>> +
>>> +	int engines_to_restart[SIG_UL_5G_LAST + 1] = {0};
>>> +	int num_failed_engine = 0;
>>> +	/* Detect engines in undefined state */
>>> +	for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST;
>>> +			template_idx++) {
>>> +		/* Check engine power-on status */
>>> +		address = HwPfFecUl5gIbDebugReg +
>>> +				ACC100_ENGINE_OFFSET * template_idx;
>>> +		status = (acc100_reg_read(d, address) >> 4) & 0xF;
>>> +		if (status == 0) {
>>> +			engines_to_restart[num_failed_engine] =
>> template_idx;
>>> +			num_failed_engine++;
>>> +		}
>>> +	}
>>> +
>>> +	int numQqsAcc = conf->q_ul_5g.num_qgroups;
>>> +	int numQgs = conf->q_ul_5g.num_qgroups;
>>> +	payload = 0;
>>> +	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc);
>> qg_idx++)
>>> +		payload |= (1 << qg_idx);
>>> +	/* Force each engine which is in unspecified state */
>>> +	for (i = 0; i < num_failed_engine; i++) {
>>> +		int failed_engine = engines_to_restart[i];
>>> +		printf("Force engine %d\n", failed_engine);
>>> +		for (template_idx = SIG_UL_5G; template_idx <=
>> SIG_UL_5G_LAST;
>>> +				template_idx++) {
>>> +			address = HWPfQmgrGrpTmplateReg4Indx
>>> +					+ BYTES_IN_WORD * template_idx;
>>> +			if (template_idx == failed_engine)
>>> +				acc100_reg_write(d, address, payload);
>>> +			else
>>> +				acc100_reg_write(d, address, 0);
>>> +		}
>>> +		/* Reset descriptor header */
>>> +		desc->req.word0 = ACC100_DMA_DESC_TYPE;
>>> +		desc->req.word1 = 0;
>>> +		desc->req.word2 = 0;
>>> +		desc->req.word3 = 0;
>>> +		desc->req.numCBs = 1;
>>> +		desc->req.m2dlen = 2;
>>> +		desc->req.d2mlen = 1;
>>> +		/* Enqueue the code block for processing */
>>> +		union acc100_enqueue_reg_fmt enq_req;
>>> +		enq_req.val = 0;
>>> +		enq_req.addr_offset = ACC100_DESC_OFFSET;
>>> +		enq_req.num_elem = 1;
>>> +		enq_req.req_elem_addr = 0;
>>> +		rte_wmb();
>>> +		acc100_reg_write(d, HWPfQmgrIngressAq + 0x100,
>> enq_req.val);
>>> +		usleep(LONG_WAIT * 100);
>>> +		if (desc->req.word0 != 2)
>>> +			printf("DMA Response %#"PRIx32"\n", desc-
>>> req.word0);
>>> +	}
>>> +
>>> +	/* Reset LDPC Cores */
>>> +	for (i = 0; i < ACC100_ENGINES_MAX; i++)
>>> +		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
>>> +				ACC100_ENGINE_OFFSET * i,
>> ACC100_RESET_HI);
>>> +	usleep(LONG_WAIT);
>>> +	for (i = 0; i < ACC100_ENGINES_MAX; i++)
>>> +		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
>>> +				ACC100_ENGINE_OFFSET * i,
>> ACC100_RESET_LO);
>>> +	usleep(LONG_WAIT);
>>> +	acc100_reg_write(d, HWPfHi5GHardResetReg,
>> ACC100_RESET_HARD);
>>> +	usleep(LONG_WAIT);
>>> +	int numEngines = 0;
>>> +	/* Check engine power-on status again */
>>> +	for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST;
>>> +			template_idx++) {
>>> +		address = HwPfFecUl5gIbDebugReg +
>>> +				ACC100_ENGINE_OFFSET * template_idx;
>>> +		status = (acc100_reg_read(d, address) >> 4) & 0xF;
>>> +		address = HWPfQmgrGrpTmplateReg4Indx
>>> +				+ BYTES_IN_WORD * template_idx;
>>> +		if (status == 1) {
>>> +			acc100_reg_write(d, address, payload);
>>> +			numEngines++;
>>> +		} else
>>> +			acc100_reg_write(d, address, 0);
>>> +	}
>>> +	printf("Number of 5GUL engines %d\n", numEngines);
>>> +
>>> +	if (d->sw_rings_base != NULL)
>>> +		rte_free(d->sw_rings_base);
>>> +	usleep(LONG_WAIT);
>>> +}
>>> +
>>> +/* Initial configuration of a ACC100 device prior to running
>>> +configure() */ int acc100_configure(const char *dev_name, struct 
>>> +acc100_conf *conf) {
>>> +	rte_bbdev_log(INFO, "acc100_configure");
>>> +	uint32_t payload, address, status;
>> maybe value or data would be a better variable name than payload.
>>
>> would mean changing acc100_reg_write
> transparent to me, but can change given DPDK uses term value. 
>
>
>>> +	int qg_idx, template_idx, vf_idx, acc, i;
>>> +	struct rte_bbdev *bbdev = rte_bbdev_get_named_dev(dev_name);
>>> +
>>> +	/* Compile time checks */
>>> +	RTE_BUILD_BUG_ON(sizeof(struct acc100_dma_req_desc) != 256);
>>> +	RTE_BUILD_BUG_ON(sizeof(union acc100_dma_desc) != 256);
>>> +	RTE_BUILD_BUG_ON(sizeof(struct acc100_fcw_td) != 24);
>>> +	RTE_BUILD_BUG_ON(sizeof(struct acc100_fcw_te) != 32);
>>> +
>>> +	if (bbdev == NULL) {
>>> +		rte_bbdev_log(ERR,
>>> +		"Invalid dev_name (%s), or device is not yet initialised",
>>> +		dev_name);
>>> +		return -ENODEV;
>>> +	}
>>> +	struct acc100_device *d = bbdev->data->dev_private;
>>> +
>>> +	/* Store configuration */
>>> +	rte_memcpy(&d->acc100_conf, conf, sizeof(d->acc100_conf));
>>> +
>>> +	/* PCIe Bridge configuration */
>>> +	acc100_reg_write(d, HwPfPcieGpexBridgeControl,
>> ACC100_CFG_PCI_BRIDGE);
>>> +	for (i = 1; i < 17; i++)
>> 17 is a magic number, use a #define
>>
>> this is a general issue.
> These are only used once but still agreed.
>
>>> +		acc100_reg_write(d,
>>> +
>> 	HwPfPcieGpexAxiAddrMappingWindowPexBaseHigh
>>> +				+ i * 16, 0);
>>> +
>>> +	/* PCIe Link Trainiing and Status State Machine */
>>> +	acc100_reg_write(d, HwPfPcieGpexLtssmStateCntrl, 0xDFC00000);
>>> +
>>> +	/* Prevent blocking AXI read on BRESP for AXI Write */
>>> +	address = HwPfPcieGpexAxiPioControl;
>>> +	payload = ACC100_CFG_PCI_AXI;
>>> +	acc100_reg_write(d, address, payload);
>>> +
>>> +	/* 5GDL PLL phase shift */
>>> +	acc100_reg_write(d, HWPfChaDl5gPllPhshft0, 0x1);
>>> +
>>> +	/* Explicitly releasing AXI as this may be stopped after PF FLR/BME */
>>> +	address = HWPfDmaAxiControl;
>>> +	payload = 1;
>>> +	acc100_reg_write(d, address, payload);
>>> +
>>> +	/* DDR Configuration */
>>> +	address = HWPfDdrBcTim6;
>>> +	payload = acc100_reg_read(d, address);
>>> +	payload &= 0xFFFFFFFB; /* Bit 2 */ #ifdef ACC100_DDR_ECC_ENABLE
>>> +	payload |= 0x4;
>>> +#endif
>>> +	acc100_reg_write(d, address, payload);
>>> +	address = HWPfDdrPhyDqsCountNum;
>>> +#ifdef ACC100_DDR_ECC_ENABLE
>>> +	payload = 9;
>>> +#else
>>> +	payload = 8;
>>> +#endif
>>> +	acc100_reg_write(d, address, payload);
>>> +
>>> +	/* Set default descriptor signature */
>>> +	address = HWPfDmaDescriptorSignatuture;
>>> +	payload = 0;
>>> +	acc100_reg_write(d, address, payload);
>>> +
>>> +	/* Enable the Error Detection in DMA */
>>> +	payload = ACC100_CFG_DMA_ERROR;
>>> +	address = HWPfDmaErrorDetectionEn;
>>> +	acc100_reg_write(d, address, payload);
>>> +
>>> +	/* AXI Cache configuration */
>>> +	payload = ACC100_CFG_AXI_CACHE;
>>> +	address = HWPfDmaAxcacheReg;
>>> +	acc100_reg_write(d, address, payload);
>>> +
>>> +	/* Default DMA Configuration (Qmgr Enabled) */
>>> +	address = HWPfDmaConfig0Reg;
>>> +	payload = 0;
>>> +	acc100_reg_write(d, address, payload);
>>> +	address = HWPfDmaQmanen;
>>> +	payload = 0;
>>> +	acc100_reg_write(d, address, payload);
>>> +
>>> +	/* Default RLIM/ALEN configuration */
>>> +	address = HWPfDmaConfig1Reg;
>>> +	payload = (1 << 31) + (23 << 8) + (1 << 6) + 7;
>>> +	acc100_reg_write(d, address, payload);
>>> +
>>> +	/* Configure DMA Qmanager addresses */
>>> +	address = HWPfDmaQmgrAddrReg;
>>> +	payload = HWPfQmgrEgressQueuesTemplate;
>>> +	acc100_reg_write(d, address, payload);
>>> +
>>> +	/* ===== Qmgr Configuration ===== */
>>> +	/* Configuration of the AQueue Depth QMGR_GRP_0_DEPTH_LOG2
>> for UL */
>>> +	int totalQgs = conf->q_ul_4g.num_qgroups +
>>> +			conf->q_ul_5g.num_qgroups +
>>> +			conf->q_dl_4g.num_qgroups +
>>> +			conf->q_dl_5g.num_qgroups;
>>> +	for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) {
>>> +		address = HWPfQmgrDepthLog2Grp +
>>> +		BYTES_IN_WORD * qg_idx;
>>> +		payload = aqDepth(qg_idx, conf);
>>> +		acc100_reg_write(d, address, payload);
>>> +		address = HWPfQmgrTholdGrp +
>>> +		BYTES_IN_WORD * qg_idx;
>>> +		payload = (1 << 16) + (1 << (aqDepth(qg_idx, conf) - 1));
>>> +		acc100_reg_write(d, address, payload);
>>> +	}
>>> +
>>> +	/* Template Priority in incremental order */
>>> +	for (template_idx = 0; template_idx < ACC100_NUM_TMPL;
>>> +			template_idx++) {
>>> +		address = HWPfQmgrGrpTmplateReg0Indx +
>>> +		BYTES_IN_WORD * (template_idx % 8);
>>> +		payload = TMPL_PRI_0;
>>> +		acc100_reg_write(d, address, payload);
>>> +		address = HWPfQmgrGrpTmplateReg1Indx +
>>> +		BYTES_IN_WORD * (template_idx % 8);
>>> +		payload = TMPL_PRI_1;
>>> +		acc100_reg_write(d, address, payload);
>>> +		address = HWPfQmgrGrpTmplateReg2indx +
>>> +		BYTES_IN_WORD * (template_idx % 8);
>>> +		payload = TMPL_PRI_2;
>>> +		acc100_reg_write(d, address, payload);
>>> +		address = HWPfQmgrGrpTmplateReg3Indx +
>>> +		BYTES_IN_WORD * (template_idx % 8);
>>> +		payload = TMPL_PRI_3;
>>> +		acc100_reg_write(d, address, payload);
>>> +	}
>>> +
>>> +	address = HWPfQmgrGrpPriority;
>>> +	payload = ACC100_CFG_QMGR_HI_P;
>>> +	acc100_reg_write(d, address, payload);
>>> +
>>> +	/* Template Configuration */
>>> +	for (template_idx = 0; template_idx < ACC100_NUM_TMPL;
>> template_idx++) {
>>> +		payload = 0;
>>> +		address = HWPfQmgrGrpTmplateReg4Indx
>>> +				+ BYTES_IN_WORD * template_idx;
>>> +		acc100_reg_write(d, address, payload);
>>> +	}
>>> +	/* 4GUL */
>>> +	int numQgs = conf->q_ul_4g.num_qgroups;
>>> +	int numQqsAcc = 0;
>>> +	payload = 0;
>>> +	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc);
>> qg_idx++)
>>> +		payload |= (1 << qg_idx);
>>> +	for (template_idx = SIG_UL_4G; template_idx <= SIG_UL_4G_LAST;
>>> +			template_idx++) {
>>> +		address = HWPfQmgrGrpTmplateReg4Indx
>>> +				+ BYTES_IN_WORD*template_idx;
>>> +		acc100_reg_write(d, address, payload);
>>> +	}
>>> +	/* 5GUL */
>>> +	numQqsAcc += numQgs;
>>> +	numQgs	= conf->q_ul_5g.num_qgroups;
>>> +	payload = 0;
>>> +	int numEngines = 0;
>>> +	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc);
>> qg_idx++)
>>> +		payload |= (1 << qg_idx);
>>> +	for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST;
>>> +			template_idx++) {
>>> +		/* Check engine power-on status */
>>> +		address = HwPfFecUl5gIbDebugReg +
>>> +				ACC100_ENGINE_OFFSET * template_idx;
>>> +		status = (acc100_reg_read(d, address) >> 4) & 0xF;
>>> +		address = HWPfQmgrGrpTmplateReg4Indx
>>> +				+ BYTES_IN_WORD * template_idx;
>>> +		if (status == 1) {
>>> +			acc100_reg_write(d, address, payload);
>>> +			numEngines++;
>>> +		} else
>>> +			acc100_reg_write(d, address, 0);
>>> +		#if RTE_ACC100_SINGLE_FEC == 1
>> #if should be at start of line
> ok
>
>>> +		payload = 0;
>>> +		#endif
>>> +	}
>>> +	printf("Number of 5GUL engines %d\n", numEngines);
>>> +	/* 4GDL */
>>> +	numQqsAcc += numQgs;
>>> +	numQgs	= conf->q_dl_4g.num_qgroups;
>>> +	payload = 0;
>>> +	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc);
>> qg_idx++)
>>> +		payload |= (1 << qg_idx);
>>> +	for (template_idx = SIG_DL_4G; template_idx <= SIG_DL_4G_LAST;
>>> +			template_idx++) {
>>> +		address = HWPfQmgrGrpTmplateReg4Indx
>>> +				+ BYTES_IN_WORD*template_idx;
>>> +		acc100_reg_write(d, address, payload);
>>> +		#if RTE_ACC100_SINGLE_FEC == 1
>>> +			payload = 0;
>>> +		#endif
>>> +	}
>>> +	/* 5GDL */
>>> +	numQqsAcc += numQgs;
>>> +	numQgs	= conf->q_dl_5g.num_qgroups;
>>> +	payload = 0;
>>> +	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc);
>> qg_idx++)
>>> +		payload |= (1 << qg_idx);
>>> +	for (template_idx = SIG_DL_5G; template_idx <= SIG_DL_5G_LAST;
>>> +			template_idx++) {
>>> +		address = HWPfQmgrGrpTmplateReg4Indx
>>> +				+ BYTES_IN_WORD*template_idx;
>>> +		acc100_reg_write(d, address, payload);
>>> +		#if RTE_ACC100_SINGLE_FEC == 1
>>> +		payload = 0;
>>> +		#endif
>>> +	}
>>> +
>>> +	/* Queue Group Function mapping */
>>> +	int qman_func_id[5] = {0, 2, 1, 3, 4};
>>> +	address = HWPfQmgrGrpFunction0;
>>> +	payload = 0;
>>> +	for (qg_idx = 0; qg_idx < 8; qg_idx++) {
>>> +		acc = accFromQgid(qg_idx, conf);
>>> +		payload |= qman_func_id[acc]<<(qg_idx * 4);
>>> +	}
>>> +	acc100_reg_write(d, address, payload);
>>> +
>>> +	/* Configuration of the Arbitration QGroup depth to 1 */
>>> +	for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) {
>>> +		address = HWPfQmgrArbQDepthGrp +
>>> +		BYTES_IN_WORD * qg_idx;
>>> +		payload = 0;
>>> +		acc100_reg_write(d, address, payload);
>>> +	}
>>> +
>>> +	/* Enabling AQueues through the Queue hierarchy*/
>>> +	for (vf_idx = 0; vf_idx < ACC100_NUM_VFS; vf_idx++) {
>>> +		for (qg_idx = 0; qg_idx < ACC100_NUM_QGRPS; qg_idx++) {
>>> +			payload = 0;
>>> +			if (vf_idx < conf->num_vf_bundles &&
>>> +					qg_idx < totalQgs)
>>> +				payload = (1 << aqNum(qg_idx, conf)) - 1;
>>> +			address = HWPfQmgrAqEnableVf
>>> +					+ vf_idx * BYTES_IN_WORD;
>>> +			payload += (qg_idx << 16);
>>> +			acc100_reg_write(d, address, payload);
>>> +		}
>>> +	}
>>> +
>>> +	/* This pointer to ARAM (256kB) is shifted by 2 (4B per register) */
>>> +	uint32_t aram_address = 0;
>>> +	for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) {
>>> +		for (vf_idx = 0; vf_idx < conf->num_vf_bundles; vf_idx++) {
>>> +			address = HWPfQmgrVfBaseAddr + vf_idx
>>> +					* BYTES_IN_WORD + qg_idx
>>> +					* BYTES_IN_WORD * 64;
>>> +			payload = aram_address;
>>> +			acc100_reg_write(d, address, payload);
>>> +			/* Offset ARAM Address for next memory bank
>>> +			 * - increment of 4B
>>> +			 */
>>> +			aram_address += aqNum(qg_idx, conf) *
>>> +					(1 << aqDepth(qg_idx, conf));
>>> +		}
>>> +	}
>>> +
>>> +	if (aram_address > WORDS_IN_ARAM_SIZE) {
>>> +		rte_bbdev_log(ERR, "ARAM Configuration not fitting %d
>> %d\n",
>>> +				aram_address, WORDS_IN_ARAM_SIZE);
>>> +		return -EINVAL;
>>> +	}
>>> +
>>> +	/* ==== HI Configuration ==== */
>>> +
>>> +	/* Prevent Block on Transmit Error */
>>> +	address = HWPfHiBlockTransmitOnErrorEn;
>>> +	payload = 0;
>>> +	acc100_reg_write(d, address, payload);
>>> +	/* Prevents to drop MSI */
>>> +	address = HWPfHiMsiDropEnableReg;
>>> +	payload = 0;
>>> +	acc100_reg_write(d, address, payload);
>>> +	/* Set the PF Mode register */
>>> +	address = HWPfHiPfMode;
>>> +	payload = (conf->pf_mode_en) ? 2 : 0;
>>> +	acc100_reg_write(d, address, payload);
>>> +	/* Enable Error Detection in HW */
>>> +	address = HWPfDmaErrorDetectionEn;
>>> +	payload = 0x3D7;
>>> +	acc100_reg_write(d, address, payload);
>>> +
>>> +	/* QoS overflow init */
>>> +	payload = 1;
>>> +	address = HWPfQosmonAEvalOverflow0;
>>> +	acc100_reg_write(d, address, payload);
>>> +	address = HWPfQosmonBEvalOverflow0;
>>> +	acc100_reg_write(d, address, payload);
>>> +
>>> +	/* HARQ DDR Configuration */
>>> +	unsigned int ddrSizeInMb = 512; /* Fixed to 512 MB per VF for now
>> */
>>> +	for (vf_idx = 0; vf_idx < conf->num_vf_bundles; vf_idx++) {
>>> +		address = HWPfDmaVfDdrBaseRw + vf_idx
>>> +				* 0x10;
>>> +		payload = ((vf_idx * (ddrSizeInMb / 64)) << 16) +
>>> +				(ddrSizeInMb - 1);
>>> +		acc100_reg_write(d, address, payload);
>>> +	}
>>> +	usleep(LONG_WAIT);
>> Is sleep needed here ? the reg_write has one.
> This one is needed on top
>
>>> +
>> Since this seems like a workaround, add a comment here.
> fair enough, ok, thanks
>
>> Tom
>>
>>> +	if (numEngines < (SIG_UL_5G_LAST + 1))
>>> +		poweron_cleanup(bbdev, d, conf);
>>> +
>>> +	rte_bbdev_log_debug("PF Tip configuration complete for %s",
>> dev_name);
>>> +	return 0;
>>> +}
>>> diff --git 
>>> a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
>>> b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
>>> index 4a76d1d..91c234d 100644
>>> --- a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
>>> +++ b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
>>> @@ -1,3 +1,10 @@
>>>  DPDK_21 {
>>>  	local: *;
>>>  };
>>> +
>>> +EXPERIMENTAL {
>>> +	global:
>>> +
>>> +	acc100_configure;
>>> +
>>> +};


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v10 03/10] baseband/acc100: add info get function
  2020-10-01 14:34       ` Maxime Coquelin
@ 2020-10-01 19:50         ` Chautru, Nicolas
  0 siblings, 0 replies; 213+ messages in thread
From: Chautru, Nicolas @ 2020-10-01 19:50 UTC (permalink / raw)
  To: Maxime Coquelin, dev, akhil.goyal
  Cc: Richardson, Bruce, Xu, Rosen, trix, Yigit, Ferruh, Liu, Tianjiao

Hi Maxime, 
Ok for all. I can rename. 

> From: Maxime Coquelin <maxime.coquelin@redhat.com>
> On 10/1/20 5:14 AM, Nicolas Chautru wrote:
> > Add in the "info_get" function to the driver, to allow us to query the
> > device.
> > No processing capability are available yet.
> > Linking bbdev-test to support the PMD with null capability.
> >
> > Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> > Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
> > ---
> >  app/test-bbdev/meson.build               |   3 +
> >  drivers/baseband/acc100/rte_acc100_cfg.h |  96 +++++++++++++
> > drivers/baseband/acc100/rte_acc100_pmd.c | 229
> > +++++++++++++++++++++++++++++++
> > drivers/baseband/acc100/rte_acc100_pmd.h |  10 ++
> >  4 files changed, 338 insertions(+)
> >  create mode 100644 drivers/baseband/acc100/rte_acc100_cfg.h
> >
> > diff --git a/app/test-bbdev/meson.build b/app/test-bbdev/meson.build
> > index 18ab6a8..fbd8ae3 100644
> > --- a/app/test-bbdev/meson.build
> > +++ b/app/test-bbdev/meson.build
> > @@ -12,3 +12,6 @@ endif
> >  if dpdk_conf.has('RTE_LIBRTE_PMD_BBDEV_FPGA_5GNR_FEC')
> >  	deps += ['pmd_bbdev_fpga_5gnr_fec']
> >  endif
> > +if dpdk_conf.has('RTE_LIBRTE_PMD_BBDEV_ACC100')
> > +	deps += ['pmd_bbdev_acc100']
> > +endif
> > \ No newline at end of file
> > diff --git a/drivers/baseband/acc100/rte_acc100_cfg.h
> > b/drivers/baseband/acc100/rte_acc100_cfg.h
> > new file mode 100644
> > index 0000000..73bbe36
> > --- /dev/null
> > +++ b/drivers/baseband/acc100/rte_acc100_cfg.h
> > @@ -0,0 +1,96 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright(c) 2020 Intel Corporation  */
> > +
> > +#ifndef _RTE_ACC100_CFG_H_
> > +#define _RTE_ACC100_CFG_H_
> > +
> > +/**
> > + * @file rte_acc100_cfg.h
> > + *
> > + * Functions for configuring ACC100 HW, exposed directly to applications.
> > + * Configuration related to encoding/decoding is done through the
> > + * librte_bbdev library.
> > + *
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice  */
> > +
> > +#include <stdint.h>
> > +#include <stdbool.h>
> > +
> > +#ifdef __cplusplus
> > +extern "C" {
> > +#endif
> > +/**< Number of Virtual Functions ACC100 supports */ #define
> > +RTE_ACC100_NUM_VFS 16
> > +
> > +/**
> > + * Definition of Queue Topology for ACC100 Configuration
> > + * Some level of details is abstracted out to expose a clean
> > +interface
> > + * given that comprehensive flexibility is not required  */ struct
> > +rte_q_topology_t {
> 
> The naming is too generic, it has to contain the driver name.
> Also, it should not pe postfixed with _t, as it is not a typedef.
> 
> "struct rte_acc100_queue_topology"?
> 
> > +	/** Number of QGroups in incremental order of priority */
> > +	uint16_t num_qgroups;
> > +	/**
> > +	 * All QGroups have the same number of AQs here.
> > +	 * Note : Could be made a 16-array if more flexibility is really
> > +	 * required
> > +	 */
> > +	uint16_t num_aqs_per_groups;
> > +	/**
> > +	 * Depth of the AQs is the same of all QGroups here. Log2 Enum : 2^N
> > +	 * Note : Could be made a 16-array if more flexibility is really
> > +	 * required
> > +	 */
> > +	uint16_t aq_depth_log2;
> > +	/**
> > +	 * Index of the first Queue Group Index - assuming contiguity
> > +	 * Initialized as -1
> > +	 */
> > +	int8_t first_qgroup_index;
> > +};
> > +
> > +/**
> > + * Definition of Arbitration related parameters for ACC100
> > +Configuration  */ struct rte_arbitration_t {
> 
> Same remark here.
> 
> > +	/** Default Weight for VF Fairness Arbitration */
> > +	uint16_t round_robin_weight;
> > +	uint32_t gbr_threshold1; /**< Guaranteed Bitrate Threshold 1 */
> > +	uint32_t gbr_threshold2; /**< Guaranteed Bitrate Threshold 2 */ };
> > +
> > +/**
> > + * Structure to pass ACC100 configuration.
> > + * Note: all VF Bundles will have the same configuration.
> > + */
> > +struct acc100_conf {
> 
> "struct rte_acc100_conf"?
> 
> > +	bool pf_mode_en; /**< 1 if PF is used for dataplane, 0 for VFs */
> > +	/** 1 if input '1' bit is represented by a positive LLR value, 0 if '1'
> > +	 * bit is represented by a negative value.
> > +	 */
> > +	bool input_pos_llr_1_bit;
> > +	/** 1 if output '1' bit is represented by a positive value, 0 if '1'
> > +	 * bit is represented by a negative value.
> > +	 */
> > +	bool output_pos_llr_1_bit;
> > +	uint16_t num_vf_bundles; /**< Number of VF bundles to setup */
> > +	/** Queue topology for each operation type */
> > +	struct rte_q_topology_t q_ul_4g;
> > +	struct rte_q_topology_t q_dl_4g;
> > +	struct rte_q_topology_t q_ul_5g;
> > +	struct rte_q_topology_t q_dl_5g;
> > +	/** Arbitration configuration for each operation type */
> > +	struct rte_arbitration_t arb_ul_4g[RTE_ACC100_NUM_VFS];
> > +	struct rte_arbitration_t arb_dl_4g[RTE_ACC100_NUM_VFS];
> > +	struct rte_arbitration_t arb_ul_5g[RTE_ACC100_NUM_VFS];
> > +	struct rte_arbitration_t arb_dl_5g[RTE_ACC100_NUM_VFS]; };
> > +
> > +#ifdef __cplusplus
> > +}
> > +#endif
> > +
> > +#endif /* _RTE_ACC100_CFG_H_ */
> 
> Regards,
> Maxime


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v10 04/10] baseband/acc100: add queue configuration
  2020-10-01 15:38       ` Maxime Coquelin
@ 2020-10-01 19:50         ` Chautru, Nicolas
  0 siblings, 0 replies; 213+ messages in thread
From: Chautru, Nicolas @ 2020-10-01 19:50 UTC (permalink / raw)
  To: Maxime Coquelin, dev, akhil.goyal
  Cc: Richardson, Bruce, Xu, Rosen, trix, Yigit, Ferruh, Liu, Tianjiao

Hi Maxime, 

> From: Maxime Coquelin <maxime.coquelin@redhat.com>
> On 10/1/20 5:14 AM, Nicolas Chautru wrote:
> > Adding function to create and configure queues for the device. Still
> > no capability.
> >
> > Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> > Reviewed-by: Rosen Xu <rosen.xu@intel.com>
> > Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
> > ---
> >  drivers/baseband/acc100/rte_acc100_pmd.c | 438
> > ++++++++++++++++++++++++++++++-
> > drivers/baseband/acc100/rte_acc100_pmd.h |  45 ++++
> >  2 files changed, 482 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> > b/drivers/baseband/acc100/rte_acc100_pmd.c
> > index 98a17b3..709a7af 100644
> > --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> > +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> > @@ -26,6 +26,22 @@
> >  RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);  #endif
> >
> > +/* Write to MMIO register address */
> > +static inline void
> > +mmio_write(void *addr, uint32_t value) {
> > +	*((volatile uint32_t *)(addr)) = rte_cpu_to_le_32(value); }
> > +
> > +/* Write a register of a ACC100 device */ static inline void
> > +acc100_reg_write(struct acc100_device *d, uint32_t offset, uint32_t
> > +payload) {
> > +	void *reg_addr = RTE_PTR_ADD(d->mmio_base, offset);
> > +	mmio_write(reg_addr, payload);
> > +	usleep(ACC100_LONG_WAIT);
> 
> Is it really needed to sleep after the MMIO write access?

Not necessaraly for every single registers but for some, and these very
MMIO are all used outside of the real time functions.


> 
> > +}
> > +
> >  /* Read a register of a ACC100 device */  static inline uint32_t
> > acc100_reg_read(struct acc100_device *d, uint32_t offset) @@ -36,6
> > +52,22 @@
> >  	return rte_le_to_cpu_32(ret);
> >  }
> >
> > +/* Basic Implementation of Log2 for exact 2^N */ static inline
> > +uint32_t log2_basic(uint32_t value) {
> > +	return (value == 0) ? 0 : rte_bsf32(value); }
> > +
> > +/* Calculate memory alignment offset assuming alignment is 2^N */
> > +static inline uint32_t calc_mem_alignment_offset(void
> > +*unaligned_virt_mem, uint32_t alignment) {
> > +	rte_iova_t unaligned_phy_mem =
> rte_malloc_virt2iova(unaligned_virt_mem);
> > +	return (uint32_t)(alignment -
> > +			(unaligned_phy_mem & (alignment-1))); }
> > +
> >  /* Calculate the offset of the enqueue register */  static inline
> > uint32_t  queue_offset(bool pf_device, uint8_t vf_id, uint8_t qgrp_id,
> > uint16_t aq_id) @@ -208,10 +240,411 @@
> >  			acc100_conf->q_dl_5g.aq_depth_log2);
> >  }
> >
> > +static void
> > +free_base_addresses(void **base_addrs, int size) {
> > +	int i;
> > +	for (i = 0; i < size; i++)
> > +		rte_free(base_addrs[i]);
> > +}
> > +
> > +static inline uint32_t
> > +get_desc_len(void)
> > +{
> > +	return sizeof(union acc100_dma_desc); }
> > +
> > +/* Allocate the 2 * 64MB block for the sw rings */ static int
> > +alloc_2x64mb_sw_rings_mem(struct rte_bbdev *dev, struct
> acc100_device *d,
> > +		int socket)
> > +{
> > +	uint32_t sw_ring_size = ACC100_SIZE_64MBYTE;
> > +	d->sw_rings_base = rte_zmalloc_socket(dev->device->driver->name,
> > +			2 * sw_ring_size, RTE_CACHE_LINE_SIZE, socket);
> > +	if (d->sw_rings_base == NULL) {
> > +		rte_bbdev_log(ERR, "Failed to allocate memory for %s:%u",
> > +				dev->device->driver->name,
> > +				dev->data->dev_id);
> > +		return -ENOMEM;
> > +	}
> > +	memset(d->sw_rings_base, 0, ACC100_SIZE_64MBYTE);
> 
> Having used zmalloc, the memset looks overkill. Also, it does not clear all the
> allocated are, don't know if this is expected.

Agreed thanks.

> 
> > +	uint32_t next_64mb_align_offset = calc_mem_alignment_offset(
> > +			d->sw_rings_base, ACC100_SIZE_64MBYTE);
> > +	d->sw_rings = RTE_PTR_ADD(d->sw_rings_base,
> next_64mb_align_offset);
> > +	d->sw_rings_phys = rte_malloc_virt2iova(d->sw_rings_base) +
> > +			next_64mb_align_offset;
> 
> sw_rings_phys should be renamed to sw_rings_iova, as it could be a VA if
> IOVA_AS_VA more is used.

agreed

> 
> > +	d->sw_ring_size = ACC100_MAX_QUEUE_DEPTH * get_desc_len();
> > +	d->sw_ring_max_depth = d->sw_ring_size / get_desc_len();
> 
> d->sw_ring_max_depth = ACC100_MAX_QUEUE_DEPTH;

sure

> 
> > +
> > +	return 0;
> > +}
> > +
> > +/* Attempt to allocate minimised memory space for sw rings */ static
> > +void alloc_sw_rings_min_mem(struct rte_bbdev *dev, struct
> > +acc100_device *d,
> > +		uint16_t num_queues, int socket)
> > +{
> > +	rte_iova_t sw_rings_base_phy, next_64mb_align_addr_phy;
> 
> Same comment regarding phys vs. iova in this function.

yep general comment

> 
> > +	uint32_t next_64mb_align_offset;
> > +	rte_iova_t sw_ring_phys_end_addr;
> > +	void *base_addrs[ACC100_SW_RING_MEM_ALLOC_ATTEMPTS];
> > +	void *sw_rings_base;
> > +	int i = 0;
> > +	uint32_t q_sw_ring_size = ACC100_MAX_QUEUE_DEPTH *
> get_desc_len();
> > +	uint32_t dev_sw_ring_size = q_sw_ring_size * num_queues;
> > +
> > +	/* Find an aligned block of memory to store sw rings */
> > +	while (i < ACC100_SW_RING_MEM_ALLOC_ATTEMPTS) {
> > +		/*
> > +		 * sw_ring allocated memory is guaranteed to be aligned to
> > +		 * q_sw_ring_size at the condition that the requested size is
> > +		 * less than the page size
> > +		 */
> > +		sw_rings_base = rte_zmalloc_socket(
> > +				dev->device->driver->name,
> > +				dev_sw_ring_size, q_sw_ring_size, socket);
> > +
> > +		if (sw_rings_base == NULL) {
> > +			rte_bbdev_log(ERR,
> > +					"Failed to allocate memory for
> %s:%u",
> > +					dev->device->driver->name,
> > +					dev->data->dev_id);
> > +			break;
> > +		}
> > +
> > +		sw_rings_base_phy = rte_malloc_virt2iova(sw_rings_base);
> > +		next_64mb_align_offset = calc_mem_alignment_offset(
> > +				sw_rings_base, ACC100_SIZE_64MBYTE);
> > +		next_64mb_align_addr_phy = sw_rings_base_phy +
> > +				next_64mb_align_offset;
> > +		sw_ring_phys_end_addr = sw_rings_base_phy +
> dev_sw_ring_size;
> > +
> > +		/* Check if the end of the sw ring memory block is before the
> > +		 * start of next 64MB aligned mem address
> > +		 */
> > +		if (sw_ring_phys_end_addr < next_64mb_align_addr_phy) {
> > +			d->sw_rings_phys = sw_rings_base_phy;
> > +			d->sw_rings = sw_rings_base;
> > +			d->sw_rings_base = sw_rings_base;
> > +			d->sw_ring_size = q_sw_ring_size;
> > +			d->sw_ring_max_depth =
> ACC100_MAX_QUEUE_DEPTH;
> > +			break;
> > +		}
> > +		/* Store the address of the unaligned mem block */
> > +		base_addrs[i] = sw_rings_base;
> > +		i++;
> > +	}
> > +
> > +	/* Free all unaligned blocks of mem allocated in the loop */
> > +	free_base_addresses(base_addrs, i);
> > +}
> > +
> > +
> > +/* Allocate 64MB memory used for all software rings */ static int
> > +acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int
> > +socket_id) {
> > +	uint32_t phys_low, phys_high, payload;
> > +	struct acc100_device *d = dev->data->dev_private;
> > +	const struct acc100_registry_addr *reg_addr;
> > +
> > +	if (d->pf_device && !d->acc100_conf.pf_mode_en) {
> > +		rte_bbdev_log(NOTICE,
> > +				"%s has PF mode disabled. This PF can't be
> used.",
> > +				dev->data->name);
> > +		return -ENODEV;
> > +	}
> > +
> > +	alloc_sw_rings_min_mem(dev, d, num_queues, socket_id);
> > +
> > +	/* If minimal memory space approach failed, then allocate
> > +	 * the 2 * 64MB block for the sw rings
> > +	 */
> > +	if (d->sw_rings == NULL)
> > +		alloc_2x64mb_sw_rings_mem(dev, d, socket_id);
> > +
> > +	if (d->sw_rings == NULL) {
> > +		rte_bbdev_log(NOTICE,
> > +				"Failure allocating sw_rings memory");
> > +		return -ENODEV;
> > +	}
> > +
> > +	/* Configure ACC100 with the base address for DMA descriptor rings
> > +	 * Same descriptor rings used for UL and DL DMA Engines
> > +	 * Note : Assuming only VF0 bundle is used for PF mode
> > +	 */
> > +	phys_high = (uint32_t)(d->sw_rings_phys >> 32);
> > +	phys_low  = (uint32_t)(d->sw_rings_phys &
> ~(ACC100_SIZE_64MBYTE-1));
> > +
> > +	/* Choose correct registry addresses for the device type */
> > +	if (d->pf_device)
> > +		reg_addr = &pf_reg_addr;
> > +	else
> > +		reg_addr = &vf_reg_addr;
> > +
> > +	/* Read the populated cfg from ACC100 registers */
> > +	fetch_acc100_config(dev);
> > +
> > +	/* Release AXI from PF */
> > +	if (d->pf_device)
> > +		acc100_reg_write(d, HWPfDmaAxiControl, 1);
> > +
> > +	acc100_reg_write(d, reg_addr->dma_ring_ul5g_hi, phys_high);
> > +	acc100_reg_write(d, reg_addr->dma_ring_ul5g_lo, phys_low);
> > +	acc100_reg_write(d, reg_addr->dma_ring_dl5g_hi, phys_high);
> > +	acc100_reg_write(d, reg_addr->dma_ring_dl5g_lo, phys_low);
> > +	acc100_reg_write(d, reg_addr->dma_ring_ul4g_hi, phys_high);
> > +	acc100_reg_write(d, reg_addr->dma_ring_ul4g_lo, phys_low);
> > +	acc100_reg_write(d, reg_addr->dma_ring_dl4g_hi, phys_high);
> > +	acc100_reg_write(d, reg_addr->dma_ring_dl4g_lo, phys_low);
> > +
> > +	/*
> > +	 * Configure Ring Size to the max queue ring size
> > +	 * (used for wrapping purpose)
> > +	 */
> > +	payload = log2_basic(d->sw_ring_size / 64);
> > +	acc100_reg_write(d, reg_addr->ring_size, payload);
> > +
> > +	/* Configure tail pointer for use when SDONE enabled */
> > +	d->tail_ptrs = rte_zmalloc_socket(
> > +			dev->device->driver->name,
> > +			ACC100_NUM_QGRPS * ACC100_NUM_AQS *
> sizeof(uint32_t),
> > +			RTE_CACHE_LINE_SIZE, socket_id);
> > +	if (d->tail_ptrs == NULL) {
> > +		rte_bbdev_log(ERR, "Failed to allocate tail ptr for %s:%u",
> > +				dev->device->driver->name,
> > +				dev->data->dev_id);
> > +		rte_free(d->sw_rings);
> > +		return -ENOMEM;
> > +	}
> > +	d->tail_ptr_phys = rte_malloc_virt2iova(d->tail_ptrs);
> > +
> > +	phys_high = (uint32_t)(d->tail_ptr_phys >> 32);
> > +	phys_low  = (uint32_t)(d->tail_ptr_phys);
> > +	acc100_reg_write(d, reg_addr->tail_ptrs_ul5g_hi, phys_high);
> > +	acc100_reg_write(d, reg_addr->tail_ptrs_ul5g_lo, phys_low);
> > +	acc100_reg_write(d, reg_addr->tail_ptrs_dl5g_hi, phys_high);
> > +	acc100_reg_write(d, reg_addr->tail_ptrs_dl5g_lo, phys_low);
> > +	acc100_reg_write(d, reg_addr->tail_ptrs_ul4g_hi, phys_high);
> > +	acc100_reg_write(d, reg_addr->tail_ptrs_ul4g_lo, phys_low);
> > +	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_hi, phys_high);
> > +	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_lo, phys_low);
> > +
> > +	d->harq_layout = rte_zmalloc_socket("HARQ Layout",
> > +			ACC100_HARQ_LAYOUT * sizeof(*d->harq_layout),
> > +			RTE_CACHE_LINE_SIZE, dev->data->socket_id);
> > +	if (d->harq_layout == NULL) {
> > +		rte_bbdev_log(ERR, "Failed to allocate harq_layout for
> %s:%u",
> > +				dev->device->driver->name,
> > +				dev->data->dev_id);
> > +		rte_free(d->sw_rings);
> > +		return -ENOMEM;
> > +	}
> > +
> > +	/* Mark as configured properly */
> > +	d->configured = true;
> > +
> > +	rte_bbdev_log_debug(
> > +			"ACC100 (%s) configured  sw_rings = %p,
> sw_rings_phys = %#"
> > +			PRIx64, dev->data->name, d->sw_rings, d-
> >sw_rings_phys);
> > +
> > +	return 0;
> > +}
> > +
> >  /* Free 64MB memory used for software rings */
> 
> Seems to be 2x64MB are allocated.

It may be 2x64 in case 64MB allocations failed. I can just remove as this is non informtive. 

> 
> >  static int
> > -acc100_dev_close(struct rte_bbdev *dev  __rte_unused)
> > +acc100_dev_close(struct rte_bbdev *dev)
> >  {
> > +	struct acc100_device *d = dev->data->dev_private;
> > +	if (d->sw_rings_base != NULL) {
> > +		rte_free(d->tail_ptrs);
> > +		rte_free(d->sw_rings_base);
> > +		d->sw_rings_base = NULL;
> > +	}
> > +	usleep(ACC100_LONG_WAIT);
> 
> This sleep looks weird, it would need a comment if it is really needed.

Non real time impact, just in case there are pending in-flight transactions from HW and ensure we are quiesce once device is considered closed. 

> 
> > +	return 0;
> > +}
> > +
> > +
> > +/**
> > + * Report a ACC100 queue index which is free
> > + * Return 0 to 16k for a valid queue_idx or -1 when no queue is
> > +available
> > + * Note : Only supporting VF0 Bundle for PF mode  */ static int
> > +acc100_find_free_queue_idx(struct rte_bbdev *dev,
> > +		const struct rte_bbdev_queue_conf *conf) {
> > +	struct acc100_device *d = dev->data->dev_private;
> > +	int op_2_acc[5] = {0, UL_4G, DL_4G, UL_5G, DL_5G};
> > +	int acc = op_2_acc[conf->op_type];
> > +	struct rte_q_topology_t *qtop = NULL;
> 
> New line.

ok

> 
> > +	qtopFromAcc(&qtop, acc, &(d->acc100_conf));
> > +	if (qtop == NULL)
> > +		return -1;
> > +	/* Identify matching QGroup Index which are sorted in priority order
> */
> > +	uint16_t group_idx = qtop->first_qgroup_index;
> > +	group_idx += conf->priority;
> > +	if (group_idx >= ACC100_NUM_QGRPS ||
> > +			conf->priority >= qtop->num_qgroups) {
> > +		rte_bbdev_log(INFO, "Invalid Priority on %s, priority %u",
> > +				dev->data->name, conf->priority);
> > +		return -1;
> > +	}
> > +	/* Find a free AQ_idx  */
> > +	uint16_t aq_idx;
> > +	for (aq_idx = 0; aq_idx < qtop->num_aqs_per_groups; aq_idx++) {
> > +		if (((d->q_assigned_bit_map[group_idx] >> aq_idx) & 0x1) ==
> 0) {
> > +			/* Mark the Queue as assigned */
> > +			d->q_assigned_bit_map[group_idx] |= (1 << aq_idx);
> > +			/* Report the AQ Index */
> > +			return (group_idx << ACC100_GRP_ID_SHIFT) +
> aq_idx;
> > +		}
> > +	}
> > +	rte_bbdev_log(INFO, "Failed to find free queue on %s, priority %u",
> > +			dev->data->name, conf->priority);
> > +	return -1;
> > +}
> > +
> > +/* Setup ACC100 queue */
> > +static int
> > +acc100_queue_setup(struct rte_bbdev *dev, uint16_t queue_id,
> > +		const struct rte_bbdev_queue_conf *conf) {
> > +	struct acc100_device *d = dev->data->dev_private;
> > +	struct acc100_queue *q;
> > +	int16_t q_idx;
> > +
> > +	/* Allocate the queue data structure. */
> > +	q = rte_zmalloc_socket(dev->device->driver->name, sizeof(*q),
> > +			RTE_CACHE_LINE_SIZE, conf->socket);
> > +	if (q == NULL) {
> > +		rte_bbdev_log(ERR, "Failed to allocate queue memory");
> > +		return -ENOMEM;
> > +	}
> > +
> > +	q->d = d;
> > +	q->ring_addr = RTE_PTR_ADD(d->sw_rings, (d->sw_ring_size *
> > +queue_id));
> 
> You might want to ensure d is not NULL before dereferencing it.

doesn't hurt, ok

> 
> > +	q->ring_addr_phys = d->sw_rings_phys + (d->sw_ring_size *
> queue_id);
> > +
> > +	/* Prepare the Ring with default descriptor format */
> > +	union acc100_dma_desc *desc = NULL;
> > +	unsigned int desc_idx, b_idx;
> > +	int fcw_len = (conf->op_type == RTE_BBDEV_OP_LDPC_ENC ?
> > +		ACC100_FCW_LE_BLEN : (conf->op_type ==
> RTE_BBDEV_OP_TURBO_DEC ?
> > +		ACC100_FCW_TD_BLEN : ACC100_FCW_LD_BLEN));
> > +
> > +	for (desc_idx = 0; desc_idx < d->sw_ring_max_depth; desc_idx++) {
> > +		desc = q->ring_addr + desc_idx;
> > +		desc->req.word0 = ACC100_DMA_DESC_TYPE;
> > +		desc->req.word1 = 0; /**< Timestamp */
> > +		desc->req.word2 = 0;
> > +		desc->req.word3 = 0;
> > +		uint64_t fcw_offset = (desc_idx << 8) +
> ACC100_DESC_FCW_OFFSET;
> > +		desc->req.data_ptrs[0].address = q->ring_addr_phys +
> fcw_offset;
> > +		desc->req.data_ptrs[0].blen = fcw_len;
> > +		desc->req.data_ptrs[0].blkid = ACC100_DMA_BLKID_FCW;
> > +		desc->req.data_ptrs[0].last = 0;
> > +		desc->req.data_ptrs[0].dma_ext = 0;
> > +		for (b_idx = 1; b_idx < ACC100_DMA_MAX_NUM_POINTERS -
> 1;
> > +				b_idx++) {
> > +			desc->req.data_ptrs[b_idx].blkid =
> ACC100_DMA_BLKID_IN;
> > +			desc->req.data_ptrs[b_idx].last = 1;
> > +			desc->req.data_ptrs[b_idx].dma_ext = 0;
> > +			b_idx++;
> > +			desc->req.data_ptrs[b_idx].blkid =
> > +					ACC100_DMA_BLKID_OUT_ENC;
> > +			desc->req.data_ptrs[b_idx].last = 1;
> > +			desc->req.data_ptrs[b_idx].dma_ext = 0;
> > +		}
> > +		/* Preset some fields of LDPC FCW */
> > +		desc->req.fcw_ld.FCWversion = ACC100_FCW_VER;
> > +		desc->req.fcw_ld.gain_i = 1;
> > +		desc->req.fcw_ld.gain_h = 1;
> > +	}
> > +
> > +	q->lb_in = rte_zmalloc_socket(dev->device->driver->name,
> > +			RTE_CACHE_LINE_SIZE,
> > +			RTE_CACHE_LINE_SIZE, conf->socket);
> > +	if (q->lb_in == NULL) {
> > +		rte_bbdev_log(ERR, "Failed to allocate lb_in memory");
> > +		rte_free(q);
> > +		return -ENOMEM;
> > +	}
> > +	q->lb_in_addr_phys = rte_malloc_virt2iova(q->lb_in);
> > +	q->lb_out = rte_zmalloc_socket(dev->device->driver->name,
> > +			RTE_CACHE_LINE_SIZE,
> > +			RTE_CACHE_LINE_SIZE, conf->socket);
> > +	if (q->lb_out == NULL) {
> > +		rte_bbdev_log(ERR, "Failed to allocate lb_out memory");
> > +		rte_free(q->lb_in);
> > +		rte_free(q);
> > +		return -ENOMEM;
> > +	}
> > +	q->lb_out_addr_phys = rte_malloc_virt2iova(q->lb_out);
> > +
> > +	/*
> > +	 * Software queue ring wraps synchronously with the HW when it
> reaches
> > +	 * the boundary of the maximum allocated queue size, no matter
> what the
> > +	 * sw queue size is. This wrapping is guarded by setting the
> wrap_mask
> > +	 * to represent the maximum queue size as allocated at the time
> when
> > +	 * the device has been setup (in configure()).
> > +	 *
> > +	 * The queue depth is set to the queue size value (conf->queue_size).
> > +	 * This limits the occupancy of the queue at any point of time, so
> that
> > +	 * the queue does not get swamped with enqueue requests.
> > +	 */
> > +	q->sw_ring_depth = conf->queue_size;
> > +	q->sw_ring_wrap_mask = d->sw_ring_max_depth - 1;
> > +
> > +	q->op_type = conf->op_type;
> > +
> > +	q_idx = acc100_find_free_queue_idx(dev, conf);
> > +	if (q_idx == -1) {
> > +		rte_free(q->lb_in);
> > +		rte_free(q->lb_out);
> > +		rte_free(q);
> > +		return -1;
> > +	}
> > +
> > +	q->qgrp_id = (q_idx >> ACC100_GRP_ID_SHIFT) & 0xF;
> > +	q->vf_id = (q_idx >> ACC100_VF_ID_SHIFT)  & 0x3F;
> > +	q->aq_id = q_idx & 0xF;
> > +	q->aq_depth = (conf->op_type ==  RTE_BBDEV_OP_TURBO_DEC) ?
> > +			(1 << d->acc100_conf.q_ul_4g.aq_depth_log2) :
> > +			(1 << d->acc100_conf.q_dl_4g.aq_depth_log2);
> > +
> > +	q->mmio_reg_enqueue = RTE_PTR_ADD(d->mmio_base,
> > +			queue_offset(d->pf_device,
> > +					q->vf_id, q->qgrp_id, q->aq_id));
> > +
> > +	rte_bbdev_log_debug(
> > +			"Setup dev%u q%u: qgrp_id=%u, vf_id=%u,
> aq_id=%u, aq_depth=%u, mmio_reg_enqueue=%p",
> > +			dev->data->dev_id, queue_id, q->qgrp_id, q->vf_id,
> > +			q->aq_id, q->aq_depth, q->mmio_reg_enqueue);
> > +
> > +	dev->data->queues[queue_id].queue_private = q;
> > +	return 0;
> > +}
> > +
> > +/* Release ACC100 queue */
> > +static int
> > +acc100_queue_release(struct rte_bbdev *dev, uint16_t q_id) {
> > +	struct acc100_device *d = dev->data->dev_private;
> > +	struct acc100_queue *q = dev->data->queues[q_id].queue_private;
> > +
> > +	if (q != NULL) {
> > +		/* Mark the Queue as un-assigned */
> > +		d->q_assigned_bit_map[q->qgrp_id] &= (0xFFFFFFFF -
> > +				(1 << q->aq_id));
> > +		rte_free(q->lb_in);
> > +		rte_free(q->lb_out);
> > +		rte_free(q);
> > +		dev->data->queues[q_id].queue_private = NULL;
> > +	}
> > +
> >  	return 0;
> >  }
> >
> > @@ -262,8 +695,11 @@
> >  }
> >
> >  static const struct rte_bbdev_ops acc100_bbdev_ops = {
> > +	.setup_queues = acc100_setup_queues,
> >  	.close = acc100_dev_close,
> >  	.info_get = acc100_dev_info_get,
> > +	.queue_setup = acc100_queue_setup,
> > +	.queue_release = acc100_queue_release,
> >  };
> >
> >  /* ACC100 PCI PF address map */
> > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h
> > b/drivers/baseband/acc100/rte_acc100_pmd.h
> > index de015ca..2508385 100644
> > --- a/drivers/baseband/acc100/rte_acc100_pmd.h
> > +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
> > @@ -522,11 +522,56 @@ struct acc100_registry_addr {
> >  	.ddr_range = HWVfDmaDdrBaseRangeRoVf,  };
> >
> > +/* Structure associated with each queue. */ struct
> > +__rte_cache_aligned acc100_queue {
> > +	union acc100_dma_desc *ring_addr;  /* Virtual address of sw ring */
> > +	rte_iova_t ring_addr_phys;  /* Physical address of software ring */
> > +	uint32_t sw_ring_head;  /* software ring head */
> > +	uint32_t sw_ring_tail;  /* software ring tail */
> > +	/* software ring size (descriptors, not bytes) */
> > +	uint32_t sw_ring_depth;
> > +	/* mask used to wrap enqueued descriptors on the sw ring */
> > +	uint32_t sw_ring_wrap_mask;
> > +	/* MMIO register used to enqueue descriptors */
> > +	void *mmio_reg_enqueue;
> > +	uint8_t vf_id;  /* VF ID (max = 63) */
> > +	uint8_t qgrp_id;  /* Queue Group ID */
> > +	uint16_t aq_id;  /* Atomic Queue ID */
> > +	uint16_t aq_depth;  /* Depth of atomic queue */
> > +	uint32_t aq_enqueued;  /* Count how many "batches" have been
> enqueued */
> > +	uint32_t aq_dequeued;  /* Count how many "batches" have been
> dequeued */
> > +	uint32_t irq_enable;  /* Enable ops dequeue interrupts if set to 1 */
> > +	struct rte_mempool *fcw_mempool;  /* FCW mempool */
> > +	enum rte_bbdev_op_type op_type;  /* Type of this Queue: TE or TD
> */
> > +	/* Internal Buffers for loopback input */
> > +	uint8_t *lb_in;
> > +	uint8_t *lb_out;
> > +	rte_iova_t lb_in_addr_phys;
> > +	rte_iova_t lb_out_addr_phys;
> > +	struct acc100_device *d;
> > +};
> > +
> >  /* Private data structure for each ACC100 device */  struct
> > acc100_device {
> >  	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
> > +	void *sw_rings_base;  /* Base addr of un-aligned memory for sw
> rings */
> > +	void *sw_rings;  /* 64MBs of 64MB aligned memory for sw rings */
> > +	rte_iova_t sw_rings_phys;  /* Physical address of sw_rings */
> > +	/* Virtual address of the info memory routed to the this function
> under
> > +	 * operation, whether it is PF or VF.
> > +	 */
> > +	union acc100_harq_layout_data *harq_layout;
> > +	uint32_t sw_ring_size;
> >  	uint32_t ddr_size; /* Size in kB */
> > +	uint32_t *tail_ptrs; /* Base address of response tail pointer buffer */
> > +	rte_iova_t tail_ptr_phys; /* Physical address of tail pointers */
> > +	/* Max number of entries available for each queue in device,
> depending
> > +	 * on how many queues are enabled with configure()
> > +	 */
> > +	uint32_t sw_ring_max_depth;
> >  	struct acc100_conf acc100_conf; /* ACC100 Initial configuration */
> > +	/* Bitmap capturing which Queues have already been assigned */
> > +	uint16_t q_assigned_bit_map[ACC100_NUM_QGRPS];
> >  	bool pf_device; /**< True if this is a PF ACC100 device */
> >  	bool configured; /**< True if this ACC100 device is configured */
> > };
> >


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v10 10/10] baseband/acc100: add configure function
  2020-10-01 15:43           ` Maxime Coquelin
@ 2020-10-01 19:50             ` Chautru, Nicolas
  2020-10-01 21:44               ` Maxime Coquelin
  0 siblings, 1 reply; 213+ messages in thread
From: Chautru, Nicolas @ 2020-10-01 19:50 UTC (permalink / raw)
  To: Maxime Coquelin, dev, akhil.goyal
  Cc: Richardson, Bruce, Xu, Rosen, trix, Yigit, Ferruh, Liu, Tianjiao

Hi Maxime, 

> From: Maxime Coquelin <maxime.coquelin@redhat.com>
> On 10/1/20 5:36 PM, Chautru, Nicolas wrote:
> > Hi Maxime,
> >> From: Maxime Coquelin <maxime.coquelin@redhat.com>
> >> Hi Nicolas,
> >>
> >> On 10/1/20 5:14 AM, Nicolas Chautru wrote:
> >>> diff --git
> >>> a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> >>> b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> >>> index 4a76d1d..91c234d 100644
> >>> --- a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> >>> +++ b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> >>> @@ -1,3 +1,10 @@
> >>>  DPDK_21 {
> >>>  	local: *;
> >>>  };
> >>> +
> >>> +EXPERIMENTAL {
> >>> +	global:
> >>> +
> >>> +	acc100_configure;
> >>> +
> >>> +};
> >>> --
> >>
> >> Ideally we should not need to have device specific APIs, but at least
> >> it should be prefixed with "rte_".
> >
> > Currently this is already like that for other bbdev PMDs.
> > So I would tend to prefer consistency over all in that context.
> > You could argue or not whether this is PMD function or a companion
> exposed function, but again if this should change it should change for all
> PMDs to avoid discrepencies.
> > If really this is deemed required this can be pushed as an extra patch
> covering all PMD, but probably not for 20.11.
> > What do you think?
> 
> Better to fix the API now to avoid namespace pollution, including the other
> comments I made regarding API on patch 3.
> That's not a big change, it can be done in v20.11 in my opinion.

ok fair enough, thanks

> 
> Thanks,
> Maxime
> 
> >>
> >> Regards,
> >> Maxime
> >


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v9 08/10] baseband/acc100: add interrupt support to PMD
  2020-10-01 16:05           ` Tom Rix
@ 2020-10-01 21:07             ` Chautru, Nicolas
  0 siblings, 0 replies; 213+ messages in thread
From: Chautru, Nicolas @ 2020-10-01 21:07 UTC (permalink / raw)
  To: Tom Rix, dev, akhil.goyal
  Cc: Richardson, Bruce, Xu, Rosen, dave.burley, aidan.goddard, Yigit,
	Ferruh, Liu, Tianjiao

Hi Tom, 

> From: Tom Rix <trix@redhat.com>
> On 9/30/20 12:45 PM, Chautru, Nicolas wrote:
> > Hi Tom,
> >
> >> From: Tom Rix <trix@redhat.com>
> >> On 9/28/20 5:29 PM, Nicolas Chautru wrote:
> >>> Adding capability and functions to support MSI interrupts, call
> >>> backs and inforing.
> >>>
> >>> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> >>> Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
> >>> ---
> >>>  drivers/baseband/acc100/rte_acc100_pmd.c | 288
> >>> ++++++++++++++++++++++++++++++-
> >>> drivers/baseband/acc100/rte_acc100_pmd.h |  15 ++
> >>>  2 files changed, 300 insertions(+), 3 deletions(-)
> >>>
> >>> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> >>> b/drivers/baseband/acc100/rte_acc100_pmd.c
> >>> index 7d4c3df..b6d9e7c 100644
> >>> --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> >>> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> >>> @@ -339,6 +339,213 @@
> >>>  	free_base_addresses(base_addrs, i);  }
> >>>
> >>> +/*
> >>> + * Find queue_id of a device queue based on details from the Info Ring.
> >>> + * If a queue isn't found UINT16_MAX is returned.
> >>> + */
> >>> +static inline uint16_t
> >>> +get_queue_id_from_ring_info(struct rte_bbdev_data *data,
> >>> +		const union acc100_info_ring_data ring_data) {
> >>> +	uint16_t queue_id;
> >>> +
> >>> +	for (queue_id = 0; queue_id < data->num_queues; ++queue_id) {
> >>> +		struct acc100_queue *acc100_q =
> >>> +				data->queues[queue_id].queue_private;
> >>> +		if (acc100_q != NULL && acc100_q->aq_id == ring_data.aq_id
> >> &&
> >>> +				acc100_q->qgrp_id == ring_data.qg_id &&
> >>> +				acc100_q->vf_id == ring_data.vf_id)
> >>> +			return queue_id;
> >> If num_queues is large, this linear search will be slow.
> >>
> >> Consider changing the search algorithm.
> > This is not in the time critical part of the code
> ok
> >
> >
> >>> +	}
> >>> +
> >>> +	return UINT16_MAX;
> >> the interrupt handlers that use this function do not a great job of
> >> handling this error.
> > if that error actualy happened then there is not much else that can be
> done except reporting the unexpected data.
> ok
> >
> >>> +}
> >>> +
> >>> +/* Checks PF Info Ring to find the interrupt cause and handles it
> >>> +accordingly */ static inline void acc100_check_ir(struct
> >>> +acc100_device *acc100_dev) {
> >>> +	volatile union acc100_info_ring_data *ring_data;
> >>> +	uint16_t info_ring_head = acc100_dev->info_ring_head;
> >>> +	if (acc100_dev->info_ring == NULL)
> >>> +		return;
> >>> +
> >>> +	ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head &
> >>> +			ACC100_INFO_RING_MASK);
> >>> +
> >>> +	while (ring_data->valid) {
> >>> +		if ((ring_data->int_nb <
> >> ACC100_PF_INT_DMA_DL_DESC_IRQ) || (
> >>> +				ring_data->int_nb >
> >>> +				ACC100_PF_INT_DMA_DL5G_DESC_IRQ))
> >>> +			rte_bbdev_log(WARNING, "InfoRing: ITR:%d
> >> Info:0x%x",
> >>> +				ring_data->int_nb, ring_data-
> >>> detailed_info);
> >>> +		/* Initialize Info Ring entry and move forward */
> >>> +		ring_data->val = 0;
> >>> +		info_ring_head++;
> >>> +		ring_data = acc100_dev->info_ring +
> >>> +				(info_ring_head &
> >> ACC100_INFO_RING_MASK);
> >> These three statements are common for the ring handling, consider a
> >> macro or inline function.
> > ok
> >
> >>> +	}
> >>> +}
> >>> +
> >>> +/* Checks PF Info Ring to find the interrupt cause and handles it
> >>> +accordingly */ static inline void
> >>> +acc100_pf_interrupt_handler(struct
> >>> +rte_bbdev *dev) {
> >>> +	struct acc100_device *acc100_dev = dev->data->dev_private;
> >>> +	volatile union acc100_info_ring_data *ring_data;
> >>> +	struct acc100_deq_intr_details deq_intr_det;
> >>> +
> >>> +	ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head &
> >>> +			ACC100_INFO_RING_MASK);
> >>> +
> >>> +	while (ring_data->valid) {
> >>> +
> >>> +		rte_bbdev_log_debug(
> >>> +				"ACC100 PF Interrupt received, Info Ring
> >> data: 0x%x",
> >>> +				ring_data->val);
> >>> +
> >>> +		switch (ring_data->int_nb) {
> >>> +		case ACC100_PF_INT_DMA_DL_DESC_IRQ:
> >>> +		case ACC100_PF_INT_DMA_UL_DESC_IRQ:
> >>> +		case ACC100_PF_INT_DMA_UL5G_DESC_IRQ:
> >>> +		case ACC100_PF_INT_DMA_DL5G_DESC_IRQ:
> >>> +			deq_intr_det.queue_id =
> >> get_queue_id_from_ring_info(
> >>> +					dev->data, *ring_data);
> >>> +			if (deq_intr_det.queue_id == UINT16_MAX) {
> >>> +				rte_bbdev_log(ERR,
> >>> +						"Couldn't find queue: aq_id:
> >> %u, qg_id: %u, vf_id: %u",
> >>> +						ring_data->aq_id,
> >>> +						ring_data->qg_id,
> >>> +						ring_data->vf_id);
> >>> +				return;
> >>> +			}
> >>> +			rte_bbdev_pmd_callback_process(dev,
> >>> +					RTE_BBDEV_EVENT_DEQUEUE,
> >> &deq_intr_det);
> >>> +			break;
> >>> +		default:
> >>> +			rte_bbdev_pmd_callback_process(dev,
> >>> +					RTE_BBDEV_EVENT_ERROR, NULL);
> >>> +			break;
> >>> +		}
> >>> +
> >>> +		/* Initialize Info Ring entry and move forward */
> >>> +		ring_data->val = 0;
> >>> +		++acc100_dev->info_ring_head;
> >>> +		ring_data = acc100_dev->info_ring +
> >>> +				(acc100_dev->info_ring_head &
> >>> +				ACC100_INFO_RING_MASK);
> >>> +	}
> >>> +}
> >>> +
> >>> +/* Checks VF Info Ring to find the interrupt cause and handles it
> >>> +accordingly */ static inline void
> >>> +acc100_vf_interrupt_handler(struct
> >>> +rte_bbdev *dev)
> >> very similar to pf case, consider combining.
> >>> +{
> >>> +	struct acc100_device *acc100_dev = dev->data->dev_private;
> >>> +	volatile union acc100_info_ring_data *ring_data;
> >>> +	struct acc100_deq_intr_details deq_intr_det;
> >>> +
> >>> +	ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head &
> >>> +			ACC100_INFO_RING_MASK);
> >>> +
> >>> +	while (ring_data->valid) {
> >>> +
> >>> +		rte_bbdev_log_debug(
> >>> +				"ACC100 VF Interrupt received, Info Ring
> >> data: 0x%x",
> >>> +				ring_data->val);
> >>> +
> >>> +		switch (ring_data->int_nb) {
> >>> +		case ACC100_VF_INT_DMA_DL_DESC_IRQ:
> >>> +		case ACC100_VF_INT_DMA_UL_DESC_IRQ:
> >>> +		case ACC100_VF_INT_DMA_UL5G_DESC_IRQ:
> >>> +		case ACC100_VF_INT_DMA_DL5G_DESC_IRQ:
> >>> +			/* VFs are not aware of their vf_id - it's set to 0 in
> >>> +			 * queue structures.
> >>> +			 */
> >>> +			ring_data->vf_id = 0;
> >>> +			deq_intr_det.queue_id =
> >> get_queue_id_from_ring_info(
> >>> +					dev->data, *ring_data);
> >>> +			if (deq_intr_det.queue_id == UINT16_MAX) {
> >>> +				rte_bbdev_log(ERR,
> >>> +						"Couldn't find queue: aq_id:
> >> %u, qg_id: %u",
> >>> +						ring_data->aq_id,
> >>> +						ring_data->qg_id);
> >>> +				return;
> >>> +			}
> >>> +			rte_bbdev_pmd_callback_process(dev,
> >>> +					RTE_BBDEV_EVENT_DEQUEUE,
> >> &deq_intr_det);
> >>> +			break;
> >>> +		default:
> >>> +			rte_bbdev_pmd_callback_process(dev,
> >>> +					RTE_BBDEV_EVENT_ERROR, NULL);
> >>> +			break;
> >>> +		}
> >>> +
> >>> +		/* Initialize Info Ring entry and move forward */
> >>> +		ring_data->valid = 0;
> >>> +		++acc100_dev->info_ring_head;
> >>> +		ring_data = acc100_dev->info_ring + (acc100_dev-
> >>> info_ring_head
> >>> +				& ACC100_INFO_RING_MASK);
> >>> +	}
> >>> +}
> >>> +
> >>> +/* Interrupt handler triggered by ACC100 dev for handling specific
> >>> +interrupt */ static void acc100_dev_interrupt_handler(void *cb_arg) {
> >>> +	struct rte_bbdev *dev = cb_arg;
> >>> +	struct acc100_device *acc100_dev = dev->data->dev_private;
> >>> +
> >>> +	/* Read info ring */
> >>> +	if (acc100_dev->pf_device)
> >>> +		acc100_pf_interrupt_handler(dev);
> >> combined like ..
> >>
> >> acc100_interrupt_handler(dev, is_pf)
> > unsure it will help readability. Much of the code would still be
> > distinct
> ok
> >
> >>> +	else
> >>> +		acc100_vf_interrupt_handler(dev); }
> >>> +
> >>> +/* Allocate and setup inforing */
> >>> +static int
> >>> +allocate_inforing(struct rte_bbdev *dev)
> >> consider renaming
> >>
> >> allocate_info_ring
> > ok
> >
> >>> +{
> >>> +	struct acc100_device *d = dev->data->dev_private;
> >>> +	const struct acc100_registry_addr *reg_addr;
> >>> +	rte_iova_t info_ring_phys;
> >>> +	uint32_t phys_low, phys_high;
> >>> +
> >>> +	if (d->info_ring != NULL)
> >>> +		return 0; /* Already configured */
> >>> +
> >>> +	/* Choose correct registry addresses for the device type */
> >>> +	if (d->pf_device)
> >>> +		reg_addr = &pf_reg_addr;
> >>> +	else
> >>> +		reg_addr = &vf_reg_addr;
> >>> +	/* Allocate InfoRing */
> >>> +	d->info_ring = rte_zmalloc_socket("Info Ring",
> >>> +			ACC100_INFO_RING_NUM_ENTRIES *
> >>> +			sizeof(*d->info_ring), RTE_CACHE_LINE_SIZE,
> >>> +			dev->data->socket_id);
> >>> +	if (d->info_ring == NULL) {
> >>> +		rte_bbdev_log(ERR,
> >>> +				"Failed to allocate Info Ring for %s:%u",
> >>> +				dev->device->driver->name,
> >>> +				dev->data->dev_id);
> >> The callers do not check that this fails.
> > arguably the error would be self contained if that did fail. But doesn't hurt
> to add, ok.
> >
> >>> +		return -ENOMEM;
> >>> +	}
> >>> +	info_ring_phys = rte_malloc_virt2iova(d->info_ring);
> >>> +
> >>> +	/* Setup Info Ring */
> >>> +	phys_high = (uint32_t)(info_ring_phys >> 32);
> >>> +	phys_low  = (uint32_t)(info_ring_phys);
> >>> +	acc100_reg_write(d, reg_addr->info_ring_hi, phys_high);
> >>> +	acc100_reg_write(d, reg_addr->info_ring_lo, phys_low);
> >>> +	acc100_reg_write(d, reg_addr->info_ring_en,
> >> ACC100_REG_IRQ_EN_ALL);
> >>> +	d->info_ring_head = (acc100_reg_read(d, reg_addr->info_ring_ptr) &
> >>> +			0xFFF) / sizeof(union acc100_info_ring_data);
> >>> +	return 0;
> >>> +}
> >>> +
> >>> +
> >>>  /* Allocate 64MB memory used for all software rings */  static int
> >>> acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int
> >>> socket_id) @@ -426,6 +633,7 @@
> >>>  	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_hi, phys_high);
> >>>  	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_lo, phys_low);
> >>>
> >>> +	allocate_inforing(dev);
> >> need to check here
> >>>  	d->harq_layout = rte_zmalloc_socket("HARQ Layout",
> >>>  			ACC100_HARQ_LAYOUT * sizeof(*d->harq_layout),
> >>>  			RTE_CACHE_LINE_SIZE, dev->data->socket_id); @@ -
> >> 437,13 +645,53 @@
> >>>  	return 0;
> >>>  }
> >>>
> >>> +static int
> >>> +acc100_intr_enable(struct rte_bbdev *dev) {
> >>> +	int ret;
> >>> +	struct acc100_device *d = dev->data->dev_private;
> >>> +
> >>> +	/* Only MSI are currently supported */
> >>> +	if (dev->intr_handle->type == RTE_INTR_HANDLE_VFIO_MSI ||
> >>> +			dev->intr_handle->type == RTE_INTR_HANDLE_UIO)
> >> {
> >>> +
> >>> +		allocate_inforing(dev);
> >> need to check here
> >>> +
> >>> +		ret = rte_intr_enable(dev->intr_handle);
> >>> +		if (ret < 0) {
> >>> +			rte_bbdev_log(ERR,
> >>> +					"Couldn't enable interrupts for
> >> device: %s",
> >>> +					dev->data->name);
> >>> +			rte_free(d->info_ring);
> >>> +			return ret;
> >>> +		}
> >>> +		ret = rte_intr_callback_register(dev->intr_handle,
> >>> +				acc100_dev_interrupt_handler, dev);
> >>> +		if (ret < 0) {
> >>> +			rte_bbdev_log(ERR,
> >>> +					"Couldn't register interrupt callback
> >> for device: %s",
> >>> +					dev->data->name);
> >>> +			rte_free(d->info_ring);
> >> does intr need to be disabled here ?
> > Well I don't see a lot of consistency with other drivers. Sometimes these
> are not even check for failure.
> > I would rather defer changing through other future patch if required as this
> is same code on other bbdev drivers already used (if changed I would rather
> all changed the same way).
> 
> ok.
> 
> 
> >
> >>> +			return ret;
> >>> +		}
> >>> +
> >>> +		return 0;
> >>> +	}
> >>> +
> >>> +	rte_bbdev_log(ERR, "ACC100 (%s) supports only VFIO MSI
> >> interrupts",
> >>> +			dev->data->name);
> >>> +	return -ENOTSUP;
> >>> +}
> >>> +
> >>>  /* Free 64MB memory used for software rings */  static int
> >>> acc100_dev_close(struct rte_bbdev *dev)  {
> >>>  	struct acc100_device *d = dev->data->dev_private;
> >>> +	acc100_check_ir(d);
> >>>  	if (d->sw_rings_base != NULL) {
> >>>  		rte_free(d->tail_ptrs);
> >>> +		rte_free(d->info_ring);
> >>>  		rte_free(d->sw_rings_base);
> >>>  		d->sw_rings_base = NULL;
> >>>  	}
> >>> @@ -643,6 +891,7 @@
> >>>  					RTE_BBDEV_TURBO_CRC_TYPE_24B
> >> |
> >> 	RTE_BBDEV_TURBO_HALF_ITERATION_EVEN |
> >> 	RTE_BBDEV_TURBO_EARLY_TERMINATION |
> >>> +
> >> 	RTE_BBDEV_TURBO_DEC_INTERRUPTS |
> >> 	RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN |
> >>>  					RTE_BBDEV_TURBO_MAP_DEC |
> >>>
> >> 	RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP | @@ -663,6 +912,7
> @@
> >> 	RTE_BBDEV_TURBO_CRC_24B_ATTACH |
> >> 	RTE_BBDEV_TURBO_RV_INDEX_BYPASS |
> >>>  					RTE_BBDEV_TURBO_RATE_MATCH |
> >>> +
> >> 	RTE_BBDEV_TURBO_ENC_INTERRUPTS |
> >> 	RTE_BBDEV_TURBO_ENC_SCATTER_GATHER,
> >>>  				.num_buffers_src =
> >>>
> >> 	RTE_BBDEV_TURBO_MAX_CODE_BLOCKS, @@ -676,7 +926,8 @@
> >>>  				.capability_flags =
> >>>  					RTE_BBDEV_LDPC_RATE_MATCH |
> >>>
> >> 	RTE_BBDEV_LDPC_CRC_24B_ATTACH |
> >>> -
> >> 	RTE_BBDEV_LDPC_INTERLEAVER_BYPASS,
> >>> +
> >> 	RTE_BBDEV_LDPC_INTERLEAVER_BYPASS |
> >>> +
> >> 	RTE_BBDEV_LDPC_ENC_INTERRUPTS,
> >>>  				.num_buffers_src =
> >>>
> >> 	RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
> >>>  				.num_buffers_dst =
> >>> @@ -701,7 +952,8 @@
> >>>  				RTE_BBDEV_LDPC_DECODE_BYPASS |
> >>>  				RTE_BBDEV_LDPC_DEC_SCATTER_GATHER |
> >>>
> >> 	RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION |
> >>> -				RTE_BBDEV_LDPC_LLR_COMPRESSION,
> >>> +				RTE_BBDEV_LDPC_LLR_COMPRESSION |
> >>> +				RTE_BBDEV_LDPC_DEC_INTERRUPTS,
> >>>  			.llr_size = 8,
> >>>  			.llr_decimals = 1,
> >>>  			.num_buffers_src =
> >>> @@ -751,14 +1003,39 @@
> >>>  #else
> >>>  	dev_info->harq_buffer_size = 0;
> >>>  #endif
> >>> +	acc100_check_ir(d);
> >>> +}
> >>> +
> >>> +static int
> >>> +acc100_queue_intr_enable(struct rte_bbdev *dev, uint16_t queue_id)
> {
> >>> +	struct acc100_queue *q = dev->data-
> >>> queues[queue_id].queue_private;
> >>> +
> >>> +	if (dev->intr_handle->type != RTE_INTR_HANDLE_VFIO_MSI &&
> >>> +			dev->intr_handle->type != RTE_INTR_HANDLE_UIO)
> >>> +		return -ENOTSUP;
> >>> +
> >>> +	q->irq_enable = 1;
> >>> +	return 0;
> >>> +}
> >>> +
> >>> +static int
> >>> +acc100_queue_intr_disable(struct rte_bbdev *dev, uint16_t queue_id)
> {
> >>> +	struct acc100_queue *q = dev->data-
> >>> queues[queue_id].queue_private;
> >>> +	q->irq_enable = 0;
> >> A -ENOTSUP above, should need similar check here.
> > How can this fail when we purely disable?
> 
> It is for api consistency.
> 
> the enable fails
> 
> the disable succeeds
> 
> that is not consistent.
> 

OK can do. Thanks


> Tom
> 
> >
> >>> +	return 0;
> >>>  }
> >>>
> >>>  static const struct rte_bbdev_ops acc100_bbdev_ops = {
> >>>  	.setup_queues = acc100_setup_queues,
> >>> +	.intr_enable = acc100_intr_enable,
> >>>  	.close = acc100_dev_close,
> >>>  	.info_get = acc100_dev_info_get,
> >>>  	.queue_setup = acc100_queue_setup,
> >>>  	.queue_release = acc100_queue_release,
> >>> +	.queue_intr_enable = acc100_queue_intr_enable,
> >>> +	.queue_intr_disable = acc100_queue_intr_disable
> >>>  };
> >>>
> >>>  /* ACC100 PCI PF address map */
> >>> @@ -3018,8 +3295,10 @@
> >>>  			? (1 << RTE_BBDEV_DATA_ERROR) : 0);
> >>>  	op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> >>>  	op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
> >>> -	if (op->status != 0)
> >>> +	if (op->status != 0) {
> >>>  		q_data->queue_stats.dequeue_err_count++;
> >>> +		acc100_check_ir(q->d);
> >>> +	}
> >>>
> >>>  	/* CRC invalid if error exists */
> >>>  	if (!op->status)
> >>> @@ -3076,6 +3355,9 @@
> >>>  		op->status |= 1 << RTE_BBDEV_SYNDROME_ERROR;
> >>>  	op->ldpc_dec.iter_count = (uint8_t) rsp.iter_cnt;
> >>>
> >>> +	if (op->status & (1 << RTE_BBDEV_DRV_ERROR))
> >>> +		acc100_check_ir(q->d);
> >>> +
> >>>  	/* Check if this is the last desc in batch (Atomic Queue) */
> >>>  	if (desc->req.last_desc_in_batch) {
> >>>  		(*aq_dequeued)++;
> >>> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h
> >>> b/drivers/baseband/acc100/rte_acc100_pmd.h
> >>> index 78686c1..8980fa5 100644
> >>> --- a/drivers/baseband/acc100/rte_acc100_pmd.h
> >>> +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
> >>> @@ -559,7 +559,14 @@ struct acc100_device {
> >>>  	/* Virtual address of the info memory routed to the this function
> >> under
> >>>  	 * operation, whether it is PF or VF.
> >>>  	 */
> >>> +	union acc100_info_ring_data *info_ring;
> >> Need a comment that this array needs a sentinel ?
> > Can clarify a bit expected HW behaviour
> >
> > Thanks
> >
> >> Tom
> >>
> >>> +
> >>>  	union acc100_harq_layout_data *harq_layout;
> >>> +	/* Virtual Info Ring head */
> >>> +	uint16_t info_ring_head;
> >>> +	/* Number of bytes available for each queue in device, depending
> >> on
> >>> +	 * how many queues are enabled with configure()
> >>> +	 */
> >>>  	uint32_t sw_ring_size;
> >>>  	uint32_t ddr_size; /* Size in kB */
> >>>  	uint32_t *tail_ptrs; /* Base address of response tail pointer
> >>> buffer */ @@ -575,4 +582,12 @@ struct acc100_device {
> >>>  	bool configured; /**< True if this ACC100 device is configured */
> >>> };
> >>>
> >>> +/**
> >>> + * Structure with details about RTE_BBDEV_EVENT_DEQUEUE event.
> It's
> >>> +passed to
> >>> + * the callback function.
> >>> + */
> >>> +struct acc100_deq_intr_details {
> >>> +	uint16_t queue_id;
> >>> +};
> >>> +
> >>>  #endif /* _RTE_ACC100_PMD_H_ */


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v9 10/10] baseband/acc100: add configure function
  2020-10-01 16:18           ` Tom Rix
@ 2020-10-01 21:11             ` Chautru, Nicolas
  0 siblings, 0 replies; 213+ messages in thread
From: Chautru, Nicolas @ 2020-10-01 21:11 UTC (permalink / raw)
  To: Tom Rix, dev, akhil.goyal
  Cc: Richardson, Bruce, Xu, Rosen, dave.burley, aidan.goddard, Yigit,
	Ferruh, Liu, Tianjiao

Hi Tom, 

> From: Tom Rix <trix@redhat.com>
> On 9/30/20 3:54 PM, Chautru, Nicolas wrote:
> > Hi Tom,
> >
> >> From: Tom Rix <trix@redhat.com>
> >> On 9/28/20 5:29 PM, Nicolas Chautru wrote:
> >>> Add configure function to configure the PF from within the
> >>> bbdev-test itself without external application configuration the device.
> >>>
> >>> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> >>> Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
> >>> ---
> >>>  app/test-bbdev/test_bbdev_perf.c                   |  72 +++
> >>>  doc/guides/rel_notes/release_20_11.rst             |   5 +
> >>>  drivers/baseband/acc100/meson.build                |   2 +
> >>>  drivers/baseband/acc100/rte_acc100_cfg.h           |  17 +
> >>>  drivers/baseband/acc100/rte_acc100_pmd.c           | 505
> >> +++++++++++++++++++++
> >>>  .../acc100/rte_pmd_bbdev_acc100_version.map        |   7 +
> >>>  6 files changed, 608 insertions(+)
> >>>
> >>> diff --git a/app/test-bbdev/test_bbdev_perf.c
> >>> b/app/test-bbdev/test_bbdev_perf.c
> >>> index 45c0d62..32f23ff 100644
> >>> --- a/app/test-bbdev/test_bbdev_perf.c
> >>> +++ b/app/test-bbdev/test_bbdev_perf.c
> >>> @@ -52,6 +52,18 @@
> >>>  #define FLR_5G_TIMEOUT 610
> >>>  #endif
> >>>
> >>> +#ifdef RTE_LIBRTE_PMD_BBDEV_ACC100
> >>> +#include <rte_acc100_cfg.h>
> >>> +#define ACC100PF_DRIVER_NAME   ("intel_acc100_pf")
> >>> +#define ACC100VF_DRIVER_NAME   ("intel_acc100_vf")
> >>> +#define ACC100_QMGR_NUM_AQS 16
> >>> +#define ACC100_QMGR_NUM_QGS 2
> >>> +#define ACC100_QMGR_AQ_DEPTH 5
> >>> +#define ACC100_QMGR_INVALID_IDX -1
> >>> +#define ACC100_QMGR_RR 1
> >>> +#define ACC100_QOS_GBR 0
> >>> +#endif
> >>> +
> >>>  #define OPS_CACHE_SIZE 256U
> >>>  #define OPS_POOL_SIZE_MIN 511U /* 0.5K per queue */
> >>>
> >>> @@ -653,6 +665,66 @@ typedef int (test_case_function)(struct
> >> active_device *ad,
> >>>  				info->dev_name);
> >>>  	}
> >>>  #endif
> >>> +#ifdef RTE_LIBRTE_PMD_BBDEV_ACC100
> >> seems like this function would break if one of the other bbdev's were
> >> #defined.
> > No these are independent. By default they are all defined.
> ok
> >
> >
> >>> +	if ((get_init_device() == true) &&
> >>> +		(!strcmp(info->drv.driver_name,
> >> ACC100PF_DRIVER_NAME))) {
> >>> +		struct acc100_conf conf;
> >>> +		unsigned int i;
> >>> +
> >>> +		printf("Configure ACC100 FEC Driver %s with default
> >> values\n",
> >>> +				info->drv.driver_name);
> >>> +
> >>> +		/* clear default configuration before initialization */
> >>> +		memset(&conf, 0, sizeof(struct acc100_conf));
> >>> +
> >>> +		/* Always set in PF mode for built-in configuration */
> >>> +		conf.pf_mode_en = true;
> >>> +		for (i = 0; i < RTE_ACC100_NUM_VFS; ++i) {
> >>> +			conf.arb_dl_4g[i].gbr_threshold1 =
> >> ACC100_QOS_GBR;
> >>> +			conf.arb_dl_4g[i].gbr_threshold1 =
> >> ACC100_QOS_GBR;
> >>> +			conf.arb_dl_4g[i].round_robin_weight =
> >> ACC100_QMGR_RR;
> >>> +			conf.arb_ul_4g[i].gbr_threshold1 =
> >> ACC100_QOS_GBR;
> >>> +			conf.arb_ul_4g[i].gbr_threshold1 =
> >> ACC100_QOS_GBR;
> >>> +			conf.arb_ul_4g[i].round_robin_weight =
> >> ACC100_QMGR_RR;
> >>> +			conf.arb_dl_5g[i].gbr_threshold1 =
> >> ACC100_QOS_GBR;
> >>> +			conf.arb_dl_5g[i].gbr_threshold1 =
> >> ACC100_QOS_GBR;
> >>> +			conf.arb_dl_5g[i].round_robin_weight =
> >> ACC100_QMGR_RR;
> >>> +			conf.arb_ul_5g[i].gbr_threshold1 =
> >> ACC100_QOS_GBR;
> >>> +			conf.arb_ul_5g[i].gbr_threshold1 =
> >> ACC100_QOS_GBR;
> >>> +			conf.arb_ul_5g[i].round_robin_weight =
> >> ACC100_QMGR_RR;
> >>> +		}
> >>> +
> >>> +		conf.input_pos_llr_1_bit = true;
> >>> +		conf.output_pos_llr_1_bit = true;
> >>> +		conf.num_vf_bundles = 1; /**< Number of VF bundles to
> >> setup */
> >>> +
> >>> +		conf.q_ul_4g.num_qgroups = ACC100_QMGR_NUM_QGS;
> >>> +		conf.q_ul_4g.first_qgroup_index =
> >> ACC100_QMGR_INVALID_IDX;
> >>> +		conf.q_ul_4g.num_aqs_per_groups =
> >> ACC100_QMGR_NUM_AQS;
> >>> +		conf.q_ul_4g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
> >>> +		conf.q_dl_4g.num_qgroups = ACC100_QMGR_NUM_QGS;
> >>> +		conf.q_dl_4g.first_qgroup_index =
> >> ACC100_QMGR_INVALID_IDX;
> >>> +		conf.q_dl_4g.num_aqs_per_groups =
> >> ACC100_QMGR_NUM_AQS;
> >>> +		conf.q_dl_4g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
> >>> +		conf.q_ul_5g.num_qgroups = ACC100_QMGR_NUM_QGS;
> >>> +		conf.q_ul_5g.first_qgroup_index =
> >> ACC100_QMGR_INVALID_IDX;
> >>> +		conf.q_ul_5g.num_aqs_per_groups =
> >> ACC100_QMGR_NUM_AQS;
> >>> +		conf.q_ul_5g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
> >>> +		conf.q_dl_5g.num_qgroups = ACC100_QMGR_NUM_QGS;
> >>> +		conf.q_dl_5g.first_qgroup_index =
> >> ACC100_QMGR_INVALID_IDX;
> >>> +		conf.q_dl_5g.num_aqs_per_groups =
> >> ACC100_QMGR_NUM_AQS;
> >>> +		conf.q_dl_5g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
> >>> +
> >>> +		/* setup PF with configuration information */
> >>> +		ret = acc100_configure(info->dev_name, &conf);
> >>> +		TEST_ASSERT_SUCCESS(ret,
> >>> +				"Failed to configure ACC100 PF for bbdev
> >> %s",
> >>> +				info->dev_name);
> >>> +		/* Let's refresh this now this is configured */
> >>> +	}
> >>> +	rte_bbdev_info_get(dev_id, info);
> >> The other bbdev's do not call rte_bbdev_info_get, can this be removed ?
> > Actually it should be added outside for all versions
> > post-configuraion. Thanks
> >
> >>> +#endif
> >>> +
> >>>  	nb_queues = RTE_MIN(rte_lcore_count(), info-
> drv.max_num_queues);
> >>>  	nb_queues = RTE_MIN(nb_queues, (unsigned int) MAX_QUEUES);
> >>>
> >>> diff --git a/doc/guides/rel_notes/release_20_11.rst
> >>> b/doc/guides/rel_notes/release_20_11.rst
> >>> index 73ac08f..c8d0586 100644
> >>> --- a/doc/guides/rel_notes/release_20_11.rst
> >>> +++ b/doc/guides/rel_notes/release_20_11.rst
> >>> @@ -55,6 +55,11 @@ New Features
> >>>       Also, make sure to start the actual text at the margin.
> >>>       =======================================================
> >>>
> >>> +* **Added Intel ACC100 bbdev PMD.**
> >>> +
> >>> +  Added a new ``acc100`` bbdev driver for the Intel\ |reg| ACC100
> >>> + accelerator  also known as Mount Bryce.  See the
> >>> + :doc:`../bbdevs/acc100` BBDEV guide for more details on this new
> driver.
> >>>
> >>>  Removed Items
> >>>  -------------
> >>> diff --git a/drivers/baseband/acc100/meson.build
> >>> b/drivers/baseband/acc100/meson.build
> >>> index 8afafc2..7ac44dc 100644
> >>> --- a/drivers/baseband/acc100/meson.build
> >>> +++ b/drivers/baseband/acc100/meson.build
> >>> @@ -4,3 +4,5 @@
> >>>  deps += ['bbdev', 'bus_vdev', 'ring', 'pci', 'bus_pci']
> >>>
> >>>  sources = files('rte_acc100_pmd.c')
> >>> +
> >>> +install_headers('rte_acc100_cfg.h')
> >>> diff --git a/drivers/baseband/acc100/rte_acc100_cfg.h
> >>> b/drivers/baseband/acc100/rte_acc100_cfg.h
> >>> index 73bbe36..7f523bc 100644
> >>> --- a/drivers/baseband/acc100/rte_acc100_cfg.h
> >>> +++ b/drivers/baseband/acc100/rte_acc100_cfg.h
> >>> @@ -89,6 +89,23 @@ struct acc100_conf {
> >>>  	struct rte_arbitration_t arb_dl_5g[RTE_ACC100_NUM_VFS];  };
> >>>
> >>> +/**
> >>> + * Configure a ACC100 device
> >>> + *
> >>> + * @param dev_name
> >>> + *   The name of the device. This is the short form of PCI BDF, e.g.
> 00:01.0.
> >>> + *   It can also be retrieved for a bbdev device from the dev_name field
> in
> >> the
> >>> + *   rte_bbdev_info structure returned by rte_bbdev_info_get().
> >>> + * @param conf
> >>> + *   Configuration to apply to ACC100 HW.
> >>> + *
> >>> + * @return
> >>> + *   Zero on success, negative value on failure.
> >>> + */
> >>> +__rte_experimental
> >>> +int
> >>> +acc100_configure(const char *dev_name, struct acc100_conf *conf);
> >>> +
> >>>  #ifdef __cplusplus
> >>>  }
> >>>  #endif
> >>> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> >>> b/drivers/baseband/acc100/rte_acc100_pmd.c
> >>> index 3589814..b50dd32 100644
> >>> --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> >>> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> >>> @@ -85,6 +85,26 @@
> >>>
> >>>  enum {UL_4G = 0, UL_5G, DL_4G, DL_5G, NUM_ACC};
> >>>
> >>> +/* Return the accelerator enum for a Queue Group Index */ static
> >>> +inline int accFromQgid(int qg_idx, const struct acc100_conf
> >>> +*acc100_conf) {
> >>> +	int accQg[ACC100_NUM_QGRPS];
> >>> +	int NumQGroupsPerFn[NUM_ACC];
> >>> +	int acc, qgIdx, qgIndex = 0;
> >>> +	for (qgIdx = 0; qgIdx < ACC100_NUM_QGRPS; qgIdx++)
> >>> +		accQg[qgIdx] = 0;
> >>> +	NumQGroupsPerFn[UL_4G] = acc100_conf->q_ul_4g.num_qgroups;
> >>> +	NumQGroupsPerFn[UL_5G] = acc100_conf->q_ul_5g.num_qgroups;
> >>> +	NumQGroupsPerFn[DL_4G] = acc100_conf->q_dl_4g.num_qgroups;
> >>> +	NumQGroupsPerFn[DL_5G] = acc100_conf->q_dl_5g.num_qgroups;
> >>> +	for (acc = UL_4G;  acc < NUM_ACC; acc++)
> >>> +		for (qgIdx = 0; qgIdx < NumQGroupsPerFn[acc]; qgIdx++)
> >>> +			accQg[qgIndex++] = acc;
> >> This looks inefficient, is there a way this could be calculated
> >> without filling arrays to
> >>
> >> access 1 value ?
> > That is not time critical, and the same common code is run each time.
> ok
> >
> >>> +	acc = accQg[qg_idx];
> >>> +	return acc;
> >>> +}
> >>> +
> >>>  /* Return the queue topology for a Queue Group Index */  static
> >>> inline void  qtopFromAcc(struct rte_q_topology_t **qtop, int
> >>> acc_enum, @@ -113,6 +133,30 @@
> >>>  	*qtop = p_qtop;
> >>>  }
> >>>
> >>> +/* Return the AQ depth for a Queue Group Index */ static inline int
> >>> +aqDepth(int qg_idx, struct acc100_conf *acc100_conf) {
> >>> +	struct rte_q_topology_t *q_top = NULL;
> >>> +	int acc_enum = accFromQgid(qg_idx, acc100_conf);
> >>> +	qtopFromAcc(&q_top, acc_enum, acc100_conf);
> >>> +	if (unlikely(q_top == NULL))
> >>> +		return 0;
> >> This error is not handled well be the callers.
> >>
> >> aqNum is similar.
> > This fails in a consistent basis, by having not queue available and handling
> this as the default case.
> ok
> >
> >>> +	return q_top->aq_depth_log2;
> >>> +}
> >>> +
> >>> +/* Return the AQ depth for a Queue Group Index */ static inline int
> >>> +aqNum(int qg_idx, struct acc100_conf *acc100_conf) {
> >>> +	struct rte_q_topology_t *q_top = NULL;
> >>> +	int acc_enum = accFromQgid(qg_idx, acc100_conf);
> >>> +	qtopFromAcc(&q_top, acc_enum, acc100_conf);
> >>> +	if (unlikely(q_top == NULL))
> >>> +		return 0;
> >>> +	return q_top->num_aqs_per_groups;
> >>> +}
> >>> +
> >>>  static void
> >>>  initQTop(struct acc100_conf *acc100_conf)  { @@ -4177,3 +4221,464
> >>> @@ static int acc100_pci_remove(struct rte_pci_device *pci_dev)
> >>> RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME,
> >>> pci_id_acc100_pf_map);
> >> RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME,
> >>> acc100_pci_vf_driver);
> >>> RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME,
> >>> pci_id_acc100_vf_map);
> >>> +
> >>> +/*
> >>> + * Implementation to fix the power on status of some 5GUL engines
> >>> + * This requires DMA permission if ported outside DPDK
> >> This sounds like a workaround, can more detail be added here ?
> > There are comments through the code I believe:
> >   - /* Detect engines in undefined state */
> >   - /* Force each engine which is in unspecified state */
> >   - /* Reset LDPC Cores */
> >   - /* Check engine power-on status again */ Do you believe this is not
> explicit enough. Power-on status may be in an undefined state hence this
> engine are avtivate with dummy payload to make sure they are in a
> predicable state once configuration is done.
> 
> Yes, not explicit enough. They do not say it is a workaround so someone else
> would not know that
> 
> this is needed or is likely needs adjusting in the future.  Maybe change
> 
> /* Check engine power-on status again */ to
> 
> /*
> 
>  * Power-on status may be in an undefined state.
> 
>  * Active this engine with a dummy payload to make sure the state is
> defined.
> 
>  */
> 

OK I will add a bit more in comments. Thanks


> Tom
> 
> >>> + */
> >>> +static void
> >>> +poweron_cleanup(struct rte_bbdev *bbdev, struct acc100_device *d,
> >>> +		struct acc100_conf *conf)
> >>> +{
> >>> +	int i, template_idx, qg_idx;
> >>> +	uint32_t address, status, payload;
> >>> +	printf("Need to clear power-on 5GUL status in internal memory\n");
> >>> +	/* Reset LDPC Cores */
> >>> +	for (i = 0; i < ACC100_ENGINES_MAX; i++)
> >>> +		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
> >>> +				ACC100_ENGINE_OFFSET * i,
> >> ACC100_RESET_HI);
> >>> +	usleep(LONG_WAIT);
> >>> +	for (i = 0; i < ACC100_ENGINES_MAX; i++)
> >>> +		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
> >>> +				ACC100_ENGINE_OFFSET * i,
> >> ACC100_RESET_LO);
> >>> +	usleep(LONG_WAIT);
> >>> +	/* Prepare dummy workload */
> >>> +	alloc_2x64mb_sw_rings_mem(bbdev, d, 0);
> >>> +	/* Set base addresses */
> >>> +	uint32_t phys_high = (uint32_t)(d->sw_rings_phys >> 32);
> >>> +	uint32_t phys_low  = (uint32_t)(d->sw_rings_phys &
> >>> +			~(ACC100_SIZE_64MBYTE-1));
> >>> +	acc100_reg_write(d, HWPfDmaFec5GulDescBaseHiRegVf,
> >> phys_high);
> >>> +	acc100_reg_write(d, HWPfDmaFec5GulDescBaseLoRegVf, phys_low);
> >>> +
> >>> +	/* Descriptor for a dummy 5GUL code block processing*/
> >>> +	union acc100_dma_desc *desc = NULL;
> >>> +	desc = d->sw_rings;
> >>> +	desc->req.data_ptrs[0].address = d->sw_rings_phys +
> >>> +			ACC100_DESC_FCW_OFFSET;
> >>> +	desc->req.data_ptrs[0].blen = ACC100_FCW_LD_BLEN;
> >>> +	desc->req.data_ptrs[0].blkid = ACC100_DMA_BLKID_FCW;
> >>> +	desc->req.data_ptrs[0].last = 0;
> >>> +	desc->req.data_ptrs[0].dma_ext = 0;
> >>> +	desc->req.data_ptrs[1].address = d->sw_rings_phys + 512;
> >>> +	desc->req.data_ptrs[1].blkid = ACC100_DMA_BLKID_IN;
> >>> +	desc->req.data_ptrs[1].last = 1;
> >>> +	desc->req.data_ptrs[1].dma_ext = 0;
> >>> +	desc->req.data_ptrs[1].blen = 44;
> >>> +	desc->req.data_ptrs[2].address = d->sw_rings_phys + 1024;
> >>> +	desc->req.data_ptrs[2].blkid = ACC100_DMA_BLKID_OUT_ENC;
> >>> +	desc->req.data_ptrs[2].last = 1;
> >>> +	desc->req.data_ptrs[2].dma_ext = 0;
> >>> +	desc->req.data_ptrs[2].blen = 5;
> >>> +	/* Dummy FCW */
> >>> +	desc->req.fcw_ld.FCWversion = ACC100_FCW_VER;
> >>> +	desc->req.fcw_ld.qm = 1;
> >>> +	desc->req.fcw_ld.nfiller = 30;
> >>> +	desc->req.fcw_ld.BG = 2 - 1;
> >>> +	desc->req.fcw_ld.Zc = 7;
> >>> +	desc->req.fcw_ld.ncb = 350;
> >>> +	desc->req.fcw_ld.rm_e = 4;
> >>> +	desc->req.fcw_ld.itmax = 10;
> >>> +	desc->req.fcw_ld.gain_i = 1;
> >>> +	desc->req.fcw_ld.gain_h = 1;
> >>> +
> >>> +	int engines_to_restart[SIG_UL_5G_LAST + 1] = {0};
> >>> +	int num_failed_engine = 0;
> >>> +	/* Detect engines in undefined state */
> >>> +	for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST;
> >>> +			template_idx++) {
> >>> +		/* Check engine power-on status */
> >>> +		address = HwPfFecUl5gIbDebugReg +
> >>> +				ACC100_ENGINE_OFFSET * template_idx;
> >>> +		status = (acc100_reg_read(d, address) >> 4) & 0xF;
> >>> +		if (status == 0) {
> >>> +			engines_to_restart[num_failed_engine] =
> >> template_idx;
> >>> +			num_failed_engine++;
> >>> +		}
> >>> +	}
> >>> +
> >>> +	int numQqsAcc = conf->q_ul_5g.num_qgroups;
> >>> +	int numQgs = conf->q_ul_5g.num_qgroups;
> >>> +	payload = 0;
> >>> +	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc);
> >> qg_idx++)
> >>> +		payload |= (1 << qg_idx);
> >>> +	/* Force each engine which is in unspecified state */
> >>> +	for (i = 0; i < num_failed_engine; i++) {
> >>> +		int failed_engine = engines_to_restart[i];
> >>> +		printf("Force engine %d\n", failed_engine);
> >>> +		for (template_idx = SIG_UL_5G; template_idx <=
> >> SIG_UL_5G_LAST;
> >>> +				template_idx++) {
> >>> +			address = HWPfQmgrGrpTmplateReg4Indx
> >>> +					+ BYTES_IN_WORD * template_idx;
> >>> +			if (template_idx == failed_engine)
> >>> +				acc100_reg_write(d, address, payload);
> >>> +			else
> >>> +				acc100_reg_write(d, address, 0);
> >>> +		}
> >>> +		/* Reset descriptor header */
> >>> +		desc->req.word0 = ACC100_DMA_DESC_TYPE;
> >>> +		desc->req.word1 = 0;
> >>> +		desc->req.word2 = 0;
> >>> +		desc->req.word3 = 0;
> >>> +		desc->req.numCBs = 1;
> >>> +		desc->req.m2dlen = 2;
> >>> +		desc->req.d2mlen = 1;
> >>> +		/* Enqueue the code block for processing */
> >>> +		union acc100_enqueue_reg_fmt enq_req;
> >>> +		enq_req.val = 0;
> >>> +		enq_req.addr_offset = ACC100_DESC_OFFSET;
> >>> +		enq_req.num_elem = 1;
> >>> +		enq_req.req_elem_addr = 0;
> >>> +		rte_wmb();
> >>> +		acc100_reg_write(d, HWPfQmgrIngressAq + 0x100,
> >> enq_req.val);
> >>> +		usleep(LONG_WAIT * 100);
> >>> +		if (desc->req.word0 != 2)
> >>> +			printf("DMA Response %#"PRIx32"\n", desc-
> >>> req.word0);
> >>> +	}
> >>> +
> >>> +	/* Reset LDPC Cores */
> >>> +	for (i = 0; i < ACC100_ENGINES_MAX; i++)
> >>> +		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
> >>> +				ACC100_ENGINE_OFFSET * i,
> >> ACC100_RESET_HI);
> >>> +	usleep(LONG_WAIT);
> >>> +	for (i = 0; i < ACC100_ENGINES_MAX; i++)
> >>> +		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
> >>> +				ACC100_ENGINE_OFFSET * i,
> >> ACC100_RESET_LO);
> >>> +	usleep(LONG_WAIT);
> >>> +	acc100_reg_write(d, HWPfHi5GHardResetReg,
> >> ACC100_RESET_HARD);
> >>> +	usleep(LONG_WAIT);
> >>> +	int numEngines = 0;
> >>> +	/* Check engine power-on status again */
> >>> +	for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST;
> >>> +			template_idx++) {
> >>> +		address = HwPfFecUl5gIbDebugReg +
> >>> +				ACC100_ENGINE_OFFSET * template_idx;
> >>> +		status = (acc100_reg_read(d, address) >> 4) & 0xF;
> >>> +		address = HWPfQmgrGrpTmplateReg4Indx
> >>> +				+ BYTES_IN_WORD * template_idx;
> >>> +		if (status == 1) {
> >>> +			acc100_reg_write(d, address, payload);
> >>> +			numEngines++;
> >>> +		} else
> >>> +			acc100_reg_write(d, address, 0);
> >>> +	}
> >>> +	printf("Number of 5GUL engines %d\n", numEngines);
> >>> +
> >>> +	if (d->sw_rings_base != NULL)
> >>> +		rte_free(d->sw_rings_base);
> >>> +	usleep(LONG_WAIT);
> >>> +}
> >>> +
> >>> +/* Initial configuration of a ACC100 device prior to running
> >>> +configure() */ int acc100_configure(const char *dev_name, struct
> >>> +acc100_conf *conf) {
> >>> +	rte_bbdev_log(INFO, "acc100_configure");
> >>> +	uint32_t payload, address, status;
> >> maybe value or data would be a better variable name than payload.
> >>
> >> would mean changing acc100_reg_write
> > transparent to me, but can change given DPDK uses term value.
> >
> >
> >>> +	int qg_idx, template_idx, vf_idx, acc, i;
> >>> +	struct rte_bbdev *bbdev = rte_bbdev_get_named_dev(dev_name);
> >>> +
> >>> +	/* Compile time checks */
> >>> +	RTE_BUILD_BUG_ON(sizeof(struct acc100_dma_req_desc) != 256);
> >>> +	RTE_BUILD_BUG_ON(sizeof(union acc100_dma_desc) != 256);
> >>> +	RTE_BUILD_BUG_ON(sizeof(struct acc100_fcw_td) != 24);
> >>> +	RTE_BUILD_BUG_ON(sizeof(struct acc100_fcw_te) != 32);
> >>> +
> >>> +	if (bbdev == NULL) {
> >>> +		rte_bbdev_log(ERR,
> >>> +		"Invalid dev_name (%s), or device is not yet initialised",
> >>> +		dev_name);
> >>> +		return -ENODEV;
> >>> +	}
> >>> +	struct acc100_device *d = bbdev->data->dev_private;
> >>> +
> >>> +	/* Store configuration */
> >>> +	rte_memcpy(&d->acc100_conf, conf, sizeof(d->acc100_conf));
> >>> +
> >>> +	/* PCIe Bridge configuration */
> >>> +	acc100_reg_write(d, HwPfPcieGpexBridgeControl,
> >> ACC100_CFG_PCI_BRIDGE);
> >>> +	for (i = 1; i < 17; i++)
> >> 17 is a magic number, use a #define
> >>
> >> this is a general issue.
> > These are only used once but still agreed.
> >
> >>> +		acc100_reg_write(d,
> >>> +
> >> 	HwPfPcieGpexAxiAddrMappingWindowPexBaseHigh
> >>> +				+ i * 16, 0);
> >>> +
> >>> +	/* PCIe Link Trainiing and Status State Machine */
> >>> +	acc100_reg_write(d, HwPfPcieGpexLtssmStateCntrl, 0xDFC00000);
> >>> +
> >>> +	/* Prevent blocking AXI read on BRESP for AXI Write */
> >>> +	address = HwPfPcieGpexAxiPioControl;
> >>> +	payload = ACC100_CFG_PCI_AXI;
> >>> +	acc100_reg_write(d, address, payload);
> >>> +
> >>> +	/* 5GDL PLL phase shift */
> >>> +	acc100_reg_write(d, HWPfChaDl5gPllPhshft0, 0x1);
> >>> +
> >>> +	/* Explicitly releasing AXI as this may be stopped after PF FLR/BME */
> >>> +	address = HWPfDmaAxiControl;
> >>> +	payload = 1;
> >>> +	acc100_reg_write(d, address, payload);
> >>> +
> >>> +	/* DDR Configuration */
> >>> +	address = HWPfDdrBcTim6;
> >>> +	payload = acc100_reg_read(d, address);
> >>> +	payload &= 0xFFFFFFFB; /* Bit 2 */ #ifdef ACC100_DDR_ECC_ENABLE
> >>> +	payload |= 0x4;
> >>> +#endif
> >>> +	acc100_reg_write(d, address, payload);
> >>> +	address = HWPfDdrPhyDqsCountNum;
> >>> +#ifdef ACC100_DDR_ECC_ENABLE
> >>> +	payload = 9;
> >>> +#else
> >>> +	payload = 8;
> >>> +#endif
> >>> +	acc100_reg_write(d, address, payload);
> >>> +
> >>> +	/* Set default descriptor signature */
> >>> +	address = HWPfDmaDescriptorSignatuture;
> >>> +	payload = 0;
> >>> +	acc100_reg_write(d, address, payload);
> >>> +
> >>> +	/* Enable the Error Detection in DMA */
> >>> +	payload = ACC100_CFG_DMA_ERROR;
> >>> +	address = HWPfDmaErrorDetectionEn;
> >>> +	acc100_reg_write(d, address, payload);
> >>> +
> >>> +	/* AXI Cache configuration */
> >>> +	payload = ACC100_CFG_AXI_CACHE;
> >>> +	address = HWPfDmaAxcacheReg;
> >>> +	acc100_reg_write(d, address, payload);
> >>> +
> >>> +	/* Default DMA Configuration (Qmgr Enabled) */
> >>> +	address = HWPfDmaConfig0Reg;
> >>> +	payload = 0;
> >>> +	acc100_reg_write(d, address, payload);
> >>> +	address = HWPfDmaQmanen;
> >>> +	payload = 0;
> >>> +	acc100_reg_write(d, address, payload);
> >>> +
> >>> +	/* Default RLIM/ALEN configuration */
> >>> +	address = HWPfDmaConfig1Reg;
> >>> +	payload = (1 << 31) + (23 << 8) + (1 << 6) + 7;
> >>> +	acc100_reg_write(d, address, payload);
> >>> +
> >>> +	/* Configure DMA Qmanager addresses */
> >>> +	address = HWPfDmaQmgrAddrReg;
> >>> +	payload = HWPfQmgrEgressQueuesTemplate;
> >>> +	acc100_reg_write(d, address, payload);
> >>> +
> >>> +	/* ===== Qmgr Configuration ===== */
> >>> +	/* Configuration of the AQueue Depth QMGR_GRP_0_DEPTH_LOG2
> >> for UL */
> >>> +	int totalQgs = conf->q_ul_4g.num_qgroups +
> >>> +			conf->q_ul_5g.num_qgroups +
> >>> +			conf->q_dl_4g.num_qgroups +
> >>> +			conf->q_dl_5g.num_qgroups;
> >>> +	for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) {
> >>> +		address = HWPfQmgrDepthLog2Grp +
> >>> +		BYTES_IN_WORD * qg_idx;
> >>> +		payload = aqDepth(qg_idx, conf);
> >>> +		acc100_reg_write(d, address, payload);
> >>> +		address = HWPfQmgrTholdGrp +
> >>> +		BYTES_IN_WORD * qg_idx;
> >>> +		payload = (1 << 16) + (1 << (aqDepth(qg_idx, conf) - 1));
> >>> +		acc100_reg_write(d, address, payload);
> >>> +	}
> >>> +
> >>> +	/* Template Priority in incremental order */
> >>> +	for (template_idx = 0; template_idx < ACC100_NUM_TMPL;
> >>> +			template_idx++) {
> >>> +		address = HWPfQmgrGrpTmplateReg0Indx +
> >>> +		BYTES_IN_WORD * (template_idx % 8);
> >>> +		payload = TMPL_PRI_0;
> >>> +		acc100_reg_write(d, address, payload);
> >>> +		address = HWPfQmgrGrpTmplateReg1Indx +
> >>> +		BYTES_IN_WORD * (template_idx % 8);
> >>> +		payload = TMPL_PRI_1;
> >>> +		acc100_reg_write(d, address, payload);
> >>> +		address = HWPfQmgrGrpTmplateReg2indx +
> >>> +		BYTES_IN_WORD * (template_idx % 8);
> >>> +		payload = TMPL_PRI_2;
> >>> +		acc100_reg_write(d, address, payload);
> >>> +		address = HWPfQmgrGrpTmplateReg3Indx +
> >>> +		BYTES_IN_WORD * (template_idx % 8);
> >>> +		payload = TMPL_PRI_3;
> >>> +		acc100_reg_write(d, address, payload);
> >>> +	}
> >>> +
> >>> +	address = HWPfQmgrGrpPriority;
> >>> +	payload = ACC100_CFG_QMGR_HI_P;
> >>> +	acc100_reg_write(d, address, payload);
> >>> +
> >>> +	/* Template Configuration */
> >>> +	for (template_idx = 0; template_idx < ACC100_NUM_TMPL;
> >> template_idx++) {
> >>> +		payload = 0;
> >>> +		address = HWPfQmgrGrpTmplateReg4Indx
> >>> +				+ BYTES_IN_WORD * template_idx;
> >>> +		acc100_reg_write(d, address, payload);
> >>> +	}
> >>> +	/* 4GUL */
> >>> +	int numQgs = conf->q_ul_4g.num_qgroups;
> >>> +	int numQqsAcc = 0;
> >>> +	payload = 0;
> >>> +	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc);
> >> qg_idx++)
> >>> +		payload |= (1 << qg_idx);
> >>> +	for (template_idx = SIG_UL_4G; template_idx <= SIG_UL_4G_LAST;
> >>> +			template_idx++) {
> >>> +		address = HWPfQmgrGrpTmplateReg4Indx
> >>> +				+ BYTES_IN_WORD*template_idx;
> >>> +		acc100_reg_write(d, address, payload);
> >>> +	}
> >>> +	/* 5GUL */
> >>> +	numQqsAcc += numQgs;
> >>> +	numQgs	= conf->q_ul_5g.num_qgroups;
> >>> +	payload = 0;
> >>> +	int numEngines = 0;
> >>> +	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc);
> >> qg_idx++)
> >>> +		payload |= (1 << qg_idx);
> >>> +	for (template_idx = SIG_UL_5G; template_idx <= SIG_UL_5G_LAST;
> >>> +			template_idx++) {
> >>> +		/* Check engine power-on status */
> >>> +		address = HwPfFecUl5gIbDebugReg +
> >>> +				ACC100_ENGINE_OFFSET * template_idx;
> >>> +		status = (acc100_reg_read(d, address) >> 4) & 0xF;
> >>> +		address = HWPfQmgrGrpTmplateReg4Indx
> >>> +				+ BYTES_IN_WORD * template_idx;
> >>> +		if (status == 1) {
> >>> +			acc100_reg_write(d, address, payload);
> >>> +			numEngines++;
> >>> +		} else
> >>> +			acc100_reg_write(d, address, 0);
> >>> +		#if RTE_ACC100_SINGLE_FEC == 1
> >> #if should be at start of line
> > ok
> >
> >>> +		payload = 0;
> >>> +		#endif
> >>> +	}
> >>> +	printf("Number of 5GUL engines %d\n", numEngines);
> >>> +	/* 4GDL */
> >>> +	numQqsAcc += numQgs;
> >>> +	numQgs	= conf->q_dl_4g.num_qgroups;
> >>> +	payload = 0;
> >>> +	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc);
> >> qg_idx++)
> >>> +		payload |= (1 << qg_idx);
> >>> +	for (template_idx = SIG_DL_4G; template_idx <= SIG_DL_4G_LAST;
> >>> +			template_idx++) {
> >>> +		address = HWPfQmgrGrpTmplateReg4Indx
> >>> +				+ BYTES_IN_WORD*template_idx;
> >>> +		acc100_reg_write(d, address, payload);
> >>> +		#if RTE_ACC100_SINGLE_FEC == 1
> >>> +			payload = 0;
> >>> +		#endif
> >>> +	}
> >>> +	/* 5GDL */
> >>> +	numQqsAcc += numQgs;
> >>> +	numQgs	= conf->q_dl_5g.num_qgroups;
> >>> +	payload = 0;
> >>> +	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc);
> >> qg_idx++)
> >>> +		payload |= (1 << qg_idx);
> >>> +	for (template_idx = SIG_DL_5G; template_idx <= SIG_DL_5G_LAST;
> >>> +			template_idx++) {
> >>> +		address = HWPfQmgrGrpTmplateReg4Indx
> >>> +				+ BYTES_IN_WORD*template_idx;
> >>> +		acc100_reg_write(d, address, payload);
> >>> +		#if RTE_ACC100_SINGLE_FEC == 1
> >>> +		payload = 0;
> >>> +		#endif
> >>> +	}
> >>> +
> >>> +	/* Queue Group Function mapping */
> >>> +	int qman_func_id[5] = {0, 2, 1, 3, 4};
> >>> +	address = HWPfQmgrGrpFunction0;
> >>> +	payload = 0;
> >>> +	for (qg_idx = 0; qg_idx < 8; qg_idx++) {
> >>> +		acc = accFromQgid(qg_idx, conf);
> >>> +		payload |= qman_func_id[acc]<<(qg_idx * 4);
> >>> +	}
> >>> +	acc100_reg_write(d, address, payload);
> >>> +
> >>> +	/* Configuration of the Arbitration QGroup depth to 1 */
> >>> +	for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) {
> >>> +		address = HWPfQmgrArbQDepthGrp +
> >>> +		BYTES_IN_WORD * qg_idx;
> >>> +		payload = 0;
> >>> +		acc100_reg_write(d, address, payload);
> >>> +	}
> >>> +
> >>> +	/* Enabling AQueues through the Queue hierarchy*/
> >>> +	for (vf_idx = 0; vf_idx < ACC100_NUM_VFS; vf_idx++) {
> >>> +		for (qg_idx = 0; qg_idx < ACC100_NUM_QGRPS; qg_idx++) {
> >>> +			payload = 0;
> >>> +			if (vf_idx < conf->num_vf_bundles &&
> >>> +					qg_idx < totalQgs)
> >>> +				payload = (1 << aqNum(qg_idx, conf)) - 1;
> >>> +			address = HWPfQmgrAqEnableVf
> >>> +					+ vf_idx * BYTES_IN_WORD;
> >>> +			payload += (qg_idx << 16);
> >>> +			acc100_reg_write(d, address, payload);
> >>> +		}
> >>> +	}
> >>> +
> >>> +	/* This pointer to ARAM (256kB) is shifted by 2 (4B per register) */
> >>> +	uint32_t aram_address = 0;
> >>> +	for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) {
> >>> +		for (vf_idx = 0; vf_idx < conf->num_vf_bundles; vf_idx++) {
> >>> +			address = HWPfQmgrVfBaseAddr + vf_idx
> >>> +					* BYTES_IN_WORD + qg_idx
> >>> +					* BYTES_IN_WORD * 64;
> >>> +			payload = aram_address;
> >>> +			acc100_reg_write(d, address, payload);
> >>> +			/* Offset ARAM Address for next memory bank
> >>> +			 * - increment of 4B
> >>> +			 */
> >>> +			aram_address += aqNum(qg_idx, conf) *
> >>> +					(1 << aqDepth(qg_idx, conf));
> >>> +		}
> >>> +	}
> >>> +
> >>> +	if (aram_address > WORDS_IN_ARAM_SIZE) {
> >>> +		rte_bbdev_log(ERR, "ARAM Configuration not fitting %d
> >> %d\n",
> >>> +				aram_address, WORDS_IN_ARAM_SIZE);
> >>> +		return -EINVAL;
> >>> +	}
> >>> +
> >>> +	/* ==== HI Configuration ==== */
> >>> +
> >>> +	/* Prevent Block on Transmit Error */
> >>> +	address = HWPfHiBlockTransmitOnErrorEn;
> >>> +	payload = 0;
> >>> +	acc100_reg_write(d, address, payload);
> >>> +	/* Prevents to drop MSI */
> >>> +	address = HWPfHiMsiDropEnableReg;
> >>> +	payload = 0;
> >>> +	acc100_reg_write(d, address, payload);
> >>> +	/* Set the PF Mode register */
> >>> +	address = HWPfHiPfMode;
> >>> +	payload = (conf->pf_mode_en) ? 2 : 0;
> >>> +	acc100_reg_write(d, address, payload);
> >>> +	/* Enable Error Detection in HW */
> >>> +	address = HWPfDmaErrorDetectionEn;
> >>> +	payload = 0x3D7;
> >>> +	acc100_reg_write(d, address, payload);
> >>> +
> >>> +	/* QoS overflow init */
> >>> +	payload = 1;
> >>> +	address = HWPfQosmonAEvalOverflow0;
> >>> +	acc100_reg_write(d, address, payload);
> >>> +	address = HWPfQosmonBEvalOverflow0;
> >>> +	acc100_reg_write(d, address, payload);
> >>> +
> >>> +	/* HARQ DDR Configuration */
> >>> +	unsigned int ddrSizeInMb = 512; /* Fixed to 512 MB per VF for now
> >> */
> >>> +	for (vf_idx = 0; vf_idx < conf->num_vf_bundles; vf_idx++) {
> >>> +		address = HWPfDmaVfDdrBaseRw + vf_idx
> >>> +				* 0x10;
> >>> +		payload = ((vf_idx * (ddrSizeInMb / 64)) << 16) +
> >>> +				(ddrSizeInMb - 1);
> >>> +		acc100_reg_write(d, address, payload);
> >>> +	}
> >>> +	usleep(LONG_WAIT);
> >> Is sleep needed here ? the reg_write has one.
> > This one is needed on top
> >
> >>> +
> >> Since this seems like a workaround, add a comment here.
> > fair enough, ok, thanks
> >
> >> Tom
> >>
> >>> +	if (numEngines < (SIG_UL_5G_LAST + 1))
> >>> +		poweron_cleanup(bbdev, d, conf);
> >>> +
> >>> +	rte_bbdev_log_debug("PF Tip configuration complete for %s",
> >> dev_name);
> >>> +	return 0;
> >>> +}
> >>> diff --git
> >>> a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> >>> b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> >>> index 4a76d1d..91c234d 100644
> >>> --- a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> >>> +++ b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> >>> @@ -1,3 +1,10 @@
> >>>  DPDK_21 {
> >>>  	local: *;
> >>>  };
> >>> +
> >>> +EXPERIMENTAL {
> >>> +	global:
> >>> +
> >>> +	acc100_configure;
> >>> +
> >>> +};


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v10 10/10] baseband/acc100: add configure function
  2020-10-01 19:50             ` Chautru, Nicolas
@ 2020-10-01 21:44               ` Maxime Coquelin
  0 siblings, 0 replies; 213+ messages in thread
From: Maxime Coquelin @ 2020-10-01 21:44 UTC (permalink / raw)
  To: Chautru, Nicolas, dev, akhil.goyal
  Cc: Richardson, Bruce, Xu, Rosen, trix, Yigit, Ferruh, Liu, Tianjiao



On 10/1/20 9:50 PM, Chautru, Nicolas wrote:
> Hi Maxime, 
> 
>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>> On 10/1/20 5:36 PM, Chautru, Nicolas wrote:
>>> Hi Maxime,
>>>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>> Hi Nicolas,
>>>>
>>>> On 10/1/20 5:14 AM, Nicolas Chautru wrote:
>>>>> diff --git
>>>>> a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
>>>>> b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
>>>>> index 4a76d1d..91c234d 100644
>>>>> --- a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
>>>>> +++ b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
>>>>> @@ -1,3 +1,10 @@
>>>>>  DPDK_21 {
>>>>>  	local: *;
>>>>>  };
>>>>> +
>>>>> +EXPERIMENTAL {
>>>>> +	global:
>>>>> +
>>>>> +	acc100_configure;
>>>>> +
>>>>> +};
>>>>> --
>>>>
>>>> Ideally we should not need to have device specific APIs, but at least
>>>> it should be prefixed with "rte_".
>>>
>>> Currently this is already like that for other bbdev PMDs.
>>> So I would tend to prefer consistency over all in that context.
>>> You could argue or not whether this is PMD function or a companion
>> exposed function, but again if this should change it should change for all
>> PMDs to avoid discrepencies.
>>> If really this is deemed required this can be pushed as an extra patch
>> covering all PMD, but probably not for 20.11.
>>> What do you think?
>>
>> Better to fix the API now to avoid namespace pollution, including the other
>> comments I made regarding API on patch 3.
>> That's not a big change, it can be done in v20.11 in my opinion.
> 
> ok fair enough, thanks

Thanks Nicolas!

I can send a patch tomorrow to fix the other baseband driver API, it
should not be an issue given it is experimental.

Maxime

>>
>> Thanks,
>> Maxime
>>
>>>>
>>>> Regards,
>>>> Maxime
>>>
> 


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v9 07/10] baseband/acc100: add support for 4G processing
  2020-10-01 15:42           ` Tom Rix
@ 2020-10-01 21:46             ` Chautru, Nicolas
  0 siblings, 0 replies; 213+ messages in thread
From: Chautru, Nicolas @ 2020-10-01 21:46 UTC (permalink / raw)
  To: Tom Rix, dev, akhil.goyal
  Cc: Richardson, Bruce, Xu, Rosen, dave.burley, aidan.goddard, Yigit,
	Ferruh, Liu, Tianjiao

Hi Tom,

> From: Tom Rix <trix@redhat.com>
> On 9/30/20 12:10 PM, Chautru, Nicolas wrote:
> > Hi Tom,
> >
> >> From: Tom Rix <trix@redhat.com>
> >> On 9/28/20 5:29 PM, Nicolas Chautru wrote:
> >>> Adding capability for 4G encode and decoder processing
> >>>
> >>> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> >>> Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
> >>> ---
> >>>  doc/guides/bbdevs/features/acc100.ini    |    4 +-
> >>>  drivers/baseband/acc100/rte_acc100_pmd.c | 1010
> >>> ++++++++++++++++++++++++++++--
> >>>  2 files changed, 945 insertions(+), 69 deletions(-)
> >>>
> >>> diff --git a/doc/guides/bbdevs/features/acc100.ini
> >>> b/doc/guides/bbdevs/features/acc100.ini
> >>> index 40c7adc..642cd48 100644
> >>> --- a/doc/guides/bbdevs/features/acc100.ini
> >>> +++ b/doc/guides/bbdevs/features/acc100.ini
> >>> @@ -4,8 +4,8 @@
> >>>  ; Refer to default.ini for the full list of available PMD features.
> >>>  ;
> >>>  [Features]
> >>> -Turbo Decoder (4G)     = N
> >>> -Turbo Encoder (4G)     = N
> >>> +Turbo Decoder (4G)     = Y
> >>> +Turbo Encoder (4G)     = Y
> >>>  LDPC Decoder (5G)      = Y
> >>>  LDPC Encoder (5G)      = Y
> >>>  LLR/HARQ Compression   = Y
> >>> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> >>> b/drivers/baseband/acc100/rte_acc100_pmd.c
> >>> index e484c0a..7d4c3df 100644
> >>> --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> >>> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> >>> @@ -339,7 +339,6 @@
> >>>  	free_base_addresses(base_addrs, i);  }
> >>>
> >>> -
> >>>  /* Allocate 64MB memory used for all software rings */  static int
> >>> acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int
> >>> socket_id) @@ -637,6 +636,41 @@
> >>>
> >>>  	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
> >>>  		{
> >>> +			.type = RTE_BBDEV_OP_TURBO_DEC,
> >>> +			.cap.turbo_dec = {
> >>> +				.capability_flags =
> >>> +
> >> 	RTE_BBDEV_TURBO_SUBBLOCK_DEINTERLEAVE |
> >>> +					RTE_BBDEV_TURBO_CRC_TYPE_24B
> >> |
> >>> +
> >> 	RTE_BBDEV_TURBO_HALF_ITERATION_EVEN |
> >>> +
> >> 	RTE_BBDEV_TURBO_EARLY_TERMINATION |
> >>> +
> >> 	RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN |
> >>> +					RTE_BBDEV_TURBO_MAP_DEC |
> >>> +
> >> 	RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP |
> >>> +
> >> 	RTE_BBDEV_TURBO_DEC_SCATTER_GATHER,
> >>> +				.max_llr_modulus = INT8_MAX,
> >>> +				.num_buffers_src =
> >>> +
> >> 	RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
> >>> +				.num_buffers_hard_out =
> >>> +
> >> 	RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
> >>> +				.num_buffers_soft_out =
> >>> +
> >> 	RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
> >>> +			}
> >>> +		},
> >>> +		{
> >>> +			.type = RTE_BBDEV_OP_TURBO_ENC,
> >>> +			.cap.turbo_enc = {
> >>> +				.capability_flags =
> >>> +
> >> 	RTE_BBDEV_TURBO_CRC_24B_ATTACH |
> >>> +
> >> 	RTE_BBDEV_TURBO_RV_INDEX_BYPASS |
> >>> +					RTE_BBDEV_TURBO_RATE_MATCH |
> >>> +
> >> 	RTE_BBDEV_TURBO_ENC_SCATTER_GATHER,
> >>> +				.num_buffers_src =
> >>> +
> >> 	RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
> >>> +				.num_buffers_dst =
> >>> +
> >> 	RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
> >>> +			}
> >>> +		},
> >>> +		{
> >>>  			.type   = RTE_BBDEV_OP_LDPC_ENC,
> >>>  			.cap.ldpc_enc = {
> >>>  				.capability_flags =
> >>> @@ -719,7 +753,6 @@
> >>>  #endif
> >>>  }
> >>>
> >>> -
> >>>  static const struct rte_bbdev_ops acc100_bbdev_ops = {
> >>>  	.setup_queues = acc100_setup_queues,
> >>>  	.close = acc100_dev_close,
> >>> @@ -763,6 +796,58 @@
> >>>  	return tail;
> >>>  }
> >>>
> >>> +/* Fill in a frame control word for turbo encoding. */ static
> >>> +inline void acc100_fcw_te_fill(const struct rte_bbdev_enc_op *op,
> >>> +struct acc100_fcw_te *fcw) {
> >>> +	fcw->code_block_mode = op->turbo_enc.code_block_mode;
> >>> +	if (fcw->code_block_mode == 0) { /* For TB mode */
> >>> +		fcw->k_neg = op->turbo_enc.tb_params.k_neg;
> >>> +		fcw->k_pos = op->turbo_enc.tb_params.k_pos;
> >>> +		fcw->c_neg = op->turbo_enc.tb_params.c_neg;
> >>> +		fcw->c = op->turbo_enc.tb_params.c;
> >>> +		fcw->ncb_neg = op->turbo_enc.tb_params.ncb_neg;
> >>> +		fcw->ncb_pos = op->turbo_enc.tb_params.ncb_pos;
> >>> +
> >>> +		if (check_bit(op->turbo_enc.op_flags,
> >>> +				RTE_BBDEV_TURBO_RATE_MATCH)) {
> >>> +			fcw->bypass_rm = 0;
> >>> +			fcw->cab = op->turbo_enc.tb_params.cab;
> >>> +			fcw->ea = op->turbo_enc.tb_params.ea;
> >>> +			fcw->eb = op->turbo_enc.tb_params.eb;
> >>> +		} else {
> >>> +			/* E is set to the encoding output size when RM is
> >>> +			 * bypassed.
> >>> +			 */
> >>> +			fcw->bypass_rm = 1;
> >>> +			fcw->cab = fcw->c_neg;
> >>> +			fcw->ea = 3 * fcw->k_neg + 12;
> >>> +			fcw->eb = 3 * fcw->k_pos + 12;
> >>> +		}
> >>> +	} else { /* For CB mode */
> >>> +		fcw->k_pos = op->turbo_enc.cb_params.k;
> >>> +		fcw->ncb_pos = op->turbo_enc.cb_params.ncb;
> >>> +
> >>> +		if (check_bit(op->turbo_enc.op_flags,
> >>> +				RTE_BBDEV_TURBO_RATE_MATCH)) {
> >>> +			fcw->bypass_rm = 0;
> >>> +			fcw->eb = op->turbo_enc.cb_params.e;
> >>> +		} else {
> >>> +			/* E is set to the encoding output size when RM is
> >>> +			 * bypassed.
> >>> +			 */
> >>> +			fcw->bypass_rm = 1;
> >>> +			fcw->eb = 3 * fcw->k_pos + 12;
> >>> +		}
> >>> +	}
> >>> +
> >>> +	fcw->bypass_rv_idx1 = check_bit(op->turbo_enc.op_flags,
> >>> +			RTE_BBDEV_TURBO_RV_INDEX_BYPASS);
> >>> +	fcw->code_block_crc = check_bit(op->turbo_enc.op_flags,
> >>> +			RTE_BBDEV_TURBO_CRC_24B_ATTACH);
> >>> +	fcw->rv_idx1 = op->turbo_enc.rv_index; }
> >>> +
> >>>  /* Compute value of k0.
> >>>   * Based on 3GPP 38.212 Table 5.4.2.1-2
> >>>   * Starting position of different redundancy versions, k0 @@ -813,6
> >>> +898,25 @@
> >>>  	fcw->mcb_count = num_cb;
> >>>  }
> >>>
> >>> +/* Fill in a frame control word for turbo decoding. */ static
> >>> +inline void acc100_fcw_td_fill(const struct rte_bbdev_dec_op *op,
> >>> +struct acc100_fcw_td *fcw) {
> >>> +	/* Note : Early termination is always enabled for 4GUL */
> >>> +	fcw->fcw_ver = 1;
> >>> +	if (op->turbo_dec.code_block_mode == 0)
> >>> +		fcw->k_pos = op->turbo_dec.tb_params.k_pos;
> >>> +	else
> >>> +		fcw->k_pos = op->turbo_dec.cb_params.k;
> >>> +	fcw->turbo_crc_type = check_bit(op->turbo_dec.op_flags,
> >>> +			RTE_BBDEV_TURBO_CRC_TYPE_24B);
> >>> +	fcw->bypass_sb_deint = 0;
> >>> +	fcw->raw_decoder_input_on = 0;
> >>> +	fcw->max_iter = op->turbo_dec.iter_max;
> >>> +	fcw->half_iter_on = !check_bit(op->turbo_dec.op_flags,
> >>> +			RTE_BBDEV_TURBO_HALF_ITERATION_EVEN);
> >>> +}
> >>> +
> >>>  /* Fill in a frame control word for LDPC decoding. */  static
> >>> inline void  acc100_fcw_ld_fill(const struct rte_bbdev_dec_op *op,
> >>> struct acc100_fcw_ld *fcw, @@ -1042,6 +1146,87 @@  }
> >>>
> >>>  static inline int
> >>> +acc100_dma_desc_te_fill(struct rte_bbdev_enc_op *op,
> >>> +		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
> >>> +		struct rte_mbuf *output, uint32_t *in_offset,
> >>> +		uint32_t *out_offset, uint32_t *out_length,
> >>> +		uint32_t *mbuf_total_left, uint32_t *seg_total_left, uint8_t
> >> r) {
> >>> +	int next_triplet = 1; /* FCW already done */
> >>> +	uint32_t e, ea, eb, length;
> >>> +	uint16_t k, k_neg, k_pos;
> >>> +	uint8_t cab, c_neg;
> >>> +
> >>> +	desc->word0 = ACC100_DMA_DESC_TYPE;
> >>> +	desc->word1 = 0; /**< Timestamp could be disabled */
> >>> +	desc->word2 = 0;
> >>> +	desc->word3 = 0;
> >>> +	desc->numCBs = 1;
> >>> +
> >>> +	if (op->turbo_enc.code_block_mode == 0) {
> >>> +		ea = op->turbo_enc.tb_params.ea;
> >>> +		eb = op->turbo_enc.tb_params.eb;
> >>> +		cab = op->turbo_enc.tb_params.cab;
> >>> +		k_neg = op->turbo_enc.tb_params.k_neg;
> >>> +		k_pos = op->turbo_enc.tb_params.k_pos;
> >>> +		c_neg = op->turbo_enc.tb_params.c_neg;
> >>> +		e = (r < cab) ? ea : eb;
> >>> +		k = (r < c_neg) ? k_neg : k_pos;
> >>> +	} else {
> >>> +		e = op->turbo_enc.cb_params.e;
> >>> +		k = op->turbo_enc.cb_params.k;
> >>> +	}
> >>> +
> >>> +	if (check_bit(op->turbo_enc.op_flags,
> >> RTE_BBDEV_TURBO_CRC_24B_ATTACH))
> >>> +		length = (k - 24) >> 3;
> >>> +	else
> >>> +		length = k >> 3;
> >>> +
> >>> +	if (unlikely((*mbuf_total_left == 0) || (*mbuf_total_left <
> >>> +length))) {
> >> similar to other patches, this check can be combined to <=
> >>
> >> change generally
> > same comment on other patch
> >
> >>> +		rte_bbdev_log(ERR,
> >>> +				"Mismatch between mbuf length and
> >> included CB sizes: mbuf len %u, cb len %u",
> >>> +				*mbuf_total_left, length);
> >>> +		return -1;
> >>> +	}
> >>> +
> >>> +	next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset,
> >>> +			length, seg_total_left, next_triplet);
> >>> +	if (unlikely(next_triplet < 0)) {
> >>> +		rte_bbdev_log(ERR,
> >>> +				"Mismatch between data to process and
> >> mbuf data length in bbdev_op: %p",
> >>> +				op);
> >>> +		return -1;
> >>> +	}
> >>> +	desc->data_ptrs[next_triplet - 1].last = 1;
> >>> +	desc->m2dlen = next_triplet;
> >>> +	*mbuf_total_left -= length;
> >>> +
> >>> +	/* Set output length */
> >>> +	if (check_bit(op->turbo_enc.op_flags,
> >> RTE_BBDEV_TURBO_RATE_MATCH))
> >>> +		/* Integer round up division by 8 */
> >>> +		*out_length = (e + 7) >> 3;
> >>> +	else
> >>> +		*out_length = (k >> 3) * 3 + 2;
> >>> +
> >>> +	next_triplet = acc100_dma_fill_blk_type_out(desc, output,
> >> *out_offset,
> >>> +			*out_length, next_triplet,
> >> ACC100_DMA_BLKID_OUT_ENC);
> >>> +	if (unlikely(next_triplet < 0)) {
> >>> +		rte_bbdev_log(ERR,
> >>> +				"Mismatch between data to process and
> >> mbuf data length in bbdev_op: %p",
> >>> +				op);
> >>> +		return -1;
> >>> +	}
> >>> +	op->turbo_enc.output.length += *out_length;
> >>> +	*out_offset += *out_length;
> >>> +	desc->data_ptrs[next_triplet - 1].last = 1;
> >>> +	desc->d2mlen = next_triplet - desc->m2dlen;
> >>> +
> >>> +	desc->op_addr = op;
> >>> +
> >>> +	return 0;
> >>> +}
> >>> +
> >>> +static inline int
> >>>  acc100_dma_desc_le_fill(struct rte_bbdev_enc_op *op,
> >>>  		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
> >>>  		struct rte_mbuf *output, uint32_t *in_offset, @@ -1110,6
> >> +1295,117
> >>> @@  }
> >>>
> >>>  static inline int
> >>> +acc100_dma_desc_td_fill(struct rte_bbdev_dec_op *op,
> >>> +		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
> >>> +		struct rte_mbuf *h_output, struct rte_mbuf *s_output,
> >>> +		uint32_t *in_offset, uint32_t *h_out_offset,
> >>> +		uint32_t *s_out_offset, uint32_t *h_out_length,
> >>> +		uint32_t *s_out_length, uint32_t *mbuf_total_left,
> >>> +		uint32_t *seg_total_left, uint8_t r) {
> >>> +	int next_triplet = 1; /* FCW already done */
> >>> +	uint16_t k;
> >>> +	uint16_t crc24_overlap = 0;
> >>> +	uint32_t e, kw;
> >>> +
> >>> +	desc->word0 = ACC100_DMA_DESC_TYPE;
> >>> +	desc->word1 = 0; /**< Timestamp could be disabled */
> >>> +	desc->word2 = 0;
> >>> +	desc->word3 = 0;
> >>> +	desc->numCBs = 1;
> >>> +
> >>> +	if (op->turbo_dec.code_block_mode == 0) {
> >>> +		k = (r < op->turbo_dec.tb_params.c_neg)
> >>> +			? op->turbo_dec.tb_params.k_neg
> >>> +			: op->turbo_dec.tb_params.k_pos;
> >>> +		e = (r < op->turbo_dec.tb_params.cab)
> >>> +			? op->turbo_dec.tb_params.ea
> >>> +			: op->turbo_dec.tb_params.eb;
> >>> +	} else {
> >>> +		k = op->turbo_dec.cb_params.k;
> >>> +		e = op->turbo_dec.cb_params.e;
> >>> +	}
> >>> +
> >>> +	if ((op->turbo_dec.code_block_mode == 0)
> >>> +		&& !check_bit(op->turbo_dec.op_flags,
> >>> +		RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP))
> >>> +		crc24_overlap = 24;
> >>> +
> >>> +	/* Calculates circular buffer size.
> >>> +	 * According to 3gpp 36.212 section 5.1.4.2
> >>> +	 *   Kw = 3 * Kpi,
> >>> +	 * where:
> >>> +	 *   Kpi = nCol * nRow
> >>> +	 * where nCol is 32 and nRow can be calculated from:
> >>> +	 *   D =< nCol * nRow
> >>> +	 * where D is the size of each output from turbo encoder block (k
> >>> ++
> >> 4).
> >>> +	 */
> >>> +	kw = RTE_ALIGN_CEIL(k + 4, 32) * 3;
> >>> +
> >>> +	if (unlikely((*mbuf_total_left == 0) || (*mbuf_total_left < kw))) {
> >>> +		rte_bbdev_log(ERR,
> >>> +				"Mismatch between mbuf length and
> >> included CB sizes: mbuf len %u, cb len %u",
> >>> +				*mbuf_total_left, kw);
> >>> +		return -1;
> >>> +	}
> >>> +
> >>> +	next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset,
> >> kw,
> >>> +			seg_total_left, next_triplet);
> >>> +	if (unlikely(next_triplet < 0)) {
> >>> +		rte_bbdev_log(ERR,
> >>> +				"Mismatch between data to process and
> >> mbuf data length in bbdev_op: %p",
> >>> +				op);
> >>> +		return -1;
> >>> +	}
> >>> +	desc->data_ptrs[next_triplet - 1].last = 1;
> >>> +	desc->m2dlen = next_triplet;
> >>> +	*mbuf_total_left -= kw;
> >>> +
> >>> +	next_triplet = acc100_dma_fill_blk_type_out(
> >>> +			desc, h_output, *h_out_offset,
> >>> +			k >> 3, next_triplet,
> >> ACC100_DMA_BLKID_OUT_HARD);
> >>> +	if (unlikely(next_triplet < 0)) {
> >>> +		rte_bbdev_log(ERR,
> >>> +				"Mismatch between data to process and
> >> mbuf data length in bbdev_op: %p",
> >>> +				op);
> >>> +		return -1;
> >>> +	}
> >>> +
> >>> +	*h_out_length = ((k - crc24_overlap) >> 3);
> >>> +	op->turbo_dec.hard_output.length += *h_out_length;
> >>> +	*h_out_offset += *h_out_length;
> >>> +
> >>> +	/* Soft output */
> >>> +	if (check_bit(op->turbo_dec.op_flags,
> >> RTE_BBDEV_TURBO_SOFT_OUTPUT)) {
> >>> +		if (check_bit(op->turbo_dec.op_flags,
> >>> +				RTE_BBDEV_TURBO_EQUALIZER))
> >>> +			*s_out_length = e;
> >>> +		else
> >>> +			*s_out_length = (k * 3) + 12;
> >>> +
> >>> +		next_triplet = acc100_dma_fill_blk_type_out(desc, s_output,
> >>> +				*s_out_offset, *s_out_length, next_triplet,
> >>> +				ACC100_DMA_BLKID_OUT_SOFT);
> >>> +		if (unlikely(next_triplet < 0)) {
> >>> +			rte_bbdev_log(ERR,
> >>> +					"Mismatch between data to process
> >> and mbuf data length in bbdev_op: %p",
> >>> +					op);
> >>> +			return -1;
> >>> +		}
> >>> +
> >>> +		op->turbo_dec.soft_output.length += *s_out_length;
> >>> +		*s_out_offset += *s_out_length;
> >>> +	}
> >>> +
> >>> +	desc->data_ptrs[next_triplet - 1].last = 1;
> >>> +	desc->d2mlen = next_triplet - desc->m2dlen;
> >>> +
> >>> +	desc->op_addr = op;
> >>> +
> >>> +	return 0;
> >>> +}
> >>> +
> >>> +static inline int
> >>>  acc100_dma_desc_ld_fill(struct rte_bbdev_dec_op *op,
> >>>  		struct acc100_dma_req_desc *desc,
> >>>  		struct rte_mbuf **input, struct rte_mbuf *h_output, @@ -
> >> 1374,6
> >>> +1670,57 @@
> >>>
> >>>  /* Enqueue one encode operations for ACC100 device in CB mode */
> >>> static inline int
> >>> +enqueue_enc_one_op_cb(struct acc100_queue *q, struct
> >> rte_bbdev_enc_op *op,
> >>> +		uint16_t total_enqueued_cbs)
> >>> +{
> >>> +	union acc100_dma_desc *desc = NULL;
> >>> +	int ret;
> >>> +	uint32_t in_offset, out_offset, out_length, mbuf_total_left,
> >>> +		seg_total_left;
> >>> +	struct rte_mbuf *input, *output_head, *output;
> >>> +
> >>> +	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> >>> +			& q->sw_ring_wrap_mask);
> >>> +	desc = q->ring_addr + desc_idx;
> >>> +	acc100_fcw_te_fill(op, &desc->req.fcw_te);
> >>> +
> >>> +	input = op->turbo_enc.input.data;
> >>> +	output_head = output = op->turbo_enc.output.data;
> >>> +	in_offset = op->turbo_enc.input.offset;
> >>> +	out_offset = op->turbo_enc.output.offset;
> >>> +	out_length = 0;
> >>> +	mbuf_total_left = op->turbo_enc.input.length;
> >>> +	seg_total_left = rte_pktmbuf_data_len(op->turbo_enc.input.data)
> >>> +			- in_offset;
> >>> +
> >>> +	ret = acc100_dma_desc_te_fill(op, &desc->req, &input, output,
> >>> +			&in_offset, &out_offset, &out_length,
> >> &mbuf_total_left,
> >>> +			&seg_total_left, 0);
> >>> +
> >>> +	if (unlikely(ret < 0))
> >>> +		return ret;
> >>> +
> >>> +	mbuf_append(output_head, output, out_length);
> >>> +
> >>> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> >>> +	rte_memdump(stderr, "FCW", &desc->req.fcw_te,
> >>> +			sizeof(desc->req.fcw_te) - 8);
> >>> +	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> >>> +
> >>> +	/* Check if any data left after processing one CB */
> >>> +	if (mbuf_total_left != 0) {
> >>> +		rte_bbdev_log(ERR,
> >>> +				"Some date still left after processing one CB:
> >> mbuf_total_left = %u",
> >>> +				mbuf_total_left);
> >>> +		return -EINVAL;
> >>> +	}
> >>> +#endif
> >>> +	/* One CB (one op) was successfully prepared to enqueue */
> >>> +	return 1;
> >>> +}
> >>> +
> >>> +/* Enqueue one encode operations for ACC100 device in CB mode */
> >>> +static inline int
> >>>  enqueue_ldpc_enc_n_op_cb(struct acc100_queue *q, struct
> >> rte_bbdev_enc_op **ops,
> >>>  		uint16_t total_enqueued_cbs, int16_t num)  { @@ -1481,78
> >> +1828,235
> >>> @@
> >>>  	return 1;
> >>>  }
> >>>
> >>> -static inline int
> >>> -harq_loopback(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
> >>> -		uint16_t total_enqueued_cbs) {
> >>> -	struct acc100_fcw_ld *fcw;
> >>> -	union acc100_dma_desc *desc;
> >>> -	int next_triplet = 1;
> >>> -	struct rte_mbuf *hq_output_head, *hq_output;
> >>> -	uint16_t harq_in_length = op-
> >>> ldpc_dec.harq_combined_input.length;
> >>> -	if (harq_in_length == 0) {
> >>> -		rte_bbdev_log(ERR, "Loopback of invalid null size\n");
> >>> -		return -EINVAL;
> >>> -	}
> >>>
> >>> -	int h_comp = check_bit(op->ldpc_dec.op_flags,
> >>> -			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION
> >>> -			) ? 1 : 0;
> >>> -	if (h_comp == 1)
> >>> -		harq_in_length = harq_in_length * 8 / 6;
> >>> -	harq_in_length = RTE_ALIGN(harq_in_length, 64);
> >>> -	uint16_t harq_dma_length_in = (h_comp == 0) ?
> >>> -			harq_in_length :
> >>> -			harq_in_length * 6 / 8;
> >>> -	uint16_t harq_dma_length_out = harq_dma_length_in;
> >>> -	bool ddr_mem_in = check_bit(op->ldpc_dec.op_flags,
> >>> -
> >> 	RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE);
> >>> -	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
> >>> -	uint16_t harq_index = (ddr_mem_in ?
> >>> -			op->ldpc_dec.harq_combined_input.offset :
> >>> -			op->ldpc_dec.harq_combined_output.offset)
> >>> -			/ ACC100_HARQ_OFFSET;
> >>> +/* Enqueue one encode operations for ACC100 device in TB mode. */
> >>> +static inline int enqueue_enc_one_op_tb(struct acc100_queue *q,
> >>> +struct rte_bbdev_enc_op *op,
> >>> +		uint16_t total_enqueued_cbs, uint8_t cbs_in_tb) {
> >>> +	union acc100_dma_desc *desc = NULL;
> >>> +	int ret;
> >>> +	uint8_t r, c;
> >>> +	uint32_t in_offset, out_offset, out_length, mbuf_total_left,
> >>> +		seg_total_left;
> >>> +	struct rte_mbuf *input, *output_head, *output;
> >>> +	uint16_t current_enqueued_cbs = 0;
> >>>
> >>>  	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> >>>  			& q->sw_ring_wrap_mask);
> >>>  	desc = q->ring_addr + desc_idx;
> >>> -	fcw = &desc->req.fcw_ld;
> >>> -	/* Set the FCW from loopback into DDR */
> >>> -	memset(fcw, 0, sizeof(struct acc100_fcw_ld));
> >>> -	fcw->FCWversion = ACC100_FCW_VER;
> >>> -	fcw->qm = 2;
> >>> -	fcw->Zc = 384;
> >>> -	if (harq_in_length < 16 * N_ZC_1)
> >>> -		fcw->Zc = 16;
> >>> -	fcw->ncb = fcw->Zc * N_ZC_1;
> >>> -	fcw->rm_e = 2;
> >>> -	fcw->hcin_en = 1;
> >>> -	fcw->hcout_en = 1;
> >>> +	uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
> >>> +	acc100_fcw_te_fill(op, &desc->req.fcw_te);
> >>>
> >>> -	rte_bbdev_log(DEBUG, "Loopback IN %d Index %d offset %d length
> >> %d %d\n",
> >>> -			ddr_mem_in, harq_index,
> >>> -			harq_layout[harq_index].offset, harq_in_length,
> >>> -			harq_dma_length_in);
> >>> +	input = op->turbo_enc.input.data;
> >>> +	output_head = output = op->turbo_enc.output.data;
> >>> +	in_offset = op->turbo_enc.input.offset;
> >>> +	out_offset = op->turbo_enc.output.offset;
> >>> +	out_length = 0;
> >>> +	mbuf_total_left = op->turbo_enc.input.length;
> >>>
> >>> -	if (ddr_mem_in && (harq_layout[harq_index].offset > 0)) {
> >>> -		fcw->hcin_size0 = harq_layout[harq_index].size0;
> >>> -		fcw->hcin_offset = harq_layout[harq_index].offset;
> >>> -		fcw->hcin_size1 = harq_in_length - fcw->hcin_offset;
> >>> -		harq_dma_length_in = (fcw->hcin_size0 + fcw->hcin_size1);
> >>> -		if (h_comp == 1)
> >>> -			harq_dma_length_in = harq_dma_length_in * 6 / 8;
> >>> -	} else {
> >>> -		fcw->hcin_size0 = harq_in_length;
> >>> -	}
> >>> -	harq_layout[harq_index].val = 0;
> >>> -	rte_bbdev_log(DEBUG, "Loopback FCW Config %d %d %d\n",
> >>> -			fcw->hcin_size0, fcw->hcin_offset, fcw->hcin_size1);
> >>> -	fcw->hcout_size0 = harq_in_length;
> >>> -	fcw->hcin_decomp_mode = h_comp;
> >>> -	fcw->hcout_comp_mode = h_comp;
> >>> -	fcw->gain_i = 1;
> >>> -	fcw->gain_h = 1;
> >>> +	c = op->turbo_enc.tb_params.c;
> >>> +	r = op->turbo_enc.tb_params.r;
> >>>
> >>> -	/* Set the prefix of descriptor. This could be done at polling */
> >>> +	while (mbuf_total_left > 0 && r < c) {
> >>> +		seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
> >>> +		/* Set up DMA descriptor */
> >>> +		desc = q->ring_addr + ((q->sw_ring_head +
> >> total_enqueued_cbs)
> >>> +				& q->sw_ring_wrap_mask);
> >>> +		desc->req.data_ptrs[0].address = q->ring_addr_phys +
> >> fcw_offset;
> >>> +		desc->req.data_ptrs[0].blen = ACC100_FCW_TE_BLEN;
> >>> +
> >>> +		ret = acc100_dma_desc_te_fill(op, &desc->req, &input,
> >> output,
> >>> +				&in_offset, &out_offset, &out_length,
> >>> +				&mbuf_total_left, &seg_total_left, r);
> >>> +		if (unlikely(ret < 0))
> >>> +			return ret;
> >>> +		mbuf_append(output_head, output, out_length);
> >>> +
> >>> +		/* Set total number of CBs in TB */
> >>> +		desc->req.cbs_in_tb = cbs_in_tb;
> >>> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> >>> +		rte_memdump(stderr, "FCW", &desc->req.fcw_te,
> >>> +				sizeof(desc->req.fcw_te) - 8);
> >>> +		rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> >> #endif
> >>> +
> >>> +		if (seg_total_left == 0) {
> >>> +			/* Go to the next mbuf */
> >>> +			input = input->next;
> >>> +			in_offset = 0;
> >>> +			output = output->next;
> >>> +			out_offset = 0;
> >>> +		}
> >>> +
> >>> +		total_enqueued_cbs++;
> >>> +		current_enqueued_cbs++;
> >>> +		r++;
> >>> +	}
> >>> +
> >>> +	if (unlikely(desc == NULL))
> >>> +		return current_enqueued_cbs;
> >>> +
> >>> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> >>> +	/* Check if any CBs left for processing */
> >>> +	if (mbuf_total_left != 0) {
> >>> +		rte_bbdev_log(ERR,
> >>> +				"Some date still left for processing:
> >> mbuf_total_left = %u",
> >>> +				mbuf_total_left);
> >>> +		return -EINVAL;
> >>> +	}
> >>> +#endif
> >>> +
> >>> +	/* Set SDone on last CB descriptor for TB mode. */
> >>> +	desc->req.sdone_enable = 1;
> >>> +	desc->req.irq_enable = q->irq_enable;
> >>> +
> >>> +	return current_enqueued_cbs;
> >>> +}
> >>> +
> >>> +/** Enqueue one decode operations for ACC100 device in CB mode */
> >>> +static inline int enqueue_dec_one_op_cb(struct acc100_queue *q,
> >>> +struct rte_bbdev_dec_op *op,
> >>> +		uint16_t total_enqueued_cbs)
> >>> +{
> >>> +	union acc100_dma_desc *desc = NULL;
> >>> +	int ret;
> >>> +	uint32_t in_offset, h_out_offset, s_out_offset, s_out_length,
> >>> +		h_out_length, mbuf_total_left, seg_total_left;
> >>> +	struct rte_mbuf *input, *h_output_head, *h_output,
> >>> +		*s_output_head, *s_output;
> >>> +
> >>> +	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> >>> +			& q->sw_ring_wrap_mask);
> >>> +	desc = q->ring_addr + desc_idx;
> >>> +	acc100_fcw_td_fill(op, &desc->req.fcw_td);
> >>> +
> >>> +	input = op->turbo_dec.input.data;
> >>> +	h_output_head = h_output = op->turbo_dec.hard_output.data;
> >>> +	s_output_head = s_output = op->turbo_dec.soft_output.data;
> >>> +	in_offset = op->turbo_dec.input.offset;
> >>> +	h_out_offset = op->turbo_dec.hard_output.offset;
> >>> +	s_out_offset = op->turbo_dec.soft_output.offset;
> >>> +	h_out_length = s_out_length = 0;
> >>> +	mbuf_total_left = op->turbo_dec.input.length;
> >>> +	seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
> >>> +
> >>> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> >>> +	if (unlikely(input == NULL)) {
> >>> +		rte_bbdev_log(ERR, "Invalid mbuf pointer");
> >>> +		return -EFAULT;
> >>> +	}
> >>> +#endif
> >>> +
> >>> +	/* Set up DMA descriptor */
> >>> +	desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
> >>> +			& q->sw_ring_wrap_mask);
> >>> +
> >>> +	ret = acc100_dma_desc_td_fill(op, &desc->req, &input, h_output,
> >>> +			s_output, &in_offset, &h_out_offset, &s_out_offset,
> >>> +			&h_out_length, &s_out_length, &mbuf_total_left,
> >>> +			&seg_total_left, 0);
> >>> +
> >>> +	if (unlikely(ret < 0))
> >>> +		return ret;
> >>> +
> >>> +	/* Hard output */
> >>> +	mbuf_append(h_output_head, h_output, h_out_length);
> >>> +
> >>> +	/* Soft output */
> >>> +	if (check_bit(op->turbo_dec.op_flags,
> >> RTE_BBDEV_TURBO_SOFT_OUTPUT))
> >>> +		mbuf_append(s_output_head, s_output, s_out_length);
> >>> +
> >>> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> >>> +	rte_memdump(stderr, "FCW", &desc->req.fcw_td,
> >>> +			sizeof(desc->req.fcw_td) - 8);
> >>> +	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> >>> +
> >>> +	/* Check if any CBs left for processing */
> >>> +	if (mbuf_total_left != 0) {
> >>> +		rte_bbdev_log(ERR,
> >>> +				"Some date still left after processing one CB:
> >> mbuf_total_left = %u",
> >>> +				mbuf_total_left);
> >>> +		return -EINVAL;
> >>> +	}
> >>> +#endif
> >> logic similar to debug in mbuf_append, should be a common function.
> > Not exactly except if I miss your point.
> 
> I look for code blocks that look similar and want you to consider if they can
> be combined
> 
> into a general function or macro.  A general function is easier to maintain. In
> this case,
> 
> it seems like logging of something is wrong with mbuf is similar to an earlier
> block of code.
> 
> Nothing is functionally wrong.

For that very example I can add a common error trap.
Not totally convinced of the value and this prevents using %s, __func__, but ok. 

> 
> >
> >>> +
> >>> +	/* One CB (one op) was successfully prepared to enqueue */
> >>> +	return 1;
> >>> +}
> >>> +
> >>> +static inline int
> >>> +harq_loopback(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
> >>> +		uint16_t total_enqueued_cbs) {
> >>> +	struct acc100_fcw_ld *fcw;
> >>> +	union acc100_dma_desc *desc;
> >>> +	int next_triplet = 1;
> >>> +	struct rte_mbuf *hq_output_head, *hq_output;
> >>> +	uint16_t harq_in_length = op-
> >>> ldpc_dec.harq_combined_input.length;
> >>> +	if (harq_in_length == 0) {
> >>> +		rte_bbdev_log(ERR, "Loopback of invalid null size\n");
> >>> +		return -EINVAL;
> >>> +	}
> >>> +
> >>> +	int h_comp = check_bit(op->ldpc_dec.op_flags,
> >>> +			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION
> >>> +			) ? 1 : 0;
> >>> +	if (h_comp == 1)
> >>> +		harq_in_length = harq_in_length * 8 / 6;
> >>> +	harq_in_length = RTE_ALIGN(harq_in_length, 64);
> >>> +	uint16_t harq_dma_length_in = (h_comp == 0) ?
> >> Can these h_comp checks be combined to a single if/else ?
> > it may be clearer, ok.
> >
> >
> >>> +			harq_in_length :
> >>> +			harq_in_length * 6 / 8;
> >>> +	uint16_t harq_dma_length_out = harq_dma_length_in;
> >>> +	bool ddr_mem_in = check_bit(op->ldpc_dec.op_flags,
> >>> +
> >> 	RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE);
> >>> +	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
> >>> +	uint16_t harq_index = (ddr_mem_in ?
> >>> +			op->ldpc_dec.harq_combined_input.offset :
> >>> +			op->ldpc_dec.harq_combined_output.offset)
> >>> +			/ ACC100_HARQ_OFFSET;
> >>> +
> >>> +	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> >>> +			& q->sw_ring_wrap_mask);
> >>> +	desc = q->ring_addr + desc_idx;
> >>> +	fcw = &desc->req.fcw_ld;
> >>> +	/* Set the FCW from loopback into DDR */
> >>> +	memset(fcw, 0, sizeof(struct acc100_fcw_ld));
> >>> +	fcw->FCWversion = ACC100_FCW_VER;
> >>> +	fcw->qm = 2;
> >>> +	fcw->Zc = 384;
> >> these magic numbers should have #defines
> > These are not magic numbers, but actually 3GPP values
> ok
> >
> >>> +	if (harq_in_length < 16 * N_ZC_1)
> >>> +		fcw->Zc = 16;
> >>> +	fcw->ncb = fcw->Zc * N_ZC_1;
> >>> +	fcw->rm_e = 2;
> >>> +	fcw->hcin_en = 1;
> >>> +	fcw->hcout_en = 1;
> >>> +
> >>> +	rte_bbdev_log(DEBUG, "Loopback IN %d Index %d offset %d length
> >> %d %d\n",
> >>> +			ddr_mem_in, harq_index,
> >>> +			harq_layout[harq_index].offset, harq_in_length,
> >>> +			harq_dma_length_in);
> >>> +
> >>> +	if (ddr_mem_in && (harq_layout[harq_index].offset > 0)) {
> >>> +		fcw->hcin_size0 = harq_layout[harq_index].size0;
> >>> +		fcw->hcin_offset = harq_layout[harq_index].offset;
> >>> +		fcw->hcin_size1 = harq_in_length - fcw->hcin_offset;
> >>> +		harq_dma_length_in = (fcw->hcin_size0 + fcw->hcin_size1);
> >>> +		if (h_comp == 1)
> >>> +			harq_dma_length_in = harq_dma_length_in * 6 / 8;
> >>> +	} else {
> >>> +		fcw->hcin_size0 = harq_in_length;
> >>> +	}
> >>> +	harq_layout[harq_index].val = 0;
> >>> +	rte_bbdev_log(DEBUG, "Loopback FCW Config %d %d %d\n",
> >>> +			fcw->hcin_size0, fcw->hcin_offset, fcw->hcin_size1);
> >>> +	fcw->hcout_size0 = harq_in_length;
> >>> +	fcw->hcin_decomp_mode = h_comp;
> >>> +	fcw->hcout_comp_mode = h_comp;
> >>> +	fcw->gain_i = 1;
> >>> +	fcw->gain_h = 1;
> >>> +
> >>> +	/* Set the prefix of descriptor. This could be done at polling */
> >>>  	desc->req.word0 = ACC100_DMA_DESC_TYPE;
> >>>  	desc->req.word1 = 0; /**< Timestamp could be disabled */
> >>>  	desc->req.word2 = 0;
> >>> @@ -1816,6 +2320,107 @@
> >>>  	return current_enqueued_cbs;
> >>>  }
> >>>
> >>> +/* Enqueue one decode operations for ACC100 device in TB mode */
> >>> +static inline int enqueue_dec_one_op_tb(struct acc100_queue *q,
> >>> +struct rte_bbdev_dec_op *op,
> >>> +		uint16_t total_enqueued_cbs, uint8_t cbs_in_tb) {
> >>> +	union acc100_dma_desc *desc = NULL;
> >>> +	int ret;
> >>> +	uint8_t r, c;
> >>> +	uint32_t in_offset, h_out_offset, s_out_offset, s_out_length,
> >>> +		h_out_length, mbuf_total_left, seg_total_left;
> >>> +	struct rte_mbuf *input, *h_output_head, *h_output,
> >>> +		*s_output_head, *s_output;
> >>> +	uint16_t current_enqueued_cbs = 0;
> >>> +
> >>> +	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
> >>> +			& q->sw_ring_wrap_mask);
> >>> +	desc = q->ring_addr + desc_idx;
> >>> +	uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
> >>> +	acc100_fcw_td_fill(op, &desc->req.fcw_td);
> >>> +
> >>> +	input = op->turbo_dec.input.data;
> >>> +	h_output_head = h_output = op->turbo_dec.hard_output.data;
> >>> +	s_output_head = s_output = op->turbo_dec.soft_output.data;
> >>> +	in_offset = op->turbo_dec.input.offset;
> >>> +	h_out_offset = op->turbo_dec.hard_output.offset;
> >>> +	s_out_offset = op->turbo_dec.soft_output.offset;
> >>> +	h_out_length = s_out_length = 0;
> >>> +	mbuf_total_left = op->turbo_dec.input.length;
> >>> +	c = op->turbo_dec.tb_params.c;
> >>> +	r = op->turbo_dec.tb_params.r;
> >>> +
> >>> +	while (mbuf_total_left > 0 && r < c) {
> >>> +
> >>> +		seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
> >>> +
> >>> +		/* Set up DMA descriptor */
> >>> +		desc = q->ring_addr + ((q->sw_ring_head +
> >> total_enqueued_cbs)
> >>> +				& q->sw_ring_wrap_mask);
> >>> +		desc->req.data_ptrs[0].address = q->ring_addr_phys +
> >> fcw_offset;
> >>> +		desc->req.data_ptrs[0].blen = ACC100_FCW_TD_BLEN;
> >>> +		ret = acc100_dma_desc_td_fill(op, &desc->req, &input,
> >>> +				h_output, s_output, &in_offset,
> >> &h_out_offset,
> >>> +				&s_out_offset, &h_out_length,
> >> &s_out_length,
> >>> +				&mbuf_total_left, &seg_total_left, r);
> >>> +
> >>> +		if (unlikely(ret < 0))
> >>> +			return ret;
> >>> +
> >>> +		/* Hard output */
> >>> +		mbuf_append(h_output_head, h_output, h_out_length);
> >>> +
> >>> +		/* Soft output */
> >>> +		if (check_bit(op->turbo_dec.op_flags,
> >>> +				RTE_BBDEV_TURBO_SOFT_OUTPUT))
> >>> +			mbuf_append(s_output_head, s_output,
> >> s_out_length);
> >>> +
> >>> +		/* Set total number of CBs in TB */
> >>> +		desc->req.cbs_in_tb = cbs_in_tb;
> >>> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> >>> +		rte_memdump(stderr, "FCW", &desc->req.fcw_td,
> >>> +				sizeof(desc->req.fcw_td) - 8);
> >>> +		rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
> >> #endif
> >>> +
> >>> +		if (seg_total_left == 0) {
> >>> +			/* Go to the next mbuf */
> >>> +			input = input->next;
> >>> +			in_offset = 0;
> >>> +			h_output = h_output->next;
> >>> +			h_out_offset = 0;
> >>> +
> >>> +			if (check_bit(op->turbo_dec.op_flags,
> >>> +					RTE_BBDEV_TURBO_SOFT_OUTPUT))
> >> {
> >>> +				s_output = s_output->next;
> >>> +				s_out_offset = 0;
> >>> +			}
> >>> +		}
> >>> +
> >>> +		total_enqueued_cbs++;
> >>> +		current_enqueued_cbs++;
> >>> +		r++;
> >>> +	}
> >>> +
> >>> +	if (unlikely(desc == NULL))
> >>> +		return current_enqueued_cbs;
> >>> +
> >>> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> >>> +	/* Check if any CBs left for processing */
> >>> +	if (mbuf_total_left != 0) {
> >>> +		rte_bbdev_log(ERR,
> >>> +				"Some date still left for processing:
> >> mbuf_total_left = %u",
> >>> +				mbuf_total_left);
> >>> +		return -EINVAL;
> >>> +	}
> >>> +#endif
> >>> +	/* Set SDone on last CB descriptor for TB mode */
> >>> +	desc->req.sdone_enable = 1;
> >>> +	desc->req.irq_enable = q->irq_enable;
> >>> +
> >>> +	return current_enqueued_cbs;
> >>> +}
> >>>
> >>>  /* Calculates number of CBs in processed encoder TB based on 'r'
> >>> and
> >> input
> >>>   * length.
> >>> @@ -1893,6 +2498,45 @@
> >>>  	return cbs_in_tb;
> >>>  }
> >>>
> >>> +/* Enqueue encode operations for ACC100 device in CB mode. */
> >>> +static uint16_t acc100_enqueue_enc_cb(struct rte_bbdev_queue_data
> >> *q_data,
> >>> +		struct rte_bbdev_enc_op **ops, uint16_t num) {
> >>> +	struct acc100_queue *q = q_data->queue_private;
> >>> +	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q-
> >>> sw_ring_head;
> >>> +	uint16_t i;
> >>> +	union acc100_dma_desc *desc;
> >>> +	int ret;
> >>> +
> >>> +	for (i = 0; i < num; ++i) {
> >>> +		/* Check if there are available space for further processing */
> >>> +		if (unlikely(avail - 1 < 0))
> >>> +			break;
> >>> +		avail -= 1;
> >>> +
> >>> +		ret = enqueue_enc_one_op_cb(q, ops[i], i);
> >>> +		if (ret < 0)
> >>> +			break;
> >>> +	}
> >>> +
> >>> +	if (unlikely(i == 0))
> >>> +		return 0; /* Nothing to enqueue */
> >>> +
> >>> +	/* Set SDone in last CB in enqueued ops for CB mode*/
> >>> +	desc = q->ring_addr + ((q->sw_ring_head + i - 1)
> >>> +			& q->sw_ring_wrap_mask);
> >>> +	desc->req.sdone_enable = 1;
> >>> +	desc->req.irq_enable = q->irq_enable;
> >>> +
> >>> +	acc100_dma_enqueue(q, i, &q_data->queue_stats);
> >>> +
> >>> +	/* Update stats */
> >>> +	q_data->queue_stats.enqueued_count += i;
> >>> +	q_data->queue_stats.enqueue_err_count += num - i;
> >>> +	return i;
> >>> +}
> >>> +
> >>>  /* Check we can mux encode operations with common FCW */  static
> >>> inline bool  check_mux(struct rte_bbdev_enc_op **ops, uint16_t num)
> >>> { @@ -1960,6 +2604,52 @@
> >>>  	return i;
> >>>  }
> >>>
> >>> +/* Enqueue encode operations for ACC100 device in TB mode. */
> >>> +static uint16_t acc100_enqueue_enc_tb(struct rte_bbdev_queue_data
> >> *q_data,
> >>> +		struct rte_bbdev_enc_op **ops, uint16_t num) {
> >>> +	struct acc100_queue *q = q_data->queue_private;
> >>> +	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q-
> >>> sw_ring_head;
> >>> +	uint16_t i, enqueued_cbs = 0;
> >>> +	uint8_t cbs_in_tb;
> >>> +	int ret;
> >>> +
> >>> +	for (i = 0; i < num; ++i) {
> >>> +		cbs_in_tb = get_num_cbs_in_tb_enc(&ops[i]->turbo_enc);
> >>> +		/* Check if there are available space for further processing */
> >>> +		if (unlikely(avail - cbs_in_tb < 0))
> >>> +			break;
> >>> +		avail -= cbs_in_tb;
> >>> +
> >>> +		ret = enqueue_enc_one_op_tb(q, ops[i], enqueued_cbs,
> >> cbs_in_tb);
> >>> +		if (ret < 0)
> >>> +			break;
> >>> +		enqueued_cbs += ret;
> >>> +	}
> >>> +
> >> other similar functions have a (i == 0) check here.
> > ok
> >
> >>> +	acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
> >>> +
> >>> +	/* Update stats */
> >>> +	q_data->queue_stats.enqueued_count += i;
> >>> +	q_data->queue_stats.enqueue_err_count += num - i;
> >>> +
> >>> +	return i;
> >>> +}
> >>> +
> >>> +/* Enqueue encode operations for ACC100 device. */ static uint16_t
> >>> +acc100_enqueue_enc(struct rte_bbdev_queue_data *q_data,
> >>> +		struct rte_bbdev_enc_op **ops, uint16_t num) {
> >>> +	if (unlikely(num == 0))
> >>> +		return 0;
> >> num == 0 check should move into the tb/cb functions
> > same comment on other patch, why not catch it early?
> >
> >>> +	if (ops[0]->turbo_enc.code_block_mode == 0)
> >>> +		return acc100_enqueue_enc_tb(q_data, ops, num);
> >>> +	else
> >>> +		return acc100_enqueue_enc_cb(q_data, ops, num); }
> >>> +
> >>>  /* Enqueue encode operations for ACC100 device. */  static uint16_t
> >>> acc100_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data, @@
> >>> -1967,7 +2657,51 @@  {
> >>>  	if (unlikely(num == 0))
> >>>  		return 0;
> >>> -	return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
> >>> +	if (ops[0]->ldpc_enc.code_block_mode == 0)
> >>> +		return acc100_enqueue_enc_tb(q_data, ops, num);
> >>> +	else
> >>> +		return acc100_enqueue_ldpc_enc_cb(q_data, ops, num); }
> >>> +
> >>> +
> >>> +/* Enqueue decode operations for ACC100 device in CB mode */ static
> >>> +uint16_t acc100_enqueue_dec_cb(struct rte_bbdev_queue_data
> >> *q_data,
> >>> +		struct rte_bbdev_dec_op **ops, uint16_t num) {
> >> Seems like the 10th variant of a similar function could these be
> >> combined to fewer functions ?
> >>
> >> Maybe by passing in a function pointer to the enqueue_one_dec_one*
> >> that does the work ?
> > They have some variants related to the actual operation and constraints.
> > Not obvious to have a valuable refactor.
> >
> As above nothing functionally wrong, just something to consider
> 
> ok.

I agree in principle and this was done to a number of places (ie. acc100_dma_fill_blk_type_out(), etc...). 
I think we can look in parallel into that idea you suggested earlier of "common code" and future refactory around that, 
notably to support future generation of devices in that family with feature overlap and hence avoid/limit code duplication moving forward across different PMDs. 
More something for 2021. Thanks

> 
> Tom
> 
> >>> +	struct acc100_queue *q = q_data->queue_private;
> >>> +	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q-
> >>> sw_ring_head;
> >>> +	uint16_t i;
> >>> +	union acc100_dma_desc *desc;
> >>> +	int ret;
> >>> +
> >>> +	for (i = 0; i < num; ++i) {
> >>> +		/* Check if there are available space for further processing */
> >>> +		if (unlikely(avail - 1 < 0))
> >>> +			break;
> >>> +		avail -= 1;
> >>> +
> >>> +		ret = enqueue_dec_one_op_cb(q, ops[i], i);
> >>> +		if (ret < 0)
> >>> +			break;
> >>> +	}
> >>> +
> >>> +	if (unlikely(i == 0))
> >>> +		return 0; /* Nothing to enqueue */
> >>> +
> >>> +	/* Set SDone in last CB in enqueued ops for CB mode*/
> >>> +	desc = q->ring_addr + ((q->sw_ring_head + i - 1)
> >>> +			& q->sw_ring_wrap_mask);
> >>> +	desc->req.sdone_enable = 1;
> >>> +	desc->req.irq_enable = q->irq_enable;
> >>> +
> >>> +	acc100_dma_enqueue(q, i, &q_data->queue_stats);
> >>> +
> >>> +	/* Update stats */
> >>> +	q_data->queue_stats.enqueued_count += i;
> >>> +	q_data->queue_stats.enqueue_err_count += num - i;
> >>> +
> >>> +	return i;
> >>>  }
> >>>
> >>>  /* Check we can mux encode operations with common FCW */ @@ -
> >> 2065,6
> >>> +2799,53 @@
> >>>  	return i;
> >>>  }
> >>>
> >>> +
> >>> +/* Enqueue decode operations for ACC100 device in TB mode */ static
> >>> +uint16_t acc100_enqueue_dec_tb(struct rte_bbdev_queue_data
> >> *q_data,
> >>> +		struct rte_bbdev_dec_op **ops, uint16_t num) {
> >> 11th ;)
> >>> +	struct acc100_queue *q = q_data->queue_private;
> >>> +	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q-
> >>> sw_ring_head;
> >>> +	uint16_t i, enqueued_cbs = 0;
> >>> +	uint8_t cbs_in_tb;
> >>> +	int ret;
> >>> +
> >>> +	for (i = 0; i < num; ++i) {
> >>> +		cbs_in_tb = get_num_cbs_in_tb_dec(&ops[i]->turbo_dec);
> >>> +		/* Check if there are available space for further processing */
> >>> +		if (unlikely(avail - cbs_in_tb < 0))
> >>> +			break;
> >>> +		avail -= cbs_in_tb;
> >>> +
> >>> +		ret = enqueue_dec_one_op_tb(q, ops[i], enqueued_cbs,
> >> cbs_in_tb);
> >>> +		if (ret < 0)
> >>> +			break;
> >>> +		enqueued_cbs += ret;
> >>> +	}
> >>> +
> >>> +	acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
> >>> +
> >>> +	/* Update stats */
> >>> +	q_data->queue_stats.enqueued_count += i;
> >>> +	q_data->queue_stats.enqueue_err_count += num - i;
> >>> +
> >>> +	return i;
> >>> +}
> >>> +
> >>> +/* Enqueue decode operations for ACC100 device. */ static uint16_t
> >>> +acc100_enqueue_dec(struct rte_bbdev_queue_data *q_data,
> >>> +		struct rte_bbdev_dec_op **ops, uint16_t num) {
> >>> +	if (unlikely(num == 0))
> >>> +		return 0;
> >> similar move the num == 0 check to the tb/cb functions.
> > same comment
> >
> >>> +	if (ops[0]->turbo_dec.code_block_mode == 0)
> >>> +		return acc100_enqueue_dec_tb(q_data, ops, num);
> >>> +	else
> >>> +		return acc100_enqueue_dec_cb(q_data, ops, num); }
> >>> +
> >>>  /* Enqueue decode operations for ACC100 device. */  static uint16_t
> >>> acc100_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data, @@
> >>> -2388,6 +3169,51 @@
> >>>  	return cb_idx;
> >>>  }
> >>>
> >>> +/* Dequeue encode operations from ACC100 device. */ static uint16_t
> >>> +acc100_dequeue_enc(struct rte_bbdev_queue_data *q_data,
> >>> +		struct rte_bbdev_enc_op **ops, uint16_t num) {
> >>> +	struct acc100_queue *q = q_data->queue_private;
> >>> +	uint16_t dequeue_num;
> >>> +	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
> >>> +	uint32_t aq_dequeued = 0;
> >>> +	uint16_t i;
> >>> +	uint16_t dequeued_cbs = 0;
> >>> +	struct rte_bbdev_enc_op *op;
> >>> +	int ret;
> >>> +
> >>> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> >>> +	if (unlikely(ops == 0 && q == NULL))
> >> ops is a pointer so should compare with NULL
> >>
> >> The && likely needs to be ||
> >>
> >> Maybe print out a message so caller knows something wrong happened.
> > ok
> >
> >>> +		return 0;
> >>> +#endif
> >>> +
> >>> +	dequeue_num = (avail < num) ? avail : num;
> >>> +
> >>> +	for (i = 0; i < dequeue_num; ++i) {
> >>> +		op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> >>> +			& q->sw_ring_wrap_mask))->req.op_addr;
> >>> +		if (op->turbo_enc.code_block_mode == 0)
> >>> +			ret = dequeue_enc_one_op_tb(q, &ops[i],
> >> dequeued_cbs,
> >>> +					&aq_dequeued);
> >>> +		else
> >>> +			ret = dequeue_enc_one_op_cb(q, &ops[i],
> >> dequeued_cbs,
> >>> +					&aq_dequeued);
> >>> +
> >>> +		if (ret < 0)
> >>> +			break;
> >>> +		dequeued_cbs += ret;
> >>> +	}
> >>> +
> >>> +	q->aq_dequeued += aq_dequeued;
> >>> +	q->sw_ring_tail += dequeued_cbs;
> >>> +
> >>> +	/* Update enqueue stats */
> >>> +	q_data->queue_stats.dequeued_count += i;
> >>> +
> >>> +	return i;
> >>> +}
> >>> +
> >>>  /* Dequeue LDPC encode operations from ACC100 device. */  static
> >>> uint16_t  acc100_dequeue_ldpc_enc(struct rte_bbdev_queue_data
> >> *q_data,
> >>> @@ -2426,6 +3252,52 @@
> >>>  	return dequeued_cbs;
> >>>  }
> >>>
> >>> +
> >>> +/* Dequeue decode operations from ACC100 device. */ static uint16_t
> >>> +acc100_dequeue_dec(struct rte_bbdev_queue_data *q_data,
> >>> +		struct rte_bbdev_dec_op **ops, uint16_t num) {
> >> very similar to enc function above, consider how to combine them to a
> >> single function.
> >>
> >> Tom
> >>
> >>> +	struct acc100_queue *q = q_data->queue_private;
> >>> +	uint16_t dequeue_num;
> >>> +	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
> >>> +	uint32_t aq_dequeued = 0;
> >>> +	uint16_t i;
> >>> +	uint16_t dequeued_cbs = 0;
> >>> +	struct rte_bbdev_dec_op *op;
> >>> +	int ret;
> >>> +
> >>> +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> >>> +	if (unlikely(ops == 0 && q == NULL))
> >>> +		return 0;
> >>> +#endif
> >>> +
> >>> +	dequeue_num = (avail < num) ? avail : num;
> >>> +
> >>> +	for (i = 0; i < dequeue_num; ++i) {
> >>> +		op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
> >>> +			& q->sw_ring_wrap_mask))->req.op_addr;
> >>> +		if (op->turbo_dec.code_block_mode == 0)
> >>> +			ret = dequeue_dec_one_op_tb(q, &ops[i],
> >> dequeued_cbs,
> >>> +					&aq_dequeued);
> >>> +		else
> >>> +			ret = dequeue_dec_one_op_cb(q_data, q, &ops[i],
> >>> +					dequeued_cbs, &aq_dequeued);
> >>> +
> >>> +		if (ret < 0)
> >>> +			break;
> >>> +		dequeued_cbs += ret;
> >>> +	}
> >>> +
> >>> +	q->aq_dequeued += aq_dequeued;
> >>> +	q->sw_ring_tail += dequeued_cbs;
> >>> +
> >>> +	/* Update enqueue stats */
> >>> +	q_data->queue_stats.dequeued_count += i;
> >>> +
> >>> +	return i;
> >>> +}
> >>> +
> >>>  /* Dequeue decode operations from ACC100 device. */  static
> >>> uint16_t acc100_dequeue_ldpc_dec(struct rte_bbdev_queue_data
> >>> *q_data, @@
> >>> -2479,6 +3351,10 @@
> >>>  	struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
> >>>
> >>>  	dev->dev_ops = &acc100_bbdev_ops;
> >>> +	dev->enqueue_enc_ops = acc100_enqueue_enc;
> >>> +	dev->enqueue_dec_ops = acc100_enqueue_dec;
> >>> +	dev->dequeue_enc_ops = acc100_dequeue_enc;
> >>> +	dev->dequeue_dec_ops = acc100_dequeue_dec;
> >>>  	dev->enqueue_ldpc_enc_ops = acc100_enqueue_ldpc_enc;
> >>>  	dev->enqueue_ldpc_dec_ops = acc100_enqueue_ldpc_dec;
> >>>  	dev->dequeue_ldpc_enc_ops = acc100_dequeue_ldpc_enc;


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v11 00/10] bbdev PMD ACC100
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 11/11] doc: update bbdev feature table Nicolas Chautru
                     ` (6 preceding siblings ...)
  2020-10-01  3:14   ` [dpdk-dev] [PATCH v10 00/10] bbdev PMD ACC100 Nicolas Chautru
@ 2020-10-02  1:01   ` Nicolas Chautru
  2020-10-02  1:01     ` [dpdk-dev] [PATCH v11 01/10] drivers/baseband: add PMD for ACC100 Nicolas Chautru
                       ` (9 more replies)
  2020-10-05 22:12   ` [dpdk-dev] [PATCH v12 00/10] bbdev PMD ACC100 Nicolas Chautru
  8 siblings, 10 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-10-02  1:01 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, trix, maxime.coquelin, ferruh.yigit,
	tianjiao.liu, Nicolas Chautru

v11: Addtional updates based on Tom + Maxime review comments on v9 and v10. Thanks
v10: Updates based on Tom Rix valuable review comments. Notably doc clarifiction, #define names updates, few magic numbers left, stricter error handling and few valuable coding suggestions. Thanks
v9: moved the release notes update to the last commit
v8: integrated the doc feature table in previous commit as suggested. 
v7: Fingers trouble. Previous one sent mid-rebase. My bad. 
v6: removed a legacy makefile no longer required
v5: rebase based on latest on main. The legacy makefiles are removed. 
v4: an odd compilation error is reported for one CI variant using "gcc latest" which looks to me like a false positive of maybe-undeclared. 
http://mails.dpdk.org/archives/test-report/2020-August/148936.html
Still forcing a dummy declare to remove this CI warning I will check with ci@dpdk.org in parallel.  
v3: missed a change during rebase
v2: includes clean up from latest CI checks.


Nicolas Chautru (10):
  drivers/baseband: add PMD for ACC100
  baseband/acc100: add register definition file
  baseband/acc100: add info get function
  baseband/acc100: add queue configuration
  baseband/acc100: add LDPC processing functions
  baseband/acc100: add HARQ loopback support
  baseband/acc100: add support for 4G processing
  baseband/acc100: add interrupt support to PMD
  baseband/acc100: add debug function to validate input
  baseband/acc100: add configure function

 app/test-bbdev/meson.build                         |    3 +
 app/test-bbdev/test_bbdev_perf.c                   |   71 +
 doc/guides/bbdevs/acc100.rst                       |  228 +
 doc/guides/bbdevs/features/acc100.ini              |   14 +
 doc/guides/bbdevs/index.rst                        |    1 +
 doc/guides/rel_notes/release_20_11.rst             |    5 +
 drivers/baseband/acc100/acc100_pf_enum.h           | 1068 +++++
 drivers/baseband/acc100/acc100_vf_enum.h           |   73 +
 drivers/baseband/acc100/meson.build                |    8 +
 drivers/baseband/acc100/rte_acc100_cfg.h           |  113 +
 drivers/baseband/acc100/rte_acc100_pmd.c           | 4727 ++++++++++++++++++++
 drivers/baseband/acc100/rte_acc100_pmd.h           |  602 +++
 .../acc100/rte_pmd_bbdev_acc100_version.map        |   10 +
 drivers/baseband/meson.build                       |    2 +-
 14 files changed, 6924 insertions(+), 1 deletion(-)
 create mode 100644 doc/guides/bbdevs/acc100.rst
 create mode 100644 doc/guides/bbdevs/features/acc100.ini
 create mode 100644 drivers/baseband/acc100/acc100_pf_enum.h
 create mode 100644 drivers/baseband/acc100/acc100_vf_enum.h
 create mode 100644 drivers/baseband/acc100/meson.build
 create mode 100644 drivers/baseband/acc100/rte_acc100_cfg.h
 create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.c
 create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.h
 create mode 100644 drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v11 01/10] drivers/baseband: add PMD for ACC100
  2020-10-02  1:01   ` [dpdk-dev] [PATCH v11 00/10] bbdev PMD ACC100 Nicolas Chautru
@ 2020-10-02  1:01     ` Nicolas Chautru
  2020-10-04 15:53       ` Tom Rix
  2020-10-02  1:01     ` [dpdk-dev] [PATCH v11 02/10] baseband/acc100: add register definition file Nicolas Chautru
                       ` (8 subsequent siblings)
  9 siblings, 1 reply; 213+ messages in thread
From: Nicolas Chautru @ 2020-10-02  1:01 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, trix, maxime.coquelin, ferruh.yigit,
	tianjiao.liu, Nicolas Chautru

Add stubs for the ACC100 PMD

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 doc/guides/bbdevs/acc100.rst                       | 228 +++++++++++++++++++++
 doc/guides/bbdevs/features/acc100.ini              |  14 ++
 doc/guides/bbdevs/index.rst                        |   1 +
 drivers/baseband/acc100/meson.build                |   6 +
 drivers/baseband/acc100/rte_acc100_pmd.c           | 175 ++++++++++++++++
 drivers/baseband/acc100/rte_acc100_pmd.h           |  37 ++++
 .../acc100/rte_pmd_bbdev_acc100_version.map        |   3 +
 drivers/baseband/meson.build                       |   2 +-
 8 files changed, 465 insertions(+), 1 deletion(-)
 create mode 100644 doc/guides/bbdevs/acc100.rst
 create mode 100644 doc/guides/bbdevs/features/acc100.ini
 create mode 100644 drivers/baseband/acc100/meson.build
 create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.c
 create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.h
 create mode 100644 drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map

diff --git a/doc/guides/bbdevs/acc100.rst b/doc/guides/bbdevs/acc100.rst
new file mode 100644
index 0000000..d6d56ad
--- /dev/null
+++ b/doc/guides/bbdevs/acc100.rst
@@ -0,0 +1,228 @@
+..  SPDX-License-Identifier: BSD-3-Clause
+    Copyright(c) 2020 Intel Corporation
+
+Intel(R) ACC100 5G/4G FEC Poll Mode Driver
+==========================================
+
+The BBDEV ACC100 5G/4G FEC poll mode driver (PMD) supports an
+implementation of a VRAN FEC wireless acceleration function.
+This device is also known as Mount Bryce.
+
+Features
+--------
+
+ACC100 5G/4G FEC PMD supports the following features:
+
+- LDPC Encode in the DL (5GNR)
+- LDPC Decode in the UL (5GNR)
+- Turbo Encode in the DL (4G)
+- Turbo Decode in the UL (4G)
+- 16 VFs per PF (physical device)
+- Maximum of 128 queues per VF
+- PCIe Gen-3 x16 Interface
+- MSI
+- SR-IOV
+
+ACC100 5G/4G FEC PMD supports the following BBDEV capabilities:
+
+* For the LDPC encode operation:
+   - ``RTE_BBDEV_LDPC_CRC_24B_ATTACH`` :  set to attach CRC24B to CB(s)
+   - ``RTE_BBDEV_LDPC_RATE_MATCH`` :  if set then do not do Rate Match bypass
+   - ``RTE_BBDEV_LDPC_INTERLEAVER_BYPASS`` : if set then bypass interleaver
+
+* For the LDPC decode operation:
+   - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK`` :  check CRC24B from CB(s)
+   - ``RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE`` :  disable early termination
+   - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP`` :  drops CRC24B bits appended while decoding
+   - ``RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE`` :  provides an input for HARQ combining
+   - ``RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE`` :  provides an input for HARQ combining
+   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE`` :  HARQ memory input is internal
+   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE`` :  HARQ memory output is internal
+   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK`` :  loopback data to/from HARQ memory
+   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_FILLERS`` :  HARQ memory includes the fillers bits
+   - ``RTE_BBDEV_LDPC_DEC_SCATTER_GATHER`` :  supports scatter-gather for input/output data
+   - ``RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION`` :  supports compression of the HARQ input/output
+   - ``RTE_BBDEV_LDPC_LLR_COMPRESSION`` :  supports LLR input compression
+
+* For the turbo encode operation:
+   - ``RTE_BBDEV_TURBO_CRC_24B_ATTACH`` :  set to attach CRC24B to CB(s)
+   - ``RTE_BBDEV_TURBO_RATE_MATCH`` :  if set then do not do Rate Match bypass
+   - ``RTE_BBDEV_TURBO_ENC_INTERRUPTS`` :  set for encoder dequeue interrupts
+   - ``RTE_BBDEV_TURBO_RV_INDEX_BYPASS`` :  set to bypass RV index
+   - ``RTE_BBDEV_TURBO_ENC_SCATTER_GATHER`` :  supports scatter-gather for input/output data
+
+* For the turbo decode operation:
+   - ``RTE_BBDEV_TURBO_CRC_TYPE_24B`` :  check CRC24B from CB(s)
+   - ``RTE_BBDEV_TURBO_SUBBLOCK_DEINTERLEAVE`` :  perform subblock de-interleave
+   - ``RTE_BBDEV_TURBO_DEC_INTERRUPTS`` :  set for decoder dequeue interrupts
+   - ``RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN`` :  set if negative LLR encoder i/p is supported
+   - ``RTE_BBDEV_TURBO_POS_LLR_1_BIT_IN`` :  set if positive LLR encoder i/p is supported
+   - ``RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP`` :  keep CRC24B bits appended while decoding
+   - ``RTE_BBDEV_TURBO_EARLY_TERMINATION`` :  set early early termination feature
+   - ``RTE_BBDEV_TURBO_DEC_SCATTER_GATHER`` :  supports scatter-gather for input/output data
+   - ``RTE_BBDEV_TURBO_HALF_ITERATION_EVEN`` :  set half iteration granularity
+
+Installation
+------------
+
+Section 3 of the DPDK manual provides instuctions on installing and compiling DPDK. The
+default set of bbdev compile flags may be found in config/common_base, where for example
+the flag to build the ACC100 5G/4G FEC device, ``CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100``,
+is already set.
+
+DPDK requires hugepages to be configured as detailed in section 2 of the DPDK manual.
+The bbdev test application has been tested with a configuration 40 x 1GB hugepages. The
+hugepage configuration of a server may be examined using:
+
+.. code-block:: console
+
+   grep Huge* /proc/meminfo
+
+
+Initialization
+--------------
+
+When the device first powers up, its PCI Physical Functions (PF) can be listed through this command:
+
+.. code-block:: console
+
+  sudo lspci -vd8086:0d5c
+
+The physical and virtual functions are compatible with Linux UIO drivers:
+``vfio`` and ``igb_uio``. However, in order to work the ACC100 5G/4G
+FEC device first needs to be bound to one of these linux drivers through DPDK.
+
+
+Bind PF UIO driver(s)
+~~~~~~~~~~~~~~~~~~~~~
+
+Install the DPDK igb_uio driver, bind it with the PF PCI device ID and use
+``lspci`` to confirm the PF device is under use by ``igb_uio`` DPDK UIO driver.
+
+The igb_uio driver may be bound to the PF PCI device using one of three methods:
+
+
+1. PCI functions (physical or virtual, depending on the use case) can be bound to
+the UIO driver by repeating this command for every function.
+
+.. code-block:: console
+
+  cd <dpdk-top-level-directory>
+  insmod ./build/kmod/igb_uio.ko
+  echo "8086 0d5c" > /sys/bus/pci/drivers/igb_uio/new_id
+  lspci -vd8086:0d5c
+
+
+2. Another way to bind PF with DPDK UIO driver is by using the ``dpdk-devbind.py`` tool
+
+.. code-block:: console
+
+  cd <dpdk-top-level-directory>
+  ./usertools/dpdk-devbind.py -b igb_uio 0000:06:00.0
+
+where the PCI device ID (example: 0000:06:00.0) is obtained using lspci -vd8086:0d5c
+
+
+3. A third way to bind is to use ``dpdk-setup.sh`` tool
+
+.. code-block:: console
+
+  cd <dpdk-top-level-directory>
+  ./usertools/dpdk-setup.sh
+
+  select 'Bind Ethernet/Crypto/Baseband device to IGB UIO module'
+  enter PCI device ID
+  select 'Display current Ethernet/Crypto/Baseband device settings' to confirm binding
+
+In a similar way the ACC100 5G/4G FEC PF may be bound with vfio-pci as any PCIe device.
+
+Enable Virtual Functions
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+Now, it should be visible in the printouts that PCI PF is under igb_uio control
+"``Kernel driver in use: igb_uio``"
+
+To show the number of available VFs on the device, read ``sriov_totalvfs`` file..
+
+.. code-block:: console
+
+  cat /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/sriov_totalvfs
+
+  where 0000\:<b>\:<d>.<f> is the PCI device ID
+
+
+To enable VFs via igb_uio, echo the number of virtual functions intended to
+enable to ``max_vfs`` file..
+
+.. code-block:: console
+
+  echo <num-of-vfs> > /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/max_vfs
+
+
+Afterwards, all VFs must be bound to appropriate UIO drivers as required, same
+way it was done with the physical function previously.
+
+Enabling SR-IOV via vfio driver is pretty much the same, except that the file
+name is different:
+
+.. code-block:: console
+
+  echo <num-of-vfs> > /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/sriov_numvfs
+
+
+Configure the VFs through PF
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The PCI virtual functions must be configured before working or getting assigned
+to VMs/Containers. The configuration involves allocating the number of hardware
+queues, priorities, load balance, bandwidth and other settings necessary for the
+device to perform FEC functions.
+
+This configuration needs to be executed at least once after reboot or PCI FLR and can
+be achieved by using the function ``acc100_configure()``, which sets up the
+parameters defined in ``acc100_conf`` structure.
+
+Test Application
+----------------
+
+BBDEV provides a test application, ``test-bbdev.py`` and range of test data for testing
+the functionality of ACC100 5G/4G FEC encode and decode, depending on the device's
+capabilities. The test application is located under app->test-bbdev folder and has the
+following options:
+
+.. code-block:: console
+
+  "-p", "--testapp-path": specifies path to the bbdev test app.
+  "-e", "--eal-params"	: EAL arguments which are passed to the test app.
+  "-t", "--timeout"	: Timeout in seconds (default=300).
+  "-c", "--test-cases"	: Defines test cases to run. Run all if not specified.
+  "-v", "--test-vector"	: Test vector path (default=dpdk_path+/app/test-bbdev/test_vectors/bbdev_null.data).
+  "-n", "--num-ops"	: Number of operations to process on device (default=32).
+  "-b", "--burst-size"	: Operations enqueue/dequeue burst size (default=32).
+  "-s", "--snr"		: SNR in dB used when generating LLRs for bler tests.
+  "-s", "--iter_max"	: Number of iterations for LDPC decoder.
+  "-l", "--num-lcores"	: Number of lcores to run (default=16).
+  "-i", "--init-device" : Initialise PF device with default values.
+
+
+To execute the test application tool using simple decode or encode data,
+type one of the following:
+
+.. code-block:: console
+
+  ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_dec_default.data
+  ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_enc_default.data
+
+
+The test application ``test-bbdev.py``, supports the ability to configure the PF device with
+a default set of values, if the "-i" or "- -init-device" option is included. The default values
+are defined in test_bbdev_perf.c.
+
+
+Test Vectors
+~~~~~~~~~~~~
+
+In addition to the simple LDPC decoder and LDPC encoder tests, bbdev also provides
+a range of additional tests under the test_vectors folder, which may be useful. The results
+of these tests will depend on the ACC100 5G/4G FEC capabilities which may cause some
+testcases to be skipped, but no failure should be reported.
diff --git a/doc/guides/bbdevs/features/acc100.ini b/doc/guides/bbdevs/features/acc100.ini
new file mode 100644
index 0000000..c89a4d7
--- /dev/null
+++ b/doc/guides/bbdevs/features/acc100.ini
@@ -0,0 +1,14 @@
+;
+; Supported features of the 'acc100' bbdev driver.
+;
+; Refer to default.ini for the full list of available PMD features.
+;
+[Features]
+Turbo Decoder (4G)     = N
+Turbo Encoder (4G)     = N
+LDPC Decoder (5G)      = N
+LDPC Encoder (5G)      = N
+LLR/HARQ Compression   = N
+External DDR Access    = N
+HW Accelerated         = Y
+BBDEV API              = Y
diff --git a/doc/guides/bbdevs/index.rst b/doc/guides/bbdevs/index.rst
index a8092dd..4445cbd 100644
--- a/doc/guides/bbdevs/index.rst
+++ b/doc/guides/bbdevs/index.rst
@@ -13,3 +13,4 @@ Baseband Device Drivers
     turbo_sw
     fpga_lte_fec
     fpga_5gnr_fec
+    acc100
diff --git a/drivers/baseband/acc100/meson.build b/drivers/baseband/acc100/meson.build
new file mode 100644
index 0000000..8afafc2
--- /dev/null
+++ b/drivers/baseband/acc100/meson.build
@@ -0,0 +1,6 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2020 Intel Corporation
+
+deps += ['bbdev', 'bus_vdev', 'ring', 'pci', 'bus_pci']
+
+sources = files('rte_acc100_pmd.c')
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
new file mode 100644
index 0000000..1b4cd13
--- /dev/null
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -0,0 +1,175 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#include <unistd.h>
+
+#include <rte_common.h>
+#include <rte_log.h>
+#include <rte_dev.h>
+#include <rte_malloc.h>
+#include <rte_mempool.h>
+#include <rte_byteorder.h>
+#include <rte_errno.h>
+#include <rte_branch_prediction.h>
+#include <rte_hexdump.h>
+#include <rte_pci.h>
+#include <rte_bus_pci.h>
+
+#include <rte_bbdev.h>
+#include <rte_bbdev_pmd.h>
+#include "rte_acc100_pmd.h"
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, DEBUG);
+#else
+RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);
+#endif
+
+/* Free 64MB memory used for software rings */
+static int
+acc100_dev_close(struct rte_bbdev *dev  __rte_unused)
+{
+	return 0;
+}
+
+static const struct rte_bbdev_ops acc100_bbdev_ops = {
+	.close = acc100_dev_close,
+};
+
+/* ACC100 PCI PF address map */
+static struct rte_pci_id pci_id_acc100_pf_map[] = {
+	{
+		RTE_PCI_DEVICE(RTE_ACC100_VENDOR_ID, RTE_ACC100_PF_DEVICE_ID)
+	},
+	{.device_id = 0},
+};
+
+/* ACC100 PCI VF address map */
+static struct rte_pci_id pci_id_acc100_vf_map[] = {
+	{
+		RTE_PCI_DEVICE(RTE_ACC100_VENDOR_ID, RTE_ACC100_VF_DEVICE_ID)
+	},
+	{.device_id = 0},
+};
+
+/* Initialization Function */
+static void
+acc100_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv)
+{
+	struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
+
+	dev->dev_ops = &acc100_bbdev_ops;
+
+	((struct acc100_device *) dev->data->dev_private)->pf_device =
+			!strcmp(drv->driver.name,
+					RTE_STR(ACC100PF_DRIVER_NAME));
+	((struct acc100_device *) dev->data->dev_private)->mmio_base =
+			pci_dev->mem_resource[0].addr;
+
+	rte_bbdev_log_debug("Init device %s [%s] @ vaddr %p paddr %#"PRIx64"",
+			drv->driver.name, dev->data->name,
+			(void *)pci_dev->mem_resource[0].addr,
+			pci_dev->mem_resource[0].phys_addr);
+}
+
+static int acc100_pci_probe(struct rte_pci_driver *pci_drv,
+	struct rte_pci_device *pci_dev)
+{
+	struct rte_bbdev *bbdev = NULL;
+	char dev_name[RTE_BBDEV_NAME_MAX_LEN];
+
+	if (pci_dev == NULL) {
+		rte_bbdev_log(ERR, "NULL PCI device");
+		return -EINVAL;
+	}
+
+	rte_pci_device_name(&pci_dev->addr, dev_name, sizeof(dev_name));
+
+	/* Allocate memory to be used privately by drivers */
+	bbdev = rte_bbdev_allocate(pci_dev->device.name);
+	if (bbdev == NULL)
+		return -ENODEV;
+
+	/* allocate device private memory */
+	bbdev->data->dev_private = rte_zmalloc_socket(dev_name,
+			sizeof(struct acc100_device), RTE_CACHE_LINE_SIZE,
+			pci_dev->device.numa_node);
+
+	if (bbdev->data->dev_private == NULL) {
+		rte_bbdev_log(CRIT,
+				"Allocate of %zu bytes for device \"%s\" failed",
+				sizeof(struct acc100_device), dev_name);
+				rte_bbdev_release(bbdev);
+			return -ENOMEM;
+	}
+
+	/* Fill HW specific part of device structure */
+	bbdev->device = &pci_dev->device;
+	bbdev->intr_handle = &pci_dev->intr_handle;
+	bbdev->data->socket_id = pci_dev->device.numa_node;
+
+	/* Invoke ACC100 device initialization function */
+	acc100_bbdev_init(bbdev, pci_drv);
+
+	rte_bbdev_log_debug("Initialised bbdev %s (id = %u)",
+			dev_name, bbdev->data->dev_id);
+	return 0;
+}
+
+static int acc100_pci_remove(struct rte_pci_device *pci_dev)
+{
+	struct rte_bbdev *bbdev;
+	int ret;
+	uint8_t dev_id;
+
+	if (pci_dev == NULL)
+		return -EINVAL;
+
+	/* Find device */
+	bbdev = rte_bbdev_get_named_dev(pci_dev->device.name);
+	if (bbdev == NULL) {
+		rte_bbdev_log(CRIT,
+				"Couldn't find HW dev \"%s\" to uninitialise it",
+				pci_dev->device.name);
+		return -ENODEV;
+	}
+	dev_id = bbdev->data->dev_id;
+
+	/* free device private memory before close */
+	rte_free(bbdev->data->dev_private);
+
+	/* Close device */
+	ret = rte_bbdev_close(dev_id);
+	if (ret < 0)
+		rte_bbdev_log(ERR,
+				"Device %i failed to close during uninit: %i",
+				dev_id, ret);
+
+	/* release bbdev from library */
+	rte_bbdev_release(bbdev);
+
+	rte_bbdev_log_debug("Destroyed bbdev = %u", dev_id);
+
+	return 0;
+}
+
+static struct rte_pci_driver acc100_pci_pf_driver = {
+		.probe = acc100_pci_probe,
+		.remove = acc100_pci_remove,
+		.id_table = pci_id_acc100_pf_map,
+		.drv_flags = RTE_PCI_DRV_NEED_MAPPING
+};
+
+static struct rte_pci_driver acc100_pci_vf_driver = {
+		.probe = acc100_pci_probe,
+		.remove = acc100_pci_remove,
+		.id_table = pci_id_acc100_vf_map,
+		.drv_flags = RTE_PCI_DRV_NEED_MAPPING
+};
+
+RTE_PMD_REGISTER_PCI(ACC100PF_DRIVER_NAME, acc100_pci_pf_driver);
+RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME, pci_id_acc100_pf_map);
+RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
+RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME, pci_id_acc100_vf_map);
+
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
new file mode 100644
index 0000000..6f46df0
--- /dev/null
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -0,0 +1,37 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#ifndef _RTE_ACC100_PMD_H_
+#define _RTE_ACC100_PMD_H_
+
+/* Helper macro for logging */
+#define rte_bbdev_log(level, fmt, ...) \
+	rte_log(RTE_LOG_ ## level, acc100_logtype, fmt "\n", \
+		##__VA_ARGS__)
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+#define rte_bbdev_log_debug(fmt, ...) \
+		rte_bbdev_log(DEBUG, "acc100_pmd: " fmt, \
+		##__VA_ARGS__)
+#else
+#define rte_bbdev_log_debug(fmt, ...)
+#endif
+
+/* ACC100 PF and VF driver names */
+#define ACC100PF_DRIVER_NAME           intel_acc100_pf
+#define ACC100VF_DRIVER_NAME           intel_acc100_vf
+
+/* ACC100 PCI vendor & device IDs */
+#define RTE_ACC100_VENDOR_ID           (0x8086)
+#define RTE_ACC100_PF_DEVICE_ID        (0x0d5c)
+#define RTE_ACC100_VF_DEVICE_ID        (0x0d5d)
+
+/* Private data structure for each ACC100 device */
+struct acc100_device {
+	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
+	bool pf_device; /**< True if this is a PF ACC100 device */
+	bool configured; /**< True if this ACC100 device is configured */
+};
+
+#endif /* _RTE_ACC100_PMD_H_ */
diff --git a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
new file mode 100644
index 0000000..4a76d1d
--- /dev/null
+++ b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
@@ -0,0 +1,3 @@
+DPDK_21 {
+	local: *;
+};
diff --git a/drivers/baseband/meson.build b/drivers/baseband/meson.build
index 415b672..72301ce 100644
--- a/drivers/baseband/meson.build
+++ b/drivers/baseband/meson.build
@@ -5,7 +5,7 @@ if is_windows
 	subdir_done()
 endif
 
-drivers = ['null', 'turbo_sw', 'fpga_lte_fec', 'fpga_5gnr_fec']
+drivers = ['null', 'turbo_sw', 'fpga_lte_fec', 'fpga_5gnr_fec', 'acc100']
 
 config_flag_fmt = 'RTE_LIBRTE_PMD_BBDEV_@0@'
 driver_name_fmt = 'rte_pmd_bbdev_@0@'
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v11 02/10] baseband/acc100: add register definition file
  2020-10-02  1:01   ` [dpdk-dev] [PATCH v11 00/10] bbdev PMD ACC100 Nicolas Chautru
  2020-10-02  1:01     ` [dpdk-dev] [PATCH v11 01/10] drivers/baseband: add PMD for ACC100 Nicolas Chautru
@ 2020-10-02  1:01     ` Nicolas Chautru
  2020-10-04 15:56       ` Tom Rix
  2020-10-02  1:01     ` [dpdk-dev] [PATCH v11 03/10] baseband/acc100: add info get function Nicolas Chautru
                       ` (7 subsequent siblings)
  9 siblings, 1 reply; 213+ messages in thread
From: Nicolas Chautru @ 2020-10-02  1:01 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, trix, maxime.coquelin, ferruh.yigit,
	tianjiao.liu, Nicolas Chautru

Add in the list of registers for the device and related
HW specs definitions.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 drivers/baseband/acc100/acc100_pf_enum.h | 1068 ++++++++++++++++++++++++++++++
 drivers/baseband/acc100/acc100_vf_enum.h |   73 ++
 drivers/baseband/acc100/rte_acc100_pmd.h |  487 ++++++++++++++
 3 files changed, 1628 insertions(+)
 create mode 100644 drivers/baseband/acc100/acc100_pf_enum.h
 create mode 100644 drivers/baseband/acc100/acc100_vf_enum.h

diff --git a/drivers/baseband/acc100/acc100_pf_enum.h b/drivers/baseband/acc100/acc100_pf_enum.h
new file mode 100644
index 0000000..a1ee416
--- /dev/null
+++ b/drivers/baseband/acc100/acc100_pf_enum.h
@@ -0,0 +1,1068 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2017 Intel Corporation
+ */
+
+#ifndef ACC100_PF_ENUM_H
+#define ACC100_PF_ENUM_H
+
+/*
+ * ACC100 Register mapping on PF BAR0
+ * This is automatically generated from RDL, format may change with new RDL
+ * Release.
+ * Variable names are as is
+ */
+enum {
+	HWPfQmgrEgressQueuesTemplate          =  0x0007FE00,
+	HWPfQmgrIngressAq                     =  0x00080000,
+	HWPfQmgrArbQAvail                     =  0x00A00010,
+	HWPfQmgrArbQBlock                     =  0x00A00014,
+	HWPfQmgrAqueueDropNotifEn             =  0x00A00024,
+	HWPfQmgrAqueueDisableNotifEn          =  0x00A00028,
+	HWPfQmgrSoftReset                     =  0x00A00038,
+	HWPfQmgrInitStatus                    =  0x00A0003C,
+	HWPfQmgrAramWatchdogCount             =  0x00A00040,
+	HWPfQmgrAramWatchdogCounterEn         =  0x00A00044,
+	HWPfQmgrAxiWatchdogCount              =  0x00A00048,
+	HWPfQmgrAxiWatchdogCounterEn          =  0x00A0004C,
+	HWPfQmgrProcessWatchdogCount          =  0x00A00050,
+	HWPfQmgrProcessWatchdogCounterEn      =  0x00A00054,
+	HWPfQmgrProcessUl4GWatchdogCounter    =  0x00A00058,
+	HWPfQmgrProcessDl4GWatchdogCounter    =  0x00A0005C,
+	HWPfQmgrProcessUl5GWatchdogCounter    =  0x00A00060,
+	HWPfQmgrProcessDl5GWatchdogCounter    =  0x00A00064,
+	HWPfQmgrProcessMldWatchdogCounter     =  0x00A00068,
+	HWPfQmgrMsiOverflowUpperVf            =  0x00A00070,
+	HWPfQmgrMsiOverflowLowerVf            =  0x00A00074,
+	HWPfQmgrMsiWatchdogOverflow           =  0x00A00078,
+	HWPfQmgrMsiOverflowEnable             =  0x00A0007C,
+	HWPfQmgrDebugAqPointerMemGrp          =  0x00A00100,
+	HWPfQmgrDebugOutputArbQFifoGrp        =  0x00A00140,
+	HWPfQmgrDebugMsiFifoGrp               =  0x00A00180,
+	HWPfQmgrDebugAxiWdTimeoutMsiFifo      =  0x00A001C0,
+	HWPfQmgrDebugProcessWdTimeoutMsiFifo  =  0x00A001C4,
+	HWPfQmgrDepthLog2Grp                  =  0x00A00200,
+	HWPfQmgrTholdGrp                      =  0x00A00300,
+	HWPfQmgrGrpTmplateReg0Indx            =  0x00A00600,
+	HWPfQmgrGrpTmplateReg1Indx            =  0x00A00680,
+	HWPfQmgrGrpTmplateReg2indx            =  0x00A00700,
+	HWPfQmgrGrpTmplateReg3Indx            =  0x00A00780,
+	HWPfQmgrGrpTmplateReg4Indx            =  0x00A00800,
+	HWPfQmgrVfBaseAddr                    =  0x00A01000,
+	HWPfQmgrUl4GWeightRrVf                =  0x00A02000,
+	HWPfQmgrDl4GWeightRrVf                =  0x00A02100,
+	HWPfQmgrUl5GWeightRrVf                =  0x00A02200,
+	HWPfQmgrDl5GWeightRrVf                =  0x00A02300,
+	HWPfQmgrMldWeightRrVf                 =  0x00A02400,
+	HWPfQmgrArbQDepthGrp                  =  0x00A02F00,
+	HWPfQmgrGrpFunction0                  =  0x00A02F40,
+	HWPfQmgrGrpFunction1                  =  0x00A02F44,
+	HWPfQmgrGrpPriority                   =  0x00A02F48,
+	HWPfQmgrWeightSync                    =  0x00A03000,
+	HWPfQmgrAqEnableVf                    =  0x00A10000,
+	HWPfQmgrAqResetVf                     =  0x00A20000,
+	HWPfQmgrRingSizeVf                    =  0x00A20004,
+	HWPfQmgrGrpDepthLog20Vf               =  0x00A20008,
+	HWPfQmgrGrpDepthLog21Vf               =  0x00A2000C,
+	HWPfQmgrGrpFunction0Vf                =  0x00A20010,
+	HWPfQmgrGrpFunction1Vf                =  0x00A20014,
+	HWPfDmaConfig0Reg                     =  0x00B80000,
+	HWPfDmaConfig1Reg                     =  0x00B80004,
+	HWPfDmaQmgrAddrReg                    =  0x00B80008,
+	HWPfDmaSoftResetReg                   =  0x00B8000C,
+	HWPfDmaAxcacheReg                     =  0x00B80010,
+	HWPfDmaVersionReg                     =  0x00B80014,
+	HWPfDmaFrameThreshold                 =  0x00B80018,
+	HWPfDmaTimestampLo                    =  0x00B8001C,
+	HWPfDmaTimestampHi                    =  0x00B80020,
+	HWPfDmaAxiStatus                      =  0x00B80028,
+	HWPfDmaAxiControl                     =  0x00B8002C,
+	HWPfDmaNoQmgr                         =  0x00B80030,
+	HWPfDmaQosScale                       =  0x00B80034,
+	HWPfDmaQmanen                         =  0x00B80040,
+	HWPfDmaQmgrQosBase                    =  0x00B80060,
+	HWPfDmaFecClkGatingEnable             =  0x00B80080,
+	HWPfDmaPmEnable                       =  0x00B80084,
+	HWPfDmaQosEnable                      =  0x00B80088,
+	HWPfDmaHarqWeightedRrFrameThreshold   =  0x00B800B0,
+	HWPfDmaDataSmallWeightedRrFrameThresh  = 0x00B800B4,
+	HWPfDmaDataLargeWeightedRrFrameThresh  = 0x00B800B8,
+	HWPfDmaInboundCbMaxSize               =  0x00B800BC,
+	HWPfDmaInboundDrainDataSize           =  0x00B800C0,
+	HWPfDmaVfDdrBaseRw                    =  0x00B80400,
+	HWPfDmaCmplTmOutCnt                   =  0x00B80800,
+	HWPfDmaProcTmOutCnt                   =  0x00B80804,
+	HWPfDmaStatusRrespBresp               =  0x00B80810,
+	HWPfDmaCfgRrespBresp                  =  0x00B80814,
+	HWPfDmaStatusMemParErr                =  0x00B80818,
+	HWPfDmaCfgMemParErrEn                 =  0x00B8081C,
+	HWPfDmaStatusDmaHwErr                 =  0x00B80820,
+	HWPfDmaCfgDmaHwErrEn                  =  0x00B80824,
+	HWPfDmaStatusFecCoreErr               =  0x00B80828,
+	HWPfDmaCfgFecCoreErrEn                =  0x00B8082C,
+	HWPfDmaStatusFcwDescrErr              =  0x00B80830,
+	HWPfDmaCfgFcwDescrErrEn               =  0x00B80834,
+	HWPfDmaStatusBlockTransmit            =  0x00B80838,
+	HWPfDmaBlockOnErrEn                   =  0x00B8083C,
+	HWPfDmaStatusFlushDma                 =  0x00B80840,
+	HWPfDmaFlushDmaOnErrEn                =  0x00B80844,
+	HWPfDmaStatusSdoneFifoFull            =  0x00B80848,
+	HWPfDmaStatusDescriptorErrLoVf        =  0x00B8084C,
+	HWPfDmaStatusDescriptorErrHiVf        =  0x00B80850,
+	HWPfDmaStatusFcwErrLoVf               =  0x00B80854,
+	HWPfDmaStatusFcwErrHiVf               =  0x00B80858,
+	HWPfDmaStatusDataErrLoVf              =  0x00B8085C,
+	HWPfDmaStatusDataErrHiVf              =  0x00B80860,
+	HWPfDmaCfgMsiEnSoftwareErr            =  0x00B80864,
+	HWPfDmaDescriptorSignatuture          =  0x00B80868,
+	HWPfDmaFcwSignature                   =  0x00B8086C,
+	HWPfDmaErrorDetectionEn               =  0x00B80870,
+	HWPfDmaErrCntrlFifoDebug              =  0x00B8087C,
+	HWPfDmaStatusToutData                 =  0x00B80880,
+	HWPfDmaStatusToutDesc                 =  0x00B80884,
+	HWPfDmaStatusToutUnexpData            =  0x00B80888,
+	HWPfDmaStatusToutUnexpDesc            =  0x00B8088C,
+	HWPfDmaStatusToutProcess              =  0x00B80890,
+	HWPfDmaConfigCtoutOutDataEn           =  0x00B808A0,
+	HWPfDmaConfigCtoutOutDescrEn          =  0x00B808A4,
+	HWPfDmaConfigUnexpComplDataEn         =  0x00B808A8,
+	HWPfDmaConfigUnexpComplDescrEn        =  0x00B808AC,
+	HWPfDmaConfigPtoutOutEn               =  0x00B808B0,
+	HWPfDmaFec5GulDescBaseLoRegVf         =  0x00B88020,
+	HWPfDmaFec5GulDescBaseHiRegVf         =  0x00B88024,
+	HWPfDmaFec5GulRespPtrLoRegVf          =  0x00B88028,
+	HWPfDmaFec5GulRespPtrHiRegVf          =  0x00B8802C,
+	HWPfDmaFec5GdlDescBaseLoRegVf         =  0x00B88040,
+	HWPfDmaFec5GdlDescBaseHiRegVf         =  0x00B88044,
+	HWPfDmaFec5GdlRespPtrLoRegVf          =  0x00B88048,
+	HWPfDmaFec5GdlRespPtrHiRegVf          =  0x00B8804C,
+	HWPfDmaFec4GulDescBaseLoRegVf         =  0x00B88060,
+	HWPfDmaFec4GulDescBaseHiRegVf         =  0x00B88064,
+	HWPfDmaFec4GulRespPtrLoRegVf          =  0x00B88068,
+	HWPfDmaFec4GulRespPtrHiRegVf          =  0x00B8806C,
+	HWPfDmaFec4GdlDescBaseLoRegVf         =  0x00B88080,
+	HWPfDmaFec4GdlDescBaseHiRegVf         =  0x00B88084,
+	HWPfDmaFec4GdlRespPtrLoRegVf          =  0x00B88088,
+	HWPfDmaFec4GdlRespPtrHiRegVf          =  0x00B8808C,
+	HWPfDmaVfDdrBaseRangeRo               =  0x00B880A0,
+	HWPfQosmonACntrlReg                   =  0x00B90000,
+	HWPfQosmonAEvalOverflow0              =  0x00B90008,
+	HWPfQosmonAEvalOverflow1              =  0x00B9000C,
+	HWPfQosmonADivTerm                    =  0x00B90010,
+	HWPfQosmonATickTerm                   =  0x00B90014,
+	HWPfQosmonAEvalTerm                   =  0x00B90018,
+	HWPfQosmonAAveTerm                    =  0x00B9001C,
+	HWPfQosmonAForceEccErr                =  0x00B90020,
+	HWPfQosmonAEccErrDetect               =  0x00B90024,
+	HWPfQosmonAIterationConfig0Low        =  0x00B90060,
+	HWPfQosmonAIterationConfig0High       =  0x00B90064,
+	HWPfQosmonAIterationConfig1Low        =  0x00B90068,
+	HWPfQosmonAIterationConfig1High       =  0x00B9006C,
+	HWPfQosmonAIterationConfig2Low        =  0x00B90070,
+	HWPfQosmonAIterationConfig2High       =  0x00B90074,
+	HWPfQosmonAIterationConfig3Low        =  0x00B90078,
+	HWPfQosmonAIterationConfig3High       =  0x00B9007C,
+	HWPfQosmonAEvalMemAddr                =  0x00B90080,
+	HWPfQosmonAEvalMemData                =  0x00B90084,
+	HWPfQosmonAXaction                    =  0x00B900C0,
+	HWPfQosmonARemThres1Vf                =  0x00B90400,
+	HWPfQosmonAThres2Vf                   =  0x00B90404,
+	HWPfQosmonAWeiFracVf                  =  0x00B90408,
+	HWPfQosmonARrWeiVf                    =  0x00B9040C,
+	HWPfPermonACntrlRegVf                 =  0x00B98000,
+	HWPfPermonACountVf                    =  0x00B98008,
+	HWPfPermonAKCntLoVf                   =  0x00B98010,
+	HWPfPermonAKCntHiVf                   =  0x00B98014,
+	HWPfPermonADeltaCntLoVf               =  0x00B98020,
+	HWPfPermonADeltaCntHiVf               =  0x00B98024,
+	HWPfPermonAVersionReg                 =  0x00B9C000,
+	HWPfPermonACbControlFec               =  0x00B9C0F0,
+	HWPfPermonADltTimerLoFec              =  0x00B9C0F4,
+	HWPfPermonADltTimerHiFec              =  0x00B9C0F8,
+	HWPfPermonACbCountFec                 =  0x00B9C100,
+	HWPfPermonAAccExecTimerLoFec          =  0x00B9C104,
+	HWPfPermonAAccExecTimerHiFec          =  0x00B9C108,
+	HWPfPermonAExecTimerMinFec            =  0x00B9C200,
+	HWPfPermonAExecTimerMaxFec            =  0x00B9C204,
+	HWPfPermonAControlBusMon              =  0x00B9C400,
+	HWPfPermonAConfigBusMon               =  0x00B9C404,
+	HWPfPermonASkipCountBusMon            =  0x00B9C408,
+	HWPfPermonAMinLatBusMon               =  0x00B9C40C,
+	HWPfPermonAMaxLatBusMon               =  0x00B9C500,
+	HWPfPermonATotalLatLowBusMon          =  0x00B9C504,
+	HWPfPermonATotalLatUpperBusMon        =  0x00B9C508,
+	HWPfPermonATotalReqCntBusMon          =  0x00B9C50C,
+	HWPfQosmonBCntrlReg                   =  0x00BA0000,
+	HWPfQosmonBEvalOverflow0              =  0x00BA0008,
+	HWPfQosmonBEvalOverflow1              =  0x00BA000C,
+	HWPfQosmonBDivTerm                    =  0x00BA0010,
+	HWPfQosmonBTickTerm                   =  0x00BA0014,
+	HWPfQosmonBEvalTerm                   =  0x00BA0018,
+	HWPfQosmonBAveTerm                    =  0x00BA001C,
+	HWPfQosmonBForceEccErr                =  0x00BA0020,
+	HWPfQosmonBEccErrDetect               =  0x00BA0024,
+	HWPfQosmonBIterationConfig0Low        =  0x00BA0060,
+	HWPfQosmonBIterationConfig0High       =  0x00BA0064,
+	HWPfQosmonBIterationConfig1Low        =  0x00BA0068,
+	HWPfQosmonBIterationConfig1High       =  0x00BA006C,
+	HWPfQosmonBIterationConfig2Low        =  0x00BA0070,
+	HWPfQosmonBIterationConfig2High       =  0x00BA0074,
+	HWPfQosmonBIterationConfig3Low        =  0x00BA0078,
+	HWPfQosmonBIterationConfig3High       =  0x00BA007C,
+	HWPfQosmonBEvalMemAddr                =  0x00BA0080,
+	HWPfQosmonBEvalMemData                =  0x00BA0084,
+	HWPfQosmonBXaction                    =  0x00BA00C0,
+	HWPfQosmonBRemThres1Vf                =  0x00BA0400,
+	HWPfQosmonBThres2Vf                   =  0x00BA0404,
+	HWPfQosmonBWeiFracVf                  =  0x00BA0408,
+	HWPfQosmonBRrWeiVf                    =  0x00BA040C,
+	HWPfPermonBCntrlRegVf                 =  0x00BA8000,
+	HWPfPermonBCountVf                    =  0x00BA8008,
+	HWPfPermonBKCntLoVf                   =  0x00BA8010,
+	HWPfPermonBKCntHiVf                   =  0x00BA8014,
+	HWPfPermonBDeltaCntLoVf               =  0x00BA8020,
+	HWPfPermonBDeltaCntHiVf               =  0x00BA8024,
+	HWPfPermonBVersionReg                 =  0x00BAC000,
+	HWPfPermonBCbControlFec               =  0x00BAC0F0,
+	HWPfPermonBDltTimerLoFec              =  0x00BAC0F4,
+	HWPfPermonBDltTimerHiFec              =  0x00BAC0F8,
+	HWPfPermonBCbCountFec                 =  0x00BAC100,
+	HWPfPermonBAccExecTimerLoFec          =  0x00BAC104,
+	HWPfPermonBAccExecTimerHiFec          =  0x00BAC108,
+	HWPfPermonBExecTimerMinFec            =  0x00BAC200,
+	HWPfPermonBExecTimerMaxFec            =  0x00BAC204,
+	HWPfPermonBControlBusMon              =  0x00BAC400,
+	HWPfPermonBConfigBusMon               =  0x00BAC404,
+	HWPfPermonBSkipCountBusMon            =  0x00BAC408,
+	HWPfPermonBMinLatBusMon               =  0x00BAC40C,
+	HWPfPermonBMaxLatBusMon               =  0x00BAC500,
+	HWPfPermonBTotalLatLowBusMon          =  0x00BAC504,
+	HWPfPermonBTotalLatUpperBusMon        =  0x00BAC508,
+	HWPfPermonBTotalReqCntBusMon          =  0x00BAC50C,
+	HWPfFecUl5gCntrlReg                   =  0x00BC0000,
+	HWPfFecUl5gI2MThreshReg               =  0x00BC0004,
+	HWPfFecUl5gVersionReg                 =  0x00BC0100,
+	HWPfFecUl5gFcwStatusReg               =  0x00BC0104,
+	HWPfFecUl5gWarnReg                    =  0x00BC0108,
+	HwPfFecUl5gIbDebugReg                 =  0x00BC0200,
+	HwPfFecUl5gObLlrDebugReg              =  0x00BC0204,
+	HwPfFecUl5gObHarqDebugReg             =  0x00BC0208,
+	HwPfFecUl5g1CntrlReg                  =  0x00BC1000,
+	HwPfFecUl5g1I2MThreshReg              =  0x00BC1004,
+	HwPfFecUl5g1VersionReg                =  0x00BC1100,
+	HwPfFecUl5g1FcwStatusReg              =  0x00BC1104,
+	HwPfFecUl5g1WarnReg                   =  0x00BC1108,
+	HwPfFecUl5g1IbDebugReg                =  0x00BC1200,
+	HwPfFecUl5g1ObLlrDebugReg             =  0x00BC1204,
+	HwPfFecUl5g1ObHarqDebugReg            =  0x00BC1208,
+	HwPfFecUl5g2CntrlReg                  =  0x00BC2000,
+	HwPfFecUl5g2I2MThreshReg              =  0x00BC2004,
+	HwPfFecUl5g2VersionReg                =  0x00BC2100,
+	HwPfFecUl5g2FcwStatusReg              =  0x00BC2104,
+	HwPfFecUl5g2WarnReg                   =  0x00BC2108,
+	HwPfFecUl5g2IbDebugReg                =  0x00BC2200,
+	HwPfFecUl5g2ObLlrDebugReg             =  0x00BC2204,
+	HwPfFecUl5g2ObHarqDebugReg            =  0x00BC2208,
+	HwPfFecUl5g3CntrlReg                  =  0x00BC3000,
+	HwPfFecUl5g3I2MThreshReg              =  0x00BC3004,
+	HwPfFecUl5g3VersionReg                =  0x00BC3100,
+	HwPfFecUl5g3FcwStatusReg              =  0x00BC3104,
+	HwPfFecUl5g3WarnReg                   =  0x00BC3108,
+	HwPfFecUl5g3IbDebugReg                =  0x00BC3200,
+	HwPfFecUl5g3ObLlrDebugReg             =  0x00BC3204,
+	HwPfFecUl5g3ObHarqDebugReg            =  0x00BC3208,
+	HwPfFecUl5g4CntrlReg                  =  0x00BC4000,
+	HwPfFecUl5g4I2MThreshReg              =  0x00BC4004,
+	HwPfFecUl5g4VersionReg                =  0x00BC4100,
+	HwPfFecUl5g4FcwStatusReg              =  0x00BC4104,
+	HwPfFecUl5g4WarnReg                   =  0x00BC4108,
+	HwPfFecUl5g4IbDebugReg                =  0x00BC4200,
+	HwPfFecUl5g4ObLlrDebugReg             =  0x00BC4204,
+	HwPfFecUl5g4ObHarqDebugReg            =  0x00BC4208,
+	HwPfFecUl5g5CntrlReg                  =  0x00BC5000,
+	HwPfFecUl5g5I2MThreshReg              =  0x00BC5004,
+	HwPfFecUl5g5VersionReg                =  0x00BC5100,
+	HwPfFecUl5g5FcwStatusReg              =  0x00BC5104,
+	HwPfFecUl5g5WarnReg                   =  0x00BC5108,
+	HwPfFecUl5g5IbDebugReg                =  0x00BC5200,
+	HwPfFecUl5g5ObLlrDebugReg             =  0x00BC5204,
+	HwPfFecUl5g5ObHarqDebugReg            =  0x00BC5208,
+	HwPfFecUl5g6CntrlReg                  =  0x00BC6000,
+	HwPfFecUl5g6I2MThreshReg              =  0x00BC6004,
+	HwPfFecUl5g6VersionReg                =  0x00BC6100,
+	HwPfFecUl5g6FcwStatusReg              =  0x00BC6104,
+	HwPfFecUl5g6WarnReg                   =  0x00BC6108,
+	HwPfFecUl5g6IbDebugReg                =  0x00BC6200,
+	HwPfFecUl5g6ObLlrDebugReg             =  0x00BC6204,
+	HwPfFecUl5g6ObHarqDebugReg            =  0x00BC6208,
+	HwPfFecUl5g7CntrlReg                  =  0x00BC7000,
+	HwPfFecUl5g7I2MThreshReg              =  0x00BC7004,
+	HwPfFecUl5g7VersionReg                =  0x00BC7100,
+	HwPfFecUl5g7FcwStatusReg              =  0x00BC7104,
+	HwPfFecUl5g7WarnReg                   =  0x00BC7108,
+	HwPfFecUl5g7IbDebugReg                =  0x00BC7200,
+	HwPfFecUl5g7ObLlrDebugReg             =  0x00BC7204,
+	HwPfFecUl5g7ObHarqDebugReg            =  0x00BC7208,
+	HwPfFecUl5g8CntrlReg                  =  0x00BC8000,
+	HwPfFecUl5g8I2MThreshReg              =  0x00BC8004,
+	HwPfFecUl5g8VersionReg                =  0x00BC8100,
+	HwPfFecUl5g8FcwStatusReg              =  0x00BC8104,
+	HwPfFecUl5g8WarnReg                   =  0x00BC8108,
+	HwPfFecUl5g8IbDebugReg                =  0x00BC8200,
+	HwPfFecUl5g8ObLlrDebugReg             =  0x00BC8204,
+	HwPfFecUl5g8ObHarqDebugReg            =  0x00BC8208,
+	HWPfFecDl5gCntrlReg                   =  0x00BCF000,
+	HWPfFecDl5gI2MThreshReg               =  0x00BCF004,
+	HWPfFecDl5gVersionReg                 =  0x00BCF100,
+	HWPfFecDl5gFcwStatusReg               =  0x00BCF104,
+	HWPfFecDl5gWarnReg                    =  0x00BCF108,
+	HWPfFecUlVersionReg                   =  0x00BD0000,
+	HWPfFecUlControlReg                   =  0x00BD0004,
+	HWPfFecUlStatusReg                    =  0x00BD0008,
+	HWPfFecDlVersionReg                   =  0x00BDF000,
+	HWPfFecDlClusterConfigReg             =  0x00BDF004,
+	HWPfFecDlBurstThres                   =  0x00BDF00C,
+	HWPfFecDlClusterStatusReg0            =  0x00BDF040,
+	HWPfFecDlClusterStatusReg1            =  0x00BDF044,
+	HWPfFecDlClusterStatusReg2            =  0x00BDF048,
+	HWPfFecDlClusterStatusReg3            =  0x00BDF04C,
+	HWPfFecDlClusterStatusReg4            =  0x00BDF050,
+	HWPfFecDlClusterStatusReg5            =  0x00BDF054,
+	HWPfChaFabPllPllrst                   =  0x00C40000,
+	HWPfChaFabPllClk0                     =  0x00C40004,
+	HWPfChaFabPllClk1                     =  0x00C40008,
+	HWPfChaFabPllBwadj                    =  0x00C4000C,
+	HWPfChaFabPllLbw                      =  0x00C40010,
+	HWPfChaFabPllResetq                   =  0x00C40014,
+	HWPfChaFabPllPhshft0                  =  0x00C40018,
+	HWPfChaFabPllPhshft1                  =  0x00C4001C,
+	HWPfChaFabPllDivq0                    =  0x00C40020,
+	HWPfChaFabPllDivq1                    =  0x00C40024,
+	HWPfChaFabPllDivq2                    =  0x00C40028,
+	HWPfChaFabPllDivq3                    =  0x00C4002C,
+	HWPfChaFabPllDivq4                    =  0x00C40030,
+	HWPfChaFabPllDivq5                    =  0x00C40034,
+	HWPfChaFabPllDivq6                    =  0x00C40038,
+	HWPfChaFabPllDivq7                    =  0x00C4003C,
+	HWPfChaDl5gPllPllrst                  =  0x00C40080,
+	HWPfChaDl5gPllClk0                    =  0x00C40084,
+	HWPfChaDl5gPllClk1                    =  0x00C40088,
+	HWPfChaDl5gPllBwadj                   =  0x00C4008C,
+	HWPfChaDl5gPllLbw                     =  0x00C40090,
+	HWPfChaDl5gPllResetq                  =  0x00C40094,
+	HWPfChaDl5gPllPhshft0                 =  0x00C40098,
+	HWPfChaDl5gPllPhshft1                 =  0x00C4009C,
+	HWPfChaDl5gPllDivq0                   =  0x00C400A0,
+	HWPfChaDl5gPllDivq1                   =  0x00C400A4,
+	HWPfChaDl5gPllDivq2                   =  0x00C400A8,
+	HWPfChaDl5gPllDivq3                   =  0x00C400AC,
+	HWPfChaDl5gPllDivq4                   =  0x00C400B0,
+	HWPfChaDl5gPllDivq5                   =  0x00C400B4,
+	HWPfChaDl5gPllDivq6                   =  0x00C400B8,
+	HWPfChaDl5gPllDivq7                   =  0x00C400BC,
+	HWPfChaDl4gPllPllrst                  =  0x00C40100,
+	HWPfChaDl4gPllClk0                    =  0x00C40104,
+	HWPfChaDl4gPllClk1                    =  0x00C40108,
+	HWPfChaDl4gPllBwadj                   =  0x00C4010C,
+	HWPfChaDl4gPllLbw                     =  0x00C40110,
+	HWPfChaDl4gPllResetq                  =  0x00C40114,
+	HWPfChaDl4gPllPhshft0                 =  0x00C40118,
+	HWPfChaDl4gPllPhshft1                 =  0x00C4011C,
+	HWPfChaDl4gPllDivq0                   =  0x00C40120,
+	HWPfChaDl4gPllDivq1                   =  0x00C40124,
+	HWPfChaDl4gPllDivq2                   =  0x00C40128,
+	HWPfChaDl4gPllDivq3                   =  0x00C4012C,
+	HWPfChaDl4gPllDivq4                   =  0x00C40130,
+	HWPfChaDl4gPllDivq5                   =  0x00C40134,
+	HWPfChaDl4gPllDivq6                   =  0x00C40138,
+	HWPfChaDl4gPllDivq7                   =  0x00C4013C,
+	HWPfChaUl5gPllPllrst                  =  0x00C40180,
+	HWPfChaUl5gPllClk0                    =  0x00C40184,
+	HWPfChaUl5gPllClk1                    =  0x00C40188,
+	HWPfChaUl5gPllBwadj                   =  0x00C4018C,
+	HWPfChaUl5gPllLbw                     =  0x00C40190,
+	HWPfChaUl5gPllResetq                  =  0x00C40194,
+	HWPfChaUl5gPllPhshft0                 =  0x00C40198,
+	HWPfChaUl5gPllPhshft1                 =  0x00C4019C,
+	HWPfChaUl5gPllDivq0                   =  0x00C401A0,
+	HWPfChaUl5gPllDivq1                   =  0x00C401A4,
+	HWPfChaUl5gPllDivq2                   =  0x00C401A8,
+	HWPfChaUl5gPllDivq3                   =  0x00C401AC,
+	HWPfChaUl5gPllDivq4                   =  0x00C401B0,
+	HWPfChaUl5gPllDivq5                   =  0x00C401B4,
+	HWPfChaUl5gPllDivq6                   =  0x00C401B8,
+	HWPfChaUl5gPllDivq7                   =  0x00C401BC,
+	HWPfChaUl4gPllPllrst                  =  0x00C40200,
+	HWPfChaUl4gPllClk0                    =  0x00C40204,
+	HWPfChaUl4gPllClk1                    =  0x00C40208,
+	HWPfChaUl4gPllBwadj                   =  0x00C4020C,
+	HWPfChaUl4gPllLbw                     =  0x00C40210,
+	HWPfChaUl4gPllResetq                  =  0x00C40214,
+	HWPfChaUl4gPllPhshft0                 =  0x00C40218,
+	HWPfChaUl4gPllPhshft1                 =  0x00C4021C,
+	HWPfChaUl4gPllDivq0                   =  0x00C40220,
+	HWPfChaUl4gPllDivq1                   =  0x00C40224,
+	HWPfChaUl4gPllDivq2                   =  0x00C40228,
+	HWPfChaUl4gPllDivq3                   =  0x00C4022C,
+	HWPfChaUl4gPllDivq4                   =  0x00C40230,
+	HWPfChaUl4gPllDivq5                   =  0x00C40234,
+	HWPfChaUl4gPllDivq6                   =  0x00C40238,
+	HWPfChaUl4gPllDivq7                   =  0x00C4023C,
+	HWPfChaDdrPllPllrst                   =  0x00C40280,
+	HWPfChaDdrPllClk0                     =  0x00C40284,
+	HWPfChaDdrPllClk1                     =  0x00C40288,
+	HWPfChaDdrPllBwadj                    =  0x00C4028C,
+	HWPfChaDdrPllLbw                      =  0x00C40290,
+	HWPfChaDdrPllResetq                   =  0x00C40294,
+	HWPfChaDdrPllPhshft0                  =  0x00C40298,
+	HWPfChaDdrPllPhshft1                  =  0x00C4029C,
+	HWPfChaDdrPllDivq0                    =  0x00C402A0,
+	HWPfChaDdrPllDivq1                    =  0x00C402A4,
+	HWPfChaDdrPllDivq2                    =  0x00C402A8,
+	HWPfChaDdrPllDivq3                    =  0x00C402AC,
+	HWPfChaDdrPllDivq4                    =  0x00C402B0,
+	HWPfChaDdrPllDivq5                    =  0x00C402B4,
+	HWPfChaDdrPllDivq6                    =  0x00C402B8,
+	HWPfChaDdrPllDivq7                    =  0x00C402BC,
+	HWPfChaErrStatus                      =  0x00C40400,
+	HWPfChaErrMask                        =  0x00C40404,
+	HWPfChaDebugPcieMsiFifo               =  0x00C40410,
+	HWPfChaDebugDdrMsiFifo                =  0x00C40414,
+	HWPfChaDebugMiscMsiFifo               =  0x00C40418,
+	HWPfChaPwmSet                         =  0x00C40420,
+	HWPfChaDdrRstStatus                   =  0x00C40430,
+	HWPfChaDdrStDoneStatus                =  0x00C40434,
+	HWPfChaDdrWbRstCfg                    =  0x00C40438,
+	HWPfChaDdrApbRstCfg                   =  0x00C4043C,
+	HWPfChaDdrPhyRstCfg                   =  0x00C40440,
+	HWPfChaDdrCpuRstCfg                   =  0x00C40444,
+	HWPfChaDdrSifRstCfg                   =  0x00C40448,
+	HWPfChaPadcfgPcomp0                   =  0x00C41000,
+	HWPfChaPadcfgNcomp0                   =  0x00C41004,
+	HWPfChaPadcfgOdt0                     =  0x00C41008,
+	HWPfChaPadcfgProtect0                 =  0x00C4100C,
+	HWPfChaPreemphasisProtect0            =  0x00C41010,
+	HWPfChaPreemphasisCompen0             =  0x00C41040,
+	HWPfChaPreemphasisOdten0              =  0x00C41044,
+	HWPfChaPadcfgPcomp1                   =  0x00C41100,
+	HWPfChaPadcfgNcomp1                   =  0x00C41104,
+	HWPfChaPadcfgOdt1                     =  0x00C41108,
+	HWPfChaPadcfgProtect1                 =  0x00C4110C,
+	HWPfChaPreemphasisProtect1            =  0x00C41110,
+	HWPfChaPreemphasisCompen1             =  0x00C41140,
+	HWPfChaPreemphasisOdten1              =  0x00C41144,
+	HWPfChaPadcfgPcomp2                   =  0x00C41200,
+	HWPfChaPadcfgNcomp2                   =  0x00C41204,
+	HWPfChaPadcfgOdt2                     =  0x00C41208,
+	HWPfChaPadcfgProtect2                 =  0x00C4120C,
+	HWPfChaPreemphasisProtect2            =  0x00C41210,
+	HWPfChaPreemphasisCompen2             =  0x00C41240,
+	HWPfChaPreemphasisOdten4              =  0x00C41444,
+	HWPfChaPreemphasisOdten2              =  0x00C41244,
+	HWPfChaPadcfgPcomp3                   =  0x00C41300,
+	HWPfChaPadcfgNcomp3                   =  0x00C41304,
+	HWPfChaPadcfgOdt3                     =  0x00C41308,
+	HWPfChaPadcfgProtect3                 =  0x00C4130C,
+	HWPfChaPreemphasisProtect3            =  0x00C41310,
+	HWPfChaPreemphasisCompen3             =  0x00C41340,
+	HWPfChaPreemphasisOdten3              =  0x00C41344,
+	HWPfChaPadcfgPcomp4                   =  0x00C41400,
+	HWPfChaPadcfgNcomp4                   =  0x00C41404,
+	HWPfChaPadcfgOdt4                     =  0x00C41408,
+	HWPfChaPadcfgProtect4                 =  0x00C4140C,
+	HWPfChaPreemphasisProtect4            =  0x00C41410,
+	HWPfChaPreemphasisCompen4             =  0x00C41440,
+	HWPfHiVfToPfDbellVf                   =  0x00C80000,
+	HWPfHiPfToVfDbellVf                   =  0x00C80008,
+	HWPfHiInfoRingBaseLoVf                =  0x00C80010,
+	HWPfHiInfoRingBaseHiVf                =  0x00C80014,
+	HWPfHiInfoRingPointerVf               =  0x00C80018,
+	HWPfHiInfoRingIntWrEnVf               =  0x00C80020,
+	HWPfHiInfoRingPf2VfWrEnVf             =  0x00C80024,
+	HWPfHiMsixVectorMapperVf              =  0x00C80060,
+	HWPfHiModuleVersionReg                =  0x00C84000,
+	HWPfHiIosf2axiErrLogReg               =  0x00C84004,
+	HWPfHiHardResetReg                    =  0x00C84008,
+	HWPfHi5GHardResetReg                  =  0x00C8400C,
+	HWPfHiInfoRingBaseLoRegPf             =  0x00C84010,
+	HWPfHiInfoRingBaseHiRegPf             =  0x00C84014,
+	HWPfHiInfoRingPointerRegPf            =  0x00C84018,
+	HWPfHiInfoRingIntWrEnRegPf            =  0x00C84020,
+	HWPfHiInfoRingVf2pfLoWrEnReg          =  0x00C84024,
+	HWPfHiInfoRingVf2pfHiWrEnReg          =  0x00C84028,
+	HWPfHiLogParityErrStatusReg           =  0x00C8402C,
+	HWPfHiLogDataParityErrorVfStatusLo    =  0x00C84030,
+	HWPfHiLogDataParityErrorVfStatusHi    =  0x00C84034,
+	HWPfHiBlockTransmitOnErrorEn          =  0x00C84038,
+	HWPfHiCfgMsiIntWrEnRegPf              =  0x00C84040,
+	HWPfHiCfgMsiVf2pfLoWrEnReg            =  0x00C84044,
+	HWPfHiCfgMsiVf2pfHighWrEnReg          =  0x00C84048,
+	HWPfHiMsixVectorMapperPf              =  0x00C84060,
+	HWPfHiApbWrWaitTime                   =  0x00C84100,
+	HWPfHiXCounterMaxValue                =  0x00C84104,
+	HWPfHiPfMode                          =  0x00C84108,
+	HWPfHiClkGateHystReg                  =  0x00C8410C,
+	HWPfHiSnoopBitsReg                    =  0x00C84110,
+	HWPfHiMsiDropEnableReg                =  0x00C84114,
+	HWPfHiMsiStatReg                      =  0x00C84120,
+	HWPfHiFifoOflStatReg                  =  0x00C84124,
+	HWPfHiHiDebugReg                      =  0x00C841F4,
+	HWPfHiDebugMemSnoopMsiFifo            =  0x00C841F8,
+	HWPfHiDebugMemSnoopInputFifo          =  0x00C841FC,
+	HWPfHiMsixMappingConfig               =  0x00C84200,
+	HWPfHiJunkReg                         =  0x00C8FF00,
+	HWPfDdrUmmcVer                        =  0x00D00000,
+	HWPfDdrUmmcCap                        =  0x00D00010,
+	HWPfDdrUmmcCtrl                       =  0x00D00020,
+	HWPfDdrMpcPe                          =  0x00D00080,
+	HWPfDdrMpcPpri3                       =  0x00D00090,
+	HWPfDdrMpcPpri2                       =  0x00D000A0,
+	HWPfDdrMpcPpri1                       =  0x00D000B0,
+	HWPfDdrMpcPpri0                       =  0x00D000C0,
+	HWPfDdrMpcPrwgrpCtrl                  =  0x00D000D0,
+	HWPfDdrMpcPbw7                        =  0x00D000E0,
+	HWPfDdrMpcPbw6                        =  0x00D000F0,
+	HWPfDdrMpcPbw5                        =  0x00D00100,
+	HWPfDdrMpcPbw4                        =  0x00D00110,
+	HWPfDdrMpcPbw3                        =  0x00D00120,
+	HWPfDdrMpcPbw2                        =  0x00D00130,
+	HWPfDdrMpcPbw1                        =  0x00D00140,
+	HWPfDdrMpcPbw0                        =  0x00D00150,
+	HWPfDdrMemoryInit                     =  0x00D00200,
+	HWPfDdrMemoryInitDone                 =  0x00D00210,
+	HWPfDdrMemInitPhyTrng0                =  0x00D00240,
+	HWPfDdrMemInitPhyTrng1                =  0x00D00250,
+	HWPfDdrMemInitPhyTrng2                =  0x00D00260,
+	HWPfDdrMemInitPhyTrng3                =  0x00D00270,
+	HWPfDdrBcDram                         =  0x00D003C0,
+	HWPfDdrBcAddrMap                      =  0x00D003D0,
+	HWPfDdrBcRef                          =  0x00D003E0,
+	HWPfDdrBcTim0                         =  0x00D00400,
+	HWPfDdrBcTim1                         =  0x00D00410,
+	HWPfDdrBcTim2                         =  0x00D00420,
+	HWPfDdrBcTim3                         =  0x00D00430,
+	HWPfDdrBcTim4                         =  0x00D00440,
+	HWPfDdrBcTim5                         =  0x00D00450,
+	HWPfDdrBcTim6                         =  0x00D00460,
+	HWPfDdrBcTim7                         =  0x00D00470,
+	HWPfDdrBcTim8                         =  0x00D00480,
+	HWPfDdrBcTim9                         =  0x00D00490,
+	HWPfDdrBcTim10                        =  0x00D004A0,
+	HWPfDdrBcTim12                        =  0x00D004C0,
+	HWPfDdrDfiInit                        =  0x00D004D0,
+	HWPfDdrDfiInitComplete                =  0x00D004E0,
+	HWPfDdrDfiTim0                        =  0x00D004F0,
+	HWPfDdrDfiTim1                        =  0x00D00500,
+	HWPfDdrDfiPhyUpdEn                    =  0x00D00530,
+	HWPfDdrMemStatus                      =  0x00D00540,
+	HWPfDdrUmmcErrStatus                  =  0x00D00550,
+	HWPfDdrUmmcIntStatus                  =  0x00D00560,
+	HWPfDdrUmmcIntEn                      =  0x00D00570,
+	HWPfDdrPhyRdLatency                   =  0x00D48400,
+	HWPfDdrPhyRdLatencyDbi                =  0x00D48410,
+	HWPfDdrPhyWrLatency                   =  0x00D48420,
+	HWPfDdrPhyTrngType                    =  0x00D48430,
+	HWPfDdrPhyMrsTiming2                  =  0x00D48440,
+	HWPfDdrPhyMrsTiming0                  =  0x00D48450,
+	HWPfDdrPhyMrsTiming1                  =  0x00D48460,
+	HWPfDdrPhyDramTmrd                    =  0x00D48470,
+	HWPfDdrPhyDramTmod                    =  0x00D48480,
+	HWPfDdrPhyDramTwpre                   =  0x00D48490,
+	HWPfDdrPhyDramTrfc                    =  0x00D484A0,
+	HWPfDdrPhyDramTrwtp                   =  0x00D484B0,
+	HWPfDdrPhyMr01Dimm                    =  0x00D484C0,
+	HWPfDdrPhyMr01DimmDbi                 =  0x00D484D0,
+	HWPfDdrPhyMr23Dimm                    =  0x00D484E0,
+	HWPfDdrPhyMr45Dimm                    =  0x00D484F0,
+	HWPfDdrPhyMr67Dimm                    =  0x00D48500,
+	HWPfDdrPhyWrlvlWwRdlvlRr              =  0x00D48510,
+	HWPfDdrPhyOdtEn                       =  0x00D48520,
+	HWPfDdrPhyFastTrng                    =  0x00D48530,
+	HWPfDdrPhyDynTrngGap                  =  0x00D48540,
+	HWPfDdrPhyDynRcalGap                  =  0x00D48550,
+	HWPfDdrPhyIdletimeout                 =  0x00D48560,
+	HWPfDdrPhyRstCkeGap                   =  0x00D48570,
+	HWPfDdrPhyCkeMrsGap                   =  0x00D48580,
+	HWPfDdrPhyMemVrefMidVal               =  0x00D48590,
+	HWPfDdrPhyVrefStep                    =  0x00D485A0,
+	HWPfDdrPhyVrefThreshold               =  0x00D485B0,
+	HWPfDdrPhyPhyVrefMidVal               =  0x00D485C0,
+	HWPfDdrPhyDqsCountMax                 =  0x00D485D0,
+	HWPfDdrPhyDqsCountNum                 =  0x00D485E0,
+	HWPfDdrPhyDramRow                     =  0x00D485F0,
+	HWPfDdrPhyDramCol                     =  0x00D48600,
+	HWPfDdrPhyDramBgBa                    =  0x00D48610,
+	HWPfDdrPhyDynamicUpdreqrel            =  0x00D48620,
+	HWPfDdrPhyVrefLimits                  =  0x00D48630,
+	HWPfDdrPhyIdtmTcStatus                =  0x00D6C020,
+	HWPfDdrPhyIdtmFwVersion               =  0x00D6C410,
+	HWPfDdrPhyRdlvlGateInitDelay          =  0x00D70000,
+	HWPfDdrPhyRdenSmplabc                 =  0x00D70008,
+	HWPfDdrPhyVrefNibble0                 =  0x00D7000C,
+	HWPfDdrPhyVrefNibble1                 =  0x00D70010,
+	HWPfDdrPhyRdlvlGateDqsSmpl0           =  0x00D70014,
+	HWPfDdrPhyRdlvlGateDqsSmpl1           =  0x00D70018,
+	HWPfDdrPhyRdlvlGateDqsSmpl2           =  0x00D7001C,
+	HWPfDdrPhyDqsCount                    =  0x00D70020,
+	HWPfDdrPhyWrlvlRdlvlGateStatus        =  0x00D70024,
+	HWPfDdrPhyErrorFlags                  =  0x00D70028,
+	HWPfDdrPhyPowerDown                   =  0x00D70030,
+	HWPfDdrPhyPrbsSeedByte0               =  0x00D70034,
+	HWPfDdrPhyPrbsSeedByte1               =  0x00D70038,
+	HWPfDdrPhyPcompDq                     =  0x00D70040,
+	HWPfDdrPhyNcompDq                     =  0x00D70044,
+	HWPfDdrPhyPcompDqs                    =  0x00D70048,
+	HWPfDdrPhyNcompDqs                    =  0x00D7004C,
+	HWPfDdrPhyPcompCmd                    =  0x00D70050,
+	HWPfDdrPhyNcompCmd                    =  0x00D70054,
+	HWPfDdrPhyPcompCk                     =  0x00D70058,
+	HWPfDdrPhyNcompCk                     =  0x00D7005C,
+	HWPfDdrPhyRcalOdtDq                   =  0x00D70060,
+	HWPfDdrPhyRcalOdtDqs                  =  0x00D70064,
+	HWPfDdrPhyRcalMask1                   =  0x00D70068,
+	HWPfDdrPhyRcalMask2                   =  0x00D7006C,
+	HWPfDdrPhyRcalCtrl                    =  0x00D70070,
+	HWPfDdrPhyRcalCnt                     =  0x00D70074,
+	HWPfDdrPhyRcalOverride                =  0x00D70078,
+	HWPfDdrPhyRcalGateen                  =  0x00D7007C,
+	HWPfDdrPhyCtrl                        =  0x00D70080,
+	HWPfDdrPhyWrlvlAlg                    =  0x00D70084,
+	HWPfDdrPhyRcalVreftTxcmdOdt           =  0x00D70088,
+	HWPfDdrPhyRdlvlGateParam              =  0x00D7008C,
+	HWPfDdrPhyRdlvlGateParam2             =  0x00D70090,
+	HWPfDdrPhyRcalVreftTxdata             =  0x00D70094,
+	HWPfDdrPhyCmdIntDelay                 =  0x00D700A4,
+	HWPfDdrPhyAlertN                      =  0x00D700A8,
+	HWPfDdrPhyTrngReqWpre2tck             =  0x00D700AC,
+	HWPfDdrPhyCmdPhaseSel                 =  0x00D700B4,
+	HWPfDdrPhyCmdDcdl                     =  0x00D700B8,
+	HWPfDdrPhyCkDcdl                      =  0x00D700BC,
+	HWPfDdrPhySwTrngCtrl1                 =  0x00D700C0,
+	HWPfDdrPhySwTrngCtrl2                 =  0x00D700C4,
+	HWPfDdrPhyRcalPcompRden               =  0x00D700C8,
+	HWPfDdrPhyRcalNcompRden               =  0x00D700CC,
+	HWPfDdrPhyRcalCompen                  =  0x00D700D0,
+	HWPfDdrPhySwTrngRdqs                  =  0x00D700D4,
+	HWPfDdrPhySwTrngWdqs                  =  0x00D700D8,
+	HWPfDdrPhySwTrngRdena                 =  0x00D700DC,
+	HWPfDdrPhySwTrngRdenb                 =  0x00D700E0,
+	HWPfDdrPhySwTrngRdenc                 =  0x00D700E4,
+	HWPfDdrPhySwTrngWdq                   =  0x00D700E8,
+	HWPfDdrPhySwTrngRdq                   =  0x00D700EC,
+	HWPfDdrPhyPcfgHmValue                 =  0x00D700F0,
+	HWPfDdrPhyPcfgTimerValue              =  0x00D700F4,
+	HWPfDdrPhyPcfgSoftwareTraining        =  0x00D700F8,
+	HWPfDdrPhyPcfgMcStatus                =  0x00D700FC,
+	HWPfDdrPhyWrlvlPhRank0                =  0x00D70100,
+	HWPfDdrPhyRdenPhRank0                 =  0x00D70104,
+	HWPfDdrPhyRdenIntRank0                =  0x00D70108,
+	HWPfDdrPhyRdqsDcdlRank0               =  0x00D7010C,
+	HWPfDdrPhyRdqsShadowDcdlRank0         =  0x00D70110,
+	HWPfDdrPhyWdqsDcdlRank0               =  0x00D70114,
+	HWPfDdrPhyWdmDcdlShadowRank0          =  0x00D70118,
+	HWPfDdrPhyWdmDcdlRank0                =  0x00D7011C,
+	HWPfDdrPhyDbiDcdlRank0                =  0x00D70120,
+	HWPfDdrPhyRdenDcdlaRank0              =  0x00D70124,
+	HWPfDdrPhyDbiDcdlShadowRank0          =  0x00D70128,
+	HWPfDdrPhyRdenDcdlbRank0              =  0x00D7012C,
+	HWPfDdrPhyWdqsShadowDcdlRank0         =  0x00D70130,
+	HWPfDdrPhyRdenDcdlcRank0              =  0x00D70134,
+	HWPfDdrPhyRdenShadowDcdlaRank0        =  0x00D70138,
+	HWPfDdrPhyWrlvlIntRank0               =  0x00D7013C,
+	HWPfDdrPhyRdqDcdlBit0Rank0            =  0x00D70200,
+	HWPfDdrPhyRdqDcdlShadowBit0Rank0      =  0x00D70204,
+	HWPfDdrPhyWdqDcdlBit0Rank0            =  0x00D70208,
+	HWPfDdrPhyWdqDcdlShadowBit0Rank0      =  0x00D7020C,
+	HWPfDdrPhyRdqDcdlBit1Rank0            =  0x00D70240,
+	HWPfDdrPhyRdqDcdlShadowBit1Rank0      =  0x00D70244,
+	HWPfDdrPhyWdqDcdlBit1Rank0            =  0x00D70248,
+	HWPfDdrPhyWdqDcdlShadowBit1Rank0      =  0x00D7024C,
+	HWPfDdrPhyRdqDcdlBit2Rank0            =  0x00D70280,
+	HWPfDdrPhyRdqDcdlShadowBit2Rank0      =  0x00D70284,
+	HWPfDdrPhyWdqDcdlBit2Rank0            =  0x00D70288,
+	HWPfDdrPhyWdqDcdlShadowBit2Rank0      =  0x00D7028C,
+	HWPfDdrPhyRdqDcdlBit3Rank0            =  0x00D702C0,
+	HWPfDdrPhyRdqDcdlShadowBit3Rank0      =  0x00D702C4,
+	HWPfDdrPhyWdqDcdlBit3Rank0            =  0x00D702C8,
+	HWPfDdrPhyWdqDcdlShadowBit3Rank0      =  0x00D702CC,
+	HWPfDdrPhyRdqDcdlBit4Rank0            =  0x00D70300,
+	HWPfDdrPhyRdqDcdlShadowBit4Rank0      =  0x00D70304,
+	HWPfDdrPhyWdqDcdlBit4Rank0            =  0x00D70308,
+	HWPfDdrPhyWdqDcdlShadowBit4Rank0      =  0x00D7030C,
+	HWPfDdrPhyRdqDcdlBit5Rank0            =  0x00D70340,
+	HWPfDdrPhyRdqDcdlShadowBit5Rank0      =  0x00D70344,
+	HWPfDdrPhyWdqDcdlBit5Rank0            =  0x00D70348,
+	HWPfDdrPhyWdqDcdlShadowBit5Rank0      =  0x00D7034C,
+	HWPfDdrPhyRdqDcdlBit6Rank0            =  0x00D70380,
+	HWPfDdrPhyRdqDcdlShadowBit6Rank0      =  0x00D70384,
+	HWPfDdrPhyWdqDcdlBit6Rank0            =  0x00D70388,
+	HWPfDdrPhyWdqDcdlShadowBit6Rank0      =  0x00D7038C,
+	HWPfDdrPhyRdqDcdlBit7Rank0            =  0x00D703C0,
+	HWPfDdrPhyRdqDcdlShadowBit7Rank0      =  0x00D703C4,
+	HWPfDdrPhyWdqDcdlBit7Rank0            =  0x00D703C8,
+	HWPfDdrPhyWdqDcdlShadowBit7Rank0      =  0x00D703CC,
+	HWPfDdrPhyIdtmStatus                  =  0x00D740D0,
+	HWPfDdrPhyIdtmError                   =  0x00D74110,
+	HWPfDdrPhyIdtmDebug                   =  0x00D74120,
+	HWPfDdrPhyIdtmDebugInt                =  0x00D74130,
+	HwPfPcieLnAsicCfgovr                  =  0x00D80000,
+	HwPfPcieLnAclkmixer                   =  0x00D80004,
+	HwPfPcieLnTxrampfreq                  =  0x00D80008,
+	HwPfPcieLnLanetest                    =  0x00D8000C,
+	HwPfPcieLnDcctrl                      =  0x00D80010,
+	HwPfPcieLnDccmeas                     =  0x00D80014,
+	HwPfPcieLnDccovrAclk                  =  0x00D80018,
+	HwPfPcieLnDccovrTxa                   =  0x00D8001C,
+	HwPfPcieLnDccovrTxk                   =  0x00D80020,
+	HwPfPcieLnDccovrDclk                  =  0x00D80024,
+	HwPfPcieLnDccovrEclk                  =  0x00D80028,
+	HwPfPcieLnDcctrimAclk                 =  0x00D8002C,
+	HwPfPcieLnDcctrimTx                   =  0x00D80030,
+	HwPfPcieLnDcctrimDclk                 =  0x00D80034,
+	HwPfPcieLnDcctrimEclk                 =  0x00D80038,
+	HwPfPcieLnQuadCtrl                    =  0x00D8003C,
+	HwPfPcieLnQuadCorrIndex               =  0x00D80040,
+	HwPfPcieLnQuadCorrStatus              =  0x00D80044,
+	HwPfPcieLnAsicRxovr1                  =  0x00D80048,
+	HwPfPcieLnAsicRxovr2                  =  0x00D8004C,
+	HwPfPcieLnAsicEqinfovr                =  0x00D80050,
+	HwPfPcieLnRxcsr                       =  0x00D80054,
+	HwPfPcieLnRxfectrl                    =  0x00D80058,
+	HwPfPcieLnRxtest                      =  0x00D8005C,
+	HwPfPcieLnEscount                     =  0x00D80060,
+	HwPfPcieLnCdrctrl                     =  0x00D80064,
+	HwPfPcieLnCdrctrl2                    =  0x00D80068,
+	HwPfPcieLnCdrcfg0Ctrl0                =  0x00D8006C,
+	HwPfPcieLnCdrcfg0Ctrl1                =  0x00D80070,
+	HwPfPcieLnCdrcfg0Ctrl2                =  0x00D80074,
+	HwPfPcieLnCdrcfg1Ctrl0                =  0x00D80078,
+	HwPfPcieLnCdrcfg1Ctrl1                =  0x00D8007C,
+	HwPfPcieLnCdrcfg1Ctrl2                =  0x00D80080,
+	HwPfPcieLnCdrcfg2Ctrl0                =  0x00D80084,
+	HwPfPcieLnCdrcfg2Ctrl1                =  0x00D80088,
+	HwPfPcieLnCdrcfg2Ctrl2                =  0x00D8008C,
+	HwPfPcieLnCdrcfg3Ctrl0                =  0x00D80090,
+	HwPfPcieLnCdrcfg3Ctrl1                =  0x00D80094,
+	HwPfPcieLnCdrcfg3Ctrl2                =  0x00D80098,
+	HwPfPcieLnCdrphase                    =  0x00D8009C,
+	HwPfPcieLnCdrfreq                     =  0x00D800A0,
+	HwPfPcieLnCdrstatusPhase              =  0x00D800A4,
+	HwPfPcieLnCdrstatusFreq               =  0x00D800A8,
+	HwPfPcieLnCdroffset                   =  0x00D800AC,
+	HwPfPcieLnRxvosctl                    =  0x00D800B0,
+	HwPfPcieLnRxvosctl2                   =  0x00D800B4,
+	HwPfPcieLnRxlosctl                    =  0x00D800B8,
+	HwPfPcieLnRxlos                       =  0x00D800BC,
+	HwPfPcieLnRxlosvval                   =  0x00D800C0,
+	HwPfPcieLnRxvosd0                     =  0x00D800C4,
+	HwPfPcieLnRxvosd1                     =  0x00D800C8,
+	HwPfPcieLnRxvosep0                    =  0x00D800CC,
+	HwPfPcieLnRxvosep1                    =  0x00D800D0,
+	HwPfPcieLnRxvosen0                    =  0x00D800D4,
+	HwPfPcieLnRxvosen1                    =  0x00D800D8,
+	HwPfPcieLnRxvosafe                    =  0x00D800DC,
+	HwPfPcieLnRxvosa0                     =  0x00D800E0,
+	HwPfPcieLnRxvosa0Out                  =  0x00D800E4,
+	HwPfPcieLnRxvosa1                     =  0x00D800E8,
+	HwPfPcieLnRxvosa1Out                  =  0x00D800EC,
+	HwPfPcieLnRxmisc                      =  0x00D800F0,
+	HwPfPcieLnRxbeacon                    =  0x00D800F4,
+	HwPfPcieLnRxdssout                    =  0x00D800F8,
+	HwPfPcieLnRxdssout2                   =  0x00D800FC,
+	HwPfPcieLnAlphapctrl                  =  0x00D80100,
+	HwPfPcieLnAlphanctrl                  =  0x00D80104,
+	HwPfPcieLnAdaptctrl                   =  0x00D80108,
+	HwPfPcieLnAdaptctrl1                  =  0x00D8010C,
+	HwPfPcieLnAdaptstatus                 =  0x00D80110,
+	HwPfPcieLnAdaptvga1                   =  0x00D80114,
+	HwPfPcieLnAdaptvga2                   =  0x00D80118,
+	HwPfPcieLnAdaptvga3                   =  0x00D8011C,
+	HwPfPcieLnAdaptvga4                   =  0x00D80120,
+	HwPfPcieLnAdaptboost1                 =  0x00D80124,
+	HwPfPcieLnAdaptboost2                 =  0x00D80128,
+	HwPfPcieLnAdaptboost3                 =  0x00D8012C,
+	HwPfPcieLnAdaptboost4                 =  0x00D80130,
+	HwPfPcieLnAdaptsslms1                 =  0x00D80134,
+	HwPfPcieLnAdaptsslms2                 =  0x00D80138,
+	HwPfPcieLnAdaptvgaStatus              =  0x00D8013C,
+	HwPfPcieLnAdaptboostStatus            =  0x00D80140,
+	HwPfPcieLnAdaptsslmsStatus1           =  0x00D80144,
+	HwPfPcieLnAdaptsslmsStatus2           =  0x00D80148,
+	HwPfPcieLnAfectrl1                    =  0x00D8014C,
+	HwPfPcieLnAfectrl2                    =  0x00D80150,
+	HwPfPcieLnAfectrl3                    =  0x00D80154,
+	HwPfPcieLnAfedefault1                 =  0x00D80158,
+	HwPfPcieLnAfedefault2                 =  0x00D8015C,
+	HwPfPcieLnDfectrl1                    =  0x00D80160,
+	HwPfPcieLnDfectrl2                    =  0x00D80164,
+	HwPfPcieLnDfectrl3                    =  0x00D80168,
+	HwPfPcieLnDfectrl4                    =  0x00D8016C,
+	HwPfPcieLnDfectrl5                    =  0x00D80170,
+	HwPfPcieLnDfectrl6                    =  0x00D80174,
+	HwPfPcieLnAfestatus1                  =  0x00D80178,
+	HwPfPcieLnAfestatus2                  =  0x00D8017C,
+	HwPfPcieLnDfestatus1                  =  0x00D80180,
+	HwPfPcieLnDfestatus2                  =  0x00D80184,
+	HwPfPcieLnDfestatus3                  =  0x00D80188,
+	HwPfPcieLnDfestatus4                  =  0x00D8018C,
+	HwPfPcieLnDfestatus5                  =  0x00D80190,
+	HwPfPcieLnAlphastatus                 =  0x00D80194,
+	HwPfPcieLnFomctrl1                    =  0x00D80198,
+	HwPfPcieLnFomctrl2                    =  0x00D8019C,
+	HwPfPcieLnFomctrl3                    =  0x00D801A0,
+	HwPfPcieLnAclkcalStatus               =  0x00D801A4,
+	HwPfPcieLnOffscorrStatus              =  0x00D801A8,
+	HwPfPcieLnEyewidthStatus              =  0x00D801AC,
+	HwPfPcieLnEyeheightStatus             =  0x00D801B0,
+	HwPfPcieLnAsicTxovr1                  =  0x00D801B4,
+	HwPfPcieLnAsicTxovr2                  =  0x00D801B8,
+	HwPfPcieLnAsicTxovr3                  =  0x00D801BC,
+	HwPfPcieLnTxbiasadjOvr                =  0x00D801C0,
+	HwPfPcieLnTxcsr                       =  0x00D801C4,
+	HwPfPcieLnTxtest                      =  0x00D801C8,
+	HwPfPcieLnTxtestword                  =  0x00D801CC,
+	HwPfPcieLnTxtestwordHigh              =  0x00D801D0,
+	HwPfPcieLnTxdrive                     =  0x00D801D4,
+	HwPfPcieLnMtcsLn                      =  0x00D801D8,
+	HwPfPcieLnStatsumLn                   =  0x00D801DC,
+	HwPfPcieLnRcbusScratch                =  0x00D801E0,
+	HwPfPcieLnRcbusMinorrev               =  0x00D801F0,
+	HwPfPcieLnRcbusMajorrev               =  0x00D801F4,
+	HwPfPcieLnRcbusBlocktype              =  0x00D801F8,
+	HwPfPcieSupPllcsr                     =  0x00D80800,
+	HwPfPcieSupPlldiv                     =  0x00D80804,
+	HwPfPcieSupPllcal                     =  0x00D80808,
+	HwPfPcieSupPllcalsts                  =  0x00D8080C,
+	HwPfPcieSupPllmeas                    =  0x00D80810,
+	HwPfPcieSupPlldactrim                 =  0x00D80814,
+	HwPfPcieSupPllbiastrim                =  0x00D80818,
+	HwPfPcieSupPllbwtrim                  =  0x00D8081C,
+	HwPfPcieSupPllcaldly                  =  0x00D80820,
+	HwPfPcieSupRefclkonpclkctrl           =  0x00D80824,
+	HwPfPcieSupPclkdelay                  =  0x00D80828,
+	HwPfPcieSupPhyconfig                  =  0x00D8082C,
+	HwPfPcieSupRcalIntf                   =  0x00D80830,
+	HwPfPcieSupAuxcsr                     =  0x00D80834,
+	HwPfPcieSupVref                       =  0x00D80838,
+	HwPfPcieSupLinkmode                   =  0x00D8083C,
+	HwPfPcieSupRrefcalctl                 =  0x00D80840,
+	HwPfPcieSupRrefcal                    =  0x00D80844,
+	HwPfPcieSupRrefcaldly                 =  0x00D80848,
+	HwPfPcieSupTximpcalctl                =  0x00D8084C,
+	HwPfPcieSupTximpcal                   =  0x00D80850,
+	HwPfPcieSupTximpoffset                =  0x00D80854,
+	HwPfPcieSupTximpcaldly                =  0x00D80858,
+	HwPfPcieSupRximpcalctl                =  0x00D8085C,
+	HwPfPcieSupRximpcal                   =  0x00D80860,
+	HwPfPcieSupRximpoffset                =  0x00D80864,
+	HwPfPcieSupRximpcaldly                =  0x00D80868,
+	HwPfPcieSupFence                      =  0x00D8086C,
+	HwPfPcieSupMtcs                       =  0x00D80870,
+	HwPfPcieSupStatsum                    =  0x00D809B8,
+	HwPfPciePcsDpStatus0                  =  0x00D81000,
+	HwPfPciePcsDpControl0                 =  0x00D81004,
+	HwPfPciePcsPmaStatusLane0             =  0x00D81008,
+	HwPfPciePcsPipeStatusLane0            =  0x00D8100C,
+	HwPfPciePcsTxdeemph0Lane0             =  0x00D81010,
+	HwPfPciePcsTxdeemph1Lane0             =  0x00D81014,
+	HwPfPciePcsInternalStatusLane0        =  0x00D81018,
+	HwPfPciePcsDpStatus1                  =  0x00D8101C,
+	HwPfPciePcsDpControl1                 =  0x00D81020,
+	HwPfPciePcsPmaStatusLane1             =  0x00D81024,
+	HwPfPciePcsPipeStatusLane1            =  0x00D81028,
+	HwPfPciePcsTxdeemph0Lane1             =  0x00D8102C,
+	HwPfPciePcsTxdeemph1Lane1             =  0x00D81030,
+	HwPfPciePcsInternalStatusLane1        =  0x00D81034,
+	HwPfPciePcsDpStatus2                  =  0x00D81038,
+	HwPfPciePcsDpControl2                 =  0x00D8103C,
+	HwPfPciePcsPmaStatusLane2             =  0x00D81040,
+	HwPfPciePcsPipeStatusLane2            =  0x00D81044,
+	HwPfPciePcsTxdeemph0Lane2             =  0x00D81048,
+	HwPfPciePcsTxdeemph1Lane2             =  0x00D8104C,
+	HwPfPciePcsInternalStatusLane2        =  0x00D81050,
+	HwPfPciePcsDpStatus3                  =  0x00D81054,
+	HwPfPciePcsDpControl3                 =  0x00D81058,
+	HwPfPciePcsPmaStatusLane3             =  0x00D8105C,
+	HwPfPciePcsPipeStatusLane3            =  0x00D81060,
+	HwPfPciePcsTxdeemph0Lane3             =  0x00D81064,
+	HwPfPciePcsTxdeemph1Lane3             =  0x00D81068,
+	HwPfPciePcsInternalStatusLane3        =  0x00D8106C,
+	HwPfPciePcsEbStatus0                  =  0x00D81070,
+	HwPfPciePcsEbStatus1                  =  0x00D81074,
+	HwPfPciePcsEbStatus2                  =  0x00D81078,
+	HwPfPciePcsEbStatus3                  =  0x00D8107C,
+	HwPfPciePcsPllSettingPcieG1           =  0x00D81088,
+	HwPfPciePcsPllSettingPcieG2           =  0x00D8108C,
+	HwPfPciePcsPllSettingPcieG3           =  0x00D81090,
+	HwPfPciePcsControl                    =  0x00D81094,
+	HwPfPciePcsEqControl                  =  0x00D81098,
+	HwPfPciePcsEqTimer                    =  0x00D8109C,
+	HwPfPciePcsEqErrStatus                =  0x00D810A0,
+	HwPfPciePcsEqErrCount                 =  0x00D810A4,
+	HwPfPciePcsStatus                     =  0x00D810A8,
+	HwPfPciePcsMiscRegister               =  0x00D810AC,
+	HwPfPciePcsObsControl                 =  0x00D810B0,
+	HwPfPciePcsPrbsCount0                 =  0x00D81200,
+	HwPfPciePcsBistControl0               =  0x00D81204,
+	HwPfPciePcsBistStaticWord00           =  0x00D81208,
+	HwPfPciePcsBistStaticWord10           =  0x00D8120C,
+	HwPfPciePcsBistStaticWord20           =  0x00D81210,
+	HwPfPciePcsBistStaticWord30           =  0x00D81214,
+	HwPfPciePcsPrbsCount1                 =  0x00D81220,
+	HwPfPciePcsBistControl1               =  0x00D81224,
+	HwPfPciePcsBistStaticWord01           =  0x00D81228,
+	HwPfPciePcsBistStaticWord11           =  0x00D8122C,
+	HwPfPciePcsBistStaticWord21           =  0x00D81230,
+	HwPfPciePcsBistStaticWord31           =  0x00D81234,
+	HwPfPciePcsPrbsCount2                 =  0x00D81240,
+	HwPfPciePcsBistControl2               =  0x00D81244,
+	HwPfPciePcsBistStaticWord02           =  0x00D81248,
+	HwPfPciePcsBistStaticWord12           =  0x00D8124C,
+	HwPfPciePcsBistStaticWord22           =  0x00D81250,
+	HwPfPciePcsBistStaticWord32           =  0x00D81254,
+	HwPfPciePcsPrbsCount3                 =  0x00D81260,
+	HwPfPciePcsBistControl3               =  0x00D81264,
+	HwPfPciePcsBistStaticWord03           =  0x00D81268,
+	HwPfPciePcsBistStaticWord13           =  0x00D8126C,
+	HwPfPciePcsBistStaticWord23           =  0x00D81270,
+	HwPfPciePcsBistStaticWord33           =  0x00D81274,
+	HwPfPcieGpexLtssmStateCntrl           =  0x00D90400,
+	HwPfPcieGpexLtssmStateStatus          =  0x00D90404,
+	HwPfPcieGpexSkipFreqTimer             =  0x00D90408,
+	HwPfPcieGpexLaneSelect                =  0x00D9040C,
+	HwPfPcieGpexLaneDeskew                =  0x00D90410,
+	HwPfPcieGpexRxErrorStatus             =  0x00D90414,
+	HwPfPcieGpexLaneNumControl            =  0x00D90418,
+	HwPfPcieGpexNFstControl               =  0x00D9041C,
+	HwPfPcieGpexLinkStatus                =  0x00D90420,
+	HwPfPcieGpexAckReplayTimeout          =  0x00D90438,
+	HwPfPcieGpexSeqNumberStatus           =  0x00D9043C,
+	HwPfPcieGpexCoreClkRatio              =  0x00D90440,
+	HwPfPcieGpexDllTholdControl           =  0x00D90448,
+	HwPfPcieGpexPmTimer                   =  0x00D90450,
+	HwPfPcieGpexPmeTimeout                =  0x00D90454,
+	HwPfPcieGpexAspmL1Timer               =  0x00D90458,
+	HwPfPcieGpexAspmReqTimer              =  0x00D9045C,
+	HwPfPcieGpexAspmL1Dis                 =  0x00D90460,
+	HwPfPcieGpexAdvisoryErrorControl      =  0x00D90468,
+	HwPfPcieGpexId                        =  0x00D90470,
+	HwPfPcieGpexClasscode                 =  0x00D90474,
+	HwPfPcieGpexSubsystemId               =  0x00D90478,
+	HwPfPcieGpexDeviceCapabilities        =  0x00D9047C,
+	HwPfPcieGpexLinkCapabilities          =  0x00D90480,
+	HwPfPcieGpexFunctionNumber            =  0x00D90484,
+	HwPfPcieGpexPmCapabilities            =  0x00D90488,
+	HwPfPcieGpexFunctionSelect            =  0x00D9048C,
+	HwPfPcieGpexErrorCounter              =  0x00D904AC,
+	HwPfPcieGpexConfigReady               =  0x00D904B0,
+	HwPfPcieGpexFcUpdateTimeout           =  0x00D904B8,
+	HwPfPcieGpexFcUpdateTimer             =  0x00D904BC,
+	HwPfPcieGpexVcBufferLoad              =  0x00D904C8,
+	HwPfPcieGpexVcBufferSizeThold         =  0x00D904CC,
+	HwPfPcieGpexVcBufferSelect            =  0x00D904D0,
+	HwPfPcieGpexBarEnable                 =  0x00D904D4,
+	HwPfPcieGpexBarDwordLower             =  0x00D904D8,
+	HwPfPcieGpexBarDwordUpper             =  0x00D904DC,
+	HwPfPcieGpexBarSelect                 =  0x00D904E0,
+	HwPfPcieGpexCreditCounterSelect       =  0x00D904E4,
+	HwPfPcieGpexCreditCounterStatus       =  0x00D904E8,
+	HwPfPcieGpexTlpHeaderSelect           =  0x00D904EC,
+	HwPfPcieGpexTlpHeaderDword0           =  0x00D904F0,
+	HwPfPcieGpexTlpHeaderDword1           =  0x00D904F4,
+	HwPfPcieGpexTlpHeaderDword2           =  0x00D904F8,
+	HwPfPcieGpexTlpHeaderDword3           =  0x00D904FC,
+	HwPfPcieGpexRelaxOrderControl         =  0x00D90500,
+	HwPfPcieGpexBarPrefetch               =  0x00D90504,
+	HwPfPcieGpexFcCheckControl            =  0x00D90508,
+	HwPfPcieGpexFcUpdateTimerTraffic      =  0x00D90518,
+	HwPfPcieGpexPhyControl0               =  0x00D9053C,
+	HwPfPcieGpexPhyControl1               =  0x00D90544,
+	HwPfPcieGpexPhyControl2               =  0x00D9054C,
+	HwPfPcieGpexUserControl0              =  0x00D9055C,
+	HwPfPcieGpexUncorrErrorStatus         =  0x00D905F0,
+	HwPfPcieGpexRxCplError                =  0x00D90620,
+	HwPfPcieGpexRxCplErrorDword0          =  0x00D90624,
+	HwPfPcieGpexRxCplErrorDword1          =  0x00D90628,
+	HwPfPcieGpexRxCplErrorDword2          =  0x00D9062C,
+	HwPfPcieGpexPabSwResetEn              =  0x00D90630,
+	HwPfPcieGpexGen3Control0              =  0x00D90634,
+	HwPfPcieGpexGen3Control1              =  0x00D90638,
+	HwPfPcieGpexGen3Control2              =  0x00D9063C,
+	HwPfPcieGpexGen2ControlCsr            =  0x00D90640,
+	HwPfPcieGpexTotalVfInitialVf0         =  0x00D90644,
+	HwPfPcieGpexTotalVfInitialVf1         =  0x00D90648,
+	HwPfPcieGpexSriovLinkDevId0           =  0x00D90684,
+	HwPfPcieGpexSriovLinkDevId1           =  0x00D90688,
+	HwPfPcieGpexSriovPageSize0            =  0x00D906C4,
+	HwPfPcieGpexSriovPageSize1            =  0x00D906C8,
+	HwPfPcieGpexIdVersion                 =  0x00D906FC,
+	HwPfPcieGpexSriovVfOffsetStride0      =  0x00D90704,
+	HwPfPcieGpexSriovVfOffsetStride1      =  0x00D90708,
+	HwPfPcieGpexGen3DeskewControl         =  0x00D907B4,
+	HwPfPcieGpexGen3EqControl             =  0x00D907B8,
+	HwPfPcieGpexBridgeVersion             =  0x00D90800,
+	HwPfPcieGpexBridgeCapability          =  0x00D90804,
+	HwPfPcieGpexBridgeControl             =  0x00D90808,
+	HwPfPcieGpexBridgeStatus              =  0x00D9080C,
+	HwPfPcieGpexEngineActivityStatus      =  0x00D9081C,
+	HwPfPcieGpexEngineResetControl        =  0x00D90820,
+	HwPfPcieGpexAxiPioControl             =  0x00D90840,
+	HwPfPcieGpexAxiPioStatus              =  0x00D90844,
+	HwPfPcieGpexAmbaSlaveCmdStatus        =  0x00D90848,
+	HwPfPcieGpexPexPioControl             =  0x00D908C0,
+	HwPfPcieGpexPexPioStatus              =  0x00D908C4,
+	HwPfPcieGpexAmbaMasterStatus          =  0x00D908C8,
+	HwPfPcieGpexCsrSlaveCmdStatus         =  0x00D90920,
+	HwPfPcieGpexMailboxAxiControl         =  0x00D90A50,
+	HwPfPcieGpexMailboxAxiData            =  0x00D90A54,
+	HwPfPcieGpexMailboxPexControl         =  0x00D90A90,
+	HwPfPcieGpexMailboxPexData            =  0x00D90A94,
+	HwPfPcieGpexPexInterruptEnable        =  0x00D90AD0,
+	HwPfPcieGpexPexInterruptStatus        =  0x00D90AD4,
+	HwPfPcieGpexPexInterruptAxiPioVector  =  0x00D90AD8,
+	HwPfPcieGpexPexInterruptPexPioVector  =  0x00D90AE0,
+	HwPfPcieGpexPexInterruptMiscVector    =  0x00D90AF8,
+	HwPfPcieGpexAmbaInterruptPioEnable    =  0x00D90B00,
+	HwPfPcieGpexAmbaInterruptMiscEnable   =  0x00D90B0C,
+	HwPfPcieGpexAmbaInterruptPioStatus    =  0x00D90B10,
+	HwPfPcieGpexAmbaInterruptMiscStatus   =  0x00D90B1C,
+	HwPfPcieGpexPexPmControl              =  0x00D90B80,
+	HwPfPcieGpexSlotMisc                  =  0x00D90B88,
+	HwPfPcieGpexAxiAddrMappingControl     =  0x00D90BA0,
+	HwPfPcieGpexAxiAddrMappingWindowAxiBase     =  0x00D90BA4,
+	HwPfPcieGpexAxiAddrMappingWindowPexBaseLow  =  0x00D90BA8,
+	HwPfPcieGpexAxiAddrMappingWindowPexBaseHigh =  0x00D90BAC,
+	HwPfPcieGpexPexBarAddrFunc0Bar0       =  0x00D91BA0,
+	HwPfPcieGpexPexBarAddrFunc0Bar1       =  0x00D91BA4,
+	HwPfPcieGpexAxiAddrMappingPcieHdrParam =  0x00D95BA0,
+	HwPfPcieGpexExtAxiAddrMappingAxiBase  =  0x00D980A0,
+	HwPfPcieGpexPexExtBarAddrFunc0Bar0    =  0x00D984A0,
+	HwPfPcieGpexPexExtBarAddrFunc0Bar1    =  0x00D984A4,
+	HwPfPcieGpexAmbaInterruptFlrEnable    =  0x00D9B960,
+	HwPfPcieGpexAmbaInterruptFlrStatus    =  0x00D9B9A0,
+	HwPfPcieGpexExtAxiAddrMappingSize     =  0x00D9BAF0,
+	HwPfPcieGpexPexPioAwcacheControl      =  0x00D9C300,
+	HwPfPcieGpexPexPioArcacheControl      =  0x00D9C304,
+	HwPfPcieGpexPabObSizeControlVc0       =  0x00D9C310
+};
+
+/* TIP PF Interrupt numbers */
+enum {
+	ACC100_PF_INT_QMGR_AQ_OVERFLOW = 0,
+	ACC100_PF_INT_DOORBELL_VF_2_PF = 1,
+	ACC100_PF_INT_DMA_DL_DESC_IRQ = 2,
+	ACC100_PF_INT_DMA_UL_DESC_IRQ = 3,
+	ACC100_PF_INT_DMA_MLD_DESC_IRQ = 4,
+	ACC100_PF_INT_DMA_UL5G_DESC_IRQ = 5,
+	ACC100_PF_INT_DMA_DL5G_DESC_IRQ = 6,
+	ACC100_PF_INT_ILLEGAL_FORMAT = 7,
+	ACC100_PF_INT_QMGR_DISABLED_ACCESS = 8,
+	ACC100_PF_INT_QMGR_AQ_OVERTHRESHOLD = 9,
+	ACC100_PF_INT_ARAM_ACCESS_ERR = 10,
+	ACC100_PF_INT_ARAM_ECC_1BIT_ERR = 11,
+	ACC100_PF_INT_PARITY_ERR = 12,
+	ACC100_PF_INT_QMGR_ERR = 13,
+	ACC100_PF_INT_INT_REQ_OVERFLOW = 14,
+	ACC100_PF_INT_APB_TIMEOUT = 15,
+};
+
+#endif /* ACC100_PF_ENUM_H */
diff --git a/drivers/baseband/acc100/acc100_vf_enum.h b/drivers/baseband/acc100/acc100_vf_enum.h
new file mode 100644
index 0000000..b512af3
--- /dev/null
+++ b/drivers/baseband/acc100/acc100_vf_enum.h
@@ -0,0 +1,73 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2017 Intel Corporation
+ */
+
+#ifndef ACC100_VF_ENUM_H
+#define ACC100_VF_ENUM_H
+
+/*
+ * ACC100 Register mapping on VF BAR0
+ * This is automatically generated from RDL, format may change with new RDL
+ */
+enum {
+	HWVfQmgrIngressAq             =  0x00000000,
+	HWVfHiVfToPfDbellVf           =  0x00000800,
+	HWVfHiPfToVfDbellVf           =  0x00000808,
+	HWVfHiInfoRingBaseLoVf        =  0x00000810,
+	HWVfHiInfoRingBaseHiVf        =  0x00000814,
+	HWVfHiInfoRingPointerVf       =  0x00000818,
+	HWVfHiInfoRingIntWrEnVf       =  0x00000820,
+	HWVfHiInfoRingPf2VfWrEnVf     =  0x00000824,
+	HWVfHiMsixVectorMapperVf      =  0x00000860,
+	HWVfDmaFec5GulDescBaseLoRegVf =  0x00000920,
+	HWVfDmaFec5GulDescBaseHiRegVf =  0x00000924,
+	HWVfDmaFec5GulRespPtrLoRegVf  =  0x00000928,
+	HWVfDmaFec5GulRespPtrHiRegVf  =  0x0000092C,
+	HWVfDmaFec5GdlDescBaseLoRegVf =  0x00000940,
+	HWVfDmaFec5GdlDescBaseHiRegVf =  0x00000944,
+	HWVfDmaFec5GdlRespPtrLoRegVf  =  0x00000948,
+	HWVfDmaFec5GdlRespPtrHiRegVf  =  0x0000094C,
+	HWVfDmaFec4GulDescBaseLoRegVf =  0x00000960,
+	HWVfDmaFec4GulDescBaseHiRegVf =  0x00000964,
+	HWVfDmaFec4GulRespPtrLoRegVf  =  0x00000968,
+	HWVfDmaFec4GulRespPtrHiRegVf  =  0x0000096C,
+	HWVfDmaFec4GdlDescBaseLoRegVf =  0x00000980,
+	HWVfDmaFec4GdlDescBaseHiRegVf =  0x00000984,
+	HWVfDmaFec4GdlRespPtrLoRegVf  =  0x00000988,
+	HWVfDmaFec4GdlRespPtrHiRegVf  =  0x0000098C,
+	HWVfDmaDdrBaseRangeRoVf       =  0x000009A0,
+	HWVfQmgrAqResetVf             =  0x00000E00,
+	HWVfQmgrRingSizeVf            =  0x00000E04,
+	HWVfQmgrGrpDepthLog20Vf       =  0x00000E08,
+	HWVfQmgrGrpDepthLog21Vf       =  0x00000E0C,
+	HWVfQmgrGrpFunction0Vf        =  0x00000E10,
+	HWVfQmgrGrpFunction1Vf        =  0x00000E14,
+	HWVfPmACntrlRegVf             =  0x00000F40,
+	HWVfPmACountVf                =  0x00000F48,
+	HWVfPmAKCntLoVf               =  0x00000F50,
+	HWVfPmAKCntHiVf               =  0x00000F54,
+	HWVfPmADeltaCntLoVf           =  0x00000F60,
+	HWVfPmADeltaCntHiVf           =  0x00000F64,
+	HWVfPmBCntrlRegVf             =  0x00000F80,
+	HWVfPmBCountVf                =  0x00000F88,
+	HWVfPmBKCntLoVf               =  0x00000F90,
+	HWVfPmBKCntHiVf               =  0x00000F94,
+	HWVfPmBDeltaCntLoVf           =  0x00000FA0,
+	HWVfPmBDeltaCntHiVf           =  0x00000FA4
+};
+
+/* TIP VF Interrupt numbers */
+enum {
+	ACC100_VF_INT_QMGR_AQ_OVERFLOW = 0,
+	ACC100_VF_INT_DOORBELL_VF_2_PF = 1,
+	ACC100_VF_INT_DMA_DL_DESC_IRQ = 2,
+	ACC100_VF_INT_DMA_UL_DESC_IRQ = 3,
+	ACC100_VF_INT_DMA_MLD_DESC_IRQ = 4,
+	ACC100_VF_INT_DMA_UL5G_DESC_IRQ = 5,
+	ACC100_VF_INT_DMA_DL5G_DESC_IRQ = 6,
+	ACC100_VF_INT_ILLEGAL_FORMAT = 7,
+	ACC100_VF_INT_QMGR_DISABLED_ACCESS = 8,
+	ACC100_VF_INT_QMGR_AQ_OVERTHRESHOLD = 9,
+};
+
+#endif /* ACC100_VF_ENUM_H */
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 6f46df0..6525d66 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -5,6 +5,9 @@
 #ifndef _RTE_ACC100_PMD_H_
 #define _RTE_ACC100_PMD_H_
 
+#include "acc100_pf_enum.h"
+#include "acc100_vf_enum.h"
+
 /* Helper macro for logging */
 #define rte_bbdev_log(level, fmt, ...) \
 	rte_log(RTE_LOG_ ## level, acc100_logtype, fmt "\n", \
@@ -27,6 +30,490 @@
 #define RTE_ACC100_PF_DEVICE_ID        (0x0d5c)
 #define RTE_ACC100_VF_DEVICE_ID        (0x0d5d)
 
+/* Define as 1 to use only a single FEC engine */
+#ifndef RTE_ACC100_SINGLE_FEC
+#define RTE_ACC100_SINGLE_FEC 0
+#endif
+
+/* Values used in filling in descriptors */
+#define ACC100_DMA_DESC_TYPE           2
+#define ACC100_DMA_CODE_BLK_MODE       0
+#define ACC100_DMA_BLKID_FCW           1
+#define ACC100_DMA_BLKID_IN            2
+#define ACC100_DMA_BLKID_OUT_ENC       1
+#define ACC100_DMA_BLKID_OUT_HARD      1
+#define ACC100_DMA_BLKID_OUT_SOFT      2
+#define ACC100_DMA_BLKID_OUT_HARQ      3
+#define ACC100_DMA_BLKID_IN_HARQ       3
+
+/* Values used in filling in decode FCWs */
+#define ACC100_FCW_TD_VER              1
+#define ACC100_FCW_TD_EXT_COLD_REG_EN  1
+#define ACC100_FCW_TD_AUTOMAP          0x0f
+#define ACC100_FCW_TD_RVIDX_0          2
+#define ACC100_FCW_TD_RVIDX_1          26
+#define ACC100_FCW_TD_RVIDX_2          50
+#define ACC100_FCW_TD_RVIDX_3          74
+
+/* Values used in writing to the registers */
+#define ACC100_REG_IRQ_EN_ALL          0x1FF83FF  /* Enable all interrupts */
+
+/* ACC100 Specific Dimensioning */
+#define ACC100_SIZE_64MBYTE            (64*1024*1024)
+/* Number of elements in an Info Ring */
+#define ACC100_INFO_RING_NUM_ENTRIES   1024
+/* Number of elements in HARQ layout memory */
+#define ACC100_HARQ_LAYOUT             (64*1024*1024)
+/* Assume offset for HARQ in memory */
+#define ACC100_HARQ_OFFSET             (32*1024)
+/* Mask used to calculate an index in an Info Ring array (not a byte offset) */
+#define ACC100_INFO_RING_MASK          (ACC100_INFO_RING_NUM_ENTRIES-1)
+/* Number of Virtual Functions ACC100 supports */
+#define ACC100_NUM_VFS                  16
+#define ACC100_NUM_QGRPS                8
+#define ACC100_NUM_QGRPS_PER_WORD       8
+#define ACC100_NUM_AQS                  16
+#define MAX_ENQ_BATCH_SIZE              255
+/* All ACC100 Registers alignment are 32bits = 4B */
+#define ACC100_BYTES_IN_WORD                 4
+#define ACC100_MAX_E_MBUF                64000
+
+#define ACC100_GRP_ID_SHIFT    10 /* Queue Index Hierarchy */
+#define ACC100_VF_ID_SHIFT     4  /* Queue Index Hierarchy */
+#define ACC100_VF_OFFSET_QOS   16 /* offset in Memory specific to QoS Mon */
+#define ACC100_TMPL_PRI_0      0x03020100
+#define ACC100_TMPL_PRI_1      0x07060504
+#define ACC100_TMPL_PRI_2      0x0b0a0908
+#define ACC100_TMPL_PRI_3      0x0f0e0d0c
+#define ACC100_QUEUE_ENABLE    0x80000000  /* Bit to mark Queue as Enabled */
+#define ACC100_WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
+
+#define ACC100_NUM_TMPL       32
+/* Mapping of signals for the available engines */
+#define ACC100_SIG_UL_5G      0
+#define ACC100_SIG_UL_5G_LAST 7
+#define ACC100_SIG_DL_5G      13
+#define ACC100_SIG_DL_5G_LAST 15
+#define ACC100_SIG_UL_4G      16
+#define ACC100_SIG_UL_4G_LAST 21
+#define ACC100_SIG_DL_4G      27
+#define ACC100_SIG_DL_4G_LAST 31
+
+/* max number of iterations to allocate memory block for all rings */
+#define ACC100_SW_RING_MEM_ALLOC_ATTEMPTS 5
+#define ACC100_MAX_QUEUE_DEPTH            1024
+#define ACC100_DMA_MAX_NUM_POINTERS       14
+#define ACC100_DMA_DESC_PADDING           8
+#define ACC100_FCW_PADDING                12
+#define ACC100_DESC_FCW_OFFSET            192
+#define ACC100_DESC_SIZE                  256
+#define ACC100_DESC_OFFSET                (ACC100_DESC_SIZE / 64)
+#define ACC100_FCW_TE_BLEN                32
+#define ACC100_FCW_TD_BLEN                24
+#define ACC100_FCW_LE_BLEN                32
+#define ACC100_FCW_LD_BLEN                36
+
+#define ACC100_FCW_VER         2
+#define ACC100_MUX_5GDL_DESC   6
+#define ACC100_CMP_ENC_SIZE    20
+#define ACC100_CMP_DEC_SIZE    24
+#define ACC100_ENC_OFFSET     (32)
+#define ACC100_DEC_OFFSET     (80)
+#define ACC100_EXT_MEM /* Default option with memory external to CPU */
+#define ACC100_HARQ_OFFSET_THRESHOLD 1024
+
+/* Constants from K0 computation from 3GPP 38.212 Table 5.4.2.1-2 */
+#define ACC100_N_ZC_1 66 /* N = 66 Zc for BG 1 */
+#define ACC100_N_ZC_2 50 /* N = 50 Zc for BG 2 */
+#define ACC100_K0_1_1 17 /* K0 fraction numerator for rv 1 and BG 1 */
+#define ACC100_K0_1_2 13 /* K0 fraction numerator for rv 1 and BG 2 */
+#define ACC100_K0_2_1 33 /* K0 fraction numerator for rv 2 and BG 1 */
+#define ACC100_K0_2_2 25 /* K0 fraction numerator for rv 2 and BG 2 */
+#define ACC100_K0_3_1 56 /* K0 fraction numerator for rv 3 and BG 1 */
+#define ACC100_K0_3_2 43 /* K0 fraction numerator for rv 3 and BG 2 */
+
+/* ACC100 Configuration */
+#define ACC100_DDR_ECC_ENABLE
+#define ACC100_CFG_DMA_ERROR    0x3D7
+#define ACC100_CFG_AXI_CACHE    0x11
+#define ACC100_CFG_QMGR_HI_P    0x0F0F
+#define ACC100_CFG_PCI_AXI      0xC003
+#define ACC100_CFG_PCI_BRIDGE   0x40006033
+#define ACC100_ENGINE_OFFSET    0x1000
+#define ACC100_RESET_HI         0x20100
+#define ACC100_RESET_LO         0x20000
+#define ACC100_RESET_HARD       0x1FF
+#define ACC100_ENGINES_MAX      9
+#define ACC100_LONG_WAIT        1000
+
+/* ACC100 DMA Descriptor triplet */
+struct acc100_dma_triplet {
+	uint64_t address;
+	uint32_t blen:20,
+		res0:4,
+		last:1,
+		dma_ext:1,
+		res1:2,
+		blkid:4;
+} __rte_packed;
+
+/* ACC100 DMA Response Descriptor */
+union acc100_dma_rsp_desc {
+	uint32_t val;
+	struct {
+		uint32_t crc_status:1,
+			synd_ok:1,
+			dma_err:1,
+			neg_stop:1,
+			fcw_err:1,
+			output_err:1,
+			input_err:1,
+			timestampEn:1,
+			iterCountFrac:8,
+			iter_cnt:8,
+			rsrvd3:6,
+			sdone:1,
+			fdone:1;
+		uint32_t add_info_0;
+		uint32_t add_info_1;
+	};
+};
+
+
+/* ACC100 Queue Manager Enqueue PCI Register */
+union acc100_enqueue_reg_fmt {
+	uint32_t val;
+	struct {
+		uint32_t num_elem:8,
+			addr_offset:3,
+			rsrvd:1,
+			req_elem_addr:20;
+	};
+};
+
+/* FEC 4G Uplink Frame Control Word */
+struct __rte_packed acc100_fcw_td {
+	uint8_t fcw_ver:4,
+		num_maps:4; /* Unused */
+	uint8_t filler:6, /* Unused */
+		rsrvd0:1,
+		bypass_sb_deint:1;
+	uint16_t k_pos;
+	uint16_t k_neg; /* Unused */
+	uint8_t c_neg; /* Unused */
+	uint8_t c; /* Unused */
+	uint32_t ea; /* Unused */
+	uint32_t eb; /* Unused */
+	uint8_t cab; /* Unused */
+	uint8_t k0_start_col; /* Unused */
+	uint8_t rsrvd1;
+	uint8_t code_block_mode:1, /* Unused */
+		turbo_crc_type:1,
+		rsrvd2:3,
+		bypass_teq:1, /* Unused */
+		soft_output_en:1, /* Unused */
+		ext_td_cold_reg_en:1;
+	union { /* External Cold register */
+		uint32_t ext_td_cold_reg;
+		struct {
+			uint32_t min_iter:4, /* Unused */
+				max_iter:4,
+				ext_scale:5, /* Unused */
+				rsrvd3:3,
+				early_stop_en:1, /* Unused */
+				sw_soft_out_dis:1, /* Unused */
+				sw_et_cont:1, /* Unused */
+				sw_soft_out_saturation:1, /* Unused */
+				half_iter_on:1, /* Unused */
+				raw_decoder_input_on:1, /* Unused */
+				rsrvd4:10;
+		};
+	};
+};
+
+/* FEC 5GNR Uplink Frame Control Word */
+struct __rte_packed acc100_fcw_ld {
+	uint32_t FCWversion:4,
+		qm:4,
+		nfiller:11,
+		BG:1,
+		Zc:9,
+		res0:1,
+		synd_precoder:1,
+		synd_post:1;
+	uint32_t ncb:16,
+		k0:16;
+	uint32_t rm_e:24,
+		hcin_en:1,
+		hcout_en:1,
+		crc_select:1,
+		bypass_dec:1,
+		bypass_intlv:1,
+		so_en:1,
+		so_bypass_rm:1,
+		so_bypass_intlv:1;
+	uint32_t hcin_offset:16,
+		hcin_size0:16;
+	uint32_t hcin_size1:16,
+		hcin_decomp_mode:3,
+		llr_pack_mode:1,
+		hcout_comp_mode:3,
+		res2:1,
+		dec_convllr:4,
+		hcout_convllr:4;
+	uint32_t itmax:7,
+		itstop:1,
+		so_it:7,
+		res3:1,
+		hcout_offset:16;
+	uint32_t hcout_size0:16,
+		hcout_size1:16;
+	uint32_t gain_i:8,
+		gain_h:8,
+		negstop_th:16;
+	uint32_t negstop_it:7,
+		negstop_en:1,
+		res4:24;
+};
+
+/* FEC 4G Downlink Frame Control Word */
+struct __rte_packed acc100_fcw_te {
+	uint16_t k_neg;
+	uint16_t k_pos;
+	uint8_t c_neg;
+	uint8_t c;
+	uint8_t filler;
+	uint8_t cab;
+	uint32_t ea:17,
+		rsrvd0:15;
+	uint32_t eb:17,
+		rsrvd1:15;
+	uint16_t ncb_neg;
+	uint16_t ncb_pos;
+	uint8_t rv_idx0:2,
+		rsrvd2:2,
+		rv_idx1:2,
+		rsrvd3:2;
+	uint8_t bypass_rv_idx0:1,
+		bypass_rv_idx1:1,
+		bypass_rm:1,
+		rsrvd4:5;
+	uint8_t rsrvd5:1,
+		rsrvd6:3,
+		code_block_crc:1,
+		rsrvd7:3;
+	uint8_t code_block_mode:1,
+		rsrvd8:7;
+	uint64_t rsrvd9;
+};
+
+/* FEC 5GNR Downlink Frame Control Word */
+struct __rte_packed acc100_fcw_le {
+	uint32_t FCWversion:4,
+		qm:4,
+		nfiller:11,
+		BG:1,
+		Zc:9,
+		res0:3;
+	uint32_t ncb:16,
+		k0:16;
+	uint32_t rm_e:24,
+		res1:2,
+		crc_select:1,
+		res2:1,
+		bypass_intlv:1,
+		res3:3;
+	uint32_t res4_a:12,
+		mcb_count:3,
+		res4_b:17;
+	uint32_t res5;
+	uint32_t res6;
+	uint32_t res7;
+	uint32_t res8;
+};
+
+/* ACC100 DMA Request Descriptor */
+struct __rte_packed acc100_dma_req_desc {
+	union {
+		struct{
+			uint32_t type:4,
+				rsrvd0:26,
+				sdone:1,
+				fdone:1;
+			uint32_t rsrvd1;
+			uint32_t rsrvd2;
+			uint32_t pass_param:8,
+				sdone_enable:1,
+				irq_enable:1,
+				timeStampEn:1,
+				res0:5,
+				numCBs:4,
+				res1:4,
+				m2dlen:4,
+				d2mlen:4;
+		};
+		struct{
+			uint32_t word0;
+			uint32_t word1;
+			uint32_t word2;
+			uint32_t word3;
+		};
+	};
+	struct acc100_dma_triplet data_ptrs[ACC100_DMA_MAX_NUM_POINTERS];
+
+	/* Virtual addresses used to retrieve SW context info */
+	union {
+		void *op_addr;
+		uint64_t pad1;  /* pad to 64 bits */
+	};
+	/*
+	 * Stores additional information needed for driver processing:
+	 * - last_desc_in_batch - flag used to mark last descriptor (CB)
+	 *                        in batch
+	 * - cbs_in_tb - stores information about total number of Code Blocks
+	 *               in currently processed Transport Block
+	 */
+	union {
+		struct {
+			union {
+				struct acc100_fcw_ld fcw_ld;
+				struct acc100_fcw_td fcw_td;
+				struct acc100_fcw_le fcw_le;
+				struct acc100_fcw_te fcw_te;
+				uint32_t pad2[ACC100_FCW_PADDING];
+			};
+			uint32_t last_desc_in_batch :8,
+				cbs_in_tb:8,
+				pad4 : 16;
+		};
+		uint64_t pad3[ACC100_DMA_DESC_PADDING]; /* pad to 64 bits */
+	};
+};
+
+/* ACC100 DMA Descriptor */
+union acc100_dma_desc {
+	struct acc100_dma_req_desc req;
+	union acc100_dma_rsp_desc rsp;
+};
+
+
+/* Union describing Info Ring entry */
+union acc100_harq_layout_data {
+	uint32_t val;
+	struct {
+		uint16_t offset;
+		uint16_t size0;
+	};
+} __rte_packed;
+
+
+/* Union describing Info Ring entry */
+union acc100_info_ring_data {
+	uint32_t val;
+	struct {
+		union {
+			uint16_t detailed_info;
+			struct {
+				uint16_t aq_id: 4;
+				uint16_t qg_id: 4;
+				uint16_t vf_id: 6;
+				uint16_t reserved: 2;
+			};
+		};
+		uint16_t int_nb: 7;
+		uint16_t msi_0: 1;
+		uint16_t vf2pf: 6;
+		uint16_t loop: 1;
+		uint16_t valid: 1;
+	};
+} __rte_packed;
+
+struct acc100_registry_addr {
+	unsigned int dma_ring_dl5g_hi;
+	unsigned int dma_ring_dl5g_lo;
+	unsigned int dma_ring_ul5g_hi;
+	unsigned int dma_ring_ul5g_lo;
+	unsigned int dma_ring_dl4g_hi;
+	unsigned int dma_ring_dl4g_lo;
+	unsigned int dma_ring_ul4g_hi;
+	unsigned int dma_ring_ul4g_lo;
+	unsigned int ring_size;
+	unsigned int info_ring_hi;
+	unsigned int info_ring_lo;
+	unsigned int info_ring_en;
+	unsigned int info_ring_ptr;
+	unsigned int tail_ptrs_dl5g_hi;
+	unsigned int tail_ptrs_dl5g_lo;
+	unsigned int tail_ptrs_ul5g_hi;
+	unsigned int tail_ptrs_ul5g_lo;
+	unsigned int tail_ptrs_dl4g_hi;
+	unsigned int tail_ptrs_dl4g_lo;
+	unsigned int tail_ptrs_ul4g_hi;
+	unsigned int tail_ptrs_ul4g_lo;
+	unsigned int depth_log0_offset;
+	unsigned int depth_log1_offset;
+	unsigned int qman_group_func;
+	unsigned int ddr_range;
+};
+
+/* Structure holding registry addresses for PF */
+static const struct acc100_registry_addr pf_reg_addr = {
+	.dma_ring_dl5g_hi = HWPfDmaFec5GdlDescBaseHiRegVf,
+	.dma_ring_dl5g_lo = HWPfDmaFec5GdlDescBaseLoRegVf,
+	.dma_ring_ul5g_hi = HWPfDmaFec5GulDescBaseHiRegVf,
+	.dma_ring_ul5g_lo = HWPfDmaFec5GulDescBaseLoRegVf,
+	.dma_ring_dl4g_hi = HWPfDmaFec4GdlDescBaseHiRegVf,
+	.dma_ring_dl4g_lo = HWPfDmaFec4GdlDescBaseLoRegVf,
+	.dma_ring_ul4g_hi = HWPfDmaFec4GulDescBaseHiRegVf,
+	.dma_ring_ul4g_lo = HWPfDmaFec4GulDescBaseLoRegVf,
+	.ring_size = HWPfQmgrRingSizeVf,
+	.info_ring_hi = HWPfHiInfoRingBaseHiRegPf,
+	.info_ring_lo = HWPfHiInfoRingBaseLoRegPf,
+	.info_ring_en = HWPfHiInfoRingIntWrEnRegPf,
+	.info_ring_ptr = HWPfHiInfoRingPointerRegPf,
+	.tail_ptrs_dl5g_hi = HWPfDmaFec5GdlRespPtrHiRegVf,
+	.tail_ptrs_dl5g_lo = HWPfDmaFec5GdlRespPtrLoRegVf,
+	.tail_ptrs_ul5g_hi = HWPfDmaFec5GulRespPtrHiRegVf,
+	.tail_ptrs_ul5g_lo = HWPfDmaFec5GulRespPtrLoRegVf,
+	.tail_ptrs_dl4g_hi = HWPfDmaFec4GdlRespPtrHiRegVf,
+	.tail_ptrs_dl4g_lo = HWPfDmaFec4GdlRespPtrLoRegVf,
+	.tail_ptrs_ul4g_hi = HWPfDmaFec4GulRespPtrHiRegVf,
+	.tail_ptrs_ul4g_lo = HWPfDmaFec4GulRespPtrLoRegVf,
+	.depth_log0_offset = HWPfQmgrGrpDepthLog20Vf,
+	.depth_log1_offset = HWPfQmgrGrpDepthLog21Vf,
+	.qman_group_func = HWPfQmgrGrpFunction0,
+	.ddr_range = HWPfDmaVfDdrBaseRw,
+};
+
+/* Structure holding registry addresses for VF */
+static const struct acc100_registry_addr vf_reg_addr = {
+	.dma_ring_dl5g_hi = HWVfDmaFec5GdlDescBaseHiRegVf,
+	.dma_ring_dl5g_lo = HWVfDmaFec5GdlDescBaseLoRegVf,
+	.dma_ring_ul5g_hi = HWVfDmaFec5GulDescBaseHiRegVf,
+	.dma_ring_ul5g_lo = HWVfDmaFec5GulDescBaseLoRegVf,
+	.dma_ring_dl4g_hi = HWVfDmaFec4GdlDescBaseHiRegVf,
+	.dma_ring_dl4g_lo = HWVfDmaFec4GdlDescBaseLoRegVf,
+	.dma_ring_ul4g_hi = HWVfDmaFec4GulDescBaseHiRegVf,
+	.dma_ring_ul4g_lo = HWVfDmaFec4GulDescBaseLoRegVf,
+	.ring_size = HWVfQmgrRingSizeVf,
+	.info_ring_hi = HWVfHiInfoRingBaseHiVf,
+	.info_ring_lo = HWVfHiInfoRingBaseLoVf,
+	.info_ring_en = HWVfHiInfoRingIntWrEnVf,
+	.info_ring_ptr = HWVfHiInfoRingPointerVf,
+	.tail_ptrs_dl5g_hi = HWVfDmaFec5GdlRespPtrHiRegVf,
+	.tail_ptrs_dl5g_lo = HWVfDmaFec5GdlRespPtrLoRegVf,
+	.tail_ptrs_ul5g_hi = HWVfDmaFec5GulRespPtrHiRegVf,
+	.tail_ptrs_ul5g_lo = HWVfDmaFec5GulRespPtrLoRegVf,
+	.tail_ptrs_dl4g_hi = HWVfDmaFec4GdlRespPtrHiRegVf,
+	.tail_ptrs_dl4g_lo = HWVfDmaFec4GdlRespPtrLoRegVf,
+	.tail_ptrs_ul4g_hi = HWVfDmaFec4GulRespPtrHiRegVf,
+	.tail_ptrs_ul4g_lo = HWVfDmaFec4GulRespPtrLoRegVf,
+	.depth_log0_offset = HWVfQmgrGrpDepthLog20Vf,
+	.depth_log1_offset = HWVfQmgrGrpDepthLog21Vf,
+	.qman_group_func = HWVfQmgrGrpFunction0Vf,
+	.ddr_range = HWVfDmaDdrBaseRangeRoVf,
+};
+
 /* Private data structure for each ACC100 device */
 struct acc100_device {
 	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v11 03/10] baseband/acc100: add info get function
  2020-10-02  1:01   ` [dpdk-dev] [PATCH v11 00/10] bbdev PMD ACC100 Nicolas Chautru
  2020-10-02  1:01     ` [dpdk-dev] [PATCH v11 01/10] drivers/baseband: add PMD for ACC100 Nicolas Chautru
  2020-10-02  1:01     ` [dpdk-dev] [PATCH v11 02/10] baseband/acc100: add register definition file Nicolas Chautru
@ 2020-10-02  1:01     ` Nicolas Chautru
  2020-10-04 16:09       ` Tom Rix
  2020-10-02  1:01     ` [dpdk-dev] [PATCH v11 04/10] baseband/acc100: add queue configuration Nicolas Chautru
                       ` (6 subsequent siblings)
  9 siblings, 1 reply; 213+ messages in thread
From: Nicolas Chautru @ 2020-10-02  1:01 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, trix, maxime.coquelin, ferruh.yigit,
	tianjiao.liu, Nicolas Chautru

Add in the "info_get" function to the driver, to allow us to query the
device.
No processing capability are available yet.
Linking bbdev-test to support the PMD with null capability.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 app/test-bbdev/meson.build               |   3 +
 drivers/baseband/acc100/rte_acc100_cfg.h |  96 +++++++++++++
 drivers/baseband/acc100/rte_acc100_pmd.c | 229 +++++++++++++++++++++++++++++++
 drivers/baseband/acc100/rte_acc100_pmd.h |  10 ++
 4 files changed, 338 insertions(+)
 create mode 100644 drivers/baseband/acc100/rte_acc100_cfg.h

diff --git a/app/test-bbdev/meson.build b/app/test-bbdev/meson.build
index 18ab6a8..fbd8ae3 100644
--- a/app/test-bbdev/meson.build
+++ b/app/test-bbdev/meson.build
@@ -12,3 +12,6 @@ endif
 if dpdk_conf.has('RTE_LIBRTE_PMD_BBDEV_FPGA_5GNR_FEC')
 	deps += ['pmd_bbdev_fpga_5gnr_fec']
 endif
+if dpdk_conf.has('RTE_LIBRTE_PMD_BBDEV_ACC100')
+	deps += ['pmd_bbdev_acc100']
+endif
\ No newline at end of file
diff --git a/drivers/baseband/acc100/rte_acc100_cfg.h b/drivers/baseband/acc100/rte_acc100_cfg.h
new file mode 100644
index 0000000..a1d43ef
--- /dev/null
+++ b/drivers/baseband/acc100/rte_acc100_cfg.h
@@ -0,0 +1,96 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#ifndef _RTE_ACC100_CFG_H_
+#define _RTE_ACC100_CFG_H_
+
+/**
+ * @file rte_acc100_cfg.h
+ *
+ * Functions for configuring ACC100 HW, exposed directly to applications.
+ * Configuration related to encoding/decoding is done through the
+ * librte_bbdev library.
+ *
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ */
+
+#include <stdint.h>
+#include <stdbool.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+/**< Number of Virtual Functions ACC100 supports */
+#define RTE_ACC100_NUM_VFS 16
+
+/**
+ * Definition of Queue Topology for ACC100 Configuration
+ * Some level of details is abstracted out to expose a clean interface
+ * given that comprehensive flexibility is not required
+ */
+struct rte_acc100_queue_topology {
+	/** Number of QGroups in incremental order of priority */
+	uint16_t num_qgroups;
+	/**
+	 * All QGroups have the same number of AQs here.
+	 * Note : Could be made a 16-array if more flexibility is really
+	 * required
+	 */
+	uint16_t num_aqs_per_groups;
+	/**
+	 * Depth of the AQs is the same of all QGroups here. Log2 Enum : 2^N
+	 * Note : Could be made a 16-array if more flexibility is really
+	 * required
+	 */
+	uint16_t aq_depth_log2;
+	/**
+	 * Index of the first Queue Group Index - assuming contiguity
+	 * Initialized as -1
+	 */
+	int8_t first_qgroup_index;
+};
+
+/**
+ * Definition of Arbitration related parameters for ACC100 Configuration
+ */
+struct rte_acc100_arbitration {
+	/** Default Weight for VF Fairness Arbitration */
+	uint16_t round_robin_weight;
+	uint32_t gbr_threshold1; /**< Guaranteed Bitrate Threshold 1 */
+	uint32_t gbr_threshold2; /**< Guaranteed Bitrate Threshold 2 */
+};
+
+/**
+ * Structure to pass ACC100 configuration.
+ * Note: all VF Bundles will have the same configuration.
+ */
+struct rte_acc100_conf {
+	bool pf_mode_en; /**< 1 if PF is used for dataplane, 0 for VFs */
+	/** 1 if input '1' bit is represented by a positive LLR value, 0 if '1'
+	 * bit is represented by a negative value.
+	 */
+	bool input_pos_llr_1_bit;
+	/** 1 if output '1' bit is represented by a positive value, 0 if '1'
+	 * bit is represented by a negative value.
+	 */
+	bool output_pos_llr_1_bit;
+	uint16_t num_vf_bundles; /**< Number of VF bundles to setup */
+	/** Queue topology for each operation type */
+	struct rte_acc100_queue_topology q_ul_4g;
+	struct rte_acc100_queue_topology q_dl_4g;
+	struct rte_acc100_queue_topology q_ul_5g;
+	struct rte_acc100_queue_topology q_dl_5g;
+	/** Arbitration configuration for each operation type */
+	struct rte_acc100_arbitration arb_ul_4g[RTE_ACC100_NUM_VFS];
+	struct rte_acc100_arbitration arb_dl_4g[RTE_ACC100_NUM_VFS];
+	struct rte_acc100_arbitration arb_ul_5g[RTE_ACC100_NUM_VFS];
+	struct rte_acc100_arbitration arb_dl_5g[RTE_ACC100_NUM_VFS];
+};
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_ACC100_CFG_H_ */
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 1b4cd13..fcba77e 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -26,6 +26,188 @@
 RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);
 #endif
 
+/* Read a register of a ACC100 device */
+static inline uint32_t
+acc100_reg_read(struct acc100_device *d, uint32_t offset)
+{
+
+	void *reg_addr = RTE_PTR_ADD(d->mmio_base, offset);
+	uint32_t ret = *((volatile uint32_t *)(reg_addr));
+	return rte_le_to_cpu_32(ret);
+}
+
+/* Calculate the offset of the enqueue register */
+static inline uint32_t
+queue_offset(bool pf_device, uint8_t vf_id, uint8_t qgrp_id, uint16_t aq_id)
+{
+	if (pf_device)
+		return ((vf_id << 12) + (qgrp_id << 7) + (aq_id << 3) +
+				HWPfQmgrIngressAq);
+	else
+		return ((qgrp_id << 7) + (aq_id << 3) +
+				HWVfQmgrIngressAq);
+}
+
+enum {UL_4G = 0, UL_5G, DL_4G, DL_5G, NUM_ACC};
+
+/* Return the queue topology for a Queue Group Index */
+static inline void
+qtopFromAcc(struct rte_acc100_queue_topology **qtop, int acc_enum,
+		struct rte_acc100_conf *acc100_conf)
+{
+	struct rte_acc100_queue_topology *p_qtop;
+	p_qtop = NULL;
+	switch (acc_enum) {
+	case UL_4G:
+		p_qtop = &(acc100_conf->q_ul_4g);
+		break;
+	case UL_5G:
+		p_qtop = &(acc100_conf->q_ul_5g);
+		break;
+	case DL_4G:
+		p_qtop = &(acc100_conf->q_dl_4g);
+		break;
+	case DL_5G:
+		p_qtop = &(acc100_conf->q_dl_5g);
+		break;
+	default:
+		/* NOTREACHED */
+		rte_bbdev_log(ERR, "Unexpected error evaluating qtopFromAcc");
+		break;
+	}
+	*qtop = p_qtop;
+}
+
+static void
+initQTop(struct rte_acc100_conf *acc100_conf)
+{
+	acc100_conf->q_ul_4g.num_aqs_per_groups = 0;
+	acc100_conf->q_ul_4g.num_qgroups = 0;
+	acc100_conf->q_ul_4g.first_qgroup_index = -1;
+	acc100_conf->q_ul_5g.num_aqs_per_groups = 0;
+	acc100_conf->q_ul_5g.num_qgroups = 0;
+	acc100_conf->q_ul_5g.first_qgroup_index = -1;
+	acc100_conf->q_dl_4g.num_aqs_per_groups = 0;
+	acc100_conf->q_dl_4g.num_qgroups = 0;
+	acc100_conf->q_dl_4g.first_qgroup_index = -1;
+	acc100_conf->q_dl_5g.num_aqs_per_groups = 0;
+	acc100_conf->q_dl_5g.num_qgroups = 0;
+	acc100_conf->q_dl_5g.first_qgroup_index = -1;
+}
+
+static inline void
+updateQtop(uint8_t acc, uint8_t qg, struct rte_acc100_conf *acc100_conf,
+		struct acc100_device *d) {
+	uint32_t reg;
+	struct rte_acc100_queue_topology *q_top = NULL;
+	qtopFromAcc(&q_top, acc, acc100_conf);
+	if (unlikely(q_top == NULL))
+		return;
+	uint16_t aq;
+	q_top->num_qgroups++;
+	if (q_top->first_qgroup_index == -1) {
+		q_top->first_qgroup_index = qg;
+		/* Can be optimized to assume all are enabled by default */
+		reg = acc100_reg_read(d, queue_offset(d->pf_device,
+				0, qg, ACC100_NUM_AQS - 1));
+		if (reg & ACC100_QUEUE_ENABLE) {
+			q_top->num_aqs_per_groups = ACC100_NUM_AQS;
+			return;
+		}
+		q_top->num_aqs_per_groups = 0;
+		for (aq = 0; aq < ACC100_NUM_AQS; aq++) {
+			reg = acc100_reg_read(d, queue_offset(d->pf_device,
+					0, qg, aq));
+			if (reg & ACC100_QUEUE_ENABLE)
+				q_top->num_aqs_per_groups++;
+		}
+	}
+}
+
+/* Fetch configuration enabled for the PF/VF using MMIO Read (slow) */
+static inline void
+fetch_acc100_config(struct rte_bbdev *dev)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	struct rte_acc100_conf *acc100_conf = &d->acc100_conf;
+	const struct acc100_registry_addr *reg_addr;
+	uint8_t acc, qg;
+	uint32_t reg, reg_aq, reg_len0, reg_len1;
+	uint32_t reg_mode;
+
+	/* No need to retrieve the configuration is already done */
+	if (d->configured)
+		return;
+
+	/* Choose correct registry addresses for the device type */
+	if (d->pf_device)
+		reg_addr = &pf_reg_addr;
+	else
+		reg_addr = &vf_reg_addr;
+
+	d->ddr_size = (1 + acc100_reg_read(d, reg_addr->ddr_range)) << 10;
+
+	/* Single VF Bundle by VF */
+	acc100_conf->num_vf_bundles = 1;
+	initQTop(acc100_conf);
+
+	struct rte_acc100_queue_topology *q_top = NULL;
+	int qman_func_id[ACC100_NUM_ACCS] = {ACC100_ACCMAP_0, ACC100_ACCMAP_1,
+			ACC100_ACCMAP_2, ACC100_ACCMAP_3, ACC100_ACCMAP_4};
+	reg = acc100_reg_read(d, reg_addr->qman_group_func);
+	for (qg = 0; qg < ACC100_NUM_QGRPS_PER_WORD; qg++) {
+		reg_aq = acc100_reg_read(d,
+				queue_offset(d->pf_device, 0, qg, 0));
+		if (reg_aq & ACC100_QUEUE_ENABLE) {
+			uint32_t idx = (reg >> (qg * 4)) & 0x7;
+			if (idx >= ACC100_NUM_ACCS)
+				break;
+			acc = qman_func_id[idx];
+			updateQtop(acc, qg, acc100_conf, d);
+		}
+	}
+
+	/* Check the depth of the AQs*/
+	reg_len0 = acc100_reg_read(d, reg_addr->depth_log0_offset);
+	reg_len1 = acc100_reg_read(d, reg_addr->depth_log1_offset);
+	for (acc = 0; acc < NUM_ACC; acc++) {
+		qtopFromAcc(&q_top, acc, acc100_conf);
+		if (q_top->first_qgroup_index < ACC100_NUM_QGRPS_PER_WORD)
+			q_top->aq_depth_log2 = (reg_len0 >>
+					(q_top->first_qgroup_index * 4))
+					& 0xF;
+		else
+			q_top->aq_depth_log2 = (reg_len1 >>
+					((q_top->first_qgroup_index -
+					ACC100_NUM_QGRPS_PER_WORD) * 4))
+					& 0xF;
+	}
+
+	/* Read PF mode */
+	if (d->pf_device) {
+		reg_mode = acc100_reg_read(d, HWPfHiPfMode);
+		acc100_conf->pf_mode_en = (reg_mode == ACC100_PF_VAL) ? 1 : 0;
+	}
+
+	rte_bbdev_log_debug(
+			"%s Config LLR SIGN IN/OUT %s %s QG %u %u %u %u AQ %u %u %u %u Len %u %u %u %u\n",
+			(d->pf_device) ? "PF" : "VF",
+			(acc100_conf->input_pos_llr_1_bit) ? "POS" : "NEG",
+			(acc100_conf->output_pos_llr_1_bit) ? "POS" : "NEG",
+			acc100_conf->q_ul_4g.num_qgroups,
+			acc100_conf->q_dl_4g.num_qgroups,
+			acc100_conf->q_ul_5g.num_qgroups,
+			acc100_conf->q_dl_5g.num_qgroups,
+			acc100_conf->q_ul_4g.num_aqs_per_groups,
+			acc100_conf->q_dl_4g.num_aqs_per_groups,
+			acc100_conf->q_ul_5g.num_aqs_per_groups,
+			acc100_conf->q_dl_5g.num_aqs_per_groups,
+			acc100_conf->q_ul_4g.aq_depth_log2,
+			acc100_conf->q_dl_4g.aq_depth_log2,
+			acc100_conf->q_ul_5g.aq_depth_log2,
+			acc100_conf->q_dl_5g.aq_depth_log2);
+}
+
 /* Free 64MB memory used for software rings */
 static int
 acc100_dev_close(struct rte_bbdev *dev  __rte_unused)
@@ -33,8 +215,55 @@
 	return 0;
 }
 
+/* Get ACC100 device info */
+static void
+acc100_dev_info_get(struct rte_bbdev *dev,
+		struct rte_bbdev_driver_info *dev_info)
+{
+	struct acc100_device *d = dev->data->dev_private;
+
+	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
+		RTE_BBDEV_END_OF_CAPABILITIES_LIST()
+	};
+
+	static struct rte_bbdev_queue_conf default_queue_conf;
+	default_queue_conf.socket = dev->data->socket_id;
+	default_queue_conf.queue_size = ACC100_MAX_QUEUE_DEPTH;
+
+	dev_info->driver_name = dev->device->driver->name;
+
+	/* Read and save the populated config from ACC100 registers */
+	fetch_acc100_config(dev);
+
+	/* This isn't ideal because it reports the maximum number of queues but
+	 * does not provide info on how many can be uplink/downlink or different
+	 * priorities
+	 */
+	dev_info->max_num_queues =
+			d->acc100_conf.q_dl_5g.num_aqs_per_groups *
+			d->acc100_conf.q_dl_5g.num_qgroups +
+			d->acc100_conf.q_ul_5g.num_aqs_per_groups *
+			d->acc100_conf.q_ul_5g.num_qgroups +
+			d->acc100_conf.q_dl_4g.num_aqs_per_groups *
+			d->acc100_conf.q_dl_4g.num_qgroups +
+			d->acc100_conf.q_ul_4g.num_aqs_per_groups *
+			d->acc100_conf.q_ul_4g.num_qgroups;
+	dev_info->queue_size_lim = ACC100_MAX_QUEUE_DEPTH;
+	dev_info->hardware_accelerated = true;
+	dev_info->max_dl_queue_priority =
+			d->acc100_conf.q_dl_4g.num_qgroups - 1;
+	dev_info->max_ul_queue_priority =
+			d->acc100_conf.q_ul_4g.num_qgroups - 1;
+	dev_info->default_queue_conf = default_queue_conf;
+	dev_info->cpu_flag_reqs = NULL;
+	dev_info->min_alignment = 64;
+	dev_info->capabilities = bbdev_capabilities;
+	dev_info->harq_buffer_size = d->ddr_size;
+}
+
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
 	.close = acc100_dev_close,
+	.info_get = acc100_dev_info_get,
 };
 
 /* ACC100 PCI PF address map */
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 6525d66..09965c8 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -7,6 +7,7 @@
 
 #include "acc100_pf_enum.h"
 #include "acc100_vf_enum.h"
+#include "rte_acc100_cfg.h"
 
 /* Helper macro for logging */
 #define rte_bbdev_log(level, fmt, ...) \
@@ -98,6 +99,13 @@
 #define ACC100_SIG_UL_4G_LAST 21
 #define ACC100_SIG_DL_4G      27
 #define ACC100_SIG_DL_4G_LAST 31
+#define ACC100_NUM_ACCS       5
+#define ACC100_ACCMAP_0       0
+#define ACC100_ACCMAP_1       2
+#define ACC100_ACCMAP_2       1
+#define ACC100_ACCMAP_3       3
+#define ACC100_ACCMAP_4       4
+#define ACC100_PF_VAL         2
 
 /* max number of iterations to allocate memory block for all rings */
 #define ACC100_SW_RING_MEM_ALLOC_ATTEMPTS 5
@@ -517,6 +525,8 @@ struct acc100_registry_addr {
 /* Private data structure for each ACC100 device */
 struct acc100_device {
 	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
+	uint32_t ddr_size; /* Size in kB */
+	struct rte_acc100_conf acc100_conf; /* ACC100 Initial configuration */
 	bool pf_device; /**< True if this is a PF ACC100 device */
 	bool configured; /**< True if this ACC100 device is configured */
 };
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v11 04/10] baseband/acc100: add queue configuration
  2020-10-02  1:01   ` [dpdk-dev] [PATCH v11 00/10] bbdev PMD ACC100 Nicolas Chautru
                       ` (2 preceding siblings ...)
  2020-10-02  1:01     ` [dpdk-dev] [PATCH v11 03/10] baseband/acc100: add info get function Nicolas Chautru
@ 2020-10-02  1:01     ` Nicolas Chautru
  2020-10-04 16:18       ` Tom Rix
  2020-10-02  1:01     ` [dpdk-dev] [PATCH v11 05/10] baseband/acc100: add LDPC processing functions Nicolas Chautru
                       ` (5 subsequent siblings)
  9 siblings, 1 reply; 213+ messages in thread
From: Nicolas Chautru @ 2020-10-02  1:01 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, trix, maxime.coquelin, ferruh.yigit,
	tianjiao.liu, Nicolas Chautru

Adding function to create and configure queues for
the device. Still no capability.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 445 ++++++++++++++++++++++++++++++-
 drivers/baseband/acc100/rte_acc100_pmd.h |  45 ++++
 2 files changed, 488 insertions(+), 2 deletions(-)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index fcba77e..f2bf2b5 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -26,6 +26,22 @@
 RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);
 #endif
 
+/* Write to MMIO register address */
+static inline void
+mmio_write(void *addr, uint32_t value)
+{
+	*((volatile uint32_t *)(addr)) = rte_cpu_to_le_32(value);
+}
+
+/* Write a register of a ACC100 device */
+static inline void
+acc100_reg_write(struct acc100_device *d, uint32_t offset, uint32_t payload)
+{
+	void *reg_addr = RTE_PTR_ADD(d->mmio_base, offset);
+	mmio_write(reg_addr, payload);
+	usleep(ACC100_LONG_WAIT);
+}
+
 /* Read a register of a ACC100 device */
 static inline uint32_t
 acc100_reg_read(struct acc100_device *d, uint32_t offset)
@@ -36,6 +52,22 @@
 	return rte_le_to_cpu_32(ret);
 }
 
+/* Basic Implementation of Log2 for exact 2^N */
+static inline uint32_t
+log2_basic(uint32_t value)
+{
+	return (value == 0) ? 0 : rte_bsf32(value);
+}
+
+/* Calculate memory alignment offset assuming alignment is 2^N */
+static inline uint32_t
+calc_mem_alignment_offset(void *unaligned_virt_mem, uint32_t alignment)
+{
+	rte_iova_t unaligned_phy_mem = rte_malloc_virt2iova(unaligned_virt_mem);
+	return (uint32_t)(alignment -
+			(unaligned_phy_mem & (alignment-1)));
+}
+
 /* Calculate the offset of the enqueue register */
 static inline uint32_t
 queue_offset(bool pf_device, uint8_t vf_id, uint8_t qgrp_id, uint16_t aq_id)
@@ -208,10 +240,416 @@
 			acc100_conf->q_dl_5g.aq_depth_log2);
 }
 
-/* Free 64MB memory used for software rings */
+static void
+free_base_addresses(void **base_addrs, int size)
+{
+	int i;
+	for (i = 0; i < size; i++)
+		rte_free(base_addrs[i]);
+}
+
+static inline uint32_t
+get_desc_len(void)
+{
+	return sizeof(union acc100_dma_desc);
+}
+
+/* Allocate the 2 * 64MB block for the sw rings */
 static int
-acc100_dev_close(struct rte_bbdev *dev  __rte_unused)
+alloc_2x64mb_sw_rings_mem(struct rte_bbdev *dev, struct acc100_device *d,
+		int socket)
 {
+	uint32_t sw_ring_size = ACC100_SIZE_64MBYTE;
+	d->sw_rings_base = rte_zmalloc_socket(dev->device->driver->name,
+			2 * sw_ring_size, RTE_CACHE_LINE_SIZE, socket);
+	if (d->sw_rings_base == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate memory for %s:%u",
+				dev->device->driver->name,
+				dev->data->dev_id);
+		return -ENOMEM;
+	}
+	uint32_t next_64mb_align_offset = calc_mem_alignment_offset(
+			d->sw_rings_base, ACC100_SIZE_64MBYTE);
+	d->sw_rings = RTE_PTR_ADD(d->sw_rings_base, next_64mb_align_offset);
+	d->sw_rings_iova = rte_malloc_virt2iova(d->sw_rings_base) +
+			next_64mb_align_offset;
+	d->sw_ring_size = ACC100_MAX_QUEUE_DEPTH * get_desc_len();
+	d->sw_ring_max_depth = ACC100_MAX_QUEUE_DEPTH;
+
+	return 0;
+}
+
+/* Attempt to allocate minimised memory space for sw rings */
+static void
+alloc_sw_rings_min_mem(struct rte_bbdev *dev, struct acc100_device *d,
+		uint16_t num_queues, int socket)
+{
+	rte_iova_t sw_rings_base_iova, next_64mb_align_addr_iova;
+	uint32_t next_64mb_align_offset;
+	rte_iova_t sw_ring_iova_end_addr;
+	void *base_addrs[ACC100_SW_RING_MEM_ALLOC_ATTEMPTS];
+	void *sw_rings_base;
+	int i = 0;
+	uint32_t q_sw_ring_size = ACC100_MAX_QUEUE_DEPTH * get_desc_len();
+	uint32_t dev_sw_ring_size = q_sw_ring_size * num_queues;
+
+	/* Find an aligned block of memory to store sw rings */
+	while (i < ACC100_SW_RING_MEM_ALLOC_ATTEMPTS) {
+		/*
+		 * sw_ring allocated memory is guaranteed to be aligned to
+		 * q_sw_ring_size at the condition that the requested size is
+		 * less than the page size
+		 */
+		sw_rings_base = rte_zmalloc_socket(
+				dev->device->driver->name,
+				dev_sw_ring_size, q_sw_ring_size, socket);
+
+		if (sw_rings_base == NULL) {
+			rte_bbdev_log(ERR,
+					"Failed to allocate memory for %s:%u",
+					dev->device->driver->name,
+					dev->data->dev_id);
+			break;
+		}
+
+		sw_rings_base_iova = rte_malloc_virt2iova(sw_rings_base);
+		next_64mb_align_offset = calc_mem_alignment_offset(
+				sw_rings_base, ACC100_SIZE_64MBYTE);
+		next_64mb_align_addr_iova = sw_rings_base_iova +
+				next_64mb_align_offset;
+		sw_ring_iova_end_addr = sw_rings_base_iova + dev_sw_ring_size;
+
+		/* Check if the end of the sw ring memory block is before the
+		 * start of next 64MB aligned mem address
+		 */
+		if (sw_ring_iova_end_addr < next_64mb_align_addr_iova) {
+			d->sw_rings_iova = sw_rings_base_iova;
+			d->sw_rings = sw_rings_base;
+			d->sw_rings_base = sw_rings_base;
+			d->sw_ring_size = q_sw_ring_size;
+			d->sw_ring_max_depth = ACC100_MAX_QUEUE_DEPTH;
+			break;
+		}
+		/* Store the address of the unaligned mem block */
+		base_addrs[i] = sw_rings_base;
+		i++;
+	}
+
+	/* Free all unaligned blocks of mem allocated in the loop */
+	free_base_addresses(base_addrs, i);
+}
+
+
+/* Allocate 64MB memory used for all software rings */
+static int
+acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id)
+{
+	uint32_t phys_low, phys_high, payload;
+	struct acc100_device *d = dev->data->dev_private;
+	const struct acc100_registry_addr *reg_addr;
+
+	if (d->pf_device && !d->acc100_conf.pf_mode_en) {
+		rte_bbdev_log(NOTICE,
+				"%s has PF mode disabled. This PF can't be used.",
+				dev->data->name);
+		return -ENODEV;
+	}
+
+	alloc_sw_rings_min_mem(dev, d, num_queues, socket_id);
+
+	/* If minimal memory space approach failed, then allocate
+	 * the 2 * 64MB block for the sw rings
+	 */
+	if (d->sw_rings == NULL)
+		alloc_2x64mb_sw_rings_mem(dev, d, socket_id);
+
+	if (d->sw_rings == NULL) {
+		rte_bbdev_log(NOTICE,
+				"Failure allocating sw_rings memory");
+		return -ENODEV;
+	}
+
+	/* Configure ACC100 with the base address for DMA descriptor rings
+	 * Same descriptor rings used for UL and DL DMA Engines
+	 * Note : Assuming only VF0 bundle is used for PF mode
+	 */
+	phys_high = (uint32_t)(d->sw_rings_iova >> 32);
+	phys_low  = (uint32_t)(d->sw_rings_iova & ~(ACC100_SIZE_64MBYTE-1));
+
+	/* Choose correct registry addresses for the device type */
+	if (d->pf_device)
+		reg_addr = &pf_reg_addr;
+	else
+		reg_addr = &vf_reg_addr;
+
+	/* Read the populated cfg from ACC100 registers */
+	fetch_acc100_config(dev);
+
+	/* Release AXI from PF */
+	if (d->pf_device)
+		acc100_reg_write(d, HWPfDmaAxiControl, 1);
+
+	acc100_reg_write(d, reg_addr->dma_ring_ul5g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->dma_ring_ul5g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->dma_ring_dl5g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->dma_ring_dl5g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->dma_ring_ul4g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->dma_ring_ul4g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->dma_ring_dl4g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->dma_ring_dl4g_lo, phys_low);
+
+	/*
+	 * Configure Ring Size to the max queue ring size
+	 * (used for wrapping purpose)
+	 */
+	payload = log2_basic(d->sw_ring_size / 64);
+	acc100_reg_write(d, reg_addr->ring_size, payload);
+
+	/* Configure tail pointer for use when SDONE enabled */
+	d->tail_ptrs = rte_zmalloc_socket(
+			dev->device->driver->name,
+			ACC100_NUM_QGRPS * ACC100_NUM_AQS * sizeof(uint32_t),
+			RTE_CACHE_LINE_SIZE, socket_id);
+	if (d->tail_ptrs == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate tail ptr for %s:%u",
+				dev->device->driver->name,
+				dev->data->dev_id);
+		rte_free(d->sw_rings);
+		return -ENOMEM;
+	}
+	d->tail_ptr_iova = rte_malloc_virt2iova(d->tail_ptrs);
+
+	phys_high = (uint32_t)(d->tail_ptr_iova >> 32);
+	phys_low  = (uint32_t)(d->tail_ptr_iova);
+	acc100_reg_write(d, reg_addr->tail_ptrs_ul5g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->tail_ptrs_ul5g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->tail_ptrs_dl5g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->tail_ptrs_dl5g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->tail_ptrs_ul4g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->tail_ptrs_ul4g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_lo, phys_low);
+
+	d->harq_layout = rte_zmalloc_socket("HARQ Layout",
+			ACC100_HARQ_LAYOUT * sizeof(*d->harq_layout),
+			RTE_CACHE_LINE_SIZE, dev->data->socket_id);
+	if (d->harq_layout == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate harq_layout for %s:%u",
+				dev->device->driver->name,
+				dev->data->dev_id);
+		rte_free(d->sw_rings);
+		return -ENOMEM;
+	}
+
+	/* Mark as configured properly */
+	d->configured = true;
+
+	rte_bbdev_log_debug(
+			"ACC100 (%s) configured  sw_rings = %p, sw_rings_iova = %#"
+			PRIx64, dev->data->name, d->sw_rings, d->sw_rings_iova);
+
+	return 0;
+}
+
+/* Free memory used for software rings */
+static int
+acc100_dev_close(struct rte_bbdev *dev)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	if (d->sw_rings_base != NULL) {
+		rte_free(d->tail_ptrs);
+		rte_free(d->sw_rings_base);
+		d->sw_rings_base = NULL;
+	}
+	/* Ensure all in flight HW transactions are completed */
+	usleep(ACC100_LONG_WAIT);
+	return 0;
+}
+
+
+/**
+ * Report a ACC100 queue index which is free
+ * Return 0 to 16k for a valid queue_idx or -1 when no queue is available
+ * Note : Only supporting VF0 Bundle for PF mode
+ */
+static int
+acc100_find_free_queue_idx(struct rte_bbdev *dev,
+		const struct rte_bbdev_queue_conf *conf)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	int op_2_acc[5] = {0, UL_4G, DL_4G, UL_5G, DL_5G};
+	int acc = op_2_acc[conf->op_type];
+	struct rte_acc100_queue_topology *qtop = NULL;
+
+	qtopFromAcc(&qtop, acc, &(d->acc100_conf));
+	if (qtop == NULL)
+		return -1;
+	/* Identify matching QGroup Index which are sorted in priority order */
+	uint16_t group_idx = qtop->first_qgroup_index;
+	group_idx += conf->priority;
+	if (group_idx >= ACC100_NUM_QGRPS ||
+			conf->priority >= qtop->num_qgroups) {
+		rte_bbdev_log(INFO, "Invalid Priority on %s, priority %u",
+				dev->data->name, conf->priority);
+		return -1;
+	}
+	/* Find a free AQ_idx  */
+	uint16_t aq_idx;
+	for (aq_idx = 0; aq_idx < qtop->num_aqs_per_groups; aq_idx++) {
+		if (((d->q_assigned_bit_map[group_idx] >> aq_idx) & 0x1) == 0) {
+			/* Mark the Queue as assigned */
+			d->q_assigned_bit_map[group_idx] |= (1 << aq_idx);
+			/* Report the AQ Index */
+			return (group_idx << ACC100_GRP_ID_SHIFT) + aq_idx;
+		}
+	}
+	rte_bbdev_log(INFO, "Failed to find free queue on %s, priority %u",
+			dev->data->name, conf->priority);
+	return -1;
+}
+
+/* Setup ACC100 queue */
+static int
+acc100_queue_setup(struct rte_bbdev *dev, uint16_t queue_id,
+		const struct rte_bbdev_queue_conf *conf)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	struct acc100_queue *q;
+	int16_t q_idx;
+
+	/* Allocate the queue data structure. */
+	q = rte_zmalloc_socket(dev->device->driver->name, sizeof(*q),
+			RTE_CACHE_LINE_SIZE, conf->socket);
+	if (q == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate queue memory");
+		return -ENOMEM;
+	}
+	if (d == NULL) {
+		rte_bbdev_log(ERR, "Undefined device");
+		return -ENODEV;
+	}
+
+	q->d = d;
+	q->ring_addr = RTE_PTR_ADD(d->sw_rings, (d->sw_ring_size * queue_id));
+	q->ring_addr_iova = d->sw_rings_iova + (d->sw_ring_size * queue_id);
+
+	/* Prepare the Ring with default descriptor format */
+	union acc100_dma_desc *desc = NULL;
+	unsigned int desc_idx, b_idx;
+	int fcw_len = (conf->op_type == RTE_BBDEV_OP_LDPC_ENC ?
+		ACC100_FCW_LE_BLEN : (conf->op_type == RTE_BBDEV_OP_TURBO_DEC ?
+		ACC100_FCW_TD_BLEN : ACC100_FCW_LD_BLEN));
+
+	for (desc_idx = 0; desc_idx < d->sw_ring_max_depth; desc_idx++) {
+		desc = q->ring_addr + desc_idx;
+		desc->req.word0 = ACC100_DMA_DESC_TYPE;
+		desc->req.word1 = 0; /**< Timestamp */
+		desc->req.word2 = 0;
+		desc->req.word3 = 0;
+		uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
+		desc->req.data_ptrs[0].address = q->ring_addr_iova + fcw_offset;
+		desc->req.data_ptrs[0].blen = fcw_len;
+		desc->req.data_ptrs[0].blkid = ACC100_DMA_BLKID_FCW;
+		desc->req.data_ptrs[0].last = 0;
+		desc->req.data_ptrs[0].dma_ext = 0;
+		for (b_idx = 1; b_idx < ACC100_DMA_MAX_NUM_POINTERS - 1;
+				b_idx++) {
+			desc->req.data_ptrs[b_idx].blkid = ACC100_DMA_BLKID_IN;
+			desc->req.data_ptrs[b_idx].last = 1;
+			desc->req.data_ptrs[b_idx].dma_ext = 0;
+			b_idx++;
+			desc->req.data_ptrs[b_idx].blkid =
+					ACC100_DMA_BLKID_OUT_ENC;
+			desc->req.data_ptrs[b_idx].last = 1;
+			desc->req.data_ptrs[b_idx].dma_ext = 0;
+		}
+		/* Preset some fields of LDPC FCW */
+		desc->req.fcw_ld.FCWversion = ACC100_FCW_VER;
+		desc->req.fcw_ld.gain_i = 1;
+		desc->req.fcw_ld.gain_h = 1;
+	}
+
+	q->lb_in = rte_zmalloc_socket(dev->device->driver->name,
+			RTE_CACHE_LINE_SIZE,
+			RTE_CACHE_LINE_SIZE, conf->socket);
+	if (q->lb_in == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate lb_in memory");
+		rte_free(q);
+		return -ENOMEM;
+	}
+	q->lb_in_addr_iova = rte_malloc_virt2iova(q->lb_in);
+	q->lb_out = rte_zmalloc_socket(dev->device->driver->name,
+			RTE_CACHE_LINE_SIZE,
+			RTE_CACHE_LINE_SIZE, conf->socket);
+	if (q->lb_out == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate lb_out memory");
+		rte_free(q->lb_in);
+		rte_free(q);
+		return -ENOMEM;
+	}
+	q->lb_out_addr_iova = rte_malloc_virt2iova(q->lb_out);
+
+	/*
+	 * Software queue ring wraps synchronously with the HW when it reaches
+	 * the boundary of the maximum allocated queue size, no matter what the
+	 * sw queue size is. This wrapping is guarded by setting the wrap_mask
+	 * to represent the maximum queue size as allocated at the time when
+	 * the device has been setup (in configure()).
+	 *
+	 * The queue depth is set to the queue size value (conf->queue_size).
+	 * This limits the occupancy of the queue at any point of time, so that
+	 * the queue does not get swamped with enqueue requests.
+	 */
+	q->sw_ring_depth = conf->queue_size;
+	q->sw_ring_wrap_mask = d->sw_ring_max_depth - 1;
+
+	q->op_type = conf->op_type;
+
+	q_idx = acc100_find_free_queue_idx(dev, conf);
+	if (q_idx == -1) {
+		rte_free(q->lb_in);
+		rte_free(q->lb_out);
+		rte_free(q);
+		return -1;
+	}
+
+	q->qgrp_id = (q_idx >> ACC100_GRP_ID_SHIFT) & 0xF;
+	q->vf_id = (q_idx >> ACC100_VF_ID_SHIFT)  & 0x3F;
+	q->aq_id = q_idx & 0xF;
+	q->aq_depth = (conf->op_type ==  RTE_BBDEV_OP_TURBO_DEC) ?
+			(1 << d->acc100_conf.q_ul_4g.aq_depth_log2) :
+			(1 << d->acc100_conf.q_dl_4g.aq_depth_log2);
+
+	q->mmio_reg_enqueue = RTE_PTR_ADD(d->mmio_base,
+			queue_offset(d->pf_device,
+					q->vf_id, q->qgrp_id, q->aq_id));
+
+	rte_bbdev_log_debug(
+			"Setup dev%u q%u: qgrp_id=%u, vf_id=%u, aq_id=%u, aq_depth=%u, mmio_reg_enqueue=%p",
+			dev->data->dev_id, queue_id, q->qgrp_id, q->vf_id,
+			q->aq_id, q->aq_depth, q->mmio_reg_enqueue);
+
+	dev->data->queues[queue_id].queue_private = q;
+	return 0;
+}
+
+/* Release ACC100 queue */
+static int
+acc100_queue_release(struct rte_bbdev *dev, uint16_t q_id)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	struct acc100_queue *q = dev->data->queues[q_id].queue_private;
+
+	if (q != NULL) {
+		/* Mark the Queue as un-assigned */
+		d->q_assigned_bit_map[q->qgrp_id] &= (0xFFFFFFFF -
+				(1 << q->aq_id));
+		rte_free(q->lb_in);
+		rte_free(q->lb_out);
+		rte_free(q);
+		dev->data->queues[q_id].queue_private = NULL;
+	}
+
 	return 0;
 }
 
@@ -262,8 +700,11 @@
 }
 
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
+	.setup_queues = acc100_setup_queues,
 	.close = acc100_dev_close,
 	.info_get = acc100_dev_info_get,
+	.queue_setup = acc100_queue_setup,
+	.queue_release = acc100_queue_release,
 };
 
 /* ACC100 PCI PF address map */
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 09965c8..5c8dde3 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -522,11 +522,56 @@ struct acc100_registry_addr {
 	.ddr_range = HWVfDmaDdrBaseRangeRoVf,
 };
 
+/* Structure associated with each queue. */
+struct __rte_cache_aligned acc100_queue {
+	union acc100_dma_desc *ring_addr;  /* Virtual address of sw ring */
+	rte_iova_t ring_addr_iova;  /* IOVA address of software ring */
+	uint32_t sw_ring_head;  /* software ring head */
+	uint32_t sw_ring_tail;  /* software ring tail */
+	/* software ring size (descriptors, not bytes) */
+	uint32_t sw_ring_depth;
+	/* mask used to wrap enqueued descriptors on the sw ring */
+	uint32_t sw_ring_wrap_mask;
+	/* MMIO register used to enqueue descriptors */
+	void *mmio_reg_enqueue;
+	uint8_t vf_id;  /* VF ID (max = 63) */
+	uint8_t qgrp_id;  /* Queue Group ID */
+	uint16_t aq_id;  /* Atomic Queue ID */
+	uint16_t aq_depth;  /* Depth of atomic queue */
+	uint32_t aq_enqueued;  /* Count how many "batches" have been enqueued */
+	uint32_t aq_dequeued;  /* Count how many "batches" have been dequeued */
+	uint32_t irq_enable;  /* Enable ops dequeue interrupts if set to 1 */
+	struct rte_mempool *fcw_mempool;  /* FCW mempool */
+	enum rte_bbdev_op_type op_type;  /* Type of this Queue: TE or TD */
+	/* Internal Buffers for loopback input */
+	uint8_t *lb_in;
+	uint8_t *lb_out;
+	rte_iova_t lb_in_addr_iova;
+	rte_iova_t lb_out_addr_iova;
+	struct acc100_device *d;
+};
+
 /* Private data structure for each ACC100 device */
 struct acc100_device {
 	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
+	void *sw_rings_base;  /* Base addr of un-aligned memory for sw rings */
+	void *sw_rings;  /* 64MBs of 64MB aligned memory for sw rings */
+	rte_iova_t sw_rings_iova;  /* IOVA address of sw_rings */
+	/* Virtual address of the info memory routed to the this function under
+	 * operation, whether it is PF or VF.
+	 */
+	union acc100_harq_layout_data *harq_layout;
+	uint32_t sw_ring_size;
 	uint32_t ddr_size; /* Size in kB */
+	uint32_t *tail_ptrs; /* Base address of response tail pointer buffer */
+	rte_iova_t tail_ptr_iova; /* IOVA address of tail pointers */
+	/* Max number of entries available for each queue in device, depending
+	 * on how many queues are enabled with configure()
+	 */
+	uint32_t sw_ring_max_depth;
 	struct rte_acc100_conf acc100_conf; /* ACC100 Initial configuration */
+	/* Bitmap capturing which Queues have already been assigned */
+	uint16_t q_assigned_bit_map[ACC100_NUM_QGRPS];
 	bool pf_device; /**< True if this is a PF ACC100 device */
 	bool configured; /**< True if this ACC100 device is configured */
 };
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v11 05/10] baseband/acc100: add LDPC processing functions
  2020-10-02  1:01   ` [dpdk-dev] [PATCH v11 00/10] bbdev PMD ACC100 Nicolas Chautru
                       ` (3 preceding siblings ...)
  2020-10-02  1:01     ` [dpdk-dev] [PATCH v11 04/10] baseband/acc100: add queue configuration Nicolas Chautru
@ 2020-10-02  1:01     ` Nicolas Chautru
  2020-10-02  1:01     ` [dpdk-dev] [PATCH v11 06/10] baseband/acc100: add HARQ loopback support Nicolas Chautru
                       ` (4 subsequent siblings)
  9 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-10-02  1:01 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, trix, maxime.coquelin, ferruh.yigit,
	tianjiao.liu, Nicolas Chautru

Adding LDPC decode and encode processing operations

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
Acked-by: Dave Burley <dave.burley@accelercomm.com>
---
 doc/guides/bbdevs/features/acc100.ini    |    8 +-
 drivers/baseband/acc100/rte_acc100_pmd.c | 1621 +++++++++++++++++++++++++++++-
 drivers/baseband/acc100/rte_acc100_pmd.h |    6 +
 3 files changed, 1629 insertions(+), 6 deletions(-)

diff --git a/doc/guides/bbdevs/features/acc100.ini b/doc/guides/bbdevs/features/acc100.ini
index c89a4d7..40c7adc 100644
--- a/doc/guides/bbdevs/features/acc100.ini
+++ b/doc/guides/bbdevs/features/acc100.ini
@@ -6,9 +6,9 @@
 [Features]
 Turbo Decoder (4G)     = N
 Turbo Encoder (4G)     = N
-LDPC Decoder (5G)      = N
-LDPC Encoder (5G)      = N
-LLR/HARQ Compression   = N
-External DDR Access    = N
+LDPC Decoder (5G)      = Y
+LDPC Encoder (5G)      = Y
+LLR/HARQ Compression   = Y
+External DDR Access    = Y
 HW Accelerated         = Y
 BBDEV API              = Y
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index f2bf2b5..f894438 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -15,6 +15,9 @@
 #include <rte_hexdump.h>
 #include <rte_pci.h>
 #include <rte_bus_pci.h>
+#ifdef RTE_BBDEV_OFFLOAD_COST
+#include <rte_cycles.h>
+#endif
 
 #include <rte_bbdev.h>
 #include <rte_bbdev_pmd.h>
@@ -466,7 +469,6 @@
 	return 0;
 }
 
-
 /**
  * Report a ACC100 queue index which is free
  * Return 0 to 16k for a valid queue_idx or -1 when no queue is available
@@ -661,6 +663,46 @@
 	struct acc100_device *d = dev->data->dev_private;
 
 	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
+		{
+			.type   = RTE_BBDEV_OP_LDPC_ENC,
+			.cap.ldpc_enc = {
+				.capability_flags =
+					RTE_BBDEV_LDPC_RATE_MATCH |
+					RTE_BBDEV_LDPC_CRC_24B_ATTACH |
+					RTE_BBDEV_LDPC_INTERLEAVER_BYPASS,
+				.num_buffers_src =
+						RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+				.num_buffers_dst =
+						RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+			}
+		},
+		{
+			.type   = RTE_BBDEV_OP_LDPC_DEC,
+			.cap.ldpc_dec = {
+			.capability_flags =
+				RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK |
+				RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP |
+				RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE |
+				RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE |
+#ifdef ACC100_EXT_MEM
+				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE |
+				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE |
+#endif
+				RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE |
+				RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS |
+				RTE_BBDEV_LDPC_DECODE_BYPASS |
+				RTE_BBDEV_LDPC_DEC_SCATTER_GATHER |
+				RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION |
+				RTE_BBDEV_LDPC_LLR_COMPRESSION,
+			.llr_size = 8,
+			.llr_decimals = 1,
+			.num_buffers_src =
+					RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+			.num_buffers_hard_out =
+					RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+			.num_buffers_soft_out = 0,
+			}
+		},
 		RTE_BBDEV_END_OF_CAPABILITIES_LIST()
 	};
 
@@ -696,9 +738,14 @@
 	dev_info->cpu_flag_reqs = NULL;
 	dev_info->min_alignment = 64;
 	dev_info->capabilities = bbdev_capabilities;
+#ifdef ACC100_EXT_MEM
 	dev_info->harq_buffer_size = d->ddr_size;
+#else
+	dev_info->harq_buffer_size = 0;
+#endif
 }
 
+
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
 	.setup_queues = acc100_setup_queues,
 	.close = acc100_dev_close,
@@ -723,6 +770,1573 @@
 	{.device_id = 0},
 };
 
+/* Read flag value 0/1 from bitmap */
+static inline bool
+check_bit(uint32_t bitmap, uint32_t bitmask)
+{
+	return bitmap & bitmask;
+}
+
+static inline char *
+mbuf_append(struct rte_mbuf *m_head, struct rte_mbuf *m, uint16_t len)
+{
+	if (unlikely(len > rte_pktmbuf_tailroom(m)))
+		return NULL;
+
+	char *tail = (char *)m->buf_addr + m->data_off + m->data_len;
+	m->data_len = (uint16_t)(m->data_len + len);
+	m_head->pkt_len  = (m_head->pkt_len + len);
+	return tail;
+}
+
+/* Compute value of k0.
+ * Based on 3GPP 38.212 Table 5.4.2.1-2
+ * Starting position of different redundancy versions, k0
+ */
+static inline uint16_t
+get_k0(uint16_t n_cb, uint16_t z_c, uint8_t bg, uint8_t rv_index)
+{
+	if (rv_index == 0)
+		return 0;
+	uint16_t n = (bg == 1 ? ACC100_N_ZC_1 : ACC100_N_ZC_2) * z_c;
+	if (n_cb == n) {
+		if (rv_index == 1)
+			return (bg == 1 ? ACC100_K0_1_1 : ACC100_K0_1_2) * z_c;
+		else if (rv_index == 2)
+			return (bg == 1 ? ACC100_K0_2_1 : ACC100_K0_2_2) * z_c;
+		else
+			return (bg == 1 ? ACC100_K0_3_1 : ACC100_K0_3_2) * z_c;
+	}
+	/* LBRM case - includes a division by N */
+	if (rv_index == 1)
+		return (((bg == 1 ? ACC100_K0_1_1 : ACC100_K0_1_2) * n_cb)
+				/ n) * z_c;
+	else if (rv_index == 2)
+		return (((bg == 1 ? ACC100_K0_2_1 : ACC100_K0_2_2) * n_cb)
+				/ n) * z_c;
+	else
+		return (((bg == 1 ? ACC100_K0_3_1 : ACC100_K0_3_2) * n_cb)
+				/ n) * z_c;
+}
+
+/* Fill in a frame control word for LDPC encoding. */
+static inline void
+acc100_fcw_le_fill(const struct rte_bbdev_enc_op *op,
+		struct acc100_fcw_le *fcw, int num_cb)
+{
+	fcw->qm = op->ldpc_enc.q_m;
+	fcw->nfiller = op->ldpc_enc.n_filler;
+	fcw->BG = (op->ldpc_enc.basegraph - 1);
+	fcw->Zc = op->ldpc_enc.z_c;
+	fcw->ncb = op->ldpc_enc.n_cb;
+	fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_enc.basegraph,
+			op->ldpc_enc.rv_index);
+	fcw->rm_e = op->ldpc_enc.cb_params.e;
+	fcw->crc_select = check_bit(op->ldpc_enc.op_flags,
+			RTE_BBDEV_LDPC_CRC_24B_ATTACH);
+	fcw->bypass_intlv = check_bit(op->ldpc_enc.op_flags,
+			RTE_BBDEV_LDPC_INTERLEAVER_BYPASS);
+	fcw->mcb_count = num_cb;
+}
+
+/* Fill in a frame control word for LDPC decoding. */
+static inline void
+acc100_fcw_ld_fill(const struct rte_bbdev_dec_op *op, struct acc100_fcw_ld *fcw,
+		union acc100_harq_layout_data *harq_layout)
+{
+	uint16_t harq_out_length, harq_in_length, ncb_p, k0_p, parity_offset;
+	uint16_t harq_index;
+	uint32_t l;
+	bool harq_prun = false;
+
+	fcw->qm = op->ldpc_dec.q_m;
+	fcw->nfiller = op->ldpc_dec.n_filler;
+	fcw->BG = (op->ldpc_dec.basegraph - 1);
+	fcw->Zc = op->ldpc_dec.z_c;
+	fcw->ncb = op->ldpc_dec.n_cb;
+	fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_dec.basegraph,
+			op->ldpc_dec.rv_index);
+	if (op->ldpc_dec.code_block_mode == 1)
+		fcw->rm_e = op->ldpc_dec.cb_params.e;
+	else
+		fcw->rm_e = (op->ldpc_dec.tb_params.r <
+				op->ldpc_dec.tb_params.cab) ?
+						op->ldpc_dec.tb_params.ea :
+						op->ldpc_dec.tb_params.eb;
+
+	fcw->hcin_en = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE);
+	fcw->hcout_en = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE);
+	fcw->crc_select = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK);
+	fcw->bypass_dec = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_DECODE_BYPASS);
+	fcw->bypass_intlv = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS);
+	if (op->ldpc_dec.q_m == 1) {
+		fcw->bypass_intlv = 1;
+		fcw->qm = 2;
+	}
+	fcw->hcin_decomp_mode = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
+	fcw->hcout_comp_mode = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
+	fcw->llr_pack_mode = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_LLR_COMPRESSION);
+	harq_index = op->ldpc_dec.harq_combined_output.offset /
+			ACC100_HARQ_OFFSET;
+#ifdef ACC100_EXT_MEM
+	/* Limit cases when HARQ pruning is valid */
+	harq_prun = ((op->ldpc_dec.harq_combined_output.offset %
+			ACC100_HARQ_OFFSET) == 0) &&
+			(op->ldpc_dec.harq_combined_output.offset <= UINT16_MAX
+			* ACC100_HARQ_OFFSET);
+#endif
+	if (fcw->hcin_en > 0) {
+		harq_in_length = op->ldpc_dec.harq_combined_input.length;
+		if (fcw->hcin_decomp_mode > 0)
+			harq_in_length = harq_in_length * 8 / 6;
+		harq_in_length = RTE_ALIGN(harq_in_length, 64);
+		if ((harq_layout[harq_index].offset > 0) & harq_prun) {
+			rte_bbdev_log_debug("HARQ IN offset unexpected for now\n");
+			fcw->hcin_size0 = harq_layout[harq_index].size0;
+			fcw->hcin_offset = harq_layout[harq_index].offset;
+			fcw->hcin_size1 = harq_in_length -
+					harq_layout[harq_index].offset;
+		} else {
+			fcw->hcin_size0 = harq_in_length;
+			fcw->hcin_offset = 0;
+			fcw->hcin_size1 = 0;
+		}
+	} else {
+		fcw->hcin_size0 = 0;
+		fcw->hcin_offset = 0;
+		fcw->hcin_size1 = 0;
+	}
+
+	fcw->itmax = op->ldpc_dec.iter_max;
+	fcw->itstop = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE);
+	fcw->synd_precoder = fcw->itstop;
+	/*
+	 * These are all implicitly set
+	 * fcw->synd_post = 0;
+	 * fcw->so_en = 0;
+	 * fcw->so_bypass_rm = 0;
+	 * fcw->so_bypass_intlv = 0;
+	 * fcw->dec_convllr = 0;
+	 * fcw->hcout_convllr = 0;
+	 * fcw->hcout_size1 = 0;
+	 * fcw->so_it = 0;
+	 * fcw->hcout_offset = 0;
+	 * fcw->negstop_th = 0;
+	 * fcw->negstop_it = 0;
+	 * fcw->negstop_en = 0;
+	 * fcw->gain_i = 1;
+	 * fcw->gain_h = 1;
+	 */
+	if (fcw->hcout_en > 0) {
+		parity_offset = (op->ldpc_dec.basegraph == 1 ? 20 : 8)
+			* op->ldpc_dec.z_c - op->ldpc_dec.n_filler;
+		k0_p = (fcw->k0 > parity_offset) ?
+				fcw->k0 - op->ldpc_dec.n_filler : fcw->k0;
+		ncb_p = fcw->ncb - op->ldpc_dec.n_filler;
+		l = k0_p + fcw->rm_e;
+		harq_out_length = (uint16_t) fcw->hcin_size0;
+		harq_out_length = RTE_MIN(RTE_MAX(harq_out_length, l), ncb_p);
+		harq_out_length = (harq_out_length + 0x3F) & 0xFFC0;
+		if ((k0_p > fcw->hcin_size0 + ACC100_HARQ_OFFSET_THRESHOLD) &&
+				harq_prun) {
+			fcw->hcout_size0 = (uint16_t) fcw->hcin_size0;
+			fcw->hcout_offset = k0_p & 0xFFC0;
+			fcw->hcout_size1 = harq_out_length - fcw->hcout_offset;
+		} else {
+			fcw->hcout_size0 = harq_out_length;
+			fcw->hcout_size1 = 0;
+			fcw->hcout_offset = 0;
+		}
+		harq_layout[harq_index].offset = fcw->hcout_offset;
+		harq_layout[harq_index].size0 = fcw->hcout_size0;
+	} else {
+		fcw->hcout_size0 = 0;
+		fcw->hcout_size1 = 0;
+		fcw->hcout_offset = 0;
+	}
+}
+
+/**
+ * Fills descriptor with data pointers of one block type.
+ *
+ * @param desc
+ *   Pointer to DMA descriptor.
+ * @param input
+ *   Pointer to pointer to input data which will be encoded. It can be changed
+ *   and points to next segment in scatter-gather case.
+ * @param offset
+ *   Input offset in rte_mbuf structure. It is used for calculating the point
+ *   where data is starting.
+ * @param cb_len
+ *   Length of currently processed Code Block
+ * @param seg_total_left
+ *   It indicates how many bytes still left in segment (mbuf) for further
+ *   processing.
+ * @param op_flags
+ *   Store information about device capabilities
+ * @param next_triplet
+ *   Index for ACC100 DMA Descriptor triplet
+ *
+ * @return
+ *   Returns index of next triplet on success, other value if lengths of
+ *   pkt and processed cb do not match.
+ *
+ */
+static inline int
+acc100_dma_fill_blk_type_in(struct acc100_dma_req_desc *desc,
+		struct rte_mbuf **input, uint32_t *offset, uint32_t cb_len,
+		uint32_t *seg_total_left, int next_triplet)
+{
+	uint32_t part_len;
+	struct rte_mbuf *m = *input;
+
+	part_len = (*seg_total_left < cb_len) ? *seg_total_left : cb_len;
+	cb_len -= part_len;
+	*seg_total_left -= part_len;
+
+	desc->data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(m, *offset);
+	desc->data_ptrs[next_triplet].blen = part_len;
+	desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN;
+	desc->data_ptrs[next_triplet].last = 0;
+	desc->data_ptrs[next_triplet].dma_ext = 0;
+	*offset += part_len;
+	next_triplet++;
+
+	while (cb_len > 0) {
+		if (next_triplet < ACC100_DMA_MAX_NUM_POINTERS &&
+				m->next != NULL) {
+
+			m = m->next;
+			*seg_total_left = rte_pktmbuf_data_len(m);
+			part_len = (*seg_total_left < cb_len) ?
+					*seg_total_left :
+					cb_len;
+			desc->data_ptrs[next_triplet].address =
+					rte_pktmbuf_iova_offset(m, 0);
+			desc->data_ptrs[next_triplet].blen = part_len;
+			desc->data_ptrs[next_triplet].blkid =
+					ACC100_DMA_BLKID_IN;
+			desc->data_ptrs[next_triplet].last = 0;
+			desc->data_ptrs[next_triplet].dma_ext = 0;
+			cb_len -= part_len;
+			*seg_total_left -= part_len;
+			/* Initializing offset for next segment (mbuf) */
+			*offset = part_len;
+			next_triplet++;
+		} else {
+			rte_bbdev_log(ERR,
+				"Some data still left for processing: "
+				"data_left: %u, next_triplet: %u, next_mbuf: %p",
+				cb_len, next_triplet, m->next);
+			return -EINVAL;
+		}
+	}
+	/* Storing new mbuf as it could be changed in scatter-gather case*/
+	*input = m;
+
+	return next_triplet;
+}
+
+/* Fills descriptor with data pointers of one block type.
+ * Returns index of next triplet on success, other value if lengths of
+ * output data and processed mbuf do not match.
+ */
+static inline int
+acc100_dma_fill_blk_type_out(struct acc100_dma_req_desc *desc,
+		struct rte_mbuf *output, uint32_t out_offset,
+		uint32_t output_len, int next_triplet, int blk_id)
+{
+	desc->data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(output, out_offset);
+	desc->data_ptrs[next_triplet].blen = output_len;
+	desc->data_ptrs[next_triplet].blkid = blk_id;
+	desc->data_ptrs[next_triplet].last = 0;
+	desc->data_ptrs[next_triplet].dma_ext = 0;
+	next_triplet++;
+
+	return next_triplet;
+}
+
+static inline void
+acc100_header_init(struct acc100_dma_req_desc *desc)
+{
+	desc->word0 = ACC100_DMA_DESC_TYPE;
+	desc->word1 = 0; /**< Timestamp could be disabled */
+	desc->word2 = 0;
+	desc->word3 = 0;
+	desc->numCBs = 1;
+}
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+/* Check if any input data is unexpectedly left for processing */
+static inline int
+check_mbuf_total_left(uint32_t mbuf_total_left)
+{
+	if (mbuf_total_left == 0)
+		return 0;
+	rte_bbdev_log(ERR,
+		"Some date still left for processing: mbuf_total_left = %u",
+		mbuf_total_left);
+	return -EINVAL;
+}
+#endif
+
+static inline int
+acc100_dma_desc_le_fill(struct rte_bbdev_enc_op *op,
+		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
+		struct rte_mbuf *output, uint32_t *in_offset,
+		uint32_t *out_offset, uint32_t *out_length,
+		uint32_t *mbuf_total_left, uint32_t *seg_total_left)
+{
+	int next_triplet = 1; /* FCW already done */
+	uint16_t K, in_length_in_bits, in_length_in_bytes;
+	struct rte_bbdev_op_ldpc_enc *enc = &op->ldpc_enc;
+
+	acc100_header_init(desc);
+
+	K = (enc->basegraph == 1 ? 22 : 10) * enc->z_c;
+	in_length_in_bits = K - enc->n_filler;
+	if ((enc->op_flags & RTE_BBDEV_LDPC_CRC_24A_ATTACH) ||
+			(enc->op_flags & RTE_BBDEV_LDPC_CRC_24B_ATTACH))
+		in_length_in_bits -= 24;
+	in_length_in_bytes = in_length_in_bits >> 3;
+
+	if (unlikely((*mbuf_total_left == 0) ||
+			(*mbuf_total_left < in_length_in_bytes))) {
+		rte_bbdev_log(ERR,
+				"Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+				*mbuf_total_left, in_length_in_bytes);
+		return -1;
+	}
+
+	next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset,
+			in_length_in_bytes,
+			seg_total_left, next_triplet);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->m2dlen = next_triplet;
+	*mbuf_total_left -= in_length_in_bytes;
+
+	/* Set output length */
+	/* Integer round up division by 8 */
+	*out_length = (enc->cb_params.e + 7) >> 3;
+
+	next_triplet = acc100_dma_fill_blk_type_out(desc, output, *out_offset,
+			*out_length, next_triplet, ACC100_DMA_BLKID_OUT_ENC);
+	op->ldpc_enc.output.length += *out_length;
+	*out_offset += *out_length;
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->data_ptrs[next_triplet - 1].dma_ext = 0;
+	desc->d2mlen = next_triplet - desc->m2dlen;
+
+	desc->op_addr = op;
+
+	return 0;
+}
+
+static inline int
+acc100_dma_desc_ld_fill(struct rte_bbdev_dec_op *op,
+		struct acc100_dma_req_desc *desc,
+		struct rte_mbuf **input, struct rte_mbuf *h_output,
+		uint32_t *in_offset, uint32_t *h_out_offset,
+		uint32_t *h_out_length, uint32_t *mbuf_total_left,
+		uint32_t *seg_total_left,
+		struct acc100_fcw_ld *fcw)
+{
+	struct rte_bbdev_op_ldpc_dec *dec = &op->ldpc_dec;
+	int next_triplet = 1; /* FCW already done */
+	uint32_t input_length;
+	uint16_t output_length, crc24_overlap = 0;
+	uint16_t sys_cols, K, h_p_size, h_np_size;
+	bool h_comp = check_bit(dec->op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
+
+	acc100_header_init(desc);
+
+	if (check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP))
+		crc24_overlap = 24;
+
+	/* Compute some LDPC BG lengths */
+	input_length = dec->cb_params.e;
+	if (check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_LLR_COMPRESSION))
+		input_length = (input_length * 3 + 3) / 4;
+	sys_cols = (dec->basegraph == 1) ? 22 : 10;
+	K = sys_cols * dec->z_c;
+	output_length = K - dec->n_filler - crc24_overlap;
+
+	if (unlikely((*mbuf_total_left == 0) ||
+			(*mbuf_total_left < input_length))) {
+		rte_bbdev_log(ERR,
+				"Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+				*mbuf_total_left, input_length);
+		return -1;
+	}
+
+	next_triplet = acc100_dma_fill_blk_type_in(desc, input,
+			in_offset, input_length,
+			seg_total_left, next_triplet);
+
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+
+	if (check_bit(op->ldpc_dec.op_flags,
+				RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
+		h_p_size = fcw->hcin_size0 + fcw->hcin_size1;
+		if (h_comp)
+			h_p_size = (h_p_size * 3 + 3) / 4;
+		desc->data_ptrs[next_triplet].address =
+				dec->harq_combined_input.offset;
+		desc->data_ptrs[next_triplet].blen = h_p_size;
+		desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN_HARQ;
+		desc->data_ptrs[next_triplet].dma_ext = 1;
+#ifndef ACC100_EXT_MEM
+		acc100_dma_fill_blk_type_out(
+				desc,
+				op->ldpc_dec.harq_combined_input.data,
+				op->ldpc_dec.harq_combined_input.offset,
+				h_p_size,
+				next_triplet,
+				ACC100_DMA_BLKID_IN_HARQ);
+#endif
+		next_triplet++;
+	}
+
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->m2dlen = next_triplet;
+	*mbuf_total_left -= input_length;
+
+	next_triplet = acc100_dma_fill_blk_type_out(desc, h_output,
+			*h_out_offset, output_length >> 3, next_triplet,
+			ACC100_DMA_BLKID_OUT_HARD);
+
+	if (check_bit(op->ldpc_dec.op_flags,
+				RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
+		/* Pruned size of the HARQ */
+		h_p_size = fcw->hcout_size0 + fcw->hcout_size1;
+		/* Non-Pruned size of the HARQ */
+		h_np_size = fcw->hcout_offset > 0 ?
+				fcw->hcout_offset + fcw->hcout_size1 :
+				h_p_size;
+		if (h_comp) {
+			h_np_size = (h_np_size * 3 + 3) / 4;
+			h_p_size = (h_p_size * 3 + 3) / 4;
+		}
+		dec->harq_combined_output.length = h_np_size;
+		desc->data_ptrs[next_triplet].address =
+				dec->harq_combined_output.offset;
+		desc->data_ptrs[next_triplet].blen = h_p_size;
+		desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_OUT_HARQ;
+		desc->data_ptrs[next_triplet].dma_ext = 1;
+#ifndef ACC100_EXT_MEM
+		acc100_dma_fill_blk_type_out(
+				desc,
+				dec->harq_combined_output.data,
+				dec->harq_combined_output.offset,
+				h_p_size,
+				next_triplet,
+				ACC100_DMA_BLKID_OUT_HARQ);
+#endif
+		next_triplet++;
+	}
+
+	*h_out_length = output_length >> 3;
+	dec->hard_output.length += *h_out_length;
+	*h_out_offset += *h_out_length;
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->d2mlen = next_triplet - desc->m2dlen;
+
+	desc->op_addr = op;
+
+	return 0;
+}
+
+static inline void
+acc100_dma_desc_ld_update(struct rte_bbdev_dec_op *op,
+		struct acc100_dma_req_desc *desc,
+		struct rte_mbuf *input, struct rte_mbuf *h_output,
+		uint32_t *in_offset, uint32_t *h_out_offset,
+		uint32_t *h_out_length,
+		union acc100_harq_layout_data *harq_layout)
+{
+	int next_triplet = 1; /* FCW already done */
+	desc->data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(input, *in_offset);
+	next_triplet++;
+
+	if (check_bit(op->ldpc_dec.op_flags,
+				RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
+		struct rte_bbdev_op_data hi = op->ldpc_dec.harq_combined_input;
+		desc->data_ptrs[next_triplet].address = hi.offset;
+#ifndef ACC100_EXT_MEM
+		desc->data_ptrs[next_triplet].address =
+				rte_pktmbuf_iova_offset(hi.data, hi.offset);
+#endif
+		next_triplet++;
+	}
+
+	desc->data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(h_output, *h_out_offset);
+	*h_out_length = desc->data_ptrs[next_triplet].blen;
+	next_triplet++;
+
+	if (check_bit(op->ldpc_dec.op_flags,
+				RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
+		desc->data_ptrs[next_triplet].address =
+				op->ldpc_dec.harq_combined_output.offset;
+		/* Adjust based on previous operation */
+		struct rte_bbdev_dec_op *prev_op = desc->op_addr;
+		op->ldpc_dec.harq_combined_output.length =
+				prev_op->ldpc_dec.harq_combined_output.length;
+		int16_t hq_idx = op->ldpc_dec.harq_combined_output.offset /
+				ACC100_HARQ_OFFSET;
+		int16_t prev_hq_idx =
+				prev_op->ldpc_dec.harq_combined_output.offset
+				/ ACC100_HARQ_OFFSET;
+		harq_layout[hq_idx].val = harq_layout[prev_hq_idx].val;
+#ifndef ACC100_EXT_MEM
+		struct rte_bbdev_op_data ho =
+				op->ldpc_dec.harq_combined_output;
+		desc->data_ptrs[next_triplet].address =
+				rte_pktmbuf_iova_offset(ho.data, ho.offset);
+#endif
+		next_triplet++;
+	}
+
+	op->ldpc_dec.hard_output.length += *h_out_length;
+	desc->op_addr = op;
+}
+
+
+/* Enqueue a number of operations to HW and update software rings */
+static inline void
+acc100_dma_enqueue(struct acc100_queue *q, uint16_t n,
+		struct rte_bbdev_stats *queue_stats)
+{
+	union acc100_enqueue_reg_fmt enq_req;
+#ifdef RTE_BBDEV_OFFLOAD_COST
+	uint64_t start_time = 0;
+	queue_stats->acc_offload_cycles = 0;
+#else
+	RTE_SET_USED(queue_stats);
+#endif
+
+	enq_req.val = 0;
+	/* Setting offset, 100b for 256 DMA Desc */
+	enq_req.addr_offset = ACC100_DESC_OFFSET;
+
+	/* Split ops into batches */
+	do {
+		union acc100_dma_desc *desc;
+		uint16_t enq_batch_size;
+		uint64_t offset;
+		rte_iova_t req_elem_addr;
+
+		enq_batch_size = RTE_MIN(n, MAX_ENQ_BATCH_SIZE);
+
+		/* Set flag on last descriptor in a batch */
+		desc = q->ring_addr + ((q->sw_ring_head + enq_batch_size - 1) &
+				q->sw_ring_wrap_mask);
+		desc->req.last_desc_in_batch = 1;
+
+		/* Calculate the 1st descriptor's address */
+		offset = ((q->sw_ring_head & q->sw_ring_wrap_mask) *
+				sizeof(union acc100_dma_desc));
+		req_elem_addr = q->ring_addr_iova + offset;
+
+		/* Fill enqueue struct */
+		enq_req.num_elem = enq_batch_size;
+		/* low 6 bits are not needed */
+		enq_req.req_elem_addr = (uint32_t)(req_elem_addr >> 6);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+		rte_memdump(stderr, "Req sdone", desc, sizeof(*desc));
+#endif
+		rte_bbdev_log_debug(
+				"Enqueue %u reqs (phys %#"PRIx64") to reg %p",
+				enq_batch_size,
+				req_elem_addr,
+				(void *)q->mmio_reg_enqueue);
+
+		rte_wmb();
+
+#ifdef RTE_BBDEV_OFFLOAD_COST
+		/* Start time measurement for enqueue function offload. */
+		start_time = rte_rdtsc_precise();
+#endif
+		rte_bbdev_log(DEBUG, "Debug : MMIO Enqueue");
+		mmio_write(q->mmio_reg_enqueue, enq_req.val);
+
+#ifdef RTE_BBDEV_OFFLOAD_COST
+		queue_stats->acc_offload_cycles +=
+				rte_rdtsc_precise() - start_time;
+#endif
+
+		q->aq_enqueued++;
+		q->sw_ring_head += enq_batch_size;
+		n -= enq_batch_size;
+
+	} while (n);
+
+
+}
+
+/* Enqueue one encode operations for ACC100 device in CB mode */
+static inline int
+enqueue_ldpc_enc_n_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op **ops,
+		uint16_t total_enqueued_cbs, int16_t num)
+{
+	union acc100_dma_desc *desc = NULL;
+	uint32_t out_length;
+	struct rte_mbuf *output_head, *output;
+	int i, next_triplet;
+	uint16_t  in_length_in_bytes;
+	struct rte_bbdev_op_ldpc_enc *enc = &ops[0]->ldpc_enc;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	acc100_fcw_le_fill(ops[0], &desc->req.fcw_le, num);
+
+	/** This could be done at polling */
+	desc->req.word0 = ACC100_DMA_DESC_TYPE;
+	desc->req.word1 = 0; /**< Timestamp could be disabled */
+	desc->req.word2 = 0;
+	desc->req.word3 = 0;
+	desc->req.numCBs = num;
+
+	in_length_in_bytes = ops[0]->ldpc_enc.input.data->data_len;
+	out_length = (enc->cb_params.e + 7) >> 3;
+	desc->req.m2dlen = 1 + num;
+	desc->req.d2mlen = num;
+	next_triplet = 1;
+
+	for (i = 0; i < num; i++) {
+		desc->req.data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(ops[i]->ldpc_enc.input.data, 0);
+		desc->req.data_ptrs[next_triplet].blen = in_length_in_bytes;
+		next_triplet++;
+		desc->req.data_ptrs[next_triplet].address =
+				rte_pktmbuf_iova_offset(
+				ops[i]->ldpc_enc.output.data, 0);
+		desc->req.data_ptrs[next_triplet].blen = out_length;
+		next_triplet++;
+		ops[i]->ldpc_enc.output.length = out_length;
+		output_head = output = ops[i]->ldpc_enc.output.data;
+		mbuf_append(output_head, output, out_length);
+		output->data_len = out_length;
+	}
+
+	desc->req.op_addr = ops[0];
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_le,
+			sizeof(desc->req.fcw_le) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+	/* One CB (one op) was successfully prepared to enqueue */
+	return num;
+}
+
+/* Enqueue one encode operations for ACC100 device in CB mode */
+static inline int
+enqueue_ldpc_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
+		uint16_t total_enqueued_cbs)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint32_t in_offset, out_offset, out_length, mbuf_total_left,
+		seg_total_left;
+	struct rte_mbuf *input, *output_head, *output;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	acc100_fcw_le_fill(op, &desc->req.fcw_le, 1);
+
+	input = op->ldpc_enc.input.data;
+	output_head = output = op->ldpc_enc.output.data;
+	in_offset = op->ldpc_enc.input.offset;
+	out_offset = op->ldpc_enc.output.offset;
+	out_length = 0;
+	mbuf_total_left = op->ldpc_enc.input.length;
+	seg_total_left = rte_pktmbuf_data_len(op->ldpc_enc.input.data)
+			- in_offset;
+
+	ret = acc100_dma_desc_le_fill(op, &desc->req, &input, output,
+			&in_offset, &out_offset, &out_length, &mbuf_total_left,
+			&seg_total_left);
+
+	if (unlikely(ret < 0))
+		return ret;
+
+	mbuf_append(output_head, output, out_length);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_le,
+			sizeof(desc->req.fcw_le) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+
+	if (check_mbuf_total_left(mbuf_total_left) != 0)
+		return -EINVAL;
+#endif
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
+/** Enqueue one decode operations for ACC100 device in CB mode */
+static inline int
+enqueue_ldpc_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs, bool same_op)
+{
+	int ret;
+
+	union acc100_dma_desc *desc;
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	struct rte_mbuf *input, *h_output_head, *h_output;
+	uint32_t in_offset, h_out_offset, mbuf_total_left, h_out_length = 0;
+	input = op->ldpc_dec.input.data;
+	h_output_head = h_output = op->ldpc_dec.hard_output.data;
+	in_offset = op->ldpc_dec.input.offset;
+	h_out_offset = op->ldpc_dec.hard_output.offset;
+	mbuf_total_left = op->ldpc_dec.input.length;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(input == NULL)) {
+		rte_bbdev_log(ERR, "Invalid mbuf pointer");
+		return -EFAULT;
+	}
+#endif
+	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+
+	if (same_op) {
+		union acc100_dma_desc *prev_desc;
+		desc_idx = ((q->sw_ring_head + total_enqueued_cbs - 1)
+				& q->sw_ring_wrap_mask);
+		prev_desc = q->ring_addr + desc_idx;
+		uint8_t *prev_ptr = (uint8_t *) prev_desc;
+		uint8_t *new_ptr = (uint8_t *) desc;
+		/* Copy first 4 words and BDESCs */
+		rte_memcpy(new_ptr, prev_ptr, ACC100_5GUL_SIZE_0);
+		rte_memcpy(new_ptr + ACC100_5GUL_OFFSET_0,
+				prev_ptr + ACC100_5GUL_OFFSET_0,
+				ACC100_5GUL_SIZE_1);
+		desc->req.op_addr = prev_desc->req.op_addr;
+		/* Copy FCW */
+		rte_memcpy(new_ptr + ACC100_DESC_FCW_OFFSET,
+				prev_ptr + ACC100_DESC_FCW_OFFSET,
+				ACC100_FCW_LD_BLEN);
+		acc100_dma_desc_ld_update(op, &desc->req, input, h_output,
+				&in_offset, &h_out_offset,
+				&h_out_length, harq_layout);
+	} else {
+		struct acc100_fcw_ld *fcw;
+		uint32_t seg_total_left;
+		fcw = &desc->req.fcw_ld;
+		acc100_fcw_ld_fill(op, fcw, harq_layout);
+
+		/* Special handling when overusing mbuf */
+		if (fcw->rm_e < ACC100_MAX_E_MBUF)
+			seg_total_left = rte_pktmbuf_data_len(input)
+					- in_offset;
+		else
+			seg_total_left = fcw->rm_e;
+
+		ret = acc100_dma_desc_ld_fill(op, &desc->req, &input, h_output,
+				&in_offset, &h_out_offset,
+				&h_out_length, &mbuf_total_left,
+				&seg_total_left, fcw);
+		if (unlikely(ret < 0))
+			return ret;
+	}
+
+	/* Hard output */
+	mbuf_append(h_output_head, h_output, h_out_length);
+#ifndef ACC100_EXT_MEM
+	if (op->ldpc_dec.harq_combined_output.length > 0) {
+		/* Push the HARQ output into host memory */
+		struct rte_mbuf *hq_output_head, *hq_output;
+		hq_output_head = op->ldpc_dec.harq_combined_output.data;
+		hq_output = op->ldpc_dec.harq_combined_output.data;
+		mbuf_append(hq_output_head, hq_output,
+				op->ldpc_dec.harq_combined_output.length);
+	}
+#endif
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_ld,
+			sizeof(desc->req.fcw_ld) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
+
+/* Enqueue one decode operations for ACC100 device in TB mode */
+static inline int
+enqueue_ldpc_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint8_t r, c;
+	uint32_t in_offset, h_out_offset,
+		h_out_length, mbuf_total_left, seg_total_left;
+	struct rte_mbuf *input, *h_output_head, *h_output;
+	uint16_t current_enqueued_cbs = 0;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
+	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+	acc100_fcw_ld_fill(op, &desc->req.fcw_ld, harq_layout);
+
+	input = op->ldpc_dec.input.data;
+	h_output_head = h_output = op->ldpc_dec.hard_output.data;
+	in_offset = op->ldpc_dec.input.offset;
+	h_out_offset = op->ldpc_dec.hard_output.offset;
+	h_out_length = 0;
+	mbuf_total_left = op->ldpc_dec.input.length;
+	c = op->ldpc_dec.tb_params.c;
+	r = op->ldpc_dec.tb_params.r;
+
+	while (mbuf_total_left > 0 && r < c) {
+
+		seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
+
+		/* Set up DMA descriptor */
+		desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
+				& q->sw_ring_wrap_mask);
+		desc->req.data_ptrs[0].address = q->ring_addr_iova + fcw_offset;
+		desc->req.data_ptrs[0].blen = ACC100_FCW_LD_BLEN;
+		ret = acc100_dma_desc_ld_fill(op, &desc->req, &input,
+				h_output, &in_offset, &h_out_offset,
+				&h_out_length,
+				&mbuf_total_left, &seg_total_left,
+				&desc->req.fcw_ld);
+
+		if (unlikely(ret < 0))
+			return ret;
+
+		/* Hard output */
+		mbuf_append(h_output_head, h_output, h_out_length);
+
+		/* Set total number of CBs in TB */
+		desc->req.cbs_in_tb = cbs_in_tb;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+		rte_memdump(stderr, "FCW", &desc->req.fcw_td,
+				sizeof(desc->req.fcw_td) - 8);
+		rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+		if (seg_total_left == 0) {
+			/* Go to the next mbuf */
+			input = input->next;
+			in_offset = 0;
+			h_output = h_output->next;
+			h_out_offset = 0;
+		}
+		total_enqueued_cbs++;
+		current_enqueued_cbs++;
+		r++;
+	}
+
+	if (unlikely(desc == NULL))
+		return current_enqueued_cbs;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (check_mbuf_total_left(mbuf_total_left) != 0)
+		return -EINVAL;
+#endif
+	/* Set SDone on last CB descriptor for TB mode */
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	return current_enqueued_cbs;
+}
+
+
+/* Calculates number of CBs in processed encoder TB based on 'r' and input
+ * length.
+ */
+static inline uint8_t
+get_num_cbs_in_tb_enc(struct rte_bbdev_op_turbo_enc *turbo_enc)
+{
+	uint8_t c, c_neg, r, crc24_bits = 0;
+	uint16_t k, k_neg, k_pos;
+	uint8_t cbs_in_tb = 0;
+	int32_t length;
+
+	length = turbo_enc->input.length;
+	r = turbo_enc->tb_params.r;
+	c = turbo_enc->tb_params.c;
+	c_neg = turbo_enc->tb_params.c_neg;
+	k_neg = turbo_enc->tb_params.k_neg;
+	k_pos = turbo_enc->tb_params.k_pos;
+	crc24_bits = 0;
+	if (check_bit(turbo_enc->op_flags, RTE_BBDEV_TURBO_CRC_24B_ATTACH))
+		crc24_bits = 24;
+	while (length > 0 && r < c) {
+		k = (r < c_neg) ? k_neg : k_pos;
+		length -= (k - crc24_bits) >> 3;
+		r++;
+		cbs_in_tb++;
+	}
+
+	return cbs_in_tb;
+}
+
+/* Calculates number of CBs in processed decoder TB based on 'r' and input
+ * length.
+ */
+static inline uint16_t
+get_num_cbs_in_tb_dec(struct rte_bbdev_op_turbo_dec *turbo_dec)
+{
+	uint8_t c, c_neg, r = 0;
+	uint16_t kw, k, k_neg, k_pos, cbs_in_tb = 0;
+	int32_t length;
+
+	length = turbo_dec->input.length;
+	r = turbo_dec->tb_params.r;
+	c = turbo_dec->tb_params.c;
+	c_neg = turbo_dec->tb_params.c_neg;
+	k_neg = turbo_dec->tb_params.k_neg;
+	k_pos = turbo_dec->tb_params.k_pos;
+	while (length > 0 && r < c) {
+		k = (r < c_neg) ? k_neg : k_pos;
+		kw = RTE_ALIGN_CEIL(k + 4, 32) * 3;
+		length -= kw;
+		r++;
+		cbs_in_tb++;
+	}
+
+	return cbs_in_tb;
+}
+
+/* Calculates number of CBs in processed decoder TB based on 'r' and input
+ * length.
+ */
+static inline uint16_t
+get_num_cbs_in_tb_ldpc_dec(struct rte_bbdev_op_ldpc_dec *ldpc_dec)
+{
+	uint16_t r, cbs_in_tb = 0;
+	int32_t length = ldpc_dec->input.length;
+	r = ldpc_dec->tb_params.r;
+	while (length > 0 && r < ldpc_dec->tb_params.c) {
+		length -=  (r < ldpc_dec->tb_params.cab) ?
+				ldpc_dec->tb_params.ea :
+				ldpc_dec->tb_params.eb;
+		r++;
+		cbs_in_tb++;
+	}
+	return cbs_in_tb;
+}
+
+/* Check we can mux encode operations with common FCW */
+static inline bool
+check_mux(struct rte_bbdev_enc_op **ops, uint16_t num) {
+	uint16_t i;
+	if (num <= 1)
+		return false;
+	for (i = 1; i < num; ++i) {
+		/* Only mux compatible code blocks */
+		if (memcmp((uint8_t *)(&ops[i]->ldpc_enc) + ACC100_ENC_OFFSET,
+				(uint8_t *)(&ops[0]->ldpc_enc) +
+				ACC100_ENC_OFFSET,
+				ACC100_CMP_ENC_SIZE) != 0)
+			return false;
+	}
+	return true;
+}
+
+/** Enqueue encode operations for ACC100 device in CB mode. */
+static inline uint16_t
+acc100_enqueue_ldpc_enc_cb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i = 0;
+	union acc100_dma_desc *desc;
+	int ret, desc_idx = 0;
+	int16_t enq, left = num;
+
+	while (left > 0) {
+		if (unlikely(avail < 1))
+			break;
+		avail--;
+		enq = RTE_MIN(left, ACC100_MUX_5GDL_DESC);
+		if (check_mux(&ops[i], enq)) {
+			ret = enqueue_ldpc_enc_n_op_cb(q, &ops[i],
+					desc_idx, enq);
+			if (ret < 0)
+				break;
+			i += enq;
+		} else {
+			ret = enqueue_ldpc_enc_one_op_cb(q, ops[i], desc_idx);
+			if (ret < 0)
+				break;
+			i++;
+		}
+		desc_idx++;
+		left = num - i;
+	}
+
+	if (unlikely(i == 0))
+		return 0; /* Nothing to enqueue */
+
+	/* Set SDone in last CB in enqueued ops for CB mode*/
+	desc = q->ring_addr + ((q->sw_ring_head + desc_idx - 1)
+			& q->sw_ring_wrap_mask);
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	acc100_dma_enqueue(q, desc_idx, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+
+	return i;
+}
+
+/* Enqueue encode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	if (unlikely(num == 0))
+		return 0;
+	return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
+}
+
+/* Check we can mux encode operations with common FCW */
+static inline bool
+cmp_ldpc_dec_op(struct rte_bbdev_dec_op **ops) {
+	/* Only mux compatible code blocks */
+	if (memcmp((uint8_t *)(&ops[0]->ldpc_dec) + ACC100_DEC_OFFSET,
+			(uint8_t *)(&ops[1]->ldpc_dec) +
+			ACC100_DEC_OFFSET, ACC100_CMP_DEC_SIZE) != 0) {
+		return false;
+	} else
+		return true;
+}
+
+
+/* Enqueue decode operations for ACC100 device in TB mode */
+static uint16_t
+acc100_enqueue_ldpc_dec_tb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i, enqueued_cbs = 0;
+	uint8_t cbs_in_tb;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		cbs_in_tb = get_num_cbs_in_tb_ldpc_dec(&ops[i]->ldpc_dec);
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - cbs_in_tb < 0))
+			break;
+		avail -= cbs_in_tb;
+
+		ret = enqueue_ldpc_dec_one_op_tb(q, ops[i],
+				enqueued_cbs, cbs_in_tb);
+		if (ret < 0)
+			break;
+		enqueued_cbs += ret;
+	}
+
+	acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+	return i;
+}
+
+/* Enqueue decode operations for ACC100 device in CB mode */
+static uint16_t
+acc100_enqueue_ldpc_dec_cb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i;
+	union acc100_dma_desc *desc;
+	int ret;
+	bool same_op = false;
+	for (i = 0; i < num; ++i) {
+		/* Check if there are available space for further processing */
+		if (unlikely(avail < 1))
+			break;
+		avail -= 1;
+
+		if (i > 0)
+			same_op = cmp_ldpc_dec_op(&ops[i-1]);
+		rte_bbdev_log(INFO, "Op %d %d %d %d %d %d %d %d %d %d %d %d\n",
+			i, ops[i]->ldpc_dec.op_flags, ops[i]->ldpc_dec.rv_index,
+			ops[i]->ldpc_dec.iter_max, ops[i]->ldpc_dec.iter_count,
+			ops[i]->ldpc_dec.basegraph, ops[i]->ldpc_dec.z_c,
+			ops[i]->ldpc_dec.n_cb, ops[i]->ldpc_dec.q_m,
+			ops[i]->ldpc_dec.n_filler, ops[i]->ldpc_dec.cb_params.e,
+			same_op);
+		ret = enqueue_ldpc_dec_one_op_cb(q, ops[i], i, same_op);
+		if (ret < 0)
+			break;
+	}
+
+	if (unlikely(i == 0))
+		return 0; /* Nothing to enqueue */
+
+	/* Set SDone in last CB in enqueued ops for CB mode*/
+	desc = q->ring_addr + ((q->sw_ring_head + i - 1)
+			& q->sw_ring_wrap_mask);
+
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	acc100_dma_enqueue(q, i, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+	return i;
+}
+
+/* Enqueue decode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t aq_avail = q->aq_depth +
+			(q->aq_dequeued - q->aq_enqueued) / 128;
+
+	if (unlikely((aq_avail == 0) || (num == 0)))
+		return 0;
+
+	if (ops[0]->ldpc_dec.code_block_mode == 0)
+		return acc100_enqueue_ldpc_dec_tb(q_data, ops, num);
+	else
+		return acc100_enqueue_ldpc_dec_cb(q_data, ops, num);
+}
+
+
+/* Dequeue one encode operations from ACC100 device in CB mode */
+static inline int
+dequeue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op **ref_op,
+		uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_enc_op *op;
+	int i;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	rsp.val = atom_desc.rsp.val;
+	rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+
+	op->status |= ((rsp.input_err)
+			? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+	op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+	op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+
+	if (desc->req.last_desc_in_batch) {
+		(*aq_dequeued)++;
+		desc->req.last_desc_in_batch = 0;
+	}
+	desc->rsp.val = ACC100_DMA_DESC_TYPE;
+	desc->rsp.add_info_0 = 0; /*Reserved bits */
+	desc->rsp.add_info_1 = 0; /*Reserved bits */
+
+	/* Flag that the muxing cause loss of opaque data */
+	op->opaque_data = (void *)-1;
+	for (i = 0 ; i < desc->req.numCBs; i++)
+		ref_op[i] = op;
+
+	/* One CB (op) was successfully dequeued */
+	return desc->req.numCBs;
+}
+
+/* Dequeue one encode operations from ACC100 device in TB mode */
+static inline int
+dequeue_enc_one_op_tb(struct acc100_queue *q, struct rte_bbdev_enc_op **ref_op,
+		uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, *last_desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_enc_op *op;
+	uint8_t i = 0;
+	uint16_t current_dequeued_cbs = 0, cbs_in_tb;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	/* Get number of CBs in dequeued TB */
+	cbs_in_tb = desc->req.cbs_in_tb;
+	/* Get last CB */
+	last_desc = q->ring_addr + ((q->sw_ring_tail
+			+ total_dequeued_cbs + cbs_in_tb - 1)
+			& q->sw_ring_wrap_mask);
+	/* Check if last CB in TB is ready to dequeue (and thus
+	 * the whole TB) - checking sdone bit. If not return.
+	 */
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
+			__ATOMIC_RELAXED);
+	if (!(atom_desc.rsp.val & ACC100_SDONE))
+		return -1;
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+
+	while (i < cbs_in_tb) {
+		desc = q->ring_addr + ((q->sw_ring_tail
+				+ total_dequeued_cbs)
+				& q->sw_ring_wrap_mask);
+		atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+				__ATOMIC_RELAXED);
+		rsp.val = atom_desc.rsp.val;
+		rte_bbdev_log_debug("Resp. desc %p: %x", desc,
+				rsp.val);
+
+		op->status |= ((rsp.input_err)
+				? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+		op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+		op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+
+		if (desc->req.last_desc_in_batch) {
+			(*aq_dequeued)++;
+			desc->req.last_desc_in_batch = 0;
+		}
+		desc->rsp.val = ACC100_DMA_DESC_TYPE;
+		desc->rsp.add_info_0 = 0;
+		desc->rsp.add_info_1 = 0;
+		total_dequeued_cbs++;
+		current_dequeued_cbs++;
+		i++;
+	}
+
+	*ref_op = op;
+
+	return current_dequeued_cbs;
+}
+
+/* Dequeue one decode operation from ACC100 device in CB mode */
+static inline int
+dequeue_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
+		struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
+		uint16_t dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_dec_op *op;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	rsp.val = atom_desc.rsp.val;
+	rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+	op->status |= ((rsp.input_err)
+			? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+	op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+	op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+	if (op->status != 0)
+		q_data->queue_stats.dequeue_err_count++;
+
+	/* CRC invalid if error exists */
+	if (!op->status)
+		op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
+	op->turbo_dec.iter_count = (uint8_t) rsp.iter_cnt / 2;
+	/* Check if this is the last desc in batch (Atomic Queue) */
+	if (desc->req.last_desc_in_batch) {
+		(*aq_dequeued)++;
+		desc->req.last_desc_in_batch = 0;
+	}
+	desc->rsp.val = ACC100_DMA_DESC_TYPE;
+	desc->rsp.add_info_0 = 0;
+	desc->rsp.add_info_1 = 0;
+	*ref_op = op;
+
+	/* One CB (op) was successfully dequeued */
+	return 1;
+}
+
+/* Dequeue one decode operations from ACC100 device in CB mode */
+static inline int
+dequeue_ldpc_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
+		struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
+		uint16_t dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_dec_op *op;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	rsp.val = atom_desc.rsp.val;
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+	op->status |= rsp.input_err << RTE_BBDEV_DATA_ERROR;
+	op->status |= rsp.dma_err << RTE_BBDEV_DRV_ERROR;
+	op->status |= rsp.fcw_err << RTE_BBDEV_DRV_ERROR;
+	if (op->status != 0)
+		q_data->queue_stats.dequeue_err_count++;
+
+	op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
+	if (op->ldpc_dec.hard_output.length > 0 && !rsp.synd_ok)
+		op->status |= 1 << RTE_BBDEV_SYNDROME_ERROR;
+	op->ldpc_dec.iter_count = (uint8_t) rsp.iter_cnt;
+
+	/* Check if this is the last desc in batch (Atomic Queue) */
+	if (desc->req.last_desc_in_batch) {
+		(*aq_dequeued)++;
+		desc->req.last_desc_in_batch = 0;
+	}
+
+	desc->rsp.val = ACC100_DMA_DESC_TYPE;
+	desc->rsp.add_info_0 = 0;
+	desc->rsp.add_info_1 = 0;
+
+	*ref_op = op;
+
+	/* One CB (op) was successfully dequeued */
+	return 1;
+}
+
+/* Dequeue one decode operations from ACC100 device in TB mode. */
+static inline int
+dequeue_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
+		uint16_t dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, *last_desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_dec_op *op;
+	uint8_t cbs_in_tb = 1, cb_idx = 0;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Get number of CBs in dequeued TB */
+	cbs_in_tb = desc->req.cbs_in_tb;
+	/* Get last CB */
+	last_desc = q->ring_addr + ((q->sw_ring_tail
+			+ dequeued_cbs + cbs_in_tb - 1)
+			& q->sw_ring_wrap_mask);
+	/* Check if last CB in TB is ready to dequeue (and thus
+	 * the whole TB) - checking sdone bit. If not return.
+	 */
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
+			__ATOMIC_RELAXED);
+	if (!(atom_desc.rsp.val & ACC100_SDONE))
+		return -1;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+
+	/* Read remaining CBs if exists */
+	while (cb_idx < cbs_in_tb) {
+		desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+				& q->sw_ring_wrap_mask);
+		atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+				__ATOMIC_RELAXED);
+		rsp.val = atom_desc.rsp.val;
+		rte_bbdev_log_debug("Resp. desc %p: %x", desc,
+				rsp.val);
+
+		op->status |= ((rsp.input_err)
+				? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+		op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+		op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+
+		/* CRC invalid if error exists */
+		if (!op->status)
+			op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
+		op->turbo_dec.iter_count = RTE_MAX((uint8_t) rsp.iter_cnt,
+				op->turbo_dec.iter_count);
+
+		/* Check if this is the last desc in batch (Atomic Queue) */
+		if (desc->req.last_desc_in_batch) {
+			(*aq_dequeued)++;
+			desc->req.last_desc_in_batch = 0;
+		}
+		desc->rsp.val = ACC100_DMA_DESC_TYPE;
+		desc->rsp.add_info_0 = 0;
+		desc->rsp.add_info_1 = 0;
+		dequeued_cbs++;
+		cb_idx++;
+	}
+
+	*ref_op = op;
+
+	return cb_idx;
+}
+
+/* Dequeue LDPC encode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+	uint32_t aq_dequeued = 0;
+	uint16_t dequeue_num, i, dequeued_cbs = 0, dequeued_descs = 0;
+	int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(ops == 0 && q == NULL))
+		return 0;
+#endif
+
+	dequeue_num = RTE_MIN(avail, num);
+
+	for (i = 0; i < dequeue_num; i++) {
+		ret = dequeue_enc_one_op_cb(q, &ops[dequeued_cbs],
+				dequeued_descs, &aq_dequeued);
+		if (ret < 0)
+			break;
+		dequeued_cbs += ret;
+		dequeued_descs++;
+		if (dequeued_cbs >= num)
+			break;
+	}
+
+	q->aq_dequeued += aq_dequeued;
+	q->sw_ring_tail += dequeued_descs;
+
+	/* Update enqueue stats */
+	q_data->queue_stats.dequeued_count += dequeued_cbs;
+
+	return dequeued_cbs;
+}
+
+/* Dequeue decode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	uint16_t dequeue_num;
+	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+	uint32_t aq_dequeued = 0;
+	uint16_t i;
+	uint16_t dequeued_cbs = 0;
+	struct rte_bbdev_dec_op *op;
+	int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(ops == 0 && q == NULL))
+		return 0;
+#endif
+
+	dequeue_num = RTE_MIN(avail, num);
+
+	for (i = 0; i < dequeue_num; ++i) {
+		op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask))->req.op_addr;
+		if (op->ldpc_dec.code_block_mode == 0)
+			ret = dequeue_dec_one_op_tb(q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+		else
+			ret = dequeue_ldpc_dec_one_op_cb(
+					q_data, q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+
+		if (ret < 0)
+			break;
+		dequeued_cbs += ret;
+	}
+
+	q->aq_dequeued += aq_dequeued;
+	q->sw_ring_tail += dequeued_cbs;
+
+	/* Update enqueue stats */
+	q_data->queue_stats.dequeued_count += i;
+
+	return i;
+}
+
 /* Initialization Function */
 static void
 acc100_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv)
@@ -730,6 +2344,10 @@
 	struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
 
 	dev->dev_ops = &acc100_bbdev_ops;
+	dev->enqueue_ldpc_enc_ops = acc100_enqueue_ldpc_enc;
+	dev->enqueue_ldpc_dec_ops = acc100_enqueue_ldpc_dec;
+	dev->dequeue_ldpc_enc_ops = acc100_dequeue_ldpc_enc;
+	dev->dequeue_ldpc_dec_ops = acc100_dequeue_ldpc_dec;
 
 	((struct acc100_device *) dev->data->dev_private)->pf_device =
 			!strcmp(drv->driver.name,
@@ -842,4 +2460,3 @@ static int acc100_pci_remove(struct rte_pci_device *pci_dev)
 RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME, pci_id_acc100_pf_map);
 RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
 RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME, pci_id_acc100_vf_map);
-
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 5c8dde3..38818f4 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -88,6 +88,8 @@
 #define ACC100_TMPL_PRI_3      0x0f0e0d0c
 #define ACC100_QUEUE_ENABLE    0x80000000  /* Bit to mark Queue as Enabled */
 #define ACC100_WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
+#define ACC100_FDONE    0x80000000
+#define ACC100_SDONE    0x40000000
 
 #define ACC100_NUM_TMPL       32
 /* Mapping of signals for the available engines */
@@ -120,6 +122,9 @@
 #define ACC100_FCW_TD_BLEN                24
 #define ACC100_FCW_LE_BLEN                32
 #define ACC100_FCW_LD_BLEN                36
+#define ACC100_5GUL_SIZE_0                16
+#define ACC100_5GUL_SIZE_1                40
+#define ACC100_5GUL_OFFSET_0              36
 
 #define ACC100_FCW_VER         2
 #define ACC100_MUX_5GDL_DESC   6
@@ -402,6 +407,7 @@ struct __rte_packed acc100_dma_req_desc {
 union acc100_dma_desc {
 	struct acc100_dma_req_desc req;
 	union acc100_dma_rsp_desc rsp;
+	uint64_t atom_hdr;
 };
 
 
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v11 06/10] baseband/acc100: add HARQ loopback support
  2020-10-02  1:01   ` [dpdk-dev] [PATCH v11 00/10] bbdev PMD ACC100 Nicolas Chautru
                       ` (4 preceding siblings ...)
  2020-10-02  1:01     ` [dpdk-dev] [PATCH v11 05/10] baseband/acc100: add LDPC processing functions Nicolas Chautru
@ 2020-10-02  1:01     ` Nicolas Chautru
  2020-10-02  1:01     ` [dpdk-dev] [PATCH v11 07/10] baseband/acc100: add support for 4G processing Nicolas Chautru
                       ` (3 subsequent siblings)
  9 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-10-02  1:01 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, trix, maxime.coquelin, ferruh.yigit,
	tianjiao.liu, Nicolas Chautru

Additional support for HARQ memory loopback

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
Reviewed-by: Tom Rix <trix@redhat.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 159 ++++++++++++++++++++++++++++++-
 1 file changed, 155 insertions(+), 4 deletions(-)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index f894438..4bb9b61 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -685,6 +685,7 @@
 				RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE |
 				RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE |
 #ifdef ACC100_EXT_MEM
+				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK |
 				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE |
 				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE |
 #endif
@@ -1419,10 +1420,7 @@
 	acc100_fcw_le_fill(ops[0], &desc->req.fcw_le, num);
 
 	/** This could be done at polling */
-	desc->req.word0 = ACC100_DMA_DESC_TYPE;
-	desc->req.word1 = 0; /**< Timestamp could be disabled */
-	desc->req.word2 = 0;
-	desc->req.word3 = 0;
+	acc100_header_init(&desc->req);
 	desc->req.numCBs = num;
 
 	in_length_in_bytes = ops[0]->ldpc_enc.input.data->data_len;
@@ -1505,12 +1503,165 @@
 	return 1;
 }
 
+static inline int
+harq_loopback(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs) {
+	struct acc100_fcw_ld *fcw;
+	union acc100_dma_desc *desc;
+	int next_triplet = 1;
+	struct rte_mbuf *hq_output_head, *hq_output;
+	uint16_t harq_in_length = op->ldpc_dec.harq_combined_input.length;
+	if (harq_in_length == 0) {
+		rte_bbdev_log(ERR, "Loopback of invalid null size\n");
+		return -EINVAL;
+	}
+
+	int h_comp = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION
+			) ? 1 : 0;
+	if (h_comp == 1)
+		harq_in_length = harq_in_length * 8 / 6;
+	harq_in_length = RTE_ALIGN(harq_in_length, 64);
+	uint16_t harq_dma_length_in = (h_comp == 0) ?
+			harq_in_length :
+			harq_in_length * 6 / 8;
+	uint16_t harq_dma_length_out = harq_dma_length_in;
+	bool ddr_mem_in = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE);
+	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+	uint16_t harq_index = (ddr_mem_in ?
+			op->ldpc_dec.harq_combined_input.offset :
+			op->ldpc_dec.harq_combined_output.offset)
+			/ ACC100_HARQ_OFFSET;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	fcw = &desc->req.fcw_ld;
+	/* Set the FCW from loopback into DDR */
+	memset(fcw, 0, sizeof(struct acc100_fcw_ld));
+	fcw->FCWversion = ACC100_FCW_VER;
+	fcw->qm = 2;
+	fcw->Zc = 384;
+	if (harq_in_length < 16 * ACC100_N_ZC_1)
+		fcw->Zc = 16;
+	fcw->ncb = fcw->Zc * ACC100_N_ZC_1;
+	fcw->rm_e = 2;
+	fcw->hcin_en = 1;
+	fcw->hcout_en = 1;
+
+	rte_bbdev_log(DEBUG, "Loopback IN %d Index %d offset %d length %d %d\n",
+			ddr_mem_in, harq_index,
+			harq_layout[harq_index].offset, harq_in_length,
+			harq_dma_length_in);
+
+	if (ddr_mem_in && (harq_layout[harq_index].offset > 0)) {
+		fcw->hcin_size0 = harq_layout[harq_index].size0;
+		fcw->hcin_offset = harq_layout[harq_index].offset;
+		fcw->hcin_size1 = harq_in_length - fcw->hcin_offset;
+		harq_dma_length_in = (fcw->hcin_size0 + fcw->hcin_size1);
+		if (h_comp == 1)
+			harq_dma_length_in = harq_dma_length_in * 6 / 8;
+	} else {
+		fcw->hcin_size0 = harq_in_length;
+	}
+	harq_layout[harq_index].val = 0;
+	rte_bbdev_log(DEBUG, "Loopback FCW Config %d %d %d\n",
+			fcw->hcin_size0, fcw->hcin_offset, fcw->hcin_size1);
+	fcw->hcout_size0 = harq_in_length;
+	fcw->hcin_decomp_mode = h_comp;
+	fcw->hcout_comp_mode = h_comp;
+	fcw->gain_i = 1;
+	fcw->gain_h = 1;
+
+	/* Set the prefix of descriptor. This could be done at polling */
+	acc100_header_init(&desc->req);
+
+	/* Null LLR input for Decoder */
+	desc->req.data_ptrs[next_triplet].address =
+			q->lb_in_addr_iova;
+	desc->req.data_ptrs[next_triplet].blen = 2;
+	desc->req.data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN;
+	desc->req.data_ptrs[next_triplet].last = 0;
+	desc->req.data_ptrs[next_triplet].dma_ext = 0;
+	next_triplet++;
+
+	/* HARQ Combine input from either Memory interface */
+	if (!ddr_mem_in) {
+		next_triplet = acc100_dma_fill_blk_type_out(&desc->req,
+				op->ldpc_dec.harq_combined_input.data,
+				op->ldpc_dec.harq_combined_input.offset,
+				harq_dma_length_in,
+				next_triplet,
+				ACC100_DMA_BLKID_IN_HARQ);
+	} else {
+		desc->req.data_ptrs[next_triplet].address =
+				op->ldpc_dec.harq_combined_input.offset;
+		desc->req.data_ptrs[next_triplet].blen =
+				harq_dma_length_in;
+		desc->req.data_ptrs[next_triplet].blkid =
+				ACC100_DMA_BLKID_IN_HARQ;
+		desc->req.data_ptrs[next_triplet].dma_ext = 1;
+		next_triplet++;
+	}
+	desc->req.data_ptrs[next_triplet - 1].last = 1;
+	desc->req.m2dlen = next_triplet;
+
+	/* Dropped decoder hard output */
+	desc->req.data_ptrs[next_triplet].address =
+			q->lb_out_addr_iova;
+	desc->req.data_ptrs[next_triplet].blen = ACC100_BYTES_IN_WORD;
+	desc->req.data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_OUT_HARD;
+	desc->req.data_ptrs[next_triplet].last = 0;
+	desc->req.data_ptrs[next_triplet].dma_ext = 0;
+	next_triplet++;
+
+	/* HARQ Combine output to either Memory interface */
+	if (check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE
+			)) {
+		desc->req.data_ptrs[next_triplet].address =
+				op->ldpc_dec.harq_combined_output.offset;
+		desc->req.data_ptrs[next_triplet].blen =
+				harq_dma_length_out;
+		desc->req.data_ptrs[next_triplet].blkid =
+				ACC100_DMA_BLKID_OUT_HARQ;
+		desc->req.data_ptrs[next_triplet].dma_ext = 1;
+		next_triplet++;
+	} else {
+		hq_output_head = op->ldpc_dec.harq_combined_output.data;
+		hq_output = op->ldpc_dec.harq_combined_output.data;
+		next_triplet = acc100_dma_fill_blk_type_out(
+				&desc->req,
+				op->ldpc_dec.harq_combined_output.data,
+				op->ldpc_dec.harq_combined_output.offset,
+				harq_dma_length_out,
+				next_triplet,
+				ACC100_DMA_BLKID_OUT_HARQ);
+		/* HARQ output */
+		mbuf_append(hq_output_head, hq_output, harq_dma_length_out);
+		op->ldpc_dec.harq_combined_output.length =
+				harq_dma_length_out;
+	}
+	desc->req.data_ptrs[next_triplet - 1].last = 1;
+	desc->req.d2mlen = next_triplet - desc->req.m2dlen;
+	desc->req.op_addr = op;
+
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
 /** Enqueue one decode operations for ACC100 device in CB mode */
 static inline int
 enqueue_ldpc_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
 		uint16_t total_enqueued_cbs, bool same_op)
 {
 	int ret;
+	if (unlikely(check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK))) {
+		ret = harq_loopback(q, op, total_enqueued_cbs);
+		return ret;
+	}
 
 	union acc100_dma_desc *desc;
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v11 07/10] baseband/acc100: add support for 4G processing
  2020-10-02  1:01   ` [dpdk-dev] [PATCH v11 00/10] bbdev PMD ACC100 Nicolas Chautru
                       ` (5 preceding siblings ...)
  2020-10-02  1:01     ` [dpdk-dev] [PATCH v11 06/10] baseband/acc100: add HARQ loopback support Nicolas Chautru
@ 2020-10-02  1:01     ` Nicolas Chautru
  2020-10-02  1:01     ` [dpdk-dev] [PATCH v11 08/10] baseband/acc100: add interrupt support to PMD Nicolas Chautru
                       ` (2 subsequent siblings)
  9 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-10-02  1:01 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, trix, maxime.coquelin, ferruh.yigit,
	tianjiao.liu, Nicolas Chautru

Adding capability for 4G encode and decoder processing

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 doc/guides/bbdevs/features/acc100.ini    |    4 +-
 drivers/baseband/acc100/rte_acc100_pmd.c | 1007 +++++++++++++++++++++++++++---
 2 files changed, 936 insertions(+), 75 deletions(-)

diff --git a/doc/guides/bbdevs/features/acc100.ini b/doc/guides/bbdevs/features/acc100.ini
index 40c7adc..642cd48 100644
--- a/doc/guides/bbdevs/features/acc100.ini
+++ b/doc/guides/bbdevs/features/acc100.ini
@@ -4,8 +4,8 @@
 ; Refer to default.ini for the full list of available PMD features.
 ;
 [Features]
-Turbo Decoder (4G)     = N
-Turbo Encoder (4G)     = N
+Turbo Decoder (4G)     = Y
+Turbo Encoder (4G)     = Y
 LDPC Decoder (5G)      = Y
 LDPC Encoder (5G)      = Y
 LLR/HARQ Compression   = Y
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 4bb9b61..8e721be 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -342,7 +342,6 @@
 	free_base_addresses(base_addrs, i);
 }
 
-
 /* Allocate 64MB memory used for all software rings */
 static int
 acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id)
@@ -664,6 +663,41 @@
 
 	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
 		{
+			.type = RTE_BBDEV_OP_TURBO_DEC,
+			.cap.turbo_dec = {
+				.capability_flags =
+					RTE_BBDEV_TURBO_SUBBLOCK_DEINTERLEAVE |
+					RTE_BBDEV_TURBO_CRC_TYPE_24B |
+					RTE_BBDEV_TURBO_HALF_ITERATION_EVEN |
+					RTE_BBDEV_TURBO_EARLY_TERMINATION |
+					RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN |
+					RTE_BBDEV_TURBO_MAP_DEC |
+					RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP |
+					RTE_BBDEV_TURBO_DEC_SCATTER_GATHER,
+				.max_llr_modulus = INT8_MAX,
+				.num_buffers_src =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+				.num_buffers_hard_out =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+				.num_buffers_soft_out =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+			}
+		},
+		{
+			.type = RTE_BBDEV_OP_TURBO_ENC,
+			.cap.turbo_enc = {
+				.capability_flags =
+					RTE_BBDEV_TURBO_CRC_24B_ATTACH |
+					RTE_BBDEV_TURBO_RV_INDEX_BYPASS |
+					RTE_BBDEV_TURBO_RATE_MATCH |
+					RTE_BBDEV_TURBO_ENC_SCATTER_GATHER,
+				.num_buffers_src =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+				.num_buffers_dst =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+			}
+		},
+		{
 			.type   = RTE_BBDEV_OP_LDPC_ENC,
 			.cap.ldpc_enc = {
 				.capability_flags =
@@ -746,7 +780,6 @@
 #endif
 }
 
-
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
 	.setup_queues = acc100_setup_queues,
 	.close = acc100_dev_close,
@@ -790,6 +823,58 @@
 	return tail;
 }
 
+/* Fill in a frame control word for turbo encoding. */
+static inline void
+acc100_fcw_te_fill(const struct rte_bbdev_enc_op *op, struct acc100_fcw_te *fcw)
+{
+	fcw->code_block_mode = op->turbo_enc.code_block_mode;
+	if (fcw->code_block_mode == 0) { /* For TB mode */
+		fcw->k_neg = op->turbo_enc.tb_params.k_neg;
+		fcw->k_pos = op->turbo_enc.tb_params.k_pos;
+		fcw->c_neg = op->turbo_enc.tb_params.c_neg;
+		fcw->c = op->turbo_enc.tb_params.c;
+		fcw->ncb_neg = op->turbo_enc.tb_params.ncb_neg;
+		fcw->ncb_pos = op->turbo_enc.tb_params.ncb_pos;
+
+		if (check_bit(op->turbo_enc.op_flags,
+				RTE_BBDEV_TURBO_RATE_MATCH)) {
+			fcw->bypass_rm = 0;
+			fcw->cab = op->turbo_enc.tb_params.cab;
+			fcw->ea = op->turbo_enc.tb_params.ea;
+			fcw->eb = op->turbo_enc.tb_params.eb;
+		} else {
+			/* E is set to the encoding output size when RM is
+			 * bypassed.
+			 */
+			fcw->bypass_rm = 1;
+			fcw->cab = fcw->c_neg;
+			fcw->ea = 3 * fcw->k_neg + 12;
+			fcw->eb = 3 * fcw->k_pos + 12;
+		}
+	} else { /* For CB mode */
+		fcw->k_pos = op->turbo_enc.cb_params.k;
+		fcw->ncb_pos = op->turbo_enc.cb_params.ncb;
+
+		if (check_bit(op->turbo_enc.op_flags,
+				RTE_BBDEV_TURBO_RATE_MATCH)) {
+			fcw->bypass_rm = 0;
+			fcw->eb = op->turbo_enc.cb_params.e;
+		} else {
+			/* E is set to the encoding output size when RM is
+			 * bypassed.
+			 */
+			fcw->bypass_rm = 1;
+			fcw->eb = 3 * fcw->k_pos + 12;
+		}
+	}
+
+	fcw->bypass_rv_idx1 = check_bit(op->turbo_enc.op_flags,
+			RTE_BBDEV_TURBO_RV_INDEX_BYPASS);
+	fcw->code_block_crc = check_bit(op->turbo_enc.op_flags,
+			RTE_BBDEV_TURBO_CRC_24B_ATTACH);
+	fcw->rv_idx1 = op->turbo_enc.rv_index;
+}
+
 /* Compute value of k0.
  * Based on 3GPP 38.212 Table 5.4.2.1-2
  * Starting position of different redundancy versions, k0
@@ -840,6 +925,25 @@
 	fcw->mcb_count = num_cb;
 }
 
+/* Fill in a frame control word for turbo decoding. */
+static inline void
+acc100_fcw_td_fill(const struct rte_bbdev_dec_op *op, struct acc100_fcw_td *fcw)
+{
+	/* Note : Early termination is always enabled for 4GUL */
+	fcw->fcw_ver = 1;
+	if (op->turbo_dec.code_block_mode == 0)
+		fcw->k_pos = op->turbo_dec.tb_params.k_pos;
+	else
+		fcw->k_pos = op->turbo_dec.cb_params.k;
+	fcw->turbo_crc_type = check_bit(op->turbo_dec.op_flags,
+			RTE_BBDEV_TURBO_CRC_TYPE_24B);
+	fcw->bypass_sb_deint = 0;
+	fcw->raw_decoder_input_on = 0;
+	fcw->max_iter = op->turbo_dec.iter_max;
+	fcw->half_iter_on = !check_bit(op->turbo_dec.op_flags,
+			RTE_BBDEV_TURBO_HALF_ITERATION_EVEN);
+}
+
 /* Fill in a frame control word for LDPC decoding. */
 static inline void
 acc100_fcw_ld_fill(const struct rte_bbdev_dec_op *op, struct acc100_fcw_ld *fcw,
@@ -1093,6 +1197,87 @@
 #endif
 
 static inline int
+acc100_dma_desc_te_fill(struct rte_bbdev_enc_op *op,
+		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
+		struct rte_mbuf *output, uint32_t *in_offset,
+		uint32_t *out_offset, uint32_t *out_length,
+		uint32_t *mbuf_total_left, uint32_t *seg_total_left, uint8_t r)
+{
+	int next_triplet = 1; /* FCW already done */
+	uint32_t e, ea, eb, length;
+	uint16_t k, k_neg, k_pos;
+	uint8_t cab, c_neg;
+
+	desc->word0 = ACC100_DMA_DESC_TYPE;
+	desc->word1 = 0; /**< Timestamp could be disabled */
+	desc->word2 = 0;
+	desc->word3 = 0;
+	desc->numCBs = 1;
+
+	if (op->turbo_enc.code_block_mode == 0) {
+		ea = op->turbo_enc.tb_params.ea;
+		eb = op->turbo_enc.tb_params.eb;
+		cab = op->turbo_enc.tb_params.cab;
+		k_neg = op->turbo_enc.tb_params.k_neg;
+		k_pos = op->turbo_enc.tb_params.k_pos;
+		c_neg = op->turbo_enc.tb_params.c_neg;
+		e = (r < cab) ? ea : eb;
+		k = (r < c_neg) ? k_neg : k_pos;
+	} else {
+		e = op->turbo_enc.cb_params.e;
+		k = op->turbo_enc.cb_params.k;
+	}
+
+	if (check_bit(op->turbo_enc.op_flags, RTE_BBDEV_TURBO_CRC_24B_ATTACH))
+		length = (k - 24) >> 3;
+	else
+		length = k >> 3;
+
+	if (unlikely((*mbuf_total_left == 0) || (*mbuf_total_left < length))) {
+		rte_bbdev_log(ERR,
+				"Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+				*mbuf_total_left, length);
+		return -1;
+	}
+
+	next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset,
+			length, seg_total_left, next_triplet);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->m2dlen = next_triplet;
+	*mbuf_total_left -= length;
+
+	/* Set output length */
+	if (check_bit(op->turbo_enc.op_flags, RTE_BBDEV_TURBO_RATE_MATCH))
+		/* Integer round up division by 8 */
+		*out_length = (e + 7) >> 3;
+	else
+		*out_length = (k >> 3) * 3 + 2;
+
+	next_triplet = acc100_dma_fill_blk_type_out(desc, output, *out_offset,
+			*out_length, next_triplet, ACC100_DMA_BLKID_OUT_ENC);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+	op->turbo_enc.output.length += *out_length;
+	*out_offset += *out_length;
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->d2mlen = next_triplet - desc->m2dlen;
+
+	desc->op_addr = op;
+
+	return 0;
+}
+
+static inline int
 acc100_dma_desc_le_fill(struct rte_bbdev_enc_op *op,
 		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
 		struct rte_mbuf *output, uint32_t *in_offset,
@@ -1151,6 +1336,117 @@
 }
 
 static inline int
+acc100_dma_desc_td_fill(struct rte_bbdev_dec_op *op,
+		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
+		struct rte_mbuf *h_output, struct rte_mbuf *s_output,
+		uint32_t *in_offset, uint32_t *h_out_offset,
+		uint32_t *s_out_offset, uint32_t *h_out_length,
+		uint32_t *s_out_length, uint32_t *mbuf_total_left,
+		uint32_t *seg_total_left, uint8_t r)
+{
+	int next_triplet = 1; /* FCW already done */
+	uint16_t k;
+	uint16_t crc24_overlap = 0;
+	uint32_t e, kw;
+
+	desc->word0 = ACC100_DMA_DESC_TYPE;
+	desc->word1 = 0; /**< Timestamp could be disabled */
+	desc->word2 = 0;
+	desc->word3 = 0;
+	desc->numCBs = 1;
+
+	if (op->turbo_dec.code_block_mode == 0) {
+		k = (r < op->turbo_dec.tb_params.c_neg)
+			? op->turbo_dec.tb_params.k_neg
+			: op->turbo_dec.tb_params.k_pos;
+		e = (r < op->turbo_dec.tb_params.cab)
+			? op->turbo_dec.tb_params.ea
+			: op->turbo_dec.tb_params.eb;
+	} else {
+		k = op->turbo_dec.cb_params.k;
+		e = op->turbo_dec.cb_params.e;
+	}
+
+	if ((op->turbo_dec.code_block_mode == 0)
+		&& !check_bit(op->turbo_dec.op_flags,
+		RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP))
+		crc24_overlap = 24;
+
+	/* Calculates circular buffer size.
+	 * According to 3gpp 36.212 section 5.1.4.2
+	 *   Kw = 3 * Kpi,
+	 * where:
+	 *   Kpi = nCol * nRow
+	 * where nCol is 32 and nRow can be calculated from:
+	 *   D =< nCol * nRow
+	 * where D is the size of each output from turbo encoder block (k + 4).
+	 */
+	kw = RTE_ALIGN_CEIL(k + 4, 32) * 3;
+
+	if (unlikely((*mbuf_total_left == 0) || (*mbuf_total_left < kw))) {
+		rte_bbdev_log(ERR,
+				"Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+				*mbuf_total_left, kw);
+		return -1;
+	}
+
+	next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset, kw,
+			seg_total_left, next_triplet);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->m2dlen = next_triplet;
+	*mbuf_total_left -= kw;
+
+	next_triplet = acc100_dma_fill_blk_type_out(
+			desc, h_output, *h_out_offset,
+			k >> 3, next_triplet, ACC100_DMA_BLKID_OUT_HARD);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+
+	*h_out_length = ((k - crc24_overlap) >> 3);
+	op->turbo_dec.hard_output.length += *h_out_length;
+	*h_out_offset += *h_out_length;
+
+	/* Soft output */
+	if (check_bit(op->turbo_dec.op_flags, RTE_BBDEV_TURBO_SOFT_OUTPUT)) {
+		if (check_bit(op->turbo_dec.op_flags,
+				RTE_BBDEV_TURBO_EQUALIZER))
+			*s_out_length = e;
+		else
+			*s_out_length = (k * 3) + 12;
+
+		next_triplet = acc100_dma_fill_blk_type_out(desc, s_output,
+				*s_out_offset, *s_out_length, next_triplet,
+				ACC100_DMA_BLKID_OUT_SOFT);
+		if (unlikely(next_triplet < 0)) {
+			rte_bbdev_log(ERR,
+					"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+					op);
+			return -1;
+		}
+
+		op->turbo_dec.soft_output.length += *s_out_length;
+		*s_out_offset += *s_out_length;
+	}
+
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->d2mlen = next_triplet - desc->m2dlen;
+
+	desc->op_addr = op;
+
+	return 0;
+}
+
+static inline int
 acc100_dma_desc_ld_fill(struct rte_bbdev_dec_op *op,
 		struct acc100_dma_req_desc *desc,
 		struct rte_mbuf **input, struct rte_mbuf *h_output,
@@ -1404,6 +1700,51 @@
 
 /* Enqueue one encode operations for ACC100 device in CB mode */
 static inline int
+enqueue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
+		uint16_t total_enqueued_cbs)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint32_t in_offset, out_offset, out_length, mbuf_total_left,
+		seg_total_left;
+	struct rte_mbuf *input, *output_head, *output;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	acc100_fcw_te_fill(op, &desc->req.fcw_te);
+
+	input = op->turbo_enc.input.data;
+	output_head = output = op->turbo_enc.output.data;
+	in_offset = op->turbo_enc.input.offset;
+	out_offset = op->turbo_enc.output.offset;
+	out_length = 0;
+	mbuf_total_left = op->turbo_enc.input.length;
+	seg_total_left = rte_pktmbuf_data_len(op->turbo_enc.input.data)
+			- in_offset;
+
+	ret = acc100_dma_desc_te_fill(op, &desc->req, &input, output,
+			&in_offset, &out_offset, &out_length, &mbuf_total_left,
+			&seg_total_left, 0);
+
+	if (unlikely(ret < 0))
+		return ret;
+
+	mbuf_append(output_head, output, out_length);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_te,
+			sizeof(desc->req.fcw_te) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+	if (check_mbuf_total_left(mbuf_total_left) != 0)
+		return -EINVAL;
+#endif
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
+/* Enqueue one encode operations for ACC100 device in CB mode */
+static inline int
 enqueue_ldpc_enc_n_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op **ops,
 		uint16_t total_enqueued_cbs, int16_t num)
 {
@@ -1503,85 +1844,235 @@
 	return 1;
 }
 
-static inline int
-harq_loopback(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
-		uint16_t total_enqueued_cbs) {
-	struct acc100_fcw_ld *fcw;
-	union acc100_dma_desc *desc;
-	int next_triplet = 1;
-	struct rte_mbuf *hq_output_head, *hq_output;
-	uint16_t harq_in_length = op->ldpc_dec.harq_combined_input.length;
-	if (harq_in_length == 0) {
-		rte_bbdev_log(ERR, "Loopback of invalid null size\n");
-		return -EINVAL;
-	}
 
-	int h_comp = check_bit(op->ldpc_dec.op_flags,
-			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION
-			) ? 1 : 0;
-	if (h_comp == 1)
-		harq_in_length = harq_in_length * 8 / 6;
-	harq_in_length = RTE_ALIGN(harq_in_length, 64);
-	uint16_t harq_dma_length_in = (h_comp == 0) ?
-			harq_in_length :
-			harq_in_length * 6 / 8;
-	uint16_t harq_dma_length_out = harq_dma_length_in;
-	bool ddr_mem_in = check_bit(op->ldpc_dec.op_flags,
-			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE);
-	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
-	uint16_t harq_index = (ddr_mem_in ?
-			op->ldpc_dec.harq_combined_input.offset :
-			op->ldpc_dec.harq_combined_output.offset)
-			/ ACC100_HARQ_OFFSET;
+/* Enqueue one encode operations for ACC100 device in TB mode. */
+static inline int
+enqueue_enc_one_op_tb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
+		uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint8_t r, c;
+	uint32_t in_offset, out_offset, out_length, mbuf_total_left,
+		seg_total_left;
+	struct rte_mbuf *input, *output_head, *output;
+	uint16_t current_enqueued_cbs = 0;
 
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
-	fcw = &desc->req.fcw_ld;
-	/* Set the FCW from loopback into DDR */
-	memset(fcw, 0, sizeof(struct acc100_fcw_ld));
-	fcw->FCWversion = ACC100_FCW_VER;
-	fcw->qm = 2;
-	fcw->Zc = 384;
-	if (harq_in_length < 16 * ACC100_N_ZC_1)
-		fcw->Zc = 16;
-	fcw->ncb = fcw->Zc * ACC100_N_ZC_1;
-	fcw->rm_e = 2;
-	fcw->hcin_en = 1;
-	fcw->hcout_en = 1;
+	uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
+	acc100_fcw_te_fill(op, &desc->req.fcw_te);
 
-	rte_bbdev_log(DEBUG, "Loopback IN %d Index %d offset %d length %d %d\n",
-			ddr_mem_in, harq_index,
-			harq_layout[harq_index].offset, harq_in_length,
-			harq_dma_length_in);
+	input = op->turbo_enc.input.data;
+	output_head = output = op->turbo_enc.output.data;
+	in_offset = op->turbo_enc.input.offset;
+	out_offset = op->turbo_enc.output.offset;
+	out_length = 0;
+	mbuf_total_left = op->turbo_enc.input.length;
 
-	if (ddr_mem_in && (harq_layout[harq_index].offset > 0)) {
-		fcw->hcin_size0 = harq_layout[harq_index].size0;
-		fcw->hcin_offset = harq_layout[harq_index].offset;
-		fcw->hcin_size1 = harq_in_length - fcw->hcin_offset;
-		harq_dma_length_in = (fcw->hcin_size0 + fcw->hcin_size1);
-		if (h_comp == 1)
-			harq_dma_length_in = harq_dma_length_in * 6 / 8;
-	} else {
-		fcw->hcin_size0 = harq_in_length;
-	}
-	harq_layout[harq_index].val = 0;
-	rte_bbdev_log(DEBUG, "Loopback FCW Config %d %d %d\n",
-			fcw->hcin_size0, fcw->hcin_offset, fcw->hcin_size1);
-	fcw->hcout_size0 = harq_in_length;
-	fcw->hcin_decomp_mode = h_comp;
-	fcw->hcout_comp_mode = h_comp;
-	fcw->gain_i = 1;
-	fcw->gain_h = 1;
+	c = op->turbo_enc.tb_params.c;
+	r = op->turbo_enc.tb_params.r;
 
-	/* Set the prefix of descriptor. This could be done at polling */
-	acc100_header_init(&desc->req);
+	while (mbuf_total_left > 0 && r < c) {
+		seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
+		/* Set up DMA descriptor */
+		desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
+				& q->sw_ring_wrap_mask);
+		desc->req.data_ptrs[0].address = q->ring_addr_iova + fcw_offset;
+		desc->req.data_ptrs[0].blen = ACC100_FCW_TE_BLEN;
 
-	/* Null LLR input for Decoder */
-	desc->req.data_ptrs[next_triplet].address =
-			q->lb_in_addr_iova;
-	desc->req.data_ptrs[next_triplet].blen = 2;
-	desc->req.data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN;
+		ret = acc100_dma_desc_te_fill(op, &desc->req, &input, output,
+				&in_offset, &out_offset, &out_length,
+				&mbuf_total_left, &seg_total_left, r);
+		if (unlikely(ret < 0))
+			return ret;
+		mbuf_append(output_head, output, out_length);
+
+		/* Set total number of CBs in TB */
+		desc->req.cbs_in_tb = cbs_in_tb;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+		rte_memdump(stderr, "FCW", &desc->req.fcw_te,
+				sizeof(desc->req.fcw_te) - 8);
+		rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+		if (seg_total_left == 0) {
+			/* Go to the next mbuf */
+			input = input->next;
+			in_offset = 0;
+			output = output->next;
+			out_offset = 0;
+		}
+
+		total_enqueued_cbs++;
+		current_enqueued_cbs++;
+		r++;
+	}
+
+	if (unlikely(desc == NULL))
+		return current_enqueued_cbs;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (check_mbuf_total_left(mbuf_total_left) != 0)
+		return -EINVAL;
+#endif
+
+	/* Set SDone on last CB descriptor for TB mode. */
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	return current_enqueued_cbs;
+}
+
+/** Enqueue one decode operations for ACC100 device in CB mode */
+static inline int
+enqueue_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint32_t in_offset, h_out_offset, s_out_offset, s_out_length,
+		h_out_length, mbuf_total_left, seg_total_left;
+	struct rte_mbuf *input, *h_output_head, *h_output,
+		*s_output_head, *s_output;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	acc100_fcw_td_fill(op, &desc->req.fcw_td);
+
+	input = op->turbo_dec.input.data;
+	h_output_head = h_output = op->turbo_dec.hard_output.data;
+	s_output_head = s_output = op->turbo_dec.soft_output.data;
+	in_offset = op->turbo_dec.input.offset;
+	h_out_offset = op->turbo_dec.hard_output.offset;
+	s_out_offset = op->turbo_dec.soft_output.offset;
+	h_out_length = s_out_length = 0;
+	mbuf_total_left = op->turbo_dec.input.length;
+	seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(input == NULL)) {
+		rte_bbdev_log(ERR, "Invalid mbuf pointer");
+		return -EFAULT;
+	}
+#endif
+
+	/* Set up DMA descriptor */
+	desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+
+	ret = acc100_dma_desc_td_fill(op, &desc->req, &input, h_output,
+			s_output, &in_offset, &h_out_offset, &s_out_offset,
+			&h_out_length, &s_out_length, &mbuf_total_left,
+			&seg_total_left, 0);
+
+	if (unlikely(ret < 0))
+		return ret;
+
+	/* Hard output */
+	mbuf_append(h_output_head, h_output, h_out_length);
+
+	/* Soft output */
+	if (check_bit(op->turbo_dec.op_flags, RTE_BBDEV_TURBO_SOFT_OUTPUT))
+		mbuf_append(s_output_head, s_output, s_out_length);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_td,
+			sizeof(desc->req.fcw_td) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+	if (check_mbuf_total_left(mbuf_total_left) != 0)
+		return -EINVAL;
+#endif
+
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
+static inline int
+harq_loopback(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs) {
+	struct acc100_fcw_ld *fcw;
+	union acc100_dma_desc *desc;
+	int next_triplet = 1;
+	struct rte_mbuf *hq_output_head, *hq_output;
+	uint16_t harq_dma_length_in, harq_dma_length_out;
+	uint16_t harq_in_length = op->ldpc_dec.harq_combined_input.length;
+	if (harq_in_length == 0) {
+		rte_bbdev_log(ERR, "Loopback of invalid null size\n");
+		return -EINVAL;
+	}
+
+	int h_comp = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION
+			) ? 1 : 0;
+	if (h_comp == 1) {
+		harq_in_length = harq_in_length * 8 / 6;
+		harq_in_length = RTE_ALIGN(harq_in_length, 64);
+		harq_dma_length_in = harq_in_length * 6 / 8;
+	} else {
+		harq_in_length = RTE_ALIGN(harq_in_length, 64);
+		harq_dma_length_in = harq_in_length;
+	}
+	harq_dma_length_out = harq_dma_length_in;
+
+	bool ddr_mem_in = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE);
+	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+	uint16_t harq_index = (ddr_mem_in ?
+			op->ldpc_dec.harq_combined_input.offset :
+			op->ldpc_dec.harq_combined_output.offset)
+			/ ACC100_HARQ_OFFSET;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	fcw = &desc->req.fcw_ld;
+	/* Set the FCW from loopback into DDR */
+	memset(fcw, 0, sizeof(struct acc100_fcw_ld));
+	fcw->FCWversion = ACC100_FCW_VER;
+	fcw->qm = 2;
+	fcw->Zc = 384;
+	if (harq_in_length < 16 * ACC100_N_ZC_1)
+		fcw->Zc = 16;
+	fcw->ncb = fcw->Zc * ACC100_N_ZC_1;
+	fcw->rm_e = 2;
+	fcw->hcin_en = 1;
+	fcw->hcout_en = 1;
+
+	rte_bbdev_log(DEBUG, "Loopback IN %d Index %d offset %d length %d %d\n",
+			ddr_mem_in, harq_index,
+			harq_layout[harq_index].offset, harq_in_length,
+			harq_dma_length_in);
+
+	if (ddr_mem_in && (harq_layout[harq_index].offset > 0)) {
+		fcw->hcin_size0 = harq_layout[harq_index].size0;
+		fcw->hcin_offset = harq_layout[harq_index].offset;
+		fcw->hcin_size1 = harq_in_length - fcw->hcin_offset;
+		harq_dma_length_in = (fcw->hcin_size0 + fcw->hcin_size1);
+		if (h_comp == 1)
+			harq_dma_length_in = harq_dma_length_in * 6 / 8;
+	} else {
+		fcw->hcin_size0 = harq_in_length;
+	}
+	harq_layout[harq_index].val = 0;
+	rte_bbdev_log(DEBUG, "Loopback FCW Config %d %d %d\n",
+			fcw->hcin_size0, fcw->hcin_offset, fcw->hcin_size1);
+	fcw->hcout_size0 = harq_in_length;
+	fcw->hcin_decomp_mode = h_comp;
+	fcw->hcout_comp_mode = h_comp;
+	fcw->gain_i = 1;
+	fcw->gain_h = 1;
+
+	/* Set the prefix of descriptor. This could be done at polling */
+	acc100_header_init(&desc->req);
+
+	/* Null LLR input for Decoder */
+	desc->req.data_ptrs[next_triplet].address =
+			q->lb_in_addr_iova;
+	desc->req.data_ptrs[next_triplet].blen = 2;
+	desc->req.data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN;
 	desc->req.data_ptrs[next_triplet].last = 0;
 	desc->req.data_ptrs[next_triplet].dma_ext = 0;
 	next_triplet++;
@@ -1831,6 +2322,102 @@
 	return current_enqueued_cbs;
 }
 
+/* Enqueue one decode operations for ACC100 device in TB mode */
+static inline int
+enqueue_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint8_t r, c;
+	uint32_t in_offset, h_out_offset, s_out_offset, s_out_length,
+		h_out_length, mbuf_total_left, seg_total_left;
+	struct rte_mbuf *input, *h_output_head, *h_output,
+		*s_output_head, *s_output;
+	uint16_t current_enqueued_cbs = 0;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
+	acc100_fcw_td_fill(op, &desc->req.fcw_td);
+
+	input = op->turbo_dec.input.data;
+	h_output_head = h_output = op->turbo_dec.hard_output.data;
+	s_output_head = s_output = op->turbo_dec.soft_output.data;
+	in_offset = op->turbo_dec.input.offset;
+	h_out_offset = op->turbo_dec.hard_output.offset;
+	s_out_offset = op->turbo_dec.soft_output.offset;
+	h_out_length = s_out_length = 0;
+	mbuf_total_left = op->turbo_dec.input.length;
+	c = op->turbo_dec.tb_params.c;
+	r = op->turbo_dec.tb_params.r;
+
+	while (mbuf_total_left > 0 && r < c) {
+
+		seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
+
+		/* Set up DMA descriptor */
+		desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
+				& q->sw_ring_wrap_mask);
+		desc->req.data_ptrs[0].address = q->ring_addr_iova + fcw_offset;
+		desc->req.data_ptrs[0].blen = ACC100_FCW_TD_BLEN;
+		ret = acc100_dma_desc_td_fill(op, &desc->req, &input,
+				h_output, s_output, &in_offset, &h_out_offset,
+				&s_out_offset, &h_out_length, &s_out_length,
+				&mbuf_total_left, &seg_total_left, r);
+
+		if (unlikely(ret < 0))
+			return ret;
+
+		/* Hard output */
+		mbuf_append(h_output_head, h_output, h_out_length);
+
+		/* Soft output */
+		if (check_bit(op->turbo_dec.op_flags,
+				RTE_BBDEV_TURBO_SOFT_OUTPUT))
+			mbuf_append(s_output_head, s_output, s_out_length);
+
+		/* Set total number of CBs in TB */
+		desc->req.cbs_in_tb = cbs_in_tb;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+		rte_memdump(stderr, "FCW", &desc->req.fcw_td,
+				sizeof(desc->req.fcw_td) - 8);
+		rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+		if (seg_total_left == 0) {
+			/* Go to the next mbuf */
+			input = input->next;
+			in_offset = 0;
+			h_output = h_output->next;
+			h_out_offset = 0;
+
+			if (check_bit(op->turbo_dec.op_flags,
+					RTE_BBDEV_TURBO_SOFT_OUTPUT)) {
+				s_output = s_output->next;
+				s_out_offset = 0;
+			}
+		}
+
+		total_enqueued_cbs++;
+		current_enqueued_cbs++;
+		r++;
+	}
+
+	if (unlikely(desc == NULL))
+		return current_enqueued_cbs;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (check_mbuf_total_left(mbuf_total_left) != 0)
+		return -EINVAL;
+#endif
+	/* Set SDone on last CB descriptor for TB mode */
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	return current_enqueued_cbs;
+}
 
 /* Calculates number of CBs in processed encoder TB based on 'r' and input
  * length.
@@ -1908,6 +2495,45 @@
 	return cbs_in_tb;
 }
 
+/* Enqueue encode operations for ACC100 device in CB mode. */
+static uint16_t
+acc100_enqueue_enc_cb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i;
+	union acc100_dma_desc *desc;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - 1 < 0))
+			break;
+		avail -= 1;
+
+		ret = enqueue_enc_one_op_cb(q, ops[i], i);
+		if (ret < 0)
+			break;
+	}
+
+	if (unlikely(i == 0))
+		return 0; /* Nothing to enqueue */
+
+	/* Set SDone in last CB in enqueued ops for CB mode*/
+	desc = q->ring_addr + ((q->sw_ring_head + i - 1)
+			& q->sw_ring_wrap_mask);
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	acc100_dma_enqueue(q, i, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+	return i;
+}
+
 /* Check we can mux encode operations with common FCW */
 static inline bool
 check_mux(struct rte_bbdev_enc_op **ops, uint16_t num) {
@@ -1976,6 +2602,54 @@
 	return i;
 }
 
+/* Enqueue encode operations for ACC100 device in TB mode. */
+static uint16_t
+acc100_enqueue_enc_tb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i, enqueued_cbs = 0;
+	uint8_t cbs_in_tb;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		cbs_in_tb = get_num_cbs_in_tb_enc(&ops[i]->turbo_enc);
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - cbs_in_tb < 0))
+			break;
+		avail -= cbs_in_tb;
+
+		ret = enqueue_enc_one_op_tb(q, ops[i], enqueued_cbs, cbs_in_tb);
+		if (ret < 0)
+			break;
+		enqueued_cbs += ret;
+	}
+	if (unlikely(enqueued_cbs == 0))
+		return 0; /* Nothing to enqueue */
+
+	acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+
+	return i;
+}
+
+/* Enqueue encode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_enc(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	if (unlikely(num == 0))
+		return 0;
+	if (ops[0]->turbo_enc.code_block_mode == 0)
+		return acc100_enqueue_enc_tb(q_data, ops, num);
+	else
+		return acc100_enqueue_enc_cb(q_data, ops, num);
+}
+
 /* Enqueue encode operations for ACC100 device. */
 static uint16_t
 acc100_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
@@ -1983,7 +2657,51 @@
 {
 	if (unlikely(num == 0))
 		return 0;
-	return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
+	if (ops[0]->ldpc_enc.code_block_mode == 0)
+		return acc100_enqueue_enc_tb(q_data, ops, num);
+	else
+		return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
+}
+
+
+/* Enqueue decode operations for ACC100 device in CB mode */
+static uint16_t
+acc100_enqueue_dec_cb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i;
+	union acc100_dma_desc *desc;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - 1 < 0))
+			break;
+		avail -= 1;
+
+		ret = enqueue_dec_one_op_cb(q, ops[i], i);
+		if (ret < 0)
+			break;
+	}
+
+	if (unlikely(i == 0))
+		return 0; /* Nothing to enqueue */
+
+	/* Set SDone in last CB in enqueued ops for CB mode*/
+	desc = q->ring_addr + ((q->sw_ring_head + i - 1)
+			& q->sw_ring_wrap_mask);
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	acc100_dma_enqueue(q, i, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+
+	return i;
 }
 
 /* Check we can mux encode operations with common FCW */
@@ -2081,6 +2799,53 @@
 	return i;
 }
 
+
+/* Enqueue decode operations for ACC100 device in TB mode */
+static uint16_t
+acc100_enqueue_dec_tb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i, enqueued_cbs = 0;
+	uint8_t cbs_in_tb;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		cbs_in_tb = get_num_cbs_in_tb_dec(&ops[i]->turbo_dec);
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - cbs_in_tb < 0))
+			break;
+		avail -= cbs_in_tb;
+
+		ret = enqueue_dec_one_op_tb(q, ops[i], enqueued_cbs, cbs_in_tb);
+		if (ret < 0)
+			break;
+		enqueued_cbs += ret;
+	}
+
+	acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+
+	return i;
+}
+
+/* Enqueue decode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_dec(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	if (unlikely(num == 0))
+		return 0;
+	if (ops[0]->turbo_dec.code_block_mode == 0)
+		return acc100_enqueue_dec_tb(q_data, ops, num);
+	else
+		return acc100_enqueue_dec_cb(q_data, ops, num);
+}
+
 /* Enqueue decode operations for ACC100 device. */
 static uint16_t
 acc100_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
@@ -2404,6 +3169,52 @@
 	return cb_idx;
 }
 
+/* Dequeue encode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_enc(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	uint16_t dequeue_num;
+	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+	uint32_t aq_dequeued = 0;
+	uint16_t i, dequeued_cbs = 0;
+	struct rte_bbdev_enc_op *op;
+	int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(ops == NULL || q == NULL)) {
+		rte_bbdev_log_debug("Unexpected undefined pointer");
+		return 0;
+	}
+#endif
+
+	dequeue_num = (avail < num) ? avail : num;
+
+	for (i = 0; i < dequeue_num; ++i) {
+		op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask))->req.op_addr;
+		if (op->turbo_enc.code_block_mode == 0)
+			ret = dequeue_enc_one_op_tb(q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+		else
+			ret = dequeue_enc_one_op_cb(q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+
+		if (ret < 0)
+			break;
+		dequeued_cbs += ret;
+	}
+
+	q->aq_dequeued += aq_dequeued;
+	q->sw_ring_tail += dequeued_cbs;
+
+	/* Update enqueue stats */
+	q_data->queue_stats.dequeued_count += i;
+
+	return i;
+}
+
 /* Dequeue LDPC encode operations from ACC100 device. */
 static uint16_t
 acc100_dequeue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
@@ -2442,6 +3253,52 @@
 	return dequeued_cbs;
 }
 
+
+/* Dequeue decode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_dec(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	uint16_t dequeue_num;
+	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+	uint32_t aq_dequeued = 0;
+	uint16_t i;
+	uint16_t dequeued_cbs = 0;
+	struct rte_bbdev_dec_op *op;
+	int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(ops == 0 && q == NULL))
+		return 0;
+#endif
+
+	dequeue_num = (avail < num) ? avail : num;
+
+	for (i = 0; i < dequeue_num; ++i) {
+		op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask))->req.op_addr;
+		if (op->turbo_dec.code_block_mode == 0)
+			ret = dequeue_dec_one_op_tb(q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+		else
+			ret = dequeue_dec_one_op_cb(q_data, q, &ops[i],
+					dequeued_cbs, &aq_dequeued);
+
+		if (ret < 0)
+			break;
+		dequeued_cbs += ret;
+	}
+
+	q->aq_dequeued += aq_dequeued;
+	q->sw_ring_tail += dequeued_cbs;
+
+	/* Update enqueue stats */
+	q_data->queue_stats.dequeued_count += i;
+
+	return i;
+}
+
 /* Dequeue decode operations from ACC100 device. */
 static uint16_t
 acc100_dequeue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
@@ -2495,6 +3352,10 @@
 	struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
 
 	dev->dev_ops = &acc100_bbdev_ops;
+	dev->enqueue_enc_ops = acc100_enqueue_enc;
+	dev->enqueue_dec_ops = acc100_enqueue_dec;
+	dev->dequeue_enc_ops = acc100_dequeue_enc;
+	dev->dequeue_dec_ops = acc100_dequeue_dec;
 	dev->enqueue_ldpc_enc_ops = acc100_enqueue_ldpc_enc;
 	dev->enqueue_ldpc_dec_ops = acc100_enqueue_ldpc_dec;
 	dev->dequeue_ldpc_enc_ops = acc100_dequeue_ldpc_enc;
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v11 08/10] baseband/acc100: add interrupt support to PMD
  2020-10-02  1:01   ` [dpdk-dev] [PATCH v11 00/10] bbdev PMD ACC100 Nicolas Chautru
                       ` (6 preceding siblings ...)
  2020-10-02  1:01     ` [dpdk-dev] [PATCH v11 07/10] baseband/acc100: add support for 4G processing Nicolas Chautru
@ 2020-10-02  1:01     ` Nicolas Chautru
  2020-10-02  1:01     ` [dpdk-dev] [PATCH v11 09/10] baseband/acc100: add debug function to validate input Nicolas Chautru
  2020-10-02  1:01     ` [dpdk-dev] [PATCH v11 10/10] baseband/acc100: add configure function Nicolas Chautru
  9 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-10-02  1:01 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, trix, maxime.coquelin, ferruh.yigit,
	tianjiao.liu, Nicolas Chautru

Adding capability and functions to support MSI
interrupts, call backs and inforing.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 307 ++++++++++++++++++++++++++++++-
 drivers/baseband/acc100/rte_acc100_pmd.h |  16 ++
 2 files changed, 320 insertions(+), 3 deletions(-)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 8e721be..7fcaaf2 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -342,6 +342,213 @@
 	free_base_addresses(base_addrs, i);
 }
 
+/*
+ * Find queue_id of a device queue based on details from the Info Ring.
+ * If a queue isn't found UINT16_MAX is returned.
+ */
+static inline uint16_t
+get_queue_id_from_ring_info(struct rte_bbdev_data *data,
+		const union acc100_info_ring_data ring_data)
+{
+	uint16_t queue_id;
+
+	for (queue_id = 0; queue_id < data->num_queues; ++queue_id) {
+		struct acc100_queue *acc100_q =
+				data->queues[queue_id].queue_private;
+		if (acc100_q != NULL && acc100_q->aq_id == ring_data.aq_id &&
+				acc100_q->qgrp_id == ring_data.qg_id &&
+				acc100_q->vf_id == ring_data.vf_id)
+			return queue_id;
+	}
+
+	return UINT16_MAX;
+}
+
+/* Checks PF Info Ring to find the interrupt cause and handles it accordingly */
+static inline void
+acc100_check_ir(struct acc100_device *acc100_dev)
+{
+	volatile union acc100_info_ring_data *ring_data;
+	uint16_t info_ring_head = acc100_dev->info_ring_head;
+	if (acc100_dev->info_ring == NULL)
+		return;
+
+	ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head &
+			ACC100_INFO_RING_MASK);
+
+	while (ring_data->valid) {
+		if ((ring_data->int_nb < ACC100_PF_INT_DMA_DL_DESC_IRQ) || (
+				ring_data->int_nb >
+				ACC100_PF_INT_DMA_DL5G_DESC_IRQ))
+			rte_bbdev_log(WARNING, "InfoRing: ITR:%d Info:0x%x",
+				ring_data->int_nb, ring_data->detailed_info);
+		/* Initialize Info Ring entry and move forward */
+		ring_data->val = 0;
+		info_ring_head++;
+		ring_data = acc100_dev->info_ring +
+				(info_ring_head & ACC100_INFO_RING_MASK);
+	}
+}
+
+/* Checks PF Info Ring to find the interrupt cause and handles it accordingly */
+static inline void
+acc100_pf_interrupt_handler(struct rte_bbdev *dev)
+{
+	struct acc100_device *acc100_dev = dev->data->dev_private;
+	volatile union acc100_info_ring_data *ring_data;
+	struct acc100_deq_intr_details deq_intr_det;
+
+	ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head &
+			ACC100_INFO_RING_MASK);
+
+	while (ring_data->valid) {
+
+		rte_bbdev_log_debug(
+				"ACC100 PF Interrupt received, Info Ring data: 0x%x",
+				ring_data->val);
+
+		switch (ring_data->int_nb) {
+		case ACC100_PF_INT_DMA_DL_DESC_IRQ:
+		case ACC100_PF_INT_DMA_UL_DESC_IRQ:
+		case ACC100_PF_INT_DMA_UL5G_DESC_IRQ:
+		case ACC100_PF_INT_DMA_DL5G_DESC_IRQ:
+			deq_intr_det.queue_id = get_queue_id_from_ring_info(
+					dev->data, *ring_data);
+			if (deq_intr_det.queue_id == UINT16_MAX) {
+				rte_bbdev_log(ERR,
+						"Couldn't find queue: aq_id: %u, qg_id: %u, vf_id: %u",
+						ring_data->aq_id,
+						ring_data->qg_id,
+						ring_data->vf_id);
+				return;
+			}
+			rte_bbdev_pmd_callback_process(dev,
+					RTE_BBDEV_EVENT_DEQUEUE, &deq_intr_det);
+			break;
+		default:
+			rte_bbdev_pmd_callback_process(dev,
+					RTE_BBDEV_EVENT_ERROR, NULL);
+			break;
+		}
+
+		/* Initialize Info Ring entry and move forward */
+		ring_data->val = 0;
+		++acc100_dev->info_ring_head;
+		ring_data = acc100_dev->info_ring +
+				(acc100_dev->info_ring_head &
+				ACC100_INFO_RING_MASK);
+	}
+}
+
+/* Checks VF Info Ring to find the interrupt cause and handles it accordingly */
+static inline void
+acc100_vf_interrupt_handler(struct rte_bbdev *dev)
+{
+	struct acc100_device *acc100_dev = dev->data->dev_private;
+	volatile union acc100_info_ring_data *ring_data;
+	struct acc100_deq_intr_details deq_intr_det;
+
+	ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head &
+			ACC100_INFO_RING_MASK);
+
+	while (ring_data->valid) {
+
+		rte_bbdev_log_debug(
+				"ACC100 VF Interrupt received, Info Ring data: 0x%x",
+				ring_data->val);
+
+		switch (ring_data->int_nb) {
+		case ACC100_VF_INT_DMA_DL_DESC_IRQ:
+		case ACC100_VF_INT_DMA_UL_DESC_IRQ:
+		case ACC100_VF_INT_DMA_UL5G_DESC_IRQ:
+		case ACC100_VF_INT_DMA_DL5G_DESC_IRQ:
+			/* VFs are not aware of their vf_id - it's set to 0 in
+			 * queue structures.
+			 */
+			ring_data->vf_id = 0;
+			deq_intr_det.queue_id = get_queue_id_from_ring_info(
+					dev->data, *ring_data);
+			if (deq_intr_det.queue_id == UINT16_MAX) {
+				rte_bbdev_log(ERR,
+						"Couldn't find queue: aq_id: %u, qg_id: %u",
+						ring_data->aq_id,
+						ring_data->qg_id);
+				return;
+			}
+			rte_bbdev_pmd_callback_process(dev,
+					RTE_BBDEV_EVENT_DEQUEUE, &deq_intr_det);
+			break;
+		default:
+			rte_bbdev_pmd_callback_process(dev,
+					RTE_BBDEV_EVENT_ERROR, NULL);
+			break;
+		}
+
+		/* Initialize Info Ring entry and move forward */
+		ring_data->valid = 0;
+		++acc100_dev->info_ring_head;
+		ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head
+				& ACC100_INFO_RING_MASK);
+	}
+}
+
+/* Interrupt handler triggered by ACC100 dev for handling specific interrupt */
+static void
+acc100_dev_interrupt_handler(void *cb_arg)
+{
+	struct rte_bbdev *dev = cb_arg;
+	struct acc100_device *acc100_dev = dev->data->dev_private;
+
+	/* Read info ring */
+	if (acc100_dev->pf_device)
+		acc100_pf_interrupt_handler(dev);
+	else
+		acc100_vf_interrupt_handler(dev);
+}
+
+/* Allocate and setup inforing */
+static int
+allocate_info_ring(struct rte_bbdev *dev)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	const struct acc100_registry_addr *reg_addr;
+	rte_iova_t info_ring_iova;
+	uint32_t phys_low, phys_high;
+
+	if (d->info_ring != NULL)
+		return 0; /* Already configured */
+
+	/* Choose correct registry addresses for the device type */
+	if (d->pf_device)
+		reg_addr = &pf_reg_addr;
+	else
+		reg_addr = &vf_reg_addr;
+	/* Allocate InfoRing */
+	d->info_ring = rte_zmalloc_socket("Info Ring",
+			ACC100_INFO_RING_NUM_ENTRIES *
+			sizeof(*d->info_ring), RTE_CACHE_LINE_SIZE,
+			dev->data->socket_id);
+	if (d->info_ring == NULL) {
+		rte_bbdev_log(ERR,
+				"Failed to allocate Info Ring for %s:%u",
+				dev->device->driver->name,
+				dev->data->dev_id);
+		return -ENOMEM;
+	}
+	info_ring_iova = rte_malloc_virt2iova(d->info_ring);
+
+	/* Setup Info Ring */
+	phys_high = (uint32_t)(info_ring_iova >> 32);
+	phys_low  = (uint32_t)(info_ring_iova);
+	acc100_reg_write(d, reg_addr->info_ring_hi, phys_high);
+	acc100_reg_write(d, reg_addr->info_ring_lo, phys_low);
+	acc100_reg_write(d, reg_addr->info_ring_en, ACC100_REG_IRQ_EN_ALL);
+	d->info_ring_head = (acc100_reg_read(d, reg_addr->info_ring_ptr) &
+			0xFFF) / sizeof(union acc100_info_ring_data);
+	return 0;
+}
+
+
 /* Allocate 64MB memory used for all software rings */
 static int
 acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id)
@@ -349,6 +556,7 @@
 	uint32_t phys_low, phys_high, payload;
 	struct acc100_device *d = dev->data->dev_private;
 	const struct acc100_registry_addr *reg_addr;
+	int ret;
 
 	if (d->pf_device && !d->acc100_conf.pf_mode_en) {
 		rte_bbdev_log(NOTICE,
@@ -432,6 +640,14 @@
 	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_hi, phys_high);
 	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_lo, phys_low);
 
+	ret = allocate_info_ring(dev);
+	if (ret < 0) {
+		rte_bbdev_log(ERR, "Failed to allocate info_ring for %s:%u",
+				dev->device->driver->name,
+				dev->data->dev_id);
+		/* Continue */
+	}
+
 	d->harq_layout = rte_zmalloc_socket("HARQ Layout",
 			ACC100_HARQ_LAYOUT * sizeof(*d->harq_layout),
 			RTE_CACHE_LINE_SIZE, dev->data->socket_id);
@@ -453,13 +669,59 @@
 	return 0;
 }
 
+static int
+acc100_intr_enable(struct rte_bbdev *dev)
+{
+	int ret;
+	struct acc100_device *d = dev->data->dev_private;
+
+	/* Only MSI are currently supported */
+	if (dev->intr_handle->type == RTE_INTR_HANDLE_VFIO_MSI ||
+			dev->intr_handle->type == RTE_INTR_HANDLE_UIO) {
+
+		ret = allocate_info_ring(dev);
+		if (ret < 0) {
+			rte_bbdev_log(ERR,
+					"Couldn't allocate info ring for device: %s",
+					dev->data->name);
+			return ret;
+		}
+
+		ret = rte_intr_enable(dev->intr_handle);
+		if (ret < 0) {
+			rte_bbdev_log(ERR,
+					"Couldn't enable interrupts for device: %s",
+					dev->data->name);
+			rte_free(d->info_ring);
+			return ret;
+		}
+		ret = rte_intr_callback_register(dev->intr_handle,
+				acc100_dev_interrupt_handler, dev);
+		if (ret < 0) {
+			rte_bbdev_log(ERR,
+					"Couldn't register interrupt callback for device: %s",
+					dev->data->name);
+			rte_free(d->info_ring);
+			return ret;
+		}
+
+		return 0;
+	}
+
+	rte_bbdev_log(ERR, "ACC100 (%s) supports only VFIO MSI interrupts",
+			dev->data->name);
+	return -ENOTSUP;
+}
+
 /* Free memory used for software rings */
 static int
 acc100_dev_close(struct rte_bbdev *dev)
 {
 	struct acc100_device *d = dev->data->dev_private;
+	acc100_check_ir(d);
 	if (d->sw_rings_base != NULL) {
 		rte_free(d->tail_ptrs);
+		rte_free(d->info_ring);
 		rte_free(d->sw_rings_base);
 		d->sw_rings_base = NULL;
 	}
@@ -670,6 +932,7 @@
 					RTE_BBDEV_TURBO_CRC_TYPE_24B |
 					RTE_BBDEV_TURBO_HALF_ITERATION_EVEN |
 					RTE_BBDEV_TURBO_EARLY_TERMINATION |
+					RTE_BBDEV_TURBO_DEC_INTERRUPTS |
 					RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN |
 					RTE_BBDEV_TURBO_MAP_DEC |
 					RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP |
@@ -690,6 +953,7 @@
 					RTE_BBDEV_TURBO_CRC_24B_ATTACH |
 					RTE_BBDEV_TURBO_RV_INDEX_BYPASS |
 					RTE_BBDEV_TURBO_RATE_MATCH |
+					RTE_BBDEV_TURBO_ENC_INTERRUPTS |
 					RTE_BBDEV_TURBO_ENC_SCATTER_GATHER,
 				.num_buffers_src =
 						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
@@ -703,7 +967,8 @@
 				.capability_flags =
 					RTE_BBDEV_LDPC_RATE_MATCH |
 					RTE_BBDEV_LDPC_CRC_24B_ATTACH |
-					RTE_BBDEV_LDPC_INTERLEAVER_BYPASS,
+					RTE_BBDEV_LDPC_INTERLEAVER_BYPASS |
+					RTE_BBDEV_LDPC_ENC_INTERRUPTS,
 				.num_buffers_src =
 						RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
 				.num_buffers_dst =
@@ -728,7 +993,8 @@
 				RTE_BBDEV_LDPC_DECODE_BYPASS |
 				RTE_BBDEV_LDPC_DEC_SCATTER_GATHER |
 				RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION |
-				RTE_BBDEV_LDPC_LLR_COMPRESSION,
+				RTE_BBDEV_LDPC_LLR_COMPRESSION |
+				RTE_BBDEV_LDPC_DEC_INTERRUPTS,
 			.llr_size = 8,
 			.llr_decimals = 1,
 			.num_buffers_src =
@@ -778,14 +1044,44 @@
 #else
 	dev_info->harq_buffer_size = 0;
 #endif
+	acc100_check_ir(d);
+}
+
+static int
+acc100_queue_intr_enable(struct rte_bbdev *dev, uint16_t queue_id)
+{
+	struct acc100_queue *q = dev->data->queues[queue_id].queue_private;
+
+	if (dev->intr_handle->type != RTE_INTR_HANDLE_VFIO_MSI &&
+			dev->intr_handle->type != RTE_INTR_HANDLE_UIO)
+		return -ENOTSUP;
+
+	q->irq_enable = 1;
+	return 0;
+}
+
+static int
+acc100_queue_intr_disable(struct rte_bbdev *dev, uint16_t queue_id)
+{
+	struct acc100_queue *q = dev->data->queues[queue_id].queue_private;
+
+	if (dev->intr_handle->type != RTE_INTR_HANDLE_VFIO_MSI &&
+			dev->intr_handle->type != RTE_INTR_HANDLE_UIO)
+		return -ENOTSUP;
+
+	q->irq_enable = 0;
+	return 0;
 }
 
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
 	.setup_queues = acc100_setup_queues,
+	.intr_enable = acc100_intr_enable,
 	.close = acc100_dev_close,
 	.info_get = acc100_dev_info_get,
 	.queue_setup = acc100_queue_setup,
 	.queue_release = acc100_queue_release,
+	.queue_intr_enable = acc100_queue_intr_enable,
+	.queue_intr_disable = acc100_queue_intr_disable
 };
 
 /* ACC100 PCI PF address map */
@@ -3018,8 +3314,10 @@
 			? (1 << RTE_BBDEV_DATA_ERROR) : 0);
 	op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
 	op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
-	if (op->status != 0)
+	if (op->status != 0) {
 		q_data->queue_stats.dequeue_err_count++;
+		acc100_check_ir(q->d);
+	}
 
 	/* CRC invalid if error exists */
 	if (!op->status)
@@ -3076,6 +3374,9 @@
 		op->status |= 1 << RTE_BBDEV_SYNDROME_ERROR;
 	op->ldpc_dec.iter_count = (uint8_t) rsp.iter_cnt;
 
+	if (op->status & (1 << RTE_BBDEV_DRV_ERROR))
+		acc100_check_ir(q->d);
+
 	/* Check if this is the last desc in batch (Atomic Queue) */
 	if (desc->req.last_desc_in_batch) {
 		(*aq_dequeued)++;
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 38818f4..1fbd96e 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -565,8 +565,16 @@ struct acc100_device {
 	rte_iova_t sw_rings_iova;  /* IOVA address of sw_rings */
 	/* Virtual address of the info memory routed to the this function under
 	 * operation, whether it is PF or VF.
+	 * HW may DMA information data at this location asynchronously
 	 */
+	union acc100_info_ring_data *info_ring;
+
 	union acc100_harq_layout_data *harq_layout;
+	/* Virtual Info Ring head */
+	uint16_t info_ring_head;
+	/* Number of bytes available for each queue in device, depending on
+	 * how many queues are enabled with configure()
+	 */
 	uint32_t sw_ring_size;
 	uint32_t ddr_size; /* Size in kB */
 	uint32_t *tail_ptrs; /* Base address of response tail pointer buffer */
@@ -582,4 +590,12 @@ struct acc100_device {
 	bool configured; /**< True if this ACC100 device is configured */
 };
 
+/**
+ * Structure with details about RTE_BBDEV_EVENT_DEQUEUE event. It's passed to
+ * the callback function.
+ */
+struct acc100_deq_intr_details {
+	uint16_t queue_id;
+};
+
 #endif /* _RTE_ACC100_PMD_H_ */
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v11 09/10] baseband/acc100: add debug function to validate input
  2020-10-02  1:01   ` [dpdk-dev] [PATCH v11 00/10] bbdev PMD ACC100 Nicolas Chautru
                       ` (7 preceding siblings ...)
  2020-10-02  1:01     ` [dpdk-dev] [PATCH v11 08/10] baseband/acc100: add interrupt support to PMD Nicolas Chautru
@ 2020-10-02  1:01     ` Nicolas Chautru
  2020-10-02  1:01     ` [dpdk-dev] [PATCH v11 10/10] baseband/acc100: add configure function Nicolas Chautru
  9 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-10-02  1:01 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, trix, maxime.coquelin, ferruh.yigit,
	tianjiao.liu, Nicolas Chautru

Debug functions to validate the input API from user
Only enabled in DEBUG mode at build time

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
Reviewed-by: Tom Rix <trix@redhat.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 436 +++++++++++++++++++++++++++++++
 1 file changed, 436 insertions(+)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 7fcaaf2..1e3c077 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -1994,6 +1994,243 @@
 
 }
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+/* Validates turbo encoder parameters */
+static inline int
+validate_enc_op(struct rte_bbdev_enc_op *op)
+{
+	struct rte_bbdev_op_turbo_enc *turbo_enc = &op->turbo_enc;
+	struct rte_bbdev_op_enc_turbo_cb_params *cb = NULL;
+	struct rte_bbdev_op_enc_turbo_tb_params *tb = NULL;
+	uint16_t kw, kw_neg, kw_pos;
+
+	if (op->mempool == NULL) {
+		rte_bbdev_log(ERR, "Invalid mempool pointer");
+		return -1;
+	}
+	if (turbo_enc->input.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid input pointer");
+		return -1;
+	}
+	if (turbo_enc->output.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid output pointer");
+		return -1;
+	}
+	if (turbo_enc->rv_index > 3) {
+		rte_bbdev_log(ERR,
+				"rv_index (%u) is out of range 0 <= value <= 3",
+				turbo_enc->rv_index);
+		return -1;
+	}
+	if (turbo_enc->code_block_mode != 0 &&
+			turbo_enc->code_block_mode != 1) {
+		rte_bbdev_log(ERR,
+				"code_block_mode (%u) is out of range 0 <= value <= 1",
+				turbo_enc->code_block_mode);
+		return -1;
+	}
+
+	if (turbo_enc->code_block_mode == 0) {
+		tb = &turbo_enc->tb_params;
+		if ((tb->k_neg < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| tb->k_neg > RTE_BBDEV_TURBO_MAX_CB_SIZE)
+				&& tb->c_neg > 0) {
+			rte_bbdev_log(ERR,
+					"k_neg (%u) is out of range %u <= value <= %u",
+					tb->k_neg, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if (tb->k_pos < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| tb->k_pos > RTE_BBDEV_TURBO_MAX_CB_SIZE) {
+			rte_bbdev_log(ERR,
+					"k_pos (%u) is out of range %u <= value <= %u",
+					tb->k_pos, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if (tb->c_neg > (RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1))
+			rte_bbdev_log(ERR,
+					"c_neg (%u) is out of range 0 <= value <= %u",
+					tb->c_neg,
+					RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1);
+		if (tb->c < 1 || tb->c > RTE_BBDEV_TURBO_MAX_CODE_BLOCKS) {
+			rte_bbdev_log(ERR,
+					"c (%u) is out of range 1 <= value <= %u",
+					tb->c, RTE_BBDEV_TURBO_MAX_CODE_BLOCKS);
+			return -1;
+		}
+		if (tb->cab > tb->c) {
+			rte_bbdev_log(ERR,
+					"cab (%u) is greater than c (%u)",
+					tb->cab, tb->c);
+			return -1;
+		}
+		if ((tb->ea < RTE_BBDEV_TURBO_MIN_CB_SIZE || (tb->ea % 2))
+				&& tb->r < tb->cab) {
+			rte_bbdev_log(ERR,
+					"ea (%u) is less than %u or it is not even",
+					tb->ea, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+		if ((tb->eb < RTE_BBDEV_TURBO_MIN_CB_SIZE || (tb->eb % 2))
+				&& tb->c > tb->cab) {
+			rte_bbdev_log(ERR,
+					"eb (%u) is less than %u or it is not even",
+					tb->eb, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+
+		kw_neg = 3 * RTE_ALIGN_CEIL(tb->k_neg + 4,
+					RTE_BBDEV_TURBO_C_SUBBLOCK);
+		if (tb->ncb_neg < tb->k_neg || tb->ncb_neg > kw_neg) {
+			rte_bbdev_log(ERR,
+					"ncb_neg (%u) is out of range (%u) k_neg <= value <= (%u) kw_neg",
+					tb->ncb_neg, tb->k_neg, kw_neg);
+			return -1;
+		}
+
+		kw_pos = 3 * RTE_ALIGN_CEIL(tb->k_pos + 4,
+					RTE_BBDEV_TURBO_C_SUBBLOCK);
+		if (tb->ncb_pos < tb->k_pos || tb->ncb_pos > kw_pos) {
+			rte_bbdev_log(ERR,
+					"ncb_pos (%u) is out of range (%u) k_pos <= value <= (%u) kw_pos",
+					tb->ncb_pos, tb->k_pos, kw_pos);
+			return -1;
+		}
+		if (tb->r > (tb->c - 1)) {
+			rte_bbdev_log(ERR,
+					"r (%u) is greater than c - 1 (%u)",
+					tb->r, tb->c - 1);
+			return -1;
+		}
+	} else {
+		cb = &turbo_enc->cb_params;
+		if (cb->k < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| cb->k > RTE_BBDEV_TURBO_MAX_CB_SIZE) {
+			rte_bbdev_log(ERR,
+					"k (%u) is out of range %u <= value <= %u",
+					cb->k, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+
+		if (cb->e < RTE_BBDEV_TURBO_MIN_CB_SIZE || (cb->e % 2)) {
+			rte_bbdev_log(ERR,
+					"e (%u) is less than %u or it is not even",
+					cb->e, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+
+		kw = RTE_ALIGN_CEIL(cb->k + 4, RTE_BBDEV_TURBO_C_SUBBLOCK) * 3;
+		if (cb->ncb < cb->k || cb->ncb > kw) {
+			rte_bbdev_log(ERR,
+					"ncb (%u) is out of range (%u) k <= value <= (%u) kw",
+					cb->ncb, cb->k, kw);
+			return -1;
+		}
+	}
+
+	return 0;
+}
+/* Validates LDPC encoder parameters */
+static inline int
+validate_ldpc_enc_op(struct rte_bbdev_enc_op *op)
+{
+	struct rte_bbdev_op_ldpc_enc *ldpc_enc = &op->ldpc_enc;
+
+	if (op->mempool == NULL) {
+		rte_bbdev_log(ERR, "Invalid mempool pointer");
+		return -1;
+	}
+	if (ldpc_enc->input.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid input pointer");
+		return -1;
+	}
+	if (ldpc_enc->output.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid output pointer");
+		return -1;
+	}
+	if (ldpc_enc->input.length >
+			RTE_BBDEV_LDPC_MAX_CB_SIZE >> 3) {
+		rte_bbdev_log(ERR, "CB size (%u) is too big, max: %d",
+				ldpc_enc->input.length,
+				RTE_BBDEV_LDPC_MAX_CB_SIZE);
+		return -1;
+	}
+	if ((ldpc_enc->basegraph > 2) || (ldpc_enc->basegraph == 0)) {
+		rte_bbdev_log(ERR,
+				"BG (%u) is out of range 1 <= value <= 2",
+				ldpc_enc->basegraph);
+		return -1;
+	}
+	if (ldpc_enc->rv_index > 3) {
+		rte_bbdev_log(ERR,
+				"rv_index (%u) is out of range 0 <= value <= 3",
+				ldpc_enc->rv_index);
+		return -1;
+	}
+	if (ldpc_enc->code_block_mode > 1) {
+		rte_bbdev_log(ERR,
+				"code_block_mode (%u) is out of range 0 <= value <= 1",
+				ldpc_enc->code_block_mode);
+		return -1;
+	}
+	int K = (ldpc_enc->basegraph == 1 ? 22 : 10) * ldpc_enc->z_c;
+	if (ldpc_enc->n_filler >= K) {
+		rte_bbdev_log(ERR,
+				"K and F are not compatible %u %u",
+				K, ldpc_enc->n_filler);
+		return -1;
+	}
+	return 0;
+}
+
+/* Validates LDPC decoder parameters */
+static inline int
+validate_ldpc_dec_op(struct rte_bbdev_dec_op *op)
+{
+	struct rte_bbdev_op_ldpc_dec *ldpc_dec = &op->ldpc_dec;
+
+	if (op->mempool == NULL) {
+		rte_bbdev_log(ERR, "Invalid mempool pointer");
+		return -1;
+	}
+	if ((ldpc_dec->basegraph > 2) || (ldpc_dec->basegraph == 0)) {
+		rte_bbdev_log(ERR,
+				"BG (%u) is out of range 1 <= value <= 2",
+				ldpc_dec->basegraph);
+		return -1;
+	}
+	if (ldpc_dec->iter_max == 0) {
+		rte_bbdev_log(ERR,
+				"iter_max (%u) is equal to 0",
+				ldpc_dec->iter_max);
+		return -1;
+	}
+	if (ldpc_dec->rv_index > 3) {
+		rte_bbdev_log(ERR,
+				"rv_index (%u) is out of range 0 <= value <= 3",
+				ldpc_dec->rv_index);
+		return -1;
+	}
+	if (ldpc_dec->code_block_mode > 1) {
+		rte_bbdev_log(ERR,
+				"code_block_mode (%u) is out of range 0 <= value <= 1",
+				ldpc_dec->code_block_mode);
+		return -1;
+	}
+	int K = (ldpc_dec->basegraph == 1 ? 22 : 10) * ldpc_dec->z_c;
+	if (ldpc_dec->n_filler >= K) {
+		rte_bbdev_log(ERR,
+				"K and F are not compatible %u %u",
+				K, ldpc_dec->n_filler);
+		return -1;
+	}
+	return 0;
+}
+#endif
+
 /* Enqueue one encode operations for ACC100 device in CB mode */
 static inline int
 enqueue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
@@ -2005,6 +2242,14 @@
 		seg_total_left;
 	struct rte_mbuf *input, *output_head, *output;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_enc_op(op) == -1) {
+		rte_bbdev_log(ERR, "Turbo encoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2051,6 +2296,14 @@
 	uint16_t  in_length_in_bytes;
 	struct rte_bbdev_op_ldpc_enc *enc = &ops[0]->ldpc_enc;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_ldpc_enc_op(ops[0]) == -1) {
+		rte_bbdev_log(ERR, "LDPC encoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2105,6 +2358,14 @@
 		seg_total_left;
 	struct rte_mbuf *input, *output_head, *output;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_ldpc_enc_op(op) == -1) {
+		rte_bbdev_log(ERR, "LDPC encoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2154,6 +2415,14 @@
 	struct rte_mbuf *input, *output_head, *output;
 	uint16_t current_enqueued_cbs = 0;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_enc_op(op) == -1) {
+		rte_bbdev_log(ERR, "Turbo encoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2221,6 +2490,142 @@
 	return current_enqueued_cbs;
 }
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+/* Validates turbo decoder parameters */
+static inline int
+validate_dec_op(struct rte_bbdev_dec_op *op)
+{
+	struct rte_bbdev_op_turbo_dec *turbo_dec = &op->turbo_dec;
+	struct rte_bbdev_op_dec_turbo_cb_params *cb = NULL;
+	struct rte_bbdev_op_dec_turbo_tb_params *tb = NULL;
+
+	if (op->mempool == NULL) {
+		rte_bbdev_log(ERR, "Invalid mempool pointer");
+		return -1;
+	}
+	if (turbo_dec->input.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid input pointer");
+		return -1;
+	}
+	if (turbo_dec->hard_output.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid hard_output pointer");
+		return -1;
+	}
+	if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_SOFT_OUTPUT) &&
+			turbo_dec->soft_output.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid soft_output pointer");
+		return -1;
+	}
+	if (turbo_dec->rv_index > 3) {
+		rte_bbdev_log(ERR,
+				"rv_index (%u) is out of range 0 <= value <= 3",
+				turbo_dec->rv_index);
+		return -1;
+	}
+	if (turbo_dec->iter_min < 1) {
+		rte_bbdev_log(ERR,
+				"iter_min (%u) is less than 1",
+				turbo_dec->iter_min);
+		return -1;
+	}
+	if (turbo_dec->iter_max <= 2) {
+		rte_bbdev_log(ERR,
+				"iter_max (%u) is less than or equal to 2",
+				turbo_dec->iter_max);
+		return -1;
+	}
+	if (turbo_dec->iter_min > turbo_dec->iter_max) {
+		rte_bbdev_log(ERR,
+				"iter_min (%u) is greater than iter_max (%u)",
+				turbo_dec->iter_min, turbo_dec->iter_max);
+		return -1;
+	}
+	if (turbo_dec->code_block_mode != 0 &&
+			turbo_dec->code_block_mode != 1) {
+		rte_bbdev_log(ERR,
+				"code_block_mode (%u) is out of range 0 <= value <= 1",
+				turbo_dec->code_block_mode);
+		return -1;
+	}
+
+	if (turbo_dec->code_block_mode == 0) {
+		tb = &turbo_dec->tb_params;
+		if ((tb->k_neg < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| tb->k_neg > RTE_BBDEV_TURBO_MAX_CB_SIZE)
+				&& tb->c_neg > 0) {
+			rte_bbdev_log(ERR,
+					"k_neg (%u) is out of range %u <= value <= %u",
+					tb->k_neg, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if ((tb->k_pos < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| tb->k_pos > RTE_BBDEV_TURBO_MAX_CB_SIZE)
+				&& tb->c > tb->c_neg) {
+			rte_bbdev_log(ERR,
+					"k_pos (%u) is out of range %u <= value <= %u",
+					tb->k_pos, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if (tb->c_neg > (RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1))
+			rte_bbdev_log(ERR,
+					"c_neg (%u) is out of range 0 <= value <= %u",
+					tb->c_neg,
+					RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1);
+		if (tb->c < 1 || tb->c > RTE_BBDEV_TURBO_MAX_CODE_BLOCKS) {
+			rte_bbdev_log(ERR,
+					"c (%u) is out of range 1 <= value <= %u",
+					tb->c, RTE_BBDEV_TURBO_MAX_CODE_BLOCKS);
+			return -1;
+		}
+		if (tb->cab > tb->c) {
+			rte_bbdev_log(ERR,
+					"cab (%u) is greater than c (%u)",
+					tb->cab, tb->c);
+			return -1;
+		}
+		if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_EQUALIZER) &&
+				(tb->ea < RTE_BBDEV_TURBO_MIN_CB_SIZE
+						|| (tb->ea % 2))
+				&& tb->cab > 0) {
+			rte_bbdev_log(ERR,
+					"ea (%u) is less than %u or it is not even",
+					tb->ea, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+		if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_EQUALIZER) &&
+				(tb->eb < RTE_BBDEV_TURBO_MIN_CB_SIZE
+						|| (tb->eb % 2))
+				&& tb->c > tb->cab) {
+			rte_bbdev_log(ERR,
+					"eb (%u) is less than %u or it is not even",
+					tb->eb, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+		}
+	} else {
+		cb = &turbo_dec->cb_params;
+		if (cb->k < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| cb->k > RTE_BBDEV_TURBO_MAX_CB_SIZE) {
+			rte_bbdev_log(ERR,
+					"k (%u) is out of range %u <= value <= %u",
+					cb->k, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_EQUALIZER) &&
+				(cb->e < RTE_BBDEV_TURBO_MIN_CB_SIZE ||
+				(cb->e % 2))) {
+			rte_bbdev_log(ERR,
+					"e (%u) is less than %u or it is not even",
+					cb->e, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+	}
+
+	return 0;
+}
+#endif
+
 /** Enqueue one decode operations for ACC100 device in CB mode */
 static inline int
 enqueue_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
@@ -2233,6 +2638,14 @@
 	struct rte_mbuf *input, *h_output_head, *h_output,
 		*s_output_head, *s_output;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_dec_op(op) == -1) {
+		rte_bbdev_log(ERR, "Turbo decoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2450,6 +2863,13 @@
 		return ret;
 	}
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_ldpc_dec_op(op) == -1) {
+		rte_bbdev_log(ERR, "LDPC decoder validation failed");
+		return -EINVAL;
+	}
+#endif
 	union acc100_dma_desc *desc;
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
@@ -2547,6 +2967,14 @@
 	struct rte_mbuf *input, *h_output_head, *h_output;
 	uint16_t current_enqueued_cbs = 0;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_ldpc_dec_op(op) == -1) {
+		rte_bbdev_log(ERR, "LDPC decoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2632,6 +3060,14 @@
 		*s_output_head, *s_output;
 	uint16_t current_enqueued_cbs = 0;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_dec_op(op) == -1) {
+		rte_bbdev_log(ERR, "Turbo decoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v11 10/10] baseband/acc100: add configure function
  2020-10-02  1:01   ` [dpdk-dev] [PATCH v11 00/10] bbdev PMD ACC100 Nicolas Chautru
                       ` (8 preceding siblings ...)
  2020-10-02  1:01     ` [dpdk-dev] [PATCH v11 09/10] baseband/acc100: add debug function to validate input Nicolas Chautru
@ 2020-10-02  1:01     ` Nicolas Chautru
  9 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-10-02  1:01 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, trix, maxime.coquelin, ferruh.yigit,
	tianjiao.liu, Nicolas Chautru

Add configure function to configure the PF from within
the bbdev-test itself without external application
configuration the device.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 app/test-bbdev/test_bbdev_perf.c                   |  71 +++
 doc/guides/rel_notes/release_20_11.rst             |   5 +
 drivers/baseband/acc100/meson.build                |   2 +
 drivers/baseband/acc100/rte_acc100_cfg.h           |  17 +
 drivers/baseband/acc100/rte_acc100_pmd.c           | 526 ++++++++++++++++++++-
 drivers/baseband/acc100/rte_acc100_pmd.h           |   1 +
 .../acc100/rte_pmd_bbdev_acc100_version.map        |   7 +
 7 files changed, 624 insertions(+), 5 deletions(-)

diff --git a/app/test-bbdev/test_bbdev_perf.c b/app/test-bbdev/test_bbdev_perf.c
index 45c0d62..6ddf012 100644
--- a/app/test-bbdev/test_bbdev_perf.c
+++ b/app/test-bbdev/test_bbdev_perf.c
@@ -52,6 +52,18 @@
 #define FLR_5G_TIMEOUT 610
 #endif
 
+#ifdef RTE_LIBRTE_PMD_BBDEV_ACC100
+#include <rte_acc100_cfg.h>
+#define ACC100PF_DRIVER_NAME   ("intel_acc100_pf")
+#define ACC100VF_DRIVER_NAME   ("intel_acc100_vf")
+#define ACC100_QMGR_NUM_AQS 16
+#define ACC100_QMGR_NUM_QGS 2
+#define ACC100_QMGR_AQ_DEPTH 5
+#define ACC100_QMGR_INVALID_IDX -1
+#define ACC100_QMGR_RR 1
+#define ACC100_QOS_GBR 0
+#endif
+
 #define OPS_CACHE_SIZE 256U
 #define OPS_POOL_SIZE_MIN 511U /* 0.5K per queue */
 
@@ -653,6 +665,65 @@ typedef int (test_case_function)(struct active_device *ad,
 				info->dev_name);
 	}
 #endif
+#ifdef RTE_LIBRTE_PMD_BBDEV_ACC100
+	if ((get_init_device() == true) &&
+		(!strcmp(info->drv.driver_name, ACC100PF_DRIVER_NAME))) {
+		struct rte_acc100_conf conf;
+		unsigned int i;
+
+		printf("Configure ACC100 FEC Driver %s with default values\n",
+				info->drv.driver_name);
+
+		/* clear default configuration before initialization */
+		memset(&conf, 0, sizeof(struct rte_acc100_conf));
+
+		/* Always set in PF mode for built-in configuration */
+		conf.pf_mode_en = true;
+		for (i = 0; i < RTE_ACC100_NUM_VFS; ++i) {
+			conf.arb_dl_4g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_dl_4g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_dl_4g[i].round_robin_weight = ACC100_QMGR_RR;
+			conf.arb_ul_4g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_ul_4g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_ul_4g[i].round_robin_weight = ACC100_QMGR_RR;
+			conf.arb_dl_5g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_dl_5g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_dl_5g[i].round_robin_weight = ACC100_QMGR_RR;
+			conf.arb_ul_5g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_ul_5g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_ul_5g[i].round_robin_weight = ACC100_QMGR_RR;
+		}
+
+		conf.input_pos_llr_1_bit = true;
+		conf.output_pos_llr_1_bit = true;
+		conf.num_vf_bundles = 1; /**< Number of VF bundles to setup */
+
+		conf.q_ul_4g.num_qgroups = ACC100_QMGR_NUM_QGS;
+		conf.q_ul_4g.first_qgroup_index = ACC100_QMGR_INVALID_IDX;
+		conf.q_ul_4g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
+		conf.q_ul_4g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
+		conf.q_dl_4g.num_qgroups = ACC100_QMGR_NUM_QGS;
+		conf.q_dl_4g.first_qgroup_index = ACC100_QMGR_INVALID_IDX;
+		conf.q_dl_4g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
+		conf.q_dl_4g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
+		conf.q_ul_5g.num_qgroups = ACC100_QMGR_NUM_QGS;
+		conf.q_ul_5g.first_qgroup_index = ACC100_QMGR_INVALID_IDX;
+		conf.q_ul_5g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
+		conf.q_ul_5g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
+		conf.q_dl_5g.num_qgroups = ACC100_QMGR_NUM_QGS;
+		conf.q_dl_5g.first_qgroup_index = ACC100_QMGR_INVALID_IDX;
+		conf.q_dl_5g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
+		conf.q_dl_5g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
+
+		/* setup PF with configuration information */
+		ret = rte_acc100_configure(info->dev_name, &conf);
+		TEST_ASSERT_SUCCESS(ret,
+				"Failed to configure ACC100 PF for bbdev %s",
+				info->dev_name);
+	}
+#endif
+	/* Let's refresh this now this is configured */
+	rte_bbdev_info_get(dev_id, info);
 	nb_queues = RTE_MIN(rte_lcore_count(), info->drv.max_num_queues);
 	nb_queues = RTE_MIN(nb_queues, (unsigned int) MAX_QUEUES);
 
diff --git a/doc/guides/rel_notes/release_20_11.rst b/doc/guides/rel_notes/release_20_11.rst
index 73ac08f..c8d0586 100644
--- a/doc/guides/rel_notes/release_20_11.rst
+++ b/doc/guides/rel_notes/release_20_11.rst
@@ -55,6 +55,11 @@ New Features
      Also, make sure to start the actual text at the margin.
      =======================================================
 
+* **Added Intel ACC100 bbdev PMD.**
+
+  Added a new ``acc100`` bbdev driver for the Intel\ |reg| ACC100 accelerator
+  also known as Mount Bryce.  See the
+  :doc:`../bbdevs/acc100` BBDEV guide for more details on this new driver.
 
 Removed Items
 -------------
diff --git a/drivers/baseband/acc100/meson.build b/drivers/baseband/acc100/meson.build
index 8afafc2..7ac44dc 100644
--- a/drivers/baseband/acc100/meson.build
+++ b/drivers/baseband/acc100/meson.build
@@ -4,3 +4,5 @@
 deps += ['bbdev', 'bus_vdev', 'ring', 'pci', 'bus_pci']
 
 sources = files('rte_acc100_pmd.c')
+
+install_headers('rte_acc100_cfg.h')
diff --git a/drivers/baseband/acc100/rte_acc100_cfg.h b/drivers/baseband/acc100/rte_acc100_cfg.h
index a1d43ef..d233e42 100644
--- a/drivers/baseband/acc100/rte_acc100_cfg.h
+++ b/drivers/baseband/acc100/rte_acc100_cfg.h
@@ -89,6 +89,23 @@ struct rte_acc100_conf {
 	struct rte_acc100_arbitration arb_dl_5g[RTE_ACC100_NUM_VFS];
 };
 
+/**
+ * Configure a ACC100 device
+ *
+ * @param dev_name
+ *   The name of the device. This is the short form of PCI BDF, e.g. 00:01.0.
+ *   It can also be retrieved for a bbdev device from the dev_name field in the
+ *   rte_bbdev_info structure returned by rte_bbdev_info_get().
+ * @param conf
+ *   Configuration to apply to ACC100 HW.
+ *
+ * @return
+ *   Zero on success, negative value on failure.
+ */
+__rte_experimental
+int
+rte_acc100_configure(const char *dev_name, struct rte_acc100_conf *conf);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 1e3c077..dc349b0 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -38,10 +38,10 @@
 
 /* Write a register of a ACC100 device */
 static inline void
-acc100_reg_write(struct acc100_device *d, uint32_t offset, uint32_t payload)
+acc100_reg_write(struct acc100_device *d, uint32_t offset, uint32_t value)
 {
 	void *reg_addr = RTE_PTR_ADD(d->mmio_base, offset);
-	mmio_write(reg_addr, payload);
+	mmio_write(reg_addr, value);
 	usleep(ACC100_LONG_WAIT);
 }
 
@@ -85,6 +85,26 @@
 
 enum {UL_4G = 0, UL_5G, DL_4G, DL_5G, NUM_ACC};
 
+/* Return the accelerator enum for a Queue Group Index */
+static inline int
+accFromQgid(int qg_idx, const struct rte_acc100_conf *acc100_conf)
+{
+	int accQg[ACC100_NUM_QGRPS];
+	int NumQGroupsPerFn[NUM_ACC];
+	int acc, qgIdx, qgIndex = 0;
+	for (qgIdx = 0; qgIdx < ACC100_NUM_QGRPS; qgIdx++)
+		accQg[qgIdx] = 0;
+	NumQGroupsPerFn[UL_4G] = acc100_conf->q_ul_4g.num_qgroups;
+	NumQGroupsPerFn[UL_5G] = acc100_conf->q_ul_5g.num_qgroups;
+	NumQGroupsPerFn[DL_4G] = acc100_conf->q_dl_4g.num_qgroups;
+	NumQGroupsPerFn[DL_5G] = acc100_conf->q_dl_5g.num_qgroups;
+	for (acc = UL_4G;  acc < NUM_ACC; acc++)
+		for (qgIdx = 0; qgIdx < NumQGroupsPerFn[acc]; qgIdx++)
+			accQg[qgIndex++] = acc;
+	acc = accQg[qg_idx];
+	return acc;
+}
+
 /* Return the queue topology for a Queue Group Index */
 static inline void
 qtopFromAcc(struct rte_acc100_queue_topology **qtop, int acc_enum,
@@ -113,6 +133,30 @@
 	*qtop = p_qtop;
 }
 
+/* Return the AQ depth for a Queue Group Index */
+static inline int
+aqDepth(int qg_idx, struct rte_acc100_conf *acc100_conf)
+{
+	struct rte_acc100_queue_topology *q_top = NULL;
+	int acc_enum = accFromQgid(qg_idx, acc100_conf);
+	qtopFromAcc(&q_top, acc_enum, acc100_conf);
+	if (unlikely(q_top == NULL))
+		return 0;
+	return q_top->aq_depth_log2;
+}
+
+/* Return the AQ depth for a Queue Group Index */
+static inline int
+aqNum(int qg_idx, struct rte_acc100_conf *acc100_conf)
+{
+	struct rte_acc100_queue_topology *q_top = NULL;
+	int acc_enum = accFromQgid(qg_idx, acc100_conf);
+	qtopFromAcc(&q_top, acc_enum, acc100_conf);
+	if (unlikely(q_top == NULL))
+		return 0;
+	return q_top->num_aqs_per_groups;
+}
+
 static void
 initQTop(struct rte_acc100_conf *acc100_conf)
 {
@@ -553,7 +597,7 @@
 static int
 acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id)
 {
-	uint32_t phys_low, phys_high, payload;
+	uint32_t phys_low, phys_high, value;
 	struct acc100_device *d = dev->data->dev_private;
 	const struct acc100_registry_addr *reg_addr;
 	int ret;
@@ -612,8 +656,8 @@
 	 * Configure Ring Size to the max queue ring size
 	 * (used for wrapping purpose)
 	 */
-	payload = log2_basic(d->sw_ring_size / 64);
-	acc100_reg_write(d, reg_addr->ring_size, payload);
+	value = log2_basic(d->sw_ring_size / 64);
+	acc100_reg_write(d, reg_addr->ring_size, value);
 
 	/* Configure tail pointer for use when SDONE enabled */
 	d->tail_ptrs = rte_zmalloc_socket(
@@ -4209,3 +4253,475 @@ static int acc100_pci_remove(struct rte_pci_device *pci_dev)
 RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME, pci_id_acc100_pf_map);
 RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
 RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME, pci_id_acc100_vf_map);
+
+/*
+ * Workaround implementation to fix the power on status of some 5GUL engines
+ * This requires DMA permission if ported outside DPDK
+ * It consists in resolving the state of these engines by running a
+ * dummy operation and reseting the engines to ensure state are reliably
+ * defined.
+ */
+static void
+poweron_cleanup(struct rte_bbdev *bbdev, struct acc100_device *d,
+		struct rte_acc100_conf *conf)
+{
+	int i, template_idx, qg_idx;
+	uint32_t address, status, value;
+	printf("Need to clear power-on 5GUL status in internal memory\n");
+	/* Reset LDPC Cores */
+	for (i = 0; i < ACC100_ENGINES_MAX; i++)
+		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
+				ACC100_ENGINE_OFFSET * i, ACC100_RESET_HI);
+	usleep(ACC100_LONG_WAIT);
+	for (i = 0; i < ACC100_ENGINES_MAX; i++)
+		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
+				ACC100_ENGINE_OFFSET * i, ACC100_RESET_LO);
+	usleep(ACC100_LONG_WAIT);
+	/* Prepare dummy workload */
+	alloc_2x64mb_sw_rings_mem(bbdev, d, 0);
+	/* Set base addresses */
+	uint32_t phys_high = (uint32_t)(d->sw_rings_iova >> 32);
+	uint32_t phys_low  = (uint32_t)(d->sw_rings_iova &
+			~(ACC100_SIZE_64MBYTE-1));
+	acc100_reg_write(d, HWPfDmaFec5GulDescBaseHiRegVf, phys_high);
+	acc100_reg_write(d, HWPfDmaFec5GulDescBaseLoRegVf, phys_low);
+
+	/* Descriptor for a dummy 5GUL code block processing*/
+	union acc100_dma_desc *desc = NULL;
+	desc = d->sw_rings;
+	desc->req.data_ptrs[0].address = d->sw_rings_iova +
+			ACC100_DESC_FCW_OFFSET;
+	desc->req.data_ptrs[0].blen = ACC100_FCW_LD_BLEN;
+	desc->req.data_ptrs[0].blkid = ACC100_DMA_BLKID_FCW;
+	desc->req.data_ptrs[0].last = 0;
+	desc->req.data_ptrs[0].dma_ext = 0;
+	desc->req.data_ptrs[1].address = d->sw_rings_iova + 512;
+	desc->req.data_ptrs[1].blkid = ACC100_DMA_BLKID_IN;
+	desc->req.data_ptrs[1].last = 1;
+	desc->req.data_ptrs[1].dma_ext = 0;
+	desc->req.data_ptrs[1].blen = 44;
+	desc->req.data_ptrs[2].address = d->sw_rings_iova + 1024;
+	desc->req.data_ptrs[2].blkid = ACC100_DMA_BLKID_OUT_ENC;
+	desc->req.data_ptrs[2].last = 1;
+	desc->req.data_ptrs[2].dma_ext = 0;
+	desc->req.data_ptrs[2].blen = 5;
+	/* Dummy FCW */
+	desc->req.fcw_ld.FCWversion = ACC100_FCW_VER;
+	desc->req.fcw_ld.qm = 1;
+	desc->req.fcw_ld.nfiller = 30;
+	desc->req.fcw_ld.BG = 2 - 1;
+	desc->req.fcw_ld.Zc = 7;
+	desc->req.fcw_ld.ncb = 350;
+	desc->req.fcw_ld.rm_e = 4;
+	desc->req.fcw_ld.itmax = 10;
+	desc->req.fcw_ld.gain_i = 1;
+	desc->req.fcw_ld.gain_h = 1;
+
+	int engines_to_restart[ACC100_SIG_UL_5G_LAST + 1] = {0};
+	int num_failed_engine = 0;
+	/* Detect engines in undefined state */
+	for (template_idx = ACC100_SIG_UL_5G;
+			template_idx <= ACC100_SIG_UL_5G_LAST;
+			template_idx++) {
+		/* Check engine power-on status */
+		address = HwPfFecUl5gIbDebugReg +
+				ACC100_ENGINE_OFFSET * template_idx;
+		status = (acc100_reg_read(d, address) >> 4) & 0xF;
+		if (status == 0) {
+			engines_to_restart[num_failed_engine] = template_idx;
+			num_failed_engine++;
+		}
+	}
+
+	int numQqsAcc = conf->q_ul_5g.num_qgroups;
+	int numQgs = conf->q_ul_5g.num_qgroups;
+	value = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		value |= (1 << qg_idx);
+	/* Force each engine which is in unspecified state */
+	for (i = 0; i < num_failed_engine; i++) {
+		int failed_engine = engines_to_restart[i];
+		printf("Force engine %d\n", failed_engine);
+		for (template_idx = ACC100_SIG_UL_5G;
+				template_idx <= ACC100_SIG_UL_5G_LAST;
+				template_idx++) {
+			address = HWPfQmgrGrpTmplateReg4Indx
+					+ ACC100_BYTES_IN_WORD * template_idx;
+			if (template_idx == failed_engine)
+				acc100_reg_write(d, address, value);
+			else
+				acc100_reg_write(d, address, 0);
+		}
+		/* Reset descriptor header */
+		desc->req.word0 = ACC100_DMA_DESC_TYPE;
+		desc->req.word1 = 0;
+		desc->req.word2 = 0;
+		desc->req.word3 = 0;
+		desc->req.numCBs = 1;
+		desc->req.m2dlen = 2;
+		desc->req.d2mlen = 1;
+		/* Enqueue the code block for processing */
+		union acc100_enqueue_reg_fmt enq_req;
+		enq_req.val = 0;
+		enq_req.addr_offset = ACC100_DESC_OFFSET;
+		enq_req.num_elem = 1;
+		enq_req.req_elem_addr = 0;
+		rte_wmb();
+		acc100_reg_write(d, HWPfQmgrIngressAq + 0x100, enq_req.val);
+		usleep(ACC100_LONG_WAIT * 100);
+		if (desc->req.word0 != 2)
+			printf("DMA Response %#"PRIx32"\n", desc->req.word0);
+	}
+
+	/* Reset LDPC Cores */
+	for (i = 0; i < ACC100_ENGINES_MAX; i++)
+		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
+				ACC100_ENGINE_OFFSET * i,
+				ACC100_RESET_HI);
+	usleep(ACC100_LONG_WAIT);
+	for (i = 0; i < ACC100_ENGINES_MAX; i++)
+		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
+				ACC100_ENGINE_OFFSET * i,
+				ACC100_RESET_LO);
+	usleep(ACC100_LONG_WAIT);
+	acc100_reg_write(d, HWPfHi5GHardResetReg, ACC100_RESET_HARD);
+	usleep(ACC100_LONG_WAIT);
+	int numEngines = 0;
+	/* Check engine power-on status again */
+	for (template_idx = ACC100_SIG_UL_5G;
+			template_idx <= ACC100_SIG_UL_5G_LAST;
+			template_idx++) {
+		address = HwPfFecUl5gIbDebugReg +
+				ACC100_ENGINE_OFFSET * template_idx;
+		status = (acc100_reg_read(d, address) >> 4) & 0xF;
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ ACC100_BYTES_IN_WORD * template_idx;
+		if (status == 1) {
+			acc100_reg_write(d, address, value);
+			numEngines++;
+		} else
+			acc100_reg_write(d, address, 0);
+	}
+	printf("Number of 5GUL engines %d\n", numEngines);
+
+	if (d->sw_rings_base != NULL)
+		rte_free(d->sw_rings_base);
+	usleep(ACC100_LONG_WAIT);
+}
+
+/* Initial configuration of a ACC100 device prior to running configure() */
+int
+rte_acc100_configure(const char *dev_name, struct rte_acc100_conf *conf)
+{
+	rte_bbdev_log(INFO, "rte_acc100_configure");
+	uint32_t value, address, status;
+	int qg_idx, template_idx, vf_idx, acc, i;
+	struct rte_bbdev *bbdev = rte_bbdev_get_named_dev(dev_name);
+
+	/* Compile time checks */
+	RTE_BUILD_BUG_ON(sizeof(struct acc100_dma_req_desc) != 256);
+	RTE_BUILD_BUG_ON(sizeof(union acc100_dma_desc) != 256);
+	RTE_BUILD_BUG_ON(sizeof(struct acc100_fcw_td) != 24);
+	RTE_BUILD_BUG_ON(sizeof(struct acc100_fcw_te) != 32);
+
+	if (bbdev == NULL) {
+		rte_bbdev_log(ERR,
+		"Invalid dev_name (%s), or device is not yet initialised",
+		dev_name);
+		return -ENODEV;
+	}
+	struct acc100_device *d = bbdev->data->dev_private;
+
+	/* Store configuration */
+	rte_memcpy(&d->acc100_conf, conf, sizeof(d->acc100_conf));
+
+	/* PCIe Bridge configuration */
+	acc100_reg_write(d, HwPfPcieGpexBridgeControl, ACC100_CFG_PCI_BRIDGE);
+	for (i = 1; i < ACC100_GPEX_AXIMAP_NUM; i++)
+		acc100_reg_write(d,
+				HwPfPcieGpexAxiAddrMappingWindowPexBaseHigh
+				+ i * 16, 0);
+
+	/* Prevent blocking AXI read on BRESP for AXI Write */
+	address = HwPfPcieGpexAxiPioControl;
+	value = ACC100_CFG_PCI_AXI;
+	acc100_reg_write(d, address, value);
+
+	/* 5GDL PLL phase shift */
+	acc100_reg_write(d, HWPfChaDl5gPllPhshft0, 0x1);
+
+	/* Explicitly releasing AXI as this may be stopped after PF FLR/BME */
+	address = HWPfDmaAxiControl;
+	value = 1;
+	acc100_reg_write(d, address, value);
+
+	/* DDR Configuration */
+	address = HWPfDdrBcTim6;
+	value = acc100_reg_read(d, address);
+	value &= 0xFFFFFFFB; /* Bit 2 */
+#ifdef ACC100_DDR_ECC_ENABLE
+	value |= 0x4;
+#endif
+	acc100_reg_write(d, address, value);
+	address = HWPfDdrPhyDqsCountNum;
+#ifdef ACC100_DDR_ECC_ENABLE
+	value = 9;
+#else
+	value = 8;
+#endif
+	acc100_reg_write(d, address, value);
+
+	/* Set default descriptor signature */
+	address = HWPfDmaDescriptorSignatuture;
+	value = 0;
+	acc100_reg_write(d, address, value);
+
+	/* Enable the Error Detection in DMA */
+	value = ACC100_CFG_DMA_ERROR;
+	address = HWPfDmaErrorDetectionEn;
+	acc100_reg_write(d, address, value);
+
+	/* AXI Cache configuration */
+	value = ACC100_CFG_AXI_CACHE;
+	address = HWPfDmaAxcacheReg;
+	acc100_reg_write(d, address, value);
+
+	/* Default DMA Configuration (Qmgr Enabled) */
+	address = HWPfDmaConfig0Reg;
+	value = 0;
+	acc100_reg_write(d, address, value);
+	address = HWPfDmaQmanen;
+	value = 0;
+	acc100_reg_write(d, address, value);
+
+	/* Default RLIM/ALEN configuration */
+	address = HWPfDmaConfig1Reg;
+	value = (1 << 31) + (23 << 8) + (1 << 6) + 7;
+	acc100_reg_write(d, address, value);
+
+	/* Configure DMA Qmanager addresses */
+	address = HWPfDmaQmgrAddrReg;
+	value = HWPfQmgrEgressQueuesTemplate;
+	acc100_reg_write(d, address, value);
+
+	/* ===== Qmgr Configuration ===== */
+	/* Configuration of the AQueue Depth QMGR_GRP_0_DEPTH_LOG2 for UL */
+	int totalQgs = conf->q_ul_4g.num_qgroups +
+			conf->q_ul_5g.num_qgroups +
+			conf->q_dl_4g.num_qgroups +
+			conf->q_dl_5g.num_qgroups;
+	for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) {
+		address = HWPfQmgrDepthLog2Grp +
+		ACC100_BYTES_IN_WORD * qg_idx;
+		value = aqDepth(qg_idx, conf);
+		acc100_reg_write(d, address, value);
+		address = HWPfQmgrTholdGrp +
+		ACC100_BYTES_IN_WORD * qg_idx;
+		value = (1 << 16) + (1 << (aqDepth(qg_idx, conf) - 1));
+		acc100_reg_write(d, address, value);
+	}
+
+	/* Template Priority in incremental order */
+	for (template_idx = 0; template_idx < ACC100_NUM_TMPL;
+			template_idx++) {
+		address = HWPfQmgrGrpTmplateReg0Indx +
+		ACC100_BYTES_IN_WORD * (template_idx % 8);
+		value = ACC100_TMPL_PRI_0;
+		acc100_reg_write(d, address, value);
+		address = HWPfQmgrGrpTmplateReg1Indx +
+		ACC100_BYTES_IN_WORD * (template_idx % 8);
+		value = ACC100_TMPL_PRI_1;
+		acc100_reg_write(d, address, value);
+		address = HWPfQmgrGrpTmplateReg2indx +
+		ACC100_BYTES_IN_WORD * (template_idx % 8);
+		value = ACC100_TMPL_PRI_2;
+		acc100_reg_write(d, address, value);
+		address = HWPfQmgrGrpTmplateReg3Indx +
+		ACC100_BYTES_IN_WORD * (template_idx % 8);
+		value = ACC100_TMPL_PRI_3;
+		acc100_reg_write(d, address, value);
+	}
+
+	address = HWPfQmgrGrpPriority;
+	value = ACC100_CFG_QMGR_HI_P;
+	acc100_reg_write(d, address, value);
+
+	/* Template Configuration */
+	for (template_idx = 0; template_idx < ACC100_NUM_TMPL;
+			template_idx++) {
+		value = 0;
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ ACC100_BYTES_IN_WORD * template_idx;
+		acc100_reg_write(d, address, value);
+	}
+	/* 4GUL */
+	int numQgs = conf->q_ul_4g.num_qgroups;
+	int numQqsAcc = 0;
+	value = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		value |= (1 << qg_idx);
+	for (template_idx = ACC100_SIG_UL_4G;
+			template_idx <= ACC100_SIG_UL_4G_LAST;
+			template_idx++) {
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ ACC100_BYTES_IN_WORD * template_idx;
+		acc100_reg_write(d, address, value);
+	}
+	/* 5GUL */
+	numQqsAcc += numQgs;
+	numQgs	= conf->q_ul_5g.num_qgroups;
+	value = 0;
+	int numEngines = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		value |= (1 << qg_idx);
+	for (template_idx = ACC100_SIG_UL_5G;
+			template_idx <= ACC100_SIG_UL_5G_LAST;
+			template_idx++) {
+		/* Check engine power-on status */
+		address = HwPfFecUl5gIbDebugReg +
+				ACC100_ENGINE_OFFSET * template_idx;
+		status = (acc100_reg_read(d, address) >> 4) & 0xF;
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ ACC100_BYTES_IN_WORD * template_idx;
+		if (status == 1) {
+			acc100_reg_write(d, address, value);
+			numEngines++;
+		} else
+			acc100_reg_write(d, address, 0);
+#if RTE_ACC100_SINGLE_FEC == 1
+		value = 0;
+#endif
+	}
+	printf("Number of 5GUL engines %d\n", numEngines);
+	/* 4GDL */
+	numQqsAcc += numQgs;
+	numQgs	= conf->q_dl_4g.num_qgroups;
+	value = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		value |= (1 << qg_idx);
+	for (template_idx = ACC100_SIG_DL_4G;
+			template_idx <= ACC100_SIG_DL_4G_LAST;
+			template_idx++) {
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ ACC100_BYTES_IN_WORD * template_idx;
+		acc100_reg_write(d, address, value);
+#if RTE_ACC100_SINGLE_FEC == 1
+			value = 0;
+#endif
+	}
+	/* 5GDL */
+	numQqsAcc += numQgs;
+	numQgs	= conf->q_dl_5g.num_qgroups;
+	value = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		value |= (1 << qg_idx);
+	for (template_idx = ACC100_SIG_DL_5G;
+			template_idx <= ACC100_SIG_DL_5G_LAST;
+			template_idx++) {
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ ACC100_BYTES_IN_WORD * template_idx;
+		acc100_reg_write(d, address, value);
+#if RTE_ACC100_SINGLE_FEC == 1
+		value = 0;
+#endif
+	}
+
+	/* Queue Group Function mapping */
+	int qman_func_id[5] = {0, 2, 1, 3, 4};
+	address = HWPfQmgrGrpFunction0;
+	value = 0;
+	for (qg_idx = 0; qg_idx < 8; qg_idx++) {
+		acc = accFromQgid(qg_idx, conf);
+		value |= qman_func_id[acc]<<(qg_idx * 4);
+	}
+	acc100_reg_write(d, address, value);
+
+	/* Configuration of the Arbitration QGroup depth to 1 */
+	for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) {
+		address = HWPfQmgrArbQDepthGrp +
+		ACC100_BYTES_IN_WORD * qg_idx;
+		value = 0;
+		acc100_reg_write(d, address, value);
+	}
+
+	/* Enabling AQueues through the Queue hierarchy*/
+	for (vf_idx = 0; vf_idx < ACC100_NUM_VFS; vf_idx++) {
+		for (qg_idx = 0; qg_idx < ACC100_NUM_QGRPS; qg_idx++) {
+			value = 0;
+			if (vf_idx < conf->num_vf_bundles &&
+					qg_idx < totalQgs)
+				value = (1 << aqNum(qg_idx, conf)) - 1;
+			address = HWPfQmgrAqEnableVf
+					+ vf_idx * ACC100_BYTES_IN_WORD;
+			value += (qg_idx << 16);
+			acc100_reg_write(d, address, value);
+		}
+	}
+
+	/* This pointer to ARAM (256kB) is shifted by 2 (4B per register) */
+	uint32_t aram_address = 0;
+	for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) {
+		for (vf_idx = 0; vf_idx < conf->num_vf_bundles; vf_idx++) {
+			address = HWPfQmgrVfBaseAddr + vf_idx
+					* ACC100_BYTES_IN_WORD + qg_idx
+					* ACC100_BYTES_IN_WORD * 64;
+			value = aram_address;
+			acc100_reg_write(d, address, value);
+			/* Offset ARAM Address for next memory bank
+			 * - increment of 4B
+			 */
+			aram_address += aqNum(qg_idx, conf) *
+					(1 << aqDepth(qg_idx, conf));
+		}
+	}
+
+	if (aram_address > ACC100_WORDS_IN_ARAM_SIZE) {
+		rte_bbdev_log(ERR, "ARAM Configuration not fitting %d %d\n",
+				aram_address, ACC100_WORDS_IN_ARAM_SIZE);
+		return -EINVAL;
+	}
+
+	/* ==== HI Configuration ==== */
+
+	/* Prevent Block on Transmit Error */
+	address = HWPfHiBlockTransmitOnErrorEn;
+	value = 0;
+	acc100_reg_write(d, address, value);
+	/* Prevents to drop MSI */
+	address = HWPfHiMsiDropEnableReg;
+	value = 0;
+	acc100_reg_write(d, address, value);
+	/* Set the PF Mode register */
+	address = HWPfHiPfMode;
+	value = (conf->pf_mode_en) ? ACC100_PF_VAL : 0;
+	acc100_reg_write(d, address, value);
+	/* Enable Error Detection in HW */
+	address = HWPfDmaErrorDetectionEn;
+	value = 0x3D7;
+	acc100_reg_write(d, address, value);
+
+	/* QoS overflow init */
+	value = 1;
+	address = HWPfQosmonAEvalOverflow0;
+	acc100_reg_write(d, address, value);
+	address = HWPfQosmonBEvalOverflow0;
+	acc100_reg_write(d, address, value);
+
+	/* HARQ DDR Configuration */
+	unsigned int ddrSizeInMb = 512; /* Fixed to 512 MB per VF for now */
+	for (vf_idx = 0; vf_idx < conf->num_vf_bundles; vf_idx++) {
+		address = HWPfDmaVfDdrBaseRw + vf_idx
+				* 0x10;
+		value = ((vf_idx * (ddrSizeInMb / 64)) << 16) +
+				(ddrSizeInMb - 1);
+		acc100_reg_write(d, address, value);
+	}
+	usleep(ACC100_LONG_WAIT);
+
+	/* Workaround in case some 5GUL engines are in an unexpected state */
+	if (numEngines < (ACC100_SIG_UL_5G_LAST + 1))
+		poweron_cleanup(bbdev, d, conf);
+
+	rte_bbdev_log_debug("PF Tip configuration complete for %s", dev_name);
+	return 0;
+}
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 1fbd96e..03ed0b3 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -158,6 +158,7 @@
 #define ACC100_RESET_HARD       0x1FF
 #define ACC100_ENGINES_MAX      9
 #define ACC100_LONG_WAIT        1000
+#define ACC100_GPEX_AXIMAP_NUM  17
 
 /* ACC100 DMA Descriptor triplet */
 struct acc100_dma_triplet {
diff --git a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
index 4a76d1d..47a23b8 100644
--- a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
+++ b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
@@ -1,3 +1,10 @@
 DPDK_21 {
 	local: *;
 };
+
+EXPERIMENTAL {
+	global:
+
+	rte_acc100_configure;
+
+};
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v11 01/10] drivers/baseband: add PMD for ACC100
  2020-10-02  1:01     ` [dpdk-dev] [PATCH v11 01/10] drivers/baseband: add PMD for ACC100 Nicolas Chautru
@ 2020-10-04 15:53       ` Tom Rix
  0 siblings, 0 replies; 213+ messages in thread
From: Tom Rix @ 2020-10-04 15:53 UTC (permalink / raw)
  To: Nicolas Chautru, dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, maxime.coquelin, ferruh.yigit, tianjiao.liu


On 10/1/20 6:01 PM, Nicolas Chautru wrote:
> Add stubs for the ACC100 PMD
>
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
> ---
>  doc/guides/bbdevs/acc100.rst                       | 228 +++++++++++++++++++++
>  doc/guides/bbdevs/features/acc100.ini              |  14 ++
>  doc/guides/bbdevs/index.rst                        |   1 +
>  drivers/baseband/acc100/meson.build                |   6 +
>  drivers/baseband/acc100/rte_acc100_pmd.c           | 175 ++++++++++++++++
>  drivers/baseband/acc100/rte_acc100_pmd.h           |  37 ++++
>  .../acc100/rte_pmd_bbdev_acc100_version.map        |   3 +
>  drivers/baseband/meson.build                       |   2 +-
>  8 files changed, 465 insertions(+), 1 deletion(-)
>  create mode 100644 doc/guides/bbdevs/acc100.rst
>  create mode 100644 doc/guides/bbdevs/features/acc100.ini
>  create mode 100644 drivers/baseband/acc100/meson.build
>  create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.c
>  create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.h
>  create mode 100644 drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map

Looks fine.

Reviewed-by: Tom Rix <trix@redhat.com>



^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v11 02/10] baseband/acc100: add register definition file
  2020-10-02  1:01     ` [dpdk-dev] [PATCH v11 02/10] baseband/acc100: add register definition file Nicolas Chautru
@ 2020-10-04 15:56       ` Tom Rix
  0 siblings, 0 replies; 213+ messages in thread
From: Tom Rix @ 2020-10-04 15:56 UTC (permalink / raw)
  To: Nicolas Chautru, dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, maxime.coquelin, ferruh.yigit, tianjiao.liu


On 10/1/20 6:01 PM, Nicolas Chautru wrote:
> Add in the list of registers for the device and related
> HW specs definitions.
>
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> Reviewed-by: Rosen Xu <rosen.xu@intel.com>
> Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
> ---
>  drivers/baseband/acc100/acc100_pf_enum.h | 1068 ++++++++++++++++++++++++++++++
>  drivers/baseband/acc100/acc100_vf_enum.h |   73 ++
>  drivers/baseband/acc100/rte_acc100_pmd.h |  487 ++++++++++++++
>  3 files changed, 1628 insertions(+)
>  create mode 100644 drivers/baseband/acc100/acc100_pf_enum.h
>  create mode 100644 drivers/baseband/acc100/acc100_vf_enum.h

Thanks for the changes.

Reviewed-by: Tom Rix <trix@redhat.com>



^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v11 03/10] baseband/acc100: add info get function
  2020-10-02  1:01     ` [dpdk-dev] [PATCH v11 03/10] baseband/acc100: add info get function Nicolas Chautru
@ 2020-10-04 16:09       ` Tom Rix
  2020-10-05 16:38         ` Chautru, Nicolas
  0 siblings, 1 reply; 213+ messages in thread
From: Tom Rix @ 2020-10-04 16:09 UTC (permalink / raw)
  To: Nicolas Chautru, dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, maxime.coquelin, ferruh.yigit, tianjiao.liu


On 10/1/20 6:01 PM, Nicolas Chautru wrote:
> Add in the "info_get" function to the driver, to allow us to query the
> device.
> No processing capability are available yet.
> Linking bbdev-test to support the PMD with null capability.
>
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
> ---
>  app/test-bbdev/meson.build               |   3 +
>  drivers/baseband/acc100/rte_acc100_cfg.h |  96 +++++++++++++
>  drivers/baseband/acc100/rte_acc100_pmd.c | 229 +++++++++++++++++++++++++++++++
>  drivers/baseband/acc100/rte_acc100_pmd.h |  10 ++
>  4 files changed, 338 insertions(+)
>  create mode 100644 drivers/baseband/acc100/rte_acc100_cfg.h
>
> diff --git a/app/test-bbdev/meson.build b/app/test-bbdev/meson.build
> index 18ab6a8..fbd8ae3 100644
> --- a/app/test-bbdev/meson.build
> +++ b/app/test-bbdev/meson.build
> @@ -12,3 +12,6 @@ endif
>  if dpdk_conf.has('RTE_LIBRTE_PMD_BBDEV_FPGA_5GNR_FEC')
>  	deps += ['pmd_bbdev_fpga_5gnr_fec']
>  endif
> +if dpdk_conf.has('RTE_LIBRTE_PMD_BBDEV_ACC100')
> +	deps += ['pmd_bbdev_acc100']
> +endif
> \ No newline at end of file
> diff --git a/drivers/baseband/acc100/rte_acc100_cfg.h b/drivers/baseband/acc100/rte_acc100_cfg.h
> new file mode 100644
> index 0000000..a1d43ef
> --- /dev/null
> +++ b/drivers/baseband/acc100/rte_acc100_cfg.h
> @@ -0,0 +1,96 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2020 Intel Corporation
> + */
> +
> +#ifndef _RTE_ACC100_CFG_H_
> +#define _RTE_ACC100_CFG_H_
> +
> +/**
> + * @file rte_acc100_cfg.h
> + *
> + * Functions for configuring ACC100 HW, exposed directly to applications.
> + * Configuration related to encoding/decoding is done through the
> + * librte_bbdev library.
> + *
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice
> + */
> +
> +#include <stdint.h>
> +#include <stdbool.h>
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +/**< Number of Virtual Functions ACC100 supports */
> +#define RTE_ACC100_NUM_VFS 16

I was expecting the definition of RTE_ACC100_NUM_VFS to be removed.

And its uses replaced with ACC100_NUM_VFS.

or

#define RTE_ACC100_NUM_VFS ACC100_NUM_VFS

> +
> +/**
> + * Definition of Queue Topology for ACC100 Configuration
> + * Some level of details is abstracted out to expose a clean interface
> + * given that comprehensive flexibility is not required
> + */
> +struct rte_acc100_queue_topology {
> +	/** Number of QGroups in incremental order of priority */
> +	uint16_t num_qgroups;
> +	/**
> +	 * All QGroups have the same number of AQs here.
> +	 * Note : Could be made a 16-array if more flexibility is really
> +	 * required
> +	 */
> +	uint16_t num_aqs_per_groups;
> +	/**
> +	 * Depth of the AQs is the same of all QGroups here. Log2 Enum : 2^N
> +	 * Note : Could be made a 16-array if more flexibility is really
> +	 * required
> +	 */
> +	uint16_t aq_depth_log2;
> +	/**
> +	 * Index of the first Queue Group Index - assuming contiguity
> +	 * Initialized as -1
> +	 */
> +	int8_t first_qgroup_index;
> +};
> +
> +/**
> + * Definition of Arbitration related parameters for ACC100 Configuration
> + */
> +struct rte_acc100_arbitration {
> +	/** Default Weight for VF Fairness Arbitration */
> +	uint16_t round_robin_weight;
> +	uint32_t gbr_threshold1; /**< Guaranteed Bitrate Threshold 1 */
> +	uint32_t gbr_threshold2; /**< Guaranteed Bitrate Threshold 2 */
> +};
> +
> +/**
> + * Structure to pass ACC100 configuration.
> + * Note: all VF Bundles will have the same configuration.
> + */
> +struct rte_acc100_conf {
> +	bool pf_mode_en; /**< 1 if PF is used for dataplane, 0 for VFs */
> +	/** 1 if input '1' bit is represented by a positive LLR value, 0 if '1'
> +	 * bit is represented by a negative value.
> +	 */
> +	bool input_pos_llr_1_bit;
> +	/** 1 if output '1' bit is represented by a positive value, 0 if '1'
> +	 * bit is represented by a negative value.
> +	 */
> +	bool output_pos_llr_1_bit;
> +	uint16_t num_vf_bundles; /**< Number of VF bundles to setup */
> +	/** Queue topology for each operation type */
> +	struct rte_acc100_queue_topology q_ul_4g;
> +	struct rte_acc100_queue_topology q_dl_4g;
> +	struct rte_acc100_queue_topology q_ul_5g;
> +	struct rte_acc100_queue_topology q_dl_5g;
> +	/** Arbitration configuration for each operation type */
> +	struct rte_acc100_arbitration arb_ul_4g[RTE_ACC100_NUM_VFS];
> +	struct rte_acc100_arbitration arb_dl_4g[RTE_ACC100_NUM_VFS];
> +	struct rte_acc100_arbitration arb_ul_5g[RTE_ACC100_NUM_VFS];
> +	struct rte_acc100_arbitration arb_dl_5g[RTE_ACC100_NUM_VFS];
> +};
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_ACC100_CFG_H_ */
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
> index 1b4cd13..fcba77e 100644
> --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> @@ -26,6 +26,188 @@
>  RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);
>  #endif
>  
> +/* Read a register of a ACC100 device */
> +static inline uint32_t
> +acc100_reg_read(struct acc100_device *d, uint32_t offset)
> +{
> +
> +	void *reg_addr = RTE_PTR_ADD(d->mmio_base, offset);
> +	uint32_t ret = *((volatile uint32_t *)(reg_addr));
> +	return rte_le_to_cpu_32(ret);
> +}
> +
> +/* Calculate the offset of the enqueue register */
> +static inline uint32_t
> +queue_offset(bool pf_device, uint8_t vf_id, uint8_t qgrp_id, uint16_t aq_id)
> +{
> +	if (pf_device)
> +		return ((vf_id << 12) + (qgrp_id << 7) + (aq_id << 3) +
> +				HWPfQmgrIngressAq);
> +	else
> +		return ((qgrp_id << 7) + (aq_id << 3) +
> +				HWVfQmgrIngressAq);
> +}
> +
> +enum {UL_4G = 0, UL_5G, DL_4G, DL_5G, NUM_ACC};
> +
> +/* Return the queue topology for a Queue Group Index */
> +static inline void
> +qtopFromAcc(struct rte_acc100_queue_topology **qtop, int acc_enum,
> +		struct rte_acc100_conf *acc100_conf)
> +{
> +	struct rte_acc100_queue_topology *p_qtop;
> +	p_qtop = NULL;
> +	switch (acc_enum) {
> +	case UL_4G:
> +		p_qtop = &(acc100_conf->q_ul_4g);
> +		break;
> +	case UL_5G:
> +		p_qtop = &(acc100_conf->q_ul_5g);
> +		break;
> +	case DL_4G:
> +		p_qtop = &(acc100_conf->q_dl_4g);
> +		break;
> +	case DL_5G:
> +		p_qtop = &(acc100_conf->q_dl_5g);
> +		break;
> +	default:
> +		/* NOTREACHED */
> +		rte_bbdev_log(ERR, "Unexpected error evaluating qtopFromAcc");
> +		break;
> +	}
> +	*qtop = p_qtop;
> +}
> +
> +static void
> +initQTop(struct rte_acc100_conf *acc100_conf)
> +{
> +	acc100_conf->q_ul_4g.num_aqs_per_groups = 0;
> +	acc100_conf->q_ul_4g.num_qgroups = 0;
> +	acc100_conf->q_ul_4g.first_qgroup_index = -1;
> +	acc100_conf->q_ul_5g.num_aqs_per_groups = 0;
> +	acc100_conf->q_ul_5g.num_qgroups = 0;
> +	acc100_conf->q_ul_5g.first_qgroup_index = -1;
> +	acc100_conf->q_dl_4g.num_aqs_per_groups = 0;
> +	acc100_conf->q_dl_4g.num_qgroups = 0;
> +	acc100_conf->q_dl_4g.first_qgroup_index = -1;
> +	acc100_conf->q_dl_5g.num_aqs_per_groups = 0;
> +	acc100_conf->q_dl_5g.num_qgroups = 0;
> +	acc100_conf->q_dl_5g.first_qgroup_index = -1;
> +}
> +
> +static inline void
> +updateQtop(uint8_t acc, uint8_t qg, struct rte_acc100_conf *acc100_conf,
> +		struct acc100_device *d) {
> +	uint32_t reg;
> +	struct rte_acc100_queue_topology *q_top = NULL;
> +	qtopFromAcc(&q_top, acc, acc100_conf);
> +	if (unlikely(q_top == NULL))
> +		return;
> +	uint16_t aq;
> +	q_top->num_qgroups++;
> +	if (q_top->first_qgroup_index == -1) {
> +		q_top->first_qgroup_index = qg;
> +		/* Can be optimized to assume all are enabled by default */
> +		reg = acc100_reg_read(d, queue_offset(d->pf_device,
> +				0, qg, ACC100_NUM_AQS - 1));
> +		if (reg & ACC100_QUEUE_ENABLE) {
> +			q_top->num_aqs_per_groups = ACC100_NUM_AQS;
> +			return;
> +		}
> +		q_top->num_aqs_per_groups = 0;
> +		for (aq = 0; aq < ACC100_NUM_AQS; aq++) {
> +			reg = acc100_reg_read(d, queue_offset(d->pf_device,
> +					0, qg, aq));
> +			if (reg & ACC100_QUEUE_ENABLE)
> +				q_top->num_aqs_per_groups++;
> +		}
> +	}
> +}
> +
> +/* Fetch configuration enabled for the PF/VF using MMIO Read (slow) */
> +static inline void
> +fetch_acc100_config(struct rte_bbdev *dev)
> +{
> +	struct acc100_device *d = dev->data->dev_private;
> +	struct rte_acc100_conf *acc100_conf = &d->acc100_conf;
> +	const struct acc100_registry_addr *reg_addr;
> +	uint8_t acc, qg;
> +	uint32_t reg, reg_aq, reg_len0, reg_len1;
> +	uint32_t reg_mode;
> +
> +	/* No need to retrieve the configuration is already done */
> +	if (d->configured)
> +		return;
> +
> +	/* Choose correct registry addresses for the device type */
> +	if (d->pf_device)
> +		reg_addr = &pf_reg_addr;
> +	else
> +		reg_addr = &vf_reg_addr;
> +
> +	d->ddr_size = (1 + acc100_reg_read(d, reg_addr->ddr_range)) << 10;
> +
> +	/* Single VF Bundle by VF */
> +	acc100_conf->num_vf_bundles = 1;
> +	initQTop(acc100_conf);
> +
> +	struct rte_acc100_queue_topology *q_top = NULL;
> +	int qman_func_id[ACC100_NUM_ACCS] = {ACC100_ACCMAP_0, ACC100_ACCMAP_1,
> +			ACC100_ACCMAP_2, ACC100_ACCMAP_3, ACC100_ACCMAP_4};
> +	reg = acc100_reg_read(d, reg_addr->qman_group_func);
> +	for (qg = 0; qg < ACC100_NUM_QGRPS_PER_WORD; qg++) {
> +		reg_aq = acc100_reg_read(d,
> +				queue_offset(d->pf_device, 0, qg, 0));
> +		if (reg_aq & ACC100_QUEUE_ENABLE) {
> +			uint32_t idx = (reg >> (qg * 4)) & 0x7;
> +			if (idx >= ACC100_NUM_ACCS)
> +				break;

a 'continue' would be better

or reverse the check

if (idx < ACC100_NUM_ACCS) {

    acc = qman_func_id ..

}

Tom

> +			acc = qman_func_id[idx];
> +			updateQtop(acc, qg, acc100_conf, d);
> +		}
> +	}
> +
> +	/* Check the depth of the AQs*/
> +	reg_len0 = acc100_reg_read(d, reg_addr->depth_log0_offset);
> +	reg_len1 = acc100_reg_read(d, reg_addr->depth_log1_offset);
> +	for (acc = 0; acc < NUM_ACC; acc++) {
> +		qtopFromAcc(&q_top, acc, acc100_conf);
> +		if (q_top->first_qgroup_index < ACC100_NUM_QGRPS_PER_WORD)
> +			q_top->aq_depth_log2 = (reg_len0 >>
> +					(q_top->first_qgroup_index * 4))
> +					& 0xF;
> +		else
> +			q_top->aq_depth_log2 = (reg_len1 >>
> +					((q_top->first_qgroup_index -
> +					ACC100_NUM_QGRPS_PER_WORD) * 4))
> +					& 0xF;
> +	}
> +
> +	/* Read PF mode */
> +	if (d->pf_device) {
> +		reg_mode = acc100_reg_read(d, HWPfHiPfMode);
> +		acc100_conf->pf_mode_en = (reg_mode == ACC100_PF_VAL) ? 1 : 0;
> +	}
> +
> +	rte_bbdev_log_debug(
> +			"%s Config LLR SIGN IN/OUT %s %s QG %u %u %u %u AQ %u %u %u %u Len %u %u %u %u\n",
> +			(d->pf_device) ? "PF" : "VF",
> +			(acc100_conf->input_pos_llr_1_bit) ? "POS" : "NEG",
> +			(acc100_conf->output_pos_llr_1_bit) ? "POS" : "NEG",
> +			acc100_conf->q_ul_4g.num_qgroups,
> +			acc100_conf->q_dl_4g.num_qgroups,
> +			acc100_conf->q_ul_5g.num_qgroups,
> +			acc100_conf->q_dl_5g.num_qgroups,
> +			acc100_conf->q_ul_4g.num_aqs_per_groups,
> +			acc100_conf->q_dl_4g.num_aqs_per_groups,
> +			acc100_conf->q_ul_5g.num_aqs_per_groups,
> +			acc100_conf->q_dl_5g.num_aqs_per_groups,
> +			acc100_conf->q_ul_4g.aq_depth_log2,
> +			acc100_conf->q_dl_4g.aq_depth_log2,
> +			acc100_conf->q_ul_5g.aq_depth_log2,
> +			acc100_conf->q_dl_5g.aq_depth_log2);
> +}
> +
>  /* Free 64MB memory used for software rings */
>  static int
>  acc100_dev_close(struct rte_bbdev *dev  __rte_unused)
> @@ -33,8 +215,55 @@
>  	return 0;
>  }
>  
> +/* Get ACC100 device info */
> +static void
> +acc100_dev_info_get(struct rte_bbdev *dev,
> +		struct rte_bbdev_driver_info *dev_info)
> +{
> +	struct acc100_device *d = dev->data->dev_private;
> +
> +	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
> +		RTE_BBDEV_END_OF_CAPABILITIES_LIST()
> +	};
> +
> +	static struct rte_bbdev_queue_conf default_queue_conf;
> +	default_queue_conf.socket = dev->data->socket_id;
> +	default_queue_conf.queue_size = ACC100_MAX_QUEUE_DEPTH;
> +
> +	dev_info->driver_name = dev->device->driver->name;
> +
> +	/* Read and save the populated config from ACC100 registers */
> +	fetch_acc100_config(dev);
> +
> +	/* This isn't ideal because it reports the maximum number of queues but
> +	 * does not provide info on how many can be uplink/downlink or different
> +	 * priorities
> +	 */
> +	dev_info->max_num_queues =
> +			d->acc100_conf.q_dl_5g.num_aqs_per_groups *
> +			d->acc100_conf.q_dl_5g.num_qgroups +
> +			d->acc100_conf.q_ul_5g.num_aqs_per_groups *
> +			d->acc100_conf.q_ul_5g.num_qgroups +
> +			d->acc100_conf.q_dl_4g.num_aqs_per_groups *
> +			d->acc100_conf.q_dl_4g.num_qgroups +
> +			d->acc100_conf.q_ul_4g.num_aqs_per_groups *
> +			d->acc100_conf.q_ul_4g.num_qgroups;
> +	dev_info->queue_size_lim = ACC100_MAX_QUEUE_DEPTH;
> +	dev_info->hardware_accelerated = true;
> +	dev_info->max_dl_queue_priority =
> +			d->acc100_conf.q_dl_4g.num_qgroups - 1;
> +	dev_info->max_ul_queue_priority =
> +			d->acc100_conf.q_ul_4g.num_qgroups - 1;
> +	dev_info->default_queue_conf = default_queue_conf;
> +	dev_info->cpu_flag_reqs = NULL;
> +	dev_info->min_alignment = 64;
> +	dev_info->capabilities = bbdev_capabilities;
> +	dev_info->harq_buffer_size = d->ddr_size;
> +}
> +
>  static const struct rte_bbdev_ops acc100_bbdev_ops = {
>  	.close = acc100_dev_close,
> +	.info_get = acc100_dev_info_get,
>  };
>  
>  /* ACC100 PCI PF address map */
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
> index 6525d66..09965c8 100644
> --- a/drivers/baseband/acc100/rte_acc100_pmd.h
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
> @@ -7,6 +7,7 @@
>  
>  #include "acc100_pf_enum.h"
>  #include "acc100_vf_enum.h"
> +#include "rte_acc100_cfg.h"
>  
>  /* Helper macro for logging */
>  #define rte_bbdev_log(level, fmt, ...) \
> @@ -98,6 +99,13 @@
>  #define ACC100_SIG_UL_4G_LAST 21
>  #define ACC100_SIG_DL_4G      27
>  #define ACC100_SIG_DL_4G_LAST 31
> +#define ACC100_NUM_ACCS       5
> +#define ACC100_ACCMAP_0       0
> +#define ACC100_ACCMAP_1       2
> +#define ACC100_ACCMAP_2       1
> +#define ACC100_ACCMAP_3       3
> +#define ACC100_ACCMAP_4       4
> +#define ACC100_PF_VAL         2
>  
>  /* max number of iterations to allocate memory block for all rings */
>  #define ACC100_SW_RING_MEM_ALLOC_ATTEMPTS 5
> @@ -517,6 +525,8 @@ struct acc100_registry_addr {
>  /* Private data structure for each ACC100 device */
>  struct acc100_device {
>  	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
> +	uint32_t ddr_size; /* Size in kB */
> +	struct rte_acc100_conf acc100_conf; /* ACC100 Initial configuration */
>  	bool pf_device; /**< True if this is a PF ACC100 device */
>  	bool configured; /**< True if this ACC100 device is configured */
>  };


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v11 04/10] baseband/acc100: add queue configuration
  2020-10-02  1:01     ` [dpdk-dev] [PATCH v11 04/10] baseband/acc100: add queue configuration Nicolas Chautru
@ 2020-10-04 16:18       ` Tom Rix
  2020-10-05 16:42         ` Chautru, Nicolas
  0 siblings, 1 reply; 213+ messages in thread
From: Tom Rix @ 2020-10-04 16:18 UTC (permalink / raw)
  To: Nicolas Chautru, dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, maxime.coquelin, ferruh.yigit, tianjiao.liu


On 10/1/20 6:01 PM, Nicolas Chautru wrote:
> Adding function to create and configure queues for
> the device. Still no capability.
>
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> Reviewed-by: Rosen Xu <rosen.xu@intel.com>
> Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
> ---
>  drivers/baseband/acc100/rte_acc100_pmd.c | 445 ++++++++++++++++++++++++++++++-
>  drivers/baseband/acc100/rte_acc100_pmd.h |  45 ++++
>  2 files changed, 488 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
> index fcba77e..f2bf2b5 100644
> --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> @@ -26,6 +26,22 @@
>  RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);
>  #endif
>  
> +/* Write to MMIO register address */
> +static inline void
> +mmio_write(void *addr, uint32_t value)
> +{
> +	*((volatile uint32_t *)(addr)) = rte_cpu_to_le_32(value);
> +}
> +
> +/* Write a register of a ACC100 device */
> +static inline void
> +acc100_reg_write(struct acc100_device *d, uint32_t offset, uint32_t payload)
> +{
> +	void *reg_addr = RTE_PTR_ADD(d->mmio_base, offset);
> +	mmio_write(reg_addr, payload);
> +	usleep(ACC100_LONG_WAIT);
> +}
> +
>  /* Read a register of a ACC100 device */
>  static inline uint32_t
>  acc100_reg_read(struct acc100_device *d, uint32_t offset)
> @@ -36,6 +52,22 @@
>  	return rte_le_to_cpu_32(ret);
>  }
>  
> +/* Basic Implementation of Log2 for exact 2^N */
> +static inline uint32_t
> +log2_basic(uint32_t value)
> +{
> +	return (value == 0) ? 0 : rte_bsf32(value);
> +}
> +
> +/* Calculate memory alignment offset assuming alignment is 2^N */
> +static inline uint32_t
> +calc_mem_alignment_offset(void *unaligned_virt_mem, uint32_t alignment)
> +{
> +	rte_iova_t unaligned_phy_mem = rte_malloc_virt2iova(unaligned_virt_mem);
> +	return (uint32_t)(alignment -
> +			(unaligned_phy_mem & (alignment-1)));
> +}
> +
>  /* Calculate the offset of the enqueue register */
>  static inline uint32_t
>  queue_offset(bool pf_device, uint8_t vf_id, uint8_t qgrp_id, uint16_t aq_id)
> @@ -208,10 +240,416 @@
>  			acc100_conf->q_dl_5g.aq_depth_log2);
>  }
>  
> -/* Free 64MB memory used for software rings */
> +static void
> +free_base_addresses(void **base_addrs, int size)
> +{
> +	int i;
> +	for (i = 0; i < size; i++)
> +		rte_free(base_addrs[i]);
> +}
> +
> +static inline uint32_t
> +get_desc_len(void)
> +{
> +	return sizeof(union acc100_dma_desc);
> +}
> +
> +/* Allocate the 2 * 64MB block for the sw rings */
>  static int
> -acc100_dev_close(struct rte_bbdev *dev  __rte_unused)
> +alloc_2x64mb_sw_rings_mem(struct rte_bbdev *dev, struct acc100_device *d,
> +		int socket)
>  {
> +	uint32_t sw_ring_size = ACC100_SIZE_64MBYTE;
> +	d->sw_rings_base = rte_zmalloc_socket(dev->device->driver->name,
> +			2 * sw_ring_size, RTE_CACHE_LINE_SIZE, socket);
> +	if (d->sw_rings_base == NULL) {
> +		rte_bbdev_log(ERR, "Failed to allocate memory for %s:%u",
> +				dev->device->driver->name,
> +				dev->data->dev_id);
> +		return -ENOMEM;
> +	}
> +	uint32_t next_64mb_align_offset = calc_mem_alignment_offset(
> +			d->sw_rings_base, ACC100_SIZE_64MBYTE);
> +	d->sw_rings = RTE_PTR_ADD(d->sw_rings_base, next_64mb_align_offset);
> +	d->sw_rings_iova = rte_malloc_virt2iova(d->sw_rings_base) +
> +			next_64mb_align_offset;
> +	d->sw_ring_size = ACC100_MAX_QUEUE_DEPTH * get_desc_len();
> +	d->sw_ring_max_depth = ACC100_MAX_QUEUE_DEPTH;
> +
> +	return 0;
> +}
> +
> +/* Attempt to allocate minimised memory space for sw rings */
> +static void
> +alloc_sw_rings_min_mem(struct rte_bbdev *dev, struct acc100_device *d,
> +		uint16_t num_queues, int socket)
> +{
> +	rte_iova_t sw_rings_base_iova, next_64mb_align_addr_iova;
> +	uint32_t next_64mb_align_offset;
> +	rte_iova_t sw_ring_iova_end_addr;
> +	void *base_addrs[ACC100_SW_RING_MEM_ALLOC_ATTEMPTS];
> +	void *sw_rings_base;
> +	int i = 0;
> +	uint32_t q_sw_ring_size = ACC100_MAX_QUEUE_DEPTH * get_desc_len();
> +	uint32_t dev_sw_ring_size = q_sw_ring_size * num_queues;
> +
> +	/* Find an aligned block of memory to store sw rings */
> +	while (i < ACC100_SW_RING_MEM_ALLOC_ATTEMPTS) {
> +		/*
> +		 * sw_ring allocated memory is guaranteed to be aligned to
> +		 * q_sw_ring_size at the condition that the requested size is
> +		 * less than the page size
> +		 */
> +		sw_rings_base = rte_zmalloc_socket(
> +				dev->device->driver->name,
> +				dev_sw_ring_size, q_sw_ring_size, socket);
> +
> +		if (sw_rings_base == NULL) {
> +			rte_bbdev_log(ERR,
> +					"Failed to allocate memory for %s:%u",
> +					dev->device->driver->name,
> +					dev->data->dev_id);
> +			break;
> +		}
> +
> +		sw_rings_base_iova = rte_malloc_virt2iova(sw_rings_base);
> +		next_64mb_align_offset = calc_mem_alignment_offset(
> +				sw_rings_base, ACC100_SIZE_64MBYTE);
> +		next_64mb_align_addr_iova = sw_rings_base_iova +
> +				next_64mb_align_offset;
> +		sw_ring_iova_end_addr = sw_rings_base_iova + dev_sw_ring_size;
> +
> +		/* Check if the end of the sw ring memory block is before the
> +		 * start of next 64MB aligned mem address
> +		 */
> +		if (sw_ring_iova_end_addr < next_64mb_align_addr_iova) {
> +			d->sw_rings_iova = sw_rings_base_iova;
> +			d->sw_rings = sw_rings_base;
> +			d->sw_rings_base = sw_rings_base;
> +			d->sw_ring_size = q_sw_ring_size;
> +			d->sw_ring_max_depth = ACC100_MAX_QUEUE_DEPTH;
> +			break;
> +		}
> +		/* Store the address of the unaligned mem block */
> +		base_addrs[i] = sw_rings_base;
> +		i++;
> +	}
> +
> +	/* Free all unaligned blocks of mem allocated in the loop */
> +	free_base_addresses(base_addrs, i);
Was the alloc fallback changing here ?
> +}
> +
> +
> +/* Allocate 64MB memory used for all software rings */
> +static int
> +acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id)
> +{
> +	uint32_t phys_low, phys_high, payload;
> +	struct acc100_device *d = dev->data->dev_private;
> +	const struct acc100_registry_addr *reg_addr;
> +
> +	if (d->pf_device && !d->acc100_conf.pf_mode_en) {
> +		rte_bbdev_log(NOTICE,
> +				"%s has PF mode disabled. This PF can't be used.",
> +				dev->data->name);
> +		return -ENODEV;
> +	}
> +
> +	alloc_sw_rings_min_mem(dev, d, num_queues, socket_id);
> +
> +	/* If minimal memory space approach failed, then allocate
> +	 * the 2 * 64MB block for the sw rings
> +	 */
> +	if (d->sw_rings == NULL)
> +		alloc_2x64mb_sw_rings_mem(dev, d, socket_id);

Since this also did not change, i guess my review came to late for v11.

I'll stop here.

v11 looks good so far.

Tom

> +
> +	if (d->sw_rings == NULL) {
> +		rte_bbdev_log(NOTICE,
> +				"Failure allocating sw_rings memory");
> +		return -ENODEV;
> +	}
> +
> +	/* Configure ACC100 with the base address for DMA descriptor rings
> +	 * Same descriptor rings used for UL and DL DMA Engines
> +	 * Note : Assuming only VF0 bundle is used for PF mode
> +	 */
> +	phys_high = (uint32_t)(d->sw_rings_iova >> 32);
> +	phys_low  = (uint32_t)(d->sw_rings_iova & ~(ACC100_SIZE_64MBYTE-1));
> +
> +	/* Choose correct registry addresses for the device type */
> +	if (d->pf_device)
> +		reg_addr = &pf_reg_addr;
> +	else
> +		reg_addr = &vf_reg_addr;
> +
> +	/* Read the populated cfg from ACC100 registers */
> +	fetch_acc100_config(dev);
> +
> +	/* Release AXI from PF */
> +	if (d->pf_device)
> +		acc100_reg_write(d, HWPfDmaAxiControl, 1);
> +
> +	acc100_reg_write(d, reg_addr->dma_ring_ul5g_hi, phys_high);
> +	acc100_reg_write(d, reg_addr->dma_ring_ul5g_lo, phys_low);
> +	acc100_reg_write(d, reg_addr->dma_ring_dl5g_hi, phys_high);
> +	acc100_reg_write(d, reg_addr->dma_ring_dl5g_lo, phys_low);
> +	acc100_reg_write(d, reg_addr->dma_ring_ul4g_hi, phys_high);
> +	acc100_reg_write(d, reg_addr->dma_ring_ul4g_lo, phys_low);
> +	acc100_reg_write(d, reg_addr->dma_ring_dl4g_hi, phys_high);
> +	acc100_reg_write(d, reg_addr->dma_ring_dl4g_lo, phys_low);
> +
> +	/*
> +	 * Configure Ring Size to the max queue ring size
> +	 * (used for wrapping purpose)
> +	 */
> +	payload = log2_basic(d->sw_ring_size / 64);
> +	acc100_reg_write(d, reg_addr->ring_size, payload);
> +
> +	/* Configure tail pointer for use when SDONE enabled */
> +	d->tail_ptrs = rte_zmalloc_socket(
> +			dev->device->driver->name,
> +			ACC100_NUM_QGRPS * ACC100_NUM_AQS * sizeof(uint32_t),
> +			RTE_CACHE_LINE_SIZE, socket_id);
> +	if (d->tail_ptrs == NULL) {
> +		rte_bbdev_log(ERR, "Failed to allocate tail ptr for %s:%u",
> +				dev->device->driver->name,
> +				dev->data->dev_id);
> +		rte_free(d->sw_rings);
> +		return -ENOMEM;
> +	}
> +	d->tail_ptr_iova = rte_malloc_virt2iova(d->tail_ptrs);
> +
> +	phys_high = (uint32_t)(d->tail_ptr_iova >> 32);
> +	phys_low  = (uint32_t)(d->tail_ptr_iova);
> +	acc100_reg_write(d, reg_addr->tail_ptrs_ul5g_hi, phys_high);
> +	acc100_reg_write(d, reg_addr->tail_ptrs_ul5g_lo, phys_low);
> +	acc100_reg_write(d, reg_addr->tail_ptrs_dl5g_hi, phys_high);
> +	acc100_reg_write(d, reg_addr->tail_ptrs_dl5g_lo, phys_low);
> +	acc100_reg_write(d, reg_addr->tail_ptrs_ul4g_hi, phys_high);
> +	acc100_reg_write(d, reg_addr->tail_ptrs_ul4g_lo, phys_low);
> +	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_hi, phys_high);
> +	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_lo, phys_low);
> +
> +	d->harq_layout = rte_zmalloc_socket("HARQ Layout",
> +			ACC100_HARQ_LAYOUT * sizeof(*d->harq_layout),
> +			RTE_CACHE_LINE_SIZE, dev->data->socket_id);
> +	if (d->harq_layout == NULL) {
> +		rte_bbdev_log(ERR, "Failed to allocate harq_layout for %s:%u",
> +				dev->device->driver->name,
> +				dev->data->dev_id);
> +		rte_free(d->sw_rings);
> +		return -ENOMEM;
> +	}
> +
> +	/* Mark as configured properly */
> +	d->configured = true;
> +
> +	rte_bbdev_log_debug(
> +			"ACC100 (%s) configured  sw_rings = %p, sw_rings_iova = %#"
> +			PRIx64, dev->data->name, d->sw_rings, d->sw_rings_iova);
> +
> +	return 0;
> +}
> +
> +/* Free memory used for software rings */
> +static int
> +acc100_dev_close(struct rte_bbdev *dev)
> +{
> +	struct acc100_device *d = dev->data->dev_private;
> +	if (d->sw_rings_base != NULL) {
> +		rte_free(d->tail_ptrs);
> +		rte_free(d->sw_rings_base);
> +		d->sw_rings_base = NULL;
> +	}
> +	/* Ensure all in flight HW transactions are completed */
> +	usleep(ACC100_LONG_WAIT);
> +	return 0;
> +}
> +
> +
> +/**
> + * Report a ACC100 queue index which is free
> + * Return 0 to 16k for a valid queue_idx or -1 when no queue is available
> + * Note : Only supporting VF0 Bundle for PF mode
> + */
> +static int
> +acc100_find_free_queue_idx(struct rte_bbdev *dev,
> +		const struct rte_bbdev_queue_conf *conf)
> +{
> +	struct acc100_device *d = dev->data->dev_private;
> +	int op_2_acc[5] = {0, UL_4G, DL_4G, UL_5G, DL_5G};
> +	int acc = op_2_acc[conf->op_type];
> +	struct rte_acc100_queue_topology *qtop = NULL;
> +
> +	qtopFromAcc(&qtop, acc, &(d->acc100_conf));
> +	if (qtop == NULL)
> +		return -1;
> +	/* Identify matching QGroup Index which are sorted in priority order */
> +	uint16_t group_idx = qtop->first_qgroup_index;
> +	group_idx += conf->priority;
> +	if (group_idx >= ACC100_NUM_QGRPS ||
> +			conf->priority >= qtop->num_qgroups) {
> +		rte_bbdev_log(INFO, "Invalid Priority on %s, priority %u",
> +				dev->data->name, conf->priority);
> +		return -1;
> +	}
> +	/* Find a free AQ_idx  */
> +	uint16_t aq_idx;
> +	for (aq_idx = 0; aq_idx < qtop->num_aqs_per_groups; aq_idx++) {
> +		if (((d->q_assigned_bit_map[group_idx] >> aq_idx) & 0x1) == 0) {
> +			/* Mark the Queue as assigned */
> +			d->q_assigned_bit_map[group_idx] |= (1 << aq_idx);
> +			/* Report the AQ Index */
> +			return (group_idx << ACC100_GRP_ID_SHIFT) + aq_idx;
> +		}
> +	}
> +	rte_bbdev_log(INFO, "Failed to find free queue on %s, priority %u",
> +			dev->data->name, conf->priority);
> +	return -1;
> +}
> +
> +/* Setup ACC100 queue */
> +static int
> +acc100_queue_setup(struct rte_bbdev *dev, uint16_t queue_id,
> +		const struct rte_bbdev_queue_conf *conf)
> +{
> +	struct acc100_device *d = dev->data->dev_private;
> +	struct acc100_queue *q;
> +	int16_t q_idx;
> +
> +	/* Allocate the queue data structure. */
> +	q = rte_zmalloc_socket(dev->device->driver->name, sizeof(*q),
> +			RTE_CACHE_LINE_SIZE, conf->socket);
> +	if (q == NULL) {
> +		rte_bbdev_log(ERR, "Failed to allocate queue memory");
> +		return -ENOMEM;
> +	}
> +	if (d == NULL) {
> +		rte_bbdev_log(ERR, "Undefined device");
> +		return -ENODEV;
> +	}
> +
> +	q->d = d;
> +	q->ring_addr = RTE_PTR_ADD(d->sw_rings, (d->sw_ring_size * queue_id));
> +	q->ring_addr_iova = d->sw_rings_iova + (d->sw_ring_size * queue_id);
> +
> +	/* Prepare the Ring with default descriptor format */
> +	union acc100_dma_desc *desc = NULL;
> +	unsigned int desc_idx, b_idx;
> +	int fcw_len = (conf->op_type == RTE_BBDEV_OP_LDPC_ENC ?
> +		ACC100_FCW_LE_BLEN : (conf->op_type == RTE_BBDEV_OP_TURBO_DEC ?
> +		ACC100_FCW_TD_BLEN : ACC100_FCW_LD_BLEN));
> +
> +	for (desc_idx = 0; desc_idx < d->sw_ring_max_depth; desc_idx++) {
> +		desc = q->ring_addr + desc_idx;
> +		desc->req.word0 = ACC100_DMA_DESC_TYPE;
> +		desc->req.word1 = 0; /**< Timestamp */
> +		desc->req.word2 = 0;
> +		desc->req.word3 = 0;
> +		uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
> +		desc->req.data_ptrs[0].address = q->ring_addr_iova + fcw_offset;
> +		desc->req.data_ptrs[0].blen = fcw_len;
> +		desc->req.data_ptrs[0].blkid = ACC100_DMA_BLKID_FCW;
> +		desc->req.data_ptrs[0].last = 0;
> +		desc->req.data_ptrs[0].dma_ext = 0;
> +		for (b_idx = 1; b_idx < ACC100_DMA_MAX_NUM_POINTERS - 1;
> +				b_idx++) {
> +			desc->req.data_ptrs[b_idx].blkid = ACC100_DMA_BLKID_IN;
> +			desc->req.data_ptrs[b_idx].last = 1;
> +			desc->req.data_ptrs[b_idx].dma_ext = 0;
> +			b_idx++;
> +			desc->req.data_ptrs[b_idx].blkid =
> +					ACC100_DMA_BLKID_OUT_ENC;
> +			desc->req.data_ptrs[b_idx].last = 1;
> +			desc->req.data_ptrs[b_idx].dma_ext = 0;
> +		}
> +		/* Preset some fields of LDPC FCW */
> +		desc->req.fcw_ld.FCWversion = ACC100_FCW_VER;
> +		desc->req.fcw_ld.gain_i = 1;
> +		desc->req.fcw_ld.gain_h = 1;
> +	}
> +
> +	q->lb_in = rte_zmalloc_socket(dev->device->driver->name,
> +			RTE_CACHE_LINE_SIZE,
> +			RTE_CACHE_LINE_SIZE, conf->socket);
> +	if (q->lb_in == NULL) {
> +		rte_bbdev_log(ERR, "Failed to allocate lb_in memory");
> +		rte_free(q);
> +		return -ENOMEM;
> +	}
> +	q->lb_in_addr_iova = rte_malloc_virt2iova(q->lb_in);
> +	q->lb_out = rte_zmalloc_socket(dev->device->driver->name,
> +			RTE_CACHE_LINE_SIZE,
> +			RTE_CACHE_LINE_SIZE, conf->socket);
> +	if (q->lb_out == NULL) {
> +		rte_bbdev_log(ERR, "Failed to allocate lb_out memory");
> +		rte_free(q->lb_in);
> +		rte_free(q);
> +		return -ENOMEM;
> +	}
> +	q->lb_out_addr_iova = rte_malloc_virt2iova(q->lb_out);
> +
> +	/*
> +	 * Software queue ring wraps synchronously with the HW when it reaches
> +	 * the boundary of the maximum allocated queue size, no matter what the
> +	 * sw queue size is. This wrapping is guarded by setting the wrap_mask
> +	 * to represent the maximum queue size as allocated at the time when
> +	 * the device has been setup (in configure()).
> +	 *
> +	 * The queue depth is set to the queue size value (conf->queue_size).
> +	 * This limits the occupancy of the queue at any point of time, so that
> +	 * the queue does not get swamped with enqueue requests.
> +	 */
> +	q->sw_ring_depth = conf->queue_size;
> +	q->sw_ring_wrap_mask = d->sw_ring_max_depth - 1;
> +
> +	q->op_type = conf->op_type;
> +
> +	q_idx = acc100_find_free_queue_idx(dev, conf);
> +	if (q_idx == -1) {
> +		rte_free(q->lb_in);
> +		rte_free(q->lb_out);
> +		rte_free(q);
> +		return -1;
> +	}
> +
> +	q->qgrp_id = (q_idx >> ACC100_GRP_ID_SHIFT) & 0xF;
> +	q->vf_id = (q_idx >> ACC100_VF_ID_SHIFT)  & 0x3F;
> +	q->aq_id = q_idx & 0xF;
> +	q->aq_depth = (conf->op_type ==  RTE_BBDEV_OP_TURBO_DEC) ?
> +			(1 << d->acc100_conf.q_ul_4g.aq_depth_log2) :
> +			(1 << d->acc100_conf.q_dl_4g.aq_depth_log2);
> +
> +	q->mmio_reg_enqueue = RTE_PTR_ADD(d->mmio_base,
> +			queue_offset(d->pf_device,
> +					q->vf_id, q->qgrp_id, q->aq_id));
> +
> +	rte_bbdev_log_debug(
> +			"Setup dev%u q%u: qgrp_id=%u, vf_id=%u, aq_id=%u, aq_depth=%u, mmio_reg_enqueue=%p",
> +			dev->data->dev_id, queue_id, q->qgrp_id, q->vf_id,
> +			q->aq_id, q->aq_depth, q->mmio_reg_enqueue);
> +
> +	dev->data->queues[queue_id].queue_private = q;
> +	return 0;
> +}
> +
> +/* Release ACC100 queue */
> +static int
> +acc100_queue_release(struct rte_bbdev *dev, uint16_t q_id)
> +{
> +	struct acc100_device *d = dev->data->dev_private;
> +	struct acc100_queue *q = dev->data->queues[q_id].queue_private;
> +
> +	if (q != NULL) {
> +		/* Mark the Queue as un-assigned */
> +		d->q_assigned_bit_map[q->qgrp_id] &= (0xFFFFFFFF -
> +				(1 << q->aq_id));
> +		rte_free(q->lb_in);
> +		rte_free(q->lb_out);
> +		rte_free(q);
> +		dev->data->queues[q_id].queue_private = NULL;
> +	}
> +
>  	return 0;
>  }
>  
> @@ -262,8 +700,11 @@
>  }
>  
>  static const struct rte_bbdev_ops acc100_bbdev_ops = {
> +	.setup_queues = acc100_setup_queues,
>  	.close = acc100_dev_close,
>  	.info_get = acc100_dev_info_get,
> +	.queue_setup = acc100_queue_setup,
> +	.queue_release = acc100_queue_release,
>  };
>  
>  /* ACC100 PCI PF address map */
> diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
> index 09965c8..5c8dde3 100644
> --- a/drivers/baseband/acc100/rte_acc100_pmd.h
> +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
> @@ -522,11 +522,56 @@ struct acc100_registry_addr {
>  	.ddr_range = HWVfDmaDdrBaseRangeRoVf,
>  };
>  
> +/* Structure associated with each queue. */
> +struct __rte_cache_aligned acc100_queue {
> +	union acc100_dma_desc *ring_addr;  /* Virtual address of sw ring */
> +	rte_iova_t ring_addr_iova;  /* IOVA address of software ring */
> +	uint32_t sw_ring_head;  /* software ring head */
> +	uint32_t sw_ring_tail;  /* software ring tail */
> +	/* software ring size (descriptors, not bytes) */
> +	uint32_t sw_ring_depth;
> +	/* mask used to wrap enqueued descriptors on the sw ring */
> +	uint32_t sw_ring_wrap_mask;
> +	/* MMIO register used to enqueue descriptors */
> +	void *mmio_reg_enqueue;
> +	uint8_t vf_id;  /* VF ID (max = 63) */
> +	uint8_t qgrp_id;  /* Queue Group ID */
> +	uint16_t aq_id;  /* Atomic Queue ID */
> +	uint16_t aq_depth;  /* Depth of atomic queue */
> +	uint32_t aq_enqueued;  /* Count how many "batches" have been enqueued */
> +	uint32_t aq_dequeued;  /* Count how many "batches" have been dequeued */
> +	uint32_t irq_enable;  /* Enable ops dequeue interrupts if set to 1 */
> +	struct rte_mempool *fcw_mempool;  /* FCW mempool */
> +	enum rte_bbdev_op_type op_type;  /* Type of this Queue: TE or TD */
> +	/* Internal Buffers for loopback input */
> +	uint8_t *lb_in;
> +	uint8_t *lb_out;
> +	rte_iova_t lb_in_addr_iova;
> +	rte_iova_t lb_out_addr_iova;
> +	struct acc100_device *d;
> +};
> +
>  /* Private data structure for each ACC100 device */
>  struct acc100_device {
>  	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
> +	void *sw_rings_base;  /* Base addr of un-aligned memory for sw rings */
> +	void *sw_rings;  /* 64MBs of 64MB aligned memory for sw rings */
> +	rte_iova_t sw_rings_iova;  /* IOVA address of sw_rings */
> +	/* Virtual address of the info memory routed to the this function under
> +	 * operation, whether it is PF or VF.
> +	 */
> +	union acc100_harq_layout_data *harq_layout;
> +	uint32_t sw_ring_size;
>  	uint32_t ddr_size; /* Size in kB */
> +	uint32_t *tail_ptrs; /* Base address of response tail pointer buffer */
> +	rte_iova_t tail_ptr_iova; /* IOVA address of tail pointers */
> +	/* Max number of entries available for each queue in device, depending
> +	 * on how many queues are enabled with configure()
> +	 */
> +	uint32_t sw_ring_max_depth;
>  	struct rte_acc100_conf acc100_conf; /* ACC100 Initial configuration */
> +	/* Bitmap capturing which Queues have already been assigned */
> +	uint16_t q_assigned_bit_map[ACC100_NUM_QGRPS];
>  	bool pf_device; /**< True if this is a PF ACC100 device */
>  	bool configured; /**< True if this ACC100 device is configured */
>  };


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v11 03/10] baseband/acc100: add info get function
  2020-10-04 16:09       ` Tom Rix
@ 2020-10-05 16:38         ` Chautru, Nicolas
  2020-10-05 22:19           ` Chautru, Nicolas
  0 siblings, 1 reply; 213+ messages in thread
From: Chautru, Nicolas @ 2020-10-05 16:38 UTC (permalink / raw)
  To: Tom Rix, dev, akhil.goyal
  Cc: Richardson, Bruce, Xu, Rosen, maxime.coquelin, Yigit, Ferruh,
	Liu,  Tianjiao

Hi Tom

> From: Tom Rix <trix@redhat.com>> 
> 
> On 10/1/20 6:01 PM, Nicolas Chautru wrote:
> > Add in the "info_get" function to the driver, to allow us to query the
> > device.
> > No processing capability are available yet.
> > Linking bbdev-test to support the PMD with null capability.
> >
> > Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> > Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
> > ---
> >  app/test-bbdev/meson.build               |   3 +
> >  drivers/baseband/acc100/rte_acc100_cfg.h |  96 +++++++++++++
> > drivers/baseband/acc100/rte_acc100_pmd.c | 229
> > +++++++++++++++++++++++++++++++
> > drivers/baseband/acc100/rte_acc100_pmd.h |  10 ++
> >  4 files changed, 338 insertions(+)
> >  create mode 100644 drivers/baseband/acc100/rte_acc100_cfg.h
> >
> > diff --git a/app/test-bbdev/meson.build b/app/test-bbdev/meson.build
> > index 18ab6a8..fbd8ae3 100644
> > --- a/app/test-bbdev/meson.build
> > +++ b/app/test-bbdev/meson.build
> > @@ -12,3 +12,6 @@ endif
> >  if dpdk_conf.has('RTE_LIBRTE_PMD_BBDEV_FPGA_5GNR_FEC')
> >  	deps += ['pmd_bbdev_fpga_5gnr_fec']
> >  endif
> > +if dpdk_conf.has('RTE_LIBRTE_PMD_BBDEV_ACC100')
> > +	deps += ['pmd_bbdev_acc100']
> > +endif
> > \ No newline at end of file
> > diff --git a/drivers/baseband/acc100/rte_acc100_cfg.h
> > b/drivers/baseband/acc100/rte_acc100_cfg.h
> > new file mode 100644
> > index 0000000..a1d43ef
> > --- /dev/null
> > +++ b/drivers/baseband/acc100/rte_acc100_cfg.h
> > @@ -0,0 +1,96 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright(c) 2020 Intel Corporation  */
> > +
> > +#ifndef _RTE_ACC100_CFG_H_
> > +#define _RTE_ACC100_CFG_H_
> > +
> > +/**
> > + * @file rte_acc100_cfg.h
> > + *
> > + * Functions for configuring ACC100 HW, exposed directly to applications.
> > + * Configuration related to encoding/decoding is done through the
> > + * librte_bbdev library.
> > + *
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice  */
> > +
> > +#include <stdint.h>
> > +#include <stdbool.h>
> > +
> > +#ifdef __cplusplus
> > +extern "C" {
> > +#endif
> > +/**< Number of Virtual Functions ACC100 supports */ #define
> > +RTE_ACC100_NUM_VFS 16
> 
> I was expecting the definition of RTE_ACC100_NUM_VFS to be removed.
> 
> And its uses replaced with ACC100_NUM_VFS.
> 
> or
> 
> #define RTE_ACC100_NUM_VFS ACC100_NUM_VFS
> 

Yes it was actually on purpose to keep that piece of code portable outside of DPDK if required. 
One is related to the PMD generic function, the other one is used for the configuration function only. 
If you feel strongly about this I could change. 

> > +
> > +/**
> > + * Definition of Queue Topology for ACC100 Configuration
> > + * Some level of details is abstracted out to expose a clean
> > +interface
> > + * given that comprehensive flexibility is not required  */ struct
> > +rte_acc100_queue_topology {
> > +	/** Number of QGroups in incremental order of priority */
> > +	uint16_t num_qgroups;
> > +	/**
> > +	 * All QGroups have the same number of AQs here.
> > +	 * Note : Could be made a 16-array if more flexibility is really
> > +	 * required
> > +	 */
> > +	uint16_t num_aqs_per_groups;
> > +	/**
> > +	 * Depth of the AQs is the same of all QGroups here. Log2 Enum : 2^N
> > +	 * Note : Could be made a 16-array if more flexibility is really
> > +	 * required
> > +	 */
> > +	uint16_t aq_depth_log2;
> > +	/**
> > +	 * Index of the first Queue Group Index - assuming contiguity
> > +	 * Initialized as -1
> > +	 */
> > +	int8_t first_qgroup_index;
> > +};
> > +
> > +/**
> > + * Definition of Arbitration related parameters for ACC100
> > +Configuration  */ struct rte_acc100_arbitration {
> > +	/** Default Weight for VF Fairness Arbitration */
> > +	uint16_t round_robin_weight;
> > +	uint32_t gbr_threshold1; /**< Guaranteed Bitrate Threshold 1 */
> > +	uint32_t gbr_threshold2; /**< Guaranteed Bitrate Threshold 2 */ };
> > +
> > +/**
> > + * Structure to pass ACC100 configuration.
> > + * Note: all VF Bundles will have the same configuration.
> > + */
> > +struct rte_acc100_conf {
> > +	bool pf_mode_en; /**< 1 if PF is used for dataplane, 0 for VFs */
> > +	/** 1 if input '1' bit is represented by a positive LLR value, 0 if '1'
> > +	 * bit is represented by a negative value.
> > +	 */
> > +	bool input_pos_llr_1_bit;
> > +	/** 1 if output '1' bit is represented by a positive value, 0 if '1'
> > +	 * bit is represented by a negative value.
> > +	 */
> > +	bool output_pos_llr_1_bit;
> > +	uint16_t num_vf_bundles; /**< Number of VF bundles to setup */
> > +	/** Queue topology for each operation type */
> > +	struct rte_acc100_queue_topology q_ul_4g;
> > +	struct rte_acc100_queue_topology q_dl_4g;
> > +	struct rte_acc100_queue_topology q_ul_5g;
> > +	struct rte_acc100_queue_topology q_dl_5g;
> > +	/** Arbitration configuration for each operation type */
> > +	struct rte_acc100_arbitration arb_ul_4g[RTE_ACC100_NUM_VFS];
> > +	struct rte_acc100_arbitration arb_dl_4g[RTE_ACC100_NUM_VFS];
> > +	struct rte_acc100_arbitration arb_ul_5g[RTE_ACC100_NUM_VFS];
> > +	struct rte_acc100_arbitration arb_dl_5g[RTE_ACC100_NUM_VFS]; };
> > +
> > +#ifdef __cplusplus
> > +}
> > +#endif
> > +
> > +#endif /* _RTE_ACC100_CFG_H_ */
> > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> > b/drivers/baseband/acc100/rte_acc100_pmd.c
> > index 1b4cd13..fcba77e 100644
> > --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> > +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> > @@ -26,6 +26,188 @@
> >  RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);  #endif
> >
> > +/* Read a register of a ACC100 device */ static inline uint32_t
> > +acc100_reg_read(struct acc100_device *d, uint32_t offset) {
> > +
> > +	void *reg_addr = RTE_PTR_ADD(d->mmio_base, offset);
> > +	uint32_t ret = *((volatile uint32_t *)(reg_addr));
> > +	return rte_le_to_cpu_32(ret);
> > +}
> > +
> > +/* Calculate the offset of the enqueue register */ static inline
> > +uint32_t queue_offset(bool pf_device, uint8_t vf_id, uint8_t qgrp_id,
> > +uint16_t aq_id) {
> > +	if (pf_device)
> > +		return ((vf_id << 12) + (qgrp_id << 7) + (aq_id << 3) +
> > +				HWPfQmgrIngressAq);
> > +	else
> > +		return ((qgrp_id << 7) + (aq_id << 3) +
> > +				HWVfQmgrIngressAq);
> > +}
> > +
> > +enum {UL_4G = 0, UL_5G, DL_4G, DL_5G, NUM_ACC};
> > +
> > +/* Return the queue topology for a Queue Group Index */ static inline
> > +void qtopFromAcc(struct rte_acc100_queue_topology **qtop, int
> > +acc_enum,
> > +		struct rte_acc100_conf *acc100_conf) {
> > +	struct rte_acc100_queue_topology *p_qtop;
> > +	p_qtop = NULL;
> > +	switch (acc_enum) {
> > +	case UL_4G:
> > +		p_qtop = &(acc100_conf->q_ul_4g);
> > +		break;
> > +	case UL_5G:
> > +		p_qtop = &(acc100_conf->q_ul_5g);
> > +		break;
> > +	case DL_4G:
> > +		p_qtop = &(acc100_conf->q_dl_4g);
> > +		break;
> > +	case DL_5G:
> > +		p_qtop = &(acc100_conf->q_dl_5g);
> > +		break;
> > +	default:
> > +		/* NOTREACHED */
> > +		rte_bbdev_log(ERR, "Unexpected error evaluating
> qtopFromAcc");
> > +		break;
> > +	}
> > +	*qtop = p_qtop;
> > +}
> > +
> > +static void
> > +initQTop(struct rte_acc100_conf *acc100_conf) {
> > +	acc100_conf->q_ul_4g.num_aqs_per_groups = 0;
> > +	acc100_conf->q_ul_4g.num_qgroups = 0;
> > +	acc100_conf->q_ul_4g.first_qgroup_index = -1;
> > +	acc100_conf->q_ul_5g.num_aqs_per_groups = 0;
> > +	acc100_conf->q_ul_5g.num_qgroups = 0;
> > +	acc100_conf->q_ul_5g.first_qgroup_index = -1;
> > +	acc100_conf->q_dl_4g.num_aqs_per_groups = 0;
> > +	acc100_conf->q_dl_4g.num_qgroups = 0;
> > +	acc100_conf->q_dl_4g.first_qgroup_index = -1;
> > +	acc100_conf->q_dl_5g.num_aqs_per_groups = 0;
> > +	acc100_conf->q_dl_5g.num_qgroups = 0;
> > +	acc100_conf->q_dl_5g.first_qgroup_index = -1; }
> > +
> > +static inline void
> > +updateQtop(uint8_t acc, uint8_t qg, struct rte_acc100_conf
> *acc100_conf,
> > +		struct acc100_device *d) {
> > +	uint32_t reg;
> > +	struct rte_acc100_queue_topology *q_top = NULL;
> > +	qtopFromAcc(&q_top, acc, acc100_conf);
> > +	if (unlikely(q_top == NULL))
> > +		return;
> > +	uint16_t aq;
> > +	q_top->num_qgroups++;
> > +	if (q_top->first_qgroup_index == -1) {
> > +		q_top->first_qgroup_index = qg;
> > +		/* Can be optimized to assume all are enabled by default */
> > +		reg = acc100_reg_read(d, queue_offset(d->pf_device,
> > +				0, qg, ACC100_NUM_AQS - 1));
> > +		if (reg & ACC100_QUEUE_ENABLE) {
> > +			q_top->num_aqs_per_groups = ACC100_NUM_AQS;
> > +			return;
> > +		}
> > +		q_top->num_aqs_per_groups = 0;
> > +		for (aq = 0; aq < ACC100_NUM_AQS; aq++) {
> > +			reg = acc100_reg_read(d, queue_offset(d-
> >pf_device,
> > +					0, qg, aq));
> > +			if (reg & ACC100_QUEUE_ENABLE)
> > +				q_top->num_aqs_per_groups++;
> > +		}
> > +	}
> > +}
> > +
> > +/* Fetch configuration enabled for the PF/VF using MMIO Read (slow)
> > +*/ static inline void fetch_acc100_config(struct rte_bbdev *dev) {
> > +	struct acc100_device *d = dev->data->dev_private;
> > +	struct rte_acc100_conf *acc100_conf = &d->acc100_conf;
> > +	const struct acc100_registry_addr *reg_addr;
> > +	uint8_t acc, qg;
> > +	uint32_t reg, reg_aq, reg_len0, reg_len1;
> > +	uint32_t reg_mode;
> > +
> > +	/* No need to retrieve the configuration is already done */
> > +	if (d->configured)
> > +		return;
> > +
> > +	/* Choose correct registry addresses for the device type */
> > +	if (d->pf_device)
> > +		reg_addr = &pf_reg_addr;
> > +	else
> > +		reg_addr = &vf_reg_addr;
> > +
> > +	d->ddr_size = (1 + acc100_reg_read(d, reg_addr->ddr_range)) << 10;
> > +
> > +	/* Single VF Bundle by VF */
> > +	acc100_conf->num_vf_bundles = 1;
> > +	initQTop(acc100_conf);
> > +
> > +	struct rte_acc100_queue_topology *q_top = NULL;
> > +	int qman_func_id[ACC100_NUM_ACCS] = {ACC100_ACCMAP_0,
> ACC100_ACCMAP_1,
> > +			ACC100_ACCMAP_2, ACC100_ACCMAP_3,
> ACC100_ACCMAP_4};
> > +	reg = acc100_reg_read(d, reg_addr->qman_group_func);
> > +	for (qg = 0; qg < ACC100_NUM_QGRPS_PER_WORD; qg++) {
> > +		reg_aq = acc100_reg_read(d,
> > +				queue_offset(d->pf_device, 0, qg, 0));
> > +		if (reg_aq & ACC100_QUEUE_ENABLE) {
> > +			uint32_t idx = (reg >> (qg * 4)) & 0x7;
> > +			if (idx >= ACC100_NUM_ACCS)
> > +				break;
> 
> a 'continue' would be better
> 
> or reverse the check
> 
> if (idx < ACC100_NUM_ACCS) {
> 
>     acc = qman_func_id ..
> 
> }

I can change. 
Please confirm what you think on the previous one. I would to get final by tomorrow latest.

Thanks
Nic


> 
> Tom
> 
> > +			acc = qman_func_id[idx];
> > +			updateQtop(acc, qg, acc100_conf, d);
> > +		}
> > +	}
> > +
> > +	/* Check the depth of the AQs*/
> > +	reg_len0 = acc100_reg_read(d, reg_addr->depth_log0_offset);
> > +	reg_len1 = acc100_reg_read(d, reg_addr->depth_log1_offset);
> > +	for (acc = 0; acc < NUM_ACC; acc++) {
> > +		qtopFromAcc(&q_top, acc, acc100_conf);
> > +		if (q_top->first_qgroup_index <
> ACC100_NUM_QGRPS_PER_WORD)
> > +			q_top->aq_depth_log2 = (reg_len0 >>
> > +					(q_top->first_qgroup_index * 4))
> > +					& 0xF;
> > +		else
> > +			q_top->aq_depth_log2 = (reg_len1 >>
> > +					((q_top->first_qgroup_index -
> > +					ACC100_NUM_QGRPS_PER_WORD) *
> 4))
> > +					& 0xF;
> > +	}
> > +
> > +	/* Read PF mode */
> > +	if (d->pf_device) {
> > +		reg_mode = acc100_reg_read(d, HWPfHiPfMode);
> > +		acc100_conf->pf_mode_en = (reg_mode == ACC100_PF_VAL)
> ? 1 : 0;
> > +	}
> > +
> > +	rte_bbdev_log_debug(
> > +			"%s Config LLR SIGN IN/OUT %s %s QG %u %u %u %u
> AQ %u %u %u %u Len %u %u %u %u\n",
> > +			(d->pf_device) ? "PF" : "VF",
> > +			(acc100_conf->input_pos_llr_1_bit) ? "POS" : "NEG",
> > +			(acc100_conf->output_pos_llr_1_bit) ? "POS" :
> "NEG",
> > +			acc100_conf->q_ul_4g.num_qgroups,
> > +			acc100_conf->q_dl_4g.num_qgroups,
> > +			acc100_conf->q_ul_5g.num_qgroups,
> > +			acc100_conf->q_dl_5g.num_qgroups,
> > +			acc100_conf->q_ul_4g.num_aqs_per_groups,
> > +			acc100_conf->q_dl_4g.num_aqs_per_groups,
> > +			acc100_conf->q_ul_5g.num_aqs_per_groups,
> > +			acc100_conf->q_dl_5g.num_aqs_per_groups,
> > +			acc100_conf->q_ul_4g.aq_depth_log2,
> > +			acc100_conf->q_dl_4g.aq_depth_log2,
> > +			acc100_conf->q_ul_5g.aq_depth_log2,
> > +			acc100_conf->q_dl_5g.aq_depth_log2);
> > +}
> > +
> >  /* Free 64MB memory used for software rings */  static int
> > acc100_dev_close(struct rte_bbdev *dev  __rte_unused) @@ -33,8
> +215,55
> > @@
> >  	return 0;
> >  }
> >
> > +/* Get ACC100 device info */
> > +static void
> > +acc100_dev_info_get(struct rte_bbdev *dev,
> > +		struct rte_bbdev_driver_info *dev_info) {
> > +	struct acc100_device *d = dev->data->dev_private;
> > +
> > +	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
> > +		RTE_BBDEV_END_OF_CAPABILITIES_LIST()
> > +	};
> > +
> > +	static struct rte_bbdev_queue_conf default_queue_conf;
> > +	default_queue_conf.socket = dev->data->socket_id;
> > +	default_queue_conf.queue_size = ACC100_MAX_QUEUE_DEPTH;
> > +
> > +	dev_info->driver_name = dev->device->driver->name;
> > +
> > +	/* Read and save the populated config from ACC100 registers */
> > +	fetch_acc100_config(dev);
> > +
> > +	/* This isn't ideal because it reports the maximum number of queues
> but
> > +	 * does not provide info on how many can be uplink/downlink or
> different
> > +	 * priorities
> > +	 */
> > +	dev_info->max_num_queues =
> > +			d->acc100_conf.q_dl_5g.num_aqs_per_groups *
> > +			d->acc100_conf.q_dl_5g.num_qgroups +
> > +			d->acc100_conf.q_ul_5g.num_aqs_per_groups *
> > +			d->acc100_conf.q_ul_5g.num_qgroups +
> > +			d->acc100_conf.q_dl_4g.num_aqs_per_groups *
> > +			d->acc100_conf.q_dl_4g.num_qgroups +
> > +			d->acc100_conf.q_ul_4g.num_aqs_per_groups *
> > +			d->acc100_conf.q_ul_4g.num_qgroups;
> > +	dev_info->queue_size_lim = ACC100_MAX_QUEUE_DEPTH;
> > +	dev_info->hardware_accelerated = true;
> > +	dev_info->max_dl_queue_priority =
> > +			d->acc100_conf.q_dl_4g.num_qgroups - 1;
> > +	dev_info->max_ul_queue_priority =
> > +			d->acc100_conf.q_ul_4g.num_qgroups - 1;
> > +	dev_info->default_queue_conf = default_queue_conf;
> > +	dev_info->cpu_flag_reqs = NULL;
> > +	dev_info->min_alignment = 64;
> > +	dev_info->capabilities = bbdev_capabilities;
> > +	dev_info->harq_buffer_size = d->ddr_size; }
> > +
> >  static const struct rte_bbdev_ops acc100_bbdev_ops = {
> >  	.close = acc100_dev_close,
> > +	.info_get = acc100_dev_info_get,
> >  };
> >
> >  /* ACC100 PCI PF address map */
> > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h
> > b/drivers/baseband/acc100/rte_acc100_pmd.h
> > index 6525d66..09965c8 100644
> > --- a/drivers/baseband/acc100/rte_acc100_pmd.h
> > +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
> > @@ -7,6 +7,7 @@
> >
> >  #include "acc100_pf_enum.h"
> >  #include "acc100_vf_enum.h"
> > +#include "rte_acc100_cfg.h"
> >
> >  /* Helper macro for logging */
> >  #define rte_bbdev_log(level, fmt, ...) \ @@ -98,6 +99,13 @@  #define
> > ACC100_SIG_UL_4G_LAST 21
> >  #define ACC100_SIG_DL_4G      27
> >  #define ACC100_SIG_DL_4G_LAST 31
> > +#define ACC100_NUM_ACCS       5
> > +#define ACC100_ACCMAP_0       0
> > +#define ACC100_ACCMAP_1       2
> > +#define ACC100_ACCMAP_2       1
> > +#define ACC100_ACCMAP_3       3
> > +#define ACC100_ACCMAP_4       4
> > +#define ACC100_PF_VAL         2
> >
> >  /* max number of iterations to allocate memory block for all rings */
> > #define ACC100_SW_RING_MEM_ALLOC_ATTEMPTS 5 @@ -517,6 +525,8
> @@ struct
> > acc100_registry_addr {
> >  /* Private data structure for each ACC100 device */  struct
> > acc100_device {
> >  	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
> > +	uint32_t ddr_size; /* Size in kB */
> > +	struct rte_acc100_conf acc100_conf; /* ACC100 Initial configuration
> > +*/
> >  	bool pf_device; /**< True if this is a PF ACC100 device */
> >  	bool configured; /**< True if this ACC100 device is configured */
> > };


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v11 04/10] baseband/acc100: add queue configuration
  2020-10-04 16:18       ` Tom Rix
@ 2020-10-05 16:42         ` Chautru, Nicolas
  0 siblings, 0 replies; 213+ messages in thread
From: Chautru, Nicolas @ 2020-10-05 16:42 UTC (permalink / raw)
  To: Tom Rix, dev, akhil.goyal
  Cc: Richardson, Bruce, Xu, Rosen, maxime.coquelin, Yigit, Ferruh,
	Liu,  Tianjiao

Hi Tom, 

> From: Tom Rix <trix@redhat.com>
> On 10/1/20 6:01 PM, Nicolas Chautru wrote:
> > Adding function to create and configure queues for the device. Still
> > no capability.
> >
> > Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> > Reviewed-by: Rosen Xu <rosen.xu@intel.com>
> > Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
> > ---
> >  drivers/baseband/acc100/rte_acc100_pmd.c | 445
> > ++++++++++++++++++++++++++++++-
> > drivers/baseband/acc100/rte_acc100_pmd.h |  45 ++++
> >  2 files changed, 488 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> > b/drivers/baseband/acc100/rte_acc100_pmd.c
> > index fcba77e..f2bf2b5 100644
> > --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> > +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> > @@ -26,6 +26,22 @@
> >  RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);  #endif
> >
> > +/* Write to MMIO register address */
> > +static inline void
> > +mmio_write(void *addr, uint32_t value) {
> > +	*((volatile uint32_t *)(addr)) = rte_cpu_to_le_32(value); }
> > +
> > +/* Write a register of a ACC100 device */ static inline void
> > +acc100_reg_write(struct acc100_device *d, uint32_t offset, uint32_t
> > +payload) {
> > +	void *reg_addr = RTE_PTR_ADD(d->mmio_base, offset);
> > +	mmio_write(reg_addr, payload);
> > +	usleep(ACC100_LONG_WAIT);
> > +}
> > +
> >  /* Read a register of a ACC100 device */  static inline uint32_t
> > acc100_reg_read(struct acc100_device *d, uint32_t offset) @@ -36,6
> > +52,22 @@
> >  	return rte_le_to_cpu_32(ret);
> >  }
> >
> > +/* Basic Implementation of Log2 for exact 2^N */ static inline
> > +uint32_t log2_basic(uint32_t value) {
> > +	return (value == 0) ? 0 : rte_bsf32(value); }
> > +
> > +/* Calculate memory alignment offset assuming alignment is 2^N */
> > +static inline uint32_t calc_mem_alignment_offset(void
> > +*unaligned_virt_mem, uint32_t alignment) {
> > +	rte_iova_t unaligned_phy_mem =
> rte_malloc_virt2iova(unaligned_virt_mem);
> > +	return (uint32_t)(alignment -
> > +			(unaligned_phy_mem & (alignment-1))); }
> > +
> >  /* Calculate the offset of the enqueue register */  static inline
> > uint32_t  queue_offset(bool pf_device, uint8_t vf_id, uint8_t qgrp_id,
> > uint16_t aq_id) @@ -208,10 +240,416 @@
> >  			acc100_conf->q_dl_5g.aq_depth_log2);
> >  }
> >
> > -/* Free 64MB memory used for software rings */
> > +static void
> > +free_base_addresses(void **base_addrs, int size) {
> > +	int i;
> > +	for (i = 0; i < size; i++)
> > +		rte_free(base_addrs[i]);
> > +}
> > +
> > +static inline uint32_t
> > +get_desc_len(void)
> > +{
> > +	return sizeof(union acc100_dma_desc); }
> > +
> > +/* Allocate the 2 * 64MB block for the sw rings */
> >  static int
> > -acc100_dev_close(struct rte_bbdev *dev  __rte_unused)
> > +alloc_2x64mb_sw_rings_mem(struct rte_bbdev *dev, struct
> acc100_device *d,
> > +		int socket)
> >  {
> > +	uint32_t sw_ring_size = ACC100_SIZE_64MBYTE;
> > +	d->sw_rings_base = rte_zmalloc_socket(dev->device->driver->name,
> > +			2 * sw_ring_size, RTE_CACHE_LINE_SIZE, socket);
> > +	if (d->sw_rings_base == NULL) {
> > +		rte_bbdev_log(ERR, "Failed to allocate memory for %s:%u",
> > +				dev->device->driver->name,
> > +				dev->data->dev_id);
> > +		return -ENOMEM;
> > +	}
> > +	uint32_t next_64mb_align_offset = calc_mem_alignment_offset(
> > +			d->sw_rings_base, ACC100_SIZE_64MBYTE);
> > +	d->sw_rings = RTE_PTR_ADD(d->sw_rings_base,
> next_64mb_align_offset);
> > +	d->sw_rings_iova = rte_malloc_virt2iova(d->sw_rings_base) +
> > +			next_64mb_align_offset;
> > +	d->sw_ring_size = ACC100_MAX_QUEUE_DEPTH * get_desc_len();
> > +	d->sw_ring_max_depth = ACC100_MAX_QUEUE_DEPTH;
> > +
> > +	return 0;
> > +}
> > +
> > +/* Attempt to allocate minimised memory space for sw rings */ static
> > +void alloc_sw_rings_min_mem(struct rte_bbdev *dev, struct
> > +acc100_device *d,
> > +		uint16_t num_queues, int socket)
> > +{
> > +	rte_iova_t sw_rings_base_iova, next_64mb_align_addr_iova;
> > +	uint32_t next_64mb_align_offset;
> > +	rte_iova_t sw_ring_iova_end_addr;
> > +	void *base_addrs[ACC100_SW_RING_MEM_ALLOC_ATTEMPTS];
> > +	void *sw_rings_base;
> > +	int i = 0;
> > +	uint32_t q_sw_ring_size = ACC100_MAX_QUEUE_DEPTH *
> get_desc_len();
> > +	uint32_t dev_sw_ring_size = q_sw_ring_size * num_queues;
> > +
> > +	/* Find an aligned block of memory to store sw rings */
> > +	while (i < ACC100_SW_RING_MEM_ALLOC_ATTEMPTS) {
> > +		/*
> > +		 * sw_ring allocated memory is guaranteed to be aligned to
> > +		 * q_sw_ring_size at the condition that the requested size is
> > +		 * less than the page size
> > +		 */
> > +		sw_rings_base = rte_zmalloc_socket(
> > +				dev->device->driver->name,
> > +				dev_sw_ring_size, q_sw_ring_size, socket);
> > +
> > +		if (sw_rings_base == NULL) {
> > +			rte_bbdev_log(ERR,
> > +					"Failed to allocate memory for
> %s:%u",
> > +					dev->device->driver->name,
> > +					dev->data->dev_id);
> > +			break;
> > +		}
> > +
> > +		sw_rings_base_iova = rte_malloc_virt2iova(sw_rings_base);
> > +		next_64mb_align_offset = calc_mem_alignment_offset(
> > +				sw_rings_base, ACC100_SIZE_64MBYTE);
> > +		next_64mb_align_addr_iova = sw_rings_base_iova +
> > +				next_64mb_align_offset;
> > +		sw_ring_iova_end_addr = sw_rings_base_iova +
> dev_sw_ring_size;
> > +
> > +		/* Check if the end of the sw ring memory block is before the
> > +		 * start of next 64MB aligned mem address
> > +		 */
> > +		if (sw_ring_iova_end_addr < next_64mb_align_addr_iova) {
> > +			d->sw_rings_iova = sw_rings_base_iova;
> > +			d->sw_rings = sw_rings_base;
> > +			d->sw_rings_base = sw_rings_base;
> > +			d->sw_ring_size = q_sw_ring_size;
> > +			d->sw_ring_max_depth =
> ACC100_MAX_QUEUE_DEPTH;
> > +			break;
> > +		}
> > +		/* Store the address of the unaligned mem block */
> > +		base_addrs[i] = sw_rings_base;
> > +		i++;
> > +	}
> > +
> > +	/* Free all unaligned blocks of mem allocated in the loop */
> > +	free_base_addresses(base_addrs, i);
> Was the alloc fallback changing here ?
> > +}
> > +
> > +
> > +/* Allocate 64MB memory used for all software rings */ static int
> > +acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int
> > +socket_id) {
> > +	uint32_t phys_low, phys_high, payload;
> > +	struct acc100_device *d = dev->data->dev_private;
> > +	const struct acc100_registry_addr *reg_addr;
> > +
> > +	if (d->pf_device && !d->acc100_conf.pf_mode_en) {
> > +		rte_bbdev_log(NOTICE,
> > +				"%s has PF mode disabled. This PF can't be
> used.",
> > +				dev->data->name);
> > +		return -ENODEV;
> > +	}
> > +
> > +	alloc_sw_rings_min_mem(dev, d, num_queues, socket_id);
> > +
> > +	/* If minimal memory space approach failed, then allocate
> > +	 * the 2 * 64MB block for the sw rings
> > +	 */
> > +	if (d->sw_rings == NULL)
> > +		alloc_2x64mb_sw_rings_mem(dev, d, socket_id);
> 
> Since this also did not change, i guess my review came too late for v11.
> 
> I'll stop here.

I actually looked into all your review comments. Many were changed (See below extra check for d->sw_rings), some of them did not warrant changes based on rationals captured in conversation on the serie. Let me know if unclear. 

> 
> v11 looks good so far.

Great. Thanks again for your reviews. 


> 
> Tom
> 
> > +
> > +	if (d->sw_rings == NULL) {
> > +		rte_bbdev_log(NOTICE,
> > +				"Failure allocating sw_rings memory");
> > +		return -ENODEV;
> > +	}
> > +
> > +	/* Configure ACC100 with the base address for DMA descriptor rings
> > +	 * Same descriptor rings used for UL and DL DMA Engines
> > +	 * Note : Assuming only VF0 bundle is used for PF mode
> > +	 */
> > +	phys_high = (uint32_t)(d->sw_rings_iova >> 32);
> > +	phys_low  = (uint32_t)(d->sw_rings_iova &
> ~(ACC100_SIZE_64MBYTE-1));
> > +
> > +	/* Choose correct registry addresses for the device type */
> > +	if (d->pf_device)
> > +		reg_addr = &pf_reg_addr;
> > +	else
> > +		reg_addr = &vf_reg_addr;
> > +
> > +	/* Read the populated cfg from ACC100 registers */
> > +	fetch_acc100_config(dev);
> > +
> > +	/* Release AXI from PF */
> > +	if (d->pf_device)
> > +		acc100_reg_write(d, HWPfDmaAxiControl, 1);
> > +
> > +	acc100_reg_write(d, reg_addr->dma_ring_ul5g_hi, phys_high);
> > +	acc100_reg_write(d, reg_addr->dma_ring_ul5g_lo, phys_low);
> > +	acc100_reg_write(d, reg_addr->dma_ring_dl5g_hi, phys_high);
> > +	acc100_reg_write(d, reg_addr->dma_ring_dl5g_lo, phys_low);
> > +	acc100_reg_write(d, reg_addr->dma_ring_ul4g_hi, phys_high);
> > +	acc100_reg_write(d, reg_addr->dma_ring_ul4g_lo, phys_low);
> > +	acc100_reg_write(d, reg_addr->dma_ring_dl4g_hi, phys_high);
> > +	acc100_reg_write(d, reg_addr->dma_ring_dl4g_lo, phys_low);
> > +
> > +	/*
> > +	 * Configure Ring Size to the max queue ring size
> > +	 * (used for wrapping purpose)
> > +	 */
> > +	payload = log2_basic(d->sw_ring_size / 64);
> > +	acc100_reg_write(d, reg_addr->ring_size, payload);
> > +
> > +	/* Configure tail pointer for use when SDONE enabled */
> > +	d->tail_ptrs = rte_zmalloc_socket(
> > +			dev->device->driver->name,
> > +			ACC100_NUM_QGRPS * ACC100_NUM_AQS *
> sizeof(uint32_t),
> > +			RTE_CACHE_LINE_SIZE, socket_id);
> > +	if (d->tail_ptrs == NULL) {
> > +		rte_bbdev_log(ERR, "Failed to allocate tail ptr for %s:%u",
> > +				dev->device->driver->name,
> > +				dev->data->dev_id);
> > +		rte_free(d->sw_rings);
> > +		return -ENOMEM;
> > +	}
> > +	d->tail_ptr_iova = rte_malloc_virt2iova(d->tail_ptrs);
> > +
> > +	phys_high = (uint32_t)(d->tail_ptr_iova >> 32);
> > +	phys_low  = (uint32_t)(d->tail_ptr_iova);
> > +	acc100_reg_write(d, reg_addr->tail_ptrs_ul5g_hi, phys_high);
> > +	acc100_reg_write(d, reg_addr->tail_ptrs_ul5g_lo, phys_low);
> > +	acc100_reg_write(d, reg_addr->tail_ptrs_dl5g_hi, phys_high);
> > +	acc100_reg_write(d, reg_addr->tail_ptrs_dl5g_lo, phys_low);
> > +	acc100_reg_write(d, reg_addr->tail_ptrs_ul4g_hi, phys_high);
> > +	acc100_reg_write(d, reg_addr->tail_ptrs_ul4g_lo, phys_low);
> > +	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_hi, phys_high);
> > +	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_lo, phys_low);
> > +
> > +	d->harq_layout = rte_zmalloc_socket("HARQ Layout",
> > +			ACC100_HARQ_LAYOUT * sizeof(*d->harq_layout),
> > +			RTE_CACHE_LINE_SIZE, dev->data->socket_id);
> > +	if (d->harq_layout == NULL) {
> > +		rte_bbdev_log(ERR, "Failed to allocate harq_layout for
> %s:%u",
> > +				dev->device->driver->name,
> > +				dev->data->dev_id);
> > +		rte_free(d->sw_rings);
> > +		return -ENOMEM;
> > +	}
> > +
> > +	/* Mark as configured properly */
> > +	d->configured = true;
> > +
> > +	rte_bbdev_log_debug(
> > +			"ACC100 (%s) configured  sw_rings = %p,
> sw_rings_iova = %#"
> > +			PRIx64, dev->data->name, d->sw_rings, d-
> >sw_rings_iova);
> > +
> > +	return 0;
> > +}
> > +
> > +/* Free memory used for software rings */ static int
> > +acc100_dev_close(struct rte_bbdev *dev) {
> > +	struct acc100_device *d = dev->data->dev_private;
> > +	if (d->sw_rings_base != NULL) {
> > +		rte_free(d->tail_ptrs);
> > +		rte_free(d->sw_rings_base);
> > +		d->sw_rings_base = NULL;
> > +	}
> > +	/* Ensure all in flight HW transactions are completed */
> > +	usleep(ACC100_LONG_WAIT);
> > +	return 0;
> > +}
> > +
> > +
> > +/**
> > + * Report a ACC100 queue index which is free
> > + * Return 0 to 16k for a valid queue_idx or -1 when no queue is
> > +available
> > + * Note : Only supporting VF0 Bundle for PF mode  */ static int
> > +acc100_find_free_queue_idx(struct rte_bbdev *dev,
> > +		const struct rte_bbdev_queue_conf *conf) {
> > +	struct acc100_device *d = dev->data->dev_private;
> > +	int op_2_acc[5] = {0, UL_4G, DL_4G, UL_5G, DL_5G};
> > +	int acc = op_2_acc[conf->op_type];
> > +	struct rte_acc100_queue_topology *qtop = NULL;
> > +
> > +	qtopFromAcc(&qtop, acc, &(d->acc100_conf));
> > +	if (qtop == NULL)
> > +		return -1;
> > +	/* Identify matching QGroup Index which are sorted in priority order
> */
> > +	uint16_t group_idx = qtop->first_qgroup_index;
> > +	group_idx += conf->priority;
> > +	if (group_idx >= ACC100_NUM_QGRPS ||
> > +			conf->priority >= qtop->num_qgroups) {
> > +		rte_bbdev_log(INFO, "Invalid Priority on %s, priority %u",
> > +				dev->data->name, conf->priority);
> > +		return -1;
> > +	}
> > +	/* Find a free AQ_idx  */
> > +	uint16_t aq_idx;
> > +	for (aq_idx = 0; aq_idx < qtop->num_aqs_per_groups; aq_idx++) {
> > +		if (((d->q_assigned_bit_map[group_idx] >> aq_idx) & 0x1) ==
> 0) {
> > +			/* Mark the Queue as assigned */
> > +			d->q_assigned_bit_map[group_idx] |= (1 << aq_idx);
> > +			/* Report the AQ Index */
> > +			return (group_idx << ACC100_GRP_ID_SHIFT) +
> aq_idx;
> > +		}
> > +	}
> > +	rte_bbdev_log(INFO, "Failed to find free queue on %s, priority %u",
> > +			dev->data->name, conf->priority);
> > +	return -1;
> > +}
> > +
> > +/* Setup ACC100 queue */
> > +static int
> > +acc100_queue_setup(struct rte_bbdev *dev, uint16_t queue_id,
> > +		const struct rte_bbdev_queue_conf *conf) {
> > +	struct acc100_device *d = dev->data->dev_private;
> > +	struct acc100_queue *q;
> > +	int16_t q_idx;
> > +
> > +	/* Allocate the queue data structure. */
> > +	q = rte_zmalloc_socket(dev->device->driver->name, sizeof(*q),
> > +			RTE_CACHE_LINE_SIZE, conf->socket);
> > +	if (q == NULL) {
> > +		rte_bbdev_log(ERR, "Failed to allocate queue memory");
> > +		return -ENOMEM;
> > +	}
> > +	if (d == NULL) {
> > +		rte_bbdev_log(ERR, "Undefined device");
> > +		return -ENODEV;
> > +	}
> > +
> > +	q->d = d;
> > +	q->ring_addr = RTE_PTR_ADD(d->sw_rings, (d->sw_ring_size *
> queue_id));
> > +	q->ring_addr_iova = d->sw_rings_iova + (d->sw_ring_size *
> queue_id);
> > +
> > +	/* Prepare the Ring with default descriptor format */
> > +	union acc100_dma_desc *desc = NULL;
> > +	unsigned int desc_idx, b_idx;
> > +	int fcw_len = (conf->op_type == RTE_BBDEV_OP_LDPC_ENC ?
> > +		ACC100_FCW_LE_BLEN : (conf->op_type ==
> RTE_BBDEV_OP_TURBO_DEC ?
> > +		ACC100_FCW_TD_BLEN : ACC100_FCW_LD_BLEN));
> > +
> > +	for (desc_idx = 0; desc_idx < d->sw_ring_max_depth; desc_idx++) {
> > +		desc = q->ring_addr + desc_idx;
> > +		desc->req.word0 = ACC100_DMA_DESC_TYPE;
> > +		desc->req.word1 = 0; /**< Timestamp */
> > +		desc->req.word2 = 0;
> > +		desc->req.word3 = 0;
> > +		uint64_t fcw_offset = (desc_idx << 8) +
> ACC100_DESC_FCW_OFFSET;
> > +		desc->req.data_ptrs[0].address = q->ring_addr_iova +
> fcw_offset;
> > +		desc->req.data_ptrs[0].blen = fcw_len;
> > +		desc->req.data_ptrs[0].blkid = ACC100_DMA_BLKID_FCW;
> > +		desc->req.data_ptrs[0].last = 0;
> > +		desc->req.data_ptrs[0].dma_ext = 0;
> > +		for (b_idx = 1; b_idx < ACC100_DMA_MAX_NUM_POINTERS -
> 1;
> > +				b_idx++) {
> > +			desc->req.data_ptrs[b_idx].blkid =
> ACC100_DMA_BLKID_IN;
> > +			desc->req.data_ptrs[b_idx].last = 1;
> > +			desc->req.data_ptrs[b_idx].dma_ext = 0;
> > +			b_idx++;
> > +			desc->req.data_ptrs[b_idx].blkid =
> > +					ACC100_DMA_BLKID_OUT_ENC;
> > +			desc->req.data_ptrs[b_idx].last = 1;
> > +			desc->req.data_ptrs[b_idx].dma_ext = 0;
> > +		}
> > +		/* Preset some fields of LDPC FCW */
> > +		desc->req.fcw_ld.FCWversion = ACC100_FCW_VER;
> > +		desc->req.fcw_ld.gain_i = 1;
> > +		desc->req.fcw_ld.gain_h = 1;
> > +	}
> > +
> > +	q->lb_in = rte_zmalloc_socket(dev->device->driver->name,
> > +			RTE_CACHE_LINE_SIZE,
> > +			RTE_CACHE_LINE_SIZE, conf->socket);
> > +	if (q->lb_in == NULL) {
> > +		rte_bbdev_log(ERR, "Failed to allocate lb_in memory");
> > +		rte_free(q);
> > +		return -ENOMEM;
> > +	}
> > +	q->lb_in_addr_iova = rte_malloc_virt2iova(q->lb_in);
> > +	q->lb_out = rte_zmalloc_socket(dev->device->driver->name,
> > +			RTE_CACHE_LINE_SIZE,
> > +			RTE_CACHE_LINE_SIZE, conf->socket);
> > +	if (q->lb_out == NULL) {
> > +		rte_bbdev_log(ERR, "Failed to allocate lb_out memory");
> > +		rte_free(q->lb_in);
> > +		rte_free(q);
> > +		return -ENOMEM;
> > +	}
> > +	q->lb_out_addr_iova = rte_malloc_virt2iova(q->lb_out);
> > +
> > +	/*
> > +	 * Software queue ring wraps synchronously with the HW when it
> reaches
> > +	 * the boundary of the maximum allocated queue size, no matter
> what the
> > +	 * sw queue size is. This wrapping is guarded by setting the
> wrap_mask
> > +	 * to represent the maximum queue size as allocated at the time
> when
> > +	 * the device has been setup (in configure()).
> > +	 *
> > +	 * The queue depth is set to the queue size value (conf->queue_size).
> > +	 * This limits the occupancy of the queue at any point of time, so
> that
> > +	 * the queue does not get swamped with enqueue requests.
> > +	 */
> > +	q->sw_ring_depth = conf->queue_size;
> > +	q->sw_ring_wrap_mask = d->sw_ring_max_depth - 1;
> > +
> > +	q->op_type = conf->op_type;
> > +
> > +	q_idx = acc100_find_free_queue_idx(dev, conf);
> > +	if (q_idx == -1) {
> > +		rte_free(q->lb_in);
> > +		rte_free(q->lb_out);
> > +		rte_free(q);
> > +		return -1;
> > +	}
> > +
> > +	q->qgrp_id = (q_idx >> ACC100_GRP_ID_SHIFT) & 0xF;
> > +	q->vf_id = (q_idx >> ACC100_VF_ID_SHIFT)  & 0x3F;
> > +	q->aq_id = q_idx & 0xF;
> > +	q->aq_depth = (conf->op_type ==  RTE_BBDEV_OP_TURBO_DEC) ?
> > +			(1 << d->acc100_conf.q_ul_4g.aq_depth_log2) :
> > +			(1 << d->acc100_conf.q_dl_4g.aq_depth_log2);
> > +
> > +	q->mmio_reg_enqueue = RTE_PTR_ADD(d->mmio_base,
> > +			queue_offset(d->pf_device,
> > +					q->vf_id, q->qgrp_id, q->aq_id));
> > +
> > +	rte_bbdev_log_debug(
> > +			"Setup dev%u q%u: qgrp_id=%u, vf_id=%u,
> aq_id=%u, aq_depth=%u, mmio_reg_enqueue=%p",
> > +			dev->data->dev_id, queue_id, q->qgrp_id, q->vf_id,
> > +			q->aq_id, q->aq_depth, q->mmio_reg_enqueue);
> > +
> > +	dev->data->queues[queue_id].queue_private = q;
> > +	return 0;
> > +}
> > +
> > +/* Release ACC100 queue */
> > +static int
> > +acc100_queue_release(struct rte_bbdev *dev, uint16_t q_id) {
> > +	struct acc100_device *d = dev->data->dev_private;
> > +	struct acc100_queue *q = dev->data->queues[q_id].queue_private;
> > +
> > +	if (q != NULL) {
> > +		/* Mark the Queue as un-assigned */
> > +		d->q_assigned_bit_map[q->qgrp_id] &= (0xFFFFFFFF -
> > +				(1 << q->aq_id));
> > +		rte_free(q->lb_in);
> > +		rte_free(q->lb_out);
> > +		rte_free(q);
> > +		dev->data->queues[q_id].queue_private = NULL;
> > +	}
> > +
> >  	return 0;
> >  }
> >
> > @@ -262,8 +700,11 @@
> >  }
> >
> >  static const struct rte_bbdev_ops acc100_bbdev_ops = {
> > +	.setup_queues = acc100_setup_queues,
> >  	.close = acc100_dev_close,
> >  	.info_get = acc100_dev_info_get,
> > +	.queue_setup = acc100_queue_setup,
> > +	.queue_release = acc100_queue_release,
> >  };
> >
> >  /* ACC100 PCI PF address map */
> > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h
> > b/drivers/baseband/acc100/rte_acc100_pmd.h
> > index 09965c8..5c8dde3 100644
> > --- a/drivers/baseband/acc100/rte_acc100_pmd.h
> > +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
> > @@ -522,11 +522,56 @@ struct acc100_registry_addr {
> >  	.ddr_range = HWVfDmaDdrBaseRangeRoVf,  };
> >
> > +/* Structure associated with each queue. */ struct
> > +__rte_cache_aligned acc100_queue {
> > +	union acc100_dma_desc *ring_addr;  /* Virtual address of sw ring */
> > +	rte_iova_t ring_addr_iova;  /* IOVA address of software ring */
> > +	uint32_t sw_ring_head;  /* software ring head */
> > +	uint32_t sw_ring_tail;  /* software ring tail */
> > +	/* software ring size (descriptors, not bytes) */
> > +	uint32_t sw_ring_depth;
> > +	/* mask used to wrap enqueued descriptors on the sw ring */
> > +	uint32_t sw_ring_wrap_mask;
> > +	/* MMIO register used to enqueue descriptors */
> > +	void *mmio_reg_enqueue;
> > +	uint8_t vf_id;  /* VF ID (max = 63) */
> > +	uint8_t qgrp_id;  /* Queue Group ID */
> > +	uint16_t aq_id;  /* Atomic Queue ID */
> > +	uint16_t aq_depth;  /* Depth of atomic queue */
> > +	uint32_t aq_enqueued;  /* Count how many "batches" have been
> enqueued */
> > +	uint32_t aq_dequeued;  /* Count how many "batches" have been
> dequeued */
> > +	uint32_t irq_enable;  /* Enable ops dequeue interrupts if set to 1 */
> > +	struct rte_mempool *fcw_mempool;  /* FCW mempool */
> > +	enum rte_bbdev_op_type op_type;  /* Type of this Queue: TE or TD
> */
> > +	/* Internal Buffers for loopback input */
> > +	uint8_t *lb_in;
> > +	uint8_t *lb_out;
> > +	rte_iova_t lb_in_addr_iova;
> > +	rte_iova_t lb_out_addr_iova;
> > +	struct acc100_device *d;
> > +};
> > +
> >  /* Private data structure for each ACC100 device */  struct
> > acc100_device {
> >  	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
> > +	void *sw_rings_base;  /* Base addr of un-aligned memory for sw
> rings */
> > +	void *sw_rings;  /* 64MBs of 64MB aligned memory for sw rings */
> > +	rte_iova_t sw_rings_iova;  /* IOVA address of sw_rings */
> > +	/* Virtual address of the info memory routed to the this function
> under
> > +	 * operation, whether it is PF or VF.
> > +	 */
> > +	union acc100_harq_layout_data *harq_layout;
> > +	uint32_t sw_ring_size;
> >  	uint32_t ddr_size; /* Size in kB */
> > +	uint32_t *tail_ptrs; /* Base address of response tail pointer buffer */
> > +	rte_iova_t tail_ptr_iova; /* IOVA address of tail pointers */
> > +	/* Max number of entries available for each queue in device,
> depending
> > +	 * on how many queues are enabled with configure()
> > +	 */
> > +	uint32_t sw_ring_max_depth;
> >  	struct rte_acc100_conf acc100_conf; /* ACC100 Initial configuration
> > */
> > +	/* Bitmap capturing which Queues have already been assigned */
> > +	uint16_t q_assigned_bit_map[ACC100_NUM_QGRPS];
> >  	bool pf_device; /**< True if this is a PF ACC100 device */
> >  	bool configured; /**< True if this ACC100 device is configured */
> > };


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v12 00/10] bbdev PMD ACC100
  2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 11/11] doc: update bbdev feature table Nicolas Chautru
                     ` (7 preceding siblings ...)
  2020-10-02  1:01   ` [dpdk-dev] [PATCH v11 00/10] bbdev PMD ACC100 Nicolas Chautru
@ 2020-10-05 22:12   ` Nicolas Chautru
  2020-10-05 22:12     ` [dpdk-dev] [PATCH v12 01/10] drivers/baseband: add PMD for ACC100 Nicolas Chautru
                       ` (10 more replies)
  8 siblings, 11 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-10-05 22:12 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, trix, maxime.coquelin, ferruh.yigit,
	tianjiao.liu, Nicolas Chautru

v12: Correcting 1 spelling error and 1 code clean up.
v11: Further updates based on Tom + Maxime review comments on v9 and v10.  Variable renaming
v10: Updates based on Tom Rix valuable review comments. Notably doc clarifiction, #define names updates, few magic numbers left, stricter error handling and few valuable coding suggestions. Thanks
v9: moved the release notes update to the last commit
v8: integrated the doc feature table in previous commit as suggested. 
v7: Fingers trouble. Previous one sent mid-rebase. My bad. 
v6: removed a legacy makefile no longer required
v5: rebase based on latest on main. The legacy makefiles are removed. 
v4: an odd compilation error is reported for one CI variant using "gcc latest" which looks to me like a false positive of maybe-undeclared. 
http://mails.dpdk.org/archives/test-report/2020-August/148936.html
Still forcing a dummy declare to remove this CI warning I will check with ci@dpdk.org in parallel.  
v3: missed a change during rebase
v2: includes clean up from latest CI checks.

Nicolas Chautru (10):
  drivers/baseband: add PMD for ACC100
  baseband/acc100: add register definition file
  baseband/acc100: add info get function
  baseband/acc100: add queue configuration
  baseband/acc100: add LDPC processing functions
  baseband/acc100: add HARQ loopback support
  baseband/acc100: add support for 4G processing
  baseband/acc100: add interrupt support to PMD
  baseband/acc100: add debug function to validate input
  baseband/acc100: add configure function

 app/test-bbdev/meson.build                         |    3 +
 app/test-bbdev/test_bbdev_perf.c                   |   71 +
 doc/guides/bbdevs/acc100.rst                       |  228 +
 doc/guides/bbdevs/features/acc100.ini              |   14 +
 doc/guides/bbdevs/index.rst                        |    1 +
 doc/guides/rel_notes/release_20_11.rst             |    5 +
 drivers/baseband/acc100/acc100_pf_enum.h           | 1068 +++++
 drivers/baseband/acc100/acc100_vf_enum.h           |   73 +
 drivers/baseband/acc100/meson.build                |    8 +
 drivers/baseband/acc100/rte_acc100_cfg.h           |  113 +
 drivers/baseband/acc100/rte_acc100_pmd.c           | 4727 ++++++++++++++++++++
 drivers/baseband/acc100/rte_acc100_pmd.h           |  602 +++
 .../acc100/rte_pmd_bbdev_acc100_version.map        |   10 +
 drivers/baseband/meson.build                       |    2 +-
 14 files changed, 6924 insertions(+), 1 deletion(-)
 create mode 100644 doc/guides/bbdevs/acc100.rst
 create mode 100644 doc/guides/bbdevs/features/acc100.ini
 create mode 100644 drivers/baseband/acc100/acc100_pf_enum.h
 create mode 100644 drivers/baseband/acc100/acc100_vf_enum.h
 create mode 100644 drivers/baseband/acc100/meson.build
 create mode 100644 drivers/baseband/acc100/rte_acc100_cfg.h
 create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.c
 create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.h
 create mode 100644 drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v12 01/10] drivers/baseband: add PMD for ACC100
  2020-10-05 22:12   ` [dpdk-dev] [PATCH v12 00/10] bbdev PMD ACC100 Nicolas Chautru
@ 2020-10-05 22:12     ` Nicolas Chautru
  2020-11-02  9:25       ` Ferruh Yigit
  2020-10-05 22:12     ` [dpdk-dev] [PATCH v12 02/10] baseband/acc100: add register definition file Nicolas Chautru
                       ` (9 subsequent siblings)
  10 siblings, 1 reply; 213+ messages in thread
From: Nicolas Chautru @ 2020-10-05 22:12 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, trix, maxime.coquelin, ferruh.yigit,
	tianjiao.liu, Nicolas Chautru

Add stubs for the ACC100 PMD

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Reviewed-by: Tom Rix <trix@redhat.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 doc/guides/bbdevs/acc100.rst                       | 228 +++++++++++++++++++++
 doc/guides/bbdevs/features/acc100.ini              |  14 ++
 doc/guides/bbdevs/index.rst                        |   1 +
 drivers/baseband/acc100/meson.build                |   6 +
 drivers/baseband/acc100/rte_acc100_pmd.c           | 175 ++++++++++++++++
 drivers/baseband/acc100/rte_acc100_pmd.h           |  37 ++++
 .../acc100/rte_pmd_bbdev_acc100_version.map        |   3 +
 drivers/baseband/meson.build                       |   2 +-
 8 files changed, 465 insertions(+), 1 deletion(-)
 create mode 100644 doc/guides/bbdevs/acc100.rst
 create mode 100644 doc/guides/bbdevs/features/acc100.ini
 create mode 100644 drivers/baseband/acc100/meson.build
 create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.c
 create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.h
 create mode 100644 drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map

diff --git a/doc/guides/bbdevs/acc100.rst b/doc/guides/bbdevs/acc100.rst
new file mode 100644
index 0000000..d6d56ad
--- /dev/null
+++ b/doc/guides/bbdevs/acc100.rst
@@ -0,0 +1,228 @@
+..  SPDX-License-Identifier: BSD-3-Clause
+    Copyright(c) 2020 Intel Corporation
+
+Intel(R) ACC100 5G/4G FEC Poll Mode Driver
+==========================================
+
+The BBDEV ACC100 5G/4G FEC poll mode driver (PMD) supports an
+implementation of a VRAN FEC wireless acceleration function.
+This device is also known as Mount Bryce.
+
+Features
+--------
+
+ACC100 5G/4G FEC PMD supports the following features:
+
+- LDPC Encode in the DL (5GNR)
+- LDPC Decode in the UL (5GNR)
+- Turbo Encode in the DL (4G)
+- Turbo Decode in the UL (4G)
+- 16 VFs per PF (physical device)
+- Maximum of 128 queues per VF
+- PCIe Gen-3 x16 Interface
+- MSI
+- SR-IOV
+
+ACC100 5G/4G FEC PMD supports the following BBDEV capabilities:
+
+* For the LDPC encode operation:
+   - ``RTE_BBDEV_LDPC_CRC_24B_ATTACH`` :  set to attach CRC24B to CB(s)
+   - ``RTE_BBDEV_LDPC_RATE_MATCH`` :  if set then do not do Rate Match bypass
+   - ``RTE_BBDEV_LDPC_INTERLEAVER_BYPASS`` : if set then bypass interleaver
+
+* For the LDPC decode operation:
+   - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK`` :  check CRC24B from CB(s)
+   - ``RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE`` :  disable early termination
+   - ``RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP`` :  drops CRC24B bits appended while decoding
+   - ``RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE`` :  provides an input for HARQ combining
+   - ``RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE`` :  provides an input for HARQ combining
+   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE`` :  HARQ memory input is internal
+   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE`` :  HARQ memory output is internal
+   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK`` :  loopback data to/from HARQ memory
+   - ``RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_FILLERS`` :  HARQ memory includes the fillers bits
+   - ``RTE_BBDEV_LDPC_DEC_SCATTER_GATHER`` :  supports scatter-gather for input/output data
+   - ``RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION`` :  supports compression of the HARQ input/output
+   - ``RTE_BBDEV_LDPC_LLR_COMPRESSION`` :  supports LLR input compression
+
+* For the turbo encode operation:
+   - ``RTE_BBDEV_TURBO_CRC_24B_ATTACH`` :  set to attach CRC24B to CB(s)
+   - ``RTE_BBDEV_TURBO_RATE_MATCH`` :  if set then do not do Rate Match bypass
+   - ``RTE_BBDEV_TURBO_ENC_INTERRUPTS`` :  set for encoder dequeue interrupts
+   - ``RTE_BBDEV_TURBO_RV_INDEX_BYPASS`` :  set to bypass RV index
+   - ``RTE_BBDEV_TURBO_ENC_SCATTER_GATHER`` :  supports scatter-gather for input/output data
+
+* For the turbo decode operation:
+   - ``RTE_BBDEV_TURBO_CRC_TYPE_24B`` :  check CRC24B from CB(s)
+   - ``RTE_BBDEV_TURBO_SUBBLOCK_DEINTERLEAVE`` :  perform subblock de-interleave
+   - ``RTE_BBDEV_TURBO_DEC_INTERRUPTS`` :  set for decoder dequeue interrupts
+   - ``RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN`` :  set if negative LLR encoder i/p is supported
+   - ``RTE_BBDEV_TURBO_POS_LLR_1_BIT_IN`` :  set if positive LLR encoder i/p is supported
+   - ``RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP`` :  keep CRC24B bits appended while decoding
+   - ``RTE_BBDEV_TURBO_EARLY_TERMINATION`` :  set early early termination feature
+   - ``RTE_BBDEV_TURBO_DEC_SCATTER_GATHER`` :  supports scatter-gather for input/output data
+   - ``RTE_BBDEV_TURBO_HALF_ITERATION_EVEN`` :  set half iteration granularity
+
+Installation
+------------
+
+Section 3 of the DPDK manual provides instuctions on installing and compiling DPDK. The
+default set of bbdev compile flags may be found in config/common_base, where for example
+the flag to build the ACC100 5G/4G FEC device, ``CONFIG_RTE_LIBRTE_PMD_BBDEV_ACC100``,
+is already set.
+
+DPDK requires hugepages to be configured as detailed in section 2 of the DPDK manual.
+The bbdev test application has been tested with a configuration 40 x 1GB hugepages. The
+hugepage configuration of a server may be examined using:
+
+.. code-block:: console
+
+   grep Huge* /proc/meminfo
+
+
+Initialization
+--------------
+
+When the device first powers up, its PCI Physical Functions (PF) can be listed through this command:
+
+.. code-block:: console
+
+  sudo lspci -vd8086:0d5c
+
+The physical and virtual functions are compatible with Linux UIO drivers:
+``vfio`` and ``igb_uio``. However, in order to work the ACC100 5G/4G
+FEC device first needs to be bound to one of these linux drivers through DPDK.
+
+
+Bind PF UIO driver(s)
+~~~~~~~~~~~~~~~~~~~~~
+
+Install the DPDK igb_uio driver, bind it with the PF PCI device ID and use
+``lspci`` to confirm the PF device is under use by ``igb_uio`` DPDK UIO driver.
+
+The igb_uio driver may be bound to the PF PCI device using one of three methods:
+
+
+1. PCI functions (physical or virtual, depending on the use case) can be bound to
+the UIO driver by repeating this command for every function.
+
+.. code-block:: console
+
+  cd <dpdk-top-level-directory>
+  insmod ./build/kmod/igb_uio.ko
+  echo "8086 0d5c" > /sys/bus/pci/drivers/igb_uio/new_id
+  lspci -vd8086:0d5c
+
+
+2. Another way to bind PF with DPDK UIO driver is by using the ``dpdk-devbind.py`` tool
+
+.. code-block:: console
+
+  cd <dpdk-top-level-directory>
+  ./usertools/dpdk-devbind.py -b igb_uio 0000:06:00.0
+
+where the PCI device ID (example: 0000:06:00.0) is obtained using lspci -vd8086:0d5c
+
+
+3. A third way to bind is to use ``dpdk-setup.sh`` tool
+
+.. code-block:: console
+
+  cd <dpdk-top-level-directory>
+  ./usertools/dpdk-setup.sh
+
+  select 'Bind Ethernet/Crypto/Baseband device to IGB UIO module'
+  enter PCI device ID
+  select 'Display current Ethernet/Crypto/Baseband device settings' to confirm binding
+
+In a similar way the ACC100 5G/4G FEC PF may be bound with vfio-pci as any PCIe device.
+
+Enable Virtual Functions
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+Now, it should be visible in the printouts that PCI PF is under igb_uio control
+"``Kernel driver in use: igb_uio``"
+
+To show the number of available VFs on the device, read ``sriov_totalvfs`` file..
+
+.. code-block:: console
+
+  cat /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/sriov_totalvfs
+
+  where 0000\:<b>\:<d>.<f> is the PCI device ID
+
+
+To enable VFs via igb_uio, echo the number of virtual functions intended to
+enable to ``max_vfs`` file..
+
+.. code-block:: console
+
+  echo <num-of-vfs> > /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/max_vfs
+
+
+Afterwards, all VFs must be bound to appropriate UIO drivers as required, same
+way it was done with the physical function previously.
+
+Enabling SR-IOV via vfio driver is pretty much the same, except that the file
+name is different:
+
+.. code-block:: console
+
+  echo <num-of-vfs> > /sys/bus/pci/devices/0000\:<b>\:<d>.<f>/sriov_numvfs
+
+
+Configure the VFs through PF
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The PCI virtual functions must be configured before working or getting assigned
+to VMs/Containers. The configuration involves allocating the number of hardware
+queues, priorities, load balance, bandwidth and other settings necessary for the
+device to perform FEC functions.
+
+This configuration needs to be executed at least once after reboot or PCI FLR and can
+be achieved by using the function ``acc100_configure()``, which sets up the
+parameters defined in ``acc100_conf`` structure.
+
+Test Application
+----------------
+
+BBDEV provides a test application, ``test-bbdev.py`` and range of test data for testing
+the functionality of ACC100 5G/4G FEC encode and decode, depending on the device's
+capabilities. The test application is located under app->test-bbdev folder and has the
+following options:
+
+.. code-block:: console
+
+  "-p", "--testapp-path": specifies path to the bbdev test app.
+  "-e", "--eal-params"	: EAL arguments which are passed to the test app.
+  "-t", "--timeout"	: Timeout in seconds (default=300).
+  "-c", "--test-cases"	: Defines test cases to run. Run all if not specified.
+  "-v", "--test-vector"	: Test vector path (default=dpdk_path+/app/test-bbdev/test_vectors/bbdev_null.data).
+  "-n", "--num-ops"	: Number of operations to process on device (default=32).
+  "-b", "--burst-size"	: Operations enqueue/dequeue burst size (default=32).
+  "-s", "--snr"		: SNR in dB used when generating LLRs for bler tests.
+  "-s", "--iter_max"	: Number of iterations for LDPC decoder.
+  "-l", "--num-lcores"	: Number of lcores to run (default=16).
+  "-i", "--init-device" : Initialise PF device with default values.
+
+
+To execute the test application tool using simple decode or encode data,
+type one of the following:
+
+.. code-block:: console
+
+  ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_dec_default.data
+  ./test-bbdev.py -c validation -n 64 -b 1 -v ./ldpc_enc_default.data
+
+
+The test application ``test-bbdev.py``, supports the ability to configure the PF device with
+a default set of values, if the "-i" or "- -init-device" option is included. The default values
+are defined in test_bbdev_perf.c.
+
+
+Test Vectors
+~~~~~~~~~~~~
+
+In addition to the simple LDPC decoder and LDPC encoder tests, bbdev also provides
+a range of additional tests under the test_vectors folder, which may be useful. The results
+of these tests will depend on the ACC100 5G/4G FEC capabilities which may cause some
+testcases to be skipped, but no failure should be reported.
diff --git a/doc/guides/bbdevs/features/acc100.ini b/doc/guides/bbdevs/features/acc100.ini
new file mode 100644
index 0000000..c89a4d7
--- /dev/null
+++ b/doc/guides/bbdevs/features/acc100.ini
@@ -0,0 +1,14 @@
+;
+; Supported features of the 'acc100' bbdev driver.
+;
+; Refer to default.ini for the full list of available PMD features.
+;
+[Features]
+Turbo Decoder (4G)     = N
+Turbo Encoder (4G)     = N
+LDPC Decoder (5G)      = N
+LDPC Encoder (5G)      = N
+LLR/HARQ Compression   = N
+External DDR Access    = N
+HW Accelerated         = Y
+BBDEV API              = Y
diff --git a/doc/guides/bbdevs/index.rst b/doc/guides/bbdevs/index.rst
index a8092dd..4445cbd 100644
--- a/doc/guides/bbdevs/index.rst
+++ b/doc/guides/bbdevs/index.rst
@@ -13,3 +13,4 @@ Baseband Device Drivers
     turbo_sw
     fpga_lte_fec
     fpga_5gnr_fec
+    acc100
diff --git a/drivers/baseband/acc100/meson.build b/drivers/baseband/acc100/meson.build
new file mode 100644
index 0000000..8afafc2
--- /dev/null
+++ b/drivers/baseband/acc100/meson.build
@@ -0,0 +1,6 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2020 Intel Corporation
+
+deps += ['bbdev', 'bus_vdev', 'ring', 'pci', 'bus_pci']
+
+sources = files('rte_acc100_pmd.c')
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
new file mode 100644
index 0000000..1b4cd13
--- /dev/null
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -0,0 +1,175 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#include <unistd.h>
+
+#include <rte_common.h>
+#include <rte_log.h>
+#include <rte_dev.h>
+#include <rte_malloc.h>
+#include <rte_mempool.h>
+#include <rte_byteorder.h>
+#include <rte_errno.h>
+#include <rte_branch_prediction.h>
+#include <rte_hexdump.h>
+#include <rte_pci.h>
+#include <rte_bus_pci.h>
+
+#include <rte_bbdev.h>
+#include <rte_bbdev_pmd.h>
+#include "rte_acc100_pmd.h"
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, DEBUG);
+#else
+RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);
+#endif
+
+/* Free 64MB memory used for software rings */
+static int
+acc100_dev_close(struct rte_bbdev *dev  __rte_unused)
+{
+	return 0;
+}
+
+static const struct rte_bbdev_ops acc100_bbdev_ops = {
+	.close = acc100_dev_close,
+};
+
+/* ACC100 PCI PF address map */
+static struct rte_pci_id pci_id_acc100_pf_map[] = {
+	{
+		RTE_PCI_DEVICE(RTE_ACC100_VENDOR_ID, RTE_ACC100_PF_DEVICE_ID)
+	},
+	{.device_id = 0},
+};
+
+/* ACC100 PCI VF address map */
+static struct rte_pci_id pci_id_acc100_vf_map[] = {
+	{
+		RTE_PCI_DEVICE(RTE_ACC100_VENDOR_ID, RTE_ACC100_VF_DEVICE_ID)
+	},
+	{.device_id = 0},
+};
+
+/* Initialization Function */
+static void
+acc100_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv)
+{
+	struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
+
+	dev->dev_ops = &acc100_bbdev_ops;
+
+	((struct acc100_device *) dev->data->dev_private)->pf_device =
+			!strcmp(drv->driver.name,
+					RTE_STR(ACC100PF_DRIVER_NAME));
+	((struct acc100_device *) dev->data->dev_private)->mmio_base =
+			pci_dev->mem_resource[0].addr;
+
+	rte_bbdev_log_debug("Init device %s [%s] @ vaddr %p paddr %#"PRIx64"",
+			drv->driver.name, dev->data->name,
+			(void *)pci_dev->mem_resource[0].addr,
+			pci_dev->mem_resource[0].phys_addr);
+}
+
+static int acc100_pci_probe(struct rte_pci_driver *pci_drv,
+	struct rte_pci_device *pci_dev)
+{
+	struct rte_bbdev *bbdev = NULL;
+	char dev_name[RTE_BBDEV_NAME_MAX_LEN];
+
+	if (pci_dev == NULL) {
+		rte_bbdev_log(ERR, "NULL PCI device");
+		return -EINVAL;
+	}
+
+	rte_pci_device_name(&pci_dev->addr, dev_name, sizeof(dev_name));
+
+	/* Allocate memory to be used privately by drivers */
+	bbdev = rte_bbdev_allocate(pci_dev->device.name);
+	if (bbdev == NULL)
+		return -ENODEV;
+
+	/* allocate device private memory */
+	bbdev->data->dev_private = rte_zmalloc_socket(dev_name,
+			sizeof(struct acc100_device), RTE_CACHE_LINE_SIZE,
+			pci_dev->device.numa_node);
+
+	if (bbdev->data->dev_private == NULL) {
+		rte_bbdev_log(CRIT,
+				"Allocate of %zu bytes for device \"%s\" failed",
+				sizeof(struct acc100_device), dev_name);
+				rte_bbdev_release(bbdev);
+			return -ENOMEM;
+	}
+
+	/* Fill HW specific part of device structure */
+	bbdev->device = &pci_dev->device;
+	bbdev->intr_handle = &pci_dev->intr_handle;
+	bbdev->data->socket_id = pci_dev->device.numa_node;
+
+	/* Invoke ACC100 device initialization function */
+	acc100_bbdev_init(bbdev, pci_drv);
+
+	rte_bbdev_log_debug("Initialised bbdev %s (id = %u)",
+			dev_name, bbdev->data->dev_id);
+	return 0;
+}
+
+static int acc100_pci_remove(struct rte_pci_device *pci_dev)
+{
+	struct rte_bbdev *bbdev;
+	int ret;
+	uint8_t dev_id;
+
+	if (pci_dev == NULL)
+		return -EINVAL;
+
+	/* Find device */
+	bbdev = rte_bbdev_get_named_dev(pci_dev->device.name);
+	if (bbdev == NULL) {
+		rte_bbdev_log(CRIT,
+				"Couldn't find HW dev \"%s\" to uninitialise it",
+				pci_dev->device.name);
+		return -ENODEV;
+	}
+	dev_id = bbdev->data->dev_id;
+
+	/* free device private memory before close */
+	rte_free(bbdev->data->dev_private);
+
+	/* Close device */
+	ret = rte_bbdev_close(dev_id);
+	if (ret < 0)
+		rte_bbdev_log(ERR,
+				"Device %i failed to close during uninit: %i",
+				dev_id, ret);
+
+	/* release bbdev from library */
+	rte_bbdev_release(bbdev);
+
+	rte_bbdev_log_debug("Destroyed bbdev = %u", dev_id);
+
+	return 0;
+}
+
+static struct rte_pci_driver acc100_pci_pf_driver = {
+		.probe = acc100_pci_probe,
+		.remove = acc100_pci_remove,
+		.id_table = pci_id_acc100_pf_map,
+		.drv_flags = RTE_PCI_DRV_NEED_MAPPING
+};
+
+static struct rte_pci_driver acc100_pci_vf_driver = {
+		.probe = acc100_pci_probe,
+		.remove = acc100_pci_remove,
+		.id_table = pci_id_acc100_vf_map,
+		.drv_flags = RTE_PCI_DRV_NEED_MAPPING
+};
+
+RTE_PMD_REGISTER_PCI(ACC100PF_DRIVER_NAME, acc100_pci_pf_driver);
+RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME, pci_id_acc100_pf_map);
+RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
+RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME, pci_id_acc100_vf_map);
+
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
new file mode 100644
index 0000000..6f46df0
--- /dev/null
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -0,0 +1,37 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#ifndef _RTE_ACC100_PMD_H_
+#define _RTE_ACC100_PMD_H_
+
+/* Helper macro for logging */
+#define rte_bbdev_log(level, fmt, ...) \
+	rte_log(RTE_LOG_ ## level, acc100_logtype, fmt "\n", \
+		##__VA_ARGS__)
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+#define rte_bbdev_log_debug(fmt, ...) \
+		rte_bbdev_log(DEBUG, "acc100_pmd: " fmt, \
+		##__VA_ARGS__)
+#else
+#define rte_bbdev_log_debug(fmt, ...)
+#endif
+
+/* ACC100 PF and VF driver names */
+#define ACC100PF_DRIVER_NAME           intel_acc100_pf
+#define ACC100VF_DRIVER_NAME           intel_acc100_vf
+
+/* ACC100 PCI vendor & device IDs */
+#define RTE_ACC100_VENDOR_ID           (0x8086)
+#define RTE_ACC100_PF_DEVICE_ID        (0x0d5c)
+#define RTE_ACC100_VF_DEVICE_ID        (0x0d5d)
+
+/* Private data structure for each ACC100 device */
+struct acc100_device {
+	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
+	bool pf_device; /**< True if this is a PF ACC100 device */
+	bool configured; /**< True if this ACC100 device is configured */
+};
+
+#endif /* _RTE_ACC100_PMD_H_ */
diff --git a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
new file mode 100644
index 0000000..4a76d1d
--- /dev/null
+++ b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
@@ -0,0 +1,3 @@
+DPDK_21 {
+	local: *;
+};
diff --git a/drivers/baseband/meson.build b/drivers/baseband/meson.build
index 415b672..72301ce 100644
--- a/drivers/baseband/meson.build
+++ b/drivers/baseband/meson.build
@@ -5,7 +5,7 @@ if is_windows
 	subdir_done()
 endif
 
-drivers = ['null', 'turbo_sw', 'fpga_lte_fec', 'fpga_5gnr_fec']
+drivers = ['null', 'turbo_sw', 'fpga_lte_fec', 'fpga_5gnr_fec', 'acc100']
 
 config_flag_fmt = 'RTE_LIBRTE_PMD_BBDEV_@0@'
 driver_name_fmt = 'rte_pmd_bbdev_@0@'
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v12 02/10] baseband/acc100: add register definition file
  2020-10-05 22:12   ` [dpdk-dev] [PATCH v12 00/10] bbdev PMD ACC100 Nicolas Chautru
  2020-10-05 22:12     ` [dpdk-dev] [PATCH v12 01/10] drivers/baseband: add PMD for ACC100 Nicolas Chautru
@ 2020-10-05 22:12     ` Nicolas Chautru
  2020-10-05 22:12     ` [dpdk-dev] [PATCH v12 03/10] baseband/acc100: add info get function Nicolas Chautru
                       ` (8 subsequent siblings)
  10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-10-05 22:12 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, trix, maxime.coquelin, ferruh.yigit,
	tianjiao.liu, Nicolas Chautru

Add in the list of registers for the device and related
HW specs definitions.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
Reviewed-by: Tom Rix <trix@redhat.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 drivers/baseband/acc100/acc100_pf_enum.h | 1068 ++++++++++++++++++++++++++++++
 drivers/baseband/acc100/acc100_vf_enum.h |   73 ++
 drivers/baseband/acc100/rte_acc100_pmd.h |  487 ++++++++++++++
 3 files changed, 1628 insertions(+)
 create mode 100644 drivers/baseband/acc100/acc100_pf_enum.h
 create mode 100644 drivers/baseband/acc100/acc100_vf_enum.h

diff --git a/drivers/baseband/acc100/acc100_pf_enum.h b/drivers/baseband/acc100/acc100_pf_enum.h
new file mode 100644
index 0000000..a1ee416
--- /dev/null
+++ b/drivers/baseband/acc100/acc100_pf_enum.h
@@ -0,0 +1,1068 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2017 Intel Corporation
+ */
+
+#ifndef ACC100_PF_ENUM_H
+#define ACC100_PF_ENUM_H
+
+/*
+ * ACC100 Register mapping on PF BAR0
+ * This is automatically generated from RDL, format may change with new RDL
+ * Release.
+ * Variable names are as is
+ */
+enum {
+	HWPfQmgrEgressQueuesTemplate          =  0x0007FE00,
+	HWPfQmgrIngressAq                     =  0x00080000,
+	HWPfQmgrArbQAvail                     =  0x00A00010,
+	HWPfQmgrArbQBlock                     =  0x00A00014,
+	HWPfQmgrAqueueDropNotifEn             =  0x00A00024,
+	HWPfQmgrAqueueDisableNotifEn          =  0x00A00028,
+	HWPfQmgrSoftReset                     =  0x00A00038,
+	HWPfQmgrInitStatus                    =  0x00A0003C,
+	HWPfQmgrAramWatchdogCount             =  0x00A00040,
+	HWPfQmgrAramWatchdogCounterEn         =  0x00A00044,
+	HWPfQmgrAxiWatchdogCount              =  0x00A00048,
+	HWPfQmgrAxiWatchdogCounterEn          =  0x00A0004C,
+	HWPfQmgrProcessWatchdogCount          =  0x00A00050,
+	HWPfQmgrProcessWatchdogCounterEn      =  0x00A00054,
+	HWPfQmgrProcessUl4GWatchdogCounter    =  0x00A00058,
+	HWPfQmgrProcessDl4GWatchdogCounter    =  0x00A0005C,
+	HWPfQmgrProcessUl5GWatchdogCounter    =  0x00A00060,
+	HWPfQmgrProcessDl5GWatchdogCounter    =  0x00A00064,
+	HWPfQmgrProcessMldWatchdogCounter     =  0x00A00068,
+	HWPfQmgrMsiOverflowUpperVf            =  0x00A00070,
+	HWPfQmgrMsiOverflowLowerVf            =  0x00A00074,
+	HWPfQmgrMsiWatchdogOverflow           =  0x00A00078,
+	HWPfQmgrMsiOverflowEnable             =  0x00A0007C,
+	HWPfQmgrDebugAqPointerMemGrp          =  0x00A00100,
+	HWPfQmgrDebugOutputArbQFifoGrp        =  0x00A00140,
+	HWPfQmgrDebugMsiFifoGrp               =  0x00A00180,
+	HWPfQmgrDebugAxiWdTimeoutMsiFifo      =  0x00A001C0,
+	HWPfQmgrDebugProcessWdTimeoutMsiFifo  =  0x00A001C4,
+	HWPfQmgrDepthLog2Grp                  =  0x00A00200,
+	HWPfQmgrTholdGrp                      =  0x00A00300,
+	HWPfQmgrGrpTmplateReg0Indx            =  0x00A00600,
+	HWPfQmgrGrpTmplateReg1Indx            =  0x00A00680,
+	HWPfQmgrGrpTmplateReg2indx            =  0x00A00700,
+	HWPfQmgrGrpTmplateReg3Indx            =  0x00A00780,
+	HWPfQmgrGrpTmplateReg4Indx            =  0x00A00800,
+	HWPfQmgrVfBaseAddr                    =  0x00A01000,
+	HWPfQmgrUl4GWeightRrVf                =  0x00A02000,
+	HWPfQmgrDl4GWeightRrVf                =  0x00A02100,
+	HWPfQmgrUl5GWeightRrVf                =  0x00A02200,
+	HWPfQmgrDl5GWeightRrVf                =  0x00A02300,
+	HWPfQmgrMldWeightRrVf                 =  0x00A02400,
+	HWPfQmgrArbQDepthGrp                  =  0x00A02F00,
+	HWPfQmgrGrpFunction0                  =  0x00A02F40,
+	HWPfQmgrGrpFunction1                  =  0x00A02F44,
+	HWPfQmgrGrpPriority                   =  0x00A02F48,
+	HWPfQmgrWeightSync                    =  0x00A03000,
+	HWPfQmgrAqEnableVf                    =  0x00A10000,
+	HWPfQmgrAqResetVf                     =  0x00A20000,
+	HWPfQmgrRingSizeVf                    =  0x00A20004,
+	HWPfQmgrGrpDepthLog20Vf               =  0x00A20008,
+	HWPfQmgrGrpDepthLog21Vf               =  0x00A2000C,
+	HWPfQmgrGrpFunction0Vf                =  0x00A20010,
+	HWPfQmgrGrpFunction1Vf                =  0x00A20014,
+	HWPfDmaConfig0Reg                     =  0x00B80000,
+	HWPfDmaConfig1Reg                     =  0x00B80004,
+	HWPfDmaQmgrAddrReg                    =  0x00B80008,
+	HWPfDmaSoftResetReg                   =  0x00B8000C,
+	HWPfDmaAxcacheReg                     =  0x00B80010,
+	HWPfDmaVersionReg                     =  0x00B80014,
+	HWPfDmaFrameThreshold                 =  0x00B80018,
+	HWPfDmaTimestampLo                    =  0x00B8001C,
+	HWPfDmaTimestampHi                    =  0x00B80020,
+	HWPfDmaAxiStatus                      =  0x00B80028,
+	HWPfDmaAxiControl                     =  0x00B8002C,
+	HWPfDmaNoQmgr                         =  0x00B80030,
+	HWPfDmaQosScale                       =  0x00B80034,
+	HWPfDmaQmanen                         =  0x00B80040,
+	HWPfDmaQmgrQosBase                    =  0x00B80060,
+	HWPfDmaFecClkGatingEnable             =  0x00B80080,
+	HWPfDmaPmEnable                       =  0x00B80084,
+	HWPfDmaQosEnable                      =  0x00B80088,
+	HWPfDmaHarqWeightedRrFrameThreshold   =  0x00B800B0,
+	HWPfDmaDataSmallWeightedRrFrameThresh  = 0x00B800B4,
+	HWPfDmaDataLargeWeightedRrFrameThresh  = 0x00B800B8,
+	HWPfDmaInboundCbMaxSize               =  0x00B800BC,
+	HWPfDmaInboundDrainDataSize           =  0x00B800C0,
+	HWPfDmaVfDdrBaseRw                    =  0x00B80400,
+	HWPfDmaCmplTmOutCnt                   =  0x00B80800,
+	HWPfDmaProcTmOutCnt                   =  0x00B80804,
+	HWPfDmaStatusRrespBresp               =  0x00B80810,
+	HWPfDmaCfgRrespBresp                  =  0x00B80814,
+	HWPfDmaStatusMemParErr                =  0x00B80818,
+	HWPfDmaCfgMemParErrEn                 =  0x00B8081C,
+	HWPfDmaStatusDmaHwErr                 =  0x00B80820,
+	HWPfDmaCfgDmaHwErrEn                  =  0x00B80824,
+	HWPfDmaStatusFecCoreErr               =  0x00B80828,
+	HWPfDmaCfgFecCoreErrEn                =  0x00B8082C,
+	HWPfDmaStatusFcwDescrErr              =  0x00B80830,
+	HWPfDmaCfgFcwDescrErrEn               =  0x00B80834,
+	HWPfDmaStatusBlockTransmit            =  0x00B80838,
+	HWPfDmaBlockOnErrEn                   =  0x00B8083C,
+	HWPfDmaStatusFlushDma                 =  0x00B80840,
+	HWPfDmaFlushDmaOnErrEn                =  0x00B80844,
+	HWPfDmaStatusSdoneFifoFull            =  0x00B80848,
+	HWPfDmaStatusDescriptorErrLoVf        =  0x00B8084C,
+	HWPfDmaStatusDescriptorErrHiVf        =  0x00B80850,
+	HWPfDmaStatusFcwErrLoVf               =  0x00B80854,
+	HWPfDmaStatusFcwErrHiVf               =  0x00B80858,
+	HWPfDmaStatusDataErrLoVf              =  0x00B8085C,
+	HWPfDmaStatusDataErrHiVf              =  0x00B80860,
+	HWPfDmaCfgMsiEnSoftwareErr            =  0x00B80864,
+	HWPfDmaDescriptorSignatuture          =  0x00B80868,
+	HWPfDmaFcwSignature                   =  0x00B8086C,
+	HWPfDmaErrorDetectionEn               =  0x00B80870,
+	HWPfDmaErrCntrlFifoDebug              =  0x00B8087C,
+	HWPfDmaStatusToutData                 =  0x00B80880,
+	HWPfDmaStatusToutDesc                 =  0x00B80884,
+	HWPfDmaStatusToutUnexpData            =  0x00B80888,
+	HWPfDmaStatusToutUnexpDesc            =  0x00B8088C,
+	HWPfDmaStatusToutProcess              =  0x00B80890,
+	HWPfDmaConfigCtoutOutDataEn           =  0x00B808A0,
+	HWPfDmaConfigCtoutOutDescrEn          =  0x00B808A4,
+	HWPfDmaConfigUnexpComplDataEn         =  0x00B808A8,
+	HWPfDmaConfigUnexpComplDescrEn        =  0x00B808AC,
+	HWPfDmaConfigPtoutOutEn               =  0x00B808B0,
+	HWPfDmaFec5GulDescBaseLoRegVf         =  0x00B88020,
+	HWPfDmaFec5GulDescBaseHiRegVf         =  0x00B88024,
+	HWPfDmaFec5GulRespPtrLoRegVf          =  0x00B88028,
+	HWPfDmaFec5GulRespPtrHiRegVf          =  0x00B8802C,
+	HWPfDmaFec5GdlDescBaseLoRegVf         =  0x00B88040,
+	HWPfDmaFec5GdlDescBaseHiRegVf         =  0x00B88044,
+	HWPfDmaFec5GdlRespPtrLoRegVf          =  0x00B88048,
+	HWPfDmaFec5GdlRespPtrHiRegVf          =  0x00B8804C,
+	HWPfDmaFec4GulDescBaseLoRegVf         =  0x00B88060,
+	HWPfDmaFec4GulDescBaseHiRegVf         =  0x00B88064,
+	HWPfDmaFec4GulRespPtrLoRegVf          =  0x00B88068,
+	HWPfDmaFec4GulRespPtrHiRegVf          =  0x00B8806C,
+	HWPfDmaFec4GdlDescBaseLoRegVf         =  0x00B88080,
+	HWPfDmaFec4GdlDescBaseHiRegVf         =  0x00B88084,
+	HWPfDmaFec4GdlRespPtrLoRegVf          =  0x00B88088,
+	HWPfDmaFec4GdlRespPtrHiRegVf          =  0x00B8808C,
+	HWPfDmaVfDdrBaseRangeRo               =  0x00B880A0,
+	HWPfQosmonACntrlReg                   =  0x00B90000,
+	HWPfQosmonAEvalOverflow0              =  0x00B90008,
+	HWPfQosmonAEvalOverflow1              =  0x00B9000C,
+	HWPfQosmonADivTerm                    =  0x00B90010,
+	HWPfQosmonATickTerm                   =  0x00B90014,
+	HWPfQosmonAEvalTerm                   =  0x00B90018,
+	HWPfQosmonAAveTerm                    =  0x00B9001C,
+	HWPfQosmonAForceEccErr                =  0x00B90020,
+	HWPfQosmonAEccErrDetect               =  0x00B90024,
+	HWPfQosmonAIterationConfig0Low        =  0x00B90060,
+	HWPfQosmonAIterationConfig0High       =  0x00B90064,
+	HWPfQosmonAIterationConfig1Low        =  0x00B90068,
+	HWPfQosmonAIterationConfig1High       =  0x00B9006C,
+	HWPfQosmonAIterationConfig2Low        =  0x00B90070,
+	HWPfQosmonAIterationConfig2High       =  0x00B90074,
+	HWPfQosmonAIterationConfig3Low        =  0x00B90078,
+	HWPfQosmonAIterationConfig3High       =  0x00B9007C,
+	HWPfQosmonAEvalMemAddr                =  0x00B90080,
+	HWPfQosmonAEvalMemData                =  0x00B90084,
+	HWPfQosmonAXaction                    =  0x00B900C0,
+	HWPfQosmonARemThres1Vf                =  0x00B90400,
+	HWPfQosmonAThres2Vf                   =  0x00B90404,
+	HWPfQosmonAWeiFracVf                  =  0x00B90408,
+	HWPfQosmonARrWeiVf                    =  0x00B9040C,
+	HWPfPermonACntrlRegVf                 =  0x00B98000,
+	HWPfPermonACountVf                    =  0x00B98008,
+	HWPfPermonAKCntLoVf                   =  0x00B98010,
+	HWPfPermonAKCntHiVf                   =  0x00B98014,
+	HWPfPermonADeltaCntLoVf               =  0x00B98020,
+	HWPfPermonADeltaCntHiVf               =  0x00B98024,
+	HWPfPermonAVersionReg                 =  0x00B9C000,
+	HWPfPermonACbControlFec               =  0x00B9C0F0,
+	HWPfPermonADltTimerLoFec              =  0x00B9C0F4,
+	HWPfPermonADltTimerHiFec              =  0x00B9C0F8,
+	HWPfPermonACbCountFec                 =  0x00B9C100,
+	HWPfPermonAAccExecTimerLoFec          =  0x00B9C104,
+	HWPfPermonAAccExecTimerHiFec          =  0x00B9C108,
+	HWPfPermonAExecTimerMinFec            =  0x00B9C200,
+	HWPfPermonAExecTimerMaxFec            =  0x00B9C204,
+	HWPfPermonAControlBusMon              =  0x00B9C400,
+	HWPfPermonAConfigBusMon               =  0x00B9C404,
+	HWPfPermonASkipCountBusMon            =  0x00B9C408,
+	HWPfPermonAMinLatBusMon               =  0x00B9C40C,
+	HWPfPermonAMaxLatBusMon               =  0x00B9C500,
+	HWPfPermonATotalLatLowBusMon          =  0x00B9C504,
+	HWPfPermonATotalLatUpperBusMon        =  0x00B9C508,
+	HWPfPermonATotalReqCntBusMon          =  0x00B9C50C,
+	HWPfQosmonBCntrlReg                   =  0x00BA0000,
+	HWPfQosmonBEvalOverflow0              =  0x00BA0008,
+	HWPfQosmonBEvalOverflow1              =  0x00BA000C,
+	HWPfQosmonBDivTerm                    =  0x00BA0010,
+	HWPfQosmonBTickTerm                   =  0x00BA0014,
+	HWPfQosmonBEvalTerm                   =  0x00BA0018,
+	HWPfQosmonBAveTerm                    =  0x00BA001C,
+	HWPfQosmonBForceEccErr                =  0x00BA0020,
+	HWPfQosmonBEccErrDetect               =  0x00BA0024,
+	HWPfQosmonBIterationConfig0Low        =  0x00BA0060,
+	HWPfQosmonBIterationConfig0High       =  0x00BA0064,
+	HWPfQosmonBIterationConfig1Low        =  0x00BA0068,
+	HWPfQosmonBIterationConfig1High       =  0x00BA006C,
+	HWPfQosmonBIterationConfig2Low        =  0x00BA0070,
+	HWPfQosmonBIterationConfig2High       =  0x00BA0074,
+	HWPfQosmonBIterationConfig3Low        =  0x00BA0078,
+	HWPfQosmonBIterationConfig3High       =  0x00BA007C,
+	HWPfQosmonBEvalMemAddr                =  0x00BA0080,
+	HWPfQosmonBEvalMemData                =  0x00BA0084,
+	HWPfQosmonBXaction                    =  0x00BA00C0,
+	HWPfQosmonBRemThres1Vf                =  0x00BA0400,
+	HWPfQosmonBThres2Vf                   =  0x00BA0404,
+	HWPfQosmonBWeiFracVf                  =  0x00BA0408,
+	HWPfQosmonBRrWeiVf                    =  0x00BA040C,
+	HWPfPermonBCntrlRegVf                 =  0x00BA8000,
+	HWPfPermonBCountVf                    =  0x00BA8008,
+	HWPfPermonBKCntLoVf                   =  0x00BA8010,
+	HWPfPermonBKCntHiVf                   =  0x00BA8014,
+	HWPfPermonBDeltaCntLoVf               =  0x00BA8020,
+	HWPfPermonBDeltaCntHiVf               =  0x00BA8024,
+	HWPfPermonBVersionReg                 =  0x00BAC000,
+	HWPfPermonBCbControlFec               =  0x00BAC0F0,
+	HWPfPermonBDltTimerLoFec              =  0x00BAC0F4,
+	HWPfPermonBDltTimerHiFec              =  0x00BAC0F8,
+	HWPfPermonBCbCountFec                 =  0x00BAC100,
+	HWPfPermonBAccExecTimerLoFec          =  0x00BAC104,
+	HWPfPermonBAccExecTimerHiFec          =  0x00BAC108,
+	HWPfPermonBExecTimerMinFec            =  0x00BAC200,
+	HWPfPermonBExecTimerMaxFec            =  0x00BAC204,
+	HWPfPermonBControlBusMon              =  0x00BAC400,
+	HWPfPermonBConfigBusMon               =  0x00BAC404,
+	HWPfPermonBSkipCountBusMon            =  0x00BAC408,
+	HWPfPermonBMinLatBusMon               =  0x00BAC40C,
+	HWPfPermonBMaxLatBusMon               =  0x00BAC500,
+	HWPfPermonBTotalLatLowBusMon          =  0x00BAC504,
+	HWPfPermonBTotalLatUpperBusMon        =  0x00BAC508,
+	HWPfPermonBTotalReqCntBusMon          =  0x00BAC50C,
+	HWPfFecUl5gCntrlReg                   =  0x00BC0000,
+	HWPfFecUl5gI2MThreshReg               =  0x00BC0004,
+	HWPfFecUl5gVersionReg                 =  0x00BC0100,
+	HWPfFecUl5gFcwStatusReg               =  0x00BC0104,
+	HWPfFecUl5gWarnReg                    =  0x00BC0108,
+	HwPfFecUl5gIbDebugReg                 =  0x00BC0200,
+	HwPfFecUl5gObLlrDebugReg              =  0x00BC0204,
+	HwPfFecUl5gObHarqDebugReg             =  0x00BC0208,
+	HwPfFecUl5g1CntrlReg                  =  0x00BC1000,
+	HwPfFecUl5g1I2MThreshReg              =  0x00BC1004,
+	HwPfFecUl5g1VersionReg                =  0x00BC1100,
+	HwPfFecUl5g1FcwStatusReg              =  0x00BC1104,
+	HwPfFecUl5g1WarnReg                   =  0x00BC1108,
+	HwPfFecUl5g1IbDebugReg                =  0x00BC1200,
+	HwPfFecUl5g1ObLlrDebugReg             =  0x00BC1204,
+	HwPfFecUl5g1ObHarqDebugReg            =  0x00BC1208,
+	HwPfFecUl5g2CntrlReg                  =  0x00BC2000,
+	HwPfFecUl5g2I2MThreshReg              =  0x00BC2004,
+	HwPfFecUl5g2VersionReg                =  0x00BC2100,
+	HwPfFecUl5g2FcwStatusReg              =  0x00BC2104,
+	HwPfFecUl5g2WarnReg                   =  0x00BC2108,
+	HwPfFecUl5g2IbDebugReg                =  0x00BC2200,
+	HwPfFecUl5g2ObLlrDebugReg             =  0x00BC2204,
+	HwPfFecUl5g2ObHarqDebugReg            =  0x00BC2208,
+	HwPfFecUl5g3CntrlReg                  =  0x00BC3000,
+	HwPfFecUl5g3I2MThreshReg              =  0x00BC3004,
+	HwPfFecUl5g3VersionReg                =  0x00BC3100,
+	HwPfFecUl5g3FcwStatusReg              =  0x00BC3104,
+	HwPfFecUl5g3WarnReg                   =  0x00BC3108,
+	HwPfFecUl5g3IbDebugReg                =  0x00BC3200,
+	HwPfFecUl5g3ObLlrDebugReg             =  0x00BC3204,
+	HwPfFecUl5g3ObHarqDebugReg            =  0x00BC3208,
+	HwPfFecUl5g4CntrlReg                  =  0x00BC4000,
+	HwPfFecUl5g4I2MThreshReg              =  0x00BC4004,
+	HwPfFecUl5g4VersionReg                =  0x00BC4100,
+	HwPfFecUl5g4FcwStatusReg              =  0x00BC4104,
+	HwPfFecUl5g4WarnReg                   =  0x00BC4108,
+	HwPfFecUl5g4IbDebugReg                =  0x00BC4200,
+	HwPfFecUl5g4ObLlrDebugReg             =  0x00BC4204,
+	HwPfFecUl5g4ObHarqDebugReg            =  0x00BC4208,
+	HwPfFecUl5g5CntrlReg                  =  0x00BC5000,
+	HwPfFecUl5g5I2MThreshReg              =  0x00BC5004,
+	HwPfFecUl5g5VersionReg                =  0x00BC5100,
+	HwPfFecUl5g5FcwStatusReg              =  0x00BC5104,
+	HwPfFecUl5g5WarnReg                   =  0x00BC5108,
+	HwPfFecUl5g5IbDebugReg                =  0x00BC5200,
+	HwPfFecUl5g5ObLlrDebugReg             =  0x00BC5204,
+	HwPfFecUl5g5ObHarqDebugReg            =  0x00BC5208,
+	HwPfFecUl5g6CntrlReg                  =  0x00BC6000,
+	HwPfFecUl5g6I2MThreshReg              =  0x00BC6004,
+	HwPfFecUl5g6VersionReg                =  0x00BC6100,
+	HwPfFecUl5g6FcwStatusReg              =  0x00BC6104,
+	HwPfFecUl5g6WarnReg                   =  0x00BC6108,
+	HwPfFecUl5g6IbDebugReg                =  0x00BC6200,
+	HwPfFecUl5g6ObLlrDebugReg             =  0x00BC6204,
+	HwPfFecUl5g6ObHarqDebugReg            =  0x00BC6208,
+	HwPfFecUl5g7CntrlReg                  =  0x00BC7000,
+	HwPfFecUl5g7I2MThreshReg              =  0x00BC7004,
+	HwPfFecUl5g7VersionReg                =  0x00BC7100,
+	HwPfFecUl5g7FcwStatusReg              =  0x00BC7104,
+	HwPfFecUl5g7WarnReg                   =  0x00BC7108,
+	HwPfFecUl5g7IbDebugReg                =  0x00BC7200,
+	HwPfFecUl5g7ObLlrDebugReg             =  0x00BC7204,
+	HwPfFecUl5g7ObHarqDebugReg            =  0x00BC7208,
+	HwPfFecUl5g8CntrlReg                  =  0x00BC8000,
+	HwPfFecUl5g8I2MThreshReg              =  0x00BC8004,
+	HwPfFecUl5g8VersionReg                =  0x00BC8100,
+	HwPfFecUl5g8FcwStatusReg              =  0x00BC8104,
+	HwPfFecUl5g8WarnReg                   =  0x00BC8108,
+	HwPfFecUl5g8IbDebugReg                =  0x00BC8200,
+	HwPfFecUl5g8ObLlrDebugReg             =  0x00BC8204,
+	HwPfFecUl5g8ObHarqDebugReg            =  0x00BC8208,
+	HWPfFecDl5gCntrlReg                   =  0x00BCF000,
+	HWPfFecDl5gI2MThreshReg               =  0x00BCF004,
+	HWPfFecDl5gVersionReg                 =  0x00BCF100,
+	HWPfFecDl5gFcwStatusReg               =  0x00BCF104,
+	HWPfFecDl5gWarnReg                    =  0x00BCF108,
+	HWPfFecUlVersionReg                   =  0x00BD0000,
+	HWPfFecUlControlReg                   =  0x00BD0004,
+	HWPfFecUlStatusReg                    =  0x00BD0008,
+	HWPfFecDlVersionReg                   =  0x00BDF000,
+	HWPfFecDlClusterConfigReg             =  0x00BDF004,
+	HWPfFecDlBurstThres                   =  0x00BDF00C,
+	HWPfFecDlClusterStatusReg0            =  0x00BDF040,
+	HWPfFecDlClusterStatusReg1            =  0x00BDF044,
+	HWPfFecDlClusterStatusReg2            =  0x00BDF048,
+	HWPfFecDlClusterStatusReg3            =  0x00BDF04C,
+	HWPfFecDlClusterStatusReg4            =  0x00BDF050,
+	HWPfFecDlClusterStatusReg5            =  0x00BDF054,
+	HWPfChaFabPllPllrst                   =  0x00C40000,
+	HWPfChaFabPllClk0                     =  0x00C40004,
+	HWPfChaFabPllClk1                     =  0x00C40008,
+	HWPfChaFabPllBwadj                    =  0x00C4000C,
+	HWPfChaFabPllLbw                      =  0x00C40010,
+	HWPfChaFabPllResetq                   =  0x00C40014,
+	HWPfChaFabPllPhshft0                  =  0x00C40018,
+	HWPfChaFabPllPhshft1                  =  0x00C4001C,
+	HWPfChaFabPllDivq0                    =  0x00C40020,
+	HWPfChaFabPllDivq1                    =  0x00C40024,
+	HWPfChaFabPllDivq2                    =  0x00C40028,
+	HWPfChaFabPllDivq3                    =  0x00C4002C,
+	HWPfChaFabPllDivq4                    =  0x00C40030,
+	HWPfChaFabPllDivq5                    =  0x00C40034,
+	HWPfChaFabPllDivq6                    =  0x00C40038,
+	HWPfChaFabPllDivq7                    =  0x00C4003C,
+	HWPfChaDl5gPllPllrst                  =  0x00C40080,
+	HWPfChaDl5gPllClk0                    =  0x00C40084,
+	HWPfChaDl5gPllClk1                    =  0x00C40088,
+	HWPfChaDl5gPllBwadj                   =  0x00C4008C,
+	HWPfChaDl5gPllLbw                     =  0x00C40090,
+	HWPfChaDl5gPllResetq                  =  0x00C40094,
+	HWPfChaDl5gPllPhshft0                 =  0x00C40098,
+	HWPfChaDl5gPllPhshft1                 =  0x00C4009C,
+	HWPfChaDl5gPllDivq0                   =  0x00C400A0,
+	HWPfChaDl5gPllDivq1                   =  0x00C400A4,
+	HWPfChaDl5gPllDivq2                   =  0x00C400A8,
+	HWPfChaDl5gPllDivq3                   =  0x00C400AC,
+	HWPfChaDl5gPllDivq4                   =  0x00C400B0,
+	HWPfChaDl5gPllDivq5                   =  0x00C400B4,
+	HWPfChaDl5gPllDivq6                   =  0x00C400B8,
+	HWPfChaDl5gPllDivq7                   =  0x00C400BC,
+	HWPfChaDl4gPllPllrst                  =  0x00C40100,
+	HWPfChaDl4gPllClk0                    =  0x00C40104,
+	HWPfChaDl4gPllClk1                    =  0x00C40108,
+	HWPfChaDl4gPllBwadj                   =  0x00C4010C,
+	HWPfChaDl4gPllLbw                     =  0x00C40110,
+	HWPfChaDl4gPllResetq                  =  0x00C40114,
+	HWPfChaDl4gPllPhshft0                 =  0x00C40118,
+	HWPfChaDl4gPllPhshft1                 =  0x00C4011C,
+	HWPfChaDl4gPllDivq0                   =  0x00C40120,
+	HWPfChaDl4gPllDivq1                   =  0x00C40124,
+	HWPfChaDl4gPllDivq2                   =  0x00C40128,
+	HWPfChaDl4gPllDivq3                   =  0x00C4012C,
+	HWPfChaDl4gPllDivq4                   =  0x00C40130,
+	HWPfChaDl4gPllDivq5                   =  0x00C40134,
+	HWPfChaDl4gPllDivq6                   =  0x00C40138,
+	HWPfChaDl4gPllDivq7                   =  0x00C4013C,
+	HWPfChaUl5gPllPllrst                  =  0x00C40180,
+	HWPfChaUl5gPllClk0                    =  0x00C40184,
+	HWPfChaUl5gPllClk1                    =  0x00C40188,
+	HWPfChaUl5gPllBwadj                   =  0x00C4018C,
+	HWPfChaUl5gPllLbw                     =  0x00C40190,
+	HWPfChaUl5gPllResetq                  =  0x00C40194,
+	HWPfChaUl5gPllPhshft0                 =  0x00C40198,
+	HWPfChaUl5gPllPhshft1                 =  0x00C4019C,
+	HWPfChaUl5gPllDivq0                   =  0x00C401A0,
+	HWPfChaUl5gPllDivq1                   =  0x00C401A4,
+	HWPfChaUl5gPllDivq2                   =  0x00C401A8,
+	HWPfChaUl5gPllDivq3                   =  0x00C401AC,
+	HWPfChaUl5gPllDivq4                   =  0x00C401B0,
+	HWPfChaUl5gPllDivq5                   =  0x00C401B4,
+	HWPfChaUl5gPllDivq6                   =  0x00C401B8,
+	HWPfChaUl5gPllDivq7                   =  0x00C401BC,
+	HWPfChaUl4gPllPllrst                  =  0x00C40200,
+	HWPfChaUl4gPllClk0                    =  0x00C40204,
+	HWPfChaUl4gPllClk1                    =  0x00C40208,
+	HWPfChaUl4gPllBwadj                   =  0x00C4020C,
+	HWPfChaUl4gPllLbw                     =  0x00C40210,
+	HWPfChaUl4gPllResetq                  =  0x00C40214,
+	HWPfChaUl4gPllPhshft0                 =  0x00C40218,
+	HWPfChaUl4gPllPhshft1                 =  0x00C4021C,
+	HWPfChaUl4gPllDivq0                   =  0x00C40220,
+	HWPfChaUl4gPllDivq1                   =  0x00C40224,
+	HWPfChaUl4gPllDivq2                   =  0x00C40228,
+	HWPfChaUl4gPllDivq3                   =  0x00C4022C,
+	HWPfChaUl4gPllDivq4                   =  0x00C40230,
+	HWPfChaUl4gPllDivq5                   =  0x00C40234,
+	HWPfChaUl4gPllDivq6                   =  0x00C40238,
+	HWPfChaUl4gPllDivq7                   =  0x00C4023C,
+	HWPfChaDdrPllPllrst                   =  0x00C40280,
+	HWPfChaDdrPllClk0                     =  0x00C40284,
+	HWPfChaDdrPllClk1                     =  0x00C40288,
+	HWPfChaDdrPllBwadj                    =  0x00C4028C,
+	HWPfChaDdrPllLbw                      =  0x00C40290,
+	HWPfChaDdrPllResetq                   =  0x00C40294,
+	HWPfChaDdrPllPhshft0                  =  0x00C40298,
+	HWPfChaDdrPllPhshft1                  =  0x00C4029C,
+	HWPfChaDdrPllDivq0                    =  0x00C402A0,
+	HWPfChaDdrPllDivq1                    =  0x00C402A4,
+	HWPfChaDdrPllDivq2                    =  0x00C402A8,
+	HWPfChaDdrPllDivq3                    =  0x00C402AC,
+	HWPfChaDdrPllDivq4                    =  0x00C402B0,
+	HWPfChaDdrPllDivq5                    =  0x00C402B4,
+	HWPfChaDdrPllDivq6                    =  0x00C402B8,
+	HWPfChaDdrPllDivq7                    =  0x00C402BC,
+	HWPfChaErrStatus                      =  0x00C40400,
+	HWPfChaErrMask                        =  0x00C40404,
+	HWPfChaDebugPcieMsiFifo               =  0x00C40410,
+	HWPfChaDebugDdrMsiFifo                =  0x00C40414,
+	HWPfChaDebugMiscMsiFifo               =  0x00C40418,
+	HWPfChaPwmSet                         =  0x00C40420,
+	HWPfChaDdrRstStatus                   =  0x00C40430,
+	HWPfChaDdrStDoneStatus                =  0x00C40434,
+	HWPfChaDdrWbRstCfg                    =  0x00C40438,
+	HWPfChaDdrApbRstCfg                   =  0x00C4043C,
+	HWPfChaDdrPhyRstCfg                   =  0x00C40440,
+	HWPfChaDdrCpuRstCfg                   =  0x00C40444,
+	HWPfChaDdrSifRstCfg                   =  0x00C40448,
+	HWPfChaPadcfgPcomp0                   =  0x00C41000,
+	HWPfChaPadcfgNcomp0                   =  0x00C41004,
+	HWPfChaPadcfgOdt0                     =  0x00C41008,
+	HWPfChaPadcfgProtect0                 =  0x00C4100C,
+	HWPfChaPreemphasisProtect0            =  0x00C41010,
+	HWPfChaPreemphasisCompen0             =  0x00C41040,
+	HWPfChaPreemphasisOdten0              =  0x00C41044,
+	HWPfChaPadcfgPcomp1                   =  0x00C41100,
+	HWPfChaPadcfgNcomp1                   =  0x00C41104,
+	HWPfChaPadcfgOdt1                     =  0x00C41108,
+	HWPfChaPadcfgProtect1                 =  0x00C4110C,
+	HWPfChaPreemphasisProtect1            =  0x00C41110,
+	HWPfChaPreemphasisCompen1             =  0x00C41140,
+	HWPfChaPreemphasisOdten1              =  0x00C41144,
+	HWPfChaPadcfgPcomp2                   =  0x00C41200,
+	HWPfChaPadcfgNcomp2                   =  0x00C41204,
+	HWPfChaPadcfgOdt2                     =  0x00C41208,
+	HWPfChaPadcfgProtect2                 =  0x00C4120C,
+	HWPfChaPreemphasisProtect2            =  0x00C41210,
+	HWPfChaPreemphasisCompen2             =  0x00C41240,
+	HWPfChaPreemphasisOdten4              =  0x00C41444,
+	HWPfChaPreemphasisOdten2              =  0x00C41244,
+	HWPfChaPadcfgPcomp3                   =  0x00C41300,
+	HWPfChaPadcfgNcomp3                   =  0x00C41304,
+	HWPfChaPadcfgOdt3                     =  0x00C41308,
+	HWPfChaPadcfgProtect3                 =  0x00C4130C,
+	HWPfChaPreemphasisProtect3            =  0x00C41310,
+	HWPfChaPreemphasisCompen3             =  0x00C41340,
+	HWPfChaPreemphasisOdten3              =  0x00C41344,
+	HWPfChaPadcfgPcomp4                   =  0x00C41400,
+	HWPfChaPadcfgNcomp4                   =  0x00C41404,
+	HWPfChaPadcfgOdt4                     =  0x00C41408,
+	HWPfChaPadcfgProtect4                 =  0x00C4140C,
+	HWPfChaPreemphasisProtect4            =  0x00C41410,
+	HWPfChaPreemphasisCompen4             =  0x00C41440,
+	HWPfHiVfToPfDbellVf                   =  0x00C80000,
+	HWPfHiPfToVfDbellVf                   =  0x00C80008,
+	HWPfHiInfoRingBaseLoVf                =  0x00C80010,
+	HWPfHiInfoRingBaseHiVf                =  0x00C80014,
+	HWPfHiInfoRingPointerVf               =  0x00C80018,
+	HWPfHiInfoRingIntWrEnVf               =  0x00C80020,
+	HWPfHiInfoRingPf2VfWrEnVf             =  0x00C80024,
+	HWPfHiMsixVectorMapperVf              =  0x00C80060,
+	HWPfHiModuleVersionReg                =  0x00C84000,
+	HWPfHiIosf2axiErrLogReg               =  0x00C84004,
+	HWPfHiHardResetReg                    =  0x00C84008,
+	HWPfHi5GHardResetReg                  =  0x00C8400C,
+	HWPfHiInfoRingBaseLoRegPf             =  0x00C84010,
+	HWPfHiInfoRingBaseHiRegPf             =  0x00C84014,
+	HWPfHiInfoRingPointerRegPf            =  0x00C84018,
+	HWPfHiInfoRingIntWrEnRegPf            =  0x00C84020,
+	HWPfHiInfoRingVf2pfLoWrEnReg          =  0x00C84024,
+	HWPfHiInfoRingVf2pfHiWrEnReg          =  0x00C84028,
+	HWPfHiLogParityErrStatusReg           =  0x00C8402C,
+	HWPfHiLogDataParityErrorVfStatusLo    =  0x00C84030,
+	HWPfHiLogDataParityErrorVfStatusHi    =  0x00C84034,
+	HWPfHiBlockTransmitOnErrorEn          =  0x00C84038,
+	HWPfHiCfgMsiIntWrEnRegPf              =  0x00C84040,
+	HWPfHiCfgMsiVf2pfLoWrEnReg            =  0x00C84044,
+	HWPfHiCfgMsiVf2pfHighWrEnReg          =  0x00C84048,
+	HWPfHiMsixVectorMapperPf              =  0x00C84060,
+	HWPfHiApbWrWaitTime                   =  0x00C84100,
+	HWPfHiXCounterMaxValue                =  0x00C84104,
+	HWPfHiPfMode                          =  0x00C84108,
+	HWPfHiClkGateHystReg                  =  0x00C8410C,
+	HWPfHiSnoopBitsReg                    =  0x00C84110,
+	HWPfHiMsiDropEnableReg                =  0x00C84114,
+	HWPfHiMsiStatReg                      =  0x00C84120,
+	HWPfHiFifoOflStatReg                  =  0x00C84124,
+	HWPfHiHiDebugReg                      =  0x00C841F4,
+	HWPfHiDebugMemSnoopMsiFifo            =  0x00C841F8,
+	HWPfHiDebugMemSnoopInputFifo          =  0x00C841FC,
+	HWPfHiMsixMappingConfig               =  0x00C84200,
+	HWPfHiJunkReg                         =  0x00C8FF00,
+	HWPfDdrUmmcVer                        =  0x00D00000,
+	HWPfDdrUmmcCap                        =  0x00D00010,
+	HWPfDdrUmmcCtrl                       =  0x00D00020,
+	HWPfDdrMpcPe                          =  0x00D00080,
+	HWPfDdrMpcPpri3                       =  0x00D00090,
+	HWPfDdrMpcPpri2                       =  0x00D000A0,
+	HWPfDdrMpcPpri1                       =  0x00D000B0,
+	HWPfDdrMpcPpri0                       =  0x00D000C0,
+	HWPfDdrMpcPrwgrpCtrl                  =  0x00D000D0,
+	HWPfDdrMpcPbw7                        =  0x00D000E0,
+	HWPfDdrMpcPbw6                        =  0x00D000F0,
+	HWPfDdrMpcPbw5                        =  0x00D00100,
+	HWPfDdrMpcPbw4                        =  0x00D00110,
+	HWPfDdrMpcPbw3                        =  0x00D00120,
+	HWPfDdrMpcPbw2                        =  0x00D00130,
+	HWPfDdrMpcPbw1                        =  0x00D00140,
+	HWPfDdrMpcPbw0                        =  0x00D00150,
+	HWPfDdrMemoryInit                     =  0x00D00200,
+	HWPfDdrMemoryInitDone                 =  0x00D00210,
+	HWPfDdrMemInitPhyTrng0                =  0x00D00240,
+	HWPfDdrMemInitPhyTrng1                =  0x00D00250,
+	HWPfDdrMemInitPhyTrng2                =  0x00D00260,
+	HWPfDdrMemInitPhyTrng3                =  0x00D00270,
+	HWPfDdrBcDram                         =  0x00D003C0,
+	HWPfDdrBcAddrMap                      =  0x00D003D0,
+	HWPfDdrBcRef                          =  0x00D003E0,
+	HWPfDdrBcTim0                         =  0x00D00400,
+	HWPfDdrBcTim1                         =  0x00D00410,
+	HWPfDdrBcTim2                         =  0x00D00420,
+	HWPfDdrBcTim3                         =  0x00D00430,
+	HWPfDdrBcTim4                         =  0x00D00440,
+	HWPfDdrBcTim5                         =  0x00D00450,
+	HWPfDdrBcTim6                         =  0x00D00460,
+	HWPfDdrBcTim7                         =  0x00D00470,
+	HWPfDdrBcTim8                         =  0x00D00480,
+	HWPfDdrBcTim9                         =  0x00D00490,
+	HWPfDdrBcTim10                        =  0x00D004A0,
+	HWPfDdrBcTim12                        =  0x00D004C0,
+	HWPfDdrDfiInit                        =  0x00D004D0,
+	HWPfDdrDfiInitComplete                =  0x00D004E0,
+	HWPfDdrDfiTim0                        =  0x00D004F0,
+	HWPfDdrDfiTim1                        =  0x00D00500,
+	HWPfDdrDfiPhyUpdEn                    =  0x00D00530,
+	HWPfDdrMemStatus                      =  0x00D00540,
+	HWPfDdrUmmcErrStatus                  =  0x00D00550,
+	HWPfDdrUmmcIntStatus                  =  0x00D00560,
+	HWPfDdrUmmcIntEn                      =  0x00D00570,
+	HWPfDdrPhyRdLatency                   =  0x00D48400,
+	HWPfDdrPhyRdLatencyDbi                =  0x00D48410,
+	HWPfDdrPhyWrLatency                   =  0x00D48420,
+	HWPfDdrPhyTrngType                    =  0x00D48430,
+	HWPfDdrPhyMrsTiming2                  =  0x00D48440,
+	HWPfDdrPhyMrsTiming0                  =  0x00D48450,
+	HWPfDdrPhyMrsTiming1                  =  0x00D48460,
+	HWPfDdrPhyDramTmrd                    =  0x00D48470,
+	HWPfDdrPhyDramTmod                    =  0x00D48480,
+	HWPfDdrPhyDramTwpre                   =  0x00D48490,
+	HWPfDdrPhyDramTrfc                    =  0x00D484A0,
+	HWPfDdrPhyDramTrwtp                   =  0x00D484B0,
+	HWPfDdrPhyMr01Dimm                    =  0x00D484C0,
+	HWPfDdrPhyMr01DimmDbi                 =  0x00D484D0,
+	HWPfDdrPhyMr23Dimm                    =  0x00D484E0,
+	HWPfDdrPhyMr45Dimm                    =  0x00D484F0,
+	HWPfDdrPhyMr67Dimm                    =  0x00D48500,
+	HWPfDdrPhyWrlvlWwRdlvlRr              =  0x00D48510,
+	HWPfDdrPhyOdtEn                       =  0x00D48520,
+	HWPfDdrPhyFastTrng                    =  0x00D48530,
+	HWPfDdrPhyDynTrngGap                  =  0x00D48540,
+	HWPfDdrPhyDynRcalGap                  =  0x00D48550,
+	HWPfDdrPhyIdletimeout                 =  0x00D48560,
+	HWPfDdrPhyRstCkeGap                   =  0x00D48570,
+	HWPfDdrPhyCkeMrsGap                   =  0x00D48580,
+	HWPfDdrPhyMemVrefMidVal               =  0x00D48590,
+	HWPfDdrPhyVrefStep                    =  0x00D485A0,
+	HWPfDdrPhyVrefThreshold               =  0x00D485B0,
+	HWPfDdrPhyPhyVrefMidVal               =  0x00D485C0,
+	HWPfDdrPhyDqsCountMax                 =  0x00D485D0,
+	HWPfDdrPhyDqsCountNum                 =  0x00D485E0,
+	HWPfDdrPhyDramRow                     =  0x00D485F0,
+	HWPfDdrPhyDramCol                     =  0x00D48600,
+	HWPfDdrPhyDramBgBa                    =  0x00D48610,
+	HWPfDdrPhyDynamicUpdreqrel            =  0x00D48620,
+	HWPfDdrPhyVrefLimits                  =  0x00D48630,
+	HWPfDdrPhyIdtmTcStatus                =  0x00D6C020,
+	HWPfDdrPhyIdtmFwVersion               =  0x00D6C410,
+	HWPfDdrPhyRdlvlGateInitDelay          =  0x00D70000,
+	HWPfDdrPhyRdenSmplabc                 =  0x00D70008,
+	HWPfDdrPhyVrefNibble0                 =  0x00D7000C,
+	HWPfDdrPhyVrefNibble1                 =  0x00D70010,
+	HWPfDdrPhyRdlvlGateDqsSmpl0           =  0x00D70014,
+	HWPfDdrPhyRdlvlGateDqsSmpl1           =  0x00D70018,
+	HWPfDdrPhyRdlvlGateDqsSmpl2           =  0x00D7001C,
+	HWPfDdrPhyDqsCount                    =  0x00D70020,
+	HWPfDdrPhyWrlvlRdlvlGateStatus        =  0x00D70024,
+	HWPfDdrPhyErrorFlags                  =  0x00D70028,
+	HWPfDdrPhyPowerDown                   =  0x00D70030,
+	HWPfDdrPhyPrbsSeedByte0               =  0x00D70034,
+	HWPfDdrPhyPrbsSeedByte1               =  0x00D70038,
+	HWPfDdrPhyPcompDq                     =  0x00D70040,
+	HWPfDdrPhyNcompDq                     =  0x00D70044,
+	HWPfDdrPhyPcompDqs                    =  0x00D70048,
+	HWPfDdrPhyNcompDqs                    =  0x00D7004C,
+	HWPfDdrPhyPcompCmd                    =  0x00D70050,
+	HWPfDdrPhyNcompCmd                    =  0x00D70054,
+	HWPfDdrPhyPcompCk                     =  0x00D70058,
+	HWPfDdrPhyNcompCk                     =  0x00D7005C,
+	HWPfDdrPhyRcalOdtDq                   =  0x00D70060,
+	HWPfDdrPhyRcalOdtDqs                  =  0x00D70064,
+	HWPfDdrPhyRcalMask1                   =  0x00D70068,
+	HWPfDdrPhyRcalMask2                   =  0x00D7006C,
+	HWPfDdrPhyRcalCtrl                    =  0x00D70070,
+	HWPfDdrPhyRcalCnt                     =  0x00D70074,
+	HWPfDdrPhyRcalOverride                =  0x00D70078,
+	HWPfDdrPhyRcalGateen                  =  0x00D7007C,
+	HWPfDdrPhyCtrl                        =  0x00D70080,
+	HWPfDdrPhyWrlvlAlg                    =  0x00D70084,
+	HWPfDdrPhyRcalVreftTxcmdOdt           =  0x00D70088,
+	HWPfDdrPhyRdlvlGateParam              =  0x00D7008C,
+	HWPfDdrPhyRdlvlGateParam2             =  0x00D70090,
+	HWPfDdrPhyRcalVreftTxdata             =  0x00D70094,
+	HWPfDdrPhyCmdIntDelay                 =  0x00D700A4,
+	HWPfDdrPhyAlertN                      =  0x00D700A8,
+	HWPfDdrPhyTrngReqWpre2tck             =  0x00D700AC,
+	HWPfDdrPhyCmdPhaseSel                 =  0x00D700B4,
+	HWPfDdrPhyCmdDcdl                     =  0x00D700B8,
+	HWPfDdrPhyCkDcdl                      =  0x00D700BC,
+	HWPfDdrPhySwTrngCtrl1                 =  0x00D700C0,
+	HWPfDdrPhySwTrngCtrl2                 =  0x00D700C4,
+	HWPfDdrPhyRcalPcompRden               =  0x00D700C8,
+	HWPfDdrPhyRcalNcompRden               =  0x00D700CC,
+	HWPfDdrPhyRcalCompen                  =  0x00D700D0,
+	HWPfDdrPhySwTrngRdqs                  =  0x00D700D4,
+	HWPfDdrPhySwTrngWdqs                  =  0x00D700D8,
+	HWPfDdrPhySwTrngRdena                 =  0x00D700DC,
+	HWPfDdrPhySwTrngRdenb                 =  0x00D700E0,
+	HWPfDdrPhySwTrngRdenc                 =  0x00D700E4,
+	HWPfDdrPhySwTrngWdq                   =  0x00D700E8,
+	HWPfDdrPhySwTrngRdq                   =  0x00D700EC,
+	HWPfDdrPhyPcfgHmValue                 =  0x00D700F0,
+	HWPfDdrPhyPcfgTimerValue              =  0x00D700F4,
+	HWPfDdrPhyPcfgSoftwareTraining        =  0x00D700F8,
+	HWPfDdrPhyPcfgMcStatus                =  0x00D700FC,
+	HWPfDdrPhyWrlvlPhRank0                =  0x00D70100,
+	HWPfDdrPhyRdenPhRank0                 =  0x00D70104,
+	HWPfDdrPhyRdenIntRank0                =  0x00D70108,
+	HWPfDdrPhyRdqsDcdlRank0               =  0x00D7010C,
+	HWPfDdrPhyRdqsShadowDcdlRank0         =  0x00D70110,
+	HWPfDdrPhyWdqsDcdlRank0               =  0x00D70114,
+	HWPfDdrPhyWdmDcdlShadowRank0          =  0x00D70118,
+	HWPfDdrPhyWdmDcdlRank0                =  0x00D7011C,
+	HWPfDdrPhyDbiDcdlRank0                =  0x00D70120,
+	HWPfDdrPhyRdenDcdlaRank0              =  0x00D70124,
+	HWPfDdrPhyDbiDcdlShadowRank0          =  0x00D70128,
+	HWPfDdrPhyRdenDcdlbRank0              =  0x00D7012C,
+	HWPfDdrPhyWdqsShadowDcdlRank0         =  0x00D70130,
+	HWPfDdrPhyRdenDcdlcRank0              =  0x00D70134,
+	HWPfDdrPhyRdenShadowDcdlaRank0        =  0x00D70138,
+	HWPfDdrPhyWrlvlIntRank0               =  0x00D7013C,
+	HWPfDdrPhyRdqDcdlBit0Rank0            =  0x00D70200,
+	HWPfDdrPhyRdqDcdlShadowBit0Rank0      =  0x00D70204,
+	HWPfDdrPhyWdqDcdlBit0Rank0            =  0x00D70208,
+	HWPfDdrPhyWdqDcdlShadowBit0Rank0      =  0x00D7020C,
+	HWPfDdrPhyRdqDcdlBit1Rank0            =  0x00D70240,
+	HWPfDdrPhyRdqDcdlShadowBit1Rank0      =  0x00D70244,
+	HWPfDdrPhyWdqDcdlBit1Rank0            =  0x00D70248,
+	HWPfDdrPhyWdqDcdlShadowBit1Rank0      =  0x00D7024C,
+	HWPfDdrPhyRdqDcdlBit2Rank0            =  0x00D70280,
+	HWPfDdrPhyRdqDcdlShadowBit2Rank0      =  0x00D70284,
+	HWPfDdrPhyWdqDcdlBit2Rank0            =  0x00D70288,
+	HWPfDdrPhyWdqDcdlShadowBit2Rank0      =  0x00D7028C,
+	HWPfDdrPhyRdqDcdlBit3Rank0            =  0x00D702C0,
+	HWPfDdrPhyRdqDcdlShadowBit3Rank0      =  0x00D702C4,
+	HWPfDdrPhyWdqDcdlBit3Rank0            =  0x00D702C8,
+	HWPfDdrPhyWdqDcdlShadowBit3Rank0      =  0x00D702CC,
+	HWPfDdrPhyRdqDcdlBit4Rank0            =  0x00D70300,
+	HWPfDdrPhyRdqDcdlShadowBit4Rank0      =  0x00D70304,
+	HWPfDdrPhyWdqDcdlBit4Rank0            =  0x00D70308,
+	HWPfDdrPhyWdqDcdlShadowBit4Rank0      =  0x00D7030C,
+	HWPfDdrPhyRdqDcdlBit5Rank0            =  0x00D70340,
+	HWPfDdrPhyRdqDcdlShadowBit5Rank0      =  0x00D70344,
+	HWPfDdrPhyWdqDcdlBit5Rank0            =  0x00D70348,
+	HWPfDdrPhyWdqDcdlShadowBit5Rank0      =  0x00D7034C,
+	HWPfDdrPhyRdqDcdlBit6Rank0            =  0x00D70380,
+	HWPfDdrPhyRdqDcdlShadowBit6Rank0      =  0x00D70384,
+	HWPfDdrPhyWdqDcdlBit6Rank0            =  0x00D70388,
+	HWPfDdrPhyWdqDcdlShadowBit6Rank0      =  0x00D7038C,
+	HWPfDdrPhyRdqDcdlBit7Rank0            =  0x00D703C0,
+	HWPfDdrPhyRdqDcdlShadowBit7Rank0      =  0x00D703C4,
+	HWPfDdrPhyWdqDcdlBit7Rank0            =  0x00D703C8,
+	HWPfDdrPhyWdqDcdlShadowBit7Rank0      =  0x00D703CC,
+	HWPfDdrPhyIdtmStatus                  =  0x00D740D0,
+	HWPfDdrPhyIdtmError                   =  0x00D74110,
+	HWPfDdrPhyIdtmDebug                   =  0x00D74120,
+	HWPfDdrPhyIdtmDebugInt                =  0x00D74130,
+	HwPfPcieLnAsicCfgovr                  =  0x00D80000,
+	HwPfPcieLnAclkmixer                   =  0x00D80004,
+	HwPfPcieLnTxrampfreq                  =  0x00D80008,
+	HwPfPcieLnLanetest                    =  0x00D8000C,
+	HwPfPcieLnDcctrl                      =  0x00D80010,
+	HwPfPcieLnDccmeas                     =  0x00D80014,
+	HwPfPcieLnDccovrAclk                  =  0x00D80018,
+	HwPfPcieLnDccovrTxa                   =  0x00D8001C,
+	HwPfPcieLnDccovrTxk                   =  0x00D80020,
+	HwPfPcieLnDccovrDclk                  =  0x00D80024,
+	HwPfPcieLnDccovrEclk                  =  0x00D80028,
+	HwPfPcieLnDcctrimAclk                 =  0x00D8002C,
+	HwPfPcieLnDcctrimTx                   =  0x00D80030,
+	HwPfPcieLnDcctrimDclk                 =  0x00D80034,
+	HwPfPcieLnDcctrimEclk                 =  0x00D80038,
+	HwPfPcieLnQuadCtrl                    =  0x00D8003C,
+	HwPfPcieLnQuadCorrIndex               =  0x00D80040,
+	HwPfPcieLnQuadCorrStatus              =  0x00D80044,
+	HwPfPcieLnAsicRxovr1                  =  0x00D80048,
+	HwPfPcieLnAsicRxovr2                  =  0x00D8004C,
+	HwPfPcieLnAsicEqinfovr                =  0x00D80050,
+	HwPfPcieLnRxcsr                       =  0x00D80054,
+	HwPfPcieLnRxfectrl                    =  0x00D80058,
+	HwPfPcieLnRxtest                      =  0x00D8005C,
+	HwPfPcieLnEscount                     =  0x00D80060,
+	HwPfPcieLnCdrctrl                     =  0x00D80064,
+	HwPfPcieLnCdrctrl2                    =  0x00D80068,
+	HwPfPcieLnCdrcfg0Ctrl0                =  0x00D8006C,
+	HwPfPcieLnCdrcfg0Ctrl1                =  0x00D80070,
+	HwPfPcieLnCdrcfg0Ctrl2                =  0x00D80074,
+	HwPfPcieLnCdrcfg1Ctrl0                =  0x00D80078,
+	HwPfPcieLnCdrcfg1Ctrl1                =  0x00D8007C,
+	HwPfPcieLnCdrcfg1Ctrl2                =  0x00D80080,
+	HwPfPcieLnCdrcfg2Ctrl0                =  0x00D80084,
+	HwPfPcieLnCdrcfg2Ctrl1                =  0x00D80088,
+	HwPfPcieLnCdrcfg2Ctrl2                =  0x00D8008C,
+	HwPfPcieLnCdrcfg3Ctrl0                =  0x00D80090,
+	HwPfPcieLnCdrcfg3Ctrl1                =  0x00D80094,
+	HwPfPcieLnCdrcfg3Ctrl2                =  0x00D80098,
+	HwPfPcieLnCdrphase                    =  0x00D8009C,
+	HwPfPcieLnCdrfreq                     =  0x00D800A0,
+	HwPfPcieLnCdrstatusPhase              =  0x00D800A4,
+	HwPfPcieLnCdrstatusFreq               =  0x00D800A8,
+	HwPfPcieLnCdroffset                   =  0x00D800AC,
+	HwPfPcieLnRxvosctl                    =  0x00D800B0,
+	HwPfPcieLnRxvosctl2                   =  0x00D800B4,
+	HwPfPcieLnRxlosctl                    =  0x00D800B8,
+	HwPfPcieLnRxlos                       =  0x00D800BC,
+	HwPfPcieLnRxlosvval                   =  0x00D800C0,
+	HwPfPcieLnRxvosd0                     =  0x00D800C4,
+	HwPfPcieLnRxvosd1                     =  0x00D800C8,
+	HwPfPcieLnRxvosep0                    =  0x00D800CC,
+	HwPfPcieLnRxvosep1                    =  0x00D800D0,
+	HwPfPcieLnRxvosen0                    =  0x00D800D4,
+	HwPfPcieLnRxvosen1                    =  0x00D800D8,
+	HwPfPcieLnRxvosafe                    =  0x00D800DC,
+	HwPfPcieLnRxvosa0                     =  0x00D800E0,
+	HwPfPcieLnRxvosa0Out                  =  0x00D800E4,
+	HwPfPcieLnRxvosa1                     =  0x00D800E8,
+	HwPfPcieLnRxvosa1Out                  =  0x00D800EC,
+	HwPfPcieLnRxmisc                      =  0x00D800F0,
+	HwPfPcieLnRxbeacon                    =  0x00D800F4,
+	HwPfPcieLnRxdssout                    =  0x00D800F8,
+	HwPfPcieLnRxdssout2                   =  0x00D800FC,
+	HwPfPcieLnAlphapctrl                  =  0x00D80100,
+	HwPfPcieLnAlphanctrl                  =  0x00D80104,
+	HwPfPcieLnAdaptctrl                   =  0x00D80108,
+	HwPfPcieLnAdaptctrl1                  =  0x00D8010C,
+	HwPfPcieLnAdaptstatus                 =  0x00D80110,
+	HwPfPcieLnAdaptvga1                   =  0x00D80114,
+	HwPfPcieLnAdaptvga2                   =  0x00D80118,
+	HwPfPcieLnAdaptvga3                   =  0x00D8011C,
+	HwPfPcieLnAdaptvga4                   =  0x00D80120,
+	HwPfPcieLnAdaptboost1                 =  0x00D80124,
+	HwPfPcieLnAdaptboost2                 =  0x00D80128,
+	HwPfPcieLnAdaptboost3                 =  0x00D8012C,
+	HwPfPcieLnAdaptboost4                 =  0x00D80130,
+	HwPfPcieLnAdaptsslms1                 =  0x00D80134,
+	HwPfPcieLnAdaptsslms2                 =  0x00D80138,
+	HwPfPcieLnAdaptvgaStatus              =  0x00D8013C,
+	HwPfPcieLnAdaptboostStatus            =  0x00D80140,
+	HwPfPcieLnAdaptsslmsStatus1           =  0x00D80144,
+	HwPfPcieLnAdaptsslmsStatus2           =  0x00D80148,
+	HwPfPcieLnAfectrl1                    =  0x00D8014C,
+	HwPfPcieLnAfectrl2                    =  0x00D80150,
+	HwPfPcieLnAfectrl3                    =  0x00D80154,
+	HwPfPcieLnAfedefault1                 =  0x00D80158,
+	HwPfPcieLnAfedefault2                 =  0x00D8015C,
+	HwPfPcieLnDfectrl1                    =  0x00D80160,
+	HwPfPcieLnDfectrl2                    =  0x00D80164,
+	HwPfPcieLnDfectrl3                    =  0x00D80168,
+	HwPfPcieLnDfectrl4                    =  0x00D8016C,
+	HwPfPcieLnDfectrl5                    =  0x00D80170,
+	HwPfPcieLnDfectrl6                    =  0x00D80174,
+	HwPfPcieLnAfestatus1                  =  0x00D80178,
+	HwPfPcieLnAfestatus2                  =  0x00D8017C,
+	HwPfPcieLnDfestatus1                  =  0x00D80180,
+	HwPfPcieLnDfestatus2                  =  0x00D80184,
+	HwPfPcieLnDfestatus3                  =  0x00D80188,
+	HwPfPcieLnDfestatus4                  =  0x00D8018C,
+	HwPfPcieLnDfestatus5                  =  0x00D80190,
+	HwPfPcieLnAlphastatus                 =  0x00D80194,
+	HwPfPcieLnFomctrl1                    =  0x00D80198,
+	HwPfPcieLnFomctrl2                    =  0x00D8019C,
+	HwPfPcieLnFomctrl3                    =  0x00D801A0,
+	HwPfPcieLnAclkcalStatus               =  0x00D801A4,
+	HwPfPcieLnOffscorrStatus              =  0x00D801A8,
+	HwPfPcieLnEyewidthStatus              =  0x00D801AC,
+	HwPfPcieLnEyeheightStatus             =  0x00D801B0,
+	HwPfPcieLnAsicTxovr1                  =  0x00D801B4,
+	HwPfPcieLnAsicTxovr2                  =  0x00D801B8,
+	HwPfPcieLnAsicTxovr3                  =  0x00D801BC,
+	HwPfPcieLnTxbiasadjOvr                =  0x00D801C0,
+	HwPfPcieLnTxcsr                       =  0x00D801C4,
+	HwPfPcieLnTxtest                      =  0x00D801C8,
+	HwPfPcieLnTxtestword                  =  0x00D801CC,
+	HwPfPcieLnTxtestwordHigh              =  0x00D801D0,
+	HwPfPcieLnTxdrive                     =  0x00D801D4,
+	HwPfPcieLnMtcsLn                      =  0x00D801D8,
+	HwPfPcieLnStatsumLn                   =  0x00D801DC,
+	HwPfPcieLnRcbusScratch                =  0x00D801E0,
+	HwPfPcieLnRcbusMinorrev               =  0x00D801F0,
+	HwPfPcieLnRcbusMajorrev               =  0x00D801F4,
+	HwPfPcieLnRcbusBlocktype              =  0x00D801F8,
+	HwPfPcieSupPllcsr                     =  0x00D80800,
+	HwPfPcieSupPlldiv                     =  0x00D80804,
+	HwPfPcieSupPllcal                     =  0x00D80808,
+	HwPfPcieSupPllcalsts                  =  0x00D8080C,
+	HwPfPcieSupPllmeas                    =  0x00D80810,
+	HwPfPcieSupPlldactrim                 =  0x00D80814,
+	HwPfPcieSupPllbiastrim                =  0x00D80818,
+	HwPfPcieSupPllbwtrim                  =  0x00D8081C,
+	HwPfPcieSupPllcaldly                  =  0x00D80820,
+	HwPfPcieSupRefclkonpclkctrl           =  0x00D80824,
+	HwPfPcieSupPclkdelay                  =  0x00D80828,
+	HwPfPcieSupPhyconfig                  =  0x00D8082C,
+	HwPfPcieSupRcalIntf                   =  0x00D80830,
+	HwPfPcieSupAuxcsr                     =  0x00D80834,
+	HwPfPcieSupVref                       =  0x00D80838,
+	HwPfPcieSupLinkmode                   =  0x00D8083C,
+	HwPfPcieSupRrefcalctl                 =  0x00D80840,
+	HwPfPcieSupRrefcal                    =  0x00D80844,
+	HwPfPcieSupRrefcaldly                 =  0x00D80848,
+	HwPfPcieSupTximpcalctl                =  0x00D8084C,
+	HwPfPcieSupTximpcal                   =  0x00D80850,
+	HwPfPcieSupTximpoffset                =  0x00D80854,
+	HwPfPcieSupTximpcaldly                =  0x00D80858,
+	HwPfPcieSupRximpcalctl                =  0x00D8085C,
+	HwPfPcieSupRximpcal                   =  0x00D80860,
+	HwPfPcieSupRximpoffset                =  0x00D80864,
+	HwPfPcieSupRximpcaldly                =  0x00D80868,
+	HwPfPcieSupFence                      =  0x00D8086C,
+	HwPfPcieSupMtcs                       =  0x00D80870,
+	HwPfPcieSupStatsum                    =  0x00D809B8,
+	HwPfPciePcsDpStatus0                  =  0x00D81000,
+	HwPfPciePcsDpControl0                 =  0x00D81004,
+	HwPfPciePcsPmaStatusLane0             =  0x00D81008,
+	HwPfPciePcsPipeStatusLane0            =  0x00D8100C,
+	HwPfPciePcsTxdeemph0Lane0             =  0x00D81010,
+	HwPfPciePcsTxdeemph1Lane0             =  0x00D81014,
+	HwPfPciePcsInternalStatusLane0        =  0x00D81018,
+	HwPfPciePcsDpStatus1                  =  0x00D8101C,
+	HwPfPciePcsDpControl1                 =  0x00D81020,
+	HwPfPciePcsPmaStatusLane1             =  0x00D81024,
+	HwPfPciePcsPipeStatusLane1            =  0x00D81028,
+	HwPfPciePcsTxdeemph0Lane1             =  0x00D8102C,
+	HwPfPciePcsTxdeemph1Lane1             =  0x00D81030,
+	HwPfPciePcsInternalStatusLane1        =  0x00D81034,
+	HwPfPciePcsDpStatus2                  =  0x00D81038,
+	HwPfPciePcsDpControl2                 =  0x00D8103C,
+	HwPfPciePcsPmaStatusLane2             =  0x00D81040,
+	HwPfPciePcsPipeStatusLane2            =  0x00D81044,
+	HwPfPciePcsTxdeemph0Lane2             =  0x00D81048,
+	HwPfPciePcsTxdeemph1Lane2             =  0x00D8104C,
+	HwPfPciePcsInternalStatusLane2        =  0x00D81050,
+	HwPfPciePcsDpStatus3                  =  0x00D81054,
+	HwPfPciePcsDpControl3                 =  0x00D81058,
+	HwPfPciePcsPmaStatusLane3             =  0x00D8105C,
+	HwPfPciePcsPipeStatusLane3            =  0x00D81060,
+	HwPfPciePcsTxdeemph0Lane3             =  0x00D81064,
+	HwPfPciePcsTxdeemph1Lane3             =  0x00D81068,
+	HwPfPciePcsInternalStatusLane3        =  0x00D8106C,
+	HwPfPciePcsEbStatus0                  =  0x00D81070,
+	HwPfPciePcsEbStatus1                  =  0x00D81074,
+	HwPfPciePcsEbStatus2                  =  0x00D81078,
+	HwPfPciePcsEbStatus3                  =  0x00D8107C,
+	HwPfPciePcsPllSettingPcieG1           =  0x00D81088,
+	HwPfPciePcsPllSettingPcieG2           =  0x00D8108C,
+	HwPfPciePcsPllSettingPcieG3           =  0x00D81090,
+	HwPfPciePcsControl                    =  0x00D81094,
+	HwPfPciePcsEqControl                  =  0x00D81098,
+	HwPfPciePcsEqTimer                    =  0x00D8109C,
+	HwPfPciePcsEqErrStatus                =  0x00D810A0,
+	HwPfPciePcsEqErrCount                 =  0x00D810A4,
+	HwPfPciePcsStatus                     =  0x00D810A8,
+	HwPfPciePcsMiscRegister               =  0x00D810AC,
+	HwPfPciePcsObsControl                 =  0x00D810B0,
+	HwPfPciePcsPrbsCount0                 =  0x00D81200,
+	HwPfPciePcsBistControl0               =  0x00D81204,
+	HwPfPciePcsBistStaticWord00           =  0x00D81208,
+	HwPfPciePcsBistStaticWord10           =  0x00D8120C,
+	HwPfPciePcsBistStaticWord20           =  0x00D81210,
+	HwPfPciePcsBistStaticWord30           =  0x00D81214,
+	HwPfPciePcsPrbsCount1                 =  0x00D81220,
+	HwPfPciePcsBistControl1               =  0x00D81224,
+	HwPfPciePcsBistStaticWord01           =  0x00D81228,
+	HwPfPciePcsBistStaticWord11           =  0x00D8122C,
+	HwPfPciePcsBistStaticWord21           =  0x00D81230,
+	HwPfPciePcsBistStaticWord31           =  0x00D81234,
+	HwPfPciePcsPrbsCount2                 =  0x00D81240,
+	HwPfPciePcsBistControl2               =  0x00D81244,
+	HwPfPciePcsBistStaticWord02           =  0x00D81248,
+	HwPfPciePcsBistStaticWord12           =  0x00D8124C,
+	HwPfPciePcsBistStaticWord22           =  0x00D81250,
+	HwPfPciePcsBistStaticWord32           =  0x00D81254,
+	HwPfPciePcsPrbsCount3                 =  0x00D81260,
+	HwPfPciePcsBistControl3               =  0x00D81264,
+	HwPfPciePcsBistStaticWord03           =  0x00D81268,
+	HwPfPciePcsBistStaticWord13           =  0x00D8126C,
+	HwPfPciePcsBistStaticWord23           =  0x00D81270,
+	HwPfPciePcsBistStaticWord33           =  0x00D81274,
+	HwPfPcieGpexLtssmStateCntrl           =  0x00D90400,
+	HwPfPcieGpexLtssmStateStatus          =  0x00D90404,
+	HwPfPcieGpexSkipFreqTimer             =  0x00D90408,
+	HwPfPcieGpexLaneSelect                =  0x00D9040C,
+	HwPfPcieGpexLaneDeskew                =  0x00D90410,
+	HwPfPcieGpexRxErrorStatus             =  0x00D90414,
+	HwPfPcieGpexLaneNumControl            =  0x00D90418,
+	HwPfPcieGpexNFstControl               =  0x00D9041C,
+	HwPfPcieGpexLinkStatus                =  0x00D90420,
+	HwPfPcieGpexAckReplayTimeout          =  0x00D90438,
+	HwPfPcieGpexSeqNumberStatus           =  0x00D9043C,
+	HwPfPcieGpexCoreClkRatio              =  0x00D90440,
+	HwPfPcieGpexDllTholdControl           =  0x00D90448,
+	HwPfPcieGpexPmTimer                   =  0x00D90450,
+	HwPfPcieGpexPmeTimeout                =  0x00D90454,
+	HwPfPcieGpexAspmL1Timer               =  0x00D90458,
+	HwPfPcieGpexAspmReqTimer              =  0x00D9045C,
+	HwPfPcieGpexAspmL1Dis                 =  0x00D90460,
+	HwPfPcieGpexAdvisoryErrorControl      =  0x00D90468,
+	HwPfPcieGpexId                        =  0x00D90470,
+	HwPfPcieGpexClasscode                 =  0x00D90474,
+	HwPfPcieGpexSubsystemId               =  0x00D90478,
+	HwPfPcieGpexDeviceCapabilities        =  0x00D9047C,
+	HwPfPcieGpexLinkCapabilities          =  0x00D90480,
+	HwPfPcieGpexFunctionNumber            =  0x00D90484,
+	HwPfPcieGpexPmCapabilities            =  0x00D90488,
+	HwPfPcieGpexFunctionSelect            =  0x00D9048C,
+	HwPfPcieGpexErrorCounter              =  0x00D904AC,
+	HwPfPcieGpexConfigReady               =  0x00D904B0,
+	HwPfPcieGpexFcUpdateTimeout           =  0x00D904B8,
+	HwPfPcieGpexFcUpdateTimer             =  0x00D904BC,
+	HwPfPcieGpexVcBufferLoad              =  0x00D904C8,
+	HwPfPcieGpexVcBufferSizeThold         =  0x00D904CC,
+	HwPfPcieGpexVcBufferSelect            =  0x00D904D0,
+	HwPfPcieGpexBarEnable                 =  0x00D904D4,
+	HwPfPcieGpexBarDwordLower             =  0x00D904D8,
+	HwPfPcieGpexBarDwordUpper             =  0x00D904DC,
+	HwPfPcieGpexBarSelect                 =  0x00D904E0,
+	HwPfPcieGpexCreditCounterSelect       =  0x00D904E4,
+	HwPfPcieGpexCreditCounterStatus       =  0x00D904E8,
+	HwPfPcieGpexTlpHeaderSelect           =  0x00D904EC,
+	HwPfPcieGpexTlpHeaderDword0           =  0x00D904F0,
+	HwPfPcieGpexTlpHeaderDword1           =  0x00D904F4,
+	HwPfPcieGpexTlpHeaderDword2           =  0x00D904F8,
+	HwPfPcieGpexTlpHeaderDword3           =  0x00D904FC,
+	HwPfPcieGpexRelaxOrderControl         =  0x00D90500,
+	HwPfPcieGpexBarPrefetch               =  0x00D90504,
+	HwPfPcieGpexFcCheckControl            =  0x00D90508,
+	HwPfPcieGpexFcUpdateTimerTraffic      =  0x00D90518,
+	HwPfPcieGpexPhyControl0               =  0x00D9053C,
+	HwPfPcieGpexPhyControl1               =  0x00D90544,
+	HwPfPcieGpexPhyControl2               =  0x00D9054C,
+	HwPfPcieGpexUserControl0              =  0x00D9055C,
+	HwPfPcieGpexUncorrErrorStatus         =  0x00D905F0,
+	HwPfPcieGpexRxCplError                =  0x00D90620,
+	HwPfPcieGpexRxCplErrorDword0          =  0x00D90624,
+	HwPfPcieGpexRxCplErrorDword1          =  0x00D90628,
+	HwPfPcieGpexRxCplErrorDword2          =  0x00D9062C,
+	HwPfPcieGpexPabSwResetEn              =  0x00D90630,
+	HwPfPcieGpexGen3Control0              =  0x00D90634,
+	HwPfPcieGpexGen3Control1              =  0x00D90638,
+	HwPfPcieGpexGen3Control2              =  0x00D9063C,
+	HwPfPcieGpexGen2ControlCsr            =  0x00D90640,
+	HwPfPcieGpexTotalVfInitialVf0         =  0x00D90644,
+	HwPfPcieGpexTotalVfInitialVf1         =  0x00D90648,
+	HwPfPcieGpexSriovLinkDevId0           =  0x00D90684,
+	HwPfPcieGpexSriovLinkDevId1           =  0x00D90688,
+	HwPfPcieGpexSriovPageSize0            =  0x00D906C4,
+	HwPfPcieGpexSriovPageSize1            =  0x00D906C8,
+	HwPfPcieGpexIdVersion                 =  0x00D906FC,
+	HwPfPcieGpexSriovVfOffsetStride0      =  0x00D90704,
+	HwPfPcieGpexSriovVfOffsetStride1      =  0x00D90708,
+	HwPfPcieGpexGen3DeskewControl         =  0x00D907B4,
+	HwPfPcieGpexGen3EqControl             =  0x00D907B8,
+	HwPfPcieGpexBridgeVersion             =  0x00D90800,
+	HwPfPcieGpexBridgeCapability          =  0x00D90804,
+	HwPfPcieGpexBridgeControl             =  0x00D90808,
+	HwPfPcieGpexBridgeStatus              =  0x00D9080C,
+	HwPfPcieGpexEngineActivityStatus      =  0x00D9081C,
+	HwPfPcieGpexEngineResetControl        =  0x00D90820,
+	HwPfPcieGpexAxiPioControl             =  0x00D90840,
+	HwPfPcieGpexAxiPioStatus              =  0x00D90844,
+	HwPfPcieGpexAmbaSlaveCmdStatus        =  0x00D90848,
+	HwPfPcieGpexPexPioControl             =  0x00D908C0,
+	HwPfPcieGpexPexPioStatus              =  0x00D908C4,
+	HwPfPcieGpexAmbaMasterStatus          =  0x00D908C8,
+	HwPfPcieGpexCsrSlaveCmdStatus         =  0x00D90920,
+	HwPfPcieGpexMailboxAxiControl         =  0x00D90A50,
+	HwPfPcieGpexMailboxAxiData            =  0x00D90A54,
+	HwPfPcieGpexMailboxPexControl         =  0x00D90A90,
+	HwPfPcieGpexMailboxPexData            =  0x00D90A94,
+	HwPfPcieGpexPexInterruptEnable        =  0x00D90AD0,
+	HwPfPcieGpexPexInterruptStatus        =  0x00D90AD4,
+	HwPfPcieGpexPexInterruptAxiPioVector  =  0x00D90AD8,
+	HwPfPcieGpexPexInterruptPexPioVector  =  0x00D90AE0,
+	HwPfPcieGpexPexInterruptMiscVector    =  0x00D90AF8,
+	HwPfPcieGpexAmbaInterruptPioEnable    =  0x00D90B00,
+	HwPfPcieGpexAmbaInterruptMiscEnable   =  0x00D90B0C,
+	HwPfPcieGpexAmbaInterruptPioStatus    =  0x00D90B10,
+	HwPfPcieGpexAmbaInterruptMiscStatus   =  0x00D90B1C,
+	HwPfPcieGpexPexPmControl              =  0x00D90B80,
+	HwPfPcieGpexSlotMisc                  =  0x00D90B88,
+	HwPfPcieGpexAxiAddrMappingControl     =  0x00D90BA0,
+	HwPfPcieGpexAxiAddrMappingWindowAxiBase     =  0x00D90BA4,
+	HwPfPcieGpexAxiAddrMappingWindowPexBaseLow  =  0x00D90BA8,
+	HwPfPcieGpexAxiAddrMappingWindowPexBaseHigh =  0x00D90BAC,
+	HwPfPcieGpexPexBarAddrFunc0Bar0       =  0x00D91BA0,
+	HwPfPcieGpexPexBarAddrFunc0Bar1       =  0x00D91BA4,
+	HwPfPcieGpexAxiAddrMappingPcieHdrParam =  0x00D95BA0,
+	HwPfPcieGpexExtAxiAddrMappingAxiBase  =  0x00D980A0,
+	HwPfPcieGpexPexExtBarAddrFunc0Bar0    =  0x00D984A0,
+	HwPfPcieGpexPexExtBarAddrFunc0Bar1    =  0x00D984A4,
+	HwPfPcieGpexAmbaInterruptFlrEnable    =  0x00D9B960,
+	HwPfPcieGpexAmbaInterruptFlrStatus    =  0x00D9B9A0,
+	HwPfPcieGpexExtAxiAddrMappingSize     =  0x00D9BAF0,
+	HwPfPcieGpexPexPioAwcacheControl      =  0x00D9C300,
+	HwPfPcieGpexPexPioArcacheControl      =  0x00D9C304,
+	HwPfPcieGpexPabObSizeControlVc0       =  0x00D9C310
+};
+
+/* TIP PF Interrupt numbers */
+enum {
+	ACC100_PF_INT_QMGR_AQ_OVERFLOW = 0,
+	ACC100_PF_INT_DOORBELL_VF_2_PF = 1,
+	ACC100_PF_INT_DMA_DL_DESC_IRQ = 2,
+	ACC100_PF_INT_DMA_UL_DESC_IRQ = 3,
+	ACC100_PF_INT_DMA_MLD_DESC_IRQ = 4,
+	ACC100_PF_INT_DMA_UL5G_DESC_IRQ = 5,
+	ACC100_PF_INT_DMA_DL5G_DESC_IRQ = 6,
+	ACC100_PF_INT_ILLEGAL_FORMAT = 7,
+	ACC100_PF_INT_QMGR_DISABLED_ACCESS = 8,
+	ACC100_PF_INT_QMGR_AQ_OVERTHRESHOLD = 9,
+	ACC100_PF_INT_ARAM_ACCESS_ERR = 10,
+	ACC100_PF_INT_ARAM_ECC_1BIT_ERR = 11,
+	ACC100_PF_INT_PARITY_ERR = 12,
+	ACC100_PF_INT_QMGR_ERR = 13,
+	ACC100_PF_INT_INT_REQ_OVERFLOW = 14,
+	ACC100_PF_INT_APB_TIMEOUT = 15,
+};
+
+#endif /* ACC100_PF_ENUM_H */
diff --git a/drivers/baseband/acc100/acc100_vf_enum.h b/drivers/baseband/acc100/acc100_vf_enum.h
new file mode 100644
index 0000000..b512af3
--- /dev/null
+++ b/drivers/baseband/acc100/acc100_vf_enum.h
@@ -0,0 +1,73 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2017 Intel Corporation
+ */
+
+#ifndef ACC100_VF_ENUM_H
+#define ACC100_VF_ENUM_H
+
+/*
+ * ACC100 Register mapping on VF BAR0
+ * This is automatically generated from RDL, format may change with new RDL
+ */
+enum {
+	HWVfQmgrIngressAq             =  0x00000000,
+	HWVfHiVfToPfDbellVf           =  0x00000800,
+	HWVfHiPfToVfDbellVf           =  0x00000808,
+	HWVfHiInfoRingBaseLoVf        =  0x00000810,
+	HWVfHiInfoRingBaseHiVf        =  0x00000814,
+	HWVfHiInfoRingPointerVf       =  0x00000818,
+	HWVfHiInfoRingIntWrEnVf       =  0x00000820,
+	HWVfHiInfoRingPf2VfWrEnVf     =  0x00000824,
+	HWVfHiMsixVectorMapperVf      =  0x00000860,
+	HWVfDmaFec5GulDescBaseLoRegVf =  0x00000920,
+	HWVfDmaFec5GulDescBaseHiRegVf =  0x00000924,
+	HWVfDmaFec5GulRespPtrLoRegVf  =  0x00000928,
+	HWVfDmaFec5GulRespPtrHiRegVf  =  0x0000092C,
+	HWVfDmaFec5GdlDescBaseLoRegVf =  0x00000940,
+	HWVfDmaFec5GdlDescBaseHiRegVf =  0x00000944,
+	HWVfDmaFec5GdlRespPtrLoRegVf  =  0x00000948,
+	HWVfDmaFec5GdlRespPtrHiRegVf  =  0x0000094C,
+	HWVfDmaFec4GulDescBaseLoRegVf =  0x00000960,
+	HWVfDmaFec4GulDescBaseHiRegVf =  0x00000964,
+	HWVfDmaFec4GulRespPtrLoRegVf  =  0x00000968,
+	HWVfDmaFec4GulRespPtrHiRegVf  =  0x0000096C,
+	HWVfDmaFec4GdlDescBaseLoRegVf =  0x00000980,
+	HWVfDmaFec4GdlDescBaseHiRegVf =  0x00000984,
+	HWVfDmaFec4GdlRespPtrLoRegVf  =  0x00000988,
+	HWVfDmaFec4GdlRespPtrHiRegVf  =  0x0000098C,
+	HWVfDmaDdrBaseRangeRoVf       =  0x000009A0,
+	HWVfQmgrAqResetVf             =  0x00000E00,
+	HWVfQmgrRingSizeVf            =  0x00000E04,
+	HWVfQmgrGrpDepthLog20Vf       =  0x00000E08,
+	HWVfQmgrGrpDepthLog21Vf       =  0x00000E0C,
+	HWVfQmgrGrpFunction0Vf        =  0x00000E10,
+	HWVfQmgrGrpFunction1Vf        =  0x00000E14,
+	HWVfPmACntrlRegVf             =  0x00000F40,
+	HWVfPmACountVf                =  0x00000F48,
+	HWVfPmAKCntLoVf               =  0x00000F50,
+	HWVfPmAKCntHiVf               =  0x00000F54,
+	HWVfPmADeltaCntLoVf           =  0x00000F60,
+	HWVfPmADeltaCntHiVf           =  0x00000F64,
+	HWVfPmBCntrlRegVf             =  0x00000F80,
+	HWVfPmBCountVf                =  0x00000F88,
+	HWVfPmBKCntLoVf               =  0x00000F90,
+	HWVfPmBKCntHiVf               =  0x00000F94,
+	HWVfPmBDeltaCntLoVf           =  0x00000FA0,
+	HWVfPmBDeltaCntHiVf           =  0x00000FA4
+};
+
+/* TIP VF Interrupt numbers */
+enum {
+	ACC100_VF_INT_QMGR_AQ_OVERFLOW = 0,
+	ACC100_VF_INT_DOORBELL_VF_2_PF = 1,
+	ACC100_VF_INT_DMA_DL_DESC_IRQ = 2,
+	ACC100_VF_INT_DMA_UL_DESC_IRQ = 3,
+	ACC100_VF_INT_DMA_MLD_DESC_IRQ = 4,
+	ACC100_VF_INT_DMA_UL5G_DESC_IRQ = 5,
+	ACC100_VF_INT_DMA_DL5G_DESC_IRQ = 6,
+	ACC100_VF_INT_ILLEGAL_FORMAT = 7,
+	ACC100_VF_INT_QMGR_DISABLED_ACCESS = 8,
+	ACC100_VF_INT_QMGR_AQ_OVERTHRESHOLD = 9,
+};
+
+#endif /* ACC100_VF_ENUM_H */
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 6f46df0..6525d66 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -5,6 +5,9 @@
 #ifndef _RTE_ACC100_PMD_H_
 #define _RTE_ACC100_PMD_H_
 
+#include "acc100_pf_enum.h"
+#include "acc100_vf_enum.h"
+
 /* Helper macro for logging */
 #define rte_bbdev_log(level, fmt, ...) \
 	rte_log(RTE_LOG_ ## level, acc100_logtype, fmt "\n", \
@@ -27,6 +30,490 @@
 #define RTE_ACC100_PF_DEVICE_ID        (0x0d5c)
 #define RTE_ACC100_VF_DEVICE_ID        (0x0d5d)
 
+/* Define as 1 to use only a single FEC engine */
+#ifndef RTE_ACC100_SINGLE_FEC
+#define RTE_ACC100_SINGLE_FEC 0
+#endif
+
+/* Values used in filling in descriptors */
+#define ACC100_DMA_DESC_TYPE           2
+#define ACC100_DMA_CODE_BLK_MODE       0
+#define ACC100_DMA_BLKID_FCW           1
+#define ACC100_DMA_BLKID_IN            2
+#define ACC100_DMA_BLKID_OUT_ENC       1
+#define ACC100_DMA_BLKID_OUT_HARD      1
+#define ACC100_DMA_BLKID_OUT_SOFT      2
+#define ACC100_DMA_BLKID_OUT_HARQ      3
+#define ACC100_DMA_BLKID_IN_HARQ       3
+
+/* Values used in filling in decode FCWs */
+#define ACC100_FCW_TD_VER              1
+#define ACC100_FCW_TD_EXT_COLD_REG_EN  1
+#define ACC100_FCW_TD_AUTOMAP          0x0f
+#define ACC100_FCW_TD_RVIDX_0          2
+#define ACC100_FCW_TD_RVIDX_1          26
+#define ACC100_FCW_TD_RVIDX_2          50
+#define ACC100_FCW_TD_RVIDX_3          74
+
+/* Values used in writing to the registers */
+#define ACC100_REG_IRQ_EN_ALL          0x1FF83FF  /* Enable all interrupts */
+
+/* ACC100 Specific Dimensioning */
+#define ACC100_SIZE_64MBYTE            (64*1024*1024)
+/* Number of elements in an Info Ring */
+#define ACC100_INFO_RING_NUM_ENTRIES   1024
+/* Number of elements in HARQ layout memory */
+#define ACC100_HARQ_LAYOUT             (64*1024*1024)
+/* Assume offset for HARQ in memory */
+#define ACC100_HARQ_OFFSET             (32*1024)
+/* Mask used to calculate an index in an Info Ring array (not a byte offset) */
+#define ACC100_INFO_RING_MASK          (ACC100_INFO_RING_NUM_ENTRIES-1)
+/* Number of Virtual Functions ACC100 supports */
+#define ACC100_NUM_VFS                  16
+#define ACC100_NUM_QGRPS                8
+#define ACC100_NUM_QGRPS_PER_WORD       8
+#define ACC100_NUM_AQS                  16
+#define MAX_ENQ_BATCH_SIZE              255
+/* All ACC100 Registers alignment are 32bits = 4B */
+#define ACC100_BYTES_IN_WORD                 4
+#define ACC100_MAX_E_MBUF                64000
+
+#define ACC100_GRP_ID_SHIFT    10 /* Queue Index Hierarchy */
+#define ACC100_VF_ID_SHIFT     4  /* Queue Index Hierarchy */
+#define ACC100_VF_OFFSET_QOS   16 /* offset in Memory specific to QoS Mon */
+#define ACC100_TMPL_PRI_0      0x03020100
+#define ACC100_TMPL_PRI_1      0x07060504
+#define ACC100_TMPL_PRI_2      0x0b0a0908
+#define ACC100_TMPL_PRI_3      0x0f0e0d0c
+#define ACC100_QUEUE_ENABLE    0x80000000  /* Bit to mark Queue as Enabled */
+#define ACC100_WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
+
+#define ACC100_NUM_TMPL       32
+/* Mapping of signals for the available engines */
+#define ACC100_SIG_UL_5G      0
+#define ACC100_SIG_UL_5G_LAST 7
+#define ACC100_SIG_DL_5G      13
+#define ACC100_SIG_DL_5G_LAST 15
+#define ACC100_SIG_UL_4G      16
+#define ACC100_SIG_UL_4G_LAST 21
+#define ACC100_SIG_DL_4G      27
+#define ACC100_SIG_DL_4G_LAST 31
+
+/* max number of iterations to allocate memory block for all rings */
+#define ACC100_SW_RING_MEM_ALLOC_ATTEMPTS 5
+#define ACC100_MAX_QUEUE_DEPTH            1024
+#define ACC100_DMA_MAX_NUM_POINTERS       14
+#define ACC100_DMA_DESC_PADDING           8
+#define ACC100_FCW_PADDING                12
+#define ACC100_DESC_FCW_OFFSET            192
+#define ACC100_DESC_SIZE                  256
+#define ACC100_DESC_OFFSET                (ACC100_DESC_SIZE / 64)
+#define ACC100_FCW_TE_BLEN                32
+#define ACC100_FCW_TD_BLEN                24
+#define ACC100_FCW_LE_BLEN                32
+#define ACC100_FCW_LD_BLEN                36
+
+#define ACC100_FCW_VER         2
+#define ACC100_MUX_5GDL_DESC   6
+#define ACC100_CMP_ENC_SIZE    20
+#define ACC100_CMP_DEC_SIZE    24
+#define ACC100_ENC_OFFSET     (32)
+#define ACC100_DEC_OFFSET     (80)
+#define ACC100_EXT_MEM /* Default option with memory external to CPU */
+#define ACC100_HARQ_OFFSET_THRESHOLD 1024
+
+/* Constants from K0 computation from 3GPP 38.212 Table 5.4.2.1-2 */
+#define ACC100_N_ZC_1 66 /* N = 66 Zc for BG 1 */
+#define ACC100_N_ZC_2 50 /* N = 50 Zc for BG 2 */
+#define ACC100_K0_1_1 17 /* K0 fraction numerator for rv 1 and BG 1 */
+#define ACC100_K0_1_2 13 /* K0 fraction numerator for rv 1 and BG 2 */
+#define ACC100_K0_2_1 33 /* K0 fraction numerator for rv 2 and BG 1 */
+#define ACC100_K0_2_2 25 /* K0 fraction numerator for rv 2 and BG 2 */
+#define ACC100_K0_3_1 56 /* K0 fraction numerator for rv 3 and BG 1 */
+#define ACC100_K0_3_2 43 /* K0 fraction numerator for rv 3 and BG 2 */
+
+/* ACC100 Configuration */
+#define ACC100_DDR_ECC_ENABLE
+#define ACC100_CFG_DMA_ERROR    0x3D7
+#define ACC100_CFG_AXI_CACHE    0x11
+#define ACC100_CFG_QMGR_HI_P    0x0F0F
+#define ACC100_CFG_PCI_AXI      0xC003
+#define ACC100_CFG_PCI_BRIDGE   0x40006033
+#define ACC100_ENGINE_OFFSET    0x1000
+#define ACC100_RESET_HI         0x20100
+#define ACC100_RESET_LO         0x20000
+#define ACC100_RESET_HARD       0x1FF
+#define ACC100_ENGINES_MAX      9
+#define ACC100_LONG_WAIT        1000
+
+/* ACC100 DMA Descriptor triplet */
+struct acc100_dma_triplet {
+	uint64_t address;
+	uint32_t blen:20,
+		res0:4,
+		last:1,
+		dma_ext:1,
+		res1:2,
+		blkid:4;
+} __rte_packed;
+
+/* ACC100 DMA Response Descriptor */
+union acc100_dma_rsp_desc {
+	uint32_t val;
+	struct {
+		uint32_t crc_status:1,
+			synd_ok:1,
+			dma_err:1,
+			neg_stop:1,
+			fcw_err:1,
+			output_err:1,
+			input_err:1,
+			timestampEn:1,
+			iterCountFrac:8,
+			iter_cnt:8,
+			rsrvd3:6,
+			sdone:1,
+			fdone:1;
+		uint32_t add_info_0;
+		uint32_t add_info_1;
+	};
+};
+
+
+/* ACC100 Queue Manager Enqueue PCI Register */
+union acc100_enqueue_reg_fmt {
+	uint32_t val;
+	struct {
+		uint32_t num_elem:8,
+			addr_offset:3,
+			rsrvd:1,
+			req_elem_addr:20;
+	};
+};
+
+/* FEC 4G Uplink Frame Control Word */
+struct __rte_packed acc100_fcw_td {
+	uint8_t fcw_ver:4,
+		num_maps:4; /* Unused */
+	uint8_t filler:6, /* Unused */
+		rsrvd0:1,
+		bypass_sb_deint:1;
+	uint16_t k_pos;
+	uint16_t k_neg; /* Unused */
+	uint8_t c_neg; /* Unused */
+	uint8_t c; /* Unused */
+	uint32_t ea; /* Unused */
+	uint32_t eb; /* Unused */
+	uint8_t cab; /* Unused */
+	uint8_t k0_start_col; /* Unused */
+	uint8_t rsrvd1;
+	uint8_t code_block_mode:1, /* Unused */
+		turbo_crc_type:1,
+		rsrvd2:3,
+		bypass_teq:1, /* Unused */
+		soft_output_en:1, /* Unused */
+		ext_td_cold_reg_en:1;
+	union { /* External Cold register */
+		uint32_t ext_td_cold_reg;
+		struct {
+			uint32_t min_iter:4, /* Unused */
+				max_iter:4,
+				ext_scale:5, /* Unused */
+				rsrvd3:3,
+				early_stop_en:1, /* Unused */
+				sw_soft_out_dis:1, /* Unused */
+				sw_et_cont:1, /* Unused */
+				sw_soft_out_saturation:1, /* Unused */
+				half_iter_on:1, /* Unused */
+				raw_decoder_input_on:1, /* Unused */
+				rsrvd4:10;
+		};
+	};
+};
+
+/* FEC 5GNR Uplink Frame Control Word */
+struct __rte_packed acc100_fcw_ld {
+	uint32_t FCWversion:4,
+		qm:4,
+		nfiller:11,
+		BG:1,
+		Zc:9,
+		res0:1,
+		synd_precoder:1,
+		synd_post:1;
+	uint32_t ncb:16,
+		k0:16;
+	uint32_t rm_e:24,
+		hcin_en:1,
+		hcout_en:1,
+		crc_select:1,
+		bypass_dec:1,
+		bypass_intlv:1,
+		so_en:1,
+		so_bypass_rm:1,
+		so_bypass_intlv:1;
+	uint32_t hcin_offset:16,
+		hcin_size0:16;
+	uint32_t hcin_size1:16,
+		hcin_decomp_mode:3,
+		llr_pack_mode:1,
+		hcout_comp_mode:3,
+		res2:1,
+		dec_convllr:4,
+		hcout_convllr:4;
+	uint32_t itmax:7,
+		itstop:1,
+		so_it:7,
+		res3:1,
+		hcout_offset:16;
+	uint32_t hcout_size0:16,
+		hcout_size1:16;
+	uint32_t gain_i:8,
+		gain_h:8,
+		negstop_th:16;
+	uint32_t negstop_it:7,
+		negstop_en:1,
+		res4:24;
+};
+
+/* FEC 4G Downlink Frame Control Word */
+struct __rte_packed acc100_fcw_te {
+	uint16_t k_neg;
+	uint16_t k_pos;
+	uint8_t c_neg;
+	uint8_t c;
+	uint8_t filler;
+	uint8_t cab;
+	uint32_t ea:17,
+		rsrvd0:15;
+	uint32_t eb:17,
+		rsrvd1:15;
+	uint16_t ncb_neg;
+	uint16_t ncb_pos;
+	uint8_t rv_idx0:2,
+		rsrvd2:2,
+		rv_idx1:2,
+		rsrvd3:2;
+	uint8_t bypass_rv_idx0:1,
+		bypass_rv_idx1:1,
+		bypass_rm:1,
+		rsrvd4:5;
+	uint8_t rsrvd5:1,
+		rsrvd6:3,
+		code_block_crc:1,
+		rsrvd7:3;
+	uint8_t code_block_mode:1,
+		rsrvd8:7;
+	uint64_t rsrvd9;
+};
+
+/* FEC 5GNR Downlink Frame Control Word */
+struct __rte_packed acc100_fcw_le {
+	uint32_t FCWversion:4,
+		qm:4,
+		nfiller:11,
+		BG:1,
+		Zc:9,
+		res0:3;
+	uint32_t ncb:16,
+		k0:16;
+	uint32_t rm_e:24,
+		res1:2,
+		crc_select:1,
+		res2:1,
+		bypass_intlv:1,
+		res3:3;
+	uint32_t res4_a:12,
+		mcb_count:3,
+		res4_b:17;
+	uint32_t res5;
+	uint32_t res6;
+	uint32_t res7;
+	uint32_t res8;
+};
+
+/* ACC100 DMA Request Descriptor */
+struct __rte_packed acc100_dma_req_desc {
+	union {
+		struct{
+			uint32_t type:4,
+				rsrvd0:26,
+				sdone:1,
+				fdone:1;
+			uint32_t rsrvd1;
+			uint32_t rsrvd2;
+			uint32_t pass_param:8,
+				sdone_enable:1,
+				irq_enable:1,
+				timeStampEn:1,
+				res0:5,
+				numCBs:4,
+				res1:4,
+				m2dlen:4,
+				d2mlen:4;
+		};
+		struct{
+			uint32_t word0;
+			uint32_t word1;
+			uint32_t word2;
+			uint32_t word3;
+		};
+	};
+	struct acc100_dma_triplet data_ptrs[ACC100_DMA_MAX_NUM_POINTERS];
+
+	/* Virtual addresses used to retrieve SW context info */
+	union {
+		void *op_addr;
+		uint64_t pad1;  /* pad to 64 bits */
+	};
+	/*
+	 * Stores additional information needed for driver processing:
+	 * - last_desc_in_batch - flag used to mark last descriptor (CB)
+	 *                        in batch
+	 * - cbs_in_tb - stores information about total number of Code Blocks
+	 *               in currently processed Transport Block
+	 */
+	union {
+		struct {
+			union {
+				struct acc100_fcw_ld fcw_ld;
+				struct acc100_fcw_td fcw_td;
+				struct acc100_fcw_le fcw_le;
+				struct acc100_fcw_te fcw_te;
+				uint32_t pad2[ACC100_FCW_PADDING];
+			};
+			uint32_t last_desc_in_batch :8,
+				cbs_in_tb:8,
+				pad4 : 16;
+		};
+		uint64_t pad3[ACC100_DMA_DESC_PADDING]; /* pad to 64 bits */
+	};
+};
+
+/* ACC100 DMA Descriptor */
+union acc100_dma_desc {
+	struct acc100_dma_req_desc req;
+	union acc100_dma_rsp_desc rsp;
+};
+
+
+/* Union describing Info Ring entry */
+union acc100_harq_layout_data {
+	uint32_t val;
+	struct {
+		uint16_t offset;
+		uint16_t size0;
+	};
+} __rte_packed;
+
+
+/* Union describing Info Ring entry */
+union acc100_info_ring_data {
+	uint32_t val;
+	struct {
+		union {
+			uint16_t detailed_info;
+			struct {
+				uint16_t aq_id: 4;
+				uint16_t qg_id: 4;
+				uint16_t vf_id: 6;
+				uint16_t reserved: 2;
+			};
+		};
+		uint16_t int_nb: 7;
+		uint16_t msi_0: 1;
+		uint16_t vf2pf: 6;
+		uint16_t loop: 1;
+		uint16_t valid: 1;
+	};
+} __rte_packed;
+
+struct acc100_registry_addr {
+	unsigned int dma_ring_dl5g_hi;
+	unsigned int dma_ring_dl5g_lo;
+	unsigned int dma_ring_ul5g_hi;
+	unsigned int dma_ring_ul5g_lo;
+	unsigned int dma_ring_dl4g_hi;
+	unsigned int dma_ring_dl4g_lo;
+	unsigned int dma_ring_ul4g_hi;
+	unsigned int dma_ring_ul4g_lo;
+	unsigned int ring_size;
+	unsigned int info_ring_hi;
+	unsigned int info_ring_lo;
+	unsigned int info_ring_en;
+	unsigned int info_ring_ptr;
+	unsigned int tail_ptrs_dl5g_hi;
+	unsigned int tail_ptrs_dl5g_lo;
+	unsigned int tail_ptrs_ul5g_hi;
+	unsigned int tail_ptrs_ul5g_lo;
+	unsigned int tail_ptrs_dl4g_hi;
+	unsigned int tail_ptrs_dl4g_lo;
+	unsigned int tail_ptrs_ul4g_hi;
+	unsigned int tail_ptrs_ul4g_lo;
+	unsigned int depth_log0_offset;
+	unsigned int depth_log1_offset;
+	unsigned int qman_group_func;
+	unsigned int ddr_range;
+};
+
+/* Structure holding registry addresses for PF */
+static const struct acc100_registry_addr pf_reg_addr = {
+	.dma_ring_dl5g_hi = HWPfDmaFec5GdlDescBaseHiRegVf,
+	.dma_ring_dl5g_lo = HWPfDmaFec5GdlDescBaseLoRegVf,
+	.dma_ring_ul5g_hi = HWPfDmaFec5GulDescBaseHiRegVf,
+	.dma_ring_ul5g_lo = HWPfDmaFec5GulDescBaseLoRegVf,
+	.dma_ring_dl4g_hi = HWPfDmaFec4GdlDescBaseHiRegVf,
+	.dma_ring_dl4g_lo = HWPfDmaFec4GdlDescBaseLoRegVf,
+	.dma_ring_ul4g_hi = HWPfDmaFec4GulDescBaseHiRegVf,
+	.dma_ring_ul4g_lo = HWPfDmaFec4GulDescBaseLoRegVf,
+	.ring_size = HWPfQmgrRingSizeVf,
+	.info_ring_hi = HWPfHiInfoRingBaseHiRegPf,
+	.info_ring_lo = HWPfHiInfoRingBaseLoRegPf,
+	.info_ring_en = HWPfHiInfoRingIntWrEnRegPf,
+	.info_ring_ptr = HWPfHiInfoRingPointerRegPf,
+	.tail_ptrs_dl5g_hi = HWPfDmaFec5GdlRespPtrHiRegVf,
+	.tail_ptrs_dl5g_lo = HWPfDmaFec5GdlRespPtrLoRegVf,
+	.tail_ptrs_ul5g_hi = HWPfDmaFec5GulRespPtrHiRegVf,
+	.tail_ptrs_ul5g_lo = HWPfDmaFec5GulRespPtrLoRegVf,
+	.tail_ptrs_dl4g_hi = HWPfDmaFec4GdlRespPtrHiRegVf,
+	.tail_ptrs_dl4g_lo = HWPfDmaFec4GdlRespPtrLoRegVf,
+	.tail_ptrs_ul4g_hi = HWPfDmaFec4GulRespPtrHiRegVf,
+	.tail_ptrs_ul4g_lo = HWPfDmaFec4GulRespPtrLoRegVf,
+	.depth_log0_offset = HWPfQmgrGrpDepthLog20Vf,
+	.depth_log1_offset = HWPfQmgrGrpDepthLog21Vf,
+	.qman_group_func = HWPfQmgrGrpFunction0,
+	.ddr_range = HWPfDmaVfDdrBaseRw,
+};
+
+/* Structure holding registry addresses for VF */
+static const struct acc100_registry_addr vf_reg_addr = {
+	.dma_ring_dl5g_hi = HWVfDmaFec5GdlDescBaseHiRegVf,
+	.dma_ring_dl5g_lo = HWVfDmaFec5GdlDescBaseLoRegVf,
+	.dma_ring_ul5g_hi = HWVfDmaFec5GulDescBaseHiRegVf,
+	.dma_ring_ul5g_lo = HWVfDmaFec5GulDescBaseLoRegVf,
+	.dma_ring_dl4g_hi = HWVfDmaFec4GdlDescBaseHiRegVf,
+	.dma_ring_dl4g_lo = HWVfDmaFec4GdlDescBaseLoRegVf,
+	.dma_ring_ul4g_hi = HWVfDmaFec4GulDescBaseHiRegVf,
+	.dma_ring_ul4g_lo = HWVfDmaFec4GulDescBaseLoRegVf,
+	.ring_size = HWVfQmgrRingSizeVf,
+	.info_ring_hi = HWVfHiInfoRingBaseHiVf,
+	.info_ring_lo = HWVfHiInfoRingBaseLoVf,
+	.info_ring_en = HWVfHiInfoRingIntWrEnVf,
+	.info_ring_ptr = HWVfHiInfoRingPointerVf,
+	.tail_ptrs_dl5g_hi = HWVfDmaFec5GdlRespPtrHiRegVf,
+	.tail_ptrs_dl5g_lo = HWVfDmaFec5GdlRespPtrLoRegVf,
+	.tail_ptrs_ul5g_hi = HWVfDmaFec5GulRespPtrHiRegVf,
+	.tail_ptrs_ul5g_lo = HWVfDmaFec5GulRespPtrLoRegVf,
+	.tail_ptrs_dl4g_hi = HWVfDmaFec4GdlRespPtrHiRegVf,
+	.tail_ptrs_dl4g_lo = HWVfDmaFec4GdlRespPtrLoRegVf,
+	.tail_ptrs_ul4g_hi = HWVfDmaFec4GulRespPtrHiRegVf,
+	.tail_ptrs_ul4g_lo = HWVfDmaFec4GulRespPtrLoRegVf,
+	.depth_log0_offset = HWVfQmgrGrpDepthLog20Vf,
+	.depth_log1_offset = HWVfQmgrGrpDepthLog21Vf,
+	.qman_group_func = HWVfQmgrGrpFunction0Vf,
+	.ddr_range = HWVfDmaDdrBaseRangeRoVf,
+};
+
 /* Private data structure for each ACC100 device */
 struct acc100_device {
 	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v12 03/10] baseband/acc100: add info get function
  2020-10-05 22:12   ` [dpdk-dev] [PATCH v12 00/10] bbdev PMD ACC100 Nicolas Chautru
  2020-10-05 22:12     ` [dpdk-dev] [PATCH v12 01/10] drivers/baseband: add PMD for ACC100 Nicolas Chautru
  2020-10-05 22:12     ` [dpdk-dev] [PATCH v12 02/10] baseband/acc100: add register definition file Nicolas Chautru
@ 2020-10-05 22:12     ` Nicolas Chautru
  2020-10-05 22:12     ` [dpdk-dev] [PATCH v12 04/10] baseband/acc100: add queue configuration Nicolas Chautru
                       ` (7 subsequent siblings)
  10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-10-05 22:12 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, trix, maxime.coquelin, ferruh.yigit,
	tianjiao.liu, Nicolas Chautru

Add in the "info_get" function to the driver, to allow us to query the
device.
No processing capability are available yet.
Linking bbdev-test to support the PMD with null capability.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 app/test-bbdev/meson.build               |   3 +
 drivers/baseband/acc100/rte_acc100_cfg.h |  96 +++++++++++++
 drivers/baseband/acc100/rte_acc100_pmd.c | 229 +++++++++++++++++++++++++++++++
 drivers/baseband/acc100/rte_acc100_pmd.h |  10 ++
 4 files changed, 338 insertions(+)
 create mode 100644 drivers/baseband/acc100/rte_acc100_cfg.h

diff --git a/app/test-bbdev/meson.build b/app/test-bbdev/meson.build
index 18ab6a8..fbd8ae3 100644
--- a/app/test-bbdev/meson.build
+++ b/app/test-bbdev/meson.build
@@ -12,3 +12,6 @@ endif
 if dpdk_conf.has('RTE_LIBRTE_PMD_BBDEV_FPGA_5GNR_FEC')
 	deps += ['pmd_bbdev_fpga_5gnr_fec']
 endif
+if dpdk_conf.has('RTE_LIBRTE_PMD_BBDEV_ACC100')
+	deps += ['pmd_bbdev_acc100']
+endif
\ No newline at end of file
diff --git a/drivers/baseband/acc100/rte_acc100_cfg.h b/drivers/baseband/acc100/rte_acc100_cfg.h
new file mode 100644
index 0000000..a1d43ef
--- /dev/null
+++ b/drivers/baseband/acc100/rte_acc100_cfg.h
@@ -0,0 +1,96 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2020 Intel Corporation
+ */
+
+#ifndef _RTE_ACC100_CFG_H_
+#define _RTE_ACC100_CFG_H_
+
+/**
+ * @file rte_acc100_cfg.h
+ *
+ * Functions for configuring ACC100 HW, exposed directly to applications.
+ * Configuration related to encoding/decoding is done through the
+ * librte_bbdev library.
+ *
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ */
+
+#include <stdint.h>
+#include <stdbool.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+/**< Number of Virtual Functions ACC100 supports */
+#define RTE_ACC100_NUM_VFS 16
+
+/**
+ * Definition of Queue Topology for ACC100 Configuration
+ * Some level of details is abstracted out to expose a clean interface
+ * given that comprehensive flexibility is not required
+ */
+struct rte_acc100_queue_topology {
+	/** Number of QGroups in incremental order of priority */
+	uint16_t num_qgroups;
+	/**
+	 * All QGroups have the same number of AQs here.
+	 * Note : Could be made a 16-array if more flexibility is really
+	 * required
+	 */
+	uint16_t num_aqs_per_groups;
+	/**
+	 * Depth of the AQs is the same of all QGroups here. Log2 Enum : 2^N
+	 * Note : Could be made a 16-array if more flexibility is really
+	 * required
+	 */
+	uint16_t aq_depth_log2;
+	/**
+	 * Index of the first Queue Group Index - assuming contiguity
+	 * Initialized as -1
+	 */
+	int8_t first_qgroup_index;
+};
+
+/**
+ * Definition of Arbitration related parameters for ACC100 Configuration
+ */
+struct rte_acc100_arbitration {
+	/** Default Weight for VF Fairness Arbitration */
+	uint16_t round_robin_weight;
+	uint32_t gbr_threshold1; /**< Guaranteed Bitrate Threshold 1 */
+	uint32_t gbr_threshold2; /**< Guaranteed Bitrate Threshold 2 */
+};
+
+/**
+ * Structure to pass ACC100 configuration.
+ * Note: all VF Bundles will have the same configuration.
+ */
+struct rte_acc100_conf {
+	bool pf_mode_en; /**< 1 if PF is used for dataplane, 0 for VFs */
+	/** 1 if input '1' bit is represented by a positive LLR value, 0 if '1'
+	 * bit is represented by a negative value.
+	 */
+	bool input_pos_llr_1_bit;
+	/** 1 if output '1' bit is represented by a positive value, 0 if '1'
+	 * bit is represented by a negative value.
+	 */
+	bool output_pos_llr_1_bit;
+	uint16_t num_vf_bundles; /**< Number of VF bundles to setup */
+	/** Queue topology for each operation type */
+	struct rte_acc100_queue_topology q_ul_4g;
+	struct rte_acc100_queue_topology q_dl_4g;
+	struct rte_acc100_queue_topology q_ul_5g;
+	struct rte_acc100_queue_topology q_dl_5g;
+	/** Arbitration configuration for each operation type */
+	struct rte_acc100_arbitration arb_ul_4g[RTE_ACC100_NUM_VFS];
+	struct rte_acc100_arbitration arb_dl_4g[RTE_ACC100_NUM_VFS];
+	struct rte_acc100_arbitration arb_ul_5g[RTE_ACC100_NUM_VFS];
+	struct rte_acc100_arbitration arb_dl_5g[RTE_ACC100_NUM_VFS];
+};
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_ACC100_CFG_H_ */
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 1b4cd13..7291bd6 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -26,6 +26,188 @@
 RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);
 #endif
 
+/* Read a register of a ACC100 device */
+static inline uint32_t
+acc100_reg_read(struct acc100_device *d, uint32_t offset)
+{
+
+	void *reg_addr = RTE_PTR_ADD(d->mmio_base, offset);
+	uint32_t ret = *((volatile uint32_t *)(reg_addr));
+	return rte_le_to_cpu_32(ret);
+}
+
+/* Calculate the offset of the enqueue register */
+static inline uint32_t
+queue_offset(bool pf_device, uint8_t vf_id, uint8_t qgrp_id, uint16_t aq_id)
+{
+	if (pf_device)
+		return ((vf_id << 12) + (qgrp_id << 7) + (aq_id << 3) +
+				HWPfQmgrIngressAq);
+	else
+		return ((qgrp_id << 7) + (aq_id << 3) +
+				HWVfQmgrIngressAq);
+}
+
+enum {UL_4G = 0, UL_5G, DL_4G, DL_5G, NUM_ACC};
+
+/* Return the queue topology for a Queue Group Index */
+static inline void
+qtopFromAcc(struct rte_acc100_queue_topology **qtop, int acc_enum,
+		struct rte_acc100_conf *acc100_conf)
+{
+	struct rte_acc100_queue_topology *p_qtop;
+	p_qtop = NULL;
+	switch (acc_enum) {
+	case UL_4G:
+		p_qtop = &(acc100_conf->q_ul_4g);
+		break;
+	case UL_5G:
+		p_qtop = &(acc100_conf->q_ul_5g);
+		break;
+	case DL_4G:
+		p_qtop = &(acc100_conf->q_dl_4g);
+		break;
+	case DL_5G:
+		p_qtop = &(acc100_conf->q_dl_5g);
+		break;
+	default:
+		/* NOTREACHED */
+		rte_bbdev_log(ERR, "Unexpected error evaluating qtopFromAcc");
+		break;
+	}
+	*qtop = p_qtop;
+}
+
+static void
+initQTop(struct rte_acc100_conf *acc100_conf)
+{
+	acc100_conf->q_ul_4g.num_aqs_per_groups = 0;
+	acc100_conf->q_ul_4g.num_qgroups = 0;
+	acc100_conf->q_ul_4g.first_qgroup_index = -1;
+	acc100_conf->q_ul_5g.num_aqs_per_groups = 0;
+	acc100_conf->q_ul_5g.num_qgroups = 0;
+	acc100_conf->q_ul_5g.first_qgroup_index = -1;
+	acc100_conf->q_dl_4g.num_aqs_per_groups = 0;
+	acc100_conf->q_dl_4g.num_qgroups = 0;
+	acc100_conf->q_dl_4g.first_qgroup_index = -1;
+	acc100_conf->q_dl_5g.num_aqs_per_groups = 0;
+	acc100_conf->q_dl_5g.num_qgroups = 0;
+	acc100_conf->q_dl_5g.first_qgroup_index = -1;
+}
+
+static inline void
+updateQtop(uint8_t acc, uint8_t qg, struct rte_acc100_conf *acc100_conf,
+		struct acc100_device *d) {
+	uint32_t reg;
+	struct rte_acc100_queue_topology *q_top = NULL;
+	qtopFromAcc(&q_top, acc, acc100_conf);
+	if (unlikely(q_top == NULL))
+		return;
+	uint16_t aq;
+	q_top->num_qgroups++;
+	if (q_top->first_qgroup_index == -1) {
+		q_top->first_qgroup_index = qg;
+		/* Can be optimized to assume all are enabled by default */
+		reg = acc100_reg_read(d, queue_offset(d->pf_device,
+				0, qg, ACC100_NUM_AQS - 1));
+		if (reg & ACC100_QUEUE_ENABLE) {
+			q_top->num_aqs_per_groups = ACC100_NUM_AQS;
+			return;
+		}
+		q_top->num_aqs_per_groups = 0;
+		for (aq = 0; aq < ACC100_NUM_AQS; aq++) {
+			reg = acc100_reg_read(d, queue_offset(d->pf_device,
+					0, qg, aq));
+			if (reg & ACC100_QUEUE_ENABLE)
+				q_top->num_aqs_per_groups++;
+		}
+	}
+}
+
+/* Fetch configuration enabled for the PF/VF using MMIO Read (slow) */
+static inline void
+fetch_acc100_config(struct rte_bbdev *dev)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	struct rte_acc100_conf *acc100_conf = &d->acc100_conf;
+	const struct acc100_registry_addr *reg_addr;
+	uint8_t acc, qg;
+	uint32_t reg, reg_aq, reg_len0, reg_len1;
+	uint32_t reg_mode;
+
+	/* No need to retrieve the configuration is already done */
+	if (d->configured)
+		return;
+
+	/* Choose correct registry addresses for the device type */
+	if (d->pf_device)
+		reg_addr = &pf_reg_addr;
+	else
+		reg_addr = &vf_reg_addr;
+
+	d->ddr_size = (1 + acc100_reg_read(d, reg_addr->ddr_range)) << 10;
+
+	/* Single VF Bundle by VF */
+	acc100_conf->num_vf_bundles = 1;
+	initQTop(acc100_conf);
+
+	struct rte_acc100_queue_topology *q_top = NULL;
+	int qman_func_id[ACC100_NUM_ACCS] = {ACC100_ACCMAP_0, ACC100_ACCMAP_1,
+			ACC100_ACCMAP_2, ACC100_ACCMAP_3, ACC100_ACCMAP_4};
+	reg = acc100_reg_read(d, reg_addr->qman_group_func);
+	for (qg = 0; qg < ACC100_NUM_QGRPS_PER_WORD; qg++) {
+		reg_aq = acc100_reg_read(d,
+				queue_offset(d->pf_device, 0, qg, 0));
+		if (reg_aq & ACC100_QUEUE_ENABLE) {
+			uint32_t idx = (reg >> (qg * 4)) & 0x7;
+			if (idx < ACC100_NUM_ACCS) {
+				acc = qman_func_id[idx];
+				updateQtop(acc, qg, acc100_conf, d);
+			}
+		}
+	}
+
+	/* Check the depth of the AQs*/
+	reg_len0 = acc100_reg_read(d, reg_addr->depth_log0_offset);
+	reg_len1 = acc100_reg_read(d, reg_addr->depth_log1_offset);
+	for (acc = 0; acc < NUM_ACC; acc++) {
+		qtopFromAcc(&q_top, acc, acc100_conf);
+		if (q_top->first_qgroup_index < ACC100_NUM_QGRPS_PER_WORD)
+			q_top->aq_depth_log2 = (reg_len0 >>
+					(q_top->first_qgroup_index * 4))
+					& 0xF;
+		else
+			q_top->aq_depth_log2 = (reg_len1 >>
+					((q_top->first_qgroup_index -
+					ACC100_NUM_QGRPS_PER_WORD) * 4))
+					& 0xF;
+	}
+
+	/* Read PF mode */
+	if (d->pf_device) {
+		reg_mode = acc100_reg_read(d, HWPfHiPfMode);
+		acc100_conf->pf_mode_en = (reg_mode == ACC100_PF_VAL) ? 1 : 0;
+	}
+
+	rte_bbdev_log_debug(
+			"%s Config LLR SIGN IN/OUT %s %s QG %u %u %u %u AQ %u %u %u %u Len %u %u %u %u\n",
+			(d->pf_device) ? "PF" : "VF",
+			(acc100_conf->input_pos_llr_1_bit) ? "POS" : "NEG",
+			(acc100_conf->output_pos_llr_1_bit) ? "POS" : "NEG",
+			acc100_conf->q_ul_4g.num_qgroups,
+			acc100_conf->q_dl_4g.num_qgroups,
+			acc100_conf->q_ul_5g.num_qgroups,
+			acc100_conf->q_dl_5g.num_qgroups,
+			acc100_conf->q_ul_4g.num_aqs_per_groups,
+			acc100_conf->q_dl_4g.num_aqs_per_groups,
+			acc100_conf->q_ul_5g.num_aqs_per_groups,
+			acc100_conf->q_dl_5g.num_aqs_per_groups,
+			acc100_conf->q_ul_4g.aq_depth_log2,
+			acc100_conf->q_dl_4g.aq_depth_log2,
+			acc100_conf->q_ul_5g.aq_depth_log2,
+			acc100_conf->q_dl_5g.aq_depth_log2);
+}
+
 /* Free 64MB memory used for software rings */
 static int
 acc100_dev_close(struct rte_bbdev *dev  __rte_unused)
@@ -33,8 +215,55 @@
 	return 0;
 }
 
+/* Get ACC100 device info */
+static void
+acc100_dev_info_get(struct rte_bbdev *dev,
+		struct rte_bbdev_driver_info *dev_info)
+{
+	struct acc100_device *d = dev->data->dev_private;
+
+	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
+		RTE_BBDEV_END_OF_CAPABILITIES_LIST()
+	};
+
+	static struct rte_bbdev_queue_conf default_queue_conf;
+	default_queue_conf.socket = dev->data->socket_id;
+	default_queue_conf.queue_size = ACC100_MAX_QUEUE_DEPTH;
+
+	dev_info->driver_name = dev->device->driver->name;
+
+	/* Read and save the populated config from ACC100 registers */
+	fetch_acc100_config(dev);
+
+	/* This isn't ideal because it reports the maximum number of queues but
+	 * does not provide info on how many can be uplink/downlink or different
+	 * priorities
+	 */
+	dev_info->max_num_queues =
+			d->acc100_conf.q_dl_5g.num_aqs_per_groups *
+			d->acc100_conf.q_dl_5g.num_qgroups +
+			d->acc100_conf.q_ul_5g.num_aqs_per_groups *
+			d->acc100_conf.q_ul_5g.num_qgroups +
+			d->acc100_conf.q_dl_4g.num_aqs_per_groups *
+			d->acc100_conf.q_dl_4g.num_qgroups +
+			d->acc100_conf.q_ul_4g.num_aqs_per_groups *
+			d->acc100_conf.q_ul_4g.num_qgroups;
+	dev_info->queue_size_lim = ACC100_MAX_QUEUE_DEPTH;
+	dev_info->hardware_accelerated = true;
+	dev_info->max_dl_queue_priority =
+			d->acc100_conf.q_dl_4g.num_qgroups - 1;
+	dev_info->max_ul_queue_priority =
+			d->acc100_conf.q_ul_4g.num_qgroups - 1;
+	dev_info->default_queue_conf = default_queue_conf;
+	dev_info->cpu_flag_reqs = NULL;
+	dev_info->min_alignment = 64;
+	dev_info->capabilities = bbdev_capabilities;
+	dev_info->harq_buffer_size = d->ddr_size;
+}
+
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
 	.close = acc100_dev_close,
+	.info_get = acc100_dev_info_get,
 };
 
 /* ACC100 PCI PF address map */
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 6525d66..09965c8 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -7,6 +7,7 @@
 
 #include "acc100_pf_enum.h"
 #include "acc100_vf_enum.h"
+#include "rte_acc100_cfg.h"
 
 /* Helper macro for logging */
 #define rte_bbdev_log(level, fmt, ...) \
@@ -98,6 +99,13 @@
 #define ACC100_SIG_UL_4G_LAST 21
 #define ACC100_SIG_DL_4G      27
 #define ACC100_SIG_DL_4G_LAST 31
+#define ACC100_NUM_ACCS       5
+#define ACC100_ACCMAP_0       0
+#define ACC100_ACCMAP_1       2
+#define ACC100_ACCMAP_2       1
+#define ACC100_ACCMAP_3       3
+#define ACC100_ACCMAP_4       4
+#define ACC100_PF_VAL         2
 
 /* max number of iterations to allocate memory block for all rings */
 #define ACC100_SW_RING_MEM_ALLOC_ATTEMPTS 5
@@ -517,6 +525,8 @@ struct acc100_registry_addr {
 /* Private data structure for each ACC100 device */
 struct acc100_device {
 	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
+	uint32_t ddr_size; /* Size in kB */
+	struct rte_acc100_conf acc100_conf; /* ACC100 Initial configuration */
 	bool pf_device; /**< True if this is a PF ACC100 device */
 	bool configured; /**< True if this ACC100 device is configured */
 };
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v12 04/10] baseband/acc100: add queue configuration
  2020-10-05 22:12   ` [dpdk-dev] [PATCH v12 00/10] bbdev PMD ACC100 Nicolas Chautru
                       ` (2 preceding siblings ...)
  2020-10-05 22:12     ` [dpdk-dev] [PATCH v12 03/10] baseband/acc100: add info get function Nicolas Chautru
@ 2020-10-05 22:12     ` Nicolas Chautru
  2020-10-05 22:12     ` [dpdk-dev] [PATCH v12 05/10] baseband/acc100: add LDPC processing functions Nicolas Chautru
                       ` (6 subsequent siblings)
  10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-10-05 22:12 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, trix, maxime.coquelin, ferruh.yigit,
	tianjiao.liu, Nicolas Chautru

Adding function to create and configure queues for
the device. Still no capability.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 445 ++++++++++++++++++++++++++++++-
 drivers/baseband/acc100/rte_acc100_pmd.h |  45 ++++
 2 files changed, 488 insertions(+), 2 deletions(-)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 7291bd6..203ee38 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -26,6 +26,22 @@
 RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);
 #endif
 
+/* Write to MMIO register address */
+static inline void
+mmio_write(void *addr, uint32_t value)
+{
+	*((volatile uint32_t *)(addr)) = rte_cpu_to_le_32(value);
+}
+
+/* Write a register of a ACC100 device */
+static inline void
+acc100_reg_write(struct acc100_device *d, uint32_t offset, uint32_t payload)
+{
+	void *reg_addr = RTE_PTR_ADD(d->mmio_base, offset);
+	mmio_write(reg_addr, payload);
+	usleep(ACC100_LONG_WAIT);
+}
+
 /* Read a register of a ACC100 device */
 static inline uint32_t
 acc100_reg_read(struct acc100_device *d, uint32_t offset)
@@ -36,6 +52,22 @@
 	return rte_le_to_cpu_32(ret);
 }
 
+/* Basic Implementation of Log2 for exact 2^N */
+static inline uint32_t
+log2_basic(uint32_t value)
+{
+	return (value == 0) ? 0 : rte_bsf32(value);
+}
+
+/* Calculate memory alignment offset assuming alignment is 2^N */
+static inline uint32_t
+calc_mem_alignment_offset(void *unaligned_virt_mem, uint32_t alignment)
+{
+	rte_iova_t unaligned_phy_mem = rte_malloc_virt2iova(unaligned_virt_mem);
+	return (uint32_t)(alignment -
+			(unaligned_phy_mem & (alignment-1)));
+}
+
 /* Calculate the offset of the enqueue register */
 static inline uint32_t
 queue_offset(bool pf_device, uint8_t vf_id, uint8_t qgrp_id, uint16_t aq_id)
@@ -208,10 +240,416 @@
 			acc100_conf->q_dl_5g.aq_depth_log2);
 }
 
-/* Free 64MB memory used for software rings */
+static void
+free_base_addresses(void **base_addrs, int size)
+{
+	int i;
+	for (i = 0; i < size; i++)
+		rte_free(base_addrs[i]);
+}
+
+static inline uint32_t
+get_desc_len(void)
+{
+	return sizeof(union acc100_dma_desc);
+}
+
+/* Allocate the 2 * 64MB block for the sw rings */
 static int
-acc100_dev_close(struct rte_bbdev *dev  __rte_unused)
+alloc_2x64mb_sw_rings_mem(struct rte_bbdev *dev, struct acc100_device *d,
+		int socket)
 {
+	uint32_t sw_ring_size = ACC100_SIZE_64MBYTE;
+	d->sw_rings_base = rte_zmalloc_socket(dev->device->driver->name,
+			2 * sw_ring_size, RTE_CACHE_LINE_SIZE, socket);
+	if (d->sw_rings_base == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate memory for %s:%u",
+				dev->device->driver->name,
+				dev->data->dev_id);
+		return -ENOMEM;
+	}
+	uint32_t next_64mb_align_offset = calc_mem_alignment_offset(
+			d->sw_rings_base, ACC100_SIZE_64MBYTE);
+	d->sw_rings = RTE_PTR_ADD(d->sw_rings_base, next_64mb_align_offset);
+	d->sw_rings_iova = rte_malloc_virt2iova(d->sw_rings_base) +
+			next_64mb_align_offset;
+	d->sw_ring_size = ACC100_MAX_QUEUE_DEPTH * get_desc_len();
+	d->sw_ring_max_depth = ACC100_MAX_QUEUE_DEPTH;
+
+	return 0;
+}
+
+/* Attempt to allocate minimised memory space for sw rings */
+static void
+alloc_sw_rings_min_mem(struct rte_bbdev *dev, struct acc100_device *d,
+		uint16_t num_queues, int socket)
+{
+	rte_iova_t sw_rings_base_iova, next_64mb_align_addr_iova;
+	uint32_t next_64mb_align_offset;
+	rte_iova_t sw_ring_iova_end_addr;
+	void *base_addrs[ACC100_SW_RING_MEM_ALLOC_ATTEMPTS];
+	void *sw_rings_base;
+	int i = 0;
+	uint32_t q_sw_ring_size = ACC100_MAX_QUEUE_DEPTH * get_desc_len();
+	uint32_t dev_sw_ring_size = q_sw_ring_size * num_queues;
+
+	/* Find an aligned block of memory to store sw rings */
+	while (i < ACC100_SW_RING_MEM_ALLOC_ATTEMPTS) {
+		/*
+		 * sw_ring allocated memory is guaranteed to be aligned to
+		 * q_sw_ring_size at the condition that the requested size is
+		 * less than the page size
+		 */
+		sw_rings_base = rte_zmalloc_socket(
+				dev->device->driver->name,
+				dev_sw_ring_size, q_sw_ring_size, socket);
+
+		if (sw_rings_base == NULL) {
+			rte_bbdev_log(ERR,
+					"Failed to allocate memory for %s:%u",
+					dev->device->driver->name,
+					dev->data->dev_id);
+			break;
+		}
+
+		sw_rings_base_iova = rte_malloc_virt2iova(sw_rings_base);
+		next_64mb_align_offset = calc_mem_alignment_offset(
+				sw_rings_base, ACC100_SIZE_64MBYTE);
+		next_64mb_align_addr_iova = sw_rings_base_iova +
+				next_64mb_align_offset;
+		sw_ring_iova_end_addr = sw_rings_base_iova + dev_sw_ring_size;
+
+		/* Check if the end of the sw ring memory block is before the
+		 * start of next 64MB aligned mem address
+		 */
+		if (sw_ring_iova_end_addr < next_64mb_align_addr_iova) {
+			d->sw_rings_iova = sw_rings_base_iova;
+			d->sw_rings = sw_rings_base;
+			d->sw_rings_base = sw_rings_base;
+			d->sw_ring_size = q_sw_ring_size;
+			d->sw_ring_max_depth = ACC100_MAX_QUEUE_DEPTH;
+			break;
+		}
+		/* Store the address of the unaligned mem block */
+		base_addrs[i] = sw_rings_base;
+		i++;
+	}
+
+	/* Free all unaligned blocks of mem allocated in the loop */
+	free_base_addresses(base_addrs, i);
+}
+
+
+/* Allocate 64MB memory used for all software rings */
+static int
+acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id)
+{
+	uint32_t phys_low, phys_high, payload;
+	struct acc100_device *d = dev->data->dev_private;
+	const struct acc100_registry_addr *reg_addr;
+
+	if (d->pf_device && !d->acc100_conf.pf_mode_en) {
+		rte_bbdev_log(NOTICE,
+				"%s has PF mode disabled. This PF can't be used.",
+				dev->data->name);
+		return -ENODEV;
+	}
+
+	alloc_sw_rings_min_mem(dev, d, num_queues, socket_id);
+
+	/* If minimal memory space approach failed, then allocate
+	 * the 2 * 64MB block for the sw rings
+	 */
+	if (d->sw_rings == NULL)
+		alloc_2x64mb_sw_rings_mem(dev, d, socket_id);
+
+	if (d->sw_rings == NULL) {
+		rte_bbdev_log(NOTICE,
+				"Failure allocating sw_rings memory");
+		return -ENODEV;
+	}
+
+	/* Configure ACC100 with the base address for DMA descriptor rings
+	 * Same descriptor rings used for UL and DL DMA Engines
+	 * Note : Assuming only VF0 bundle is used for PF mode
+	 */
+	phys_high = (uint32_t)(d->sw_rings_iova >> 32);
+	phys_low  = (uint32_t)(d->sw_rings_iova & ~(ACC100_SIZE_64MBYTE-1));
+
+	/* Choose correct registry addresses for the device type */
+	if (d->pf_device)
+		reg_addr = &pf_reg_addr;
+	else
+		reg_addr = &vf_reg_addr;
+
+	/* Read the populated cfg from ACC100 registers */
+	fetch_acc100_config(dev);
+
+	/* Release AXI from PF */
+	if (d->pf_device)
+		acc100_reg_write(d, HWPfDmaAxiControl, 1);
+
+	acc100_reg_write(d, reg_addr->dma_ring_ul5g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->dma_ring_ul5g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->dma_ring_dl5g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->dma_ring_dl5g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->dma_ring_ul4g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->dma_ring_ul4g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->dma_ring_dl4g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->dma_ring_dl4g_lo, phys_low);
+
+	/*
+	 * Configure Ring Size to the max queue ring size
+	 * (used for wrapping purpose)
+	 */
+	payload = log2_basic(d->sw_ring_size / 64);
+	acc100_reg_write(d, reg_addr->ring_size, payload);
+
+	/* Configure tail pointer for use when SDONE enabled */
+	d->tail_ptrs = rte_zmalloc_socket(
+			dev->device->driver->name,
+			ACC100_NUM_QGRPS * ACC100_NUM_AQS * sizeof(uint32_t),
+			RTE_CACHE_LINE_SIZE, socket_id);
+	if (d->tail_ptrs == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate tail ptr for %s:%u",
+				dev->device->driver->name,
+				dev->data->dev_id);
+		rte_free(d->sw_rings);
+		return -ENOMEM;
+	}
+	d->tail_ptr_iova = rte_malloc_virt2iova(d->tail_ptrs);
+
+	phys_high = (uint32_t)(d->tail_ptr_iova >> 32);
+	phys_low  = (uint32_t)(d->tail_ptr_iova);
+	acc100_reg_write(d, reg_addr->tail_ptrs_ul5g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->tail_ptrs_ul5g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->tail_ptrs_dl5g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->tail_ptrs_dl5g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->tail_ptrs_ul4g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->tail_ptrs_ul4g_lo, phys_low);
+	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_hi, phys_high);
+	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_lo, phys_low);
+
+	d->harq_layout = rte_zmalloc_socket("HARQ Layout",
+			ACC100_HARQ_LAYOUT * sizeof(*d->harq_layout),
+			RTE_CACHE_LINE_SIZE, dev->data->socket_id);
+	if (d->harq_layout == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate harq_layout for %s:%u",
+				dev->device->driver->name,
+				dev->data->dev_id);
+		rte_free(d->sw_rings);
+		return -ENOMEM;
+	}
+
+	/* Mark as configured properly */
+	d->configured = true;
+
+	rte_bbdev_log_debug(
+			"ACC100 (%s) configured  sw_rings = %p, sw_rings_iova = %#"
+			PRIx64, dev->data->name, d->sw_rings, d->sw_rings_iova);
+
+	return 0;
+}
+
+/* Free memory used for software rings */
+static int
+acc100_dev_close(struct rte_bbdev *dev)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	if (d->sw_rings_base != NULL) {
+		rte_free(d->tail_ptrs);
+		rte_free(d->sw_rings_base);
+		d->sw_rings_base = NULL;
+	}
+	/* Ensure all in flight HW transactions are completed */
+	usleep(ACC100_LONG_WAIT);
+	return 0;
+}
+
+
+/**
+ * Report a ACC100 queue index which is free
+ * Return 0 to 16k for a valid queue_idx or -1 when no queue is available
+ * Note : Only supporting VF0 Bundle for PF mode
+ */
+static int
+acc100_find_free_queue_idx(struct rte_bbdev *dev,
+		const struct rte_bbdev_queue_conf *conf)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	int op_2_acc[5] = {0, UL_4G, DL_4G, UL_5G, DL_5G};
+	int acc = op_2_acc[conf->op_type];
+	struct rte_acc100_queue_topology *qtop = NULL;
+
+	qtopFromAcc(&qtop, acc, &(d->acc100_conf));
+	if (qtop == NULL)
+		return -1;
+	/* Identify matching QGroup Index which are sorted in priority order */
+	uint16_t group_idx = qtop->first_qgroup_index;
+	group_idx += conf->priority;
+	if (group_idx >= ACC100_NUM_QGRPS ||
+			conf->priority >= qtop->num_qgroups) {
+		rte_bbdev_log(INFO, "Invalid Priority on %s, priority %u",
+				dev->data->name, conf->priority);
+		return -1;
+	}
+	/* Find a free AQ_idx  */
+	uint16_t aq_idx;
+	for (aq_idx = 0; aq_idx < qtop->num_aqs_per_groups; aq_idx++) {
+		if (((d->q_assigned_bit_map[group_idx] >> aq_idx) & 0x1) == 0) {
+			/* Mark the Queue as assigned */
+			d->q_assigned_bit_map[group_idx] |= (1 << aq_idx);
+			/* Report the AQ Index */
+			return (group_idx << ACC100_GRP_ID_SHIFT) + aq_idx;
+		}
+	}
+	rte_bbdev_log(INFO, "Failed to find free queue on %s, priority %u",
+			dev->data->name, conf->priority);
+	return -1;
+}
+
+/* Setup ACC100 queue */
+static int
+acc100_queue_setup(struct rte_bbdev *dev, uint16_t queue_id,
+		const struct rte_bbdev_queue_conf *conf)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	struct acc100_queue *q;
+	int16_t q_idx;
+
+	/* Allocate the queue data structure. */
+	q = rte_zmalloc_socket(dev->device->driver->name, sizeof(*q),
+			RTE_CACHE_LINE_SIZE, conf->socket);
+	if (q == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate queue memory");
+		return -ENOMEM;
+	}
+	if (d == NULL) {
+		rte_bbdev_log(ERR, "Undefined device");
+		return -ENODEV;
+	}
+
+	q->d = d;
+	q->ring_addr = RTE_PTR_ADD(d->sw_rings, (d->sw_ring_size * queue_id));
+	q->ring_addr_iova = d->sw_rings_iova + (d->sw_ring_size * queue_id);
+
+	/* Prepare the Ring with default descriptor format */
+	union acc100_dma_desc *desc = NULL;
+	unsigned int desc_idx, b_idx;
+	int fcw_len = (conf->op_type == RTE_BBDEV_OP_LDPC_ENC ?
+		ACC100_FCW_LE_BLEN : (conf->op_type == RTE_BBDEV_OP_TURBO_DEC ?
+		ACC100_FCW_TD_BLEN : ACC100_FCW_LD_BLEN));
+
+	for (desc_idx = 0; desc_idx < d->sw_ring_max_depth; desc_idx++) {
+		desc = q->ring_addr + desc_idx;
+		desc->req.word0 = ACC100_DMA_DESC_TYPE;
+		desc->req.word1 = 0; /**< Timestamp */
+		desc->req.word2 = 0;
+		desc->req.word3 = 0;
+		uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
+		desc->req.data_ptrs[0].address = q->ring_addr_iova + fcw_offset;
+		desc->req.data_ptrs[0].blen = fcw_len;
+		desc->req.data_ptrs[0].blkid = ACC100_DMA_BLKID_FCW;
+		desc->req.data_ptrs[0].last = 0;
+		desc->req.data_ptrs[0].dma_ext = 0;
+		for (b_idx = 1; b_idx < ACC100_DMA_MAX_NUM_POINTERS - 1;
+				b_idx++) {
+			desc->req.data_ptrs[b_idx].blkid = ACC100_DMA_BLKID_IN;
+			desc->req.data_ptrs[b_idx].last = 1;
+			desc->req.data_ptrs[b_idx].dma_ext = 0;
+			b_idx++;
+			desc->req.data_ptrs[b_idx].blkid =
+					ACC100_DMA_BLKID_OUT_ENC;
+			desc->req.data_ptrs[b_idx].last = 1;
+			desc->req.data_ptrs[b_idx].dma_ext = 0;
+		}
+		/* Preset some fields of LDPC FCW */
+		desc->req.fcw_ld.FCWversion = ACC100_FCW_VER;
+		desc->req.fcw_ld.gain_i = 1;
+		desc->req.fcw_ld.gain_h = 1;
+	}
+
+	q->lb_in = rte_zmalloc_socket(dev->device->driver->name,
+			RTE_CACHE_LINE_SIZE,
+			RTE_CACHE_LINE_SIZE, conf->socket);
+	if (q->lb_in == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate lb_in memory");
+		rte_free(q);
+		return -ENOMEM;
+	}
+	q->lb_in_addr_iova = rte_malloc_virt2iova(q->lb_in);
+	q->lb_out = rte_zmalloc_socket(dev->device->driver->name,
+			RTE_CACHE_LINE_SIZE,
+			RTE_CACHE_LINE_SIZE, conf->socket);
+	if (q->lb_out == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate lb_out memory");
+		rte_free(q->lb_in);
+		rte_free(q);
+		return -ENOMEM;
+	}
+	q->lb_out_addr_iova = rte_malloc_virt2iova(q->lb_out);
+
+	/*
+	 * Software queue ring wraps synchronously with the HW when it reaches
+	 * the boundary of the maximum allocated queue size, no matter what the
+	 * sw queue size is. This wrapping is guarded by setting the wrap_mask
+	 * to represent the maximum queue size as allocated at the time when
+	 * the device has been setup (in configure()).
+	 *
+	 * The queue depth is set to the queue size value (conf->queue_size).
+	 * This limits the occupancy of the queue at any point of time, so that
+	 * the queue does not get swamped with enqueue requests.
+	 */
+	q->sw_ring_depth = conf->queue_size;
+	q->sw_ring_wrap_mask = d->sw_ring_max_depth - 1;
+
+	q->op_type = conf->op_type;
+
+	q_idx = acc100_find_free_queue_idx(dev, conf);
+	if (q_idx == -1) {
+		rte_free(q->lb_in);
+		rte_free(q->lb_out);
+		rte_free(q);
+		return -1;
+	}
+
+	q->qgrp_id = (q_idx >> ACC100_GRP_ID_SHIFT) & 0xF;
+	q->vf_id = (q_idx >> ACC100_VF_ID_SHIFT)  & 0x3F;
+	q->aq_id = q_idx & 0xF;
+	q->aq_depth = (conf->op_type ==  RTE_BBDEV_OP_TURBO_DEC) ?
+			(1 << d->acc100_conf.q_ul_4g.aq_depth_log2) :
+			(1 << d->acc100_conf.q_dl_4g.aq_depth_log2);
+
+	q->mmio_reg_enqueue = RTE_PTR_ADD(d->mmio_base,
+			queue_offset(d->pf_device,
+					q->vf_id, q->qgrp_id, q->aq_id));
+
+	rte_bbdev_log_debug(
+			"Setup dev%u q%u: qgrp_id=%u, vf_id=%u, aq_id=%u, aq_depth=%u, mmio_reg_enqueue=%p",
+			dev->data->dev_id, queue_id, q->qgrp_id, q->vf_id,
+			q->aq_id, q->aq_depth, q->mmio_reg_enqueue);
+
+	dev->data->queues[queue_id].queue_private = q;
+	return 0;
+}
+
+/* Release ACC100 queue */
+static int
+acc100_queue_release(struct rte_bbdev *dev, uint16_t q_id)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	struct acc100_queue *q = dev->data->queues[q_id].queue_private;
+
+	if (q != NULL) {
+		/* Mark the Queue as un-assigned */
+		d->q_assigned_bit_map[q->qgrp_id] &= (0xFFFFFFFF -
+				(1 << q->aq_id));
+		rte_free(q->lb_in);
+		rte_free(q->lb_out);
+		rte_free(q);
+		dev->data->queues[q_id].queue_private = NULL;
+	}
+
 	return 0;
 }
 
@@ -262,8 +700,11 @@
 }
 
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
+	.setup_queues = acc100_setup_queues,
 	.close = acc100_dev_close,
 	.info_get = acc100_dev_info_get,
+	.queue_setup = acc100_queue_setup,
+	.queue_release = acc100_queue_release,
 };
 
 /* ACC100 PCI PF address map */
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 09965c8..5c8dde3 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -522,11 +522,56 @@ struct acc100_registry_addr {
 	.ddr_range = HWVfDmaDdrBaseRangeRoVf,
 };
 
+/* Structure associated with each queue. */
+struct __rte_cache_aligned acc100_queue {
+	union acc100_dma_desc *ring_addr;  /* Virtual address of sw ring */
+	rte_iova_t ring_addr_iova;  /* IOVA address of software ring */
+	uint32_t sw_ring_head;  /* software ring head */
+	uint32_t sw_ring_tail;  /* software ring tail */
+	/* software ring size (descriptors, not bytes) */
+	uint32_t sw_ring_depth;
+	/* mask used to wrap enqueued descriptors on the sw ring */
+	uint32_t sw_ring_wrap_mask;
+	/* MMIO register used to enqueue descriptors */
+	void *mmio_reg_enqueue;
+	uint8_t vf_id;  /* VF ID (max = 63) */
+	uint8_t qgrp_id;  /* Queue Group ID */
+	uint16_t aq_id;  /* Atomic Queue ID */
+	uint16_t aq_depth;  /* Depth of atomic queue */
+	uint32_t aq_enqueued;  /* Count how many "batches" have been enqueued */
+	uint32_t aq_dequeued;  /* Count how many "batches" have been dequeued */
+	uint32_t irq_enable;  /* Enable ops dequeue interrupts if set to 1 */
+	struct rte_mempool *fcw_mempool;  /* FCW mempool */
+	enum rte_bbdev_op_type op_type;  /* Type of this Queue: TE or TD */
+	/* Internal Buffers for loopback input */
+	uint8_t *lb_in;
+	uint8_t *lb_out;
+	rte_iova_t lb_in_addr_iova;
+	rte_iova_t lb_out_addr_iova;
+	struct acc100_device *d;
+};
+
 /* Private data structure for each ACC100 device */
 struct acc100_device {
 	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
+	void *sw_rings_base;  /* Base addr of un-aligned memory for sw rings */
+	void *sw_rings;  /* 64MBs of 64MB aligned memory for sw rings */
+	rte_iova_t sw_rings_iova;  /* IOVA address of sw_rings */
+	/* Virtual address of the info memory routed to the this function under
+	 * operation, whether it is PF or VF.
+	 */
+	union acc100_harq_layout_data *harq_layout;
+	uint32_t sw_ring_size;
 	uint32_t ddr_size; /* Size in kB */
+	uint32_t *tail_ptrs; /* Base address of response tail pointer buffer */
+	rte_iova_t tail_ptr_iova; /* IOVA address of tail pointers */
+	/* Max number of entries available for each queue in device, depending
+	 * on how many queues are enabled with configure()
+	 */
+	uint32_t sw_ring_max_depth;
 	struct rte_acc100_conf acc100_conf; /* ACC100 Initial configuration */
+	/* Bitmap capturing which Queues have already been assigned */
+	uint16_t q_assigned_bit_map[ACC100_NUM_QGRPS];
 	bool pf_device; /**< True if this is a PF ACC100 device */
 	bool configured; /**< True if this ACC100 device is configured */
 };
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v12 05/10] baseband/acc100: add LDPC processing functions
  2020-10-05 22:12   ` [dpdk-dev] [PATCH v12 00/10] bbdev PMD ACC100 Nicolas Chautru
                       ` (3 preceding siblings ...)
  2020-10-05 22:12     ` [dpdk-dev] [PATCH v12 04/10] baseband/acc100: add queue configuration Nicolas Chautru
@ 2020-10-05 22:12     ` Nicolas Chautru
  2020-10-05 22:12     ` [dpdk-dev] [PATCH v12 06/10] baseband/acc100: add HARQ loopback support Nicolas Chautru
                       ` (5 subsequent siblings)
  10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-10-05 22:12 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, trix, maxime.coquelin, ferruh.yigit,
	tianjiao.liu, Nicolas Chautru

Adding LDPC decode and encode processing operations

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
Acked-by: Dave Burley <dave.burley@accelercomm.com>
---
 doc/guides/bbdevs/features/acc100.ini    |    8 +-
 drivers/baseband/acc100/rte_acc100_pmd.c | 1621 +++++++++++++++++++++++++++++-
 drivers/baseband/acc100/rte_acc100_pmd.h |    6 +
 3 files changed, 1629 insertions(+), 6 deletions(-)

diff --git a/doc/guides/bbdevs/features/acc100.ini b/doc/guides/bbdevs/features/acc100.ini
index c89a4d7..40c7adc 100644
--- a/doc/guides/bbdevs/features/acc100.ini
+++ b/doc/guides/bbdevs/features/acc100.ini
@@ -6,9 +6,9 @@
 [Features]
 Turbo Decoder (4G)     = N
 Turbo Encoder (4G)     = N
-LDPC Decoder (5G)      = N
-LDPC Encoder (5G)      = N
-LLR/HARQ Compression   = N
-External DDR Access    = N
+LDPC Decoder (5G)      = Y
+LDPC Encoder (5G)      = Y
+LLR/HARQ Compression   = Y
+External DDR Access    = Y
 HW Accelerated         = Y
 BBDEV API              = Y
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 203ee38..05f6f5e 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -15,6 +15,9 @@
 #include <rte_hexdump.h>
 #include <rte_pci.h>
 #include <rte_bus_pci.h>
+#ifdef RTE_BBDEV_OFFLOAD_COST
+#include <rte_cycles.h>
+#endif
 
 #include <rte_bbdev.h>
 #include <rte_bbdev_pmd.h>
@@ -466,7 +469,6 @@
 	return 0;
 }
 
-
 /**
  * Report a ACC100 queue index which is free
  * Return 0 to 16k for a valid queue_idx or -1 when no queue is available
@@ -661,6 +663,46 @@
 	struct acc100_device *d = dev->data->dev_private;
 
 	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
+		{
+			.type   = RTE_BBDEV_OP_LDPC_ENC,
+			.cap.ldpc_enc = {
+				.capability_flags =
+					RTE_BBDEV_LDPC_RATE_MATCH |
+					RTE_BBDEV_LDPC_CRC_24B_ATTACH |
+					RTE_BBDEV_LDPC_INTERLEAVER_BYPASS,
+				.num_buffers_src =
+						RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+				.num_buffers_dst =
+						RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+			}
+		},
+		{
+			.type   = RTE_BBDEV_OP_LDPC_DEC,
+			.cap.ldpc_dec = {
+			.capability_flags =
+				RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK |
+				RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP |
+				RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE |
+				RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE |
+#ifdef ACC100_EXT_MEM
+				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE |
+				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE |
+#endif
+				RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE |
+				RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS |
+				RTE_BBDEV_LDPC_DECODE_BYPASS |
+				RTE_BBDEV_LDPC_DEC_SCATTER_GATHER |
+				RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION |
+				RTE_BBDEV_LDPC_LLR_COMPRESSION,
+			.llr_size = 8,
+			.llr_decimals = 1,
+			.num_buffers_src =
+					RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+			.num_buffers_hard_out =
+					RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+			.num_buffers_soft_out = 0,
+			}
+		},
 		RTE_BBDEV_END_OF_CAPABILITIES_LIST()
 	};
 
@@ -696,9 +738,14 @@
 	dev_info->cpu_flag_reqs = NULL;
 	dev_info->min_alignment = 64;
 	dev_info->capabilities = bbdev_capabilities;
+#ifdef ACC100_EXT_MEM
 	dev_info->harq_buffer_size = d->ddr_size;
+#else
+	dev_info->harq_buffer_size = 0;
+#endif
 }
 
+
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
 	.setup_queues = acc100_setup_queues,
 	.close = acc100_dev_close,
@@ -723,6 +770,1573 @@
 	{.device_id = 0},
 };
 
+/* Read flag value 0/1 from bitmap */
+static inline bool
+check_bit(uint32_t bitmap, uint32_t bitmask)
+{
+	return bitmap & bitmask;
+}
+
+static inline char *
+mbuf_append(struct rte_mbuf *m_head, struct rte_mbuf *m, uint16_t len)
+{
+	if (unlikely(len > rte_pktmbuf_tailroom(m)))
+		return NULL;
+
+	char *tail = (char *)m->buf_addr + m->data_off + m->data_len;
+	m->data_len = (uint16_t)(m->data_len + len);
+	m_head->pkt_len  = (m_head->pkt_len + len);
+	return tail;
+}
+
+/* Compute value of k0.
+ * Based on 3GPP 38.212 Table 5.4.2.1-2
+ * Starting position of different redundancy versions, k0
+ */
+static inline uint16_t
+get_k0(uint16_t n_cb, uint16_t z_c, uint8_t bg, uint8_t rv_index)
+{
+	if (rv_index == 0)
+		return 0;
+	uint16_t n = (bg == 1 ? ACC100_N_ZC_1 : ACC100_N_ZC_2) * z_c;
+	if (n_cb == n) {
+		if (rv_index == 1)
+			return (bg == 1 ? ACC100_K0_1_1 : ACC100_K0_1_2) * z_c;
+		else if (rv_index == 2)
+			return (bg == 1 ? ACC100_K0_2_1 : ACC100_K0_2_2) * z_c;
+		else
+			return (bg == 1 ? ACC100_K0_3_1 : ACC100_K0_3_2) * z_c;
+	}
+	/* LBRM case - includes a division by N */
+	if (rv_index == 1)
+		return (((bg == 1 ? ACC100_K0_1_1 : ACC100_K0_1_2) * n_cb)
+				/ n) * z_c;
+	else if (rv_index == 2)
+		return (((bg == 1 ? ACC100_K0_2_1 : ACC100_K0_2_2) * n_cb)
+				/ n) * z_c;
+	else
+		return (((bg == 1 ? ACC100_K0_3_1 : ACC100_K0_3_2) * n_cb)
+				/ n) * z_c;
+}
+
+/* Fill in a frame control word for LDPC encoding. */
+static inline void
+acc100_fcw_le_fill(const struct rte_bbdev_enc_op *op,
+		struct acc100_fcw_le *fcw, int num_cb)
+{
+	fcw->qm = op->ldpc_enc.q_m;
+	fcw->nfiller = op->ldpc_enc.n_filler;
+	fcw->BG = (op->ldpc_enc.basegraph - 1);
+	fcw->Zc = op->ldpc_enc.z_c;
+	fcw->ncb = op->ldpc_enc.n_cb;
+	fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_enc.basegraph,
+			op->ldpc_enc.rv_index);
+	fcw->rm_e = op->ldpc_enc.cb_params.e;
+	fcw->crc_select = check_bit(op->ldpc_enc.op_flags,
+			RTE_BBDEV_LDPC_CRC_24B_ATTACH);
+	fcw->bypass_intlv = check_bit(op->ldpc_enc.op_flags,
+			RTE_BBDEV_LDPC_INTERLEAVER_BYPASS);
+	fcw->mcb_count = num_cb;
+}
+
+/* Fill in a frame control word for LDPC decoding. */
+static inline void
+acc100_fcw_ld_fill(const struct rte_bbdev_dec_op *op, struct acc100_fcw_ld *fcw,
+		union acc100_harq_layout_data *harq_layout)
+{
+	uint16_t harq_out_length, harq_in_length, ncb_p, k0_p, parity_offset;
+	uint16_t harq_index;
+	uint32_t l;
+	bool harq_prun = false;
+
+	fcw->qm = op->ldpc_dec.q_m;
+	fcw->nfiller = op->ldpc_dec.n_filler;
+	fcw->BG = (op->ldpc_dec.basegraph - 1);
+	fcw->Zc = op->ldpc_dec.z_c;
+	fcw->ncb = op->ldpc_dec.n_cb;
+	fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_dec.basegraph,
+			op->ldpc_dec.rv_index);
+	if (op->ldpc_dec.code_block_mode == 1)
+		fcw->rm_e = op->ldpc_dec.cb_params.e;
+	else
+		fcw->rm_e = (op->ldpc_dec.tb_params.r <
+				op->ldpc_dec.tb_params.cab) ?
+						op->ldpc_dec.tb_params.ea :
+						op->ldpc_dec.tb_params.eb;
+
+	fcw->hcin_en = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE);
+	fcw->hcout_en = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE);
+	fcw->crc_select = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK);
+	fcw->bypass_dec = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_DECODE_BYPASS);
+	fcw->bypass_intlv = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS);
+	if (op->ldpc_dec.q_m == 1) {
+		fcw->bypass_intlv = 1;
+		fcw->qm = 2;
+	}
+	fcw->hcin_decomp_mode = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
+	fcw->hcout_comp_mode = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
+	fcw->llr_pack_mode = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_LLR_COMPRESSION);
+	harq_index = op->ldpc_dec.harq_combined_output.offset /
+			ACC100_HARQ_OFFSET;
+#ifdef ACC100_EXT_MEM
+	/* Limit cases when HARQ pruning is valid */
+	harq_prun = ((op->ldpc_dec.harq_combined_output.offset %
+			ACC100_HARQ_OFFSET) == 0) &&
+			(op->ldpc_dec.harq_combined_output.offset <= UINT16_MAX
+			* ACC100_HARQ_OFFSET);
+#endif
+	if (fcw->hcin_en > 0) {
+		harq_in_length = op->ldpc_dec.harq_combined_input.length;
+		if (fcw->hcin_decomp_mode > 0)
+			harq_in_length = harq_in_length * 8 / 6;
+		harq_in_length = RTE_ALIGN(harq_in_length, 64);
+		if ((harq_layout[harq_index].offset > 0) & harq_prun) {
+			rte_bbdev_log_debug("HARQ IN offset unexpected for now\n");
+			fcw->hcin_size0 = harq_layout[harq_index].size0;
+			fcw->hcin_offset = harq_layout[harq_index].offset;
+			fcw->hcin_size1 = harq_in_length -
+					harq_layout[harq_index].offset;
+		} else {
+			fcw->hcin_size0 = harq_in_length;
+			fcw->hcin_offset = 0;
+			fcw->hcin_size1 = 0;
+		}
+	} else {
+		fcw->hcin_size0 = 0;
+		fcw->hcin_offset = 0;
+		fcw->hcin_size1 = 0;
+	}
+
+	fcw->itmax = op->ldpc_dec.iter_max;
+	fcw->itstop = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE);
+	fcw->synd_precoder = fcw->itstop;
+	/*
+	 * These are all implicitly set
+	 * fcw->synd_post = 0;
+	 * fcw->so_en = 0;
+	 * fcw->so_bypass_rm = 0;
+	 * fcw->so_bypass_intlv = 0;
+	 * fcw->dec_convllr = 0;
+	 * fcw->hcout_convllr = 0;
+	 * fcw->hcout_size1 = 0;
+	 * fcw->so_it = 0;
+	 * fcw->hcout_offset = 0;
+	 * fcw->negstop_th = 0;
+	 * fcw->negstop_it = 0;
+	 * fcw->negstop_en = 0;
+	 * fcw->gain_i = 1;
+	 * fcw->gain_h = 1;
+	 */
+	if (fcw->hcout_en > 0) {
+		parity_offset = (op->ldpc_dec.basegraph == 1 ? 20 : 8)
+			* op->ldpc_dec.z_c - op->ldpc_dec.n_filler;
+		k0_p = (fcw->k0 > parity_offset) ?
+				fcw->k0 - op->ldpc_dec.n_filler : fcw->k0;
+		ncb_p = fcw->ncb - op->ldpc_dec.n_filler;
+		l = k0_p + fcw->rm_e;
+		harq_out_length = (uint16_t) fcw->hcin_size0;
+		harq_out_length = RTE_MIN(RTE_MAX(harq_out_length, l), ncb_p);
+		harq_out_length = (harq_out_length + 0x3F) & 0xFFC0;
+		if ((k0_p > fcw->hcin_size0 + ACC100_HARQ_OFFSET_THRESHOLD) &&
+				harq_prun) {
+			fcw->hcout_size0 = (uint16_t) fcw->hcin_size0;
+			fcw->hcout_offset = k0_p & 0xFFC0;
+			fcw->hcout_size1 = harq_out_length - fcw->hcout_offset;
+		} else {
+			fcw->hcout_size0 = harq_out_length;
+			fcw->hcout_size1 = 0;
+			fcw->hcout_offset = 0;
+		}
+		harq_layout[harq_index].offset = fcw->hcout_offset;
+		harq_layout[harq_index].size0 = fcw->hcout_size0;
+	} else {
+		fcw->hcout_size0 = 0;
+		fcw->hcout_size1 = 0;
+		fcw->hcout_offset = 0;
+	}
+}
+
+/**
+ * Fills descriptor with data pointers of one block type.
+ *
+ * @param desc
+ *   Pointer to DMA descriptor.
+ * @param input
+ *   Pointer to pointer to input data which will be encoded. It can be changed
+ *   and points to next segment in scatter-gather case.
+ * @param offset
+ *   Input offset in rte_mbuf structure. It is used for calculating the point
+ *   where data is starting.
+ * @param cb_len
+ *   Length of currently processed Code Block
+ * @param seg_total_left
+ *   It indicates how many bytes still left in segment (mbuf) for further
+ *   processing.
+ * @param op_flags
+ *   Store information about device capabilities
+ * @param next_triplet
+ *   Index for ACC100 DMA Descriptor triplet
+ *
+ * @return
+ *   Returns index of next triplet on success, other value if lengths of
+ *   pkt and processed cb do not match.
+ *
+ */
+static inline int
+acc100_dma_fill_blk_type_in(struct acc100_dma_req_desc *desc,
+		struct rte_mbuf **input, uint32_t *offset, uint32_t cb_len,
+		uint32_t *seg_total_left, int next_triplet)
+{
+	uint32_t part_len;
+	struct rte_mbuf *m = *input;
+
+	part_len = (*seg_total_left < cb_len) ? *seg_total_left : cb_len;
+	cb_len -= part_len;
+	*seg_total_left -= part_len;
+
+	desc->data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(m, *offset);
+	desc->data_ptrs[next_triplet].blen = part_len;
+	desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN;
+	desc->data_ptrs[next_triplet].last = 0;
+	desc->data_ptrs[next_triplet].dma_ext = 0;
+	*offset += part_len;
+	next_triplet++;
+
+	while (cb_len > 0) {
+		if (next_triplet < ACC100_DMA_MAX_NUM_POINTERS &&
+				m->next != NULL) {
+
+			m = m->next;
+			*seg_total_left = rte_pktmbuf_data_len(m);
+			part_len = (*seg_total_left < cb_len) ?
+					*seg_total_left :
+					cb_len;
+			desc->data_ptrs[next_triplet].address =
+					rte_pktmbuf_iova_offset(m, 0);
+			desc->data_ptrs[next_triplet].blen = part_len;
+			desc->data_ptrs[next_triplet].blkid =
+					ACC100_DMA_BLKID_IN;
+			desc->data_ptrs[next_triplet].last = 0;
+			desc->data_ptrs[next_triplet].dma_ext = 0;
+			cb_len -= part_len;
+			*seg_total_left -= part_len;
+			/* Initializing offset for next segment (mbuf) */
+			*offset = part_len;
+			next_triplet++;
+		} else {
+			rte_bbdev_log(ERR,
+				"Some data still left for processing: "
+				"data_left: %u, next_triplet: %u, next_mbuf: %p",
+				cb_len, next_triplet, m->next);
+			return -EINVAL;
+		}
+	}
+	/* Storing new mbuf as it could be changed in scatter-gather case*/
+	*input = m;
+
+	return next_triplet;
+}
+
+/* Fills descriptor with data pointers of one block type.
+ * Returns index of next triplet on success, other value if lengths of
+ * output data and processed mbuf do not match.
+ */
+static inline int
+acc100_dma_fill_blk_type_out(struct acc100_dma_req_desc *desc,
+		struct rte_mbuf *output, uint32_t out_offset,
+		uint32_t output_len, int next_triplet, int blk_id)
+{
+	desc->data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(output, out_offset);
+	desc->data_ptrs[next_triplet].blen = output_len;
+	desc->data_ptrs[next_triplet].blkid = blk_id;
+	desc->data_ptrs[next_triplet].last = 0;
+	desc->data_ptrs[next_triplet].dma_ext = 0;
+	next_triplet++;
+
+	return next_triplet;
+}
+
+static inline void
+acc100_header_init(struct acc100_dma_req_desc *desc)
+{
+	desc->word0 = ACC100_DMA_DESC_TYPE;
+	desc->word1 = 0; /**< Timestamp could be disabled */
+	desc->word2 = 0;
+	desc->word3 = 0;
+	desc->numCBs = 1;
+}
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+/* Check if any input data is unexpectedly left for processing */
+static inline int
+check_mbuf_total_left(uint32_t mbuf_total_left)
+{
+	if (mbuf_total_left == 0)
+		return 0;
+	rte_bbdev_log(ERR,
+		"Some date still left for processing: mbuf_total_left = %u",
+		mbuf_total_left);
+	return -EINVAL;
+}
+#endif
+
+static inline int
+acc100_dma_desc_le_fill(struct rte_bbdev_enc_op *op,
+		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
+		struct rte_mbuf *output, uint32_t *in_offset,
+		uint32_t *out_offset, uint32_t *out_length,
+		uint32_t *mbuf_total_left, uint32_t *seg_total_left)
+{
+	int next_triplet = 1; /* FCW already done */
+	uint16_t K, in_length_in_bits, in_length_in_bytes;
+	struct rte_bbdev_op_ldpc_enc *enc = &op->ldpc_enc;
+
+	acc100_header_init(desc);
+
+	K = (enc->basegraph == 1 ? 22 : 10) * enc->z_c;
+	in_length_in_bits = K - enc->n_filler;
+	if ((enc->op_flags & RTE_BBDEV_LDPC_CRC_24A_ATTACH) ||
+			(enc->op_flags & RTE_BBDEV_LDPC_CRC_24B_ATTACH))
+		in_length_in_bits -= 24;
+	in_length_in_bytes = in_length_in_bits >> 3;
+
+	if (unlikely((*mbuf_total_left == 0) ||
+			(*mbuf_total_left < in_length_in_bytes))) {
+		rte_bbdev_log(ERR,
+				"Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+				*mbuf_total_left, in_length_in_bytes);
+		return -1;
+	}
+
+	next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset,
+			in_length_in_bytes,
+			seg_total_left, next_triplet);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->m2dlen = next_triplet;
+	*mbuf_total_left -= in_length_in_bytes;
+
+	/* Set output length */
+	/* Integer round up division by 8 */
+	*out_length = (enc->cb_params.e + 7) >> 3;
+
+	next_triplet = acc100_dma_fill_blk_type_out(desc, output, *out_offset,
+			*out_length, next_triplet, ACC100_DMA_BLKID_OUT_ENC);
+	op->ldpc_enc.output.length += *out_length;
+	*out_offset += *out_length;
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->data_ptrs[next_triplet - 1].dma_ext = 0;
+	desc->d2mlen = next_triplet - desc->m2dlen;
+
+	desc->op_addr = op;
+
+	return 0;
+}
+
+static inline int
+acc100_dma_desc_ld_fill(struct rte_bbdev_dec_op *op,
+		struct acc100_dma_req_desc *desc,
+		struct rte_mbuf **input, struct rte_mbuf *h_output,
+		uint32_t *in_offset, uint32_t *h_out_offset,
+		uint32_t *h_out_length, uint32_t *mbuf_total_left,
+		uint32_t *seg_total_left,
+		struct acc100_fcw_ld *fcw)
+{
+	struct rte_bbdev_op_ldpc_dec *dec = &op->ldpc_dec;
+	int next_triplet = 1; /* FCW already done */
+	uint32_t input_length;
+	uint16_t output_length, crc24_overlap = 0;
+	uint16_t sys_cols, K, h_p_size, h_np_size;
+	bool h_comp = check_bit(dec->op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION);
+
+	acc100_header_init(desc);
+
+	if (check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP))
+		crc24_overlap = 24;
+
+	/* Compute some LDPC BG lengths */
+	input_length = dec->cb_params.e;
+	if (check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_LLR_COMPRESSION))
+		input_length = (input_length * 3 + 3) / 4;
+	sys_cols = (dec->basegraph == 1) ? 22 : 10;
+	K = sys_cols * dec->z_c;
+	output_length = K - dec->n_filler - crc24_overlap;
+
+	if (unlikely((*mbuf_total_left == 0) ||
+			(*mbuf_total_left < input_length))) {
+		rte_bbdev_log(ERR,
+				"Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+				*mbuf_total_left, input_length);
+		return -1;
+	}
+
+	next_triplet = acc100_dma_fill_blk_type_in(desc, input,
+			in_offset, input_length,
+			seg_total_left, next_triplet);
+
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+
+	if (check_bit(op->ldpc_dec.op_flags,
+				RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
+		h_p_size = fcw->hcin_size0 + fcw->hcin_size1;
+		if (h_comp)
+			h_p_size = (h_p_size * 3 + 3) / 4;
+		desc->data_ptrs[next_triplet].address =
+				dec->harq_combined_input.offset;
+		desc->data_ptrs[next_triplet].blen = h_p_size;
+		desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN_HARQ;
+		desc->data_ptrs[next_triplet].dma_ext = 1;
+#ifndef ACC100_EXT_MEM
+		acc100_dma_fill_blk_type_out(
+				desc,
+				op->ldpc_dec.harq_combined_input.data,
+				op->ldpc_dec.harq_combined_input.offset,
+				h_p_size,
+				next_triplet,
+				ACC100_DMA_BLKID_IN_HARQ);
+#endif
+		next_triplet++;
+	}
+
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->m2dlen = next_triplet;
+	*mbuf_total_left -= input_length;
+
+	next_triplet = acc100_dma_fill_blk_type_out(desc, h_output,
+			*h_out_offset, output_length >> 3, next_triplet,
+			ACC100_DMA_BLKID_OUT_HARD);
+
+	if (check_bit(op->ldpc_dec.op_flags,
+				RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
+		/* Pruned size of the HARQ */
+		h_p_size = fcw->hcout_size0 + fcw->hcout_size1;
+		/* Non-Pruned size of the HARQ */
+		h_np_size = fcw->hcout_offset > 0 ?
+				fcw->hcout_offset + fcw->hcout_size1 :
+				h_p_size;
+		if (h_comp) {
+			h_np_size = (h_np_size * 3 + 3) / 4;
+			h_p_size = (h_p_size * 3 + 3) / 4;
+		}
+		dec->harq_combined_output.length = h_np_size;
+		desc->data_ptrs[next_triplet].address =
+				dec->harq_combined_output.offset;
+		desc->data_ptrs[next_triplet].blen = h_p_size;
+		desc->data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_OUT_HARQ;
+		desc->data_ptrs[next_triplet].dma_ext = 1;
+#ifndef ACC100_EXT_MEM
+		acc100_dma_fill_blk_type_out(
+				desc,
+				dec->harq_combined_output.data,
+				dec->harq_combined_output.offset,
+				h_p_size,
+				next_triplet,
+				ACC100_DMA_BLKID_OUT_HARQ);
+#endif
+		next_triplet++;
+	}
+
+	*h_out_length = output_length >> 3;
+	dec->hard_output.length += *h_out_length;
+	*h_out_offset += *h_out_length;
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->d2mlen = next_triplet - desc->m2dlen;
+
+	desc->op_addr = op;
+
+	return 0;
+}
+
+static inline void
+acc100_dma_desc_ld_update(struct rte_bbdev_dec_op *op,
+		struct acc100_dma_req_desc *desc,
+		struct rte_mbuf *input, struct rte_mbuf *h_output,
+		uint32_t *in_offset, uint32_t *h_out_offset,
+		uint32_t *h_out_length,
+		union acc100_harq_layout_data *harq_layout)
+{
+	int next_triplet = 1; /* FCW already done */
+	desc->data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(input, *in_offset);
+	next_triplet++;
+
+	if (check_bit(op->ldpc_dec.op_flags,
+				RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
+		struct rte_bbdev_op_data hi = op->ldpc_dec.harq_combined_input;
+		desc->data_ptrs[next_triplet].address = hi.offset;
+#ifndef ACC100_EXT_MEM
+		desc->data_ptrs[next_triplet].address =
+				rte_pktmbuf_iova_offset(hi.data, hi.offset);
+#endif
+		next_triplet++;
+	}
+
+	desc->data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(h_output, *h_out_offset);
+	*h_out_length = desc->data_ptrs[next_triplet].blen;
+	next_triplet++;
+
+	if (check_bit(op->ldpc_dec.op_flags,
+				RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
+		desc->data_ptrs[next_triplet].address =
+				op->ldpc_dec.harq_combined_output.offset;
+		/* Adjust based on previous operation */
+		struct rte_bbdev_dec_op *prev_op = desc->op_addr;
+		op->ldpc_dec.harq_combined_output.length =
+				prev_op->ldpc_dec.harq_combined_output.length;
+		int16_t hq_idx = op->ldpc_dec.harq_combined_output.offset /
+				ACC100_HARQ_OFFSET;
+		int16_t prev_hq_idx =
+				prev_op->ldpc_dec.harq_combined_output.offset
+				/ ACC100_HARQ_OFFSET;
+		harq_layout[hq_idx].val = harq_layout[prev_hq_idx].val;
+#ifndef ACC100_EXT_MEM
+		struct rte_bbdev_op_data ho =
+				op->ldpc_dec.harq_combined_output;
+		desc->data_ptrs[next_triplet].address =
+				rte_pktmbuf_iova_offset(ho.data, ho.offset);
+#endif
+		next_triplet++;
+	}
+
+	op->ldpc_dec.hard_output.length += *h_out_length;
+	desc->op_addr = op;
+}
+
+
+/* Enqueue a number of operations to HW and update software rings */
+static inline void
+acc100_dma_enqueue(struct acc100_queue *q, uint16_t n,
+		struct rte_bbdev_stats *queue_stats)
+{
+	union acc100_enqueue_reg_fmt enq_req;
+#ifdef RTE_BBDEV_OFFLOAD_COST
+	uint64_t start_time = 0;
+	queue_stats->acc_offload_cycles = 0;
+#else
+	RTE_SET_USED(queue_stats);
+#endif
+
+	enq_req.val = 0;
+	/* Setting offset, 100b for 256 DMA Desc */
+	enq_req.addr_offset = ACC100_DESC_OFFSET;
+
+	/* Split ops into batches */
+	do {
+		union acc100_dma_desc *desc;
+		uint16_t enq_batch_size;
+		uint64_t offset;
+		rte_iova_t req_elem_addr;
+
+		enq_batch_size = RTE_MIN(n, MAX_ENQ_BATCH_SIZE);
+
+		/* Set flag on last descriptor in a batch */
+		desc = q->ring_addr + ((q->sw_ring_head + enq_batch_size - 1) &
+				q->sw_ring_wrap_mask);
+		desc->req.last_desc_in_batch = 1;
+
+		/* Calculate the 1st descriptor's address */
+		offset = ((q->sw_ring_head & q->sw_ring_wrap_mask) *
+				sizeof(union acc100_dma_desc));
+		req_elem_addr = q->ring_addr_iova + offset;
+
+		/* Fill enqueue struct */
+		enq_req.num_elem = enq_batch_size;
+		/* low 6 bits are not needed */
+		enq_req.req_elem_addr = (uint32_t)(req_elem_addr >> 6);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+		rte_memdump(stderr, "Req sdone", desc, sizeof(*desc));
+#endif
+		rte_bbdev_log_debug(
+				"Enqueue %u reqs (phys %#"PRIx64") to reg %p",
+				enq_batch_size,
+				req_elem_addr,
+				(void *)q->mmio_reg_enqueue);
+
+		rte_wmb();
+
+#ifdef RTE_BBDEV_OFFLOAD_COST
+		/* Start time measurement for enqueue function offload. */
+		start_time = rte_rdtsc_precise();
+#endif
+		rte_bbdev_log(DEBUG, "Debug : MMIO Enqueue");
+		mmio_write(q->mmio_reg_enqueue, enq_req.val);
+
+#ifdef RTE_BBDEV_OFFLOAD_COST
+		queue_stats->acc_offload_cycles +=
+				rte_rdtsc_precise() - start_time;
+#endif
+
+		q->aq_enqueued++;
+		q->sw_ring_head += enq_batch_size;
+		n -= enq_batch_size;
+
+	} while (n);
+
+
+}
+
+/* Enqueue one encode operations for ACC100 device in CB mode */
+static inline int
+enqueue_ldpc_enc_n_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op **ops,
+		uint16_t total_enqueued_cbs, int16_t num)
+{
+	union acc100_dma_desc *desc = NULL;
+	uint32_t out_length;
+	struct rte_mbuf *output_head, *output;
+	int i, next_triplet;
+	uint16_t  in_length_in_bytes;
+	struct rte_bbdev_op_ldpc_enc *enc = &ops[0]->ldpc_enc;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	acc100_fcw_le_fill(ops[0], &desc->req.fcw_le, num);
+
+	/** This could be done at polling */
+	desc->req.word0 = ACC100_DMA_DESC_TYPE;
+	desc->req.word1 = 0; /**< Timestamp could be disabled */
+	desc->req.word2 = 0;
+	desc->req.word3 = 0;
+	desc->req.numCBs = num;
+
+	in_length_in_bytes = ops[0]->ldpc_enc.input.data->data_len;
+	out_length = (enc->cb_params.e + 7) >> 3;
+	desc->req.m2dlen = 1 + num;
+	desc->req.d2mlen = num;
+	next_triplet = 1;
+
+	for (i = 0; i < num; i++) {
+		desc->req.data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(ops[i]->ldpc_enc.input.data, 0);
+		desc->req.data_ptrs[next_triplet].blen = in_length_in_bytes;
+		next_triplet++;
+		desc->req.data_ptrs[next_triplet].address =
+				rte_pktmbuf_iova_offset(
+				ops[i]->ldpc_enc.output.data, 0);
+		desc->req.data_ptrs[next_triplet].blen = out_length;
+		next_triplet++;
+		ops[i]->ldpc_enc.output.length = out_length;
+		output_head = output = ops[i]->ldpc_enc.output.data;
+		mbuf_append(output_head, output, out_length);
+		output->data_len = out_length;
+	}
+
+	desc->req.op_addr = ops[0];
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_le,
+			sizeof(desc->req.fcw_le) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+	/* One CB (one op) was successfully prepared to enqueue */
+	return num;
+}
+
+/* Enqueue one encode operations for ACC100 device in CB mode */
+static inline int
+enqueue_ldpc_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
+		uint16_t total_enqueued_cbs)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint32_t in_offset, out_offset, out_length, mbuf_total_left,
+		seg_total_left;
+	struct rte_mbuf *input, *output_head, *output;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	acc100_fcw_le_fill(op, &desc->req.fcw_le, 1);
+
+	input = op->ldpc_enc.input.data;
+	output_head = output = op->ldpc_enc.output.data;
+	in_offset = op->ldpc_enc.input.offset;
+	out_offset = op->ldpc_enc.output.offset;
+	out_length = 0;
+	mbuf_total_left = op->ldpc_enc.input.length;
+	seg_total_left = rte_pktmbuf_data_len(op->ldpc_enc.input.data)
+			- in_offset;
+
+	ret = acc100_dma_desc_le_fill(op, &desc->req, &input, output,
+			&in_offset, &out_offset, &out_length, &mbuf_total_left,
+			&seg_total_left);
+
+	if (unlikely(ret < 0))
+		return ret;
+
+	mbuf_append(output_head, output, out_length);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_le,
+			sizeof(desc->req.fcw_le) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+
+	if (check_mbuf_total_left(mbuf_total_left) != 0)
+		return -EINVAL;
+#endif
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
+/** Enqueue one decode operations for ACC100 device in CB mode */
+static inline int
+enqueue_ldpc_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs, bool same_op)
+{
+	int ret;
+
+	union acc100_dma_desc *desc;
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	struct rte_mbuf *input, *h_output_head, *h_output;
+	uint32_t in_offset, h_out_offset, mbuf_total_left, h_out_length = 0;
+	input = op->ldpc_dec.input.data;
+	h_output_head = h_output = op->ldpc_dec.hard_output.data;
+	in_offset = op->ldpc_dec.input.offset;
+	h_out_offset = op->ldpc_dec.hard_output.offset;
+	mbuf_total_left = op->ldpc_dec.input.length;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(input == NULL)) {
+		rte_bbdev_log(ERR, "Invalid mbuf pointer");
+		return -EFAULT;
+	}
+#endif
+	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+
+	if (same_op) {
+		union acc100_dma_desc *prev_desc;
+		desc_idx = ((q->sw_ring_head + total_enqueued_cbs - 1)
+				& q->sw_ring_wrap_mask);
+		prev_desc = q->ring_addr + desc_idx;
+		uint8_t *prev_ptr = (uint8_t *) prev_desc;
+		uint8_t *new_ptr = (uint8_t *) desc;
+		/* Copy first 4 words and BDESCs */
+		rte_memcpy(new_ptr, prev_ptr, ACC100_5GUL_SIZE_0);
+		rte_memcpy(new_ptr + ACC100_5GUL_OFFSET_0,
+				prev_ptr + ACC100_5GUL_OFFSET_0,
+				ACC100_5GUL_SIZE_1);
+		desc->req.op_addr = prev_desc->req.op_addr;
+		/* Copy FCW */
+		rte_memcpy(new_ptr + ACC100_DESC_FCW_OFFSET,
+				prev_ptr + ACC100_DESC_FCW_OFFSET,
+				ACC100_FCW_LD_BLEN);
+		acc100_dma_desc_ld_update(op, &desc->req, input, h_output,
+				&in_offset, &h_out_offset,
+				&h_out_length, harq_layout);
+	} else {
+		struct acc100_fcw_ld *fcw;
+		uint32_t seg_total_left;
+		fcw = &desc->req.fcw_ld;
+		acc100_fcw_ld_fill(op, fcw, harq_layout);
+
+		/* Special handling when overusing mbuf */
+		if (fcw->rm_e < ACC100_MAX_E_MBUF)
+			seg_total_left = rte_pktmbuf_data_len(input)
+					- in_offset;
+		else
+			seg_total_left = fcw->rm_e;
+
+		ret = acc100_dma_desc_ld_fill(op, &desc->req, &input, h_output,
+				&in_offset, &h_out_offset,
+				&h_out_length, &mbuf_total_left,
+				&seg_total_left, fcw);
+		if (unlikely(ret < 0))
+			return ret;
+	}
+
+	/* Hard output */
+	mbuf_append(h_output_head, h_output, h_out_length);
+#ifndef ACC100_EXT_MEM
+	if (op->ldpc_dec.harq_combined_output.length > 0) {
+		/* Push the HARQ output into host memory */
+		struct rte_mbuf *hq_output_head, *hq_output;
+		hq_output_head = op->ldpc_dec.harq_combined_output.data;
+		hq_output = op->ldpc_dec.harq_combined_output.data;
+		mbuf_append(hq_output_head, hq_output,
+				op->ldpc_dec.harq_combined_output.length);
+	}
+#endif
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_ld,
+			sizeof(desc->req.fcw_ld) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
+
+/* Enqueue one decode operations for ACC100 device in TB mode */
+static inline int
+enqueue_ldpc_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint8_t r, c;
+	uint32_t in_offset, h_out_offset,
+		h_out_length, mbuf_total_left, seg_total_left;
+	struct rte_mbuf *input, *h_output_head, *h_output;
+	uint16_t current_enqueued_cbs = 0;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
+	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+	acc100_fcw_ld_fill(op, &desc->req.fcw_ld, harq_layout);
+
+	input = op->ldpc_dec.input.data;
+	h_output_head = h_output = op->ldpc_dec.hard_output.data;
+	in_offset = op->ldpc_dec.input.offset;
+	h_out_offset = op->ldpc_dec.hard_output.offset;
+	h_out_length = 0;
+	mbuf_total_left = op->ldpc_dec.input.length;
+	c = op->ldpc_dec.tb_params.c;
+	r = op->ldpc_dec.tb_params.r;
+
+	while (mbuf_total_left > 0 && r < c) {
+
+		seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
+
+		/* Set up DMA descriptor */
+		desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
+				& q->sw_ring_wrap_mask);
+		desc->req.data_ptrs[0].address = q->ring_addr_iova + fcw_offset;
+		desc->req.data_ptrs[0].blen = ACC100_FCW_LD_BLEN;
+		ret = acc100_dma_desc_ld_fill(op, &desc->req, &input,
+				h_output, &in_offset, &h_out_offset,
+				&h_out_length,
+				&mbuf_total_left, &seg_total_left,
+				&desc->req.fcw_ld);
+
+		if (unlikely(ret < 0))
+			return ret;
+
+		/* Hard output */
+		mbuf_append(h_output_head, h_output, h_out_length);
+
+		/* Set total number of CBs in TB */
+		desc->req.cbs_in_tb = cbs_in_tb;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+		rte_memdump(stderr, "FCW", &desc->req.fcw_td,
+				sizeof(desc->req.fcw_td) - 8);
+		rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+		if (seg_total_left == 0) {
+			/* Go to the next mbuf */
+			input = input->next;
+			in_offset = 0;
+			h_output = h_output->next;
+			h_out_offset = 0;
+		}
+		total_enqueued_cbs++;
+		current_enqueued_cbs++;
+		r++;
+	}
+
+	if (unlikely(desc == NULL))
+		return current_enqueued_cbs;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (check_mbuf_total_left(mbuf_total_left) != 0)
+		return -EINVAL;
+#endif
+	/* Set SDone on last CB descriptor for TB mode */
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	return current_enqueued_cbs;
+}
+
+
+/* Calculates number of CBs in processed encoder TB based on 'r' and input
+ * length.
+ */
+static inline uint8_t
+get_num_cbs_in_tb_enc(struct rte_bbdev_op_turbo_enc *turbo_enc)
+{
+	uint8_t c, c_neg, r, crc24_bits = 0;
+	uint16_t k, k_neg, k_pos;
+	uint8_t cbs_in_tb = 0;
+	int32_t length;
+
+	length = turbo_enc->input.length;
+	r = turbo_enc->tb_params.r;
+	c = turbo_enc->tb_params.c;
+	c_neg = turbo_enc->tb_params.c_neg;
+	k_neg = turbo_enc->tb_params.k_neg;
+	k_pos = turbo_enc->tb_params.k_pos;
+	crc24_bits = 0;
+	if (check_bit(turbo_enc->op_flags, RTE_BBDEV_TURBO_CRC_24B_ATTACH))
+		crc24_bits = 24;
+	while (length > 0 && r < c) {
+		k = (r < c_neg) ? k_neg : k_pos;
+		length -= (k - crc24_bits) >> 3;
+		r++;
+		cbs_in_tb++;
+	}
+
+	return cbs_in_tb;
+}
+
+/* Calculates number of CBs in processed decoder TB based on 'r' and input
+ * length.
+ */
+static inline uint16_t
+get_num_cbs_in_tb_dec(struct rte_bbdev_op_turbo_dec *turbo_dec)
+{
+	uint8_t c, c_neg, r = 0;
+	uint16_t kw, k, k_neg, k_pos, cbs_in_tb = 0;
+	int32_t length;
+
+	length = turbo_dec->input.length;
+	r = turbo_dec->tb_params.r;
+	c = turbo_dec->tb_params.c;
+	c_neg = turbo_dec->tb_params.c_neg;
+	k_neg = turbo_dec->tb_params.k_neg;
+	k_pos = turbo_dec->tb_params.k_pos;
+	while (length > 0 && r < c) {
+		k = (r < c_neg) ? k_neg : k_pos;
+		kw = RTE_ALIGN_CEIL(k + 4, 32) * 3;
+		length -= kw;
+		r++;
+		cbs_in_tb++;
+	}
+
+	return cbs_in_tb;
+}
+
+/* Calculates number of CBs in processed decoder TB based on 'r' and input
+ * length.
+ */
+static inline uint16_t
+get_num_cbs_in_tb_ldpc_dec(struct rte_bbdev_op_ldpc_dec *ldpc_dec)
+{
+	uint16_t r, cbs_in_tb = 0;
+	int32_t length = ldpc_dec->input.length;
+	r = ldpc_dec->tb_params.r;
+	while (length > 0 && r < ldpc_dec->tb_params.c) {
+		length -=  (r < ldpc_dec->tb_params.cab) ?
+				ldpc_dec->tb_params.ea :
+				ldpc_dec->tb_params.eb;
+		r++;
+		cbs_in_tb++;
+	}
+	return cbs_in_tb;
+}
+
+/* Check we can mux encode operations with common FCW */
+static inline bool
+check_mux(struct rte_bbdev_enc_op **ops, uint16_t num) {
+	uint16_t i;
+	if (num <= 1)
+		return false;
+	for (i = 1; i < num; ++i) {
+		/* Only mux compatible code blocks */
+		if (memcmp((uint8_t *)(&ops[i]->ldpc_enc) + ACC100_ENC_OFFSET,
+				(uint8_t *)(&ops[0]->ldpc_enc) +
+				ACC100_ENC_OFFSET,
+				ACC100_CMP_ENC_SIZE) != 0)
+			return false;
+	}
+	return true;
+}
+
+/** Enqueue encode operations for ACC100 device in CB mode. */
+static inline uint16_t
+acc100_enqueue_ldpc_enc_cb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i = 0;
+	union acc100_dma_desc *desc;
+	int ret, desc_idx = 0;
+	int16_t enq, left = num;
+
+	while (left > 0) {
+		if (unlikely(avail < 1))
+			break;
+		avail--;
+		enq = RTE_MIN(left, ACC100_MUX_5GDL_DESC);
+		if (check_mux(&ops[i], enq)) {
+			ret = enqueue_ldpc_enc_n_op_cb(q, &ops[i],
+					desc_idx, enq);
+			if (ret < 0)
+				break;
+			i += enq;
+		} else {
+			ret = enqueue_ldpc_enc_one_op_cb(q, ops[i], desc_idx);
+			if (ret < 0)
+				break;
+			i++;
+		}
+		desc_idx++;
+		left = num - i;
+	}
+
+	if (unlikely(i == 0))
+		return 0; /* Nothing to enqueue */
+
+	/* Set SDone in last CB in enqueued ops for CB mode*/
+	desc = q->ring_addr + ((q->sw_ring_head + desc_idx - 1)
+			& q->sw_ring_wrap_mask);
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	acc100_dma_enqueue(q, desc_idx, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+
+	return i;
+}
+
+/* Enqueue encode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	if (unlikely(num == 0))
+		return 0;
+	return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
+}
+
+/* Check we can mux encode operations with common FCW */
+static inline bool
+cmp_ldpc_dec_op(struct rte_bbdev_dec_op **ops) {
+	/* Only mux compatible code blocks */
+	if (memcmp((uint8_t *)(&ops[0]->ldpc_dec) + ACC100_DEC_OFFSET,
+			(uint8_t *)(&ops[1]->ldpc_dec) +
+			ACC100_DEC_OFFSET, ACC100_CMP_DEC_SIZE) != 0) {
+		return false;
+	} else
+		return true;
+}
+
+
+/* Enqueue decode operations for ACC100 device in TB mode */
+static uint16_t
+acc100_enqueue_ldpc_dec_tb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i, enqueued_cbs = 0;
+	uint8_t cbs_in_tb;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		cbs_in_tb = get_num_cbs_in_tb_ldpc_dec(&ops[i]->ldpc_dec);
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - cbs_in_tb < 0))
+			break;
+		avail -= cbs_in_tb;
+
+		ret = enqueue_ldpc_dec_one_op_tb(q, ops[i],
+				enqueued_cbs, cbs_in_tb);
+		if (ret < 0)
+			break;
+		enqueued_cbs += ret;
+	}
+
+	acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+	return i;
+}
+
+/* Enqueue decode operations for ACC100 device in CB mode */
+static uint16_t
+acc100_enqueue_ldpc_dec_cb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i;
+	union acc100_dma_desc *desc;
+	int ret;
+	bool same_op = false;
+	for (i = 0; i < num; ++i) {
+		/* Check if there are available space for further processing */
+		if (unlikely(avail < 1))
+			break;
+		avail -= 1;
+
+		if (i > 0)
+			same_op = cmp_ldpc_dec_op(&ops[i-1]);
+		rte_bbdev_log(INFO, "Op %d %d %d %d %d %d %d %d %d %d %d %d\n",
+			i, ops[i]->ldpc_dec.op_flags, ops[i]->ldpc_dec.rv_index,
+			ops[i]->ldpc_dec.iter_max, ops[i]->ldpc_dec.iter_count,
+			ops[i]->ldpc_dec.basegraph, ops[i]->ldpc_dec.z_c,
+			ops[i]->ldpc_dec.n_cb, ops[i]->ldpc_dec.q_m,
+			ops[i]->ldpc_dec.n_filler, ops[i]->ldpc_dec.cb_params.e,
+			same_op);
+		ret = enqueue_ldpc_dec_one_op_cb(q, ops[i], i, same_op);
+		if (ret < 0)
+			break;
+	}
+
+	if (unlikely(i == 0))
+		return 0; /* Nothing to enqueue */
+
+	/* Set SDone in last CB in enqueued ops for CB mode*/
+	desc = q->ring_addr + ((q->sw_ring_head + i - 1)
+			& q->sw_ring_wrap_mask);
+
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	acc100_dma_enqueue(q, i, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+	return i;
+}
+
+/* Enqueue decode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t aq_avail = q->aq_depth +
+			(q->aq_dequeued - q->aq_enqueued) / 128;
+
+	if (unlikely((aq_avail == 0) || (num == 0)))
+		return 0;
+
+	if (ops[0]->ldpc_dec.code_block_mode == 0)
+		return acc100_enqueue_ldpc_dec_tb(q_data, ops, num);
+	else
+		return acc100_enqueue_ldpc_dec_cb(q_data, ops, num);
+}
+
+
+/* Dequeue one encode operations from ACC100 device in CB mode */
+static inline int
+dequeue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op **ref_op,
+		uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_enc_op *op;
+	int i;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	rsp.val = atom_desc.rsp.val;
+	rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+
+	op->status |= ((rsp.input_err)
+			? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+	op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+	op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+
+	if (desc->req.last_desc_in_batch) {
+		(*aq_dequeued)++;
+		desc->req.last_desc_in_batch = 0;
+	}
+	desc->rsp.val = ACC100_DMA_DESC_TYPE;
+	desc->rsp.add_info_0 = 0; /*Reserved bits */
+	desc->rsp.add_info_1 = 0; /*Reserved bits */
+
+	/* Flag that the muxing cause loss of opaque data */
+	op->opaque_data = (void *)-1;
+	for (i = 0 ; i < desc->req.numCBs; i++)
+		ref_op[i] = op;
+
+	/* One CB (op) was successfully dequeued */
+	return desc->req.numCBs;
+}
+
+/* Dequeue one encode operations from ACC100 device in TB mode */
+static inline int
+dequeue_enc_one_op_tb(struct acc100_queue *q, struct rte_bbdev_enc_op **ref_op,
+		uint16_t total_dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, *last_desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_enc_op *op;
+	uint8_t i = 0;
+	uint16_t current_dequeued_cbs = 0, cbs_in_tb;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + total_dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	/* Get number of CBs in dequeued TB */
+	cbs_in_tb = desc->req.cbs_in_tb;
+	/* Get last CB */
+	last_desc = q->ring_addr + ((q->sw_ring_tail
+			+ total_dequeued_cbs + cbs_in_tb - 1)
+			& q->sw_ring_wrap_mask);
+	/* Check if last CB in TB is ready to dequeue (and thus
+	 * the whole TB) - checking sdone bit. If not return.
+	 */
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
+			__ATOMIC_RELAXED);
+	if (!(atom_desc.rsp.val & ACC100_SDONE))
+		return -1;
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+
+	while (i < cbs_in_tb) {
+		desc = q->ring_addr + ((q->sw_ring_tail
+				+ total_dequeued_cbs)
+				& q->sw_ring_wrap_mask);
+		atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+				__ATOMIC_RELAXED);
+		rsp.val = atom_desc.rsp.val;
+		rte_bbdev_log_debug("Resp. desc %p: %x", desc,
+				rsp.val);
+
+		op->status |= ((rsp.input_err)
+				? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+		op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+		op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+
+		if (desc->req.last_desc_in_batch) {
+			(*aq_dequeued)++;
+			desc->req.last_desc_in_batch = 0;
+		}
+		desc->rsp.val = ACC100_DMA_DESC_TYPE;
+		desc->rsp.add_info_0 = 0;
+		desc->rsp.add_info_1 = 0;
+		total_dequeued_cbs++;
+		current_dequeued_cbs++;
+		i++;
+	}
+
+	*ref_op = op;
+
+	return current_dequeued_cbs;
+}
+
+/* Dequeue one decode operation from ACC100 device in CB mode */
+static inline int
+dequeue_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
+		struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
+		uint16_t dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_dec_op *op;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	rsp.val = atom_desc.rsp.val;
+	rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+	op->status |= ((rsp.input_err)
+			? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+	op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+	op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+	if (op->status != 0)
+		q_data->queue_stats.dequeue_err_count++;
+
+	/* CRC invalid if error exists */
+	if (!op->status)
+		op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
+	op->turbo_dec.iter_count = (uint8_t) rsp.iter_cnt / 2;
+	/* Check if this is the last desc in batch (Atomic Queue) */
+	if (desc->req.last_desc_in_batch) {
+		(*aq_dequeued)++;
+		desc->req.last_desc_in_batch = 0;
+	}
+	desc->rsp.val = ACC100_DMA_DESC_TYPE;
+	desc->rsp.add_info_0 = 0;
+	desc->rsp.add_info_1 = 0;
+	*ref_op = op;
+
+	/* One CB (op) was successfully dequeued */
+	return 1;
+}
+
+/* Dequeue one decode operations from ACC100 device in CB mode */
+static inline int
+dequeue_ldpc_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
+		struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
+		uint16_t dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_dec_op *op;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	rsp.val = atom_desc.rsp.val;
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+	op->status |= rsp.input_err << RTE_BBDEV_DATA_ERROR;
+	op->status |= rsp.dma_err << RTE_BBDEV_DRV_ERROR;
+	op->status |= rsp.fcw_err << RTE_BBDEV_DRV_ERROR;
+	if (op->status != 0)
+		q_data->queue_stats.dequeue_err_count++;
+
+	op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
+	if (op->ldpc_dec.hard_output.length > 0 && !rsp.synd_ok)
+		op->status |= 1 << RTE_BBDEV_SYNDROME_ERROR;
+	op->ldpc_dec.iter_count = (uint8_t) rsp.iter_cnt;
+
+	/* Check if this is the last desc in batch (Atomic Queue) */
+	if (desc->req.last_desc_in_batch) {
+		(*aq_dequeued)++;
+		desc->req.last_desc_in_batch = 0;
+	}
+
+	desc->rsp.val = ACC100_DMA_DESC_TYPE;
+	desc->rsp.add_info_0 = 0;
+	desc->rsp.add_info_1 = 0;
+
+	*ref_op = op;
+
+	/* One CB (op) was successfully dequeued */
+	return 1;
+}
+
+/* Dequeue one decode operations from ACC100 device in TB mode. */
+static inline int
+dequeue_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op **ref_op,
+		uint16_t dequeued_cbs, uint32_t *aq_dequeued)
+{
+	union acc100_dma_desc *desc, *last_desc, atom_desc;
+	union acc100_dma_rsp_desc rsp;
+	struct rte_bbdev_dec_op *op;
+	uint8_t cbs_in_tb = 1, cb_idx = 0;
+
+	desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+			__ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC100_FDONE))
+		return -1;
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Get number of CBs in dequeued TB */
+	cbs_in_tb = desc->req.cbs_in_tb;
+	/* Get last CB */
+	last_desc = q->ring_addr + ((q->sw_ring_tail
+			+ dequeued_cbs + cbs_in_tb - 1)
+			& q->sw_ring_wrap_mask);
+	/* Check if last CB in TB is ready to dequeue (and thus
+	 * the whole TB) - checking sdone bit. If not return.
+	 */
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
+			__ATOMIC_RELAXED);
+	if (!(atom_desc.rsp.val & ACC100_SDONE))
+		return -1;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+
+	/* Read remaining CBs if exists */
+	while (cb_idx < cbs_in_tb) {
+		desc = q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+				& q->sw_ring_wrap_mask);
+		atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc,
+				__ATOMIC_RELAXED);
+		rsp.val = atom_desc.rsp.val;
+		rte_bbdev_log_debug("Resp. desc %p: %x", desc,
+				rsp.val);
+
+		op->status |= ((rsp.input_err)
+				? (1 << RTE_BBDEV_DATA_ERROR) : 0);
+		op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+		op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+
+		/* CRC invalid if error exists */
+		if (!op->status)
+			op->status |= rsp.crc_status << RTE_BBDEV_CRC_ERROR;
+		op->turbo_dec.iter_count = RTE_MAX((uint8_t) rsp.iter_cnt,
+				op->turbo_dec.iter_count);
+
+		/* Check if this is the last desc in batch (Atomic Queue) */
+		if (desc->req.last_desc_in_batch) {
+			(*aq_dequeued)++;
+			desc->req.last_desc_in_batch = 0;
+		}
+		desc->rsp.val = ACC100_DMA_DESC_TYPE;
+		desc->rsp.add_info_0 = 0;
+		desc->rsp.add_info_1 = 0;
+		dequeued_cbs++;
+		cb_idx++;
+	}
+
+	*ref_op = op;
+
+	return cb_idx;
+}
+
+/* Dequeue LDPC encode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+	uint32_t aq_dequeued = 0;
+	uint16_t dequeue_num, i, dequeued_cbs = 0, dequeued_descs = 0;
+	int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(ops == 0 && q == NULL))
+		return 0;
+#endif
+
+	dequeue_num = RTE_MIN(avail, num);
+
+	for (i = 0; i < dequeue_num; i++) {
+		ret = dequeue_enc_one_op_cb(q, &ops[dequeued_cbs],
+				dequeued_descs, &aq_dequeued);
+		if (ret < 0)
+			break;
+		dequeued_cbs += ret;
+		dequeued_descs++;
+		if (dequeued_cbs >= num)
+			break;
+	}
+
+	q->aq_dequeued += aq_dequeued;
+	q->sw_ring_tail += dequeued_descs;
+
+	/* Update enqueue stats */
+	q_data->queue_stats.dequeued_count += dequeued_cbs;
+
+	return dequeued_cbs;
+}
+
+/* Dequeue decode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	uint16_t dequeue_num;
+	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+	uint32_t aq_dequeued = 0;
+	uint16_t i;
+	uint16_t dequeued_cbs = 0;
+	struct rte_bbdev_dec_op *op;
+	int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(ops == 0 && q == NULL))
+		return 0;
+#endif
+
+	dequeue_num = RTE_MIN(avail, num);
+
+	for (i = 0; i < dequeue_num; ++i) {
+		op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask))->req.op_addr;
+		if (op->ldpc_dec.code_block_mode == 0)
+			ret = dequeue_dec_one_op_tb(q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+		else
+			ret = dequeue_ldpc_dec_one_op_cb(
+					q_data, q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+
+		if (ret < 0)
+			break;
+		dequeued_cbs += ret;
+	}
+
+	q->aq_dequeued += aq_dequeued;
+	q->sw_ring_tail += dequeued_cbs;
+
+	/* Update enqueue stats */
+	q_data->queue_stats.dequeued_count += i;
+
+	return i;
+}
+
 /* Initialization Function */
 static void
 acc100_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv)
@@ -730,6 +2344,10 @@
 	struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
 
 	dev->dev_ops = &acc100_bbdev_ops;
+	dev->enqueue_ldpc_enc_ops = acc100_enqueue_ldpc_enc;
+	dev->enqueue_ldpc_dec_ops = acc100_enqueue_ldpc_dec;
+	dev->dequeue_ldpc_enc_ops = acc100_dequeue_ldpc_enc;
+	dev->dequeue_ldpc_dec_ops = acc100_dequeue_ldpc_dec;
 
 	((struct acc100_device *) dev->data->dev_private)->pf_device =
 			!strcmp(drv->driver.name,
@@ -842,4 +2460,3 @@ static int acc100_pci_remove(struct rte_pci_device *pci_dev)
 RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME, pci_id_acc100_pf_map);
 RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
 RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME, pci_id_acc100_vf_map);
-
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 5c8dde3..38818f4 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -88,6 +88,8 @@
 #define ACC100_TMPL_PRI_3      0x0f0e0d0c
 #define ACC100_QUEUE_ENABLE    0x80000000  /* Bit to mark Queue as Enabled */
 #define ACC100_WORDS_IN_ARAM_SIZE (128 * 1024 / 4)
+#define ACC100_FDONE    0x80000000
+#define ACC100_SDONE    0x40000000
 
 #define ACC100_NUM_TMPL       32
 /* Mapping of signals for the available engines */
@@ -120,6 +122,9 @@
 #define ACC100_FCW_TD_BLEN                24
 #define ACC100_FCW_LE_BLEN                32
 #define ACC100_FCW_LD_BLEN                36
+#define ACC100_5GUL_SIZE_0                16
+#define ACC100_5GUL_SIZE_1                40
+#define ACC100_5GUL_OFFSET_0              36
 
 #define ACC100_FCW_VER         2
 #define ACC100_MUX_5GDL_DESC   6
@@ -402,6 +407,7 @@ struct __rte_packed acc100_dma_req_desc {
 union acc100_dma_desc {
 	struct acc100_dma_req_desc req;
 	union acc100_dma_rsp_desc rsp;
+	uint64_t atom_hdr;
 };
 
 
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v12 06/10] baseband/acc100: add HARQ loopback support
  2020-10-05 22:12   ` [dpdk-dev] [PATCH v12 00/10] bbdev PMD ACC100 Nicolas Chautru
                       ` (4 preceding siblings ...)
  2020-10-05 22:12     ` [dpdk-dev] [PATCH v12 05/10] baseband/acc100: add LDPC processing functions Nicolas Chautru
@ 2020-10-05 22:12     ` Nicolas Chautru
  2020-10-05 22:12     ` [dpdk-dev] [PATCH v12 07/10] baseband/acc100: add support for 4G processing Nicolas Chautru
                       ` (4 subsequent siblings)
  10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-10-05 22:12 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, trix, maxime.coquelin, ferruh.yigit,
	tianjiao.liu, Nicolas Chautru

Additional support for HARQ memory loopback

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
Reviewed-by: Tom Rix <trix@redhat.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 159 ++++++++++++++++++++++++++++++-
 1 file changed, 155 insertions(+), 4 deletions(-)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 05f6f5e..0541046 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -685,6 +685,7 @@
 				RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE |
 				RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE |
 #ifdef ACC100_EXT_MEM
+				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK |
 				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE |
 				RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE |
 #endif
@@ -1419,10 +1420,7 @@
 	acc100_fcw_le_fill(ops[0], &desc->req.fcw_le, num);
 
 	/** This could be done at polling */
-	desc->req.word0 = ACC100_DMA_DESC_TYPE;
-	desc->req.word1 = 0; /**< Timestamp could be disabled */
-	desc->req.word2 = 0;
-	desc->req.word3 = 0;
+	acc100_header_init(&desc->req);
 	desc->req.numCBs = num;
 
 	in_length_in_bytes = ops[0]->ldpc_enc.input.data->data_len;
@@ -1505,12 +1503,165 @@
 	return 1;
 }
 
+static inline int
+harq_loopback(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs) {
+	struct acc100_fcw_ld *fcw;
+	union acc100_dma_desc *desc;
+	int next_triplet = 1;
+	struct rte_mbuf *hq_output_head, *hq_output;
+	uint16_t harq_in_length = op->ldpc_dec.harq_combined_input.length;
+	if (harq_in_length == 0) {
+		rte_bbdev_log(ERR, "Loopback of invalid null size\n");
+		return -EINVAL;
+	}
+
+	int h_comp = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION
+			) ? 1 : 0;
+	if (h_comp == 1)
+		harq_in_length = harq_in_length * 8 / 6;
+	harq_in_length = RTE_ALIGN(harq_in_length, 64);
+	uint16_t harq_dma_length_in = (h_comp == 0) ?
+			harq_in_length :
+			harq_in_length * 6 / 8;
+	uint16_t harq_dma_length_out = harq_dma_length_in;
+	bool ddr_mem_in = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE);
+	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+	uint16_t harq_index = (ddr_mem_in ?
+			op->ldpc_dec.harq_combined_input.offset :
+			op->ldpc_dec.harq_combined_output.offset)
+			/ ACC100_HARQ_OFFSET;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	fcw = &desc->req.fcw_ld;
+	/* Set the FCW from loopback into DDR */
+	memset(fcw, 0, sizeof(struct acc100_fcw_ld));
+	fcw->FCWversion = ACC100_FCW_VER;
+	fcw->qm = 2;
+	fcw->Zc = 384;
+	if (harq_in_length < 16 * ACC100_N_ZC_1)
+		fcw->Zc = 16;
+	fcw->ncb = fcw->Zc * ACC100_N_ZC_1;
+	fcw->rm_e = 2;
+	fcw->hcin_en = 1;
+	fcw->hcout_en = 1;
+
+	rte_bbdev_log(DEBUG, "Loopback IN %d Index %d offset %d length %d %d\n",
+			ddr_mem_in, harq_index,
+			harq_layout[harq_index].offset, harq_in_length,
+			harq_dma_length_in);
+
+	if (ddr_mem_in && (harq_layout[harq_index].offset > 0)) {
+		fcw->hcin_size0 = harq_layout[harq_index].size0;
+		fcw->hcin_offset = harq_layout[harq_index].offset;
+		fcw->hcin_size1 = harq_in_length - fcw->hcin_offset;
+		harq_dma_length_in = (fcw->hcin_size0 + fcw->hcin_size1);
+		if (h_comp == 1)
+			harq_dma_length_in = harq_dma_length_in * 6 / 8;
+	} else {
+		fcw->hcin_size0 = harq_in_length;
+	}
+	harq_layout[harq_index].val = 0;
+	rte_bbdev_log(DEBUG, "Loopback FCW Config %d %d %d\n",
+			fcw->hcin_size0, fcw->hcin_offset, fcw->hcin_size1);
+	fcw->hcout_size0 = harq_in_length;
+	fcw->hcin_decomp_mode = h_comp;
+	fcw->hcout_comp_mode = h_comp;
+	fcw->gain_i = 1;
+	fcw->gain_h = 1;
+
+	/* Set the prefix of descriptor. This could be done at polling */
+	acc100_header_init(&desc->req);
+
+	/* Null LLR input for Decoder */
+	desc->req.data_ptrs[next_triplet].address =
+			q->lb_in_addr_iova;
+	desc->req.data_ptrs[next_triplet].blen = 2;
+	desc->req.data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN;
+	desc->req.data_ptrs[next_triplet].last = 0;
+	desc->req.data_ptrs[next_triplet].dma_ext = 0;
+	next_triplet++;
+
+	/* HARQ Combine input from either Memory interface */
+	if (!ddr_mem_in) {
+		next_triplet = acc100_dma_fill_blk_type_out(&desc->req,
+				op->ldpc_dec.harq_combined_input.data,
+				op->ldpc_dec.harq_combined_input.offset,
+				harq_dma_length_in,
+				next_triplet,
+				ACC100_DMA_BLKID_IN_HARQ);
+	} else {
+		desc->req.data_ptrs[next_triplet].address =
+				op->ldpc_dec.harq_combined_input.offset;
+		desc->req.data_ptrs[next_triplet].blen =
+				harq_dma_length_in;
+		desc->req.data_ptrs[next_triplet].blkid =
+				ACC100_DMA_BLKID_IN_HARQ;
+		desc->req.data_ptrs[next_triplet].dma_ext = 1;
+		next_triplet++;
+	}
+	desc->req.data_ptrs[next_triplet - 1].last = 1;
+	desc->req.m2dlen = next_triplet;
+
+	/* Dropped decoder hard output */
+	desc->req.data_ptrs[next_triplet].address =
+			q->lb_out_addr_iova;
+	desc->req.data_ptrs[next_triplet].blen = ACC100_BYTES_IN_WORD;
+	desc->req.data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_OUT_HARD;
+	desc->req.data_ptrs[next_triplet].last = 0;
+	desc->req.data_ptrs[next_triplet].dma_ext = 0;
+	next_triplet++;
+
+	/* HARQ Combine output to either Memory interface */
+	if (check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_OUT_ENABLE
+			)) {
+		desc->req.data_ptrs[next_triplet].address =
+				op->ldpc_dec.harq_combined_output.offset;
+		desc->req.data_ptrs[next_triplet].blen =
+				harq_dma_length_out;
+		desc->req.data_ptrs[next_triplet].blkid =
+				ACC100_DMA_BLKID_OUT_HARQ;
+		desc->req.data_ptrs[next_triplet].dma_ext = 1;
+		next_triplet++;
+	} else {
+		hq_output_head = op->ldpc_dec.harq_combined_output.data;
+		hq_output = op->ldpc_dec.harq_combined_output.data;
+		next_triplet = acc100_dma_fill_blk_type_out(
+				&desc->req,
+				op->ldpc_dec.harq_combined_output.data,
+				op->ldpc_dec.harq_combined_output.offset,
+				harq_dma_length_out,
+				next_triplet,
+				ACC100_DMA_BLKID_OUT_HARQ);
+		/* HARQ output */
+		mbuf_append(hq_output_head, hq_output, harq_dma_length_out);
+		op->ldpc_dec.harq_combined_output.length =
+				harq_dma_length_out;
+	}
+	desc->req.data_ptrs[next_triplet - 1].last = 1;
+	desc->req.d2mlen = next_triplet - desc->req.m2dlen;
+	desc->req.op_addr = op;
+
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
 /** Enqueue one decode operations for ACC100 device in CB mode */
 static inline int
 enqueue_ldpc_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
 		uint16_t total_enqueued_cbs, bool same_op)
 {
 	int ret;
+	if (unlikely(check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_LOOPBACK))) {
+		ret = harq_loopback(q, op, total_enqueued_cbs);
+		return ret;
+	}
 
 	union acc100_dma_desc *desc;
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v12 07/10] baseband/acc100: add support for 4G processing
  2020-10-05 22:12   ` [dpdk-dev] [PATCH v12 00/10] bbdev PMD ACC100 Nicolas Chautru
                       ` (5 preceding siblings ...)
  2020-10-05 22:12     ` [dpdk-dev] [PATCH v12 06/10] baseband/acc100: add HARQ loopback support Nicolas Chautru
@ 2020-10-05 22:12     ` Nicolas Chautru
  2020-10-05 22:12     ` [dpdk-dev] [PATCH v12 08/10] baseband/acc100: add interrupt support to PMD Nicolas Chautru
                       ` (3 subsequent siblings)
  10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-10-05 22:12 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, trix, maxime.coquelin, ferruh.yigit,
	tianjiao.liu, Nicolas Chautru

Adding capability for 4G encode and decoder processing

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 doc/guides/bbdevs/features/acc100.ini    |    4 +-
 drivers/baseband/acc100/rte_acc100_pmd.c | 1007 +++++++++++++++++++++++++++---
 2 files changed, 936 insertions(+), 75 deletions(-)

diff --git a/doc/guides/bbdevs/features/acc100.ini b/doc/guides/bbdevs/features/acc100.ini
index 40c7adc..642cd48 100644
--- a/doc/guides/bbdevs/features/acc100.ini
+++ b/doc/guides/bbdevs/features/acc100.ini
@@ -4,8 +4,8 @@
 ; Refer to default.ini for the full list of available PMD features.
 ;
 [Features]
-Turbo Decoder (4G)     = N
-Turbo Encoder (4G)     = N
+Turbo Decoder (4G)     = Y
+Turbo Encoder (4G)     = Y
 LDPC Decoder (5G)      = Y
 LDPC Encoder (5G)      = Y
 LLR/HARQ Compression   = Y
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 0541046..e9aa07d 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -342,7 +342,6 @@
 	free_base_addresses(base_addrs, i);
 }
 
-
 /* Allocate 64MB memory used for all software rings */
 static int
 acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id)
@@ -664,6 +663,41 @@
 
 	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
 		{
+			.type = RTE_BBDEV_OP_TURBO_DEC,
+			.cap.turbo_dec = {
+				.capability_flags =
+					RTE_BBDEV_TURBO_SUBBLOCK_DEINTERLEAVE |
+					RTE_BBDEV_TURBO_CRC_TYPE_24B |
+					RTE_BBDEV_TURBO_HALF_ITERATION_EVEN |
+					RTE_BBDEV_TURBO_EARLY_TERMINATION |
+					RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN |
+					RTE_BBDEV_TURBO_MAP_DEC |
+					RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP |
+					RTE_BBDEV_TURBO_DEC_SCATTER_GATHER,
+				.max_llr_modulus = INT8_MAX,
+				.num_buffers_src =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+				.num_buffers_hard_out =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+				.num_buffers_soft_out =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+			}
+		},
+		{
+			.type = RTE_BBDEV_OP_TURBO_ENC,
+			.cap.turbo_enc = {
+				.capability_flags =
+					RTE_BBDEV_TURBO_CRC_24B_ATTACH |
+					RTE_BBDEV_TURBO_RV_INDEX_BYPASS |
+					RTE_BBDEV_TURBO_RATE_MATCH |
+					RTE_BBDEV_TURBO_ENC_SCATTER_GATHER,
+				.num_buffers_src =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+				.num_buffers_dst =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+			}
+		},
+		{
 			.type   = RTE_BBDEV_OP_LDPC_ENC,
 			.cap.ldpc_enc = {
 				.capability_flags =
@@ -746,7 +780,6 @@
 #endif
 }
 
-
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
 	.setup_queues = acc100_setup_queues,
 	.close = acc100_dev_close,
@@ -790,6 +823,58 @@
 	return tail;
 }
 
+/* Fill in a frame control word for turbo encoding. */
+static inline void
+acc100_fcw_te_fill(const struct rte_bbdev_enc_op *op, struct acc100_fcw_te *fcw)
+{
+	fcw->code_block_mode = op->turbo_enc.code_block_mode;
+	if (fcw->code_block_mode == 0) { /* For TB mode */
+		fcw->k_neg = op->turbo_enc.tb_params.k_neg;
+		fcw->k_pos = op->turbo_enc.tb_params.k_pos;
+		fcw->c_neg = op->turbo_enc.tb_params.c_neg;
+		fcw->c = op->turbo_enc.tb_params.c;
+		fcw->ncb_neg = op->turbo_enc.tb_params.ncb_neg;
+		fcw->ncb_pos = op->turbo_enc.tb_params.ncb_pos;
+
+		if (check_bit(op->turbo_enc.op_flags,
+				RTE_BBDEV_TURBO_RATE_MATCH)) {
+			fcw->bypass_rm = 0;
+			fcw->cab = op->turbo_enc.tb_params.cab;
+			fcw->ea = op->turbo_enc.tb_params.ea;
+			fcw->eb = op->turbo_enc.tb_params.eb;
+		} else {
+			/* E is set to the encoding output size when RM is
+			 * bypassed.
+			 */
+			fcw->bypass_rm = 1;
+			fcw->cab = fcw->c_neg;
+			fcw->ea = 3 * fcw->k_neg + 12;
+			fcw->eb = 3 * fcw->k_pos + 12;
+		}
+	} else { /* For CB mode */
+		fcw->k_pos = op->turbo_enc.cb_params.k;
+		fcw->ncb_pos = op->turbo_enc.cb_params.ncb;
+
+		if (check_bit(op->turbo_enc.op_flags,
+				RTE_BBDEV_TURBO_RATE_MATCH)) {
+			fcw->bypass_rm = 0;
+			fcw->eb = op->turbo_enc.cb_params.e;
+		} else {
+			/* E is set to the encoding output size when RM is
+			 * bypassed.
+			 */
+			fcw->bypass_rm = 1;
+			fcw->eb = 3 * fcw->k_pos + 12;
+		}
+	}
+
+	fcw->bypass_rv_idx1 = check_bit(op->turbo_enc.op_flags,
+			RTE_BBDEV_TURBO_RV_INDEX_BYPASS);
+	fcw->code_block_crc = check_bit(op->turbo_enc.op_flags,
+			RTE_BBDEV_TURBO_CRC_24B_ATTACH);
+	fcw->rv_idx1 = op->turbo_enc.rv_index;
+}
+
 /* Compute value of k0.
  * Based on 3GPP 38.212 Table 5.4.2.1-2
  * Starting position of different redundancy versions, k0
@@ -840,6 +925,25 @@
 	fcw->mcb_count = num_cb;
 }
 
+/* Fill in a frame control word for turbo decoding. */
+static inline void
+acc100_fcw_td_fill(const struct rte_bbdev_dec_op *op, struct acc100_fcw_td *fcw)
+{
+	/* Note : Early termination is always enabled for 4GUL */
+	fcw->fcw_ver = 1;
+	if (op->turbo_dec.code_block_mode == 0)
+		fcw->k_pos = op->turbo_dec.tb_params.k_pos;
+	else
+		fcw->k_pos = op->turbo_dec.cb_params.k;
+	fcw->turbo_crc_type = check_bit(op->turbo_dec.op_flags,
+			RTE_BBDEV_TURBO_CRC_TYPE_24B);
+	fcw->bypass_sb_deint = 0;
+	fcw->raw_decoder_input_on = 0;
+	fcw->max_iter = op->turbo_dec.iter_max;
+	fcw->half_iter_on = !check_bit(op->turbo_dec.op_flags,
+			RTE_BBDEV_TURBO_HALF_ITERATION_EVEN);
+}
+
 /* Fill in a frame control word for LDPC decoding. */
 static inline void
 acc100_fcw_ld_fill(const struct rte_bbdev_dec_op *op, struct acc100_fcw_ld *fcw,
@@ -1093,6 +1197,87 @@
 #endif
 
 static inline int
+acc100_dma_desc_te_fill(struct rte_bbdev_enc_op *op,
+		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
+		struct rte_mbuf *output, uint32_t *in_offset,
+		uint32_t *out_offset, uint32_t *out_length,
+		uint32_t *mbuf_total_left, uint32_t *seg_total_left, uint8_t r)
+{
+	int next_triplet = 1; /* FCW already done */
+	uint32_t e, ea, eb, length;
+	uint16_t k, k_neg, k_pos;
+	uint8_t cab, c_neg;
+
+	desc->word0 = ACC100_DMA_DESC_TYPE;
+	desc->word1 = 0; /**< Timestamp could be disabled */
+	desc->word2 = 0;
+	desc->word3 = 0;
+	desc->numCBs = 1;
+
+	if (op->turbo_enc.code_block_mode == 0) {
+		ea = op->turbo_enc.tb_params.ea;
+		eb = op->turbo_enc.tb_params.eb;
+		cab = op->turbo_enc.tb_params.cab;
+		k_neg = op->turbo_enc.tb_params.k_neg;
+		k_pos = op->turbo_enc.tb_params.k_pos;
+		c_neg = op->turbo_enc.tb_params.c_neg;
+		e = (r < cab) ? ea : eb;
+		k = (r < c_neg) ? k_neg : k_pos;
+	} else {
+		e = op->turbo_enc.cb_params.e;
+		k = op->turbo_enc.cb_params.k;
+	}
+
+	if (check_bit(op->turbo_enc.op_flags, RTE_BBDEV_TURBO_CRC_24B_ATTACH))
+		length = (k - 24) >> 3;
+	else
+		length = k >> 3;
+
+	if (unlikely((*mbuf_total_left == 0) || (*mbuf_total_left < length))) {
+		rte_bbdev_log(ERR,
+				"Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+				*mbuf_total_left, length);
+		return -1;
+	}
+
+	next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset,
+			length, seg_total_left, next_triplet);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->m2dlen = next_triplet;
+	*mbuf_total_left -= length;
+
+	/* Set output length */
+	if (check_bit(op->turbo_enc.op_flags, RTE_BBDEV_TURBO_RATE_MATCH))
+		/* Integer round up division by 8 */
+		*out_length = (e + 7) >> 3;
+	else
+		*out_length = (k >> 3) * 3 + 2;
+
+	next_triplet = acc100_dma_fill_blk_type_out(desc, output, *out_offset,
+			*out_length, next_triplet, ACC100_DMA_BLKID_OUT_ENC);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+	op->turbo_enc.output.length += *out_length;
+	*out_offset += *out_length;
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->d2mlen = next_triplet - desc->m2dlen;
+
+	desc->op_addr = op;
+
+	return 0;
+}
+
+static inline int
 acc100_dma_desc_le_fill(struct rte_bbdev_enc_op *op,
 		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
 		struct rte_mbuf *output, uint32_t *in_offset,
@@ -1151,6 +1336,117 @@
 }
 
 static inline int
+acc100_dma_desc_td_fill(struct rte_bbdev_dec_op *op,
+		struct acc100_dma_req_desc *desc, struct rte_mbuf **input,
+		struct rte_mbuf *h_output, struct rte_mbuf *s_output,
+		uint32_t *in_offset, uint32_t *h_out_offset,
+		uint32_t *s_out_offset, uint32_t *h_out_length,
+		uint32_t *s_out_length, uint32_t *mbuf_total_left,
+		uint32_t *seg_total_left, uint8_t r)
+{
+	int next_triplet = 1; /* FCW already done */
+	uint16_t k;
+	uint16_t crc24_overlap = 0;
+	uint32_t e, kw;
+
+	desc->word0 = ACC100_DMA_DESC_TYPE;
+	desc->word1 = 0; /**< Timestamp could be disabled */
+	desc->word2 = 0;
+	desc->word3 = 0;
+	desc->numCBs = 1;
+
+	if (op->turbo_dec.code_block_mode == 0) {
+		k = (r < op->turbo_dec.tb_params.c_neg)
+			? op->turbo_dec.tb_params.k_neg
+			: op->turbo_dec.tb_params.k_pos;
+		e = (r < op->turbo_dec.tb_params.cab)
+			? op->turbo_dec.tb_params.ea
+			: op->turbo_dec.tb_params.eb;
+	} else {
+		k = op->turbo_dec.cb_params.k;
+		e = op->turbo_dec.cb_params.e;
+	}
+
+	if ((op->turbo_dec.code_block_mode == 0)
+		&& !check_bit(op->turbo_dec.op_flags,
+		RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP))
+		crc24_overlap = 24;
+
+	/* Calculates circular buffer size.
+	 * According to 3gpp 36.212 section 5.1.4.2
+	 *   Kw = 3 * Kpi,
+	 * where:
+	 *   Kpi = nCol * nRow
+	 * where nCol is 32 and nRow can be calculated from:
+	 *   D =< nCol * nRow
+	 * where D is the size of each output from turbo encoder block (k + 4).
+	 */
+	kw = RTE_ALIGN_CEIL(k + 4, 32) * 3;
+
+	if (unlikely((*mbuf_total_left == 0) || (*mbuf_total_left < kw))) {
+		rte_bbdev_log(ERR,
+				"Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+				*mbuf_total_left, kw);
+		return -1;
+	}
+
+	next_triplet = acc100_dma_fill_blk_type_in(desc, input, in_offset, kw,
+			seg_total_left, next_triplet);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->m2dlen = next_triplet;
+	*mbuf_total_left -= kw;
+
+	next_triplet = acc100_dma_fill_blk_type_out(
+			desc, h_output, *h_out_offset,
+			k >> 3, next_triplet, ACC100_DMA_BLKID_OUT_HARD);
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+
+	*h_out_length = ((k - crc24_overlap) >> 3);
+	op->turbo_dec.hard_output.length += *h_out_length;
+	*h_out_offset += *h_out_length;
+
+	/* Soft output */
+	if (check_bit(op->turbo_dec.op_flags, RTE_BBDEV_TURBO_SOFT_OUTPUT)) {
+		if (check_bit(op->turbo_dec.op_flags,
+				RTE_BBDEV_TURBO_EQUALIZER))
+			*s_out_length = e;
+		else
+			*s_out_length = (k * 3) + 12;
+
+		next_triplet = acc100_dma_fill_blk_type_out(desc, s_output,
+				*s_out_offset, *s_out_length, next_triplet,
+				ACC100_DMA_BLKID_OUT_SOFT);
+		if (unlikely(next_triplet < 0)) {
+			rte_bbdev_log(ERR,
+					"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+					op);
+			return -1;
+		}
+
+		op->turbo_dec.soft_output.length += *s_out_length;
+		*s_out_offset += *s_out_length;
+	}
+
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->d2mlen = next_triplet - desc->m2dlen;
+
+	desc->op_addr = op;
+
+	return 0;
+}
+
+static inline int
 acc100_dma_desc_ld_fill(struct rte_bbdev_dec_op *op,
 		struct acc100_dma_req_desc *desc,
 		struct rte_mbuf **input, struct rte_mbuf *h_output,
@@ -1404,6 +1700,51 @@
 
 /* Enqueue one encode operations for ACC100 device in CB mode */
 static inline int
+enqueue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
+		uint16_t total_enqueued_cbs)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint32_t in_offset, out_offset, out_length, mbuf_total_left,
+		seg_total_left;
+	struct rte_mbuf *input, *output_head, *output;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	acc100_fcw_te_fill(op, &desc->req.fcw_te);
+
+	input = op->turbo_enc.input.data;
+	output_head = output = op->turbo_enc.output.data;
+	in_offset = op->turbo_enc.input.offset;
+	out_offset = op->turbo_enc.output.offset;
+	out_length = 0;
+	mbuf_total_left = op->turbo_enc.input.length;
+	seg_total_left = rte_pktmbuf_data_len(op->turbo_enc.input.data)
+			- in_offset;
+
+	ret = acc100_dma_desc_te_fill(op, &desc->req, &input, output,
+			&in_offset, &out_offset, &out_length, &mbuf_total_left,
+			&seg_total_left, 0);
+
+	if (unlikely(ret < 0))
+		return ret;
+
+	mbuf_append(output_head, output, out_length);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_te,
+			sizeof(desc->req.fcw_te) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+	if (check_mbuf_total_left(mbuf_total_left) != 0)
+		return -EINVAL;
+#endif
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
+/* Enqueue one encode operations for ACC100 device in CB mode */
+static inline int
 enqueue_ldpc_enc_n_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op **ops,
 		uint16_t total_enqueued_cbs, int16_t num)
 {
@@ -1503,85 +1844,235 @@
 	return 1;
 }
 
-static inline int
-harq_loopback(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
-		uint16_t total_enqueued_cbs) {
-	struct acc100_fcw_ld *fcw;
-	union acc100_dma_desc *desc;
-	int next_triplet = 1;
-	struct rte_mbuf *hq_output_head, *hq_output;
-	uint16_t harq_in_length = op->ldpc_dec.harq_combined_input.length;
-	if (harq_in_length == 0) {
-		rte_bbdev_log(ERR, "Loopback of invalid null size\n");
-		return -EINVAL;
-	}
 
-	int h_comp = check_bit(op->ldpc_dec.op_flags,
-			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION
-			) ? 1 : 0;
-	if (h_comp == 1)
-		harq_in_length = harq_in_length * 8 / 6;
-	harq_in_length = RTE_ALIGN(harq_in_length, 64);
-	uint16_t harq_dma_length_in = (h_comp == 0) ?
-			harq_in_length :
-			harq_in_length * 6 / 8;
-	uint16_t harq_dma_length_out = harq_dma_length_in;
-	bool ddr_mem_in = check_bit(op->ldpc_dec.op_flags,
-			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE);
-	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
-	uint16_t harq_index = (ddr_mem_in ?
-			op->ldpc_dec.harq_combined_input.offset :
-			op->ldpc_dec.harq_combined_output.offset)
-			/ ACC100_HARQ_OFFSET;
+/* Enqueue one encode operations for ACC100 device in TB mode. */
+static inline int
+enqueue_enc_one_op_tb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
+		uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint8_t r, c;
+	uint32_t in_offset, out_offset, out_length, mbuf_total_left,
+		seg_total_left;
+	struct rte_mbuf *input, *output_head, *output;
+	uint16_t current_enqueued_cbs = 0;
 
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
-	fcw = &desc->req.fcw_ld;
-	/* Set the FCW from loopback into DDR */
-	memset(fcw, 0, sizeof(struct acc100_fcw_ld));
-	fcw->FCWversion = ACC100_FCW_VER;
-	fcw->qm = 2;
-	fcw->Zc = 384;
-	if (harq_in_length < 16 * ACC100_N_ZC_1)
-		fcw->Zc = 16;
-	fcw->ncb = fcw->Zc * ACC100_N_ZC_1;
-	fcw->rm_e = 2;
-	fcw->hcin_en = 1;
-	fcw->hcout_en = 1;
+	uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
+	acc100_fcw_te_fill(op, &desc->req.fcw_te);
 
-	rte_bbdev_log(DEBUG, "Loopback IN %d Index %d offset %d length %d %d\n",
-			ddr_mem_in, harq_index,
-			harq_layout[harq_index].offset, harq_in_length,
-			harq_dma_length_in);
+	input = op->turbo_enc.input.data;
+	output_head = output = op->turbo_enc.output.data;
+	in_offset = op->turbo_enc.input.offset;
+	out_offset = op->turbo_enc.output.offset;
+	out_length = 0;
+	mbuf_total_left = op->turbo_enc.input.length;
 
-	if (ddr_mem_in && (harq_layout[harq_index].offset > 0)) {
-		fcw->hcin_size0 = harq_layout[harq_index].size0;
-		fcw->hcin_offset = harq_layout[harq_index].offset;
-		fcw->hcin_size1 = harq_in_length - fcw->hcin_offset;
-		harq_dma_length_in = (fcw->hcin_size0 + fcw->hcin_size1);
-		if (h_comp == 1)
-			harq_dma_length_in = harq_dma_length_in * 6 / 8;
-	} else {
-		fcw->hcin_size0 = harq_in_length;
-	}
-	harq_layout[harq_index].val = 0;
-	rte_bbdev_log(DEBUG, "Loopback FCW Config %d %d %d\n",
-			fcw->hcin_size0, fcw->hcin_offset, fcw->hcin_size1);
-	fcw->hcout_size0 = harq_in_length;
-	fcw->hcin_decomp_mode = h_comp;
-	fcw->hcout_comp_mode = h_comp;
-	fcw->gain_i = 1;
-	fcw->gain_h = 1;
+	c = op->turbo_enc.tb_params.c;
+	r = op->turbo_enc.tb_params.r;
 
-	/* Set the prefix of descriptor. This could be done at polling */
-	acc100_header_init(&desc->req);
+	while (mbuf_total_left > 0 && r < c) {
+		seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
+		/* Set up DMA descriptor */
+		desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
+				& q->sw_ring_wrap_mask);
+		desc->req.data_ptrs[0].address = q->ring_addr_iova + fcw_offset;
+		desc->req.data_ptrs[0].blen = ACC100_FCW_TE_BLEN;
 
-	/* Null LLR input for Decoder */
-	desc->req.data_ptrs[next_triplet].address =
-			q->lb_in_addr_iova;
-	desc->req.data_ptrs[next_triplet].blen = 2;
-	desc->req.data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN;
+		ret = acc100_dma_desc_te_fill(op, &desc->req, &input, output,
+				&in_offset, &out_offset, &out_length,
+				&mbuf_total_left, &seg_total_left, r);
+		if (unlikely(ret < 0))
+			return ret;
+		mbuf_append(output_head, output, out_length);
+
+		/* Set total number of CBs in TB */
+		desc->req.cbs_in_tb = cbs_in_tb;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+		rte_memdump(stderr, "FCW", &desc->req.fcw_te,
+				sizeof(desc->req.fcw_te) - 8);
+		rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+		if (seg_total_left == 0) {
+			/* Go to the next mbuf */
+			input = input->next;
+			in_offset = 0;
+			output = output->next;
+			out_offset = 0;
+		}
+
+		total_enqueued_cbs++;
+		current_enqueued_cbs++;
+		r++;
+	}
+
+	if (unlikely(desc == NULL))
+		return current_enqueued_cbs;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (check_mbuf_total_left(mbuf_total_left) != 0)
+		return -EINVAL;
+#endif
+
+	/* Set SDone on last CB descriptor for TB mode. */
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	return current_enqueued_cbs;
+}
+
+/** Enqueue one decode operations for ACC100 device in CB mode */
+static inline int
+enqueue_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint32_t in_offset, h_out_offset, s_out_offset, s_out_length,
+		h_out_length, mbuf_total_left, seg_total_left;
+	struct rte_mbuf *input, *h_output_head, *h_output,
+		*s_output_head, *s_output;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	acc100_fcw_td_fill(op, &desc->req.fcw_td);
+
+	input = op->turbo_dec.input.data;
+	h_output_head = h_output = op->turbo_dec.hard_output.data;
+	s_output_head = s_output = op->turbo_dec.soft_output.data;
+	in_offset = op->turbo_dec.input.offset;
+	h_out_offset = op->turbo_dec.hard_output.offset;
+	s_out_offset = op->turbo_dec.soft_output.offset;
+	h_out_length = s_out_length = 0;
+	mbuf_total_left = op->turbo_dec.input.length;
+	seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(input == NULL)) {
+		rte_bbdev_log(ERR, "Invalid mbuf pointer");
+		return -EFAULT;
+	}
+#endif
+
+	/* Set up DMA descriptor */
+	desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+
+	ret = acc100_dma_desc_td_fill(op, &desc->req, &input, h_output,
+			s_output, &in_offset, &h_out_offset, &s_out_offset,
+			&h_out_length, &s_out_length, &mbuf_total_left,
+			&seg_total_left, 0);
+
+	if (unlikely(ret < 0))
+		return ret;
+
+	/* Hard output */
+	mbuf_append(h_output_head, h_output, h_out_length);
+
+	/* Soft output */
+	if (check_bit(op->turbo_dec.op_flags, RTE_BBDEV_TURBO_SOFT_OUTPUT))
+		mbuf_append(s_output_head, s_output, s_out_length);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	rte_memdump(stderr, "FCW", &desc->req.fcw_td,
+			sizeof(desc->req.fcw_td) - 8);
+	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+	if (check_mbuf_total_left(mbuf_total_left) != 0)
+		return -EINVAL;
+#endif
+
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
+static inline int
+harq_loopback(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs) {
+	struct acc100_fcw_ld *fcw;
+	union acc100_dma_desc *desc;
+	int next_triplet = 1;
+	struct rte_mbuf *hq_output_head, *hq_output;
+	uint16_t harq_dma_length_in, harq_dma_length_out;
+	uint16_t harq_in_length = op->ldpc_dec.harq_combined_input.length;
+	if (harq_in_length == 0) {
+		rte_bbdev_log(ERR, "Loopback of invalid null size\n");
+		return -EINVAL;
+	}
+
+	int h_comp = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION
+			) ? 1 : 0;
+	if (h_comp == 1) {
+		harq_in_length = harq_in_length * 8 / 6;
+		harq_in_length = RTE_ALIGN(harq_in_length, 64);
+		harq_dma_length_in = harq_in_length * 6 / 8;
+	} else {
+		harq_in_length = RTE_ALIGN(harq_in_length, 64);
+		harq_dma_length_in = harq_in_length;
+	}
+	harq_dma_length_out = harq_dma_length_in;
+
+	bool ddr_mem_in = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_INTERNAL_HARQ_MEMORY_IN_ENABLE);
+	union acc100_harq_layout_data *harq_layout = q->d->harq_layout;
+	uint16_t harq_index = (ddr_mem_in ?
+			op->ldpc_dec.harq_combined_input.offset :
+			op->ldpc_dec.harq_combined_output.offset)
+			/ ACC100_HARQ_OFFSET;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	fcw = &desc->req.fcw_ld;
+	/* Set the FCW from loopback into DDR */
+	memset(fcw, 0, sizeof(struct acc100_fcw_ld));
+	fcw->FCWversion = ACC100_FCW_VER;
+	fcw->qm = 2;
+	fcw->Zc = 384;
+	if (harq_in_length < 16 * ACC100_N_ZC_1)
+		fcw->Zc = 16;
+	fcw->ncb = fcw->Zc * ACC100_N_ZC_1;
+	fcw->rm_e = 2;
+	fcw->hcin_en = 1;
+	fcw->hcout_en = 1;
+
+	rte_bbdev_log(DEBUG, "Loopback IN %d Index %d offset %d length %d %d\n",
+			ddr_mem_in, harq_index,
+			harq_layout[harq_index].offset, harq_in_length,
+			harq_dma_length_in);
+
+	if (ddr_mem_in && (harq_layout[harq_index].offset > 0)) {
+		fcw->hcin_size0 = harq_layout[harq_index].size0;
+		fcw->hcin_offset = harq_layout[harq_index].offset;
+		fcw->hcin_size1 = harq_in_length - fcw->hcin_offset;
+		harq_dma_length_in = (fcw->hcin_size0 + fcw->hcin_size1);
+		if (h_comp == 1)
+			harq_dma_length_in = harq_dma_length_in * 6 / 8;
+	} else {
+		fcw->hcin_size0 = harq_in_length;
+	}
+	harq_layout[harq_index].val = 0;
+	rte_bbdev_log(DEBUG, "Loopback FCW Config %d %d %d\n",
+			fcw->hcin_size0, fcw->hcin_offset, fcw->hcin_size1);
+	fcw->hcout_size0 = harq_in_length;
+	fcw->hcin_decomp_mode = h_comp;
+	fcw->hcout_comp_mode = h_comp;
+	fcw->gain_i = 1;
+	fcw->gain_h = 1;
+
+	/* Set the prefix of descriptor. This could be done at polling */
+	acc100_header_init(&desc->req);
+
+	/* Null LLR input for Decoder */
+	desc->req.data_ptrs[next_triplet].address =
+			q->lb_in_addr_iova;
+	desc->req.data_ptrs[next_triplet].blen = 2;
+	desc->req.data_ptrs[next_triplet].blkid = ACC100_DMA_BLKID_IN;
 	desc->req.data_ptrs[next_triplet].last = 0;
 	desc->req.data_ptrs[next_triplet].dma_ext = 0;
 	next_triplet++;
@@ -1831,6 +2322,102 @@
 	return current_enqueued_cbs;
 }
 
+/* Enqueue one decode operations for ACC100 device in TB mode */
+static inline int
+enqueue_dec_one_op_tb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
+		uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
+{
+	union acc100_dma_desc *desc = NULL;
+	int ret;
+	uint8_t r, c;
+	uint32_t in_offset, h_out_offset, s_out_offset, s_out_length,
+		h_out_length, mbuf_total_left, seg_total_left;
+	struct rte_mbuf *input, *h_output_head, *h_output,
+		*s_output_head, *s_output;
+	uint16_t current_enqueued_cbs = 0;
+
+	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
+			& q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	uint64_t fcw_offset = (desc_idx << 8) + ACC100_DESC_FCW_OFFSET;
+	acc100_fcw_td_fill(op, &desc->req.fcw_td);
+
+	input = op->turbo_dec.input.data;
+	h_output_head = h_output = op->turbo_dec.hard_output.data;
+	s_output_head = s_output = op->turbo_dec.soft_output.data;
+	in_offset = op->turbo_dec.input.offset;
+	h_out_offset = op->turbo_dec.hard_output.offset;
+	s_out_offset = op->turbo_dec.soft_output.offset;
+	h_out_length = s_out_length = 0;
+	mbuf_total_left = op->turbo_dec.input.length;
+	c = op->turbo_dec.tb_params.c;
+	r = op->turbo_dec.tb_params.r;
+
+	while (mbuf_total_left > 0 && r < c) {
+
+		seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
+
+		/* Set up DMA descriptor */
+		desc = q->ring_addr + ((q->sw_ring_head + total_enqueued_cbs)
+				& q->sw_ring_wrap_mask);
+		desc->req.data_ptrs[0].address = q->ring_addr_iova + fcw_offset;
+		desc->req.data_ptrs[0].blen = ACC100_FCW_TD_BLEN;
+		ret = acc100_dma_desc_td_fill(op, &desc->req, &input,
+				h_output, s_output, &in_offset, &h_out_offset,
+				&s_out_offset, &h_out_length, &s_out_length,
+				&mbuf_total_left, &seg_total_left, r);
+
+		if (unlikely(ret < 0))
+			return ret;
+
+		/* Hard output */
+		mbuf_append(h_output_head, h_output, h_out_length);
+
+		/* Soft output */
+		if (check_bit(op->turbo_dec.op_flags,
+				RTE_BBDEV_TURBO_SOFT_OUTPUT))
+			mbuf_append(s_output_head, s_output, s_out_length);
+
+		/* Set total number of CBs in TB */
+		desc->req.cbs_in_tb = cbs_in_tb;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+		rte_memdump(stderr, "FCW", &desc->req.fcw_td,
+				sizeof(desc->req.fcw_td) - 8);
+		rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+#endif
+
+		if (seg_total_left == 0) {
+			/* Go to the next mbuf */
+			input = input->next;
+			in_offset = 0;
+			h_output = h_output->next;
+			h_out_offset = 0;
+
+			if (check_bit(op->turbo_dec.op_flags,
+					RTE_BBDEV_TURBO_SOFT_OUTPUT)) {
+				s_output = s_output->next;
+				s_out_offset = 0;
+			}
+		}
+
+		total_enqueued_cbs++;
+		current_enqueued_cbs++;
+		r++;
+	}
+
+	if (unlikely(desc == NULL))
+		return current_enqueued_cbs;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (check_mbuf_total_left(mbuf_total_left) != 0)
+		return -EINVAL;
+#endif
+	/* Set SDone on last CB descriptor for TB mode */
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	return current_enqueued_cbs;
+}
 
 /* Calculates number of CBs in processed encoder TB based on 'r' and input
  * length.
@@ -1908,6 +2495,45 @@
 	return cbs_in_tb;
 }
 
+/* Enqueue encode operations for ACC100 device in CB mode. */
+static uint16_t
+acc100_enqueue_enc_cb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i;
+	union acc100_dma_desc *desc;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - 1 < 0))
+			break;
+		avail -= 1;
+
+		ret = enqueue_enc_one_op_cb(q, ops[i], i);
+		if (ret < 0)
+			break;
+	}
+
+	if (unlikely(i == 0))
+		return 0; /* Nothing to enqueue */
+
+	/* Set SDone in last CB in enqueued ops for CB mode*/
+	desc = q->ring_addr + ((q->sw_ring_head + i - 1)
+			& q->sw_ring_wrap_mask);
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	acc100_dma_enqueue(q, i, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+	return i;
+}
+
 /* Check we can mux encode operations with common FCW */
 static inline bool
 check_mux(struct rte_bbdev_enc_op **ops, uint16_t num) {
@@ -1976,6 +2602,54 @@
 	return i;
 }
 
+/* Enqueue encode operations for ACC100 device in TB mode. */
+static uint16_t
+acc100_enqueue_enc_tb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i, enqueued_cbs = 0;
+	uint8_t cbs_in_tb;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		cbs_in_tb = get_num_cbs_in_tb_enc(&ops[i]->turbo_enc);
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - cbs_in_tb < 0))
+			break;
+		avail -= cbs_in_tb;
+
+		ret = enqueue_enc_one_op_tb(q, ops[i], enqueued_cbs, cbs_in_tb);
+		if (ret < 0)
+			break;
+		enqueued_cbs += ret;
+	}
+	if (unlikely(enqueued_cbs == 0))
+		return 0; /* Nothing to enqueue */
+
+	acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+
+	return i;
+}
+
+/* Enqueue encode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_enc(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	if (unlikely(num == 0))
+		return 0;
+	if (ops[0]->turbo_enc.code_block_mode == 0)
+		return acc100_enqueue_enc_tb(q_data, ops, num);
+	else
+		return acc100_enqueue_enc_cb(q_data, ops, num);
+}
+
 /* Enqueue encode operations for ACC100 device. */
 static uint16_t
 acc100_enqueue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
@@ -1983,7 +2657,51 @@
 {
 	if (unlikely(num == 0))
 		return 0;
-	return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
+	if (ops[0]->ldpc_enc.code_block_mode == 0)
+		return acc100_enqueue_enc_tb(q_data, ops, num);
+	else
+		return acc100_enqueue_ldpc_enc_cb(q_data, ops, num);
+}
+
+
+/* Enqueue decode operations for ACC100 device in CB mode */
+static uint16_t
+acc100_enqueue_dec_cb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i;
+	union acc100_dma_desc *desc;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - 1 < 0))
+			break;
+		avail -= 1;
+
+		ret = enqueue_dec_one_op_cb(q, ops[i], i);
+		if (ret < 0)
+			break;
+	}
+
+	if (unlikely(i == 0))
+		return 0; /* Nothing to enqueue */
+
+	/* Set SDone in last CB in enqueued ops for CB mode*/
+	desc = q->ring_addr + ((q->sw_ring_head + i - 1)
+			& q->sw_ring_wrap_mask);
+	desc->req.sdone_enable = 1;
+	desc->req.irq_enable = q->irq_enable;
+
+	acc100_dma_enqueue(q, i, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+
+	return i;
 }
 
 /* Check we can mux encode operations with common FCW */
@@ -2081,6 +2799,53 @@
 	return i;
 }
 
+
+/* Enqueue decode operations for ACC100 device in TB mode */
+static uint16_t
+acc100_enqueue_dec_tb(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	int32_t avail = q->sw_ring_depth + q->sw_ring_tail - q->sw_ring_head;
+	uint16_t i, enqueued_cbs = 0;
+	uint8_t cbs_in_tb;
+	int ret;
+
+	for (i = 0; i < num; ++i) {
+		cbs_in_tb = get_num_cbs_in_tb_dec(&ops[i]->turbo_dec);
+		/* Check if there are available space for further processing */
+		if (unlikely(avail - cbs_in_tb < 0))
+			break;
+		avail -= cbs_in_tb;
+
+		ret = enqueue_dec_one_op_tb(q, ops[i], enqueued_cbs, cbs_in_tb);
+		if (ret < 0)
+			break;
+		enqueued_cbs += ret;
+	}
+
+	acc100_dma_enqueue(q, enqueued_cbs, &q_data->queue_stats);
+
+	/* Update stats */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+
+	return i;
+}
+
+/* Enqueue decode operations for ACC100 device. */
+static uint16_t
+acc100_enqueue_dec(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	if (unlikely(num == 0))
+		return 0;
+	if (ops[0]->turbo_dec.code_block_mode == 0)
+		return acc100_enqueue_dec_tb(q_data, ops, num);
+	else
+		return acc100_enqueue_dec_cb(q_data, ops, num);
+}
+
 /* Enqueue decode operations for ACC100 device. */
 static uint16_t
 acc100_enqueue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
@@ -2404,6 +3169,52 @@
 	return cb_idx;
 }
 
+/* Dequeue encode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_enc(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_enc_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	uint16_t dequeue_num;
+	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+	uint32_t aq_dequeued = 0;
+	uint16_t i, dequeued_cbs = 0;
+	struct rte_bbdev_enc_op *op;
+	int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(ops == NULL || q == NULL)) {
+		rte_bbdev_log_debug("Unexpected undefined pointer");
+		return 0;
+	}
+#endif
+
+	dequeue_num = (avail < num) ? avail : num;
+
+	for (i = 0; i < dequeue_num; ++i) {
+		op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask))->req.op_addr;
+		if (op->turbo_enc.code_block_mode == 0)
+			ret = dequeue_enc_one_op_tb(q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+		else
+			ret = dequeue_enc_one_op_cb(q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+
+		if (ret < 0)
+			break;
+		dequeued_cbs += ret;
+	}
+
+	q->aq_dequeued += aq_dequeued;
+	q->sw_ring_tail += dequeued_cbs;
+
+	/* Update enqueue stats */
+	q_data->queue_stats.dequeued_count += i;
+
+	return i;
+}
+
 /* Dequeue LDPC encode operations from ACC100 device. */
 static uint16_t
 acc100_dequeue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
@@ -2442,6 +3253,52 @@
 	return dequeued_cbs;
 }
 
+
+/* Dequeue decode operations from ACC100 device. */
+static uint16_t
+acc100_dequeue_dec(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_dec_op **ops, uint16_t num)
+{
+	struct acc100_queue *q = q_data->queue_private;
+	uint16_t dequeue_num;
+	uint32_t avail = q->sw_ring_head - q->sw_ring_tail;
+	uint32_t aq_dequeued = 0;
+	uint16_t i;
+	uint16_t dequeued_cbs = 0;
+	struct rte_bbdev_dec_op *op;
+	int ret;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	if (unlikely(ops == 0 && q == NULL))
+		return 0;
+#endif
+
+	dequeue_num = (avail < num) ? avail : num;
+
+	for (i = 0; i < dequeue_num; ++i) {
+		op = (q->ring_addr + ((q->sw_ring_tail + dequeued_cbs)
+			& q->sw_ring_wrap_mask))->req.op_addr;
+		if (op->turbo_dec.code_block_mode == 0)
+			ret = dequeue_dec_one_op_tb(q, &ops[i], dequeued_cbs,
+					&aq_dequeued);
+		else
+			ret = dequeue_dec_one_op_cb(q_data, q, &ops[i],
+					dequeued_cbs, &aq_dequeued);
+
+		if (ret < 0)
+			break;
+		dequeued_cbs += ret;
+	}
+
+	q->aq_dequeued += aq_dequeued;
+	q->sw_ring_tail += dequeued_cbs;
+
+	/* Update enqueue stats */
+	q_data->queue_stats.dequeued_count += i;
+
+	return i;
+}
+
 /* Dequeue decode operations from ACC100 device. */
 static uint16_t
 acc100_dequeue_ldpc_dec(struct rte_bbdev_queue_data *q_data,
@@ -2495,6 +3352,10 @@
 	struct rte_pci_device *pci_dev = RTE_DEV_TO_PCI(dev->device);
 
 	dev->dev_ops = &acc100_bbdev_ops;
+	dev->enqueue_enc_ops = acc100_enqueue_enc;
+	dev->enqueue_dec_ops = acc100_enqueue_dec;
+	dev->dequeue_enc_ops = acc100_dequeue_enc;
+	dev->dequeue_dec_ops = acc100_dequeue_dec;
 	dev->enqueue_ldpc_enc_ops = acc100_enqueue_ldpc_enc;
 	dev->enqueue_ldpc_dec_ops = acc100_enqueue_ldpc_dec;
 	dev->dequeue_ldpc_enc_ops = acc100_dequeue_ldpc_enc;
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v12 08/10] baseband/acc100: add interrupt support to PMD
  2020-10-05 22:12   ` [dpdk-dev] [PATCH v12 00/10] bbdev PMD ACC100 Nicolas Chautru
                       ` (6 preceding siblings ...)
  2020-10-05 22:12     ` [dpdk-dev] [PATCH v12 07/10] baseband/acc100: add support for 4G processing Nicolas Chautru
@ 2020-10-05 22:12     ` Nicolas Chautru
  2020-10-05 22:12     ` [dpdk-dev] [PATCH v12 09/10] baseband/acc100: add debug function to validate input Nicolas Chautru
                       ` (2 subsequent siblings)
  10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-10-05 22:12 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, trix, maxime.coquelin, ferruh.yigit,
	tianjiao.liu, Nicolas Chautru

Adding capability and functions to support MSI
interrupts, call backs and inforing.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 307 ++++++++++++++++++++++++++++++-
 drivers/baseband/acc100/rte_acc100_pmd.h |  16 ++
 2 files changed, 320 insertions(+), 3 deletions(-)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index e9aa07d..408548e 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -342,6 +342,213 @@
 	free_base_addresses(base_addrs, i);
 }
 
+/*
+ * Find queue_id of a device queue based on details from the Info Ring.
+ * If a queue isn't found UINT16_MAX is returned.
+ */
+static inline uint16_t
+get_queue_id_from_ring_info(struct rte_bbdev_data *data,
+		const union acc100_info_ring_data ring_data)
+{
+	uint16_t queue_id;
+
+	for (queue_id = 0; queue_id < data->num_queues; ++queue_id) {
+		struct acc100_queue *acc100_q =
+				data->queues[queue_id].queue_private;
+		if (acc100_q != NULL && acc100_q->aq_id == ring_data.aq_id &&
+				acc100_q->qgrp_id == ring_data.qg_id &&
+				acc100_q->vf_id == ring_data.vf_id)
+			return queue_id;
+	}
+
+	return UINT16_MAX;
+}
+
+/* Checks PF Info Ring to find the interrupt cause and handles it accordingly */
+static inline void
+acc100_check_ir(struct acc100_device *acc100_dev)
+{
+	volatile union acc100_info_ring_data *ring_data;
+	uint16_t info_ring_head = acc100_dev->info_ring_head;
+	if (acc100_dev->info_ring == NULL)
+		return;
+
+	ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head &
+			ACC100_INFO_RING_MASK);
+
+	while (ring_data->valid) {
+		if ((ring_data->int_nb < ACC100_PF_INT_DMA_DL_DESC_IRQ) || (
+				ring_data->int_nb >
+				ACC100_PF_INT_DMA_DL5G_DESC_IRQ))
+			rte_bbdev_log(WARNING, "InfoRing: ITR:%d Info:0x%x",
+				ring_data->int_nb, ring_data->detailed_info);
+		/* Initialize Info Ring entry and move forward */
+		ring_data->val = 0;
+		info_ring_head++;
+		ring_data = acc100_dev->info_ring +
+				(info_ring_head & ACC100_INFO_RING_MASK);
+	}
+}
+
+/* Checks PF Info Ring to find the interrupt cause and handles it accordingly */
+static inline void
+acc100_pf_interrupt_handler(struct rte_bbdev *dev)
+{
+	struct acc100_device *acc100_dev = dev->data->dev_private;
+	volatile union acc100_info_ring_data *ring_data;
+	struct acc100_deq_intr_details deq_intr_det;
+
+	ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head &
+			ACC100_INFO_RING_MASK);
+
+	while (ring_data->valid) {
+
+		rte_bbdev_log_debug(
+				"ACC100 PF Interrupt received, Info Ring data: 0x%x",
+				ring_data->val);
+
+		switch (ring_data->int_nb) {
+		case ACC100_PF_INT_DMA_DL_DESC_IRQ:
+		case ACC100_PF_INT_DMA_UL_DESC_IRQ:
+		case ACC100_PF_INT_DMA_UL5G_DESC_IRQ:
+		case ACC100_PF_INT_DMA_DL5G_DESC_IRQ:
+			deq_intr_det.queue_id = get_queue_id_from_ring_info(
+					dev->data, *ring_data);
+			if (deq_intr_det.queue_id == UINT16_MAX) {
+				rte_bbdev_log(ERR,
+						"Couldn't find queue: aq_id: %u, qg_id: %u, vf_id: %u",
+						ring_data->aq_id,
+						ring_data->qg_id,
+						ring_data->vf_id);
+				return;
+			}
+			rte_bbdev_pmd_callback_process(dev,
+					RTE_BBDEV_EVENT_DEQUEUE, &deq_intr_det);
+			break;
+		default:
+			rte_bbdev_pmd_callback_process(dev,
+					RTE_BBDEV_EVENT_ERROR, NULL);
+			break;
+		}
+
+		/* Initialize Info Ring entry and move forward */
+		ring_data->val = 0;
+		++acc100_dev->info_ring_head;
+		ring_data = acc100_dev->info_ring +
+				(acc100_dev->info_ring_head &
+				ACC100_INFO_RING_MASK);
+	}
+}
+
+/* Checks VF Info Ring to find the interrupt cause and handles it accordingly */
+static inline void
+acc100_vf_interrupt_handler(struct rte_bbdev *dev)
+{
+	struct acc100_device *acc100_dev = dev->data->dev_private;
+	volatile union acc100_info_ring_data *ring_data;
+	struct acc100_deq_intr_details deq_intr_det;
+
+	ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head &
+			ACC100_INFO_RING_MASK);
+
+	while (ring_data->valid) {
+
+		rte_bbdev_log_debug(
+				"ACC100 VF Interrupt received, Info Ring data: 0x%x",
+				ring_data->val);
+
+		switch (ring_data->int_nb) {
+		case ACC100_VF_INT_DMA_DL_DESC_IRQ:
+		case ACC100_VF_INT_DMA_UL_DESC_IRQ:
+		case ACC100_VF_INT_DMA_UL5G_DESC_IRQ:
+		case ACC100_VF_INT_DMA_DL5G_DESC_IRQ:
+			/* VFs are not aware of their vf_id - it's set to 0 in
+			 * queue structures.
+			 */
+			ring_data->vf_id = 0;
+			deq_intr_det.queue_id = get_queue_id_from_ring_info(
+					dev->data, *ring_data);
+			if (deq_intr_det.queue_id == UINT16_MAX) {
+				rte_bbdev_log(ERR,
+						"Couldn't find queue: aq_id: %u, qg_id: %u",
+						ring_data->aq_id,
+						ring_data->qg_id);
+				return;
+			}
+			rte_bbdev_pmd_callback_process(dev,
+					RTE_BBDEV_EVENT_DEQUEUE, &deq_intr_det);
+			break;
+		default:
+			rte_bbdev_pmd_callback_process(dev,
+					RTE_BBDEV_EVENT_ERROR, NULL);
+			break;
+		}
+
+		/* Initialize Info Ring entry and move forward */
+		ring_data->valid = 0;
+		++acc100_dev->info_ring_head;
+		ring_data = acc100_dev->info_ring + (acc100_dev->info_ring_head
+				& ACC100_INFO_RING_MASK);
+	}
+}
+
+/* Interrupt handler triggered by ACC100 dev for handling specific interrupt */
+static void
+acc100_dev_interrupt_handler(void *cb_arg)
+{
+	struct rte_bbdev *dev = cb_arg;
+	struct acc100_device *acc100_dev = dev->data->dev_private;
+
+	/* Read info ring */
+	if (acc100_dev->pf_device)
+		acc100_pf_interrupt_handler(dev);
+	else
+		acc100_vf_interrupt_handler(dev);
+}
+
+/* Allocate and setup inforing */
+static int
+allocate_info_ring(struct rte_bbdev *dev)
+{
+	struct acc100_device *d = dev->data->dev_private;
+	const struct acc100_registry_addr *reg_addr;
+	rte_iova_t info_ring_iova;
+	uint32_t phys_low, phys_high;
+
+	if (d->info_ring != NULL)
+		return 0; /* Already configured */
+
+	/* Choose correct registry addresses for the device type */
+	if (d->pf_device)
+		reg_addr = &pf_reg_addr;
+	else
+		reg_addr = &vf_reg_addr;
+	/* Allocate InfoRing */
+	d->info_ring = rte_zmalloc_socket("Info Ring",
+			ACC100_INFO_RING_NUM_ENTRIES *
+			sizeof(*d->info_ring), RTE_CACHE_LINE_SIZE,
+			dev->data->socket_id);
+	if (d->info_ring == NULL) {
+		rte_bbdev_log(ERR,
+				"Failed to allocate Info Ring for %s:%u",
+				dev->device->driver->name,
+				dev->data->dev_id);
+		return -ENOMEM;
+	}
+	info_ring_iova = rte_malloc_virt2iova(d->info_ring);
+
+	/* Setup Info Ring */
+	phys_high = (uint32_t)(info_ring_iova >> 32);
+	phys_low  = (uint32_t)(info_ring_iova);
+	acc100_reg_write(d, reg_addr->info_ring_hi, phys_high);
+	acc100_reg_write(d, reg_addr->info_ring_lo, phys_low);
+	acc100_reg_write(d, reg_addr->info_ring_en, ACC100_REG_IRQ_EN_ALL);
+	d->info_ring_head = (acc100_reg_read(d, reg_addr->info_ring_ptr) &
+			0xFFF) / sizeof(union acc100_info_ring_data);
+	return 0;
+}
+
+
 /* Allocate 64MB memory used for all software rings */
 static int
 acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id)
@@ -349,6 +556,7 @@
 	uint32_t phys_low, phys_high, payload;
 	struct acc100_device *d = dev->data->dev_private;
 	const struct acc100_registry_addr *reg_addr;
+	int ret;
 
 	if (d->pf_device && !d->acc100_conf.pf_mode_en) {
 		rte_bbdev_log(NOTICE,
@@ -432,6 +640,14 @@
 	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_hi, phys_high);
 	acc100_reg_write(d, reg_addr->tail_ptrs_dl4g_lo, phys_low);
 
+	ret = allocate_info_ring(dev);
+	if (ret < 0) {
+		rte_bbdev_log(ERR, "Failed to allocate info_ring for %s:%u",
+				dev->device->driver->name,
+				dev->data->dev_id);
+		/* Continue */
+	}
+
 	d->harq_layout = rte_zmalloc_socket("HARQ Layout",
 			ACC100_HARQ_LAYOUT * sizeof(*d->harq_layout),
 			RTE_CACHE_LINE_SIZE, dev->data->socket_id);
@@ -453,13 +669,59 @@
 	return 0;
 }
 
+static int
+acc100_intr_enable(struct rte_bbdev *dev)
+{
+	int ret;
+	struct acc100_device *d = dev->data->dev_private;
+
+	/* Only MSI are currently supported */
+	if (dev->intr_handle->type == RTE_INTR_HANDLE_VFIO_MSI ||
+			dev->intr_handle->type == RTE_INTR_HANDLE_UIO) {
+
+		ret = allocate_info_ring(dev);
+		if (ret < 0) {
+			rte_bbdev_log(ERR,
+					"Couldn't allocate info ring for device: %s",
+					dev->data->name);
+			return ret;
+		}
+
+		ret = rte_intr_enable(dev->intr_handle);
+		if (ret < 0) {
+			rte_bbdev_log(ERR,
+					"Couldn't enable interrupts for device: %s",
+					dev->data->name);
+			rte_free(d->info_ring);
+			return ret;
+		}
+		ret = rte_intr_callback_register(dev->intr_handle,
+				acc100_dev_interrupt_handler, dev);
+		if (ret < 0) {
+			rte_bbdev_log(ERR,
+					"Couldn't register interrupt callback for device: %s",
+					dev->data->name);
+			rte_free(d->info_ring);
+			return ret;
+		}
+
+		return 0;
+	}
+
+	rte_bbdev_log(ERR, "ACC100 (%s) supports only VFIO MSI interrupts",
+			dev->data->name);
+	return -ENOTSUP;
+}
+
 /* Free memory used for software rings */
 static int
 acc100_dev_close(struct rte_bbdev *dev)
 {
 	struct acc100_device *d = dev->data->dev_private;
+	acc100_check_ir(d);
 	if (d->sw_rings_base != NULL) {
 		rte_free(d->tail_ptrs);
+		rte_free(d->info_ring);
 		rte_free(d->sw_rings_base);
 		d->sw_rings_base = NULL;
 	}
@@ -670,6 +932,7 @@
 					RTE_BBDEV_TURBO_CRC_TYPE_24B |
 					RTE_BBDEV_TURBO_HALF_ITERATION_EVEN |
 					RTE_BBDEV_TURBO_EARLY_TERMINATION |
+					RTE_BBDEV_TURBO_DEC_INTERRUPTS |
 					RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN |
 					RTE_BBDEV_TURBO_MAP_DEC |
 					RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP |
@@ -690,6 +953,7 @@
 					RTE_BBDEV_TURBO_CRC_24B_ATTACH |
 					RTE_BBDEV_TURBO_RV_INDEX_BYPASS |
 					RTE_BBDEV_TURBO_RATE_MATCH |
+					RTE_BBDEV_TURBO_ENC_INTERRUPTS |
 					RTE_BBDEV_TURBO_ENC_SCATTER_GATHER,
 				.num_buffers_src =
 						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
@@ -703,7 +967,8 @@
 				.capability_flags =
 					RTE_BBDEV_LDPC_RATE_MATCH |
 					RTE_BBDEV_LDPC_CRC_24B_ATTACH |
-					RTE_BBDEV_LDPC_INTERLEAVER_BYPASS,
+					RTE_BBDEV_LDPC_INTERLEAVER_BYPASS |
+					RTE_BBDEV_LDPC_ENC_INTERRUPTS,
 				.num_buffers_src =
 						RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
 				.num_buffers_dst =
@@ -728,7 +993,8 @@
 				RTE_BBDEV_LDPC_DECODE_BYPASS |
 				RTE_BBDEV_LDPC_DEC_SCATTER_GATHER |
 				RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION |
-				RTE_BBDEV_LDPC_LLR_COMPRESSION,
+				RTE_BBDEV_LDPC_LLR_COMPRESSION |
+				RTE_BBDEV_LDPC_DEC_INTERRUPTS,
 			.llr_size = 8,
 			.llr_decimals = 1,
 			.num_buffers_src =
@@ -778,14 +1044,44 @@
 #else
 	dev_info->harq_buffer_size = 0;
 #endif
+	acc100_check_ir(d);
+}
+
+static int
+acc100_queue_intr_enable(struct rte_bbdev *dev, uint16_t queue_id)
+{
+	struct acc100_queue *q = dev->data->queues[queue_id].queue_private;
+
+	if (dev->intr_handle->type != RTE_INTR_HANDLE_VFIO_MSI &&
+			dev->intr_handle->type != RTE_INTR_HANDLE_UIO)
+		return -ENOTSUP;
+
+	q->irq_enable = 1;
+	return 0;
+}
+
+static int
+acc100_queue_intr_disable(struct rte_bbdev *dev, uint16_t queue_id)
+{
+	struct acc100_queue *q = dev->data->queues[queue_id].queue_private;
+
+	if (dev->intr_handle->type != RTE_INTR_HANDLE_VFIO_MSI &&
+			dev->intr_handle->type != RTE_INTR_HANDLE_UIO)
+		return -ENOTSUP;
+
+	q->irq_enable = 0;
+	return 0;
 }
 
 static const struct rte_bbdev_ops acc100_bbdev_ops = {
 	.setup_queues = acc100_setup_queues,
+	.intr_enable = acc100_intr_enable,
 	.close = acc100_dev_close,
 	.info_get = acc100_dev_info_get,
 	.queue_setup = acc100_queue_setup,
 	.queue_release = acc100_queue_release,
+	.queue_intr_enable = acc100_queue_intr_enable,
+	.queue_intr_disable = acc100_queue_intr_disable
 };
 
 /* ACC100 PCI PF address map */
@@ -3018,8 +3314,10 @@
 			? (1 << RTE_BBDEV_DATA_ERROR) : 0);
 	op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
 	op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
-	if (op->status != 0)
+	if (op->status != 0) {
 		q_data->queue_stats.dequeue_err_count++;
+		acc100_check_ir(q->d);
+	}
 
 	/* CRC invalid if error exists */
 	if (!op->status)
@@ -3076,6 +3374,9 @@
 		op->status |= 1 << RTE_BBDEV_SYNDROME_ERROR;
 	op->ldpc_dec.iter_count = (uint8_t) rsp.iter_cnt;
 
+	if (op->status & (1 << RTE_BBDEV_DRV_ERROR))
+		acc100_check_ir(q->d);
+
 	/* Check if this is the last desc in batch (Atomic Queue) */
 	if (desc->req.last_desc_in_batch) {
 		(*aq_dequeued)++;
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 38818f4..1fbd96e 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -565,8 +565,16 @@ struct acc100_device {
 	rte_iova_t sw_rings_iova;  /* IOVA address of sw_rings */
 	/* Virtual address of the info memory routed to the this function under
 	 * operation, whether it is PF or VF.
+	 * HW may DMA information data at this location asynchronously
 	 */
+	union acc100_info_ring_data *info_ring;
+
 	union acc100_harq_layout_data *harq_layout;
+	/* Virtual Info Ring head */
+	uint16_t info_ring_head;
+	/* Number of bytes available for each queue in device, depending on
+	 * how many queues are enabled with configure()
+	 */
 	uint32_t sw_ring_size;
 	uint32_t ddr_size; /* Size in kB */
 	uint32_t *tail_ptrs; /* Base address of response tail pointer buffer */
@@ -582,4 +590,12 @@ struct acc100_device {
 	bool configured; /**< True if this ACC100 device is configured */
 };
 
+/**
+ * Structure with details about RTE_BBDEV_EVENT_DEQUEUE event. It's passed to
+ * the callback function.
+ */
+struct acc100_deq_intr_details {
+	uint16_t queue_id;
+};
+
 #endif /* _RTE_ACC100_PMD_H_ */
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v12 09/10] baseband/acc100: add debug function to validate input
  2020-10-05 22:12   ` [dpdk-dev] [PATCH v12 00/10] bbdev PMD ACC100 Nicolas Chautru
                       ` (7 preceding siblings ...)
  2020-10-05 22:12     ` [dpdk-dev] [PATCH v12 08/10] baseband/acc100: add interrupt support to PMD Nicolas Chautru
@ 2020-10-05 22:12     ` Nicolas Chautru
  2020-10-05 22:12     ` [dpdk-dev] [PATCH v12 10/10] baseband/acc100: add configure function Nicolas Chautru
  2020-10-06 13:20     ` [dpdk-dev] [PATCH v12 00/10] bbdev PMD ACC100 Maxime Coquelin
  10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-10-05 22:12 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, trix, maxime.coquelin, ferruh.yigit,
	tianjiao.liu, Nicolas Chautru

Debug functions to validate the input API from user
Only enabled in DEBUG mode at build time

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
Reviewed-by: Tom Rix <trix@redhat.com>
---
 drivers/baseband/acc100/rte_acc100_pmd.c | 436 +++++++++++++++++++++++++++++++
 1 file changed, 436 insertions(+)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 408548e..8447540 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -1994,6 +1994,243 @@
 
 }
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+/* Validates turbo encoder parameters */
+static inline int
+validate_enc_op(struct rte_bbdev_enc_op *op)
+{
+	struct rte_bbdev_op_turbo_enc *turbo_enc = &op->turbo_enc;
+	struct rte_bbdev_op_enc_turbo_cb_params *cb = NULL;
+	struct rte_bbdev_op_enc_turbo_tb_params *tb = NULL;
+	uint16_t kw, kw_neg, kw_pos;
+
+	if (op->mempool == NULL) {
+		rte_bbdev_log(ERR, "Invalid mempool pointer");
+		return -1;
+	}
+	if (turbo_enc->input.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid input pointer");
+		return -1;
+	}
+	if (turbo_enc->output.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid output pointer");
+		return -1;
+	}
+	if (turbo_enc->rv_index > 3) {
+		rte_bbdev_log(ERR,
+				"rv_index (%u) is out of range 0 <= value <= 3",
+				turbo_enc->rv_index);
+		return -1;
+	}
+	if (turbo_enc->code_block_mode != 0 &&
+			turbo_enc->code_block_mode != 1) {
+		rte_bbdev_log(ERR,
+				"code_block_mode (%u) is out of range 0 <= value <= 1",
+				turbo_enc->code_block_mode);
+		return -1;
+	}
+
+	if (turbo_enc->code_block_mode == 0) {
+		tb = &turbo_enc->tb_params;
+		if ((tb->k_neg < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| tb->k_neg > RTE_BBDEV_TURBO_MAX_CB_SIZE)
+				&& tb->c_neg > 0) {
+			rte_bbdev_log(ERR,
+					"k_neg (%u) is out of range %u <= value <= %u",
+					tb->k_neg, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if (tb->k_pos < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| tb->k_pos > RTE_BBDEV_TURBO_MAX_CB_SIZE) {
+			rte_bbdev_log(ERR,
+					"k_pos (%u) is out of range %u <= value <= %u",
+					tb->k_pos, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if (tb->c_neg > (RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1))
+			rte_bbdev_log(ERR,
+					"c_neg (%u) is out of range 0 <= value <= %u",
+					tb->c_neg,
+					RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1);
+		if (tb->c < 1 || tb->c > RTE_BBDEV_TURBO_MAX_CODE_BLOCKS) {
+			rte_bbdev_log(ERR,
+					"c (%u) is out of range 1 <= value <= %u",
+					tb->c, RTE_BBDEV_TURBO_MAX_CODE_BLOCKS);
+			return -1;
+		}
+		if (tb->cab > tb->c) {
+			rte_bbdev_log(ERR,
+					"cab (%u) is greater than c (%u)",
+					tb->cab, tb->c);
+			return -1;
+		}
+		if ((tb->ea < RTE_BBDEV_TURBO_MIN_CB_SIZE || (tb->ea % 2))
+				&& tb->r < tb->cab) {
+			rte_bbdev_log(ERR,
+					"ea (%u) is less than %u or it is not even",
+					tb->ea, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+		if ((tb->eb < RTE_BBDEV_TURBO_MIN_CB_SIZE || (tb->eb % 2))
+				&& tb->c > tb->cab) {
+			rte_bbdev_log(ERR,
+					"eb (%u) is less than %u or it is not even",
+					tb->eb, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+
+		kw_neg = 3 * RTE_ALIGN_CEIL(tb->k_neg + 4,
+					RTE_BBDEV_TURBO_C_SUBBLOCK);
+		if (tb->ncb_neg < tb->k_neg || tb->ncb_neg > kw_neg) {
+			rte_bbdev_log(ERR,
+					"ncb_neg (%u) is out of range (%u) k_neg <= value <= (%u) kw_neg",
+					tb->ncb_neg, tb->k_neg, kw_neg);
+			return -1;
+		}
+
+		kw_pos = 3 * RTE_ALIGN_CEIL(tb->k_pos + 4,
+					RTE_BBDEV_TURBO_C_SUBBLOCK);
+		if (tb->ncb_pos < tb->k_pos || tb->ncb_pos > kw_pos) {
+			rte_bbdev_log(ERR,
+					"ncb_pos (%u) is out of range (%u) k_pos <= value <= (%u) kw_pos",
+					tb->ncb_pos, tb->k_pos, kw_pos);
+			return -1;
+		}
+		if (tb->r > (tb->c - 1)) {
+			rte_bbdev_log(ERR,
+					"r (%u) is greater than c - 1 (%u)",
+					tb->r, tb->c - 1);
+			return -1;
+		}
+	} else {
+		cb = &turbo_enc->cb_params;
+		if (cb->k < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| cb->k > RTE_BBDEV_TURBO_MAX_CB_SIZE) {
+			rte_bbdev_log(ERR,
+					"k (%u) is out of range %u <= value <= %u",
+					cb->k, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+
+		if (cb->e < RTE_BBDEV_TURBO_MIN_CB_SIZE || (cb->e % 2)) {
+			rte_bbdev_log(ERR,
+					"e (%u) is less than %u or it is not even",
+					cb->e, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+
+		kw = RTE_ALIGN_CEIL(cb->k + 4, RTE_BBDEV_TURBO_C_SUBBLOCK) * 3;
+		if (cb->ncb < cb->k || cb->ncb > kw) {
+			rte_bbdev_log(ERR,
+					"ncb (%u) is out of range (%u) k <= value <= (%u) kw",
+					cb->ncb, cb->k, kw);
+			return -1;
+		}
+	}
+
+	return 0;
+}
+/* Validates LDPC encoder parameters */
+static inline int
+validate_ldpc_enc_op(struct rte_bbdev_enc_op *op)
+{
+	struct rte_bbdev_op_ldpc_enc *ldpc_enc = &op->ldpc_enc;
+
+	if (op->mempool == NULL) {
+		rte_bbdev_log(ERR, "Invalid mempool pointer");
+		return -1;
+	}
+	if (ldpc_enc->input.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid input pointer");
+		return -1;
+	}
+	if (ldpc_enc->output.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid output pointer");
+		return -1;
+	}
+	if (ldpc_enc->input.length >
+			RTE_BBDEV_LDPC_MAX_CB_SIZE >> 3) {
+		rte_bbdev_log(ERR, "CB size (%u) is too big, max: %d",
+				ldpc_enc->input.length,
+				RTE_BBDEV_LDPC_MAX_CB_SIZE);
+		return -1;
+	}
+	if ((ldpc_enc->basegraph > 2) || (ldpc_enc->basegraph == 0)) {
+		rte_bbdev_log(ERR,
+				"BG (%u) is out of range 1 <= value <= 2",
+				ldpc_enc->basegraph);
+		return -1;
+	}
+	if (ldpc_enc->rv_index > 3) {
+		rte_bbdev_log(ERR,
+				"rv_index (%u) is out of range 0 <= value <= 3",
+				ldpc_enc->rv_index);
+		return -1;
+	}
+	if (ldpc_enc->code_block_mode > 1) {
+		rte_bbdev_log(ERR,
+				"code_block_mode (%u) is out of range 0 <= value <= 1",
+				ldpc_enc->code_block_mode);
+		return -1;
+	}
+	int K = (ldpc_enc->basegraph == 1 ? 22 : 10) * ldpc_enc->z_c;
+	if (ldpc_enc->n_filler >= K) {
+		rte_bbdev_log(ERR,
+				"K and F are not compatible %u %u",
+				K, ldpc_enc->n_filler);
+		return -1;
+	}
+	return 0;
+}
+
+/* Validates LDPC decoder parameters */
+static inline int
+validate_ldpc_dec_op(struct rte_bbdev_dec_op *op)
+{
+	struct rte_bbdev_op_ldpc_dec *ldpc_dec = &op->ldpc_dec;
+
+	if (op->mempool == NULL) {
+		rte_bbdev_log(ERR, "Invalid mempool pointer");
+		return -1;
+	}
+	if ((ldpc_dec->basegraph > 2) || (ldpc_dec->basegraph == 0)) {
+		rte_bbdev_log(ERR,
+				"BG (%u) is out of range 1 <= value <= 2",
+				ldpc_dec->basegraph);
+		return -1;
+	}
+	if (ldpc_dec->iter_max == 0) {
+		rte_bbdev_log(ERR,
+				"iter_max (%u) is equal to 0",
+				ldpc_dec->iter_max);
+		return -1;
+	}
+	if (ldpc_dec->rv_index > 3) {
+		rte_bbdev_log(ERR,
+				"rv_index (%u) is out of range 0 <= value <= 3",
+				ldpc_dec->rv_index);
+		return -1;
+	}
+	if (ldpc_dec->code_block_mode > 1) {
+		rte_bbdev_log(ERR,
+				"code_block_mode (%u) is out of range 0 <= value <= 1",
+				ldpc_dec->code_block_mode);
+		return -1;
+	}
+	int K = (ldpc_dec->basegraph == 1 ? 22 : 10) * ldpc_dec->z_c;
+	if (ldpc_dec->n_filler >= K) {
+		rte_bbdev_log(ERR,
+				"K and F are not compatible %u %u",
+				K, ldpc_dec->n_filler);
+		return -1;
+	}
+	return 0;
+}
+#endif
+
 /* Enqueue one encode operations for ACC100 device in CB mode */
 static inline int
 enqueue_enc_one_op_cb(struct acc100_queue *q, struct rte_bbdev_enc_op *op,
@@ -2005,6 +2242,14 @@
 		seg_total_left;
 	struct rte_mbuf *input, *output_head, *output;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_enc_op(op) == -1) {
+		rte_bbdev_log(ERR, "Turbo encoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2051,6 +2296,14 @@
 	uint16_t  in_length_in_bytes;
 	struct rte_bbdev_op_ldpc_enc *enc = &ops[0]->ldpc_enc;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_ldpc_enc_op(ops[0]) == -1) {
+		rte_bbdev_log(ERR, "LDPC encoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2105,6 +2358,14 @@
 		seg_total_left;
 	struct rte_mbuf *input, *output_head, *output;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_ldpc_enc_op(op) == -1) {
+		rte_bbdev_log(ERR, "LDPC encoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2154,6 +2415,14 @@
 	struct rte_mbuf *input, *output_head, *output;
 	uint16_t current_enqueued_cbs = 0;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_enc_op(op) == -1) {
+		rte_bbdev_log(ERR, "Turbo encoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2221,6 +2490,142 @@
 	return current_enqueued_cbs;
 }
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+/* Validates turbo decoder parameters */
+static inline int
+validate_dec_op(struct rte_bbdev_dec_op *op)
+{
+	struct rte_bbdev_op_turbo_dec *turbo_dec = &op->turbo_dec;
+	struct rte_bbdev_op_dec_turbo_cb_params *cb = NULL;
+	struct rte_bbdev_op_dec_turbo_tb_params *tb = NULL;
+
+	if (op->mempool == NULL) {
+		rte_bbdev_log(ERR, "Invalid mempool pointer");
+		return -1;
+	}
+	if (turbo_dec->input.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid input pointer");
+		return -1;
+	}
+	if (turbo_dec->hard_output.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid hard_output pointer");
+		return -1;
+	}
+	if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_SOFT_OUTPUT) &&
+			turbo_dec->soft_output.data == NULL) {
+		rte_bbdev_log(ERR, "Invalid soft_output pointer");
+		return -1;
+	}
+	if (turbo_dec->rv_index > 3) {
+		rte_bbdev_log(ERR,
+				"rv_index (%u) is out of range 0 <= value <= 3",
+				turbo_dec->rv_index);
+		return -1;
+	}
+	if (turbo_dec->iter_min < 1) {
+		rte_bbdev_log(ERR,
+				"iter_min (%u) is less than 1",
+				turbo_dec->iter_min);
+		return -1;
+	}
+	if (turbo_dec->iter_max <= 2) {
+		rte_bbdev_log(ERR,
+				"iter_max (%u) is less than or equal to 2",
+				turbo_dec->iter_max);
+		return -1;
+	}
+	if (turbo_dec->iter_min > turbo_dec->iter_max) {
+		rte_bbdev_log(ERR,
+				"iter_min (%u) is greater than iter_max (%u)",
+				turbo_dec->iter_min, turbo_dec->iter_max);
+		return -1;
+	}
+	if (turbo_dec->code_block_mode != 0 &&
+			turbo_dec->code_block_mode != 1) {
+		rte_bbdev_log(ERR,
+				"code_block_mode (%u) is out of range 0 <= value <= 1",
+				turbo_dec->code_block_mode);
+		return -1;
+	}
+
+	if (turbo_dec->code_block_mode == 0) {
+		tb = &turbo_dec->tb_params;
+		if ((tb->k_neg < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| tb->k_neg > RTE_BBDEV_TURBO_MAX_CB_SIZE)
+				&& tb->c_neg > 0) {
+			rte_bbdev_log(ERR,
+					"k_neg (%u) is out of range %u <= value <= %u",
+					tb->k_neg, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if ((tb->k_pos < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| tb->k_pos > RTE_BBDEV_TURBO_MAX_CB_SIZE)
+				&& tb->c > tb->c_neg) {
+			rte_bbdev_log(ERR,
+					"k_pos (%u) is out of range %u <= value <= %u",
+					tb->k_pos, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if (tb->c_neg > (RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1))
+			rte_bbdev_log(ERR,
+					"c_neg (%u) is out of range 0 <= value <= %u",
+					tb->c_neg,
+					RTE_BBDEV_TURBO_MAX_CODE_BLOCKS - 1);
+		if (tb->c < 1 || tb->c > RTE_BBDEV_TURBO_MAX_CODE_BLOCKS) {
+			rte_bbdev_log(ERR,
+					"c (%u) is out of range 1 <= value <= %u",
+					tb->c, RTE_BBDEV_TURBO_MAX_CODE_BLOCKS);
+			return -1;
+		}
+		if (tb->cab > tb->c) {
+			rte_bbdev_log(ERR,
+					"cab (%u) is greater than c (%u)",
+					tb->cab, tb->c);
+			return -1;
+		}
+		if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_EQUALIZER) &&
+				(tb->ea < RTE_BBDEV_TURBO_MIN_CB_SIZE
+						|| (tb->ea % 2))
+				&& tb->cab > 0) {
+			rte_bbdev_log(ERR,
+					"ea (%u) is less than %u or it is not even",
+					tb->ea, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+		if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_EQUALIZER) &&
+				(tb->eb < RTE_BBDEV_TURBO_MIN_CB_SIZE
+						|| (tb->eb % 2))
+				&& tb->c > tb->cab) {
+			rte_bbdev_log(ERR,
+					"eb (%u) is less than %u or it is not even",
+					tb->eb, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+		}
+	} else {
+		cb = &turbo_dec->cb_params;
+		if (cb->k < RTE_BBDEV_TURBO_MIN_CB_SIZE
+				|| cb->k > RTE_BBDEV_TURBO_MAX_CB_SIZE) {
+			rte_bbdev_log(ERR,
+					"k (%u) is out of range %u <= value <= %u",
+					cb->k, RTE_BBDEV_TURBO_MIN_CB_SIZE,
+					RTE_BBDEV_TURBO_MAX_CB_SIZE);
+			return -1;
+		}
+		if (check_bit(turbo_dec->op_flags, RTE_BBDEV_TURBO_EQUALIZER) &&
+				(cb->e < RTE_BBDEV_TURBO_MIN_CB_SIZE ||
+				(cb->e % 2))) {
+			rte_bbdev_log(ERR,
+					"e (%u) is less than %u or it is not even",
+					cb->e, RTE_BBDEV_TURBO_MIN_CB_SIZE);
+			return -1;
+		}
+	}
+
+	return 0;
+}
+#endif
+
 /** Enqueue one decode operations for ACC100 device in CB mode */
 static inline int
 enqueue_dec_one_op_cb(struct acc100_queue *q, struct rte_bbdev_dec_op *op,
@@ -2233,6 +2638,14 @@
 	struct rte_mbuf *input, *h_output_head, *h_output,
 		*s_output_head, *s_output;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_dec_op(op) == -1) {
+		rte_bbdev_log(ERR, "Turbo decoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2450,6 +2863,13 @@
 		return ret;
 	}
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_ldpc_dec_op(op) == -1) {
+		rte_bbdev_log(ERR, "LDPC decoder validation failed");
+		return -EINVAL;
+	}
+#endif
 	union acc100_dma_desc *desc;
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
@@ -2547,6 +2967,14 @@
 	struct rte_mbuf *input, *h_output_head, *h_output;
 	uint16_t current_enqueued_cbs = 0;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_ldpc_dec_op(op) == -1) {
+		rte_bbdev_log(ERR, "LDPC decoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
@@ -2632,6 +3060,14 @@
 		*s_output_head, *s_output;
 	uint16_t current_enqueued_cbs = 0;
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	/* Validate op structure */
+	if (validate_dec_op(op) == -1) {
+		rte_bbdev_log(ERR, "Turbo decoder validation failed");
+		return -EINVAL;
+	}
+#endif
+
 	uint16_t desc_idx = ((q->sw_ring_head + total_enqueued_cbs)
 			& q->sw_ring_wrap_mask);
 	desc = q->ring_addr + desc_idx;
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* [dpdk-dev] [PATCH v12 10/10] baseband/acc100: add configure function
  2020-10-05 22:12   ` [dpdk-dev] [PATCH v12 00/10] bbdev PMD ACC100 Nicolas Chautru
                       ` (8 preceding siblings ...)
  2020-10-05 22:12     ` [dpdk-dev] [PATCH v12 09/10] baseband/acc100: add debug function to validate input Nicolas Chautru
@ 2020-10-05 22:12     ` Nicolas Chautru
  2020-10-06 13:20     ` [dpdk-dev] [PATCH v12 00/10] bbdev PMD ACC100 Maxime Coquelin
  10 siblings, 0 replies; 213+ messages in thread
From: Nicolas Chautru @ 2020-10-05 22:12 UTC (permalink / raw)
  To: dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, trix, maxime.coquelin, ferruh.yigit,
	tianjiao.liu, Nicolas Chautru

Add configure function to configure the PF from within
the bbdev-test itself without external application
configuration the device.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
---
 app/test-bbdev/test_bbdev_perf.c                   |  71 +++
 doc/guides/rel_notes/release_20_11.rst             |   5 +
 drivers/baseband/acc100/meson.build                |   2 +
 drivers/baseband/acc100/rte_acc100_cfg.h           |  17 +
 drivers/baseband/acc100/rte_acc100_pmd.c           | 526 ++++++++++++++++++++-
 drivers/baseband/acc100/rte_acc100_pmd.h           |   1 +
 .../acc100/rte_pmd_bbdev_acc100_version.map        |   7 +
 7 files changed, 624 insertions(+), 5 deletions(-)

diff --git a/app/test-bbdev/test_bbdev_perf.c b/app/test-bbdev/test_bbdev_perf.c
index 45c0d62..6ddf012 100644
--- a/app/test-bbdev/test_bbdev_perf.c
+++ b/app/test-bbdev/test_bbdev_perf.c
@@ -52,6 +52,18 @@
 #define FLR_5G_TIMEOUT 610
 #endif
 
+#ifdef RTE_LIBRTE_PMD_BBDEV_ACC100
+#include <rte_acc100_cfg.h>
+#define ACC100PF_DRIVER_NAME   ("intel_acc100_pf")
+#define ACC100VF_DRIVER_NAME   ("intel_acc100_vf")
+#define ACC100_QMGR_NUM_AQS 16
+#define ACC100_QMGR_NUM_QGS 2
+#define ACC100_QMGR_AQ_DEPTH 5
+#define ACC100_QMGR_INVALID_IDX -1
+#define ACC100_QMGR_RR 1
+#define ACC100_QOS_GBR 0
+#endif
+
 #define OPS_CACHE_SIZE 256U
 #define OPS_POOL_SIZE_MIN 511U /* 0.5K per queue */
 
@@ -653,6 +665,65 @@ typedef int (test_case_function)(struct active_device *ad,
 				info->dev_name);
 	}
 #endif
+#ifdef RTE_LIBRTE_PMD_BBDEV_ACC100
+	if ((get_init_device() == true) &&
+		(!strcmp(info->drv.driver_name, ACC100PF_DRIVER_NAME))) {
+		struct rte_acc100_conf conf;
+		unsigned int i;
+
+		printf("Configure ACC100 FEC Driver %s with default values\n",
+				info->drv.driver_name);
+
+		/* clear default configuration before initialization */
+		memset(&conf, 0, sizeof(struct rte_acc100_conf));
+
+		/* Always set in PF mode for built-in configuration */
+		conf.pf_mode_en = true;
+		for (i = 0; i < RTE_ACC100_NUM_VFS; ++i) {
+			conf.arb_dl_4g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_dl_4g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_dl_4g[i].round_robin_weight = ACC100_QMGR_RR;
+			conf.arb_ul_4g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_ul_4g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_ul_4g[i].round_robin_weight = ACC100_QMGR_RR;
+			conf.arb_dl_5g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_dl_5g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_dl_5g[i].round_robin_weight = ACC100_QMGR_RR;
+			conf.arb_ul_5g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_ul_5g[i].gbr_threshold1 = ACC100_QOS_GBR;
+			conf.arb_ul_5g[i].round_robin_weight = ACC100_QMGR_RR;
+		}
+
+		conf.input_pos_llr_1_bit = true;
+		conf.output_pos_llr_1_bit = true;
+		conf.num_vf_bundles = 1; /**< Number of VF bundles to setup */
+
+		conf.q_ul_4g.num_qgroups = ACC100_QMGR_NUM_QGS;
+		conf.q_ul_4g.first_qgroup_index = ACC100_QMGR_INVALID_IDX;
+		conf.q_ul_4g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
+		conf.q_ul_4g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
+		conf.q_dl_4g.num_qgroups = ACC100_QMGR_NUM_QGS;
+		conf.q_dl_4g.first_qgroup_index = ACC100_QMGR_INVALID_IDX;
+		conf.q_dl_4g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
+		conf.q_dl_4g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
+		conf.q_ul_5g.num_qgroups = ACC100_QMGR_NUM_QGS;
+		conf.q_ul_5g.first_qgroup_index = ACC100_QMGR_INVALID_IDX;
+		conf.q_ul_5g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
+		conf.q_ul_5g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
+		conf.q_dl_5g.num_qgroups = ACC100_QMGR_NUM_QGS;
+		conf.q_dl_5g.first_qgroup_index = ACC100_QMGR_INVALID_IDX;
+		conf.q_dl_5g.num_aqs_per_groups = ACC100_QMGR_NUM_AQS;
+		conf.q_dl_5g.aq_depth_log2 = ACC100_QMGR_AQ_DEPTH;
+
+		/* setup PF with configuration information */
+		ret = rte_acc100_configure(info->dev_name, &conf);
+		TEST_ASSERT_SUCCESS(ret,
+				"Failed to configure ACC100 PF for bbdev %s",
+				info->dev_name);
+	}
+#endif
+	/* Let's refresh this now this is configured */
+	rte_bbdev_info_get(dev_id, info);
 	nb_queues = RTE_MIN(rte_lcore_count(), info->drv.max_num_queues);
 	nb_queues = RTE_MIN(nb_queues, (unsigned int) MAX_QUEUES);
 
diff --git a/doc/guides/rel_notes/release_20_11.rst b/doc/guides/rel_notes/release_20_11.rst
index 73ac08f..c8d0586 100644
--- a/doc/guides/rel_notes/release_20_11.rst
+++ b/doc/guides/rel_notes/release_20_11.rst
@@ -55,6 +55,11 @@ New Features
      Also, make sure to start the actual text at the margin.
      =======================================================
 
+* **Added Intel ACC100 bbdev PMD.**
+
+  Added a new ``acc100`` bbdev driver for the Intel\ |reg| ACC100 accelerator
+  also known as Mount Bryce.  See the
+  :doc:`../bbdevs/acc100` BBDEV guide for more details on this new driver.
 
 Removed Items
 -------------
diff --git a/drivers/baseband/acc100/meson.build b/drivers/baseband/acc100/meson.build
index 8afafc2..7ac44dc 100644
--- a/drivers/baseband/acc100/meson.build
+++ b/drivers/baseband/acc100/meson.build
@@ -4,3 +4,5 @@
 deps += ['bbdev', 'bus_vdev', 'ring', 'pci', 'bus_pci']
 
 sources = files('rte_acc100_pmd.c')
+
+install_headers('rte_acc100_cfg.h')
diff --git a/drivers/baseband/acc100/rte_acc100_cfg.h b/drivers/baseband/acc100/rte_acc100_cfg.h
index a1d43ef..d233e42 100644
--- a/drivers/baseband/acc100/rte_acc100_cfg.h
+++ b/drivers/baseband/acc100/rte_acc100_cfg.h
@@ -89,6 +89,23 @@ struct rte_acc100_conf {
 	struct rte_acc100_arbitration arb_dl_5g[RTE_ACC100_NUM_VFS];
 };
 
+/**
+ * Configure a ACC100 device
+ *
+ * @param dev_name
+ *   The name of the device. This is the short form of PCI BDF, e.g. 00:01.0.
+ *   It can also be retrieved for a bbdev device from the dev_name field in the
+ *   rte_bbdev_info structure returned by rte_bbdev_info_get().
+ * @param conf
+ *   Configuration to apply to ACC100 HW.
+ *
+ * @return
+ *   Zero on success, negative value on failure.
+ */
+__rte_experimental
+int
+rte_acc100_configure(const char *dev_name, struct rte_acc100_conf *conf);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c b/drivers/baseband/acc100/rte_acc100_pmd.c
index 8447540..47ddbae 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -38,10 +38,10 @@
 
 /* Write a register of a ACC100 device */
 static inline void
-acc100_reg_write(struct acc100_device *d, uint32_t offset, uint32_t payload)
+acc100_reg_write(struct acc100_device *d, uint32_t offset, uint32_t value)
 {
 	void *reg_addr = RTE_PTR_ADD(d->mmio_base, offset);
-	mmio_write(reg_addr, payload);
+	mmio_write(reg_addr, value);
 	usleep(ACC100_LONG_WAIT);
 }
 
@@ -85,6 +85,26 @@
 
 enum {UL_4G = 0, UL_5G, DL_4G, DL_5G, NUM_ACC};
 
+/* Return the accelerator enum for a Queue Group Index */
+static inline int
+accFromQgid(int qg_idx, const struct rte_acc100_conf *acc100_conf)
+{
+	int accQg[ACC100_NUM_QGRPS];
+	int NumQGroupsPerFn[NUM_ACC];
+	int acc, qgIdx, qgIndex = 0;
+	for (qgIdx = 0; qgIdx < ACC100_NUM_QGRPS; qgIdx++)
+		accQg[qgIdx] = 0;
+	NumQGroupsPerFn[UL_4G] = acc100_conf->q_ul_4g.num_qgroups;
+	NumQGroupsPerFn[UL_5G] = acc100_conf->q_ul_5g.num_qgroups;
+	NumQGroupsPerFn[DL_4G] = acc100_conf->q_dl_4g.num_qgroups;
+	NumQGroupsPerFn[DL_5G] = acc100_conf->q_dl_5g.num_qgroups;
+	for (acc = UL_4G;  acc < NUM_ACC; acc++)
+		for (qgIdx = 0; qgIdx < NumQGroupsPerFn[acc]; qgIdx++)
+			accQg[qgIndex++] = acc;
+	acc = accQg[qg_idx];
+	return acc;
+}
+
 /* Return the queue topology for a Queue Group Index */
 static inline void
 qtopFromAcc(struct rte_acc100_queue_topology **qtop, int acc_enum,
@@ -113,6 +133,30 @@
 	*qtop = p_qtop;
 }
 
+/* Return the AQ depth for a Queue Group Index */
+static inline int
+aqDepth(int qg_idx, struct rte_acc100_conf *acc100_conf)
+{
+	struct rte_acc100_queue_topology *q_top = NULL;
+	int acc_enum = accFromQgid(qg_idx, acc100_conf);
+	qtopFromAcc(&q_top, acc_enum, acc100_conf);
+	if (unlikely(q_top == NULL))
+		return 0;
+	return q_top->aq_depth_log2;
+}
+
+/* Return the AQ depth for a Queue Group Index */
+static inline int
+aqNum(int qg_idx, struct rte_acc100_conf *acc100_conf)
+{
+	struct rte_acc100_queue_topology *q_top = NULL;
+	int acc_enum = accFromQgid(qg_idx, acc100_conf);
+	qtopFromAcc(&q_top, acc_enum, acc100_conf);
+	if (unlikely(q_top == NULL))
+		return 0;
+	return q_top->num_aqs_per_groups;
+}
+
 static void
 initQTop(struct rte_acc100_conf *acc100_conf)
 {
@@ -553,7 +597,7 @@
 static int
 acc100_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id)
 {
-	uint32_t phys_low, phys_high, payload;
+	uint32_t phys_low, phys_high, value;
 	struct acc100_device *d = dev->data->dev_private;
 	const struct acc100_registry_addr *reg_addr;
 	int ret;
@@ -612,8 +656,8 @@
 	 * Configure Ring Size to the max queue ring size
 	 * (used for wrapping purpose)
 	 */
-	payload = log2_basic(d->sw_ring_size / 64);
-	acc100_reg_write(d, reg_addr->ring_size, payload);
+	value = log2_basic(d->sw_ring_size / 64);
+	acc100_reg_write(d, reg_addr->ring_size, value);
 
 	/* Configure tail pointer for use when SDONE enabled */
 	d->tail_ptrs = rte_zmalloc_socket(
@@ -4209,3 +4253,475 @@ static int acc100_pci_remove(struct rte_pci_device *pci_dev)
 RTE_PMD_REGISTER_PCI_TABLE(ACC100PF_DRIVER_NAME, pci_id_acc100_pf_map);
 RTE_PMD_REGISTER_PCI(ACC100VF_DRIVER_NAME, acc100_pci_vf_driver);
 RTE_PMD_REGISTER_PCI_TABLE(ACC100VF_DRIVER_NAME, pci_id_acc100_vf_map);
+
+/*
+ * Workaround implementation to fix the power on status of some 5GUL engines
+ * This requires DMA permission if ported outside DPDK
+ * It consists in resolving the state of these engines by running a
+ * dummy operation and resetting the engines to ensure state are reliably
+ * defined.
+ */
+static void
+poweron_cleanup(struct rte_bbdev *bbdev, struct acc100_device *d,
+		struct rte_acc100_conf *conf)
+{
+	int i, template_idx, qg_idx;
+	uint32_t address, status, value;
+	printf("Need to clear power-on 5GUL status in internal memory\n");
+	/* Reset LDPC Cores */
+	for (i = 0; i < ACC100_ENGINES_MAX; i++)
+		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
+				ACC100_ENGINE_OFFSET * i, ACC100_RESET_HI);
+	usleep(ACC100_LONG_WAIT);
+	for (i = 0; i < ACC100_ENGINES_MAX; i++)
+		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
+				ACC100_ENGINE_OFFSET * i, ACC100_RESET_LO);
+	usleep(ACC100_LONG_WAIT);
+	/* Prepare dummy workload */
+	alloc_2x64mb_sw_rings_mem(bbdev, d, 0);
+	/* Set base addresses */
+	uint32_t phys_high = (uint32_t)(d->sw_rings_iova >> 32);
+	uint32_t phys_low  = (uint32_t)(d->sw_rings_iova &
+			~(ACC100_SIZE_64MBYTE-1));
+	acc100_reg_write(d, HWPfDmaFec5GulDescBaseHiRegVf, phys_high);
+	acc100_reg_write(d, HWPfDmaFec5GulDescBaseLoRegVf, phys_low);
+
+	/* Descriptor for a dummy 5GUL code block processing*/
+	union acc100_dma_desc *desc = NULL;
+	desc = d->sw_rings;
+	desc->req.data_ptrs[0].address = d->sw_rings_iova +
+			ACC100_DESC_FCW_OFFSET;
+	desc->req.data_ptrs[0].blen = ACC100_FCW_LD_BLEN;
+	desc->req.data_ptrs[0].blkid = ACC100_DMA_BLKID_FCW;
+	desc->req.data_ptrs[0].last = 0;
+	desc->req.data_ptrs[0].dma_ext = 0;
+	desc->req.data_ptrs[1].address = d->sw_rings_iova + 512;
+	desc->req.data_ptrs[1].blkid = ACC100_DMA_BLKID_IN;
+	desc->req.data_ptrs[1].last = 1;
+	desc->req.data_ptrs[1].dma_ext = 0;
+	desc->req.data_ptrs[1].blen = 44;
+	desc->req.data_ptrs[2].address = d->sw_rings_iova + 1024;
+	desc->req.data_ptrs[2].blkid = ACC100_DMA_BLKID_OUT_ENC;
+	desc->req.data_ptrs[2].last = 1;
+	desc->req.data_ptrs[2].dma_ext = 0;
+	desc->req.data_ptrs[2].blen = 5;
+	/* Dummy FCW */
+	desc->req.fcw_ld.FCWversion = ACC100_FCW_VER;
+	desc->req.fcw_ld.qm = 1;
+	desc->req.fcw_ld.nfiller = 30;
+	desc->req.fcw_ld.BG = 2 - 1;
+	desc->req.fcw_ld.Zc = 7;
+	desc->req.fcw_ld.ncb = 350;
+	desc->req.fcw_ld.rm_e = 4;
+	desc->req.fcw_ld.itmax = 10;
+	desc->req.fcw_ld.gain_i = 1;
+	desc->req.fcw_ld.gain_h = 1;
+
+	int engines_to_restart[ACC100_SIG_UL_5G_LAST + 1] = {0};
+	int num_failed_engine = 0;
+	/* Detect engines in undefined state */
+	for (template_idx = ACC100_SIG_UL_5G;
+			template_idx <= ACC100_SIG_UL_5G_LAST;
+			template_idx++) {
+		/* Check engine power-on status */
+		address = HwPfFecUl5gIbDebugReg +
+				ACC100_ENGINE_OFFSET * template_idx;
+		status = (acc100_reg_read(d, address) >> 4) & 0xF;
+		if (status == 0) {
+			engines_to_restart[num_failed_engine] = template_idx;
+			num_failed_engine++;
+		}
+	}
+
+	int numQqsAcc = conf->q_ul_5g.num_qgroups;
+	int numQgs = conf->q_ul_5g.num_qgroups;
+	value = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		value |= (1 << qg_idx);
+	/* Force each engine which is in unspecified state */
+	for (i = 0; i < num_failed_engine; i++) {
+		int failed_engine = engines_to_restart[i];
+		printf("Force engine %d\n", failed_engine);
+		for (template_idx = ACC100_SIG_UL_5G;
+				template_idx <= ACC100_SIG_UL_5G_LAST;
+				template_idx++) {
+			address = HWPfQmgrGrpTmplateReg4Indx
+					+ ACC100_BYTES_IN_WORD * template_idx;
+			if (template_idx == failed_engine)
+				acc100_reg_write(d, address, value);
+			else
+				acc100_reg_write(d, address, 0);
+		}
+		/* Reset descriptor header */
+		desc->req.word0 = ACC100_DMA_DESC_TYPE;
+		desc->req.word1 = 0;
+		desc->req.word2 = 0;
+		desc->req.word3 = 0;
+		desc->req.numCBs = 1;
+		desc->req.m2dlen = 2;
+		desc->req.d2mlen = 1;
+		/* Enqueue the code block for processing */
+		union acc100_enqueue_reg_fmt enq_req;
+		enq_req.val = 0;
+		enq_req.addr_offset = ACC100_DESC_OFFSET;
+		enq_req.num_elem = 1;
+		enq_req.req_elem_addr = 0;
+		rte_wmb();
+		acc100_reg_write(d, HWPfQmgrIngressAq + 0x100, enq_req.val);
+		usleep(ACC100_LONG_WAIT * 100);
+		if (desc->req.word0 != 2)
+			printf("DMA Response %#"PRIx32"\n", desc->req.word0);
+	}
+
+	/* Reset LDPC Cores */
+	for (i = 0; i < ACC100_ENGINES_MAX; i++)
+		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
+				ACC100_ENGINE_OFFSET * i,
+				ACC100_RESET_HI);
+	usleep(ACC100_LONG_WAIT);
+	for (i = 0; i < ACC100_ENGINES_MAX; i++)
+		acc100_reg_write(d, HWPfFecUl5gCntrlReg +
+				ACC100_ENGINE_OFFSET * i,
+				ACC100_RESET_LO);
+	usleep(ACC100_LONG_WAIT);
+	acc100_reg_write(d, HWPfHi5GHardResetReg, ACC100_RESET_HARD);
+	usleep(ACC100_LONG_WAIT);
+	int numEngines = 0;
+	/* Check engine power-on status again */
+	for (template_idx = ACC100_SIG_UL_5G;
+			template_idx <= ACC100_SIG_UL_5G_LAST;
+			template_idx++) {
+		address = HwPfFecUl5gIbDebugReg +
+				ACC100_ENGINE_OFFSET * template_idx;
+		status = (acc100_reg_read(d, address) >> 4) & 0xF;
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ ACC100_BYTES_IN_WORD * template_idx;
+		if (status == 1) {
+			acc100_reg_write(d, address, value);
+			numEngines++;
+		} else
+			acc100_reg_write(d, address, 0);
+	}
+	printf("Number of 5GUL engines %d\n", numEngines);
+
+	if (d->sw_rings_base != NULL)
+		rte_free(d->sw_rings_base);
+	usleep(ACC100_LONG_WAIT);
+}
+
+/* Initial configuration of a ACC100 device prior to running configure() */
+int
+rte_acc100_configure(const char *dev_name, struct rte_acc100_conf *conf)
+{
+	rte_bbdev_log(INFO, "rte_acc100_configure");
+	uint32_t value, address, status;
+	int qg_idx, template_idx, vf_idx, acc, i;
+	struct rte_bbdev *bbdev = rte_bbdev_get_named_dev(dev_name);
+
+	/* Compile time checks */
+	RTE_BUILD_BUG_ON(sizeof(struct acc100_dma_req_desc) != 256);
+	RTE_BUILD_BUG_ON(sizeof(union acc100_dma_desc) != 256);
+	RTE_BUILD_BUG_ON(sizeof(struct acc100_fcw_td) != 24);
+	RTE_BUILD_BUG_ON(sizeof(struct acc100_fcw_te) != 32);
+
+	if (bbdev == NULL) {
+		rte_bbdev_log(ERR,
+		"Invalid dev_name (%s), or device is not yet initialised",
+		dev_name);
+		return -ENODEV;
+	}
+	struct acc100_device *d = bbdev->data->dev_private;
+
+	/* Store configuration */
+	rte_memcpy(&d->acc100_conf, conf, sizeof(d->acc100_conf));
+
+	/* PCIe Bridge configuration */
+	acc100_reg_write(d, HwPfPcieGpexBridgeControl, ACC100_CFG_PCI_BRIDGE);
+	for (i = 1; i < ACC100_GPEX_AXIMAP_NUM; i++)
+		acc100_reg_write(d,
+				HwPfPcieGpexAxiAddrMappingWindowPexBaseHigh
+				+ i * 16, 0);
+
+	/* Prevent blocking AXI read on BRESP for AXI Write */
+	address = HwPfPcieGpexAxiPioControl;
+	value = ACC100_CFG_PCI_AXI;
+	acc100_reg_write(d, address, value);
+
+	/* 5GDL PLL phase shift */
+	acc100_reg_write(d, HWPfChaDl5gPllPhshft0, 0x1);
+
+	/* Explicitly releasing AXI as this may be stopped after PF FLR/BME */
+	address = HWPfDmaAxiControl;
+	value = 1;
+	acc100_reg_write(d, address, value);
+
+	/* DDR Configuration */
+	address = HWPfDdrBcTim6;
+	value = acc100_reg_read(d, address);
+	value &= 0xFFFFFFFB; /* Bit 2 */
+#ifdef ACC100_DDR_ECC_ENABLE
+	value |= 0x4;
+#endif
+	acc100_reg_write(d, address, value);
+	address = HWPfDdrPhyDqsCountNum;
+#ifdef ACC100_DDR_ECC_ENABLE
+	value = 9;
+#else
+	value = 8;
+#endif
+	acc100_reg_write(d, address, value);
+
+	/* Set default descriptor signature */
+	address = HWPfDmaDescriptorSignatuture;
+	value = 0;
+	acc100_reg_write(d, address, value);
+
+	/* Enable the Error Detection in DMA */
+	value = ACC100_CFG_DMA_ERROR;
+	address = HWPfDmaErrorDetectionEn;
+	acc100_reg_write(d, address, value);
+
+	/* AXI Cache configuration */
+	value = ACC100_CFG_AXI_CACHE;
+	address = HWPfDmaAxcacheReg;
+	acc100_reg_write(d, address, value);
+
+	/* Default DMA Configuration (Qmgr Enabled) */
+	address = HWPfDmaConfig0Reg;
+	value = 0;
+	acc100_reg_write(d, address, value);
+	address = HWPfDmaQmanen;
+	value = 0;
+	acc100_reg_write(d, address, value);
+
+	/* Default RLIM/ALEN configuration */
+	address = HWPfDmaConfig1Reg;
+	value = (1 << 31) + (23 << 8) + (1 << 6) + 7;
+	acc100_reg_write(d, address, value);
+
+	/* Configure DMA Qmanager addresses */
+	address = HWPfDmaQmgrAddrReg;
+	value = HWPfQmgrEgressQueuesTemplate;
+	acc100_reg_write(d, address, value);
+
+	/* ===== Qmgr Configuration ===== */
+	/* Configuration of the AQueue Depth QMGR_GRP_0_DEPTH_LOG2 for UL */
+	int totalQgs = conf->q_ul_4g.num_qgroups +
+			conf->q_ul_5g.num_qgroups +
+			conf->q_dl_4g.num_qgroups +
+			conf->q_dl_5g.num_qgroups;
+	for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) {
+		address = HWPfQmgrDepthLog2Grp +
+		ACC100_BYTES_IN_WORD * qg_idx;
+		value = aqDepth(qg_idx, conf);
+		acc100_reg_write(d, address, value);
+		address = HWPfQmgrTholdGrp +
+		ACC100_BYTES_IN_WORD * qg_idx;
+		value = (1 << 16) + (1 << (aqDepth(qg_idx, conf) - 1));
+		acc100_reg_write(d, address, value);
+	}
+
+	/* Template Priority in incremental order */
+	for (template_idx = 0; template_idx < ACC100_NUM_TMPL;
+			template_idx++) {
+		address = HWPfQmgrGrpTmplateReg0Indx +
+		ACC100_BYTES_IN_WORD * (template_idx % 8);
+		value = ACC100_TMPL_PRI_0;
+		acc100_reg_write(d, address, value);
+		address = HWPfQmgrGrpTmplateReg1Indx +
+		ACC100_BYTES_IN_WORD * (template_idx % 8);
+		value = ACC100_TMPL_PRI_1;
+		acc100_reg_write(d, address, value);
+		address = HWPfQmgrGrpTmplateReg2indx +
+		ACC100_BYTES_IN_WORD * (template_idx % 8);
+		value = ACC100_TMPL_PRI_2;
+		acc100_reg_write(d, address, value);
+		address = HWPfQmgrGrpTmplateReg3Indx +
+		ACC100_BYTES_IN_WORD * (template_idx % 8);
+		value = ACC100_TMPL_PRI_3;
+		acc100_reg_write(d, address, value);
+	}
+
+	address = HWPfQmgrGrpPriority;
+	value = ACC100_CFG_QMGR_HI_P;
+	acc100_reg_write(d, address, value);
+
+	/* Template Configuration */
+	for (template_idx = 0; template_idx < ACC100_NUM_TMPL;
+			template_idx++) {
+		value = 0;
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ ACC100_BYTES_IN_WORD * template_idx;
+		acc100_reg_write(d, address, value);
+	}
+	/* 4GUL */
+	int numQgs = conf->q_ul_4g.num_qgroups;
+	int numQqsAcc = 0;
+	value = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		value |= (1 << qg_idx);
+	for (template_idx = ACC100_SIG_UL_4G;
+			template_idx <= ACC100_SIG_UL_4G_LAST;
+			template_idx++) {
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ ACC100_BYTES_IN_WORD * template_idx;
+		acc100_reg_write(d, address, value);
+	}
+	/* 5GUL */
+	numQqsAcc += numQgs;
+	numQgs	= conf->q_ul_5g.num_qgroups;
+	value = 0;
+	int numEngines = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		value |= (1 << qg_idx);
+	for (template_idx = ACC100_SIG_UL_5G;
+			template_idx <= ACC100_SIG_UL_5G_LAST;
+			template_idx++) {
+		/* Check engine power-on status */
+		address = HwPfFecUl5gIbDebugReg +
+				ACC100_ENGINE_OFFSET * template_idx;
+		status = (acc100_reg_read(d, address) >> 4) & 0xF;
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ ACC100_BYTES_IN_WORD * template_idx;
+		if (status == 1) {
+			acc100_reg_write(d, address, value);
+			numEngines++;
+		} else
+			acc100_reg_write(d, address, 0);
+#if RTE_ACC100_SINGLE_FEC == 1
+		value = 0;
+#endif
+	}
+	printf("Number of 5GUL engines %d\n", numEngines);
+	/* 4GDL */
+	numQqsAcc += numQgs;
+	numQgs	= conf->q_dl_4g.num_qgroups;
+	value = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		value |= (1 << qg_idx);
+	for (template_idx = ACC100_SIG_DL_4G;
+			template_idx <= ACC100_SIG_DL_4G_LAST;
+			template_idx++) {
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ ACC100_BYTES_IN_WORD * template_idx;
+		acc100_reg_write(d, address, value);
+#if RTE_ACC100_SINGLE_FEC == 1
+			value = 0;
+#endif
+	}
+	/* 5GDL */
+	numQqsAcc += numQgs;
+	numQgs	= conf->q_dl_5g.num_qgroups;
+	value = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		value |= (1 << qg_idx);
+	for (template_idx = ACC100_SIG_DL_5G;
+			template_idx <= ACC100_SIG_DL_5G_LAST;
+			template_idx++) {
+		address = HWPfQmgrGrpTmplateReg4Indx
+				+ ACC100_BYTES_IN_WORD * template_idx;
+		acc100_reg_write(d, address, value);
+#if RTE_ACC100_SINGLE_FEC == 1
+		value = 0;
+#endif
+	}
+
+	/* Queue Group Function mapping */
+	int qman_func_id[5] = {0, 2, 1, 3, 4};
+	address = HWPfQmgrGrpFunction0;
+	value = 0;
+	for (qg_idx = 0; qg_idx < 8; qg_idx++) {
+		acc = accFromQgid(qg_idx, conf);
+		value |= qman_func_id[acc]<<(qg_idx * 4);
+	}
+	acc100_reg_write(d, address, value);
+
+	/* Configuration of the Arbitration QGroup depth to 1 */
+	for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) {
+		address = HWPfQmgrArbQDepthGrp +
+		ACC100_BYTES_IN_WORD * qg_idx;
+		value = 0;
+		acc100_reg_write(d, address, value);
+	}
+
+	/* Enabling AQueues through the Queue hierarchy*/
+	for (vf_idx = 0; vf_idx < ACC100_NUM_VFS; vf_idx++) {
+		for (qg_idx = 0; qg_idx < ACC100_NUM_QGRPS; qg_idx++) {
+			value = 0;
+			if (vf_idx < conf->num_vf_bundles &&
+					qg_idx < totalQgs)
+				value = (1 << aqNum(qg_idx, conf)) - 1;
+			address = HWPfQmgrAqEnableVf
+					+ vf_idx * ACC100_BYTES_IN_WORD;
+			value += (qg_idx << 16);
+			acc100_reg_write(d, address, value);
+		}
+	}
+
+	/* This pointer to ARAM (256kB) is shifted by 2 (4B per register) */
+	uint32_t aram_address = 0;
+	for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) {
+		for (vf_idx = 0; vf_idx < conf->num_vf_bundles; vf_idx++) {
+			address = HWPfQmgrVfBaseAddr + vf_idx
+					* ACC100_BYTES_IN_WORD + qg_idx
+					* ACC100_BYTES_IN_WORD * 64;
+			value = aram_address;
+			acc100_reg_write(d, address, value);
+			/* Offset ARAM Address for next memory bank
+			 * - increment of 4B
+			 */
+			aram_address += aqNum(qg_idx, conf) *
+					(1 << aqDepth(qg_idx, conf));
+		}
+	}
+
+	if (aram_address > ACC100_WORDS_IN_ARAM_SIZE) {
+		rte_bbdev_log(ERR, "ARAM Configuration not fitting %d %d\n",
+				aram_address, ACC100_WORDS_IN_ARAM_SIZE);
+		return -EINVAL;
+	}
+
+	/* ==== HI Configuration ==== */
+
+	/* Prevent Block on Transmit Error */
+	address = HWPfHiBlockTransmitOnErrorEn;
+	value = 0;
+	acc100_reg_write(d, address, value);
+	/* Prevents to drop MSI */
+	address = HWPfHiMsiDropEnableReg;
+	value = 0;
+	acc100_reg_write(d, address, value);
+	/* Set the PF Mode register */
+	address = HWPfHiPfMode;
+	value = (conf->pf_mode_en) ? ACC100_PF_VAL : 0;
+	acc100_reg_write(d, address, value);
+	/* Enable Error Detection in HW */
+	address = HWPfDmaErrorDetectionEn;
+	value = 0x3D7;
+	acc100_reg_write(d, address, value);
+
+	/* QoS overflow init */
+	value = 1;
+	address = HWPfQosmonAEvalOverflow0;
+	acc100_reg_write(d, address, value);
+	address = HWPfQosmonBEvalOverflow0;
+	acc100_reg_write(d, address, value);
+
+	/* HARQ DDR Configuration */
+	unsigned int ddrSizeInMb = 512; /* Fixed to 512 MB per VF for now */
+	for (vf_idx = 0; vf_idx < conf->num_vf_bundles; vf_idx++) {
+		address = HWPfDmaVfDdrBaseRw + vf_idx
+				* 0x10;
+		value = ((vf_idx * (ddrSizeInMb / 64)) << 16) +
+				(ddrSizeInMb - 1);
+		acc100_reg_write(d, address, value);
+	}
+	usleep(ACC100_LONG_WAIT);
+
+	/* Workaround in case some 5GUL engines are in an unexpected state */
+	if (numEngines < (ACC100_SIG_UL_5G_LAST + 1))
+		poweron_cleanup(bbdev, d, conf);
+
+	rte_bbdev_log_debug("PF Tip configuration complete for %s", dev_name);
+	return 0;
+}
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h b/drivers/baseband/acc100/rte_acc100_pmd.h
index 1fbd96e..03ed0b3 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.h
+++ b/drivers/baseband/acc100/rte_acc100_pmd.h
@@ -158,6 +158,7 @@
 #define ACC100_RESET_HARD       0x1FF
 #define ACC100_ENGINES_MAX      9
 #define ACC100_LONG_WAIT        1000
+#define ACC100_GPEX_AXIMAP_NUM  17
 
 /* ACC100 DMA Descriptor triplet */
 struct acc100_dma_triplet {
diff --git a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
index 4a76d1d..47a23b8 100644
--- a/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
+++ b/drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
@@ -1,3 +1,10 @@
 DPDK_21 {
 	local: *;
 };
+
+EXPERIMENTAL {
+	global:
+
+	rte_acc100_configure;
+
+};
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v11 03/10] baseband/acc100: add info get function
  2020-10-05 16:38         ` Chautru, Nicolas
@ 2020-10-05 22:19           ` Chautru, Nicolas
  0 siblings, 0 replies; 213+ messages in thread
From: Chautru, Nicolas @ 2020-10-05 22:19 UTC (permalink / raw)
  To: Tom Rix, dev, akhil.goyal
  Cc: Richardson, Bruce, Xu, Rosen, maxime.coquelin, Yigit, Ferruh,
	Liu,  Tianjiao

Hi Tom, 

> -----Original Message-----
> From: Chautru, Nicolas
> Sent: Monday, October 5, 2020 9:39 AM
> To: 'Tom Rix' <trix@redhat.com>; dev@dpdk.org; akhil.goyal@nxp.com
> Cc: Richardson, Bruce <bruce.richardson@intel.com>; Xu, Rosen
> <rosen.xu@intel.com>; maxime.coquelin@redhat.com; Yigit, Ferruh
> <ferruh.yigit@intel.com>; Liu, Tianjiao <Tianjiao.Liu@intel.com>
> Subject: RE: [PATCH v11 03/10] baseband/acc100: add info get function
> 
> Hi Tom
> 
> > From: Tom Rix <trix@redhat.com>>
> >
> > On 10/1/20 6:01 PM, Nicolas Chautru wrote:
> > > Add in the "info_get" function to the driver, to allow us to query
> > > the device.
> > > No processing capability are available yet.
> > > Linking bbdev-test to support the PMD with null capability.
> > >
> > > Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> > > Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
> > > ---
> > >  app/test-bbdev/meson.build               |   3 +
> > >  drivers/baseband/acc100/rte_acc100_cfg.h |  96 +++++++++++++
> > > drivers/baseband/acc100/rte_acc100_pmd.c | 229
> > > +++++++++++++++++++++++++++++++
> > > drivers/baseband/acc100/rte_acc100_pmd.h |  10 ++
> > >  4 files changed, 338 insertions(+)
> > >  create mode 100644 drivers/baseband/acc100/rte_acc100_cfg.h
> > >
> > > diff --git a/app/test-bbdev/meson.build b/app/test-bbdev/meson.build
> > > index 18ab6a8..fbd8ae3 100644
> > > --- a/app/test-bbdev/meson.build
> > > +++ b/app/test-bbdev/meson.build
> > > @@ -12,3 +12,6 @@ endif
> > >  if dpdk_conf.has('RTE_LIBRTE_PMD_BBDEV_FPGA_5GNR_FEC')
> > >  	deps += ['pmd_bbdev_fpga_5gnr_fec']  endif
> > > +if dpdk_conf.has('RTE_LIBRTE_PMD_BBDEV_ACC100')
> > > +	deps += ['pmd_bbdev_acc100']
> > > +endif
> > > \ No newline at end of file
> > > diff --git a/drivers/baseband/acc100/rte_acc100_cfg.h
> > > b/drivers/baseband/acc100/rte_acc100_cfg.h
> > > new file mode 100644
> > > index 0000000..a1d43ef
> > > --- /dev/null
> > > +++ b/drivers/baseband/acc100/rte_acc100_cfg.h
> > > @@ -0,0 +1,96 @@
> > > +/* SPDX-License-Identifier: BSD-3-Clause
> > > + * Copyright(c) 2020 Intel Corporation  */
> > > +
> > > +#ifndef _RTE_ACC100_CFG_H_
> > > +#define _RTE_ACC100_CFG_H_
> > > +
> > > +/**
> > > + * @file rte_acc100_cfg.h
> > > + *
> > > + * Functions for configuring ACC100 HW, exposed directly to applications.
> > > + * Configuration related to encoding/decoding is done through the
> > > + * librte_bbdev library.
> > > + *
> > > + * @warning
> > > + * @b EXPERIMENTAL: this API may change without prior notice  */
> > > +
> > > +#include <stdint.h>
> > > +#include <stdbool.h>
> > > +
> > > +#ifdef __cplusplus
> > > +extern "C" {
> > > +#endif
> > > +/**< Number of Virtual Functions ACC100 supports */ #define
> > > +RTE_ACC100_NUM_VFS 16
> >
> > I was expecting the definition of RTE_ACC100_NUM_VFS to be removed.
> >
> > And its uses replaced with ACC100_NUM_VFS.
> >
> > or
> >
> > #define RTE_ACC100_NUM_VFS ACC100_NUM_VFS
> >
> 
> Yes it was actually on purpose to keep that piece of code portable outside of
> DPDK if required.
> One is related to the PMD generic function, the other one is used for the
> configuration function only.
> If you feel strongly about this I could change.

I have pushed the v12 now. Note that this is done the same way for other bbdev devices. 
One is internal to PMD and the other one is only required to do the configuration either in bbdev-test or other external piece of code. Hence kept as is in v12 (only change in v12 is inverting the logic to remove the break as you had suggested).
Hopefully v12 is the last one!
Thanks again
Nic 


> 
> > > +
> > > +/**
> > > + * Definition of Queue Topology for ACC100 Configuration
> > > + * Some level of details is abstracted out to expose a clean
> > > +interface
> > > + * given that comprehensive flexibility is not required  */ struct
> > > +rte_acc100_queue_topology {
> > > +	/** Number of QGroups in incremental order of priority */
> > > +	uint16_t num_qgroups;
> > > +	/**
> > > +	 * All QGroups have the same number of AQs here.
> > > +	 * Note : Could be made a 16-array if more flexibility is really
> > > +	 * required
> > > +	 */
> > > +	uint16_t num_aqs_per_groups;
> > > +	/**
> > > +	 * Depth of the AQs is the same of all QGroups here. Log2 Enum : 2^N
> > > +	 * Note : Could be made a 16-array if more flexibility is really
> > > +	 * required
> > > +	 */
> > > +	uint16_t aq_depth_log2;
> > > +	/**
> > > +	 * Index of the first Queue Group Index - assuming contiguity
> > > +	 * Initialized as -1
> > > +	 */
> > > +	int8_t first_qgroup_index;
> > > +};
> > > +
> > > +/**
> > > + * Definition of Arbitration related parameters for ACC100
> > > +Configuration  */ struct rte_acc100_arbitration {
> > > +	/** Default Weight for VF Fairness Arbitration */
> > > +	uint16_t round_robin_weight;
> > > +	uint32_t gbr_threshold1; /**< Guaranteed Bitrate Threshold 1 */
> > > +	uint32_t gbr_threshold2; /**< Guaranteed Bitrate Threshold 2 */ };
> > > +
> > > +/**
> > > + * Structure to pass ACC100 configuration.
> > > + * Note: all VF Bundles will have the same configuration.
> > > + */
> > > +struct rte_acc100_conf {
> > > +	bool pf_mode_en; /**< 1 if PF is used for dataplane, 0 for VFs */
> > > +	/** 1 if input '1' bit is represented by a positive LLR value, 0 if '1'
> > > +	 * bit is represented by a negative value.
> > > +	 */
> > > +	bool input_pos_llr_1_bit;
> > > +	/** 1 if output '1' bit is represented by a positive value, 0 if '1'
> > > +	 * bit is represented by a negative value.
> > > +	 */
> > > +	bool output_pos_llr_1_bit;
> > > +	uint16_t num_vf_bundles; /**< Number of VF bundles to setup */
> > > +	/** Queue topology for each operation type */
> > > +	struct rte_acc100_queue_topology q_ul_4g;
> > > +	struct rte_acc100_queue_topology q_dl_4g;
> > > +	struct rte_acc100_queue_topology q_ul_5g;
> > > +	struct rte_acc100_queue_topology q_dl_5g;
> > > +	/** Arbitration configuration for each operation type */
> > > +	struct rte_acc100_arbitration arb_ul_4g[RTE_ACC100_NUM_VFS];
> > > +	struct rte_acc100_arbitration arb_dl_4g[RTE_ACC100_NUM_VFS];
> > > +	struct rte_acc100_arbitration arb_ul_5g[RTE_ACC100_NUM_VFS];
> > > +	struct rte_acc100_arbitration arb_dl_5g[RTE_ACC100_NUM_VFS]; };
> > > +
> > > +#ifdef __cplusplus
> > > +}
> > > +#endif
> > > +
> > > +#endif /* _RTE_ACC100_CFG_H_ */
> > > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c
> > > b/drivers/baseband/acc100/rte_acc100_pmd.c
> > > index 1b4cd13..fcba77e 100644
> > > --- a/drivers/baseband/acc100/rte_acc100_pmd.c
> > > +++ b/drivers/baseband/acc100/rte_acc100_pmd.c
> > > @@ -26,6 +26,188 @@
> > >  RTE_LOG_REGISTER(acc100_logtype, pmd.bb.acc100, NOTICE);  #endif
> > >
> > > +/* Read a register of a ACC100 device */ static inline uint32_t
> > > +acc100_reg_read(struct acc100_device *d, uint32_t offset) {
> > > +
> > > +	void *reg_addr = RTE_PTR_ADD(d->mmio_base, offset);
> > > +	uint32_t ret = *((volatile uint32_t *)(reg_addr));
> > > +	return rte_le_to_cpu_32(ret);
> > > +}
> > > +
> > > +/* Calculate the offset of the enqueue register */ static inline
> > > +uint32_t queue_offset(bool pf_device, uint8_t vf_id, uint8_t
> > > +qgrp_id, uint16_t aq_id) {
> > > +	if (pf_device)
> > > +		return ((vf_id << 12) + (qgrp_id << 7) + (aq_id << 3) +
> > > +				HWPfQmgrIngressAq);
> > > +	else
> > > +		return ((qgrp_id << 7) + (aq_id << 3) +
> > > +				HWVfQmgrIngressAq);
> > > +}
> > > +
> > > +enum {UL_4G = 0, UL_5G, DL_4G, DL_5G, NUM_ACC};
> > > +
> > > +/* Return the queue topology for a Queue Group Index */ static
> > > +inline void qtopFromAcc(struct rte_acc100_queue_topology **qtop,
> > > +int acc_enum,
> > > +		struct rte_acc100_conf *acc100_conf) {
> > > +	struct rte_acc100_queue_topology *p_qtop;
> > > +	p_qtop = NULL;
> > > +	switch (acc_enum) {
> > > +	case UL_4G:
> > > +		p_qtop = &(acc100_conf->q_ul_4g);
> > > +		break;
> > > +	case UL_5G:
> > > +		p_qtop = &(acc100_conf->q_ul_5g);
> > > +		break;
> > > +	case DL_4G:
> > > +		p_qtop = &(acc100_conf->q_dl_4g);
> > > +		break;
> > > +	case DL_5G:
> > > +		p_qtop = &(acc100_conf->q_dl_5g);
> > > +		break;
> > > +	default:
> > > +		/* NOTREACHED */
> > > +		rte_bbdev_log(ERR, "Unexpected error evaluating
> > qtopFromAcc");
> > > +		break;
> > > +	}
> > > +	*qtop = p_qtop;
> > > +}
> > > +
> > > +static void
> > > +initQTop(struct rte_acc100_conf *acc100_conf) {
> > > +	acc100_conf->q_ul_4g.num_aqs_per_groups = 0;
> > > +	acc100_conf->q_ul_4g.num_qgroups = 0;
> > > +	acc100_conf->q_ul_4g.first_qgroup_index = -1;
> > > +	acc100_conf->q_ul_5g.num_aqs_per_groups = 0;
> > > +	acc100_conf->q_ul_5g.num_qgroups = 0;
> > > +	acc100_conf->q_ul_5g.first_qgroup_index = -1;
> > > +	acc100_conf->q_dl_4g.num_aqs_per_groups = 0;
> > > +	acc100_conf->q_dl_4g.num_qgroups = 0;
> > > +	acc100_conf->q_dl_4g.first_qgroup_index = -1;
> > > +	acc100_conf->q_dl_5g.num_aqs_per_groups = 0;
> > > +	acc100_conf->q_dl_5g.num_qgroups = 0;
> > > +	acc100_conf->q_dl_5g.first_qgroup_index = -1; }
> > > +
> > > +static inline void
> > > +updateQtop(uint8_t acc, uint8_t qg, struct rte_acc100_conf
> > *acc100_conf,
> > > +		struct acc100_device *d) {
> > > +	uint32_t reg;
> > > +	struct rte_acc100_queue_topology *q_top = NULL;
> > > +	qtopFromAcc(&q_top, acc, acc100_conf);
> > > +	if (unlikely(q_top == NULL))
> > > +		return;
> > > +	uint16_t aq;
> > > +	q_top->num_qgroups++;
> > > +	if (q_top->first_qgroup_index == -1) {
> > > +		q_top->first_qgroup_index = qg;
> > > +		/* Can be optimized to assume all are enabled by default */
> > > +		reg = acc100_reg_read(d, queue_offset(d->pf_device,
> > > +				0, qg, ACC100_NUM_AQS - 1));
> > > +		if (reg & ACC100_QUEUE_ENABLE) {
> > > +			q_top->num_aqs_per_groups = ACC100_NUM_AQS;
> > > +			return;
> > > +		}
> > > +		q_top->num_aqs_per_groups = 0;
> > > +		for (aq = 0; aq < ACC100_NUM_AQS; aq++) {
> > > +			reg = acc100_reg_read(d, queue_offset(d-
> > >pf_device,
> > > +					0, qg, aq));
> > > +			if (reg & ACC100_QUEUE_ENABLE)
> > > +				q_top->num_aqs_per_groups++;
> > > +		}
> > > +	}
> > > +}
> > > +
> > > +/* Fetch configuration enabled for the PF/VF using MMIO Read (slow)
> > > +*/ static inline void fetch_acc100_config(struct rte_bbdev *dev) {
> > > +	struct acc100_device *d = dev->data->dev_private;
> > > +	struct rte_acc100_conf *acc100_conf = &d->acc100_conf;
> > > +	const struct acc100_registry_addr *reg_addr;
> > > +	uint8_t acc, qg;
> > > +	uint32_t reg, reg_aq, reg_len0, reg_len1;
> > > +	uint32_t reg_mode;
> > > +
> > > +	/* No need to retrieve the configuration is already done */
> > > +	if (d->configured)
> > > +		return;
> > > +
> > > +	/* Choose correct registry addresses for the device type */
> > > +	if (d->pf_device)
> > > +		reg_addr = &pf_reg_addr;
> > > +	else
> > > +		reg_addr = &vf_reg_addr;
> > > +
> > > +	d->ddr_size = (1 + acc100_reg_read(d, reg_addr->ddr_range)) << 10;
> > > +
> > > +	/* Single VF Bundle by VF */
> > > +	acc100_conf->num_vf_bundles = 1;
> > > +	initQTop(acc100_conf);
> > > +
> > > +	struct rte_acc100_queue_topology *q_top = NULL;
> > > +	int qman_func_id[ACC100_NUM_ACCS] = {ACC100_ACCMAP_0,
> > ACC100_ACCMAP_1,
> > > +			ACC100_ACCMAP_2, ACC100_ACCMAP_3,
> > ACC100_ACCMAP_4};
> > > +	reg = acc100_reg_read(d, reg_addr->qman_group_func);
> > > +	for (qg = 0; qg < ACC100_NUM_QGRPS_PER_WORD; qg++) {
> > > +		reg_aq = acc100_reg_read(d,
> > > +				queue_offset(d->pf_device, 0, qg, 0));
> > > +		if (reg_aq & ACC100_QUEUE_ENABLE) {
> > > +			uint32_t idx = (reg >> (qg * 4)) & 0x7;
> > > +			if (idx >= ACC100_NUM_ACCS)
> > > +				break;
> >
> > a 'continue' would be better
> >
> > or reverse the check
> >
> > if (idx < ACC100_NUM_ACCS) {
> >
> >     acc = qman_func_id ..
> >
> > }
> 
> I can change.
> Please confirm what you think on the previous one. I would to get final by
> tomorrow latest.
> 
> Thanks
> Nic
> 
> 
> >
> > Tom
> >
> > > +			acc = qman_func_id[idx];
> > > +			updateQtop(acc, qg, acc100_conf, d);
> > > +		}
> > > +	}
> > > +
> > > +	/* Check the depth of the AQs*/
> > > +	reg_len0 = acc100_reg_read(d, reg_addr->depth_log0_offset);
> > > +	reg_len1 = acc100_reg_read(d, reg_addr->depth_log1_offset);
> > > +	for (acc = 0; acc < NUM_ACC; acc++) {
> > > +		qtopFromAcc(&q_top, acc, acc100_conf);
> > > +		if (q_top->first_qgroup_index <
> > ACC100_NUM_QGRPS_PER_WORD)
> > > +			q_top->aq_depth_log2 = (reg_len0 >>
> > > +					(q_top->first_qgroup_index * 4))
> > > +					& 0xF;
> > > +		else
> > > +			q_top->aq_depth_log2 = (reg_len1 >>
> > > +					((q_top->first_qgroup_index -
> > > +					ACC100_NUM_QGRPS_PER_WORD) *
> > 4))
> > > +					& 0xF;
> > > +	}
> > > +
> > > +	/* Read PF mode */
> > > +	if (d->pf_device) {
> > > +		reg_mode = acc100_reg_read(d, HWPfHiPfMode);
> > > +		acc100_conf->pf_mode_en = (reg_mode == ACC100_PF_VAL)
> > ? 1 : 0;
> > > +	}
> > > +
> > > +	rte_bbdev_log_debug(
> > > +			"%s Config LLR SIGN IN/OUT %s %s QG %u %u %u %u
> > AQ %u %u %u %u Len %u %u %u %u\n",
> > > +			(d->pf_device) ? "PF" : "VF",
> > > +			(acc100_conf->input_pos_llr_1_bit) ? "POS" : "NEG",
> > > +			(acc100_conf->output_pos_llr_1_bit) ? "POS" :
> > "NEG",
> > > +			acc100_conf->q_ul_4g.num_qgroups,
> > > +			acc100_conf->q_dl_4g.num_qgroups,
> > > +			acc100_conf->q_ul_5g.num_qgroups,
> > > +			acc100_conf->q_dl_5g.num_qgroups,
> > > +			acc100_conf->q_ul_4g.num_aqs_per_groups,
> > > +			acc100_conf->q_dl_4g.num_aqs_per_groups,
> > > +			acc100_conf->q_ul_5g.num_aqs_per_groups,
> > > +			acc100_conf->q_dl_5g.num_aqs_per_groups,
> > > +			acc100_conf->q_ul_4g.aq_depth_log2,
> > > +			acc100_conf->q_dl_4g.aq_depth_log2,
> > > +			acc100_conf->q_ul_5g.aq_depth_log2,
> > > +			acc100_conf->q_dl_5g.aq_depth_log2);
> > > +}
> > > +
> > >  /* Free 64MB memory used for software rings */  static int
> > > acc100_dev_close(struct rte_bbdev *dev  __rte_unused) @@ -33,8
> > +215,55
> > > @@
> > >  	return 0;
> > >  }
> > >
> > > +/* Get ACC100 device info */
> > > +static void
> > > +acc100_dev_info_get(struct rte_bbdev *dev,
> > > +		struct rte_bbdev_driver_info *dev_info) {
> > > +	struct acc100_device *d = dev->data->dev_private;
> > > +
> > > +	static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
> > > +		RTE_BBDEV_END_OF_CAPABILITIES_LIST()
> > > +	};
> > > +
> > > +	static struct rte_bbdev_queue_conf default_queue_conf;
> > > +	default_queue_conf.socket = dev->data->socket_id;
> > > +	default_queue_conf.queue_size = ACC100_MAX_QUEUE_DEPTH;
> > > +
> > > +	dev_info->driver_name = dev->device->driver->name;
> > > +
> > > +	/* Read and save the populated config from ACC100 registers */
> > > +	fetch_acc100_config(dev);
> > > +
> > > +	/* This isn't ideal because it reports the maximum number of
> > > +queues
> > but
> > > +	 * does not provide info on how many can be uplink/downlink or
> > different
> > > +	 * priorities
> > > +	 */
> > > +	dev_info->max_num_queues =
> > > +			d->acc100_conf.q_dl_5g.num_aqs_per_groups *
> > > +			d->acc100_conf.q_dl_5g.num_qgroups +
> > > +			d->acc100_conf.q_ul_5g.num_aqs_per_groups *
> > > +			d->acc100_conf.q_ul_5g.num_qgroups +
> > > +			d->acc100_conf.q_dl_4g.num_aqs_per_groups *
> > > +			d->acc100_conf.q_dl_4g.num_qgroups +
> > > +			d->acc100_conf.q_ul_4g.num_aqs_per_groups *
> > > +			d->acc100_conf.q_ul_4g.num_qgroups;
> > > +	dev_info->queue_size_lim = ACC100_MAX_QUEUE_DEPTH;
> > > +	dev_info->hardware_accelerated = true;
> > > +	dev_info->max_dl_queue_priority =
> > > +			d->acc100_conf.q_dl_4g.num_qgroups - 1;
> > > +	dev_info->max_ul_queue_priority =
> > > +			d->acc100_conf.q_ul_4g.num_qgroups - 1;
> > > +	dev_info->default_queue_conf = default_queue_conf;
> > > +	dev_info->cpu_flag_reqs = NULL;
> > > +	dev_info->min_alignment = 64;
> > > +	dev_info->capabilities = bbdev_capabilities;
> > > +	dev_info->harq_buffer_size = d->ddr_size; }
> > > +
> > >  static const struct rte_bbdev_ops acc100_bbdev_ops = {
> > >  	.close = acc100_dev_close,
> > > +	.info_get = acc100_dev_info_get,
> > >  };
> > >
> > >  /* ACC100 PCI PF address map */
> > > diff --git a/drivers/baseband/acc100/rte_acc100_pmd.h
> > > b/drivers/baseband/acc100/rte_acc100_pmd.h
> > > index 6525d66..09965c8 100644
> > > --- a/drivers/baseband/acc100/rte_acc100_pmd.h
> > > +++ b/drivers/baseband/acc100/rte_acc100_pmd.h
> > > @@ -7,6 +7,7 @@
> > >
> > >  #include "acc100_pf_enum.h"
> > >  #include "acc100_vf_enum.h"
> > > +#include "rte_acc100_cfg.h"
> > >
> > >  /* Helper macro for logging */
> > >  #define rte_bbdev_log(level, fmt, ...) \ @@ -98,6 +99,13 @@
> > > #define ACC100_SIG_UL_4G_LAST 21
> > >  #define ACC100_SIG_DL_4G      27
> > >  #define ACC100_SIG_DL_4G_LAST 31
> > > +#define ACC100_NUM_ACCS       5
> > > +#define ACC100_ACCMAP_0       0
> > > +#define ACC100_ACCMAP_1       2
> > > +#define ACC100_ACCMAP_2       1
> > > +#define ACC100_ACCMAP_3       3
> > > +#define ACC100_ACCMAP_4       4
> > > +#define ACC100_PF_VAL         2
> > >
> > >  /* max number of iterations to allocate memory block for all rings
> > > */ #define ACC100_SW_RING_MEM_ALLOC_ATTEMPTS 5 @@ -517,6
> +525,8
> > @@ struct
> > > acc100_registry_addr {
> > >  /* Private data structure for each ACC100 device */  struct
> > > acc100_device {
> > >  	void *mmio_base;  /**< Base address of MMIO registers (BAR0) */
> > > +	uint32_t ddr_size; /* Size in kB */
> > > +	struct rte_acc100_conf acc100_conf; /* ACC100 Initial
> > > +configuration */
> > >  	bool pf_device; /**< True if this is a PF ACC100 device */
> > >  	bool configured; /**< True if this ACC100 device is configured */
> > > };


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v12 00/10] bbdev PMD ACC100
  2020-10-05 22:12   ` [dpdk-dev] [PATCH v12 00/10] bbdev PMD ACC100 Nicolas Chautru
                       ` (9 preceding siblings ...)
  2020-10-05 22:12     ` [dpdk-dev] [PATCH v12 10/10] baseband/acc100: add configure function Nicolas Chautru
@ 2020-10-06 13:20     ` Maxime Coquelin
  2020-10-06 19:43       ` Akhil Goyal
  10 siblings, 1 reply; 213+ messages in thread
From: Maxime Coquelin @ 2020-10-06 13:20 UTC (permalink / raw)
  To: Nicolas Chautru, dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, trix, ferruh.yigit, tianjiao.liu

Hi Nicolas,

The series looks overall good to me now. Thanks for implementing the
suggested changes.

For what it's worth:

Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>

Thanks,
Maxime

On 10/6/20 12:12 AM, Nicolas Chautru wrote:
> v12: Correcting 1 spelling error and 1 code clean up.
> v11: Further updates based on Tom + Maxime review comments on v9 and v10.  Variable renaming
> v10: Updates based on Tom Rix valuable review comments. Notably doc clarifiction, #define names updates, few magic numbers left, stricter error handling and few valuable coding suggestions. Thanks
> v9: moved the release notes update to the last commit
> v8: integrated the doc feature table in previous commit as suggested. 
> v7: Fingers trouble. Previous one sent mid-rebase. My bad. 
> v6: removed a legacy makefile no longer required
> v5: rebase based on latest on main. The legacy makefiles are removed. 
> v4: an odd compilation error is reported for one CI variant using "gcc latest" which looks to me like a false positive of maybe-undeclared. 
> http://mails.dpdk.org/archives/test-report/2020-August/148936.html
> Still forcing a dummy declare to remove this CI warning I will check with ci@dpdk.org in parallel.  
> v3: missed a change during rebase
> v2: includes clean up from latest CI checks.
> 
> Nicolas Chautru (10):
>   drivers/baseband: add PMD for ACC100
>   baseband/acc100: add register definition file
>   baseband/acc100: add info get function
>   baseband/acc100: add queue configuration
>   baseband/acc100: add LDPC processing functions
>   baseband/acc100: add HARQ loopback support
>   baseband/acc100: add support for 4G processing
>   baseband/acc100: add interrupt support to PMD
>   baseband/acc100: add debug function to validate input
>   baseband/acc100: add configure function
> 
>  app/test-bbdev/meson.build                         |    3 +
>  app/test-bbdev/test_bbdev_perf.c                   |   71 +
>  doc/guides/bbdevs/acc100.rst                       |  228 +
>  doc/guides/bbdevs/features/acc100.ini              |   14 +
>  doc/guides/bbdevs/index.rst                        |    1 +
>  doc/guides/rel_notes/release_20_11.rst             |    5 +
>  drivers/baseband/acc100/acc100_pf_enum.h           | 1068 +++++
>  drivers/baseband/acc100/acc100_vf_enum.h           |   73 +
>  drivers/baseband/acc100/meson.build                |    8 +
>  drivers/baseband/acc100/rte_acc100_cfg.h           |  113 +
>  drivers/baseband/acc100/rte_acc100_pmd.c           | 4727 ++++++++++++++++++++
>  drivers/baseband/acc100/rte_acc100_pmd.h           |  602 +++
>  .../acc100/rte_pmd_bbdev_acc100_version.map        |   10 +
>  drivers/baseband/meson.build                       |    2 +-
>  14 files changed, 6924 insertions(+), 1 deletion(-)
>  create mode 100644 doc/guides/bbdevs/acc100.rst
>  create mode 100644 doc/guides/bbdevs/features/acc100.ini
>  create mode 100644 drivers/baseband/acc100/acc100_pf_enum.h
>  create mode 100644 drivers/baseband/acc100/acc100_vf_enum.h
>  create mode 100644 drivers/baseband/acc100/meson.build
>  create mode 100644 drivers/baseband/acc100/rte_acc100_cfg.h
>  create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.c
>  create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.h
>  create mode 100644 drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> 


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v12 00/10] bbdev PMD ACC100
  2020-10-06 13:20     ` [dpdk-dev] [PATCH v12 00/10] bbdev PMD ACC100 Maxime Coquelin
@ 2020-10-06 19:43       ` Akhil Goyal
  0 siblings, 0 replies; 213+ messages in thread
From: Akhil Goyal @ 2020-10-06 19:43 UTC (permalink / raw)
  To: Maxime Coquelin, Nicolas Chautru, dev
  Cc: bruce.richardson, rosen.xu, trix, ferruh.yigit, tianjiao.liu


> 
> Hi Nicolas,
> 
> The series looks overall good to me now. Thanks for implementing the
> suggested changes.
> 
> For what it's worth:
> 
> Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> 

Series applied to dpdk-next-crypto

Thanks.


^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v12 01/10] drivers/baseband: add PMD for ACC100
  2020-10-05 22:12     ` [dpdk-dev] [PATCH v12 01/10] drivers/baseband: add PMD for ACC100 Nicolas Chautru
@ 2020-11-02  9:25       ` Ferruh Yigit
  2020-11-02 11:16         ` Ferruh Yigit
  0 siblings, 1 reply; 213+ messages in thread
From: Ferruh Yigit @ 2020-11-02  9:25 UTC (permalink / raw)
  To: Nicolas Chautru, dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, trix, maxime.coquelin, tianjiao.liu

On 10/5/2020 11:12 PM, Nicolas Chautru wrote:
> Add stubs for the ACC100 PMD
> 
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> Reviewed-by: Tom Rix <trix@redhat.com>
> Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
> ---
>   doc/guides/bbdevs/acc100.rst                       | 228 +++++++++++++++++++++
>   doc/guides/bbdevs/features/acc100.ini              |  14 ++
>   doc/guides/bbdevs/index.rst                        |   1 +
>   drivers/baseband/acc100/meson.build                |   6 +
>   drivers/baseband/acc100/rte_acc100_pmd.c           | 175 ++++++++++++++++
>   drivers/baseband/acc100/rte_acc100_pmd.h           |  37 ++++
>   .../acc100/rte_pmd_bbdev_acc100_version.map        |   3 +
>   drivers/baseband/meson.build                       |   2 +-
>   8 files changed, 465 insertions(+), 1 deletion(-)
>   create mode 100644 doc/guides/bbdevs/acc100.rst
>   create mode 100644 doc/guides/bbdevs/features/acc100.ini
>   create mode 100644 drivers/baseband/acc100/meson.build
>   create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.c
>   create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.h
>   create mode 100644 drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
> 

Hi Nicolas, Akhil,

Should MAINTAINERS file also needs to updated for this new PMD?

^ permalink raw reply	[flat|nested] 213+ messages in thread

* Re: [dpdk-dev] [PATCH v12 01/10] drivers/baseband: add PMD for ACC100
  2020-11-02  9:25       ` Ferruh Yigit
@ 2020-11-02 11:16         ` Ferruh Yigit
  0 siblings, 0 replies; 213+ messages in thread
From: Ferruh Yigit @ 2020-11-02 11:16 UTC (permalink / raw)
  To: Nicolas Chautru, dev, akhil.goyal
  Cc: bruce.richardson, rosen.xu, trix, maxime.coquelin, tianjiao.liu

On 11/2/2020 9:25 AM, Ferruh Yigit wrote:
> On 10/5/2020 11:12 PM, Nicolas Chautru wrote:
>> Add stubs for the ACC100 PMD
>>
>> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
>> Reviewed-by: Tom Rix <trix@redhat.com>
>> Acked-by: Liu Tianjiao <Tianjiao.liu@intel.com>
>> ---
>>   doc/guides/bbdevs/acc100.rst                       | 228 +++++++++++++++++++++
>>   doc/guides/bbdevs/features/acc100.ini              |  14 ++
>>   doc/guides/bbdevs/index.rst                        |   1 +
>>   drivers/baseband/acc100/meson.build                |   6 +
>>   drivers/baseband/acc100/rte_acc100_pmd.c           | 175 ++++++++++++++++
>>   drivers/baseband/acc100/rte_acc100_pmd.h           |  37 ++++
>>   .../acc100/rte_pmd_bbdev_acc100_version.map        |   3 +
>>   drivers/baseband/meson.build                       |   2 +-
>>   8 files changed, 465 insertions(+), 1 deletion(-)
>>   create mode 100644 doc/guides/bbdevs/acc100.rst
>>   create mode 100644 doc/guides/bbdevs/features/acc100.ini
>>   create mode 100644 drivers/baseband/acc100/meson.build
>>   create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.c
>>   create mode 100644 drivers/baseband/acc100/rte_acc100_pmd.h
>>   create mode 100644 drivers/baseband/acc100/rte_pmd_bbdev_acc100_version.map
>>
> 
> Hi Nicolas, Akhil,
> 
> Should MAINTAINERS file also needs to updated for this new PMD?
 >

Ahh, it seems that is OK, all basebased drivers altogether maintained by Nicolas 
:), so they are not added individually

      Baseband API - EXPERIMENTAL
      M: Nicolas Chautru <nicolas.chautru@intel.com>
      T: git://dpdk.org/next/dpdk-next-crypto
      F: lib/librte_bbdev/
      F: doc/guides/prog_guide/bbdev.rst
*    F: drivers/baseband/

^ permalink raw reply	[flat|nested] 213+ messages in thread

end of thread, other threads:[~2020-11-02 11:16 UTC | newest]

Thread overview: 213+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-19  0:25 [dpdk-dev] [PATCH v3 00/11] bbdev PMD ACC100 Nicolas Chautru
2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 01/11] drivers/baseband: add PMD for ACC100 Nicolas Chautru
2020-08-29  9:44   ` Xu, Rosen
2020-09-04 16:44     ` Chautru, Nicolas
2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 02/11] baseband/acc100: add register definition file Nicolas Chautru
2020-08-29  9:55   ` Xu, Rosen
2020-08-29 17:39     ` Chautru, Nicolas
2020-09-03  2:15       ` Xu, Rosen
2020-09-03  9:17         ` Ferruh Yigit
2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 03/11] baseband/acc100: add info get function Nicolas Chautru
2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 04/11] baseband/acc100: add queue configuration Nicolas Chautru
2020-08-29 10:39   ` Xu, Rosen
2020-08-29 17:48     ` Chautru, Nicolas
2020-09-03  2:30       ` Xu, Rosen
2020-09-03 22:48         ` Chautru, Nicolas
2020-09-04  2:01           ` Xu, Rosen
2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 05/11] baseband/acc100: add LDPC processing functions Nicolas Chautru
2020-08-20 14:38   ` Dave Burley
2020-08-20 14:52     ` Chautru, Nicolas
2020-08-20 14:57       ` Dave Burley
2020-08-20 21:05         ` Chautru, Nicolas
2020-09-03  8:06           ` Dave Burley
2020-08-29 11:10   ` Xu, Rosen
2020-08-29 18:01     ` Chautru, Nicolas
2020-09-03  2:34       ` Xu, Rosen
2020-09-03  9:09         ` Ananyev, Konstantin
2020-09-03 20:45           ` Chautru, Nicolas
2020-09-15  1:45             ` Chautru, Nicolas
2020-09-15 10:21             ` Ananyev, Konstantin
2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 06/11] baseband/acc100: add HARQ loopback support Nicolas Chautru
2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 07/11] baseband/acc100: add support for 4G processing Nicolas Chautru
2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 08/11] baseband/acc100: add interrupt support to PMD Nicolas Chautru
2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 09/11] baseband/acc100: add debug function to validate input Nicolas Chautru
2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 10/11] baseband/acc100: add configure function Nicolas Chautru
2020-09-03 10:06   ` Aidan Goddard
2020-09-03 18:53     ` Chautru, Nicolas
2020-08-19  0:25 ` [dpdk-dev] [PATCH v3 11/11] doc: update bbdev feature table Nicolas Chautru
2020-09-04 17:53   ` [dpdk-dev] [PATCH v4 00/11] bbdev PMD ACC100 Nicolas Chautru
2020-09-04 17:53     ` [dpdk-dev] [PATCH v4 01/11] drivers/baseband: add PMD for ACC100 Nicolas Chautru
2020-09-08  3:10       ` Liu, Tianjiao
2020-09-04 17:53     ` [dpdk-dev] [PATCH v4 02/11] baseband/acc100: add register definition file Nicolas Chautru
2020-09-15  2:31       ` Xu, Rosen
2020-09-18  2:39       ` Liu, Tianjiao
2020-09-04 17:53     ` [dpdk-dev] [PATCH v4 03/11] baseband/acc100: add info get function Nicolas Chautru
2020-09-18  2:47       ` Liu, Tianjiao
2020-09-04 17:54     ` [dpdk-dev] [PATCH v4 04/11] baseband/acc100: add queue configuration Nicolas Chautru
2020-09-15  2:31       ` Xu, Rosen
2020-09-18  3:01       ` Liu, Tianjiao
2020-09-04 17:54     ` [dpdk-dev] [PATCH v4 05/11] baseband/acc100: add LDPC processing functions Nicolas Chautru
2020-09-21  1:40       ` Liu, Tianjiao
2020-09-04 17:54     ` [dpdk-dev] [PATCH v4 06/11] baseband/acc100: add HARQ loopback support Nicolas Chautru
2020-09-21  1:41       ` Liu, Tianjiao
2020-09-04 17:54     ` [dpdk-dev] [PATCH v4 07/11] baseband/acc100: add support for 4G processing Nicolas Chautru
2020-09-21  1:43       ` Liu, Tianjiao
2020-09-04 17:54     ` [dpdk-dev] [PATCH v4 08/11] baseband/acc100: add interrupt support to PMD Nicolas Chautru
2020-09-21  1:45       ` Liu, Tianjiao
2020-09-04 17:54     ` [dpdk-dev] [PATCH v4 09/11] baseband/acc100: add debug function to validate input Nicolas Chautru
2020-09-21  1:46       ` Liu, Tianjiao
2020-09-04 17:54     ` [dpdk-dev] [PATCH v4 10/11] baseband/acc100: add configure function Nicolas Chautru
2020-09-21  1:48       ` Liu, Tianjiao
2020-09-04 17:54     ` [dpdk-dev] [PATCH v4 11/11] doc: update bbdev feature table Nicolas Chautru
2020-09-21  1:50       ` Liu, Tianjiao
2020-09-21 14:36     ` [dpdk-dev] [PATCH v4 00/11] bbdev PMD ACC100 Chautru, Nicolas
2020-09-22 19:32       ` Akhil Goyal
2020-09-23  2:21         ` Chautru, Nicolas
2020-09-23  2:12   ` [dpdk-dev] [PATCH v5 " Nicolas Chautru
2020-09-23  2:12     ` [dpdk-dev] [PATCH v5 01/11] drivers/baseband: add PMD for ACC100 Nicolas Chautru
2020-09-23  2:12     ` [dpdk-dev] [PATCH v5 02/11] baseband/acc100: add register definition file Nicolas Chautru
2020-09-23  2:12     ` [dpdk-dev] [PATCH v5 03/11] baseband/acc100: add info get function Nicolas Chautru
2020-09-23  2:12     ` [dpdk-dev] [PATCH v5 04/11] baseband/acc100: add queue configuration Nicolas Chautru
2020-09-23  2:12     ` [dpdk-dev] [PATCH v5 05/11] baseband/acc100: add LDPC processing functions Nicolas Chautru
2020-09-23  2:12     ` [dpdk-dev] [PATCH v5 06/11] baseband/acc100: add HARQ loopback support Nicolas Chautru
2020-09-23  2:12     ` [dpdk-dev] [PATCH v5 07/11] baseband/acc100: add support for 4G processing Nicolas Chautru
2020-09-23  2:12     ` [dpdk-dev] [PATCH v5 08/11] baseband/acc100: add interrupt support to PMD Nicolas Chautru
2020-09-23  2:12     ` [dpdk-dev] [PATCH v5 09/11] baseband/acc100: add debug function to validate input Nicolas Chautru
2020-09-23  2:12     ` [dpdk-dev] [PATCH v5 10/11] baseband/acc100: add configure function Nicolas Chautru
2020-09-23  2:12     ` [dpdk-dev] [PATCH v5 11/11] doc: update bbdev feature table Nicolas Chautru
2020-09-23  2:19   ` [dpdk-dev] [PATCH v6 00/11] bbdev PMD ACC100 Nicolas Chautru
2020-09-23  2:19     ` [dpdk-dev] [PATCH v6 01/11] service: retrieve lcore active state Nicolas Chautru
2020-09-23  2:19     ` [dpdk-dev] [PATCH v6 02/11] test/service: fix race condition on stopping lcore Nicolas Chautru
2020-09-23  2:19     ` [dpdk-dev] [PATCH v6 03/11] drivers/baseband: add PMD for ACC100 Nicolas Chautru
2020-09-23  2:19     ` [dpdk-dev] [PATCH v6 04/11] baseband/acc100: add register definition file Nicolas Chautru
2020-09-23  2:19     ` [dpdk-dev] [PATCH v6 05/11] baseband/acc100: add info get function Nicolas Chautru
2020-09-23  2:19     ` [dpdk-dev] [PATCH v6 06/11] baseband/acc100: add queue configuration Nicolas Chautru
2020-09-23  2:19     ` [dpdk-dev] [PATCH v6 07/11] baseband/acc100: add LDPC processing functions Nicolas Chautru
2020-09-23  2:19     ` [dpdk-dev] [PATCH v6 08/11] baseband/acc100: add HARQ loopback support Nicolas Chautru
2020-09-23  2:19     ` [dpdk-dev] [PATCH v6 09/11] baseband/acc100: add support for 4G processing Nicolas Chautru
2020-09-23  2:19     ` [dpdk-dev] [PATCH v6 10/11] baseband/acc100: add interrupt support to PMD Nicolas Chautru
2020-09-23  2:19     ` [dpdk-dev] [PATCH v6 11/11] baseband/acc100: add debug function to validate input Nicolas Chautru
2020-09-23  2:24   ` [dpdk-dev] [PATCH v7 00/11] bbdev PMD ACC100 Nicolas Chautru
2020-09-23  2:24     ` [dpdk-dev] [PATCH v7 01/11] drivers/baseband: add PMD for ACC100 Nicolas Chautru
2020-09-23  2:24     ` [dpdk-dev] [PATCH v7 02/11] baseband/acc100: add register definition file Nicolas Chautru
2020-09-23  2:24     ` [dpdk-dev] [PATCH v7 03/11] baseband/acc100: add info get function Nicolas Chautru
2020-09-23  2:24     ` [dpdk-dev] [PATCH v7 04/11] baseband/acc100: add queue configuration Nicolas Chautru
2020-09-23  2:25     ` [dpdk-dev] [PATCH v7 05/11] baseband/acc100: add LDPC processing functions Nicolas Chautru
2020-09-23  2:25     ` [dpdk-dev] [PATCH v7 06/11] baseband/acc100: add HARQ loopback support Nicolas Chautru
2020-09-23  2:25     ` [dpdk-dev] [PATCH v7 07/11] baseband/acc100: add support for 4G processing Nicolas Chautru
2020-09-23  2:25     ` [dpdk-dev] [PATCH v7 08/11] baseband/acc100: add interrupt support to PMD Nicolas Chautru
2020-09-23  2:25     ` [dpdk-dev] [PATCH v7 09/11] baseband/acc100: add debug function to validate input Nicolas Chautru
2020-09-23  2:25     ` [dpdk-dev] [PATCH v7 10/11] baseband/acc100: add configure function Nicolas Chautru
2020-09-23  2:25     ` [dpdk-dev] [PATCH v7 11/11] doc: update bbdev feature table Nicolas Chautru
2020-09-28 20:19       ` Akhil Goyal
2020-09-29  0:57         ` Chautru, Nicolas
2020-09-28 23:52   ` [dpdk-dev] [PATCH v8 00/10] bbdev PMD ACC100 Nicolas Chautru
2020-09-28 23:52     ` [dpdk-dev] [PATCH v8 01/10] drivers/baseband: add PMD for ACC100 Nicolas Chautru
2020-09-28 23:52     ` [dpdk-dev] [PATCH v8 02/10] baseband/acc100: add register definition file Nicolas Chautru
2020-09-28 23:52     ` [dpdk-dev] [PATCH v8 03/10] baseband/acc100: add info get function Nicolas Chautru
2020-09-28 23:52     ` [dpdk-dev] [PATCH v8 04/10] baseband/acc100: add queue configuration Nicolas Chautru
2020-09-28 23:52     ` [dpdk-dev] [PATCH v8 05/10] baseband/acc100: add LDPC processing functions Nicolas Chautru
2020-09-28 23:52     ` [dpdk-dev] [PATCH v8 06/10] baseband/acc100: add HARQ loopback support Nicolas Chautru
2020-09-28 23:52     ` [dpdk-dev] [PATCH v8 07/10] baseband/acc100: add support for 4G processing Nicolas Chautru
2020-09-28 23:52     ` [dpdk-dev] [PATCH v8 08/10] baseband/acc100: add interrupt support to PMD Nicolas Chautru
2020-09-28 23:52     ` [dpdk-dev] [PATCH v8 09/10] baseband/acc100: add debug function to validate input Nicolas Chautru
2020-09-28 23:52     ` [dpdk-dev] [PATCH v8 10/10] baseband/acc100: add configure function Nicolas Chautru
2020-09-29  0:29   ` [dpdk-dev] [PATCH v9 00/10] bbdev PMD ACC100 Nicolas Chautru
2020-09-29  0:29     ` [dpdk-dev] [PATCH v9 01/10] drivers/baseband: add PMD for ACC100 Nicolas Chautru
2020-09-29 19:53       ` Tom Rix
2020-09-29 23:17         ` Chautru, Nicolas
2020-09-30 23:06           ` Tom Rix
2020-09-30 23:30             ` Chautru, Nicolas
2020-09-29  0:29     ` [dpdk-dev] [PATCH v9 02/10] baseband/acc100: add register definition file Nicolas Chautru
2020-09-29 20:34       ` Tom Rix
2020-09-29 23:30         ` Chautru, Nicolas
2020-09-30 23:11           ` Tom Rix
2020-09-29  0:29     ` [dpdk-dev] [PATCH v9 03/10] baseband/acc100: add info get function Nicolas Chautru
2020-09-29 21:13       ` Tom Rix
2020-09-30  0:25         ` Chautru, Nicolas
2020-09-30 23:20           ` Tom Rix
2020-09-29  0:29     ` [dpdk-dev] [PATCH v9 04/10] baseband/acc100: add queue configuration Nicolas Chautru
2020-09-29 21:46       ` Tom Rix
2020-09-30  1:03         ` Chautru, Nicolas
2020-09-30 23:36           ` Tom Rix
2020-09-29  0:29     ` [dpdk-dev] [PATCH v9 05/10] baseband/acc100: add LDPC processing functions Nicolas Chautru
2020-09-30 16:53       ` Tom Rix
2020-09-30 18:52         ` Chautru, Nicolas
2020-10-01 15:31           ` Tom Rix
2020-10-01 16:07             ` Chautru, Nicolas
2020-09-29  0:29     ` [dpdk-dev] [PATCH v9 06/10] baseband/acc100: add HARQ loopback support Nicolas Chautru
2020-09-30 17:25       ` Tom Rix
2020-09-30 18:55         ` Chautru, Nicolas
2020-10-01 15:32           ` Tom Rix
2020-09-29  0:29     ` [dpdk-dev] [PATCH v9 07/10] baseband/acc100: add support for 4G processing Nicolas Chautru
2020-09-30 18:37       ` Tom Rix
2020-09-30 19:10         ` Chautru, Nicolas
2020-10-01 15:42           ` Tom Rix
2020-10-01 21:46             ` Chautru, Nicolas
2020-09-29  0:29     ` [dpdk-dev] [PATCH v9 08/10] baseband/acc100: add interrupt support to PMD Nicolas Chautru
2020-09-30 19:03       ` Tom Rix
2020-09-30 19:45         ` Chautru, Nicolas
2020-10-01 16:05           ` Tom Rix
2020-10-01 21:07             ` Chautru, Nicolas
2020-09-29  0:29     ` [dpdk-dev] [PATCH v9 09/10] baseband/acc100: add debug function to validate input Nicolas Chautru
2020-09-30 19:16       ` Tom Rix
2020-09-30 19:53         ` Chautru, Nicolas
2020-10-01 16:07           ` Tom Rix
2020-09-29  0:29     ` [dpdk-dev] [PATCH v9 10/10] baseband/acc100: add configure function Nicolas Chautru
2020-09-30 19:58       ` Tom Rix
2020-09-30 22:54         ` Chautru, Nicolas
2020-10-01 16:18           ` Tom Rix
2020-10-01 21:11             ` Chautru, Nicolas
2020-10-01  3:14   ` [dpdk-dev] [PATCH v10 00/10] bbdev PMD ACC100 Nicolas Chautru
2020-10-01  3:14     ` [dpdk-dev] [PATCH v10 01/10] drivers/baseband: add PMD for ACC100 Nicolas Chautru
2020-10-01  3:14     ` [dpdk-dev] [PATCH v10 02/10] baseband/acc100: add register definition file Nicolas Chautru
2020-10-01  3:14     ` [dpdk-dev] [PATCH v10 03/10] baseband/acc100: add info get function Nicolas Chautru
2020-10-01 14:34       ` Maxime Coquelin
2020-10-01 19:50         ` Chautru, Nicolas
2020-10-01  3:14     ` [dpdk-dev] [PATCH v10 04/10] baseband/acc100: add queue configuration Nicolas Chautru
2020-10-01 15:38       ` Maxime Coquelin
2020-10-01 19:50         ` Chautru, Nicolas
2020-10-01  3:14     ` [dpdk-dev] [PATCH v10 05/10] baseband/acc100: add LDPC processing functions Nicolas Chautru
2020-10-01  3:14     ` [dpdk-dev] [PATCH v10 06/10] baseband/acc100: add HARQ loopback support Nicolas Chautru
2020-10-01  3:14     ` [dpdk-dev] [PATCH v10 07/10] baseband/acc100: add support for 4G processing Nicolas Chautru
2020-10-01  3:14     ` [dpdk-dev] [PATCH v10 08/10] baseband/acc100: add interrupt support to PMD Nicolas Chautru
2020-10-01  3:14     ` [dpdk-dev] [PATCH v10 09/10] baseband/acc100: add debug function to validate input Nicolas Chautru
2020-10-01  3:14     ` [dpdk-dev] [PATCH v10 10/10] baseband/acc100: add configure function Nicolas Chautru
2020-10-01 14:11       ` Maxime Coquelin
2020-10-01 15:36         ` Chautru, Nicolas
2020-10-01 15:43           ` Maxime Coquelin
2020-10-01 19:50             ` Chautru, Nicolas
2020-10-01 21:44               ` Maxime Coquelin
2020-10-02  1:01   ` [dpdk-dev] [PATCH v11 00/10] bbdev PMD ACC100 Nicolas Chautru
2020-10-02  1:01     ` [dpdk-dev] [PATCH v11 01/10] drivers/baseband: add PMD for ACC100 Nicolas Chautru
2020-10-04 15:53       ` Tom Rix
2020-10-02  1:01     ` [dpdk-dev] [PATCH v11 02/10] baseband/acc100: add register definition file Nicolas Chautru
2020-10-04 15:56       ` Tom Rix
2020-10-02  1:01     ` [dpdk-dev] [PATCH v11 03/10] baseband/acc100: add info get function Nicolas Chautru
2020-10-04 16:09       ` Tom Rix
2020-10-05 16:38         ` Chautru, Nicolas
2020-10-05 22:19           ` Chautru, Nicolas
2020-10-02  1:01     ` [dpdk-dev] [PATCH v11 04/10] baseband/acc100: add queue configuration Nicolas Chautru
2020-10-04 16:18       ` Tom Rix
2020-10-05 16:42         ` Chautru, Nicolas
2020-10-02  1:01     ` [dpdk-dev] [PATCH v11 05/10] baseband/acc100: add LDPC processing functions Nicolas Chautru
2020-10-02  1:01     ` [dpdk-dev] [PATCH v11 06/10] baseband/acc100: add HARQ loopback support Nicolas Chautru
2020-10-02  1:01     ` [dpdk-dev] [PATCH v11 07/10] baseband/acc100: add support for 4G processing Nicolas Chautru
2020-10-02  1:01     ` [dpdk-dev] [PATCH v11 08/10] baseband/acc100: add interrupt support to PMD Nicolas Chautru
2020-10-02  1:01     ` [dpdk-dev] [PATCH v11 09/10] baseband/acc100: add debug function to validate input Nicolas Chautru
2020-10-02  1:01     ` [dpdk-dev] [PATCH v11 10/10] baseband/acc100: add configure function Nicolas Chautru
2020-10-05 22:12   ` [dpdk-dev] [PATCH v12 00/10] bbdev PMD ACC100 Nicolas Chautru
2020-10-05 22:12     ` [dpdk-dev] [PATCH v12 01/10] drivers/baseband: add PMD for ACC100 Nicolas Chautru
2020-11-02  9:25       ` Ferruh Yigit
2020-11-02 11:16         ` Ferruh Yigit
2020-10-05 22:12     ` [dpdk-dev] [PATCH v12 02/10] baseband/acc100: add register definition file Nicolas Chautru
2020-10-05 22:12     ` [dpdk-dev] [PATCH v12 03/10] baseband/acc100: add info get function Nicolas Chautru
2020-10-05 22:12     ` [dpdk-dev] [PATCH v12 04/10] baseband/acc100: add queue configuration Nicolas Chautru
2020-10-05 22:12     ` [dpdk-dev] [PATCH v12 05/10] baseband/acc100: add LDPC processing functions Nicolas Chautru
2020-10-05 22:12     ` [dpdk-dev] [PATCH v12 06/10] baseband/acc100: add HARQ loopback support Nicolas Chautru
2020-10-05 22:12     ` [dpdk-dev] [PATCH v12 07/10] baseband/acc100: add support for 4G processing Nicolas Chautru
2020-10-05 22:12     ` [dpdk-dev] [PATCH v12 08/10] baseband/acc100: add interrupt support to PMD Nicolas Chautru
2020-10-05 22:12     ` [dpdk-dev] [PATCH v12 09/10] baseband/acc100: add debug function to validate input Nicolas Chautru
2020-10-05 22:12     ` [dpdk-dev] [PATCH v12 10/10] baseband/acc100: add configure function Nicolas Chautru
2020-10-06 13:20     ` [dpdk-dev] [PATCH v12 00/10] bbdev PMD ACC100 Maxime Coquelin
2020-10-06 19:43       ` Akhil Goyal

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).