[PATCH v1 0/7] VRB2 BBDEV PMD introduction

DPDK patches and discussions
 help / color / mirror / Atom feed

* [PATCH v1 0/7] VRB2 BBDEV PMD introduction
@ 2023-09-19  1:21 Nicolas Chautru
  2023-09-19  1:21 ` [PATCH v1 1/7] bbdev: add FFT version member in driver info Nicolas Chautru
                   ` (7 more replies)
  0 siblings, 8 replies; 24+ messages in thread
From: Nicolas Chautru @ 2023-09-19  1:21 UTC (permalink / raw)
  To: dev, maxime.coquelin
  Cc: hemant.agrawal, david.marchand, hernan.vargas, Nicolas Chautru

This serie includes includes changes to the VRB BBDEV PMD for 23.11.
This relies on the previous serie that Maxime is about to apply
(https://patches.dpdk.org/project/dpdk/list/?series=28544).
I did not include documentationa just yet but I will in next revision.

This allows the VRB unified driver to support the new VRB2
implementation variant on GNR-D.

This also include minor change to the dev_info to expose FFT version
flexibility to expose information to the application on what is
configured dynamically on the device.


Nicolas Chautru (7):
  bbdev: add FFT version member in driver info
  baseband/acc: add FFT version in the VRM PMD
  baseband/acc: remove the 4G SO capability for VRB1
  baseband/acc: allocate FCW memory separately
  baseband/acc: add support for MLD operation
  baseband/acc: introduce the new VRB2 variant
  baseband/acc: add configure helper for VRB2

 drivers/baseband/acc/acc100_pmd.h     |    2 +
 drivers/baseband/acc/acc_common.h     |   97 +-
 drivers/baseband/acc/rte_acc100_pmd.c |   10 +-
 drivers/baseband/acc/rte_vrb_pmd.c    | 1967 ++++++++++++++++++++++---
 drivers/baseband/acc/vrb1_pf_enum.h   |   17 +-
 drivers/baseband/acc/vrb2_pf_enum.h   |  124 ++
 drivers/baseband/acc/vrb2_vf_enum.h   |  121 ++
 drivers/baseband/acc/vrb_cfg.h        |   16 +
 drivers/baseband/acc/vrb_pmd.h        |  173 ++-
 lib/bbdev/rte_bbdev.h                 |    2 +
 10 files changed, 2297 insertions(+), 232 deletions(-)
 create mode 100644 drivers/baseband/acc/vrb2_pf_enum.h
 create mode 100644 drivers/baseband/acc/vrb2_vf_enum.h

-- 
2.34.1


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v1 1/7] bbdev: add FFT version member in driver info
  2023-09-19  1:21 [PATCH v1 0/7] VRB2 BBDEV PMD introduction Nicolas Chautru
@ 2023-09-19  1:21 ` Nicolas Chautru
  2023-09-19  9:55   ` Maxime Coquelin
  2023-09-19  1:21 ` [PATCH v1 2/7] baseband/acc: add FFT version in the VRM PMD Nicolas Chautru
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 24+ messages in thread
From: Nicolas Chautru @ 2023-09-19  1:21 UTC (permalink / raw)
  To: dev, maxime.coquelin
  Cc: hemant.agrawal, david.marchand, hernan.vargas, Nicolas Chautru

This can be used to distinguish different version of the
flexible pointwise windowing applied to the FFT and expose
this to the application.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 lib/bbdev/rte_bbdev.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h
index a5bcc09f10..d6e54ee9a4 100644
--- a/lib/bbdev/rte_bbdev.h
+++ b/lib/bbdev/rte_bbdev.h
@@ -349,6 +349,8 @@ struct rte_bbdev_driver_info {
 	const struct rte_bbdev_op_cap *capabilities;
 	/** Device cpu_flag requirements */
 	const enum rte_cpu_flag_t *cpu_flag_reqs;
+	/** Versioning number for the FFT operation type. */
+	uint16_t fft_version;
 };
 
 /** Macro used at end of bbdev PMD list */
-- 
2.34.1


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v1 2/7] baseband/acc: add FFT version in the VRM PMD
  2023-09-19  1:21 [PATCH v1 0/7] VRB2 BBDEV PMD introduction Nicolas Chautru
  2023-09-19  1:21 ` [PATCH v1 1/7] bbdev: add FFT version member in driver info Nicolas Chautru
@ 2023-09-19  1:21 ` Nicolas Chautru
  2023-09-19  1:21 ` [PATCH v1 3/7] baseband/acc: remove the 4G SO capability for VRB1 Nicolas Chautru
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 24+ messages in thread
From: Nicolas Chautru @ 2023-09-19  1:21 UTC (permalink / raw)
  To: dev, maxime.coquelin
  Cc: hemant.agrawal, david.marchand, hernan.vargas, Nicolas Chautru

This allows to exposes the flexible version of the poitwise
flexible operation being dynamically configured on the device.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 drivers/baseband/acc/acc_common.h  |  1 +
 drivers/baseband/acc/rte_vrb_pmd.c | 22 ++++++++++++++++++++++
 2 files changed, 23 insertions(+)

diff --git a/drivers/baseband/acc/acc_common.h b/drivers/baseband/acc/acc_common.h
index 5bb00746c3..df18506e75 100644
--- a/drivers/baseband/acc/acc_common.h
+++ b/drivers/baseband/acc/acc_common.h
@@ -512,6 +512,7 @@ struct acc_deq_intr_details {
 enum {
 	ACC_VF2PF_STATUS_REQUEST = 1,
 	ACC_VF2PF_USING_VF = 2,
+	ACC_VF2PF_LUT_VER_REQUEST = 3,
 };
 
 
diff --git a/drivers/baseband/acc/rte_vrb_pmd.c b/drivers/baseband/acc/rte_vrb_pmd.c
index 9e5a73c9c7..3c8f3409ed 100644
--- a/drivers/baseband/acc/rte_vrb_pmd.c
+++ b/drivers/baseband/acc/rte_vrb_pmd.c
@@ -298,6 +298,27 @@ vrb_device_status(struct rte_bbdev *dev)
 	return reg;
 }
 
+/* Request device FFT related version information. */
+static inline uint32_t
+vrb_device_fft_ver(struct rte_bbdev *dev)
+{
+	struct acc_device *d = dev->data->dev_private;
+	uint32_t reg, time_out = 0;
+
+	if (d->pf_device)
+		return 0;
+
+	vrb_vf2pf(d, ACC_VF2PF_LUT_VER_REQUEST);
+	reg = acc_reg_read(d, d->reg_addr->pf2vf_doorbell);
+	while ((time_out < ACC_STATUS_TO) && (reg == RTE_BBDEV_DEV_NOSTATUS)) {
+		usleep(ACC_STATUS_WAIT); /*< Wait or VF->PF->VF Comms */
+		reg = acc_reg_read(d, d->reg_addr->pf2vf_doorbell);
+		time_out++;
+	}
+
+	return reg;
+}
+
 /* Checks PF Info Ring to find the interrupt cause and handles it accordingly. */
 static inline void
 vrb_check_ir(struct acc_device *acc_dev)
@@ -1100,6 +1121,7 @@ vrb_dev_info_get(struct rte_bbdev *dev, struct rte_bbdev_driver_info *dev_info)
 	fetch_acc_config(dev);
 	/* Check the status of device. */
 	dev_info->device_status = vrb_device_status(dev);
+	dev_info->fft_version = vrb_device_fft_ver(dev);
 
 	/* Exposed number of queues. */
 	dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
-- 
2.34.1


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v1 3/7] baseband/acc: remove the 4G SO capability for VRB1
  2023-09-19  1:21 [PATCH v1 0/7] VRB2 BBDEV PMD introduction Nicolas Chautru
  2023-09-19  1:21 ` [PATCH v1 1/7] bbdev: add FFT version member in driver info Nicolas Chautru
  2023-09-19  1:21 ` [PATCH v1 2/7] baseband/acc: add FFT version in the VRM PMD Nicolas Chautru
@ 2023-09-19  1:21 ` Nicolas Chautru
  2023-09-19 15:20   ` David Marchand
  2023-09-19  1:21 ` [PATCH v1 4/7] baseband/acc: allocate FCW memory separately Nicolas Chautru
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 24+ messages in thread
From: Nicolas Chautru @ 2023-09-19  1:21 UTC (permalink / raw)
  To: dev, maxime.coquelin
  Cc: hemant.agrawal, david.marchand, hernan.vargas, Nicolas Chautru

This removes the specific capability and support of LTE Decoder
Soft Output option on the VRB1 PMD.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 drivers/baseband/acc/rte_vrb_pmd.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/baseband/acc/rte_vrb_pmd.c b/drivers/baseband/acc/rte_vrb_pmd.c
index 3c8f3409ed..e0f50460bd 100644
--- a/drivers/baseband/acc/rte_vrb_pmd.c
+++ b/drivers/baseband/acc/rte_vrb_pmd.c
@@ -1019,14 +1019,11 @@ vrb_dev_info_get(struct rte_bbdev *dev, struct rte_bbdev_driver_info *dev_info)
 					RTE_BBDEV_TURBO_CRC_TYPE_24B |
 					RTE_BBDEV_TURBO_DEC_CRC_24B_DROP |
 					RTE_BBDEV_TURBO_EQUALIZER |
-					RTE_BBDEV_TURBO_SOFT_OUT_SATURATE |
 					RTE_BBDEV_TURBO_HALF_ITERATION_EVEN |
 					RTE_BBDEV_TURBO_CONTINUE_CRC_MATCH |
-					RTE_BBDEV_TURBO_SOFT_OUTPUT |
 					RTE_BBDEV_TURBO_EARLY_TERMINATION |
 					RTE_BBDEV_TURBO_DEC_INTERRUPTS |
 					RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN |
-					RTE_BBDEV_TURBO_NEG_LLR_1_BIT_SOFT_OUT |
 					RTE_BBDEV_TURBO_MAP_DEC |
 					RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP |
 					RTE_BBDEV_TURBO_DEC_SCATTER_GATHER,
@@ -1975,6 +1972,9 @@ enqueue_dec_one_op_cb(struct acc_queue *q, struct rte_bbdev_dec_op *op,
 	struct rte_mbuf *input, *h_output_head, *h_output,
 		*s_output_head, *s_output;
 
+	/* Disable explictly SO for VRB 1. */
+	op->turbo_dec.op_flags &= ~RTE_BBDEV_TURBO_SOFT_OUTPUT;
+
 	desc = acc_desc(q, total_enqueued_cbs);
 	vrb_fcw_td_fill(op, &desc->req.fcw_td);
 
-- 
2.34.1


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v1 4/7] baseband/acc: allocate FCW memory separately
  2023-09-19  1:21 [PATCH v1 0/7] VRB2 BBDEV PMD introduction Nicolas Chautru
                   ` (2 preceding siblings ...)
  2023-09-19  1:21 ` [PATCH v1 3/7] baseband/acc: remove the 4G SO capability for VRB1 Nicolas Chautru
@ 2023-09-19  1:21 ` Nicolas Chautru
  2023-09-19  1:21 ` [PATCH v1 5/7] baseband/acc: add support for MLD operation Nicolas Chautru
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 24+ messages in thread
From: Nicolas Chautru @ 2023-09-19  1:21 UTC (permalink / raw)
  To: dev, maxime.coquelin
  Cc: hemant.agrawal, david.marchand, hernan.vargas, Nicolas Chautru

This allows more flexibility to the FCW size for the
unified driver. No actual functional change.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 drivers/baseband/acc/acc_common.h  |  4 +++-
 drivers/baseband/acc/rte_vrb_pmd.c | 25 ++++++++++++++++++++++++-
 2 files changed, 27 insertions(+), 2 deletions(-)

diff --git a/drivers/baseband/acc/acc_common.h b/drivers/baseband/acc/acc_common.h
index df18506e75..b5ee113faf 100644
--- a/drivers/baseband/acc/acc_common.h
+++ b/drivers/baseband/acc/acc_common.h
@@ -101,6 +101,7 @@
 #define ACC_NUM_QGRPS_PER_WORD         8
 #define ACC_MAX_NUM_QGRPS              32
 #define ACC_RING_SIZE_GRANULARITY      64
+#define ACC_MAX_FCW_SIZE              128
 
 /* Constants from K0 computation from 3GPP 38.212 Table 5.4.2.1-2 */
 #define ACC_N_ZC_1 66 /* N = 66 Zc for BG 1 */
@@ -582,13 +583,14 @@ struct __rte_cache_aligned acc_queue {
 	uint32_t aq_enqueued;  /* Count how many "batches" have been enqueued */
 	uint32_t aq_dequeued;  /* Count how many "batches" have been dequeued */
 	uint32_t irq_enable;  /* Enable ops dequeue interrupts if set to 1 */
-	struct rte_mempool *fcw_mempool;  /* FCW mempool */
 	enum rte_bbdev_op_type op_type;  /* Type of this Queue: TE or TD */
 	/* Internal Buffers for loopback input */
 	uint8_t *lb_in;
 	uint8_t *lb_out;
+	uint8_t *fcw_ring;
 	rte_iova_t lb_in_addr_iova;
 	rte_iova_t lb_out_addr_iova;
+	rte_iova_t fcw_ring_addr_iova;
 	int8_t *derm_buffer; /* interim buffer for de-rm in SDK */
 	struct acc_device *d;
 };
diff --git a/drivers/baseband/acc/rte_vrb_pmd.c b/drivers/baseband/acc/rte_vrb_pmd.c
index e0f50460bd..78f465b25b 100644
--- a/drivers/baseband/acc/rte_vrb_pmd.c
+++ b/drivers/baseband/acc/rte_vrb_pmd.c
@@ -883,6 +883,25 @@ vrb_queue_setup(struct rte_bbdev *dev, uint16_t queue_id,
 		goto free_companion_ring_addr;
 	}
 
+	q->fcw_ring = rte_zmalloc_socket(dev->device->driver->name,
+			ACC_MAX_FCW_SIZE * d->sw_ring_max_depth,
+			RTE_CACHE_LINE_SIZE, conf->socket);
+	if (q->fcw_ring == NULL) {
+		rte_bbdev_log(ERR, "Failed to allocate fcw_ring memory");
+		ret = -ENOMEM;
+		goto free_companion_ring_addr;
+	}
+	q->fcw_ring_addr_iova = rte_malloc_virt2iova(q->fcw_ring);
+
+	/* For FFT we need to store the FCW separately */
+	if (conf->op_type == RTE_BBDEV_OP_FFT) {
+		for (desc_idx = 0; desc_idx < d->sw_ring_max_depth; desc_idx++) {
+			desc = q->ring_addr + desc_idx;
+			desc->req.data_ptrs[0].address = q->fcw_ring_addr_iova +
+					desc_idx * ACC_MAX_FCW_SIZE;
+		}
+	}
+
 	q->qgrp_id = (q_idx >> VRB1_GRP_ID_SHIFT) & 0xF;
 	q->vf_id = (q_idx >> VRB1_VF_ID_SHIFT)  & 0x3F;
 	q->aq_id = q_idx & 0xF;
@@ -994,6 +1013,7 @@ vrb_queue_release(struct rte_bbdev *dev, uint16_t q_id)
 	if (q != NULL) {
 		/* Mark the Queue as un-assigned. */
 		d->q_assigned_bit_map[q->qgrp_id] &= (~0ULL - (1 << (uint64_t) q->aq_id));
+		rte_free(q->fcw_ring);
 		rte_free(q->companion_ring_addr);
 		rte_free(q->lb_in);
 		rte_free(q->lb_out);
@@ -3225,7 +3245,10 @@ vrb_enqueue_fft_one_op(struct acc_queue *q, struct rte_bbdev_fft_op *op,
 	output = op->fft.base_output.data;
 	in_offset = op->fft.base_input.offset;
 	out_offset = op->fft.base_output.offset;
-	fcw = &desc->req.fcw_fft;
+
+	fcw = (struct acc_fcw_fft *) (q->fcw_ring +
+			((q->sw_ring_head + total_enqueued_cbs) & q->sw_ring_wrap_mask)
+			* ACC_MAX_FCW_SIZE);
 
 	vrb1_fcw_fft_fill(op, fcw);
 	vrb1_dma_desc_fft_fill(op, &desc->req, input, output, &in_offset, &out_offset);
-- 
2.34.1


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v1 5/7] baseband/acc: add support for MLD operation
  2023-09-19  1:21 [PATCH v1 0/7] VRB2 BBDEV PMD introduction Nicolas Chautru
                   ` (3 preceding siblings ...)
  2023-09-19  1:21 ` [PATCH v1 4/7] baseband/acc: allocate FCW memory separately Nicolas Chautru
@ 2023-09-19  1:21 ` Nicolas Chautru
  2023-09-19  1:21 ` [PATCH v1 6/7] baseband/acc: introduce the new VRB2 variant Nicolas Chautru
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 24+ messages in thread
From: Nicolas Chautru @ 2023-09-19  1:21 UTC (permalink / raw)
  To: dev, maxime.coquelin
  Cc: hemant.agrawal, david.marchand, hernan.vargas, Nicolas Chautru

There is no functionality related to the MLD operation
but allows the unified PMD to support the operation
being added moving forward.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 drivers/baseband/acc/acc_common.h  |  1 +
 drivers/baseband/acc/rte_vrb_pmd.c | 39 ++++++++++++++++++++++++------
 drivers/baseband/acc/vrb_pmd.h     | 12 +++++++++
 3 files changed, 45 insertions(+), 7 deletions(-)

diff --git a/drivers/baseband/acc/acc_common.h b/drivers/baseband/acc/acc_common.h
index b5ee113faf..5de58dbe36 100644
--- a/drivers/baseband/acc/acc_common.h
+++ b/drivers/baseband/acc/acc_common.h
@@ -87,6 +87,7 @@
 #define ACC_FCW_LE_BLEN                32
 #define ACC_FCW_LD_BLEN                36
 #define ACC_FCW_FFT_BLEN               28
+#define ACC_FCW_MLDTS_BLEN             32
 #define ACC_5GUL_SIZE_0                16
 #define ACC_5GUL_SIZE_1                40
 #define ACC_5GUL_OFFSET_0              36
diff --git a/drivers/baseband/acc/rte_vrb_pmd.c b/drivers/baseband/acc/rte_vrb_pmd.c
index 78f465b25b..0a634d62f6 100644
--- a/drivers/baseband/acc/rte_vrb_pmd.c
+++ b/drivers/baseband/acc/rte_vrb_pmd.c
@@ -37,7 +37,7 @@ vrb1_queue_offset(bool pf_device, uint8_t vf_id, uint8_t qgrp_id, uint16_t aq_id
 		return ((qgrp_id << 7) + (aq_id << 3) + VRB1_VfQmgrIngressAq);
 }
 
-enum {UL_4G = 0, UL_5G, DL_4G, DL_5G, FFT, NUM_ACC};
+enum {UL_4G = 0, UL_5G, DL_4G, DL_5G, FFT, MLD, NUM_ACC};
 
 /* Return the accelerator enum for a Queue Group Index. */
 static inline int
@@ -53,6 +53,7 @@ accFromQgid(int qg_idx, const struct rte_acc_conf *acc_conf)
 	NumQGroupsPerFn[DL_4G] = acc_conf->q_dl_4g.num_qgroups;
 	NumQGroupsPerFn[DL_5G] = acc_conf->q_dl_5g.num_qgroups;
 	NumQGroupsPerFn[FFT] = acc_conf->q_fft.num_qgroups;
+	NumQGroupsPerFn[MLD] = acc_conf->q_mld.num_qgroups;
 	for (acc = UL_4G;  acc < NUM_ACC; acc++)
 		for (qgIdx = 0; qgIdx < NumQGroupsPerFn[acc]; qgIdx++)
 			accQg[qgIndex++] = acc;
@@ -83,6 +84,9 @@ qtopFromAcc(struct rte_acc_queue_topology **qtop, int acc_enum, struct rte_acc_c
 	case FFT:
 		p_qtop = &(acc_conf->q_fft);
 		break;
+	case MLD:
+		p_qtop = &(acc_conf->q_mld);
+		break;
 	default:
 		/* NOTREACHED. */
 		rte_bbdev_log(ERR, "Unexpected error evaluating %s using %d", __func__, acc_enum);
@@ -139,6 +143,9 @@ initQTop(struct rte_acc_conf *acc_conf)
 	acc_conf->q_fft.num_aqs_per_groups = 0;
 	acc_conf->q_fft.num_qgroups = 0;
 	acc_conf->q_fft.first_qgroup_index = -1;
+	acc_conf->q_mld.num_aqs_per_groups = 0;
+	acc_conf->q_mld.num_qgroups = 0;
+	acc_conf->q_mld.first_qgroup_index = -1;
 }
 
 static inline void
@@ -250,7 +257,7 @@ fetch_acc_config(struct rte_bbdev *dev)
 	}
 
 	rte_bbdev_log_debug(
-			"%s Config LLR SIGN IN/OUT %s %s QG %u %u %u %u %u AQ %u %u %u %u %u Len %u %u %u %u %u\n",
+			"%s Config LLR SIGN IN/OUT %s %s QG %u %u %u %u %u %u AQ %u %u %u %u %u %u Len %u %u %u %u %u %u\n",
 			(d->pf_device) ? "PF" : "VF",
 			(acc_conf->input_pos_llr_1_bit) ? "POS" : "NEG",
 			(acc_conf->output_pos_llr_1_bit) ? "POS" : "NEG",
@@ -259,16 +266,19 @@ fetch_acc_config(struct rte_bbdev *dev)
 			acc_conf->q_ul_5g.num_qgroups,
 			acc_conf->q_dl_5g.num_qgroups,
 			acc_conf->q_fft.num_qgroups,
+			acc_conf->q_mld.num_qgroups,
 			acc_conf->q_ul_4g.num_aqs_per_groups,
 			acc_conf->q_dl_4g.num_aqs_per_groups,
 			acc_conf->q_ul_5g.num_aqs_per_groups,
 			acc_conf->q_dl_5g.num_aqs_per_groups,
 			acc_conf->q_fft.num_aqs_per_groups,
+			acc_conf->q_mld.num_aqs_per_groups,
 			acc_conf->q_ul_4g.aq_depth_log2,
 			acc_conf->q_dl_4g.aq_depth_log2,
 			acc_conf->q_ul_5g.aq_depth_log2,
 			acc_conf->q_dl_5g.aq_depth_log2,
-			acc_conf->q_fft.aq_depth_log2);
+			acc_conf->q_fft.aq_depth_log2,
+			acc_conf->q_mld.aq_depth_log2);
 }
 
 static inline void
@@ -332,7 +342,7 @@ vrb_check_ir(struct acc_device *acc_dev)
 
 	while (ring_data->valid) {
 		if ((ring_data->int_nb < ACC_PF_INT_DMA_DL_DESC_IRQ) || (
-				ring_data->int_nb > ACC_PF_INT_DMA_DL5G_DESC_IRQ)) {
+				ring_data->int_nb > ACC_PF_INT_DMA_MLD_DESC_IRQ)) {
 			rte_bbdev_log(WARNING, "InfoRing: ITR:%d Info:0x%x",
 					ring_data->int_nb, ring_data->detailed_info);
 			/* Initialize Info Ring entry and move forward. */
@@ -366,6 +376,7 @@ vrb_dev_interrupt_handler(void *cb_arg)
 			case ACC_PF_INT_DMA_FFT_DESC_IRQ:
 			case ACC_PF_INT_DMA_UL5G_DESC_IRQ:
 			case ACC_PF_INT_DMA_DL5G_DESC_IRQ:
+			case ACC_PF_INT_DMA_MLD_DESC_IRQ:
 				deq_intr_det.queue_id = get_queue_id_from_ring_info(
 						dev->data, *ring_data);
 				if (deq_intr_det.queue_id == UINT16_MAX) {
@@ -393,6 +404,7 @@ vrb_dev_interrupt_handler(void *cb_arg)
 			case ACC_VF_INT_DMA_FFT_DESC_IRQ:
 			case ACC_VF_INT_DMA_UL5G_DESC_IRQ:
 			case ACC_VF_INT_DMA_DL5G_DESC_IRQ:
+			case ACC_VF_INT_DMA_MLD_DESC_IRQ:
 				/* VFs are not aware of their vf_id - it's set to 0.  */
 				ring_data->vf_id = 0;
 				deq_intr_det.queue_id = get_queue_id_from_ring_info(
@@ -741,7 +753,7 @@ vrb_find_free_queue_idx(struct rte_bbdev *dev,
 		const struct rte_bbdev_queue_conf *conf)
 {
 	struct acc_device *d = dev->data->dev_private;
-	int op_2_acc[6] = {0, UL_4G, DL_4G, UL_5G, DL_5G, FFT};
+	int op_2_acc[7] = {0, UL_4G, DL_4G, UL_5G, DL_5G, FFT, MLD};
 	int acc = op_2_acc[conf->op_type];
 	struct rte_acc_queue_topology *qtop = NULL;
 	uint16_t group_idx;
@@ -804,7 +816,8 @@ vrb_queue_setup(struct rte_bbdev *dev, uint16_t queue_id,
 	int fcw_len = (conf->op_type == RTE_BBDEV_OP_LDPC_ENC ?
 			ACC_FCW_LE_BLEN : (conf->op_type == RTE_BBDEV_OP_TURBO_DEC ?
 			ACC_FCW_TD_BLEN : (conf->op_type == RTE_BBDEV_OP_LDPC_DEC ?
-			ACC_FCW_LD_BLEN : ACC_FCW_FFT_BLEN)));
+			ACC_FCW_LD_BLEN : (conf->op_type == RTE_BBDEV_OP_FFT ?
+			ACC_FCW_FFT_BLEN : ACC_FCW_MLDTS_BLEN))));
 
 	for (desc_idx = 0; desc_idx < d->sw_ring_max_depth; desc_idx++) {
 		desc = q->ring_addr + desc_idx;
@@ -916,6 +929,8 @@ vrb_queue_setup(struct rte_bbdev *dev, uint16_t queue_id,
 		q->aq_depth = (1 << d->acc_conf.q_dl_5g.aq_depth_log2);
 	else if (conf->op_type ==  RTE_BBDEV_OP_FFT)
 		q->aq_depth = (1 << d->acc_conf.q_fft.aq_depth_log2);
+	else if (conf->op_type ==  RTE_BBDEV_OP_MLDTS)
+		q->aq_depth = (1 << d->acc_conf.q_mld.aq_depth_log2);
 
 	q->mmio_reg_enqueue = RTE_PTR_ADD(d->mmio_base,
 			d->queue_offset(d->pf_device, q->vf_id, q->qgrp_id, q->aq_id));
@@ -972,6 +987,13 @@ vrb_print_op(struct rte_bbdev_dec_op *op, enum rte_bbdev_op_type op_type,
 			op_dl->ldpc_enc.n_filler, op_dl->ldpc_enc.cb_params.e,
 			op_dl->ldpc_enc.op_flags, op_dl->ldpc_enc.rv_index
 			);
+	} else if (op_type == RTE_BBDEV_OP_MLDTS) {
+		struct rte_bbdev_mldts_op *op_mldts = (struct rte_bbdev_mldts_op *) op;
+		rte_bbdev_log(INFO, "  Op MLD %d RBs %d NL %d Rp %d %d %x\n",
+				index,
+				op_mldts->mldts.num_rbs, op_mldts->mldts.num_layers,
+				op_mldts->mldts.r_rep,
+				op_mldts->mldts.c_rep, op_mldts->mldts.op_flags);
 	}
 }
 
@@ -1152,13 +1174,16 @@ vrb_dev_info_get(struct rte_bbdev *dev, struct rte_bbdev_driver_info *dev_info)
 			d->acc_conf.q_dl_5g.num_qgroups;
 	dev_info->num_queues[RTE_BBDEV_OP_FFT] = d->acc_conf.q_fft.num_aqs_per_groups *
 			d->acc_conf.q_fft.num_qgroups;
+	dev_info->num_queues[RTE_BBDEV_OP_MLDTS] = d->acc_conf.q_mld.num_aqs_per_groups *
+			d->acc_conf.q_mld.num_qgroups;
 	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_DEC] = d->acc_conf.q_ul_4g.num_qgroups;
 	dev_info->queue_priority[RTE_BBDEV_OP_TURBO_ENC] = d->acc_conf.q_dl_4g.num_qgroups;
 	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_DEC] = d->acc_conf.q_ul_5g.num_qgroups;
 	dev_info->queue_priority[RTE_BBDEV_OP_LDPC_ENC] = d->acc_conf.q_dl_5g.num_qgroups;
 	dev_info->queue_priority[RTE_BBDEV_OP_FFT] = d->acc_conf.q_fft.num_qgroups;
+	dev_info->queue_priority[RTE_BBDEV_OP_MLDTS] = d->acc_conf.q_mld.num_qgroups;
 	dev_info->max_num_queues = 0;
-	for (i = RTE_BBDEV_OP_NONE; i <= RTE_BBDEV_OP_FFT; i++)
+	for (i = RTE_BBDEV_OP_NONE; i <= RTE_BBDEV_OP_MLDTS; i++)
 		dev_info->max_num_queues += dev_info->num_queues[i];
 	dev_info->queue_size_lim = ACC_MAX_QUEUE_DEPTH;
 	dev_info->hardware_accelerated = true;
diff --git a/drivers/baseband/acc/vrb_pmd.h b/drivers/baseband/acc/vrb_pmd.h
index 01028273e7..1cabc0b7f4 100644
--- a/drivers/baseband/acc/vrb_pmd.h
+++ b/drivers/baseband/acc/vrb_pmd.h
@@ -101,6 +101,8 @@ struct acc_registry_addr {
 	unsigned int dma_ring_ul4g_lo;
 	unsigned int dma_ring_fft_hi;
 	unsigned int dma_ring_fft_lo;
+	unsigned int dma_ring_mld_hi;
+	unsigned int dma_ring_mld_lo;
 	unsigned int ring_size;
 	unsigned int info_ring_hi;
 	unsigned int info_ring_lo;
@@ -116,6 +118,8 @@ struct acc_registry_addr {
 	unsigned int tail_ptrs_ul4g_lo;
 	unsigned int tail_ptrs_fft_hi;
 	unsigned int tail_ptrs_fft_lo;
+	unsigned int tail_ptrs_mld_hi;
+	unsigned int tail_ptrs_mld_lo;
 	unsigned int depth_log0_offset;
 	unsigned int depth_log1_offset;
 	unsigned int qman_group_func;
@@ -140,6 +144,8 @@ static const struct acc_registry_addr vrb1_pf_reg_addr = {
 	.dma_ring_ul4g_lo = VRB1_PfDmaFec4GulDescBaseLoRegVf,
 	.dma_ring_fft_hi = VRB1_PfDmaFftDescBaseHiRegVf,
 	.dma_ring_fft_lo = VRB1_PfDmaFftDescBaseLoRegVf,
+	.dma_ring_mld_hi = 0,
+	.dma_ring_mld_lo = 0,
 	.ring_size =      VRB1_PfQmgrRingSizeVf,
 	.info_ring_hi = VRB1_PfHiInfoRingBaseHiRegPf,
 	.info_ring_lo = VRB1_PfHiInfoRingBaseLoRegPf,
@@ -155,6 +161,8 @@ static const struct acc_registry_addr vrb1_pf_reg_addr = {
 	.tail_ptrs_ul4g_lo = VRB1_PfDmaFec4GulRespPtrLoRegVf,
 	.tail_ptrs_fft_hi = VRB1_PfDmaFftRespPtrHiRegVf,
 	.tail_ptrs_fft_lo = VRB1_PfDmaFftRespPtrLoRegVf,
+	.tail_ptrs_mld_hi = 0,
+	.tail_ptrs_mld_lo = 0,
 	.depth_log0_offset = VRB1_PfQmgrGrpDepthLog20Vf,
 	.depth_log1_offset = VRB1_PfQmgrGrpDepthLog21Vf,
 	.qman_group_func = VRB1_PfQmgrGrpFunction0,
@@ -179,6 +187,8 @@ static const struct acc_registry_addr vrb1_vf_reg_addr = {
 	.dma_ring_ul4g_lo = VRB1_VfDmaFec4GulDescBaseLoRegVf,
 	.dma_ring_fft_hi = VRB1_VfDmaFftDescBaseHiRegVf,
 	.dma_ring_fft_lo = VRB1_VfDmaFftDescBaseLoRegVf,
+	.dma_ring_mld_hi = 0,
+	.dma_ring_mld_lo = 0,
 	.ring_size = VRB1_VfQmgrRingSizeVf,
 	.info_ring_hi = VRB1_VfHiInfoRingBaseHiVf,
 	.info_ring_lo = VRB1_VfHiInfoRingBaseLoVf,
@@ -194,6 +204,8 @@ static const struct acc_registry_addr vrb1_vf_reg_addr = {
 	.tail_ptrs_ul4g_lo = VRB1_VfDmaFec4GulRespPtrLoRegVf,
 	.tail_ptrs_fft_hi = VRB1_VfDmaFftRespPtrHiRegVf,
 	.tail_ptrs_fft_lo = VRB1_VfDmaFftRespPtrLoRegVf,
+	.tail_ptrs_mld_hi = 0,
+	.tail_ptrs_mld_lo = 0,
 	.depth_log0_offset = VRB1_VfQmgrGrpDepthLog20Vf,
 	.depth_log1_offset = VRB1_VfQmgrGrpDepthLog21Vf,
 	.qman_group_func = VRB1_VfQmgrGrpFunction0Vf,
-- 
2.34.1


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v1 6/7] baseband/acc: introduce the new VRB2 variant
  2023-09-19  1:21 [PATCH v1 0/7] VRB2 BBDEV PMD introduction Nicolas Chautru
                   ` (4 preceding siblings ...)
  2023-09-19  1:21 ` [PATCH v1 5/7] baseband/acc: add support for MLD operation Nicolas Chautru
@ 2023-09-19  1:21 ` Nicolas Chautru
  2023-09-19  1:21 ` [PATCH v1 7/7] baseband/acc: add configure helper for VRB2 Nicolas Chautru
  2023-09-21  7:25 ` [PATCH v1 0/7] VRB2 BBDEV PMD introduction David Marchand
  7 siblings, 0 replies; 24+ messages in thread
From: Nicolas Chautru @ 2023-09-19  1:21 UTC (permalink / raw)
  To: dev, maxime.coquelin
  Cc: hemant.agrawal, david.marchand, hernan.vargas, Nicolas Chautru

This extends the unified driver to support both the
VRB1 and VRB2 implentation of Intel vRAN Boost.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 drivers/baseband/acc/acc_common.h     |   84 +-
 drivers/baseband/acc/rte_acc100_pmd.c |    4 +-
 drivers/baseband/acc/rte_vrb_pmd.c    | 1441 ++++++++++++++++++++++---
 drivers/baseband/acc/vrb1_pf_enum.h   |   17 +-
 drivers/baseband/acc/vrb2_pf_enum.h   |  124 +++
 drivers/baseband/acc/vrb2_vf_enum.h   |  121 +++
 drivers/baseband/acc/vrb_pmd.h        |  161 ++-
 7 files changed, 1789 insertions(+), 163 deletions(-)
 create mode 100644 drivers/baseband/acc/vrb2_pf_enum.h
 create mode 100644 drivers/baseband/acc/vrb2_vf_enum.h

diff --git a/drivers/baseband/acc/acc_common.h b/drivers/baseband/acc/acc_common.h
index 5de58dbe36..56578c43ba 100644
--- a/drivers/baseband/acc/acc_common.h
+++ b/drivers/baseband/acc/acc_common.h
@@ -18,6 +18,7 @@
 #define ACC_DMA_BLKID_OUT_HARQ      3
 #define ACC_DMA_BLKID_IN_HARQ       3
 #define ACC_DMA_BLKID_IN_MLD_R      3
+#define ACC_DMA_BLKID_DEWIN_IN      3
 
 /* Values used in filling in decode FCWs */
 #define ACC_FCW_TD_VER              1
@@ -103,6 +104,9 @@
 #define ACC_MAX_NUM_QGRPS              32
 #define ACC_RING_SIZE_GRANULARITY      64
 #define ACC_MAX_FCW_SIZE              128
+#define ACC_IQ_SIZE                    4
+
+#define ACC_FCW_FFT_BLEN_3             28
 
 /* Constants from K0 computation from 3GPP 38.212 Table 5.4.2.1-2 */
 #define ACC_N_ZC_1 66 /* N = 66 Zc for BG 1 */
@@ -132,6 +136,11 @@
 #define ACC_LIM_21 14 /* 0.21 */
 #define ACC_LIM_31 20 /* 0.31 */
 #define ACC_MAX_E (128 * 1024 - 2)
+#define ACC_MAX_CS 12
+
+#define ACC100_VARIANT          0
+#define VRB1_VARIANT		2
+#define VRB2_VARIANT		3
 
 /* Helper macro for logging */
 #define rte_acc_log(level, fmt, ...) \
@@ -332,6 +341,37 @@ struct __rte_packed acc_fcw_fft {
 		res:19;
 };
 
+/* FFT Frame Control Word */
+struct __rte_packed acc_fcw_fft_3 {
+	uint32_t in_frame_size:16,
+		leading_pad_size:16;
+	uint32_t out_frame_size:16,
+		leading_depad_size:16;
+	uint32_t cs_window_sel;
+	uint32_t cs_window_sel2:16,
+		cs_enable_bmap:16;
+	uint32_t num_antennas:8,
+		idft_size:8,
+		dft_size:8,
+		cs_offset:8;
+	uint32_t idft_shift:8,
+		dft_shift:8,
+		cs_multiplier:16;
+	uint32_t bypass:2,
+		fp16_in:1, /* Not supported in VRB1 */
+		fp16_out:1,
+		exp_adj:4,
+		power_shift:4,
+		power_en:1,
+		enable_dewin:1,
+		freq_resample_mode:2,
+		depad_ouput_size:16;
+	uint16_t cs_theta_0[ACC_MAX_CS];
+	uint32_t cs_theta_d[ACC_MAX_CS];
+	int8_t cs_time_offset[ACC_MAX_CS];
+};
+
+
 /* MLD-TS Frame Control Word */
 struct __rte_packed acc_fcw_mldts {
 	uint32_t fcw_version:4,
@@ -473,14 +513,14 @@ union acc_info_ring_data {
 		uint16_t valid: 1;
 	};
 	struct {
-		uint32_t aq_id_3: 6;
-		uint32_t qg_id_3: 5;
-		uint32_t vf_id_3: 6;
-		uint32_t int_nb_3: 6;
-		uint32_t msi_0_3: 1;
-		uint32_t vf2pf_3: 6;
-		uint32_t loop_3: 1;
-		uint32_t valid_3: 1;
+		uint32_t aq_id_vrb2: 6;
+		uint32_t qg_id_vrb2: 5;
+		uint32_t vf_id_vrb2: 6;
+		uint32_t int_nb_vrb2: 6;
+		uint32_t msi_0_vrb2: 1;
+		uint32_t vf2pf_vrb2: 6;
+		uint32_t loop_vrb2: 1;
+		uint32_t valid_vrb2: 1;
 	};
 } __rte_packed;
 
@@ -765,16 +805,20 @@ alloc_sw_rings_min_mem(struct rte_bbdev *dev, struct acc_device *d,
  */
 static inline uint16_t
 get_queue_id_from_ring_info(struct rte_bbdev_data *data,
-		const union acc_info_ring_data ring_data)
+		const union acc_info_ring_data ring_data, uint16_t device_variant)
 {
 	uint16_t queue_id;
+	struct acc_queue *acc_q;
+	uint16_t vf_id = (device_variant == VRB2_VARIANT) ? ring_data.vf_id_vrb2 : ring_data.vf_id;
+	uint16_t aq_id = (device_variant == VRB2_VARIANT) ? ring_data.aq_id_vrb2 : ring_data.aq_id;
+	uint16_t qg_id = (device_variant == VRB2_VARIANT) ? ring_data.qg_id_vrb2 : ring_data.qg_id;
 
 	for (queue_id = 0; queue_id < data->num_queues; ++queue_id) {
-		struct acc_queue *acc_q =
-				data->queues[queue_id].queue_private;
-		if (acc_q != NULL && acc_q->aq_id == ring_data.aq_id &&
-				acc_q->qgrp_id == ring_data.qg_id &&
-				acc_q->vf_id == ring_data.vf_id)
+		acc_q = data->queues[queue_id].queue_private;
+
+		if (acc_q != NULL && acc_q->aq_id == aq_id &&
+				acc_q->qgrp_id == qg_id &&
+				acc_q->vf_id == vf_id)
 			return queue_id;
 	}
 
@@ -1436,4 +1480,16 @@ get_num_cbs_in_tb_ldpc_enc(struct rte_bbdev_op_ldpc_enc *ldpc_enc)
 	return cbs_in_tb;
 }
 
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+static inline void
+acc_memdump(const char *string, void *buf, uint16_t bytes)
+{
+	printf("%s\n", string);
+	uint32_t *data = buf;
+	uint16_t i;
+	for (i = 0; i < bytes / 4; i++)
+		printf("0x%08X\n", data[i]);
+}
+#endif
+
 #endif /* _ACC_COMMON_H_ */
diff --git a/drivers/baseband/acc/rte_acc100_pmd.c b/drivers/baseband/acc/rte_acc100_pmd.c
index 5362d39c30..7f8d05b5a9 100644
--- a/drivers/baseband/acc/rte_acc100_pmd.c
+++ b/drivers/baseband/acc/rte_acc100_pmd.c
@@ -294,7 +294,7 @@ acc100_pf_interrupt_handler(struct rte_bbdev *dev)
 		case ACC100_PF_INT_DMA_UL5G_DESC_IRQ:
 		case ACC100_PF_INT_DMA_DL5G_DESC_IRQ:
 			deq_intr_det.queue_id = get_queue_id_from_ring_info(
-					dev->data, *ring_data);
+					dev->data, *ring_data, acc100_dev->device_variant);
 			if (deq_intr_det.queue_id == UINT16_MAX) {
 				rte_bbdev_log(ERR,
 						"Couldn't find queue: aq_id: %u, qg_id: %u, vf_id: %u",
@@ -348,7 +348,7 @@ acc100_vf_interrupt_handler(struct rte_bbdev *dev)
 			 */
 			ring_data->vf_id = 0;
 			deq_intr_det.queue_id = get_queue_id_from_ring_info(
-					dev->data, *ring_data);
+					dev->data, *ring_data, acc100_dev->device_variant);
 			if (deq_intr_det.queue_id == UINT16_MAX) {
 				rte_bbdev_log(ERR,
 						"Couldn't find queue: aq_id: %u, qg_id: %u",
diff --git a/drivers/baseband/acc/rte_vrb_pmd.c b/drivers/baseband/acc/rte_vrb_pmd.c
index 0a634d62f6..36d2c8173d 100644
--- a/drivers/baseband/acc/rte_vrb_pmd.c
+++ b/drivers/baseband/acc/rte_vrb_pmd.c
@@ -37,6 +37,15 @@ vrb1_queue_offset(bool pf_device, uint8_t vf_id, uint8_t qgrp_id, uint16_t aq_id
 		return ((qgrp_id << 7) + (aq_id << 3) + VRB1_VfQmgrIngressAq);
 }
 
+static inline uint32_t
+vrb2_queue_offset(bool pf_device, uint8_t vf_id, uint8_t qgrp_id, uint16_t aq_id)
+{
+	if (pf_device)
+		return ((vf_id << 14) + (qgrp_id << 9) + (aq_id << 3) + VRB2_PfQmgrIngressAq);
+	else
+		return ((qgrp_id << 9) + (aq_id << 3) + VRB2_VfQmgrIngressAq);
+}
+
 enum {UL_4G = 0, UL_5G, DL_4G, DL_5G, FFT, MLD, NUM_ACC};
 
 /* Return the accelerator enum for a Queue Group Index. */
@@ -197,7 +206,7 @@ fetch_acc_config(struct rte_bbdev *dev)
 	struct acc_device *d = dev->data->dev_private;
 	struct rte_acc_conf *acc_conf = &d->acc_conf;
 	uint8_t acc, qg;
-	uint32_t reg_aq, reg_len0, reg_len1, reg0, reg1;
+	uint32_t reg_aq, reg_len0, reg_len1, reg_len2, reg_len3, reg0, reg1, reg2, reg3;
 	uint32_t reg_mode, idx;
 	struct rte_acc_queue_topology *q_top = NULL;
 	int qman_func_id[VRB_NUM_ACCS] = {ACC_ACCMAP_0, ACC_ACCMAP_1,
@@ -219,32 +228,81 @@ fetch_acc_config(struct rte_bbdev *dev)
 	acc_conf->num_vf_bundles = 1;
 	initQTop(acc_conf);
 
-	reg0 = acc_reg_read(d, d->reg_addr->qman_group_func);
-	reg1 = acc_reg_read(d, d->reg_addr->qman_group_func + 4);
-	for (qg = 0; qg < d->num_qgroups; qg++) {
-		reg_aq = acc_reg_read(d, d->queue_offset(d->pf_device, 0, qg, 0));
-		if (reg_aq & ACC_QUEUE_ENABLE) {
-			if (qg < ACC_NUM_QGRPS_PER_WORD)
-				idx = (reg0 >> (qg * 4)) & 0x7;
+	if (d->device_variant == VRB1_VARIANT) {
+		reg0 = acc_reg_read(d, d->reg_addr->qman_group_func);
+		reg1 = acc_reg_read(d, d->reg_addr->qman_group_func + 4);
+		for (qg = 0; qg < d->num_qgroups; qg++) {
+			reg_aq = acc_reg_read(d, d->queue_offset(d->pf_device, 0, qg, 0));
+			if (reg_aq & ACC_QUEUE_ENABLE) {
+				if (qg < ACC_NUM_QGRPS_PER_WORD)
+					idx = (reg0 >> (qg * 4)) & 0x7;
+				else
+					idx = (reg1 >> ((qg - ACC_NUM_QGRPS_PER_WORD) * 4)) & 0x7;
+				if (idx < VRB1_NUM_ACCS) {
+					acc = qman_func_id[idx];
+					updateQtop(acc, qg, acc_conf, d);
+				}
+			}
+		}
+
+		/* Check the depth of the AQs. */
+		reg_len0 = acc_reg_read(d, d->reg_addr->depth_log0_offset);
+		reg_len1 = acc_reg_read(d, d->reg_addr->depth_log1_offset);
+		for (acc = 0; acc < NUM_ACC; acc++) {
+			qtopFromAcc(&q_top, acc, acc_conf);
+			if (q_top->first_qgroup_index < ACC_NUM_QGRPS_PER_WORD)
+				q_top->aq_depth_log2 =
+						(reg_len0 >> (q_top->first_qgroup_index * 4)) & 0xF;
 			else
-				idx = (reg1 >> ((qg - ACC_NUM_QGRPS_PER_WORD) * 4)) & 0x7;
-			if (idx < VRB1_NUM_ACCS) {
-				acc = qman_func_id[idx];
-				updateQtop(acc, qg, acc_conf, d);
+				q_top->aq_depth_log2 = (reg_len1 >> ((q_top->first_qgroup_index -
+						ACC_NUM_QGRPS_PER_WORD) * 4)) & 0xF;
+		}
+	} else {
+		reg0 = acc_reg_read(d, d->reg_addr->qman_group_func);
+		reg1 = acc_reg_read(d, d->reg_addr->qman_group_func + 4);
+		reg2 = acc_reg_read(d, d->reg_addr->qman_group_func + 8);
+		reg3 = acc_reg_read(d, d->reg_addr->qman_group_func + 12);
+		/* printf("Debug Function %08x %08x %08x %08x\n", reg0, reg1, reg2, reg3);*/
+		for (qg = 0; qg < VRB2_NUM_QGRPS; qg++) {
+			reg_aq = acc_reg_read(d, vrb2_queue_offset(d->pf_device, 0, qg, 0));
+			if (reg_aq & ACC_QUEUE_ENABLE) {
+				/* printf("Qg enabled %d %x\n", qg, reg_aq);*/
+				if (qg / ACC_NUM_QGRPS_PER_WORD == 0)
+					idx = (reg0 >> ((qg % ACC_NUM_QGRPS_PER_WORD) * 4)) & 0x7;
+				else if (qg / ACC_NUM_QGRPS_PER_WORD == 1)
+					idx = (reg1 >> ((qg % ACC_NUM_QGRPS_PER_WORD) * 4)) & 0x7;
+				else if (qg / ACC_NUM_QGRPS_PER_WORD == 2)
+					idx = (reg2 >> ((qg % ACC_NUM_QGRPS_PER_WORD) * 4)) & 0x7;
+				else
+					idx = (reg3 >> ((qg % ACC_NUM_QGRPS_PER_WORD) * 4)) & 0x7;
+				if (idx < VRB_NUM_ACCS) {
+					acc = qman_func_id[idx];
+					updateQtop(acc, qg, acc_conf, d);
+				}
 			}
 		}
-	}
 
-	/* Check the depth of the AQs. */
-	reg_len0 = acc_reg_read(d, d->reg_addr->depth_log0_offset);
-	reg_len1 = acc_reg_read(d, d->reg_addr->depth_log1_offset);
-	for (acc = 0; acc < NUM_ACC; acc++) {
-		qtopFromAcc(&q_top, acc, acc_conf);
-		if (q_top->first_qgroup_index < ACC_NUM_QGRPS_PER_WORD)
-			q_top->aq_depth_log2 = (reg_len0 >> (q_top->first_qgroup_index * 4)) & 0xF;
-		else
-			q_top->aq_depth_log2 = (reg_len1 >> ((q_top->first_qgroup_index -
-					ACC_NUM_QGRPS_PER_WORD) * 4)) & 0xF;
+		/* Check the depth of the AQs. */
+		reg_len0 = acc_reg_read(d, d->reg_addr->depth_log0_offset);
+		reg_len1 = acc_reg_read(d, d->reg_addr->depth_log0_offset + 4);
+		reg_len2 = acc_reg_read(d, d->reg_addr->depth_log0_offset + 8);
+		reg_len3 = acc_reg_read(d, d->reg_addr->depth_log0_offset + 12);
+
+		for (acc = 0; acc < NUM_ACC; acc++) {
+			qtopFromAcc(&q_top, acc, acc_conf);
+			if (q_top->first_qgroup_index / ACC_NUM_QGRPS_PER_WORD == 0)
+				q_top->aq_depth_log2 = (reg_len0 >> ((q_top->first_qgroup_index %
+						ACC_NUM_QGRPS_PER_WORD) * 4)) & 0xF;
+			else if (q_top->first_qgroup_index / ACC_NUM_QGRPS_PER_WORD == 1)
+				q_top->aq_depth_log2 = (reg_len1 >> ((q_top->first_qgroup_index %
+						ACC_NUM_QGRPS_PER_WORD) * 4)) & 0xF;
+			else if (q_top->first_qgroup_index / ACC_NUM_QGRPS_PER_WORD == 2)
+				q_top->aq_depth_log2 = (reg_len2 >> ((q_top->first_qgroup_index %
+						ACC_NUM_QGRPS_PER_WORD) * 4)) & 0xF;
+			else
+				q_top->aq_depth_log2 = (reg_len3 >> ((q_top->first_qgroup_index %
+						ACC_NUM_QGRPS_PER_WORD) * 4)) & 0xF;
+		}
 	}
 
 	/* Read PF mode. */
@@ -341,18 +399,29 @@ vrb_check_ir(struct acc_device *acc_dev)
 	ring_data = acc_dev->info_ring + (acc_dev->info_ring_head & ACC_INFO_RING_MASK);
 
 	while (ring_data->valid) {
-		if ((ring_data->int_nb < ACC_PF_INT_DMA_DL_DESC_IRQ) || (
-				ring_data->int_nb > ACC_PF_INT_DMA_MLD_DESC_IRQ)) {
-			rte_bbdev_log(WARNING, "InfoRing: ITR:%d Info:0x%x",
-					ring_data->int_nb, ring_data->detailed_info);
-			/* Initialize Info Ring entry and move forward. */
-			ring_data->val = 0;
+		if (acc_dev->device_variant == VRB1_VARIANT) {
+			if ((ring_data->int_nb < ACC_PF_INT_DMA_DL_DESC_IRQ) || (
+					ring_data->int_nb > ACC_PF_INT_DMA_MLD_DESC_IRQ)) {
+				rte_bbdev_log(WARNING, "InfoRing: ITR:%d Info:0x%x",
+						ring_data->int_nb, ring_data->detailed_info);
+				/* Initialize Info Ring entry and move forward. */
+				ring_data->val = 0;
+			}
+		} else { /* VRB2_VARIANT */
+			if ((ring_data->int_nb_vrb2 < ACC_PF_INT_DMA_DL_DESC_IRQ) || (
+					ring_data->int_nb_vrb2 > ACC_PF_INT_DMA_MLD_DESC_IRQ)) {
+				rte_bbdev_log(WARNING, "InfoRing: ITR:%d Info:0x%x",
+						ring_data->int_nb_vrb2, ring_data->val);
+				/* Initialize Info Ring entry and move forward. */
+				ring_data->val = 0;
+			}
 		}
 		info_ring_head++;
 		ring_data = acc_dev->info_ring + (info_ring_head & ACC_INFO_RING_MASK);
 	}
 }
 
+
 /* Interrupt handler triggered by dev for handling specific interrupt. */
 static void
 vrb_dev_interrupt_handler(void *cb_arg)
@@ -361,16 +430,22 @@ vrb_dev_interrupt_handler(void *cb_arg)
 	struct acc_device *acc_dev = dev->data->dev_private;
 	volatile union acc_info_ring_data *ring_data;
 	struct acc_deq_intr_details deq_intr_det;
+	uint16_t vf_id, aq_id, qg_id, int_nb;
+	bool isVrb1 = (acc_dev->device_variant == VRB1_VARIANT);
 
 	ring_data = acc_dev->info_ring + (acc_dev->info_ring_head & ACC_INFO_RING_MASK);
 
 	while (ring_data->valid) {
+		vf_id = isVrb1 ? ring_data->vf_id : ring_data->vf_id_vrb2;
+		aq_id = isVrb1 ? ring_data->aq_id : ring_data->aq_id_vrb2;
+		qg_id = isVrb1 ? ring_data->qg_id : ring_data->qg_id_vrb2;
+		int_nb = isVrb1 ? ring_data->int_nb : ring_data->int_nb_vrb2;
 		if (acc_dev->pf_device) {
 			rte_bbdev_log_debug(
-					"VRB1 PF Interrupt received, Info Ring data: 0x%x -> %d",
-					ring_data->val, ring_data->int_nb);
+					"PF Interrupt received, Info Ring data: 0x%x -> %d",
+					ring_data->val, int_nb);
 
-			switch (ring_data->int_nb) {
+			switch (int_nb) {
 			case ACC_PF_INT_DMA_DL_DESC_IRQ:
 			case ACC_PF_INT_DMA_UL_DESC_IRQ:
 			case ACC_PF_INT_DMA_FFT_DESC_IRQ:
@@ -378,13 +453,11 @@ vrb_dev_interrupt_handler(void *cb_arg)
 			case ACC_PF_INT_DMA_DL5G_DESC_IRQ:
 			case ACC_PF_INT_DMA_MLD_DESC_IRQ:
 				deq_intr_det.queue_id = get_queue_id_from_ring_info(
-						dev->data, *ring_data);
+						dev->data, *ring_data, acc_dev->device_variant);
 				if (deq_intr_det.queue_id == UINT16_MAX) {
 					rte_bbdev_log(ERR,
 							"Couldn't find queue: aq_id: %u, qg_id: %u, vf_id: %u",
-							ring_data->aq_id,
-							ring_data->qg_id,
-							ring_data->vf_id);
+							aq_id, qg_id, vf_id);
 					return;
 				}
 				rte_bbdev_pmd_callback_process(dev,
@@ -396,9 +469,9 @@ vrb_dev_interrupt_handler(void *cb_arg)
 			}
 		} else {
 			rte_bbdev_log_debug(
-					"VRB1 VF Interrupt received, Info Ring data: 0x%x\n",
+					"VRB VF Interrupt received, Info Ring data: 0x%x\n",
 					ring_data->val);
-			switch (ring_data->int_nb) {
+			switch (int_nb) {
 			case ACC_VF_INT_DMA_DL_DESC_IRQ:
 			case ACC_VF_INT_DMA_UL_DESC_IRQ:
 			case ACC_VF_INT_DMA_FFT_DESC_IRQ:
@@ -406,14 +479,16 @@ vrb_dev_interrupt_handler(void *cb_arg)
 			case ACC_VF_INT_DMA_DL5G_DESC_IRQ:
 			case ACC_VF_INT_DMA_MLD_DESC_IRQ:
 				/* VFs are not aware of their vf_id - it's set to 0.  */
-				ring_data->vf_id = 0;
+				if (acc_dev->device_variant == VRB1_VARIANT)
+					ring_data->vf_id = 0;
+				else
+					ring_data->vf_id_vrb2 = 0;
 				deq_intr_det.queue_id = get_queue_id_from_ring_info(
-						dev->data, *ring_data);
+						dev->data, *ring_data, acc_dev->device_variant);
 				if (deq_intr_det.queue_id == UINT16_MAX) {
 					rte_bbdev_log(ERR,
 							"Couldn't find queue: aq_id: %u, qg_id: %u",
-							ring_data->aq_id,
-							ring_data->qg_id);
+							aq_id, qg_id);
 					return;
 				}
 				rte_bbdev_pmd_callback_process(dev,
@@ -428,8 +503,7 @@ vrb_dev_interrupt_handler(void *cb_arg)
 		/* Initialize Info Ring entry and move forward. */
 		ring_data->val = 0;
 		++acc_dev->info_ring_head;
-		ring_data = acc_dev->info_ring +
-				(acc_dev->info_ring_head & ACC_INFO_RING_MASK);
+		ring_data = acc_dev->info_ring + (acc_dev->info_ring_head & ACC_INFO_RING_MASK);
 	}
 }
 
@@ -461,7 +535,10 @@ allocate_info_ring(struct rte_bbdev *dev)
 	phys_low  = (uint32_t)(info_ring_iova);
 	acc_reg_write(d, d->reg_addr->info_ring_hi, phys_high);
 	acc_reg_write(d, d->reg_addr->info_ring_lo, phys_low);
-	acc_reg_write(d, d->reg_addr->info_ring_en, VRB1_REG_IRQ_EN_ALL);
+	if (d->device_variant == VRB1_VARIANT)
+		acc_reg_write(d, d->reg_addr->info_ring_en, VRB1_REG_IRQ_EN_ALL);
+	else
+		acc_reg_write(d, d->reg_addr->info_ring_en, VRB2_REG_IRQ_EN_ALL);
 	d->info_ring_head = (acc_reg_read(d, d->reg_addr->info_ring_ptr) &
 			0xFFF) / sizeof(union acc_info_ring_data);
 	return 0;
@@ -516,6 +593,7 @@ vrb_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id)
 	phys_high = (uint32_t)(d->sw_rings_iova >> 32);
 	phys_low  = (uint32_t)(d->sw_rings_iova & ~(ACC_SIZE_64MBYTE-1));
 
+
 	/* Read the populated cfg from device registers. */
 	fetch_acc_config(dev);
 
@@ -540,6 +618,10 @@ vrb_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id)
 	acc_reg_write(d, d->reg_addr->dma_ring_dl4g_lo, phys_low);
 	acc_reg_write(d, d->reg_addr->dma_ring_fft_hi, phys_high);
 	acc_reg_write(d, d->reg_addr->dma_ring_fft_lo, phys_low);
+	if (d->device_variant == VRB2_VARIANT) {
+		acc_reg_write(d, d->reg_addr->dma_ring_mld_hi, phys_high);
+		acc_reg_write(d, d->reg_addr->dma_ring_mld_lo, phys_low);
+	}
 	/*
 	 * Configure Ring Size to the max queue ring size
 	 * (used for wrapping purpose).
@@ -549,8 +631,7 @@ vrb_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id)
 
 	/* Configure tail pointer for use when SDONE enabled. */
 	if (d->tail_ptrs == NULL)
-		d->tail_ptrs = rte_zmalloc_socket(
-				dev->device->driver->name,
+		d->tail_ptrs = rte_zmalloc_socket(dev->device->driver->name,
 				VRB_MAX_QGRPS * VRB_MAX_AQS * sizeof(uint32_t),
 				RTE_CACHE_LINE_SIZE, socket_id);
 	if (d->tail_ptrs == NULL) {
@@ -574,6 +655,10 @@ vrb_setup_queues(struct rte_bbdev *dev, uint16_t num_queues, int socket_id)
 	acc_reg_write(d, d->reg_addr->tail_ptrs_dl4g_lo, phys_low);
 	acc_reg_write(d, d->reg_addr->tail_ptrs_fft_hi, phys_high);
 	acc_reg_write(d, d->reg_addr->tail_ptrs_fft_lo, phys_low);
+	if (d->device_variant == VRB2_VARIANT) {
+		acc_reg_write(d, d->reg_addr->tail_ptrs_mld_hi, phys_high);
+		acc_reg_write(d, d->reg_addr->tail_ptrs_mld_lo, phys_low);
+	}
 
 	ret = allocate_info_ring(dev);
 	if (ret < 0) {
@@ -671,10 +756,17 @@ vrb_intr_enable(struct rte_bbdev *dev)
 			return ret;
 		}
 
-		if (acc_dev->pf_device)
-			max_queues = VRB1_MAX_PF_MSIX;
-		else
-			max_queues = VRB1_MAX_VF_MSIX;
+		if (d->device_variant == VRB1_VARIANT) {
+			if (acc_dev->pf_device)
+				max_queues = VRB1_MAX_PF_MSIX;
+			else
+				max_queues = VRB1_MAX_VF_MSIX;
+		} else {
+			if (acc_dev->pf_device)
+				max_queues = VRB2_MAX_PF_MSIX;
+			else
+				max_queues = VRB2_MAX_VF_MSIX;
+		}
 
 		if (rte_intr_efd_enable(dev->intr_handle, max_queues)) {
 			rte_bbdev_log(ERR, "Failed to create fds for %u queues",
@@ -776,7 +868,10 @@ vrb_find_free_queue_idx(struct rte_bbdev *dev,
 			/* Mark the Queue as assigned. */
 			d->q_assigned_bit_map[group_idx] |= (1ULL << aq_idx);
 			/* Report the AQ Index. */
-			return (group_idx << VRB1_GRP_ID_SHIFT) + aq_idx;
+			if (d->device_variant == VRB1_VARIANT)
+				return (group_idx << VRB1_GRP_ID_SHIFT) + aq_idx;
+			else
+				return (group_idx << VRB2_GRP_ID_SHIFT) + aq_idx;
 		}
 	}
 	rte_bbdev_log(INFO, "Failed to find free queue on %s, priority %u",
@@ -819,6 +914,9 @@ vrb_queue_setup(struct rte_bbdev *dev, uint16_t queue_id,
 			ACC_FCW_LD_BLEN : (conf->op_type == RTE_BBDEV_OP_FFT ?
 			ACC_FCW_FFT_BLEN : ACC_FCW_MLDTS_BLEN))));
 
+	if ((q->d->device_variant == VRB2_VARIANT) && (conf->op_type == RTE_BBDEV_OP_FFT))
+		fcw_len = ACC_FCW_FFT_BLEN_3;
+
 	for (desc_idx = 0; desc_idx < d->sw_ring_max_depth; desc_idx++) {
 		desc = q->ring_addr + desc_idx;
 		desc->req.word0 = ACC_DMA_DESC_TYPE;
@@ -915,9 +1013,16 @@ vrb_queue_setup(struct rte_bbdev *dev, uint16_t queue_id,
 		}
 	}
 
-	q->qgrp_id = (q_idx >> VRB1_GRP_ID_SHIFT) & 0xF;
-	q->vf_id = (q_idx >> VRB1_VF_ID_SHIFT)  & 0x3F;
-	q->aq_id = q_idx & 0xF;
+	if (d->device_variant == VRB1_VARIANT) {
+		q->qgrp_id = (q_idx >> VRB1_GRP_ID_SHIFT) & 0xF;
+		q->vf_id = (q_idx >> VRB1_VF_ID_SHIFT)  & 0x3F;
+		q->aq_id = q_idx & 0xF;
+	} else {
+		q->qgrp_id = (q_idx >> VRB2_GRP_ID_SHIFT) & 0x1F;
+		q->vf_id = (q_idx >> VRB2_VF_ID_SHIFT)  & 0x3F;
+		q->aq_id = q_idx & 0x3F;
+	}
+
 	q->aq_depth = 0;
 	if (conf->op_type ==  RTE_BBDEV_OP_TURBO_DEC)
 		q->aq_depth = (1 << d->acc_conf.q_ul_4g.aq_depth_log2);
@@ -1150,6 +1255,127 @@ vrb_dev_info_get(struct rte_bbdev *dev, struct rte_bbdev_driver_info *dev_info)
 		RTE_BBDEV_END_OF_CAPABILITIES_LIST()
 	};
 
+	static const struct rte_bbdev_op_cap vrb2_bbdev_capabilities[] = {
+		{
+			.type = RTE_BBDEV_OP_TURBO_DEC,
+			.cap.turbo_dec = {
+				.capability_flags =
+					RTE_BBDEV_TURBO_SUBBLOCK_DEINTERLEAVE |
+					RTE_BBDEV_TURBO_CRC_TYPE_24B |
+					RTE_BBDEV_TURBO_DEC_CRC_24B_DROP |
+					RTE_BBDEV_TURBO_EQUALIZER |
+					RTE_BBDEV_TURBO_SOFT_OUT_SATURATE |
+					RTE_BBDEV_TURBO_HALF_ITERATION_EVEN |
+					RTE_BBDEV_TURBO_CONTINUE_CRC_MATCH |
+					RTE_BBDEV_TURBO_SOFT_OUTPUT |
+					RTE_BBDEV_TURBO_EARLY_TERMINATION |
+					RTE_BBDEV_TURBO_DEC_INTERRUPTS |
+					RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN |
+					RTE_BBDEV_TURBO_NEG_LLR_1_BIT_SOFT_OUT |
+					RTE_BBDEV_TURBO_MAP_DEC |
+					RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP |
+					RTE_BBDEV_TURBO_DEC_SCATTER_GATHER,
+				.max_llr_modulus = INT8_MAX,
+				.num_buffers_src =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+				.num_buffers_hard_out =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+				.num_buffers_soft_out =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+			}
+		},
+		{
+			.type = RTE_BBDEV_OP_TURBO_ENC,
+			.cap.turbo_enc = {
+				.capability_flags =
+					RTE_BBDEV_TURBO_CRC_24B_ATTACH |
+					RTE_BBDEV_TURBO_RV_INDEX_BYPASS |
+					RTE_BBDEV_TURBO_RATE_MATCH |
+					RTE_BBDEV_TURBO_ENC_INTERRUPTS |
+					RTE_BBDEV_TURBO_ENC_SCATTER_GATHER,
+				.num_buffers_src =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+				.num_buffers_dst =
+						RTE_BBDEV_TURBO_MAX_CODE_BLOCKS,
+			}
+		},
+		{
+			.type   = RTE_BBDEV_OP_LDPC_ENC,
+			.cap.ldpc_enc = {
+				.capability_flags =
+					RTE_BBDEV_LDPC_RATE_MATCH |
+					RTE_BBDEV_LDPC_CRC_24B_ATTACH |
+					RTE_BBDEV_LDPC_INTERLEAVER_BYPASS |
+					RTE_BBDEV_LDPC_ENC_INTERRUPTS |
+					RTE_BBDEV_LDPC_ENC_SCATTER_GATHER |
+					RTE_BBDEV_LDPC_ENC_CONCATENATION,
+				.num_buffers_src =
+						RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+				.num_buffers_dst =
+						RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+			}
+		},
+		{
+			.type   = RTE_BBDEV_OP_LDPC_DEC,
+			.cap.ldpc_dec = {
+			.capability_flags =
+				RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK |
+				RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP |
+				RTE_BBDEV_LDPC_CRC_TYPE_24A_CHECK |
+				RTE_BBDEV_LDPC_CRC_TYPE_16_CHECK |
+				RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE |
+				RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE |
+				RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE |
+				RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS |
+				RTE_BBDEV_LDPC_DEC_SCATTER_GATHER |
+				RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION |
+				RTE_BBDEV_LDPC_HARQ_4BIT_COMPRESSION |
+				RTE_BBDEV_LDPC_LLR_COMPRESSION |
+				RTE_BBDEV_LDPC_SOFT_OUT_ENABLE |
+				RTE_BBDEV_LDPC_SOFT_OUT_RM_BYPASS |
+				RTE_BBDEV_LDPC_SOFT_OUT_DEINTERLEAVER_BYPASS |
+				RTE_BBDEV_LDPC_DEC_INTERRUPTS,
+			.llr_size = 8,
+			.llr_decimals = 2,
+			.num_buffers_src =
+					RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+			.num_buffers_hard_out =
+					RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+			.num_buffers_soft_out = 0,
+			}
+		},
+		{
+			.type	= RTE_BBDEV_OP_FFT,
+			.cap.fft = {
+				.capability_flags =
+						RTE_BBDEV_FFT_WINDOWING |
+						RTE_BBDEV_FFT_CS_ADJUSTMENT |
+						RTE_BBDEV_FFT_DFT_BYPASS |
+						RTE_BBDEV_FFT_IDFT_BYPASS |
+						RTE_BBDEV_FFT_FP16_INPUT |
+						RTE_BBDEV_FFT_FP16_OUTPUT |
+						RTE_BBDEV_FFT_POWER_MEAS |
+						RTE_BBDEV_FFT_WINDOWING_BYPASS,
+				.num_buffers_src =
+						RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+				.num_buffers_dst =
+						RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+			}
+		},
+		{
+			.type	= RTE_BBDEV_OP_MLDTS,
+			.cap.fft = {
+				.capability_flags =
+						RTE_BBDEV_MLDTS_REP,
+				.num_buffers_src =
+						RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+				.num_buffers_dst =
+						RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+			}
+		},
+		RTE_BBDEV_END_OF_CAPABILITIES_LIST()
+	};
+
 	static struct rte_bbdev_queue_conf default_queue_conf;
 	default_queue_conf.socket = dev->data->socket_id;
 	default_queue_conf.queue_size = ACC_MAX_QUEUE_DEPTH;
@@ -1194,7 +1420,10 @@ vrb_dev_info_get(struct rte_bbdev *dev, struct rte_bbdev_driver_info *dev_info)
 	dev_info->default_queue_conf = default_queue_conf;
 	dev_info->cpu_flag_reqs = NULL;
 	dev_info->min_alignment = 1;
-	dev_info->capabilities = vrb1_bbdev_capabilities;
+	if (d->device_variant == VRB1_VARIANT)
+		dev_info->capabilities = vrb1_bbdev_capabilities;
+	else
+		dev_info->capabilities = vrb2_bbdev_capabilities;
 	dev_info->harq_buffer_size = 0;
 
 	vrb_check_ir(d);
@@ -1243,6 +1472,9 @@ static struct rte_pci_id pci_id_vrb_pf_map[] = {
 	{
 		RTE_PCI_DEVICE(RTE_VRB1_VENDOR_ID, RTE_VRB1_PF_DEVICE_ID)
 	},
+	{
+		RTE_PCI_DEVICE(RTE_VRB2_VENDOR_ID, RTE_VRB2_PF_DEVICE_ID)
+	},
 	{.device_id = 0},
 };
 
@@ -1251,6 +1483,9 @@ static struct rte_pci_id pci_id_vrb_vf_map[] = {
 	{
 		RTE_PCI_DEVICE(RTE_VRB1_VENDOR_ID, RTE_VRB1_VF_DEVICE_ID)
 	},
+	{
+		RTE_PCI_DEVICE(RTE_VRB2_VENDOR_ID, RTE_VRB2_VF_DEVICE_ID)
+	},
 	{.device_id = 0},
 };
 
@@ -1287,6 +1522,7 @@ vrb_fcw_td_fill(const struct rte_bbdev_dec_op *op, struct acc_fcw_td *fcw)
 				fcw->ea = op->turbo_dec.cb_params.e;
 				fcw->eb = op->turbo_dec.cb_params.e;
 			}
+
 			if (op->turbo_dec.rv_index == 0)
 				fcw->k0_start_col = ACC_FCW_TD_RVIDX_0;
 			else if (op->turbo_dec.rv_index == 1)
@@ -1305,7 +1541,7 @@ vrb_fcw_td_fill(const struct rte_bbdev_dec_op *op, struct acc_fcw_td *fcw)
 		fcw->bypass_teq = 0;
 	}
 
-	fcw->code_block_mode = 1; /* FIXME */
+	fcw->code_block_mode = 1;
 	fcw->turbo_crc_type = check_bit(op->turbo_dec.op_flags,
 			RTE_BBDEV_TURBO_CRC_TYPE_24B);
 
@@ -1465,8 +1701,8 @@ vrb_dma_desc_td_fill(struct rte_bbdev_dec_op *op,
 	if (op->turbo_dec.code_block_mode == RTE_BBDEV_TRANSPORT_BLOCK) {
 		k = op->turbo_dec.tb_params.k_pos;
 		e = (r < op->turbo_dec.tb_params.cab)
-			? op->turbo_dec.tb_params.ea
-			: op->turbo_dec.tb_params.eb;
+				? op->turbo_dec.tb_params.ea
+				: op->turbo_dec.tb_params.eb;
 	} else {
 		k = op->turbo_dec.cb_params.k;
 		e = op->turbo_dec.cb_params.e;
@@ -1677,61 +1913,329 @@ vrb1_dma_desc_ld_fill(struct rte_bbdev_dec_op *op,
 	return 0;
 }
 
+/* Fill in a frame control word for LDPC decoding. */
 static inline void
-vrb_dma_desc_ld_update(struct rte_bbdev_dec_op *op,
-		struct acc_dma_req_desc *desc,
-		struct rte_mbuf *input, struct rte_mbuf *h_output,
-		uint32_t *in_offset, uint32_t *h_out_offset,
-		uint32_t *h_out_length,
+vrb2_fcw_ld_fill(struct rte_bbdev_dec_op *op, struct acc_fcw_ld *fcw,
 		union acc_harq_layout_data *harq_layout)
 {
-	int next_triplet = 1; /* FCW already done. */
-	desc->data_ptrs[next_triplet].address = rte_pktmbuf_iova_offset(input, *in_offset);
-	next_triplet++;
+	uint16_t harq_out_length, harq_in_length, ncb_p, k0_p, parity_offset;
+	uint32_t harq_index;
+	uint32_t l;
 
-	if (check_bit(op->ldpc_dec.op_flags, RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
-		struct rte_bbdev_op_data hi = op->ldpc_dec.harq_combined_input;
-		desc->data_ptrs[next_triplet].address =
-				rte_pktmbuf_iova_offset(hi.data, hi.offset);
-		next_triplet++;
+	fcw->qm = op->ldpc_dec.q_m;
+	fcw->nfiller = op->ldpc_dec.n_filler;
+	fcw->BG = (op->ldpc_dec.basegraph - 1);
+	fcw->Zc = op->ldpc_dec.z_c;
+	fcw->ncb = op->ldpc_dec.n_cb;
+	fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_dec.basegraph,
+			op->ldpc_dec.rv_index);
+	if (op->ldpc_dec.code_block_mode == RTE_BBDEV_CODE_BLOCK)
+		fcw->rm_e = op->ldpc_dec.cb_params.e;
+	else
+		fcw->rm_e = (op->ldpc_dec.tb_params.r <
+				op->ldpc_dec.tb_params.cab) ?
+						op->ldpc_dec.tb_params.ea :
+						op->ldpc_dec.tb_params.eb;
+
+	if (unlikely(check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE) &&
+			(op->ldpc_dec.harq_combined_input.length == 0))) {
+		rte_bbdev_log(WARNING, "Null HARQ input size provided");
+		/* Disable HARQ input in that case to carry forward. */
+		op->ldpc_dec.op_flags ^= RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE;
+	}
+	if (unlikely(fcw->rm_e == 0)) {
+		rte_bbdev_log(WARNING, "Null E input provided");
+		fcw->rm_e = 2;
 	}
 
-	desc->data_ptrs[next_triplet].address =
-			rte_pktmbuf_iova_offset(h_output, *h_out_offset);
-	*h_out_length = desc->data_ptrs[next_triplet].blen;
-	next_triplet++;
+	fcw->hcin_en = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE);
+	fcw->hcout_en = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE);
+	fcw->crc_select = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK);
+	fcw->so_en = check_bit(op->ldpc_dec.op_flags, RTE_BBDEV_LDPC_SOFT_OUT_ENABLE);
+	fcw->so_bypass_intlv = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_SOFT_OUT_DEINTERLEAVER_BYPASS);
+	fcw->so_bypass_rm = check_bit(op->ldpc_dec.op_flags, RTE_BBDEV_LDPC_SOFT_OUT_RM_BYPASS);
+	fcw->bypass_dec = 0;
+	fcw->bypass_intlv = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_DEINTERLEAVER_BYPASS);
+	if (op->ldpc_dec.q_m == 1) {
+		fcw->bypass_intlv = 1;
+		fcw->qm = 2;
+	}
+	if (check_bit(op->ldpc_dec.op_flags, RTE_BBDEV_LDPC_HARQ_6BIT_COMPRESSION)) {
+		fcw->hcin_decomp_mode = 1;
+		fcw->hcout_comp_mode = 1;
+	} else if (check_bit(op->ldpc_dec.op_flags, RTE_BBDEV_LDPC_HARQ_4BIT_COMPRESSION)) {
+		fcw->hcin_decomp_mode = 4;
+		fcw->hcout_comp_mode = 4;
+	} else {
+		fcw->hcin_decomp_mode = 0;
+		fcw->hcout_comp_mode = 0;
+	}
 
-	if (check_bit(op->ldpc_dec.op_flags,
-				RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
-		/* Adjust based on previous operation. */
-		struct rte_bbdev_dec_op *prev_op = desc->op_addr;
-		op->ldpc_dec.harq_combined_output.length =
-				prev_op->ldpc_dec.harq_combined_output.length;
-		uint32_t harq_idx = hq_index(op->ldpc_dec.harq_combined_output.offset);
-		uint32_t prev_harq_idx = hq_index(prev_op->ldpc_dec.harq_combined_output.offset);
-		harq_layout[harq_idx].val = harq_layout[prev_harq_idx].val;
-		struct rte_bbdev_op_data ho = op->ldpc_dec.harq_combined_output;
-		desc->data_ptrs[next_triplet].address =
-				rte_pktmbuf_iova_offset(ho.data, ho.offset);
-		next_triplet++;
+	fcw->llr_pack_mode = check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_LLR_COMPRESSION);
+	harq_index = hq_index(op->ldpc_dec.harq_combined_output.offset);
+	if (fcw->hcin_en > 0) {
+		harq_in_length = op->ldpc_dec.harq_combined_input.length;
+		if (fcw->hcin_decomp_mode == 1)
+			harq_in_length = harq_in_length * 8 / 6;
+		else if (fcw->hcin_decomp_mode == 4)
+			harq_in_length = harq_in_length * 2;
+		harq_in_length = RTE_MIN(harq_in_length, op->ldpc_dec.n_cb
+				- op->ldpc_dec.n_filler);
+		harq_in_length = RTE_ALIGN_CEIL(harq_in_length, 64);
+		fcw->hcin_size0 = harq_in_length;
+		fcw->hcin_offset = 0;
+		fcw->hcin_size1 = 0;
+	} else {
+		fcw->hcin_size0 = 0;
+		fcw->hcin_offset = 0;
+		fcw->hcin_size1 = 0;
 	}
 
-	op->ldpc_dec.hard_output.length += *h_out_length;
-	desc->op_addr = op;
-}
+	fcw->itmax = op->ldpc_dec.iter_max;
+	fcw->so_it = op->ldpc_dec.iter_max;
+	fcw->itstop = check_bit(op->ldpc_dec.op_flags, RTE_BBDEV_LDPC_ITERATION_STOP_ENABLE);
+	fcw->cnu_algo = ACC_ALGO_MSA;
+	fcw->synd_precoder = fcw->itstop;
 
-/* Enqueue one encode operations for device in CB mode */
-static inline int
-enqueue_enc_one_op_cb(struct acc_queue *q, struct rte_bbdev_enc_op *op,
-		uint16_t total_enqueued_cbs)
-{
-	union acc_dma_desc *desc = NULL;
-	int ret;
-	uint32_t in_offset, out_offset, out_length, mbuf_total_left, seg_total_left;
-	struct rte_mbuf *input, *output_head, *output;
+	fcw->minsum_offset = 1;
+	fcw->dec_llrclip   = 2;
 
-	desc = acc_desc(q, total_enqueued_cbs);
-	acc_fcw_te_fill(op, &desc->req.fcw_te);
+	/*
+	 * These are all implicitly set
+	 * fcw->synd_post = 0;
+	 * fcw->dec_convllr = 0;
+	 * fcw->hcout_convllr = 0;
+	 * fcw->hcout_size1 = 0;
+	 * fcw->hcout_offset = 0;
+	 * fcw->negstop_th = 0;
+	 * fcw->negstop_it = 0;
+	 * fcw->negstop_en = 0;
+	 * fcw->gain_i = 1;
+	 * fcw->gain_h = 1;
+	 */
+	if (fcw->hcout_en > 0) {
+		parity_offset = (op->ldpc_dec.basegraph == 1 ? 20 : 8)
+			* op->ldpc_dec.z_c - op->ldpc_dec.n_filler;
+		k0_p = (fcw->k0 > parity_offset) ?
+				fcw->k0 - op->ldpc_dec.n_filler : fcw->k0;
+		ncb_p = fcw->ncb - op->ldpc_dec.n_filler;
+		l = k0_p + fcw->rm_e;
+		harq_out_length = (uint16_t) fcw->hcin_size0;
+		harq_out_length = RTE_MIN(RTE_MAX(harq_out_length, l), ncb_p);
+		harq_out_length = RTE_ALIGN_CEIL(harq_out_length, 64);
+		fcw->hcout_size0 = harq_out_length;
+		fcw->hcout_size1 = 0;
+		fcw->hcout_offset = 0;
+		harq_layout[harq_index].offset = fcw->hcout_offset;
+		harq_layout[harq_index].size0 = fcw->hcout_size0;
+	} else {
+		fcw->hcout_size0 = 0;
+		fcw->hcout_size1 = 0;
+		fcw->hcout_offset = 0;
+	}
+
+	fcw->tb_crc_select = 0;
+	if (check_bit(op->ldpc_dec.op_flags, RTE_BBDEV_LDPC_CRC_TYPE_24A_CHECK))
+		fcw->tb_crc_select = 2;
+	if (check_bit(op->ldpc_dec.op_flags, RTE_BBDEV_LDPC_CRC_TYPE_16_CHECK))
+		fcw->tb_crc_select = 1;
+}
+
+static inline int
+vrb2_dma_desc_ld_fill(struct rte_bbdev_dec_op *op,
+		struct acc_dma_req_desc *desc,
+		struct rte_mbuf **input, struct rte_mbuf *h_output,
+		uint32_t *in_offset, uint32_t *h_out_offset,
+		uint32_t *h_out_length, uint32_t *mbuf_total_left,
+		uint32_t *seg_total_left, struct acc_fcw_ld *fcw)
+{
+	struct rte_bbdev_op_ldpc_dec *dec = &op->ldpc_dec;
+	int next_triplet = 1; /* FCW already done. */
+	uint32_t input_length;
+	uint16_t output_length, crc24_overlap = 0;
+	uint16_t sys_cols, K, h_p_size, h_np_size;
+
+	acc_header_init(desc);
+
+	if (check_bit(op->ldpc_dec.op_flags, RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP))
+		crc24_overlap = 24;
+
+	/* Compute some LDPC BG lengths. */
+	input_length = fcw->rm_e;
+	if (check_bit(op->ldpc_dec.op_flags, RTE_BBDEV_LDPC_LLR_COMPRESSION))
+		input_length = (input_length * 3 + 3) / 4;
+	sys_cols = (dec->basegraph == 1) ? 22 : 10;
+	K = sys_cols * dec->z_c;
+	output_length = K - dec->n_filler - crc24_overlap;
+
+	if (unlikely((*mbuf_total_left == 0) || (*mbuf_total_left < input_length))) {
+		rte_bbdev_log(ERR,
+				"Mismatch between mbuf length and included CB sizes: mbuf len %u, cb len %u",
+				*mbuf_total_left, input_length);
+		return -1;
+	}
+
+	next_triplet = acc_dma_fill_blk_type_in(desc, input,
+			in_offset, input_length,
+			seg_total_left, next_triplet,
+			check_bit(op->ldpc_dec.op_flags,
+			RTE_BBDEV_LDPC_DEC_SCATTER_GATHER));
+
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+
+	if (check_bit(op->ldpc_dec.op_flags, RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
+		if (op->ldpc_dec.harq_combined_input.data == 0) {
+			rte_bbdev_log(ERR, "HARQ input is not defined");
+			return -1;
+		}
+		h_p_size = fcw->hcin_size0 + fcw->hcin_size1;
+		if (fcw->hcin_decomp_mode == 1)
+			h_p_size = (h_p_size * 3 + 3) / 4;
+		else if (fcw->hcin_decomp_mode == 4)
+			h_p_size = h_p_size / 2;
+		if (op->ldpc_dec.harq_combined_input.data == 0) {
+			rte_bbdev_log(ERR, "HARQ input is not defined");
+			return -1;
+		}
+		acc_dma_fill_blk_type(
+				desc,
+				op->ldpc_dec.harq_combined_input.data,
+				op->ldpc_dec.harq_combined_input.offset,
+				h_p_size,
+				next_triplet,
+				ACC_DMA_BLKID_IN_HARQ);
+		next_triplet++;
+	}
+
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->m2dlen = next_triplet;
+	*mbuf_total_left -= input_length;
+
+	next_triplet = acc_dma_fill_blk_type(desc, h_output,
+			*h_out_offset, output_length >> 3, next_triplet,
+			ACC_DMA_BLKID_OUT_HARD);
+
+	if (check_bit(op->ldpc_dec.op_flags, RTE_BBDEV_LDPC_SOFT_OUT_ENABLE)) {
+		if (op->ldpc_dec.soft_output.data == 0) {
+			rte_bbdev_log(ERR, "Soft output is not defined");
+			return -1;
+		}
+		dec->soft_output.length = fcw->rm_e;
+		acc_dma_fill_blk_type(desc, dec->soft_output.data, dec->soft_output.offset,
+				fcw->rm_e, next_triplet, ACC_DMA_BLKID_OUT_SOFT);
+		next_triplet++;
+	}
+
+	if (check_bit(op->ldpc_dec.op_flags,
+				RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
+		if (op->ldpc_dec.harq_combined_output.data == 0) {
+			rte_bbdev_log(ERR, "HARQ output is not defined");
+			return -1;
+		}
+
+		/* Pruned size of the HARQ */
+		h_p_size = fcw->hcout_size0 + fcw->hcout_size1;
+		/* Non-Pruned size of the HARQ */
+		h_np_size = fcw->hcout_offset > 0 ?
+				fcw->hcout_offset + fcw->hcout_size1 :
+				h_p_size;
+		if (fcw->hcin_decomp_mode == 1) {
+			h_np_size = (h_np_size * 3 + 3) / 4;
+			h_p_size = (h_p_size * 3 + 3) / 4;
+		} else if (fcw->hcin_decomp_mode == 4) {
+			h_np_size = h_np_size / 2;
+			h_p_size = h_p_size / 2;
+		}
+		dec->harq_combined_output.length = h_np_size;
+		acc_dma_fill_blk_type(
+				desc,
+				dec->harq_combined_output.data,
+				dec->harq_combined_output.offset,
+				h_p_size,
+				next_triplet,
+				ACC_DMA_BLKID_OUT_HARQ);
+
+		next_triplet++;
+	}
+
+	*h_out_length = output_length >> 3;
+	dec->hard_output.length += *h_out_length;
+	*h_out_offset += *h_out_length;
+	desc->data_ptrs[next_triplet - 1].last = 1;
+	desc->d2mlen = next_triplet - desc->m2dlen;
+
+	desc->op_addr = op;
+
+	return 0;
+}
+
+static inline void
+vrb_dma_desc_ld_update(struct rte_bbdev_dec_op *op,
+		struct acc_dma_req_desc *desc,
+		struct rte_mbuf *input, struct rte_mbuf *h_output,
+		uint32_t *in_offset, uint32_t *h_out_offset,
+		uint32_t *h_out_length,
+		union acc_harq_layout_data *harq_layout)
+{
+	int next_triplet = 1; /* FCW already done. */
+	desc->data_ptrs[next_triplet].address = rte_pktmbuf_iova_offset(input, *in_offset);
+	next_triplet++;
+
+	if (check_bit(op->ldpc_dec.op_flags, RTE_BBDEV_LDPC_HQ_COMBINE_IN_ENABLE)) {
+		struct rte_bbdev_op_data hi = op->ldpc_dec.harq_combined_input;
+		desc->data_ptrs[next_triplet].address =
+				rte_pktmbuf_iova_offset(hi.data, hi.offset);
+		next_triplet++;
+	}
+
+	desc->data_ptrs[next_triplet].address =
+			rte_pktmbuf_iova_offset(h_output, *h_out_offset);
+	*h_out_length = desc->data_ptrs[next_triplet].blen;
+	next_triplet++;
+
+	if (check_bit(op->ldpc_dec.op_flags,
+				RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
+		/* Adjust based on previous operation. */
+		struct rte_bbdev_dec_op *prev_op = desc->op_addr;
+		op->ldpc_dec.harq_combined_output.length =
+				prev_op->ldpc_dec.harq_combined_output.length;
+		uint32_t harq_idx = hq_index(op->ldpc_dec.harq_combined_output.offset);
+		uint32_t prev_harq_idx = hq_index(prev_op->ldpc_dec.harq_combined_output.offset);
+		harq_layout[harq_idx].val = harq_layout[prev_harq_idx].val;
+		struct rte_bbdev_op_data ho = op->ldpc_dec.harq_combined_output;
+		desc->data_ptrs[next_triplet].address =
+				rte_pktmbuf_iova_offset(ho.data, ho.offset);
+		next_triplet++;
+	}
+
+	op->ldpc_dec.hard_output.length += *h_out_length;
+	desc->op_addr = op;
+}
+
+/* Enqueue one encode operations for device in CB mode */
+static inline int
+enqueue_enc_one_op_cb(struct acc_queue *q, struct rte_bbdev_enc_op *op,
+		uint16_t total_enqueued_cbs)
+{
+	union acc_dma_desc *desc = NULL;
+	int ret;
+	uint32_t in_offset, out_offset, out_length, mbuf_total_left, seg_total_left;
+	struct rte_mbuf *input, *output_head, *output;
+
+	desc = acc_desc(q, total_enqueued_cbs);
+	acc_fcw_te_fill(op, &desc->req.fcw_te);
 
 	input = op->turbo_enc.input.data;
 	output_head = output = op->turbo_enc.output.data;
@@ -1780,6 +2284,7 @@ enqueue_ldpc_enc_n_op_cb(struct acc_queue *q, struct rte_bbdev_enc_op **ops,
 	/** This could be done at polling. */
 	acc_header_init(&desc->req);
 	desc->req.numCBs = num;
+	desc->req.dltb = 0;
 
 	in_length_in_bytes = ops[0]->ldpc_enc.input.data->data_len;
 	out_length = (enc->cb_params.e + 7) >> 3;
@@ -1818,7 +2323,7 @@ enqueue_ldpc_enc_n_op_cb(struct acc_queue *q, struct rte_bbdev_enc_op **ops,
 	return num;
 }
 
-/* Enqueue one encode operations for device for a partial TB
+/* Enqueue one encode operations for VRB1 device for a partial TB
  * all codes blocks have same configuration multiplexed on the same descriptor.
  */
 static inline void
@@ -2005,6 +2510,105 @@ vrb1_enqueue_ldpc_enc_one_op_tb(struct acc_queue *q, struct rte_bbdev_enc_op *op
 	return return_descs;
 }
 
+/* Fill in a frame control word for LDPC encoding. */
+static inline void
+vrb2_fcw_letb_fill(const struct rte_bbdev_enc_op *op, struct acc_fcw_le *fcw)
+{
+	fcw->qm = op->ldpc_enc.q_m;
+	fcw->nfiller = op->ldpc_enc.n_filler;
+	fcw->BG = (op->ldpc_enc.basegraph - 1);
+	fcw->Zc = op->ldpc_enc.z_c;
+	fcw->ncb = op->ldpc_enc.n_cb;
+	fcw->k0 = get_k0(fcw->ncb, fcw->Zc, op->ldpc_enc.basegraph,
+			op->ldpc_enc.rv_index);
+	fcw->rm_e = op->ldpc_enc.tb_params.ea;
+	fcw->rm_e_b = op->ldpc_enc.tb_params.eb;
+	fcw->crc_select = check_bit(op->ldpc_enc.op_flags,
+			RTE_BBDEV_LDPC_CRC_24B_ATTACH);
+	fcw->bypass_intlv = 0;
+	if (op->ldpc_enc.tb_params.c > 1) {
+		fcw->mcb_count = 0;
+		fcw->C = op->ldpc_enc.tb_params.c;
+		fcw->Cab = op->ldpc_enc.tb_params.cab;
+	} else {
+		fcw->mcb_count = 1;
+		fcw->C = 0;
+	}
+}
+
+/* Enqueue one encode operations for device in TB mode.
+ * returns the number of descs used.
+ */
+static inline int
+vrb2_enqueue_ldpc_enc_one_op_tb(struct acc_queue *q, struct rte_bbdev_enc_op *op,
+		uint16_t enq_descs)
+{
+	union acc_dma_desc *desc = NULL;
+	uint32_t in_offset, out_offset, out_length, seg_total_left;
+	struct rte_mbuf *input, *output_head, *output;
+
+	uint16_t desc_idx = ((q->sw_ring_head + enq_descs) & q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	vrb2_fcw_letb_fill(op, &desc->req.fcw_le);
+	struct rte_bbdev_op_ldpc_enc *enc = &op->ldpc_enc;
+	int next_triplet = 1; /* FCW already done */
+	uint32_t in_length_in_bytes;
+	uint16_t K, in_length_in_bits;
+
+	input = enc->input.data;
+	output_head = output = enc->output.data;
+	in_offset = enc->input.offset;
+	out_offset = enc->output.offset;
+	seg_total_left = rte_pktmbuf_data_len(enc->input.data) - in_offset;
+
+	acc_header_init(&desc->req);
+	K = (enc->basegraph == 1 ? 22 : 10) * enc->z_c;
+	in_length_in_bits = K - enc->n_filler;
+	if ((enc->op_flags & RTE_BBDEV_LDPC_CRC_24A_ATTACH) ||
+			(enc->op_flags & RTE_BBDEV_LDPC_CRC_24B_ATTACH))
+		in_length_in_bits -= 24;
+	in_length_in_bytes = (in_length_in_bits >> 3) * enc->tb_params.c;
+
+	next_triplet = acc_dma_fill_blk_type_in(&desc->req, &input, &in_offset,
+			in_length_in_bytes, &seg_total_left, next_triplet,
+			check_bit(enc->op_flags, RTE_BBDEV_LDPC_ENC_SCATTER_GATHER));
+	if (unlikely(next_triplet < 0)) {
+		rte_bbdev_log(ERR,
+				"Mismatch between data to process and mbuf data length in bbdev_op: %p",
+				op);
+		return -1;
+	}
+	desc->req.data_ptrs[next_triplet - 1].last = 1;
+	desc->req.m2dlen = next_triplet;
+
+	/* Set output length */
+	/* Integer round up division by 8 */
+	out_length = (enc->tb_params.ea * enc->tb_params.cab +
+			enc->tb_params.eb * (enc->tb_params.c - enc->tb_params.cab)  + 7) >> 3;
+
+	next_triplet = acc_dma_fill_blk_type(&desc->req, output, out_offset,
+			out_length, next_triplet, ACC_DMA_BLKID_OUT_ENC);
+	enc->output.length = out_length;
+	out_offset += out_length;
+	desc->req.data_ptrs[next_triplet - 1].last = 1;
+	desc->req.data_ptrs[next_triplet - 1].dma_ext = 0;
+	desc->req.d2mlen = next_triplet - desc->req.m2dlen;
+	desc->req.numCBs = enc->tb_params.c;
+	if (desc->req.numCBs > 1)
+		desc->req.dltb = 1;
+	desc->req.op_addr = op;
+
+	if (out_length < ACC_MAX_E_MBUF)
+		mbuf_append(output_head, output, out_length);
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	acc_memdump("FCW", &desc->req.fcw_le, sizeof(desc->req.fcw_le));
+	acc_memdump("Req Desc.", desc, sizeof(*desc));
+#endif
+	/* One CB (one op) was successfully prepared to enqueue */
+	return 1;
+}
+
 /** Enqueue one decode operations for device in CB mode. */
 static inline int
 enqueue_dec_one_op_cb(struct acc_queue *q, struct rte_bbdev_dec_op *op,
@@ -2017,8 +2621,10 @@ enqueue_dec_one_op_cb(struct acc_queue *q, struct rte_bbdev_dec_op *op,
 	struct rte_mbuf *input, *h_output_head, *h_output,
 		*s_output_head, *s_output;
 
-	/* Disable explictly SO for VRB 1. */
-	op->turbo_dec.op_flags &= ~RTE_BBDEV_TURBO_SOFT_OUTPUT;
+	if (q->d->device_variant == VRB1_VARIANT) {
+		/* Disable explictly SO for VRB 1. */
+		op->turbo_dec.op_flags &= ~RTE_BBDEV_TURBO_SOFT_OUTPUT;
+	}
 
 	desc = acc_desc(q, total_enqueued_cbs);
 	vrb_fcw_td_fill(op, &desc->req.fcw_td);
@@ -2061,7 +2667,7 @@ enqueue_dec_one_op_cb(struct acc_queue *q, struct rte_bbdev_dec_op *op,
 	return 1;
 }
 
-/** Enqueue one decode operations for device in CB mode */
+/** Enqueue one decode operations for device in CB mode. */
 static inline int
 vrb_enqueue_ldpc_dec_one_op_cb(struct acc_queue *q, struct rte_bbdev_dec_op *op,
 		uint16_t total_enqueued_cbs, bool same_op)
@@ -2114,11 +2720,16 @@ vrb_enqueue_ldpc_dec_one_op_cb(struct acc_queue *q, struct rte_bbdev_dec_op *op,
 			seg_total_left = rte_pktmbuf_data_len(input) - in_offset;
 		else
 			seg_total_left = fcw->rm_e;
-
-		ret = vrb1_dma_desc_ld_fill(op, &desc->req, &input, h_output,
-				&in_offset, &h_out_offset,
-				&h_out_length, &mbuf_total_left,
-				&seg_total_left, fcw);
+		if (q->d->device_variant == VRB1_VARIANT)
+			ret = vrb1_dma_desc_ld_fill(op, &desc->req, &input, h_output,
+					&in_offset, &h_out_offset,
+					&h_out_length, &mbuf_total_left,
+					&seg_total_left, fcw);
+		else
+			ret = vrb2_dma_desc_ld_fill(op, &desc->req, &input, h_output,
+					&in_offset, &h_out_offset,
+					&h_out_length, &mbuf_total_left,
+					&seg_total_left, fcw);
 		if (unlikely(ret < 0))
 			return ret;
 	}
@@ -2207,12 +2818,18 @@ vrb_enqueue_ldpc_dec_one_op_tb(struct acc_queue *q, struct rte_bbdev_dec_op *op,
 		desc->req.data_ptrs[0].blen = ACC_FCW_LD_BLEN;
 		rte_memcpy(&desc->req.fcw_ld, &desc_first->req.fcw_ld, ACC_FCW_LD_BLEN);
 		desc->req.fcw_ld.tb_trailer_size = (c - r - 1) * trail_len;
-
-		ret = vrb1_dma_desc_ld_fill(op, &desc->req, &input,
-				h_output, &in_offset, &h_out_offset,
-				&h_out_length,
-				&mbuf_total_left, &seg_total_left,
-				&desc->req.fcw_ld);
+		if (q->d->device_variant == VRB1_VARIANT)
+			ret = vrb1_dma_desc_ld_fill(op, &desc->req, &input,
+					h_output, &in_offset, &h_out_offset,
+					&h_out_length,
+					&mbuf_total_left, &seg_total_left,
+					&desc->req.fcw_ld);
+		else
+			ret = vrb2_dma_desc_ld_fill(op, &desc->req, &input,
+					h_output, &in_offset, &h_out_offset,
+					&h_out_length,
+					&mbuf_total_left, &seg_total_left,
+					&desc->req.fcw_ld);
 
 		if (unlikely(ret < 0))
 			return ret;
@@ -2254,7 +2871,7 @@ vrb_enqueue_ldpc_dec_one_op_tb(struct acc_queue *q, struct rte_bbdev_dec_op *op,
 	return current_enqueued_cbs;
 }
 
-/* Enqueue one decode operations for device in TB mode */
+/* Enqueue one decode operations for device in TB mode. */
 static inline int
 enqueue_dec_one_op_tb(struct acc_queue *q, struct rte_bbdev_dec_op *op,
 		uint16_t total_enqueued_cbs, uint8_t cbs_in_tb)
@@ -2476,14 +3093,23 @@ vrb_enqueue_ldpc_enc_tb(struct rte_bbdev_queue_data *q_data,
 	int descs_used;
 
 	for (i = 0; i < num; ++i) {
-		cbs_in_tb = get_num_cbs_in_tb_ldpc_enc(&ops[i]->ldpc_enc);
-		/* Check if there are available space for further processing. */
-		if (unlikely((avail - cbs_in_tb < 0) || (cbs_in_tb == 0))) {
-			acc_enqueue_ring_full(q_data);
-			break;
+		if (q->d->device_variant == VRB1_VARIANT) {
+			cbs_in_tb = get_num_cbs_in_tb_ldpc_enc(&ops[i]->ldpc_enc);
+			/* Check if there are available space for further processing. */
+			if (unlikely((avail - cbs_in_tb < 0) || (cbs_in_tb == 0))) {
+				acc_enqueue_ring_full(q_data);
+				break;
+			}
+			descs_used = vrb1_enqueue_ldpc_enc_one_op_tb(q, ops[i],
+					enqueued_descs, cbs_in_tb);
+		} else {
+			if (unlikely(avail < 1)) {
+				acc_enqueue_ring_full(q_data);
+				break;
+			}
+			descs_used = vrb2_enqueue_ldpc_enc_one_op_tb(q, ops[i], enqueued_descs);
 		}
 
-		descs_used = vrb1_enqueue_ldpc_enc_one_op_tb(q, ops[i], enqueued_descs, cbs_in_tb);
 		if (descs_used < 0) {
 			acc_enqueue_invalid(q_data);
 			break;
@@ -2617,7 +3243,6 @@ vrb_enqueue_ldpc_dec_cb(struct rte_bbdev_queue_data *q_data,
 			break;
 		}
 		avail -= 1;
-
 		rte_bbdev_log(INFO, "Op %d %d %d %d %d %d %d %d %d %d %d %d\n",
 			i, ops[i]->ldpc_dec.op_flags, ops[i]->ldpc_dec.rv_index,
 			ops[i]->ldpc_dec.iter_max, ops[i]->ldpc_dec.iter_count,
@@ -2745,6 +3370,7 @@ vrb_dequeue_enc_one_op_cb(struct acc_queue *q, struct rte_bbdev_enc_op **ref_op,
 	op->status |= ((rsp.input_err) ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
 	op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
 	op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+	op->status |= ((rsp.engine_hung) ? (1 << RTE_BBDEV_ENGINE_ERROR) : 0);
 
 	if (desc->req.last_desc_in_batch) {
 		(*aq_dequeued)++;
@@ -2765,7 +3391,56 @@ vrb_dequeue_enc_one_op_cb(struct acc_queue *q, struct rte_bbdev_enc_op **ref_op,
 	return desc->req.numCBs;
 }
 
-/* Dequeue one LDPC encode operations from device in TB mode.
+
+/* Dequeue one LDPC encode operations from VRB2 device in TB mode
+ */
+static inline int
+vrb2_dequeue_ldpc_enc_one_op_tb(struct acc_queue *q, struct rte_bbdev_enc_op **ref_op,
+		uint16_t *dequeued_ops, uint32_t *aq_dequeued,
+		uint16_t *dequeued_descs)
+{
+	union acc_dma_desc *desc, atom_desc;
+	union acc_dma_rsp_desc rsp;
+	struct rte_bbdev_enc_op *op;
+	int desc_idx = ((q->sw_ring_tail + *dequeued_descs) & q->sw_ring_wrap_mask);
+
+	desc = q->ring_addr + desc_idx;
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc, __ATOMIC_RELAXED);
+
+	/* Check fdone bit */
+	if (!(atom_desc.rsp.val & ACC_FDONE))
+		return -1;
+
+	rsp.val = atom_desc.rsp.val;
+	rte_bbdev_log_debug("Resp. desc %p: %x", desc, rsp.val);
+
+	/* Dequeue */
+	op = desc->req.op_addr;
+
+	/* Clearing status, it will be set based on response */
+	op->status = 0;
+	op->status |= rsp.input_err << RTE_BBDEV_DATA_ERROR;
+	op->status |= rsp.dma_err << RTE_BBDEV_DRV_ERROR;
+	op->status |= rsp.fcw_err << RTE_BBDEV_DRV_ERROR;
+	op->status |= rsp.engine_hung << RTE_BBDEV_ENGINE_ERROR;
+
+	if (desc->req.last_desc_in_batch) {
+		(*aq_dequeued)++;
+		desc->req.last_desc_in_batch = 0;
+	}
+	desc->rsp.val = ACC_DMA_DESC_TYPE;
+	desc->rsp.add_info_0 = 0; /*Reserved bits */
+	desc->rsp.add_info_1 = 0; /*Reserved bits */
+
+	/* One op was successfully dequeued */
+	ref_op[0] = op;
+	(*dequeued_descs)++;
+	(*dequeued_ops)++;
+	return 1;
+}
+
+
+/* Dequeue one encode operations from device in TB mode.
  * That operation may cover multiple descriptors.
  */
 static inline int
@@ -2815,6 +3490,7 @@ vrb_dequeue_enc_one_op_tb(struct acc_queue *q, struct rte_bbdev_enc_op **ref_op,
 		op->status |= ((rsp.input_err) ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
 		op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
 		op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+		op->status |= ((rsp.engine_hung) ? (1 << RTE_BBDEV_ENGINE_ERROR) : 0);
 
 		if (desc->req.last_desc_in_batch) {
 			(*aq_dequeued)++;
@@ -2861,6 +3537,8 @@ vrb_dequeue_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
 	op->status |= ((rsp.input_err) ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
 	op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
 	op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+	op->status |= rsp.engine_hung << RTE_BBDEV_ENGINE_ERROR;
+
 	if (op->status != 0) {
 		/* These errors are not expected. */
 		q_data->queue_stats.dequeue_err_count++;
@@ -2914,6 +3592,7 @@ vrb_dequeue_ldpc_dec_one_op_cb(struct rte_bbdev_queue_data *q_data,
 	op->status |= rsp.input_err << RTE_BBDEV_DATA_ERROR;
 	op->status |= rsp.dma_err << RTE_BBDEV_DRV_ERROR;
 	op->status |= rsp.fcw_err << RTE_BBDEV_DRV_ERROR;
+	op->status |= rsp.engine_hung << RTE_BBDEV_ENGINE_ERROR;
 	if (op->status != 0)
 		q_data->queue_stats.dequeue_err_count++;
 
@@ -2995,6 +3674,7 @@ vrb_dequeue_dec_one_op_tb(struct acc_queue *q, struct rte_bbdev_dec_op **ref_op,
 		op->status |= ((rsp.input_err) ? (1 << RTE_BBDEV_DATA_ERROR) : 0);
 		op->status |= ((rsp.dma_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
 		op->status |= ((rsp.fcw_err) ? (1 << RTE_BBDEV_DRV_ERROR) : 0);
+		op->status |= ((rsp.engine_hung) ? (1 << RTE_BBDEV_ENGINE_ERROR) : 0);
 
 		if (check_bit(op->ldpc_dec.op_flags, RTE_BBDEV_LDPC_CRC_TYPE_24A_CHECK))
 			tb_crc_check ^= desc->rsp.add_info_1;
@@ -3046,7 +3726,6 @@ vrb_dequeue_enc(struct rte_bbdev_queue_data *q_data,
 	if (avail == 0)
 		return 0;
 	op = acc_op_tail(q, 0);
-
 	cbm = op->turbo_enc.code_block_mode;
 
 	for (i = 0; i < avail; i++) {
@@ -3089,9 +3768,14 @@ vrb_dequeue_ldpc_enc(struct rte_bbdev_queue_data *q_data,
 
 	for (i = 0; i < avail; i++) {
 		if (cbm == RTE_BBDEV_TRANSPORT_BLOCK)
-			ret = vrb_dequeue_enc_one_op_tb(q, &ops[dequeued_ops],
-					&dequeued_ops, &aq_dequeued,
-					&dequeued_descs, num);
+			if (q->d->device_variant == VRB1_VARIANT)
+				ret = vrb_dequeue_enc_one_op_tb(q, &ops[dequeued_ops],
+						&dequeued_ops, &aq_dequeued,
+						&dequeued_descs, num);
+			else
+				ret = vrb2_dequeue_ldpc_enc_one_op_tb(q, &ops[dequeued_ops],
+						&dequeued_ops, &aq_dequeued,
+						&dequeued_descs);
 		else
 			ret = vrb_dequeue_enc_one_op_cb(q, &ops[dequeued_ops],
 					&dequeued_ops, &aq_dequeued,
@@ -3221,6 +3905,47 @@ vrb1_fcw_fft_fill(struct rte_bbdev_fft_op *op, struct acc_fcw_fft *fcw)
 		fcw->bypass = 0;
 }
 
+/* Fill in a frame control word for FFT processing. */
+static inline void
+vrb2_fcw_fft_fill(struct rte_bbdev_fft_op *op, struct acc_fcw_fft_3 *fcw)
+{
+	fcw->in_frame_size = op->fft.input_sequence_size;
+	fcw->leading_pad_size = op->fft.input_leading_padding;
+	fcw->out_frame_size = op->fft.output_sequence_size;
+	fcw->leading_depad_size = op->fft.output_leading_depadding;
+	fcw->cs_window_sel = op->fft.window_index[0] +
+			(op->fft.window_index[1] << 8) +
+			(op->fft.window_index[2] << 16) +
+			(op->fft.window_index[3] << 24);
+	fcw->cs_window_sel2 = op->fft.window_index[4] +
+			(op->fft.window_index[5] << 8);
+	fcw->cs_enable_bmap = op->fft.cs_bitmap;
+	fcw->num_antennas = op->fft.num_antennas_log2;
+	fcw->idft_size = op->fft.idft_log2;
+	fcw->dft_size = op->fft.dft_log2;
+	fcw->cs_offset = op->fft.cs_time_adjustment;
+	fcw->idft_shift = op->fft.idft_shift;
+	fcw->dft_shift = op->fft.dft_shift;
+	fcw->cs_multiplier = op->fft.ncs_reciprocal;
+	fcw->power_shift = op->fft.power_shift;
+	fcw->exp_adj = op->fft.fp16_exp_adjust;
+	fcw->fp16_in = check_bit(op->fft.op_flags, RTE_BBDEV_FFT_FP16_INPUT);
+	fcw->fp16_out = check_bit(op->fft.op_flags, RTE_BBDEV_FFT_FP16_OUTPUT);
+	fcw->power_en = check_bit(op->fft.op_flags, RTE_BBDEV_FFT_POWER_MEAS);
+	if (check_bit(op->fft.op_flags,
+			RTE_BBDEV_FFT_IDFT_BYPASS)) {
+		if (check_bit(op->fft.op_flags,
+				RTE_BBDEV_FFT_WINDOWING_BYPASS))
+			fcw->bypass = 2;
+		else
+			fcw->bypass = 1;
+	} else if (check_bit(op->fft.op_flags,
+			RTE_BBDEV_FFT_DFT_BYPASS))
+		fcw->bypass = 3;
+	else
+		fcw->bypass = 0;
+}
+
 static inline int
 vrb1_dma_desc_fft_fill(struct rte_bbdev_fft_op *op,
 		struct acc_dma_req_desc *desc,
@@ -3254,6 +3979,58 @@ vrb1_dma_desc_fft_fill(struct rte_bbdev_fft_op *op,
 	return 0;
 }
 
+static inline int
+vrb2_dma_desc_fft_fill(struct rte_bbdev_fft_op *op,
+		struct acc_dma_req_desc *desc,
+		struct rte_mbuf *input, struct rte_mbuf *output, struct rte_mbuf *win_input,
+		struct rte_mbuf *pwr, uint32_t *in_offset, uint32_t *out_offset,
+		uint32_t *win_offset, uint32_t *pwr_offset)
+{
+	bool pwr_en = check_bit(op->fft.op_flags, RTE_BBDEV_FFT_POWER_MEAS);
+	bool win_en = check_bit(op->fft.op_flags, RTE_BBDEV_FFT_DEWINDOWING);
+	int num_cs = 0, i, bd_idx = 1;
+
+	/* FCW already done */
+	acc_header_init(desc);
+
+	RTE_SET_USED(win_input);
+	RTE_SET_USED(win_offset);
+
+	desc->data_ptrs[bd_idx].address = rte_pktmbuf_iova_offset(input, *in_offset);
+	desc->data_ptrs[bd_idx].blen = op->fft.input_sequence_size * ACC_IQ_SIZE;
+	desc->data_ptrs[bd_idx].blkid = ACC_DMA_BLKID_IN;
+	desc->data_ptrs[bd_idx].last = 1;
+	desc->data_ptrs[bd_idx].dma_ext = 0;
+	bd_idx++;
+
+	desc->data_ptrs[bd_idx].address = rte_pktmbuf_iova_offset(output, *out_offset);
+	desc->data_ptrs[bd_idx].blen = op->fft.output_sequence_size * ACC_IQ_SIZE;
+	desc->data_ptrs[bd_idx].blkid = ACC_DMA_BLKID_OUT_HARD;
+	desc->data_ptrs[bd_idx].last = pwr_en ? 0 : 1;
+	desc->data_ptrs[bd_idx].dma_ext = 0;
+	desc->m2dlen = win_en ? 3 : 2;
+	desc->d2mlen = pwr_en ? 2 : 1;
+	desc->ib_ant_offset = op->fft.input_sequence_size;
+	desc->num_ant = op->fft.num_antennas_log2 - 3;
+
+	for (i = 0; i < RTE_BBDEV_MAX_CS; i++)
+		if (check_bit(op->fft.cs_bitmap, 1 << i))
+			num_cs++;
+	desc->num_cs = num_cs;
+
+	if (pwr_en && pwr) {
+		bd_idx++;
+		desc->data_ptrs[bd_idx].address = rte_pktmbuf_iova_offset(pwr, *pwr_offset);
+		desc->data_ptrs[bd_idx].blen = num_cs * (1 << op->fft.num_antennas_log2) * 4;
+		desc->data_ptrs[bd_idx].blkid = ACC_DMA_BLKID_OUT_SOFT;
+		desc->data_ptrs[bd_idx].last = 1;
+		desc->data_ptrs[bd_idx].dma_ext = 0;
+	}
+	desc->ob_cyc_offset = op->fft.output_sequence_size;
+	desc->ob_ant_offset = op->fft.output_sequence_size * num_cs;
+	desc->op_addr = op;
+	return 0;
+}
 
 /** Enqueue one FFT operation for device. */
 static inline int
@@ -3261,26 +4038,35 @@ vrb_enqueue_fft_one_op(struct acc_queue *q, struct rte_bbdev_fft_op *op,
 		uint16_t total_enqueued_cbs)
 {
 	union acc_dma_desc *desc;
-	struct rte_mbuf *input, *output;
-	uint32_t in_offset, out_offset;
+	struct rte_mbuf *input, *output, *pwr, *win;
+	uint32_t in_offset, out_offset, pwr_offset, win_offset;
 	struct acc_fcw_fft *fcw;
 
 	desc = acc_desc(q, total_enqueued_cbs);
 	input = op->fft.base_input.data;
 	output = op->fft.base_output.data;
+	pwr = op->fft.power_meas_output.data;
+	win = op->fft.dewindowing_input.data;
 	in_offset = op->fft.base_input.offset;
 	out_offset = op->fft.base_output.offset;
+	pwr_offset = op->fft.power_meas_output.offset;
+	win_offset = op->fft.dewindowing_input.offset;
 
 	fcw = (struct acc_fcw_fft *) (q->fcw_ring +
 			((q->sw_ring_head + total_enqueued_cbs) & q->sw_ring_wrap_mask)
 			* ACC_MAX_FCW_SIZE);
 
-	vrb1_fcw_fft_fill(op, fcw);
-	vrb1_dma_desc_fft_fill(op, &desc->req, input, output, &in_offset, &out_offset);
+	if (q->d->device_variant == VRB1_VARIANT) {
+		vrb1_fcw_fft_fill(op, fcw);
+		vrb1_dma_desc_fft_fill(op, &desc->req, input, output, &in_offset, &out_offset);
+	} else {
+		vrb2_fcw_fft_fill(op, (struct acc_fcw_fft_3 *) fcw);
+		vrb2_dma_desc_fft_fill(op, &desc->req, input, output, win, pwr,
+				&in_offset, &out_offset, &win_offset, &pwr_offset);
+	}
 #ifdef RTE_LIBRTE_BBDEV_DEBUG
-	rte_memdump(stderr, "FCW", &desc->req.fcw_fft,
-			sizeof(desc->req.fcw_fft));
-	rte_memdump(stderr, "Req Desc.", desc, sizeof(*desc));
+	acc_memdump("FCW", fcw, 128);
+	acc_memdump("Req Desc.", desc, 128);
 #endif
 	return 1;
 }
@@ -3353,6 +4139,7 @@ vrb_dequeue_fft_one_op(struct rte_bbdev_queue_data *q_data,
 	op->status |= rsp.input_err << RTE_BBDEV_DATA_ERROR;
 	op->status |= rsp.dma_err << RTE_BBDEV_DRV_ERROR;
 	op->status |= rsp.fcw_err << RTE_BBDEV_DRV_ERROR;
+	op->status |= rsp.engine_hung << RTE_BBDEV_ENGINE_ERROR;
 	if (op->status != 0)
 		q_data->queue_stats.dequeue_err_count++;
 
@@ -3383,6 +4170,7 @@ vrb_dequeue_fft(struct rte_bbdev_queue_data *q_data,
 	uint32_t aq_dequeued = 0;
 	int ret;
 
+
 	dequeue_num = RTE_MIN(avail, num);
 
 	for (i = 0; i < dequeue_num; ++i) {
@@ -3399,6 +4187,371 @@ vrb_dequeue_fft(struct rte_bbdev_queue_data *q_data,
 	return i;
 }
 
+/* Fill in a frame control word for MLD-TS processing. */
+static inline void
+vrb2_fcw_mldts_fill(struct rte_bbdev_mldts_op *op, struct acc_fcw_mldts *fcw)
+{
+	fcw->nrb = op->mldts.num_rbs;
+	fcw->NLayers = op->mldts.num_layers - 1;
+	fcw->Qmod0 = (op->mldts.q_m[0] >> 1) - 1;
+	fcw->Qmod1 = (op->mldts.q_m[1] >> 1) - 1;
+	fcw->Qmod2 = (op->mldts.q_m[2] >> 1) - 1;
+	fcw->Qmod3 = (op->mldts.q_m[3] >> 1) - 1;
+	/* Mark some layers as disabled */
+	if (op->mldts.num_layers == 2) {
+		fcw->Qmod2 = 3;
+		fcw->Qmod3 = 3;
+	}
+	if (op->mldts.num_layers == 3)
+		fcw->Qmod3 = 3;
+	fcw->Rrep = op->mldts.r_rep;
+	fcw->Crep = op->mldts.c_rep;
+}
+
+/* Fill in descriptor for one MLD-TS processing operation. */
+static inline int
+vrb2_dma_desc_mldts_fill(struct rte_bbdev_mldts_op *op,
+		struct acc_dma_req_desc *desc,
+		struct rte_mbuf *input_q, struct rte_mbuf *input_r,
+		struct rte_mbuf *output,
+		uint32_t *in_offset, uint32_t *out_offset)
+{
+	uint16_t qsize_per_re[VRB2_MLD_LAY_SIZE] = {8, 12, 16}; /* Layer 2 to 4. */
+	uint16_t rsize_per_re[VRB2_MLD_LAY_SIZE] = {14, 26, 42};
+	uint16_t sc_factor_per_rrep[VRB2_MLD_RREP_SIZE] = {12, 6, 4, 3, 0, 2};
+	uint16_t i, outsize_per_re = 0;
+	uint32_t sc_num, r_num, q_size, r_size, out_size;
+
+	/* Prevent out of range access. */
+	if (op->mldts.r_rep > 5)
+		op->mldts.r_rep = 5;
+	if (op->mldts.num_layers < 2)
+		op->mldts.num_layers = 2;
+	if (op->mldts.num_layers > 4)
+		op->mldts.num_layers = 4;
+	for (i = 0; i < op->mldts.num_layers; i++)
+		outsize_per_re += op->mldts.q_m[i];
+	sc_num = op->mldts.num_rbs * RTE_BBDEV_SCPERRB * (op->mldts.c_rep + 1);
+	r_num = op->mldts.num_rbs * sc_factor_per_rrep[op->mldts.r_rep];
+	q_size = qsize_per_re[op->mldts.num_layers - 2] * sc_num;
+	r_size = rsize_per_re[op->mldts.num_layers - 2] * r_num;
+	out_size =  sc_num * outsize_per_re;
+	/* printf("Sc %d R num %d Size %d %d %d\n", sc_num, r_num, q_size, r_size, out_size); */
+
+	/* FCW already done. */
+	acc_header_init(desc);
+	desc->data_ptrs[1].address = rte_pktmbuf_iova_offset(input_q, *in_offset);
+	desc->data_ptrs[1].blen = q_size;
+	desc->data_ptrs[1].blkid = ACC_DMA_BLKID_IN;
+	desc->data_ptrs[1].last = 0;
+	desc->data_ptrs[1].dma_ext = 0;
+	desc->data_ptrs[2].address = rte_pktmbuf_iova_offset(input_r, *in_offset);
+	desc->data_ptrs[2].blen = r_size;
+	desc->data_ptrs[2].blkid = ACC_DMA_BLKID_IN_MLD_R;
+	desc->data_ptrs[2].last = 1;
+	desc->data_ptrs[2].dma_ext = 0;
+	desc->data_ptrs[3].address = rte_pktmbuf_iova_offset(output, *out_offset);
+	desc->data_ptrs[3].blen = out_size;
+	desc->data_ptrs[3].blkid = ACC_DMA_BLKID_OUT_HARD;
+	desc->data_ptrs[3].last = 1;
+	desc->data_ptrs[3].dma_ext = 0;
+	desc->m2dlen = 3;
+	desc->d2mlen = 1;
+	desc->op_addr = op;
+	desc->cbs_in_tb = 1;
+
+	return 0;
+}
+
+/* Check whether the MLD operation can be processed as a single operation. */
+static inline bool
+vrb2_check_mld_r_constraint(struct rte_bbdev_mldts_op *op) {
+	uint8_t layer_idx, rrep_idx;
+	uint16_t max_rb[VRB2_MLD_LAY_SIZE][VRB2_MLD_RREP_SIZE] = {
+			{188, 275, 275, 275, 0, 275},
+			{101, 202, 275, 275, 0, 275},
+			{62, 124, 186, 248, 0, 275} };
+
+	if (op->mldts.c_rep == 0)
+		return true;
+
+	layer_idx = RTE_MIN(op->mldts.num_layers - VRB2_MLD_MIN_LAYER,
+			VRB2_MLD_MAX_LAYER - VRB2_MLD_MIN_LAYER);
+	rrep_idx = RTE_MIN(op->mldts.r_rep, VRB2_MLD_MAX_RREP);
+	rte_bbdev_log_debug("RB %d index %d %d max %d\n", op->mldts.num_rbs, layer_idx, rrep_idx,
+			max_rb[layer_idx][rrep_idx]);
+
+	return (op->mldts.num_rbs <= max_rb[layer_idx][rrep_idx]);
+}
+
+/** Enqueue MLDTS operation split accross symbols. */
+static inline int
+enqueue_mldts_split_op(struct acc_queue *q, struct rte_bbdev_mldts_op *op,
+		uint16_t total_enqueued_descs)
+{
+	uint16_t qsize_per_re[VRB2_MLD_LAY_SIZE] = {8, 12, 16}; /* Layer 2 to 4. */
+	uint16_t rsize_per_re[VRB2_MLD_LAY_SIZE] = {14, 26, 42};
+	uint16_t sc_factor_per_rrep[VRB2_MLD_RREP_SIZE] = {12, 6, 4, 3, 0, 2};
+	uint32_t i, outsize_per_re = 0, sc_num, r_num, q_size, r_size, out_size, num_syms;
+	union acc_dma_desc *desc, *first_desc;
+	uint16_t desc_idx, symb;
+	struct rte_mbuf *input_q, *input_r, *output;
+	uint32_t in_offset, out_offset;
+	struct acc_fcw_mldts *fcw;
+
+	desc_idx = ((q->sw_ring_head + total_enqueued_descs) & q->sw_ring_wrap_mask);
+	first_desc = q->ring_addr + desc_idx;
+	input_q = op->mldts.qhy_input.data;
+	input_r = op->mldts.r_input.data;
+	output = op->mldts.output.data;
+	in_offset = op->mldts.qhy_input.offset;
+	out_offset = op->mldts.output.offset;
+	num_syms = op->mldts.c_rep + 1;
+	fcw = &first_desc->req.fcw_mldts;
+	vrb2_fcw_mldts_fill(op, fcw);
+	fcw->Crep = 0; /* C rep forced to zero. */
+
+	/* Prevent out of range access. */
+	if (op->mldts.r_rep > 5)
+		op->mldts.r_rep = 5;
+	if (op->mldts.num_layers < 2)
+		op->mldts.num_layers = 2;
+	if (op->mldts.num_layers > 4)
+		op->mldts.num_layers = 4;
+
+	for (i = 0; i < op->mldts.num_layers; i++)
+		outsize_per_re += op->mldts.q_m[i];
+	sc_num = op->mldts.num_rbs * RTE_BBDEV_SCPERRB; /* C rep forced to zero. */
+	r_num = op->mldts.num_rbs * sc_factor_per_rrep[op->mldts.r_rep];
+	q_size = qsize_per_re[op->mldts.num_layers - 2] * sc_num;
+	r_size = rsize_per_re[op->mldts.num_layers - 2] * r_num;
+	out_size =  sc_num * outsize_per_re;
+
+	for (symb = 0; symb < num_syms; symb++) {
+		desc_idx = ((q->sw_ring_head + total_enqueued_descs + symb) & q->sw_ring_wrap_mask);
+		desc = q->ring_addr + desc_idx;
+		acc_header_init(&desc->req);
+		if (symb == 0)
+			desc->req.cbs_in_tb = num_syms;
+		else
+			rte_memcpy(&desc->req.fcw_mldts, fcw, ACC_FCW_MLDTS_BLEN);
+		desc->req.data_ptrs[1].address = rte_pktmbuf_iova_offset(input_q, in_offset);
+		desc->req.data_ptrs[1].blen = q_size;
+		in_offset += q_size;
+		desc->req.data_ptrs[1].blkid = ACC_DMA_BLKID_IN;
+		desc->req.data_ptrs[1].last = 0;
+		desc->req.data_ptrs[1].dma_ext = 0;
+		desc->req.data_ptrs[2].address = rte_pktmbuf_iova_offset(input_r, 0);
+		desc->req.data_ptrs[2].blen = r_size;
+		desc->req.data_ptrs[2].blkid = ACC_DMA_BLKID_IN_MLD_R;
+		desc->req.data_ptrs[2].last = 1;
+		desc->req.data_ptrs[2].dma_ext = 0;
+		desc->req.data_ptrs[3].address = rte_pktmbuf_iova_offset(output, out_offset);
+		desc->req.data_ptrs[3].blen = out_size;
+		out_offset += out_size;
+		desc->req.data_ptrs[3].blkid = ACC_DMA_BLKID_OUT_HARD;
+		desc->req.data_ptrs[3].last = 1;
+		desc->req.data_ptrs[3].dma_ext = 0;
+		desc->req.m2dlen = VRB2_MLD_M2DLEN;
+		desc->req.d2mlen = 1;
+		desc->req.op_addr = op;
+
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+		acc_memdump("FCW", &desc->req.fcw_mldts, sizeof(desc->req.fcw_mldts));
+		acc_memdump("Req Desc.", desc, sizeof(*desc));
+#endif
+	}
+	desc->req.sdone_enable = 0;
+
+	return num_syms;
+}
+
+/** Enqueue one MLDTS operation. */
+static inline int
+enqueue_mldts_one_op(struct acc_queue *q, struct rte_bbdev_mldts_op *op,
+		uint16_t total_enqueued_descs)
+{
+	union acc_dma_desc *desc;
+	uint16_t desc_idx;
+	struct rte_mbuf *input_q, *input_r, *output;
+	uint32_t in_offset, out_offset;
+	struct acc_fcw_mldts *fcw;
+
+	desc_idx = ((q->sw_ring_head + total_enqueued_descs) & q->sw_ring_wrap_mask);
+	desc = q->ring_addr + desc_idx;
+	input_q = op->mldts.qhy_input.data;
+	input_r = op->mldts.r_input.data;
+	output = op->mldts.output.data;
+	in_offset = op->mldts.qhy_input.offset;
+	out_offset = op->mldts.output.offset;
+	fcw = &desc->req.fcw_mldts;
+	vrb2_fcw_mldts_fill(op, fcw);
+	vrb2_dma_desc_mldts_fill(op, &desc->req, input_q, input_r, output,
+			&in_offset, &out_offset);
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	acc_memdump("FCW", &desc->req.fcw_mldts, sizeof(desc->req.fcw_mldts));
+	acc_memdump("Req Desc.", desc, sizeof(*desc));
+#endif
+	return 1;
+}
+
+/* Enqueue MLDTS operations. */
+static uint16_t
+vrb2_enqueue_mldts(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_mldts_op **ops, uint16_t num)
+{
+	int32_t aq_avail, avail;
+	struct acc_queue *q = q_data->queue_private;
+	uint16_t i, enqueued_descs = 0, descs_in_op;
+	int ret;
+	bool as_one_op;
+
+	aq_avail = acc_aq_avail(q_data, num);
+	if (unlikely((aq_avail <= 0) || (num == 0)))
+		return 0;
+	avail = acc_ring_avail_enq(q);
+
+	for (i = 0; i < num; ++i) {
+		as_one_op = vrb2_check_mld_r_constraint(ops[i]);
+		descs_in_op = as_one_op ? 1 : ops[i]->mldts.c_rep + 1;
+
+		/* Check if there are available space for further processing. */
+		if (unlikely(avail < descs_in_op)) {
+			acc_enqueue_ring_full(q_data);
+			break;
+		}
+		avail -= descs_in_op;
+
+		if (as_one_op)
+			ret = enqueue_mldts_one_op(q, ops[i], enqueued_descs);
+		else
+			ret = enqueue_mldts_split_op(q, ops[i], enqueued_descs);
+
+		if (ret < 0) {
+			acc_enqueue_invalid(q_data);
+			break;
+		}
+
+		enqueued_descs += ret;
+	}
+
+	if (unlikely(i == 0))
+		return 0; /* Nothing to enqueue. */
+
+	acc_dma_enqueue(q, enqueued_descs, &q_data->queue_stats);
+
+	/* Update stats. */
+	q_data->queue_stats.enqueued_count += i;
+	q_data->queue_stats.enqueue_err_count += num - i;
+	return i;
+}
+
+/*
+ * Dequeue one MLDTS operation.
+ * This may have been split over multiple descriptors.
+ */
+static inline int
+dequeue_mldts_one_op(struct rte_bbdev_queue_data *q_data,
+		struct acc_queue *q, struct rte_bbdev_mldts_op **ref_op,
+		uint16_t dequeued_ops, uint32_t *aq_dequeued)
+{
+	union acc_dma_desc *desc, atom_desc, *last_desc;
+	union acc_dma_rsp_desc rsp;
+	struct rte_bbdev_mldts_op *op;
+	uint8_t descs_in_op, i;
+
+	desc = acc_desc_tail(q, dequeued_ops);
+	atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc, __ATOMIC_RELAXED);
+
+	/* Check fdone bit. */
+	if (!(atom_desc.rsp.val & ACC_FDONE))
+		return -1;
+
+	descs_in_op = desc->req.cbs_in_tb;
+	if (descs_in_op > 1) {
+		/* Get last CB. */
+		last_desc = q->ring_addr + ((q->sw_ring_tail + dequeued_ops + descs_in_op - 1)
+				& q->sw_ring_wrap_mask);
+		/* Check if last op is ready to dequeue by checking fdone bit. If not exit. */
+		atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc, __ATOMIC_RELAXED);
+		if (!(atom_desc.rsp.val & ACC_FDONE))
+			return -1;
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+		acc_memdump("Last Resp", &last_desc->rsp.val, sizeof(desc->rsp.val));
+#endif
+		/* Check each operation iteratively using fdone. */
+		for (i = 1; i < descs_in_op - 1; i++) {
+			last_desc = q->ring_addr + ((q->sw_ring_tail + dequeued_ops + i)
+					& q->sw_ring_wrap_mask);
+			atom_desc.atom_hdr = __atomic_load_n((uint64_t *)last_desc,
+					__ATOMIC_RELAXED);
+			if (!(atom_desc.rsp.val & ACC_FDONE))
+				return -1;
+		}
+	}
+#ifdef RTE_LIBRTE_BBDEV_DEBUG
+	acc_memdump("Resp", &desc->rsp.val, sizeof(desc->rsp.val));
+#endif
+	/* Dequeue. */
+	op = desc->req.op_addr;
+
+	/* Clearing status, it will be set based on response. */
+	op->status = 0;
+
+	for (i = 0; i < descs_in_op; i++) {
+		desc = q->ring_addr + ((q->sw_ring_tail + dequeued_ops + i) & q->sw_ring_wrap_mask);
+		atom_desc.atom_hdr = __atomic_load_n((uint64_t *)desc, __ATOMIC_RELAXED);
+		rsp.val = atom_desc.rsp.val;
+		op->status |= rsp.input_err << RTE_BBDEV_DATA_ERROR;
+		op->status |= rsp.dma_err << RTE_BBDEV_DRV_ERROR;
+		op->status |= rsp.fcw_err << RTE_BBDEV_DRV_ERROR;
+		op->status |= rsp.engine_hung << RTE_BBDEV_ENGINE_ERROR;
+	}
+
+	if (op->status != 0)
+		q_data->queue_stats.dequeue_err_count++;
+	if (op->status & (1 << RTE_BBDEV_DRV_ERROR))
+		vrb_check_ir(q->d);
+
+	/* Check if this is the last desc in batch (Atomic Queue). */
+	if (desc->req.last_desc_in_batch) {
+		(*aq_dequeued)++;
+		desc->req.last_desc_in_batch = 0;
+	}
+	desc->rsp.val = ACC_DMA_DESC_TYPE;
+	desc->rsp.add_info_0 = 0;
+	*ref_op = op;
+
+	return descs_in_op;
+}
+
+/* Dequeue MLDTS operations from VRB2 device. */
+static uint16_t
+vrb2_dequeue_mldts(struct rte_bbdev_queue_data *q_data,
+		struct rte_bbdev_mldts_op **ops, uint16_t num)
+{
+	struct acc_queue *q = q_data->queue_private;
+	uint16_t dequeue_num, i, dequeued_cbs = 0;
+	uint32_t avail = acc_ring_avail_deq(q);
+	uint32_t aq_dequeued = 0;
+	int ret;
+
+	dequeue_num = RTE_MIN(avail, num);
+
+	for (i = 0; i < dequeue_num; ++i) {
+		ret = dequeue_mldts_one_op(q_data, q, &ops[i], dequeued_cbs, &aq_dequeued);
+		if (ret <= 0)
+			break;
+		dequeued_cbs += ret;
+	}
+
+	q->aq_dequeued += aq_dequeued;
+	q->sw_ring_tail += dequeued_cbs;
+	/* Update enqueue stats. */
+	q_data->queue_stats.dequeued_count += i;
+	return i;
+}
+
 /* Initialization Function */
 static void
 vrb_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv)
@@ -3417,6 +4570,8 @@ vrb_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv)
 	dev->dequeue_ldpc_dec_ops = vrb_dequeue_ldpc_dec;
 	dev->enqueue_fft_ops = vrb_enqueue_fft;
 	dev->dequeue_fft_ops = vrb_dequeue_fft;
+	dev->enqueue_mldts_ops = vrb2_enqueue_mldts;
+	dev->dequeue_mldts_ops = vrb2_dequeue_mldts;
 
 	d->pf_device = !strcmp(drv->driver.name, RTE_STR(VRB_PF_DRIVER_NAME));
 	d->mmio_base = pci_dev->mem_resource[0].addr;
@@ -3433,6 +4588,16 @@ vrb_bbdev_init(struct rte_bbdev *dev, struct rte_pci_driver *drv)
 			d->reg_addr = &vrb1_pf_reg_addr;
 		else
 			d->reg_addr = &vrb1_vf_reg_addr;
+	} else {
+		d->device_variant = VRB2_VARIANT;
+		d->queue_offset = vrb2_queue_offset;
+		d->fcw_ld_fill = vrb2_fcw_ld_fill;
+		d->num_qgroups = VRB2_NUM_QGRPS;
+		d->num_aqs = VRB2_NUM_AQS;
+		if (d->pf_device)
+			d->reg_addr = &vrb2_pf_reg_addr;
+		else
+			d->reg_addr = &vrb2_vf_reg_addr;
 	}
 
 	rte_bbdev_log_debug("Init device %s [%s] @ vaddr %p paddr %#"PRIx64"",
diff --git a/drivers/baseband/acc/vrb1_pf_enum.h b/drivers/baseband/acc/vrb1_pf_enum.h
index 82a36685e9..6dc359800f 100644
--- a/drivers/baseband/acc/vrb1_pf_enum.h
+++ b/drivers/baseband/acc/vrb1_pf_enum.h
@@ -98,11 +98,18 @@ enum {
 	ACC_PF_INT_DMA_UL5G_DESC_IRQ = 8,
 	ACC_PF_INT_DMA_DL5G_DESC_IRQ = 9,
 	ACC_PF_INT_DMA_MLD_DESC_IRQ = 10,
-	ACC_PF_INT_ARAM_ECC_1BIT_ERR = 11,
-	ACC_PF_INT_PARITY_ERR = 12,
-	ACC_PF_INT_QMGR_ERR = 13,
-	ACC_PF_INT_INT_REQ_OVERFLOW = 14,
-	ACC_PF_INT_APB_TIMEOUT = 15,
+	ACC_PF_INT_ARAM_ACCESS_ERR = 11,
+	ACC_PF_INT_ARAM_ECC_1BIT_ERR = 12,
+	ACC_PF_INT_PARITY_ERR = 13,
+	ACC_PF_INT_QMGR_OVERFLOW = 14,
+	ACC_PF_INT_QMGR_ERR = 15,
+	ACC_PF_INT_ATS_ERR = 22,
+	ACC_PF_INT_ARAM_FUUL = 23,
+	ACC_PF_INT_EXTRA_READ = 24,
+	ACC_PF_INT_COMPLETION_TIMEOUT = 25,
+	ACC_PF_INT_CORE_HANG = 26,
+	ACC_PF_INT_DMA_HANG = 28,
+	ACC_PF_INT_DS_HANG = 27,
 };
 
 #endif /* VRB1_PF_ENUM_H */
diff --git a/drivers/baseband/acc/vrb2_pf_enum.h b/drivers/baseband/acc/vrb2_pf_enum.h
new file mode 100644
index 0000000000..e2801e2d7d
--- /dev/null
+++ b/drivers/baseband/acc/vrb2_pf_enum.h
@@ -0,0 +1,124 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2021 Intel Corporation
+ */
+
+#ifndef VRB2_PF_ENUM_H
+#define VRB2_PF_ENUM_H
+
+/*
+ * VRB2 Register mapping on PF BAR0
+ * This is automatically generated from RDL, format may change with new RDL
+ * Release.
+ * Variable names are as is
+ */
+enum {
+	VRB2_PfQmgrEgressQueuesTemplate             = 0x0007FC00,
+	VRB2_PfQmgrIngressAq                        = 0x00100000,
+	VRB2_PfQmgrSoftReset                        = 0x00A00034,
+	VRB2_PfQmgrAramAllocEn	              = 0x00A000a0,
+	VRB2_PfQmgrAramAllocSetupN0                 = 0x00A000b0,
+	VRB2_PfQmgrAramAllocSetupN1                 = 0x00A000b4,
+	VRB2_PfQmgrAramAllocSetupN2                 = 0x00A000b8,
+	VRB2_PfQmgrAramAllocSetupN3                 = 0x00A000bc,
+	VRB2_PfQmgrDepthLog2Grp                     = 0x00A00200,
+	VRB2_PfQmgrTholdGrp                         = 0x00A00300,
+	VRB2_PfQmgrGrpTmplateReg0Indx               = 0x00A00600,
+	VRB2_PfQmgrGrpTmplateReg1Indx               = 0x00A00700,
+	VRB2_PfQmgrGrpTmplateReg2Indx               = 0x00A00800,
+	VRB2_PfQmgrGrpTmplateReg3Indx               = 0x00A00900,
+	VRB2_PfQmgrGrpTmplateReg4Indx               = 0x00A00A00,
+	VRB2_PfQmgrGrpTmplateReg5Indx               = 0x00A00B00,
+	VRB2_PfQmgrGrpTmplateReg6Indx               = 0x00A00C00,
+	VRB2_PfQmgrGrpTmplateReg7Indx               = 0x00A00D00,
+	VRB2_PfQmgrGrpTmplateEnRegIndx              = 0x00A00E00,
+	VRB2_PfQmgrArbQDepthGrp                     = 0x00A02F00,
+	VRB2_PfQmgrGrpFunction0                     = 0x00A02F80,
+	VRB2_PfQmgrGrpPriority                      = 0x00A02FC0,
+	VRB2_PfQmgrVfBaseAddr                       = 0x00A08000,
+	VRB2_PfQmgrAqEnableVf                       = 0x00A10000,
+	VRB2_PfQmgrRingSizeVf                       = 0x00A20010,
+	VRB2_PfQmgrGrpDepthLog20Vf                  = 0x00A20020,
+	VRB2_PfQmgrGrpDepthLog21Vf                  = 0x00A20024,
+	VRB2_PfFabricM2iBufferReg                   = 0x00B30000,
+	VRB2_PfFecUl5gIbDebug0Reg                   = 0x00B401FC,
+	VRB2_PfFftConfig0                           = 0x00B58004,
+	VRB2_PfFftParityMask8                       = 0x00B5803C,
+	VRB2_PfDmaConfig0Reg                        = 0x00B80000,
+	VRB2_PfDmaConfig1Reg                        = 0x00B80004,
+	VRB2_PfDmaQmgrAddrReg                       = 0x00B80008,
+	VRB2_PfDmaAxcacheReg                        = 0x00B80010,
+	VRB2_PfDmaAxiControl                        = 0x00B8002C,
+	VRB2_PfDmaQmanen                            = 0x00B80040,
+	VRB2_PfDmaQmanenSelect                      = 0x00B80044,
+	VRB2_PfDmaCfgRrespBresp                     = 0x00B80814,
+	VRB2_PfDmaDescriptorSignature               = 0x00B80868,
+	VRB2_PfDmaErrorDetectionEn                  = 0x00B80870,
+	VRB2_PfDmaFec5GulDescBaseLoRegVf            = 0x00B88020,
+	VRB2_PfDmaFec5GulDescBaseHiRegVf            = 0x00B88024,
+	VRB2_PfDmaFec5GulRespPtrLoRegVf             = 0x00B88028,
+	VRB2_PfDmaFec5GulRespPtrHiRegVf             = 0x00B8802C,
+	VRB2_PfDmaFec5GdlDescBaseLoRegVf            = 0x00B88040,
+	VRB2_PfDmaFec5GdlDescBaseHiRegVf            = 0x00B88044,
+	VRB2_PfDmaFec5GdlRespPtrLoRegVf             = 0x00B88048,
+	VRB2_PfDmaFec5GdlRespPtrHiRegVf             = 0x00B8804C,
+	VRB2_PfDmaFec4GulDescBaseLoRegVf            = 0x00B88060,
+	VRB2_PfDmaFec4GulDescBaseHiRegVf            = 0x00B88064,
+	VRB2_PfDmaFec4GulRespPtrLoRegVf             = 0x00B88068,
+	VRB2_PfDmaFec4GulRespPtrHiRegVf             = 0x00B8806C,
+	VRB2_PfDmaFec4GdlDescBaseLoRegVf            = 0x00B88080,
+	VRB2_PfDmaFec4GdlDescBaseHiRegVf            = 0x00B88084,
+	VRB2_PfDmaFec4GdlRespPtrLoRegVf             = 0x00B88088,
+	VRB2_PfDmaFec4GdlRespPtrHiRegVf             = 0x00B8808C,
+	VRB2_PfDmaFftDescBaseLoRegVf                = 0x00B880A0,
+	VRB2_PfDmaFftDescBaseHiRegVf                = 0x00B880A4,
+	VRB2_PfDmaFftRespPtrLoRegVf                 = 0x00B880A8,
+	VRB2_PfDmaFftRespPtrHiRegVf                 = 0x00B880AC,
+	VRB2_PfDmaMldDescBaseLoRegVf                = 0x00B880C0,
+	VRB2_PfDmaMldDescBaseHiRegVf                = 0x00B880C4,
+	VRB2_PfQosmonAEvalOverflow0                 = 0x00B90008,
+	VRB2_PfPermonACntrlRegVf                    = 0x00B98000,
+	VRB2_PfQosmonBEvalOverflow0                 = 0x00BA0008,
+	VRB2_PfPermonBCntrlRegVf                    = 0x00BA8000,
+	VRB2_PfPermonCCntrlRegVf                    = 0x00BB8000,
+	VRB2_PfHiInfoRingBaseLoRegPf                = 0x00C84014,
+	VRB2_PfHiInfoRingBaseHiRegPf                = 0x00C84018,
+	VRB2_PfHiInfoRingPointerRegPf               = 0x00C8401C,
+	VRB2_PfHiInfoRingIntWrEnRegPf               = 0x00C84020,
+	VRB2_PfHiBlockTransmitOnErrorEn             = 0x00C84038,
+	VRB2_PfHiCfgMsiIntWrEnRegPf                 = 0x00C84040,
+	VRB2_PfHiMsixVectorMapperPf                 = 0x00C84060,
+	VRB2_PfHiPfMode                             = 0x00C84108,
+	VRB2_PfHiClkGateHystReg                     = 0x00C8410C,
+	VRB2_PfHiMsiDropEnableReg                   = 0x00C84114,
+	VRB2_PfHiSectionPowerGatingReq              = 0x00C84128,
+	VRB2_PfHiSectionPowerGatingAck              = 0x00C8412C,
+};
+
+/* TIP PF Interrupt numbers */
+enum {
+	VRB2_PF_INT_QMGR_AQ_OVERFLOW = 0,
+	VRB2_PF_INT_DOORBELL_VF_2_PF = 1,
+	VRB2_PF_INT_ILLEGAL_FORMAT = 2,
+	VRB2_PF_INT_QMGR_DISABLED_ACCESS = 3,
+	VRB2_PF_INT_QMGR_AQ_OVERTHRESHOLD = 4,
+	VRB2_PF_INT_DMA_DL_DESC_IRQ = 5,
+	VRB2_PF_INT_DMA_UL_DESC_IRQ = 6,
+	VRB2_PF_INT_DMA_FFT_DESC_IRQ = 7,
+	VRB2_PF_INT_DMA_UL5G_DESC_IRQ = 8,
+	VRB2_PF_INT_DMA_DL5G_DESC_IRQ = 9,
+	VRB2_PF_INT_DMA_MLD_DESC_IRQ = 10,
+	VRB2_PF_INT_ARAM_ACCESS_ERR = 11,
+	VRB2_PF_INT_ARAM_ECC_1BIT_ERR = 12,
+	VRB2_PF_INT_PARITY_ERR = 13,
+	VRB2_PF_INT_QMGR_OVERFLOW = 14,
+	VRB2_PF_INT_QMGR_ERR = 15,
+	VRB2_PF_INT_ATS_ERR = 22,
+	VRB2_PF_INT_ARAM_FUUL = 23,
+	VRB2_PF_INT_EXTRA_READ = 24,
+	VRB2_PF_INT_COMPLETION_TIMEOUT = 25,
+	VRB2_PF_INT_CORE_HANG = 26,
+	VRB2_PF_INT_DMA_HANG = 28,
+	VRB2_PF_INT_DS_HANG = 27,
+};
+
+#endif /* VRB2_PF_ENUM_H */
diff --git a/drivers/baseband/acc/vrb2_vf_enum.h b/drivers/baseband/acc/vrb2_vf_enum.h
new file mode 100644
index 0000000000..69debc9116
--- /dev/null
+++ b/drivers/baseband/acc/vrb2_vf_enum.h
@@ -0,0 +1,121 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2021 Intel Corporation
+ */
+
+#ifndef VRB2_VF_ENUM_H
+#define VRB2_VF_ENUM_H
+
+/*
+ * VRB2 Register mapping on VF BAR0
+ * This is automatically generated from RDL, format may change with new RDL
+ */
+enum {
+	VRB2_VfHiVfToPfDbellVf           = 0x00000000,
+	VRB2_VfHiPfToVfDbellVf           = 0x00000008,
+	VRB2_VfHiInfoRingBaseLoVf        = 0x00000010,
+	VRB2_VfHiInfoRingBaseHiVf        = 0x00000014,
+	VRB2_VfHiInfoRingPointerVf       = 0x00000018,
+	VRB2_VfHiInfoRingIntWrEnVf       = 0x00000020,
+	VRB2_VfHiInfoRingPf2VfWrEnVf     = 0x00000024,
+	VRB2_VfHiMsixVectorMapperVf      = 0x00000060,
+	VRB2_VfHiDeviceStatus            = 0x00000068,
+	VRB2_VfHiInterruptSrc            = 0x00000070,
+	VRB2_VfDmaFec5GulDescBaseLoRegVf = 0x00000120,
+	VRB2_VfDmaFec5GulDescBaseHiRegVf = 0x00000124,
+	VRB2_VfDmaFec5GulRespPtrLoRegVf  = 0x00000128,
+	VRB2_VfDmaFec5GulRespPtrHiRegVf  = 0x0000012C,
+	VRB2_VfDmaFec5GdlDescBaseLoRegVf = 0x00000140,
+	VRB2_VfDmaFec5GdlDescBaseHiRegVf = 0x00000144,
+	VRB2_VfDmaFec5GdlRespPtrLoRegVf  = 0x00000148,
+	VRB2_VfDmaFec5GdlRespPtrHiRegVf  = 0x0000014C,
+	VRB2_VfDmaFec4GulDescBaseLoRegVf = 0x00000160,
+	VRB2_VfDmaFec4GulDescBaseHiRegVf = 0x00000164,
+	VRB2_VfDmaFec4GulRespPtrLoRegVf  = 0x00000168,
+	VRB2_VfDmaFec4GulRespPtrHiRegVf  = 0x0000016C,
+	VRB2_VfDmaFec4GdlDescBaseLoRegVf = 0x00000180,
+	VRB2_VfDmaFec4GdlDescBaseHiRegVf = 0x00000184,
+	VRB2_VfDmaFec4GdlRespPtrLoRegVf  = 0x00000188,
+	VRB2_VfDmaFec4GdlRespPtrHiRegVf  = 0x0000018C,
+	VRB2_VfDmaFftDescBaseLoRegVf     = 0x000001A0,
+	VRB2_VfDmaFftDescBaseHiRegVf     = 0x000001A4,
+	VRB2_VfDmaFftRespPtrLoRegVf      = 0x000001A8,
+	VRB2_VfDmaFftRespPtrHiRegVf      = 0x000001AC,
+	VRB2_VfDmaMldDescBaseLoRegVf     = 0x000001C0,
+	VRB2_VfDmaMldDescBaseHiRegVf     = 0x000001C4,
+	VRB2_VfDmaMldRespPtrLoRegVf      = 0x000001C8,
+	VRB2_VfDmaMldRespPtrHiRegVf      = 0x000001CC,
+	VRB2_VfPmACntrlRegVf             = 0x00000200,
+	VRB2_VfPmACountVf                = 0x00000208,
+	VRB2_VfPmAKCntLoVf               = 0x00000210,
+	VRB2_VfPmAKCntHiVf               = 0x00000214,
+	VRB2_VfPmADeltaCntLoVf           = 0x00000220,
+	VRB2_VfPmADeltaCntHiVf           = 0x00000224,
+	VRB2_VfPmBCntrlRegVf             = 0x00000240,
+	VRB2_VfPmBCountVf                = 0x00000248,
+	VRB2_VfPmBKCntLoVf               = 0x00000250,
+	VRB2_VfPmBKCntHiVf               = 0x00000254,
+	VRB2_VfPmBDeltaCntLoVf           = 0x00000260,
+	VRB2_VfPmBDeltaCntHiVf           = 0x00000264,
+	VRB2_VfPmCCntrlRegVf             = 0x00000280,
+	VRB2_VfPmCCountVf                = 0x00000288,
+	VRB2_VfPmCKCntLoVf               = 0x00000290,
+	VRB2_VfPmCKCntHiVf               = 0x00000294,
+	VRB2_VfPmCDeltaCntLoVf           = 0x000002A0,
+	VRB2_VfPmCDeltaCntHiVf           = 0x000002A4,
+	VRB2_VfPmDCntrlRegVf             = 0x000002C0,
+	VRB2_VfPmDCountVf                = 0x000002C8,
+	VRB2_VfPmDKCntLoVf               = 0x000002D0,
+	VRB2_VfPmDKCntHiVf               = 0x000002D4,
+	VRB2_VfPmDDeltaCntLoVf           = 0x000002E0,
+	VRB2_VfPmDDeltaCntHiVf           = 0x000002E4,
+	VRB2_VfPmECntrlRegVf             = 0x00000300,
+	VRB2_VfPmECountVf                = 0x00000308,
+	VRB2_VfPmEKCntLoVf               = 0x00000310,
+	VRB2_VfPmEKCntHiVf               = 0x00000314,
+	VRB2_VfPmEDeltaCntLoVf           = 0x00000320,
+	VRB2_VfPmEDeltaCntHiVf           = 0x00000324,
+	VRB2_VfPmFCntrlRegVf             = 0x00000340,
+	VRB2_VfPmFCountVf                = 0x00000348,
+	VRB2_VfPmFKCntLoVf               = 0x00000350,
+	VRB2_VfPmFKCntHiVf               = 0x00000354,
+	VRB2_VfPmFDeltaCntLoVf           = 0x00000360,
+	VRB2_VfPmFDeltaCntHiVf           = 0x00000364,
+	VRB2_VfQmgrAqReset0              = 0x00000600,
+	VRB2_VfQmgrAqReset1              = 0x00000604,
+	VRB2_VfQmgrAqReset2              = 0x00000608,
+	VRB2_VfQmgrAqReset3              = 0x0000060C,
+	VRB2_VfQmgrRingSizeVf            = 0x00000610,
+	VRB2_VfQmgrGrpDepthLog20Vf       = 0x00000620,
+	VRB2_VfQmgrGrpDepthLog21Vf       = 0x00000624,
+	VRB2_VfQmgrGrpDepthLog22Vf       = 0x00000628,
+	VRB2_VfQmgrGrpDepthLog23Vf       = 0x0000062C,
+	VRB2_VfQmgrGrpFunction0Vf        = 0x00000630,
+	VRB2_VfQmgrGrpFunction1Vf        = 0x00000634,
+	VRB2_VfQmgrAramUsageN0           = 0x00000640,
+	VRB2_VfQmgrAramUsageN1           = 0x00000644,
+	VRB2_VfQmgrAramUsageN2           = 0x00000648,
+	VRB2_VfQmgrAramUsageN3           = 0x0000064C,
+	VRB2_VfHiMSIXBaseLoRegVf         = 0x00001000,
+	VRB2_VfHiMSIXBaseHiRegVf         = 0x00001004,
+	VRB2_VfHiMSIXBaseDataRegVf       = 0x00001008,
+	VRB2_VfHiMSIXBaseMaskRegVf       = 0x0000100C,
+	VRB2_VfHiMSIXPBABaseLoRegVf      = 0x00003000,
+	VRB2_VfQmgrIngressAq             = 0x00004000,
+};
+
+/* TIP VF Interrupt numbers */
+enum {
+	VRB2_VF_INT_QMGR_AQ_OVERFLOW = 0,
+	VRB2_VF_INT_DOORBELL_PF_2_VF = 1,
+	VRB2_VF_INT_ILLEGAL_FORMAT = 2,
+	VRB2_VF_INT_QMGR_DISABLED_ACCESS = 3,
+	VRB2_VF_INT_QMGR_AQ_OVERTHRESHOLD = 4,
+	VRB2_VF_INT_DMA_DL_DESC_IRQ = 5,
+	VRB2_VF_INT_DMA_UL_DESC_IRQ = 6,
+	VRB2_VF_INT_DMA_FFT_DESC_IRQ = 7,
+	VRB2_VF_INT_DMA_UL5G_DESC_IRQ = 8,
+	VRB2_VF_INT_DMA_DL5G_DESC_IRQ = 9,
+	VRB2_VF_INT_DMA_MLD_DESC_IRQ = 10,
+};
+
+#endif /* VRB2_VF_ENUM_H */
diff --git a/drivers/baseband/acc/vrb_pmd.h b/drivers/baseband/acc/vrb_pmd.h
index 1cabc0b7f4..def8ceaf93 100644
--- a/drivers/baseband/acc/vrb_pmd.h
+++ b/drivers/baseband/acc/vrb_pmd.h
@@ -8,6 +8,8 @@
 #include "acc_common.h"
 #include "vrb1_pf_enum.h"
 #include "vrb1_vf_enum.h"
+#include "vrb2_pf_enum.h"
+#include "vrb2_vf_enum.h"
 #include "vrb_cfg.h"
 
 /* Helper macro for logging */
@@ -31,12 +33,13 @@
 #define RTE_VRB1_VENDOR_ID           (0x8086)
 #define RTE_VRB1_PF_DEVICE_ID        (0x57C0)
 #define RTE_VRB1_VF_DEVICE_ID        (0x57C1)
-
-#define VRB1_VARIANT               2
+#define RTE_VRB2_VENDOR_ID           (0x8086)
+#define RTE_VRB2_PF_DEVICE_ID        (0x57C2)
+#define RTE_VRB2_VF_DEVICE_ID        (0x57C3)
 
 #define VRB_NUM_ACCS                 6
 #define VRB_MAX_QGRPS                32
-#define VRB_MAX_AQS                  32
+#define VRB_MAX_AQS                  64
 
 #define ACC_STATUS_WAIT      10
 #define ACC_STATUS_TO        100
@@ -61,7 +64,6 @@
 #define VRB1_SIG_DL_4G_LAST 23
 #define VRB1_SIG_FFT        24
 #define VRB1_SIG_FFT_LAST   24
-
 #define VRB1_NUM_ACCS       5
 
 /* VRB1 Configuration */
@@ -90,6 +92,69 @@
 #define VRB1_MAX_PF_MSIX            (256+32)
 #define VRB1_MAX_VF_MSIX            (256+7)
 
+/* VRB2 specific flags */
+
+#define VRB2_NUM_VFS        64
+#define VRB2_NUM_QGRPS      32
+#define VRB2_NUM_AQS        64
+#define VRB2_GRP_ID_SHIFT    12 /* Queue Index Hierarchy */
+#define VRB2_VF_ID_SHIFT     6  /* Queue Index Hierarchy */
+#define VRB2_WORDS_IN_ARAM_SIZE (512 * 1024 / 4)
+#define VRB2_NUM_ACCS        6
+#define VRB2_AQ_REG_NUM      4
+
+/* VRB2 Mapping of signals for the available engines */
+#define VRB2_SIG_UL_5G       0
+#define VRB2_SIG_UL_5G_LAST  5
+#define VRB2_SIG_DL_5G       9
+#define VRB2_SIG_DL_5G_LAST 11
+#define VRB2_SIG_UL_4G      12
+#define VRB2_SIG_UL_4G_LAST 16
+#define VRB2_SIG_DL_4G      21
+#define VRB2_SIG_DL_4G_LAST 23
+#define VRB2_SIG_FFT        24
+#define VRB2_SIG_FFT_LAST   26
+#define VRB2_SIG_MLD        30
+#define VRB2_SIG_MLD_LAST   31
+#define VRB2_FFT_NUM        3
+
+#define VRB2_FCW_MLDTS_BLEN 32
+#define VRB2_MLD_MIN_LAYER   2
+#define VRB2_MLD_MAX_LAYER   4
+#define VRB2_MLD_MAX_RREP    5
+#define VRB2_MLD_LAY_SIZE    3
+#define VRB2_MLD_RREP_SIZE   6
+#define VRB2_MLD_M2DLEN      3
+
+#define VRB2_MAX_PF_MSIX      (256+32)
+#define VRB2_MAX_VF_MSIX      (64+7)
+#define VRB2_REG_IRQ_EN_ALL   0xFFFFFFFF  /* Enable all interrupts */
+#define VRB2_FABRIC_MODE      0x8000103
+#define VRB2_CFG_DMA_ERROR    0x7DF
+#define VRB2_CFG_AXI_CACHE    0x11
+#define VRB2_CFG_QMGR_HI_P    0x0F0F
+#define VRB2_RESET_HARD       0x1FF
+#define VRB2_ENGINES_MAX      9
+#define VRB2_GPEX_AXIMAP_NUM  17
+#define VRB2_CLOCK_GATING_EN  0x30000
+#define VRB2_FFT_CFG_0        0x2001
+#define VRB2_FFT_ECC          0x60
+#define VRB2_FFT_RAM_EN       0x80008000
+#define VRB2_FFT_RAM_DIS      0x0
+#define VRB2_FFT_RAM_SIZE     512
+#define VRB2_CLK_EN           0x00010A01
+#define VRB2_CLK_DIS          0x01F10A01
+#define VRB2_PG_MASK_0        0x1F
+#define VRB2_PG_MASK_1        0xF
+#define VRB2_PG_MASK_2        0x1
+#define VRB2_PG_MASK_3        0x0
+#define VRB2_PG_MASK_FFT      1
+#define VRB2_PG_MASK_4GUL     4
+#define VRB2_PG_MASK_5GUL     8
+#define VRB2_PF_PM_REG_OFFSET 0x10000
+#define VRB2_VF_PM_REG_OFFSET 0x40
+#define VRB2_PM_START         0x2
+
 struct acc_registry_addr {
 	unsigned int dma_ring_dl5g_hi;
 	unsigned int dma_ring_dl5g_lo;
@@ -218,4 +283,92 @@ static const struct acc_registry_addr vrb1_vf_reg_addr = {
 	.pf2vf_doorbell = VRB1_VfHiPfToVfDbellVf,
 };
 
+
+/* Structure holding registry addresses for PF */
+static const struct acc_registry_addr vrb2_pf_reg_addr = {
+	.dma_ring_dl5g_hi =  VRB2_PfDmaFec5GdlDescBaseHiRegVf,
+	.dma_ring_dl5g_lo =  VRB2_PfDmaFec5GdlDescBaseLoRegVf,
+	.dma_ring_ul5g_hi =  VRB2_PfDmaFec5GulDescBaseHiRegVf,
+	.dma_ring_ul5g_lo =  VRB2_PfDmaFec5GulDescBaseLoRegVf,
+	.dma_ring_dl4g_hi =  VRB2_PfDmaFec4GdlDescBaseHiRegVf,
+	.dma_ring_dl4g_lo =  VRB2_PfDmaFec4GdlDescBaseLoRegVf,
+	.dma_ring_ul4g_hi =  VRB2_PfDmaFec4GulDescBaseHiRegVf,
+	.dma_ring_ul4g_lo =  VRB2_PfDmaFec4GulDescBaseLoRegVf,
+	.dma_ring_fft_hi =   VRB2_PfDmaFftDescBaseHiRegVf,
+	.dma_ring_fft_lo =   VRB2_PfDmaFftDescBaseLoRegVf,
+	.dma_ring_mld_hi =   VRB2_PfDmaMldDescBaseHiRegVf,
+	.dma_ring_mld_lo =   VRB2_PfDmaMldDescBaseLoRegVf,
+	.ring_size =         VRB2_PfQmgrRingSizeVf,
+	.info_ring_hi =      VRB2_PfHiInfoRingBaseHiRegPf,
+	.info_ring_lo =      VRB2_PfHiInfoRingBaseLoRegPf,
+	.info_ring_en =      VRB2_PfHiInfoRingIntWrEnRegPf,
+	.info_ring_ptr =     VRB2_PfHiInfoRingPointerRegPf,
+	.tail_ptrs_dl5g_hi = VRB2_PfDmaFec5GdlRespPtrHiRegVf,
+	.tail_ptrs_dl5g_lo = VRB2_PfDmaFec5GdlRespPtrLoRegVf,
+	.tail_ptrs_ul5g_hi = VRB2_PfDmaFec5GulRespPtrHiRegVf,
+	.tail_ptrs_ul5g_lo = VRB2_PfDmaFec5GulRespPtrLoRegVf,
+	.tail_ptrs_dl4g_hi = VRB2_PfDmaFec4GdlRespPtrHiRegVf,
+	.tail_ptrs_dl4g_lo = VRB2_PfDmaFec4GdlRespPtrLoRegVf,
+	.tail_ptrs_ul4g_hi = VRB2_PfDmaFec4GulRespPtrHiRegVf,
+	.tail_ptrs_ul4g_lo = VRB2_PfDmaFec4GulRespPtrLoRegVf,
+	.tail_ptrs_fft_hi =  VRB2_PfDmaFftRespPtrHiRegVf,
+	.tail_ptrs_fft_lo =  VRB2_PfDmaFftRespPtrLoRegVf,
+	.tail_ptrs_mld_hi =  VRB2_PfDmaFftRespPtrHiRegVf,
+	.tail_ptrs_mld_lo =  VRB2_PfDmaFftRespPtrLoRegVf,
+	.depth_log0_offset = VRB2_PfQmgrGrpDepthLog20Vf,
+	.depth_log1_offset = VRB2_PfQmgrGrpDepthLog21Vf,
+	.qman_group_func =   VRB2_PfQmgrGrpFunction0,
+	.hi_mode =           VRB2_PfHiMsixVectorMapperPf,
+	.pf_mode =           VRB2_PfHiPfMode,
+	.pmon_ctrl_a =       VRB2_PfPermonACntrlRegVf,
+	.pmon_ctrl_b =       VRB2_PfPermonBCntrlRegVf,
+	.pmon_ctrl_c =       VRB2_PfPermonCCntrlRegVf,
+	.vf2pf_doorbell =    0,
+	.pf2vf_doorbell =    0,
+};
+
+/* Structure holding registry addresses for VF */
+static const struct acc_registry_addr vrb2_vf_reg_addr = {
+	.dma_ring_dl5g_hi =  VRB2_VfDmaFec5GdlDescBaseHiRegVf,
+	.dma_ring_dl5g_lo =  VRB2_VfDmaFec5GdlDescBaseLoRegVf,
+	.dma_ring_ul5g_hi =  VRB2_VfDmaFec5GulDescBaseHiRegVf,
+	.dma_ring_ul5g_lo =  VRB2_VfDmaFec5GulDescBaseLoRegVf,
+	.dma_ring_dl4g_hi =  VRB2_VfDmaFec4GdlDescBaseHiRegVf,
+	.dma_ring_dl4g_lo =  VRB2_VfDmaFec4GdlDescBaseLoRegVf,
+	.dma_ring_ul4g_hi =  VRB2_VfDmaFec4GulDescBaseHiRegVf,
+	.dma_ring_ul4g_lo =  VRB2_VfDmaFec4GulDescBaseLoRegVf,
+	.dma_ring_fft_hi =   VRB2_VfDmaFftDescBaseHiRegVf,
+	.dma_ring_fft_lo =   VRB2_VfDmaFftDescBaseLoRegVf,
+	.dma_ring_mld_hi =   VRB2_VfDmaMldDescBaseHiRegVf,
+	.dma_ring_mld_lo =   VRB2_VfDmaMldDescBaseLoRegVf,
+	.ring_size =         VRB2_VfQmgrRingSizeVf,
+	.info_ring_hi =      VRB2_VfHiInfoRingBaseHiVf,
+	.info_ring_lo =      VRB2_VfHiInfoRingBaseLoVf,
+	.info_ring_en =      VRB2_VfHiInfoRingIntWrEnVf,
+	.info_ring_ptr =     VRB2_VfHiInfoRingPointerVf,
+	.tail_ptrs_dl5g_hi = VRB2_VfDmaFec5GdlRespPtrHiRegVf,
+	.tail_ptrs_dl5g_lo = VRB2_VfDmaFec5GdlRespPtrLoRegVf,
+	.tail_ptrs_ul5g_hi = VRB2_VfDmaFec5GulRespPtrHiRegVf,
+	.tail_ptrs_ul5g_lo = VRB2_VfDmaFec5GulRespPtrLoRegVf,
+	.tail_ptrs_dl4g_hi = VRB2_VfDmaFec4GdlRespPtrHiRegVf,
+	.tail_ptrs_dl4g_lo = VRB2_VfDmaFec4GdlRespPtrLoRegVf,
+	.tail_ptrs_ul4g_hi = VRB2_VfDmaFec4GulRespPtrHiRegVf,
+	.tail_ptrs_ul4g_lo = VRB2_VfDmaFec4GulRespPtrLoRegVf,
+	.tail_ptrs_fft_hi =  VRB2_VfDmaFftRespPtrHiRegVf,
+	.tail_ptrs_fft_lo =  VRB2_VfDmaFftRespPtrLoRegVf,
+	.tail_ptrs_mld_hi =  VRB2_VfDmaMldRespPtrHiRegVf,
+	.tail_ptrs_mld_lo =  VRB2_VfDmaMldRespPtrLoRegVf,
+	.depth_log0_offset = VRB2_VfQmgrGrpDepthLog20Vf,
+	.depth_log1_offset = VRB2_VfQmgrGrpDepthLog21Vf,
+	.qman_group_func =   VRB2_VfQmgrGrpFunction0Vf,
+	.hi_mode =           VRB2_VfHiMsixVectorMapperVf,
+	.pf_mode =           0,
+	.pmon_ctrl_a =       VRB2_VfPmACntrlRegVf,
+	.pmon_ctrl_b =       VRB2_VfPmBCntrlRegVf,
+	.pmon_ctrl_c =       VRB2_VfPmCCntrlRegVf,
+	.vf2pf_doorbell =    VRB2_VfHiVfToPfDbellVf,
+	.pf2vf_doorbell =    VRB2_VfHiPfToVfDbellVf,
+};
+
+
 #endif /* _VRB_PMD_H_ */
-- 
2.34.1


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v1 7/7] baseband/acc: add configure helper for VRB2
  2023-09-19  1:21 [PATCH v1 0/7] VRB2 BBDEV PMD introduction Nicolas Chautru
                   ` (5 preceding siblings ...)
  2023-09-19  1:21 ` [PATCH v1 6/7] baseband/acc: introduce the new VRB2 variant Nicolas Chautru
@ 2023-09-19  1:21 ` Nicolas Chautru
  2023-09-21  7:25 ` [PATCH v1 0/7] VRB2 BBDEV PMD introduction David Marchand
  7 siblings, 0 replies; 24+ messages in thread
From: Nicolas Chautru @ 2023-09-19  1:21 UTC (permalink / raw)
  To: dev, maxime.coquelin
  Cc: hemant.agrawal, david.marchand, hernan.vargas, Nicolas Chautru

This allows to configure the VRB2 device using a
companion configuration function within the DPDK
bbdev-test environment.

Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
---
 drivers/baseband/acc/acc100_pmd.h     |   2 +
 drivers/baseband/acc/acc_common.h     |   7 +
 drivers/baseband/acc/rte_acc100_pmd.c |   6 +-
 drivers/baseband/acc/rte_vrb_pmd.c    | 322 ++++++++++++++++++++++++++
 drivers/baseband/acc/vrb_cfg.h        |  16 ++
 5 files changed, 352 insertions(+), 1 deletion(-)

diff --git a/drivers/baseband/acc/acc100_pmd.h b/drivers/baseband/acc/acc100_pmd.h
index a48298650c..5a8965fa53 100644
--- a/drivers/baseband/acc/acc100_pmd.h
+++ b/drivers/baseband/acc/acc100_pmd.h
@@ -34,6 +34,8 @@
 #define ACC100_VENDOR_ID           (0x8086)
 #define ACC100_PF_DEVICE_ID        (0x0d5c)
 #define ACC100_VF_DEVICE_ID        (0x0d5d)
+#define VRB1_PF_DEVICE_ID          (0x57C0)
+#define VRB2_PF_DEVICE_ID          (0x57C2)
 
 /* Values used in writing to the registers */
 #define ACC100_REG_IRQ_EN_ALL          0x1FF83FF  /* Enable all interrupts */
diff --git a/drivers/baseband/acc/acc_common.h b/drivers/baseband/acc/acc_common.h
index 56578c43ba..8c2f1db262 100644
--- a/drivers/baseband/acc/acc_common.h
+++ b/drivers/baseband/acc/acc_common.h
@@ -1480,6 +1480,13 @@ get_num_cbs_in_tb_ldpc_enc(struct rte_bbdev_op_ldpc_enc *ldpc_enc)
 	return cbs_in_tb;
 }
 
+static inline void
+acc_reg_fast_write(struct acc_device *d, uint32_t offset, uint32_t value)
+{
+	void *reg_addr = RTE_PTR_ADD(d->mmio_base, offset);
+	mmio_write(reg_addr, value);
+}
+
 #ifdef RTE_LIBRTE_BBDEV_DEBUG
 static inline void
 acc_memdump(const char *string, void *buf, uint16_t bytes)
diff --git a/drivers/baseband/acc/rte_acc100_pmd.c b/drivers/baseband/acc/rte_acc100_pmd.c
index 7f8d05b5a9..699a227d13 100644
--- a/drivers/baseband/acc/rte_acc100_pmd.c
+++ b/drivers/baseband/acc/rte_acc100_pmd.c
@@ -5187,6 +5187,10 @@ rte_acc_configure(const char *dev_name, struct rte_acc_conf *conf)
 		return acc100_configure(dev_name, conf);
 	else if (pci_dev->id.device_id == ACC101_PF_DEVICE_ID)
 		return acc101_configure(dev_name, conf);
-	else
+	else if (pci_dev->id.device_id == VRB1_PF_DEVICE_ID)
 		return vrb1_configure(dev_name, conf);
+	else if (pci_dev->id.device_id == VRB2_PF_DEVICE_ID)
+		return vrb2_configure(dev_name, conf);
+
+	return -ENXIO;
 }
diff --git a/drivers/baseband/acc/rte_vrb_pmd.c b/drivers/baseband/acc/rte_vrb_pmd.c
index 36d2c8173d..76efc8faf1 100644
--- a/drivers/baseband/acc/rte_vrb_pmd.c
+++ b/drivers/baseband/acc/rte_vrb_pmd.c
@@ -5073,3 +5073,325 @@ vrb1_configure(const char *dev_name, struct rte_acc_conf *conf)
 	rte_bbdev_log_debug("PF Tip configuration complete for %s", dev_name);
 	return 0;
 }
+
+
+/* Initial configuration of a VRB2 device prior to running configure(). */
+int
+vrb2_configure(const char *dev_name, struct rte_acc_conf *conf)
+{
+	rte_bbdev_log(INFO, "vrb2_configure");
+	uint32_t value, address, status;
+	int qg_idx, template_idx, vf_idx, acc, i, aq_reg, static_allocation, numEngines;
+	int numQgs, numQqsAcc, totalQgs;
+	int qman_func_id[8] = {0, 2, 1, 3, 4, 5, 0, 0};
+	struct rte_bbdev *bbdev = rte_bbdev_get_named_dev(dev_name);
+	int rlim, alen, timestamp;
+
+	/* Compile time checks */
+	RTE_BUILD_BUG_ON(sizeof(struct acc_dma_req_desc) != 256);
+	RTE_BUILD_BUG_ON(sizeof(union acc_dma_desc) != 256);
+	RTE_BUILD_BUG_ON(sizeof(struct acc_fcw_td) != 24);
+	RTE_BUILD_BUG_ON(sizeof(struct acc_fcw_te) != 32);
+
+	if (bbdev == NULL) {
+		rte_bbdev_log(ERR,
+		"Invalid dev_name (%s), or device is not yet initialised",
+		dev_name);
+		return -ENODEV;
+	}
+	struct acc_device *d = bbdev->data->dev_private;
+
+	/* Store configuration */
+	rte_memcpy(&d->acc_conf, conf, sizeof(d->acc_conf));
+
+	/* Explicitly releasing AXI as this may be stopped after PF FLR/BME */
+	address = VRB2_PfDmaAxiControl;
+	value = 1;
+	acc_reg_write(d, address, value);
+
+	/* Set the fabric mode */
+	address = VRB2_PfFabricM2iBufferReg;
+	value = VRB2_FABRIC_MODE;
+	acc_reg_write(d, address, value);
+
+	/* Set default descriptor signature */
+	address = VRB2_PfDmaDescriptorSignature;
+	value = 0;
+	acc_reg_write(d, address, value);
+
+	/* Enable the Error Detection in DMA */
+	value = VRB2_CFG_DMA_ERROR;
+	address = VRB2_PfDmaErrorDetectionEn;
+	acc_reg_write(d, address, value);
+
+	/* AXI Cache configuration */
+	value = VRB2_CFG_AXI_CACHE;
+	address = VRB2_PfDmaAxcacheReg;
+	acc_reg_write(d, address, value);
+
+	/* AXI Response configuration */
+	acc_reg_write(d, VRB2_PfDmaCfgRrespBresp, 0x0);
+
+	/* Default DMA Configuration (Qmgr Enabled) */
+	acc_reg_write(d, VRB2_PfDmaConfig0Reg, 0);
+	acc_reg_write(d, VRB2_PfDmaQmanenSelect, 0xFFFFFFFF);
+	acc_reg_write(d, VRB2_PfDmaQmanen, 0);
+
+	/* Default RLIM/ALEN configuration */
+	rlim = 0;
+	alen = 3;
+	timestamp = 0;
+	address = VRB2_PfDmaConfig1Reg;
+	value = (1 << 31) + (rlim << 8) + (timestamp << 6) + alen;
+	acc_reg_write(d, address, value);
+
+	/* Default FFT configuration */
+	for (template_idx = 0; template_idx < VRB2_FFT_NUM; template_idx++) {
+		acc_reg_write(d, VRB2_PfFftConfig0 + template_idx * 0x1000, VRB2_FFT_CFG_0);
+		acc_reg_write(d, VRB2_PfFftParityMask8 + template_idx * 0x1000, VRB2_FFT_ECC);
+	}
+
+	/* Configure DMA Qmanager addresses */
+	address = VRB2_PfDmaQmgrAddrReg;
+	value = VRB2_PfQmgrEgressQueuesTemplate;
+	acc_reg_write(d, address, value);
+
+	/* ===== Qmgr Configuration ===== */
+	/* Configuration of the AQueue Depth QMGR_GRP_0_DEPTH_LOG2 for UL */
+	totalQgs = conf->q_ul_4g.num_qgroups + conf->q_ul_5g.num_qgroups +
+			conf->q_dl_4g.num_qgroups + conf->q_dl_5g.num_qgroups +
+			conf->q_fft.num_qgroups + conf->q_mld.num_qgroups;
+	for (qg_idx = 0; qg_idx < VRB2_NUM_QGRPS; qg_idx++) {
+		address = VRB2_PfQmgrDepthLog2Grp + ACC_BYTES_IN_WORD * qg_idx;
+		value = aqDepth(qg_idx, conf);
+		acc_reg_write(d, address, value);
+		address = VRB2_PfQmgrTholdGrp + ACC_BYTES_IN_WORD * qg_idx;
+		value = (1 << 16) + (1 << (aqDepth(qg_idx, conf) - 1));
+		acc_reg_write(d, address, value);
+	}
+
+	/* Template Priority in incremental order */
+	for (template_idx = 0; template_idx < ACC_NUM_TMPL; template_idx++) {
+		address = VRB2_PfQmgrGrpTmplateReg0Indx + ACC_BYTES_IN_WORD * template_idx;
+		value = ACC_TMPL_PRI_0;
+		acc_reg_write(d, address, value);
+		address = VRB2_PfQmgrGrpTmplateReg1Indx + ACC_BYTES_IN_WORD * template_idx;
+		value = ACC_TMPL_PRI_1;
+		acc_reg_write(d, address, value);
+		address = VRB2_PfQmgrGrpTmplateReg2Indx + ACC_BYTES_IN_WORD * template_idx;
+		value = ACC_TMPL_PRI_2;
+		acc_reg_write(d, address, value);
+		address = VRB2_PfQmgrGrpTmplateReg3Indx + ACC_BYTES_IN_WORD * template_idx;
+		value = ACC_TMPL_PRI_3;
+		acc_reg_write(d, address, value);
+		address = VRB2_PfQmgrGrpTmplateReg4Indx + ACC_BYTES_IN_WORD * template_idx;
+		value = ACC_TMPL_PRI_4;
+		acc_reg_write(d, address, value);
+		address = VRB2_PfQmgrGrpTmplateReg5Indx + ACC_BYTES_IN_WORD * template_idx;
+		value = ACC_TMPL_PRI_5;
+		acc_reg_write(d, address, value);
+		address = VRB2_PfQmgrGrpTmplateReg6Indx + ACC_BYTES_IN_WORD * template_idx;
+		value = ACC_TMPL_PRI_6;
+		acc_reg_write(d, address, value);
+		address = VRB2_PfQmgrGrpTmplateReg7Indx + ACC_BYTES_IN_WORD * template_idx;
+		value = ACC_TMPL_PRI_7;
+		acc_reg_write(d, address, value);
+	}
+
+	address = VRB2_PfQmgrGrpPriority;
+	value = VRB2_CFG_QMGR_HI_P;
+	acc_reg_write(d, address, value);
+
+	/* Template Configuration */
+	for (template_idx = 0; template_idx < ACC_NUM_TMPL; template_idx++) {
+		value = 0;
+		address = VRB2_PfQmgrGrpTmplateEnRegIndx + ACC_BYTES_IN_WORD * template_idx;
+		acc_reg_write(d, address, value);
+	}
+	/* 4GUL */
+	numQgs = conf->q_ul_4g.num_qgroups;
+	numQqsAcc = 0;
+	value = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		value |= (1 << qg_idx);
+	for (template_idx = VRB2_SIG_UL_4G; template_idx <= VRB2_SIG_UL_4G_LAST;
+			template_idx++) {
+		address = VRB2_PfQmgrGrpTmplateEnRegIndx + ACC_BYTES_IN_WORD * template_idx;
+		acc_reg_write(d, address, value);
+	}
+	/* 5GUL */
+	numQqsAcc += numQgs;
+	numQgs = conf->q_ul_5g.num_qgroups;
+	value = 0;
+	numEngines = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		value |= (1 << qg_idx);
+	for (template_idx = VRB2_SIG_UL_5G; template_idx <= VRB2_SIG_UL_5G_LAST;
+			template_idx++) {
+		/* Check engine power-on status */
+		address = VRB2_PfFecUl5gIbDebug0Reg + ACC_ENGINE_OFFSET * template_idx;
+		status = (acc_reg_read(d, address) >> 4) & 0x7;
+		address = VRB2_PfQmgrGrpTmplateEnRegIndx + ACC_BYTES_IN_WORD * template_idx;
+		if (status == 1) {
+			acc_reg_write(d, address, value);
+			numEngines++;
+		} else
+			acc_reg_write(d, address, 0);
+	}
+	rte_bbdev_log(INFO, "Number of 5GUL engines %d", numEngines);
+	/* 4GDL */
+	numQqsAcc += numQgs;
+	numQgs	= conf->q_dl_4g.num_qgroups;
+	value = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		value |= (1 << qg_idx);
+	for (template_idx = VRB2_SIG_DL_4G; template_idx <= VRB2_SIG_DL_4G_LAST;
+			template_idx++) {
+		address = VRB2_PfQmgrGrpTmplateEnRegIndx + ACC_BYTES_IN_WORD * template_idx;
+		acc_reg_write(d, address, value);
+	}
+	/* 5GDL */
+	numQqsAcc += numQgs;
+	numQgs	= conf->q_dl_5g.num_qgroups;
+	value = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		value |= (1 << qg_idx);
+	for (template_idx = VRB2_SIG_DL_5G; template_idx <= VRB2_SIG_DL_5G_LAST;
+			template_idx++) {
+		address = VRB2_PfQmgrGrpTmplateEnRegIndx + ACC_BYTES_IN_WORD * template_idx;
+		acc_reg_write(d, address, value);
+	}
+	/* FFT */
+	numQqsAcc += numQgs;
+	numQgs	= conf->q_fft.num_qgroups;
+	value = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		value |= (1 << qg_idx);
+	for (template_idx = VRB2_SIG_FFT; template_idx <= VRB2_SIG_FFT_LAST;
+			template_idx++) {
+		address = VRB2_PfQmgrGrpTmplateEnRegIndx + ACC_BYTES_IN_WORD * template_idx;
+		acc_reg_write(d, address, value);
+	}
+	/* MLD */
+	numQqsAcc += numQgs;
+	numQgs	= conf->q_mld.num_qgroups;
+	value = 0;
+	for (qg_idx = numQqsAcc; qg_idx < (numQgs + numQqsAcc); qg_idx++)
+		value |= (1 << qg_idx);
+	for (template_idx = VRB2_SIG_MLD; template_idx <= VRB2_SIG_MLD_LAST;
+			template_idx++) {
+		address = VRB2_PfQmgrGrpTmplateEnRegIndx
+				+ ACC_BYTES_IN_WORD * template_idx;
+		acc_reg_write(d, address, value);
+	}
+
+	/* Queue Group Function mapping */
+	for (i = 0; i < 4; i++) {
+		value = 0;
+		for (qg_idx = 0; qg_idx < ACC_NUM_QGRPS_PER_WORD; qg_idx++) {
+			acc = accFromQgid(qg_idx + i * ACC_NUM_QGRPS_PER_WORD, conf);
+			value |= qman_func_id[acc] << (qg_idx * 4);
+		}
+		acc_reg_write(d, VRB2_PfQmgrGrpFunction0 + i * ACC_BYTES_IN_WORD, value);
+	}
+
+	/* Configuration of the Arbitration QGroup depth to 1 */
+	for (qg_idx = 0; qg_idx < VRB2_NUM_QGRPS; qg_idx++) {
+		address = VRB2_PfQmgrArbQDepthGrp + ACC_BYTES_IN_WORD * qg_idx;
+		value = 0;
+		acc_reg_write(d, address, value);
+	}
+
+	static_allocation = 1;
+	if (static_allocation == 1) {
+		/* This pointer to ARAM (512kB) is shifted by 2 (4B per register) */
+		uint32_t aram_address = 0;
+		for (qg_idx = 0; qg_idx < totalQgs; qg_idx++) {
+			for (vf_idx = 0; vf_idx < conf->num_vf_bundles; vf_idx++) {
+				address = VRB2_PfQmgrVfBaseAddr + vf_idx
+						* ACC_BYTES_IN_WORD + qg_idx
+						* ACC_BYTES_IN_WORD * 64;
+				value = aram_address;
+				acc_reg_fast_write(d, address, value);
+				/* Offset ARAM Address for next memory bank  - increment of 4B. */
+				aram_address += aqNum(qg_idx, conf) *
+						(1 << aqDepth(qg_idx, conf));
+			}
+		}
+		if (aram_address > VRB2_WORDS_IN_ARAM_SIZE) {
+			rte_bbdev_log(ERR, "ARAM Configuration not fitting %d %d\n",
+					aram_address, VRB2_WORDS_IN_ARAM_SIZE);
+			return -EINVAL;
+		}
+	} else {
+		/* Dynamic Qmgr allocation */
+		acc_reg_write(d, VRB2_PfQmgrAramAllocEn, 1);
+		acc_reg_write(d, VRB2_PfQmgrAramAllocSetupN0, 0x1000);
+		acc_reg_write(d, VRB2_PfQmgrAramAllocSetupN1, 0);
+		acc_reg_write(d, VRB2_PfQmgrAramAllocSetupN2, 0);
+		acc_reg_write(d, VRB2_PfQmgrAramAllocSetupN3, 0);
+		acc_reg_write(d, VRB2_PfQmgrSoftReset, 1);
+		acc_reg_write(d, VRB2_PfQmgrSoftReset, 0);
+	}
+
+	/* ==== HI Configuration ==== */
+
+	/* No Info Ring/MSI by default */
+	address = VRB2_PfHiInfoRingIntWrEnRegPf;
+	value = 0;
+	acc_reg_write(d, address, value);
+	address = VRB2_PfHiCfgMsiIntWrEnRegPf;
+	value = 0xFFFFFFFF;
+	acc_reg_write(d, address, value);
+	/* Prevent Block on Transmit Error */
+	address = VRB2_PfHiBlockTransmitOnErrorEn;
+	value = 0;
+	acc_reg_write(d, address, value);
+	/* Prevents to drop MSI */
+	address = VRB2_PfHiMsiDropEnableReg;
+	value = 0;
+	acc_reg_write(d, address, value);
+	/* Set the PF Mode register */
+	address = VRB2_PfHiPfMode;
+	value = ((conf->pf_mode_en) ? ACC_PF_VAL : 0) | 0x1F07F0;
+	acc_reg_write(d, address, value);
+	/* Explicitly releasing AXI after PF Mode */
+	acc_reg_write(d, VRB2_PfDmaAxiControl, 1);
+
+	/* QoS overflow init */
+	value = 1;
+	address = VRB2_PfQosmonAEvalOverflow0;
+	acc_reg_write(d, address, value);
+	address = VRB2_PfQosmonBEvalOverflow0;
+	acc_reg_write(d, address, value);
+
+	/* Enabling AQueues through the Queue hierarchy*/
+	unsigned int  en_bitmask[VRB2_AQ_REG_NUM];
+	for (vf_idx = 0; vf_idx < VRB2_NUM_VFS; vf_idx++) {
+		for (qg_idx = 0; qg_idx < VRB2_NUM_QGRPS; qg_idx++) {
+			for (aq_reg = 0;  aq_reg < VRB2_AQ_REG_NUM; aq_reg++)
+				en_bitmask[aq_reg] = 0;
+			if (vf_idx < conf->num_vf_bundles && qg_idx < totalQgs) {
+				for (aq_reg = 0;  aq_reg < VRB2_AQ_REG_NUM; aq_reg++) {
+					if (aqNum(qg_idx, conf) >= 16 * (aq_reg + 1))
+						en_bitmask[aq_reg] = 0xFFFF;
+					else if (aqNum(qg_idx, conf) <= 16 * aq_reg)
+						en_bitmask[aq_reg] = 0x0;
+					else
+						en_bitmask[aq_reg] = (1 << (aqNum(qg_idx,
+								conf) - aq_reg * 16)) - 1;
+				}
+			}
+			for (aq_reg = 0; aq_reg < VRB2_AQ_REG_NUM; aq_reg++) {
+				address = VRB2_PfQmgrAqEnableVf + vf_idx * 16 + aq_reg * 4;
+				value = (qg_idx << 16) + en_bitmask[aq_reg];
+				acc_reg_fast_write(d, address, value);
+			}
+		}
+	}
+
+	rte_bbdev_log(INFO,
+			"VRB2 basic config complete for %s - pf_bb_config should ideally be used instead",
+			dev_name);
+	return 0;
+}
diff --git a/drivers/baseband/acc/vrb_cfg.h b/drivers/baseband/acc/vrb_cfg.h
index e3c8902b46..79487c4e04 100644
--- a/drivers/baseband/acc/vrb_cfg.h
+++ b/drivers/baseband/acc/vrb_cfg.h
@@ -29,4 +29,20 @@
 int
 vrb1_configure(const char *dev_name, struct rte_acc_conf *conf);
 
+/**
+ * Configure a VRB2 device.
+ *
+ * @param dev_name
+ *   The name of the device. This is the short form of PCI BDF, e.g. 00:01.0.
+ *   It can also be retrieved for a bbdev device from the dev_name field in the
+ *   rte_bbdev_info structure returned by rte_bbdev_info_get().
+ * @param conf
+ *   Configuration to apply to VRB2 HW.
+ *
+ * @return
+ *   Zero on success, negative value on failure.
+ */
+int
+vrb2_configure(const char *dev_name, struct rte_acc_conf *conf);
+
 #endif /* _VRB_CFG_H_ */
-- 
2.34.1


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v1 1/7] bbdev: add FFT version member in driver info
  2023-09-19  1:21 ` [PATCH v1 1/7] bbdev: add FFT version member in driver info Nicolas Chautru
@ 2023-09-19  9:55   ` Maxime Coquelin
  2023-09-19 20:51     ` Chautru, Nicolas
  0 siblings, 1 reply; 24+ messages in thread
From: Maxime Coquelin @ 2023-09-19  9:55 UTC (permalink / raw)
  To: Nicolas Chautru, dev; +Cc: hemant.agrawal, david.marchand, hernan.vargas



On 9/19/23 03:21, Nicolas Chautru wrote:
> This can be used to distinguish different version of the
> flexible pointwise windowing applied to the FFT and expose
> this to the application.

Does this version relates to a standard, or is this specific to the 
implementation of your VRB devices?

> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> ---
>   lib/bbdev/rte_bbdev.h | 2 ++
>   1 file changed, 2 insertions(+)
> 
> diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h
> index a5bcc09f10..d6e54ee9a4 100644
> --- a/lib/bbdev/rte_bbdev.h
> +++ b/lib/bbdev/rte_bbdev.h
> @@ -349,6 +349,8 @@ struct rte_bbdev_driver_info {
>   	const struct rte_bbdev_op_cap *capabilities;
>   	/** Device cpu_flag requirements */
>   	const enum rte_cpu_flag_t *cpu_flag_reqs;
> +	/** Versioning number for the FFT operation type. */
> +	uint16_t fft_version;
>   };
>   
>   /** Macro used at end of bbdev PMD list */


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v1 3/7] baseband/acc: remove the 4G SO capability for VRB1
  2023-09-19  1:21 ` [PATCH v1 3/7] baseband/acc: remove the 4G SO capability for VRB1 Nicolas Chautru
@ 2023-09-19 15:20   ` David Marchand
  2023-09-19 20:32     ` Chautru, Nicolas
  0 siblings, 1 reply; 24+ messages in thread
From: David Marchand @ 2023-09-19 15:20 UTC (permalink / raw)
  To: Nicolas Chautru; +Cc: dev, maxime.coquelin, hemant.agrawal, hernan.vargas

On Tue, Sep 19, 2023 at 3:25 AM Nicolas Chautru
<nicolas.chautru@intel.com> wrote:
>
> This removes the specific capability and support of LTE Decoder
> Soft Output option on the VRB1 PMD.

Please explain why such support is removed for this hw.


>
> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> ---
>  drivers/baseband/acc/rte_vrb_pmd.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/baseband/acc/rte_vrb_pmd.c b/drivers/baseband/acc/rte_vrb_pmd.c
> index 3c8f3409ed..e0f50460bd 100644
> --- a/drivers/baseband/acc/rte_vrb_pmd.c
> +++ b/drivers/baseband/acc/rte_vrb_pmd.c
> @@ -1019,14 +1019,11 @@ vrb_dev_info_get(struct rte_bbdev *dev, struct rte_bbdev_driver_info *dev_info)
>                                         RTE_BBDEV_TURBO_CRC_TYPE_24B |
>                                         RTE_BBDEV_TURBO_DEC_CRC_24B_DROP |
>                                         RTE_BBDEV_TURBO_EQUALIZER |
> -                                       RTE_BBDEV_TURBO_SOFT_OUT_SATURATE |
>                                         RTE_BBDEV_TURBO_HALF_ITERATION_EVEN |
>                                         RTE_BBDEV_TURBO_CONTINUE_CRC_MATCH |
> -                                       RTE_BBDEV_TURBO_SOFT_OUTPUT |
>                                         RTE_BBDEV_TURBO_EARLY_TERMINATION |
>                                         RTE_BBDEV_TURBO_DEC_INTERRUPTS |
>                                         RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN |
> -                                       RTE_BBDEV_TURBO_NEG_LLR_1_BIT_SOFT_OUT |
>                                         RTE_BBDEV_TURBO_MAP_DEC |
>                                         RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP |
>                                         RTE_BBDEV_TURBO_DEC_SCATTER_GATHER,
> @@ -1975,6 +1972,9 @@ enqueue_dec_one_op_cb(struct acc_queue *q, struct rte_bbdev_dec_op *op,
>         struct rte_mbuf *input, *h_output_head, *h_output,
>                 *s_output_head, *s_output;
>
> +       /* Disable explictly SO for VRB 1. */
> +       op->turbo_dec.op_flags &= ~RTE_BBDEV_TURBO_SOFT_OUTPUT;

Can you explain why it is needed to filter this out?

I did not find a clear description in the bbdev API.
It would help if there were explicits references in doxygen of which
capability is necessary for using flags/API.


I was expecting that asking for RTE_BBDEV_TURBO_SOFT_OUTPUT to a
driver is only allowed if rte_bbdev_op_cap contains it.
With this assumption, it would be invalid for an application to
request RTE_BBDEV_TURBO_SOFT_OUTPUT through rte_bbdev_enqueue_dec_ops.


> +
>         desc = acc_desc(q, total_enqueued_cbs);
>         vrb_fcw_td_fill(op, &desc->req.fcw_td);
>
> --
> 2.34.1
>

At this point of the series, the documentation still references
RTE_BBDEV_TURBO_SOFT_OUTPUT as something supported by the vrb1 driver.


-- 
David Marchand


^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [PATCH v1 3/7] baseband/acc: remove the 4G SO capability for VRB1
  2023-09-19 15:20   ` David Marchand
@ 2023-09-19 20:32     ` Chautru, Nicolas
  2023-09-21  7:13       ` David Marchand
  0 siblings, 1 reply; 24+ messages in thread
From: Chautru, Nicolas @ 2023-09-19 20:32 UTC (permalink / raw)
  To: David Marchand; +Cc: dev, maxime.coquelin, hemant.agrawal, Vargas, Hernan

Hi David, 

> -----Original Message-----
> From: David Marchand <david.marchand@redhat.com>
> Sent: Tuesday, September 19, 2023 8:20 AM
> To: Chautru, Nicolas <nicolas.chautru@intel.com>
> Cc: dev@dpdk.org; maxime.coquelin@redhat.com;
> hemant.agrawal@nxp.com; Vargas, Hernan <hernan.vargas@intel.com>
> Subject: Re: [PATCH v1 3/7] baseband/acc: remove the 4G SO capability for
> VRB1
> 
> On Tue, Sep 19, 2023 at 3:25 AM Nicolas Chautru
> <nicolas.chautru@intel.com> wrote:
> >
> > This removes the specific capability and support of LTE Decoder Soft
> > Output option on the VRB1 PMD.
> 
> Please explain why such support is removed for this hw.

The decision is made to defeature this optional capability as under certain race conditions enabling this may potentially cause reliability issues which would not be acceptable.
Note that this is an optional additional output information  (soft output information) independent of the actual decoding operation. 
More details below next to your other comments. 

> 
> 
> >
> > Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> > ---
> >  drivers/baseband/acc/rte_vrb_pmd.c | 6 +++---
> >  1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/baseband/acc/rte_vrb_pmd.c
> > b/drivers/baseband/acc/rte_vrb_pmd.c
> > index 3c8f3409ed..e0f50460bd 100644
> > --- a/drivers/baseband/acc/rte_vrb_pmd.c
> > +++ b/drivers/baseband/acc/rte_vrb_pmd.c
> > @@ -1019,14 +1019,11 @@ vrb_dev_info_get(struct rte_bbdev *dev, struct
> rte_bbdev_driver_info *dev_info)
> >                                         RTE_BBDEV_TURBO_CRC_TYPE_24B |
> >                                         RTE_BBDEV_TURBO_DEC_CRC_24B_DROP |
> >                                         RTE_BBDEV_TURBO_EQUALIZER |
> > -                                       RTE_BBDEV_TURBO_SOFT_OUT_SATURATE |
> >                                         RTE_BBDEV_TURBO_HALF_ITERATION_EVEN |
> >                                         RTE_BBDEV_TURBO_CONTINUE_CRC_MATCH |
> > -                                       RTE_BBDEV_TURBO_SOFT_OUTPUT |
> >                                         RTE_BBDEV_TURBO_EARLY_TERMINATION |
> >                                         RTE_BBDEV_TURBO_DEC_INTERRUPTS |
> >                                         RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN |
> > -                                       RTE_BBDEV_TURBO_NEG_LLR_1_BIT_SOFT_OUT |
> >                                         RTE_BBDEV_TURBO_MAP_DEC |
> >                                         RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP |
> >
> > RTE_BBDEV_TURBO_DEC_SCATTER_GATHER,
> > @@ -1975,6 +1972,9 @@ enqueue_dec_one_op_cb(struct acc_queue *q,
> struct rte_bbdev_dec_op *op,
> >         struct rte_mbuf *input, *h_output_head, *h_output,
> >                 *s_output_head, *s_output;
> >
> > +       /* Disable explictly SO for VRB 1. */
> > +       op->turbo_dec.op_flags &= ~RTE_BBDEV_TURBO_SOFT_OUTPUT;
> 
> Can you explain why it is needed to filter this out?
> 
> I did not find a clear description in the bbdev API.
> It would help if there were explicits references in doxygen of which capability
> is necessary for using flags/API.
> 
> 
> I was expecting that asking for RTE_BBDEV_TURBO_SOFT_OUTPUT to a driver
> is only allowed if rte_bbdev_op_cap contains it.
> With this assumption, it would be invalid for an application to request
> RTE_BBDEV_TURBO_SOFT_OUTPUT through rte_bbdev_enqueue_dec_ops.

You may arguably expect this from a well behaved user application but still there is nothing that enforces it explicitly, ie. notably under negative scenario conditions which we still need to manage gracefully.
Here we want to make sure that in case the optional operational flag is included, we fall back to default mode when using the VRB1 variant.
Keep in mind that the unified driver can support multiple HW variant (see rest of the serie) and may support this option for other variants using same code.

In term of documentation, I believe that capability/flag (ie. note that the enum maps to a capability when retrieved from info_get, and to an operation flag when provided to the bbdev api) is already captured explicitly for many generations. Basically this an optional output of the LTE decoding processing, to provide APP LLR which can be potentially be useful for the user application (separate optional mbuf). It may or may not be supported by a bb device, and it may or may not be requested to be provided through the API. Typically this is not enabled. 

In that commit we are defeaturing this optional capability for VRB1, we no longer expose it to the application, and in case the application was requesting it, we would ignore it (as we do for any other flags that is not supported, they become don't care flags which are ignored).

Kindly let me know if still unclear.
Thanks
Nic

> 
> 
> > +
> >         desc = acc_desc(q, total_enqueued_cbs);
> >         vrb_fcw_td_fill(op, &desc->req.fcw_td);
> >
> > --
> > 2.34.1
> >
> 
> At this point of the series, the documentation still references
> RTE_BBDEV_TURBO_SOFT_OUTPUT as something supported by the vrb1
> driver.
> 
> 
> --
> David Marchand

^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [PATCH v1 1/7] bbdev: add FFT version member in driver info
  2023-09-19  9:55   ` Maxime Coquelin
@ 2023-09-19 20:51     ` Chautru, Nicolas
  2023-09-22  8:14       ` Maxime Coquelin
  0 siblings, 1 reply; 24+ messages in thread
From: Chautru, Nicolas @ 2023-09-19 20:51 UTC (permalink / raw)
  To: Maxime Coquelin, dev; +Cc: hemant.agrawal, david.marchand, Vargas, Hernan

Hi Maxime, 

This is neither part of 3GPP per se, nor specific to VRB device. Let me provide more context. 
The SRS processing chain (https://doc.dpdk.org/guides/prog_guide/bbdev.html#bbdev-fft-operation) includes a pointwise multiplication by time window. 
The generic API include some control of these windowing function but still the actual shape need to be programmed onto any device (ie. rectangular, taped, sinc, different width or offset, any abritraty shape defined as an array of scalars). These degrees of liberties cannot be exposed through a generic API (information is multi-kB, ie the data itself) and can be user specific (external to the HW IP itself or outside of Intel control).
As an illustration for VRB device pf_bb_config provides to user an option to include such windowing data as an input ("FFT LUT bin file"), but more generally at platform level for any bb device this big Look-Up Table or big array can be configured on the host during platform initialization for a given deployment or vendor. 
What is required here is for the user application to have knowledge of what version of such array is being used on the given platform, as this information would be relevant to processing done outside of bbdev (notably for noise estimate). Through that mechanism, the user can now map through that API which possible file was being used, and act accordingly.
The content itself is not specified, for VRB we just use the md5sum of that binary file (which is just a big array of int16 for point wise multiplication) so that this can be used to share knowledge between initialized platform configuration and at run-time user application assumption. 
It is also important to under that the user/vendor may use any array or shape (based on their algorithm) regardless of Intel or IP, and still be able to share information mapping between what is configured on the platform (multiple versions possible) and what the application enumerates. 

I can add more details in the documentation indeed but above should arguably make sense. The name FFT_version naming may be quite vague, this is more related to the FFT pointwise windowing array variant assumed on the platform. I did not want to impose for it to be an md5sum necessarily, hence the vagueness, as it could be any hash shared between the device programming and the user application related to the semi-static FFT processing programming. 

Let me know if unclear or if any other thought, 
Thanks
Nic

> -----Original Message-----
> From: Maxime Coquelin <maxime.coquelin@redhat.com>
> Sent: Tuesday, September 19, 2023 2:56 AM
> To: Chautru, Nicolas <nicolas.chautru@intel.com>; dev@dpdk.org
> Cc: hemant.agrawal@nxp.com; david.marchand@redhat.com; Vargas, Hernan
> <hernan.vargas@intel.com>
> Subject: Re: [PATCH v1 1/7] bbdev: add FFT version member in driver info
> 
> 
> 
> On 9/19/23 03:21, Nicolas Chautru wrote:
> > This can be used to distinguish different version of the flexible
> > pointwise windowing applied to the FFT and expose this to the
> > application.
> 
> Does this version relates to a standard, or is this specific to the
> implementation of your VRB devices?
> 
> > Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> > ---
> >   lib/bbdev/rte_bbdev.h | 2 ++
> >   1 file changed, 2 insertions(+)
> >
> > diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h index
> > a5bcc09f10..d6e54ee9a4 100644
> > --- a/lib/bbdev/rte_bbdev.h
> > +++ b/lib/bbdev/rte_bbdev.h
> > @@ -349,6 +349,8 @@ struct rte_bbdev_driver_info {
> >   	const struct rte_bbdev_op_cap *capabilities;
> >   	/** Device cpu_flag requirements */
> >   	const enum rte_cpu_flag_t *cpu_flag_reqs;
> > +	/** Versioning number for the FFT operation type. */
> > +	uint16_t fft_version;
> >   };
> >
> >   /** Macro used at end of bbdev PMD list */

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v1 3/7] baseband/acc: remove the 4G SO capability for VRB1
  2023-09-19 20:32     ` Chautru, Nicolas
@ 2023-09-21  7:13       ` David Marchand
  2023-09-21 17:18         ` Chautru, Nicolas
  0 siblings, 1 reply; 24+ messages in thread
From: David Marchand @ 2023-09-21  7:13 UTC (permalink / raw)
  To: Chautru, Nicolas
  Cc: dev, maxime.coquelin, hemant.agrawal, Vargas, Hernan, Thomas Monjalon

On Tue, Sep 19, 2023 at 10:32 PM Chautru, Nicolas
<nicolas.chautru@intel.com> wrote:
>
> Hi David,
>
> > -----Original Message-----
> > From: David Marchand <david.marchand@redhat.com>
> > Sent: Tuesday, September 19, 2023 8:20 AM
> > To: Chautru, Nicolas <nicolas.chautru@intel.com>
> > Cc: dev@dpdk.org; maxime.coquelin@redhat.com;
> > hemant.agrawal@nxp.com; Vargas, Hernan <hernan.vargas@intel.com>
> > Subject: Re: [PATCH v1 3/7] baseband/acc: remove the 4G SO capability for
> > VRB1
> >
> > On Tue, Sep 19, 2023 at 3:25 AM Nicolas Chautru
> > <nicolas.chautru@intel.com> wrote:
> > >
> > > This removes the specific capability and support of LTE Decoder Soft
> > > Output option on the VRB1 PMD.
> >
> > Please explain why such support is removed for this hw.
>
> The decision is made to defeature this optional capability as under certain race conditions enabling this may potentially cause reliability issues which would not be acceptable.
> Note that this is an optional additional output information  (soft output information) independent of the actual decoding operation.
> More details below next to your other comments.

This must be explained in the commitlog.

>
> >
> >
> > >
> > > Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> > > ---
> > >  drivers/baseband/acc/rte_vrb_pmd.c | 6 +++---
> > >  1 file changed, 3 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/drivers/baseband/acc/rte_vrb_pmd.c
> > > b/drivers/baseband/acc/rte_vrb_pmd.c
> > > index 3c8f3409ed..e0f50460bd 100644
> > > --- a/drivers/baseband/acc/rte_vrb_pmd.c
> > > +++ b/drivers/baseband/acc/rte_vrb_pmd.c
> > > @@ -1019,14 +1019,11 @@ vrb_dev_info_get(struct rte_bbdev *dev, struct
> > rte_bbdev_driver_info *dev_info)
> > >                                         RTE_BBDEV_TURBO_CRC_TYPE_24B |
> > >                                         RTE_BBDEV_TURBO_DEC_CRC_24B_DROP |
> > >                                         RTE_BBDEV_TURBO_EQUALIZER |
> > > -                                       RTE_BBDEV_TURBO_SOFT_OUT_SATURATE |
> > >                                         RTE_BBDEV_TURBO_HALF_ITERATION_EVEN |
> > >                                         RTE_BBDEV_TURBO_CONTINUE_CRC_MATCH |
> > > -                                       RTE_BBDEV_TURBO_SOFT_OUTPUT |
> > >                                         RTE_BBDEV_TURBO_EARLY_TERMINATION |
> > >                                         RTE_BBDEV_TURBO_DEC_INTERRUPTS |
> > >                                         RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN |
> > > -                                       RTE_BBDEV_TURBO_NEG_LLR_1_BIT_SOFT_OUT |
> > >                                         RTE_BBDEV_TURBO_MAP_DEC |
> > >                                         RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP |
> > >
> > > RTE_BBDEV_TURBO_DEC_SCATTER_GATHER,
> > > @@ -1975,6 +1972,9 @@ enqueue_dec_one_op_cb(struct acc_queue *q,
> > struct rte_bbdev_dec_op *op,
> > >         struct rte_mbuf *input, *h_output_head, *h_output,
> > >                 *s_output_head, *s_output;
> > >
> > > +       /* Disable explictly SO for VRB 1. */
> > > +       op->turbo_dec.op_flags &= ~RTE_BBDEV_TURBO_SOFT_OUTPUT;
> >
> > Can you explain why it is needed to filter this out?
> >
> > I did not find a clear description in the bbdev API.
> > It would help if there were explicits references in doxygen of which capability
> > is necessary for using flags/API.
> >
> >
> > I was expecting that asking for RTE_BBDEV_TURBO_SOFT_OUTPUT to a driver
> > is only allowed if rte_bbdev_op_cap contains it.
> > With this assumption, it would be invalid for an application to request
> > RTE_BBDEV_TURBO_SOFT_OUTPUT through rte_bbdev_enqueue_dec_ops.
>
> You may arguably expect this from a well behaved user application but still there is nothing that enforces it explicitly, ie. notably under negative scenario conditions which we still need to manage gracefully.

If your application is buggy (not reading / complying with the device
capabilities), fix it.


> Here we want to make sure that in case the optional operational flag is included, we fall back to default mode when using the VRB1 variant.
> Keep in mind that the unified driver can support multiple HW variant (see rest of the serie) and may support this option for other variants using same code.

Whatever the HW variant, the API should be respected: exposing
capabilities is done on a per device basis.


>
> In term of documentation, I believe that capability/flag (ie. note that the enum maps to a capability when retrieved from info_get, and to an operation flag when provided to the bbdev api) is already captured explicitly for many generations. Basically this an optional output of the LTE decoding processing, to provide APP LLR which can be potentially be useful for the user application (separate optional mbuf). It may or may not be supported by a bb device, and it may or may not be requested to be provided through the API. Typically this is not enabled.

Being optional does not mean that a driver can ignore it.
Otherwise, there is no point in exposing a capability.



Thanks.

-- 
David Marchand


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v1 0/7] VRB2 BBDEV PMD introduction
  2023-09-19  1:21 [PATCH v1 0/7] VRB2 BBDEV PMD introduction Nicolas Chautru
                   ` (6 preceding siblings ...)
  2023-09-19  1:21 ` [PATCH v1 7/7] baseband/acc: add configure helper for VRB2 Nicolas Chautru
@ 2023-09-21  7:25 ` David Marchand
  7 siblings, 0 replies; 24+ messages in thread
From: David Marchand @ 2023-09-21  7:25 UTC (permalink / raw)
  To: Nicolas Chautru; +Cc: dev, maxime.coquelin, hemant.agrawal, hernan.vargas

On Tue, Sep 19, 2023 at 3:25 AM Nicolas Chautru
<nicolas.chautru@intel.com> wrote:
>
> This serie includes includes changes to the VRB BBDEV PMD for 23.11.
> This relies on the previous serie that Maxime is about to apply
> (https://patches.dpdk.org/project/dpdk/list/?series=28544).

Fyi, the CI people started to implement series dependencies.
Currently, only the ovsrobot supports it, but the UNH lab will support
it soon, too (and others CI will probably follow later).
http://inbox.dpdk.org/dts/f7ty1hkobj2.fsf@redhat.com/T/#m7e86d61319ab09aedc123defea34b6e3699d34fa

For example here, that would translate to adding the following tag to
this cover letter (or the first patch of your series):

Depends-on: series-28544 ("bbdev: API extension for 23.11")

-- 
David Marchand

^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [PATCH v1 3/7] baseband/acc: remove the 4G SO capability for VRB1
  2023-09-21  7:13       ` David Marchand
@ 2023-09-21 17:18         ` Chautru, Nicolas
  2023-09-27  7:08           ` Maxime Coquelin
  0 siblings, 1 reply; 24+ messages in thread
From: Chautru, Nicolas @ 2023-09-21 17:18 UTC (permalink / raw)
  To: David Marchand
  Cc: dev, maxime.coquelin, hemant.agrawal, Vargas, Hernan, Thomas Monjalon

Hi David, 

> -----Original Message-----
> From: David Marchand <david.marchand@redhat.com>
> Sent: Thursday, September 21, 2023 12:13 AM
> To: Chautru, Nicolas <nicolas.chautru@intel.com>
> Cc: dev@dpdk.org; maxime.coquelin@redhat.com;
> hemant.agrawal@nxp.com; Vargas, Hernan <hernan.vargas@intel.com>;
> Thomas Monjalon <thomas@monjalon.net>
> Subject: Re: [PATCH v1 3/7] baseband/acc: remove the 4G SO capability for
> VRB1
> 
> On Tue, Sep 19, 2023 at 10:32 PM Chautru, Nicolas
> <nicolas.chautru@intel.com> wrote:
> >
> > Hi David,
> >
> > > -----Original Message-----
> > > From: David Marchand <david.marchand@redhat.com>
> > > Sent: Tuesday, September 19, 2023 8:20 AM
> > > To: Chautru, Nicolas <nicolas.chautru@intel.com>
> > > Cc: dev@dpdk.org; maxime.coquelin@redhat.com;
> > > hemant.agrawal@nxp.com; Vargas, Hernan <hernan.vargas@intel.com>
> > > Subject: Re: [PATCH v1 3/7] baseband/acc: remove the 4G SO
> > > capability for
> > > VRB1
> > >
> > > On Tue, Sep 19, 2023 at 3:25 AM Nicolas Chautru
> > > <nicolas.chautru@intel.com> wrote:
> > > >
> > > > This removes the specific capability and support of LTE Decoder
> > > > Soft Output option on the VRB1 PMD.
> > >
> > > Please explain why such support is removed for this hw.
> >
> > The decision is made to defeature this optional capability as under certain
> race conditions enabling this may potentially cause reliability issues which
> would not be acceptable.
> > Note that this is an optional additional output information  (soft output
> information) independent of the actual decoding operation.
> > More details below next to your other comments.
> 
> This must be explained in the commitlog.

OK will add now. 

> 
> >
> > >
> > >
> > > >
> > > > Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> > > > ---
> > > >  drivers/baseband/acc/rte_vrb_pmd.c | 6 +++---
> > > >  1 file changed, 3 insertions(+), 3 deletions(-)
> > > >
> > > > diff --git a/drivers/baseband/acc/rte_vrb_pmd.c
> > > > b/drivers/baseband/acc/rte_vrb_pmd.c
> > > > index 3c8f3409ed..e0f50460bd 100644
> > > > --- a/drivers/baseband/acc/rte_vrb_pmd.c
> > > > +++ b/drivers/baseband/acc/rte_vrb_pmd.c
> > > > @@ -1019,14 +1019,11 @@ vrb_dev_info_get(struct rte_bbdev *dev,
> > > > struct
> > > rte_bbdev_driver_info *dev_info)
> > > >                                         RTE_BBDEV_TURBO_CRC_TYPE_24B |
> > > >                                         RTE_BBDEV_TURBO_DEC_CRC_24B_DROP |
> > > >                                         RTE_BBDEV_TURBO_EQUALIZER |
> > > > -                                       RTE_BBDEV_TURBO_SOFT_OUT_SATURATE |
> > > >                                         RTE_BBDEV_TURBO_HALF_ITERATION_EVEN |
> > > >                                         RTE_BBDEV_TURBO_CONTINUE_CRC_MATCH |
> > > > -                                       RTE_BBDEV_TURBO_SOFT_OUTPUT |
> > > >                                         RTE_BBDEV_TURBO_EARLY_TERMINATION |
> > > >                                         RTE_BBDEV_TURBO_DEC_INTERRUPTS |
> > > >                                         RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN |
> > > > -                                       RTE_BBDEV_TURBO_NEG_LLR_1_BIT_SOFT_OUT |
> > > >                                         RTE_BBDEV_TURBO_MAP_DEC |
> > > >
> > > > RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP |
> > > >
> > > > RTE_BBDEV_TURBO_DEC_SCATTER_GATHER,
> > > > @@ -1975,6 +1972,9 @@ enqueue_dec_one_op_cb(struct acc_queue
> *q,
> > > struct rte_bbdev_dec_op *op,
> > > >         struct rte_mbuf *input, *h_output_head, *h_output,
> > > >                 *s_output_head, *s_output;
> > > >
> > > > +       /* Disable explictly SO for VRB 1. */
> > > > +       op->turbo_dec.op_flags &= ~RTE_BBDEV_TURBO_SOFT_OUTPUT;
> > >
> > > Can you explain why it is needed to filter this out?
> > >
> > > I did not find a clear description in the bbdev API.
> > > It would help if there were explicits references in doxygen of which
> > > capability is necessary for using flags/API.
> > >
> > >
> > > I was expecting that asking for RTE_BBDEV_TURBO_SOFT_OUTPUT to a
> > > driver is only allowed if rte_bbdev_op_cap contains it.
> > > With this assumption, it would be invalid for an application to
> > > request RTE_BBDEV_TURBO_SOFT_OUTPUT through
> rte_bbdev_enqueue_dec_ops.
> >
> > You may arguably expect this from a well behaved user application but still
> there is nothing that enforces it explicitly, ie. notably under negative scenario
> conditions which we still need to manage gracefully.
> 
> If your application is buggy (not reading / complying with the device
> capabilities), fix it.

Supporting negative scenario is within the scope of the PMD, whatever the application throws at us in cannot cause any HW issue.
Fixing application issues is outside of DPDK control obviously. 

> 
> 
> > Here we want to make sure that in case the optional operational flag is
> included, we fall back to default mode when using the VRB1 variant.
> > Keep in mind that the unified driver can support multiple HW variant (see
> rest of the serie) and may support this option for other variants using same
> code.
> 
> Whatever the HW variant, the API should be respected: exposing capabilities
> is done on a per device basis.
> 

It should be ideally, but in practice in case this is not done for whatever reason (negative scenario, bug in user application)
then we want the PMD to still avoid misbehaving. 

> 
> >
> > In term of documentation, I believe that capability/flag (ie. note that the
> enum maps to a capability when retrieved from info_get, and to an operation
> flag when provided to the bbdev api) is already captured explicitly for many
> generations. Basically this an optional output of the LTE decoding processing,
> to provide APP LLR which can be potentially be useful for the user application
> (separate optional mbuf). It may or may not be supported by a bb device, and
> it may or may not be requested to be provided through the API. Typically this
> is not enabled.
> 
> Being optional does not mean that a driver can ignore it.
> Otherwise, there is no point in exposing a capability.

I am not sure I follow your concern. Capability are critical for application to enumerate what the underlying device can do.
Here we are only stating that this is valuable to harden the PMD so that it can operate even if an unexpected API is provided, notably to guarantee the unified code is not used in an unintended manner.
Note that no PMD to my knowledge enforces checking explicitly the op_flag matches with the capability (like a bitmask check),
and I don’t really think we have to, these other flags are just meant to have effect since not supported. 

> 
> 
> 
> Thanks.
> 
> --
> David Marchand


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v1 1/7] bbdev: add FFT version member in driver info
  2023-09-19 20:51     ` Chautru, Nicolas
@ 2023-09-22  8:14       ` Maxime Coquelin
  2023-09-22 16:41         ` Chautru, Nicolas
  0 siblings, 1 reply; 24+ messages in thread
From: Maxime Coquelin @ 2023-09-22  8:14 UTC (permalink / raw)
  To: Chautru, Nicolas, hemant.agrawal, dev; +Cc: david.marchand, Vargas, Hernan

Hi Nicolas,

On 9/19/23 22:51, Chautru, Nicolas wrote:
> Hi Maxime,
> 
> This is neither part of 3GPP per se, nor specific to VRB device. Let me provide more context.
> The SRS processing chain (https://doc.dpdk.org/guides/prog_guide/bbdev.html#bbdev-fft-operation) includes a pointwise multiplication by time window.
> The generic API include some control of these windowing function but still the actual shape need to be programmed onto any device (ie. rectangular, taped, sinc, different width or offset, any abritraty shape defined as an array of scalars). These degrees of liberties cannot be exposed through a generic API (information is multi-kB, ie the data itself) and can be user specific (external to the HW IP itself or outside of Intel control).

Thanks for the explanations. I also did my homework as my FFT knowledge
was buried quite deep in my memory. :)

So this is a vendor-specific way to express generic paramaters.

Regarding VRB device, is this table per device or per VF?
Could it be configured by the application directly, or has it to be done
through the PF?

> As an illustration for VRB device pf_bb_config provides to user an option to include such windowing data as an input ("FFT LUT bin file"), but more generally at platform level for any bb device this big Look-Up Table or big array can be configured on the host during platform initialization for a given deployment or vendor.
> What is required here is for the user application to have knowledge of what version of such array is being used on the given platform, as this information would be relevant to processing done outside of bbdev (notably for noise estimate). Through that mechanism, the user can now map through that API which possible file was being used, and act accordingly.
> The content itself is not specified, for VRB we just use the md5sum of that binary file (which is just a big array of int16 for point wise multiplication) so that this can be used to share knowledge between initialized platform configuration and at run-time user application assumption.
> It is also important to under that the user/vendor may use any array or shape (based on their algorithm) regardless of Intel or IP, and still be able to share information mapping between what is configured on the platform (multiple versions possible) and what the application enumerates.
> 
> I can add more details in the documentation indeed but above should arguably make sense. The name FFT_version naming may be quite vague, this is more related to the FFT pointwise windowing array variant assumed on the platform. I did not want to impose for it to be an md5sum necessarily, hence the vagueness, as it could be any hash shared between the device programming and the user application related to the semi-static FFT processing programming.
> 
> Let me know if unclear or if any other thought,

I think this is clear now to me.

In my opinion, this is not good to have this part of the BBDEV API, as 
every vendor will have their own way to represent this.

Other alternative is to have a vendor specific API. This is far from
ideal and should be avoided as much as possible, but in this case the
application has to know anyways which device it is driving. It would be
at least clear the field has to be interpreted in a vendor-specific way.

@Hemant, I would be interested in your opinion. (I don't know if NXP has
or plans to have FFT accelerator IP)

Regards,
Maxime

> Thanks
> Nic
> 
>> -----Original Message-----
>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>> Sent: Tuesday, September 19, 2023 2:56 AM
>> To: Chautru, Nicolas <nicolas.chautru@intel.com>; dev@dpdk.org
>> Cc: hemant.agrawal@nxp.com; david.marchand@redhat.com; Vargas, Hernan
>> <hernan.vargas@intel.com>
>> Subject: Re: [PATCH v1 1/7] bbdev: add FFT version member in driver info
>>
>>
>>
>> On 9/19/23 03:21, Nicolas Chautru wrote:
>>> This can be used to distinguish different version of the flexible
>>> pointwise windowing applied to the FFT and expose this to the
>>> application.
>>
>> Does this version relates to a standard, or is this specific to the
>> implementation of your VRB devices?
>>
>>> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
>>> ---
>>>    lib/bbdev/rte_bbdev.h | 2 ++
>>>    1 file changed, 2 insertions(+)
>>>
>>> diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h index
>>> a5bcc09f10..d6e54ee9a4 100644
>>> --- a/lib/bbdev/rte_bbdev.h
>>> +++ b/lib/bbdev/rte_bbdev.h
>>> @@ -349,6 +349,8 @@ struct rte_bbdev_driver_info {
>>>    	const struct rte_bbdev_op_cap *capabilities;
>>>    	/** Device cpu_flag requirements */
>>>    	const enum rte_cpu_flag_t *cpu_flag_reqs;
>>> +	/** Versioning number for the FFT operation type. */
>>> +	uint16_t fft_version;
>>>    };
>>>
>>>    /** Macro used at end of bbdev PMD list */
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [PATCH v1 1/7] bbdev: add FFT version member in driver info
  2023-09-22  8:14       ` Maxime Coquelin
@ 2023-09-22 16:41         ` Chautru, Nicolas
  2023-09-26 10:00           ` Maxime Coquelin
  0 siblings, 1 reply; 24+ messages in thread
From: Chautru, Nicolas @ 2023-09-22 16:41 UTC (permalink / raw)
  To: Maxime Coquelin, hemant.agrawal, dev; +Cc: david.marchand, Vargas, Hernan

Hi Maxime, 

> -----Original Message-----
> From: Maxime Coquelin <maxime.coquelin@redhat.com>
> Sent: Friday, September 22, 2023 1:15 AM
> To: Chautru, Nicolas <nicolas.chautru@intel.com>;
> hemant.agrawal@nxp.com; dev@dpdk.org
> Cc: david.marchand@redhat.com; Vargas, Hernan
> <hernan.vargas@intel.com>
> Subject: Re: [PATCH v1 1/7] bbdev: add FFT version member in driver info
> 
> Hi Nicolas,
> 
> On 9/19/23 22:51, Chautru, Nicolas wrote:
> > Hi Maxime,
> >
> > This is neither part of 3GPP per se, nor specific to VRB device. Let me provide
> more context.
> > The SRS processing chain
> (https://doc.dpdk.org/guides/prog_guide/bbdev.html#bbdev-fft-operation)
> includes a pointwise multiplication by time window.
> > The generic API include some control of these windowing function but still
> the actual shape need to be programmed onto any device (ie. rectangular,
> taped, sinc, different width or offset, any abritraty shape defined as an array
> of scalars). These degrees of liberties cannot be exposed through a generic API
> (information is multi-kB, ie the data itself) and can be user specific (external to
> the HW IP itself or outside of Intel control).
> 
> Thanks for the explanations. I also did my homework as my FFT knowledge
> was buried quite deep in my memory. :)
> 
> So this is a vendor-specific way to express generic paramaters.

Unsure this is that vendor specific. At least the interface allows to know a hash of the table being loaded (which is just pointwise data really, non-proprietary format). I did not state the content is a simple md5sum of the bin file being loaded from linux. 
 
> Regarding VRB device, is this table per device or per VF?
> Could it be configured by the application directly, or has it to be done through
> the PF?

This is configured for the device at platform level, ie. through operator. Common to all application/devices. This captures the windows shape assumptions.

> 
> > As an illustration for VRB device pf_bb_config provides to user an option to
> include such windowing data as an input ("FFT LUT bin file"), but more
> generally at platform level for any bb device this big Look-Up Table or big
> array can be configured on the host during platform initialization for a given
> deployment or vendor.
> > What is required here is for the user application to have knowledge of what
> version of such array is being used on the given platform, as this information
> would be relevant to processing done outside of bbdev (notably for noise
> estimate). Through that mechanism, the user can now map through that API
> which possible file was being used, and act accordingly.
> > The content itself is not specified, for VRB we just use the md5sum of that
> binary file (which is just a big array of int16 for point wise multiplication) so
> that this can be used to share knowledge between initialized platform
> configuration and at run-time user application assumption.
> > It is also important to under that the user/vendor may use any array or
> shape (based on their algorithm) regardless of Intel or IP, and still be able to
> share information mapping between what is configured on the platform
> (multiple versions possible) and what the application enumerates.
> >
> > I can add more details in the documentation indeed but above should
> arguably make sense. The name FFT_version naming may be quite vague, this
> is more related to the FFT pointwise windowing array variant assumed on the
> platform. I did not want to impose for it to be an md5sum necessarily, hence
> the vagueness, as it could be any hash shared between the device
> programming and the user application related to the semi-static FFT
> processing programming.
> >
> > Let me know if unclear or if any other thought,
> 
> I think this is clear now to me.
> 
> In my opinion, this is not good to have this part of the BBDEV API, as every
> vendor will have their own way to represent this.
> 
> Other alternative is to have a vendor specific API. This is far from ideal and
> should be avoided as much as possible, but in this case the application has to
> know anyways which device it is driving. It would be at least clear the field has
> to be interpreted in a vendor-specific way.
> 
> @Hemant, I would be interested in your opinion. (I don't know if NXP has or
> plans to have FFT accelerator IP)

Yes looking forward to it. 


> 
> Regards,
> Maxime
> 
> > Thanks
> > Nic
> >
> >> -----Original Message-----
> >> From: Maxime Coquelin <maxime.coquelin@redhat.com>
> >> Sent: Tuesday, September 19, 2023 2:56 AM
> >> To: Chautru, Nicolas <nicolas.chautru@intel.com>; dev@dpdk.org
> >> Cc: hemant.agrawal@nxp.com; david.marchand@redhat.com; Vargas,
> Hernan
> >> <hernan.vargas@intel.com>
> >> Subject: Re: [PATCH v1 1/7] bbdev: add FFT version member in driver
> >> info
> >>
> >>
> >>
> >> On 9/19/23 03:21, Nicolas Chautru wrote:
> >>> This can be used to distinguish different version of the flexible
> >>> pointwise windowing applied to the FFT and expose this to the
> >>> application.
> >>
> >> Does this version relates to a standard, or is this specific to the
> >> implementation of your VRB devices?
> >>
> >>> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> >>> ---
> >>>    lib/bbdev/rte_bbdev.h | 2 ++
> >>>    1 file changed, 2 insertions(+)
> >>>
> >>> diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h index
> >>> a5bcc09f10..d6e54ee9a4 100644
> >>> --- a/lib/bbdev/rte_bbdev.h
> >>> +++ b/lib/bbdev/rte_bbdev.h
> >>> @@ -349,6 +349,8 @@ struct rte_bbdev_driver_info {
> >>>    	const struct rte_bbdev_op_cap *capabilities;
> >>>    	/** Device cpu_flag requirements */
> >>>    	const enum rte_cpu_flag_t *cpu_flag_reqs;
> >>> +	/** Versioning number for the FFT operation type. */
> >>> +	uint16_t fft_version;
> >>>    };
> >>>
> >>>    /** Macro used at end of bbdev PMD list */
> >


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v1 1/7] bbdev: add FFT version member in driver info
  2023-09-22 16:41         ` Chautru, Nicolas
@ 2023-09-26 10:00           ` Maxime Coquelin
  2023-09-27 23:50             ` Chautru, Nicolas
  0 siblings, 1 reply; 24+ messages in thread
From: Maxime Coquelin @ 2023-09-26 10:00 UTC (permalink / raw)
  To: Chautru, Nicolas, hemant.agrawal, dev; +Cc: david.marchand, Vargas, Hernan



On 9/22/23 18:41, Chautru, Nicolas wrote:
> Hi Maxime,
> 
>> -----Original Message-----
>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>> Sent: Friday, September 22, 2023 1:15 AM
>> To: Chautru, Nicolas <nicolas.chautru@intel.com>;
>> hemant.agrawal@nxp.com; dev@dpdk.org
>> Cc: david.marchand@redhat.com; Vargas, Hernan
>> <hernan.vargas@intel.com>
>> Subject: Re: [PATCH v1 1/7] bbdev: add FFT version member in driver info
>>
>> Hi Nicolas,
>>
>> On 9/19/23 22:51, Chautru, Nicolas wrote:
>>> Hi Maxime,
>>>
>>> This is neither part of 3GPP per se, nor specific to VRB device. Let me provide
>> more context.
>>> The SRS processing chain
>> (https://doc.dpdk.org/guides/prog_guide/bbdev.html#bbdev-fft-operation)
>> includes a pointwise multiplication by time window.
>>> The generic API include some control of these windowing function but still
>> the actual shape need to be programmed onto any device (ie. rectangular,
>> taped, sinc, different width or offset, any abritraty shape defined as an array
>> of scalars). These degrees of liberties cannot be exposed through a generic API
>> (information is multi-kB, ie the data itself) and can be user specific (external to
>> the HW IP itself or outside of Intel control).
>>
>> Thanks for the explanations. I also did my homework as my FFT knowledge
>> was buried quite deep in my memory. :)
>>
>> So this is a vendor-specific way to express generic paramaters.
> 
> Unsure this is that vendor specific. At least the interface allows to know a hash of the table being loaded (which is just pointwise data really, non-proprietary format). I did not state the content is a simple md5sum of the bin file being loaded from linux.

Ok, I think it would be better to provide an API to get the table
directly, and have the format being described in the documentation.

With that, we can also provide the hash as you'd like, but the method to 
calculate the hash should also be provided. Or the application can
perform the hash itself if it needs it.

The fact that it is several KB is not an issue, as this information
would only be queried once at init time if really needed.

An non-DPDK alternative could be to pass such information to the pod via
the device plugin (as a mounted file for instance, or variable).

>> Regarding VRB device, is this table per device or per VF?
>> Could it be configured by the application directly, or has it to be done through
>> the PF?
> 
> This is configured for the device at platform level, ie. through operator. Common to all application/devices. This captures the windows shape assumptions.

Thanks for the information!

>>
>>> As an illustration for VRB device pf_bb_config provides to user an option to
>> include such windowing data as an input ("FFT LUT bin file"), but more
>> generally at platform level for any bb device this big Look-Up Table or big
>> array can be configured on the host during platform initialization for a given
>> deployment or vendor.
>>> What is required here is for the user application to have knowledge of what
>> version of such array is being used on the given platform, as this information
>> would be relevant to processing done outside of bbdev (notably for noise
>> estimate). Through that mechanism, the user can now map through that API
>> which possible file was being used, and act accordingly.
>>> The content itself is not specified, for VRB we just use the md5sum of that
>> binary file (which is just a big array of int16 for point wise multiplication) so
>> that this can be used to share knowledge between initialized platform
>> configuration and at run-time user application assumption.
>>> It is also important to under that the user/vendor may use any array or
>> shape (based on their algorithm) regardless of Intel or IP, and still be able to
>> share information mapping between what is configured on the platform
>> (multiple versions possible) and what the application enumerates.
>>>
>>> I can add more details in the documentation indeed but above should
>> arguably make sense. The name FFT_version naming may be quite vague, this
>> is more related to the FFT pointwise windowing array variant assumed on the
>> platform. I did not want to impose for it to be an md5sum necessarily, hence
>> the vagueness, as it could be any hash shared between the device
>> programming and the user application related to the semi-static FFT
>> processing programming.
>>>
>>> Let me know if unclear or if any other thought,
>>
>> I think this is clear now to me.
>>
>> In my opinion, this is not good to have this part of the BBDEV API, as every
>> vendor will have their own way to represent this.
>>
>> Other alternative is to have a vendor specific API. This is far from ideal and
>> should be avoided as much as possible, but in this case the application has to
>> know anyways which device it is driving. It would be at least clear the field has
>> to be interpreted in a vendor-specific way.
>>
>> @Hemant, I would be interested in your opinion. (I don't know if NXP has or
>> plans to have FFT accelerator IP)
> 
> Yes looking forward to it.

Thanks,
Maxime

> 
>>
>> Regards,
>> Maxime
>>
>>> Thanks
>>> Nic
>>>
>>>> -----Original Message-----
>>>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>> Sent: Tuesday, September 19, 2023 2:56 AM
>>>> To: Chautru, Nicolas <nicolas.chautru@intel.com>; dev@dpdk.org
>>>> Cc: hemant.agrawal@nxp.com; david.marchand@redhat.com; Vargas,
>> Hernan
>>>> <hernan.vargas@intel.com>
>>>> Subject: Re: [PATCH v1 1/7] bbdev: add FFT version member in driver
>>>> info
>>>>
>>>>
>>>>
>>>> On 9/19/23 03:21, Nicolas Chautru wrote:
>>>>> This can be used to distinguish different version of the flexible
>>>>> pointwise windowing applied to the FFT and expose this to the
>>>>> application.
>>>>
>>>> Does this version relates to a standard, or is this specific to the
>>>> implementation of your VRB devices?
>>>>
>>>>> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
>>>>> ---
>>>>>     lib/bbdev/rte_bbdev.h | 2 ++
>>>>>     1 file changed, 2 insertions(+)
>>>>>
>>>>> diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h index
>>>>> a5bcc09f10..d6e54ee9a4 100644
>>>>> --- a/lib/bbdev/rte_bbdev.h
>>>>> +++ b/lib/bbdev/rte_bbdev.h
>>>>> @@ -349,6 +349,8 @@ struct rte_bbdev_driver_info {
>>>>>     	const struct rte_bbdev_op_cap *capabilities;
>>>>>     	/** Device cpu_flag requirements */
>>>>>     	const enum rte_cpu_flag_t *cpu_flag_reqs;
>>>>> +	/** Versioning number for the FFT operation type. */
>>>>> +	uint16_t fft_version;
>>>>>     };
>>>>>
>>>>>     /** Macro used at end of bbdev PMD list */
>>>
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v1 3/7] baseband/acc: remove the 4G SO capability for VRB1
  2023-09-21 17:18         ` Chautru, Nicolas
@ 2023-09-27  7:08           ` Maxime Coquelin
  2023-09-27  7:32             ` Maxime Coquelin
  0 siblings, 1 reply; 24+ messages in thread
From: Maxime Coquelin @ 2023-09-27  7:08 UTC (permalink / raw)
  To: Chautru, Nicolas, David Marchand
  Cc: dev, hemant.agrawal, Vargas, Hernan, Thomas Monjalon



On 9/21/23 19:18, Chautru, Nicolas wrote:
> Hi David,
> 
>> -----Original Message-----
>> From: David Marchand <david.marchand@redhat.com>
>> Sent: Thursday, September 21, 2023 12:13 AM
>> To: Chautru, Nicolas <nicolas.chautru@intel.com>
>> Cc: dev@dpdk.org; maxime.coquelin@redhat.com;
>> hemant.agrawal@nxp.com; Vargas, Hernan <hernan.vargas@intel.com>;
>> Thomas Monjalon <thomas@monjalon.net>
>> Subject: Re: [PATCH v1 3/7] baseband/acc: remove the 4G SO capability for
>> VRB1
>>
>> On Tue, Sep 19, 2023 at 10:32 PM Chautru, Nicolas
>> <nicolas.chautru@intel.com> wrote:
>>>
>>> Hi David,
>>>
>>>> -----Original Message-----
>>>> From: David Marchand <david.marchand@redhat.com>
>>>> Sent: Tuesday, September 19, 2023 8:20 AM
>>>> To: Chautru, Nicolas <nicolas.chautru@intel.com>
>>>> Cc: dev@dpdk.org; maxime.coquelin@redhat.com;
>>>> hemant.agrawal@nxp.com; Vargas, Hernan <hernan.vargas@intel.com>
>>>> Subject: Re: [PATCH v1 3/7] baseband/acc: remove the 4G SO
>>>> capability for
>>>> VRB1
>>>>
>>>> On Tue, Sep 19, 2023 at 3:25 AM Nicolas Chautru
>>>> <nicolas.chautru@intel.com> wrote:
>>>>>
>>>>> This removes the specific capability and support of LTE Decoder
>>>>> Soft Output option on the VRB1 PMD.
>>>>
>>>> Please explain why such support is removed for this hw.
>>>
>>> The decision is made to defeature this optional capability as under certain
>> race conditions enabling this may potentially cause reliability issues which
>> would not be acceptable.
>>> Note that this is an optional additional output information  (soft output
>> information) independent of the actual decoding operation.
>>> More details below next to your other comments.
>>
>> This must be explained in the commitlog.
> 
> OK will add now.
> 
>>
>>>
>>>>
>>>>
>>>>>
>>>>> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
>>>>> ---
>>>>>   drivers/baseband/acc/rte_vrb_pmd.c | 6 +++---
>>>>>   1 file changed, 3 insertions(+), 3 deletions(-)
>>>>>
>>>>> diff --git a/drivers/baseband/acc/rte_vrb_pmd.c
>>>>> b/drivers/baseband/acc/rte_vrb_pmd.c
>>>>> index 3c8f3409ed..e0f50460bd 100644
>>>>> --- a/drivers/baseband/acc/rte_vrb_pmd.c
>>>>> +++ b/drivers/baseband/acc/rte_vrb_pmd.c
>>>>> @@ -1019,14 +1019,11 @@ vrb_dev_info_get(struct rte_bbdev *dev,
>>>>> struct
>>>> rte_bbdev_driver_info *dev_info)
>>>>>                                          RTE_BBDEV_TURBO_CRC_TYPE_24B |
>>>>>                                          RTE_BBDEV_TURBO_DEC_CRC_24B_DROP |
>>>>>                                          RTE_BBDEV_TURBO_EQUALIZER |
>>>>> -                                       RTE_BBDEV_TURBO_SOFT_OUT_SATURATE |
>>>>>                                          RTE_BBDEV_TURBO_HALF_ITERATION_EVEN |
>>>>>                                          RTE_BBDEV_TURBO_CONTINUE_CRC_MATCH |
>>>>> -                                       RTE_BBDEV_TURBO_SOFT_OUTPUT |
>>>>>                                          RTE_BBDEV_TURBO_EARLY_TERMINATION |
>>>>>                                          RTE_BBDEV_TURBO_DEC_INTERRUPTS |
>>>>>                                          RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN |
>>>>> -                                       RTE_BBDEV_TURBO_NEG_LLR_1_BIT_SOFT_OUT |
>>>>>                                          RTE_BBDEV_TURBO_MAP_DEC |
>>>>>
>>>>> RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP |
>>>>>
>>>>> RTE_BBDEV_TURBO_DEC_SCATTER_GATHER,
>>>>> @@ -1975,6 +1972,9 @@ enqueue_dec_one_op_cb(struct acc_queue
>> *q,
>>>> struct rte_bbdev_dec_op *op,
>>>>>          struct rte_mbuf *input, *h_output_head, *h_output,
>>>>>                  *s_output_head, *s_output;
>>>>>
>>>>> +       /* Disable explictly SO for VRB 1. */
>>>>> +       op->turbo_dec.op_flags &= ~RTE_BBDEV_TURBO_SOFT_OUTPUT;
>>>>
>>>> Can you explain why it is needed to filter this out?
>>>>
>>>> I did not find a clear description in the bbdev API.
>>>> It would help if there were explicits references in doxygen of which
>>>> capability is necessary for using flags/API.
>>>>
>>>>
>>>> I was expecting that asking for RTE_BBDEV_TURBO_SOFT_OUTPUT to a
>>>> driver is only allowed if rte_bbdev_op_cap contains it.
>>>> With this assumption, it would be invalid for an application to
>>>> request RTE_BBDEV_TURBO_SOFT_OUTPUT through
>> rte_bbdev_enqueue_dec_ops.
>>>
>>> You may arguably expect this from a well behaved user application but still
>> there is nothing that enforces it explicitly, ie. notably under negative scenario
>> conditions which we still need to manage gracefully.
>>
>> If your application is buggy (not reading / complying with the device
>> capabilities), fix it.
> 
> Supporting negative scenario is within the scope of the PMD, whatever the application throws at us in cannot cause any HW issue.
> Fixing application issues is outside of DPDK control obviously.

I don't think it is not the role of the PMD to workaround application
bugs.

The PMD driver reports capabilities for a given device variant. The
application ignores that and forces the capability, the PMD driver
should fail. It is better for the application the PMD driver let it know
it is misbehaving than trying to hide it.

> 
>>
>>
>>> Here we want to make sure that in case the optional operational flag is
>> included, we fall back to default mode when using the VRB1 variant.
>>> Keep in mind that the unified driver can support multiple HW variant (see
>> rest of the serie) and may support this option for other variants using same
>> code.
>>
>> Whatever the HW variant, the API should be respected: exposing capabilities
>> is done on a per device basis.
>>
> 
> It should be ideally, but in practice in case this is not done for whatever reason (negative scenario, bug in user application)
> then we want the PMD to still avoid misbehaving.
> 
>>
>>>
>>> In term of documentation, I believe that capability/flag (ie. note that the
>> enum maps to a capability when retrieved from info_get, and to an operation
>> flag when provided to the bbdev api) is already captured explicitly for many
>> generations. Basically this an optional output of the LTE decoding processing,
>> to provide APP LLR which can be potentially be useful for the user application
>> (separate optional mbuf). It may or may not be supported by a bb device, and
>> it may or may not be requested to be provided through the API. Typically this
>> is not enabled.
>>
>> Being optional does not mean that a driver can ignore it.
>> Otherwise, there is no point in exposing a capability.
> 
> I am not sure I follow your concern. Capability are critical for application to enumerate what the underlying device can do.
> Here we are only stating that this is valuable to harden the PMD so that it can operate even if an unexpected API is provided, notably to guarantee the unified code is not used in an unintended manner.
> Note that no PMD to my knowledge enforces checking explicitly the op_flag matches with the capability (like a bitmask check),
> and I don’t really think we have to, these other flags are just meant to have effect since not supported.
> 
>>
>>
>>
>> Thanks.
>>
>> --
>> David Marchand
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v1 3/7] baseband/acc: remove the 4G SO capability for VRB1
  2023-09-27  7:08           ` Maxime Coquelin
@ 2023-09-27  7:32             ` Maxime Coquelin
  0 siblings, 0 replies; 24+ messages in thread
From: Maxime Coquelin @ 2023-09-27  7:32 UTC (permalink / raw)
  To: Chautru, Nicolas, David Marchand
  Cc: dev, hemant.agrawal, Vargas, Hernan, Thomas Monjalon



On 9/27/23 09:08, Maxime Coquelin wrote:
> 
> 
> On 9/21/23 19:18, Chautru, Nicolas wrote:
>> Hi David,
>>
>>> -----Original Message-----
>>> From: David Marchand <david.marchand@redhat.com>
>>> Sent: Thursday, September 21, 2023 12:13 AM
>>> To: Chautru, Nicolas <nicolas.chautru@intel.com>
>>> Cc: dev@dpdk.org; maxime.coquelin@redhat.com;
>>> hemant.agrawal@nxp.com; Vargas, Hernan <hernan.vargas@intel.com>;
>>> Thomas Monjalon <thomas@monjalon.net>
>>> Subject: Re: [PATCH v1 3/7] baseband/acc: remove the 4G SO capability 
>>> for
>>> VRB1
>>>
>>> On Tue, Sep 19, 2023 at 10:32 PM Chautru, Nicolas
>>> <nicolas.chautru@intel.com> wrote:
>>>>
>>>> Hi David,
>>>>
>>>>> -----Original Message-----
>>>>> From: David Marchand <david.marchand@redhat.com>
>>>>> Sent: Tuesday, September 19, 2023 8:20 AM
>>>>> To: Chautru, Nicolas <nicolas.chautru@intel.com>
>>>>> Cc: dev@dpdk.org; maxime.coquelin@redhat.com;
>>>>> hemant.agrawal@nxp.com; Vargas, Hernan <hernan.vargas@intel.com>
>>>>> Subject: Re: [PATCH v1 3/7] baseband/acc: remove the 4G SO
>>>>> capability for
>>>>> VRB1
>>>>>
>>>>> On Tue, Sep 19, 2023 at 3:25 AM Nicolas Chautru
>>>>> <nicolas.chautru@intel.com> wrote:
>>>>>>
>>>>>> This removes the specific capability and support of LTE Decoder
>>>>>> Soft Output option on the VRB1 PMD.
>>>>>
>>>>> Please explain why such support is removed for this hw.
>>>>
>>>> The decision is made to defeature this optional capability as under 
>>>> certain
>>> race conditions enabling this may potentially cause reliability 
>>> issues which
>>> would not be acceptable.
>>>> Note that this is an optional additional output information  (soft 
>>>> output
>>> information) independent of the actual decoding operation.
>>>> More details below next to your other comments.
>>>
>>> This must be explained in the commitlog.
>>
>> OK will add now.
>>
>>>
>>>>
>>>>>
>>>>>
>>>>>>
>>>>>> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
>>>>>> ---
>>>>>>   drivers/baseband/acc/rte_vrb_pmd.c | 6 +++---
>>>>>>   1 file changed, 3 insertions(+), 3 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/baseband/acc/rte_vrb_pmd.c
>>>>>> b/drivers/baseband/acc/rte_vrb_pmd.c
>>>>>> index 3c8f3409ed..e0f50460bd 100644
>>>>>> --- a/drivers/baseband/acc/rte_vrb_pmd.c
>>>>>> +++ b/drivers/baseband/acc/rte_vrb_pmd.c
>>>>>> @@ -1019,14 +1019,11 @@ vrb_dev_info_get(struct rte_bbdev *dev,
>>>>>> struct
>>>>> rte_bbdev_driver_info *dev_info)
>>>>>>                                          
>>>>>> RTE_BBDEV_TURBO_CRC_TYPE_24B |
>>>>>>                                          
>>>>>> RTE_BBDEV_TURBO_DEC_CRC_24B_DROP |
>>>>>>                                          RTE_BBDEV_TURBO_EQUALIZER |
>>>>>> -                                       
>>>>>> RTE_BBDEV_TURBO_SOFT_OUT_SATURATE |
>>>>>>                                          
>>>>>> RTE_BBDEV_TURBO_HALF_ITERATION_EVEN |
>>>>>>                                          
>>>>>> RTE_BBDEV_TURBO_CONTINUE_CRC_MATCH |
>>>>>> -                                       RTE_BBDEV_TURBO_SOFT_OUTPUT |
>>>>>>                                          
>>>>>> RTE_BBDEV_TURBO_EARLY_TERMINATION |
>>>>>>                                          
>>>>>> RTE_BBDEV_TURBO_DEC_INTERRUPTS |
>>>>>>                                          
>>>>>> RTE_BBDEV_TURBO_NEG_LLR_1_BIT_IN |
>>>>>> -                                       
>>>>>> RTE_BBDEV_TURBO_NEG_LLR_1_BIT_SOFT_OUT |
>>>>>>                                          RTE_BBDEV_TURBO_MAP_DEC |
>>>>>>
>>>>>> RTE_BBDEV_TURBO_DEC_TB_CRC_24B_KEEP |
>>>>>>
>>>>>> RTE_BBDEV_TURBO_DEC_SCATTER_GATHER,
>>>>>> @@ -1975,6 +1972,9 @@ enqueue_dec_one_op_cb(struct acc_queue
>>> *q,
>>>>> struct rte_bbdev_dec_op *op,
>>>>>>          struct rte_mbuf *input, *h_output_head, *h_output,
>>>>>>                  *s_output_head, *s_output;
>>>>>>
>>>>>> +       /* Disable explictly SO for VRB 1. */
>>>>>> +       op->turbo_dec.op_flags &= ~RTE_BBDEV_TURBO_SOFT_OUTPUT;
>>>>>
>>>>> Can you explain why it is needed to filter this out?
>>>>>
>>>>> I did not find a clear description in the bbdev API.
>>>>> It would help if there were explicits references in doxygen of which
>>>>> capability is necessary for using flags/API.
>>>>>
>>>>>
>>>>> I was expecting that asking for RTE_BBDEV_TURBO_SOFT_OUTPUT to a
>>>>> driver is only allowed if rte_bbdev_op_cap contains it.
>>>>> With this assumption, it would be invalid for an application to
>>>>> request RTE_BBDEV_TURBO_SOFT_OUTPUT through
>>> rte_bbdev_enqueue_dec_ops.
>>>>
>>>> You may arguably expect this from a well behaved user application 
>>>> but still
>>> there is nothing that enforces it explicitly, ie. notably under 
>>> negative scenario
>>> conditions which we still need to manage gracefully.
>>>
>>> If your application is buggy (not reading / complying with the device
>>> capabilities), fix it.
>>
>> Supporting negative scenario is within the scope of the PMD, whatever 
>> the application throws at us in cannot cause any HW issue.
>> Fixing application issues is outside of DPDK control obviously.
> 
> I don't think it is not the role of the PMD to workaround application
> bugs.

Of course I meant:
I don't think it is the role of the PMD to workaround application bugs.

> 
> The PMD driver reports capabilities for a given device variant. The
> application ignores that and forces the capability, the PMD driver
> should fail. It is better for the application the PMD driver let it know
> it is misbehaving than trying to hide it.
> 
>>
>>>
>>>
>>>> Here we want to make sure that in case the optional operational flag is
>>> included, we fall back to default mode when using the VRB1 variant.
>>>> Keep in mind that the unified driver can support multiple HW variant 
>>>> (see
>>> rest of the serie) and may support this option for other variants 
>>> using same
>>> code.
>>>
>>> Whatever the HW variant, the API should be respected: exposing 
>>> capabilities
>>> is done on a per device basis.
>>>
>>
>> It should be ideally, but in practice in case this is not done for 
>> whatever reason (negative scenario, bug in user application)
>> then we want the PMD to still avoid misbehaving.
>>
>>>
>>>>
>>>> In term of documentation, I believe that capability/flag (ie. note 
>>>> that the
>>> enum maps to a capability when retrieved from info_get, and to an 
>>> operation
>>> flag when provided to the bbdev api) is already captured explicitly 
>>> for many
>>> generations. Basically this an optional output of the LTE decoding 
>>> processing,
>>> to provide APP LLR which can be potentially be useful for the user 
>>> application
>>> (separate optional mbuf). It may or may not be supported by a bb 
>>> device, and
>>> it may or may not be requested to be provided through the API. 
>>> Typically this
>>> is not enabled.
>>>
>>> Being optional does not mean that a driver can ignore it.
>>> Otherwise, there is no point in exposing a capability.
>>
>> I am not sure I follow your concern. Capability are critical for 
>> application to enumerate what the underlying device can do.
>> Here we are only stating that this is valuable to harden the PMD so 
>> that it can operate even if an unexpected API is provided, notably to 
>> guarantee the unified code is not used in an unintended manner.
>> Note that no PMD to my knowledge enforces checking explicitly the 
>> op_flag matches with the capability (like a bitmask check),
>> and I don’t really think we have to, these other flags are just meant 
>> to have effect since not supported.
>>
>>>
>>>
>>>
>>> Thanks.
>>>
>>> -- 
>>> David Marchand
>>


^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [PATCH v1 1/7] bbdev: add FFT version member in driver info
  2023-09-26 10:00           ` Maxime Coquelin
@ 2023-09-27 23:50             ` Chautru, Nicolas
  2023-09-28  8:27               ` Maxime Coquelin
  0 siblings, 1 reply; 24+ messages in thread
From: Chautru, Nicolas @ 2023-09-27 23:50 UTC (permalink / raw)
  To: Maxime Coquelin, hemant.agrawal, dev; +Cc: david.marchand, Vargas, Hernan

Hi Maxime, Hemant, 

I wanted initially to keep it fairly open hence a hash table for the windows profiles, but it is also possible to expose something more descriptive, that would work as well actually.
Ie.

+	/** FFT windowing width for 2048 FFT. */
+	uint16_t fft_window_width[RTE_BBDEV_MAX_FFT_WIN];

The provides the width of each windows shape which is enough to distinguish major variants and to estimate noise factor. 

Let me know of opinion.
Thanks
Nic

> -----Original Message-----
> From: Maxime Coquelin <maxime.coquelin@redhat.com>
> Sent: Tuesday, September 26, 2023 3:00 AM
> To: Chautru, Nicolas <nicolas.chautru@intel.com>;
> hemant.agrawal@nxp.com; dev@dpdk.org
> Cc: david.marchand@redhat.com; Vargas, Hernan
> <hernan.vargas@intel.com>
> Subject: Re: [PATCH v1 1/7] bbdev: add FFT version member in driver info
> 
> 
> 
> On 9/22/23 18:41, Chautru, Nicolas wrote:
> > Hi Maxime,
> >
> >> -----Original Message-----
> >> From: Maxime Coquelin <maxime.coquelin@redhat.com>
> >> Sent: Friday, September 22, 2023 1:15 AM
> >> To: Chautru, Nicolas <nicolas.chautru@intel.com>;
> >> hemant.agrawal@nxp.com; dev@dpdk.org
> >> Cc: david.marchand@redhat.com; Vargas, Hernan
> >> <hernan.vargas@intel.com>
> >> Subject: Re: [PATCH v1 1/7] bbdev: add FFT version member in driver
> >> info
> >>
> >> Hi Nicolas,
> >>
> >> On 9/19/23 22:51, Chautru, Nicolas wrote:
> >>> Hi Maxime,
> >>>
> >>> This is neither part of 3GPP per se, nor specific to VRB device. Let
> >>> me provide
> >> more context.
> >>> The SRS processing chain
> >> (https://doc.dpdk.org/guides/prog_guide/bbdev.html#bbdev-fft-operatio
> >> n) includes a pointwise multiplication by time window.
> >>> The generic API include some control of these windowing function but
> >>> still
> >> the actual shape need to be programmed onto any device (ie.
> >> rectangular, taped, sinc, different width or offset, any abritraty
> >> shape defined as an array of scalars). These degrees of liberties
> >> cannot be exposed through a generic API (information is multi-kB, ie
> >> the data itself) and can be user specific (external to the HW IP itself or
> outside of Intel control).
> >>
> >> Thanks for the explanations. I also did my homework as my FFT
> >> knowledge was buried quite deep in my memory. :)
> >>
> >> So this is a vendor-specific way to express generic paramaters.
> >
> > Unsure this is that vendor specific. At least the interface allows to know a
> hash of the table being loaded (which is just pointwise data really, non-
> proprietary format). I did not state the content is a simple md5sum of the bin
> file being loaded from linux.
> 
> Ok, I think it would be better to provide an API to get the table directly, and
> have the format being described in the documentation.
> 
> With that, we can also provide the hash as you'd like, but the method to
> calculate the hash should also be provided. Or the application can perform
> the hash itself if it needs it.
> 
> The fact that it is several KB is not an issue, as this information would only be
> queried once at init time if really needed.
> 
> An non-DPDK alternative could be to pass such information to the pod via the
> device plugin (as a mounted file for instance, or variable).
> 
> >> Regarding VRB device, is this table per device or per VF?
> >> Could it be configured by the application directly, or has it to be
> >> done through the PF?
> >
> > This is configured for the device at platform level, ie. through operator.
> Common to all application/devices. This captures the windows shape
> assumptions.
> 
> Thanks for the information!
> 
> >>
> >>> As an illustration for VRB device pf_bb_config provides to user an
> >>> option to
> >> include such windowing data as an input ("FFT LUT bin file"), but
> >> more generally at platform level for any bb device this big Look-Up
> >> Table or big array can be configured on the host during platform
> >> initialization for a given deployment or vendor.
> >>> What is required here is for the user application to have knowledge
> >>> of what
> >> version of such array is being used on the given platform, as this
> >> information would be relevant to processing done outside of bbdev
> >> (notably for noise estimate). Through that mechanism, the user can
> >> now map through that API which possible file was being used, and act
> accordingly.
> >>> The content itself is not specified, for VRB we just use the md5sum
> >>> of that
> >> binary file (which is just a big array of int16 for point wise
> >> multiplication) so that this can be used to share knowledge between
> >> initialized platform configuration and at run-time user application
> assumption.
> >>> It is also important to under that the user/vendor may use any array
> >>> or
> >> shape (based on their algorithm) regardless of Intel or IP, and still
> >> be able to share information mapping between what is configured on
> >> the platform (multiple versions possible) and what the application
> enumerates.
> >>>
> >>> I can add more details in the documentation indeed but above should
> >> arguably make sense. The name FFT_version naming may be quite vague,
> >> this is more related to the FFT pointwise windowing array variant
> >> assumed on the platform. I did not want to impose for it to be an
> >> md5sum necessarily, hence the vagueness, as it could be any hash
> >> shared between the device programming and the user application
> >> related to the semi-static FFT processing programming.
> >>>
> >>> Let me know if unclear or if any other thought,
> >>
> >> I think this is clear now to me.
> >>
> >> In my opinion, this is not good to have this part of the BBDEV API,
> >> as every vendor will have their own way to represent this.
> >>
> >> Other alternative is to have a vendor specific API. This is far from
> >> ideal and should be avoided as much as possible, but in this case the
> >> application has to know anyways which device it is driving. It would
> >> be at least clear the field has to be interpreted in a vendor-specific way.
> >>
> >> @Hemant, I would be interested in your opinion. (I don't know if NXP
> >> has or plans to have FFT accelerator IP)
> >
> > Yes looking forward to it.
> 
> Thanks,
> Maxime
> 
> >
> >>
> >> Regards,
> >> Maxime
> >>
> >>> Thanks
> >>> Nic
> >>>
> >>>> -----Original Message-----
> >>>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
> >>>> Sent: Tuesday, September 19, 2023 2:56 AM
> >>>> To: Chautru, Nicolas <nicolas.chautru@intel.com>; dev@dpdk.org
> >>>> Cc: hemant.agrawal@nxp.com; david.marchand@redhat.com; Vargas,
> >> Hernan
> >>>> <hernan.vargas@intel.com>
> >>>> Subject: Re: [PATCH v1 1/7] bbdev: add FFT version member in driver
> >>>> info
> >>>>
> >>>>
> >>>>
> >>>> On 9/19/23 03:21, Nicolas Chautru wrote:
> >>>>> This can be used to distinguish different version of the flexible
> >>>>> pointwise windowing applied to the FFT and expose this to the
> >>>>> application.
> >>>>
> >>>> Does this version relates to a standard, or is this specific to the
> >>>> implementation of your VRB devices?
> >>>>
> >>>>> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> >>>>> ---
> >>>>>     lib/bbdev/rte_bbdev.h | 2 ++
> >>>>>     1 file changed, 2 insertions(+)
> >>>>>
> >>>>> diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h index
> >>>>> a5bcc09f10..d6e54ee9a4 100644
> >>>>> --- a/lib/bbdev/rte_bbdev.h
> >>>>> +++ b/lib/bbdev/rte_bbdev.h
> >>>>> @@ -349,6 +349,8 @@ struct rte_bbdev_driver_info {
> >>>>>     	const struct rte_bbdev_op_cap *capabilities;
> >>>>>     	/** Device cpu_flag requirements */
> >>>>>     	const enum rte_cpu_flag_t *cpu_flag_reqs;
> >>>>> +	/** Versioning number for the FFT operation type. */
> >>>>> +	uint16_t fft_version;
> >>>>>     };
> >>>>>
> >>>>>     /** Macro used at end of bbdev PMD list */
> >>>
> >


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v1 1/7] bbdev: add FFT version member in driver info
  2023-09-27 23:50             ` Chautru, Nicolas
@ 2023-09-28  8:27               ` Maxime Coquelin
  2023-09-28 16:33                 ` Chautru, Nicolas
  0 siblings, 1 reply; 24+ messages in thread
From: Maxime Coquelin @ 2023-09-28  8:27 UTC (permalink / raw)
  To: Chautru, Nicolas, hemant.agrawal, dev; +Cc: david.marchand, Vargas, Hernan

Hi Nicolas,

On 9/28/23 01:50, Chautru, Nicolas wrote:
> Hi Maxime, Hemant,
> 
> I wanted initially to keep it fairly open hence a hash table for the windows profiles, but it is also possible to expose something more descriptive, that would work as well actually.
> Ie.
> 
> +	/** FFT windowing width for 2048 FFT. */
> +	uint16_t fft_window_width[RTE_BBDEV_MAX_FFT_WIN];
> 
> The provides the width of each windows shape which is enough to distinguish major variants and to estimate noise factor.

That sounds much better IMHO.

Regarding the array and values sizes:
1. Should it only covers 2048 points FFT? I see some references about
4096 FFT for 5G and satellites  communications
2. Is uint16_t enough for all the usecases?

Since this array is quite big, could it be exposed to the application
via dedicated APIs instead of a field? An API to query the length of the
array so that the application can allocate required meory, and an API to
copy the data in the allocated mem?

Maybe overkill, but I feel different FFT size could be supported in the
future, so that would be both future proof and more memory efficient for
apps that don't need this.

> Let me know of opinion.

Thanks for suggesting this,
Maxime

> Thanks
> Nic
> 
>> -----Original Message-----
>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>> Sent: Tuesday, September 26, 2023 3:00 AM
>> To: Chautru, Nicolas <nicolas.chautru@intel.com>;
>> hemant.agrawal@nxp.com; dev@dpdk.org
>> Cc: david.marchand@redhat.com; Vargas, Hernan
>> <hernan.vargas@intel.com>
>> Subject: Re: [PATCH v1 1/7] bbdev: add FFT version member in driver info
>>
>>
>>
>> On 9/22/23 18:41, Chautru, Nicolas wrote:
>>> Hi Maxime,
>>>
>>>> -----Original Message-----
>>>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>> Sent: Friday, September 22, 2023 1:15 AM
>>>> To: Chautru, Nicolas <nicolas.chautru@intel.com>;
>>>> hemant.agrawal@nxp.com; dev@dpdk.org
>>>> Cc: david.marchand@redhat.com; Vargas, Hernan
>>>> <hernan.vargas@intel.com>
>>>> Subject: Re: [PATCH v1 1/7] bbdev: add FFT version member in driver
>>>> info
>>>>
>>>> Hi Nicolas,
>>>>
>>>> On 9/19/23 22:51, Chautru, Nicolas wrote:
>>>>> Hi Maxime,
>>>>>
>>>>> This is neither part of 3GPP per se, nor specific to VRB device. Let
>>>>> me provide
>>>> more context.
>>>>> The SRS processing chain
>>>> (https://doc.dpdk.org/guides/prog_guide/bbdev.html#bbdev-fft-operatio
>>>> n) includes a pointwise multiplication by time window.
>>>>> The generic API include some control of these windowing function but
>>>>> still
>>>> the actual shape need to be programmed onto any device (ie.
>>>> rectangular, taped, sinc, different width or offset, any abritraty
>>>> shape defined as an array of scalars). These degrees of liberties
>>>> cannot be exposed through a generic API (information is multi-kB, ie
>>>> the data itself) and can be user specific (external to the HW IP itself or
>> outside of Intel control).
>>>>
>>>> Thanks for the explanations. I also did my homework as my FFT
>>>> knowledge was buried quite deep in my memory. :)
>>>>
>>>> So this is a vendor-specific way to express generic paramaters.
>>>
>>> Unsure this is that vendor specific. At least the interface allows to know a
>> hash of the table being loaded (which is just pointwise data really, non-
>> proprietary format). I did not state the content is a simple md5sum of the bin
>> file being loaded from linux.
>>
>> Ok, I think it would be better to provide an API to get the table directly, and
>> have the format being described in the documentation.
>>
>> With that, we can also provide the hash as you'd like, but the method to
>> calculate the hash should also be provided. Or the application can perform
>> the hash itself if it needs it.
>>
>> The fact that it is several KB is not an issue, as this information would only be
>> queried once at init time if really needed.
>>
>> An non-DPDK alternative could be to pass such information to the pod via the
>> device plugin (as a mounted file for instance, or variable).
>>
>>>> Regarding VRB device, is this table per device or per VF?
>>>> Could it be configured by the application directly, or has it to be
>>>> done through the PF?
>>>
>>> This is configured for the device at platform level, ie. through operator.
>> Common to all application/devices. This captures the windows shape
>> assumptions.
>>
>> Thanks for the information!
>>
>>>>
>>>>> As an illustration for VRB device pf_bb_config provides to user an
>>>>> option to
>>>> include such windowing data as an input ("FFT LUT bin file"), but
>>>> more generally at platform level for any bb device this big Look-Up
>>>> Table or big array can be configured on the host during platform
>>>> initialization for a given deployment or vendor.
>>>>> What is required here is for the user application to have knowledge
>>>>> of what
>>>> version of such array is being used on the given platform, as this
>>>> information would be relevant to processing done outside of bbdev
>>>> (notably for noise estimate). Through that mechanism, the user can
>>>> now map through that API which possible file was being used, and act
>> accordingly.
>>>>> The content itself is not specified, for VRB we just use the md5sum
>>>>> of that
>>>> binary file (which is just a big array of int16 for point wise
>>>> multiplication) so that this can be used to share knowledge between
>>>> initialized platform configuration and at run-time user application
>> assumption.
>>>>> It is also important to under that the user/vendor may use any array
>>>>> or
>>>> shape (based on their algorithm) regardless of Intel or IP, and still
>>>> be able to share information mapping between what is configured on
>>>> the platform (multiple versions possible) and what the application
>> enumerates.
>>>>>
>>>>> I can add more details in the documentation indeed but above should
>>>> arguably make sense. The name FFT_version naming may be quite vague,
>>>> this is more related to the FFT pointwise windowing array variant
>>>> assumed on the platform. I did not want to impose for it to be an
>>>> md5sum necessarily, hence the vagueness, as it could be any hash
>>>> shared between the device programming and the user application
>>>> related to the semi-static FFT processing programming.
>>>>>
>>>>> Let me know if unclear or if any other thought,
>>>>
>>>> I think this is clear now to me.
>>>>
>>>> In my opinion, this is not good to have this part of the BBDEV API,
>>>> as every vendor will have their own way to represent this.
>>>>
>>>> Other alternative is to have a vendor specific API. This is far from
>>>> ideal and should be avoided as much as possible, but in this case the
>>>> application has to know anyways which device it is driving. It would
>>>> be at least clear the field has to be interpreted in a vendor-specific way.
>>>>
>>>> @Hemant, I would be interested in your opinion. (I don't know if NXP
>>>> has or plans to have FFT accelerator IP)
>>>
>>> Yes looking forward to it.
>>
>> Thanks,
>> Maxime
>>
>>>
>>>>
>>>> Regards,
>>>> Maxime
>>>>
>>>>> Thanks
>>>>> Nic
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>>>> Sent: Tuesday, September 19, 2023 2:56 AM
>>>>>> To: Chautru, Nicolas <nicolas.chautru@intel.com>; dev@dpdk.org
>>>>>> Cc: hemant.agrawal@nxp.com; david.marchand@redhat.com; Vargas,
>>>> Hernan
>>>>>> <hernan.vargas@intel.com>
>>>>>> Subject: Re: [PATCH v1 1/7] bbdev: add FFT version member in driver
>>>>>> info
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 9/19/23 03:21, Nicolas Chautru wrote:
>>>>>>> This can be used to distinguish different version of the flexible
>>>>>>> pointwise windowing applied to the FFT and expose this to the
>>>>>>> application.
>>>>>>
>>>>>> Does this version relates to a standard, or is this specific to the
>>>>>> implementation of your VRB devices?
>>>>>>
>>>>>>> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
>>>>>>> ---
>>>>>>>      lib/bbdev/rte_bbdev.h | 2 ++
>>>>>>>      1 file changed, 2 insertions(+)
>>>>>>>
>>>>>>> diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h index
>>>>>>> a5bcc09f10..d6e54ee9a4 100644
>>>>>>> --- a/lib/bbdev/rte_bbdev.h
>>>>>>> +++ b/lib/bbdev/rte_bbdev.h
>>>>>>> @@ -349,6 +349,8 @@ struct rte_bbdev_driver_info {
>>>>>>>      	const struct rte_bbdev_op_cap *capabilities;
>>>>>>>      	/** Device cpu_flag requirements */
>>>>>>>      	const enum rte_cpu_flag_t *cpu_flag_reqs;
>>>>>>> +	/** Versioning number for the FFT operation type. */
>>>>>>> +	uint16_t fft_version;
>>>>>>>      };
>>>>>>>
>>>>>>>      /** Macro used at end of bbdev PMD list */
>>>>>
>>>
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* RE: [PATCH v1 1/7] bbdev: add FFT version member in driver info
  2023-09-28  8:27               ` Maxime Coquelin
@ 2023-09-28 16:33                 ` Chautru, Nicolas
  2023-10-03 11:46                   ` Maxime Coquelin
  0 siblings, 1 reply; 24+ messages in thread
From: Chautru, Nicolas @ 2023-09-28 16:33 UTC (permalink / raw)
  To: Maxime Coquelin, hemant.agrawal, dev; +Cc: david.marchand, Vargas, Hernan

HI Maxime, 


> -----Original Message-----
> From: Maxime Coquelin <maxime.coquelin@redhat.com>
> Sent: Thursday, September 28, 2023 1:27 AM
> To: Chautru, Nicolas <nicolas.chautru@intel.com>; hemant.agrawal@nxp.com;
> dev@dpdk.org
> Cc: david.marchand@redhat.com; Vargas, Hernan <hernan.vargas@intel.com>
> Subject: Re: [PATCH v1 1/7] bbdev: add FFT version member in driver info
> 
> Hi Nicolas,
> 
> On 9/28/23 01:50, Chautru, Nicolas wrote:
> > Hi Maxime, Hemant,
> >
> > I wanted initially to keep it fairly open hence a hash table for the windows
> profiles, but it is also possible to expose something more descriptive, that
> would work as well actually.
> > Ie.
> >
> > +	/** FFT windowing width for 2048 FFT. */
> > +	uint16_t fft_window_width[RTE_BBDEV_MAX_FFT_WIN];
> >
> > The provides the width of each windows shape which is enough to
> distinguish major variants and to estimate noise factor.
> 
> That sounds much better IMHO.
> 
> Regarding the array and values sizes:
> 1. Should it only covers 2048 points FFT? I see some references about
> 4096 FFT for 5G and satellites  communications 2. Is uint16_t enough for all the
> usecases?

That is a misunderstanding, probably because I did not include the definition and value of RTE_BBDEV_MAX_FFT_WIN on the snippet above.
The dimension of the array is purely the number of windows to choose from, ie. 16.
That is NOT an array matching the size of the FFT. In effect that width value is for the reference for 2048 FFT, but the actual width would be scaled down when a lower FFT is being set or higher for bigger FFT, so this doesn’t make assumption on the max FFT size, just a given portion using a reference resolution for 2048 FFT. 
Uint16_t is more than enough, that width cannot be more than 1024 based on reference above. 

> 
> Since this array is quite big, could it be exposed to the application via dedicated
> APIs instead of a field? An API to query the length of the array so that the
> application can allocate required meory, and an API to copy the data in the
> allocated mem?
> 
> Maybe overkill, but I feel different FFT size could be supported in the future, so
> that would be both future proof and more memory efficient for apps that
> don't need this.

Note above, let me know if unclear. 

> 
> > Let me know of opinion.
> 
> Thanks for suggesting this,
> Maxime
> 
> > Thanks
> > Nic
> >
> >> -----Original Message-----
> >> From: Maxime Coquelin <maxime.coquelin@redhat.com>
> >> Sent: Tuesday, September 26, 2023 3:00 AM
> >> To: Chautru, Nicolas <nicolas.chautru@intel.com>;
> >> hemant.agrawal@nxp.com; dev@dpdk.org
> >> Cc: david.marchand@redhat.com; Vargas, Hernan
> >> <hernan.vargas@intel.com>
> >> Subject: Re: [PATCH v1 1/7] bbdev: add FFT version member in driver
> >> info
> >>
> >>
> >>
> >> On 9/22/23 18:41, Chautru, Nicolas wrote:
> >>> Hi Maxime,
> >>>
> >>>> -----Original Message-----
> >>>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
> >>>> Sent: Friday, September 22, 2023 1:15 AM
> >>>> To: Chautru, Nicolas <nicolas.chautru@intel.com>;
> >>>> hemant.agrawal@nxp.com; dev@dpdk.org
> >>>> Cc: david.marchand@redhat.com; Vargas, Hernan
> >>>> <hernan.vargas@intel.com>
> >>>> Subject: Re: [PATCH v1 1/7] bbdev: add FFT version member in driver
> >>>> info
> >>>>
> >>>> Hi Nicolas,
> >>>>
> >>>> On 9/19/23 22:51, Chautru, Nicolas wrote:
> >>>>> Hi Maxime,
> >>>>>
> >>>>> This is neither part of 3GPP per se, nor specific to VRB device.
> >>>>> Let me provide
> >>>> more context.
> >>>>> The SRS processing chain
> >>>> (https://doc.dpdk.org/guides/prog_guide/bbdev.html#bbdev-fft-operat
> >>>> io
> >>>> n) includes a pointwise multiplication by time window.
> >>>>> The generic API include some control of these windowing function
> >>>>> but still
> >>>> the actual shape need to be programmed onto any device (ie.
> >>>> rectangular, taped, sinc, different width or offset, any abritraty
> >>>> shape defined as an array of scalars). These degrees of liberties
> >>>> cannot be exposed through a generic API (information is multi-kB,
> >>>> ie the data itself) and can be user specific (external to the HW IP
> >>>> itself or
> >> outside of Intel control).
> >>>>
> >>>> Thanks for the explanations. I also did my homework as my FFT
> >>>> knowledge was buried quite deep in my memory. :)
> >>>>
> >>>> So this is a vendor-specific way to express generic paramaters.
> >>>
> >>> Unsure this is that vendor specific. At least the interface allows
> >>> to know a
> >> hash of the table being loaded (which is just pointwise data really,
> >> non- proprietary format). I did not state the content is a simple
> >> md5sum of the bin file being loaded from linux.
> >>
> >> Ok, I think it would be better to provide an API to get the table
> >> directly, and have the format being described in the documentation.
> >>
> >> With that, we can also provide the hash as you'd like, but the method
> >> to calculate the hash should also be provided. Or the application can
> >> perform the hash itself if it needs it.
> >>
> >> The fact that it is several KB is not an issue, as this information
> >> would only be queried once at init time if really needed.
> >>
> >> An non-DPDK alternative could be to pass such information to the pod
> >> via the device plugin (as a mounted file for instance, or variable).
> >>
> >>>> Regarding VRB device, is this table per device or per VF?
> >>>> Could it be configured by the application directly, or has it to be
> >>>> done through the PF?
> >>>
> >>> This is configured for the device at platform level, ie. through operator.
> >> Common to all application/devices. This captures the windows shape
> >> assumptions.
> >>
> >> Thanks for the information!
> >>
> >>>>
> >>>>> As an illustration for VRB device pf_bb_config provides to user an
> >>>>> option to
> >>>> include such windowing data as an input ("FFT LUT bin file"), but
> >>>> more generally at platform level for any bb device this big Look-Up
> >>>> Table or big array can be configured on the host during platform
> >>>> initialization for a given deployment or vendor.
> >>>>> What is required here is for the user application to have
> >>>>> knowledge of what
> >>>> version of such array is being used on the given platform, as this
> >>>> information would be relevant to processing done outside of bbdev
> >>>> (notably for noise estimate). Through that mechanism, the user can
> >>>> now map through that API which possible file was being used, and
> >>>> act
> >> accordingly.
> >>>>> The content itself is not specified, for VRB we just use the
> >>>>> md5sum of that
> >>>> binary file (which is just a big array of int16 for point wise
> >>>> multiplication) so that this can be used to share knowledge between
> >>>> initialized platform configuration and at run-time user application
> >> assumption.
> >>>>> It is also important to under that the user/vendor may use any
> >>>>> array or
> >>>> shape (based on their algorithm) regardless of Intel or IP, and
> >>>> still be able to share information mapping between what is
> >>>> configured on the platform (multiple versions possible) and what
> >>>> the application
> >> enumerates.
> >>>>>
> >>>>> I can add more details in the documentation indeed but above
> >>>>> should
> >>>> arguably make sense. The name FFT_version naming may be quite
> >>>> vague, this is more related to the FFT pointwise windowing array
> >>>> variant assumed on the platform. I did not want to impose for it to
> >>>> be an md5sum necessarily, hence the vagueness, as it could be any
> >>>> hash shared between the device programming and the user application
> >>>> related to the semi-static FFT processing programming.
> >>>>>
> >>>>> Let me know if unclear or if any other thought,
> >>>>
> >>>> I think this is clear now to me.
> >>>>
> >>>> In my opinion, this is not good to have this part of the BBDEV API,
> >>>> as every vendor will have their own way to represent this.
> >>>>
> >>>> Other alternative is to have a vendor specific API. This is far
> >>>> from ideal and should be avoided as much as possible, but in this
> >>>> case the application has to know anyways which device it is
> >>>> driving. It would be at least clear the field has to be interpreted in a
> vendor-specific way.
> >>>>
> >>>> @Hemant, I would be interested in your opinion. (I don't know if
> >>>> NXP has or plans to have FFT accelerator IP)
> >>>
> >>> Yes looking forward to it.
> >>
> >> Thanks,
> >> Maxime
> >>
> >>>
> >>>>
> >>>> Regards,
> >>>> Maxime
> >>>>
> >>>>> Thanks
> >>>>> Nic
> >>>>>
> >>>>>> -----Original Message-----
> >>>>>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
> >>>>>> Sent: Tuesday, September 19, 2023 2:56 AM
> >>>>>> To: Chautru, Nicolas <nicolas.chautru@intel.com>; dev@dpdk.org
> >>>>>> Cc: hemant.agrawal@nxp.com; david.marchand@redhat.com; Vargas,
> >>>> Hernan
> >>>>>> <hernan.vargas@intel.com>
> >>>>>> Subject: Re: [PATCH v1 1/7] bbdev: add FFT version member in
> >>>>>> driver info
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On 9/19/23 03:21, Nicolas Chautru wrote:
> >>>>>>> This can be used to distinguish different version of the
> >>>>>>> flexible pointwise windowing applied to the FFT and expose this
> >>>>>>> to the application.
> >>>>>>
> >>>>>> Does this version relates to a standard, or is this specific to
> >>>>>> the implementation of your VRB devices?
> >>>>>>
> >>>>>>> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
> >>>>>>> ---
> >>>>>>>      lib/bbdev/rte_bbdev.h | 2 ++
> >>>>>>>      1 file changed, 2 insertions(+)
> >>>>>>>
> >>>>>>> diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h index
> >>>>>>> a5bcc09f10..d6e54ee9a4 100644
> >>>>>>> --- a/lib/bbdev/rte_bbdev.h
> >>>>>>> +++ b/lib/bbdev/rte_bbdev.h
> >>>>>>> @@ -349,6 +349,8 @@ struct rte_bbdev_driver_info {
> >>>>>>>      	const struct rte_bbdev_op_cap *capabilities;
> >>>>>>>      	/** Device cpu_flag requirements */
> >>>>>>>      	const enum rte_cpu_flag_t *cpu_flag_reqs;
> >>>>>>> +	/** Versioning number for the FFT operation type. */
> >>>>>>> +	uint16_t fft_version;
> >>>>>>>      };
> >>>>>>>
> >>>>>>>      /** Macro used at end of bbdev PMD list */
> >>>>>
> >>>
> >


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v1 1/7] bbdev: add FFT version member in driver info
  2023-09-28 16:33                 ` Chautru, Nicolas
@ 2023-10-03 11:46                   ` Maxime Coquelin
  0 siblings, 0 replies; 24+ messages in thread
From: Maxime Coquelin @ 2023-10-03 11:46 UTC (permalink / raw)
  To: Chautru, Nicolas, hemant.agrawal, dev; +Cc: david.marchand, Vargas, Hernan



On 9/28/23 18:33, Chautru, Nicolas wrote:
> HI Maxime,
> 
> 
>> -----Original Message-----
>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>> Sent: Thursday, September 28, 2023 1:27 AM
>> To: Chautru, Nicolas <nicolas.chautru@intel.com>; hemant.agrawal@nxp.com;
>> dev@dpdk.org
>> Cc: david.marchand@redhat.com; Vargas, Hernan <hernan.vargas@intel.com>
>> Subject: Re: [PATCH v1 1/7] bbdev: add FFT version member in driver info
>>
>> Hi Nicolas,
>>
>> On 9/28/23 01:50, Chautru, Nicolas wrote:
>>> Hi Maxime, Hemant,
>>>
>>> I wanted initially to keep it fairly open hence a hash table for the windows
>> profiles, but it is also possible to expose something more descriptive, that
>> would work as well actually.
>>> Ie.
>>>
>>> +	/** FFT windowing width for 2048 FFT. */
>>> +	uint16_t fft_window_width[RTE_BBDEV_MAX_FFT_WIN];
>>>
>>> The provides the width of each windows shape which is enough to
>> distinguish major variants and to estimate noise factor.
>>
>> That sounds much better IMHO.
>>
>> Regarding the array and values sizes:
>> 1. Should it only covers 2048 points FFT? I see some references about
>> 4096 FFT for 5G and satellites  communications 2. Is uint16_t enough for all the
>> usecases?
> 
> That is a misunderstanding, probably because I did not include the definition and value of RTE_BBDEV_MAX_FFT_WIN on the snippet above.
> The dimension of the array is purely the number of windows to choose from, ie. 16.

Ok, where is the number of windows made available?

> That is NOT an array matching the size of the FFT. In effect that width value is for the reference for 2048 FFT, but the actual width would be scaled down when a lower FFT is being set or higher for bigger FFT, so this doesn’t make assumption on the max FFT size, just a given portion using a reference resolution for 2048 FFT.

Ok, if you could document this with example that would be great.
Maybe you have some existing links explaining that?

> Uint16_t is more than enough, that width cannot be more than 1024 based on reference above.
> 
>>
>> Since this array is quite big, could it be exposed to the application via dedicated
>> APIs instead of a field? An API to query the length of the array so that the
>> application can allocate required meory, and an API to copy the data in the
>> allocated mem?
>>
>> Maybe overkill, but I feel different FFT size could be supported in the future, so
>> that would be both future proof and more memory efficient for apps that
>> don't need this.
> 
> Note above, let me know if unclear.

It is not clear to me how this representation is generic or specific to
your device.

Thanks,
Maxime

> 
>>
>>> Let me know of opinion.
>>
>> Thanks for suggesting this,
>> Maxime
>>
>>> Thanks
>>> Nic
>>>
>>>> -----Original Message-----
>>>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>> Sent: Tuesday, September 26, 2023 3:00 AM
>>>> To: Chautru, Nicolas <nicolas.chautru@intel.com>;
>>>> hemant.agrawal@nxp.com; dev@dpdk.org
>>>> Cc: david.marchand@redhat.com; Vargas, Hernan
>>>> <hernan.vargas@intel.com>
>>>> Subject: Re: [PATCH v1 1/7] bbdev: add FFT version member in driver
>>>> info
>>>>
>>>>
>>>>
>>>> On 9/22/23 18:41, Chautru, Nicolas wrote:
>>>>> Hi Maxime,
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>>>> Sent: Friday, September 22, 2023 1:15 AM
>>>>>> To: Chautru, Nicolas <nicolas.chautru@intel.com>;
>>>>>> hemant.agrawal@nxp.com; dev@dpdk.org
>>>>>> Cc: david.marchand@redhat.com; Vargas, Hernan
>>>>>> <hernan.vargas@intel.com>
>>>>>> Subject: Re: [PATCH v1 1/7] bbdev: add FFT version member in driver
>>>>>> info
>>>>>>
>>>>>> Hi Nicolas,
>>>>>>
>>>>>> On 9/19/23 22:51, Chautru, Nicolas wrote:
>>>>>>> Hi Maxime,
>>>>>>>
>>>>>>> This is neither part of 3GPP per se, nor specific to VRB device.
>>>>>>> Let me provide
>>>>>> more context.
>>>>>>> The SRS processing chain
>>>>>> (https://doc.dpdk.org/guides/prog_guide/bbdev.html#bbdev-fft-operat
>>>>>> io
>>>>>> n) includes a pointwise multiplication by time window.
>>>>>>> The generic API include some control of these windowing function
>>>>>>> but still
>>>>>> the actual shape need to be programmed onto any device (ie.
>>>>>> rectangular, taped, sinc, different width or offset, any abritraty
>>>>>> shape defined as an array of scalars). These degrees of liberties
>>>>>> cannot be exposed through a generic API (information is multi-kB,
>>>>>> ie the data itself) and can be user specific (external to the HW IP
>>>>>> itself or
>>>> outside of Intel control).
>>>>>>
>>>>>> Thanks for the explanations. I also did my homework as my FFT
>>>>>> knowledge was buried quite deep in my memory. :)
>>>>>>
>>>>>> So this is a vendor-specific way to express generic paramaters.
>>>>>
>>>>> Unsure this is that vendor specific. At least the interface allows
>>>>> to know a
>>>> hash of the table being loaded (which is just pointwise data really,
>>>> non- proprietary format). I did not state the content is a simple
>>>> md5sum of the bin file being loaded from linux.
>>>>
>>>> Ok, I think it would be better to provide an API to get the table
>>>> directly, and have the format being described in the documentation.
>>>>
>>>> With that, we can also provide the hash as you'd like, but the method
>>>> to calculate the hash should also be provided. Or the application can
>>>> perform the hash itself if it needs it.
>>>>
>>>> The fact that it is several KB is not an issue, as this information
>>>> would only be queried once at init time if really needed.
>>>>
>>>> An non-DPDK alternative could be to pass such information to the pod
>>>> via the device plugin (as a mounted file for instance, or variable).
>>>>
>>>>>> Regarding VRB device, is this table per device or per VF?
>>>>>> Could it be configured by the application directly, or has it to be
>>>>>> done through the PF?
>>>>>
>>>>> This is configured for the device at platform level, ie. through operator.
>>>> Common to all application/devices. This captures the windows shape
>>>> assumptions.
>>>>
>>>> Thanks for the information!
>>>>
>>>>>>
>>>>>>> As an illustration for VRB device pf_bb_config provides to user an
>>>>>>> option to
>>>>>> include such windowing data as an input ("FFT LUT bin file"), but
>>>>>> more generally at platform level for any bb device this big Look-Up
>>>>>> Table or big array can be configured on the host during platform
>>>>>> initialization for a given deployment or vendor.
>>>>>>> What is required here is for the user application to have
>>>>>>> knowledge of what
>>>>>> version of such array is being used on the given platform, as this
>>>>>> information would be relevant to processing done outside of bbdev
>>>>>> (notably for noise estimate). Through that mechanism, the user can
>>>>>> now map through that API which possible file was being used, and
>>>>>> act
>>>> accordingly.
>>>>>>> The content itself is not specified, for VRB we just use the
>>>>>>> md5sum of that
>>>>>> binary file (which is just a big array of int16 for point wise
>>>>>> multiplication) so that this can be used to share knowledge between
>>>>>> initialized platform configuration and at run-time user application
>>>> assumption.
>>>>>>> It is also important to under that the user/vendor may use any
>>>>>>> array or
>>>>>> shape (based on their algorithm) regardless of Intel or IP, and
>>>>>> still be able to share information mapping between what is
>>>>>> configured on the platform (multiple versions possible) and what
>>>>>> the application
>>>> enumerates.
>>>>>>>
>>>>>>> I can add more details in the documentation indeed but above
>>>>>>> should
>>>>>> arguably make sense. The name FFT_version naming may be quite
>>>>>> vague, this is more related to the FFT pointwise windowing array
>>>>>> variant assumed on the platform. I did not want to impose for it to
>>>>>> be an md5sum necessarily, hence the vagueness, as it could be any
>>>>>> hash shared between the device programming and the user application
>>>>>> related to the semi-static FFT processing programming.
>>>>>>>
>>>>>>> Let me know if unclear or if any other thought,
>>>>>>
>>>>>> I think this is clear now to me.
>>>>>>
>>>>>> In my opinion, this is not good to have this part of the BBDEV API,
>>>>>> as every vendor will have their own way to represent this.
>>>>>>
>>>>>> Other alternative is to have a vendor specific API. This is far
>>>>>> from ideal and should be avoided as much as possible, but in this
>>>>>> case the application has to know anyways which device it is
>>>>>> driving. It would be at least clear the field has to be interpreted in a
>> vendor-specific way.
>>>>>>
>>>>>> @Hemant, I would be interested in your opinion. (I don't know if
>>>>>> NXP has or plans to have FFT accelerator IP)
>>>>>
>>>>> Yes looking forward to it.
>>>>
>>>> Thanks,
>>>> Maxime
>>>>
>>>>>
>>>>>>
>>>>>> Regards,
>>>>>> Maxime
>>>>>>
>>>>>>> Thanks
>>>>>>> Nic
>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>>>>>> Sent: Tuesday, September 19, 2023 2:56 AM
>>>>>>>> To: Chautru, Nicolas <nicolas.chautru@intel.com>; dev@dpdk.org
>>>>>>>> Cc: hemant.agrawal@nxp.com; david.marchand@redhat.com; Vargas,
>>>>>> Hernan
>>>>>>>> <hernan.vargas@intel.com>
>>>>>>>> Subject: Re: [PATCH v1 1/7] bbdev: add FFT version member in
>>>>>>>> driver info
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 9/19/23 03:21, Nicolas Chautru wrote:
>>>>>>>>> This can be used to distinguish different version of the
>>>>>>>>> flexible pointwise windowing applied to the FFT and expose this
>>>>>>>>> to the application.
>>>>>>>>
>>>>>>>> Does this version relates to a standard, or is this specific to
>>>>>>>> the implementation of your VRB devices?
>>>>>>>>
>>>>>>>>> Signed-off-by: Nicolas Chautru <nicolas.chautru@intel.com>
>>>>>>>>> ---
>>>>>>>>>       lib/bbdev/rte_bbdev.h | 2 ++
>>>>>>>>>       1 file changed, 2 insertions(+)
>>>>>>>>>
>>>>>>>>> diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h index
>>>>>>>>> a5bcc09f10..d6e54ee9a4 100644
>>>>>>>>> --- a/lib/bbdev/rte_bbdev.h
>>>>>>>>> +++ b/lib/bbdev/rte_bbdev.h
>>>>>>>>> @@ -349,6 +349,8 @@ struct rte_bbdev_driver_info {
>>>>>>>>>       	const struct rte_bbdev_op_cap *capabilities;
>>>>>>>>>       	/** Device cpu_flag requirements */
>>>>>>>>>       	const enum rte_cpu_flag_t *cpu_flag_reqs;
>>>>>>>>> +	/** Versioning number for the FFT operation type. */
>>>>>>>>> +	uint16_t fft_version;
>>>>>>>>>       };
>>>>>>>>>
>>>>>>>>>       /** Macro used at end of bbdev PMD list */
>>>>>>>
>>>>>
>>>
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2023-10-03 11:46 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-09-19  1:21 [PATCH v1 0/7] VRB2 BBDEV PMD introduction Nicolas Chautru
2023-09-19  1:21 ` [PATCH v1 1/7] bbdev: add FFT version member in driver info Nicolas Chautru
2023-09-19  9:55   ` Maxime Coquelin
2023-09-19 20:51     ` Chautru, Nicolas
2023-09-22  8:14       ` Maxime Coquelin
2023-09-22 16:41         ` Chautru, Nicolas
2023-09-26 10:00           ` Maxime Coquelin
2023-09-27 23:50             ` Chautru, Nicolas
2023-09-28  8:27               ` Maxime Coquelin
2023-09-28 16:33                 ` Chautru, Nicolas
2023-10-03 11:46                   ` Maxime Coquelin
2023-09-19  1:21 ` [PATCH v1 2/7] baseband/acc: add FFT version in the VRM PMD Nicolas Chautru
2023-09-19  1:21 ` [PATCH v1 3/7] baseband/acc: remove the 4G SO capability for VRB1 Nicolas Chautru
2023-09-19 15:20   ` David Marchand
2023-09-19 20:32     ` Chautru, Nicolas
2023-09-21  7:13       ` David Marchand
2023-09-21 17:18         ` Chautru, Nicolas
2023-09-27  7:08           ` Maxime Coquelin
2023-09-27  7:32             ` Maxime Coquelin
2023-09-19  1:21 ` [PATCH v1 4/7] baseband/acc: allocate FCW memory separately Nicolas Chautru
2023-09-19  1:21 ` [PATCH v1 5/7] baseband/acc: add support for MLD operation Nicolas Chautru
2023-09-19  1:21 ` [PATCH v1 6/7] baseband/acc: introduce the new VRB2 variant Nicolas Chautru
2023-09-19  1:21 ` [PATCH v1 7/7] baseband/acc: add configure helper for VRB2 Nicolas Chautru
2023-09-21  7:25 ` [PATCH v1 0/7] VRB2 BBDEV PMD introduction David Marchand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).